霍夫變換 (Hough Transform)

最近做 project 要用到 Hough Transform,發覺呢個 technique 幾有趣,所以寫低分享下。簡單嚟講,Hough Transform 係電腦視覺入面用嚟搵幾何形狀嘅方法——就算條線斷咗、有雜訊都搵到,真係幾勁。

點解要用極座標?

一開始我都唔明點解唔用 $y = mx + c$ 嚟表示直線咁簡單。原來係因為垂直線會令斜率 $m$ 變成無限大,搞到個 algorithm 會出事。所以 Hough Transform 用咗極座標表示法:

\rho = x \cos \theta + y \sin \theta

$\rho$ :原點到直線嘅垂直距離
$\theta$ :垂線同 x 軸嘅夾角

呢個寫法最正係咩線都 handle 到,包括垂直線。

$極座標表示法:\rho 係原點到直線嘅距離,\theta 係角度$

點樣運作?投票機制

核心概念係將圖像空間嘅點轉換到參數空間 $(\rho, \theta)$ 。圖像入面一個點 $(x, y)$ 會對應參數空間入面一條 sinusoid curve。如果有幾個點喺同一條線上面,佢哋喺參數空間嘅 curves 就會相交於同一點——呢個交點就代表嗰條直線嘅參數。

步驟:

先用 edge detection(例如 Canny)搵出邊緣點
每個邊緣點都會喺 $(\rho, \theta)$ 空間投票
搵出票數最高嘅位置,就係我哋要搵嘅直線

實際數學例子

假設圖像上有三個點共線: $(2, 3)$ 、 $(4, 5)$ 、 $(6, 7)$

例子 1:驗證呢三點喺同一條線

先用傳統方法搵條線嘅方程式。計斜率:

m = \frac{5-3}{4-2} = \frac{2}{2} = 1

所以 $y = x + 1$ 。用 Hough Transform 嘅極座標表示法:

For 點 $(2, 3)$ :

\rho = 2\cos\theta + 3\sin\theta

For 點 $(4, 5)$ :

\rho = 4\cos\theta + 5\sin\theta

For 點 $(6, 7)$ :

\rho = 6\cos\theta + 7\sin\theta

如果呢三點真係共線,佢哋會喺某個 $(\rho, \theta)$ 相交。試下 $theta = 45° = frac{pi}{4}$ :

點 $(2, 3)$ : $\rho = 2 \times \frac{\sqrt{2}}{2} + 3 \times \frac{\sqrt{2}}{2} = \frac{5\sqrt{2}}{2} \approx 3.54$
點 $(4, 5)$ : $\rho = 4 \times \frac{\sqrt{2}}{2} + 5 \times \frac{\sqrt{2}}{2} = \frac{9\sqrt{2}}{2} \approx 6.36$
點 $(6, 7)$ : $\rho = 6 \times \frac{\sqrt{2}}{2} + 7 \times \frac{\sqrt{2}}{2} = \frac{13\sqrt{2}}{2} \approx 9.19$

唔同 $\rho$ 值?咁係咪代表佢哋唔共線?錯喇!記住每個點會產生一條 curve,我哋要搵嘅係 curves 嘅交點。

試返正確嘅 $\theta$ 值。對於直線 $y = x + 1$ ,轉換成極座標:

\rho = x\cos\theta + y\sin\theta = x\cos\theta + (x+1)\sin\theta

\rho = x(\cos\theta + \sin\theta) + \sin\theta

當 $theta approx 135° = frac{3pi}{4}$ :

$cos(135°) = -frac{sqrt{2}}{2}$ , $sin(135°) = frac{sqrt{2}}{2}$

For 點 $(2, 3)$ : $\rho = 2 \times (-\frac{\sqrt{2}}{2}) + 3 \times \frac{\sqrt{2}}{2} = \frac{\sqrt{2}}{2} \approx 0.707$

For 點 $(4, 5)$ : $\rho = 4 \times (-\frac{\sqrt{2}}{2}) + 5 \times \frac{\sqrt{2}}{2} = \frac{\sqrt{2}}{2} \approx 0.707$

For 點 $(6, 7)$ : $\rho = 6 \times (-\frac{\sqrt{2}}{2}) + 7 \times \frac{\sqrt{2}}{2} = \frac{\sqrt{2}}{2} \approx 0.707$

Bingo!三個點喺 $(\rho, \theta) = (0.707, 135°)$ 相交,證明佢哋的確共線。

例子 2:Accumulator 投票

喺實際運算中,我哋會建立一個 accumulator array $A[\rho][\theta]$ 。對於每個邊緣點 $(x, y)$ :

將 $\theta$ 由 $0°$ 掃到 $180°$ (通常用 1° increment)
對每個 $\theta$ 計算對應嘅 $\rho = x\cos\theta + y\sin\theta$
$A[\rho][\theta]$ 加一票

例如點 $(3, 4)$ :

$\theta$ (deg)	$\rho$ 計算	$\rho$ 值
0°	$3 \times 1 + 4 \times 0$	3.0
30°	$3 \times 0.866 + 4 \times 0.5$	4.6
45°	$3 \times 0.707 + 4 \times 0.707$	4.95
60°	$3 \times 0.5 + 4 \times 0.866$	4.96
90°	$3 \times 0 + 4 \times 1$	4.0

每個 $(\rho, \theta)$ pair 都會喺 accumulator 入面投一票。當多個點投票到同一格時,嗰格嘅數值就會升高,最後我哋搵出 peak 就知道邊條線最明顯。

實際應用

呢個方法真係好實用:

自動駕駛:搵車道線,就算係虛線都 detect 到

醫學影像:搵血管、虹膜(用 circular Hough Transform)

文檔掃描:校正傾斜嘅文件角度

我自己試過用嚟做車道檢測,效果真係唔錯,特別係喺光線唔好或者有陰影嘅情況下。

Circle Detection (圓形檢測)

Hough Transform 除咗可以搵直線,仲可以搵圓形!原理其實差不多,只不過參數空間變咗 3D。

圓形方程式

一個圓形可以用三個參數嚟表示:

(x - a)^2 + (y - b)^2 = r^2

$(a, b)$ :圓心座標
$r$ :半徑

所以參數空間係 3D 嘅 $(a, b, r)$ ,而唔係直線檢測嘅 2D $(\rho, \theta)$ 。

點樣運作?

同樣係投票機制,但係複雜咗:

已知半徑:如果你已經知道要搵嘅圓形半徑 $r$ ,咁問題就簡化成 2D。對於圖像上嘅每個邊緣點 $(x, y)$ ,佢可能係屬於任何一個半徑為 $r$ 嘅圓。呢啲可能嘅圓心會形成一個圓圈,圓心就係 $(x, y)$ ,半徑就係 $r$ 。
未知半徑:如果唔知半徑,就要 scan 晒所有可能嘅 $(a, b, r)$ 組合。計算量會大好多,所以通常會限制 $r$ 嘅範圍。

數學例子

假設我哋想檢測一個圓心在 $(5, 5)$ 、半徑 $r = 3$ 嘅圓。圓上面有幾個點:

$(8, 5)$
$(5, 8)$
$(2, 5)$
$(5, 2)$

驗證呢啲點係咪真係喺個圓上面:

For 點 $(8, 5)$ :

(8-5)^2 + (5-5)^2 = 3^2 + 0^2 = 9 = r^2 \quad ✓

For 點 $(5, 8)$ :

(5-5)^2 + (8-5)^2 = 0^2 + 3^2 = 9 = r^2 \quad ✓

喺 Hough Transform 入面,每個點會對所有可能嘅圓心 $(a, b)$ 投票。如果我哋已知 $r = 3$ :

點 $(8, 5)$ 會對所有距離佢 3 個單位嘅位置投票
點 $(5, 8)$ 會對所有距離佢 3 個單位嘅位置投票
點 $(2, 5)$ 會對所有距離佢 3 個單位嘅位置投票
點 $(5, 2)$ 會對所有距離佢 3 個單位嘅位置投票

最後,位置 $(5, 5)$ 會獲得最多票,因為佢係真正嘅圓心!

詳細計算表格:Occurrence Table (出現次數表)

假設我哋有三個邊緣點需要檢測圓心,已知半徑 $r = 3$ 。我哋會為每個點建立投票表:

邊緣點 1: $(x_1, y_1) = (8, 5)$

對於每個角度 $theta$ ,計算可能的圓心 $(a, b)$ :

$r$	$\theta$	$a = x_1 - r\cos(\theta)$	$b = y_1 - r\sin(\theta)$	可能圓心 $(a, b)$
3	0°	$8 - 3 \times 1 = 5$	$5 - 3 \times 0 = 5$	$(5, 5)$ ✓
3	90°	$8 - 3 \times 0 = 8$	$5 - 3 \times 1 = 2$	$(8, 2)$
3	180°	$8 - 3 \times (-1) = 11$	$5 - 3 \times 0 = 5$	$(11, 5)$
3	270°	$8 - 3 \times 0 = 8$	$5 - 3 \times (-1) = 8$	$(8, 8)$

邊緣點 2: $(x_2, y_2) = (5, 8)$

$r$	$\theta$	$a = x_2 - r\cos(\theta)$	$b = y_2 - r\sin(\theta)$	可能圓心 $(a, b)$
3	0°	$5 - 3 \times 1 = 2$	$8 - 3 \times 0 = 8$	$(2, 8)$
3	90°	$5 - 3 \times 0 = 5$	$8 - 3 \times 1 = 5$	$(5, 5)$ ✓
3	180°	$5 - 3 \times (-1) = 8$	$8 - 3 \times 0 = 8$	$(8, 8)$
3	270°	$5 - 3 \times 0 = 5$	$8 - 3 \times (-1) = 11$	$(5, 11)$

邊緣點 3: $(x_3, y_3) = (2, 5)$

$r$	$\theta$	$a = x_3 - r\cos(\theta)$	$b = y_3 - r\sin(\theta)$	可能圓心 $(a, b)$
3	0°	$2 - 3 \times 1 = -1$	$5 - 3 \times 0 = 5$	$(-1, 5)$
3	90°	$2 - 3 \times 0 = 2$	$5 - 3 \times 1 = 2$	$(2, 2)$
3	180°	$2 - 3 \times (-1) = 5$	$5 - 3 \times 0 = 5$	$(5, 5)$ ✓
3	270°	$2 - 3 \times 0 = 2$	$5 - 3 \times (-1) = 8$	$(2, 8)$

Occurrence Table (出現次數統計)

收集所有可能嘅圓心,計算每個位置被投票嘅次數:

可能圓心 $(a, b)$	出現次數 (Occurrence)	來源邊緣點
$(5, 5)$ 🎯	3 (最高!)	點1( $theta=0°$ ), 點2( $theta=90°$ ), 點3( $theta=180°$ )
$(8, 2)$	1	點1( $theta=90°$ )
$(11, 5)$	1	點1( $theta=180°$ )
$(8, 8)$	2	點1( $theta=270°$ ), 點2( $theta=180°$ )
$(2, 8)$	2	點2( $theta=0°$ ), 點3( $theta=270°$ )
$(5, 11)$	1	點2( $theta=270°$ )
$(-1, 5)$	1	點3( $theta=0°$ )
$(2, 2)$	1	點3( $theta=90°$ )

結論: 圓心 $(5, 5)$ 獲得 3 票(最高出現次數),所以佢就係我哋要搵嘅圓心!

呢個就係 Hough Transform 嘅核心——投票機制。每個邊緣點都會為好多可能嘅圓心投票,但係只有真正嘅圓心會從所有邊緣點度獲得票數,形成 peak!

直線 vs 圓形檢測

	直線檢測	圓形檢測
參數數量	2 個 $(\rho, \theta)$	3 個 $(a, b, r)$
參數空間	2D	3D
計算複雜度	較低	較高
記憶體需求	較少	較多
應用	車道線、文檔校正	虹膜識別、硬幣檢測

實際應用例子

虹膜識別:喺眼睛影像入面搵虹膜(圓形)

硬幣/圓形物件檢測:檢測圖像中嘅硬幣或者其他圓形物體

交通標誌檢測:好多交通標誌都係圓形嘅

細胞檢測:醫學影像入面搵細胞(通常接近圓形)

我之前試過用圓形 Hough Transform 檢測硬幣,效果幾好,但係要小心設定好 radius 嘅範圍,如果範圍太大會好慢。

💡 重點總結
Hough Circle Detection 嘅步驟:

對每個邊緣點,掃描所有可能嘅角度 $\theta$

用公式 $a = x - rcos(theta)$ , $b = y - r\sin(\theta)$ 計算可能圓心

喺 Accumulator 入面為每個可能圓心投票(occurrence +1)

搵出票數最高(occurrence 最大)嘅位置 = 真正圓心

變動半徑 (Variable Radius) 詳解

當半徑未知時,問題就複雜好多。我哋需要喺 3D 參數空間 $(a, b, r)$ 入面搵 peak。

固定半徑 vs 變動半徑:

情況	參數空間	Accumulator 維度	複雜度
固定半徑 ( $r$ 已知)	2D: $(a, b)$	2D array	$O(N \times W \times H)$
變動半徑 ( $r$ 未知)	3D: $(a, b, r)$	3D array	$O(N \times W \times H \times R)$

其中 $N$ 係邊緣點數量, $W$ 同 $H$ 係圖像寬高, $R$ 係可能嘅半徑數量。

數學例子:變動半徑檢測

假設圖像大小係 $500 \times 500$ pixels,我哋想檢測半徑介乎 $r \in [10, 60]$ pixels 嘅圓形。

步驟:

建立 3D accumulator: $A[a][b][r]$ ,初始值全部係 0
- $a, b \in [0, 500)$
- $r in [10, 60]$ ,假設每 1 pixel 一個間隔,即 51 個可能值

對每個邊緣點 $(x_i, y_i)$ 投票:

javascriptfor r = 10 to 60:
    for θ = 0° to 360° (每 1° 一個間隔):
        a = x_i - r·cos(θ)
        b = y_i - r·sin(θ)
        A[a][b][r] += 1

搵 accumulator 入面嘅 peak

具體計算例子

假設有個邊緣點喺 $(100, 100)$ 。如果真實圓心喺 $(70, 100)$ 、半徑 $r = 30$ :

試 $r = 30$ , $theta = 0°$ :

a = 100 - 30 \times \cos(0°) = 100 - 30 = 70 \quad ✓

b = 100 - 30 \times \sin(0°) = 100 - 0 = 100 \quad ✓

所以呢個點會投一票俾 $(a, b, r) = (70, 100, 30)$ 。

試其他半徑:

$r = 20$ : 會投票俾 $(80, 100, 20)$ ← 錯嘅圓心
$r = 40$ : 會投票俾 $(60, 100, 40)$ ← 錯嘅圓心

關鍵點: 只有當多個邊緣點都投票俾同一個 $(a, b, r)$ 時,嗰個先係真正嘅圓!

記憶體需求比較

固定半徑 ( $r = 30$ ):

Accumulator size: $500 \times 500 = 250,000$ cells
假設每個 cell 用 4 bytes (int32): ~1 MB

變動半徑 ( $r in [10, 60]$ , 51 個值):

Accumulator size: $500 \times 500 \times 51 = 12,750,000$ cells
記憶體: ~51 MB (係固定半徑嘅 50 倍!)

優化技巧

限制半徑範圍: 根據應用場景設定合理嘅 $r_{\text{min}}$ 同 $r_{\text{max}}$
用 edge gradient: 如果你有 edge 嘅 gradient 方向,可以只 check 嗰個方向,唔使 scan 成個圓周:
```
javascripta = x_i - r·cos(gradient_angle)
b = y_i - r·sin(gradient_angle)
只投一票,唔係 360 票!
```
Coarse-to-fine: 先用大嘅 step size(例如每 5 pixels)搵大概位置,再喺附近用細 step size 搵精確值
Random sampling: 唔係每個 $\theta$ 都試,隨機 sample 部分角度

呢啲優化可以將計算時間減少 10-100 倍,但係要小心 trade-off,太激進嘅優化會 miss 掉啲圓。

Hough Transform vs Linear Regression

好多人會問:點解唔用 linear regression 嚟搵直線?其實呢兩樣嘢雖然都係搵線,但係用途同原理完全唔同。

Linear Regression(最小平方法)

目標:搵一條最 fit 所有點嘅線(minimize 所有點到線嘅距離平方和)
假設:所有點都應該喺條線上面,只係有 noise
方法:用公式 $\min \sum_{i=1}^{n} (y_i - (mx_i + c))^2$ 直接計出嚟
輸出:一條線
弱點:好怕 outliers——一個離譜嘅點就可以令成條線偏晒

例子:如果你有點 $(1,2), (2,4), (3,6)$ ,linear regression 會 fit 出 $y = 2x$ 。但係加多個 outlier $(10, 100)$ ,條線就會被拉歪咗。

Hough Transform

目標:detect 圖像入面存在嘅線(唔係 fit 所有點)
假設:只有部分點屬於直線,可以有 noise、outliers、多條線
方法:投票機制——每個點投票俾所有經過佢嘅可能直線
輸出:可以同時 detect 多條線
優勢:唔怕 outliers——outlier 只會亂投票,唔會形成 peak

例子:同樣嘅點 $(1,2), (2,4), (3,6)$ ,Hough Transform 會 detect 出 $y = 2x$ 。就算加咗 outlier $(10, 100)$ ,都仍然會搵到 $y = 2x$ ,因為有 3 個點投票俾佢 vs 1 個點亂投。

主要分別

	Linear Regression	Hough Transform
Outliers	好易受影響 ❌	Robust ✅
多條線	只搵到一條	可以 detect 好多條
斷線	處理唔到	Handle 到
速度	快(有公式直接計)	慢啲(要 iterate 投票)
用途	統計分析、預測	電腦視覺、影像處理

幾時用邊個?

用 Linear Regression:當你相信所有 data points 都應該喺條線上面,而你想 model 呢個 relationship(例如用面積預測樓價)

用 Hough Transform:當你想喺有雜訊嘅圖像入面 detect 幾何形狀(例如喺相片入面搵車道線,但係相片仲有樹、車、陰影等等)

簡單嚟講:Linear regression 係 fitting, Hough Transform 係 detection。

參考資料