Professional Documents
Culture Documents
Václav Hlaváč
Examples in 2D.
Initial idea, filtering in frequency domain
2/65
Image processing ≡ filtration of 2D signals.
input spatial output
image filter image
Filtration in the spatial domain. We would say in time domain for 1D signals. It is a linear
combination of the input image with coefficients of (often local) filter. The basic operation is
called convolution.
Filtration in the frequency domain. Conversion to the ‘frequency domain’, filtration there, and the
conversion back.
We consider Fourier transform, but there are other linear integral transforms serving a similar
purpose, e.g., cosine, wavelets.
1D Fourier transform, introduction
3/65
f(t)
Even f (t) = f (−t) t
f(t)
t
Odd f (t) = −f (−t)
Conjugage f ( 5) = 2 + 3i
f (ξ) = f ∗(−ξ)
symmetric f (−5) = 2 − 3i
f ∗ denotes a complex conjugate function.
i is a complex unit.
Any function can be decomposed
as a sum of the even and odd part 5/65
t = t + t
Fourier Tx definition: continuous cased
6/65
F{f (t)} = F (ξ), where ξ [Hz=s−1] is a frequency and 2πξ [s−1] is the angular frequency.
What is the meaning of the inverse Fourier Tx? Express it as a Riemann sum:
. 2πiξ0 t 2πiξ1 t
f (t) = . . . + F (ξ0) e + F (ξ1) e + . . . ∆ξ ,
⇒ Any 1D function can be expressed as a the weighted sum (integral) of many different complex
exponentials (because of Euler’s formula eiξ = cos ξ + i sin ξ, also of cosinusoids and sinusoids).
Existence conditions of Fourier Tx
7/65
R∞
1. | f (t) | dt < ∞, i.e. f (t) has to grow slower than an exponential curve.
−∞
2. f (t) can have only a finite number of discontinuities and maxima, minima in any finite
rectangle.
3. f (t) need not have discontinuities with the infinite amplitude.
Fourier transformation exists always for digital images as they are limited and have finite number
of discontinuities.
Fourier Tx, symmetries
8/65
Symmetry with regards to the complex conjugate part, i.e., F (−iξ) = F ∗(iξ).
a third function (f ∗ h), often used to create a modification of one of the input functions.
Convolution is an integral ‘mixing’ values of two functions, i.e., of the function h(t), which is
Z ∞ Z ∞
(f ∗ h)(t) = (h ∗ f )(t) ≡ f (τ ) h(t − τ ) dτ = f (t − τ ) h(τ ) dτ .
−∞ −∞
The limits can be constraint to the interval [0, t], because we assume zero values of functions
X X
(f ∗ h)(i) = (h ∗ f )(i) ≡ h(i − m) f (m) = h(i) f (i − m) ,
m∈O m∈O
where O is a local neighborhood of a ‘current position’ i and h is the convolution kernel (also
convolution mask).
Fourier Tx, properties (1)
12/65
1
Time scaling f (a t) |a| F (ξ/a)
Fourier Tx, properties (2)
13/65
R∞
Area in time F (0) = f (t)dt Area function f (t).
−∞
R∞
Area in freq. f (0) = F (ξ)dξ Area under F (ξ)
−∞
R∞ 2
R∞
Parseval’s th. |f (t)| dt = |F (ξ)|2dξ f energy = F energy
−∞ −∞
Basic Fourier Tx pairs (1)
14/65
d(t) 1 f(t)
f(t)
0 t 0 t -2T -T 0 T 2T t
Re F(x) d(x)
F(x)
1
0 x 1 1 1 x
x 0 0 1
-2T -T T 2T
Dirac constant ∞ sequence of Diracs
Basic Fourier Tx pairs (2)
15/65
f(t)
t t t
F(x) -x0
-x0 0 x0 x 0 x0 x -2x0 -x0 0 x0 2x0 x
F(x)
0 x 0 x0 x 0 x
The signal of short duration must have wide Fourier spectrum and vice versa.
1
(signal duration) (frequency bandwith) ≥
2
Observation: Gaussian e−t modulated by a sinusoid (Gabor function) has the smallest
duration-bandwidth product.
The principle is related to Heisenberg Uncertainty Principle from quantum mechanics (Werner
Heisenberg, published 1927, Nobel Prize 1932). This principle constraints the precision with
which the position and the momentum of a particle can be known.
W. Heisenberg 1927: “It is impossible to determine accurately both the position and the
Fourier transform assumes a periodic signal. What if a non-periodic signal has to be processed?
There are two common approaches.
1. To process the signal in small chunks (windows) and assume that the signal is periodic
outside the windows.
The approach was introduced by Dennis Gabor in 1946 and it is named Short time Fourier transform.
Dennis Gabor, 1900-1979, inventor of holography, Nobel price for physics in 1971, studied in Budapest,
PhD in Berlin in 1927, fled Nazi persecution to Britain in 1933.
Mere cutting of the signal to rectangular windows is not good because discontinuities at windows limits
function ensuring the zero signal value at the limits of the window and beyond it.
2. Use of more complex basis function, e.g., wavelets in the wavelet transform.
Discrete Fourier transform
19/65
Let f (n) be an input signal (a sequence), n = 0, . . . , N − 1.
Let F (k) be a Frequency spectrum (the result of the discrete Fourier transformation) of a
signal f (n).
Discrete Fourier transformation
N −1
X 2πi kn
F (k) ≡ f (n) e− N
n=0
N −1
1 X 2πi kn
f (n) ≡ F (k) e N
N
k=0
Computational complexity, a reminder
20/65
While considering complexity, it is abstracted from a specific computer. Only an asymptotic
An asymptotic upper and lower bounds for the magnitude of a function g(n) (i.e., its growth)
to ∞.
O(g(n)) denotes the set of functions f (n), where f (n) bounds g(n) asymptotically from
below. Formally, there exist a positive constant c and a number n0 such that
0 ≤ f (n) ≤ c g(n) for all n ≥ n0.
Ω(g(n)) denotes the set of functions f (n), where bounds g(n) asymptotically from above.
Formally, there exist a positive constant c and a number n0 such that 0 ≤ c g(n) ≤ f (n) for
all n ≥ n0.
Computational complexity, the notation
21/65
‘Big O()’ notation; for example, O(n2) means that the number of algorithm steps will be
roughly proportional to the square of the number of samples in the worst case.
Additional terms and multiplicative constants are not taken into account because a qualitative
comparison is sought.
The quadratic complexity O(n2) is worse than say O(n) (linear) or O(1) (constant,
N −1 N −1
− 2πi
X X
Discrete Fourier Transform (DFT) F (k) ≡ f (n) e N kn = W nk f (n)
n=0 n=0
The vector f (n) is multiplied by the matrix whose element (n, k) is the complex constant W
additions.
Calculation all N DFT coefficients requires N 2 complex multiplications and N (N − 1)
complex additions.
The overall computational complexity O(N 2).
Fast Fourier transform
23/65
A Fast Fourier transform (FFT) is an efficient algorithm to compute the discrete Fourier
The DFT of length N can be expressed as sum of two DFTs of length N/2, the first one
consisting of odd and the second of even samples.
(Danielson, Lanczos in 1942; further developed by Cooley, Tukey in 1965)
There are two approaches how to split the signal called
Note 2: The input sequence can be divided to more than two parts in general.
Decimation in time (DIT)
25/65
The input sequence f (n), n = 1, . . . , N − 1 is divided into even f e(n0) and odd f o(n0)
parts, n0 = 0, 1, . . . , N/2 − 1.
The Fourier transform of corresponding parts denoted F e, F o can be calculated recursively
0 0 s 0 0 n0 N
f (n ) + f (n ∗ N/2), f = (f (n ) − f (n + N/2)) W .
Their Fourier transform fulfills: F r (k 0) = F (2k 0) and F s(k 0) = 2k 0 + 1 for any
k 0 = 0, 1, . . . , N/2 − 1.
r s 0
Sequences f and f can be processed recursively with inverse equations f (n )=
1 r 0 s 0 N −n0 0 N 1 r 0 s 0 N −n0
2 f (n ) + f (n ) W , f (n + 2 ) = 2 f (n ) − f (n ) W .
N −1
X −2πikn
F (k) = e N f (n)
n=0
(N/2)−1 (N/2)−1
X −2πik(2n) X −2πik(2n+1)
= e N f (2n) + e N f (2n + 1)
n=0 n=0
(N/2)−1 (N/2)−1
X −2πikn X −2πikn
k
= e N/2 f (2n) + W e N/2 f (2n + 1)
n=0 n=0
= F e(k) + W k F o(k) , k = 1, . . . , N
For every pattern of log2 N e’s and o’s, there is a one-point transform that is just one of
f0 f1 f2 f3 f4 f5 f6 f7
Iteration
1
2
3
F0 F1 F2 F3 F4 F5 F6 F7
2D Fourier transform
30/65
The idea. The image function f (x, y) is decomposed to a linear combination of harmonic (sines
and cosines, more generally orthogonal) functions.
Z∞ Z∞
F (u, v) = f (x, y) e−2πi(xu+yv) dx dy
−∞ −∞
Inverse Fourier tranform
31/65
Z∞ Z∞
f (x, y) = F (u, v) e2πi(xu+yv) du dv
−∞ −∞
combination.
Spatial frequencies spectrum
32/65
p
Amplitude spectrum 2 (u, v) + F 2 (u, v)
|F (u, v)| = FRe Im
h i
FIm (u,v)
Phase angle spectrum φ(u, v) = tan−1 FRe (u,v)
level).
The analogy to corrugated iron comes from a topographic displaying of a 2D sinusoid (or
cosinusoid).
2D sinusoid, illustration (2)
34/65
v
(u,v)
w
w sin Q
√
u2 + v 2, u = ω cos Θ, v = ω sin Θ, Θ = tan−1 v
ω= u .
Q
u
w cos Q
Illustration of 2D FT bases vectors
35/65
sin(3x + 2y)+ cos(x + 4y) different display only analogy: carton egg tray
2D discrete Fourier transform
37/65
Direct transform
M −1 N −1
1 X X h mu nv i
F (u, v) = f (m, n) exp −2πi + ,
M N m=0 n=0 M N
u = 0, 1, . . . , M − 1 , v = 0, 1, . . . , N − 1 ,
Inverse transform
M −1 N −1 mu
X X h nv i
f (m, n) = F (u, v) exp 2πi + ,
u=0 v=0
M N
m = 0, 1, . . . , M − 1 , n = 0, 1, . . . , N − 1 .
2D Fourier Tx as twice 1D Fourier Tx
38/65
−1 −1
M
" N #
1 X 1 X −2πinv −2πimu
F (u, v) = exp f (m, n) exp ,
M m=0
N n=0
N M
u = 0, 1, . . . , M − 1 , v = 0, 1, . . . , N − 1 .
The term in square brackets corresponds to the one-dimensional Fourier transform of the mth
line and can be computed using the standard fast Fourier transform (FFT).
Each line is substituted with its Fourier transform, and the one-dimensional discrete Fourier
Gaussian is selected for illustration because it has a smooth spectrum, cf. uncertainty principle.
Input intensity image, coordinate system
40/65
250
50
100 200
150
200 150
250
300
100
350
400
50
450
500
0
100 200 300 400 500
Real part of the spectrum, image and mesh
41/65
Problem with the image related coordinate system related to the image: interesting information is
in corners, moreover divided into quarters. Due to spectrum periodicity it can be arbitrarily shifted.
Real part of the spectrum
50
100
150
Spatial frequency v
200
250
300
350
400
450
500
100 200 300 400 500
Spatial frequency u
50
100
150
Spatial frequency v
200
250
300
350
400
450
500
100 200 300 400 500
Spatial frequency u
50
100
150
Spatial frequency v
200
250
300
350
400
450
500
100 200 300 400 500
Spatial frequency u
image mesh
Periodic image
44/65
Periodic spectrum
45/65
Centered spectra
46/65
A D
of the coordinate system (0, 0) in the middle of the
spectrum.
B C
Assume the original spectrum is divided into four
−200
−150
−100
Spatial frequency v
−50
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
−200
−150
−100
Spatial frequency v
−50
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
−200
−150
−100
Spatial frequency v
−50
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
image mesh
Prague Castle example, input image 265×256
50/65
200
50
150
100
100
150
200 50
250
0
50 100 150 200 250
Real part of the centered spectrum, image and mesh
51/65
−100
−50
Spatial frequency v
50
100
−100
−50
Spatial frequency v
50
100
−100
−50
Spatial frequency v
50
100
image mesh
Rice example, input image 265×256
54/65
200
180
50
160
100 140
120
150
100
80
200
60
250
40
50 100 150 200 250
Real part of the centered spectrum, image and mesh
55/65
−100
−50
Spatial frequency v
50
100
−100
−50
Spatial frequency v
50
100
−100
−50
Spatial frequency v
50
100
image mesh
Horizontal line example, input image 265×256
58/65
Horizontal line example, real part of the spectrum
59/65
Horizontal line example,
imaginary part of the spectrum 60/65
Horizontal line example, power spectrum
61/65
Rectangle example, input image 512×512
62/65
200
50 180
100 160
150 140
200 120
250 100
300 80
350 60
400 40
450 20
500
0
100 200 300 400 500
Real part of the centered spectrum, image and mesh
63/65
−200
−150
−100
Spatial frequency v
−50
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
−200
−150
−100
Spatial frequency v
−50
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
−200
−150
−100
Spatial frequency v
−50
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
image mesh
input spatial output
image filter image
t = t + t
d(t) 1 f(t)
f(t)
0 t 0 t -2T -T 0 T 2T t
Re F(x) d(x)
F(x)
1
0 x 1 1 1 x
x 0 0 1
-2T -T T 2T
f(t)=cos(2px0 t) f(t)=sin(2px0 t) f(t)=cos(2px0 t) + cos(4px0 t)
f(t)
t t t
F(x) -x0
-x0 0 x0 x 0 x0 x -2x0 -x0 0 x0 2x0 x
f(t) f(t)=(sin 2px0t)/pt f(t)=exp(-t2)
1
f(t)
-T 0 T t 0 t 0 t
2 2
Re F(x)=(sin 2pxT)/px Re F(x) Re F(x)= p exp(-p x )
1
F(x)
0 x 0 x0 x 0 x
f0 f1 f2 f3 f4 f5 f6 f7
Iteration
1
2
3
F0 F1 F2 F3 F4 F5 F6 F7
v
(u,v)
w
w sin Q
Q
u
w cos Q
250
50
100 200
150
200 150
250
300
100
350
400
50
450
500
0
100 200 300 400 500
Real part of the spectrum
50
100
150
Spatial frequency v
200
250
300
350
400
450
500
100 200 300 400 500
Spatial frequency u
Imaginary part of the spectrum
50
100
150
Spatial frequency v
200
250
300
350
400
450
500
100 200 300 400 500
Spatial frequency u
log power spectrum
50
100
150
Spatial frequency v
200
250
300
350
400
450
500
100 200 300 400 500
Spatial frequency u
A D
B C
C B
D A
Real part of the spectrum, centered
−250
−200
−150
−100
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
Imaginary part of the spectrum, centered
−250
−200
−150
−100
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
log power spectrum, centered
−250
−200
−150
−100
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
200
50
150
100
100
150
200 50
250
0
50 100 150 200 250
Real part of the spectrum, centered
−100
−50
Spatial frequency v
50
100
−100
−50
Spatial frequency v
50
100
−100
−50
Spatial frequency v
50
100
180
50
160
100 140
120
150
100
80
200
60
250
40
50 100 150 200 250
Real part of the spectrum, centered
−100
−50
Spatial frequency v
50
100
−100
−50
Spatial frequency v
50
100
−100
−50
Spatial frequency v
50
100
50 180
100 160
150 140
200 120
250 100
300 80
350 60
400 40
450 20
500
0
100 200 300 400 500
Real part of the spectrum, centered
−250
−200
−150
−100
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
Imaginary part of the spectrum, centered
−250
−200
−150
−100
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u
log power spectrum, centered
−250
−200
−150
−100
50
100
150
200
250
−200 −100 0 100 200
Spatial frequency u