You are on page 1of 96

Computer Vision - Mosig

Filters, contours, segmentation

Sergi Pujades

Original slides by : Jean-Sebastian Franco


Characterization of an image

• Which information is relevant / interesting in an image?


Characterization of an image

• Salient points (key points): areas of strong contrast (0D)


Characterization of an image

• Lines, salient contours: contrast boundaries (1D)


Characterization of an image

• Regions: groups of pixels with similar properties (2D)


Contours and regions

• Contours: boundary between homogenous regions of the image


• Regions: homogeneous regions between the contours
• Duality contours/regions
Challenges

• Which criteria to aggregate pixels?


• What is a salient point / contour? (fuzzy)
• How to distinguish between information and noise?
Challenges
Outline

• Image basics (you (should) know it already)


▪ representations, noise, contrast
▪ correlation, linear filters, smoothing (mean, Gaussien,…)

• Detection of contrast changes / contours


▪ gradients, finite difference, Sobel, Laplacien

• Detection of straight lines


▪ Hough transform

• Aggregation / segmentation
▪ Graph cut
Filters
Problem: the noise

• How to model the noise in the images?


• Most simple and adopted model: Gaussian noise (thermal)
▪ additive noise, independently sampled in each pixel from a Gaussian
distribution (same distribution for the full image)

σ=1 σ=16
Noise in 1D (and consequences)

f(x)

f’(x)
Noise in 2D (and consequences)

• Contours (gradients) noyés dans le bruit


• Besoin de réduire le bruit pour extraire une bonne info de contraste
How to reduce the noise?

• Problem:
▪ noise and high frequencies
▪ derivative computations get problematic

• Solution:
▪ remove the hight frequencies

• Convolution with a low-band filter


Convolution

(𝑓 ∗ 𝑔)(𝑡) = ∫ 𝑓(𝜏)𝑔(𝑡 − 𝜏)𝑑𝜏


𝜏

(𝑓 ∗ 𝑔)(𝑚) = ∑𝑛 𝑓(𝑛)𝑔(𝑚 − 𝑛)
Solution: first smooth
(low-band Gaussian filter)


𝑔∗𝑓

𝜕
𝑔∗𝑓
𝜕𝑥
Theorem of the derivatives of a convolution

𝜕𝑔
𝜕𝑥

𝜕 𝜕𝑔
𝑔∗𝑓= ∗𝑓
𝜕𝑥 𝜕𝑥
Noise in 2D

Noyau
lissage
Benefits of the smoothing
2D convolutions

• Continuous case:

(𝑚 ∗ 𝑓)(𝑥) = ∫ 𝑚(𝑢)𝑓(𝑥 − 𝑢)𝑑𝑢


𝑢

(𝑚 ∗ 𝑓)(𝑥, 𝑦) = ∫ ∫ 𝑚(𝑢, 𝑣)𝑓(𝑥 − 𝑢, 𝑦 − 𝑣)𝑑𝑢𝑑𝑣


𝑢𝑣

• Discrete case:

(𝑚 ∗ 𝑓)(𝑥)
𝑤
= ∑𝑖=−𝑤 𝑚(𝑖)𝑓(𝑥 − 𝑖)

(𝑚 ∗ 𝑓)(𝑥, 𝑦) ∑𝑖=−𝑤 ∑𝑗=−h 𝑚(𝑖, 𝑗)𝑓(𝑥 − 𝑖, 𝑦 − 𝑗)


𝑤 h
=
Discrete convolutions in 2D

f(.) f(.) f(.)

f(.) f(.) f(.)

f(.) f(.) f(.)

c11 c12 c13


c21 c22 c23
c31 c32 c33

o (i,j) = c11 f(i-1,j-1) + c12 f(i-1,j) + c13 f(i-1,j+1) +


c21 f(i,j-1) + c22 f(i,j) + c23 f(i,j+1) +
c31 f(i+1,j-1) + c32 f(i+1,j) + c33 f(i+1,j+1)
Smoothing with a mean filter

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0
1 1 1 1
0 0 0 90 90 90 90 90 0 0
9 1 1 1
0 0 0 90 0 90 90 90 0 0
1 1 1
0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0
𝐻[𝑢, 𝑣]

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

𝐹[𝑥, 𝑦]
Mean filter
Smoothing with a 2D Gaussian filter

0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
1 1 2 1
0 0 0 90 90 90 90 90 0 0
16 2 4 2
0 0 0 90 0 90 90 90 0 0
1 2 1
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0 𝐻[𝑢, 𝑣]
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0

𝐹[𝑥, 𝑦]

Kernel with an approximation of the Gaussian function:

1 −𝑢 + 𝑣 2 2

h(𝑢, 𝑣) = 𝑒 𝜎 2

2𝜋𝜎 2
Gaussian filter
Edge detection
Edge detection
Why do we care about edges?

• Fundamental to human perception


• Compact and compressed version of the visual information
Types of edges

Surface normal discontinuity

Depth discontinuity

Reflectance / color discontinuity

Illumination discontinuity
Actual exemples
Ideal edge

f(x) = edge

f’(x)

f’’(x)
Real edges
Real edges

Very noisy signal

Edge detection not


trival
Edges’ properties
Edge descriptors
• Normal: unit vector

direction of maximal (intensity) change

• Direction: unit vector



perpendicular to the normal vector

• Position

• Intensity (contrast)

quantity of change on the edge

• Compact description:
▪ Normal with Intensity as vector length (2D)
▪ Position (2D)
Image as a 3D surface
Gradient image

• Differential operator: 2D vector pointing towards the direction where the


intensity has the highest magnitude of variation

[ 𝜕𝑥 𝜕𝑦 ]
𝜕𝑓 𝜕𝑓
𝛻𝑓 = ,

( 𝜕𝑥 ) ( 𝜕𝑦 )
2 2

( 𝜕𝑥 𝜕𝑦 )
𝜕𝑓 𝜕𝑓 𝜕𝑓 𝜕𝑓
▪ Direction: 𝜃 = 𝑡𝑎𝑛−1 / Intensity: 𝛻𝑓 = +

[ 𝜕𝑥 ] [ 𝜕𝑦 ] [ 𝜕𝑥 𝜕𝑦 ]
𝜕𝑓 𝜕𝑓 𝜕𝑓 𝜕𝑓
𝛻𝑓 = ,0 𝛻𝑓 = 0, 𝛻𝑓 = ,
Geometric interpretation

• An image f(x,y) is not a continuous function:


▪ we approximate it with a continuous function (first order, a plane)
▪ we compute the derivatives of this function
Methods based on the gradient

Edge:
"high variation in the signal”

Look for:
“high value in the first derivative”
Finite differences (1D)

𝑑𝑓 𝑓(𝑥 + 𝑑𝑥) − 𝑓(𝑥) 𝑓(𝑥 + 𝑑𝑥) − 𝑓(𝑥 − 𝑑𝑥)


𝑑𝑥
≈ 𝑑𝑥
≈ 2𝑑𝑥
𝑑 2𝑓 𝑓(𝑥 + 𝑑𝑥) − 2𝑓(𝑥) + 𝑓(𝑥 − 𝑑𝑥)

𝑑𝑥2 𝑑𝑥2
Finite differences
Finite differences as a convolution:

«enforce symmetry»

𝑓(𝑥 + 𝑑𝑥, 𝑦) − 𝑓(𝑥, 𝑦)


[−1,1]
𝜕𝑓
→ 𝜕𝑥
≈ 𝑑𝑥
𝑓(𝑥 + 𝑑𝑥, 𝑦) − 𝑓(𝑥 − 𝑑𝑥, 𝑦)
[−0.5, 0,0 . 5]
𝜕𝑓
→ 𝜕𝑥
≈ 2𝑑𝑥
𝑓(𝑥, 𝑦 + 𝑑𝑦) − 𝑓(𝑥, 𝑦)
[−1,1]
𝑇 𝜕𝑓
→ 𝜕𝑦
≈ 𝑑𝑦
𝑓(𝑥, 𝑦 + 𝑑𝑦) − 𝑓(𝑥, 𝑦 − 𝑑𝑦)
[−0.5, 0,0 . 5]
𝑇 𝜕𝑓
→ 𝜕𝑦
≈ 2𝑑𝑦
First smooth in 2D
• Naive solution:
▪ Gaussian smoothing of the image (1 convolution)
▪ then finite differences (1 convolution)
• Trick: apply the theorem of the derivatives of the convolution
▪ compose Gaussian smoothing and differentiation
▪ use the filter with the Gaussian derivatives - 1 convolution (only)
Example of gradient operator: Sobel

• Well known operators for edge detection


-1 0 1 -1 -2 -1
Sx[u,v] = -2 0 2 Sy[u,v] = 0 0 0
-1 0 1 1 2 1
• Particular case of composition of 3x3 Gaussian filter and finite differences

1 2 1
Sx[u,v] ∝ 2 4 2 * -1 0 1
1 2 1

1 2 1 -1
Sy[u,v] ∝ 2 4 2 * 0
1 2 1 1
• Can be generalized for arbitrary size smoothing kernels
Influence of the scale of the filtering
kernel

1 pixel 3 pixels 7 pixels

It affects the derivatives computation, and the scale (and semantics) of the
detected contours
Laplace operator (or Laplacian) Discrete
approximation
2 2
• In 2D 2 𝜕 𝑓 𝜕 𝑓 0 1 0
𝛻 𝑓= 2 + 2
▪ scalar magnitude 𝜕𝑥 𝜕𝑦 1 -4 1
▪ invariant to rotation 0 1 0
• Find the maximum gradient value:
▪ find where the Laplace operator is =0

Image Gradient Laplace op


Laplacian of Gaussian (LoG)

• Same approach in 2D: LoG =


▪ 1. Gaussian smoothing
▪ 2. Laplacian

2 2
LoG ∗ 𝐼 = 𝛻 ∗ 𝐺 ∗ 𝐼 = 𝛻 𝐺 ∗ 𝐼

𝜋𝜎 ( )
𝑥2 + 𝑦2
LoG(𝑥, 𝑦) = −
1 𝑥2 + 𝑦2
− 2𝜎2
  4 1 − 𝑒
2𝜎 2

• Cheaper (in computation) than the intensity of the gradient!


Laplacian of Gaussian (LoG)

0 1 1 2 2 2 1 1 0
1 2 4 5 5 5 4 2 1
1 4 5 3 0 3 5 4 1
2 5 3 −12 −24 −12 3 5 2
2 5 0 −24 −40 −24 0 5 2
2 5 3 −12 −24 −12 3 5 2
1 4 5 3 0 3 5 4 1
1 2 4 5 5 5 4 2 1
0 1 1 2 2 2 1 1 0

Discrete approximation(σ=1.4)
LoG shape 

(upside down Mexican hat)
Common heuristic is to choose 5*σ
as filter size
Log at different scale values

Yellow = neg, Green = pos


Edges at different scales
LoG Zero-Crossings

sigma=2 sigma=4
Edge detection

3 problems :
1. Intensity of the gradient is scale dependent: which scale should we use?


2. Intensity of the gradient is high in a “big stripe” area: 



which are the significant edge points


3. How to connect the significant edge points to obtain a curved edge?


Contour extraction (Canny)

∇f

Approach:

1. Only use the maximum intensity in a “slice” of the edge


(direction of the gradient ∇f)
2. Search for other points in the direction of the tangent
(normal to the gradient ∇f)
Fine scale, heigh threshold
Large scale, high threshold
Large scale, low threshold
Detecting lines

Hough transform
Line detection
Line detection

• Problems with the noise: continuity of the filter outputs is not granted
How to merge points into a line?

• Hough transform (Duda & Hart, 1973)

• Use polar coordinates to parametrize lines:


▪ one line = a pair (r,θ)

𝑐𝑜𝑠𝜃 𝑟
𝑦=− 𝑥+
𝑠𝑖𝑛𝜃 𝑠𝑖𝑛𝜃

𝑟 = 𝑥𝑐𝑜𝑠𝜃 + 𝑦𝑠𝑖𝑛𝜃
How to merge points into a line?

• Compute the histogram h(r,θ), each observation adds one point to the line
How to merge points into a line?

• Compute the histogram h(r,θ), each observation adds one point to the line
Hough transform

Approach generalizes to other shapes: “Generalizing the Hough transform to detect arbitrary shapes”, Ballard 1981
Segmentation
Segmentation

• Regions: groups of pixels with similar properties


What does belong together?

Gestalt Theory (~1920)

source : studyblue.com
Principle of characterization

• Find a similarity measure between the elements in the image 



(pixels, super-pixels, small regions)

• Examples of similar characteristics:


▪ pixel intensity, color
▪ position, speed
▪ texture
▪ size, orientation in the case of small regions

• Example of a similarity function:


▪ distance to the mean of the region
▪ probability according to a distribution of the characteristics
• Mahalanobis distance (distance weighted with the variance)
• histogramme
Example: color as characteristic
Example: color as characteristic

• Element = pixel
• Each pixel corresponds to a point in the RGB color space
Similarity in the color space

• Naive approach: euclidean distance

𝑑(𝑝, 𝑞) = (𝑝𝑅 − 𝑞𝑅) + (𝑝𝐺 − 𝑞𝐺) + (𝑝𝐵 − 𝑞𝐵)


2 2 2

𝑑(𝑝, 𝑞) = (𝑝𝐻 − 𝑞𝐻) + (𝑝𝑆 − 𝑞𝑆) + (𝑝𝑉 − 𝑞𝑉 )


2 2 2
Segmentation in the color space

• Find clusters of similar pixels using the color distance


Example
16 groups
2 groups
Segmentation without prior information

(“unsupervised”)
• Clustering by aggregation
▪ an element is added to a group if it is “similar enough” to the other
elements in the group
▪ iterate

• Clustering by subdivision
▪ divide the groups in sub-groups according to the best boundary
▪ boundary defined with a similarity criteria
▪ iterate
Algorithm K-Means
• Initialization
▪ Given K, the number of target groups, and N the number of elements in
the characteristic space (ex. color, could be different)
▪ Randomly select K elements among the possible N
▪ These elements are now the centers m1, ..., mK of the K groups

• Iteration
▪ Assign each of the N points xi to the closest group center mi
▪ Re-compute the mean mi of each group
▪ If no change, stop

• Effectively minimizing the loss (or objective function)


Example with 3 groups
Pros / cons

• Simple • How to choose K?


• Fast • Sensitive to initialization
• Convergence towards a local • Spherical groups only
minimum • (very) sensitive to outliers
Example

K-Means

Better clusters?
Generalization: GMM

Gaussian Mixture Models

• Instead of having a “hard” association of one point to a cluster, we define


the probability of a point to belong to one cluster

• Each blob is represented with a Gaussian distribution 
 b ∈ [1,N ]




 P(x | μ , V ) = 1 − 12 (x−μb)TVb−1(x−μb)
b b e

 (2π)d | Vb |

 μ ,V
d = 2 data point dimension b b mean, covariance

• The probability to observe a point is a 



mixture of Gaussians
K
P(x | Θ) = ∑b=1 αbP(x | Θb)

αb mixing coefficients
Solving GMM: Expectation Maximization (EM)

• Goal, find the parameters Θ of the blobs that maximize the probability:


P(X | Θ) = ∏x P(x | Θ)

Algorithm:
1. Expectation step: knowing the current
estimates of the blobs, compute the
probabilities of the points to belong to them
2. Maximization step: knowing the
probabilities of the point to belong to the
blobs, recompute the blobs parameters to
maximize the total probability
3. iterate until convergence
EM details
• E-step :
▪ probability of a point to belong to a blob b (current estimation)
αb P(x | μb, Vb)

 P(b | x, μb, Vb) =
∑i=1 αi P(x | μi, Vi)

• M-Step
▪ Update of the mixture weights αbnew 1 N
αbnew = N
∑i=1 P(b | xi, μb, Vb)

▪ Update of means and covariances



N
∑i=1 xiP(b | xi, μb, Vb)
new
μb = N
∑i=1 P(b | xi, μb, Vb)

N
∑i=1 (xi − μbnew)(xi − μbnew)T P(b | xi, μb, Vb)
Vbnew = N
∑i=1 P(b | xi, μb, Vb)
Generalization : EM

• Replace naive point assignment by probabilistic (weighted) assignment :


each point has a distinct probability of belonging to every cluster
GMMs and Expectation Maximization

• Relatively simple • Still K needs to be chosen before hand


• Fast • Sensitive to the initialization
• Convergence towards a local • The generative model needs to be
minimum chosen (shape of the blobs)
• Non spherical cluster (more
generic)
• Can better handle outliers – one
can add an « outlier » cluster
• Can be applied to many vision
problems, segmentation and also
model parameter fitting
Superpixels motivation & principle


• Previous aggregation is solely based on color distance

• How to integrate spatial proximity?

• Superpixels : simple algorithm grouping similar pixels in the same region of


the image

• Simple algorithm SLIC : K-means en 5D, color + 2D coordinates


• Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk, SLIC
Superpixels Compared to State-of-the-art Superpixel Methods, IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 34, num. 11, p. 2274 – 2282, May 2012.
Superpixels : example SLIC

Segmentation using a graph
partitioning strategy: principle


• Represent elements and their relations in a graph

• Cut the graph in sub-graphs

• Internal bindings are strong, inter-graph bindings are weak


Graphs of images

• One pixel -> one vertex


• One edge between each pair of pixels
• Each edge has a weight describing the

similarity (or affinity) between the pixels
Examples of similarities (affinities)

• Distance between positions

𝑎𝑓𝑓(𝑥, 𝑦) = 𝑒
−1 2
2𝜎𝑑2
𝑥−𝑦

• Distance in intensity space

2𝜎𝑖2 (
− 𝐼(𝑦))
2
−1
𝑎𝑓𝑓(𝑥, 𝑦) = 𝑒
𝐼(𝑥)

• Distance in color space (which one?)


2
𝑐(𝑥) − 𝑐(𝑦)
𝑎𝑓𝑓(𝑥, 𝑦) = 𝑒
−1
2𝜎𝑡2
Segmentation using a graph
partitioning strategy

• Sub-graphs:
▪ remove the edges of low similarity

• Sub-graphs represent different groups / regions


How ?

Graph cut [Boykov 2001, Shi & Malik]

• Cut: set of edges allowing a graph partition when removed


• Cost of a cut: sum of the edge weights being cut
• Efficient algorithms exist
How ?

Graph cut [Boykov 2001, Shi & Malik]

• Cut: set of edges allowing a graph partition when removed


• Cost of a cut: sum of the edge weights being cut
• Efficient algorithms exist
Pros / cons

• Generic method, can be applied in many problems

• High memory and computation ressources required


• Biais in the cuts
▪ regions with similar sizes are preferred
▪ “shortcuts” exist (local minima)
If the region characteristics are
known…
• Example of the ball

• Compute the statistics of the ball appearance: mean color, distribution,


histogram
• Find the target pixels using the similarity
Source : Wasik & Saffiotti. Robust Color Segmentation for the RoboCup Domain. 16th International
Conference on Pattern Recognition (ICPR'02), Vol. 2, p. 20651, 2002.
Grabcut - interactive method

• User gives a coarse segmentation


• Similarity: color distribution in a region
• Iterative process:
▪ update the distribution
▪ cut the graph
▪ stop critera when the update / cut do not change

Rother, Carsten, Vladimir Kolmogorov, and Andrew Blake. “GrabCut: interactive


foreground extraction using iterated graph cuts." ACM transactions on graphics
(TOG) 23, no. 3 (2004): 309-314.
References

• Richard Szeliski teaching unit


▪ https://courses.cs.washington.edu/courses/cse576/05sp/lectures/segment.pdf

• OpenCV – Hough transform


▪ http://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html

You might also like