03 Filters Contours Segmentation - SP

Computer Vision - Mosig
Filters, contours, segmentation
Sergi Pujades
Original slides by : Jean-Sebastian Franco

Characterization of an image
• Which information is relevant / interesting in an image?

• Salient points (key points): areas of strong contrast (0D)

• Lines, salient contours: contrast boundaries (1D)

• Regions: groups of pixels with similar properties (2D)

Contours and regions
• Contours: boundary between homogenous regions of the image

• Regions: homogeneous regions between the contours
• Duality contours/regions
Challenges
• Which criteria to aggregate pixels?

• What is a salient point / contour? (fuzzy)
• How to distinguish between information and noise?
Challenges
Outline
• Image basics (you (should) know it already)

▪ representations, noise, contrast
▪ correlation, linear filters, smoothing (mean, Gaussien,…)
• Detection of contrast changes / contours

▪ gradients, finite difference, Sobel, Laplacien
• Detection of straight lines

▪ Hough transform
• Aggregation / segmentation
▪ Graph cut
Filters
Problem: the noise
• How to model the noise in the images?

• Most simple and adopted model: Gaussian noise (thermal)
▪ additive noise, independently sampled in each pixel from a Gaussian
distribution (same distribution for the full image)
σ=1 σ=16
Noise in 1D (and consequences)
f(x)
f’(x)
Noise in 2D (and consequences)
• Contours (gradients) noyés dans le bruit

• Besoin de réduire le bruit pour extraire une bonne info de contraste
How to reduce the noise?
• Problem:
▪ noise and high frequencies
▪ derivative computations get problematic
• Solution:
▪ remove the hight frequencies
• Convolution with a low-band filter

Convolution
(𝑓 ∗ 𝑔)(𝑡) = ∫ 𝑓(𝜏)𝑔(𝑡 − 𝜏)𝑑𝜏

𝜏
(𝑓 ∗ 𝑔)(𝑚) = ∑𝑛 𝑓(𝑛)𝑔(𝑚 − 𝑛)
Solution: first smooth
(low-band Gaussian filter) 
𝑔∗𝑓
𝜕
𝑔∗𝑓
𝜕𝑥
Theorem of the derivatives of a convolution
𝜕𝑔
𝜕𝑥
𝜕 𝜕𝑔
𝑔∗𝑓= ∗𝑓
𝜕𝑥 𝜕𝑥
Noise in 2D
Noyau
lissage
Benefits of the smoothing
2D convolutions
• Continuous case:
(𝑚 ∗ 𝑓)(𝑥) = ∫ 𝑚(𝑢)𝑓(𝑥 − 𝑢)𝑑𝑢

𝑢
(𝑚 ∗ 𝑓)(𝑥, 𝑦) = ∫ ∫ 𝑚(𝑢, 𝑣)𝑓(𝑥 − 𝑢, 𝑦 − 𝑣)𝑑𝑢𝑑𝑣

𝑢𝑣
• Discrete case:
(𝑚 ∗ 𝑓)(𝑥)
𝑤
= ∑𝑖=−𝑤 𝑚(𝑖)𝑓(𝑥 − 𝑖)
(𝑚 ∗ 𝑓)(𝑥, 𝑦) ∑𝑖=−𝑤 ∑𝑗=−h 𝑚(𝑖, 𝑗)𝑓(𝑥 − 𝑖, 𝑦 − 𝑗)

𝑤 h
=
Discrete convolutions in 2D
f(.) f(.) f(.)
f(.) f(.) f(.)
f(.) f(.) f(.)
c11 c12 c13

c21 c22 c23
c31 c32 c33
o (i,j) = c11 f(i-1,j-1) + c12 f(i-1,j) + c13 f(i-1,j+1) +

c21 f(i,j-1) + c22 f(i,j) + c23 f(i,j+1) +
c31 f(i+1,j-1) + c32 f(i+1,j) + c33 f(i+1,j+1)
Smoothing with a mean filter
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
1 1 1 1
0 0 0 90 90 90 90 90 0 0
9 1 1 1
0 0 0 90 0 90 90 90 0 0
1 1 1
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
𝐻[𝑢, 𝑣]
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
𝐹[𝑥, 𝑦]
Mean filter
Smoothing with a 2D Gaussian filter
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
1 1 2 1
0 0 0 90 90 90 90 90 0 0
16 2 4 2
0 0 0 90 0 90 90 90 0 0
1 2 1
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0 𝐻[𝑢, 𝑣]
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
𝐹[𝑥, 𝑦]
Kernel with an approximation of the Gaussian function:
1 −𝑢 + 𝑣 2 2
h(𝑢, 𝑣) = 𝑒 𝜎 2
2𝜋𝜎 2
Gaussian filter
Edge detection
Edge detection
Why do we care about edges?
• Fundamental to human perception

• Compact and compressed version of the visual information
Types of edges
Surface normal discontinuity
Depth discontinuity
Reflectance / color discontinuity
Illumination discontinuity
Actual exemples
Ideal edge
f(x) = edge
f’(x)
f’’(x)
Real edges
Real edges
Very noisy signal
Edge detection not

trival
Edges’ properties
Edge descriptors
• Normal: unit vector 
direction of maximal (intensity) change
• Direction: unit vector 

perpendicular to the normal vector
• Position
• Intensity (contrast) 
quantity of change on the edge
• Compact description:
▪ Normal with Intensity as vector length (2D)
▪ Position (2D)
Image as a 3D surface
Gradient image
• Differential operator: 2D vector pointing towards the direction where the

intensity has the highest magnitude of variation
[ 𝜕𝑥 𝜕𝑦 ]
𝜕𝑓 𝜕𝑓
𝛻𝑓 = ,
( 𝜕𝑥 ) ( 𝜕𝑦 )
2 2
( 𝜕𝑥 𝜕𝑦 )
𝜕𝑓 𝜕𝑓 𝜕𝑓 𝜕𝑓
▪ Direction: 𝜃 = 𝑡𝑎𝑛−1 / Intensity: 𝛻𝑓 = +
[ 𝜕𝑥 ] [ 𝜕𝑦 ] [ 𝜕𝑥 𝜕𝑦 ]
𝜕𝑓 𝜕𝑓 𝜕𝑓 𝜕𝑓
𝛻𝑓 = ,0 𝛻𝑓 = 0, 𝛻𝑓 = ,
Geometric interpretation
• An image f(x,y) is not a continuous function:

▪ we approximate it with a continuous function (first order, a plane)
▪ we compute the derivatives of this function
Methods based on the gradient
Edge:
"high variation in the signal”
Look for:
“high value in the first derivative”
Finite differences (1D)
𝑑𝑓 𝑓(𝑥 + 𝑑𝑥) − 𝑓(𝑥) 𝑓(𝑥 + 𝑑𝑥) − 𝑓(𝑥 − 𝑑𝑥)

𝑑𝑥
≈ 𝑑𝑥
≈ 2𝑑𝑥
𝑑 2𝑓 𝑓(𝑥 + 𝑑𝑥) − 2𝑓(𝑥) + 𝑓(𝑥 − 𝑑𝑥)
≈
𝑑𝑥2 𝑑𝑥2
Finite differences
Finite differences as a convolution: 
«enforce symmetry»
𝑓(𝑥 + 𝑑𝑥, 𝑦) − 𝑓(𝑥, 𝑦)

[−1,1]
𝜕𝑓
→ 𝜕𝑥
≈ 𝑑𝑥
𝑓(𝑥 + 𝑑𝑥, 𝑦) − 𝑓(𝑥 − 𝑑𝑥, 𝑦)
[−0.5, 0,0 . 5]
𝜕𝑓
→ 𝜕𝑥
≈ 2𝑑𝑥
𝑓(𝑥, 𝑦 + 𝑑𝑦) − 𝑓(𝑥, 𝑦)
[−1,1]
𝑇 𝜕𝑓
→ 𝜕𝑦
≈ 𝑑𝑦
𝑓(𝑥, 𝑦 + 𝑑𝑦) − 𝑓(𝑥, 𝑦 − 𝑑𝑦)
[−0.5, 0,0 . 5]
𝑇 𝜕𝑓
→ 𝜕𝑦
≈ 2𝑑𝑦
First smooth in 2D
• Naive solution:
▪ Gaussian smoothing of the image (1 convolution)
▪ then finite differences (1 convolution)
• Trick: apply the theorem of the derivatives of the convolution
▪ compose Gaussian smoothing and differentiation
▪ use the filter with the Gaussian derivatives - 1 convolution (only)
Example of gradient operator: Sobel
• Well known operators for edge detection

-1 0 1 -1 -2 -1
Sx[u,v] = -2 0 2 Sy[u,v] = 0 0 0
-1 0 1 1 2 1
• Particular case of composition of 3x3 Gaussian filter and finite differences
1 2 1
Sx[u,v] ∝ 2 4 2 * -1 0 1
1 2 1
1 2 1 -1
Sy[u,v] ∝ 2 4 2 * 0
1 2 1 1
• Can be generalized for arbitrary size smoothing kernels
Influence of the scale of the filtering
kernel
1 pixel 3 pixels 7 pixels
It affects the derivatives computation, and the scale (and semantics) of the
detected contours
Laplace operator (or Laplacian) Discrete
approximation
2 2
• In 2D 2 𝜕 𝑓 𝜕 𝑓 0 1 0
𝛻 𝑓= 2 + 2
▪ scalar magnitude 𝜕𝑥 𝜕𝑦 1 -4 1
▪ invariant to rotation 0 1 0
• Find the maximum gradient value:
▪ find where the Laplace operator is =0
Image Gradient Laplace op

Laplacian of Gaussian (LoG)
• Same approach in 2D: LoG =

▪ 1. Gaussian smoothing
▪ 2. Laplacian
2 2
LoG ∗ 𝐼 = 𝛻 ∗ 𝐺 ∗ 𝐼 = 𝛻 𝐺 ∗ 𝐼
𝜋𝜎 ( )
𝑥2 + 𝑦2
LoG(𝑥, 𝑦) = −
1 𝑥2 + 𝑦2
− 2𝜎2
4 1 − 𝑒
2𝜎 2
• Cheaper (in computation) than the intensity of the gradient!

Laplacian of Gaussian (LoG)
0 1 1 2 2 2 1 1 0
1 2 4 5 5 5 4 2 1
1 4 5 3 0 3 5 4 1
2 5 3 −12 −24 −12 3 5 2
2 5 0 −24 −40 −24 0 5 2
2 5 3 −12 −24 −12 3 5 2
1 4 5 3 0 3 5 4 1
1 2 4 5 5 5 4 2 1
0 1 1 2 2 2 1 1 0
Discrete approximation(σ=1.4)
LoG shape  
(upside down Mexican hat)
Common heuristic is to choose 5*σ
as filter size
Log at different scale values
Yellow = neg, Green = pos

Edges at different scales
LoG Zero-Crossings
sigma=2 sigma=4
Edge detection
3 problems :
1. Intensity of the gradient is scale dependent: which scale should we use? 
2. Intensity of the gradient is high in a “big stripe” area:  

which are the significant edge points 
3. How to connect the significant edge points to obtain a curved edge?

Contour extraction (Canny)
∇f
Approach:
1. Only use the maximum intensity in a “slice” of the edge

(direction of the gradient ∇f)
2. Search for other points in the direction of the tangent
(normal to the gradient ∇f)
Fine scale, heigh threshold
Large scale, high threshold
Large scale, low threshold
Detecting lines
Hough transform
Line detection
Line detection
• Problems with the noise: continuity of the filter outputs is not granted
How to merge points into a line?
• Hough transform (Duda & Hart, 1973)
• Use polar coordinates to parametrize lines:

▪ one line = a pair (r,θ)
𝑐𝑜𝑠𝜃 𝑟
𝑦=− 𝑥+
𝑠𝑖𝑛𝜃 𝑠𝑖𝑛𝜃
𝑟 = 𝑥𝑐𝑜𝑠𝜃 + 𝑦𝑠𝑖𝑛𝜃
• Compute the histogram h(r,θ), each observation adds one point to the line
• Compute the histogram h(r,θ), each observation adds one point to the line
Hough transform
Approach generalizes to other shapes: “Generalizing the Hough transform to detect arbitrary shapes”, Ballard 1981
Segmentation
Segmentation
• Regions: groups of pixels with similar properties

What does belong together? 
Gestalt Theory (~1920)
source : studyblue.com
Principle of characterization
• Find a similarity measure between the elements in the image  

(pixels, super-pixels, small regions)
• Examples of similar characteristics:

▪ pixel intensity, color
▪ position, speed
▪ texture
▪ size, orientation in the case of small regions
• Example of a similarity function:

▪ distance to the mean of the region
▪ probability according to a distribution of the characteristics
• Mahalanobis distance (distance weighted with the variance)
• histogramme
Example: color as characteristic
Example: color as characteristic
• Element = pixel
• Each pixel corresponds to a point in the RGB color space
Similarity in the color space
• Naive approach: euclidean distance
𝑑(𝑝, 𝑞) = (𝑝𝑅 − 𝑞𝑅) + (𝑝𝐺 − 𝑞𝐺) + (𝑝𝐵 − 𝑞𝐵)

2 2 2
𝑑(𝑝, 𝑞) = (𝑝𝐻 − 𝑞𝐻) + (𝑝𝑆 − 𝑞𝑆) + (𝑝𝑉 − 𝑞𝑉 )

2 2 2
Segmentation in the color space
• Find clusters of similar pixels using the color distance

Example
16 groups
2 groups
Segmentation without prior information 
(“unsupervised”)
• Clustering by aggregation
▪ an element is added to a group if it is “similar enough” to the other
elements in the group
▪ iterate
• Clustering by subdivision
▪ divide the groups in sub-groups according to the best boundary
▪ boundary defined with a similarity criteria
▪ iterate
Algorithm K-Means
• Initialization
▪ Given K, the number of target groups, and N the number of elements in
the characteristic space (ex. color, could be different)
▪ Randomly select K elements among the possible N
▪ These elements are now the centers m1, ..., mK of the K groups
• Iteration
▪ Assign each of the N points xi to the closest group center mi
▪ Re-compute the mean mi of each group
▪ If no change, stop
• Effectively minimizing the loss (or objective function)

Example with 3 groups
Pros / cons
• Simple • How to choose K?

• Fast • Sensitive to initialization
• Convergence towards a local • Spherical groups only
minimum • (very) sensitive to outliers
Example
K-Means
Better clusters?
Generalization: GMM 
Gaussian Mixture Models
• Instead of having a “hard” association of one point to a cluster, we define

the probability of a point to belong to one cluster
• Each blob is represented with a Gaussian distribution   b ∈ [1,N ]

 
  P(x | μ , V ) = 1 − 12 (x−μb)TVb−1(x−μb)
b b e
  (2π)d | Vb |
  μ ,V
d = 2 data point dimension b b mean, covariance
• The probability to observe a point is a  

mixture of Gaussians
K
P(x | Θ) = ∑b=1 αbP(x | Θb)
αb mixing coefficients
Solving GMM: Expectation Maximization (EM)
• Goal, find the parameters Θ of the blobs that maximize the probability: 
 
P(X | Θ) = ∏x P(x | Θ)
Algorithm:
1. Expectation step: knowing the current
estimates of the blobs, compute the
probabilities of the points to belong to them
2. Maximization step: knowing the
probabilities of the point to belong to the
blobs, recompute the blobs parameters to
maximize the total probability
3. iterate until convergence
EM details
• E-step :
▪ probability of a point to belong to a blob b (current estimation)
αb P(x | μb, Vb)
  P(b | x, μb, Vb) =
∑i=1 αi P(x | μi, Vi)
• M-Step
▪ Update of the mixture weights αbnew 1 N
αbnew = N
∑i=1 P(b | xi, μb, Vb)
 
▪ Update of means and covariances 

N
∑i=1 xiP(b | xi, μb, Vb)
new
μb = N
N
∑i=1 (xi − μbnew)(xi − μbnew)T P(b | xi, μb, Vb)
Vbnew = N
Generalization : EM
• Replace naive point assignment by probabilistic (weighted) assignment :

each point has a distinct probability of belonging to every cluster
GMMs and Expectation Maximization
• Relatively simple • Still K needs to be chosen before hand

• Fast • Sensitive to the initialization
• Convergence towards a local • The generative model needs to be
minimum chosen (shape of the blobs)
• Non spherical cluster (more
generic)
• Can better handle outliers – one
can add an « outlier » cluster
• Can be applied to many vision
problems, segmentation and also
model parameter fitting
Superpixels motivation & principle 
• Previous aggregation is solely based on color distance
• How to integrate spatial proximity?
• Superpixels : simple algorithm grouping similar pixels in the same region of

the image
• Simple algorithm SLIC : K-means en 5D, color + 2D coordinates 
• Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk, SLIC
Superpixels Compared to State-of-the-art Superpixel Methods, IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 34, num. 11, p. 2274 – 2282, May 2012.
Superpixels : example SLIC 
Segmentation using a graph
partitioning strategy: principle 
• Represent elements and their relations in a graph
• Cut the graph in sub-graphs
• Internal bindings are strong, inter-graph bindings are weak

Graphs of images
• One pixel -> one vertex

• One edge between each pair of pixels
• Each edge has a weight describing the 
similarity (or affinity) between the pixels
Examples of similarities (affinities)
• Distance between positions
𝑎𝑓𝑓(𝑥, 𝑦) = 𝑒
−1 2
2𝜎𝑑2
𝑥−𝑦
• Distance in intensity space
2𝜎𝑖2 (
− 𝐼(𝑦))
2
−1
𝐼(𝑥)
• Distance in color space (which one?)

2
𝑐(𝑥) − 𝑐(𝑦)
−1
2𝜎𝑡2
Segmentation using a graph
partitioning strategy
• Sub-graphs:
▪ remove the edges of low similarity
• Sub-graphs represent different groups / regions

How ? 
Graph cut [Boykov 2001, Shi & Malik]
• Cut: set of edges allowing a graph partition when removed

• Cost of a cut: sum of the edge weights being cut
• Efficient algorithms exist
How ? 
Graph cut [Boykov 2001, Shi & Malik]
• Cut: set of edges allowing a graph partition when removed

• Cost of a cut: sum of the edge weights being cut
• Efficient algorithms exist
Pros / cons
• Generic method, can be applied in many problems
• High memory and computation ressources required

• Biais in the cuts
▪ regions with similar sizes are preferred
▪ “shortcuts” exist (local minima)
If the region characteristics are
known…
• Example of the ball
• Compute the statistics of the ball appearance: mean color, distribution,

histogram
• Find the target pixels using the similarity
Source : Wasik & Saffiotti. Robust Color Segmentation for the RoboCup Domain. 16th International
Conference on Pattern Recognition (ICPR'02), Vol. 2, p. 20651, 2002.
Grabcut - interactive method
• User gives a coarse segmentation

• Similarity: color distribution in a region
• Iterative process:
▪ update the distribution
▪ cut the graph
▪ stop critera when the update / cut do not change
Rother, Carsten, Vladimir Kolmogorov, and Andrew Blake. “GrabCut: interactive

foreground extraction using iterated graph cuts." ACM transactions on graphics
(TOG) 23, no. 3 (2004): 309-314.
References
• Richard Szeliski teaching unit

▪ https://courses.cs.washington.edu/courses/cse576/05sp/lectures/segment.pdf
• OpenCV – Hough transform

▪ http://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html

03 Filters Contours Segmentation - SP

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

03 Filters Contours Segmentation - SP

Uploaded by

Copyright:

Available Formats

Computer Vision - Mosig

Filters, contours, segmentation

Original slides by : Jean-Sebastian Franco

• Which information is relevant / interesting in an image?

• Salient points (key points): areas of strong contrast (0D)

• Lines, salient contours: contrast boundaries (1D)

• Regions: groups of pixels with similar properties (2D)

• Contours: boundary between homogenous regions of the image

• Which criteria to aggregate pixels?

• Image basics (you (should) know it already)

• Detection of contrast changes / contours

• Detection of straight lines

• How to model the noise in the images?

• Contours (gradients) noyés dans le bruit

• Convolution with a low-band filter

(𝑓 ∗ 𝑔)(𝑡) = ∫ 𝑓(𝜏)𝑔(𝑡 − 𝜏)𝑑𝜏

(𝑚 ∗ 𝑓)(𝑥) = ∫ 𝑚(𝑢)𝑓(𝑥 − 𝑢)𝑑𝑢

(𝑚 ∗ 𝑓)(𝑥, 𝑦) = ∫ ∫ 𝑚(𝑢, 𝑣)𝑓(𝑥 − 𝑢, 𝑦 − 𝑣)𝑑𝑢𝑑𝑣

(𝑚 ∗ 𝑓)(𝑥, 𝑦) ∑𝑖=−𝑤 ∑𝑗=−h 𝑚(𝑖, 𝑗)𝑓(𝑥 − 𝑖, 𝑦 − 𝑗)

f(.) f(.) f(.)

f(.) f(.) f(.)

f(.) f(.) f(.)

c11 c12 c13

o (i,j) = c11 f(i-1,j-1) + c12 f(i-1,j) + c13 f(i-1,j+1) +

Kernel with an approximation of the Gaussian function:

• Fundamental to human perception

Surface normal discontinuity

Reflectance / color discontinuity

Very noisy signal

Edge detection not

• Direction: unit vector

• Differential operator: 2D vector pointing towards the direction where the

• An image f(x,y) is not a continuous function:

𝑑𝑓 𝑓(𝑥 + 𝑑𝑥) − 𝑓(𝑥) 𝑓(𝑥 + 𝑑𝑥) − 𝑓(𝑥 − 𝑑𝑥)

𝑓(𝑥 + 𝑑𝑥, 𝑦) − 𝑓(𝑥, 𝑦)

• Well known operators for edge detection

1 pixel 3 pixels 7 pixels

Image Gradient Laplace op

• Same approach in 2D: LoG =

• Cheaper (in computation) than the intensity of the gradient!

Yellow = neg, Green = pos

2. Intensity of the gradient is high in a “big stripe” area:

3. How to connect the significant edge points to obtain a curved edge?

1. Only use the maximum intensity in a “slice” of the edge

• Hough transform (Duda & Hart, 1973)

• Use polar coordinates to parametrize lines:

• Regions: groups of pixels with similar properties

• Find a similarity measure between the elements in the image

• Examples of similar characteristics:

• Example of a similarity function:

• Naive approach: euclidean distance

𝑑(𝑝, 𝑞) = (𝑝𝑅 − 𝑞𝑅) + (𝑝𝐺 − 𝑞𝐺) + (𝑝𝐵 − 𝑞𝐵)

𝑑(𝑝, 𝑞) = (𝑝𝐻 − 𝑞𝐻) + (𝑝𝑆 − 𝑞𝑆) + (𝑝𝑉 − 𝑞𝑉 )

• Find clusters of similar pixels using the color distance

• Effectively minimizing the loss (or objective function)

• Simple • How to choose K?

• Instead of having a “hard” association of one point to a cluster, we define

• Each blob is represented with a Gaussian distribution b ∈ [1,N ]

• The probability to observe a point is a

▪ Update of means and covariances

• Replace naive point assignment by probabilistic (weighted) assignment :

• Relatively simple • Still K needs to be chosen before hand

• Previous aggregation is solely based on color distance

• Direction: unit vector 

2. Intensity of the gradient is high in a “big stripe” area:  

• Find a similarity measure between the elements in the image  

• Each blob is represented with a Gaussian distribution   b ∈ [1,N ]

• The probability to observe a point is a  

▪ Update of means and covariances 

• Simple algorithm SLIC : K-means en 5D, color + 2D coordinates