IT5409 - Ch4 - Part2 - Feature ExtractionMatching - 4pages

22/05/2021
Plan
• Edges
‒ Detection
‒ linking
Computer Vision • Feature extraction

‒ Global features
Chapter 4: Feature detection and Image
‒ Local features
matching
‒ Matching
• Applications
1 2
Local features vs Global features

• Two types of features are extracted from the image:
‒ local and global features (descriptors)
• Global features
‒ Describe the image as a whole to the generalize the entire object
‒ Include contour representations, shape descriptors, and texture
features
• Examples: Invariant Moments (Hu, Zernike), Histogram Oriented
Gradients (HOG), PHOG, and Co-HOG,...
• Local feature: Feature extraction
‒ the local features describe the image patches (key points in the • Global features
image) of an object
‒ represents the texture/color in an image patch
• Color / Shape / Texture
• Examples: SIFT, SURF, LBP, BRISK, MSER and FREAK, … • Local features
3 4
22/05/2021
Global features? Types of features

• Contour representation, Shape features
How to distinguish these objects?
• Color descriptors
• Texture features
5 6
Color features Distance / Similarity

• Histogram 256 bins intensity histogram
• L1 ou L2 (euclidian) distances are often used
16 bins intensity histogram
N
d L1 (H, G)   hi  g i
i 1
• Histogram intersection
∑ min( ℎ , 𝑔 )
∩ (H,G) =
∑ 𝑔
7 8
22/05/2021
Advantages of histogram Some inconveniences

• Invariant to basic geometric transformations: • The similarity between colors in
‒ Rotation adjacent colors (bin) is not taken
‒ Translation into account
‒ Zoom (Scaling)
• The spatial distribution of pixel

values is not considered: 2 different
images may have the same
histogram
9 10
Some inconveniences Texture features

• Background effect: d(I1,I2) ? d(I1, I3) • A texture can be defined as
‒ a region with variation of intensity
‒ as a spatial organization of pixels
I3
I1
I2
• Color representation dependency (color space),

device dependency, …
11 12
22/05/2021
Texture features Texture features

• There are several methods for analyzing textures:
‒ First order statistics
• Statistics on histogram
‒ Co-occurence matrices
• Searching patterns
‒ Frequential analysis
• Gabor filter
‒…
• The most difficult is to find a good representation (good
parameters) for each texture
13 14
First order statistics First order statistics

• Histogram-based: mean, variance, skewness, kurtosis, • Skewness vs. kurtosis (red: normal distribution with mean
energy, entropy, ... = 5, variance = 4)
h(i) : số điểm ảnh ở mức xám i
is.hust.edu.vn 15 16
22/05/2021
GLCM (Grey Level Co-occurence Matrices) GLCM

 The idea here is to identify gray level that repeat
themselves given a distance and a direction
 Co-occurence matrices (Haralick)
card ({ p1 , p2 I ( p1 )  ci , I ( p2 )  c j , N d ,  ( p1 , p2 )  true})
CM d ,  (ci , c j ) 
 Matrix of size Ng x Ng card ({ p1, p 2 N d , ( p1 , p2 )  true})
 Ng is the number of gray level in the image (256x256)
We often reduce that number to 8x8, 16x16 or 32x32
N d ,  ( p1 , p2 )  true

p2 is a neigbor of p1 at a distance d direction 
 Many matrices, one for each distance and direction

 Distance : 1, 2, 3 (,4, …) 0°
 Direction : 0°, 45°, 90°, 135° (, …)
135° 90° 45°
 Processing time can be very long
18
GLCM GLCM
 Example on how to compute these matrices:  Example on how to compute these matrices:
1 2 3 4 1 2 3 4
1 4 4 3 1 4 4 3
1 ? ? ? ? 1 0 0 0 1
4 2 3 2 4 2 3 2
2 ? ? ? ? 2 0 0 0 0
1 2 1 4 1 2 1 4
3 ? ? ? ? 3 0 0 0 0
1 2 2 3 1 2 2 3
4 ? ? ? ? 4 0 0 0 0
Image Image
Matrix for distance=1 Matrix for distance=1 and
and direction=0° direction=0°
We loop over the image and for each pair of pixels following the given Pair of neighbor pixels (1,4)
distance and orientation, we increment the co-occurence matrix
22/05/2021
GLCM GLCM
 Example on how to compute these matrices:  Example on how to compute these matrices:
1 2 3 4
1 2 3 4 1 4 4 3
1 4 4 3 1 0 0 0 1
1 0 0 0 1 4 2 3 2
4 2 3 2 2 0 0 1 0
2 0 0 0 0 1 2 1 4
1 2 1 4 3 0 1 0 0
3 0 0 0 0 1 2 2 3
1 2 2 3 4 0 1 1 1
4 0 0 0 1 Image
Image Matrix for distance=1 and
Matrix for distance=1 and direction=0°
direction=0°
Pair of neighbor pixels (4,4) ...and so on (after 2 lines)
GLCM GLCM
 Example on how to compute these matrices • Most important/populer parameters computed from
(final): GLCM:
1 4 4 3
1 2 3 4 1 2 3 4
Energy   CM d 2 (i, j ) minimal when all elements are equal
1 0 2 0 2 1 0 2 1 0
4 2 3 2 i j
2 1 1 2 0 2 1 1 0 0
1 2 1 4 entropy    CM d (i, j ) log(CM d (i, j ))
3 0 1 0 0 3 0 0 0 1 a measure of chaos,
1 2 2 3 i j maximal when all elements are equal
4 0 1 1 1 4 0 2 1 0
Image
Matrix for distance=1 Matrix for distance=1 contrast   (i  j ) 2 CM d (i, j )
and direction=0° and direction=45° i j small values when big elements
1 are near the main diagonal
idm   CM d (i, j )
i j 1  (i  j ) 2
…and so on for each matrix (several matrices at the end) idm (inverse differential moment) has small values
when big elements are far from the main diagonal
is.hust.edu.vn 24
22/05/2021
GLCM Invariances
• Haralick features: • Rotation?
‒ For each GLCM, we can compute up to 14 (13) ‒ Average on all directions
parameters characterizing the texture, of which the most
important : mean, variance, energy, inertia, entropy,
• Scaling?
inverse differential moment
‒ Multi-resolutions
‒ Ref: http://haralick.org/journals/TexturalFeatures.pdf
Texture features comparision Shape features

• Contour-based features
‒ Chain coding, polygone approximation, geometric
parameters, angular profile, surface, perimeter, …
• Region based:
‒ Invariant moments, …
Source : William Robson Schwartz et al. Evaluation of Feature Descriptors for Texture Classification – 2012 JEI
27 28
22/05/2021
Shape features Examples: Freeman chain coding

Shape features
Boundary-based methods
Region-based methods
Gometrical Signatures
Global
Perimeter Centroid Distance Geometrical
Eccentricity Complex Coordinates Moment invariants
Curvature Curvature signature Area Zernike moments
Axes directionality Turning Angle Compactness Pseudo Zernike moments
Euler number
Grid method
Others Signature descriptors Decompasition
Chain codes Fourier Descriptors
Triangulation
UNL-Fourier
Medial Axis Transform
NFD
(Skeleton Transform)
Wavelet Descriptors
B-Splines
29 30
Examples: angular profile, … Examples : Image moments

• Moment
𝑀 , = 𝑎𝑟𝑒𝑎 𝑜𝑓 𝑡ℎ𝑒 𝑟𝑒𝑔𝑖𝑜𝑛 𝐷
𝑀 , ,𝑀 , = 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑 𝑜𝑓 𝐷
• Central moments:
Invariant to
translation
31 32
22/05/2021
Invariant moments (Hu's moments) Examples : Hu's moments

6 images and their Hu Moments
invariant to
translation,
scale, and
rotation, and
reflection
Change for
image
reflection
https://www.learnopencv.com/wp-content/uploads/2018/12/HuMoments-Shape-Matching.png
33 34
Shape Context Examples: PHOG
PHOG:
Pyramid Histogram of Oriented Gradients
https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/shape/sc_digits.html
Source:http://www.robots.ox.ac.uk/~vgg/research/caltech/phog.html
35 36
22/05/2021
Why local features?

• Image matching: a challenging problem
Feature extraction
• Global features
• Local features
• Interest point detector
• Local descriptor
Source: CS131 - Juan Carlos Niebles and Ranjay Krishna
37 38
Image matching Harder Still?
by swashford
by Diva Sian
NASA Mars Rover images
Slide credit: Steve Seitz
by scgbt Slide credit: Steve Seitz
39 40
22/05/2021
Answer Below (Look for tiny colored squares) Motivation for using local features
• Global representations have major limitations
• Instead, describe and match only local regions
• Increased robustness to
‒ Occlusions
‒ Articulation d dq
φ
φ
θq
NASA Mars Rover images with SIFT feature matches
θ
(Figure by Noah Snavely)

‒ Intra-category variations
Source: CS131 - Juan Carlos Niebles and Ranjay Krishna
41 42
Local features and object recognition Local features and alignment

• We need to match (align) images
• Recognition of specific objects/scenes
• Global methods sensitive to occlusion, lighting,
parallax effects. So look for local features that match
well.
• How would you do it by eye?
Sivic and Zisserman, 2003
D. Lowe 2002
[Darya Frolova and Denis Simakov]

43 44
22/05/2021
Local features and alignment Local features and alignment

• Detect feature points in both images • Detect feature points in both images
• Find corresponding pairs • Find corresponding pairs
• Use these pairs to align images
[Darya Frolova and Denis Simakov]
45 [Darya Frolova and Denis46Simakov]
Local features for image matching

1. Find a set of
distinctive keypoints
A1 2. Define a region
around each keypoint
A2 A3
3. Extract and
normalize the region content
fA Similarity fB 4. Compute a localdescriptor

measure
from the normalized region
N pixels
e.g. color e.g. color
N pixels d ( f A, fB )  T 5. Match local descriptors
Source : Jim Little, Lowe: features, UBC.

Slide credit: Bastian Leibe
47 48
22/05/2021
Local features Local feature extraction

• Objectifs: • Local features: how to determine image patches / local
• Look for similar objects/regions regions
• Partial query Dividing into

Look for pictures that contain sunflowers patches with Keypoint detection
regular grid
Image segmentation
• Solution:
• Describing local regions
• Adding spatial constraints if need Without knowledge about
image content
Based on the content of image
49 50
Common Requirements Invariance: Geometric Transformations

• Problem 1:
‒ Detect the same point independently in both images
No chance to match!
We need a repeatable detector!
• Problem 2:
‒ For each point correctly recognize the corresponding one

We need a reliable and distinctive descriptor!
Slide credit: Darya Frolova, Denis Simakov
51 52
22/05/2021
Invariance: Geometric Transformations Invariance: Photometric Transformations
Levels of Geometric Invariance

• Often modeled as a linear
transformation:
‒ Scaling + Offset
Slide credit: Steve Seitz Slide credit: Tinne Tuytelaars
53 54
Requirements Main questions

• Region extraction needs to be repeatable and accurate • Where will the interest points come from?
‒ Invariant to translation, rotation, scale changes
‒ Robust or covariant to out-of-plane (affine) transformations ‒ What are salient features that we’ll detect in multiple
‒ Robust to lighting variations, noise, blur, quantization views?
• Locality: Features are local, therefore robust to occlusion and

• How to describe a local region?
clutter • How to establish correspondences, i.e.,
• Quantity: We need a sufficient number of regions to cover the
compute matches?
object
• Distinctiveness: The regions should contain “interesting” structure
• Efficiency: Close to real-time performance
55 56
22/05/2021
Interest points: why and where?
Feature extraction
• Global features
• Local features
• Local descriptor Yarbus eye tracking
• Matching
Source : Derek Hoiem, Computer Vision, University of Illinois.
57 58
Same image with different questions
Interest points: why and where?

• Where will the interest points come from?
59 60
22/05/2021
Keypoint Localization Finding Corners

• Goals:
‒ Repeatable detection
‒ Precise localization
‒ Interesting content
 Look for two-
• Key property:
dimensional signal
‒ In the region around a corner, image gradient has two or
changes
more dominant directions
• Corners are repeatable and distinctive
C.Harris and M.Stephens. "A Combined Corner and Edge Detector.“ Proceedings of the 4th Alvey
Vision Conference, 1988.
Slide credit: Svetlana Lazebnik
61 62
Corners as distinctive interest points Corners versus edges

• Design criteria
‒ We should easily recognize the point by looking through a
small window (locality) Large
Corner
‒ Shifting the window in any direction should give a large Large
change in intensity (good localization)
Small
Edge
Large
Small
Nothing
Small
“flat” region: “edge”: “corner”:
no change in no change along significant change
all directions the edge in all directions
direction
Slide credit: Alyosha Efros
63 64
22/05/2021
Harris detector formulation Corner Detection by Auto-correlation

Change of intensity for the shift [u,v]: Change in appearance of window w(x,y) for shift [u,v]:
E (u , v )   w( x, y )  I ( x  u, y  v )  I ( x, y ) 
2
E (u , v )   w( x, y )  I ( x  u, y  v )  I ( x, y ) 
2
x, y
x, y
Window Shifted
I(x, y)
Intensity E(u, v)
function intensity
or
Window function w(x,y) = E(0,0)
1 in window, 0 outside Gaussian

Source: R. Szeliski
w(x, y)
65 66
Corner Detection by Auto-correlation Corner Detection by Auto-correlation

Change in appearance of window w(x,y) for shift [u,v]: Change in appearance of window w(x,y) for shift [u,v]:
E (u , v )   w( x, y )  I ( x  u, y  v )  I ( x, y )  E (u , v )   w( x, y )  I ( x  u, y  v )  I ( x, y ) 
2 2
x, y x, y
I(x, y) We want to discover how E behaves for small shifts

E(u, v)
But this is very slow to compute naively.
O(window_width2 * shift_range2 * image_width2)
E(3,2)
O( 112 * 112 * 6002 ) = 5.2 billion of these

w(x, y)
67 68
22/05/2021
Corner Detection by Auto-correlation

Change in appearance of window w(x,y) for shift [u,v]: Local quadratic approximation of E(u,v) in the
neighborhood of (0,0) is given by the
E (u , v )   w( x, y )  I ( x  u, y  v )  I ( x, y ) 
2
x, y
second-order Taylor expansion:
We want to discover how E behaves for small shifts  E (0,0) 1  E (0,0) Euv (0,0) u 
E (u , v)  E (0,0)  [u v] u   [u v] uu  
 Ev (0,0)  2  Euv (0,0) Evv (0,0)   v 
But we know the response in E that we are looking
for – strong peak.  Approximation Notation: partial derivative
69 70
Corner Detection: Mathematics

we are looking for strong peak Second-order Taylor expansion of E(u,v) about (0,0):
 E (0,0) 1  E (0,0) Euv (0,0) u 
E (u , v)  E (0,0)  [u v ] u   [u v ] uu  
 Ev (0,0)  2  Euv (0,0) Evv (0,0)   v 
 E (0,0) 1  E (0,0) Euv (0,0) u  Eu (u , v)   2w( x, y )I ( x  u , y  v)  I ( x, y )I x ( x  u , y  v)
E (u , v)  E (0,0)  [u v] u   [u v] uu   x, y
 Ev (0,0)  2  Euv (0,0) Evv (0,0)   v 
Euu (u , v)   2w( x, y )I x ( x  u, y  v) I x ( x  u , y  v)
x, y
Ignore function   2 w( x, y )I ( x  u , y  v)  I ( x, y )I xx ( x  u , y  v)

value; set to 0 Ignore first x, y
Just look at
derivative, shape of Euv (u , v)   2w( x, y )I y ( x  u, y  v) I x ( x  u, y  v)
set to 0 second x, y
derivative
  2 w( x, y )I ( x  u , y  v)  I ( x, y )I xy ( x  u , y  v)
x, y
71 72
22/05/2021
Corner Detection: Mathematics Corner Detection: Mathematics

Second-order Taylor expansion of E(u,v) about (0,0):
E (u , v )   w( x, y )  I ( x  u, y  v )  I ( x, y ) 
2
 E (0,0) 1  Euu (0,0) Euv (0,0) u  x, y

E (u , v)  E (0,0)  [u v ] u   [u v ] E (0,0) E (0,0)   v 
 Ev ( 0, 0)  2  uv vv  
𝑤(𝑥, 𝑦)𝐼 (𝑥, 𝑦) 𝑤(𝑥, 𝑦)𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦)

, , 𝑢
𝐸(𝑢, 𝑣) ≈ [𝑢 𝑣]
𝐸 (0,0) = 2𝑤(𝑥, 𝑦) 𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦) 𝑣
𝐸(0,0) = 0 𝑤(𝑥, 𝑦)𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦) 𝑤(𝑥, 𝑦)𝐼 (𝑥, 𝑦)
𝐸 (0,0) = 0 ,
, ,
𝐸 (0,0) = 0 𝐸 (0,0) = 2𝑤(𝑥, 𝑦) 𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦)
,
𝐸 (0,0) = 2𝑤(𝑥, 𝑦) 𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦)

,
73 74
Harris detector formulation What does this matrix reveal?

• This measure of change can be approximated by: • First, let’s consider an axis-aligned corner:
u 
E (u , v )  [u v ] M     I x2 I I  1 0 
v  M  x y
 
 I x I y I   0 2 
2
where M is a 22 matrix computed from image derivatives: y
 I2 IxI y  Gradient with respect to x,

M   w( x, y )  x  times gradient with respect to y
x, y  I x I y I y2 
Sum over image region – the area

we are checking for corner
Image I Ix Iy IxIy
M
Slide credit: Rick Szeliski
Slide credit: David Jacobs
75 76
22/05/2021
What does this matrix reveal? General case

• First, let’s consider an axis-aligned corner: • Since M is symmetric, we have  0 
M  R 1  1 R
 0 2 
  I x2 I I x y
 1 0  (Eigenvalue decomposition)
M   
 I x I y I   0 2 
2
y • We can visualize M as an ellipse with axis lengths
determined by the eigenvalues and orientation
• This means:
determined by a rotation matrix R
‒ Dominant gradient directions align with x or y axis Direction of the
fastest change
‒ If either λ is close to 0, then this is not a corner, so look for
Direction of the
locations where both are large. slowest change
(max)-1/2
• What if we have a corner that is not aligned with the (min)-1/2
image axes?
Slide credit: David Jacobs adapted from Darya Frolova, Denis Simakov
77 78
Interpreting the eigenvalues Corner response function

• Classification of image points using eigenvalues of M :
q  det(M )  a trace(M )2  12  a (1  2 )2
2 “Edge”
2 “Edge” θ<0
2 >> 1 “Corner”
“Corner” θ>0
1 and 2 are large, 1 ~ 2;
E increases in all directions
• Fast approximation
1 and 2 are small; ‒ Avoid computing the “Flat”
E is almost constant in “Edge”
all directions
“Flat” “Edge” eigenvalues region θ<0
region 1 >> 2
‒ α: constant 1
1
(0.04 to 0.06)
Slide credit: Kristen Grauman Slide credit: Kristen Grauman
79 80
22/05/2021
Window Function w(x,y) M   I x2 IxI y  Summary: Harris Detector [Harris88]

 w( x, y)  I I 
I y2 
x, y  x y
• Option 1: uniform window • Compute second moment matrix

‒ Sum over square window (autocorrelation matrix)
Ix Iy
 I2 IxI y  1. Image
M   x  derivatives
IxI y
x, y  I y2  Ix Iy IxI
2. Square of 2 2
1 in window, 0 outside  I x2 ( D ) I x I y ( D )  derivatives y
M ( I ,  D )  g ( I )   
 I x I y ( D ) I y ( D ) 
2
‒ Problem: not rotation invariant 3. Gaussian
filter g(I)
g(Ix2) g(Iy2) g(IxIy)
• Option 2: Smooth with Gaussian • Compute corner response
‒ Gaussian already performs weighted sum 4. Cornerness function – two strong eigenvalues
 I 2
IxI y  q  det[M ( I , D )] a[trace(M ( I , D ))]2
M  g ( )   x
  g ( I x2 ) g ( I y2 )  [ g ( I x I y )]2  a[ g ( I x2 )  g ( I y2 )]2
Gaussian  I x I y I y2 
R
‒ Result is rotation invariant 5. Perform non-maximum suppression
C.Harris and M.Stephens. “A Combined Corner and Edge Detector.” Proceedings of the 4th Alvey
Vision Conference: pages 147—151, 1988.
Slide credit: Bastian Leibe Slide credit: Krystian Mikolajczyk
81 82
Harris Detector: Workflow Harris Detector: Workflow

computer corner responses θ
Slide adapted from Darya Frolova, Denis Simakov Slide adapted from Darya Frolova, Denis Simakov
83 84
22/05/2021
Harris Detector: Workflow Harris Detector: Workflow

Take points where θ > threshold Take only the local maxima of θ, where θ > threshold
Slide adapted from Darya Frolova, Denis Simakov

85 86
Harris Detector: Workflow Harris Detector – Responses [Harris88]
Resulting Harris points
Effect: A very precise

corner detector.
Slide credit: Krystian Mikolajczyk

87 88
22/05/2021
Harris Detector: Properties Harris Detector: Properties

• Translation invariance? • Translation invariance
• Rotation invariance?
Ellipse rotates but its shape (i.e.

eigenvalues) remains the same
Corner response θ is invariant to image rotation
Slide credit: Kristen Grauman
89 90
Harris Detector: Properties Scale invariance: how to?

• Translation invariance • Exhaustive search
• Rotation invariance • Invariance
• Scale invariance? • Robustness
Corner All points will be

classified as edges!
Not invariant to image scale!
91 92
22/05/2021
Exhaustive search Invariance

• Multi-scale approach • Extract patch from each image individually
Slide adapted from T. Tuytelaars ECCV 2006 tutorial Slide adapted from T. Tuytelaars ECCV 2006 tutorial
93 94
Automatic scale selection Automatic scale selection

• Solution: • Common approach:
‒ Design a function on the region, which is “scale Take a local maximum of this function
invariant” (the same for corresponding regions, even if
Observation: region size, for which the maximum is
they are at different scales)
achieved, should be invariant to image scale.
Example: average intensity. For corresponding regions (even of
different sizes) it will be the same. Important: this scale invariant region size is found in
‒ For a point in one image, we can consider it as a each image independently!
function of region size (patch width)
f Image 1 f Image 2 f Image 1 f Image 2
scale = 1/2
scale = 1/2
region size s1 s2 region size

region size region size
95 96
22/05/2021
Automatic Scale Selection Example

Function responses for increasing scale (scale signature)
f ( I i1im ( x,  ))  f ( I i1im ( x,  ))
Same operator responses if the patch contains the same image up to scale
factor.
K. Grauman, B. Leibe K. Grauman, B. Leibe
f ( I i1 im ( x,  )) f ( I i1im ( x,  ))
97
98
Example Example
Function responses for increasing scale (scale signature) Function responses for increasing scale (scale signature)

f ( I i1 im ( x,  )) f ( I i1im ( x,  )) f ( I i1 im ( x,  )) f ( I i1im ( x,  ))
99 100
22/05/2021
Example Example
Function responses for increasing scale (scale signature) Function responses for increasing scale (scale signature)

f ( I i1 im ( x,  )) f ( I i1im ( x,  )) f ( I i1 im ( x,  )) f ( I i1im ( x,  ))
101 102
Scale Invariant Detection What is a useful signature function?

• A “good” function for scale detection: • Functions for determining scale f  Kernel  Image
has one stable sharp peak Kernels:
f f f
Good !
L   2  Gxx ( x, y,  )  G yy ( x, y,  ) 
bad bad (Laplacian)
region size region size region size
DoG  G ( x, y, k )  G ( x, y,  )
(Difference of Gaussians)
• For usual images: a good function would be one
where Gaussian
which responds to contrast (sharp local intensity x2  y 2

change) G ( x, y ,  )  1
e 2 2
2 Note: both kernels are invariant
to scale and rotation
103 104
22/05/2021
What is a useful signature function? Characteristic scale

• Laplacian-of-Gaussian = “blob” detector • We define the characteristic scale as the scale
that produces peak of Laplacian response
characteristic scale
T. Lindeberg (1998). "Feature detection with automatic scale selection." IJCV 30 (2): pp 77--116.
Source: K. Grauman, B. Leibe
Source: Lana Lazebnik
105 106
Laplacian-of-Gaussian (LoG) Example: Scale-space blob detector

• Interest points: 5
Local maxima in scale

space of LoG 4
Lxx ( )  Lyy ( ) 3
2
 List of
 (x, y, σ)
107 108Lana Lazebnik

Source:
22/05/2021
Example: Scale-space blob detector Example: Scale-space blob detector
Source: Lana Lazebnik

109Lana Lazebnik
Source: 110
Alternative approach Alternative approach

Approximate LoG with Difference-of-Gaussian (DoG). • Approximate LoG with Difference-of-Gaussian (DoG):
‒ 1. Blur image with σ Gaussian kernel
‒ 2. Blur image with kσ Gaussian kernel
‒ 3. Subtract 2. from 1.
Small k gives a closer approximation to LoG, but usually we want to build a
scale space quickly out of this. k = 1.6 gives an appropriate scale space, k =
sqrt(2)
- =
Ruye Wang
111 112
22/05/2021
Find local maxima in position-scale space

of DoG Harris-Laplacian
• Harris-Laplacian1 scale
 Laplacian 
Find maxima
Find local maximum of:
…
…
‒ Harris corner detector in space y
(image coordinates)
k 2k  Harris  x
‒ Laplacian in scale
- =

- k  List of
= (x, y, s)
- =

Input image 1 K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001
2 D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. IJCV 2004
113 114
115
Scale Invariant Detectors DoG (SIFT) keypoint Detector

• Harris-Laplacian1
scale • DoG at multi-octaves
 Laplacian 
Find local maximum of: • Extrema detection in scale space

‒ Harris corner detector in space y
• Keypoint location
(image coordinates)
‒ Laplacian in scale  Harris  x ‒ Interpolation
‒ Removing instable points
 SIFT (D.Lowe)2 scale
• Orientation Assignment
 DoG 
Find local maximum of:

 Difference of Gaussians in space y
and scale
 DoG  x
1 K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001
2 D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. IJCV 2004
116
22/05/2021
DoG (SIFT) Detector DoG (SIFT) Detector

• DoG at multi-octaves • Scale-Space Extrema Choose all extrema within
3x3x3 neighborhood
 
D k 2
 
G k 2 * I
G k * I
D    G k   G  * I Dk 
G  * I
D 
X is selected if it is larger or smaller than all 26 neighbors

(its 8 neighbors in the current image and 9 neighbors
each in the scales above and below)
117 118
DoG (SIFT) Detector Example of keypoint detection

(a) 233x189 image
• Orientation assignment (b) 832 DOG extrema
(c) 729 left after peak
‒ Create histogram of local value threshold
gradient directions at selected (d) 536 left after testing
ratio of principle
scale curvatures (removing
edge responses)
‒ Assign canonical orientation at
peak of smoothed histogram
• Each key specifies stable 2D
coordinates
(x, y, scale,orientation)
If 2 major orientations, use both.
119 120
22/05/2021
DoG (SIFT) Detector Scale Invariant Detectors

• Experimental evaluation of detectors w.r.t. scale
change
Repeatability rate:
# correspondences
# possible correspondences
A SIFT keypoint : {x, y, scale, dominant orientation}

K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001
Source: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004 Slide credit: CS131 -Juan Carlos Niebles and Ranjay Krishna
121
Many existing detectors available

• Hessian & Harris [Beaudet ‘78], [Harris ‘88]
• Laplacian, DoG [Lindeberg ‘98], [Lowe ‘99]
• Harris-/Hessian-Laplace [Mikolajczyk & Schmid ‘01]
• Harris-/Hessian-Affine [Mikolajczyk & Schmid ‘04]
• EBR and IBR [Tuytelaars & Van Gool ‘04]
•
•
MSER
Salient Regions
[Matas ‘02]
[Kadir & Brady ‘01] Feature extraction

• Others… • Global features
• Local features
• Those detectors have become a basic building block • Local descriptor
for many recent applications in Computer Vision. • Matching
123 124
22/05/2021
Local Descriptor Invariant local features

• Compact, good representation for local information • Image content is transformed into local feature coordinates that
are invariant to translation, rotation, scale, and other imaging
parameters
• Invariant
‒ Geometric transformations: rotation, translation, scaling,..
‒ Camera view change
‒ Illiminution
• Exemples
‒ SIFT, SURF(Speeded Up Robust Features), PCA-SIFT, …
‒ LBP, BRISK, MSER and FREAK, …
Following slides credit: CVPR 2003 Tutorial on Recognition and Matching Based on Local Invariant Features David Lowe
125 126
128
Advantages of invariant local features Becoming rotation invariant

• Locality: • We are given a keypoint and its scale from DoG
‒ features are local, so robust to occlusion and clutter (no prior
segmentation) • We will select a characteristic orientation for
• Distinctiveness: the keypoint (based on the most prominent
gradient there)
‒ individual features can be matched to a large database of objects
• Quantity:
• We will describe all features
‒ many features can be generated for even small objects
relative to this orientation
• Efficiency:
‒ close to real-time performance
• Extensibility:
‒ can easily be extended to wide range of differing feature types,
with each adding robustness
127
22/05/2021
129 130
SIFT descriptor formation SIFT descriptor formation

• Use the blurred image associated with the
keypoint’s scale
0 2
• Take image gradients over the keypoint
neighborhood.
• To become rotation invariant, rotate the
gradient directions AND locations by (-
keypoint orientation) • Using precise gradient locations is fragile. We’d like to allow some “slop” in
‒ Now we’ve cancelled out rotation and have the image, and still produce a very similar descriptor
gradients expressed at locations relative to
keypoint orientation θ • Using Gaussian filter : to avoid sudden changes in the descriptor with small
changes in the position of the window, and to give less emphasis to
‒ We could also have just rotated the whole
image by -θ, but that would be slower. gradients that are far from the center of the descriptor, as these are most
affected by misregistration errors
• Create array of orientation histograms (a 4x4 array is shown)
Source: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004

http://campar.in.tum.de/twiki/pub/Chair/TeachingWs13TDCV/feature_descriptors.pdf SIFT: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004
Image: Ashish A Gupta, PhD thesis 2013
131 132
SIFT descriptor formation SIFT descriptor formation

• Adding robustness to illumination changes:
0 2
‒ The descriptor is made of gradients (differences between pixels)
•  already invariant to changes in brightness (e.g. adding 10 to
all image pixels yields the exact same descriptor)
‒ A higher -contrast photo will increase the magnitude of gradients
• Put the rotated gradients into their local orientation histograms linearly
‒ A local orientation histogram has n bins (e.g 8 as shown). •  to correct for contrast changes, normalize the vector (scale to
‒ To avoid all boundary affects in which the descriptor abruptly changes as a sample shifts length 1.0)
smoothly from being within one histogram to another or from one orientation to another
‒ Very large image gradients are usually from unreliable 3D
•  interpolation is used to distribute the value of each gradient sample into adjacent histogram bins
• Also, scale down gradient contributions for gradients far from the bin center: a weight of 1-d
illumination effects (glare, etc)
d: distance of the sample from the central value of the bin •  to reduce their effect, clamp all values in the vector to be ≤ 0.2
• The SIFT authors found that best results were with 8 orientation bins per histogram (an experimentally tuned value). Then normalize the vector again.
and and a 4x4 histogram array  a SIFT descriptor: vector of 128 values
 Result is a vector which is fairly invariant to illumination changes
Image: Ashish A Gupta, PhD thesis 2013 SIFT: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004
22/05/2021
SIFT Sensitivity to number of histogram orientations
• Extraordinarily robust matching technique

‒ Can handle changes in viewpoint: up to about 60 degree out of plane rotation
‒ Can handle significant changes in illumination
• Sometimes even day vs. night (below)
‒ Fast and efficient—can run in real time
Steve Seitz
Steve Seitz
133 134
136
Feature stability to noise Feature stability to affine change
• Match features after random change in image scale & orientation, • Match features after random change in image scale & orientation,
with differing levels of image noise with 2% image noise, and affine distortion
• Find nearest neighbor in database of 30,000 features • Find nearest neighbor in database of 30,000 features
David G. Lowe, "Distinctive image features from scale-invariant keypoints," IJCV, 60, 2 (2004), pp. 91-110
135
22/05/2021
SIFT Keypoint Descriptor: summary

137
Distinctiveness of features
• Vary size of database of features, with 30 degree affine change,

2% image noise
• Measure % correct for single nearest neighbor match
Blur the image Compute orientation

Compute gradients in
using the scale of histogram in 8
respect to the keypoint
the keypoint directions over 4x4
orientation(rotation
(scale invariance) sample regions
invariance)
Source: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004

http://campar.in.tum.de/twiki/pub/Chair/TeachingWs13TDCV/feature_descriptors.pdf
138
Other detectors and descriptors
Popular features: SURF, HOG, SIFT

http://campar.in.tum.de/twiki/pub/Chair/TeachingWs13TDCV/feature_descriptors.p
df
Summary some local features:

http://www.cse.iitm.ac.in/~vplab/courses/CV_DIP/PDF/Feature_Detectors_and_Descri
Feature extraction
ptors.pdf
• Global features
• Local features
• Local descriptor
• Matching
139 140
22/05/2021
Feature matching Feature matching

Given a feature in I1, how to find the best match in I2? • How to define the difference between two features f1, f2?
1. Define distance function that compares two descriptors ‒ Simple approach: use only distance value d(f1, f2)
• Use L1, L2, cosine, Mahalanobis,… distance •  can give good score to very ambiguous matches
2. Test all the features in I2, find the one with min distance ‒ Better approaches: add additional constraints
• Radio of distance
OpenCV: • Spatial constraints between neigborhood pixels
- Brute force matching
• Fitting the transformation, then refine the matches (RANSAC)
- Flann Matching: Fast Library for Approximate Nearest Neighbors
[Muja and Lowe, 2009]
Marius Muja and David G Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In VISAPP (1),
pages 331–340, 2009
141
Feature matching Feature matching

• Simple approach: use distance value d(f1, f2) • Better approaches: radio of distance = d(f1,f2) / d(f1,f2')
 can give good score to very ambiguous matches ‒ f2 is best match to f1 in I2;
‒ f2’ is 2nd best SSD match to f1 in I2
‒ An ambiguous/bad match will have ratio close 1
‒ Look for unique matches which have low ratio
f1 f2
f1 f2' f2
I1 I2
I1 I2
22/05/2021
145
Ratio of distances reliable for matching Feature matching
• Better approaches: Spatial constraints between neigborhood pixels
David G. Lowe, "Distinctive image features from scale-invariant keypoints," IJCV, 60, 2 (2004), pp. 91-110 Source: from slides of Valérie Gouet-Brunet
Feature matching Evaluating the results

• Better approaches: fitting the transformation (RANSAC alg.) How can we measure the performance of a feature matcher?
‒ Fitting 2D transformation matrix
50
• Six variables 75
– Each point give two equations 200
–  at least three points
• Least squares
‒ RANSAC: refinement of matches Best feature distance

• Compute error:
148
22/05/2021
149
True/false positives Image matching

50
true match
75
• How to define the distance between 2 images I1, I2?
200 ‒ Using global features: easy
false match
d(I1, I2) = d(feature of I1, feature of I2)
‒ Using local features:

feature distance
• Voting strategy
The distance threshold affects performance • Solving an optimization problem (time consuming)
‒ True positives = # of detected matches that are correct • Building a "global" feature from local features: BoW (bag-of-
• Suppose we want to maximize these—how to choose threshold? words, bag-of-features), VLAD, ..
‒ False positives = # of detected matches that are incorrect
• Suppose we want to minimize these—how to choose threshold?
150
Voting strategy Optimization problem

Input image
• Transportation problem
Images in a database
I1 : {(ri, wi), i=1, N} Provider

I2 : {(r’j, wj), j=1, M} Consommer
d(I1, I2) =???
d ( I1 , I 2 )  min  f ij  d (ri , r ' j )

i j
Selected region
for query The similarity between 2
f ij  0
images is based on the
number of matches fi
ij  w j ,  f ij  wi
j
 f  d (r , r '
*
)
 f ij  min( wi ,  w j )
d EMD ( I1 , I 2 )  i j
ij i j
 f
i j i j *
ij
i j
Source: Modified from slides of Valérie Gouet-Brunet

http://vellum.cz/~mikc/oss-projects/CarRecognition/doc/dp/node29.html
151 152
22/05/2021
Bag-of-words Visual Vocabulary …

• Local feature ~~ a word
• An image ~~ a document
• Apply a technique for textual document representation:
vector model
1.Extracting local features from a 2. Builing visual vocabulary

set of images (dictionary) using a clustering
method
3. An image is represented by a bag of words

 can be represented by tf.idf vector
153 154
Bag of words: outline

1. Extract features
2. Learn “visual vocabulary”
3. Quantize features using visual vocabulary
4. Represent images by frequencies of
“visual words”
Applications
155 156
22/05/2021
Object detection/recognition/search 157

Object detection/recognition
Sivic and Zisserman, 2003
Lowe 2002
Rothganger et al. 2003
David Low, Distinctive Image Features from Scale-Invariant Keypoints, IJCV

1582004
Application: Image Panoramas Application: Image Panoramas 16

0

• Procedure:
‒ Detect feature points in both images
‒ Find corresponding pairs
‒ Use these pairs to align the images
17-Oct- 17-Oct-
159 160
17 17
22/05/2021
Automatic mosaicing Wide baseline stereo
[Image from T. Tuytelaars ECCV 2006 tutorial]
http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html
161 162
CBIR (Content-based image retrieval) CBIR: partial retrieval
Source. http://www-rocq.inria.fr/imedia
163 164
22/05/2021
CBIR: BoW with SIFT + histogram CBIR: BoW with SIFT + histogram
Tập đặc
20 ảnh/nhóm trưng
SIFT
5 ảnh/nhóm
Tập ảnh lưu Từ điển hình ảnh

trong DB
Tính vector trọng
số cho từng ảnh
~80 ảnh/nhóm
~60 ảnh/nhóm
CSDL mô tả đặc
trưng ảnh SIFT
dưới dạng các
Source :ĐATN – Phạm Xuân Trường K52 - BK vector trọng số
Tập ảnh test Source :Đồ án tốt nghiệp – Phạm Xuân Trường K52 - BK
165 166
CBIR: BoW with SIFT + histogram References

• Lecture 5,6: CS231 - Juan Carlos Niebles and Ranjay Krishna,
Stanford Vision and Learning Lab
• Vision par Ordinateur, Alain Boucher, IFI
Source :Đồ án tốt nghiệp – Phạm Xuân Trường K52 - BK

22/05/2021
Thank you for

your attention!
169

IT5409 - Ch4 - Part2 - Feature ExtractionMatching - 4pages

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IT5409 - Ch4 - Part2 - Feature ExtractionMatching - 4pages

Uploaded by

Copyright:

Available Formats

22/05/2021

Computer Vision • Feature extraction

Local features vs Global features

Global features? Types of features

Color features Distance / Similarity

Advantages of histogram Some inconveniences

• The spatial distribution of pixel

Some inconveniences Texture features

• Color representation dependency (color space),

Texture features Texture features

First order statistics First order statistics

h(i) : số điểm ảnh ở mức xám i

GLCM (Grey Level Co-occurence Matrices) GLCM

 Many matrices, one for each distance and direction

Pair of neighbor pixels (4,4) ...and so on (after 2 lines)

Texture features comparision Shape features

Shape features Examples: Freeman chain coding

Examples: angular profile, … Examples : Image moments

𝑀 , = 𝑎𝑟𝑒𝑎 𝑜𝑓 𝑡ℎ𝑒 𝑟𝑒𝑔𝑖𝑜𝑛 𝐷

Invariant moments (Hu's moments) Examples : Hu's moments

Shape Context Examples: PHOG

Why local features?

Image matching Harder Still?

Slide credit: Steve Seitz

by scgbt Slide credit: Steve Seitz

Slide credit: Steve Seitz

Local features and object recognition Local features and alignment

Sivic and Zisserman, 2003

[Darya Frolova and Denis Simakov]

Local features and alignment Local features and alignment

[Darya Frolova and Denis Simakov]

45 [Darya Frolova and Denis46Simakov]

Local features for image matching

fA Similarity fB 4. Compute a localdescriptor

e.g. color e.g. color

N pixels d ( f A, fB )  T 5. Match local descriptors

Source : Jim Little, Lowe: features, UBC.

Local features Local feature extraction

• Partial query Dividing into

Common Requirements Invariance: Geometric Transformations

Slide credit: Steve Seitz

Invariance: Geometric Transformations Invariance: Photometric Transformations

Levels of Geometric Invariance

Requirements Main questions

• Locality: Features are local, therefore robust to occlusion and

• Distinctiveness: The regions should contain “interesting” structure

• Efficiency: Close to real-time performance

Interest points: why and where?

Same image with different questions

Interest points: why and where?

Keypoint Localization Finding Corners

Slide credit: Bastian Leibe

Slide credit: Svetlana Lazebnik

Corners as distinctive interest points Corners versus edges

Harris detector formulation Corner Detection by Auto-correlation

1 in window, 0 outside Gaussian

Corner Detection by Auto-correlation Corner Detection by Auto-correlation

I(x, y) We want to discover how E behaves for small shifts

O( 112 * 112 * 6002 ) = 5.2 billion of these

Corner Detection by Auto-correlation

Corner Detection: Mathematics

Ignore function   2 w( x, y )I ( x  u , y  v)  I ( x, y )I xx ( x  u , y  v)

Corner Detection: Mathematics Corner Detection: Mathematics