Interest Point Detection 1

CNG 483
Introduction to Computer Vision
Interest Point Detection
Asst. Prof. Dr. Meryem Erbilek
Slide partially based on Stanford U. CS131
METU Lecture 6 - 1 CENG 483

What we will learn today?
• Local invariant features
– Motivation
– Requirements, invariances
• Keypoint localization
– Harris corner detector
Some background reading:

Richard Szeliski, Computer Vision: Algorithms and Applications, Chp.4.1
David Lowe, IJCV 2004

Image matching:
a challenging problem

Image matching:
a challenging problem
Slide credit: Steve Seitz

by Diva Sian
by swashford

Harder Case

by Diva Sian by scgbt

Harder Still?

NASA Mars Rover images

Answer Below (Look for tiny colored squares)

NASA Mars Rover images with SIFT feature matches
(Figure by Noah Snavely)

Motivation for using local features
• Global representations have major limitations
• Instead, describe and match only local regions
• Increased robustness to
– Occlusions
– Articulation
– Intra-category variations

General Approach
1. Find a set of
distinctive key-
points
A1 2. Define a region
around each
keypoint
A2 A3
3. Extract and
normalize the
region content
Slide credit: Bastian Leibe

Similarity
measure 4. Compute a local
descriptor from the
N pixels
normalized region
e.g. color e.g. color
N pixels 5. Match local

descriptors

Common Requirements
• Problem 1:
– Detect the same point independently in both images
Slide credit: Darya Frolova, Denis Simakov

No chance to match!
We need a repeatable detector!

Common Requirements
• Problem 1:
– Detect the same point independently in both images
• Problem 2:
– For each point correctly recognize the corresponding one
Slide credit: Darya Frolova, Denis Simakov

?
We need a reliable and distinctive descriptor!

Requirements
• Region extraction needs to be repeatable and accurate
– Invariant to translation, rotation, scale changes
– Robust or covariant to out-of-plane (≈affine) transformations
– Robust to lighting variations, noise, blur, quantization
• Locality: Features are local, therefore robust to occlusion

and clutter.
• Distinctiveness: The regions should contain “interesting”
structure.

• Efficiency: Close to real-time performance.
• Quantity: We need a sufficient number of regions to cover the
object.

Requirements
Slide credit: Silvio Savarese


Requirements
Slide credit: Silvio Savarese


Many Existing Detectors Available
• Hessian & Harris [Beaudet ‘78], [Harris ‘88]
• Laplacian, DoG [Lindeberg ‘98], [Lowe ‘99]
• Harris-/Hessian-Laplace [Mikolajczyk & Schmid ‘01]
• Harris-/Hessian-Affine [Mikolajczyk & Schmid ‘04]
• EBR and IBR [Tuytelaars & Van Gool ‘04]
• MSER [Matas ‘02]
• Salient Regions [Kadir & Brady ‘01]
• Others…

• Those detectors have become a basic building block for
many pre-deep learning approaches in Computer Vision.
• They are still very relevant & powerful tools

What we will learn today?
– Motivation

Keypoint Localization
• Goals:

– Repeatable detection
– Precise localization
– Interesting content
⇒ Look for two-dimensional signal changes

Finding Corners
• Key property:
– In the region around a corner, image gradient has two or
Slide credit: Svetlana Lazebnik

more dominant directions
• Corners are repeatable and distinctive
C.Harris and M.Stephens. "A Combined Corner and Edge Detector.“

Proceedings of the 4th Alvey Vision Conference, 1988.

Corners as Distinctive Interest Points
• Design criteria
– We should easily recognize the point by looking through a
small window (locality)
– Shifting the window in any direction should give a large
change in intensity (good localization)
Slide credit: Alyosha Efros

“flat” region: “edge”: “corner”:
no change in all no change along significant change
directions the edge direction in all directions

Corners versus edges
Large
Corner
Large
Small
Edge
Large
Small
Nothing
Small

Corners versus edges
??
Corner
??

Harris Detector Formulation
• Consider shifting the window W by (u,v):
– How do the pixels in W change?
– Auto-correlation function measures
the self similarity of a signal and is W
related to the sum-of-squared
difference.
– Compare each pixel before and
after the shift by summing up the
squared differences (SSD).
– This defines an SSD “error” of E(u,v):
SSD for (u,v) direction
ofSlide
a single
partially region
based on Stanford U. CS131 Slide partially from Selim Aksoy
• Change of intensity for the shift [u,v]:
Window Shifted Intensity

function intensity
Slide credit: Rick Szeliski

Window function w(x,y) = o
r
1 in window, 0 outside Gaussian

• Taylor Series expansion of I:
• If the motion (u,v) is assumed to be small, then first

order approximation is good.
• Plugging this into the formula on the previous slide...

Slide partially based on Stanford U. CS131 Slide partially from Selim Aksoy
• Sum-of-squared differences error E(u,v):
• This can be rewritten:
• For the example above:

– You can move the center of the green window to anywhere on the blue
unit circle.
– Which directions will result in the largest and smallest E values?
• This measure of change can be further simplified as:
where H is a 2×2 matrix computed from image derivatives:
Gradient with
respect to x,
times gradient
Sum over image region – the area we are

with respect to y
checking for corner

Image I Ix Iy IxIy

• E(u,v) is an equation of an ellipse, where M is the covariance

matrix.
• Let 1 and 2 be eigenvalues of M.
• Reminder:

• Interpreting the Eigenvalues
• Classification of image points using eigenvalues of M:
λ2 “Edge”
λ2 >> λ1
“Corner”
λ1 and λ2 are large, λ1 ~ λ2;
E increases in all directions
Slide credit: Kristen Grauman
λ1 and λ2 are small;

E is almost constant in “Flat”
all directions “Edge”
region λ1 >> λ2
λ1

• To measure the corner strength:
R = det(H) – k(trace(H))2
where
trace(H) = λ1 + λ2
det(H) = λ1 x λ2
(λ1 and λ2 are the eigenvalues of H).
• R is positive for corners, negative in edge regions, and small in
flat regions.
METU Lecture 6 -
Corner Response Function
λ2 “Edge”
θ<0
“Corner”
θ>0
• Fast approximation
– Avoid computing the
eigenvalues
– α: constant
(0.04 to 0.06) “Flat” “Edge”
region θ<0
λ1

Window Function w(x,y)
• Option 1: uniform window

– Sum over square window
– Problem: not rotation invariant 1 in window, 0 outside
• Option 2: Smooth with Gaussian

– Gaussian already performs weighted sum
– Result is rotation invariant Gaussian

Summary: Harris Detector [Harris88]
• Compute second moment matrix
(autocorrelation matrix)
Ix Iy
1. Image
derivatives
2. Square of Ix2 Iy2 IxIy

derivatives
Slide credit: Krystian Mikolajczyk
3. Gaussian
filter g(σI)
g(Ix2) g(Iy2) g(IxIy)
4. Cornerness function – two strong eigenvalues
5. Perform non-maximum suppression

Slide partially based on Stanford U. CS131 R
Algorithm: Harris Detector
METU Lecture 6 - 35
Harris detector example
METU Lecture 6 -
Corner responses R
R values (red high, blue low)

METU Lecture 6 -
Thresholding
Threshold (R > value)

METU Lecture 6 -
Non maximum suppression
Local maxima of R
METU Lecture 6 - 39
Harris features (red)

METU Lecture 6 - 40
Harris Detector – Responses [Harris88]

Effect: A very precise
corner detector.




• Results are well suited for finding stereo correspondences

Harris Detector: Properties
• Translation invariance?


• Translation invariance
• Rotation invariance?

Ellipse rotates but its shape (i.e.
eigenvalues) remains the same
Corner response θ is invariant to image rotation


• Translation invariance
• Rotation invariance
• Scale invariance?

Corner All points will be
classified as edges!
Not invariant to image scale!

What we learned so far?
– Motivation

Interest Point Detection 1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Interest Point Detection 1

Uploaded by

Copyright:

Available Formats

CNG 483

Introduction to Computer Vision

Interest Point Detection

Asst. Prof. Dr. Meryem Erbilek

Slide partially based on Stanford U. CS131

METU Lecture 6 - 1 CENG 483

Some background reading:

Slide partially based on Stanford U. CS131

METU Lecture 6 - 2 CENG 483

Slide partially based on Stanford U. CS131

METU Lecture 6 - 3 CENG 483

Slide credit: Steve Seitz

METU Lecture 6 - 4 CENG 483

Slide credit: Steve Seitz

Slide partially based on Stanford U. CS131

METU Lecture 6 - 5 CENG 483

Slide credit: Steve Seitz

METU Lecture 6 - 6 CENG 483

Slide credit: Steve Seitz

METU Lecture 6 - 7 CENG 483

Slide partially based on Stanford U. CS131

METU Lecture 6 - 8 CENG 483

Slide credit: Bastian Leibe

N pixels 5. Match local

METU Lecture 6 - 9 CENG 483

Slide credit: Darya Frolova, Denis Simakov

We need a repeatable detector!

Slide partially based on Stanford U. CS131

METU Lecture 6 - 10 CENG 483

Slide credit: Darya Frolova, Denis Simakov

We need a reliable and distinctive descriptor!

Slide partially based on Stanford U. CS131

METU Lecture 6 - 11 CENG 483

• Locality: Features are local, therefore robust to occlusion

Slide credit: Bastian Leibe

Slide partially based on Stanford U. CS131

METU Lecture 6 - 12 CENG 483

Slide credit: Silvio Savarese

METU Lecture 6 - 13 CENG 483

Slide credit: Silvio Savarese

METU Lecture 6 - 14 CENG 483

Slide credit: Bastian Leibe

Slide partially based on Stanford U. CS131

METU Lecture 6 - 15 CENG 483

Slide partially based on Stanford U. CS131

METU Lecture 6 - 16 CENG 483

Slide credit: Bastian Leibe

Slide partially based on Stanford U. CS131

METU Lecture 6 - 17 CENG 483

Slide credit: Svetlana Lazebnik

C.Harris and M.Stephens. "A Combined Corner and Edge Detector.“

METU Lecture 6 - 18 CENG 483

Slide credit: Alyosha Efros

METU Lecture 6 - 19 CENG 483

Slide partially based on Stanford U. CS131

METU Lecture 6 - 20 CENG 483

Slide partially based on Stanford U. CS131

METU Lecture 6 - 21 CENG 483

Window Shifted Intensity

Slide credit: Rick Szeliski

METU Lecture 6 - 23 CENG 483

• If the motion (u,v) is assumed to be small, then first

• Plugging this into the formula on the previous slide...