Image Sub-Sampling and Pyramids

Image Sub-Sampling and
Pyramids
Today
► Sub-Sampling
► Image Pyramids
► Applications of Pyramids
Computer Vision Lecture # 2 2

Image Scaling
This image is too big to fit on the
screen. How can we generate a
half-sized version?
Source: S. Seitz
Image sub-sampling
1/8
1/4
Throw away every other row and

column to create a 1/2 size image
- called image sub-sampling
Source: S. Seitz
Image sub-sampling
1/2 1/4 (2x zoom) 1/8 (4x zoom)
Why does this look so crufty?

Source: S. Seitz
Image sub-sampling
Source: F. Durand
Wagon-wheel effect
(See http://www.michaelbach.de/ot/mot_wagonWheel/index.html) Source: L. Zhang

Aliasing
► Occurs when your sampling rate is not high enough to

capture the amount of detail in your image
► Can give you the wrong signal/image—an alias
► To do sampling right, need to understand the structure

of your signal/image
► To avoid aliasing:
 sampling rate ≥ 2 * max frequency in the image
► said another way: ≥ two samples per cycle
 This minimum sampling rate is called the Nyquist rate Source: L. Zhang
Good and Bad Sampling
Good sampling:
•Sample often or,
•Sample wisely
Bad sampling:
•Aliasing!
Even worse for synthetic images
Aliasing
► When downsampling by a factor of two
 Original image has frequencies that are too high
► How can we fix this?

Smoothing as low-pass filtering
► High frequencies ► Common solution:

lead to trouble with use a Gaussian
sampling.  multiplying FT by
► Suppress high Gaussian is
equivalent to
frequencies before convolving image
sampling ! with Gaussian.
 truncate high
frequencies in FT
 or convolve with a
low-pass filter
Gaussian pre-filtering
G 1/8
G 1/4
Gaussian 1/2
• Solution: filter the image, then subsample

Source: S. Seitz
Subsampling with Gaussian pre-filtering
Gaussian 1/2 G 1/4 G 1/8
• Solution: filter the image, then subsample

Source: S. Seitz
Compare with...
1/2 1/4 (2x zoom) 1/8 (4x zoom)
Source: S. Seitz
Upsampling
► This image is too small for this screen:
► How can we make it 10 times as big?
► Simplest approach:
repeat each row
and column 10 times
► (“Nearest neighbor
interpolation”)
Image interpolation
d = 1 in this
example
1 2 3 4 5
Recall how a digital image is formed
• It is a discrete point-sampling of a continuous function

• If we could somehow reconstruct the original function, any new
image could be generated, at any resolution and scale
Adapted from: S. Seitz

Image interpolation
d = 1 in this
example
1 2 3 4 5
Recall how a digital image is formed
• It is a discrete point-sampling of a continuous function

• If we could somehow reconstruct the original function, any new
image could be generated, at any resolution and scale

Image interpolation
1 d = 1 in this
example
1 2 2.5 3 4 5
• What if we don’t know ?

• Guess an approximation:
• Can be done in a principled way: filtering
• Convert to a continuous function:
• Reconstruct by convolution with a reconstruction filter, h

Image interpolation
Original image: x 10
Nearest-neighbor interpolation Bilinear interpolation Bicubic interpolation

Image interpolation
Also used for resampling
Gaussian
pre-filtering
• Solution: filter
the image, then
subsample
F00 F11 FF22
blur subsample blur subsample …

F00* H F11* H
{
Gaussian
pyramid
blur
F00
subsample blur
F11
subsample
FF22
…
F00* H F11* H
Gaussian pyramids
[Burt and Adelson, 1983]
Gaussian Pyramids have all sorts of applications in computer vision
Source: S. Seitz
Gaussian pyramids
[Burt and Adelson, 1983]
• How much space does a Gaussian pyramid take compared to the

original image?
Source: S. Seitz
Multi-Resolution Image Representation
• Fourier domain tells us “what” (frequencies, sharpness,

texture properties), but not “where”.
• Spatial domain tells us “where” (pixel location) but not “what”.
• We want a image representation that gives a local description

of image “events” – what is happening where.
• Naturally, think about representing images across varying scales.

Figure from David Forsyth
Multi-resolution Image Pyramids
Low resolution
High resolution
Space Required for Pyramids
Constructing a pyramid by
taking every second pixel
leads to layers that badly
misrepresent the top layer
Decimation
Expansion
Interpolation Results
The Gaussian Pyramid
► Smooth with Gaussians because

 a Gaussian*Gaussian=another Gaussian
► Synthesis
 smooth and downsample
► Gaussians are low pass filters, so repetition is
redundant
► Kernel width doubles with each level
Gaussian pyramids used for
►up- or downsampling images.
►Multi-resolution image analysis
 Look for an object over various spatial scales
 Coarse-to-fine image processing: form blur
estimate or the motion analysis on very low-
resolution image, upsample and repeat. Often a
successful strategy for avoiding local minima in
complicated estimation tasks.
62
37
The Gaussian Pyramid
Low resolution G4  (G3 * gaussian)  2
blur down-sample
G3  (G2 * gaussian)  2
down-sam
blur ple
dow
n-sa
mple
blur
do
wn
-s
am
G0  Image
ple
blur
High resolution
Pyramids at Same Resolution
The Laplacian Pyramid
Li  Gi  expand(Gi 1 )
Gaussian Pyramid Gi  Li  expand(Gi 1 ) Laplacian Pyramid
Gn Ln  Gn
expa
nd
G2 exp
and- = L2
G1 L1
ex
pa -
nd =
G0 L0
- =
Applications of Image Pyramids
► Coarse-to-Fine strategies for computational efficiency.

► Search for correspondence
 look at coarse scales, then refine with finer scales
► Edge tracking
 a “good” edge at a fine scale has parents at a coarser scale
► Control of detail and computational cost in matching
 e.g. finding stripes
 very important in texture representation
► Image Blending
► Data compression (laplacian pyramid)
Fast Template Matching

Multi-Resolution Correlation
Image blending
53
Image Blending
Pyramid blending of Regions
Horror Photo
© prof. dmartin
Facial Feature Blending

Facial Feature Blending

Image Fusion
Multi-scale Transform (MST) = Obtain Pyramid from Image

Inverse Multi-scale Transform (IMST) = Obtain Image from Pyramid
Multi-Sensor Fusion
Matlab resources for pyramids (with tutorial)
http://www.cns.nyu.edu/~eero/software.html
61
Matlab resources for pyramids (with tutorial)
http://www.cns.nyu.edu/~eero/software.html
62
Why use these representations?
► Handle real-world size variations with a
constant-size vision algorithm.
► Remove noise
► Analyze texture
► Recognize objects
► Label image features
63
63
Acknowledgements
►I have planned and made this course material
from the courses taught by following
computer vision experts.
 Steve Seitz, Washington University
 Richard Szeliski, Microsoft Research
 Noah Snavely, Cornell University
 Sirinavasa Narasimhan, Carnegie Mellon University
 Antonio Torralba, MIT
 Michael Black, Brown University

Image Sub-Sampling and Pyramids

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Image Sub-Sampling and Pyramids

Uploaded by

Copyright:

Available Formats

Image Sub-Sampling and

Computer Vision Lecture # 2 2

Throw away every other row and

1/2 1/4 (2x zoom) 1/8 (4x zoom)

Why does this look so crufty?

(See http://www.michaelbach.de/ot/mot_wagonWheel/index.html) Source: L. Zhang

► Occurs when your sampling rate is not high enough to

► To do sampling right, need to understand the structure

► How can we fix this?

► High frequencies ► Common solution:

• Solution: filter the image, then subsample

Gaussian 1/2 G 1/4 G 1/8

• Solution: filter the image, then subsample

1/2 1/4 (2x zoom) 1/8 (4x zoom)

Recall how a digital image is formed

• It is a discrete point-sampling of a continuous function

Adapted from: S. Seitz

Recall how a digital image is formed

• It is a discrete point-sampling of a continuous function

Adapted from: S. Seitz

• What if we don’t know ?

• Reconstruct by convolution with a reconstruction filter, h

Adapted from: S. Seitz

Nearest-neighbor interpolation Bilinear interpolation Bicubic interpolation

F00 F11 FF22

blur subsample blur subsample …

Gaussian Pyramids have all sorts of applications in computer vision

• How much space does a Gaussian pyramid take compared to the

• Fourier domain tells us “what” (frequencies, sharpness,

• Spatial domain tells us “where” (pixel location) but not “what”.

• We want a image representation that gives a local description

• Naturally, think about representing images across varying scales.

► Smooth with Gaussians because

► Coarse-to-Fine strategies for computational efficiency.

Computer Vision Lecture # 2 45

Computer Vision Lecture # 2 57

Computer Vision Lecture # 2 58

Multi-scale Transform (MST) = Obtain Pyramid from Image

Computer Vision Lecture # 2 64

You might also like