You are on page 1of 63

Image Sub-Sampling and

Pyramids
Today
► Sub-Sampling
► Image Pyramids
► Applications of Pyramids

Computer Vision Lecture # 2 2


Image Scaling
This image is too big to fit on the
screen. How can we generate a
half-sized version?

Source: S. Seitz
Image sub-sampling

1/8

1/4

Throw away every other row and


column to create a 1/2 size image
- called image sub-sampling
Source: S. Seitz
Image sub-sampling

1/2 1/4 (2x zoom) 1/8 (4x zoom)

Why does this look so crufty?


Source: S. Seitz
Image sub-sampling

Source: F. Durand
Wagon-wheel effect

(See http://www.michaelbach.de/ot/mot_wagonWheel/index.html) Source: L. Zhang


Aliasing

► Occurs when your sampling rate is not high enough to


capture the amount of detail in your image
► Can give you the wrong signal/image—an alias

► To do sampling right, need to understand the structure


of your signal/image

► To avoid aliasing:
 sampling rate ≥ 2 * max frequency in the image
► said another way: ≥ two samples per cycle
 This minimum sampling rate is called the Nyquist rate Source: L. Zhang
Good and Bad Sampling

Good sampling:
•Sample often or,
•Sample wisely

Bad sampling:
•Aliasing!
Even worse for synthetic images
Aliasing
► When downsampling by a factor of two
 Original image has frequencies that are too high

► How can we fix this?


Smoothing as low-pass filtering

► High frequencies ► Common solution:


lead to trouble with use a Gaussian
sampling.  multiplying FT by
► Suppress high Gaussian is
equivalent to
frequencies before convolving image
sampling ! with Gaussian.
 truncate high
frequencies in FT
 or convolve with a
low-pass filter
Gaussian pre-filtering

G 1/8

G 1/4

Gaussian 1/2

• Solution: filter the image, then subsample


Source: S. Seitz
Subsampling with Gaussian pre-filtering

Gaussian 1/2 G 1/4 G 1/8

• Solution: filter the image, then subsample


Source: S. Seitz
Compare with...

1/2 1/4 (2x zoom) 1/8 (4x zoom)

Source: S. Seitz
Upsampling
► This image is too small for this screen:
► How can we make it 10 times as big?
► Simplest approach:
repeat each row
and column 10 times
► (“Nearest neighbor
interpolation”)
Image interpolation

d = 1 in this
example

1 2 3 4 5

Recall how a digital image is formed

• It is a discrete point-sampling of a continuous function


• If we could somehow reconstruct the original function, any new
image could be generated, at any resolution and scale

Adapted from: S. Seitz


Image interpolation

d = 1 in this
example

1 2 3 4 5

Recall how a digital image is formed

• It is a discrete point-sampling of a continuous function


• If we could somehow reconstruct the original function, any new
image could be generated, at any resolution and scale

Adapted from: S. Seitz


Image interpolation

1 d = 1 in this
example

1 2 2.5 3 4 5

• What if we don’t know ?


• Guess an approximation:
• Can be done in a principled way: filtering
• Convert to a continuous function:

• Reconstruct by convolution with a reconstruction filter, h

Adapted from: S. Seitz


Image interpolation
Original image: x 10

Nearest-neighbor interpolation Bilinear interpolation Bicubic interpolation


Image interpolation
Also used for resampling
Gaussian
pre-filtering
• Solution: filter
the image, then
subsample

F00 F11 FF22

blur subsample blur subsample …


F00* H F11* H
{
Gaussian
pyramid

blur
F00
subsample blur
F11
subsample
FF22

F00* H F11* H
Gaussian pyramids
[Burt and Adelson, 1983]

Gaussian Pyramids have all sorts of applications in computer vision

Source: S. Seitz
Gaussian pyramids
[Burt and Adelson, 1983]

• How much space does a Gaussian pyramid take compared to the


original image?
Source: S. Seitz
Multi-Resolution Image Representation

• Fourier domain tells us “what” (frequencies, sharpness,


texture properties), but not “where”.

• Spatial domain tells us “where” (pixel location) but not “what”.

• We want a image representation that gives a local description


of image “events” – what is happening where.

• Naturally, think about representing images across varying scales.


Figure from David Forsyth
Multi-resolution Image Pyramids
Low resolution

High resolution
Space Required for Pyramids
Constructing a pyramid by
taking every second pixel
leads to layers that badly
misrepresent the top layer
Decimation
Expansion
Interpolation Results
The Gaussian Pyramid

► Smooth with Gaussians because


 a Gaussian*Gaussian=another Gaussian
► Synthesis
 smooth and downsample
► Gaussians are low pass filters, so repetition is
redundant
► Kernel width doubles with each level
Gaussian pyramids used for
►up- or down- sampling images.
►Multi-resolution image analysis
 Look for an object over various spatial scales
 Coarse-to-fine image processing: form blur
estimate or the motion analysis on very low-
resolution image, upsample and repeat. Often a
successful strategy for avoiding local minima in
complicated estimation tasks.

62
37
The Gaussian Pyramid
Low resolution G4  (G3 * gaussian)  2
blur down-sample
G3  (G2 * gaussian)  2
down-sam
blur ple
G2  (G1 * gaussian)  2
dow
n-sa
mple
blur
G1  (G0 * gaussian)  2
do
wn
-s
am
G0  Image
ple
blur

High resolution
Pyramids at Same Resolution
The Laplacian Pyramid
Li  Gi  expand(Gi 1 )
Gaussian Pyramid Gi  Li  expand(Gi 1 ) Laplacian Pyramid

Gn Ln  Gn
expa
nd
G2 exp
and- = L2
G1 L1
ex
pa -
nd =
G0 L0

- =
Applications of Image Pyramids

► Coarse-to-Fine strategies for computational efficiency.


► Search for correspondence
 look at coarse scales, then refine with finer scales
► Edge tracking
 a “good” edge at a fine scale has parents at a coarser scale
► Control of detail and computational cost in matching
 e.g. finding stripes
 very important in texture representation
► Image Blending
► Data compression (laplacian pyramid)
Fast Template Matching

Computer Vision Lecture # 2 45


Fast Template Matching
Fast Template Matching
Multi-Resolution Correlation
Image blending

53
Image Blending
Pyramid blending of Regions
Horror Photo

© prof. dmartin
Facial Feature Blending

Computer Vision Lecture # 2 57


Facial Feature Blending

Computer Vision Lecture # 2 58


Image Fusion

Multi-scale Transform (MST) = Obtain Pyramid from Image


Inverse Multi-scale Transform (IMST) = Obtain Image from Pyramid
Multi-Sensor Fusion
Matlab resources for pyramids (with tutorial)
http://www.cns.nyu.edu/~eero/software.html

61
Matlab resources for pyramids (with tutorial)
http://www.cns.nyu.edu/~eero/software.html

62
Why use these representations?
► Handle real-world size variations with a
constant-size vision algorithm.
► Remove noise
► Analyze texture
► Recognize objects
► Label image features

63
63
Acknowledgements
►I have planned and made this course material
from the courses taught by following
computer vision experts.
 Steve Seitz, Washington University
 Richard Szeliski, Microsoft Research
 Noah Snavely, Cornell University
 Sirinavasa Narasimhan, Carnegie Mellon University
 Antonio Torralba, MIT
 Michael Black, Brown University

Computer Vision Lecture # 2 64

You might also like