You are on page 1of 75

EE5811

Topics in Computer Vision


Dr. LI Haoliang
Department of Electrical Engineering
Recap of Week 1
• A brief History of Computer Vision
• Different Applications
What is an image?
What is an image?

Digital Camera

We’ll focus on these in this course

Also image formation The Eye


Source: A. Efros
What is an image?
• A grid (matrix) of intensity values
255 255 255 255 255 255 255 255 255 255 255 255

255 255 255 255 255 255 255 255 255 255 255 255

255 255 255 20 0 255 255 255 255 255 255 255

255 255 255 75 75 75 255 255 255 255 255 255

=
255 255 75 95 95 75 255 255 255 255 255 255

255 255 96 127 145 175 255 255 255 255 255 255

255 255 127 145 175 175 175 255 255 255 255 255

255 255 127 145 200 200 175 175 95 255 255 255

255 255 127 145 200 200 175 175 95 47 255 255

255 255 127 145 145 175 127 127 95 47 255 255

255 255 74 127 127 127 95 95 95 47 255 255

255 255 255 74 74 74 74 74 74 255 255 255

255 255 255 255 255 255 255 255 255 255 255 255

255 255 255 255 255 255 255 255 255 255 255 255

(common to use one byte per value: 0 = black, 255 = white)


What is an image?
• We can think of a (grayscale) image as a function, f,
from R2 to R:
• f (x,y) gives the intensity at position (x,y)
f (x, y)

3D view

• A digital image is a discrete (sampled, quantized) version


of this function
Image transformations
• As with any function, we can apply operators to an
image

Example

g (x,y) = f (x,y) + 20 g (x,y) = f (-x,y)


Characterizing image
transformations

[i] does not mean transformation is applied at each pixel separately

Source: Deva Ramanan


Characterizing image
transformations
• Properties of “nice” functional transformation
Impulse response
• Delta function
Impulse response
• Delta function
Convolution
Convolution
Example
Example
Properties of Convolution

We can efficiently implement complex operations

Powerful way to think about ANY image transformation that


satisfies additivity, scaling, and shift-invariance.
Size
• Given F of length N and H of length M, what’s size
of G = F * H?
(Cross) Correlation

Commutative properties do not hold


Convolution vs. Correlation
Cross-correlation
Let be the image, be the kernel (of
size 2k+1 x 2k+1), and be the output
image

This is called a cross-correlation operation:

• Can think of as a “dot product” between


local neighborhood and kernel for each pixel
Convolution
• Same as cross-correlation, except that the kernel is
“flipped” (horizontally and vertically)

This is called a convolution operation:

• Convolution is commutative and associative


Linear filtering
• Cross-correlation, convolution
• Replace each pixel by a linear combination (a weighted sum)
of its neighbors
• The prescription for the linear combination is called
the “kernel” (or “mask”, “filter”)

10 5 3 0 0 0
4 6 1 0 0.5 0 8
1 1 8 0 1 0.5
Local image data kernel Modified image data

Source: L. Zhang
Filters
• Filtering
• Form a new image whose pixels are a combination of the
original pixels
• Why?
• To get useful information from images
• E.g., extract edges or contours (to understand shape)
• To enhance the image
• E.g., to remove noise
• E.g., to sharpen or to “enhance image”
Canonical Image Processing
problems
• Image Restoration
• denoising
• deblurring
• Image Compression
• JPEG, JPEG2000, MPEG..
• Computing Field Properties
• optical flow
• disparity
• Locating Structural Features
• corners
• edges
Question: Noise reduction
• Given a camera and a still scene, how can you
reduce noise?

Take lots of images and average them!


What’s the next best thing?
Source: S. Seitz
Image filtering
• Modify the pixels in an image based on some function of
a local neighborhood of each pixel

10 5 3 Convolution
4 5 1 7
1 1 7

Local image data Modified image data

Source: L. Zhang
Convolution

Adapted from F. Durand


Border effects
Border padding
Mean filtering
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 20 10
0 0 0 90 90 90 90 90 0 0 0 20 40 60 60 60 40 20

1 1 1 0 0 0 90 90 90 90 90 0 0 0 30 60 90 90 90 60 30

1
1
1
1
1
1
*
0
0
0
0
0
0
0
0
0
90
90
90
90
0
90
90
90
90
90
90
90
90
90
90
0
0
0
0
0
0
= 0
0
0
30
30
20
50
50
30
80
80
50
80
80
50
90
90
60
60
60
40
30
30
20
0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 30 20 10
0 0 90 0 0 0 0 0 0 0 10 10 10 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0
Mean filtering/Moving average
Mean filtering/Moving average
Mean filtering/Moving average
Mean filtering/Moving average
Mean filtering/Moving average
Mean filtering/Moving average
Linear filters: examples

0 0 0

=
* 0
0
1
0
0
0

Original Identical image

Source: D. Lowe
Linear filters: examples

0 0 0

=
* 1
0
0
0
0
0

Original Shifted left


By 1 pixel

Source: D. Lowe
Linear filters: examples

1 1 1

=
* 1
1
1
1
1
1

Original Blur (with a mean filter)

Source: D. Lowe
Linear filters: examples

-
0 0 0 1 1 1

=
* 0
0
2
0
0
0
1
1
1
1
1
1

Sharpening filter
Original (accentuates edges)

Source: D. Lowe
Sharpening

Source: D. Lowe
Smoothing with box filter revisited

Source: D. Forsyth
Gaussian Kernel

0.003 0.013 0.022 0.013 0.003


0.013 0.059 0.097 0.059 0.013
0.022 0.097 0.159 0.097 0.022
0.013 0.059 0.097 0.059 0.013
0.003 0.013 0.022 0.013 0.003

5 x 5,  = 1

• Constant factor at front makes volume sum to 1 (can be ignored, as


we should re-normalize weights to sum to 1 in any case)

Source: C. Rasmussen
Gaussian Kernel

Source: C. Rasmussen
Gaussian filters

= 1 pixel = 5 pixels = 10 pixels = 30 pixels


Mean vs. Gaussian filtering
Gaussian filter
• Removes “high-frequency” components from the
image (low-pass filter)
• Convolution with self is another Gaussian

* =

Source: K. Grauman
Sharpening revisited
• What does blurring take away?

– =
original smoothed (5x5) detail

Let’s add it back:

+α =
original detail sharpened
Source: S. Lazebnik
Filters: Thresholding
Image Scaling
This image is too big to fit on the
screen. How can we generate a
half-sized version?

Source: S. Seitz
Image sub-sampling

1/8

1/4

Throw away every other row and


column to create a smaller size image
- called image sub-sampling
Source: S. Seitz
Image sub-sampling

1/2 1/4 (2x zoom) 1/8 (4x zoom)

Why does this look so bad?


Source: S. Seitz
Aliasing

• Occurs when your sampling rate is not high enough to


capture the amount of detail in your image
• Can give you the wrong signal/image—an alias

• To do sampling right, need to understand the structure of


your signal/image

• To avoid aliasing:
• sampling rate ≥ 2 * max frequency in the image
• said another way: ≥ two samples per cycle
• This minimum sampling rate is called the Nyquist rate
Source: L. Zhang
Nyquist limit – 2D example

Good sampling

Bad sampling
Aliasing
• When downsampling by a factor of two
• Original image has frequencies that are too high

• How can we fix this?


Gaussian pre-filtering

G 1/8

G 1/4

Gaussian 1/2

• Solution: filter the image, then subsample


Source: S. Seitz
Subsampling with Gaussian pre-filtering

Gaussian 1/2 G 1/4 G 1/8

• Solution: filter the image, then subsample


Source: S. Seitz
Compare with...

1/2 1/4 (2x zoom) 1/8 (4x zoom)

Source: S. Seitz
Gaussian
pre-filtering
• Solution: filter
the image, then
subsample

F0 F1 F2

blur subsample blur subsample …


F0 * H F1 * H
Gaussian
pyramid

F0 F1 F2

blur subsample blur subsample …


F0 * H F1 * H
Gaussian pyramids
[Burt and Adelson, 1983]

• In computer graphics, a mip map [Williams, 1983]


• A precursor to wavelet transform

Gaussian Pyramids have all sorts of applications in computer vision


Source: S. Seitz
Upsampling
• This image is too small for this screen:
• How can we make it 10 times as big?
• Simplest approach:
repeat each row
and column 10 times
• (“Nearest neighbor
interpolation”)
Image interpolation

d = 1 in this
example

1 2 3 4 5

Recall how a digital image is formed

• It is a discrete point-sampling of a continuous function


• If we could somehow reconstruct the original function, any new
image could be generated, at any resolution and scale

Adapted from: S. Seitz


Image interpolation

d = 1 in this
example

1 2 3 4 5

Recall how a digital image is formed

• It is a discrete point-sampling of a continuous function


• If we could somehow reconstruct the original function, any new
image could be generated, at any resolution and scale

Adapted from: S. Seitz


Image interpolation

1 d = 1 in this
example

1 2 2.5 3 4 5

• What if we don’t know ?


• Guess an approximation:
• Can be done in a principled way: filtering
• Convert to a continuous function:

• Reconstruct by convolution with a reconstruction filter, h

Adapted from: S. Seitz


Image interpolation
“Ideal” reconstruction

Nearest-neighbor
interpolation

Linear interpolation

Gaussian reconstruction

Source: B. Curless
Image interpolation
• What does the 2D version of this hat function look like?

performs
linear interpolation bilinear interpolation

Better filters give better resampled images


• Bicubic is common choice

Cubic reconstruction filter


Image interpolation
Original image: x 10

Nearest-neighbor interpolation Bilinear interpolation Bicubic interpolation

Potential project: Seam carving


https://en.wikipedia.org/wiki/Seam_carving
Image interpolation
• Resizing (resampling)
• Remapping (geometrical Transformation, rotation,...)
• Inpainting (restauration of holes)
• Morphing, nonlinear transformations
Coding Exercise
• Using Python to implement image filtering with the
provided image as input (cv2.filter2D)

-
0 0 0 1 1 1
1 1 1
0 2 0 1 1 1
1 1 1 0 0 0 1 1 1
1 1 1

You can also try


other filters.
Coding Exercise
• Image Interpolation
• Nearest Neighbor Interpolation: selects the value of the
nearest point and does not consider the values of
neighboring points at all, yielding a piecewise-constant
interpolant.
• Bilinear Interpolation: uses values of only the 4 nearest
pixels, located in diagonal directions from a given pixel
• Bicubic Interpolation: considers 16 pixels (4×4).

cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]]) → dst


Coding Exercise

https://docs.opencv.org/3.4/dc/dff/tutorial_py_pyramids.html
Question
• Suppose that you filter an image f(x,y) with a
spatial filter mask w(x,y) using convolution, where
the mask is smaller than the image in both spatial
directions.
• Show the important property that, if the coefficients of
the mask sum to zero, then the sum of all the elements
in the resulting filtered image will be zero also (you may
assume that the border of the image has been padded
with the appropriate number of zeros).
• Would the result be the same if the filtering is
implemented using correlation?
Solution

You might also like