You are on page 1of 238


Sensor and Vision technologies for automotive vehicles including self-

driving cars.

2017-03-30 Automotive Sensor Systems 2

Sensor that make the cars safer
Cars will have to manage
input data from myriad
sensors, and make split-
second decisions that
might involve taking
control from the driver.
Here, when forward
collision warning senses
that a crash is imminent,
data from body mass and
position sensors in the
cabin instantly adjust the
amount of force with
which air bags are
deployed and seat belts
are tightened.

2017-03-30 Automotive Sensor Systems 3

Sensor that make the cars safer
3D sensing to provide
vital real-time
information about seat
occupants size and
position to improve
safety and comfort of

Vital information will

assist airbag systems
to decide if, when,
and how hard to
inflate after sensing
the occupants.

2017-03-30 Automotive Sensor Systems 4

2017-03-30 Automotive Sensor Systems 5
Image Segmentation

2017-03-30 Automotive Sensor Systems 6

Feature Extraction

Acknowledgements to CSE486, Robert Collins

2017-03-30 Automotive Sensor Systems 7
Sensing Depth from Images: Stereo Vision

How do we perceive the three-dimensional properties of the world when
the images on our retinas are only two-dimensional?
Stereo is not the entire story!

2017-03-30 Automotive Sensor Systems 8

Sensing Depth from Images: Stereo Vision
1. Find correspondences over entire Depth of corresponding
image automatically point in the scene
2. Compute 3-D locations of points
assuming calibrated cameras

Correspondence between
left and right images

Left Camera Right Camera

2017-03-30 Automotive Sensor Systems 9
Stereo Vision Based 3D System

Range can be computed from disparity.

2017-03-30 Automotive Sensor Systems 10

Automated Video Surveillance

Acknowledgements to CSE486, Robert Collins

2017-03-30 Automotive Sensor Systems 11
Head and face tracking
Computational speed
Kalman filtering

2017-03-30 Automotive Sensor Systems 12

Object Recognition

Manipulation: Find tool Navigation: Find landmarks

Mapping: Recover as-built models
Detect and recognize faces
2017-03-30 Automotive Sensor Systems 13
Articulated Body Fitting

Some slides in this lecture were kindly

provided by
Professor Allen Hanson
University of Massachusetts at

Acknowledgements to CSE486, Robert Collins

2017-03-30 Automotive Sensor Systems 14
Definition of a Sensor and a Transducer

A transducer is a device that converts input energy into

output energy, the latter usually differing in kind but bearing a
known relationship to the input.

A sensor is a transducer that receives an input stimulus

and responds with an electrical signal bearing a known
relationship to the input.

2017-03-30 Automotive Sensor Systems 15

The Electromagnetic Spectrum
Visible Spectrum

700 nm 400 nm
The system
consists of
folds of

2017-03-30 (Source: Electromagnetic Spectrum) Automotive Sensor Systems 16

Time Of Flight Measurement
Principle of operation of most radar, laser and active acoustic devices
The time between the transmission of a pulse of energy and the reception of the echo is
measured to provide the range

where R range (m)
v wave propagation velocity (m/s)
T round trip time (s)

2017-03-30 Automotive Sensor Systems 17

Histogram-based Segmentation
Ex: bright object on dark background:

Select threshold
Number of pixels
Create binary image:
I(x,y) < T -> O(x,y) = 0
I(x,y) > T -> O(x,y) = 1

Gray value

2017-03-30 Automotive Sensor Systems 18
Threshold Value vs. Histogram
10 11 10 0 0 1

9 10 11 1 0 1
Thresholding usually involves analyzing the 10 9 10 0 2 1
11 10 9 10 9 11
Different image features give rise to
distinct features in a histogram 9 10 11 9 99 11
10 9 9 11 10 10
In general the histogram peaks
corresponding to two features will

An example of a threshold value is the mean

intensity value

2017-03-30 Automotive Sensor Systems 19

Image Mid-level Thresholding

Multi-level Thresholding
A point (x,y) belongs to;
i. an object class if T1 < f(x,y) T2
ii. another object class if f(x,y) > T2
iii. background if f(x,y) T1

T depends on;
only f(x,y) : only on gray-level values Global threshold
both f(x,y) and p(x,y) : on gray-level values and its neighbors Local

2017-03-30 Automotive Sensor Systems 20

Profiles of image intensity edges

2017-03-30 Automotive Sensor Systems 21

Edge detection
Edge is a set of connected pixels that lie on the boundary between two regions.
The slope of the ramp is inversely proportional to the degree of blurring in the
We no longer have a thin (one pixel thick) path, instead, an edge point now is
any point contained in the ramp, and an edge would then be a set of such points
that are connected.
The thickness is determined by the length of the ramp.
The length is determined by the slope, which is in turn determined by the degree
of blurring.
Blurred edges tend to be thick and sharp edges tend to be thin

2017-03-30 Automotive Sensor Systems 22

Edge detection
Due to optics, sampling, and image
acquisition imperfection

2017-03-30 Automotive Sensor Systems 23


1st derivative can be used to detect the presence of an

edge (if a point is on a ramp)
Sign of the 2nd derivative can be used to determine
whether an edge pixel lie on the dark or light side of an
Second derivative produces two value per edge
Zero crossing near the edge midpoint
Non-horizontal edges define a profile perpendicular to the
edge direction.

2017-03-30 Automotive Sensor Systems 24

Edge detection
Edge detection models:

2017-03-30 Automotive Sensor Systems 25

Effects of noise
Consider a single row or column of the image
Plotting intensity as a function of position gives a signal

Where is the edge?

2017-03-30 Automotive Sensor Systems 26

Edge detection Image

Edge detection in noisy images

A few
Derivatives are sensitive to (even noise
fairly little) noise
Consider image smoothing prior to
the use of derivatives
Edge point whose first more
derivative is above a pre- noise
specified threshold.
Edge connected edge points
Visible (to
Derivatives are computed eye) noise
through gradients (1st) and
Laplacians (2nd).

2017-03-30 Automotive Sensor Systems 27

Where is the edge?

smooth first

Look for peaks in

2017-03-30 Automotive Sensor Systems 28

Derivative theorem of
This saves us one operation:

2017-03-30 Automotive Sensor Systems 29

Laplacian of Gaussian

Laplacian of Gaussian

Where is the edge? Zero-crossings of bottom graph

2017-03-30 Automotive Sensor Systems 30
2D edge detection filters

Laplacian of Gaussian

Gaussian derivative of Gaussian

is the Laplacian operator:

2017-03-30 Automotive Sensor Systems 31

Edge detection
Edge detection General Procedure
It is a good practice to smooth the image before edge detection to
reduce noise.
Detect edge point detect the points that may be part of an edge.
Select the true edge members and compose them to an edge.

2017-03-30 Automotive Sensor Systems 32

Edge detection
Edge point - to determine a point as an edge point
The transition in grey level associated with the point has to be significantly
stronger than the background at that point.
Use threshold to determine whether a value is significant or not.
The points two-dimensional first-order derivative must be greater than a
specified threshold.

2017-03-30 Automotive Sensor Systems 33

Edge detection
Gradient Operators
It is a tool for finding edge strength and direction at location (x,y) of an
image, f.
Gx x
f grand( f ) f
G y

The Vector pointing to the direction of maximum rate of change of f at

coordinates (x,y).

2017-03-30 Automotive Sensor Systems 34

Edge detection
Gradient Operators
Magnitude: gives the quantity of the increase (some times referred to as
gradient too).
First derivatives are implemented using the magnitude of the gradient.
Linear approximation
M ( x, y ) mag(f ) [G G ]

f f
2 2 2 f Gx G y
x y

Direction: perpendicular to the direction of the edge at (x,y) , =

tan .

2017-03-30 Automotive Sensor Systems 35

Edge detection
Gradient Operators Gradient Vector
Gradient Vector



Edge Direction

Gradient angle is measured with respect to the axis.

The direction of an edge angle at , is perpendicular to the direction of the gradient vector at that

2017-03-30 Automotive Sensor Systems 36

Edge detection
Gradient Operators
Instead of computing the partial derivatives at every pixel
location in the image, an approximation is calculated.
Using 2x2 or 3x3 neighbourhood centered about a point.
for : Subtract the pixel in the left column from the pixel in
the right column.
for : Subtract the pixel in the top row from the pixel in the
bottom row.
For instance, Sobel operators introduce some smoothing and
give more importance to the center point.
2017-03-30 Automotive Sensor Systems 37
Canny Edge Detector 2D

First derivative Gradient vector

d ( I ( x) * G( x)) E( x, y) ( I ( x, y) * G( x, y))
E ( x)
Absolute value Magnitude

E ( x ) Th E ( x, y ) Th
2017-03-30 Automotive Sensor Systems 38
The input is image I; G is a zero mean Gaussian filter (std = )

1. J = I * G (smoothing)
2. For each pixel (i,j): (edge enhancement)
Compute the image gradient
J(i,j) = (Jx(i,j),Jy(i,j))
Estimate edge strength
es(i,j) = (Jx2(i,j)+ Jy2(i,j))1/2
Estimate edge orientation
eo(i,j) = arctan(Jy(i,j)/Jx(i,j))
The output are images Es and Eo

2017-03-30 Automotive Sensor Systems 39

The output image Es has the magnitudes of the smoothed gradient.
Sigma determines the amount of smoothing.
Es has large values at edges


2017-03-30 Automotive Sensor Systems 40

How do we detect edges?

Es has large values at edges: Find local maxima


2017-03-30 Automotive Sensor Systems 41

but it also may have wide ridges around the local maxima (large
values around the edges)


2017-03-30 Automotive Sensor Systems 42

The inputs are Es & Eo (outputs of CANNY_ENHANCER)
Consider 4 directions D={ 0,45,90,135} wrt x

For each pixel (i,j) do:

1. Find the direction dD s.t. d Eo(i,j) (normal to the edge) x x
2. If {Es(i,j) is smaller than at least one of its neigh. along d}
Otherwise, IN(i,j)= Es(i,j)

The output is the thinned edge image IN

2017-03-30 Automotive Sensor Systems 43

Non-maximum suppression

Check if pixel is local maximum along gradient direction

requires checking interpolated pixels p and r

2017-03-30 Automotive Sensor Systems 44

the next
edge point

Assume the
marked point is an
edge point. Then
we construct the
tangent to the edge
curve (which is
normal to the
gradient at that
point) and use this
to predict the next
points (here either
r or s).

2017-03-30 (Forsyth & Ponce)

Automotive Sensor Systems 45
Check that maximum value of gradient value is sufficiently large
drop-outs? use hysteresis
use a high threshold to start edge curves and a low threshold to continue them.

2017-03-30 Automotive Sensor Systems 46

Edges are found by thresholding the output of
If the threshold is too high:
Very few (none) edges
High MISDETECTIONS, many gaps
If the threshold is too low:
Too many (all pixels) edges
High FALSE POSITIVES, many extra edges

2017-03-30 Automotive Sensor Systems 47

Hysteresis Thresholding
This leads to the
Es(i,j)<H creation of 3 classes :
below low threshold (to
be removed),
above high threshold
Es(i,j)> H (to be retained), and
between low and high
thresholds (to be
retained only if
Es(i,j)>L Es(i,j)<L connected to a edge
above high threshold).

2017-03-30 Automotive Sensor Systems 48

Sensor system

2017-03-30 Automotive Sensor Systems 49

General Types of Sensors
1 2 3
1, Resistive Sensors
2, Capacitive Sensors
3, Inductive Sensors
4, Potential Transformer -
- Sensors
5, Eddy Current Sensors 4 8
6, Piezoelectric Transducers
7, Photoelectric Sensors
8, Thermoelectric Sensors
9, Thermocouple
10, Fiber Optic Sensor
11, Gas Sensors, Chemical 7
Sensors, Biological Sensors
12, Accelerometers

2017-03-30 Automotive Sensor Systems 50

Sensors as Convertors?
Standard feature:
a sensor converts an input quantity to an output quantity
E.g., thermocouple converts temperature to voltage

Typically aim for the output quantity to be an electrical signal

Interfacing with computers and analogue electronics

2017-03-30 Automotive Sensor Systems 51

Sensor system (example)

A thermocouple measuring circuit with a heat source, cold junction and a measuring instrument.

2017-03-30 Automotive Sensor Systems 52

2017-03-30 Automotive Sensor Systems 53
2017-03-30 Automotive Sensor Systems 54
2017-03-30 Automotive Sensor Systems 55
2017-03-30 Automotive Sensor Systems 56
2017-03-30 Automotive Sensor Systems 57
2017-03-30 Automotive Sensor Systems 58
2017-03-30 Automotive Sensor Systems 59
2017-03-30 Automotive Sensor Systems 60
2017-03-30 Automotive Sensor Systems 61
2017-03-30 Automotive Sensor Systems 62
2017-03-30 Automotive Sensor Systems 63
2017-03-30 Automotive Sensor Systems 64
2017-03-30 Automotive Sensor Systems 65
Sensor Characteristics
Standard definitions of characteristics that describe
how well sensors do their jobs.

These are the language of sensor data sheets.

Four key characteristics:
Bias / Offset

2017-03-30 Automotive Sensor Systems 66

2017-03-30 Automotive Sensor Systems 67
2017-03-30 Automotive Sensor Systems 68
2017-03-30 Automotive Sensor Systems 69
2017-03-30 Automotive Sensor Systems 70
2017-03-30 Automotive Sensor Systems 71
2017-03-30 Automotive Sensor Systems 72
2017-03-30 Automotive Sensor Systems 73
2017-03-30 Automotive Sensor Systems 74
2017-03-30 Automotive Sensor Systems 75
2017-03-30 Automotive Sensor Systems 76
Sensor modelling
Pressure sensor

4 16 0 7 03
= +
3 1 3 1 3

Its nonlinear in 0 . Therefor, cant be solved for 0 .

The 1st term represents the stiffness associated with the bending of the
diaphragm. The 2nd term represents the stiffness associated with stretching of
the diaphragm.
( )
If 0 , then the 2nd term can be ignored. So, =

N.B: Only valid for the case of small deflections.
2017-03-30 Automotive Sensor Systems 77
Digital Transducers
Digital transducer is any measuring device that produces a digital output
In analog sensors (transducers) both sensing and transducer stages are analog.
But since physical systems are typically continuous time systems, the sensing stage
of a digital measuring device is analog.
The transducer stage generates the discrete output signal such as pulse trains or
encoded data
Digital transducers do not introduce quantization error
When the output is a pulse signal, a counter is used to count the pulses or to count
the clock cycles over the pulse duration
The count is represented as a digital word according to some code
Binary, BCD (Binary Coded Decimal),ASCII
The output of the transducer may be available in a coded form

3/30/2017 Automotive Sensor Systems 78

Shaft Encoders
Digital transducers used for measuring angular displacements and velocities
Applications: Robotic manipulators, machine tools, data storage systems, plotters,
printers and other rotating machinery
Advantages: High resolution (word size), high accuracy (noise immunity), relative
ease of adaption in digital control systems with associated reduction in cost
Two types
Incremental encoders generates pulses
Absolute encoders whole word
Techniques of transducer signal generation
Optical (photosensor) method
Sliding contact (electrical conducting) method
Magnetic saturation (reluctance) method
Proximity sensor method

3/30/2017 Automotive Sensor Systems 79

Incremental Optical Encoders
Two types Pick-Off Pick-Off
2 1
1. Offset sensor configuration
2. Offset track configuration
v1 90 Reference Code Disk
Lags by 90 Window
Reference Pulse
Time t , Pick-Off


Time t

Leads by 90

Time t
Time t

Time t

3/30/2017 Automotive Sensor Systems 80

Absolute Optical Encoders
Reading: 1 1 1 1
0 15
1 14

2 13


4 11

5 10

6 9
7 8

Binary Code

If there are n tracks, there are n pick-off elements and the disk is divided into 2n
number of sectors. If n = 16 o

The data word uniquely determines the position at the time

3/30/2017 Automotive Sensor Systems 81

Chain Coding based segmentation

By Herbert Freeman (1961 &


Boundary as a sequence of
straight lines

4- or 8-connectedness

2017-03-30 Automotive Sensor Systems 82

Example (chain coding)

2017-03-30 Automotive Sensor Systems 83

Algorithm MEAN SHIFT
A non-parametric technique
Finds the peak of a given histogram
It is based on Robust Statistics

See: Robust Analysis of Feature Space: Color Image Segmentation, by D. Comaniciu and P.Meer,
CVPR 1997, pp. 750-755.

2017-03-30 Automotive Sensor Systems 84

Finding Modes in a Histogram

How Many Modes Are There?

Easy to see, hard to compute
2017-03-30 Automotive Sensor Systems 85
Mean Shift [Comaniciu & Meer]

o x x x
1. Initialize random seed, and fixed window
Iterative Mode 2. Calculate center of gravity x of the window (themean)
Search 3. Translate the search window to the mean
4. Repeat Step 2 until convergence
2017-03-30 Automotive Sensor Systems 86
Fundamental Definitions Erosion and Dilation

While either set A or B can be thought of as an "image", A is

usually considered as the image and B is called a
structuring element.
The structuring element is to mathematical morphology
what the convolution kernel is to linear filter theory.

Dilation, in general, causes objects to dilate or grow in size.

Erosion causes objects to shrink. The amount and the way
that they grow or shrink depend upon the choice of the
structuring element.
Dilating or eroding without specifying the structural element (source:

makes no more sense than trying to lowpass filter an image

without specifying the filter.

2017-03-30 Automotive Sensor Systems 87

Basic Morphological Operations
Opening (example # 1)
Step 1: The segment
left after erosion ->

Step 2: Perform
dilation on the
segment left in step 1

Output: Output of
opening ->

2017-03-30 Automotive Sensor Systems 88

Basic Morphological Operations
Closing (example #2)

2017-03-30 Automotive Sensor Systems 89

Edge Tracking Methods
Adjusting a priori Boundaries:
Given: Approximate Location of Boundary
Task: Find Accurate Location of Boundary

Search for STRONG EDGES along normals to approximate boundary.

Fit curve (eg., polynomials) to strong edges.

12/22/2016 Automotive Sensor Systems 90

Edge Tracking Methods
Divide and Conquer:

Given: Boundary lies between points A and B

Task: Find Boundary

Connect A and B with Line

Find strongest edge along line bisector

Use edge point as break point

12/22/2016 Automotive Sensor Systems 91

Curve Fitting
Find Polynomial: y
y f ( x) ax3 bx2 cx d
that best fits the given points ( xi , yi )
1 x
3 2
[ y ( axi bxi cxi d )] 2

N i

Using: 0 , 0 , 0 , 0
a b c d

Note: f (x) is LINEAR in the parameters (a, b, c, d)

12/22/2016 Automotive Sensor Systems 92

Line Grouping Problem

Slide credit: David Jacobs

12/22/2016 Automotive Sensor Systems 93

Automotive Sensor Systems
Line Grouping Problem

This is difficult because of:

Extraneous data: clutter or multiple models
We do not know what is part of the model?
Can we pull out models with a few parts from much
larger amounts of background clutter?
Missing data: only some parts of model are
It is not feasible to check all combinations of features
by fitting a model to each possible subset

12/22/2016 Automotive Sensor Systems 94

Automotive Sensor Systems
Hough Transform

12/22/2016 Automotive Sensor Systems 95

Automotive Sensor Systems
Hough Transform
Elegant method for direct object recognition. It is often simpler to
transform a problem to another domain -- solve it -- and come back.

Weve been doing this with time- and frequency-domain concepts

(Fourier) all our lives.

Hough Transforms exploit the fact that a large analytic curve may
encompass many pixels in image space, but be characterized by
only a few parameters.

Edges need not be connected

Complete object need not be visible
Key Idea: Edges VOTE for the possible model

12/22/2016 Automotive Sensor Systems 96

Hough Transform
The Hough Transform can detect lines or curves that are very broken
(after initial edge detection, for example).

HTs can only detect lines or curves that analytically specifiable, or that
can be represented in a template-like form (GHT, Ballard).

Even for the GHT, the implementation is a bit awkward, and you have
to know what youre looking for. So the Hough Transform is primarily a
hypothesize and test tool

12/22/2016 Automotive Sensor Systems 97

Hough Transform
Image and Parameter Spaces
y m
y mx c

( xi , yi ) (m, c)
x c
Image Space Parameter Space

Parameter space also called Hough Space.

Connection between image (x,y) and Hough (m,b)
A line in the image corresponds to a point in Hough space
To go from image space to Hough space:
given a set of points (x,y), find all (m,b) such that y = mx + b

12/22/2016 Automotive Sensor Systems 98

Hough Transform
y b
y mx b

( xi , yi ) (m , b )
x m
Image Space Parameter Space

Equation of Line: y mx b y i mx i b or b x i m y i
Find: (m , b )
Consider point: ( xi , yi )

12/22/2016 Automotive Sensor Systems 99

Finding lines in an image: Hough space
y b


x m0 m
image space Hough (parameter) space

Connection between image (x,y) and Hough (m,b)

A line in the image corresponds to a point in Hough space
To go from image space to Hough space:
given a set of points (x,y), find all (m,b) such that y = mx + b
Slide credit: Steve Seitz
12/22/2016 Automotive Sensor Systems 100
Finding lines in an image: Hough space
y b

x0 x m
image space Hough (parameter) space
Connection between image (x,y) and Hough (m,b) spaces
A line in the image corresponds to a point in Hough space
To go from image space to Hough space:
given a set of points (x,y), find all (m,b) such that y = mx + b
What does a point (x0, y0) in the image space map to?
Answer: the solutions of b = -x0m + y0
this is a line in Hough space Slide credit: Steve Seitz

12/22/2016 Automotive Sensor Systems 101

Finding lines in an image: Hough space
y b
(x1, y1)
(x0, y0)

b = x1m + y1
x0 x m
image space Hough (parameter) space

What are the line parameters for the line that

contains both (x0, y0) and (x1, y1)?
It is the intersection of the lines b = x0m + y0 and
b = x1m + y1
12/22/2016 Automotive Sensor Systems 102
Finding lines in an image: Hough algorithm
y b

x m
image space Hough (parameter) space

How can we use this to find the most likely parameters (m,b) for
the most prominent line in the image space?
Let each edge point in image space vote for a set of possible
parameters in Hough space
Accumulate votes in discrete set of bins; parameters with the
most votes indicate line in image space.

12/22/2016 Automotive Sensor Systems 103

Hough Transform
Basic Algorithm:

1. Quantize Parameter Space (m, c)

(m, c)
2. Create Accumulator Array A(m, c)
3. Set A(m, c) 0 m, c Parameter Space
A(m, c)
4. For each image edge ( xi , yincrement:
i) 1 1
1 1
A(m, c) A(m, c) 1 1 1
If (m, clies
) on the line c xi m yi 1 1
1 1
5. Find local maxima in A(m, c) 1 1

12/22/2016 Automotive Sensor Systems 104

Polar representation for lines
Practical Issues with usual (m,b) parameter space, it can
take on infinite values, undefined for vertical lines.

The slope of the line is < < i.e. 2 < < 2
The representation y=mx+b does not express lines of the form x = k

Image columns
[0,0] x Solution:
d : perpendicular distance
from line to origin
: angle the perpendicular
Image rows

y makes with the x-axis

x cos y sin d
Point in image space sinusoid segment in Hough space
(Slides: Kristen Grauman)
12/22/2016 Automotive Sensor Systems 105
Hough Transform in ( ) plane

To avoid infinity slope, use polar coordinate to represent a line.

x cos y sin
Q points on the same straight line gives Q sinusoidal curves in () plane
intersecting at the same (i i) cell.

12/22/2016 Automotive Sensor Systems 106

Better Parameterization in ( ) plane

NOTE: m y
Large Accumulator ( xi , yi )
Image Space
More memory and computations

Improvement: (Finite Accumulator Array Size)

Line equation: x cos y sin
Here 0 2
0 max
Given points ( xi , yi ) find ( , ) ? Hough Space

Hough Space Sinusoid

12/22/2016 Automotive Sensor Systems 107
Hough Transform
Improved Algorithm:

Input is an edge image (E(i,j) = 1 for edges)

1. Discretize and in increments of d and d. Let A(R,T) be an

array of integer accumulators, initialized to 0.
2. For each pixel E(i,j)=1 and h=1,2,T do
i. = i cos(h * d ) + j sin(h * d )
ii. Find closest integer k corresponding to
iii. Increment counter A(h,k) by one
3. Find local maxima in A(R,T)

12/22/2016 Automotive Sensor Systems 108

Finding Circles by Hough Transform
Equation of Circle:

( xi a) 2 ( yi b) 2 r 2

If radius is known: (2D Hough Space)

Accumulator Array A(a, b)

12/22/2016 Automotive Sensor Systems 109

Finding Circles by Hough Transform

Equation of Circle:

( xi a) 2 ( yi b) 2 r 2

If radius is not known: 3D Hough Space!

Use Accumulator array A(a, b, r )

What is the surface in the hough space?

12/22/2016 Automotive Sensor Systems 110

Using Gradient Information
Gradient information can save lot of computation:

Edge Location ( xi , yi )
Edge Direction i

Assume radius is known:

a x r cos
b y r sin

Need to increment only one point in Accumulator!!

12/22/2016 Automotive Sensor Systems 111

Matching with Features
Problem 2:
For each point correctly recognize the corresponding one

We need a reliable and distinctive descriptor

12/22/2016 Automotive Sensor Systems 112
Harris corner detector for 2D image

C.Harris, M.Stephens. A Combined Corner and Edge Detector. 1988

12/22/2016 Automotive Sensor Systems 113
The Basic Idea
We should easily recognize the point by looking
through a small window
Shifting a window in any direction should give a
large change in intensity

12/22/2016 Automotive Sensor Systems 114

Detection of Corner Features
Need two strong edges:
Create the following matrix:

Either Ex or Ey but not both are large in a

neighborhood of corner

If min(l1,l2) > T
There is a corner!

12/22/2016 Automotive Sensor Systems 115

Detection of Corner Features
Solution: rotate the corner to align it with the image coordinate

12/22/2016 Automotive Sensor Systems 116

Harris Detector: Some Properties

Partial invariance to intensity change

Only derivatives are used => invariance to intensity
shift I I + b

Intensity scale: I a I



x (image coordinate) x (image coordinate)

12/22/2016 Automotive Sensor Systems 117

Harris Detector: Some Properties
But: non-invariant to image scale!

All points will be Corner !

classified as edges

12/22/2016 Automotive Sensor Systems 118

Fourier Descriptor (FD)

Obtained by applying Fourier centroid

transform on a shape signatures,
such as the centroid distance
function R().
Image segmentation
Extract boundary of objects.

3/30/2017 Automotive Sensor Systems 119

Fourier Descriptor (FD) Cont

Example of The centroid distance signature and the Fourier series


3/30/2017 Automotive Sensor Systems 120

Fourier Descriptor (FD) Cont
1 2
=0 exp( ); = 0,1, , 1

Translation: Invariant since we use R()
Rotation: We can make it rotation invariant by choosing the starting point as
the target distance.
Scaling: Suppose that we resize the object. Thats equivalent to simply
multiplying x(k) and y(k) by some constant. As you are well-acquainted by now,
thats just multiplication of the Fourier descriptor by the same constant.

3/30/2017 Automotive Sensor Systems 121

Texture Description
Fourier Transform in small windows
Wavlets or Filter banks
Feature vectors
Statistical descriptors
Markov Chains

The auto-correlation
Describes the relations between neighboring pixels.
Equivalently, we can analyze the power spectrum of the window: We apply a
Fourier Transform in small windows.
Analyzing the power spectrum:
Periodicity: The energy of different frequencies.
Directionality: The energy of slices in different directions.
Simplest Texture Discrimination
Compare histograms.
Divide intensities into discrete ranges.
Count how many pixels in each range.

0-25 26-50 51-75 76-100 225-250

Chi square distance between texton

j 0.1


1 K [hi (m) h j (m)]2

(hi , h j )

2 m 1 hi (m) h j (m)
More Complex Discrimination
Histogram comparison is very limiting
Every pixel is independent.
Everything happens at a tiny scale.
Second order statistics (or co-occurrence matrices)

The intensity histogram is very limited in describing a texture (f.e -

checkerboard versus white-black regions.
Use higher-level statistics: Pairs distribution.

0 1 2 3
Example: 0
co-occurrence matrix of
0 0 1 1 2 2 1 0
0 0 1 1 1 0 2 0 0
I(x,y) and I(x+1,y)
Normalize the matrix to 0 2 2 2 2 0 0 3 2
get probabilities.
2 2 3 3 3 0 0 0 1

From this matrix, generate a list of features:

Entropy (can also be used as a measure for textureness).
Homogeneity ( )
N (i, j)1 | i j |
i, j
Co-occurrence Matrix Features

A co-occurrence matrix is a 2D array C in which Both the rows and columns

represent a set of possible image values.
For gray-tone images, V- the set of possible gray-tones (1D)
For color images, V- the set of possible color values (3D)
C (i,j) indicates how many times value i co-occurs with value j in a particular
spatial relationship d.
The spatial relationship is specified by a vector d = (dr,dc).
dr a displacement in rows (downward)
dc a displacement in columns (to the right)
The gray-tone co-occurrence matrix Cd for image I is defined by
, = | , , = + , + = }|
3/30/2017 Automotive Sensor Systems 128
Co-occurrence Matrix Features Cont
Example: Three different co-occurrence matrix for a gray-tone image

4x4 image I and three different co-occurrence matrices for I: C(0;1), C(1;0), and C(1;1).

3/30/2017 Automotive Sensor Systems 129

Normalized Co-occurrence Matrix
The normalized gray-tone co-occurrence matrix Nd defined by:

which normalizes the co-occurrence values to lie between zero and one and
allows them to be thought of as probabilities in a large matrix.

3/30/2017 Automotive Sensor Systems 130

Normalized Co-occurrence Matrix
0 1
0 1 0 0 1 0 0 2 9
1 1 0 1 1 0 1 10 4
0 1 0 0 1 0
Displacement Cd(I, j)
1 1 0 1 1 0 Vector Denominator = 2+ 9+10+4 = 25
0 1 0 0 1 0
1 1 0 1 1 0

Image patch
2 9
Nd (I,j) = 1/25 =>
10 4

3/30/2017 Automotive Sensor Systems 131

Normalized Co-occurrence Matrix

Co-occurrence matrices capture properties of a texture, but they are not

directly useful for further analysis, such as comparing two textures.

Instead, numeric features are computed from the co-occurrence matrix that
can be used to represent the texture more compactly.

3/30/2017 Automotive Sensor Systems 132

Normalized Co-occurrence Matrix
These are standard features derivable
from a normalized co-occurence

Where, i, j are the means and i, j

are the standard deviations of the row
and column.

Sums Nd(i) and Nd(j) defined by:

3/30/2017 Automotive Sensor Systems 133

3/30/2017 Automotive Sensor Systems 148
3/30/2017 149
3/30/2017 Automotive Sensor Systems 149
3/30/2017 150
3/30/2017 Automotive Sensor Systems 150
3/30/2017 151
3/30/2017 Automotive Sensor Systems 151
3/30/2017 152
3/30/2017 Automotive Sensor Systems 152
3/30/2017 153
3/30/2017 Automotive Sensor Systems 153
3/30/2017 154
3/30/2017 Automotive Sensor Systems 154
3/30/2017 155
3/30/2017 Automotive Sensor Systems 155
3/30/2017 156
3/30/2017 Automotive Sensor Systems 156
3/30/2017 157
3/30/2017 Automotive Sensor Systems 157
3/30/2017 158
3/30/2017 Automotive Sensor Systems 158
3/30/2017 159
Determinant: A must be square

a11 a12 a11 a12

det a11a22 a21a12
a21 a22 a21 a22

a11 a12 a13

a22 a23 a21 a23 a21 a22
det a21 a22
a23 a11 a12 a13
a32 a33 a31 a33 a31 a32
a31 a32

2 5
Example: det 2 15 13
3 1

3/30/2017 160
Inverse: A must be square

1 1
Ann A nn A nn Ann I
a11 a12 1 a22 a12
a a
21 a 22 a11a22 a21a12 21 a11
6 2 1 5 2
1 5 28 1 6

6 2 6 2 1 5 2 6 2 1 28 0 1 0
1 5 .1 5 28 1 6 .1 5 28 0 28 0 1

3/30/2017 161
2D Vector x2 P
v ( x1 , x2 ) v

2 2
|| v || x1 x2
If || v || 1 , v Is a UNIT vector

v x1 x2
, Is a unit vector
|| v || || v || || v ||
Orientation: tan 1

3/30/2017 162
Vector Addition

v w ( x1 , x2 ) ( y1 , y2 ) ( x1 y1 , x2 y2 )


3/30/2017 163
Vector Subtraction

v w ( x1 , x2 ) ( y1 , y2 ) ( x1 y1 , x2 y2 )


3/30/2017 164
Scalar Product

av a( x1 , x2 ) (ax1 , ax2 )


3/30/2017 165
Inner (dot) Product

w v.w ( x1 , x2 ).( y1 , y2 ) x1 y1 x2 . y2

The inner product is a SCALAR!

v.w ( x1 , x2 ).( y1 , y2 ) || v || || w || cos

v.w 0 v w

3/30/2017 166
Orthonormal Basis
i (1,0) || i || 1

ij 0
j j (0,1) || j || 1
i x1

v ( x1 , x2 ) v x1.i x2 .j

v.i ( x1.i x2 .j).i x1.1 x2 .0 x1

v.j ( x1.i x2 .j).j x1.0 x2 .1 x2
3/30/2017 167
Vector (cross) Product
u w u v w
v The cross product is a VECTOR!

Magnitude: || u || || v.w |||| v ||| w || sin

u v u v (v w) v 0
u w u w (v w) w 0

3/30/2017 168
Vector Product Computation
i (1,0,0) i 1
j (0,1,0) j 1 i.j i.k j.k 0
k (0,0,1) k 1
u v w ( x1 , x2 , x3 ) ( y1 , y2 , y3 )
i j k
u w
u x1 x2 x3
y1 y2 y3
( x2 y3 x3 y2 )i ( x3 y1 x1 y3 ) j ( x1 y2 x2 y1 )k
3/30/2017 169
Coordinate Systems
World Camera Film Image
Coords. Coords. Coords. Coords.

Xw x x u
Yw y y v
Zw z

3/30/2017 170
Coordinate Systems
World Camera Film Image
Coords. Coords. Coords. Coords.

Xw x x u
Yw y y v
Zw z

Rigid transformation: rotation & translation

3/30/2017 171
Changing a coordinate system is equivalent to apply the inverse
transformation to the point coordinates

3/30/2017 172
Reverse Rotations
Q: How do you undo a rotation of R()?
A: Apply the inverse of the rotation R-1() = R(-)

How to construct R-1() = R(-)

Inside the rotation matrix: cos() = cos(-)
The cosine elements of the inverse rotation matrix are
The sign of the sine elements will flip
Therefore R-1() = R(-) = RT()

3/30/2017 173
3D Rotation of Coordinates Systems
Rotation around the coordinate axes, clockwise:

Z,Z 1 0 0
Rx ( ) 0 cos sin
0 sin cos

Y cos 0 sin
R y ( ) 0 1 0
X sin 0 cos
cos sin 0
X Rz ( ) sin cos 0
0 0 1

3/30/2017 174
3D Rotation of Coordinates Systems

Translate by a vector t=(tx,ty,tx)T:


1 0 0 t x
0 1 0 t
T y

0 0 1 t z

y Y
0 0 0 1

x 175


zc 4 Translate W to C:
10 6
1 0 0 0
0 1 0 3
0 0 1 2

0 0 0 1

3/30/2017 176
Relationship in Perspective Projection

World to camera:

Camera: X



Pw Yw
Transform: Zw

R ,T

3/30/2017 177
Relationship in Perspective Projection

World to camera:

X Zw

Y R Yw T (1)
Z Zw

r11 r12 r13 T x

R r21 r22 r23 ,T T y
Eq. (1) can be rewritten as:
r T z
31 r32 r33

X r11Zw r12Yw r13Zw T x

Y r21Zw r22Yw r23Zw T y (2)
Z r31Zw r32Yw r33Zw T z

3/30/2017 178
Relationship in Perspective Projection

Camera to image:

Camera: X
, Image:
P Y p
Z y


3/30/2017 179
Relationship in Perspective Projection

Camera to image:

Camera: X
, Image:
P Y p
Z y

3/30/2017 180
Relationship in Perspective Projection

World to frame:

= =

= =

Replace X, Y, and Z from eq. (2)

f r11Zw r12Yw r13Zw T x

x im O x
S x r31Zw r32Yw r33Zw T z
f r21Zw r22Yw r23Zw T y
y im O y
S y r31Zw r32Yw r33Zw T z
3/30/2017 181
Relationship in Perspective Projection

World to frame:
If we left = / and = / , we have now 4 independent intrinsic
parameters Ox , Oy , fx , and
fx - Focal length expressed in the effective horizontal pixel size
- Aspect ratio: pixel deformation introduced by the acquisition process.
Thus, we have:

r11Zw r12Yw r13Zw T x

x im O x f x (6)
r31Zw r32Yw r33Zw T z
r21Zw r22Yw r23Zw T y
y im O y f y (7)
r31Zw r32Yw r33Zw T z

3/30/2017 182
Relationship in Perspective Projection


Why are we doing the manipulation stated in (6) and (7)?

In the three coordinate systems: world, camera, and image, which one
cant be accessed?

3/30/2017 183
Relationship in Perspective Projection


The camera coordinate system cant be accessed.

You see, we have eliminated the camera coordinates from the
relationships and link directly the world coordinates , , with
the image coordinates ,

This suggests that, give a sufficient # of Paris of 3-D world points and
their corresponding image points, we can try to solve (6) and (7) for the
unknown parameters.

3/30/2017 184
Pinhole Camera Model
(World Coordinates)
y Zw

Y x p

X f

O Yw

P R T Pw M ext Pw
p M int P M int M ext Pw
3/30/2017 185
Camera Model Summary
Geometric Projection of a Camera
Pinhole camera model
Perspective projection
Weak-Perspective Projection

3/30/2017 186
Camera Model Summary
Camera Parameters
Extrinsic parameters (R, T): R, T 6 DOF (degrees of freedom)

p M P M int M ext Pw
Intrinsic Parameters: f, oint
x,oy, sx,sy

P R T Pw M ext Pw
p M Pw M is 3x4
M has 6 dof

f / s x 0 ox 0
1 r11 r12 r13 Tx
M int 0 0
w f / sy oy
x2 M intM ext Yw M ext r21 r22 r23 Ty
x Z 0 0 1 0
3 w r31 r32 r33 Tz

xim x1 / x3

y x
im 2 3 / x
3/30/2017 187
The Calibration Problem

3/30/2017 188
Direct parameter Calibration Summary
Algorithm (p130-131)
1. Measure N 3D coordinates (Xi, Yi,Zi)
2. Locate their corresponding image points (xi,yi) - Zw
Edge, Corner, Hough
3. Build matrix A of a homogeneous system Av = 0
4. Compute SVD of A , solution v
5. Determine aspect ratio and scale ||
6. Recover the first two rows of R and the first two Xw
components of T up to a sign
7. Determine sign s of by checking the projection
equation Yw
8. Compute the 3rd row of R by vector product, and
enforce orthogonality constraint by SVD
9. Solve Tz and fx using Least Square and SVD, then
fy = fx /
The Calibration Problem
Step 2: Estimate ox and oy
The computation of ox and oy will be based on the
following theorem:
Orthocenter Theorem: Let T be the triangle on the image
plane defined by the three vanishing points of three
mutually orthogonal sets of parallel lines in space. The
image center (ox , oy) is the orthocenter of T.

We can use the same calibration pattern to compute

three vanishing points (use three pairs of parallel lines
defined by the sides of the planes).

Note 1: it is important that the calibration pattern is imaged from a

viewpoint guaranteeing that none of the three mutually orthogonal
directions will be near parallel to the image plane !
Note 2: to improve the accuracy of the image center computation, it is a
good idea to estimate the center using several views of the calibration
pattern and average the results.
3/30/2017 190
Estimating the Image Center
Vanishing points:
Due to perspective, all parallel lines in 3D space appear to meet in a
point on the image - the vanishing point, which is the common
intersection of all the image lines
Estimating the Image Center
Vanishing points:
Due to perspective, all parallel lines in 3D space appear to meet in a
point on the image - the vanishing point, which is the common
intersection of all the image lines

Estimating the Image Center
Vanishing points:
Due to perspective, all parallel lines in 3D space appear to meet in a
point on the image - the vanishing point, which is the common
intersection of all the image lines



Orthocenter Theorem: Estimating the

Input: three mutually
orthogonal sets of parallel Image Center
lines in an image
T: a triangle on the image
plane defined by the three
vanishing points
Image center =
orthocenter of triangle T
Orthocenter of a triangle is
the common intersection
of the three altitudes



Orthocenter Theorem: Estimating the

Input: three mutually
orthogonal sets of parallel Image Center
lines in an image
T: a triangle on the image
plane defined by the three
vanishing points
Image center =
orthocenter of triangle T
Orthocenter of a triangle is
the common intersection
of the three altitudes



Orthocenter Theorem: Estimating the

Input: three mutually
orthogonal sets of parallel Image Center
lines in an image
T: a triangle on the image
plane defined by the three
vanishing points
Image center = orthocenter h3
of triangle T
Orthocenter of a triangle is
the common intersection of
the three altitudes


Guidelines for Calibration
Pick up a well-known technique or a few
Design and construct calibration patterns (with known 3D)
Make sure what parameters you want to find for your camera
Run algorithms on ideal simulated data
You can either use the data of the real calibration pattern or using computer generated
Define a virtual camera with known intrinsic and extrinsic parameters
Generate 2D points from the 3D data using the virtual camera
Run algorithms on the 2D-3D data set
Add noises in the simulated data to test the robustness
Run algorithms on the real data (images of calibration target)
If successful, you are all set
Check how you select the distribution of control points
Check the accuracy in 3D and 2D localization
Check the robustness of your algorithms again
Develop your own algorithms NEW METHODS?
Finding the disparity map
Left image Il
Right image Ir
Parameters that must be chosen:
Correlation Window size 2W+1
Search Window size
Similarity measure Y
Let pl and pr be pixels on the Il and Ir
Let R(pl) be the search window x on Ir associated
with pl
Let d be the displacement between pl and a point in

pl d

2W+1 2W+1
For each pixel pl=[i,j] in Il do:

For each displacement d=[d1,d2] in R(pl) compute:

C(d) = l=-Wl=W k=-Wk=W Y(Il(i+k,j+l),Ir(i+k-d1,j+l-d2))

The disparity at pl is the vector d with best C(d) over R(pl) (max. Cfg, or min.

Output the disparity for each pixel pl

Haar wavelet to Haar-like features

A Haar-like feature considers adjacent rectangular

regions at a specific location in a detection window,
sums up the pixel intensities in each region and
calculates the difference between these sums.

Edge Features -

Line Features -

Center Features -

3/30/2017 Automotive Sensor Systems: Guest Lectures 202

Haar-like feature Application
A Haar-like feature considers adjacent rectangular regions at a specific location in a detection
window, sums up the pixel intensities in each region and calculates the difference between
these sums.

This difference is then used to categorize subsections of

an image.
For example, let us say we have an image database with
human faces. It is a common observation that among all faces the
region of the eyes is darker than the region of the cheeks.
Therefore a common haar feature for face detection is a set of two
adjacent rectangles that lie above the eye and the cheek
The position of these rectangles is defined relative to a detection window
that acts like a bounding box to the target object (the face in this case).

3/30/2017 Automotive Sensor Systems: Guest Lectures 203

Haar-like feature Application: Viola-Jones

Face Detector

A window of the target size is moved

B over the input image, and for each
subsection of the image the Haar-like
feature is calculated.

This difference is then compared to a

learned threshold that separates non-
objects from objects.

3/30/2017 Automotive Sensor Systems: Guest Lectures 204 204

Face Detection: Viola and Jones face detector

10 20 4

Source: Face detection, lecture slides, Prof. K.H. Wong

-1areas Image 7
Pixel values inside the 45 7
(just for example)
216 102 78
129 210 111

Feature, =

Algorithm: = 216+102+78+129+210+111= 846

if f > threshold = 10+20+4+7+45+7= 93

feature= +1;
else Feat_val = = 846 93 = 753
feature= -1;
end if;
If threshold =700 , then feature = +1.
3/30/2017 Automotive Sensor Systems: Guest Lectures 205
Face Detection: Viola - Jones face detector

Rectangle feature,

Face detection Algorithm:

If calculated feature, f is large,
then it is face, i.e.
if (f) > threshold, then This is not a face. This is a face:
Because the The eye-area (shaded
face area)is dark, the nose-
calculated feature, f <
else threshold area(white area) is bright.
non-face So f is large, and f >
threshold, then, it is face.
3/30/2017 Automotive Sensor Systems: Guest Lectures 206
Viola-Jones Face Detector

Is it good enough to justify the detection?

Haar-like feature is only a weak

learner or classifier (its detection
quality is slightly better than random
guessing) a
large number of
Haar-like features are
necessary to describe an object
with sufficient accuracy.

3/30/2017 Automotive Sensor Systems: Guest Lectures 207

Viola-Jones Face Detector
Haar-like feature is only a weak learner or
classifier (its detection quality is slightly
better than random guessing) a large
number of Haar-like features are
necessary to describe an object with

Source: Face detection, lecture slides, Prof. K.H. Wong

sufficient accuracy.

You may consider these features as facial

features; A
Left Eye : B

Nose : + C D E
Mouth: + G

They can be different sizes, polarity,

orientation and aspect ratios.
3/30/2017 Automotive Sensor Systems: Guest Lectures 208
Feature Extraction & Matching
Texture-based Face Recognition: Local
Binary Pattern (LBP)
LBP features are usually obtained from image pixels of a 33
neighbourhood region.

The basic LBP operator compares the 8 neighbouring pixel

intensity values to the intensity value of the central pixel in the
region and represents the result as a 8-bit.
3/30/2017 Automotive Sensor Systems: Guest Lectures 209
Feature Extraction & Matching
Texture-based Face Recognition: Local Binary Pattern (LBP)
LBP features = image pixels of a 33 neighbourhood region.
Basic LBP operator compares the 8 neighbouring pixels to the central pixel and
represents the result as a 8-bit.
LBP value of pixel , = where () = , ;

= , .
- neighboring pixel intensity value, - central pixel intensity value, Total number of

Automotive Sensor Systems: Guest Lectures 3/30/2017 210

Feature Extraction & Matching
Texture-based Face Recognition: Multi-scale LBP (MLBP)
Extension of the basic LBP. It introduces a radius parameter R, which means that the
compared neighbours are R pixels away from the center pixel. There is also another
parameter P, which is the number of sampling points along the circle of radius R.

Example - Multi-scale LBP calculation.

P and R represent the distance of the sampling points from the center pixel
and the number of the sampling points to be used, respectively.
3/30/2017 Automotive Sensor Systems: Guest Lectures 211
Feature Extraction & Matching
LBP algorithm was further modified to deal with
Texture-based Face Recognition: Uniform LBP textures at different scales and to use
neighborhood at different sizes.

Uniform LBP: A local binary pattern is classified as uniform if the binary

pattern contains at most two bitwise transitions from 0 to 1 or vice versa,
when the bit pattern is observed either clock wise or anti-clock wise.

For instance, examine the following patterns;

00000000 - 0 transitions
01110000 and 11001111 - 2 transitions (uniform)
11001001 - 4 transitions (not uniform)
01010010 - 6 transitions (not uniform)

By norm uniform LBP is noted , , where, the subscript
represents the neighborhood: sampling points with radius of and the
superscript term 2 denote the uniform pattern.

3/30/2017 Automotive Sensor Systems: Guest Lectures 212

Feature Extraction & Matching

Texture-based Face Recognition: , cont.
only important local textures,
Uniform LBP determine such as
ends of lines, edges, angles, and spots.

Example: detectable primitive textures of Uniform LBP

Spot Spot/Flat Line End Edge Corner

3/30/2017 Automotive Sensor Systems: Guest Lectures 213

Feature Extraction & Matching
Texture-based Face Recognition: LBP coding Histogram

Example: 8,2 histogram
Input image is 6060 pixels, it is divided into six regions with window
size of 10 10.
Thus, a (66) 59 = 12124 vector that represents histogram values of
all the labels in the sub images and this vector contains all the useful
information in the image.

3/30/2017 Automotive Sensor Systems: Guest Lectures 214

Video representations are often the most efficient way to represent information

Video signals can effectively tell a temporally evolving story

Arise in cinema (image sequences)
Arise in surveillance
Arise in medical applications

03/23/2017 Automotive Sensor Systems 215

What is the Video Signal?

Video signal is basically any sequence of time varying

Still image is a spatial distribution of intensities that
remain constant with time, whereas a time varying
sequence has a spatial intensity distribution that
varies with time.
Video signal is treated as a series of images called
frames. An illusion of continuous video is obtained by
changing the frames in a faster manner which is
generally termed as frame rate.

03/23/2017 Automotive Sensor Systems 216

Motivation Behind Video Processing
Video Retrieval: searching for digital videos in large
databases. The search will analyze the actual
content of the video. The term Content might
refer colours, shapes, textures.

Video Surveillance: monitoring of the behaviour,

activities, or other changing information, usually of
people for the purpose of influencing, managing,
directing, or protecting them.

Humancomputer interaction: designing of

computer technology, focused on the interfaces
between users and computers.

03/23/2017 Automotive Sensor Systems 217

What is an Action Recognition
Perform some appropriate processing on a video, and output the action label.

Level of semantics
action activity event
walking, watching TV, a volleyball
pointing, drinking tea etc. game, a party
etc. etc.

Huge amount of video is available and growing.

Human actions are major events in movies, TV news, and personal video.
300 hours every minute
03/23/2017 Automotive Sensor Systems 218
Why Action Recognition is Challenging?
Different scales
People may appear at different scales in different videos, while performing the
same action.
Movement of the camera
The camera may be a handheld camera, and the person holding it can cause it
to shake.
Camera may be mounted on something that moves.
Movement with the camera
The person performing an action (e.g., playing soccer) may be moving with the
camera at a similar speed.

03/23/2017 Automotive Sensor Systems 219

The person performing an action may be occluded by another object and
action may not be fully visible
Background clutter
Other people present in the video frames while we target to recognize the
action of a specific human.
Human variation
Humans are of different sizes/shapes
Action variation
Different people perform different actions in different ways. (e.g. walking can
be slow or fast)
03/23/2017 Automotive Sensor Systems 220
General Action Recognition Pipeline

Shape Features,
Local Features, Bag of Features, Support Vector Machine,
Motion Fisher Vector, Extreme Learning
Features, Machine,

03/23/2017 Automotive Sensor Systems 221
Bag of Features

Object Input Video HOG patch descriptors

Bag of Features Collection of space-time patches Histogram of visual words

03/23/2017 Automotive Sensor Systems 222

Training of the Model Recognition

codewords dictionary
feature detection feature detection

image representation

category models category

(and/or) classifiers decision
Histogram of Oriented Gradients (HOG)
Gradient Orientation Descriptor Block
computation binning blocks normalization

Gradient computation
Gx=I[-1 0 1] 3

Gy=I[-1 0 1]T Gx
-64 Gy
61 64 146 74 69 72
61 62 -23 -68 -30 14
117 111 211 140 111 92
41 62 -39 -129 9 78 G
Magnitude: 107
G Gx2 Gy2 193 254 255 231 187 201 56 47 65 66 42 20

193 234 255 195 126 204 46 69 68 180 79 86 Gx
Gy 59 143 -43 114 143 98
Angle: Gy 86 121 41 -23 39 43
ac tan( ) 17 -20 87 -59 -128 14
Gx 0 -20 0 -36 -61 3

03/23/2017 Automotive Sensor Systems 224

HOG within a Block of Image

Computing histogram of gradient based on orientation (Using unsigned of orientation

(0-1800) and 9 bins)

56 97 66 101 42 20
46 117 74 206 92 88
Magnitude: 64 179 128 132 172 99
116 121 41 89 75 77

Pixel intensities
G Gx2 Gy2 63 65 90 90 131 20
41 65 39 134 62 78

87 29 81 139 93 81
97 36 67 119 121 102
Orientation: 66 53 20 120 124 84
48 94 90 15 149 146
ac tan( y ) 16 18 105 139 103 45
Gx 1 18 180 164 82 2

03/23/2017 Automotive Sensor Systems 225

What is 3D Digital Imaging?

3/30/2017 Automotive Sensor Systems 226

3D Digital Imaging is:

entry point of reality into the virtual world

3/30/2017 Automotive Sensor Systems 227

3/30/2017 Automotive Sensor Systems 228

3D sensing techniques
Range Sensors
Acquire images encoding shape directly

Range Image
Special class of digital images. Each piece of a range
image express the distance between a known reference
frame and a visible point in the scene

Reproduces the 3-D structure a a scene.

3/30/2017 Automotive Sensor Systems 229

3D sensing techniques
advantage passive and low cost
disadvantage insufficient measurements and
correspondence problem

advantage fast
disadvantage high cost

Structured light
advantage simplicity and low cost
disadvantage specula reflection

3/30/2017 Automotive Sensor Systems 230

Active vs. Passive Range Sensors

Active range sensors project energy (e.g., a pattern of

light, sonar pulses) on the scene and detect its position to
perform the measure, or exploit the effect of controlled
changes of some sensor parameters (e.g., focus)

Passive range sensors rely on intensity images to

reconstruct depth (e.g., stereopsis).

3/30/2017 Automotive Sensor Systems 231

Active sensors
get a large number of 3d data points automatically without texture or
features present
3d points are obtained fast using sensor hardware without any required
software at the user end
works in all kinds of lighting conditions
having densely sampled 3d data makes it easier to find the topology of the
model (connectivity of the points)

sensors are expensive
not always eye safe
get a lot of 3d data points which must be processed

3/30/2017 Automotive Sensor Systems 232

Passive Techniques

Simple image acquisition, even more with arrival of digital cameras
Scale independence, same camera for small and large objects

Correspondence between points on 2D images

3/30/2017 Automotive Sensor Systems 233


Based on trigonometry:
When a base and two angles are known,
the 3rd point can be calculated.

3/30/2017 Automotive Sensor Systems 234

Specification for 3D Scanner
Workspace: the volume of space in which range
data can be collected.

Stand-off distance: the approximate distance

between the sensor and the workspace.

Depth of field: the depth of the workspace (along


Accuracy: Statistical variations of repeated

measurements of a known true value.

Resolution or precision: the smallest change in

range that the sensor can measure.

Speed: the # of range points measured per


Size and weigh.

3/30/2017 Automotive Sensor Systems 235
Structured Light Range Scanner

CCD camera

Laser projector

Laser stripe

Translation stage

3/30/2017 Automotive Sensor Systems 236

Structured light
3-D World Laser projector

side view H

Improved version of the single-point triangulation

Emit some pattern of light, e.g., a line, spots, grid
Observe from the above: if the light hits an object the
displacement of the pattern is proportional the distance of the
Whole 3-D volume can be detected moving a plane of light over
the volume.
3/30/2017 Automotive Sensor Systems 237

3-D World Laser projector 2-D Image

H f
side view F
D2 W
optical center

D1 = D2 W f
top view
f : focal length
D2: laser strip displacement in image
W : Working distance
H= D1 tan()
H: object height
D1: laser strip displacement
F : incident angle

3/30/2017 Automotive Sensor Systems 238


3/30/2017 Automotive Sensor Systems 239