You are on page 1of 33

Object Detection: The Viola-Jones Face Detector

Augusto Morgan
Institute of Computing - University of Campinas
augusto.morgan@students.ic.unicamp.br

June 9, 2014

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

1 / 22

Overview

Object Detection

Viola-Jones Face Detector


Haar-like features and the integral image
AdaBoost
Cascade of Weak Classifiers

Haar-like Features Extended Set

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

2 / 22

Object Detection

How can we detect objects in an image?

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

3 / 22

Object Detection

How can we detect objects in an image?


We can use a classifier:
Given an image, is it the object we are looking for or not?
But what if the images contains a lot of other objects?
We are interested in finding where in the image are the objects.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

3 / 22

Sliding Window

We can use the classifier in small portions of the image!


We slice the image in small subwindows and apply the classifier on each
one of them.
Problems?

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

4 / 22

Viola-Jones Real-Time Face Detector

Proposed in 2001 by Paul Viola and Michael Jones


It discards a great number of negative samples before applying too
much processing time on them, achieving high frame-rates
How does it achieve that?

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

5 / 22

Haar wavelet function


The classifier used in the paper is bases on Haar-like features.

Haar wavelet
function:

1 0 t < 21 ,
(t) = 1 12 < t 1,

0 otherwhise.
Figure: Haar wavelet

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

6 / 22

Haar-like Features
Rectangles representing a score based on positive areas and negative areas.
Three kind of features: 2, 3 and 4 rectangles.
Each feature is calculated by:
X
X
f (i) =
IWhite
IBlack

Figure: The different types of Haar-Like


Features
Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

7 / 22

Haar-like Features
Rectangles representing a score based on positive areas and negative areas.
Three kind of features: 2, 3 and 4 rectangles.
Each feature is calculated by:
X
X
f (i) =
IWhite
IBlack
Problem: The number of
Haar-Like Features is too large!
For a 24x24 pixels window there
are more than 160,000 distinct
Haar-Like Features.

Note: this set is overcomplete.


Augusto Morgan (IC)

Figure: The different types of Haar-Like


Features

Viola-Jones Face Detector

June 9, 2014

7 / 22

The Integral Image


New intermediate representation of
the image, similar to the Summed
Area Table used in CG.
Each pixel (x,y) contains the sum of
the original pixels above and to the
left of (x,y), inclusive.
ii(x, y ) =

i(x 0 , y 0 )

x 0 <x
y 0 <y

It can be computed in one pass over


the original image.

Augusto Morgan (IC)

Viola-Jones Face Detector

Figure: The integral image

June 9, 2014

8 / 22

Features Calculation using the Integral Image

The sum of each rectangle can


be calculated using the integral
image in four array references.

Sum(R) = ii(A)ii(B)ii(D)+ii(C )
Figure: The sum of one
rectangle using the integral
image

Augusto Morgan (IC)

Each feature can then be


calculated in a few array
references.

Viola-Jones Face Detector

June 9, 2014

9 / 22

Advantages and Drawbacks

Rectangular Features are very simple and coarse.


However they are really fast!

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

10 / 22

Advantages and Drawbacks

Rectangular Features are very simple and coarse.


However they are really fast!
They can be calculated at different scales without the need to calculate a
Gaussian Pyramid and each level integral image, wich speeds up its use
with multiscale detection.
Every other feature strategy that need the Pyramid to be calculated for
multiscale runs slower than this approach.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

10 / 22

Training the Classifier

Given the features and the set of positive and negative examples, any
classifier can be trained.
There are, however, a huge number of features.
A very small number of features can be combined to create an effective
classifier.
How to find these features?

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

11 / 22

A weak classifier

The weak classifier used in the paper takes as input a sub-window (x) and
consists of a feature (f ), a threshold () and a polarity (p) indicating the
direction of the following inequality:

1 pf (x) < p,
h(x, f , , p) =
0 otherwhise.
The weak classifier used can be viewed as a single node decision tree, a
stump.
For each feature, an optimal threshold is associated, which is used to
minimize the number of missclassifications.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

12 / 22

AdaBoost

AdaBoost is used to boost the performance of a simple learning algorithm.


It combines weak classification functions, to create a more powerfull one.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

13 / 22

AdaBoost

AdaBoost is used to boost the performance of a simple learning algorithm.


It combines weak classification functions, to create a more powerfull one.
At each round the examples are re-weighted to emphasize those which
were incorrectly classified by the previous weak classifier.
The final strong classifier is a weighted combination of weak classifiers
followed by a threshold.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

13 / 22

AdaBoost

We can see the AdaBoost procedure as a greedy feature selection process:


AdaBoost is actually selecting a small set of good features.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

14 / 22

AdaBoost

We can see the AdaBoost procedure as a greedy feature selection process:


AdaBoost is actually selecting a small set of good features.
This way, the weak learning algorithm tries to select the single rectangle
that best separate the positive and negative examples.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

14 / 22

Training

Done in multiples rounds.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

15 / 22

Training

Done in multiples rounds.


All examples start with the same weight.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

15 / 22

Training

Done in multiples rounds.


All examples start with the same weight.
At each round it searches over a large set of features and thresholds,
choosing the feature/threshold that minimize the weighted error.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

15 / 22

Training

Done in multiples rounds.


All examples start with the same weight.
At each round it searches over a large set of features and thresholds,
choosing the feature/threshold that minimize the weighted error.
The examples wrongly classified have their weight changed and the process
is repeated.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

15 / 22

Considerations

Huge set of possible features and related thresholds (NK , where N is the
number of examples and K the number of features).
For 20000 samples and 160000 features (the number for the 24x24 pixels
subwindow) contains 3.2 billion distincts classifiers!
If using M rounds, AdaBoost takes O(MKN).

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

16 / 22

Considerations

Huge set of possible features and related thresholds (NK , where N is the
number of examples and K the number of features).
For 20000 samples and 160000 features (the number for the 24x24 pixels
subwindow) contains 3.2 billion distincts classifiers!
If using M rounds, AdaBoost takes O(MKN).
For each subwindow, all the classifiers are used and combined to get the
final answer.
What if we could eliminate subwindows earlier?

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

16 / 22

The Attentional Cascade


The insight is that smaller, and therefore more efficient, boosted classifiers
can be constructed which reject many of the negative sub-windows while
detecting almost all positive instances.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

17 / 22

The Attentional Cascade


The insight is that smaller, and therefore more efficient, boosted classifiers
can be constructed which reject many of the negative sub-windows while
detecting almost all positive instances.
This can be done by adjusting the threshold in the AdaBoost algorithm, to
minimize false-negatives.

Figure: The first features selected by AdaBoost

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

17 / 22

The Attentional Cascade


They achieved 100% Hit Rate, and 50% False Positive in the first 2
feature classifier.
Far from acceptable, but, with a few operations they can discard around
50% of the non-face sub-windows. And this is only the first classifier.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

18 / 22

The Attentional Cascade


They achieved 100% Hit Rate, and 50% False Positive in the first 2
feature classifier.
Far from acceptable, but, with a few operations they can discard around
50% of the non-face sub-windows. And this is only the first classifier.
A cascade of classifiers is built this way, with the positive output of each
one, activating the next one, using the more complex classifiers only in the
sub-windows that are more likely a face.
Since the great majority of sub-windows of an image are negative, the
cascade tries to eliminate as many sub-windows as possible at the earliest
stage possible.

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

18 / 22

The Attentional Cascade

Figure: The Classifier Cascade

In the end, a post-processing step is taken to handle multiple-detections of


the same face, to have no duplicates.
Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

19 / 22

Haar-like Features Extended Set


Proposed by Rainer Lienhart and Jochen Maydt in 2002.
Same principle, more variability.

Figure: The extended Haar-like feature set

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

20 / 22

Rotated Summed Area Table

Figure: The rotated integral image

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

21 / 22

References

Viola, P. and Jones. M., CVPR 2001, Rapid Object Detection using a
Boosted Cascade of Simple Features
Viola, P. and Jones. M., International Journal of Computer Vision
v. 57 2004, Robust Real-Time Face Detection.
Lienhart, R. and Maydt, J., IEEE ICIP 2002, An Extended Set of
Haar-like Features for Rapid Object Detection
Weisstein, Eric W. Haar Function. From MathWorldA Wolfram
Web Resource. http://mathworld.wolfram.com/HaarFunction.html

Augusto Morgan (IC)

Viola-Jones Face Detector

June 9, 2014

22 / 22