Objectrecognition

Object Recognition in Images
Slides originally created by Bernd Heisele

Object Recognition
Object Detection Object Identification
Where is Jane? Who is it?

Where is a Face? Is it Jane or Erik?
Is there a face in the image?
Fall 2002 Pattern Recognition for Vision

General Problems of Recognition
Invariance:
•“External parameters”
•Pose
•Illumination
•“Internal parameters”
•Person identity
•Facial expression
Applicable to many classes

of objects

Object Detection
Task
Given an input image,
determine if there are
objects of a given class (e.g.
faces, people, cars..)
in the image and where they
are located.

Detection—Problems
1. Classifier must generalize over all exemplars of one

class.
2. Negative class consists of everything else.
3. High accuracy (small FP rate) required for most

applications.

Face Detection – basic scheme
Face examples Classification

Result
Off-line
training
Classifier
Feature vector (x1, x2 ,…, xn)
Feature Extraction
Non-face examples Pixel pattern
Search for faces at

different resolutions
and locations

Training and Testing
Training Set
Train Classifier
False Positive
Labeled Test Set Correct
Sensitivity
Classify

Image Features
19x19
Histogram Masking
Equalization
Gray 283 [0, 1]
x-y-Sobel
Histogram
Filtering
Equalization
Gradient 17x17 [0, 1]
Haar Wavelet Normalization

Transform
Wavelets 1740 [0, 1]

ROC Image Features

Classifiers

Positive Training Data
Real Synthetic
2900 faces + 2900 mirrored faces (T. Vetter, Univ. of Freiburg)
Illumination
Identity
Rotation
3D Morphing: + =

Real vs. Synthetic

Negative Training Data
Problem: 1 face in 116.440 examined windows
Add to
training set
Initial set of Determine false-

Retrain
25.000 non- positives on large set
faces Classifier of non-face images
Bootstrapping

Bootstrapping
0.9 FP/image 6.7 FP/image

Performance of Global Face Detectors

Rotation
Rotation out of
image plane
• Component-based classification
Rotation in the • Train on rotated faces
image plane
• Rotation invariant features

• Apply 2D rotation to image

Global vs. Components
Single
template
Component
templates

Component-based Detection
Eyes Nose Mouth
1st Level: classify

Components
2nd Level:
Geometrical
relation
between
components classify
maximum response
of each component
classifier + x, y
location

Learning Components
Components:
• discriminatory
• robust against changes in
pose and illumination
Synthetic faces:
• 7 different 3-D head models
• 2,500 faces
Rotation: -30o to + 30o
• 3-D correspondences
for automatic location of
components

Learning Components—One Way To Do It
Start with small initial regions
Expand into one of four directions
Extract new components from images
Train SVM classifiers
Choose best expansion according to

error bound of SVMs

Margin, Radius and Expected Error
x *2
Bound on error
M 2
E < c(R / M)
R
M
x*1
Feature Space
Cross V
alidation might be better

Some Examples

est on CMU PIE Database
T
Faces have been manually labeled

(only –45o to 45o of rotation)
• About 40,000 faces

• 68 people
• 13 poses
• 43 illumination conditions
• 4 different expressions

ROC Component vs. Global
Face Detection: Component-based vs. Global Approach

(5,000 faces 25, 000 non-faces)
1
0.9
0.8
0.7
0.6
Correct
0.5
14 learned components
0.4
whole face
0.3
0.2
0.1
0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
FP / inspected window
Heisele, B., T. Serre, M. Pontil, T. Vetter and T. Poggio. Categorization by Learning and Combining Object Parts.
In: Advances in Neural Information Processing Systems (NIPS'01), Vancouver, Canada

Advances on Component-base Face Detection Stan Bileschi
Components are small, and prone to false

detection, even within the face.
Stan Bileschi
Training on Faces
Use the remainder of the face in the negative training set

Positive Negative

Training on Faces Only
Red:
Trained
only with
faces.
Blue:
Trained
on faces
and non-
faces.

Errors
Often, many components classify correctly, with

only a few errors

Using Models of Pair-wise Positions

Pair-wise Biasing Leads to Tightened Result Images

Pair-wise Biasing

Application: Eye Detection

Identification
A B
Task:
Given an image of an
A B
object of a particular class
(e.g. face) identify which
exemplar it is.
C D
C D

Identification—Problems
1. Multi-class problem
2. Classifier must distinguish between exemplars that

might look very similar.
3. Classifier has to reject exemplars that were not in

the training database.

Problems in Face Identification
Limited information in a single face image
Illumination Rotation

System Architecture
Identification Result
Training Data
A
Classifier Support Vector Machine,

….
Feature extraction Gray, Gradient,

Wavelets, …
Face Image

Multi-class Classification with SVM
Bottom-Up 1vs1 1 vs. All
A or B or C or D A B,C,D
A or B C or D B A,C,D
C A,B,D
D A,B,C
A B C D
Training: L
Training: L (L-1) / 2
Classification : L
Classification : L-1

Global Approach
Detect and extract face
Feed gray values into N

SVMs
Classify based on SVM SVM SVM SVM

A B C D
maximum output
Max Operation
Identification result

Global Approach with Clustering
Partition training images of

each person into viewpoint-
specific
clusters
Train a linear SVM on each

cluster SVM SVM SVM SVM
C C C C
Take maximum over all SVM

outputs Max Operation
Identification result

Component-based Approach
Detect and extract components
Feed gray values of

components to N SVMs
Take max. over all SVM outputs

SVM SVM SVM SVM
A B C D
Max Operation
Identification Results

W hy Components for Identification?
Feature 1 != Feature 2
Feature 1 = Feature 2
Jennifer Huang

Component-based Face Recognition with 3D Morphable Models
More ROC Curves
Heisele, B., P. Ho and T. Poggio. Face Recognition with Support Vector Machines: Global Versus Component-based
Approach, International Conference on Computer Vision (ICCV'01), Vancouver, Canada, Vol. 2, 688-694, 2001.

Morphable Models for Face Identification, Jennifer Huang
3D morphable models Training data for

component-based
A Morphable Model for the Synthesis of 3D Faces.
Blanz, V. and Vetter, T. SIGGRAPH'99 Conference Proceedings, pp. 187-194
face recognition
Jennifer Huang

Morphable Model
Generation of 3D head model from

two images
See class on Morphable Models
Jennifer Huang

Some Training Images
Synthetic images are easily rendered from 3D head model

under varying illuminations and rotations in depth
Jennifer Huang

Preliminary Results on Synthetic Images
Jennifer Huang

Current Work – Testing on Real Images
Problems Encountered:
Detection
Inaccurate Component detection
Recognition
Accuracy of 3D models
Choice of Illumination and Pose
Jennifer Huang

Literature
B. Heisele, A. Verri and T. Poggio: Learning and

Vision Machines. Proceedings of the IEEE, Visual
Perception: Technology and Tools, Vol. 90, No. 7,
pp. 1164-1177, 2002.
See also CBCL Web page

Objectrecognition

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Objectrecognition

Uploaded by

Copyright:

Available Formats

Object Recognition in Images

Slides originally created by Bernd Heisele

Object Detection Object Identification

Where is Jane? Who is it?

Fall 2002 Pattern Recognition for Vision

Applicable to many classes

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

1. Classifier must generalize over all exemplars of one

2. Negative class consists of everything else.

3. High accuracy (small FP rate) required for most

Fall 2002 Pattern Recognition for Vision

Face examples Classification

Feature vector (x1, x2 ,…, xn)

Non-face examples Pixel pattern

Search for faces at

Fall 2002 Pattern Recognition for Vision

Labeled Test Set Correct

Fall 2002 Pattern Recognition for Vision

Haar Wavelet Normalization

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Problem: 1 face in 116.440 examined windows

Initial set of Determine false-

Fall 2002 Pattern Recognition for Vision

0.9 FP/image 6.7 FP/image

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

• Rotation invariant features

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Eyes Nose Mouth

1st Level: classify

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Start with small initial regions

Expand into one of four directions

Extract new components from images

Train SVM classifiers

Choose best expansion according to

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Faces have been manually labeled

• About 40,000 faces

Fall 2002 Pattern Recognition for Vision

Face Detection: Component-based vs. Global Approach

Fall 2002 Pattern Recognition for Vision

Components are small, and prone to false

Use the remainder of the face in the negative training set

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Often, many components classify correctly, with

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

Fall 2002 Pattern Recognition for Vision

2. Classifier must distinguish between exemplars that