You are on page 1of 48

Object Recognition in Images

Slides originally created by Bernd Heisele


Object Recognition

Object Detection Object Identification

Where is Jane? Who is it?


Where is a Face? Is it Jane or Erik?
Is there a face in the image?

Fall 2002 Pattern Recognition for Vision


General Problems of Recognition

Invariance:
•“External parameters”
•Pose
•Illumination
•“Internal parameters”
•Person identity
•Facial expression

Applicable to many classes


of objects

Fall 2002 Pattern Recognition for Vision


Object Detection

Task
Given an input image,
determine if there are
objects of a given class (e.g.
faces, people, cars..)
in the image and where they
are located.

Fall 2002 Pattern Recognition for Vision


Detection—Problems

1. Classifier must generalize over all exemplars of one


class.

2. Negative class consists of everything else.

3. High accuracy (small FP rate) required for most


applications.

Fall 2002 Pattern Recognition for Vision


Face Detection – basic scheme

Face examples Classification


Result

Off-line
training
Classifier

Feature vector (x1, x2 ,…, xn)

Feature Extraction

Non-face examples Pixel pattern

Search for faces at


different resolutions
and locations

Fall 2002 Pattern Recognition for Vision


Training and Testing

Training Set

Train Classifier

False Positive

Labeled Test Set Correct

Sensitivity

Classify

Fall 2002 Pattern Recognition for Vision


Image Features

19x19
Histogram Masking
Equalization
Gray 283 [0, 1]

x-y-Sobel
Histogram
Filtering
Equalization
Gradient 17x17 [0, 1]

Haar Wavelet Normalization


Transform
Wavelets 1740 [0, 1]

Fall 2002 Pattern Recognition for Vision


ROC Image Features

Fall 2002 Pattern Recognition for Vision


Classifiers

Fall 2002 Pattern Recognition for Vision


Positive Training Data

Real Synthetic
2900 faces + 2900 mirrored faces (T. Vetter, Univ. of Freiburg)

Illumination

Identity

Rotation

3D Morphing: + =

Fall 2002 Pattern Recognition for Vision


Real vs. Synthetic

Fall 2002 Pattern Recognition for Vision


Negative Training Data

Problem: 1 face in 116.440 examined windows

Add to
training set

Initial set of Determine false-


Retrain
25.000 non- positives on large set
faces Classifier of non-face images

Bootstrapping

Fall 2002 Pattern Recognition for Vision


Bootstrapping

0.9 FP/image 6.7 FP/image

Fall 2002 Pattern Recognition for Vision


Performance of Global Face Detectors

Fall 2002 Pattern Recognition for Vision


Rotation

Rotation out of
image plane

• Component-based classification
Rotation in the • Train on rotated faces
image plane

• Rotation invariant features


• Apply 2D rotation to image

Fall 2002 Pattern Recognition for Vision


Global vs. Components

Single
template

Component
templates

Fall 2002 Pattern Recognition for Vision


Component-based Detection

Eyes Nose Mouth

1st Level: classify


Components

2nd Level:
Geometrical
relation
between
components classify

maximum response
of each component
classifier + x, y
location

Fall 2002 Pattern Recognition for Vision


Learning Components

Components:
• discriminatory
• robust against changes in
pose and illumination

Synthetic faces:
• 7 different 3-D head models
• 2,500 faces
Rotation: -30o to + 30o
• 3-D correspondences
for automatic location of
components

Fall 2002 Pattern Recognition for Vision


Learning Components—One Way To Do It

Start with small initial regions

Expand into one of four directions

Extract new components from images

Train SVM classifiers

Choose best expansion according to


error bound of SVMs

Fall 2002 Pattern Recognition for Vision


Margin, Radius and Expected Error

x *2

Bound on error
M 2
E < c(R / M)
R
M

x*1
Feature Space

Cross V
alidation might be better

Fall 2002 Pattern Recognition for Vision


Some Examples

Fall 2002 Pattern Recognition for Vision


est on CMU PIE Database
T

Faces have been manually labeled


(only –45o to 45o of rotation)

• About 40,000 faces


• 68 people
• 13 poses
• 43 illumination conditions
• 4 different expressions

Fall 2002 Pattern Recognition for Vision


ROC Component vs. Global

Face Detection: Component-based vs. Global Approach


(5,000 faces 25, 000 non-faces)
1

0.9

0.8

0.7

0.6
Correct

0.5
14 learned components

0.4
whole face
0.3

0.2

0.1

0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
FP / inspected window

Heisele, B., T. Serre, M. Pontil, T. Vetter and T. Poggio. Categorization by Learning and Combining Object Parts.
In: Advances in Neural Information Processing Systems (NIPS'01), Vancouver, Canada

Fall 2002 Pattern Recognition for Vision


Advances on Component-base Face Detection Stan Bileschi

Components are small, and prone to false


detection, even within the face.

Stan Bileschi
Fall 2002 Pattern Recognition for Vision
Training on Faces

Use the remainder of the face in the negative training set


Positive Negative

Fall 2002 Pattern Recognition for Vision


Training on Faces Only

Red:
Trained
only with
faces.

Blue:
Trained
on faces
and non-
faces.

Fall 2002 Pattern Recognition for Vision


Errors

Often, many components classify correctly, with


only a few errors

Fall 2002 Pattern Recognition for Vision


Using Models of Pair-wise Positions

Fall 2002 Pattern Recognition for Vision


Pair-wise Biasing Leads to Tightened Result Images

Fall 2002 Pattern Recognition for Vision


Pair-wise Biasing

Fall 2002 Pattern Recognition for Vision


Application: Eye Detection

Fall 2002 Pattern Recognition for Vision


Identification

A B
Task:

Given an image of an
A B
object of a particular class
(e.g. face) identify which
exemplar it is.
C D

C D

Fall 2002 Pattern Recognition for Vision


Identification—Problems

1. Multi-class problem

2. Classifier must distinguish between exemplars that


might look very similar.

3. Classifier has to reject exemplars that were not in


the training database.

Fall 2002 Pattern Recognition for Vision


Problems in Face Identification

Limited information in a single face image

Illumination Rotation

Fall 2002 Pattern Recognition for Vision


System Architecture

Identification Result
Training Data
A

Classifier Support Vector Machine,


….

Feature extraction Gray, Gradient,


Wavelets, …

Face Image

Fall 2002 Pattern Recognition for Vision


Multi-class Classification with SVM

Bottom-Up 1vs1 1 vs. All

A or B or C or D A B,C,D

A or B C or D B A,C,D

C A,B,D

D A,B,C
A B C D
Training: L
Training: L (L-1) / 2
Classification : L
Classification : L-1

Fall 2002 Pattern Recognition for Vision


Global Approach

Detect and extract face

Feed gray values into N


SVMs

Classify based on SVM SVM SVM SVM


A B C D
maximum output

Max Operation

Identification result

Fall 2002 Pattern Recognition for Vision


Global Approach with Clustering

Partition training images of


each person into viewpoint-
specific
clusters

Train a linear SVM on each


cluster SVM SVM SVM SVM
C C C C

Take maximum over all SVM


outputs Max Operation

Identification result

Fall 2002 Pattern Recognition for Vision


Component-based Approach

Detect and extract components

Feed gray values of


components to N SVMs

Take max. over all SVM outputs


SVM SVM SVM SVM
A B C D

Max Operation

Identification Results

Fall 2002 Pattern Recognition for Vision


W hy Components for Identification?

Feature 1 != Feature 2

Feature 1 = Feature 2

Jennifer Huang

Fall 2002 Pattern Recognition for Vision


Component-based Face Recognition with 3D Morphable Models
More ROC Curves

Heisele, B., P. Ho and T. Poggio. Face Recognition with Support Vector Machines: Global Versus Component-based
Approach, International Conference on Computer Vision (ICCV'01), Vancouver, Canada, Vol. 2, 688-694, 2001.

Fall 2002 Pattern Recognition for Vision


Morphable Models for Face Identification, Jennifer Huang

3D morphable models Training data for


component-based
A Morphable Model for the Synthesis of 3D Faces.
Blanz, V. and Vetter, T. SIGGRAPH'99 Conference Proceedings, pp. 187-194
face recognition
Jennifer Huang

Fall 2002 Pattern Recognition for Vision


Morphable Model

Generation of 3D head model from


two images
See class on Morphable Models
Jennifer Huang

Fall 2002 Pattern Recognition for Vision


Some Training Images

Synthetic images are easily rendered from 3D head model


under varying illuminations and rotations in depth

Jennifer Huang

Fall 2002 Pattern Recognition for Vision


Preliminary Results on Synthetic Images

Jennifer Huang

Fall 2002 Pattern Recognition for Vision


Current Work – Testing on Real Images

Problems Encountered:
Detection
Inaccurate Component detection

Recognition
Accuracy of 3D models
Choice of Illumination and Pose

Jennifer Huang

Fall 2002 Pattern Recognition for Vision


Literature

B. Heisele, A. Verri and T. Poggio: Learning and


Vision Machines. Proceedings of the IEEE, Visual
Perception: Technology and Tools, Vol. 90, No. 7,
pp. 1164-1177, 2002.

See also CBCL Web page

Fall 2002 Pattern Recognition for Vision

You might also like