Face Recognition From Video Using Robust Kernel Resistor-Average Distance

Face Recognition from Video
using Robust Kernel Resistor-

Average Distance
Prepared By :
Motivation
 Single-image face recognition has too

many constraints
 Video is easily obtainable
 Video should provide more robust
information
Problem Statement
 Goal: Recognition from video without

heavy constraints on subject’s movement
 Problem: Unconstrained data includes
highly non-linear variations
 Question: How to find distinguishable
features in ‘disorganized’ data?
Problem Statement
 Given a sequence of face images in

random* positions:
 How to use data without determining real-
world position?
 How to distinguish different subjects?

Problem Statement
 Given a sequence of face images in

random* positions:
 How to use data without determining real-
world position?
Kernel PCA projection
 How to distinguish different subjects?
Resistor-Average Distance measure
General Technique to
compare subjects
 For each set of images, align the faces
through affine transformation
 Remove outliers
 Create synthetic data to improve
robustness
 Create K-face space using both sets
 Project data sets and compare distances
PCA Refresher
 Principal component analysis projects data
xi , i  1,...,
into 
l , xi a N
subspace
 Preserves distinguishing features by using
eigenvectors V of covariance matrix C
1 l
C   xi xi
V  CV
T
l i 1
Kernel PCA
 Since PCA won’t work for non-linear
variations, map the data to an
approximately linear space F
 :  F N
Kernel PCA
 New problem:
1 l
C    ( xi ) ( xi ) T (1)
l i 1
V  C V
 Solving this directly is expensive
 Instead, use the ‘kernel trick’
Kernel PCA
 Eigenvectors will lie in the span of projected
data  ( x ),...,  ( x )
1 l
 We can use equivalent system
 (( xi ) V )  (( xi )  C V ), k  1,..., l (2)
and extract the same values for V.

Kernel PCA
 Further, V can be described as a linear

combination of the data’s projection.
l
V    i  ( xi ) (3)
i 1
 A key to the trick, define the kernel matrix K.
K ij : ( ( xi )   ( x j ))
T
Kernel PCA
Substituting (1) and (3) into (2), we get:
l
1 l l
 ( ( xi )    i  ( xi ))  ( ( xi )    ( xi ) ( xi )   i  ( xi )), k  1,..., l
T
i 1 l i 1 i 1
Now, using the kernel matrix we get:

lK  K 2
Which will have equivalent solutions to the

easily solved:
l  K
The trick to Kernal PCA
 We can easily have our subspace, but how to
construct K?
 Consider a mapping with polynomial degree 5,
and 16x16 pixel images
 It would require 1010 dimensionality!
 Instead, just define a kernel function
k ( x, y )  ( ( x)   ( y ))
T
Kernel function
 Using the kernel function to build the kernel
matrix will make computation reasonable
 Arandjelovic and Cipolla used
.6 ( x  y ) T ( x  y )
k ( x, y )  e
found empirically
Comparing datasets using
RAD
 Resister-Average Distance
 Based on Kullback-Leiber divergence
 Datasets represented as probability distributions
1 1 1
DRAD ( p, q)  ( DKL ( p || q )  DKL (q || p ) )
Kullback-Leiber
divergence
 p ( x) 
 Defined as DKL ( p || q)   p( x) log 2  dx
 q( x) 
 Often difficult to compute, but if the

distributions are normal. . .
1  | Cq |  
DKL ( p || q )  log 2    Tr (C p Cq  Cq ( xq  x p )( xq  x p ) )  N 
1 1 T
2  |C | 
 p 
Where xi is data mean, and Ci is covariance matrix

Summary of algorithm
 For 2 input sets, automatically detect

eyes and nostrils & perform affine
transform to normalize
Resister-Average Distance
 RAD works as a measure because unlike

KLD, it is symmetric
 Using RANSAC, keep only inliers

 Create additional synthetic data for each
set by applying slight, random
perturbations to input images
 Apply RANSAC Kernel PCA to the union
of augmented datasets
 Randomly select samples from data
 Compute KPCA for samples
 Project data into K-face space, count how
many data {zi} lie within threshold of origin
 Repeat
 Use the largest {zi} to create K-space
 Separately, project each augmented set

into K-face space.
 Use the set of projections as a probability
distribution model for each subject
 Because of the nonlinear projection, these
distributions should be normal
 Use RAD to measure distances
Experimental Results
Experimental Results
 Training consisted of 30-50 video images (taken

at 10fps); testing consisted of sets of 35
 At best, achieved 98% recognition with 2% error
References
 O. Arandjelovic and R. Cipolla. Face Recognition from Face Motion
Manifolds using Robust Kernel Resistor-Average Distance. Face
Processing in Video, 2004
 D. H. Johnson and S. Sinanovic. Symmetrizing the kullbackleibler
distance. Rice University Working Paper, 2001.
 B. Scholkopf, A. Smola, and K. Muller. Kernel principal component
analysis. Advances in Kernel Methods – SV Learning, pages 327–352,
1999.

Face Recognition From Video Using Robust Kernel Resistor-Average Distance

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Face Recognition From Video Using Robust Kernel Resistor-Average Distance

Uploaded by

Copyright:

Available Formats

Face Recognition from Video

using Robust Kernel Resistor-

 Single-image face recognition has too

 Goal: Recognition from video without

 Given a sequence of face images in

 How to distinguish different subjects?

 Given a sequence of face images in

 We can use equivalent system

 (( xi ) V )  (( xi )  C V ), k  1,..., l (2)

and extract the same values for V.

 Further, V can be described as a linear

 A key to the trick, define the kernel matrix K.

Now, using the kernel matrix we get:

Which will have equivalent solutions to the

 Often difficult to compute, but if the

Where xi is data mean, and Ci is covariance matrix

 For 2 input sets, automatically detect

 RAD works as a measure because unlike

 Using RANSAC, keep only inliers

 Separately, project each augmented set

 Training consisted of 30-50 video images (taken

You might also like