You are on page 1of 24

Master SC Information and Communication Security

Dr. Andrzej Drygajlo Speech Processing and Biometrics Group Signal Processing Institute Ecole Polytechnique Fdrale de Lausanne (EPFL)

Center for Interdisciplinary Studies in Information Security (ISIS)

Face Recognition

Face detection Face tracking Face recognition (appearance-based)

Local features
DCT-based methods

Global features (holistic approach)

Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA)

Performance evaluation Advantages and disadvantages

Principal Component Analysis (PCA)

transformation, is a data-representation method that finds an alternative set of parameters for a set of raw data (or features) such that most of the variability in the data is compressed down to the first few parameters

Principal component analysis (PCA), or Karhunen-Loeve

The transformed PCA parameters are orthogonal The PCA, diagonalizes the
covariance matrix, and the resulting diagonal elements are the variances of the transformed PCA parameters

It completely decorrelates any data in the transform domain It packs the most energy (variance) in the fewest number of

transform coefficients It minimizes the MSE (mean square error) between the reconstructed and original data for any specified data compression It minimizes the total entropy of the data

There is not fast algorithm for its implementation The PCA is not a fixed transform, but has to be generated for each

type of data statistic There is considerable computational effort involved in generation of eigenvalues and eigenvectors of the covariance matrices


r The covariance (scatter) matrix of the data x , which encodes the
variance and covariance of the data, is used in PCA to find the optimal rotation of the parameter space PCA finds the eigenvectors and eigenvalues of the covariance matrix. These have the property that
where: S - covariance (scatter) matrix r r r - eigenvectors W = [ w1 , w2 ,K , wD ] V - transformed covariance matrix (diagonal scatter matrix of eigenvalues) diag( V ) = v = [v1 , v2 ,K , vD ] - eigenvalues


0 0.26 0.96 14492.28 20760.14 0.26 0.96 302.84 = 0.96 0.26 20760.14 14492.28 0.96 0.26 94.40 0


Having found the eigenvectors and eigenvalues,
the principal components are found by the following transformation:

r r T xPCA = W x

xPCA,1 0.26 x1 0.96 x2 0.26 0.96 x1 x = 0.96 x + 0.26 x = 0.96 0.26 x 2 1 2 PCA,2
The eigenvectors give an idea of the importance of each of the original parameters in accounting for the variance in the data

A face image defines a point
in the high-dimensional image space Different face images share a number of similarities with each other
They can be described by a relatively low-dimensional subspace They can be projected into an appropriately chosen subspace of eigenfaces and classification can be performed by similarity computation (distance)





2D DCT and PCA

Feature Vector V = [C0, C1, C2, ... , Cuv];

Feature vector with first few local PCA basis functions

Graphs from: C. Sanderson, On Local Features for Face Verification, IDIAPRR 04-36

Suppose data consists of M faces with D feature values


... 1) Place data in D x M matrix x x = ... xij ... 2) Mean-center the data ... Compute D-dimensional (mean). x0 = x - 3) Compute D x D covariance matrix ( c = x0 x0T) 4) Compute eigenvectors and eigenvalues of covariance matrix 5) Choose K largest eigenvalues (K << D). 6) Form a D x K matrix W with K columns of eigenvectors. 7) The new coordinates xPCA of data (in PCA space) consists of projecting data into K-dimensional subspace by xPCA = WT(x )

M faces

D features

PCA seeks directions that are efficient for
representing the data


not efficient


Class A Class B

Class A Class B

Eigenfaces, the algorithm


The database
a1 a 2 = M a 2 N
e1 e2 = M e 2 N

b1 b2 = M b 2 N
f1 f2 = M f 2 N

c1 c2 = M c 2 N
g1 g2 = M g 2 N

d1 d2 = M d 2 N

h1 h2 = M h 2 N

Eigenfaces, the algorithm


We compute the average face

m1 m r 2 1 m= M M mN 2

a1 + b1 + L + h1 a + b +L + h 2 2 2 , M M M a 2 + b 2 +L + h 2 N N N

where M = 8

Eigenfaces, the algorithm


Then subtract it from the training faces

a1 m1 b1 m1 c1 m1 d1 m1 r d 2 m2 r a2 m2 r b2 m2 r c2 m2 , , bm = , cm = , dm = am = M M M M M M M M a 2 m 2 b 2 m 2 c 2 m 2 d 2 m 2 N N N N N N N N e1 m1 e2 m2 r , em = M M e 2 m 2 N N r fm = f1 m1 g1 m1 h1 m1 r h2 m2 f 2 m2 g 2 m2 r , gm = , hm = M M M M M M g 2 m 2 h 2 m 2 f N 2 mN 2 N N N N

Eigenfaces, the algorithm

Now we build the matrix which is N2 by M The covariance matrix which is N2 by N2


r r r r r r r r x = am bm cm d m em f m g m hm

C = xx
The matrix is very large The computational effort is very big

Find eigenvalues of the covariance matrix We are interested in at most M eigenvalues

We can reduce the dimension of the covariance (scatter) matrix

Find the M eigenvalues and eigenvectors

Eigenvectors of C and S are equivalent

S = x x

Build transform matrix W from the eigenvectors of S

Eigenfaces, the algorithm


Compute for each face its projection onto the face

r r T r xPCA,1 = W ( am ) , xPCA,2 = WT r r r xPCA,5 = WT ( em ) , xPCA,6 = WT

( ) ( )

r r r T r bm , xPCA,3 = W ( cm ) , xPCA,4 = WT r r r r f m , xPCA,7 = WT ( g m ) , xPCA,8 = WT

( ) ( )

r dm , r hm

Compute the threshold

1 r r = max xPCA,i xPCA, j 2


i, j = 1, 2,K, M

Eigenfaces, the algorithm


To recognize a face
r1 r2 = M r 2 N

Subtract the average face from it

r1 m1 r2 m2 r rm = M M r 2 m 2 N N

Eigenfaces, the algorithm

Compute its projection onto the face space


r r xPCA = W ( rm )

Compute the distance in the face space between the face

and all known faces

r r = xPCA xPCA,i
2 i


i = 1, 2,K , M

Eigenfaces, the algorithm

Reconstruct the face from eigenfaces


Compute the distance between the face and its


r r rPCA = W xPCA
2 2

r r = rm rPCA

Distinguish between If then it is not a face

If < and i , (i =1,2,K, M) then it is a new face If < and min { i } < then it is a known face



Perfect reconstruction with all eigenfaces

= 0.4

+ 0.2

+ ... + 0.6

Reasonable reconstruction with just a few eigenfaces

= 0.4

+ 0.2



Eigenfaces do not distinguish between shape and appearance PCA does not use class information: PCA projections are optimal for reconstruction from a low dimensional basis, they may not be optimal from a discrimination standpoint: Much of the variation from one image to the next is due to illumination changes. [Moses, Adini, Ullman]

Problems with eigenfaces:

Different illumination Different head pose Different alignment Different facial expression



Database Samples




Major (principal)