You are on page 1of 5

Face Recognition by Combining Gabor Wavelets and Nearest Neighbor Discriminant Analysis

Kadir Krta, Onur Dolu, Muhittin Gkmen Istanbul Technical University, Computer Engineering Dept. 34469, Maslak, Istanbul, Turkey {kirtac, doluo, gokmen}@itu.edu.tr

Abstract
One of the successful approaches in face recognition is the Gabor wavelets-based approach. The importance of the Gabor wavelets lie under the fact that the kernels are similar to the 2-D receptive eld proles of the man visual neurons, offering spatial locality, spatial frequency and orientation selectivity. In this work, we propose a new combination of a Gabor wavelets-based method for illumination and expression invariant face recognition. It applies the Nearest Neighbor Discriminant Analysis to the augmented Gabor feature vectors obtained by the Gabor wavelets representation of facial images. To make use of all the features provided by different Gabor kernels, each kernel output is concatenated to form an augmented Gabor feature vector. The feasibility of the proposed method has been successfully tested on Yale database by giving a comparison with its predecessor, NNDA. The effectiveness of the method is shown by a comparative performance study against standard face recognition methods such as the combination of Gabor and Eigenfaces and the combination of the Gabor and Fisherfaces, using a subset of the FERET database containing a total of 600 facial images of 200 subjects exhibiting both illumination and facial expression variations. The achieved recognition rate of 98 percent in the FERET test shows the efciency of the proposed method.

1. Introduction
Due to the highly informative and discriminative nature of the face stimuli, face recognition has been regarded as a biometric identication application in computer vision community. However, because of the 3-D nature of the stimuli, the appearance of a single face can change dramatically due to the variations of light, pose and ex-

pressions. One of the successful approaches for robust face recognition is the Gabor wavelets-based approach. After the pioneering work of Daugman [1], extending 1-D Gabor wavelets to 2-D, Gabor wavelets have been extensively used both in many image processing and computer vision applications. The efciency of the Gabor wavelets lie under the fact that they exhibit similarity to the 2-D receptive eld proles of the man visual neurons, offering spatial locality, spatial frequency and orientation selectivity. Therefore, the Gabor wavelets representation of facial images should be robust to illumination and facial expression variations. In [2], Lades et al. introduced the use of Gabor wavelets for face recognition using the Dynamic Link Architecture. The DLA forms and stores deformable model graphs whose vertices are labeled by Gabor jets computed from a rectangular subgrid centered over the object to be stored. It then applies a exible graph matching between model graphs and the image graph. In [3], Wiskott et al. extended the DLA architecture to Elastic Bunch Graph Matching. They have utilized phase information for accurate node localization and introduced the bunch graph data structure which combined jets of a small set of individuals. In recognition, matching takes place between a bunch graph and an image graph. In [4], Donato et al. utilized Gabor wavelets for facial action classication. They achieved the best results with the Gabor wavelets approach and with an ICA-based scheme. In [5], Liu et al. combined Gabor wavelets and Enhanced Fisher Linear Discriminant Model for face recognition. They applied a set of Gabor kernels to whole face image and resulted with augmented Gabor feature vectors. They then applied feature extraction on the augmented Gabor feature vectors using EFM. In this work, we propose a new combination of a Gabor wavelets-based method, Gabor+NNDA. It applies the Nearest Neighbor Discriminant Analysis [6] to the

978-1-4244-2881-6/08/$25.00 2008 IEEE

augmented Gabor feature vectors obtained by the Gabor wavelets representation of facial images. To make use of all the features provided by different Gabor kernels, each kernel output is concatenated to form an augmented Gabor feature vector. The feasibility of the proposed method has been successfully tested on Yale database [7] by giving a comparison with its predecessor, NNDA. The effectiveness of the proposed method is shown by a comparative performance study against standard face recognition methods such as the combination of Gabor and Eigenfaces and the combination of the Gabor and Fisherfaces, using a subset of the FERET database containing a total of 600 facial images of 200 subjects exhibiting both illumination and facial expression variations. The achieved recognition rate of 98 percent in the FERET test shows the efciency of the proposed method.

Gaussian window and scale and orientation of the oscillatory part. The parameter determines the ratio of window width to scale, in other words, the number of the oscillations under the envelope function [2].

2.1. Two-dimensional Gabor wavelets-based feature representation


The 2-D Gabor wavelets representation of an image is the convolution of the image with a family of kernels , , where is the orientation and is the spatial scale of the kernel. The convolution of an image I (z) with a Gabor kernel , results with G , (z) which is dened as, G , (z) = I (z) , , (3) where z = (x, y) and denotes the convolution operation. Thus, the set S = {G , (z) : {0, . . . , 7}, {0, . . . , 4}} forms the Gabor wavelets representation of an image I (z).

2. Two-dimensional Gabor wavelets


Gabor wavelets(kernels) are a set of lters , , which are dened as, k , 2
2
k , 2 z 2 ) 2 2

3. Nearest Neighbor Discriminant Analysis


Nearest neighbor discriminant analysis(NNDA) is a nonparametric feature extraction method which forms the between-class and within-class scatter matrices in a nonparametric way [6]. Considering a c-class problem with classes Ci {i = 1, 2, . . . , c} and training samples {x1 , x2 , . . . , xN }, the extra-class and intra-class neighbor of a sample xn Ci is dened as,
E xn = arg min z xn , z Ci , z I xn = arg min z xn , z Ci , z = xn . z

, (z) =

[eik , z e

2 /2

], (1)

where and indicates the orientation and scale of the kernels, z = (x, y) denotes the image coordinates, . denotes the norm operator and k , is the wave vector. Each kernel is a product of a Gaussian envelope function and a complex plane wave. The wave vector k , is dened as, k , = k ei , (2)

(4) (5)

where k = kmax / f and = /8. kmax is dened as maximum frequency, is dened as the width of the Gaussian along the x and y axis and f is the spacing factor between kernels in the frequency domain [2]. Lades et al. investigated = 2, f = 2 and kmax = /2 yielding with optimal results along with 5 scales, {0, . . . , 4}, and 8 orientations, {0, . . . , 7}. In [8], Shen et al. also discussed tuning the Gabor kernel parameters and after two experiments they showed that 5 scales and 8 orientations yielded with optimal recognition performance. The rst exponential term in the square brackets in (1) indicates the oscillatory part while the second exponential term compensates for the DC value of the kernel, to make the lter independent from the absolute intensity of the image. The kernel, exhibiting complex response, combines a real part(cosine part) and an imaginary part(sine part). The wavelets are parameterized by k , , which controls the width of the

The nonparametric extra-class and intra-class distances are dened as, E E (6) n = xn xn ,
I I n = xn xn .

(7)

Using the extra-class and intra-class distances dened above, the nonparametric between-class and withinclass scatter matrices are dened as follows, SB =
E T wn (E n )(n ) , N

(8)

n=1 N

SW =

n=1

wn (In )(In )T ,
I n + E n

(9)

where wn is dened as, wn =


I n

(10)

wn is introduced to emphasize the samples in class boundaries and deemphasize the samples in class centers. Utilizing the fact that the accuracy of the nearest neighbor classication can be directly computed by,

then picking out every (k. ) th sample both in x w and y directions, where 1 k ( ), k Z and w is the width of the G , (z). The training algorithm of Gabor+NNDA is given in Table 1. Table 1: Training algorithm of Gabor+NNDA. (1)Given D dimensional samples {x1 , x2 , . . . , xN }, ddimensional discriminant subspace is searched. (2)Normalize each sample xi to zero-mean and unit variance. (3)Apply a set of 40 Gabor kernels(5 scales and 8 orientations) to each sample xi , resulting with Gi, , (z); i {1, . . . , N }, {0, . . . , 7}, {0, . . . , 4}. (4)Downsample each lter output Gi, , (z) with a factor of to achieve Gi, , (z), and normalize the nal Gi, , (z) to zero-mean and unit variance. (5)Concatenate the rows(or columns) of each resul( ) tant Gi, , (z) to form an augmented feature vector xi .
( ) ( ) ( )

n = E n

2 I n ,

(11)

Qiu and Wu [6] reached to the following solution for the computation of the projection matrix W, W = arg maxtr(W T (SB SW )W ).
W

(12)

Thus, the columns of the projection matrix W are the m leading eigenvectors of SB SW , corresponding to the m greatest eigenvalues. Due to the computational inefciency of obtaining extra- and intra-class nearest neighbors when the data dimensionality is high, rst PCA is applied to reduce the dimension of the data to N-1(the rank of the total scatter matrix) by removing the null space of the total scatter matrix. To keep the nonparametric extra-class and intra-class differences of the high dimensional space consisted with the projected extra-class and intra-class differences, Qiu and Wu [6] proposed stepwise dimensionality reduction process. In this scheme, the nonparametric between-class and within-class matrices are recomputed each time in the current dimensionality. This process is repeated until reaching the desired dimensionality. They also utilized k-NN classication criterion in the training phase. The nonparametric extra- and intra-class differences are rewritten as,
E E = x x[ k/2] , I I = x x([ k/2]+1) ,

xi

( )

= {Gi,0,0 |Gi,0,1 | . . . |Gi,7,4 }t .

( )

( )

( )

(6)Form the nal Gabor feature matrix X ( ) , by as( ) sembling each xi in columns, side by side. X ( ) = {x1 |x2 | . . . |xN }. (7)Apply PCA to X ( ) to learn the PCA projection matrix Tpca . Tpca = [1 |2 | . . . |N 1 ], Tpca N 1D . (8)Project feature matrix X ( ) with the learned PCA t X ( ) . model. Ypca = Tpca (9)Apply NNDA on Ypca to learn the NNDA projection matrix. Tnnda = [1 |2 | . . . |d ], Tnnda d N 1 . (10)Project Gabor+PCA feature matrix Ypca with the t learned NNDA model. Y = Tnnda Ypca .
( ) ( ) ( )

(13) (14)

I where, x([ k/2]+1) is dened as the intra-class ([k/2] + 1)E th nearest neighbor and x[ k/2] is dened as the [k/2]th extra-class nearest neighbor of the sample x. If the I distance from x to x([ k/2]+1) is smaller than the distance E from x to x[k/2] , x will be classied correctly by k-NN classier.

4. Face recognition by combining Gabor wavelets and Nearest Neighbor Discriminant Analysis
After obtaining the feature representation described in (3), each G , (z) is downsampled by a scale factor and normalized by zero-mean and unit variance operation. Downsampling is carried out by rst smoothing G , (z) image using a 5 5 Gaussian window and

In recognition, steps (2) to (5) of the training algorithm is applied similarly to a test image y, and y( ) is obtained. Then, y( ) is projected using the projection matrix Tnnda , t (15) y = Tnnda y( ) . Finally, L2 distance measure is applied to identify y with the label of the closest feature vector in Gabor+NNDA feature space,

L2 (y , y) = (y y)t (y y).

(16)

5. Experimental Results
We performed the rst set of experiments on Yale database [7], to compare the proposed method with its predecessor, NNDA. Yale database contains 165 images of 15 subjects. There are 11 images per subject, one for each of the following facial expressions or congurations: centerlight, w/glasses, happy, left-light, w/no glasses, normal, right-light, sad, sleepy, surprised, and wink. Scaled images of size 64x64 are used in this experiment. The downsampling factor of Gabor+NNDA is 64, i.e., = 64. 5 partitions are formed by randomly selecting 5 images for training and leaving the remaining 6 images for testing in each trial. The step size of NNDA is set to 5. The reduced subspace is of 14 dimensions. Note that the original dimensionality is 64x64 = 4,096. While the step size is constant, (k, al pha) tuple is changed as, (k, al pha) : k {1, 3, 5}, al pha {0, 1, 2, 3}. The result is shown in Fig. 2.
0.92 0.9 0.88

feature dimension, along with parameters step size = 5, al pha = 1, and k = 5. The next experiment is the performance comparison of the two standard methods Gabor+Eigenfaces and Gabor+Fisherfaces [5], with the proposed method Gabor+NNDA, using a subset of the FERET database [9]. The FERET database contains 1564 sets of images for a total of 14,126 images that includes 1199 individuals and 365 duplicate sets of images. The experiment includes 600 facial images of 200 subjects such that each subject has three images of size 256x384 with 256 gray scale levels. Each of these three images shows different variations with the following conguration: the rst image is in neutral expression, the second image shows different expression than the rst one and the last image contains different illumination than rst two images. First, the centers of the eyes are manually detected, then they are aligned to predened locations by rotation and scaling transformations. Finally, face image is cropped to the size of 128x128 to extract the facial region, which is further normalized by zero-mean and unit variance operation. The training parameters of Gabor+NNDA is set as, step size = 13, al pha = 1, and k = 1. The result is shown in Figure 2.

1 0.9 0.8 0.7 recognition rate 0.6 0.5 0.4 0.3 0.2 0.1 Gabor+Eigenfaces Gabor+Fisherfaces Gabor+NNDA 0 20 40 60 number of features 80 100

recognition rate

0.86 0.84 0.82 0.8 0.78 0.76 (1,0) (1,1) (1,2) (1,3) (3,0) (3,1) (3,2) (3,3) (5,0) (5,1) (5,2) (5,3) (k,alpha) NNDA Gabor+NNDA

Figure 2: Comparative face recognition performance of Gabor+Eigenfaces, Gabor+Fisherfaces and Gabor+NNDA, on the augmented Gabor feature vector X ( ) , downsampled by a factor of 64, i.e., = 64. In the experiment, Gabor+NNDA achieves the highest recognition rate by 98 percent in 65 feature dimensions while Gabor+Fisherfaces achieves a 92.6 percent accuracy and Gabor+Eigenfaces achieves 40.6 percent accuracy in the same feature dimension. Figure 2 shows the overall superior performance of the proposed method over Gabor+Eigenfaces and Gabor+Fisherfaces methods.

Figure 1: Relative face recognition performance of Gabor+NNDA and NNDA on Yale database. Reduced feature dimension is 14. Step size is set to 5. From Figure 1, it can be easily observed that Gabor+NNDA outperformed NNDA on all (k, al pha) tuples. Gabor+NNDA reaches 90 percent accuracy in 14

6. Discussion and Conclusion


2-D Gabor wavelets are optimally localized in the space and frequency domains, and result with illumination, expression and pose invariant image features. Nearest Neighbor Discriminant Analysis(NNDA) [6] was shown to be an efcient nonparametric feature extraction tool from the point of view of nearest neighbor classication. It does not suffer from the small sample size problem and it does not need to estimate any parametric distribution because of its nonparametric nature. Moreover, it does not suffer from the singularity of the within-class scatter matrix as no matrix inversion is required in eigenvector computation. Due to the nonparametric structure and stepwise dimensionality reduction process, the training time complexity of NNDA is greater than that of Fisherfaces method [7]. However, NNDA gives higher performance than Fisherfaces in terms of classication accuracy and both methods have the same time complexity in recognition phase. Gabor+NNDA extracts important discriminant features both utilizing the power of Gabor wavelets and NNDA. The efciency of the approach is shown with experiments both in Yale and FERET database. It achieved a 98 percent classication accuracy in 65 feature dimension, outperforming both standard methods such as Gabor+Fisherfaces and Gabor+Eigenfaces [5], using a 200 class subset of FERET database exhibiting both illumination and expression variations. We also evidenced the increase of classication accuracy in the Yale test, with increasing k and decreasing alpha values, which also agrees with the theoretical results of [6]. The FERET test also showed the favor of stepwise dimensionality reduction process, by reaching 98 percent accuracy after 13 steps. However, the chosen step sizes greater than 13 did not increase the classication accuracy and did not seem to be effective in terms of time complexity. In our experiments, training with 400 FERET images took 80 seconds for 13 steps by using Matlab R14 on our Intel P4 3.2 Ghz conguration. It should be noted that no suggestion is given in the original NNDA work [6], on how to select parameters alpha, k and step size. An optimization scheme such as Evolutionary Computing can be suggested to provide the optimal parameter selection for training. But this would also increase the time complexity of the system. Due to the effectiveness of kernel approaches in Gabor wavelets-based face recognition [8], NNDA can be extended to a kernel approach and Gabor features then can be utilized in kernel-NNDA space. Moreover, instead of applying a simple distance measure like L2 norm in classication, more sophisticated classication schemes such as Support Vector Machines or Neural

Networks can be applied. However, it is no guarantee that SVM and Neural Network classiers would be effective in Gabor+NNDA feature space.

Acknowledgments
This work has been partially supported by TUBITAK(National Science Council of Turkey) under the grant National Scholarship Programme for MSc. Students and TUBITAK project numbered 104E121. The authors would like to thank Fatih Kahraman and Abdulkerim apar for valuable discussions.

References
[1] J. G. Daugman, Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical lters, J. Opt. Soc. Amer., vol. 2, no. 7, pp. 11601169, 1985. [2] M. Lades, J. C. Vorbruggen, J. M. Buhmann, J. M. Lange, C. von der Malsburg, R. P. Wrtz, and W. Konen, Distortion Invariant Object Recognition in the Dynamic Link Architecture, IEEE Trans. Computers, vol. 42, no. 3, pp. 300311, 1993. [3] L. Wiskott, J.-M. Fellous, N. Krger, and C. von der Malsburg, Face Recognition by Elastic Bunch Graph Matching, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 775779, 1997. [4] G. Donato, M. S. Bartlett, J. C. Hager, P. Ekman, and T. J. Sejnowski, Classifying Facial Actions, IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 10, pp. 974989, 1999. [5] C. Liu and H. Wechsler, Gabor feature based classication using the enhanced sher linear discriminant model for face recognition, IEEE Transactions on Image Processing, vol. 11, no. 4, pp. 467476, 2002. [6] X. Qiu and L. Wu, Nearest Neighbor Discriminant Analysis, IJPRAI, vol. 20, no. 8, pp. 12451260, 2006. [7] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, Eigenfaces vs. Fisherfaces: Recognition Using Class Specic Linear Projection, IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 711720, 1997. [8] L. Shen, L. Bai, and M. C. Fairhurst, Gabor wavelets and General Discriminant Analysis for face identication and verication, Image Vision Comput., vol. 25, no. 5, pp. 553563, 2007. [9] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, The FERET database and evaluation procedure for facerecognition algorithms, Image Vision Comput., vol. 16, no. 5, pp. 295306, 1998.

You might also like