Pattern Recognition 41 (2008) 396 – 405 www.elsevier.

com/locate/pr

Elastic shape-texture matching for human face recognition
Xudong Xie, Kin-Man Lam ∗
Department of Electronic and Information Engineering, Centre for Multimedia Signal Processing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Received 26 August 2006; received in revised form 7 March 2007; accepted 12 June 2007

Abstract In this paper, a novel, elastic, shape-texture matching method, namely ESTM, for human face recognition is proposed. In our approach, both the shape and the texture information are used to compare two faces without establishing any precise pixel-wise correspondence. The edge map is used to represent the shape of an image, while the texture information is characterized by both the Gabor representations and the gradient direction of each pixel. Combining these features, a shape-texture Hausdorff distance is devised to compute the similarity of two face images. The elastic matching is robust to small, local distortions of the feature points such as those caused by facial expression variations. In addition, the use of the edge map, Gabor representations and the direction of the image gradient can all alleviate the effect of illumination to a certain extent. With different databases, experimental results show that our algorithm can always achieve a better performance than other face recognition algorithms under different conditions, except when an image is under poor and uneven illumination. Experiments based on the Yale database, AR database, ORL database and YaleB database show that our proposed method can achieve recognition rates of 88.7%, 97.7%, 78.3% and 89.5%, respectively. 2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Keywords: Face recognition; Hausdorff distance; Gabor wavelets; Elastic shape-texture matching

1. Introduction The morphable face model [1,2] has achieved great success in encoding and representing human face images. This approach separates a given image into its shape and texture information. The shape encodes the feature geometry of the face, which is represented by a set of facial feature points and can be used to construct a pixel-wise correspondence on a reference image. The texture, which is shape-free, can be obtained after mapping the original image onto the reference image. Therefore, the shape-free texture information can be constructed only after the shape information about a face has been obtained. Although many different methods have been proposed to locate facial features [3,4] and to detect face contours [5,6], it is still a challenge to accomplish this automatically. Furthermore, Ref. [2] reported that the morphable face approach

∗ Corresponding author. Tel.: +852 2362 8439; fax: +852 2766 6207.

E-mail address: enkmlam@polyu.edu.hk (K.-M. Lam).

cannot achieve a robust performance for images under various conditions. Psychological studies have indicated that line drawings of objects can be recognized as quickly and almost as accurately as photographs [7], which means that the edge-like retinal images of faces can be used for face recognition at the level of early visi-break on. Therefore, the edges of a face image can be considered the aggregate of important feature points that are useful for face recognition. Hausdorff distance [8] is such an approach, whereby the distance between two edge maps or point sets can be calculated without the explicit pairing of the points. The smaller the Hausdorff distance, the smaller the difference or deformation between the two corresponding edge maps is, and the more similar the two corresponding face images are. Takács [9] has introduced a “doubly” modified Hausdorff distance (M2HD), which provides a more reliable and robust distance measure between two point sets than the original one. A spatially weighted modified Hausdorff distance [10] has also been proposed, which considers the importance of facial features and allocates different weights to the points according to

0031-3203/$30.00 2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2007.06.008

The elastic matching is robust to small. morphological operations [19] are first applied. 2. respectively. In our approach. the Gabor representations and the direction of image gradient can all alleviate the effect of illumination to a certain extent. such as the eyes. y) = f (x. All these methods are based on edge maps without considering any texture information about the input images. we consider not only the edge image. y) are selected. y). This threshold can achieve the best recognition result with this database. ˜ The Gabor map of an image is denoted as G(x. the AR database [21]. y). Experimental results. the binary image produced is called an edge map of the image. GW are therefore used to extract the texture information in our method. y) ∗ Ky (x. the graph structure cannot sufficiently and effectively represent the distribution of all the feature points of human faces. such as facial expression variations. The binary edge map obtained is denoted as E(x. When determining the threshold. y) n(x. and has also explored the significance of the elastic deformation for the application of facebased person authentication. which is defined as follows: G (x. orientation selectivity. an elastic matching is proposed for face recognition. and the threshold is set so that 12% of the points with the largest magnitudes of n(x. elastic shape-texture matching (ESTM) method of face recognition. In Ref. exhibit the desirable characteristics of capturing salient visual properties such as spatial localization. lighting and perspectives. and then elastic graph matching (EGM) is used to compare their resulting image structures. y) and Ky (x. and the edge map. and spatial frequency [13]. and will then be employed for other databases in our experiments. Lam / Pattern Recognition 41 (2008) 396 – 405 397 the importance of the facial regions. etc. we propose a novel. y). while after a thresholding procedure. * denotes the 2-D convolution operator. EG (x. [11] incorporates the a priori structure of a human face. which provide additional information about the shape. The DLA first computes the Gabor jets of the face images. local distortions of the feature points. Based on the shape and the texture information. Therefore. f (x. Experimental results based on different databases show that ESTM outperforms other methods that employ either the shape (edge map) or the texture (GW) information only under various image conditions. 1(c) displays the corresponding edge maps by using this adaptive thresholding scheme. In this paper. Elastic shape-texture matching It has been shown that the combined shape and texture feature carries the most discriminating information for human face recognition [1]. Xie. Finally. to emphasize the importance of facial regions and achieves a better performance level. (2) f (x. which are useful for texture detection and face recognition [14.1. and can achieve a good recognition performance under the different variations. The GW can effectively abstract local and discriminating features. which is a very difficult task in practical applications. The dimension of G(x. mouth. y) = . y) are sorted in descending order. we consider the gradient direction [19] of each edge point as a supplement to representing a shape. [14]. We therefore propose an efficient algorithm which combines these two types of information for face recognition. and Fig. conclusions are drawn in Section 4.15]. Ref. y). y) Therefore. Gy (x. [17] adopts discriminatory power coefficients to weigh the matching error at each grid node. As the magnitudes of the Gabor representations can provide a measure of the local properties of an image [14]. y). which characterize the corresponding texture information.. the GW have been applied to face recognition via the dynamic link architecture (DLA) framework. y) ∗ Kx (x. This paper is organized as follows. whose kernels are similar to the response of the 2-D receptive field profiles of the mammalian simple cortical cell [12]. Ref. and Kx (x. y) can be considered more likely to be an edge point of the facial features. y) . The Gabor wavelets (GW). and the GW. where the threshold value 12% is obtained by experiment based on the Yale database. Furthermore. Although these methods can preserve some texture features and local geometry information [18]. The angles (gradient direction) of the edge points [19]. Unlike the morphable face model. We define EG (x. [16] has introduced an automatic weighting for the nodes of the elastic graph according to their significance. namely eigen-mask. but also the intensity values of the original image f (x. the output of an image after edge detection is called an edge image. usually have lower gray-level intensities than other parts of a face. y) represents the gray-level intensity of an image at coordinates (x. In this paper. The values of n(x. are given in Section 3. which is obtained by concatenating the GW representation at different ˜ center frequencies and orientations. the edge map is used to represent the shape information about a face image. y) are the Sobel horizontal and vertical gradient kernels. instead of using some specific feature points that are very difficult to locate accurately in practice. gabor maps and angle maps In order to obtain the edge map of a face image. This is because the important facial features.-M. Gx (x. y) . K. y) = f (x. which represents the shape information about a face image. The morphological DLA proposed in Ref. Our new ESTM method is presented in Section 2. y). 1(b) shows the edge images EG obtained using morphological edge detection. Fig. 2. our algorithm. can reduce the effect of expressions. which compare the performances of our proposed algorithm to other face recognition algorithms based on the Yale database [20]. y). This algorithm is called ESTM. y) (1) where Gx (x. a pixel which has a larger value of n(x. our method does not need to find the pixel-wise correspondence between images. The gradient direction in this paper is also called the angle information. Our method considers the edge map. ESTM. The edge maps. these two features are complementary to each other and contain the complete information about face images.X. are also incorporated in our algorithm. y) = arctan Gy (x. the ORL database [22] and the YaleB database [23]. In fact.

b) and da (a.398 X. Shape-texture Hausdorff distance ˜ For the edge map E(x. y). GA . All these three measures are independent of each other and are defined as follows: de (a. B). two finite point sets AP = {a1 . b) is a distance measure between the point pair (a. where the elements in AP and BP correspond to the points in the edge maps EA and EB of the original images. Lam / Pattern Recognition 41 (2008) 396 – 405 Fig. our shape-texture Hausdorff distance is defined as follows. (1 − I )P a a∈AP b∈NB P where . GB . 2. (4) can also be considered as a combination of three parts. b) + · dg (a. respectively. . Given two human face images A and B. only one center frequency and eight orientations are considered in our algorithm. (3) d(a. An advantage of using Eq. b) are the edge distance. . and have the same values as in Eq. . K. (b) the edge images obtained by morphological operations. y).2. (5) where de (a. respectively. To reduce the dimension of this representation. (10) . . . is therefore determined by the numbers of center frequencies and orientations used. and angle map A(x. This angle information is also useful for describing the shape. In our method. 1. B) is called the directed shape-texture Hausdorff distance and is defined as follows: hst (A. b). and the orientation varies from 0 to 7 /8 in steps of /8. P = · Pe + · Pg + · Pa . and AB are the Gabor maps and angle maps of the two images. and . AA . b).e. . and equal to 0 otherwise. we define ˜ ˜ Pg (a) = GA (a) − GB (a) . The gradient direction of a pixel varies from − /2 to /2. (a) The original facial images. which is equal a to 1 if there exists a point b ∈ NBP . . y). similar to Eq. the shape-texture Hausdorff distance is H (A. bNB } can be obtained. b). (9) instead of a fixed P is that this allows us to adopt different penalties for different distance measures. . . hst (B. (5). respectively. (c) the edge maps obtained by the adaptive thresholding method. is an underlying norm (the L2 norm is used in our ˜ ˜ method). aNA } and BP = {b1 . the penalty P in Eq. ˜ ˜ dg (a. (5).-M. and . i. B) = max(hst (A. which consists of three different terms as follows: d(a. Gabor distance and angle distance. and are the coefficients used to adjust the weights of these three distance measures. and the angle map of an image is denoted as A(x. dg (a. and I is an indicator. and NA and NB are the corresponding numbers of points in sets AP and BP . (8) (6) (7) hst (A. . Xie. (4) is the neighborhood of the point a in the set BP . b). a NBP where Pe . where P is an associated penalty. respectively. y). for the pixel a ∈ AP to a pixel b within the neighborhood of a in BP . Pg . Then. and Pa are the corresponding penalties for these three distance measures. b) + · da (a. (9) . b) = GA (a) − GB (b) and da (a. b) = · de (a. The center frequency is chosen to be /2. In fact. B) = 1 NA max I min d(a. b) = AA (a) − AB (b) . A)). b) = a − b . Gabor map G (x.

the values of { . which not only possesses the advantages of featurebased approaches—such as being invariant to illumination and low memory requirement—but also has the advantage of high recognition performance in template matching.X. In fact. = 0 and = 0. which affect the recognition results. (1 − I )P (a) . 2. and other factors such as aging. For feature matching. most of them cannot achieve a satisfactory performance. From Eq. However. the searching is non-rigid. an ESTM for face recognition is proposed. When only one image per person is available for training. the variations caused by different conditions disturb an image in different ways. the penalties Pe (a) and Pa (a) are set as fixed values in our algorithm.e. they are useful to alleviate the effect of being unable to detect the edges under poor lighting. Because only edge points are considered when computing the distance. hair styles and glasses. poses and perspectives. the corresponding best set of parameters is also tabulated. i.3. (5). Fig. ESTM for face recognition Using the shape-texture Hausdorff distance. or its low-frequency spectrum is influenced. Due to the fact that the magnitudes of GW representation are less sensitive to the lighting conditions [24]. So our method can maintain the recognition performance under uneven lighting conditions. we use P (a) instead of a fixed value P in Eq. as the compensation for one variation may have an adverse effect on another. which will be tested in Section 3. a a∈AP b∈NB P (11) 2. Secondly. a Therefore. and the GW and the image gradient can also reduce the effect of varying lightings [24. (10). b). i. } are the weights of the three distance measures. firstly. only the high-frequency spectrum is affected. if a point of the point set BP cannot be found in NBP for the point a ∈ AP . The lighting or perspective variations affect the global components of the image [25]. In fact. elastic. ESTM is equivalent to M2HD. In our method.27]. which can tolerate small and local distortions of a human face and reduce the shape variations caused by expressions or perspective. }. For the features extracted. A practical face recognition technique needs to be robust to the image variations caused by illumination conditions. For simplicity. Here we should note that . . both the computational complexity and the memory requirement are greatly reduced. 2 shows the whole process of our proposed face recognition system.e. facial expressions. Therefore. hst (A. If = 0. while with facial expression variations. most of the existing methods need more than one image per person for training.-M. . where the Yale database is used as the training data. expressions and perspectives at the same time. Lam / Pattern Recognition 41 (2008) 396 – 405 399 Pre-processing Elastic Shape-Texture Matching Gabor map Computation of shapetexture Hausdorff distance Input face image Gray edge image Edge ma p Angle ma p Database Fig. The architecture of our face recognition system. we aim to perform face recognition with variations due to illuminations. B) = 1 NA max I min d(a. and are normalized such that + + = 1. K. Xie. this ESTM approach can be considered as a combination of template matching and geometrical feature matching [4]. most of the existing methods only consider one or two variations. . As shown in Eq. the edges are relatively insensitive to illumination. as is often the case in real applications. (4). Table 1 shows some combinations of { . the corresponding Gabor representations for image B at position a will be considered when computing the penalty for Gabor distance. . which is called the highfrequency phenomenon [26]. we can see that the value of the penalty is adaptive to the point under consideration. but only one frontal face image of each person under even illumination and with normal expression is available for training. For each of the combinations.

}.e. the number of center frequencies and orientations used in the EGM are five and eight. = 0. we have also combined the respective sub-classes of the same conditions to form the combined databases. 2 /4 and /4. Pa = /20 =1 = 0. In each database. = 0.e. = 0 = 0. = 0. histogram equalization is applied to all the images. The performances of the algorithms are the worst with the ORL database. Table 2 The test databases used in the experiments Yale Number of subjects Number of test images 15 150 AR 121 605 ORL 40 360 YaleB 10 640 EGM. AR database. we can observe that under normal conditions. Similarly.8. The neighborhood size is set at 9 × 9. For PCA. and all color images are converted to gray-scale images. one image for each subject under normal conditions was selected as a training sample. Pe = 4. For example. = 0 + + =1 Parameter set = 1. The eyes can also be detected automatically. = 0 = 0. i. the AR database. = 0 = 0. PCA can preserve the global structure of the image space. which are from 0 to 7 /8 in steps of /8.g. all the eigenfaces available for each database are used. This is consistent with the results in Ref. Lam / Pattern Recognition 41 (2008) 396 – 405 Table 1 The best sets of parameters for different conditions of { . K. facial expressions.96. Fig. = 0. To enhance the global contrast of the images and to reduce the effect of uneven illuminations.-M. Pa = /30 3. From the results. = 0. The Euclidean distance is computed and the nearest-neighbor rule is adopted for classification. as the resulting error in detecting the eyes will degrade their performances. M2HD [9]. because the faces in the ORL database have some small facial expression and perspective variations. In Table 3. Although the ESTMg uses only 12% of the pixels in an image as edge points and one center frequency for the Gabor filters. and the corresponding numbers are tabulated in Table 3.8 = 0. Pe = 4. the position of the two eyes is located manually. Pa = /20 = 0. = 0. we will evaluate the performances of the ESTM algorithm with different conditions of the parameter set { . and eight orientations.8. most of the algorithms can achieve a high recognition rate. ORL database and YaleB database only.05. Face recognition under normal conditions The respective recognition rates based on the different subdatabases with normal faces are shown in Table 4. which is suitable for small. Xie. The GW representation are concatenated to form a highdimensional vector. e. The GW always outperforms the M2HD. 3. the face images in the databases are divided manually into several sub-classes according to their different conditions. = 0 = 0. For each of the combined databases. respectively.04.32. A normal image means that the face image is of frontal view. and under even illumination and neutral expression. the number of eigenfaces used is 14. = 0. = 0. the texture carries more discriminating information than the shape. Pe = 4. In order to measure the recognition performances of our proposed algorithm. a face is under even illumination if the azimuth angle and the elevation angle of the lighting are both less than 20◦ . /2. } for face recognition based on different face databases. . This observation shows that the edge points can be considered as the aggregate of important feature points that carry the most discriminating information for face recognition.93. non-rigid local distortions in human face recognition. . at most M −1. for the Yale database. [1].02. i. 39 and 9. where Abbreviation M2HD ESTMa ESTMg ESTMea ESTMeg ESTM Conditions = 0. The GW √ employs three center frequencies.e. The databases used include the Yale database. while the GW. and EGM [14]. respectively. while the dimension of the elastic graph is 6 × 8. = 0.400 X. GW. = 0 = 0. are evaluated and compared with the PCA [28]. = 0. where M is the total number of training samples. 120. the training and testing images of the combined database under normal conditions come from the Yale database. The performances of ESTM and its several simplified versions. 3 shows some examples of the images. etc. this method can still achieve similar recognition rates to the GW. which is used directly to compute the distance between two images pixel by pixel.8 = 1. and M2HD adopt the local information about the image. All images are cropped to a size of 64 × 64 based on the eye locations. as listed in Table 1. The face images in different databases are captured under different conditions. the ORL database and the YaleB database. In our experiments. while the GW uses the texture information only in the matching. Experimental results In this section. and others formed the testing set. but the recognition rates cannot reflect the true performances of the respective methods.1. The number of distinct subjects and the number of testing images in the respective databases are tabulated in Table 2. the training set consists of images from the corresponding sub-classes. i. The recognition rates using . Pe = 4. such as varied lighting conditions. ORL database and YaleB database.68. and the EGM can preserve some of the texture and local geometry information. In order to investigate the effect of the different conditions on the face recognition algorithms. The M2HD considers only the shape information.

the shadows produced will also affect the edge map generated. The YaleB database is often used to investigate the effect of lighting on face recognition. the GW outperforms other methods when the faces are under poor lighting conditions. Our proposed ESTM method always outperforms other methods. The performance of the GW degrades as compared to the results in Section 3. Due to the fact that PCA represents faces with their principal components. In the Yale database.2.3. which adopts not only the edge information. 3. such as the YaleB database. Lam / Pattern Recognition 41 (2008) 396 – 405 401 Fig.-M. we select only those images with azimuth angles or elevation angles of lighting larger than 20◦ as the testing images. it outperforms GW when the Yale database and the AR database are used. which means that the combined features carry the most discriminating information. but also the angle information. This is because . The recognition rate based on the combined database falls from 80. but they are not as large as those in the YaleB database. and (d) images from the YaleB database. such as human faces. In this part of the experiments. lighting from both sides of the face is also adopted.2–26. The images in these two databases have some illumination variations. compared to the results of the M2HD. Although our ESTM is not as good as the GW for the YaleB database. some of the edges may not be detected in a consistent manner [29]. However. which is also based on edge maps. the performances of those algorithms that rely on the edge information for recognition will be greatly affected.3. K. Xie. Therefore. the lighting is either from the left or the right of the face images. Face recognition with different facial expressions and perspective variations Experiments based on the face images with different facial expressions are performed and the recognition results are summarized in Table 6. 3. Furthermore. ESTMea .2% to 45. when the lighting is not from the front of a face. The edge map can serve as a robust representation to illumination changes if the objects concerned have sharp edges only. Table 3 The sub-classes of the test databases used in the experiments Normal Yale AR ORL YaleB Combined 45 – 189 160 394 Facial expression variation 75 242 63 – 380 Lighting variation 30 363 – 480 873 Perspective variation – – 108 – 108 ESTMa are similar to the results using M2HD. Moreover.X. the ESTM can achieve higher recognition rates of 13. can perform better than both ESTMa and M2HD in most cases. The performance of PCA degrades significantly compared to the results based on normal faces. Furthermore. In the AR database. when the objects have smooth surfaces. its recognition rate is lower than M2HD in some cases. As discussed in Section 2. Some cropped faces used in our experiments: (a) images from the Yale database.1. Furthermore. (b) images from the AR database. this method cannot work properly when a face is under severe lighting variations. Face recognition under varying lighting conditions The experimental results based on the images under varying lighting are shown in Table 5 . rather than using only one or two of them. (c) images from the ORL database.8%. In the case of large illumination variation. besides the lighting from the left and the right. 3. the lighting variations affect the global components of the image.1%.

0 100.3 84. which has been discussed in Section 3.5 88.4.3 79.g.0 ESTM 90.3 71.4 EGM 83.3 M2HD 76.7 100. looking to the right.-M.0 42. Gabor map. left. i.5 98.4 68.8 ESTMeg 84. ESTMg .2 M2HD 66. and show that none of these face recognition methods can achieve a satisfactory performance under perspective variations. ESTMa .8 98. The average number of feature points for an edge map is · N 2 . e.2 49.4 60.7 84.4 86. the ESTM still outperforms the other methods.3 94.2 ESTM 85.8 74. This is due to the effect of perspective variations.9 99.0 91. respectively.7 84. no matter which method is adopted.7 80.7 98. where nf and na are the numbers of center frequencies and orientations used for the Gabor filters. which is disturbed by local distortions caused by changes in facial expression.9 89. Xie. except for the YaleB database.2 64. The GW considers the texture information about the neighborhood of each pixel. With these four databases.7 99.4 90.8 81.e.1 ESTMeg 83.7 93.2 86. ESTMea and ESTMeg .5 90.0 79.7 97.3 GW 74. The recognition rate of ESTMg is slightly higher than that of the ESTM when using the AR database. 3.0 89. 3.2 ESTMa 77.3 97.0 94.0 M2HD 80.1 79.1 99.7 59. also outperform the traditional methods in most cases.7 89. Face recognition with different databases We have evaluated and discussed the effect of different conditions on different face recognition methods.8 ESTMeg 88.0 facial expressions often cause some local distortions of the feature points.1 78. Suppose that the size of the normalized face is N × N .0 97. All the testing images were selected from the ORL database with the faces either rotated out of the image plane. clockwise or anti-clockwise. and angle map are tabulated in the second row of Table 9.3 50. or rotated in the image plane.0 65.4 86. Nevertheless. Gabor map.1 77. Compared to other methods.3 92. the GW achieves the best performance.8 ESTMa 90. The relative performances of the different algorithms were also evaluated for faces with perspective variations.0 96.5.8 EGM 73. we also show the performances of the respective face recognition methods based on the different databases without dividing them into sub-databases and the performance based on the total combined database in Table 8. and therefore the total number of bits used to represent a face image in the database is 16( + nf na + 1)N 2 .0 100. In this case.1 81. In addition.1 GW 73.5 77.6 ESTMg 78.8 45.3 71.8 ESTMea 78. the simplified versions of ESTM.7 94.0 59.8 86.8 ESTMa 91. the recognition rate for the ORL database is always lower than the others.1 87.0 62. where a feature point is the x.3 96.3 72. In this section. and can be represented by two bytes.7 84.7 92. Storage requirements and computational complexity For our approach. which will then affect the corresponding local texture and shape properties.1 55. The numbers of bits to represent an edge map. the data stored in a database for a face image include its edge map.0 97.1 ESTM 93.2 EGM 57.4 91.1 69.3 ESTMg 86.6 ESTMea 93.4 76. The experimental results are tabulated in Table 7. Each element in the Gabor map and the angle map is represented by a 16-bit floating-point number. Lam / Pattern Recognition 41 (2008) 396 – 405 Table 4 Face recognition results under normal conditions Recognition rate (%) Yale ORL YaleB Combined PCA 82.5 Table 6 Face recognition results under different facial expressions Recognition rate (%) Yale AR ORL Combined PCA 66.7 84.1 81. we can see that the ESTM outperforms all the other methods based on the different databases.402 X.4 99.8 76.9 84. From Table 8. .1 69.3. K.6 ESTMg 76. and percent of the points are selected as edge points in the edge map.2 57.4 Table 5 Face recognition results under varying lighting conditions Recognition rate (%) Yale AR YaleB Combined PCA 46.7 ESTMea 90.0 80.2 GW 91. The dimensions of the Gabor map and angle map are nf na N 2 and N 2 .4 88.0 84.5 63.and y-coordinates. up and down.3 98.2 86. and angle map.3 90. our proposed ESTM can achieve the best performance. and both methods have a recognition rate higher than 98%.7 77.6 73.

respectively. Experiments were conducted on a computer system with Pentium IV 2. O(N 2 log2 (N 2 )) and O(N 2 ). such as PCA.5 77. b).5 EGM 42. which show that our algorithm can always achieve the best performance compared to other algorithms. The runtime required for feature extraction is the time spent computing the corresponding edge map. when an image is under seriously uneven illumina- . Furthermore.5 ESTMa 50.5 69.9 75.7 63. b) and angle distance da (a. and the matching is performed within a neighborhood. Gabor map and angle map are first computed. The computational complexities for computing an edge map.9 ESTMg 56.3 EGM 67.7 67. In fact. Gabor map or angle map is considered.0 Table 8 Face recognition results based on different databases Recognition rate (%) Yale AR ORL YaleB Combined PCA 67. and the GW and gradient direction are employed to characterize the corresponding texture information. For searching in a large database. Lam / Pattern Recognition 41 (2008) 396 – 405 Table 7 Face recognition results under various perspectives Recognition rate (%) ORL PCA 39. Suppose that the average runtimes required to compute these three distances for one point pair (a. we only need to consider the time required to compute these maps of the query image. and M is the number of images stored in the database.1 75.0 95.2 94. This means that the possible number of pixels to be compared when matching each point pair is D 2 .7 97. the runtime for matching is the most significant part for the whole process. Gabor distance dg (a. This method does not need to construct a precise pixel-wise correspondence between the two images being compared.4 83. under different conditions.8 GW 56.9 ESTMg 80.1 72.7 ESTMea 85.7 95. and thus suitable for face recognition. and that the total runtime tall = te + tg + ta . uneven illuminations.3 79. where a factor of 2 is multiplied.4 GHz CPU and 512 MB RAM.9 73. we have proposed a novel elastic shape-texture matching algorithm. its edge map. The average runtime using ESTM for face recognition based on the ORL database (40 face subjects) is 0. B) and hst (B. and perspective variations. the performance of this method relies on the precision of edge detection.7 86.3 75. The paper also addresses the performances of different face recognition algorithms in terms of changes in facial expressions. Suppose that the size of the neighborhood considered when searching for a matching pair is D × D.9 77.-M.7 M2HD 72. and angle map. respectively. in which case the GW performs the best while the ESTM achieves the second highest recognition rate. Gabor map. tg and ta . b) between pixel a ∈ A and pixel b ∈ B are to be computed.5 ESTMea 48.4 403 ESTM 60. A) in Eq. In our approach.3 89.6 ESTM 88. only one image per person is used for training in our experiments. GW. As ESTM uses the edge map. which makes it very useful for practical face recognition applications.2 85.9 GW 79.3 82.3 95. Xie. namely ESTM.3 95.3 68. In this matching. EGM and M2HD.7 70.7 74.3 69.9 72. then the computational complexity of ESTM is in the order of O(2 N 2 D 2 Mt all ). for human face recognition.1 62.8 ESTMeg 85.4 ESTMa 84.0 Table 9 Storage requirements and computational complexity Edge map Storage requirements (bits) Computational complexity Feature extraction Matching 16 N2 Gabor map 16nf na N2 Angle map 16N 2 O(N 2 ) O(2 N 2 D 2 Mt a ) O(N 2 ) O(2 N 2 D 2 Mt e ) O(N 2 log2 (N 2 )) O(2 N 2 D 2 Mt g ) For a query image.1 ESTMeg 57.6 s.7 78. 4. the edge distance de (a. the edge map is used to represent the shape information about an input image. Gabor map and angle map are in the order of O(N 2 ). Conclusions In this paper.4 69. ESTMg and ESTMa are shown in the last row of Table 9.0 60.2.6 57. the computational time for face recognition includes two parts: feature extraction and matching.0 58.X. As all these maps of the training images have been computed and stored in the face database. as discussed in Section 3.6 M2HD 43. (10) are to be computed. This makes this approach robust to small and local distortions of the facial feature points. where only the edge map. and then a shape-texture Hausdorff distance is proposed to compute the difference between a query input and the faces in a database. The respective computational complexities of M2HD. since both hst (A.3 96. b) are te . K. For a query image.2 61. The only exception is when the face images are under very poor lighting conditions. Experiments were conducted based on different databases.1 70.

IEEE Trans. Pattern Anal. A. [15] D. Ahn. http://cvc. Y. Wurtz. Spatially eigen-weighted Hausdorff distances for human face recognition. Speech. [5] I. Comput. in: G. Proceedings of the European Conference on Computer Vision. The role of complex cells in object recognition. Intell. Lam. Malsburg. W. Lee.yale. [19] R. 15 (10) (1993) 1042–1052. K. Intell.A. Jacobs. http://www. 24 (1–3) (2003) 499–507.edu/projects/yalefacesB/yalefacesB. Shams. Mach. K. An adaptive active contour model for highly irregular boundaries. 9 (4) (2000) 555–560. Ayach. he was a lecturer at the Department of Electronic Engineering of The Hong Kong Polytechnic University. Face recognition: the problem of compensating for changes in illumination direction. Mach. Brunelli. 42 (3) (1993) 300–311. Another possible way around this problem is to assign different weights to different edge points according to their importance. Huttenlocher.J. Lam. For example. Siu. [9] B. [17] C.M. I. Guo. The Hong Kong Polytechnic University in 2006. Zhang. Tefas. IEEE Trans. S. [20] Yale University. Chen. [6] W. 1992. 27 (5) (2005) 805–811. Lam has also been a member of the organizing committee and program committee of many international conferences. Acknowledgment The work described in this paper was supported by The Hong Kong Polytechnic University. Moses. In particular. H. respectively.Eng. 85 (9) (1997) 1423–1435. From 1990 to 1993. K. [16] B. Pentland. Pattern Anal. Poa Scholarship for overseas studies and was awarded a M.research. Image Process. Intell.M. K. 3 (1) (1991) 71–86. Intell. Cognitive Neurosci. whereby the edge information is distorted by the shadow. Ph. Ullman.V. [8] D. MA. Y.S. J. Mach. Siu.C. Vorbruggen. Mach. Poggio.and texture-based enhanced fisher classifier for face recognition. 1992. Australia. Takács. K. Imperial College of Science. K. Intell. IEEE Trans. IEEE Trans. Springer.H. and was awarded the IBM Australia Research Student Project Prize. W.edu/projects/yalefaces/yalefaces. Lange. Pattern Anal. [30. IEEE Trans. KIN-MAN LAM received his Associateship in Electronic Engineering with distinction from The Hong Kong Polytechnic University (formerly called Hong Kong Polytechnic) in 1986. Weizman Institute of Science. R.M. Vision Res. Berlin. Pattern Recognition Lett. Wechsler. He won the S. Mach. degree from the Department of Electronic and Information Engineering. Chui. Lades. 1993. Lam. IEEE Trans. in: Proceedings of the IEEE Conference CVPR. R. Psychol. pp.W.L. 92–96.S. Liu. as in Ref.M.att. [26] C. he undertook a Ph. In August 1993. [2] A. H. D. Wechsler. Illumination invariant face recognition. Liu. this weighting function could then be incorporated into the shape-texture Hausdorff distance. Fischer. pattern recognition. Face recognition: generalization to novel images. J. Digital Image Processing.M. IEEE Trans. Lam / Pattern Recognition 41 (2008) 396 – 405 [11] K. W. Independent component analysis of Gabor features for face recognition. Xie. Shen. [18] J. In search of illumination invariants.com/pub/data/att_faces. [28] M.yale.31] provide some methods of reconstructing a visually natural human face from an image under uneven illuminations. Frequency-based nonrigid motion analysis.D. Hong Kong. 8 (4) (1999) 504– 516. Automatic interpretation and coding of face images using flexible models. Refs. Oh. Liu. [10] B. and Signal Processing (ICASSP’03). the Technical Chair . References [1] C. An efficient algorithm for human face detection and facial feature extraction under different conditions. L. tion.Sc.W. A. such as wearing glasses and/or a scarf.C. degree in communication engineering from the Department of Electrical Engineering.S. and has been an Associate Professor since February 1999. He joined the Department of Electronic and Information Engineering. IEEE Trans. The Hong Kong Polytechnic University again as an Assistant Professor in October 1996. Benavente. Eigenfaces for recognition. Face authentication with Gabor information on deformable graphs. Nastar. 254–261. Lam. Pattern Recognition 36 (8) (2003) 1827–1834.404 X. Yang.D. and won an Australia Postgraduate Award for his studies. [13] C. 19 (7) (1997) 721–732.M. IEEE Trans.D. Tock. [3] K. Moses. K.E. and Ref. C. [14] M.C. When the image has some variations caused by occlusions. G. studies in August 1996. Pattern Anal. Distortion invariant object recognition in the dynamic link architecture. Liu. T. Pattern Anal. W. and neural nets. [23] Yale University. A shape. S. Tsinghua University. Siu. J. Human face recognition based on spatially weighted Hausdorff distance. in 1987. Lam. P. J. http://cvc. L. Siu. Pattern Recognition 38 (2) (2005) 221–230.M. Turk. Pattern Recognition 31 (12) (1998) 1873–1880. W. [29] Y.C.P. K. and human face analysis. China. Dr. J. Lades. Gonzalez. pp. IEEE Trans. [27] H. von der Malsburg. [24] L. [11]. Hilton Head. Konen. IEEE Proc.html . Pattern Recognition Lett. the performance of ESTM will degrade. Craw. Pattern Anal. C. China. Klanderman. Park. R. He received his Ph. 42 (22) (2002) 2547–2554.L.M. June 1998.H.-M. Pattern Recognition 38 (10) (2005) 1705–1716. D. T. About the Author—XUDONG XIE received his B. Xie. Rucklidge. [4] R. uk. Belhumeur.W.J. 2000. Pattern Recognition 34 (2) (2001) 323–331. the performance will also be degraded.H. Technology and Medicine. Neural Networks 14 (4) (2003) 919– 928. html .F. An Introduction to Wavelets. degree in Electronic Engineering and M. Lam. N. CVC Technical Report #24.C. [21] A. Intell. S. [32] J. Taylor.zip . Bigun. Comparing face images using the modified Hausdorff distance. Boston. Image Process. In this case. elastic matching. Sandini (Ed. [12] C. His research interests include image processing. Cootes. Thesis. A proper way to solve this problem is to use an additional preprocessing method to improve the quality of the edge map. Kotropoulos. the algorithm cannot judge which edges come from the face and which from the accoutrements. 10 (4) (2001) 598–608. C. 1993. Image Process. England. Lin.D. Lanitis. About the Author—DR. Yan. Face recognition: features versus templates. Frontal face authentication using morphological elastic graph matching. Pattern Recognition 34 (10) (2001) 1993–2004.N.C. Lam. 25 (2) (2004) 267–276.F.P. Recognition-by-components: a theory of human image understanding. The AR face database. Woods. Adini. Comparing images using the Hausdorff distance. [31] D. he was the Secretary of the 2003 IEEE International Conference on Acoustics. [32] describes a method of removing glasses from facial images. S. Duc. Glasses removal from facial image using recursive error compensation. He completed his Ph. Pitas. degree in Signal and Information Processing from the Department of Electrical Engineering. Academic Press.C. UK. Wong.K. Reading. [7] Biederman. IEEE Trans. in 1999 and 2002.P. [22] The Oliver Research Laboratory in Cambridge. Addison-Wesley. Bennett. Face recognition: eigenface.Sc. 19 (7) (1997) 743–765. [25] Y. [30] X. Y. S.D.). 18 (11) (1996) 1067–1079. Face recognition under varying illumination based on a 2D face shape model. A. Mach. 15 (9) (1993) 850–863. Rev. W. Optimal sampling of Gabor features for face recognition. Buhmann. M. Shen. Finding face features. degree program in the Department of Electrical Engineering at the University of Sydney. Choi. 94 (1987) 115–147. Martinez.

Xie. Dr. Lam is the Chairman of the IEEE Hong Kong Chapter of Signal Processing and an Associate Editor of EURASIP Journal of Image and Video Processing. Currently. Dr. Video and Speech Processing (ISIMP 2004). In addition. His current research interests include human face recognition. Lam was a Guest Editor for the Special Issue on Biometric Signal Processing. and a Technical Co-Chair of the 2005 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS 2005). and computer vision.X.-M. . image and video processing. K. Lam / Pattern Recognition 41 (2008) 396 – 405 405 of the 2004 International Symposium on Intelligent Multimedia. EURASIP Journal on Applied Signal Processing.