ABSTRACT

CONTENTS 1. Introduction 1.1 Motivation: Biometric Security Technology 1.2 Face Recognition 2. Analysis 2.1 Problem Statement 2.2 Literature Survey 2.2.1 Eigenface Method 2.2.2 Neural Network Approach 2.2.3 Stochastic Modeling 2.2.4 Geometrical feature matching 2.2.5 Template Matching 2.2.6 Graph Matching 2.2.7 N-tuple classifiers 2.2.8 Line Edge Map 2.3 Line Edge Map Method 2.3.1 Face Detection 2.3.2 Edge Detectors 2.3.3 Thinning Algorithm 2.3.4 Curve Fitting Algorithm 2.3.5 Hausdorff Distance Algorithm 2.4 Use Case Diagram 3. Design 3.1 Class Relationship Diagram 3.2 Class Diagram 3.3 Sequence Diagram

4. Implementation and Test Results

Conclusion References Appendix A Users Manual Appendix B .NET Framework Setup Appendix C SQL Server 7.0 Setup

CHAPTER 1 INTRODUCTION
1.1 Motivation: Biometric Security Technology WHAT IS A BIOMETRIC? The security field uses three different types of authentication:
• • •

Something you know—a password, PIN, or piece of personal information (such as your mother's maiden name); Something you have—a card key, smart card, or token (like a Secure ID card); and/or Something you are—a biometric.

Of these, a biometric is the most secure and convenient authentication tool. It can't be borrowed, stolen, or forgotten; and forging one is practically impossible. Biometrics takes account of individual’s unique physical or behavioral characteristics to recognize or authenticate identity. Common physical biometrics include fingerprints; hand or palm geometry; retina, iris, or facial characteristics. Behavioral characters include signature, voice (which also has a physical component), keystroke pattern, and gait. Of these classes of biometrics, technologies for signature and voice are the most developed. Figure 1 describes the process involved in using a biometric system for security.

Figure 1: How a biometric system works. (1) Capture the chosen biometric; (2) process the biometric, extract and enroll the biometric template; (3) store the template in a local repository, or a central repository, or a portable token such as a smart card; (4) live-scan the chosen biometric; (5) process the biometric and extract the biometric template; (6) match the scanned biometric against stored templates; (7) provide a matching score to business applications; (8) record a secure audit trail with respect to system used. Fingerprints A fingerprint looks at the patterns found on a fingertip. There are different approaches to fingerprint verification. Some emulates the traditional police method of matching minutiae while others use either straight pattern-matching devices or a bit more unique i.e. things like moiré fringe patterns and ultrasonic. Some verification approaches can detect when a live finger is presented while others cannot. Hand geometry Hand geometry involves analyzing and measuring the shape of the hand. This biometric offers a good balance of performance characteristics and is relatively easy to use. It is suitable for places with more users. It is also appropriate when users access the system infrequently and/or are perhaps less disciplined in their approach to the system. Accuracy can be very high if desired. Flexible performance tuning and configuration can accommodate a wide range of applications. Retina A retina-based biometric involves analyzing the layer of blood vessels situated at the back of the eye. An established technology, this technique involves using a low-intensity light source through an optical coupler to scan the unique patterns of the retina. Retinal scanning can be quite accurate but does require the user to look into a receptacle and focus on a given point. This is not particularly convenient if you wear glasses or are concerned about having close contact with the reading device. Iris An iris-based biometric, on the other hand, involves analyzing features found in the colored ring of tissue that surrounds the pupil. Iris scanning, undoubtedly the less intrusive of the eye-related biometrics, uses a fairly conventional camera element and requires no close contact between the user and the reader. In addition, it has the potential

for higher than average template-matching performance. Iris biometrics work with glasses in place and is one of the few devices that can work well in identification mode. Face Face recognition analyzes facial characteristics. It requires a digital camera to develop a facial image of the user for authentication. Facial features are most important thing in face recognition. It extracts features from a face image and compare with those stored in the database for identification. Signature Signature verification analyzes the way a user signs his/her name. Signing features such as speed, velocity, and pressure are as important as the finished signature's static shape. Voice Voice authentication is not based on voice recognition but on voice-to-print authentication, where complex technology transforms voice into text. Voice biometrics has the maximum potential for growth, as it requires no new hardware.

USES FOR BIOMETRICS Security systems use biometrics for two basic purposes: 1. Either to verify or 2. To identify the user. Identification tends to be the more difficult of the two uses because a system must search a database of enrolled users to find a match (a one-to-many search). The biometric that a security system employs depends in part on • What the system is protecting and • What it is trying to protect against. Physical access For decades, many highly secure environments have used biometric technology for entry access. Today, the primary application of biometrics is in physical security i.e. to control access to secure locations (rooms or buildings). Unlike photo identification cards, which a security guard must verify, biometrics permits unmanned access control. Biometrics is useful for high-volume access control. For example, biometrics controlled access of 65,000 people during the 1996 Olympic Games, and Disney World uses a fingerprint scanner to verify season-pass holders entering the theme park. Virtual access For a long time, biometric-based network and computer access were areas often discussed but rarely implemented. Virtual access is the application that will provide the critical mass to move biometrics for network and computer access from the realm of science-fiction devices to regular system components. Physical lock-downs can protect hardware, and passwords are currently the most popular way to protect data on a network. Biometrics, however, can increase the ability to protect

data by implementing a more secure key than a password. Biometrics also allows a hierarchical structure of data protection, making the data further secure. Passwords supply a minimal level of access to network data, but biometrics is the next level. You can even lairize biometric technologies to enhance security levels. E-commerce applications E-commerce developers are exploring the use of biometrics and smart cards to more accurately verify a trading party's identity. Some are using biometrics to obtain secure services over the telephone through voice authentication. Covert surveillance One of the more challenging research areas involves using biometrics for covert surveillance. Using facial and body recognition technologies, researchers hope to use biometrics to automatically identify known suspects entering buildings or traversing crowded security areas such as airports. The use of biometrics for covert identification as opposed to authentication must overcome technical challenges such as simultaneously identifying multiple subjects in a crowd and working with uncooperative subjects. In these situations, devices cannot count on consistency in pose, viewing angle, or distance from the detector.

THE FUTURE OF BIOMETRICS Although companies are using biometrics for authentication in a variety of situations, the industry is still evolving and emerging. To both guide and support the growth of biometrics, the Biometric Consortium formed in December 1995. Standardization The biometrics industry includes more than 150 separate hardware and software vendors, each with their own proprietary interfaces, algorithms, and data structures. Standards are emerging to provide a common software interface, to allow sharing of biometric templates, and to permit effective comparison and evaluation of different biometric technologies. The BioAPI standard, defines a common method for interfacing with a given biometric application. BioAPI is open-systems standard developed by a consortium of more than 60 vendors and government agencies. Written in C, it consists of a set of function calls to perform basic actions common to all biometric technologies, such as
• • •

Enroll user, Verify asserted identity (authentication), and Discover identity.

Another draft standard is the Common Biometric Exchange File Format, which defines a common mean of exchanging and storing templates collected from a variety of biometric devices. Biometric assurance i.e. the confidence that a biometric device can achieve the intended level of security is another active research area. Current metrics for comparing biometric

technologies, such as the crossover error rate and the average enrollment time, are limited because they lack a standard test bed on which to base their values. Several groups, including the US Department of Defense's Biometrics Management Office, are developing standard testing methodologies. Hybrid technology uses One of the more interesting uses of biometrics involves combining biometrics with smart cards and Public-Key Infrastructure (PKI). A major problem with biometrics is how and where to store the user's template. Because the template represents the user's personal characteristics, its storage introduces privacy concerns. Furthermore, storing the template in a centralized database leaves that template subject to attack and compromise. On the other hand, storing the template on a smart card enhances individual privacy and increases protection from attack, because individual users control their own templates. PKI uses public and private-key cryptography for user identification and authentication. It has some advantages over biometrics: It is mathematically more secure, and it can be used across the Internet. The main drawback of PKI is the management of the user's private key. To be secure, the private key must be protected from compromise, while to be useful; the private key must be portable. The solution to these problems is to store the private key on a smart card and protect it with a biometric. 1.2 Face Recognition With the advancement in computer and automated systems, one is seldom surprised to find such systems applicable to many visual tasks in our daily activities. Automated systems on production lines inspect goods for our consumption, and law-enforcement agencies use computer systems to search databases of fingerprint records. Visual surveillance of scenes, visual feedback for control etc. all has potential applications for automated visual systems. One area that has grown significantly in importance over the past decade is that of computer face processing in visual scenes. Researchers attempt to teach the computer to recognize and analyze human faces from images so as to produce an easy and convenient platform for interaction between human and computers. Law- enforcement can be improved by automatically recognizing criminals from a group of suspects. Security can also be reinforced by identifying that the authorized person is physically present. Moreover, human facial expressions can be analyzed to direct robot motion to perform certain secondary, or even primary, tasks in our routine work requirements. For more than a quarter of a century, research has been done in automatic face recognition. Psycho-physicists and neuroscientists have attempted to understand why a human being is able to handle problem of face recognition without any problems. Engineers had and still have the dream of a face recognition, which is done fully automatically by computers. They want to obtain efficiency, which is comparable with the human ability of face recognition. This problem has not been solved yet and the scientists have still to go a very long way to reach this goal.

Face recognition can basically be understood as a complex pattern recognition task. Thus, most of the techniques that have been applied originate from the field of signal processing and computer science research. Probably because of the fast development in the research of face recognition or the large number of parallel existing approaches there is no textbook that can be recommended. Though, some survey articles are useful to get acquainted with this matter. First attempts of face recognition deal with the problem by describing features in the image as compared to the stored data. Several other approaches have applied correlations with already stored feature templates. Automatic face recognition is a technique that can locate and identify faces automatically in an image and determine “who is who” from a database. It is gaining more and more attention in the area of computer vision, image processing and pattern recognition. There are several important steps involved in recognizing face such as detection, representation and identification. Based on different representations, various approaches can be grouped into feature-based and image-based. Usually every group of researchers uses their own database of manually normalized faces, but these have conditions far away from the kind of images expected in reality. In realistic situations there would be rather a face somewhere in the image, but not in the middle and thus the background would be cluttered. Many techniques for face recognition have been developed whose principles span several disciplines, such as image processing, pattern recognition, computer vision, and neural networks. The increasing interest in face recognition is mainly driven by application demands, such as non-intrusive identification and verification for credit cards and automatic teller machine transactions, non-intrusive access-control to buildings, identification for law enforcement, etc. Machine recognition of faces yields problems that belong to the following categories whose objectives are briefly outlined: 1. Face Recognition: Given a test face and a set of reference faces in a database find the N most similar reference faces to the test face. 2. Face Authentication: Given a test face and a reference one, decide if the test face is identical to the reference face. Face recognition has been studied more extensively than face authentication. The two problems are conceptually different. On one hand, a face recognition system usually assists a human expert to determine the identity of a test face by computing all similarity scores between the test face and each human face stored in the system database and by ranking them. On the other hand, a face authentication system should decide itself if a test face is assigned to a client (i.e., one who claims his/her own identity) or to an impostor (i.e., one who pretends to be someone else). Cognitive psychological studies indicated that human beings recognize line drawings as quickly and almost as accurately as gray-level pictures. These results might imply that edge images of objects could be used for object recognition and to achieve similar accuracy as gray-level images. A novel concept, “faces can be recognized using line edge map,” is proposed. A compact face feature, Line Edge Map (LEM), is extracted for face coding and recognition.

The faces were encoded into binary edge maps using Sobel edge detection algorithm. The Hausdorff distance was chosen to measure the similarity of the two point sets, i.e., the edge maps of two faces, because the Hausdorff distance can be calculated without an explicit pairing of points in their respective data sets. A pre-filtering scheme (two-stage identification) is used to speed up the searching using a 2D pre-filtering vector derived from the face LEM. A feasibility investigation and evaluation for face recognition based solely on face LEM is conducted, which covers all the conditions of human face recognition, i.e., face recognition under controlled/ideal condition, varying lighting condition, varying facial expression, and varying pose.

Chapter 2 Analysis
2.1 Problem Statement For more than a quarter of a century research has been done in automatic face recognition. Psycho-physicists and neuroscientists have attempted to understand why the human being is able to handle the problem of face recognition nearly without any problems. Engineers had and still have the dream of a face recognition, which is done fully automatically by computers. They want to obtain efficiency, which is comparable with the human ability of face recognition. This problem has not been solved yet and the scientists have still to go a very long way to reach this goal. There is an increasing application demand, such as nonintrusive identification and verification for credit cards and automatic teller machine transactions, nonintrusive access-control to buildings, identification for law enforcement, etc. So face recognition system is very useful at these demands. As we are using LEM method, we will use certain algorithms to fulfill this task. For that, we will find important region of face from the image. Then we detect edge from that image and will thin the edges. This thinned edge map is then converted to line edge map, which will be store in to database. As input comes, that will also be processed through these steps and finally compared with images stored in the database. In short our problem comprise of 5 main modules: - Face Detection - Edge Detection

-

Thinning Line Edge Map Face Comparison

2.2 Literature Survey There are many techniques available for human face recognition, but the major human face recognition technique applies mostly to frontal faces. Major methods considered for face recognition are eigen-face (eigen-feature), neural network, dynamic link architecture, hidden Markov model, geometrical feature matching, and template matching. The approaches are analyzed in terms of the facial representations they used. 2.2.1 Eigenface Method Eigen-face is one of the most thoroughly investigated approaches to face recognition. It is also known as Karhunen- LoeÁve expansion, eigen-picture, eigen-vector, and principal component. Sirovich and Kirby [5] and Kirby et al. [6] used principal component analysis to efficiently represent pictures of faces. They argued that any face images could be approximately reconstructed by a small collection of weights for each face and a standard face picture (eigen-picture). The weights describing each face are obtained by projecting the face image onto the eigen-picture. Turk and Pentland [7] used eigen-faces, motivated by the technique of Kirby and Sirovich, for face detection and identification. In mathematical terms, eigen-faces are the principal components of the distribution of faces, or the eigen-vectors of the covariance matrix of the set of face images. The eigenvectors are ordered to represent different amounts of the variation, respectively, among the faces. Each face can be represented exactly by a linear combination of the eigen-faces. It can also be approximated using only the best eigenvectors with the largest eigenvalues. The best M eigenfaces construct an M dimensional space, i.e., the face space. As the images include a large quantity of background area, the results can be influenced by background. Grudin [26] showed that the correlation between images of the whole faces is not efficient for satisfactory recognition performance. Illumination normalization [6] is usually necessary for the eigenface approach. Zhao and Yang [32] proposed a new method to compute the covariance matrix using three images each taken in different lighting conditions to account for arbitrary illumination effects, for an object. Lambertian and Pentland et al. [8] extended their early work on eigenface to eigenfeatures corresponding to face components, such as eyes, nose, and mouth. They used a modular eigenspace which was composed of the above eigenfeatures (i.e., eigeneyes, eigennose, and eigenmouth). This method would be less sensitive to appearance changes than the standard eigenface method. In summary, eigenface appears as a fast, simple, and practical method. However, in general, it does not provide invariance over changes in scale and lighting conditions. 2.2.2 Neural Network Approach

The attractiveness of using neural network could be due to its non-linearity in the network. Hence, the feature extraction step may be more efficient than the linear Karhunen-LoeÁve methods. One of the first artificial neural network (ANN) techniques used for face recognition is a single layer adaptive network called WISARD which contains a separate network for each stored individual [9]. The way of constructing a neural network structure is crucial for successful recognition. It is very much dependent on the intended application. • For face detection, multilayer perceptron [10] and convolutional neural network [11] have been applied. • For face verification, Cresceptron [12] multi resolution pyramid structure is used. Lawrence et al. [11] proposed a hybrid neural network, which combined local image sampling, a self-organizing map (SOM) neural network, and a convolutional neural network. The SOM provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimension reduction and invariance to minor changes in the image sample. The convolutional network extracts successively larger features in a hierarchical set of layers and provides partial invariance to translation, rotation, scale, and deformation. The authors reported 96.2 percent correct recognition on ORL database of 400 images of 40 individuals. The classification time is less than 0.5 second, but the training time is as long as 4 hours. Lin et al. [13] used probabilistic decision-based neural network (PDBNN), which inherited the modular structure from its predecessor, a decision based neural network (DBNN) [14]. The PDBNN can be applied effectively to 1) Face detector: This finds the location of a human face in a cluttered image, 2) Eye localizer: This determines the positions of both eyes in order to generate meaningful feature vectors, and 3) Face recognizer: A hierarchical neural network structure with non-linear basis functions and a competitive credit-assignment scheme. PDBNN-based biometric identification system has the merits of both neural networks and statistical approaches, and its distributed computing principle is relatively easy to implement on parallel computer. In [13], it was reported that PDBNN face recognizer had the capability of recognizing up to 200 people and could achieve up to 96 percent correct recognition rate in approximately 1 second. However, when the number of persons increases, the computing expense will become more demanding. In general, neural network approaches encounter problems when the number of classes (i.e., individuals) increases. Moreover, they are not suitable for a single model image recognition task because multiple model images per person are necessary in order for training the systems to optimal parameter setting. 2.2.3 Stochastic Modeling Stochastic modeling of non-stationary vector time series based on hidden Markov models (HMM) has been very successful for speech applications. Samaria and Fallside [27] applied this method to human face recognition. Faces were intuitively divided into regions such as the eyes, nose, mouth, etc., which can be associated with the states of a hidden Markov model. Since HMMs require a one-dimensional observation sequence and

images are two-dimensional, the images should be converted into either 1D temporal sequence or 1D spatial sequence. In [28], a spatial observation sequence was extracted from a face image by using a band sampling technique. Each face image was represented by a 1D vector series of pixel observation. Each observation vector is a block of L lines and there is an M lines overlap between successive observations. An unknown test image is first sampled to an observation sequence. Then, it is matched against every HMMs in the model face database (each HMM represents a different subject). The match with the highest likelihood is considered the best match and the relevant model reveals the identity of the test face. The recognition rate of HMM approach is 87 percent using ORL database consisting of 400 images of 40 individuals. Pseudo 2D HMM [28] was reported to achieve a 95 percent recognition rate in their preliminary experiments. Its classification time and training time were not given (believed to be very expensive). The choice of parameters had been based on subjective intuition. 2.2.4 Geometrical feature matching Geometrical feature matching techniques are based on the computation of a set of geometrical features from the picture of a face. The fact that the face recognition is possible even at a coarse resolution of 8x6 pixels [17] where the single facial features are hardly revealed in detail implies that the overall geometrical configuration of the face features is sufficient for recognition. The overall configuration can be described by a vector representing the position and size of the main facial feature, such as eyes and eyebrows, nose, mouth, and the shape of face outline. One of the pioneering works on automated face recognition by using geometrical features was done by Kanade [19] in 1973. Their system achieved a peak performance of 75 percent recognition rate on a database of 20 people using two images per person, one as the model and the other as the test image. Goldstein et al. [20] and Kaya and Kobayashi [18] showed that a face recognition program provided with features extracted manually could perform recognition apparently with satisfactory results. Bruneli and Poggio [21] automatically extracted a set of geometrical features from the picture of a face, such as nose width and length, mouth position, and chin shape. There were 35 features extracted to form a 35 dimensional vector. The recognition was then performed with a Bayes classifier. They reported a recognition rate of 90 percent on a database of 47 people. Cox et al. [22] introduced a mixture-distance technique, which achieved 95 percent recognition rate on a query database of 685 individuals. Each face was represented by 30 manually extracted distances. Manjunath et al. [23] used Gabor wavelet decomposition to detect feature points for each face image, which greatly reduced the storage requirement for the database. Typically, 35-45 feature points per face were generated. The matching process utilized the information presented in a topological graphic representation of the feature points. After compensating for different centroid location, two cost values, the topological cost, and similarity cost, were evaluated. The recognition accuracy in terms of the best match to the right person was 86 percent and 94 percent of the correct person's face was in the top three candidate matches.

In summary, geometrical feature matching based on precisely measured distances between features may be most useful for finding possible matches in a large database such as a mug shot album. However, it will be dependent on the accuracy of the feature location algorithms. Current automated face feature location algorithms do not provide a high degree of accuracy and require considerable computational time. 2.2.5 Template Matching A simple version of template matching is that a test image represented as a twodimensional array of intensity values is compared using a suitable metric, such as the Euclidean distance, with a single template representing the whole face. There are several other more sophisticated versions of template matching on face recognition. One can use more than one face template from different viewpoints to represent an individual's face. A face from a single viewpoint can also be represented by a set of multiple distinctive smaller templates [24], [21]. The face image of gray levels may also be properly processed before matching [25]. In [21], Bruneli and Poggio automatically selected a set of four features templates, i.e., the eyes, nose, mouth, and the whole face, for all of the available faces. They compared the performance of their geometrical matching algorithm and template matching algorithm on the same database of faces, which contains 188 images of 47 individuals. The template matching was superior in recognition (100 percent recognition rate) to geometrical matching (90 percent recognition rate) and was also simpler. Since the principal components (also known as eigenfaces or eigenfeatures) are linear combinations of the templates in the data basis, the technique cannot achieve better results than correlation [21], but it may be less computationally expensive. One drawback of template matching is its computational complexity. Another problem lies in the description of these templates. Since the recognition system has to be tolerant to certain discrepancies between the template and the test image, this tolerance might average out the differences that make individual faces unique. In general, template-based approaches compared to feature matching are a more logical approach. 2.2.6 Graph Matching Graph matching is another approach to face recognition. Lades et al. [15] presented a dynamic link structure for distortion invariant object recognition, which employed elastic graph matching to find the closest stored graph. Dynamic link architecture is an extension to classical artificial neural networks. Memorized objects are represented by sparse graphs, whose vertices are labeled with a multi-resolution description in terms of a local power spectrum and whose edges are labeled with geometrical distance vectors. Object recognition can be formulated as elastic graph matching which is performed by stochastic optimization of a matching cost function. Wiskott and von der Malsburg [16] extended the technique and matched human faces against a gallery of 112 neutral frontal view faces. Probe images were distorted due to rotation in depth and changing facial expression. Encouraging results on faces with large rotation angles were obtained. They reported recognition rates of 86.5 percent and 66.4 percent for the matching tests of 111 faces of 15 degree rotation and 110 faces of 30

degree rotation to a gallery of 112 neutral frontal views. In general, dynamic link architecture is superior to other face recognition techniques in terms of rotation invariant; however, the matching process is computationally expensive. 2.2.7 N-tuple classifiers Conventional n-tuple systems have the desirable features of super-fast single-pass training, super-fast recognition, conceptual simplicity, straightforward hardware and software implementations and accuracy that is often competitive with other more complex, slower methods. Due to their attractive features, n-tuple methods have been the subject of much research. In conventional n-tuple based image recognition systems, the locations specified by each n-tuple are used to identify an address in a look-up-table. The contents of this address either use a single bit to indicate whether or not this address was accessed during training, or store a count of how many times that address occurred. While the traditional n-tuple classifier deals with binary-valued input vectors, methods using n-tuple systems with integer-valued inputs have also been developed. Allinson and Kolcz [3] have developed a method of mapping scalar attributes into bit strings based on a combination of CMAC and Gray coding methods. This method has the property that for small differences in the arithmetic values of the attributes, the hamming distance between the bit strings is equal to the arithmetic difference. For larger values of the arithmetic distance, the hamming distance is guaranteed to be above a certain threshold. The continuous n-tuple method also shares some similarity at the architectural level with the single layer look-up perceptron of Tattersall et al [32], though they differ in the way the class outputs are calculated, and in the training methods used to configure the contents of the look-up tables (RAMS). In summary, no existing technique is free from limitations. Further efforts are required to improve the performances of face recognition techniques, especially in the wide range of environments encountered in real world. 2.2.8 Line Edge Map Cognitive psychological studies indicated that human beings recognize line drawings as quickly and almost as accurately as gray-level pictures. These results might imply that edge images of objects could be used for object recognition and to achieve similar accuracy as gray-level images. The faces were encoded into binary edge maps using Sobel edge detection algorithm. The Hausdorff distance was chosen to measure the similarity of the two point sets, i.e., the edge maps of two faces, because the Hausdorff distance can be calculated without an explicit pairing of points in their respective data sets. The modified Hausdorff distance in the formulation of h(A,B) = (1/Na) ∑a∈A{minb∈B ||a – b|| }  was used, as it is less sensitive to noise than the maximum or kth ranked Hausdorff distance formulations. Takacs argued that the process of face recognition might start at a much earlier stage and edge images can be used for the recognition of faces without the

involvement of high-level cognitive functions. However, the Hausdorff distance uses only the spatial information of an edge map without considering the inherent local structural characteristics inside such a map. A successful object recognition approaches might need to combine aspects of feature-based approaches with template matching method. A Line Edge Map (LEM) approach extracts lines from a face edge map as features. This approach can be considered as a combination of template matching and geometrical feature matching. The LEM approach not only possesses the advantages of feature-based approaches, such as invariant to illumination and low memory requirement, but also has the advantage of high recognition performance of template matching. The above three reasons together with the fact that edges are relatively insensitive to illumination changes motivated this research. 2.3 Line Edge Map Method A novel face feature representation, Line Edge Map (LEM), is proposed here to integrate the structural information with spatial information of a face image by grouping pixels of face edge map to line segments. After thinning the edge map, a polygonal line fitting process is applied to generate the LEM of a face. The LEM representation, which records only the end points of line segments on curves, further reduces the storage requirement. Efficient coding of faces is a very important aspect in a face recognition system. LEM is also expected to be less sensitive to illumination changes due to the fact that it is an intermediate-level image representation derived from low level edge map representation. The basic unit of LEM is the line segment grouped from pixels of edge map. In this study, we explore the information of LEM and investigate the feasibility and efficiency of human face recognition using LEM. A Line Segment Hausdorff Distance (LHD) measure is then proposed to match LEMs of faces. LHD has better distinctive power because it can make use of the additional structural attributes of line orientation, line-point association, and number disparity in LEM, i.e., it is not encouraged to match two lines with large orientation difference, and all the points on one line have to match to points on another line only. 2.3.1 Face Detection The original algorithm is based on mosaic images of reduced resolution that attempt to capture the macroscopic features of the human face. It is assumed that there is a resolution level where the main part of the face occupies an area of about 4x4 cells. Accordingly, a mosaic image can be created for this resolution level. It is the so called quartet image. The grey level of each cell equals the average value of the grey levels of all pixels included in the cell. An abstract model for the face at the resolution level of the quartet image is depicted in Figure. The main part of the face corresponds to the region of 4x4 cells having an origin cell marked by “X”. By subdividing each quartet image cell to 2x2 cells of half dimensions the octet image results, where the main facial features such as the eyebrows/eyes, the nostrils/nose and the mouth are detected. Therefore, a hierarchical knowledge-based system can be designed that aims at detecting facial candidates by establishing rules applied to the quartet image and subsequently at validating the choice of a facial

candidate by establishing rules applied to the octet image for detecting the key facial features.

Fig. 1 As can be seen, the underlying idea is very simple and very attractive, because it is close to our intuition for the human face. However, the implementation is computationally intensive. The algorithm is applied iteratively for the entire range of possible cell dimensions in order to determine the best cell dimensions for creating the quartet image for each person. Another limitation is that only square cells are employed. In order to avoid the iterative nature of the original method, we estimate the cell dimensions in the quartet image by processing the horizontal and the vertical profile of the image. Let us denote by n and m the vertical and the horizontal quartet cell dimensions, respectively. The horizontal profile of the image is obtained by averaging all pixel intensities in each image column, by detecting abrupt transitions in the horizontal profile. These local minima are determined in the horizontal profile. These correspond to the left and right side of the head. Accordingly, the quartet cell dimension in the horizontal direction can easily be estimated. Similarly, the vertical profile of the image is obtained by averaging all pixel intensities in each image row. The significant local minima in the vertical profile correspond to the hair, eyebrows, eyes, mouth and chin. It is fairly easy to locate the row where the eyebrows/eyes appear in the image by detecting the local minimum after the first abrupt transition in the vertical profile. Searching for the row should be detected. It corresponds to a significant maximum that occurs below the eyes. Then, the steepest minimum below the nose tip is associated to the upper lip. By setting the distance between the rows where the eyes and the upper lip have been found to 2n, the quartet cell dimension in the vertical direction can be estimated. It is evident that the proposed preprocessing step overcomes also the drawback of square cells, because the cell dimensions are adapted to each person separately.

Having estimated the quartet cell dimensions, comes the description of facial candidate detection rules. Since the system remains hierarchical, it is more preferable to decide that a face exists in a scene although there may be no actual face than not to detect a face that exists. The decision whether or not a region of 4x4 cells is a facial candidate is based on: • The detection of a homogenous region of 2x2 cells in the middle of the model that is shown in light grey color in fig 1 above. • The detection of homogeneous connected components having significant length in the π-shaped region shown in black color in fig 1, or, • The detection of a beard region shown in dark gray color in fig 1. Moreover, a significant difference in the average cell intensity between the central 2x2 region and the π-shaped region must be detected. For the sake of completeness, we note that if there aren’t adequate cells in the vertical direction, the π-shaped region may have a total length of 12 cells instead of 14 cells. We have found that the above described rules are more successful in detecting a facial candidate. Subsequently, eyebrows/eyes, nostrils/nose and mouth detection rules are developed to validate the facial candidates determined by the procedure outlined above. 2.3.2 Edge Detectors The operators described here are those whose purpose is to identify meaningful image features on the basis of distributions of pixel grey levels. The two categories of operators included here are: 1. Edge Pixel Detectors - that assign a value to a pixel in proportion to the likelihood that the pixel is part of an image edge (i.e. a pixel which is on the boundary between two regions of different intensity values). 2. Line Pixel Detectors - that assign a value to a pixel in proportion to the likelihood that the pixel is part of an image line (i.e. a dark narrow region bounded on both sides by lighter regions, or vice-versa). Detectors for other features can be defined, such as circular arc detectors in intensity images (or even more general detectors, as in the generalized Hough transform), or planar point detectors in range images, etc. Note that the operators merely identify pixels likely to be part of such a structure. To actually extract the structure from the image it is then necessary to group together image pixels (which are usually adjacent). 1) Roberts Cross Edge Detector The Roberts Cross operator performs a simple, quick to compute, 2-D spatial gradient measurement on an image. It thus highlights regions of high spatial gradient, which often correspond to edges. In its most common usage, the input to the operator is a greyscale image, as is the output. Pixel values at each point in the output represent the estimated absolute magnitude of the spatial gradient of the input image at that point.

How It Works In theory, the operator consists of a pair of 2×2 convolution masks as shown in Figure 1. One mask is simply the other rotated by 90°. This is very similar to the Sobel operator. +1 0 Gx 0 -1 0 -1 Gy +1 0

Figure 1: Roberts Cross convolution masks These masks are designed to respond maximally to edges running at 45° to the pixel grid, one mask for each of the two perpendicular orientations. The masks can be applied separately to the input image, to produce separate measurements of the gradient component in each orientation (call these Gx and Gy). These can then be combined together to find the absolute magnitude of the gradient at each point and the orientation of that gradient. The gradient magnitude is given by: | G |= Gx 2 + Gy 2 Although typically, an approximate magnitude is computed using: | G |=| Gx | + | Gy | which is much faster to compute. The angle of orientation of the edge giving rise to the spatial gradient (relative to the pixel grid orientation) is given by: θ = arctan(Gy / Gx) − 3π / 4 In this case, orientation 0 is taken to mean that the direction of maximum contrast from black to white runs from left to right on the image, and other angles are measured anticlockwise from this. Often, the absolute magnitude is the only output the user sees --- the two components of the gradient are conveniently computed and added in a single pass over the input image using the pseudo-convolution operator shown in Figure 2.

P1 P3

P2 P4

Figure 2: Pseudo-Convolution masks used to quickly compute approximate gradient magnitude

Using this mask the approximate magnitude is given by: | G |=| P1 − P4 | + | P2 − P3 | The main reason for using the Roberts cross operator is that it is very quick to compute. Only four input pixels need to be examined to determine the value of each output pixel, and only subtractions and additions are used in the calculation. In addition there are no parameters to set. Its main disadvantages are that since it uses such a small mask, it is very sensitive to noise. It also produces very weak responses to genuine edges unless they are very sharp. The Sobel operator performs much better in this respect. 2) Sobel Edge Detector The Sobel operator performs a 2-D spatial gradient measurement on an image and so emphasizes regions of high spatial gradient that corresponds to edges. Typically it is used to find the approximate absolute gradient magnitude at each point in an input greyscale image. How It Works In theory at least, the operator consists of a pair of 3×3 convolution masks as shown in Figure 1. One mask is simply the other rotated by 90°. This is very similar to the Roberts Cross operator. -1 -2 -1 0 0 0 +1 +2 +1 +1 0 -1 +2 0 -2 +1 0 -1

Gx Figure 1: Sobel convolution masks

Gy

These masks are designed to respond maximally to edges running vertically and horizontally relative to the pixel grid, one mask for each of the two perpendicular orientations. The masks can be applied separately to the input image, to produce separate measurements of the gradient component in each orientation (call these Gx and Gy). These can then be combined together to find the absolute magnitude of the gradient at each point and the orientation of that gradient. The gradient magnitude is given by: | G |= Gx 2 + Gy 2 Although typically, an approximate magnitude is computed using: | G |=| Gx | + | Gy | which is much faster to compute.

The angle of orientation of the edge (relative to the pixel grid) giving rise to the spatial gradient is given by: θ = arctan(Gy / Gx) − 3π / 4 In this case, orientation 0 is taken to mean that the direction of maximum contrast from black to white runs from left to right on the image, and other angles are measured anticlockwise from this. Often, this absolute magnitude is the only output the user sees --- the two components of the gradient are conveniently computed and added in a single pass over the input image using the pseudo-convolution operator shown in Figure 2. P1 P4 P7 P2 P5 P8 P3 P6 P9

Figure 2: Pseudo-convolution masks used to quickly compute approximate gradient magnitude Using this mask the approximate magnitude is given by:

| G |= | ( P1 + 2 × P2 + P3 ) − ( P7 + 2 × P8 + P9 ) | + | ( P3 + 2 × P6 + P9 ) − ( P1 + 2 × P4 + P7 ) |
The Sobel operator is slower to compute than the Roberts Cross operator, but its larger convolution mask smoothes the input image to a greater extent and so makes the operator less sensitive to noise. The operator also generally produces considerably higher output values for similar edges compared with the Roberts Cross. As with the Roberts Cross operator, output values from the operator can easily overflow the maximum allowed pixel value for image types that only support smallish integer pixel values (e.g. 8-bit integer images). When this happens the standard practice is to simply set overflowing output pixels to the maximum allowed value. The problem can be avoided by using an image type that supports pixel values with a larger range. Natural edges in images often lead to lines in the output image that are several pixels wide due to the smoothing effect of the Sobel operator. Some thinning may be desirable to counter this. Failing that, some sort of hysteresis ridge tracking could be used as in the Canny operator. 3) Canny Edge Detector The Canny operator was designed to be an optimal edge detector (according to particular criteria --- there are other detectors around that also claim to be optimal with respect to

slightly different criteria). It takes as input a grey scale image, and produces as output an image showing the positions of tracked intensity discontinuities. How It Works The Canny operator works in a multi-stage process. First of all the image is smoothed by Gaussian convolution. Then a simple 2-D first derivative operator (somewhat like the Roberts Cross) is applied to the smoothed image to highlight regions of the image with high first spatial derivatives. Edges give rise to ridges in the gradient magnitude image. The algorithm then tracks along the top of these ridges and sets to zero all pixels that are not actually on the ridge top so as to give a thin line in the output, a process known as non-maximal suppression. The tracking process exhibits hysteresis controlled by two thresholds: T1 and T2 with T1 > T2. Tracking can only begin at a point on a ridge higher than T1. Tracking then continues in both directions out from that point until the height of the ridge falls below T2. This hysteresis helps to ensure that noisy edges are not broken up into multiple edge fragments. The effect of the Canny operator is determined by three parameters --- the width of the Gaussian mask used in the smoothing phase, and the upper and lower thresholds used by the tracker. Increasing the width of the Gaussian mask reduces the detector's sensitivity to noise, at the expense of losing some of the finer detail in the image. The localization error in the detected edges also increases slightly as the Gaussian width is increased. Usually, the upper tracking threshold can be set quite high, and the lower threshold quite low for good results. Setting the lower threshold too high will cause noisy edges to break up. Setting the upper threshold too low increases the number of spurious and undesirable edge fragments appearing in the output. One problem with the basic Canny operator is to do with Y-junctions i.e. places where three ridges meet in the gradient magnitude image. Such junctions can occur where an edge is partially occluded by another object. The tracker will treat two of the ridges as a single line segment, and the third one as a line that approaches, but doesn't quite connect to, that line segment. 4) Compass Edge Detector Compass Edge Detection is an alternative approach to the differential gradient edge detection. The operation usually outputs two images, one estimating the local edge gradient magnitude and one estimating the edge orientation of the input image. How It Works When using compass edge detection the image is convolved with a set of (in general 8) convolution masks, each of which is sensitive to edges in a different orientation. For each pixel the local edge gradient magnitude is estimated with the maximum response of all 8 masks at this pixel location: |G| = max (|Gi| : I = 1 to n) where Gi is the response of the mask i at the particular pixel position and n is the number of convolution masks. The local edge orientation is estimated with the orientation of the mask, which yields the maximum response.

Various masks can be used for this operation, for the following discussion we will use the Prewitt mask. Two templates out of the set of 8 are shown in Figure 1: -1 -1 -1 +1 -2 +1 0o +1 +1 +1 +1 -1 -1 +1 -2 -1 45o +1 +1 +1

Figure 1 Prewitt compass edge detecting templates sensitive to 0° and 45°. The whole set of 8 masks is produced by taking one of the masks and rotating its coefficients circularly. Each of the resulting masks is sensitive to another edge orientation ranging from 0° to 315° in steps of 45°, where 0° corresponds to a vertical edge. The maximum response |G| for each pixel gives rise to the value of the corresponding pixel in the output magnitude image. The values for the output orientation image lie between 1 and 8, depending on which of the 8 masks produced the maximum response. This edge detection method is also called edge template matching, because a set of edge templates is matched to the image, each representing an edge in a certain orientation. The edge magnitude and orientation of a pixel is then determined by the template, which matches the local area of the pixel the best. The compass edge detector is an appropriate way to estimate the magnitude and orientation of an edge. Whereas differential gradient edge detection needs a rather timeconsuming calculation to estimate the orientation from the magnitudes in x- and ydirection, the compass edge detection obtains the orientation directly from the mask with the maximum response. The compass operator is limited to (here) 8 possible orientations; however experience shows that most direct orientations estimates are not much more accurate. On the other hand, the compass operator needs (here) 8 convolutions for each pixel, whereas the gradient operator needs only 2, one mask being sensitive to edges in the vertical direction and one to the horizontal direction. The result for the edge magnitude image is very similar with both methods, provided the same convolving mask is used. Common Variants As already mentioned earlier, there are various masks, which can be used for Compass Edge Detection. The most common ones are shown in Figure 2: 0o 45o -1 -2 -1 0 0 0 +1 +2 +1 Sobel 0 -1 -2 1 0 -1 2 1 0

-3 -3 -3

-3 0 -3

+5 +5 +5 Kirsch

-3 -3 -3

+5 0 -3

+5 +5 -3

-1 -1 -1

0 0 0

+1 +1 +1 Robinson

0 -1 -1

+1 0 -1

+1 +1 0

Figure 2 Some examples for the most common compass edge detecting masks, each example showing two masks out of the set of eight. For every template, the set of all eight masks is obtained by shifting the coefficients of the mask circularly. The result for using different templates is similar; the main difference is the different scale in the magnitude image. The advantage of Sobel and Robinson masks is, that only 4 out of the 8 magnitude values must be calculated. Since each pair of masks rotated about 180° opposite is symmetric, each of the remaining four values can be generated by inverting the result of the opposite mask. 5) Zero Crossing Detector The zero crossing detector looks for places in the Laplacian of an image where the value of the Laplacian passes through zero --- i.e. points where the Laplacian changes sign. Such points often occur at `edges' in images --- i.e. points where the intensity of the image changes rapidly, but they also occur at places that are not as easy to associate with edges. It is best to think of the zero crossing detector as some sort of feature detector rather than as a specific edge detector. Zero crossings always lie on closed contours and so the output from the zero crossing detector is usually a binary image with single pixel thickness lines showing the positions of the zero crossing points. The starting point for the zero crossing detector is an image, which has been filtered using the Laplacian of Gaussian filter. The zero crossings that result are strongly influenced by the size of the Gaussian used for the smoothing stage of this operator. As the smoothing is increased then fewer and fewer zero crossing contours will be found, and those that do remain will correspond to features of larger and larger scale in the image. How It Works The core of the zero crossing detector is the Laplacian of Gaussian filter and so knowledge of that operator is assumed here. As described there, `edges' in images give

rise to zero crossings in the LoG output. For instance, Figure 1 shows the response of a 1-D LoG filter to a step edge in the image. However, zero crossings also occur at any place where the image intensity gradient starts increasing or starts decreasing, and this may happen at places that are not obviously edges. Often zero crossings are found in regions of very low gradient where the intensity gradient wobbles up and down around zero. Once the image has been LoG filtered, it only remains to detect the zero crossings. This can be done in several ways. The simplest is to simply threshold the LoG output at zero, to produce a binary image where the boundaries between foreground and background regions represent the locations of zero crossing points. These boundaries can then be easily detected and marked in single pass, e.g. using some morphological operator. For instance, to locate all boundary points, we simply have to mark each foreground point that has at least one background neighbor. The problem with this technique is that will tend to bias the location of the zero crossing edge to either the light side of the edge, or the dark side of the edge, depending upon whether it is decided to look for the edges of foreground regions or for the edges of background regions.

Figure 1 Response of 1-D LoG filter to a step edge. The left hand graph shows a 1-D image, 200 pixels long, containing a step edge. The right hand graph shows the response of a 1-D LoG filter with Gaussian standard deviation 3 pixels. A better technique is to consider points on both sides of the threshold boundary, and choose the one with the lowest absolute magnitude of the Laplacian, which will hopefully be closest to the zero crossing. Since the zero crossings generally fall in between two pixels in the LoG filtered image, an alternative output representation is an image grid, which is spatially, shifted half a pixel across and half a pixel down relative to the original image. Such a representation is known as a dual lattice. This does not actually localize the zero crossing any more accurately of course. A more accurate approach is to perform some kind of interpolation to estimate the position of the zero crossing to sub-pixel precision.

The behavior of the LoG zero crossing edge detector is largely governed by the standard deviation of the Gaussian used in the LoG filter. The higher this value is set, the more smaller features will be smoothed out of existence, and hence fewer zero crossings will be produced. Hence, this parameter can be set to remove unwanted detail or noise as desired. The idea that at different smoothing levels, different sized features become prominent is referred to as `scale'. 6) Line detection While edges (i.e. boundaries between regions with relatively distinct greylevels) are by far the most common type of discontinuity in an image, instances of thin lines in an image occur frequently enough that it is useful to have a separate mechanism for detecting them. Here we present a convolution based technique, which produces a gradient image description of the thin lines in an input image. Note that the Hough transform can be used to detect lines; however, in that case, the output is a parametric description of the lines in an image. How It Works The line detection operator consists of a convolution mask tuned to detect the presence of lines of a particular width n, at a particular orientation θ. Figure 1 shows a collection of four such masks, which each respond to lines of single pixel width at the particular orientation shown.

-1-1-1+2+2+2-1 -1-1 a

-1+2-1-1+2-1-1 +2-1 b

-1-1+2-1+2-1+2 -1-1 c

+2-1-1-1+2-1-1 -1+2 d

Figure 1 Four line detection masks which respond maximally to horizontal, vertical, and oblique (+45 and -45 degree) single pixel wide lines. If Ri denotes the response of mask i, we can apply each of these masks across an image and, for any particular point, if | Ri| > | Rj |for all j ≠ i that point is more likely to contain a line whose orientation (and width) corresponds to that of mask i. One usually thresholds Ri to eliminate weak lines corresponding to edges and other features with intensity gradients which have a different scale than the desired line width. In order to find complete lines, one must join together line fragments, e.g., with an edge tracking operator. 2.3.3 Thinning Algorithm

“Thinning” plays an important role in digital image processing and pattern recognition, since the outcome of thinning can largely determine the effectiveness and efficiency of extracting the distinctive features from the images. In image processing and pattern

recognition problems, a digitized binary pattern is normally defined by a matrix, where each element, called a pixel, is either 1 (front/white pixel) or 0 (background/dark pixel). Thinning is a process that deletes the front pixels and transforms the pattern into a “thin” line drawing. The resulted thin image is denominated as a skeleton of the original image. The thinned image must preserve the basic structure and the connectedness of the original image. Skeletonization or thinning is a very important preprocessing step in pattern analysis such as industrial parts inspection, fingerprint recognition, optical character recognition, and biomedical diagnosis [37]. One advantage of skeletonization is the reduction of memory space required for storing the essential structural information presented in a pattern. Moreover, it simplifies the data structure required in pattern analysis. Most of the skeletonization algorithms require iterative passes through the whole image, or at least through each pixel of the object considered. At each pass, a relatively complicated analysis over each pixel’s neighborhood must be performed, which makes the algorithms time-consuming. The objective of thinning is to reduce the amount of information in image pattern to the minimum needed for recognition. Thinned image helps the extraction of important features such as end points, junction points, and connections from image patterns. Thus, many thinning algorithms have been proposed until now. Two major approaches of thinning digital patterns can be categorized into iterative boundary removal algorithms and distance transformation algorithms [38]. Iterative boundary removal algorithms delete pixels on the boundary of a pattern repeatedly until only unit pixel-width thinned image remains. Distance transformation algorithms are not appropriate for general applications since they are not robust, especially for patterns with highly variable stroke directions and thickness. Thinning based on iterative boundary removal can be divided into sequential and parallel algorithms. In a sequential/serial method, the value of a pixel at the nth iteration depends on a set of pixels for some of which the result of nth iteration is already known. In parallel processing, the value of a pixel at the nth iteration depends on the values of the pixel and its neighbors at the (n - 1)th iteration. Thus, all the pixels of the digital pattern can be thinned simultaneously. There are two main steps in this thinning algorithm that are repeated until the obtained image approaches the medium axis of the original image. In the first step the contour of the image is calculated for deletion and marked (this is a serial approach) and in the second step the contour marked is deleted (this is a parallel approach). The contour of an image is formed by pixels-on that are found in the innermost and most distant positions of this image. These are the main characteristics of the TA algorithm: i) maintain the connectivity and preserve the end points; ii) the skeleton resulting approaches the medium axis of the original image; iii) is practically immune to noise; and iv) the execution time is very fast [39]. 2.3.4 Curve Fitting Algorithm

I. Dynamic Strip Algorithm Strip algorithms for curve fitting have received much attention recently because of their superior speed performance advantage. As shown in fig. 1, a strip is defined by one critical and two boundary lines. A critical line is defined by two reference points, the first and the second data points (i.e. points O and a in fig. 1) of a curve. Then two boundary lines which are parallel to the critical line and at a distance d from it are defined. The distance d is commonly called the error tolerance. These two boundary lines form a strip to restrict the line fitting process. The curve is then traversed point by point. The process stops and a line segment is generated when the first point which is outside the strip is found (e.g. point e in fig. 1). A line segment is then defined by the points O and c. Point c is used again as the starting point for the next strip fitting mechanism. One major problem with the strip algorithm is that if the second reference point is positioned in such a way that the third point on the curve is outside the strip, the resulting line segment will then be very short and is often not desirable. An example is shown as strip 1 in fig 2. it can be seen in the same figure that strip 2 is a more desirable strip because it contains more data points. From the above simple observation, Leung and Yang [40] proposed a Dynamic Strip algorithm (DSA) which rotates the strip using the starting point as a pivot. The basic idea is to rotate the strip to enclose as many data points as possible. An example to illustrate the advantage of the Dynamic Strip algorithm can be seen in fig.3 where case (a) illustrates the best possible strip without rotation while case (b) illustrates the best possible strip when rotation is allowed. Orientation of the strip is the only parameter to vary in the Dynamic Strip Algorithm. d d .f
Boundary Line Critical Line Boundary Line

o.

.a

.b

.c

.e

Fig. 1 Definition of a strip. Strip 2 .b o. .a .c .e Strip 1

Fig. 2: A badly and a properly chosen strip. II. Dynamic Two-Strip Algorithm The Dynamic Two-Strip algorithm has two stages. In the first stage, a generator called the Left-Right Strip Generator (LRSG) is employed to find the best fitted LHS and RHS strips at each data point. In our convention, a point that is traversed before (after) the data

point P is said to be on the RHS (LHS) of P. The computed strips are used to compute the figure of merit of the data point. In the second stage, a local maximum detection process is applied to pick out desirable feature points, i.e. points of high curvature. The approximated curve is one with the feature points connected by straight lines. Left-Right Strip Generator (LRSG) LRSG is an extension of the Dynamic Strip algorithm. The strip is allowed to adjust its orientation as well as the width dynamically. To simplify our discussion, we assume our data points are labeled from 0, 1, …, to N-1 and are traversed in either clockwise or Left Right Left Right counter-clockwise fashion. Let Li ( Li ) and Wi ( Wi ) be the length and th width of the fitted LHS (RHS) strip at the i data point. Initially, a strip with the minimum width (i.e. Wi = wmin ) is used in each direction. When no more data points can be included into the strip, the ratio

E

left i

=

LLeft i Wi Left

and

E

Right i

=

LRight i Wi Right

is computed. Intuitively, Ei is a measure of the elongatedness of the strip. The longer or narrower the strip is the higher the value of Ei. An elongated strip, which is one with large Ei, is therefore desirable. The value of the width is then increased by the smallest amount such that the strip fitting iteration can resume, i.e. when more data points can be included. This process continues until the maximum allowable width (Wmax) of the strip is reached. The value of Wmax can be set arbitrarily large while the minimum width (W min) cannot be set arbitrary small. This is particularly true if the data are digitized because the length of a strip has an upper bound Lmax (bounded by the dimension of the screen) and a lower bound Lmin (bounded by the distance between two consecutive vertical or horizontal pixels) but not the width of a strip. The width of a strip does not have a lower bound. If we arbitrarily choose Wmin to be less than 1/ L max, no strip of width ≥ 1 can be chosen since it always gives a smaller Ei. This can be illustrated by considering

Ei =
' i

Li Wi

L'i E = ' Wi
with Wi < 1 / Lmax and 1 ≤ Wi’ < ∞. In this case, we will have E i > Li ⋅ Lmax . Since Li is bounded from below by 1, Ei would be > Lmax. On the other hand, Ei’ can be at most equal to Li’ with Wi’ = 1. since Li’ is bounded from above by Lmax, E i would be larger than Ei’. Therefore, under the situation with Wi < 1/Lmax, no strip of width ≥ 1 will be chosen and none or little data reduction (noise filtering) is done. In practice, data reduction or noise filtering is desirable. The result of the above operation is a collection of all the longest possible LHS (RHS) strips with different width at each data point. At each side of the data point, only the strip with the largest Ei is selected.

The LRSG simulates the side detection mechanism. The curvature at a point can then be determined by the angle subtended by the best fitted left and right strips. In order to determine if the ith data point pi is a feature point, we define a figure of merit ( f i ) that measures the worthiness of Pi to be included in the approximation. f i is defined as :

f i = EiLeft ⋅ S iθ ⋅ EiRight
θ Where θ is the angle subtended by the best fitted left and right strips and Si is the angle acuteness measure at point i.

S iθ =| 180  − θ | 0 ≤ θ ≤ 360o.
θ According to this computation, sharper angles will give a larger value of Si . It can be seen that a sharp angle subtended by long strips will result in a large fi whereas a blunt angle subtended by short strips will result in a small fi. The above discussion can be summarized by the following three steps. Left Right (1) Determine Ei and Ei for all i. (2) Determine the angle θ subtended by the left and right strips and also the value of Siθ . (3) Determine f i .

Local Maximum Detection The local maximum detection process consists of three stages. First, non-local-maximum points (i.e. points with small f as compared with their neighbors) are eliminated temporarily. The second step is to check if over eliminated has occurred. Consequently, some temporarily eliminated points are added back to the result. The final step is to fit narrow strips to the remaining points to eliminate points that align approximately on a straight line. Details of the above steps are described in the following. Non-local-maximum elimination process: basically, this is a process that allows each data point, Pi, with high f i to eliminate other points that are in the left and right domain of P i. A domain is defined by the area or length covered by the best fitted strip of a point. To Left simplify future discussing, the left and right domains of Pi are denoted by Di and DiRight respectively. A point Q is in say, the left domain of Pi is written as: Q ∈ Di
Left

An ideal case is shown in fig. 4 where points A and B are local maxima since all the other points which are between A and B (e.g. C) have strips subtending an angle of approximately 180o (fig 3(a)) or they may have strips of wider widths together with wider angles (fig. 3(b)). In these cases, points between A and B (e.g. point C) are eliminated. In the algorithm, a point Pj is eliminated if one of the following conditions are satisfied. (i) there exists m such that Left Left Left Pj ∈ D j − m and D j ⊆ D j − m and f j < f j − m

(ii)

there exists m such that Right Right Right Pj ∈ D j + m and D j ⊆ D j + m and f j < f j + m

In practice, it was found to be difficult to find complete compliance of the domain Left Left Right Right subsetting conditions, i.e. D j ⊆ D j − m or D j ⊆ D j + m . Therefore the conditions are Left Left relaxed. We define that the condition D j ⊆ D j − m is said to hold if half of the left domain of
** Prcrcj
Left is covered by D j − m . The same can be applied to the right domain

Another problem that can arise can be understood by considering fig. 4. In fig. 4, if the lines AB and FG are ling enough, the curve BCDF is comparatively insignificant and can be ignored. On the other hand, if either AB or FG is short, the curve BCDF may be of significance. The classification can be illustrated by considering the angle at point B. At point B, the best fitted right strip would be from point B to A. If the line FG is long, the best fitted left strip of B would be from point B to G. On the other hand, if the line FG is short, the best fitted left strip may be from B to C since a narrower strip, which can give a Left larger value of f i , can be used. In the first case the angle subtended by the left and right strips of point B is obtuse while in the second be eliminated. C

B.

Left domain of C

.

.A

Right domain of C (a) Right domain of C

B.

.C
Left domain of C (b)

.A

Fig 3 Examples of two local maximum points(A and B) and one weaker point (C) with its domains. For example (see fig. 3), if the best fitted left strip of point A is from A to G, the process will examine the points (e.g. B,C,D and F) in between before eliminating any of them. If the lines AB and FG are long enough, all the points in between will have obtuse angles and are eliminated. Otherwise, those which have acute angles will be retained. For example, if point B has an acute angle, only points between A and B will be eliminated by A. consequently, the left domain of A is reduced to be from A to B only
Left Domain of A & Right domain of G

G.

F B D C

.A

Figure 4: A possible but undesirable chosen strip (AG).

Bridging process: in the first process, weak and insignificant points are eliminated. In practice, some weak points may be of significance. A check is made to determine the possibility of over elimination. In case of over elimination, some temporarily eliminated points are added back to the result. Ideally, neighboring feature points are supported by domain of point A covers point B and the right domain of point B covers point A. If two selected neighboring points A and B are not bridged, we say over elimination has occurred Bridges can be broken in the following ways: (i) (ii)
Right Left A ∉ D B and B ∉ D A Right Left A ∉ D B and B ∈ D A Or Right Left A ∈ D B and B ∉ D A .

(see fig. 5b) (see fig. 5c)

In either case, additional feature points are sought and the points involved are reexamined iteratively (or recursively) until all neighboring points are bridged together. The additional feature points are sought at the end of the shortened domains by selecting immediate local maximum points in the neighborhood. For example, in fig. 5(b) at the end of the shortened right domain of B, the process looks for the first local maximum starting from point C to B. For the shortened left domain of A, the process starts from point D to A. In short, the bridging process checks for the termination condition (i.e. all neighboring points are bridged together) in each iteration. If the condition is satisfied, the process terminates. Otherwise, additional feature Right domain of Band the iteration continues. points are sought Left domain of A Strip fitting process: it is a data reduction process to fit narrow strips to the remaining B A points. The reason behind this process is that some consecutive feature points may align approximately on a straight line and it is desirable to eliminate the points in between. For Figure 5(a): An and D are chosen as the feature example, if points A, B, Cideal relationship between two localpoints after the first two maximum it is (A and B) and their domains. processes as shown in fig. 5(b),pointsdesirable to eliminate points C and D and let the more prominent points A and B to represent the curve ADCB. In practice, the process first locates the most outstanding points, the local maximum points (e.g. A) among the remaining points, as starting points. Then two narrow strips of fixedA Right domain of B Left domain of widths (one half of the minimum width) is fitted to the LHS and RHS of the data point, eliminating any B points within the strips with smaller values of merit (e.g. C and D). TheA fitting strips whenever the last point can be fitted within the strip is found or a point of a larger value C of merit is met. In either case, the last point examined isD eliminated. not Fig 5(b) An example of bridges broken with condition(i)

Right domain of B B

Left domain of A A C

Fig. 5(c) An example of bridge broken with condition(ii)

2.3.5

Hausdorff Distance Algorithm

The Hausdorff distance is a shape comparison metric based on binary images. It is a distance defined between two point sets. Unlike most shape comparison methods that build a point-to-point correspondence between a model and a test image, Hausdorff distance can be calculated without explicit point correspondence. The Hausdorff distance for binary image matching is more tolerant to perturbations in the locations of points than binary correlation techniques, since it measures proximity rather than exact superposition. The use of the Hausdorff distance for binary image comparison and computer vision was originally proposed by Huttenlocher and colleagues [41]. In their paper the authors argue that the method is more tolerant to perturbations in the locations of points than binary correlation techniques since it measures proximity rather than exact superposition. Unlike most shape comparison methods, the Hausdorff distance can be calculated without the explicit pairing of points in their respective data sets, A and B. Furthermore, there is a natural allowance to compare partial images and the method lends itself to simple and fast implementation. Formally, given two finite point sets A ={a1, …, ap}, and B = {b1, …, bq}, the Hausdorff distance is defined as H ( A , B ) = max ( h ( A , B ) , h ( A , B ) ), Where, h(A,B) = max min || a – b ||. a∈A b∈B In the formulation above ||.|| is some underlying norm over the point sets A and B. In the following discussing, we assume that the distance between any two data points is defined as the Euclidean distance. h (A, B) can be trivially computed in time O(pq) for point sets of size p and q, respectively, and this can be improved to O((p + q)log(p + q)). The function h (A, B) is called the directed Hausdorff distance from set A to B. It identifies

the point a∈A that is farthest from any point of B and measures the distance from a to its nearest neighbor in B. In other words, h (A, B) in effect ranks each point of A based on its distance to the nearest point in B and then used the largest ranked such point as the measure of distance (the most mismatched point of A). Intuitively, if h (A, B) = d, then each point of A must be within distance d of some point of B, and there also is some point of A that is exactly distance d from the nearest point of B. For practical implementations, it is also important (due to occlusion or noise conditions) to be able to compare portions of shapes rather than providing exact matches. To handle such situations, the Hausdorff distance can be naturally extended to find the best partial distance between sets A and B. To achieve this, while computing h (A, B), one simply has to rank each point of A by its distance to the nearest point in B and take the Kth ranked value. This definition provides a nice property, that is it automatically selects the K “best matching” points of set A that minimizes the directed Hausdorff distance [41]. Realizing that there could be many different ways to define the directed (h (A, B), h (B, A)) and undirected (H (A, B)) distances between two point sets A and B, Dubuisson and Jain revised the metric and redefine the original definition of h (A, B) proposing an improved measure, called the modified Hausdorff distance (MHD), which is less sensitive to noise. Specifically, in their formulation

h( A, B ) =

1 Na

∑ min || a − b ||
a∈ A b∈B

where Na = p, the number of points in set A. In their paper, the authors argue that even the Kth ranked Hausdorff distance of Huttenlocher present some problems for object matching under noisy conditions, and conclude that the modified distance proposed above has the most desirable behavior for real-world applications. In this paper, we adopt the MHD formulation of Dubuisson, and further improve its a performance by introducing the notion of a neighborhood function ( N B ) and associated penalties (P). Specifically, we assume that for each point in set A, the corresponding point in B must fall within a range of a given diameter. This assumption is valid under the conditions that (i) the input and reference images are normalized by appropriate preprocessing algorithms, and (ii) the non-rigid transformation is small and localized. Let Nba be the neighborhood of point a in set B, and an indicator I = 1 if there exists a point b a ∈ N B , and I=0 otherwise. The complete formulation of the “doubly” modified Hausdorff distance (M2HD) can now be written as

d ( A, B ) = max( I min || a − b ||, (1 − I ) P ), a
b∈N B

h( A, B) =

1 Na

∑ d ( a , B ),
a∈ A

H ( A, B) = max(h( A, B), h( B, A)).
The notion of similarity encoded by this modified Hausdorff distance is that each point of A be near some point of B and vice versa. It requires, however, that all matching pairs fall

within a given neighborhood of each other in consistency with our initial assumption that local image transformations may take place. If no matching pair can be found, the present model introduces a penalty mechanism to ensure that images with large overlap are easily distinguished as well. As a result, the proposed modified Hausdorff measure (M2HD) is ideal for applications, such as face recognition, where although overall shape similarity is maintained, the matching algorithm has to account for small, non-rigid local distortions.

2.4

Use Case Diagram

Create File Perform User Perform

Director

Fig. Use Case Diagram for How User interacts with System

Face Detection Robert Cross Edge Detection

Robert Cross Convert Image to Binary Image

Process

Generate Line Edge Map Trainer Save Image to Database

Tester

Find Image

Directed

Modified

Doubly Modified HD

Fig. Use Case Diagram for Systems internal process CHAPTER 3 DESIGN 3.1 Class Relationship Diagram Package: FaceRecognitionSystem

Package: FaceRecognitionSystem.MainWin

Package: FaceRecognitionSystem.GUI

Package: FaceRecognitionSystem.CreateDB

Package: FaceRecognitionSystem.StoreDB

Package: FaceRecognitionSystem.Support

Package: FaceRecognitionSystem.Binary

Package: FaceRecognitionSystem.FaceRegion

Package: FaceRecognitionSystem.EdgeDetector

Package: FaceRecognitionSystem.Thinning

Package: FaceRecognitionSystem.Dynamic2Strip

Package: FaceRecognitionSystem.HausdorffDistance

3.2 Class Diagrams

This class is a main GUI class, which opens a form in which different user defined usercontrols are placed. This class is using other GUI classes to show processing and result.

System.Windows.Forms |__ FaceRecognitionSystem.GUI.MainWin Class FaceRecognitionSystem.GUI.MainWin :

This class is providing two important method to find successor and predecessor of any point in image.

This class is used for conversion between RGB and HSL models. It can also set/modify brightness, saturation and hue.

This class is contained in RGBHSL class and is used as an external support. It is used for RGB to HIS and vice versa conversions.

Class FaceRecognitionSystem.Support.SuccPredec Class FaceRecognitionSystem.Support.RGBHSL Class FaceRecognitionSystem.Support.HSL

This class is used to extract main region of the face from the images.

This class is used to insert data (images) into database.

This class is used to create database and table in that database in SQL Server 7.

Interface FaceRecognitionSystem.EdgeDetector.Convolution Class FaceRecognitionSystem.FaceRegion.FaceRegion Class FaceRecognitionSystem.StoreDB.DataMgmt Class FaceRecognitionSystem.CreateDB.CreateDB

This class is for converting image to binary image.

This class is an implementation of a Sobel Edge Detector.

This class is an implementation of a Robert Cross Edge Detector.

This class will generate Edge Map of an image.

This is an abstract class, which provides image to 2-D matrix and vice versa conversion. Abstract Class FaceRecognitionSystem.EdgeDetector.ImgMatrix

This is an interface, which provides method for convolution.

Class FaceRecognitionSystem.Binary.BinImage Class FaceRecognitionSystem.EdgeDetector.Sobel Class FaceRecognitionSystem.EdgeDetector.RobertCross Class FaceRecognitionSystem.EdgeDetector.EdgeMap

Class FaceRecognitionSystem.Thinning.BinMatrix

Class FaceRecognitionSystem.Thinning.HitAndMiss

Class FaceRecognitionSystem.Thinning.SerialThinning

Class FaceRecognitionSystem.Dynamic2Strip.LocalMaximum

This class is used for performing binary operations on imagematrix. This class is used to perform Hit and Miss process on face image. This class performs thinning operation over a binary face image.

This class is used to find Hausdorff distance of an input image to each images stored in the database.

This class is a left-right strip generator based on Dynamic Two-Strip algorithm.

This class will find pixels with local maximum and eliminate other pixels.

Class FaceRecognitionSystem.HausdorffDistance.HausdorffDistance Class FaceRecognitionSystem.Dynamic2Strip.LRSG

This class is an implementation of a Doubly Modified Hausdorff Distance algorithm.

This class is an implementation of a Modified Hausdorff Distance algorithm. Class FaceRecognitionSystem.HausdorffDistance.MHD

This class is an implementation of a Directed Hausdorff Distance algorithm.

3.3 Sequence Diagram Class FaceRecognitionSystem.HausdorffDistance.M2HD Class FaceRecognitionSystem.HausdorffDistance.HD

Fig. Sequence Diagram for Creating Database

Fig. Sequence Diagram for FullBy Step Testing Fig. Sequence Diagram Step Testing Fig. Sequence Diagram forfor Full Training

Fig. Sequence Diagram for Step By Step Training

CHAPTER 4 ANALISYS Class FaceRecognitionSystem.GUI.MainWin public class MainWin : System.Windows.Forms.Form { private System.Windows.Forms.MainMenu mainMenu1; private System.Windows.Forms.MenuItem menuFile; private System.Windows.Forms.MenuItem menuOpen; private System.Windows.Forms.MenuItem menuExit; private System.Windows.Forms.MenuItem menuItem1; private System.Windows.Forms.MenuItem menuOptions; private System.Windows.Forms.MenuItem menuTraining; private System.Windows.Forms.MenuItem menuTesting; private System.Windows.Forms.MenuItem menuSBSTraining; private System.Windows.Forms.MenuItem menuFTraining; private System.Windows.Forms.MenuItem menuSBSTesting; private System.Windows.Forms.MenuItem menuFTesting; private System.Windows.Forms.OpenFileDialog openFileDialog; private System.ComponentModel.Container components = null; private FaceRecognitionSystem.GUI.ShowProcessing showProcessing1; private FaceRecognitionSystem.GUI.SBSTraining sbsTraining1; private System.Windows.Forms.MenuItem menuDatabase; private System.Windows.Forms.MenuItem menuDBCreate; private System.Windows.Forms.MenuItem menuHelp; private System.Windows.Forms.MenuItem menuUse; private System.Windows.Forms.MenuItem menuAbtUS; private System.Windows.Forms.TextBox txtWelcome; private FaceRecognitionSystem.GUI.FullTrainning fullTrainning1; private FaceRecognitionSystem.GUI.SBSTesting sbsTesting1; private FaceRecognitionSystem.GUI.FullTesting fullTesting1; private FaceRecognitionSystem.GUI.ShowResult showResult1; private System.Windows.Forms.MenuItem menuItem4; public FaceRecognitionSystem.GUI.PBoxPanel pBoxPanel1; public System.Windows.Forms.Panel WelcomePanel; public FaceRecognitionSystem.GUI.ShowInformation showInformation1; public MainWin() { InitializeComponent(); } protected override void Dispose( bool disposing ); static void Main() { Application.Run(new MainWin()); } }

This is a class from where Main( ) method is called. When we run application, this form will be loaded first, where other user-defined user controls are placed. Class FaceRecognitionSystem.Support.HSL and Class FaceRecognitionSystem.Support.RGBHSL public class HSL { double _h; double _s; double _l; double H double S double L public HSL(); } public class RGBHSL { public RGBHSL(); public static Color SetBrightness(double brightness); public static Color ModifyBrightness(Color c,double brightness); public static Color SetSaturation(Color c,double Saturation); public static Color ModifySaturation(Color c,double Saturation); public static Color SetHue(Color c, double Hue); public static Color ModifyHue(Color c, double Hue); public static Color HSL_to_RGB(HSL hsl); public static HSL RGB_to_HSL (Color c); } This class is used as an external support. This class is useful to convert RGB to HSL and vice versa. And also used to set/modify brightness, saturation and hue. Class FaceRecognitionSystem.Support.SuccPredec public class SuccPredec { public SuccPredec(); public Point successor(Point x,Point p,int[][] Q); public Point predecessor(Point x,Point p,int[][] Q); } public Point successor(Point x,Point p,int[][] Q) Encapsulation : public Return type : Point Method name : successor

Arguments : x - reference point with respect to which the successor of current point will be found from image matrix. p - current point Q - image matrix. Matrix of type integer is a representation of image as an intensity values of each pixel at each node of matrix. This method returns a point, which will be a successor point of p with respect to x from image-matrix Q. public Point predecessor(Point x,Point p,int[][] Q) Encapsulation : public Return type : Point Method name : predecessor Arguments : x - reference point with respect to which the predecessor of current point will be found from image matrix. p - current point Q - image matrix. Matrix of type integer is a representation of image as an intensity values of each pixel at each node of matrix. This method returns a point, which will be a predecessor point of p with respect to x from image-matrix Q. Class FaceRecognitionSystem.CreateDB.CreateDB public class CreateDB { string str; SqlConnection con; SqlCommand comm; public CreateDB(); } str : variable of type string, used as a query string for database. con : variable of type SqlConnection, used to create connection to database. comm : variable of type SqlCommand, used to create command stored in string str, which will be executed in database connected with connection con. public CreateDB() Encapsulation : public Method type : constructor Method name : CreateDB Arguments : N/A This method will create database named “FaceDB” in SQL Server and than create a table named as “FaceTab” in FaceDB, which is used for storing images in to database for identification.

Class FaceRecognitionSystem.StoreDB.DataMgmt public class DataMgmt { SqlConnection con; SqlDataAdapter adap; SqlCommandBuilder builder; DataSet dataset; string insert; public DataMgmt(); public void insertion(Image OI, Image PI); public void distroy(); } con : variable of type SqlConnection, used to create connection to database. adap : variable of type SqlDataAdapter, used to create data adapter. builder : variable of type SqlCommandBuilder, used to build command and execute. dataset : variable of type DataSet, used to insert data into database. insert : variable of type string, used to store query for inserting data into database. public DataMgmt() Encapsulation : public Method type : constructor Method name : DataMgmt Arguments : N/A This method will create connection to database. And connect dataset to the table “FaceTab” in the database. public void insertion(Image OI, Image PI) Encapsulation : public Return type : void Method name : insertion Arguments : OI : argument of type Image, is an original image on which processing is done. PI : argument of type Image, is a processed image. This method will store OI and PI into the database. public void distroy() Encapsulation : public Return type : void Method name :distroy Arguments : N/A This method will close connection and dispose builder and adapter objects.

Class FaceRecognitionSystem.FaceRegion.FaceRegion public class FaceRegion { public Image FaceReg(Image I2); private Image ScaleImage (Image image, int width, int height); private Image MainRegion(Image I,int[] cols, int[] rows); } public Image FaceReg(Image I2) Encapsulation : public Return type : Image Method name : FaceReg Arguments : I2 – argument of type Image, is an original image from which the main region of the image will extract. This function will find the region of the face from the image passed as an argument and will return the portion of the image found as an image. private Image ScaleImage (Image image, int width, int height) Encapsulation : public Return type : Image Method name : ScaleImage Arguments : image : an original image, which is to be scale. width : integer value shows the width of the scaled image. height : integer value shows the height of the scaled image. This function will scale image to given size of width and height and return the scaled image. private Image MainRegion(Image I,int[] cols, int[] rows) Encapsulation : private Return type : Image Method name : MainRegion Arguments : I : original image cols : integer array, contains x position of left-top and right-bottom points rows : integer array, contains y position of left-top and right-bottom points This method will extract the region founded, specified with two array cols and rows form the image I and convert that region to image and return. Interface FaceRecognitionSystem.EdgeDetector.Convolution interface Convolution { double[][] Convolve(double[][] X,int[][] Y); }

double[][] Convolve(double[][] X,int[][] Y) Encapsulation : public Return type : double[][] Method name : Convolve Arguments : x : 2-D array of double, which will be convolve. y : 2-D array of integer, by which x will be convolved. Class FaceRecognitionSystem.EdgeDetector.ImgMatrix public abstract class ImgMatrix { protected abstract double[][] ImgToMat(Image I); protected abstract Image MatToImg(double[][] X); } protected abstract double[][] ImgToMat(Image I) Encapsulation : protected Return type : double[][] Method name : ImgToMat Arguments : I - argument of type Image. This method will convert image I to matrix of double values filled with the intensity values of each pixel and return matrix. protected abstract Image MatToImg(double[][] X) Encapsulation : protected Return type : double[][] Method name : MatToImg Arguments : I - argument of type Image. This method will convert matrix of double values filled with the intensity values of each pixel to an image and return an image. Class FaceRecognitionSystem.EdgeDetector.EdgeMap public class EdgeMap : ImgMatrix, Convolution { private double[][] Gx; private double[][] Gy; public EdgeMap(int[][] X,int[][] Y,double[][] Z) public double[][] Magnitude() public double[][] Angle() public double[][] Convolve(double[][] X,int[][] Y) protected override double[][] ImgToMat(Image I) protected override Image MatToImg(double[][] X) }

Gx : 2-D array of double, stores x-kernel Gy : 2-D array of double, stores y-kernel public EdgeMap(int[][] X,int[][] Y,double[][] Z) Encapsulation : public Method type : constructor Method name : EdgeMap Arguments : X – 2-D array of integers, is a kernel Y – 2-D array of integers, is a kernel Z – 2-D array of doubles, is an image-matrix This method will convolve Z with respect to X and Y and store result into Gx and Gy respectively. public double[][] Magnitude() Encapsulation : public Return type : double[][] Method name : Magnitude Arguments : N/A This method will find magnitude from Gx and Gy and return the result in 2-D array of double value. public double[][] Angle() Encapsulation : public Return type : double[][] Method name : Angle Arguments : N/A This method will find Angle from Gx and Gy and return result in 2-D array of double value. public double[][] Convolve(double[][] X,int[][] Y) Encapsulation : public Return type : double[][] Method name : Convolve Arguments : x : 2-D array of double, which will be convolve. y : 2-D array of integer, by which x will be convolved. This method will perform convolution of X with respect to Y and return result in 2-D array of double value. protected override double[][] ImgToMat(Image I) Encapsulation : protected Return type : double[][] Method name : ImgToMat Arguments : I – Image to convert

This method will convert an image I into 2-D array of double value filled with intensity values of each pixel. protected override Image MatToImg(double[][] X) Encapsulation : protected Return type : double[][] Method name : MatToImg Arguments : X – 2-D array of double, is an image-matrix This method will convert an image-matrix which contains intensity values of each pixel to an image and return that image. Class FaceRecognitionSystem.EdgeDetector.RobertCross public class RobertCross : EdgeMap { int[][] Gx; int[][] Gy; public RobertCross() public Image RCrossED(Image I) } Gx : 2-D array of integer, is kernel. Gy : 2-D array of integer, is kernel. public RobertCross() Encapsulation : public Method type : constructor Method name : RobertCross Arguments : N/A This method will initialize Gx and Gy kernels. public Image RCrossED(Image I) Encapsulation : public Return type : Image Method name : RCrossED Arguments : I – an Image to process This method will process on image I, extracts edges from image, and returns extracted edge-map as an image.

Class FaceRecognitionSystem.EdgeDetector.Sobel public class Sobel : EdgeMap { int[][] Gx; int[][] Gy; public Sobel() public Image SobelED(Image I) } Gx : 2-D array of integer, is kernel. Gy : 2-D array of integer, is kernel. public Sobel() Encapsulation : public Method type : constructor Method name : Sobel Arguments : N/A This method will initialize Gx and Gy kernels. public Image SobelED(Image I) Encapsulation : public Return type : Image Method name : SobelED Arguments : I – an Image to process This method will process on image I, extracts edges from image, and returns extracted edge-map as an image. Class FaceRecognitionSystem.Binary.BinImage public class BinImage : EdgeDetector.EdgeMap { public BinImage() public Image BinaryImage(Image I) } public Image BinaryImage(Image I) Encapsulation : public Return type : Image Method name : BinaryImage Arguments : I – Image to convert into binary. This method will convert image I to binary based on some predefined threshold.

Class FaceRecognitionSystem.Thinning.BinMatrix public class BinMatrix : support.SuccPredec { public BinMatrix() public int[][] NOT(int[][] X) public Image OR(Image X1,Image Y1) public int[][] AND(int[][] X,int[][] Y) protected double[][] ImgToMat(Image I) protected Image MatToImg(double[][] X) public int[][] ImgMat(Image I) public Image MatImg(int[][] X) } public int[][] NOT(int[][] X) Encapsulation : public Return type : int[][] Method name : NOT Arguments : x - 2-D array of integers, is an image-matrix to invert. This method will invert image-matrix x into 2-D array and return the got result. public Image OR(Image X1,Image Y1) Encapsulation : public Return type : Image Method name : OR Arguments : X1 - first Image Y1 – second Image This method will OR two images and generate single resulted image and return it. public int[][] AND(int[][] X,int[][] Y) Encapsulation : public Return type : Image Method name : OR Arguments : X - first Image Y – second Image This method will AND two images and generate single resulted image and return it. public int[][] ImgMat(Image I) Encapsulation : public Return type : int[][] Method name : ImgMat Arguments : I – Image to convert

This method will convert an image I into 2-D array of integer, filled with intensity values of each pixels. public Image MatImg(int[][] X) Encapsulation : public Return type : Image Method name : MatImg Arguments : X – 2-D array of integer, is an image-matrix This method will convert an image-matrix, which contains intensity values of each pixel to an image and return that image. Class FaceRecognitionSystem.Thinning.HitAndMiss public class HitAndMiss { public int[][] HitNMiss(int[][] I,int[][] SE) } public int[][] HitNMiss(int[][] I,int[][] SE) Encapsulation : public Return type : int[][] Method name : HitNMiss Arguments : I – 2-D array of integers, is an image-matrix. SE – 2-D array of integers, is a structuring element. This method will process image I with structuring element SE and find resulted image and convert that into 2-D array of integers and returns it. Class FaceRecognitionSystem.Thinning.SerialThinning public class SerialThinning : BinMatrix { private Point First; private Point Prev; private int[][] Q; public SerialThinning() public SerialThinning(Image I) private void Deletion(Point p) public Image Thinning() private int B(Point p) private int A(Point p) } First : starting point of any loop Prev : previous point of current point that traversed

Q : 2-D array of integers, is an image-matrix. public SerialThinning(Image I) Encapsulation : public Method type : constructor Method name : SerialThinning Arguments : I – image to thin. This method will perform initialization of different variables. private void Deletion(Point p) Encapsulation : private Return type : void Method name : Deletion Arguments : P – point to delete This method will delete point P from image-matrix Q. public Image Thinning() Encapsulation : public Return type : Image Method name : Thinning Arguments : N/A This method will perform thinning operation over image-matrix Q and return thinned image as a result. private int B(Point p) Encapsulation : private Return type : int Method name : B Arguments : p – current point This method will return total number of pixels form value 1 in the neighborhood of P. private int A(Point p) Encapsulation : private Return type : int Method name : A Arguments : p – current point This method will return number of 1 to 0 transitions in the neighborhood of p.

Class FaceRecognitionSystem.Dynamic2Strip.LocalMaximum public class LocalMaximum : LRSG { private int[][] Q1,Q; private float[][] fi; public LocalMaximum(Image I, ref PictureBox pb) public void LMax() public void NLMElim() private Point[] strip(Point p) } Q1 : 2-D array of type integer, an image-matrix, used to store original input image. Q : 2-D array of type integer, an image-matrix, used for processing. fi : 2-D array of type float, used to store calculated value of each pixel. public LocalMaximum(Image I, ref PictureBox pb) Encapsulation : public Method type : constructor Method name : LocalMaximum Arguments : I : an input image. pb : reference variable to PictureBox. This method will initialize Q,Q1 and fi and then call other method to process of input image. Finally place processed image into the picture box whose reference is passed as an argument. public void LMax() Encapsulation : public Return type : void Method name : LMax Arguments : N/A This method will calculate local maximum of each dark pixel of the image and store result in matrix f. public void NLMElim() Encapsulation : public Return type : void Method name : NLMElim Arguments : N/A This method will eliminate all pixels which are not local maximum. private Point[] strip(Point p) Encapsulation : private Return type : Point[] Method name : strip

Arguments : p – point, for which strips to be found. This method will find strips on both sides of a pixel p, and return an array of point in which all points within the rectangle generate from strips are there. Class FaceRecognitionSystem.Dynamic2Strip.LRSG public class LRSG : support.SuccPredec { private double LLength,LWidth; private double RLength,RWidth; private double angle; private int[][] Q; private Point Lpt1,Lpt2,Lpt3,Lpt4; private Point Rpt1,Rpt2,Rpt3,Rpt4; private Point start; private float lm,rm,m1,m2; public LRSG() public float f(Point p, int[][] Q1) private double S() private double E(Point p, char d) private void strips(Point p, char d) private double length(char d) public float line_slop(Point p, Point z) public bool check(Point pt1,Point pt2,Point t,Point st,float m1,float m2) public bool same_side(Point x, float m, Point p, Point t) public int line_side(Point x, float m, Point p) public void points12(Point p, Point z, char d) public void pt12(Point p, Point z, ref Point pt1, ref Point pt2) public void points34(Point p,char d) public Point intersectln(Point p, Point pt) } LLength : length of left-side strip. LWidth : distance between left-side strips. RLength : length of right-side strip. RWidth : distance between right-side strips. angle : subtended angle between left and right strips. Q : an image-matrix. Lpt1, Lpt2, Lpt3, Lpt4 : end points of two strips of left-side. Rpt1, Rpt2, Rpt3, Rpt4 : end points of two strips of right-side. start : starting point of loop. lm : slop of left strip. Rm : slop of right strip. public LRSG() Encapsulation : public Method type : constructor

Method name : LRSG Arguments : N/A This method will initialize variables. public float f(Point p, int[][] Q1) Encapsulation : public Return type : float Method name : f Arguments : p – point for which to calculate f. Q1 – an image-matrix This method will calculate f for point p. private double S() Encapsulation : private Return type : double Method name : S Arguments : N/A This method will find S for a point. private double E(Point p, char d) Encapsulation : private Return type : double Method name : E Arguments : p – point for which to calculate elongatedness. d – character indicate the direction as left or right. This method will calculate elongatedness of a point p in direction d. private void strips(Point p, char d) Encapsulation : private Return type : void Method name : strips Arguments : p – point around which strips are to be found. d – char indicate the direction. This method will find strips and store their end points into global variables Lpt1,Lpt2,Lpt3,Lpt4 and Rpt1,Rpt2,Rpt3,Rpt4 as per direction passed. private double length(char d) Encapsulation : private Return type : double Method name : length Arguments : d – character indicate direction.

This method will find length of a strip based on the direction passed. public float line_slop(Point p, Point z) Encapsulation : public Return type : float Method name : line_slop Arguments : p – first point. z – second point. This method will find slop of a line passing through p and z and return slop found. public bool check(Point pt1,Point pt2,Point t,Point st,float m1,float m2) Encapsulation : public Return type : bool Method name : check Arguments : pt1: point on a first left-side strip. Pt2 : point on a second left-side strip. t : point to be check. st : point for which strips are generated. m1: slop of strips on left-side. m2: slop of line perpendicular to line with slop m1. This method will check if point t is in between two lines passing from pt1 and pt2 with slop m1 and return its status as true if point is in between two lines else flase. public bool same_side(Point x, float m, Point p, Point t) Encapsulation : public Return type : bool Method name : same_side Arguments : x : point on a line with slop m. m : slop of a line. p, t : points which are to be checked. This method will check if points p and t are on same side of a line passing from x with slop m. public int line_side(Point x, float m, Point p) Encapsulation : public Return type : int Method name : line_side Arguments : x : point on a line with slop m. m : slop of a line. p : point to be check.

This method will calculate on which side of line passing from x with slop m is point p and return calculated value. public void points12(Point p, Point z, char d) Encapsulation : public Return type : void Method name : points12 Arguments : p, z – points on a line. d – character indication a direction. This method will find two points on both side of p which will be on two strips of p. public void points34(Point p,char d) Encapsulation : public Return type : void Method name : points34 Arguments : p – last point in the region This method will find two last points of the strips. public Point intersectln(Point p, Point pt) Encapsulation : public Return type : Point Method name : intersectln Arguments : p, pt : two points on different lines. This method will find intersection of two lines passing from p an pt with slops m1 and m2 respectively. Class FaceRecognitionSystem.HausdorffDistance.HausdorffDistance public class HausdorffDistance : Thinning.BinMatrix { private Image I; private float[] dist; private int P; private int N; private Image[] ImgArr; int ReqImg; string svr; public HausdorffDistance(string svraddr) public Image[] HausdorffDist(Image I1,int choice,int no) public Image[] HausdorffDist(Image I1, int P1, int N1, int no) public void distance(int choice) public int[] sort()

public float max(float A,float B) public float h(int[][] A,int[][] B,int choice) } I : Image passed for searching. dist : array of float values, stores distance calculated with each image in database. P : penalty N : radius of neighborhood ImgArr : array of images, stores best matches in descending order. ReqImg : integer, number of requested images. svr : string, stores server name/address. public HausdorffDistance(string svraddr) Encapsulation : public Method type : constructor Method name : HausdorffDistance Arguments : svraddr : string, used to pass server address. This method will initialize server address that will be used in processing through svr. public Image[] HausdorffDist(Image I1,int choice,int no) Encapsulation : public Return type : Image[] Method name : HausdorffDist Arguments : I1 : Image, passed for searching. choice : integer, identified which algorithm to use for searching. no : integer, number of best match images to return. This method will find distance between image pass as argument and images in the database and return number of images requested in array of images. public Image[] HausdorffDist(Image I1, int P1, int N1, int no) Encapsulation : public Return type : Image[] Method name : HausdorffDist Arguments : I1 : Image, passed for searching. P1 : integer, penalty value passed for calculation. N1 : integer, radius of neighborhood. no : integer, number of best match images to return. This method will find distance between image pass as argument and images in the database and return number of images requested in array of images. public void distance(int choice) Encapsulation : public Return type : void

Method name : distance Arguments : choice : integer, identify the algorithm to use. public int[] sort() Encapsulation : public Return type : int[] Method name : sort Arguments : N/A This method will sort the dist array and return sorted array of index. public float max(float A,float B) Encapsulation : public Return type : float Method name : max Arguments : A, B : two float values from which to find max value. This method will return the maximum value from A and B. public float h(int[][] A,int[][] B,int choice) Encapsulation : public Return type : float Method name : h Arguments : A : 2-D array of integers, is an image-matrix to search. B : 2-D array of integers, is an image-matrix retrieved from database to compare. choice : integer, identifies the comparison algorithm. This method will find distance from A to B. Class FaceRecognitionSystem.HausdorffDistance.HD public class HD { public float h(int[][] A,int[][] B) public float max(float[] a) public float[] min(int[][] A,int[][] B) public float minimum(Point p,int[][] B) } public float h(int[][] A,int[][] B) Encapsulation : public Return type : float Method name : h Arguments : A : 2-D array of integers, is an image-matrix to search. B : 2-D array of integers, is an image-matrix retrieved from database to compare.

This method will find distance from A to B. public float max(float[] a) Encapsulation : public Return type : float Method name : max Arguments : a : an array of float values from which to find max value. This method will return the maximum value from float array a. public float[] min(int[][] A,int[][] B) Encapsulation : public Return type : float[] Method name : min Arguments : A : image-matrix, which is to be search. B : image-matrix, retrieved from database. This method will find minimum distance of all points in A with each point in B and returns calculated distance as an array of float values. public float minimum(Point p,int[][] B) Encapsulation : public Return type : float Method name : minimum Arguments : p : point in the image. B : image-matrix, retrieved from database. This method will find minimum distance from p to each point in B and returns calculated distance as a float value. Class FaceRecognitionSystem.HausdorffDistance.MHD public class MHD : HD { public new float h(int[][] A,int[][] B) public float avg(float[] m) } public new float h(int[][] A,int[][] B) Encapsulation : public Return type : float Method name : h Arguments : A : 2-D array of integers, is an image-matrix to search. B : 2-D array of integers, is an image-matrix retrieved from database to compare.

This method will find distance from A to B. public float avg(float[] m) Encapsulation : public Return type : float Method name : avg Arguments : m : an array of float values. This method will return the average value of all elements of array m. Class FaceRecognitionSystem.HausdorffDistance.M2HD public class M2HD : MHD { int P; int N; public M2HD() public M2HD(int P1,int N1) public new float h(int[][] A,int[][] B) public float[] ds(int[][] A,int[][] B) public float d(Point p,int[][] B) public float max(float a, float b) public float min(Point p,int[][] B) } P : integer, penalty N : integer, radius of neighborhood. public M2HD(int P1,int N1) Encapsulation : public Method type : constructor Method name : M2HD Arguments : P1 : integer, penalty passed for calculation. N1 : integer, radius of neighborhood. This method will initialize local parameters. public new float h(int[][] A,int[][] B) Encapsulation : public Return type : float Method name : h Arguments : A : 2-D array of integers, is an image-matrix to search. B : 2-D array of integers, is an image-matrix retrieved from database to compare. This method will find distance from A to B.

public float[] ds(int[][] A,int[][] B) Encapsulation : public Return type : float[] Method name : ds Arguments : A : image-matrix, which is to be search. B : image-matrix, retrieved from database. This method will return the values calculated for all points in A with each point in B based on penalty. public float d(Point p,int[][] B) Encapsulation : public Return type : float Method name : d Arguments : p : Point in image. B : image-matrix, retrieved from database. This method will find distance of point p with each point in B and returns calculated distance based on penalty. public float max(float a, float b) Encapsulation : public Return type : float Method name : max Arguments : a, b : two float values from which to find max value. This method will return the maximum value from a and b. public float min(Point p,int[][] B) Encapsulation : public Return type : float Method name : min Arguments : p : point in the image. B : image-matrix, retrieved from database. This method will find minimum distance from p to each point in B and returns calculated distance as a float value.

CONCLUSION

REFERENCES [1] Yongsheng Gao and Maylor K.H. Leung, “Face Recognition Using Line Edge Map” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 24, No. 6, June 2002. [2] Surendra Gupta and Krupesh Parmar, “A Combined approach of Serial and Parallel Thinning Algorithm for Binary Face Image,” Computing-2005, Division IV, CSI Conference, May 2005. [3] Y. Gao, “Efficiently comparing face images using a modified Hausdorff distance,” IEE Proc.-Vis. Image Signal Process., Vol. 150, No. 6, December 2003. [4] M.K.H. Leung and Y.H. Yang, “Dynamic Two-Strip Algorithm in Curve Fitting,” Pattern Recognition, vol. 23, pp. 69-79, 1990. [5] David S. Bolme, “ELASTIC BUNCH GRAPH MATCHING,” Colorado State University Fort Collins, Colorado Summer 2003 [6] Laurenz Wiskott, “The Role of Topographical Constraints in Face Recognition,” Pattern Recognition Letters 20(1):89-96 (1999) [7] Daniel L. Swets, John (Juyang) Weng, “ Using Discriminant Eigenfeatures for Image Retrieval, ,” IEEE Trans. Pattern Anal. Machine Intell., vol. 18, August 1996. [8] C. Kotropoulos and I. Pitas, “Rule-Based Face Detection in Frontal Views,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP-97), vol. 4, pp. 2537-2540, Apr. 1997. [9] P. T. Jackway and M. Deriche, “Scale-space properties of the multiscale morphological dilation-erosion,” IEEE Trans. Pattern Anal. Machine Intell., vol. 18, pp. 38–51, Jan. 1996. [10]L. Sirovich and M. Kirby, “Low-Dimensional Procedure for the Characterisation of Human Faces,” J. Optical Soc. of Am., vol. 4, pp. 519-524, 1987. [11]M. Kirby and L. Sirovich, “Application of the Karhunen-LoeÁve Procedure for the Characterisation of Human Faces,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, pp. 831-835, Dec. 1990. [12]M. Turk and A. Pentland, “Eigenfaces for Recognition,” J. Cognitive Neuroscience, vol. 3, pp. 71-86, 1991. [13]M.A. Grudin, “A Compact Multi-Level Model for the Recognition of Facial Images,” PhD thesis, Liverpool John Moores Univ., 1997. [14]L. Zhao and Y.H. Yang, “Theoretical Analysis of Illumination in PCA-Based Vision Systems,” Pattern Recognition, vol. 32, pp. 547-564, 1999.

[15]A. Pentland, B. Moghaddam, and T. Starner, “View-Based and Modular Eigenspaces for Face Recognition,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 84-91, 1994. [16]T.J. Stonham, “Practical Face Recognition and Verification with WISARD,” Aspects of Face Processing, pp. 426-441, 1984. [17]K.K. Sung and T. Poggio, “Learning Human Face Detection in Cluttered Scenes,” Computer Analysis of Image and Patterns, pp. 432-439, 1995. [18]S. Lawrence, C.L. Giles, A.C. Tsoi, and A.D. Back, “Face Recognition: A Convolutional Neural-Network Approach,” IEEE Trans. Neural Networks, vol. 8, pp. 98-113, 1997. [19]J. Weng, J.S. Huang, and N. Ahuja, “Learning Recognition and Segmentation of 3D objects from 2D images,” Proc. IEEE Int'l Conf. Computer Vision, pp. 121-128, 1993. [20]S.H. Lin, S.Y. Kung, and L.J. Lin, “Face Recognition/Detection by Probabilistic Decision-Based Neural Network,” IEEE Trans. Neural Networks, vol. 8, pp. 114-132, 1997. [21]S.Y. Kung and J.S. Taur, “Decision-Based Neural Networks with Signal/Image Classification Applications,” IEEE Trans. Neural Networks, vol. 6, pp. 170-181, 1995. [22]F. Samaria and F. Fallside, “Face Identification and Feature Extraction Using Hidden Markov Models,” Image Processing: Theory and Application, G. Vernazza, ed., Elsevier, 1993. [23]F. Samaria and A.C. Harter, “Parameterisation of a Stochastic Model for Human Face Identification,” Proc. Second IEEE Workshop Applications of Computer Vision, 1994. [24]S. Tamura, H. Kawa, and H. Mitsumoto, “Male/Female Identification from 8_6 Very Low Resolution Face Images by Neural Network,” Pattern Recognition, vol. 29, pp. 331-335, 1996. [25]Y. Kaya and K. Kobayashi, “A Basic Study on Human Face Recognition,” Frontiers of Pattern Recognition, S. Watanabe, ed., p. 265, 1972. [26]T. Kanade, “Picture Processing by Computer Complex and Recognition of Human Faces,” technical report, Dept. Information Science, Kyoto Univ., 1973. [27]A.J. Goldstein, L.D. Harmon, and A.B. Lesk, “Identification of Human Faces,” Proc. IEEE, vol. 59, p. 748, 1971. [28]R. Bruneli and T. Poggio, “Face Recognition: Features versus Templates,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, pp. 1042-1052, 1993. [29]I.J. Cox, J. Ghosn, and P.N. Yianios, “Feature-Based Face Recognition Using Mixture-Distance,” Computer Vision and Pattern Recognition, 1996.

[30]B.S. Manjunath, R. Chellappa, and C. von der Malsburg, “A Feature Based Approach to Face Recognition,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 373-378, 1992. [31]B. TakaÂcs, “Comparing Face Images Using the Modified Hausdorff Distance,” Pattern Recognition, vol. 31, pp. 1873-1881, 1998. [32]C. Kotropoulos, A. Tefas, and I. Pitas, “Frontal Face Authentication Using Morphological Elastic Graph Matching,” IEEE Trans. Image Processing, vol. 4, no. 9, pp. 555-560, Apr. 2000. [33]P.J.M. van Laarhoven and E.H.L. Aarts, Simulated Annealing: Theory and Applications. Kluwer Academic Publishers, 1987. [34]R.H.J.M. Otten and L.P.P.P. van Ginneken, The Annealing Algorithm. Kluwer Academic Publishers, 1989. [35]Olivier de Vel and Stefan Aeberhard, “Line-Based Face Recognition under Varying Pose,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 21, No. 10, October 1999. [36]H.I. Kim, S.H. Lee and N.I. Cho, “Rotation-invariant face detection using angular projections,” ELECTRONICS LETTERS 10th June 2004 Vol. 40 No. 12 [37]Frank Y. Shih and Wai-Tak Wong, “A New Safe-Point Thinning Algorithm Based on the Mid-Crack Code Tracing”, IEEE Transactions on Systems. Man, and Cybernetics, vol. 25, no. 2, pp. 370-377, Feb. 1995. [38]N. H. Han, C. W. La, and P. K. Rhee, “An Efficient Fully Parallel Thinning Algorithm”, IEEE,1997. [39]Edna Lucia Flores, “A Fast Thinning Algorithm”, IEEE, 1998. [40]Leung M.K., Yang Y. “A region based approach for human body motion analysis”, Pattern Recognition 20:321-339; 1987. [41]D.P. Huttenlocher, G.A. Klanderman and W.J.Rucklidge “Comparing Images Using the Hausdorff Distance”, IEEE Trans., Pattern Analysis and Machine Intelligence 15(9), 850-863 (1993).