Attribution Non-Commercial (BY-NC)

809 views

Attribution Non-Commercial (BY-NC)

- Fundamentals of Vibrations
- Face Recognition Technology
- Thesis Proposal - FPGA-Based Face Recognition System, By Poie - Nov 12, 2009
- face recognition using local dct
- Face Recognition Technology
- Multi-Mass Spring Modeling
- Novel DCT based watermarking scheme for digital images
- Steganography and Random Grids
- FEM (Dynamic Analysis)
- Ch5.pdf
- Using the Modal Method in Rotor Dynamic Systems.pdf
- chpt2hints
- Gupta 2010 Soil Dynamics and Earthquake Engineering
- PCA on Images
- Download
- CA 4201508513
- PDF%2Fjcssp.2012.431.435
- wallace tree multiplier
- 3. Comp Sci - Ijcseitr - A Novel Approach for Iris - Heena
- 3

You are on page 1of 57

CHAPTER 1

INTRODUCTION

Face recognition by humans is a high level visual task for which it has

been extremely difficult to construct detailed neurophysiological and psychophysical models.

This is because faces are complex natural stimuli that differ dramatically from the artificially

constructed data often used in both human and computer vision research. Thus, developing a

computational approach to face recognition can prove to be very difficult indeed. In fact,

despite the many relatively successful attempts to implement computerbased face recognition

systems, we have yet to see one which combines speed, accuracy, and robustness to face

variations caused by 3D pose, facial expressions, and aging. The primary difficulty in

analyzing and recognizing human faces arises because variations in a single face can be very

large, while variations between different faces are quite small. That is, there is an inherent

structure to a human face, but that structure exhibits large variations due to the presence of a

multitude of muscles in a particular face. Given that recognizing faces is critical for humans

in their everyday activities, automating this process would be very useful in a wide range of

applications including security, surveillance, criminal identification, and video compression.

This paper discusses a new computational approach to face recognition that, when combined

with proper face localization techniques, has proved to be very efficacious. This section

begins with a survey of the face recognition research performed to date. The proposed

approach is then presented along with its objectives and

the motivations for choosing it. The section concludes with an overview of the structure of

the paper.

Face Recognition

person from a digital image. It does that by comparing selected facial features in the live

image and a facial database. It is typically used for security systems and can be compared to

other biometrics such as fingerprint or eye iris recognition systems. The great advantage of

facial recognition system is that it does not require aid from the test subject. Properly

c

!

designed systems installed in airports, multiplexes and other public places can detect the

presence of criminals among the crowd.

The development and implementation of face recognition systems is totally dependent in the

development of computers, since without computers the efficient use of the algorithms is

impossible. So the history of face recognition goes side by side with the history of computers.

Research in automatic face recognition dates back at least until the 1960¶s. Bledsoe, in 1966,

was the first to attempt semi-automated face recognition with a hybrid human computer

system that classified faces on the basis of fiducial marks entered on photographs by hand.

Parameters for the classification were normalized distances and ratios among points such as

eye corners, mouth corners, nose tip and chin point. Later work at Bell

laboratories(Goldstein, Harmon and Lesk,1971; Harmon, 1971) developed a vector of upto

21 features and recognized faces using standard pattern classification techniques. The chosen

features were largely subjective evaluation (e.g. shade of hair, length of ears, lip thickness)

made by human subjects, each of which would be difficult to automate.

An early paper by Fischler and Elschlager (1973) attempted to measure similar features

automatically. They described a linear embedding algorithm that used local feature template

matching and a global measure of fit to find and measure facial features. This template

matching approach has been continued and improved by recent work of Yuille, Cohen and

Hallinan (1989). Their strategy is based on ³deformable templates´, which are parameterized

models of the face and its features in which the parameter values are determined by

interaction with the image.Connectionalist approach to face identification seeks to capture the

configurational or gestate-like nature of the task. Kohonen (1989) and Kohonen and Lahito

(1981) describe an associative network with a simple learning algorithm that can recognize

(classify) face images and recall a face image from an incomplete or noisy version input to

the network. Fleming and Cottrell (1990) extend these ideas using nonlinear units, training

the system by back propagation. Stonham¶s WISARD system (1986) is a general pattern

recognition devise based on neutral net principles. It has been applied with some success to

binary face images, recognizing both identity and expression. Most connectionist system

dealing with faces treat the input image as a general 2-D pattern, and can make no explicit

use of the configurational prosperities of face. Moreover, some of these systems require an

inordinate number of training examples to achieve a reasonable level of performance. Kirby

c

"

and Sirovich were among the first to apply principal component analysis (PCA) to face

images and showed that PCA is an optimal compression scheme that minimizes the mean

squared error between the original images and their reconstructions for any given level of

compression. Turk and Pentland popularized the use of PCA for face recognition. They used

PCA to compute a set of subspace basis vectors (which they called ³eigenfaces´) for a

database of face images and projected the images in the database into the compressed

subspace. New test images were then matched to images in the database by projecting them

onto the basis vectors and finding the nearest compressed image in the subspace(eigenspace).

The initial success of eigenfaces popularized the idea of matching images in compressed

subspaces. Researchers began to search for other subspaces that might improve performance.

One alternative is Fisher¶s Linear Discriminant Analysis (LDA, a.k.a. ³fisherfaces´). For any

N-class classification problem, the goal of LDA is to find the N-1 basis vectors that maximize

the interclass distances while minimizing the intraclass distances.

This is the entry point of the face recognition process. It is the module where the face image

under consideration is presented to the system. In other words, the user is asked to present a

face image to the face recognition system in this module. An acquisition module can request

a face image from several different environments: The face image can be an image file that is

located on a magnetic disk, it can be captured by a frame grabber and camera or it can be

scanned from paper with the help of a scanner.

In this module, by means of early vision techniques, face images are normalized and if

desired, they are enhanced to improve the recognition performance of the system. Some or all

of the pre-processing steps may be implemented in a face recognition system

After performing some pre-processing (if necessary), the normalized face image is presented

to the feature extraction module in order to find the key features that are going to be used for

classification. In other words, this module is responsible for composing a feature vector that

is well enough to represent the face image.

c

#

In this module, with the help of a pattern classifier, extracted features of the face image is

compared with the ones stored in a face library (or face database). After doing this

comparison, face image is classified as either known or unknown.

model that best describes a face, by extracting the most relevant information contained

in that face. Eigenfaces approach is a principal component analysis method, in which a

small set of characteristic pictures are used to describe the variation between face images.

Goal is to find out the eigenvectors (Eigenfaces) of the covariance matrix of the

distribution, spanned by a training set of face images. Later, every face image is

represented by a linear combination of these eigenvectors.

Evaluations of these eigenvectors are quite difficult for typical image sizes but, an

approximation that is suitable for practical purposes is also presented. Recognition is

performed by projecting a new image into the subspace spanned by the Eigenfaces and

then classifying the face by comparing its position in face space with the positions of

known individuals.

to its simplicity, speed and learning capability. Experimental results are given to demonstrate

the viability of the proposed ³face recognition method´.

Information carrying function of time is called signal. Real time signals can be

audio(voice) or video(image) signals. Still video is called an image. Moving image is called a

video. Difference between digital image processing and signals and systems is that time

graph is not there in DIP. X and Y coordinates in DIP are spatial coordinates. Time graph is

not there because photo doesn¶t change with time.

What is image?

Image : An image is defined as a two dimensional function f(x, y) where x and y are spatial

coordinates and the amplitude µf¶ at any point (x, y) is known as the intensity of image at that

point.

c

$

What is a pixel?

Pixel : A pixel(short for picture element) is a single point in a graphic image. Each such

information element is not really a dot, nor a square but an abstract sample. Each element of

the above matrix is known as a pixel where dark = 0 and light = 1. A pixel with only 1 bit

will represent a black and white image. If the number of bits are increased then the number of

gray levels will increase and a better picture quality is achieved.

All naturally occurring images are analog in nature. If the number of pixels is

more then the clarity is more. An image is represented as a matrix in DIP. In DSP we use

only row matrices. Naturally occurring images should be sampled and quantized to get a

digital image. A good image should have 1024*1024 pixels which is known as 1k * 1k = 1M

pixel.

Image acquition : Digital image acquisition is the creation of digital images typically

from a physical object. A digital image may be created directly from a physical scene by a

camera or similar device. Alternatively it can be obtained from another image in an analog

medium such as photographs, photographic film, or printed paper by a scanner or similar

device. Many technical images acquired with tomographic equipment, side-looking radar, or

radio telescopes are actually obtained by complex processing of non-image data.

degradation due to mechanical problems, out-of-focus blur, motion, inappropriate

illumination and noise. The goal of image enhancement is to start from a recorded image and

to produce the most visually pleasing image.

Image restoration : The goal of image restoration is to start from a recorded image and to

produce the most visually pleasing image. The goal of enhancement is beauty. The goal of

restoration is truth. The measure of success in restoration is usually an error measure between

the original and the estimate image. No mathematical error function is known that

corresponds to human perceptual assessment of error.

c

%

Colour image processing : Colour image processing is based on that any colour can be

obtained by mixing 3 basic colours red, green and blue. Hence 3 matrices are necessary each

one representing each colour.

occurring at any instant can be of particular interest. In these cases it may be very beneficial

to know the time intervals these particular spectral components occur. For example, in EEGs

the latency of an event-related potential is of particular interest.

Wavelet transform is capable of providing the time and frequency information

simultaneously, hence giving a time-frequency representation of the signal. Although the

time and frequency resolution problems are results of a physical phenomenon ( the

Heisenberg uncertainty principle ) and exist regardless of the transform used, it is possible to

any signal by using an alternative approach called the multiresolution analysis (MRA). MRA

analyzes the signal at different frequencies with different resolutions. MRA is designed to

give good time resolution and poor frequency resolution at high frequencies and good

frequency resolution and poor time resolution at low frequencies.

Its objective is to reduce redundancy of the image data in order to be able to store or transmit

data in an efficient form.

based on mathematical morphology. Since these techniques rely only on the relative ordering

of pixel values not on their numerical values they are especially suited to the processing of

binary images and grayscale images.

Segmentation: In the analysis of the objects in images it is essential that we can distinguish

between the objects of interest and ³the rest´. This latter group is also referred to as the

background. The techniques that are used to find the objects of interest are usually referred to

as segmentation techniques.

c

&

CHAPTER 2

DCT is a well-known signal analysis tool used in compression standards due to its

compact representation power. Although Discrete Cosine transform (KLT) is known to be the

optimal transform in terms of information packing, its data dependent nature makes it

unfeasible for use in some practical tasks. Furthermore, DCT closely approximates the

compact representation ability of the KLT, which makes it a very useful tool for signal

representation both in terms of information packing and in terms of computational

complexity due to its data independent nature.

ÿocal Appearance Based Face Representation

Local appearance based face representation is a generic local approach and does not

require detection of any salient local regions, such as eyes, as in the modular or component

based approaches [5, 10] for face representation. Local appearance based face representation

can be performed as follows: A detected and normalized face image Implementation of face

recognition system is divided into blocks of 8x8 pixels size. Each block is then represented

by its DCT coefficients. The reason for choosing a block size of 8x8 pixels is to have small-

enough blocks in which stationarity is provided and transform complexity is kept simple on

one hand, and to have big enough blocks to provide sufficient

compression on the other hand. The top-left DCT coefficient is removed from the

representation since it only represents the average intensity value of the block. From the

remaining DCT coefficients the ones containing the highest information are extracted via zig-

zag scan.

Fusion

To fuse the local information, the extracted features from 8x8 pixels blocks can be

combined at the feature level or at the decision level.

Feature Fusion

In feature fusion, the DCT coefficients obtained from each block are concatenated to

construct the feature vector which is used by the classifier.

c

'

Decision Fusion

In decision fusion, classification is done separately on each block and later, the

individual classification results are combined. To combine the individual classification results

2.2 Definition

Ahmed, Natarajan, and Rao (1974) first introduced the discrete cosine transform (DCT) in

the early seventies. Ever since, the DCT has grown in popularity, and several variants have

been proposed (Rao and Yip, 1990). In particular, the DCT was categorized by Wang (1984)

into four slightly different transformations named DCT-I, DCT-II, DCT-III, and DCT-IV. Of

the four classesWang defined, DCT-II was the one first suggested by Ahmed et al., and it is

the one of concern in this paper.

number of performance criteria. One of these Criteria Is the variance distribution of transform

coefficients. This criterion judges the performance of a discrete transform by measuring its

variance distribution for a random sequence having some specific probability distribution

function (Rao and Yip, 1990). It is desirable to have a small number of transform coefficients

with large variances such that all other coefficients can be discarded with little error in the

reconstruction of signals from the ones retained. The error criterion generally used when

reconstructing from truncated transforms is the mean-square error (MSE). In terms of pattern

recognition, it is noted that dimensionality reduction is perhaps as important an objective as

class separability in an application such as face recognition. Thus, a transform exhibiting

largevariance distributions for a small number of coefficients is desirable. This is so because

such a transform would require less information to be stored and used for recognition. In this

respect, as well as others, the DCT has been shown to approach the optimality of the KLT

(Pratt, 1991). The variance distribution for the various discrete transforms is usually

measured when the input sequence is a stationary first-order Markov process (Markov-1

process). Such a process has an autocovariance matrix of the form Shown In Eq. (2.6) and

provides a good model for the scan lines of gray-scale images (Jain, 1989). The matrix in Eq.

(2.6) is aToeplitz matrix, which is expected since the process is stationary (Jain, 1989). Thus,

c

(

the variance distribution measures are usually computed for random sequences of length O

that result in an auto-covariance matrix of the form:

=

1 ' '2 'Oí1

'1' 'Oí2

'Oí1 'Oí2 1

' Ł correlation coeff.

|'| 1 ««««««««««««««««(2.6)

Face Recognition Using the Discrete Cosine Transform 171

for O = 16 and ' = 0 9 (adapted from K.R. Rao and P. Yip, Discrete Cosine Transform

Algorithms, Advantages, Applications, New York: Academic, 1990). Data is shown for the

following transforms: discrete cosine transform (DCT), discrete Fourier transform (DFT),

slant transform (ST), discrete sine transform (type I) (DST-I), discrete sine transform (type II)

(DST-II), and Karhunen-Loeve transform (KLT). Figure 2.1 shows the variance distribution

for a selection of discrete transforms given a first-order Markov process of length O = 16 and

' = 0 9. The data for this curve were obtained directly from Rao and Yip (1990) in which

c

)

other curves for different lengths are also presented. The purpose here is to illustrate that the

DCT variance distribution, when compared to other deterministic transforms, decreases most

rapidly. The DCT variance distribution is also very close to that of the KLT, which confirms

its near optimality. Both of these observations highlight the potential of the DCT for data

compression and, more importantly, feature extraction.

The KLT completely decorrelates a signal in the transformdomain, minimizes MSE in data

compression, contains the most energy (variance) in the fewest number of transform

coefficients, and minimizes the total representation entropy of the input sequence (Rosenfeld

and Kak, 1976). All of these properties, particularly the first two, are extremely useful in

pattern recognition applications. The computation of the KLT essentially involves the

determination of the eigenvectors of a covariance matrix of a set of training sequences

(images in the case of face recognition). In particular, given å trainingimages of size, say, O

× O, the covariance matrix of interest is given by C = A
Au (2.7) where r is a matrix

whose columns are the å training images (after having an average face image subtracted

from each of them) reshaped into O2-element vectors. Note that because of the size of r, the

computation of the eigenvectors of may be intractable. However, as discussed in Turk and

Pentland (1991), because å is usually much smaller than O2 in face recognition, the

eigenvectors of can be obtained more efficiently by computing the eigenvectors of another

smaller matrix (see (Turk and Pentland, 1991) for details). Once the eigenvectors of are

obtained, only those with the highest corresponding eigenvalues are usually retained to form

the KLT basis set. One measure for the fraction of eigenvectors retained for the KLT basis set

is given by

=

å_

_=1

å

_=1

c

where is the th eigenvalue of and å_ is the number of eigenvectors forming the KLT

basis set. As can be seen from the definition of in Eq. (2.7), the KLT basis functions are

data-dependent. Now, in the case of a first-order Markov process, these basis functions can

be found analytically (Rao and Yip,1990). Moreover, these functions can be shown to be

asymptotically equivalent to the DCT basis functions as ' (of Eq. (2.6)) ĺ 1 for any given O

(Eq. (2.6)) and as O ĺfor any given ' (Rao and Yip, 1990). It is this asymptotic

equivalence that explains the near optimal performance of the DCT in terms of its variance

distribution for first-order Markov processes. In fact, this equivalence also explains the near

optimal performance of the DCT based on a handful of other criteria such as energy packing

efficiency, residual correlation, and mean-square error in estimation (Rao and Yip, 1990).

This provides a strong justification for the use of the DCT for face recognition. Specifically,

since the KLT has been shown to be very effective in face recognition (Pentland et al., 1994),

it is expected that a deterministic transform that is mathematically related to it would

probably perform just as well in the same application. 172 Hafed and Levine

As for the computational complexity of the DCT and KLT, it is evident from the above

overview that theKLT requires significant processing during training, since its basis set is

data-dependent. This overhead in computation, albeit occurring in a non-time-critical off-line

training process, is alleviated with the DCT. As for online feature extraction, the KLT of an O

× O image can be computed in å_O2 time where å_ is the number of KLT basis vectors.

In comparison, the DCT of the same image can be computed in O2log2O time because of

its relation to the discrete Fourier transform²which can be implemented efficiently using the

fast Fourier transform (Oppenheim and Schafer, 1989). This means that the DCT can be

computationally more efficient than the KLT depending on the size of the KLT basis set.2 It

is thus concluded that the discrete cosine transform is very well suited to application in face

recognition. Because of the similarity of its basis functions to those of theKLT,

theDCTexhibits striking feature extraction and data compression capabilities. In fact, coupled

with these, the ease and speed of the computation of theDCT may even favor it over the KLT

in face recognition.

c

!

CHAPTER 3

FACE NORMAÿIZATION AND RECOGNITION

The face recognition algorithm discussed in this paper is depicted in Fig. 3.1 It involves both

face normalization and recognition. Since face and eye localization is not performed

automatically, the eye coordinates of the input faces need to be entered manually in order to

normalize the faces correctly. This requirement is not a major limitation because the

algorithm can easily be invoked after running a localization system such as the one presented

in Jebara (1996) or others in the literature. As can be seen from Fig 3.2, the system receives

as input an image containing a face along with its eye coordinates. It then executes both

geometric and illumination normalization functions as will be described later. Once a

normalized (and cropped) face is obtained, it can be compared to other faces, under the same

nominal size, orientation, position, and illumination conditions.

This comparison is based on features extracted using the DCT. The basic idea here is to

compute the DCT of the normalized face and retain a certain subset of the DCT coefficients

as a feature vector Describing this face. This feature vector contains the low-to-mid

frequency DCT coefficients, as these are the ones having the highest variance. To recognize a

particular input face, the system compares this face¶s feature vector to the feature vectors of

the database faces using a Euclidean distance nearest-neighbor classifier (Duda and Hart,

1973). If the feature vector of the probe is v and that of a database face is f, then the

Euclidean distance between the two is

? =_ 0 í02 + 1 í12+
+ åí1 íåí12««««««. (3.1)

where

v = [0 1 åí1]u

f = [ 0 1 åí1] (3.2)

and å is the number of DCT coefficients retained as

features. A match is obtained by minimizing ?. Note that this approach computes the DCT on

the entire normalized image. This is different from the use of the DCT in the JPEG

compression standard (Pennebaker and Mitchell, 1993), in which the DCTis computed on

individual subsets of the image. The use of the DCT on individual subsets of an image, as

c

"

in the JPEG standard, for face recognition has been proposed in Shneier and Abdel-Mottaleb

(1996) and Eickeler et al. (2000). Also, note that this approach basically assumes no

thresholds on ?. That is, the system described always assumes that the closest match is the

correct match, and no probe is ever rejected as unknown. If a threshold is defined on ?,

then the gallery face that minimizes ? would only be output as the match when ? .

Otherwise, the probewould be declared asunknown. In this way, one can actually define a

threshold to achieve 100% recognition accuracy, but, of course, at the cost ofa certain number

of rejections. In other words, the system could end up declaring an input face as unknown

even though it exists in the gallery. Suitable values of can be obtained using the so-called

Receiver Operating Characteristic curve (ROC) (Grzybowski and Younger, 1997), as will be

illustrated later. be quite small, as will be seen in the next section. As an illustration,

c

#

Feature Extraction

To obtain the feature vector representing a face, Its DCT is computed, and only a subset of

the obtained coefficients is retained. The size of this subset is chosen such that it can

sufficiently represent a face, but it can in fact Face Recognition Using the Discrete Cosine

Transform Fig.3.2(a) shows a sample image of a face, and Fig.3.2(b) shows the low-to-mid

frequency 8 × 8subset of its DCT coefficients. It can be observed that the DCT coefficients

exhibit the expected behavior in which a relatively large amount of information about the

original image is stored in a fairly small number of coefficients. In fact, looking at Fig. 3.2

(b), we note that the DC term is more than 15,000 and the minimum magnitude in the

presented set of coefficients is less than 1. Thus there is an order of 10,000 reduction in

coefficient magnitude in the first 64 DCT coefficients. Most of the discarded coefficients

have magnitudes less than 1. For the purposes of this paper, square subsets, similar to the one

shown in Fig. 3.2(b), are used for the feature vectors. It should be noted that the size of the

subset of DCT coefficients retained as a feature vector may not be large enough for achieving

an accurate reconstruction of the input image. That is, in the case of face recognition, data

compression ratios larger than the ones necessary to render accurate reconstruction of input

images are175 Hafed and Levine (a) (b) encountered

c

$

Figure 3.2 Typical face image (a) of size 128 × 128 and an 8 × 8 subset of its DCT (b).

This observation, of course, has no ramifications on the performance evaluation

ofthe system, because accurate reconstruction is not a requirement. In fact, this situation was

also encountered in Turk and Pentland (1991) where the KLT coefficients used in face

recognition were not sufficient to achieve a subjectively acceptable facial reconstruction.

Figure 3.3 shows the effect of using a feature vector of size 64 to reconstructa typical face

image. Now, it may be the case that one chooses to use more DCT coefficients to represent

faces. However, there could be a cost associated with doing so. Specifically, more

coefficients do not necessarily imply better recognition results, because by adding them, one

may actually be representing more irrelevant information (Swets and Weng, 1996).

c

%

(a)

(b)

Figure 3.3 Effect of reconstructing a 128 × 128 image using only 64 DCT coefficients: (a)

original (b) reconstructed.

3.2 Normalization

Two kinds of normalization are performed in the proposed face recognition system. The first

deals with geometric distortions due to varying imaging conditions. Face Recognition Using

c

&

the Discrete Cosine Transform 175 That is, it attempts to compensate for position, scale, and

minor orientation variations in faces. This way, feature vectors are always compared for

images characterized by the same conditions. The second kind of normalization deals with

the illumination of faces. The reasoning here is that the variations in pixel intensities between

different images of faces could be due to illumination conditions. Normalization in this case

is not very easily dealt with because illumination normalization could result in an artificial

tinting of light colored faces and a corresponding lightening of dark colored ones. In the

following two subsections, the issues involved in both kinds of normalization are presented,

and the stage is set for various experiments to test their effectiveness for face recognition.

These experiments and their results are detailed in Section 4.

Geometry

The proposed system is a holistic approach to face recognition. Thus it uses the image of a

whole face and, as discussed in Section 1, it is expected to be sensitive to variations in facial

scale and orientation. An investigation of this effect was performed in the case of the DCT to

confirm this observation. The data used for this test were from the MIT database, which is

described, along with the other databases studied, in a fair amount of detail in Section 4. This

database contains a subset of faces that only vary in scale. To investigate the effects of scale

on face recognition accuracy, faces at a single scale were used as the gallery faces, and faces

from two different scales were used as the probes. Figure 3.5 illustrates how scale can

degrade the performance of a face recognition system. In the figure, the term ³Training Case´

refers to Figure 3.5 Three faces from the MIT database exhibiting scale variations. The labels

refer to the experiments performed in Fig. 3.4.

c

'

64 DCT coefficients were used for feature vectors, and 14 individuals of the MIT database

were considered. the scale in the gallery images, and the terms ³Case 1´ and ³Case 2´

describe the two scales that were available for the probes. Figure 3.5 shows examples of faces

from the training set and from the two cases of scal investigated. These results indicate that

the DCT exhibits sensitivity to scale similar to that shown for the KLT (Turk and Pentland,

1991). The geometric normalization we have used basically attempts to make all faces have

the same size and same frontal, upright pose. It also attempts to crop face images such that

most of the background is excluded. To achieve this, it uses the input face eye coordinates

and defines a transformation to place these eyes in standard positions. That is, it scales faces

such that the eyes are176 Hafed and Levine

c

(

The final image dimensions are 128 × 128.always the same distance apart, and it positions

these faces in an image such that most of the background is excluded. This normalization

procedure is illustrated in Fig.3.6, and it is similar to that proposed in Brunelli and Poggio

(1993). Given the eye coordinates of the input face image, the normalization procedure

performs thefollowing three transformations: rotate the image so that the eyes fall on a

horizontal line, scale the image (while maintaining the original aspect ratio) so that the eye

centers are at a fixed distance apart (36 pixels), and translate the image to place the eyes at

set positions within a 128×128 cropping window (see Fig. 3.6). Note that we only require the

eye coordinates of input faces in order to perform this normalization. Thus no knowledge of

individual face contours is available, which means that we cannot easily exclude the whole

background from the normalized images. Since we cannot tailor an optimal normalization

and cropping scheme for each face without knowledge of its contours, the dimensions shown

in Fig. 3.6 were chosen to result in as little background, hair, and clothing information as

possible, and they seemed appropriate given the variations in face geometry among people.

Another observation we can make about Fig. 3.6 is that the normalization performed accounts

for only twodimensional perturbations in orientation. That is, no compensation is done for

three-dimensional (in depth) pose variations. This is a much more difficult problem to deal

with, and a satisfactory solution to it has yet to be found. Of course, one could increase the

robustness of a face recognition system to 3-D pose variations by including several training

images containing such variations for a single person. The effect of doing this will be

discussed in the next section. Also, by two-dimensional perturbations in orientation, we mean

slight rotations from the upright position. These rotations are the ones that may arise

naturally, even if people are looking straight ahead (see Fig. 3.7 for an example). Of course,

larger 2-D rotations do not occur naturally and always include some 3-D aspect to them,

which obviously 2-D normalization does not account for. As for the actual normalization

technique implemented, it basically consists of defining and applying a 2-D affine

transformation, based on the relative eye positions and their distance. Figure 3.8 illustrates

the result of applying such a transformation on a sample face image.

c

!)

3.3 Illumination

recognition system, even though Turk and Pentland indicate that the correlation between face

images under different lighting conditions remains relatively high (Turk and Pentland, 1991).

In fact, experience has shown that for large databases of images, obtained with different

sensors under different lighting conditions, special care must be expended to ensure that

recognition thresholds are not affected. To compensate for illumination variations in our

experiments, we apply Hummel¶s histogram modification technique (Hummel, 1975). That

is, we simply choose Face Recognition Using the Discrete Cosine Transform 177

c

!

Figure 3.8 The result of applying such a transformation on a sample face image.

orientation, and position. a target histogram and then compute a gray-scale transformation

thatwould modify the input image histogram to resemble the target. It should be noted that

another interesting approach to illumination compensation can be found in Brunelli (1997), in

which computer graphics techniques are used to estimate and compensate for illuminant

direction. This alleviates the need to train with multiple images under varying pose, but it

also has significant computational costs. The key issue in illumination compensation is how

to select the target illumination. This is so because there could be tradeoffs involved in

choosing such a target, especially if the face database contains a wide variety of skin tones.

An extensive study of illumination 178 Hafed and Levine

compensation of faces for automatic recognition was done in conjunction with these

experiments. The aim was to find an appropriate solution to this problem in order to improve

the performance of our system. The results of this study are documented in an unpublished

report available from the authors (Hafed, 1996). The main conclusion that can be drawn from

the study is that illumination normalization is very sensitive to the choice of target

illumination. That is, if an average face is considered as a target, then all histograms will be

mapped onto one histogram that has a reduced dynamic range (due to averaging), and the net

result is a loss of contrast in the facial images. In turn, this loss of contrast makes all faces

look somewhat similar, and some vital information about these faces, like skin color, is lost.

It was found that the best compromise was achieved if the illumination of a single face is

adjusted so as to compensate for possible non-uniform lighting conditions of the two halves

c

!!

of the same face. That is, no inter-face normalization is performed, and in this way, no

artificial darkening or lightening of faces occurs due to attempts to normalize all faces to a

single target. Of course, the results of illumination normalization really depend on the

database being considered. For example, if the illumination of faces in a database is

sufficiently uniform, then illumination normalization techniques are redundant.

3.4 Experiments

This section describes experiments with the developed face recognition system. These were

fairly extensive, and the hallmark of the work presented here is that the DCT was put to the

test under a wide variety of conditions. Specifically, several databases, with significant

differences between them, were used in the experimentation.

A flowchart of the system described in the previous section is presented in Fig.

3.9 As can be seen, there is a pre-processing stage in which the face codes for the individual

database images are extracted and stored for later use. This stage can be thought of as a

modeling stage, which is necessary even for human beings: we perform a correlation between

what is seen and what is already known in order to actually achieve recognition (Sekuler and

Blake, 1994). At run-time, a test input is presented to the system, and its face codes are

extracted. The closest match is found by performing a search that basically computes

Euclidean distances and sort the results using a fast algorithm (Silvester, 1993). the various

modules used and the flowchart of operation. This section begins with a brief overview of the

various face databases used for testing the system; the differences among these databases are

highlighted. Then the experiments performed and their results are presented and discussed.

We compare the proposed local appearance-based approach with several well-known

holistic face recognition approaches

6 Principal Component Analysis (PCA) [15], Linear Discriminant Analysis (LDA) [2],

approach, which uses Gaussian mixture models for modeling the distributions of

feature vectors. This approach will be named ³local DCT + GMM´ in the remainder

of the paper. Moreover, we also test a local appearance based approach using PCA for

the representation instead of DCT which will be named Local PCA in the paper.

c

!"

c

!#

Fig. 3.10 Samples from the Yale database. First row: Samples from training set. Second row:

Samples from test set.

In all our experiments, except for the DCT+GMM approach, where the classification is

done with Maximum-Likelihood, we use the nearest neighbor classifier with the normalized

correlation d as the distance metric:

The Yale face database consists of 15 individuals, where for each individual, there are 11

face images containing variations in illumination and facial expression. From these 11 face

images, we use 5 for training, the ones with annotations ³center light´, ³no glasses´,

³normal´, ³sleepy´ and ³wink´. The remaining 6 images - ³glasses´, ³happy´, ³left light´,

³right light´, ³sad´, ³surprised´ - are used for testing. The test images with illumination from

sides and with glasses are put in the test set on purpose in order to harden the testing

conditions. The face images are closely cropped and scaled to 64x64 resolution. Fig. 3.10

depicts some sample images from the training and testing set.

In the first experiment, the performances of PCA, global DCT, local DCT

and local PCA with feature fusion are examined with varying feature vector dimensions.

Fig.3.11 plots the obtained recognition results for the four approaches for varying number of

coefficients (holistic and local approaches are plotted in different figures due to the difference

in the dimension of used feature vectors in the classification). It can be observed that while

there¶s no significant performance difference between PCA, local PCA and global DCT, local

DCT with feature fusion outperforms these three approaches significantly. Fig3.11 shows that

Local DCT outperforms Local PCA significantly at each feature vector dimension which

c

!$

indicates that using DCT for local appearance representation is a better choice than using

PCA. Next, the block-based DCT with decision fusion is examined, again with varying

feature vector dimensions. Table 1 depicts the obtained results. It can be seen that further

improvement is gained via decision fusion. Using 20 DCT coefficients, 99% accuracy is

achieved. For comparison, the results obtained when using PCA for local representation are

also depicted in Table 3.1 Overall, the results obtained with PCA for local appearance

represenation are much lower than those obtained with the local DCT representation.

Fig. 3.11 Correct recognition rate versus number of used coefficients on the Yale

database.PCA vs. DCT.

40 eigenvectors are chosen corresponding to 97.92% of the energy content.From the results

depicted in below Table it can be seen that the proposed approaches using local DCT

features outperform the holistic approaches as well as the local DCT features modeled with a

GMM, which ignores location information.

3.1 Different database method and its recognition rate

Method Reco.Rate

PCA(20) 75.6%

LDA(14) 80.0%

ICA 1(40) 77.8%

ICA 2(40) 72.2%

Global DCT(64) 74.4%

Local DCT(18)+GMM(8) as in [12] 58.9%

Local DCT +Feature Fusion (192) 86.7%

Local DCT (10)+Decision Fusion(64) 98.9%

c

!%

CHAPTER 4

the way biological nervous system, such as the brain, process information. The key element

of this paradigm is the novel structure of the information processing system. It is composed

of a large number of highly interconnected processing elements (neurons) working in unison

to solve specific problems. ANNs like people, learn by example. An ANN is configured for a

specific application, such as pattern recognition or data classification, through a learning

process. Learning in biological system involves adjustments to the synaptic connections that

exist between the neurons. This is true of ANNs as well.

Historical background

Neural network simulations appear to be a recent development. However, this field was

established before the advent of computers, and has survived at least one major setback and

several eras.

Many important advance have been boosted by the use of inexpensive computer emulations.

Following an initial period of enthusiasm, the field survived a period of frustration and

disrepute. During this period when funding and professional supports was minimal, important

advances were made by relatively few researchers. These pioneers were able to develop

convincing technology which surpassed the limitations identified by Minsky and Papert.

Minsky and Papert , published a book (in 1969) in which they summed up a general feeling

of frustration (against neural networks) among researchers, and was thus accepted by most

without further analysis. Currently, the neural network field enjoys a resurgence of interest

and a corresponding increase in funding.

The first artificial neuron was produced in 1943 by the neurophysiologist warren McCulloch

and the logician Walter Pits. But the technology available at that time did not allow them to

do too much.

c

!&

Neural networks, with their remarkable ability to derive meaning from complicated or

imprecise data, can be used to extract patterns and detect trends that are too complex to be

noticed by either humans or other computer techniques. A trained neural can be thought of as

an ³expert´ in the category of information it has been given to analise. This expert can then

be used to provide projections given new situations of interest and answer ³what if´

questions.

Other advantages

Adaptive learning: an ability to learn how to do tasks based on the data given for

training or initial experience.

1. Self - organisation: an ANN can create its own organization or representation of the

information it receives during learning time.

2. Real time operation: ANN computations may be carried out in parallel, and special

hardware devices are being designed and manufactured which take advantage of this

capability.

3. Fault tolerance via redundant information coding: Partial destruction of a network

leads to the corresponding degradation of performance. However, some network

capabilities may be retained even with major network damage.

Neural networks take a different approach to problem solving than that of conventional

computers. Conventional computers use an algorithmic approach i.e. the computer follows a

set of instructions in order to solve a problem. Unless the specific steps that the computer

needs to follow are known the computer cannot solve the problems. That restricts the

problem solving capability of conventional computers to problems that we already

understand and know how to solve. But computers would be so much more useful if they

could do things that we don¶t exactly know how to do.

Neural networks process information in a similar way the human brains does. The network

is Composed of a large number of highly interconnected processing elements(neurons)

working in parallel to solve a specific problem. Neural network by an example. They cannot

be programmed to perform a specific task. The example must be selected carefully otherwise

c

!'

useful time is wasted or even worse the network might be functioning incorrectly. The

disadvantage is that because that because the network finds out how to solve the problem by

itself, it¶s operation can be unpredictable.

On the other hand, conventional computers use a cognitive approach to problem solving;

the way the problem is to solved must be known and stated in small unambiguous instruction.

These instructions are then converted to a high level language program and then into machine

code that the computer can understand. These machines are totally predictable; if anything

goes wrong is due to a software or hardware fault.

Neural networks and conventional algorithmic computers are not in competition but

complement each other. There are tasks are more suited to an algorithmic approach like

arithmetic operations and tasks that are more suited to neural networks. Even more, a large

number of tasks, require systems that use a combination of the two approaches (normally a

conventional computer is used to supervise the neural network) in order to perform at

maximum efficiency.

The commonest type of artificial neural network consists of three groups, or layers, of

units: a layer of ³input´ units is connected to a layer of ³hidden´ units, which is connected to

a layer of ³output´.

M The activity of the input units represents the raw information that is fed into the

network.

M The activity of each hidden unit is determined by the activities of the input units and

the weights on the connections between the input and the hidden units.

M The behaviour of the output units depends on the activity of the hidden units and the

weights between the hidden and output units.

c

!(

c

")

Block Diagram

-+ + 4

*
.

+

+
,

5

+ /
+

0

1

2

3 +

+

Figure 4.2 Block diagram of the face recognition system using Eigenface algorithm

intensity values. Applications include photographs with poor contrast due to glare, for

example. Normalization is sometimes called contrast stretching. In more general fields of

data processing, such as digital signal processing, it is referred to as dynamic range

expansion.

c

"

to bring the image, or other type of signal, into a range that is more familiar or normal to the

senses, hence the term normalization. Often, the motivation is to achieve consistency in

dynamic range for a set of data, signals, or images to avoid mental distraction or fatigue. For

example, a newspaper will strive to make all of the images in an issue share a similar range of

grayscale.

180 and the desired range is 0 to 255 the process entails subtracting 50 from each of pixel

intensity, making the range 0 to 130. Then each pixel intensity is multiplied by 255/130,

making the range 0 to 255. Auto-normalization in image processing software typically

normalizes to the full dynamic range of the number system specified in the image file format.

The normalization process will produce iris regions, which have the same constant

dimensions, so that two photographs of the same iris under different conditions will have

characteristic features at the same spatial location.

c

"!

CHAPTER 5

moved. Think of an eigen vector as an arrow whose direction is not changed. It may stretch,

or shrink, as space is transformed, but it continues to point in the same direction. Most

arrows will move, as illustrated by a spinning planet, but some vectors will continue to point

in the same direction, such as the north pole.

The scaling factor of an eigen vector is called its eigen value. An eigen value only makes

sense in the context of an eigen vector, i.e. the arrow whose length is being changed. In the

plane, a rigid rotation of 90° has no eigen vectors, because all vectors move. However, the

reflection y = -y has the x and y axes as eigen vectors. In this function, x is scaled by 1 and y

by -1, the eigen values corresponding to the two eigen vectors. All other vectors move in the

plane. The y axis, in the above example, is subtle. The direction of the vector has been

reversed, yet we still call it an eigen vector, because it lives in the same line as the original

vector. It has been scaled by -1, pointing in the opposite direction. An eigen vector stretches,

or shrinks, or reverses course, or squashes down to 0. The key is that the output vector is a

constant (possibly negative) times the input vector.

These concepts are valid over a division ring, as well as a field. Multiply by K on the left to

build the K vector space, and apply the transformation, as a matrix, on the right. However,

the following method for deriving eigen values and vectors is based on the determinant, and

requires a field.

Given a matrix M implementing a linear transformation, what are its eigen vectors and

values? Let the vector x represent an eigen vector and let l be the eigen value. We must

solve x*M = lx. Rewrite lx as x times l times the identity matrix and subtract it from both

sides. The right side drops to 0, and the left side is x*M-x*l*identity. Pull x out of both

factors and write x*Q = 0, where Q is M with l subtracted from the main diagonal. The eigen

vector x lies in the kernel of the map implemented by Q. The entire kernel is known as the

eigen space, and of course it depends on the value of l.

c

""

If the eigen space is nontrivial then the determinant of Q must be 0. Expand the determinant,

giving an n degree polynomial in l. (This is where we need a field, to pull all the entries to

the left of l, and build a traditional polynomial.) This is called the characteristic polynomial

of the matrix. The roots of this polynomial are the eigen values. There are at most n eigen

values.

Substitute each root in turn and find the kernel of Q. We are looking for the set of vectors x

such that x*Q = 0. Let R be the transpose of Q and solve R*x = 0, where x has become a

column vector. This is a set of simultaneous equations that can be solved using gaussian

elimination. In summary, a somewhat straightforward algorithm extracts the eigen values, by

solving an n degree polynomial, then derives the eigen space for each eigen value. Some

eigen values will produce multiple eigen vectors, i.e. an eigen space with more than one

dimension. The identity matrix, for instance, has an eigen value of 1, and an n-dimensional

eigen space to go with it. In contrast, an eigen value may have multiplicity > 1, yet there is

only one eigen vector. This is illustrated by [1,1|0,1], a function that tilts the x axis

counterclockwise and leaves the y axis alone. The eigen values are 1 and 1, and the eigen

vector is 0,1, namely the y axis.

Let two eigen vectors have the same eigen value. specifically, let a linear map multiply the

vectors v and w by the scaling factor l. By linearity, 3v+4w is also scaled by l. In fact every

linear combination of v and w is scaled by l. When a set of vectors has a common eigen

value, the entire space spanned by those vectors is an eigen space, with the same eigen value.

This is not surprising, since the eigen vectors associated with l are precisely the kernel of the

transfoormation defined by the matrix M with l subtracted from the main diagonal. This

kernel is a vector space, and so is the eigen space of l. Select a basis b for the eigen space of

l. The vectors in b are eigen vectors, with eigen value l, and every eigen vector with eigen

value l is spanned by b. Conversely, an eigen vector with some other eigen value lies outside

of b.

Different eigen values always lead to independent eigen spaces. Suppose we have the shortest

counterexample. Thus c1x1 + c2x2 + « + ck xk = 0. Here x1 through xk are the eigen vectors,

c

"#

and c1 through ck are the coefficients that prove the vectors form a dependent set.

Furthermore, the vectors represent at least two different eigen values. Let the first 7 vectors

share a common eigen value l. If these vectors are dependent then one of them can be

expressed as a linear combination of the other 6. Make this substitution and find a shorter list

of dependent eigen vectors that do not all share the same eigen value. The first 6 have eigen

value l, and the rest have some other eigen value. Remember, we selected the shortest list, so

this is a contradiction. Therefore the eigen vectors associated with any given eigen value are

independent. Scale all the coefficients c1 through ck by a common factor s. This does not

change the fact that the sum of cixi is still zero. However, other than this scaling factor, we

will prove there are no other coefficients that carry the eigen vectors to 0.

If there are two independent sets of coefficients that lead to 0, scale them so

the first coefficients in each set are equal, then subtract. This gives a shorter linear

combination of dependent eigen vectors that yields 0. More than one vector remains, else

cjxj = 0, and xj is the 0 vector. We already showed these dependent eigen vectors cannot

share a common eigen value, else they would be linearly independent; thus multiple eigen

values are represented. This is a shorter list of dependent eigen vectors with multiple eigen

values, which is a contradiction. If a set of coefficients carries our eigen vectors to 0, it must

be a scale multiple of c1 c2 c3 « ck. Now take the sum of cixi and multiply by M on the

right. In other words, apply the linear transformation. The image of 0 ought to be 0. Yet

each coefficient is effectively multiplied by the eigen value for its eigen vector, and not all

eigen values are equal. In particular, not all eigen values are 0.

Here is a simple application of eigen vectors. A rigid rotation in 3 space always has an axis

of rotation. Let M implement the rotation. The determinant of M, with l subtracted from its

main diagonal, gives a cubic polynomial in l, and every cubic has at least one real root. Since

lengths are preserved by a rotation, l is ±1. If l is -1 we have a reflection. So l = 1, and the

space rotates through some angle ș about the eigen vector. That's why every planet, every

star, has an axis of rotation.

c

"$

Matching Algorithm

Here you can do Both images are Same Displays the results Match Found or Not Found

+
+

+ 6

+
+ /

+

2

0

c

"%

This is the entry point of the face recognition process. It is the module where the face image

under consideration is presented to the system. In other words, the user is asked to present a

face image to the face recognition system in this module. An acquisition module can request

a face image from several different environments: The face image can be an image file that is

located on a magnetic disk, it can be captured by a frame grabber and camera or it can be

scanned from paper with the help of a scanner.

In this module, by means of early vision techniques, face images are normalized and if

desired, they are enhanced to improve the recognition performance of the system. Some or all

of the following pre-processing steps may be implemented in a face recognition system:

1. Image size (resolution) normalization: it is usually done to change the acquired image

size to a default image size on which the face recognition system operates.

2. Histogram equalization: it is usually done on too dark or too bright images in order to

enhance the image quality and to improve face recognition performance. It modifies

the dynamic range (contrast range) of the image and as a result, some important facial

features become more apparent.

3. Median filtering: for noisy images especially obtained from a camera or from a frame

grabber, median filtering can clean the image without loosing information.

4. High-pass filtering: feature extractors that are based on facial outlines may benefit the

results that are obtained from an edge detection scheme. High-pass filtering

emphasizes the details of an image such as contours, which can dramatically improve

edge detection performance.

5. Background removal: in order to deal primarily with facial information itself, face

background can be removed. This is especially important for face recognition systems

where entire information contained in the image module should be capable of

determining the face outline.

c

"&

face image in which the head is somehow shifted or rotated. The head plays the key

role in the determination of facial features. Especially for face recognition systems

that are based on the frontal views of faces, it may be desirable that the pre-processing

module determines and if possible, normalizes the shifts and rotations in the head

position.

7. Illumination normalization: face images taken under different illuminations can

degrade recognition performance especially for face recognition systems based on the

principal component analysis in which entire face information is used for recognition.

Hence, normalization is done to account for this.

The feature extraction module

After performing some pre-processing (if necessary), the normalized face image is presented

to the feature extraction module in order to find the key features that are going to be used for

classification. In other words, this module is responsible for composing a feature vector that

is well enough to represent the face image.

In this module, with the help of a pattern classifier, extracted features of the face image is

compared with the ones stored in a face library (or face database). After doing this

comparison, face image is classified as either known or unknown.

Training set

Training sets are used during the ³learning phase´ of the face recognition process. The

feature extraction and the classification modules adjust their parameters in order to achieve

optimum recognition performance by making use of training sets.

Due to the dynamic nature of face images, a face recognition system encounters various

problems during the recognition process. It is possible to classify a face recognition system as

either ³robust´ or ³weak´ based on its recognition performances under these circumstances.

c

"'

1. Scale invariance: the same face can be presented to the system at different scales.

This may happen due to the focal distance between the face and the camera. As this

distance gets closer, the face image gets bigger.

2. Shift invariance: the same face can be presented to the system at different

perspectives and orientations. For instance, face images of the same person could be

taken from frontal and profile views. Besides, head orientation may change due to

translations and rotations.

3. Illumination invariance: face images of the same person can be taken under

different illumination conditions such as, the position and the strength of the light

source can be modified.

4. Emotional expression and detail invariance: face images of the same person can

differ in expressions when smiling or laughing. Also, some details such as dark

glasses, beards or moustaches can be present.

5. Noise invariance: a robust face recognition system should be insensitive to noise

generated by frame grabbers or cameras. Also, it should function under partially

occluded images.

c

"(

CHAPTER 6

DEVEÿOPING TOOÿS

MATLAB is a high performance language for technical computing .It integrates computation

visualization and programming in an easy to use environment

Mat lab stands for matrix laboratory. It was written originally to provide easy access to

matrix software developed by LINPACK (linear system package) and EISPACK (Eigen

system package) projects.

basic element is matrix that does not require pre dimensioning

2. Algorithm development

3. Data acquisition

4. Data analysis ,exploration ands visualization

5. Scientific and engineering graphics

The main features of MATÿAB

Field matrix algebra

2. A large collection of predefined mathematical functions and the ability to define one¶s

own functions.

3. Two-and three dimensional graphics for plotting and displaying data

4. A complete online help system

5. Powerful, matrix or vector oriented high level programming language for individual

applications.

6. Toolboxes available for solving advanced problems in several application areas

Features and capabilities of MATLAB

c

#)

(,

/0 )!!

0 '!!

"0 &1'

20

3

40 &''

50 (

60 )

Block Diagram

2. How to write scripts (main functions) with matlab

3. How to write functions with matlab

4. How to use the debugger

5. How to use the graphical interface

c

#

After learning about matlab we will be able to use matlab as a tool to help us with our

maths, electronics, signal & image processing, statistics, neural networks, control and

automation.

Matlab resources

à Functions

à Flow statements (for, while)

à Control statements (if, else)

à data structures (struct, cells)

à input/ouputs (read,write,save)

à object oriented programming.

Environment

à Command window.

à Editor

à Debugger

à Profiler (evaluate performances)

Mathematical libraries

API

à Call matlab functions from c

Scripts and main programs

In matlab, scripts are the equivalent of main programs. The variables declared in a

script are visible in the workspace and they can be saved. Scripts can therefore take a lot of

memory if you are not careful, especially when dealing with images. To create a script, you

will need to start the editor, write your code and run it.

c

#!

Syntax:

A = imread(filename,fmt)

[X,map] = imread(filename,fmt)

[...] = imread(filename)

Description:

A. If the file contains a grayscale intensity image, A is a two-dimensional array. If the file

contains a truecolor (RGB) image, A is a three-dimensional (m-by-n-by-3) array.

[X,map] = imread(filename,fmt) reads the indexed image in filename into X and its

associated colormap into map. The colormap values are rescaled to the range [0,1]. A and

map are two-dimensional arrays.

[...] = imread(filename) attempts to infer the format of the file from its content.

filename is a string that specifies the name of the graphics file, and fmt is a string that

specifies the format of the file. If the file is not in the current directory or in a directory in the

MATLAB path, specify the full pathname for a location on your system. If imread cannot

find a file named filename, it looks for a file named filename.fmt. If you do not specify a

string for fmt, the toolbox will try to discern the format of the file by checking the file header.

c

#"

TIFF-Specific Syntax:

[...] = imread(...,idx) reads in one image from a multi-image TIFF file. idx is an

integer value that specifies the order in which the image appears in the file. For example, if

idx is 3, imread reads the third image in the file. If you omit this argument, imread reads the

first image in the file. To read all ages of a TIFF file, omit the idx argument.

PNG-Specific Syntax:

The discussion in this section is only relevant to PNG files that contain transparent

pixels. A PNG file does not necessarily contain transparency data. Transparent pixels, when

they exist, will be identified by one of two components: a a

or an

. (A PNG file can only have one of these components, not both.) The transparency

chunk identifies which pixel values will be treated as transparent, e.g., if the value in the

transparency chunk of an 8-bit image is 0.5020, all pixels in the image with the color 0.5020

can be displayed as transparent. An alpha channel is an array with the same number of pixels

as are in the image, which indicates the transparency status of each corresponding pixel in the

image (transparent or nontransparent). Another potential PNG component related to

c

##

transparency is the background color chunk, which (if present) defines a color value that can

be used behind all transparent pixels. This section identifies the default behavior of the

toolbox for reading PNG images that contain either a transparency chunk or an alpha channel,

and describes how you can override it.

HDF-Specific syntax:

[...] = imread(...,ref) reads in one image from a multi-image HDF file. ref is an integer

value that specifies the reference number used to identify the image. For example, if ref is 12,

imread reads the image whose reference number is 12. (Note that in an HDF file the reference

numbers do not necessarily correspond to the order of the images in the file. You can use

imfinfo to match up image order with reference number.) If you omit this argument, imread

reads the first image in the file. .

6.2 This table summarizes the types of images that imread can read

Format Variants

1-bit, 4-bit, 8-bit, and 24-bit uncompressed images; 4-bit and 8-bit run-length

BMP

encoded (RLE) images

8-bit raster image datasets, with or without associated colormap; 24-bit raster image

HDF

datasets

Any baseline JPEG image (8 or 24-bit); JPEG images with some commonly used

JPEG

extensions

Any PNG image, including 1-bit, 2-bit, 4-bit, 8-bit, and 16-bit grayscale images; 8-

PNG

bit and 16-bit indexed images; 24-bit and 48-bit RGB images

Any baseline TIFF image, including 1-bit, 8-bit, and 24-bit uncompressed images; 1-

TIFF bit, 8-bit, 16-bit, and 24-bit images with packbits compression; 1-bit images with

CCITT compression; also 16-bit grayscale, 16-bit indexed, and 48-bit RGB images.

c

#$

Syntax

imshow(I)

imshow(I,[low high])

imshow(RGB)

imshow(BW)

imshow(X,map)

imshow(filename)

himage = imshow(...)

Description

imshow(I,[low high]) displays the grayscale image I, specifying the display range for

I in [low high]. The value low (and any value less than low) displays as black; the value high

(and any value greater than high) displays as white. Values in between are displayed as

intermediate shades of gray, using the default number of gray levels. If you use an empty

matrix ([]) for [low high], imshow uses [min(I(:)) max(I(:))]; that is, the minimum value in I

is displayed as black, and the maximum value is displayed as white.

imshow(BW) displays the binary image BW. imshow displays pixels with the value 0

(zero) as black and pixels with the value 1 as white.

imshow(X,map) displays the indexed image X with the colormap map. A color map

matrix may have any number of rows, but it must have exactly 3 columns. Each row is

interpreted as a color, with the first element specifying the intensity of red light, the second

green, and the third blue. Color intensity can be specified on the interval 0.0 to 1.0.

c

#%

imshow(filename) displays the image stored in the graphics file filename. The file

must contain an image that can be read by imread or dicomread. imshow calls imread or

dicomread to read the image from the file, but does not store the image data in the MATLAB

workspace. If the file contains multiple images, the first one will be displayed. The file must

be in the current directory or on the MATLAB path.

Remarks

imshow is the toolbox's fundamental image display function, optimizing figure, axes,

and image object property settings for image display. imtool provides all the image display

capabilities of imshow but also provides access to several other tools for navigating and

exploring images, such as the Pixel Region tool, Image Information tool, and the Adjust

Contrast tool. imtool presents an integrated environment for displaying images and

performing some common image processing tasks.

Examples

X= imread('moon.tif');

imshow(X).

c

#&

Introduction

When you start MATLAB, the MATLAB desktop appears, containing tools (graphical user

interfaces) for managing files, variables, and applications associated with MATLAB. The

following illustration shows the default desktop. You can customize the arrangement of tools

and documents to suit your needs. For more information about the desktop tools .

c

#'

6.5 Implementations

The best way for you to get started with MATLAB is to learn how to handle

matrices. Start MATLAB and follow along with each example. You can enter matrices into

MATLAB in several different ways:

m Create matrices with your own functions in M-files. Start by entering Dürer¶s matrix as a

list of its elements. You only have to follow a few basic conventions:

m Surround the entire list of elements with square brackets, [ ]. To enter matrix, simply type

in the Command Window

A =16 3 2 13

5 10 11 8

9 6 7 12

4 15 14 1

This matrix matches the numbers in the engraving. Once you have entered the matrix, it is

automatically remembered in the MATLAB workspace. You can refer to it simply as A. Now

that you have A in the workspace, sum, transpose, and diag

You are probably already aware that the special properties of a magic square have to do with

the various ways of summing its elements. If you take the sum along any row or column, or

c

#(

along either of the two main diagonals, you will always get the same number. Let us verify

that using MATLAB. The first statement to try is sum(A)

ans =34 34 34 34

When you do not specify an output variable, MATLAB uses the variable ans, short for

answer, to store the results of a calculation. You have computed a row vector containing the

sums of the columns of A. Sure enough, each of the columns has the same sum, the magic

sum, 34.

How about the row sums? MATLAB has a preference for working with the columns of a

matrix, so one way to get the row sums is to transpose the matrix, compute the column sums

of the transpose, and then transpose the result. For an additional way that avoids the double

transpose use the dimension argument for the sum function. MATLAB has two transpose

operators. The apostrophe operator (e.g., A') performs a complex conjugate transposition. It

flips a matrix about its main diagonal, and also changes the sign of the imaginary component

of any complex elements of the matrix. The apostrophe-dot operator (e.g., A'.), transposes

without affecting the sign of complex elements. For matrices containing all real elements, the

two operators return the same result.

So A' produces

ans =

16 5 9 4

3 10 6 15

2 11 7 14

13 8 12 1

ans =

34

34

c

$)

34

34

The sum of the elements on the main diagonal is obtained with the sum and the diag

functions:

diag(A) produces

ans =

16

10

ans =

34

The other diagonal, the so-called anti diagonal, is not so important Mathematically, so

MATLAB does not have a ready-made function for it. But a function originally intended for

use in graphics, fliplr, flips a matrix From left to right:

Sum (diag(fliplr(A)))

ans =

34

You have verified that the matrix in Dürer¶s engraving is indeed a magic Square and, in the

process, have sampled a few MATLAB matrix operations.

Operators

+ Addition

c

$

- Subtraction

* Multiplication

/ Division

MATLAB documentation)

. ^ Power

Generating Matrices

Z = zeros(2,4)

Z=

0000

0000

F = 5*ones(3,3)

F=

555

555

c

$!

555

N = fix(10*rand(1,10))

N=

9264874084

R = randn(4,4)

R=

M-Files

You can create your own matrices using M-files, which are text files containing MATLAB

code. Use the MATLAB Editor or another text editor to create a file Containing the same

statements you would type at the MATLAB command Line. Save the file under a name that

ends in .m.For example, create a file containing these five lines: A = [...

Store the file under the name magik.m. Then the statement magik reads the file and creates a

variable, A, containing our example matrix.

MATLAB displays graphs in a special window known as a figure. To create a graph, you

need to define a coordinate system. Therefore every graph is placed within axes, which are

contained by the figure. The actual visual representation of the data is achieved with graphics

c

$"

objects like lines and surfaces. These objects are drawn within the coordinate system defined

by the axes, which MATLAB automatically creates specifically to accommodate the range of

the data. The actual data is stored as properties of the graphics objects.

Plotting Tools

Plotting tools are attached to figures and create an environment for creating Graphs. These

tools enable you to do the following:

Display the plotting tools from the View menu or by clicking the plotting tools icon in the

figure toolbar, as shown in the following picture.

c

$#

Editor/Debugger

Use the Editor/Debugger to create and debug M-files, which are programs you write to run

MATLAB functions. The Editor/Debugger provides a graphical user interface for text

editing, as well as for M-file debugging. To create or edit an M-file use File > New or File >

Open, or use the edit function.

c

$$

CHAPTER 7

7.1 Conclusion

result in inefficiencies when such images are used directly in recognition tasks. In this

project, Discrete Cosine Transforms (DCTs) are used to reduce image information

redundancy because only a subset of the transform coefficients are necessary to preserve the

most important facial features, such as hair outline, eyes and mouth. We demonstrate

experimentally that when DCT coefficients are fed into a backpropagation neural network for

classification, high recognition rates can be achieved using only a small proportion (0.19%)

of available transform components.

7.2Future scope

face recognition. Our method consists of three steps. First, face images are transformed into

DCT domain. Second, DCT domain acquired from face image is applied on energy

probability for the purpose of dimension reduction of data and optimization of valid

information., Third, in order to obtain the most silent and invariant feature of face images, the

LDA is applied in the data extracted from the frequency mask that can facilitate the selection

of useful DCT frequency bands for image recognition, because not all the bands are useful in

classification. At last, it will extract the linear discriminative features by LDA and perform

the classification by the nearest neighbor classifier. For the purpose of dimension reduction of

data and optimization of valid information, the proposed method has shown better

recognition performance than PCA plus LDA and existing DCT method.

c

$%

7.3 References

[1] .C. M. Bishop, Neural Networks for Pattern Recognition. Oxford: OxfordUniversity

press,1995

[2]. B. Chalmond and S. Girard, ³Nonlinear modeling of scattered multivariate data and its

application to shape change,´ IEEE Transactions on Pattern Analysis and Machine

Intelligence, vol. 21, no. 5, pp. 422-432, 1999.

[3]. R. Chellappa, C. L. Wilson, and S. Sirohey, ³Human and machine recognition of faces: A

survey,´ Proceedingsof the IEEE, vol. 83, no. 5, pp. 705-740, 1995.

[4] C. Christopoulos, J. Bormans, A. Skodras, and J. Cornelis, ³Efficient computation of the

two-dimensional fast cosine transform,´ in SPIE Hybrid Image and Signal Processing

IV, pp. 229-237, 1994.

[5]. R. Gonzalez and R. Woods, Digital Image Processing. Reading, MA: Addison-Wesley,

1992.

[6]. A. Hyvarinen, ³Survey on independent component analysis,´ Neural Computing Surveys,

2, pp. 94-128, 1999. J. Karhunen and J. Joutsensalo, ³Generalization of principal

component analysis, optimization problems and neural networks,´ Neural Networks,

vol.8, no. 4, pp. 549-562, 1995.

[7]. M. Kirby and L. Sirovich, ³Application of the Karhunen-Loeve procedure for the

characterization of human faces,´ IEEE Transactions on Pattern Analysis and Machine

Intelligence, vol. 12, no. 1, pp. 103-108, 1990.

[8]. S. Lawrence, C. Lee Giles, A. Tsoi, and A. Back, ³Face recognition: A convolutional

neural network approach,´ IEEE Transactions on Neural Networks, vol. 8, no. 1, pp.

98-113,1997.

[9]. C. Nebauer, ³Evaluation of convolutional neural networks for visual recognition,´ IEEE

Transactions on Neural Networks, vol. 9, no. 4, pp. 685-696, 1998.

[10]. Z. Pan, R. Adams, and H. Bolouri, ³Dimensionality reduction of face images using

discrete cosine transforms for recognition.´ submitted to IEEE Conference on

Computer Vision and Pattern Recognition, 2000.

[11]. F. Samaria, Face Recognition using Hidden Markov Models. PhD thesis, Cambridge

University, 1994.

c

$&

on Pattern Analysis and Machine Intelligence, vol. 11, no. 3, pp. 304-314, 1989.

[13].D. Valentin, H. Abdi, A. O¶Toole, and G. Cottrell, ³Connectionist models of face

processing: A survey,´ Pattern Recognition, vol. 27, pp. 1209-1230, 1994.

[14]. M. S. Bartlett et al. Face recognition by independent component analysis. IEEE Trans.

on Neural Networks, 13(6):1450 1454, 2002.

[15]. P. N. Belhumeur et al. Eigenfaces vs. fisherfaces: Recognition using class specific

linear projection. IEEE Trans. on PAMI, 19(7):711 720, 1997.

[16]. R. Gottumukkal and V. K. Asari. An improved face recognition technique based on

modular PCA approach. Pattern Recognition Letters, 25(4), 2004.

[17]. Z. M. Hafed and M. D. Levine. Face recognition using the discrete cosine transform.

International Journal of Computer Vision, 43(3), 2001.

[18]. B. Heisele et al. Face recognition with support vector machines: Global versus

component-based approach. In ICCV, pages 688 694, 2001.

[19]. T. Kanade. Picture processing by computer complex and recognition of human faces.

Technical report, Kyoto Univ., Dept. Inform. Sci., 1973.

- Fundamentals of VibrationsUploaded bymurari103
- Face Recognition TechnologyUploaded bymshibas
- Thesis Proposal - FPGA-Based Face Recognition System, By Poie - Nov 12, 2009Uploaded byPoie
- face recognition using local dctUploaded byAkrm Sied
- Face Recognition TechnologyUploaded byabuals
- Multi-Mass Spring ModelingUploaded byJohn Alexiou
- Novel DCT based watermarking scheme for digital imagesUploaded byIDES
- Steganography and Random GridsUploaded byAIRCC - IJNSA
- FEM (Dynamic Analysis)Uploaded byDebasis Saha
- Ch5.pdfUploaded byandrey90
- Using the Modal Method in Rotor Dynamic Systems.pdfUploaded byalaine1114
- chpt2hintsUploaded byЯeader
- Gupta 2010 Soil Dynamics and Earthquake EngineeringUploaded byquentindelvigne
- PCA on ImagesUploaded bywooDefy
- DownloadUploaded byFrangky Tupamahu
- CA 4201508513Uploaded byAnonymous 7VPPkWS8O
- PDF%2Fjcssp.2012.431.435Uploaded byjit_72
- wallace tree multiplierUploaded byJason Humphries
- 3. Comp Sci - Ijcseitr - A Novel Approach for Iris - HeenaUploaded byTJPRC Publications
- 3Uploaded byamk2009
- 2DINVERSEUploaded byMinyare NJ
- out_6Uploaded byeka
- Lukas 2016Uploaded byAkshayHanumante
- eigenvaluesandeigenvectorsofsymmetricmatrices-130319151519-phpapp01.pdfUploaded by1balamanian
- pengenalan_pola - Copy.txtUploaded byQory
- ProjectUploaded byAnkur Srivastava
- 16501.pdfUploaded byRejo george
- Luc TDS ING 2016 State Space TrajectoriesUploaded byNitin Maurya
- 1804.06655.pdfUploaded byAnukriti Bansal
- phase 3Uploaded bySlowerman man

- IJIP-478Uploaded bypathummw
- Montias-1959-Planning With Material Balances in Soviet-Type EconomiesUploaded bygioanela
- Rayleigh Distribution - Wikipedia, The Free EncyclopediaUploaded byArun Sharma
- CS1354- GRAPHICS AND MUTIMEDIA IMPORTANT TWO MARKSUploaded bySumathi Bas
- part-a-qpaper.pdfUploaded bySoumik Sarangi
- C++ Practice ProblemsUploaded byaddisud
- Verilog for PrintUploaded byrppvch
- The Influence of teacher on students Achievement in Math.pdfUploaded byHafiz M Iqbal
- AP ECET 2015 Syllabus.pdfUploaded byaglasem
- HW5Uploaded byRahul Patil
- Quantitative Financial Economics Stocks Bonds Foreign ExchangeUploaded byenfrspit
- 10.1016@j.ympev.2019.01.022Uploaded byAnonymous xff5e1
- Work Energy and PowerUploaded byAditi Mahajan
- GuidedTour1 eCognition8 GettingStarted Example SimpleBuildingExtractionUploaded byElkin Cañón
- IntroUploaded bytt3340
- Dynamics of Rail VehiclesUploaded byBimal Bhattacharya
- A History of Differential AnalyzersUploaded byDouglasForero
- Module 1 ReviewUploaded byazhamarquez
- ACC_LSstacker.pdfUploaded byAjayprakash Mishra
- Motion Control With LabviewUploaded bysocat120013485
- Part One Waveguides and CavitiesUploaded byAlex Trinh
- MATHEMATICS (Problems and Elements)Uploaded byRoyce Martinez
- Boas- Mathematical Methods in the Physical Sciences 3ed Instructors SOLUTIONS MANUALUploaded bygmnagendra123
- Glosario de Términos matemáticos en InglésUploaded byFreddy Castro Ponce
- Lecture7 Borel Sets and Lebesgue MeasureUploaded bysourav kumar ray
- ME3040 & ME5090_20160919- Solid ModellingUploaded bySaiRam
- Design and Analysis of Riser for Sand CastingUploaded byrpadhra8803
- IELTS Vocabulary Pie ChartsUploaded byChomp
- 228 Time Lags NoteUploaded byKhairul Akmal
- 2_Improvement of Manufacturing OperationUploaded byVanusha

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.