You are on page 1of 74

CHAPTER 1

INTRODUCTION
The human face holds important amount of information and attributes such as
expression, gender and age. The vast majority of people are able to easily recognize
human traits like emotional states, where they can tell if the person is happy, sad or
angry from the face. Likewise, it is easy to determine the gender of the person.
However, knowing person’s age just by looking at old or recent pictures for them is
often a bigger challenge. Our objective in this thesis is to develop a human face
detection and age progression from face images. Given a face image of the person, we
label it with an estimated age. Aging is non-reversible process. Human face
characteristics change with time which reflects major variations in appearance. The
age progression signs displayed on faces are uncontrollable and personalized such as
hair whitening, muscles dropping and wrinkles. The aging signs depend on many
external factors such as life style and degree of stress. For instance smoking causes
several facial characteristics changes. A 30 years old person who smokes a box of
cigarettes each day will look like a 42 years old one. Compared with other facial
characteristics such as identity, expression and gender, aging effects display three
unique characteristics:

 The aging progress is uncontrollable. No one can advance or delay aging at


will. The procedure of aging is slow and irreversible.
 Personalized aging variations. Different people age in different ways. The
aging variation of each person is determined by his/her genes as well as many
external factors, such as health, lifestyle, weather conditions, etc.
 The aging variations are temporal data. The aging progress must obey the
order of time. The face status at a particular age will affect all older faces, but
will not affect those younger ones. Each of these characteristics contributes to
the difficulties of automatic age estimation.

We have to distinguish between two computer vision problems. Age synthesis which
aim at simulating the aging effects on human faces (i.e. simulate how the face would
look like at a certain age ) with customized single or mixed facial attributes (identity,

1
expression, gender, age, ethnicity, pose, etc.) which is the inverse procedure of Face
detections shown in Figure 1.1. While, Face detection and time domain analysis over
time aims at labelling a face image automatically with the exact age (year) or the age
group (year range) of the individual face.

Figure 1.1: Age synthesis

The bio-inspired features (BIF) had significant success in the past few years on a wide
range of computer vision tasks such category recognition and face recognition, but
have only been recently applied to the human Face detectionproblem. Our approach
builds an extended version of BIF (Extended BIF – EBIF) that can encode facial
features for face representation. A face representation using EBIF for a labelled
database of face images is being extracted by incorporating fine detailed facial
features, automatic initialization using Active Shape Models and analyzing a more
complete facial area by including the forehead details. Then, for the age estimation, a
cascade of Support Vector Machine (SVM) and Support Vector Regression (SVR)
models are trained on the extracted facial features to learn an age estimator. We
evaluated our algorithm on the widely used FG-NET and MORPH benchmark
databases showing the superiority of our proposed algorithm (extended BIF – EBIF)
over the state-of-the-art methods.

In this thesis, we study different facial parts using EBIF mainly: the whole face, eye
wrinkles and the internal face using different feature shape points. The analysis shows
that, eye wrinkles contain the most important aging features where they cover 30% of
the face compared to internal face and whole face. We build our own large database
using the internet and the fast growing social media content such as that of Flickr.
This enables having a rich repository of tagged images that can serve in the task of
age estimation. We present an automatic Advance Analysis of Image Processing
Algorithm image mining engine that is capable of collecting images using human age

2
related text queries having various ethnicity groups and image qualities. Then, we use
the Active Shape Model for robust face detection using Viola and Jone’s AdaBoost
Haar-like face detector; it acts also as a removal step for non-face images. After that,
we use the extended bio-inspired features (EBIF) to extract the facial aging
information. The downloaded images from the internet are not accurately labelled
with the descriptive tags of true ages. So, we cannot use the Advance Analysis of
Image Processing Algorithm image collection as a training dataset. That motivates us
to introduce a robust universal labeller algorithm to label Flickr images automatically
with no manual human labelling. Finally, we use the FG-NET and MORPH standard
databases as testing datasets showing the superiority of our proposed image web
mining algorithm, over the state-of-the-art methods. Our human age estimator has
several advantages. Primarily, it is an automatic system that does not require any kind
of human intervention or supplying any input parameters to estimate his/her age.
Secondly, it is based on recently proposed algorithms in image and facial image
analysis for the task of evaluating the age of a human face. Thirdly, only a single
image of the subject is required to estimate his age within seconds. Lastly, the
estimated ages are very close to the real ages. The availability of good quality images
is not essential for the method, because the Active Shape Model is trained on different
qualities of images with large variation of lighting, pose, and expression which is able
to extract the face accurately.

1.1 Motivation

Automatic Face detection and it progression in time domain from facial images has
recently emerged as a technology with multiple interesting applications. The
following examples demonstrate some beneficial uses of the software.

1.1.1 Electronic Customer Relationship Management (ECRM)

The ECRM is a management strategy that uses latest computer vision algorithms and
approaches to build interaction tools for effectively establishing different relationships
with all customers and serving them individually. Customers are classified to different
age groups such as babies, teenagers, adults and senior adults. It is important to take
their habits, preferences, responsiveness, and expectation to marketing in
consideration, companies can earn more money by acknowledging this fact,

3
responding directly to all customers‟ specific needs based on their age groups, and
customizing products or services according to each customer age group. The most
challenging part is to maintain enough personal information records or histories from
all customers‟ age groups, where companies need to invest a large amount of cost
input to establish long-term customer relationships. For example, the owner of a fast
food shop wants to know the most popular sandwiches or meals for each age group;
the advertising companies want to target specific audiences (potential customers) for
specific advertisements in terms of age groups; mobile manufacturers want to know
which age group is more interested in their new product models showing in a public
kiosk; clothes stores may display suitable fashions for males or females according to
their age groups.

1.2.2 Security Control and Surveillance Monitoring

Security control and surveillance monitoring systems are becoming increasingly


important in our everyday life especially with the rise in number of crimes and
terrorist threats. With the help of a monitoring camera, a human age estimator
application can generate a warning sound or alarm when underage drinkers are
entering bars and preventing them from purchasing tobacco products from vending
machines if the IDs are faked or another, then image face estimation can be a second
check point; stopped the aged when he/she wants to play roller coaster in An
amusement Park; and deny non-adult persons to access internet pages with adult
materials. So, it is clear that Face detection from surveillance cameras can be useful in
many situations. Health care systems Face detection applications are not limited to
prevent criminals from committing crimes, but also can be used in health care
systems, such as robotic nurse, intelligent intensive care unit, for customized services.
For example, personalized Avatar will be selected automatically from the custom-
built Avatar database to interact with patients from different age groups with
particular preferences.

1.2.3 Information retrieval

The internet is considered as the largest image database that ever existed, it allows
access to billions of face images uploaded by regular users with descriptive tags and
titles, albeit noisy in many cases. One of the most famous websites are facebook.com,
flickr.com and other regular sites, where the users can benefit from an Face

4
detectionapplication that can estimate human face accurately, and retrieve face images
based on the estimated age image query for friendship requests and face image
collection.

1.2.4 Challenges

There were several challenges encountered when attempting to develop our algorithm,
because face images can demonstrate a wide degree of variation in both shape and
texture. Appearance variations are caused by individual differences, the deformation
of an individual face due to changes in expression and speaking, as well as lighting
variations. These issues are explained more in the following points:-

 The Face detection problem is particularly challenging as age depends on many


factors, some of them are visual and many others are non-visual such as ethnic
background, living style, working environment, health condition and social life.
For instance, the effects of ultraviolet radiation, usually through exposure to
sunlight, may cause solar aging which is another strong cause for advanced
signsof face aging. In particular, Stone stated that aging can be accelerated
bysmoking, genetic predisposition, emotional stress, disease processes, dramatic
changes in weight, and exposure to extreme climates.
 The visual features that can help in evaluating age such as people’s facial features
are affected by pose, lighting and imaging conditions.
 Males and females may have different general discriminative features displayed
in images due to the different extent in using makeup, accessories and cosmetic
surgeries which increase the negative influence of individual differences.
 The difficulty of acquiring large-scale databases, which covers enough age range
with chronological face aging images, makes the estimation tasks more difficult
to achieve. Although Advance Analysis of Image Processing Algorithm image
mining can help the data collection, it is usually hard or even impractical to
collect a large database of large amount of subjects providing a series of personal
images across different ages.

1.2 System Modules

Our system mainly is decomposed of three modules: 1) Core system module;

5
2) Enhancement module;

3) Application module as shown in Figure 1.1.

Figure 1.2: Face detectionsystem modules

1.3Core system module

In the core system module, our Face detection algorithm consists of two tasks, face
representation using the extended bio-inspired features (EBIF) that is based on the
bio-inspired features to encode the facial features robustly. Then, age estimation for
analysis over the time domain, where we train a cascade of Support Vector Machines
(SVM) and Support Vector Regression (SVR) to learn the relationship between the
coded representation of the face and the actual age of the subjects. Once this
relationship is established, it is possible to estimate the age of a previously unseen
image. Some concepts need to be explained first.

We want to differentiate five definitions about human age in this thesis.

 Actual Age analysis: The real age (cumulated years after birth) of an individual.
 Appearance Age: The age information shown on the visual appearance.
 Perceived Age: The individual age gauged by human subjects from the visual
appearance.
 Estimated Age: The individual age recognized by machine from the visual
appearance.
 Categorization of age: Further are being categorized on the basis of their
belonging age progression.

6
We use the Actual age and Estimate age with progression estimation definitions in
this work.

1.4Enhancement module

In this module, we do several improvements to enhance the output results from the
core system module such as more analysis on different facial parts to validate the
importance of those different parts, increase the number of pictures on FG-NET aging
database using MORPH dataset. We combine MORPH pictures and FG-NET ones
with pre-defined selection criteria, which ensure the integrity and fairness of the
chosen pictures from MORPH database, that to be inserted in FG-NET database,
which result in a new database with large variations. Our selection criteria can be
applied on any combinations of databases to have a more generalized one. Output
results from this module will show the significant improvements on the face detection
and age progression analysis.

1.5Application module

After building the core system and enhancement modules, we demonstrate the
application module that has several components:

 Image collector that crawls images from the internet using human age related text
queries with several conditions such as image quality, different poses,
expressions, multiple faces in the same image and single face image. This lead to
a large database for the purpose of having more training images
 The crawled images suffer from different problems such as face misalignment and
multi-instance faces in the same image with possibly incorrect label of the image
faces. This motivated us to propose different solutions to overcome the above
mentioned problems.

We then build a more generalized database using the downloaded images as training
set that uses the internet as a rich image repository in the task of age estimation. FG-
NET and MORPH datasets are being used as testing sets to show the superiority of
our web-based application components over state-of-the-art methods.

7
1.6 Neural Network for Face Recognition System

System for face recognition is consisted of two parts: hardware and software. This
system is used for automatic recognition users or confirmation of password. For input
is used either digital pictures or video frame from same video. State institution and
some private organization are using this system for face recognition especially for
identification face by video cameras like input parameter or for biometrics system for
checking identity using cameras and 3D scanners. System must to recognize where is
face on some picture, to take it from picture and to do verification. There are many
ways for verification, but the most popular is recognition of face’s characteristics.
Face has about 80 characteristic parameters some of them are: width of nose, space
between eyes, high of eyehole, shape of the zygotic bone and jaw width.

A rationally connected neural network examines small windows of an image, and


decides whether each window contains a face. The system arbitrates between multiple
networks to improve performance over a single network.Training a neural network for
the face detection task is challenging because of the difficulty in characterizing
prototypical “non-face” images. Unlike face recognition, in which the classes to be
discriminated are different faces, the two classes to be discriminated in face detection
are “images containing faces” and “images not containing faces”. It is easy to get a
representative sample of images which contain faces, but it is much harder to get a
representative sample of those which do not. The size of the training set for the
second class can grow very quickly.

1.7 Thesis contributions

The key technical contributions of this thesis include:

 Novel framework for automatic Face detection in human face images using our
proposed algorithm for face representation. Then, a robust Face detection scheme
is introduced using a newly introduced integration of support vector machine
(SVM) and support vector regression (SVR).
 We analyze different facial parts: a) eye wrinkles b) internal face (whole face
without forehead features) and c) whole face (face with forehead features). The
purpose of this study is to know the location of the most important aging features
in the input face image. Next, we increase the number of missing pictures in older

8
age groups for performance enhancement. Predefined selection criteria is used to
combine the MORPH and FG-NET database images for range completion, which
results in a new database with large variations of ethnicity groups. These selection
criteria can be applied on different database to generate a more generalized one.
 We design a human Face detection application by utilizing images to build more
generalized database with different ethnicity groups. It has several components
that contribute in the construction of the database such as a image collector, face
detection and noise removal step, face representation and labelling algorithm.

9
CHAPTER 2

LITERATURE SURVEY
Human face aging is a non-reversible process, causing human face characteristics to
change over time as hair whitening, muscles drop and wrinkles, even though some
beauty care products may slightly reverse minor photo-aging effects. People have
different patterns of aging, with time human faces start to take different forms in
different ages, and there is general discriminative information we can always
describe. Previous work on aging can be broken down into two major categories; (a)
age progression and (b) face estimation and Recognition

2.1Age progression

Age progression is used to modify and enhance photograph by computer or manually


(with professional hand drawing skills) for the purpose of suspect/victim and lost
people identification with law enforcement. This technique has evolved when police
investigative work and art united throughout history. When the photos of missing
family members (especially children) or wanted fugitives are outdated, professionals
can predict the natural aging of the subject faces and produce updated face images,
utilizing all available individual information, such as facial attributes, lifestyle,
occupation, and genetics. Age synthesis by machine can significantly enhance the
efficiency of professionals works while at the same time provides more photo-realistic
aging effects that can satisfy the needs of aesthetics. Two popular synthesis
algorithms are discussed as follows:

 Implicit Statistical Synthesis: The implicit statistical synthesis focuses on the


appearance analysis, which considers shape and texture synthesis simultaneously
and often uses statistical methods Explicit Mechanical Synthesis: The explicit
mechanical synthesis focuses on the texture analysis, which is more related to skin
aging.

10
2.1.1 Implicit Statistical Synthesis

Zhi-Hua Zhou use AAM to build aging functions of young faces under 30 years of
age, in which PCA is applied to extract the shape and texture variations from a set of
training examples. The PCA coefficients for the linear reconstruction of training
samples are considered as model parameters, which control different types of
appearance variations. This aging model can be used for age normalization and
further improvement the face recognition performance. [17]

Figure 2.1 shows the AAM example and the aging appearance simulation result by
Zhi-Hua Zhou‟s method. Ramanathan and Chellappa proposed an appearance-based
adult aging framework. The shape aging is manipulated by a physically-based
parametric muscle model, while the texture aging is manipulated by an image gradient
based wrinkle extraction function. This model can predict and simulate facial aging in
two cases: weight-gain and weight loss. The wrinkle simulation module can generate
different effects such as subtle, moderate, and strong.

Figure 2.1: Face aging using active appearance model and principal component
analysis

Volker Blanz. presented a dynamic face aging model with multi-resolution and
multilayer image representations. Associated with a layered and-Or graph, all 50
training face images are decomposed into different parts. The general aging effects
are learned from global hair style and shape changes, facial components deformations,
and facial wrinkles geography. A dynamic Markov process model is built on the

11
graph structures of the training data. The graph structures over different age groups
are finally sampled in terms of the dynamic model to simulate new aging faces.
Figure above shows this model and some aging simulation results which exhibit
highly photorealistic results. [7].

Figure 2.2: Multi-layer dynamic face aging model and results

Figure 2.3: M-Face results

2.1.2 Explicit Mechanical Synthesis

12
Felix Juefei presented an image-based method to transfer geometric details between
two object surfaces. They found that geometric details can be captured without
knowing the surface reflectance. After alignment, the geometric details can be cloned
to render other surfaces. [11]

Both face aging and rejuvenating can be simulated using this method. The first and
third columns in Figure 2.4 show the original images of two persons with different
age. The second and fourth columns show the two rendering results for face aging and
rejuvenating (exchange the age between the two faces).

(a) Subject 1 (original face in the left picture, simulated face in the right picture (b)
Subject 2 (original face in the left picture, simulated face in the right picture

Figure 2.4: Face aging and rejuvenating results using image-based surface detail
transfer results

2.2 Age estimation for progression analysis

The existing Face detection frameworks using face images typically consist of two
main stages: age image representation and Face detection techniques.

2.2.1 Image representation

13
 2.2.1.1 Anthropometric models

Kwon and Lobo studied cranio-facial development theory. The theory of cranio-facial
uses a mathematical model to describe the growth of a person‟s head from infancy to
adulthood. Farkas provided a comprehensive overview of face anthropometry. Face
anthropometry is the science of measuring sizes and proportions on human faces. For
age growth characterization, people usually utilize the distance ratios measured from
facial landmarks, instead of using directly the mathematical models. There are two
reasons that people do not use the mathematical formulation for age estimation: The
mathematical model cannot analyze the head growth naturally specifically when the
ages are close to adults;and it is hard to measure the head growth from 2D face
images. Kwon and Lobo classified human faces into three groups: babies, young
adults and senior adults. The experiment was performed on a small database of 47
faces. The authors did not report the overall performance on this small database. Face
detection approaches based on the anthropometry model can only deal with young
ages, since the human head shape does not change too much in its adult period. [15]

 2.2.1.2 Active appearance models

The active appearance model is a statistical face model proposed initially in for
coding face images. Given a set of training face images, a statistical shape model and
an intensity model are learned separately, based on the principal component analysis.
The AAMs were used successfully for face encoding. The AAMs for face aging by
proposing an aging function, 𝑎𝑔𝑒=𝑓𝑏 to explain the variation in age. In the aging
function, age is the actual age of an individual in a face image, b is a vector
containing 50 raw model parameters learned from the AAMs, and f is the aging
function. The aging function defines the relationship between the age of individuals
and the parametric description of the face images. The experiments were performed
on a database of 500 face images of 60 subjects. The tried different classifiers for
Face detection based on their age image representation, especially the quadratic aging
function. In comparison with the anthropometry model based approaches, the AAMs
can deal with any ages in general, rather than just young ages. In addition, the AAMs
based approaches to age image representation consider both the shape and texture
rather than just the facial geometry as in the anthropometric model based methods.
These approaches are applicable to the precise age estimation, since each test image

14
will be labelled with a specific age value chosen from a continuous range. The further
improvements on these aging pattern representation methods are (1) to provide
evidences that the relation between face and age can be essentially represented by a
quadratic function; (2) to deal with outliers in the age labelling space; and (3) to deal
with high-dimensional parameters.

 2.2.1.3 Aging pattern subspace

Geng et al. [19] introduced Aging pattern Subspace (AGES), which deal with a
sequence of an individual’s aging face images that will be used all together to model
the aging process. Instead of dealing with each aging face image separately. An aging
pattern is defined as a sequence of personal face images, coming from the same
person, sorted in the temporal order . If the face images of all ages are available for an
individual, the corresponding aging pattern is called a complete aging pattern;
otherwise, it is called anIncomplete aging pattern. The AGES method presented in can
simulate the missing ages by using an EM-like iterative learning algorithm. This
AGES method works in two stages: the learning stage and the Face detection stage. In
learning of the aging pattern subspace, the PCA technique was used to obtain a
subspace representation. The difference from the standard PCA approach is that there
are possibly some missing age images for each aging pattern. The proposed EM-like
iterative learning method is used to minimize the reconstruction error characterized by
the difference between the available age images and the reconstructed face images.
The initial values for missing faces are set by the average of available face images.
Then the eigenvectors of the covariance matrix of all face images and the means can
be computed. Given the eigenvectors and mean face, the reconstruction of the faces
can be computed. The process iterates until the reconstruction error is small enough.
In the Face detectionstage, the test face image needs to find an aging pattern suitable
for it and a proper age position in that aging pattern. The age position is returned as
the estimate of the age of the test face image. To do this, the test face is verified at
every possible position in the aging pattern and the one with the minimum
reconstruction error is selected.

In terms of utilizing the AAMs for encoding face images, Geng et al. used 200 AAMs
features to encode each face image. The experiment for evaluating the AGES method
was performed on the FG-NET aging database. The Mean Absolute Error (MAE) was

15
reported as 6.77 years. In practical use of the AGES method, a problem is that in
order to estimate the age of an input face image, the AGES method assumes there
exist face images of the same individual but at different ages or at least a similar aging
pattern for that face image in the training database. This assumption might not be
satisfied for some aging databases. And also it is not easy to collect a large database
containing face images of the same individual at many different ages with close
imaging conditions. Another problem of the AGES method is that the AAMs face
representation might not encode facial wrinkles well for senior people, because the
AAMs method only encodes the image intensities without using any spatial
neighborhood to calculate texture patterns. Intensities of single pixels usually cannot
characterize local texture information. In order to represent the facial wrinkles for
elder adults, the texture patterns at local regions need to be considered.

2.2.1.4 Appearance models.

Pramod Kumar Pisharady and Martin Saerbeck. [18] proposed to use the Biologically
Inspired Features (BIF) for Face detectionvia faces. The theory behind this method
will be explained in details in the next section. BIF can achieve MAE of 4.77 years on
the FG-NET aging database, and MAEs of 3.91 years and 3.47 years for female and
male on the YGA database respectively. Considering both age and gender estimation
in an automatic framework , the BIF+Age Manifold feature combined with SVM can
achieve MAEs of 2.61 years and 2.58 years for female and male on the YGA database
respectively. These results demonstrate the superior performance of the BIF for the
task of age estimation.

2.3 Face detection techniques

Given an aging feature representation, the next step is to estimate ages. Face detection
approaches fall into two categories: a) classification-based; and b) regression-based

2.3.1 Classification-based

Zhi-Hua Zhou et al. [17] evaluated the performance of different classifiers for age
estimation, including the nearest neighbor classifier, the Artificial Neural Networks
(ANNs), and a quadratic function classifier. The face images are represented by the
AAMs method. From experiments on a small database containing 400 images at ages

16
ranging from 0 to 35 years, it was reported that the quadratic function classifier can
reach 5.04 years of MAE, which is slightly lower than the nearest neighbor classifier,
but higher than the ANNs. The SVM was applied to FacedetectionbyGuo et al. on a
large YGA database with 8,000 images. The MAEs are 5.55 and 5.52 years for
females and males, respectively. The MAE is 7.16 for the FG-NET aging database.
Kanno et al. presented to use artificial ANN for the 4-class age-group classification
which achieved 80% accuracy on 110 young male faces. Gaussian models in a low-
dimensional 2DLDA+LDA feature space using the EM Algorithm. The age-group
classification is determined by fitting the test image to each Gaussian model and
comparing the likelihoods. For the 5-year range age-group classification, their system
achieves accuracies of about 50% for male and 43% for female. For 10-year range
age-group classification, it achieves accuracies of about 72% for male and 63% for
female. For 15-year range age-group classification, it achieves accuracies of about
82% for male and 74% for female.

2.3.2 Regression-based

Michel F. Valstar, TimurAlmaev [5] investigated three formulations for the aging
function: linear, quadratic, and cubic, respectively, with 50 raw model parameters.
The optimal model parameters are learned from training face images of different ages
based on a genetic algorithm. The SDP is an effective tool but computationally very
expensive . When the size of the training set is large, the solution to SDP may be
difficult to achieve. An Expectation Maximization (EM) algorithm was used to solve
the regression problem and speed up the optimization process . The MAEs are
reported as 6.95 years for both female and male on the YGA database, and 5.33 for
the FG-NET aging database. Zhou et al. presented the generalized Image Based
Regression (IBR) aiming at multiple-output settings. A boosting scheme is used to
select features from redundant Haar-like feature set. The proposed training algorithm
can also significantly reduce the computational cost. The IBR can achieve 5.81 MAE
of a 5-folds cross validation test on the FGNET aging database. Suo et al. compared
Age group specific Linear Regression (ALR), MLP, SVR, and logistic regression
(multi-class Adaboost) on FG-NET and their own databases and finally achieved the
best performance with MLP in their experiments. Guo et al. , chose the SVMs as a
representative classifier, and the SVR as a representative regressor, and compared
their performance using the same input data. From their experiments, the SVMs

17
perform much better than the SVR on the YGA database (5.55 versus 7.00, and 5.52
versus 7.47 years, for females and males, respectively), while the SVMs perform
much worse than the SVR on the FG-NET database (7.16 versus 5.16 years). From
the experimental results, we can see that the classification-based Face detectioncan be
much better or much worse than the regression-based approach in different cases.

2.4 Bio-inspired Models

Recent work suggests that visual processing in the cortex can be modelled as a
hierarchy of increasingly sophisticated representations. A recent theory suggests that
the feed forward path of object recognition in the cortex accounts for the first few
hundred milliseconds of visual processing in primate cortex follows a mostly feed
forward hierarchy. Riesenhuber et al. Proposed a new hierarchical model derived
from a feed-forward model of the primate visual object recognition pathway, called
the “HMAX” model.The standard framework as shown in Figure 2.5 consists of
different layers of computational units called simple (S) and complex (C) cell units
creating increasing complexity as the layers progress from the primary visual cortex
(V1) to inferior temporal cortex (IT) . A notable property of the model is the nonlinear
maximum operation “MAX” over the S units rather than the linear summation
operation “SUM” in pooling inputs at the C layers. Specifically, the first layer of the
model, called the S1 layer, is created by convolving a pyramid of Gabor filters at 4
orientations and 16 scales, over the input gray level image.

Adjacent two scales of S1units are then grouped together to form 8 “bands” of units
for each orientation. The second layer, called the C1layer, is then generated by taking
the maximum values within a local spatial neighbourhood and across the scales within
a band. So the resulted C1representation contains 8 bands and 4 orientations. The
advantage of taking the “MAX” operation within a small range of position and scale
is to tolerate small shifts and scale changes

18
Figure 2.5: Bio-inspired model

Thomas SerreLior Wolf Tomaso Poggio extended the “HMAX” model of


Riesenhuber and Poggio to include two higher level layers, called S2 and C2, for
object recognition. In the S2 layer, template matching is performed to match the
patches of C1 units with some pre-learned prototype patches that are extracted from
natural images. This S2 layer gets intermediate features that are more selective and
thus useful for discriminating between classes of objects. These S2 units are then
convolved over an entire image and C2 units are assigned the maximum response
value on S2.[13]

2.5 Literature Review

[1]Mr. Dinesh Chandra Jain Dr. V. P. Pawarproposed a new way to recognize the
face using facial recognition software and using neural network methods. That makes
a facial recognition system to protect frauds and terrorists.The Face recognition is an
important and secured way to protect the frauds at everywhere like government
agencies are investing a considerable amount of resources into improving security
systems as result of recent terrorist events that dangerously exposed flaws and
weaknesses in today’s safety mechanisms. Badge or password-based authentication
procedures are too easy to hack.

[2] Sujata G. Bhele and V. H. Mankar proposed an attempt is made to review a


wide range of methods used for face recognition comprehensively. This include PCA,

19
LDA, ICA, SVM, Gabor wavelet soft computing tool like ANN for recognition and
various hybrid combination of this techniques. This review investigates all these
methods with parameters that challenges face recognition like illumination, pose
variation, facial expressions.

[3] MamtaDhanda Seth Jai Prakash Mukund Lal Institute of Engineering and
Technology proposed The design of the face recognition system is based upon
“eigenfaces”. The original images of the training set are transformed into a set of
eigenfaces E. Then, the weights are calculated for each image of the training set and
stored in the set W. Upon observing an unknown image Y, the weights are calculated
for that particular image and stored in the vector WY. Afterwards, WY is compared
with the weights of images, of which one knows for certain that they are facing.

[4]M.Nandini, P.Bhargavi, G.Raja Sekhar Department of EConE,


SreeVidyanikethan Engineering College Tirupathi proposed a novel approach for
recognizing the human faces. The recognition is done by comparing the
characteristics of the new face to that of known individuals. It has Face localization
part, where mouth end point and eyeballs will be obtained. In feature Extraction,
Distance between eyeballs and mouth end point will be calculated. The recognition is
performed by Neural Network (NN) using Back Propagation Networks (BPN) and
Radial Basis Function (RBF) networks.

[5] Michel F. Valstar, TimurAlmaev, Jeffrey M. Girard, Gary McKeown, Marc


Mehu, Lijun Yin, Maja Pantic and Jeffrey F. Cohn proposed the second such
challenge in automatic recognition of facial expressions, to be held in conjunction
with the 11 IEEE conference on Face and Gesture Recognition, May 2015, in
Ljubljana, Slovenia. Three sub-challenges are defined: the detection of AU
occurrence, the estimation of AU intensity for pre-segmented data, and fully
automatic AU intensity estimation. In this work we outline the evaluation protocol,
the data used, and the results of a baseline method for the three sub-challenges.

[6] Marian Stewart Bartlett, Member, IEEE, Javier R. Movellan, Member,


IEEE, and Terrence J. Sejnowski, Fellow, IEEE proposed a basis images found by
PCA depend only onpair wiserelationships between pixels in the image database. In a task
such as face recognition, in which important information may be contained in the high-order
relationships among pixels, it seems reasonable to expect that better basis images may be

20
found by methods sensitive to these high-order statistics. Independent component analysis
(ICA), a generalization of PCA, is one such method.

[7] Volker Blanz and Thomas Vetter, Member, IEEEproposed a method for face
recognition across variations in pose, ranging from frontal to profile views, and cross
a wide range of illuminations, including cast shadows and specular reflections. To
account for these variations, the algorithm simulates the process of image formation
in 3D space, using computer graphics, and it estimates 3D shape and texture of faces
from single images. The estimate is achieved by fitting a statistical, morphable model
of 3D faces to images. The model is learned from a set of textured 3D scans of heads.

[8]Juwei Lu, Student Member, IEEE, Konstantinos N. Plataniotis, Member,


IEEE, and Anastasios N. Venetsanopoulos, Fellow, IEEE proposed a kernel
machine-based discriminant analysis method, which deals with the nonlinearity of the
face patterns’ distribution. The proposed method also effectively solves the so-called
“small sample size” (SSS) problem, which exists in most FR tasks. The new
algorithm has been tested, in terms of classification error rate performance, on the
multiview UMIST face database.

[9]WU Xiao-Jun Josef Kittler YANG Jing-Yu Kieron Messer Wang Shi-Tong
proposed a new kernel direct discriminant analysis (KDDA) algorithm in this paper.
First, a recently advocated direct linear discriminant analysis (DLDA) algorithm is
overviewed. Then the new KDDA algorithm is developed which can be considered as
a kernel version of the DLDA algorithm. The design of the minimum distance
classifier in the new kernel subspace is then discussed.

[10] Juwei Lu, K.N. Plataniotis, A.N. Venetsanopoulos Bell Canada Multimedia
Laboratory, The Edward S. Rogers Sr. proposed a kernel machine based
discriminant analysis method, which deals with the nonlinearity of the face patterns’
distribution. The proposed method also effectively solves the so-called “small sample
size” (SSS) problem which exists in most FR tasks. The new algorithm has been
tested, in terms of classification error rate performance, on the multi-view UMIST
face database.

[11] Felix Juefei-Xu∗1 ,Khoa Luu∗1,2 , Marios Savvides1 , Tien D. Bui2 , and
Ching Y. Suen2 proposed a novel framework of utilizing periocular region for

21
age invariant face recognition. To obtain age invariant features, we first perform
preprocessing schemes, such as pose correction, illumination and periocular region
normalization. And then we apply robust Walsh-Hadamard transform encoded local
binary patterns (WLBP) on preprocessed periocular region only. We find the WLBP
feature on periocular region maintains consistency of the same individual across ages.

[12] GuodongGuoGuowangMu Yun Fu BBN Technologies Thomas S. Huang


UIUC proposed a biologically inspired features (BIF) for human age estimation from
faces. As in previous bioinspired models, a pyramid of Gabor filters are used at all
positions of the input image for the S1 units. But unlike previous models, we find that
the pre-learned prototypes for the S2 layer and then progressing to C2 cannot work
well for age estimation.

[13] Thomas SerreLior Wolf Tomaso Poggio proposed a introduce a novel set of
features for robust object recognition. Each element of this set is a complex feature
obtained by combining position- and scale-tolerant edgedetectors over neighboring
positions and multiple orientations. Our system’s architecture is motivated by a
quantitative model of visual cortex.

[14] Anjali1 , Asst. Prof.Arun Kumar2 proposed a Human face characteristics


change with time which reflects major variations in appearance. The age progression
signs displayed on faces are uncontrollable and personalized such as hair whitening,
muscles dropping and wrinkles. Age synthesis is defined to rerender a face image
aesthetically with natural aging and rejuvenating effects on the individual face.
Automatic age-progression is the process of modifying an image showing the face of
a person in order to predict his/her future facial appearance.

[15]Young H. Kwon∗ and Niels da Vitoria Lobo† proposed a theory and practical
computations for visual age classification from facial images. Currently, the theory
has only been implemented to classify input images into one of three agegroups:
babies, young adults, and senior adults. The computations are based on cranio-facial
development theory and skin wrinkle analysis. In the implementation, primary
features of the face are found first, followed by secondary feature analysis.

[16] AsumanGünay and Vasif V. Nabiyev proposed a novel age estimation method
- Global and Local feAture based Age estiMation (GLAAM) - relying on global and

22
local features of facial images. Global features are obtained with Active Appearance
Models (AAM). Local features are extracted with regional 2D-DCT (2- dimensional
Discrete Cosine Transform) of normalized facial images.

[17]Xin Geng1,2,3 , Kate Smith-Miles1 , Zhi-Hua Zhou∗ proposed an algorithm


named IIS-LLD for learning from the label distributions, which is an iterative
optimization process based on the maximum entropy model. Experimental results
show the advantages of IIS-LLD over the traditional learning methods based on
single-labeled data.

[18] Pramod Kumar Pisharady and Martin Saerbeck proposed a C2 feature


extraction system, which is based on the standard model of the ventral stream of
visual cortex, is modified for the extraction of facial features. The S2 units in the
learning stage are tuned to frontal and profile views of the faces to provide pose
invariant recognition.

[19] Geng,Kent Smith-Miles, Senior Member, IEEE proposed an automatic age


estimation method named AGES (AGingpattern Subspace). The basic idea is to model
the aging pattern, which is defined as thesequence of a particular individual’s face
images sorted in time order, byconstructing a representative subspace. The proper
aging pattern for a previouslyunseen face image is determined by the projection in the
subspace that canreconstruct the face image with minimum reconstruction error, while
the position ofthe face image in that aging pattern will then indicate its age.

23
CHAPTER 3

SCOPE OF THE WORK

3.1 Scope and Analysis

Face detection accuracy depends on how well the input images have been represented
by good general discriminative features. The choice of classification or regression has
an impact on the result of the estimated age for unknown image. In this chapter, we
present a complete Face detection framework with description of each component.
The output of this framework is the estimated age for the input face image. Core
system module is considered the first module in our three modules system in the Face
detection task. The major contributions presented in this chapter are: (1) automatic
localization of facial landmarks for the input faces for the first time with the bio
inspired- features (BIF) method using Active Shape Models (ASM) ; (2) utilizing
micro facial features to reveal facial details in the forehead leading to a significant
increase in the overall accuracy over the state-of-the-art methods; (3) a new set of
Gabor features (I - Imaginary and M -Magnitude) parts that were not investigated
before for the Face detection problem along with the commonly used real part in BIF;
and (4) increasing the number of shape features through inclusion of the forehead
shape features.

3.1.1 System structure

The proposed algorithm for Face detections divided into five steps. First the facial
landmarks for the face image are detected automatically using ASM (as opposed to
the case of BIF where this step was manually performed). The image is cropped to the
area covering a fixed number of points generated from the ASM step (several
numbers of points was tested experimentally). Then, the cropped image undergoes
filtering by a family of Gabor functions that yield three set of features Real,
Imaginary and Magnitude (RMI) parts at different orientations and scales. The filtered
outputs undergo a feature dimensionality reduction step by just keeping the maximum
(MAX) and standard deviations (STD) of the Gabor filtered outputs. Finally, both a
classification-based and regression-based models were used in the training phase

24
(SVM and SVR in this case) to produce the final age model estimator. The complete
framework block diagram is illustrated in Figure 3.1.

Figure 3.1: Face detection and progression framework

3.1.2 Detailed implementation

 3.1.2.1 Face feature localizer

Face images can demonstrate a wide degree of variation in both shape and texture.
Appearance variations are caused by individual differences, the deformation of an
individual face due to changes in expression, pose and speaking, as well as lighting
variations. Figure 3.2 shows how the image of a person varies with different poses.
Figure 3.1 shows illumination changes caused by light sources at arbitrary positions
and intensities that contribute to a significant amount of variability. Typically we
locate the features of a face in order to perform Face detection on the detected shape
features.

Figure 3.2: Images of the same person with different head pose

25
Figure 3.3: Images of the same person under different lighting conditions.

In this step, we aim at accurately localizing the facial region to extract features only
from the relevant parts of the input image. This localization step was manually
performed in, which limits its practical usage. In this work, we explore the use of
Active Shape Models for the automatic localization of facial landmark points; which
has two main stages, namely training and fitting. In the training stage, we manually
locate landmark points for hundreds of images in such a way that each landmark
represents a distinguishable point present on every example image. An object shape is
represented by a set of labeled points or landmarks. The number of landmarks should
be large enough to show the overall shape and the details where it is needed.

We need to determine the number of landmark points that can adequately represent
the shape. We tried 75 and 68 points which were provided by respectively. We use 75
points to cover the whole face including the forehead. Then, we build a statistical
shape model based on these annotated images. The model that will be used to describe
a shape and its typical appearances is based on the variations of the spatial position of
each landmark point within the training set. The model represents the expected shape
and local grey-level structure of a target object in an image Having built the shape
model, points on the incoming face image are fitted. First, we detect the face in the
input image. Second, we initialize the shape points and do image alignment fitting to
get the detected shape features automatically. Finally the input image is cropped to the
area covered by the ASM fitted landmark points. In this thesis, we use 75 points for
the landmark points as opposed to 68. This includes features from the forehead region
which contributed to an increase of accuracy, as shown in the reported experiments.
The difference between using 68 and 75 landmark points is illustrated in Figure 3.4.

26
Figure 3.4: 68 and 75 points samples from FG-NET.

Figure 3.5 shows the output of the fitting stage with different head poses, it is clear
that ASM is able to extract the facial features accurately for original input face images
of the same person in Figure 3.2 with different head rotation.

Figure 3.5: cropped face images from fitting stage of the same person with different
head poses.

Figure 3.6: cropped face images from fitting stage with different source of lighting.

ASM successfully extracted the facial features under different lighting conditions as
shown in Figure 3.6. Although, the fitting stage could not detect the other input face
images in Figure 3.3 due to the high intensity of source lighting and illumination. This
issue can be solved by annotating images with different illumination changes caused

27
by light sources at arbitrary positions and intensities in training stage which will
contribute to have a significant amount of variability.

3.1.3 Texture-based face representation

Texture features have proven to be distinctive for the task of Face detectionfrom
facial images. Particularly, the use of Gabor has proven to be successful as in. We
follow a similar approach as in but with the addition of different forms of Gabor
functions, namely the imaginary and magnitude features. The cropped image output
from the feature detection localizer block using Active Shape Model (ASM) is filtered
by a family of Gabor functions with different forms, 8 orientations and 16 scales.
Gabor functions for a particular scale (sigma) and orientation (theta) are described by
the equations:

Where X = xcosθ+ysinθ and Y = −xsinθ+ycosθ, and θ varies between 0 and π. The


parameters λ(wavelength), σ(effective width), and γ(aspect ratio) =0.3 are based on
the work in , where they used just the R-Real part (equation 1).

G(x, y) =exp−(𝑋2+ 𝛾2𝑌2)2𝜎2 × cos 2𝜋𝜆𝑋 .............................. 3.1

G(i, y) =exp−(𝑋2+ 𝛾2𝑌2)2𝜎2 ×𝑠𝑖𝑛 2𝜋𝜆𝑋 .................................3.2

The starting filter size is (3x3) rather than (5x5) which capable of revealing facial
details in the forehead area (with parameters shown in Table 3.1. This can be
observed in Figure 3.7 where the input face image is of size 59x80. The S1 units at
four orientations of band 1 (filter sizes 3x3 and 5x5) are displayed. Figure 3.8 zooms
on parts that show the different between 3x3 and 5x5 filter size.

28
Figure 3.7: Gabor filtered results at band 1 (two scales with filter sizes 3×3 and 5×5)
at four orientations. Note that in the case of the 3x3 filter, the forehead features start
to be more visible.

Equation 3.1 and 3.2 generate two set of features (R – Real and I – Imaginary parts)
respectively, we also generate a magnitude feature which is dependent on the R and I
features and not an independent feature like them. The benefit of using the Magnitude
part is shown in the reported experiments. M is described by the following equation:-

M = 𝑅2+𝐼2

Figure 3.8: Gabor filtered results with filter sizes 3x3 and 5x5

29
The results of M – Magnitude, R - Real, and I - Imaginary image parts after applying
pyramids of Gabor function with different forms, orientations and scales are shown in
Figure 3.9, 3.10 and 3.11 respectively. Gabor filtered outputs can serve as candidate
features for the Face detection problem. However, they have a very high dimension,
leading to difficulties in training. In addition, there are redundancies in the Gabor
filter outputs. Hence, a usual adopted scheme is to summarize the outputs of the
Gabor filters using some statistics measure. Here, we adopt the operations used and
proven to work quite well, namely the maximum “MAX” and the standard deviations
“STD”,with a variation on the “MAX” definition by avoiding image sub sampling to
keep the local variations which might be important for characterizing the facial details
(e.g., wrinkles, creases and muscles drop). The three sets of Gabor features are shown
in Fig. 3.12 with the “MAX” and “STD” operators applied on them.

Figure 3.9: Gabor filtered results for M – Magnitude image part

30
Figure 3.10: Gabor filtered results for R – Real image part

Figure 3.11: Gabor filtered results for I – Imaginary image part

31
Table 3.3 – Gaber filter parameter

Figure 3.12: Real, Imaginary and Magnitude parts after applying “MAX” and “STD”
operators

Fi= max xij , xij+1 3.4

32
Where, Fi corresponds to the maximum value of two adjacent filters in the same scale
in the S1 layer band at pixel i. Where xij and xij+1 are the filtered values with scales j
and j+1 at pixel i

F is the mean value of the filtered values within pooling grid size of Ns×Ns.

The process of the two nonlinear operations “MAX” and “STD” is done for each
orientation and each scale band independently. For instance, consider the first band:
S=1. For each orientation, it contains two S1maps: the one obtained using a filter of
size 3×3 and one the other one obtained using a filer of size 5×5. The two maps have
the same width and height but with different values, because different filters are
applied. First, we take a max over the two scales by recording only the maximum
value from the two maps leading to a maximum map through “MAX” operation.

3.1.4 Integration of Classification and Regression

Face detectionand progression can be treated as a classification problem, when each


age is considered as a class label. Alternatively, Face detectioncan be treated as a
regression problem, where each age is considered a regression value. In our
experiments, we use both SVR and SVM methods for Face detectionon the FG-NET
and the MORPH standard databases. The RBF SVR can address the three limitations
of the traditional quadratic regression model:(1) the simple quadratic function may
not model the complex aging process, especially for a large span of years, e.g., 0-70;
(2) the least square estimation is sensitive to outliers that come from incorrect labels
when collecting a large image database; and (3) the least square estimate criterion
only minimizes the empirical risk which may not generalize well for unseen
examples.

A face feature localizer is used to detect the face in each image using Active Shape
Model stage (ASM). Then, the images are cropped and resized to 59×80 gray-level
images. For the face representation; we use our extension of the biologically-inspired

33
features method to model each face for the purpose of age estimation, which leads to a
total of 6100 features per image. We use cascade of classification and regression. We
build six SVR models and one SVM model using the experimentally selected
parameter provided in Table 3.2.

Using SVR or SVM separately cannot adequately estimate age because of the
diversity of the aging process across different ages. Hence, we combine SVR and
SVM models by selecting which model to use over each age group, based on MSE
results over the training. The age of the test image is predicted using a cascade of
SVM and SVR models by taking the average over the estimated ages as shown in
Figure 3.14. Then, based on the decision nodes, the final age is estimated.

Figure 3.1.4: Face detectionand categorization over the progression process for test
images cascade of SVR and SVM models

3.1.5 Evaluation measures

We used two measures to evaluate Face detectionperformance: (1) Mean Absolute


Error and (2) Cumulative score (CS). The MAE is defined as the average of the
absolute errors between the estimated ages and the ground truth ages.

34
Where lk is the ground truth age for the test is image k and lk^ is the estimated age
and N is the total number of test images. The cumulative score CS(j) is defined as
Ne≤jN×100% where Ne≤j is the number of test images on which the Face
detectionmakes an absolute error no higher than j years.

3.1.6 Aging databases

Collecting face images is an important task for the purposes of building models for
accurate age estimation. However, it is extremely hard in practice to collect large size
aging databases, especially when one want to collect the chronometric image series
from an individual. In this thesis, we have used two standard aging databases FG-
NET and MORPH; we summarize them as follows with other existing benchmark
aging databases.

3.1.7 FG-NET aging database

The FG-NET aging database is publically available. It contains 1,002 high-resolution


color or grey scale face images of 82 multiple-face subjects with large variation of
lighting, pose, and expression. The age range is from 0 to 69 years with chronological
aging images available for each subject (on average 12 images per subject). Figure
3.15 show example images for FG-NET database.

Figure 3.15: Some sample images from FG-NET

3.1.8 MORPH database

The publically available MORPH face database was collected by the face aging group
in the University of North Carolina at Wilmington, for the purpose of face biometrics

35
applications. This longitudinal database records individuals‟ metadata, such as age,
gender, ethnicity, height, weight, and ancestry, which is organized into two albums.
Album 1 contains 1,724 face images of 515 subjects taken between 1962 and 1998.
The ages range from the average of 27.3 to maximum 68 years. There are 294 images
of females and 1,430 images of males. The age span is from 46 days to 29 years.
Figure 3.16 show image examples for MORPH album 1. Album 2 contains about
55,000 face images which about 77% images are Black faces, 19% are White, and the
remaining 4% includes Hispanic, Asian, Indian, and Other. Figure 3.17 show example
images for MORPH album 2.

Figure 3.16: Some sample images from MORPH album 1

3.1.9 YGA database

The private Yamaha Gender and Age (YGA) is not publically available database. So,
we did not use it in our evaluations. YGA database contains 8,000 high-resolution
outdoor color images of 1,600 Asian subjects, 800 females and 800 males, with ages
ranging from0 (newborn) to 93 years. Each subject has about 5 near frontal images at
the same age and a ground truth label of his other approximate age as an integer. The
photos contain large variations in illumination, facial expression, and makeup. The

36
faces are cropped automatically by a face detector , and resized to 60x60 gray-level
patches

3.1.10. Experiments and results of earlier research work

A Leave-One-Person-Out (LOPO) test strategy is used on the FG-NET database, i.e.,


in each fold, the images of one person are used as the test set and those of the others
are used as the training set. After 82 folds, each subject has been used as test set once,
and the final results are calculated based on all the estimations. In this way, the
algorithms are tested in the case similar to real applications, i.e., the subject for whom
the algorithm attempt to estimate his/her age is previously unseen in the training set.
In order to further test the generalization ability, the algorithm trained on the FG-NET
Aging Database is then tested on the MORPH database. Note that as a test set, the
possible age range of the

MORPH data is assumed to be the same as that of the FG-NET Aging Database (0-
69), although the actual range is much smaller (15-68) (the same strategy was used
in). The results of using separate models for different face representations (R – Real, I
– Imaginary, M – Magnitude) features are shown in Table 3.1 on FG-NET aging
databases in terms of MAE. It is clear that using SVR or SVM separately cannot
adequately estimate age because of the diversity of the aging process across different
ages. Hence, we combine SVR and SVM models by selecting which model to use
over each age group, based on MSE results over the training set.

Table 3.1: MAEs of using separate SVR and SVM for different Gabor features on
FG-NET

37
Table 3.2: MAE (years) measures for 75 and 68 shape feature.

We first measure the effect of utilizing detailed shape features by using 75 points as
opposed to 68 . Experimental results are provided in Table 3.2, which shows the
superiority of a more detailed shape representation over FG-NET.

The second set of experiments aimed at measuring MAE and CS scores of the
proposed method against the state-of-the-art methods. The MAE results on the FG-
NET and MORPH are shown in Table 3.3 for different face representation features. In
the FG-NET, the R – Real part has an MAE of 3.16 which is less than the 3.31 of M –
Magnitude and 3.50 of I – Imaginary parts. For the MORPH database, (which was
tested in), the authors used 433 images which represent only Caucasian descent - we
used the same images for consistency. The R – Real part has an MAE of 4.11 on
MORPH which is significantly less than M – Magnitude and I – Imaginary parts (4.34
and 4.47) respectively.

Table 3.3: MAE (years) measures for R, I and M parts on FG-NET and MORPH

The MAEs at each age group (about 10 years span) are given in Table 3.3 for FG-
NET, Our extension of the bio-inspired features and the new set of Gabor features (I –
Imaginary and M – Magnitude) have MAEs that are less than state-of-the-art
methods. These average errors are substantially smaller than the RUN method (5.78
year) and even significantly lower than the very recent BIF approach (4.77 year)
which announced to be the best reported result so far. See Table 3.4 for an aggregate
comparison of MAE values for both FG-NET and MORPH databases

38
Table 3.4: MAE (years) at different age groups on FG-NET.

Table 3.5: MAE (years) comparisons.

CS curves are similarly shown in Figure 3.18 for the R – Real part which we call it
extended bio-inspired feature (EBIF) and BIF methods.

39
3.1.11 Summary

In this study, we explore presented a human age estimator based on the bio-inspired
features-based method. We explored new set of Gabor features for the first time (I –
Imaginary and M – Magnitude). We have combined BIF and the Active Shape Model
(ASM) for initialization. Furthermore, we have experimented with the extraction of
finer facial features as opposed to and shown experimentally the superiority of the
proposed contributions. Evaluated on the FG-NET and MORPH benchmark
databases, our algorithm achieved high accuracy in Evaluating human ages compared
to published methods. We also tested the proposed algorithm on MORPH database to
show its generalization capabilities. We improve the output results by combining the
core system module with a second module called the enhancement module, where we
investigate different facial parts such as eye wrinkles and internal face and creating a
more generalized database by combining FG-NET and MORPH databases only to
enhance the results in the older age groups by completing the missing pictures.

3.2 Enhancement module

Most of the Face detection frameworks use the whole face to estimate the age of the
input face image to have a better accuracy. In our previous chapter, we used the whole
face that was represented by 75 feature shape point to estimate the age using our
enhanced version of the bio-inspired features (EBIF) that achieved better accuracy
than the state-of-the-art. The researchers focused in their research to find a global
aging function using larger datasets or different methods, but few of them explored
where are the most important aging features in the face such as Zhi-Hua Zhou, who
empirically studied the significance of different facial parts for automatic age
estimation. The algorithm is based on his previous work on statistical face models.
His investigation involved the assessment of the following face regions: the whole
face (including the hairline), the whole internal face, the upper part of the face and the
lower part of the face as show in Figure 3.17.

40
Figure 3.17: Different facial regions used for Face detection

Experimental results revealed that the area around the eyes Figure 3.17(c), proved to
be the most significant for age estimation. The model of the upper facial part
minimized estimation error and standard deviation resulting in a mean error of 3.83
years and a standard deviation of 3.70 years. Zhi-Hua Zhou claims that introduction
of the hairline (when using the whole face) has a negative effect on the results. He
claims that the increased variability of the hair region distorts the Face detectiontask.
Zhi-Hua Zhou analysis was limited to subjects ranging from 0 to 35 years old, and
contained 330 images, of which only 80 were used for testing purposes. Evidently,
faces with more wrinkles were not used, leaving in doubt his ability to estimate the
age of subjects older than 35 year. This motivated us to analysis these facial parts
using faces with more wrinkles and Evaluating the age for subjects older than 35 year
to see the impact analysis on the older ages. In this chapter, we use the core Face
detection components explained in chapter 3 to analyze different facial parts: a) eye
wrinkles b) whole internal face (without forehead area) and c) whole face (with
forehead area) using different feature shape points. The analysis shows that eye
wrinkles, which cover 30% of the facial area, contain the most important aging
features compared to internal face and whole face. We use I- Imaginary and M-
Magnitude parts of Gabor function that were introduced in Chapter 3 to extensively
analyze eye wrinkles. As shown also in the experiments section in chapter 3, The FG-
NET MAEs results in the older age groups are high compared to younger age groups,
that is due to the very limited number of pictures in older age groups in the training
phase. We enhance the results of those older age groups in FG-NET by increasing the
number of missing pictures in older age groups using MORPH database. In next

41
section, we explain the proposed Face detection frame work to analyze different
human facial parts.

3.2.1 Face detection and progression framework

As shown in the diagram of Figure 3.18, the proposed Face detection algorithm
consists of two main stages, namely 1) Pre-processing stage and 2) Face detection
process.

Figure 3.18: Face detection and progression proposed framework.

In pre-processing stage, the facial landmarks for the whole face, internal face and eye
wrinkles are detected using Active Shape Model (ASM) block as was performed
previously in section 3.2.1. The images are cropped to the area covering a fixed
number of points generated from the ASM stage. Then, in the Face detection process,
the cropped images undergo filtering by a family of Gabor functions at different
orientations and scales using 𝑆1 block. The filtered outputs undergo a feature
dimensionality reduction step by keeping the maximum (MAX) and standard
deviations (STD) of the Gabor filtered outputs using 𝐶1 block. Finally, both
classification-based and regression-based models were used in the training phase
(SVM and SVR in this case) to produce the final age model estimator.

3.2.2 Shape boundary detection

We use the same methodology in section 3.2.1 to extract the feature shape points, but
we aim at analyzing eye wrinkles using 20 points, internal face area using 58 points
and whole face using 75 points which were provided by . The purpose of this analysis
is to determine the locations of the most important aging features using eye wrinkles

42
or internal face only rather than using the whole face area. Finally, we build three
separate statistical shape models using

1) 75 points shape model;

2) 20 points shape model; and

3) 58 points shape model by ASM based on the annotated images.

Each face region has its own annotation points that were used during the training
stage. Eye wrinkles has 20 points that was taken from the BioIDdatabase to build a
shape model that is used in the fitting stage to extract the area that covers this number
of points. Whole face and internal face have 75 and 58 points that were taken from to
build a shape model for them. These shape models are used in the fitting stage to
extract the area that covers this number of points. So, the used points describe each
facial part are uncorrelated to each other. The difference between using 75, 20 and 58
landmark points is illustrated figure below. The facial regions covered by each shape
model produce the same facial parts that were produced by Zhi-Hua Zhou except the
mouth area, where there is no shape model built for it in our experiments due to the
non-existing annotation points for this area.

Figure 3.19:58, 20 and 75 points samples from FG-NET.

43
3.2.3 FaceRepresentation

We use the same explained feature face representation earlier chapter. Figure 3.20
show the output of 𝑆1 layer for eye wrinkle, internal face and whole face.

Figure 3.20show the output of 𝑆1 layer for eye wrinkle

3.2.3 Cascade of classification and regression

We use cascade of classification and regression models as explained in section 3.21.


We build six SVR models and one SVM model using the experimentally selected
parameters provided in Table 3.2.

Figure 3.21: Gabor filtered results using I, M and R parts at different bands with filter
size 3×3 at 0

44
CHAPTER 4

MATERIAL AND METHODS

4.1 Proposed work

To age progress we perform the following steps.

4.1.1 Pose correction: the input face is warped to approximately frontal pose using
the alignment pipeline of denote the aligned photo I.

4.1.2 Texture age progress: Relight the source and target age cluster averages to
match the lighting of yielding AI s and AI t. Compute flow F source–input between
AI s and I and warp AI s to the input image coordinate frame, and similarly for F
target–input. This yields a pair of illumination matched projections, Js and Jt both
warped to input. The texture difference Jt−Js is added to the input image I.

4.1.3 Flow age progress: Apply flow from source cluster to target cluster Ftarget–
source mapped to the input image, i.e., apply F input–target ◦F target–source to the
texture-modified image I + Jt − Js. For efficiency, we precompute bidirectional flows
from each age cluster to every other age cluster. Aspect ratio progress:

Apply change in aspect ratio, to account for variation in head shape over time. Per-
cluster aspect ratios were computed as the ratio of distance between the left and right
eye to the distance between the eyes and mouth, averaged over the fiducially point
locations of images in each of the clusters. We also allow for differences in skin tone
(albedo) by computing a separate rank-4 subspace and projection for each colour
channel.

The main focus of this study is to move the research on the human Face detection and
progression to real applications and practical usage of life rather than being bounded
to the existing databases with their limitations to a single human ethnic group or the
well annotated faces. All methods and algorithms should take into consideration a
more generalized database that contains various races with different image qualities
and conditions.

45
In this work, we address the following issues:

1. Though it is practically difficult or even impossible to collect a huge human


Face detection database with correct true labels. But, the internet provides us
with the facility to collect such a large amount of face images with possible
age information existing in the form of tags or descriptions for a particular
image. Popular photo sharing websites such as Flickr can provide a large
number of images based on a single age-related query such as 20 years old, the
returned results will be in thousands of correct images from different various
ancestry groups

2. Face misalignment can be rectified by using the Active Shape Model (ASM)
to locate the correct facial landmarks for the face images.

3. The problem of multi-instance faces in the same image with possibly incorrect
labels of the image. This motivated us to design the universal labeller
algorithm for efficient and effective image labelling.

4.2 Limitation of Project

As Proposed work is capable to detect small image that is a part of big image but it is
hardly to detect the big size image of larger image size. It is due to the algorithm
(template matching algorithm) which is based on the matrix formation of image taken.
Further, Inclusion of big image and its matrix information for use of template
matching is quite hard to recognise from the ancestor (Parent Image).

As this research include eyes, ear, nose and some other specific part of face image is
just the similar concept of template matching algorithm which is already described in
earlier chapter.

4.3   Limit of Age Progression

These theses investigate the face recognition and its modules like face part detection.
Further, for age progression this include three time domain phase like the age of
human at childhood age, at middle age and at older age. By observing these three
different phases of human this research concludes the factors on which we can
differentiate the image:

46
 Texture of Skin
 Aging Sign of Skin
 Face Part slight shift or slight deformation

All above parameter is just the specific criteria on which age progression can be
determine in situation which is not as actual. There are many factors which can affect
the person and their face during the growth of age. Environmental condition, food
consumption, exercise, day to day life routine and many factors affect the looks as
well as overall personality of human. So, it is hard to find or predict the actual face
after some year.

47
CHAPTER 5

RESULTS AND DISCUSSION

5.1 SIMULATION UNDER MATLAB GUI

We encounter several problems in the downloaded images such as face misalignment,


multiple faces and non-face images; these problems were solved using the Active
Shape Model and Advance analysis of image processing algorithm. We use the core
system module components to extract the aging information.

 The success of any Face detection frame work depends on the availability of the
data. So, data collection is extremely laborious and important task. Ideal data set
should cover a wide range of age Include many different subjects and contain at
least one image for each age of each subject.

 For better Face detection results, facial attributes decomposition plays an


important role, because a face image shows multiple facial attributes: identity,
expression, gender, age, race, pose, etc. Decomposition of these facial attributes is
essential to extract age-related features

 People rely on multiple cues to estimate other people’s age such as face, voice,
gait and hair. Combine face with one or more other cues for Face detection might
remarkably improve the current performance

5.2 STEPS FOR FACE DETECTION

 Create a cascade detector object.

 Read a video frame and run the face detector.

 Draw the returned bounding box around the detected face.

 Convert the first box into a list of 4 points

 This is needed to be able to visualize the rotation of the object.

 Detect feature points in the face region.

48
 Display the detected points.

 Create a point tracker and enable the bidirectional error constraint to

 make it more robust in the presence of noise and clutter.

 Initialize the tracker with the initial point locations and the initial video frame.

 Make a copy of the points to be used for computing the geometric

 transformation between the points in the previous and the current frames get the
next frame Track the points. Note that some points may be lost.

 Estimate the geometric transformation between the old points and the new points
and eliminate outliers

 Apply the transformation to the bounding box points

 Insert a bounding box around the object being tracked Display tracked points

 Reset the points

 Display the annotated video frame using the video player object

 Clean up

5.3 ALGORITHM FOR FACE RECOGNITION

The match metrics use a difference equation with general form:


l n denotes the metric space (Rn,dp) for Rn n > 1.
p

5.3.1 Sum of Absolute Differences (SAD)

This metric is also known as the Taxicab or Manhattan Distance metric. It sums the


absolute values of the differences between pixels in the original image and the
1
corresponding pixels in the template image. This metric is the l  norm of the
difference image. The lowest SAD score estimates the best position of template
within the search image. The general SAD distance metric becomes:

49
5.3.2 Sum of Squared Differences (SSD)
This metric is also known as the Euclidean Distance metric. It sums the square of the
absolute differences between pixels in the original image and the corresponding pixels
2
in the template image. This metric is the square of the l  norm of the difference
image. The general SSD distance metric becomes:

5.3.3 MaximumAbsoluteDifference (MaxAD)


This metric is also known as the Uniform Distance metric. It sums the maximum of
absolute values of the differences between pixels in the original image and the
corresponding pixels in the template image. This distance metric provides

the l  norm of the difference image. The general MaxAD distance metric becomes:

Which is simplifies as below

5.4 SIMULATION CODE


functionvarargout = New_Fig(varargin)
% NEW_FIG MATLAB code for New_Fig.fig
% NEW_FIG, by itself, creates a new NEW_FIG or raises the existing
% singleton*.
%
% H = NEW_FIG returns the handle to a new NEW_FIG or the handle to
% the existing singleton*.
%
% NEW_FIG('CALLBACK',hObject,eventData,handles,...) calls the local
% function named CALLBACK in NEW_FIG.M with the given input arguments.
%
% NEW_FIG('Property','Value',...) creates a new NEW_FIG or raises the
% existing singleton*. Starting from the left, property value pairs are
% applied to the GUI before New_Fig_OpeningFcn gets called. An
% unrecognized property name or invalid value makes property application
% stop. All inputs are passed to New_Fig_OpeningFcn via varargin.
%
% *See GUI Options on GUIDE's Tools menu. Choose "GUI allows only one
% instance to run (singleton)".
%

50
% See also: GUIDE, GUIDATA, GUIHANDLES

% Edit the above text to modify the response to help New_Fig

% Last Modified by GUIDE v2.5 25-May-2017 02:23:46

% Begin initialization code - DO NOT EDIT


gui_Singleton = 1;
gui_State = struct('gui_Name', mfilename, ...
'gui_Singleton', gui_Singleton, ...
'gui_OpeningFcn', @New_Fig_OpeningFcn, ...
'gui_OutputFcn', @New_Fig_OutputFcn, ...
'gui_LayoutFcn', [] , ...
'gui_Callback', []);
ifnargin&&ischar(varargin{1})
gui_State.gui_Callback = str2func(varargin{1});
end

ifnargout
[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT

% --- Executes just before New_Fig is made visible.


functionNew_Fig_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% hObject handle to figure
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% varargin command line arguments to New_Fig (see VARARGIN)

% Choose default command line output for New_Fig


handles.output = hObject;

% Update handles structure


guidata(hObject, handles);

% UIWAIT makes New_Fig wait for user response (see UIRESUME)


% uiwait(handles.figure1);

% --- Outputs from this function are returned to the command line.
functionvarargout = New_Fig_OutputFcn(hObject, eventdata, handles)
% varargout cell array for returning output args (see VARARGOUT);
% hObject handle to figure
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)

51
% Get default command line output from handles structure
varargout{1} = handles.output;

% --- Executes on button press in pushbutton1.


function pushbutton1_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton1 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
main

% --- Executes on button press in pushbutton2.


function pushbutton2_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton2 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
ageprogression

% --- Executes on button press in pushbutton3.


function pushbutton3_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton3 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
target

% --- Executes during object creation, after setting all properties.


function pushbutton2_CreateFcn(hObject, eventdata, handles)
% hObject handle to pushbutton2 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles empty - handles not created until after all CreateFcns called

clear all;
close all;
clc;

if ~exist('gabor.mat','file')
fprintf ('Creating Gabor Filters ...');
create_gabor;
end
if exist('net.mat','file')
load net;
else
createffnn
end
if exist('imgdb.mat','file')
loadimgdb;

52
else
IMGDB = loadimages;
end
while (1==1)
choice=menu('Face Detection',...
'Create Database',...
'Initialize Network',...
'Train Network',...
'Test on Photos',...
'Exit');
if (choice ==1)
IMGDB = loadimages;
end
if (choice == 2)
createffnn
end
if (choice == 3)
net = trainnet(net,IMGDB);
end
if (choice == 4)
pause(0.001);
[file_namefile_path] = uigetfile ('*.jpg');
iffile_path ~= 0
im = imread ([file_path,file_name]);
try
im = rgb2gray(im);
end
tic
im_out = imscan (net,im);
toc
figure;imshow(im_out,'notruesize');
end
end
if (choice == 5)
clear all;
clc;
close all;
return;
end
end
function Psi = gabor (w,nu,mu,Kmax,f,sig)

% w : Window [128 128]


% nu : Scale [0 ...4];
% mu : Orientation [0...7]
% kmax = pi/2
% f = sqrt(2)
% sig = 2*pi

m = w(1);

53
n = w(2);
K = Kmax/f^nu * exp(i*mu*pi/8);
Kreal = real(K);
Kimag = imag(K);
NK = Kreal^2+Kimag^2;
Psi = zeros(m,n);
for x = 1:m
for y = 1:n
Z = [x-m/2;y-n/2];
Psi(x,y) = (sig^(-2))*exp((-.5)*NK*(Z(1)^2+Z(2)^2)/(sig^2))*...
(exp(i*[KrealKimag]*Z)-exp(-(sig^2)/2));
end
end
functionim_out = imscan (net,im)

close all

%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~
% PARAMETERS
SCAN_FOLDER = 'imscan/';
UT_FOLDER = 'imscan/under-thresh/';
TEMPLATE1 = 'template1.png';
TEMPLATE2 = 'template2.png';
Threshold = 0.5;
DEBUG = 0;
%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~

warning off;
delete ([UT_FOLDER,'*.*']);
delete ([SCAN_FOLDER,'*.*']);
if (DEBUG == 1)
mkdir (UT_FOLDER);
mkdir (SCAN_FOLDER);
end

[m n]=size(im);

%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~
% First Section
C1 = mminmax(double(im));
C2 = mminmax(double(imread (TEMPLATE1)));
C3 = mminmax(double(imread (TEMPLATE2)));
Corr_1 = double(conv2 (C1,C2,'same'));
Corr_2 = double(conv2 (C1,C3,'same'));
Cell.state = int8(imregionalmax(Corr_1) | imregionalmax(Corr_2));
Cell.state(1:13,:)=-1;

54
Cell.state(end-13:end,:)=-1;
Cell.state(:,1:9)=-1;
Cell.state(:,end-9:end)=-1;
Cell.net = ones(m,n)*-1;
[LUTmLUTn]= find(Cell.state == 1);
imshow(im);
hold on
plot(LUTn,LUTm,'.y');pause(0.01);

%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~
% Second Section
while (1==1)
[i j] = find(Cell.state==1,1);
ifisempty(i)
break;
end
imcut = im(i-13:i+13,j-9:j+8);
Cell.state(i,j) = -1;
Cell.net(i,j) = sim(net,im2vec(imcut));
if Cell.net(i,j) < -0.95
for u_=i-3:i+3
for v_=j-3:j+3
try
Cell.state(u_,v_)=-1;
end
end
end
plot(j,i,'.k');pause(0.01);
continue;
elseif Cell.net(i,j) < -1*Threshold
plot(j,i,'.m');pause(0.01);
continue;
elseif Cell.net(i,j) > 0.95
plot(j,i,'.b');pause(0.01);
for u_=i-13:i+13
for v_=j-9:j+9
try
Cell.state(u_,v_)=-1;
end
end
end
elseif Cell.net(i,j) > Threshold
plot(j,i,'.g');pause(0.01);
elseif Cell.net(i,j) < Threshold
plot(j,i,'.r');pause(0.01);
end
fori_=-1:1
for j_=-1:1
m_=i+i_;

55
n_=j+j_;
if (Cell.state(m_,n_) == -1 || Cell.net(m_,n_)~=-1)
continue;
end
imcut = im(m_-13:m_+13,n_-9:n_+8);
Cell.net(m_,n_) = sim(net,im2vec(imcut));
if Cell.net(m_,n_) > 0.95
plot(n_,m_,'.b');pause(0.01);
for u_=m_-13:m_+13
for v_=n_-9:n_+9
try
Cell.state(u_,v_)=-1;
end
end
end
continue;
end
if Cell.net(m_,n_) > Threshold
Cell.state(m_,n_) = 1;
plot(n_,m_,'.g');pause(0.01);
if (DEBUG == 1)
imwrite(imcut,[SCAN_FOLDER,'@',int2str(m_),',',int2str(n_),'
(',int2str(fix(Cell.net(m_,n_)*100)),'%).png']);
end
else
Cell.state(m_,n_) = -1;
plot(n_,m_,'.r');pause(0.01);
if (DEBUG == 1)
imwrite(imcut,[UT_FOLDER,'@',int2str(m_),',',int2str(n_),'
(',int2str(fix(Cell.net(m_,n_)*100)),'%).png']);
end
end
end
end
end

%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~
% Third Section
hold off
figure;imshow (Cell.net,[]);
xy_ = Cell.net > Threshold;
xy_ = imregionalmax(xy_);
xy_ = imdilate (xy_,strel('disk',2,4));
[LabelMatrix,nLabel] = bwlabeln(xy_,4);
CentroidMatrix = regionprops(LabelMatrix,'centroid');
xy_ = zeros(m,n);
fori = 1:nLabel
xy_(fix(CentroidMatrix(i).Centroid(2)),...
fix(CentroidMatrix(i).Centroid(1))) = 1;

56
end
xy_ = drawrec(xy_,[27 18]);
im_out (:,:,1) = im;
im_out (:,:,2) = im;
im_out (:,:,3) = im;
fori = 1:m
for j=1:n
ifxy_(i,j)==1
im_out (i,j,1)=0;
im_out (i,j,2)=255;
im_out (i,j,3)=0;
end
end
end

%~~~~~~~~~~~~~~~~~~~~~

if exist('imgdb.mat','file')
loadimgdb;
else
IMGDB = cell (3,[]);
end
fprintf ('Loading Faces ');
folder_content = dir ([face_folder,'*',file_ext]);
nface = size (folder_content,1);
for k=1:nface
string = [face_folder,folder_content(k,1).name];
image = imread(string);
[m n] = size(image);
if (m~=27 || n~=18)
continue;
end
f=0;
fori=1:length(IMGDB)
ifstrcmp(IMGDB{1,i},string)
f=1;
end
end
if f==1
continue;
end
fprintf ('.');
IM {1} = im2vec (image); % ORIGINAL FACE IMAGE
IM {2} = im2vec (fliplr(image)); % MIRROR OF THE FACE
IM {3} = im2vec (circshift(image,1));
IM {4} = im2vec (circshift(image,-1));
IM {5} = im2vec (circshift(image,[0 1]));
IM {6} = im2vec (circshift(image,[0 -1]));
IM {7} = im2vec (circshift(fliplr(image),1));
IM {8} = im2vec (circshift(fliplr(image),-1));

57
IM {9} = im2vec (circshift(fliplr(image),[0 1]));
IM {10} = im2vec (circshift(fliplr(image),[0 -1]));
fori=1:10
IMGDB {1,end+1}= string;
IMGDB {2,end} = out_max;
IMGDB (3,end) = {IM{i}};
end
end
fprintf ('\nLoading non-faces ');
folder_content = dir ([non_face_folder,'*',file_ext]);
nnface = size (folder_content,1);
for k=1:nnface
string = [non_face_folder,folder_content(k,1).name];
image = imread(string);
[m n] = size(image);
if (m~=27 || n~=18)
continue;
end
f=0;
fori=1:length(IMGDB)
ifstrcmp(IMGDB{1,i},string)
f=1;
end
end
if f==1
continue;
end
fprintf ('.');
IM {1} = im2vec (image);
IM {2} = im2vec (fliplr(image));
IM {3} = im2vec (flipud(image));
IM {4} = im2vec (flipud(fliplr(image)));
fori=1:4
IMGDB {1,end+1}= string;
IMGDB {2,end} = out_min;
IMGDB (3,end) = {IM{i}};
end
end
fprintf('\n');
saveimgdb IMGDB;
fprintf('\n This program detects a target in an image')
fprintf('\n Entering the image for MATLAB...')
fprintf('\n Save the image or its copy in MATLAB working Directory')
imagname = input('\n Enter the name of the image file (filename.ext) : ','s');
w = imread(imagname);
w = im2double(w);
sizw = size(w);
figure
imshow(w)
title('Input Image')

58
pause(3.5);
close;
fprintf('\n Entering the target image for MATLAB...')
fprintf('\n Save the target image or its copy in MATLAB working Directory')
tarname = input('\n Enter the name of the target image file (filename.ext) : ','s');
t = imread(tarname);
t = im2double(t);
sizt = size(t);
figure
imshow(t)
title('Target Image')
pause(3.5);
close;
ww = rgb2gray(w);
tt = rgb2gray(t);
tedge = edge(tt);
wedge = edge(ww);
out = filter2(tedge,wedge);
o = max(max(out));
output = (1/o)*out;

pixel = find(output == 1);


pcolumn = fix(pixel / sizw(1));
prow = mod(pixel,sizw(1));
rdis = fix(sizt(1)/2);
cdis = fix(sizt(2)/2);
cmin = pcolumn - cdis;
cmax = pcolumn + cdis;
rmin = prow - rdis;
rmax = prow + rdis;
c = [cmincmincmaxcmax];
r = [rminrmaxrmaxrmin];
m = roipoly(ww,c,r);
m = im2double(m);
m = 0.5 * (m + 1);
mask(:,:,1) = m;
mask(:,:,2) = m;
mask(:,:,3) = m;
final = mask .* w;
figure
imshow(final)
title('Result Image')
pause(3.5);
close;
subplot(1,2,1)
imshow(w)
title('Input Image')
subplot(1,2,2)
imshow(final)
title('Result Image')

59
sav = input('\n Do you like to SAVE Result Image? (y/n) : ','s');
if (sav == 'y')
fprintf('\n You choose to SAVE the Result Image')
naming = input('\n Type the name of the new image file (filename.ext) : ','s');
fprintf('\n Saving ...')
imwrite(final,naming);
fprintf('\n The new file is called %s and it is saved in MATLAB working
Directory',naming)
else
fprintf('\n You choose NOT to SAVE the Result Image')
end
clear

5.5 SIMULATION PROCESS WITH EXPLANATION

Fig5.1: Basic Layout designed in MATLAB

60
Fig 5.2: After the click button of face recognition a GUI come out which is the part of
training set designed for face or non face images.

Fig 5.3: Training to Sample of different face and non face to distinguish

61
Fig 5.4: These layouts come after the push button face part detection has been pressed

This will need one picture having full face image and after it fetched it is been name
with extension. Then put the target area which could extract from that image.

Fig 5.5: Test image

62
Fig5.6: Target image which is the part of above image

Fig 5.7: The Detected part after the execution

63
Fig 5.8: After the execution of Age Progression of PM Modi

64
Fig 5.9: It has the scroll button for the shape bend and color bend that has main cause
of aging sign for old age.

We encountered several problems in the downloaded images such as face


misalignment, multiple faces and non-face images; these problems were solved using
the Active Shape Model and Advance analysis of image processing algorithm. We
used the core system module components to extract the aging information.

 The success of any Face detection frame work depends on the availability of the
data. So, data collection is extremely laborious and important task. Ideal data set
should cover a wide range of age Include many different subjects and contain at
least one image for each age of each subject.
 For better Face detection results, facial attributes decomposition plays an
important role, because a face image shows multiple facial attributes: identity,
expression, gender, age, race, pose, etc. Decomposition of these facial attributes is
essential to extract age-related features
 People rely on multiple cues to estimate other people’s age such as face, voice,
gait and hair. Combine face with one or more other cues for Face detection might
remarkably improve the current performance

65
 Active Shape Model and Advance analysis of image processing algorithm can be
trained on different images under pose and illumination variations. Because, pose
and illumination variations are always troublesome in real applications. Also,
Pose-invariant and illumination- invariant techniques (intensively investigated in
face recognition) might be introduced into Face detection methods.
 Considering the numerous influential factors, Face detection can hardly be certain.
At this point, Face detection can be treated as a Fuzzy classification approach
E.g., “I am 85% sure that you are 18 years old”. Rather than using SVM or SVR.
 Application oriented Face detection techniques may treat Face detection task as a
binary classification problem, where, most age-specific access control systems
only need to determine whether the user is older than a certain age.
 Human aging patterns can be affected by many internal and external factors such
as genetics, race, gender, health, lifestyle, and even weather conditions.
Incorporating the influential factors to improve performance of age estimation.
 Age progression using the bio-inspired features is not being investigated. A
further research can be established to create automatic aging scheme, where all it
requires only one input image of a subject and produces an age progressed or age
regressed image of the person at a target age.
 Paying more attention to face detection algorithms such as combining the
classification and regression methods might improve the results further, because
choosing an Face detection algorithm is always a critical choice that will increase
or decrease the performance notably.

AGE PROGRESSION FACE


Test-1 RECOGNITION

NO OF IMAGES (GROUP) DETECTED NON DETECTED


IMAGES IMAGES
7 7 0
15 13 2
8 5 3
62 52 10

66
(II) AGE PROGRESSION FACE PART
TEST -2 DETECTION
Images part detected part not detected (eye)
(eye)
1 Y
2 y
Images part detected part not detected (eye)
(nose)
1 Y
2 y

AGE PROGRESSION-Test 3

When we slide shape Blend … it will change the shape of face … it indicate that age
is processing in any face.

AGE PROGRESSION-Test 4

In this phase, Age is processing from child to old age hence age is processing.

67
CHAPTER 6

CONCLUSION AND FUTURE WORK

6.1 Conclusion

We have developed a fully automatic Face detection and progression frame work in
this thesis. A three modelled architecture is proposed:

1) Core system module;

2) Enhancement module;

3) Application module.

We constructed a new database using the internet as a rich repository for image
collection. Over many images were crawled, that is based on image collector using
human age-related queries.

6.2 Core Module

In core system module, we have built the main components of our human Face
detection system. We introduced a novel face representation schema which has two
main steps; face cropping using the Active Shape Model (ASM) to crop the face
image to the area that covers the face boundary.

6.3 Enhancement Module

In this module, we do several improvements to enhance the output results from the
core system module such as more analysis on different facial parts to validate the
importance of those different parts, increase the number of pictures on FG-NET aging
database using MORPH dataset.

6.4 Application module

After building the core system and enhancement modules, we demonstrate the
application module that has several components:

68
 Image collector that crawls images from the internet using human age related text
queries with several conditions such as image quality, different poses,
expressions, multiple faces in the same image and single face image. This lead to
a large database for the purpose of having more training images
 The crawled images suffer from different problems such as face misalignment and
multi-instance faces in the same image with possibly incorrect label of the image
faces. This motivated us to propose different solutions to overcome the above
mentioned problems.

6.5 Future Scope

Face recognition systems used today work very well under constrained conditions,
although all systems work much better with frontal mug-shot images and constant
lighting. All current face recognition algorithms fail under the vastly varying
conditions under which humans need to and are able to identify other people. Next
generation person recognition systems will need to recognize people in real-time and
in much less constrained situations.

We believe that identification systems that are robust in natural environments, in the
presence of noise and illumination changes, cannot rely on a single modality, so that
fusion with other modalities is essential. Technology used in smart environments has
to be unobtrusive and allow users to act freely. Wearable systems in particular require
their sensing technology to be small, low powered and easily integral with the user's
clothing. Considering all the requirements, identification systems that use face
recognition and speaker identification seem to us to have the most potential for wide-
spread application.

Cameras and microphones today are very small, light-weight and have been
successfully integrated with wearable systems. Audio and video based recognition
systems have the critical advantage that they use the modalities humans use for
recognition. Finally, researchers are beginning to demonstrate that unobtrusive audio-
and-video based person identification systems can achieve high recognition rates
without requiring the user to be in highly controlled environments.

Further, after the face recognition and its several parts using template matching
algorithm. These researches enhance the module for age progression of face. At any

69
stage of human face we can predict its future face and past face easily. In future this
technique will be used for automatic update of passport photo at airport database and
employee database of any government agencies or exam conducted for government
sector.

70
REFERENCES
[1] Mr. Dinesh Chandra Jain Dr. V. P. Pawar Research Schoolar in Comp-Science
Associate Professor , School of Comp- Science SRTM University – Nanded (M.S)
SRTM University – Nanded (M.S) “A Novel Approach For Recognition Of Human
Face Automatically Using Neural Network Method” Volume 2, Issue 1, January 2012
ISSN: 2277 128X.

[2] Sujata G. Bhele1 and V. H. Mankar2 “A Review Paper on Face Recognition


Techniques”ISSN: 2278 – 1323 International Journal of Advanced Research in
Computer Engineering & Technology (IJARCET) Volume 1, Issue 8, October 2012.

[3]MamtaDhandaAssist. Prof (Computer Department),Seth Jai Prakash Mukund Lal


Institute of Engineering and Technology, Radaur“Face Recognition Using
Eigenvectors From Principal Component Analysis” International Journal of
Advanced Engineering Research and Studies E-ISSN2249–8974.

[4] M.Nandini, P.Bhargavi, G.RajaSekhar Department of EConE, SreeVidyanikethan


Engineering College “Face Recognition Using Neural Networks” International
Journal of Scientific and Research Publications, Volume 3, Issue 3, March 2013 1
ISSN 2250-3153.

[5] Michel F. Valstar, TimurAlmaev, Jeffrey M. Girard, Gary McKeown, Marc Mehu,
Lijun Yin, Maja Pantic and Jeffrey F. Cohn: “FERA 2015 - Second Facial Expression
Recognition and Analysis Challenge” 978-1-4799-6026-2/15/$31.00 ©2015 IEEE.

[6] Marian Stewart Bartlett, Member, IEEE, Javier R. Movellan, Member, IEEE, and
Terrence J. Sejnowski, Fellow, IEEE: “Face Recognition by Independent Component
Analysis” IEEE Transactions On Neural Networks, Vol. 13, No. 6, November 2002.

[7]Volker Blanz and Thomas Vetter, Member, IEEE “Face Recognition Based on
Fitting a 3D Morphable Model” IEEE Transaction On Pattern Analysis And Machine
Intelligence, Vol. 25, No. 9, September 2003.

[8] Juwei Lu, Student Member, IEEE, Konstantinos N. Plataniotis, Member, IEEE,
and Anastasios N. Venetsanopoulos, Fellow, IEEE“Face Recognition Using Kernel

71
Direct Discriminate Analysis Algorithms” IEEE Transactions On Neural Networks,
Vol. 14, No. 1, January 2003.

[9] WU Xiao-Jun Josef Kittler YANG Jing-Yu Kieron Messer Wang Shi-Tong
Department of Computer Science, Jiangsu University of Science and Technology,
Zhenjiang, China: “A New Kernel Direct Discriminant Analysis (KDDA) Algorithm
for Face Recognition” Win2PDF available at http://www.daneprairie.com.

[10] Juwei Lu, K.N. Plataniotis, A.N. Venetsanopoulos Bell Canada Multimedia
Laboratory, The Edward S. Rogers Sr. Department of Electrical and Computer
Engineering University of Toronto, Toronto, M5S 3G4, ONTARIO, CANADA:
“Face Recognition Using Kernel Direct Discriminant Analysis Algorithms” IEEE
Transactions on Neural Networks in December 12, 2001. Revised and re-submitted in
July 16, 2002. Accepted for publication in August 1, 2002

[11] Felix Juefei-Xu∗1 ,Khoa Luu∗1,2 , Marios Savvides1 , Tien D. Bui2 , and
Ching Y. Suen2 “Investigating Age Invariant Face Recognition Based on Periocular
Biometrics” 978-1-4577-1359-0/11/$26.00 ©2011 IEEE.

[12] GuodongGuoGuowangMu Yun Fu BBN Technologies Thomas S. Huang UIUC


“Human Age Estimation Using Bio-inspired Features” 978-1-4244-3991-1/09/$25.00
©2009 IEEE

[13] Thomas SerreLior Wolf Tomaso Poggio “Object Recognition with Features
Inspired by Visual Cortex” Brain and Cognitive Sciences Department Massachusetts
Institute of Technology Cambridge, MA 02142 {serre,liorwolf}@mit.edu,
tp@ai.mit.edu.

[14] Anjali1 , Asst. Prof.Arun Kumar2 “FACE RECOGNITION AND AGE


PROGRESSION USING AAIA ALGORITHM” International Journal For
Technological Research In Engineering Volume 2, Issue 11, July-2015.

[15] Young H. Kwon∗ and Niels da Vitoria Lobo “Age Classification from Facial
Images” School of Computer Science, University of Central Florida, Orlando, Florida
32816 Received May 26, 1994; accepted September 6, 1996.

72
[16] AsumanGünay and Vasif V. Nabiyev “Age Estimation Based on AAM and 2D-
DCT Features of Facial Images” (IJACSA) International Journal of Advanced
Computer Science and Applications, Vol. 6, No. 2, 2015”

[17] Xin Geng1,2,3 , Kate Smith-Miles1 , Zhi-Hua Zhou3 “Facial Age Estimation by
Learning from Label Distributions” ARC (DP0987421), NSFC (60905031,
60635030), JiangsuSF (BK2009269), 973 Program (2010CB327903) and Jiangsu 333
Program”.

[18] Pramod Kumar Pisharady and Martin Saerbeck “Pose Invariant Face Recognition
Using Neuro-Biologically Inspired Features” International Journal of Future
Computer and Communication, Vol. 1, No. 3, October 2012.

[19] Geng, Senior Member, IEEE, Kate Smith-Miles, Senior Member, IEEE
“Automatic Age Estimation Based on Facial Aging Patterns” IEEE
TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 29, NO. 12, DECEMBER 2007

[20] N. Ramanathan and R. Chellappa, “Face Verification Across Age Progression”.


In Proceedings of IEEE Transactions on Image Processing, vol. 15, no. 11, pp. 3349-
3361, 2006. [21] B. Ni, Z. Song, and S. Yan, “ image Mining Towards Universal Age
Estimator”. In Proceedings of ACM Multimedia, pp. 85-94, 2009.

[21] N. Hewahi, A. Olwan, N. Tubeel, S. El-Asar and Z. Abu-Sultan, “Face


detectionbased on Neural Networks using Face Features”. In Proceedings of Journal
of Emerging Trends in Computing and Information Sciences, vol. 1, no. 2, pp. 61- 67,
2010.

[22] A. Stone, “The Aging Process of the Face & Techniques of Rejuvenation”,
http://www.aaronstonemd.com/Facial Aging Rejuvenation.shtm

[23] A. M. Alberta, K. Ricanek, and E. Pattersonb, “A Review of the Literature on the


Aging Adult Skull and Face: Implications for Forensic Science Research and
Applications”. In Proceedings of Forensic Science International, vol. 172, no. 1, pp.
1-9, 2007.

[24] L. A. Zebrowitz, Reading Faces: Window to the Soul?,WestviewPress, 1997

73
[25] A. Zhi-Hua Zhou, C. Taylor, and T. Cootes, “Toward Automatic Simulation of
Aging Effects on Face Images”. In Proceedings of IEEE Transactions on PAMI, vol.
24, no. 4, pp. 442-455, 2002.

[26] T. Cootes, G. Edwards, and C. Taylor, “Active Appearance Models”. In


Proceedings of European Conference on Computer Vision, vol. 2, pp. 484-498, 1998.

74

You might also like