Age Estimation From Facial Images: Bachelor of Technology Electronics and Communication Engineering

A PROJECT REPORT ON
AGE ESTIMATION FROM FACIAL IMAGES

A Project submitted in partial fulfilment of the requirement for the award of the
degree of
BACHELOR OF TECHNOLOGY
IN
ELECTRONICS AND COMMUNICATION ENGINEERING
Submitted by
BHAVANA ATLURI (17071A0466)

SHARANYA GADDAM (17071A0477)
PRASHANTH JULURU (17071A0484)
JANAKI KAHAR (18075A0417)
UNDER THE GUIDANCE OF

Mr. V SAGAR REDDY
Assistant Professor
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

VNR VIGNANA JYOTHI INSTITUTE OF ENGINEERING
&TECHNOLOGY, NIZAMPET, PRAGATHI NAGAR,
HYDERABAD, TELANGANA 500090
2017-2021
VNR VIGNANA JYOTHI INSTITUTE OF ENGINEERING & TECHNOLOGY,
NIZAMPET, PRAGATHI NAGAR, HYDERABAD, TELANGANA 500090
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
CERTIFICATE
This is to certify that the Major project report entitled “Age Estimation from Facial
Images” being submitted by A. BHAVANA (17071A0466), G. SHARANYA (17071A0477),
J. PRASHANTH (17071A0484), K. JANAKI (18075A0417) in partial fulfilment of the
degree of Bachelor of Technology in Electronics and Communication Engineering during
the academic year 2017 – 2021.
Certified further, to the best of our knowledge, the work reported here is not a part
of any other project based on which a degree or an award has been given on an earlier
occasion to any other candidate. The result has been verified and found to be satisfactory.
Internal Guide External Examiner
Mr.V SAGAR REDDY

Assistant Professor
Head of the Department, ECE
DR.Y. PADMA SAI

Professor
Table of Contents
1 INTRODUCTION 5
1.1 OVERVIEW OF THE PROJECT …………………………………........5
1.2 OBJECTIVE……………………………………………………………..6
1.3 MOTIVATION…………………………………………………………..6
1.4 TOOLS USED……………………………………………………………7
2 CONVOLUTIONAL NEURAL NETWORK 9
2.1 INTRODUCTION……………………………………………………….9
2.2 STRUCTURE OF CNN………………………………………………....10
2.2.1 CONVOLUTIONAL LAYER………….…………………......10
2.2.2 POOLING LAYER…………………………………………...12
2.2.3 ACTIVATION FUNCTION……………………………….....13
2.2.4 LOSS LAYER………………………………………………..18
2.3 SUMMARY…………………………………………………………….19
3 HAAR CASCADE FRONTAL FACE 21
3.1 INTRODUCTION …………………………………………………….....21
3.2 FEATURES ……………………………………………………………...22
3.3 INTEGRAL IMAGE ………………………………………………….....24
3.4 ATTENTIONAL CASCADE ……………………………………………24
3.5 SUMMARY………………………………………………………………26
4 METHODOLOGY 27
4.1 BLOCK DIAGRAM ……………………………………………………..27
4.2 IMAGE CAPTURE USING OPENCV…………………………………..28
4.3 FACE DETECTION………………………………………………………..29
4.4 MODEL…………………………………………………………………….30
4.4.1 PRE-PROCESSING……………………………………………..30
4.4.2 MODEL DEFINITION………………………………………….32
4.4.2.1 REGRESSION ACCURACY METRICS…………………..32
4.4.3 MODEL TRAINING…………………………………………….33
4.5 PREDICTION………………………………………………………………35
Department of ECE, VNRVJIET

5 APPLICATION USING TKINTER 37

5.1 INTRODUCTION………………………………………………………….37
5.2 IMPLEMENTATION………………………………………………………37
CONCLUSION 40
REFERENCES 41

CHAPTER 1
1.1 OVERVIEW OF THE PROJECT

Age assessment from facial pictures utilizing convolution neural
organization (CNN) attracted more consideration late occasions. Facial
pictures contain biometric attributes. Age assessment, Sex order and look
acknowledgment are expanded exploration bearings from facial pictures.
Age assessment can be utilized in biometrics, content access, security, and
observation.
Customary strategies for age assessment first plan include physically,

remove highlights, and afterward perform age assessment. Convolution
neural organization (CNN) is an idea of profound learning, has exceptional
highlights in picture handling. This model is better than conventional model
and last age assessment exactness is expanded.
• The face picture of human contains numerous attributes including

character, age, sex, and passionate appearances. Age is more significant trait
among them. A human face picture can consider as perplexing sign made
from traits including skin tone and numerous different highlights. Age
assessment is a troublesome and testing task. These highlights are useful in
investigating the pictures continuously application. For instance,
programmed age assessment of people can be utilized for framework access
control. And furthermore, in insight gathering.

• Automatic age assessment of facial pictures is a difficult issue in PC

vision and picture examination as the way toward maturing is influenced by
various issues like sexual orientation, identity, climate and so on
• Also, to appraise close to precise age from facial pictures requires a

lot of information and dreary preparing period. in this paper, we have
proposed an age assessor dependent on the convolutional neural organization
that can anticipate age from facial pictures precisely
1.2 OBJECTIVE
⮚The principal objective is to accomplish acceptable precision

utilizing Convolutional Neural Organizations.
⮚Make it usable for applications like informal communities,

promoting applications and reconnaissance.
1.3 MOTIVATION
Age assessment is an essential capacity needed for smooth

correspondence between people. A legitimate comprehension old enough
watches out for an effective correspondence among people these days.
Human age assessment from facial pictures makes a significant exploring
subject because of its huge use going from observation checking to scientific
purposes and long-range informal communication stages.
It is a valuable technique: –
● Age‐specific human PC association (It fulfills inclinations, all things

considered, )

● Age‐specific access control (For minor’s security)
● Law requirement (For Security and Observation purposes)
● Multi‐cue ID (Through face/unique finger impression/iris + age

recognizable proof)
● Age-based ordering of face pictures (photograph collections)
This method is utilized in Web wellbeing for minors, Cigarette candy

machine, Age‐specific shopping HCI.
1.4 TOOLS USED

The major contributions in this model are:
We presented a new CNN structure for extracting ageing
characteristics from facial photos, allowing us to extract both local and
global ageing indicators. We also developed a new labelled encoding
strategy for converting discrete ageing labels to a continuous possibility
vector, which increased the CNN structure's performance. Our suggested
approach achieves cutting-edge results on the age estimate problem.
We proposed an entirely different structure to coordinate profundity
assessment from a solitary picture task. In this foundation, we could
effectively move the relapse task into an order task by applying mark
encoding technique, which improved the presentation of our CNN structure
by a gigantic perimeter. In expansion, we carried out the surface ordinary
requirements on profundity refining stage, which improved the exactness of
profundity guess. Our proposed structure accomplished a promising
execution on profundity assessment by a solitary picture task.

Our proposed name encoder technique showed a solid potential to

update the exhibition of CNN without altogether changing the design. This
could turn into a fascinating exploration course towards CNN.

CHAPTER 2
BACKGROUND ON CONVOLUTIONAL NEURAL NETWORK
2.1 INRODUCTION
A Convolutional Neural Network (CNN) is a kind of feed-forward

artificial neural network where the individual neurons are tiled so that they
react to covering districts in the visual field. CNN is an organically
motivated variation. From the early work of Hubel and Wiesel on the
feline’s visual cortex in 1968, it is realized that the visual cortex contains an
unpredictable plan of neurons. These neurons are touchy to little sub-locales
of the visual field, called a responsive field. The sub-areas are tiled to cover
the whole visual field. These neurons work as neighborhood letters over the
information space and are appropriate to misuse spatially nearby connection
introduced in normal pictures.
Fukushima's [1] work first registers models dependent on these
neighborhood networks among neurons and on progressively coordinated
changes of the picture. In this work, he found that when neurons with similar
boundaries are applied on patches of the past layer at different areas, a type
of translational invariance is accomplished. As indicated by this thought,
LeCun [2] planned and prepared CNNs utilizing the mistake slope with
back-engendering calculation, acquiring condition of-the art execution on
heaps of example acknowledgment undertakings . One of the primary
commitments of CNN on neural organization region is the execution of
loads sharing, applying same loads on same component map, which expands

the learning efficiency by significantly lessening the quantity of teachable

boundaries.
Then, at that point, lately, dynamic capacities and dropout calculation
were executed on CNN, which expands the non-linearity and autonomy of
highlight maps, then, at that point prompts higher-understanding and more
steady scholarly highlights.
2.2 STRUCTURE OF CNN
There are two principal layer types in a CNN: convolutional layers

and pooling layers. Furthermore, with the advancement of CNN, dynamic
capacities, dropout are included request to build the exhibition of CNN.
What is more, in the last layer of CNN, extraordinary misfortune layers are
picked by the sort of undertakings. I will momentarily clarify all components
we executed in this theory
2.2.1 CONVOLUTIONAL LAYER
Convolutional layer is a center structure square of CNN, which

differs CNN with customary artificial neural organizations. To stay
away from the circumstance of learning billions of boundaries (if all
layers are completely associated), utilizing convolutional procedure on
little locales has been presented. One significant benefit of
convolutional networks is the loads partaking in convolutional layers,
which means executing same filters on same component map. Loads
sharing assists with diminishing the necessary processing memory and

to improve CNN execution on PC vision errands. Then, at that point,

by decreasing the quantity of teachable boundaries, the overfitting issue
of conventional neural organization was mitigated.
The boundaries of the convolutional layer consist of a bunch of

learnable filters, which is little spatially. During the forward pass, each
filter is convolved across the width and tallness of the information
volume, delivering a 2-dimensional initiation guide of that filter. The
organization learns filters that will be enacted by specific kinds of
highlights from the contribution at specific positions, which is same
with the convolutional activity in the conventional component planned
calculations - removing fundamental highlights from inputs. Then, at
that point, stacking these actuation maps for all filters along the
profundity measurement shapes the full yield volume. With the
assistance of load sharing, the quantity of learnt filters in convolutional
expanded, which empowers the extraction of more data from input
information.
In a convolutional layer, an element map is acquired by rehashed
use of a capacity across sub-areas of the whole picture, as such, by
convolution of the info picture with a filter, adding an inclination term.
On the off chance that we indicate the kth input highlight guide of a
given convolutional layer as hk, whose filter is set as Wk and
predisposition is bk, then, at that point the yield includes guide of this
convolutional layer hk+1 is acquired as:
hk+1 = hk Wk + b k (2.1)

where the signifies the convolutional activity.
For completely associated layer, it is an exceptional instance of

the convolutional layer. A completely associated layer is a
convolutional layer that takes all neurons in the past layer and
interfaces them to each neuron it has, which implies a convolutional
layer without loads sharing and each filter of this layer is with size 1 x
1. The primary capacity of completely associated layer is to decrease
the spatially situated of neurons and structure high-getting highlights.
2.2.2 POOLING LAYER
Pooling layer is generally embedded between convolutional

layers occasionally in a CNN design. The capacity of pooling layer is to
lessen the goal of highlight maps, consequently accomplishing spatial
invariance just as reducing the overfitting issue. In a pooling layer, each
pooled highlight map compares to an element guide of the past layer. A
little nxn fix is utilized to join units of the element map, subsequently
making position invariance over bigger neighborhoods. In the
meantime, it down-samples the contribution by a factor of nxn along
every course.
There are two principal pooling layers utilized in CNN, the sub-
inspecting pooling and max pooling.

Albeit normal pooling was generally utilized in customary CNN,

max pooling has shown a superior execution in tests and is broadly
utilized in ongoing CNN engineering plans. Scherer et al.[3] directed
an investigation to look at the exhibition of max pooling and sub-
inspecting on object acknowledgment task. The outcome shows that
maximum pooling is better than sub-testing for invariance catches in
picture like information. Additionally, max pooling empowers quicker
assembly rate by picking prevalent invariant highlights which improves
execution in speculation and diminishes the quantity of teachable
boundaries, in this way limiting estimations and registering time,
bringing about a superior effectiveness during preparing.
2.2.3 ACTIVE FUNCTION
In CNN, perhaps the most significant factor is the execution of

dynamic function. Active capacities increase the non-linearity in
networks, which prompts high comprehension of the information
bunch. Notwithstanding nonlinearity, dynamic capacity likewise gives
out a component map without outrageous information esteems, which
expands the independence of neurons in the next layer, then, at that
point results increment the security of the whole network.
Three most ordinarily utilized dynamic capacities in CNN are

presented here. One Important regular affirmation is that all dynamic
capacities ought to be differentiable, which guarantees the use of the
back-proliferation calculation in the preparation measure.

Sigmoid Function :
The sigmoid function is defined as :
1
sigm ( x ) =
1+e− x
Below is the figure of sigmoid function, As can be seen from Fig,

the sigmoid capacity takes a genuine esteemed number and squashes it
into a reach somewhere in the range of 0 and 1. Specifically, huge
negative numbers will in general be 0 and enormous positive numbers
will in general be 1. The sigmoid capacity has been oftentimes utilized
since it has a decent understanding as the ring pace of a neuron: from
not firing by any means, for example 0, to completely soaked firing at
an accepted greatest frequency.

Albeit sigmoid capacity has been generally utilized, late exploration

shows that the non-linearity of sigmoid capacity performs bad in some
training situations, since it has two significant drawbacks.
To begin with, the Sigmoid capacity is effectively soaked and

slaughters slopes in the preparation interaction. An unacceptable
property of the sigmoid neuron is that when the neuron's actuation
soaks at one or the other tail of 0 or 1, the angles at these areas are very
nearly zero. During back-spread, the angle will be increased by the
angle of this present door's yield for the entire goal. Thus, if the
neighborhood inclination is little, it will viably execute the slope and
nearly has no sign move through the neuron to its loads and afterward
to its information, recursively. In addition, one should give additional
consideration while instating the loads of sigmoid neurons to forestall

immersion. For instance, if the underlying loads are too huge, most
neurons would be immersed, and the organization nearly has no
learning capacity.
Besides, the yields of sigmoid capacity [4] are not zero-focused.

This is unwanted since neurons in the following layer will get non-
zero-focused information. This affects the elements during slope
plummet, since, supposing that the information coming into a neuron is
consistently certain (for example x > 0), then, at that point the
inclination on the loads during back-engendering will turn out to be
either all sure, or all negative (contingent upon the slope of the entire
articulation). This could present unfortunate crisscrossing elements in
the load's refreshing interaction. In any case, when these slopes are
added up across a clump of information, the final refreshing for the
loads could have variable signs, which limits the caused mistake.
Hence, this is a bother, yet it is less significant contrasted with the
immersed actuation issue.
Tanh Function:
The tanh function is defined as
1−e−2 x
tanh( x)¿
1+ e−2 x
Comparable with sigmoid capacity, tanh work squashes a
genuine esteemed number to a scope of [-1; 1] in a non-linearity way.
In spite of the fact that its yield is zero-focused, which stays away from
the crisscrossing elements in the load's refreshing cycle, it has an

immersed enactment issue. Along these lines, albeit the tanh work has
improvement over sigmoid capacity, it performs not well .
Rectified Linear Unit (ReLU) :

ReLU is the most generally utilized dynamic capacity in CNN
now. Its capacity is defined as:
f ( x )=max ⁡(0 , x)
Below is the visualization of ReLU capacity

There are two primary benefits of ReLU work. In the first place,
contrasted and sigmoid capacity and tanh work that include convoluted
tasks, for example exponentials, ReLU can be executed by essentially
setting a limit at nothing. In this way, CNN with ReLU trains a few
times quicker than their counterparts with tanh work and sigmoid
capacity[5]. Second, ReLU does not experience the ill effects of
soaking which
improves the CNN's benefit that need not bother with bunches of pre-
preparing.
In any case, there is likewise a disadvantage for ReLU work, for
example ReLU units can bite the dust during preparing. For instance,
ReLU can irreversibly pass on and do not initiate any information point
during preparing since it will get knocked off the information complex
if the learning rate is set excessively high. In any case, this is less
habitually an issue with a legitimate setting of the learning rate.
2.2.4 LOSS LAYER

Distinctive misfortune capacities are picked for various
assignments in CNN. In this subsection, we primarily present two
usually utilized misfortune capacities - SoftMax loss and Euclidean
loss.
Euclidean Loss :
Euclidean misfortune is utilized for genuine worth relapse
errands. Since it is for a solitary real value, the last layer of CNN with
Euclidean misfortune is 1x1 size. The numerical capacity of Euclidean
loss shows beneath:
2N
1
L= ∑ ¿¿
2N i=1
Where d^i means the relapsed yields, di indicates the objective yields, N
signifies the quantity of yields.
SoftMax Loss:
Softmax loss is utilized for anticipating a solitary class of K
fundamentally unrelated classes and yields a chance vector with size
1xk, where all components in the vector sums to be one.
The functional representation is as follows

L=−∑ y j log p j
j
where yj is the ground truth class, when the objective has a place with j-
th class, yj = 1,otherwise, yj = 0, pj indicates the anticipated chance of
information has a place with j-th class. At the point when yield the
anticipated chance vector.
eO j
p j=
∑ eO k
where Oj signifies the yield at j-th position of the last layer of

CNN. In spite of the fact that Softmax loss is intended for classification
errands, it can likewise be carried out on relapse assignments, in this
proposal, we applied Softmax misfortune to two relapse undertakings,
age assessment and single picture profundity expectation.
2.3 SUMMARY
The cornerstone of CNN, which is a self-learning structure, was
presented in this section. This capacity grows as the organization's
nonlinearity grows, which can be achieved by increasing the number of
layers in the organization. As a result, one CNN research project is to
increase the number of organization layers while retaining the size of

highlight maps at a reasonable scale. Furthermore, a sensible state of

CNN structure is important, and a design with a pyramid shape
regularly produces better results than structures with other forms. As a
result, encoding yield markers to improve the state of CNN structure is
also a viable option.
As a result, improving the state of CNN structure by encoding

yield marks is also a possible research path. In the two portions that
follow, we show two of our proposed structures for age estimation from
a single image task and depth estimation from a single image task,
respectively. Both of these proposed structures are CNN-dependent..

CHAPTER 3
HAAR CASCADE FRONTAL FACE ALGORITHM
3.1 INTRODUCTION
Face Location, a broadly famous subject with an immense scope
of uses. Cutting edge Cell phones and PCs accompany in-constructed
face identification programming, which can verify the character of the
client. There are various applications that can catch, recognize, and
measure a face progressively, can distinguish the age and the sexual
orientation of the client, and furthermore can apply some truly cool
channels. The rundown isn't restricted to these portable applications, as
Face Recognition additionally has a wide scope of utilizations in
Reconnaissance, Security, and Biometrics also. In any case, the
beginning of its Examples of overcoming adversity traces all the way
back to 2001 when Viola and Jones proposed the first-since forever
Item Identification System for Constant Face Location in Video Film.
So, what is Haar Cascade? It is an Item Identification Calculation

used to recognize faces in a picture or a constant video. The calculation
utilizes edge or line recognition highlights proposed by Viola and Jones
in their exploration paper "Rapid Object Detection using a Boosted
Cascade of Simple Features[6]” distributed in 2001. The calculation is
given a great deal of positive pictures comprising of countenances, and
a ton of negative pictures not comprising of any face to prepare on
them.
3.2 FEATURES
a) b)
d)
c)
The primary commitment to the exploration was the presentation

of the haar features appeared previously. These features on the picture
makes it simple to discover the edges or the lines in the picture, or to
pick regions where there is an unexpected change in the powers of the
pixels.
An example estimation of Haar value from a rectangular picture
segment has been appeared here. The hazier regions in the haar include
are pixels with values 1, and the lighter regions are pixels with values
0. Each of these is answerable for discovering one specific element in
the picture. Like an edge, a line, or any design in the picture where
there is an unexpected difference in powers. For ex. in the picture over,
the haar highlight can identify an upward edge with hazier pixels at its
right and lighter pixels at its left.

The goal here is to discover the amount of all the picture pixels
lying in the more obscure space of the haar include and the amount of
all the picture pixels lying in the lighter space of the haar highlight.
And afterward discover their distinction. Presently if the picture has an
edge isolating dull pixels on the right and light pixels on the left, then,
at that point the haar worth will be more like 1. That implies, we say
that there is an edge recognized if the haar esteem is more like 1. In the
model above, there is no edge as the haar esteem is a long way from 1.
The haar feature ceaselessly navigates from the upper left of the
picture to the base right to look for the specific element. This is only a
portrayal of the entire idea of the haar highlight crossing. In its genuine
work, the haar feature would cross pixel by pixel in the picture.
Additionally, all potential sizes of the haar features will be applied.
Presently, the haar features crossing on a picture would include a

great deal of numerical estimations. As we can see for a solitary square
shape on one or the other side, it includes 18-pixel esteem increases
(for a square shape encasing 18 pixels). Envision doing this for the
entire picture with all sizes of the haar highlights. This would be a
rushed activity in any event, for a superior machine.

3.3 INTEGRAL IMAGE
To handle this, they presented another idea known as The

Necessary Picture to play out a similar activity. An Integral Image is
determined from the First Picture so that every pixel in this is the
amount of the multitude of pixels lying in its left or more in the First
Picture .The last pixel at the base right corner of the Integral Image will
be the amount of the multitude of pixels in the First Picture.
With the Integral Image, just 4 steady worth increments are

required each an ideal opportunity for any component size (regarding
the 18 increases prior). This diminishes the time intricacy of every
expansion continuously, as the quantity of increases does not rely upon
the quantity of pixels encased any longer.
3.4 ATTENTIONAL CASCADE
Presently comes the Cascading part. The subset of every one of

the 6000 highlights will again run on the preparation pictures to
recognize if there is a facial component present or not. Presently the
creators have taken a standard window size of 24x24 inside which the
component location will be running. It is again a tedious assignment.

To improve on this, they proposed another method called The

Attentional Cascade. The thought behind this is, not every one of the
highlights need to run on every single window. Assuming an element
falls flat on a specific window, we can say that the facial highlights are
absent there. Subsequently, we can move to the following windows
where there can be facial highlights present.
The below Figure is the output of Haar cascade frontal face algorithm.
The image is captured using OpenCV and if there is a Face it gives us
the following result else it throws an error.
Face Detection Using Haar Cascade Algorithm

3.5 SUMMARY
Haar Course Recognition is one of the most established at this

point incredible face location calculations developed. It has been there
since long, some time before Profound Learning got acclaimed. Haar
Highlights were utilized to recognize faces, yet in addition for eyes,
lips, permit number plates and so on.

CHAPTER 4
METHODOLOGY
4.1 BLOCK DIAGRAM
The picture is caught utilizing the OpenCV library accessible in

Python3. From that picture we identify whether there is a face in it. If
the face is detected , the Trained CNN model gives the age prediction
for the given input picture.
In the below sections I will give you the detailed explanation

about the functionality of each block and how the CNN model is
trained. We trained the model on UTKFace dataset, which is an open-
source dataset available in Kaggle.

4.2 IMAGE CAPTURE USING OpenCV
Python gives different libraries to picture and video handling.

One of them is OpenCV. OpenCV is a tremendous library that aides in
giving different capacities to picture and video activities. With
OpenCV, we can catch a video from the camera. It allows you to make
a video catch object which is useful to catch recordings through
webcam and afterward you may perform wanted procedure on that
video.
Steps to capture video:
 Use cv2.VideoCapture() to get a video catch object for the

camera.
 Set up a limitless while circle and utilize the read() technique to
peruse the edges utilizing the above made article.
 Use cv2.imshow() technique to show the casings in the video.
 Break the circle when the client clicks a particular key.
Below is the implementation of above steps. When the client clicks

key “q” in the keyboard the circle breaks until then it works as a
normal video capture.

1. import cv2
2.
3. # Instance for capturing video
4.
5. vid = cv2.VideoCapture(0)
6.
7. while(True):
8. # Frame is captured
9. # with frame
10. ret, frame = vid.read()
11. # visualize the frame
12. cv2.imshow('frame', frame)
13. # the 'q' key is used to exit from frame
14. if cv2.waitKey(1) & 0xFF == ord('q'):
15. break
16.
17. #Release the object
18. vid.release()
19.
20. # Remove all windows
21. cv2.destroyAllWindows()
Output :

4.3 FACE DETECTION

The image captured above is given as input to the Haar cascade
frontal face detection model, as discussed priorly it traverses entire
picture by checking all features and yields the picture with a border that
includes face inside it.
The implementation here is like that of image capture, but here
we need to download the xml file of Haar Cascade algorithm and set
the path to it in the project. And for further use the cropped image i.e.
(face detected) has to be stored, so that we can use it as input to the
model and desired yields are obtained.
4.4 MODEL
4.4.1 Pre-Processing
Information preprocessing portrays any sort of handling
performed on crude information to set it up for another preparing
method. It was generally utilized as a primer advance for an

information mining measure. All the more as of late, these procedures
have advanced for preparing AI and artificial intelligence models and
for running deductions against them. Likewise, these strategies can be
utilized in mix with an assortment of information sources,
incorporating information put away in documents or data sets, or being
radiated by streaming information frameworks.

Information preprocessing changes the information into a

configuration that will be all the more effectively and viably prepared
with the end goal of the client - for instance, in a neural network.
Genuine information is untidy and is regularly made, prepared

and put away by an assortment of people, business cycles and
applications. While it very well might be reasonable for the current
reason, an informational index might be missing individual fields,
contain manual information mistakes, or have copy information or
various names to portray exactly the same thing. Despite the fact that
people can frequently distinguish and correct these issues in the line of
business, this information should be consequently preprocessed when it
is utilized to prepare AI or profound learning calculations.
As mentioned, we use the UTKFace dataset, so the preprocessing

steps that are done here are
 The filename of images contains the age of the person, so we

store them in a List.
 Convert the images from BGR to RBG format.
 Resize the images to 48x48 pixels.
 Append all the images to a list, so that they can be used for
training and validation purposes.

Below is the implementation of the discussed steps above, all the steps
are performed using the OpenCV library accessible in Python.
1. import cv2
2. ages=[]
3. images=[]
4. for fle in flies:
5. age=int(fle.split('_')[0])
6. total=fldr+'/'+fle
7. print(total)
8. image=cv2.imread(total)
9. image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
10. image= cv2.resize(image,(48,48))
11. images.append(image)
4.4.2 MODEL DEFINITION
The model contains an input layer and 4 convolutional layers

where in after each convolutional layer we have an MaxPooling layer
and after these 4 layers we have dense layers with ReLU activation
function. We have one output i.e., Age. The loss metrics used to train
the model is mean absolute error.
4.4.2.1 REGRESSION ACCURACY METRICS
The MSE, MAE, RMSE, and R-Squared are for the most part
utilized measurements to assess the forecast mistake rates and model
execution in relapse investigation.
Here we have used MAE i.e., Mean absolute error which

represents the difference between expected yield and original predicted
yield.

The MAE can be expressed as

N
1
MAE= ∑ ¿ y i−^y ∨¿ ¿ where y i is the expected yield and ^y is the actual
N i=1
yield.
4.4.3 MODEL TRAINING
An epoch in AI implies one complete pass of the preparation

dataset through the calculation. This epochs number is a significant
hyperparameter for the calculation. It indicates the quantity of ages or
complete passes of the whole preparing dataset going through the
preparation or learning interaction of the calculation.
Here we save the model with ‘h5’ extension, by using this file we
can interface with the face detection as discussed earlier. Below is the
snippet for saving the changes to the file for each epoch when the loss
is reduced.
1. fle_s='Age_sex_detection.h5'
2. checkpointer = ModelCheckpoint(fle_s,
monitor='val_loss',verbose=1,save_best_only=True,save_weights_onl
y=False, mode='auto',save_freq='epoch')
3. Early_stop=tf.keras.callbacks.EarlyStopping(patience=75,
monitor='val_loss',restore_best_weights=True),
4. callback_list=[checkpointer, early_stop]

The early stop parameter decides when to stop the process, here
the patience is 75 i.e., the model terminates its training when the loss is
not reduced for 75 consecutive epochs.
The above is the loss vs Number of epoch graph where the blue line
corresponds to the loss while training and the other is loss occurred
during the validation process.

4.5 PREDICTION USING THE TRAINED MODEL
Download the file that we created i.e., ‘Age_sex_dectection.h5’

and by using the load model library available in keras python, the
implementation is as shown below
from tensorflow.keras.models import load_model import tensorflow as tf
def model_load():
model_1 = load_model('../Age_detection_201.h5')
model_1.summary()
return model_1
The summary of the model gives the layers and activation

functions that are used in the model definition. At the end we return the
instance of the model for predictions.
Now we integrate all the components discussed earlier, firstly

capture of picture using OpenCV library, then we move to face
detection using the haar cascade algorithm. The cropped image is pre-
processed and is given to the model, and it predicts our age.[7]
This age is written on the image captured using putText method

available in cv2 library. Below is the predicted age of me using the
instance of the model

Predicted age using the model

CHAPTER 5
APPLICATION USING TKINTER
5.1 INTRODUCTION
Graphical Client Interface(GUI) is a type of UI which permits

clients to associate with PCs through visual markers utilizing things
like symbols, menus, windows, and so on It enjoys upper hands over
the Order Line Interface(CLI) where clients associate with PCs by
composing orders utilizing console just and whose utilization is more
troublesome than GUI.
Tkinter is the inbuilt python module that is utilized to make GUI

applications. It is perhaps the most normally utilized modules for
making GUI applications in Python as it is straightforward and simple
to work with. You don't have to stress over the establishment of the
Tkinter module independently as it accompanies Python as of now. It
gives an article arranged interface to the Tk GUI toolbox.
5.2 IMPLEMENTATION
Here using Tkinter we created a basic GUI to create a content

driven platform. It has a button “Start Webcam” which captures the
picture and using the same model age is predicted. We store the value
of age in a variable and if it is greater than 18 we printed a message
“Wooho, enjoy your show!” else it shows an error message “Sorry!,
You are not authorized to view the content”.

Here for demonstration purpose, I altered the threshold value

between 18,30 and evaluated using my face. The implementation is as
shown below
import tkinter as tk from tkinter
import ttk from tkinter import messagebox
# import webcam
from webcam import webcam_capture
s = ''
age_value = 0
try:
from ctypes import windll
windll.shcore.SetProcessDpiAwareness(1)
except:
pass
def check_age(age):
if age < 28:
return True
else:
return False
def startweb():
age_value = webcam_capture()
if check_age(age_value):
messagebox.showerror('Sorry!', 'You are not authorized to
view the content')
root.destroy()
else:
messagebox.showinfo('Wooho', 'Enjoy your show!!')
root = tk.Tk()
root.title('Content Driven Access')
root.geometry("600x400")
# button to start the webcam capture
ttk.Button(root, text="Start Webcam", command=startweb).pack()
root.mainloop()

This is the root page we get after

we execute the program. Once we
click on the “Start Webcam” image
is captured and the age is predicted
using the model.
GUI USING TKINTER
If the age is greater than threshold

value we get the following display
message i.e. “enjoy, your show”.
Case 1: Age is greater than threshold value
If the age is less than threshold

value we get the following error
message i.e. “You are not
authorized to view the content”.
Case 2: Age is less than threshold value

CONCLUSION
This proposed model helps to predict the human age, we applied

appropriate activation function for each layer to make the model more
accurate. Here we used ReLU and sigmoid activation functions to
predict the probability of the output.
The proposed model can be used to content driven platforms and

surveillance purpose and electronic customer relationship management
i.e., ERCM.

REFERENCES
[1] K. Fukushima, Neocognitron: A self-organizing neural

network model for a mechanism of pattern recognition unaffected
by shift in position," Biological cybernetics, vol. 36, no. 4, pp.
193{202, 1980}
[2] Y. LeCun, B. Baser, J. S. Danker, D. Henderson, R. E.

Howard, W. Hubbard, and L. D. Jackal, Backpropagation applied
to handwritten zip code recognition," Neural computation, vol. 1,
no. 4, pp. 541{551, 1989}.
[3] D. Scherer, A.Muller, and S. Behnke, Evaluation of pooling

operations in convolutional architectures for object recognition,"
in Artificial Neural Networks{ICANN 2010. Springer, 2010, pp.
92{101}
[4] B. Xu, N. Wang, T. Chen, and M. Li, Empirical evaluation

of rectified activation in convolutional network," arXiv preprint
arXiv:1505.00853, 2015.
[5] A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet

classification with deep convolutional neural networks," in
Advances in neural information processing systems, 2012, pp.
1097{1105}
[6] Viola, Paul & Jones, Michael. (2001). Rapid Object
Detection using a Boosted Cascade of Simple Features. IEEE
Conf Comput Vis Pattern Recognition. 1. I-511.
10.1109/CVPR.2001.990517.
[7] Liu, Xinhua; Zou, Yao; Kaung, Hailan; Ma, Xiaolin. 2020.
"Face Image Age Estimation Based on Data Augmentation and
Lightweight Convolutional Neural Network" Symmetry 12, no. 1: 146.
https://doi.org/10.3390/sym12010146

Age Estimation From Facial Images: Bachelor of Technology Electronics and Communication Engineering

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Age Estimation From Facial Images: Bachelor of Technology Electronics and Communication Engineering

Uploaded by

Copyright:

Available Formats

A PROJECT REPORT ON

AGE ESTIMATION FROM FACIAL IMAGES

BHAVANA ATLURI (17071A0466)

UNDER THE GUIDANCE OF

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

Internal Guide External Examiner

Mr.V SAGAR REDDY

Head of the Department, ECE

DR.Y. PADMA SAI

Department of ECE, VNRVJIET

5 APPLICATION USING TKINTER 37

Department of ECE, VNRVJIET

1.1 OVERVIEW OF THE PROJECT

Customary strategies for age assessment first plan include physically,

• The face picture of human contains numerous attributes including

Department of ECE, VNRVJIET

• Automatic age assessment of facial pictures is a difficult issue in PC

• Also, to appraise close to precise age from facial pictures requires a

⮚The principal objective is to accomplish acceptable precision

⮚Make it usable for applications like informal communities,

Age assessment is an essential capacity needed for smooth

● Age‐specific human PC association (It fulfills inclinations, all things

Department of ECE, VNRVJIET

● Age‐specific access control (For minor’s security)

● Law requirement (For Security and Observation purposes)

● Multi‐cue ID (Through face/unique finger impression/iris + age

● Age-based ordering of face pictures (photograph collections)

This method is utilized in Web wellbeing for minors, Cigarette candy

1.4 TOOLS USED

Department of ECE, VNRVJIET

Our proposed name encoder technique showed a solid potential to

Department of ECE, VNRVJIET

A Convolutional Neural Network (CNN) is a kind of feed-forward

Department of ECE, VNRVJIET

the learning efficiency by significantly lessening the quantity of teachable

2.2 STRUCTURE OF CNN

There are two principal layer types in a CNN: convolutional layers

2.2.1 CONVOLUTIONAL LAYER

Convolutional layer is a center structure square of CNN, which

Department of ECE, VNRVJIET

to improve CNN execution on PC vision errands. Then, at that point,

The boundaries of the convolutional layer consist of a bunch of

Department of ECE, VNRVJIET

where the signifies the convolutional activity.

For completely associated layer, it is an exceptional instance of

2.2.2 POOLING LAYER

Pooling layer is generally embedded between convolutional

Department of ECE, VNRVJIET

Albeit normal pooling was generally utilized in customary CNN,

2.2.3 ACTIVE FUNCTION

In CNN, perhaps the most significant factor is the execution of

Three most ordinarily utilized dynamic capacities in CNN are

Department of ECE, VNRVJIET

Below is the figure of sigmoid function, As can be seen from Fig,

Department of ECE, VNRVJIET

Albeit sigmoid capacity has been generally utilized, late exploration

To begin with, the Sigmoid capacity is effectively soaked and

Department of ECE, VNRVJIET

Besides, the yields of sigmoid capacity [4] are not zero-focused.

Department of ECE, VNRVJIET

Rectified Linear Unit (ReLU) :