You are on page 1of 6

An Efficient Convolutional Neural Network

Approach for Facial Recognition


Aayushi Mangal Himanshu Malik Garima Aggarwal
Department of Computer Science Department of Computer Science Department of Computer Science
ASET, Amity University ASET, Amity University ASET, Amity University
Noida, UP, India Noida, UP, India Noida, UP, India
aayushi.mangal9@gmail.com malik.himanshu396@gmail.com gmehta@amity.edu

Abstract— Data security being the main concern now a many different layers [12]. Each of which is responsible for the
days, has faced a lot of threat in terms of breaching of extraction of deterministic features from the input image and
information which requires immediate attention. produces the output. With the application and inclusion of every
Biometrics have served a long-run for this purpose which is new layer, accuracy can be increased.
a part of Deep Learning. In the recent past, face recognition
has become a very important tool for safety and security One of the important applications of neural networks to be
purposes. This paper presents the application of face studied is the Pattern Recognition. In the field of Facial
recognition technique, making use of Convolutional Neural recognition, trained neural networks study the geometrical
Network (CNN) with Python and a comparison is drawn pattern [10] of the face i.e. the eyes, the nose, the mouth and
between the other techniques such as Principal Component every other component on the face and then recognizes the
Analysis (PCA), Local Binary Pattern (LBP) and K Nearest user/person from the dataset. Facial Recognition is becoming
Neighbour (KNN). Unlike conventional methods, the the most important factor for security purposes.
proposed scheme uses four Convolutional layers with ReLu
layers, four pooling layers, a fully connected layer and a Facial Recognition has proved to be more advantageous
Softmax Loss Layer to normalize the probability [5,7] in contrast with other biometric identifiers such as, it is of
distribution. The dataset consists of 1500 images with non-contact nature i.e. images could be studied with no
different facial expressions and the model is trained and interaction with the user/person as the face could be captured
tested in order to acquire an accuracy using CNN method. from a distance. Moreover, it also requires less processing as
Experimental results show that the proposed Neural compared to other biometric techniques. Insufficient light on
Network scored an accuracy of 96.96%. face and variety in facial expressions could be a drawback of
using face recognition as a security tool.
Keywords— Facial Recognition, Deep Learning, Face
Database, Convolutional Neural Networks, K Nearest The main contribution of this paper is to propose a technique
Neighbour, Principal Component Analysis, Local Binary of facial recognition using the concept of neural networks;
Pattern Histogram. particularly, the Convolutional Neural Networks (CNN). The
research work draws a comparison between various face
I. INTRODUCTION recognition techniques, namely, Principal Component Analysis
(PCA), Local Binary Pattern (LBP) and K Nearest Neighbor [8]
With the evolution of technology an increase in the data theft (KNN). The purpose of this study is to establish the best
cases could be observed. Biometrics recognition has helped to method, amongst those mentioned, that can be applied for a
serve as a protection layer for the data security. Biometric facial recognition system [2]. The proposed network, uses four
recognition could be bifurcated as recognition of fingerprints, Convolutional Layers with ReLu layer which helps in the better
iris, facial and hands. Recognizing a face or a digit, broadly, an extraction of the facial features. Further, SoftMax Loss Layer is
image, for a person is a very easy task and most of them could also used for the technique to be loss-less in nature. The
perform it effortlessly. But the truth is, our brain is good at techniques are tested over a dataset that was created manually
making sense of what our eyes see every day. But difficulties on Spyder IDE, making use of Python programming language.
arise when this task is to be performed by the computer. For Constraints such as images with different facial expressions,
this, the computer should be able to think and process like a angle for the image and the variation in intensity of light in
human brain where it could take decisions on its own and background are considered. Results found through experiments
produce results. show a significant increase in accuracy and efficiency of the
recognition of face.
To serve this purpose, the concept of “Deep Learning” is
applied. The system is trained using neural networks [1,4] The paper is organized as follows. Face Dataset is explained
which would be able to analyze the information provided and in Section II. Literature Survey has been described in Section
make predictions based on its learning. Deep Learning consists III. Proposed Face Recognition technique is described in
of Convolutional Neural Networks (CNN) which act same as a Section IV. Proposed Technique is described under Section V.
human brain and take multiple inputs and produces a single Section VI describes the Experimental Results. While the
output and can be termed as artificial neurons. CNN consists of

c
978-1-7281-2791-0/20/$31.00 2020 IEEE 817
Conclusion is under Section VII and Section VIII describes the trained on 2D basis and as an unsupervised way of learning
future scope of the study. [11].

II. FACE DATASET Chakka Mounica and Venugopal P (2016) [12] analysed the
increased crime rate in the world of ATMs and therefore tested
The dataset chosen has in total 1500 images of 15 different LBPH technique for face recognition and used haar like features
people with 100 images per person in the PNG format having and carried out the process in three stages feature extraction,
256x256 image resolution with change in intensity and different matching and classification and the efficiency for the system
facial expressions. It was created manually by making a video built was 76% and could also recognise images tilted at 45
of the concerned person for at least 10 sec and then extracting degrees. Shekhar Karanwal and Ravindra Kumar Purwar
the frames from the same video. It consists of different (2017) [13] compared different methods for image recognition
expressions such as straight or smiling face, open and closed using LBPH over two different datasets and analyzed that
eyes for detailed analysis of the images. The face position LBPH(3x3)-PCA on ORL face database and LBPI-PCA on
towards the camera are also captured with a variable angle from extended yale-B face database produces better results than other
left to right side. features for small and large sample training sizes.

III. LITERATURE SURVEY Moreover, Poonam Tanwar, Divya Arora, Dhruv Anand
(2016) [16] also worked on the problems with the facial
Until recent times, face-to-face technology was considered expressions. They suggested the combination of KNN and
something directly to science fiction. But in the last decade, this Hidden Markov Model could do the justice to the problem.
innovative technology has not only been successful but has also Later, Hadi Santoso Agus Harjoko, Agfianto Eko Putra (2017)
gained popularity. Many sectors have been in advantage with [17] worked upon the efficiency of the KNN as it time
the advent of this technology such as mobile phone companies, consuming while calculating distance. They suggested a
which are now providing a new of biometric security using face solution to the problem with the use of priority k-d tree search
recognition, improvement in traveller’s security and to speed up to the process of KNN classification
convenience by airports, identification of terrorists and thieves,
etc. Prashant P Thakare, Pravin S Patil [14] conducted a
comparative study between KNN, Robust LBP and Distinct
Woodrow Wilson Bledsoe is considered to be the father of LBP which helped in focusing on the problem due to
Face Recognition. In 1960s he developed a software known as expressions such as anger, happy, sad, excited. The model
RAND tablet which could classify image with the help of hand suggested a good efficiency to be implemented.
that could manually record the coordinate locations of the facial
features such as eyes, nose, etc. Later, with the help of these Then came CNN into study. The very first Convolutional
metrices, a database could be created from which the most Neural Network (CNN) was used by Kunihiko Fukushima,
resembled image could be retrieved. But unfortunately, this where he designed the multi-layer nerve network. In 1979, he
technology was limited due the computer processing power. developed an artificial neural network (ANN) called
Neocognitron, which uses a hierarchical design and layout. This
Later, Sirovich and Kirby used a new approach called as design allows computers to "learn" to distinguish visible
EigenFaces which used the linear algebra on facial recognition. models. This network is similar to the modern version, but is
They were able to show that the analysis of the features of the trained in vigorous activity in many layers, which over time
collection of face images can form a set of key features. They become stronger. In addition, Fukushima's practice allows
also showed that less than one hundred values are required to various functions to be changed manually, increasing the
encode a properly normalized face image. "severity" of some nodes. In 2015, CNN came into operation
with the most accurate and in-depth research, especially the
And thereafter, Matthew A. Turk and Alex P. Pentland CNN, which has been successfully recognized.
(1991) [11] presented an approach to the detection and
identification of human faces using Principal Component Face recognition has drawn much attention towards itself
Analysis where faces were considered as a 2D object where new and CNN’s being the most explored, due to its easy
faces could easily be learned in an unsupervised manner. Later implementation and show of great results. In [15] Guosheng Hu
than in 2006, it could be seen that the facial recognition could showed a deep emphasises on deep learning and CNN. They
be seen as a security check in the form of CCTVs with the use used public database instead of private database. A classic
of eigen images. metric learning method that exploits CNN-learned features was
brought forward. Experiments conducted showed two
In the recent past, Facebook also introduced Face Detection important factors to good CNN-FRS performance are the
feature that helps identify people whose faces are included in combination of multiple CNNs and metric learning.
daily photo updates of Facebook users. This feature allowed
users to tag their friends just by pointing the mouse cursor over Patrik Kamencay [2] calculated the working efficiency of
the face of the person in the image. the proposed Convolutional Neural Network (CNN) with three
well-known image recognition techniques such as (PCA),
As already studied, eigen vectors were first considered in (LBPH) and (KNN) is tested. Efficiency of the PCA, LBPH,
1994 [15] and drew a focus upon itself as it was the most KNN was demonstrated and CNN was put forward. All the
successful technique of that time. In 1991, when Turk presented experiments were performed on the ORL. The observations and
his approach towards images where the images could easily be

818 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)
results show that the LBPH provide better results than PCA and input for the next layer. ReLU layer is used for the activation
KNN. For proposed CNN, a top recognition accuracy of 98.3 % purpose.
was obtained.

KewenYan [3] studied CNN and calculated the accuracy of


the two databases. The face recognition rate of the ORL face
database and AR face database based on this network is 99.82%
and 99.78%. Syed Zulqarnain [7] studied the accuracies of 3D
photographs over the databases of 2D photographs. They gave
a method for generating a large corpus of labelled 3D face
identities and their numerous examples for training and a
protocol for combining the most challenging existing 3D
datasets for testing. Without preprocessing this dataset, their
network already outperforms state of the art face recognition by
Figure 1: Pooling of the image
over 10%. Finally, they showed the ef¿ciency of their method
for the open world face recognition problem. • Flattening
The next layer in a CNN involves flattening of the pooled
Leon A. Gatys et al. [4] demonstrated the use of feature matrix. Flattening is a simple concept. The pooled matrix is
representations from high-performing Convolutional Neural formed after layer 2, the CNN then simply converts that matrix
Networks to transfer image style between arbitrary images. into a column matrix, and this phenomenon is called flattening.
They used image representations derived from Convolutional
Neural Networks optimized for object recognition, which made • Full Connection
high level image information explicit, introducing A Neural After flattening of the pooled feature maps, working of the
Algorithm of Artistic Style that can separate and recombine the ANN takes place. An Artificial neural network is composed of
image content and style of natural images. The algorithm allows multiple nodes in a similar fashion as that of a human brain
to produce new images of high perceptual quality that combine neuron.
the content of an arbitrary photograph with the appearance of
numerous well-known artworks. The functioning of an ANN involves an input layer which is
in the form of a column matrix (last step of the CNN). These
IV. PROPOSED FACE RECOGNITION input values are then fed into the numerous hidden neurons of
the ANN. The input features are then assigned weights on the
TECHNIQUE: CNN (Convolutional basis of what feature is deterministic for a particular type of
Neural Network) classification. It is up to the user how many such “hidden”
layers they want to put in. The greater the amount of these
Convolutional Neural Networks work similar to the hidden layers, the better the performance of the ANN. The
Nervous System present in a human brain. It has many hidden result of the ANN is usually a binary classification, wherein it
layers and can take multiple inputs and produce a single output predicts a particular classification of an image.
after processing the data i.e. passing of the data from one layer
to another. With every layer, a new feature is extracted and the The overall structure of the network which would be obtained
final layer produces an output. after applying all the layers is shown in Figure 2.
A CNN can be created using these layers:

• Convolution
This layer acts as a core building block for the neural system.
The images are fed as an input layer to the model. The images
are treated as matrices and all the important features are
extracted in 3x3 matrix form. The matrix formed and the input
matrix is compared. For every corresponding digit matching,1
is returned and a new matric is formed which helps in reduction
of the size of the image.
These matrices form a set which is called as convolutional
layer.
Figure 2: Overall Structure of the CNN
• Pooling
The next layer which is used, is known as “Max Pooling” which V. PROPOSED METHOD
is described in Figure 1. This layer uses a MAX functionality
to reduce the spatial size of the input parameters so as to reduce In this section, the accuracy and efficiency of the proposed
the number of parameters and computation in the network, and method on the dataset created is calculated. The testing method
hence to also control overfitting. 2x2 matrix filter is usually were implemented in Python programming language.
applied to the layer, eliminating around 75% of the activations.
MAX value out of the 4 is considered which further acts as an The proposed Neural Network consists of 4 Convolutional
layer and Pooling layer where each layer (convolutional) is

10th International Conference on Cloud Computing, Data Science & Engineering (Confluence) 819
followed up by a pooling layer. Each layer gets input in the form 3. Next, the pooling layer performs max pooling with 2x2
of feature map and converts it by means of non-linearity. By dimensions. To cut the amount of data, the down
down sampling and stacking the outputs CNN’s extract abstract sampling layer uses the max-pooling method.
feature map which are invariant to translations and distortions.
4. After down-sampling, the output is served as an input
Flowchart for the proposed algorithm is shown in Figure 3.
for the second Convolutional layer which has double
1. For input image data, faces from the dataset are resized the amount of feature map than first layer i.e. 64 with
into 256x256 pixels. 5x5 kernel dimension and ReLu activation function.
5. Again, the max pooling layer with a kernel dimension
of 2x2 and a stride of 2 is used.
6. The third Convolutional layer has 128 feature maps
with the same parameters as the first and Convolutional
layer.
7. The output from the above layer is again pooled with
the same parameters as first and second pooling layers.
8. After pooling from the third layer, the next layer is the
fourth Convolutional layer having 256 feature maps
with the same parameters as the other Convolutional
layer of the network.
9. As the next layer, the last pooling layer performs the
max pooling with 2x2 kernel dimensions and stride of
2.
10. The upcoming layer; the standard dense layer which
was used in the proposed network. The output of the
very last layer was made to pass through the SoftMax
cross entropy with logits of the lossy layer to normalize
the probability distribution. Finally, the accuracy was
calculated.

VI. EXPERIMENTAL RESULTS


The scope of the project is vast and important. The
comparison among the algorithms was studied. CNN was
successfully implemented over the dataset which consisted of
1500 images. The code was implemented on Spyder 3.6 tool.
The language used was Python. Certain libraries such as
TensorFlow, keras were installed on the system in order to
implement the code and obtain the best results.

The training set included 225 images as shown in Figure 4:

Figure 3: The proposed network


2. These images were taken as input for the next layer
which is the first Convolutional Layer which has 32
feature maps with 5x5 kernel dimensions. ReLu
(Rectified Linear Unit) was used as an activation
function.
Figure 4: Training set for the implementation

820 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)
The other algorithms showed an accuracy as follows:
• PCA showed an accuracy of 87.4%
• LBPH showed an accuracy of about 93.1%
• KNN showed an accuracy of about 84.8%
• CNN, the implemented one, shows an accuracy of
96.96% (depicted in Figure 5)

Figure 7: Time Elapsed vs Algorithms Implemented

Table 1: Comparison among different Facial Techniques

Reference Technique Size of Accuracy Time


Dataset Obtained Elapsed
Figure 5: Accuracy for CNN (sec)
Tested PCA 1500 87.4% 126.3
Figure 6 shows the graphical representation of the
Technique images
techniques implemented vs accuracy attained by each algorithm
whereas Figure 7 draws a comparison between the time elapsed Tested LBPH 1500 93.1% 118.4
during the implementation of different techniques. Technique images
Tested KNN 1500 84.8% 105.6
Technique images
Proposed CNN: 4 1500 96.96% 51.4
Network Convolution images
layers

Although CNN’s do not give 100 percent accuracy, yet it is


found to be the best among all the other algorithms, that might
be used for the technique of Facial Recognition. Table 1 is proof
of the same. This table draws a comparison amongst the well
know Facial Recognition techniques, describing attributes like
size of dataset, time elapsed and most important of them all, the
accuracy of recognition. It clearly shows that CNN’s have
proved to attain the maximum accuracy for this technique.

VII. CONCLUSION
Figure 6: Graphical representation of the technique implemented This paper illustrates the working of Convolutional Neural
Networks. A comparison is drawn between other facial
techniques such as PCA, LBPH and KNN. The proposed
network uses four Convolutional layers for the feature
extraction, each layer followed by the pooling layer. Further,
SoftMax Loss layer is also used to normalize the probability
distribution. For the created dataset, the technique attains an
accuracy of 96.96%. This technique can be extended for a huge
dataset that can be used on a larger scale.

VIII. FUTURE SCOPE


This study would be helpful to focus on the different issues and
various challenges of the upcoming future. The present work
that has been done using Neural Networks are surely to be
improved upon in the future, so as to achieve more accurate,

10th International Conference on Cloud Computing, Data Science & Engineering (Confluence) 821
realistic and reliable results. Deep Learning has its significance [9] Toshev A, Szegedy C. Deeppose, “Human pose estimation via deep neural
networks,” in Conference on Computer Vision and Pattern Recognition
with respect to the analogy it draws with the similar working as (CVPR), Los AIamitos: IEEE, pp. 1653-1660, 2014.
that of a human brain. Finding patterns, extracting features, [10] Lawrence, S., Giles, C. L., & Tsoi, A. C., “Convolutional neural networks
estimating, predicting, disclosure of learning and so on., all the for face recognition. In Proceedings CVPR IEEE Computer Society
three types of neural networks possess great protentional in Conference on Computer Vision and Pattern Recognition, pp. 217-222,
1996.
their respective fields of application.in various business areas. [11] M. A. Turk and A. P. Pentland. Face recognition using eigen-faces. In
Computer Vision and Pattern Recognition, pages 586–591, 1991.
They have wide application area nearly in each industry where [12] Chakka Mounica and Venugopal P (2016). Face Detection and
there is a database and predictions need to be made on certain Recognition using LBPH. International Journal of Engineering Research
existing trends. It certainly holds a great importance in the and Science and Technology. IEEE.
upcoming future in the world of Artificial Intelligence. [13] Shekhar Karanwal and Ravindra Kumar Purwar (2017) Performance
Analysis of Local Binary Pattern Features with PCA for Face
Recognition. Indian Journal of Science and Technology, Vol 10(23).
REFERENCES [14] Thakare P and Patil S (2016) Facial Expression Recognition Algorithm
Based on KNN Classifier. International Journal of Computer Science and
[1] Guosheng Hu, Yongxin Yang, Dong Yi, Josef Kittler, William Christmas, Network. Vol. 5.
“When Face Recognition Meets with Deep Learning: An Evaluation of [15] G. Huang, M. Mattar, H. Lee, and E. G. Learned-Miller. “Learning to
Convolutional Neural Networks for Face Recognition,” pp. 4321-4329, align from scratch,” in Advances in Neural in- formation Processing
2017. Systems, pp 764–772, 2012.
[2] Patrik kamencay, Miroslav Benco, Tomas Mizdos, Roman Radil, “A New [16] Tanwar, P., Arora, D., LingayasUniversity, F., & Anand, D. Facial
Method for Face Recognition Using Convolutional Neural Network,” in Expression Detection using Hidden Markov model.
Digital Image and Computer Graphics,Vol. 15, pp. 663-677, 2017. [17] Santoso, H., Harjoko, A., & Putra, A. E. (2017). Efficient K-Nearest
[3] Kewen Yan, Shaohui Huang, Yaoxian Song, Wei Liu, Neng Fan, “Face Neighbor Searches for Multiple-Face Recognition In The Classroom
Recognition Based on Convolution Neural Network,” pp. 4077-4081, Based On Three Levels Dwt-Pca. International Journal Of Advanced
2017. Computer Science And Applications, 8(11), 112-122.
[4] Chen L, Guo X, Geng C., “Human face recognition based on adaptive [18] Krizhevsky A, Sutskever I, Hinton G E., “Imagenet, classification with
deep Convolution Neural Network.” in Chinese Control Conference, pp. deep convolutional neural networks.” in Advances in neural information
6967-6970, 2016. processing sytems. pp.1097-1105, 2012.
[5] S. Z. Gilani and A. Mian, “Towards large-scale 3D face recognition,” in [19] Yang M H, “Kernel eigenfaces vs. kernel fisherfaces Face recognition
IEEE DICTA, pp. 1–8, 2016. using kernel methods,” in 10th IEEE International Conference and
[6] Meena, D. and R. Sharan, “An approach to face detection and recognition” Workshops on Automatic Face and Gesture Recognition, Washington, pp.
in International Conference on Recent Advances and Innovations in 0215-0215, 2002.
Engineering (ICRAIE), Jaipur: IEEE, pp. 1–6, 2016. [20] https://medium.com/datadriveninvestor/machine-learning-on-facial-
[7] Syed Zulqarnain Gilani Ajmal Mian, “Learning from Millions of 3D scans recognition-b3dfba5625a7
for Large-scale 3D Face Recognition,” pp.1896-1905, 2018. [21] Tobias, L., A. Ducournau, F. Rousseau, G. Mercier and R. Fablet,
[8] Pan, J. S., Qiao, Y. L., & Sun, S. H., “A fast K nearest neighbors “Convolutional Neural Networks for object recognition on mobile
classification algorithm,” in IEICE Transactions on Fundamentals of devices: A case study,” in 23rd International Conference on Pattern
Electronics, Communications and Computer Sciences, pp. 961-963.s, Recognition (ICPR), Cancun: IEEE, pp. 3530–3535, 2016.
2004.

822 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)

You might also like