Professional Documents
Culture Documents
net/publication/331416284
CITATIONS READS
0 58
5 authors, including:
Yaya Heryadi
Binus University
138 PUBLICATIONS 1,052 CITATIONS
SEE PROFILE
All content following this page was uploaded by Harco Leslie Hendric Spits Warnars on 12 November 2020.
Abstract—Facial recognition as of biometric authentication from a low dimensional representation through the
used in the field of security, military, finance and daily use is assumption of distribution, such as linear subspace [2] [3][4],
become a trend or famous, because of its natural and not intrusive
manifold [5][6][7], sparse representation [8][9][10]. This
nature. Many methods for face recognition such as holistic
learning, the use of local features, shallow learning and deep technique dominates the techniques used in facial recognition
learning, some methods are susceptible to variations in pose communities in the 1990s and 2000s, but from widely known,
change, illumination, expression and age variation. State of the theoretical and practical problems, the holistic method fails to
art of face recognition today is a deep learning technique that respond to the challenges of free change and the difference
delivers high accuracy. In this paper author replicate an face from the face of the assumption that they made before. The
recognition using deep learning architecture called OpenFace general technique has a disadvantage when faced with
Convolutional Neural Network. In this research author make
variation on the size of image, color dept and age, and see how that
changes in Ages, poses, illumination and expression (PIE),
factor impact on accuracy of face recognition in that architecture. When A-PIE change, this technique decrease at the level of
As the result from the research, the accuracy of a model depends accuracy in detecting a person's face. To address the
on the image size, color depth, and age variation, but in OpenFace challenges of these problems, in the early 2000s, new
CNN that recognition still provides fairly good accuracy when techniques emerged that used local-features as a base such as
reducing the size of image and color depth, as long as the image Gabor [11] and LBP [12] using multilevel and multi-
can still be detected on the landmark facial, so the alignment
dimensional extensions [13][14][15], resulting in excellent
process can be done on the face image.
performance (robust) that can pass through the variant
Keywords—Image Size, Color Depth, Age variant, properties of local-filtering. However, that local-features or
Convolutional Neural Network, Face Recognition handcrafted features are difficult to implement and not
compact. In the early 2010s, local descriptor was introduced
I. INTRODUCTION into face recognition community [16][17][18], where local
filters offer better cohesiveness. But the external
A. Face Recognition representation of this technique still has a limit regarding
Face recognition, now widely used in biometric robustness when faced with the apparition of a linear complex
authentication, often used for security, financial and military, of variations in facial appearance.
or daily used in present application. Face recognition becomes At 2014, DeepFace [19] and DeepID[20] became the best
popular, because of the nature of face recognition are not technique in verification accuracy for Labeled Face in the
intrusive and natural. Face recognition research introduce Wild (LFW), defeating human capabilities in an unlimited
since early 1990. Eigenface as holistic learning technique in scenario for the first time. Since then research in facial
early 1990, make face recognition became famous as a recognition has shifted to a deep learning-based approach,
research study. Eigenface technique gets 60% accuracy on which acquired original invariant features progressively by
experiment conduct by Turk et al. [1]. accumulating nonlinear filters. Deep-learning architectures,
In facial recognition, enhancement using features as a basis including convolutional neural networks (CNNs)
in the previous few years is divided into four essential [21][22][23][24], deep believe Networks (DBNs) [25] and
techniques: a holistic approach, local handcraft, shallow stacked autoencoders (SAEs) [26] that simulate the inner
learning dan deep learning. The holistic approach is derived perceptron brain work human brain, deep-network can
978-1-5386-9422-0/18/$31.00 ©2018 IEEE
The 1st 2018 INAPR International Conference, 7 Sept 2018, Jakarta, Indonesia 39
represent high-level abstraction with various layers of The other factor to improve the accuracy of the algorithm
nonlinear transformation. required a large dataset, with large dataset computer can
perform better to determine the features can be used as a
B. Background and Terminology in Face Recognition
descriptor of one's face recognition. To enlarge the dataset,
The basic of face recognition are classification; a person we can create a synthetic dataset, or do some process like
would be classified as one person. A face image forms any mirroring, scaling, change image mode, hue and illumination.
person can be classified as one person-based calculation of The other terminology for creating a synthetic image is
similarity between face image as input and face image in a one-to-many augmentation: create multiple image images
database. A person will be classified as the same person if the with various pose variations from numerous photos to enable
high similarity between input and database, and state as a deep learning network to learn to become a feature. But we
different person if had low similarity. If you want success to must choose one: many to one normalization or one-to-many
find great feature or discriminator for face recognition, you augmentation, we can see on table I.
must find the technique to make intra-variant in face image
for same person decreases and makes the inter-variant of
different people grow larger. The function to calculate the TABLE I. PREPROCESSING IN FACE RECOGNITION
similarity between same image person can describe as Data Description Subsetting
bellow[27] in equation (1) where 𝐼𝑖 and 𝐼𝑗 are two images processing
that we want to measure the resemblance, 𝑃𝑗 is a One to Create a new face 3D model
preprocessing function, 𝑓 is a function to perform feature many with the pose [29][30][31][32][33][34][35][36]
variations of an 2D deep model [37][38][39]
extraction, and S is the similarity level of the two images. image Data augmentation
[40][41][42][43][20][44][45][46]
𝑆 (𝑓(𝑃𝑖 (𝐼𝑖 )); 𝑓 (𝑃𝑗 (𝐼𝑗 ))) () Many to Fixed canonical SAE[47][48][49]
one view of facial CNN[50][51][52][53][54]
images from one GAN[55][56][57][58]
As mentioned before when face Ages, poses, illumination face or many faces
and expression (A-PIE) change will decrease the accuracy of that have a non-
the face recognition. A posture changed in the same person frontal view
will make a big difference in the classification process, the
same person will be classified as difference person, because C. Face Recognition in Deep Learning
of differences of appearance the images (if using a 2D camera In deep learning technique, a computer is asked to learn by
or CCTV). If a face gets a different level of illumination may themselves to determine the essential features that will be
be classified as a different person because illumination level used, this job need high computational resources, especially
gave significantly different change on an appearance of the if it involves a substantial number of datasets in the learning
images. Fifth factors are expression and age differences, that process, but now this no problem again because the current
factor will reduce the accuracy in face recognition. computer technology and supply can handle this task.
To prevent the reduction of accuracy in classification, face
recognition must be following some steps. As mentioned by
Ranjan et al. [28] there are three things that a facial D. Feature extraction uses deep learning.
recognition system needs, the first is a face detector that Network architecture deep learning in facial recognition
detects and localizes faces in a video or image. The second is can be categorized in a single network or multiple networks.
a facial landmark detector to determine the landmark points Get inspired by the success of the challenges given by
in the face, faces that have been successfully detected facial ImageNet [59], common CNN architectures like AlexNet,
landmarks will be aligned into official coordinate positions. VGGNet, GoogleNet and RestNet [21][22][60][24],
Then after the video or face image is in canonical coordinates, introduced and became the model base in the face recognition
then the next is the process of face recognition. The face directly or modified. Until now a new face recognition
detection steps its essential to reduce the intra-variant, more architecture is still being developed to improve efficiency. By
substantial area to process in face recognition, more gave adopting a single network, face recognition network is trained
significant differences, so we need to cut the field of attention in multiple networks with multiple inputs [55] or various
by the only process the face part, we need to detect the face tasks. [61], result from research shows this way provides an
on an image, and crop into the sufficient space for face- increase in performance when results are accumulated from
recognition. This process also reduces the computation cost various networks.
because the part of an image processed decreased. Face Various architectural of deep learning built such as
alignment even importance step because the face was aligned ImageNet, AlexNet, VGGNet, GoogleNet, and ResNet are
into legal coordinate position have more prominent similarity using the CNN method, used as a basis for face recognition,
in face space, than not aligned. This technique also knows as and some other architectures built to improve efficiency such
terminology Many-to-one normalization: fix the canonical as multiple networks. But basically, the function used on the
view of the face image of one or many photos that have a non- construction is the same loss of function.
frontal image angle, an image with an authoritative frontal
view; so facial recognition can work in controlled conditions.
20,00%
120,00%
100,00%
0,00%
80,00%
Accuracy
60,00%
40,00%
20,00%
Fig 4. Accuracy change by reducing color depth
0,00%
250 100 75 50 Then the author also using the own dataset obtained from
Pixel Width the collection of images from the internet, 263 images from
13 subject or identity, and gives the age variation on the
Accuracy on Threshold SVM Acc KNN Acc dataset to test whether age or age variation will decrease
accuracy in detecting a person's face, and the following
results are obtained from table IV:
Fig 3. Accuracy result by variating the resolution