Sensors 19 04733 PDF

sensors
Article
An Identity Authentication Method Combining
Liveness Detection and Face Recognition
Shuhua Liu , Yu Song, Mengyu Zhang, Jianwei Zhao, Shihao Yang and Kun Hou *
School of Information Science and Technology, Northeast Normal University, Changchun 130117, China;
Liush129@nenu.edu.cn (S.L.); songy590@nenu.edu.cn (Y.S.); zhangmy167@nenu.edu.cn (M.Z.);
zhaojw374@nenu.edu.cn (J.Z.); yangsh861@nenu.edu.cn (S.Y.)
* Correspondence: houk431@nenu.edu.cn

Received: 9 September 2019; Accepted: 29 October 2019; Published: 31 October 2019
Abstract: In this study, an advanced Kinect sensor was adopted to acquire infrared radiation (IR)
images for liveness detection. The proposed liveness detection method based on infrared radiation
(IR) images can deal with face spoofs. Face pictures were acquired by a Kinect camera and converted
into IR images. Feature extraction and classification were carried out by a deep neural network to
distinguish between real individuals and face spoofs. IR images collected by the Kinect camera have
depth information. Therefore, the IR pixels from live images have an evident hierarchical structure,
while those from photos or videos have no evident hierarchical feature. Accordingly, two types of IR
images were learned through the deep network to realize the identification of whether images were
from live individuals. In comparison with other liveness detection cross-databases, our recognition
accuracy was 99.8% and better than other algorithms. FaceNet is a face recognition model, and it is
robust to occlusion, blur, illumination, and steering. We combined the liveness detection and FaceNet
model for identity authentication. For improving the application of the authentication approach,
we proposed two improved ways to run the FaceNet model. Experimental results showed that the
combination of the proposed liveness detection and improved face recognition had a good recognition
effect and can be used for identity authentication.
Keywords: liveness detection; Kinect camera; infrared radiation; deep learning; FaceNet
1. Introduction
Face recognition is the most efficient and widely used among various biometric techniques,
such as fingerprinting, iris scanning, and hand geometry. The reason is that this method is natural,
nonintrusive, and low cost [1]. Therefore, researchers have developed several recognition techniques
in the last decade. These techniques can generally be divided into two categories according to the face
feature extracting methodology: methods that manually extract features on the basis of traditional
machine learning and those that automatically acquire face features on the basis of deep learning.
The accuracy of face recognition is greatly improved using the deep learning network because of its
capability to extract the deep features of human faces. FaceNet is a face recognition model with high
accuracy, and it is robust to occlusion, blur, illumination, and steering [2]. It directly learns a mapping
from face images in a compact Euclidean space where distances directly correspond to a measure
of face similarity. Once this space has been produced, tasks such as face recognition can be easily
implemented using standard techniques with FaceNet embeddings as feature vectors. In addition,
end-to-end training of FaceNet simplifies the setup and shows that directly optimizing a loss relevant
to the task at hand improves performance. In this study, for improving the application of the FaceNet
model, we proposed two improved ways, namely, by improving the model and by building “unknown”
data classification. The details will be introduced in Section 3.2.
Sensors 2019, 19, 4733; doi:10.3390/s19214733 www.mdpi.com/journal/sensors

Sensors 2019, 19, 4733 2 of 14
Although the improved FaceNet framework can accurately recognize human faces, like other
recognition systems, it cannot prevent cheating. Most existing face recognition systems are vulnerable
to spoofing attacks. A spoofing attack occurs when someone attempts to bypass a face biometric
system by presenting a fake face in front of the camera. For instance, the researchers in [3] inspected
the threat of the online social network-based facial disclosure against that based on some commercial
face authentication systems. Common spoof attacks include photos, videos, masks, and replayed 3D
face models.
Therefore, this paper proposes a liveness detection approach based on infrared radiation (IR)
images acquired using a Kinect camera. IR images from live faces are used as positive samples,
while IR images from photos or videos are used as negative samples. The samples above are input
into the convolutional neural network (CNN) for training to distinguish live faces and spoof attacks.
After liveness detection, an improved FaceNet will continue to recognize a face and provide the
corresponding ID or UNKNOWN output for accurate identity authentication.
The rest of the paper is organized as follows. Section 2 briefly reviews the related works and
recent liveness detection methods. Section 3 presents a framework that combines liveness detection
and face recognition, and then the proposed liveness detection method based on IR image features
and an improved FaceNet model, called IFaceNet, are described. Section 4 presents the experimental
verifications. Section 5 elaborates the conclusions.
2. Related Works
Face recognition has gradually become an important encryption and decryption method because
of its rapidity, effectiveness, and user friendliness. However, the security issues of face recognition
technology are becoming increasingly prominent. Therefore, liveness detection has become an
important part for reliable authentication systems. With the development of the Internet, criminals
collect user face images from the Internet and produce fake faces to attack an authentication system.
Ref. [3] passed the authentication of six commercial face recognition systems, namely, Face Unlock,
Facelock Pro, Visidon, Veriface, Luxand Blink, and FastAccess, by using photos of valid users. Common
spoof faces include photos (print), videos (replay), masks, and synthetic 3D face models. Among
them, photos and videos are 2D fake faces that are less expensive for spoof attacks and are the two
most popular forms of deception. Therefore, it is urgent to introduce liveness detection into identity
authentication systems to improve the practicality and safety of face recognition. Liveness detection
methods can obtain different classification systems depending on different classification criteria.
According to their application forms, current mainstream liveness detection methods are divided into
interactive and noninteractive categories. Interactive detection methods use action instructions to
interact with users and require users to cooperate to complete certain actions. The noninteractive
method does not need user interactions and automatically completes the detection task. According to
the extraction methods of face features, these methods can be divided into two categories: manual
feature extraction and automatic feature extraction using a deep learning network.
Common liveness detection methods are mainly based on texture, life information, different
sensors, and deep features. Live faces have complex 3D structures, while photo and video attacks
are 2D planar structures. Different light reflections of surfaces from the 3D and 2D structures will
exhibit differences in bright and dark areas of facial colors. Texture-based methods mainly use
these differences as clues to classify live and fake faces. The detection method based on texture is
implemented using Local Binary Pattern(LBP) [4,5] and improved LBP [6,7] algorithms. This method
has a low computational complexity and is easy to implement, but it is greatly influenced by hardware
conditions. The accuracy of the algorithm decreases when the image quality is low. The method
based on life features uses vital signs, such as heartbeat [8,9], blood flow [10], blinking [11,12], and
involuntary micromotion of facial muscles [13,14], to classify live and fake faces. Under the constraint
conditions, this method has a high detection accuracy if life features can be extracted stably; however,
this method requires face video as the input and needs a large amount of computation. Even more
Sensors 2019, 19, 4733 3 of 14
unfortunately, the simulated micromotion of fake faces can also attack this method. The method based
on different sensors adopts different image acquisition systems, such as a multispectral camera, an
infrared camera,
Sensors 2019, 19, xaFOR
deepPEER camera,
REVIEW and a light field camera, to capture corresponding types3of of human
14
face images for liveness detection. The overall recognition accuracy of this method is high, but this
methodmethod
needs based
to addon different sensors adopts
new hardware different
and, thus, image the
increases acquisition
system systems,
cost. Thesuch as a multispectral
methods based on deep
camera, an infrared camera, a deep camera, and a light field camera, to capture corresponding types
features involve training of the initial CNN to extract depth features followed by classification [15–17].
of human face images for liveness detection. The overall recognition accuracy of this method is high,
These methods use pretrained ResNet-50, VGG, and other models to extract features [18–20] and 3D
but this method needs to add new hardware and, thus, increases the system cost. The methods based
convolution to extract spatiotemporal deep features [21,22].
on deep features involve training of the initial CNN to extract depth features followed by
Given
classificationdeep
that learning
[15-17]. These can extract
methods usehigh-level
pretrained abstract
ResNet-50, features
VGG, andof human faces autonomously,
other models to extract
noninteractive liveness detection using deep learning is a future development
features [18-20] and 3D convolution to extract spatiotemporal deep features [21,22]. trend. With the gradual
popularization of face
Given that deeprecognition
learning cansystems and the decrease
extract high-level in the price
abstract features of hardware,
of human it is necessary
faces autonomously,
noninteractive
and worthy by adding liveness
image detection usingequipment
acquisition deep learning
into is a future
some development
important trend. With systems
face authentication the
gradualthe
to improve popularization
security andof face recognition
reliability of them.systems
Livenessanddetection
the decrease in the
of faces pricereal
using of hardware, it is
depth information
is notnecessary
commonly and usedworthy by adding
in biometrics image acquisition
technology equipment
and the literature [23].into some important
All publicly availableface
datasets
authentication systems to improve the security and reliability of them. Liveness detection of faces
such as CASIA, NUAA, and PRINT-ATTACK DB are designed for 2D spoofing prevention, and no
using real depth information is not commonly used in biometrics technology and the literature [23].
depth data are included in these datasets. Therefore, we adopted a Kinect camera acquiring real
All publicly available datasets such as CASIA, NUAA, and PRINT-ATTACK DB are designed for 2D
infrared radiation
spoofing (IR) images
prevention, and no fordepth
liveness
datadetection.
are included in these datasets. Therefore, we adopted a
Kinect camera acquiring real infrared radiation (IR) images for liveness detection.
3. Methodology
3. Methodology
3.1. Problem Statements
3.1. Problem
With Statements of face recognition, criminals will try to attack the face recognition system,
the popularity
for whichWithliveness detectionofhas
the popularity become
face an important
recognition, parttry
criminals will of to
theattack
authentication system. system,
the face recognition Among the
for liveness
current which liveness detection
detection has become
algorithms, an important
methods based part of thelearning
on deep authentication
with IR system. Among
images the by
collected
Kinect cameras are rarely reported. Therefore, we proposed this method in this paper. Infrared by
current liveness detection algorithms, methods based on deep learning with IR images collected images
wereKinect
capturedcameras
by a are rarely
Kinect reported.
camera Therefore,
as the trainingwe proposed
data. The datathis from
methodrealin faces
this paper.
servedInfrared
as positive
images were captured by a Kinect camera as the training data. The data from real faces served as
samples, while the data from photos or videos served as negative samples for training the CNN network
positive samples, while the data from photos or videos served as negative samples for training the
to distinguish whether a face is live. In addition, because the FaceNet model has a high face recognition
CNN network to distinguish whether a face is live. In addition, because the FaceNet model has a high
accuracy, this paper put forward two improved ways to use FaceNet for applications. The improved
face recognition accuracy, this paper put forward two improved ways to use FaceNet for applications.
FaceNet
The improved with
combined a liveness
FaceNet combineddetection
with aalgorithm to form an
liveness detection integrated
algorithm to authentication system.
form an integrated
authentication system.
3.2. Model Framework
3.2. Model
The Framework
proposed framework that combines FaceNet with liveness detection is shown in Figure 1.
The proposed framework that combines FaceNet with liveness detection is shown in Figure 1.
Kinect
IR Picture RGB Picture
MTCNN
IR Face RGB Face
resize
Live CNN Face
resize
detection recognition
True
live face? FaceNet
False
Error ID
Figure 1. Identity
Figure authentication
1. Identity authentication framework based on
framework based on liveness
livenessdetection
detection and
and FaceNet.
FaceNet. CNN,CNN,
convolutional neural
convolutional network;
neural IR,IR,infrared
network; infraredradiation; MTCNN,multitask
radiation; MTCNN, multitask cascaded
cascaded CNN.
CNN; RGB.
Sensors 2019, 19, x FOR PEER REVIEW 4 of 14
Sensors 2019, 19, 4733

x FOR PEER REVIEW 44 of
of 14
Firstly, we adopted a Microsoft Kinect camera to collect RGB and IR images of human faces.
Secondly,
Firstly,a multitask
we adopted cascaded convolutional
a Microsoft Kinect camera network to (MTCNN)
collect RGB[24] and wasIR used
images to clip and align
of human faces.the
face Firstly,
parts aofmultitask
Secondly, weRGB andcascaded
adopted IR aimages.
MicrosoftFinally,
Kinect
convolutional the camera
IR images
network processed
to(MTCNN)
collect RGB byand
[24] MTCNN
wasIRused were
images
to clip used
of andtoalign
human train the
faces.
the
CNN
Secondly, for aliveness
multitask detection,
cascaded while RGB
convolutional images were
network utilized
(MTCNN)
face parts of RGB and IR images. Finally, the IR images processed by MTCNN were used to train the to train
[24] was theusedFaceNet
to clip model
and for
align face
the
recognition.
face
CNN parts of When
for liveness RGB and the results
IR images.
detection, of liveness
while RGB detection
Finally, the IR were
images are true,
images faceto
processed
utilized recognition
by MTCNN
train is continued
the FaceNet were to complete
used
model to train
for face
the
the entire
CNN authentication
for liveness process.
detection, If
while the liveness
RGB images detection
were is false,
utilized
recognition. When the results of liveness detection are true, face recognition is continued to complete then
to trainthe operation
the FaceNet of the
model algorithm
for face
is terminated,
recognition.
the and the
When
entire authentication faceresults
recognition
process. is no
of liveness
If the longer detection
detection
liveness performed.
are true, is face
false,recognition is continued
then the operation of thetoalgorithm
complete
is The MTCNN
theterminated,
entire authentication consisted
and face of three
process.
recognition If is
thestages.
noliveness
longer First, candidate
detection
performed. windows
is false, then the were produced
operation of through
the algorithm a fast
proposal
is The MTCNN consisted of three stages. First, candidate windows were produced through astage
terminated, network and (P-Net)
face through
recognition a
is shallow
no longer CNN. Second,
performed. candidates were refined in the next fast
through
The a refinement
MTCNN network
consisted of (R-Net)
three to
stages. reject a
First, large
candidatenumber
proposal network (P-Net) through a shallow CNN. Second, candidates were refined in the next stage of
windows nonfacewere windows
produced through
through a amore
fast
complex
proposal CNN.
network Third,
(P-Net) the output
through a network
shallow (O-Net)
CNN. Second,produced
through a refinement network (R-Net) to reject a large number of nonface windows through a more candidates a final
were bounding
refined in box
the and
next facial
stage
landmark
through
complex positions
aCNN.
refinement Third,by using
network
the a more
(R-Net)
output topowerful
network reject(O-Net) CNNnumber
a large to refine
produced the
of anonface results
final and output
windows
bounding through
box and thea facial
facial
more
landmark
complex CNN.
landmark positions.
Third,by
positions theusing
outputa network
more powerful (O-Net) produced
CNN to refine a final thebounding
resultsbox andand facial the
output landmark
facial
In
positions the
by
landmark positions. following
using a moresections,
powerful the liveness
CNN to detection
refine the based
results on
and IR
outputimagesthe and
facial the face
landmark recognition
positions.
algorithm
In the based
In the following
followingon improved
sections, FaceNet
sections, the liveness
the livenessare described
detectionin
detection detail.
based
based on IR
on IR images
images and and thethe face
face recognition
recognition
algorithm based on improved FaceNet
algorithm based on improved FaceNet are described in detail. are described in detail.
3.3. Liveness Detection Based on IR Image Features (LDIR)
3.3. Liveness Detection Based on IR Image Features (LDIR)
3.3. Liveness Detection
Face spoofing Based onbased
detection IR Image on IR Features
images(LDIR)can be treated as a two-classifier issue. This method
Face
is used spoofing
tospoofing
discriminate detection basedfrom
fake faces on IRreal images
ones.cancan
Thebebe treated as
proposed a two-classifier
liveness detectionissue.algorithmThis method
focused
Face detection based on IR images treated as a two-classifier issue. This method
is
onused
3D to discriminate
facial space fake faces
estimation. A from real
Kinect ones.was
camera Theutilized
proposed to livenessIRdetection
obtain images algorithm
with deep focused
and gray
is used to discriminate fake faces from real ones. The proposed liveness detection algorithm focused
on 3D facial
information. space
These estimation.
images were A Kinectinput camera
to the was utilized to obtain IR images with deep and gray
on 3D facial space estimation. A Kinect camera wasCNN utilized for totraining
obtain IR to images
obtain with facial deepskin and texture
gray
information.
information These
in images
space. Becausewerewhole
input to the CNN
facial texturesfor training
were to obtain facial
considered, 2D skin texturesuch
deceptions, information
as those
information. These images were input to the CNN for training to obtain facial skin texture
in space.
using Because
photos whole facial textures were considered, The2D deceptions, such as those using photos
information inand videos,
space. Because no longer
wholehave facialany effects.were
textures process
considered, of liveness detection
2D deceptions, based
such on IR
as those
and
imagesvideos, no
is shown longer have any effects. The process of liveness detection based on IR images is shown
using photos and in Figure
videos, no2.longer
IR images have of any real faces The
effects. andprocess
framed of real faces were
liveness detectiontakenbased as positive
on IR
in Figure 2.
samples, IR images
while photos, ofphotos
real faces andtoframed
stuck human real faces
faces, andwere takenon
photos asanpositive samples,
electronic pad whiletaken
were photos, as
images is shown in Figure 2. IR images of real faces and framed real faces were taken as positive
photos
negative stuck
samplesto human
after faces, and
Kinect photosand
collection on inputted
an electronic to the pad CNNwere fortaken as negative samples after
training.
samples, while photos, photos stuck to human faces, and photos on an electronic pad were taken as
Kinect collection and inputted to the CNN for training.
negative samples after Kinect collection and inputted to the CNN for training. Positive Samples
Kinect
Face Detection &
IR PIC MTCNN
Alignment
Kinect Positive Samples
Face Detection &
IR PIC MTCNN
Alignment
CNN
CNN
Kinect
Face Detection &
IR PIC MTCNN
Alignment
Kinect Negative Samples
Face Detection &
IR PIC MTCNN
Alignment
Negative Samples
Figure 2. Training
Figure 2. Trainingprocess
processof
ofliveness
livenessdetection.
detection.
In Figure 2. Training process of liveness detection.

Inthis
thisresearch, a CNN
research, network
a CNN with four-layer
network convolution
with four-layer was designed
convolution for liveness
was designed fordetection.
liveness
The CNN
detection. structure
The CNN is shown in
structure Figure
is shown3.in Figure 3.
In this research, a CNN network with four-layer convolution was designed for liveness
detection. The CNN structure is shown in Figure 3.
Figure3.3. CNN
Figure CNN structure.
structure.
Figure 3. CNN structure.

SensorsAfter
Sensors 2019,
convolution
2019, 19,
19, x FOR PEER of
4733
the second
REVIEW and fourth layer, a 2 × 2 maximum pooling was adopted. The
55 of 14
14
optimization of dropout at 0.25 was performed after pooling, and optimization of dropout at 0.5 was
performed after the first
After convolution offull
the connection
second andlayer. fourthThe activation
layer, a 2 × 2function
maximum in pooling
the network
was was relu, The
adopted. and
the After
learning convolution
rate was lr of
= the
0.01. second
The and
Stochasticfourth layer,
Gradient a
Descent2 × 2 maximum
SGD
optimization of dropout at 0.25 was performed after pooling, and optimization of dropout at 0.5 was with a pooling
momentum was ofadopted.
0.9 were
The
usedoptimization
as optimizers,
performed after the of dropout
and
first the at
fullloss 0.25 was performed
function
connection was cross
layer. after pooling,
entropy.
The activation and optimization
function in the network of dropout
was relu,atand 0.5
was performed
First, IRrate
the learning after
imageswasthelr =first
0.01.full
collected using
The connection
Kinect were
Stochastic layer. The Descent
resized
Gradient activation
to 64 × 64, function
with ainmomentum
normalized,
SGD the
andnetwork ofwas
inputted 0.9intorelu,
werethe
and the
network learning
for rate
training. was
Whenlr = 0.01.
a new The Stochastic
user
used as optimizers, and the loss function was cross entropy. was Gradient
added, 10 Descent
rounds SGD
of with
training a momentum
were of
automatically 0.9
were used IR
performed
First, as(nb_epoch
optimizers,
images collected andAs
= 10). the
usingloss
the function
output
Kinect were was
category cross
resized entropy.
results
to only
64 included
× 64, true and
normalized, andfalse, the “binary-
inputted into the
cross First, IR
threshold” images collected
classification using
functionKinect
was were resized
adopted, to
and 64
the
network for training. When a new user was added, 10 rounds of training were automatically× 64, normalized,
trained model and inputted
outputted true into the
or false
network
performed for
at the predictiontraining.
(nb_epoch When
stage. a
When
= 10). new user
As the output was added,
output category 10 rounds
result is results
true, itonlyof training
means were
the input
included automatically
trueimage performed
is live,
and false, the and
“binary-face
(nb_epoch
recognition
cross =is10).
threshold” As theusing
continued output
classification category
FaceNet;
function was results
otherwise,
adopted, only
spoof
and included
information
the trainedtruemodel
isand false,
given, the
and “binary-cross
face
outputted recognition
true or false
threshold”
is no
at thelonger classification
performed.
prediction function
stage. When the was adopted,
output and
result is theittrained
true, means model
the input outputted
image istrue orand
live, falseface at
the prediction stage. When the output result is true, it means the
recognition is continued using FaceNet; otherwise, spoof information is given, and face recognition input image is live, and face
3.4.
noImproved
recognition
is FaceNet
longerisperformed.
continued Model
usingforFaceNet;
Application-IFaceNet
otherwise, spoof information is given, and face recognition is
no longer performed.
3.4.1.
3.4. FaceNetFaceNet
Improved Model Model for Application-IFaceNet
3.4. Improved FaceNet Model for Application-IFaceNet
FaceNet consists of a batch input layer and a deep CNN followed by L2 normalization, which
3.4.1.
results FaceNet Model
in face Model
embedding. This step is followed by triplet loss during training [2], as shown in Figure
3.4.1. FaceNet
4. FaceNet consists of a batch input layer and a deep CNN followed by L2 normalization, which
FaceNet consists of a batch input layer and a deep CNN followed by L2 normalization, which
results in face embedding. This step is followed by triplet loss during training [2], as shown in Figure
results in face embedding. This step is followed by triplet loss during training [2], as shown in Figure 4.
4.
Figure 4. FaceNet structure.

Figure 4. FaceNet structure.
FaceNet strives to embed f(x), from Figureimage x into
4. FaceNet a feature space Rd, such that the squared
structure.
FaceNet strives to embed f(x), from image x into
distance between all faces of the same identity is small regardless a feature spaceofRd, such that
imaging the squared
conditions, whereasdistance
the
between
squared all
FaceNet faces
distance of the
between
strives same identity
a pairf(x),
to embed is
of face small
fromimages regardless
imagefromx into of imaging
different conditions,
identities
a feature space Rd, whereas
is large.
such Tripletthe
that theloss squared
of the
squared
distance
model isbetween
distance shown inall
between a Figure
pair
facesofof
5.face
theimages
The model
same from different
can minimize
identity identities
is small the isoflarge.
distance
regardless between
imagingTriplet
an loss ofand
anchor
conditions, the amodel
whereas is
positive
the
shown
of the in
squared sameFigure
distance 5.between
identityTheandmodel can of
maximize
a pair minimize the distance
the distance
face images between
from between an anchor
the anchor
different identitiesandisand a positive
alarge.
negative ofofaloss
Triplet the same
different
of the
identity
identity.
model and maximize the distance between the anchor and a negative of a
is shown in Figure 5. The model can minimize the distance between an anchor and a positive different identity.
of the same identity and maximize the distance between the anchor and a negative of a different
identity.
Figure 5. Function of the triplet loss.

Figure 5. Function of the triplet loss.
When training FaceNet, three face images, namely Xi (a), Xi (p), and Xi (n), were extracted from the
training set training
When each time. These images
FaceNet, three face
Figure belonged
5.images,
Function tonamely
anchor, Xpositive,
of the tripleti(a), Xi(p),and
loss. andnegative Xi(n), wereclasses. The three
extracted from
pictures formed
the training a triplet.
set each time.The training
These images network
belonged enabled ||f(Xi (a))
to anchor, − f(Xi (p))||
positive, 2 to be as classes.
and negative small asThe
possible
three
and the
picturesdistances
When formed a||f(X
training triplet.
FaceNet, − f(X
i (a))The training
three
2 and ||f(X (p)) − f(X (n))||2 to be as large
i (n))||face network
images, i enabled
namely||f(X i Xi(a),i(a))-f(X
Xi(p), i(p))||
and X2i(n), asbe
to possible.
as small
were as possible
extracted from
the That
andtrainingis, set
theeach
the distances triple
||f(Xloss
time. satisfies
i(a))-f(X
These i(n))||the and
images 2 following
belonged equation:
||f(Xi(p))-f(X
to anchor,i(n))|| 2 to be as
positive, andlarge as possible.
negative classes. The three
Thatformed
pictures is, the atriple loss
triplet. satisfies
The training thenetwork
following equation:
enabled ||f(Xi(a))-f(Xi(p))|| 2 to be as small as possible
2 2
||f(X i (a)) − f(X i (p))|| + α < ||f(X i (a))
and the distances ||f(Xi(a))-f(Xi(n))|| and ||f(Xi(p))-f(Xi(n))|| to be as large as possible.
2 −2 f(Xi (n))|| , (1)
||f(Xi(a)) − f(Xi(p))||2 + α < ||f(Xi(a)) − f(Xi(n))||2, (1)
That is, the triple loss satisfies the following equation:
where α is a real number, which makes the distance between facial image features of the same person
where α is a real number, which makes the distance between facial image features of the same person
smaller than the distance between ||f(Xi(a))facial features
− f(Xi(p))|| of<different
2 + α ||f(Xi(a))people by α.2,Thus, the loss function
− f(Xi(n))|| (1)is
smaller than the distance between facial features of different people by α. Thus, the loss function is
designed as
designed
where α isasa real number, which makes the distance 2 between facial image features 2 of the same person
Loss = ||f(X i (a)) − f(Xi (p))|| + α - ||f(Xi (a)) − f(Xi (n))|| . (2)
smaller than the distance between facial features of different people by α. Thus, the loss function is
designed as
Sensors 2019, 19, 4733 6 of 14
The triplet loss tries to enforce a margin between each pair of faces from one person to all other
faces. This procedure allows the faces of one identity to live on a manifold while still enforcing the
distance and, thus, the discriminability from other identities. Therefore, the FaceNet model is robust
to pose, illumination, and expression [25], which were previously considered to be difficult for face
verification systems.
3.4.2. IFaceNet
The original FaceNet model was trained on a labeled faces in the wild (LFW) dataset. In the
recognition stage, the trained model extracted the feature vector, which was compared with the faces
that have been classified by SVM in the database. A set of predicted values of similarity between the
recognized face and various other faces was available in the database, and the maximum of similarity
was selected as the output result. If the person is in the database, then the correct identification will
be given. However, when the recognized person is not in the database, the classifier will select the
person category with the largest predicted value for the output, which results in false identification.
Accordingly, we improved the original FaceNet model in two ways for real applications and called
it IFaceNet.
Set a Valid Threshold

The person is not in the dataset when the recognizing similarity is low. Therefore, we set a
threshold of 0.9, and the person was marked as unknown when the predicted maximum was less than
0.9. The modified FaceNet model is shown in Figure 6. By setting the valid threshold to 0.9, unknown
people will be filtered effectively so as to reduce the false recognition rate.
Building a Unknown Category

We added an “unknown” category and placed a large number of photos from LFW into a named
“unknown” folder to reduce the false recognition rate. The unknown folder consisted of approximately
6000 faces from more than 4000 people of LFW. The number of photos in each labelled person should
be approximately equal to the number of photos in the unknown folder. When a face is not in the real
application dataset, it will have maximum similarity with the unknown category, and an “unknown”
will be the output. By adding the unknown category, the original FaceNet model could deal with this
case. Thus, we did not need to modify the FaceNet model.
The experimental results showed that both of the improved approaches could reduce the false
recognition rate of strangers.
Sensors 2019,19,
Sensors2019, 19,4733
x FOR PEER REVIEW 77 of
of1414
Classified
photos Real-time face
detection N
MTCNN
Detecting faces？
Y
FaceNet
MTCNN
Extract feature
FaceNet
Classified by SVM
Extract feature
Get classified face
profile(.pkl)
Compare with the

threshold
greater than
threshold？
N Y
Output unknown Output label
Figure 6. Face recognition process by IFaceNet.

Figure 6. Face recognition process by IFaceNet. SVM.
4. Experimental Results and Analysis
4. Experimental Results and Analysis
4.1. Experiment Settings
4.1. Experiment Settings
Our model used the Keras framework, which is based on the Tensorflow software library.
Our model
The hardware used the
platform wasKeras
an Intel ® Core™which
framework, i5-8400is based
with 2.8onGHz,
the Tensorflow
6-core CPU,software library.
64 GRAM, The
nvidia
hardware
GeForce GTXplatform
1080ti,was
andan Intel Core™
Ubuntu ® i5-8400
18.04.1 OS. Thewith 2.8 GHz,
learning rate 6-core
was setCPU, 64 GRAM,
to 0.01, and the nvidia
decay GeForce
was set
toGTX 10−6 . The
1 × 1080ti, andbatch
Ubuntu
size18.04.1
was 10, OS. The
and thelearning rategradient
stochastic was set descent
to 0.01, and
withthe decay wasoptimizer
momentum set to 1 ×
10 adopted.
was −6 . The batch size was 10, and the stochastic gradient descent with momentum optimizer was
adopted.
4.2. Datasets
4.2. Datasets
The proposed liveness detection algorithm was based on Kinect sensor hardware, and the dataset
calledThe NenuLD was liveness
proposed our own.detection
NenuLDalgorithm
included 18,036
was basedface photos collected
on Kinect sensorbyhardware,
a Kinect camera.
and the
The dataset
dataset consisted
called NenuLDof a living
was ourset and
own.a spoof
NenuLD set. included
In accordance with
18,036 common
face photosattack forms,
collected bythe spoof
a Kinect
set consisted
camera. The of photos,
dataset photos of
consisted stuck to human
a living set andfaces,
a spoofand photos
set. on an electronic
In accordance pad. The
with common NenuLD
attack forms,
the spoof set consisted of photos, photos stuck to human faces, and photos on an electronic pad. The
Sensors 2019, 19, 4733 8 of 14

dataset is shown in Table 1. The dataset was divided into training, verification, and test sets in the ratio
of NenuLD dataset is
8:1:1. IR images shown in
acquired Table
using 1. Theare
Kinect dataset
shownwas
individed into
Figure 7. training,
Figure verification,
7a presents and test
IR images ofsets
a live
in the ratio of 8:1:1. IR images acquired using Kinect are shown in Figure 7. Figure 7a
face collected from a real person, and Figure 7b presents spoof images from photos including some presents IR
images of a live face
movie stars and ourselves. collected from a real person, and Figure 7b presents spoof images from photos
including some movie stars and ourselves.
(a) IR images of live faces
(b) IR images of photos
Figure7.7.IR
Figure IR pictures
pictures collected
collected with
withaaKinect
Kinectcamera.
camera.
Sensors 2019, 19, 4733 9 of 14

Table 1. Liveness detection dataset, called NenuLD.
TableTraining
1. Liveness detectionValidation
Images dataset, called NenuLD.Test Images
Images
Real faces Training 8400
Images 1050 Images
Validation 1050Images
Test
Spoof faces
Real faces 6030
8400 753
1050 753
1050
total 14,430 1803 1803
Spoof faces 6030 753 753
total 14,430 1803 1803
Spoof data are shown in Figure 8. Figure 8a–c show a hand-held photo, a photo stuck to a human
Spoof
face to data are
simulate 3Dshown
human inface
Figure 8. Figure
spoofing, 8a–c
and show aor
a photo hand-held
video onphoto, a photo pad,
an electronic stuckrespectively.
to a human
face to simulate 3D human face spoofing, and a photo or video on an electronic pad,
The left side presented IR image outputs using Kinect, while the right side presented RGB images. respectively.
The left
The side presented
IR image IR image
corresponding outputs
to the photousing
had aKinect, while
clear face the right
contour. Theside presented
MTCNN RGB
could images.
detect and
The
frameIRthe
image
face,corresponding
and the algorithm to the photo hadthe
recognized a clear face
face as contour.
false. The the
However, MTCNN
photo could
on thedetect and
electronic
frame the face, and the algorithm recognized the face as false. However, the photo on
pad corresponded to an IR image, which was black and could not be used to detect the outline of thethe electronic
pad
humancorresponded
face; thus, ittodid
an not
IR image, whichcapability.
have spoof was black and could not be used to detect the outline of the
human face; thus, it did not have spoof capability.
(a) Photo spoof
(b) Simulated 3D photo spoof
(c) Replay spoof

Figure 8. Spoof
Spoof pictures.
We also
also collected
collectedframed
framedfaces
facesasaspositive
positivesamples, as as
samples, shown in Figure
shown 9a,b,
in Figure apart
9a,b, fromfrom
apart otherother
real
face face
real photos to remove
photos the influence
to remove of the
the influence ofphoto boundary
the photo on liveness
boundary detection.
on liveness detection.
Sensors 2019, 19, 4733 10 of 14
Sensors
Sensors2019,
2019,19,
19,xxFOR
FORPEER
PEERREVIEW
REVIEW 10
10 ofof 14
14
(a)
(a)Liveness
Livenessdetection
detection (b)
(b)Framed
Framedface
faceby
byaabox
box
Figure9.9.Positive
Figure Positivesamples.
samples.
4.3.
4.3. Integrated Experiments
Experiments
4.3.Integrated
Integrated Experiments
In
In this study,
study, the liveness detection algorithm was combined with the face recognition
recognition algorithm
Inthis
this study,the
theliveness
livenessdetection
detectionalgorithm
algorithmwaswascombined
combinedwith withthetheface
face recognitionalgorithm
algorithm
based
based on IFaceNet
IFaceNet for
for identity
identityauthentication.
authentication.WhenWhenthetheoutput
outputofofthe
theliveness
liveness detection algorithm
based on IFaceNet for identity authentication. When the output of the liveness detection algorithmisis
on detection algorithm
is true, face
true, recognition willbebecarried
carried out;otherwise,
otherwise, facerecognition
recognition will not not be performed. If the
true, face
face recognition
recognitionwill will be carried out;
out; otherwise, face
face recognitionwill will not be
be performed.
performed. IfIfthe
the
recognized
recognized person is is
in the dataset, then the corresponding label (name) is(name)
given; otherwise, “unknown”
recognized person
person is in in the
the dataset,
dataset, then
then the
the corresponding
corresponding labellabel (name) isis given;
given; otherwise,
otherwise,
is displayed. The recognition
“unknown” results are shown in Figure 10. Figure 10.
“unknown”isisdisplayed.
displayed.The Therecognition
recognitionresults
resultsare
areshown
shownin in Figure 10.
Figure
Figure 10a shows the results for a real person who was labeled in the dataset. Thus, the
the identity
Figure10a10ashows
showsthe theresults
resultsfor
foraareal
realperson
personwhowhowas
waslabeled
labeledininthe
thedataset.
dataset.Thus,
Thus, theidentity
identity
was
was recognized, and the name of this person was the output. Figure 10b presents the results for a real
wasrecognized,
recognized,andandthethename
nameof ofthis
thisperson
personwaswasthe
theoutput.
output.Figure
Figure10b
10bpresents
presentsthe
theresults
resultsfor
foraareal
real
person
person who
who was
was not
not in
in the
the dataset.
dataset. Thus,
Thus, “unknown”
“unknown” was
was displayed.
displayed. Figure
Figure 10c,d
10c,d show
show a
a hand-held
hand-held
person who was not in the dataset. Thus, “unknown” was displayed. Figure 10c,d show a hand-held
photo,
photo, and Figure 10e,f show a hand-held photo, a real person, and aa photo on an electronic pad.
photo,andandFigure
Figure10e,f
10e,fshowshowaahand-held
hand-heldphoto,
photo,aareal
realperson,
person,andand aphoto
photoon onananelectronic
electronicpad.
pad.
(a)
(a)Labeled
Labeledreal
realperson
person (b)
(b)Unlabeled
Unlabeledreal
realperson
person
(c)
(c)Labeled
Labeledperson
personholding
holdingaapicture
picture (d)
(d)Unlabeled
Unlabeledperson
personholding
holdingaapicture
picture
Figure 10. Cont.

Sensors 2019, 19, 4733 11 of 14
(e) Labeled person holding a picture and an (f) Unlabeled person holding a picture and an
electronic pad electronic pad
Figure 10.
Figure Integrated test.
10. Integrated test.
4.4. Performance Evaluation

4.4. Performance Evaluation
The proposed identity authentication algorithm consisted of two parts, namely, liveness detection
The proposed identity authentication algorithm consisted of two parts, namely, liveness
and face recognition. The liveness detection algorithm consisted of a lightweight CNN model. After
detection and face recognition. The liveness detection algorithm consisted of a lightweight CNN
the dataset was divided into training, verification, and test sets in the ratio of 8:1:1, the accuracy
model. After the dataset was divided into training, verification, and test sets in the ratio of 8:1:1, the
of the algorithm was 99.8% with our dataset NenuLD. The comparisons with other liveness
accuracy of the algorithm was 99.8% with our dataset NenuLD. The comparisons with other liveness
detection cross-databases are shown in Table 2. Our error recognition accuracy was only 0.2%
detection cross-databases are shown in Table 2. Our error recognition accuracy was only 0.2% and
and higher than others. Because we collected images with a Kinect camera, we could not perform
higher than others. Because we collected images with a Kinect camera, we could not perform
intradatabase comparisons. The improved FaceNet model was trained on the LFW dataset followed
intradatabase comparisons. The improved FaceNet model was trained on the LFW dataset followed
by cross-validation. The recognition accuracy was 99.7%. Contrasts between FaceNet and IFaceNet
by cross-validation. The recognition accuracy was 99.7%. Contrasts between FaceNet and IFaceNet
are shown in Table 3. For strangers, IFaceNet had a more accurate recognition output than FaceNet.
are shown in Table 3. For strangers, IFaceNet had a more accurate recognition output than FaceNet.
The whole performance evaluation of the identity authentication algorithm is shown in Table 4.
The whole performance evaluation of the identity authentication algorithm is shown in Table 4. The
The accuracy of live detection was 99.8% on the NenuLD dataset, and the accuracy of IFaceNet was
accuracy of live detection was 99.8% on the NenuLD dataset, and the accuracy of IFaceNet was 99.7%.
99.7%. Thus, the accuracy of the whole identity authentication algorithm was 99.5%, which is equal to
Thus, the accuracy of the whole identity authentication algorithm was 99.5%, which is equal to the
the multiplication of both recognition accuracy factors.
multiplication of both recognition accuracy factors.
Table 2. Comparison of liveness detection with cross-databases.
Table 2. Comparison of liveness detection with cross-databases.
Replay-Attack ERR (%) CASIA ERR (%) NenuLD ERR (%)
Replay-Attack ERR (%) CASIA ERR (%) NenuLD ERR (%)
DOG(baseline) [26] - 17.0 -
DOG(baseline) [26] - 17.0 -
DLTP [27] 7.13 7.02 -
DLTP [27]
Deep Learning [15] 6.17.13 7.37.02 --
Deep Learning
DPCNN [17] [15] 2.96.1 4.57.3 --
DPCNN
LDIR [17] - 2.9 - 4.5 0.2-
LDIR - - 0.2
Table 3. Comparison of FaceNet and IFaceNet.
Table 3. Comparison of FaceNet and IFaceNet.
People in Dataset Strangers
People in Dataset
Output correct results with 99.7%
Strangers
Output ID with Maximal
FaceNet
recognition rate similarity (error)
FaceNet Output ID with Maximal similarity (error)
recognition rate
IFaceNet Output “unknown” (correct)
recognition rate
IFaceNet Output “unknown” (correct)
recognition rate
Table 4. Accuracy of the proposed algorithm.
Table 4. Accuracy of the proposed algorithm.
Accuracy of Live Detection Accuracy of Face Recognition Total Accuracy
Accuracy of Live Detection Accuracy of Face Recognition Total Accuracy
99.8% 99.7% 99.5%
99.8% 99.7% 99.5%
The integrated identity authentication algorithm had a time performance of 0.01 s in identifying
a photo. Thus, this algorithm can be used for real-time detection and recognition. The IR images for
Sensors 2019, 19, 4733 12 of 14
The integrated identity authentication algorithm had a time performance of 0.01 s in identifying
a photo. Thus, this algorithm can be used for real-time detection and recognition. The IR images
for liveness detection were collected using a Kinect camera and, therefore, cannot be compared with
the existing face spoof datasets. However, the recognition accuracy from Table 4 indicates that the
proposed algorithm was applicable in terms of accuracy and real-time efficiency.
5. Conclusions and Future Work

This paper proposed an identity authentication system combining an improved FaceNet model
and a liveness detection method. IR images collected by the Kinect camera have depth information.
Therefore, the IR pixels from live images have an evident hierarchical structure, while those from
photos or videos do not have this feature. Experimental results showed the proposed liveness detection
method had a higher recognition accuracy. After that, we improved the FaceNet model for real
applications and combine it with liveness detection. The system could effectively solve 2D deception.
The IR image features of live faces are greatly different from those of photos or videos, so liveness
detection can be treated as a binary classification problem. Thus, a lightweight CNN was designed to
realize accurate liveness recognition. The algorithm had a high time efficiency and can be applied in
real time.
In future work, more and diverse liveness samples will be collected, and different types of spoofs
will be added to detect the reliability of models and algorithms, especially for spoofs in 3D masks and
wax figures.
Author Contributions: This study was completed by the co-authors. S.L. conceived and led the research.
The major experiments and analyses were undertaken by Y.S. and J.Z., M.Z. was responsible for drawing all
flowcharts. S.Y. recorded and analyzed experimental data. K.H. wrote the draft. All authors have read and
approved the final manuscript.
Funding: This research was funded by the project of Jilin Provincial Science and Technology Department under
grant 20180201003GX, and the APC was funded by grant 20180201003GX too.
Acknowledgments: This work was supported partially by the project of Jilin Provincial Science and Technology
Department under grant 20180201003GX and the project of Jilin province development and reform commission
under grant 2019C053-4. The authors gratefully acknowledge the helpful comments and suggestions of the
reviewers, which have improved the presentation.
Conflicts of Interest: Authors declare no conflicts of interest.
References
1. Chihaoui, M.; Elkefi, A.; Bellil, W.; Ben Amar, C. A survey of 2D face recognition techniques. Computers 2016,
5, 21. [CrossRef]
2. Schroff, F.; Kalenichenko, D.; Philbin, J. FaceNet: A unified embedding for face recognition and clustering.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA,
USA, 7–12 June 2015; pp. 815–823.
3. Li, Y.; Xu, K.; Yan, Q.; Li, Y.; Deng, R.H. Understanding OSN-based facial disclosure against face authentication
systems. In Proceedings of the 9th ACM Symposium on Information, Computer and Communications
Security, Kyoto, Japan, 3–6 June 2014; ACM: New York, NY, USA; pp. 413–424.
4. de Freitas Pereira, T.; Komulainen, J.; Anjos, A.; De Martino, J.M.; Hadid, A.; Pietik¨ainen, M.; Marcel, S. Face
liveness detection using dynamic texture. EURASIP J. Image Video Process. 2014, 2, 4. [CrossRef]
5. Boulkenafet, Z.; Komulainen, J.; Hadid, A. Face spoofing detection using colour texture analysis. IEEE Trans.
Inf. Forensics Secur. 2016, 11, 1818–1830. [CrossRef]
6. Chingovska, I.; Anjos, A.; Marcel, S. On the effectiveness of local binary patterns in face anti-spoofing.
In Proceedings of the 11th International Conference of the Biometrics Special Interes Group, Darmstadt,
Germany, 6–7 September 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 1–7.
7. Raghavendra, R.; Raja, K.B.; Busch, C. Presentation attack detection for face recognition using light field
camera. IEEE Trans. Image Process. 2015, 24, 1060–1075. [CrossRef] [PubMed]
Sensors 2019, 19, 4733 13 of 14
8. Li, X.B.; Komulainen, J.; Zhao, G.Y.; Yuen, P.C.; Pietik¨ainen, M. Generalized face anti-spoofing by detecting
pulse from face videos. In Proceedings of the 23rd International Conference on Pattern Recognition, Cancun,
Mexico, 4–8 December 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 4244–4249.
9. Hernandez-Ortega, J.; Fierrez, J.; Morales, A.; Tome, P. Time analysis of pulse-based face anti-spoofing in
visible and NIR. In Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops,
Salt Lake City, UT, USA, 18–22 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 544–552.
10. Wang, S.Y.; Yang, S.H.; Chen Y, P.; Huang, J.W. Face liveness detection based on skin blood flow analysis.
Symmetry 2017, 9, 305. [CrossRef]
11. Sun, L.; Pan, G.; Wu, Z.H.; Lao, S.H. Blinking-based live face detection using conditional random fields.
In Proceedings of the International Conference on Biometrics, Seoul, Korea, 27–29 August 2007; Springer:
Berlin/Heidelberg, Germany, 2007; pp. 252–260.
12. Li, J.W. Eye blink detection based on multiple gabor response waves. In Proceedings of the International
Conference on Machine Learning and Cybernetics, San Diego, CA, USA, 11–13 December 2008; IEEE:
Piscataway, NJ, USA, 2008; pp. 2852–2856.
13. Kollreider, K.; Fronthaler, H.; Faraj, M.I.; Bigun, J. Real-time face detection and motion analysis with
application in “liveness” assessment. Trans. Inf. Forensics Secur. 2007, 2, 548–558. [CrossRef]
14. Bharadwaj, S.; Dhamecha, T.I.; Vatsa, M.; Singh, R. Computationally efficient face spoofing detection with
motion magnification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Workshops, Portland, OR, USA, 23–28 June 2013; pp. 105–110.
15. Yang, J.W.; Lei, Z.; Li, S.Z. Learn convolutional neural network for face anti-spoofing. arXiv 2014,
arXiv:1408.5601.
16. Atoum, Y.; Liu, Y.J.; Jourabloo, A.; Liu, X.M. Face antispoofing using patch and depth-based cnns.
In Proceedings of the IEEE International Joint Conference on Biometrics, Denver, CO, USA, 1 October 2017;
IEEE: Piscataway, NJ, USA, 2017; pp. 319–328.
17. Li, L.; Feng, X.Y.; Boulkenafet, Z.; Xia, Z.Q.; Li, M.M.; Hadid, A. An original face anti-spoofing approach
using partial convolutional neural network. In Proceedings of the 6th International Conference on Image
Processing Theory Tools and Applications, Oulu, Finland, 12–15 December 2016; IEEE: Piscataway, NJ, USA,
2016; pp. 1–6.
18. Tu, X.K.; Fang, Y.C. Ultra-deep neural network for face anti-spoofing. In Proceedings of the International
Conference on Neural Information Processing, Guangzhou, China, 14–18 November 2017; Springer:
Berlin/Heidelberg, Germany, 2017; pp. 686–695.
19. Li, L.; Feng, X.Y.; Jiang, X.Y.; Xia, Z.Q.; Hadid, A. Face antispoofing via deep local binary patterns.
In Proceedings of the IEEE International Conference on Image Processing, Beijing, China, 17–20 September
2017; IEEE: Piscataway, NJ, USA, 2017; pp. 101–105.
20. Nagpal, C.; Dubey, S.R. A performance evaluation of convolutional neural networks for face anti spoofing.
arXiv 2018, arXiv:1805.04176.
21. Li, H.L.; He, P.S.; Wang, S.Q.; Rocha, A.; Jiang, X.H.; Kot, A.C. Learning generalized deep feature representation
for face anti-spoofing. IEEE Trans. Inf. Forensics Secur. 2018, 13, 2639–2652. [CrossRef]
22. Gan, J.Y.; Li, S.L.; Zhai, Y.K.; Liu, C.Y. 3D convolutional neural network based on face anti-spoofing.
In Proceedings of the International Conference on Multimedia and Image Processing, Wuhan, China, 17–19
March 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–5.
23. Ghazel, A.; Sharifa, A. The effectiveness of depth data in liveness face authentication using 3D sensor cameras.
Sensors 2019, 19, 1928.
24. Zhang, K.; Zhang, Z.; Li, Z.; Qiao, Y. Joint face detection and alignment using multitask cascaded convolutional
networks. IEEE Signal Process. Lett. 2016, 23, 1499–1503. [CrossRef]
25. Sim, T.; Baker, S.; Bsat, M. The CMU pose, illumination, and expression (PIE) database. In Proceedings of the
Fifth IEEE International Conference on Automatic Face Gesture Recognition, Washington, DC, USA, 21 May
2002; IEEE: Piscataway, NJ, USA, 2002; pp. 2852–2856.
Sensors 2019, 19, 4733 14 of 14
26. Zhang, Z.; Yan, J.; Liu, S.; Lei, Z.; Yi, D.; Li, S. A face antispoofing database with diverse attacks. In Proceedings
of the International Conference on Biometrics(ICB), New Delhi, India, 30 March–1 April 2012; pp. 26–31.
27. Parveen, S.; Ahmad, S.M.S.; Abbas, N.H.; Adnan, W.A.W.; Hanafi, M.; Naeem, N. Face liveness detection
using Dynamic Local Ternary Pattern (DLTP). Computers 2016, 5, 10. [CrossRef]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Sensors 19 04733 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sensors 19 04733 PDF

Uploaded by

Copyright:

Available Formats

sensors

Sensors 2019, 19, 4733; doi:10.3390/s19214733 www.mdpi.com/journal/sensors

IR Picture RGB Picture

IR Face RGB Face

Sensors 2019, 19, 4733

In Figure 2. Training process of liveness detection.

Figure 3. CNN structure.

Figure 4. FaceNet structure.

Figure 5. Function of the triplet loss.

Set a Valid Threshold

Building a Unknown Category

Compare with the

Output unknown Output label

Figure 6. Face recognition process by IFaceNet.

Sensors 2019, 19, x FOR PEER REVIEW 8 of 14

(a) IR images of live faces

(b) IR images of photos

Sensors 2019, 19, x FOR PEER REVIEW 9 of 14

(a) Photo spoof

(b) Simulated 3D photo spoof

(c) Replay spoof

Figure 10. Cont.

4.4. Performance Evaluation

5. Conclusions and Future Work

You might also like