Professional Documents
Culture Documents
1
Pedro C. Neto1,2 INESC TEC
pedro.d.carneiro@inesctec.pt Porto, Portugal
Ana F. Sequeira1 2
Faculty of Engineering
ana.f.sequeira@inesctec.pt University of Porto, Porto, Portugal
Jaime S. Cardoso2,1
jaime.cardoso@inesctec.pt
Abstract different subjects. The average clips length is 10 seconds. These clips
were collected from five mobile devices (distinct camera resolutions for
Presentation attacks are some of the most frequent vulnerabilities of bio- all of them) and five lighting conditions. The front-facing camera was
metric systems. To perform these attacks, the impostors attempt to bypass used with a distance between face and camera of about 30 to 50 centime-
the biometric vision system. The human visual cortex system can leverage tres.
distinct information from the background and the main focus. However,
researchers still rely on the idea that the background is, in the majority Table 1: List of attacks present in the ROSE-YOUTU dataset [12].
of cases, harmful to machine learning algorithms. And thus, face pre- Attack Description
sentation attack detection models are trained with tight crops of the face. - Genuine (bona fide)
It is argued that it rather limits the model and its performance. We fur- #1 Still printed paper
ther show that a binary classification system aware of the background is #2 Quivering printed paper
capable of outperforming its counterpart that gets no information regard- #3 Video which records a Lenovo LCD display
ing the background. The proposed methodology beats current approaches #4 Video which records a Mac LCD display
and achieves an equal error rate (EER) of just 0.9%. We further analyze #5 Paper mask with two eyes and mouth cropped out
the predictions from an interpretability point-of-view and argue that the #6 Paper mask without cropping
background elements used by the model are similar to the ones used by #7 Paper mask with the upper part cut in the middle
humans.
(g) Attack #4 (h) Attack #1 (i) Attack #6 (j) Genuine (k) Genuine (l) Genuine
Figure 1: Samples collected from the ROSE-YOUTU dataset [12] containing images from attacks and genuine captures. On the top row, cropped
images are displayed. Whereas the bottom row contains the exact same images, but with all the background information included.
Table 3: Comparison of the proposed approach with the state-of-the-art. that a face PAD model is capable of leveraging both background and face
EER is displayed as %. In bold is the best result per column. elements to make a correct prediction.
Method EER This proposed approach surpassed the state-of-the-art results for the
Color LBP [1, 5] 27.6 ROSE-YOUTU dataset. The lightweight model is capable of providing
CoALBP (YCBCR) [12] 17.1 impressive results. The interpretability analysis corroborated the beliefs
CoALBP (HSV) [12] 16.4 regarding the usage of background elements.
Color [2, 5] 13.9 Acknowledgements This work was partially funded by the Project TAMI - Transpar-
De Spoofing [5, 9] 12.3 ent Artificial Medical Intelligence (NORTE-01-0247-FEDER-045905) financed by ERDF -
RCTR-all spaces [5] 10.7 European Regional Fund through the North Portugal Regional Operational Program - NORTE
2020, by the Portuguese Foundation for Science and Technology - FCT under the CMU -
ResNet-18 [7] 9.3 Portugal International Partnership, and within the PhD grant “2021.06872.BD”’.
SE-ResNet18 [8] 8.6
AlexNet [12] 8.0
References
DR-UDA (SE-ResNet18) [13] 8.0
DR-UDA (ResNet-18) [13] 7.2 [1] Zinelabidine Boulkenafet, Jukka Komulainen, and Abdenour Hadid. Face anti-spoofing
based on color texture analysis. In 2015 IEEE international conference on image pro-
3D-CNN [11] 7.0 cessing (ICIP), pages 2636–2640. IEEE, 2015.
Blink-CNN [6] 4.6 [2] Zinelabidine Boulkenafet, Jukka Komulainen, and Abdenour Hadid. Face spoofing de-
DRL-FAS [3] 1.8 tection using colour texture analysis. IEEE Transactions on Information Forensics and
Security, 11(8):1818–1830, 2016. doi: 10.1109/TIFS.2016.2555286.
Ours 0.9 [3] Rizhao Cai, Haoliang Li, Shiqi Wang, Changsheng Chen, and Alex C Kot. Drl-fas: A
novel framework based on deep reinforcement learning for face anti-spoofing. IEEE
Transactions on Information Forensics and Security, 16:937–951, 2020.
[4] Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, and Vineeth N Balasubrama-
nian. Grad-cam++: Generalized gradient-based visual explanations for deep convolu-
tional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision
(WACV), pages 839–847, 2018. doi: 10.1109/WACV.2018.00097.
[5] Yuting Du, Tong Qiao, Ming Xu, and Ning Zheng. Towards Face Presentation Attack
Detection Based on Residual Color Texture Representation. Security and Communica-
tion Networks, 2021:6652727, 2021. ISSN 1939-0114. doi: 10.1155/2021/6652727.
URL https://doi.org/10.1155/2021/6652727.
[6] Md. Mehedi Hasan, Md. Salah Uddin Yusuf, Tanbin Islam Rohan, and Shidhartho Roy.
(a) Replay Attack (b) Paper Mask At- (c) Print Attack Efficient two stage approach to detect face liveness : Motion based and deep learning
tack based. In 2019 4th International Conference on Electrical Information and Communi-
cation Technology (EICT), pages 1–6, 2019. doi: 10.1109/EICT48899.2019.9068813.
Figure 2: Explanations produced for a prediction from a frame from a
[7] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for
video of subject #23. Colors closer to pink represent areas with larger image recognition. In Proceedings of the IEEE conference on computer vision and pat-
relevance for the decision. Bluish colors represent less important pixels. tern recognition, pages 770–778, 2016.
[8] Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In Proceedings of the
IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018.
[9] Amin Jourabloo, Yaojie Liu, and Xiaoming Liu. Face de-spoofing: Anti-spoofing
Finally, we produced explanations of our model for an example of via noise modeling. In Proceedings of the European Conference on Computer Vision
each category of attacks. For the replay attack, we produced the explana- (ECCV), pages 290–306, 2018.
tions in Figure 2a. That figure shows that the model leveraged the pres- [10] Dakshina Ranjan Kisku and Rinku Datta Rakshit. Face spoofing and counter-spoofing:
a survey of state-of-the-art algorithms. Transactions on Machine Learning and Artificial
ence of reflections in the attack image. Figure 2b shows the explanation Intelligence, 5(2):31, 2017.
for a paper mask attack, and as expected, the explanation does not rely on [11] Haoliang Li, Peisong He, Shiqi Wang, Anderson Rocha, Xinghao Jiang, and Alex C.
the background. Instead, the model directs its focus to the mask area for Kot. Learning generalized deep feature representation for face anti-spoofing. IEEE
the final prediction. Finally, the print attack explanation is seen in Fig- Transactions on Information Forensics and Security, 13(10):2639–2652, 2018. doi: 10.
1109/TIFS.2018.2825949.
ure 2c. It shows that the model understands the conditions of the image [12] Haoliang Li, Wen Li, Hong Cao, Shiqi Wang, Feiyue Huang, and Alex C. Kot. Unsu-
given and directs its focus to an important background artefact, the pin pervised domain adaptation for face anti-spoofing. IEEE Transactions on Information
holding the image. Forensics and Security, 13(7):1794–1809, 2018. doi: 10.1109/TIFS.2018.2801312.
[13] Guoqing Wang, Hu Han, Shiguang Shan, and Xilin Chen. Unsupervised adversarial
domain adaptation for cross-domain face presentation attack detection. IEEE Transac-
tions on Information Forensics and Security, 16:56–69, 2021. doi: 10.1109/TIFS.2020.
5 Conclusions 3002390.
In this work, we explored our belief that researchers have been creating
limitations for face presentation attack detection models by cropping the
face from the frame. The experiments of our work corroborated the view