Pavan C Internship

Sampoorna Institute of Technology And Research
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
PROLECT ON
“3D-FACIAL LANDMARK DETECTIONS FOR

INTELLIGENT VIDEO SYSTEMS”
NAME: PAVAN C UNDER THE GUIDANCE OF
USN: 1SZ18CS009 PROF. NIRUPADI

ABSTRACT
 Facial landmark detection is a fundamental research topic in computer vision

that is widely adopted in many applications.
 Facial-Landmark Detector, which is based on a state-of-the art architecture, is
used to obtain accurate facial landmark points.
 Additionally, it modifies the Hourglass modules by using the Residual-Dense
blocks in the mainstream for capturing more efficient features.

INTRODUCTION
 It is a crucial technique for face recognition, gaze estimation, facial attribute

recognition, etc. However, fast head pose estimation executing on the terminal
for video edge computation has many challenges due to the computational
complexity of the existing algorithms.
 Since human face contains an amount of information related to identity, emotion,
attention, gender, etc., many face-based applications have been developed, for
example, face recognition, identity verification, expression recognition, facial
attribute analysis, etc.
 Face detectors like D-lib ,Faster R-CNN is used to detect bounding boxes of
faces inside the input.
LITERATURE SURVEY
SI NO TITLE YEAR AUTHOR METHODOLOGY RESULTS
1 Facial 2020 Hongzhe Liu,1 Facial landmark we introduce a

Landmark Weicheng detection is novel generative
Detection Zheng,1 Cheng essential to many adversarial
Using Xu , 1 Teng facial analysis network with
Generative Liu,1 and Min applications, e.g., improved
Adversarial Zuo2 face recognition, autoencoders to
Network facial expression restore the
Combined with analysis, and 3D partially
Autoencoder face modeling. occluded face
for Occlusion region via deep
Networks regression
networks.
LITERATURE SURVEY
SI TITLE YEAR AUTHOR METHODOLGY RESULTS

NO
2 A Robust Facial 2020 Ping Wang, Facial landmark Facial landmark

Landmark Qianyu Zhou, (FL) detection detection in
Detection in Yucong Zhao, aims to identify uncontrolled
Uncontrolled Shumin Zhao, some key points natural
Natural Condition Jiquan Ma as a sparse condition is a
representation challenging task
of a face. due to
illumination,
occlusion,
expression,
posture, blur
and other
factors.
LITERATURE SURVEY
SI NO TITLE YEAR AUTHOR METHODOLOGY RESULTS
3 Fast Head Pose 2020 Weiwei Wang The human Edge

Estimation via 1 , Xiaoyan head pose computing is
Rotation- Chen 1 , estimation is an a rising
Adaptive Facial Shuangwu important and computing
Landmark Zheng 1 , and challenging strategy
Detection for Haiqing Li 2 problem, which that
Video Edge provides the distributes
Computation estimation of parts of
the head computation
posture in 3D to the
space from 2D terminal,
image. rather than
all of them
to the cloud.
PROBLEM STATEMENT
• Human pose estimation applications were mainly based on

pictorial structures.
• 2D facial landmark annotation hardly preserves the 3D structure
of the human face, especially for profile views.
• Old methods have not been able to achieve the high performance
exhibited by the cascaded regression methods.
EXISTING SYSTEM
• Facial recognition is studied in image recognition fields using various datasets

such as photos of frontal face, images in the wild.
Steps:
(a) Conversion of data from the 3D facial image to a 2D image.
(b) Extraction of facial landmarks from the 3D image using Convolutional
Neural Network (CNN).
(c) Inversion of the identified facial landmarks from 2D to 3D images.

PROPOSED SYSTEM
• The proposed method adopts the top-down approach.

• The full 3D facial landmark can be generated from this 3D
facial landmark by adding an extra network to predict the depth.
• Detects the landmark-points of all faces in an input, which can be

an image or a frame of video.
• This R-CNN use algorithms to propose possible region of interest(ROI).

ADVANTAGES
 This study combines the advantages of PCN face detection method and LBF
landmark detection method to construct a novel landmark detection method
RALBF, which is robust to rotational variation.
 Experiments show that the method has obvious speed advantage, which is
suitable for running on terminal for video edge computation.
 The test results show that the proposed head pose estimation method works
well on usual condition, except that the estimation error would go up when the
yaw angle exceeds the range of ±35◦ .
DISADVANTAGES
 Result may be inaccurate when the head pose is large.
 The role of the verification procedure is a secondary inspection

for the estimation result to increase the reliability of the system.
METHODOLOGY
 The proposed method adopts the top-down approach.
 From a given input image or frame, the Faster R-CNN face detector with ResNet-50
[20] as the backbone is used to generate the bounding boxes for all faces inside the
input.
 Then, the input is cropped based on these bounding boxes to have many cropped images,
each image is corresponding to a bounding box (or a face).
 After that, for each cropped image, the score-maps of all landmark-points of the faces
inside is generated by using the proposed Facial-Landmark Detector.
 Finally, the landmarks of faces are aligned by the max activations across these
scoremaps and the offset of bounding boxes on the original input image or frame.
METHODOLOGY
There are 4 major methods that are applied:
 Deep Residual Network
 Facial Land-Mark Detector
 Modified Hour-Glass Module
 Residual Dense Block

1.DEEP RESIDUAL NETWORK
 When the network goes deeper and deeper, the accuracy can be saturated and
then degrades rapidly. Adding more layers to the model leads to lower accuracy.
 This problem is called the degradation problem.
 The ResNet is proposed by He et al. [20] to overcome this.
 Their solution is to add the skip connections to create a residual mapping.

2. FACIAL LAND-MARK
DETECTOR
 The original Stacked Hourglass Networks [11] or its variants [38], [40], which
just use a 7 × 7 convolution layer at the beginning followed by multiple Hourglass
modules at stride = 4 (the resolution is 4 × 4 times lower than the input), this
proposed network uses a backbone of three first blocks of ResNet-50 [20] to
extract features of the input cropped image at stride = 8.
 The most expensive computational aspect of this system is the hourglass

modules.
 The upsampling step helps the proposed network generate the stride = 4
resolution score-maps with higher quality for landmark-points of the face
inside.
Architecture of proposed facial
landmark detector.
Fig. 1 Architecture of proposed facial landmark

detector.
3. MODIFIED HOUR-GLASS
MODULE
 Modified Hour-Glass Modules replaces the original residual blocks

with residual dense blocks in the mainstream to capture more
efficient information.
 Since the residual-dense block is quite heavy in computation and

size, it uses 1 × 1 convolution layers in the branch streams to
reduce the model size and computational time.
Architecture of Modified
Hour-Glass Module
Fig. 2 Architecture of Modified Hour-Glass

Module
4. RESIDUAL DENSE BLOCK
 The Dense block uses the direct connections from any layer to all subsequent
layers in a block.
 The Residual block adds a residual-connection which bypasses the Hb(·) with an
identity function.
APPLICATION
• Airport Security.
• Employee attendance in MNC’s.
• Used in CCTV cameras for traffic rule

violation.
• Other security purposes.

CONCLUSION
• The ResNet-50 as the backbone, followed by four Modified Hourglass

modules.
• It modifies the Hourglass modules by using the Residual-Dense blocks in the
mainstream for capturing more efficient features
• It also enhanced the features from Modified Hourglass modules with finer-
resolution features.
REFEREENCES
1)Takuma Terada, Yen-Wei Chen, Ryusuke Kimura “3D Facial Landmark Detection Using Deep Convolutional Neural
Networks” in 2018 14th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery
(ICNC-FSKD).
2) Jiankang Deng, Yuxiang Zhou, Shiyang Cheng, and Stefanos Zaferiou “Cascade Multi-view Hourglass Model for Robust
3D Face Alignment” in 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition.
3) Olalekan Agbolade, Azree Nazri, Razali , Abdul Azim Ghani, Yoke Kqueen Cheah “Homologous Multi-Points
Warping: An Algorithm for Automatic 3D Facial Landmark” in 2019 IEEE International Conference on Automatic Control
and Intelligent Systems (I2CACIS 2019), 29 June 2019, Selangor, Malaysia.
4)Gary Storey , Richard Jiang , Shelagh Keogh, Ahmed Bouridane , And Chang-Tsun Li “3DPalsyNet: A Facial Palsy
Grading and Motion Recognition Framework Using Fully 3D Convolutional Neural Networks” in July 9, 2019, accepted
July 23, 2019, date of publication August 23, 2019, date of current version September 10, 2019.
5) Adrian Bulat and Georgios Tzimiropouloz “Hierarchical Binary CNNs for Landmark Localization with Limited
Resources” in 2020 IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol. 42, No. 2, February 2020.

Pavan C Internship

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pavan C Internship

Uploaded by

Copyright:

Available Formats

Sampoorna Institute of Technology And Research

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

“3D-FACIAL LANDMARK DETECTIONS FOR

NAME: PAVAN C UNDER THE GUIDANCE OF

USN: 1SZ18CS009 PROF. NIRUPADI

 Facial landmark detection is a fundamental research topic in computer vision

used to obtain accurate facial landmark points.

 Additionally, it modiﬁes the Hourglass modules by using the Residual-Dense

blocks in the mainstream for capturing more efﬁcient features.

 It is a crucial technique for face recognition, gaze estimation, facial attribute

SI NO TITLE YEAR AUTHOR METHODOLOGY RESULTS

1 Facial 2020 Hongzhe Liu,1 Facial landmark we introduce a

SI TITLE YEAR AUTHOR METHODOLGY RESULTS

2 A Robust Facial 2020 Ping Wang, Facial landmark Facial landmark

SI NO TITLE YEAR AUTHOR METHODOLOGY RESULTS

3 Fast Head Pose 2020 Weiwei Wang The human Edge

• Human pose estimation applications were mainly based on

• Facial recognition is studied in image recognition fields using various datasets

(c) Inversion of the identified facial landmarks from 2D to 3D images.

• The proposed method adopts the top-down approach.

• Detects the landmark-points of all faces in an input, which can be

• This R-CNN use algorithms to propose possible region of interest(ROI).

 Result may be inaccurate when the head pose is large.

 The role of the verification procedure is a secondary inspection

 The proposed method adopts the top-down approach.

There are 4 major methods that are applied:

 Deep Residual Network

 Facial Land-Mark Detector

 Modified Hour-Glass Module

 Residual Dense Block

 This problem is called the degradation problem.

 The ResNet is proposed by He et al. [20] to overcome this.

 Their solution is to add the skip connections to create a residual mapping.

 The most expensive computational aspect of this system is the hourglass

Fig. 1 Architecture of proposed facial landmark

 Modified Hour-Glass Modules replaces the original residual blocks

 Since the residual-dense block is quite heavy in computation and

Fig. 2 Architecture of Modified Hour-Glass

• Employee attendance in MNC’s.

• Used in CCTV cameras for traffic rule

• Other security purposes.

• The ResNet-50 as the backbone, followed by four Modiﬁed Hourglass

• It modiﬁes the Hourglass modules by using the Residual-Dense blocks in the

mainstream for capturing more efﬁcient features

You might also like