Abstract - How To Accurately and Effectively Identify: Image Detection From Videos Using AI CHAPTER 1: Introduction

IMAGE DETECTION FROM VIDEOS USING a deep neural network is constructed to train and
AI construct the combined features.
CHAPTER 1: Introduction
Abstract - How to accurately and effectively identify
people has always been an interesting topic, both in Ever since IBM introduced first personal computer
research and in industry. With the rapid development on 1981, to the .com era in the early 2000s, to the
of artificial intelligence in recent years, facial online shopping trend in last 10 years, and the
recognition gains more and more attention. Internet of Things today, computers and information
Compared with the traditional card recognition, technologies are rapidly integrating into everyday
fingerprint recognition and iris recognition, face human life. As the digital world and real world merge
recognition has many advantages, including but limit more and more together, how to accurately and
to non-contact, high concurrency, and user friendly. effectively identify users and improve information
It has high potential to be used in government, public security has become an important research topic. Not
facilities, security, e-commerce, retailing, education only in the civil area, in particular, since the 9-11
and many other fields. Deep learning is one of the terrorist attacks, governments all over the world have
new and important branches in machine learning. made urgent demands on this issue, prompting the
Deep learning refers to a set of algorithms that solve development of emerging identification methods.
various problems such as images and texts by using Traditional identity recognition technology mainly
various machine learning algorithms in multi-layer rely on the individual’s own memory (password,
neural networks. Deep learning can be classified as a username, etc.) or foreign objects (ID card, key, etc.).
neural network from the general category, but there However, whether by virtue of foreign objects or
are many changes in the concrete realization. At the their own memory, there are serious security risks. It
core of deep learning is feature learning, which is is not only difficult to regain the original identity
designed to obtain hierarchical information through material, but also the identity information is easily
hierarchical networks, so as to solve the important acquired by others if the identification items that
problems that previously required artificial design prove their identity are stolen or forgotten. As a
features. Deep Learning is a framework that contains result, if the identity is impersonated by others, then
several important algorithms. For different there will be serious consequences. Different from
applications (images, voice, text), you need to use the traditional identity recognition technology,
different network models to achieve better results. biometrics is the use of the inherent characteristics of
With the development of deep learning and the the body for identification, such as fingerprints,
introduction of deep convolutional neural networks, irises, face and so on. 6 Compared with the
the accuracy and speed of face recognition have traditional identity recognition technology, biological
made great strides. However, as we said above, the features have many advantages, as: 1.
results from different networks and models are very Reproducibility, biological characteristics are born
different. In this paper, facial features are extracted with, cannot be changed, so it is impossible to copy
by merging and comparing multiple models, and then other people's biological characteristics. 2.
Availability, biological features as part of the human object co-segmentation. It is also used in tracking
body, readily available, and will never be forgotten. objects, for example tracking a ball during a football
3. Easy to use. Many biological characteristics will match, tracking movement of a cricket bat, or
not require individuals to corporate with the examine tracking a person in a video.
device. Based on the above advantages, biometrics Every object class has its own special features that
has attracted the attention of major corporations and helps in classifying the class for example all circles
research institutes and has successfully replaced are round. Object class detection uses these special
traditional recognition technologies in many fields. features. For example, when looking for circles,
And with the rapid development of computer and objects that are at a particular distance from a point
artificial intelligence, biometrics technology is easy (i.e. the center) are sought. Similarly, when looking
to cooperate with computers and networks to realize for squares, objects that are perpendicular at corners
automation management, and is rapidly integrating and have equal side lengths are needed. A similar
into people's daily life. When comparing the approach is used for face identification where eyes,
differences between different biometrics, we can see nose, and lips can be found and features like skin
that the cost of facial recognition is low, the color and distance between eyes can be found.
acceptance from user is easy, and the acquisition of
information is easy. Facial recognition is the use of RELEVANT CONTEMPORARY ISSUES
computer vision technology and related algorithms, The main disadvantages of manual attendance
from the pictures or videos to find faces, and then tracking are the risk of human error. We're all human
analysis of the identity. In addition, further analysis and we can easily make mistakes when collecting
of the acquired face, may conduct some additional data manually. Manually tracking is also time
attributes of the individual, such as gender, age, consuming and also increases the paper load.
emotion, and etc. PROBLEM IDENTIFICATION
With a face recognition attendance system, the entire
APPLICATION OF THIS PROJECT environment is automated. You won't just take the
Object detection is a computer technology related attendance but also automatically record the entry-
to computer vision and image processing that deals exit time of the employees. It also adds to the security
with detecting instances of semantic objects of a of the workplace as the system can recognize who
certain class (such as humans, buildings, or cars) in left the designated area and when accurately.
digital images and videos. Object detection has
applications in many areas of computer vision, TASK IDENTIFICATION
including image retrieval and video surveillance. This project deals on face recognition where each
student in the class will be video graphed and their
IDENTIFICATION OF CLASS & NEED attendance will be stored in a server. The teacher can
It is widely used in computer vision tasks such as then record the attendance by just clicking some
image annotation, vehicle counting, activity pictures of the classroom. The system will recognize
recognition, face detection, face recognition, video the faces and mark their attendance.
feature points and texture and positional relationship
CHAPTER 2: Literature survey constraints considered together for calculation.
Although ASM has more articles to improve, it is
2.1 FACE DETECTION AND FACE TRACKING worth mentioning that the AMM model, but also
This article Robust Real-time Object Detection [2] is another important idea is to improve the original
the most frequently cited article in a series of articles article based on the edge of the texture model. The
by Viola that makes face detection truly workable. regression-based approach presented in the paper
We can learn about several face detection methods Boosted Regression Active Shape Models [6] is
and algorithm from this publication. The article Fast better than the one based on the categorical apparent
rotation invariant multi-view face detection based on model. The article Face Alignment by Explicit Shape
real Adaboost [3] for the first time real adaboost Regression [7] is another aspect of ASM
applied to object detection, and proposed a more improvement and an improvement on the shape
mature and practical multi-face detection framework, model itself. Is based on the linear combination of
the nest structure mentioned on the cascade structure training samples to constrain the shape, the effect of
improvements also have good results. The article alignment is currently seen the best. The purpose of
Tracking in Low Frame Rate Video: A Cascade the facial feature point positioning is to further
Particle Filter with Discriminative Observers of determine facial feature points (eyes, mouth center
Different Life Spans [4] is a good combination of points, eyes, mouth contour points, organ contour
face detection model and tracking, offline model and points, etc.) on the basis of the face area detected by
online model, and obtained the CVPR 2007 Best the face detection / tracking, s position. These 3
Student Paper. 10 The above 3 papers discussed articles show 11 the methods for face positioning and
about the face detection and face tracking problems. face alignment. The basic idea of locating the face
According to the research result in these papers, we feature points is to combine the texture features of the
can make real time face detection systems. The main face locals and the position constraints of the organ
purpose is to find the position and size of each face in feature points.
the image or video, but for tracking, it is also 2.1.3 FACE FEATURE EXTRACTION
necessary to determine the correspondence between PCA-based eigenfaces [8] are one of the most classic
different faces in the frame. algorithms for face recognition. Although today's
PCA is more used in dimensionality reduction in real
2.1.2 FACE POSITIONING AND ALIGNMENT systems than classification, such a classic approach
Earlier localization of facial feature points focused on deserves our attention. The article Local Gabor
two or three key points, such as locating the center of Binary Pattern Histogram Sequence (LGBPHS): A
the eyeball and the center of the mouth, but later Novel Non-Statistical Model for Face Representation
introduced more points and added mutual restraint to and Recognition [9] is close to many mature
improve the accuracy and stability of positioning commercial systems. In many practical systems, a
Sex. The article Active Shape Models-Their Training framework for extracting authentication information
and Application [5] is a model of dozens of facial is PCA and LDA. Using PDA to reduce matrix to
avoid the matrix singularity problem of LDA solving,
then using LDA to extract the features suitable for CONCEPT GENERATION
classification, To further identify the various original Every object class has its own special features that
features extracted after the decision-level fusion. helps in classifying the class for example all circles
Although some of the LFW test protocols are not are round. Object class detection uses these special
reasonable, there is indeed a face recognition library features. For example, when looking for circles,
that is closest to the actual data. In this article, objects that are at a particular distance from a point
Blessing Dimensionality: Highdimensional Feature (i.e. the center) are sought. Similarly, when looking
and Its Efficient Compression for Face Verification for squares, objects that are perpendicular at corners
[10], the use of precise positioning point as a and have equal side lengths are needed. A similar
reference to face multi-scale, multi-regional approach is used for face identification where eyes,
representation of the idea is worth learning, can be nose, and lips can be found and features like skin
combined with a variety of representation. The above color and distance between eyes can be found.
3 papers discussed about facial feature
positioning/alignment. Facial feature extraction is a SPECIFICATIONS / FEATURES
face image into a string of fixed-length numerical Face recognition is among the most productive image
process. This string of numbers is called the "Face processing applications and has a pivotal role in the
Feature" and has the ability to characterize this face. technical field. Recognition of the human face is an
Human face to mention 12 the characteristics of the active issue for authentication purposes specifically
process of input is "a face map" and "facial features in the context of attendance of students. Attendance
key points coordinates", the output is the system using face recognition is a procedure of
corresponding face of a numerical string (feature). recognizing students by using face biostatistics based
Face to face feature algorithm will be based on facial on the high definition monitoring and other computer
features of the key point coordinates of the human technologies.
face pre-determined mode, and then calculate the
features. In recent years, the deep learning method The proposed system makes the use of opencv-
basically ruled the face lift feature algorithm, In the python, face_recognition, numpy, cmake and dlib
articles mentioned above, they showed the progress python-based packages to detect face biostatistics.
of research in this area. These algorithms are fixed After face recognition attendance reports will be
time length algorithm. Earlier face feature models are generated and stored in excel format.
larger, slow, only used in the background service.
However, some recent studies can optimize the
model size and operation speed to be available to the
mobile terminal under the premise of the basic
guarantee algorithm effect. IMPLEMENTATION PLAN
Steps to Build the Face Recognition System
CHAPTER 3: Design flow / Process 1. INSTALL LIBRARIES
We need to install three libraries in order to
implement face recognition and the libraries
are as follows:
a) opencv-python: A library for some image
pre-processing
b) dlib: It is a modern C++ toolkit containing

machine learning algorithms
and tools for creating complex
software in C++ to solve real-world
problems.
As you can see RGB looks natural so you

will always change the channel to RGB.
c) face_recognition: The face_recognition
library, created and maintained by Adam 3. FIND FACE LOCATION AND DRAW
Geitgey that wraps around dlib facial BOUNDING BOXES
recognition functionality. You need to draw a bounding box around
the faces in order to show if the human face
has been detected or not.
2. LOADING IMAGES
After importing libraries you need to load an
image.
face_recognition library loads images in the
form of BGR, in order to
print the image you should convert it into
RGB using OpenCV.
face_recognition.compare_faces
returns True if the person in both
images are the same other it returns
False.
4. TRAIN AN IMAGE FOR FACE
RECOGNITION Steps to Build the Face Recognition Attendance
This process is done in two steps and the System
steps are as follows:
a) Training 1. Install necessary libraries
At this stage, we convert the train
image into some encodings and
store the encodings with the given
name of the person for that image.
b) Testing
For testing, we load an image and
2. Define a folder path where your training
convert it into encodings, and now
image dataset will be stored
match encodings with the stored
encodings during training, this
matching is based on finding
maximum similarity. When you
c) Create a list to store person_name
find the encoding matching to the
and image array.
test image you get the name
d) Traverse all image file present in
associated with train encodings.
path directory, read images, and
append the image array to the
image list, and file-name to
classNames.
3. Create a function to encode all the train
images and store them in a variable
encoded_face_train.
with open(“Attendance.csv”,’r+’) creates a

file and ‘r+’ mode is used to open a file for
reading and writing.
We first check if the name of the attendee is
already available in attendance.csv we won’t
write attendance again.
If the attendee’s name is not available in
attendance.csv we will write the attendee
name with a time of function call.
4. Creating a function that will create a
5. Read Webcam for Real-Time Recognition
Attendance.csv file to store the attendance
a) Resize the image by 1/4 only
with time, you need to create Attendance.csv
for the recognition part. output
file manually and give the path in the
frame will be of the original
function
size.
b) Resizing improves the Frame
per Second.
c) face_recognition.face_location
s() is called on the resized
image(imgS) for face bounding
box coordinates must be
multiplied by 4 in order to
overlay on the output frame.
d) face_recognition.distance()
returns an array of the distance
of the test image with all
images present in our train constraint. There are several challenges that
directory. are faced by the Facial Recognitions System
e) The index of the minimum are as follows:
face distance will be the
matching face. 1. Illumination: It changes the face
f) After finding the matching appearance drastically, it is
name we call the observed that the slight changes in
markAttendance function. lighting conditions cause a
g) Draw bounding box using significant impact on its results.
cv2.rectangle().
h) We put the matching name on 2. Pose: Facial Recognition systems
the output frame using are highly sensitive to the pose,
cv2.putText(). which may result in faulty
recognition or no recognition if the
database is only trained on frontal
face view.
3. Facial Expressions: Different

expressions of the same individual
are another significant factor that
needs to be taken into account.
Modern Recognizers can easily
deal with it though.
4. Low Resolution: Training of

recognizer must be done on a good
resolution picture, otherwise the
model will fail to extract features.
5. Aging: With increasing age, the

CHAPTER 4: Results analysis and human face features shape, lines,
texture changes which are yet
validation
another
Although building facial recognition seems
easy it is not as easy in the real world
images that are being taken without any
RESULTS
The program was able to detect the faces
and some objects in the video and also it
was able to name them correctly.
OUTPUTS
The Screenshots of the outputs are given
below:
CHAPTER 5: Conclusion and future work
CONCLUSION
In this project, we discussed how to create a face
recognition system using the face_recognition library
and made an attendance system.
The expected result of the proposed program was to

capture the faces of the students from the webcam
and record their attendance. But, unfortunately the We would convey our special thanks to Mr Bilal
“dlib library” is very unstable and a problematic Bashir, for his substantial support and valuable
library and due to these reason the program feedback, which paved the way for improvisations
sometimes does not even start. There is less chances and changes, helping uplifting the quality of their
that the program will work. The reason could be as design project.
we don’t have a high performance GPU or due to the Last but definitely not the least, we thank our
operating system or due to the python interpreter we classmates of PH21BCS-404(A) for their constant
used. motivation and competitive inspiration
FUTURE WORK
When building a facial recognition model, there are
many parameters which can be tuned to increase the REFER
model performance. We can keep tuning our models
ENCES
to increase its accuracy. Also, we will find a good
alternative of “dlib” library and implement in our 1. [1]"cookbook.fortinet.com," 10 10 2018.
project as this library is somewhat problematic to use. [Online]. Available:
For a trained base model, we can re-train it using a https://cookbook.fortinet.com/face-
specific dataset. So another way to increase the whole recognition-configuration-in-forticentral/.
system’s performance is to capture the specific [Accessed 10 10 2018].
people’s images and re-train the model based on this
small dataset. For example, if an organization with 2. [2]M. J. Paul Viola, "Robust Real-time
3000 people uses this system, the model can be Object Detection," International Journal of
trained to be very accurate on these 3000 people. Computer Vision, pp. 137-154, 2004.
ACKNOWLEDGEMENT
We would like to express our sincere gratitude to

3. [3]H. A. C. H. a. S. L. Bo Wu, "Fast rotation
the professor because washing head of
invariant multi-view face detection based on
Department Academic Unit 1 for providing us
real Adaboost," Sixth IEEE International
with this wonderful opportunity of designing this
Conference on Automatic Face and Gesture
marvelous Independent project on “IMAGE
Recognition, pp. 79-84, 2004.
DETECTION FROM VIDEOS USING AI”
We are highly indebted to Mr. Rohit Kumar, our
mentor for his preserving guidance and
4. [4]H. A. Y. T. S. L. Yuan Li, "Tracking in
unflinching support in the process of crafting this Low Frame Rate Video: A Cascade Particle
project and documenting this report, which is Filter with Discriminative Observers of
worthy experience. Different Life Spans," IEEE Transactions on

Pattern Analysis and Machine Intelligence,
pp. 1728-1740, 2008.
Conference on Computer Vision and Pattern
Recognition, Portland, OR, 2013.
5. [5] C. J. T. D. H. C. A. J. G. T. F. COOTES,
"Active Shape Models-Their Training and
Application," COMPUTER VISION AND 11. [11] "Convolutional_neural_network,"
IMAGE UNDERSTANDING, pp. 38-59, 2017. [Online]. Available:
1995. https://en.wikipedia.org/wiki/Convolutional
_neural_network.
6. [6] D. C. a. T. Cootes, Boosted Regression 12. [12] S. S. Liew, "Research Gate," 1 3 2016.
Active Shape Models, BMVC, 2007. [Online]. Available:
https://www.researchgate.net/figure/Archite
cture-of-the-classical-LeNet-5-
7. [7] Y. W. F. W. a. J. S. X. Cao, "Face CNN_fig2_299593011. [Accessed 10 10
alignment by Explicit Shape Regression," in 2018].
2012 IEEE Conference on Computer Vision

and Pattern Recognition, Providence, RI, 13. [13] L. B. Y. B. a. P. H. Y. LeCun,
2012. "Gradient-based learning applied to
document recognition," Proceedings of the
8. [8] M. T. a. A. Pentland, "Eigenfaces for IEEE, pp. 1-45, 11 1998.
recognition," Journal of Cognitive 14. [14] xlvector, "JIANSHU," 25 7 2016.

Neuroscience, pp. 71-86, 1991. 47 [Online]. Available:
https://www.jianshu.com/p/70a66c8f73d3.
[Accessed 18 9 2018].
9. [9] S. S. W. G. X. C. a. H. Z. Wenchao
Zhang, "Local Gabor binary pattern 15. [15] F. S. Samaria, Face recognition using
histogram sequence (LGBPHS): a novel Hidden Markov Models, Doctoral thesis,
non-statistical model for face representation 1995.
and recognition," Tenth IEEE International
Conference on Computer Vision, pp. 786-
791, 2005.
16. [16] E. H. G. R. A. L. H. Learned-Miller, "
Labeled Faces in the Wild: A Survey,"
10. [10] X. C. F. W. a. J. S. D. Chen, "Blessing Advances in Face Detection and Facial
of Dimensionality: High-Dimensional Image Analysis, pp. 189-248, 2016.
Feature and Its Efficient Compression for
Face Verification," in 2013 IEEE
17. [17] V. Chu, "Medium.com," 20 4 2017. Available:
[Online]. Available: http://ispl.korea.ac.kr/~wjhwang/project/201
https://medium.com/initializedcapital/bench 0/TIP.html.
marking-tensorflow-performance-and-cost-
across-different-gpu-options69bd85fe5d58.
[Accessed 19 9 2018]. 48
18. [18] B. L. M. S. B. Amos, "Openface: A

general-purpose face recognition library
with mobile applications," CMU School of
Computer Science, Tech. Rep., 2016.
19. [19] B. Hill, "HOTHARDWARE," 20 8

2018. [Online]. Available:
https://hothardware.com/news/nvidia-
geforce-rtx-1080-rtx-1080-ti-799-1199-
september20th. [Accessed 10 9 2018].
20. [20] 4psa, "4psa.com," 28 6 2013. [Online].

Available: https://blog.4psa.com/the-
callbacksyndrome-in-node-js/. [Accessed 11
8 2018].
21. [21] W.-S. Chu, "Component-Based

Constraint Mutual Subspace Method," 2017.
[Online]. Available:
http://www.contrib.andrew.cmu.edu/~wschu
/project_fr.html.
22. [22] W. Hwang, "Face Recognition System

Using Multiple Face Model of Hybrid
Fourier Feature under Uncontrolled
Illumination Variation," 2017. [Online].

Abstract - How To Accurately and Effectively Identify: Image Detection From Videos Using AI CHAPTER 1: Introduction

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Abstract - How To Accurately and Effectively Identify: Image Detection From Videos Using AI CHAPTER 1: Introduction

Uploaded by

Copyright:

Available Formats

IMAGE DETECTION FROM VIDEOS USING a deep neural network is constructed to train and

AI construct the combined features.

b) dlib: It is a modern C++ toolkit containing

As you can see RGB looks natural so you

with open(“Attendance.csv”,’r+’) creates a

3. Facial Expressions: Different

4. Low Resolution: Training of

5. Aging: With increasing age, the

CHAPTER 5: Conclusion and future work

The expected result of the proposed program was to

We would like to express our sincere gratitude to

worthy experience. Different Life Spans," IEEE Transactions on

7. [7] Y. W. F. W. a. J. S. X. Cao, "Face CNN_fig2_299593011. [Accessed 10 10

alignment by Explicit Shape Regression," in 2018].

2012 IEEE Conference on Computer Vision

8. [8] M. T. a. A. Pentland, "Eigenfaces for IEEE, pp. 1-45, 11 1998.

recognition," Journal of Cognitive 14. [14] xlvector, "JIANSHU," 25 7 2016.

[Online]. Available: http://ispl.korea.ac.kr/~wjhwang/project/201

18. [18] B. L. M. S. B. Amos, "Openface: A

19. [19] B. Hill, "HOTHARDWARE," 20 8

20. [20] 4psa, "4psa.com," 28 6 2013. [Online].

21. [21] W.-S. Chu, "Component-Based

22. [22] W. Hwang, "Face Recognition System

You might also like