You are on page 1of 27

Smart glasses for the blind with Face detection and

recognition system based on Deep Learning

Project Report Submitted In Partial Fulfillment Of


The Requirements for The Degree Of

Bachelor of Technology
In

Electronics and Communication Engineering


Of

Maulana Abul Kalam Azad University of Technology, West Bengal


By

DWAIPYAYAN MAHATA, 10900320063


JOYJIT DEY, 10900320056
KAUSTAV SARKAR, 10900320076

Under the guidance of

PROF. NILADRI SHEKHAR MISHRA

Department of Electronics and Communication Engineering


Netaji Subhash Engineering College
Garia, Kolkata – 700152

2023-24

Page 1 of 27
CERTIFICATE

This is to certify that this project report titled Smart glasses for the blind with Face

detection and recognition system based on Deep Learning submitted in partial


fulfillment of requirements for award of the degree Bachelor of Technology

(B.Tech) in Electronics and Communication Engineering of West Bengal University of

Technology is a faithful record of the original work

carried out by,

DWAIPYAYAN MAHATA, 10900320063, 201090100310033 OF 2020-21

JOYJIT DEY, 10900320056, 201090100310040 OF 2020-21

KAUSTAV SARKAR, 10900320076, 201090100310020 OF 2020-21

under my guidance and supervision.

It is further certified that it contains no material, which to a substantial extent has been

submitted for the award of any degree/diploma in any institute or has been published in

any form, except the assistances drawn from other sources, for which due

acknowledgement has been made.

___________
Date: Signature of the Supervisor

PROF. Niladri Shekhar Mishra


________________

Head of the Department

Electronics and Communication Engineering


Netaji Subhash Engineering College
Techno City, Garia, Kolkata – 700 152
Page 2 of 27
Declaration

We hereby declare that this project report titled

Smart glasses for the blind with Face detection and


recognition system based on Deep Learning

is our own original work carried out as a under graduate student in Netaji Subhash

Engineering College except to the extent that assistances from other sources are duly

acknowledged.

All sources used for this project report have been fully and properly cited. It contains no

material which to a substantial extent has been submitted for the award of any

degree/diploma in any institute or has been published in any form, except where due

acknowledgement is made.

Student’s names: Signatures: Date:

……………………….. ……………………….. ……………………

……………………….. ……………………….. ……………………

……………………….. ……………………….. ……………………

……………………….. ……………………….. ……………………

Page 3 of 27
Certificate of Approval
We hereby approve this dissertation titled

Smart glasses for the blind with Face detection and recognition system based on
Deep Learning

carried out by

DWAIPYAYAN MAHATA, 10900320063, 201090100310033 OF 2020-21

JOYJIT DEY, 10900320056, 201090100310040 OF 2020-21

KAUSTAV SARKAR, 10900320076, 201090100310020 OF 2020-21

under the guidance of

PROF. NILADRI SHEKHAR MISHRA

of Netaji Subhash Engineering College, Kolkata in partial fulfillment of requirements for

award of the degree Bachelor of Technology (B. Tech) in << Program name >> of West

Bengal University of Technology

Date:………..

Examiners’ signatures:

1. ………………………………………….

2. ………………………………………….

3. ………………………………………….

Page 4 of 27
Acknowledgement and/or Dedication

We hereby declare that the work presented in this report entitled “ Smart
glasses for the blind with Face detection and recognition system based on
Deep Learning” in partial fulfillment of the requirements for the award of the
degree of Bachelor of Technology in Electronics and Communication
Engineering submitted in the department of Electronics and Communication
Engineering, Netaji Subhash Engineering College, is an authentic record of
our own work under the supervision of Niladri Shekhar Mishra, Department
of Electronics and Communication Engineering . The matter embodied in the
report has not been submitted for the award of any other degree or diploma.

Dwaipyayan Mahata

Joyjit Dey

Kaustav Sarkar

Dated:…………………

Page 5 of 27
1. Introduction (Chapter 1: A description of the rationale of the project): (i)
Describe the problem being addressed (Why this problem? Application
areas? Social impact?) (ii) Literature review to address the problem (discuss
most recent publications, their approach, advantages, and disadvantages; all
references should be cited) (iii) Brief discussion of the proposed concept to
solve the problem (only discuss the concept briefly), (iv) brief description of
the workflow and outcome. - A description of the rationale of the project.
2. Methodology (Chapter 2): Detailed description of the proposed
methodology (design approach, analysis, theory etc.)
3. Results (Chapter 3): A description of the result analysis justifying the
proposed concept/ideas (data analysis, graphs, table, etc.)
4. Conclusion: (Chapter 4): Discuss the main outcome being achieved, other
possible approaches, merits and demerits of the proposed concept, social
impact, and future scope of the work.
5. References: A list of the references cited in the synopsis (must be referred
to in the text). Significant references are appreciated, preferably published in
peer-reviewed journals most recently.

Page 6 of 27
Abstract
Face recognition, propelled by advances in deep learning, has evolved into a
transformative technology with applications spanning security, authentication, and
human-computer interaction. Leveraging Convolutional Neural Networks (CNNs),
this methodology offers a robust solution to the inherent challenges posed by
variations in lighting, pose, and facial expressions.

The process begins with face detection using state-of-the-art CNN architectures
like Single Shot Multibox Detector (SSD) or You Only Look Once (YOLO). Detected
faces then undergo feature extraction through layers of convolutional and pooling
operations, enabling the automatic learning of hierarchical facial features crucial for
accurate identification.

During training, deep learning models employ techniques such as triplet loss to
ensure that the embeddings (numerical representations) of the same person's
faces are closer in the feature space, enhancing discrimination between
individuals. The utilization of large, diverse datasets for training is pivotal, allowing
models to generalize well to real-world scenarios.

Transfer learning plays a vital role, enabling the application of pre-trained models
on extensive datasets to face recognition tasks with limited labeled data. This
facilitates robust performance even in scenarios where training data is scarce.

The applications of face recognition by deep learning are wide-ranging,


encompassing security and surveillance, user authentication, and seamless
human-computer interaction. However, ethical considerations, including privacy
concerns and potential biases in training data, must be addressed to ensure
responsible deployment in practical settings.

Page 7 of 27
Contents

1: INTRODUCTION Page No
1.1 Problem Definition.......................................................................................

1.2 Aim of The Project………..................................................................................

1.3 Project Overview/Specification........................................................................

1.4 Software/Hardware Specifications................................................................

2: REVIEW WORK
2.1 Existing System..........................................................................................

2.2 Proposed System.........................................................................................

3: WORKING METHODOLOGY
3.1 Flowcharts& Circuit Diagram..................................................................

3.2 Work Done Until Now…………………………………………………………………………….

4: RESULTS/OUTPUT & CONCLUSION


4.1 Results…………………………………………………………………………………………………..

4.2 Conclusion…………………………………………………………………………………………….

5: REFERENCES

Page 8 of 27
Chapter 1

Introduction
Face recognition using deep learning has emerged as a powerful and efficient technology
in the field of computer vision. Deep learning, a subset of machine learning, involves
training artificial neural networks on large datasets to learn and extract hierarchical
representations of data. In the context of face recognition, deep learning algorithms can
automatically identify and authenticate individuals by analyzing facial features.

The traditional methods of face recognition often relied on handcrafted features, such as
the position of eyes, nose, and mouth, which could be sensitive to variations in lighting,
pose, and facial expressions. Deep learning approaches, on the other hand, have shown
remarkable success in addressing these challenges by automatically learning relevant
features directly from raw data.

Here's a brief introduction to the key components of face recognition using deep learning:

1. Convolutional Neural Networks (CNNs): CNNs are a fundamental building block


in deep learning for computer vision tasks, including face recognition. These
networks are designed to automatically learn hierarchical features from images. In
the context of face recognition, CNNs can identify patterns and features in facial
images that are crucial for accurate identification.
2. Face Detection: Before recognition, faces need to be detected in an image.
Convolutional neural networks, such as Single Shot Multibox Detector (SSD) or
You Only Look Once (YOLO), are commonly used for accurate and efficient face
detection.
3. Feature Extraction: Once faces are detected, deep learning models extract
features that are crucial for distinguishing one face from another. This process
involves several layers of convolutional and pooling operations that progressively
learn and abstract features.
4. Embedding and Similarity Measurement: The features extracted from a face
image are transformed into a numerical representation, often referred to as an
embedding. The similarity between these embeddings is then measured to
determine whether two face images belong to the same person. Triplet loss is a
common technique used during training to ensure that the embeddings of the same
person's faces are closer in the feature space.
5. Training on Large Datasets: The effectiveness of deep learning models relies on
large datasets for training. In the case of face recognition, diverse datasets
containing images of individuals in various poses, lighting conditions, and
expressions are used to train models robust to real-world scenarios.

9 of 27
6. Transfer Learning: Transfer learning is often employed to leverage pre-trained
models on large datasets for face recognition tasks with limited labeled data. This
helps in achieving good performance even when the available training data is
relatively small.

Face recognition by deep learning has found applications in various domains, including
security, surveillance, user authentication, and human-computer interaction. However, it's
essential to address ethical considerations, such as privacy concerns and potential biases
in the training data, when deploying face recognition systems in real-world applications.

FIG.1.1 Face detection based on deep learning


1.1 Problem Definition
Visually impaired individuals face challenges in recognizing and interacting with people in
their surroundings, leading to a potential reduction in independence and social
engagement. Existing assistive technologies often lack the ability to provide real-time
information about the faces and identities of individuals nearby. The goal is to develop
smart glasses equipped with face detection and recognition capabilities through deep
learning to address these challenges.

Key Challenges:

1. Limited Facial Recognition Solutions for the Visually Impaired:


Current assistive technologies for the blind often lack advanced facial
recognition capabilities, limiting the ability of users to identify individuals
around them.

2. Real-Time Processing and Feedback:


Providing real-time information about detected faces and recognized
individuals is a challenge, requiring efficient deep learning algorithms and
responsive feedback mechanisms.

Page 10 of 27
3. Privacy Concerns:
Developing a facial recognition system that respects the privacy of
individuals and adheres to ethical standards is crucial to avoid potential
privacy violations.

4. Customization for Individual Needs:


Meeting the diverse needs of visually impaired users by allowing
customization of recognition sensitivity, feedback preferences, and other
settings.

5. Obstacle Detection and Navigation:


Integrating obstacle detection and navigation features to enhance overall
mobility and safety in different environments.

1.2 Aim of The Project


The aim of a project involving smart glasses for the blind with face detection and
recognition through deep learning is to enhance the independence and social interaction
of visually impaired individuals. The primary objectives of such a project are:

1. Assistive Technology for the Visually Impaired:


Develop smart glasses as an assistive technology to empower visually
impaired individuals in recognizing and interacting with people in their
surroundings.

2. Face Detection:
Implement a reliable face detection system that can identify faces in the
wearer's environment, providing information about the presence of
individuals nearby.

3. Face Recognition:
Incorporate deep learning-based face recognition to enable the smart
glasses to recognize and provide information about the identity of known
individuals, such as friends or acquaintances.

4. Real-Time Feedback:
Ensure that the system provides real-time auditory or haptic feedback to the
wearer, conveying information about the detected faces and recognized
individuals.

5. Navigation Assistance:
Integrate navigation features to help users navigate their surroundings,
providing information about obstacles and guiding them in unfamiliar
environments.
6. User-Friendly Interaction:
Design an intuitive and user-friendly interface to facilitate seamless
interaction with the smart glasses, allowing users to control and customize
the system easily.

7. Privacy Considerations:
Implement privacy-conscious features, ensuring that the facial recognition
system respects the privacy of individuals and complies with ethical
standards.

8. Customization for Individual Needs:


Allow users to customize settings and preferences based on their individual
needs and preferences, considering factors such as recognition sensitivity
and feedback preferences.

9. Integration of Additional Sensors:


Explore the integration of additional sensors, such as depth sensors or
obstacle detection sensors, to enhance the overall functionality of the smart
glasses in various environments.

10. Speech Interface:


Incorporate a speech interface that enables users to interact with the system
through voice commands, providing a natural and hands-free interaction
experience.

11. Connectivity with Mobile Devices:


Enable connectivity with mobile devices to enhance the capabilities of the
smart glasses, allowing users to access additional information and
functionalities.

12. Training and User Support:


Provide training resources and ongoing user support to ensure that visually
impaired individuals can effectively use and benefit from the smart glasses.

13. Integration with Existing Technologies:


Explore compatibility and integration with existing technologies, such as
GPS systems or navigation applications, to complement and enhance the
overall functionality.

14. Field Testing and User Feedback:


Conduct field testing with visually impaired users to gather feedback on the
usability, effectiveness, and potential improvements of the smart glasses in
real-world scenarios.

By achieving these objectives, the project aims to create a valuable tool that empowers
visually impaired individuals, enhances their social interactions, and contributes to their
overall independence and quality of life.
Page 12 of 27
1.3 Project Overview

Fig1.1: A general view of face recognition


system
A video stream or an image is always as an input of face recognition system, and the
output is a verification or identification of face(s) that appear in the image or video. A
general face recognition process normally consists of various essential stages, in which
mainly are face detection and face recognition. Even though the accuracy of facial
recognition system as a biometric technology is lower than fingerprint recognition and iris
recognition, its contactless and non-invasive process makes it widely implemented

1.3.1 Face Detection


Face detection is one of the computer vision techniques with the concept of tracking or
determining the face area in an image or a video. Face detection is the first and essential
step for any face processing systems, including face recognition system, detection of
driver drowsiness in vehicles, access control, criminal identification and so-on. There are
many methods and algorithms have been proposed to detect face(s) in the given real-time
video or image with different accuracy and false detection rates.

1.3.2 Face Detection by Viola-Jones Algorithm

Viola-Jones algorithm was founded


by Paul Viola and Michael Jones in
2001 (Dang and Sharma, 2017). It
came up with an effective algorithm to
detect the human faces in real-time.
Viola-Jones face detection method
has four (4) main stages, which are
Haar-like features selection, integral
image creation, training of AdaBoost
and cascading classifiers.
(1) Haar-like features selection. All human faces have the similar properties such as
the eyes region is darker than the nose bridge region and so on. The Haar-like
feature is used to compare these properties. There are three kind of Haar-like
features: (i) two-rectangle feature, (ii) three-rectangular feature and (iii) four-
rectangular feature.
(2) Integral image creation. An integral image is generated to allow for very rapid
feature evaluation with Haar-like features.
(3) Training of AdaBoost. It a simple and effective classifier, which choose a minor
number of significant features from a massive library of potential features using an
algorithm.
(4) Cascading classifiers. It combines the successively complex classifiers in a
cascade structure that intensely increases the detector speed by focusing attention
on promising regions of the image.

Dang and Sharma (2017) compared and analysed the precision and recall of four basic
algorithms which are used for face detection: (1) Viola-Jones, (2) Support Vector
Machines-Based (3) Neural Network-Based Face Detection and (4) SMQT Features and
SNOW Classifier, face detection. They concluded that the Viola-Jones is the best among
all these algorithms.
Datta, Datta and Banerjee (2015) also mentioned that the Viola-Jones face detector is
able to process face image rapidly with high true detection rates is a realtime framework.
Rajeshwari and Prof. Anala (2015) agreed that the Viola-Jones method gives better
results, but it has greater time consumption than skin colour-based detection method and
background subtraction method.
From the works mentioned above, the Viola-Jones algorithm can be considered as a
popular method for face detection. However, the Viola-Jones algorithm cannot detect
faces in a diverse position or angle (Enriquez, 2018). Low accuracy of face detection has
resulted when the face is not presented in a front-facing position with proper lighting. In
other words, the Viola-Jones face detection method could not handle non-frontal faces
efficiently.

Page 14 of 27
1.3.3 Face Detection by Multi-Task Cascaded Convolutional Network (MTCNN)
The Viola-Jones face detector while being prevalent in face detection tasks for a decade.
As mentioned before, Viola-Jones face detector degrades expressively with greater visual
variations of faces that usually occurs in real-world applications. Inspired by the
achievement obtained in computer vision tasks through the use of deep convolutional
neural networks (CNNs), numerous studies were motivated to use this architecture for
face detection. In this respect, Zhang et al. (2016) proposed a Multi-task Cascaded
Convolutional Networks (MTCNN) based framework for joint face detection and
alignment, which implements three stages of designed deep CNNs in a cascaded
structure that forecast the face and landmark locations.

FIG. Pipeline of MTCNN framework


There are three-stage cascaded framework in the MTCNN face detection method:
(1) Stage 1: Candidate facial windows and its bounding box regression vectors are
produced quickly through the few layers of CNNs named Proposal Network (PNet). These
candidates are calibrated based on the estimated bounding box regression vectors. Then
the non-maximum suppression (NMS) algorithm is used to merge the highly overlapped
candidate windows.
(2) Stage 2: These candidates are refined in the next stage through a more complex
CNNs called Refinement Network (R-Net), which further discards a large number of
untrue candidates (non-faces windows).
(3) Stage 3: A more powerful CNNs, Output Network (O-Net) generates the final bounding
box and facial landmarks position.

FIG: P-Net, R-Net, and O-Net architectures in MTCNN structure

In contrast to the Viola-Jones algorithm, CNNs able to detect faces in various positions or
angle and different lighting circumstances. As a result of it, the CNNs face detection
method requires to store a larger amount of information and much more space needed
than the Viola-Jones algorithm (Enriquez, 2018). Accessing too much RAM (Random-
Access Memory) and requiring stronger processing unit is a constant problem to run the
program of CNNs. It limited the CNNs can be implemented correctly. Therefore, while
CNNs are faster and much more reliable in term of accuracy in face detection, the Viola-
Jones algorithm is still widely used today.

Page 16 of 27
1.4 Specifications
This section provides an outline for the hardware and software requirements of the
developed system.

1.4.1 Hardware Specifications

 Laptop webcam

1.4.2 Software Specifications

 Python language
 VScode / Jupyternotebook (coding environment)
 Anaconda (ML libraries and algorithms)
 Pyhton library packages
(i) OpenCV Library
(ii) Tensorflow Library
(iii) NumPy Library
REVIEW WORK

2.1 Existing System


Smart Glass System for BVI People- One of the most important and significant tasks for
BVI people is to recognize the face and identity information of relatives and friends.
Daescu et al. [13] created a face recognition system that can receive facial images
captured via the camera of smart glass based on commands from the user, process the
result on the server, and thereafter return the result via audio. The system is designed as
a client–server architecture, with a pair of cellphones, smart glasses, and a back-end
server employed to implement face recognition using deep CNN models such as FaceNet
and Inception-ResNet. However, this face recognition system needs to retrain to
recognize new faces that are not available on the server, thereby requiring increased time
to function. Mandal et al. [39] focused on the ability of recognition of faces under various
lighting conditions and face poses and developed a wearable face recognition system
based on Google Glasses and subclass discriminant analysis to achieve within-subclass
discriminant analysis. However, this system suffers from a familiar problem; that is,
although it correctly recognized the faces of 88 subjects, the model had to be retrained for
new faces that were not in the initial dataset.

Researchers and developers continue to improve existing algorithms and propose novel
solutions for face detection, making it a dynamic and evolving field within computer vision.
Keep in mind that the effectiveness of a specific method can depend on factors such as
the dataset used, the application context, and the computational resources available.
As of my last knowledge update in January 2022, there have been notable developments
in utilizing face detection for smart glasses designed to assist individuals with visual
impairments. These technologies aim to enhance the daily lives of the blind by providing
real-time information about their surroundings. Key aspects of existing work include:

1. Object Recognition and Navigation:


 Face detection is often integrated into smart glasses' object recognition
systems, allowing users to identify and locate people in their vicinity. This
contributes to enhanced navigation and social interaction.
2. Deep Learning Techniques:
 Deep learning, particularly convolutional neural networks (CNNs), has
played a crucial role in improving the accuracy of face detection algorithms.
These techniques enable smart glasses to recognize faces in real-time,
providing valuable information to the user.

Page 18 of 27
3. Real-Time Feedback:
 Smart glasses offer real-time auditory or haptic feedback about detected
faces, enabling users to receive instant information about people around
them. This feature enhances the user's situational awareness and social
interactions.
4. Wearable Computer Vision Systems:
 Researchers have developed wearable computer vision systems that
seamlessly integrate with smart glasses. These systems capture visual data
through embedded cameras, process it in real-time, and deliver relevant
information to the user, including the recognition of faces.

In recent years, artificial intelligence and deep learning approaches are rapidly entering all
areas, including autonomous vehicle systems , robotics, space exploration, medicine, pet
and animal monitoring systems , and areas that start with the word smart, such as smart
city, smart home, smart agriculture, etc. Computer vision and artificial intelligence
methods play a key role in the development of smart glass systems. It is not possible to
build a smart glass system without computer vision methods such as object detection and
recognition methods because the input data is an image or a video. Object detection and
recognition has garnered the attention of researchers, and numerous new approaches are
being developed every year. To reduce the review areas, we analyzed lightweight object
detection and recognition models designed for embedded systems.

In 2016, Iandola et al.designed three primary mechanisms to squeeze CNN networks and
named SqueezeNet: (1) 3 × 3 filters were replaced with 1 × 1 filters; (2) the number of
input channels was reduced to 3 × 3 filters, and (3) the network was down-sampled late.
These three approaches reduced the number of parameters in a CNN while maximizing
the accuracy of the limited parameter sources. Mobile deep learning is rapidly expanding.
The Tiny-YOLO net for iOS, introduced by Apte et al. in 2017, was developed for mobile
devices and tested with a metal GPU for real-time applications with an accuracy
approximately similar to the original YOLO. In the same year, Howard et al. built a
lightweight deep neural network named MobileNet using depth-wise separable
convolution architecture for mobile and embedded systems. This model has inspired
researchers and has been used in various applications. In 2018, the MobileNet-SSD
network , derived from VGG-SSD, was proposed to improve the accuracy of small objects
in real-time speed. Further, Wong et al. developed a compact single-shot detection deep
CNN based on the remarkable performance of the fire microarchitecture presented in
SqueezeNet and the macro architecture introduced in SSD. A tiny SSD is created for real-
time embedded systems by reducing the model size and consists of a fire subnetwork
stack and optimized SSD-based convolutional feature layers. With the increasing
capabilities of processors for mobile and embedded devices, numerous effective mobile
deep CNNs for object detection and recognition have been introduced in recent years,
such as ShuffleNet , PeleeNet , and EfficientDet .

It's important to note that the field of assistive technologies for the blind is dynamic, and
ongoing research and development are likely to bring further advancements.
2.2 Proposed System
Our goal is to create convenience and opportunities for blind people to facilitate
independent travel during both day and night-time. To achieve this goal, wearable smart
glass and a multifunctional system that can capture images through a mini camera and
return object recognition results with voice feedback to users are the most effective
approaches. It is also conceivable to perceive visual information by touching the contours
of detected salient objects according to the needs of blind people via a refreshable tactile
display. The system is required to use deep CNNs to detect objects with high accuracy,
and a powerful processor to perform the processes sufficiently fast in real time. Therefore,
we introduced client–server architecture that consists of smart glass and a
smartphone/tactile pad as a local, and an artificial intelligence server to perform image
processing tasks. Hereinafter, for simplicity in the text, a smartphone is written instead of
a smartphone/tactile pad. The local part comprises smart glass and a smartphone and
transfers data via a Bluetooth connection. Meanwhile, the artificial intelligence server
receives the images from the local, processes them, and returns the result in audio
format. Note that, smart glass hardware has a built-in speaker for direct output and
earphone port for audio connection to convey returned audio results from smartphone to
users.

Page 20 of 27
Chapter 2
Working Methodology
3.1 Flow Chart:

Figure 1: Flow Chart of working principle

Page 21 of 27
3.2 Work done:

Step 1: Install the Required Software and Library Packages The purposed face recognition
system is developed using Anaconda with the Python programming language. Install
Anaconda Anaconda is essentially a nicely packaged Python IDE (Integrated Development
Environment) that is shipped with tons of useful library package, such as NumPy, Time,
Matplotlib and so-on. Anaconda also uses the concept of creating environments to isolate
different libraries and versions.

Install Library Packages Some Python library packages are required to be installed for the developed system,
which includes the OpenCV, Tensorflow, MTCNN, Face_recognition, Dlib, NumPy, Threading, OS, Time,
Pyttsx3, Openpyxl and Tkinter Library. These library packages can be simply installed by entering the
relevant command in the Python terminal

Library Packages Command

OpenCV pip install opencv-python

Tensorflow pip install tensorflow-gpu

NumPy pip install numpy

Step 2: Face Detection Face detection is the first and essential step for the face
recognition system. A face must be captured in order to recognize it. The face detection
technique in the developed system is using the pre-trained MTCNN face detector model.
The working principle of MTCNN is mentioned in Chapter 2. It has a good face detection
result, which works well for a large angle non-frontal face. Figure below shows the result of
detecting facial regions. The locations and outlines of each person’s eyes, nose, mouth
and chin can also be obtained using the MTCNN face detector. There are total 68
coordinates on the face. However, this face landmark is not a must in the face recognition
system.

Work is still pending in the part of image recognition, dataset creation, linking the dataset,
and hardware part, to reach the proposed goal of ours.

Page 22 of 27
Chapter 3 Result Analysis

FIG. Face Detection

Page 23 of 27
Conclusion
This paper describes a smart glass system that includes object detection, salient
object extraction, and text recognition models using computer vision and deep learning for
blind people. The proposed system is fully automatic and runs on an artificial intelligence
server. It detects and recognizes objects from low-light and dark-scene images to assist
blind people in a regular, day to day environment. The traditional smart glass system was
extended using deep learning models and the addition of salient object extraction for tactile
graphics and text recognition for text-to-speech.
Smart glass systems require greater energy and memory in embedded systems
because they are based on deep learning models. Therefore, we built it in an artificial
intelligence server to ensure real-time performance and solve energy problems. With the
advancement of the 5G era, transmitting image data to a server or receiving real-time
results for users is no longer a concern. The experimental results showed that object
detection, salient object extraction, and text recognition models were robust and performed
well with the help of low-light enhancement techniques in a dark scene environment. In the
future, we aim to create low-light and dark-image datasets with bounding box and ground
truth data to address object detection and text recognition tasks as well as evaluations at
night

Page 24 of 27
References
1. Adam, G. (2016) Machine Learning is Fun! Part 4: Modern Face Recognition with
Deep Learning. Available at: https://medium.com/@ageitgey/machinelearning-is-fun-part-4-
modern-face-recognition-with-deep-learningc3cffc121d78 (Accessed: 28 July 2019).

2. Adam, G. (2018) Face Recognition Accuracy Problems · ageitgey/face_recognition


Wiki · GitHub. Available at: https://github.com/ageitgey/face_recognition/wiki/Face-
RecognitionAccuracy-Problems (Accessed: 29 July 2019).

3. Ali, T. (2010) Face Recognition: An Introduction. Available at:


https://alitarhini.wordpress.com/2010/12/05/face-recognition-an-introduction/
(Accessed: 22 June 2019).

4. Brownlee, J. (2019) How to Develop a Face Recognition System Using FaceNet in


Keras. Available at: https://machinelearningmastery.com/how-to-develop-aface-recognition-
system-using-facenet-in-keras-and-an-svm-classifier/
(Accessed: 28 July 2019).

5. Cai, Z. et al. (2018) ‘Joint Head Pose Estimation with Multi-task Cascaded
Convolutional Networks for Face Alignment’, Proceedings - International
Conference on Pattern Recognition. IEEE, 2018–Augus, pp. 495–500. doi:
10.1109/ICPR.2018.8545898

6. Datta, A. K., Datta, M. and Banerjee, P. K. (2015) Face Detection and Recognition:
Theory and Practice. CRC Press, 2015.

7. Schneiderman.United States of America Patent U.S. Patent No. 8,457,367, 2013.

8. R. J. Baron, “Mechanisms of human facial recognition,” International Journal of


ManMachine Studies.

9. M. Nixon, “"Eye Spacing Measurement for Facial Recognition",” International


Society for Optics and Photonics., vol. (Vol. 575), (19 December 1985).

10. H. &. Y. J. Yu, “A direct LDA algorithm for high-dimensional data—with


application to face recognition,” 2001

Page 25 of 27
Rubrics
Project Assessment Report (Final Year: Project Part I)
Department of Electronics and Communication Engineering
Netaji Subhash Engineering College, Technocity, Garia, Kolkata

Page 26 of 27
Page 27 of 27

You might also like