Professional Documents
Culture Documents
END-SEM Report
Submitted By:
Name Roll No SAP Id Branch
Harshita Mittal R134219044 500077452 CSF
Ankur Gupta R110219018 500076127 CCVT
Astha Kumari R110219027 500076107 CCVT
F Rohith Immanuel R134219041 500075157 CSF
Approved By:
I. Abstract
Communication is difficult for those with hearing and speech impairments. The ability to
communicate in sign language can help deaf and hearing individuals communicate more
effectively. In this paper, a new sign language identification technique for detecting the 26
alphabets and 6 motions in sign language is suggested. We can recognize the indications and
provide the appropriate text output using computer vision and neural networks. There is no longer
a need for a complicated and expensive hardware system to identify Sign Language instead, a
smart phone or a web cam will suffice. Google's Media Pipe framework, which was released in
2019, and an advanced recurrent neural networks model are used to do this.
Key Points: RNN, TensorFlow, OpenCV, Media pipe.
II. Introduction
There have been various technological improvements, as well as much study, to assist the
deaf and dumb. Deep learning and computer vision can also be utilized to help with the
cause. Because understanding Sign Language is not something that everyone has, our
project can be very useful for deaf and dumb people in interacting with others.
Furthermore, this can be expanded to building automated editors, where a person can easily
write using only their hand movements.
Sign Languages are the set of signs/gestures which are used by physically impaired people,
so that they can easily communicate with others. Sign Language helps to bridge the
communication gap between the people. There are so many different Sign Languages
which are used by physically impaired people like ASL (American Sign Language), ISL
(Indian Sign Language), BSL (British Sign Language). ASL is shown in (Fig 1)
Fig 1 (ASL Symbols) [1]
2.2 Sign Language Detection
The use of AI algorithms, along with the availability of massive data sets and great
processing capacity, is a technological benefit. We can identify what people wish to
say with the assistance of these technologies by turning motions into letters using
Machine Learning and OpenCV. These technologies can be used to create real-time
situations.
2.2.1 OpenCV
It is an open-source computer vision and machine learning software library. It was built
to accelerate the use of machine learning algorithms. It can be used to detect and
recognize faces, identify objects and most importantly classify the human actions by
accessing the web cam in real time. Further we use this library along with media pipe
to extract the landmarks of face and hands.
2.2.2 Media-pipe
Live perception of simultaneous human pose, face landmarks, and hand tracking in real-
time can enable various modern life applications: fitness and sport analysis, gesture
control and sign language recognition, Mid-Air writing.
By using the inferred pose landmarks we derive three regions of interest (ROI) crops for
each hand (2x) and the face, and employ a re-crop model to improve the ROI.
We then crop the full-resolution input frame to these ROIs and apply task-specific face
and hand models to estimate their corresponding landmarks. Finally, we merge all
landmarks with those of the pose model to yield the full 540+ landmarks. As shown in
(Fig 2)
2.2.3 TensorFlow
TensorFlow is one of Google's most well-known deep learning frameworks. It is a
Python programming language-based free and open-source software library.
TensorFlow is used to train the dataset. The TensorFlow library combines many APIs
to produce a large-scale deep learning architecture similar to an RNN. To decrease the
computing effort, TensorFlow employs a graph structure.
[1] In this paper, SIGN LANGUAGE RECOGNITION (Feb 2014): STATE OF THE ART
by Ashok K Sahoo [1], Gouri Sankar Mishra [2] and Kiran Kumar Ravulakollu [3]
Journal: ARPN Journal of Engineering and Applied Sciences
They introduced that Systems should be able to distinguish face, hand (right/left) and other
parts of body simultaneously.
[3] SIGN LANGUAGE RECOGNITION by Muskan Dhiman [1] and Dr G.N. Rathna [2].
In this paper they introduced that for user- dependent, the user will give a set of images to
the model for training, so it becomes familiar with the user.
The objective of this project is to develop a system for the symbolic expression through
images so that the communication gap between a normal and physically-impaired people
can be easily bridged.
VI. Methodology
Phase 1 Extracting the holistic key points
Live perception of simultaneous human pose, face landmarks, and hand tracking in real-
time can enable various modern life applications: fitness and sport analysis, gesture control
and sign language recognition, Mid-Air writing. By using the inferred pose landmarks, we
derive three regions of interest (ROI) crops for each hand (2x) and the face, and employ a
re-crop model to improve the ROI. We then crop the full-resolution input frame to these
ROIs and apply task-specific face and hand models to estimate their corresponding
landmarks. Finally, we merge all landmarks with those of the pose model to yield the full
540+ landmarks.
Step 1: We first defined the draw_landmarks through we will drawing the landmarks for face,
hands and poses.
Step 2: then we defined styled_landmarks which give some style to our landmarks by defining
the radius, thickness and the dedicate color to the face, left, right hand and pose.
mp_drawing. draw_landmarks (image, results. face_landmarks, mp_holistic.
FACE_CONNECTIONS,
mp_drawing.DrawingSpec(color= (80,110,10), thickness=1, circle_radius=1),
mp_drawing.DrawingSpec(color= (80,256,121), thickness=1, circle_radius=1))
Step 3: Then we saved the extracted Key points or landmarks for further use in dataset creation.
As shown in (Fig 4).
(Fig 4) After merging the hand and face landmarks extraction
https://github.com/harshita0501/Sign-Language-Recongnition/
[1]https://www.ai-media.tv/ai-media-blog/sign-language-alphabets-from-
around-the-world/
[2] https://google.github.io/mediapipe/
[3]https://www.researchgate.net/publication/262187093_Sign_language_recogni
tion_State_of_the_art/
[4]https://www.researchgate.net/publication/326972551_American_Sign_Langu
age_Recognition_System_An_Optimal_Approach/
[5]https://edu.authorcafe.com/academies/6813/sign-language-recognition/
[6]https://upcommons.upc.edu/bitstream/handle/2117/343984/ASL%20recognit
ion%20in%20real%20time%20with%20RNN%20-
%20Antonio%20Dom%C3%A8nech.pdf?sequence=1&isAllowed=y/