0% found this document useful (0 votes)
79 views4 pages

Realtime Sign Language Detection and Recognition

Uploaded by

smonimala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views4 pages

Realtime Sign Language Detection and Recognition

Uploaded by

smonimala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2022 2nd Asian Conference on Innovation in Technology (ASIANCON)

Pune, India. Aug 26-28, 2022

Realtime Sign Language Detection and Recognition


Aakash Deep Aashutosh Litoriya Akshay Ingole
CDAC ACTS CDAC ACTS CDAC ACTS
Pune, Maharashtra Pune, Maharashtra Pune, Maharashtra
2022 2nd Asian Conference on Innovation in Technology (ASIANCON) | 978-1-6654-6851-0/22/$31.00 ©2022 IEEE | DOI: 10.1109/ASIANCON55314.2022.9908995

aakashbishnoi108@gmail.com litoriya.aashu01@gmail.com akshayingole48@gmail.com

Vaibhav Asare Shubham M Bhole Dr. Shantanu Pathak


CDAC ACTS CDAC ACTS DHI training & research Consultancy
Pune, Maharashtra Pune, Maharashtra Pune, Maharashtra
vbvasare@gmail.com shubhambhole81@gmail.com shantanu.s.pathak@gmail.com

are:
Abstract—The real-time sign language recognition system is
developed for recognizing the gestures of Indian Sign Language x Palm Detection Model
(ISL). Generally, sign languages consist of hand gestures. For x Hand Landmark Model
recognizing the signs, the Regions of Interest (ROI) are identified
and tracked using the skin segmentation feature of OpenCV. x Multiclass Classification
Then by using [1]Media Pipe, it captures the landmarks of the
hands and the key points of landmarks are stored in an NumPy x Realtime Sign Detection and Recognition.
array. Then we can train the model on it by using TensorFlow, B. Methodologies and Technologies
Keras and LSTM. Lastly the model can be tested Realtime by
taking live feed from the webcam. The approach for sign detection and hence we can
recognize the sign language actions.
Realtime Sign Detection and Recognition is oneof the potential
applications for the deaf and dumb people as it help them to
x Firstly, we will detect the hands in the live feed of the
connect with the world. Previous approaches to sign detection webcam.
and recognition were done by using the Machine Learning x Landmarks are collected from the different position of
Algorithm by training it on the images but now we are using Deep the hands
Learning Models to enhancing Realtime sign detection and
recognition. x The Key points of the landmarks are store in the array so
that they can used further for the process.
Keywords—Indian Sign Language, Realtime, [1]Media Pipe,
Landmarks, Key points, OpenCV, NumPy, LSTM, TensorFlow, x While capturing the landmarks, the specific labels are
Keras. given by the user.

I. INTRODUCTION x With the help of labels, given data isclassified.


According to the World Health Organization, the number The model is trained with the help of LSTM (Long Short-
of hearing-impaired individuals has recently reached 400 Term Memory) layers by giving the key points of landmarks as
million. For this reason, recent studies have been accelerated to input and labels as output.
make disabled people communicate more easily The sign
x At last, we test the model using the livefeed of
language is a very important way of communication for deaf-
webcam in Realtime.
dumb people. In sign language each gesture has a specific
meaning. So, therefore complex meanings can be explain by x We Created a Web Application for ourproject using
the help of combination of various basic elements. Sign Flask.
language is a gesture based language for communication of
deaf and dumb people. It is basically a non-verbal language III. MEDIA PIPE
which is usually used to deaf and dumb people to communicate [1]Media Pipe offers cross-platform, customizable ML
more effectively with each other or normal people. Sign solutions for live and streaming media. It provide various
language contains special rules and grammar's for expressing solutions such as Hands, Face, Holistic, Pose, etc. In our
effectively. Project, We are using [1]Media pipe Holistic Solutions which
is combinations of Hands, Face and Pose. But in our case we
II. OBJECTIVE AND METHODOLOGY
are using only [1]Media pipe Hands Solutions.
A. Objective of Project
Medi pipe Hands uses two types of model:-
The project aims to deliver a system for Realtime sign
language detection and recognition using [1]Media Pipe and x Palm Detection Model:- It is used for detectionof hands
Computer Vision. And the training of model using LSTM and in an image.
CNN. x Hand Landmark Model:- It recognize andidentifies
The different processes which can be used in this project the 21 landmarks of hands.

978-1-6654-6851-0/22/$31.00 ©2022 IEEE 1


Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on August 24,2024 at 06:46:18 UTC from IEEE Xplore. Restrictions apply.
IV. DATA PREPARATION VII. ACCURACY COMPARISON
We collected the Data by capturing the hand images
through OpenCV window. Then we captured the landmarks of Models ActivationFunction Accuracy Loss
the hands and stored the key points of the landmarks and stored LSTM
Model On Relu/Softmax 0.9667 0.2673
them in an NumPy array
Landmarks
CNN
Model OnImages Relu 0. 9827 0.0761

Fig. 3. Model Comparison

Models Activation Loss


Function Accuracy
Epochs=500,
Optimizer=Adam Relu/Softmax 0.9667
0.2673
Epochs=1000,
Optimize=Adam Tanh/Softmax 0.9947
0.0127
Epochs=3000,
Optimize=Adam Tanh/Softmax 0.9512 0.002

Fig. 1. Data Collection Fig. 4. Model Parameters Comparison

V. MODEL DESCRIPTION VIII. ACCURACY & LOSS GRAPH


The model we used is built with TensorFlow and Keras
using Long Short Term Memory (LSTM). It is an artificial
Recurrent Neural Network (RNN) architecture. Unlike
standard Feedforward Neural Networks, LSTM has feedback
connections. It can process not only single data points such as
images, but also entire sequences of data such as speech or
video.
The LSTM model architecture consists of the following
layers:
x LSTM layer; 64 nodes
x LSTM layer; 128 nodes
x LSTM layer; 64 nodes
x Fully connected layer; 64 nodes
x Fully connected layer; 32 nodes Fig. 5. Accuracy Graph
The final layer is also a fully connected layer with nodes
equal to no. of labels. In all the layers, a Relu as activation
function and for output layer we use Softmax.
VI. SYSTEM ARCHITECTURE

Fig. 2. System Architecture


Fig. 6. Loss Graph

2
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on August 24,2024 at 06:46:18 UTC from IEEE Xplore. Restrictions apply.
A. Flow Chart

Fig. 7. Confusion Matrix

Fig. 9. Flow Chart

B. Web Application

Fig. 10. Face Login System

Fig. 8. Classification Report

IX. DEPLOYMENT Fig. 11. Realtime Sign Detection

We deployed our project using Flask. We also created a


login system in our website and also a face login system which
works on face recognition library.

3
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on August 24,2024 at 06:46:18 UTC from IEEE Xplore. Restrictions apply.
CNN model requires large amount storage as it works on
images but with the help of LSTM model, we can work with
Key points of the images, which are easy to store
ACKNOWLEDGMENT
We would like to thanks CDAC ACTS, Pune for
supporting and helping us to choose this project to work
on it. And we also like to thank our guide Dr Shantanu
Pathak for guiding us throughout the project.
REFERENCES
[1] https://mediapipe.dev/
Fig. 12. Sign Detection using Images [2] https://ai.googleblog.com/2020/12/mediapipe- holistic-simultaneous-
face.html
X. CONCLUSION [3] https://developers.googleblog.com/2021/04/signall-sdk- sign-language-
interface-using-mediapipe-now- available.html
In this Python project, we have built a Realtime Sign [4] Arpita Halder , Akshit Tayade, “Real-time Vernacular Sign Language
Detection and Recognition that you can implement in Recognition using MediaPipe and Machine Learning” www.ijrpr.com
numerous ways. We used [1]Media Pipe Library to capture the ISSN 2582-7421
landmarks of hands and the used LSTM model to train it and [5] Murat Taskiran, Mehmet Killioglu, Nihan Kahraman, “A Real-Time
predict the sign language actions. System for Recognition of American Sign Language by using Deep
Learning” IEEE 18044537
The objective of this project is met partially. The program [6] Brandon Garcia, Sigberto Alarcon Viesca, ” Real-time American Sign
is able to load and perform within the required time frame and Language Recognition with Convolutional Neural Networks”
the resulting detections accuracy is acceptable. This prototype stanford.edu
can be further tested and evaluated on various aspects of
scalability and stability

4
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on August 24,2024 at 06:46:18 UTC from IEEE Xplore. Restrictions apply.

You might also like