Professional Documents
Culture Documents
Sign Language
Sign Language
Submitted to
MASTER OF TECHNOLOGY
Submitted By
E. VENKATESWARAMMA (22X41D5805)
Ch. Ambedkar
July 2023
S.R.K INSTITUTE OF TECHNOLOGY
ENIKEPADU, VIJAYAWADA.
DEPARTMENT OF CSE
CERTIFICATE
22X41D5805 E. VENKATESWARAMMA
ACKNOWLEDGEMENT
Firstly, I would like to convey my heart full thanks to the Almighty for the
blessings on me to carry out this project work without any disruption.
I am very much grateful to Dr. B. Asha Latha, H.O.D of CSE Department, for her
valuable guidance which helped me to bring out this project successfully. Her wise approach
made me to learn the minute details of the subject. Her matured and patient guidance paved
a way for completing my project with the sense of satisfaction and pleasure.
I am very much thankful to our principal Dr. M. Ekambaram Naidu for his kind support
and facilities provided at our campus which helped me to bring out this project successfully.
Finally, I would like to convey my heart full thanks to all Technical Staff, for their
guidance and support in every step of this project. I convey my sincere thanks to all the faculty
and friends who directly or indirectly helped me for the successful completion of this
project.
PROJECT ASSOCIATE
E.VENKATESWARAMMA (22X41D5805)
IV
ABSTRACT
Smart agriculture is an emerging concept, because IOT sensors are capable of
providing information about agriculture fields and then act upon based on the user
input.
Sign Language Recognition (SLR) targets on interpreting the sign language into
text or speech, so as to facilitate the communication between deaf-mute people and
ordinary people. This task has broad social impact, but is still very challenging due to the
complexity and large variations in hand actions. Existing methods for SLR use hand-crafted
features to describe sign language motion and build classification models based on those
features. However, it is difficult to design reliable features to adapt to the large variations of
hand gestures. To approach this problem, we propose a novel convolutional neural network
(CNN) which extracts discriminative spatial-temporal features from raw video stream
automatically without any prior knowledge, avoiding designing features. To boost the
performance, multi- channels of video streams, including color information, depth clue, and
body joint positions, are used as input to the CNN in order to integrate color, depth and
trajectory information. We validate the proposed model on a real dataset collected with
Microsoft Kinect and demonstrate its effectiveness over the traditional approaches based on
hand-crafted features
V
CONTENTS
INDEX
CONTENTS Page No
CERTIFICATE I
ACKNOWLEDGEMENT II
DECLARATION III
ABSTRACT IV
1. INTRODUCTION 1-2
3. REQUIREMENTS GATHERING 6
5. TECHNOLOGIES 9-28
6. DESIGN 29-45
a. UML Diagrams
b. System Architecture
c. Code
7. CONCLUSION 46
8. REFERENCES 47-48
VI
INTRODUCTION
Sign language, as one of the most widely used communication means for hearing-
impaired people, is expressed by variations of hand-shapes, body movement, and even
facial expression. Since it is difficult to collaboratively exploit the information from hand-
shapes and body movement trajectory, sign language recognition is still a very challenging
task. This paper proposes an effective recognition model to translate sign language into text
or speech in order to help the hearing impaired communicate with normal people through
sign language.
Kinect is a motion sensor which can provide color stream and depth stream. With the
public Windows SDK, the body joint locations can be obtained in real-time as shown in Fig.1.
Therefore, we choose Kinect as capture device to record sign words dataset. The change of
color and depth in pixel level are useful information to discriminate different sign actions.
And the variation of body joints in time dimension can depict the trajectory of sign actions.
Using multiple types of visual sources as input leads CNNs paying attention to the change
not only in color, but also in depth and trajectory. It is worth mentioning that we can avoid
the difficulty of tracking hands, segmenting hands from background and designing
descriptors for hands because CNNs have the capability to learn features automatically from
raw data without any prior knowledge [3].
1
CNNs have been applied in video stream classification recently years. A potential
concern of CNNs is time consuming. It costs several weeks or months to train a CNNs with
million-scale in million videos. Fortunately, it is still possible to achieve real-time efficiency,
with the help of CUDA for parallel processing. We propose to apply CNNs to extract spatial
and temporal features from video stream for Sign Language Recognition (SLR). Existing
methods for SLR use hand-crafted features to describe sign language motion and build
classification model based on these features.
In contrast, CNNs can capture motion information from raw video data automatically,
avoiding designing features. We develop a CNNs taking multiple types of data as input. This
architecture integrates color, depth and trajectory information by performing convolution
and subsampling on adjacent video frames. Experimental results demonstrate that 3D CNNs
can significantly outperform Gaussian mixture model with Hidden Markov model (GMM-
HMM) baselines on some sign words recorded by ourselves.
2
LITERATURE SURVEY
Sanil Jain and KV Sameer Raja worked on Indian Sign Language Recognition, using colored
images. They used feature extraction methods like bag of visual words, Gaussian random
and the Histogram of Gradients (HoG). Three subjects were used to train SVM, and they
achieved an accuracy of 54.63% when tested on a totally different user.
[2] An Efficient Framework for Indian Sign Language Recognition Using Wavelet
Transform:
In this paper authors presented a scheme using a database driven hand gesture
recognition based upon skin color model approach and thresholding approach along with
3
effective template matching with can be effectively used for human robotics applications and
similar other applications. Initially, hand region is segmented by applying skin color model in
YCbCr color space. In the next stage thresholding is applied to separate foreground and
background. Finally, template based matching technique is developed using Principal
Component Analysis (PCA) for recognition.
Authors presented the static hand gesture recognition system using digital image
processing. For hand gesture feature vector SIFT algorithm is used. The SIFT features have
been computed at the edges which are invariant to scaling, rotation, addition of noise.
In this paper a method for automatic recognition of signs on the basis of shape-based
features is presented. For segmentation of hand region from the images, Otsu’s thresholding
algorithm is used, that chooses an optimal threshold to minimize the within-class variance of
thresholder black and white pixels. Features of segmented hand region are calculated using
Hu’s invariant moments that are fed to Artificial Neural Network for classification.
Performance of the system is evaluated on the basis of Accuracy, Sensitivity and Specificity.
Authors presented various method of hand gesture and sign language recognition
proposed in the past by various researchers. For deaf and dumb people, Sign language is the
only way of communication. With the help of sign language, these physical impaired people
express their emotions and thoughts to other person.
[7] Design Issue and Proposed Implementation of Communication Aid for Deaf &
Dumb People:
In this paper author proposed a system to aid communication of deaf and dumb
people communication using Indian sign language (ISL) with normal people where hand
gestures will be converted into appropriate text message. Main objective is to design an
algorithm to convert dynamic gesture to text at real time. Finally after testing is done the
system will be implemented on android platform and will be available as an application.
4
[8] Real Time Detection And Recognition Of Indian And American Sign
Language Using Sift:
Author proposed a real time vision based system for hand gesture recognition for
human computer interaction in many applications. The system can recognize 35 different
hand gestures given by Indian and American Sign Language or ISL and ASL at faster rate
with virtuous accuracy. RGB-to-GRAY segmentation technique was used to minimize the
chances of false detection. Authors proposed a method of improvised Scale Invariant Feature
Transform (SIFT) and same was used to extract features. The system is model using MATLAB.
To design and efficient user friendly hand gesture recognition system, a GUI model has been
implemented.
[9] A Review on Feature Extraction for Indian and American Sign Language:
Paper presented the recent research and development of sign language based on
manual communication and body language. Sign language recognition system typically
elaborate three steps preprocessing, feature extraction and classification. Classification
methods used for recognition are Neural Network (NN), Support Vector Machine (SVM),
Hidden Markov Models(HMM), Scale Invariant Feature Transform(SIFT),etc.
Author presented application that helps the deaf and dumb person to communicate
with the rest of the world using sign language. The key feature in this system is the real time
gesture to text conversion. The processing steps include: gesture extraction, gesture
matching and conversion to speech. Gesture extraction involves use of various image
processing techniques such as International Journal of Technical Research & Science
histogram matching, bounding box computation, skin colour segmentation and region
growing. Techniques applicable for Gesture matching include feature point matching and
correlation based matching. The other features in the application include voicing out of text
and text to gesture conversion.
[11] Offline Signature Verification Using Surf Feature Extraction and Neural
Networks Approach:
In this paper, off-line signature recognition & verification using neural network is
proposed, where the signature is captured and presented to the user in an image format.
5
REQUIREMENTS GATHERING
HARDWARE:
Processor - Pentium–III
Speed – 2.4GHz
SOFTWARE:
6
SYSTEM ANALYSIS
EXISTING SYSTEM:
• Many American Deaf believe that one handed finger-spelling makes for faster
finger- spelling than two-handed systems.
DISADVANTAGES:
• Existing methods for SLR use hand-crafted features to describe sign language
motion and build classification model based on these features.
7
PROPOSED SYSTEM:
• ASL dataset, which consists of 26 hand gestures is used to input sign and RCNN
algorithm is used for prediction when the sign is given as input to the webcam voice
is received as output to the user.
• We propose to apply CNNs to extract spatial and temporal features from video
stream for Sign Language Recognition (SLR).
ADVANTAGES:
• High accuracy in recognizing sign (hand gesture) , outputs the text or speech to
easily understand their way of communication.
• Learning sign language is not needed to understand what they are expressing.
8
TECHNOLOGIES
What is Python:
• Python language is being used by almost all tech-giant companies like – Google,
Amazon, Facebook, Instagram, Dropbox, Uber… etc.
• The biggest strength of Python is huge collection of standard libraries which can be
used for the following –
Machine Learning
Test frameworks
Multimedia
9
Advantages of Python:
1. Extensive Libraries
Python downloads with an extensive library and it contain code for various
purposes like regular expressions, documentation-generation, unit-testing, web
browsers, threading, databases, CGI, email, image manipulation, and more. So, we don’t
have to write the complete code for that manually.
2. Extensible
As we have seen earlier, Python can be extended to other languages. You can
write some of your code in languages like C++ or C. This comes in handy, especially in
projects.
3. Embeddable
4. Improved Productivity
5. IOT Opportunities
Since Python forms the basis of new platforms like Raspberry Pi, it finds the
future bright for the Internet of Things. This is a way to connect the language with the
real world.
When working with Java, you may have to create a class to print ‘Hello World’.
But in Python, just a print statement will do. It is also quite easy to learn, understand,
10
and code. This is why when people pick up Python, they have a hard time adjusting
to other more verbose languages like Java.
7. Readable
Because it is not such a verbose language, reading Python is much like reading
English. This is the reason why it is so easy to learn, understand, and code. It also does
not need curly braces to define blocks, and indentation is mandatory. This further aids
the readability of the code.
8. Object-Oriented
Like we said earlier, Python is freely available. But not only can you download
Python for free, but you can also download its source code, make changes to it, and
even distribute it. It downloads with an extensive collection of libraries to help you with
your tasks.
10. Portable
When you code your project in a language like C++, you may need to make some
changes to it if you want to run it on another platform. But it isn’t the same with Python.
Here, you need to code only once, and you can run it anywhere. This is called Write
Once Run Anywhere (WORA). However, you need to be careful enough not to include
any system-dependent features.
11. Interpreted
11
Advantages of Python Over Other Languages
1. Less Coding
Almost all of the tasks done in Python requires less coding when the same task is
done in other languages. Python also has an awesome standard library support, so you
don’t have to search for any third-party libraries to get your job done. This is the reason
that many people suggest learning Python to beginners.
2. Affordable
The 2019 GitHub annual survey showed us that Python has overtaken Java in the
most popular programming language category.
Python code can run on any machine whether it is Linux, Mac or Windows.
Programmers need to learn different languages for different jobs but with Python, you
can professionally build web apps, perform data analysis and machine learning,
automate things, do web scraping and also build games and powerful visualizations. It is
an all- rounder programming language.
Disadvantages of Python
So far, we’ve seen why Python is a great choice for your project. But if you choose it, you
should be aware of its consequences as well. Let’s now see the downsides of choosing
Python over another language.
1. Speed Limitations
We have seen that Python code is executed line by line. But since Python is
interpreted, it often results in slow execution. This, however, isn’t a problem unless
speed is a focal point for the project. In other words, unless high speed is a
requirement, the benefits offered by Python are enough to distract us from its speed
limitations.
12
2. Weak in Mobile Computing and Browsers
The reason it is not so famous despite the existence of Bryton is that it isn’t that
secure.
3. Design Restrictions
As you know, Python is dynamically-typed. This means that you don’t need to
declare the type of variable while writing the code. It uses duck-typing. But wait, what’s
that? Well, it just means that if it looks like a duck, it must be a duck. While this is easy on
the programmers during coding, it can raise run-time errors.
5. Simple
No, we’re not kidding. Python’s simplicity can indeed be a problem. Take my
example. I don’t do Java, I’m more of a Python person. To me, its syntax is so simple that
the verbosity of Java code seems unnecessary.
History of Python:
13
worked as an implementer on a team building a language called ABC at Centrum voor
Wiskunde en Informatica (CWI). I don't know how well people know ABC's influence on
Python. I try to mention ABC's influence because I'm indebted to everything I learned during
that project and to the people who worked on it .Later on in the same Interview, Guido van
Rossum continued: "I remembered all my experience and some of my frustration with ABC.
I decided to try to design a simple scripting language that possessed some of ABC's
better properties, but without its problems. So I started typing. I created a simple virtual
machine, a simple parser, and a simple runtime. I made my own version of the various ABC
parts that I liked. I created a basic syntax, used indentation for statement grouping instead
of curly braces or begin-end blocks, and developed a small number of powerful data types: a
hash table (or dictionary, as we call it), a list, strings, and numbers."
Before we take a look at the details of various machine learning methods, let's start
by looking at what machine learning is, and what it isn't. Machine learning is often
categorized as a subfield of artificial intelligence, but I find that categorization can often be
misleading at first brush. The study of machine learning certainly arose from research in this
context, but in the data science application of machine learning methods, it's more helpful
to think of machine learning as a means of building models of data.
At the most fundamental level, machine learning can be categorized into two main
types: supervised learning and unsupervised learning.
14
Supervised learning involves somehow modeling the relationship between
measured features of data and some label associated with the data; once this model is
determined, it can be used to apply labels to new, unknown data. This is
further subdivided into classification tasks and regression tasks: in classification, the labels
are discrete categories, while in regression, the labels are continuous quantities. We will see
examples of both types of supervised learning in the following section.
Human beings, at this moment, are the most intelligent and advanced species on
earth because they can think, evaluate and solve complex problems. On the other side, AI is
still in its initial stage and haven’t surpassed human intelligence in many aspects. Then the
question is that what is the need to make machine learn? The most suitable reason for
doing this is, “to make decisions, based on data, with efficiency and scale”.
15
Challenges in Machines Learning :
Quality of data − Having good-quality data for ML algorithms is one of the biggest
challenges. Use of low-quality data leads to the problems related to data preprocessing and
feature extraction.
No clear objective for formulating business problems − Having no clear objective and well-
defined goal for business problems is another key challenge for ML because this technology
is not that mature yet.
Curse of dimensionality − Another challenge ML model faces is too many features of data
points. This can be a real hindrance.
Emotion analysis
Sentiment analysis
Error detection and prevention
16
Weather forecasting and prediction
Speech synthesis
Speech recognition
Customer segmentation
Object recognition
Fraud detection
Fraud prevention
Arthur Samuel coined the term “Machine Learning” in 1959 and defined it as a
“Field of study that gives computers the capability to learn without being explicitly
programmed”.
And that was the beginning of Machine Learning! In modern times, Machine Learning
is one of the most popular (if not the most!) career choices. According to Indeed, Machine
Learning Engineer Is The Best Job of 2019 with a 344% growth and an average base salary
of $146,085 per year.
But there is still a lot of doubt about what exactly is Machine Learning and how to
start learning it? So this article deals with the Basics of Machine Learning and also the path
you can follow to eventually become a full-fledged Machine Learning Engineer. Now let’s
get started!!!
17
Step 1 – Understand the Prerequisites
In case you are a genius, you could start ML directly but normally, there are some
prerequisites that you need to know which include Linear Algebra, Multivariate Calculus,
Statistics, and Python. And if you don’t know these, never fear! You don’t need a Ph.D.
degree in these topics to get started but you do need a basic understanding.
b) Learn Statistics
Data plays a huge role in Machine Learning. In fact, around 80% of your time
as an ML expert will be spent collecting and cleaning data. And statistics is a field that
handles the collection, analysis, and presentation of data. So it is no surprise that you
need to learn it!!!
Some of the key concepts in statistics that are important are Statistical
Significance, Probability Distributions, Hypothesis Testing, Regression, etc. Also,
Bayesian Thinking is also a very important part of ML which deals with various
concepts like Conditional Probability, Priors, and Posteriors, Maximum Likelihood, etc.
c) Learn Python
Some people prefer to skip Linear Algebra, Multivariate Calculus and Statistics
and learn them as they go along with trial and error. But the one thing that you
absolutely cannot skip is Python! While there are other languages you can use for
Machine Learning like R, Scala, etc. Python is currently the most popular language for
ML. In fact, there are many Python libraries that are specifically useful for Artificial
Intelligence and Machine Learning such as Keras, TensorFlow, Scikit-learn, etc.So if
you want to learn ML, it’s best if you learn Python! You can do that using various
online resources and courses such as Fork Python available Free on GeeksforGeeks.
18
Step 2 – Learn Various ML Concepts
Now that you are done with the prerequisites, you can move on to actually learning
ML (Which is the fun part!!!) It’s best to start with the basics and then move on to the more
complicated stuff. Some of the basic concepts in ML are:
• Prediction – Once our model is ready, it can be fed a set of inputs to which it
will provide a predicted output(label).
• Unsupervised Learning – This involves using unlabeled data and then finding
the underlying structure in the data in order to learn more and more about
the data itself using factor and cluster analysis models.
• Semi-supervised Learning – This involves using unlabeled data like
Unsupervised Learning with a small amount of labeled data. Using labeled
data vastly increases the learning accuracy and is also more cost-effective
than Supervised Learning.
19
• Reinforcement Learning – This involves learning optimal actions through trial
and error. So, the next action is decided by learning behaviors that are based
on the current state and that will maximize the reward in the future.
Machine Learning can review large volumes of data and discover specific trends
and patterns that would not be apparent to humans. For instance, for an e-
commerce website like Amazon, it serves to understand the browsing behaviors and
purchase histories of its users to help cater to the right products, deals, and
reminders relevant to them. It uses the results to reveal relevant advertisements to
them.
With ML, you don’t need to babysit your project every step of the way. Since it
means giving machines the ability to learn, it lets them make predictions and also
improve the algorithms on their own. A common example of this is anti-virus software’s;
they learn to filter new threats as they are recognized. ML is also good at recognizing
spam.
Continuous Improvement
Machine Learning algorithms are good at handling data that are multi-dimensional
and multi-variety, and they can do this in dynamic or uncertain environments.
Wide Applications
You could be an e-tailer or a healthcare provider and make ML work for you.
Where it does apply, it holds the capability to help deliver a much more personal
experience to customers while also targeting the right customers.
20
Disadvantages of Machine Learning:
Data Acquisition
Machine Learning requires massive data sets to train on, and these should be
inclusive/unbiased, and of good quality. There can also be times where they must wait
for new data to be generated.
ML needs enough time to let the algorithms learn and develop enough to fulfill
their purpose with a considerable amount of accuracy and relevancy. It also needs
massive resources to function. This can mean additional requirements of computer
power for you.
Interpretation of Results
High error-susceptibility
Guido Van Rossum published the first version of Python code (version 0.9.0) at
alt.sources in February 1991. This release included already exception handling, functions,
and the core data types of list, dict, str and others. It was also object oriented and had a
module system.
Python version 1.0 was released in January 1994. The major new features included in
this release were the functional programming tools lambda, map, filter and reduce, which
Guido Van Rossum never liked.Six and a half years later in October 2000, Python 2.0 was
introduced. This release included list comprehensions, a full garbage collector and it
was
21
supporting Unicode. Python flourished for another 8 years in the versions 2.x before the
next major release as Python 3.0 (also known as "Python 3000" and "Py3K") was released.
Python 3 is not backwards compatible with Python 2.x. The emphasis in Python 3 had been
on the removal of duplicate programming constructs and modules, thus fulfilling or coming
close to fulfilling the 13th law of the Zen of Python: "There should be one -- and preferably
only one -
- obvious way to do it." Some changes in Python 3.7:
Purpose:
Python
22
Python also acknowledges that speed of development is important. Readable
and terse code is part of this, and so is access to powerful constructs that avoid tedious
repetition of code. Maintainability also ties into this may be an all but useless metric, but it
does say something about how much code you have to scan, read and/or understand to
troubleshoot problems or tweak behaviors. This speed of development, the ease with which
a programmer of other languages can pick up basic Python skills and the huge standard
library is key to another area where Python excels. All its tools have been quick to
implement, saved a lot of time, and several of them have later been patched and updated
by people with no Python background - without breaking.
TensorFlow
TensorFlow was developed by the Google Brain team for internal Google use.
It was released under the Apache 2.0 open-source license on November 9, 2015.
NumPy
Pandas
Pandas is an open-source Python Library providing high-performance data
manipulation and analysis tool using its powerful data structures. Python was
majorly used for data munging and preparation. It had very little contribution
towards data analysis. Pandas solved this problem. Using Pandas, we can accomplish
five typical steps in the processing and analysis of data, regardless of the origin of
data load, prepare, manipulate, model, and analyze. Python with Pandas is used in a
wide range of fields including academic and commercial domains including finance,
economics, Statistics, analytics, etc.
Matplotlib
Scikit – learn
24
Python:
25
Django – Design Philosophies
Loosely Coupled − Django aims to make each element of its stack independent of
the others.
Don't Repeat Yourself (DRY) − Everything should be developed only in exactly one
place instead of repeating it again and again.
Clean Design − Django strictly maintains a clean design throughout its own code and
makes it easy to follow best web-development practices.
Advantages of Django
Here are few advantages of using Django which can be listed out here −
Framework Support − Django has built-in support for Ajax, RSS, Caching and various
other frameworks.
26
As you already know, Django is a Python web framework. And like most modern
framework, Django supports the MVC pattern. First let's see what is the Model-View-
Controller (MVC) pattern, and then we will look at Django’s specificity for the Model-View-
Template (MVT) pattern.
MVC Pattern
When talking about applications that provides UI (web or desktop), we usually talk about
MVC architecture. And as the name suggests, MVC pattern is based on three components:
Model, View, and Controller. Check our MVC tutorial here to know more.
The Model-View-Template (MVT) is slightly different from MVC. In fact the main difference
between the two patterns is that Django itself takes care of the Controller part (Software
Code that controls the interactions between the Model and View), leaving us with the
template. The template is a HTML file mixed with Django Template Language (DTL).
The following diagram illustrates how each of the components of the MVT pattern interacts
with each other to serve a user request −
The developer provides the Model, the view and the template then just maps it to a URL
and Django does the magic to serve it to the user.
27
Jupyter Notebook
The Jupyter Notebook is an open-source web application that you can use to create and
share documents that contain live code, equations, visualizations, and text. Jupyter
Notebook is maintained by the people at Project Jupyter.
Jupyter Notebooks are a spin-off project from the IPython project, which used to have an
IPython Notebook project itself. The name, Jupyter, comes from the core supported
programming languages that it supports: Julia, Python, and R. Jupyter ships with the IPython
kernel, which allows you to write your programs in Python, but there are currently over 100
other kernels that you can also use.
28
DESIGN
UML DIAGRAMS:
UML is an acronym that stands for Unified Modeling Language. Simply put, UML is a
modern approach to modeling and documenting software. In fact, it’s one of the most
popular business process modeling techniques.
UML was created as a result of the chaos revolving around software development
and documentation. In the 1990s, there were several different ways to represent and
document software systems. The need arose for a more unified way to visually represent
those systems and as a result, in 1994-1996, the UML was developed by three software
engineers working at Rational Software. It was later adopted as the standard in 1997 and
has remained the standard ever since, receiving only a few updates.
GOALS:
29
USE CASE DIAGRAM:
CLASS DIAGRAM:
Class diagram describes the attributes and operations of a class and also the
constraints imposed on the system. The class diagrams are widely used in the modeling of
object-oriented systems because they are the only UML diagrams, which can be mapped
directly with object- oriented languages.
SEQUENCE DIAGRAM:
30
ACTIVITY DIAGRAM:
Activity diagram is basically a flowchart to represent the flow from one activity to
another activity. The activity can be described as an operation of the system. The control
flow is drawn from one operation to another. This flow can be sequential, branched, or
concurrent.
COMPONENT DIAGRAM:
The name of the diagram itself clarifies the purpose of the diagram and
other details. It describes different states of a component in a system. The states are specific
to a component/object of a system. A State chart diagram describes a state machine. State
machine can be defined as a machine which defines different states of an object and these
states are controlled by external or internal events. State chart diagram defines the states, it
is used to model the lifetime of an object.
DEPLOYMENT DIAGRAM:
The deployment diagram visualizes the physical hardware on which the software
will be deployed. It portrays the static deployment view of a system. It involves the nodes
and their relationships. It ascertains how software is deployed on the hardware. It maps the
software architecture created in design to the physical system architecture, where the
software will be executed as a node. Since it involves many nodes, the relationship is shown
by utilizing communication paths.
31
COLLABORATION DIAGRAM:
Because you do not show the specific classes or identities of the participating
instances, but only the roles and connectors, you can reuse a collaboration to diagram
architectural patterns of collaborating objects and to model their common behavior, similar
to a template. When you want to show a specific occurrence of a pattern, you use a
collaboration use.
32
USE CASE DIAGRAM:
User System
Train CNN with Gesture Image
CLASS DIAGRAM:
33
SEQUENCE DIAGRAM
trainCNNwithGestureImage
signLanguageRecognitionfromWebCam
identifyingSignLanguagefromWebCam
logout
34
ACTIVITY DIAGRAM:
COMPONENT DIAGRAM:
35
STATE CHART DIAGRAM:
DEPLOYMENT DIAGRAM:
36
COLLABORATION DIAGRAM:
4: logout 1: uploadHandGestureDataset
2: trainCNNwithGestureImage
application
3: signLanguageRecognitionFromWebam
user
37
SYSTEM ARCHITECTURE:
A CNN model is used to extract features from the frames and to predict hand
gestures. It is a multilayered feedforward neural network mostly used in image recognition.
The architecture of CNN consists of some convolution layers, each comprising of a pooling
layer, activation function, and batch normalization which is optional. It also has a set of fully
connected layers. As one of the images moves across the network, it gets reduced in size.
This happens as a result of max pooling. The last layer gives us the prediction of the class
probabilities.
CNNs are feature extraction models in deep learning that recently have
proven to be to be very successful at image recognition. As of now, the models are in use by
various industry leaders like Google, Facebook and Amazon. And recently, researchers at
Google applied CNNs on video data. CNNs are inspired by the visual cortex of the human
brain. The artificial neurons in a CNN will connect to a local region of the visual field, called a
receptive field. This is accomplished by performing discrete convolutions on the image with
filter values as trainable weights. Multiple filters are applied for each channel, and together
with the activation functions of the neurons, they form feature maps. This is followed by a
pooling scheme, where only the interesting information of the feature maps are pooled
together. These techniques are performed in multiple layers.
Classification
38
In our case, the input layer of the convolutional neural network has 32
feature maps of size 3 by 3, and the activation function is a Rectified Linear Unit. The max
pool layer has a size of 2×2. The dropout is set to 50 percent and the layer is flattened.
The last layer of the network is a fully connected output layer with ten units, and the
activation function is SoftMax. Then we compile the model by using category cross-entropy
as the loss function and Adam as the optimizer.
CNN
39
Figure: CNN
40
CODING:
from tkinter import messagebox from
tkinter import *
from tkinter import simpledialog import tkinter
from tkinter import filedialog
from tkinter.filedialog import askopenfilename import cv2
import random import
numpy as np
from keras.utils.np_utils import to_categorical from keras.layers
import MaxPooling2D
from keras.layers import Dense, Dropout, Activation, Flatten from keras.layers import
Convolution2D
from keras.models import Sequential from
keras.models import model_from_json import pickle
import os import
imutils
from gtts import gTTS
from playsound import playsound import os
from threading import Thread
main = tkinter.Tk()
main.title("Sign Language Recognition to Text & Voice using CNN Advance") main.geometry("1300x1200")
bg = None playcount
=0
deleteDirectory():
filelist = [ f for f in os.listdir('play') if f.endswith(".mp3") ] for f in filelist:
os.remove(os.path.join('play', f))
41
def play(playcount,gesture): class
PlayThread(Thread):
def init (self,playcount,gesture): Thread. init (self)
self.gesture = gesture self.playcount = playcount
def run(self):
t1 = gTTS(text=self.gesture, lang='en', slow=False) t1.save("play/"+str(self.playcount)
+".mp3") playsound("play/"+str(self.playcount)+".mp3")
def remove_background(frame):
fgmask = bgModel.apply(frame, learningRate=0) kernel =
np.ones((3, 3), np.uint8)
fgmask = cv2.erode(fgmask, kernel, iterations=1) res =
cv2.bitwise_and(frame, frame, mask=fgmask) return res
if os.path.exists('model1/model.json'):
with open('model1/model.json', "r") as json_file: loaded_model_json =
json_file.read() classifier = model_from_json(loaded_model_json)
classifier.load_weights("model1/model_weights.h5")
classifier._make_predict_function() print(classifier.summary())
f = open('model1/history.pckl', 'rb') data =
pickle.load(f)
f.close()
acc = data['accuracy'] accuracy =
acc[9] * 100
text.insert(END,"CNN Hand Gesture Training Model Prediction Accuracy = "+str(accuracy))
42
else:
classifier = Sequential()
classifier.add(Convolution2D(32, 3, 3, input_shape = (64, 64, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Convolution2D(32, 3, 3, activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2))) classifier.add(Flatten())
classifier.add(Dense(output_dim = 256, activation = 'relu')) classifier.add(Dense(output_dim = 5,
activation = 'softmax')) print(classifier.summary())
classifier.compile(optimizer = 'adam', loss =
'categorical_crossentropy', metrics = ['accuracy'])
hist = classifier.fit(X_train, Y_train, batch_size=16, epochs=10, shuffle=True, verbose=2)
classifier.save_weights('model1/model_weights.h5') model_json = classifier.to_json()
with open("model1/model.json", "w") as json_file: json_file.write(model_json)
f = open('model1/history.pckl', 'wb')
pickle.dump(hist.history, f) f.close()
f = open('model1/history.pckl', 'rb') data =
pickle.load(f)
f.close()
acc = data['accuracy'] accuracy =
acc[9] * 100
text.insert(END,"CNN Hand Gesture Training Model Prediction Accuracy = "+str(accuracy))
43
aWeight = 0.5
camera = cv2.VideoCapture(0)
top, right, bottom, left = 10, 350, 325, 690
num_frames = 0
while(True):
(grabbed, frame) = camera.read()
frame = imutils.resize(frame, width=700) frame =
cv2.flip(frame, 1)
clone = frame.copy()
(height, width) = frame.shape[:2] roi =
frame[top:bottom, right:left]
gray = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY) gray =
cv2.GaussianBlur(gray, (41, 41), 0) if num_frames < 30:
run_avg(gray, aWeight) else:
temp = gray
hand = segment(gray) if hand
is not None:
(thresholded, segmented) = hand
cv2.drawContours(clone, [segmented + (right, top)], -1,
(0, 0, 255))
#cv2.imwrite("test.jpg",temp)
#cv2.imshow("Thesholded", temp)
#ret, thresh = cv2.threshold(temp, 150, 255,
cv2.THRESH_BINARY + cv2.THRESH_OTSU)
#thresh = cv2.resize(thresh, (64, 64)) #thresh =
np.array(thresh)
#img = np.stack((thresh,)*3, axis=-1) roi =
frame[top:bottom, right:left] roi = fgbg2.apply(roi);
cv2.imwrite("test.jpg",roi)
#cv2.imwrite("newDataset/Fist/"+str(count)+".png",roi) #count = count + 1
#print(count)
img = cv2.imread("test.jpg") img =
cv2.resize(img, (64, 64))
img = img.reshape(1, 64, 64, 3)
img = np.array(img, dtype='float32') img /= 255
predict = classifier.predict(img) value =
np.amax(predict)
cl = np.argmax(predict)
result = names[np.argmax(predict)] if value >=
0.99:
print(str(value)+" "+str(result)) cv2.putText(clone, 'Gesture Recognize
as :
'+str(result), (10, 25), cv2.FONT_HERSHEY_SIMPLEX,0.5, (0, 255, 255), 2)
if oldresult != result: play(playcount,result)
oldresult = result playcount =
playcount + 1
else:
cv2.putText(clone, '', (10, 25),
cv2.FONT_HERSHEY_SIMPLEX,0.5, (0, 255, 255), 2)
cv2.imshow("video frame", roi)
cv2.rectangle(clone, (left, top), (right, bottom), (0,255,0), 2)
44
num_frames += 1 cv2.imshow("Video Feed",
clone) keypress = cv2.waitKey(1) & 0xFF if
keypress == ord("q"):
break camera.release()
cv2.destroyAllWindows()
deleteDirectory()
main.config(bg='magenta3')
main.mainloop()
45
CONCLUSION
We developed a CNN model for sign language recognition. Our model learns
and extracts both spatial and temporal features by performing 3D convolutions.
The developed deep architecture extracts multiple types of information from
adjacent input frames and then performs convolution and subsampling
separately.
46
47