0% found this document useful (0 votes)
107 views84 pages

Sign Language Recognition System Report

This document discusses a project report on sign language recognition. It aims to develop a framework to identify and translate static sign language gestures into text or speech. Sign language recognition systems typically use computer vision techniques to extract features from images or videos of signs, and machine learning classifiers to recognize patterns and identify signs. The proposed system would help improve communication between deaf and normal individuals without requiring sign language experts. It could assist people in learning basic sign language as well through mobile applications. The goal is to create an affordable and accessible sign language recognition system.

Uploaded by

shruti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views84 pages

Sign Language Recognition System Report

This document discusses a project report on sign language recognition. It aims to develop a framework to identify and translate static sign language gestures into text or speech. Sign language recognition systems typically use computer vision techniques to extract features from images or videos of signs, and machine learning classifiers to recognize patterns and identify signs. The proposed system would help improve communication between deaf and normal individuals without requiring sign language experts. It could assist people in learning basic sign language as well through mobile applications. The goal is to create an affordable and accessible sign language recognition system.

Uploaded by

shruti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

SIGN LANGUAGE RECOGNITION

Project report submitted in partial fulfillment of the requirement for the degree of

BACHELOR OF TECHNOLOGY
IN
ELECTRONICS AND COMMUNICATION ENGINEERING

By

ARCHITA GUPTA (171021)

SHRUTI SHARMA (171051)

UNDER THE GUIDANCE OF

MR. ANUJ KUMAR MAURYA

JAYPEE UNIVERSITY OF INFORMATION TECHNOLOGY, WAKNAGHAT


SOLAN, DECEMBER 2020

1
TABLE OF CONTENTS

CAPTION PAGE NO.

DECLARATION i
ACKNOWLEDGEMENT ii
LIST OF ACRONYMS AND ABBREVIATIONS iii
LIST OF FIGURES iv
ABSTRACT v
CHAPTER-1: INTRODUCTION 9
1.1 Literature Review 11
1.2 Motivation 12

CHAPTER-2: SYSTEM DESCRIPTION 13


2.1 Machine learning 13
2.1.1 Types of machine learning 13
2.1.2 Model Building Cycle 14
2.2 Neural Networks 17
2.2.1 Working 17
2.2.2 Convolutional Neural Network 19

CHAPTER-3: PROPOSED SYSTEM 24


3.1 Basic steps of sign language detection 24
3.2 Common classifiers used 27

CHAPTER-4: METHODOLOGY 29
4.1 Prerequisites 29
4.2 Steps Followed 30

CHAPTER-5: EXPERIMENTAL RESULTS 31


5.1 Data collection 31
5.2 Preprocessing images 35

CHAPTER-6: CONCLUSION AND FUTURE SCOPE 38


REFERENCES 39
PLAGARISM REPORT 40
2
PLAGIARISM REPORT

PLAGIARISM -8%

3
DECLARATION

We hereby declare that the work reported in the B.Tech Project Report entitled “SIGN LANGUAGE

RECOGNITION” submitted at Jaypee University of Information Technology, Waknaghat, India is

an authentic record of our work carried out under the supervision of PROF. ANUJ MAURYA. We have

not submitted this work elsewhere for any other degree or diploma.

-------------------------- -------------------------
ARCHITA GUPTA SHRUTI SHARMA
171021 171051

This is to certify that the above statement made by the candidates is correct to the best of my knowledge.

-------------------------
MR. ANUJ MAURYA

Date: 5 DEC 2020

Head of the Department/Project Coordinator

4
ACKNOWLEDGEMENT

Our success of completion of our project required guidance from many individuals and assistance from
many people and we are extremely privileged to have got this all along the completion of this project.

We take this opportunity to express our gratitude to our supervisor Prof. Anuj Maurya, for his insightful
advice, motivating suggestions, invaluable guidance, help and support in successful completion of this
project and also for his constant encouragement and advice throughout our project.

The in-house facilities provided by the department throughout the project are also equally
acknowledgeable. We would like to convey our thanks to the teaching and non-teaching staff of the
Electronics and Communication Engineering Department for their invaluable help and support.

5
LIST OF ACRONYMS AND ABBREVIATIONS

CNN Convolutional Neural Network

RNN Recurrent Neural Network

SVM Support Vector Machine

ASL American Sign Language

ISL Indian Sign Language

kNN k-Nearest Neighbor

BOW Bag Of Words

PCA Principal Component Analysis

RF Random Forest

GB Gradient Boosting

HMM Hidden Markov Models


EMG Electromyography

6
LIST OF FIGURES

Fig.2.1 Types of machine learning

Fig.2.1.2a) Model building cycle

Fig.2.1.2b) Types of datasets

Fig.2.2.1a) Artificial Neural Network Model

Fig.2.2.1b) Working of a Neural Network Model

Fig.2.2.2a) Convolutional Neural Network

Fig.2.2.2b) Structure of CNN

Fig.2.2.2c) Layers in CNN

Fig.2.2.2d) Types of pooling

Fig.2.2.2e) Complete structure of CNN

Fig.3 Block diagram of sign language recognition

Fig.3.1a) Hand gestures

Fig.3.1b) Data preprocessing

7
ABSTRACT

Sign language is a natural language used to communicate by people who are auditory or speech impaired.
It utilizes hand gestures to convey meaning instead of sound.
In India, more than 2 million persons are mute. They find it impossible to interact with normal individuals
and sign languages cannot be heard by normal individuals.
A need exists for translators of sign language that can translate sign language into spoken language and
vice versa.

The use of such translators, however, is minimal, expensive and does not work for the whole life of a
deaf person. This led to the creation of a recognition device for sign language that can convert signs into
text or speech automatically.

8
CHAPTER 1
INTRODUCTION

In constructing a country, communication is important. Effective communication contributes to greater


comprehension, as it involves all group members, including the deaf. 1.23 percent of mankind on earth
were either mute, deaf or were suffering from hearing impairedness in the Philippines. Sign language
closes the connectivity divide for other persons. Many listeners, however, cannot interpret the sign
language and it takes a lot of time to master it. As a result, it becomes difficult to form a communication
between a normal person and the one suffering from hearing impairedness.

Sign language is a type of communication used by persons with hearing and speech disability. Humans
with some diability use sign-language expressions to communicate their ideas and desires to develop
some sort of communication. But normal people find it incredibly hard to interpret, so during medical
checkups and legal visits, a person with good understanding of sign language is required who can convey
their message. There has been a growing need for interpretation over the past five years.[1]

Based on its input, the SLR structure can be divided into major two key classifications: data gloves-based
and vision-based. Microcontrollers and various sensors, such as flex sensors, acceleroometers etc.,cen be
used. Chouhan et al use smart gloves to obtain certain inputs such as hand positions, joint alignment, and
velocity. Using motion sensors, there are other ways to record signals, such as RGB cameras,
electromyography (EMG) sensors, leap motion controls and Kinect sensors. Higher precision is the
advantage of this strategy, and the downside is that it has restricted movement. The role of vision-based
approaches has been more common in recent years, with feedback from the camera ( stereo camera, web
camera or 3D camera). To make hand identification clearer, Sandjaja and Marcos used color-coded
gloves. A fusion of the two architectures, called hybrid architecture, is also possible. Although these are
more cost-effective.
Usually, the structure of these devices is broken into two major parts. Our first part is the extraction of
features, which utilizes image recognition techniques or computer vision approaches to remove the
desired features from a recording. The second component, which includes the recognizer, needs to learn
some recurring pattern derived in training data from the derived and characterized features and correctly
identify the test data for which computer algorithms have been used. Many of the above-mentioned

9
experiments rely on interpreting the signals identified by the one using signs, usually creates to terms
that can be heard by the normal or observer or interpreter. Apparently such experiments have
demonstrated that technology is beneficial in so many respects, their critics claim that certain hearing-
impaired people are invasive. Instead the supporters recommended a method that would assist all
interpreters who wish to master simple static sign language at the same time and not be invasive. It is
also important to remember that on mobile phones, applications are introduced that allow the non-signer
to get the basics of sign language through youtube and other platforms multiple videos on the
smartphones. Such applications and programs, however, demand a significant amount of storage and a
decent connection of internet.
The goal of the proposed study is to create a framework that enables us to identify and translate static
sign movements into corresponding words. To collect the data from the signer, a vision-based solution
using a 3D or web camera is applied which can be used offline. The goal of developing the framework
is to act as a learning aid for people who want to understand more about sign language basics, such as
Numbers, alphabets and Static Traditional Signs. Proponents presented a white backdrop and a particular
location for hand picture recognition, thereby enhancing the system's accuracy and using the Convolution
Neural Network (CNN) as the system's recognizer. Basic numbers, ASL alphabets (A–Z) and static signs
are included in the framework of the analysis. The power of the machine to produce words by
fingerprinting without the use of external technology or some sensors is one of the key features of this
research.

10
1.1 LITERATURE REVIEW

Mandeep Kaur Ahuja, Amardeep Singh, “Hand Gesture Recognition Using PCA”: This article tells that
the authors introduced a scheme that uses a data-driven hand gesture detection based on the approach of
the model detecting skin color and the approach of thresholding, along with an efficient template
matching that can be used efficiently for applications of robotics systems and other related applications.
The hand area is initially segmented by the implementation of the skin color model in the color space of
YCbCr. Thresholding is added to the separate foreground and background in the next step. Finally, for
identification, the technique of template matching is generated using Principal Component Analysis
(PCA).[2]

Chandandeep Kaur, and Nivit Gill, “An Automated System for Indian Sign Language Recognition”: In
the past, scholars presented different methods of hand gesture and sign language identification suggested
by different researchers. Sign language is the only hope to contact for hearing impaired and dumb
persons. These visually disabled individuals communicate their feelings and ideas to other individuals
with the aid of sign language.

Neelam K. Gilorkar, Manisha M. Ingle, “Real Time Detection And Recognition Of Indian And
American Sign Language Using Sift”: Recent study and development of sign language focused on body
language and communication done manually have been discussed in a paper. Usually three pre-
processing, feature extraction and classification measures are developed by the sign language recognition
system. Support Vector Machine (SVM), Neural Networks(NN), Scale Invariant Feature
Transform(SIFT), etc are classification methods used for recognition.

Neelam K. Gilorkar, and Manisha M. Ingle, “A Review on Feature Extraction for Indian and American
Sign Language”: The author introduced an application that enables a deaf and dumb person to use sign
language to develop a platform to share their ideas with the rest of the humans. The real time gesture to

11
text translation is the main feature of this scheme. The measures for processing include: extraction of
gestures, mixing of gestures and translation to voice. Extraction of movements requires the use of
different techniques for processing of an image, such as bounding box computing, histogram matching,
segmentation of skin color and area growth. Techniques related to Gesture matching include matching
of function points and matching based on correlation. The other functions of the program include the
translation of input and text to gesture conversion.[4]

1.2 Motivation

By nature, mankind has been blessed with voices that enables them to communicate and understand each
other. Therefore, spoken language is one of the core features of humans. Unfortunately, owing to the loss
of one sense i.e. listening, not everyone has this capability. There are about 5 to 15 million deaf people
in India. Sign language is the fundamental alternate form of communication between hearing impaired
persons, and to make this communication practical, many dictionaries of words or single letters have
been defined. It is difficult for most persons who do not know sign language to communicate without an
interpreter. Therefore for real-time communication, a device that transcribes sign language signals into
plain text or audio may aid. It can also provide people with immersive training to learn sign language.

12
CHAPTER 2
SYSTEM DESCRIPTION

2.1 MACHINE LEARNING


Machine learning is an artificial intelligence technology that enables devices the power to learn and
evolve from experience automatically without being directly programmed. It relies on the computer
algorithms that can access and use data and learn about themselves. The machine learning model with
insights and data, such as instances, and experiments. The primary aim is to avoid any human intervention
and guidance and enable the computers to develop a pattern and learn it automatically and change
behavior accordingly.
The life cycle of machine learning is characterized as a cyclical process involving a three-step process
(development of pipelines, training phase and inference phase) acquired by data scientists and data
engineers to create, train and service models using the enormous amount of data involved in different
applications in order to develop, train and serve models In order to derive functional company principles,
the enterprise should take advantage of artificial intelligence and deep learning algorithms.

2.1.1 TYPES OF MACHINE LEARNING

Fig. 2.1 Types of machine learning [12]

13
SUPERVISED LEARNING:

It requires an aim or result variable (dependent variable) to be projected from the predictor set defined
(independent variables). We may produce a function that maps inputs to desired outputs using these
collections of variables.
Supervised learning problems can be further classified into two classes on the basis of the form of goal
variable:
Regression: when in essence, the goal variable is continuous.

Classification: where the goal attribute of essence is discrete.


Regression and classification problems have different algorithms.

UNSUPERVISED LEARNING:

We do not have an outcome variable or goal to predict in this learning. It is used mostly for

Records, industry segmentation and so on.

2.1.2 MODEL BUILDING CYCLE

Fig.2.1.2a) Model building cycle


14
Any creation of a machine learning model can be divided narrowly into six steps:

1) Issue Description means identifying and recognizing the question in a more comprehensive manner.

Way detailed. We define the purpose of the problem and the target variable for prediction.

2) HYPOTHESIS Creation is the approach to guessing by which we Derive certain important

parameters of knowledge that have a major association with the goal of prediction.

3) DATA COLLECTION extracts data from relevant sources surrounding the data collection process.

An analytical problem, after the generation of the hypothesis.

4) DATA Discovery AND TRANSFORMATION aids with data processing and data analysis.

Convert it to the form required. It assists in the detection of outliers and incomplete qualities.

In data exploration, there are multiple sub-steps involved:

a. Data reading: We read the usable raw data into the software/analysis method.

a. Identification of variables: It is the method of naming the variables as

Based or autonomous

Ongoing or discreet

b. Univariate analysis: we discuss one variable at a time here, summarize it, classify the variable.

Recap. Overview.

c. Bivariate analysis: The analytical relationship between two variables is analyzed here.

d. Lost value treatment: to recognize the missing value and to handle it.

e. Care for outliers: to detect and correct the anomalies.

f. Transformation of a variable: This is a method by which we replace a variable with a function of some

kind Variable.

3) MODEL BUILDING is a tool for constructing a statistical model to forecast future growth.

Centered upon evidence from the past. First base model or model for benchmarks:

15
Create dataset for predictive model: we divide the dataset into two groups:

TRAINING DATA and TESTING DATA

Fig 2.1.2 b) Types of datasets [12]

TRAINING DATA: The training set findings are the experience that the training set offers.
The algorithm makes use of learning. Each observation consists of an observation in supervised learning
issues. Component with output observed and one or two reference variables observed.
The data section we use to train our model. This is the knowledge the model currently has.
Seeing and learning from both input and output.

TESTING DATA: The test sample is a set of observations used for results assessment.
Using a certain performance metric for the model. It is noteworthy that there are no findings from the
The instruction set is included in the set of exams. If there are samples in the test set from the Training
collection, it would be hard to determine whether the algorithm has learned to generalize Or have merely
memorized it from the training package. Once our model is fully educated, the impartial assessment
offers testing results. When we have Our model can forecast certain values (without seeing real values)
in the inputs of test results.
Performance output. We test our experiment after prediction by matching it with the real performance
present in the model.
Data Testing. This is how we measure and see how much our model has gained from the Experiences
fed in as data from preparation, set at the moment of training.

16
2.2 NEURAL NETWORKS

In basic terms, collection of algorithms is known as a neural network that are incredibly effective at
identifying correlations or associations in a data set with the structure which tries to mimic the operation
of the brain of humans.
As people, in our daily lives, we have the remarkable capacity to recognize patterns. Think of any time
a puzzle is solved, or when you remember a song automatically after a few seconds of hearing it or when
you look anywhere and recognize the object you're looking at immediately. Or even while you're
speaking. When have you been able to do these amazing tasks without ever having to think about it? This
is due to our strong brain, which allows us the opportunity to recognise patterns and similarities to note
and has become the entire inspiration behind the science behind Deep Learning, believing that by
attempting to reproduce and even enhance what humans are now able to do we will build even more
powerful machines.
In today's world, neural networks have countless uses. The possibilities are completely infinite, from
addressing multiple market challenges such as revenue analysis, consumer study, data confirmation, and
risk control, to image and voice recognition in the medical field, to self-driving vehicles.[6]

2.2.1 WORKING

Fig. 2.2.1 a) Artificial Neural network model [12]

17
An ANN is a model that, using a super complex math function, solves a super complex math problem.
With a bunch of data defining it the input layer), we give it a problem and it is able to find the optimal
solutions (the output layer, that's what you want to predict) by computing a complex equation.

The lines connecting the nodes/neurons portrayed as circles symbolize the relations between neurons and
allow the model to become more detailed over time (by updating the weights of the connections).

Input layer: What the system learns at all times. Ex: A customer's banking conduct.

The secret layer: where the magic takes place.

Output layer: What will be expected by the system Ex: whether or not the consumer will leave in the
next 6 months.

Node/Neuron: A number-holding thing. Described in the picture by a circle.

Gradient descent: The algorithm that helps one to gain more and more detailed knowledge as the model
progresses by changing the connections' weights.

Weights: These are the things that the model changes in order to become more specific after each
iteration. The associations formed between each neuron reflect them. There is a different weight on each
relation.

Fig.2.2.1b) Working of a Neural Network[12]

Import the training collection that acts as the layer of input. spread the data from the very first or input
layer to the last or output layer via a hidden layer, where an expected value y is received. Forward
propagation is the mechanism by which the input node is compounded by a random weight and the
activation function is added. Measure the error between the value being expected and the actual value.
18
To adjust the weights of the ties, backpropagate the error and use gradient descent. Repeat these
measures by finding the optimal weights before the error is appropriately reduced.[6]

2.2.2 CONVOLUTIONAL NEURAL NETWORKS

Fig. 2.2.2 a) Convolutional Neural Network [7]

The process includes taking a image as input and creating a class or a likelihood of classes that better
represent the image is image classification. In CNN we have input as a image, give significance to the
different features in the input image and be able to separate each other from each other. In contrast to
other classification algorithms, the pre-processing needed for CNN is much smaller.

The neurons are organized in the layers of CNN in three dimensions, unlike standard neural networks:
distance, height, depth. The neurons in a layer would only be connected in a fully-connected way to a
specific area of the layer (window size) before it, instead of all the neurons. In addition, the final output
layer will have dimensions (number of classes), since we would reduce the entire picture into a single
vector of class scores by the end of the CNN architecture.[7]

19
Fig. 2.2.2 b) CNN structure [12]

A CNN structure typically consists of three layers: a convolutional layer, pooling layer, and fully
connected layer.

1. Convolution Layer: The primary purpose of convolution is to remove features from the data, such as
borders, shades, corners. When we go further into the network, the network continues to recognize more
specific characteristics such as shapes, numbers, and even facial parts.

We take a limited window size in the convolution layer [typically of length 5*5] that extends to the input
matrix depth. The layer is composed of learnable window size filters. We slid the window by phase size
at each iteration and measured the dot product of filter entries and input values at a given location. As
we begin this process, a 2-Dimensional activation matrix is well defined that provides the matrix's
response at any spatial location. That is, some type of visual feature such as an edge of some orientation
or a blotch of some colour, the network can learn filters that activate.

We have a featured matrix at the end of the convolution phase that has less parameters (dimensions) than
the actual image, as well as more simple features than the actual one. So now we're going to work from
now on with our featured matrix.[7]

20
Fig. 2.2.2 c) Layers in CNN [7]

2. Pooling Layer: To decrease the size of the activation matrix and eventually decrease the learnable
parameters, we use the pooling layer. This layer is intended primarily to reduce the computing resources
needed for data processing. It is achieved by further reducing the dimensions of the featured matrix. We
seek to remove the dominant features from a small range of neighborhoods in this layer.
Two ways of pooling exist:

A) Max Pooling: We take a window size for max pooling [for example, window size 2*2], and we just
take a limit of 4 values. Lid this window well and begin this process, so finally get half of its original
size with an activation matrix.

(b) Average Pooling: The average is taken of all the obtained values in a window for average pooling.

A matrix containing the key characteristics of the input image after pooling layer and it has much smaller
size and dimensions, that will be beneficial in the coming steps.

21
Fig. 2.2.2 d) Types of pooling [12]

3. Completely Connected Layer: Neurons are connected only to a local area in the convolution layer,
while all the inputs are well connected to neurons in a fully connected region.

4. Final Output Layer: After extracting values from the completely attached layer, link them well to
the final layer of neurons, which will estimate the likelihood of each image being in various groups.[7]

Fig. 2.2.2 e) Complete structure of CNN[7]

1. Provide the image input into the layer of convolution.

2. Take convolution with kernel/filters featured.

3. To minimize the measurements, add a pooling pad.

4. Add several times to these layers.

22
5. The obtained output is flattened and fed to a layer which is completely connected.

6. The model is then trained using logistic regression and backpropagation.

23
CHAPTER 3

PROPOSED PROJECT

FIG. 3 BLOCK DIAGRAM OF SIGN LANGUAGE DETECTION

The Sign Language Recognition (SLR) device takes an input expression from the hearing impaired
person and delivers text or speech output to the average person.
With the support of Sign Language, our project aim is to take the basic step of linking the social and
connectivity bridge between ordinary citizens and people with disabilities.

3.1 BASIC STEPS IN SIGN LANGUAGE DETECTION

1) Data collection or acquisition


2) Preprocessing of data
3) Extraction of features
4) Gesture classification

24
DATA ACQUISTION:
The numerous methods of collecting knowledge about the motion of the hand can be achieved in the
following ways:
1. Use of sensory instruments: To have precise hand configuration and location, it uses electro-
mechanical devices. It is possible to use different glove-based methods to extract information. But it's
pricey and it's not user friendly.
2. Vision-based approach: The computer camera is the input medium for viewing information from hands
or fingertips in vision-based techniques. The Vision Based approaches need only a camera, thus
understanding a normal contact without the need of any external hardware between humans and
computers. In defining artificial vision applications that are applied in software and/or hardware, these
systems aim to supplement biological vision. The main difficulty in hand-based vision
Detection is to deal with the considerable variety in the appearance of the human hand due to a wide
range of hand gestures, various possibilities in skin tone as well as differences of points of view and
speed of the camera.

Fig. 3.1a) Hand gestures

25
DATA PREPROCESSING

Because images in a managed environment are not captured and have varying resolutions and sizes,
image preprocessing is necessary. It is a technique for digitalising images and removing from images
some valuable information called area of interest (ROI).Three stages are used in this stage: image
segmentation (skin masking), skin identification, edge detection. By converting the image to HSV color
space, the skin mask is created from the raw image.
The skin may be segmented using a skin mask. Finally, to identify and understand the presence of sharp
discontinuities in an image, the Canny Edge procedure is used to detect the edges of the image.

Fig. 3.1 b) Data preprocessing

FEATURE EXTRACTION

The extraction of features is one of the most critical steps in the comprehension of sign language, since
it provides the output feature vector that is used as an input by the classifier. Without depending on the
orientation, degree of lighting, location and size of the object in the image, feature extraction methods
used to locate artifacts and shapes must be accurate and durable.
Using various methods such as texture properties, orientation histogram, etc the characteristics can be
collected. The Principal Component Analysis (PCA) is used in some situations to decrease
dimensionality to get Vector function from ROI.

CLASSIFICATION

Once the dataset is created, classification is the next step. It is necessary to split the data for training and
testing before going to classification. The next step is to feed the training data to the machine learning

26
model until the data is ready. Educated classes defined corresponding to signs during the testing process
and giving output in text or audio format.[11]

3.2 COMMON CLASSIFIERS USED

Artificial Neural Network (ANN)

An artificial neural network(ANN) contains artificial neurons that display dynamic behaviour that is
determined by the relations between elements and their parameters. To infer a function from given inputs
and observations, ANN is used. The Kohonen-Self Organizing Map is one of the central networks of
unsupervised learning. It was used to describe the alphabet movements in sign languages.
Feed Forward Back Propagation Network is the two most used leaning supervisory networks (BPN),
Neural Network and Radial Basis Function (RBFNN). RBFNN is used in Sign Language recognition or
static motion recognition.[11]

K-Nearest Neighbor (KNN)

Using a supervised learning algorithm, the K-nearest neighbor (KNN) classifier classifies objects
dependent on feature space. The most common classification approach is the Nearest Neighbor
algorithm.
An object is categorized into the class that is most prevalent among its closest neighbors in K. A program
or an algorithm which memorizes all the case and then upon receiving new cases it will be able to predict
on the similar grounds is K nearest neighbors.

Support Vector Machine (SVM)

SVM is another supervised learning method for pattern recognition. The simple SVM takes input data
and predicts the output generated by two possible groups. Help Vector Machines are focused on
hyperplanes of decisions that determine the limits of decisions. Two sets of objects with distinct class
membership are divided by a decision plane. Vector Machine help aims to optimize the decision
boundary between hyperplanes. We need to verify the precision of our performance after
classification.[11]
27
TESTING

The number of correctly identified numbers or letters that are obtained by the device was applied and
divided by the product of the total number of users multiplied by the number of trials in order to validate
the accuracy of the identification of numbers or letters movements.
If the device produces an identical letter/number longer than 15 seconds, the cumulative number of valid
letters/numbers recognized is not used.

28
CHAPTER 4
METHODOLOGY

The aim of this project is to recognise symbolic gestures by photographs such that it is easier to reduce
the contact distance between a normal and a hearing-impaired person.
(a) To gather a dataset.
(b) To segment the portion of the skin from the photograph, as the remaining part could be treated as
noise w.r.t. The issue of character classification.
(c) To remove specific features from skin segmented images that may prove to be useful, i.e. learning
and classification, for the next level.
(d) Use the derived features as input into separate supervised training learning models and then,
eventually, use the trained classification models.

4.1 PREREQUISITES

TensorFlow: TensorFlow is an open source numerical computing software library. We identify the
computational graph nodes first then the actual computation takes place within a session. In Machine
Learning, TensorFlow is used extensively.

Keras: Keras is a library of high-level neural networks written in Python that functions as a TensorFlow
wrapper. It is used in situations where with minimal lines of code, we want to construct and validate the
neural network easily. It involves implementations of widely used elements of the neural network such
as layers, goal, activation functions, optimizers, and methods that make it easy to deal with images and
text data.

OpenCV: OpenCV known as Open Source Computer Vision is a library of open source files.
Real-time computer-vision programming features are included. For characteristics such as facial and
object recognition, it is primarily used for image detection, video recording and analysis. It is written in
C++, which is its main interface, but Python has bindings.

29
Jupyter notebook: The Jupyter Notebook is an open-source web application that enables you to create
and exchange live code, calculations, visualizations, and narrative text documents. Data cleaning and
conversion, numerical simulation, mathematical modelling, analysis of data, machine learning, and many
more are used.[9]

A laptop with a 1080P Full-HD web camera would be used to introduce the device. The camera can take
pictures of the hands that are being fed into the device. Note that the signer will be calibrated to the frame
scale so that the device will catch the hand position of the signer. When a user's gesture has already been
detected by the camera, the machine classifies the test sample and matches it with the movements stored
in the dictionary, and the resulting performance is shown to the user on the screen.[8]

4.2 STEPS FOLLOWED

A. Data collection

The processing of datasets for static SLR was carried out using Python to continuously capture images.
The images were immediately cropped and transformed to a black and white sample of 50 to 50 pixels.
Every class contained 1,200 images that, given the left-handed signers, were then flipped horizontally.[8]

B. Hand Skin Color Detection using Image Processing

The signer has been recommended to provide a transparent backdrop for the hands for better skin color
identification, which would make it easier for the machine to identify the skin colors. Detection of the
skin was done using cv2.cvtColor. Photos were translated to HSV from RGB. The HSV frame was
supplied via the cv2.inRange function, with the lower and upper ranges as the arguments. The
performance from the cv2.inRange function was the mask. The area of the frame weighted like the skin
was considered to be white pixels in the generated mask. Although black pixels are overlooked, the
functions cv2.erode and cv2.dilate delete small regions that may indicate a small area of false-positive
skin. Then, using this kernel, two erosion and dilation iterations were performed. Finally, a Gaussian blur
was used to smooth out the resulting masks.[8]

30
CHAPTER 5
EXPERIMENTAL RESULTS

5.1 DATA COLLECTION:

Data folder containing two folders test and train gets created. Test and train folders contain dataset of all
alphabets from A to Z.

Gathering of datasets was done using python and continuous capturing of images.

31
Letter H dataset

32
Letter L dataset

Letter O dataset

33
Letter W dataset

Letter Z dataset

34
5.2 PREPROCESSING IMAGES

Data 2 folder containing two folders test and train gets created. Test and train folders contain dataset of
all alphabets from A to Z.

Letter H dataset

35
Letter L dataset

Letter O dataset

Letter W dataset

36
Letter Z dataset

37
CHAPTER 6
CONCLUSION AND FUTURE SCOPE

Our project aims to simplify communication between deaf and dumb individuals by introducing the
communication route of computers so that sign language can be captured, understood, converted to text
and displayed on LCD automatically. For sign language translation, there are different approaches. Some
use wired electronic gloves and others use a strategy based on visuals. Electronic gloves are expensive
and one person does not use another person's gloves. Different strategies are used in the vision-based
approach to distinguish and align the captured movements with gestures in the database. It is a simple,
powerful and robust technique to convert RGB image to binary and compare it with a database using a
comparison algorithm. To translate sign language into text, this technique is sufficiently precise.
FUTURE SCOPE
The proposed framework can be developed and applied using Raspberry Pi for future work. The portion
of image processing should be expanded to allow the device to interact in both ways i.e. it should be able
to translate natural language into sign language, and vice versa. We are going to attempt to identify
signals that require motion. In addition, we will concentrate on translating the series of gestures into
language i.e. words and phrases, and then converting it into the voice that can be understood.

38
REFERENCES

[1] HTTPS://DIGITALCOMMONS.KENNESAW.EDU/CS_ETD
[2] V. ATHITSOS, C. NEIDLE, S. SCLAROFF AND J. NASH, "THE AMERICAN SIGN
LANGUAGE LEXICON VIDEO
[3] DATASET," 2008 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION
AND PATTERN
RECOGNITION WORKSHOPS, 2008.
[4] J. HUANG, W. ZHOU AND Q. ZHANG, "VIDEO-BASED SIGN LANGUAGE
RECOGNITION WITHOUT TEMPORAL SEGMENTATION," ARXIV, 2018.
[5] T.-W. CHONG AND B.-G. LEE, "AMERICAN SIGN LANGUAGE RECOGNITION USING
LEAP MOTION CONTROLLER WITH MACHINE LEARNING APPROACH," SENSORS,
VOL. 18, 2018.
[6]HTTPS://MEDIUM.COM/@GONGSTER/HOW-DOES-A-NEURAL-NETWORK-WORK-
INTUITIVELY-IN-CODE-F51F7B2C1E3F
[7]HTTPS://MEDIUM.COM/NYBLES/A-BRIEF-GUIDE-TO-CONVOLUTIONAL-
NEURAL-NETWORK-CNN-642F47E88ED4
[8]WWW.RESEARCHGATE.NET/PUBLICATION/ 337285019
[9] EN.WIKIPEDIA
[10] EN.M.WIKIPWDIA.ORG/MACHINE LEARNING
[11] WWW.IJIRCCE.COM
[12] WWW.GOOGLE.COM

39
PLAGIARISM REPORT

PLAGIARISM-8%

40
Shruti_08122020
by Shruti Rp

Submission date: 08-Dec-2020 09:29AM (UTC+0530)


Submission ID: 1468266706
File name: Final_Year_Project_Report_1.pdf (3.02M)
Word count: 5813
Character count: 30681
Shruti_08122020
ORIGINALITY REPORT

8 %
SIMILARITY INDEX
5%
INTERNET SOURCES
2%
PUBLICATIONS
6%
STUDENT PAPERS

PRIMARY SOURCES

1
Submitted to Jaypee University of Information
Technology
2%
Student Paper

2
www.studymode.com
Internet Source 1%
3
Submitted to Issaquah High School
Student Paper 1%
4
ijtrs.com
Internet Source 1%
5
Submitted to Multimedia University
Student Paper 1%
6
"Advances in Decision Sciences, Image
Processing, Security and Computer Vision",
1%
Springer Science and Business Media LLC,
2020
Publication

7
www.slideshare.net
Internet Source 1%
www.ijcse.com
Internet Source

<1%
8

9
"Big Data Analytics", Springer Science and
Business Media LLC, 2019
<1%
Publication

10
Submitted to University College London
Student Paper <1%
11
Submitted to Visvesvaraya Technological
University, Belagavi
<1%
Student Paper

12
Submitted to Higher Education Commission
Pakistan
<1%
Student Paper

13
bmcmedimaging.biomedcentral.com
Internet Source <1%
14
diposit.ub.edu
Internet Source <1%
15
Submitted to University of Surrey
Student Paper <1%
16
Submitted to The University of Manchester
Student Paper <1%
17
H. Horii. "Estimate modelling for assessing the
safety performance of occupant restraint
<1%
systems", WITPRESS LTD., 2013
Publication
Exclude quotes On Exclude matches < 10 words
Exclude bibliography On

You might also like