Research Paper at 9890704605

Implementation of Hand Gesture Recognition System to
Aid Deaf-dumb People
Abstract. In recent years, there has been a fast increase in the range of deaf and
dumb victims because of birth defects or some other problems. Since a deaf and
mute person cannot talk with an ordinary person in order that they ought to rely
on some kind of communication system. The gesture shows some physical
movements of the body that convey a message. Gesture recognition is the math-
ematical interpretation of an individual’s motion by an information processing
system. Linguistic communication provides the most effective communication
platform for the deaf and dumb person to speak with an ordinary person. The
target of this paper is to develop a time period system for hand gesture recogni-
tion that acknowledges hand gestures and then converts them into voice and
text. To implement this method we tend to use a Raspberry Pi with the camera
module and programmed with Python Programming Language supported by
Open Source Computer Vision (OpenCV) library. It also contains a 5 inch
800*480 Resistive HDMI Touch Screen Display for I/O data. There is also 5
megapixel Pi camera to capture the images of a person’s hand. In this paper, ef-
forts have been done to detect 8 different gestures. Each gesture has assigned
unique sound and text output. In experimental results, 800 samples were taken
into the consideration out of which 760 samples were detected correctly and 40
samples were detected wrongly. Hence the proposed system gives the accuracy
of 95%.
Keywords: Raspberry Pi, Python, Feature Extraction, Contours, OpenCV
1 INTRODUCTION
In our daily routine, we will communicate with one another by using speech. Gestures
are a lot of preferred and natural to act with computers for human so it builds a bridge
between humans and machines. For several deaf and dumb person, linguistic commu-
nication is their primary language making a powerful sense of social and cultural
identity. The planned system is predicated on vision-based hand recognition approach
that is a lot of natural and doesn't need any info to spot the actual gesture. The hand
gestures should be known under variable illumination conditions. There are several
feature extraction ways and classification techniques are offered and therefore the call
on which of them to use could be a difficult task. The planned methodology performs
background segmentation of the hand from the information then we tend to assign
gesture for a different sentence. It involves feature extraction ways to angle calcula-
tion of hand gestures, then finally the gestures are detected and changing these ges-
tures into text and voice. The planned system relies on Raspberry Pi with the camera
2
module and programmed with Python programing language supported by Open

source computer Vision (OpenCV) library. It also contains a 5 inch 800*480 Resistive
HDMI Touch Screen Display for I/O data. There is also 5 megapixel Pi camera to
capture the images of a person’s hand. In this paper, efforts have been done to detect
8 different gestures. Each gesture has assigned unique sound and text output. In ex-
perimental results, 800 (100 samples for each gesture) samples were taken into the
consideration out of which 760 samples were detected correctly and 40 samples were
detected wrongly.
2 LITERATURE SURVEY:
Rathi et al. [1] A framework for perceiving a dynamic hand words motion of Indian
signs and change of perceived signal into text and voice and vice versa. Eigenvectors
and Eigen esteem method has utilized for highlight extraction. Eigenvalue worth
weighted Euclidean Distance based for the most part classifier has utilized.Geethu
and Anu [2] ARM CORTEX A8 Processor board is used.
For image characterization, Haar classifier is used though 1-D HMM is utilized
for Speech modification. Marking acknowledgment has created its significance in
a very few zones like Human computer Interactions (HCIs), mechanical controls,
home computerization. Quiapo et al. [3] it was ready to fulfill the needs of a
sign Language Translator. The task was ready to boost the fluctuate of the
flex detection parts although as well as new types of detector states
for included sifting. The procedure GUI conveyed the bigger a part of the capacities
that were needed within the two-way interpretation technique. Sayan Tapadar et al.
[5] this includes training with the acquired alternatives that square measure near par-
ticular for different hand motions. In this way, we will be prepared to set up gesture-
based communications and thus assemble crippled individuals socially satisfactorily.
Use the distinctive feature extraction. Hamid A. Jalab and Herman .K. Omer [6] a
hand motion interface for prevailing media player misuse neural system. The antici-
pated standard recognizes a gathering of 4 explicit hand signals, to be specific: Play,
Stop, Forward, and Reverse. Our standard is predicated on four stages, Image pro-
curement, Hand division, alternatives extraction, and Classification. Geethu G Nath
and Arun C [7] Implemented framework for marking recognition for not too sharp
people in ARM CORTEX A8 processor board misuse convex sunken body standard
and model coordinating principle. The framework is utilized to oversee gadgets like
an instrument, car Audio Systems, home apparatuses. Shweta et al [8] Build up a
genuine time framework for hand motion acknowledgment that recognizes hand sig-
nals, alternatives of hands like pinnacle figuring and edge computation thus convert
motion pictures into voice and contrariwise using image processing. Ali A. Abed and
Sarah A. Rahman [9] the versatile instrument is developed and tried to demonstrate
the viability of the anticipated guideline. The instrument movement and route happy
with very surprising headings: Forward, Backward, Right, Left and Stop. The ubiqui-
ty rate of the automated framework came to with respect to ninety-eight using Rasp-
berry Pi with the camera module and modified with Python. Muhammad Yaqoob
3
Javed et al. [10] Digital Dactylology Converser (DOC) that could be a gadget that
changes over a sign language into voice sign and instant message. The anticipated
gadget will function admirably and translates the letters, letters in order to content and
sound. Anup Nandy et al. [11] Give efficient acknowledgment exactness to a limited
arrangement of dynamic ISL motions. It incorporates the amazing outcomes for Eu-
clidian separation and K-Nearest neighbor measurements.
3 PROPOSED ARCHITECTURE:
3.1 SYSTEM ARCHITECTURE:
i. Frame capture
The input file may be a frame or a sequence of video frames, taken by a Raspber-
ry Pi camera module pointed toward the user’s hand. A 5MP camera module that
capable of 1080p video and still image however additionally 720p60 and
640x480p60/90 captures the frame. The picture captured with background and
stable light. Region of Interest (ROI) is that the hand region, thus it captured the
pictures of the hand and converts them to binary scale so as to search out the ROI
[9].
ii. Blur Image:
In image processing, a Gaussian blur is that the results of blurring a picture by a

Gaussian function. It’s a wide used result in graphics software system, usually to
scale back picture noise and reduce details. Blur frame is important to method for
picture improvement and for obtaining smart results. Blurring is utilized for
smoothing pictures and scale down noise and details from the pictures. Image can
be filtered by LPF and HPF. LPF helps in removing noises, blurring the picture
and on the other hand HPF helps in finding edges in the pictures. Mathematical-
ly, using a Gaussian blur to a picture is that the same as convolving the picture
with a Gaussian function (Equation 1).
Gaussian Blur formula
( ) ( )
√
iii. Frame Segmentation:
Frame segmentation is that the beginning of any frame recognition

method. The main purpose of hand segmentation is to distinguish
hand area from the background within the picture. So as to realize
4
this, totally different image segmentation algorithms are used like

thresholding method, this result is shown in (fig. 1).
Fig. 1 Thresholding Process
iv. Find Contours and Convex Hull:
Contour is a curve joining all the continuous points. It is a useful tool for
shape analysis, detection, and recognition. In OpenCV Software, finding
contours is like finding a white color object from the black color back-
ground. The green line around the hand (fig. 2) is
termed a convex hull that is used to get the fingertips is that the arched set
encasing the hand space. Convex hull can look the same as contour approxi-
mation however it's not (both could offer an equivalent result in the same
case). It checks a curve for convexity defects and corrects it.
Fig. 2 Convex Hull and Defects

v. Find Convexity Defects and Area Ratio:
Any deviation of the object from this hull can be considered as convexity
defects denoted as blue dots (fig 2). There are 3 points start point, far point
(defect point), and end point between the two fingers. If the angle between
two fingers is greater than 30 degree and less than 90 degree then the cavity
5
formed will be termed as a defect. Any two fingers. The defects will be most
likely of triangular shape which will have 3 corner points i.e. start point, far
point, and end point
Start a End
Point Point
b c
Far
Point
Formula of a,b,c, & area ratio
a= √( ( ) ( )) (( ( ) ( ))
b=√( ( ) ( )) (( ( ) ( ))
c=√( ( ) ( )) (( ( ) ( ))
Area ratio is given by,
ar=√ ( )( )( )
( )
Angle=
Accuracy =
3.2 METHODOLOGY
The camera coupled with the Raspberry Pi initially captures the image of the hand to
be processed and identified. The input image must be converted to a specific format
before processing and identifying the gesture. After the process, the gesture is identi-
fied and the text is generated. This text is for ordinary people to read, and text-to-
speech messages are available if you cannot see them.
6
Input RGB to Identify Identify

Camera Binary
Frame Colour Gesture
Generate
Speaker
Text
Fig.3 Block Diagram of Working Module
A. Image Capture:
The input information is a picture or series of pictures taken by one
camera and directed at the user’s hand. The gesture pictures are re-
al images of various sizes taken with a camera.
B. RGB to Binary:
Two skin color areas are defined, one is the top area and one is the
bottom area with respect to HSV, where the input image is cap-
tured by the camera and then converted to a grayscale image.
C. Identify Color:
In this step, the skin color is extracted from the object frame. Next,
the image was cropped to get rid of the unwanted parts of the initial
pictures. Finally, we get clear results images with uniform size and
consistent background.
D. Identify Gesture:
Identify the convex hull and contour in the green box around the
hand. In this green box, the gap between two fingers is called as de-
fects. If the angle between two fingers is less than 90 degrees and
greater than 30 degrees, it is considered defects. These defects are
7
marked with a blue dot. Based on the hand area and the defects we
can identify the gesture.
E. Generate text and text to speech:

The corresponding text will be displayed on the LCD screen ac-
cording to the gesture. This text is then converted to sound in the
output.
4 IMPLEMENTATION AND RESULT:
4.1 ALGORITHM:
 Import the mandatory packages: define the necessary packages

that required during this algorithm such as:
from tkinter import*
from PIL import Image
from PIL import ImageTk
import cv2
import time
import os
import picamera
import numpy as np
import math
 Define frames for tkinter image:

root = Tk()
frame1 = Frame(root)
frame2 = Frame(root)
frame3 = Frame(root
 Initialize the present frame as follows:

frame1.pack(side=TOP)
frame2.pack(side=LEFT)
frame3.pack(side=LEFT)
8
Fig. 4 TKinter Image
 Capturing image is as follows:

camera=picamera.PiCamera()
camera.resolution=(200,200)
camera.start_preview()
time.sleep(5)
camera.capture
camera.stop_preview()
camera.close()
 Convert image to HSV, obtain skin color object and blur image as follows:
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
lower_skin = np.array([0,20,70], dtype=np.uint8)

upper_skin = np.array([20,255,255], dtype=np.uint8)
mask1 = cv2.inRange(hsv, lower_skin, upper_skin)
mask = cv2.GaussianBlur(mask,(5,5),100)
 Then find the contours as follows:
contours,hierarchy=cv2.findContours(mask,cv2.RETR_TREE,cv2.CHAI
N_APPROX_SI MPLE)
 Find convex hull as follows:
cnt = max(contours, key = lambda x: cv2.contourArea(x))

9
hull = cv2.convexHull(cnt)
 Calculate area ratio for the gesture:

areahull = cv2.contourArea(hull)
areacnt = cv2.contourArea(cnt)
arearatio=((areahull-areacnt)/areacnt)*100
hull = cv2.convexHull(approx, returnPoints=False)
defects = cv2.convexityDefects(approx, hull)
 Code for finding no. of defects due to fingers:

for i in range(defects.shape[0]):
s,e,f,d = defects[i,0]
start = tuple(approx[s][0])
end = tuple(approx[e][0])
far = tuple(approx[f][0])
pt= (100,180)
a = math.sqrt((end[0] - start[0])**2 + (end[1] - start[1])**2)
b = math.sqrt((far[0] - start[0])**2 + (far[1] - start[1])**2)
c = math.sqrt((end[0] - far[0])**2 + (end[1] - far[1])**2)
s = (a+b+c)/2
ar = math.sqrt(s*(s-a)*(s-b)*(s-c)
angle = math.acos((b**2 + c**2 - a**2)/(2*b*c)) * 57
4.2 HAEDWARE IMPLEMENTATION:
Fig. 5 Hardware Module

10
4.3 FLOW CHART:
Fig. 6 Flowchart
11
4.3 RESULT:
Table 1: Image to Sound Conversion
Gestures Input Threshold Identified text Identified

image image image output audio file
output
8
12
The above table represents how input image gets converted into sound and display
output. Each and every gesture has unique sound and display output. We will discuss
this in next table.
Table 2: Result Related Discussion
Gestures Number of Area Ratio Display and
Defects Audio Output
1 0 Less than 12 NO
2 0 Less than 17.5 HELLO
3 1 Less than 17.5 PLEASE HELP
4 2 Less than 27 I AM THIRSTY
5 3 Less than 27 YES
6 4 Less than 27 THANK YOU
7 0 Less than 17.5 ALL THE BEST
8 2 Less than 27 OK
4.5 ANALYSIS PART OF RESULT:
Table 3: Confusion Matrix
OUTPUT Gesture Gesture Gesture Gesture Gesture Gesture Gesture Gesture

INPUT 1 2 3 4 5 6 7 8
Gesture
100 0 0 0 0 0 0 0
1
Gesture
0 100 0 0 0 0 0 0
2
Gesture
3 0 4 95 0 0 0 1 0
Gesture
0 0 3 90 2 0 0 5
4
Gesture
0 0 0 0 100 0 0 0
5
Gesture
6 0 0 0 0 0 100 0 0
Gesture
7 6 2 0 0 0 85 0
7
Gesture
8 0 0 3 5 2 0 0 90
13
For hand gesture detection we have a tendency to take 8 different gestures. Every ges-
ture repeated hundred times, so the total range of tested pictures was 800 among that
pictures correct recognition was 760 and wrong recognition was 40 with a mean detec-
tion rate of hand gestures of 95.00%.
Accuracy formula
Table 4: Percentage Accuracy of Various Gestures
Different Total No. of No. of Right No. of Wrong Accuracy In

Gesture Gestures Gestures Gestures Percentage
Gesture 1 100 100 0 100%
Gesture 2 100 100 0 100%
Gesture 3 100 95 5 95%
Gesture 4 100 90 10 90%
Gesture 5 100 100 0 100%
Gesture 6 100 100 0 100%
Gesture 7 100 85 15 85%
Gesture 8 100 90 10 90%
Total Gesture 800 760 40 95%
The graph for average recognition rate of various hand gestures that is shown be-
low (fig 7).
120
100
80
60 Total no. of gestures
40
No. of right gestures
20
0 Wrong gestures
Accuracy
Fig. 7 Graph for Average Recognition Rate

14
5 CONCLUSION
The planned system is simple to implement as there's no complicated feature cal-
culation. This system was implemented using Raspberry Pi with the camera
module and programmed with Python Programming Language supported by
Open Source Computer Vision (OpenCV) library. The system was used to
acknowledge sign language utilized by deaf and dumb parsons. This system is
used to overcome the communication gap between mute person and ordinary per-
son. There's a necessity for research within the area feature extraction and illumi-
nation therefore the system becomes a lot of reliable. The system was used to
acknowledge sign language utilized by deaf and dumb parsons. The deaf and
dumb person will use the hand gestures to do linguistic communica-
tion and it'll be converted into voice and text with accuracy 95%.
REFERENCES
1. Rathi, S., & Gawande, U. (2017). Development of full duplex intelligent communication
system for deaf and dumb people. 2017 7th International Conference on Cloud
Computing, Data Science & Engineering -
Confluence.doi:10.1109/confluence.2017.7943247
2. Geethu G Nath, Anu V S, “Embedded Sign Language Interpreter System For Deaf and
Dumb People”, 2017 International Conference on Innovations in information Embedded
and Communication Systems (ICIIECS)
3. Quiapo, Carlos Emmanuel A. and Ramos, Katrina Nicole M., “Development of a Sign
Language Translator Using Simplified Tilt, Flex and Contact Sensor Modules”, 978-1-
5090-2597-8/16/$31.00 c 2016 IEEE
4. Subhankar Chattoraj Karan Vishwakarma, “Assistive System for Physically Disabled
People using Gesture Recognition”, 2017 IEEE 2nd International Conference on Signal
and Image Processing
5. Sayan Tapadar, Suhrid Krishna Chatterjee, Himadri Nath Saha, Shinjini Ray, Sudipta
Saha, “A Machine Learning Based Approach for Hand Gesture Recognition using
Distinctive Feature Extraction”, 978-1-5386-4649-6/18/$31.00 ©2018 IEEE
6. Hamid A. Jalab Herman .K. Omer, “Human Computer Interface Using Hand Gesture
Recognition Based On Neural Network”, 978-1-4799-7626-3/15/$31.00 ©2015 IEEE
7. Geethu G Nath, Arun C S, “Real Time Sign Language Interpreter”, 2017 International
Conference on Electrical, Instrumentation and Communication Engineering
(ICEICE2017)
8. Shweta, Rajesh, Vitthal, “Real Time Two Way Communication Approach for Hearing
Impaired and Dumb Person Based on Image Processing”, 2016 IEEE International
Conference on Computational Intelligence and Computing Research
9. Ali A. Abed, Sarah A. Rahman, “Python-based Raspberry Pi for Hand Gesture
Recognition”, International Journal of Computer Applications (0975 – 8887) Volume 173
– No.4, September 2017
10. Muhammad Yaqoob Javed!, Muhammad Majid Gulzar3Syed Tahir Hussain Rizvi 2M
Junaid Asif4 Zaineb Iqbal, “Implementation of Image Processing Based Digital
Dactylology Converser for Deaf-Mute Persons”, 978-1-4673-8753-8/16/$31 .00 ©2016
IEEE
11. Anup Nandy, Jay Shankar Prasad, Soumik Mondal, Pavan Chakraborty, and G.C. Nandi,”
Recognition of Isolated Indian Sign Language Gesture in Real Time”, BAIP 2010, CCIS
70, pp. 102–107, 2010. © Springer-Verlag Berlin Heidelberg 2010
15
12. Kapil Yadav and Jhilik Bhattacharya, “Real-Time Hand Gesture Detection and
Recognition for Human Computer Interaction”, Intelligent Systems Technologies and
Applications, Advances in Intelligent Systems and Computing 384, DOI: 10.1007/978-3-
319-23036-8_49
13. Giuseppe Airò Farulla, Ludovico Orlando Russo, Chiara Pintor, Daniele Pianu, Giorgio
Micotti, Alice Rita Salgarella, Domenico Camboni, Marco Controzzi, Christian Cipriani,
Calogero Maria Oddo, Stefano Rosa1, and Marco Indaco, “Real-Time Single Camera
Hand Gesture Recognition System for Remote Deaf-Blind Communication”, c Springer
International Publishing Switzerland 2014
14. [R, GONZALEZ. R, WOODS. Digital Image Processing, New Jersy: Pearson Education.
Inc.
15. S, UMBAUGH. Computer Vision and Image Processing: A Practical Approach Using
Cviptools with Cdrom, Prentice Hall PTR.
16. H, JALAB. R. W, IBRAHIM Texture enhancement based on the SavitzkyGolay fractional
differential operator. Mathematical Problems in Engineering, 2013

Research Paper at 9890704605

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Paper at 9890704605

Uploaded by

Copyright:

Available Formats

Implementation of Hand Gesture Recognition System to

Aid Deaf-dumb People

Keywords: Raspberry Pi, Python, Feature Extraction, Contours, OpenCV

module and programmed with Python programing language supported by Open

3.1 SYSTEM ARCHITECTURE:

ii. Blur Image:

In image processing, a Gaussian blur is that the results of blurring a picture by a

Gaussian Blur formula

iii. Frame Segmentation:

Frame segmentation is that the beginning of any frame recognition

this, totally different image segmentation algorithms are used like

Fig. 1 Thresholding Process

iv. Find Contours and Convex Hull:

Fig. 2 Convex Hull and Defects

Area ratio is given by,

Input RGB to Identify Identify

Fig.3 Block Diagram of Working Module

E. Generate text and text to speech:

4 IMPLEMENTATION AND RESULT:

 Import the mandatory packages: define the necessary packages

 Define frames for tkinter image:

 Initialize the present frame as follows:

Fig. 4 TKinter Image

 Capturing image is as follows:

lower_skin = np.array([0,20,70], dtype=np.uint8)

mask1 = cv2.inRange(hsv, lower_skin, upper_skin)

 Then find the contours as follows:

 Find convex hull as follows:

cnt = max(contours, key = lambda x: cv2.contourArea(x))

 Calculate area ratio for the gesture:

 Code for finding no. of defects due to fingers:

4.2 HAEDWARE IMPLEMENTATION:

Fig. 5 Hardware Module

4.3 FLOW CHART:

Table 1: Image to Sound Conversion

Gestures Input Threshold Identified text Identified

4.5 ANALYSIS PART OF RESULT:

Table 3: Confusion Matrix

OUTPUT Gesture Gesture Gesture Gesture Gesture Gesture Gesture Gesture

Table 4: Percentage Accuracy of Various Gestures

Different Total No. of No. of Right No. of Wrong Accuracy In

Fig. 7 Graph for Average Recognition Rate

You might also like