Hand Gesture Recognition Using Neural Networks

A
Project Report On
“Hand Gesture Recognition using Neural Networks”
Submitted by
Kiran P V Exam No. B3223021
Vidit Mediratta Exam No. B3223027
Gaurav Sharma Exam No. B3223042
Under the Guidance of
Prof. Vijay Karra
For the partial fulfillment of
B.E. (Electronics & Telecommunication) 2008-2009
To
DEPARTMENT OF ELECTRONICS & TELECOMMUNICATION
ARMY INSTITUTE OF TECHNOLOGY
DIGHI HILLS, PUNE-411015
Under
University of Pune
1
Hand Gesture Recognition Using Neural Networks
CERTIFICATE
This is to certify that Kiran P V, Gaurav Sharma and Vidit Mediratta have
successfully submitted the seminar report on
“HAND GESTURE RECOGNITION USING NEURAL
NETWORK”
During the academic year 2008-2009 in the partial fulfillment towards completion
of bachelor’s Degree Program in Engineering(Electronics and Telecommunication)
under University of Pune.

Mrs. Surekha K.S Prof. Vijay
Karra
Head of Department Project Guide
Electronics and Telecommunication Electronics and Telecommunication

Mrs. Surekha K.S
Principal
Army Institute of Technology
Dighi Hills,Pune-411015
2
Hand Gesture Recognition Using Neural Networks
ACKNOWLEDGEMENT
We wish to express our sincere gratitude to our guide Prof. Vijay Karra for his valuable
guidance at all stages of our project. We acknowledge the whole hearted, unreserved and positive
encouragement on his part, which helped us to tackle all our problems to ensure successful
completion of the project.
We are thankful to the Staff of Department of Electronics & Telecommunication, A.I.T., for
all the direct and indirect help and for making available the resources of the department for the
timely completion of our project.
We are also thankful to our Prof. Surekha K S for her valuable suggestions.
We sincerely believe that our guides were the motivating forces behind the project. It was their
constant encouragement and constructive criticism that has made our project achieve its present
form.
We would be ungrateful if we did not acknowledge our family and friends who were always by
our side.
Last but not the least to whom we have named, we express deep gratitude and to whom we
haven’t please note that even though you are unnamed, you are appreciated more than you know.
Kiran P V
Vidit Mediratta
Gaurav Sharma
3
Hand Gesture Recognition Using Neural Networks
Table of contents
1. Abstract…………………………………………………………………… .........5
2. Introduction
A.Brief description…………………………………………………………….7
B. Literature survey……………………………………………………………9
C. Software Engineering Approach……………………………………..........11
3. Problem Definition……………………………………………………………..12
4. Design
I. A.Hand Gesture Recognition………….…………………………………. .14
B. Image .Database ………………………………………………………15
C.Image Processing………………………………………………….........16
D. Matlab……………………………..… ………………………………..17
E. Neural Network………………………………………………………..18
F.Block Diagram. ……………………………………………………..... .21
II. Defining the different issues
A.Database Creation…………………………………………………… 23
B.Counting the fingers……………………………………………………29
C.Matlab Operations…………………………………………………… 37
D.Neuron Model……………………………………………………….. .43
E.Microcontroller & Robot………………………………………………49
5. Stepwise procedure flow……………………………………………………. ..55
6.Time Activity Chart………………………………………………………….. .56
7. Conclusion……………………………………………………………………. 57
8. Future scope…………………………………………………………………...58
9. Bibliography…………………………………………………………………. .59
4
Hand Gesture Recognition Using Neural Networks
ABSTRACT
Hand gesture recognition techniques have been studied for more than two decades. Several
solutions have been developed , however, little attention has been paid on the human factors, e.g.
the intuitiveness of the applied hand gestures. This study was inspired by the movie Minority
Report, in which a gesture-based interface was presented to a
large audience. In the movie, a video-browsing application was controlled by hand gestures.
Nowadays the tracking of hand movements and the computer recognition of gestures is
realizable , however, for a usable system it is essential to have an intuitive set of gestures. The
system functions used in Minority Report were reverse engineered and a user study was
conducted, in which participants were asked to express these functions by means of hand
gestures. We were interested how people formulate gestures and whether we could find any
pattern in these gestures. In particular, we focused on the types of gestures in order to study
intuitiveness, and on the kinetic features to discover how they influence computer recognition.
We found that there are typical gestures for each function, and these are not necessarily related to
the technology people are used to. This result suggests that an intuitive set of gestures can be
designed, which is not only usable in this specific application, but can be generalized for other
purposes as well. Furthermore, directions are given for computer recognition of gestures
regarding the number of hands used and the dimensions of the space where the gestures are
formulated.
5
Hand Gesture Recognition Using Neural Networks
INTRODUCTION
6
Hand Gesture Recognition Using Neural Networks
BRIEF DESCRIPTION
Several successful approaches to spatio-temporal signal processing such as speech recognition
and hand gesture recognition have been proposed. Most of them involve time alignment which
requires substantial computation and considerable memory storage. In this paper, we present a
neural-network-based approach to spatio-temporal pattern recognition. This approach employs
a powerful method based on Hyper Rectangular Composite Neural Networks (HRCNNs) for
selecting templates; therefore, considerable memory is alleviated.
Due to congenital malfunctions, diseases, head injuries, or virus infections, deaf or
non- vocal individuals are unable to communicate with hearing persons through speech. They
use sign language or hand gestures to express themselves, however, most hearing persons do
not have the special sign language expertise. Hand gestures can be classified into two classes:
(1) static hand gestures which relies only the information about the angles of the lingers and (2)
dynamic hand gestures which relies not only the fingers' flex angles but also the hand
trajectories and orientations. The dynamic hand gestures can be further divided into two
subclasses. The first subclass consists of hand gestures involving hand movements and the
second subclass consists; of hand gestures involving fingers' movements but without changing
the position of the hands. That is, it requires at least two different hand shapes connected
sequentially to form a particular hand gesture. Therefore samples of these hand gestures are
spatio-temporal patterns. The basic idea of our method for recognizing these spatio-temporal
hand gestures is as follows. We generate templates for each basic hand shape by training a
Hyper Rectangular Composite Neural Network (HRCNN). Templates for each hand shape are
then represented in the form of crisp IF-THEN rules, which are extracted from the values of
synaptic weights of the corresponding trained HRCNN. The accumulated similarity associated
7
Hand Gesture Recognition Using Neural Networks
with all samples of the input is computed for each hand gesture in the vocabulary, and the
unknown gesture is classified as the gesture yielding the highest accumulative similarity.
Developing sign language applications for deaf people can be very important, as many of them,
being not able to speak a language, are also not able to read or write a spoken language.
Ideally, a translation systems would make it possible to communicate with deaf people.
Compared to speech commands, hand gestures are advantageous in noisy environments, in
situations where speech commands would be disturbing, as well as for communicating
quantitative information and spatial relationships.
A gesture is a form of non-verbal communication made with a part of the body and used instead
of verbal communication (or in combination with it). Most people use gestures and body
language in addition to words when they speak. A sign language is a language which uses
gestures instead of sound to convey meaning combining hand-shapes, orientation and movement
of the hands, arms or body, facial expressions and lip-patterns. Similar to automatic speech
recognition (ASR), we focus in gesture recognition which can be later translated to a certain
machine movement.
The goal of this project is to develop a program implementing real time gesture recognition. At
any time, a user can exhibit his hand doing a specific gesture in front of a video camera linked to
a computer. However, the user is not supposed to be exactly at the same place when showing his
hand. The program has to collect pictures of this gesture thanks to the video camera, to analyze it
and to identify the sign. It has to do it as fast as possible, given that real time processing is
required. In order to lighten the project, it has been decided that the identification would consist
in counting the number of fingers that are shown by the user in the input picture.
We propose a fast algorithm for automatically recognizing a limited set of gestures from hand
images for a robot control application. Hand gesture recognition is a challenging problem in its
general form. We consider a fixed set of manual commands and a reasonably structured
environment, and develop a simple, yet effective, procedure for gesture recognition. Our
approach contains steps for segmenting the hand region, locating the fingers and finally
classifying the gesture. The algorithm is in variant to translation, rotation, and scale of the
hand .We can even demonstrate the effectiveness of the technique on real imagery.
8
Hand Gesture Recognition Using Neural Networks
LITERATURE SURVEY
Objective:
Our objective is to identify requirements (i.e., quality attributes and functional
requirements) for Gesture Based Recognition. We especially focus on requirements
for research tools that target the domains of visualization for software maintenance,
reengineering, and reverse engineering.
Method:
The requirements are identified with a comprehensive literature survey based on relevant
publications in journals, conference proceedings, and theses. We have referred
Documents and journals available on the net for the same . Most of the data has been referred
from the IEEE website. As our library has online subscription of the IEEE journals, it
provided immense help in locating the resources.
The various journals referred are:
1) Implementation of adaptive feed-forward algorithm by Jaroslaw Szewinski_†, Wojciech
Jalmuzna_, University of Technology, Institute of Electronic Systems, Warsaw, Poland.

This deals with the description of the various algorithms used in Neural Networks viz. •
feed-forward (FF) • feedback (FB) • adaptive feed-forward (AFF).

2) Gesture Based Robot Control by V. S. Rao and C. Mahanta ,Department of
Electronics and Communication Engineering ,Indian Institute of Technology, Guwahati.

9
Hand Gesture Recognition Using Neural Networks
This journal deals with the past and recent developments in gesture recognition system. It
provided the great works by different scientists in different parts of the globe working on the
same aim: visual gesture recognition system for controlling robots.

3) A Fast Algorithm For Vision-Based Hand Gesture Recognition For Robot
Control by Asanterabi Malima, Erol Özgür, and Müjdat Çetin, Faculty of Engineering and
Natural Sciences, Sabancı University, Tuzla, İstanbul, Turkey.
The approach contains steps for segmenting the hand region, locating the fingers,and
finally classifying the gesture. The algorithm is invariant to translation, rotation, and scale of
the hand.
4) A Gesture controlled robot for object perception and Manipulation by Mark
Batcher, Institute of Neuroninformatics , Germany.
Gripsee is the name of the Robot of whose design is discussed in the paper ,it is used
for identifying an object, grasp it, and moving it to a new position. It serves as a
multipurpose Robot which can perform a no. of tasks , it is used as a Service Robot.
5) Programming-By-Example Gesture Recognition by Kevin Gabayan, Steven Lansel .
Machine learning and hardware improvements to a programming-by-example rapid
prototyping system are proposed This paper deals with the dynamic time warping gesture
recognition approach involving single signal channels.
10
Hand Gesture Recognition Using Neural Networks
SOFTWARE ENGINEERING APPROACH
For developing the code, and the whole algorithm, it was preferable to use Matlab. Indeed, in this
environment, image displaying, graphical analysis and image processing turn into a simple
enough issue concerning the coding, because Matlab has a huge and very complete “Image
Processing Toolbox”, and the fact that Matlab is optimized for matrix-based calculus make any
image treatment more easier given that any image can be considered as a matrix.
That’s why the whole Code has been developed first under Matlab environment. Only the
code of the Neural Network Method and of the Weighted Averaging Analysis method is
provided. Indeed, given that the last one is a kind of combination of the Pixel Counting Method
and of the Edge Counting Method, their respective codes may be extracted from the code of the
Weighted Averaging Method.
For the movement of robot, the program has been written in assembly language since it is
most suitable and we are well aware of the subject. The IC used is 8051 microcontroller, hence
the code was written and tested in RIDE software.
11
Hand Gesture Recognition Using Neural Networks
PROBLEM DEFINITION
The experimental setup consists of a digital camera used to take the images .The camera
is interfaced to computer. Computer is used to create the database & analysis of the
images. The computer consists of a program prepared in MATLAB for the various
operations on the images. Using Neural Network tool box, analysis of the images is done.
The initial step is to create the database of the images which are used for training &
testing. The image database can have different formats. Images can be either hand drawn,
digitized photographs or a 3D dimensional hand. Photographs were used, as they are the
most realistic approach. Two operations were carried out in all of the images. They were
converted to grayscale and the background was made uniform. The images with internet
databases already had uniform backgrounds but the ones taken with the digital camera
had to be processed in Photoshop .The pattern recognition system that will be used
consists of some transformation T, which converts an image into a feature vector, which
will be then compared with feature vectors of a training set of gestures.

12
Hand Gesture Recognition Using Neural Networks
DESIGN
13
Hand Gesture Recognition Using Neural Networks
HAND GESTURE RECOGNITION
Consider a robot navigation problem, in which a robot responds to the hand pose signs given by
a human, visually observed by the robot through a camera. We are interested in an algorithm that
enables the robot to identify a hand pose sign in the input image, as one of five possible
commands (or counts). The identified command will then be used as a
control input for the robot to perform a certain action or execute a certain task. For examples of
the signs to be used in our algorithm, see Figure . The signs could be associated with various
meanings depending on the function of the robot. For example, a “one” count could mean “move
forward”, a “five” count could mean “stop”. Furthermore, “two”, “three”, and “four” counts
could be interpreted as “reverse”, “turn
right,” and “turn left.”
Set of hand gestures, or “counts” considered in our work.
14
Hand Gesture Recognition Using Neural Networks
IMAGE DATABASE
The starting point of the project was the creation of a database with all the images that
would be used for training and testing.
The image database can have different formats. Images can be either hand drawn,
digitized photographs or a 3D dimensional hand. Photographs were used, as they are the
most realistic approach.
Images came from two main sources. Various ASL databases on the Internet and
photographs I took with a digital camera. This meant that they have different sizes,
different resolutions and some times almost completely different angles of shooting.
Images belonging to the last case were very few but they were discarded, as there was no
chance of classifying them correctly. Two operations were carried out in all of the
images. They were converted to grayscale and the background was made uniform. The
internet databases already had uniform backgrounds but the ones I took with the digital
camera had to be processed in Adobe Photoshop.
Drawn images can still simulate translational variances with the help of an editing
program (e.g. Adobe Photoshop).
The database itself was constantly changing throughout the completion of the project as it
was it that would decide the robustness of the algorithm. Therefore, it had to be done in
such way that different situations could be tested and thresholds above which the
algorithm didn’t classify correct would be decided.
The construction of such a database is clearly dependent on the application. If the
application is a crane controller for example operated by the same person for long periods
the algorithm doesn’t have to be robust on different person’s images. In this case noise
and motion blur should be tolerable.
15
Hand Gesture Recognition Using Neural Networks
IMAGE PROCESSING
Image processing is any form of signal processing for which the input is an image, such as
photographs or frames of video; the output of image processing can be either an image or a set of
characteristics or parameters related to the image. Most image-processing techniques involve
treating the image as a two-dimensional signal and applying standard signal-
processing techniques to it.
Typical operations
Among many other image processing operations are:
• Geometric transformations such as enlargement, reduction, and rotation
• Color corrections such as brightness and contrast adjustments, quantization, or
conversion to a different color space
• Digital compositing or Optical compositing (combination of two or more images). Used
in filmmaking to make a "matte"
• Image editing (e.g., to increase the quality of a digital image)
• Image registration (alignment of two or more images), differencing and morphing
• Image segmentation
• Extending dynamic range by combining differently exposed images
• 2-D object recognition with affine invariance
Applications
 Computer vision
 Face detection
 Feature detection
 Lane departure warning system
 Non-photorealistic rendering
 Medical image processing
16
Hand Gesture Recognition Using Neural Networks
 Microscope image processing
 Morphological image processing
 Remote sensing
MATLAB
The name MATLAB stands for matrix laboratory.
MATLAB is a high-performance language for technical computing. It integrates
computation, visualization, and programming in an easy-to-use environment where
problems and solutions are expressed in familiar mathematical notation. Typical uses
include:
_ Math and computation
_ Algorithm development
_ Modeling, simulation, and prototyping
_ Data analysis, exploration, and visualization
_ Scientific and engineering graphics
_ Application development, including Graphical User Interface building
MATLAB is an interactive system whose basic data element is an array that does not
require dimensioning. This allows you to solve many technical computing problems,
especially those with matrix and vector formulations, in a fraction of the time it would
take to write a program in a scalar non-interactive language such as C or Fortran.
MATLAB has evolved over a period of years with input from many users. In university
environments, it is the standard instructional tool for introductory and advanced courses
in mathematics, engineering, and science. In industry, MATLAB is the tool of choice for
high-productivity research, development, and analysis.
The reason that I have decided to use MATLAB for the development of this project is its
toolboxes. Toolboxes allow you to learn and apply specialized technology. Toolboxes
are comprehensive collections of MATLAB functions (M-files) that extend the
17
Hand Gesture Recognition Using Neural Networks
MATLAB environment to solve particular classes of problems. It includes among others
image processing and neural networks toolboxes.
NEURAL NETWORK
An artificial neural network (ANN), also called a simulated neural network (SNN) or commonly
just neural network (NN) is an interconnected group of artificial neurons that uses
a mathematical or computational model for information processing based on a
connectionistic approach to computation. In most cases an ANN is an adaptive system that
changes its structure based on external or internal information that flows through the network.
In more practical terms neural networks are non-linear statistical data modeling or decision
making tools. They can be used to model complex relationships between inputs and outputs or
to find patterns in data
An artificial neural network involves a network of simple processing elements (artificial
neurons) which can exhibit complex global behavior, determined by the connections between the
processing elements and element parameters. One classical type of artificial neural network is the
Hopfield net.
In a neural network model simple nodes, which can be called variously "neurons", "neurodes",
"Processing Elements" (PE) or "units", are connected together to form a network of nodes —
hence the term "neural network". While a neural network does not have to be adaptive per se, its
practical use comes with algorithms designed to alter the strength (weights) of the connections in
the network to produce a desired signal flow.
In modern software implementations of artificial neural networks the approach inspired by
biology has more or less been abandoned for a more practical approach based on statistics and
signal processing. In some of these systems neural networks, or parts of neural networks (such as
18
Hand Gesture Recognition Using Neural Networks
artificial neurons) are used as components in larger systems that combine both adaptive and non-
adaptive elements.
Neural networks are composed of simple elements operating in parallel. These elements
are inspired by biological nervous systems. As in nature, the network function is
determined largely by the connections between elements. We can train a neural network
to perform a particular function by adjusting the values of the connections (weights)
between elements.
Commonly neural networks are adjusted, or trained, so that a particular input leads to a
specific target output There, the network is adjusted, based on a comparison of the output and the
target, until the network output matches the target.
Figure : Neural Net block diagram
Neural networks have been trained to perform complex functions in various fields of
application including pattern recognition, identification, classification, speech, vision and
control systems.
Today neural networks can be trained to solve problems that are difficult for conventional
computers or human beings. The supervised training methods are commonly used, but
other networks can be obtained from unsupervised training techniques or from direct
design methods. Unsupervised networks can be used, for instance, to identify groups of
data. Certain kinds of linear networks and Hopfield networks are designed directly. In
19
Hand Gesture Recognition Using Neural Networks
summary, there are a variety of kinds of design and learning techniques that enrich the
choices that a user can make.
Applications
The utility of artificial neural network models lies in the fact that they can be used to infer a
function from observations and also to use it. This is particularly useful in applications where the
complexity of the data or task makes the design of such a function by hand impractical.
Real life applications
The tasks to which artificial neural networks are applied tend to fall within the following broad
categories:
• Function approximation, or regression analysis, including time series prediction and
modelling.
• Classification, including pattern and sequence recognition, novelty detection and
sequential decision making.
• Data processing, including filtering, clustering, blind signal separation and
compression.
Application areas include system identification and control (vehicle control, process control),
game-playing and decision making (backgammon, chess, racing), pattern recognition (radar
systems, face identification, object recognition, etc.), sequence recognition (gesture, speech,
handwritten text recognition), medical diagnosis, financial applications, data mining (or
knowledge discovery in databases, "KDD"), visualization and e-mail spam filtering.
20
Hand Gesture Recognition Using Neural Networks
BLOCK DIAGRAM
PC WITH
MATLAB
MOTOR DRIVER MOTOR
8051
MICROCON
TROLLER

21
Pattern to Recognized
recognize Pattern
Generation of
templates

Pattern
recognition
Decision
Logic
Sampling
Hand Gesture Recognition Using Neural Networks
22
Hand Gesture Recognition Using Neural Networks
DEFINING THE DIFFERENT ISSUES
Collecting the pictures
First of all, and obviously, it will be necessary to collect pictures. There is a choice to do
concerning the way we want to collect these pictures, given that it depends on how we
implement the main program. Running in the MATLAB environment requires the pictures to be
saved in memory and called back when running the program, because the Image Acquisition
Toolbox is not available on the MATLAB version used for the design of the program.
That’s why, for a real time processing, it will be necessary to implement the program in a
C or C++ environment. So, the easiest way to collect pictures is to use VideoOCX for example,
assuming encoding in C++.
However, to develop the body of the program, there are no real time constraints: it is
possible to work on typical and representative pictures previously chosen and saved. The whole
MATLAB program has been developed using such saved pictures. Then, it has been modified so
that it can be used in real time C++ stand-alone functions.
Finding the hand
Now, let’s suppose that a set of representative pictures is provided. We need then to
analyze the picture, and to find the relevant part of the picture. Indeed the user will never put his
hand in the same area of the picture. Here are given few examples of the same sign done in
different areas, which have to lead to the same identification result, which should be ‘2’:
23
Hand Gesture Recognition Using Neural Networks
Analysis and identification
Then, the real work can start: Let’s suppose we got the relevant part of the image, which
contains only the hand. How can we “guess” the type of sign? To make the problem easier, we
can consider that we are interested only in the number of fingers exhibited by the user. So, we
can sum up the problem: How can we count the number of fingers in a picture of hand?
There are plenty of ways to do it. In the following pages, the advantages and drawbacks
of few of them will be described. There are some geometrical ways that can make the problem
solved by counting numbers of blocks within a picture, or some more sophisticated methods,
such as neural networks or laplacian filtering, which can lead to interesting results.
Examples of Allowed pictures

24
Hand Gesture Recognition Using Neural Networks
It has been already explained that the position of the hand in the picture is not important.
Given that the background is known, it is possible to build a new picture that corresponds to the
difference between the current picture of hand and the background. So it is possible to collect a
picture that contains only the hand, and some noise.
After processing noise removal, the resulting picture will be black almost everywhere except
where the hand is. So, zooming can then be easily realized by cropping areas whose pixel values
are close to 0.
Picture of the difference with the background
The difference with the background can be done using the Matlab function “imabsdiff”.
After that, to make all the preprocessing easier, it is better to create a binary picture. To do so, it
is necessary to choose a threshold: pixels with value lower than this threshold will be set to 0
(black) and others will be set to 1. The choice of this threshold depends on the video camera
properties: if we consider that the camera provides pixels coded on bytes, pixel values will be
from 0 to 255. Some measurements have proven that in this case, the presence of the hand will
imply a variation of pixel values bigger than 20 units. Of course, the optimal threshold depends
on the background, nevertheless, this threshold can be correct in most of cases.
Then it is necessary to execute noise-removal functions, else every noisy pixel that its
value is too high may be considered as part of the hand and will be included in the zoom-in
25
Hand Gesture Recognition Using Neural Networks
picture. For example if we suppose that the hand is in one corner of the picture and that there is a
noisy pixel in the opposite corner of the picture of the differences, so the zooming function will
keep it and the resulting picture, after zooming, will not be very different of the initial picture!
That’s why it is necessary to use noise removal functions.
The noise removal is processed using the function bwmorph(open), that erodes then
dilates the noisy picture. By this way, lonely pixels disappear during the erosion; other elements
are restored to their initial shape thanks to the following dilation.
Here are given few examples of resulting pictures.
Background Input Picture Binary Picture
26
Hand Gesture Recognition Using Neural Networks
Standard Re-sizing
According to the requirements, the user is not supposed to be systematically at the same
distance of the video camera. The consequences are obvious: if he is close to it, the hand will
occupy a large part of the input picture. At the contrary, when he is far from it, the hand will
appear small enough on the picture. So, the pictures of the hands after cropping may have some
very different sizes. That’s to say that it is necessary to resize all the pictures to a standard size so
that we can process them all the same way.
It seems evident that it is not useful to resize it to a size larger than the original one given
that it will not add information. Worse, it would be a serious drawback because it would increase
the amount of massive calculus, and it is contrary to the constraint of real time processing. For
these reasons, it is quite more interesting to reduce the size, but not too much. Indeed, in an
excessively reduced picture, some fingers can disappear, and some spaces between two fingers
way also disappear so that is seems there is only one finger.
After few tests and measurements, it has been decided that a size of 30x30 is quite small
enough to make calculus fast, and large enough to avoid any major damage to the initial picture.
In these conditions, the average dimensions of a finger are:
- width: 3~5 pixels
- length: 15~20 pixels
Of course, different users will all have different hands, hence different absolute
measurements. Nevertheless, such standard re-sizing will provide relative measurements: if the
size of the real thumb and ring fingers depend on the user, the ratio will be generally constant.
For almost all users:
-
( ) ( ) ... ( ) Width thumb Width ring Width atrial ≈ ≈ ≈
-
1 2
1 2
( ) ( )
...
( ) ( )
User User
User User
Length ring Length ring
Cst
Length atrial Length atrial
≈ ≈ ≈
That’s why this re-sizing operation can be considered as a standardization process: for any user,
the final re-sized image will have almost identical properties concerning the dimension of its
elements.
27
Hand Gesture Recognition Using Neural Networks
Finally, the fact that the width of a finger is 4 to 5 pixels implies that in the resulting
picture
A schematic example
A real example:
Input picture Binary picture Zoom-in Resizing
In these conditions, for any input picture, for any hand gesture that involve the thumb
finger, the preprocessing algorithm provides a standard-sized binary picture that corresponds to a
zoom on the hand. Once this preprocessing is finished, the real processing can start, that is to
say, the identification process can be launched.
28
Initial Picture, Size: 240 x 320
Hand, Size: ? x ?
Re-sized hand,
30 x 30
Hand Gesture Recognition Using Neural Networks
Counting the fingers
Simple Pixel Counting Analysis
The first immediate idea is the following: a picture that contains only the hand of the user
is provided to the program. In this picture, if there are only one or two fingers that are exhibited,
the numbers of pixels with value ‘1’ will be small. If the five fingers are shown, there will be
more pixels at ‘1’. So, there is a strong link between the number of fingers and the number of
pixels set to ‘1’. The easiest way to classify an image is then to compute the sum of the pixels of
the re-sized hand picture, and to compare to the resulting value to different ranges:
If sum < range_1
Then No fingers
If range_1 < sum < range_2
Then 1 finger
If range_2 < sum < range_3
Then 2 fingers
If range_3 < sum < range_4
Then 3 fingers
If range_4 < sum < range_5
Then 4 fingers
If range_5 < sum
Then 5 fingers
The advantage of this method is huge: Such programming is quite easy and very fast.
However, it is not a very efficient way:
According to the previous sections, the width of a finger will generally be 4 to 5 pixels,
and let’s suppose its length is 15 to 20 pixels, according to the user. So let’s consider that for
User 1, each finger has a dimension:
4 ( ) *15 ( ) 60 / pixels width pixels length pixels finger ·
29
Hand Gesture Recognition Using Neural Networks
For User 1, four fingers will lead to about 200 pixels. Let’s suppose that for User 2, the width of
a finger is 5 and its length is 20. Finger dimension is:
5 ( ) *20 ( ) 100 pixels width pixels length pixels ·
For User 2, two fingers will also lead to 200 pixels. The Consequence is that the program will get
confused and may tell the User 2 he is exhibiting five fingers (two fingers and the thumb) when
he just shows three of them (two and the thumb)!
Another issue is that even if it is always the same use who do signs, and that the different
ranges have been optimized for his average finger size, errors will probably occur if he doesn’t
open widely the hand: Indeed, if the hand is fully open, let’s assume no error will occur, but if
the fingers are a little bit cockled (“closed”), then for each finger, the sum of its relative pixels
will be smaller, and if it is the case of several fingers, the global sum may lead to a mistake. An
example of this phenomenon is given here:
The program answers ‘5’ The program answers ‘4’
In this example, when the two last fingers are cockled, the sum of their pixels makes the program
consider there are only four fingers, because the global sum is almost the same than the one the
program would obtain if four fingers were exhibited in a hand fully opened.
This very simple method is efficient for a single user, and if he accepts more constraints
concerning the allowed signs. Such solution is not acceptable for the project, at least because it
has to work with several users. Then, it is necessary to consider some more sophisticated
solutions.
30
Hand Gesture Recognition Using Neural Networks
Simple Block Counting Analysis
The program has to count the number of fingers? So let’s create a picture in which will remain
only the fingers. It is easy to do, given that the orientation of the hand is known. Cropping the
left part of the picture (including the thumb) will cause that only the fingers remain on the
picture
In such cases, the number of fingers is the number of blocks in the cropped picture, plus
1, because the thumb has to be considered, even if it has been cropped.
This method offers a huge advantage: its simplicity. Indeed, no calculus or special
treatment is required; the only operation we have to do is to compute the number of blocks in the
shortened image. Using a Matlab function, in the Image Processing Toolbox, called bwlabel
makes the coding very easy.
31
Hand Gesture Recognition Using Neural Networks
However, this method has also some major drawbacks. Indeed, the re-sizing operation
can make some well-separated fingers turn into to two joined fingers, that will look like one
single big finger, and it will cause an error in the evaluation of the number of fingers.
If the user wants to avoid such problems, he has to open widely the hand. By this way,
any confusion get impossible. The problem is that if the user opens the hand widely, the index
finger or the atrial finger (the fifth one) may not be present in the last columns of the picture. So
the user has to open the hand widely, but not too much, and he may need time to find the best
opening for each one of the different signs he want to do. And even if we suppose, that he
succeed in doing it, another phenomenon occurs:
If the user opens the hand just enough according to the sign he does, some noisy pixels
that remain, although the noise removal, may join two fingers. Then the function bwlabel will
consider they are just one block and it will imply an error in the estimation of the number of
fingers.
This method is very interesting and efficient while considering its low level of complexity and its
simple coding. However, there are possibilities to improve this method, because the rate of error
is can be reduced. With this method, around 70-75 percent of the allowed signs (say: that include
the thumb fingers) are successfully classified.
32
Hand Gesture Recognition Using Neural Networks
Weighted Averaging Analysis
In order to understand the basic idea that is discussed here, let’s consider the differences
and the common points between the methods that have already been introduced:
- The Pixel Counting method and the Edges Counting method were some very simple
solutions, but their problem was they were not efficient enough. Their advantage was
their low-complexity level for the implementation, given that they were geometrical
solutions.
- The Neural Networks solution has been proven quite more efficient, but it requires
training, and special management and processing of the binary picture. Moreover, when
looking at the weights of the input layer, it appears that the neural network just realizes a
kind of weighted averaging.
Hence, the motivation in this section is to try to realize weighed averaging by a simpler
way.
Choosing the weights
In this section, the explanations will refer to the following picture, which has already
been introduced in the section “Edge Counting Analysis”. This picture was an example that leads
to a classification error:
33
Hand Gesture Recognition Using Neural Networks
First of all, let’s suppose not weights are sued, say weights are all set to the same value,
one for example. When averaging the pixel value, all the pixels will have the same importance.
Given that the left part of the picture is not relevant in order to compute the number of fingers
(except the thumb finger, all the fingers are in the right part of the picture), the only columns that
will be considered are the columns 15 to 25 for example.
It has been proven previously that only edge counting
in this area is not efficient in this case, and that only counting
the number of pixels set to 1 may lead to incoherent results,
given that the relative dimensions of a finger depend on the user
and that the following picture will lead to ‘4’.
One solution is to mix these two methods, say to realize weighted averaging when the
weight of each pixel set to 1 is half the number of edges in the column of the considered pixel.
For example, according to the picture provided at the beginning of this section, the pixel at line
19, column 16 is set to one and its weight is 6 given that there are 12 edges in the column 16.
A fast-approximated calculus leads to the following results:
• If there is only the thumb finger in the picture, no pixel will be set to 1 in the columns 15
up to 25, and the weighted averaging will lead to 0.
• If there are the thumb and one fingers in the picture, about 60 to 100 pixels will be set to
1 in the columns 15 to 25, and the number of edges in this area should be 1. So the
weighted averaging should lead to values from 60 to 100.
• If there are the thumb and two fingers in the picture, about 2*60 to 2*100 pixels will be
set to 1 in the columns 15 to 25, and the number of edges in this area should be 2. So the
weighted averaging should lead to values from 2*60*2 to 2*100*2, say 240 to 400.
• If there are the thumb and three fingers in the picture, about 3*60 to 3*100 pixels will be
set to 1 in the columns 15 to 25, and the number of edges in this area should be 3. So the
weighted averaging should lead to values from 3*60*3 to 3*100*3, say 540 to 900.
34
Hand Gesture Recognition Using Neural Networks
• If there are the thumb and four fingers in the picture, about 4*60 to 4*100 pixels will be
set to 1 in the columns 15 to 25, and the number of edges in this area should be 4. So the
weighted averaging should lead to values from 4*60*4 to 4*100*4, say 960 to 1600.
According to these values, let’s create the following bounds:
• Bound between 1 and 2 fingers:
0 60
30
2
+
·
• Bound between 2 and 3 fingers:
100 240
170
2
+
·
• Bound between 3 and 4 fingers:
400 540
470
2
+
·
• Bound between 4 and 5 fingers:
900 960
930
2
+
·
That is to say that the algorithm has to realize the following operations:
1) Calculate
( )
25 30
15 1
_ ( ) * ( , )
colmun line
column line
WA number edges column pixel line column
· ·
· ·
]
·
]
]
∑ ∑
2) Estimate the number of fingers in the picture using:

30
_ _ 1
30 170
_ _ 2
170 470
_ _ 3
470 930
_ _ 4
930
_ _ 5
if WA then
Number of fingers
if WA then
Number of fingers
if WA then
Number of fingers
if WA then
Number of fingers
if WA then
Number of fingers
<
·
< <
·
< <
·
< <
·
<
·
_ _
2
100
Number of edges
WA
| `

. ,
·
35
Hand Gesture Recognition Using Neural Networks
The consequence is that the distance between typical WA values (values of the weighted
averaging) increases at an exponential rate, and that makes the classification less sensitive to
errors. Indeed, in this case, the bound between two close possibilities is always large: for
example it has been said that the typical WA when 5 fingers is (960+1600)/2=1280. An error can
occur only if the calculated WA, which should be 1280, is under 930, the calculation error has to
be bigger than 350. This can happen only if there are a lot of errors on the number of edges in
each column and if the relative dimensions of the fingers are “strange”: one finger very thick,
and three fingers very thin and the thumb.
In order to understand the efficiency of this method, let’s compare it to the bound that
would have been considered in a simple pixel counting algorithm: for four fingers, the sum of the
pixel will be about 3*60=180, and for five fingers, it would be equal to 4*60=240. The bound
between 4 and 5 fingers would be (180+240)/2=210. An error on five fingers happens when less
than 210 pixels are counted in the columns 15 to 25. The margin is:
240-210=30.
When comparing the error margins, it appears that without any weights, it is equal to 30,
and that with weights chosen as number of edges in the column of the analyzed pixel, this margin
tend to 350, so more than 10 times the previous margin! That’s why this method is quite better
the simple pixel counting one: different number of fingers lead to different ranges that are
separated by very large spaces that only huge errors can get through, and such errors are not very
frequent.
Without weights, confusion may occur when several fingers are exhibited (three, four or
five fingers). The use of weights makes these confusion quite more rare because three four and
five fingers pictures turn into WA values that are very distant one to the other.
36
Hand Gesture Recognition Using Neural Networks
Matlab Operations
Building GUI interfaces in Matlab
This example shows how to build user GUI in Matlab.
Start gui builder by typing
>>guide

Select "Blank GUI", click OK
37
Hand Gesture Recognition Using Neural Networks
The GUI window will open

Resize the design window.
Using the pallette on the left, drag and drop, resize and position the canvas, buttons, and static text
windows
38
Hand Gesture Recognition Using Neural Networks

Double-click on an object to open the properties dialog. Change the captions on the buttons
and remove "Static Text" string from the text window. Set the font size 30 for the text
windows and change horizontal alingment to "right."

39
Hand Gesture Recognition Using Neural Networks

The GUI is finished. Save the work.
The rest of the design process will take care of the functionality provided by each GUI component
Neural Network Toolbox
MATLAB with tools for designing, implementing, visualizing, and simulating neural networks. Neural
networks are invaluable for applications where formal analysis would be difficult or impossible, such
as pattern recognition and nonlinear system identification and control. Neural Network Toolbox
40
Hand Gesture Recognition Using Neural Networks
software provides comprehensive support for many proven network paradigms, as well as graphical
user interfaces (GUIs) that enable you to design and manage your networks. The modular, open,
and extensible design of the toolbox simplifies the creation of customized functions and networks.
Neural Network Toolbox GUIs make it easy to work with neural networks. The Neural
Network Fitting Tool is a wizard that leads you through the process of fitting data using
neural networks. You can use the tool to import large and complex data sets, quickly
create and train networks, and evaluate network performance.
Key features
 GUI for creating, training, and simulating neural networks
 Support for the most commonly used supervised and
unsupervised network architectures
 Comprehensive set of training and learning functions
 Dynamic learning networks,including time delay, nonlinear
autoregressive (NARX), layer-recurrent, and custom dynamic
 Simulink blocks for building neural networks and advanced
blocks for control systems applications
 Support for automatically generating Simulink blocks from
neural network objects
 Preprocessing and postprocessing functions and Simulink blocks
for improving network training and assessing network performance
41
Hand Gesture Recognition Using Neural Networks
Network Architectures
Neural network toolbox supports both supervised and unsupervised networks.
Supervised Networks
Supervised neural networks are trained to produce desired outputs in response to
sample inputs, making them particularly well suited to modeling and controlling dynamic
systems, classifying noisy data, and predicting future events.
Neural Network Toolbox supports four supervised networks:feedforward, radial basis, dynamic,
and learning vectorquantization (LVQ).
Feedforward networks have one-way connections from input to output layers. They are most
commonly used for prediction, pattern recognition, and nonlinear function fitting. Supported
feedforward networks include feedforward backpropagation,cascade-forward backpropagation,
feedforward input-delay backpropagation, linear, and perceptron networks.
Radial basis networks provide an alternative, fast method for designing nonlinear feedfor-
42
Hand Gesture Recognition Using Neural Networks
ward networks. Supported variations include generalized regression and probabilistic
neural networks.
Dynamic networks use memory and recurrent feedback connections to recognize spatial and
temporal patterns in data. They are commonly used for time-series prediction, nonlinear dynamic
system modeling, and control system applications. Prebuilt dynamic networks in the toolbox
include focused and distributed time-delay, nonlinear autoregressive (NARX), layer-recurrent,
Elman, and Hopfield networks. The toolbox also supports dynamic training of custom networks
with arbitrary connections.
LVQ is a powerful method for classifying patterns that are not linearly separable. LVQ lets you
specify class boundaries and the granularity of classification.
Unsupervised Networks
Unsupervised neural network saretrained by letting the network continually adjust itself
to new inputs.They find relationships within data and can automatically define classification
schemes.
Neural Network Toolbox supports two types of self-organizing, unsupervised etworks:
competitive layers and self-organizing maps.
Competitive layers recognize and group similar input vectors. By using these groups, the
network automatically sorts the inputs into categories.
Training and Learning Functions
Training and learning functions are mathematical procedures used to automatically adjust the
network’s weights and biases. The training function dictates a global algorithm that affects all the
weights and biases of a given network. The learning function can be applied to individual weights
and biases within a network.
43
Hand Gesture Recognition Using Neural Networks
Neuron Model

Simple Neuron
A neuron with a single scalar input and no bias is shown on the left below.

Figure : Neuron
The scalar input p is transmitted through a connection that multiplies its strength by the
scalar weight w, to form the product wp, again a scalar. Here the weighted input wp is the
only argument of the transfer function f, which produces the scalar output a. The neuron
on the right has a scalar bias, b. You may view the bias as simply being added to the
product wp as shown by the summing junction or as shifting the function f to the left by
an amount b. The bias is much like a weight, except that it has a constant input of 1. The
transfer function net input n, again a scalar, is the sum of the weighted input wp and the
bias b. This sum is the argument of the transfer function f. Here f is a transfer function,
typically a step function or a sigmoid function, that takes the argument n and produces
the output a. Examples of various transfer functions are given in the next section. Note
that w and b are both adjustable scalar parameters of the neuron. The central idea of
neural networks is that such parameters can be adjusted so that the network exhibits some
desired or interesting behavior.
Thus, we can train the network to do a particular job by adjusting the weight or bias
parameters, or perhaps the network itself will adjust these parameters to achieve some
desired end. All of the neurons in the program written in MATLAB have a bias.
44
Hand Gesture Recognition Using Neural Networks
.
45
Hand Gesture Recognition Using Neural Networks
Feed forward Neural Networks
Feed forward neural networks (FF networks) are the most popular and most widely used models
in many practical applications. They are known by many different names, such as "multi-layer
perceptrons."
Figure illustrates a one-hidden-layer FF network with inputs ,..., and output . Each arrow in
the figure symbolizes a parameter in the network. The network is divided into layers. The input
layer consists of just the inputs to the network. Then follows a hidden layer, which consists of
any number of neurons, or hidden units placed in parallel. Each neuron performs a weighted
summation of the inputs, which then passes a nonlinear activation function , also called
the neuron function.
A feedforward network with one hidden layer and one output.
Mathematically the functionality of a hidden neuron is described by
where the weights { , } are symbolized with the arrows feeding into the neuron.
The network output is formed by another weighted summation of the outputs of the neurons in
the hidden layer. This summation on the output is called the output layer. In Figure there is only
one output in the output layer since it is a single-output problem. Generally, the number of output
neurons equals the number of outputs of the approximation problem.
46
Hand Gesture Recognition Using Neural Networks
The output of this network is given by
where n is the number of inputs and nh is the number of neurons in the hidden layer. The
variables { , , , } are the parameters of the network model that are represented
collectively by the parameter vector ..
Note that the size of the input and output layers are defined by the number of inputs and outputs
of the network and, therefore, only the number of hidden neurons has to be specified when the
network is defined..
In training the network, its parameters are adjusted incrementally until the training data satisfy
the desired mapping as well as possible; that is, until ( ) matches the desired output y as closely
as possible up to a maximum number of iterations
The FF network in Figure is just one possible architecture of an FF network. You can modify the
architecture in various ways by changing the options. For example, you can change the activation
function to any differentiable function you want..
47
Hand Gesture Recognition Using Neural Networks
Advantages of Neural Computing
There are a variety of benefits that an analyst realizes from using neural networks in their
work.
Pattern recognition is a powerful technique for harnessing the information in
the data and generalizing about it. Neural nets learn to recognize the patterns
which exist in the data set.
The system is developed through learning rather than programming.
Programming is much more time consuming for the analyst and requires the
analyst to specify the exact behavior of the model. Neural nets teach
themselves the patterns in the data freeing the analyst for more interesting
work.
Neural networks are flexible in a changing environment. Rule based systems
or programmed systems are limited to the situation for which they were
designed--when conditions change, they are no longer valid. Although neural
networks may take some time to learn a sudden drastic change, they are
excellent at adapting to constantly changing information.
Neural networks can build informative models where more conventional
approaches fail. Because neural networks can handle very complex
interactions they can easily model data which is too difficult to model with
traditional approaches such as inferential statistics or programming logic.
Performance of neural networks is at least as good as classical statistical
modeling, and better on most problems. The neural networks build models
that are more reflective of the structure of the data in significantly less time.
48
Hand Gesture Recognition Using Neural Networks
Limitations of Neural Computing
There are some limitations to neural computing. The key limitation is the neural
network's inability to explain the model it has built in a useful way. Analysts often want
to know why the model is behaving as it is. Neural networks get better answers but they
have a hard time explaining how they got there.
There are a few other limitations that should be understood. First, It is difficult to extract
rules from neural networks. This is sometimes important to people who have to explain
their answer to others and to people who have been involved with artificial intelligence,
particularly expert systems which are rule-based.
As with most analytical methods, you cannot just throw data at a neural net and get a
good answer. You have to spend time understanding the problem or the outcome you are
trying to predict. And, you must be sure that the data used to train the system are
appropriate and are measured in a way that reflects the behavior of the factors. If the data
are not representative of the problem, neural computing will not product good results.
This is a classic situation where "garbage in" will certainly produce "garbage out."
Finally, it can take time to train a model from a very complex data set. Neural techniques
are computer intensive and will be slow on low end PCs or machines without math
coprocessors. It is important to remember though that the overall time to results can still
be faster than other data analysis approaches, even when the system takes longer to train.
Processing speed alone is not the only factor in performance and neural networks do not
require the time programming and debugging or testing assumptions that other analytical
approaches do.
49
Hand Gesture Recognition Using Neural Networks
MICROCONTROLLER AND ROBOT
Power Supply
We are directly providing 12V D C supply. The 12V D C is converted into 5V DC supply. 12v is
required for motor driving and 5 v for the microcontroller assembly.
12V is converted into 5V with the help of 7805 and capacitor combination.
Microcontroller(8051)
A microcontroller has a CPU in addition to a fixed amount of RAM, ROM, I/O ports, and timers
are all embedded together on one chip. These are used in embedded system. We have used
80c51 8-bit flash microcontroller family AT89C5124PIwith 64k of flash memory and 1kB of
RAM. The 89C5124PI device contains a non-volatile 64kB Flash program memory that is both
parallel programmable and serial In-System and In-Application Programmable. In-System
Programming (ISP) allows the user to download new code while the microcontroller sits in the
application.
In-Application Programming (IAP) means that the microcontroller fetches new program code
and reprograms itself while in the system. This allows for remote programming over a modem
link. A default serial loader (boot loader) program in ROM allows serial In-System programming
of the Flash memory via the UART without the need for a loader in the Flash code. For In-
Application Programming, the user program erases and reprograms the Flash memory by use of
standard routines contained in ROM.
50
Hand Gesture Recognition Using Neural Networks
This device is a Single-Chip 8-Bit Microcontroller manufactured in advanced CMOS process
and is a derivative of the 80C51 microcontroller family. The instruction set is 100% compatible
with the 80C51 instruction set.The device also has four 8-bit I/O ports, three 16-bit timer/event
counters, a multi-source, four-priority-level, nested interrupt structure, an enhanced UART and
on-chip oscillator and timing circuits.

The added features of the AT89C5124PI makes it a powerful microcontroller for applications
that require pulse width modulation, high-speed I/O and up/down counting capabilities such as
motor control.
Features :-
a) 80C51 Central Processing Unit
b) On-chip Flash Program Memory with In-System Programming(ISP) and In-Application
Programming (IAP) capability
c) Boot ROM contains low level Flash programming routines for downloading via the UART
d) Can be programmed by the end-user application (IAP)
e) 6 clocks per machine cycle operation (standard)
f) 12 clocks per machine cycle operation (optional)
g) Speed up to 20 MHz with 6 clock cycles per machine cycle (40 MHz equivalent
performance); up to 33 MHz with 12 clocks per machine cycle
h) Fully static operation
i) RAM expandable externally to 64 kB
j) 4 level priority interrupt
k) 8 interrupt sources
l) Four 8-bit I/O ports
m) Full-duplex enhanced UART
n) Framing error detection
o) Automatic address recognition
p) Power control modes
Clock can be stopped and resumed
– Idle mode
– Power down mode
q) Programmable clock out
r) Second DPTR register
s) Asynchronous port reset
t) Low EMI (inhibit ALE)
u) Programmable Counter Array (PCA)
--- PWM
---Capture/Compare
51
Hand Gesture Recognition Using Neural Networks
PIN DESCRIPTION :
a) Ground: 0 V reference.
b) Power Supply(Vcc): This is the power supply voltage for normal, idle, and power- down
operation.
c) Port 0(8 I/O pins from 39-32):
Port 0 is an open-drain, bidirectional I/O port. Port 0 pins that have 1s written to them float
and can be used as high-impedance inputs. Port 0 is also the multiplexed low-order address and
data bus during accesses to external program and data memory. In this application, it uses
strong internal pull-ups when emitting 1s.
d) Port 1(8 I/O numbered 1-8):
Port 1 is an 8-bit bidirectional I/O port with
internal pull-ups on all pins except P1.6 and P1.7 which are open Drain.Port 1 pins that
have 1s written to them are pulled high by the internal pull-ups and can be used as inputs.
As inputs, port 1 pins that are externally pulled low will source current because of the
internal pull-ups.
Alternate functions for 89C51RB2/RC2/RD2 Port 1 include:
1) T2 (P1.0): Timer/Counter 2 external count input/Clockout
2) T2EX (P1.1): Timer/Counter 2 Reload/Capture/Direction Control
3) ECI (P1.2): External Clock Input to the PCA
4) CEX0 (P1.3): Capture/Compare External I/O for PCA module 0
5) CEX1 (P1.4): Capture/Compare External I/O for PCA module 1
6) CEX2 (P1.5): Capture/Compare External I/O for PCA module 2
7) CEX3 (P1.6): Capture/Compare External I/O for PCA module 3
8) CEX4 (P1.7): Capture/Compare External I/O for PCA module 4
e) Port 2(21-28):
Port 2 is an 8-bit bidirectional I/O port with internal pull-
ups. Port 2 pins that have 1s written to them are pulled high by the
internal pull-ups and can be used as inputs. As inputs, port 2 pins that are
externally being pulled low will source current because of the internal
pull-ups. Port 2 emits the high-order address byte during fetches from
external program memory and during accesses to external data memor
that use 16-bit addresses (MOVX @DPTR).
52
Hand Gesture Recognition Using Neural Networks
f) Port 3(10-17):
Port 3 is an 8-bit bidirectional I/O port with internal pull-
ups. Port 3 pins that have 1s written to them are pulled high by the
internal pull-ups and can be used as inputs. As inputs, port 3 pins that are
externally being pulled low will source current because of the pull-ups.
Port 3 also serves the special features of the 89C51RB2/RC2/RD2, as listed
below:
I. RxD (P3.0): Serial input port
II. TxD (P3.1): Serial output port
III. INT0 (P3.2): External interrupt
IV. INT1 (P3.3): External interrupt
V. T0 (P3.4): Timer 0 external input
VI. T1 (P3.5): Timer 1 external input
VII. WR (P3.6): External data memory write strobe
VIII. RD (P3.7): External data memory read strobe
g) RST Reset(pin 9): A high on this pin for two machine cycles while the
oscillator is running, resets the device. An internal diffused resistor to
VSS permits a power-on reset using only an external capacitor to VCC.
h) ALE (Address Latch Enable, pin 30): Output pulse for latching the low
byte of the address during an access to external memory. In normal
operation, ALE is emitted twice every machine cycle, and can be used
for external timing or clocking. Note that one ALE pulse is skipped
during each access to external data memory. ALE can be disabled by
setting SFR auxiliary.0. With this bit set, ALE will be active only during
a MOVX instruction.
i) PSEN (Program Store Enable, pin 29): The read strobe to external
program memory. When executing code from the external program
memory, PSEN is activated twice each machine cycle, except that two
PSEN activations are skipped during each access to external data
memory. PSEN is not activated during fetches from internal program
memory.
j) EA/VPP(External Access Enable/Programming Supply Voltage, pin 31):
EA must be externally held low to enable the device to fetch code
from external program memory locations. If EA is held high, the device
executes from internal program memory. The value on the EA pin is
latched when RST is released and any subsequent changes have no
53
Hand Gesture Recognition Using Neural Networks
effect. This pin also receives the programming supply voltage (VPP)
during Flash programming.
k) XTAL1 and XTAL2(pin 19 & 18): Input & output respectively to the
inverting oscillator amplifier and input to the internal clock generator
circuits.
To avoid “latch-up” effect at power-on, the voltage on any pin (other than
VPP) must not be higher than VCC + 0.5 V or less than VSS – 0.5 V.
Motor Driver(ULN2004A)
The ULN2004A is high voltage, high current darlington arrays each containing seven open
collector darlington pairs with common emitters. Each channel rated at 500mA and can withstand
peak currents of 600mA.Suppression diodes are included for inductive load driving and the inputs
are pinned opposite the outputs to simplify board layout.
These versatile devices are useful for driving a wide range of loads including solenoids, relays DC
motors, LED displays filament lamps, thermal print-heads and high power buffers
Maximum output voltage is 50V
The 2004A is supplied in 16 pin plastic DIP packages with a copper lead frame to reduce thermal
resistance.

54
Hand Gesture Recognition Using Neural Networks
Robot
The robot is two wheel robot with a castor wheel provided for the support.ULN2004A ic is
used for driving the motors. Stepper motor has been used. As the name suggests, stepper
motors do not spin freely like DC motors; they rotate in discrete steps, under the command of a
controller. This makes them easier to control, as the controller knows exactly how far they
have rotated, without having to use a sensor. Therefore they are used on many robots. Stepper
motor used is a unipolar motor ,hence having six wires coming out of it. Four of them are used
for receiving data from the microcontroller for its movement while two are short circuited and
connected to 12V DC supply.
For the movement of motor ,its alternate windings are excited continuously with the help of
assembly code

55
Hand Gesture Recognition Using Neural Networks
Stepwise procedure/ flow:
Input pattern to be recognized
56
Sampling
Generation of
templates
Template matching with
input pattern
Best match
Hand Gesture Recognition Using Neural Networks
Recognized pattern

TIME ACTIVITY CHART:
5
A
C
T
I
V
I
T
I
E
S
4
3
2
1
3 4 7 10 12
Months
57
Hand Gesture Recognition Using Neural Networks
Activities:
1- Literature review
2- Selection of Application & decide the specifications of the
equipments required for same.
3 – Make an experimental set-up.
4 - Conduct trials, plot results & conclusion.
5 - Preparation of report
CONCLUSION
We proposed a fast and simple algorithm for a hand gesture recognition problem. Given
observed images of the hand, the algorithm segments the hand region, and then makes an
inference on the activity of the fingers involved in the gesture. We have demonstrated the
effectiveness of this computationally efficient algorithm on real images we have acquired. Based
on our motivating robot control application, we have only considered a limited number of
gestures. Our algorithm can be extended in a number of ways to recognize a broader set of
gestures. The segmentation portion of our algorithm is too simple, and would need to be
improved if this technique would need to be used in challenging operating conditions. However
we should note that the segmentation problem in a general setting is an open research problem
itself. Reliable performance of hand gesture recognition techniques in a general setting require
dealing with occlusions, temporal tracking for recognizing dynamic gestures, as well as 3D
modeling of the hand, which are still mostly beyond the current state of the art.
58
Hand Gesture Recognition Using Neural Networks

FUTURE SCOPE
Even with limited processing power, it will be possible to design very efficient algorithms in
order to
• Track people,
• (Re-)identify them
• Understand their (static) gestures
• Control a robot
Our software has been designed to be reusable and many behaviors that are more complex may
be added to our work. Because we limited ourselves to low processing power, our work could
easily be made more performing by adding a state-of-the-art processor. The use real embedded
OS could improve our system in terms of speed and stability. In addition, implementing more
sensor modalities would improve robustness even in very complex scenes. Our system has
59
Hand Gesture Recognition Using Neural Networks
shown the possibility that interaction with machines through gestures is a feasible task and the
set of detected gestures could be enhanced to more commands by implementing a more complex
model of a human being. In the future, service robots executing many different tasks from house-
maid work to nuclear power plant services might arise and become a common part of everyday
live normal as computers nowadays.
BIBLIOGRAPHY
Books and references
• Matlab by R P Singh
• The 8051 Microcontroller by Mazidi
• Image Processing book by Bijith Marakarkandy
• Digital Image Processing: An Algorithmic Approach by Joshi M A
• Neural Network by Gonzales Cenelia
• www.wikipedia.com
• www.google.com
• ieeexplore.ieee.org
60

Hand Gesture Recognition Using Neural Networks

CERTIFICATE

This is to certify that Kiran P V, Gaurav Sharma and Vidit Mediratta have successfully submitted the seminar report on “HAND GESTURE RECOGNITION USING NEURAL NETWORK” During the academic year 2008-2009 in the partial fulfillment towards completion of bachelor’s Degree Program in Engineering(Electronics and Telecommunication) under University of Pune.

Mrs. Surekha K.S Karra
Head of Department Electronics and Telecommunication

Prof. Vijay
Project Guide Electronics and Telecommunication

Mrs. Surekha K.S
Principal Army Institute of Technology Dighi Hills,Pune-411015

2

Hand Gesture Recognition Using Neural Networks

ACKNOWLEDGEMENT
We wish to express our sincere gratitude to our guide Prof. Vijay Karra for his valuable guidance at all stages of our project. We acknowledge the whole hearted, unreserved and positive encouragement on his part, which helped us to tackle all our problems to ensure successful completion of the project. We are thankful to the Staff of Department of Electronics & Telecommunication, A.I.T., for all the direct and indirect help and for making available the resources of the department for the timely completion of our project. We are also thankful to our Prof. Surekha K S for her valuable suggestions. We sincerely believe that our guides were the motivating forces behind the project. It was their constant encouragement and constructive criticism that has made our project achieve its present form. We would be ungrateful if we did not acknowledge our family and friends who were always by our side. Last but not the least to whom we have named, we express deep gratitude and to whom we haven’t please note that even though you are unnamed, you are appreciated more than you know. Kiran P V Vidit Mediratta Gaurav Sharma

3

..21 II.…………………………………..Brief description……………………………………………………………. Defining the different issues A.Matlab Operations…………………………………………………… 37 D. Image . .Hand Gesture Recognition………….7 B...16 D.Counting the fingers……………………………………………………29 C..58 9.. ... Problem Definition……………………………………………………………... Literature survey……………………………………………………………9 C. Future scope………………………………………………………………….Neuron Model……………………………………………………….Block Diagram........ ..12 4.43 E. Stepwise procedure flow……………………………………………………..… ……………………………….. Software Engineering Approach…………………………………….. Design I.Microcontroller & Robot………………………………………………49 5.59 4 . Neural Network……………………………………………………….Database ………………………………………………………15 C. A. Matlab……………………………. .. Conclusion…………………………………………………………………….Time Activity Chart…………………………………………………………....56 7...11 3. 57 8.Hand Gesture Recognition Using Neural Networks Table of contents 1..17 E.. Introduction A.. Bibliography…………………………………………………………………. Abstract…………………………………………………………………… ..5 2..Image Processing………………………………………………….. …………………………………………………….18 F.Database Creation…………………………………………………… 23 B.. .55 6..14 B... .

g. the intuitiveness of the applied hand gestures. In particular. we focused on the types of gestures in order to study intuitiveness. 5 . We were interested how people formulate gestures and whether we could find any pattern in these gestures. in which participants were asked to express these functions by means of hand gestures. We found that there are typical gestures for each function. In the movie. however. but can be generalized for other purposes as well. and on the kinetic features to discover how they influence computer recognition. Furthermore. directions are given for computer recognition of gestures regarding the number of hands used and the dimensions of the space where the gestures are formulated. This result suggests that an intuitive set of gestures can be designed. for a usable system it is essential to have an intuitive set of gestures. little attention has been paid on the human factors. however. The system functions used in Minority Report were reverse engineered and a user study was conducted. e. Nowadays the tracking of hand movements and the computer recognition of gestures is realizable . a video-browsing application was controlled by hand gestures. Several solutions have been developed . and these are not necessarily related to the technology people are used to. This study was inspired by the movie Minority Report. in which a gesture-based interface was presented to a large audience. which is not only usable in this specific application.Hand Gesture Recognition Using Neural Networks ABSTRACT Hand gesture recognition techniques have been studied for more than two decades.

Hand Gesture Recognition Using Neural Networks INTRODUCTION 6 .

The dynamic hand gestures can be further divided into two subclasses. however. This approach employs a powerful method based on Hyper Rectangular Composite Neural Networks (HRCNNs) for selecting templates. considerable memory is alleviated. deaf or non. That is. Therefore samples of these hand gestures are spatio-temporal patterns.vocal individuals are unable to communicate with hearing persons through speech. In this paper. The basic idea of our method for recognizing these spatio-temporal hand gestures is as follows. They use sign language or hand gestures to express themselves. diseases. Most of them involve time alignment which requires substantial computation and considerable memory storage. we present a neural-network-based approach to spatio-temporal pattern recognition. head injuries. The accumulated similarity associated 7 . most hearing persons do not have the special sign language expertise. Hand gestures can be classified into two classes: (1) static hand gestures which relies only the information about the angles of the lingers and (2) dynamic hand gestures which relies not only the fingers' flex angles but also the hand trajectories and orientations. therefore.Hand Gesture Recognition Using Neural Networks BRIEF DESCRIPTION Several successful approaches to spatio-temporal signal processing such as speech recognition and hand gesture recognition have been proposed. Due to congenital malfunctions. it requires at least two different hand shapes connected sequentially to form a particular hand gesture. which are extracted from the values of synaptic weights of the corresponding trained HRCNN. of hand gestures involving fingers' movements but without changing the position of the hands. or virus infections. Templates for each hand shape are then represented in the form of crisp IF-THEN rules. We generate templates for each basic hand shape by training a Hyper Rectangular Composite Neural Network (HRCNN). The first subclass consists of hand gestures involving hand movements and the second subclass consists.

Hand gesture recognition is a challenging problem in its general form. given that real time processing is required. and develop a simple. and scale of the hand . Ideally. The goal of this project is to develop a program implementing real time gesture recognition.We can even demonstrate the effectiveness of the technique on real imagery. It has to do it as fast as possible. rotation. as well as for communicating quantitative information and spatial relationships. in situations where speech commands would be disturbing. However. as many of them. 8 . and the unknown gesture is classified as the gesture yielding the highest accumulative similarity. The program has to collect pictures of this gesture thanks to the video camera. a translation systems would make it possible to communicate with deaf people. Compared to speech commands. Most people use gestures and body language in addition to words when they speak. A gesture is a form of non-verbal communication made with a part of the body and used instead of verbal communication (or in combination with it). A sign language is a language which uses gestures instead of sound to convey meaning combining hand-shapes. are also not able to read or write a spoken language. We propose a fast algorithm for automatically recognizing a limited set of gestures from hand images for a robot control application. hand gestures are advantageous in noisy environments.Hand Gesture Recognition Using Neural Networks with all samples of the input is computed for each hand gesture in the vocabulary. arms or body. Our approach contains steps for segmenting the hand region. facial expressions and lip-patterns. We consider a fixed set of manual commands and a reasonably structured environment. a user can exhibit his hand doing a specific gesture in front of a video camera linked to a computer. we focus in gesture recognition which can be later translated to a certain machine movement. The algorithm is in variant to translation. yet effective. the user is not supposed to be exactly at the same place when showing his hand. Similar to automatic speech recognition (ASR). to analyze it and to identify the sign. it has been decided that the identification would consist in counting the number of fingers that are shown by the user in the input picture. orientation and movement of the hands. In order to lighten the project. At any time. Developing sign language applications for deaf people can be very important. procedure for gesture recognition. being not able to speak a language. locating the fingers and finally classifying the gesture.

Poland. and reverse engineering. reengineering. Wojciech Jalmuzna_. S. Most of the data has been referred from the IEEE website. We have referred Documents and journals available on the net for the same .e.Indian Institute of Technology. University of Technology.Hand Gesture Recognition Using Neural Networks LITERATURE SURVEY Objective: Our objective is to identify requirements (i. We especially focus on requirements for research tools that target the domains of visualization for software maintenance.. it provided immense help in locating the resources.Department of Electronics and Communication Engineering . As our library has online subscription of the IEEE journals. Institute of Electronic Systems. Guwahati. Method: The requirements are identified with a comprehensive literature survey based on publications in journals. • feed-forward (FF) • feedback (FB) • adaptive feed-forward (AFF). conference proceedings. Warsaw. 2) Gesture Based Robot Control by V. quality attributes and functional requirements) for Gesture Based Recognition. The various journals referred are: relevant 1) Implementation of adaptive feed-forward algorithm by Jaroslaw Szewinski_†. 9 . This deals with the description of the various algorithms used in Neural Networks viz. Mahanta . and theses. Rao and C.

Turkey. and scale of the hand. The approach contains steps for segmenting the hand region. 10 . İstanbul. Tuzla. locating the fingers. It provided the great works by different scientists in different parts of the globe working on the same aim: visual gesture recognition system for controlling robots.and finally classifying the gesture. 5) Programming-By-Example Gesture Recognition by Kevin Gabayan.Hand Gesture Recognition Using Neural Networks This journal deals with the past and recent developments in gesture recognition system. Germany. it is used as a Service Robot. and Müjdat Çetin. Faculty of Engineering and Natural Sciences. The algorithm is invariant to translation. 4) A Gesture controlled robot for object perception and Manipulation by Mark Batcher. rotation. 3) A Fast Algorithm For Vision-Based Hand Gesture Recognition For Robot Control by Asanterabi Malima. Gripsee is the name of the Robot of whose design is discussed in the paper . of tasks . Steven Lansel . Sabancı University. and moving it to a new position. Machine learning and hardware improvements to a programming-by-example rapid prototyping system are proposed This paper deals with the dynamic time warping gesture recognition approach involving single signal channels. grasp it. Erol Özgür. It serves as a multipurpose Robot which can perform a no.it is used for identifying an object. Institute of Neuroninformatics .

and the whole algorithm. For the movement of robot. 11 . the program has been written in assembly language since it is most suitable and we are well aware of the subject.Hand Gesture Recognition Using Neural Networks SOFTWARE ENGINEERING APPROACH For developing the code. Indeed. their respective codes may be extracted from the code of the Weighted Averaging Method. and the fact that Matlab is optimized for matrix-based calculus make any image treatment more easier given that any image can be considered as a matrix. Only the code of the Neural Network Method and of the Weighted Averaging Analysis method is provided. Indeed. The IC used is 8051 microcontroller. it was preferable to use Matlab. in this environment. given that the last one is a kind of combination of the Pixel Counting Method and of the Edge Counting Method. image displaying. graphical analysis and image processing turn into a simple enough issue concerning the coding. hence the code was written and tested in RIDE software. because Matlab has a huge and very complete “Image Processing Toolbox”. That’s why the whole Code has been developed first under Matlab environment.

The initial step is to create the database of the images which are used for training & testing. The image database can have different formats. Photographs were used. The computer consists of a program prepared in MATLAB for the various operations on the images. They were converted to grayscale and the background was made uniform. 12 . digitized photographs or a 3D dimensional hand. Using Neural Network tool box. as they are the most realistic approach. Two operations were carried out in all of the images.Hand Gesture Recognition Using Neural Networks PROBLEM DEFINITION The experimental setup consists of a digital camera used to take the images .The camera is interfaced to computer.The pattern recognition system that will be used consists of some transformation T. Computer is used to create the database & analysis of the images. which converts an image into a feature vector. Images can be either hand drawn. analysis of the images is done. The images with internet databases already had uniform backgrounds but the ones taken with the digital camera had to be processed in Photoshop . which will be then compared with feature vectors of a training set of gestures.

Hand Gesture Recognition Using Neural Networks DESIGN 13 .

or “counts” considered in our work. “two”. 14 . as one of five possible commands (or counts). visually observed by the robot through a camera.” and “turn left. Furthermore. For example. We are interested in an algorithm that enables the robot to identify a hand pose sign in the input image.Hand Gesture Recognition Using Neural Networks HAND GESTURE RECOGNITION Consider a robot navigation problem.” Set of hand gestures. “three”. see Figure . The identified command will then be used as a control input for the robot to perform a certain action or execute a certain task. in which a robot responds to the hand pose signs given by a human. a “one” count could mean “move forward”. For examples of the signs to be used in our algorithm. “turn right. and “four” counts could be interpreted as “reverse”. a “five” count could mean “stop”. The signs could be associated with various meanings depending on the function of the robot.

Images belonging to the last case were very few but they were discarded.g. The construction of such a database is clearly dependent on the application. The database itself was constantly changing throughout the completion of the project as it was it that would decide the robustness of the algorithm. This meant that they have different sizes. as they are the most realistic approach.Hand Gesture Recognition Using Neural Networks IMAGE DATABASE The starting point of the project was the creation of a database with all the images that would be used for training and testing. The internet databases already had uniform backgrounds but the ones I took with the digital camera had to be processed in Adobe Photoshop. Therefore. They were converted to grayscale and the background was made uniform. In this case noise and motion blur should be tolerable. Images came from two main sources. Photographs were used. Adobe Photoshop). Two operations were carried out in all of the images. Drawn images can still simulate translational variances with the help of an editing program (e. different resolutions and some times almost completely different angles of shooting. If the application is a crane controller for example operated by the same person for long periods the algorithm doesn’t have to be robust on different person’s images. 15 . as there was no chance of classifying them correctly. digitized photographs or a 3D dimensional hand. it had to be done in such way that different situations could be tested and thresholds above which the algorithm didn’t classify correct would be decided. The image database can have different formats. Various ASL databases on the Internet and photographs I took with a digital camera. Images can be either hand drawn.

Most image-processing techniques involve treating the image as a two-dimensional signal and applying standard signalprocessing techniques to it. and rotation Color corrections such as brightness and contrast adjustments. Used in filmmaking to make a "matte" Image editing (e. such as photographs or frames of video. to increase the quality of a digital image) Image registration (alignment of two or more images). differencing and morphing Image segmentation Extending dynamic range by combining differently exposed images 2-D object recognition with affine invariance Applications       Computer vision Face detection Feature detection Lane departure warning system Non-photorealistic rendering Medical image processing 16 ..Hand Gesture Recognition Using Neural Networks IMAGE PROCESSING Image processing is any form of signal processing for which the input is an image. quantization. conversion to a different color space or Digital compositing or Optical compositing (combination of two or more images). the output of image processing can be either an image or a set of characteristics or parameters related to the image. Typical operations Among many other image processing operations are: • • • • • • • • Geometric transformations such as enlargement.g. reduction.

Hand Gesture Recognition Using Neural Networks
  

Microscope image processing Morphological image processing Remote sensing

MATLAB

The name MATLAB stands for matrix laboratory. MATLAB is a high-performance language for technical computing. It integrates computation, visualization, and programming in an easy-to-use environment where problems and solutions are expressed in familiar mathematical notation. Typical uses include: _ Math and computation _ Algorithm development _ Modeling, simulation, and prototyping _ Data analysis, exploration, and visualization _ Scientific and engineering graphics _ Application development, including Graphical User Interface building MATLAB is an interactive system whose basic data element is an array that does not require dimensioning. This allows you to solve many technical computing problems, especially those with matrix and vector formulations, in a fraction of the time it would take to write a program in a scalar non-interactive language such as C or Fortran. MATLAB has evolved over a period of years with input from many users. In university environments, it is the standard instructional tool for introductory and advanced courses in mathematics, engineering, and science. In industry, MATLAB is the tool of choice for high-productivity research, development, and analysis. The reason that I have decided to use MATLAB for the development of this project is its toolboxes. Toolboxes allow you to learn and apply specialized technology. Toolboxes are comprehensive collections of MATLAB functions (M-files) that extend the

17

Hand Gesture Recognition Using Neural Networks MATLAB environment to solve particular classes of problems. It includes among others image processing and neural networks toolboxes.

NEURAL NETWORK

An artificial neural network (ANN), also called a simulated neural network (SNN) or commonly just neural network (NN) is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing based on a connectionistic approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network. In more practical terms neural networks are non-linear statistical data modeling or decision making tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data An artificial neural network involves a network of simple processing elements (artificial neurons) which can exhibit complex global behavior, determined by the connections between the processing elements and element parameters. One classical type of artificial neural network is the Hopfield net. In a neural network model simple nodes, which can be called variously "neurons", "neurodes", "Processing Elements" (PE) or "units", are connected together to form a network of nodes — hence the term "neural network". While a neural network does not have to be adaptive per se, its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow. In modern software implementations of artificial neural networks the approach inspired by biology has more or less been abandoned for a more practical approach based on statistics and signal processing. In some of these systems neural networks, or parts of neural networks (such as

18

Hand Gesture Recognition Using Neural Networks artificial neurons) are used as components in larger systems that combine both adaptive and nonadaptive elements.

Neural networks are composed of simple elements operating in parallel. These elements are inspired by biological nervous systems. As in nature, the network function is determined largely by the connections between elements. We can train a neural network to perform a particular function by adjusting the values of the connections (weights) between elements. Commonly neural networks are adjusted, or trained, so that a particular input leads to a specific target output There, the network is adjusted, based on a comparison of the output and the target, until the network output matches the target.

Figure : Neural Net block diagram Neural networks have been trained to perform complex functions in various fields of application including pattern recognition, identification, classification, speech, vision and control systems. Today neural networks can be trained to solve problems that are difficult for conventional computers or human beings. The supervised training methods are commonly used, but other networks can be obtained from unsupervised training techniques or from direct design methods. Unsupervised networks can be used, for instance, to identify groups of data. Certain kinds of linear networks and Hopfield networks are designed directly. In 19

financial applications. handwritten text recognition). process control). there are a variety of kinds of design and learning techniques that enrich the choices that a user can make. data mining (or knowledge discovery in databases.). or regression analysis. visualization and e-mail spam filtering. Classification. object recognition. clustering. including pattern and sequence recognition. including time series prediction and modelling. novelty detection and sequential decision making. Application areas include system identification and control (vehicle control. Applications The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations and also to use it. blind signal separation and compression. game-playing and decision making (backgammon. 20 . etc. chess. This is particularly useful in applications where the complexity of the data or task makes the design of such a function by hand impractical. medical diagnosis. "KDD"). pattern recognition (radar systems. racing). face identification. Data processing. including filtering. speech. Real life applications The tasks to which artificial neural networks are applied tend to fall within the following broad categories: • • • Function approximation.Hand Gesture Recognition Using Neural Networks summary. sequence recognition (gesture.

Hand Gesture Recognition Using Neural Networks BLOCK DIAGRAM MOTOR DRIVER Pattern to recognize Sampling Generation of templates Pattern recognition Decision Logic 8051 MICROCON TROLLER Recognized Pattern 21 .

Hand Gesture Recognition Using Neural Networks 22 .

the easiest way to collect pictures is to use VideoOCX for example. given that it depends on how we implement the main program. We need then to analyze the picture. Running in the MATLAB environment requires the pictures to be saved in memory and called back when running the program. There is a choice to do concerning the way we want to collect these pictures. The whole MATLAB program has been developed using such saved pictures. because the Image Acquisition Toolbox is not available on the MATLAB version used for the design of the program. which should be ‘2’: 23 . it will be necessary to collect pictures. let’s suppose that a set of representative pictures is provided. So. Then. which have to lead to the same identification result. to develop the body of the program. assuming encoding in C++. Here are given few examples of the same sign done in different areas. That’s why. it will be necessary to implement the program in a C or C++ environment. for a real time processing. Indeed the user will never put his hand in the same area of the picture. However. and obviously.Hand Gesture Recognition Using Neural Networks DEFINING THE DIFFERENT ISSUES Collecting the pictures First of all. Finding the hand Now. it has been modified so that it can be used in real time C++ stand-alone functions. and to find the relevant part of the picture. there are no real time constraints: it is possible to work on typical and representative pictures previously chosen and saved.

Examples of Allowed pictures 24 . we can sum up the problem: How can we count the number of fingers in a picture of hand? There are plenty of ways to do it. such as neural networks or laplacian filtering. the real work can start: Let’s suppose we got the relevant part of the image. So. There are some geometrical ways that can make the problem solved by counting numbers of blocks within a picture. the advantages and drawbacks of few of them will be described. which contains only the hand.Hand Gesture Recognition Using Neural Networks Analysis and identification Then. or some more sophisticated methods. How can we “guess” the type of sign? To make the problem easier. which can lead to interesting results. we can consider that we are interested only in the number of fingers exhibited by the user. In the following pages.

the optimal threshold depends on the background. and some noise. pixel values will be from 0 to 255. Then it is necessary to execute noise-removal functions. zooming can then be easily realized by cropping areas whose pixel values are close to 0. the presence of the hand will imply a variation of pixel values bigger than 20 units. The choice of this threshold depends on the video camera properties: if we consider that the camera provides pixels coded on bytes. After that. Given that the background is known. it is possible to build a new picture that corresponds to the difference between the current picture of hand and the background. After processing noise removal. else every noisy pixel that its value is too high may be considered as part of the hand and will be included in the zoom-in 25 . it is necessary to choose a threshold: pixels with value lower than this threshold will be set to 0 (black) and others will be set to 1. to make all the preprocessing easier. Of course. To do so. this threshold can be correct in most of cases. So. it is better to create a binary picture. Some measurements have proven that in this case. Picture of the difference with the background The difference with the background can be done using the Matlab function “imabsdiff”. the resulting picture will be black almost everywhere except where the hand is. So it is possible to collect a picture that contains only the hand.Hand Gesture Recognition Using Neural Networks It has been already explained that the position of the hand in the picture is not important. nevertheless.

The noise removal is processed using the function bwmorph(open).Hand Gesture Recognition Using Neural Networks picture. so the zooming function will keep it and the resulting picture. after zooming. that erodes then dilates the noisy picture. Background Input Picture Binary Picture 26 . will not be very different of the initial picture! That’s why it is necessary to use noise removal functions. lonely pixels disappear during the erosion. By this way. other elements are restored to their initial shape thanks to the following dilation. For example if we suppose that the hand is in one corner of the picture and that there is a noisy pixel in the opposite corner of the picture of the differences. Here are given few examples of resulting pictures.

In these conditions. Nevertheless. the hand will appear small enough on the picture. in an excessively reduced picture.. some fingers can disappear. the final re-sized image will have almost identical properties concerning the dimension of its elements. It seems evident that it is not useful to resize it to a size larger than the original one given that it will not add information. and some spaces between two fingers way also disappear so that is seems there is only one finger. ≈ Cst Length(atrial )User1 Length(atrial )User 2 That’s why this re-sizing operation can be considered as a standardization process: for any user. it has been decided that a size of 30x30 is quite small enough to make calculus fast. different users will all have different hands. and large enough to avoid any major damage to the initial picture. when he is far from it. such standard re-sizing will provide relative measurements: if the size of the real thumb and ring fingers depend on the user. it would be a serious drawback because it would increase the amount of massive calculus. but not too much. 27 . That’s to say that it is necessary to resize all the pictures to a standard size so that we can process them all the same way. Worse. At the contrary. it is quite more interesting to reduce the size. So. the ratio will be generally constant. the hand will occupy a large part of the input picture.. ≈ Width(atrial ) Length(ring )User1 Length(ring )User 2 ≈ ≈ .Hand Gesture Recognition Using Neural Networks Standard Re-sizing According to the requirements. The consequences are obvious: if he is close to it. For these reasons. and it is contrary to the constraint of real time processing. Indeed. the average dimensions of a finger are: width: 3~5 pixels length: 15~20 pixels Of course.. hence different absolute measurements. For almost all users: Width(thumb) ≈ Width(ring ) ≈ . the user is not supposed to be systematically at the same distance of the video camera. After few tests and measurements.. the pictures of the hands after cropping may have some very different sizes.

the preprocessing algorithm provides a standard-sized binary picture that corresponds to a zoom on the hand. 30 x 30 A real example: Input picture Binary picture Zoom-in Resizing In these conditions. Once this preprocessing is finished. Size: ? x ? Initial Picture. 28 .Hand Gesture Recognition Using Neural Networks Finally. for any input picture. for any hand gesture that involve the thumb finger. the real processing can start. that is to say. the fact that the width of a finger is 4 to 5 pixels implies that in the resulting picture A schematic example Hand. Size: 240 x 320 Re-sized hand. the identification process can be launched.

the width of a finger will generally be 4 to 5 pixels. according to the user. the numbers of pixels with value ‘1’ will be small. So let’s consider that for User 1. and to compare to the resulting value to different ranges: If sum < range_1 Then No fingers If range_1 < sum < range_2 Then 1 finger If range_2 < sum < range_3 Then 2 fingers If range_3 < sum < range_4 Then 3 fingers If range_4 < sum < range_5 Then 4 fingers If range_5 < sum Then 5 fingers The advantage of this method is huge: Such programming is quite easy and very fast. The easiest way to classify an image is then to compute the sum of the pixels of the re-sized hand picture. If the five fingers are shown. So. if there are only one or two fingers that are exhibited.Hand Gesture Recognition Using Neural Networks Counting the fingers Simple Pixel Counting Analysis The first immediate idea is the following: a picture that contains only the hand of the user is provided to the program. In this picture. there is a strong link between the number of fingers and the number of pixels set to ‘1’. each finger has a dimension: 4 pixels( width) *15 pixels (length) = 60 pixels / finger 29 . there will be more pixels at ‘1’. and let’s suppose its length is 15 to 20 pixels. it is not a very efficient way: According to the previous sections. However.

and that the different ranges have been optimized for his average finger size. the global sum may lead to a mistake. and if it is the case of several fingers. Let’s suppose that for User 2. errors will probably occur if he doesn’t open widely the hand: Indeed. if the hand is fully open. the sum of their pixels makes the program consider there are only four fingers. Such solution is not acceptable for the project. when the two last fingers are cockled. at least because it has to work with several users. An example of this phenomenon is given here: The program answers ‘5’ The program answers ‘4’ In this example. the width of a finger is 5 and its length is 20. the sum of its relative pixels will be smaller. because the global sum is almost the same than the one the program would obtain if four fingers were exhibited in a hand fully opened. Then. it is necessary to consider some more sophisticated solutions. but if the fingers are a little bit cockled (“closed”). and if he accepts more constraints concerning the allowed signs. Finger dimension is: 5 pixels( width) * 20 pixels(length) = 100 pixels For User 2. The Consequence is that the program will get confused and may tell the User 2 he is exhibiting five fingers (two fingers and the thumb) when he just shows three of them (two and the thumb)! Another issue is that even if it is always the same use who do signs. 30 . then for each finger. four fingers will lead to about 200 pixels. This very simple method is efficient for a single user. two fingers will also lead to 200 pixels.Hand Gesture Recognition Using Neural Networks For User 1. let’s assume no error will occur.

31 . It is easy to do. given that the orientation of the hand is known. even if it has been cropped. Cropping the left part of the picture (including the thumb) will cause that only the fingers remain on the picture In such cases. Indeed. because the thumb has to be considered. called bwlabel makes the coding very easy.Hand Gesture Recognition Using Neural Networks Simple Block Counting Analysis The program has to count the number of fingers? So let’s create a picture in which will remain only the fingers. the only operation we have to do is to compute the number of blocks in the shortened image. This method offers a huge advantage: its simplicity. Using a Matlab function. no calculus or special treatment is required. plus 1. in the Image Processing Toolbox. the number of fingers is the number of blocks in the cropped picture.

some noisy pixels that remain. The problem is that if the user opens the hand widely. another phenomenon occurs: If the user opens the hand just enough according to the sign he does. 32 . this method has also some major drawbacks. the re-sizing operation can make some well-separated fingers turn into to two joined fingers. any confusion get impossible. but not too much. the index finger or the atrial finger (the fifth one) may not be present in the last columns of the picture. may join two fingers. around 70-75 percent of the allowed signs (say: that include the thumb fingers) are successfully classified. because the rate of error is can be reduced. Then the function bwlabel will consider they are just one block and it will imply an error in the estimation of the number of fingers. he has to open widely the hand. and it will cause an error in the evaluation of the number of fingers. although the noise removal. This method is very interesting and efficient while considering its low level of complexity and its simple coding. that will look like one single big finger. there are possibilities to improve this method. By this way. So the user has to open the hand widely.Hand Gesture Recognition Using Neural Networks However. And even if we suppose. However. that he succeed in doing it. Indeed. If the user wants to avoid such problems. and he may need time to find the best opening for each one of the different signs he want to do. With this method.

given that they were geometrical solutions. Choosing the weights In this section. the explanations will refer to the following picture. when looking at the weights of the input layer. the motivation in this section is to try to realize weighed averaging by a simpler way. but their problem was they were not efficient enough. and special management and processing of the binary picture. but it requires training. let’s consider the differences and the common points between the methods that have already been introduced: The Pixel Counting method and the Edges Counting method were some very simple solutions. The Neural Networks solution has been proven quite more efficient.Hand Gesture Recognition Using Neural Networks Weighted Averaging Analysis In order to understand the basic idea that is discussed here. Hence. it appears that the neural network just realizes a kind of weighted averaging. which has already been introduced in the section “Edge Counting Analysis”. Moreover. Their advantage was their low-complexity level for the implementation. This picture was an example that leads to a classification error: 33 .

Hand Gesture Recognition Using Neural Networks First of all. and the number of edges in this area should be 1. • If there are the thumb and three fingers in the picture. counting counting results. column 16 is set to one and its weight is 6 given that there are 12 edges in the column 16. If there are the thumb and one fingers in the picture. and the number of edges in this area should be 2. all the fingers are in the right part of the picture). A fast-approximated calculus leads to the following results: • • If there is only the thumb finger in the picture. the only columns that will be considered are the columns 15 to 25 for example. let’s suppose not weights are sued. and that only the number of pixels set to 1 may lead to incoherent given that the relative dimensions of a finger depend on and that the following picture will lead to ‘4’. say to realize weighted averaging when the weight of each pixel set to 1 is half the number of edges in the column of the considered pixel. one for example. When averaging the pixel value. So the weighted averaging should lead to values from 3*60*3 to 3*100*3. It has been proven previously that only edge in this area is not efficient in this case. • If there are the thumb and two fingers in the picture. and the weighted averaging will lead to 0. say 240 to 400. about 60 to 100 pixels will be set to 1 in the columns 15 to 25. Given that the left part of the picture is not relevant in order to compute the number of fingers (except the thumb finger. and the number of edges in this area should be 3. say 540 to 900. So the weighted averaging should lead to values from 60 to 100. say weights are all set to the same value. about 2*60 to 2*100 pixels will be set to 1 in the columns 15 to 25. no pixel will be set to 1 in the columns 15 up to 25. the pixel at line 19. For example. So the weighted averaging should lead to values from 2*60*2 to 2*100*2. the user 34 . according to the picture provided at the beginning of this section. One solution is to mix these two methods. all the pixels will have the same importance. about 3*60 to 3*100 pixels will be set to 1 in the columns 15 to 25.

So the weighted averaging should lead to values from 4*60*4 to 4*100*4. say 960 to 1600. about 4*60 to 4*100 pixels will be set to 1 in the columns 15 to 25. and the number of edges in this area should be 4. According to these values.Hand Gesture Recognition Using Neural Networks • If there are the thumb and four fingers in the picture. column)    ) 2) Estimate the number of fingers in the picture using: if if if if if WA < 30 then Number _ of _ fingers = 1 30 < WA < 170 then Number _ of _ fingers = 2 170 < WA < 470 then Number _ of _ fingers = 3 470 < WA < 930 then Number _ of _ fingers = 4 930 < WA then Number _ of _ fingers = 5 WA = 100  Number _ of _ edges    2   35 . let’s create the following bounds: • • • • 0 + 60 = 30 2 100 + 240 = 170 Bound between 2 and 3 fingers: 2 400 + 540 = 470 Bound between 3 and 4 fingers: 2 900 + 960 = 930 Bound between 4 and 5 fingers: 2 Bound between 1 and 2 fingers: That is to say that the algorithm has to realize the following operations: 1) Calculate colmun = 25 WA = ∑ column =15  number _ edges(column)*   (∑ line =30 line =1 pixel (line.

Indeed. When comparing the error margins. This can happen only if there are a lot of errors on the number of edges in each column and if the relative dimensions of the fingers are “strange”: one finger very thick. An error can occur only if the calculated WA. let’s compare it to the bound that would have been considered in a simple pixel counting algorithm: for four fingers. is under 930. The bound between 4 and 5 fingers would be (180+240)/2=210. which should be 1280. the calculation error has to be bigger than 350. and three fingers very thin and the thumb. four or five fingers). and for five fingers. In order to understand the efficiency of this method. in this case. 36 . The use of weights makes these confusion quite more rare because three four and five fingers pictures turn into WA values that are very distant one to the other. and such errors are not very frequent. this margin tend to 350. An error on five fingers happens when less than 210 pixels are counted in the columns 15 to 25. The margin is: 240-210=30. so more than 10 times the previous margin! That’s why this method is quite better the simple pixel counting one: different number of fingers lead to different ranges that are separated by very large spaces that only huge errors can get through. it appears that without any weights. and that with weights chosen as number of edges in the column of the analyzed pixel.Hand Gesture Recognition Using Neural Networks The consequence is that the distance between typical WA values (values of the weighted averaging) increases at an exponential rate. Without weights. it would be equal to 4*60=240. the bound between two close possibilities is always large: for example it has been said that the typical WA when 5 fingers is (960+1600)/2=1280. it is equal to 30. the sum of the pixel will be about 3*60=180. and that makes the classification less sensitive to errors. confusion may occur when several fingers are exhibited (three.

Hand Gesture Recognition Using Neural Networks Matlab Operations Building GUI interfaces in Matlab This example shows how to build user GUI in Matlab. Start gui builder by typing >>guide Select "Blank GUI". click OK 37 .

resize and position the canvas. buttons. Using the pallette on the left. drag and drop.Hand Gesture Recognition Using Neural Networks The GUI window will open Resize the design window. and static text windows 38 .

Set the font size 30 for the text windows and change horizontal alingment to "right.Hand Gesture Recognition Using Neural Networks Double-click on an object to open the properties dialog." 39 . Change the captions on the buttons and remove "Static Text" string from the text window.

such as pattern recognition and nonlinear system identification and control. Save the work. Neural Network Toolbox 40 .Hand Gesture Recognition Using Neural Networks The GUI is finished. implementing. The rest of the design process will take care of the functionality provided by each GUI component Neural Network Toolbox MATLAB with tools for designing. visualizing. and simulating neural networks. Neural networks are invaluable for applications where formal analysis would be difficult or impossible.

You can use the tool to import large and complex data sets. and extensible design of the toolbox simplifies the creation of customized functions and networks. The modular. as well as graphical user interfaces (GUIs) that enable you to design and manage your networks. The Neural Network Fitting Tool is a wizard that leads you through the process of fitting data using neural networks.Hand Gesture Recognition Using Neural Networks software provides comprehensive support for many proven network paradigms. and simulating neural networks  Support for the most commonly used supervised and unsupervised network architectures  Comprehensive set of training and learning functions  Dynamic learning networks. training. and evaluate network performance. layer-recurrent. nonlinear autoregressive (NARX). and custom dynamic  Simulink blocks for building neural networks and advanced blocks for control systems applications  Support for automatically generating Simulink blocks from neural network objects  Preprocessing and postprocessing functions and Simulink blocks for improving network training and assessing network performance 41 .including time delay. Key features  GUI for creating. quickly create and train networks. Neural Network Toolbox GUIs make it easy to work with neural networks. open.

They are most commonly used for prediction. Neural Network Toolbox supports four supervised networks:feedforward. and predicting future events. radial basis. fast method for designing nonlinear feedfor42 . Supervised Networks Supervised neural networks are trained to produce desired outputs in response to sample inputs. linear. Radial basis networks provide an alternative. Supported feedforward networks include feedforward backpropagation. and learning vectorquantization (LVQ). Feedforward networks have one-way connections from input to output layers. classifying noisy data. pattern recognition. and perceptron networks. dynamic. making them particularly well suited to modeling and controlling dynamic systems.cascade-forward backpropagation.Hand Gesture Recognition Using Neural Networks Network Architectures Neural network toolbox supports both supervised and unsupervised networks. feedforward input-delay backpropagation. and nonlinear function fitting.

The training function dictates a global algorithm that affects all the weights and biases of a given network. LVQ is a powerful method for classifying patterns that are not linearly separable. Neural Network Toolbox supports two types of self-organizing. The learning function can be applied to individual weights and biases within a network.Hand Gesture Recognition Using Neural Networks ward networks. 43 . Competitive layers recognize and group similar input vectors. nonlinear dynamic system modeling. Training and Learning Functions Training and learning functions are mathematical procedures used to automatically adjust the network’s weights and biases. Dynamic networks use memory and recurrent feedback connections to recognize spatial and temporal patterns in data. and Hopfield networks. Elman. the network automatically sorts the inputs into categories. unsupervised etworks: competitive layers and self-organizing maps. LVQ lets you specify class boundaries and the granularity of classification. Unsupervised Networks Unsupervised neural network saretrained by letting the network continually adjust itself to new inputs. They are commonly used for time-series prediction. By using these groups. and control system applications. nonlinear autoregressive (NARX). The toolbox also supports dynamic training of custom networks with arbitrary connections. Prebuilt dynamic networks in the toolbox include focused and distributed time-delay.They find relationships within data and can automatically define classification schemes. layer-recurrent. Supported variations include generalized regression and probabilistic neural networks.

or perhaps the network itself will adjust these parameters to achieve some desired end. Here the weighted input wp is the only argument of the transfer function f. b. that takes the argument n and produces the output a. 44 . we can train the network to do a particular job by adjusting the weight or bias parameters. The bias is much like a weight. which produces the scalar output a. The neuron on the right has a scalar bias. Figure : Neuron The scalar input p is transmitted through a connection that multiplies its strength by the scalar weight w. Examples of various transfer functions are given in the next section. to form the product wp. again a scalar. Note that w and b are both adjustable scalar parameters of the neuron. typically a step function or a sigmoid function. All of the neurons in the program written in MATLAB have a bias. except that it has a constant input of 1. The transfer function net input n. The central idea of neural networks is that such parameters can be adjusted so that the network exhibits some desired or interesting behavior. Thus.Hand Gesture Recognition Using Neural Networks Neuron Model Simple Neuron A neuron with a single scalar input and no bias is shown on the left below. This sum is the argument of the transfer function f. is the sum of the weighted input wp and the bias b. Here f is a transfer function. You may view the bias as simply being added to the product wp as shown by the summing junction or as shifting the function f to the left by an amount b. again a scalar.

Hand Gesture Recognition Using Neural Networks . 45 .

and output . This summation on the output is called the output layer. Generally. 46 ..Hand Gesture Recognition Using Neural Networks Feed forward Neural Networks Feed forward neural networks (FF networks) are the most popular and most widely used models in many practical applications.. which consists of any number of neurons.. such as "multi-layer perceptrons. Then follows a hidden layer. also called the neuron function. The input layer consists of just the inputs to the network. Mathematically the functionality of a hidden neuron is described by where the weights { . Each arrow in the figure symbolizes a parameter in the network.. They are known by many different names. The network is divided into layers. } are symbolized with the arrows feeding into the neuron. The network output is formed by another weighted summation of the outputs of the neurons in the hidden layer. the number of output neurons equals the number of outputs of the approximation problem. A feedforward network with one hidden layer and one output. Each neuron performs a weighted summation of the inputs. which then passes a nonlinear activation function ." Figure illustrates a one-hidden-layer FF network with inputs . or hidden units placed in parallel. In Figure there is only one output in the output layer since it is a single-output problem.

its parameters are adjusted incrementally until the training data satisfy the desired mapping as well as possible. until ( ) matches the desired output y as closely as possible up to a maximum number of iterations The FF network in Figure is just one possible architecture of an FF network. . 47 . therefore. you can change the activation function to any differentiable function you want.. that is.. The variables { .Hand Gesture Recognition Using Neural Networks The output of this network is given by where n is the number of inputs and nh is the number of neurons in the hidden layer. } are the parameters of the network model that are represented collectively by the parameter vector .. You can modify the architecture in various ways by changing the options. Note that the size of the input and output layers are defined by the number of inputs and outputs of the network and. only the number of hidden neurons has to be specified when the network is defined. For example. . In training the network.

Programming is much more time consuming for the analyst and requires the analyst to specify the exact behavior of the model. they are no longer valid. Although neural networks may take some time to learn a sudden drastic change.Hand Gesture Recognition Using Neural Networks Advantages of Neural Computing There are a variety of benefits that an analyst realizes from using neural networks in their work. 48 . The system is developed through learning rather than programming. Neural nets teach themselves the patterns in the data freeing the analyst for more interesting work. Because neural networks can handle very complex interactions they can easily model data which is too difficult to model with traditional approaches such as inferential statistics or programming logic. Neural nets learn to recognize the patterns which exist in the data set. Pattern recognition is a powerful technique for harnessing the information in the data and generalizing about it. Neural networks are flexible in a changing environment. Rule based systems or programmed systems are limited to the situation for which they were designed--when conditions change. Performance of neural networks is at least as good as classical statistical modeling. they are excellent at adapting to constantly changing information. Neural networks can build informative models where more conventional approaches fail. The neural networks build models that are more reflective of the structure of the data in significantly less time. and better on most problems.

you cannot just throw data at a neural net and get a good answer. And. it can take time to train a model from a very complex data set. The key limitation is the neural network's inability to explain the model it has built in a useful way. This is a classic situation where "garbage in" will certainly produce "garbage out. It is difficult to extract rules from neural networks. There are a few other limitations that should be understood. you must be sure that the data used to train the system are appropriate and are measured in a way that reflects the behavior of the factors. As with most analytical methods. particularly expert systems which are rule-based. If the data are not representative of the problem. Neural techniques are computer intensive and will be slow on low end PCs or machines without math coprocessors." Finally. First. even when the system takes longer to train. 49 . Neural networks get better answers but they have a hard time explaining how they got there.Hand Gesture Recognition Using Neural Networks Limitations of Neural Computing There are some limitations to neural computing. You have to spend time understanding the problem or the outcome you are trying to predict. It is important to remember though that the overall time to results can still be faster than other data analysis approaches. Analysts often want to know why the model is behaving as it is. neural computing will not product good results. This is sometimes important to people who have to explain their answer to others and to people who have been involved with artificial intelligence. Processing speed alone is not the only factor in performance and neural networks do not require the time programming and debugging or testing assumptions that other analytical approaches do.

ROM. I/O ports. These are used in embedded system. In-Application Programming (IAP) means that the microcontroller fetches new program code and reprograms itself while in the system. A default serial loader (boot loader) program in ROM allows serial In-System programming of the Flash memory via the UART without the need for a loader in the Flash code. Microcontroller(8051) A microcontroller has a CPU in addition to a fixed amount of RAM. 12V is converted into 5V with the help of 7805 and capacitor combination. 12v is required for motor driving and 5 v for the microcontroller assembly. This allows for remote programming over a modem link. For InApplication Programming.Hand Gesture Recognition Using Neural Networks MICROCONTROLLER AND ROBOT Power Supply We are directly providing 12V D C supply. 50 . The 89C5124PI device contains a non-volatile 64kB Flash program memory that is both parallel programmable and serial In-System and In-Application Programmable. the user program erases and reprograms the Flash memory by use of standard routines contained in ROM. and timers are all embedded together on one chip. We have used 80c51 8-bit flash microcontroller family AT89C5124PIwith 64k of flash memory and 1kB of RAM. In-System Programming (ISP) allows the user to download new code while the microcontroller sits in the application. The 12V D C is converted into 5V DC supply.

Hand Gesture Recognition Using Neural Networks This device is a Single-Chip 8-Bit Microcontroller manufactured in advanced CMOS process and is a derivative of the 80C51 microcontroller family. an enhanced UART and on-chip oscillator and timing circuits. nested interrupt structure. three 16-bit timer/event counters. The instruction set is 100% compatible with the 80C51 instruction set. The added features of the AT89C5124PI makes it a powerful microcontroller for applications that require pulse width modulation. Features :a) 80C51 Central Processing Unit b) On-chip Flash Program Memory with In-System Programming(ISP) and In-Application Programming (IAP) capability c) Boot ROM contains low level Flash programming routines for downloading via the UART d) Can be programmed by the end-user application (IAP) e) 6 clocks per machine cycle operation (standard) f) 12 clocks per machine cycle operation (optional) g) Speed up to 20 MHz with 6 clock cycles per machine cycle (40 MHz equivalent performance). up to 33 MHz with 12 clocks per machine cycle h) Fully static operation i) RAM expandable externally to 64 kB j) 4 level priority interrupt k) 8 interrupt sources l) Four 8-bit I/O ports m) Full-duplex enhanced UART n) Framing error detection o) Automatic address recognition p) Power control modes Clock can be stopped and resumed – Idle mode – Power down mode q) Programmable clock out r) Second DPTR register s) Asynchronous port reset t) Low EMI (inhibit ALE) u) Programmable Counter Array (PCA) --. four-priority-level.PWM ---Capture/Compare 51 .The device also has four 8-bit I/O ports. high-speed I/O and up/down counting capabilities such as motor control. a multi-source.

0): Timer/Counter 2 external count input/Clockout 2) T2EX (P1.4): Capture/Compare External I/O for PCA module 1 6) CEX2 (P1. As inputs. As inputs. Port 2 emits the high-order address byte during fetches from external program memory and during accesses to external data memor that use 16-bit addresses (MOVX @DPTR). Port 0 is also the multiplexed low-order address and data bus during accesses to external program and data memory. and operation.Hand Gesture Recognition Using Neural Networks PIN DESCRIPTION : a) Ground: 0 V reference.5): Capture/Compare External I/O for PCA module 2 7) CEX3 (P1. Alternate functions for 89C51RB2/RC2/RD2 Port 1 include: 1) T2 (P1.Port 1 pins that have 1s written to them are pulled high by the internal pull-ups and can be used as inputs.6): Capture/Compare External I/O for PCA module 3 8) CEX4 (P1. Port 2 pins that have 1s written to them are pulled high by the internal pull-ups and can be used as inputs. port 1 pins that are externally pulled low will source current because of the internal pull-ups. 52 . idle. port 2 pins that are externally being pulled low will source current because of the internal pull-ups. power.7): Capture/Compare External I/O for PCA module 4 e) Port 2(21-28): Port 2 is an 8-bit bidirectional I/O port with internal pullups.6 and P1. b) Power Supply(Vcc): This is the power supply voltage for normal. d) Port 1(8 I/O numbered 1-8): Port 1 is an 8-bit bidirectional I/O port with internal pull-ups on all pins except P1.down c) Port 0(8 I/O pins from 39-32): Port 0 is an open-drain. it uses strong internal pull-ups when emitting 1s.7 which are open Drain.2): External Clock Input to the PCA 4) CEX0 (P1.1): Timer/Counter 2 Reload/Capture/Direction Control 3) ECI (P1. In this application.3): Capture/Compare External I/O for PCA module 0 5) CEX1 (P1. Port 0 pins that have 1s written to them float and can be used as high-impedance inputs. bidirectional I/O port.

RxD (P3. port 3 pins that are externally being pulled low will source current because of the pull-ups.0): Serial input port II. resets the device. except that two PSEN activations are skipped during each access to external data memory. RD (P3. PSEN is activated twice each machine cycle.5): Timer 1 external input VII.0. INT1 (P3. In normal operation.2): External interrupt IV. pin 31): EA must be externally held low to enable the device to fetch code from external program memory locations.Hand Gesture Recognition Using Neural Networks f) Port 3(10-17): Port 3 is an 8-bit bidirectional I/O port with internal pullups.6): External data memory write strobe VIII.3): External interrupt V. Note that one ALE pulse is skipped during each access to external data memory. h) ALE (Address Latch Enable. Port 3 pins that have 1s written to them are pulled high by the internal pull-ups and can be used as inputs. INT0 (P3. An internal diffused resistor to VSS permits a power-on reset using only an external capacitor to VCC. ALE is emitted twice every machine cycle. With this bit set. pin 29): The read strobe to external program memory.4): Timer 0 external input VI. If EA is held high. TxD (P3. WR (P3. ALE can be disabled by setting SFR auxiliary.7): External data memory read strobe g) RST Reset(pin 9): A high on this pin for two machine cycles while the oscillator is running. PSEN is not activated during fetches from internal program memory. T0 (P3. as listed below: I. When executing code from the external program memory. i) PSEN (Program Store Enable.1): Serial output port III. j) EA/VPP(External Access Enable/Programming Supply Voltage. ALE will be active only during a MOVX instruction. As inputs. and can be used for external timing or clocking. the device executes from internal program memory. T1 (P3. pin 30): Output pulse for latching the low byte of the address during an access to external memory. Port 3 also serves the special features of the 89C51RB2/RC2/RD2. The value on the EA pin is latched when RST is released and any subsequent changes have no 53 .

These versatile devices are useful for driving a wide range of loads including solenoids. Motor Driver(ULN2004A) The ULN2004A is high voltage.5 V or less than VSS – 0. 54 . LED displays filament lamps.Hand Gesture Recognition Using Neural Networks effect.5 V. relays DC motors. the voltage on any pin (other than VPP) must not be higher than VCC + 0. Each channel rated at 500mA and can withstand peak currents of 600mA. thermal print-heads and high power buffers Maximum output voltage is 50V The 2004A is supplied in 16 pin plastic DIP packages with a copper lead frame to reduce thermal resistance. This pin also receives the programming supply voltage (VPP) during Flash programming. high current darlington arrays each containing seven open collector darlington pairs with common emitters. To avoid “latch-up” effect at power-on. k) XTAL1 and XTAL2(pin 19 & 18): Input & output respectively to the inverting oscillator amplifier and input to the internal clock generator circuits.Suppression diodes are included for inductive load driving and the inputs are pinned opposite the outputs to simplify board layout.

without having to use a sensor. As the name suggests.hence having six wires coming out of it. stepper motors do not spin freely like DC motors. under the command of a controller.Hand Gesture Recognition Using Neural Networks Robot The robot is two wheel robot with a castor wheel provided for the support. This makes them easier to control.its alternate windings are excited continuously with the help of assembly code 55 . Four of them are used for receiving data from the microcontroller for its movement while two are short circuited and connected to 12V DC supply. For the movement of motor . as the controller knows exactly how far they have rotated. Stepper motor has been used.ULN2004A ic is used for driving the motors. Stepper motor used is a unipolar motor . they rotate in discrete steps. Therefore they are used on many robots.

Hand Gesture Recognition Using Neural Networks Stepwise procedure/ flow: Input pattern to be recognized Sampling Generation of templates Template matching with input pattern Best match 56 .

Hand Gesture Recognition Using Neural Networks Recognized pattern TIME ACTIVITY CHART: 5 4 3 2 1 3 4 Months 7 10 A C T I V I T I E S 12 57 .

we have only considered a limited number of gestures. 3 – Make an experimental set-up. 5 .Preparation of report CONCLUSION We proposed a fast and simple algorithm for a hand gesture recognition problem. The segmentation portion of our algorithm is too simple. Given observed images of the hand. plot results & conclusion. temporal tracking for recognizing dynamic gestures. as well as 3D modeling of the hand. However we should note that the segmentation problem in a general setting is an open research problem itself. 58 .Selection of Application & decide the specifications of the equipments required for same.Conduct trials. and would need to be improved if this technique would need to be used in challenging operating conditions.Literature review 2. the algorithm segments the hand region. Reliable performance of hand gesture recognition techniques in a general setting require dealing with occlusions. Based on our motivating robot control application.Hand Gesture Recognition Using Neural Networks Activities: 1. 4 . We have demonstrated the effectiveness of this computationally efficient algorithm on real images we have acquired. which are still mostly beyond the current state of the art. Our algorithm can be extended in a number of ways to recognize a broader set of gestures. and then makes an inference on the activity of the fingers involved in the gesture.

implementing more sensor modalities would improve robustness even in very complex scenes. it will be possible to design very efficient algorithms in order to • Track people. The use real embedded OS could improve our system in terms of speed and stability. In addition. Our system has 59 . our work could easily be made more performing by adding a state-of-the-art processor.Hand Gesture Recognition Using Neural Networks FUTURE SCOPE Even with limited processing power. Because we limited ourselves to low processing power. • (Re-)identify them • Understand their (static) gestures • Control a robot Our software has been designed to be reusable and many behaviors that are more complex may be added to our work.

ieee. BIBLIOGRAPHY Books and references • • • • • • • • Matlab by R P Singh The 8051 Microcontroller by Mazidi Image Processing book by Bijith Marakarkandy Digital Image Processing: An Algorithmic Approach by Joshi M A Neural Network by Gonzales Cenelia www.org 60 . In the future.com www.Hand Gesture Recognition Using Neural Networks shown the possibility that interaction with machines through gestures is a feasible task and the set of detected gestures could be enhanced to more commands by implementing a more complex model of a human being.com ieeexplore.google. service robots executing many different tasks from housemaid work to nuclear power plant services might arise and become a common part of everyday live normal as computers nowadays.wikipedia.

Sign up to vote on this title
UsefulNot useful