You are on page 1of 9

TABLE OF CONTENTS S. NO. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.


1. Introduction
For the past few years, the common input computer devices did not change a lot. This means, the communicating with computers at this moment are limited to mouse, keyboard, track ball, web-cam, light pen and etc. This is happened because the existing of input devices is adequate in order to perform most of the function that computer able to do. In other hand, the new application/software is constantly introduced into our market. This software is able to perform multiples of functions using just the common input computer devices. Vision based interfaces are feasible and popular at this moment because the computer is able to communicate with user using webcam. This means, user able to give command to the computer by just showing some actions in front of the webcam without typing keyboard and clicking mouse button. Hence, users are able to perform human-machine interaction (HMI) with these userfriendlier features. Eventually, this will deploy new commands that are not possible with current computer input devices. Lately, there has been a surge in interest in recognizing human hand gestures. Hand gesture recognition has several of applications such as computer games, gaming machines, as mouse replacement and machinery control (e.g. crane, surgery machines). Moreover, controlling computers via hand gestures can make many applications work more intuitive than using mouse, keyboard or other input devices. The most structured sets of gestures belong to sign language. In sign language, each gesture has an assigned meaning (or meanings). This project will focus on American Sign Language (ASL). ASL is the language of choice for most deaf people. The main purpose of invented ASL is to allow deaf people communicate with normal people.ASL consists of approximately 6000 gestures of common words with finger spelling which are use to communicate proper nouns. Finger spelling can be performs by one hand and 26 gestures to communicate the 26 letters of the alphabets. Examples of signs shown in Figure 1.

Figure 1: ASL examples

2. Background
There are few ways to perform hand gesture recognition. Lets classifies it into three categories. The first category is involved heavily in hardware parts such as glove based analysis, employ sensors (mechanical or optical) attached to a glove that transduce finger flexions into electrical signals to determine the hand posture. Normally, the sensors that used are acoustic or magnetic sensor which embedded into the glove. Example, there is a system able to translates mutually between Japanese Sign Language (JSL) and Japanese. The system is work by recognition of one-handed motions. A VPL (Visual Programming

Language) Data Glove Model II is used for acquiring hand data. It has 2 sensors for measuring bending angles of the 2 joints on each finger, one over the knuckle and the other over middle joint of the finger. There is also a sensor attached to the back of the glove which measures 3 position data and 3 orientation data relative to the fixed magnetic source. The position data is calibrated by subtracting the neutral position data from the raw position data. The second category is analysis of drawing gesture, which are involved using special input devices such as stylus .Most of hand gesture recognition currently works by using mechanical sensing, most often for direct manipulation of a virtual environment. But this type of sensing has a range of problems such as accuracy, reliability and electromagnetic noise. These two categories involved external hardware parts. The third category is vision based analysis which is based on the way human beings perceive information about their surroundings. Visual sensing has the potential to make gestural interaction more practical and this type of method is most intuitive method to perform hand gesture recognition because it involved no external hardware part, this mean it can recognition our hand gesture freely without anything put on our hand. What it need is just a camera, webcam, camcorder or anything can capture image that able to interface with computer. In this project, we will focus on vision based analysis.

3. Objectives
a) b) c) d) Implementation of pattern recognition using Neural Network into MATLAB. The implemented system should able to perform classification correctly. The implemented application should be user friendly enough for anyone to use. System should be able to get static image through the webcam and perform the classification.

4. Aim
The aim of this project is creating visual biased analysis application to perform Hand Gesture Recognition of American Sign Language (ASL). This project is able to recognize a few hand gestures of ASL such as hand gestures for letter A, B, C and number 1, 2 ,3 and etc successfully without any error regardless the person hand sizes and other external causes.

5. Approach
This project should not involve any other external part/hardware except computer equipped with webcam. This is to keep the cost minimum and everyone able to own and use this application easily. Since this project is limited the hardware part to computer and webcam, we just need to consider the software and programming parts. There are few software can perform hand gesture recognition such as MATLAB, Microsoft Visual C#, Microsoft Visual C++, and Microsoft Visual Basic with correct way of programming but the most common software are MATLAB and Microsoft Visual C#. Both are very powerful tools. This project is based on MATLAB software. MATLAB is chosen over Microsoft C# because (MATLAB is perfect for speeding up development process which it allows user to work faster and concentrate on the results rather on the design of the programming. MATLAB has Toolboxes which allow us to learn and apply specialized technology. It is a tool of choice for high productivity research, analysis and development. In this project, we used 2 toolboxes which are Neural Network and Images Processing.

6. Vision based Analysis

For this Hand Gesture Recognition project, is based on Human-Computer Interaction (HCI) technology. The computer can perform hand gesture recognition on American Sign Language (ASL). The system use MATLAB Toolboxes, Neural Network to perform the gesture recognition. It work by feed numerous types of hand gestures images then into neural network and the system will train the network itself. Once the neural network is trained, this neural network can perform multiples of hand gesture recognition of ASL. Further explanation regarding how the images are feed into network and how the network process will be discuss on this report.

7. Neural Network
Neural network is also known as Artificial Neural Network (ANN), is an artificial intelligent system which is based on biological neural network. Neural network are makes up of simple elements operating in parallel. These elements also know as neurons and will discuss at Simple Neuron chapter. The network function is widely determined by the connections between these elements. These elements are works same as biological nervous systems. Neural networks able to be trained to perform a particular function by adjusting the values of the connections (weight) between these elements.

Figure 2: Neural Network Block Diagram Neural network is adjusted and trained in order the particular input leads to a specific target output .Example at Figure 2, the network is adjusted, based on a comparison of the output and the target until the network output is matched the target. Neural network have been used in various fields of applications such as identification, speech, recognition, vision, classifications and control systems. Neural network can be trained to perform all this complex functions. Nowadays, neural network can be trained to solve many difficult problems faced by human being and computer.

7.1 Simple Neuron

Figure 3 below shows a neuron with a single scalar input. A neuron with a single scalar input and no bias appears on the left below while with bias appears on right below. MATLABs Toolboxes Neural Network and build in function is neuron with bias. [9]

Figure 3: Neuron with Single Scalar The scalar input p is feed through a connection that multiplies its strength by the scalar weight w to form the product wp, again it is a scalar. The weighted input wp is the argument of the transfer function f, which produces the scalar output a. The neuron on the right side has a scalar bias, b included. This bias is just simply being added to the product wp. The bias is act just like a weight with constant input of 1. The transfer function net input n, again a scalar, is the sum of the weighted input wp with the bias b. This sum is the argument of the transfer function f. Here f is a transfer function, typically a step function, which takes the argument n and produces the output a. The w and b are both adjustable scalar parameters of the neuron. As a result, neural network can be trained to perform a particular task by adjusting the bias or weight parameters, or the network itself will adjust these parameters to achieve some desired result. Hard-limit transfer function is limits the output of the neuron to either 0, if the net input argument n is less than 0, or 1 if n is greater than or equal to 0. This function will be used in Perceptron, which is to create neurons that make classification decisions. Perceptron is the programs that learn concepts. There are 2 types of training available in Neural Network which are Supervised Training and Unsupervised Training .Peceptron is fall under Supervised Training.

7.2 Supervised Training

Supervised training is based on the system trying to predict the outcomes for known examples. It compares its predictions with the target answer and learns from its mistakes by adjusting the weight and bias accordingly. First, the input data is feed into the input layer neurons. Then, the neurons pass the inputs along to the next nodes. As inputs are passed along, the weighting, also known as connection, is adjusted and when the inputs reach to the next node, the weightings values are summed. This summed value is either weakened or intensified. This process continues until the data reach the output layer. Then it will come to the classification step, where the predicted output from the network is compare with the actual output. If the predicted output is equal to the actual output, no change is made to the weights in the network. But, if the predicted output value is different from the actual output, there is an error and the error is propagated back through the network and the weights are adjusted accordingly. In this project, supervised training was preferred because this project is about classification and pattern recognition. Supervised training able to compare these hand gestures with the correct/real target and learns by the system itself. While unsupervised training is focus more in clustering due

to it works by placing similar cases together. Unsupervised training performs better efficiency on data compression and data mining.

7.4 Perceptron Learning Rule

Perceptron is one of the programs of Neural Network that learns concepts. For example, this program is able to learn to respond for the inputs that present to it whether it is True(1) or False(0), by repeatedly "studying" examples presented to it. This program is the suitable for classification and pattern recognition. Single perceptrons structure is quite simple. There are few inputs, depends on input data, a bias with and output. The inputs and outputs must be in binary form which the value can only be 0 or 1. Figure 3 shows Perceptron Schematic diagram.

Figure 5: Perceptron Schematic diagram Inside the Perceptron layer, the input data (0 or 1) are feed into weight by multiply with it and the weight is generally a real number between 0 and 1. The value are then feed into neuron together with bias, bias is real value as well range from 0 to 1 Once the value is obtained, the next step process is adjusting the weights. The way of Perceptron learning is through modifying its weights. The value obtained from the limiter function is also known as actual output. Perceptron adjust its weights by the difference between the desired output and the actual output. This can be written as: Change in Weight i = Current Value of Input i (Desired Output - Current Output) [12] Perceptron will continue adjust its weights until the actual output value is no difference with desired output value or with minimum difference value.

8. Introduction of MATLAB
This Hand Gesture Recognition project is developed using MATLAB software. MATLAB stands for matrix laboratory. MATLAB is a powerful and high-performance language for technical computing. MATLAB integrates programming, computation, and visualization in user friendly environment where problems & solutions are presented in common mathematical notation. [13] MATLAB is uses for: Math and calculation Development of algorithms Prototyping, modelling and simulation Data analysis, exploration and visualization Application/software development MATLAB is an interactive system which allows us to solve many technical computing problems, especially those problems involved with vector and matrix formulations. This because MATLABs basic data element is an array which does not require dimensioning. This would save users time that takes to write a program if compared with non-interactive language such as Fortran or C. MATLAB already developed and evolved over a period of years by researchers and is still developing. In study environment, MATLAB is advanced courses in engineering, science and mathematics and is a standard instructional tool for introductory. In industry, MATLABs tool is widely use for highproductivity research, development and analysis. Another thing that makes MATLAB so special is its Toolboxes. Toolboxes are MATLAB features a family of application-specific solutions. Toolboxes allow user to learn and apply specialized technology. Toolboxes are comprehensive collections of MATLAB functions (M-files) that extend the MATLAB environment to solve particular classes of problems. Examples of Toolboxes available are fuzzy logic, neural network, signal processing, image processing, wavelets, image acquisition and many more.

9. Methodology
9.1 Image Databases
The first steps that need to do in this project are creation of image database. These images database base will be used for Artificial Neural Network training and testing. The image database can have different formats. For this project, digitalized photographed were used because they are the most realistic approach. The images are captured using laptop with webcam.The images database can be any image file format such as .jpg, .tif, .bmp and many more. Images used by this project are .jpg file .These images will go through a transformation approach called Transformation T.

Figure 10: Pattern Recognition System This method is chosen because it is fast and simple algorithms.

9.2 Perceptron
As discussed before, Perceptron is one of the Neural Network to learns concept. The next part will be based on Perceptron learning rule to train the network to perform pattern recognition. Figure 13 below shows the flow chart for Perceptron program.

The output will compare with the Target vector. After that, if there is an error, the Perceptron network will re-adjust the weights value until there is no error or minimized and then it will stop. Each pass through the input vectors is called epoch

Testing Perceptron Network Implementation

Figure 17: Network Testing Flow Chart

1st Step: Select test set image / Get Image from Webcam Once the Perceptron Network is completed trained, the network is ready for testing. First, we select test set of image which already converted into feature vector form or get image through webcam then process it into feature vector form. The image can be come from any type of hand gesture sign, not necessary to be trained hand gesture sign since this is just for testing. 2nd Step: Process by Perceptron Network Now the image in feature vector form is feed into the network. These feature vector values will go through all the adjusted weights (neurons) inside the Perceptron network and the will come out an output. 3rd Step: Display Matched Output The system will display the matched output which presented in vector format. Improvement is made so the output will display both vector format and the meaning of gesture sign in graphical form.

10. References
Books & Journals C. Nugent and J.C. Augusto (2006). Smart Homes and Beyond. IOS Press. Klimis Symeonidis (2000). Hand Gesture Recognition using Neural Network. University of Surrey.