You are on page 1of 3

Gesture Control Gaming

Tarinee Prasad Sahoo Kumar Mapanip Saheb Sahoo


Department of Computer Science and Engineering Department of Computer Science and Engineering
National Institute of Technology Rourkela National Institute of Technology Rourkela
Rourkela, India Rourkela, India
116CS0224@nitrkl.ac.in 716CS1046@nitrkl.ac.in

Abstract—This is a project implementing concepts of deep to the computer. The Programmed package then identifies
learning and image processing to design a gesture based system meaningful gestures from a planned gesture library wherever
which uses various hand gestures to mimic keyboard commands. every gesture is mapped to a associated command.The package
These virtual commands will be used as the input for a game.
The game will be designed using PyGame module of python. then associates every realtime gesture, interprets it , uses the
library to spot meaningful gestures that match with the given
I. I NTRODUCTION gesture. Once the gesture has been understood, classification
The main aim behind this project is to make an interaction is performed.
between human and computer using various applications run- C. Stages of Processing
ning on computer by aiming basic shapes made by hand.Our
There are three main stages in working of a search engine,
hand movements have an important role while interacting with
namely
other people, as they convey very rich information in many
Data Collection
ways. Gesture recognition could be a quite sensory activity
Data Preprocessing
computing program that permits computers to capture and
Deep Learning
interpret human gestures as commands. Gesture Recognition
Score Computation
is largely the power of a computer to grasp gestures and
execute commands supported those gestures . Many companies D. Related Work
are working on gesture recognition system . Most of them
Gesture Recognition is essentially the power of a pc to grasp
are working with additional sensors like wireless gloves with
gestures and execute commands supported those gestures.
sensors or the most popular Microsoft Kinect which uses
Most shoppers ar familiar with the conception through Wii
a Depth camera which basically consists of some system
match, X-box and PlayStation games like Just Dance and
that segment the body , and based on the segmentation they
Kinect Sports.
produce the getsure . On the contorary we decided to feed a
neural network with a large amount of data and seeing whether II. A LGORITHMS U SED AND W ORKING
we can achieve the result without using much of hardware . A. Adaptive Thresholding
We only need a laptop camera to build a gesture recognition
system . Adaptive thresholding generally takes a grayscale or color
image as input and, within the simplest implementation, out-
A. Why gesture? puts a binary image representing the segmentation. for every
Gesture is a natural form of expression and has been a vital picture element within the image, a threshold needs to be
mode of communication for human.Building a gesture based calculated. If the picture element price is below the brink
recognition system would be the first step for a computer it’s set to the background price, otherwise it assumes the
to understand human body language. Thus building a bridge foreground price.
between human and computer interaction. There are two prime approaches to finding the threshold:
(i) the Chow and Kaneko approach and (ii) local thresholding.
B. How Gesture Recognition works? the idea behind each strategies is that smaller image regions
Gesture recognition is an alternate technique for provid- are a lot of probably to possess around uniform illumination,
ing period of time knowledge to a computer rather than therefore being a lot of appropriate for thresholding. Chow
typewriting with keys or sound on barely screen, a motion associate degreed Kaneko divide a picture into an array of
sensing element perceives and interprets movements because overlapping subimages and so notice the optimum threshold
the primary supply of knowledge input.The recognized gesture for every subimage by investigation its bar chart. the brink
can be used as data input for games. The camera feeds for every single picture element is found by interpolating the
image knowledge into a sensing device(camera) connected results of the subimages. the disadvantage of this technique
is that it’s machine pricey and, therefore, isn’t acceptable for
period applications.
An alternative approach to finding the native threshold Fig 2: CNN
is to statistically examine the intensity values of the native
neighborhood of every picture element. The datum that is most C. CNN Parameters
acceptable depends mostly on the input image. easy and quick First layer : Convolution layer with 5 * 5 stride with relu
functions embrace the mean of the native intensity distribution, activation function
T = Mean Second layer : Maxpooling layer with 2 * 2 pooling window
Third layer : Convolution layer with 5 * 5 sliding with relu
the median value, activation function
Fourth layer : Maxpooling layer with 2 * 2 pooling window
T = Median Flattening of outputs from fourth layer
Fifth layer : Full connection layer with 256 input nodes and
or the mean of the minimum and maximum values, relu Activation
Sixth layer : Four nodes with sigmoid function
T = (Max+Min)/2
III. P ROGRESS AND C ONCLUSION
A. Setup
We developed our algorithms on Python3 . And we em-
ployed Tensorflow 1.4,Keras 2.0.9 to build the network and
the loss function . We used CV2 3.3.0 for image processing
and drawing bounding boxes as well as text. For mathematical
computations Numpy 1.2.1 was used . For network model
training NVIDIA NC6 in AWS
B. Dataset
In order to train a convolutional neural network, a lot of
Fig 1: Adaptive Thresholding data is required for good accuracy . All the dataset has been
prepared and trained by us.
B. Convolutional Neural Network
C. Data Collection
Convolutional Neural Network is a category of deep neural
networks, most ordinarily applied to analyzing visual imaging. We used OpenCv module of python for image processing,
CNNs use a variation of multilayer perceptrons designed to image is extracted from live video with camera frame by
want borderline preprocessing. they’re conjointly called shift frame. Images are then converted to grayscale. Adaptive mean
invariant or space invariant artificial neural networks (SIANN), thresholding with binary thresholding is used Mask is applied.
supported their shared-weights architecture and translation The final image that was saved was a 64x64 grayscaleimage.
unchangingness characteristics. Convolutional networks were D. Data Preprocessing
inspired by biological processes in this the property pattern be-
This step preprocesses the data i.e. normalizing and split-
tween neurons resembles the organization of the animal visual
ting the data into train test. The split ratio is 75
area. Individual plant tissue neurons respond to stimuli solely
Train 8000
in an exceedingly restricted region of the visual view called
Test 2000
the receptive field. The receptive fields of various neurons
partly overlap specified they cowl the entire visual view. CNNs E. CNN Model
use comparatively very little pre-processing compared to al-
ternative image classification algorithms. this implies that the
network learns the filters that in ancient algorithms were hand-
engineered. This independence from previous information and
human effort in feature style could be a major advantage.

Fig 3: Flow Chart Diagram This section describes the archi-


tecture used. The model consists of two CONV BOX followed
by a fully connected layer. One CONV BOX consists of three
layers
Convolution Operation - ¿ Relu - ¿ Max Pooling So, this
operation was carried out twice. Then flattening of the data
is performed and fed into Fully connected ANN.
The plot Fig. 1 shows the steps performs in training the
neural network with a appropriate set of hyperparameter with
proper tuning .

R ESULT

Fig 4: Flow Chart Diagram shows the validation loss for the
training set

Fig 5: Flow Chart Diagram shows the validation loss for the
test set
C ONCLUSION AND F UTURE W ORK
As can be observed from the graph, the accuracy of our
model is very good and hence it will work satisfactorily for
four control based systems .So for the future work we are
planning to develop a multiple control based game and control
it with gestures instead of keyboard.
R EFERENCES
[1] https://www.youtube.com/watch?v=FTr3n7uBIuE
[2] http://cs231n.github.io/convolutional-networks/
[3] https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
[4] http://colah.github.io/posts/2014-07-Conv-Nets-Modular/
[5] http://andrew.gibiansky.com/blog/machine-learning/convolutional-
neural-networks/

You might also like