940 views

Uploaded by pi194043

A method for gesture recognition using Hidden markov model approach

save

You are on page 1of 9

Pi19404

December 23, 2012

Contents

Contents

Gesture Recognition using HMM

0.1 0.2 0.3 Gesture . . . . . . . . . . . . . . . Hidden Markov Model . . . . . . Gestures and HMM . . . . . . . . 0.3.1 Gestures Representation . 0.3.2 Normalizing Gestures . . 0.3.3 Discretization of Gestures References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

3 3 4 4 4 5 9

0.4

2|9

0.1 Gesture

A gesture is represented as a spatio-temporal sequence of feature vectors that describe the direction of hand movement. The hand gesture continuous time representation of feature vectors ,this is converted to one of codewords from predened set using vector quantizer. we construct a codebook for each gesture and build a HMM model utilizing the spatio temporal information encoded by the codewords representing the gesture. A unique initial and nal state are dened in the model. The number of states in the model are dened by the complexity of the gesture. Higher number of states represent the gesture better but at the cost of performance,

0.2 Hidden Markov Model

A hidden Markov model is a collection of nite states connected by transitions. Each state is characterized by two sets of probabilities: a transition probability, and either a discrete output probability distribution A Hidden Markov models is dened using : Q - sequence of hidden states O - sequence of observed states S - input symbol set V - observation symbol set Qo - initial state of system A - Transition matrix B - Emission matrix For each class we create a hidden Markov model.

3|9

After observing the sequence of observations we can determine which HMM is more likely to generate the sequence. The HMM outputs a likelihood measure for given sequence of observed sequence . The observed sequence is assigned to the class which produces higher likelihood of producing the sequence. Since any random input will produces some likelihood of belonging to one class or another we need to determine some threshold below which sequence does not belong to either class. This threshold is determined by taking average likelihood observed by processing observed sequence from training or validation data using the HMM of specic class. Thus if likelihood is below this average likelihood we infer that non of the HMM models are likely to produces the given observation.

0.3.1 Gestures Representation

The task of gesture recognition is given a gesture we need to decide if it belongs to known class of gesture.Thus a mathematical representation of gestures is required and a means to compare two gestures is required. Consider the case of 2D gesture. A point of 2D gesture is represented by co-ordinates (x,y) The gesture can be represented as spatio-temporal sequence of 2D points. The rst step is to record the gesture . Thus we record some examples of following two type of gesture.

0.3.2 Normalizing Gestures

Gestures representation is required to be independent of scale and position in 2D space it is performed. Thus we need to normalize the gesture . All the gesture need to be scaled to the same size and aligned with each other and represented using same number of points First scaling is performed followed by translation and then re-sampling To scale the points calculate the minimum and maximum values in each dimension and apply linear scaling of points. To center about the origin the centroid of the points are calculated and all the points are translated wrt to the centroid.

4|9

Below are plots after normalization steps ,the gestures are normalized

(a) gesture 1

(b) gesture 2

Figure 1: Normalized gesture plots

The next step is to resample so that all gestures are represented by same number of points. Simplest re-sampling strategy is of uniform re-sampling. below are example gestures after re-sampling

(a) gesture 1

(b) gesture 2

Figure 2: Normalized gesture plots

We require a mathematical representation of gesture.Consider the ideal gesture than is required to be identied. Each discrete pixel value can be represented as a symbol and thus gesture can be represented as sequence of symbols however the number of symbols in this case will be 1600 for each pixel representation. And since gesture will not occupy all the pixels the representation will be typically sparse and have high redundancy Calculations for such a large representation will be huge and not suitable for real time application. Since gesture is represented by 30 points we required only 30 symbols to represent a particular gestures Thus it is desirable to represent the gesture reduced set of symbols. Even 30 symbols may be too large to achieve real time performance ,if we reduces the gesture set

5|9

further we are compromising accuracy for performance. One simple method to do this is to perform clustering x the number of desired symbols and perform clustering on the input data.Thus each pixel will be represented by a cluster and thus reduce the representation size. First mean is calculated over all the samples at each point of the re-sampled data. Thus if gesture is represented using 30 points we calculate average co-ordinates of these 30 points over all the examples of the gesture. Then clustering is performed over the mean.The centroids of clusters formed will be dene new points used to represent the gesture. Instead of 30 points the gesture is represented using reduced number of states . Once centroids are determined all the points of gesture are assigned to their closest centroid.

(a) gesture 1

(b) gesture 2

Figure 3: Result after Clustering

Each centroid can be though of a observed state . Each data point is represebted centroid/symbols.We have discretized the continous data into discrete once. Thus gesture is dened as sequence of the observed states of the gesture we have estimated from the training data Each input data point is then associated with a observed state after the discretization process and gesture is presented as sequence of observed states. During the training process the model parameters are determined which maximize the observed sequence. The parameters to be decided are number of input states,number of output states. Based on the number of output state the gestures are discretized as sequence of ob-

6|9

served states. This is used in the training process to determine the parameters of HMM. Once the training parameters are determined calculate the average likelihood of observed sequence on the training/validation set.This will be the threshold of given HMM. For the given two set of gestures : 1. M (8) -number of observed states 2. N (4) -number of input states Thus we train two HMM : 0.0000 0.0000 0.0000 0.6251 0.3749 0 0 0 0.3728 0.6272 0 B1 = 0.0000 A1 = 0 0.0000 0 0.8556 0.1444 0.6017 0 0 0 1.0000 0.3983 0.0000 0.0000 0.0000 0.0000 0.1000 0.9000 0 0 0 0.7568 0.2432 0 B2 = 1.0000 A2 = 0 0.0000 0 0.7501 0.2499 0.0000 0 0 0 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0077 0.9923 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0270 0.0000 0.9730 0.0000 0.0000 0.0000 0.0000 0.5866 0.4134 0.0000 0.0000 0.0000 0.0000 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000 0.9997 0.2525 0.1163 0.2359 0.0000 0.1429 0.0000 0.0000 0.2525 0.2193 0.2246 0.1818 0.0053 0.2032 0.1604 0.0053 0.0000

P1 = P2 = 1.0000 0.0000 0.0000 0.0000 we compute the likelihood of validation/training data. we determine the average thresholds for each gesture as twice the average likelihood For present data we obtain the likelihood as : TA = 79.4310 and TB = 60.5704. Now given test gesture likelihood are computed against the HMM For the rst model likelihood of 4 test data are : 42.296347, 142.296347, 30.364364, 70.747603 This third gesture is incorrectly identied as being generated by the model.

7|9

(a) gesture 1

(b) gesture 1

(c) gesture 1

(d) gesture 1

Figure 4: Test Data

Different number of samples,input states and observed symbols were tried outcome and reasonable parameters of model. The nal conguration used was : The number of samples of 120 were chosen for representing the gesture. Number if input and output symbols were chosen to be 20. The thresholds in this case were 405.235169and 389.913413 the values of the 4 gesture for model 1 were 212.277444, In f , 373.451343, 325.134281 the values of the 4 gestures for model 2 were 560.252356, 210.214212, 717.249978, 439.386705 In this simple model also though respective gestures are identied correctly some incorrect gesture keep on occuring inspite of using a threshold based approach to eliminate the gestures with lower likelihood. The gesure that were incorrectly identied they form part of the gesture model 1. A template based approach gives better performance for such gestures. Need to check if on how to improve the system to that likelyhood . This aspect need to be improved is HMM are to be used in real time system where a lot of invalid gesture would occur and the model must be capable of eliminating such gestures.

8|9

0.4 References

1. http://cs229.stanford.edu/section/cs229-hmm.pdf

2. http://www.creativedistraction.com/demos/gesture-recognition-kinect-with-hidden-marko

9|9

- Adaptive Skin Color DetectorUploaded bypi194043
- Dense optical flow expansion based on polynomial basis approximationUploaded bypi194043
- Control Limited Adaptive Histogram Equalization for Image EnhancementUploaded bypi194043
- Spatio Temporal Feature extraction using harris 3d corner detectorUploaded bypi194043
- Fast 3D Extension of Fast Feature Detection for SpatioTemporal Corner DetectionUploaded bypi194043
- shape classification using Histogram of oriented gradientsUploaded bypi194043
- OpenCL 2D Convolution Using Separable Filters -Box FilterUploaded bypi194043
- temporal filtersUploaded bypi194043
- Noise Models in Image processingUploaded bypi194043
- Feature Detection - Overview of Harris Corner Feature DetectionUploaded bypi194043
- Overview and Implementation of Fast Corner Detection MethodUploaded bypi194043
- Uniform Color QuantizationUploaded bypi194043
- Image enhancement using FusionUploaded bypi194043
- OpenCL Image Convolution Filter - Box FilterUploaded bypi194043
- Gaussian Multivariate Distribution -Part 1Uploaded bypi194043
- automatic white balance algorithm 1Uploaded bypi194043
- OpenCL Heterogeneous parallel program for Gaussian FilterUploaded bypi194043
- Symmetic Nearest Neighbour Anisotropic 2D image filterUploaded bypi194043
- PhD ProposalUploaded byMariana Mat Rani
- MIT16_410F10_lec20Uploaded byscvalencia606
- A detailed descriptions and results for different color constancy algorithmsUploaded bypi194043
- OpenCL Heterogenenous program for Image Processing - ColorSpace conversion BGR-HSV,HSV-BGR,BGR-GRAYUploaded bypi194043
- Seeded Region Growing using Line Scan algorithm - Stack base ImplementationUploaded bypi194043
- 12Uploaded byapi-3754855
- Overview of Good Features to Track Feature DetectorUploaded bypi194043
- Region Growing Algorithm For UnderWater Image SegmentationUploaded bypi194043
- A simple color balance algorithmUploaded bypi194043
- Single Passs Connected Component LabellingUploaded bypi194043
- Image Degradation and Restoration ModelUploaded bypi194043
- Harsh ItUploaded bySayalee Pethe

- OpenVision Library Gaussian Mixture Model ImplementationUploaded bypi194043
- ARM Neon Optimization for image interleaving and deinterleavingUploaded bypi194043
- A linear channel filterUploaded bypi194043
- Fast 2D Separable Symmetric/Anti-Symmmetric ConvolutionUploaded bypi194043
- Continuous Emission Hidden Markov Model for sequence classificationUploaded bypi194043
- Implementation of discrete hidden markov model for sequence classification in C++ using EigenUploaded bypi194043
- Markov chain implementation in C++ using EigenUploaded bypi194043
- polynomial approximation of a 2D signalUploaded bypi194043
- Modified Canny Edge DetectionUploaded bypi194043
- Polynomial Approximation of 2D image patch -Part 2Uploaded bypi194043
- Gaussian Multivariate Distribution -Part 1Uploaded bypi194043
- Multi Class Logistic Regression Training and TestingUploaded bypi194043
- Compiling Native C/C++ library for AndroidUploaded bypi194043
- C++ virtual functions and abstract classUploaded bypi194043
- C++ static members and functionUploaded bypi194043
- Mean Shift TrackingUploaded bypi194043
- Uniform Local Binary Pattern and Spatial Histogram ComputationUploaded bypi194043
- Normalized convolution for image interpolationUploaded bypi194043
- Tan and Triggs Illumination normalizationUploaded bypi194043
- Mean Shift AlgorithmUploaded bypi194043
- Local Binary PatternUploaded bypi194043
- C++ InheritanceUploaded bypi194043
- C++ Const,Volatile Type QualifiersUploaded bypi194043
- Polynomial Approximation of 1D signalUploaded bypi194043
- Integral Image for Computation of Mean And VarianceUploaded bypi194043
- Random Ferns for Patch DescriptionUploaded bypi194043
- Embedded Systems Programming with ARM on Linux - Blinking LEDUploaded bypi194043
- C++ Class Members and friendsUploaded bypi194043

- Vi Vim CheatsheetUploaded byMohamed Aziz Kandil
- US Army: 2004AugustUploaded byArmy
- Instalación y Mantenimiento de Servicios de Internet 1ª EvaluacionUploaded byvladrimiro21
- 5.1 Consideraciones General ParaUploaded byJuan Andres Duran
- REPORTE DE INVESTIGACIONUploaded byVictoria Aguilar
- Remoción de Metales Pesados en Suelos Contaminados de Relave Minero de Ticapampa Recuay Ancash Utilizando La Lechuga y El Girasol Como FitorremediadoresUploaded byThalia Valentin
- aguas-residuales.pdfUploaded bySteven Perez
- Codigo JisUploaded byMARIO HUAMAN
- 45. DesinfeccionUploaded byNany Diaz
- HP Pavilion Entertainment PC - Quick Reference GuideUploaded byFernando
- DMP-BD65PXUploaded byshu_babe
- ACT100Uploaded byAdam Ong
- EN_12385 previewUploaded bykevin
- Semestral Basico 2013-1 - Ondas Mecanicas y OemUploaded byDan Pariasca
- Exercicios de Osmose - BiofisicaUploaded byLuzinete Fernandes Silva
- PLANTA SOLAR FOTOVOLTAICA.pptxUploaded bymililimilagros
- Javascript in CognosUploaded byAkhil Shenvi Kerkar
- Analysis of the Montgomery County ZTA Zoning Amendment on Small CellsUploaded bySafe Tech For Schools
- 12 Control de Calidad en La Etapa de EjecuciónUploaded bypattymc29
- Transcend MP330 Manual7Uploaded byvsalaiselvam
- d03d05solenoidUploaded byTimothy Fields
- Antecedentes de la computación ubicuaUploaded byMarisela Cadena
- Metso Slurry SysUploaded byDanang Kurniawan
- Medidor Gasolina a ledsUploaded byOmegacool
- Fundamentos c. de Computacion.docxUploaded byjesus oviedo melendez
- Hidrostatica QuestõesUploaded byMarlus Santos
- Course 12 c Administration WorkshopUploaded byram
- Navier StokesUploaded bySanteBrucoli
- A Performance Analysis of Msa Usingvarious Excitation TechniquesUploaded byRadhika Sethu
- Crane-CPB/EOT/HOT pptUploaded bySubhajyoti Ganguly