4.1 Architecture

The proposed virtual keyboard system performs gesture acquisition by a mono VISIOn

sensor which call typically be a CCO or CMOS imaging sensor. The functional

architecture of the proposed system is shown in the Figure 4-1.


Gesture Acquisition


.... _-- --~,-. ---

Gesture Feature Extraction


, 0




Alphabet / Character Emitted

Figure 4-1 Architecture of Gesture Recognition System


Sensor senses the demonstrated gesture and presents the information in the signal form.

In C<ISC or vision based gesture recognition system, 2D signal is captured this is then

combined to form a temporal signal. Gesture acquisition stage captures this signal and

records in an appropriate way. Usually [here is pre-processing stage which is an optional

stuge and fillers the noise from the acquired signal. Then gesture feature extraction stage

extracts the features from the acquired and pre-processed information. These derived

real u res arc further uti I ized in the gesture recogni ti on stage.

Transducer Action

Key Stroke Hand Movement

-- - ---.... Traditional Keybo.ard

Character Emitted

Mono Vision Based Gesture Analysis

Key Stroke Hand Movement


.... Based Virtual --.-. Character Emitted Keyboard

Figure 4-2 Keyboard Comparison

Suppose the output of the keyboard system is defined as



Where c, is either alphanumeric character such as ('I =' A' '('2 =' B' etc. or some control

key. i= 1.2, ... , L where L = 63 is the total nu mber of keys on the keyboard (as L = 63 in

19]). Both the traditional and gesture based virtual keyboards emit c! as output as shown

in Figure 4-2.

In a traditional keyboard, transducer action is performed in electro-mechanical switch

function fashion. While mono Vision gesture based virtual keyboard analyzes the hand

and finger gestures in the video sequence. Concept of making key stroke by both

traditional keyboard and gesture based Virtual keyboard is shown in Figure 4-3.

IlKey ~;~~'sed l: __ SWitCh .I~_


Character Emit1ed

Key Stroke I I I

Hand 1---~)I>1 Gesture --1 Character

Mo_v_e_m_e_n_t _ '-1 -----' Emitl.::J

Figure 4-3 Keyboard Functional Comparison

Hand video is captured continuously. Concept of dominant finger is introduced which

defines the dominant finger as responsible finger making key stroke. Whenever dominant

(Inger is triggered, a gesture estimation procedure is initiated. as shown III Figure 4-4.

This gesture estimation procedure reveals the key stroke c, where i = 1.2, ... , J .... ,L.





Finger . ">

. Triggered /



·1 Yes

Gesture Estimation


Key Stroke C

Figure 4-4 Concept of Dominant Finger

4.2 Allgorithmic Design of Mono Vision Gesture

Bas,ed Virtual Keyboard

Algorithmic detail of gesture based virtual keyboard IS shown in Figure 4-5, The

proposed virtual keyboard captures the digital video sequence Y:J by mono VIsion

sensor like CCO or CMOS sensors [7], Frames r(x,y) are extracted where (x, y) are the

spatia) coordinates in the image plane.





Hand Detection Finger Isolation



IFil1@8r rlilovemen't res,lIiimaNQJl

lrqiwt®f)i Queue

; I


rn.~d G~s'ure I ~aJ~lOnes I


Knowledge 1

Fusion I ,


FUlZY Rule I 1/


B_as.,-ed l . ,



Fuzzifier '-"I




.... '---,,--,


, C

Figure 4-5 Algorithmic Detail of Virtual Keyboard

Pre-processing stage performs segmentation where the hand and finger joints are

extracted. This has bcen thoroughly investigated [3], [4} and is not discussed here.

This research work adapted parametric approach, shown in Figure 4-6, where cad hand

is modeled us set of joints, J, where i = L2, .. .,N, N = 19 for either right or lef: hand.


Each J, is defined as1, =l1\1,1.",J where Ju and 1" defines the position of i" joint in

(x. r) coordinates of frame F(x, y), Recording the positions of

Figure 4-6 Iland Representation b) Paramenic Approach

particular joint for III number of frames represent the trajectory of that joint and is

deli ned as:


Cumulaii vcly, the trajectories of all joints are defined as:

1· [-rl' -J' T J7

- I ~ ••• r ." \'

Where ;1 particular I: represents the trajectory of i'h joint over III number of frames.

[ r represents the transpose. The overall information contained by matrix t or

dimension M ::: N can be defined as:


r) I ( [ )) I (:2 ) ) I (M )

) c (I )) " ( 2 ) ) 2 (lvl )

4.2.1 Fuzzy Representation of Trajectories


(4·.'1 I

represent the pre-stored templates of reference gestures of key strokes. Each G:

corresponds to particular key stroke cj •

G "s are learned and stored in off-line training

} ~

phase. Let ),1'(111) be the location of i" joint in the m'" frame in the instantaneous

trnjectory Z", and ),(; (11/) be the locution of the same joint in the pre-stored gestureG ..

Fuzz y membership function [8] lll.!; is computed that defines the degree of membership

of the mstaniancous trajectory T with the pre-stored trajectories of gestures G ,over a

universe of discourse (set of all possible gestures) as

/1/(, = 1 ~ d I






D = llJ - ('171 ), J (", (1"11 ~l

'I !.J. J r. ~, ~




Output of ,LIl(; membership function appears in the range [0 1], '0' corresponds to

minimum membership of instantaneous trajectory in some pre-stored gesture while' I'

corresponds to the maximum.

4.2.2 Fuzzy Representation of Dominant Finger

The finger responsible for making key stroke is extracted and is called dominant finger,

The same Joint 1, is now classified (IS 1 (x, v) where J (x, v) is the position of joint P

pq - T I-H( ~ <

in finger o as shown in Figure 4-7, where P = 1, .. ,,4 representing total number of joints in

the finger, and q = 1, ... ,5 representing total number of fingers in either hand.

Let R (111) be the motion of finger C/ between two successive frames is given below

(/''_' ._..

( ), ~,." r(dJ 'V-,A {m))~ [',dJ~'\',(m.),)2).

I< !"II ~ L ~'I ,+ ~'I ,

'I' .,_1 dm ,din , "


Aggregate motion or c/i finger is defined as

.IJ - i

S, = '\' R (111)

'I L 'I 11/ I


Figure 4-7 Model of Hand

Then membership of dominant finger in making key stroke IS computed by the

membership function u I., which is mathematically represented as

O:::;S, :::;0.2S ... ",_ •.

'I 'I .. ;.1 ....


S > O.4S

17 o·rr_l\



4 5



4 -4S +-

(f 5



5 ~S -- fj 2

0.55 <5 <0.7S

{j.rr,l\ q 'I,n!,_!\

(4-12 )

4.2.3 Knowledge Fusion I Fuzzy Rule Base

The recorded trajectory information is compared to the pre-stored gestures by fuzzy rule

base approach. Thirty two fuzzy rules are defined which are simultaneously applied to

both hands. Each time, key stroke defined by one hand is reported. For example. the

general architecture of fuzzy rule base may be defined as follows:

II" fli.l, is c, /\ flrl is high A ,lif2islow/\ ,lI"islowA PI-lislowTHEN K ISCI

lrnplcrnentation of above rule based for particular characters is given below:


II' .u r (, IS G. J\ ,Uil
I r ,Lt'!(;, IS G, i\ .u II
If PI,c;, IS G I. J\ fill is low J\ !'12 is low J\ 1'1, is low J\ Pj.l is high THEN' K isk .\'

is high ;\ Ill" is low J\ .u 11 is low J\ P f ~ is low THEN K is k,

11' 'uu; is G( J\.tt: I is low J\ ,ll12 is high J\ Pt ,l is low i\ P,.) is low THEN K isk( .

The above rule based is extended to all keys on the keyboard. Finally defuzzificr[6] IS

incorporated based upon centroid to declare the key pressed.

4.2.4 Degree of Confidence of Key Stroke

DoC (Degree of confidence) is an essential parameter for evaluation of the accuracy of

key stroke revealed by gesture recognition algorithm. DoC gi ves the level or confidence

ill the declared decision of key stroke. If DoC is below a threshold, then additional

xupporti vc tools e.g. dictionary look up tables etc. can be employed. DoC is defined as

DuC=K (K -K/)xlOO

{/ If )




for j = 1.2 .... , J ,., Land

K = rnax(K ) VJ' b;k a

I) / ~ ,



4.3 Summary

This chapter was devoted to the description of the developed virtual keyboard algorithm. The proposed algorithm is based upon fuzzy logic based gesture recognition implementation. A parametric approach for hand representation had been followed where joints in the finger become reference point. This parametric information had been dealt in two different streams whose outcome is combined by fuzzy rule base approach. DoC (Degree of Confidence) [5] is consulted to check the health of the declared key stroke. The threshold for DoC had been selected empirically.

4.4 Reference:

[Il \II. M. Cerney, 1. M. Vance, "gesture recognition in virtual environments: a review and framework for future development," Iowa state university human computer

interaction technical report ISU-HCI- 2005-01, Mar. 28, 2005.

[2] \'1. Turk, M. Kelsch, "Keyboard without keyboards: a survey of virtual keyboards", UCSB technical report 2002.

[31 :v1. B, Caglar, :'-l". Lobo, "open hand detection in a cluttered single image using finger primitives." Computer vision and pattern recognition workshop 2006 (CVPRW'06) [4] ~v!. Kolsch, \11. Turk, "Robust I-land Detection," 6th IEEE imcrnuuonal conference on Automatic Face unci Gesture Recognition (FGR'04).

l::ij M. Mufti, "Fault detection and identification using fuzzy wavelets", PhD Dissertation, Georgia Institute of technology, August 1995.

[6J T. 1. Ross, "Fuzzy logic with engineering applications" Wiley, 2004.


[7] c. Dcrnant, S. Streicher-Abel, P. Waszkewitz. "Industrial image processing visual quality control in manufacturing", Springer 1999.

l8] H. X. Li, V, C. Yen, "Fuzzy sets and fuzzy decision making", CRC press 1995. [9J ~.vW\v.vkJ').CO.11


Sign up to vote on this title
UsefulNot useful