Real-Time Mocap Dance Recognition For An Interactive Dancing Game

COMPUTER ANIMATION AND VIRTUAL WORLDS
Comp. Anim. Virtual Worlds 2011; 22:229–237

Published online 12 April 2011 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cav.397
SPECIAL ISSUE PAPER
Real-time mocap dance recognition for an interactive

dancing game
Liqun Deng1,2*, Howard Leung1, Naijie Gu2 and Yang Yang1,2
1
Department of Computer Science, City University of Hong Kong and USTC-CityU Joint Advanced Research Centre, Suzhou,
P.R. China
2
Department of Computer Science and Technology, University of Science and Technology of China and USTC-CityU Joint Advanced
Research Centre, Suzhou, P.R. China
ABSTRACT
In this paper, we present an interactive dancing game based on motion capture technology. We address the problem of real-
time recognition of the user’s live dance performance in order to determine the interactive motion to be rendered by a
virtual dance partner. The real-time recognition algorithm is based on a human body partition indexing scheme with
flexible matching to determine the end of a move as well as to detect unwanted motion. We show that the system can
recognize the live dance motions of users with good accuracy and render the interactive dance move of the virtual partner.
Copyright # 2011 John Wiley & Sons, Ltd.
KEYWORDS
interactive dancing game; motion capture; real-time motion recognition
*Correspondence
Liqun Deng, Computer Science and Technology, University of Science and Technology of China, Hefei, China.
E-mail: dlqun@mail.ustc.edu.cn
1. INTRODUCTION students to learn dance by themselves. Similar ideas have

been developed in the games such as Dance Central1 and
Motion capture (mocap) has been prevalent in the Dance Masters2 with the Kinect game console. Users are
computer graphics industry because of the capability of required to mimic the dance animations, and their motions
the motion data in recording expressive human motions in are captured by a motion sensor and compared with the
the real world. It can produce vivid animations through the animated motions to obtain the scores.
virtual characters which help creating an immersive virtual While performance driven animation and motion
reality environment for entertainment, sport training, etc. mimicking games are entertaining, it is more desirable
Performance-driven animation is an intuitive way for to deliver intelligent reaction by the virtual avatar
applying the mocap technology which aims at reconstruct- according to the input of the user. This can enhance the
ing the live motions of users and mapping them into the interactivity between the human user and the virtual
movement of the virtual characters. Animation authored by character so that the user feel more immersed in the virtual
the direct performance also allows the control of virtual environment. This requires a high level understanding of
avatars or facilitates human–avatar interaction [1,2]. the user’s input motion and is often achieved by
Computer puppetry [3] aims at directly mapping motions recognition techniques. In this paper, we consider this
of a real performer onto virtual characters. Magnenat- type of application by designing an interactive dancing
Thalmann et al. [4] proposed to capture the teachers’ dance game.
motions using mocap and create a web 3D learning The overview of our interactive dance game is shown in
platform that allows students watch the virtual avatar Figure 1. We assume that a dance sequence is connected by
performing the captured dance motions. a list of dance moves from a certain number of template
The captured motion can also be analyzed in motion- classes. We also assume that the mapping between each
based applications such as dance learning or sports game in
order to provide some feedback back to the user. Chan et al. 1
http://en.wikipedia.org/wiki/Dance_Central.
2
[5] implemented a dance training system that allows http://wapedia.mobi/en/Dance_Masters.
Copyright ß 2011 John Wiley & Sons, Ltd. 229

Real-time mocap dance recognition L. Deng et al.
Figure 1. The overview of the system with block diagram.
dance move and its corresponding reactive move is generating different locomotion styles for animation. Lee
predefined. In the game, the dance performance of a user et al. [7] proposed to precompute the reactive motions for
is lively captured, preprocessed, and recognized in real avatars from a large collection of motion dataset so as to
time by a novel online classifier. At the same time, reduce the time delay during the interactive control of
according to the recognition result, the suitable interactive avatars in the animation.
dance motion by the virtual dance partner is promptly Synchronizing the motion with the music is another
determined and rendered. This will give the user the important aspect. Shiratori et al. [8] and Kim et al. [9]
impression that he/she is dancing in collaboration with the identified the appropriate dance motions according to the
virtual partner. The primary contributions of this paper matching with the input music. Alankus et al. [10]
include the followings: proposed an automatic approach to synthesize dance
motions with musical rhythms.
We propose a novel approach to recognize dance
Lee et al. [11] worked on real time controlling the
motions in real time. We develop an online human
avatars with mocap data. In their approach, the motion data
motion recognizer based on a human body partition
in database are preprocessed both by the Markov process in
scheme. We present our flexible matching method and
lower layer and a clustering technique in higher layer.
several rules which contribute to the good performance
Various classification techniques have been applied for
on both continuous motion recognition and unwanted
recognizing motions. Li et al. [12] applied singular value
motion detection.
decomposition (SVD) for feature extraction on motions
We implement an interactive dancing game system that
and proposed a new eigen feature based similarity measure
can be used for dance training and entertainment. Exper-
to classify them. Tormene et al. [13] proposed a new
iments and the user study demonstrated that our classi-
variant of dynamic time warping named open-end DTW
fier is effective in online motion recognition, and the
(OE-DTW) which allows matching incomplete time series
proposed system is well accepted by the investigated
patterns with complete ones. As another extension of
users.
DTW, continuous dynamic programming (CDP) was also
The rest of this paper is organized as follows: the next utilized for human gesture recognition [14]. Liang et al. [2]
section briefly reviews the related work. Next, the real-time recognized the motions recorded by accelerometers using a
motion classifier is described and the interactive dancing continuous hidden Markov model (HMM) based classifier.
game application is illustrated in the following two sec- In our prior work [15], we addressed the problem of
tions, respectively. The performance evaluation section continuously recognizing the dance moves for a long dance
presents the experimental results and user studies. The sequence, but did not consider the real time issue. The
conclusion and future work are provided in the final above existing methods are not suitable for our case in
section. which the recognition decision is required each time when
several new frames are input instead of waiting for the end
of the entire pattern before starting the recognition process.
2. RELATED WORK We need a fast method that does not require too much
training data.
There have been a lot of efforts towards the offline analysis Our recognition method is motivated by the success of
of a single person’s motion. Kovar et al. [6] created a novel indexing techniques applied for motion retrieval [16–18],
concept of motion graph that constructed the mocap data as which are efficient and fast in searching a large dataset for a
a directed graph, which was demonstrated to be efficient in query motion. For example, Chiu et al. [16] proposed to
230 Comp. Anim. Virtual Worlds 2011; 22:229–237 ß 2011 John Wiley & Sons, Ltd.
DOI: 10.1002/cav
L. Deng et al. Real-time mocap dance recognition
partition a human skeletal model into nine body parts and On the other hand, to facilitate a higher-level description
construct an index map for each of the body parts through of the motions, the joints of the skeletal model are grouped
self-organizing map (SOM) clustering. These maps are into five partitions that include torso, left upper limb, right
then used for querying a motion in a long motion sequence. upper limb, left lower limb, and right lower limb, as shown
Two advantages of the body partition based retrieval are: in Figure 2(b). Hence each motion is represented by five
(1) reducing the computation cost by partitioning whole- disjoint sub-motion matrices corresponding to the body
body motions with high dimension into a set of body-part partitions along the frame time, which are clustered and
motions of low dimension and (2) avoiding the disharmony indexed separately.
that usually occurs in dance between different body parts.
We use a similar scheme, and extend it from motion 3.1.2. Clustering Motions With SOM.
retrieval to real-time motion recognition. In addition, each For each of the C classes, we select K trials as the training
time when a new block of input motion is recognized, source and build the index structure (In our system, C ¼ 19,
DTW is used to further extract its temporal alignment with K ¼ 5). Thus, K C sub motions for each body partition
the corresponding template motion, and thus to decide the exist. These sets of sub motions here are clustered using a
exact interactive motion clip. SOM based approach.
For the technical theory of SOM, we refer readers to the
3. REAL-TIME MOTION book by Duda et al. [19]. In our case, for each set of sub-
CLASSIFIER motions specified by a certain body part, we collect all the
corresponding frames and cluster them using a two-step
The framework of our proposed classifier is shown in procedure. The first step is to train a SOM with an existed
Figure 1. It is divided into the indexing and recognition SOM toolbox [20], and take the weight vectors in the SOM
stages, which are presented in the following subsections. nodes as cluster centers. During SOM training, the initial
SOM parameters are set as the default of the toolbox [21].
3.1. Indexing Stage After this step, we find that most of the resulted clusters are
not or rarely indexed by the training frames, thus the
3.1.1. Motion Representation. second step is to iteratively refine the result. We first get rid
In this prototype system, we take Agogo dance into of the clusters that are not or rarely indexed (less than 40 in
account. We assume that this dance is composed of 19 our case), then resize the SOM map according to the
classes of dance moves of different difficulty levels. Some number of remaining cluster centers and retrain the frames.
moves are symmetric in which the male and female This step is repeated until all the resulted cluster centers are
dancers’ motions are the same while in other moves, the well indexed by the training frames. The rationale behind
two dancers’ motions are completely different but this step is to reduce the noise in the result caused by
collaborative. More information about our dance moves irregular training frames.
is described in the prior work [15]. For each class, 15 trials Figure 3 shows an example of projecting motions into
are pre-captured by five subjects using the optical the clusters. By following the notation by Wu et al. [17],
system. The durations of the motion clips range from the resulting sequences of cluster IDs are known as motion
85 to 360 frames. Each motion is measured by a set of 3D strings and each motion trial is transformed into five
rotations of 20 body joints (see Figure 2(a)). motion strings, with each frame represented by a vector of
Figure 2. (a) Human skeletal model. (b) Five body partitions of a skeletal model.
Comp. Anim. Virtual Worlds 2011; 22:229–237 ß 2011 John Wiley & Sons, Ltd. 231
DOI: 10.1002/cav
Figure 3. SOM clustering and projection.
five elements as the projections of the body parts. The final consisted of two parts, entries and content. The number of
sizes of the five SOM clusters are 10, 28, 33, 15, and 19, entries is exactly the number of the SOM centers of the
respectively. corresponding body part generated by the previous
subsection, and the entries are specified by the cluster
IDs (see Figure 5). The content of the entries is filled up by
3.1.3. Building Indexing Maps. traversing the TMs, that is, if the torso of the i-th frame of
This step aims to build five index maps corresponding to model m contains ID j, then a new pair (m, i) will be added
torso, left upper limb, left lower limb, right upper limb, and into the content of the j-th entry of the torso map. The
right lower limb, respectively. average numbers of pairs for each entry in the five index
First, for each motion class, the K trials are combined to maps are 120, 102, 88, 217, and 139, respectively.
be trained as a template model (TM). Let L be the largest
lengths among the K trials. We scale the trials to contain L
frames by uniform scaling [22], and combine them (see
Figure 4 for example). Thus, the resulted model is also with 3.2. Recognition Stage
L frames, and each frame is composed by five items with
respect to the five body parts. Each item is a unification of In this subsection, we apply the index maps for real-time
the corresponding K values. Second, the C TMs are used to motion recognition. We propose a flexible matching
build the index maps. Each map is actually a hash table scheme to search the match of a query motion from the
Figure 4. An example of TM generation with two motion trials.
DOI: 10.1002/cav
From the definitions, it can be observed that the WBM is

a subset of the LRBM because if all the five body partitions
are matched, this implies that the left body part and the right
body part are matched. The converse is, however, not true.
3.2.2. Sequential Accumulation.

The temporal order of motions is critical for the
recognition. Take two motions ‘‘squat down’’ and ‘‘stand
up’’ as shown in Figure 6 for example. These two motions
may lead to the same result since they share the same set of
Figure 5. An example of index map structure. postures but in reverse order. It is thus necessary to
accumulate the recognized results of the input frames to
ensure a strong sequential correlation.
To address the problem, we define two rules to match
index maps, and the indexing scores are sequentially the input motion with the template modes during
accumulated. After determining the recognized class with recognition. (1) Minimize the temporal difference. For
the largest score, DTW is used to extract the exact temporal an input frame, there may be several matched frames
alignment between the input motion and the corresponding from the same template model tm, for example, (tm, f1),
recognized template motion. ðtm; f2 Þ; :::; ðtm; fn Þ. In this case, if the match of the
previous frame is (tm, f0 ), then the pair (tm, fi) with which fi
3.2.1. Frame Matching Scheme. is the smallest frame number among the matched frames
Suppose TO, LU, LL, RU, RL are the indexed results from with fi > f 0 is chosen as the matched result from tm for the
the index maps of torso, left upper limb, left lower limb, current input frame. (2) Approximate match first and then
right upper limb, and right lower limb, respectively. exact match for the body. To allow a more robust matching
We define two matching schemes to calculate index score. and alignment of the input motion, for each input frame, we
Definition 1. Whole Body Match (WBM) first obtain the set of LRBM and determine the match by
If the same pair (TM #, Frame #) ¼ (m, i) exists in all five rule (1), and obtain the set of exact match pairs under
match results, then it indicates that the input frame contains WBM. If the approximate match pair is also an exact match
very similar posture as frame i of the template model m, pair, then we increase the vote for the associate template
and we claim that the input frame exactly matches frame i with e (e > 1, in our case, e ¼ 1.5), otherwise with 1.
of model m. The whole body match (WBM) can be
obtained by the intersection of the match results from each 3.2.3. Detecting End Point and Identifying
of the five body partitions as indicated in Equation (1). Unwanted Motions.
In practice, users usually perform dance moves continu-
WholeBodyMatch ¼ TO\LU\LL\RU\RL (1) ously without any breakpoints. Also, it is common that
users may perform a move that is not defined by the system.
Definition 2. Left or Right Body Match (LRBM) Thus, it requires the system to automatically both detect the
We find that many of the frames can hardly find exact end points of the dance moves and identify the unwanted
matches of all the five body partitions due to the spatial moves from the online dance.
variation between the query motions and the TMs. To Our solution is based on two observations: (1) although
enhance the flexibility, we define an approximate match as the embedded motions vary in durations, there exists a
indicated in Equation (2). Namely, when the same pair (TM minimal length for a valid move. If the duration of a motion
#, Frame #) exists in the torso and left limbs or the torso and is less than this minimal length, then it is considered as an
right limbs, we consider it as an approximate match. unwanted move; (2), after the end of a move, the matched
result for the corresponding TM will drop sharply.
LeftRightBodyMatch Assuming that TMi has the largest number of body
\ \ [ \ \ matches among the TMs, then if the current input frames
¼ TO LU LL TO RU RL (2)
are not matched well with TMi for k times consecutively, it
Figure 6. (a) Squat Down, (b) Stand Up.
DOI: 10.1002/cav
is considered as reaching the end point. If the length of the result hence is used to decide the exact interactive motion
current recognized motion is larger than a fixed length clip to animate avatar in real time.
minLen, it is considered as a meaningful dance move,
otherwise it is considered as an unwanted move. (In our
system, k ¼ 2, minLen ¼ 60). 4. INTERACTIVE DANCING GAME
After the end point of a recognized move is detected, the
input motion sequence is segmented, and the counts for all We have implemented the interactive dancing game
the TMs are reset to zero, hence re-starting the recognition system. In the application, two modes, i.e., the training
process. mode and the freestyle mode have been taken into account.
The training mode aims to demonstrate pre-captured
dance motions such that users can watch and learn how
3.2.4. Extracting Temporal Alignment. they should dance with the virtual partner. An exemplar
After the previous step, the label of the input motion is scenario is shown in Figure 8(a).
determined. However, it does not guarantee that the In the freestyle mode, the user is allowed to dance freely
alignment between the input motion and the recognized in a given time period and his/her dance is lively captured
template motion is exact, which is important for the system and processed by the game system in real time (see
to produce smooth interactive motions. Figure 8(b)). The music related to the dance is also played.
We employ OE-DTW [13] to process the motions to be At the same time, the virtual partner is performing the
aligned. One advantage of OE-DTW over traditional DTW interactive motions to let the user feel more immersed in
is that OE-DTW does not require the prior knowledge on the virtual environment. At the top part of the screen as
the length of the motion to be matched, and its distance is shown in Figure 8(b), several messages are displayed. The
determined by the last column of the distance table thus it is message ‘‘interactive move: 4 (25%)’’ means that the
suitable for online application. Suppose Q is an online current input motion is recognized as the motion class 4,
motion clip, R is a template motion, and Figure 7 shows an and the percent completion of this move by the user is 25%.
example on OE-DTW. The current input Q matches the For each run, we allow the user to dance for 20 seconds,
suffix R1:::I of R. and the remaining time is shown on the screen. When a
For each move class, among the K trials, the trial with meaningful dance move is detected, one mark is awarded to
minimal average DTW distance is chosen as the template the user.
motion of this class. The real time input frames are
continuously aligned with the template motions and the
5. PERFORMANCE EVALUATION
We examine the proposed classifier on both isolated and

continuous motion recognition, and compare the results
with existing methods. A user study is also conducted to
test the interactive dancing game.
5.1. Motion Recognition
As mentioned in the previous section, the database contains

285 motion trials, and we use one third of them for training.
Hence, we will test our classifier on the remaining motion
trials.
Figure 9 shows the recognition accuracy on the isolated
partial motions. Partial motions are quantified by a
parameter P that specifies the percentage of the frames
of each isolated motion that are fed into the recognizer. For
example, for P ¼ 10%, only the first 10% of the frames
from each isolated motion are used as input for recognition.
For P ¼ 100%, the whole frames of each isolated motion
are used for recognition. We can see that the accuracy
increases with P, and especially, when P ¼ 100%, the
accuracy reaches 97.89%.
To examine the performance on continuous motion
recognition, we also collected 100 test motion sequences
where each sequence consists of an average of five move
patterns embedded at various time instances. The
Figure 7. An example of OE-DTW alignment. recognition result is provided in Table 1, which also
DOI: 10.1002/cav
Figure 8. (a) Interactive dance game in training mode, where the right avatar is dancing the input moves, and the left avatar dancing the
interactive moves. (b) A snap of the scenario in freestyle model. The shot in the right bottom shows the real time dancing of a real user.
5.2. User Study
To test the interactive dancing game application, we invited

10 subjects to try out the dance game. Among the subjects,
four of them have prior knowledge about Agogo dance,
while the others do not. Before trying this system, we first
briefly introduced this game including the mocap system to
them, and let them know what they should pay their
attention to when they are experiencing it. Especially, for
the ones who had no prior experience of Agogo dance, we
recommended them to watch some Agogo videos and take
Figure 9. Recognition accuracy on isolated partial motions.
the training mode before freely dancing in front of the
shows the result from CDP [14] for comparison. It can be system.
seen that for all three kinds of errors encountered in After experiencing with our interactive dancing game,
continuous motion recognition, our method is more robust the subjects were asked to give their feedback by a
than CDP. questionnaire. Table 2 shows the questions that we
Table 1. Performance comparison for continuous motion recognition.
Our proposed method CDP
Number of insertion errors 4 12

Number of deletion errors 13 31
Number of substitution errors 34 52
Number of correctly recognized moves 453 417
Detection ratio (%) 90.60% 83.40%
Reliability (%) 89.88% 81.45%
Table 2. Questions for questionnaire and result.
No. Question Average mark
1 Is the game fun? 4.7 ( p < 0.001)

2 Is the game environment immersive? 4.6 ( p < 0.001)
3 Has the interactive motion been performed smoothly? 4.5 ( p < 0.001)
4 Is the delay of the interactive motions acceptable? 4.5 ( p < 0.001)
5 Are your dances correctly detected as you desire? 4.5 ( p < 0.001)
6 Is it effective for you to learn Agogo dance? 4.8 ( p < 0.001)
7 Is the training mode helpful? 4.3 ( p < 0.001)
8 Is the interface of this game satisfactory? 4.2 ( p < 0.001)
DOI: 10.1002/cav
provided. The subjects were required to give their marks 3. Shin HJ, Lee J, Shin SY, Gleicher M. Computer
using the five-level Likert scale (1 means strongly disagree puppetry: an importance-based approach. ACM Trans-
and 5 means strongly agree) for each of the questions. The action on Graphics 2001; 20(2): 67–94.
last column of Table 2 shows the average mark as well as 4. Magnenat-Thalmann N, Protopsaltou D, Kavakli E.
the statistical significance of each question. Since high Learning how to dance using a web 3D platform. In
mark represents positive feedback and vice versa, we can ICWL ’07: Lecture Notes in Computer Science, Vol.
see that our system impresses the subject greatly. The 4823, 2008; 1–12.
marks from question 3, 4, and 5 support that the live dances 5. Chan JCP, Leung H, Tang JKT, Komura T. A virtual
are accurately recognized, and that of question 6, 7, and 8 reality dance training system using motion capture
suggest that the system is well designed and nicely technology. IEEE Transaction on Learning Technol-
interacts with users. The marks from question 1 and 2, on ogies, 17 Aug. 2010. <http://doi.ieeecomputersociety.
the whole, indicate that the system can achieve good org/10.1109/TLT.2010.27>
performance and impress users. 6. Kovar L, Gleicher M, Pighin F. Motion graphs. ACM
Transaction on Graphics 2002; 21(3): 473–482.
7. Lee J, Lee KH. Precomputing avatar behavior from
human motion data. In SCA ’04: Proceedings of the
6. CONCLUSIONS AND FUTURE 2004 ACM SIGGRAPH/Eurographics symposium on
WORK Computer animation, 2004; 79–87.
8. Shiratori T, Nakazawa A, Ikeuchi K. Dancing-to-
This paper presented an interactive dancing game based on music character animation. Computer Graphics
the mocap technology. We proposed a novel approach to Forum 2006; 25(3): 449–458.
handle the real-time recognition of the user’s dance motion 9. Kim JW, Fouad H, Sibert JL, Hahn JK. Perceptually
based on human body partition indexing. The matching motivated automatic dance motion generation for
was flexible in identifying the end of a move and detecting music. Computer Animation and Virtual Worlds,
unwanted motion. Experiments showed that our proposed 2009; 20(2–3): 184–375.
method has good performance on both isolated and 10. Alankus G, Bayazit AA, Bayazit OB. Automated
continuous motion recognition and positive feedback from motion synthesis for dancing characters. Computer
the subjects was obtained from the user study. However, Animation and Virtual Worlds 2005; 16(3–4): 259–
our classifier is trained with the trials of all motion classes 271.
as a whole. Hence if a new motion class is added to the 11. Lee J, Chai J, Reitsma PS. Interactive control of
system, it needs to redo all the training procedure. avatars animated with human motion data. ACM
As future work, we will consider introducing more kinds Transaction on Graphics 2002; 21(3): 491–500.
of dances into this system to increase the complexity and 12. Li C, Zheng SQ, Prabhakaran B. Segmentation and
variety. Also, since different users have different dance recognition of motion streams by similarity search.
styles and dance progresses when they use this system for ACM Transaction on Multimedia Computing, Com-
learning purpose, it is a promising work to make the system munications, and Applications 2007; 3(3): Article 16.
personalized. We may combine this work with our prior 13. Tormene P, Giorgino T, Quaglini S, Stefanelli M.
work [5] to produce more meaningful feedback to the user Matching incomplete time series with dynamic time
regarding his/her dance performance. warping: an algorithm and an application to post-
stroke rehabilitation. Artificial Intelligence in Medi-
cine 2009; 45(1): 11–34.
14. Mori A, Uchida S, Kurazume R, Taniguchi R,
ACKNOWLEDGEMENTS Hasegawa T, Sakoe H. Early recognition and predic-
tion of gestures. In ICPR ’06: Proceedings of Inter-
The work described in this paper was fully supported by a national Conference on Pattern Recognition, 2006;
grant from the Research Grants Councils of the Hong Kong 560–563.
Special Administration Region, China (Project No. CityU 15. Deng LQ, Leung H, Gu NJ, Yang Y. Automated
1165/09E). recognition of sequential patterns in captured motion
streams. In WAIM ’10: Lecture Notes in Computer
Science, Vol. 6184, 2010; 250–261.
REFERENCES
16. Chiu C, Chao S, Wu M, Yang S, Lin H. Content-based
1. Chai J, Hodgins JK. Performance animation from low- retrieval for human motion data. Journal of Visual
dimensional control signals. ACM Transactions on Communication and Image Representation 2004; 15:
Graphics 2005; 24(3): 686–696. 446–466.
2. Liang X, Li Q, Zhang X, Zhang S, Geng W. Perform- 17. Wu S, Wang Z, Xia S. Indexing and retrieval of human
ance driven motion choreographing with acceler- motion data by a hierarchical tree. In VRST ’09:
ometers. Computer Animation and Virtual Worlds Proceedings of the 16th ACM Symposium on Virtual
2009; 20: 89–99. Reality Software and Technology, 2009; 207–214.
DOI: 10.1002/cav
18. Deng Z, Gu Q, Li Q. Perceptually consistent example- Naijie Gu is currently a professor in

based human motion retrieval. In I3D ’09: Proceed- the Department of Computer
ings of the 2009 symposium on Interactive 3D Science and Technology at Univer-
graphics and games, 2009; 191–198. sity of Science and Technology of
19. Duda RO, Hart PE, Stork DG. Pattern Classification. China (USTC). His main research
John Wiley & Sons: New York, 2001. projects include Parallel Algorithm
20. Vesanto J, Himberg J, Alhoniemi E, Parkankangas J. Design and Analysis, Communica-
SOM toolbox for Matlab 5. Technical Report A57, tions in Parallel or Distributed Net-
2000. work, Software Optimization, etc.
21. SOM Toolbox: http://www.cis.hut.fi/somtoolbox/
documentation/somalg.shtml. Yang Yang is currently a PhD stu-
dent in the Department of Computer
22. Keogh E, Palpanas T, Zordan VB, Gunopulos D,
Science and Technology at Univer-
Cardle M. Indexing large human-motion databases. sity of Science and Technology of
In VLDB ’04: Proceedings of the Thirtieth inter- China (USTC). He is also enrolled
national conference on Very large data bases, 2004; by a joint PhD program offered by
780–791. City University of Hong Kong
(CityU) and USTC. His research
interests include human motion ana-
lysis, human computer interaction,
AUTHORS’ BIOGRAPHIES
and e-learning.
Liqun Deng is currently a PhD stu-
dent in the Department of Computer
Science and Technology at Univer-
sity of Science and Technology of
China (USTC). He is also enrolled
by a joint PhD program offered
by City University of Hong Kong
(CityU) and USTC. His research
interests include human motion
analysis, human computer interaction, and pattern
recognition.
Howard Leung is currently an Assis-

tant Professor in the Department of
Computer Science at City University
of Hong Kong. He received his
BEng degree in Electrical Engineer-
ing from McGill University, Canada,
in 1998, MSc and the PhD degrees in
Electrical and Computer Engineer-
ing from Carnegie Mellon Univer-
sity in 1999 and 2003, respectively. Howard’s current
research projects include 3D Human Motion Analysis,
Brain Informatics and Intelligent Tools for Chinese
Handwriting Education. Howard is currently a member
of the Multimedia Systems & Applications Technical
Committee (MSATC) which is a subsidiary organiza-
tion of the IEEE Circuits and Systems (IEEE-CAS)
Society.
DOI: 10.1002/cav

Real-Time Mocap Dance Recognition For An Interactive Dancing Game

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Real-Time Mocap Dance Recognition For An Interactive Dancing Game

Uploaded by

Copyright:

Available Formats

COMPUTER ANIMATION AND VIRTUAL WORLDS

Comp. Anim. Virtual Worlds 2011; 22:229–237

SPECIAL ISSUE PAPER

Real-time mocap dance recognition for an interactive

1. INTRODUCTION students to learn dance by themselves. Similar ideas have

Copyright ß 2011 John Wiley & Sons, Ltd. 229

Figure 1. The overview of the system with block diagram.

Figure 3. SOM clustering and projection.

Figure 4. An example of TM generation with two motion trials.

From the deﬁnitions, it can be observed that the WBM is

3.2.2. Sequential Accumulation.

Figure 6. (a) Squat Down, (b) Stand Up.

We examine the proposed classiﬁer on both isolated and

5.1. Motion Recognition

As mentioned in the previous section, the database contains

5.2. User Study

To test the interactive dancing game application, we invited

Table 1. Performance comparison for continuous motion recognition.

Our proposed method CDP

Number of insertion errors 4 12

Table 2. Questions for questionnaire and result.

No. Question Average mark

1 Is the game fun? 4.7 ( p < 0.001)

18. Deng Z, Gu Q, Li Q. Perceptually consistent example- Naijie Gu is currently a professor in

Howard Leung is currently an Assis-

You might also like