Professional Documents
Culture Documents
net/publication/290490824
CITATIONS READS
12 71
4 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Michal Balazia on 20 April 2016.
1 Introduction
Human gait is defined as the manner in which a person walks. Recent stud-
ies have proven that gait can be seen as a biometric characteristic and used
as a signature to recognize people. The great advantage is its possibility to be
captured at a distance, even surreptitiously. However, effectiveness of gait recog-
nition methods strongly depends on many factors such as camera view, carried
accessories, person’s clothes, or walking surface.
Gait recognition methods can be divided to two major categories: model-based
and appearance-based approaches. Appearance-based methods [4,6,10] generally
characterize the whole motion pattern of the human body by a compact repre-
sentation regardless of the underlying structure. They usually combine extracted
human silhouettes from each video frame into a single gait image that preserves
temporal information. Nevertheless, recognition based on comparison of such
gait images is restricted to one view point. To use gait features in unconstrained
views, we need to adopt the model-based concept in order to estimate 3D models
of walking persons.
The model-based concept fits various kinds of stick figures onto the walking
human. The recovered stick structure allows accurate measurements to perform,
independent of camera view. BenAbdelkader et al. [1] computed an average
stride length and cadence of feet and used just both these numbers for gait
recognition. Tanawongsuwan and Bobick [7] compared joint-angle trajectories of
hips, knees, and feet by the dynamic time warping (DTW) similarity function,
with normalization for noise reduction. Cunado et al. [5] used a pendulum model
G. Bebis et al. (Eds.): ISVC 2012, Part II, LNCS 7432, pp. 11–20, 2012.
c Springer-Verlag Berlin Heidelberg 2012
12 J. Sedmidubsky et al.
where thigh’s motion and rotation were analyzed using a Fourier transformation.
The approach of Wang et al. [9] measured a mean shape of silhouettes gained
by Procrustes’s analysis and combined it with absolute positions of angles of
specific joints. Yoo et al. [11] compared sequences of 2D stick figures by a back-
propagation neural network algorithm. Recent advances in gait recognition have
been surveyed in [3].
We adopt the model-based concept to recover a 3D stick figure of the human
body by capturing spatial coordinates of significant anatomical landmarks, such
as hands, hips, knees, or feet. The recovered stick figure is used to compute
distance-time dependency signals that express how a distance between two spe-
cific joints of the human body changes in time. The collection of such signals
defines a gait pattern of person’s walk (Section 2). In Section 3, a novel similarity
function for comparing gait patterns is introduced. To effectively compare gait
patterns, we normalize them to encapsulate signals corresponding exclusively to
a single walk cycle (Section 4). In Section 5, the influence of normalization and
effectiveness of similarity function is deeply evaluated on a real-life 3D motion
database. We distinguish from existing approaches by taking also movements of
arms into account and by comparing gait patterns on the basis of normalized
walk cycles.
The main contributions of this paper constitute: (1) proposal of a gait pattern
that encapsulates information about a person’s walk in the form of viewpoint
invariant distance-time dependency signals, (2) introduction of a novel similarity
function for comparing gait patterns, taking movements of legs and arms into
account, and (3) experimental evaluation of recognition rate of the proposed
function and its modifications, based also on diverse normalization methods.
2 Gait Representation
We introduce a structural model of a human body. This model is used for the
extraction of viewpoint invariant planar signals. The collection of such planar
signals forms a gait pattern of person’s walk. Recognition of persons is based on
comparing their gait patterns by a sophisticated similarity function.
M = (CL , CR , EL , ER , HL , HR , LL , LR , KL , KR , FL , FR ) ,
anatomical landmarks
trajectories
z
y
x
length of input video in terms of number of frames, i.e., to the number of times
a specific body point has been captured.
A collection of consecutive points represents a motion trajectory (see Figure 1).
Formally, each point P moving in time, as the person walks, constitutes a discrete
trajectory TP , defined as:
TP = {Pf | f ∈ F } .
The discrete domain F allows us to utilize metric functions for point-by-point
comparison of trajectories. Trajectories cannot be used directly for recogni-
tion because the values of their spatial coordinates depend on the calibration
of system that detects and estimates particular coordinates. Moreover, per-
sons do not walk in the same direction, which makes trajectories of different
walks (even of the same person) incomparable. We rather compute distances
between selected pairs of trajectories to construct distance-time dependency sig-
nals. Such signals are already independent of the walk direction and system
calibration.
Φ (S, S ) =
df − df
. (1)
f ∈F ∩F
This function, also known as the L1 or Manhattan distance function, sums point-
by-point differences between two specific DTDSs. In case the domains F and F
are not the same, similarity is computed among their common frames only. The
function returns 0 if the signals are identical and with an increasing distance
their similarity decreases.
We introduce a novel similarity function D for comparing gait patterns G and
G . This function is based on aggregation of four Φ functions and is formally
defined as:
D (G, G ) = Φ SLL FL , SL L FL + Φ SLR FR , SL R FR +
(2)
Φ SCL HL , SC L HL
+ Φ SCR HR , SC R HR
.
Gait Recognition Based on Normalized Walk Cycles 15
S’FLFR S’FLFR
SFLFR SFLFR
Distance
Distance
(a) (b)
Distance
(c) (d)
m1, m’2 m’3 m2 m’4 m3 150th frame Video Frames 150th frame
Fig. 2. Normalization of two feet DTDSs with a different number of footsteps (each
hill represents a single footstep). Figure (a) represents these signals without any nor-
malization. Figure (b) denotes identified minima of each signal. Figure (c) constitutes
just the first walk cycle of the signals, which starts with the move of left foot ahead.
Figure (d) shows the extracted walk cycles after linear transformation to 150 frames.
we pick the video frames m1 , m2 , m3 , m4 where the first four minima were iden-
tified. The pairs of adjacent minima determine individual footsteps, alternately
with the left or right foot in front. The requested walk cycle is formed by the
first two footsteps, so each S ∈ G is cropped according to the m1 -th and m3 -th
video frame. The cropped signals are linearly transformed to the standardized
length of 150 video frames.
The F N approach extracts the walk cycle disregarding the fact whether the
first footstep belonged to the left or right foot. However, a characteristic of some
DTDSs depends on the leg which undertook a given footstep – such DTDSs
are periodic on the level of walk cycles. Moreover, human walking might not be
balanced, e.g., due to an injury, which even results in a different characteristic of
feet signal for the left and right foot. The walk cycle normalization W N solves
this problem by extracting a single walk cycle that always starts with the move
of left foot ahead – the footstep of the left leg and consecutive footstep of the
right leg. To identify the first footstep of the left leg, we analyze the signal
SKL FR = {df | f ∈ F } that constitutes the changing distance between the left
knee and right foot. If both the feet are passing, this signal achieves a higher value
when the left foot is moving ahead in comparison with the opposite situation
when the right foot is moving ahead. In this way, if the condition dm1 < dm2 is
met, we crop each signal S ∈ G according to the m1 -th and m3 -th video frame
(m1 and m3 are frames where the first and third minima of the feet signal were
found). Otherwise, signals are cropped according to the m2 -th and m4 -th video
frame. Both the extracted footsteps form the requested walk cycle with the first
footstep undertaken by the left leg. Similar to F N , the requested walk cycle is
Gait Recognition Based on Normalized Walk Cycles 17
finally transformed to the length of 150 video frames. The whole normalization
process is depicted in Figure 2 and described in [8] in more detail.
5 Experimental Evaluation
We evaluate effectiveness of the proposed similarity function for gait recognition
and compare it with other functions and different normalization approaches.
Firstly, we describe a motion-capture database used. Secondly, methodology for
evaluating experimental trials is presented. Thirdly, effectiveness of examined
similarity functions and influence of diverse normalization processes is reported.
5.1 Database
We utilized the Motion Capture Database (MoCap DB) 1 from the CMU Graph-
ics Lab as a primary data source of trajectories of walking humans. This database
contains motion sequences of different kinds of movements (e.g., dance, walk,
box, etc.) for 144 recorded persons. We performed experiments on the subset of
motion sequences that corresponded to common walking. We took all 131 walk-
ing sequences belonging to 24 recorded persons. Each person had at least two
different sequences. Walking sequences are the only ones that could meaningfully
be used for gait recognition.
We implemented a specialized software to extract gait patterns from 131 walk-
ing sequences. In particular, we extracted trajectories of all landmarks P ∈ M
(see Section 2) for each walking sequence. The obtained trajectories were em-
ployed to compute DTDSs specified in Table 1. These DTDSs were normalized
and used to construct a gait pattern for each walking sequence.
5.2 Methodology
We concentrated on verifying effectiveness of our approach by evaluating nearest-
neighbors queries. To be maximally fair, we constructed one query for each
person – the query object for each query was randomly chosen from gait patterns
belonging to the given person. Thus 24 queries were constructed and evaluated
against a database of all 131 gait patterns. The nearest found neighbor was
always the same as the query gait pattern (i.e., the exact match), so it was
omitted and the next closest neighbor was analyzed. If the gait pattern of the
analyzed neighbor belonged to the same person as the query pattern, search
was successful because of the correct person identified. Search could always be
successful since at least two different gait patterns were available in the database
for each person. Effectiveness – a recognition rate – was stated as a ratio between
the number of correctly identified persons and the number of all persons (i.e.,
the number of successful queries divided by 24). Since we do not define any
recognition threshold, it is not possible to calculate false positives. This is the
part of our future work.
1
http://mocap.cs.cmu.edu
18 J. Sedmidubsky et al.
5.3 Results
The results were deeply studied for diverse similarity functions and the three nor-
malization approaches presented in Section 4: (1) Simple Normalization (SN ),
(2) Footstep Normalization (F N ), and (3) Walk Cycle Normalization (W N ).
We expect that the SN normalization should achieve the worst recognition rate
since it does not take individual footsteps into account. The W N normalization
should be more effective than F N because it, furthermore, distinguishes between
the left and right foot.
The normalized signals served as input parameters for computation of simi-
larity of gait patterns. We also evaluated three different types of similarity by
changing the Φ function in Equation 1. In addition to the original Manhattan
distance (L1 ), the Euclidean distance (L2 ), and the dynamic time warping ap-
proach (DT W ) [2] were used to measure similarity of two DTDSs.
We also modified Equation 2 to evaluate suitability of different DTDSs for
gait recognition. Firstly, we simply modified the function D to recognize persons
based purely on a single DTDS, i.e., D = Φ (S, S ). This approach is denoted as
single-DTDS recognition. Secondly, we modified the function D to combine sev-
eral DTDSs with the same “weight” (e.g., the original setting with four DTDSs
in Equation 2). The use of several DTDSs is referred as multi-DTDS recognition.
L1 L2 DT W
Examined DTDS
W N F N SN W N F N SN W N F N SN
SLL FR 0.77 0.54 0.27 0.75 0.52 0.21 0.67 0.58 0.44
SLL FL 0.69 0.60 0.33 0.63 0.56 0.21 0.75 0.56 0.54
SLR FR 0.71 0.58 0.25 0.67 0.50 0.25 0.73 0.67 0.56
S CL H L 0.73 0.63 0.40 0.73 0.63 0.40 0.58 0.58 0.40
SLR FL 0.56 0.52 0.23 0.56 0.54 0.25 0.69 0.48 0.38
S CR H R 0.67 0.60 0.40 0.65 0.60 0.35 0.60 0.52 0.54
Combination of L1 L2 DT W
examined DTDSs W N F N SN W N F N SN W N F N SN
S CL H L + S CR H R 0.94 0.71 0.44 0.88 0.67 0.38 0.96 0.73 0.69
SLL FL + SLR FR 0.83 0.65 0.40 0.81 0.63 0.35 0.92 0.67 0.60
SLL FL + SCL HL 0.81 0.81 0.54 0.85 0.79 0.56 0.83 0.75 0.67
SLL FL + SCL HL + SLR FR + SCR HR 0.71 0.67 0.29 0.69 0.65 0.27 0.83 0.67 0.50
6 Conclusions
We investigated the problem of gait recognition based on processing trajectories
of human walking. Trajectories were used to extract distance-time dependency
signals to ensure viewpoint invariant recognition. These signals were normalized
in the form of walk cycles that were compared by a specialized similarity method.
The results were evaluated on a real-life database and compared with diverse
similarity functions and normalization approaches. The combination of signals
expressing the manner of movement of left and right arm along with the walk-
cycle normalization and DTW-like comparison led to the 96 % effectiveness. We
demonstrated that the normalization process and the movement of arms are
important characteristics to be considered for gait recognition.
20 J. Sedmidubsky et al.
We are aware of the fact that the database of 131 walking sequences is too
small, so we plan to build a bigger 3D database of motion trajectories acquired
by the Kinect 2 equipment. In the future, we also plan to improve a normalization
approach along with similarity function, so that gait patterns could compose of
more than a single walk cycle for more effective recognition.
References
1. BenAbdelkader, C., Cutler, R., Davis, L.: Stride and cadence as a biometric in
automatic person identification and verification. In: 5th International Conference
on Automatic Face Gesture Recognition, pp. 372–377. IEEE (2002)
2. Berndt, D.J., Clifford, J.: Finding patterns in time series: a dynamic programming
approach. In: Advances in Knowledge Discovery and Data Mining, pp. 229–248.
American Association for Artificial Intelligence, Menlo Park (1996)
3. Bhanu, B., Han, J.: Human Recognition at a Distance in Video. In: Advances in
Computer Vision and Pattern Recognition. Springer (2010)
4. Chen, C., Liang, J., Zhao, H., Hu, H., Tian, J.: Frame difference energy image for
gait recognition with incomplete silhouettes. Pattern Recognition 30(11), 977–984
(2009)
5. Cunado, D.: Automatic extraction and description of human gait models for recog-
nition purposes. Computer Vision and Image Understanding 90(1), 1–41 (2003)
6. Han, J., Bhanu, B.: Individual recognition using gait energy image. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence 28(2), 316–322 (2006)
7. Tanawongsuwan, R., Bobick, A.F.: Gait recognition from time-normalized joint-
angle trajectories in the walking plane. In: International Conference on Computer
Vision and Pattern Recognition (CVPR 2001), vol. 2(C), II–726–II–731 (2001)
8. Valcik, J., Sedmidubsky, J., Balazia, M., Zezula, P.: Identifying Walk Cycles for
Human Recognition. In: Chau, M., Wang, G.A., Yue, W.T., Chen, H. (eds.) PAISI
2012. LNCS, vol. 7299, pp. 127–135. Springer, Heidelberg (2012)
9. Wang, L., Ning, H., Tan, T., Hu, W.: Fusion of static and dynamic body biomet-
rics for gait recognition. IEEE Transactions on Circuits and Systems for Video
Technology 14(2), 149–158 (2004)
10. Xue, Z., Ming, D., Song, W., Wan, B., Jin, S.: Infrared gait recognition based
on wavelet transform and support vector machine. Pattern Recognition 43(8),
2904–2910 (2010)
11. Yoo, J.H., Hwang, D., Moon, K.Y., Nixon, M.S.: Automated human recognition
by gait using neural network. In: Workshops on Image Processing Theory, Tools
and Applications, pp. 1–6. IEEE (2008)
2
http://www.xbox.com/kinect