Professional Documents
Culture Documents
Review
Centre for Clinical Research Excellence in Clinical Gait Analysis and Gait Rehabilitation, Murdoch Childrens Research Institute, Royal Childrens Hospital, Melbourne, Australia
Hugh Williamson Gait Analysis Service, Royal Childrens Hospital, Melbourne, Australia
c
Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, Australia
d
School of Physiotherapy, The University of Melbourne, Melbourne, Australia
b
A R T I C L E I N F O
A B S T R A C T
Article history:
Received 5 March 2008
Received in revised form 5 September 2008
Accepted 5 September 2008
Background/Aim: Three-dimensional kinematic measures of gait are routinely used in clinical gait analysis
and provide a key outcome measure for gait research and clinical practice. This systematic review
identies and evaluates current evidence for the inter-session and inter-assessor reliability of threedimensional kinematic gait analysis (3DGA) data.
Method: A targeted search strategy identied reports that fullled the search criteria. The quality of fulltext reports were tabulated and evaluated for quality using a customised critical appraisal tool.
Results: Fifteen full manuscripts and eight abstracts were included. Studies addressed both withinassessor and between-assessor reliability, with most examining healthy adults. Four full-text reports
evaluated reliability in people with gait pathologies. The highest reliability indices occurred in the hip and
knee in the sagittal plane, with lowest errors in pelvic rotation and obliquity and hip abduction. Lowest
reliability and highest error frequently occurred in the hip and knee transverse plane. Methodological
quality varied, with key limitations in sample descriptions and strategies for statistical analysis. Reported
reliability indices and error magnitudes varied across gait variables and studies. Most studies providing
estimates of data error reported values (S.D. or S.E.) of less than 58, with the exception of hip and knee
rotation.
Conclusion: This review provides evidence that clinically acceptable errors are possible in gait analysis.
Variability between studies, however, suggests that they are not always achieved.
2008 Elsevier B.V. All rights reserved.
Keywords:
Gait
Gait analysis
Reliability
Reproducibility
Measurement error
Contents
1.
2.
3.
4.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.
Study identication and selection . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.
Data extraction and quality appraisal . . . . . . . . . . . . . . . . . . . . . . .
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.
Sample selection, composition and description . . . . . . . . . . . . . . .
3.2.
Study procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.
Statistical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.
Reliability ndings: overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.
Methodological considerations: participant and assessor samples
4.2.
Methodological considerations: study design and procedures . . . .
4.3.
Methodological considerations: statistical analysis . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
361
361
361
361
361
362
362
362
364
365
366
366
367
* Corresponding author at: Gait CCRE, Murdoch Childrens Research Institute, Hugh Williamson Gait Laboratory, Royal Childrens Hospital, Flemington Rd Parkville, Victoria
3052, Australia. Tel.: +61 3 9345 5354; fax: +61 3 9345 5447.
E-mail address: jennifer.mcginley@mcri.edu.au (J.L. McGinley).
0966-6362/$ see front matter 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.gaitpost.2008.09.003
5.
361
368
368
368
1. Introduction
2. Method
2.1. Study identication and selection
The search strategy for this review began with retrieval of published reports
indexed on health or biomechanics related electronic databases from MEDLINE
(1970 to July 2007), EMBASE (1980 to July 2007), CINAHL (1982 to July 2007), RECAL
Bibliographic Database (pre-1990 to July 2007) and Inspec (1970 to July 2007). The
search was limited to literature reporting studies of human subjects with abstracts
written in English. The search terms were customised to each database and
included the following keywords; gait, gait disorders, gait analysis, observer
variation, reproducibility of results, and reliability. Bibliographies of identied
papers and relevant conference proceedings were hand searched.
The review was conducted to be of primary relevance to gait laboratories
collecting typical multi-joint lower body gait kinematic data. The titles and
abstracts identied by the initial search strategy were screened by the rst named
author (JM) to identify potentially eligible reports and retrieve full-text reports.
When the title or abstract did not clearly indicate whether an article should be
included then the complete article was obtained and reviewed. Full-text reports
were then evaluated by two authors (JM and RB) for the following inclusion criteria:
(1) reports of the inter-session or inter-assessor reliability of three-dimensional
kinematic gait or running measures of human participants; (2) including at least
three joints of the lower body (pelvis, hips, knees, ankles); (3) reporting numerical
ndings from repeated kinematic data capture from more than one measurement
occasion (with markers replaced each occasion); (4) full papers or abstracts (not
later published as full papers); (5) published with an English abstract.
2.2. Data extraction and quality appraisal
Reports were retained as either full-text reports or published abstracts. A
standardised data extraction and appraisal form was constructed to identify and
detail key features of each study. Two reviewers (JM and RB) initially independently
piloted the form with a small subset of representative studies to conrm the
content and to assess the reliability. The extracted study details focused on
participant characteristics and recruitment, study procedures and biomechanical
models, and the statistical analysis techniques.
The quality of study design and conduct are key elements in evaluating scientic
evidence, with contemporary systematic reviews providing study quality
appraisals in addition to quantitative reviews. Although a large body of literature
exists to provide guidelines for the systematic evaluation of research methodology
[10,11], the majority are focussed primarily upon studies of healthcare interventions, in particular randomized controlled trials. As no standardised or established
guidelines were located for reviews of reliability, a customised quality appraisal
form was developed. The appraisal component was developed to integrate relevant
examples of methodological quality criteria from other systematic reviews of
reliability [1214], and gait classication [15]. Relevant quality themes and
principles were also adapted from quality criteria proposed for the measurement
properties of health status questionnaires [16], and the QUADAS tool used to
appraise studies of diagnostic accuracy [17]. Additionally, an initial expert panel
was formed to consider and dene the data extraction and appraisal criteria for the
study. Quality appraisal indicators were developed into a standardised form to
ensure a structured approach to evaluation of key quality elements and to ensure
equal appraisal of all papers. Appraisal items were not scored as the validity of such
scoring systems is currently unproven [18]. The appraisal criteria included themes
related to external validity such as sampling methods and description, standardisation and description of procedures, and selection of statistical analysis techniques.
Appraisal criteria were not applied to the abstract-only reports because their
brevity limited the provision of methodological detail.
The data extraction and appraisal form were used independently by two
reviewers (JM and RB) to extract key details from each report and to evaluate the
quality of each full-text paper. Any rating disagreements on quality criteria were
checked against the original article to ascertain the correct scoring according to a
pre-dened procedure, in accordance with established and recommended protocols
[15,18].
3. Results
The electronic searches and hand-search of references and
selected conference proceedings yielded a total of 510 articles.
Following the application of the inclusion/exclusion criteria, 23
362
reect the heterogeneity of the sample, and allow insight into the
generalisability of the ndings to other populations. The number of
gait participants varied widely across full-text studies from 1 [27]
to 40 [23,28], with 10 reports including more than 10 participants.
Justication for the sample size of gait participants was not
provided in any study.
The sampling method used to recruit assessors or the related
inclusion and exclusion criteria were not reported in any of the
full-text reports. Descriptions of the assessors were generally
poor, with only two studies reporting the desired complete
details including the number of assessors, professional background and experience or training [29,30]. Physiotherapists were
most often reported as the group, with six of the reports also
describing their assessors as either experienced or highly trained
[9,24,2831]. The number of assessors was frequently small,
between one and ve.
Table 1
Characteristics of the identied studies of the reliability of 3DGA data.
Study
Biomechanical model
Participant characteristics
(n, age (years), type, gender)
Assessor characteristics
(n, profession)
Statistical analysis
n = 5, Discipline: NS
n = 3, Discipline: PT
n = 2, Discipline: PT
PiG
VCM
VCM
Vicon and Motion Analysis
Corporation software
Vicon and Motion Analysis
Corporation software
21 marker model similar to
Kabada model with ANALYZE
software.
Conventional gait model
n = 1, Discipline:
therapist
n = 1, Discipline: NS
Unilateral CODA
Conventional biomechanical
model
VCM, CCM/Orthotrak
NS
VCM
VCM
VCM
n = 1, Age: NS
n = 1, Discipline = NS
n = 1 Age: 7, Healthy, F
n = 5, Discipline: PT
Inter-assessor, Interval: NS
n = NS, Discipline = NS
n = 1, Discipline: NS
n = 3, Discipline:
Clinician
n = 4 laboratories,
Discipline: NS
n = NS, Discipline: NS
n = 1, Discipline: NS
n = 4, Discipline: PT
n = 2, Discipline: PT
n = 1, Discipline:
Technician
n = 1 or 2, Discipline:
Physician & technician
n = 2, Discipline: NS
n = 24 (in 12 labs),
Discipline: clinicians
n = 24 (in 12 labs),
Discipline: clinicians
n = 1, Discipline: NS
363
(A), Abstract only; F, female; M, male; VCM, Vicon Clinical Manager; GC, gait cycle; S.D., standard deviation; SEM, standard error of measurement; W-Ass, Within-assessor; ANOVA, analysis of variance; CODA, Cartesian
Optoelectric Dynamic Anthropometer; OLGA, optimised lower-limb gait analysis; LOA, limits of agreement; CCM, Cleveland Clinic Model; NS, not stated; PiG, Plug-in-Gait; PT, Physiotherapist; CMC, coefcient of multiple
correlation; CMD, coefcient of multiple determination; CV%, coefcient of variation; ICC, intra-class correlation; CI, condence interval; NS, not stated; # data refers to Leardini et al. [29] Study 2 (inter-examiner).
364
Table 2
Methodological quality of the reviewed full-text studies.
Gait participants
Sampling
method
Inclusion and
exclusion criteria
Description
Assessor
participant
description
Not stated
Not stated
Not stated
Convenience
Not stated
Not stated
Not stated
Convenience
Not stated
Not stated
Convenience
Not stated
Convenience
Not stated
Case consecutive
Not stated
Not stated
Not stated
Stated
Not stated
Limited
Not stated
Stated
Stated
Stated
Stated
Not stated
Stated
Stated
Stated
Partial
Inadequate
Adequate
Partial
Partial
Partial
Adequate
Adequate
Partial
Adequate
Partial
Adequate
Partial
Adequate
Adequate
Partial
Partial
Inadequate
Partial
Inadequate
Partial
Adequate
Inadequate
Inadequate
Partial
Partial
Partial
Inadequate
Adequate
Partial
Protocol
standardisation
and description
Model
description
Data
description
Statistical
analysis
Adequate
Limited
Adequate
Adequate
Limited
Limited
Limited
Limited
Adequate
Adequate
Limited
Limited
Adequate
Adequate
Limited
Adequate
Adequate
Limited
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Limited
Adequate
Adequate
Adequate
Limited
Limited
Limited
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Limited
Adequate
Limited
Adequate
Adequate
Adequate
Adequate
Adequate
Adequate
Limited
Limited
Limited
Table 3
Summary of studies reporting within-assessor reliability of 3DGA, data as coefcient of multiple correlation (CMC) (within-assessor).
Sagittal
Coronal
Transverse
Pelvic Tilt
Hip exion
Knee exion
Ankle dexion
Pel obliquity
Hip abduction
Knee varus/val
Pel rotation
Hip rotation
Knee rotation
Foot progression
Besier et al.
[31] (AL)a
Besier et al.
[31] (FUN)a
Gorton
et al. [19]
Growney
et al. [34]
Kadaba
et al. [28]
Steinwender
et al. [23]
Steinwender
et al. [23]
Tsushima,
et al. [30]
Yavuzer
et al. [24]
Healthy
adults
Healthy
adults
Healthy
children
Healthy
Healthy
adults
Healthy
children
Children
with CP
Healthy
adult
Adults with
stroke
.97
.96
.92
.93
.80
.62
.83
.98
.96
.93
.79
.99
.99
.96
.91
.64
.96
.99
.98
.85
.90
.74
.88
.74
.54
.55
.24
.98
.99
.93
.89
.89
.61
.72
.41
.49
.58
.32
.96
.96
.87
.75
.85
.49
.67
.59
.34
.37
.56
.96
.96
.83
.73
.76
.58
.71
.57
.41
.49
.38
.99
.99
.98
.98
.97
.79
.89
.82
.81
.82
.95
.89
.85
.85
.92
.82
.63
.87
Besier et al. CMCs derived from coefcient of multiple determination data, L side.
Median
.56
.96
.96
.93
.85
.89
.74
.72
.62
.54
.55
365
366
also more variable than older children [19]. Children with CP were
more variable for some kinematic gait variables than healthy
children [23]. Measures of gait data reliability are intrinsically
related to the variability within the studied group [54], with
measurements widely considered to be population-specic [55].
Whether estimates of error can be reasonably generalised across
clinical populations should be carefully considered, in the context
of the characteristics of the specic pathology, and the associated
impairment and gait dysfunction characteristics. Furthermore,
although different gait disorders may be associated with variable
levels of intrinsic gait repeatability, it is not clear whether the
nature of the gait disorder has any direct effect on procedural
sources of error such as marker placement. It is likely that such
errors may be related to patient-specic factors such as cognition,
compliance and cooperation which may or may not be related to
the gait disorder.
The potential inuence of the assessor characteristics on the
reliability of 3DGA data received very limited focus within the
studies in this review, with generally poor detailing of assessor
recruitment and descriptions. Kinematic 3D gait measurement
using landmark-specic models requires specialised staff skills,
including accurate and consistent placement of markers, and
expert knowledge of the underlying biomechanical model.
Training of clinical staff in standardised protocols is widely
considered to be important [1,29]. The consistency of the measures
may therefore be inuenced by assessor experience, expertise,
professional background and additional training [56], with
experience of the clinical team potentially contributing to random
error in gait data [57]. Inclusion criteria or sampling methods for
assessors were not reported in any study, and it seems probable
that assessors were convenience samples of staff working within
the authors laboratories. Whether the samples were inuenced by
any biasing factors, such as recruitment of only the most
experienced or best assessors is uncertain. If experience or
discipline-specic training is a determinant of 3DGA measurement
reliability, then it is uncertain whether the results of best
assessors can be applied to other inexperienced assessors, or those
from different professional backgrounds. Similarly, if the ndings
are from novice assessors, then the error sizes reported may be
larger than those typically achieved by experienced assessors with
greater expertise.
4.2. Methodological considerations: study design and procedures
Although the majority of studies described the use of
standardised protocols, wide variation was apparent in the
duration between measurement sessions. Justication of the time
interval duration is recognized as a desirable attribute of study
quality [16], but was absent in the majority of reports. Selection of
an optimal interval in repeated 3DGA measures requires consideration of both practical and theoretical issues. In principle,
intervals should be far apart to minimise fatigue or memory bias
effects, but short enough to avoid genuine change in the
measurements [16,55]. Articially short intervals within a day
are often most feasible to achieve, yet may leave visible signs of
marker placement on skin to unblind a repeat assessment or
subsequent assessor, or increase the possibility that assessors may
remember aspects of anthropometric measures or landmark
identication. Fatigue may also cause true variations in the gait
patterns of clinical subjects when measured repeatedly within a
day by multiple assessors. In contrast, longer time periods of
months increase the possibility that real change has occurred
within the measurement interval, potentially introducing disease
progression bias [17]. In clinical populations such as CP,
deterioration in gait has been documented over periods of 12
367
368
Table 4
Factors to consider when planning or reporting a 3DGA gait reliability study.
Descriptor
Methods
Participants (gait)
Participants (assessors)
Protocol and model
Study design
Results
Participants (gait)
Participants (assessors)
Data
ences (MCID) [65]. Further evidence may also be sought for the
responsiveness of 3DGA measures. Whether the error magnitudes
are sufciently low will be relative to the magnitude of expected
intervention effect size and specic population context. Further
studies are necessary in typical clinical populations to provide high
quality evidence indicating whether 3DGA measures are sufciently reliable to detect clinically important change.
5. Considerations and recommendations for future research
A number of limitations should be considered when interpreting the ndings of this review. All papers were retained for
inclusion regardless of study quality, in order to provide a
comprehensive overview of available data. Statistical synthesis
of the data was not performed. The ndings of this review are
limited to the published papers identied by the search strategies.
Potential publication bias was not assessed and may have resulted
in an over-estimation of reliability. Study quality was only
reviewed by the criterion tool developed for the study purpose.
Future studies of the reliability of 3DGA require careful
consideration of optimal design to enhance the generalisability
of the ndings. If the intention is to apply the reliability estimates
to clinical populations, then careful attention is necessary to
recruit and describe samples which are representative of the
clinical populations of interest. Assessor recruitment and characterization warrants comparable attention. Protocols should
carefully consider what standardised measurement interval is
most appropriate and minimise predictable sources of assessor
bias. Appropriate statistical strategies should include reliability
estimates in units of degrees to enhance interpretation. Future
studies should also consider evaluation of the reliability of kinetics
and consider study designs that allow evaluation of the responsiveness of 3DGA. Table 4 proposes a list of factors that should be
considered when designing or reporting a study of the reliability of
3DGA.
369