You are on page 1of 11

Research Report

Clinimetric Properties of the


Performance-Oriented Mobility
Assessment

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


ўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўўў

Background and Purpose. The Performance-Oriented Mobility Assessment (POMA) is


a widely used instrument that provides an evaluation of balance and gait. It is used
clinically to determine the mobility status of older adults or to evaluate changes over
time. To support the use of the POMA for these purposes, the clinimetric properties
(in particular, responsiveness) were determined. Subjects. Participants (78% female;
mean age⫽84.9 years) were living in either self-care or nursing-care residences.
Concurrent and discriminant validity were assessed with the total group (N⫽245),
whereas reliability and responsiveness were determined with a subsample (n⫽30).
Fall-related predictive validity was assessed with a subsample of 72 participants.
Methods. In addition to the POMA, several reference performance tests were
administered. The POMA was assessed on 2 consecutive days by 2 raters (observers).
The analyses included the calculation of Spearman rank correlation coefficients (R),
limits of agreement (LOA) with Bland-Altman plots, minimal detectable changes at
the 95% confidence level (MDC95), and sensitivity and specificity with regard to
predicting falls. When possible, findings for the total scale (POMA-T) were comple-
mented by findings for its balance subscale (POMA-B) and its gait subscale (POMA-
G). Results. The interrater and test-retest reliability for the POMA-T and the POMA-B
were good (R⫽.74 –.93), whereas for the POMA-G, the reliability values, although
high as well, were systematically slightly lower (R⫽.72–.89). The Spearman correla-
tions with the reference performance tests (R⫽ⱍ.64ⱍ–ⱍ.68ⱍ) indicated satisfactory
concurrent validity for the POMA-T and the POMA-B, but the corresponding findings
for the POMA-G (R⫽ⱍ.52ⱍ–ⱍ.56ⱍ) were less convincing. The discriminant validity values
of the 3 scales were about the same. The LOA for the POMA-T were on the order of
– 4.0 to 4.0 for test-retest agreement and –3.0 to 3.0 for interrater agreement. On the
basis of the MDC95 values, it was concluded that changes in POMA-T scores at the
individual level should be at least 5 points and that those at the group level (n⫽30)
should be at least 0.8 point to be considered reliable. Even when optimal cutoff points
were used, sensitivity and specificity values (varying between 62.5% and 66.1%) for the
POMA-T as well as for its 2 subscales indicated poor accuracy in predicting falls.
Discussion and Conclusion. The POMA-T and its subscale POMA-B have adequate
reliability and validity for assessing mobility in older adults. The POMA-T is useful for
demonstrating intervention effects at the group level. Changes within subjects,
however, should be at least 5 points before being interpreted as reliable changes. The
accuracy of the POMA-T in predicting falls is poor. [Faber MJ, Bosscher RJ, van
Wieringen PCW. Clinimetric properties of the Performance-Oriented Mobility Assess-
ment. Phys Ther. 2006;86:944 –954.]

Key Words: Minimal detectable change, Older people, Performance-Oriented Mobility Assessment,
Reliability, Validity.

Marjan J Faber, Ruud J Bosscher, Piet CW van Wieringen


ўўўўўў

944 Physical Therapy . Volume 86 . Number 7 . July 2006


T
he Performance-Oriented Mobility Assessment changes being expressed in the units of measurement of
(POMA) scale, developed by Tinetti and first the instrument. Relative reliability is the degree to which
published in 1986,1 is a widely used tool for subjects maintain their position in a sample with
assessing mobility and fall risk in older people. repeated measurements, usually assessed with some type
It is easily applied in clinical settings; other than a of correlation coefficient.14 Responsiveness is defined as
standard chair and a stopwatch, no further equipment is the ability of an instrument to accurately detect change
required, and only little experience is needed to master when it has occurred.15,16
its use.1 After a few practice sessions, the observer can
complete the assessment in less than 15 minutes.2 Only limited clinimetric data on the original POMA have
been published. With regard to test-retest reliability of
Several adapted versions of the POMA have been pub- POMA-T scores, intraclass correlation coefficients
lished, but in this article, only the original 28-point (ICCs) of .88 (for 40 residents of skilled nursing
version is considered, as it is the most commonly used homes)17 and .97 (for 8 community-dwelling older peo-

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


version.3 The total POMA scale (POMA-T) consists of a ple)4 have been reported. The concurrent validity of
balance scale (POMA-B) and a gait scale (POMA-G). POMA-T scores was investigated in a cross-sectional
The POMA-B carries the subject through positions and study6 of 167 older people with mild balance impair-
changes in position, reflecting stability tasks that are ments. Spearman correlations (R) of POMA-T scores
related to daily activities. In the POMA-G, several quali- with the results of several balance-related tests were
tative aspects of the locomotion pattern are examined. calculated; these measures included maximum step
Each item is scored on a 2- or 3-point scale, resulting in length (R⫽.75), tandem stance time (R⫽.69), stance
a maximum score of 28 on the POMA-T and maximum time on one foot (R⫽.74), tandem walk time (R⫽–.62),
scores of 16 and 12 on the POMA-B and the POMA-G, Timed “Up & Go” Test (TUG) (R⫽–.65), and 6-Minute
respectively. Originally, the POMA-T was developed to Walk Test (R⫽.62). For a group of 59 community-
predict falls in an institutionalized population.3 Later, dwelling older people, a Spearman correlation of .79
the scale also was used in various clinical contexts as a between POMA-T scores and gait impairment scores
measure of mobility impairment4 – 6 and to study the based on a neurologic examination was found.5
effects of interventions.7–13
With regard to the POMA-B, a test-retest reliability value
A prerequisite for using a clinical measurement tool is (ICC) of .93 was reported for a group of 14 residential
that its clinimetric properties, including validity, reliabil- care facility residents.7 Interrater reliability values in that
ity, and responsiveness, are satisfactory. Validity indicates study, expressed as Pearson correlation coefficients (r),
whether the instrument does indeed measure what it is varied from .76 to .90. For a group of 40 residents of
intended to measure. Concurrent validity refers to the skilled nursing homes, the ICC indicating interrater
relationship between scores on the scale in question and reliability was .75.17 In one study focusing on the inter-
scores on other scales intended to measure the same rater reliability of scores on the 8 individual items of the
construct. Predictive validity refers to the degree to which POMA-B, kappa coefficients ranging from .40 to 1.00
the scores predict an external criterion. Reliability refers were reported across many raters with various levels of
to the extent to which the measurements are objective experience for 29 hospital inpatients and nursing home
(interrater reliability) and stable over time (test-retest residents.18 The predictive validity of scores on the
reliability). Absolute reliability is the degree to which POMA-B for falls was investigated by Verghese et al19
repeated measurements vary for subjects, with the with a group of 60 community-dwelling older people;

MJ Faber, PhD, is Senior Researcher, Faculty of Human Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands. Address
all correspondence to Dr Faber at Centre for Quality of Care Research (WOK), Radboud University Nijmegen Medical Centre, PO Box 9101, 117
KWAZO, 6500 HB Nijmegen, the Netherlands (m.faber@kwazo.umcn.nl).

RJ Bosscher, PhD, is Associate Professor, Faculty of Human Movement Sciences, Vrije Universiteit Amsterdam.

PCW van Wieringen, PhD, is Associate Professor, Faculty of Human Movement Sciences, Vrije Universiteit Amsterdam

Dr Faber, Dr Bosscher and Dr van Wieringen provided concept/idea/research design. Dr Faber and Dr van Wieringen provided writing. Dr Faber
provided data collection, project management, subjects, and data analysis. Dr van Wieringen provided fund procurement. Dr Bosscher provided
consultation (including review of manuscript before submission).

The medical ethical committee of the Vrije Universiteit Medical Centre approved the study protocol.

This article was received May 23, 2005, and was accepted January 31, 2006.

Physical Therapy . Volume 86 . Number 7 . July 2006 Faber et al . 945


ўўўўўўўўўўўўўўўўўўўўўў
with a cutoff value set at a score of 10 points, the volunteers’ general practitioners. The second criterion
sensitivity was 61.5% and the specificity was 69.5%. was operationalized by a score of at least 18 on the
Mini-Mental State Examination.21 In addition, the nurs-
With regard to the POMA-G, an interrater reliability ing staff judged all volunteers meeting this criterion to
value (ICC) of .83 was reported for a group of 40 be fit to participate.
residents of skilled nursing homes.17 The concurrent
validity of scores on the POMA-G was investigated for 34 Of the 278 interested and eligible participants, 33 were
community-dwelling older people by correlating excluded because they had Mini-Mental State Examina-
POMA-G scores with their ankle ranges of motion, tion scores of less than 18. The concurrent and discrimi-
resulting in a Spearman correlation of .63.4 nant validity data for the present study were obtained
from the remaining 245 participants in the RCT. The
Although the data presented above are encouraging, the reliability and responsiveness data were collected from a
number of clinimetric studies is still relatively small, in sample of 30 participants living in the last 3 included

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


particular, with regard to validity. Moreover, all reliabil- residences. Participants in the RCT who were living in
ity values reported so far refer to relative reliability; no the latter residences could volunteer to participate in the
findings have been published with regard to absolute reliability and responsiveness study. Predictive validity
reliability or to the related characteristic of responsive- was determined for the participants who were randomly
ness of the POMA scale. This dearth of published data assigned to the control group in the RCT; these partic-
raises questions about the use of the POMA for moni- ipants did not receive any intervention, and their fall
toring patients’ clinical recovery process or responses to history was recorded over a period of 10 months after
interventions,20 even though the POMA has been used randomization for the RCT (n⫽72). The characteristics
extensively for these goals.7–13 of the participants belonging to the 3 study groups are
summarized in Table 1. All participants gave written
Given these considerations, we conducted a large-scale informed consent.
clinimetric study with older adults living in long-term
care facilities in order to extend the small database with Procedure and Data Collection
respect to the relative interrater and test-retest reliability To assess concurrent validity, 2 research physical thera-
and validity (concurrent, discriminant, and predictive) pists, both with 4 years of experience in physical testing
of scores on the original POMA and to add important of older adults, made individual assessments of all par-
information about its absolute reliability and the mini- ticipants. In addition to the POMA, the TUG,22 the
mal detectable change, which was the type of change balance test from the Frailty and Injuries: Cooperative
chosen for a study on responsiveness.15 Studies of Intervention Techniques (FICSIT-4),23 and a
gait speed test24 were administered. Information about
Method the type of walking aid commonly used by the partici-
pants was collected to determine discriminant validity.
Participants
Data for the present study were collected from partici- For the reliability and responsiveness part of the study, 2
pants in a randomized controlled trial (RCT) investigat- graduate students who were studying human movement
ing the effects of 2 exercise programs. These participants sciences and who received 8 hours of training in scoring
were recruited from 15 long-term self-care and nursing the POMA scored the POMA for the 30 participants on
care residences, with the number of residents ranging 2 consecutive days while the physical therapists gave the
from 120 to 500. In self-care residences, people live test instructions to the participants. On both days, the
independently but have access to on-site nursing care students scored the POMA simultaneously but indepen-
and dining and recreational facilities. In nursing care dently from each other. Given the short interval of about
residences, people live less independently, with provi- 24 hours between the 2 assessments, changes in perfor-
sions for full nursing care if necessary. Preceding the mance attributable to changing health conditions or
RCT, all residents were invited to meetings in which interventions seemed highly improbable. As indicated
information about the setup of the RCT and exercise earlier, fall-related predictive validity was determined
programs was given. People who were interested in with the group of 72 control participants in the RCT,
participation were screened on the basis of the following that is, participants who were not involved in an inter-
inclusion criteria: ability to walk independently across a vention program. Fall data were collected by means of
distance of at least 6 m, with or without the use of a fall diaries that were kept by the participants over a
walking aid; capacity to understand instructions to be period of 10 months. A fall was defined as “an event that
provided during the programs; and absence of medical results in a person coming unintentionally to rest on the
contraindications to participation, as judged by the ground or other lower level.”25

946 . Faber et al Physical Therapy . Volume 86 . Number 7 . July 2006


Table 1.
Characteristics of Participants in the Concurrent and Discriminant Validity Study, the Fall-Related Validity Study, and the Reliability and
Responsiveness Study

Value in the Following Study:


Concurrent and
Discriminant Fall-Related Reliability and
Validity Validity Responsiveness
Characteristica (nⴝ245) (nⴝ72) (nⴝ30)

Age, y, X (SD) 84.9 (6.0) 84.7 (6.1) 83.1 (7.3)


No. (%) of women 191 (78) 58 (81) 24 (80)
MMSE score, X (SD) (range⫽0–30) 25.7 (2.9) 25.7 (2.9) 26.5 (2.4)
Self-reported health status, no. (%) of subjects

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


Poor 4 (1.6) 0 (0) 1 (3.3)
Fair 87 (35.7) 21 (29.2) 8 (26.7)
Good 104 (42.6) 34 (47.2) 17 (56.7)
Excellent 49 (20.1) 17 (23.7) 4 (13.3)
Walking aid, no. (%) of subjects
None 86 (35) 25 (35) 12 (40)
Cane or stick 26 (11) 9 (13) 5 (17)
Walker 121 (49) 30 (42) 13 (43)
Wheelchair 12 (5) 8 (11) 0 (0)
Physical activity, min/d, X (SD) 65 (49) 70 (46) 75 (50)
Maximum walking speed, m/s, X (SD) 0.76 (0.31) 0.79 (0.34) 0.95 (0.25)
FICSIT-4 score, X (SD) (range⫽0–5) 2.4 (1.3) 2.5 (1.3) 2.6 (1.4)
TUG score, s, X (SD) 24.6 (14.8) 23.0 (14.6) 16.6 (6.0)
GARS score, X (SD) (range⫽18–90) 42.4 (13.1) 39.4 (13.6) 36.8 (11.6)
POMA score, X (SD)
Total (range⫽0–28) 19.3 (5.3) 19.7 (5.7) 18.9 (5.4)
Gait (range⫽0–12) 9.0 (2.5) 9.2 (2.6) 8.4 (2.6)
Balance (range⫽0–16) 10.3 (3.5) 10.5 (3.8) 10.5 (3.1)
a
MMSE⫽Mini-Mental State Examination, FICSIT-4⫽balance test from the Frailty and Injuries: Cooperative Studies of Intervention Techniques, TUG⫽Timed “Up
& Go” Test, GARS⫽Groningen Activity Restriction Scale, POMA⫽Performance-Oriented Mobility Assessment.

Measurement Instruments mined for a larger group of 60 patients by correlating


The original POMA version used in this clinimetric the time to complete the TUG with the Berg Balance
evaluation (Appendix)1 consists of 8 balance items and 8 Scale (Pearson r ⫽⫺.81), a gait speed test (Pearson
gait items to be scored on a 2- or 3-point scale. The r ⫽⫺.61), and the Barthel Index (Pearson r ⫽⫺.51).22
balance items include sitting balance, rising from a chair
and sitting down again, standing balance (eyes open and The FICSIT-4 is used to test a person’s ability to maintain
eyes closed), and turning balance, adding up to a balance in parallel stance, semitandem stance, tandem
maximum score of 12 points (POMA-B). The gait items stance, and one-leg stance. Each position was tested for
include gait initiation, step length, step height, step a maximum of 10 seconds, and participants proceeded
length symmetry and continuity, path direction, and to the next stance only when the previous stance could
trunk sway, adding up to a maximum score of 16 points be maintained for at least 3 seconds. A summary score
(POMA-G). The total score (POMA-T) ranges from 0 to for the 4 positions was computed as suggested by
28 points. Lower scores indicate poorer performance. Rossiter-Fornoff et al,23 resulting in a scale ranging from
0 to 5 points, with higher scores indicating better
The TUG is a test of basic functional mobility and is balance performance. The test-retest reliability of scores
scored as the minimum time needed to stand up from a on the FICSIT-3 (similar to the FICSIT-4, but without the
standard armchair, walk across a distance of 3 m, turn one-leg stance) has been determined over intervals
around, walk back to the chair, and sit down again. between 2 measurements ranging from 3 to 12 months.
Interrater reliability (ICC⫽.99) and test-retest reliability The Pearson r ranged from .25 to .74, with longer
(ICC⫽.99) of TUG scores have been determined for 22 intervals resulting in lower test-retest correlations.23
patients attending a geriatric hospital.22 In that same
study, the concurrent validity of TUG scores was deter-

Physical Therapy . Volume 86 . Number 7 . July 2006 Faber et al . 947


ўўўўўўўўўўўўўўўўўўўўўў
Fast gait speed was determined across a distance of 6 m, weighted kappa coefficient of the total number of activ-
which was marked on the floor with tape. The partici- ities measured by the LAPAQ over 1 year was .65.31
pants, who were allowed to use their usual walking aid,
were asked to walk as fast as possible without running. Data Analysis
They were instructed to wait with both feet 1 m behind Assumptions of normality were not met for the POMA-T,
the starting line and to start walking after a verbal POMA-B, POMA-G, and TUG. Therefore, all calcula-
command. Timing began after the leading foot crossed tions of relative reliability and of concurrent and dis-
the starting line and stopped after the leading foot criminant validity were based on nonparametric statis-
crossed the finish line. The participants were instructed tics. The computation of absolute reliability and
to continue walking for a short distance after the finish responsiveness is based on differences in paired obser-
line was crossed to prevent them from decelerating vations, assuming that these differences are normally
before this line was reached. Speed was computed by distributed. This assumption held true for POMA-T but
dividing distance (in meters) by time (in seconds).24 The not for POMA-B and POMA-G. Consequently, absolute

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


highest speed attained during 1 of 2 attempts was used reliability findings are provided only for the former
for analysis. The test-retest reliability (ICC) for gait scale.
speed over an interval of about 2 weeks for a group of
105 frail older people (mean age⫽78.0 years) was .79.26 The relative interrater and test-retest reliability of the
The test-retest reliability values (ICCs) determined on POMA scores were expressed in terms of Spearman rank
the same day for comfortable and maximum gait speeds correlations (R). These calculations were complemented
for a group of 96 subjects between 60 and 89 years of age by testing the differences between the paired scores
were .97 and .96, respectively.24 given by the 2 raters and between the paired scores on
the 2 test days by means of a Wilcoxon signed rank test.
Self-reported limitations in basic activities of daily living
(BADL) and independent activities of daily living Absolute interrater and test-retest reliability for the
(IADL) were assessed by means of the Groningen Activ- POMA-T were visualized by means of Bland-Altman plots
ity Restriction Scale (GARS).27 The GARS consists of 18 with 95% limits of agreement (LOA).32 In those plots,
items, covering 11 BADL and 7 IADL tasks, all scored on the differences (d) between each pair of observations are
a 5-point scale (possible scoring range of 18 –90 points, presented as a function of the average value for each pair
with higher scores indicating more limitations). The of observations. Assuming a normal distribution of the
GARS has been used to determine changes in disable- differences, 95% of those differences may be expected to
ment over time, to differentiate between degrees of fall within the interval d ⫾ (1.96 ⫻ SDdiff), with d being
disability, and to assess the need for professional care.27 the mean difference and SDdiff being the standard
The test-retest correlation, determined within a group of deviation of the difference. The mean difference d
77 subjects over a 4-month interval, was .7428; the inter- captures the systematic difference between the paired
rater reliability has not been determined. An indication observations, whereas the SDdiff captures the agreement
for concurrent validity was found in a population-based at the level of individual observations.
study of 4,777 subjects in which the GARS scores corre-
lated highly with the scores on the physical functioning The responsiveness of the POMA-T was considered at
subscale of the 20-Item Short-Form Health Survey (SF- both the individual level and the group level and is
20) (Pearson r⫽⫺.72).29 The latter subscale measures presented in the units of measurements of this scale. The
the extent to which health problems interfere with a responsiveness at the individual level is captured as the
variety of activities (eg, playing sports, carrying groceries, minimal detectable change with a confidence level of
climbing stairs, and walking).30 95% (MDC95) at the individual level (MDC95,ind), as
follows:
Finally, the average number of minutes per day spent on
habitual daily physical activities during the preceding 2 MDC95,ind ⫽ 1.96⫻ 冑2 ⫻ SEM,
weeks was determined by administering the Longitudi-
nal Aging Study Amsterdam Physical Activity Question- where SEM is the standard error of measurement (ie, the
naire (LAPAQ).31 The LAPAQ covers the frequency and square root of the within-subject variance).15 Changes
duration of walking outside, bicycling, gardening, light smaller than MDC95,ind cannot be reliably (with a confi-
and heavy household activities, and sport activities dur- dence level of 95%) interpreted as “real” changes in the
ing the preceding 2 weeks. The total amounts of activity score for a subject compared with chance fluctuations.
measured by the LAPAQ and by means of a 7-day diary The responsiveness to changes at the group level, known
were highly correlated (Spearman R⫽.68; n⫽356; men as the MDC95 at the group level (MDC95,group), depends
and women 65 years of age and older). The test-retest on the size of the group (n), as follows33:
reliability was established with the same group, and the

948 . Faber et al Physical Therapy . Volume 86 . Number 7 . July 2006


MDC95,ind
MDC95,group⫽ .
冑n
Changes smaller than MDC95,group cannot be reliably
(with a confidence level of 95%) interpreted as “real”
changes in the mean score for a group compared with
chance fluctuations.

The concurrent validity of the POMA scores was assessed


by calculating their Spearman rank correlations (R) with
the scores on a number of reference tests described
above. The discriminant validity was calculated by relat-
ing the POMA scores to the type of walking aid com-

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


monly used by the participants (none, cane or stick,
walker, or wheelchair) by means of a Kruskal-Wallis test
with type of walking aid as the experimental factor,
followed by post hoc comparisons by means of Mann- Figure 1.
Whitney U tests with Bonferroni adjustments. Histogram of Performance-Oriented Mobility Assessment (POMA)
scores for a group of 245 older adults with mobility impairments and
living in long-term-care facilities.
Fall-related predictive validity was determined by pre-
dicting future falls on the basis of the POMA scores. A
“nonfaller” was defined as a subject who did not fall or shown in Table 2. All test-retest reliability values for the
fell only once during the follow-up period, whereas a POMA-T, POMA-B, and POMA-G varied between .72
“faller” was defined as a subject who fell at least twice and .86, whereas the interrater reliability values ranged
during the follow-up period (as in the study by Tinetti from .80 to .93. No significant differences between the
et al3). Predictive validity was expressed in terms of pairs of scores were found, except with regard to the
sensitivity and specificity. Sensitivity, in this context, is test-retest reliability of the POMA-G scores; on the latter
defined as the probability that a future faller is indeed scale, rater 1 had a significantly lower mean score on day
predicted to be a faller, whereas specificity is defined as 2 than on day 1 (Wilcoxon signed rank test, P⫽.03).
the probability that a future nonfaller is indeed pre-
dicted to be a nonfaller. Receiver operating characteris- The Bland-Altman plots illustrating the absolute reliabil-
tic curves were used for selecting the optimal cutoff ity of the POMA-T are shown in Figure 2. From these
scores, and 95% confidence intervals were calculated. All plots, it is clear that the mean differences between the
analyses were performed with SPSS version 11.5* for paired observations showed only small and nonsignifi-
Windows. cant deviations from 0, indicating that no systematic
differences in scores emerged between the 2 raters or
Results between day 1 and day 2 of assessment. The 95% LOA
for POMA-T, which also are shown in Table 2, ranged
Floor and Ceiling Effects from – 4.0 to 4.6 for test-retest reliability and from –3.6 to
The scores on the 3 POMA scales were inspected for 2.9 for interrater reliability.
possible floor and ceiling effects by determining the
number of participants with the lowest and highest Responsiveness
possible scores on the 3 scales. The lowest possible The MDC95 values for both individual and group
scores, that is, 0 points, on the POMA-T, POMA-B, and changes in POMA-T scores are shown in Table 2. For
POMA-G were not obtained, whereas 11 (4.5%), 13 individual assessments, the MDC95,ind values were 4.0 to
(5.3%), and 52 (21.2%) of the participants obtained the 4.2. When the test-retest assessments were evaluated at
highest possible scores on these tests. The distribution of the group level, the MDC95,group values were 0.7 to 0.8.
the POMA-T scores for the 245 participants is shown in These values indicate that changes in scores at the
Figure 1. individual level should be at least 5 points and that
changes in mean group scores should exceed 0.8 to be
Reliability deemed reliable with a confidence level of 95%.
The Spearman correlations indicating the interrater and
test-retest relative reliability for the POMA scales are Validity
The Spearman correlations between the scores on the
POMA scales and the scores on the reference tests
* SPSS Inc, 233 S Wacker Dr, Chicago, IL 60606. (walking speed, TUG, FICSIT-4, GARS, and LAPAQ),

Physical Therapy . Volume 86 . Number 7 . July 2006 Faber et al . 949


ўўўўўўўўўўўўўўўўўўўўўў
Table 2.
Reliability and Responsiveness of the Performance-Oriented Mobility Assessment (POMA) Total Scale (POMA-T), Balance Subscale (POMA-B),
and Gait Subscale (POMA-G) for Test-Retest and Interrater Situations (n⫽30)

Test-Retest Interrater
a
Parameter Rater 1 Rater 2 Day 1 Day 2

POMA-T (range⫽0–28)
Reliability
Spearman R .86 .82 .93 .91
Mean difference 0.5 0.0 0.1 ⫺0.4
95% LOA ⫺3.6 to 4.6 ⫺4.0 to 4.0 ⫺2.8 to 2.9 ⫺3.6 to 2.8
Responsiveness
MDC95,ind 4.2 4.0
MDC95,group 0.8 0.7

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


POMA-B (range⫽0–16)
Reliability
Spearman R .78 .74 .90 .88
POMA-G (range⫽0–12)
Reliability
Spearman R .72 .77 .80 .89
a
LOA⫽limits of agreement, MDC95,ind⫽minimal detectable change for individual subjects at a 95% confidence level, MDC95,group⫽minimal detectable change for
a group at a 95% confidence level.

indicating the concurrent validity of


scores for the scales, are shown in
Table 3. All correlations were signifi-
cant at the .01 level. Except for the
correlations with LAPAQ, which were
low, all correlations between the
POMA-T and the POMA-B on the one
hand and the reference tests on the
other hand ranged from ⱍ.64ⱍ to ⱍ.70ⱍ.
The corresponding correlations
between the POMA-G and the refer-
ence tests were lower, ranging from
ⱍ.51ⱍ to ⱍ.56ⱍ.

The mean scores (and standard devi-


ations) on the POMA scales for the
subjects who used no walking aid
(n⫽86), a cane or a stick (n⫽26), a
walker (n⫽121), or a wheelchair
(n⫽12) are shown in Table 4. Signif-
icant group differences between
mean POMA-T scores emerged
between the independent ambulators
and the cane and walker users on the
one hand and the wheelchair users on
the other hand and between the inde-
pendent ambulators and the walker
Figure 2. users. The POMA-B scores differenti-
Bland-Altman plot representing the absolute test-retest reliability on days 1 and 2 for rater 1 (a) ated between the independent ambu-
and rater 2 (b) and the absolute interrater reliability on day 1 (c) and day 2 (d). The within-subject
mean total Performance-Oriented Mobility Assessment score is plotted against the within-subject
lators and the cane users and between
difference. The solid horizontal line indicates the mean difference, and the dashed horizontal the walker and wheelchair users.
lines indicate the upper and lower 95% limits of agreement. Finally, the POMA-G scores led to the
same differentiations as the POMA-T
scores. The wheelchair users were dif-

950 . Faber et al Physical Therapy . Volume 86 . Number 7 . July 2006


Table 3. for the POMA-G, the test scores tended
Concurrent Validity of the Performance-Oriented Mobility Assessment (POMA) Total Scale to be lower on the retest than on the
(POMA-T), Balance Subscale (POMA-B), and Gait Subscale (POMA-G), Expressed as
Spearman Rank Correlations
first test. Combined with the high ICCs
found in previous studies,4,7,17 these
Spearman Rank Correlationb for:
data indicate that the relative reliability
a
characteristics of the POMA scales
Test POMA-T POMA-B POMA-G
seem to be adequate.
Maximum walking speed .65 .64 .52
TUG ⫺.68 ⫺.66 ⫺.56 From a clinical point of view, relative
FICSIT-4 .67 .67 .51 reliability must be considered less rele-
GARS ⫺.70 ⫺.68 ⫺.55
vant than absolute reliability. The LOA
LAPAQ .38 .35 .33
showed that for the POMA-T, no sys-
a
TUG⫽Timed “Up & Go” Test, FICSIT-4⫽balance test from the Frailty and Injuries: Cooperative tematic bias was present for test-retest
Studies of Intervention Techniques, GARS⫽Groningen Activity Restriction Scale, LAPAQ⫽Longitudinal

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


Aging Study Amsterdam Physical Activity Questionnaire. and interrater situations. The test-retest
b
All correlations were significant at P⬍.01. reliability data have direct implications
for responsiveness. The responsiveness
findings with regard to the POMA-T
Table 4. indicated that, given a confidence
Discriminant Validity of the Performance-Oriented Mobility Assessment (POMA) Total Scale
(POMA-T), Balance Subscale (POMA-B), and Gait Subscale (POMA-G) for Categories Based interval of 95%, intervention effects
on the Commonly Used Walking Aid for Daily Mobility should be at least 5 points at the indi-
vidual level and at least 0.8 point at the
X (SD) Score ona: group level (with a group size of n⫽30)
Walking
Aid POMA-T POMA-B POMA-G before a real improvement rather than
a chance fluctuation can be reliably
None 21.9 (5.5)A,B 11.9 (3.6)A,B 10.0 (2.3)A,B concluded. It should be emphasized,
Cane 20.0 (4.5)B 11.1 (2.5)A,B 8.8 (2.4)
however, that this real change should
Walker 17.9 (4.4)B,C 9.3 (3.0)C,D 8.6 (2.2)B,C
Wheelchair 12.9 (5.0)A,C,D 6.8 (3.1)C,D 6.2 (2.9)A,C be attributed to the intervention only
when other systematic influences, such
a
Post hoc testing revealed significant subgroup differences after Bonferroni corrections for the following
as spontaneous recovery, are controlled
comparisons: A⫽comparison with walker, B⫽comparison with wheelchair, C⫽comparison with none,
and D⫽comparison with cane. for by means of an adequate control
group.

ferentiated from the other 3 groups, and the independent In earlier clinical trials in which the POMA was used as
ambulators were differentiated from the walker users. an outcome measure, statistically significant intervention
effects of 3.5 to 5.3 points (relative to the results for a
Among the subsample of 72 participants whose data control group) were reported.8,11,34,35 Given these aver-
were entered into the analysis involving falls, 24 (33%) age group effects and the order of magnitude of the
were classified as “fallers” (at least 2 falls) and 48 (67%) critical MDC95,ind determined in the present study, one
were classified as “nonfallers” (either no falls or one may safely conclude that for a number of subjects,
fall). Sensitivity and specificity values indicating the reliable intervention effects indeed have occurred. Even
predictive validity of scores for the POMA scales in terms in those cases, however, the clinical relevance of the
of discriminating future fallers from nonfallers, are improvement is not beyond doubt. Clinical relevance
shown in Table 5. It is evident that the predictive powers can be demonstrated by showing that the change scores
of the POMA-T, POMA-B, and POMA-G are about the also exceed the minimal clinically important difference,
same: Given optimal cutoff values of 19, 10, and 9, the defined as the smallest change that ensures clinically
sensitivity (95% confidence interval) of all of the scales relevant improvement. Several methods have been pro-
was 64.0% (44.5%–79.8%), and their specificity values posed to determine the minimal clinically important
were 66.1% (53.0%–77.1%), 66.1% (53.0%–77.1%), and difference.36 An anchor-based method is preferred, in
62.5% (49.4%–74.0%), respectively. which the change in an external criterion that may
be determined from either a clinician’s or a patient’s
Discussion perspective is used to “anchor” improvement. How-
The relative interrater and test-retest reliability values for ever, finding a valid external criterion, which often
the POMA-T, POMA-B, and POMA-G, as quantified by will be very difficult,37 was beyond the scope of the
Spearman correlation coefficients, were rather high, but present study.

Physical Therapy . Volume 86 . Number 7 . July 2006 Faber et al . 951


ўўўўўўўўўўўўўўўўўўўўўў
Table 5. and using a 40-point version of the
Predictive Validity of the Performance-Oriented Mobility Assessment (POMA) Total Scale POMA-T, the sensitivity was 70% and
(POMA-T), Balance Subscale (POMA-B), and Gait Subscale (POMA-G) in Predicting Fallers
and Nonfallers
the specificity was 52%.39 In a case-
control study of 80 participants and
Value for:
using a modified 57-point version of
the POMA, the sensitivity was 70% and
Parameter POMA-T POMA-B POMA-G
the specificity was 65%.40 Only one
X (SD) score study, a case-control study involving
Nonfallers (n⫽56) 20.8 (5.5) 11.1 (3.8) 9.7 (2.4) community-dwelling older people and
Fallers (n⫽25) 17.4 (5.5) 9.2 (3.6) 8.3 (2.7) using a 24-point version of the
Optimal cutoff point 19 10 9 POMA-B, demonstrated much higher
Sensitivity, % (95% 64.0 (44.5–79.8) 64.0 (44.5–79.8) 64.0 (44.5–79.8) sensitivity and specificity: 95.5% for fre-
confidence quent fallers versus nonfallers.41
interval)

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


Specificity, % (95% 66.1 (53.0–77.1) 66.1 (53.0–77.1) 62.5 (49.4–74.0) Conclusion
confidence The POMA-T and its subscales POMA-B
interval)
and POMA-G showed good relative reli-
ability, as well as concurrent and dis-
criminant validity. Nevertheless, the
The concurrent validity values for the POMA-T and the POMA-G performed less well with regard to these clini-
POMA-B were quite acceptable, as demonstrated by the metric properties. Given that the latter scale also showed
association with other physical performance tests a ceiling effect, the POMA-T and the POMA-B should be
(R⫽ⱍ.64ⱍ–ⱍ.68ⱍ) and self-reported limitations (R⫽ⱍ.68ⱍ– preferred. Responsiveness could be assessed only for
ⱍ.70ⱍ). The validity of the POMA-G scores was weaker. POMA-T; at the individual level, a change in score of at
The Spearman correlations in question ranged from least 5 points proved to be reliable, whereas a change in
ⱍ.51ⱍ to ⱍ.56ⱍ. The correlations between the scores on the the mean score of 0.8 point indicated a reliable change
POMA scales and the self-reported amounts of physical in the mean score for a group of 30 subjects. Further-
activity (LAPAQ) were low, ranging from .33 to .38. It more, it was demonstrated that the usefulness of the
may be argued, however, that self-reported physical POMA scales for predicting future falls was severely
activity is less adequate as a reference test, because it is a limited.
measure not of performance but of perception.38 Gen-
erally speaking, the concurrent validity values for the References
POMA-T and the POMA-B concur with the (sparse) data 1 Tinetti ME. Performance-oriented assessment of mobility problems
in elderly patients. J Am Geriatr Soc. 1986;34:119 –126.
from previous studies.4 – 6 For the POMA-G, no such data
were reported earlier. 2 Hayes KW, Johnson ME. Measures of adult general performance
tests. Arthritis Care Res. 2003;49:S28 –S42.
Discriminant validity was demonstrated by finding signif- 3 Tinetti ME, Williams TF, Mayewski R. Fall risk index for elderly
icant differences between subgroups of subjects defined patients based on number of chronic disabilities. Am J Med. 1986;80:
429 – 434.
according to the type of walking aid that they used.
Although the POMA-T and the POMA-G differentiated 4 Mecagni C, Smith JP, Roberts KE, O’Sullivan SB. Balance and ankle
among the same (combined) subgroups and the range of motion in community-dwelling women aged 64 to 87 years: a
correlational study. Phys Ther. 2000;80:1004 –1011.
POMA-B differentiated between other subgroups, there
is no evidence for clear differences among the discrim- 5 Baloh RW, Ying SH, Jacobson KM. A longitudinal study of gait and
balance dysfunction in normal older people. Arch Neurol. 2003;60:
inatory powers of the 3 scales.
835– 839.

The predictive validity with regard to falling was not 6 Cho BL, Scarpace D, Alexander NB. Tests of stepping as indicators of
mobility, balance, and fall risk in balance-impaired older adults. J Am
satisfactory for any of the POMA scales. Given optimal Geriatr Soc. 2004;52:1168 –1173.
cutoff criteria, both the sensitivity and the specificity of
7 Harada N, Chiu V, Fowler E, et al. Physical therapy to improve
the POMA-T and its subscales ranged from 62.5% to
functioning of older people in residential care facilities. Phys Ther.
66.1%. However, in studies in which other versions of 1995;75:830 – 838.
the POMA scale were used, similar values for sensitivity
8 MacRae PG, Asplund LA, Schnelle JF, et al. A walking program for
and specificity were reported. In a prospective study of nursing home residents: effects on walk endurance, physical activity,
60 community-dwelling older adults and using a 16-point mobility, and quality of life. J Am Geriatr Soc. 1996;44:175–180.
version of the POMA-B, the sensitivity was 61.5% and the
9 Protas EJ, Harris C, Moch C, Rusk M. Sensitivity of a clinical scale of
specificity was 69.5%.19 In another prospective study of balance and gait in frail nursing home residents. Disabil Rehabil.
225 community-dwelling adults 75 years of age and older 2000;22:372–378.

952 . Faber et al Physical Therapy . Volume 86 . Number 7 . July 2006


10 Rubenstein LZ, Josephson KR, Trueblood PR, et al. Effects of a 27 Suurmeijer TP, Doeglas DM, Moum T, et al. The Groningen Activity
group exercise program on strength, mobility, and falls among fall- Restriction Scale for measuring disability: its utility in international
prone elderly men. J Gerontol A Biol Sci Med Sci. 2000;55:M317–M321. comparisons. Am J Public Health. 1994;84:1270 –1273.
11 Hauer K, Pfisterer M, Schuler M, et al. Two years later: a prospective 28 Kempen GIJM, Doeglas DM, Suurmeijer TPBM. The Assessment of
long-term follow-up of a training intervention in geriatric patients with (I)ADL With the Groningen Activity Restriction Scale: A Manual [in Dutch].
a history of severe falls. Arch Phys Med Rehabil. 2003;84:1426 –1432. Groningen, the Netherlands: Northern Centre for Health Care
Research, University of Groningen; 1993.
12 DeVito CA, Morgan RO, Duque M, et al. Physical performance
effects of low-intensity exercise among clinically defined high-risk 29 Kempen GIJM, Miedema I, Ormel J, Molenaar W. The assessment
elders. Gerontology. 2003;49:146 –154. of disability with the Groningen Activity Restriction Scale: conceptual
framework and psychometric properties. Soc Sci Med. 1996;43:
13 Dyer CA, Taylor GJ, Reed M, et al. Falls prevention in residential
1601–1610.
care homes: a randomised controlled trial. Age Ageing. 2004;33:
596 – 602. 30 Stewart AL, Hays RD, Ware JE Jr. The MOS short-form general
health survey: reliability and validity in a patient population. Med Care.
14 Atkinson G, Nevill AM. Statistical methods for assessing measure-
1988;26:724 –735.
ment error (reliability) in variables relevant to sports medicine. Sports

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


Med. 1998;26:217–238. 31 Stel VS, Smit JH, Pluijm SM, et al. Comparison of the LASA Physical
Activity Questionnaire with a 7-day diary and pedometer. J Clin
15 Beaton DE, Bombardier C, Katz JN, Wright JG. A taxonomy for
Epidemiol. 2004;57:252–258.
responsiveness. J Clin Epidemiol. 2001;54:1204 –1217.
32 Bland JM, Altman DG. Statistical methods for assessing agreement
16 Dekker J, Dallmeijer AJ, Lankhorst GJ. Clinimetrics in rehabilitation
between two methods of clinical measurement. Lancet. 1986;1:307–310.
medicine: current issues in developing and applying measurement
instruments. J Rehabil Med. 2005;37:193–201. 33 de Vet HC, Bouter LM, Bezemer PD, Beurskens AJ. Reproducibility
and responsiveness of evaluative outcome measures: theoretical con-
17 McGinty SM, Masters LD, Till DB. Inter-tester reliability using the
siderations illustrated by an empirical example. Int J Technol Assess
Tinetti gait and balance assessment scale. Issues on Aging. 1999;22:3–5.
Health Care. 2001;17:479 – 487.
18 Cipriany-Dacko LM, Innerst D, Johannsen J, Rude V. Inter-rater
34 Galindo-Ciocon DJ, Ciocon JO, Galindo DJ. Gait training and falls
reliability of the Tinetti balance scores in novice and experienced
in the elderly. J Gerontol Nurs. 1995;21:10 –17.
physical therapy clinicians. Arch Phys Med Rehabil. 1997;78:1160 –1164.
35 Hauer K, Specht N, Schuler M, et al. Intensive physical training in
19 Verghese J, Buschke H, Viola L, et al. Validity of divided attention
geriatric patients after severe falls and hip surgery. Age Ageing. 2002;
tasks in predicting falls in older individuals: a preliminary study. J Am
31:49 –57.
Geriatr Soc. 2002;50:1572–1576.
36 Wells G, Beaton D, Shea B, et al. Minimal clinically important
20 de Vet HC, Terwee CB, Bouter LM. Current challenges in clinimet-
differences: review of methods. J Rheumatol. 2001;28:406 – 412.
rics. J Clin Epidemiol. 2003;56:1137–1141.
37 Schuck P, Zwingmann C. The ‘smallest real difference’ as a measure
21 Folstein MF, Folstein SE, McHugh PR. “Mini-Mental State”: a
of sensitivity to change: a critical analysis. Int J Rehabil Res. 2003;26:
practical method for grading cognitive state of patients for the
85–91.
clinician. J Psychiatr Res. 1975;12:189 –198.
38 Reuben DB, Valle LA, Hays RD, Siu AL. Measuring physical
22 Podsiadlo D, Richardson S. The Timed “Up & Go”: a test of basic
function in community-dwelling older persons: a comparison of self-
functional mobility for frail elderly persons. J Am Geriatr Soc. 1991;39:
administered, interviewer-administered, and performance-based mea-
142–148.
sures. Am J Geriatr Soc. 1995;43:17–23.
23 Rossiter-Fornoff JE, Wolf SL, Wolfson LI, Buchner DM. A cross-
39 Raiche M, Hebert R, Prince F, Corriveau H. Screening older adults
sectional validation study of the FICSIT common data base static
at risk of falling with the Tinetti balance scale. Lancet. 2000;356:
balance measures: frailty and injuries— cooperative studies of interven-
1001–1002.
tion techniques. J Gerontol A Biol Sci Med Sci. 1995;50:M291–M297.
40 Meldrum D, Finn AM. An investigation of balance function in
24 Steffen TM, Hacker TA, Mollinger L. Age- and gender-related test
elderly subjects who have and have not fallen. Physiotherapy. 1993;79:
performance in community-dwelling elderly people: Six-Minute Walk
839 – 842.
Test, Berg Balance Scale, Timed “Up & Go” Test, and gait speeds. Phys
Ther. 2002;82:128 –137. 41 Chiu AY, Au-Yeung SS, Lo SK. A comparison of four functional tests
in discriminating fallers from non-fallers in older people. Disabil
25 Buchner DM, Hornbrook MC, Kutner NG, et al. Development of
Rehabil. 2003;25:45–50.
the common data base for the FICSIT trials. J Am Geriatr Soc. 1993;41:
297–308.
26 Jette AM, Jette DU, Ng J, et al. Are performance-based measures
sufficiently reliable for use in multicenter trials? J Gerontol A Biol Sci Med
Sci. 1999;54:M3–M6.

Physical Therapy . Volume 86 . Number 7 . July 2006 Faber et al . 953


ўўўўўўўўўўўўўўўўўўўўўў
Appendix.
Performance-Oriented Mobility Assessment3

Balance
Instructions: The subject is seated on a hard, armless chair. The following maneuvers are tested.

Maneuver Description of Scoring

1. Sitting balance 0⫽Leans or slides on the chair, unable to maintain an upright position
1⫽Holds onto the chair to be able to sit upright
2⫽Sits stably, upright, and safely on the chair
2. Arising 0⫽Unable to arise without help
1⫽Able to arise but uses arms
2⫽Able to arise in one smooth motion without using arms
3. Immediate standing balance (first 5 s) 0⫽Unsteady, marked staggering, moves feet, marked trunk sway, or grabs object
for support
1⫽Steady but uses walker or cane or mild staggering but catches self without

Downloaded from https://academic.oup.com/ptj/article/86/7/944/2805177 by guest on 16 October 2023


grabbing object for support
2⫽Steady without walker or cane or other support
4. Standing balance (after 5 s) 0⫽Unsteady
1⫽Steady but wide stance or uses cane or other support
2⫽Steady with narrow stance and without support
5. Standing balance with eyes closed 0⫽Unsteady
and feet together 1⫽Steady but wide stance or support is needed
2⫽Steady with narrow stance and without support
6. Nudged (light push on sternum, 0⫽Begins to fall, needs support to prevent falling
subject with feet close together) 1⫽Takes more than 2 steps backward to prevent falling
2⫽Steady, takes fewer than 2 steps backward
7. Turning 360° 0⫽Unstable, needs support
1⫽Stable with discontinuous steps (places one foot first before lifting the other)
2⫽Stable without support and with continuous steps
8. Sitting down 0⫽Unsafe (misjudged distance, falls onto chair)
1⫽Uses arms or not a smooth motion
2⫽Safe, smooth motion
Total balance score 0–16 points

Gait
Instructions: The subject stands with the examiner, walks down the hallway or room at the usual pace. The subject is asked to walk down the walkway,
turn, and walk back after being instructed to “go.” The subject should use the usual walking aid. The following characteristics are scored.

Characteristic Description of Scoring

1. Initiation of gait (immediately 0⫽Any hesitancy or multiple attempts to start


after “go”) 1⫽No hesitancy
2a. Step height 0⫽Left foot does not clear floor completely with step or is lifted too high (above right medial malleolus)
1⫽Left foot completely clears floor
0⫽Right foot does not clear floor completely with step or is lifted too high (above left medial malleolus)
1⫽Right foot completely clears floor
2b. Step length 0⫽Left swing foot does not pass right stance foot with step
1⫽Left foot passes right stance foot with step
0⫽Right swing foot does not pass left stance foot with step
1⫽Right foot passes left stance foot with step
3. Step symmetry 0⫽Right and left step lengths are not equal
1⫽Right and left step lengths appear equal
4. Step continuity 0⫽Stopping or discontinuity between steps
1⫽Steps appear continuous
5. Path deviation 0⫽Marked deviation to both sides or in one direction
1⫽Mild or moderate deviation or straight with a walking aid
2⫽Straight without a walking aid
6. Trunk sway 0⫽Marked sway or flexed knees or trunk or uses arms to maintain balance
1⫽Stable trunk balance without sway, no flexion, no use of arms, and no use of walking aid
7. Walking stance 0⫽Heels apart while walking
1⫽Heels almost touching while walking
8. Turning while walking (180°) 0⫽Staggering, taking breaks, discontinuous motion
1⫽Smooth, continuous motion
Total walking score 0–12 points

954 . Faber et al Physical Therapy . Volume 86 . Number 7 . July 2006

You might also like