Barthels and Katz

Arthritis & Rheumatism (Arthritis Care & Research)
Vol. 49, No. 5S, October 15, 2003, pp S15–S27

DOI 10.1002/art.11415
© 2003, American College of Rheumatology
MEASURES OF FUNCTION
Measures of Adult General Functional

Status
The Barthel Index, Katz Index of Activities of Daily Living, Health Assessment
Questionnaire (HAQ), MACTAR Patient Preference Disability Questionnaire, and
Modified Health Assessment Questionnaire (MHAQ)
Patricia P. Katz for the Association of Rheumatology Health Professionals

Outcomes Measures Task Force
BARTHEL INDEX Number of items in scale. Ten in original and

modified English version; 15 in an expanded
General Description version; 5 in short version. Other variants have
Purpose. Measure functional independence and different numbers of items. Ten in Turkish; 13 in
need for assistance in mobility and self-care (1). Japanese.
The Barthel Index was developed in a chronic
hospital setting; it has generally not been adopted Subscales. None.
for use in community-based studies (2). Items were
chosen to reflect the level of nursing care required. Populations. Developmental/target.
The item weightings are based on the level of Rehabilitation patients with stroke and other
nursing care required and social acceptability (3). neuromuscular or musculoskeletal disorders.
Content. Basic activities of daily living (e.g., Other uses. Oncologic disorders.
feeding, transfer, hygiene). Items are rated in terms
of whether individuals can perform activities WHO ICF Components. Impairment, Activity
independently, can perform with some assistance, limitation.
or are dependent.
Developer/contact information. Dorothea Administration

Barthel, PT, BA. Method. Data obtained from medical records,
direct observation, or by interview. Can also be
Versions. Original, modified 10-item version (4), self-administered (7). One study suggests that the
expanded 15-item version (5), and 5-item short scale can be administered reliably over the
form (6) in English. Other variants are also telephone to subjects; data provided by proxy
available. There is little consensus over which respondents were slightly less reliable (8).
should be considered the definitive version (3), but
the original and the 10-item and 15-item Training. Self-explanatory.
modifications are the most commonly used. It has
been translated into Turkish and Japanese (self- Time to administer/complete. Completion by
rated version). health professional takes ⬍5 minutes. Self-
administration takes ⬍10 minutes.
Patricia P. Katz, PhD: Arthritis Research Group, Univer- Equipment needed. None.
sity of California, San Francisco.
Address correspondence to Patricia Katz, PhD: Arthritis
Research Group, University of California, San Francisco, Cost/availability. Original form available at
3333 California Street, Suite 270, San Francisco, CA 94143- http://www.medal.org/adocs/docs_ch37/
0920. E-mail: pkatz@itsa.ucsf.edu. doc_ch37.05.html#A37.05.03. Modified 10-item
Submitted for publication July 30, 2003; accepted July 30,
2003.
version available at: http://www.medal.org/adocs/
docs_ch37/doc_ch37.05.html#A37.05.04.
S15
S16 Katz et al
Scoring from 0.53 to 0.94 (11); the interrater reliability for

Responses. Scale. Scale depends on the version total score: intraclass correlation (ICC) ⫽ 0.94 (95%
used. In the original version, functional status confidence interval [95% CI] 0.91– 0.96) (11), and
descriptions are provided, and the rater selects the the internal consistency via Cronbach’s alpha ⫽
appropriate description of functional ability for each 0.87– 0.92 (9). In the 5-item version, the test-retest
category. Items are weighted according to the level of reliability was 0.89 (12), interrater agreement was
nursing care required. Weights range from 0–15. A 0.99 (13), and Cronbach’s alpha 0.98 (13).
modified scoring system has been suggested by Shah In the 5-item version, Cronbach’s alpha was
and colleagues (9) using a 5-level ordinal scale. 0.88 (6).
In the modified 10-item version, functional
categories may be scored from 0 to 1, 0 to 2, or 0 to Validity. Original version criterion-related. A
3, depending on the function. The 15-item version comparison of self-report versus observed
uses a 4-point response scale, and the 5-item performance in 126 patients ⱖ75 years old (14)
version is scored 0 to 1, 0 to 2, or 0 to 3, gave the follow figures, values shown are Kappa
depending on the function. (95% CI; % exact agreement) Eating 0.34 (0.19 –
0.49; 69.8), Dressing 0.33 (0.20 – 0.47; 56.1),
Score range. Original version 0 –100. The score
Hygiene 0.21 (0.06 – 0.37; 66.7), Bathing 0.34 (0.22–
range for individual items is Feeding 0 –10 (3
functional descriptions, unable, needs some help, 0.47; 50.8), Control bladder 0.32 (0.17– 0.47; 65.9),
independent, score as 0, 5, or 10), Moving from Control bowel 0.10 (0.00 – 0.26; 73.0), Transfer bed
wheelchair to bed/return 0 –15, Personal hygiene 0.36 (0.00 – 0.71; 80.2), Transfer toilet 0.40 (0.21–
0 –5, Toileting 0 –10, Bathing 0 –5, Walking on 0.58; 79.4), Walking 0.30 (0.15– 0.44; 59.5), Stairs
level surface 0 –15, Ascend/descend stairs 0 –10, 0.37 (0.36 – 0.38; 51.6).
Dressing 0 –10, Controlling bowels 0 –10, and
Controlling bladder 0 –10. In the 10-item modified Original version construct validity. Correlation
version, the scores range from 0 to 20. The 15-item with Short Form-36 subscales is r ⫽ 0.22 (Role
version has scores ranging from 0 to 100, and the Emotional subscale) to 0.81 (Physical Functioning
5-item version scores range from 0 to 20. subscale) (15). Correlation with Nottingham Health
Profile subscales: r ⫽ 0.189 – 0.840 depending on
Interpretation of scores. Higher scores reflect subscale (15). Correlation with Berg Balance Scale
greater independence. In the original version, a and Fugl-Meyer: r ⱖ0.78 (11). Correlation with
patient who scores 100 is continent; independent PULSES: Pearson r ⫽ ⫺0.61 to ⫺0.80, depending
in feeding, dressing, getting in and out of bed, and on point in time measurement was taken, (i.e.,
bathing; can walk at least one block; and can admission, discharge, and followup at 2 years
ascend and descend stairs without help. Shah and
(scales are inverse to each other) (12).
colleagues (9) note that a score of 0 –20 suggests
total dependence, 21– 60 severe dependence, 61–90
moderate dependence and 91–99 slight Original version, predictive validity. Correlation
dependence. In the 15-item version, a score of 60 with Frenchay Activities Index at 180 days after
is commonly considered to be the threshold score stroke: r ⱖ 0.59 (11).
for marked dependence (10).
Modified 10-item version validity. Wade and
Method of scoring. Arithmetic computation by Hewer showed high concurrent validity (r ⫽ 0.73–
hand. 0.77) with a measure of motor ability (16).
Predictive validity: Barthel scores were predictive
Time to score. Less than 5 minutes. of 6-month mortality, hospital length of stay, and
progress following stroke (10,17,18).
Training to score. Not reported.
Training to interpret. Not reported. The 15-item version validity. There were high
correlations (overall r ⫽ 0.91) between scores and
Norms available. No. performance of tasks and role performance (5), and
high correlations with other measures of function
(e.g., with Katz Index of Activities of Daily Living,
Psychometric Information r ⫽ 0.78; with PULSES profile, r ⫽ ⫺0.74 to ⫺0.90)
Reliability. In the original version, the interrater (10,13,19,20). Scores were predictive of return to
reliability for each item via weighted kappa ranged independent living after 6 months (21).
Adult General Function S17
The 5-item short form validity validity. Living Index in stroke patients. J Formos Med Assoc
Concurrent validity correlation with original 2001;100:526 –32.
version, r ⫽ 0.90 (6). 12. Granger CV, Albrecht GL, Hamilton BB. Outcome of
comprehensive medical rehabilitation: measurement
Sensitivity/responsiveness to change. Studies by PULSES Profile and the Barthel Index. Arch Phys
Med Rehabil 1979;60:145–54.
have demonstrated that the Barthel Index can
13. Shinar D, Gross CR, Bronstein KS, Licara-Gehr EE,
demonstrate change with effect sizes equivalent to
Eden DT, Cabrera AR, et al. Reliability of the
the Functional Independence Measure (22). Among Activities of Daily Living Scale and its use in
patients in a geriatric day hospital, the Barthel telephone interview. Arch Phys Med Rehabil 1987;
Index was not as sensitive as the London Handicap 68:723– 8.
Scale (23). 14. Sinoff G, Ore L. The Barthel Activities of Daily
Living Index: self-reporting versus actual
Comments and Critique performance in the old-old (ⱖ 75 years). J Am
Geriatr Soc 1997;45:832– 6.
The many variants of the index may produce
15. Wilkinson PR, Wolfe CDA, Warburton FG, Rudd AG,
confusion. Some authors have noted that Howard RS, Ross-Russell RW, et al. Longer term
interpreting the middle scoring categories may be quality of life and outcome in stroke patients: is the
difficulty (3,4). The Barthel Index is not designed Barthel index alone an adequate measure of
to detect low levels of disability. Individuals may outcome? Qual Health Care 1997;6:125–30.
receive the highest score, and still require 16. Wade DT, Hewer RL. Functional abilities after
assistance with other activities (3). There is stroke: measurement, natural history and prognosis.
considerably more psychometric data available for J Neurol Neurosurg Psychiatry 1987;50:177– 82.
the Barthel Index than for many other activities of 17. Wylie CM. Gauging the response of stroke patients to
daily living scales (3). rehabilitation. J Am Geriatr Soc 1967;5:797– 805.
18. Granger CV, Greer DS, Liset E, Coulombe J, O’Brien
E. Measurement of outcomes of care for stroke
References patients. Stroke 1975;6:34 – 41.
1. (Original) Mahoney FI, Barthel DW. Functional 19. Granger CV. Outcome of comprehensive medical
evaluation: the Barthel Index. MD State Med J 1965; rehabilitation: an analysis based upon the
14:61–5. impairment, disability, and handicap model. Int
2. Spector WD. Functional disability scales. In: Spilker Rehabil Med 1985;7:45–50.
B, editor. Quality of life and pharmacoeconomics in 20. Rockwood K, Stolee P, Fox RA. Use of goal
clinical trials. 2nd edition. Philadelphia: Lippincott- attainment scaling in measuring clinically important
Raven; 1996. p. 133– 43. change in the frail elderly. J Clin Epidemiol 1993;46:
3. McDowell I, Newell C. Measuring health: a guide to 1113– 8.
rating scales and questionnaires. 2nd edition. New 21. Granger CV, Hamilton BB, Gresham GE, Kramer AK.
York: Oxford University Press; 1996. p. 63–7. The Stroke Rehabilitation Outcome Study: Part II.
4. Collin C, Wade DT, Davies S, Horne V. The Barthel Relative merits of the total Barthel Index Score and a
ADL Index: a reliability study. Int Disabil Stud 1988; four-item subscore in predicting patient outcomes.
10:61–3. Arch Phys Med Rehabil 1989;70:100 –3.
5. Fortinsky RH, Granger CV, Seltzer GB. The use of 22. Van der Putten JJMF, Hobart JC, Freeman JA,
functional assessment in understanding home care Thompson AJ. Measuring change in disability after
needs. Med Care 1981;19:489 –97. inpatient rehabilitation: comparison of the
6. Hobart JC, Thompson AJ. The five item Barthel responsiveness of the Barthel Index and the
index. J Neurol Neurosurg Psychiatry 2001;71:225–
Functional Independence Measure. J Neurol
30.
Neurosurg Psychiatry 1999;66:480 – 4.
7. McGinnis GE, Seward ML, DeJong G, Osberg JS.
23. Harwood RH, Ebrahim S. Measuring the outcomes of
Program evaluation of physical medicine and
day hospital attendance: a comparison of the Barthel
rehabilitation departments using self-report Barthel.
Index and London Handicap Scale. Clin Rehabil
Arch Phys Med Rehabil 1986;14:61–5.
2000;14:527–31.
8. Korner-Bitensky N, Wood-Dauphinee S. Barthel
Index information elicited over the telephone: is it
reliable? Am J Phys Med Rehabil 1995;74:9 –18.
9. Shah S, Vanclay F, Cooper B. Improving the KATZ INDEX OF INDEPENDENCE IN
sensitivity of the Barthel Index for stroke
rehabilitation. J Clin Epidemiol 1989;42:703–9.
ACTIVITIES OF DAILY LIVING, OR
10. Granger CV, Sherwood CC, Greer DS. Functional INDEX OF ADL
status measures in a comprehensive stroke care
program. Arch Phys Med Rehabil 1977;58:555– 61.
General Description
11. Hsueh I-P, Lee M-M, Hsieh C-L. Psychometric Purpose. Measure independence in activities of
characteristics of the Barthel Activities of Daily daily living (ADL).
S18 Katz et al
Content. Basic activities of daily living (bathing, bathing and one additional function; D ⫽
dressing, toileting, transfers, continence, and independent in all but bathing, dressing and one
feeding). Katz et al noted that the loss of functional additional function; E ⫽ independent in all but
skills occurs in a specific order, with the most bathing, dressing, going to toilet, and one
complex being lost first (1). The initial scoring additional function; F ⫽ independence in all but
method for this scale reflects this hierarchy of bathing, dressing, going to toilet, transferring and
function. one additional function; G ⫽ dependent in all six
functions; Other ⫽ dependent in at least two
Developer/contact information. Sidney Katz, functions, but not classifiable as C, D, E, or F. Katz
MD. and Akpom (3) later proposed a simplified scoring
system in which individuals are scored 0 – 6 ,
Versions. Original. reflecting the number of ADLs in which they are
dependent.
Number of items in scale. There are 6, one for
each ADL. Some research has suggested that Score range. Range is A–G, or 0 – 6.
continence should not be considered an ADL and
should not be included in the scale (2). Interpretation of scores. Scores reflect the
specific ADLs, or number of ADLs, in which an
Subscales. None. individual is dependent. Higher (alphabetically or
numerically) scores reflect greater dependence.
Populations. Developmental/target. Older adults
and individuals with chronic diseases. Method of scoring. Independence in various
combinations of ADL determine ordinal rank on
Other uses. None. alpha scale, or add the number of ADLs in which
the individual is dependent for the numeric scale.
WHO ICF Components. Impairment, Activity
limitation. Time to score. Less than 5 minutes.
Administration Training to score. Not reported.

Method. Observation. Training to interpret. Not reported.
Training. Observer must be trained to administer Norms available. Not reported.
the scale.
Time to administer/complete. Not reported.

Psychometric Information
Equipment needed. None. Reliability. The interrater reliability is 0.95 or
better after training (1,4). The coefficient of
Cost/availability. Available at http://www. reproducibility, (a measure of the internal
medal.org/adocs/docs_ch37/doc_ch37.05. consistency of an ordered measure), is 0.96 – 0.99
html#A37.05.05. (5).
Validity. Construct validity. Scores on the Katz

Scoring ADL Index are correlated with scores on the
Responses. Scale. Each ADL is scored on a 3- Barthel index (r ⫽ 0.78 [6], kappa ⫽ 0.77 [7]).
point scale of independence. The Katz Index of
ADL is a semi-Guttman scale, meaning that the Predictive validity. Correlation with mobility
scale items are ordered in terms of difficulty. The dysfunction (0.50) and house confinement (0.39)
scoring reflects this, although some variation in the among older adult patients 2 years later (8).
hierarchy of difficulty is allowed. Katz et al (1) Correlation between ADL dependency level and
reported that the function of 86% of persons mortality among nursing home residents (4). In a
evaluated was consistent with the hierarchy. The comparison of outcome (home versus hospitalized/
original scoring using the ADL hierarchy uses an deceased) at one month post stroke between
8-level ordinal scale where A ⫽ independence in patients with grade A-B-C versus patients with
feeding, continence, transferring, going to toilet, grade D-E-F-G using 2 different hospital samples
dressing and bathing; B ⫽ independent in all but the following were found (9): positive predictive
one of these functions; C ⫽ independent in all but value 94%, 96%; negative predictive value 92%,
96%; sensitivity 83%, 94%, and specificity 97%, clinical trials. 2nd edition. Philadelphia: Lippincott-
97%. Raven; 1996. p. 133– 43.
In a comparison of outcome (survived versus
deceased) 1 month after stroke between patients HEALTH ASSESSMENT
with grades A-B-C-D-E-F for survival versus
patients with grade G, the following were found QUESTIONNAIRE (HAQ)
(9): positive predictive value 94%, 98%; negative General Description
predictive value 68%, 62%; sensitivity 84%, 86%, Purpose. Although the original (“full”) HAQ
and specificity 86%, 96%. covers 5 dimensions of health outcomes, the
version most commonly used includes only the
Sensitivity/responsiveness to change. This scale Disability Index, the Visual Analog Pain (VAS)
has a significant floor effect, in that it is relatively Pain Scale, and the VAS Patient Global
insensitive to variations at low levels of disability Assessment. This review will focus only on the
(10). Disability Index.
The HAQ Disability Index measures difficulty
Comments and Critique in performing activities of daily living. It is the
The Katz Index of ADL is very widely used, in most widely used functional measure in
a wide variety of populations (10), although rheumatology. The HAQ was specifically
relatively little has been published on its reliability developed for use among adults with arthritis, but
and validity. The Katz ADL scale is sensitive to it has since been used in a wide range of
environment; that is, different scores may be populations (1).
obtained for individuals in different settings or
with different environmental modifications (11). Content. Questions assessing difficulty over the
past week in 20 specific functions, grouped into 8
categories: dressing and grooming, arising, eating,
References
walking, personal hygiene, reaching, gripping, and
1. (Original) Katz S, Ford AB, Moskowitz RW, Jackson other activities.
BA, Jaffe MW: The Index of ADL: a standardized
measure of biological and psychosocial function.
JAMA 1963;185:914 –9.
Developer/contact information. James F. Fries,
2. Jagger C, Clarke M, Davies RA. The elderly at home: MD, Division of Immunology and Rheumatology
indices of disability. J Epidemiol Community Health Stanford University Medical Center, 1000 Welch
1986;40:139 – 42. Road, Suite 203, Palo Alto, CA 94304-1808.
3. Katz S, Akpom CA. A measure of primary
sociobiological functions. Internat J Health Serv Versions. Many adaptations and/or translations
1976;6:493–507. are available including English (US, Canada,
4. Spector WD, Takada HA. Characteristics of nursing Australia), Belgian Flemish and French, Canadian
homes that affect resident outcomes. J Aging Health French, Chinese (Cantonese, Hong Kong), Danish,
1991;3:427–54. French, German, Spanish (US, Spain, many Central
5. Brorsson B, Asberg KH. Katz Index of Independence
and South American countries), Swedish, and
in ADL: reliability and validity in short term care.
Turkish. (For complete listing, see Bruce and Fries
Scand J Rehabil Med 1984;16:125–32.
6. Rockwood K, Stolee P, Fox RA. Use of goal [reference 2], p. 172.)
attainment scaling in measuring clinically important
change in the frail elderly. J Clin Epidemiol 1993;46: Number of items in scale. There are 20 items
1113– 8. covering 8 categories. In addition, questions are
7. Gresham GE, Phillips TF, Labi MLC. ADL status in asked about personal assistance or assistive aids or
stroke: relative merits of three standard indices. Arch devices needed to perform the 20 functions.
Phys Med Rehabil 1980;61:355– 8.
8. Katz S, Downs TD, Cash HR, Grotz RC. Progress in Subscales. Dressing and grooming (2 items, dress
development of the Index of ADL. Gerontologist yourself, including tying shoelaces and doing
1970;10:20 –30. buttons, shampoo your hair); Arising (2 items,
9. Asberg KH, Nydevik I. Early prognosis of stroke
stand up straight from an armless straight chair, get
outcome by means of Katz Index of Activities of
Daily Living. Scand J Rehab Med 1991;23:187–91.
in and out of bed); Eating (3 items, cut your meat,
10. McDowell I, Newell C. Measuring health: a guide to lift a full cup or glass to your mouth, open a new
rating scales and questionnaires. 2nd edition. New milk carton); Walking (2 items, walk outdoors on
York: Oxford University Press; 1996. p. 63–7. flat ground, climb up 5 steps); Personal hygiene (3
11. Spector WD. Functional disability scales. In: Spilker items, wash and dry your entire body, take a tub
B, editor. Quality of life and pharmacoeconomics in bath, get on and off the toilet); Reaching (2 items,
S20 Katz et al
reach and get down a 5-pound object from just be calculated in order to compare with published
above your head, bend down to pick up clothing data.
from the floor); Gripping (3 items, open car doors,
open jars that have been previously opened, turn Time to score. Less than 2 minutes.
faucets on and off); Other activities (3 items, run
errands and shop, get in and out of a car, do Training to score. None.
chores such as vacuuming or yardwork).
Training to interpret. Not reported.
Populations. Developmental/target. Individuals
with rheumatoid arthritis and osteoarthritis. Norms available. Not reported.
Other uses. Modifications have been made for

other rheumatologic and non-rheumatologic Psychometric Information
conditions. Reliability. Test-retest correlations have ranged
from 0.87 to 0.99 (2,6). In alternate forms, i.e.,
WHO ICF Components. Activity limitation. interviewer versus self-administered forms of the
instrument (1), the Spearman rank correlation was
Dressing 0.60, Arising 0.82, Eating 0.85, Walking
Administration 0.83, Hygiene 0.56, Reach 0.80, Grip 0.64, and
Method. Interviewer (in person or by telephone) Index 0.88.
or self-administered.
Validity. Criterion validity. Fries et al (1)
Training. None required. compared an interviewer-administered HAQ to
observed performance and found an overall
Time to administer/complete. Less than 10 correlation of 0.88, with component scores ranging
minutes. from 0.47 (arising) to 0.88 (walking). Daltroy et al
(7) also found a high correlation (-0.72) between
Equipment needed. None. HAQ scores and a physical capacity measure.
Cost/availability. Contact the author. Construct validity. Numerous studies have

shown significant correlations of HAQ scores with
clinical (e.g., joint count, grip strength) and
Scoring laboratory (e.g., erythrocyte sedimentation rate
Responses. Scale. Each item is rated from 0 to 3, [ESR]) measures, and other measures of function
with 0 ⫽ no difficulty, 1 ⫽ some difficulty, 2 ⫽ (e.g., Arthritis Impact Measurement Scales [AIMS],
much difficulty, 3 ⫽ unable to do. WOMAC) (2,6,8).
Score range. The highest score within a category Predictive. HAQ scores are among the best
is used as the category score. Dependence on predictors of long-term outcomes, including work
physical assistance or equipment automatically disability, economic loss, and joint surgery, among
raises the category score to 2. The HAQ score is people with rheumatoid arthritis (9). HAQ scores
calculated as the mean of the 8 category scores. have been found to be better predictors of
Scores range from 0 to 3, in increments of 0.125. If mortality than other patient self-report measures,
fewer than 6 category scores are present, the laboratory, radiographic, or physical examination
overall score is not calculated. data (10).
Interpretation of scores. Higher scores reflect Sensitivity/responsiveness to change. Liang et al

more limitation. (11) found that the HAQ was less responsive to
change within subjects before and after hip or knee
Method of scoring. Hand scored. Alternate surgery than the Sickness Impact Profile (SIP),
methods of scoring have been developed—for AIMS, and the Index of Well-Being (IWB).
example, scoring without taking use of assistance However, Hawley and Wolfe (12) showed that the
or aids into account (3) or using the mean category HAQ may be more sensitive to detecting change
score instead of the highest score (4)— but these than traditional measures of disease activity and
scoring methods have not gained wide use. Wolfe disease-related impairments (e.g., ESR, grip
(5) suggests that even if alternative scoring strength). Ziebland et al (13) reported that the
methods are used, the traditional score should also modified HAQ (MHAQ) was more responsive to
change than pre-post differences in HAQ. They 9. Wolfe F. A reappraisal of HAQ disability in
attributed this to the MHAQ’s use of “transition” rheumatoid arthritis. Arthritis Rheum 2000; 43:2751–
questions. On the other hand, Stucki et al (14) 61.
found that changes in clinical and laboratory 10. Wolfe F, Michaud K, Gefeller O, Choi HK. Predicting
mortality in patients with rheumatoid arthritis.
parameters were associated with improved,
Arthritis Rheum 2003;48:1530 – 42.
unchanged, or worse HAQ scores, but this 11. Liang MH, Larson MG, Cullen KE, Schwartz JA.
association was not noted for the MHAQ. Wolfe (5) Comparative measurement efficiency and sensitivity
also found the HAQ better at detecting treatment of five health status instruments for arthritis
change than the MHAQ. HAQ may also be more research. Arthritis Rheum 1985;28:542–7.
sensitive to detecting change in the middle of the 12. Hawley DJ, Wolfe F. Sensitivity to change of the
scale rather than at the ends of the scale (15). Health Assessment Questionnaire (HAQ) and other
clinical and health status measures in rheumatoid
arthritis: results of short-term clinical trials and
Comments and Critique observational studies versus long-term observational
studies. Arthritis Care Res 1992;5:130 – 6.
The HAQ is probably the most widely used
13. Ziebland S, Fitzpatrick R, Jenkinson C, Mowat A,
instrument in rheumatology. Available evidence Mowat A. Comparison of two approaches to
indicates high reliability and validity. However, measuring change in health status in rheumatoid
the HAQ may be less responsive to change than arthritis: the Health Assessment Questionnaire
other instruments, especially among individuals (HAQ) and modified HAQ. Ann Rheum Dis 1992;51:
with very low or very high levels of disability. 1202–5.
14. Stucki G, Stucki S, Bruhlmann P, Michel BA. Ceiling
effects of the Health Assessment Questionnaire and
References its modified version in some ambulatory rheumatoid
1. (Original) Fries JF, Spitz P, Kraines RG, Holman HR. patients. Ann Rheum Dis 1995;54:461–5.
Measurement of patient outcome in arthritis. 15. Daltroy LH. Common problems in using, modifying,
Arthritis Rheum 1980;23:137– 45. and reporting on classic measurement instruments.
2. Bruce B, Fries JF. The Stanford Health Assessment Arthritis Care Res 1997;10:441–7.
Questionnaire: a review of its history, issues,
progress, and documentation. J Rheumatol 2003;30:
167–78. MACTAR PATIENT PREFERENCE
3. Van der Heide A, Jacobs JWG, van Albada-Kuipers DISABILITY QUESTIONNAIRE
GA, Kraaimaat FW, Geenan R, Bijlsma JWJ. Self
report functional disability scores and the use of
General Description
devices: two distinct aspects of physical function in Purpose. To assess disability in patients with
rheumatoid arthritis. Ann Rheum Dis 1993;52:497– rheumatoid arthritis (RA), focusing only on those
502. activities affected by RA and judged to be
4. Tomlin GS, Holm MG, Rogers JC, Kwoh CK. important to the patient.
Comparison of standard and alternate Health
Assessment Questionnaire scoring procedures for Content. Patients identify the 5 specific activities
documenting functional outcomes in patients with
in which they would most like to have
rheumatoid arthritis. J Rheumatol 1996;23:1524 –30.
5. Wolfe F. Which HAQ is best? A comparison of the
improvement. Questions are provided for baseline
HAQ, MHAQ and RA-HAQ, a Difficult 8-Item HAQ and followup assessments. Followup assessments
(DHAQ), and a Rescored 20 Item HAQ (HAQ20): focus on changes in ability to perform the activities
analyses in 2,491 rheumatoid arthritis patients identified at baseline. The scale is scored by
following leflunomide initiation. J Rheumatol 2001; assessing changes in the ability to perform these
28:982–9. activities from baseline to followup.
6. McDowell I, Newell C. Measuring health: a guide to
rating scales and questionnaires. 2nd edition. New Developer/contact information. Peter Tugwell,
York: Oxford University Press; 1996. p.106 –15. MD and colleagues, Center for Global Health
7. Daltroy LH, Larson MG, Eaton HM, Phillips CB, Institute of Population Health, 1 Stewart St, Rm
Liang MH. Discrepancies between self-reported and
312, Ottawa, ON K1N 6L5, Canada. E-mail:
observed physical function in the elderly: the
influence of response shift and other factors. Soc Sci
elacasse@uottawa.ca.
Med 1999;48:1549 – 61.
8. Ramey DR, Fries JF, Singh G. The Health Assessment Versions. Original.
Questionnaire 1995: status and review. In Spilker B,
editor. Quality of life and pharmacoeconomics in Number of items in scale. The baseline
clinical trials, 2nd edition. Philadelphia: Lippincott- interview includes 5 questions intended to elicit
Raven; 1996, p. 227–37. activities that have been affected by arthritis.
S22 Katz et al
Additional questions are used to query the Training to score. Not reported.
patient’s priorities for improvement. The number
of additional questions depends on how many Training to interpret. Not reported.
activities are ranked for priority; 5 is the usual
number. The followup interview assesses changes Norms available. Not reported.
in each of the priority activities.
Psychometric Information
Subscales. None.
Reliability. Interrater reliability: ICC ⫽ 0.78 (3).
Populations. Developmental/target. Individuals
with rheumatic conditions. Validity. In assessing concurrent validity,
Tugwell et al (4) noted significant correlations of
Other uses. None. MACTAR scores with other end-points: McMaster
Health Index Questionnaire, Physical subscale
WHO ICF Components. Activity limitation, 0.53, Physician’s global assessment r ⫽ 0.52, Lee
Participation restriction. Functional Index r ⫽ 0.50, Joint pain/tenderness
count r ⫽ ⫺0.38, Grip strength r ⫽ 0.33.
Verhoeven et al (2) examined the correlation of
Administration MACTAR scores (using the weighted summary
Method. Interviewer-administered. scores) with other functional and outcome
measures. HAQ r ⫽ ⫺0.66, AIMS mobility scale r
Training. Interviewers need to be trained to ⫽ 0.52, Grip strength r ⫽ 0.43, Patient’s global
administer the interview. assessment r ⫽ ⫺0.57.
Time to administer/complete. Time is 10 –15 Sensitivity/responsiveness to change. The

minutes (1,2). MACTAR has generally been found to be highly
responsive to change (2– 4). Tugwell et al (4)
Equipment needed. None. attribute the high degree of responsiveness to the
use of “transition” questions (“have you noticed a
Cost/availability. Available in original reference change in your ability to . . .”?) rather than pre-post
(1). Copy available at the Arthritis Care & Research changes in “single-state” questions, and to the
Web site at http://www.interscience.wiley.com/ tailoring of items to patients. Wright and Young (3)
jpages/0004-3591:1/suppmat/index.html. noted that, although the MACTAR was a highly
responsive scale, its use of a “transition” question
may reduce variability and elevate the score, and
Scoring limiting individuals to 5 items (activities) may
Responses. Scale. Based on responses to follow- exaggerate improvement because problems not
up interview: (“Have you noticed any change in listed do not change.
your ability to ___.” [If yes], “Has your ability to
___ improved or become worse?”) Worse ⫽ -1; no
change ⫽ 0; better ⫽ 1.
Comments and Critique
The unique aspect of the MACTAR is its focus
Score range. A summary score can be created by on patient preferences. Tugwell and others (1,2,4,5)
weighting each change score according to its have noted that standard functional status
priority ranking, with the highest ranked activity’s questions often do not include activities that are
change score multiplied by 5, and the lowest important to patients, and that the activities that
ranked multiplied by 1. The formula is ⌺([6 – rank] are included in standard functional status
⫻ change score). measures are often not considered to be important
by patients.
Interpretation of scores. Scores are interpreted Limitations of the MACTAR include its
to represent change for each individual over time. unique method of calculating change and a lack of
If the summary score is used, higher positive knowledge of the reliability and stability of patient
scores reflect improvement; negative scores reflect preferences during a stable functional period (6).
worsening. The latter issue may be particularly relevant to
longer-term followup i.e., activities selected at
Method of scoring. Hand-scored. baseline may become less important as the pattern
of disability changes (2). Some authors have noted
Time to score. Not reported. that evaluating each patient according to different
critieria (e.g., different activities) may also be from each of the 8 categories covered in the HAQ).
problematic (7); the developers of the scale argued The MHAQ has subscales that assess degree of
that using different items to assess functional difficulty, satisfaction with function, change in
change was no more problematic than a situation function over the past 6 months, and perceived
in which individuals may obtain the same overall need for help with each activity, although the
score on a standard functional status instrument by degree of difficulty items are the most commonly
exhibiting problems with different activities (1). used.
Verhoeven et al (2) noted that the scoring
system was complex and required amendments. Developer/contact information. Theodore
Although the MACTAR is a valid and highly Pincus, MD, Division of Rheumatology, Vanderbilt
responsive instrument, the complexity of University, 203 Oxford House, Box 5, Nashville,
administration and scoring may limit the feasibility TN 37232-4500. E-mail: t.pincus@vanderbilt.edu.
of its use for clinical trials or other aggregrate
situations (2). Its strength may be in monitoring Versions. Original.
change in function for individual patients over
relatively short periods of time. Number of items in scale. There are 8 items
(dressing, arising, eating, walking, hygiene,
reaching, gripping, getting in and out of car)
References repeated in each of 4 subscales.
1. (Original) Tugwell P, Bombardier C, Buchanan W,
Goldsmith CH, Grace E, Hanna B.The MACTAR Subscales. Difficulty, satisfaction, change in
Patient Preference Disability Questionnaire: an function, need for help.
individualized functional priority approach for
assessing improvement in clinical trials in Populations. Developmental/target. Individuals
rheumatoid arthritis. J Rheumatol 1987;14:446 –51. with rheumatic conditions.
2. Verhoeven AC, Boers M, van der Linden S. Validity
of the MACTAR Questionnaire as a functional index
in a rheumatoid arthritis clinical trial. J Rheumatol
Other uses. None.
2000;27:2801–9.
3. Wright JG, Young NL. A comparison of different WHO ICF Components. Activity limitation.
indices of responsiveness. J Clin Epidemiol 1997;50:
239 – 46.
4. Tugwell P, Bombardier C, Buchanan WW, Goldsmith Administration
C, Grace E, Bennett KJ, et al. Methotrexate in Method. Self-report.
rheumatoid arthritis: impact on quality of life
assessed by traditonal standard-item and
Training. Not reported.
individualized patient preference health status
questionnaires. Arch Intern Med 1990;150:59 – 62.
5. Hewlett S, Smith AP, Kirwan JR. Values for function Time to administer/complete. Less than 5
in rheumatoid arthritis: patients, professionals, and minutes.
public. Ann Rheum Dis 2001;60:928 –33.
6. Bell MJ, Bombardier C, Tugwell P. Measurement of Equipment needed. None.
functional status, quality of life, and utility in
rheumatoid arthritis. Arthritis Rheum 1990;33:591– Cost/availability. Available in original reference
601. (1).
7. Karlson EW, Katz JN, Liang MH. Chronic rheumatic
disorders. In: Spilker B, editor. Quality of life and
pharmacoeconomics in clinical trials. 2nd edition. Scoring
Philadelphia: Lippincott-Raven; 1996. p. 1029 –37.
Responses. Scale. For Difficulty (“Are you able
to . . .?”). the scale is 0 ⫽ Without any difficulty,
MODIFIED HEALTH ASSESSMENT 1 ⫽ With some difficulty, 2 ⫽ With much
QUESTIONNAIRE (MHAQ) difficulty, 3 ⫽ Unable to do. Any positive response
regarding help or assistive devices raises the score
General Description to 2. For Satisfaction (“How satisfied are you with
Purpose. The MHAQ is a modification of the your ability to . . .?”), 0 ⫽ Satisfied and 1 ⫽
Health Assessment Questionnaire (HAQ). Dissatisfied ⫽ 1. Change in difficulty (“Compared
to 6 months ago, how difficult is it NOW (this
Content. The number of specific activities week) to . . . ?”) 0 ⫽ Less difficult now, 1 ⫽ No
queried is reduced from 20 to 8 (one item is used change, and 2 ⫽ More difficult now. Need for help
S24 Katz et al
(“Do you need help to . . .?”), 0 ⫽ Do not need Sensitivity/responsiveness to change. Difficulty
help ⫽ 0, and 1 ⫽ Need help. scale. Blalock and colleagues (2) suggest that the
MHAQ is relatively insensitive to low levels of
Score range. Scale scores are the mean of the disability, and, because of its restricted range and
scores on the 8 items within the scale. Difficulty skewed distribution, should be used with caution
0 –3; Satisfaction 0 –1; Change in function 0 –2; when the intent is to assess functional change.
Need for help 0 –1. Stucki and colleagues (4) and Wolfe (6) also noted
clustering of scores at the low end of the scale
Interpretation of scores. Higher scores on all (Stucki at scores ⬍0.3; Wolfe at scores ⱕ 1.0).
scales are more negative (i.e., reflect more
difficulty, less satisfaction, function more difficult Change in Difficulty scale. Ziebland and
now than previously, and need for more help). colleagues (7) found that the MHAQ change in
difficulty scale was more sensitive to changes in
Method of scoring. Arithmetic calculation by hand. clinical variables (i.e., correlated more highly with
variables such as grip strength, pain, morning
Time to score. Less than 5 minutes. stiffness, and ESR) than a pre-post difference in the
traditional HAQ score.
Training to score. Not reported.
Training to interpret. Not reported. Comments and Critique

The majority of psychometric analysis of the
Norms available. Not reported. MHAQ has focused on the difficulty subscale, and
has generally found that it appears to be less
Psychometric Information psychometrically sound than the HAQ. Blalock
and colleagues (2) noted that scores on the MHAQ
Reliability. For the Difficulty scale, the test-
were consistently lower than those on the HAQ.
retest reliability at one month ⫽ 0.91 (1).
Mean differences on the overall difficulty score
Validity. Concurrent validity for the Difficulty were 0.67 lower using HAQ score calculated with
scale. Concurrent validity showed the following adjustment for help and/or assistive devices, and
correlation with HAQ subscale scores (1): Dressing 0.52 lower using HAQ scores without such
0.75, Arising 0.71, Eating 0.75, Walking 0.74, adjustments. The MHAQ does not make
Hygiene 0.79, Reaching 0.82, Gripping 0.76, and adjustments for use of help or assistive devices.
In/Out Car 0.84. Blalock and colleagues (2) also Blalock also noted that while HAQ scores were
examined the equivalency of the HAQ with the normally distributed across the scale’s full possible
MHAQ, and found that although the scores were range (0 – 3), MHAQ scores were not normally
highly correlated, the MHAQ scores were distributed and ranged only from 0 to 1.75. Similar
consistently and significantly lower (indicated findings were also noted by Stucki et al (4) and by
better function) than the HAQ score. In every Wolfe (6). There are conflicting reports about
category, HAQ items chosen for the MHAQ had a correlations between MHAQ scores and clinical
lower mean than the MHAQ-excluded items. and laboratory variables. Wolfe concluded that the
advantages in length of the MHAQ over the HAQ
Construct validity for the Difficulty Scale. were offset by loss of sensitivity and
Although Pincus et al (3) reported significant responsiveness to change (6).
correlations of MHAQ scores with clinical (e.g., On the other hand, there is some evidence
joint count, radiographic measures) and laboratory that the change in difficulty scale may be more
(e.g., ESR) measures, Stucki et al (4) found that sensitive to changes in clinical variables than a
MHAQ scores were not correlated with changes in pre-post change score calculated from the HAQ.
laboratory or clinical measures. Arvidson et al (5) This finding is consistent with findings of high
reported that MHAQ scores were not correlated responsiveness using the MACTAR’s “transition”
with radiographic evidence of joint damage; but questions.
were correlated with performance measures (e.g.,
walk test, grip strength). References
1. (Original) Pincus T, Summey JA, Soraci SA, Wallston
Construct validity for Dissatisfaction With KA, Hummon NP. Assessment of patient satisfaction
Function scale. Scores were incrementally greater in activities of daily living using a modified Stanford
(more dissatisfied) as difficulty in function Health Assessment Questionnaire. Arthritis Rheum
increased (1). 1983;26:1346 –53.
2. Blalock SJ, Sauter SVH, DeVellis RF. The Modified analyses in 2491 rheumatoid arthritis patients
Health Assessment Questionnaire difficulty scale: a following leflunomide initiation. J Rheumatol 2001;
health status measure revisited. Arthritis Care Res 28:982–9.
1990;3:182– 8. 7. Ziebland S, Fitzpatrick R, Jenkinson C, Mowat A,
3. Pincus T, Callahan LF, Brooks RH, Fuchs HA, Olsen Mowat A. Comparison of two approaches to
NJ, Kaye JJ. Self-report questionnaire scores in measuring change in health status in rheumatoid
rheumatoid arthritis compared with traditional arthritis: the Health Assessment Questionnaire (HAQ)
physical, radiographic, and laboratory measures. Ann and modified HAQ. Ann Rheum Dis 1992;51:1202–5.
Intern Med 1989;110:259 – 66.
4. Stucki G, Stucki S, Bruhlmann P, Michel BA:.Ceiling
effects of the Health Assessment Questionnaire and Acknowledgments
its modified version in some ambulatory rheumatoid Members of the Association of Rheumatology
arthritis patients. Ann Rheum Dis 1995;54:461–5.
Health Professionals Outcomes Measures Task
5. Arvidson NG, Larsson A, Larsen A. Simple function
tests, but not the modified HAQ, correlate with
Force are Patricia P. Katz, PhD, (Chairman), Karen
radiological joint damage in rheumatoid arthritis. W. Hayes, PT, PhD, John E. Hewett, PhD, Carol
Scand J Rheumatol 2002;31:146 –50. Oatis, PT, PhD, Janet L. Poole, PhD, OTR/L,
6. Wolfe F. Which HAQ is best? A comparison of the Elizabeth A. Schlenk, PhD, RN, Christina H.
HAQ, MHAQ and RA-HAQ, a Difficult 8 Item HAQ Stenström, PhD, RPT and Janalee Taylor, MSN,
(DHAQ), and a Rescored 20 Item HAQ (HAQ20): RN.
S26
Summary Table of Adult Functional Status Measures*
Psychometric properties
Method of Time for Primary scale Validated
Measure/scale Content No. of items Response format administration administration outputs populations Reliability Validity Responsiveness
Barthel Index Functional Original: 10 Unable, needs help, Observer, from Observer: Less Overall score of Rehabilitation patients Excellent Excellent Moderate
independence and Modifications: independent (with medical than 5 dependence in with stroke and
need for assistance 5, 10, 15 some variation) records or minutes ADL other
in self-care/basic interview (in- Self: Less than neuromuscular or
activities of daily person or 10 minutes musculo-skeletal
living telephone), disorders
self-
administered
Katz Index of Independence in 6 3-point scale of Observer scored Not reported Overall score of Older adults and Little work Predictive Unknown
ADL ADL independence dependence in individuals with validity
ADL. Scored chronic diseases is
from A–G, or excellent
from 0–6
Health Difficulty in 20 items 0–3; 0 ⫽ no difficulty, Interviewer or Less than 10 Overall score Individuals with Excellent Excellent Moderate
Assessment performing (activities) 1 ⫽ some difficulty, self- minutes from 0–3 (mean rheumatoid arthritis
Questionnaire activities of daily over 8 2 ⫽ much difficulty, administered of category and osteoarthritis.
(HAQ) living categories, 3 ⫽ unable to do scores) Also used in other
plus rheumatic and non-
queries rheumatic
about use conditions
of help or
aids
MACTAR Disability, focusing Baseline: 5 At followup: Worse, Interviewer Less than 10 Summary score of Individuals with Acceptable Excellent Excellent
Patient only on activities questions no change, better minutes changes in arthritis, rheumatic
Preference judged to be to elicit priority conditions
Disability important by the activities. activities
Questionnaire patient Additional
items to
rank
priority.
Number
can vary,
usually 5.
Followup:
assesses
changes in
each
priority
activity
Modified Health Degree of difficulty, 8 items Difficulty (0–3) Interviewer or Less than 10 Scores for Individuals with Difficulty: Difficulty: Difficulty:
Assessment satisfaction with (activities) 0 ⫽ no difficulty, self- minutes difficulty, rheumatoid arthritis Excellent Moderate Moderate
Questionnaire function, changes within 1 ⫽ some difficulty, administered satisfaction,
(MHAQ) in function over each 2 ⫽ much difficulty, change in
past 6 months, subscale. 3 ⫽ unable to do. difficulty, need
and need for help Items are a Satisfaction (0–1) for help.
in activities of subset of 0 ⫽ dissatisfied, Difficulty score
daily living HAQ items. 1 ⫽ satisfied. most
Change in difficulty, commonly
(0–2), 0 ⫽ less reported.
difficult now,
1 ⫽ no change,
2 ⫽ more difficult
now
Need for help (0–1)
0 ⫽ do not need
help, 1 ⫽ need
help.
* ADL ⫽ activities of daily living.

Katz et al
Content of Adult Functional Status Measures*
Barthel Katz HAQ MHAQ
Impairments
Continence ⫹ ⫹
Basic ADL-Mobility
Bed Activities ⫹ ⫹
Standing ⫹ ⫹
Transfers ⫹ ⫹ ⫹
Ambulation ⫹ ⫹ ⫹
Inclines/Stairs ⫹ ⫹
Basic ADL-Personal Care
Bathing ⫹ ⫹ ⫹ ⫹
Toileting ⫹ ⫹ ⫹
Grooming ⫹ ⫹
Dressing ⫹ ⫹ ⫹ ⫹
Feeding ⫹ ⫹ ⫹ ⫹
Hand Functions ⫹ ⫹
Instrumental ADL
Home chores ⫹
* ⫹ ⫽ has item(s) related to this activity or concept. Content of

MACTAR is dependent on patient-identified activities.
For another comparative presentation of the content of functional
status measures, see Stewart AL, Painter PL. Issues in measuring
physical functioning and disability in arthritis patients. Arthritis
Care Res 1997;10:395– 405. HAQ ⫽ Health Assessment Question-
naire; MHAQ ⫽ Modified Health Assessment Questionnaire; ADL ⫽
activities of daily living.

Barthels and Katz

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Barthels and Katz

Uploaded by

Copyright:

Available Formats

Arthritis & Rheumatism (Arthritis Care & Research)

Vol. 49, No. 5S, October 15, 2003, pp S15–S27

Measures of Adult General Functional

Patricia P. Katz for the Association of Rheumatology Health Professionals

BARTHEL INDEX Number of items in scale. Ten in original and

Developer/contact information. Dorothea Administration

Scoring from 0.53 to 0.94 (11); the interrater reliability for

Administration Training to score. Not reported.

Time to administer/complete. Not reported.

Validity. Construct validity. Scores on the Katz

Other uses. Modiﬁcations have been made for

Cost/availability. Contact the author. Construct validity. Numerous studies have

Interpretation of scores. Higher scores reﬂect Sensitivity/responsiveness to change. Liang et al

Time to administer/complete. Time is 10 –15 Sensitivity/responsiveness to change. The

Training to interpret. Not reported. Comments and Critique

Summary Table of Adult Functional Status Measures*

* ADL ⫽ activities of daily living.

Content of Adult Functional Status Measures*

Barthel Katz HAQ MHAQ

* ⫹ ⫽ has item(s) related to this activity or concept. Content of

You might also like