You are on page 1of 8

Screening Performance of the 15-Item

Geriatric Depression Scale in a Diverse


Elderly Home Care Population
Linda G. Marc, Sc.D., M.P.H., M.S., Patrick J. Raue, Ph.D.,
Martha L. Bruce, Ph.D., M.P.H.

Objective: To empirically evaluate the psychometric properties of the 15-item Geri-


atric Depression Scale (GDS-15); determine the optimal cutoff points and screening
performance for the detection of major depression; and examine differential item
functioning (DIF) to determine the variability of item responses across sociodemo-
graphics in an elderly home care population. Design: A secondary analysis of data
collected from a random sample study. Setting: Homebound subjects newly admitted
over a 2-year-period to a large visiting nurse service agency in Westchester, New York.
Participants: Five hundred twenty-six subjects over age 65, newly admitted to home
care for skilled nursing. Measurements: Major depression was diagnosed using both
patient, Structured Clinical Interview for Diagnostic and Statistical Manual of Mental
Disorders, Fourth Edition, and best estimate procedures. Self-report measures included
the GDS-15, activities of daily living (ADL), instrumental ADL, and pain intensity.
Cognitive impairment was assessed using the Mini-Mental State Examination and
medical morbidity using the Charlson Comorbidity Index. Results: Optimal cutoff
(5) yielded sensitivity 71.8% and specificity of 78.2%, however, the accuracy of the
GDS-15 was not influenced by severity of medical burden. Persons with a cluster of
ailments were twice as likely (Adj odds ratio ⫽ 2.47; 95% confidence interval ⫽
1.49 – 4.09) to be diagnosed with depression. DIF analyses revealed no variability of
item responses across sociodemographics. Conclusion: Main findings suggest that
the accuracy of the GDS-15 was not influenced by severity of clinical or functional
factors, or sociodemographics. This has broad implications suggesting that the very
old, ill, and diverse populations can be appropriately screened for depression using
the GDS-15. (Am J Geriatr Psychiatry 2008; 16:914 –921)

Key Words: Depression, differential item functioning, home care, optimal cutoff value,
sensitivity, specificity

Received November 20, 2007; revised 9 July 2008; accepted July 10, 2008. From the Department of Psychiatry, Weill Medical College of Cornell
University, Westchester, NY; and the Center for Multicultural Mental Health Research, Cambridge Health Alliance-Harvard Medical, Somerville,
MA. Send correspondence and reprint requests to Linda Marc, Sc.D., M.P.H., M.S., 120 Beacon Street, 4th Floor, Center for Multicultural Mental
Health Research, Cambridge Health Alliance-Harvard Medical, Somerville, MA 02143. e-mail: linda.marc@post.harvard.edu
© 2008 American Association for Geriatric Psychiatry

914 Am J Geriatr Psychiatry 16:11, November 2008


Marc et al.

T he assessment of depression in elderly home care


patients is essential for determining the magni-
tude and nature of depression1; however, in clinical
race/ethnicity on the measurement properties of the
GDS. To further examine the effects of these vari-
ables on the properties of the GDS-15 in a home-
practice,where time is at a premium, diagnostic in- bound population, we will employ DIF analysis to
struments like the Structured Clinical Interview for examine the degree to which items that constitute the
Diagnostic and Statistical Manual of Mental Disorders, scale are systematically related to these independent
Fourth Edition (SCID)2 are not routinely used in the factors.10,11 As an example, item difficulty bias can be
home care setting. Although screening for depres- determined across gender if we investigate whether
sion is part of the comprehensive assessment of women, compared with men, more frequently re-
home care patients, there is no information on the spond higher on certain items, after matching the
validity of standardized screens relative to diagnos- subgroups on level of depression (usually the total
tic assessment in such populations. This study exam-
scale score).12 Item discrimination, bias is deter-
ines the sensitivity and specificity of the 15-item
mined by evaluating whether the item difficulty bias
Geriatric Depression Scale (GDS-15)3 compared with
increases or decreases as a function of the level of
the SCID, a gold standard assessment.
depressive symptoms (the underlying construct).
More than 20 years ago the 30-item GDS was de-
Drawing a parallel from the field of epidemiology to
veloped as a self-report instrument to screen for clin-
the field of psychometrics, the evaluation of item
ical depression among the elderly.4 The instrument
excludes certain somatic symptoms, which might be difficulty (uniform DIF) and item discrimination
due to medical illness, and makes use of a simple (nonuniform DIF) are analogous to confounding and
response format (yes/no, rated 1or 0) which facili- effect modification, respectively (Personal Commu-
tates easier use by individuals with impaired cogni- nication, Crane P, 2003).
tive functions. The endorsed items are then totaled, Thus, the primary aim of this study was to empiri-
generating a score from which patients are classified cally evaluate the psychometric properties of the
as depressed or nondepressed. The development, GDS-15 in an elderly home healthcare population, de-
validation, and factor structure of the shorter GDS-15 termine the optimal cutoff points and screening perfor-
has been described previously, elsewhere,3 and has mance for the detection of major depression, and to
been evaluated in a variety of inpatient, outpatient, examine age, level of educational, gender, and race/
primary care, and nursing home populations.5 Al- ethnicity on the measurement properties of the scale.
though the short form is more practical for use As distress associated with medical illness and disabil-
among the elderly, its administration to home care ity in an elderly homebound population may confound
patients burdened with poor medical and functional the ability of this tool to correctly detect or recognize
status has not been reported on. Furthermore, its depression, we hypothesize that the sensitivity and
validity, reliability, sensitivity, and specificity com- specificity of this instrument will be influenced by se-
pared with a gold standard have not yet been exam- verity of medical burden, impaired functioning, and
ined in homebound patients. cognitive impairment. Finally, as there have been no
Although the use of the GDS score assumes uni- previous reports to date on item bias analyses in the
dimensionality (a single underlying construct of de- GDS-15 administered to elderly homebound patients,
pressive symptoms) and no item-level bias, the effect
herein, we propose exploratory DIF analyses.
of independent factors (i.e., age, educational attain-
ment, gender, and race/ethnicity) on the measure-
ment properties of the GDS in homebound patients is
unknown. Although there have been reports that the
instrument performs poorly in the “old-old,”6 and METHODS
among persons with low or no formal level of edu-
cation,7,8 Tang et al.9 found no differential item func- The study received full review and approval from
tioning (DIF) across age or education in an elderly the Institutional Review Board of Weill Medical Col-
population. To our knowledge, there have been no lege of Cornell University. All study participants
reports on item bias due to the effect of gender or were provided an informed consent for signature.

Am J Geriatr Psychiatry 16:11, November 2008 915


Screening Performance of the Geriatric Depression Scale

Participants information based on all sources of information, in-


cluding patient interview, informant interview, and
This was a secondary analysis of data collected from
patient medical status and medications as docu-
a random sample study including 526 subjects aged 65
mented in the medical record (Health Care Financing
and older, newly admitted over a 2-year-period (De-
Administration form 485). The consensus best esti-
cember 1997 to December 1999) to a large visiting nurse
mate procedure was used in the parent study14 and
service agency in Westchester, NY. As the validity of
has been previously described as a reliable method
the 15-item GDS in cognitively impaired elderly sub-
for both interviewed and noninterviewed individuals
jects has been questioned, study patients who scored
for most diagnostic categories.17,18 An inclusive ap-
⬍18 on the Mini-Mental State Examination (MMSE)
proach to assigning diagnoses was used, whereby
were excluded from the psychometric analyses.13
symptoms were rated as present regardless of
The sampling strategy to recruit a representative
whether they were due to general medical condi-
sample of agency patients has been previously de-
tions or medications.19
scribed, elsewhere.14 The original study including 539 Cognitive impairment was assessed using the
patients was designed to report on the distribution, MMSE,20 an instrument shown to have consistency and
course, and outcomes of Diagnostic and Statistical Man- reliability in detecting cognitive functioning in an el-
ual of Mental Disorders, Fourth Edition (DSM–IV) major derly population.21 Patients were grouped into a di-
depression in elderly patients receiving home care for chotomous category indicating cognitive impairment
medical and surgical problems. Selected participants (MMSE score ranging 18 –23) or no cognitive impair-
were interviewed by bachelor and master’s-level re- ment (MMSE score ⱖ24), which is the most widely
search associates 2 weeks after admission. used and accepted cutoff for the MMSE.22 Medical
Study subjects had a mean age of 78.3 (SD ⫾ 7.5) morbidity was determined from the medical record
years, and were predominately women (N ⫽ 351 of and patient interview by a geriatric internist using
539; 65.1%). Overall demographic characteristics were the Charlson Comorbidity Index.23 The Charlson is
diverse across ethnicity, education, marital and poverty the most extensively studied comorbidity index, which
status, and very similar to reported national statistics of shows strong evidence of moderate-to-high psychometric
home care patients.15 Ethnic composition included properties in the disabled and elderly populations.24,25
non-Hispanic White (N ⫽ 458 of 539; 85.0%), non- Other self-report measures include counts of activi-
Hispanic Black (N ⫽ 56 of 539; 10.4%), and Hispanics ties of daily living (ADLs) and instrumental ADLs the
(N ⫽ 21 of 539; 3.9%). At time of interview, one-third patient was unable to do without assistance.26 Evidence
were married (N ⫽ 204 of 539; 37.9%); one-fourth (N ⫽ from the literature suggests that the criterion validity of
94 of 363) reported living in poverty, as defined by the these indexes are satisfactory when assessed in terms of
1998 guidelines of the U.S. Department of Health and the correlation with an outcome variable (i.e., home
Human Services16; and there was variation reported help).27 Finally, pain intensity was assessed by the sin-
across educational attainment: less than high school gle three-level item (“a great deal,” “a little bit,” or
(N ⫽ 164; 30.6%); high school (N ⫽ 170; 31.7%); some “none”) from the Medical Outcomes study 36-item
college (N ⫽ 91; 17.0%); college (N ⫽ 53; 9.9%); post- Short-Form Health Survey.28
college (N ⫽ 58; 10.8%).

Statistical Analysis
Measures
A receiver operating characteristic (ROC) curve was
Self-report measures of depression were taken by plotted for the GDS-15 and SCID diagnosis to compare
research assistants using the 15-item GDS. Separate the sensitivity and specificity of each threshold for ma-
patient and informant interviews assessed current jor depression.29 Sensitivity was defined as the proba-
DSM–IV criteria for major depression using the bility of a positive screening for depression given that
mood module of the SCID. The diagnosis of depres- the individual met criteria for depression using infor-
sion was established by consensus of the study geri- mation from the SCID. Specificity was defined as a
atric psychiatrist, geriatrician, clinical psychologist negative screen for depression, given that the individ-
(PJR), and principal investigator (MLB) using clinical ual did not meet the clinical criteria for depression.

916 Am J Geriatr Psychiatry 16:11, November 2008


Marc et al.

ROC curves were plotted separately for cognitive im- lence of depression using the SCID was 15.4% (N ⫽
pairment, disability, and pain. The goal was to compare 81/526). A GDS-15 scale was then generated using
the sensitivity and specificity of each threshold of mor- casewise deletion, dropping 30 subjects who did not
bidity. Threshold categories were based on the median respond to all 15-items, and 4 subjects who had not
score for each measure (MMSE mean ⫽ 3; Charlson responded to any items. Examining the remaining
mean ⫽ 2; ADL mean ⫽ 1; instrumental ADL mean ⫽ cases (N ⫽ 492), bivariate analyses showed that ma-
4; pain mean ⫽ 2), assigning patients “worse” or “bet- jor depression was not significantly associated with
ter” health status. Accuracy was measured by the area any sociodemographic factors, and these findings are
under the ROC curve (AUC), and a p-value was re- similar to results previously reported in the parent
ported for the statistical test comparing the equality of study.14
ROC curves, detecting the difference between areas
under the curves for medical conditions and for socio- Optimal Cutoff, Sensitivity, and Specificity Using
demographic characteristics. the SCID
Optimal cutoff scores were determined using
Youden’s Index to summarize the information into a Preliminary psychometric analyses of the scale,
single numeric value.30 Internal consistency was eval- based on 492 study participants, shows an overall
uated using the Kuder-Richardson formula 20 (KR-20), mean of 3.5 (ranging from 0 to 13) with an internal
a special version of alpha for items that are dichoto- consistency-reliability equal to 0.80 (Table 1). Individ-
mous.31 Two-tailed t tests were used for continuous uals with depression, compared with nondepressed,
variables to compare the mean scale score between had significantly higher scores on the GDS-15 (t ⫽
different subgroups. Chi-square analyses were per- 10.23, df ⫽ 490, p ⬍0.001), Charlson Medical Comor-
formed on dichotomous categorical variables. Alpha- bidity Index (t ⫽ 2.21, df ⫽ 524, p ⫽0.027), instrumental
level 0.05 was used for determining significance. ADL (t ⫽ 2.73, df ⫽ 513, p ⫽0.007), and pain intensity
DIF was evaluated using logistic regression to pre- (t ⫽ 4.23, df ⫽ 512, p ⬍0.001) (Table 1). In an unad-
dict item responses across dichotomous categories for justed logistic regression model, analyses showed that
gender (0 ⫽ male; 1 ⫽ female) and nonwhite race (0 ⫽ patients reporting at least 3 ailments (cluster), com-
white; 1 ⫽ nonwhite). Ordinal logistic regression, using pared with patients reporting less or none, were 2.5
the proportional odds model was used to predict item times more likely to be diagnosed with major depres-
responses across the ordinal categorical variables for sion (defined by SCID) (OR ⫽ 2.47; 95% CI ⫽ 1.49 –
educational attainment (0 ⫽ ⬍high school; 1 ⫽ high 4.09). These, results remained unchanged in an ad-
school; 2 ⫽ some college; 3 ⫽ college; 4 ⫽ postcollege) justed stepwise logistic regression model (OR ⫽ 2.47;
and age, which was categorized into a three-level ordi- 95% CI ⫽ 1.49 – 4.09; Wald ␹2 ⫽ 12.26; 1df; N ⫽ 524
nal variable for this analysis (0 ⫽ 65–74 years; 1 ⫽ with data on all variables) (Table 1).
75– 84 years; 2 ⫽ 85 and older).32,33 Measures of mag- The optimal cutoff using the trade-off between
nitude consisted of the odds ratio (OR) with 95% con- true-positives and true-negatives (Youden’s Index) is
fidence intervals (CI). Evidence of DIF was defined as 5, with a sensitivity of 71.8%, specificity of 78.2%,
an ORs ⱖ2.0 or conversely ⱕ0.50.10,12 Items with no and ROC area under the curve at 0.793 (Table 2).
evidence of DIF were totaled and scored to assess their When the sensitivity and specificity of the GDS-15
overall performance, compared with the original score were examined across clinical, functional, cluster,
using the 15-items. Software programs used to manage and demographic variables, using the cutoff of 5,
and analyze the data described, herein, include there were no statistical differences across any of
STATA,34 SPSS,35 and EXCEL.36 these factors. Sensitivity seemed somewhat higher
among patients with cognitive impairment, greater
medical morbidity, disability, and pain but, again,
these differences did not reach statistical signifi-
cance. We also explored whether sensitivity differed
RESULTS
by a composite variable indicating patients who had
Examining the study population (excluding cases at least 3 of these conditions, but the difference did
with moderate to severe dementia, N ⫽ 13), preva- not reach statistical significance.

Am J Geriatr Psychiatry 16:11, November 2008 917


Screening Performance of the Geriatric Depression Scale

TABLE 1. Mean Scores of Clinical and Functional Factors by Depression Status Among Elderly Home Healthcare Patients
Major Depressiona
Yes No Analysis
Factor Alpha Mean SD Mean SD t df p
GDS-15 (range ⫽ 0–13; N ⫽ 492) KR-20 ⫽ 0.80 6.7 3.5 3.0 2.6 10.23 490 ⬍0.001
Medical morbidity (Charlson Comorbidity Index; range, 0–10) 3.3 2.3 2.5 2.0 2.21 524 ⬍0.027
ADL disability (range, 0–6) 1.3 1.5 1.0 1.2 0.54 512 0.591
Instrumental ADL (IADL) disability (range, 0–6) 3.7 1.4 3.2 1.5 2.73 513 0.007
Cognitive function (MMSE, range ⫽ 0–30) 26.3 3.2 26.6 2.8 ⫺0.76 520 0.450
Reported pain (range, 1–3) 2.3 0.8 1.9 0.8 4.23 512 ⬍0.001
Depression No Depression
Clusterb N % N %
Subjects reporting cluster including medical morbidity, ADL, 31 38.3 89 20.0 Adj OR ⫽ 2.47
IADL, or pain (95% CI ⫽ 1.49–4.09)
Wald ␹2 ⫽ 12.26,
Subjects reporting none or less than three comorbid factors 50 61.7 356 80.0
1df, p ⫽ 0.005
a
Depression defined using SCID.
b
Subjects reporting at least 3 of the 4 comorbid factors (Charlson Index, ADL disability, IADL, or pain).

TABLE 2. GDS-15: Sensitivity, Specificity and Optimal Cutoff


Cutoff Points >0 >1 >2 >3 >4 >5 >6 >7 >8 >9 >10 >11 >12 >13 >13
Sensitivity (%) 100.0 97.2 93.0 83.1 76.1 71.8 60.6 54.9 38.0 25.3 21.1 16.9 11.3 7.0% 0.0%
Specificity (%) 0.0 14.7 34.2 51.3 65.1 78.2 86.2 91.2 93.1 95.5 97.2 98.1 98.8 99.5% 100.0%
a
Youden’s 0.00 0.12 0.27 0.34 0.41 0.50 0.47 0.46 0.31 0.21 0.18 0.15 0.10 0.07 0.00
GDS-15
AUC 0.7933
Standard error 0.0308
Lower confidence interval 0.7330
Upper confidence interval 0.8536
a
Optimal cutoff.

Differential Item Functioning 2, p ⫽ 0.04); and “Do you feel that your situation is
hopeless” (OR ⫽ 0.35; Wald ␹2 ⫽ 85.5, df ⫽ 2,
Results of the DIF analyses indicated that age, edu-
p ⫽ 0.01), again controlling for total GDS score.
cational attainment, gender, and race do not influence
Finally, the likelihood of non-Whites responding
the measurement properties of the GDS-15. Of the 15
positively to the item, Do you feel you have more
items evaluated, 12 showed no evidence of either uni-
problems with memory was 4 times (OR ⫽ 4.37,
form or nonuniform DIF and were relatively free of
Wald ␹2 ⫽ 51.0, df ⫽ 2, p ⬍0.001) that of Whites with
item bias. However, 3 items met the criteria for evi-
dence of bias, with ORs ⱖ2.0 or conversely ⱕ0.50 (Ta- equivalent GDS scores.
ble 3). Briefly, only uniform DIF was observed for items Based on this information above, an analysis of
5, 10, and 14 by sex; and for item-10 by race. the shortened GDS score was performed deleting
Results show that the likelihood of women re- the 3 items with evidence of bias (i.e., items 5, 10,
sponding negatively to the item pertaining to “Are and 14). The psychometric properties of this 12-
you in good spirits most of the time” was 3.43 times item version (AUC ⫽ 0.8681) was not significantly
(OR ⫽ 3.43; Wald ␹2 ⫽ 85.1, df ⫽ 2, p ⫽ 0.04) that of improved over the original 15-item scale (AUC ⫽
men, controlling for total GDS score. In contrast, men 0.8716). In another attempt to further examine if
were twice more likely than women to respond pos- improvements could be made, a 14-item version of
itively to items—“Do you feel you have more prob- the scale was created omitting item-10, as it
lems with memory” (OR ⫽ 0.43; Wald ␹2 ⫽ 51.2, df ⫽ showed item-level bias by both gender and race.

918 Am J Geriatr Psychiatry 16:11, November 2008


Marc et al.

TABLE 3. Differential Item Functioning: Odds Ratios for GDS-15 Items Across Categories for Age, Education, Gender, and Race
Age Education
Uniform Nonuniform Uniform Nonuniform
OR ␹2 df p OR ␹2 df p OR ␹2 df p OR ␹2 df p
Q1 0.80 100.7 2 0.26 0.78 a
93.1 3 ⬍0.001 1.17 104.4 2 0.14 1.12 a
111.7 3 0.03
Q2 1.06 82.7 2 0.68 0.97 84.9 3 0.80 1.05 82.3 2 0.53 1.08 87.7 3 0.30
Q3 1.03 105.5 2 0.90 1.01 105.5 3 0.83 0.79 104.8 2 0.07 0.99 113.3 3 0.88
Q4 0.89 108.2 2 0.46 0.99 113.7 3 0.87 0.87 103.6 2 0.12 1.04 113.3 3 0.27
Q5 0.97 91.5 2 0.92 0.97 90.9 3 0.65 1.04 91.3 2 0.84 1.05 91.1 3 0.33
Q6 0.98 75.8 2 0.93 1.09 76.6 3 0.10 1.16 78.3 2 0.20 0.98 79.4 3 0.51
Q7 0.76 84.8 2 0.21 0.97 87.4 3 0.65 1.44a 78.6 2 0.01 1.03 79.6 3 0.69
Q8 1.29 102.1 2 0.13 1.14 106.5 3 0.11 0.84 102.9 2 0.07 1.01 103.2 3 0.75
Q9 1.22 59.4 2 0.16 0.96 61.4 3 0.48 0.91 61.4 2 0.25 1.05 61.7 3 0.11
Q10 1.02 51.8 2 0.94 1.00 51.9 3 0.98 0.97 51.5 2 0.85 1.00 52.5 3 0.78
Q11 1.72 67.1 2 0.09 0.92 67.3 3 0.34 1.31 71.4 2 0.13 0.95 70.3 3 0.34
Q12 0.82 116.0 2 0.35 1.14 118.5 3 0.17 0.79 126.7 2 0.08 0.89a 130.6 3 0.01
Q13 0.87 50.6 2 0.31 0.99 51.4 3 0.87 1.28a 58.7 2 0.002 1.09 54.2 3 0.17
Q14 1.12 82.6 2 0.65 1.07 80.6 3 0.36 0.85 80.9 2 0.28 1.05 82.6 3 0.22
Q15 1.19 95.4 2 0.37 0.95 97.6 3 0.42 0.70a 97.9 2 0.01 1.00 98.3 3 0.82
Gender Race
Uniform Nonuniform Uniform Nonuniform
OR ␹2 df p OR ␹2 df p OR ␹2 df p OR ␹2 df p
Q1 1.28 102.7 2 0.43 1.03 102.9 3 0.80 0.48 100.1 2 0.15 0.88 99.7 3 0.40
Q2 1.30 82.0 2 0.27 0.76 91.0 3 0.12 0.98 82.9 2 0.95 1.17 87.6 3 0.50
Q3 0.90 106.1 2 0.75 1.18 105.5 3 0.18 1.20 105.8 2 0.70 1.05 115.7 3 0.75
Q4 1.20 107.7 2 0.49 1.04 107.6 3 0.66 0.94 106.7 2 0.85 1.37 105.5 3 0.14
Q5 3.43b 85.1 2 0.04 1.15 85.3 3 0.30 1.21 91.6 2 0.78 0.89 90.1 3 0.43
Q6 1.31 77.4 2 0.41 0.86 81.5 3 0.11 0.61 77.0 2 0.29 1.14 75.4 3 0.48
Q7 1.33 79.5 2 0.45 1.21 76.2 3 0.19 0.66 79.9 2 0.38 1.25 80.4 3 0.42
Q8 1.67 103.7 2 0.08 1.21 105.3 3 0.11 0.65 104.5 2 0.34 0.72a 113.8 3 0.02
Q9 1.06 59.1 2 0.78 0.95 63.1 3 0.50 1.30 59.4 2 0.37 0.98 59.5 3 0.87
Q10 0.43b 51.2 2 0.04 0.86 50.1 3 0.13 4.37b 51.0 2 ⬍0.001 1.19 48.6 3 0.24
Q11 1.37 67.3 2 0.48 0.68a 80.5 3 0.01 1.07 71.0 2 0.91 0.96 71.8 3 0.83
Q12 0.84 119.9 2 0.59 1.00 120.3 3 1.00 1.41 119.3 2 0.47 0.79 116.4 3 0.10
Q13 1.12 51.5 2 0.60 0.98 52.3 3 0.88 0.78 51.3 2 0.37 1.04 51.6 3 0.84
Q14 0.35b 85.5 2 0.01 0.99 88.2 3 0.94 1.00 82.1 2 1.00 1.11 81.6 3 0.63
Q15 0.53a 94.4 2 0.04 0.80 92.4 3 0.06 1.52 95.6 2 0.32 0.86 95.2 3 0.22

Demographic categories: Age (0 ⫽ 65–74 years; 1 ⫽ 75– 84 years; 2 ⫽ 85 and older); Education (0 ⫽ ⬍high school; 1 ⫽ high school; 2⫽ some
college; 3 ⫽ college; 4 ⫽ postcollege); Gender (0 ⫽ male; 1 ⫽ female); Race (0 ⫽ white; 1 ⫽ nonwhite). OR: odds ratio; ␹2: Wald statistic; df:
degrees of freedom.
a
Statistically significant, does not meet DIF criteria.
b
Statistically significant, meets DIF criteria.

Properties of this 14-item version showed no sig- were similar to other published reports on its sensi-
nificant improvement over the original GDS-15, tivity, specificity, and optimal cutoff value. The anal-
although the AUC was slightly higher (0.8732). yses confirm that using the cutoff of 5 yields optimal
Overall, DIF analyses suggest that age, level of sensitivity 71.8% and specificity 78.2%, when com-
education, gender, and race do not have an effect pared with SCID criteria for depression. Hence, the
on the measurement properties of the GDS-15 in an findings in this patient population are similar to
elderly homebound population. those reported in pooled studies of the GDS-15 indi-
cating a sensitivity 80.5%, specificity 75.0%, with op-
timal cutoff values 5 and 6.5
Although we found some evidence that the
DISCUSSION
GDS-15 (using cutoff of 5) has higher sensitivity
In this study population of elderly home healthcare and lower specificity in patients with greater co-
patients, the psychometric properties of the GDS-15 morbidity, these findings were not statistically sig-

Am J Geriatr Psychiatry 16:11, November 2008 919


Screening Performance of the Geriatric Depression Scale

nificant. We also found no evidence that the have had “mild cognitive impairment” of various
accuracy of the GDS-15 was influenced by socio- nosologies.
demographic factors. Analyses were performed to
compare AUC across the diverse subgroups of this
population, showing the instrument performs
comparably, increasing the generalizability of CONCLUSIONS
these study results. In addition, DIF analyses re-
vealed no variability of item responses across sub- The GDS was developed to give a simple, easy to
groups identified by age, level of education, gen- use approach to screening for depression in older
adults. The advantage of the GDS for medically ill
der, or race. Thus, these results suggest that the
populations is that the instrument purposely does
GDS-15 offer no evidence of a difference across the
not assess the somatic symptoms of depression, as
middle aged, elderly and old-old (ⱖ75 years),
to not inflate the total score by inadvertently at-
across low versus high levels of educational attain-
tributing symptoms of medical illness to depres-
ment, nor across gender or race.
sion. A risk in this approach is that the scale might
We did find some evidence of sociodemographic
underestimate cases of depression by systemati-
variation in which symptoms were endorsed after
cally excluding those symptoms of depression that
matching for total GDS score. Women were less likely
are somatic. Our data suggest, however, that both
than men to endorse being in “good spirits most of the
the sensitivity and specificity of the GDS are well
time.” Again matching for total GDS, men were more
within acceptable ranges. Further, accuracy of the
likely than women, and non-Whites were more likely
GDS-15 is not influenced by severity of medical
than Whites, to endorse memory problems. But omit-
burden, age, or other sociodemographic character-
ting these items from the total GDS score did not im-
istics even in a medically ill and disabled patient
prove the score’s psychometric properties. Whether or
population.
not these differences in item endorsement across de-
These results have broad implications for de-
mographic groups, is clinically meaningful, is another
pression screening suggesting that i) the “very
question that could be pursed with different types of
old” and ill can be screened appropriately despite
analyses in the future.
clinician beliefs that this population is too difficult
to assess; ii) the presence of a major depressive
Limitations episode among elderly homebound adults can be
reliably detected; and iii) the tool is useful for the
A potential limitation to the generalizability of detection of depression across culture-specific
these findings is that the study sample was drawn populations.37,38
from a single visiting nurse service agency in
Westchester, NY. However, the sociodemographic
and clinical characteristics of the study sample reflect This work was supported by grants from the National
those of home healthcare patients nationally, sug- Institute of Mental Health, #T32 MH19132 and #T32
gesting that the findings have broad relevance. A MH067555 (to LGM), #K23 MH069784 (to PJR), and
second limitation is the high rate of nonparticipation #R01 MH 56482 (to MLB). Additional support for meth-
among sample patients that, although common to odological consultation was provided to LGM as a 2006 –
studies of homebound seniors, results in potential 2007 Scholar of the African-American Mental Health Re-
selection bias. We do not know if these findings search Scientist Consortium (AAMHRS), and 2006
apply to patients who did not participate. A third Program Scholar in the Summer Institute for Applied
limitation is that patients were grouped into cogni- Multi-Ethnic Research, at the Inter-University Consor-
tive impairment versus no cognitive impairment, us- tium for Political and Social Science Research, University
ing an MMSE cutoff score of 23 and 2422; however, of Michigan, Ann-Arbor. LGM received consultancy fees
without more in-depth clinical evaluation, some sub- from Behavioral Science International LLC, GlaxoSmith-
jects labeled “no cognitive impairment” may in fact Kline and Pfizer during the study period.

920 Am J Geriatr Psychiatry 16:11, November 2008


Marc et al.

References
1. Lyness JM, Noel TK, Cox C, et al: Screening for depression in 19. Koenig HG, George LK, Peterson BL, et al: Depression in medi-
elderly primary care patients: a comparison of the Center for cally ill hospitalized older adults: prevalence, characteristics, and
Epidemiologic Studies-Depression Scale and the Geriatric Depres- course of symptoms according to six diagnostic schemes. Am
sion Scale. Arch Intern Med 1997; 157:449 – 454 J Psychiatry 1997; 154:1376 –1383
2. Spitzer RL: User’s Guide for the Structured Clinical Interview for 20. Folstein MF, Folstein SE, McHugh PR: “Mini-mental state.” A
DSM–III–R: SCID. Washington DC, American Psychiatric Press, practical method for grading the cognitive state of patients for
1990 the clinician. J Psychiatr Res 1975; 12:189 –198
3. Sheikh JI, Yesavage JA: Geriatric Depression Scale (GDS): recent 21. Lopez MN, Charter RA, Mostafavi B, et al: Psychometric proper-
evidence and development of a shorter version, in Clinical Ger- ties of the Folstein Mini-Mental State Examination. Assessment
ontology: A Guide to Assessment and Intervention. Edited by 2005; 12:137–144
Brink TL. New York, The Haworth Press, 1986, pp 165–173 22. Folstein MF, Folstein SE, McHugh PR, et al: Mini-Mental State
4. Yesavage JA, Brink TL, Rose TL, et al: Development and validation Examination user’s guide. Odessa, FL, 2001
of a geriatric depression screening scale: a preliminary report. 23. Charlson ME, Pompei P, Ales KL, et al: A new method of classi-
J Psychiatr Res 1982; 17:37– 49 fying prognostic comorbidity in longitudinal studies: develop-
5. Wancata J, Alexandrowicz R, Marquart B, et al: The criterion ment and validation. J Chronic Dis 1987; 40:373–383
validity of the Geriatric Depression Scale: a systematic review. 24. de Groot V, Beckerman H, Lankhorst GJ, et al: How to measure
Acta Psychiatr Scand 2006; 114:398 – 410 comorbidity: a critical review of available methods. J Clin Epide-
6. Watson LC, Lewis CL, Kistler CE, et al: Can we trust depression miol 2003; 56:221–229
screening instruments in healthy “old-old” adults? Int J Geriatr 25. Charlson ME, Peterson JC, Syat BL, et al: Outcomes of community-
Psychiatry 2004; 19:278 –285 based social service interventions in homebound elders. Int J
7. Cwikel J, Ritchie K: Screening for depression among the elderly Geriatr Psychiatry 2008; 23:427– 432
in Israel: an assessment of the Short Geriatric Depression Scale 26. Lawton MP, Brody EM: Assessment of older people: self-maintain-
(S-GDS). Isr J Med Sci 1989; 25:131–137 ing and instrumental activities of daily living. Gerontologist 1969;
8. Kim JM, Prince MJ, Shin IS, et al: Validity of Korean Form of 9:179 –186
Geriatric Depression Scale (KGDS) among cognitively impaired 27. Norstrom T, Thorslund M: The structure of IADL and ADL mea-
Korean elderly and the development of a 15-item short version sures: some findings from a Swedish study. Age Ageing 1991;
(KGDS-15). Int J Methods Psychiatr Res 2001; 10:204 –210 20:23–28
9. Tang WK, Wong E, Chiu HF, et al: The Geriatric Depression Scale 28. Ware JE Jr, Sherbourne CD: The MOS 36-item short-form health
should be shortened: results of Rasch analysis. Int J Geriatr Psy- survey (SF-36). I. Conceptual framework and item selection. Med
chiatry 2005; 20:783–789 Care 1992; 30:473– 483
10. Cole SR: Assessment of differential item functioning in the Per- 29. Erdreich LS, Lee ET: Use of relative operating characteristic anal-
ceived Stress Scale-10. J Epidemiol Community Health 1999; ysis in epidemiology: a method for dealing with subjective judge-
53:319 –320 ment. Am J Epidemiol 1981; 114:649 – 662
11. Holland PW, Wainer H. Differential Item Functioning. Hillside, 30. Youden WJ: Index for rating diagnostic tests. Cancer 1950;
NJ, Lawrence Erlbaum Associates, 1993 3:32–35
12. Cole SR, Kawachi I, Maller SJ, et al: Test of item-response bias in 31. Kuder GF, Richardson MW: The theory of the estimation of test
the CES-D scale: experience from the New Haven EPESE study. reliability. Psychometrika 1937; 2:151–160
J Clin Epidemiol 2000; 53:285–289 32. Ananth C, Kleinbaum D: Regression models for ordinal reponses:
13. Feher EP, Larrabee GJ, Crook TH III: Factors attenuating the a review of methods and applications. Int J Epidemiol 1997;
validity of the Geriatric Depression Scale in a dementia popula- 26:1323–1333
tion. J Am Geriatr Soc 1992; 40:906 –909 33. Swaminathan H, Rogers H: Detecting differential item function-
14. Bruce ML, McAvay GJ, Raue PJ, et al: Major depression in ing using logistic regression procedures. J Educ Meas 1990; 27:
elderly home health care patients. Am J Psychiatry 2002; 159: 361–370
1367–1374 34. Stata [computer program]. Release 9 Ed. College Station, TX,
15. Haupt BJ, Jones A: The national home and hospice care survey: Stata Corp., 2005
1996 summary. Vital Health Stat 13 1999; (141):1–238 35. SPSS for Windows 14.0 [computer program]. Chicago, IL, SPSS
16. Federal Register; 1998, Feb 24. p 9235–9238 Inc., 2005
17. Leckman JF, Sholomskas D, Thompson WD, et al: Best estimate of 36. Microsoft Excel [computer program]: Microsoft Corp., 2000
lifetime psychiatric diagnosis: a methodological study. Arch Gen 37. Rait G, Burns A, Baldwin R, et al: Screening for depression in
Psychiatry 1982; 39:879 – 883 African-Caribbean elders. Fam Pract 1999; 16:591–595
18. Klein DN, Ouimette PC, Kelly HS, et al: Test-retest reliability of 38. Harralson TL, White TM, Regenberg AC, et al: Similarities and
team consensus best-estimate diagnoses of axis I and II disorders differences in depression among black and white nursing home
in a family study. Am J Psychiatry 1994; 151:1043–1047 residents. Am J Geriatr Psychiatry 2002; 10:175–184

Am J Geriatr Psychiatry 16:11, November 2008 921

You might also like