Optimising The Diagnostic Performance of The Geriatric Depression Scale Izal-2010

Psychiatry Research 178 (2010) 142146
Contents lists available at ScienceDirect
Psychiatry Research
j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / p s yc h r e s
Optimising the diagnostic performance of the Geriatric Depression Scale

Mara Izal a,, Ignacio Montorio a, Roberto Nuevo b,c, Gema Prez-Rojo a, Isabel Cabrera a
a
b
c
Universidad Autnoma de Madrid (Spain), Facultad de Psicologa, Spain

Department of Psychiatry, Autnoma University of Madrid, La Princesa University Hospital, Madrid, Spain
Instituto de Salud Carlos III, Centro de Investigacin en Red de Salud Mental, CIBERSAM, Spain
a r t i c l e
i n f o
Article history:
Received 1 July 2008
Received in revised form 10 December 2008
Accepted 22 February 2009
Keywords:
Screening
Diagnostic efciency
GDS
Older adults
a b s t r a c t
The aim of this work is to empirically generate a shortened version of the Geriatric Depression Scale (GDS),
with the intention of maximising the diagnostic performance in the detection of depression compared with
previously GDS validated versions, while optimizing the size of the instrument. A total of 233 individuals
(128 from a Day Hospital, 105 randomly selected from the community) aged 60 or over completed the GDS
and other measures. The 30 GDS items were entered in the Day Hospital sample as independent variables in a
stepwise logistic regression analysis predicting diagnosis of Major Depression. A nal solution of 10 items
was retained, which correctly classied 97.4% of cases. The diagnostic performance of these 10 GDS items was
analysed in the random sample with a receiver operating characteristic (ROC) curve. Sensitivity (100%),
specicity (97.2%), positive (81.8%) and negative (100%) predictive power, and the area under the curve
(0.994) were comparable with values for GDS-30 and higher compared with GDS-15, GDS-10 and GDS-5. In
addition, the new scale proposed had excellent t when testing its unidimensionality with CFA for categorical
outcomes (e.g., CFI = 0.99). The 10-item version of the GDS proposed here, the GDS-R, seems to retain the
diagnostic performance for detecting depression in older adults of the GDS-30 items, while increasing the
sensitivity and predictive values relative to other shortened versions.
2009 Elsevier Ireland Ltd. All rights reserved.
1. Introduction
The Geriatric Depression Scale (GDS; Brink et al., 1982) is one of the
most frequently used instruments in the diagnosis and study of
depression in older adults (Mui, 1996; Stiles and McGarrahan, 1998;
Jongenelis et al., 2005). The original GDS is a measurement made up of
30 items ('yes'/'no') that was designed to assess the severity of
depression in older adults in response to the recognition that depression
scales based on the general population might not be adequate for use in
the elderly population. In fact, items referred to somatic symptoms were
removed from the initial pool in the construction process of the scale
(Brink et al.,1982; Yesavage et al.,1983). The GDS has demonstrated high
diagnostic precision in identifying depression, although large number of
items make its application difcult in a number of health contexts, such
as primary health care (Heisel et al., 2005). Moreover, its application to
elderly people entails the risk of biases due to fatigue or concentration
problems/attention span difculties (Herrmann et al., 1996), which also
increases the time needed to complete it.
Depression often goes unrecognised in the elderly (Rabins, 1996),
and it is a signicant source of concern for families, increases use of
medical services and pharmaceutical costs, and impairs immunologic
Corresponding author. Facultad de Psicologa, Universidad Autnoma de Madrid,
Ciudad Universitaria de Cantoblanco, 28049 Madrid, Spain. Tel.: +34 914974060; fax: +34
914975215.
E-mail address: maria.izal@uam.es (M. Izal).
0165-1781/$ see front matter 2009 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.psychres.2009.02.018
function (Schleifer et al.,1999). It is also one of the main predictors of the

risk of suicide among older adults. The World Health Organization
indicated in its annual report (WHO, 2006) that depression would be the
second cause of disability by 2020, only below that of cardiopathy and
higher than cancer or acquired immunodeciency syndrome (AIDS),
since older adults as a population group are particularly vulnerable to
disability. The identication of depressive syndromes in the elderly is
therefore a health priority, highlighting the necessity of developing and
validating economic, simple, and efcacious screening measures for
depression in this age group. In this sense, different short versions of the
GDS (15,10, 8, 5 and 4 items) have been proposed, which offer the merits
of a simpler administration, easy response format, and economy of time
(Sheikh and Yesavage, 1986; Yesavage, 1988; Stiles and McGarrahan,
1998). The GDS-15 appears to have good psychometric properties and
adequate performance identifying depression, with a sensitivity up to
91% (D'Ath et al., 1994) for a cut-score of 5 and a specicity up to 81% for
a cut-score of 4 (Brown and Schinka, 2005). There are, however,
different reasons to be cautious with the use of the shortened forms of
the GDS. Other studies have found moderate performance for the GDS15 (for example, 67% sensitivity and 73% specicity, for an optimal cutscore of 3, in the Van Marwijk study, 1995), and particularly low positive
predictive power values (e.g.,18.4% in Arthur et al.,1999, or 31% in Brown
and Schinka, 2005). In a systematic and thorough recent review of the
properties of different versions of the GDS in a large number of
published studies (Wancata et al., 2006), the sensitivity for the GDS-15
varied between 0.600 and 0.940 (with a mean of 0.805), and the
M. Izal et al. / Psychiatry Research 178 (2010) 142146
specicity between 0.570 and 0.870 (with a mean of 0.750). Although

these values do not appear to differ from the values for the GDS-30
reported in the same review, an additional problem with the shortened
versions of the GDS arises from the procedure for selecting the items for
the GDS-15 from the original version (Cheng and Chan, 2004), since the
latter was based on correlations with somatic symptoms of depression
and suicidal thoughts, an approach that is considered problematic when
diagnosing depression in older adults (Brink et al., 1982; Yesavage et al.,
1983). Moreover, the remainder of the shorter versions, including the
GDS-5, were all developed from the GDS-15 rather than the GDS-30
(Tang et al., 2005; Jongenelis et al., 2007), under the questionable
assumption that the GDS-15 already contained the most useful items
taken from the full scale. In fact, the potential problems of performance
have been even more marked with shorter versions such as the GDS-10
(D'ath et al., 1994) or the GDS-5 (Hoyl et al., 1999). A further criticism is
that some of the GDS versions have not been validated by standard
diagnostic criteria (DSM, International Classication of Diseases (ICD)),
but by other self-reports of depression or by similar indirect
measurements.
The goal of this work was to propose a new version of the GDS,
relying on empirical criteria addressed to optimise the discriminative
ability of the scale as a screening measure for depression, contrasting
its validity with standard diagnostic criteria. In addition, our objective
was to optimise the size of the questionnaire using the original scale of
30 items as a starting point, and to compare the relative efcacy of
different versions of the GDS that are most often used (GDS-30, GDS15, GDS-10, and GDS-5) with the new proposed scale.
2. Methods
2.1. Sample
The sample was composed of 233 individuals aged 60 years or over (mean = 74.5;
S.D. = 8.2); 69.0% were women. The study was carried out in two phases. In the rst
one, the sample was made up of 128 older adults recruited in the Geriatric Unit of a
day hospital. The average age was 75.6 years (S.D. = 9.6; 77.3% were women). The
prevalence of major depression in this sample according to the diagnostic interview
(see below) was 33.3% (n = 18), 100% women. The second sample was comprised of
105 older adults randomly selected form the census of an urban locality in the Madrid
area (Spain). The mean age was 72.9 years (S.D. = 5.7), with 58.1% women. The
prevalence of major depression in this sample, according also to the diagnostic
interview, was 8.6% (n = 9), 88.9% of them being women.
2.2. Measurements
The GDS (Brink et al., 1982; Yesavage et al., 1983) is a measurement of the severity
of depression made up of 30 items with a "yes"/"no" format. Respondents describe their
feelings about items addressing issues that are typical of older adults who experience
depression. The GDS score is calculated by counting the number of responses that
suggest probable depression, reversing the inverse items, which gives a total score
ranging from 0 to 30. The Spanish version of the questionnaire used here (Montorio,
and Izal, 1996) has demonstrated good psychometric properties. Spanish validations of
the GDS have demonstrated adequate properties across Hispanic cultures, including, for
instance, Spaniards (Fernndez-San Martn et al., 2002), Mexicans (Baker and Espino,
1997) or Argentinians (Carrete et al., 2001).
The presence of cognitive impairment was controlled through the mini-mental state
examination (MMSE; Folstein et al., 1975; the Spanish version by Lobo et al., 1979) using a
cut-score of 25 or lower (Del Ser and Pea-Casanova, 1994). Thirteen persons from the
second sample (random sampling) were under this threshold and were not included in the
sample of 105 persons reported above. No one in the sample recruited in the Geriatric Unit
of a Day Hospital was eliminated because the presence of cognitive impairment.
The diagnosis of major depression was established through SCID-I (First et al.,
1999), a structured diagnostic interview according to the Diagnostic and Statistical
Manual of Mental Disorders (DSM-IV) (APA, 1994) criteria for anxiety and mood
disorders. Following the decision trees of the DSM-IV, this interview rst inquires about
the core criteria of each disorder, and then probes for other characteristics of the
disorder to establish the severity and the clinical symptoms of the individual.
Medication and the presence of somatic problems are also assessed to rule out the
presence of depression due to medical conditions or substance consumption.
2.3. Procedure
Diagnostic interviews and self-reports were obtained by trained psychologists and
within an individual interview format. Interviewers were different for both types of
143
assessments and they were blind to the outcome of the other interview. For self-reports,
printed cards with the different response options were used in the application of the
instruments, thus facilitating understanding and minimising the induction or
deduction of the answers. The only role of the interviewer was reading the content
of the item and writing down the response and no additional information was provided.
In the sample recruited in the day hospital, individuals attending for general geriatric
consultations were asked to participate in a survey about the concerns of older adults.
Those who agreed to participate were interviewed by two different psychologists. The
census-based random sample was stratied by age and gender. The rate of participation
was 53.4%. As in the other sample, participants were interviewed by two different
psychologists. The locality of Majadahonda was selected for the study due to its
proximity to Madrid and similarity in demographic characteristics (Prez et al., 2007).
Persons in both samples received a diagnostic interview and an interview including
self-reports measures obtained by three different psychologists. Participants completed
a consent form, and did not receive any compensation for their participation.
2.4. Statistical analyses
Data analyses were conducted separately for the two samples. First, the sample selected
from a day hospital was used as a basis for the extraction of the most discriminative items
and the pool of items extracted with this procedure was revised for purposes of content
validity; retaining the items that were empirically extracted in a second step involving the
potential inclusion of relevant items. The second sample (random sampling) was used as a
cross-validation sample to test the diagnostic performance of the scale composed by the
retained items and compare it with other versions of the GDS. The rationale for using the
random sample for purposes of cross-validation relies on its being representative of
the characteristics of a general older population, as well as the expectation that it would
provide an adequate estimation of the prevalence rate for depression, which could affect
the positive and negative predictive power values obtained in analyses of the relative
performance of the scales.
First, a forward stepwise binary logistic regression was performed, entering the 30
GDS items as independent variables for identifying the presence of depression. Thus,
only the items most predictive and informative for depression were retained, and the
excluded ones were considered redundant with this set for purposes of screening. The
process was iterative and it was nished when there were no variables reaching the
signicance threshold to be removed (P b 0.10) or included (P b 0.05) in the model. Next,
given that logistic regression gives an optimal solution with the minimum number of
items, while diagnostic performance improves with the increase on variability of scores,
we revised the content of the scale to guarantee that core symptoms of depression had
been included and to analyse the effects of their manual inclusion in case they had been
removed in the regression analysis. This revision was considered as a means of
maximising content validity as content validity cannot be guaranteed by a simple
empirical reduction, even when the main criterion for shortening is empirical and the
core block of items empirically extracted is retained, regardless of their content.
Then, using the random sample, the performance of the new version of the GDS for
detecting depression was compared with the four previous versions (GDS-30, GDS-15,
GDS-10, and GDS-5). Receiver operating characteristic (ROC) curves were calculated. ROC
analysis gives the sensitivity and specicity for each score of the scales and graphically
plots sensitivity values with 1-specicity. It enables optimal cut-scores for identifying
clinical cases to be quantitatively established. The area under the curve (AUC; range 01)
and the Youden index (Youden, 1950) were used to compare the global performance of
each version of the GDS. Sensitivity, specicity, positive (PPP) and negative (NPP)
predictive power, and positive and negative likelihood ratios (+LR, and LR, respectively)
were calculated for the optimal cut-off score in each GDS version and for its score in the
version previously extracted. According to recommendations in the literature, 95%
Condence Intervals were calculated for each of these values (Hilgers, 1991). This seems to
be particularly important given the small size of the clinical sample (n = 9) considered for
the analyses. The prevalence of depression for older adults in Madrid, which will affect the
calculation of predictive power was established as 8.6% (Montorio et al., 2001). The AUCs
were compared to evaluate the diagnostic performance of the four versions of the GDS
with a nonparametric test based on the Wilcoxon rank sum test (DeLong et al., 1988;
Hanley and McNeil, 1982). These analyses were performed with the programs SPSS for
Windows, release 14.0, Stata version 10.0, and MedCalc version 9.6.4.0.
Finally, we tested the factorial structure of the scale including the retained items,
through Conrmatory Factor Analysis for categorical outcomes, with robust weighted least
square estimator using tetrachoric correlations. The total sample (clinical and random) was
used for this analysis. A unidimensional model was assumed to test whether the retained
items assessed only the core dimensions of depression or whether the multidimensionality
existing in the GDS-30 remained in the shortened version. According to the usual
recommendations (e.g., Reise et al., 1993) several indices were used to assess t, according
with the values reported by Yu (2002): (a) lack of signicance of 2; (b) comparative t
index (CFI; N 0.96); (c) TuckerLewis index (TLI N0.96); (d) root mean square error of
approximation (RMSEA; b0.05); and (e) weighted root mean square residual (WRMR; b 1.0).
These analyses were performed using the program MPLUS, release 4.21.
3. Results
Binary logistic regression analysis, performed with the sample
selected from a day hospital, produced an optimal solution of seven
144
Table 1
Weights for the logistic regression, percentage of afrmative responses and chi-square
tests for retained items of the GDS-R.
Weights
% of yes,
% of yes,
2
(Standard errors) Depression Non-clinical
Item of original GDS
3. Do you feel that your life is

0.36 (.97)
empty?
4. Do you often get bored?
3.78 (1.44)
5.30 (1.93)
6. Are you bothered by thoughts
you can't get out of your
head?
7. Are you in good spirits most of
3.55 (1.32)
the time?
10. Do you often feel helpless?
2.50 (1.08)
2.49 (.91)
14. Do you feel you have more
problems with memory than
most?
15. Do you think it is wonderful to
4.33 (1.61)
be alive now?
16. Do you often feel downhearted
6.03 (2.22)
and blue?
21. Do you feel full of energy?
3.04 (1.11)
23. Do you feel that most people
2.12 (1.01)
are better off than you are?
Constant
1.34 (1.09)
63.8
16.8
42.9
66.0
93.6
17.3
30.3
44.9
61.3
44.7
7.6
40.3
46.8
59.6
11.9
14.6
29.7
41.9
40.4
5.4
42.0
89.4
18.9
83.9
66.0
63.8
11.9
32.4
62.2
15.6
*P b 0.05; **P b 0.01.

All chi-square contrasts were signicant with P b 0.001.
items, which correctly classied 94.7% of cases (three false positives,

four false negatives). Nalgerkerke's R2 was 0.903; Cox and Snell's R2
0.637, chi-square (7) = 128.64 (P b 0.001). The revision of the content
of the items retained indicated that three relevant items had been
removed from the nal equation (item 3: emptiness; item 7: mood
state; and item 10: helplessness). Items 3 and 7 are affective
symptoms core to the depression diagnosis (APA, 1994). Item 10,
helplessness, is considered a core factor of depression in all ages, both
in animal and human models (Barlow, 2002). To maximise the content
validity of the scale, these three items were added to the empirically
generated 7-item version. The equation including these three items
improved the cases correctly classied (97.6%; two false positives,
one false negative), as well as Nalgerkerke's (0.949) and Cox and
Snell's (0.669) R2. Likewise, there was a signicant increase in chisquare (3) = 11.66 (P = 0.009). These results of the new shorted GDS
version (hereafter called GDS-R) are presented in Table 1.
Table 2
Comparison of performance between versions of the GDS and the generated GDS-R.
Optimal cut-score
Sensitivity
95% CI
Specicity
95% CI
Positive predictive
power
95% CI
Negative predictive
power
95% CI
Positive likelihood
ratio
95% CI
Negative likelihood
ratio
95% CI
Area under the
curve
Standard Error
95% Condence
interval
Youden index
GDS-30
GDS-15
GDS-5
GDS-10
GDS-R
15
100%
66.2/100
95.8%
89.7/98.8
69.2%
5
100%
66.2/100
87.5%
79.2/93.4
42.9%
2
66.7%
30.1/92.1
78.1%
68.5/85.9
22.2%
3
100
66.2/100
81.3
72.0/88.5
33.3%
5
100%
66.2/100
97.9%
92.7/99.7
81.8%
38.6/90.7
100%
21.9/66.0
100%
8.7/42.3
96.2%
16.6/54.0
100%
48.2/97.2
100%
96.1/100
24.0
95.7/100
8.0
89.2/99.2
3.1
95.3/100
5.33
96.1/100
48.0
23.0/25.0
0.00
7.4/8.6
0.00
1.9/4.9
0.43
4.8/5.9
0.00
46.6/49.4
0.00
0.980
0.966
0.2/ 1.2
0.816
0.949
0.994
0.033
0.931/
0.997
0.958
0.042
0.912/
0.992
0.875
0.065
0.685/
0.947
0.448
0.052
0.888/
0.982
0.813
0.018
0.953/
0.998
0.979
All AUCs were signicant with P b 0.001.
Table 3
Cut-scores and Youden Index for the GDS-R.
Score
Sensitivity (95% CI)
Specicity (95% CI)
Youden Index
2
3
4
5
6
1.0 (0.662/1.0)
1.0 (0.662/1.0)
1.0 (0.662/1.0)
1.0 (0.662/1.0)
0.778 (0.401/0.965)
0.604 (0.499/0.703)
0.771 (0.674/0.850)
0.906 (0.829/0.956)
0.979 (0.927/0.997)
0.990 (0.943/0.998)
0.604
0.771
0.906
0.979
0.768
Values in bold indicate the score simultaneously maximizing both sensitivity and
specicity, the rst score in the table is maximizing sensitivity while retaining a specicity
higher than that expected by random chance (0.50), and the last score is maximizing
specicity while retaining a sensitivity higher than that expected by random chance.
In the next stage, using the random sample for cross-validation,

sensitivity and specicity of the generated 10-item version were
analysed and compared with those obtained by the GDS-5, GDS-10,
GDS-15 and GDS-30. The GDS-R presented a high Area Under the Curve
(AUC): 0.994; P b 0.001. It also had a clearly high sensitivity and
specicity for detecting depression (100% and 97.9%, respectively, for a
cut-score of 5 or higher). The values of the AUCs for GDS-10-R, GDS-30,
GDS-15, GDS-10, and GDS-5, are presented in Table 2, together with
sensitivity, specicity, and positive and negative predictive power (PPP
and NPP, respectively) for the optimal cut-score (that which simultaneously maximises sensitivity and specicity) on each scale. In Table 3,
values for different possible cut-scores of the GDS-R are presented. The
omnibus comparison for the ve AUCs indicated that there were
signicant differences between them (2(4) = 15.2; P = 0.004). Paired
comparisons for the AUCs indicated that AUC of the GDS-5 was
signicantly lower than the AUC of GDS-30 (2(1) = 6.64; P = 0.010),
GDS-15 (2(1) = 6.86; P = 0.009), GDS-10 (2(1) = 4.52; P = 0.034),
and GDS-R (2(1) = 7.64; P = 0.006). The AUC of the GDS-R was
signicantly higher than the AUC for the GDS-10 (2(1) = 4.85;
P = 0.028). There were no statistically signicant differences between
the GDS-30 or the GDS-15 compared with the GDS-10 or the GDS-R. The
GDS-R showed a trend for a higher AUC than the GDS-15 (2(1) = 3.04;
P = 0.082). The ROC curves of the ve versions of the GDS are graphically
presented in Fig. 1.
Further, to test the accuracy of the extension of the size of the scale
for theoretical reasons to optimise the content validity, we compared,
in the random sample, the performance of the GDS-R with the
empirically generated 7-item scale. The results for the GDS-R were
higher, although similar in most of values, but clearly better regarding
the positive predictive value: 50% for the seven items, 81.8% for the
GDS-10.
Finally, the CFA revealed adequate t indices for the unidimensional model: 2 (26) = 29.722, P = 0.279; CFI = 0.990; CMIN/
Fig. 1. Graphical representation of the ROC curves for the GDS versions.

Table 4
Regression weights for the unidimensional of the GDS-R.
Item of original GDS
Standardized
Unstandardized
Residual variance
3. Life empty
4. Often bored
6. Bothered by ruminations
7. Not in good spiritsa
10. Feel helpless
14. Memory problems
15. Not wonderful to be alivea
16. Downhearted and blue
21. Not full of energya
23. Others better off
0.651
0.562
0.745
0.665
0.524
0.556
0.596
0.859
0.691
0.354
1.0
0.864
1.145
1.021
0.804
0.854
0.916
1.320
1.061
0.543
0.576
0.684
0.444
0.558
0.726
0.691
0.644
0.261
0.523
0.875
(0.00)
(0.16)
(0.162)
(0.15)
(0.16)
(0.17)
(0.18)
(0.18)
(0.17)
(0.15)
Values within parentheses are standard errors.

All weights signicant with P b 0.001.
a
Reversed items.
DF = 1.143; TLI = 0.991; RMSEA = 0.025; WRMR = 0.774. The standardised and unstandardised loadings of each item are presented in
Table 4.
4. Discussion
The present study aimed to optimise the performance of the GDS as
a screening measure for depression, reducing the size of the scale
through a stepwise logistic regression analysis, completed with a
revision of the content to guarantee the validity of the scale. The new
scale proposed appears to improve both sensitivity and specicity for
detecting depression in older adults in the cross-validation sample
compared with other shortened versions of the GDS, and it is at least as
good for this purpose as the original 30-item version. The values for an
optimal cut-score of 5 (5 or higher) can be considered excellent (100%
sensitivity; 97.9% specicity), and are higher than the values usually
reported in the empirical literature for the other versions of the GDS or
other self-report questionnaires assessing depression (Tuunainena et
al., 2001; Vzquez et al., 2007). For example, a maximum sensitivity of
91% and specicity of 81% for the GDS-15 have been found with
different cut-scores in various studies (D'Ath et al., 1994; Brown and
Schinka, 2005), and mean values of 81% (sensitivity) and 75%
(specicity) have been found in a recent systematic review (Wancata
et al., 2006). The sensitivity of the GDS-R is similar to that of the latest
published version of the GDS made up of eight items (97.9% vs. 96.3%),
but considerably superior in its specicity (96.2% vs. 71.7%), which was
to be expected since a lower cut-score was proposed for this latest
version (Jongenelis et al., 2007). Although the wide range in the 95%
condence intervals for sensitivity limits the generalisability of these
results, all values are higher for the GDS-R, and the CIs for specicity
and +LR do not overlap with those of the GDS-5, GDS-10 or GDS-15,
providing additional support for the improved performance of the
scale (see Table 2).
Furthermore, the positive predictive power, the negative predictive
power (NPP), the positive likelihood ratios, the total AUC, and the
Youden Index all indicate that the GDS-R could improve the diagnostic
performance of the other versions, and is at least as good as the GDS30. The differences in PPP are especially relevant, as the high value for
the GDS-R (81.8%) decreases to 69.2% for the GDS-30, and dramatically
decreases for the GDS-15 (42.2%), for the GDS-10 (33.3%) and for the
GDS-5 (22.2%). Although what should be considered as the optimal
cut-score for each scale is not only an empirical issue, as it depends on
the specic objectives for detecting depression, all of the values for the
GDS-5 and the GDS-15 clearly deteriorated even when other cutscores were selected (these data are available upon request to the rst
author).
Thus, the GDS-R seems to equal the empirical performance of the
GDS-30 for detecting depression, with better values in several indices
(e.g., a 12.6% increase in PPP), apparently removes items related to
anxiety and is briefer, and therefore improves the balance of benet/
145
costs and its feasibility. Compared to the GDS-15, apart from the
reduction in length, the procedures for selecting items have been mainly
empirical, with a rational review of the content validity, which led to the
addition of three items that were empirically extracted, whereas the
GDS-15 was only rationally generated.
Results of the CFA for categorical outcomes provide support to the
unidimensional structure of the GDS-R. The procedures used for
selecting the items of the GDS-R in the present study were both
empirical and theoretical, trying to maximise the diagnostic performance as well as the content validity for the core criteria of
depression. As thoroughly discussed by Cheng and Chan (2004), the
original GDS includes items probably assessing different dimensions
of negative affect related to depression. Therefore, given that the
reduction was performed according to both empirical and theoretical
criteria, it was necessary to review whether the retained items are
evident in different dimensions of depression or whether they assess a
unique core dimension of old age depression. The excellent t of the
unidimensional model indicates that the GDS-R supports the
unidimensional structure. Furthermore, the analysis of the contents
of the 10 items retained suggests that only items central to depression
in old age have been included, whereas items more related to anxiety
or worry have been removed.
A limitation of the present work is the lack of validation of the results
about the diagnostic performance of the GDS-R to distinguish major
depression from cases with subsyndromal depression or with different
disorders. Further studies should analyse the robustness of the properties of the GDS-R in different settings and age groups. In any case,
although the small size of the second sample (especially the clinical subsample) precluded separate analyses according to these characteristics,
the randomisation procedure for selecting the sample from the census of
the general community dwelling population provides support to the
generalisation of the results to the non-clinical population of older
adults. An additional problem of the clinical sub-sample used here is that
88.9% were women, limiting the generalizability of the results to the
older men with depression.
In conclusion, the results of this work provide support for the use
of the proposed shortened version of the GDS, composed of 10 items
included in the original 30-item version. This scale at least retains the
diagnostic accuracy of the questionnaire compared with the total
scale, while it is a shorter, more economical, and easier to apply
measure of depression. Thus, the GDS-R can be used as an adequate
substitute for the GDS, especially in the context of limited time or
excessive assessment load.
References
American Psychiatric Association, 1994. Diagnostic and Statistical Manual of Mental
Disorders, 4th ed. APA, Washington, DC.
Arthur, A., Jagger, C., Lindesay, J., Graham, C., Clarke, M., 1999. Using an annual over-75
health check to screen for depression: validation of the short Geriatric Depression
Scale (GDS-15) within general practice. International Journal of Geriatric Psychiatry
14 (6), 431439.
Baker, F.M., Espino, D.V., 1997. A Spanish version of the Geriatric Depression Scale in
Mexican-American elders. International Journal of Geriatric Psychiatry 12 (1),
2125.
Barlow, D.H., 2002. Anxiety and its Disorders: The Nature and Treatment of Anxiety and
Panic. Guilford Press, New York.
Brink, T.L., Yesavage, J.A., Lum, B., Heersma, P., Adey, M., Rose, T.L., 1982. Screening tests
for geriatric depression. Clinical Gerontologist, 1, 3744.
Brown, L.M., Schinka, J.A., 2005. Development and initial validation of a 15-item
informant version of the geriatric depression scale. International Journal of
Geriatric Psychiatry 20, 911918.
Carrete, P., Augustovski, F., Gimpel, N., Fernandez, S., Di Paolo, R., Schaffer, I.,
Rubinstein, F., 2001. Validation of a telephone-administered Geriatric Depression
Scale in a Hispanic elderly population. Journal of General Internal Medicine 16 (7),
446450.
Cheng, S., Chan, A.C.M., 2004. A brief version of the Geriatric Depression Scale for the
Chinese. Psychological Assessment 16, 182186.
D' Ath, P., Katona, P., Mullan, E., Evans, S., Katona, C., 1994. Screening, detection and
management of depression in elderly primary care attenders. I: The acceptability
and performance of the 15 item Geriatric Depression Scale (GDS15) and the
development of short versions. The Journal of Family Practice 11, 260266.
146
DeLong, E.R., DeLong, D.M., Clarke-Pearson, D.L., 1988. Comparing the areas under two or
more correlated receiver operating curves: a nonparametric approach. Biometrics 44,
837845.
Del Ser, T., Pea-Casanova, J. (Eds.), 1994. Evaluacin neuropsicolgica y funcional de la
demencia.[Neuropsychological and functional assessment of dementia], Prous
Editores, Barcelona.
Fernndez-San Martn, M.I., Andrade, C., Molina, Muoz, P.E., Carretero, B., Rodrguez,
M., Silva, A., 2002. Validation of the Spanish version of the geriatric depression scale
(GDS) in primary care. International Journal Geriatric of Psychiatry 17, 279287.
First, M.B., Spitzer, R.L., Gibbon, M., and Williams, J.B.W., 1999. Entrevista Clnica
Estructurada para los Trastornos del Eje I del DSM-IV (SCID-I) [Structured Clinical
Interview for the DSM-IV Axis I Disorders (SCID-I)], Masson, Barcelona.
Folstein, M., Folstein, S., McHugh, P.R., 1975. Mini-Mental State: a practical method for
grading the cognitive state of patients for the clinician. Journal of Psychiatric
Research 12, 189198.
Hanley, J.A., McNeil, B.J., 1982. The meaning and use of the area under a receiver
operating characteristic (ROC) curve. Radiology 143, 2936.
Heisel, M.J., Flett, G.L., Duberstein, P.R., Lyness, J.M., 2005. Does the geriatric depression
scale (GDS) distinguish between older adults with high versus low ideation?
American Journal of Geriatric Psychiatry 13, 876883.
Herrmann, N., Mittman, N., Silver, I.L., Shulman, K.I., Busto, U.A., Shear, N.H., Naranjo, C.A.,
1996. A validation study of the Geriatric Depression Scale (GDS) short form.
International Journal of Geriatric Psychiatry 11, 457460.
Hilgers, R.A., 1991. Distribution-free condence bounds for ROC curves. Methods of
Information in Medicine, 30, 96101.
Hoyl, M.T., Alessi, C.A., Harker, J.O., Josephson, K.R., Pietruszka, E.M., Koelfgen, M.,
Mervis, J.R., Fitten, L.J., Rubenstein, L.Z., 1999. Development and testing of a veitem version of the Geriatric Depression Scale. Journal of American Geriatric and
Society 47, 873878.
Jongenelis, K., Pot, A.M., Eisses, A.M.H., Gerritsen, D.M., Derksen, M., Beekman, A.T.F.,
Kluiter, H., Ribbe, M.W., 2005. Diagnostic accuracy of the original 30-item and
shortened versions of the Geriatric Depression Scale in nursing home patients.
International Journal of Geriatric Psychiatry 20, 10671074.
Jongenelis, K., Gerritsen, D.L., Pot, A.M., Beekman, A.T., Eisses, A.M., Kluiter, H., Ribbe, M.W.,
2007. Construction and validation of a patient- and user-friendly nursing home version
of the Geriatric Depression Scale. International Journal of Geriatric Psychiatry 22,
837842.
Lobo, A., Esquerra, J., Gomez-Burgada, F., Sala, J.M., Seva, A., 1979. El Mini-Examen
Cognoscitivo: un test sencillo y prctico para detectar alteraciones intelectuales en
pacientes mdicos [The Cognoscitive mini-test: a simple practical test to detect
intellectual changes in medical patients]. Actas Luso-Espaola de Neurologa y
Psiquiatra 3, 189202.
Montorio, I., Izal, M., 1996. The Geriatric Depression Scale: a review of its development
and utility. International Psychogeriatrics 8, 103111.
Montorio, I., Nuevo, R., Losada, A., Mrquez, M., 2001. Prevalencia de trastornos de
ansiedad y depresin en una muestra de personas mayores residentes en la
comunidad [Prevalence of anxiety and depressive disorders in a sample of
community-dwelling older adults]. Mapfre Medicina, 12, 1926.
Mui, A., 1996. Geriatric Depression Scale as a community screening instrument for
elderly Chinese immigrants. International Psychogeriatrics 8 (3), 445458.
Prez, M.A., Moreno, V.M., Puerta, D.R., Martnez, Y.G., Vicario, I.H., Ceruelo, E.E., de la
Cmara, A.G., 2007. Factores socioeconmicos y frecuentacin en las consultas de
medicina de familia de la red sanitaria pblica madrilea [Socioeconomic factors
and utilization of public family practice facilities in Madrid]. Gaceta Sanitaria, 21,
219226.
Rabins, P., 1996. Barriers to diagnosis and treatment of depression in elderly patients.
American Journal of Geriatric Psychiatry 44, 7984.
Reise, S.P., Widaman, K.F., Pugh, R.H., 1993. Conrmatory factor analysis and item response
theory: two approaches for exploring measurement invariance. Psychological Bulletin
114 (3), 552566.
Schleifer, S.J., Keller, S.E., Bartlett, J.A., 1999. Depression and immunity: clinical factor
and therapeutic course. Psychiatry Research 85, 6369.
Sheikh, J.I., Yesavage, J.A., 1986. Geriatric Depression Scale (GDS): recent evidence and
development of a shorter version. Clinical Gerontology 5, 165173.
Stiles, P.G., McGarrahan, J.F., 1998. The Geriatric Depression Scale: a comprehensive
review. Journal of Clinical Gerontology 4, 89110.
Tang, W.K., Wong, E., Chiu, H.F.K., Lum, C.M., Ungvari, G.S., 2005. The Geriatric
Depression Scale should be shortened: results of Rasch analysis. International
Journal of Geriatric Psychiatry 20, 783789.
Tuunainena, A., Langer, R.D., Klauber, R.M., Kripkea, D.F., 2001. Short version of the CES-D
(Burnam screen) for depression in reference to the structured psychiatric interview.
Psychiatry Research 103, 261270.
Van Marwijk, H.W., Wallace, P., de Bock, G.H., Hermans, J., Kaptein, A.A., Mulder, J.D.,
1995. Evaluation of the feasibility, reliability and diagnostic value of shortened
versions of the geriatric depression scale. The British Journal of General Practice 45,
195199.
Vzquez, L.F., Blanco, V., Lpez, M., 2007. An adaptation of the Center for Epidemiologic
Studies Depression Scale for use in non-psychiatric Spanish populations. Psychiatry
Research 149, 247252.
Wancata, J., Alexandrowicz, R., Marquart, B., Weiss, M., Friedrich, F., 2006. The criterion
validity of the Geriatric Depression Scale: a systematic review. Acta Psychiatrica
Scandinavica, 114, 398410.
WHO, 2006. The world health report 2006working together for health. . [Online].
Available at: http://www.who.int/whr/2006/whr06_en.pdf [accessed 10 May
2007].
Yesavage, J.A., 1988. Geriatric Depression Scales. Psychopharmacology Bulletin 24,
709711.
Yesavage, J.A., Brink, T.L., Rose, T.L., Lum, O., Huang, V., Adey, M.B., Leirer, V.O., 1983.
Development and validation of a geriatric depression screening scale: a preliminary
report. Journal of Psychiatric Research 17, 3749.
Youden, W.J., 1950. Index for rating diagnostic tests. Cancer 3, 3235.
Yu, C.Y., 2002. Evaluating cutoff criteria of model t indices for latent variable models with
binary and continuous outcomes. Doctoral dissertation, University of California, Los
Angeles.

Optimising The Diagnostic Performance of The Geriatric Depression Scale Izal-2010

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optimising The Diagnostic Performance of The Geriatric Depression Scale Izal-2010

Uploaded by

Copyright:

Available Formats

Psychiatry Research 178 (2010) 142146

Contents lists available at ScienceDirect

Optimising the diagnostic performance of the Geriatric Depression Scale

Universidad Autnoma de Madrid (Spain), Facultad de Psicologa, Spain

function (Schleifer et al.,1999). It is also one of the main predictors of the

M. Izal et al. / Psychiatry Research 178 (2010) 142146

specicity between 0.570 and 0.870 (with a mean of 0.750). Although

M. Izal et al. / Psychiatry Research 178 (2010) 142146

Item of original GDS

3. Do you feel that your life is

*P b 0.05; **P b 0.01.

items, which correctly classied 94.7% of cases (three false positives,

All AUCs were signicant with P b 0.001.

Sensitivity (95% CI)

Specicity (95% CI)

In the next stage, using the random sample for cross-validation,

M. Izal et al. / Psychiatry Research 178 (2010) 142146

Values within parentheses are standard errors.

M. Izal et al. / Psychiatry Research 178 (2010) 142146

You might also like