Haft BDIDASS PsyArXiv

Linking the Beck Depression Inventory (BDI-I) and the Depression Subscale from the
Depression Anxiety Stress Scale-21 (DASS-D)
Stephanie L. Haft1, Cannon Thomas2,3, Jacqueline B. Persons1,4

1
Department of Psychology, University of California, Berkeley, Berkeley, CA
2San Francisco Group for Evidence-Based Psychotherapy, San Francisco, CA
3University of California San Francisco, San Francisco, CA
4Oakland Cognitive Behavior Therapy Center, Oakland, CA
Corresponding author: Stephanie L. Haft, stephanie.haft@berkeley.edu

Publication status: Unpublished
Draft date: February 3, 2023
LINKING THE BDI-I AND DASS-D 2
Abstract
The Beck Depression Inventory (BDI-I) and depression subscale from the Depression
Anxiety Stress Scale-21 (DASS-D) are two common measures of depressive symptoms. The aim
of the present study was to use unidimensional item response theory (IRT) and equipercentile
linking methods to produce score cross-walk tables so that the BDI-I and DASS-D could be
compared. A sample of college students (N=455; 75% female, 21% White) completed both the
BDI-I and the DASS-D simultaneously. Methodology outlined by the PROsetta Stone project
(Choi et al., 2021) was used to link the BDI-I and DASS-D using data from this sample. A
separate validation sample of individuals recruited from Amazon mTurk (N=136) was used to
further test the accuracy of instrument linking methods. Results suggested that the BDI-I and
DASS-D were sufficiently unidimensional to be linked, and that IRT-based methods with fixed
parameter calibration produced the most accurate score cross-walk tables.

Depression is a salient public health issue given its relatively high lifetime prevalence
(2%-15%) and its disease burden (Moussavi et al., 2007). Consequently, numerous research
studies aim to better understand causes of and treatment for depression and depressive
symptoms. Over 280 measures of depression currently exist (Santor et al., 2006), making
comparisons across studies and across time challenging. Recently, the issue of multiple measures
in patient outcomes research has been addressed by the PROsetta Stone project and the Patient-
Reported Outcomes Measurement Information System (PROMIS; Bjorner, 2021). PROMIS uses
unidimensional item response theory (IRT) and equipercentile linking methods to establish a
common, standardized metric for frequently used patient-reported outcome measures. The
PROsetta Stone project publicly provides linking tables for several common metrics for
researchers to equate two measures of the same construct. In addition, the project has
outlined an optimal methodology for general IRT-based measurement linking procedures (Choi
et al., 2021). The aim of the present study is to link two common measurements of depression:
the Beck Depression Inventory (BDI-I) and depression subscale from the Depression Anxiety
Stress Scale-21 (DASS-D). Although the DASS-D is not included in the current PROsetta Stone
linking bank, the present study will use the methodology outlined by the PROsetta Stone project
to link the BDI-I and DASS-D using a sample of adults who simultaneously completed both
measures.
Methods
Sample
Main Sample (N=455)
A sample of 510 participants who met initial eligibility requirements (English fluency,
age 18 or older) were recruited from a large West Coast university to complete online
questionnaires. For the present study, participants were included if they did not complete the
BDI-I (N=31) or if they completed questionnaires in less than fifteen minutes (N=24), which
could indicate invalid responses. The final sample of 455 undergraduate participants was 75%
female, and 62% Asian, 21% White, 12% multiracial, 1% Black, <1% American Indian/Alaska
Native, <1% Native Hawaiian/Pacific Islander, and 4% chose not to report race. Participants
received course credit for completing the questionnaires, and all participants provided informed
consent.
Validation Sample (N=136)
A sample of 243 participants were initially recruited from Amazon Mechanical Turk who
met initial eligibility requirements (English fluency, age 18 or older). After excluding those who
did not complete key study measures, those who were excluded by a suspicious-ISP detection
algorithm (Prims et al., 2018), and those who completed the questionnaire in an improbably fast
time (less than 15 minutes), the final sample consisted of 136 participants for the present study.
This sample was 57% female and 85% White, 6% Native Hawaiian/Pacific Islander, 3% Asian,
3% multiracial, 2% American Indian/Alaska Native, and 1% chose not to report race.
Participants received $5 compensation for completing the questionnaires, and all participants
provided informed consent.
Measures
Beck Depression Inventory
The Beck Depression Inventory (BDI-I; Beck et al., 1987) is a 21-item self-report
inventory that assesses depression severity using a 4-point Likert scale. For the present study,
item 9 assessing suicidality was excluded from participant questionnaires, resulting in a 20-item
BDI-I. To score the BDI-I, individual item scores are summed to yield a possible score ranging
from 0 to 80 for the present study. In the main sample, the internal consistency of the 20-item
BDI was 0.91, and the split-half reliability was 0.86.
Depression Anxiety Stress Scale-21 – Depression
The depression subscale of the Depression Anxiety Stress Scale-21 (DASS-D; Lovibond
& Lovibond, 1995) consists of 7 items that asks individuals to rate the extent of several
depressive symptoms using a 4-point scale. To score the DASS-D, individual item scores are
summed to yield a possible score ranging from 0 to 21 for the present study. In the main sample,
the internal consistency of the 7-item DASS-D was 0.90, and the split-half reliability was 0.79.
Analytic Plan
We used a single group design in which a main sample (N=455) completed all items on
both the DASS-D and BDI-I. Factor analyses were conducted in Mplus (Muthén & Muthén,
2017), while all other analyses were conducted using the PROsetta package in R (Choi et al.,
2021).
Evaluating Linking Assumptions
To ensure that our data met assumptions for linking, we first qualitatively inspected item
content overlap between the DASS-D and BDI-I to select items that were conceptually congruent
in measurement of depression. Second, we evaluated the unidimensionality of a combined BDI-I
and DASS-D scale using exploratory and confirmatory factor analytic methods. We also
estimated the proportion of total variance that was attributable to a general factor with
hierarchical omega (ωh) –values of 0.70 or higher indicate sufficient unidimensionality for
linking procedures (Reise et al., 2013). In addition, we calculated the correlation between the
separate BDI-I and DASS-D instruments, which is recommended to be a minimum of 0.70 (Cella
et al., 2016). We also calculated coefficient alpha and item-total correlations of the combined
item set. Third, we tested for another linking assumption, subgroup invariance, by comparing
standardized mean differences between the BDI-I and DASS-D among demographic subsamples
based on gender, age, and racial identity – the recommended SMD to demonstrate adequate
subgroup invariance is below 0.08 (Dorans & Holland, 2000).
IRT-based and Equipercentile Linking
Once linking assumptions were met, we proceeded to link the BDI-I and DASS-D using
two main methods: item response theory (IRT)-based fixed-parameter equating, and
equipercentile linking. When deriving the BDI-to-DASS linking functions, we treated DASS-D
item parameters as the anchor measure during equating, and used a graded response model. From
the IRT-based estimated item parameters, we created a cross-walk score conversion table of
summed scores. We also used equipercentile linking (Kolen, 2004), which is a non-IRT model
that identifies the score on a target measure with the sample percentile rank for each value
provided on the anchor or source measure. This method also produces a cross-walk score
conversion table. Because of the noise in equipercentile equating linking functions, loglinear
smoothing was used.
Evaluating Linking Results
Ultimately three linking methods were compared and evaluated for accuracy by
computing the correlation, mean difference, and standard deviation of difference scores for
linked and actual scores. The first linking method evaluated was IRT pattern scoring with an
expected a posteriori (EAP) estimate – this uses the item parameter estimates and pattern of item
responses to produce a linked score). The second linking method was IRT cross-walk scoring,
which also uses EAP estimation but derives linked scores from the cross-walk table of summed
scores. The third linking method was equipercentile linking with loglinear smoothing. To
evaluate these three methods, we first compared actual and linked scores using the main sample
(N=455). Next, we used the produced cross-walk score tables to evaluate the accuracy of IRT
cross-walk scoring and equipercentile linking in a separate validation sample (N=136).
Results
Item Content Overlap
Comparison of item content across the 20-item BDI-I and 7-item depression subscale
from the DASS-21 (DASS-D) revealed that the BDI-I had several items that assessed constructs
not measured on the DASS-D, including guilt feelings, sense of punishment, self-accusation,
crying, irritability, social withdrawal, indecisiveness, somatic distortion, sleep disturbance,
fatigability, loss of appetite, weight loss, somatic preoccupation, and loss of libido (see Table 1).
Results from factor analyses from another study that compared the BDI-I and DASS-D
suggested that “the BDI differs from the DASS Depression scale primarily in that the BDI
includes items such as weight loss, insomnia, somatic preoccupation and irritability, which fail to
discriminate between depression and other affective states” (Lovibond and Lovibond, 1995,
p.335). In addition, the DASS-D had an item evaluating devaluation of life, which was not
measured by any BDI-I items. These items without conceptual overlap were consequently
dropped from both scales, resulting in a 6-item BDI-I and 6-item DASS-D.
Testing Unidimensionality
Exploratory factor analysis (EFA) on the 12-item combined BDI-I and DASS-D revealed
that the ratio of first to second eigenvalues was above 6, indicating sufficient unidimensionality
(ratio > 4; Reeve et al., 2007). A confirmatory factor analysis (CFA) was conducted on the
combined measure to evaluate fit of a unidimensional model. The following conventions were
used, which are considered adequate to evaluate appropriateness for IRT modeling approaches
(Choi et al., 2011): comparative fit index (CFI) > 0.90, Tucker Lewis index (TLI) > 0.90, and
root-mean-square-error of approximation (RMSEA) < 0.10. All items loaded onto a single factor
in significant and expected directions, and model fit was adequate (CFI = 0.93, TLI = 0.91,
RMSEA = 0.078). In addition, the hierarchical omega (ωh) value of 0.79 indicated a sufficient
index of unidimensionality (ωh>0.70; Reise et al., 2013). The computed Pearson correlation of
the BDI-I and DASS-D summed scores was r=0.75, which is above the recommended threshold
(>0.70) for scale linking (Cella et al., 2016). In addition, coefficient alpha for the combined scale
indicated good internal consistency (0.92), and item-total correlations ranged from 0.64 to 0.82.
Subgroup Invariance Analysis
Standardized mean differences (SMD) were computed for gender-related differences,
education-related differences (>14 years vs. < 14 years), and racial identity differences (person
of color vs. White). The SMDs were comparable between the DASS-D and BDI-I summed
scores for each subgroup – differences between SMDs were 0.068 for gender, 0.026 for
education, and 0.043 for racial identity. Given that SMD values were less than 0.08, subgroup
invariance was supported and this assumption for scale linking was met (Dorans & Holland,
2000).
Evaluation of Linking Results
Table 2 displays item parameters for the IRT model using fixed-parameter calibration.
Score cross-walk tables are shown in Table 3, using IRT-based item parameter estimates as well
as equipercentile linking methods. To compare the accuracy of the three linking methods (IRT-
based pattern scoring, IRT-based cross-walk scoring, and equipercentile linking), we first
compared the linked DASS-D summed raw score to the actual DASS-D summed score within
the main sample. We then used the crosswalk tables generated in Table 3 to produce linked
scores for the DASS-D based on BDI-I scores in a separate validation sample. Results from these
comparisons are showin in Table 4. The IRT-based cross-walk scoring and equipercentile linking
produced adequate and identical correlations for both the main sample (0.74) and the validation
sample (0.84). However, compared to equipercentile linking, the IRT-based cross-walk method
resulted in the least variation in difference scores (SD difference) in the main sample. Therefore,
the IRT-based cross-walk method was selected as the most accurate linking method.
Conclusion
The present study produces a score cross-walk table that can be used to link a modified 6-
item version of the BDI-I and 6-item version of the DASS-D. Linking the BDI-I and DASS-D
can help researchers who study depression compare these metrics across studies and across time.
Of the methods tested, results suggested that using IRT-based cross-walk tables was the most
accurate linking method. This study is limited in that findings are derived from a Western,
nonclinical sample of adults – it will be important for future studies to replicate these equating
procedures and relationships across clinical and non-Western samples.

References
Beck, A. T., Steer, R. A., & Brown, G. K. (1987). Beck depression inventory. Harcourt Brace
Jovanovich New York:
Bjorner, J. B. (2021). Solving the tower of babel problem for patient-reported outcome measures:
comments on: linking scores with patient-reported health outcome instruments: a validation
study and comparison of three linking methods. Psychometrika, 86(3), 747–753.
Cella, D., Lai, J.-S., Jensen, S. E., Christodoulou, C., Junghaenel, D. U., Reeve, B. B., & Stone,
A. A. (2016). PROMIS fatigue item bank had clinical validity across diverse chronic
conditions. Journal of Clinical Epidemiology, 73, 128–134.
Choi, S. W., Lim, S., Schalet, B. D., Kaat, A. J., & Cella, D. (2021). PROsetta: An R package for
linking patient-reported outcome measures. Applied Psychological Measurement, 45(5),
386–388.
Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equatability of tests: Basic
theory and the linear case. Journal of Educational Measurement, 37(4), 281–306.
Kolen, M. J. (2004). Linking assessments: Concept and history. Applied Psychological
Measurement, 28(4), 219–226.
Lovibond, P. F., & Lovibond, S. H. (1995). The structure of negative emotional states:
Comparison of the Depression Anxiety Stress Scales (DASS) with the Beck Depression and
Anxiety Inventories. Behaviour Research and Therapy, 33(3), 335–343.
Moussavi, S., Chatterji, S., Verdes, E., Tandon, A., Patel, V., & Ustun, B. (2007). Depression,
chronic diseases, and decrements in health: results from the World Health Surveys. The
Lancet, 370(9590), 851–858.
Muthén, B., & Muthén, L. (2017). Mplus. In Handbook of item response theory (pp. 507–518).
Chapman and Hall/CRC.
Reise, S. P., Scheines, R., Widaman, K. F., & Haviland, M. G. (2013). Multidimensionality and
structural coefficient bias in structural equation modeling: A bifactor perspective.
Educational and Psychological Measurement, 73(1), 5–26.
Santor, D. A., Gregus, M., & Welch, A. (2006). FOCUS ARTICLE: Eight decades of
measurement in depression. Measurement: Interdisciplinary Research and Perspectives,
4(3), 135–155.
Table 1
Item Conceptual Content and Overlap in the Beck Depression Inventory (BDI-I) and DASS
Depression Scales (DASS-D)
BDI-I DASS-D
Item # Construct Assessed Dropped (d) Item # Construct Assessed Dropped (d)
1 Mood 3 Anhedonia
2 Hopelessness 5 Inertia
3 Self-deprecation 10 Hopelessness
4 Anhedonia 13 Mood
5 Guilt feelings d 16 Anhedonia
6 Sense of Punishment d 17 Self-deprecation
7 Self-deprecation 21 Devaluation of life d
8 Self-accusation d
10 Crying d
11 Irritability d
12 Social withdrawal d
13 Indecisiveness d
14 Somatic Distortion d
15 Inertia
16 Sleep disturbance d
17 Fatigability d
18 Loss of Appetite d
19 Weight loss d
20 Somatic preoccupation d
21 Loss of libido d
Table 2
DASS-D and BDI-I Item Parameter Estimates from Fixed-Parameter Calibration
Item Slope CB1 CB2 CB3

DASS3 2.994 0.558 1.390 1.851
DASS10 3.902 0.515 1.169 1.713
DASS13 2.311 -0.008 1.213 1.521
DASS16 3.009 0.487 1.365 1.712
DASS17 2.706 0.680 1.211 1.757
DASS21 2.666 0.878 1.320 1.640
BDI1 2.700 0.061 1.409 2.073
BDI2 2.567 0.247 1.436 2.286
BDI3 2.233 0.533 1.529 2.646
BDI4 2.095 0.346 1.887 2.453
BDI7 2.391 0.174 1.814 2.408
BDI15 2.237 0.060 1.301 2.469
Table 3
Summed Score Cross-walk Tables for the BDI-I and DASS-D
IRT-Based Method Equipercentile Linking

BDI-I Score DASS-D Score BDI-I Score DASS-D Score
6 6 6 6
7 6 7 6
8 7 8 7
9 8 9 8
10 9 10 9
11 10 11 10
12 11 12 12
13 13 13 14
14 14 14 15
15 16 15 16
16 17 16 18
17 19 17 19
18 20 18 19
19 21 19 20
20 22 20 21
21 23 21 22
22 23 22 22
23 24 23 23
24 24 24 24
Table 4
Correlations, Mean Differences, and Standard Deviations of Actual vs. Linked DASS-D Scores
Linking Method Correlation Mean Difference SD of Differences

Main Sample (N=455)
IRT Pattern Scoring 0.69 -0.53 2.83
IRT Cross-Walk Scoring 0.74 0.15 2.76
Equipercentile Linking 0.74 0.00 2.80
Validation Sample (N=136)
IRT Cross-Walk Scoring 0.87 0.54 2.50
Equipercentile Linking 0.87 0.48 2.50

Haft BDIDASS PsyArXiv

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Haft BDIDASS PsyArXiv

Uploaded by

Copyright:

Available Formats

Linking the Beck Depression Inventory (BDI-I) and the Depression Subscale from the

Depression Anxiety Stress Scale-21 (DASS-D)

Stephanie L. Haft1, Cannon Thomas2,3, Jacqueline B. Persons1,4

Corresponding author: Stephanie L. Haft, stephanie.haft@berkeley.edu

parameter calibration produced the most accurate score cross-walk tables.

Main Sample (N=455)

Validation Sample (N=136)

3% multiracial, 2% American Indian/Alaska Native, and 1% chose not to report race.

provided informed consent.

Beck Depression Inventory

BDI was 0.91, and the split-half reliability was 0.86.

Depression Anxiety Stress Scale-21 – Depression

Evaluating Linking Assumptions

in measurement of depression. Second, we evaluated the unidimensionality of a combined BDI-I

subgroup invariance is below 0.08 (Dorans & Holland, 2000).

IRT-based and Equipercentile Linking

smoothing was used.

Evaluating Linking Results

cross-walk scoring and equipercentile linking in a separate validation sample (N=136).

Item Content Overlap

crying, irritability, social withdrawal, indecisiveness, somatic distortion, sleep disturbance,

Subgroup Invariance Analysis

Standardized mean differences (SMD) were computed for gender-related differences,

Evaluation of Linking Results

procedures and relationships across clinical and non-Western samples.

Jovanovich New York:

study and comparison of three linking methods. Psychometrika, 86(3), 747–753.

conditions. Journal of Clinical Epidemiology, 73, 128–134.

linking patient-reported outcome measures. Applied Psychological Measurement, 45(5),

Kolen, M. J. (2004). Linking assessments: Concept and history. Applied Psychological

Measurement, 28(4), 219–226.

Anxiety Inventories. Behaviour Research and Therapy, 33(3), 335–343.

Lancet, 370(9590), 851–858.

Chapman and Hall/CRC.

structural coefficient bias in structural equation modeling: A bifactor perspective.

Educational and Psychological Measurement, 73(1), 5–26.

measurement in depression. Measurement: Interdisciplinary Research and Perspectives,

DASS-D and BDI-I Item Parameter Estimates from Fixed-Parameter Calibration

Item Slope CB1 CB2 CB3

Summed Score Cross-walk Tables for the BDI-I and DASS-D

IRT-Based Method Equipercentile Linking

Linking Method Correlation Mean Difference SD of Differences

You might also like