Professional Documents
Culture Documents
Author manuscript
Psychol Med. Author manuscript; available in PMC 2023 June 19.
Author Manuscript
3Rhode Island Hospital, Butler Hospital, and Alpert Medical School of Brown University,
Providence, RI, USA
4New York-Presbyterian Hospital and Weill Cornell Medical College, New York, NY, USA
Abstract
Background.—Serotonin-reuptake inhibitors (SRIs) are first-line pharmacotherapy for the
treatment of body dysmorphic disorder (BDD), a common and severe disorder. However, prior
research has not focused on or identified definitive predictors of SRI treatment outcomes.
Leveraging precision medicine techniques such as machine learning can facilitate the prediction of
treatment outcomes.
Methods.—The study used 10-fold cross-validation support vector machine (SVM) learning
Author Manuscript
models to predict three treatment outcomes (i.e. response, partial remission, and full remission) for
97 patients with BDD receiving up to 14-weeks of open-label treatment with the SRI escitalopram.
SVM models used baseline clinical and demographic variables as predictors. Feature importance
analyses complemented traditional SVM modeling to identify which variables most successfully
predicted treatment response.
Conclusions.—The current study is the first to demonstrate that machine learning algorithms
can successfully predict treatment outcomes for pharmacotherapy for BDD. Consistent with
precision medicine initiatives in psychiatry, the current study provides a foundation for
personalized pharmacotherapy strategies for patients with BDD.
Keywords
Body dysmorphic disorder; machine learning; pharmacotherapy; SRI
Introduction
Patients with body dysmorphic disorder (BDD) experience distressing or impairing
Author Manuscript
Serotonin-reuptake inhibitors (SRIs) are the first-line pharmacologic treatment for BDD
Author Manuscript
and are often efficacious (Phillips, 2017). However, not all patients achieve response or
remission, and thus it is important to identify predictors of treatment outcomes. For SRI
treatment, intent-to-treat nonresponse rates range from 27% to 47%, and a completer
analysis of the current study (the only medication study to report completer analyses)
yielded a non-response rate of 19% (Phillips, 2017). Likewise, for cognitive behavioral
therapy (CBT) for BDD, intent-to-treat non-response rates range from 46% to 60%
(Harrison, de la Cruz, Enander, Radua, & Mataix-Cols, 2016), and a completer analysis
yielded non-response rates of 15% to 17% across two different sites in a recent trial
(Wilhelm et al., 2019). Only a few SRI studies have examined predictor variables. Moreover,
this research has focused principally on co-morbidity as the primary predictor rather than a
more comprehensive set of constructs and variables that have relevance to BDD. Phillips,
Dwight, and McElroy (1998) found that comorbid major depressive disorder (MDD) and
Author Manuscript
obsessive–compulsive disorder (OCD) did not predict the treatment response of BDD to
fluvoxamine (n = 30). A randomized placebo-controlled trial of fluoxetine in 67 patients
with BDD similarly found that treatment response was independent of the presence of
comorbid MDD and OCD as well as severity and duration of BDD (Phillips, Albertini,
& Rasmussen, 2002). Likewise, a small open-label trial of citalopram for BDD (n = 15)
found that treatment response was as likely for those with and without MDD (Phillips
& Najjar, 2003). And in a double-blind cross-over trial of the SRI clomipramine v. the
non-SRI antidepressant desipramine (n = 29), treatment efficacy was not moderated by
the presence of comorbid MDD, OCD, or social anxiety disorder (SAD) (Hollander et
Author Manuscript
al., 1999). In addition, in all of these studies, delusionality/insight of BDD beliefs did not
predict treatment response.
There is, however, some limited evidence that personality disorder (PD) pathology might
predict poorer outcomes of SRI treatment response in BDD, although findings are mixed.
In two studies, comorbid PD did not predict response to the SRIs fluoxetine (n = 67)
or fluvoxamine (n = 30), although this might be attributable to type II error (Phillips &
McElroy, 2000; Phillips et al., 2002). However, fluvoxamine responders had significantly
fewer PDs than non-responders at the study baseline. And although the latter study did not
find that neuroticism predicted SRI response, another, the larger study did (Fang, Porth,
Phillips, & Wilhelm, 2019). The primary report from the current study briefly noted that the
only baseline variable that predicted BDD response was the presence of a PD (Phillips et al.,
2016).
Author Manuscript
To our knowledge, these are the only studies that have examined predictors of SRI outcomes
in BDD. None of these reports used supervised machine learning approaches, which have
multiple advantages when examining predictors of the treatment outcome (see below). In
addition, all prior reports examined treatment response but not remission, most were limited
by small sample sizes, and most examined just a few potential predictors. Furthermore,
given recent initiatives in promoting precision medicine frameworks in psychiatry and
clinical psychology (Bernardini et al., 2017; Hayes et al., 2019; Hofmann, Curtiss, & Hayes,
2020), it would be profitable to determine whether machine learning approaches would
enhance our ability to predict who would be most likely to benefit from pharmacotherapy for
BDD. Traditionally, researchers have used familiar statistical procedures such as ordinary
least squares methods to test whether a small number of hypothesized psychological
Author Manuscript
Although several prior studies have examined a limited number of predictor variables, as
Author Manuscript
mentioned above, there has been no research employing state-of-the-art predictive modeling
approaches such as machine learning to develop more refined predictive clinical tools
for BDD treatment. Machine learning can assist in determining: (a) whether meaningful
tools can be developed to predict individual treatment outcomes, and (b) what individual
predictors contribute most to the accurate classification of treatment outcomes.
The current report is the first to focus on predictors of medication treatment response in
BDD and, more specifically, to utilize machine learning to determine whether baseline
clinical and demographic characteristics can predict treatment outcome with an SRI for
Author Manuscript
BDD. This report leverages data from the largest study of SRI treatment for BDD (Phillips
et al., 2016). Consistent with the primary goals of a machine learning approach, the current
study adopted a wide array of predictors that have generally been studied in prior BDD
studies and were available in the current dataset (Phillips, 2017). Specifically, support vector
machines (SVM) were used to predict three outcomes of interest: responder status, partial
remission status, and full remission status. Being able to differentially predict whether
any given patient with BDD will achieve one of these three outcomes can facilitate
decision-making processes about what course of treatment is most advisable. To optimize
our ability to predict treatment outcomes, recursive feature elimination (RFE) procedures
were implemented to identify the best performing model using the most successful
predictors. Furthermore, feature importance analyses were conducted to complement the
primary SVM analyses to determine which baseline variables contributed most to prediction
Author Manuscript
performance. The predictive modeling approach adopted in the current study may be able to
promote better precision medicine tools for BDD treatment. Specifically, machine learning
procedures may be able to inform an applied framework such as an online prediction
calculator, whereby patient scores can be inputted to determine the likelihood of achieving a
certain treatment outcome for SRI treatment. By identifying whether a patient will likely
achieve remission, partial remission, or no response at all from SRI treatment, these
tools can inform clinical decisions about whether SRI treatment is likely to be sufficient
or whether additional interventions might be indicated (e.g. CBT). However, translating
machine learning algorithms for use in clinical practice necessitates circumspection insofar
as these prediction tools may be undermined by poor model performance and by being
validated on non-representative samples (Senior, Fanshawe, Fazel, & Fazel, 2021). The
current study provides the initial steps in leveraging machine learning in a precision
medicine context for BDD treatment.
Author Manuscript
Methods
Participants
Participants in the study were 100 adults with a diagnosis of DSM-IV BDD. Diagnoses were
obtained using the Structured Clinical Interview for DSM-IV Axis I PDs (SCID-I; First,
Spitzer, Gibbon, & Williams, 1997) and the Structured Clinical Interview for DSM-IV Axis
II PDs (SCID-II; First, Gibbon, Spitzer, Williams, & Benjamin, 1997). Further inclusion
criteria included a score of at least ⩾24 on the Yale-Brown Obsessive–Compulsive Scale
Modified for BDD (BDD-YBOCS, Phillips et al., 1997; Phillips, Hart, & Menard, 2014),
reflecting BDD of at least moderate severity, and a score of at least moderate on the CGI
Severity Scale (Guy, 1976). Exclusion criteria included current or past bipolar disorder or
Author Manuscript
a psychotic disorder, current clinically significant suicidality or a suicide attempt within the
past year, substance abuse or dependence within the past 3 months, concurrent CBT, and use
of psychotropic medication during the study or for 2 weeks before baseline assessment (6
weeks for fluoxetine). The current research is a secondary data analysis of the original trial
(Phillips et al., 2016), and full inclusion and exclusion criteria are more fully described in
the original publication.
Procedure
Author Manuscript
Full details about the experimental procedures for this clinical trial are reported in the
original study (Phillips et al., 2016). As previously reported, participants were recruited
from one of two sites (i.e. Massachusetts General Hospital or Butler Hospital and then
Rhode Island Hospital, both affiliated with Brown University). The clinical trial consisted
of two phases. In phase I, all participants received open-label escitalopram treatment (up
to 30 mg/day) for 14 weeks. In phase II, responders to the initial treatment were randomly
assigned to either 6 months of continuation of escitalopram treatment or to discontinuation
of escitalopram and substitution with pill placebo. The current report uses data from phase I
of the study, specifically pre-treatment baseline data and response and remission status after
14 weeks of escitalopram treatment.
Measures
Author Manuscript
Predictor variables—Several demographic variables were considered for the analysis, but
the final models included gender and race because they were the two variables most likely to
be associated with the outcome variables after a pre-screening process (see online Appendix
I of Supplementary Methods). With respect to gender, all the participants identified as either
male or female. Because there was a small proportion of participants who did not identify as
white (16%), the race variable was binary coded to represent white and non-white, as further
divisions would lead to categories with very low percentages that could not be analyzed.
Clinical variables included the following. The BDD-YBOCS (Phillips et al., 1997,
2014) is a 12-item semi-structured rater-administered scale adapted from the Yale-
Brown Obsessive–Compulsive Scale, rating past-week BDD severity. Interrater reliability
(intraclass correlations) on the BDD-YBOCS for all scale items and the total score was
Author Manuscript
greater than 0.9 (Phillips et al., 2016). Reliability (Cronbach’s alpha) of the scale in the
current study was α = 0.78. The Clinical Global Impression Scale (CGI; Guy, 1976) is
a global rating scale that ranges from 1 (normal, not ill at all) to 7 (among the most
extremely ill patients), which was used to assess BDD severity. The Brown Assessment of
Beliefs Scale (BABS; Eisen et al., 1998) is a 7-item semi-structured rater-administered scale,
assessing past-week BDD-related insight/delusional beliefs (e.g. ‘I am ugly’). Interrater
reliability (intraclass correlations) on the BABS for all scale items and the total score
was >0.9 (Phillips et al., 2016). Reliability (Cronbach’s alpha) in the current dataset was
α = 0.76. The Hamilton Depression Rating Scale (HAM-D; Miller, Bishop, Norman, &
Maddever, 1985) is a 17-item semi-structured instrument assessing the current severity
of depressive symptoms (α = 0.78). The Q-LES-Q Short Form (Endicott, Nee, Harrison,
& Blumenthal, 1993) assesses the quality of life in social, leisure, household, work,
emotional well-being, physical, and school domains (α = 0.89). The Beck Depression
Author Manuscript
Inventory II (BDI-II; Beck, Steer, & Brown, 1996) is a 21-item, self-report questionnaire
assessing depression (α = 0.92). The Beck Hopelessness Scale (BHS; Beck & Steer,
1988) is a 20-item, self-report questionnaire assessing major aspects of hopelessness
including feelings about the future, loss of motivation, and expectations (α = 0.92). The
Brief Symptom Inventory (BSI; Derogatis & Melisaratos, 1983) is a 53-item, self-report
instrument assessing a broad range of general psychopathology and distress (α = 0.97).
Finally, the presence of certain comorbid Axis I diagnoses [i.e. SAD, MDD, and OCD] and
the presence of any Axis-II PD were assessed with the SCID-I and SCID-II. Thus, a total of
Author Manuscript
with previous research that has empirically characterized these states in BDD (Fernández
de la Cruz et al., 2021; Phillips et al., 1997, 2014). Specifically, treatment response is
characterized by a 30% or greater reduction in BDD-YBOCS scores, whereas achieving
at least partial remission is defined as a BDD-YBOCS score of less than or equal to
16. Consistent with Fernández de la Cruz et al. (2021), response and partial remission
were stipulated as lasting for at least one week, which differs from the operationalization
by Phillips et al. (2016) requiring a reduction in symptoms being preserved for at least
two consecutive assessment points. Because a cutpoint for full remission has not been
established for the BDD-YBOCS, the field’s primary treatment outcome measure, we
characterized full remission by a score of less than or equal to 2 on the Psychiatric Status
Rating Scale for Body Dysmorphic Disorder (BDD-PSR), which is a reliable global 7-point
scale measuring BDD severity and diagnostic status. This cutpoint on the BDD-PSR was
Author Manuscript
used in the primary paper from this study to determine full remission (Phillips, Pagano,
Menard, & Stout, 2006).
Data analysis—To predict treatment response, partial remission, and full remission,
machine learning algorithms were evaluated using all the aforementioned predictor
variables. Initially, several model algorithms were considered and compared, as indicated
in the online Supplementary Methods section. Overall, a radial kernel SVM algorithm
was evaluated to the outcomes of interest as it exhibited the best performance compared
to other algorithms (see online Supplementary Tables S1–S3). SVM features a number
of advantages that make it particularly suited to the current project. Specifically, SVM
procedures attempt to maximize generalizability (i.e. how accurately the model performs),
are suitable for situations in which there are relatively smaller sample sizes and a larger
Author Manuscript
number of predictor variables, and are robust to outliers (Boehmke & Greenwell, 2019).
Machine learning models were examined using 10-fold cross-validation, which partitions the
sample into 10 subsets, of which nine are used in the training process and predictions are
made in the remaining subset. This process is repeated for each of the remaining 10 subsets,
with each of the 10 subsets being used exactly once as the testing data. Results of the 10
folds are averaged to produce a single estimate.
under the curve (AUC) metrics were calculated. An AUC value of 0.5 denotes discrimination
between classes at the chance level, and values greater than 0.5 denote successful
classification (i.e. maximize the true positive rate and minimize the false positive rate).
Although the exact meaning of AUC values must be interpreted in relation to the particular
classification problem of interest, we will adopt the following generally accepted AUC
framework as an interpretive guideline: AUC = 0.50 reflects no discrimination, 0.70
⩽ AUC ⩽ 0.80 reflects acceptable discrimination, and AUC ⩾ 0.80 reflects excellent
discrimination (Hosmer & Lemeshow, 1999). Also, standard ROC metrics were evaluated
such as sensitivity (i.e. the proportion of positives correctly identified) and specificity (i.e.
the proportion of negatives that are correctly identified). These metrics range between 0 and
1 such that larger values indicate better performance. Accuracy, which is the percentage of
total items classified correctly, was estimated as well.
Author Manuscript
it can be helpful to determine which subset of predictors results in the best classification
performance. Thus, RFE procedures were implemented to identify the best performing
model using the most successful predictors. In brief, RFE is an iterative process by which a
machine learning model is trained and tested with varying subsets of the predictor variables,
and a final model is fitted with the optimal subset of predictors (Kuhn & Johnson, 2013). In
the current study, the RFE algorithm was estimated for subset sizes ranging from 1 to 14 (i.e.
up to the total number of predictors available). Results are presented for the best-performing,
final model containing the most optimal subset of predictors.
Furthermore, feature importance analyses were conducted to determine the ranked order of
predictors in terms of their predictive power. For each predictor, an individual AUC value
was estimated to indicate individual predictive performance. Feature importance values are
Author Manuscript
presented for both the predictors in the final best performing model, as well as all other
predictors used to build the initial SVM model prior to RFE procedures being implemented.
Because SVM algorithms are often considered ‘black-box’ models, for which there is no
readily meaningful interpretation of the predictor weights, partial dependence plots (PDP)
were estimated to visualize the relationship between a given predictor and the outcome
(Boehmke & Greenwell, 2019). PDPs display the probability of the outcome variable being
a certain value (e.g. responder status) for each value of a predictor variable. PDPs were
estimated for the top three predictors in each of the final models for each of the three
Author Manuscript
Results
Participant characteristics
A total of 100 participants with BDD were included in the intent-to-treat population in Phase
I, and the 97 subjects with a postbaseline assessment (i.e. were in the study long enough to
have an assessment after the initial intake assessment) were included in the current study.
Demographic characteristics are presented in Table 1. The mean age was 33.5 (S.D. = 12.4)
with a range between 18 and 68, and 64% of the participants identified as female. With
respect to race, 84% of participants identified as White, 7% as Black, 2% as Asian, 1%
as American Indian, 1% as Alaskan Native, and 5% as multi-racial. Regarding ethnicity,
12% identified as Latinix. The mean baseline BDD-YBOCS score was 32.72 (S.D. = 5.43)
Author Manuscript
with a range between 24 and 46, which reflects moderate to severe symptoms. Overall, in
the intent-to-treat sample, 72% of participants achieved a response, 51% achieved partial
remission, and 20% achieved full remission [these percentages differ slightly from those in
this study’s primary report (Phillips et al., 2016), reflecting slight differences in definitions
used for treatment response]. Full demographic and clinical characteristics are also provided
in the primary report from this study (Phillips et al., 2016).
(95% CI 0.63–0.82). The best model performance was associated with a cost parameter
of 0.5 and a sigma tuning parameter of 0.2732069. Results of the feature importance
analysis revealed that hopelessness (BHS), having a PD, and BDD severity (CGI) were
the most important three predictors. Feature importance results for all predictors, as well
as those emphasized in the final model, are presented in Table 2. PDPs for the top three
predictors are displayed in Fig. 1. In general, higher scores of hopelessness and having a
PD were associated with lower probabilities of achieving treatment response, whereas higher
scores of BDD symptom severity on the CGI were associated with a greater probability of
achieving treatment response.
RFE procedures revealed that the best performing model contained eleven predictors (i.e.
BSI, BDI-II, QLESQ, CGI-BDD, BDD-YBOCS, PD, BHS, BABS, HAM-D, OCD, and
MDD). The final SVM model exhibited acceptable classification performance with an AUC
of 0.75. Sensitivity was 0.67, and specificity was 0.73. Accuracy was 0.70 (95% CI 0.60–
0.79). The best model performance was associated with a cost parameter of 1 and a sigma
tuning parameter of 0.07505707. Results of the feature importance analysis of the final
model revealed that general psychopathology (BSI), depression (BDI-II), and quality of life
(QLESQ) were the most important predictors, respectively. Feature importance results for
all predictors, as well as those emphasized in the final model, are presented in Table 2.
Author Manuscript
PDPs for the top three predictors are displayed in Fig. 2. In general, higher scores of overall
psychopathology and of depression were associated with lower probabilities of achieving
partial remission, whereas higher scores of quality of life were associated with a greater
probability of achieving partial remission.
and general psychopathology (BSI) were the only important predictors in the final model.
Feature importance results for all predictors, as well as those emphasized in the final model,
are presented in Table 2. PDPs for the only two predictors in the final model are displayed
in Fig. 3. In general, higher scores of overall psychopathology were associated with lower
probabilities of achieving full remission, whereas higher scores of quality of life were
associated with a greater probability of achieving full remission.
Discussion
The current study provides the first evidence that machine learning algorithms can
successfully predict treatment outcomes for pharmacotherapy for BDD. In the BDD
literature, very little prognostic information exists about what factors are predictive of
successful pharmacotherapy treatment. Results of the final SVM models identified using
Author Manuscript
RFE procedures indicated acceptable prediction of each of the three primary outcomes
(i.e. response, partial remission, and full remission), as AUC values were all above
0.70. Furthermore, feature importance analyses supported constructs such as quality of
life, depression symptoms, general psychopathology symptoms, and hopelessness as most
predictive of treatment outcomes. The presence of a PD was more strongly predictive of
poorer treatment response and partial remission rather than full remission. Demographic
variables such as gender and race were the least predictive of treatment outcome. By
probing the PDP plots, it appears that higher levels of some psychopathology measures (e.g.
depression, general psychopathology, hopelessness, PD) were associated with less favorable
treatment outcomes, whereas the better quality of life and more severe BDD as assessed by
the CGI were predictive of better outcomes.
Author Manuscript
machine learning models in the present study is noteworthy given that it relied only on
Author Manuscript
baseline self-report and clinical interview data, which is more feasible than requiring costly
neuroimaging or genetic data as has been emphasized in other machine learning studies in
psychiatry (Lee et al., 2018).
AUC values for the overall models were in the acceptable range for all three treatment
outcomes, although relatively few individual predictor variables were in the acceptable range
(above 0.70). This is consistent with prior studies, which did not consistently identify any
predictors of BDD response to an SRI, although most prior studies had limited statistical
power for this purpose. This attests to the value of leveraging machine learning models
which can uncover patterns across individual predictors in the dataset to bolster overall
prediction accuracy.
The current machine learning models may provide a platform for facilitating a more
Author Manuscript
informed decision-making process about the potential utility of escitalopram for BDD
for individual patients. However, using such models also incur potential risks for
misclassification. For example, patients might decide to forgo the first-line medication
treatment (an SRI) for this severe disorder when it in fact might actually prove efficacious
for them. Because none of the models exhibits perfect classification performance and might
differ in different patient populations or with different medications, clinicians and patients
can collectively deliberate on the role of uncertainty in these predictions and collaborate on
the extent to which it is advisable to follow a machine learning model’s recommendation.
Greater general psychopathology on the BSI did predict poorer outcomes for partial and
full remission (but not treatment response), with AUCs in the acceptable range. A possible
explanation for this finding is that the BSI contains some items that might interfere with
Author Manuscript
treatment response. For instance, feeling others are to blame for most of your troubles, or
having ideas that someone else can control your thoughts, might inhibit social interaction
and functioning, which is assessed by the BDD-YBOCS. It is unclear why more severe
depression on the BDI-II reached the acceptable range for prediction of partial and full
remission (but not for response), as our clinical impression is that depression severity does
not affect improvement with SRI treatment. Moreover, comorbid MDD was one of the
weaker predictors of treatment outcome. Perhaps, the BDI-II had more power to predict the
outcomes because it is a continuous variable, whereas the dichotomous nature of the MDD
variable conveys less information, which may undermine its predictive power.
Regarding predictors of treatment response, it is unclear why hopelessness was the most
salient predictor. However, hopelessness has also been shown to predict poorer response to
SRI treatment in MDD (e.g. Papakostas et al., 2007). It is possible that greater hopelessness
Author Manuscript
Our finding that comorbid MDD, OCD, and SAD did not predict treatment response much
Author Manuscript
better than chance is consistent with clinical experience and with results from prior studies.
The presence of comorbid PD was overall a stronger predictor of treatment response and
partial remission than Axis I comorbidity. Studies of a broad range of Axis I disorders
indicate that PDs tend to have an adverse effect on treatment response (Reich & Vasile,
1993), although BDD studies that have examined this issue have had mixed findings
(Phillips et al., 2002, 2016).
Our finding that BDD-related delusionality/insight, as assessed by the BABS, did not
predict response and full remission much better than chance is consistent with prior studies
indicating that SRI monotherapy is equally efficacious for both delusional and nondelusional
BDD (Phillips, 2017). This has been a consistent and interesting finding in all prior
BDD pharmacotherapy studies and is worth highlighting because SRI monotherapy is not
considered efficacious for other disorders that are often characterized by delusional beliefs.
Author Manuscript
A principal strength of the current study is that machine learning models were validated
across three different levels of the treatment outcome (i.e. response, partial remission,
and full remission). SVM algorithms accomplished acceptable classification discrimination
in predicting all outcomes. Furthermore, both sensitivity and specificity across all three
models were largely balanced, indicating that the models exhibited acceptable ability in
both ruling-in and ruling-out each of the three outcomes. The model predicting response
status has slightly higher sensitivity than specificity, indicating it might be slightly more
well-suited in ruling out someone being a ‘responder’. Moreover, the algorithm predicting
full remission possessed slightly better specificity than sensitivity, perhaps indicating this
model is somewhat better at ruling in an outcome (i.e. being a ‘full-remitter’). Another
important aspect of the current study was the feature importance analyses, which determine
Author Manuscript
Notwithstanding the strengths of the current study, certain limitations warrant mention. First,
from a clinical perspective, it is not known whether similar findings would pertain to the
higher doses of escitalopram that can be used to treat BDD in clinical practice (Phillips,
2017). Second, the sample size is relatively modest in the context of traditional machine
learning studies. That notwithstanding, successful machine learning studies have been
accomplished using smaller sample sizes (e.g. Flygare et al., 2020), and approaches were
adopted in the current study to optimize performance, given the sample size. Specifically,
SVM algorithms have utility in smaller sample sizes (Boehmke & Greenwell, 2019), and
10-fold cross-validation was employed to prevent biasing the model training on any single
subsample of the dataset. Future research would benefit from replicating these results
Author Manuscript
meaning. In an effort to better explicate the directionality of the relationship between the
Author Manuscript
top predictors and treatment outcomes in each model, PDPs were produced to visualize
relationships between predictors and outcomes. Fifth, the number of predictors in the current
study was relatively modest compared to what might be typical for machine learning studies
leveraging big data. It would be profitable for future studies to model much larger numbers
of predictors to better identify which features are most predictive of treatment success for
BDD patients. Furthermore, it may be profitable to consider individual item level data as
predictors in addition to aggregate sum scores. Sixth, a traditional 10-fold cross-validation
procedure was used such that each of the 10 subsets of the data was used exactly once as the
testing data. This is a limitation, as performance estimates for training cross-validation tend
to be optimistic relative to validation set performance. In future machine learning research,
it would be especially beneficial to consider strategies to leverage multiple trial datasets
to permit model testing in whole, novel datasets that were not used for training purposes.
Author Manuscript
Finally, it is rare that any machine learning model yields perfectly accurate predictions,
and this is true of the models validated in the current study. There is always a risk of
misclassification, and, therefore, it would be necessary for clinicians to discuss the potential
consequences of misclassification (e.g. waste of time in expecting treatment response,
potential medication side-effects without intended therapeutic benefit, forgoing a medication
that would actually be beneficial, etc.) with the patient and prescriber collaboratively making
an informed decision. Importantly, a broader range of clinical factors than were examined
in this report must be considered when deciding whether to treat with an SRI. For example,
patients with a high degree of suicidality, which is common in BDD (Snorrason et al., 2019)
but who were not included in this study, should always receive an SRI (Phillips, 2017).
Thus, machine learning models are best viewed as tools that provide useful information for
treatment planning rather than rigid prescriptions about treatment courses.
Author Manuscript
Overall, these results support the utility of machine learning in predicting three important
treatment outcomes for pharmacotherapy for BDD. Consistent with precision medicine
initiatives in psychiatry (Bernardini et al., 2017), the current study provides the foundation
for personalized pharmacotherapy strategies for patients with BDD, although further studies
are needed. Future directions include replicating the machine learning framework of
the current study for SRI treatment and other types of treatment for BDD (e.g. other
medications, CBT, combined pharmacotherapy and CBT). By doing so, it may be possible to
develop online prediction calculators to determine the likelihood of response and remission
for any given treatment, which can facilitate shared decision making with a clinician.
Differential merits about individual interventions can be evaluated to determine the most
appropriate course of treatment for a single patient with BDD. A noteworthy benefit
of the current study is that successful predictive modeling for BDD treatment response
Author Manuscript
can be accomplished using accessible and cost-effective data from self-report and clinical
interview assessments. By facilitating the dual goals of leveraging precision medicine and
data feasibility, machine learning approaches to BDD treatment are poised to advance the
current standard of patient care and improve outcomes at the individual level.
Supplementary Material
Refer to Web version on PubMed Central for supplementary material.
Financial support.
Author Manuscript
This original trial presented in this paper was funded by a Collaborative R01 grant from the National Institute of
Mental Health to Dr Phillips (R01 MH072917) and Dr Wilhelm (R01 MH072854).
References
Angelakis I, Gooding PA, & Panagioti M (2016). Suicidality in body dysmorphic disorder (BDD):
A systematic review with meta-analysis. Clinical Psychology Review, 49, 55–66. [PubMed:
27607741]
Beck AT, & Steer RA (1988) Manual for the beck hopelessness scale. San Antonio, TX: Psychological
Corporation.
Beck AT, Steer RA, & Brown GK (1996). Beck depression inventory – second edition manual. San
Antonio, TX: The Psychological Corporation.
Bernardini F, Attademo L, Cleary SD, Luther C, Shim R, Quartesan R, & Compton MT (2017).
Risk prediction models in psychiatry: Toward a new frontier for the prevention of mental illnesses.
Author Manuscript
Derogatis L, & Melisaratos N (1983). The brief symptom inventory: An introductory report.
Psychological Medicine, 13, 595–605. [PubMed: 6622612]
Eisen JL, Phillips KA, Baer L, Beer DA, Atala KD, & Rasmussen SA (1998). The brown assessment
of beliefs scale: Reliability and validity. American Journal of Psychiatry, 155, 102–108. [PubMed:
9433346]
Endicott J, Nee J, Harrison W, & Blumenthal R (1993). Quality of life enjoyment and satisfaction
questionnaire: A new measure. Psychopharmacology Bulletin, 29, 321–326. [PubMed: 8290681]
Fang A, Porth R, Phillips KA, & Wilhelm S (2019). Personality as a predictor of treatment response
to escitalopram in adults with body dysmorphic disorder. Journal of Psychiatric Practice, 25, 347–
357. [PubMed: 31505519]
Fernández de la Cruz LF, Enander J, Rück C, Wilhelm S, Phillips KA, Steketee G, … Veale D
(2021). Empirically defining treatment response and remission in body dysmorphic disorder.
Psychological Medicine, 51, 1–7. [PubMed: 33267920]
First MB, Gibbon M, Spitzer RL, Williams JBW, & Benjamin LS (1997). Structured Clinical Interview
for DSM-IVAxis II personality disorders (SCID-II). Washington, DC: American Psychiatric Press.
Author Manuscript
First MB, Spitzer RL, Gibbon M, & Williams JBW (1997). Structured clinical interview for DSM-IV
axis I disorders (SCID I). New York: Biometric Research Department.
Flygare O, Enander J, Andersson E, Ljótsson B, Ivanov VZ, Mataix-Cols D, & Rück C (2020).
Predictors of remission from body dysmorphic disorder after internet-delivered cognitive behavior
therapy: A machine learning approach. BMC Psychiatry, 20, 1–9. [PubMed: 31898506]
Greenberg JL, Phillips KA, Steketee G, Hoeppner SS, & Wilhelm S (2019). Predictors of response
to cognitive-behavioral therapy for body dysmorphic disorder. Behavior Therapy, 50, 839–849.
[PubMed: 31208692]
Guy W (1976). ECDEU Assessment manual for psychopharmacology: Revised. Rockville, MD:
ECDEU Assessment Manual. U.S. Department of Health, Education, and Welfare, Public Health
Author Manuscript
Service, Alcohol, Drug Abuse, and Mental Health Administration, National Institute of Mental
Health, Psychopharmacology Research Branch, Division of Extramural Research Programs.
Harrison A, de la Cruz LF, Enander J, Radua J, & Mataix-Cols D (2016). Cognitive-behavioral therapy
for body dysmorphic disorder: A systematic review and meta-analysis of randomized controlled
trials. Clinical Psychology Review, 48, 43–51. [PubMed: 27393916]
Hayes SC, Hofmann SG, Stanton CE, Carpenter JK, Sanford BT, Curtiss JE, & Ciarrochi J (2019).
The role of the individual in the coming era of process-based therapy. Behaviour Research and
Therapy, 117, 40–53. [PubMed: 30348451]
Hofmann SG, Curtiss JE, & Hayes SC (2020). Beyond linear mediation: Toward a dynamic network
approach to study treatment processes. Clinical Psychology Review, 76, 101824. [PubMed:
32035297]
Hollander E, Allen A, Kwon J, Aronowitz B, Schmeidler J, Wong C, & Simeon D (1999).
Clomipramine vs desipramine crossover trial in body dysmorphic disorder: Selective efficacy of a
serotonin reuptake inhibitor in imagined ugliness. Archives of General Psychiatry, 56, 1033–1039.
Author Manuscript
[PubMed: 10565503]
Hoogendoorn M, Berger T, Schulz A, Stolz T, & Szolovits P (2016). Predicting social anxiety
treatment outcome based on therapeutic email conversations. IEEE Journal of Biomedical and
Health Informatics, 21(5), 1449–1459. [PubMed: 27542187]
Hosmer DW, & Lemeshow S (1999). Applied logistic regression (2nd ed.). New York: John Wiley &
Sons.
Koran LM, Abujaoude E, Large MD, & Serpe RT (2008). The prevalence of body dysmorphic disorder
in the United States adult population. CNS Spectrums, 13, 316–322. [PubMed: 18408651]
Kuhn M (2008). Caret package. Journal of Statistical Software, 28, 1–26. [PubMed: 27774042]
Kuhn M, & Johnson K (2013). Applied predictive modeling. New York: Springer.
Lee Y, Ragguett RM, Mansur RB, Boutilier JJ, Rosenblat JD, Trevizol A, … McIntyre RS (2018).
Applications of machine learning algorithms to predict therapeutic outcomes in depression: A
meta-analysis and systematic review. Journal of Affective Disorders, 241, 519–532. [PubMed:
30153635]
Author Manuscript
Miller IW, Bishop S, Norman WH, & Maddever H (1985). The modified Hamilton rating scale for
depression: Reliability and validity. Psychiatry Research, 14, 131–142. [PubMed: 3857653]
Nie Z, Vairavan S, Narayan VA, Ye J, & Li QS (2018). Predictive modeling of treatment-resistant
depression using data from STAR*D and an independent clinical study. PLoS One, 13, e0197268.
[PubMed: 29879133]
Papakostas GI, Petersen T, Homberger CH, Green CH, Smith J, Alpert JE, & Fava M (2007).
Hopelessness as a predictor of non-response to fluoxetine in major depressive disorder. Annals
of Clinical Psychiatry, 19, 5–8. [PubMed: 17453655]
Phillips KA (2017). Pharmacotherapy and other somatic treatments for body dysmorphic disorder.
In Phillips KA (Ed.), Body dysmorphic disorder: Advances in research and clinical practice (pp.
333–356). New York: Oxford University Press.
Phillips KA, Albertini RS, & Rasmussen SA (2002). A randomized placebo-controlled trial of
fluoxetine in body dysmorphic disorder. Archives of General Psychiatry, 59, 381–388. [PubMed:
11926939]
Phillips KA, Coles ME, Menard W, Yen S, Fay C, & Weisberg RB (2005a). Suicidal ideation and
Author Manuscript
suicide attempts in body dysmorphic disorder. The Journal of Clinical Psychiatry, 66, 717–725.
[PubMed: 15960564]
Phillips KA, Dwight MM, & McElroy SL (1998). Efficacy and safety of fluvoxamine in body
dysmorphic disorder. Journal of Clinical Psychiatry, 59, 165–171. [PubMed: 9590666]
Phillips KA, Hart AS, & Menard W (2014). Psychometric evaluation of the yale–brown obsessive–
compulsive scale modified for body dysmorphic disorder (BDD-YBOCS). Journal of Obsessive–
Compulsive and Related Disorders, 3, 205–208.
Phillips KA, Hollander E, Rasmussen SA, Aronowitz BR, DeCaria C, & Goodman WK (1997). A
severity rating scale for body dysmorphic disorder: Development, reliability, and validity of a
Phillips KA, Keshaviah A, Dougherty DD, Stout RL, Menard W, & Wilhelm S (2016).
Pharmacotherapy relapse prevention in body dysmorphic disorder: A double-blind, placebo-
controlled trial. American Journal of Psychiatry, 173, 887–895. [PubMed: 27056606]
Phillips KA, & McElroy SL (2000). Personality disorders and traits in patients with body dysmorphic
disorder. Comprehensive Psychiatry, 41, 229–236. [PubMed: 10929788]
Phillips KA, Menard W, Fay C, & Weisberg R (2005b). Demographic characteristics, phenomenology,
comorbidity, and family history in 200 individuals with body dysmorphic disorder.
Psychosomatics, 46, 317–325. [PubMed: 16000674]
Phillips KA, & Najjar F (2003). An open-label study of citalopram in body dysmorphic disorder. The
Journal of Clinical Psychiatry, 64, 715–720. [PubMed: 12823088]
Phillips KA, Pagano ME, Menard W, & Stout RL (2006). A 12-month follow-up study of the course of
body dysmorphic disorder. American Journal of Psychiatry, 163, 907–912. [PubMed: 16648334]
Phillips KA, Quinn G, & Stout RL (2008). Functional impairment in body dysmorphic disorder: A
prospective, follow-up study. Journal of Psychiatric Research, 42, 701–707. [PubMed: 18377935]
Author Manuscript
Reich JH, & Vasile RG (1993). Effect of personality disorders on the treatment outcome of axis I
conditions: An update. The Journal of Nervous and Mental Disease, 181, 475–484. [PubMed:
8103074]
Rief W, Buhlmann U, Wilhelm S, Borkenhagen A, & Brähler E (2006). The prevalence of body
dysmorphic disorder: A population-based survey. Psychological Medicine, 36, 877–885. [PubMed:
16515733]
Schieber K, Kollei I, de Zwaan M, & Martin A (2015). Classification of body dysmorphic disorder
– what is the advantage of the new DSM-5 criteria?. Journal of Psychosomatic Research, 78,
223–227. [PubMed: 25595027]
Senior M, Fanshawe T, Fazel M, & Fazel S (2021). Prediction models for child and adolescent mental
health: A systematic review of methodology and reporting in recent research. JCPP Advances,
e12034.
Snorrason I, Beard C, Christensen K, Bjornsson AS, & Björgvinsson T (2019). Body dysmorphic
disorder and major depressive episode have comorbidity-independent associations with suicidality
Author Manuscript
win an acute psychiatric setting. Journal of Affective Disorders, 259, 266–270. [PubMed:
31450136]
Wilhelm S, Phillips KA, Greenberg JL, O’Keefe SM, Hoeppner SS, Keshaviah A, … Schoenfeld DA
(2019). Efficacy and posttreatment effects of therapist-delivered cognitive behavioral therapy vs
supportive psychotherapy for adults with body dysmorphic disorder: A randomized clinical trial.
JAMA Psychiatry, 76(4), 363–373. [PubMed: 30785624]
Author Manuscript
Fig. 1.
PDP of top features for a response.
Note: In the partial dependence plots, the y axis (i.e. yhat) denotes the probability of
Author Manuscript
predicting response status given a particular value of the predictor. In general, higher scores
of hopelessness and having a personality disorder were associated with lower probabilities
of achieving treatment response, whereas higher scores of BDD symptom severity on the
CGI were associated with a greater probability of achieving treatment response. PDP, partial
dependence plot; BHS, Beck Hopelessness Scale; PD, personality disorder diagnosis; CGI-
BDD, clinical global impression of BDD severity.
Fig. 2.
PDP of top features for partial remission.
Note: In the partial dependence plots, the y axis (i.e. yhat) denotes the probability of
Author Manuscript
predicting partial remission status given a particular value of the predictor. In general,
higher scores of overall psychopathology and of depression were associated with lower
probabilities of achieving partial remission, whereas higher (better) scores of quality of
life were associated with a greater probability of achieving partial remission. PDP, partial
dependence plot; BSI, Brief Symptom Inventory; BDI, Beck Depression Inventory; QOL,
quality of life.
Fig. 3.
PDP of top features for full remission.
Note: In the partial dependence plots, the y axis (i.e. yhat) denotes the probability of
predicting full remission status given a particular value of the predictor. In general, higher
scores of overall psychopathology were associated with lower probabilities of achieving full
remission, whereas higher (better) scores of quality of life were associated with a greater
probability of achieving full remission. PDP, partial dependence plot; BSI, Brief Symptom
Inventory; QOL, quality of life.
Author Manuscript
Author Manuscript
Table 1.
Demographic characteristics
Author Manuscript
Variable N %
Female 64 64%
Race
Asian 2 2%
Black/African American 7 7%
White 84 84%
Marital Status
Author Manuscript
Single 63 63%
Married 20 20%
Divorced 12 12%
Separated 2 2%
Other 3 3%
Mean S.D.
Note: Values are based on the original full sample of 100 patients.
Author Manuscript
Author Manuscript
Table 2.
FI, feature importance; PD, having any personality disorder; SAD, social anxiety disorder; MDD, major depressive disorder; OCD, obsessive-compulsive disorder.
Note: The final models identified by RFE are highlighted in grey. For response status, RFE included the top 5 predictors. For partial remission status, RFE included the top 11 predictors. For full remission
status, RFE included the top 2 predictors. Values denote area under the ROC curve for each predictor, indicating relative predictive power. Higher values indicate a better classification of responder status.
Page 20