Professional Documents
Culture Documents
BACKGROUND: Eating disorders affect upwards of 30 mil- Further examination of the validity of the SCOFF or de-
lion people worldwide and often go undertreated and velopment of a new screening tool, or multiple tools, to
underdiagnosed. The purpose of this systematic review screen for the range of DSM-5 eating disorders heteroge-
and meta-analysis was to evaluate the diagnostic accuracy nous populations is warranted.
of the Sick, Control, One, Fat and Food (SCOFF) question- TRIAL REGISTRATION: This study is registered online
naire for DSM-5 eating disorders in the general population. with PROSPERO (CRD42018089906).
METHOD: The Preferred Reporting Items for Systematic
KEY WORDS: eating disorders; SCOFF; screening; systematic review;
Reviews and Meta-analyses (PRISMA) were followed. A
diagnostic test accuracy.
PubMed search was conducted among peer-reviewed ar-
ticles. Information regarding validation of the SCOFF was J Gen Intern Med 35(3):885– 93
required for inclusion. Study quality was assessed using DOI: 10.1007/s11606-019-05478-6
the Quality Assessment of Diagnostic Accuracy Studies-2 © Society of General Internal Medicine (This is a U.S. government work and
(QUADAS-2) tool. not under copyright protection in the U.S.; foreign copyright protection
RESULTS: The final analysis included 25 studies. The va- may apply) 2019
lidity of the SCOFF was high across samples with a pooled
sensitivity of 0.86 (95% CI, 0.78–0.91) and specificity of 0.83
(95% CI, 0.77–0.88). Subgroup analyses were conducted to
INTRODUCTION
examine the impact of methodology, study quality, and clin-
ical characteristics on diagnostic accuracy. Studies with the Eating disorders effect upwards of 30 million people and carry
highest sensitivity tended to be case-control studies of young with them significant morbidity and mortality.1 Effective screen-
women with anorexia nervosa (AN) and bulimia nervosa ing for eating disorders is critical as these disorders are commonly
(BN). Studies which included more men, included those
underdiagnosed and undertreated.1–3 The 5-item SCOFF (Sick,
diagnosed with binge eating disorder, and recruited from
large community samples tended to have lower sensitivity. Control, One, Fat and Food; see Fig. 1) questionnaire, developed
Few studies reported on BMI and race/ethnicity; thus, sub- in 1999 by Morgan and colleagues, is the most widely used
groups for these factors could not be examined. No studies screening measure for eating disorders. With the inclusion of
used reference standards which assessed all DSM-5 eating binge eating disorder and other specified eating disorders (i.e.,
disorders. atypical anorexia, low frequency or limited duration bulimia
CONCLUSION: This meta-analysis of 25 validation stud- nervosa and binge eating disorder, purging disorder, night eating
ies demonstrates that the SCOFF is a simple and useful
syndrome) in DSM-5,4 it has become increasingly important to
screening tool for young women at risk for AN and BN.
However, there is not enough evidence to support utilizing expand awareness of various types of eating pathology. Of
the SCOFF for screening for the range of DSM-5 eating particular importance, these new categories of eating disorders
disorders in primary care and community-based settings. had not yet been defined at the time that the SCOFF was
developed.
Prior Presentations This project was presented at the annual meeting
The changing landscape of diagnostic eating disorder catego-
for the Society for Behavioral Medicine (SBM) in April 2018 and at the
International Conference on Eating Disorders (ICED) in March 2019. ries since the publication of DSM-5 highlights the importance of
ensuring screening tools are appropriate for detecting the full
Electronic supplementary material The online version of this article range of eating disorders in the general population. To date, the
(https://doi.org/10.1007/s11606-019-05478-6) contains supplementary SCOFF has been the recommended screening tool across numer-
material, which is available to authorized users.
ous validation studies; however, these recommendations have not
been systematically assessed. The purpose of this systematic
Received July 15, 2019
Accepted October 7, 2019 review and meta-analysis is to evaluate whether the SCOFF
Published online November 8, 2019 can appropriately screen patients in the general population for
885
886 Kutz et al.: Eating Disorder Screening: a Systematic Review and Meta-analysis JGIM
Identification
Records identified through
database search Additional records identified
PubMed n = 979 through other sources
Cochrane n = 5 n=3
Screening
Studies included in
qualitative/quantitative synthesis
n = 25
English, and article representing re-publication of prior data which was diagnosed with an eating disorder based on the
or commentary on data. criterion used. The range of any eating disorder diagnosis,
which included anorexia nervosa, bulimia nervosa, binge eat-
Study Characteristics ing disorder, and eating disorder not otherwise specified, was
1.2 to 64.3%. The range of the sample that had each eating
Table 1 depicts the characteristics of included studies.7, 8, 10–32
disorder diagnosis was as follows: anorexia nervosa = 0 to
The 25 studies reviewed included a total of 11,531 individuals.
32.1%, bulimia nervosa = 0 to 23.1%, eating disorder not
Thirteen unique countries were represented with 16 studies
otherwise specified = 0.4 to 46.3%, binge eating disorder =
conducted in Europe, four in North America, three in Asia,
0.4 to 11.6%. Sixteen studies explicitly reported on the percent
and two in South America. Samples were recruited from three
of individuals with anorexia and bulimia. Eleven studies re-
primary locations: medical settings (primary care clinics, spe-
ported on the percent of the sample meeting criteria for eating
cialty clinics), schools (grade school, high school, and univer-
disorder not otherwise specified and six reported on binge
sities), and the general community. Ages across studies ranged
eating disorder. Aside from binge eating disorder, no studies
from 10 to 95, with the majority (n = 18) of studies including
explicitly examined validity for any of the other newly includ-
primarily adult samples and seven being conducted in entirely
ed eating disorders in the DSM-5.
adolescent or young adult populations. Twelve studies includ-
Other demographic characteristics were not frequently re-
ed an entirely female sample and four additional studies in-
ported across all studies and thus are not included in Table 1.
cluded samples which were at least 70% female. The percent-
Only four studies reported any information about race or
age of females included in the remaining eight studies ranged
ethnicity, and samples tended to be primarily Caucasian
from 46.2 to 68%. Thirteen studies utilized interview format
(57.2 to 87.9%). Studies also did not frequently include infor-
(SCID-I, CIDI interview, DSM-IV interview, EDE) for the
mation about BMI. The six studies reporting average BMI
reference standard. The remaining sixteen studies utilized
found that it ranged from 21.98 to 28.1. Of these, four studies
various self-report measures including the EDI-3, the EDE-
included samples with an average BMI in the normal range
Q, the EAT-26, the Q-EDD, and an ICD-10 symptom rating
(i.e., between 18.5 and 24.9).
scale. Eighteen studies reported on the percent of the sample
888 Kutz et al.: Eating Disorder Screening: a Systematic Review and Meta-analysis JGIM
Quality Assessment were rated as low across all risk of bias and applicability concern
domains. Risk of bias was high within patient selection across
Summary index scores for the QUADAS-2 are depicted in
five studies. Four of these studies used a case-control design
Table 2. Risk of bias and applicability concerns were rated as
while one recruited an at-risk sample as opposed to utilizing a
low risk/concern (depicted as “+” signs in the table), high risk/
random or consecutive sample. Risk of bias was high in four
concern (represented as “-” signs in the table), and unknown risk/
studies in the flow and timing domain due to the SCOFF and
concern (depicted as “?” signs in the table). Only two studies
reference standard being administered at different times (i.e., not
JGIM Kutz et al.: Eating Disorder Screening: a Systematic Review and Meta-analysis 889
sequentially) or if there was any ambiguity about questionnaires interview vs questionnaire reference standard) and clinical char-
being completed at the same time (such as would be present in acteristics (age, gender, location, and diagnosis) on diagnostic
surveys that were mailed). In general, risk of bias was low across accuracy. Table 3 presents pooled sensitivity, specificity, and
studies in the reference standard domain. Applicability concerns heterogeneity values for each subgroup. The diagnostic accuracy
were most prevalent in the patient selection domain, with 14 of the SCOFF was higher in case-control studies (p < 0.01), when
studies having high applicability concerns. This was often be- an interview was used as a reference standard as opposed to a
cause the sample utilized was restricted demographically (e.g., questionnaire (p = 0.05) and when the percentage of women in
only included females). Applicability concerns were high across the sample was larger than the percentage of men (p < 0.01).
five studies for the reference standard. In these studies, certain Additionally, diagnostic accuracy was higher when risk of bias
subgroups of patients were not given the reference standard or was high for patient selection (p < 0.01). Sensitivity and speci-
were excluded for unknown reasons. ficity were lower in studies which included individuals diagnosed
with BED; however, this difference was not significant (p =
Diagnostic Accuracy 0.22). Of note, subgroup analysis did not explain the high overall
heterogeneity of the included studies as all subgroups had an I2
Diagnostic accuracy rates for each study are depicted in the forest
value of greater than 60%.
plot in Figure 3 and the receiver operating curve (SROC) in
The likelihood ratio scattergram (Fig. 5) shows the distri-
Figure 4. Pooled sensitivity was 0.86 (95% CI, 0.78–0.91) and
bution of positive and negative likelihood ratios. The pooled
specificity was 0.83 (95% CI, 0.77–0.88). The area under the
positive likelihood ratio of 5.0 (95% CI, 3.6–6.8) suggests that
curve (AUC) was 0.91 (95% CI, 0.88–0.93). Heterogeneity was
the SCOFF is moderately helpful in detecting eating disorders.
statistically significant for sensitivity (I2 = 97.63; 95% CI, 97.17–
The negative likelihood ratio of 0.17 (95% CI, 0.11–0.27)
98.09) and specificity (I2 = 98.22; 95% CI, 97.91–98.54). Dif-
suggests that the SCOFF is moderately helpful in ruling out
ferences in study methodology or clinical characteristics of the
the presence of an eating disorder.33, 34
sample may result in elevated heterogeneity. Heterogeneity may
also be elevated due to different thresholds utilized across studies
to define cases. This was not the case for the SCOFF as the
threshold effect was not significant (r = − 0.21; p = 0.32). DISCUSSION
In order to address the significant heterogeneity found across We conducted a meta-analysis of 25 validation studies on the
studies, subgroup analyses were conducted to examine the im- SCOFF to determine whether this screen is a valid tool for
pact of methodological (i.e., case-control vs non-case-control; identifying eating disorders in diverse settings and populations.
890 Kutz et al.: Eating Disorder Screening: a Systematic Review and Meta-analysis JGIM
Our in-depth examination of the SCOFF calls into question the recruited from community samples and included the highest
effectiveness of this tool for eating disorder screening in primary reported rates of BED. Sensitivity was also lower in locations
care and community settings, with diverse populations, and with where rates of obesity tend to be higher (e.g., North America).
the full range of DSM-5 eating disorder diagnoses. This exam- Comparing demographic variables across studies shows
ination provides a critical context given that we also found in our that while the SCOFF has been validated numerous times
study, as was found in a previous meta-analysis,35 that the
SCOFF is an effective tool for identifying the presence of
particular eating disorders (i.e., AN and BN) in the population
for which it was initially developed (i.e., young women with
eating disorder symptoms).
The purpose of screening is to capture the range of patholog-
ical eating and identify cases that might not be identified by other
means. The SCOFF was originally developed and subsequently
validated several times using case-control study designs. Case-
control studies dramatically limit samples to a specific target
population (i.e., cases and matched controls) and do not capture
the diversity and range of disorders in the general population.
Additionally, validity data from case-control studies may artifi-
cially inflate the efficacy of screening measures and lead to
erroneous conclusions about the utility of the measure in the
general population.6 As expected, our analyses revealed that the
highest levels of sensitivity were found in case-control studies
including young women diagnosed with AN and BN. These
findings are important as higher rates of sensitivity and specificity
in case-control samples highlight that when patients are at risk for
AN and BN, the SCOFF is a highly robust screening measure. Figure 4 Receiver operating curve (SROC).
Conversely, studies with lower sensitivity rates were primarily
JGIM Kutz et al.: Eating Disorder Screening: a Systematic Review and Meta-analysis 891
Methodologic
Recruitment
Case-control 5 0.96 (0.92–1.00) 0.89 (0.81–0.98) 85 (68–100)
Random/consecutive sample 20 0.80 (0.73–0.88) 0.81 (0.75–0.87)
Reference standard type
Interview 14 0.91 (0.86–0.96) 0.81 (0.74–0.89) 66 (24–100)
Questionnaire 11 0.75 (0.63–0.87) 0.84 (0.77–0.92)
Quality
Patient selection (bias)
High bias 5 0.95 (0.91–1.00) 0.78 (0.63–0.94) 97 (95–99)
Low bias 16 0.79 (0.70–0.89) 0.84 (0.78–0.91)
Patient selection (applicability)
High applicability concerns 14 0.91 (0.85–0.96) 0.81 (0.73–0.89) 94 (89–99)
Low applicability concerns 9 0.72 (0.57–0.87) 0.86 (0.78–0.94)
Flow and timing
High bias 12 0.81 (0.63–0.99) 0.76 (0.61–0.91) 99 (98–99)
Low bias 4 0.81 (0.71–0.91) 0.87 (0.82–0.93)
Demographics
Diagnosis*
BED included 6 0.77 (0.60–0.94) 0.78 (0.65–0.91) 33 (0–100)
BED not included 19 0.88 (0.82–0.94) 0.84 (0.78–0.90)
Gender†
Majority female 19 0.88 (0.81–0.93) 0.82 (0.74–0.88) 89 (78–100)
Equal female and male 5 0.75 (0.46–0.91) 0.86 (0.73–0.93)
Sample
Medical/research settings 12 0.90 (0.78–0.96) 0.86 (0.80–0.91)
Schools/universities 9 0.83 (0.74–0.89) 0.81 (0.73–0.87) —
Community 4 0.71 (0.45–0.96) 0.86 (0.74–0.97)
Age‡
Adults 12 0.84 (0.74–0.94) 0.86 (0.79–0.93) 83 (65–100)
Children, adolescents, young adult 12 0.86 (0.74–0.95) 0.79 (0.69–0.88)
Location§
Europe 17 0.88 (0.82–0.95) 0.84 (0.78–0.98)
North America 4 0.69 (0.43–0.95) 0.85 (0.72–0.98) 97 (95–99)
*Diagnosis: Binge eating disorder assessed in study and rates reported in manuscript; binge eating disorder not assessed and/or rates not included in manuscript
† Majority female: sample is ≥ 60% female; Equal female and men: sample is < 60% female
‡ Adults: average age is ≥ 25; Children, adolescents, young adult: average age is < 25
§
There were too few studies conducted in Asia and South America to obtain pooled results
since its development, it is often validated in samples highly (i.e., young women with AN and BN). In fact, of the 25 studies
similar to the population in which it was initially validated reviewed, more than half utilized a predominately or entirely
female sample. Of the studies that did include males, only diagnoses and for diverse populations. The present review sug-
three were conducted using adults. In addition, many studies gests that the SCOFF is a highly sensitive screening measure for
did not report on important demographic and clinical charac- young women at risk for AN and BN but analyses and quality
teristics including certain eating disorder diagnoses (e.g., assessment of studies raised concerns about the generalizability
BED), race, and BMI. Of those that did report on these and reliability of these results for other eating disorder diagnoses.
characteristics, there was evidence that samples utilized in Currently, there is insufficient evidence to recommend the use of
these validation studies often did not reflect the racial and the SCOFF for large-scale screening in primary care and diverse
weight diversity seen across DSM-5 eating disorders outside community settings. This review identifies the need for the
of AN and BN. In addition, only six studies explicitly exam- development of a new screening tool, or multiple tools, for
ined the efficacy of the SCOFF for identifying BED and none validation for the full range of DSM-5 eating disorder diagnoses
examined efficacy for any of the other specified eating disor- in heterogenous samples.
ders in DSM-5. Reflecting the lack of demographic variability
in the samples across the 25 studies, applicability concerns Corresponding Author: Amanda M. Kutz, PhD; VA Salt Lake City
Health Care System, Salt Lake City, UT, USA (e-mail: Amanda.
were high in many studies on the QUADAS-2 risk of bias tool kutz@va.gov).
in the patient selection domain. Given these high applicability
concerns, it is difficult to make conclusions about the appro-
Author Contributions RMM, SM, and AMK were involved in
priateness of using the SCOFF to screen for eating disorders conception of the review. AMK and AGM were involved in review of
with the exception of young women at risk for AN and BN. literature, extraction of data, and creation of data tables. AMK and
Compared with a prior systematic review on this topic con- CGG were involved in statistical analyses. CGG provided expert
guidance on systematic review manuscript preparation. AMK, AGM,
ducted by Botella and colleagues,35 the present systematic review and RMM drafted the manuscript. All authors were involved in critical
provides a more in-depth and comprehensive analysis and, most review of the manuscript.
importantly, includes an assessment of the quality of included
Funding Information This project was supported in part by the VA’s
studies. Additionally, ten new validation studies had been pub- Heath Services Research and Development (CIN 13-407) (HSR&D)
lished and were included in our analysis for a total of 25 valida- Center of Innovation (COIN) Pain Research, Informatics, Multi-morbid-
tion studies. As per PRISMA-DTA guidelines, this review also ities, and Education (PRIME) Center, West Haven, CT.
10. Aoun A, et al. Validation of the Arabic version of the SCOFF questionnaire 24. Pannocchia L, et al. A psychometric exploration of an Italian translation
for the screening of eating disorders. East Mediterr Health J. 2015;21(5). of the SCOFF questionnaire. Eur Eat Disord Rev. 2011;19(4):371-373.
11. Berger U, et al. Screening of disordered eating in 12-Year-old girls and 25. Parker SC, Lyons J., Bonner J. Eating disorders in graduate students:
boys: psychometric analysis of the German versions of SCOFF and EAT- exploring the SCOFF questionnaire as a simple screening tool. J Am Coll
26. Psychother Psychosom Med Psychol. 2011;61(7):311-318. Heal. 2005;54(2):103-107.
12. Caamaño F, et al. Validation of SCOFF questionnaire among 26. Richter F, et al. Screening disordered eating in a representative sample
preteenagers. BMJ. 2002. of the German population: Usefulness and psychometric properties of the
13. Cotton MA, Ball C, Robinson P. Four simple questions can help screen German SCOFF questionnaire. Eat Behav. 2017;25:81-88.
for eating disorders. J Gen Intern Med. 2003;18(1):53-56. 27. Rueda GE, et al. Validación de la encuesta SCOFF para tamizaje de
14. Garcia FD, et al. Validation of the French version of SCOFF question- trastornos de la conducta alimentaria en mujeres universitarias.
naire for screening of eating disorders among adults. World J Biol Biomédica. 2005;25(2):196-202.
Psychiatry. 2010;11(7):888-893. 28. Rueda GJ, et al. Validation of the SCOFF questionnaire for screening the eating
15. Garcia FD, et al. Detection of eating disorders in patients: validity and behaviour disorders of adolescents in school. Aten Primaria. 2005;35(2):89-94.
reliability of the French version of the SCOFF questionnaire. Clin Nutr. 29. Sanchez-Armass O, et al. Validation of the SCOFF questionnaire for
2011;30(2):178-181. screening of eating disorders among Mexican university students. Eat
16. Garcia-Campayo J, et al. Validation of the Spanish version of the SCOFF Weight Disord. 2017;22(1):153-160.
questionnaire for the screening of eating disorders in primary care. J 30. Siervo M, et al. Application of the SCOFF, Eating Attitude Test 26 (EAT
Psychosom Res. 2005;59(2):51-55. 26) and Eating Inventory (TFEQ) Questionnaires in young women seeking
17. Lähteenmäki S, et al. Validation of the Finnish version of the SCOFF diet-therapy. Eat Weight Disord. 2005;10(2):76-82.
questionnaire among young adults aged 20 to 35 years. BMC Psychiatry. 31. Solmi F, et al. Validation of the SCOFF questionnaire for eating disorders
2009;9(1):5. in a multiethnic general population sample. Int J Eat Disord.
18. Leung SF, et al. Psychometric properties of the SCOFF questionnaire 2015;48(3):312-316.
(Chinese version) for screening eating disorders in Hong Kong secondary 32. Wahida WMZW, Lai PSM, Hadi HA. Validity and reliability of the english
school students: a cross-sectional study. Int J Nurs Stud. version of the sick, control, one stone, fat, food (SCOFF) in Malaysia. Clin
2009;46(2):239-247. Nutr ESPEN. 2017;18:55-58.
19. Lichtenstein MB, Hemmingsen SD, Støving RK. Identification of eating 33. McGee S. Simplifying likelihood ratios. J Gen Intern Med.
disorder symptoms in Danish adolescents with the SCOFF Question- 2002;17(8):647-650.
naire. Nordic J Psychiatry. 2017;71(5):340-347. 34. Wilson MC, Henderson MC, Smetana GW. Chapter 5. Evidence-Based
20. Luck AJ, et al. The SCOFF questionnaire and clinical interview for eating Clinical Decision Making. In: Henderson MC, Tierney LM, Jr., Smetana
d i s o r d e r s i n g e n e r a l p r a c t i c e : c o m p a r a t i v e s t u d y. B M J . GW. eds. The Patient History: An Evidence-Based Approach to Differential
2002;325(7367):755-756. Diagnosis. New York: McGraw-Hill; 2012.
21. Mond JM, et al. Screening for eating disorders in primary care: EDE-Q 35. Botella J, et al. A meta-analysis of the diagnostic accuracy of the SCOFF.
versus SCOFF. Behav Res Ther. 2008;46(5):612-622. Span J Psychol. 2013;16:E92.
22. Morgan JF, Reid F, Lacey JH., The SCOFF questionnaire: assessment of a
new screening tool for eating disorders. BMJ. 1999;319(7223):1467-1468.
23. Muro-Sans P., Amador-Campos JA, Morgan JF. The SCOFF-c: Psycho- Publisher’s Note Springer Nature remains neutral with regard to
metric properties of the Catalan version in a Spanish adolescent sample. jurisdictional claims in published maps and institutional affiliations.
J Psychosom Res. 2008;64(1):81-86.