You are on page 1of 8

BJD

Q U A L I TA T I V E A N D O U T CO M E S R E S E A R CH British Journal of Dermatology

Validation of an Oral Disease Severity Score (ODSS) tool


for use in oral mucous membrane pemphigoid
M. Ormond iD ,1 H. McParland,1 P. Thakrar,2 A.N.A. Donaldson,3 M. Andiappan,3 R.J. Cook,1,4 M.E. Escudier,1,5
J. Higham,2 E. Hullah,1 R. McMillan,6 J. Taylor,7 P.J. Shirlaw,1 S.J. Challacombe1,5 and J.F. Setterfield1,5,8
1
Department of Oral Medicine and 8St John’s Institute of Dermatology, Guy’s Hospital, Guy’s and St Thomas’ NHS Foundation Trust, London, U.K.
2
Department of Oral Medicine, Birmingham Dental Hospital and School of Dentistry, Birmingham, U.K.
3
Biostatistics and Research Methods Centre, 4Centre for Oral Clinical and Translational Sciences, and 5Centre for Host–Microbiome Interactions, King’s College
London Faculty of Dentistry, Oral& Craniofacial Sciences, London, U.K.
6
Department of Oral Medicine, Eastman Dental Hospital, UCLH/Eastman Dental Institute, UCL, London, U.K.
7
Department of Oral Medicine, Glasgow Dental Hospital and School, Glasgow, U.K.

Summary

Correspondence Background Mucous membrane pemphigoid (MMP) is a rare autoimmune bullous


Jane Setterfield. disease predominantly affecting the oral mucosa. Optimal management relies
E-mail: jane.setterfield@kcl.ac.uk upon thorough clinical assessment and documentation at each visit.
Objectives The primary aim of this study was to validate the Oral Disease Severity
Accepted for publication
11 September 2019
Score (ODSS) for the assessment of oral involvement in MMP. We also compared
its inter- and intraobserver reliability with those of the oral parts of the Mucous
Funding sources Membrane Pemphigoid Disease Area Index (MMPDAI), Autoimmune Bullous Skin
None. Disorder Intensity Score (ABSIS) and Physician’s Global Assessment (PGA).
Methods Fifteen patients with mild-to-moderately severe oral MMP were scored
Conflicts of interest
for disease severity by 10 oral medicine clinicians from four U.K. centres using
None to declare.
the ODSS, the oral sections of MMPDAI and ABSIS, and PGA. Two clinicians
M.O. and H.M. contributed equally to rescored all patients after 2 h.
this work. Results In terms of reliability, the interobserver ODSS total score intraclass correla-
tion coefficient (ICC) was 097, MMPDAI activity 059 and damage 015, ABSIS
DOI 10.1111/bjd.18566 total 084, and PGA 072. The intraobserver ICCs (two observers) for ODSS total
were 097 and 093; for MMPDAI activity 093 and 070 and damage 093 and
079; for ABSIS total 099 and 094; and for PGA 092 and 094. Convergent
validity between ODSS and MMPDAI was good (correlation coefficient 088).
The mean  SD time for completion of ODSS was 93  31 s, with MMPDAI
102  24 s and ABSIS involvement 71  18 s. The PGA took < 5 s.
Conclusions This study has validated the ODSS for the assessment of oral MMP. It
has shown superior interobserver agreement over MMPDAI, ABSIS and PGA, and
superior intraobserver reliability to MMPDAI. It is quick and easy to perform.

What’s already known about this topic?


• There are no validated scoring methodologies for oral mucous membrane pem-
phigoid (MMP).
• Proposed disease activity scoring tools for MMP include the Mucous Membrane
Disease Area Index (MMPDAI) and the Autoimmune Bullous Skin Disorder Intensity
Score (ABSIS).
• The Oral Disease Severity Score (ODSS) has been validated for use in oral pemphi-
gus vulgaris (PV). It has been shown to be reliable and sensitive in both lichen pla-
nus (LP) and MMP.

78 British Journal of Dermatology (2020) 183, pp78–85 © 2019 British Association of Dermatologists
Validation of an ODSS tool for use in oral MMP, M. Ormond et al. 79

What does this study add?


• The ODSS has been shown to be a thorough, sensitive and reproducible, yet quick
scoring tool for the assessment of oral involvement in MMP.
• Its versatility for use in oral PV, MMP and LP is an added advantage over other
scoring methodologies.

What are the clinical implications of this work?


• We propose that the ODSS be used as a clinical scoring tool for monitoring activity
in oral MMP in clinical practice as well as for use in multicentre studies.

Mucous membrane pemphigoid (MMP) is a heterogeneous affected site was formally assessed by the relevant specialist –
disease with a wide spectrum of clinical and immunopatho- for example in dermatology, oral medicine, otolaryngology or
logical presentations. Affected sites include the oral mucosa, ophthalmology – and the methodology was devised with their
eyes, skin, genitalia, larynx, oesophagus and nasopharynx.1 input. This scoring methodology was later used in further
Oral manifestations of MMP include painful desquamative MMP cohort studies.10 However, for more accurate sequential
gingivitis; blistering and ulceration of the palate, buccal assessment, a more detailed oral severity score was needed.
mucosae and oropharynx; and occasional lesions seen on the The Oral Disease Severity Score (ODSS) was devised by the
floor of mouth, tongue and lips. There appear to be distinct oral medicine group at Guy’s Hospital as part of a strategy to
oral phenotypes.1 Approximately one-third of patients will develop a comprehensive scoring system to record outcomes
experience skin lesions, which are typically seen on the head for oral mucosal diseases.11 It was developed from a multisite
and neck area, although occasionally they also affect the methodology previously published for MMP.1 ODSS records
limbs.2 In a small subgroup of patients it may present with the presence of lesions and degree of activity at multiple oral
a more widespread cutaneous blistering condition akin to sites and includes a subjective assessment of the patient’s oral
bullous pemphigoid, but the oral lesions are usually more pain over the preceding week. Previous studies have demon-
prominent from the outset than would be expected in bul- strated that the ODSS is a reliable and sensitive tool in both
lous pemphigoid and are a clue to the diagnosis of MMP. oral lichen planus (LP) and MMP.11,12 It has been recently val-
The disease can have a long course and rarely remits without idated for use in oral pemphigus vulgaris (PV)13 and has been
treatment. A wide variety of therapies have been used to shown to be useful to assess therapeutic response over time in
treat MMP but there are no large placebo-controlled random- both severe mucosal LP and PV.14,15
ized controlled studies, and there are few studies utilizing a In 2012 an international panel of experts, predominantly
standardized scoring methodology to quantify objective dermatologists, proposed a new scoring system, the Mucous
improvement with systemic treatment.3 Membrane Pemphigoid Disease Area Index (MMPDAI).16 This
As the mouth and eyes are the two most frequently was adapted from the validated Pemphigus Disease Area
involved sites in MMP, it is paramount that these are assessed Index and the Bullous Pemphigoid Disease Area Index,17–19
with the most accurate, reproducible and validated methods and is a multisite scoring methodology that is yet to be vali-
available. In addition, to be valuable, a scoring tool should dated. A further tool advocated for potential use in MMP is
also be feasible and sensitive to change, and have external the Autoimmune Bullous Skin Disorder Intensity Score
validity.4 To date there have been no validated scoring (ABSIS), which has been validated for PV but not for
methodologies for any site in MMP.5 For ocular disease the MMP.20
most frequently used prospective scoring system is the Foster– The primary aim of our study was to validate the ODSS for
Tauber tool, in which the presence of subconjunctival scar- use in MMP by investigating its inter- and intraobserver relia-
ring, together with qualitative assessment of the extent of both bility and ease of use. A secondary aim was to compare its
forniceal foreshortening and symblepharon, are combined to inter- and intraobserver reliability and ease of use with those
create a four-stage alphanumeric measure of the extent of con- of the oral parts of the MMPDAI and ABSIS and the Physi-
junctival scarring.6 However, further ocular MMP scoring cian’s Global Assessment (PGA).
methodologies including assessments of the severity of inflam-
mation using image-based grading and quantitative measure-
Patients and methods
ments of forniceal shortening are under evaluation.7–9
In 1998 we published a multisite scoring methodology for Research ethics approval was obtained (REC15/ES/0038). The
MMP. It was used to assess disease severity alongside establish- study was conducted over 1 day within the Department of
ing biomarkers for more severely affected patients.1 Each Oral Medicine at Guy’s Hospital, London.

© 2019 British Association of Dermatologists British Journal of Dermatology (2020) 183, pp78–85
80 Validation of an ODSS tool for use in oral MMP, M. Ormond et al.

dentally qualified, one of whom was a practising dermatologist.


Patients
Patients were scored using the ODSS, the oral parts of the MMPDAI
Fifteen patients (aged 31–79 years) with a confirmed diagno- and ABSIS, and the PGA. All physicians were familiar with the
sis of predominantly oral MMP (based on clinical findings, PGA, while five were experienced in using the ODSS and five were
histopathology and direct immunofluorescence) were recruited not. None of the clinicians routinely used either the MMPDAI or
consecutively from the outpatient clinic of the departments of ABSIS. Prior to the study, a set of training slides demonstrating the
oral medicine and dermatology at Guy’s Hospital, London. ODSS system, MMPDAI, ABSIS and PGA was sent to all clinicians.
The visit replaced one of their routine follow-up appoint- On the study day the chief investigator met with all the clinicians
ments. All patients had mild-to-moderately severe oral lesions. for a detailed discussion of methodologies, using clinical slides as
Fourteen patients were taking systemic treatment: eight sul- examples. All clinicians examined and scored each patient once,
fapyridine, two dapsone, one azathioprine, one mycopheno- and two clinicians examined all patients twice with a 2-h interval
late mofetil, one prednisolone plus sulfapyridine and one to reduce recall. All physicians were asked for feedback regarding
prednisolone plus mycophenolate mofetil. One patient was the scoring tools in relation to ease of use. An assistant recorded
using topical treatment only (fluticasone propionate nasules). the scores in random order and the time taken for each
methodology. Twelve sets of scores were recorded for each patient.
Physicians
Oral Disease Severity Score
Ten clinicians experienced in diagnosing and managing patients
with MMP were included from four oral medicine centres in the The ODSS is a comprehensive oral scoring system, detailed in
U.K. All were oral medicine specialists; six were medically and Figure 1, for MMP and previously validated for oral PV.13 It has

Fig 1. Oral Disease Severity Score.

British Journal of Dermatology (2020) 183, pp78–85 © 2019 British Association of Dermatologists
Validation of an ODSS tool for use in oral MMP, M. Ormond et al. 81

been used as an outcome measure in the assessment of thera- involvement. In our study, only the oral part of the tool was
peutic efficacy in oral LP and PV.11,14,15 In the ODSS the oral used. Oral involvement is scored by recording the presence or
cavity is divided into 17 sites weighted 0–2 according to the absence of lesions in 11 sites in the oral cavity, with each site
area of possible involvement. These sites are the outer and inner scoring 0 or 1, giving a maximum of 11. The oral cavity is
lips, buccal mucosae right and left, soft palate, hard palate, dor- divided into upper gingivae, lower gingivae, upper lip, lower
sum of tongue, ventrolateral tongue right and left, floor of lip, right buccal mucosa, left buccal mucosa, tongue, floor of
mouth, oropharynx and the gingivae (divided into six seg- mouth, hard palate, soft palate and pharynx. A subjective sever-
ments). As detailed in Figure 1, a score of 2 corresponds to > ity scale based on the patient’s assessment of discomfort during
50% of the buccal mucosa on one side being affected, or bilat- eating and drinking a range of foods is included, with a maxi-
eral involvement of the dorsum of tongue, floor of mouth, hard mum score of 45. The two components are summed to provide
and soft palate or oropharynx. Each unit of site is then allocated a total score for oral severity.
an activity score, which ranges from 0 to 3 (no activity = 0,
mild inflammation – erythema or healing areas = 1, prominent
Physician’s Global Assessment
erythema = 2, and blistering or ulceration = 3) (Fig. 2). For a
specific area (e.g. buccal mucosa), the activities for each unit of The PGA is a 10-point analogue scale in which clinicians make
the site are added together. The third component is a pain score, a judgement on the overall health of, in this case, the oral
which is subjective and on a scale of 0–10 provided by the mucosa. The scorer rates the mucosa from 0 = perfect health
patient as an average for the preceding week. The three compo- to 10 = worst mucosal disease imaginable. The PGA has
nents are summed to give a total score with a theoretical maxi- previously been used as an outcome measure in dermatologi-
mum of 106; however, > 95% of patients would be expected to cal conditions,21–23 and has been validated for oral PV.13 It
have scores in the range of 0–60, representing a clinical range has not been validated for use in other autoimmune bullous
from remission to severe disease. disorders.

Mucous Membrane Pemphigoid Disease Area Index Time for completion of each outcome measure

The MMPDAI was designed through consensus by a group of An independent assistant used a stopwatch to record the time
international experts in autoimmune bullous disease and is cur- taken (in seconds) by each clinician to obtain a disease sever-
rently awaiting validation.16 Only the section relating to the ity score for each scoring tool, except the PGA, which took <
oral cavity was used in our study. The oral cavity is divided 5 s and was therefore not timed. ODSS time included the sub-
into seven areas, the buccal mucosa, palate, upper and lower jective pain component whereas the ABSIS timing did not
gingiva, tongue/floor of mouth, labial mucosa and posterior include the subjective severity score.
pharynx. Scores are recorded in two columns, to separate active
ulcers and blisters from postinflammatory changes and scarring
Statistical methods
(damage). Scores are allocated based on the number and size
of lesions, with a total possible score of 70 points for mucosal Interobserver reliability was assessed by 10 clinicians (ob-
activity and seven points for disease damage. The activity and servers) scoring all patients with each of the four scoring
damage scores are not added together. tools. A sample size of 15 participants was required to achieve
intraclass correlations (ICCs) of 077 for the interobserver reli-
ability.
Autoimmune Bullous Skin Disorder Intensity Score
Intraobserver reliability was tested with two replications
utze et al.,20 initially
The ABSIS scoring tool was developed by Pf€ per patient (as per test–retest), with a minimum of 2 h
for use in PV to score both skin and mucous membrane between scores to minimize the risk of recall. As the

(a) (b) (c)

Fig 2. (a) Upper central gingivae indicating mild erythema (site score 1, activity 1). (b) Upper central gingivae indicating marked erythema (site
score 1, activity 2). (c) Hard palate demonstrating areas of ulceration bilaterally (site score 2, activity 3 + 3 = 6).

© 2019 British Association of Dermatologists British Journal of Dermatology (2020) 183, pp78–85
82 Validation of an ODSS tool for use in oral MMP, M. Ormond et al.

involvement was more onerous (burden, time or money


Test–retest reliability
resources etc.) for the rater than for the patient, taking the
rater as fixed in the factorial design was more efficient. We Intraobserver agreement between initial scoring and rescoring of
fixed the number of raters to two and found the sample size the same patients by two observers demonstrated ICCs (95% CIs)
required in terms of the number of patients (who are for ODSS total of 097 (093–100) and 093 (086–099), site
assumed to be a random sample from the population of 095 (090–100) and 093 (085–100), activity 095 (090–
patients). With both raters performing two replications in 100) and 094 (088–100) and pain 097 (094–100) and 059
each methodology in all of the patients, a total of 15 patients (025–093). The ICCs for MMPDAI activity were 093 (087–
provided 80% power to detect an ICC difference of 050 (rel- 100) and 070 (044–096) and damage 093 (085–100) and
ative to a null value of 020). Anticipating an ICC of 085, 079 (060–098). For ABSIS total, ICCs were 099 (098–100)
15 patients with two replications will yield a width of 030 and 094 (089–100), with involvement 098 (097–100) and
in the 95% confidence interval (CI). 090 (079–100) and severity 099 (097–100) and 094 (089–
Multilevel models were used to quantify the inter- and 100). The PGA ICCs were 092 (084–100) and 094 (089–
intraobserver reliabilities of the continuous measures. Assess- 100) (Table 2).
ment for the level of agreement in terms of the ICC for ordi-
nal or continuous measures followed well-established
Convergent validity
benchmark limits (Fleiss’ and Altman’s benchmark scales).24,25
Landis–Koch’s benchmark values were followed when kappa There was good correlation between the MMPDAI activity and
coefficients were used for categorical outcomes.26 In all cases, the ODSS total score (0884, P < 0001), ODSS total and ABSIS
for more rigour, in addition to the point estimate we took total (0797, P < 0001) and MMPDAI activity and ABSIS total
into account the lower bound of the 95% CI.24 Convergent (0791, P < 0001) (Table 3).
validity was calculated using the Spearman rank correlation
coefficient.
Time for completion

The mean time to obtain a disease severity score using ODSS


Results
(total) was 93  31 s, with MMPDAI (activity and damage)
Fifteen patients (five female, 10 male) with confirmed MMP 102  24 s and ABSIS (involvement) 71  18 s.
were included. The mean  SD age was 65  118 years
(range 31–79).
Discussion
This study has shown that the ODSS is a valid scoring method
Distribution of scores
for assessment of oral disease severity in MMP, with higher
The mean  SD total ODSS score was 256  109, range 6–56, inter- and intraobserver reliability than MMPDAI and higher
median 25, interquartile range (IQR) 17–328, reflecting mild- interobserver reliability than ABSIS and PGA. The study has
to-moderately severe disease. The mean MMPDAI activity score also validated the oral components of the MMPDAI and ABSIS
was 72  61, range 0–31, median 6, IQR 2–11 and the mean for use in oral MMP.
damage score was 16  14, range 0–6, median 15, IQR 0–3. The methodologies were compared using standardized
The mean ABSIS total score was 121  105, range 0–36, med- benchmark scales. The interobserver score for ODSS total was
ian 10, IQR 2–20. The mean score for involvement was 32  classified as excellent, in contrast to the MMPDAI activity
15, median 3, IQR 2–4. For disease severity the mean was 99 score, which was fair/moderate and the MMPDAI damage
 96, median 8, IQR 0–175. The mean PGA score was 41  score, which was poor. For ABSIS the total score was good/
22, range 1–9, median 4, IQR 2–6 (Table 1). substantial and for PGA moderate/good. Thus, this study has
shown that the ODSS was more reliable as a scoring tool
among these clinicians than MMPDAI, ABSIS or PGA.
Reliability
The sample size was calculated in advance using appropriate
statistical calculations in order to achieve an intraclass correla-
Interobserver reliability
tion of 077, which is considered more than adequate to allow
The ICC (95% CI) for the ODSS total was 097 (094–099), assessment of reliability. As MMP is a rare disease, combined
ODSS site 073 (057–088), ODSS activity 082 (071–094) with the requirement for all participants to attend on one day,
and ODSS pain 081 (068–093). For MMPDAI the ICC for increasing the sample size would have been practically difficult.
activity was 059 (039–079) and for damage 015 (008– We sought statistical advice on how to undertake the
030). For ABSIS total the ICC score was 084 (074–095), intraobserver methodology. By using two clinicians to score
for involvement 074 (059–089) and for patient-reported each patient twice, 30 scores were produced, compared with
severity 087 (078–096). Finally, the ICC for PGA was 072 20 if each clinician had rescored one patient. We retested with
(056–088) (Table 1). Table 1 also includes benchmark val- a minimum 2-h interval to reduce recall bias. While a longer
ues following Fleiss’ and Altman’s benchmark scales.24,25 interval (24–48 h) might have further reduced recall, this

British Journal of Dermatology (2020) 183, pp78–85 © 2019 British Association of Dermatologists
Validation of an ODSS tool for use in oral MMP, M. Ormond et al. 83

might also have been associated with subtle changes in disease scoring tools was < 2 min (ODSS 93 s, MMPDAI 102 s and
activity, and of practical relevance patients would have ABSIS 71 s), which we consider feasible for use in routine clin-
required two visits, adding an extra burden. Although five of ics. The PGA typically takes < 5 s and was therefore not timed.
the 10 investigators had experience of the ODSS, with the As there was no significant difference in the time taken by those
potential for possible bias in the data, there was no difference familiar with ODSS and other clinicians, we extrapolate that
detected in the reliability of the tool between these groups. these mean differences reflect ease of use. However, all mea-
For intraobserver reliability the scores for ODSS total, ABSIS sures were felt to be practical for use in the clinical setting.
total and PGA were classified as excellent. For MMPDAI activ- Clinicians were asked to comment on their experience of
ity and damage, benchmark ratings were good/substantial. using each scoring tool. They reported that the ODSS was easy
Thus, the ODSS had similar or better intraobserver scores than to use and most accurately recorded the extent of oral disease
the other scoring tools examined. in MMP, recording more detailed activity across more sites,
The accuracy of any scoring system is best assessed against a thereby having a wider scale. ABSIS was the easiest tool with
gold standard. However, for MMP no gold standard exists, as which to assess activity, recording blisters or ulcers at only 11
there are no validated scoring methods with which to compare. sites. However, there was loss in sensitivity as erythema was
There was good convergent validity between the oral part of not included, although it is present in preblistering lesions or
MMPDAI activity and ODSS total, between ODSS total and stable inflammation. The substantial subjective component of
ABSIS, and between MMPDAI and ABSIS. The validity of a tool ABSIS requires the patient to report symptoms with food
also depends on how well the variables in the study represent types, and the results showed differing answers depending on
the phenomenon of interest (construct validity). While analysis how the questions were put to the patient. Clinicians felt the
of construct validity was not the aim of this study, nor was it ABSIS scoring system was weighted too strongly on this sub-
formally addressed, the variables in the ODSS contain objective jective component. The PGA, while simple and very quick,
measures of disease activity and severity, as well as including was felt to offer little information regarding the objective oral
patient subjective data, allowing for a comprehensive appraisal involvement of MMP, so its potential for sequential monitor-
of mucosal disease.11 Construct validity may be an area for fur- ing of disease was limited.
ther investigation in the future. These outcome measures were The MMPDAI is the only scoring tool to include activity and
identified as important by the World Workshop on Oral Medi- damage components. For the mouth it requires lesions to be an
cine VI.27 The scores can be reported separately, giving an ulcer, or a blister to be active, while scarring, erythema and
accurate assessment both by the clinician and as a patient- postinflammatory changes are descriptors for damage. Inclusion
reported outcome. It should be noted that although ODSS con- of a damage score for multisite MMP is valuable, albeit that scar-
tains a subjective element it does not replace a quality-of-life ring is an unusual finding in the oral mucosa. However, the
measure. The granularity of the ODSS scoring system allows omission of erythema as a descriptor for activity precludes those
this component to be assessed separately from the other aspects lesions that are stable but erythematous and therefore, in the
of disease severity. In order to capture patient experience, oral mucosa, inflamed. The authors of the MMPDAI have subse-
ODSS, MMPDAI and ABSIS can be combined with patient- quently acknowledged the absence of erythema from the activity
reported outcome measures. The ODSS has been externally descriptors for mucosal sites and are preparing a revision. In
evaluated for use in MMP10 and validated for use in PV.13 contrast to PV, gingival erythema in MMP may last months or
In terms of the practical use of these methodologies in the years and is usually symptomatic, requiring active monitoring
clinical setting, the average time taken to complete each of the and often systemic therapy. Thus, oral physicians unanimously

Table 1 Scores and interobserver reliability for each of the disease severity scoring systems and their individual components

Scoring system Range Mean  SD Median (IQR) Interobserver ICC (95% CI) Overall benchmark valuea
ODSS site 0–10 32  24 3 (2–45) 073 (057–088) Moderate/good
ODSS activity 3–37 150  74 15 (9–19) 082 (071–094) Good/substantial
ODSS pain 0–10 32  24 3 (2–45) 081 (068–093) Good/substantial
ODSS total (0–106) 6–56 256  109 25 (17–328) 097 (094–099) Excellent
MMPDAI activity (0–70) 0–31 72  61 6 (2–11) 059 (039–079) Fair/moderate
MMPDAI damage (0–12) 0–6 16  14 15 (0–3) 015 (008–030) Poor
ABSIS involvement 0–8 32  15 3 (2–4) 074 (059–089) Moderate/good
ABSIS severity 0–32 99  96 8 (0–175) 087 (078–096) Good/substantial
ABSIS total (0–56) 0–36 121  105 10 (2–20) 084 (074–095) Good/substantial
PGA (0–10) 1–9 41  22 4 (2–6) 072 (056–088) Moderate/good

ODSS, Oral Disease Severity Score; MMPDAI, Mucous Membrane Pemphigoid Disease Area Index; ABSIS, Autoimmune Bullous Skin Disorder
Intensity Score; PGA, Physician’s Global Assessment; IQR, interquartile range; ICC, intraclass correlation coefficient; CI, confidence interval.
a
Assessment for the level of agreement in terms of the ICCs followed Fleiss’ and Altman’s benchmark scales.24,25

© 2019 British Association of Dermatologists British Journal of Dermatology (2020) 183, pp78–85
84 Validation of an ODSS tool for use in oral MMP, M. Ormond et al.

Table 2 Within-observer (intraobserver) reliability data for each scoring methodology

Scoring system Observer 1 ICC (95% CI) Observer 2 ICC (95% CI) P-value Overall benchmark valuesa
ODSS site 095 (090–100) 093 (085–100) 026 Excellent
ODSS activity 095 (090–100) 094 (088–100) 039 Excellent
ODSS pain 097 (094–100) 059 (025–093) 035 Moderate/good
ODSS total 097 (093–100) 093 (086–099) 075 Excellent
MMPDAI activity 093 (087–100) 070 (044–096) 030 Good/substantial
MMPDAI damage 093 (085–100) 079 (060–098) 028 Good/substantial
ABSIS involvement 098 (097–100) 090 (079–100) 080 Excellent
ABSIS severity 099 (097–100) 094 (089–100) 057 Excellent
ABSIS total 099 (098–100) 094 (089–100) 055 Excellent
PGA 092 (084–100) 094 (089–100) 005 Excellent

ODSS, Oral Disease Severity Score; MMPDAI, Mucous Membrane Pemphigoid Disease Area Index; ABSIS, Autoimmune Bullous Skin Disorder
Intensity Score; PGA, Physician’s Global Assessment; ICC, intraclass correlation coefficient; CI, confidence interval. aAssessment for the level
of agreement in terms of the ICCs followed Fleiss’ and Altman’s benchmark scales.24,25

Table 3 Convergent validity of the disease severity scoring systems This study has demonstrated the value of the ODSS for the
assessment of disease severity in oral MMP and by assessing
Correlation activity at 17 oral sites it provides the potential to accurately
Scoring system coefficient P-value monitor response to treatment. It is easy to use and quick to
ODSS total and MMPDAI 0884 < 0001 learn and is designed for use in oral medicine, dermatology
activity and other relevant clinical specialities. Its additional versatility
ODSS total and ABSIS total 0797 < 0001 for use in PV and LP is an added advantage over other scoring
MMPDAI activity and ABSIS 0791 < 0001 methodologies. We propose that this scoring tool would be
total useful for recording sequential disease activity routinely in the
ODSS, Oral Disease Severity Score; MMPDAI, Mucous Membrane clinic, as well as in future multicentre studies.
Pemphigoid Disease Area Index; ABSIS, Autoimmune Bullous
Skin Disorder Intensity Score.
References
1 Setterfield J, Shirlaw PJ, Kerr-Muir M et al. Mucous membrane pem-
felt this should more correctly represent activity rather than phigoid: a dual circulating antibody response with IgG and IgA sig-
nifies a more severe and persistent disease. Br J Dermatol 1998;
damage and it may in part have accounted for the longer time
138:602–10.
taken to complete the scoring. Clinicians also reported difficulty 2 Ahmed AR, Kurgis BS, Rogers RS. Cicatricial pemphigoid. J Am
in assessing the level of activity where there was localized or Acad Dermatol 1991; 24:987–1001.
patchy gingival ulceration, finding the 10-point scale more diffi- 3 Taylor J, McMillan R, Shephard M et al. World Workshop on Oral
cult to adapt to the gingiva. Medicine VI: a systematic review of the treatment of mucous
While the MMPDAI is intended for use by dermatologists in membrane pemphigoid. Oral Surg Oral Med Oral Pathol Oral Radiol
‘mild MMP’ but not in patients with severe ocular or laryngeal 2015; 120:161–71.
4 Bastuji-Garin S, Sbidian E. How to validate outcome instruments
disease, patients with oral lesions occur in both groups. Fur-
for pemphigus. J Invest Dermatol 2009; 129:2328–30.
thermore, those with mild disease will present to a range of 5 Lee BWH, Tan JCK, Radjenovic M et al. A review of scoring sys-
specialists and therefore a scoring methodology that is applica- tems for ocular involvement in chronic cutaneous bullous disease.
ble to all patients and all clinicians is arguably more useful. As Orphanet J Rare Dis 2018; 13:83.
neither oral physicians nor dermatologists are skilled in ocular 6 Tauber J, Jabbur N, Foster CS. Improved detection of disease pro-
assessment, and disease progression does not always manifest gression in ocular cicatricial pemphigoid. Cornea 1992; 11:446–51.
as erythema, a total score that includes oral- and ocular-specific 7 Ong HS, Setterfield JF, Minassian DC, Dart JK. Mucous membrane
pemphigoid with ocular involvement: the clinical phenotype and
methods but with additional sites assessed would be optimal.
its relationship to direct immunofluorescence findings. Ophthalmol-
An ideal scoring system should be able to assess a patient’s ogy 2018; 125:496–504.
baseline severity to aid assessment, monitor progress longitu- 8 Williams GP, Saw VP, Saeed T et al. Validation of a fornix depth
dinally throughout treatment and detect relapse.4 The ODSS measurer: a putative tool for the assessment of progressive cicatris-
has been used in this way in our department for over 10 years ing conjunctivitis. Br J Ophthalmol 2011; 95:842–7.
for oral MMP, PV and LP. It has been shown to be a valuable 9 Dart JK. The 2016 Bowman Lecture Conjunctival curses: scarring
tool in longitudinal studies in oral LP and PV14,15 and has conjunctivitis 30 years on. Eye (Lond) 2017; 31:301–32.
10 Cozzani E, Di Zenzo G, Calabresi I et al. Autoantibody profile of a
revealed distinct clinical phenotypes in MMP, which appear to
cohort of 78 Italian patients with mucous membrane pemphigoid:
be stable over long periods of time.

British Journal of Dermatology (2020) 183, pp78–85 © 2019 British Association of Dermatologists
Validation of an ODSS tool for use in oral MMP, M. Ormond et al. 85

correlation between reactivity profile and clinical involvement. Acta 19 Wijayanti A, Zhao CY, Boettiger D et al. The reliability, validity and
Derm Venereol 2016; 96:768–73. responsiveness of two disease scores (BPDAI and ABSIS) for bullous
11 Escudier M, Ahmed N, Shirlaw P et al. A scoring system for muco- pemphigoid: which one to use? Acta Derm Venereol 2017; 97:24–31.
sal disease severity with special reference to oral lichen planus. Br J 20 Pf€
utze M, Niedermeier A, Hertl M, Eming R. Introducing a novel
Dermatol 2007; 157:765–70. Autoimmune Bullous Skin Disorder Intensity Score (ABSIS) in
12 Reeves GMB, Lloyd M, Rajlawat BP et al. Ocular and oral grading pemphigus. Eur J Dermatol 2007; 17:4–11.
of mucous membrane pemphigoid. Graefes Arch Clin Exp Ophthalmol 21 Langley RG, Ellis CN. Evaluating psoriasis with Psoriasis Area and
2012; 250:611–18. Severity Index, Psoriasis Global Assessment, and Lattice System
13 Ormond M, McParland H, Donaldson ANA et al. An Oral Disease Physician’s Global Assessment. J Am Acad Dermatol 2004; 51:563–9.
Severity Score validated for use in oral pemphigus vulgaris. Br J 22 Heald P, Mehlmauer M, Martin AG et al. Topical bexarotene ther-
Dermatol 2018; 179:872–81. apy for patients with refractory or persistent early-stage cutaneous
14 Wee J, Shirlaw PJ, Challacombe SJ, Setterfield JF. Efficacy of T-cell lymphoma: results of the phase III clinical trial. J Am Acad
mycophenolate mofetil in severe mucocutaneous lichen planus: a Dermatol 2003; 49:801–15.
retrospective review of 10 patients. Br J Dermatol 2012; 167:36– 23 Guzzo CA, Weiss JS, Mogavero HS et al. A review of two con-
43. trolled multicenter trials comparing 0.05% halobetasol propionate
15 Greenblatt DT, Benton EC, Groves RW, Setterfield JF. Crescendo ointment to its vehicle in the treatment of chronic eczematous
response to rituximab in oral pemphigus vulgaris: a case with 7- dermatoses. J Am Acad Dermatol 1991; 25:1179–83.
year follow-up. Clin Exp Dermatol 2016; 41:529–32. 24 Shrout PE, Fleiss JL. Intraclass correlations uses in assessing rater
16 Murrell DF, Marinovic B, Caux F et al. Definitions and outcome mea- reliability. Psychol Bull 1979; 86:420–8.
sures for mucous membrane pemphigoid: recommendations of an 25 Altman DG. Practical Statistics for Medical Research. London: Chapman
international panel of experts. J Am Acad Dermatol 2015; 72:168–74. and Hall, 1991.
17 Murrell DF, Dick S, Ahmed AR et al. Consensus statement on defi- 26 Landis JR, Koch GG. The measurement of observer agreement for
nitions of disease, end points and therapeutic response for pem- categorical data. Biometrics 1977; 33:159–74.
phigus. J Am Acad Dermatol 2008; 58:1043–6. 27 Nı Riordain R, Shirlaw P, Alajbeg I et al. World Workshop on Oral
18 Rosenbach M, Murrell D, Bystryn JC et al. Reliability and conver- Medicine VI: patient-reported outcome measures and oral mucosal
gent validity of two outcome instruments for pemphigus. J Invest disease: current status and future direction. Oral Surg Oral Med Oral
Dermatol 2009; 129:2404–10. Pathol Oral Radiol 2015; 120:152–60.

© 2019 British Association of Dermatologists British Journal of Dermatology (2020) 183, pp78–85

You might also like