You are on page 1of 18

RESEARCH ARTICLE

Molecular-Based Diagnosis of Multiple


Sclerosis and Its Progressive Stage
Christopher Barbour, MS,1,2* Peter Kosa, PhD,1* Mika Komori, MD PhD,1
Makoto Tanigawa, BS,1 Ruturaj Masvekar, PhD,1 Tianxia Wu, PhD,3
Kory Johnson, PhD,4 Panagiotis Douvaras, PhD,5 Valentina Fossati, PhD,5
Ronald Herbst, PhD,6 Yue Wang, PhD,6 Keith Tan, PhD,7
Mark Greenwood, PhD,2 and Bibiana Bielekova, MD1

Objective: Biomarkers aid diagnosis, allow inexpensive screening of therapies, and guide selection of patient-specific
therapeutic regimens in most internal medicine disciplines. In contrast, neurology lacks validated measurements of
the physiological status, or dysfunction(s) of cells of the central nervous system (CNS). Accordingly, patients with
chronic neurological diseases are often treated with a single disease-modifying therapy without understanding
patient-specific drivers of disability. Therefore, using multiple sclerosis (MS) as an example of a complex polygenic
neurological disease, we sought to determine whether cerebrospinal fluid (CSF) biomarkers are intraindividually sta-
ble, cell type-, disease- and/or process-specific, and responsive to therapeutic intervention.
Methods: We used statistical learning in a modeling cohort (n 5 225) to develop diagnostic classifiers from DNA-
aptamer–based measurements of 1,128 CSF proteins. An independent validation cohort (n 5 85) assessed the
reliability of derived classifiers. The biological interpretation resulted from in vitro modeling of primary or stem cell–
derived human CNS cells and cell lines.
Results: The classifier that differentiates MS from CNS diseases that mimic MS clinically, pathophysiologically, and on
imaging achieved a validated area under the receiver operating characteristic curve (AUROC) of 0.98, whereas the
classifier that differentiates relapsing–remitting from progressive MS achieved a validated AUROC of 0.91. No classi-
fiers could differentiate primary progressive from secondary progressive MS better than random guessing.
Treatment-induced changes in biomarkers greatly exceeded intraindividual and technical variabilities of the assay.
Interpretation: CNS biological processes reflected by CSF biomarkers are robust, stable, disease specific, or even
disease stage specific. This opens opportunities for broad utilization of CSF biomarkers in drug development and
precision medicine for CNS disorders.
ANN NEUROL 2017;82:795–812

B iomarkers play a critical role in diagnostic and thera-


peutic decisions in many areas of internal medicine.
Cell-specific analytes (such as liver function tests) provide
origin and represent the basis of molecular diagnosis.
Molecular dissection of complex disorders allows selection
of optimal, individualized therapy. Such “precision” ther-
essential information about functionality in their cells of apy consists of simultaneous application of (multiple)

View this article online at wileyonlinelibrary.com. DOI: 10.1002/ana.25083

Received Feb 4, 2017, and in revised form Oct 4, 2017. Accepted for publication Oct 17, 2017.
Address correspondence to Dr Bielekova, Neuroimmunological Diseases Unit (NDU), National, Institute of Neurological Disorders and Stroke (NINDS),
National Institutes of Health (NIH), Building 10, Room 5N248, Bethesda, MD 20892. E-mail: Bibi.Bielekova@nih.gov

From the 1Neuroimmunological Diseases Unit, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD;
2
Department of Mathematical Sciences, Montana State University, Bozeman, MT; 3Clinical Trials Unit, National Institute of Neurological Disorders and
Stroke, National Institutes of Health, Bethesda, MD; 4Bioinformatics Section, National Institute of Neurological Disorders and Stroke, National Institutes
of Health, Bethesda, MD; 5New York Stem Cell Foundation Research Institute, New York, NY; 6Department of Oncology Research, MedImmune,
Gaithersburg, MD; and 7Translational Medicine, Neuroscience, Innovative Medicines, and Early Development, AstraZeneca, Cambridge,
United Kingdom.

M.K. contributed to this work as a former employee of the National Institute of Neurological Disorders and Stroke, and the opinions expressed in this
work do not represent her current affiliation, Eli Lilly Japan, Kobe, Japan.

*C.B. and P.K. contributed equally.

Additional supporting information can be found in the online version of this article.

Published 2017. This article is a US Government work and is in the public domain in the USA. 795
ANNALS of Neurology

drugs that collectively target all pathological processes that Multimodal Analysis of Neuroimmunological Diseases of the
underlie expression of a disease in a particular patient. Central Nervous System” (ClinicalTrials.gov identifier:
In contrast, neurologists lack tools that provide reli- NCT00794352). The patients’ eligibility criteria included age
able information about the dysfunction of constituent 18 to 75 years and presentation with a clinical syndrome con-
cells of the central nervous system (CNS). This ambigu- sistent with immune-mediated CNS disorder or neuroimaging
ity leads to 20 to 40% diagnostic errors,1,2 slow thera- consistent with inflammatory or demyelinating CNS disease.
The inclusion criteria for healthy donors (HDs) were age 18 to
peutic progress,3 and suboptimal clinical outcomes.
75 years, with vital signs within normal range at the time of
Complex neurological disorders such as multiple sclerosis
the screening visit. The diagnostic workup included a neurolog-
(MS) are generally treated by a single disease-modifying
ical examination, magnetic resonance imaging (MRI) of the
treatment (DMT), without understanding patient-specific
brain, and laboratory tests (blood, CSF) as described.6 Diagno-
drivers of disability. The multiplicity of mechanisms in ses of relapsing–remitting MS (RRMS), primary progressive
neurodegenerative diseases and heterogeneity within MS (PPMS), and secondary progressive MS (SPMS) were based
patient populations makes successful treatment by a sin- on 2010 revised McDonald diagnostic criteria.7 The remaining
gle therapy unlikely. Conversely, proving clinical efficacy subjects were classified as either other inflammatory neurologi-
of a single therapy is difficult precisely because of the cal disorders (OIND; eg, meningitis/encephalitis, Susac syn-
limited contribution of the targeted mechanism to the drome, CNS vasculitis, systemic lupus erythematosus, and
overall disease process. genetic immunodeficiencies with CNS inflammation) or non-
Thus, reliable quantification of diverse pathogenic inflammatory neurological disorders (NIND; eg, epilepsy, vas-
processes in the CNS of living subjects is a prerequisite for cular/ischemic disorders, leukodystrophy) based on the evidence
broad therapeutic progress in neurology. Although cerebro- of intrathecal inflammation as published.6,8 The final clinical
spinal fluid (CSF), an outflow of CNS interstitial fluid,4 is diagnostic classification was based on longitudinal follow-up,
an ideal source for molecular biomarkers, remarkably few but reached prior to development of SOMAscan-based diagnos-
CSF biomarkers have reached clinical practice or drug tic classifiers. The vast majority of subjects (with few OIND
exceptions described elsewhere6) were not treated by any DMTs
development.5 This reality is partly based on a circular
at the time of CSF collection.
argument; CSF examinations are not implemented in clini-
Clinical information on the validation cohort (n 5 85)
cal trials or clinics because of a lack of validated, commer-
were not available to developers of the diagnostic classifiers, and
cially available biomarker measurements, while reliable data
the results of the molecular diagnostic tests were not available
on surrogacy of biomarkers to clinical outcomes can be to clinicians determining diagnoses.
obtained only from clinical trials or wide clinical use.
Consequently, the goal of this proof-of-concept study CSF Collection and Processing
was to investigate on the example of MS the following CSF was collected on ice and processed according to a written
hypotheses: (1) a subset of CSF biomarkers are intraindi- standard operating procedure. Research CSF aliquots were
vidually stable in the absence of disease process or thera- assigned prospective alphanumeric codes, and centrifuged (335
peutic intervention, and such biomarkers can be assembled g for 10 minutes at 48C) within 15 minutes of collection. The
into clinically useful tests; (2) a subgroup of CSF bio- CSF supernatant was aliquoted and stored in polypropylene
markers have restricted cellular origin and can be used to tubes at 2808C until use.
develop clinically useful classifiers; (3) healthy and different
disease states of the CNS are sufficiently dissimilar on a SOMAscan
molecular level that CSF biomarker-based classifiers can SOMAscan (SomaLogic, Boulder, CO) is a relative quantifica-
tion of 1,128 proteins (ie, SOMAScan version available
differentiate a specific disease from those that have similar
between June 2012 and October 20169) or 1,300 proteins (ie,
clinical phenotype, pathophysiology, or imaging features;
SOMAScan version available after October 2016) using single-
(4) CSF biomarker-based classifiers can also quantify evolu-
stranded DNA molecules synthesized from chemically modified
tion of a single disease process, thus differentiating its
nucleotides (SOMAmers; Slow Off-rate Modified DNA
stages; and (5) therapy-induced changes in CSF biomarkers Aptamers). Chemical modifications enhance affinity binding to
can be readily distinguished from intraindividual variability, specific proteins. SOMAmers play a dual role of protein
demonstrating that CSF biomarkers could serve as pharma- affinity-binding reagents and a DNA sequence recognized by
codynamic markers in drug development. complementary DNA probes. This enables quantification of
individual protein concentration using a DNA concentration
Subjects and Methods quantified by hybridization.10–12 The raw data (relative fluores-
Subjects cent units [RFUs]) are normalized and calibrated; hybridization
Subjects were prospectively recruited (May 2009–March 2015) normalization uses a set of 12 hybridization controls, and a
as part of a Natural History protocol “Comprehensive common pooled calibrator corrects for plate-to-plate variation.

796 Volume 82, No. 5


Barbour et al: MS Molecular-Based Diagnosis

SOMAScan focuses on secreted soluble proteins, using a single in the training dataset. The SNR was calculated as the clinical
discovery platform for all research applications. variance divided by the sum of all (the clinical, biological, and
technical) variances. Thus, SNR reflects the proportion of the
Assessment of Cellular Origin of Tested total variability that is attributable to variation among subjects
Biomarkers from a Subset of Human Immune from different diagnostic categories. In other words, biomarkers
Cells and CNS Cells with high SNR (ie, values close to 1) are biomarkers with large
Fresh peripheral blood mononuclear cells (PBMCs) of 2 HDs differences across subjects in the training cohort, but with low
were obtained from Ficoll gradient-treated lymphocytapheresis variation in longitudinal sampling of healthy people that can be
samples. Monocytes, B cells, CD41 T cells, CD81 T cells, nat- detected with minimal variations across different assay runs.
ural killer cells, innate lymphoid cells, and myeloid dendritic These markers have a high diagnostic potential. In contrast,
cells were fluorescence-activated cell sorted. Each purified cell biomarkers with SNR close to 0 represent proteins that are
subtype was cultured (1 3 106 cells/ml) in serum-free X-VIVO either not detectable in the CSF, or offer low diagnostic value
15 medium (Lonza, Walkersville, MD) with or without 10lg/ because the physiological (intraindividual) variation or assay
ml phorbol 12-myristate 13-acetate (PMA) and 1lM Ionomy- noise is comparable to the difference between diagnostic catego-
cin. Supernatants were collected after 48 hours and frozen ries we desire to measure. An analogous SNR procedure
until use. restricted the number of biomarker ratios used for statistical
ISOLATED PRIMARY HUMAN CNS CELLS OR CELL LINES. learning.
Human neurons (ScienCell, Carlsbad, CA), human astrocytes
(ScienCell), human brain endothelial cell line (HCMEC/D3; Area under the Receiver Operating
provided by Pierre-Olivier Couraud, INSERM, France13), Characteristics Curve
human microglia cell line (CHME5; provided by Nazira El- The area under the receiver operating characteristics curve
Hage, Florida International University, FL), and human choroid (AUROC; R statistical software using the roc function in the
plexus epithelial cells (hCPEpiC; ScienCell) were plated (105 pROC package17) quantified the ability of biomarkers, bio-
cells/ml, 10ml/flask). Cells were treated with PBMC culture marker ratios, and diagnostic classifiers to differentiate diagnos-
media (control) and an inflammatory mediator (supernatant tic categories. Higher AUROC values imply a larger potential
from lipopolysaccharide- and CD3/CD28 beads-stimulated for separating diagnostic groups. The AUROCs were calculated
human PBMCs; 50% vol/vol). Oligodendrocytes were differen- for the 124,750 ratios formed from the top 500 SNR
tiated from the NIH-approved human embryonic stem cell line SOMAmers in the modeling cohort (n 5 225), and these were
RUES1 using published protocol.14 Cell-culture supernatants used to restrict the number of biomarker ratios used for statisti-
were collected after 24-hour incubation and frozen until use. cal learning. Graphical exploration of the distributions of the
SNR and AUROCs was used to determine cutoffs for the best
Measuring Signal-to-Noise Ratio in Biomarkers ratios on the 2 criteria in each situation. For differentiating MS
Differences in biomarker measurements in identical samples (RRMS, PPMS, SPMS) from non-MS subjects (OIND,
analyzed blindly at different times (n 5 29, 88 samples) quan- NIND, HD), cutoffs of AUROC > 0.65 and SNR > 0.75
tified the technical variability. Similarly, differences in longitudi- were selected. Similarly, cutoffs of AUROC > 0.7 and SNR >
nal samples of HDs (n 5 11, 24 samples) collected 1 year 0.7 were used in differentiating progressive (PPMS, SPMS) MS
apart quantified the biological variation. From each technical or from RRMS. Finally, cutoffs of AUROC > 0.65 and SNR >
biological replicate, we calculated the relative percentage change 0.65 were used for differentiating SPMS from PPMS. Ratios
for each SOMAmer as the difference in the repeated measures meeting these cutoffs were used in statistical learning to assem-
divided by the average of the replicates. ble diagnostic classifiers.
To constrain the number of biomarkers used for statistical The performance of resulting classifiers was assessed by
learning, we calculated signal-to-noise ratios (SNRs) using R AUROC, along with its 95% confidence interval (CI), in the
statistical software15 as exemplified in Figure 1A. Briefly, we independent validation cohort.
estimated the residual variance for each log-transformed bio-
marker from a linear mixed model,16 after accounting for Statistical Learning to Develop Diagnostic
subject-to-subject variation using the random intercept adjust- Classifiers Using a Random Forest Methodology
ment. The residual variance measured when identical samples Random forests18 (randomForest R package19; see description
were analyzed repeatedly represents “technical” variation (ie, in Fig 2) were built by sequentially estimating multiple classifi-
variation caused by differences in assay runs). Analogously, the cation trees (between 500 and 1,500; selected based on the sta-
variance measured in multiple different CSF samples derived bility of the out-of-bag [OOB] error; see Fig 2) using
from individual HDs represents “biological” variation (ie, intra- bootstrapped training cohort samples (n 5 225) with a random
individual variation in healthy state). A third type of variance, subset of predictors for each node. The trees in the “forest” are
which we call “clinical,” reflects how individual biomarkers vary averaged together, providing more reliable predictions than are
in the presence of different disease states (eg, variation across all possible using a single classification tree. Biomarkers and bio-
disease states). This variance was estimated by using one obser- marker ratios were natural log-transformed prior to classifier
vation for each biomarker from each of the n 5 225 subjects construction. Variable importance measures (average decrease in

November 2017 797


ANNALS of Neurology

FIGURE 1.

798 Volume 82, No. 5


Barbour et al: MS Molecular-Based Diagnosis

accuracy from permuting each covariate across all trees)20,21 MS from all other CNS diseases. Machine learning strat-
evaluated the contribution of individual biomarkers to the clas- egies,18 such as random forests,19 combine multiple bio-
sifier. The R code for constructing these classifiers is provided markers using a statistical algorithm that enhances
as Supplementary File 1. sensitivity and specificity,22 resulting in clinically useful
classifiers (see Fig 2 for explanation of the random forests
Results algorithm).
SOMAscan Assay on CSF Samples We employed random forests trained using the
Using SOMAscan, we analyzed CSF samples in blinded fash- modeling (n 5 225) cohort to develop 3 classifiers: (1)
ion in 2 independent cohorts of subjects (Table): (1) a model- one that differentiates MS from all other diagnostic
ing cohort (n 5 225) consisted of untreated subjects from 6 groups, (2) one that differentiates RRMS from progres-
diagnostic groups (RRMS, n 5 40; PPMS, n 5 40; SPMS, n sive MS (ie, PPMS 1 SPMS), and (3) one that differ-
5 40; NIND, n 5 39; OIND, n 5 41; and HD, n 5 25); entiates PPMS from SPMS. Because the validity of the
and (2) an independent validation cohort (n 5 85 untreated classifier must be tested in an independent cohort not
subjects) consisted of 14 subjects per non-MS diagnostic cate- used for the model construction,23 we assessed the per-
gory (OIND and NIND), 16 subjects per RRMS and PPMS formance of the diagnostic tests in the independent
group, 15 SPMS subjects, and 10 HDs. The raw SOMAscan (n 5 85) cohort by predicting AUROCs and their CIs
data for both cohorts are available in Supplementary Table 1. (Fig 3).
In addition, to assess the technical and biological Classifiers that used all 1,128 SOMAmers achieved
variability of the SOMAscan, we analyzed 88 CSF sam- validated AUROC 5 0.91 (95% CI 5 0.84–0.97) for
ples representing technical replicates (identical CSF ali- the MS versus non-MS test, AUROC 5 0.73 (95%
quots analyzed at different time points) and 24 CI 5 0.57–0.90) for the RRMS versus progressive MS
longitudinal CSF samples collected from 11 HDs over a test, and AUROC 5 0.64 (95% CI 5 0.44–0.84) for
1-year span, serving as biological replicates. The average the SPMS versus PPMS test (see Fig 3).
technical and biological relative percentage change (see We expected that not all 1,128 proteins measured
Fig 1B and Supplementary Table 2) was 11.9% and by SOMAscan will be detectable in the CSF. We filtered
12.9%, respectively. To determine the effect of immuno- out noise stemming from the poorly detectable bio-
modulatory DMT on SOMAmers, we analyzed 10 longi- markers by restricting the number of SOMAmers to the
tudinal CSF samples from 5 OIND patients collected 500 with the highest SNR (see Subjects and Methods for
before and after therapy with high-dose methylpredniso- details). We reasoned that the most useful biomarkers
lone. The resulting average relative percentage change of will vary greatly among subjects from different diagnostic
52.1% exceeded the highest technical and biological rela- categories, whereas they will have stable physiological lev-
tive percentage change. Therefore, we concluded that the els (ie, they will have low variance in biological replicates
SOMAscan reliably measures CSF biomarkers and can of HDs) and can be measured with high precision (ie,
detect an effect of DMT in subjects over time. low variance in technical replicates). This reduced set of
500 biomarkers (Supplementary Table 3) improved per-
Development of Diagnostic Tests for MS formance of the classifiers: AUROC for the RRMS versus
Considering the complex biological mechanisms underly- progressive MS classifier increased to 0.80 (95% CI 5
ing MS, a single biomarker cannot reliably differentiate 0.65–0.95); the performance of the diagnostic test for

FIGURE 1: Signal-to-noise ratio (SNR) calculation and differences in technical and biological replicates versus pre- and post-
treatment samples. (A) Graphical example of technical (top graphs) and biological (bottom graphs) variance calculation for
SOMAmer SL004672. The x-axes correspond to the patient/sample that the measurements were produced from. The upper panels
show technical replicates (n 5 88), and the lower panels show biological replicates (n 5 24). The left panels show the raw measure-
ments (natural-log scale relative fluorescent units [RFUs]) for SOMAmer SL004672 for each sample on the y-axes. The right panels
show the identical raw observations with the random intercept effect subtracted to account for subject-to-subject variation. These
residuals (after subtracting means of technical or biological replicates) were used to estimate the technical and biological variance,
respectively. The horizontal black lines are the estimated mean from the technical (top graphs) and biological (bottom graphs) vari-
ance models. (B) Differences in biomarker measurements in identical samples analyzed repeatedly (technical replicates, n 5 88,
left), in longitudinal healthy donor samples measured at different time points (biological replicates, n 5 24, middle), and in patient
samples before and after application of immunomodulatory therapy (biological changes, n 5 10, right) were quantified in 2 ways:
(1) an average of Spearman rho values calculated across 500 high-signaling SOMAmers with high SNR, and (2) an average of varia-
bilities (a median of relative percentage changes calculated as absolute difference of RFUs for each of the 500 high-signaling
SOMAmers between 2 replicates divided by the average of the 2 RFUs) for all pairs of replicates in each respective category.
Examples of the strongest and the weakest correlations between 2 samples in each category are visualized on 500 high-signaling
SOMAmers. The axes show log10 scales of RFUs of SOMAmers. CSF 5 cerebrospinal fluid; SD 5 standard deviation.

November 2017 799


ANNALS of Neurology

FIGURE 2: Highly simplified artificial example explaining the principles of random forests. (A) A decision tree differentiates
groups of observations (elements) using selected features. For example, to differentiate relapsing–remitting multiple sclerosis
(RRMS) from progressive multiple sclerosis (MS), useful features may be magnetic resonance imaging (MRI) contrast-enhancing
lesions (CELs) and T2 lesions, IgG index, and disability. (B) Assembling features into a decision tree provides better results
than classifying based on any single feature. A decision tree algorithm selects from available features one that best differenti-
ates diagnostic categories and computes its optimal threshold (ie, the value on which to split the elements). The algorithm
then finds the next best feature to split the categories, and this process repeats itself until meeting termination criterion, for
example, when a certain number of splits has occurred. The number of splits corresponds to the depth of a tree. A random for-
est algorithm mitigates the problem of unreliable predictions by using a collection of decision trees, each generated slightly
differently, using a random subset of features and elements. First, the algorithm restricts the number of features from which
each new tree is constructed; if testing p features, the algorithm randomly selects 冑p features available for every split in a
tree. Second, each tree is constructed from a random sample of patients (with replacement) of the same size as the original
training cohort (bootstrapping). The observations withheld from each tree due to bootstrapping are used to calculate out-of-
bag (OOB) misclassification error. In our example (C), only 冑4 5 2 features are used for each split in the decision trees of depth
2, with each of the decision trees generated from a bootstrapped subset of the training dataset. (C) Possible partitions for
CELs-T2 lesions (upper) and CELs-disability (lower) combinations of features; (D) corresponding examples of decision trees,
with a total of 4 trees in the forest. The final prediction is derived as an average prediction from all randomly generated trees.
For example, if 1 tree classified a patient as progressive MS but the other 3 trees classified the patient as RRMS, the subject
will be classified as RRMS with 75% probability. Because of the high variability in individual trees, the algorithm is typically run
for many trees until the OOB predictions stabilize. Therefore, it cannot be described by a mathematical equation or a single
decision tree. (E) The randomness assures that the algorithm searches the entire p-dimensional partition space (only 3 dimen-
sions shown, but the search space is 4-dimensional) for the best features, and by averaging the partitioning thresholds in the
training cohort, the classifier also effectively derives optimal global thresholds. (F) By calculating the average OOB error when
a feature is omitted from the construction of a tree, we can generate a global “variable importance” metric that reflects
decrease in accuracy of the random forest classifier in the absence of the specific feature.

MS marginally improved (AUROC 5 0.92, 95% CI 5 physically interact or belong to the same network. How-
0.86–0.98), and remained unchanged for SPMS versus ever, random forests consider biomarkers only sequentially
PPMS test (see Fig 3). within any given tree. For related biomarkers, such as
Although biomarkers are secreted by diverse cells receptor–ligand pairs, the pathogenic process may depend
under physiological or pathological states, many bio- more on their stoichiometry than on their respective con-
markers are biologically related; for example, they centrations. Therefore, we hypothesized that considering

800 Volume 82, No. 5


Barbour et al: MS Molecular-Based Diagnosis

TABLE. Demographic Data

Diagnosis

Characteristic A, HD B, NIND C, OIND D, RRMS E, PPMS F, SPMS

Modeling cohort, n 5 225


No., F/M 9/16 29/10a 17/24 26/14 17/23 23/17
Age, yr
Average 40.5 47.9 46.9 39.6 51.7 53.6
SD 11.7 10.8 15.8 9.2 9.6 10.7
b
Differing categories E,F D NS B,E,F A,D A,D
Range 22.4–58.4 18.2–70.6 15.4–74.2 18.0–59.9 28.2–70.4 27.4–69.6
Disease duration, yr
Average NA 5.4 4.2 4.4 10.6 20.4
SD NA 7.6 3.9 6.0 7.5 10.7
b
Differing categories NA E,F E,F E,F B,C,D,F B,C,D,E
Range NA 0.2–34.5 0.5–13.1 0.0–20.7 0.5–30.3 1.5–42.4
EDSS
Average 0.5 2.0 3.7 1.6 5.2 5.8
SD 0.5 2.1 2.8 1.3 1.7 1.4
Differing categoriesb B,C,D,E,F A,E,F A,D,F A,C,E,F A,B,D A,B,C,D
Range 0.0–1.5 0.0–6.5 0.0–9.0 0.0–6.0 2.0–8.5 2.0–8.0
SNRS
Average 97.4 90.4 76.6 92.1 67.4 60.1
SD 3.1 13.3 19.6 8.9 15.6 15.9
b
Differing categories B,C,D,E,F A,E,F A,F A,D,F A,B,D A,B,C,D
Range 87–100 51–100 42–100 65–100 24–94 29–87
SDMT
Average 51.2 49.8 45.3 53.2 40.6 37.6
SD 11.5 14.3 14.7 10.5 12.5 12.4
b
Differing categories E,F E,F NS E,F A,B,D A,B,D
Range 32–71 19–80 26–77 32–76 4–58 12–59
Validation cohort, n 5 85
No., F/M 5/5 12/2 6/8 10/6 9/7 11/4
Age, yr 38.8 46.3 45.9 38.3 55.5 54.0
Average 28.9 13.3 15.0 9.6 7.4 7.2
SD E,F NA NA E,F A,D A,D
Differing categoriesb 28.9–56.2 21.0–66.0 22.1–71.0 27.9–56.1 36.0–64.5 38.4–66.0
Range NA 5.6 4.1 2.6 10.0 25.0

November 2017 801


ANNALS of Neurology

TABLE . Continued

Diagnosis

Characteristic A, HD B, NIND C, OIND D, RRMS E, PPMS F, SPMS

Disease duration, yr NA 5.3 4.1 5.1 5.9 7.7


Average NA F E,F E,F C,D,F B,C,D,E
SD NA 0.4–14.9 0.4–14.9 0.1–20.1 1.6–24.2 9.5–38.4
b
Differing categories 0.3 3.8 2.6 1.7 5.0 6.3
Range 0.5 2.2 2.4 0.8 1.6 0.5
EDSS B,D,E,F A,F F A,E,F A,D A,B,C,D,E
Average 0.0–1.0 1.0–6.5 0.0–6.5 1.0–4.0 1.5–6.5 5.0–7.0
SD 98.6 79.6 82.8 93.8 68.5 57.5
b
Differing categories 2.0 13.3 17.4 4.4 12.0 7.1
Range B,D,E,F A,F F A,E,F A,D,F A,B,C,D,E
SNRS 95–100 57–98 57–100 80–98 50–92 49–75
Average 49.1 47.4 44.7 52.1 43.4 37.4
SD 14.4 10.3 11.9 9.7 11.0 10.4
b
Differing categories NA NA NA F NA D
Range 32–69 31–62 28–65 36–70 28–68 20–51
Demographic data are given for 6 diagnostic categories represented by columns A–F. Age, disease duration, EDSS, SNRS, and SDMT are shown as
averages with SD and range. For each of the 5 continuous demographic variables—age, EDSS, SNRS, SDMT and disease duration—One-way
ANOVA was used to compare the variable means across the 6 diagnosis groups: HD, NIND, OIND, PPMS, RRMS, SPMS (for disease duration
without HD). Based on Levene test for homogeneity of variances, ANOVA with equal or unequal variances was applied. Multiple comparisons
between all pairwise means were performed using Tukey’s method. Normality was evaluated by Shapiro–Wilk test based on the residuals. Natural
log transformation was applied to disease duration. SAS version 9.4 was used for the statistical analyses.
a
Fisher exact test shows statistically significant difference in gender (p 5 0.005) that disappears when the NIND group is excluded (p 5 0.0738).
b
Letters A–F identify diagnostic categories that show statistically significant difference with adjusted p < 0.05.
ANOVA 5 analysis of variance; EDSS 5 Expanded Disability Status Scale; F 5 female; HD 5 healthy donor; M 5 male; NA 5 not applicable;
NIND 5 noninflammatory neurological disorder; NS 5 not statistically significant; OIND 5 other inflammatory neurological disorder; PPMS 5
primary progressive multiple sclerosis; RRMS 5 relapsing–remitting multiple sclerosis; SD 5 standard deviation; SDMT 5 Symbol Digit Modal-
ity Test; SNRS 5 Scripps Neurological Rating Scale; SPMS 5 secondary progressive multiple sclerosis.

related biomarkers simultaneously, for example as ratios, limiting the dimension of the search space. This led to
will add discriminatory value. Mathematically, this corre- 5,401 retained ratios for MS versus non-MS, 3,626
sponds to broadening the biomarker-based random forests retained ratios for progressive versus relapsing MS, and
from partitioning the predictor space based on absolute 1,504 retained ratios for SPMS versus PPMS. Using
concentrations of individual markers (represented by blue these sets of ratios strongly enhanced the performance of
dotted perpendicular lines in Fig 2C) to considering pre- random forest models distinguishing MS from non-MS
dictors built from the relative proportion of biomarkers and RRMS from progressive MS (validated AUROC 5
(represented by diagonal orange line in Fig 2C). 0.95, 95% CI 5 0.91–0.99 and AUROC 5 0.88, 95%
Consequently, we used the 500 high-SNR CI 5 0.76–1.00, respectively). However, the performance
SOMAmers to generate 124,750 biomarker ratios. To of SPMS versus PPMS diagnostic test remained low
build random forests only from ratios with the highest (AUROC 5 0.45, 95% CI 5 0.24–0.67).
clinical utility, we combined AUROC with SNR values A clinical test has to fulfill technical requirements
in the modeling (n 5 225) cohort, selecting logical cut- of reproducible measurement across different laboratories,
offs described in Subjects and Methods that allowed for often achieved by using standard curves; this makes mea-
enough diversity to capture the biological processes while suring several hundreds of proteins prohibitive.

802 Volume 82, No. 5


Barbour et al: MS Molecular-Based Diagnosis

FIGURE 3: Schematic diagram of the SOMAscan analysis leading to molecular diagnostic tools. The SOMAscan assay comprises
1,128 SOMAmers (solid black line curves). Calculation of technical and biological signal-to-noise ratio (SNR) reduced the number
of SOMAmers considered for further analysis to 500 (dashed red line curves). Using the 500 high-signaling SOMAmers, 124,750
biomarker ratios were generated that were subsequently tested for their SNR and their ability to differentiate 2 diagnostic
groups (based on area under the receiver operating characteristics curve [AUROC] in the modeling cohort), resulting in 5,401
high-signaling biomarker ratios for multiple sclerosis (MS) versus non-MS diagnostic test, 3,626 biomarker ratios for progressive
versus relapsing–remitting MS (RRMS) diagnostic test, and 1,504 biomarker ratios for secondary progressive MS (SPMS) versus
primary progressive MS (PPMS) diagnostic test (green dotted line curves). Out-of-bag AUROC estimates (bottom graphs) exam-
ined from different random forests generated by sequentially adding ratios with the highest variable importance led to a logical
cutoff (marked by solid red lines) of 22 SOMAmer ratios for MS versus non-MS diagnostic comparison, 21 SOMAmer ratios for
RRMS versus progressive MS diagnostic comparison, and 33 SOMAmer ratios for SPMS versus PPMS (blue dash–dot line curves).
Restriction of SOMAmer ratios to the most important ones resulted in validated AUROC 5 0.98 (MS vs non-MS, 95% confidence
interval [CI] 5 0.94–1.00), AUROC 5 0.91 (RRMS vs progressive MS, 95% CI 5 0.80–1.00), and AUROC 5 0.58 (SPMS vs PPMS,
95% CI 5 0.37-0.79). *Based on observations in modeling dataset only. ER 5 error rate.

November 2017 803


ANNALS of Neurology

Therefore, we sought to identify the smallest number of supernatants from lipopolysaccharide- and CD3/CD28
biomarkers that can be assembled into the random forest beads–stimulated human PBMCs to mimic inflammatory
classifiers without a significant loss of accuracy. To conditions, and immune cells were activated by PMA/
achieve this, we examined the OOB AUROC estimates Ionomycin to achieve robust activation of all immune
from random forests generated from the modeling (n 5 cells. Supernatants were analyzed by SOMAscan assay at
225) cohort by sequentially adding ratios with the high- time 0 and after 24-hour incubation/stimulation. Results
est variable importance (see Fig 2D). The point of inflec- are shown as the stimulation index, a ratio of RFUs at
tion where the OOB AUROC appeared to stabilize was 24 hours and time 0 for each condition for 48
selected to achieve models with high predictive ability at SOMAmers that form the 2 diagnostic classifiers. These
a lower complexity (see Fig 3). Interestingly, the reduc- results are the basis for the biological interpretation of
tion of the number of ratios further improved the perfor- the diagnostic classifiers.
mance of the classifiers in the validation cohort; the 22
most important ratios in the MS versus non-MS classifier Biological Interpretation of MS versus non-MS
led to validated AUROC 5 0.98 (95% CI 5 0.94–1.00), Diagnostic Test
and the 21 most important ratios distinguished RRMS The 22 most important ratios of the MS diagnostic test
from progressive MS with AUROC 5 0.91 (95% CI 5 were dominated by immune cell–specific biomarkers (see
0.80–1.00; see Fig 3). The 33 most important ratios dis- Fig 5). Twenty-one of these contain plasma cell-specific
tinguishing SPMS from PPMS led to a classifier with per- biomarkers TNFRSF17 (BCMA) or IGG. The main
formance comparable to random guessing (AUROC 5 insight from the MS diagnostic classifier is that in MS,
0.58, 95% CI 5 0.37–0.79), and therefore this model activation of humoral immunity represented by plasma
was abandoned from further analyses. cells and plasmablasts is out of proportion to activation
In all 3 classifiers, variable importance measures of all other cellular components of innate or adaptive
were dominated by ratios in models that also included immunity. Combining cellular origin and known biologi-
single SOMAmers (data not shown), validating our cal functions, the biomarkers constituting the MS diag-
mathematical and biological foundation of the ratio- nostic test can be divided into 9 subgroups, where
based hypothesis. plasma cell activation/levels is compared to overall intra-
The clinical properties of the validated models are thecal inflammation (group 1a; PDCD1LG2, SLAMF6,
summarized in Figure 4. When using 50% cutoff to con- CD48, CSF3, CXCL13, TNFRSF4) and amount of acti-
vert continuous probabilities into dichotomous classifiers, vation of myeloid lineage (groups 1b and 1c; PLA2G7,
the MS molecular diagnostic test shows 87.2% sensitivity CCL7, TLR4:LY96 complex, PRTN3). MS patients also
(95% CI 5 77.7–96.8%) and 94.7% specificity (95% show higher plasma cell activation/levels in comparison
CI 5 87.6–100.0%) with a diagnostic odds ratio of to vascular injury and/or ongoing CNS stress (groups 1d,
123.0. The progressive MS classifier differentiates RRMS 2a, and 2c; FLT4, CDKN1B, TNFRSF6B, DSG2, CRK,
from progressive MS with 93.5% sensitivity (95% CI 5 PGK1, MAPK14, F9, DCTPP1). A ratio of IgG and
84.9–100.0%) and 81.3% specificity (95% CI 5 62.1– IgM (group 2b) points to differences in immunoglobulin
100.0%), reaching a diagnostic odds ratio of 62.8. subtypes between MS and non-MS subjects. Higher
plasma cell immunoglobulin secretion in comparison to
Deconvolution of Biomarkers’ Cell of Origin levels of CNS injury and higher plasma cell activation in
To investigate whether statistical learning preferentially comparison to astrocyte activation (group 2d; TNC) was
selected biomarkers with restricted cellular origin into also observed in MS. Lastly, MS subjects have increased
clinically useful tests, we used in vitro modeling on epithelial stress compared to activation of neutrophils
human primary immune and CNS cells (see Subjects and (group 3; MMP7, PRTN3).
Methods), complemented with data from the public
domain, such as the RNA sequencing database of human Biological Interpretation of Progressive MS
CNS cells24,25 and the Human Protein Atlas,26 to assess versus RRMS Diagnostic Test
cellular origin of the biomarkers in the optimized ran- The 21 top ratios distinguish RRMS from progressive
dom forest classifiers (Figs 5 and 6). The results of MS (see Fig 6). Seven were ratios of EDA2R or EDAR
SOMAscan analysis of supernatants from human CNS with markers released predominantly by CNS cells, espe-
cell lines and from freshly sorted human peripheral cially neurons and oligodendrocytes: STX1A, EPHA5,
immune cells are depicted in Supplementary Table 4. We JAM3, NTRK3, RGMA, BOC, and UNC5 (group 1a); all
analyzed cell cultured media in resting state and upon of these ratios demonstrate relative loss of CNS-specific
activation; CNS cells/cell lines were exposed to markers in progressive MS. Moreover, 3 ratios show

804 Volume 82, No. 5


Barbour et al: MS Molecular-Based Diagnosis

FIGURE 4: STARD (Standards for Reporting Diagnostic accuracy studies) diagrams and confusion matrices reporting the flow of
subjects used for validation of the molecular diagnostic test. (A) STARD diagram and (C) confusion matrix for 85 subjects used
for validation of the multiple sclerosis (MS) molecular diagnostic test and (B) STARD diagram and (D) confusion matrix for 47
subjects used for validation of the progressive MS molecular diagnostic test. cerebrospinal fluid (CSF); relapsing–remitting MS
(RRMS).

proportional loss of neuronal and oligodendroglial and 2b; CFD, GZMA, SELL, SERPING1, IL22). Finally,
markers in relation to ICOSLG, expressed on activated a group of ratios relates to dysregulation of LTA-LTB and
antigen-presenting cells, in progressive MS (group 1b; IL22 pathways, which play an important role in the for-
EPHA5, JAM3, TYRO3). Similarly, the ratio INHBA/ mation of tertiary lymphoid follicles in progressive MS
JAM3 measures relative loss of an oligodendroglial (groups 2c, 2d, 2e, and 3; LILRB2, SELL, LTA-LTB,
marker in comparison to a marker secreted predomi- IL10, PRTN3, ETHE1, GP6, CLEC1B) and also in
nantly by myeloid cells during wound healing and tissue platelet aggregation (group 3).
remodeling (group 1c). Progressive MS patients also
show increased epithelial stress in comparison to overall Discussion
intrathecal inflammation (group 1d; SELL, EDAR). Exploring recent advances in proteomics, we asked
Another group of biomarker ratios points out enhanced whether CSF biomarkers can reliably measure intrathecal
alternative pathways of complement activation in com- processes and thus facilitate diagnosis, drug development,
parison to overall intrathecal inflammation (groups 2a and clinical management of patients with complex CNS

November 2017 805


ANNALS of Neurology

FIGURE 5.

806 Volume 82, No. 5


Barbour et al: MS Molecular-Based Diagnosis

diseases. We hypothesized and confirmed that intraindi- ideally after cessation of the acute process that prompted
vidually stable CSF biomarkers with restricted cellular diagnostic testing.
origin are over-represented in clinically useful classifiers. Diagnostic specificity is of high clinical importance,
Collected data provide proof-of-principle evidence that because false-positive results expose subjects to potential
molecular diagnosis of polygenic CNS diseases is feasible harms of unnecessary therapies. Because the MS diagnos-
with current technologies. tic classifier is antigen-nonspecific, dysregulated immu-
In contrast to internal medicine disciplines that uti- nity targeting non-MS antigens may be misclassified as
lize molecular biomarkers,27 contemporary diagnostic MS if it elicits qualitatively similar intrathecal inflamma-
process and therapeutic decisions for polygenic neurologi- tion. We observed that 2 OIND patients (1 with CTLA-
cal diseases are based on clinical findings and structural 4 haploinsufficiency and another with chronic aseptic
imaging, both of which lack molecular specificity. This meningitis) were misclassified as MS. It is plausible that
may contribute to high misclassification rate in neurode- analogously to mutations shared among different cancers,
generative diseases against pathology.1,2 Finding that 1 of conditions with pathogenic mechanisms similar to MS
the first 3 “MS” subjects who succumbed to may respond to MS treatments. This is the third advan-
natalizumab-induced progressive multifocal leukoence- tage of molecular taxonomy, as it could promote CNS
phalopathy demonstrated no pathological evidence of therapeutics from disease-specific monotherapies to
MS28 suggests that a >20% misdiagnosis rate may also process-specific therapies, where treatments are shared
be applicable to MS, despite advances provided by imag- among pathophysiologically related conditions and ratio-
ing. We observed approximately 10% discrepancy nally assembled into patient-specific polypharmacy
between clinical- and biomarker-based MS diagnosis (Fig regimens.
7A); the absence of pathological evidence prevents deter- Our study has following limitations. SOMAscan
mining which classification is correct. The majority of represents a selection of proteins that are not specifically
SOMAscan-misclassified “MS” patients lacked defining targeted to the CNS. This drawback, however, also
biological features of MS: intrathecal activation of plasma proves that molecular signatures of distinct diseases are
cells and adaptive immunity, validated by alternative sufficiently robust that sampling 1% of the relevant
assays (see Fig 7B). Therefore, patients misclassified by proteome can reliably differentiate among them. Our
molecular classifiers either exhibited a noninflammatory observation that unbiased statistical learning selected vir-
form of MS, observed by pathologists at frequencies anal- tually all available SOMAmers with restricted CNS cellu-
ogous to our MS misclassification rate,29 or had alterna- lar origin suggests that deliberate broadening of the
tive conditions. Regardless of what we call such ailments, sampled proteome to more CNS-relevant biomarkers has
these patients lack targets of immunomodulatory DMTs a potential to improve classifications, and expand under-
and are unlikely to reap their benefit. Thus, providing standing of disease mechanisms. Also, SOMAscan is a
therapeutically relevant information represents the first discovery platform, routinely optimized and expanded,
advantage of molecular diagnosis. The second advantage and therefore lacking standards of clinical applications.
is reporting diagnostic probabilities as a continuous vari- We dealt with this problem by embedding many techni-
able that captures the strength of biological evidence, in cal replicates that allowed normalization between differ-
comparison to a dichotomous clinical diagnosis. Indeter- ent assay runs. However, even after normalizing and
minate results close to 50% probability (which repre- focusing on biomarkers with high SNR, the interassay
sented 88% of subjects with discrepant clinical and variability (measured by technical replicates) decreased
molecular diagnosis; see Fig 7A) should be repeated, the performance of the classifiers. Furthermore, during

FIGURE 5: Multiple sclerosis (MS) versus non-MS molecular diagnostic test. A parallel coordinate plot (PCP) is shown for the 22
most important features that distinguish MS from non-MS. The plot displays individual patients from combined modeling (n 5
225) and validation cohort (n 5 85) divided into MS group (relapsing–remitting MS [RRMS], primary progressive MS [PPMS],
secondary progressive MS [SPMS]; thin red lines) and non-MS group (healthy donor [HD], noninflammatory neurological disor-
der [NIND], other inflammatory neurological disorder [OIND]; thin blue lines). A group average is shown as a thick yellow line
for PPMS, thick red line for RRMS, thick orange line for SPMS, thick purple line for HD, thick green line for NIND, and thick
blue line for OIND. The y-axis shows SOMAmer natural log ratios scaled to 0–1 range. SOMAmer ratios were grouped based
on the cellular origin and known functions of the individual components into 9 groups. Different cell types are shown above
the PCP to highlight the cell origin of individual SOMAmers. MS patients show higher plasma cell/plasmablast activation/levels
compared to overall intrathecal inflammation (group 1a), myeloid lineage (groups 1b and 1c), epithelial damage (group 1d),
central nervous system (CNS) destruction and epithelial injury (group 2a), differences in immunoglobulin subtypes (group 2b),
CNS and endothelial damage (group 2c), astrocyte activation (group 2d), and higher epithelial injury compared to neutrophil
activation (group 3). *Ratios in the classifier are inverted. BBB 5 blood–brain barrier; IT 5 intrathecal.

November 2017 807


FIGURE 6: Progressive versus relapsing–remitting MS (RRMS) molecular diagnostic test. A parallel coordinate plot (PCP) is shown for
the 21 most important variables that distinguish progressive multiple sclerosis (MS) from relapsing MS. The plot displays individual MS
patients from combined modeling (n 5 120) and validation cohort (n 5 47) divided into progressive MS (PMS) group (primary progres-
sive MS [PPMS] and secondary progressive MS [SPMS]; thin blue lines) and relapsing MS group (relapsing–remitting MS [RRMS]; thin
red lines). A group average is shown as thick purple line for PPMS, thick blue line for SPMS, and thick red line for RRMS. The y-axis
shows SOMAmer natural log ratios scaled to 0–1 range. SOMAmer ratios were grouped based on the cellular origin and known func-
tions of the individual components into 10 groups. Different cell types are shown above the PCP to highlight the cell origin of individual
SOMAmers. Progressive MS patients show increased loss of neuronal, oligodendroglial, astrocytic, and neuroprotective markers
(groups 1a and 1b), proportional loss of oligodendroglial marker compared to myeloid lineage and epithelial marker (group 1c),
increased epithelial injury in comparison to overall immune activation (1d), enhanced complement activation (groups 2a and 2b), dysre-
gulation of pathways linked to formation of tertiary lymphoid follicles (groups 2c, 2d, 2e, 3), and platelet aggregation (group 3). *Ratios
in the classifier are inverted. BBB 5 blood–brain barrier; CNS 5 central nervous system; NK 5 natural killer.
Barbour et al: MS Molecular-Based Diagnosis

FIGURE 7: Comparison of the performance of clinical and molecular diagnostic tests. (A) The multiple sclerosis (MS) diagnostic
probability (on the y-axis) of 85 subjects from the validation cohort is shown in the graph. Blue circles represent subjects with origi-
nal non-MS diagnosis (healthy donors, other inflammatory neurological disorders [OINDs], noninflammatory neurological disorders),
and orange circles represent subjects with original MS diagnosis (relapsing–remitting MS [RRMS], primary progressive MS, second-
ary progressive MS). The red line represents an arbitrary cutoff at 50%. The pink background marks an area between 30% and
70% where the certainty of the molecular classification is weak (contains 22.4% of the validation cohort’s subjects). The orange
background highlights 70.0% of the validation cohort’s MS subjects with highly probable MS molecular diagnosis (>70%), and the
blue background labels 86.8% of the validation cohort’s non-MS subjects with high probability of non-MS molecular diagnosis
(<30%). The gray bars represent a frequency distribution bar chart with the bin size of 5%. (B) Misdiagnosed subjects (pink circles)
were evaluated for non-SOMAlogic biomarkers of inflammation—IgG index, BCMA, sCD27, and CHI3L1—using alternative assays
(for details on methodology, see Komori et al34). The group medians are shown for MS subjects as an orange line and for non-MS
subjects as a blue line. The 7 MS subjects who were classified as non-MS by the molecular diagnostic test showed a noninflamma-
tory type of disease, whereas the 2 non-MS (OIND) subjects who were categorized as MS according to the SOMAlogic MS molecu-
lar classifier showed significant levels of inflammatory markers, overlapping with MS. (C) Comparison of IgG index data (left) and
molecular MS diagnostic probability (right) in the combined modeling and validation cohort shows distributions of non-MS (blue
circles) and MS subjects (orange circles). The black dotted lines on the left show lower and upper limit of normal IgG Index and on
the right the 50% cut off for MS molecular diagnostic test. (D) Separation of RRMS (green circles) and progressive MS (PMS; purple
circles) subjects into 2 age categories (<45 years, left side and >45 years, right side) shows that age does not affect performance
of the progressive MS classifier. The black dotted line shows the 50% cut off for progressive MS diagnostic test.

the submission and review of this article, Somalogic dilutions, and quantifies 1,300 proteins. We subsequently
updated SOMAscan from an assay that quantifies 1,128 used the original validation cohort to test whether ran-
proteins to an assay that uses different buffers and dom forests constructed from the selected set of

November 2017 809


ANNALS of Neurology

biomarker ratios depicted in Figures 5 and 6 can still diagnostic workup of MS or related neuroimmunological
reliably differentiate MS from other diseases and RRMS diseases and who must be differentiated from all 3 MS
from progressive MS using the new version of SOMAs- subtypes.
can. We observed OOB AUROC of 0.89 (95% CI 5 Supporting the notion that CSF biomarkers can
0.82–0.97) for MS versus non-MS classifier and OOB expand understanding of CNS diseases, the following
AUROC of 0.84 (95% CI 5 0.66–1.00) for RRMS ver- knowledge was gained from the current study. The essential
sus progressive MS classifier. This shows that our results difference between MS and its mimics is selective expan-
are not assay-dependent, but reflect true biological pro- sion/activation of B-cell/plasma cell lineages, out of pro-
cesses. Nevertheless, biomarker-based precision neurology portion to the activation of other immune cells and to the
cannot be achieved without the biotech industry, which resultant injury/stress of CNS-resident cells. An ancillary
needs to develop fully quantitative, CSF-targeted assays pathway that helps to diagnose MS is linked to a marker of
that conform to technical requirements of clinical tests. tissue remodeling and repair, MMP7. These features are
To facilitate this, we considered “assay economy,” as an shared by all MS subtypes, indicating that PPMS is not a
optimum between assay cost (dependent on the number pathophysiologically distinct “non-inflammatory” entity,30
of proteins that need absolute and relative quantification) but rather an equivalent disease stage to SPMS. This con-
and accuracy. Biomarker ratios simplify assay commer- clusion is supported by the observed inability to validate a
cialization, by limiting the need to run standard curves molecular classifier that differentiates PPMS from SPMS
for every analyte and providing internal normalization with accuracy higher than random guessing and by thera-
that avoids false-positive results caused by, for example, peutic response of PPMS to immunomodulation by
high protein levels. However, absolute quantification of ocrelizumab.31
at least the dominant biomarker partners, such as BCMA The dominance of plasma cell biomarkers in the
and EDA2R, will likely be necessary for quality assurance molecular classifier poses a question of its value against
for clinical applications. current CSF tests such as IgG index and oligoclonal bands
The creative use of SNR screened out biomarkers (OCBs). We have included IgG index and MS classifier
of low clinical value; this improved the efficacy of the prediction rates to Figure 7C to demonstrate the superior-
random forest algorithm, which searched for optimal bio- ity of the classifier. Similar data were obtained for OCBs;
markers in the lower dimensional search space. Although in the cohort of patients with available OCB data, the
such use of external data and domain-expert knowledge sensitivity of the OCB test (93.9%, 95% CI 5 90.3–
is encouraged in statistical learning, as it typically 97.6%) was comparable to the sensitivity of molecular
improves performance (as it did in our case), we classifier (96.4%, 95% CI 5 93.5–99.2%), and the spe-
acknowledge that this methodology leads to some arbi- cificity of the OCB test (80.0%, 95% CI 5 72.8–87.2%)
trariness in the selection of markers, and thus it may not was highly outperformed by the specificity of the molecu-
work in other settings or for other cohorts. To assure lar classifier (98.3%, 95% CI 5 96.0–100.0%).
adequate representation of all MS subtypes and both Statistical learning also enhanced our understanding
inflammatory and noninflammatory MS mimics in the of progressive MS by demonstrating that PPMS and
classifier construction, we designed this study to have SPMS are biologically indistinguishable. These data argue
approximately equal representation from all patient for merging PPMS and SPMS cohorts in future drug
groups. This may not be representative of the rates in development and clinical considerations. Features that dif-
the real population of patients, where, for example, ferentiate progressive MS from RRMS are greater CNS
RRMS/SPMS patients are much more frequently tissue destruction, including more widespread endothelial/
encountered than PPMS subjects. Because SNR is depen- epithelial cell stress, and reactive gliosis with increased per-
dent on the composition of the training cohort, we meability of CNS barriers and greater activation of innate
acknowledge that different compositions of the training immunity. In addition to proportional loss of oligoden-
cohort may lead to selection of different biomarkers, droglial and neuronal biomarkers that likely reflect injury
potentially better or worse for separating certain disease or loss of their cells of origin, there are also immunologi-
states. This behavior is inherent to any statistical learning cal differences between RRMS and progressive MS. These
process and, therefore, sampling and population structure relate to innate immunity (complement, myeloid lineage,
must be considered carefully in the study design. We and antigen presentation), and to pathways involved in
view our population selection as appropriate for the the formation of tertiary lymphoid follicles, such as lym-
stated goals, because in addition to HDs, controls photoxin complex and IL22. This is consistent with the
included subjects with varied inflammatory and nonin- pathological evidence of tertiary lymphoid follicles in pro-
flammatory CNS diseases, who presented for the gressive MS32 and with a recent report that the level of

810 Volume 82, No. 5


Barbour et al: MS Molecular-Based Diagnosis

compartmentalization of immune responses to the CNS SOMAscan assays. P.K. received fellowship support from
can differentiate RRMS from 2 progressive MS subtypes.6 the Myelin Research Foundation, and M.K. received
Finally, it is intriguing that 4 of 21 ratios that differentiate postdoctoral fellowship support from the Japan Society
progressive MS from RRMS (ie, containing SERPING1 of the Promotion of Science.
and CFD, which is essential in alternative complement We thank Dr P.-O. Courad (Institute Cochin,
activation by cleaving C3) are linked to “neurotoxic reac- INSERM, France) for providing the HCMEC/D3 cell
tive astrocytes,” recently shown to mediate neuronal death line; E. Romm for processing of CSF samples; clinicians
in MS and other neurodegenerative diseases.33 P. Williamson, A. Panackal, A. Wichman, J. Cherup, I.
One may ask to what degree the identified MS Cortese, J. Ohayon, K. Fenton, C. Toro, D. Landis, A.
progression-specific processes reflect aging. Reanalyzing Vanderver, E. Wells, C. Pardo, and L. Krupp and
probabilities of progressive MS in patients younger and research nurse J. Dwyer for expert patient care; regula-
older than 45 years demonstrated that the molecular classi- tory nurse R. Hayden for help with regulatory paper-
fier correctly differentiates RRMS from progressive MS work; schedulers A. Mayfield and K. Pumphrey for
irrespective of age (see Fig 7D). Thus, the biological inter- patient scheduling; D. Maric for help with cell sorting;
pretation of MS classifiers offers the following unifying and finally the patients, their caregivers, and healthy vol-
hypothesis for future longitudinal studies: although aber- unteers, without whom this work would not be possible.
rant activation of B-/plasma cell lineage is essential for
development of MS, the complex response of CNS tissue, Author Contributions
exemplified by microglial activation, toxic astrogliosis, and Study concept and design: B.B.; data acquisition and
endothelial/epithelial stress, determines the extent and irre- analysis: all authors; drafting the manuscript and figures:
versibility of demyelination and neuronal death, which C.B., P.K., M.G., and B.B.
underlie progressive accumulation of disability in MS.
Although longitudinal data represented only a small Potential Conflicts of Interest
part of the current study, they were instrumental for B.B., P.K., M.K., C.B., and M.G. are coinventors of U.S. pat-
selecting high SNR biomarkers, which improved the ent application number 62/038,530: Biomarkers for Diagno-
accuracy of the molecular classifiers. They also demon- sis and Management of Neuro-immunological Diseases,
strated the ability of CSF biomarkers to measure broad which pertains to the results of this paper. B.B., P.K., and
biological effects of applied therapies in the intrathecal M.K. have assigned their patent rights to the U.S. Depart-
compartment. Expanding CSF biomarker studies to lon- ment of Health and Human Services. C.B. and M.G.
gitudinal cohorts could identify molecular signatures that assigned their patent rights to Montana State University.
forecast therapeutic efficacy, as well as biological syner-
gisms among different treatments. Longitudinal cohorts
References
are also required to determine the extent and stability of
1. Koga S, Aoki N, Uitti RJ, et al. When DLB, PD, and PSP masquer-
pathogenic heterogeneity.29 Implementation of CSF bio- ade as MSA: an autopsy study of 134 patients. Neurology 2015;
markers in phase I/II trials can guide dose and patient 85:404–412.

selection, and eliminate unpromising agents without 2. Rizzo G, Copetti M, Arcuti S, et al. Accuracy of clinical diagnosis
of Parkinson disease: a systematic review and meta-analysis. Neu-
accruing excessive costs and sequestering large numbers rology 2016;86:566–576.
of available patients.34 Such biomarker-supported trials
3. Kola I, Landis J. Can the pharmaceutical industry reduce attrition
thus offer the promise to propel CSF biomarker–based rates? Nat Rev Drug Discov 2004;3:711–715.
precision medicine into neurology practice. Although the 4. Johanson CE, Duncan JA III, Klinge PM, et al. Multiplicity of cere-
presented results make these prospects realistic, they can- brospinal fluid functions: new challenges in health and disease.
Cerebrospinal Fluid Res 2008;5:10.
not be achieved without broader, visionary investment of
5. Vlassenko AG, McCue L, Jasielec MS, et al. Imaging and cerebro-
efforts and resources to exploit the full potential of CSF
spinal fluid biomarkers in early preclinical alzheimer disease. Ann
biomarkers in neurology. Neurol 2016;80:379–387.

6. Komori M, Blake A, Greenwood M, et al. Cerebrospinal fluid


markers reveal intrathecal inflammation in progressive multiple
Acknowledgment sclerosis. Ann Neurol 2015;78:3–20.
This study was supported by the intramural research pro- 7. Polman CH, Reingold SC, Banwell B, et al. Diagnostic criteria for
gram of the NIH National Institute of Neurological Dis- multiple sclerosis: 2010 revisions to the McDonald criteria. Ann
Neurol 2011;69:292–302.
orders and Stroke (NINDS) and a material transfer
8. Bielekova B, Komori M, Xu Q, et al. Cerebrospinal fluid IL-12p40,
agreement between NINDS and Medimmune (a member CXCL13 and IL-8 as a combinatorial biomarker of active intrathe-
of the AstraZeneca Group), which partially funded the cal inflammation. PLoS One 2012;7:e48370.

November 2017 811


ANNALS of Neurology

9. Rohloff JC, Gelinas AD, Jarvis TC, et al. Nucleic acid ligands with 23. Ioannidis JP. A roadmap for successful applications of clinical pro-
protein-like side chains: modified aptamers and their use as diag- teomics. Proteomics Clin Appl 2011;5:241–247.
nostic and therapeutic agents. Mol Ther Nucleic Acids 2014;3:e201.
24. Zhang Y, Chen K, Sloan SA, et al. An RNA-sequencing transcrip-
10. Gold L, Walker JJ, Wilcox SK, Williams S. Advances in human pro- tome and splicing database of glia, neurons, and vascular cells of
teomics at high scale with the SOMAscan proteomics platform. N the cerebral cortex. J Neurosci 2014;34:11929–11947.
Biotechnol 2012;29:543–549.
25. Hawrylycz MJ, Lein ES, Guillozet-Bongaarts AL, et al. An anatomi-
11. Kraemer S, Vaught JD, Bock C, et al. From SOMAmer-based bio- cally comprehensive atlas of the adult human brain transcriptome.
marker discovery to diagnostic and clinical applications: a Nature 2012;489:391–399.
SOMAmer-based, streamlined multiplex proteomic assay. PLoS
26. Uhlen M, Fagerberg L, Hallstrom BM, et al. Proteomics.
One 2011;6:e26332.
Tissue-based map of the human proteome. Science 2015;347:
12. Gold L, Ayers D, Bertino J, et al. Aptamer-based multiplexed prote- 1260419.
omic technology for biomarker discovery. PLoS One 2010;5:e15004.
27. Morgan P, Van Der Graaf PH, Arrowsmith J, et al. Can the flow of
13. Weksler BB, Subileau EA, Perriere N, et al. Blood-brain barrier- medicines be improved? Fundamental pharmacokinetic and phar-
specific properties of a human adult brain endothelial cell line. macological principles toward improving phase II survival. Drug
FASEB J 2005;19:1872–1874. Discov Today 2012;17:419–424.
14. Douvaras P, Fossati V. Generation and isolation of oligodendro- 28. Kleinschmidt-DeMasters BK, Tyler KL. Progressive multifocal leu-
cyte progenitor cells from human pluripotent stem cells. Nat Pro- koencephalopathy complicating treatment with natalizumab and
toc 2015;10:1143–1154. interferon beta-1a for multiple sclerosis. N Engl J Med 2005;353:
369–374.
15. R Core Team. R: a language and environment for statistical comput-
ing. Vienna, Austria: R Foundation for Statistical Computing, 2016. 29. Lucchinetti C, Bruck W, Parisi J, et al. Heterogeneity of multiple
sclerosis lesions: implications for the pathogenesis of demyelin-
16. Pinheiro J, Bates D, DebRoy S, et al. nlme: Linear and nonlinear
ation. Ann Neurol 2000;47:707–717.
mixed effects models. R package version 31-128. 2016. http://
cran/R-project.org/package=nlme Access date 2016-05-04. 30. Stys PK, Zamponi GW, van Minnen J, Geurts JJ. Will the real multi-
ple sclerosis please stand up? Nat Rev Neurosci 2012;13:507–514.
17. Robin X, Turck N, Hainard A, et al. pROC: an open-source pack-
age for R and S1 to analyze and compare ROC curves. BMC Bio- 31. Montalban X, Hauser SL, Kappos L, et al. Ocrelizumab versus pla-
informatics 2011;12:77. cebo in primary progressive multiple sclerosis. N Engl J Med
2017;376:209–220.
18. Hastie T, Tibshirani R, Friedman J. The elements of statistical
learning. New York, NY: Springer; 2009. 32. Magliozzi R, Howell O, Vora A, et al. Meningeal B-cell follicles in
secondary progressive multiple sclerosis associate with early onset
19. Liaw A, Wiener M. Classification and regression by randomForest.
of disease and severe cortical pathology. Brain 2007;130(pt 4):
R News 2002;2:18–22.
1089–1104.
20. Breiman L. Random forests. Mach Learn 2001;45:5–32.
33. Liddelow SA, Guttenplan KA, Clarke LE, et al. Neurotoxic reactive
21. Friedman JH. Greedy function approximation: a gradient boosting astrocytes are induced by activated microglia. Nature 2017;541:
machine. Ann Stat 2001;29:1189–1232. 481–487.

22. Bielekova B, Vodovotz Y, An G, Hallenbeck J. How implementa- 34. Komori M, Lin YC, Cortese I, et al. Insufficient disease inhibition
tion of systems biology into clinical trials accelerates understand- by intrathecal rituximab in progressive multiple sclerosis. Ann Clin
ing of diseases. Front Neurol 2014;5:102. Transl Neurol 2016;3:166–179.

812 Volume 82, No. 5

You might also like