You are on page 1of 15

Artificial Intelligence In Medicine 103 (2020) 101807

Contents lists available at ScienceDirect

Artificial Intelligence In Medicine


journal homepage: www.elsevier.com/locate/artmed

Prognostic factors of Rapid symptoms progression in patients with newly T


diagnosed parkinson’s disease
Kostas M. Tsiourisa,b, Spiros Konitsiotisc, Dimitrios D. Koutsourisa, Dimitrios I. Fotiadisb,d,*
a
Biomedical Engineering Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, GR15773, Athens, Greece
b
Unit of Medical Technology and Intelligent Information Systems, Dept. of Material Science and Engineering, University of Ioannina, GR45110, Ioannina, Greece
c
Dept. of Neurology, Medical School, University of Ioannina, GR45110, Ioannina, Greece
d
Dept. of Biomedical Research, Institute of Molecular Biology and Biotechnology, FORTH, GR45110, Ioannina, Greece

ARTICLE INFO ABSTRACT

Keywords: Tracking symptoms progression in the early stages of Parkinson’s disease (PD) is a laborious endeavor as the
Parkinson’s disease disease can be expressed with vastly different phenotypes, forcing clinicians to follow a multi-parametric ap-
Rapid progression proach in patient evaluation, looking for not only motor symptomatology but also non-motor complications,
Prognostic factors including cognitive decline, sleep problems and mood disturbances. Being neurodegenerative in nature, PD is
Machine learning
expected to inflict a continuous degradation in patients’ condition over time. The rate of symptoms progression,
however, is found to be even more chaotic than the vastly different phenotypes that can be expressed in the
initial stages of PD. In this work, an analysis of baseline PD characteristics is performed using machine learning
techniques, to identify prognostic factors for early rapid progression of PD symptoms. Using open data from the
Parkinson’s Progression Markers Initiative (PPMI) study, an extensive set of baseline patient evaluation out-
comes is examined to isolate determinants of rapid progression within the first two and four years of PD. The rate
of symptoms progression is estimated by tracking the change of the Movement Disorder Society-Unified
Parkinson's Disease Rating Scale (MDS‐UPDRS) total score over the corresponding follow-up period. Patients are
ranked according to their progression rates and those who expressed the highest rates of MDS-UPDRS total score
increase per year of follow-up period are assigned into the rapid progression class, using 5- and 10-quantiles
partition. Classification performance against the rapid progression class was evaluated in a per quantile partition
analysis scheme and in quantile-independent approach, respectively. The results shown a more accurate patient
discrimination with quantile partitioning, however, a much more compact subset of baseline factors is extracted
in the latter, making a more suitable for actual interventions in practice. Classification accuracy improved in all
cases when using the longer 4-year follow-up period to estimate PD progression, suggesting that a prolonged
patient evaluation can provide better outcomes in identifying rapid progression phenotype. Non-motor symp-
toms are found to be the main determinants of rapid symptoms progression in both follow-up periods, with
autonomic dysfunction, mood impairment, anxiety, REM sleep behavior disorders, cognitive decline and
memory impairment being alarming signs at baseline evaluation, along with rigidity symptoms, certain la-
boratory blood test results and genetic mutations.

1. Introduction slightly more prone. Some geographical variations have also been found
with the prevalence of PD being highest in South America and sig-
Parkinson’s disease (PD) is the second most common neurological nificantly lower in Asia with prevalence in populations from Europe,
disorder affecting an estimated 0.3 % of the global population [1]. North America and Australia in between [2]. Although the neurophy-
Population-based studies have shown that the prevalence of PD in- siology of molecular pathways and pathological consequences of the
creases significantly with age as disease diagnosis is rare in individuals disease have been studied at neuron-level scale as well as wider brain
below 50 years old, while the number of patients per 100,000 people connectivity networks [3–5], the exact causes leading to PD remain
doubles for each age decade above 60 years old, with males being elusive. On a genetic level, the list of gene mutations associated with PD


Corresponding author at: Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University Campus of
Ioannina, GR 45110, Ioannina, Greece.
E-mail address: fotiadis@cc.uoi.gr (D.I. Fotiadis).

https://doi.org/10.1016/j.artmed.2020.101807
Received 26 February 2019; Received in revised form 7 January 2020; Accepted 13 January 2020
0933-3657/ © 2020 Elsevier B.V. All rights reserved.

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

continues to grow in size suggesting the existence of a partially inherent of newly diagnosed patients from the PPMI dataset. Patients are split
etiology [6,7], while environmental and lifestyle conditions have also into groups according to their progression rate using quantiles parti-
been suggested as potential risk factors for PD diagnosis as well [8]. tion, with the ones experiencing the fastest progression within the
PD is predominately characterized as a movement disorder due to follow-up period, considered as the rapid progression class. An ex-
its motor clinical symptomatology, leading to disabilities as motor tensive search is then performed in the available feature space, using
symptoms spread from the initial onset location and progress in se- every baseline evaluation outcome from the PPMI dataset (i.e. clinical,
verity. Motor symptoms are caused by the brain’s inability to adjust the imaging and laboratory examinations), to identify the most informative
levels of dopamine production and release, a neurotransmitter that is baseline features in relation to symptoms progression. To the best of our
crucial in between neurons communication and movement control, due knowledge, this is the first time that such an extensive baseline feature
to the impairment of dopaminergic neurons [9]. However, the non- vector is used in a machine learning-based analysis to detect potential
motor symptomatology of the disease has equally important compli- prognostic factors of rapid PD symptoms progression and eminent
cations in patient’s condition, with recent studies showing that PD pa- functional decline in newly diagnosed patients. In addition, prognostic
tients have increased risk of experiencing cognitive decline, dementia, factors are evaluated in two different timelines, assessing PD progres-
sleep disorders, depression and psychosis [10–12]. This great variation sion at the 2- and 4-year follow-up interval independently.
in PD symptoms expression can be seen in almost every aspect of pa-
tient’s condition, from distinctive tremor-dominant and gait dis- 2. Markers of disease progression
turbances-dominant motor symptoms patient groups [13], to cognitive
impairment versus non-impaired and presence of sleep disorders versus With official guidelines lacking, one of the most debatable aspects in
non-sleep disturbances groups in the non-motor symptoms front [14]. the evaluation of PD progression is the core definition of the disease
To make matters worse, previous studies have shown that these dif- progression marker itself: How do we accurately track and measure
ferences in PD phenotype can affect significantly the overall disease progression in PD? According to the literature, there is no definitive
progression rate [15,16]. answer due to the complexity of the disease and its wide range of
A phenotype that is also caused by the heterogeneous nature of PD phenotypes with vastly different symptomatology. A wide variety of
and is not so often studied is that of atypical rapid disease progression. potential disease progression markers have been studied in the past,
In general, progression of PD can vary significantly from patient to including blood, serum and urine analysis, cerebrospinal fluid (CSF)
patient, with differences in motor symptoms progression in particular analysis, brain imaging (i.e. MRI, SPECT, DaTscans), clinical outcomes
being among the most notable in clinical environment. However, since and neurophysiological biomarkers. However, a systematic revision of
there is not official guideline regarding what should be considered as such studies by McGhee et al. concluded that they were generally of
(atypically) rapid progression of PD [17], the term is used rather va- poor quality, with small numbers of participants, excessive inclusion/
guely and usually describes qualitative changes in patient’s condition exclusion criteria and potentially flawed methodologies with simplistic
when reaching certain milestones of the typical PD progression statistical analysis being conducted; even in longitudinal ones. Based on
pathway prematurely. Leaving definition difficulties aside, early rapid their search, the authors suggested that currently there is not sufficient
progression risk stratification is an important unmet need in the initial evidence and cross-study agreement in the literature to recommend the
post-diagnosis PD patient evaluation, as it can have a positive impact in use of any biomarker to effectively track progression of PD [25].
multiple levels by: (a) alerting physicians to follow a more precise Using clinical evaluations, PD progression can be estimated in a
disease management plan, as these patients would require increased qualitative or quantitative manner. A qualitative approach relies on
attention to copy with the aggressive phenotype of their condition, (b) tracking the time required for each patient to reach major milestones in
improving the screening process for inclusion of PD patients in clinical the disease progression pathway in search for atypically early onset. For
studies, in order to evaluate their clinical outcomes under more chal- example, changing stages in the Hoehn and Yahr scale [26] has been
lenging disease conditions. For example, the expected impact of a new proposed as a viable way to track such important milestones to pro-
treatment can be better validated in a medication-placebo group trial if gression of PD symptoms and disability in [27], while PD patients have
the enlisted patients have a natural predisposition in developing a se- been considered as rapidly progressed when a lower than expected time
vere symptomatology faster, in time, for the relatively limited duration interval is required to reach stages 3 and above in the scale, as these
of the study. stages declare severe functional disability [28]. A downside of this
In the research for prognostic factors, most of the previous studies approach is that the scale is mostly dependent on symptom locality to
relied on univariate and, to a lesser extent, multivariate statistical track PD progression as they expand from lateral to bilateral and axial
analysis in their search for baseline determinants of rapid progression in and does not consider symptom severity per stage. In addition, quali-
PD [18–21]. Data mining and machine learning algorithms are viable tative analyses are rather a condition-based triggered assignment of PD
alternatives to statistical analysis, as they can perform multi-parametric patients as rapidly progressed cases and do not provide a way to
evaluations of a wide range of features and symptoms to discover quantify or measure the rate of progression over any time interval (e.g.
hidden dependencies among them, offering a significant advantage over progression per month/year/visit).
univariate statistical analysis. However, machine learning has not Therefore, to obtain a more accurate estimation of PD progression
gained too much attention in this field and only recently such methods rate quantitative progression analysis have been proposed instead. In
are starting to emerge in PD progression analysis, partially because of quantitative approaches, the MDS-UPDRS is the most frequently used
increasing data availability, as big cohort studies such as PPMI are now scale to evaluate progression of functional decline and disability due to
providing access to large-scale, longitudinal data with clinical evalua- PD symptoms, as it is currently considered the golden standard for
tions over several years [22]. In the era prior to such initiatives open motor symptoms evaluation and their effects in patient’s daily living in
datasets were scarce, limiting independent research and even most both motor and non-motor experiences [29–31]. Despite being pre-
private cohorts consisted of about a couple of hundred patients [23], dominately skewed towards motor symptoms evaluation, the first parts
which restricted the use of machine learning techniques, as data volume of the scale evaluate a wide variety of non-motor aspects of PD, in-
is key to increase the robustness of the analysis and its outcomes. cluding mental cognition decline, psychosis, depression, sleep problems
In this study, a machine learning-based methodology is proposed to and daytime sleepiness, etc., making the MDS-UPDRS the most multi-
isolate baseline features that are indicative of early rapid progression in parametric PD evaluation scale. The non-motor symptoms evaluation
PD. The change in the total score of the Movement Disorder parts of MDS-UPDRS may lack the level of detail found on other scales
Society‐sponsored revision of the Unified Parkinson's Disease Rating targeting non-motor aspects specifically (e.g. MoCA, RBD, SCOPA,
Scale (MDS‐UPDRS) [24] is used to track progression of PD in a cohort STAI) to be able to provide an equally comprehensive non-motor

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

Fig. 1. Schematic representation of the proposed methodology for the identification of prognostic factors for rapid PD progression at baseline evaluation.

symptoms assessment, but the total MDS-UPDRS score was able to progression, in order to expand our understanding regarding the evo-
outperform these scales and other biological markers, by showing lower lution of PD symptoms in the early stages of the disease. The study is
variance in discriminating disease severity in the early stages of PD in a still ongoing, collecting data according to the initial follow-up visits
recent study estimating evolution trajectories using data from the PPMI schedule, with plans to extend patient evaluation with annual visits
dataset [21]. In addition, the motor evaluation part of the scale was until 2023, in an attempt to offer long-term evaluation outcomes with
found to have a strong correlation with neuronal loss in the substantia observational data from subjects monitored for 5–13 years (depending
nigra, with neuronal density decreasing linearly with every point of on the cohort). In this study we focus on the De Novo cohort as it is the
increased total MDS-UPDRS part III score [32]. Thus, despite having far only one tracking the progression of PD in newly diagnosed and un-
from perfect balance between motor and non-motor symptoms, MDS- treated patients. This cohort includes 423 patients (i.e. 65.5 % males
UPDRS is nonetheless as best it can get when considering the range of and 34.5 % females), with a mean age of 61.6 years (i.e. 34–85 years
PD symptoms that are covered in a single evaluation scale. old) at baseline evaluation, with an average total UPDRS score of
A potential downfall of using the MDS-UPDRS scale for quantitative 32.35 ± 13.14 points. Resting tremor was present at diagnosis in 331
measuring of PD progression is not accounting for the effects of in- patients (i.e. 78.25 %), rigidity in 320 patients (i.e. 75.65 %), brady-
itiating symptomatic therapy [33]. Observed symptoms are expected to kinesia in 348 patients (i.e. 82.27 %) and postural instability in 29
reduce in severity interfering with an accurate representation of the patients (i.e. 6.86 %). Patient monitoring included four follow-up visits
true rate of PD progression. A solution is to estimate a function that in the first year after enrolment (i.e. in the 3rd, 6th, 9th and 12th
describes the bounded benefit of symptomatic treatment and adjust the month), followed by two annual visits in six months intervals for every
rater’s scoring values. However, as depicted in a recent review of such year of follow-up afterwards.
models with medication-adjusted estimations of MDS-UPDRS scores,
these models were empirically developed, lacking mechanistic ap- 3.2. Rapid PD progression and early functional decline
proaches to consider systems biology and pharmacology modeling [34].
Thus, despite being promising, there is not enough evidence and clin- An outline of the proposed methodology for identifying prognostic
ical trials to validate these findings. Regardless, in our case study this is factors of rapid symptoms progression in patients with PD using ma-
less of an issue since patients who continue to experience a severe chine learning algorithms is shown in Fig. 1. The proposed metho-
downfall in their observed symptomatology despite medication treat- dology consists of four primary modules: (a) extraction of the baseline
ment consist the targeted PD population of our analysis. Considering features for each patient, (b) estimation of the progression rate based on
the high standards in patient management under the supervision of the the change of the total MDS-UPDRS score, (c) patient segregation in
PPMI study protocols, patients ranking continuously higher MDS- quantiles according to progression rate (i.e. 5-quantiles and 10-quan-
UPDRS scores during the study can be confidently considered as cases tiles, respectively) with the patients in the top upper quantile reporting
of rapid PD progression. Furthermore, to minimize the confounding the most rapid progression, and (d) an extensive search in the baseline
effects of dopaminergic medication, the analysis of the motor MDS- feature space to isolate the features that can most accurately dis-
UPDRS part III in this study is restricted to scores recorded only in the criminate the rapid progression class of patients from the rest; who have
OFF state, with a minimum of 6 h between the last dose and the MDS- reported slower progression rates. Each module is presented in more
UPDRS evaluation. detail in the following subsections.

3. Materials and methods 3.2.1. Baseline feature extraction


A wide range of baseline features are included in this study, by
3.1. PPMI dataset gathering every patient evaluation outcome that is available in the
Clinical Data files of the PPMI dataset, covering every aspect of PD with
The Parkinson’s Progression Markers Initiative (PPMI) is a pro- motor and non-motor clinical symptoms evaluation, brain imaging re-
spective observational, multi-center study with 34 clinical partners sults, blood/serum/CSF laboratory results, genetics, general neurolo-
around the world, aiming to develop the largest collection of open data, gical and physical examination, etc. The extracted feature vector in-
providing longitudinal evaluations of PD patients with clinical evalua- cludes in total 601 baseline feature values for every patient in the De
tions, imaging results and biologic specimens. The PPMI cohort aimed Novo cohort. The complete list of the baseline features that are used in
to promote independent research in identifying potential markers of PD our analysis is shown in Table 1. If a baseline feature is missing from a

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

Table 1
The complete list of the baseline features that are available in the PPMI clinical dataset.
Group of features Evaluation scales/Clinical examinations/Imaging/Laboratory results at baseline Number of features

Neurological Exams Cranial Nerves 9

General neurological exam 18

Physical Exams General physical exam 11

Vital signs 10

General patient information Demographics 14

Family history 25

Socio-Economics 2

MRI/DatScan – SPECT Imaging DatScan – SPECT results 5

Brain Region 4

MRI results 5

Lab Exams Blood Chemistry and Hematology 44

Biomarker analysis 55

Biospecimen analysis results 4

Motor symptoms MDS-UPDRS part II 14

MDS-UPDRS part III 37

MDS-UPDRS total score 1

Modified Schwab and England ADL 1

PASE Household Activity 9

Genetics SNCA multiplication and Genetic Risk Score 2

NeuroX Genotyping Selected SNPs 33

Non-motor symptoms MDS-UPDRS part I 15

MDS-UPDRS part I total score 1

Benton Judgment of Line Orientation 35

Cognitive Categorization 5

Epworth Sleepiness Scale, ESS 10

Geriatric Depression Scale (short), GDS 17

Hopkins Verbal Learning Test, HVLT 14

Letter-Number Sequencing, LNS 24

Montreal Cognitive Assessment, MoCA 35

Questionnaire for Impulsive-Compulsive disorders, QUIP 14

REM Sleep Behavior Disorder Questionnaire, RBDSQ 23

Scale for Outcomes in Parkinson’s disease for Autonomic Symptoms, SCOPA-AUT 39

Lexical and Semantic Fluency 6

State-Trait Anxiety Inventory, STAI 43

Symbol Digit Modalities, SDM 5

University of Pennsylvania Smell ID Test, UPSIT 5

PD features Symptoms present at diagnosis 7

Total: 601

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

patient’s record, the respective screening value for that particular fea- life. Positive linear slope values declare symptoms worsening, negative
ture is obtained instead (if available) in order to minimize missing values declare a mitigation of symptoms (e.g. due to treatment initia-
values, as the two visits are about one month apart. In addition, some tion) and close to zero slope values denote a steady, overall, course of
evaluations were solely performed during the screening visit, so their symptomatology over the follow-up period, respectively.
corresponding features are recovered from the screening visit instead of
the baseline as well. 3.2.3. Rapid progression class
Once the progression rate has been extracted for every patient in
3.2.2. PD progression rate estimation each follow-up period, it is possible to sort and split patients into sub-
Progression of PD is estimated using the total score of the MDS- groups according to their estimated progression values. The quantiles
UPDRS. However, since the patients in the De Novo PPMI cohort were partition approach is used to separate patients into groups of equal size,
newly diagnosed and had to be medication-free in order to be eligible moving from slower progression rates at the lower end, to higher pro-
for participation in the study, the vast majority did not developed motor gression rates towards the top. The quantile-splitting is a viable alter-
fluctuations and therefore part IV was not assessed and, thus, these native considering that there is no official guidelines or thresholds to
values are not reported in the respective files. Consequently, part IV of signify rapid progression. Thus, in order to avoid using empirical
the MDS-UPDRS is not used in the estimation of the total score for each thresholds, we opted to implement a 5 and 10-quantiles partition split
visit and the progression rate is estimated using the respective values of of the patient population for each follow-up period of 2 and 4 years. As
parts I-III (i.e. Part I: Non-Motor Aspects of Experiences of Daily Living, it is shown in Fig. 2, the upper quantile in both splits contains the pa-
Part II: Motor Aspects of Experiences of Daily Living and Part III: Motor tients who experienced the highest rates of symptoms progression and
Examination). As it was mentioned in Section 2, if there are both “ON” they are assigned into the rapid progression class (i.e. top 20 % of cases
and “OFF” evaluations of the MDS-UPDRS motor examination part III, compared to top 10 %, respectively). The number of patients and the
the results from the evaluation at “OFF” state are used in the estimation mean progression rate of each quantile is shown in Fig. 2 as well.
of the total score, to minimize the effect of medication on the observed Testing with both 5 and 10-quantiles partition will allow us to in-
symptoms severity. For clarity, we denote the total MDS-UPDRS score vestigate the existence of a potential data-oriented threshold in defining
used in this study as total MDS-UPDRS parts I-III score in the rest of the rapid progression, should a notable difference in the discrimination
manuscript. accuracy between the two is observed.
The total MDS-UPDRS parts I-III score is extracted from the PPMI An advantage of using quantile partition to split PD patients into
records of each patient for the baseline and the subsequent follow-up subgroups of advancing progression rates, is that it allows us to eval-
evaluations up to four years later (i.e. months 3, 6, 9, 12, 18, 24, 30, 36, uate if there are different prognostic baseline features depending on the
42 and 48). Then, depending on the duration of the follow-up evalua- estimated progression rate trend, by comparing the rapid progression
tion period (i.e. 2 or 4 years), a 2-D data points matrix is formed con- class with varying levels of slower progression rates. Thus, as it is
taining the discrete visit time intervals and the respective value of the shown in Fig. 2, the rapid progression class (i.e. “RP class”) is evaluated
total MDS-UPDRS parts I-III score for each visit. The baseline score is for significant differences in patients’ baseline features against each
also included as the starting point in the estimation of the progression lower quantile separately, to obtain clinical information in the type of
rate, resulting in 7 and 11 discrete evaluation visits for the 2 and 4 year baseline factors depending on progression rate. In addition, the pro-
follow-up periods, respectively. A simple polynomial curve fitting posed methodology will be also evaluated in a quantile partition-in-
model is then used to extract a function that best describes the pro- dependent classification scheme, with the rapid progression class being
gression of total MDS-UPDRS parts I-III score per visit with a linear tested against all the remaining patients, who are assigned into a single
polynomial, as follows: lower progression class (i.e. green bar in Fig. 2) for this analysis, dis-
regarding their respective variance in progression rate per quantile. It
y = ax + b. (1) should be noted that each one of the 4 rapid progression classes across
The curve fitting model proposes a solution for a , b that best fits the 2-D the respective evaluation splits shown in Fig. 2 is different, as each one
data matrix by minimizing the sum of squared deviation (least-squares consists of different patients. For example, from the 358 patients in the
fitting) using a recursive analysis: 2 year follow-up only 212 remain at 4 years, so not all patients who
were considered as rapidly progressed at 2 years will also progress after
y1 x1 1 4 years. Thus, it is very likely that patients have switched classes
y2 a x2 1 compared to their 2 year evaluation or are completely missing due to
= ,
b early withdraw. The rapid progression classes within the same follow-
yn xn 1 up period are also different due to using the 5- and 10-quantiles par-
(2)
tition to obtain the top 20 % and 10 % of most rapidly advanced pa-
where a is the slope, b is the y-intercept, x1,2, ..., n is the time interval of tients (i.e. the top 20 % consists of the same top 10 % and another 10 %
baseline and follow-up visits in months, and y1,2, ..., n is the total MDS- of patients below them).
UPDRS parts I-III score per visit.
In order to estimate their PD progression rate, patients need to have 3.2.4. Baseline feature selection and classification
a total MDS-UPDRS parts I-III score available for their baseline or A preprocessing step is initially performed to remove features with
screening visit, the last visit of the follow up period (i.e. at month 24 or many missing values, exceeding a predefined threshold of 0.3 in rela-
48 for the 2-year and 4-year follow-up, respectively) and at least one tion to the total number of instances. In addition, features containing a
evaluation score per year in between the baseline and the final visit. single unique category of values are also removed as they have no
Since no imputation for missing values is performed, patients lacking discriminative potential with respect to the classification outcome (e.g.
this minimum number of MDS-UPDRS evaluation scores were excluded, all instances have “Male” as gender). Our goal is to perform a complete
reducing the number of available patients to 358 for the 2-year follow- evaluation analysis for potential baseline prognostic factors and, hence,
up period and 212 for the 4-year follow-up period, respectively. The every baseline feature remaining after preprocessing is considered in
more simplistic linear progression fitting model was selected to cope the feature selection process, in order to identify the most informative
with the restricted time resolution in follow-up visits and the presence baseline features. Feature selection allows for not only improved clas-
of missing values, as it can provide an accurate-enough estimation of sification performance, but it also enhances the robustness of the clas-
the overall trend of PD progression for each patient, and not to indicate sifier, since the number of available baseline features remaining is
that a linear progression of PD symptoms is actually expressed in real larger than the number of available instances (i.e. curse of

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

Fig. 2. Class partition scheme of the PPMI De Novo cohort per follow-up period, with and without quantile-specific evaluation. The values on the left of each quantile
denote the mean rate of change in MDS-UPDRS parts I-III score in points/year. N = total number of patients, n1-n10=number of patients per quantile.

dimensionality). To avoid such issues, we implemented a non-greedy, data. If the fit provides an increase in classification accuracy of at least
best-first feature selection algorithm which performs an exhaustive 0.01 compared to [Fi , Fj], the 3-feature subset is stored along with its
search in the entire feature space, to extract the most comprehensive classification performance, otherwise the 3-feature subset [Fi , Fj, Fk ] is
and compact subset of features that can still provide high classification discarded. Then, a different subset of 3 features is selected and eval-
accuracy. uated by changing Fk , let us denote it as [Fi , Fj, Fk ] with
The feature selection process is schematically depicted in Fig. 3. We k k i j , in an iterative process, until every possible 3-feature
initially set a starting node consisting of two different features from the combination has been evaluated (i.e. N 2 iterations required in total
baseline feature vector Fi and Fj , with i [1,2, …, N ], j [1,2, …, N ], to evaluate all potential 3-feature subsets). In each iteration, a unique 3-
i j and N denoting the total number of baseline features available. feature subset is being evaluated and it is either stored if its classifi-
Using only the 2 features [Fi , Fj] we evaluate their potential classifi- cation accuracy improved that of [Fi , Fj] by at least 0.01, or discarded if
cation performance by fitting the selected classification algorithm in the no improvement could be obtained, as described in the case of subset
training proportion of the dataset. If the fit provides both a sensitivity [Fi , Fj, Fk ] above.
and specificity above 0.5 in respect to the rapid progression class, the Once all iterations are completed, the stored 3-feature subsets,
feature selection process will continue by adding more features to which improved the classification accuracy of the starting node [Fi , Fj],
further improve classification performance, otherwise the starting node are further expanded into 4-feature subsets (i.e. [Fi , Fj, Fk, Fl] with
is discarded and another starting node, with a different set of features l [1,2, …, N ]), by adding a different fourth baseline feature (i.e.
[Fi , Fj ] that has not been previously tested, is selected for evaluation l k i j ) into each stored 3-feature subset following a similar
instead. iterative process as before. Each unique subset of [Fi , Fj, Fk, Fl] is
Let us denote Fk with k [1,2, …, N ] and k i j , a third feature to evaluated using the classification algorithm and the training data and is
be added in the initial subset of features [Fi , Fj] of a starting node that either stored for further expansion into 5-feature subsets if its classifi-
provided sensitivity and specificity higher than 0.5. Using only the 3- cation accuracy improved that of its preceding [Fi , Fj, Fk ] subset by at
feature subset [Fi , Fj, Fk ], their potential classification performance is least 0.01 or discarded otherwise as before, respectively. The repetitive
evaluated again by fitting the classification algorithm in the training process of adding extra features is eventually terminated when the

Fig. 3. Baseline feature selection and classification. Starting nodes of two features are expanded to their full potential during training. The best subset of features is
then used for the evaluation with the testing set.

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

Table 2
Classification performance results for the rapid progression class at the 2-year follow-up without considering quantile
partition.

MDS-UPDRS=Movement Disorder Society‐Unified Parkinson's Disease Rating Scale, STAI=State-Trait Anxiety Inventory,
MoCA=Montreal Cognitive Assessment, GDS=Geriatric Depression Scale, RBDSQ=REM Sleep Behavior Disorder
Questionnaire, ESS=Epworth Sleepiness Scale, SDM=Symbol Digit Modalities, LNS=Letter-Number Sequencing,
rsXXXXXX=Single Nucleotide Polymorphisms (SNPs).
*p ≤ 0.05; indicate significant prognostic value at baseline.

accuracy of the last (n)-sized feature subset can no longer improve by targeted group of users consists of non-machine learning experts, such
more than 0.01 in the following (n+1)-sized feature subsets. At this as the medical personnel in this work. The proposed methodology was
point the initial starting node [Fi , Fj] has been evaluated to its full developed in Python 3.6 using the python-weka-wrapper3 package [35],
potential and a new starting node is set for evaluation, using a different which offers complete access to the WEKA API and its extensive toolkit
set of features [Fi , Fj ] which has not been tested before. The total of open libraries [36], directly from within Python.
number of unique combinations of 2-feature starting nodes (SNs) that
have to be fully expanded and evaluated depends on the size of the
baseline feature vector and can be estimated as: 4. Results

N!
SNs = , For a more complete evaluation, two different follow-up periods are
(N 2)! (3) tested, considering the first 2 and 4 years of follow-up visits in-
dependently, and two distinct rapid progression classes per follow-up
where N is the total number of available baseline features.
period, respectively, consisting of the top 20 % and 10 % of the patients
When the search within the entire feature space is completed and all
with the fastest progression. In addition, two classification schemes are
possible SN combinations have been evaluated, the stored classification
evaluated: a) quantile partition-independent analysis in Section 4.1,
performance results from each n-feature subset are compared to find
where the rapid progression class is compared against all the remaining
the subset that provided the best classification accuracy, overall during
patients who are assigned to a single slower progression class, dis-
training. This subset will be selected for the final evaluation with the
regarding their individual progression rates, and b) a more detailed
testing proportion of the dataset. In case of identical classification
analysis between the rapid progression class and each corresponding
performance between subsets with different number of features, the
partition of patients with slower progression rates as independent
subset with the smaller number of features is selected (e.g. choose a 4-
classes in Section 4.2, in order to investigate the connection between
feature subset over a 5-feature). Finally, in case of identical classifica-
the various progression rates (i.e. from slower to faster) and the ob-
tion performance between subsets with the same feature count (e.g. two
served baseline clinical symptomatology and whether different phe-
4-feature subsets), the respective unique features are selected for
notypes are expressed. The classification performance is estimated with
testing.
RIPPER and Naïve Bayes algorithms. Statistically significant differences
Two different classifiers are selected as classification algorithms for
between the two classes were estimated for each selected baseline
evaluation, namely the Naive Bayes and the Repeated Incremental
feature using one-way ANOVA or Mann-Whitney U tests (depending on
Pruning to Produce Error Reduction (RIPPER) algorithms. Both algo-
criteria such as normal distribution, homogeneity of variances).
rithms are widely used in biomedical applications and medical data
The classification performance results for all classification experi-
analysis since they offer easier interpretation, the first in terms of
ments below are estimated using 5-fold cross-validation to minimize the
probability estimates demonstrating feature significance, the latter in
potential risk of overfitting due to the relatively low volume of samples
terms of classifications rules providing insights in decision making in
compared to the number of baseline features. Classification accuracy is
natural language (e.g. “IF (MDS-UPDRS part III score > = 19) AND
estimated as (TP + TN)/(TP + TN + FP + FN), sensitivity is estimated
(age > 67) AND (…) THEN Class = Rapid progression”). The ease of
as TP/(TP + FN) and specificity as TN/(TN + FP), where:
interpretation of the extracted results is very important since the

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

Table 3
Classification performance results for the rapid progression class at the 4-year follow-up without considering quantile
partition.

MDS-UPDRS=Movement Disorder Society‐Unified Parkinson's Disease Rating Scale, STAI=State-Trait Anxiety Inventory,
SCOPA-AUT=Scale for Outcomes in Parkinson’s disease for Autonomic Symptoms, GDS=Geriatric Depression Scale,
RBDSQ=REM Sleep Behavior Disorder Questionnaire, MoCA=Montreal Cognitive Assessment, LNS=Letter-Number
Sequencing, rsXXXXX=Single Nucleotide Polymorphisms (SNPs).
*p ≤ 0.05; indicate significant prognostic value at baseline.

• True Positives (TP) denote the patients correctly classified as rapidly than 2 %), however, it also required up to more than double the number
progressed. of baseline features to achieve this, which is not an ideal trade-off.
• True Negatives (TN) denote the patients correctly classified as Thus, we opted to select the more compact feature subset of the RIPPER
slower progressed. algorithm as the better outcome from this evaluation and include the
• False Positives (FP) denote the patients incorrectly classified as ra- selected features in Table 2 along with the corresponding p-values from
pidly progressed. statistical analysis. The exact baseline features that were selected using
• False Negatives (FN) denote the patients incorrectly classified as the Naïve Bayes algorithm and their respective p-values are provided in
slower progressed. Supplementary Table 1.
The corresponding analysis results for the 4-year follow-up period
4.1. Evaluation without quantile partition split are presented in Table 3. The increased duration of the follow-up period
resulted in a 10.32 % and 5.84 % increase in accuracy (i.e. 81.80 % and
In this classification scheme, the top 20 % and 10 % of the PD pa- 88.68 % compared to 71.48 % and 82.84 %) for the RIPPER algorithm
tients with the most rapid progression are evaluated against all the compared to the 2-year follow-up period, with similar levels of increase
remaining patient population in a 2-class classification output (i.e. for the Naïve Bayes algorithm, respectively (i.e. 82.38 % and 91.12 %
rapid vs slower progression class). To overcome classification perfor- compared to 73.23 % and 84.76 %). Thus, an advantage of tracking PD
mance limitations due to the high class imbalance without excluding progression rates over longer periods can be observed based on these
any cases from the evaluation, the patients from the slower progression results. As in the 2-year follow-up, classification performance improved
class are randomly split into 4 equal subgroups for the classification when only the top 10 % of the most rapidly advanced cases were as-
with the top 20 % of patients and into 9 equal subgroups with the top signed to rapid progression class, with mean accuracy increasing from
10 %, respectively. Each subgroup is independently evaluated against 81.80%–88.68 % and from 82.38%–91.12 % for the RIPPER and Naïve
the rapid progression class and the mean value of all the respective Bayes algorithms, respectively. As shown by the total number of fea-
performance metrics (i.e. accuracy, sensitivity and specificity) that are tures and the features selected per subgroup, feature selection with
estimated per subgroup is reported. Naïve Bayes is again less optimal with substantially more features re-
The classification performance results for the 2-year follow-up quired for only a slight increase in classification performance, thus, the
period are presented in Table 2. Restricting the number of patients in RIPPER algorithm is preferred and its selected features are presented in
the rapid progression class to the top 10 % of the most rapidly advanced Table 3 along with the corresponding p-values from statistical analysis.
cases resulted in higher classification performance, with the mean Due to size limitations, the baseline features selected using Naïve Bayes
classification accuracy being 11.36 % higher using the RIPPER algo- are provided in Supplementary Table 2 along with their respective p-
rithm (i.e. 82.84 % compared to 71.48 %) and 11.53 % using the Naïve values. The small number of features selected per evaluation subgroup
Bayes algorithm (i.e. from 84.76 % compared to 73.23 %), respectively. in Tables 2 and 3 using RIPPER (i.e. 2–5 features) showcases the im-
The number of features selected per classification run with each random portance of feature selection in minimizing the effects of overfitting.
split (i.e. features selected per subgroup) is shown in Table 2 along with Commonly selected features across the different classification experi-
the total number of selected features from all subgroups. The results ments in Tables 2 and 3 are highlighted for convenience. Although
show a slight advantage of the Naïve Bayes algorithm over RIPPER (less perfect overlap between selected features is not expected considering

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

Table 4
Classification performance results for the rapid progression class per follow-up period, using 5 and 10-quantiles partition.

Table 5
Baseline factors for rapid progression selected using RIPPER algorithm per follow-up period using 5-quantiles partition.
5-quantiles Baseline features selected
partition
100-80 % 2 years follow-up 4 years follow-up

vs 80-60 % MDS-UPDRS: Speech, rs11158026, STAI: Some unimportant thought is MDS-UPDRS: Leg agility, rs11060180, Semantic Fluency: Animal score,
bothering, Blood test: Serum chloride, Number of full siblings, STAI: STAI: Not feeling calm cool and collected, RBDSQ: Sudden limb
Worrying over possible misfortunes, GDS: Good spirits most of time, STAI: movements, STAI: Not feeling steady, rs199347, SCOPA-AUT: Feel full very
Have disturbing thoughts quickly during meal

vs 60-40 % Blood test: Serum glucose, Semantic Fluency: Total number of vegetables, SCOPA-AUT: Been impotent, STAI: Feel strained, SCOPA-AUT: Trouble
STAI: High state subscore, RBDSQ: Disturbed sleep, STAI: Some tolerating heat, Blood test: Urea nitrogen, rs11060180, LNS: Trial 4a,
unimportant thought bothering SCOPA-AUT: Pass urine again within 2 hours, Number of half siblings,
Blood test: Eosinophils, SCOPA-AUT: Saliva dribbled out of mouth

vs 40-20 % MDS-UPDRS: Anxious mood, Blood test: Total protein, MDS-UPDRS: Rest MDS-UPDRS: Urinary problems, Benton Test: Item 9, RBDSQ: Aggressive/
tremor amplitude upper extremity, Height (taller), Rigidity present at Action packed dreams, MDS-UPDRS: Toe tapping, HVLT: Immediate recall
diagnosis, Brain region: Left putamen trial 2, SCOPA-AUT: Difficulty retaining urine, rs11158026, rs115462410,
rs11724635, rs118117788, rs17649553, rs8192591

vs 20-0 % Neurological Exam: Plantar reflex, MDS-UPDRS: Toe tapping, RBDSQ: MDS-UPDRS: Urinary problems, Neurological Exam: Plantar reflex,
Speaking in sleep, HVLT: Immediate recall trial 1, HVLT: Immediate recall MDS-UPDRS: Toe tapping, RBDSQ: Disturbed sleep, SCOPA-AUT: Pass
trial 3, rs12637471, SCOPA-AUT: Saliva dribbled out of mouth, RBDSQ: urine at night, RBDSQ: Sudden limb movements, STAI: Worry too much
Aggressive/Action packed dreams, rs118117788, RBDSQ: Hurt bed over something that really doesn’t matter, rs11158026, MoCA: Delayed
partner, Semantic Fluency: Animal score, Postural Instability present at recall red, LNS: Questions 1-7 score, rs14235
diagnosis

MDS-UPDRS=Movement Disorder Society‐Unified Parkinson's Disease Rating Scale, STAI=State-Trait Anxiety Inventory, SCOPA-AUT=Scale for Outcomes in
Parkinson’s disease for Autonomic Symptoms, GDS=Geriatric Depression Scale, RBDSQ=REM Sleep Disorder Questionnaire, HVLT=Hopkins Verbal Learning Test,
MoCA=Montreal Cognitive Assessment, LNS=Letter-Number Sequencing, Benton Test=Benton Judgment of Line Orientation, rsXXXXXX=Single Nucleotide
Polymorphisms (SNPs).

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

Table 6
Baseline factors for rapid progression selected using RIPPER algorithm per follow-up period using 10-quantiles partition.
10-quantiles Baseline features selected
partition
100-90 % 2 years follow-up 4 years follow-up

vs 90-80 % MDS-UPDRS: Hygiene, MoCA: Delayed recall subscore, RBDSQ: MDS-UPDRS: Gait, STAI: Not being content, MDS-UPDRS: Total score,
Parkinsonism, Benton Test: Item 19, STAI: Feel confused, rs12637471 Number of full siblings, Side predominately affected at onset

vs 80-70 % rs11060180, rs71628662, MoCA: Delayed recall subscore, MDS-UPDRS: STAI: Some unimportant thought bothering, RBDSQ: Speaking in sleep,
Finger tapping, MDS-UPDRS: Constipation problems, Neurological Exam: RBDSQ: Aggressive/Action packed dreams, Standing blood pressure
Reflex arm systolic, Blood test: Basophils, RBDSQ: Vivid dreams, Height (taller),
Benton Test: Item 27

vs 70-60 % MDS-UPDRS: Pain and other sensations, Benton Test: Item 27, STAI: Feel Age (older), STAI: Not feeling pleasant, STAI: Worry too much over
rested, MoCA: Abstraction, MDS-UPDRS: Turning in bed, SCOPA-AUT: something that really doesn’t matter, HVLT: Derived retention total score
Difficulty retaining urine

vs 60-50 % ESS: Being sleepy, STAI: Being jittery, Age (older), Blood test: Serum MDS-UPDRS: Urinary problems, Semantic Fluency: Total number of
bicarbonate, MDS-UPDRS: Constipation problems, MDS-UPDRS: Eating vegetables, Standing heart rate, Age (older), Supine blood pressure, Number
tasks, MDS-UPDRS: Leg agility, MDS-UPDRS: Postural tremor, MoCA: of half siblings, STAI: Worry too much over something that really
Delayed recall face doesn’t matter, Depression category, QUIP: Summary score, SCOPA-AUT:
Feel full very quickly during meal, MDS-UPDRS: Rest tremor amplitude
lower extremity, rs11724635, rs12456492

vs 50-40 % Cranial Nerve VIII, Blood Test: Serum uric acid, MoCA: Language subscore, SCOPA-AUT: Constipation problems, SCOPA-AUT: 1-21 score, STAI: Feel
RBDSQ: Speaking in sleep, UPSIT: Score booklet #4, STAI: Feel satisfied strained, Blood test: Serum chloride, Blood test: Serum IGF-1, Blood test:
about self, MDS-UPDRS: Rigidity neck, MDS-UPDRS: Facial expression, Basophils, MoCA: Verbal Fluency number of words, Brain Region: Right
RBDSQ: Hurt bed partner, SCOPA-AUT: Having difficulty swallowing or putamen, HVLT: Recognition, UPSIT: Score booklet #1, Semantic
choked, ESS: Fall asleep while sitting/reading, LNS: Trial 4a Fluency: Animal score

vs 40-30 % Blood test: Serum chloride, HVLT: Recognition, UPSIT: Score booklet #3, Blood test: ALT(SGPT), MDS-UPDRS: Urinary problems, RBDSQ:
SCOPA-AUT: 1-21 score, Blood test: Eosinophils, Identify self as Asian/ Aggressive/Action packed dreams, MoCA: Attention serial 7 sec, Number
Black-African American/Indian/Alaska Native of full siblings

vs 30-20 % MDS-UPDRS: Hand movements, HVLT: Immediate recall trial 1, RBDSQ: MDS-UPDRS: Rigidity upper extremity, RBDSQ: Have aggressive/action
Depression, MDS-UPDRS: Constipation problems, LNS: Derived scaled packed dreams, STAI: Not feeling satisfied, STAI: Not feeling at ease
score, Biological father with PD

vs 20-10 % MDS-UPDRS: Eating tasks, MoCA: Total score, RBDSQ: Summary score, GDS: Being afraid of something bad happening, Number of paternal aunts/
Number of children, STAI: Not feeling satisfied, SCOPA-AUT: 24-25 score uncles, STAI: Worry too much over something that really doesn’t matter,
MDS-UPDRS: Rigidity lower extremity, Height (taller)

vs 10-0 % MDS-UPDRS: Eating tasks, RBDSQ: Speaking in sleep, SCOPA-AUT: 22-23 MDS-UPDRS: Body bradykinesia, Benton Test: Item 1, STAI: Not Feeling
score, MDS-UPDRS: Toe tapping, MDS-UPDRS: Leg agility, UPSIT: Score pleasant, STAI: Not being content
booklet #1, Blood test: Basophils, MoCA: Delayed recall subscore

*MDS-UPDRS=Movement Disorder Society‐Unified Parkinson's Disease Rating Scale, STAI=State-Trait Anxiety Inventory, SCOPA-AUT=Scale for Outcomes in
Parkinson’s disease for Autonomic Symptoms, GDS=Geriatric Depression Scale, RBDSQ=REM Sleep Disorder Questionnaire, HVLT=Hopkins Verbal Learning Test,
MoCA=Montreal Cognitive Assessment, UPSIT=University of Pennsylvania Smell ID Test, LNS=Letter-Number Sequencing, Benton Test=Benton Judgment of Line
Orientation, ESS=Epworth Sleepiness Scale, QUIP=Questionnaire for Impulsive-Compulsive disorders, rsXXXXXX=Single Nucleotide Polymorphisms (SNPs).

that the classes across the four evaluation experiments are different, the partition provided higher classification performance for both algo-
majority of the selected baseline features are repeatedly found between rithms compare to 5-quantiles partition. The RIPPER algorithm utilizes
the evaluation experiments in Tables 2 and 3, while the contextual si- again a more compact subset of selected features offering similar clas-
milarities of others that do not overlap makes them practically identical sification performance as Naïve Bayes with fewer baseline features. In
(e.g. “STAI: Worry too much over something that really does not matter” fact, there are only two instances where Naïve Bayes required less
and “STAI: Some unimportant thought bothering”, “SCOPA-AUT: Weak features (i.e. partitions 60-50 % and 50-40 %, 4-year follow-up). A
urine stream in past month” and “MDS-UPDRS: Urinary problems”). multiclass classification approach is not evaluated due to relatively high
number of individual classes, especially when considering the 10-
4.2. Evaluation per quantile partition split quantile partition split.
The most informative baseline features per classification quantile
In this classification scheme, the rapid progression class is evaluated and follow-up period for the 5-quantiles partition are presented in
against the corresponding lower quantiles, which are consider as in- Table 5. The baselines features of each quantile classification are pre-
dependent classes. Classification performance is estimated for both sented in order of selection from the best-first search algorithm. The
follow-up periods, using RIPPER and Naïve Bayes algorithms as before. majority of the selected baseline features come from non-motor eva-
The results for all classification experiments are presented in Table 4, luation outcomes including urinary and speech problems, saliva
including the number of selected features per classification algorithm. drooling, signs of cognitive decline and memory impairment (i.e.
Regarding the effect of the follow-up period, the results show an in- MoCA, LNS, Benton Test, HVLT and Lexical and Semantic fluency
crease in classification accuracy for the RIPPER algorithm when moving scales), patient in depressed mood and with higher levels of anxiety as
to the 4-year follow-up by 11.16 % on average (1.03–17.99 %) for the captured by STAI scale, autonomic dysfunction captured with SCOPA-
5-quantiles and by 7.56 % on average (2.37–10.42 %) for the 10- AUT scale and presence of sleep disorders (i.e. RBDSQ scale). Multiple
quantiles partition, respectively. Almost identical margins of improve- single nucleotide polymorphisms (SNPs) are also found in the rapid
ment are noted for Naïve Bayes as well. As expected, the 10-quantiles progression class along with variations in the concentration of some

10

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

preprocessing), while previous studies were often limited to fewer


features and univariate feature analyses, which is not ideal considering
the heterogeneous nature of PD. The change of MDS-UPDRS score over
time was used to estimate the trend of progression rate for each patient
and two different classification schemes were evaluated over two dis-
tinct follow-up patient monitoring periods.

5.1. Rapid progression classification performance

The Naïve Bayes and RIPPER classification algorithms were tested


in this study to compare their ability to successfully extract the most
informative baseline features during training and provide accurate
patient classification when validating with the testing data. The two
algorithms provided, overall, similar levels of classification perfor-
mance during testing, with Naïve Bayes being marginally better; be-
tween 1–5 % in some cases. This slightly better classification perfor-
mance, however, came with the added drawback of requiring more
Fig. 4. Comparison of the proposed methodology with our preliminary study baseline features, which would frequently reach more than 2 times the
showcasing the improvement in classification performance by extending the amount of features selected by RIPPER for the same classification task.
baseline features included in feature selection. This trade-off in feature selection density was not favorable and, thus,
going with the more compact feature subsets that were extracted using
components found in the blood test laboratory results. Finally, motor RIPPER in the vast majority of the experiments was weighted as a better
symptoms are substantially fewer and are primarily found to be origi- alternative.
nating from the lower extremities including agility and toe tapping, The results of the evaluation with the De Novo cohort of the PPMI
with minor signs of postural instability at baseline. In order to retain the dataset revealed an improvement when a longer follow-up period is
size of Table 5, the results from the statistical analysis for all baseline considered in the estimation of the progression rate. Moving from 2 to
features selected by the RIPPER algorithm are shown in detail in Sup- 4 years the classification accuracy increased in all cases tested, with and
plementary Table 3. The baseline features that were selected using without quantile partitions, suggesting a better outcome in identifying
Naïve Bayes are provided in Supplementary Table 5 along with their rapid progression phenotype. As expected, judging on the rate of pro-
respective p-values, respectively. gression in the first couple of years is not ideal as progression can be
The respective baseline features per follow-up period for the 10- affected by not inherent factors such as delayed treatment initiation.
quantiles partition are presented in Table 6. The results suggest again Most of these issues are gradually smoothed out over time with ap-
that the majority of baseline features originate from non-motor symp- propriate patient management strategies, leaving a more clear depic-
toms, including sleep-related problems and daytime sleepiness (i.e. tion of the patients who have an inherent predisposition for rapid
RBDSQ and ESS scale), autonomic dysfunctions with urinary problems progression and usually show poor response to medication.
(i.e. SCOPA-AUT, MDS-UPDRS scales), impaired cognition and memory Performance is also affected by the number of patients assigned to the
(i.e. MoCA, Benton test, HVLT, LNS and Lexical and Semantic Fluency rapid progression class, with classification results improving in both
scales), depressed mood and anxiety (i.e. GDS, STAI scales) and genetic classification schemes in Sections 4.1 and 4.2, when using only the
mutations. Combined with the above, laboratory findings from blood upper 10th quantile. The results from Tables 2 and 3 show that the
test analysis are also helpful. In this higher tier of most rapidly pro- mean sensitivity of RIPPER algorithm increased by 6.17 % and 11.23 %
gressed patients, however, there are some features that were not pre- for both the 2- and 4-year follow-up period, respectively, when using
viously reported using the 5-quantiles partition, including olfactory only the patients from upper 10-quantile in the rapid progression class.
dysfunction detected with the University of Pennsylvania smell identi- Specificity gains show greater variance as its mean value reached a
fication test (UPSIT), age at baseline evaluation and constipation pro- significant 13.23 % improvement in the 2-year follow-up, while only an
blems. Motor-oriented features are also limited in the top 10 % of pa- increase of 2.71 % was obtained at the 4-year follow-up. In addition,
tients, with the identification of symptoms at lower extremities similar improvements can be noted in the results of Table 4 from the
including leg agility and toe tapping along with signs of rigidity, pos- quantile-dependent evaluation, as sensitivity and specificity increased
tural and rest tremor at hands. Commonly selected features across from 77.14 to 85.71% and 73.61-83.33%–79.41-91.18 % and 80.56-
different partitions and/or classification experiments within Tables 5 100 % in the 2-year follow-up, and from 85.37 to 97.56% and
and 6 are highlighted in bold for convenience. Statistical analysis is also 80.95–97.67 % to 95–100 % and 90.48–100 % in the 4-year follow-up,
performed for all baseline features selected by RIPPER and the results respectively.
are shown in Supplementary Table 6 due to size limitations, while the Discrimination of the rapid progression class has also improved by
corresponding baseline features using the Naïve Bayes algorithm are considering a wider range of baseline features in the search for poten-
provided in Supplementary Table 6 along with their respective p-va- tial prognostic factors. Compared to our preliminary results in [37],
lues. where a smaller subset of only 139 baseline features were used to
classify the same cohort of the De Novo PPMI patients within a 2-year
follow-up period, the total classification accuracy has increased sig-
5. Discussion nificantly for both 5 and 10-quantiles partitions. The reduced baseline
feature evaluation in [37] provided a lower mean classification accu-
This study aimed to demonstrate a different approach in the re- racy of 76.6 % compared to 88.4 % in this study for the 10-quantiles
search for early prognostic factors of rapid PD progression and func- partition and 70.5 % compared to 79.54 % for the 5-quantiles partition,
tional decline, proposing a machine learning-based alternative in a field respectively, while extending the follow-up period to four years pro-
that is traditionally dominated by statistical analysis and regression- vided even better results regardless the size of quantiles partition. The
based modeling. The proposed methodology can extract baseline factors classification accuracy across all cases is shown in Fig. 4. Regarding
of rapid progression by performing multi-parametric evaluations over similar methodologies in the literature using machine learning techni-
an extensive feature space of about 430 features (i.e. after ques, Faghri et al. [38] proposed a model for rapid progression

11

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

Fig. 5. ROC curves for classification performance across all evaluation experiments using RIPPER algorithm.

prognosis based on baseline features consisting of 140 clinical para- determinants for faster progression of PD symptoms and increased
meters to discriminate between slow, moderate and fast disease pro- disability. Older age of patients during either the diagnosis of the dis-
gression. Selecting 52 features as the most significant determinants, the ease or the baseline evaluation has been previously suggested as one of
evaluation results with the De Novo PPMI cohort reported a 4-year the most valid factors for rapid progression and shorter survival time in
progression prognosis with an average AUC of 0.93. The proposed many studies [23,29,39–44]. Cognitive impairment at baseline eva-
methodology provided an average AUC of 0.82 and 0.87 for the top 20 luation is another feature that has been widely correlated with faster
% and 10 % in the quantile-independent analysis using significantly worsening of motor function and disability in PD [29,41,42] and has
fewer baseline features (i.e. 13 and 14, respectively). Using the 10- been even suggested to increase risk of mortality [44] and early de-
quantile evaluation, the proposed methodology provided an average velopment of dementia [45]. In fact, diagnosis of dementia at baseline
AUC of 0.96 using 59 baseline features and the RIPPER algorithm in the evaluation is also highly indicative of faster functional decline with
same 4-year follow-up evaluation period. The corresponding ROC patients requiring home nursing much sooner [20,39]. Then, the se-
curves are shown in Fig. 5. lection of features declaring early urinary, constipation and eating
problems, signifies the commonly reported prognostic value of auto-
5.2. Baseline prognostic factors analysis nomic symptoms for rapid progression phenotyping [46–48]. Further-
more, urinary, gastrointestinal, cardiovascular and thermoregulatory
The majority of the baseline features selected in both classification dysfunction has been reported to greatly impact PD patients’ quality of
schemes evaluated in this study (i.e. Sections 4.1 and 4.2) consists life [49,50]. Smell dysfunction is also very common in PD and the ap-
predominately of non-motor symptoms, including signs of autonomic pearance of low UPSIT scale scores in the top 10 % of patients with
dysfunction, sleep disorders, signs of cognitive decline, memory im- faster progression, shows its usefulness as a prognostic factor for rapid
pairment, emotional disturbances with signs of depression and anxiety, progression as well [51].
higher age at baseline evaluation and genetic risk factors due to SNPs. REM Sleep Behavior Disorders (RBD) and excessive daytime slee-
Despite the wide range of the specific baseline features that are re- piness have been known to impact most neurodegenerative diseases and
ported in Tables 2–6, broader categories of alarming early symptoms PD in particular [52,53]. The expression of both conditions is highly
can be easily identified by aggregating the selected baseline features correlated with more rapid disease progression [54–56], earlier onset
according to the general class of symptoms and conditions that they are and increased severity of non-motor PD symptoms affecting patients’
used to evaluate patients for, as it is shown in Table 7. The contextual quality of life [57,58]. RBD and excessive daytime sleepiness at baseline
similarities of the different selected baseline features in each class of evaluation were also reported as potential risk factors for later emo-
symptoms presented in Table 7, highlights the relevance of the under- tional disorders, including depression and anxiety [12]. Depressed pa-
lying non-motor symptomatology even in a more abstract way. Fur- tients have also been shown to express significant decline in cognition,
thermore, this aggregated level of prognostic symptoms can also be memory, activities of daily living and a more rapid progression of dis-
more useful in the everyday clinical environment as it easier to re- ability by advancing faster through the Hoehn and Yahr stages
member and look for. The prevalence of non-motor baseline symptoms [23,59,60]. Presence of anxiety, which is commonly coexisting with
as prognostic factors is evident in Table 7 and, as shown by the detailed depression and other mood disturbances, is also correlated with severe
evaluation results in Tables 5 and 6, is consistent and unaffected by the non-motor phenotypes [61,62] and more rapid progression [63,64].
different rates of progression expressed in each quantile partition. Fi- Most common symptoms of anxiety include distress, worries and fear,
nally, the transition from 5- to 10-quantiles partition, where fewer of which are usually accompanied by social withdrawal due to the em-
the most advanced cases are assigned into the rapid progression class, barrassment of the PD symptoms expression [65]. In accordance with
revealed the high prognostic value of early smell dysfunction, daytime the literature, symptoms of anxiety and depression are also found in-
sleepiness and advancing age for rapid progression of PD. dicative of rapid PD progression in this study.
Our findings are in agreement with previously reported Based on our results, neuron degeneration and brain atrophy is also

12

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

Table 7
Categorization of the most prominent selected baseline features into their relevant class of symptoms and related conditions.
Broad category Class of symptoms Relevant baseline features selected

Autonomic dysfunction [46–50] Urinary problems MDS-UPDRS: Urinary problems, SCOPA-AUT: Weak urine stream, Difficulty retaining urine, Pass urine again
within 2 hours, Pass urine at night

Digestive difficulties MDS-UPDRS: Constipation problems, SCOPA-AUT: Constipation problems, Feel full very quickly during meal,
Having difficulty swallowing or choked

Other SCOPA-AUT: 1–21 score, Saliva dribbled out of mouth

Sleep Problems [12,54–58] REM sleep RBDSQ: Move arms/legs during sleep, Sudden limb movements, Aggressive/Action packed dreams, Speaking
in sleep, Hurt bed partner, Disturbed sleep

Daytime Sleepiness ESS: Fall asleep while watching TV, Fall asleep while sitting/reading, Being sleepy

Cognitive dysfunction [29,39,41–45] Cognitive impairment Semantic Fluency: Animal score, Total number of vegetables, MoCA: Verbal fluency, Visuospatial/Executive
subscore, Language subscore, Total score, LNS: 5a/5c/4a, Derived scaled score, Benton Test: Multiple items

Memory problems MoCA: Delayed recall subscore, HVLT: Immediate recall, Recognition, GDS: More memory problems than
most

Emotional disorders [23,59–64] Mood impairment GDS: Feel situation is hopeless, Being afraid of something bad happening, Depression category, RBDSQ:
Depression, MDS-UPDRS: Apathy

Anxiety disorders STAI: Some unimportant thought is bothering, Worry too much over something that really doesn’t matter,
Feel strained, Not feeling/being content, Not feeling satisfied, Not feeling pleasant, MDS-UPDRS: Anxious
mood

Lab exams [74–76] Blood tests Basophils, Serum chloride, APTT-QT, Eosinophils, Serum glucose

Clinical Evaluation [20,23,41,77–79] Motor MDS-UPDRS: Rigidity upper/lower extremities & neck, Leg agility, Toe tapping

Activities of daily living MDS-UPDRS: Eating tasks, Speech, Hygiene, Part I summary score

Neurological Plantar reflex, Cranial Nerves

Genetics [69–73] SNPs rs823118, rs329648, rs11060180, rs118117788, rs11158026, rs11724635, rs12637471

Other [23,51,66–68] UPSIT: Booklet scores, Age, MDS-UPDRS: Pain and other sensations, Brain region: Left/Right putamen,
Height, Number of siblings

*MDS-UPDRS=Movement Disorder Society‐Unified Parkinson's Disease Rating Scale, STAI=State-Trait Anxiety Inventory, SCOPA-AUT=Scale for Outcomes in
Parkinson’s disease for Autonomic Symptoms, GDS=Geriatric Depression Scale, RBDSQ=REM Sleep Disorder Questionnaire, HVLT=Hopkins Verbal Learning Test,
MoCA=Montreal Cognitive Assessment, UPSIT=University of Pennsylvania Smell ID Test, LNS=Letter-Number Sequencing, Benton Test=Benton Judgment of Line
Orientation, ESS=Epworth Sleepiness Scale, rsXXXXXX=Single Nucleotide Polymorphisms (SNPs).

found to be indicative of rapid symptoms progression in PD, enhancing movement and strength and peripheral sensation in PD patients have
previous findings reported in [66] for the faster overall progression been associated with increased risk of fall [77,78]. Early impairment in
subgroup of PPMI patients when analyzing PD-specific brain networks activities of daily living has been found to be indicative of rapid motor
for deformations using MRI data. A connection between structural symptoms progression [20] and a potential marker for PD progression
changes in the putamen brain region and sleep disorders, was reported overall [79]. Finally, some of the selected features that were not pre-
by Boucetta et al. with patients scoring higher in the RBD scale than viously reported in the literature include patient’s height, family history
matching controls [67], while an inverse correlation between depres- dependencies with close relatives with and without PD and number of
sion and anxiety symptoms severity and basal ganglia DAT availability, siblings. It should be noted, however, that these features were usually
especially in the left anterior putamen region, was shown in [68]. Re- among the last to be added in the subset of selected baseline features
garding genetic analysis, as it is shown by the selected SNPs in Table 7, suggesting that their prognostic value might be comparably smaller.
patients from the rapid progression class carried various genetic mu- Gender did not seem to impact the rate of symptoms progression.
tations, which are among the most frequent SNPs found in PD patients A limitation of this study lies within the framework of the evalua-
[69] and were also previously associated as risk factors [70,71] and tion process as, in the absence of official guidelines, the quantiles
determinants of faster motor symptoms progression [72,73]. The con- partition split resulted in a relatively high number of individual classes,
centration of eosinophils and basophils from blood test results showcase preventing the effective evaluation of multi-class approaches and
the impact of inflammation in PD phenotyping and as a pathogenic leading to high variance in the selected features in Tables 5 and 6. This
factor for the progression of the disease [74,75], while reduced levels of issue was mitigated in this work by using a quantile-independent ana-
glucose have been reported to increase the risk of early cognitive de- lysis, which resulted in much more compact set of prognostic baseline
cline [76]. features (i.e. Tables 2 and 3). Towards this direction, the reduction of
Motor symptoms, in contrast, are found to be less informative as the number of quantiles considered as in [80] or a phenotype-specific
determinants of rapid progression. Presence of rigidity and impaired partitioning approach, turning the variation in the expressed sympto-
movement of lower extremities, including reduced leg agility and toe matology from patient to patient into an advantage, could be viable
tapping potential, were among the few motor symptoms that were se- alternatives and potential solutions to overcome such limitations and
lected as prognostic factors. The rigidity-bradykinesia phenotype has are planned for our future work.
been also reported in the literature [23,41], while reduced leg

13

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

6. Conclusions [12] Barrett MJ, et al. Baseline symptoms and basal forebrain volume predict future
psychosis in early Parkinson disease. Neurology 2018. p. 10.1212/
WNL.0000000000005421.
Patient stratification for rapid symptoms progression and early [13] Rajput AH, et al. Baseline motor findings and Parkinson disease prognostic sub-
functional decline in newly diagnosed PD patients has the potential to types. Neurology 2017. p. 10.1212/WNL.0000000000004078.
completely change our perspective regarding patient management in [14] Rolinski M, et al. REM sleep behaviour disorder is associated with worse quality of
life and other non-motor features in early Parkinson’s disease. J Neurol Neurosurg
the early stages of the disease, leading to better and more precise in- Psychiatr 2014;85(5):560–6.
terventions. In this study, a wide variety of baseline features were as- [15] Fereshtehnejad S-M, Postuma RB. Subtypes of parkinson’s disease: what do they tell
sessed for their ability to signal higher risk of rapid progression, using us about disease progression? Curr Neurol Neurosci Rep 2017;17(4):34.
[16] Baumann CR, et al. Body side and predominant motor features at the onset of
machine learning techniques. The results suggest that there are sig- Parkinson’s disease are linked to motor and nonmotor progression. Mov Disord
nificant variations in the observed non-motor symptomatology and 2014;29(2):207–13.
mental condition of PD patients at baseline, which should alarm clin- [17] Postuma RB, et al. MDS clinical diagnostic criteria for Parkinson’s disease. Mov
Disord 2015;30(12):1591–601.
icians that an aggressive phenotype of PD might be present. Early signs
[18] Mollenhauer, B., et al., Baseline predictors for progression 4 years after Parkinson’s
of autonomic dysfunction, sleep disorders, affected mood and anxiety, disease diagnosis in the De Novo Parkinson Cohort (DeNoPa). Movement
cognitive or memory decline along with advanced age, olfactory im- Disorders. 0(0).
pairment, presence of genetic mutations and rigidity are found to be [19] Jankovic J, Kapadia AS. Functional decline in parkinson disease. Arch Neurol
2001;58(10):1611–5.
useful prognostic factors for rapid progression within the first 2 and [20] Louis ED, et al. Progression of parkinsonian signs in parkinson disease. Arch Neurol
4 years of follow-up, while a slower progression is expected when such 1999;56(3):334–7.
symptoms do not appear at diagnosis. [21] Iddi S, et al. Estimating the evolution of disease in the parkinson’s progression
markers initiative. Neurodegener Dis 2018;18(4):173–90.
[22] Marek K, et al. The parkinson progression marker initiative (PPMI). Prog Neurobiol
Funding 2011;95(4):629–35.
[23] Post B, et al. Prognostic factors for the progression of Parkinson’s disease: a sys-
tematic review. Mov Disord 2007;22(13):1839–51.
This work was supported by the PD_Manager project, funded within [24] Goetz CG, et al. Movement Disorder Society‐sponsored revision of the Unified
the EU Framework Programme for Research and Innovation Horizon Parkinson’s Disease Rating Scale (MDS‐UPDRS): scale presentation and clinimetric
2020, under grant number 643706. testing results. Movement disorders: official journal of the Movement Disorder
Society 2008;23(15):2129–70.
[25] McGhee DJ, et al. A systematic review of biomarkers for disease progression in
Declaration of Competing Interest Parkinson’s disease. BMC Neurol 2013;13(1):35.
[26] Goetz CG, et al. Movement Disorder Society Task Force report on the Hoehn and
Yahr staging scale: status and recommendations the Movement Disorder Society
The authors declare that there is no conflict of interest regarding the
Task Force on rating scales for Parkinson’s disease. Mov Disord 2004;19(9):1020–8.
publication of this work. [27] Vavougios GD, et al. Identification of a prospective early motor progression cluster
of Parkinson’s disease: data from the PPMI study. J Neurol Sci 2018;387:103–8.
Acknowledgements [28] Ferguson LW, Rajput ML, Muhajarine N, Shah SM, Rajput A. Clinical features at first
visit and rapid disease progression in Parkinson’s disease. Parkinsonism Relat
Disord 2008;14(5):431–5.
Data used in the preparation of this article were obtained from the [29] Alves G, et al. Progression of motor impairment and disability in Parkinson disease.
Parkinson’s Progression Markers Initiative (PPMI) database (www. A population-based study 2005;65(9):1436–41.
[30] Latourelle JC, et al. Large-scale identification of clinical and genetic predictors of
ppmi-info.org/data). For up-to-date information on the study, visit motor progression in patients with newly diagnosed Parkinson’s disease: a long-
www.ppmi-info.org. PPMI is a public-private partnership funded by the itudinal cohort study and validation. Lancet Neurol 2017;16(11):908–16.
Michael J. Fox Foundation for Parkinson’s Research and funding part- [31] Lewis SJG, et al. Heterogeneity of Parkinson’s disease in the early clinical stages
using a data driven approach. J Neurol Neurosurg Psychiatr 2005;76(3):343–8.
ners. The list with full names of all of the PPMI funding partners found [32] Greffard S, et al. Motor score of the Unified Parkinson Disease Rating Scale as a
at www.ppmi-info.org/fundingpartners. good predictor of Lewy body-associated neuronal loss in the substantia nigra. Arch
Neurol 2006;63(4):584–8.
[33] Holden SK, et al. Progression of MDS-UPDRS scores over five years in de novo
Appendix A. Supplementary data
parkinson disease from the parkinson’s progression markers initiative cohort. Mov
Disord Clin Pract 2018;5(1):47–53.
Supplementary material related to this article can be found, in the [34] Venuto CS, et al. A review of disease progression models of Parkinson’s disease and
applications in clinical trials. Mov Disord 2016;31(7):947–56.
online version, at doi:https://doi.org/10.1016/j.artmed.2020.101807.
[35] Reutemann P. Python-weka-wrapper3: Python 3 wrapper for weka using java-
bridge. GitHub repository. 2016 (Accessed 21 February 2019) [Online]. https://
References github.com/fracpete/python-weka-wrapper3.
[36] Hall M, et al. The WEKA data mining software: an update. Acm Sigkdd Explor Newsl
2009;11(1):10–8.
[1] Poewe W, et al. Parkinson disease. Nat Rev Dis Primers 2017;3:17013. [37] Tsiouris KM, et al. Predicting rapid progression of parkinson's disease at baseline patients
[2] Pringsheim T, et al. The prevalence of Parkinson’s disease: a systematic review and evaluation. in 2017 39th Annual International Conference of the IEEE Engineering in
meta-analysis. Mov Disord 2014;29(13):1583–90. Medicine and Biology Society (EMBC) 2017.
[3] Yau Y, et al. Network connectivity determines cortical thinning in early Parkinson’s [38] Faghri F, et al. Predicting onset, progression, and clinical subtypes of Parkinson
disease progression. Nat Commun 2018;9(1):12. disease using machine learning. bioRxiv 2018:338913.
[4] Huang C, et al. Changes in network activity with the progression of Parkinson’s [39] Parashos SA, et al. Medical services utilization and prognosis in Parkinson disease: a
disease. Brain 2007;130(Pt 7):1834–46. population-based study. Mayo Clin Proc 2002;77(9):918–25.
[5] Halliday G, Lees A, Stern M. Milestones in Parkinson’s disease—clinical and pa- [40] Kempster PA, et al. Relationships between age and late progression of Parkinson’s
thologic features. Mov Disord 2011;26(6):1015–21. disease: a clinico-pathological study. Brain 2010;133(Pt 6):1755–62.
[6] Tropea TF, Chen-Plotkin AS. Unlocking the mystery of biomarkers: a brief in- [41] Reinoso G, et al. Clinical evolution of Parkinson’s disease and prognostic factors
troduction, challenges and opportunities in Parkinson Disease. Parkinsonism Relat affecting motor progression: 9-year follow-up study. Eur J Neurol
Disord 2018;46:S15–8. 2015;22(3):457–63.
[7] Martino R, et al. Onset and progression factors in Parkinson’s disease: a systematic [42] Velseboer DC, et al. Prognostic factors of motor impairment, disability, and quality
review. NeuroToxicology 2017;61:132–41. of life in newly diagnosed PD. Neurology 2013;80(7):627–33.
[8] Ascherio A, Schwarzschild MA. The epidemiology of Parkinson’s disease: risk factors [43] Suchowersky O, et al. Practice Parameter: diagnosis and prognosis of new onset
and prevention. Lancet Neurol 2016;15(12):1257–72. Parkinson disease (an evidence-based review). Report of the Quality Standards
[9] Guest PC. Parkinson’s disease, biomarkers and beyond. Biomarkers and mental ill- Subcommittee of the American Academy of Neurology 2006;66(7):968–75.
ness: it’s not all in the mind. Springer International Publishing: Cham; 2017. p. [44] Oosterveld LP, et al. Prognostic factors for early mortality in Parkinson’s disease.
157–71. Parkinsonism Relat Disord 2015;21(3):226–30.
[10] Hely MA, et al. The Sydney multicenter study of Parkinson’s disease: the inevit- [45] Pedersen KF, et al. Prognosis of mild cognitive impairment in early parkinson dis-
ability of dementia at 20 years. Mov Disord 2008;23(6):837–44. ease: the norwegian ParkWest StudyCognitive impairment in early parkinson dis-
[11] DeMaagd G, Philip A. Parkinson’s disease and its management: part 1: disease en- ease. JAMA Neurol 2013;70(5):580–6.
tity, risk factors, pathophysiology, clinical presentation, and diagnosis. P & T: a [46] De Pablo-Fernandez E, et al. Association of autonomic dysfunction with disease
peer-reviewed journal for formulary management 2015;40(8):504–32. progression and survival in parkinson disease. JAMA Neurol 2017;74(8):970–6.

14

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.
K.M. Tsiouris, et al. Artificial Intelligence In Medicine 103 (2020) 101807

[47] Picillo M, et al. The PRIAMO study: urinary dysfunction as a marker of disease parkinson’s disease (N5.002). Neurology 2017;88(16 Supplement):N5.002.
progression in early Parkinson’s disease. Eur J Neurol 2017;24(6):788–95. [64] Martens KAE, et al. Anxiety is associated with freezing of gait and attentional set-
[48] Merola A, et al. Autonomic dysfunction in Parkinson’s disease: a prospective cohort shifting in Parkinson’s disease: a new perspective for early intervention. Gait
study. Mov Disord 2018;33(3):391–7. Posture 2016;49:431–6.
[49] Damian A, et al. Autonomic function, as self-reported on the SCOPA-autonomic [65] Dissanayaka NNW, et al. Disease-specific anxiety symptomatology in Parkinson’s
questionnaire, is normal in essential tremor but not in Parkinson’s disease. disease. Int Psychogeriatr 2016;28(7):1153–63.
Parkinsonism Relat Disord 2012;18(10):1089–93. [66] Fereshtehnejad S-M, et al. Clinical criteria for subtyping Parkinson’s disease: bio-
[50] Magerkurth C, Schnitzer R, Braune S. Symptoms of autonomic failure in Parkinson’s markers and longitudinal progression. Brain 2017;140(7):1959–76.
disease: prevalence and impact on daily life. Clin Auton Res 2005;15(2):76–82. [67] Boucetta S, et al. Structural brain alterations associated with rapid eye movement
[51] Cavaco S, et al. Abnormal olfaction in Parkinson’s disease is related to faster disease sleep behavior disorder in parkinson’s disease. Sci Rep 2016;6:26782.
progression. Behav Neurol 2015;2015. [68] Weintraub D, et al. Striatal dopamine transporter imaging correlates with anxiety
[52] Comella CL. Sleep disorders in Parkinson’s disease: an overview. Mov Disord and depression symptoms in parkinson’s disease. J Nucl Med 2005;46(2):227–32.
2007;22(S17):S367–73. [69] Lill CM. Genetics of Parkinson’s disease. Mol Cell Probes 2016;30(6):386–96.
[53] Boeve BF, et al. Pathophysiology of REM sleep behaviour disorder and relevance to [70] Cai M, et al. Association between rs823128 polymorphism and the risk of
neurodegenerative disease. Brain 2007;130(11):2770–88. Parkinson’s disease: a meta-analysis. Neurosci Lett 2018;665:110–6.
[54] Duarte Folle A, et al. Clinical progression in Parkinson’s disease with features of [71] Ibanez L, et al. Parkinson disease polygenic risk score is associated with Parkinson
REM sleep behavior disorder: a population-based longitudinal study. Parkinsonism disease status and age at onset but not with alpha-synuclein cerebrospinal fluid
Relat Disord 2019;62:105–11. levels. BMC Neurol 2017;17(1):198.
[55] Bohnen NI, Hu MTM. Sleep disturbance as potential risk and progression factor for [72] Iwaki H, et al. Genetic risk of Parkinson disease and progression. An analysis of 13
parkinson’s disease. J Parkinsons Dis 2019;9(3):603–14. longitudinal cohorts 2019;5(4):e348.
[56] Pagano G, et al. REM behavior disorder predicts motor progression and cognitive [73] Webb J, Willette AA. Aging modifies the effect of GCH1 RS11158026 on DAT up-
decline in Parkinson disease. Neurology 2018;91(10):e894–905. take and Parkinson’s disease clinical severity. Neurobiol Aging 2017;50:39–46.
[57] Kim YE, Jeon BS. Clinical implication of REM sleep behavior disorder in Parkinson’s [74] Umehara T, et al. Differential leukocyte count is associated with clinical phenotype
disease. J Parkinsons Dis 2014;4(2):237–44. in Parkinson’s disease. J Neurol Sci 2020;409:116638.
[58] Rolinski M, et al. REM sleep behaviour disorder is associated with worse quality of [75] Kannarkat GT, Boss JM, Tansey MG. The role of innate and adaptive immunity in
life and other non-motor features in early Parkinson’s disease. J Neurol Neurosurg Parkinson’s disease. J Parkinsons Dis 2013;3(4):493–514.
Psychiatr 2014;85(5):560–6. [76] Firbank MJ, et al. Cerebral glucose metabolism and cognition in newly diagnosed
[59] Starkstein SE, et al. A prospective longitudinal study of depression, cognitive de- Parkinson’s disease: ICICLE-PD study. J Neurol Neurosurg Psychiatr
cline, and physical impairments in patients with Parkinson’s disease. J Neurol 2017;88(4):310–6.
Neurosurg Psychiatr 1992;55(5):377–82. [77] Latt MD, et al. Clinical and physiological assessments for elucidating falls risk in
[60] Gison A, et al. Dispositional optimism, depression, disability and quality of life in Parkinson’s disease. Mov Disord 2009;24(9):1280–9.
Parkinson’s disease. Funct Neurol 2014;29(2):113–9. [78] Kerr GK, et al. Predictors of future falls in Parkinson disease. Neurology
[61] Rutten S, et al. Anxiety in Parkinson’s disease: symptom dimensions and overlap 2010;75(2):116–24.
with depression and autonomic failure. Parkinsonism Relat Disord [79] Harrison MB, et al. UPDRS activity of daily living score as a marker of Parkinson’s
2015;21(3):189–93. disease progression. Mov Disord 2009;24(2):224–30.
[62] Yamanishi T, et al. Anxiety and depression in patients with parkinson’s disease. [80] Vásquez-Correa JC, et al. Multimodal assessment of parkinson’s disease: a deep
Intern Med 2013;52(5):539–45. learning approach. IEEE J Biomed Health Inform 2019;23(4):1618–30.
[63] Hiller A, Quinn J, Schmidt P. Does psychological stress affect the progression of

15

Downloaded for Samuel Escares (siescares@uc.cl) at Pontifical Catholic University of Chile from ClinicalKey.com by Elsevier on March 20, 2020.
For personal use only. No other uses without permission. Copyright ©2020. Elsevier Inc. All rights reserved.

You might also like