Early Diagnosis of Parkinson's Disease: A Combined Method Using Deep Learning and Neuro-Fuzzy Techniques

Computational Biology and Chemistry 102 (2023) 107788
Contents lists available at ScienceDirect
Computational Biology and Chemistry

journal homepage: www.elsevier.com/locate/cbac
Early diagnosis of Parkinson’s disease: A combined method using deep

learning and neuro-fuzzy techniques
Mehrbakhsh Nilashi a, b, *, Rabab Ali Abumalloh c, Salma Yasmin Mohd Yusuf d, Ha
Hang Thi e, f, **, Mohammad Alsulami g, Hamad Abosaq g, Sultan Alyami g, Abdullah Alghamdi g
a
UCSI Graduate Business School, UCSI University, No. 1 Jalan Menara Gading, UCSI Heights, 56000 Cheras, Kuala Lumpur, Malaysia
b
Centre for Global Sustainability Studies (CGSS), Universiti Sains Malaysia, 11800, USM Penang, Malaysia
c
Computer Department, Applied College, Imam Abdulrahman Bin Faisal University, P.O. Box. 1982, Dammam, Saudi Arabia
d
Primary Care Medicine Department, Faculty of Medicine, Universiti Teknologi MARA, Sungai Buloh 47000, Selangor, Malaysia
e
Institute of Research and Development, Duy Tan University, Da Nang, VietNam
f
International School, Duy Tan University, Da Nang, VietNam
g
Computer Science Dept., College of Computer Science and Information Systems, Najran University, Najran, Saudi Arabia
A R T I C L E I N F O A B S T R A C T
Keywords: Predicting Unified Parkinson’s Disease Rating Scale (UPDRS) in Total- UPDRS and Motor-UPDRS clinical scales is
Computational intelligence an important part of controlling PD. Computational intelligence approaches have been used effectively in the
Parkinson’s disease early diagnosis of PD by predicting UPDRS. In this research, we target to present a combined approach for PD
UPDRS
diagnosis using an ensemble learning approach with the ability of online learning from clinical large datasets.
Diagnosis
Accuracy
The method is developed using Deep Belief Network (DBN) and Neuro-Fuzzy approaches. A clustering approach,
Time complexity Expectation-Maximization (EM), is used to handle large datasets. The Principle Component Analysis (PCA)
technique is employed for noise removal from the data. The UPDRS prediction models are constructed for PD
diagnosis. To handle the missing data, K-NN is used in the proposed method. We use incremental machine
learning approaches to improve the efficiency of the proposed method. We assess our approach on a real-world
PD dataset and the findings are assessed compared to other PD diagnosis approaches developed by machine
learning techniques. The findings revealed that the approach can improve the UPDRS prediction accuracy and
the time complexity of previous methods in handling large datasets.
1. Introduction as the PD develops (Sapir et al., 1999). PD includes non-motor symptoms

and motor symptoms (see Fig. 1). In most cases, medical tests are not
Parkinson’s Disease (PD) is a progressive neurodegenerative disorder available for definitive detection of the disease so an accurate diagnosis
(Kim et al., 2020) triggered by polygenic and environmental factors can be difficult. Parkinson’s disease can impact both men and women.
(Krohn et al., 2020). PD mostly affects people over 60 years (Razali and The disease affects approximately 50% more men than women.
Ahmad, 2011). Parkinson’s symptoms and the rate of progression vary Currently, there are no laboratory tests for the diagnosis of non-genetic
among individuals. PD symptoms entail muscle rigidity, delay of causes of PD. Accurate detection depends on the neurological history of
movement, walking problems, tremor of the limbs, balance and coor a person with PD. While PD is not cured, some symptoms are frequently
dination, cognition, vocal impairment, eating and swallowing problems, alleviated with medications, surgical treatment, or other therapies.
and mood disturbances (Eskidere et al., 2012; Sakar and Kursun, 2010). Usually, Unified Parkinson’s Disease Rating Scale (UPDRS) is used to
Voice impairments such as dysphonia and dysarthria have been indi track PD symptom progression (Eskidere et al., 2012; Nilashi et al.,
cated in approximately 90% of PD patients (Eskidere et al., 2012; Little 2020b, 2016; Tsanas et al., 2009). UPDRS displays the existence and
et al., 2008). Parkinson’s patients often experience dysphonia (Majdi intensity of symptoms (Eskidere et al., 2012; Prashanth and Roy, 2018a)
nasab et al., 2012), which is distinguished by a breathy and harsh voice which is widely used as a medical rating scale for PD (Disease, 2003;
* Corresponding author at: UCSI Graduate Business School, UCSI University, No. 1 Jalan Menara Gading, UCSI Heights, 56000 Cheras, Kuala Lumpur, Malaysia.
** Corresponding author at: Institute of Research and Development, Duy Tan University, Da Nang, Vietnam.
E-mail addresses: nilashidotnet@hotmail.com (M. Nilashi), hntha@duytan.edu.vn (H.H. Thi).
https://doi.org/10.1016/j.compbiolchem.2022.107788
Received 3 June 2022; Received in revised form 28 October 2022; Accepted 8 November 2022
Available online 10 November 2022
1476-9271/© 2022 Elsevier Ltd. All rights reserved.
M. Nilashi et al. Computational Biology and Chemistry 102 (2023) 107788
Eskidere et al., 2012; Nilashi et al., 2020b, 2016; Prashanth and Roy, techniques can significantly reduce both processing time and memory
2018b; Tsanas et al., 2009). Both motor impairment and motor disability consumption. Hence, the use of unsupervised and incremental machine
are assessed by this scale. learning techniques must be taken into account in the development of
Many research have utilized up-to-date approaches based on ma disease diagnosis algorithms and systems.
chine learning techniques to inspect and present insights from medical In this research, a novel approach using deep learning and neuro-
data. Disease diagnosis using machine learning techniques has been a fuzzy techniques is presented to track PD progression from a real-
center of interest of researchers in both medical and artificial intelli world dataset. Specifically, we employ Deep Belief Network (DBN)
gence fields. Machine learning algorithms are being utilized widely for (Hinton, 2009) combined with Adaptive Neuro-Fuzzy Inference System
disease detection in the healthcare sector (Arji et al., 2019; Kishore et al., (ANFIS) (Jang, 1993) in constructing UPDRS prediction models. The
2020; Nilashi et al., 2018b; Senturk, 2020; Sharma et al., 2020). These fuzzy logic proposed by (Zadeh, 1965) has been an effective approach
algorithms have been found to be effective in PD diagnosis through a set for modeling systems for disease diagnosis (Chen et al., 2013; Ganji and
of real-world datasets. While these techniques have provided robust Abadeh, 2011). The mathematical model of fuzzy logic expresses the
outcomes to inspect different data sets in the context of PD (Ferreira degree of uncertainty in terms of the distribution of possibilities (For
et al., 2022; Valla et al., 2022), the majority of research has utilized oughi et al., 2023; Nilashi et al., 2020a; Yadegaridehkordi et al., 2020).
supervised approaches that do not utilize incremental learning ap In this study, the combination of fuzzy logic with the neural network
proaches. In fact, both accuracy and real-time prediction are important approach (neuro-fuzzy) aims to provide an efficient way of predicting
criteria that must be considered in the development of disease diagnosis UPDRS. In addition, the DBN can examine the data’s essential charac
systems. Incremental learning can help avoid the re-computation pro teristics in depth. It minimizes the influence of human factors and
cess to develop prediction models. The re-computation process can significantly enhances neural network training outcomes. The approach
impact the prediction time of the model. Incremental learning enables also applies Expectation-Maximization (EM) to segment the data. To
fast computation of the data (Guo et al., 2014). It can overcome overcome the computation time issue, we develop the proposed method
large-scale problems. In contrary to conventional approaches, it presents for incremental learning. Through several experiments, we will
promising outcomes in real applications (Li et al., 2021). Based on that, demonstrate that the use of incremental approaches is efficient for
it has been utilized and integrated with different approaches such as real-time PD diagnosis. The approach is assessed using real-world PD
continual algorithms (Wiwatcharakoses and Berrar, 2021), extreme data and the outcomes are evaluated and compared with other PD
learning machines (Li et al., 2021), generative rehearsal strategy (Lee detection approaches.
et al., 2021), and transfer learning (Koivu et al., 2021). Still, although The remainder of this work is organized as follows. In Section 2, we
incremental learning has gained the interest of researchers, the provide the related work for PD diagnosis using machine learning
deployment of incremental learning in disease diagnosis and prediction, techniques. In Section 3, we present the proposed methodology. In
particularly PD detection, is fairly unexplored. In addition, for large Section 4, we present the dataset for method evaluation. In Section 5, the
datasets, the sole use of supervised learning methods may not be an results of the method evaluation on the dataset are provided. Finally, we
efficient way to predict the disease as the prediction models may not be conclude this research in Section 6. A list of abbreviations utilized in this
appropriately constructed from the input and output variables of the research is shown in Table 1.
dataset. In fact, the use of unsupervised and incremental learning
Fig. 1. Proposed method.
2
Table 1 tracking of the severity or progress of PD (El Maachi et al., 2020; Grover
List of abbreviations utilized in this research. et al., 2018). In Table 2, we present a summary of the approaches that
Abbreviation Description were used in PD diagnosis, classification, and tracking in previous
literature.
ANFIS Adaptive Neuro-Fuzzy Inference System
CD Contrastive Divergence In a study presented by Das (2010), a comparison between several
CML Conventional Machine Learning PD diagnosis approaches was elaborated. The method utilized in the
CNN Convolutional Neural Networks study aimed to effectively identify healthy people. Four classification
DBN Deep Belief Network patterns (NNs, Regression, DMneural, and DT) were deployed and
DL Deep Learning
DNN Deep Neural Network
comparative research was applied. Several assessment approaches were
DR Deep Ranking carried out for measuring the outcome score of the utilized patterns. The
DT Decision Tree outcome of the study highlighted that NNs presented better classifica
ELM Extreme Learning Machines tion outcomes against the regression, DMneural, and decision tree
ET Essential Tremor
(92.9% classification accuracy). Additionally, the outcomes were
GBRM Gaussian– Restricted Boltzmann Machine
GMM Gaussian Mixture Model assessed compared to KSVM and presented encouraging outcomes.
HC Healthy Control SVM is a nonlinear pattern classification algorithm. SVM maps input
HOSVD Higher-Order Singular Value Decomposition data into a high-dimensional feature space, where it uses a nonlinear
K-NN K-Nearest Neighbour kernel function to construct an optimal discriminant hyperplane. SVM
KSVM Kernel Support Vector Machines
NMS-MRI Neuromelanin Sensitive Magnetic Resonance Imaging
was deployed in several studies in PD classification and diagnosis
LR Logistic Regression (Bhattacharya and Bhatia, 2010; Chen et al., 2013; Eskidere et al., 2012;
LSTM Long Short-Term Memory Ozcift, 2012). For example, along with DT and LR, SVM was used by
ML Machine Learning (Yadav et al., 2012) for PD detection. The outcome of the research was
MSA Multiple System Atrophy
assessed compared to other techniques. The authors concentrated on the
NNs Neural Networks
PCA Principle Component Analysis speech pronunciation complexity signs among PD-impacted individuals
PD Parkinson’s Disease and attempted to frame the model by utilizing three data mining ap
PT Parkinsonian Tremor proaches from various perspectives. The outcomes of the deployed ap
RBM Restricted Boltzmann Machine proaches were assessed regarding three main aspects of accuracy,
SOM Self-Organizing Map
specificity, and sensitivity. The findings of the research indicated that,
SVD Singular Value Decomposition
SVM Support Vector Machine regarding accuracy and sensitivity, SVM presented the best outcome. On
SVR Support Vector Regression the other hand, in terms of specificity, LR presented the best results.
UPDRS Unified Parkinson’s Disease Rating Scale Exarchos et al. (2012) used the DT module and association rules for
PD diagnosis. The designed approach was a portion of the PERFORM
system, which was utilized for PD controlling and therapy. In the sug
2. Related work
gested approach, a PD patient carried out some primary assessments and
the system generated the upcoming incident of the signs depending on
Several researchers have explored PD in terms of the detection and
the primary tests and therapy. Following that, a particular therapy will
diagnosis of the disease (Camps et al., 2018; Nilashi et al., 2020b; Ortiz
be specified for the PD patient. The experts can explain the predictions
et al., 2019). Other approaches have focused primarily on the remote
by displaying the conditions. This approach was assessed using
real-world data. Furthermore, the outcomes presented an encouraging
Table 2 prediction accuracy (57.1–77.4%). As stated by the authors, the research
Previous Literature on PD Diagnosis. has one limitation concerning the lack of real-world data.
Deep Learning (DL) techniques, and specifically, CNN has presented
Method References
encouraging outcomes in many artificial intelligence applications
SVM (Bhattacharya and Bhatia, 2010; Ozcift, 2012; Eskidere et al.,
regarding its robust pattern recognition capabilities (Afonso et al., 2019;
2012; Chen et al., 2013; Behroozi and Sami, 2016; Nilashi et al.,
2018b ;Nilashi et al., 2018a;Asgari and Shafran, 2010;Yadav et al.,
Pereira et al., 2018; Shinde et al., 2019). Afonso et al. (2019) adopted
2012) CNN for modeling signals in PD diagnosis. The authors assessed the
KNN (Polat, 2012;Chen et al., 2013;Jain and Shetty, 2016;Behroozi and designed approach under multiple contexts and indicated the enhance
Sami, 2016;Wan et al., 2018) ment in the classification accuracy compared to their proceeding
NN (Das, 2010; Afonso et al., 2019; Pereira et al., 2018; Shinde et al.,
research. Pereira et al. (2018) adopted CNN to extract the features from
2019; Eskidere et al., 2012; Babu and Suresh, 2013; Hariharan
et al., 2014; Khan et al., 2016; Buza and Varga, 2016; Al-Fatlawi handwritten dynamic photos. The researchers have also presented a
et al., 2016; Jain and Shetty, 2016; Grover et al., 2018) dataset that contains signals of PD patients and healthy individuals for
ANFIS (Nilashi et al., 2019; Polat, 2012; Nilashi et al., 2018a) the research community. Additionally, Shinde et al. (2019) developed a
Fuzzy Logic (Li et al., 2011; Polat, 2012; Chen et al., 2013) computerized technology to analyze the dataset that was collected from
K-Means (Polat, 2012)
GP (Guo et al., 2010; Khan et al., 2016; Avci and Dogantekin, 2016)
100 individuals with various conditions and used CNN to establish
EM (Guo et al., 2010; Nilashi et al., 2017a; Nilashi et al., 2019; biological indicators to classify PD patients from NMS-MRI. The result of
Hariharan et al., 2014; Nilashi et al., 2018a) the research provided an accuracy of 80% and enabled determining the
PCA (Nilashi et al., 2017a; Chen et al., 2013; Hariharan et al., 2014; most discriminative parts of the neuromelanin contrast.
Nilashi et al., 2018a)
Few researchers have adopted DBN in PD diagnosis and identifica
Random Forest (Peterek et al., 2013)
LDA (Hariharan et al., 2014; Wan et al., 2018) tion. In a similar context to this research, Nilashi et al. (2020b) used DBN
DT (Nilashi et al., 2017a; Froelich et al., 2015; Yadav et al., 2012; and SVR to measure UPDRS. The authors utilized the
Exarchos et al., 2012) Self-Organizing-Map (SOM) technique in the clustering process to
Association (Jain and Shetty, 2016) enhance the accuracy of the outcome. The approach was assessed using
Rule
NB (Naranjo et al., 2016; Behroozi and Sami, 2016; Abdar and
a real-world dataset. Nine segments were indicated, and a DBN was
Zomorodi-Moghadam, 2018) designed for each segment. The outcomes of the DBN prediction models
Regression (Wan et al., 2018; Yadav et al., 2012) were incorporated by the SVR approach. Additionally, the findings were
Deep Learning (Grover et al., 2018; Gunduz, 2019; Wan et al., 2018; Johri and assessed compared to other approaches (SVR, Neuro-Fuzzy techniques,
Tripathi, 2019; Nilashi et al., 2020c)
and supervised learning techniques). The findings indicated that
3
integrating the clustering, DBN, and SVR presented improved UPDRS 4. Mathematical background of EM and DBN
predictions compared to other approaches.
Additionally, various approaches have been advanced for remote In the following sub-sections, we present the mathematical back
monitoring of PD progress. Many other researchers have concentrated ground of the EM and DBN techniques used in the proposed method.
on the detection of the degree of the disease or the prediction of the
severity (Grover et al., 2018; Kim et al., 2018). Several studies have used 4.1. Gaussian mixture model (GMM)
feature extraction approaches using speak signals as input data to
determine the intensity of the disease (Asgari and Shafran, 2010; Galaz As a probabilistic method, clustering with GMM allows modeling
et al., 2016; Grover et al., 2018). For example, Grover et al. (2018) each cluster by a parametric Gaussian distribution (Bouman et al.,
designed a new approach for PD severity identification using NNs. The 1997). EM has been an effective clustering technique for different ap
outcomes of the research presented good accuracy (81.66%) compared plications (Ambroise et al., 1997; Garriga et al., 2016; Ramirez-Rozo
to other approaches. et al., 2012).
A one-dimensional superposition of Gaussian parts represents the
3. Methodology distribution of the full data f(x) as follows:
( )
The main goal of the study is to develop a new approach for PD T ∑
− 1
1
exp − 2(x − μj ) (x − μj )
diagnosis using an ensemble learning approach with the ability of online ∑J
( ) ∑ J
j
f (X) = pj f X|μj , Ʃj = pj (1)
learning using a large clinical dataset (see Fig. 1). Basically, the Deep j=1 j=1
d
(2п)2 |Ʃj |2
1
Belief Network and Neuro-Fuzzy approaches are used to achieve the goal
( )
of the study. First, a clustering approach, Expectation-Maximization, is ∑J ∫
where j=1 pj = 1 and x f X|μj , Ʃj dx = 1.
applied to handle the large dataset. Second, Principal Component
Analysis (PCA) was used for data noise removal by solving the multi- The function f(x) is formed using J distributions, with probability
collinearity issue (Alin, 2010; Graham, 2003; Mansfield and Helms, values pj , j = 1, 2, …, J. Each value x can be mapped based on
1982) within the PD data. Third, the DBN-ANFIS prediction is per any of the J model distributions. The approach of maximum-likelihood
{
formed. This will accordingly increase the accuracy of the prediction of may be adopted for the calculation of missing variables, Ɵ = μj , Ʃ j ,
UPDRS in the PD data. We assess our approach on a real-world PD }
J
dataset and the findings are evaluated compared to other PD diagnosis Pj j = 1 by optimizing the log-likelihood formula referring to the
approaches developed by machine learning techniques. The proposed group of possible training inputs {Xk } NK = 1, as follows:
DBN-ANFIS is presented in Fig. 2. ( )
The proposed method entails two steps to handle the missing data ∑ ∑
N J
( )
L= log pj f X k |μj , Ʃj (2)
and perform online learning. In the case of missing data, we have k=1 j=1
considered K-NN in the proposed method. The method is further
developed for incremental learning through Incremental PCA (IPCA) EM calculations can be used to solve the problem of an inadequate
and Incremental DBN-ANFIS. Finally, if new data is available, the dataset. The need for data regarding the labels of the data used in the
method will be able to train the new data and update the previous training step raises the issue of training data samples. To calculate GMM
prediction models through incremental approaches of the techniques. parameters, the EM formula is used by following the next stages:
This procedure is provided to refine the computation time of UPDRS
prediction through the features provided in the PD dataset. 1. Determine the initial values pj (0), μj(0), Σj(0), j = 1, 2,
…J and calculate the primary log-likelihood.
Fig. 2. A general structure of DBN-ANFIS.
4
( )
∑
N ∑
J
( ) 4.2. Rule of a DBN
L(0) = log pj(0) f Xk |μj (0), Ʃj (0) (3)
4.2.1. RBM
k=1 j=1
RBM is a stochastic Neural Network (NN) (Shao et al., 2016; Wang

and Shang, 2014). Fig. 3 presents the structure of RBM, in which dual
2. Expectation-stage: calculate
layers, a visible layer v = { vi |i = 1, 2, 3, …, n}, and an invisible layer
( ) { ⃒ }
pj (t)f xk | μj (t), Ʃj (t) h = hj ⃒j = 1, 2, 3, …, n , construct the NN. The links between the
p (j|xk , Ɵ(t) ) = j
∑
J binary nodes in both layers are confined in the RBM. Additionally, all
Pl (t)f (xk | μl (t), Ʃl (t) )
l=1 nodes in both layers are completely linked. Still, in the h layer or the v
= 1, 2, …, J, k = 1, 2, 3, …, N (4) layer, the nodes are not linked.
As an energy-based method, the energy equation of the RBM is
represented by weights and biases as follows:
3. Find the new values in the maximization stage. ∑
n ∑
m ∑
m ∑
n
E(v, h) = − wij hj vi − ai vi − bj hj (7)
1 ∑N
j=1 i=1 i=1 j=1
pj (t + 1) = p(j|xk , Ɵ(t) )
N k=1
In the v layer, vj is the visible node i, whereas, in the hidden layer h,
hj is the hidden node j. The weight among vj and hj is repre
∑
N
xk p(j|xk , Ɵ(t) ) sented by wij . ai and bj are the bias values of the vj and the hj ,
μj (t + 1) = k=1
respectively. The equation implies that the joint distribution of vj and
∑N
p(j|xk , Ɵ(t) ) hj is described as follows:
k=1
exp( − E(v, h) )
( )( )T p(v, h) = (8)
∑
N
Z
∑ p(j|xk , Ɵ(t)) xk − μj (t + 1) xk − μj (t + 1)
k=1
(t +1) =
∑
N Z is the normalizing variable which is measured using the following
equation:
j p(j|xk , Ɵ(t))
k=1
∑∑
(5) Z= exp( − E(v, h) ) (9)
v h
In any of the h layer or the v layer, the nodes are not linked, hence,
4. Calculate the new log-likelihood values in the convergence check
the conditional probabilities of vj and hj are separated and can be
stage.
( ( )) calculated as follows:
∑
N ∑
J ∑ ( )
L(t + 1) = log pj (t + 1)f xk |μj (t + 1), (t + 1) (6) ( ⃒ ) ∑m
k=1 j=1 j p hj = 1⃒v = sigmoid wij vi + bj

i=1
( )
5. Go back to the second stage if |L(t + 1) − L(t) |〉 δ for a pre ∑
n
p(vi = 1|h) = sigmoid wij hj + ai (10)
defined threshold δ; else terminate. j=1
The sigmoid function is calculated using the following equation:

Sigmoid(x) = 1/(1 + exp( − x) ) (11)
To calculate the variables’ boundaries {w, a, b}, conjugate
gradient descent is particularly used to increase the logarithmic likeli
hood of the data used to train the NN. The variables {w, a, b} can
be measured as follows:
∂lnP(v) (〈〉〈〉 )
Δwij = ≈ η vi hj data − vi hj rec
∂wij
∂lnP(v) ( )
Δai = ≈ η 〈vi 〉data − 〈vi 〉rec
∂ai
∂lnP(v) (〈〉〈〉 )
Δbj = ≈ η hj data − hj rec (12)
∂bj
η is the ratio of the learning process; 〈〉data indicates the anticipation of

the input data distribution; 〈〉rec is the anticipation of the reconstructed
input data distribution (Yu and Yan, 2019). Still, the expectation
〈〉rec cannot be calculated flexibly (Hinton and Salakhutdinov, 2007).
To conduct RBM training effectively, the CD formula is used to update
the variables (Hinton, 2002). The conditions for {w, a, b} can be
calculated from the following equations:
(〈〉〈〉 )
Δwij ≈ η vi hj data − vi hj k
Fig. 3. Restricted Boltzmann Machine.
5
( )
Δai ≈ η 〈vi 〉data − 〈vi 〉k
σ i n+1 = σ i n + Δσ i n + mΔσi n− 1
(18)
(〈〉〈〉 )
Δbj ≈ η hj data − hj k (13) RBM and GRBM differ in the method of producing the re
constructions and the energy function.
where 〈〉k represents the allocation of samples following Gibbs sam
4.2.3. DBN architecture
pling for stage k. In general, a single full-step Gibbs sampler uses the CD
The DBN technique is considered a probabilistic generative model
algorithm to reconstruct the data (Hinton and Salakhutdinov, 2007).
with a deep structure (Hinton et al., 2006). This technique is trained
The initial training input is partitioned into mini-groups during the
through greedy layer-wise algorithms in two main stages (Hua et al.,
implementation step. This can enhance the effectiveness of the imple
2015), pre-training and fine-tuning (see Fig. 4). Before the training
mentation process. The update formulas of RBM variables may be rep
stage, RBMs are trained sequentially from below to above. The v layer of
resented as:
RBM1 processes the continuous data, the CD algorithm changes the
wij n+1 = wij n + Δwij n + mΔwij n− 1
weights and biases. Weights and biases should be settled after the
training stage, and the outcome of the h layer in RBM1 will be processed
ai n+1 = ai n + Δai n + mΔai n− 1
by the v layer in RBM2. The remaining RBMs are trained similarly and
the training procedure is continued till the final RBM. This stage follows
bj n+1 = bj n + Δbj n + mΔbj n− 1
(14) the unsupervised pattern. Weights and biases are initiated following the
pre-training step of each RBM. As a supervised pattern, the back
n is the existing training group; m expresses the momentum that is propagation (BP) technique is deployed to change all the weights to
adopted to enhance the learning rate (Hinton, 2012). enhance the resulting accuracy by utilizing the target and DBN outputs.
This process continues to reach the greatest era.
4.2.2. GRBM
Gaussian RBM can be used to present a closed-form depiction of the 5. Dataset
distribution of the training data (Fischer and Igel, 2014). Besides, GRBM
can be utilized to represent high-dimensional real-valued data. Several This research utilized a real-world data set to assess the proposed
studies have deployed GRBM in several contexts of digit recognition hybrid approach. This dataset is widely used for machine learning ap
(Bengio, 2009) and image recognition (Hinton and Salakhutdinov, proaches developed for PD diagnosis (Eskidere et al., 2012; Nilashi et al.,
2006). In this study, the nodes in the v layer and h layer are binary and 2018b; Tsanas et al., 2009). The dataset consists of 5875 records with 16
stochastic. In GRBM, the binary value is changed through independent features and two outputs, around 200 records per patient for 28 males
Gaussian noise. GRBM algorithm of energy is expressed as: and 14 females (Tsanas et al., 2009). The features are HNR, NHR, RPDE,
∑
n ∑ ∑ ∑ DFA, PPE, MDVP:Jitter (Abs), Shimmer:DDA, MDVP:Jitter:PPQ5,
m
wij hj vi m
(vi − ai )2 n
E(v, h) = − − − bj hj (15) MDVP:Shimmer, MDVP:Jitter:RAP, Jitter:DDP, MDVP:Shimmer (dB),
σi 2σ i 2
j=1 i=1 i=1 j=1 Shimmer:APQ3, Shimmer:APQ5, Shimmer:APQ11 and MDVP:Jitter
(%). The outputs of this dataset are Total-UPDRS and Motor-UPDRS
where σi and ai are respectively the standard deviation and mean of which are aimed to be predicted by the proposed method. In Fig. 5,
Gaussian noise added to the data for inputs. Accordingly, in GRBM, the the histogram with a normal distribution fit for Motor- and Total-UPDRS
condition distributions of v and h layers are transformed to: is presented.
( )
( ⃒ ) ∑m
wij vi
p hj = 1⃒v = sigmoid + bj 6. Results
i=1
σi
( ) The method was implemented in MATLAB R2013b. The EM clus
∑
n
p(vi = 1|h) = N σi wij hj + ai , σi 2 (16) tering was applied to PD datasets. The selection of the correct count of
j=1 clusters is critical in any clustering technique. The likelihood needs to be
optimized in the EM clustering with the Gaussian mixture model. Thus,
( )
where x indicates the real measure of the input data; N μ, σ2 in the best cluster number is chosen for this optimization by assessing
dicates Gaussian distribution with σ2 (variance) and μ (mean). The value different measures for the count of clusters. It is important to note that
σ 2 is considered as a training value that enables the best result of σ 2 . The we applied information theoretical criteria like the Akaike Information
update rules for {w, a, b, σ} can be expressed as: Criterion (AIC) (Akaike, 1974) to determine the optimum cluster value,
(〈〉〈〉 ) according to Pelleg and Moore (Pelleg and Moore, 2000). We have
Δwij =
∂lnP(v)
≈η
vi hj
−
vi hj therefore used the re-substitution AIC estimate and have evaluated the
∂wij σi data σi k data for the optimum number of clusters. Furthermore, we used 10-fold
(〈〉〈〉 ) cross-validation in the clustering process to achieve unbiased outcomes.
Δai =
∂lnP(v)
≈η
vi
−
vi Finally, the best clustering results were obtained for EM with 13 clusters,
∂ai σi 2 data σi 2 k the optimal criterion value (275755.9052) was obtained when EM
(〈〉 generated 13 segments from PD data. The results of EM are visualized for
∂lnP(v) 〈〉 )
Δbj = ≈ η hj data − hj k 13 clusters in Fig. 6. In this figure, the distribution of Total-UPDRS and
∂bj Motor-UPDRS in 13 clusters of EM is visualized using PC1 generated by
PCA. For ease of visualization, a part of the results for the PD features is
∂lnP(v) presented in Fig. 7 using PC1 and PC2.
Δσ i =
∂σ i We advanced supervised learning for DBN after data clustering
(〈〉〈〉 )
(vi − ai )2 ∑ (vi − ai )2 ∑ through EM to forecast Total- and Motor-UPDRS. The DBN predictive
n n
wij hj vi wij hj vi
≈η − − − (17)
σi 3 j=1
σi σi 3 j=1
σi models were trained for each EM cluster. Another number of training
epochs was considered in each cluster. In each cluster, we tried 100
data k
In the same way, updating conditions for biases { a, b} and weights DBNs with several epochs ranging from 100 to 1000 and a step size of
w stay the same and the update condition for variance σ is: 100. In every DBN’s hidden layer, the number of neurons is set to 100. In
6
Fig. 4. Architecture of DBN.
addition, in the DBN’s input layer 16 neurons were used. In ELM there PC1 = 0.777, PC2 = 0.786 and PC3 = 0.0522, the value of 31.5 is ob
were 100 hidden layers. The data were divided into training and test tained for Motor-UPDRS. In addition, if PC1 = 0.352, PC2 = 0.663, and
data in each cluster. In an interval (0, 1), the data was first normalized PC3 = − 0.208, the value of 32.1 is obtained by Total-UPDRS. Note that
using Eq. [19]. the prediction of Motor-UPDRS and Total-UPDRS can be performed
through all PCs generated from the dataset. However, in the results, as
X − min (X)
(19) shown in Fig. 10, only 3 PCs have been considered.
′
X =
Max (X) − min(X)
Choosing the appropriate number of components for PCA is a crucial
The feature extracted by DBN in each cluster was used as the input of issue. The rule proposed by (Cattell, 1966) was used to determine the
the ANFIS model for prediction, and a hybrid method with different most important PCs in the PCA analysis. According to this rule, the
fuzzy membership functions was employed in the ANFIS network to chosen PCs in each cluster are listed in Table 3.
improve the prediction accuracy. For ANFIS, we have considered The proposed method is evaluated using two evaluation metrics,
Gaussian membership functions (Ahani et al., 2021; Kurtulus and Flipo, R2adjusted (adjusted coefficient of determination) and RMSE (Root Mean
2012; Mardani et al., 2018; Nilashi et al., 2017b, 2015). In each cluster, Square Error) as respectively shown in Eq. (20) and Eq. (21). Consid
the DBN-ANFIS was implemented with these membership functions. For ering Pi as real outputs for Motor- and Total-UPDRS, Ak for predicted
each input, three linguistic variables (Low, Moderate, and High) were Motor- and Total-UPDRS, Pm as the mean value of predicted output and
considered (see Fig. 8). Am as actual mean value with m independent variables, RMSE and
We present the results of DBN-ANFIS in Fig. 9 for different segments.
R2adjusted are computed for N samples as follows:
To visualize the results, 3D plots in the ANFIS are generated for every
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
two inputs versus the outputs of the models, Motor-UPDRS and Total- 1 ∑N
UPDRS. These plots show the behavior of the ANFIS models to predict RMSE = (Pi − Ak )2 (20)
N i=1
Motor-UPDRS and Total-UPDRS through the PD features which have
⎛ ⎞
been transformed into the PCs vectors. ∑
N
In Fig. 10, we present the prediction result of Total- and Motor- ⎜
⎜
(Ai − Am )(Pi − Pm ) ⎟(
⎟
)
N− 1
UPDRS based on 3 PCs out of 16 PCs generated by the PCA technique.
i=1
R2adjusted = 1 − ⎜
⎜1 − √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
⎟
⎟ N− m− 1
These PCs are used instead of the main variables. Similar to the main ⎝ ∑ ⎠
N
(Ai − Am )2 × (Pi − Pm )2
variables, each PC includes membership functions based on three lin i=1
( )
guistic variables in ANFIS modelling. Based on the values of PCs the ( )2 N− 1
Motor-UPDRS and Total-UPDRS are predicted through two fuzzy rules = 1− 1− R
N− m− 1
constructed by training the ANFIS models. It is found that if (21)
7
Fig. 5. Visualizing (a) histogram with a normal distribution fit for Total-UPDRS and Motor-UPDRS and (b) distribution of Total-UPDRS and Motor-UPDRS by PCA.
Fig. 6. Visualizing the EM clustering results for PD dataset.
8
Fig. 7. PD features in EM clusters.
Fig. 8. Membership functions used in ANFIS models.
Fig. 9. The prediction results in 3D plots in ANFIS.
The results of predicting Motor- and Total-UPDRS are presented for The results are presented for RMSE and R2adjusted for each training and
two clusters of EM by DBN-ANFIS in Fig. 11. According to the results, it test set. The results are presented for max, min, and mean values. Note
is found that DBN-ANFIS has accurately predicted the Motor- and Total- that the average values of R2 and RMSE are considered for all clusters
UPDRS in the clusters. in these tables. In the case of Motor-UPDRS, the results demonstrate that
In Fig. 12, RMSE in 40 epochs for 13 clusters is displayed. In Table 4
EM + DBN + ANFIS(RMSE = 0.537; R2 = 0.893) outperforms other
and Table 5, we present the results of the proposed method, EM +
prediction machine learning techniques, DBN + ANFIS(RMSE =
DBN + ANFIS, along with the other prediction methods, DBN + ANFIS, ( )
SVR, and ANFIS. The training was performed for 100 epochs in ANFIS. 0.894; R2 = 0.866), DBN +SVR RMSE = 0.875; R2 = 0.886 ,
The hybrid approach was used to train the data. In this experimental SVR(RMSE = 1.313; R2 = 0.761) and ANFIS(RMSE = 1.584; R2 =
evaluation, we have selected the RBF kernel for SVR with the best C and 0.702). In the case of Total-UPDRS, it is found that
( )
γ values C = 23 , γ = 2− 2 obtained by an exhaustive search method. EM +DBN +ANFIS(RMSE = 0.513; R2 = 0.913) provides more accu
rate results compared to DBN + ANFIS(RMSE = 0.880; R2 = 0.880),
9
Fig. 10. Prediction result of Motor-UPDRS and Total-UPDRS based on 3 PCs.
Table 3
Chosen PCs in each cluster.
Cluster No. PCs
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12 PC13 PC14 PC15 PC16
Cluster 1 √ √ √ √ √ √ √ √ √
Cluster 2 √ √ √ √ √
Cluster 3 √ √ √ √ √ √ √ √
Cluster 4 √ √ √ √ √
Cluster 5 √ √ √ √ √ √ √ √
Cluster 6 √ √ √ √ √ √
Cluster 7 √ √ √ √ √ √ √
Cluster 8 √ √ √ √ √ √ √
Cluster 9 √ √ √ √ √ √ √ √
Cluster 10 √ √ √ √ √
Cluster 11 √ √ √ √ √ √ √ √
Cluster 12 √ √ √ √ √
Cluster 13 √ √ √ √ √ √ √ √ √
( )
DBN +SVR RMSE = 0.870; R2 = 0.892 and SVR(RMSE = 1.297; R2 results for Motor-UPDRS and Total-UPDRS. It is found that the RMSE
( )
= 0.784) and ANFIS RMSE = 1.568; R2 = 0.722 . We also compare values obtained by HSLSSVR are respectively 0.8158 and 0.8004 for
the RMSE of the proposed method with the method proposed by (Zhao Motor-UPDRS and Total-UPDRS which is higher than the RMSE values
et al., 2015). In their method, HSLSSVR, the authors evaluated the of our method.
method on the PD dataset used in this study and provided the RMSE We also extended our data analysis in terms of the computation time
10
Fig. 11. Motor- and Total-UPDRS prediction results in two clusters.
of the methods. We tried to compare the computation time of the in machine learning techniques for UPDRS prediction. The method was
cremental learning technique with the non-incremental technique. The developed using Expectation-Maximization (EM), Deep Learning (DL),
incremental EM+DBN+ANFIS is compared with the non-incremental and ANFIS techniques. DBN was used to predict UPDRS from the clusters
EM+DBN+ANFIS, DBN+SVR and DBN+ANFIS on the same dataset. discovered by EM. The method was assessed using real-world PD input
The results are presented in Fig. 13. The plots are provided as a function data and the outcomes were evaluated and compared with previous PD
of the incremental data ratio for computation time. It is found that the diagnosis approaches. It was found that this approach was effective in
incremental EM+DBN+ANFIS method has been effective in minimizing improving the accuracy of UPDRS prediction as well as the time
the computation time concerning the non-incremental complexity compared to previous approaches that utilize huge input
EM+DBN+ANFIS, DBN+SVR and DBN+ANFIS methods. The plots in data. The use of the clustering technique was effective in the improve
Fig. 13 for both incremental and non-incremental approaches show that ment of the computation time with the aid of an incremental approach.
the increase of data has less impact on the computation time of the in In addition, incorporating the DBN and ANFIS led to accurate prediction
cremental approach. This is because the method which can learn online of UPDRS concerning the other approaches, SVR and ANFIS, measured
from the available data does not need to rerun the model for whole data. by the RMSE and adjusted coefficient of determination metrics. The
In fact, the incremental approach can update the previous prediction method proposed in this research has proved to be effective in accurate
model with newly added data. UPDRS prediction. However, it is recommended that future studies
should further investigate the use of deep learning and clustering tech
7. Conclusion niques for PD diagnosis approaches. In addition, as the majority of
previous works have focused on non-incremental approaches, the
Machine learning has been effectively utilized in the diagnosis of PD. development of online learning approaches is valuable to be investi
Accurate prediction of UPDRS is important in the diagnosis of PD. gated on the large PD datasets. Accordingly, in a future study on PD
Multiple researchers have concentrated on the development of algo diagnosis, we aim to refine the accuracy and computation time of the
rithms to accurately predict the scales of UPDRS, Total-UPDRS, and method through optimization and incremental deep learning techniques
Motor-UPDRS. In this research, we proposed a novel approach through for large datasets.
11
Fig. 12. RMSE in 40 epochs for 13 clusters.
Table 4
Motor-UPDRS and R2 and RMSE results.
Metrics EM+DBN+ANFIS Method DBN+ANFIS Method SVR Method ANFIS Method DBN+SVR Method
Train Test Train Test Train Test Train Test Train Test
RMSE Max 0.520 0.544 0.875 0.905 1.243 1.368 1.654 1.731 0.862 0.891
Min 0.512 0.520 0.872 0.882 1.136 1.259 1.356 1.438 0.850 0.859
Mean 0.516 0.537 0.871 0.894 0.932 1.313 1.505 1.584 0.856 0.875
R2adjusted Max 0.931 0.914 0.897 0.875 0.881 0.792 0.811 0.765 0.913 0.898
Min 0.907 0.879 0.869 0.853 0.805 0.731 0.665 0.639 0.891 0.875
Mean 0.922 0.893 0.877 0.866 0.843 0.761 0.738 0.702 0.902 0.886
Table 5
Total-UPDRS and R2 and RMSE results.
Metrics EM+DBN+ANFIS Method DBN+ANFIS Method SVR Method ANFIS Method DBN+SVR Method
Train Test Train Test Train Test Train Test Train Test
RMSE Max 0.512 0.521 0.874 0.886 1.229 1.353 1.641 1.715 0.864 0.869
Min 0.495 0.512 0.865 0.876 1.121 1.241 1.341 1.421 0.862 0.871
Mean 0.506 0.513 0.866 0.880 1.175 1.297 1.491 1.568 0.863 0.870
R2adjusted Max 0.943 0.928 0.915 0.897 0.889 0.815 0.834 0.785 0.921 0.912
Min 0.925 0.894 0.891 0.861 0.827 0.753 0.687 0.659 0.909 0.872
Mean 0.937 0.913 0.909 0.880 0.858 0.784 0.761 0.722 0.915 0.892
CRediT authorship contribution statement Hamad Abosaq: Investigation, Writing – review & editing, Visualiza
tion. Sultan Alyami: Investigation, Writing – review & editing, Visu
Mehrbakhsh Nilashi: Supervision, Conceptualization, Methodol alization. Abdullah Alghamdi: Investigation, Writing – review &
ogy, Investigation, Software, Data curation, Formal analysis, Writing – editing, Visualization.
original draft, Writing – review & editing, Validation. Rabab Ali Abu
malloh: Conceptualization, Methodology, Investigation, Software, Data Declaration of Competing Interest
curation, Formal analysis, Writing – original draft, Writing – review &
editing, Validation. Salma Yasmin Mohd Yusuf: Investigation, Writing The authors declare that they have no known competing financial
– original draft, Writing – review & editing, Validation. Ha Hang Thi: interests or personal relationships that could have appeared to influence
Conceptualization, Writing – review & editing, Validation. Mohammad the work reported in this paper.
Alsulami: Investigation, Writing – review & editing, Visualization.
12
Fig. 13. Computation time for incremental and non-incremental approaches.
Data Availability Camps, J., Sama, A., Martin, M., Rodriguez-Martin, D., Perez-Lopez, C., Arostegui, J.M.
M., Cabestany, J., Catala, A., Alcaine, S., Mestre, B., 2018. Deep learning for freezing
of gait detection in Parkinson’s disease patients in their homes using a waist-worn
Data will be made available on request. inertial measurement unit. Knowl. -Based Syst. 139, 119–131.
Cattell, R.B., 1966. The scree test for the number of factors. Multivar. Behav. Res. 1 (2),
245–276.
Acknowledgements Chen, H.-L., Huang, C.-C., Yu, X.-G., Xu, X., Sun, X., Wang, G., Wang, S.-J., 2013. An
efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest
The authors are thankful to the Deanship of Scientific Research at neighbor approach. Expert Syst. Appl. 40 (1), 263–271.
Das, R., 2010. A comparison of multiple classification methods for diagnosis of Parkinson
Najran University for funding this work under the Research Collabora disease. Expert Syst. Appl. 37 (2), 1568–1572.
tion Funding program grant code NU/RC/SERC/11/13. El Maachi, I., Bilodeau, G.-A., Bouachir, W., 2020. Deep 1D-Convnet for accurate
Parkinson disease detection and severity prediction from gait. Expert Syst. Appl.
143, 113075.
References Eskidere, Ö., Ertaş, F., Hanilçi, C., 2012. A comparison of regression methods for remote
tracking of Parkinson’s disease progression. Expert Syst. Appl. 39 (5), 5523–5528.
Abdar, M., Zomorodi-Moghadam, M., 2018. Impact of patients’ gender on parkinson’s Exarchos, T.P., Tzallas, A.T., Baga, D., Chaloglou, D., Fotiadis, D.I., Tsouli, S., Diakou, M.,
disease using classification algorithms. J. AI Data Min. 6 (2), 277–285. Konitsiotis, S., 2012. Using partial decision trees to predict Parkinson’s symptoms: A
Afonso, L.C., Rosa, G.H., Pereira, C.R., Weber, S.A., Hook, C., Albuquerque, V.H.C., new approach for diagnosis and therapy in patients suffering from Parkinson’s
Papa, J.P., 2019. A recurrence plot-based approach for Parkinson’s disease disease. Comput. Biol. Med. 42 (2), 195–204.
identification. Future Gener. Comput. Syst. 94, 282–292. Ferreira, M.I.A., Barbieri, F.A., Moreno, V.C., Penedo, T., Tavares, J.M.R., 2022. Machine
Ahani, A., Nilashi, M., Zogaan, W.A., Samad, S., Aljehane, N.O., Alhargan, A., Mohd, S., learning models for Parkinson’s disease detection and stage classification based on
Ahmadi, H., Sanzogni, L., 2021. Evaluating medical travelers’ satisfaction through spatial-temporal gait parameters. Gait Posture 98, 49–55.
online review analysis. J. Hosp. Tour. Manag. 48, 519–537. Fischer, A., Igel, C., 2014. Training restricted Boltzmann machines: an introduction.
Akaike, H., 1974. A new look at the statistical model identification. IEEE Trans. Autom. Pattern Recognit. 47 (1), 25–39.
Control 19 (6), 716–723. Foroughi, B., Nhan, P.V., Iranmanesh, M., Ghobakhloo, M., Nilashi, M.,
Al-Fatlawi, A.H., Jabardi, M.H., Ling, S.H., 2016. Efficient diagnosis system for Yadegaridehkordi, E., 2023. Determinants of intention to use autonomous vehicles:
Parkinson’s disease using deep belief network, 2016 IEEE Congress on Evolutionary findings from PLS-SEM and ANFIS. J. Retail. Consum. Serv. 70, 103158.
Computation (CEC). IEEE 1324–1330. Froelich, W., Wrobel, K., Porwik, P., 2015. Diagnosis of Parkinson’s disease using speech
Alin, A., 2010. Multicollinearity. Wiley Interdiscip. Rev. Comput. Stat. 2 (3), 370–374. samples and threshold-based classification. J. Med. Imaging Health Inform. 5 (6),
Ambroise, C., Dang, M., Govaert, G., 1997. Clustering of spatial data by the EM 1358–1363.
algorithm, geoENV I—Geostatistics for environmental applications. Springer,, Galaz, Z., Mzourek, Z., Mekyska, J., Smekal, Z., Kiska, T., Rektorova, I., Orozco-
pp. 493–504. Arroyave, J.R., Daoudi, K., 2016. Degree of Parkinson’s disease severity estimation
Arji, G., Ahmadi, H., Nilashi, M., Rashid, T.A., Ahmed, O.H., Aljojo, N., Zainol, A., 2019. based on speech signal processing, 2016 39th International Conference on
Fuzzy logic approach for infectious disease diagnosis: A methodical evaluation, Telecommunications and Signal Processing (TSP). IEEE, pp. 503–506.
literature and classification. Biocybern. Biomed. Eng. 39 (4), 937–955. Ganji, M.F., Abadeh, M.S., 2011. A fuzzy classification system based on Ant Colony
Asgari, M., Shafran, I., 2010. Extracting cues from speech for predicting severity of Optimization for diabetes disease diagnosis. Expert Syst. Appl. 38 (12),
parkinson’s disease, 2010 IEEE International Workshop on Machine Learning for 14650–14659.
Signal Processing. IEEE 462–467. Garriga, J., Palmer, J.R., Oltra, A., Bartumeus, F., 2016. Expectation-maximization
Avci, D., Dogantekin, A., 2016. An expert diagnosis system for parkinson disease based binary clustering for behavioural annotation. PLoS One 11 (3), e0151984.
on genetic algorithm-wavelet kernel-extreme learning machine. Parkinson’s Dis. Graham, M.H., 2003. Confronting multicollinearity in ecological multiple regression.
2016. Ecology 84 (11), 2809–2815.
Babu, G.S., Suresh, S., 2013. Parkinson’s disease prediction using gene expression–a Grover, S., Bhartia, S., Yadav, A., Seeja, K., 2018. Predicting severity of Parkinson’s
projection based learning meta-cognitive neural classifier approach. Expert Syst. disease using deep learning. Procedia Comput. Sci. 132, 1788–1794.
Appl. 40 (5), 1519–1529. Gunduz, H., 2019. Deep learning-based Parkinson’s disease classification using vocal
Behroozi, M., Sami, A., 2016. A multiple-classifier framework for Parkinson’s disease feature sets. IEEE Access 7, 115540–115551.
detection based on various vocal tests. Int. J. Telemed. Appl. 2016. Guo, L., Hao, J.-h, Liu, M., 2014. An incremental extreme learning machine for online
Bengio, Y., 2009. Learning Deep Architectures for AI. Now Publishers Inc. sequential learning problems. Neurocomputing 128, 50–58.
Bhattacharya, I., Bhatia, M.P.S., 2010. SVM classification to distinguish Parkinson Guo, P.-F., Bhattacharya, P., Kharma, N., 2010. Advances in detecting Parkinson’s
disease patients, Proceedings of the 1st Amrita ACM-W Celebration on Women in disease, International Conference on Medical Biometrics. Springer, pp. 306–314.
Computing in India, pp. 1–6. Hariharan, M., Polat, K., Sindhu, R., 2014. A new hybrid intelligent system for accurate
Bouman, C.A., Shapiro, M., Cook, G., Atkins, C.B., Cheng, H., 1997. Cluster: An detection of Parkinson’s disease. Comput. Methods Prog. Biomed. 113 (3), 904–913.
unsupervised algorithm for modeling Gaussian mixtures. Hinton, G.E., 2002. Training products of experts by minimizing contrastive divergence.
Buza, K., Varga, N.Á., 2016. Parkinsonet: estimation of updrs score using hubness-aware Neural Comput. 14 (8), 1771–1800.
feedforward neural networks. Appl. Artif. Intell. 30 (6), 541–555.
13
Hinton, G.E., 2009. Deep belief networks. Scholarpedia 4 (5), 5947. Nilashi, M., Ahmadi, H., Manaf, A.A., Rashid, T.A., Samad, S., Shahmoradi, L., Aljojo, N.,
Hinton, G.E., 2012. A practical guide to training restricted Boltzmann machines. Neural Akbari, E., 2020a. Coronary heart disease diagnosis through self-organizing map and
networks: Tricks of the trade. Springer, pp. 599–619. fuzzy support vector machine with incremental updates. Int. J. Fuzzy Syst. 22 (4),
Hinton, G.E., Salakhutdinov, R.R., 2006. Reducing the dimensionality of data with neural 1376–1388.
networks. science 313 (5786), 504–507. Nilashi, M., Ahmadi, H., Sheikhtaheri, A., Naemi, R., Alotaibi, R., Alarood, A.A.,
Hinton, G.E., Salakhutdinov, R.R., 2007. Using deep belief nets to learn covariance Munshi, A., Rashid, T.A., Zhao, J., 2020b. Remote Tracking of Parkinson’s Disease
kernels for Gaussian processes. Adv. Neural Inf. Process. Syst. 20, 1249–1256. Progression Using Ensembles of Deep Belief Network and Self-Organizing Map.
Hinton, G.E., Osindero, S., Teh, Y.-W., 2006. A fast learning algorithm for deep belief Expert Syst. Appl., 113562
nets. Neural Comput. 18 (7), 1527–1554. Nilashi, M., Ahmadi, H., Sheikhtaheri, A., Naemi, R., Alotaibi, R., Alarood, A.A.,
Hua, Y., Guo, J., Zhao, H., 2015. Deep belief networks and deep learning, Proceedings of Munshi, A., Rashid, T.A., Zhao, J., 2020c. Remote tracking of Parkinson’s disease
2015 International Conference on Intelligent Computing and Internet of Things. progression using ensembles of deep belief network and self-organizing map. Expert
IEEE, pp. 1–4. Syst. Appl. 159, 113562.
Jain, S., Shetty, S., 2016. Improving accuracy in noninvasive telemonitoring of Ortiz, A., Murcia, F.J.M., Munilla, J., Gorriz, J.M., Ramirez, J., 2019. Label aided deep
progression of Parkinson’S Disease using two-step predictive model, 2016 Third ranking for the automatic diagnosis of Parkinsonian syndromes. Neurocomputing
International Conference on Electrical, Electronics, Computer Engineering and their 330, 162–171.
Applications (EECEA). IEEE, pp. 104–109. Ozcift, A., 2012. SVM feature selection based rotation forest ensemble classifiers to
Jang, J.-S., 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. improve computer-aided diagnosis of Parkinson disease. J. Med. Syst. 36 (4),
Syst., Man, Cybern. 23 (3), 665–685. 2141–2147.
Johri, A., Tripathi, A., 2019. Parkinson Disease Detection Using Deep Neural Networks, Pelleg, D., Moore, A.W., 2000. X-means: Extending k-means with efficient estimation of
2019 Twelfth International Conference on Contemporary Computing (IC3). IEEE, pp. the number of clusters. Icml 727–734.
1–4. Pereira, C.R., Pereira, D.R., Rosa, G.H., Albuquerque, V.H., Weber, S.A., Hook, C.,
Khan, M.M., Chalup, S.K., Mendes, A., 2016. Parkinson’s disease data classification using Papa, J.P., 2018. Handwritten dynamics assessment through convolutional neural
evolvable wavelet neural networks, Australasian Conference on Artificial Life and networks: An application to Parkinson’s disease identification. Artif. Intell. Med. 87,
Computational Intelligence. Springer, pp. 113–124. 67–77.
Kim, H.B., Lee, W.W., Kim, A., Lee, H.J., Park, H.Y., Jeon, H.S., Kim, S.K., Jeon, B., Peterek, T., Dohnálek, P., Gajdoš, P., Šmondrk, M., 2013. Performance evaluation of
Park, K.S., 2018. Wrist sensor-based tremor severity quantification in Parkinson’s Random Forest regression model in tracking Parkinson’s disease progress, 13th
disease using convolutional neural network. Comput. Biol. Med. 95, 140–146. International Conference on Hybrid Intelligent Systems (HIS 2013). IEEE, pp. 83–87.
Kim, J.J., Bandres-Ciga, S., Blauwendraat, C., Gan-Or, Z., Consortium, I.Ps.D.G., 2020. Polat, K., 2012. Classification of Parkinson’s disease using feature weighting method on
No genetic evidence for involvement of alcohol dehydrogenase genes in risk for the basis of fuzzy C-means clustering. Int. J. Syst. Sci. 43 (4), 597–609.
Parkinson’s disease. Neurobiol. Aging 87 (140), e119–140 e122. Prashanth, R., Roy, S.D., 2018a. Early detection of Parkinson’s disease through patient
Kishore, P., Kumari, C.U., Kumar, M., Pavani, T., 2020. Detection and analysis of questionnaire and predictive modelling. Int. J. Med. Inform. 119, 75–87.
Alzheimer’s disease using various machine learning algorithms. Mater. Today. Proc. Prashanth, R., Roy, S.D., 2018b. Novel and improved stage estimation in Parkinson’s
Koivu, A., Sairanen, M., Airola, A., Pahikkala, T., Leung, W.-c, Lo, T.-k, Sahota, D.S., disease using clinical scales and machine learning. Neurocomputing 305, 78–103.
2021. Adaptive risk prediction system with incremental and transfer learning. Ramirez-Rozo, T.J., Garcia-Alvarez, J.C., Castellanos-Dominguez, C., 2012. Infrared
Comput. Biol. Med. 138, 104886. thermal image segmentation using expectation-maximization-based clustering, 2012
Krohn, L., Grenn, F.P., Makarious, M.B., Kim, J.J., Bandres-Ciga, S., Roosen, D.A., Gan- XVII Symposium of Image, Signal Processing, and Artificial Vision (STSIVA). IEEE,
Or, Z., Nalls, M.A., Singleton, A.B., Blauwendraat, C., 2020. Comprehensive pp. 223–226.
assessment of PINK1 variants in Parkinson’s disease. Neurobiol. Aging. Razali, R., Ahmad, F., Abd Rahman, F.N., Midin, M., Sidi, H., 2011. Burden of care
Kurtulus, B., Flipo, N., 2012. Hydraulic head interpolation using anfis—model selection among caregivers of patients with Parkinson disease: a cross-sectional study. Clin.
and sensitivity analysis. Comput. Geosci. 38 (1), 43–51. Neurol. Neurosurg. 113 (8), 639–643.
Lee, S., Chang, K., Baek, J.-G., 2021. Incremental learning using generative-rehearsal Sakar, C.O., Kursun, O., 2010. Telediagnosis of Parkinson’s disease using measurements
strategy for fault detection and classification. Expert Syst. Appl. 184, 115477. of dysphonia. J. Med. Syst. 34 (4), 591–599.
Li, D.-C., Liu, C.-W., Hu, S.C., 2011. A fuzzy-based data transformation for feature Sapir, S., Pawlas, A., Ramig, L., Countryman, S., O’BRIEN, C., Hoehn, M., Thompson, L.,
extraction to increase classification performance with small medical data sets. Artif. 1999. Speech and voice abnormalities in Parkinson disease: relation to severity of
Intell. Med. 52 (1), 45–52. motor impairment, duration of disease, medication, depression, gender and age.
Li, Q., Xiong, Q., Ji, S., Yu, Y., Wu, C., Gao, M., 2021. Incremental semi-supervised NCVS Status Prog. Rep. 14, 149–161.
Extreme Learning Machine for Mixed data stream classification. Expert Syst. Appl. Senturk, Z.K., 2020. Early diagnosis of Parkinson’s disease using machine learning
185, 115591. algorithms. Med. Hypotheses 138, 109603.
Little, M., McSharry, P., Hunter, E., Spielman, J., Ramig, L., 2008. Suitability of Shao, S., Sun, W., Wang, P., Gao, R.X., Yan, R., 2016. Learning features from vibration
dysphonia measurements for telemonitoring of Parkinson’s disease. Nat. Preced. 1-1. signals for induction motor fault diagnosis, 2016 International Symposium on
Majdinasab, F., Karkheiran, S., Moradi, N., Shahidi, G.A., Salehi, M., 2012. Relation Flexible Automation (ISFA). IEEE, pp. 71–76.
between Voice Handicap Index (VHI) and disease severity in Iranian patients with Sharma, P., Choudhary, K., Gupta, K., Chawla, R., Gupta, D., Sharma, A., 2020. Artificial
Parkinson’s disease. Med. J. Islam. Repub. Iran. 26 (4), 157. plant optimization algorithm to detect heart rate & presence of heart disease using
Mansfield, E.R., Helms, B.P., 1982. Detecting multicollinearity. Am. Stat. 36 (3a), machine learning. Artif. Intell. Med. 102, 101752.
158–160. Shinde, S., Prasad, S., Saboo, Y., Kaushick, R., Saini, J., Pal, P.K., Ingalhalikar, M., 2019.
Mardani, A., Streimikiene, D., Nilashi, M., Arias Aranda, D., Loganathan, N., Jusoh, A., Predictive markers for Parkinson’s disease using deep neural nets on neuromelanin
2018. Energy consumption, economic growth, and CO2 emissions in G20 countries: sensitive MRI. Neuroimage: Clin. 22, 101748.
application of adaptive neuro-fuzzy inference system. Energies 11 (10), 2771. Tsanas, A., Little, M.A., McSharry, P.E., Ramig, L.O., 2009. Accurate telemonitoring of
Movement Disorder Society Task Force on Rating Scales for Parkinson’s Disease, 2003. Parkinson’s disease progression by noninvasive speech tests. IEEE Trans. Biomed.
The unified Parkinson’s disease rating scale (UPDRS): status and recommendations. Eng. 57 (4), 884–893.
Mov. Disord. 18 (7), 738–750. Valla, E., Nõmm, S., Medijainen, K., Taba, P., Toomela, A., 2022. Tremor-related feature
Naranjo, L., Pérez, C.J., Campos-Roca, Y., Martín, J., 2016. Addressing voice recording engineering for machine learning based Parkinson’s disease diagnostics. Biomed.
replications for Parkinson’s disease detection. Expert Syst. Appl. 46, 286–292. Signal Process. Control 75, 103551.
Nilashi, M., Ibrahim, O., Ahani, A., 2016. Accuracy improvement for predicting Wan, S., Liang, Y., Zhang, Y., Guizani, M., 2018. Deep multi-layer perceptron classifier
Parkinson’s disease progression. Sci. Rep. 6 (1), 1–18. for behavior analysis to estimate parkinson’s disease severity using smartphones.
Nilashi, M., Ibrahim, O.B., Ithnin, N., Zakaria, R., 2015. A multi-criteria recommendation IEEE Access 6, 36825–36833.
system using dimensionality reduction and Neuro-Fuzzy techniques. Soft Comput. 19 Wang, D., Shang, Y., 2014. A new active labeling method for deep learning, 2014
(11), 3173–3207. International joint conference on neural networks (IJCNN). IEEE, pp. 112–119.
Nilashi, M., bin Ibrahim, O., Ahmadi, H., Shahmoradi, L., 2017a. An analytical method Wiwatcharakoses, C., Berrar, D., 2021. A self-organizing incremental neural network for
for diseases prediction using machine learning techniques. Comput. Chem. Eng. 106, continual supervised learning. Expert Syst. Appl. 185, 115662.
212–223. Yadav, G., Kumar, Y., Sahoo, G., 2012. Predication of Parkinson’s disease using data
Nilashi, M., Bin Ibrahim, O., Mardani, A., Ahani, A., Jusoh, A., 2018a. A soft computing mining methods: A comparative analysis of tree, statistical and support vector
approach for diabetes disease classification. Health Inform. J. 24 (4), 379–393. machine classifiers, 2012 National Conference on Computing and Communication
Nilashi, M., Ibrahim, O., Ahmadi, H., Shahmoradi, L., Farahmand, M., 2018b. A hybrid Systems. IEEE, pp. 1–8.
intelligent system for the prediction of Parkinson’s Disease progression using Yadegaridehkordi, E., Hourmand, M., Nilashi, M., Alsolami, E., Samad, S., Mahmoud, M.,
machine learning techniques. Biocybern. Biomed. Eng. 38 (1), 1–15. Alarood, A.A., Zainol, A., Majeed, H.D., Shuib, L., 2020. Assessment of sustainability
Nilashi, M., Dalvi-Esfahani, M., Ibrahim, O., Bagherifard, K., Mardani, A., Zakuan, N., indicators for green building manufacturing using fuzzy multi-criteria decision
2017b. A soft computing method for the prediction of energy performance of making approach. J. Clean. Prod. 277, 122905.
residential buildings. Measurement 109, 268–280. Yu, J., Yan, X., 2019. Active features extracted by deep belief network for process
Nilashi, M., Ibrahim, O., Samad, S., Ahmadi, H., Shahmoradi, L., Akbari, E., 2019. An monitoring. ISA Trans. 84, 247–261.
analytical method for measuring the Parkinson’s disease progression: A case on a Zadeh, L.A., 1965. Fuzzy sets. Inf. Control 8 (3), 338–353.
Parkinson’s telemonitoring dataset. Measurement 136, 545–557. Zhao, Y.-P., Li, B., Li, Y.-B., Wang, K.-K., 2015. Householder transformation based sparse
least squares support vector regression. Neurocomputing 161, 243–253.
14

Early Diagnosis of Parkinson's Disease: A Combined Method Using Deep Learning and Neuro-Fuzzy Techniques

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Early Diagnosis of Parkinson's Disease: A Combined Method Using Deep Learning and Neuro-Fuzzy Techniques

Uploaded by

Copyright:

Available Formats

Computational Biology and Chemistry 102 (2023) 107788

Contents lists available at ScienceDirect

Computational Biology and Chemistry

Early diagnosis of Parkinson’s disease: A combined method using deep

1. Introduction as the PD develops (Sapir et al., 1999). PD includes non-motor symptoms

Fig. 1. Proposed method.

Fig. 2. A general structure of DBN-ANFIS.

RBM is a stochastic Neural Network (NN) (Shao et al., 2016; Wang

k=1 j=1 j p hj = 1⃒v = sigmoid wij vi + bj

The sigmoid function is calculated using the following equation:

η is the ratio of the learning process; 〈〉data indicates the anticipation of

Fig. 3. Restricted Boltzmann Machine.

Fig. 4. Architecture of DBN.

Fig. 6. Visualizing the EM clustering results for PD dataset.

Fig. 7. PD features in EM clusters.

Fig. 8. Membership functions used in ANFIS models.

Fig. 9. The prediction results in 3D plots in ANFIS.

Fig. 10. Prediction result of Motor-UPDRS and Total-UPDRS based on 3 PCs.

Fig. 11. Motor- and Total-UPDRS prediction results in two clusters.

Fig. 12. RMSE in 40 epochs for 13 clusters.

Fig. 13. Computation time for incremental and non-incremental approaches.

You might also like