You are on page 1of 22

International Journal of Information Management Data Insights

A Novel Diagnosis System for Parkinson Disease Based on Ensemble Random Forest
--Manuscript Draft--

Manuscript Number:

Full Title: A Novel Diagnosis System for Parkinson Disease Based on Ensemble Random Forest

Article Type: Full Length Article

Keywords: Parkinson Disease; Ensemble; Random Forest; Machine Learning Classifiers

Corresponding Author: Avijit Kumar Chaudhuri, Ph.D.


Techno Engineering College Banipur
Habra, WEST BENGAL INDIA

Corresponding Author Secondary


Information:

Corresponding Author's Institution: Techno Engineering College Banipur

Corresponding Author's Secondary


Institution:

First Author: Avijit Kumar Chaudhuri, Ph.D.

First Author Secondary Information:

Order of Authors: Avijit Kumar Chaudhuri, Ph.D.

Order of Authors Secondary Information:

Abstract: One of the important concerns in healthcare and machine learning research is
diagnosing Parkinson's disease (PD). This study aims to develop a PD prediction
model. This paper proposes an ensemble random forest (ERF) model, which employs
a variety of classification approaches to achieve this goal. The proposed ERF classifier
is evaluated on the PD dataset from the machine learning repository at the University
of California at Irvine (UCI). The suggested classifier is also compared to various state-
of-the-art machine-learning classifiers, such as random forest, naive bayes, support
vector machine with radial basis function kernel, and decision tree. To assess the
effectiveness of the suggested ERF classifier, several performance indicators such as
accuracy, sensitivity, specificity, F-Measure, receiver operating characteristic, area
under the curve, and statistical tests such as the kappa statistics were used. Finally,
the suggested ERF model revealed its potential in the classification results, with a 96
% accuracy rate.

Suggested Reviewers: Dr. Deepankar Sinha, PHD


Indian Institute of Foreign Trade - Kolkata Campus
dsinha2000@gmail.com
NA

SANKHAYAN CHOUDHURI
University of Calcutta Rashbehari Siksha Prangan: University of Calcutta - Rajabazar
Science College Campus
sankhayan@gmail.com
NA

Additional Information:

Question Response

Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation
Cover Letter

To
The Editor-in-Chief

Dear Editor-in-Chief,
I am submitting the manuscript for consideration of publication in your esteemed journal.
The manuscript is entitled “A Novel Diagnosis System for Parkinson Disease Based on
Ensemble Random Forest”.

I confirm that it has not been published elsewhere.


The author is:
Avijit Kumar Chaudhuri; e-mail: c.avijit@gmail.com
Thanks and Regards
Dr. Avijit Kumar Chaudhuri - Corresponding author
Title Page

A Novel Diagnosis System for Parkinson Disease Based on Ensemble Random Forest

Dr. Avijit Kumar Chaudhuri

Computer Science and Engineering, Techno Engineering College Banipur, Kolkata, India

c.avijit@gmail.com
Manuscript Click here to view linked References

1
2
3
4 A Novel Diagnosis System for Parkinson Disease Based on Ensemble Random Forest
5
6
7 Dr. Avijit Kumar Chaudhuri
8
9
10 Computer Science and Engineering, Techno Engineering College Banipur, Kolkata, India
11
12
13
c.avijit@gmail.com
14
15 Abstract
16
17
18 One of the important concerns in healthcare and machine learning research is diagnosing Parkinson's
19 disease (PD). This study aims to develop a PD prediction model. This paper proposes an ensemble
20
21 random forest (ERF) model, which employs a variety of classification approaches to achieve this goal.
22
23 The proposed ERF classifier is evaluated on the PD dataset from the machine learning repository at the
24 University of California at Irvine (UCI). The suggested classifier is also compared to various state-of-
25
26 the-art machine-learning classifiers, such as random forest, naive bayes, support vector machine with
27
28 radial basis function kernel, and decision tree. To assess the effectiveness of the suggested ERF classifier,
29
30 several performance indicators such as accuracy, sensitivity, specificity, F-Measure, receiver operating
31
32
characteristic, area under the curve, and statistical tests such as the kappa statistics were used. Finally,
33 the suggested ERF model revealed its potential in the classification results, with a 96 % accuracy rate.
34
35
36 Keywords: Parkinson Disease, Ensemble, Random Forest, Machine Learning Classifiers
37
38
39
40
41
42
1. Introduction
43 Parkinson's disease (PD) is a serious health problem that affects individuals worldwide. It
44
45 is the prototypical adult-onset neurodegenerative disorder initially described by Doctor James
46
47 Parkinson as shaking palsy (Parkinson, 2002). It is a chronic sickness that deteriorates the
48
49 patient's health over time. Dopamine is a neurotransmitter produced by neurons in the biological
50
51 neural network that is necessary for the regulation of many motor and non-motor natural
52
53
processes in the human body. Certain cell body clusters in PD neurons are unable to generate
54 dopamine. The quantity of Dopamine generated in the neurological system diminishes as PD
55
56 advances, leaving a person unable to move appropriately (Langston, 2002; Meissner et al., 2011).
57
58 Tremors in one hand, foot, or leg are often the initial symptoms of PD. Slow movement,
59
60 stiffness, decreased body balance, difficulty in standing, decreased facial expressions,
61
62
63
64
65
1
2
3
4 coordination issues, difficulty in thinking, difficulty in understanding, difficulty in writing,
5
6 distorted sense of smell, dribbling urine, impaired voice, soft speech, and voice box spasms are
7
8 some of the other symptoms(Langston, 2002; Meissner et al., 2011; Gallagher et al., 2010;
9
10 Mittel, 2003). 90% of these people have vocal dysfunction and have trouble speaking or
11
12 communicating (Schley et al., 1982). As a result, careful examination of the sound/voice of PD
13
14
patients utilizing modern signal processing algorithms assists in the diagnosis and tracking of the
15 disease progress.
16
17 In the absence of any known risk factors, the diagnosis is especially difficult, and
18
19 consistency of results between doctors is difficult to achieve. Based on risk factors, it is difficult
20
21 to determine the risks or make a diagnosis of PD (Akyol, 2017). The machine-learning approach
22
23 is useful in this situation because of its improved ability to collect, store, and process data in
24
25
order to reveal patterns and provide insights. Machine-learning techniques can predict risk early
26 in the course of PD. Any error in disease prediction, especially Type II errors, has the potential to
27
28 be fatal. In addition to accuracy, other performance measures such as consistency, sensitivity, and
29
30 specificity are important in such studies. Several machine-learning techniques to disease
31
32 prediction in general and PD in particular, fail to achieve these extra performance criteria.
33
34 An Ensemble Random Forest (ERF) classifier is proposed in this paper. It is compared
35
against many cutting-edge machine-learning classifiers like Logistic Regression (LR), Naïve
36
37 Bayes(NB), Decision Tree(DT), Support Vector Machine(SVM), and Random Forest(RF), as
38
39 well as earlier research on the same data set as shown in Table 8 below. The author compares
40
41 many performance measures and statistical tests (accuracy, sensitivity, specificity, ROC, AUC,
42
43 Kappa statistic) on 50–50 %, 66–34 %, 80–20 % splits of training and testing data and 20-fold
44
45 cross-validation.
46 The goal is to find answers to the following research questions:
47
48 Research Question 1: Is the suggested ERF classifier suitable for PD prediction?
49
50 Research Question 2: Does the suggested ERF classifier fulfill the added Sensitivity and
51
52 Specificity criteria?
53
54 Research Question 3: Is the suggested ERF classifier consistent and statistically significant
55
56 throughout the dataset's various levels of Training and Testing samples?
57 Figure 1 depicts the flowchart of the experimental design and model construction.
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31 Figure 1. PD prediction model
32
33 2. Related Work
34
35
Emerging technologies for disease diagnosis and detection are becoming possible as technology
36 advances(Ray & Chaudhuri, 2021). Many machine learning, expert systems, and soft computing
37
38 approaches are offered by diverse researchers in practically every field of medicine(Chaudhuri et
39
40 al., 2021; Chaudhuri et al., 2021). Various ways for diagnosing PD are available in the literature
41
42 using adequate modeling of voice and speech datasets. Prepossessing, feature extraction and
43
44 classification are critical phases in these computerized classification systems. Little et al., 2008
45
used a kernel support vector machine (SVM) with a feature selection approach to diagnose PD,
46
47 and they achieved a promising actual accuracy of 91.4%. Nonlinear Models Using Dirichlet
48
49 Process Mixtures for PD diagnosis were proposed by Little et al.(2008), Shahbaba & Neal(2009)
50
51 and Psorakis et al.(2010). The authors (Shahbaba & Neal, 2009) employed Improved multi-class
52
53 multi-kernel Relevance Vector Machines (mRVMs) and attained 89.47 % using the ten-fold
54
55 cross-validation approach. Psorakis et al.(2010) suggested a model based on genetic
56 programming and the expectation-maximization technique. Guo et al.(2010), Psorakis et
57
58 al.(2010) and Sakar & Kursun(2010) built an appropriate model using the mutual information
59
60 measure and Support Vector Machines. Sakar & Kursun(2010) and Das(2010) conducted a
61
62
63
64
65
1
2
3
4 comparative analysis of four separate classification models and found that the Artificial Neural
5
6 Network-based model had the largest accuracy (92.9 %). Das(2010) and Luukka(2011) used
7
8 fuzzy entropy metrics and a similarity classifier. This study produced an average actual accuracy
9
10 of 85.03 % using a 50/50 training and testing set. For the diagnosis of PD, Luukka(2011), Ozcift
11
12 & Gulten(2011) suggested a correlation-based feature selection (CFS) technique with rotating
13
14
forest ensemble classifiers. To improve classification performance on relatively small data sets,
15 Ozcift & Gulten(2011) and Li et al.(2011) presented fuzzy-based nonlinear transformation
16
17 approaches using PCA and SVM. To demonstrate the performance of their technique, they
18
19 applied it to six medical datasets, including a PD dataset.
20
21 Using Parallel Artificial Neural Network architecture, Astrom & Koker(2011) achieved 91.20 %
22
23 classification accuracy. Spadoto et al.(2011) used evolutionary-based feature selection strategies
24
25
to increase the accuracy of an optimum path forest classifier for PD diagnosis. Polat(2012) used
26 a Fuzzy c-means (FCM) clustering-based feature weighting approach and a kNN classifier to
27
28 achieve a 97.93% accuracy rate. Support Vector Machine with chi-square distance kernel was
29
30 used by Daliri (2013). The achieved accuracy was 91.20 %. PCA and a fuzzy kNN system were
31
32 employed by Chen et al.(2013). With the 10 fold cross-validation procedure, they attained a
33
34 promising accuracy of 96.07 %. Zuo et al.(2013) used an evolutionary approach (Particle Swarm
35
Optimization) to improve the performance of a fuzzy k-nearest neighbor classifier, achieving an
36
37 accuracy of 97.47 %. Zhang(2017) presented PD diagnosis utilizing time-frequency
38
39 characteristics, stacked autoencoders (SAE), and kNN classifiers.
40
41
42
43
44
45 3. Methodology
46 3.1. Dataset
47
48 Max Little of the University of Oxford generated the dataset in association with the National
49
50 Centre for Voice and Speech in Denver, Colorado, to capture the speech signals. The feature
51
52 extraction approaches for general voice problems were reported in the original paper. This
53
54 dataset includes biomedical voice measurements from 31 people, 23 of whom have PD. Each
55
56 column in the table represents a different voice measure, and each row represents one of the 195
57 voice recordings made by these people ("name" column) as depicted in Table 1. The primary
58
59
60
61
62
63
64
65
1
2
3
4 goal of the data is to distinguish healthy people from those with PD by using the "status" column,
5
6 which is set to 0 for healthy people and 1 for those with PD(Little et al., 2007).
7
8 The data is stored in ASCII CSV format. Each row in the CSV file corresponds to a single voice
9
10 recording occurrence. There are approximately six recordings per patient, with the patient's name
11
12 in the first column.
13
14
Table 1. Description of the dataset
15
16 Sl. Attributes Description Range of Mean Std. Dev
17
18 No Values
19
20
21 1 name Patient name in ASCII and recording - - -
22 number
23
24 2 MDVP:Fo(Hz) Vocal fundamental frequency on average 88.333 - 154.22864 41.39006
25 260.105 1 475
26
27
28
29 3 MDVP:Fhi(Hz) Maximum fundamental frequency of the 102.145 - 197.10491 91.49154
30
31 voice 592.03 79 764
32
33
34
35 4 MDVP:Flo(Hz) Minimum fundamental frequency of the 65.476 - 116.32463 43.52141
36 voice 239.17 08 318
37
38
39
40 5 MDVP:Jitter(% Several measures of fundamental frequency 0.00168 - 0.00622 0.004848
41
42 ) variation 0.03316
43 6 MDVP:Jitter(A 0.000007 - 4.40E-05 3.48E-05
44
45 bs) 0.00026
46
7 MDVP:RAP 0.00068 - 0.0033064 0.002967
47
48 0.02144 1 774
49
50 8 MDVP:PPQ 0.00092 - 0.0034463 0.002758
51 0.01958 59 977
52
53 9 Jitter:DDP 0.00204 - 0.0099199 0.008903
54 0.06433 49 344
55
56 10 MDVP:Shimme Several amplitude variation measures 0.00954 - 0.0297091 0.018856
57 r 0.11908 28 932
58
59 11 MDVP:Shimme 0.085 - 0.2822512 0.194877
60
61
62
63
64
65
1
2
3
4 r(dB) 1.302 82 29
5
6 12 Shimmer:APQ3 0.00455 - 0.0156641 0.010153
7 0.05647 54 162
8
9 13 Shimmer:APQ5 0.0057 - 0.0178782 0.012023
10 0.0794 56 706
11
12 14 MDVP:APQ 0.00719 - 0.0240814 0.016946
13
0.13778 87 736
14
15 15 Shimmer:DDA 0.01364 - 0.0469926 0.030459
16
17 0.16942 15 119
18 16 NHR Two measurements of the noise-to-tonal 0.00065 - 0.0248470 0.040418
19
20 component ratio in the voice 0.31482 77 449
21 17 HNR 8.441 - 21.885974 4.425764
22
23 33.047 36 269
24 18 RPDE There are two measurements of nonlinear 0.25657 - 0.4985355 0.103941
25
26 dynamical complexity. 0.685151 38 714
27
19 D2 1.423287 - 2.3818260 0.382799
28
29 3.671155 87 047
30
20 DFA Exponent of signal fractal scaling 0.574282 0.7180990 0.055335
31
32 - 0.825288 46 83
33
34 21 spread1 Three nonlinear measurements of (-7.964984) - 1.090207
35 fundamental frequency fluctuation – (- 5.684396 764
36
37 2.434031) 744
38 22 spread2 0.006274 - 0.2265103 0.083405
39
40 0.450493 49 763
41 23 PPE 0.044539 - 0.2065516 0.090119
42
43 0.527367 41 322
44 24 status Patients' health conditions
45
46 1 – PD
47
0 - healthy
48
49
50
51
52
53
54
55 3.2. Algorithm for the ERF and Stacking
56
57 Stacking tries to combine the ability of numerous state-of-the-art classifiers to obtain prediction
58
59 accuracy that is better than the ensemble's classifiers. Stacking divides the training dataset into
60
61
62
63
64
65
1
2
3
4 the same number of subsets as the ensemble's classifiers, with each classifier training on a non-
5
6 overlapping subset. Based on obtained fit, each classifier is given a relative weight, and a Meta
7
8 RF classifier is formed (Sikora, 2015; Bhasuran et al., 2016). The proposed IRF technique
9
10 employs RF as the ultimate meta-classifier, training an RF classifier on each non-overlapping
11
12 subset produced using stratified random sampling (to assure equal class distribution). The
13
14
working of the proposed classifier is illustrated in Figure 2.
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34 Figure 2.Working of ERF classifier
35
36 Suppose we are considering a set of training data {(x j , y j )}Kj=1, where x j  R Q and y j  {-1, 1}. and
37
38 suppose we are given a potentially large number of logistic regression classifiers as weak
39
m ( ai ) bi
40 classifiers, denoted f p (x) {-1, 1}, and a 0-1 loss function I, defined as I(f m (a), b)  {00 ifif ff m ( ai )  bi
41
42
43 Then, the algorithm of the AdaBoost can be illustrated as follows
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4 for j from 1 to N, w (jJ ) =1
5
6 for m = 1 to M do
7 Fit weak classifier m(random forest) to minimize the objective function:
 w (jm) I(f x (a j )  b j
8 N
9 m  j 1
10  j w(jm)
11
where I(f p (a j )  b j ) = 1 if f p (a j )  b j and 0 otherwise
12
1 m
13  m  ln
14 m
15 for all j do
16  m I(f p (x j )  y j )
17 w (jm 1)  w (jm)e
18 end for
19 end for
20
21 After learning, the final classifier is based on a linear combination of the logistic regressions:
22
 M 
23 g(x) = sign   m f m ( x) 
24  m 1 
25
26
27
28
29
30
31 Table 2: Performance evaluation metrics and statistical tests
32 Metrics Formula/Description
33
34 Accuracy
35 Sensitivity
36
37 Specificity
38 F-Score
39
40
41
42
43
44
45
46
47
48
49
Kappa statistics PRa -PR ac
50
51 , ‘ PRa ’ represents total agreement probability and ‘ PRac ’
(1-PR ac )
52
53 represents probability ‘by chance’
54
55 Receiver operating ROC is plotted between Sensitivity and (1-Specificity). The area under the
56 characteristic (ROC) curve (AUC) measures the degree to which the curve is up in the northwest
57
58 corner(Vandewiele et al., 2021).
59
60
61
62
63
64
65
1
2
3
4 4. Results and Discussion
5
6
7 The author developed and simulated a proposed model by using the Python programming
8
9 language. In this model, author performed a comparative study between five state-of-the-art
10
11 machine-learning algorithms, namely LR, RF, NB, SVM, and DT, and the proposed model.
12
13 Among these five popular machine-learning techniques, some show better accuracy, whereas the
14 performances of others are inferior. To boost the accuracy and performance of the weak
15
16 classifier, the author used advanced ensemble machine learning and proposed an ensemble meta-
17
18 algorithmic technique (Figure 2).
19
20 The main aim of the experimentation is to classify healthy people and people with PD.
21
22 The machine-learning techniques applied on the UCI dataset named “Oxford Parkinson’s
23
24
Disease Detection (OPD)” of biomedical voice measurements is used for this purpose and
25 showed 96% accuracy. The various performance evaluation matrices used in this study are
26
27 professed in Table 2.
28
29 As shown in Table 3, different approaches yielded different levels of accuracy, with NB
30
31 recording an accuracy of 70% while the proposed ERF exhibited 96% accuracy.
32
33 Table 3. Comparison of accuracies
34
35 Training -Testing Accuracies
36
37 Partition
38 LR NB DT SVM RF ERF
39
40 50-50 0.90 0.75 0.86 0.92 0.90 0.96
41
42
43 66-34 0.84 0.90 0.85 0.85 0.94 0.93
44
45 80-20 0.87 0.69 0.92 0.87 0.92 0.95
46
47
20 fold cross 0.82 0.70 0.79 0.84 0.79 0.88
48
49 validation
50
51
52
53
54 A confusion matrix presents the statistics of real and projected classifications achieved
55
56 from the analysis of different classification systems. The performance of all such systems is
57
58 generally assessed by using the data generated in this matrix. Table 4 shows the results generated
59
60 from confusion matrices by using different machine-learning algorithms. The performance of the
61
62
63
64
65
1
2
3
4 proposed model, along with the performances of other methods, was evaluated based on
5
6 sensitivity, specificity, and accuracy tests, which use the true positive (TP), true negative (TN),
7
8 false negative (FN), and false positive (FP) terms.
9
10
11
12
13 Table 4. Comparison of sensitivity and specificity
14
15 Train LR NB DT SVM RF ERF
16
17 ing-
18 Testi
19
20 ng
21
22 Sensit Specif Sensit Specif Sensit Specif Sensit Specif Sensit Specif Sensit Specif
23
24 ivity icity ivity icity ivity icity ivity icity ivity icity ivity icity
25
26 50-50 0.89 0.93 0.96 0.48 0.91 0.70 0.90 1 0.93 0.78 0.96 0.95
27
28
29 66-34 0.86 0.75 0.92 0.47 0.90 0.71 0.87 0.77 0.94 0.93 0.94 0.88
30
31 80-20 0.89 0.75 0.92 0.33 0.94 0.83 0.89 0.75 0.94 0.83 0.94 1
32
33
20 0.82 0.85 0.70 0.97 0.79 0.87 0.84 0.86 0.89 0.94 0.88 0.87
34
35 fold
36
cross
37
38 valida
39
40 tion
41
42
43
44 The results of sensitivity and specificity in Table 4 demonstrate the potential of the
45
46 proposed model in the classification of two classes. The comparison of the proposed model with
47
48 other widely used independent classification techniques is shown in Figure 6 It is clear from the
49
50
comparative results that the proposed classification technique has the highest accuracy,
51 sensitivity and specificity values (accuracy 96%, sensitivity=0.96, specificity=1) for the PD
52
53 dataset.
54
55 I found from different analyses in this study that the classification accuracy of the LR
56
57 was 90%, with 89% sensitivity and 93% specificity. The RF achieved an accuracy of
58
59 classification of 94%, with 94% sensitivity and 94% specificity. The accuracy of classification of
60
61
62
63
64
65
1
2
3
4 the SVM was 92%, with 90% sensitivity and 100% specificity. However, the best performance
5
6 of the six classifiers evaluated was that of the proposed classifier, which achieved 96% accuracy
7
8 in classification, with 96% sensitivity and 100% specificity. Table 3 and Table 4 show the
9
10 complete set of results.
11
12 The analytical results of the suggested classifier are equally acceptable. The enhanced
13
14
specificity and sensitivity to predict PD using the proposed classifier is a significant outcome.
15 The NB classifier achieved 96% sensitivity and 97% specificity; however, the other classifiers
16
17 produced lower sensitivity and specificity than ERF. The suggested classifier eliminated that
18
19 disadvantage, yielding sensitivity and specificity accuracy levels of 96% and 100% respectively,
20
21 implying that fewer patients would need to be tested for PD due to improved specificity. At the
22
23 same time, a greater sensitivity value would save money and minimize the waiting periods of
24
25
really ill patients, both of which would be crucial in saving lives.
26 Table 5. Comparison of F-Score
27
28
29 Training -Testing F-Score
30
Partition
31 LR NB DT SVM RF ERF
32
33
34 50-50 0.94 0.81 0.91 0.95 0.93 0.97
35
36 66-34 0.90 0.78 0.90 0.90 0.96 0.95
37
38
39 80-20 0.93 0.79 0.95 0.93 0.95 0.97
40
41 20 fold cross 0.80 0.78 0.80 0.82 0.90 0.90
42
43 validation
44
45
46
47 The F-measure, on the other hand, is a commonly used metric in information retrieval
48
49 and class imbalance issues that has been studied by numerous researchers. The fundamental
50
51 advantage of F-measure is that it compares classifier performance in terms of recall (or True
52
53 Positive Rate) to accuracy using a factor that adjusts their relative relevance. (Soleymani,
54 Granger & Fumera, 2020). The F-measure findings in Table 5 show the potential of the
55
56 suggested model in the classification of two classes. Table 5 compares the proposed model to
57
58 several frequently applied independent classification algorithms. The comparison study found
59
60
61
62
63
64
65
1
2
3
4 that the suggested classification algorithm has the greatest F-measure values (97%) for the PD
5
6 dataset.
7
8 The ROC charts for these experiments with individual machine-learning techniques are
9
10 depicted in Table 6. In this table, six ROC charts are drawn in different parts for a 20-fold cross-
11
12 validation shown in blue. Experimental results show that, in terms of cross-validation accuracy,
13
14
the proposed classifier outperformed most of the other previously used methods discussed in the
15 literature review. With the proposed model, the generated AUC value reaches 0.95.
16
17 Table 6. Comparison of AUC
18
19
20 Training -Testing LR NB DT SVM RF ERF
21 Partition
22
23
24 20 fold cross
25 validation
26
27
28
29
30
31 AUC 0.84 0.82 0.73 0.80 0.93 0.95
32
33
34
35
36 Comparing the performances of different machine-learning classifiers might generate an
37
ambiguous result if the comparison has been based only on accuracy-based metrics. The Cohen’s
38
39 Kappa Statistic (CKS) value is used to help produce error-free comparative efficiency of
40
41 different classifiers(Chaudhuri, Banerjee & Das, 2021). The cost of error must be considered in
42
43 such evaluations. In this respect, the CKS is an excellent measure for inspecting classifications
44
45 that may be due to chance. Usually, the CKS takes a value between -1 and +1. As the classifier’s
46
47 calculated kappa value approaches 1, its performance is assumed to be more realistic than “by
48 chance.” Therefore, the CKS value is a suggested metric for measurement purposes in the
49
50 performance analysis of classifiers (Ben-David, 2008). This kappa value is calculated by using
51
52 Equation 1:
53
54 PRa -PR ac
55 CKS= (1)
56
(1-PR ac )
57
58 Where ‘ PRa ’ represents total agreement probability and ‘ PRac ’ represents probability ‘by
59
60 chance’
61
62
63
64
65
1
2
3
4 The results of the CKS analysis of the five popular machine-learning techniques and the
5
6 proposed model are shown in Table 7. These results clearly demonstrate that the proposed model
7
8 performed much better than other classifiers (value=0.88).
9
10 Table 7. Kappa statistic for each model
11
12 Training- LR NB DT SVM RF ERF
13 Testing
14
15
16 Kappa Statistic
17
18
19 50-50 0.68 0.46 0.60 0.74 0.72 0.88
20
21 66-34 0.52 0.40 0.61 0.57 0.84 0.80
22
23
24 80-20 0.48 0.28 0.72 0.48 0.72 0.80
25
26 20 fold 0.40 0.48 0.43 0.45 0.70 0.66
27
28 cross
29 validation
30
31
32
33
34
35 Table 8. PD dataset performance comparison
36
37
38
39
40 Year Method Classifica Sensitivity/ F-Measure Kappa ROC/AUC
41 tion Specificity
42
43 Accuracy
44 (%)
45
46 ÇAĞLAR et al., 2010 ANFC- 94.72 0.88/1 × × ×
47 LH
48
49 MLPNN 89.69 0.88/0.96 × × ×
50 RBFNN 87.63 0.88/0.96 × × ×
51
52
53 Ene, 2008 PNN 81.28 × × × ×
54
55 Avci & Dogantekin, GA-WK- 96.81 0.95/0.98 × × ×
56 2016 ELM
57
58 Cai et al., 2018 CBFO- 96.97 0.97/0.99 × × 0.98
59
FKNN
60
61
62
63
64
65
1
2
3
4 Ozcift & Gulten, 2011 CFS-RF 87.10 × × × 0.86
5
6 Cai et al., 2017 SVM- 97.42 0.99/0.92 × × ×
7 BFO
8
9 Chen ET AL., 2013 PCA- 96.07 0.96/0.96 × × 0.96
10 FKNN
11
12 Kadam & Jadhav, 2019 FESA- 92.19 0.97/0.90 × × ×
13
DNN
14
15 This study ERF 96 0.96/1 0.97 0.88 0.95
16
17
18
19
20 5. Conclusion
21 PD is one of the major causes of mortality worldwide, particularly in low- and middle-income
22
23 nations. Detecting PD necessitates a series of medical tests and their interpretation by experts.
24
25 When the findings of medical tests disagree, making a diagnosis becomes difficult, and it may
26
27 impair the consistency of diagnosis among physicians. In this case, the machine-learning
28
29 technique can identify patterns and give insights. Machine-learning approaches can identify the
30
31
risk of PD at an early stage based on elements of regular lifestyles and the results of a few
32 medical tests.
33
34 The suggested ensemble random forest classifier, like the stacking-ensemble approach, combines
35
36 the random forest classifier predictions utilizing RF (also) as the Meta-Classifier. On the PD
37
38 Study dataset, biomedical voice measures from 31 patients, 23 of whom have PD, are included in
39
40 this collection (PD). Each column in the table indicates a distinct voice measure, and each row
41
42
represents one of these people's 195 voice recordings ("name" column).
43 The classes in the dataset are distributed unequally. Such unbalanced datasets increase the risk of
44
45 a diagnosed sick patient being misdiagnosed as healthy (which is very severe) and decrease the
46
47 likelihood of a healthy patient being misdiagnosed.
48
49 The author compares the proposed classifier to two parametric (LR and NB) and three non-
50
51 parametric classifiers (SVM with radial basis kernel, DT, and RF). The comparison is based on
52 the accuracy, sensitivity, specificity, F- Measure, ROC, AUC and Kappa statistics. The author
53
54 uses the 50–50, 66–34, and 80–20% train-test splits, as well as 20-fold cross-validations. We
55
56 confirm that the training and testing datasets do not contain identical records of PD patients.
57
58 Non-parametric classifiers outperform parametric classifiers (whose learning is bound by their
59
60 assumptions), and the Random Forest classifier outperforms them all. The suggested classifier
61
62
63
64
65
1
2
3
4 outperforms the Random Forest classifier in terms of accuracy, with 96% accuracy and the
5
6 lowest false positives/negatives overall.
7
8
9
10
11
12
13
14
15 Akyol, K. (2017). A study on the diagnosis of Parkinson’s disease using digitized wacom
16
17 graphics tablet dataset. Int J Inf Technol Comput Sci, 9, 45-51.
18
19 American Parkinson Disease Association (APDA), Symptoms of Parkinson’s disease. https://
20
21 Åström, F., & Koker, R. (2011). A parallel neural network approach to prediction of Parkinson’s
22
23 Disease. Expert systems with applications, 38(10), 12470-12474.
24
25
Avci, D., & Dogantekin, A. (2016). An expert diagnosis system for parkinson disease based on
26 genetic algorithm-wavelet kernel-extreme learning machine. Parkinson’s disease, 2016.
27
28 Ben-David, A. (2008). Comparison of classification accuracy using Cohen’s weighted
29
30 kappa.Expert Systems with Applications, 34(2), 825–832.
31
32 Bhasuran, B., Murugesan, G., Abdulkadhar, S., & Natarajan, J. (2016). Stacked ensemble
33
34 combined with fuzzy matching for biomedical named entity recognition of diseases.
35
Journal of biomedical informatics, 64, 1-9.
36
37 ÇAĞLAR, M. F., ÇETİŞLİ, B., & Toprak, I. B. (2010). Automatic recognition of Parkinson’s
38
39 disease from sustained phonation tests using ANN and adaptive neuro-fuzzy classifier.
40
41 Mühendislik Bilimleri ve Tasarım Dergisi, 1(2), 59-64.
42
43 Cai, Z., Gu, J., Wen, C., Zhao, D., Huang, C., Huang, H., ... & Chen, H. (2018). An intelligent
44
45 Parkinson’s disease diagnostic system based on a chaotic bacterial foraging optimization
46 enhanced fuzzy KNN approach. Computational and mathematical methods in medicine,
47
48 2018.
49
50 Chaudhuri, A. K., Banerjee, D. K., & Das, A. (2021). A Dataset Centric Feature Selection and
51
52 Stacked Model to Detect Breast Cancer. International Journal of Intelligent Systems and
53
54 Applications (IJISA), 13(4), 24-37.
55
56 Chaudhuri, A. K., Sinha, D., Banerjee, D. K., & Das, A. (2021). A novel enhanced decision tree
57 model for detecting chronic kidney disease. Network Modeling Analysis in Health
58
59 Informatics and Bioinformatics, 10(1), 1-22.
60
61
62
63
64
65
1
2
3
4 Chaudhuri, A.K., Ray, A., Banerjee, D.K., & Das, A. (2021). A Multi-Stage Approach
5
6 Combining Feature Selection with Machine Learning Techniques for Higher Prediction
7
8 Reliability and Accuracy in Cervical Cancer Diagnosis. International Journal of
9
10 Intelligent Systems and Applications.
11
12 Chen, H. L., Huang, C. C., Yu, X. G., Xu, X., Sun, X., Wang, G., & Wang, S. J. (2013). An
13
14
efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest
15 neighbor approach. Expert systems with applications, 40(1), 263-271.
16
17 Chen, H. L., Huang, C. C., Yu, X. G., Xu, X., Sun, X., Wang, G., & Wang, S. J. (2013). An
18
19 efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest
20
21 neighbor approach. Expert systems with applications, 40(1), 263-271.
22
23 Daliri, M. R. (2013). Chi-square distance kernel of the gaits for the diagnosis of Parkinson's
24
25
disease. Biomedical Signal Processing and Control, 8(1), 66-70.
26 Das, R. (2010). A comparison of multiple classification methods for diagnosis of Parkinson
27
28 disease. Expert Systems with Applications, 37(2), 1568-1572.
29
30 Ene, M. (2008). Neural network-based approach to discriminate healthy people from those with
31
32 Parkinson's disease. Annals of the University of Craiova-Mathematics and Computer
33
34 Science Series, 35, 112-116.
35
Gallagher, D. A., Lees, A. J., & Schrag, A. (2010). What are the most important nonmotor
36
37 symptoms in patients with Parkinson's disease and are we missing them?. Movement
38
39 Disorders, 25(15), 2493-2500.
40
41 Guo, P. F., Bhattacharya, P., & Kharma, N. (2010, June). Advances in detecting Parkinson’s
42
43 disease. In International Conference on Medical Biometrics (pp. 306-314). Springer,
44
45 Berlin, Heidelberg.
46 Kadam, V. J., & Jadhav, S. M. (2019). Feature ensemble learning based on sparse autoencoders
47
48 for diagnosis of Parkinson’s disease. In Computing, communication and signal
49
50 processing (pp. 567-581). Springer, Singapore.
51
52 Langston, J. W. (2002). Parkinson’s disease: current and future challenges. Neurotoxicology,
53
54 23(4-5), 443-450.
55
56 Li, D. C., Liu, C. W., & Hu, S. C. (2011). A fuzzy-based data transformation for feature
57 extraction to increase classification performance with small medical data sets. Artificial
58
59 intelligence in medicine, 52(1), 45-52.
60
61
62
63
64
65
1
2
3
4 Little, M., McSharry, P., Hunter, E., Spielman, J., & Ramig, L. (2008). Suitability of dysphonia
5
6 measurements for telemonitoring of Parkinson’s disease. Nature Precedings, 1-1.
7
8 Little, M., McSharry, P., Roberts, S., Costello, D., & Moroz, I. (2007). Exploiting nonlinear
9
10 recurrence and fractal scaling properties for voice disorder detection. Nature Precedings,
11
12 1-1.
13
14
Luukka, P. (2011). Feature selection using fuzzy entropy measures with similarity classifier.
15 Expert Systems with Applications, 38(4), 4600-4607.
16
17 Meissner, W. G., Frasier, M., Gasser, T., Goetz, C. G., Lozano, A., Piccini, P., ... & Bezard, E.
18
19 (2011). Priorities in Parkinson's disease research. Nature reviews Drug discovery, 10(5),
20
21 377-393.
22
23 Mittel, C. S. (2003). Parkinson's disease: Overview and current abstracts.
24
25
Ozcift, A., & Gulten, A. (2011). Classifier ensemble construction with rotation forest to improve
26 medical diagnosis performance of machine learning algorithms. Computer methods and
27
28 programs in biomedicine, 104(3), 443-451.
29
30 Ozcift, A., & Gulten, A. (2011). Classifier ensemble construction with rotation forest to improve
31
32 medical diagnosis performance of machine learning algorithms. Computer methods and
33
34 programs in biomedicine, 104(3), 443-451.
35
Polat, K. (2012). Classification of Parkinson's disease using feature weighting method on the
36
37 basis of fuzzy C-means clustering. International Journal of Systems Science, 43(4), 597-
38
39 609.
40
41 Psorakis, I., Damoulas, T., & Girolami, M. A. (2010). Multiclass relevance vector machines:
42
43 sparsity and accuracy. IEEE Transactions on neural networks, 21(10), 1588-1598.
44
45 Ray, A., & Chaudhuri, A. K. (2021). Smart healthcare disease diagnosis and patient
46 management: Innovation, improvement and skill development. Machine Learning with
47
48 Applications, 3, 100011.
49
50 Sahu, B., & Mohanty, S. N. (2021). CMBA-SVM: a clinical approach for Parkinson disease
51
52 diagnosis. International Journal of Information Technology, 13(2), 647-655.
53
54 Sakar, C. O., & Kursun, O. (2010). Telediagnosis of Parkinson’s disease using measurements of
55
56 dysphonia. Journal of medical systems, 34(4), 591-599.
57 Schley, W. S., Fenton, E., & Niimi, S. (1982). Vocal symptoms in parkinson disease treated with
58
59 levodopa: a case report. Annals of Otology, Rhinology & Laryngology, 91(1), 119-121.
60
61
62
63
64
65
1
2
3
4 Shahbaba, B., & Neal, R. (2009). Nonlinear models using Dirichlet process mixtures. Journal of
5
6 Machine Learning Research, 10(8).
7
8 Sikora, R. (2015). A modified stacking ensemble machine learning algorithm using genetic
9
10 algorithms. In Handbook of Research on Organizational Transformations through Big
11
12 Data Analytics (pp. 43-53).IGi Global.
13
14
Soleymani, R., Granger, E., & Fumera, G. (2020). F-measure curves: A tool to visualize
15 classifier performance under imbalance. Pattern Recognition, 100, 107146.
16
17 Spadoto, A. A., Guido, R. C., Carnevali, F. L., Pagnin, A. F., Falcão, A. X., & Papa, J. P. (2011,
18
19 August). Improving Parkinson's disease identification through evolutionary-based feature
20
21 selection. In 2011 Annual International Conference of the IEEE Engineering in Medicine
22
23 and Biology Society (pp. 7857-7860). Ieee.
24
25
Vandewiele, G., Dehaene, I., Kovács, G., Sterckx, L., Janssens, O., Ongenae, F., ... &
26 Demeester, T. (2021). Overly optimistic prediction results on imbalanced data: a case
27
28 study of flaws and benefits when applying over-sampling. Artificial Intelligence in
29
30 Medicine, 111, 101987.
31
32 www.apdaparkinson.org/what-is-parkinsons/symptoms/ (2017). Accessed 21 Nov 2017
33
34 Zhang, Y. N. (2017). Can a smartphone diagnose parkinson disease? a deep neural network
35
method and telediagnosis system implementation. Parkinson’s disease, 2017.
36
37 Zuo, W. L., Wang, Z. Y., Liu, T., & Chen, H. L. (2013). Effective detection of Parkinson's
38
39 disease using an adaptive fuzzy k-nearest neighbor approach. Biomedical Signal
40
41 Processing and Control, 8(4), 364-373.
42
43
44
45 Declaration of Interests
46
47 The author declares that he has no known competing financial interests or personal relationships that
48 could have appeared to influence the work reported in this paper.
49
50
51
52
53 Highlights
54  This study has been done to tackle the Parkinson's disease which is a highly mortal disease
55
56
 Proposed a novel improved random forest model which works on ensemble of classifiers
57  Found efficient compared with other classifiers and researches done on Parkinson's disease
58  We could achieve close to 100% accuracy in the classification results
59  Significantly reduce medical costs for prediction of Parkinson's disease
60
61
62
63
64
65
Highlights

A Novel Diagnosis System for Parkinson Disease Based on Ensemble Random Forest

Highlights
 This study has been done to tackle the Parkinson's disease which is a highly mortal disease
 Proposed a novel improved random forest model which works on ensemble of classifiers
 Found efficient compared with other classifiers and researches done on Parkinson's disease
 We could achieve close to 100% accuracy in the classification results
 Significantly reduce medical costs for prediction of Parkinson's disease

You might also like