You are on page 1of 4

A Machine Learning Model for Prediction of

Parkinson’s Disease using Vocal Features


Abstract— Parkinson's disease is a progressive diseases, often improving accuracy. However, considerations
neurological disorder which affects motor skills and is such as execution time and algorithm complexity are crucial
characterized by symptoms such as muscle rigidity, in medical applications. Due to the ability to enhance
tremors and difficulties in maintaining equilibrium and decision-making performance, machine learning techniques
coordination. Detecting Parkinson's disease involves can prove as a promising solution to assist the Parkinson's
analyzing the patient's medical background and disease diagnosis.
conducting multiple imaging tests like MRI, Computed This paper discussed an method to come across the
Tomography, and DaTscan, along with physical and Parkinson’s sicknesses that includes feature getting to know
neurological assessments. All these steps are essential to from voice recordings, and those found out capabilities are
validate a definitive diagnosis. Recent researches have used as inputs for supervised classification techniques. The
proved that variations in speech can be used as a model makes use of ensemble classifiers composed of
quantifiable indicator to detect Parkinson's disease in its various Machine Learning Algorithms, which include SVM,
early stages. KNN, and XGBoost. The overall performance of every
This method is both non-invasive and economically technique is assessed with the help of performance metrics to
efficient in contrast to alternative diagnostic procedures. determine the most effective approach. This also uses
To achieve a speech-based diagnosis system, we have Synthetic oversampling of the minority magnificence and use
tried a feature selection algorithm in combination with of a function selection set of rules at the same time as
classification algorithms and a stacking classifier to preserving the most critical ones that make a contribution to
combine these base models. In the process of feature the accuracy of the model.
selection, we have applied the mutual information gain
method in combination with three classifiers: K-nearest
neighbors(KNN), Support Vector Machine (SVM) and II. LITERATURE SURVEY
XGBoost. To evaluate various permutations, we've made Individuals affected by PD frequently manifest a range of
use of the speech dataset provided UCI. To address the vocal indications involving difficulties in producing regular
issue of imbalanced data, we've applied the Synthetic vocal sounds, referred to as dysphonia. Dysphonia
Minority Oversampling Technique (SMOTE). The encompasses diverse voice-related disorders that can arise
stacking classifier has shown the best performance, from both pathological and functional issues affecting the
attaining an accuracy rate of 97.05%. voice. This can result in voice quality that is raspy, tense, or
effortful. Furthermore, speech intelligibility might be
Keywords—Parkinson’s Diseases, Vocal features, SVM, KNN, hindered due to low volume or distinctiveness, diverting
XGBoost, Stacking classifier, SMOTE attention from the intended message. These voice-related
disorders may also emerge from incorrect vocal instrument
I. INTRODUCTION
usage, including pitch misalignment, improper volume
Parkinson's disease (PD) is a disease which is control, or inadequate breath support, often attributed to
neurodegenerative affecting the relevant frightened system, postural challenges. Some instances of dysphonia seem to
in general manifesting as motor and non-motor symptoms. stem from a combination of misuse and physiological factors.
Motor symptoms encompass tremors, slowed actions, issue The current diagnostic methods for PD heavily rely on
in day by day activities, and shuffling at the same time as human expertise. Parkinson's disease diagnosis, where the
walking. Non-motor symptoms embody loss of smell, speech choice of feature extraction technique is critical for achieving
issues, constipation, fatigue, insomnia, and reminiscence high performance
problems. Detecting non-motor signs and symptoms early is Several studies have explored machine learning models for
important for managing the disease. PD diagnosis. One study used a stacking ensemble learning
It is observed that PD begins before motor symptoms appear. framework with various classifiers and achieved a 96.88%
In this case voice disorders affect around 90% of patients. accuracy using multi-modal features [1]. Another study
Individuals with PD often experience vocal difficulties predicted disease severity using machine learning models,
known as dysphonia, which encompasses various voice- achieving high predictive values [2]. Additionally, a model
related disorders resulting from both pathological and based on handwriting samples showed promising results in
functional issues. This can lead to raspy, tense, or effortful early PD identifications [3]
voice quality and hindered speech intelligibility due to low Speech processing is a significant area of research in
volume or distinctiveness. These problems can arise from Parkinson's disease diagnosis, where the choice of feature
improper vocal instrument usage and postural challenges, extraction technique is critical for achieving high
sometimes involving a combination of misuse and performance. Various feature extraction techniques have
physiological factors. Currently, diagnosing PD relies been explored, and their strengths and weaknesses have been
heavily on human expertise. Hence, there is a requirement for analyzed [4].
enhanced techniques to detect early non-motor symptoms The literature surrounding Parkinson's disease has
and impede the progression of the disease. Machine learning extensively explored speech measurement techniques for
tools have been used widely in the field of medicine to detect general voice disorders. Several studies highlight that up to

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE


90% of individuals with PD experience a specific voice A. Parkinson’s Speech dataset
disorder known as dysphonia [5]. Dysphonia can be This research utilized a publicly accessible dataset sourced
categorized as either pathological or functional and can be from the UCI repository[14]. The dataset encompasses
diagnosed based on a constellation of vocal symptoms, speech recordings from 8 individuals in good health and 23
including signs of vocal impairment. Researchers, such as individuals diagnosed with PD. The dataset's origin is
[6], propose speech analysis methods to investigate the attributed to Max Little of Oxford University. Within the
impact of voice disorders on PD patients. These techniques dataset, there are 195 rows, each corresponding to voice
involve extracting features like fractal scaling and recurrence measurements of individual participants, while each column
patterns to distinguish between disordered and normal represents a distinct vocal characteristic. Among the total,
voices. An alternative approach introduces a hybrid model 147 voice measurements pertain to PD patients, while the
for the diagnosis of Parkinson's disease based on speech remaining entries belong to those without health issues. A
signals. This approach utilizes feature selection techniques, "status" column is included, featuring binary values: zero
including genetic algorithm, mutual information gain and denoting healthy individuals and one indicating PD patients.
extra tree algorithm methods, achieving an accuracy rate of
95.58% through the use of genetic algorithms in combination B. Proposed Methdology
with a random forest classifier [7]. The proposed approach aims to utilize speech signals for
In [8] Senturk's work, proposed feature selection through Parkinson's disease diagnosis. As depicted in Figure 1, the
feature importance and recursive feature elimination suggested hybrid system comprises three main components:
methods. Classifiers such as regression trees, artificial neural feature reduction, SMOTE balancing, and binary
networks, and SVM are used for PD diagnosis classification. classification. Data items from the UCI Parkinson's language
Furthermore, Lahmiri and Shmuel [9] utilized voice disorder database were selected using the Mutual Information method.
patterns to differentiate between individuals who are healthy There is an inherent class imbalance issue in the dataset, with
and those who have Parkinson's disease. They utilized 147 cases out of 195 involving PD patients, and the
various pattern ranking techniques for feature selection in remainder representing healthy individuals. To tackle this
combination with an optimized SVM classifier, leading to an problem, the study employs SMOTE, wherein minority
accuracy of 92.21% when utilizing the Wilcox feature samples are synthetically generated from existing instances
ranking method. In a related context, [10]Polat introduced a within the dataset. The classification process involves the use
hybrid model for diagnosing Parkinson's disease using of SVM, KNN, and XGBoost classifiers, ultimately leading
speech signals analysis. The process starts with to the creation of a stacking classifier with Logistic
preprocessing using SMOTE and then proceeds to Regression as the final classifier.
classification with a random forest classifier, achieving an
accuracy of 94.8%.
Further studies, such as Gupta et al, highlight the
enhancement of classifier performance through techniques
like multi-feature evolutionary algorithms and optimized
feature selection algorithms [11].
The initial symptoms of PD might be subtle and go
unnoticed, progressing to more pronounced manifestations as
the disease advances. These symptoms vary among patients
and encompass both motor symptoms like bradykinesia,
Fig. 1 System Overview
tremors, rigidity, and postural instability, as well as non-
motor symptoms like psychiatric manifestations, sleep 1) Feature Selection Method
disturbances, sensory impairments, and autonomic Mutual Information gain feature selection is a technique used
dysfunction. Notable symptoms in Parkinson’s patients in machine learning to select the most informative features
include changes in speech, tremors, slowed movement, from a dataset. Mutual Information serves as a metric for
altered handwriting, muscle rigidity, impaired balance, and quantifying the extent of information one variable holds with
loss of automatic movements. Researchers have identified regard to another variable. When applied to feature selection,
vocal disorders in around 90% of individuals with Mutual Information quantifies how much a feature's presence
Parkinson’s disease, often evident in its early stages. These (or absence) contributes to predicting the value of the target
vocal disorders encompass issues like reduced volume, variable. Finally the 11 selected features which will be feed
defective voice quality, articulation difficulties, and reduced to the classifier which is given in figure 2.
pitch [12].
The early detection of PD is a complex task, primarily
because it requires extensive medical assessments and
repeated examinations conducted by neurologists and
specialists in movement disorders. Regrettably, this
procedure can be lengthy and cumbersome, especially for
older patients, the majority of whom are aged sixty or above.
III. METHODOLOGY
The dataset plays an important role in any machine learning
based approach. The former part of this section discuss about
the dataset used while in the later part the proposed approach
to detect the Parkinson’s algorithm is discussed.
data point In our model, the KNN classifier utilizes a concept
known as lazy learning or instance-based learning. This
approach involves using the entire training dataset during the
classification process, as there is no dedicated training phase
in this method.
 XGBoost
XGBoost or Extreme Gradient Boosting, has gained
significant popularity as an open-source gradient boosting
library extensively employed in supervised learning tasks,
particularly in regression and classification. Its primary
attributes include high scalability, efficiency, and accuracy.
Gradient boosting, a technique that amalgamates numerous
weak models, such as decision trees, to form a robust
predictive model, serves as the foundation for XGBoost.
These weak models are trained in a sequential manner, with
each new model endeavoring to rectify the errors made by its
Feature Selection using Mutual predecessors. XGBoost, as an implementation of gradient
Information Gain boosting, employs a more advanced algorithm to optimize
the training process and enhance the model's accuracy.The
essence of XGBoost lies in the sequential addition of
decision trees to the model. Each newly introduced tree is
constructed with the aim of rectifying the errors generated by
the prior trees. To be specific, XGBoost fits the new tree to
the residuals, or the discrepancies, of the previous tree's
Fig. 2. Feature Selection predictions. This iterative process ensures that each
subsequent tree is trained to predict the remaining error in the
2) Machine Learning Classifiers
predictions of the preceding trees. Additionally, XGBoost
For this study, the Parkinson diseases detection is considered incorporates regularization techniques to mitigate the risk of
as the binary classification problem having classes as PD overfitting. These techniques include L1 and L2
detected and PD not detected. In this section different regularization on the feature weights and a penalty for adding
classification methods used for Parsion’s disease detection new nodes to the tree structure. By doing so, XGBoost
are discussed. enhances the generalization capabilities of the model and
 Support Vector Machine(SVM) reduces the likelihood of the model memorizing the training
SVM is an algorithm of machine learning algorithm applied data.
to the tasks which involves regression and classification.  Stacking Classifier
They serve as robust nonlinear classifiers that operate by Stacking Classifier is an ensemble getting to know method
identifying an optimal hyperplane capable of effectively utilized in device learning that mixes a couple of type models
separating distinct classes of data points within a high- to improve predictive performance. In Stacking Classifier,
dimensional space. The choice of the hyperplane is geared multiple class fashions are skilled at the same dataset to make
towards optimizing the separation between the nearest data predictions. The predictions from those models are then
points of each class, thus increasing the likelihood of blended by way of schooling a meta-classifier on the outputs
correctly categorizing new data samples. SVMs find of the base classifiers. The meta-classifier takes the output of
application across a broad spectrum of use cases and are the bottom classifiers as enter and makes a final prediction.
recognized for their precision and competence in managing The base classifiers may be any form of classification set of
intricate datasets. The process of hyperparameter tuning was rules, including logistic regression, selection tree, random
used to determine the most suitable parameters for optimal woodland, KNN, or SVM. The meta-classifier also can be any
SVM performance. category algorithm, which includes logistic regression, choice
tree, or neural- network. One gain of the usage of Stacking
 K- Nearest Neighbour (KNN) Classifier is that it boosts the total performance of the
KNN stands as a widely used machine learning algorithm classification model through combining the strengths of
applicable to both classification and regression purposes. multiple fashions. It also can lessen the chance of overfitting,
KNN operates by identifying the K closest data points within that could arise whilst a unmarried version is used.
the training dataset to the incoming input data point and then
IV. RESULTS AND DISCUSSION
making predictions regarding its class or value. This
prediction hinges on either the majority class or the average The proposed system, employ the mutual information gain
value among these nearest neighbors. feature selection method along with three classifiers: SVM,
In classification tasks, K-nearest neighbors(KNN) gives a KNN, and XGBoost. Subsequently, a stacking classifier is
class label to a new input data point by seeing the class labels used that combines the output of these three classifiers. In
of its K-nearest neighbors from the training dataset. The class this section, the performance of different classifiers is
label that looks most frequent among these KNN is given to examined on full features and selected feature subsets. To
the new data points. In the context of regression tasks, KNN assess the validity of our results, a k-fold cross-validation
assigns a value to the new input data point by calculating the approach is used with k set to 10.
average value of its KNN in the training dataset. The average The model performance evaluation relies on several metrics,
value of the KNN is used as the predicted value for the new including accuracy, recall, F-measure, precision, and AUC.
Table 1 shows the performance metrics for the full features detection. Proper remedy, a balanced weight-reduction plan
(23 features) K-fold score wherin table 2 shows the and exercising can decrease the signs and symptoms of
performance metrics for the selected (11 features) features K- Parkinson’s. The primary focus of this study revolves around
fold score the early detection of Parkinson's disease using speech
The stacking classifier trained on all features outperforms signals. There is no ordinary feature selection technique and
the other classifiers with a 97.05% accuracy. For the selected classifier for the scientific dataset.
features from the mutual information gain method, the While one characteristic selection technique may also
stacking classifier is again the highest performing out of the enhance the accuracy for a classifier, the identical can't be
other 3 with a 94.5% accuracy. By using this method of stated for its impact on other classifiers. The proposed hybrid
feature selection, the accuracy of KNN improves by nearly machine is at an impressive ninety seven.05%. The suggested
3%, for SVM it decreases by nearly 3%, and for XGBoost it system should not be seen as a replacement for healthcare
has little to no effect. professionals but can serve as a supplementary tool for
TABLE I. FULL FEATURES K- FOLD SCORES
Parkinson's disease diagnosis. Detecting Parkinson's disease
can be accomplished through a diagnostic tool that analyzes
Full Feature Recall Accurac F1 score Precisi Roc_a handwritten drawings since slowness and tremors are early
K-fold scores y on uc
SVM 96.66 96.63 96.62 96.97 99.49
symptoms of Parkinson's that can significantly affect a
KNN 89.47 89.37 89.11 91.59 98.39 person's handwriting. Different tools and strategies can be
XGBoost 93.75 93.64 93.61 94.31 98.28 attempted to yield distinct results inclusive of one-of-a-kind
Proposed 97.08 97.05 97.04 97.3 99.42 SMOTE variations. Alternative strategies for extracting voice
Method capabilities can be explored.
(Stacking
Classifier)
REFERENCES
TABLE II. SELECTED FEATURES K- FOLD SCORE [1] Y. Yang, L. Wei, Y. Hu and S. Nie, “Classification of Parkinson’s
Disease based on multimodel features and stacking ensemble learning”
Full Feature Recall Accurac F1 score Precisi Roc_a J. Neurosci. Methods, Vol. 350, p. 109019, Feb 2021
K-fold scores y on uc [2] K. P. Nguyen et al. “Predicting Parkinson’s Disease trajectory using
SVM 93.3 93.22 93.2 93.77 98.14 clinical and neuroimaging baseline measures” Parkinsonism relat.
KNN 92.88 92.83 92.8 93.4 95.84 Discord., vol 85, pp. 44-51, April 2021
XGBoost 93.71 93.66 93.6 94.28 98.43 [3] I. Kamran, S. Naz, I. Razzak and M. Imran, “Handwriting dynamics
Proposed 94.55 94.5 94.46 95.08 98.36 assessment using deep neural network for early identification of
Method Parkinson’s Disease”, Future Gener. Comput. Syst. Vol 117, pp. 234-
(Stacking 244, April 2021
Classifier) [4] M. A. Mazumdar, R. A. Salam “Feature Extraction Techniques for
Speech Processing: A Review”, Int. J. Adv. Trends Comput. Sci. Eng.,
Results of this study are also compared with the study of vol. 8, no.1.3, pp. 285-292, Aug. 2019
other reseacher’s who have used the UCI Parkinson’s Speech [5] M. Little, P. Macsharry et al., “Exploiting Nonlinear Recurrence and
Fractal scaling Properties for Voice disorder detection”, Biomed Eng.
Dataset. The comparative results are shown in the table 3 Online, Vol 6, p. 23, Feb 2007
which proves that the proposed approach outperformed over [6] M. Little, P. Macsharry et al., “Suitability of dysphonia measutements
the past studies done in [7][8][11][13]. for telemonitoring of Parkinson’s Disease”, Nat. Preced., pp. 1-1 Sept.
2008
TABLE III. PERFORMANCE COMPARISION WITH THE EXISTING [7] R. Lamba, T. Gulati, Hadeel A., A. Jain, “A Hybrid System for
APPROACHES SUGGESTED BY OTHER AUTHORS Parkinson’s Disease dignosis using Machine Learning Techniques”,
Springer, Int. J. of Specch Tech., 14th April 2021.
Study Feature Selection Classifiers Accura
Methods cy (%) [8] Z. Karapinar Senturk, “Early Diagnosis of Parkinson’s Disease using
[11] Optimized KNN, Decision Tree 92.19 Machine Learning Algorithms”, Med. Hypotheses, vol. 138, p. 109603,
Cuttlefish May 2020
Algorithm [9] S. Lahmiri, A. Shamuel, “Detection of Parkinson’s Disease based on
[13] Modified Grey KNN, Random Forest, 93.87 voice patterns ranking and optimized Support Vector Machine”,
Wolf Decision Tree Biomed. Signal Process. Control, vol. 49, pp. 427-433, Mar 2019.
Optimization [10] Polat, K. (2019). A hybrid approach to Parkinson disease classification
[8] Recursive Feature SVM, ANN, Classification 93.84 using speech signal: The combination of SMOTE and random forests.
Elimination, and Regression Trees In 2019 scientific meeting on electrical-electronics & bio- medical
Feature engineering and computer science (EBBT) IEEE, 1–3.
Importance [11] Gupta, D., Julka, A., Jain, S., Aggarwal, T., Khanna, A., Arunkumar,
[7] Extra Tree, Naïve Bayes, KNN, 95.58 N., & de Albuquerque, V. H. C. (2018). Optimized cuttlefish algorithm
Mutual Random Forest for diagnosis of Parkinson’s disease. Cognitive Sys-tems Research, 52,
Information Gain, 36–48.
Genetic [12] Zesiewicz et all.(2019), “Management of Early Parkinson’s Disease”,
Algorithm Clinics in geriatric medicine.
Proposed All Features Naïve Bayes, KNN, 97.05
[13] Sharma P., Sundaram S, Sharma M., Sharma A. (2019), “Diagnosis of
Method XGBoost(stacking
Parkinson’s Disease using Modified grey wolf optimization”,
Classifier)
Cognitive Sytem Research, 54, pp. 100-115.
V. CONCLUSION [14] https://archive.ics.uci.edu/dataset/174/parkinsons

Parkinson’s is a chronic ailment consequently; the best


manner to improve an affected person’s lifestyles is early

You might also like