You are on page 1of 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/350794169

Computer Aided Diagnosis of Thyroid Disease Using Machine Learning


Algorithms

Conference Paper · April 2021


DOI: 10.1109/ICECE51571.2020.9393054

CITATIONS READS

0 74

7 authors, including:

Md. Asfi -Ar-Raihan Asif Mirza Muntasir Nishat


Islamic University of Technology Islamic University of Technology
3 PUBLICATIONS   0 CITATIONS    20 PUBLICATIONS   49 CITATIONS   

SEE PROFILE SEE PROFILE

Fahim Faisal Md. Fahim Shikder


Islamic University of Technology Islamic University of Technology
21 PUBLICATIONS   42 CITATIONS    3 PUBLICATIONS   0 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

DC motor braking system View project

Analysis and Design of Optimized Controller for Power Converters by Nonlinear Optimization Techniques View project

All content following this page was uploaded by Mirza Muntasir Nishat on 11 April 2021.

The user has requested enhancement of the downloaded file.


2020 11th International Conference on Electrical and Computer Engineering (ICECE)

Computer Aided Diagnosis of Thyroid Disease


Using Machine Learning Algorithms
Md. Asfi-Ar-Raihan Asif , Mirza Muntasir Nishat, Fahim Faisal, Md. Fahim Shikder, Mahmudul Hasan Udoy,
Rezuanur Rahman Dip and Ragib Ahsan
Department of Electrical and Electronic Engineering, Islamic University of Technology, Dhaka, Bangladesh
Email: raihanasif@iut-dhaka.edu, mirzamuntasir@iut-dhaka.edu, faisaleee@iut-dhaka.edu, fahimshikder@iut-dhaka.edu,
mahmudul81@iut-dhaka.edu, rezuanurrahman@iut-dhaka.edu, ragibahsan@iut-dhaka.edu

Abstract— This paper presents a comprehensive study of tests, and physical exams. Additional blood tests like
investigating the performance of different machine learning calcitonin, thyroglobulin tests are conducted to identify
algorithms in the diagnosis of thyroid disease. Detecting thyroid thyroid cancer. In recent times, machine learning (ML)
disease at an early stage is a task of utmost significance because appears as a handy diagnostic tool to identify different
fatal thyroid diseases like thyroid cancer can be fully cured with
thyroid diseases and their progressed stages with good
proper treatment. Therefore, machine learning (ML) has made
its way up to be a reliable component to predict thyroid diseases. accuracy [5]. Many researchers worked with various
A dataset from the University of California, Irvine (UCI) algorithms to predict thyroid disease proficiently. Qiao et al.
repository has been trained and tested to build the model proposed a classification model based on the Random Forest
classifier. Several classifying machine learning algorithms were (RF) algorithm using principal component analysis,
implemented on the dataset and their corresponding confusion achieving 95.63% accuracy to diagnose the thyroid disease
matrices were presented. Subsequently, a detailed comparative [6]. Priyanka et al. applied their dataset to the Support Vector
analysis was carried out in terms of accuracy, precision, Machine (SVM) alongside the Recursive Feature Elimination
sensitivity, F1 score, ROC-AUC which provided conclusive (RFE), gaining an accuracy of 92.92% in detecting the
evidence that Multilayer Perceptron (MLPC) was the most
thyroid disease [7]. On the other hand, Gyanendra et al.
proficient algorithm among these algorithms with an accuracy
of 99.70% after hyperparameter optimization. found an accuracy of 96.87% using the KNN algorithm to
detect thyroid disease [8]. So, proper execution of these
Keywords— Machine learning, Thyroid disease, UCI dataset, algorithms can substantially rectify human-made errors and
Hyperparameter optimization. work as a deputy to medical professionals to make credible
decisions [9].
I. INTRODUCTION
In this paper, we have implemented several ML
Thyroid, a small gland located in the anterior part of the
algorithms like K-Nearest Neighbor (KNN), Support Vector
neck, just beneath Adam’s apple and enclosed around the
Machine (SVM), AdaBoost (AdB), XGBoost (XGB),
trachea is in charge of some of the vital functions of the
Gaussian Process Classifier (GPC), Gradient Boosting
human body which are creating and secreting hormones that
Classifier (GBC), Multiplier Perceptron Classifier (MLPC) in
control crucial anatomic functions like metabolism and
UCI thyroid disease dataset and observed the performance of
protein synthesis. Metabolism rate is predominantly
the models. In Section II, data preprocessing is presented and
governed by two active thyroid hormones named T4
the study of the ML algorithms is briefly stated in Section III.
(thyroxine) and T3 (triiodothyronine) which command the
Hence, the results of the simulations were executed in Section
body cells of the amount of energy to be exerted [1].
IV. At last, the conclusion is demonstrated in section V with
However, thyroid disease is a dire phenomenon of body
a promising outcome to detect thyroid disease effectively.
functionality when the thyroid gland fails to produce the right
amount of thyroid hormones. The most common thyroid II. DATA PREPROCESSING
diseases are hyperthyroidism, hypothyroidism, thyroiditis,
Data preprocessing is one of the fundamental steps in
and Hashimoto’s thyroiditis, all occur due to the abnormal
machine learning where feature scaling is an integral part of
emission of thyroid hormones [2]. As the symptoms are very
it. There are many popular methods in feature scaling like
similar to other medical conditions, the diagnosis of thyroid
Min-Max scaling, Standardization (Variance Scaling),
disease takes much longer before it reaches a critical stage
Normalization, etc. The information of the attributes from the
[3]. dataset is tabulated in Table I [10].
Thyroid disease detection is often an arduous task because
it presents a set of symptoms that overlap with some other TABLE I. ATTRIBUTE INFORMATION
medical conditions as well. For example, hyperthyroidism Attribute Elaboration Domain
appears with symptoms like weight loss, muscle weakness, TSH Thyroid Stimulating Hormone 0.4 to 5.0 milli-IU
experiencing anxiety, irritation and nervousness, insomnia T3 Triiodothyronine 100 to 200 ng/dL
while hypothyroidism reveals symptoms like weight gain, TT4 Total T4 (Thyroxine) 4.6 to 12 mg/dL
fatigue, experiencing forgetfulness, intolerance to low T4U T4 uptake 0.7 to 1.9 ng/dL
temperatures, etc. [4]. Conventionally, detection of thyroid
FTI Free T4 Index 4-11 ng/dL
disease is carried out by performing blood tests, imaging

978-0-7381-1102-5/20/$31.00 ©2020 IEEE


depending on the decision tree stump. For each of the weak
The UCI dataset contains 3164 instances with 25
attributes. Exploratory Data Analysis (EDA) was performed iteration h ( xt ) after i − iterations final classifier is,
and it is found that there were some data outliers in the Ei =  E [ Fi −1 ( xt ) + α i h ( xt )]
distribution. In case of outliers, a general scaling technique
like standard scaling, min-max scaling does not work well. Here, αi is the assigned coefficient.
So, robust scaling performs better whenever there are outliers
in data and also improves classification accuracies in the case D. XGBoost (XGB):
of biomedical data [11].
To push the limit of computations for boosted algorithms,
Tianqi Chen developed XGBoost, also known as extreme
gradient boosting based on the principles of gradient boosting
[15]. It predicts model in ensemble form of a weak predicted
model, classically decision tree. The model generally is an
optimization of an arbitrary differentiable loss function.
E. Gaussian Processes Classifier(GPC):
Gaussian Processes Classifier (GPC) has emerged as one
of the powerful tools in machine learning which predicts the
data by incorporating prior knowledge. Its main application
is fitting a function to the data. For every point in the dataset,
there are many functions to fit the dataset; the Gaussian
process assigns a probability to each of these functions [16].
The probability density function of a Gaussian random
Fig. 1 Correlation heatmap variable is stated below,
A correlation heat map was plotted to investigate salient
1 T
 −1 ( x − µ ))
features from the dataset (Fig. 1). Data were split into training P ( x; µ , ) = d
e −0.5(( x − µ )
and testing sets which were 80% & 20% of total data
respectively as this ratio displays low bias & low variance (2π ) |  | 2

compared to the other ratios. Here, µ resembles the mean,  depicts the covariance
III. STUDY OF ALGORITHMS matrix, and |  | is the determinant of  .
To predict thyroid disease, observing the performance of F. Gradient Boosting Classifier(GBC):
different ML algorithms is the prime objective of this work.
Following that, all the algorithms studied in this work are Gradient Boosting (GB) refers to a boosting method that
briefly explained below: converts weak learners into strong learners, and here it is
gradient descent. It combines the best possible model with the
A. K-Nearest Neighbor(KNN): previous model with a target to minimize the error for the
KNN is a basic algorithm for machine learning which next model. Let’s write the loss function is, where yi is the
predicts the data based on the nearest neighbors [12]. The target value and y ip is the prediction [17],
algorithm runs based on the Euclidean distance function
, which is depicted below: LMSE =  ( yi − yip ) 2
The predicted value after using gradient boosting is,
d ( xi , yi ) = ( xi ,1 − yi ,1 ) 2 + ....... + ( xi , m − yi ,m ) 2 yip = yip − α * 2 *  ( yi − yip )

B. Support Vector Machine(SVM): Here, α is the learning rate and  (y i − yip ) is the sum of
residuals.
Being a popular supervised machine learning algorithm,
SVM can be used both for classification and regression G. Multiplier Perceptron Classifier (MLPC):
challenges. The objective of SVM is to find a hyperplane with The perceptron is simply a binary classification related to
a maximum margin in an X-dimensional space (X being the the human neural network. A multiplier perceptron is a stack
number of features) that separately classifies the data points. of several perceptron in many layers, just like a human
The loss function is known as ‘Hinge loss’ assists to neuron [18]. The first layer sends signals to all the perceptron
maximize the margin. Hinge loss function can be written as and so on to the final layer. The perceptron uses several
following [13], weights to carry the signal. For a final layer output, the error
is backpropagated to the previous layers. The final layer
 0 if y * f ( x) ≥ 1 output is,
c( x, y, f ( x)) = 
1 − y * f ( x) else
 n

C. AdaBoost(AdB): 1

if w *x
i =1
i i −θ ≥ 0
AdaBoost is also a boosting algorithm that makes a robust
y= n
0
classifier by combining several weak-leaner classifiers [14].

if w *x i i −θ < 0
It assigns different weight coefficients to the classifiers i =1
IV. RESULTS & ANALYSIS Therefore, ‘random search’ was chosen in our proposed
After studying the ML algorithms, the dataset was applied model for hyperparameter optimization.
for building the models and predicting thyroid disease After the optimization of the hyperparameters with our
accurately. Hence all the confusion matrices are tabulated in proposed method, the machine learning models were trained
Table II so that further analysis can be portrayed. Different to keep the bias as less as possible. So there was no chance of
performance parameters like accuracy, precision, sensitivity, overfitting. Then they were tested with cross-validation so
specificity, F1 score, and ROC-AUC were calculated for both that there is no chance of data leakage keeping the variance
default hyperparameter (DHP) and hyperparameter as less as possible. Thus, Multilayer Perceptron (MLPC) gave
optimization (HPO) with random search method. In case of the best accuracy of 99.70% among all the classifiers. After
default hyperparameter (DHP), Multilayer Perceptron hyperparameter optimization, four hidden layers of neural
(MLPC) classifier showcased the best accuracy (87.12%). network having 1000 neurons each, was found to be most
Here, by default, a hidden layer of neural networks having appropriate for the classifier. However, ReLu was used as the
100 neurons with ReLu activation function was used where activation function where 0.0005 & 0.001 were used as L2
0.0001 & 0.001 were taken as L2 regularization parameters regularization parameters and learning rate respectively.
and learning rate respectively. However, other nearly However, GBC (98.26%), AdB (97.62%) & XGB (96.20%)
performed classifiers for DHP were AdB (86.33%), SVM executed promising results in terms of accuracy and also
(85.74%) and GPC (84.79%). The confusion matrices for all other performance matrices. For all other performance
the classifiers are tabulated in Table II. parameters like precision, sensitivity, specificity, F1 score,
and ROC-AUC, hyperparameter optimization (HPO)
TABLE II. CONFUSION MATRICES FOR ALL CLASSIFIERS
outperforms the default hyperparameter (DHP) technique.
Confusion Matrix Predicted Thus, Table III and Table IV present the outcomes for both
(KNN) True False default hyperparameter (DHP) and hyperparameter
True 41 28 optimization (HPO). Furthermore, Receiving Operator
Actual
False 14 548
Characteristics (ROC) curves for each machine learning
Confusion Matrix Predicted algorithms were plotted in Fig. 2.
(SVM) True False
True 54 15
Actual
False 12 550

Confusion Matrix Predicted


(AdB) True False
True 67 9
Actual
False 6 549

Confusion Matrix Predicted


(XGB) True False
True 56 13
Actual
False 14 548

Confusion Matrix Predicted


(GPC) True False
Actual True 52 17
False 11 551
Fig. 2 ROC curve for all the models
Confusion Matrix Predicted
ROC curve illustrates the performances of classifiers
(GBC) True False
True 68 5 where all the classification thresholds are taken into account
Actual and the tradeoff between True Positive Rate (TPR) and False
False 6 552
Positive Rate (FPR) are plotted. The area under the curve of
Confusion Matrix Predicted ROC signifies the strength of classifiers which one is nearer
(MLPC) True False to one has the capability of segmenting the class labels than
True 81 1
Actual
False 1 548
other classifiers. It is evident from Table IV that, MLPC
presents better results among all the algorithms in terms of
Hyperparameter optimization (HPO) improves the precision (0.9878), sensitivity (0.9878), specificity (0.9982),
performance of machine learning algorithms. In automated F1 score (0.9878), and ROC-AUC (0.957). However,
machine learning, there are two popular optimization Gradient Boosting Classifier (GBC) appears as the second-
methods which are ‘grid search’ and ‘random search’. Grid best performing classifier among all. In this case, the
Search has a high dimensionality problem when there is a lot algorithm displays precision, sensitivity, specificity, F1
of data. Besides increasing the resolution of discretization score, and ROC-AUC to be 0.9315, 0.9189, 0.9910, 0.9826,
increases the number of function evaluations which increases and 0.956 correspondingly. After GBC, AdaBoost (AdB)
the computational time a lot. So implementing any machine emerges as the next classifier with good results. Here,
learning model practically using ‘grid search’ is inefficient precision, sensitivity, specificity, F1 score, and ROC-AUC
from an economic point of view [19]. Whenever any model are 0.9178, 0.8816, 0.9892, 0.8993, and 0.950 successively.
is proposed for practical purposes particularly in medical Thus, this investigation reports that MLPC, GBC, and AdB
diagnosis cases, it should be efficient as well as economic. perform better than all other mentioned algorithms.
TABLE III. PERFORMANCE MATRICES OF ML ALGORITHMS USING DEFAULT HYPER PARAMETER (DHP)

ML Algorithms Accuracy (%) Precision Sensitivity Specificity F1 Score ROC-AUC


KNN 79.00 0.3494 0.2762 0.8971 0.3085 0.801
SVM 85.74 0.4675 0.4235 0.9249 0.4400 0.823
AdB 86.33 0.4810 0.4578 0.9249 0.4691 0.833
XGB 83.00 0.3736 0.3953 0.8973 0.3842 0.805
GPC 84.79 0.4198 0.4146 0.9142 0.4146 0.811
GBC 83.65 0.4096 0.3864 0.9096 0.3908 0.806
MLPC 87.12 0.5135 0.4578 0.9341 0.4841 0.891

TABLE IV. PERFORMANCE MATRICES OF ML ALGORITHMS USING HYPER PARAMETER OPTIMIZATION (HPO)

ML Algorithms Accuracy (%) Precision Sensitivity Specificity F1 Score ROC-AUC


KNN 93.34 0.5942 0.7455 0.9514 0.6613 0.8916
SVM 96.00 0.7826 0.8182 0.9735 0.80 0.921
AdB 97.62 0.9178 0.8816 0.9892 0.8993 0.950
XGB 96.20 0.8429 0.8194 0.9803 0.8310 0.921
GPC 95.56 0.7536 0.8254 0.9701 0.7879 0.943
GBC 98.26 0.9315 0.9189 0.9910 0.9826 0.956
MLPC 99.70 0.9878 0.9878 0.9982 0.9878 0.957

V. CONCLUSION [8] Chaubey, Gyanendra & Bisen, Dhananjay & Arjaria, Siddharth &
Yadav, Vibhash. (2020). Thyroid Disease Prediction Using Machine
Thyroid being an essential part of the human body Learning Approaches. National Academy Science Letters.
produces several hormones that serve multifarious vital tasks 10.1007/s40009-020-00979-z.
[9] Nishat, Mirza Muntasir, Fahim Faisal, Rezuanur Rahman Dip, Md.
in the human body. So thyroid disease threatens the human Fahim Shikder, Ragib Ahsan, Md. Asfi-Ar-Raihan Asif and Mahmudul
body in all physiological systems like the endocrine system, Hasan Udoy. “Performance Investigation of Different Boosting
cardiovascular system, nervous system, respiratory system, Algorithms in Predicting Chronic Kidney Disease.” In 2nd International
digestive system, muscular system, and reproductive system. Conference on Sustainable Technologies for Industry 4.0 (STI), 19-20
December 2020, Dhaka, Bangladesh, in press.
Heart failure, falling into unconsciousness and mental [10] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository
disorders are common occurrences and can cause death in [http://archive.ics.uci.edu/ml/machine-learning-databases/thyroid-
many cases. Consequently, early detection & accurate disease/sick-euthyroid.data]. Irvine, CA: University of California,
clinical diagnosis of thyroid diseases can keep the School of Information and Computer Science.
physiological systems of the human body balanced and can [11] Quiring, Erwin, David Klein, Daniel Arp, Martin Johns, and Konrad
Rieck. "Adversarial Preprocessing: Understanding and Preventing
save many lives. In this research, we conducted an efficient Image-Scaling Attacks in Machine Learning." In 29th {USENIX}
data preprocessing technique and investigated several Security Symposium ({USENIX} Security 20). 2020.
machine learning algorithms for early prediction of thyroid [12] Li, Wei, Yumin Chen, and Yuping Song. "Boosted K-nearest neighbor
disease and proposed Multilayer Perceptron (MLPC) which classifiers based on fuzzy granules." Knowledge-Based
Systems (2020): 105606.
imparted the highest accuracy of 99.70%. Therefore, it can be [13] Wang, Mingjing, and Huiling Chen. "Chaotic multi-swarm whale
implemented practically and so it can aid medical optimizer boosted support vector machine for medical
professionals in early diagnosis of thyroid disease. Thus our diagnosis." Applied Soft Computing 88 (2020): 105946.
proposed model can contribute to combat against thyroid [14] Fan, Zhao, Fanyu Xu, Cai Li, and Lili Yao. "Application of KPCA and
AdaBoost algorithm in classification of functional magnetic resonance
disease and ensure the welfare of the human being. imaging of Alzheimer’s disease." Neural Computing and
Applications (2020): 1-10.
REFERENCES [15] Abdurrahman, G., and M. Sintawati. "Implementation of xgboost for
[1] Molina, Patricia E. Endocrine physiology. Edited by McGraw-Hill classification of parkinson’s disease." In Journal of Physics:
Education. New York: Lange Medical Books/McGraw-Hill, 2006. Conference Series, vol. 1538, no. 1, p. 012024. IOP Publishing, 2020.
[2] Silva de Morais, Nathalie, Jessica Stuart, Haixia Guan, Zhihong Wang, [16] Desai, Rahul, Pratik Porob, Penjo Rebelo, Damodar Reddy Edla, and
Edmund S. Cibas, Mary C. Frates, Carol B. Benson et al. "The impact Annushree Bablani. "EEG Data Classification for Mental State
of Hashimoto thyroiditis on thyroid nodule cytology and risk of thyroid Analysis Using Wavelet Packet Transform and Gaussian Process
cancer." Journal of the Endocrine Society 3, no. 4 (2019): 791-800. Classifier." Wireless Personal Communications (2020): 1-21.
[3] Gyuricsko, Eric. "The “slightly” abnormal thyroid test: What is the [17] Bahad, Pritika, and Preeti Saxena. "Study of adaboost and gradient
pediatrician to do?" Current Problems in Pediatric and Adolescent boosting algorithms for predictive analytics." In International
Health Care (2020): 100770. Conference on Intelligent Computing and Smart Communication 2019,
[4] Boelaert K, Visser WE, Taylor PN, Moran C, Léger J, Persani L. pp. 235-244. Springer, Singapore, 2020.
Management of hyperthyroidism and hypothyroidism. Endocrinology. [18] Pathak, Ketki C., and Swathi S. Kundaram. "Accuracy-Based
2020; 183:G33-9. Performance Analysis of Alzheimer’s Disease Classification Using
[5] Razia, Shaik, P. Siva Kumar, and A. Srinivasa Rao. "Machine learning Deep Convolution Neural Network." In Soft Computing: Theories and
techniques for Thyroid disease diagnosis: A systematic review." Applications, pp. 731-744. Springer, Singapore, 2020.
In Modern Approaches in Machine Learning and Cognitive Science: A [19] Géron, Aurélien. Hands-on machine learning with Scikit-Learn, Keras,
Walkthrough, pp. 203-212. Springer, Cham, 2020. and TensorFlow: Concepts, tools, and techniques to build intelligent
[6] Pan, Qiao, et al."Improved ensemble classification method of thyroid systems. O'Reilly Media, 2019.
disease based on random forest."2016 8th International Conference on
Information Technology in Medicine and Education (ITME).IEEE,
2016.
[7] Duggal, Priyanka & Shukla, Shipra. (2020). Prediction Of Thyroid
Disorders Using Advanced Machine Learning Techniques. 670-675.
10.1109/Confluence47617.2020.9058102.

View publication stats

You might also like