You are on page 1of 9

Machine Learning analysis in the prediction of

diabetes mellitus: a systematic review of the


literature

Abstract— In recent years, diabetes mellitus has increased In recent years, humanity has been immersed in health
its prevalence in the global landscape and currently, due to problems, especially in low-income environments, and the
COVID-19, people with diabetes mellitus are the most likely to situation is aggravated by the limited capacity of the health
develop a critical picture of this disease, which is why early system [4] This is why it is important to develop and
diagnosis is so important and where the implementation of
implement technologies such as machine learning models
machine learning has played a vital role in recent years. In this
study we conducted a systematic review of 55 researches where that serve as tools for doctors and patients, through preventive
models focused on the prediction of diabetes mellitus and its medicine that can help diagnose patients early and provide
different types have been developed or implemented, these them with health advice.
articles have been retrieved from important databases such as
IEEE Xplore, Scopus, ScienceDirect, IOPscience, EBSCOhost, The aim of this article is to analyze and to make known the
Wiley. The results obtained show that one of the models based presence of machine learning models to detect and predict
on Support Vector Machine algorithms achieved 100% diabetes mellitus and its types. This article aims to provide an
accuracy in disease prediction. The vast majority of researches analytical summary based on research conducted in different
used Wekka platform as a modeling tool, but it is worth
countries around the world in the last 4 years.
mentioning that the models with the best performance were
developed in MATLAB (100%) and RStudio (99%). On the
other hand, researches seek to predict diabetes without
II. METODOLOGY
specifying the type, however, there are a considerable number
of articles that predict type 2 diabetes. A. Type of study
Keywords—Diabetes Mellitus, Diabetes Types, Diabetes For the preparation of the article, the systematic review of
Gestational, Machine Learning, Systematic review the scientific literature will be used; this is a process that
allows the collection of relevant evidence on a given topic, in
I. INTRODUCTION addition, it adjusts to the established eligibility criteria, which
allows obtaining answers to the research questions
formulated. [5]
Over the years, diabetes has become a global public health
problem. Recent studies show that more than 381 million
people over the age of 18 suffer from diabetes and that B. Research questions
approximately 45.8% of them have not yet been diagnosed.
The proposed research questions are as follows:
[1].

This disease is classified into 3 types. Type 1 diabetes which RQ1 Which diabetes mellitus prediction models have shown
is caused by insulin deficiency. Type 2 diabetes is caused by the best results according to performance metrics over the
varying degrees of insulin resistance, altered insulin past 4 years?
secretion, increased glucose production and various genetic
metabolism defects in the action of insulin. Finally, RQ2: Which tools and languages are the most widely used in
gestational diabetes occurs in women during pregnancy. [2] the world to develop or implement machine learning models
for diabetes mellitus prediction?
Now, most physicians would agree that this disease, largely
related to one's lifestyle, can be prevented, unfortunately, the RQ3: Which countries has the most research related to the
medical community has been largely absent from the battle to prediction of diabetes mellitus been conducted in the last 4
improve these conditions. In fact, numerous studies show that years?
physicians often discuss weight management, physical
activity, or proper nutrition in <40% of the people they see in RQ4: What type of diabetes mellitus has had the highest
their offices. [3] amount of scientific research focused on prediction with
machine learning worldwide?

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE


C. Search strategies Inclusion I02 Articles related to the application or
implementation of machine learning models
Based on the questions posed, an exhaustive search of for the prediction of diabetes mellitus.
articles published in the main databases such as IEEEXplore,
I03 Articles related to the prediction of type 1
Scopus, ScienceDirect, IOPscience, EBSCOhost and Wiley diabetes, type 2 diabetes or gestational
were carried out, from which 683 scientific articles from the diabetes using machine learning.
last 4 years were collected. E01 Articles unrelated to the development of
machine learning models for the prediction of
The following formulas were used to search for diabetes- diabetes mellitus.
related research: E02 Articles unrelated to the implementation of
Exclusion machine learning models for diabetes
TABLE I. SEARCH EQUATIONS prediction.
E03 Articles unrelated to the prediction of type 1
Data Base Equation diabetes, type 2 diabetes or gestational
IEEE Xplore prediction of diabetes by machine learning NOT diabetes using machine learning.
deep learning NOT risk
Scopus prediction AND of AND diabetes AND by AND
machine AND learning AND NOT deep
learning" AND NOT risk A total of 683 articles were analyzed, of which 3 duplicate
ScienceDirect diabetes AND machine learning AND predict articles were discarded. Then, 55 articles were selected,
model NOT deep learning NOT drug excluding 628 according to the exclusion criteria and which
IOPscience diabetes AND machine learning AND predict did not contribute to answering the research questions. This
model resulted in 55 articles for the systematic review.
EBSCOhost prediction of diabetes by machine learning NOT
deep learning NOT risk
Wiley prediction of diabetes by machine learning NOT
deep learning NOT risk
Articles identified through Articles identified through
IDENTIFICATION
database searching other sources
In order to make an optimal selection of research, search (n = 683) (n = 0)
formulas and key words were applied, as well as inclusion
and exclusion criteria, and finally, discarding due to
duplicity, obtaining 55 relevant papers.
Articles after duplicates
removed (n = 680)
78 22

IEEE
SCREENING

162 15 Articles to be Articles excluded


Applying exclusion and inclusion criteria

screened (n=80) (n= 625)


Search by formulas and keywords

SCOPUS
Did not meet the
331 12 inclusion criteria
Duplicate Removal

ScienceDirect TOTAL
55
Papers
ELIGIBILITY

74 2
relevant
IOPscience Articles assessed for Articles excluded with
eligibility (n=55) reasons (n= 0)
18 2
Did not address the
EBSCOhost research question

20 2
INCLUDED

Wiley Articles included in


quantitative synthesis
(meta-analysis) (n = 55)
Figure 1. Selection methodology diagram

D. Inclusion and Exclusion criteria Figure 2. Document Inclusion and Exclusion Flowchart
Inclusion and exclusion criteria presented in the following
table were applied for the systematic review study.

TABLE II. INCLUSION AND EXCLUSION CRITERIA III. RESULTS

Criteria
I01 Articles related to the development or
performance comparison of diabetes mellitus
prediction models.
The following graph represents the percentage
contribution of each of the databases in this review.
TABLE III. ARTICLES BY COUNTRY

4% 3% Country Quantity Articles

27% Saudi Arabia 1 [6]


40% EBSCOhost
Bangladesh 1 [7]
IEEE Xplore
China 4 [8],[9],[10],[11]
IOPScience
Korea 1 [12]
ScienceDirect
United Arab
22% 4% SCOPUS 1 [13]
Emirates
WILEY
India [2],[14],[15],[16],[17],[18],[19],[20],[21
],[22],[23],[24],[25],[26],[27],[28],[29],
Figure 3. Contribution by Database 32
[30],[31],[32],[33],[34],[35],[36],[37],[3

The following graph represents the number of 8],[39],[40],[41],[42],[43],[44]


published articles focused on developing and/or evaluating Indonesia 3 [45],[46],[47]
mathematical models that allow the prediction of diabetes in
the world by year and database. England 1 [1]

Israel 1 [48]
9
Italy 1 [49]
8
North
7 1 [50]
Macedonia
6
Morocco 1 [51]
5
Pakistan 1 [52]
4
Poland 1 [53]
3
USA 5 [54],[55],[56],[57],[58]
2
1 Total 55

0
2018 2019 2020 2021 The following graph represents the number of articles
EBSCOhost IEEE Xplore published by continent, where Asia is predominant.
IOPScience ScienceDirect
SCOPUS WILEY

Africa 1
Figure 4. Articles by year and database

The following graph shows the number of articles by country, America 5


with the countries that have published the largest number of
scientific research papers being shown with greater intensity.
Europe 4

Asia 45

0 10 20 30 40 50

Figure 6. Articles by continent

The following graph represents the number of articles by


database and type of diabetes.

Figure 5. Map of articles by country


25 The following graph represents the number of items per
machine learning algorithm used in the research.
20
AdaBoost 1
15 C4.5 1
Convolutional Neural Networks 1
Decision Tree 2
10 Decision Tree+Neural network 1
Extreme Learning Machine 1
Glmnet 1
5 Gradient Boost 2
Gradient Boosted Trees 1
K-Nearest Neighbor 3
0 Logistic Model Tree 1
EBS IEE IOP Scie SCO WIL Logistic Regression 3
COh E Scie nceD PUS EY Modified Support Vector… 1
ost Xplo nce irect Naive Bayes 3
re Neural Networks 8
Type 1 2 Random Forest 10
RBF + C4.5 + NB 1
Gestational 2 1 1 Support Vector Machine 10
Tree Partitioning Adaptive… 1
Type 2 2 7 1 6 4 Two-class Logistic Regression 1
Diabetes 13 1 5 9 1 XGBoost 2
0 2 4 6 8 10 12
Figure 7. Articles by database and type of diabetes
Figure 9. Articles by machine learning algorithm

The following table shows the categories of items according


to the results found. The following graph shows the articles according to metrics
identified in the research articles.
TABLA IV. ARTICLES BY TYPE OF DIABETES
Accuracy 51
Categories Articles Precision 21
Recall 18
Type 1 Diabetes Mellitus [49], [8] Specificity 13
[50], [54], [14], [15], [16], [19], Sensitivity 12
[20], [32], [33], [35], [36], [57], AUC 10
Type 2 Diabetes Mellitus ROC 8
[12], [38], [58], [6], [47], [52], F-Measure 8
[42], [43] F1-Score 8
Gestational Diabetes Error Rate 5
[2], [13], [9], [41] MCC 3
Mellitus FP-Rate 3
[1], [55], [17], [45], [18], [21], [56], [7], Cross Validation 3
Diabetes Mellitus (Type is [22], [23], [24], [46], [25], [26], [51], Otros 1
not specified) [27], [28], [29], [30], [31], [34], [37], 0 10 20 30 40 50 60
[48], [10], [39], [40], [53], [11], [44] Figure 10. Articles by evaluation metrics

The following graph represents the number of items per


According to the metrics chart, accuracy is the most recurrent
dataset used for the development of the models.
metric, based on which we show the models with the best
performance according to the indicated metric.
1
Other Datasets
2
UCI Datasets TABLE V. MODELS WITH BETTER ACCURACY
4
Institutions and Foundations 7 Model Accuracy Article
32
Own Datasets 9 Support Vector
100% [49]
Machine
Hospitals, Clinics and…
Random Forest 99% [30]
PIMA Indian Diabetes Extreme Learning
99% [28]
0 10 20 30 40 Machine

Figure 8. Articles by dataset


The following graph represents the number of articles per identify the best machine learning models, the most frequent
model development tool used in the research. implementation tools, as well as the largest amount of
research according to the type of diabetes and countries, in
order to answer the proposed questions:
35 31
RQ1 Which diabetes mellitus prediction models have
30 shown the best results according to performance metrics
25 over the past 4 years?
20
Figure 10 shows that most of the articles analyzed in this
15 review use Accuracy as a metric to evaluate their models. The
10 result obtained allows us to use this metric as a reference to
10
4 identify the models with the best performance.
4
5 1 1 1 1 2
0
According to Table V, it can be seen that the Support Vector
Machine, Random Forest and Extreme Learning Machine
models have obtained the best performance with 100%, 99%
and 99% accuracy, respectively.

It is important to mention that according to Figure 9, it is


Figure 11. Articles by model development tool shown that the Support Vector Machine and Random Forest
models are the models most used by researchers.

The following graph represents the number of articles by RQ2: Which tools and languages are the most widely used
programming language used in the research, where the in the world to develop or implement machine learning
language. models for diabetes mellitus prediction?

According to Figure 11, it can be seen that of all the articles


analyzed in this review, although most of the researchers do
12 not indicate the development environment of their models, a
large number use the free software platform Weka as their
modeling, visualization and data analysis tool, which is
23
described by all the authors as a simple and intuitive tool that
4 makes it much easier to carry out the aforementioned tasks.
1
Furthermore, according to Figure 12, it can be seen that
10 Graphical User Interfaces (GUI) are more used for modeling
4 1 than the programming languages themselves.

It is worth mentioning that according to Table VI, the


GUI M PHP/JavaScript… Python Python/R R No indica MATLAB tool, in which the M language is used, is the one
Figure 12. Articles by programming language with the highest accuracy (100%) when developing a
machine learning model (Support Vector Machine), followed
by R Studio with the R language in which, when developing
The following table shows the tools and languages used in the the Random Forest model, an accuracy of 99% was obtained.
research with the highest performing models
RQ3: Which countries has the most research related to
the prediction of diabetes mellitus been conducted in the
TABLE VI. TOOLS AND LANGUAGES ACCORDING TO last 4 years?
MODELS WITH BETTER ACCURACY
Figure 4, manifests that in the last four years (2018-2021)
Tool Lenguage Model Accuracy Artículo
there is an increasing amount of research related to diabetes
Support
Lenguaje mellitus prediction, between 2019 and 2020 a great variety of
MATLAB Vector 100% [49]
M publications were made especially in the IEEE Xplore
Machine
R Studio R
Random
99% [30] databases followed by Scopus, this result is supported by
Forest Figure 3. This indicates that there is a preference by
Extreme
MATLAB
Lenguaje
Learning 99% [28] researchers to make their scientific publications related to
M diabetes prediction in the aforementioned databases.
Machine

According to Figure 6, it is shown that all the articles


IV. DISCUSSION analyzed in the present review come from the continents of
In this systematic investigation of the scientific literature, we Asia, North America, Europe and Africa (from highest to
analyze machine learning models for diabetes prediction, lowest). This result indicates that the development of
machine learning models for the prediction of diabetes selected studies for the prediction of type 2 diabetes were
mellitus is a recurrent research topic almost all over the identified.
world.
On the other hand, in contrast to our target group of articles,
According to Figure 5 and Table III, it is shown that the vast the systemic review [61], aims to evaluate the use and
majority of research is conducted in India followed by China. performance of ML prediction models applied in community
This result indicates that it is in these countries where there is settings for early detection of risk groups. In this research,
more experience in the development of machine learning studies conducted in hospitals, primary care centers and
models for the prediction of diabetes mellitus, therefore, it is laboratories were excluded. Twenty-three articles published
important to include research from the identified countries in between 2009 and 2019 were taken into account for the
future studies. research. It was found that models based on Artificial Neural
Network showed the highest predictive capacity, followed by
RQ4: What type of diabetes mellitus has had the highest Logistic Regression, Decision trees and Random Forest.
amount of scientific research focused on prediction with
machine learning worldwide? In the present study, 55 articles were selected to identify the
most used machine learning models as in other studies, in
According to Table IV, it can be seen that of all the articles addition we recognize the ML models for diabetes prediction
analyzed in the present review, more research does not focus with better performance according to the metrics
on a specific type of diabetes mellitus, but it can also be seen recommended by numerous studies, as the most used
that there is a significant number of research that seeks to development or implementation tools, as well as the largest
predict type 2 diabetes mellitus, and a small group focuses on amount of research according to type of diabetes and
gestational diabetes and type 1 diabetes. countries. We agree with the study [59] in identifying the
most used dataset, which is PIMA Diabetes India as shown in
Figure 7 shows that in almost all the databases consulted for Figure 8, with reference to the most used models according
this review, at least one research related to the prediction of to Figure 9 we have: Artificial Neural Network (ANN),
diabetes mellitus was found. This result indicates that, in the Support Vector Machine (SVM), Random Forest, in which
most important scientific databases, research on the we agree in two models with respect to the results of the
development of models for the prediction of diabetes mellitus article [59] and in one model with the study [60]. Regarding
is relevant. the ML models with better performance according to the
accuracy metric, we agree in one model with the article. [61]
In addition, Figure 8 shows that, of all the articles analyzed,
the vast majority of the research has used the PIMA Indian V. CONCLUSIONS
Diabetes dataset for training prediction models and it is also After having carried out a systematic investigation of the
observed that there is a minority that generates its own scientific literature of 55 articles related to the topic in
dataset. This result indicates that the information provided by question, it is concluded that:
this dataset is highly valued by researchers and is related to
the fact that there is a greater amount of research on predictive The prediction models of diabetes mellitus that have
models for type 2 diabetes and gestational diabetes because presented better results in the last 4 years are Support Vector
the variables in these data are identified as determinants for Machine (100%), Random Forest (99%) and Extreme
the prediction of the mentioned types. Learning Machine (99%), according to Accuracy, which was
defined as a metric to identify the best models due to its
presence in most researches where they consider it as a
RELATED TAKS determinant metric.

Other systemic review studies conducted such as [59] focus The tools and languages with which machine learning models
on analyzing diabetes predictions based on machine learning are usually developed or implemented worldwide are the free
and deep learning techniques published in the last six years software platform Weka, for its ease of use, as well as
(2013 to 2019), where the main datasets were identified as: MATLAB, in which the M language is used, followed by R
Electrocardiograms, Breath Dataset, ICU Datasets, PIMA Studio with the R language, these last two tools managed to
Indian Diabetes, with the latter being predominant. Likewise, develop models that obtained the highest scores according to
the article determined that the most used classifiers for Accuracy.
diabetes prediction are: Artificial Neural Network (ANN),
Support Vector Machine (SVM), Decision Tree, and Naive The vast majority of research is conducted in India followed
Bayes. by China. This result indicates that it is in these countries
where there is greater experience and preference for the
In research 60, 31 articles were selected to identify the development of predictive models for diabetes.
applications of artificial intelligence (AI) for the care of type
2 diabetes mellitus, and the main applications of AI for the Likewise, there is a greater number of studies that do not
care of type 2 diabetes mellitus were screening and diagnosis. focus on a specific type of diabetes mellitus, but it is also
Among all the AI methods reviewed, machine learning observed that there is an important number of studies that
methods were the most applied techniques and the most used seek to predict type 2 diabetes mellitus.
methods were: Support Vector Machine and Naive Bayes. In
the same way, the most important variables used in the
Based on the results obtained, it is recommended for future diabetes based on machine learning algorithm,” Int.
research within the scope of this review, to work on the J. Environ. Res. Public Health, vol. 18, no. 6, pp. 9–
development of predictive models with the Support Vector 11, 2021, doi: 10.3390/ijerph18063317.
Machine algorithm due to the good results obtained, as well [13] Y. Srivastava, P. Khanna, and S. Kumar, “Estimation
as to consider accuracy as a metric to evaluate model of Gestational Diabetes Mellitus using Azure AI
performance and to use MATLAB or R Studio as Services,” Proc. - 2019 Amity Int. Conf. Artif. Intell.
development tools. AICAI 2019, pp. 321–326, 2019, doi:
10.1109/AICAI.2019.8701307.
VI. REFERENCES [14] M. Tanvir Islam, M. Raihan, F. Farzana, P. Ghosh,
and S. Ahmed Shaj, “An empirical study on diabetes
[1] N. Nnamoko, A. Hussain, and D. England, mellitus prediction using apriori algorithm,” Adv.
“Predicting Diabetes Onset: An Ensemble Intell. Syst. Comput., vol. 1166, pp. 539–550, 2021,
Supervised Learning Approach,” 2018 IEEE Congr. doi: 10.1007/978-981-15-5148-2_48.
Evol. Comput. CEC 2018 - Proc., pp. 1–7, 2018, doi: [15] M. T. Islam, M. Raihan, F. Farzana, N. Aktar, P.
10.1109/CEC.2018.8477663. Ghosh, and S. Kabiraj, “Typical and Non-Typical
[2] I. Gnanadass, “Prediction of Gestational Diabetes by Diabetes Disease Prediction using Random Forest
Machine Learning Algorithms,” IEEE Potentials, Algorithm,” 2020 11th Int. Conf. Comput. Commun.
vol. 39, no. 6, pp. 32–37, 2020, doi: Netw. Technol. ICCCNT 2020, pp. 1–6, 2020, doi:
10.1109/MPOT.2020.3015190. 10.1109/ICCCNT49239.2020.9225430.
[3] J. M. Rippe, “The Silent Epidemic,” Am. J. Med., vol. [16] A. Mir and S. N. Dhage, “Diabetes Disease
134, no. 2, pp. 164–165, 2021, doi: Prediction Using Machine Learning on Big Data of
10.1016/j.amjmed.2020.09.028. Healthcare,” Proc. - 2018 4th Int. Conf. Comput.
[4] WHO, “Recommendations for people living with Commun. Control Autom. ICCUBEA 2018, pp. 1–6,
NCDs, caregivers, family members and the public,” 2018, doi: 10.1109/ICCUBEA.2018.8697439.
World Heal. Organ., no. April, pp. 1–6, 2020, [17] K. L. Priya, M. S. Charan Reddy Kypa, M. M.
[Online]. Available: Sudhan Reddy, and G. R. Mohan Reddy, “A Novel
https://apps.who.int/iris/handle/10665/331473. Approach to Predict Diabetes by Using Naive Bayes
[5] W. Mengist, T. Soromessa, and G. Legese, “Method Classifier,” Proc. 4th Int. Conf. Trends Electron.
for conducting systematic literature review and meta- Informatics, ICOEI 2020, no. Icoei, pp. 603–607,
analysis for environmental science research,” 2020, doi: 10.1109/ICOEI48184.2020.9142959.
MethodsX, vol. 7, p. 100777, 2020, doi: [18] R. S. Raj, D. S. Sanjay, M. Kusuma, and S. Sampath,
10.1016/j.mex.2019.100777. “Comparison of Support Vector Machine and Naïve
[6] S. P. Chatrati et al., “Smart home health monitoring Bayes Classifiers for Predicting Diabetes,” 1st Int.
system for predicting type 2 diabetes and Conf. Adv. Technol. Intell. Control. Environ.
hypertension,” J. King Saud Univ. - Comput. Inf. Sci., Comput. Commun. Eng. ICATIECE 2019, pp. 41–45,
no. xxxx, Jan. 2020, doi: 2019, doi: 10.1109/ICATIECE45860.2019.9063792.
10.1016/j.jksuci.2020.01.010. [19] R. Syed, R. K. Gupta, and N. Pathik, “An Advance
[7] S. K. Dey, A. Hossain, and M. M. Rahman, Tree Adaptive Data Classification for the Diabetes
“Implementation of a Web Application to Predict Disease Prediction,” 2018 Int. Conf. Recent Innov.
Diabetes Disease: An Approach Using Machine Electr. Electron. Commun. Eng. ICRIEECE 2018,
Learning Algorithm,” 2018 21st Int. Conf. Comput. pp. 1793–1798, 2018, doi:
Inf. Technol. ICCIT 2018, pp. 1–5, 2019, doi: 10.1109/ICRIEECE44171.2018.9009180.
10.1109/ICCITECHN.2018.8631968. [20] G. Tripathi and R. Kumar, “Early Prediction of
[8] J. Xue, F. Min, and F. Ma, “Research on diabetes Diabetes Mellitus Using Machine Learning,”
prediction method based on machine learning,” J. ICRITO 2020 - IEEE 8th Int. Conf. Reliab. Infocom
Phys. Conf. Ser., vol. 1684, no. 1, 2020, doi: Technol. Optim. (Trends Futur. Dir., pp. 1009–1014,
10.1088/1742-6596/1684/1/012062. 2020, doi: 10.1109/ICRITO48877.2020.9197832.
[9] H. Liu et al., “Machine learning risk score for [21] D. Vigneswari, N. K. Kumar, V. Ganesh Raj, A.
prediction of gestational diabetes in early pregnancy Gugan, and S. R. Vikash, “Machine Learning Tree
in Tianjin, China,” Diabetes. Metab. Res. Rev., no. Classifiers in Predicting Diabetes Mellitus,” 2019 5th
February, 2020, doi: 10.1002/dmrr.3397. Int. Conf. Adv. Comput. Commun. Syst. ICACCS
[10] G. Li, Y. Liu, H. Li, R. Yao, and C. Li, “MCMC 2019, pp. 84–87, 2019, doi:
impute missing values and Bayesian variable 10.1109/ICACCS.2019.8728388.
selection for logistic regression model to predict [22] S. C. Gupta and N. Goel, “Performance enhancement
Pima Indian Diabetes,” J. Phys. Conf. Ser., vol. 1865, of diabetes prediction by finding optimum K for
no. 4, p. 042087, Apr. 2021, doi: 10.1088/1742- KNN classifier with feature selection method,” Proc.
6596/1865/4/042087. 3rd Int. Conf. Smart Syst. Inven. Technol. ICSSIT
[11] C. Zhu, C. U. Idemudia, and W. Feng, “Improved 2020, no. Icssit, pp. 980–986, 2020, doi:
logistic regression model for diabetes prediction by 10.1109/ICSSIT48917.2020.9214129.
integrating PCA and K-means techniques,” [23] P. S. Kohli and A. L. Regression, “Application of
Informatics Med. Unlocked, vol. 17, no. April, p. Machine Learning in Disease Prediction,” 2020 IEEE
100179, 2019, doi: 10.1016/j.imu.2019.100179. 5th Int. Conf. Comput. Commun. Autom. ICCCA
[12] H. M. Deberneh and I. Kim, “Prediction of type 2 2020, pp. 1–4, 2020.
[24] P. Kaur, N. Sharma, A. Singh, and B. Gill, “CI-DPF: datamining algorithms,” 2020 Int. Conf. Comput.
A Cloud IoT based Framework for Diabetes Commun. Informatics, ICCCI 2020, pp. 22–25, 2020,
Prediction,” 2018 IEEE 9th Annu. Inf. Technol. doi: 10.1109/ICCCI48352.2020.9104108.
Electron. Mob. Commun. Conf. IEMCON 2018, pp. [37] S. C. Gupta and N. Goel, “Enhancement of
654–660, 2019, doi: Performance of K-Nearest Neighbors Classifiers for
10.1109/IEMCON.2018.8614775. the Prediction of Diabetes Using Feature Selection
[25] Karthikeyan S. M, C. P.J, G. C. B, and M. J, Method,” 2020 IEEE 5th Int. Conf. Comput.
“Performance Analysis Based on Data Mining Commun. Autom. ICCCA 2020, pp. 681–686, 2020,
Technique in Predicting the Diabetic Disease – doi: 10.1109/ICCCA49541.2020.9250887.
Decision tree and Naïve Bayes,” 2019 1st Int. Conf. [38] V. L. Helen Josephine, A. P. Nirmala, and V. L.
Adv. Inf. Technol., pp. 2019–2022, 2019. Alluri, “Impact of Hidden Dense Layers in
[26] S. Thenappan, M. Valan Rajkumar, and P. S. Convolutional Neural Network to enhance
Manoharan, “Predicting Diabetes Mellitus Using Performance of Classification Model,” IOP Conf.
Modified Support Vector Machine with Cloud Ser. Mater. Sci. Eng., vol. 1131, no. 1, p. 012007,
Security,” IETE J. Res., vol. 0, no. 0, pp. 1–11, 2020, Apr. 2021, doi: 10.1088/1757-899X/1131/1/012007.
doi: 10.1080/03772063.2020.1782781. [39] A. Mujumdar and V. Vaidehi, “Diabetes Prediction
[27] R. Patil and S. Tamane, “A comparative analysis on using Machine Learning Algorithms,” Procedia
the evaluation of classification algorithms in the Comput. Sci., vol. 165, pp. 292–299, 2019, doi:
prediction of diabetes,” Int. J. Electr. Comput. Eng., 10.1016/j.procs.2020.01.047.
vol. 8, no. 5, pp. 3966–3975, 2018, doi: [40] P. Samant and R. Agarwal, “Machine learning
10.11591/ijece.v8i5.pp3966-3975. techniques for medical diagnosis of diabetes using
[28] B. Suvarnamukhi and M. Seshashayee, “Big data iris images,” Comput. Methods Programs Biomed.,
processing system for diabetes prediction using vol. 157, pp. 121–128, 2018, doi:
machine learning technique,” Int. J. Innov. Technol. 10.1016/j.cmpb.2018.01.004.
Explor. Eng., vol. 8, no. 12, pp. 4478–4483, 2019, [41] D. Sisodia and D. S. Sisodia, “Prediction of Diabetes
doi: 10.35940/ijitee.L3515.1081219. using Classification Algorithms,” Procedia Comput.
[29] R. G. Franklin and B. Muthukumar, “Detection of Sci., vol. 132, no. Iccids, pp. 1578–1585, 2018, doi:
diabetes mellitus using machine learning 10.1016/j.procs.2018.05.122.
algorithms,” Int. J. Res. Pharm. Sci., vol. 11, no. 4, [42] D. Jashwanth Reddy et al., “Predictive machine
pp. 6881–6887, 2020, doi: learning model for early detection and analysis of
10.26452/ijrps.v11i4.3662. diabetes,” Mater. Today Proc., no. xxxx, 2020, doi:
[30] P. S. Kumar and S. Pranavi, “Performance analysis 10.1016/j.matpr.2020.09.522.
of machine learning algorithms on diabetes dataset [43] N. P. Tigga and S. Garg, “Prediction of Type 2
using big data analytics,” 2017 Int. Conf. Infocom Diabetes using Machine Learning Classification
Technol. Unmanned Syst. Trends Futur. Dir. ICTUS Methods,” Procedia Comput. Sci., vol. 167, no. 2019,
2017, vol. 2018-Janua, no. Iddm, pp. 508–513, 2018, pp. 706–716, 2020, doi:
doi: 10.1109/ICTUS.2017.8286062. 10.1016/j.procs.2020.03.336.
[31] P. Pandeeswary and M. Janaki, “Performance [44] B. Jain, N. Ranawat, P. Chittora, P. Chakrabarti, and
analysis of big data classification techniques on S. Poddar, “A machine learning perspective: To
diabetes prediction,” Int. J. Innov. Technol. Explor. analyze diabetes,” Mater. Today Proc., no. xxxx,
Eng., vol. 8, no. 10, pp. 533–537, 2019, doi: 2021, doi: 10.1016/j.matpr.2020.12.445.
10.35940/ijitee.J8840.0881019. [45] M. Radja and A. W. R. Emanuel, “Performance
[32] J. Beschi Raja, R. Anitha, R. Sujatha, V. Roopa, and Evaluation of Supervised Machine Learning
S. Sam Peter, “Diabetics prediction using gradient Algorithms Using Different Data Set Sizes for
boosted classifier,” Int. J. Eng. Adv. Technol., vol. 9, Diabetes Prediction,” Proceeding - 2019 5th Int.
no. 1, pp. 3181–3183, 2019, doi: Conf. Sci. Inf. Technol. Embrac. Ind. 4.0 Towar.
10.35940/ijeat.A9898.109119. Innov. Cyber Phys. Syst. ICSITech 2019, pp. 252–
[33] P. A. Ebenzer, R. Bhattalwar, H. Patel, and R. 258, 2019, doi:
Kumar, “Patient readmission prediction due to 10.1109/ICSITech46713.2019.8987479.
diabetes using machine learning classification,” Int. [46] R. Aminah and A. H. Saputro, “Diabetes prediction
J. Innov. Technol. Explor. Eng., vol. 9, no. 1, pp. system based on iridology using machine learning,”
678–681, 2019, doi: 10.35940/ijitee.A4561.119119. 2019 6th Int. Conf. Inf. Technol. Comput. Electr.
[34] S. Raghavendra and J. Santosh Kumar, “Performance Eng. ICITACEE 2019, pp. 1–6, 2019, doi:
evaluation of random forest with feature selection 10.1109/ICITACEE.2019.8904125.
methods in prediction of diabetes,” Int. J. Electr. [47] R. B. Lukmanto, Suharjito, A. Nugroho, and H.
Comput. Eng., vol. 10, no. 1, pp. 353–359, 2020, doi: Akbar, “Early detection of diabetes mellitus using
10.11591/ijece.v10i1.pp353-359. feature selection and fuzzy support vector machine,”
[35] M. T. Student, K. Lakshmaih, E. Foundation, and G. Procedia Comput. Sci., vol. 157, pp. 46–54, 2019,
District, “Diabetic Prediction Using Kernel Based doi: 10.1016/j.procs.2019.08.140.
Support Vector Machine,” vol. 9, no. 2, pp. 1178– [48] A. Cahn et al., “Prediction of progression from pre-
1183, 2020. diabetes to diabetes: Development and validation of
[36] M. S. Geetha Devasena, R. Kingsy Grace, and G. a machine learning model,” Diabetes. Metab. Res.
Gopu, “PDD: Predictive diabetes diagnosis using Rev., vol. 36, no. 2, pp. 1–8, 2020, doi:
10.1002/dmrr.3252. Technol. HI-POCT 2019, pp. 147–150, 2019, doi:
[49] E. Cordelli, G. Maulucci, M. De Spirito, A. Rizzi, D. 10.1109/HI-POCT45284.2019.8962811.
Pitocco, and P. Soda, “A decision support system for [56] J. Ma, “Machine Learning in Predicting Diabetes in
type 1 diabetes mellitus diagnostics based on dual the Early Stage,” Proc. - 2020 2nd Int. Conf. Mach.
channel analysis of red blood cell membrane Learn. Big Data Bus. Intell. MLBDBI 2020, pp. 167–
fluidity,” Comput. Methods Programs Biomed., vol. 172, 2020, doi:
162, pp. 263–271, 2018, doi: 10.1109/MLBDBI51377.2020.00037.
10.1016/j.cmpb.2018.05.025. [57] L. Kopitar, P. Kocbek, L. Cilar, A. Sheikh, and G.
[50] L. Loku, B. Fetaji, and M. Fetaji, “Prevention of Stiglic, “Early detection of type 2 diabetes mellitus
Diabetes by Devising A Prediction Analytics using machine learning-based prediction models,”
Model,” HORA 2020 - 2nd Int. Congr. Human- Sci. Rep., vol. 10, no. 1, pp. 1–12, 2020, doi:
Computer Interact. Optim. Robot. Appl. Proc., pp. 1– 10.1038/s41598-020-68771-z.
4, 2020, doi: 10.1109/HORA49412.2020.9152894. [58] J. J. Khanam and S. Y. Foo, “A comparison of
[51] T. Nibareke and J. Laassiri, “Using Big Data- machine learning algorithms for diabetes prediction,”
machine learning models for diabetes prediction and ICT Express, no. xxxx, 2021, doi:
flight delays analytics,” J. Big Data, vol. 7, no. 1, 10.1016/j.icte.2021.02.004.
2020, doi: 10.1186/s40537-020-00355-0. [59] S. Larabi-Marie-Sainte, L. Aburahmah, R.
[52] T. Mahboob Alam et al., “A model for early Almohaini, and T. Saba, “Current techniques for
prediction of diabetes,” Informatics Med. Unlocked, diabetes prediction: Review and case study,” Appl.
vol. 16, no. July, p. 100204, 2019, doi: Sci., vol. 9, no. 21, 2019, doi: 10.3390/app9214604.
10.1016/j.imu.2019.100204. [60] S. Abhari, S. R. N. Kalhori, M. Ebrahimi, H.
[53] A. Viloria, Y. Herazo-Beltran, D. Cabrera, and O. B. Hasannejadasl, and A. Garavand, “Artificial
Pineda, “Diabetes Diagnostic Prediction Using intelligence applications in type 2 diabetes mellitus
Vector Support Machines,” Procedia Comput. Sci., care: Focus on machine learning methods,” Healthc.
vol. 170, pp. 376–381, 2020, doi: Inform. Res., vol. 25, no. 4, pp. 248–261, 2019, doi:
10.1016/j.procs.2020.03.065. 10.4258/hir.2019.25.4.248.
[54] R. Lee and C. Chitnis, “Improving health-care [61] K. De Silva, W. K. Lee, A. Forbes, R. T. Demmer, C.
systems by disease prediction,” Proc. - 2018 Int. Barton, and J. Enticott, “Use and performance of
Conf. Comput. Sci. Comput. Intell. CSCI 2018, pp. machine learning models for type 2 diabetes
726–731, 2018, doi: prediction in community settings: A systematic
10.1109/CSCI46756.2018.00145. review and meta-analysis,” Int. J. Med. Inform., vol.
[55] R. Deo and S. Panigrahi, “Performance Assessment 143, no. August, p. 104268, 2020, doi:
of Machine Learning Based Models for Diabetes 10.1016/j.ijmedinf.2020.104268.
Prediction,” 2019 IEEE Healthc. Innov. Point Care

You might also like