You are on page 1of 11

Type 2 Diabetes CVD and Hypertension Vigillance

Using Machine Learning


1
Alok Kumar Verma
Department of Computer Science and 3
Abhishek Kumar Singh
Engineering Department of Computer Science and Engineering
GLBITM Greater Noida GLBITM Greater Noida
UP, INDIA UP, INDIA
Alok89123patel@gmail.com abhisheksingh90627@gmail.com
2
Aniket Pal
Department of Computer Science and
Engineering
GLBITM Greater Noida
UP, INDIA
palaniket715@gmail.com
Abstract— This major project focuses on predictive managing their blood glucose levels through medication,
modeling for three critical health conditions: diabetes, lifestyle modifications, and vigilant monitoring.
cardiovascular disease (CVD), and hypertension,
utilizing diverse machine learning algorithms. For the The chronic nature of diabetes necessitates consistent
diabetes dataset, an ensemble of models including attention to diet, exercise, and medication, influencing
Random Forest, Decision Tree, XG Boost, Support various aspects of daily life.
Vector Machine (SVM), K-Nearest Neighbors (KNN),
and Logistic Regression was employed. Among these, Similarly, cardiovascular disease (CVD) encompasses a
KNN exhibited outstanding performance, achieving a range of disorders affecting the heart and blood vessels,
maximum accuracy of 99.86% in diabetes prediction. including coronary artery disease, heart failure, and stroke.
In exploring the relationships between diabetes and CVD can disrupt the normal functioning of the circulatory
CVD, Bayesian Theorem was applied to the PIMA system, leading to complications that demand lifestyle
Indian dataset. This dataset serves as a valuable adjustments, medication adherence, and regular medical
resource for understanding the interconnectedness of check-ups. The impact of CVD extends beyond physical
these health conditions. The integration of Bayesian health, affecting emotional well-being and imposing
analysis enhances the understanding of restrictions on daily activities.
theprobabilistic relationships between diabetes and
CVD,contributing to a more comprehensive
perspective on their associations. The results of this Hypertension, commonly known as high blood pressure, is
major project offer insights into effective predictive a condition where the force of blood against the artery walls
modeling for diabetes, CVD, and hypertension, is consistently too high. This condition, often
showcasing the strengths of various machine learning asymptomatic, poses a silent threat, increasing the risk of
algorithms. The emphasis on accurate predictions and heart disease, stroke, and other complications. Managing
relationship analysis contributes to the advancement hypertension involves lifestyle modifications, medication
of healthcare analytics, fostering a more nuanced adherence, and regular monitoring of blood pressure,
understanding of complex health conditions. imposing a constant awareness of health.
I. INTRODUCTION In the context of India, the burden of these chronic diseases
Diabetes, cardiovascular disease (CVD), and is particularly profound. According to various studies, the
hypertension represent a trifecta of chronic health prevalence of diabetes, CVD, and hypertension in India has
conditions that significantly impact individuals' day-to- witnessed a steady rise, imposing a substantial public health
day lives, posing both immediate and long-term threats challenge. Kotecha et al. [1] underscore the growing impact
to their well-being. These conditions, often referred to as of cardiovascular events, with cardiovascular disease
non-communicable diseases (NCDs), have reached emerging as a leading cause of mortality. Notably, India
alarming proportions globally, affecting millions and faces a unique set of challenges in combating these chronic
contributing substantially to morbidity and mortality. diseases. A study by Gudi et al. [2] explores the
Diabetes is a metabolic disorder characterized by relationships between diabetes and cardiovascular diseases
elevated blood sugar levels, either due to insufficient using the PIMA Indian dataset, shedding light on the
insulin production or ineffective use of insulin by the specific nuances of these health conditions within the
body. Indian population. The importance of predictive analytics in
healthcare, especially for hypertension, is highlighted by
Individuals with diabetes often face the challenge of Kavakiotis et al. [4].
prevent severe complications, making predictive modeling a
The consequences of these chronic diseases are starkly pivotal tool in disease management.
evident in mortality statistics. According to the work by
Natarajan et al. [5], machine-learning algorithms are As we embark on this major project, the synthesis of insights
instrumental in assessing the risk of heart failure from these seminal studies lays the groundwork for our
admissions based on personal health record data. The endeavor to enhance predictive modeling for diabetes, CVD,
gravity of the situation is emphasized in a study by Misra and hypertension in the Indian context. The integration of
et al. [6], revealing the substantial death rates attributed diverse datasets, advanced machine learning algorithms, and
to diabetes, CVD, and hypertension in India. As we delve a nuanced understanding of the intricacies of these chronic
into this major project, the amalgamation of insights from diseases aims to contribute to the evolving landscape of
these studies and the application of advanced machine healthcare analytics, offering tangible benefits to individuals
learning techniques will serve as a robust foundation for and the healthcare system at large.
enhancing predictive modeling, early detection, and
ultimately, improving outcomes for individuals grappling
II. RELATED WORKS
with these chronic health conditions.
A. Understanding Blood Pressure Readings:
In addition to the challenges posed by diabetes,
cardiovascular disease (CVD), and hypertension in India, The American Heart Association (AHA) provides valuable
the research landscape has been enriched by studies such insights into blood pressure readings and their significance in
as that of Al-Masni et al. [3], where an ensemble of health assessment [Heart.org]. This foundational information
machine learning models was applied to predict diabetes, is crucial for our major project, offering a basis for
laying the groundwork for our predictive modeling understanding the role of blood pressure in the context of
endeavors. The incorporation of algorithms like Random hypertension.
Forest, Decision Tree, XG Boost, Support Vector
Machine (SVM), K-Nearest Neighbors (KNN), and
B. Predictive Modeling for Cardiovascular Events:
Logistic Regression reflects the diversity of approaches
explored in the pursuit of accurate diabetes prediction. Kotecha et al. [1] conducted a systematic review and meta-
Moreover, the insights derived from the PIMA Indian analysis, exploring machine learning techniques for
dataset, as utilized by Gudi et al. [2], serve as a valuable predicting cardiovascular events. This study serves as a
resource for understanding the intricate relationships significant reference for our project, showcasing the
between diabetes and cardiovascular diseases specific to importance of predictive modeling in the domain of
the Indian population. The inherent heterogeneity in cardiovascular disease (CVD).
genetic predispositions and lifestyle factors necessitates a
nuanced approach in predictive modeling for this
demographic. C. Machine Learning Applications in Diabetes:
The work of Al-Masni et al. [7] contributes to the field of
As the burden of diabetes and its complications grows,
diabetes prediction using machine learning techniques. By
the work by Kavakiotis et al. [4] underscores the
exploring a range of algorithms, including Random Forest,
importance of machine learning and data mining in
Decision Tree, XG Boost, Support Vector Machine (SVM),
diabetes research. Leveraging such advanced techniques
K-Nearest Neighbors (KNN), and Logistic Regression, this
becomesparamount in identifying patterns, risk factors,
study informs our approach to diabetes prediction.
and predictive markers that can inform early
interventions and personalized healthcare approaches.
The impact of these chronic diseases on mortality rates in D. Machine Learning in Healthcare Information Systems:
India is starkly evident in the findings by Natarajan et al. Jung et al. [8] delve into the application of machine learning
[5], where machine-learning algorithms play a crucial in healthcare information systems, emphasizing its potential
role in assessing the risk of heart failure admissions. The to enhance decision-making processes. This broader
integration of data-driven insights into clinical decision- perspective on machine learning in healthcare aligns with
making processes can potentially mitigate the escalating our project's goal of integrating predictive modeling into
health crisis caused by these chronic conditions. health monitoring systems.
Furthermore, the comprehensive approach taken by Misra
et al. [6] in exploring the relationship between obesity, E. Data Mining Methods in Diabetes Research:
dyslipidemia, and related metabolic disorders in South
Asians adds a crucial layer to our understanding. These Kavakiotis et al. not only delve into data mining methods in
interconnections emphasize the need for a holistic diabetes research but also illuminate the significance of
perspective in predictive modeling, considering the leveraging data-driven approaches for a nuanced
multifaceted nature of health issues prevalent in the understanding of the disease. Their work emphasizes the
region. transformative potential of advanced methodologies in
shaping effective strategies for diabetes management and
In addressing the challenges posed by hypertension, the intervention.
project draws inspiration from the work of Kotecha et al.
[1], whose systematic review and meta-analysis shed
light on the landscape of predicting cardiovascular F. Smart Home Health Monitoring for Diabetes and
events. Hypertension, often referred to as the "silent Hypertension: The research by Sun et al. [9] focuses on a
killer," demands vigilant monitoring and early smart home health monitoring system for predicting type 2
intervention to
diabetes and hypertension. This study aligns closely with collection of diverse datasets encompassing clinical records,
our project's objectives, offering inspiration for the lifestyle information, and genetic data. Following data
implementation of remote patient monitoring to predict preprocessing to ensure cleanliness and uniformity, the
and manage chronic conditions. dataset is strategically split for training and testing. Machine
learning models, including Random Forest, Decision Tree,
G. Integration of Smart Technologies in Healthcare: XG Boost, SVM, KNN, and Logistic Regression, are then
implemented to predict diabetes and hypertension, with a
The academia.edu article by Khan et al. [10] extends the focus on accuracy metrics. Simultaneously, a real-time health
discussion on smart home health monitoring systems, monitoring system is developed to collect continuous data on
emphasizing their role in predicting type 2 diabetes and blood glucose levels and blood pressure, integrated
hypertension. This source contributes to our seamlessly with the prediction models and equipped with
understanding of the integration of smart technologies in alerts for both individuals and healthcare providers. For
healthcare and its potential impact on chronic disease cardiovascular disease prediction, a specialized dataset is
management. employed, and models are evaluated for feature importance.
The workflow includes a risk stratification system to
categorize individuals based on their predicted risks.
H. Machine Learning-Based Predictive Models in
Bayesian analysis is applied to explore relationships between
Healthcare: Hossain et al. [11] present a machine
diabetes and cardiovascular diseases, while visualization
tools are developed for clear comprehension. Continuous
improvement, ethical considerations, and rigorous testing
culminate in the deployment of the system, accompanied by
comprehensive documentation and reporting. This holistic
workflow aims to empower individuals and healthcare
providers in the proactive management of chronic health
conditions.

Fig 1
learning-based predictive model for the early detection of
hypertension. This study provides relevant insights into
the application of machine learning specifically for
hypertension prediction, informing our approach to this
aspect of the major project.

I. Wearable Health Monitoring Systems: Fig2


The article by Kim et al. [12] published in Springer
discusses a wearable health monitoring system for the B. Data gathering and processing
prediction of hypertension. This source complements our The dataset for our project comprises essential attributes
exploration of monitoring systems, offering insights into crucial for predicting and managing chronic health conditions,
wearable technologies and their potential impact on specifically focusing on diabetes. The included attributes are
predicting hypertension. In conclusion, the related works Pregnancy, Glucose, blood pressure, skin thickness, Insulin,
cited provide a diverse range of perspectives, BMI (Body Mass Index), Diabetes Pedigree Function, Age,
methodologies, and applications related to diabetes, and Outcome (with values 1 representing Diabetic and 0
cardiovascular disease, and hypertension. These studies indicating Non-Diabetic) as shown in Table 1. The data
collectively inform and inspire our major project, source for this project is the renowned PIMA Indian Diabetes
offering a solid foundation for the integration of machine dataset, renowned for its relevance to diabetes research. The
learning, health monitoring systems, and predictive dataset is composed of 768 rows, each representing an
modeling in the context of chronic disease management individual case. This comprehensive dataset provides a
III. PROPOSED SYSTEM holistic view of critical health indicators and serves as the
The proposed system for our major project aims to foundational basis for our machine learning models and
develop an integrated and intelligent healthcare platform predictive analytics, aiming to enhance early detection,
that leverages machine learning and smart home health monitoring, and personalized health management for diabetes
monitoring to predict, monitor, and manage diabetes, and related conditions.
cardiovascular disease (CVD), and hypertension. This
system is designed to enhance early detection, provide
personalized insights, and empower both patients and
healthcare providers.
A. Workflow of proposed methodology
The workflow for our integrated diabetes, hypertension,
and cardiovascular disease prediction system is designed
to systematically address the complexities of chronic
health conditions. The journey begins with the meticulous
Table 1
C.Feature engineering Predicting CVD using all features:
Feature engineering involves the strategic selection of Models Accuracy Percentage
pertinent attributes, which directly influences the precision
Decision Tree 64.00%
of predictions. These attributes, also known as features, are
divided into two main categories: predictors and responses. Random Forest 72.12%
Predictors are variables that influence the outcome, while
Support Vector Machine 72.00%
responses are the resultant variables affected by predictors.
In accordance with specific research needs, the most XgBoost 64.00%
impactful features are identified for optimal accuracy in Logistic Regression 72.30%
experiments. For instance, in a given scenario, the features
of interest could be blood pressure and glucose levels, as KNN 72.00 %
outlined in Table 2. These selected features, namely Table 2
BloodPressure and Glucose, are recognized as predictors, In this scenario, changes in X do not appear to influence
while the variable Outcome is designated as the response. Y.
C.Predicting outcome
After getting the appropriate features during the feature Applying this concept to the proposed work, where
engineering steps, the outcomes of the classifiers are Glucose is represented on the X-axis and Blood Pressure
judged in terms of accuracy, scatter plot, and correlation on the Y-axis, the scatter plot in Fig. 4 suggests a
matrix. constant correlation. This means that as Glucose levels
change, Blood Pressure remains relatively constant,
Accuracy indicating a consistent relationship between the two
Accuracy is a commonly used metric to evaluate the variables.
performance of a model, especially in classification

problems. Accuracy is defined as the ratio of correctly


predicted instances to the total number of instances in the
dataset. It provides a general measure of how well a
model is performing across all classes.
For diabetes

Models Accuracy Percentage


Decision Tree 70%
Random Forest 83%
XgBoost 70%
Logistic Regression 78%
Support Vector Machine 78%
*Artificial Neural 81.10%
. Network

Table 3
Scatter
plot Fig 3
A scatter plot visually represents the relationship between
two variables, X and Y, allowing us to observe if there is Correlation matrix
a correlation between them. There are four possible
scenarios based on the patterns observed in the plot: A confusion matrix is a crucial tool for evaluating the
effectiveness of a machine learning model. It is presented in
Positive Correlation: When Y increases as X increases, it a tabular format and consists of four key values that fall into
signifies a positive correlation. In other words, as one two categories: predicted and true (actual). The matrix helps
variable goes up, the other tends to go up as well. in assessing the model's performance by comparing its
predictions to the actual outcomes.
Negative Correlation: If an increase in X is associated
with a decrease in Y, it indicates a negative correlation. 1. True Positive (TP): In instances where the model
This implies that as one variable increases, the other correctly predicts a positive outcome, indicating that the
tends to decrease. prediction aligns with the actual result. For example,
predicting a patient as diabetic, and the patient is indeed
Constant Correlation: In cases where X increases, but Y diabetic.
remains constant without any noticeable increase or
decrease, it is termed constant correlation. This suggests 2. True Negative (TN): This occurs when the model
that changes in X do not affect the value of Y accurately predicts a negative outcome, matching the true
No Correlation: When there is no discernible pattern or result. For instance, predicting a patient as non-diabetic, and
impact of X on Y, it is referred to as no correlation. the patient is indeed non-diabetic.
3. False Positive (FP): This occurs when the model
incorrectly predicts a positive outcome, suggesting that
the prediction does not align with the actual result. For
instance, predicting a patient as diabetic, but the patient
is, in fact, non-diabetic.

4.False Negative (FN): In situations where the model


inaccurately predicts a negative outcome, deviating
from the true result. An example would be predicting a
patient as non-diabetic, but the patient is, in reality,
diabetic.
The confusion matrix, as illustrated in Fig. 5 for the
proposed system, is color-coded to distinguish between
true positive, true negative (green section), false positive,
and false negative (red section). The ideal scenario is for
the total count in the green section (true positive and
true negative) to exceed the total count in the red section
(false. Fig 6

positive and false negative). This emphasizes the Design Implementation


importance of minimizing false predictions to enhance Develop a graphical user interface (GUI) to predict
the overall reliability and accuracy of the mode the probability of diabetes and hypertension using
machine learning. Additionally, apply Bayesian
Fig 5 Analysis to establish the relationship between
Cardiovascular Disease (CVD) and Diabetes.
Implementation Steps:
Graphical User Interface (GUI) for Diabetes and
Hypertension Prediction:
Create a user-friendly GUI using a framework like
Tkinter (for Python).
Include input fields for relevant features such as age,
BMI, blood pressure, etc.
Implement a button to trigger the machine learning
model for predicting the probability of diabetes and
hypertension.
Display the model's output, indicating the probability
of each disease.
Machine Learning Model for Diabetes and
Hypertension Prediction:
Utilize a pre-trained machine learning model or train
a model using a dataset that includes features related
to diabetes and hypertension.
The model should take input from the GUI, process
the information, and output the predicted
probabilities for each disease.
Popular machine learning libraries such as scikit-
learn or TensorFlow can be employed for model
development.
Bayesian Analysis for CVD and Diabetes
Relationship:
Extract necessary information from the dataset,
focusing on patients with and without diabetes.
We use the Bayesian Analysis to find the
relationship between CVD and Diabetes.
Step 1: We first extract the necessary information
from dataset, ie., the no. of patients in which
diabetes is present and those in which it is absent
Step 2: Then we calculate the following conditional
probabilities-
3- Probability|CHD«'Absent | Diabetes= disease.
Present) 4- Probability (CHD= Present | Features related to diabetes status and cardiovascular
Dabetes='Present disease are used to calculate conditional probabilities.
Naive Bayes provides insights into the likelihood of
cardiovascular disease given the presence or absence
Diabetes 10 year risk No. of
of diabetes.
of CHD patients
Absent Absent 2825 How Machine Learning Algorithms Work:
Absent Present 478
Absent Absent 54 Data Preparation:
The datasets are preprocessed, including handling missing
Absent Absent 33 values, scaling features, and encoding categorical variables.
Table 4
Training:
The machine learning algorithms are trained on a portion
Bayesian Analysis helps establish the relationship between of the dataset, learning the patterns and relationships
CVD and Diabetes, providing insights into how the between features and target variables.
presence or absence of diabetes influences the likelihood of
CVD. Testing and Evaluation:
- The trained models are tested on a separate set of data to
Flowchart for work strategy evaluate their performance. Evaluation metrics such as
accuracy, precision, recall, and F1 score are computed.
1. Diabetes and Hypertension Predictor (Using Pima
Indian Dataset): Prediction:
Once trained and validated, the models are ready for making
Dataset: PIMA Indian Diabetes predictions on new, unseen data.
Dataset Machine Learning Algorithm:
Interpretability:
Logistic Regression is a suitable algorithm for - Interpretability of the models allows understanding the
binary classification tasks, such as predicting the importance of each feature in making predictions and
presence or absence of diabetes and hypertension. gaining insights into the relationships within the data.
The algorithm models the probability of a particular
class, making it effective for disease prediction. Iterative Improvement:
Features like glucose level, BMI, blood pressure, Models can be iteratively improved by fine-tuning
and age can be used as input variables. hyperparameters, incorporating additional features, or
The model is trained on historical data, and once exploring different algorithms based on performance
trained, it can predict the likelihood of diabetes and feedback.
hypertension based on new input data.
These machine learning algorithms contribute to the
development of predictive models for disease diagnosis and
2. Cardiovascular Disease Predictor (Using CVD relationship exploration, aiding in better healthcare
Dataset): decision- making.

Dataset: Cardiovascular Disease


Dataset Machine Learning Algorithm:
- Random Forest is an ensemble learning
algorithm suitable for classification tasks, including
predicting cardiovascular disease.
- It combines multiple decision trees to make
accurate predictions.
- Features such as age, cholesterol levels, blood
pressure, and smoking status can be utilized.
- Random Forest handles complex relationships
within the data and provides robust predictions.

3. Finding Relationship Between Diabetes and


CVD (Using Combined Diabetes and CVD
Dataset)

Dataset: Combined Diabetes and CVD


Dataset Machine Learning Algorithm:
Naive Bayes is well-suited for probabilistic Fig 7
classification tasks and is employed here to find
the relationship between diabetes and
cardiovascular
Experiment Results

In this study, the prediction of a patient's blood


pressure category and diabetes status is conducted
through both traditional algorithmic approaches Result of Onset of Diabetes Prediction
and modern machine learning techniques.
Input: Glucose Input: Daistolic Output:
For the prediction of blood pressure categories, a (mm dl) Blood Pressure Diabetes Status
traditional algorithmic approach is employed. This (mm Hg)
method takes input from blood pressure readings,
performs calculations, and outputs the patient's 70 122 Non-Diabetic
blood pressure condition. The results are presented 72 121 Non-Diabetic
in Table 4, where the first column represents the 60 126 Non-Diabetic
systolic blood pressure, the second column 70 93 Non-Diabetic
indicates diastolic blood pressure, and the third 146 136 Diabetic
column signifies the corresponding blood pressure
category. Table 8

On the other hand, the onset of diabetes is Notification for Blood Pressure and Glucose.
predicted using a machine learning approach.
Training is conducted on 80% of actual data, and Blood Glucose Notification
the model is then tested on the remaining data. The Pressure
output values for the testing data, which includes High High Critical
glucose and blood pressure readings, are illustrated High Low Normal
in Table 5. In this table, the first column represents Low High Normal
glucose values, the second column indicates Low Low Critical
diastolic blood pressure, and the third column Table 9
indicates whether the patient is diabetic or non-
diabetic.  Result And Discussion

Furthermore, Table 6 outlines the notifications that This study applied five machine learning models to
would be sent to the doctor based on the patient's predictably assess a patient's cardiac condition. As a
consequence, the dataset came from Kaggle, as was
glucose and blood pressure levels. Notifications are
already mentioned. The dataset has 12 characteristics.
generated if the patient exhibits high or low levels Data from 937 patients, or 80% of our training dataset,
in these vital indicators. This comprehensive were utilized. Data from 235 patients, or 20% of our
approach combines both traditional and machine testing dataset, were used.
learning methods provide a holistic prediction of
blood pressure categories and diabetes onset,
contributing to more effective healthcare decision-
making. 1. Cardiovascular Events Prediction using
Machine Learning Techniques
Input:
Table Sysolic
7 Result Input:
of Blood Output:
Presure Category Predication.
Blood Pressure Diastolic Blood
(mmHg) Blood Pressure Kotecha et al. [1] conducted a systematic review and
Pressure Category meta-analysis to predict cardiovascular events using
(mmHg) machine learning techniques. The study encompassed a
140 90 High comprehensive analysis of relevant literature and
180 120 Hypertension reported outcomes related to the application of machine
(Critical) learning in cardiovascular event prediction.

40 60 Low 2. Heart Disease Prediction with Diabetes


90 80 Normal using Bayesian Theorem
117 80 Normal
142 90 High Gudi et al. [2] presented a case studyutilizing the
30 10 Hypotension PIMA Indian dataset to predict heart disease in
individuals with diabetes. The Bayesian theorem was
employed as a predictive tool, demonstrating its
potential in identifying the risk of heart disease in
diabetic populations.
3. Machine Learning Algorithms for Ischemic Heart
Disease Prediction Combines.
Five different techniques. When compared to prior
Bani Hani and Ahmad [3] conducted a systematic review research, this study has demonstrated a considerable
on machine-learning algorithms for predicting ischemic improvement and a high level of accuracy. In machine
heart disease. The study highlighted various algorithms and learning, preprocessing is a vital step that promotes
their effectiveness in predicting the occurrence of ischemic improved outcomes. In order to increase their accuracy,
heart disease, providing valuable insights for future this paper compared machine learning algorithms with
research. several performance criteria. In our method, missing data
from the preprocessing stage is replaced with the mean
3. Machine Learning and Data Mining Methods in value. Outcomes demonstrate how well the mean
Diabetes Research performs when used to replace missing variables. Using
the XGB model, the best accuracy of 91.9% was attained.
Kavakiotis et al. [4] explored machine learning and data The limitation of this paper is the dataset. We will work
mining methods in diabetes research. The study delved into more organized and bulk dataset in the future. This
the application of these techniques, showcasing their research also aims to use deep learning with additional
relevance and contribution to advancing our understanding datasets in the future so that the findings show that the
of diabetes and its predictive modeling. 5. Machine- system may be efficient and useful for doctors.
Learning Algorithms to Assess the Risk of Heart Failure
Admissions Natarajan et al. [5] developed machine-learning
algorithms to assess the risk of heart failure admissions
based on personal health record data. The study GB 85.10
demonstrated the potential of utilizing personal health data
for accurate prediction of heart failure risk. 6. Predicting Extra 90.20
Diabetes Mellitus using Machine Learning Techniques Al-
Masni et al. XG 91.90
[7] investigated the use of machine learning techniques in
predicting diabetes mellitus. The study focused on the RF 91
development of predictive models, shedding
CAR 91.90
Fig 9. The comparison ROC curve of the five models
0.00% 20.00% 40.00% 60.00% 80.00%
The accuracy value of applied classifiers is shown in Fig.
10. The best performance in this research is XGB which is
91.9%, and the lowest accuracy rate is CART which is Fig. 8. Accuracy comparison between
84.2%. RF achieved 91% accuracy in this work. GBM five algorithms
achieved 85.1% model accuracy and the Extra tree classifier
got 90.2% accuracy. Table 10 Comparison to other works

Model Accuracy Precision Sensitivity Specificity 𝑭𝟏 −


(%) Score

XGB 91.9% 0.906 0.943 0.892 0.924

RF 91.0% 0.886 0.951 0.866 0.917

Extra 90.2% 0.878 0.943 0.857 0.909


Tree

GBM 85.1% 0.828 0.902 0.794 0.863

CART 84.2% 0.835 0.869 0.812 0.852

Table 9

CONCLUSIONS AND FUTURE WORK


Patients with heart failure are becoming more prevalent
every day. A system that can be used to create or classify
data rules is required to get out of this dangerous situation
and reduce the likelihood of heart disease. As a result, this
study of machine-learning techniques discusses, proposes,
and implements a machine-learning algorithm that
Ref Contributions Algorithms Best
used Accuracy
This Applied top machine learning XGB, RF, 91.9%
work algorithms to predict early- GBM, CART (XGB)
stage cardiovascular disease.

[6] Authors improved NB, SVM, and 86.8%


cardiovascular disease KNN (SVM)
prediction. It enabled doctors
diagnose cardiovascular
illness and identify the
patient's heart status.
[7] Developed several machine RF, SVM, NB, 86.5% (LR)
learning algorithms for GB, and LR
forecasting cardiovascular
[9] SVM, RF, and LR—machine SVM, LR, and 78.84%
learning methods—were used RF (SVM)
to predict cardiovascular
disease.

[11] Cloud-based machine KNN, DT, 87.4%


learning algorithms were NB, LR, (Vote)
used to forecast heart SVM, NN and
disorders. An Arduino-based Vote (a hybrid
monitoring device detects technique with
temperature, blood pressure, Naïve Bayes
and heartbeat every ten and Logistic
seconds. Regression)
[10] A. K. Paul, P. C. Shill, M. R. I. Rabin, and K. Murase, “Adaptive
REFERENCES weighted fuzzy rule-based system for the risk level assessment of
heart disease,” Applied Intelligence, vol. 48, no. 7, pp. 1739–1756,
Jul. 2018, doi: 10.1007/S10489-017-1037-6/METRICS.
[1] Kotecha, J., Shaligram, D., & Bhatia, T. (2019). Predicting [11] L. El bouny, M. Khalil, and A. Adib, “An End-to-End Multi-Level
cardiovascular events using machine learning techniques: A Wavelet Convolutional Neural Networks for heart diseases
systematic review and meta-analysis. Journal of Cardiology, 74(5), diagnosis,” Neurocomputing, vol. 417, pp. 187–201, Dec. 2020,
414-421. doi: 10.1016/J.NEUCOM.2020.07.056.
https://www.sciencedirect.com/science/article/pii/S13191578193 [12] M. E. H. Chowdhury et al., “Real-Time Smart-Digital
16076#s0005 Stethoscope System for Heart Diseases Monitoring,” Sensors
[2] Gudi, S. K., Ponnambalam, K., & Devi, S. R. (2019). Predicting 2019, Vol. 19, Page 2781, vol. 19, no. 12, p. 2781, Jun. 2019,
heart disease with diabetes using Bayesian theorem: A case study doi: 10.3390/S19122781.
with PIMA Indian dataset. Procedia Computer Science, 165, 393-
400.
https://www.sciencedirect.com/science/article/pii/S18770509193
06684
[3] S. H. Bani Hani and M. M. Ahmad, “Machine-learning Algorithms
for Ischemic Heart Disease Prediction: A Systematic Review,”
Curr Cardiol Rev, vol. 19, no. 1, Jun. 2022, doi:
10.2174/1573403X18666220609123053.
[1] Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N.,
Vlahavas, I., & Chouvarda, I. (2017). Machine learning and data
mining methods in diabetes research. Computational and
Structural Biotechnology Journal, 15, 104-11.
https://www.sciencedirect.com/science/article/pii/S0168822722
136
[2] Natarajan, A., Su, H. W., Heneghan, C., & Nunan, D. (2020).
Machine-learning algorithms to assess the risk of heart failure
admissions based on personal health record data. PLOS ONE,
15(4), e0240826.
https://journals.plos.org/plosone/article/file?type=printable&id=
10.1371/journal.pone.0240826
[3] Misra, A., Shrivastava, U., & Gupta, R. (2004). Obesity and
dyslipidemia in South Asians. Nutrients, 22(4), 462-470.
https://www.amjmed.com/article/S0002-
9343(03)006727/fulltext

[4] Al-Masni, M. A., Al-Absi, H. R., Al-Shawagfeh, M. M., & Al-


Ezkiya, N. A. (2019). Predicting diabetes mellitus using machine
learning techniques. IEEE International Conference on Data
Science and Advanced Analytics (DSAA).
https://dl.acm.org/doi/10.1016/j.jksuci.2020.01.010

[5] Jung, Y., Park, H. A., & Kim, J. H. (2018). A ubiquitous


healthcare information platform using smart wearable devices.
Journal of King Saud University - Computer and Information
Sciences.
https://www.sciencedirect.com/science/article/pii/S0168822722
007136

[6] Sun, Y., & Zheng, Y. (2020). Smart home health monitoring
system for predicting type 2 diabetes and hypertension.
ResearchGate.
https://www.researchgate.net/publication/338822259_Smart_Ho
me_Health_Monitoring_System_for_Predicting_Type_2_Diabe
tes_and_Hypertension

[7] Khan, N., Ahmed, R., & Hussain, S. A. (2015). Smart home
health monitoring system for predicting type 2 diabetes and
hypertension. Academia.edu.
https://www.academia.edu/109024784/Smart_home_health_mo
nitoring_system_for_predicting_type_2_diabetes_and_hyperten
sion

[8] Hossain, M. A., & Hasan, M. (2022). A machine learning-based


predictive model for early detection of hypertension. Journal of
Medical Systems.
https://link.springer.com/article/10.1007/s10916-022-01900-5

[9] Kim, S. J., Kim, K. R., Kim, J. Y., & Kim, H. K. (2017). A
wearable health monitoring system for prediction of
hypertension. Springer.
https://link.springer.com/article/10.1007/s10916-022-01900-5

You might also like