Diabetes

GURU GHASIDAS VISHWAVIDYALAYA,
BILASPUR
A project on
DIABETES PREDICTION
PRESENTED BY: GUIDED BY:

POONAM OGRE Mr. PRASHANT VAISHNAV
MCA III SEM C.S.I.T dept
ROLL NO. 21072141
TABLE OF CONTENTS
SL.NO CONTENTS
1. INTRODUCTION
2.
3.
4.
6.
7.
8.
9.
INTRODUCTION
 Diabetes is a disease that occurs when your blood glucose, also called blood sugar,
is too high. Blood glucose is your main source of energy and comes from the food
you eat. Insulin, a hormone made by the pancreas, helps glucose from food get
into your cells to be used for energy. Sometimes your body doesn’t make enough
—or any—insulin or doesn’t use insulin well. Glucose then stays in your blood
and doesn’t reach your cells.
 Diabetes is a chronic (long-lasting) health condition that affects how
our body turns food into energy.Our body breaks down most of the
food we eat into sugar (glucose) and releases it into our bloodstream.
When our blood sugar goes up, it signals our pancreas to release
insulin. Insulin acts like a key to let the blood sugar into our body’s cells
for use as energy.
 With diabetes, our body doesn’t make enough insulin or can’t use it as
well as it should. When there isn’t enough insulin or cells
stop responding to insulin, too much blood sugar stays in our
bloodstream. Over time, that can cause serious health problems, such
as heart disease, vision loss, and kidney disease.
RISK FACTOR FOR DIABETES
The following are some of the known risk factors for diabetes.
 Type 1 Diabetes
 Type 1 diabetes is thought to be caused by an immune reaction (the body
attacks itself by mistake). Risk factors for type 1 diabetes are not as clear as for
prediabetes and type 2 diabetes. Known risk factors include:
• Family history: Having a parent, brother, or sister with type 1 diabetes.
• Age: You can get type 1 diabetes at any age, but it usually develops in children,
teens, or young adults.
• TYPE 2 Diabetes
• Have prediabetes
• Are overweight.
• Are 45 years or older
• Are physically active less than 3 times a week
ROLE OF MACHINE LEARNING IN DETECTION OF BREAST CANCER
 A mammogram is an x-ray picture of the breast. It can be used to check for breast
cancer in women who have no signs or symptoms of the disease. It can also be used if
you have a lump or other sign of breast cancer. Screening mammography is the type of
mammogram that checks you when you have no symptoms. It can help reduce the
number of deaths from breast cancer among women ages 40 to 70. But it can also have
drawbacks.
 Mammograms can sometimes find something that looks abnormal but isn't cancer.
This leads to further testing and can cause you anxiety. Sometimes mammograms can
miss cancer when it is there. It also exposes you to radiation. You should talk to your
doctor about the benefits and drawbacks of mammograms. Together, you can decide
when to start and how often to have a mammogram. Now while its difficult to figure
out for physicians by seeing only images of x-ray that weather the tumor is toxic or not
training a machine learning model according to the identification of tumour can be of
great help.
FLOW CHART
Model Creation
We don’t know which algorithms would be best for this problem.

Let’s check each algorithm in loop and print its accuracy, so that we can select
our best algorithm.
Let’s test 6 different algorithms:
• Logistic Regression (LR)
• Linear Discriminant Analysis (LDA)
• K-Nearest Neighbors (KNN).
• Classification and Regression Trees (CART).
• Gaussian Naive Bayes (NB).
• Support Vector Machines (SVM).
MODEL IMPLEMENTATION
 LOGISTIC REGRESSION
 KNEIGHBORS CLASSIFIER
 SVC
Models Comparison Table:
Training Accuracy
100 95.4902
93.2026
90
80
70
57.3856
60
50
40
30
20
10
0
LogisticRegression KNeighborsClassifier SVC
TESTING ACCURACY
Testing Accuracy
100
91.1111 91.1111
90
80
70
60
50 42.2222
40
30
20
10
0
LogisticRegression KNeighborsClassifier SVC
VISUALIZATION
 PAIR PLOT :Plotting the violin plot to check the comparison of a
variable distribution:
VISUALIZATION
 HEATMAP
 Plotting the heatmap to check the correlation.
dataset.corr() is used to find the pairwise correlation of all columns in the
dataframe.
VISUALIZATION
 BOXPLOT
It is created to display the summary of the set of data values having properties
like minimum, first quartile, median, third quartile and maximum.
VISUALIZATION
 COUNTPLOT
countplot() method is used to Show the counts of observations in each
categorical bin using bars.
RESULT
 On the basis of above experiment KNN and LR are the

best for the prediction of breast cancer with the 91%
accuracy.
CONCLUSION
 Early detection of diabetes is one of the significant challenges in the

health care industry. In our research, we designed a system, which can
predict diabetes with high accuracy. Using the feature reduction method,
we dropped three features. We used five input features (Glucose, BMI,
Insulin, Pregnancy, and Age)and one output feature (outcome) in the PIMA
dataset. We used different machine learning algorithms, including DT,
KNN, RF, NB, AB, LR, SVM on the Pima Indian Diabetes Dataset to predict
diabetes and evaluated the performance on various measures. All models
show good results for some parameters like accuracy, precision, recall, and
F-measure. All models provided an accuracy greater than 70%. LR and SVM
provided ap-proximately 77%–78% accuracy for train/test split .

Diabetes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Diabetes

Uploaded by

Copyright:

Available Formats

GURU GHASIDAS VISHWAVIDYALAYA,

PRESENTED BY: GUIDED BY:

We don’t know which algorithms would be best for this problem.

 On the basis of above experiment KNN and LR are the

 Early detection of diabetes is one of the significant challenges in the

You might also like