You are on page 1of 9

DIABETES PREDICTIVE SYSTEM

USING MACHINE LEARNING


A Project Work Synopsis

Submitted in the partial fulfillment for the award of the degree of

BACHELOR OF ENGINEERING

IN

CSE- INFORMATION SECURITY

Submitted by:

NAME OF THE STUDENT


RISHABH MOHATA (19BCS3542)
MADHAV SINGH (19BCS3543)
AKANSHA SACHDEVA (19BCS3510)
SIDDHANT BHARDWAJ (19BCS3502)
SHREY JAIN (19BCS3541)

Under the Supervision of:


SHUBHAM GARGRISH

SUPERVISORS NAME
SHUBHAM GARGRISH

CHANDIGARH UNIVERSITY, GHARUAN, MOHALI – 140413


Table of Contents
Title Page i
Abstract ii
List of Figures iii
Timeline / Gantt Chart iv

1. INTRODUCTION* 1
1.1 Problem Definition 1
1.2 Project Overview/Specifications* (page-1 and 3) 2
1.3 Hardware Specification 3
1.4 Software Specification 4

2. LITERATURE SURVEY 5
2.1 Existing System 5
2.2 Proposed System 6

3. PROBLEM FORMULATION

4. RESEARCH OBJECTIVES 7
5. METHODOLOGY 8
6. TENTATIVE CHAPTER PLAN FOR THE PROPOSED WORK
7. REFERENCES
8. APPENDICES
List of Figures
Figure Title page

3.1 Pictorial representation of diabetes prediction system 6


1 INTRODUCTION
1.1 PROBLEM DEFINITION: Diabetes is noxious diseases in the world. Diabetes is caused
because of obesity or high blood glucose level, and so forth. It affects the hormone insulin,
resulting in abnormal metabolism of carbs and improves level of sugar in the blood.
Diabetes occurs when body does not make enough insulin. According to (WHO) World
Health Organization about 422 million people suffering from diabetes are particularly
from low or idle income countries. And this could be increased to 490 billion up to the
year of 2030. However prevalence of diabetes is found among various Countries like
Canada, China, and India etc. Population of India is now more than 100 million so the
actual number of diabetics in India is 40 million. Diabetes is major cause of death in the
world. Early prediction of disease like diabetes can be controlled and save the human life.

1.2 PROJECT OVERVIEW/SPECIFICATIONS: To easily detect Diabetes we are


designing a prediction system, this work explores prediction of diabetes by taking various
attributes related to diabetes disease. For this purpose we use the Pima Indian Diabetes
Dataset, we apply various Machine Learning classification and ensemble Techniques to
predict diabetes. Machine Learning is a method that is used to train computers or
machines explicitly. Various Machine Learning Techniques provide efficient result to
collect knowledge by building various classification and ensemble models from collected
dataset. Such collected data can be useful to predict diabetes. Various techniques of
Machine Learning can capable to do prediction, however it’s tough to choose best
technique. Thus for this purpose we apply popular classification and ensemble methods on
dataset for prediction.

1.3 HARDWARE SPECIFICATIONS:

 RAM – 2 GB or more
 i3 or i5 Intel Core Processor

1.4 SOFTWARE SPECIFICATIONS:

 Windows 10 x64
 Google Colaboratory/ Jupyter Notebook
 Python 3.9
 Flask
 Heroku
2 LITERATURE SURVEY

2.1 EXISTING SYSTEM: Looking at the present scenario of Diabetes prediction the
chances of accurate provision of a result is really shaky due to various problems, the
genetic features even when sharing the correct base patient record might give out a
falsifies information about whether the patient has diabetes or not, Diagnosis of diabetes is
considered a challenging problem for quantitative research. Some parameters like A1c ,
fructosamine, white blood cell count, fibrinogen and hematological indices were shown to
be ineffective due to some limitations.Since Diabetes is an immune disorder hence at
many times it might be correlated along with thyroid and other immuno-deficiency
diseases.
Therefore, the problem lies at various levels while correctly detecting and diagnosing a
patient with diabetes some being-
• Many people don’t have enough resources to go for regular diabetes checkup.
• There might be errors while the report collection.
• Portable machines for home checkup are costly hence people avoid buying them.
• Record maintenance and Human error while diagnosing, ignorance due to personal
reasons

2.2 PROPOSED SYSTEM: Due to the recent advancements in the computer science field,
there are several studies showing us that diabetes can be accurately predicted before
handedly by using machine learning algorithms and making a proper generated result
using the persons previous medical records. This system can be deployed on a cloud
architecture or on a server in inside of a hospital or medical institution and then can be
used to monitor the patient records while keeping in check and maintaining their privacy
and security. This system will not only help us rule out and predict an outcome which is
correct and useful, but also help us identify early symptom and rectify the problem
occurring before handedly and treat the patient before he/she reaches a chronic state of the
disease. The following are the advantages of the proposed system:

 Automation along with self-reporting system


 Cloud based operation
 Cost- effective
 Increased security
 Time saving
 Easy to manage
3 PROBLEM FORMULATION

From the literature review, it is observed that studies highlight the need of efficient and
scalable approach for diabetes system. The existing techniques come with disadvantages like
there is always a risk of human error, record maintenance, and report generation is very time-
consuming. It is also an ineffective and outdated approach which can cause keyboard and
printing errors and incorrect entry may lead to harmful outcomes. This is a classification
problem of supervised machine learning. The objective is to predict whether or not a patient
has diabetes, based on certain diagnostic measurements included in the dataset.

0 – Absence of Diabetes
1 – Presence of Diabetes

Using M.L we get following benefits over the existing system:

 High Accuracy Rates


The main advantage is its Accuracy. The system checks and gives the output without any
misunderstanding and errors. Diabetes will be detected at the right time due to the high
accuracy levels.
 Saves Time
There is no need for individuals to go out to centers regularly for check-ups. This system will
predict accurately within seconds, whether or not individual has diabetes or not with high
accuracy.
4 RESEARCH OBJECTIVES
The proposed research is aimed to carry out work leading to the development of an approach
for diabetes system. The proposed aim will be achieved by dividing the work into following
objectives:

 Data will be collected from National Institute of Diabetes and Digestive and Kidney
Diseases and pre-processed to be analyzed further.

 Dataset will be visualized using libraries like Matplotlib and Seaborn for better
understanding.

 Data will be divided into training and test data, former being used to train models on the
basis of algorithms like Random Forest, KNN, SVM and Logistic Regression.

 Accuracies from all models will be evaluated and the best model will be chosen to be
deployed on web using Flask on local server.

 Using Heroku we will deploy the model on cloud based server to be easy to use for any
organization/ individual.

 The result would be an end-to-end Machine Learning model which will easily predict
whether an individual has diabetes or not and help the society to check their health with
considerable accuracy.
5 METHODOLOGY

The following methodology will be followed to achieve the objectives defined for proposed
research work:
 Various parameters will be identified to evaluate the proposed system.

 Comparison of new implemented approach with exiting approaches will be done.


 Generative learning model will assume that the existence of specific characteristics in a
class is not related to the existence of any other.
 We use a statistical method so that we can easily analyze the dataset.
 Logistic Regression for fast and it's performance is better as it is an execution of a
boosted decision tree.
 Support Vector Machine (SVM) classification is based on the differentiation in the
classes , these classes are data set points present in different planes.
6 TENTATIVE CHAPTER PLAN FOR THE PROPOSED WORK

CHAPTER 1: INTRODUCTION

This chapter will cover the overview of problem definition, software and hardware
requirements details regarding proposed project.

CHAPTER 2: LITERATURE REVIEW


This chapter includes the literature available for Diabetes Predictive System using ML. The
findings of the researchers will be highlighted which will become basis of current
implementation. It will provide introduction to the concepts which are necessary to understand
the proposed system and will highlight the faults in existing system.

CHAPTER 3: PROBLEM FORMULATION


This chapter discusses about the existing system flaws and how it will be beneficial to create a
new diabetes predictive system.

CHAPTER 4: RESEARCH OBJECTIVES

This chapter will cover the objectives and work flow of the proposed approach.

CHAPTER 5: METHODOLOGY

This chapter will cover the technical details of the proposed approach.

You might also like