You are on page 1of 47

MANAKULA VINAYAGAR INSTITUTE

OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

DIABETIC PREDICTION AND DIET PLAN FOR PATIENT SECURED


USING BLOCKCHAIN IN ELECTRONIC HEALTH RECORD

DONE BY :
PROJECT GUIDE :
BATCH NO : A2
1. Dikshith Roshan T (18TD0618) Dr. N. POONGUZHALI
2. Haritha R (18TD0629) ASSOCIATE PROFESSOR
3. Ranjani R (18TD0664) DEPARTMENT OF CSE
(IV year CSE -A )

Date : 07/07/2022
1
AGENDA
• ABSTRACT
• INTRODUCTION
• PROBLEM DEFINITION
• OBJECTIVE
• LITERATURE SURVEY
• EXISTING SYSTEM
• DRAWBACKS OF EXISTING SYSTEM
• PROPOSED SYSTEM
• DIABETIC INSIGHT AND DELICIOUS DIET IN MEDCHAIN(DIDDM)
• MODULE I - DIABETIC INSIGHT
• MODULE II - DIABETIC DIET PLAN
• MODULE III - MEDCHAIN
• DATASET DESCRIPTION
• RESULT ANALSIS
• CONCLUSION
• REFERENCES 2
ABSTRACT
• In the modern world, diabetes has risen to the top of the most
prevalent diseases. Machine learning is used to determine whether a
patient has diabetes. We employ the Random Forest Algorithm to get
better prediction results in order to increase accuracy. By prescribing
a food strategy based on a patient's age and BMI categories, we also
hope to keep their diabetes under control. All of the patient-provided
data is kept in an electronic health record (EHR) and protected by the
Interplanetary File System (IPFS), a private blockchain used to store
and access patient data.

3
INTRODUCTION

4
INTRODUCTION
• Diabetes is a disease that occurs when your blood glucose is too high.
• Insulin, a hormone made by the pancreas, helps glucose from food get into your
cells to be used for energy.
• Sometimes your body doesn’t make enough insulin or doesn’t use insulin
well. Over time, having too much glucose in your blood can cause health
problems. This condition is known as Diabetics
•  Although diabetes has no cure, you can take steps to manage your diabeted and
stay healthy.
Causes of diabetics:
• Obesity and an inactive lifestyle are two of the most common causes of type 2
diabetes. These things are responsible for about 90% to 95% of diabetes.

5
MACHINE LEARNING
• Machine learning (ML) is the study of computer algorithms that can
improve automatically through experience and by the use of data.
• Random forest is a Supervised Machine Learning Algorithm that
is used widely in Classification and Regression problems. It builds
decision trees on different samples and takes their majority vote for
classification and average in case of regression.

INPUT ANALYZE FIND PREDICTION


DATA DATA PATTERNS

6
ELECTRONIC HEALTH RECORD
• An Electronic Health Record is a digital version of a patients
paper chart. EHRs are real time , patient-centred records that make
information available instantly and securely to the authorized user.

Patient Personal Details Patient Medical Details

EHR

7
BLOCKCHAIN
• Blockchain is a collection of records linked with each
other strongly, resistant to alteration. It uses peer to peer network
for sharing information.

BLOCK II

BLOCK I BLOCK III

BLOCK IV

8
PROBLEM DEFINITION
• To develop a diabetic prediction system using machine learning
technique and provide a diet plan based on severity of diabeticsThe
patient medical records (EHR) is secured using private blockchain and
access only to the authorized users.

9
OBJECTIVE
• By gathering the patient's medical information, including Name, Age,
Insulin, Glucose Level, and BMI, and using Machine Learning's
Random Forest Algorithm to predict diabetic status for the patient
• To suggest a patient's diet based on medical data obtained during the
diagnosis of diabetes and preparation of a diet plan
• By implementing the InterPlanetry File System private blockchain to
secure the EHR that contains the patient's medical information (IPFS)

10
LITERATURE SURVEY

YEAR TITLE PROBLEM TECHNOLOGY USED SOLUTION


Effects of a diet based on the Dietary 8 weeks of controlled diet to
To check the diet based on dietary
Guidelines on vascular health and 8 weeks of controlled diet based on obese women to check whether
2022 guidelines to reduce cardio metric risk
TMAO in women with cardiometabolic dietary guidelines . the diet is going to reduce the
factors.
risk factors. risk of cardiovascular disease.
A systematic literature review on
With different ML algorithm
obesity: Understanding the causes &
To identify obesity and diseases which obesity is predicted , and detect
2021 consequences of obesity and reviewing Machine Learning.
are affected to them. diseases which would be
various machine learning approaches
affected for them.
used to predict obesity.
Machine learning methodology can be
Machine learning for prediction of used to detect persons with increased Data collected form age 34-54
2021 diabetes risk in middle-aged Swedish type 2 diabetes or prediabetes risk Machine learning . and predict diabetics using ML
people. among people without known algorithm.
abnormal glucose regulation.
Machine learning to promote health Convolutional neural networks
Deep Learning and convolutional
2021 management through lifestyle changes To detect hypertension. and decision tree is used to
neural networks.
for hypertension patients. predict hypertension.
Teacher reinforcement
Diet Planning with Machine Learning:
learning to connect neutral
2021 Teacher-forced REINFORCE for To create a diet plan for children with Neutral machine transition and
machine transition and
Composition Compliance with concern of nutrient value and calories reinforcement Learning
reinforcement learning to
Nutrition Enhancement
propose a diet

11
LITERATURE SURVEY

YEAR TITLE PROBLEM TECHNOLOGY USED SOLUTION


To provide security and privacy to Swarm exchange techniques
BioTHR: Electronic Health Record Electronic Health Record since the Blockchain assisted EHR management to facilitate seamless and
2021 Servicing Scheme in IoT-Blockchain sensitive information of the patients using IoT.
Ecosystem. secure transmission of user
can be tampered. data.
Traditional EHR-based systems are
plagued by data loss risks, security Hyperledger fabric platform is
A Novel Patient-Centric Architectural and immutability consensus over used in order maintain the
2020 Framework for Blockchain-Enabled health records, gapped Blockchain
Healthcare Applications privacy of the electronic Health
communication among constituted Record.
hospitals, and inefficient clinical data
retrieval systems , among others.
Machine learning and artificial Machine Learning to detect
intelligence based Diabetes Mellitus To detect diabetics and recommend Machine Learning and Artificial diabetics and AI to create
2020
detection and self-management: A diet and tablet dosage for the patients. Intelligence. interactive chat Bot to
systematic review. communicate with patients.
First phase , contains
authentication and signature
scheme which is proposed
Privacy, confidentiality, and data
BinDaaS: Blockchain-Based Deep- based on lattices-based
2019 consistency are major challenges in
Learning as-a-Service in Healthcare Blockchain and Deep Learning. cryptography. Second phase ,
maintaining the Electronic Health
4.0 Applications. provides Deep Learning as-a-
Records of patients.
Service (DaaS) which is used
on stored EHR datasets to
predict future diseases.

12
EXISTING SYSTEM
• In the existing system they have used machine learning methodology
to detect persons with increased type 2 diabetes or prediabetes risk
among people without known abnormal glucose regulation. The
parameters used in previous system were BMI, Waist – hip ratio, Age,
systolic and diastolic blood pressure and diabetes hybridity

13
DRAWBACKS OF EXISTING SYSTEM

• Patients may not always have access to the prediction's results.


• The patient information is not protected, and their medical report
may be altered.
• After prediction, there is no advice on how individuals can keep
their insulin level the same or reduce it based on the forecast
results.

14
PROPOSED SYSTEM
• To develop a prediction system using machine learning model for
diabetics disease using random forest algorithm. The system also
recommends diet plan for the patient based on insulin level, age and
BMI. The patient medical records (EHR) is secured using private
blockchain and access only to the authorized users.

15
DIABETIC INSIGHT AND DELICIOUS DIET IN MEDCHAIN(DIDDM)
DIABETIC INSIGHT MEDCHAIN

RANDOM
DATA
FOREST
CLEANING EHR
ALGORITHM YES

INPUT VALIDATOR
DATA
NO
VISUALIZATION DENIED
PATIENT MEDICAL
DETAILS
BLOCK I TRANSACTION
DIABETIC DIET PLAN PROCESS
CONDITION CLAUSE
EHR
PREDICTION OUTPUT BLOCK II BLOCK III

AGE
BLOCK IV
BMI RANGE

DIET PLAN MEDICAL INFORMATION

16
MODULE I - DIABETIC
INSIGHT
 PREDICTION OF DIABETICS USING RANDOM FOREST ALGORITHM

17
PREDICTION MODULE DIAGRAM

RANDOM
DATA FOREST
CLEANING ALGORITHM 
INPUT  OUTPUT

DATA
VISUALIZATION

18
PREDICTION MODULE
INPUT :
• Dataset of diabetics patients is the input for our project.
• Datasets are a collection of instances that all share a common attribute. 
• In Machine Learning projects, we need a training data set to train our model. 
• The more data you provide to the ML system, the faster that model can learn
and improve.

PARAMETERS IN DATASET
• Glucose Level • Age
• Diabetes Pedigree • Pregnancies
Function • Insulin Level
• Blood Pressure • BMI
• Skin Thickness

19
PREDICTION MODULE

DATA CLEANING :
The main aim of Data Cleaning is to identify and remove errors &
duplicate data, in order to create a reliable dataset.

RAW DATA IN     ERROR / DUPLICATE 


DATA CLEANING 
INPUT           FREE  DATA

            Input                                              Process                                             Output

20
PREDICTION MODULE
DATA VISUALIZATION :
• Data visualization is the graphical representation of information and
data.
• Data visualization helps to analyze data quickly and efficiently.
• It is important to understand how data is used in a particular Machine
Learning model it helps in analyzing it. 

DATA
ERROR FREE DATA
VISUALIZATION

                 Input                                    Process                                          Output

21
PREDICTION MODULE
RANDOM FOREST ALGORITHM :
Random forest builds multiple decision trees and merges them together to
get a more accurate and stable prediction.
STEPS :
STEP 1 : Pick N random records from the dataset.
STEP 2 : Build a decision tree based on these N records.
STEP 3 : Choose the number of trees you want in your algorithm and repeat
steps 1 and 2.
STEP 4 : Each decision tree predict the output with the help of subset of data. 
STEP 5 : Final output is based on the majority of output from the decision
tree.
22
WORKFLOW OF RANDOM FOREST
ALGORITHM
DATASET

SUBSAMPLE SUBSAMPLE SUBSAMPLE

PREDICTION PREDICTION PREDICTION

MAJORITY VOTING

FINAL PREDICTION 23
MODULE II - DIABETIC
DIET PLAN
DIET PLAN AND STORING PATIENT RECORD IN EHR

24
DIET PLAN
• Based on the output of prediction process we are giving diet plan for
patient.

DIET PLAN USING CONDITION CLAUSE

STEP 1 : The process begins with the Prediction Results


STEP 2 : Based on the Prediction Results the patient is categorized as either
diabetic or non-diabetic
STEP 3 : If the patient is diabetic generate Diet plan based on the Age and
BMI value of the Patient with respect to their age and BMI categories.
STEP 4 : If the patient is non-diabetic , based on their age a curated diet plan
is generated.
25
WORKFLOW OF DIET
PLAN PREDICTION
NO
YES OUTPUT

AGE AGE
ABO
8 AB O VE
8
1-1 VE 1-1 50

19 - 50

19 - 50
50

CHILD ADULT OLDER CHILD ADULT OLDER

DIET DIET DIET


BMI BMI BMI PLAN PLAN PLAN
10 11 12
BELOW 18.5

BELOW 18.5
BELOW 18.5
ABOVE 25

ABOVE 25

ABOVE 25
18 – 24

18 – 24

DIET DIET 18 – 24
DIET
PLAN PLAN PLAN
2 5 8
DIET DIET DIET DIET DIET DIET
PLAN PLAN PLAN PLAN PLAN PLAN
1 3 4 6 7 9
26
STORING PATIENT DETAILS IN
DATABASE
• We created database named flaskapp to store the patient details
given by the user this is considered as Electronic Health Record(EHR)

27
MODULE III -
MEDCHAIN
BLOCKCHAIN BASED ELECTRONIC HEALTH RECORD

28
INTERPLANETARY FILE
SYSTEM(IPFS)
• IPFS gives a unique hash value to each file
• The hash is totally different even if there's only a difference of one single
character
• IPFS is a file sharing system that can be leveraged to more efficiently
store and share large files
• It relies on cryptographic hashes that can easily be stored on a blockchain

29
INTERPLANETARY FILE
SYSTEM(IPFS)
• CONTENT ADDRESSING
IPFS uses content addressing to identify content by what's in it rather than by where it's
located. IPFS protocol has a content identifier, or CID, that is its hash. The hash is unique to the
content that it came from, even though it may look short compared to the original content. IPLD
translates between hash-linked data structures, allowing for the unification of the data across
distributed systems.

• DIRECTED ACYCLIC GRAPH


IPFS and many other distributed systems take advantage of a data structure called directed
acyclic graphs, or DAGs.
INTERPLANETARY FILE
SYSTEM(IPFS)
• DISTRIBUTED HASH TABLES
To find which peers are hosting the content you're after (discovery),
IPFS uses a distributed hash table, or DHT. A hash table is a database of
keys to values. A distributed hash table is one where the table is split
across all the peers in a distributed network.
STEPS FOR STORING EHR IN IPFS

• STEP 1 : To begin IPFS, launch the command line interface and


confirm that the daemon is active

• STEP 2 : A hash value is produced during program execution

• STEP 3 : We can access the EHR in the IPFS gateway using the hash
value

32
WORKFLOW OF SECURING EHR
USING IPFS
DATA ARE BROKEN INTO
DISTRIBUTED LEDGER
BLOCKS

IPFS

UPLOAD DOWNLOAD
ENCRYPTED DATA HASH VALUE
TO IPFS

33
RESULT ANALYSIS

34
DATA SET DESCRIPTION
S.NO ATTRIBUTE DESCRIPTION OF ATTRIBUTES MEAN VALUES STD
VALUE
1 Pregnancies Number of times 0.2262 0.19

2 Glucose Plasma glucose concentration(mg/dL) 0.6075 0.16

3 Blood Pressure Diastolic blood pressure (mm Hg) 0.5664 0.15

4 Skin thickness Tricep skin fold thickness (mm) 0.2074 0.16

5 Insulin 2-hour serum insulin (mm U/ml) 0.09 0.13

6 BMI Body mass index(weight in kg/height 0.47 0.11


in m)
7 Pedi Diabetespedigreefunction 0.16 0.14

8 Age Years 0.20 0.19

9 Target 1:diabetic,0:Non-diabetic 0.34 0.47


35
DATA SET DESCRIPTION
• There are a total of 768 records and 9
features in the dataset.
• Used parameters in our system are
Age, BMI, Insulin and Glucose Level.
• Each feature can be either of integer or
float dataype.
• Some features like Glucose, Blood
pressure , Insulin, BMI have zero values
which represent missing data.
• There are zero NaN values in the 0 – Non-Diabetic
dataset. 1 – Diabetic
• In the outcome column, 1 represents
diabetes positive and 0 represents
diabetes negative.
36
EVALUATION PARAMETER
• Accuracy
It is most common performance metric for classification algorithms. It may be
defined as the number of correct predictions made as a ratio of all predictions
made. We can easily calculate it by confusion matrix with the help of following
formula ACCURACY = TP+TN/TP+TN+FN+FP

• Reading latency
• Writing latency
ACCURACY COMPARSION OF ML
ALGORITHMS Y
100

90

80

70
ACCURACY

60

50

40

30

20

10

0 X
KNN Navie Bayes Gradient Boosting Random Forest

ALGORITHMS
38
CONFUSION MATRIX OF RANDOM
FOREST ALGORITHM
PREDICTED CLASS

YES NO

ACTUAL CLASS
YES TP FN

NO FP TN

ACCURACY = TP+TN/TP+TN+FN+FP

39
INTERPLANETARY FILE SYSTEM
FILE SYSTEM PROTOCOL VERSION OPERATONS

IPFS IPFS 0.7.0 ADD & GET

Vsftpd FTP 3.0.3 COPY & GET

40
WRITING LATENCY
Y
Y
IPFS and FTP writing small data IPFS and FTP writing large data
350 4000

LATENCY TIME (ms)


300 3500
LATENCY TIME (ms)

3000
250
2500
200 2000
150 1500
1000
100
500
50
0
1mb 4mb 16mb 64mb X
0
1kb 4kb 16kb 64kb 256kb
X
IPFS FTP
IPFS FTP
FILE SIZE
FILE SIZE

41
READING LATENCY
Y
Y
IPFS and FTP reading small data IPFS and FTP reading large data
250 7000

6000
LATENCY TIME (ms)

LATENCY TIME (ms)


200
5000

150 4000

3000
100
2000

50 1000

0
1mb 4mb 16mb 64mb X
0 X
1kb 4kb 16kb 64kb 256kb
IPFS FTP
IPFS FTP
FILE SIZE FILE SIZE

42
CONCLUSION
• The primary application of this system is to predict whether a patient
is diabetic by using random forest algorithm with accuracy of 97%
when compare to other machine learning algorithms it also provide a
diet plan based on their age and BMI categories. Furthermore, patient
information is stored in an electronic health record and is secured
using the Interplanetary File System.

43
FUTURE WORK
REFERENCE
• Sridevi Krishnan, Erik R. Gertz, Sean H. Adams, John W. Newman,Theresa L.
Pedersen, Nancy L. Keim, Brian J. Bennett/Effects of a diet based on the Dietary
Guidelines on vascular health and TMAO in women with cardiometabolic risk
factors/Nutrition, Metabolism & Cardiovascular Diseases (2022) 32, 210e219
• Mahmood Safaei, Elankovan A. Sundararajan, Maha Driss, Wadii Boulila,
Azrulhizam Shapi’i/A systematic literature review on obesity: Understanding the
causes & consequences of obesity and reviewing various machine learning
approaches used to predict obesity/Computers in Biology and Medicine 136
(2021) 104754
• Lara Lama, Oskar Wilhelmsson, Erik Norlander, Lars Gustafsson, Anton Lager,Per
Tynelius, Lars Warvik, Claes-Goran Ostenson/Machine learning for prediction of
diabetes risk in middle-aged Swedish people/Heliyon 7 (2021) e07419
• Md. Mazharul Islam , Rittika Shamsuddin/ Machine learning to promote health
management through lifestyle changes for hypertension patients/Array 12 (2021)
100090
45
REFERENCE
• Akhilendra Pratap Singh, Member, IEEE, Nihar Ranjan Pradhan , Student Member,
IEEE, Ashish K. Luhach , Member, IEEE, Sivansu Agnihotri, Member, IEEE, Noor
Zaman Jhanjhi , Senior Member, IEEE, Sahil Verma , Member, IEEE, Kavita , Member,
IEEE, Uttam Ghosh , Senior Member, IEEE, and Diptendu Sinha Roy , Senior Member,
IEEE/A Novel Patient-Centric Architectural Framework for Blockchain-Enabled
Healthcare Applications/IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL.
17, NO. 8, AUGUST 2021
• ROSARIO CATELLI, FRANCESCO GARGIULO, VALENTINA CASOLA, GIUSEPPE DE
PIETRO, HAMIDO FUJITA, AND MASSIMO ESPOSITO/A Novel COVID-19 Data Set and
an Effective Deep Learning Approach for the De-Identification of Italian Medical
Records/Digital Object Identifier 10.1109/ACCESS.2021.3054479
• Jyotismita Chaki, S. Thillai Ganesh, S.K Cidham, S. Ananda Theertan/Machine
learning and artificial intelligence based Diabetes Mellitus detection and self-
management: A systematic review/J. Chaki et al. / Journal of King Saud University –
Computer and Information Sciences xxx (xxxx) xxx - 2020
46
THANK YOU

47

You might also like