Professional Documents
Culture Documents
1
Content
Introduction
Challenges
Literature Survey
Motivation
Proposed methodology
Experimental result
Conclusion
Future scope
Reference
2
Introduction
• Type 2 diabetes mellitus[T2DM] is a deadliest disease affecting 90% of people
worldwide
• The development of T2DM is caused by a combination of lifestyle and genetic
factors
• Cardiovascular disease(CVD) is a serious long-term diabetes complication
• The risk of CVD increases by 2 to 4 times for T2DM patients
• The mortality rate of CVD in Indian T2DM patients is 70%
3
Challenges( ki ki prb ache ei field e )
4
Related Work
• The European Associations for the study of Diabetes recommend FRAMINGHAM [] as a
CVD risk prediction model
Limitations
Applicable to the general population
5
Related Work
Year Author Data Mining Algorithms Accuracy
7
Proposed methodology(1/)
Feature Original
Feature Selected
Construction Feature
Selection Features
set
Data
420 samples 294 samples
of T2DM of T2DM 162 Training Classification
processing algorithm
patients patients Cases Data
Sample
labelling Testing
Data
132
Controls
Final Estimated
evaluation Accuracy
Figure 1: Block diagram of the proposed framework 8
Proposed methodology(2/)
Data Collection
9
Proposed methodology(3/)
Data Preparation
10
Proposed methodology(4/)
Feature Description
A unique measurable property of a phenomenon.
11
Proposed methodology(4/)
Foot Infection Categorical Presence of foot infection. Possible values are Yes or No
Smoking Categorical Smoking habit of patient. Possible values are Yes or No
Family history Categorical Family history of CVD. Possible values are Yes or No
13
Proposed methodology(6/)
Classification model
A component of ML algorithms to extract patterns from data
14
Proposed methodology(7/)
Outlier detection
Detection of outliers is a crucial task for classification models.
15
Proposed methodology(8/)
•
Cook’s distance
Detects outliers in multivariate data.
Di =
Where,
is the prediction value for observation j
is the prediction value for observation j without including point i
σ2 is the estimation of error variance.
p is the number of independent variables present in the model.
16
Proposed methodology(9/)
•
DFFITS method
Measures the impact of a observation on the response value
17
Proposed methodology(10/)
Multicollinearity Check
18
Proposed methodology(11/)
Feature Selection
A process of identifying subset of features
19
Proposed methodology(12/)
Label
information
Learning
algorithm
Feature
selection
Feature Features
Training set generation Classifier
Identifies the optimum value of decision threshold for the proposed classifier
21
Proposed methodology(14/)
Model Validation
Process of determining resemblance of the model with the real data.
22
Proposed methodology(15/)
Current scenario
Markable flare of CVD risks in T2DM patients is observed in India
23
Proposed methodology(16/)
Name of Test Average Purpose
price
TMT Rs. 1800 To asses the functionality of heart and blood vessels
Coronary angiography Rs. 6000 To check blood vessel problems and heart
abnormalities
Carotid ultrasound Rs. 1000 To assess the risk of stroke
25
Experimental Analysis(1/
Outlier Detection analysis
Cook’s distance method finds the following observations as suspicious
28, 172, 67, 265, 102, 58, 70, 215
37
Experimental Analysis(13/
Score System analysis
• Score system is developed using the features identified by Backward Deletion or
Stepwise Regression method.
• Score assignment for each individual risk factor is done with utmost care
• The efficacy of the Score system is evaluated using in-sample dataset and out-
of-sample dataset.
38
Experimental Analysis(14/
44
Experimental Analysis(20/
48
49