Professional Documents
Culture Documents
5
Model Validation
4
Model Selection
3 and Building
Find out how
accurate are the
Data Preparation
2 Identify best fit ML
Predictions
Understanding
Business
Problem
Data
Model
Business Discovery Data Model
Selection
Problem and Preparation Validation
and Building
Collection
➢ Understand Project
Objectives
➢ Investigate Question
and gather
Requirements
➢ Convert into a
Statistical Problem
Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building
➢ Understand Data
architecture
Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building
➢ Univariate Analysis
➢ Data Cleaning
➢ Outlier Treatment
➢ Missing Value Treatment
➢ Feature Engineering
➢ Variable Creation
➢ Data Transformation
➢ Dimension Reduction
➢ Bivariate Analysis and
Hypothesis Testing
➢ Data Split
➢ Training Set
➢ Testing Set
Copyright © 2019 ShyftPath, All rights reserved.
Anatomy of Statistical Model
A step-by-step approach to solve a Business Problem
Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building
➢ Regression
➢ Classification
➢ Clustering
➢ Association Rule
Mining, etc.
Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building
➢ Check Model
Performance,
Accuracy, ROC, AUC,
KS, etc.
Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building
➢ Understand Project ➢ Understand Data ➢ Univariate Analysis ➢ Regression ➢ Score And Predict
Objectives architecture ➢ Data Cleaning using Test Sample
• Outlier Treatment ➢ Classification
➢ Define the Problem ➢ Data List Preparation • Missing Value ➢ Check the
and Identification of Treatment ➢ Clustering Robustness and
➢ Investigate Question Data Sources ➢ Feature Engineering Stability of the
and gather • Variable Creation ➢ Association Rule Model
Requirements ➢ Collect Initial Data • Data Mining, etc.
Transformation ➢ Check Model
➢ Convert into a ➢ Define Variables and • Dimension Performance,
Statistical Problem Create Data Reduction Accuracy, ROC,
Dictionary ➢ Bivariate Analysis AUC, KS, etc.
and Hypothesis
➢ Validate for Testing
Correctness ➢ Data Split
• Training Set
• Testing Set
3
1
Use Training data to
Training TRAIN the Model Model
Data
4
Finalize Train Model
Split Data
Data Data
2 5
Use Testing data to
Testing TEST the Model Model
Data
Check
Accuracy
6
Accuracy
Copyright © 2019 ShyftPath, All rights reserved.