You are on page 1of 8

Anatomy of Statistical Model

A step-by-step approach to solve a Business Problem

5
Model Validation
4
Model Selection
3 and Building
Find out how
accurate are the
Data Preparation
2 Identify best fit ML
Predictions

Data Discovery Algorithm


1 and Collection
Preparing the
Business Data for Model
Problem
Gathering data

Understanding
Business
Problem

Copyright © 2019 ShyftPath, All rights reserved.


Anatomy of Statistical Model
A step-by-step approach to solve a Business Problem

Data
Model
Business Discovery Data Model
Selection
Problem and Preparation Validation
and Building
Collection

➢ Understand Project
Objectives

➢ Define the Problem

➢ Investigate Question
and gather
Requirements

➢ Convert into a
Statistical Problem

Copyright © 2019 ShyftPath, All rights reserved.


Anatomy of Statistical Model
A step-by-step approach to solve a Business Problem

Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building

➢ Understand Data
architecture

➢ Data List Preparation and


Identification of Data
Sources

➢ Collect Initial Data

➢ Define Variables and


Create Data Dictionary

➢ Validate for Correctness


Copyright © 2019 ShyftPath, All rights reserved.
Anatomy of Statistical Model
A step-by-step approach to solve a Business Problem

Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building

➢ Univariate Analysis
➢ Data Cleaning
➢ Outlier Treatment
➢ Missing Value Treatment
➢ Feature Engineering
➢ Variable Creation
➢ Data Transformation
➢ Dimension Reduction
➢ Bivariate Analysis and
Hypothesis Testing
➢ Data Split
➢ Training Set
➢ Testing Set
Copyright © 2019 ShyftPath, All rights reserved.
Anatomy of Statistical Model
A step-by-step approach to solve a Business Problem

Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building

➢ Regression

➢ Classification

➢ Clustering

➢ Association Rule
Mining, etc.

Copyright © 2019 ShyftPath, All rights reserved.


Anatomy of Statistical Model
A step-by-step approach to solve a Business Problem

Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building

➢ Score And Predict


using Test Sample

➢ Check the Robustness


and Stability of the
Model

➢ Check Model
Performance,
Accuracy, ROC, AUC,
KS, etc.

Copyright © 2019 ShyftPath, All rights reserved.


A step-by-step approach to solve a Business Problem

Data Model
Business Data Model
Discovery and Selection and
Problem Preparation Validation
Collection Building

➢ Understand Project ➢ Understand Data ➢ Univariate Analysis ➢ Regression ➢ Score And Predict
Objectives architecture ➢ Data Cleaning using Test Sample
• Outlier Treatment ➢ Classification
➢ Define the Problem ➢ Data List Preparation • Missing Value ➢ Check the
and Identification of Treatment ➢ Clustering Robustness and
➢ Investigate Question Data Sources ➢ Feature Engineering Stability of the
and gather • Variable Creation ➢ Association Rule Model
Requirements ➢ Collect Initial Data • Data Mining, etc.
Transformation ➢ Check Model
➢ Convert into a ➢ Define Variables and • Dimension Performance,
Statistical Problem Create Data Reduction Accuracy, ROC,
Dictionary ➢ Bivariate Analysis AUC, KS, etc.
and Hypothesis
➢ Validate for Testing
Correctness ➢ Data Split
• Training Set
• Testing Set

Business Impact : Return on Investment (ROI)

Copyright © 2019 ShyftPath, All rights reserved.


Why Split Data into Train & Test sets ?

Splitting Data Model Building

3
1
Use Training data to
Training TRAIN the Model Model
Data
4
Finalize Train Model
Split Data
Data Data
2 5
Use Testing data to
Testing TEST the Model Model
Data
Check
Accuracy
6

Accuracy
Copyright © 2019 ShyftPath, All rights reserved.

You might also like