Professional Documents
Culture Documents
INTRODUCTION
1
Analytics for Business
§ Introduction
§ Analytics Process
§ Application
2
Analytics
3
Analytics
Descriptive Diagnostic
• Total revenue • Whether sales went down due to a recent change in the color
• The average revenue per scheme of the web page
customer • Is the introduction of a new product cannibalizing the sales
• YoY (%) increase in sales of an existing product?
• YoY (%) increase in premium • Is there any effect of discount or promotional activity on the
collected sales?
4
Analytics
Predictive Prescriptive
• What will be sales in the forthcoming • Google Maps recommends the best route from start to
years? destination
• What will be the demand in the • Routing systems recommend the cost-effective route that
following week? truck drivers should follow to complete their delivery
• What will be the cash flow over the schedule
next six months?
• Recommendation systems propose videos to watch or
• What will be the number of covid
products to buy.
infected in the next two months?
• What will be the number of tourist • Pricing management systems used by Airlines to decide the
arrivals? dynamic pricing of the seats
5
PROCESS – CRISP DM
Business Understanding: This initial phase focuses on understanding the
project objectives and requirements from a business perspective, and then
converting this knowledge into a data mining problem definition, and a
preliminary project plan designed to achieve the objectives.
Evaluation: At this stage, you have built one or more models that appear
to have high quality, from a data analysis perspective. Before proceeding to
final deployment of the model, it is important to more thoroughly evaluate
the model, and review the steps executed to construct the model, to be
certain it properly achieves the business objectives.
Deployment: Creation of the model is generally not the end of the project.
Usually, the knowledge gained will need to be organized and presented in a
way that the customer can use it. Depending on the requirements, the
deployment phase can be as simple as generating a report or as complex as
implementing a repeatable data mining process.
8.1 1.1
$48 Bn
Trillion by Trillion by
2023
2026 2022
9
Business Understanding
Product Returns
10
Business Understanding
Product Returns
• Part of business model • Products returned without being • Rechecked, Reprocessed, Repackaged
experienced • Lower service levels
• Bracketing • Customer dissatisfaction
• Incorrect Address • Negative impact on future purchase
• Unresponsive intention
• Untraceable • Negative impact on environment
• Incomplete shipping information to TP
11
Business Understanding
Place Order
Customer Customer Unreachable
Online Payment or CoD
Receive
Receive Order
“Undelivered Item”
Alivekart
Dispatch Order
12
Data Understanding
S. No Attributes Description Data Type
1 Year Year of the order Date/Time
2 Prod_Description SKU Description Nominal
3 Payment_Mode Payment Mode: Online / Cash-on-Delivery Categorical
4 Order_Number Order Number Nominal
5 Value Order Value Ratio
6 Order_Status Order Status – Delivered, Returned, Returned Categorical
Undelivered, Cancelled
7 Street Name Shipping Address – Street Nominal
8 City Shipping Address – City Nominal
9 State Shipping Address – State Nominal
10 Pin_Code Shipping Address – Pin Code Nominal
11 Contact_Number Customer Contact Number Nominal
12 Tracking_ID Shipment Tracking ID Nominal
13
Introduction Analytics Process Business Applications
Data Preparation
Feature Method / Remark Class Constructed from Order_Status
Year Nominal Attribute – Dropped RWOD = 1 Others = 0
Prod_Description Redundant Feature – Dropped
Prod_Code Parsed from SKU Description Street Name Nominal – Dropped
and label_encoded Add_Bin Derived from Street_Name
Gender Parsed from SKU Description Add_Strings Derived from Street Name
and label_encoded City Nominal – Dropped
Color Parsed from SKU Description State Retained and Lable Encoded
and label_encoded Pin Code Nominal – Dropped
Size Parsed from SKU Description Pincode_Available No = 0; Yes = 1
and label_encoded Contact_Number Nominal – Dropped
Weight_Fit Parsed from SKU Description Contact_Available No = 0; Yes = 1
and label_encoded Contact_Digit Derived from Contact_Number and Label
Payment_Mode Redundant Feature – Dropped Encoded. Series ER was used when
MOP Online = 0, COD = 1 Contact_Number was unavailable
Order_Number Nominal Attribute – Dropped Tracking_ID Nominal – Dropped
Multiple_Item Constructed by processing Order_Number Tracking_ID_S Derived from Tracking_ID and Label
Encoded. Series ZZ was used when
Value Ratio
Contact_Number was unavailable
Order_Status Redundant Feature – Dropped
Customer_Type Derived from Mobile Number
14
Modelling and Evaluation
Sno Model AUC Value
1 Logistic Regression 0.625
2 naïve Bayes 0.660
3 Decision Trees 0.621
4 Linear SVM 0.545
5 RBF SVM 0.586
6 Polynomial SVM 0.587
7 Sigmoid SVM 0.532
8 Random Forest 0.680
9 Adaptive Boosting 0.601
10 Gradient Boosting 0.631
11 Stochastic Gradient Boosting 0.628
12 Deep Neural Network 0.645
15
Introduction Analytics Process Business Applications
16
Deployment
High-Risk Order
17
Takeaways - You should learn
Programming
Languages R &
Data Exploration & Modelling and Deployment
Python, Statistics,
Understanding Evaluation (Strategies Prescribed)
Machine Learning,
Deep Learnning
Analytics Courses – Statistics, R, Python, DSML, AIDL, NLP, Data Vis, Big Data, Econometrics, other
analytics electives – Strengthens your analytical perspective
18
Thank You
19