Analytics For Decision Making

ANL203
Analytics for Decision Making

Seminar 2
ECA – Generic Dataset
Pls note that the Generic Dataset provided for ECA serves as a guideline.
You can modify the dataset accordingly to suit your needs. For example, rename some of the variables, derive new
variables from the dataset or/and create your own variables.
If you do have a better dataset and will like to use that to build your dashboard, pls do so. You will need to state the
source of your dataset in your report.
IMPORTANT
Provide screenshots of the dashboard in your report, and store the Tableau-packaged workbook into a single Tableau
workbook file with extracted data named “student_number.tbwx”. Put both items (i.e., the Tableau-packaged
workbook and dataset used to construct the dashboard) into a folder, zip it and submit the zipped folder to ECA_PPT.
2
Recap – Seminar 1
o Data Visualisation
o Benefits
o Stages – Four basic stages and three loops
o Data – Attributes and Measurement (Nominal, Ordinal, Ratio and Interval)
o Various type of data – Categorical, Time Series, Spatial, Multi Variable and Distribution
o Business Performance Measures

o 7 Principles
o Strategic Business Performance Management Framework
o Vision and Mission
o Strategic Themes – main high level business strategies; major areas that the organization will focus on
o Strategic Objectives – defined objectives; 4 aspects need to be considered – Impact, measureable, frequency and
availability and accessibility
o Strategy Map – a diagram that is used to communicate the strategic objectives
o Business Performance Targets – SMART
o Strategic Initiatives – specific projects, programmes or planned activities to help to meet or exceed the targets
3
Recap – Seminar 1
o Dashboards
o 3 types – Strategic, Tactical and Operational
o Benefits
o Design Principles
o Tableau Hands-On
4
Connecting the dots…
Strategic Strategic
Data Mining Dashboard
Objectives Initiatives
Increase
Monitor
Increase sales of Who to
sales of
Sales by $1M Product X by target?
product X
$200K
Increase Monitor the

Reduce Who will number of
Market Share churn by 15% churn?
by 5% churners
5
Connecting the dots…
Strategic Strategic
Dashboard Data Mining
Objectives Initiatives
Increase
Monitor
Increase sales of Who to
sales of
Sales by $1M Product X by target?
product X
$200K
Increase Monitor the Who will

Reduce
Market Share number of churn?
churn by 15%
by 5% churners
6
Topics
• Overview of Data Mining
• Data Mining Methodology
o CRISP-DM
• Data Mining Techniques
o Description and Visualisation
o Association Analysis
o Clustering
o Predictive Modelling
• Pre-Requisites of Successful Data Mining
• Limitations of Data Mining
• Case Study: New York City Fights Fire with Data
• Demo
7
Overview of Data Mining
How Data Mining came about?
• Data explosion problem
• There is a tremendous increase in the amount of data recorded and stored on digital media
• We are drowning in data, but starving for knowledge!
“The greatest problem of today is how to teach people to ignore the irrelevant, how to
refuse to know things, before they are suffocated. For too many facts are as bad as
none at all.” (W.H. Auden) 9
Definition of Data Mining
• The best way to understand data mining is to view it in a decision-making context. To make
good decisions, decision-makers need good information.
• Data Mining can be defined as the process of finding previously unknown and valid
information from databases and using that information to take certain actions or make good
decisions
• Uses techniques from the disciplines of
o statistics,
o artificial intelligence,
o and machine learning.
10
11
12
2 .Credit Card
1. Restaurant Payment 3. Credit Card Swipe
4. Transmission of Card
Holder and Transaction
Data
8a. Card Payment 7a. DM Result: Low
Approval Probability of Fraud
OR
6. DM Model Checking
of Fraudulent Card Use
8b. Card 7b. DM Result: High
Payment Non- Probability of Fraud 5. Card Company Processing
Approval of Data
13
Data Mining Methodology
CRoss-Industry Standard Process for Data Mining
15
Source: https://upload.wikimedia.org/wikipedia/commons/thumb/b/b9/CRISP-DM_Process_Diagram.png/1200px-CRISP-DM_Process_Diagram.png
Phase 1 – Business Understanding
This phase comprises four tasks:
• Determine business objectives
• Assess situation
• Determine data mining goals
• Produce project plan
16
What are the data mining goals?
?
17
Phase 2 – Data Understanding
There are three tasks in this phase:
• Collect data
• Describe and explore data
• Verify data quality
18
Phase 3 – Data Preparation
Data preparation can be divided into the following subtasks:
• Select data
• Clean data
• Construct and integrate data
• Format data
19
Phase 4 – Modeling
This phase can be divided into the following subtasks:
• Select the modeling technique
• Generate test design
• Build and assess model
20
Phase 5 – Evaluation
• Evaluate results
• Review process and determine next steps
21
Phase 6 – Deployment
• Plan deployment
• Monitoring and maintenance
• Final report
22
Overview of the CRISP-DM tasks
23
(Source: https://www.the-modelling-agency.com/crisp-dm.pdf)
Data Mining Methodology – An illustration
Case Study – ABC Bank
The number of credit card frauds in ABC Bank had been increasing over the past few years and
the bank had been suffering quite a fair bit of losses in the past few months. The bank had
engaged you and your group as a business analyst.
24
Phase 1 – Business Understanding
Business Problem:
• Credit card fraud is causing banks to suffer losses to customers and also to the bank
Business Goal:
• Able to detect future credit card fraud accurately and prevent it from happening in future
Data Mining Goal:

• Able to accurately identify/predict credit card fraud based on some criteria of transactions
pattern.
25
Phase 2 – Data Understanding
Collecting data:
Data assumed collected in past transaction history in data warehouse. What attributes should be
used? (Time/Venue/Type)
Describing and exploring data:

Use SPSS modeler ‘Data audit’ node to describe the data
Verifying data quality:

Use SPSS modeler ‘Data audit’ node to assess the quality
26
Phase 3 – Data Preparation
Selecting data:
Set of attributes that are relevant to classification includes ‘Time of fraud’, ‘Day of fraud’ etc.
Cleaning data:
Remove any outliers, replace missing values etc
Constructing and integrating data:

Incorporate external databases or create new calculated fields. For example, we can integrate the
transaction database (which only contains transactions) and customer database (which only
contains customer information)
27
Phase 4 – Modelling
Selecting the modelling technique:
Classification – decision trees, neural networks.
Generating test design:

Evaluate the Decision tree performance using SPSS Analysis node by comparing the predicted and
actual target values for both training and testing data
Building and assessing the model:

Using 70% of the past fraud transaction databases as training data and 30% are testing data.
28
Phase 5 – Evaluation
Evaluating results:
Identify the best tool for data mining classification. Check the % of testing data that are correctly
classified.
Reviewing the process and determining the next step:

If there is a need to improve the model, we have to start from the initial phase of business
understanding.
Phase 6 – Deployment
Deployment:
If the model is ready we can proceed with the deployment.
Notes:
The model is based on past transactional data of past frauds, hence when there are new frauds, the
model should always be updated.
29
Data Mining Techniques
Description and Visualisation
• Description and visualisation can contribute towards the understanding and detection of hidden
patterns in the data.
• Description refers to the summarisation of data to facilitate understanding.
• Visualisation can be considered an enhanced graphical approach that allows user input.
31
Association Analysis
• Association analysis is a descriptive technique that finds grouping by looking for patterns or
clusters among a set of items.
• The objective of association analysis is to determine which items co-occur frequently within a
database.
• Association analysis produces rules that are intuitive and easy to understand.
• Possible applications:
o Market Basket Analysis: What are the items inside each market basket?
o Fraud Detection: Association of certain patterns or incidences in transactions may flag off a
potential fraud.
32
Association Analysis
• An itemset is a collection of one or more items.
• Association rules are statements in the form of:

antecedent(s) then consequent(s) : antecedent(s) consequent(s)
• Association implies co- occurrence and not causality.
• Support is defined as the frequency of occurrence of an itemset. Support = % of times X appear = P(X)
• Rule Support is the fraction of total transactions that contain both X and Y.
Rule support = % of times X and Y appear together= P(X and Y)
• Confidence measures how often the items in the consequent set Y appear in antecedent transactions
containing X.
Confidence = likelihood that Y appears when X occurs = P(X and Y)/P(X)
33
Exercise
ID Item
1 Bread, Jam
2 Bread, Chips, Biscuits, Ice-cream
3 Jam, Chips, Biscuits, Muffins
4 Bread, Jam, Chips, Biscuits, Muffins
5 Bread, Jam, Chips, Muffins
1. What is the rule support for {Jam, Bread} ?
2. What is the confidence for {Jam, Chips}{Biscuits}?
34
Solution ID Item
1 Bread, Jam
2 Bread, Chips, Biscuits, Ice-cream
3 Jam, Chips, Biscuits, Muffins
4 Bread, Jam, Chips, Biscuits, Muffins
5 Bread, Jam, Chips, Muffins
1. What is the rule support for {Jam, Bread} ? 60%
2. What is the confidence for {Jam, Chips}{Biscuits}?

X = Jam and Chips; Y = Biscuits
Support count for (X,Y) = {Jam, Chips, Biscuits} is 2 (IDs 3 and 4).
Support count for X is 3 (IDs 3, 4 and 5).
Hence, this meant that 67% of the transactions that contain Jam and Chips will contain biscuits. 35
Clustering
• Clustering is an exploratory technique that attempts to discover natural groupings in data.
• The objective is to group similar (homogeneous) objects into the same cluster and dissimilar
(heterogeneous) objects into different clusters on the basis of distances among these objects.
• Domain knowledge plays an important role in deciding on the most useful results.
• Possible applications:
o Customer Segmentation: Customer population is partitioned into distinct groups based on

purchasing patterns and other customer characteristics
o Fraud Detection: Identifying subgroups/clusters that might behave differently or abnormally
36
Clustering
37
Clustering
• Cluster sizes should not be too small (i.e. containing only a very small number of observations)
so that the clusters are meaningful to use. The rule-of-thumb for the minimum number of
observations in a cluster is about 5 to 10 percent of the dataset size.
• A clustering solution should not consist of too many clusters because deployment of the
results will then be less feasible or useful.
• The number of clustering criteria should not be excessive, so that the clustering solution can
be meaningful interpreted and used.
38
Clustering
In K-means clustering, “K” refers to the number of clusters and “means” the cluster centroids
(i.e., the centre or average of all the observations within a cluster).
39
Predictive Modelling
• Classification refers to the prediction of a target that is categorical in nature. Examples of classification
include predicting fraud versus non-fraud, high-risk versus low-risk or purchase versus non-purchase.
• Estimation, on the other hand, refers to the prediction of a target that is quantitative (i.e., continuous) in
nature (e.g., predicting the amount spent, duration of a telephone call or account balance).
• Predictive modelling attempts to predict a target (also called a response variable or a dependent variable) on
the basis of one or more inputs (also called predictor variables or independent variables).
40
• Three data mining tools are commonly used for predictive modelling, namely, regression,
neural networks and decision trees.
• There is no one best data mining tool for predictive modelling as each of these models has its
own pros and cons.
Logistics Regression Neural Network Decision Tree 41

Data mining is used in many industries and has been proven to aid decision making.
• Telecommunication: One of the most common use of predictive modelling in the telecommunication
industry is to predict whether a customer is likely to churn.
• Banking: Credit scoring, an application of predictive modelling, is widely used in banks to aid making
decisions on whether to offer credit to their customers.
• Retail: Retail shops are using predictive modelling to aid in their up-selling and cross-selling effort.
They will send mailers to customers who are likely to respond to upselling or cross-selling campaign
so as to achieve higher profit with lower marketing cost.
42
Predictive Modelling Methodology
43
Source: https://upload.wikimedia.org/wikipedia/commons/thumb/b/b9/CRISP-DM_Process_Diagram.png/1200px-CRISP-DM_Process_Diagram.png
Predictive Modelling: Decision Trees
• A decision tree comprises a series of decision points, where the entire set of observations or subset of these
observations is split on the basis of some splitting criteria.
• Subsets are similar in characteristics
• Classification Tree
Able to predict the values of a categorical target variable, given a set of inputs
• Regression Tree
Able to estimate the values of a continuous target variable, given the input
• Recursive partitioning algorithm
44
What are the rules for the buyers?
45
What are the rules for the buyers?
Rules for Buyers:

Node 4: Income<$100,000;
Age>=25 (75%)
Node 8: Income>=$100,000;
Male;
Race is Malay or Indian (85%)
46
Predictive Modelling: Evaluation
• A “good” predictive model should have predictive performance that is

considered effective for practical application.
• In general, there are two aspects to consider when evaluating model

performance:
o the performance measure, and
o the model evaluation method.
47
Performance Measure
p+v p+v p+v q+u

overall accuracy = = error rate = 1 − accuracy = 1 − ( ) or
m m
p+q+u +v m
v v
accuracy for positive cases = hit rate for positive cases =
u+v q+v
p p
accuracy for negative cases = hit rate for negative cases =
p+q p+u
48
Simple Train- and- Test Evaluation Method
• Partition the data into a training data and testing data. Construct the model
based on the training data.
• Test the model using the testing data. The testing data serves as a data set that
the model has not seen before.
• The Simple Train-and-Test approach allows analysts to infer how well the model
would perform on unseen data; that is, observations that have not been seen by
the model.
49
Performance Measure
Predicted Response
Response Yes No Total
Yes 65 35 100
No 30 70 100
Total 95 105 200
Based on the confusion matrix, what is the

overall accuracy rate, the overall error rate
and the hit rate for “Yes” ?
50
Performance Measure
Predicted Response
Response Yes No Total
Yes 65 35 100
No 30 70 100
Total 95 105 200
• Overall accuracy rate of the predictive model is 67.5% (i.e., (65 + 70)/200)
• Overall error rate is 32.5% (i.e., 1 – (65 + 70)/200)
• Hit rate for “Yes” is 68.4% (i.e., 65/95).
51
Pre-Requisites and
Limitations
Pre-Requisites and Limitations
The state of the data, the coordination among different departments and
the complexity of a data mining project are important determinants of the
duration of the data mining project.
The data mining team should possess the following knowledge and skills:
domain knowledge, data mining knowledge and skills, IT expertise, and
statistical and research expertise.
53
Pre-Requisites and Limitations
• Some patterns, trends and relationships found in data mining may not be useful.
• Data mining is well developed for modelling (i.e., prediction) but it is not as well
developed for effect assessment.
• Substantial investment of resources in data mining.
• Data mining requires intensive planning and technical preparation work.
• Collective knowledge and skills are needed hence different departments in an

organisation have to work closely together
54
New York City Fights Fire
with Data
New York City Fights Fire with Data
Based on the article, what do you think is
• Business issue
• Data mining objective
• Data mining technique
• Data mining application
• How the application had helped to address the business issue
• Limitations of the application
56
http://www.govtech.com/public-safety/New-York-City-Fights-Fire-with-Data.html
Based on the article, what do you think is
1. Business issue
• Rising volume of building inspection requirements.
2. Data mining objective

• Predict where fires might spark
3. Data mining technique

• Each data element is given a weight to calculate fire risk; similar to credit scoring; likely
technique used is regression
4. Data mining application

• Risk score engine; Higher risk score buildings were inspected first.
57
5. How the application had helped to address the business issue?
• Helped to better deploy the limited resources to inspect building with higher fire risk score
before the fire sparks.
6. Limitations of the application

• Manual inputs by inspectors subjected to data issues
• Only applicable to New York city
58
Data Mining Demo
– SPSS Modeler
Demo
• MailPurchase is a mail order company with a database of 1400 customers.
• For each customer, the following data for last year were captured.
Attribute Description
Status Whether the customer has purchased a promoted product in any of the
quarterly marketing campaigns
Expend Average monthly expenditure on the company’s products
Numpur Average number of purchases per quarter
Age Age of customer as at 1 January
Gender Gender of customer
Income Annual income of customer as at 1 January (in $’000)
Race Race of customer
Marital Marital status of customer as at 1 January
Member Whether the customer is a member of the loyalty card programme
60
Demo
• Suppose that to develop the next marketing campaign, MailPurchase is interested to target
only existing customers with a high probability of purchase.
• Hence it is interesting to classify existing customers as likely purchasers or non-purchasers.
• To construct this prediction model, MailPurchase has decided to use ‘Status’ as the target
variable and the others as inputs.
• From the prediction model, MailPurchase will be able to predict the probability of purchase
and hence be able to classify existing customers into the purchaser and non-purchaser
groups.
61
Demo
62
Demo
CHAID is
chosen as the
final model
63
Demo
64
Node First Second Third Fourth Fifth Probability
Variable Variable Variable Variable Variable
Demo 20 Race
=1, 2 or 4
Member
=0
Gender
=1
Marital
=3
0.596
24 Race Member Gender Age Income 0.909

=1, 2 or 4 =0 =2 <=25 <=92
10 Race Member Age 0.800
=1, 2 or 4 =1 <=29
=1, 2 or 4 =1 >29 and <=43
=1, 2 or 4 =1 >60
14 Race Marital Income 0.765
=3 =2 <=61
15 Race Marital Income 0.970
=3 =2 >61 and <=186
17 Race Marital Gender 0.919
=3 =3 =1
18 Race Marital Gender 0.600
=3 =3 =2
65
suss.edu.sg

Analytics For Decision Making

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analytics For Decision Making

Uploaded by

Copyright:

Available Formats

ANL203

Analytics for Decision Making

o Business Performance Measures

Increase Monitor the

Increase Monitor the Who will

• Determine business objectives

• Determine data mining goals

• Produce project plan

• Describe and explore data

• Verify data quality

• Construct and integrate data

• Select the modeling technique

• Generate test design

• Build and assess model

• Review process and determine next steps

• Monitoring and maintenance

Data Mining Goal:

Describing and exploring data:

Verifying data quality:

Constructing and integrating data:

Generating test design:

Building and assessing the model:

Reviewing the process and determining the next step:

• Description refers to the summarisation of data to facilitate understanding.

• Association rules are statements in the form of:

• Association implies co- occurrence and not causality.

2 Bread, Chips, Biscuits, Ice-cream

3 Jam, Chips, Biscuits, Muffins

4 Bread, Jam, Chips, Biscuits, Muffins

5 Bread, Jam, Chips, Muffins

1. What is the rule support for {Jam, Bread} ?

2. What is the confidence for {Jam, Chips}{Biscuits}?

2 Bread, Chips, Biscuits, Ice-cream

3 Jam, Chips, Biscuits, Muffins

4 Bread, Jam, Chips, Biscuits, Muffins

5 Bread, Jam, Chips, Muffins

1. What is the rule support for {Jam, Bread} ? 60%

2. What is the confidence for {Jam, Chips}{Biscuits}?

o Customer Segmentation: Customer population is partitioned into distinct groups based on

o Fraud Detection: Identifying subgroups/clusters that might behave differently or abnormally

neural networks and decision trees.

own pros and cons.

Logistics Regression Neural Network Decision Tree 41

• Subsets are similar in characteristics

• Recursive partitioning algorithm

Rules for Buyers:

• A “good” predictive model should have predictive performance that is

• In general, there are two aspects to consider when evaluating model

p+v p+v p+v q+u

Based on the confusion matrix, what is the

Response Yes No Total

Total 95 105 200

• Substantial investment of resources in data mining.

• Data mining requires intensive planning and technical preparation work.

• Collective knowledge and skills are needed hence different departments in an

• Data mining objective

• Data mining technique

• Data mining application

• How the application had helped to address the business issue

• Limitations of the application

2. Data mining objective

3. Data mining technique