You are on page 1of 51

Chapter 9

Market Basket Analysis and


Association Rules
2
Data Mining Techniques So Far
Chapter 5 Statistics
Chapter 6 Decision Trees
Chapter 7 Neural Networks
Chapter 8 Nearest Neighbor Approaches: Memory-
Based Reasoning and Collaborative Filtering
3
Questions related to Market Basket
4
What can be inferred?
I purchase diapers
I purchase a new car
I purchase OTC cough medicine
I purchase a prescription medication
I dont show up for class
5
Market Basket Analysis
Retail each customer purchases different set of
products, different quantities, different times
MBA uses this information to:
Identify who customers are (not by name)
Understand why they make certain purchases
Gain insight about its merchandise (products):
Fast and slow movers
Products which are purchased together
Products which might benefit from promotion
Take action:
Store layouts
Which products to put on specials, promote, coupons
Combining all of this with a customer loyalty card it
becomes even more valuable
6
Association Rules
DM technique most closely allied with
Market Basket Analysis
AR can be automatically generated
AR represent patterns in the data without a
specified target variable
Good example of undirected data mining
Whether patterns make sense is up to
humanoids (us!)
7
Association Rules Apply Elsewhere
Items purchased on a credit card, such as rental cars and
hotel rooms, provide insight into the next product that
customers are likely to purchase.
Optional services purchased by telecommunications
customers (call waiting, call forwarding, DSL, speed call, and
so on) help determine how to bundle these services together
to maximize revenue.
Banking services used by retail customers (money market
accounts, CDs, investment services, car loans, and so on)
identify customers likely to want other services.
Unusual combinations of insurance claims can be a sign of
fraud and can spark further investigation.
Medical patient histories can give indications of likely
complications based on certain combinations of treatments.
8
Market Basket Analysis Drill-Down
MBA is a set of techniques, Association
Rules being most common, that focus on
point-of-sale (p-o-s) transaction data
3 types of market basket data (p-o-s data)
Customers
Orders (basic purchase data or baskets or
item sets)
Items (merchandise/services purchased)
9
Typical Data Structure (Relational Database)
Lots of questions can be answered
Avg # of orders/customer
Avg # unique items/order
Avg # of items/order
For a product
What % of customers have purchased
Avg # orders/customer include it
Avg quantity of it purchased/order
Visualization
is extremely
helpful
next slide
Transaction Data
10
Combining data
These measures
give broad insight
into the business.
In some cases,
there are few repeat customers, so the proportion of
orders per customer is close to 1.
This suggests a business opportunity to increase the
number of sales per customers.
Or, the number of products per order may be close to 1,
suggesting an opportunity for cross-selling during the
process of making an order.
It can be useful to compare these measures to each
other.
11
Questions about ...
Sales Order Characteristics
Item Popularity
Tracking Marketing Interventions
Clustering Products by Usage
12
Sales Order Characteristics
Customer purchases have additional interesting characteristics.
For instance, the average order size varies by time and region
For Web purchases and mail-order transactions, additional
information may also be gathered at the point of sale:
Did the order use gift wrap?
Is the order going to the same address as the billing address?
Did the purchaser accept
or decline a particular
cross-sell offer?
13
Item popularity
What is the most common item found on a
one-item order?
What is the most common item found on a
multi-item order?
What is the most common item for repeat
customer purchases?
How has ordering of an item changed over
time?
How does the ordering of an item vary
geographically?
14
Tracking Marketing Interventions
Including marketing interventions along with the product sales over
time makes it possible to see the effect of the interventions.
Prior to the intervention, sales are hovering at 50 units / week.
After the intervention, they peak at 7-8 times that amount.
A challenge in answering this question is determining whether the
additional sales are incremental or are made by customers who
would purchase the product anyway at some later time.
We can also look at the
number of baskets
containing the item.
If the number of customers
is not increasing, there is
evidence that existing
customers are simply
stocking up on the item at
a lower cost.
15
Clustering Products by Usage
What groups of products often appear together?
Such groups of products are very useful for making recommendations
to customerscustomers who have purchased some of the products
may be interested in the rest of them
A lot of information available about products.
In addition to the product hierarchy, such information
includes the color of clothes, whether food is low calorie,
whether a poster includes a frame, and so on
Questions:
Do diet products tend to sell together?
Are customers purchasing similar colors of clothing at the same
time?
Do customers who purchase framed posters also buy other
products?
16
Pivoting for Cluster Algorithms
17
Association Rules
Wal-Mart customers who purchase Barbie dolls
have a 60% likelihood of also purchasing one of
three types of candy bars
Customers who purchase maintenance
agreements are very likely to purchase large
appliances
When a new hardware store opens, one of the
most commonly sold items is toilet bowl cleaners
So what
18
Famous Rules: Beer & Diapers
19
Famous Rules: Beer & Diapers
WHY?
Beer drinkers do not want to interrupt their enjoyment of
televised sports, so they buy diapers to reduce trips to the
bathroom. No, thats not it.
Families with young children are preparing for the weekend.
What can a retailer do with this information?
Put the beer and diapers close together, so when one is
purchased, customers remember to buy the other one.
Put them as far apart as possible, so opportunity to buy yet
more items.
Put higher-margin diapers a bit closer to the beer, although
mixing baby products and alcohol would probably be
unseemly.
20
Association Rules
If buy
Diaper
Buy Beer
Then
If buy
Beer, Diaper
Buy Cheese,
Chocolate
Then
Shoppers who buy Diaper are very likely to buy Beer.
Shoppers who buy Beer and Diaper are likely to buy Cheese and Chocolate
Examples:
For a frequent itemset {Diaper, Beer}, is Diaper
promoting the purchase of Beer, or Beer
increasing the chance of Diaper purchase?
We need directions.
21
Association Rules
Rule format:
If {set of items} Then {set of items}
LHS implies RHS *
If {Diaper,
Baby Food}
{Beer, Wine}
Then
LHS RHS
An association rule is valid if it satisfies some evaluation measures
* RHS = "Right Hand Side LHS = "Left Hand Side
22
Association Rules
Association rule types:
Actionable Rules contain high-quality, actionable
information
Trivial Rules information already well-known by
those familiar with the business
Results from market basket analysis may simply be measuring
the success of previous marketing campaigns
Inexplicable Rules no explanation and do not
suggest action
Trivial and Inexplicable Rules occur most often
23
Milk & Wine co-occur
But
Only 2 out of 200K transactions contain these
items
Rule Evaluation
Transaction No. Item 1 Item 2 Item 3
100 Beer Diaper Chocolate
101 Milk Chocolate Wine
102 Beer Wine Vodka
103 Beer Cheese Diaper
104 Ice Cream Diaper Beer
.
24
Support:
The frequency in which the items in LHS and RHS co-occur.
E.g., The support of the {Diaper} {Beer} rule is 3/5:
60% of the transactions contain both items.
No. of transactions containing items in LHS and RHS
Total No. of transactions in the dataset
Support =
Transaction No. Item 1 Item 2 Item 3
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Wine Vodka
103 Beer Cheese Diaper
104 Ice Cream Diaper Beer
Rule Evaluation Support
25
Rule Evaluation - Confidence
Is Beer leading to Diaper purchase or Diaper leading to Beer purchase?
Among the transactions with Diaper, 100% have Beer. P(Beer|Diaper)=100%
Among the transactions with Beer, 75% have Diaper. P(Diaper|Beer)=75%
Confidence =
Transaction No. Item 1 Item 2 Item 3
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Wine Vodka
103 Beer Cheese Diaper
104 Ice Cream Diaper Beer
No. of transactions containing both LHS and RHS
No. of transactions containing LHS
confidence for {Diaper} {Beer} : 3/3
When Diaper is purchased, the likelihood of Beer purchase is 100%
confidence for {Beer} {Diaper} : 3/4
When Beer is purchased, the likelihood of Diaper purchase is 75%
So, {Diaper} {Beer} is a more important rule according to confidence.
26
Rule Evaluation - Lift
Transaction No. Item 1 Item 2 Item 3 Item 4
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Milk Vodka Chocolate
103 Beer Milk Diaper Chocolate
104 Milk Diaper Beer
Whats the support and confidence for rule {Chocolate}{Milk}?
Support = 3/5 Confidence = 3/4
Very high support and confidence.
Does Chocolate really lead to Milk purchase?
No! Because Milk occurs in 4 out of 5 transactions. Chocolate is even
decreasing the chance of Milk purchase
3/4 < 4/5, i.e. P(Milk|Chocolate)<P(Milk) Lift = (3/4)/(4/5) = 0.9375 < 1
When lift > 1 then the rule is better at predicting the result than guessing
When lift < 1, the rule is doing worse than informed guessing and using
the Negative Rule produces a better rule than guessing
27
Rule Evaluation Lift (cont.)
Measures how much more likely is the RHS given the LHS
than merely the RHS
Lift = confidence of the rule / probability of the RHS
i.e. = P(RHS|LHS)/P(RHS)
Example: {Diaper} {Beer}
Total number of customer in database: 1000
No. of customers buying Diaper: 200
No. of customers buying beer: 50
No. of customers buying Diaper & beer: 20
Probability of Beer = 50/1000 (5%)
Confidence = 20/200 (10%)
Lift = 10%/5% = 2
Lift higher than 1 implies people have higher change to buy
Beer when they buy Diaper. Lift lower than 1 implies people
have lower change to buy Milk when they buy Chocolate.
28
Rule Evaluation Practical Impact
Most methods for extracting association rules find too
many trivial rules. Most are either obvious and
uninteresting.
Example: If Maternity Ward then patient is a woman.
Confidence 100%, support 100%
Need to screen for rules that are of particular interest and
significance.
Actionable: Keep only rules that can be acted upon.
Interestingness: Various measures for how surprising or
unexpected a rule is.
Example: A rule is interesting if it contradicts what is currently known
(e.g., it contradicts a rule that was previously discovered).
29
Creating Association Rules
1. Choosing the right set of items
2. Generating rules by deciphering the
counts in the co-occurrence matrix
3. Overcoming the practical limits imposed
by thousands or tens of thousands of
unique items
30
Creating Association Rules
31
Creating Association Rules
Choosing the right set of items
Within a grocery store where there are tens of
thousands of products on the shelves, a frozen pizza
might be considered an item for analysis purposes,
regardless of its toppings (extra cheese, pepperoni, or
mushrooms), its crust (extra thick, whole wheat, or
white), or its size.
On the other hand, the manager of frozen foods or a
chain of pizza restaurants may be very interested in
the particular combinations of toppings that are
ordered.
32
Creating
Association
Rules
Choosing the
right set of
items
What level of
the product
hierarchy is
the right one
to use?
Market basket analysis produces the best results when the items occur in roughly the
same number of transactions in the data. This helps prevent rules from being dominated
by the most common items. Product hierarchies can help here. Roll up rare items to
higher levels in the hierarchy, so they become more frequent. More common items may
not have to be rolled up at all.
33
Creating Association Rules
Generating rules by deciphering the
counts in the co-occurrence matrix
if condition, then result.
if Barbie doll, then candy bar
= if a customer purchases a Barbie doll, then
the customer is also expected to purchase a
candy bar.
Saying that the rule if B and C then A has a
confidence of 0.33 is equivalent to saying that
when B and C appear in a transaction, there is a
33 percent chance that A also appears in it.
34
Creating Association Rules
Overcoming the practical limits imposed
by thousands or tens of thousands of
unique items
1. Generate co-occurrence matrix for single
itemsif OJ then soda
2. Generate co-occurrence matrix for two
itemsif OJ and Milk then soda
3. Generate co-occurrence matrix for three
itemsif OJ and Milk and Window Cleaner
then soda
4. And so on
35
Algorithm to Extract Association Rules
The standard algorithm: Apriori
Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining
Association Rules in Large Databases. VLDB 1994: 487-499
The Association Rules problem was defined as:
Generate all association rules that have
support greater than the user-specified minimum support
and confidence greater than the user-specified minimum
confidence
the base algorithm uses support and confidence, but we
can also use lift to rank the rules discovered by Apriori.
The algorithm performs an efficient search over the
data to find all such rules.
36
Finding Association Rules from Data
Association rules discovery problem is decomposed into
two sub-problems:
1. Find all sets of items (itemsets) whose support is above
minimum support - called frequent itemsets or large itemsets
2. From each frequent itemset, generate rules whose
confidence is above minimum confidence.
Given a large itemset Y, and X is a subset of Y
Calculate confidence of the rule X (Y - X)
If its confidence is above the minimum confidence, then X
(Y - X) is an association rule we are looking for.
37
Example
A data set with 5 transactions
Minimum support = 40%, Minimum confidence = 80%
Phase 1: Find all frequent itemsets
{Beer} (support=80%),
{Diaper} (60%),
{Chocolate} (40%)
{Beer, Diaper} (60%)
Transaction No. Item 1 Item 2 Item 3
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Wine Vodka
103 Beer Cheese Diaper
104 Ice Cream Diaper Beer
Beer Diaper (conf. 34= 75%)
Diaper Beer (conf. 33= 100%)
Phase 2:
38
A naive way is to calculate the support for every possible itemset. 2
N
possible itemsets given N items impossible to do!
Need smart method: frequent itemsets of size n contain itemsets of size n-
1 that also must be frequest
Example: if {diaper, beer} is frequent then {diaper} and {beer} are each
frequent as well
This means that
If an itemset is not frequent (e.g., {wine}) then no itemset that includes
wine can be frequent either, such as {wine, beer} .
We therefore first find all itemsets of size 1 that are frequent.
Then try to expand these by counting the frequency of all itemsets of size
2 that include frequent itemsets of size 1.
Example:
If {wine} is not frequent we need not try to find out whether {wine, beer} is
frequent. But if both {wine} & {beer} were frequent then it is possible
(though not guaranteed) that {wine, beer} is also frequent.
Then take only itemsets of size 2 that are frequent, and try to expand
those, etc.
Phase 1: Finding all frequent itemsets
How to perform an efficient search of all frequent itemsets?
39
Assume {Milk, Bread, Butter} is a frequent itemset.
Using items contained in the itemset, list all possible rules
{Milk} {Bread, Butter}
{Bread} {Milk, Butter}
{Butter} {Milk, Bread}
{Milk, Bread} {Butter}
{Milk, Butter} {Bread}
{Bread, Butter} {Milk}
Calculate the confidence of each rule
Pick the rules with confidence above the minimum confidence
Support {Milk, Bread, Butter}
Support {Milk}
No. of transaction that support {Milk, Bread, Butter}
No. of transaction that support {Milk}
=
Phase 2: Generating Association Rules
Confidence of {Milk} {Bread, Butter}:
40
Agrawal (94)s Apriori Algorithm -
An Example
Transactions
1
st
scan
C
1
L
1
L
2
C
2 C
2
2
nd
scan
C
3
L
3
3
rd
scan
T-ID Items
10 A, C, D
20 B, C, E
30 A, B, C, E
40 B, E
Itemset sup
{A} 2
{B} 3
{C} 3
{D} 1
{E} 3
Itemset sup
{A} 2
{B} 3
{C} 3
{E} 3
Itemset
{A, B}
{A, C}
{A, E}
{B, C}
{B, E}
{C, E}
Itemset sup
{A, B} 1
{A, C} 2
{A, E} 1
{B, C} 2
{B, E} 3
{C, E} 2
Itemset sup
{A, C} 2
{B, C} 2
{B, E} 3
{C, E} 2
Itemset
{B, C, E}
Itemset sup
{B, C, E} 2
{A,B,C}, {A, C, E}?
41
The number of
combinations
with n items is
proportional to
the number of
items raised to
the n
th
power - a
number that gets
very large, very
fast.
42
Final Thought on Association Rules:
The Problem of Lots of Data
Fast Food Restaurantcould have 100 items on its menu
How many combinations are there with 3 different menu
items? 161,700 !
Supermarket10,000 or more unique items
50 million 2-item combinations
100 billion 3-item combinations
How to reduce data:
Use of product hierarchies (groupings)
Prunning: reducing the number of items and combinations
of items being considered at each step
Minimum support pruning requires that a rule hold on a minimum
number of transactions.
If there are one million transactions and the minimum support is 1%,
then only rules supported by 10,000 transactions are of interest.
Finally, know that the number of transactions in a given time-
period could also be huge (hence expensive to analyze)
43
Using Association Rules to
Compare Stores
EX: compare sales at store openings versus existing stores:
1. Gather data for a specific period (such as 2 weeks) from
store openings.
Augment each of the transactions in this data with a virtual item
saying that the transaction is from a store opening.
2. Gather about the same amount of data from existing stores.
Here you might use a sample across all existing stores, or you
might take all the data from stores in comparable locations.
Augment the transactions in this data with a virtual item saying
that the transaction is from an existing store.
3. Apply market basket analysis to find association rules in
each set.
4. Pay particular attention to association rules containing the
virtual items.
44
Dissociation
Rules
if A and not B, then C
Dissociation rules can be generated by a simple adaptation of
the basic market basket analysis algorithm.
Downsides to including new items:
doubling the number of items seriously degrades performance
the size of a typical transaction grows because it now includes
inverted items
the frequency of the inverse items tends to be much larger than the
frequency of the original items.
So, minimum support constraints tend to produce rules in which
all items are inverted, such as if NOT A and NOT B then NOT C.
These rules are less likely to be actionable.
45
Sequential Analysis Using
Association Rules
Association rules find things that happen at the same time -
what items are purchased at a given time.
The next natural question concerns sequences of events
and what they mean. Examples:
New homeowners purchase shower curtains before purchasing
furniture.
Customers who purchase new lawnmowers are very likely to purchase
a new garden hose in the following 6 weeks.
When a customer goes into a bank branch and asks for an account
reconciliation, there is a good chance that he or she will close all his or
her accounts.
In order to consider time-series analyses on your customers,
there has to be some way of identifying customers. Without a
way of tracking individual customers, there is no way to
analyze their behavior over time.
46
Sequential Patterns
Instead of finding association between items in a single
transactions, find association between items across
related transactions over time.
Customer ID Transaction Data. Item 1 Item 2
AA 2/2/2001 Laptop Case
AA 1/13/2002 Wireless network card Router
BB 4/5/2002 laptop iPaq
BB 8/10/2002 Wireless network card Router

Sequence : {Laptop}, {Wireless Card, Router}
A sequence has to satisfy some predetermined minimum
support
47
Exercise 1 by hand
Given the above list of transactions, do the following:
1) Find all the frequent itemsets (minimum support 40%)
2) Find all the association rules (minimum confidence 70%)
3) For the discovered association rules, calculate the lift
Transaction No.Item 1 Item 2 Item 3 Item 4
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Soap Vodka
103 Beer Cheese Wine
104 Milk Diaper Beer Chocolate
48
RapidMiner Practice
To see:
RapidMiner Tutorial example 2 / 26
To practice:
Do the exercise presented in the tutorial using the
file Iris.ioo.
49
Exercise 1 using RapidMiner
Take Beer.xls file and find the
association rules
First process the data to the right format
(Beer1.xls )
50
RapidMiner Practice
To see:
Training Videos\05 - Akhtar Fareed -
RapidMinerTutorial\RapidMiner Tutorial (part 9_9)
Association Rules
To practice:
Do the exercises presented in the movie using the
file BalanceScale.xls.
51
Data Preprocessing
Bank.xls Bank.ioo
Save as .ioo format
Process design
Take a look at the .ioo file and attributes / variables
Process the attributes using Select Attributes
Rules can only handle categorical data types
Find association rules
Use operators: FP-Growth then Create Association Rules
Association Rules
Read and interpret the results
RapidMiner Practice

You might also like