You are on page 1of 22

ASSOCIATION RULES: Business Analytics

APRIORI Lecture 5/6


LEARNING OBJECTIVES
• Explain what is Association Rule Mining
• Define Support, Confidence and Lift
• Demonstrate the Apriori Algorithm
• Define Market Basket Analysis and how it can be applied
• Conduct Market Basket Analysis in R (Not included in this review)
• Conduct Link Analysis in R (Not included in this review)
A. What is Association Rule Mining?

ASSOCIATION RULE
• Association Rule Mining (ARM) is a method for discovering
interesting relationships between variables in large databases.
• It is intended to identify strong rules using some measures of
interestingness.
• It was proposed by Agrawaletal (1993).
• It is studied extensively by the database and data mining
community.
• It assumes all data are categorical and is not a good algorithm for
numeric data.
A. What is Association Rule Mining?

APPLICATION OF
ASSOCIATION RULES
• Credit Card Purchases
Items purchased on a credit card, such as rental cars and hotel rooms, provide insight
into the next product that customers are likely to purchase,

• Market Basket Analysis


Given a database of customer transactions, where each transaction is a set of items the
goal is to find groups of items which are frequently purchased together.

• Banking
Assessing banking products used by retail customers (savings account, certificate of
deposit, investment services, car loans, and so on) to identify a customer’s need for
other products.

• Medicine
Medical patient histories can give indications of likely complications based on certain
combinations of treatments.
B. Define Support, Confidence and Lift

ASSOCIATION RULES
USING APRIORI
Apriori is an algorithm used for association rule learning over
transactional databases.
B. Define Support, Confidence and Lift

ASSOCIATION RULES
USING APRIORI
Definition
B. Define Support, Confidence and Lift

ASSOCIATION RULES
USING APRIORI
Support

• The support s of an item set X is the percentage of transactions in


the transaction database D that contain X.

• The support of the rule X⇒Y in the transaction database D is the


support of the items set X⋃Y in D.
B. Define Support, Confidence and Lift

ASSOCIATION RULES
USING APRIORI
Confidence

• The confidence of the rule X⇒Y in the transaction database D, is the


ratio of the number of transactions in D that contain X⋃Y to the
number of transactions that contain X in D.

• Confidence is a measure of how often the consequent is true when


the antecedent is true.
E.g., the rule bread⇒milk has a confidence of 80%, if 80% of the purchases that
include bread also include milk.
B. Define Support, Confidence and Lift

ASSOCIATION RULES
USING APRIORI
Lift

• Lift indicates how much better the rule is as an improvement over guessing the
associations among items

• The lift is calculated as:

• Lift is similar to correlation:

• If items A and B are independent the lift has a value of one.

• A lift greater than one or less than one suggest that A and B are dependent.
B. Define Support, Confidence and Lift

ASSOCIATION RULES
USING APRIORI
Lift
• Lift greater than one indicates attraction of the items. These items
are positively correlated.
For example, Rule X⇒Y with a lift of 1.31 suggests that when X is purchased, Y is
31% more likely to be purchased.

• Lift less than one indicates repulsion of the items. These items are
negatively correlated.
For example, Rule X⇒Y with a lift of 0.25 would be better reversed to represent:
If X⇒NOT Y
and the lift recalculated.
B. Define Support, Confidence and Lift

ASSOCIATION RULES
USING APRIORI
B. Define Support, Confidence and Lift

ASSOCIATION RULES
USING APRIORI
C. Demonstrate the Apriori Algorithm

APRIORI
Steps

1. Find all itemsets that have high support


These are known as frequent itemsets. If an itemset satisfies minimum
support, then it is a frequent itemset.

2. Generate association rules from frequent itemsets

3. Calculate the lift to accept or reject the rule and determine the type
of relationship among the items
C. Demonstrate the Apriori Algorithm

DEFINITION: • Itemset
FREQUENT −A collection of one or more items
ITEMSET −Example: {Milk, Bread, Diaper}
• Support count (σ)
−Frequency of occurrence of an
itemset
−E.g. σ ({Milk, Bread, Diaper}) = 2
• Support
−Fraction of transactions that contain
an itemset
−E.g. s({Milk, Bread, Diaper}) = 2/5
• Frequent Itemset
−An itemset whose support is greater
than or equal to a minsup threshold
C. Demonstrate the Apriori Algorithm

Association Rule
EXAMPLE An implication expression of the form
X⇒Y, where X and Y are itemsets
Example:{Milk, Diaper} ⇒ {Beer}

Rule Evaluation Metrics


Support (s)
Fraction of transactions that contain
both X and Y
Confidence (c)
Measures how often items in Y appear
in transactions that contain X
C. Demonstrate the Apriori Algorithm

MINING Examples of Rules:


ASSOCIATION {Milk, Diaper} ⇒ {Beer} (s=0.4, c=0.67)
RULES {Milk, Beer} ⇒ {Diaper} (s=0.4, c=1.0)
{Diaper, Beer} ⇒ {Milk} (s=0.4, c=0.67)

Observations:
• All the above rules are binary
partitions of the same itemset:
{Milk, Diaper, Beer}
• Rules originating from the same
itemset have identical support,
but can have different confidence
• Thus, we may decouple the support
and confidence requirements
C. Demonstrate the Apriori Algorithm

EXAMPLE
C. Demonstrate the Apriori Algorithm

ASSOCIATION RULE TYPES


• Actionable Rules
−Contain high‐quality, actionable information
• Trivial Rules
−Information already well‐known by those familiar with the business
• Inexplicable Rules
−No explanation and do not suggest action
• Trivial and Inexplicable Rules occur most often
C. Demonstrate the Apriori Algorithm

EXAMPLES OF RULES
• Beer and Diapers: What did managers learn? That beer & diapers are
often bought by men on Thursday and Saturdays. Why?
• Data → Information → Knowledge (Insight)
• Wal‐Mart customers who purchase Barbie dolls have a 60% likelihood
of also purchasing one of three types of candy bars [Forbes, Sept 8,
1997]
• Customers who purchase maintenance agreements are very likely to
purchase large appliances (Linoff and Berry experience)
• When a new hardware store opens, one of the most commonly sold
items is toilet bowl cleaners (Linoff and Berry experience)
D. Define Market Basket Analysis and how it can be applied

WHAT IS
MARKET BASKET ANALYSIS?
• Modelling technique based on the premise that if you buy a certain
group of items, you are more (or less) likely to buy another group of
items.
• For example Jamaicans are likely to buy both spice bun and cheese
for Easter. We can represent this as a rule
If {Spice bun} Then {Cheese}
• A market basket analysis can also be conducted on items that were
not purchased at the same time as well.
D. Define Market Basket Analysis and how it can be applied

EXAMPLES OF
HOW TO ACTION RESULTS
Marketing and Sales Promotion:
• Let the rule discovered be
{Bagels, … } --> {Potato Chips}
• Potato Chips as consequent => Can be used to determine what
should be done to boost its sales.
• Bagels in the antecedent => Can be used to see which products
would be affected if the store discontinues selling bagels.
• Bagels in antecedent and Potato Chips in consequent => Can be
used to see what products should be sold with Bagels to promote
sale of Potato Chips!
QUESTIONS?

You might also like