You are on page 1of 8

Association Rule

Yudi Agusta, PhD
Data Warehouse dan Data Mining, Lecture 12
Copyright © Yudi Agusta, PhD, 2006

Copyright © Yudi Agusta, PhD, 2006

Lecture’s Structure
Definition and Benefit Method Apriori Algorithm Support for An Item Set Confidence for An Association Rule Support for An Association Rule Extended Rule Selection

Copyright © Yudi Agusta, PhD, 2006

Definition
Market Basket Analysis Find regularities in the shopping behavior of customers of supermarkets, mail-order companies and the like Try to find sets of products that are frequently bought together The information is expressed in the form or rule

1

Copyright © Yudi Agusta, PhD, 2006

Benefit
The rule can be used to increase the number of items sold, for instance, by appropriately arranging the products in the shelves of a supermarket (they maybe placed adjacent of each other) The rule can be used to arrange promotion The rule can be used to prepare the availability of products in the selves

Copyright © Yudi Agusta, PhD, 2006

Association Rule
Example: If a customer buys wine and bread, he often buys cheese, too An association rule expresses an association between sets of items/products Can be applied to items/products in a:
Supermarket mail-order company special equipment options of a car optional services offered by telecommunication companies etc

Copyright © Yudi Agusta, PhD, 2006

Association Rule
Tidak hanya mencari hanya sekedar rule, tapi mencari rule yang “good”, “expressive” dan “reliable” Standar measure of good rule: support and confidence of a rule Main problem: there are so many possible rules
Example: products in a supermarket

Efficient algorithm is needed to inspect all the possible rules

2

Copyright © Yudi Agusta, PhD, 2006

Apriori Algorithm
Developed by Agrawal et al. in 1993 Restrict the search space Check only a subset of all rules Try not to miss important rules Some terms:
Support Confidence Item Set: a group of products such as {bread, wine, cheese}.

Copyright © Yudi Agusta, PhD, 2006

Support an Item Set
Let T be the set of all transactions under consideration
Contoh: set of all baskets of products bought by the customers of a supermarket

The support of an item set S is the percentage of all transactions in T which contain S, dengan rumus:
Support(S) = (|U|/|T|)*100% U is the set of all transactions that contain all items in S |U| and |T| are the number of elements in U and T

Copyright © Yudi Agusta, PhD, 2006

Contoh
A customer buys the set X = {milk, bread, apples, wine, sausages, cheese, onions, potatoes} Item Set = {bread, wine, cheese} S is obviously a subset of X, hence S is in U If there are 318 customers and 242 of them buy such a set U or a similar one that contains S, the support(S) = 76,1%

3

Copyright © Yudi Agusta, PhD, 2006

Confidence of An Association Rule
A measure used by Agrawal et al. (1993) to evaluate association rules (Intuitively) is the number of cases in which the rule is correct relative to the number of cases in which it is applicable The confidence of a rule R = “A And B Then C” is the support of the set of all items that appear in the rule divided by the support of the antecedent of the rule
Confidence(R) = (Support{A,B,C}/Support{A,B})*100%

Copyright © Yudi Agusta, PhD, 2006

Example
Let R = “wine AND bread THEN cheese” If a customer buys wine and bread then the rule is applicable and it says that he/she can be expected to buy cheese. If he/she doesn’t buy wine or doesn’t buy bread or buys neither, then the rule is not applicable and it doesn’t say anything about the customer. The customer may or may not buy cheese, and thus, the rule may or may not be correct.

Copyright © Yudi Agusta, PhD, 2006

Rule Confidence
Set a limit about how good is a rule in predicting an event. In the last example, a percentage of the number of all correct prediction divided by the number of all prediction is the rule confidence. In some program, the rule confidence of a rule is set equal to 80% (to make it a good rule)

4

Copyright © Yudi Agusta, PhD, 2006

Support of An Association Rule
Example: “A AND B THEN C” According to Agrawal (1993)
Is the support of the antecedent of the rule. The support of an association rule is the support of {A,B}

According to Borgelt (2002)
Is the support of the consequent of the rule. The support of an association rule is the support of {A,B,C}

Copyright © Yudi Agusta, PhD, 2006

Extended Rule Selection
Good rules are not often interesting. Example 1:
60% of all customers buy bread (THEN bread) 62% of all customers buy cheese along with bread (cheese THEN bread) Interesting?

Example 2:
60% of all customers buy bread (THEN bread) 20% of all customers buy cheese along with bread (cheese THEN bread) Interesting?

Copyright © Yudi Agusta, PhD, 2006

Extended Rule Selection
Interesting rule has to have a significantly different rule confidence from its antecedent. Comparison of a good rule with its antecedent rule including rule without antecedent is important. Can be performed using:
Absolute Confidence Difference to Prior Difference of Confidence Quotient to 1 Absolute Difference of Improvement Value to 1 Information Difference to Prior Normalised Chi2 Measure

5

Copyright © Yudi Agusta, PhD, 2006

Absolute Confidence Difference to Prior
Absolute value of the difference between rules Example:
(THEN bread) = 60% (cheese THEN bread) = 62% Absolute Confidence Difference = 2%

Threshold for prior confidence can be set, for example, = 20% ‘Interesting rule’ has to have absolute confidence difference less than (60-20)% =40% and greater than (60+20)%=80%

Copyright © Yudi Agusta, PhD, 2006

Difference of Confidence Quotient 1
Quotient? For the last example, if threshold for prior confidence is set = 20% Then:
The confidence of the next interesting rule is less than (1-20%)*60%=0.8*60%=48% The confidence of the next interesting rule is greater than 60%/(1-20%)=60%/0.8=75%

The difference with the previous one is the deviation depends on the prior confidence

Copyright © Yudi Agusta, PhD, 2006

Absolute Difference of Improvement Value to 1 For the last example, if threshold for prior confidence is set = 20% Then
Absolute difference of improvement value to 1 is less than (1-20%)*60%=0.8*60%=48% Absolute difference of improvement value to 1 is greater than (1+20%)*60%=1.2*60%=72%

Absolute difference of improvement value to 1 can be greater than 100%

6

Copyright © Yudi Agusta, PhD, 2006

Information Difference to Prior
Is the information gain criterion used in decision tree learners such as C4.5 to select the split attributes The idea: without any further information about other items in the set, we have a certain probability distribution for, say “bread” and “no bread” From the last example:
H=-P(bread)log2P(bread)-P(no bread)log2P(no bread) P(bread): the support of “bread”=the prior confidence of “bread”

The entropy of a probability distribution is a lower bound on the number of yes-no-questions you have to ask in order to determine the actual value

Copyright © Yudi Agusta, PhD, 2006

Information Difference to Prior
From the last example: we also need to calculate the entropy for P(*|cheese) and P(*|no cheese) P(*|cheese): probability to buy * with cheese P(*|no cheese): probability to buy * with no cheese The difference between the entropy for the prior rule and the expected value of posterior rule is the information difference to prior

Copyright © Yudi Agusta, PhD, 2006

Normalised Chi2 Measure
Is used to know how strong two variables are depended on each other By measuring the difference between a supposed independent distribution of two discrete variables and the actual joint distribution Contains value between 0 (no dependence) to 1 (very strong dependence) Threshold is set, and rules with higher Chi2 measures are chosen as interesting rules

7

Copyright © Yudi Agusta, PhD, 2006

Topik Diskusi
Apa kendala yang sering dihadapi dalam menerapkan Association Rule? Beberapa cara pemberian score untuk memilih rule yang menarik sudah dijelaskan hari ini. Kenapa pemberian score ini begitu penting? Diskusikan secara detail satu dari cara-cara tersebut.

8