You are on page 1of 11

DATA WAREHOUSING & DATA MINING

Lecture 17
Association Rule
Association rule learning
◦ It is an association algorithm.
◦ Association rule learning is a rule-based machine learning method for discovering
interesting relations between variables in large databases. ... It identifies frequent if-then
associations called association rules which consists of an antecedent (if) and a consequent
(then).
◦ (antecedent) A B (consequent)
◦ Relationship between items A and B is known as single cardinality.
◦ If a costumer buy A,B and C then cardinality usually increases accordingly.
Apriori algorithm
◦ Apriori algorithm refers to the algorithm which is used to calculate the association rules
between objects. It means how two or more objects are related to one another.
◦ Apriori algorithm is also called frequent pattern mining.

Market Basket Analysis


◦ It is one of the key technique used by the large retailers to uncover associations between items.
E.g. retailers are showing discount on eggs, so if the customer buys bread and butter he/she
must be encourage to buy eggs also.
Apriori algorithm

Support =

Confidence =

Lift =
Example
◦ For the following given transaction Dataset, Generate rules using Apriori Algorithm. Consider the values
as SUPPORT = 50% and CONFIDENCE = 75%

Transaction ID Item purchased


1 Bread, Cheese, Egg, Juice
2 Bread, Cheese, Juice
3 Bread, Milk, Yogurt
4 Bread, Juice, Milk
5 Cheese, Juice, Milk
Cont..
Create Frequent Item set and support

Item Pairs Frequency Support


Bread 4 4/5 = 80%
Cheese 3 3/5 = 60%
Egg 1 1/5 = 20%
Juice 4 4/5 = 80%
Milk 3 3/5 = 60%
Yogurt 1 1/5 = 20%

We have to consider only support greater then 50%, So Egg and Yogurt will be
discarded.
Cont..
2-Items candidate set and write their frequency

Items Frequency Support


(Bread, Cheese) 2 2/5 = 40%
(Bread, Juice) 3 3/5 = 60%
(Bread, Milk) 2 2/5 = 40%
(Cheese, Juice) 3 3/5 = 60%
(Cheese, Milk) 1 1/5 = 20%
(Juice, Milk) 2 2/5 = 40%
Cont..
3-Items candidate set and write their frequency
Bread, Juice, Cheese this is the only pair. We couldn’t consider because its support will be less than 50%
Generate Rule:

(Bread, Juice) 3 3/5 = 60%

1. Bread Juice 2. Juice Bread

1. Confidence = = × = = 75%

2. Confidence = = × = = 75%
Cont..
(Cheese, Juice) 3 3/5 = 60%

1. Cheese Juice 2. Juice Cheese

1. Confidence = = × = 100%

2. Confidence = = × = 75%
Rule 1: Bread, Juice = 75%
Rule 2: Juice, Bread = 75%
Rule 3: Cheese, Juice = 100%
Rule 4: Juice, Cheese = 75%

All these 4 rules(items association) are good.

You might also like