Professional Documents
Culture Documents
Unit - III
Association Rules
Prepared by
R.Poonguzhali
Periyar Maniammai Institute of Science and Technology
04/13/2020 1
Outline
Basic Concepts
Apriori Algorithm
04/13/2020 2
Basic Concepts
What is Frequent pattern mining?
Frequent pattern : a pattern (a set of items,
subsequences, substructures, etc.) that occurs
frequently in a data set
Example:
A set of items such as milk and bread that occur
frequently together in a transaction dataset is a
frequent itemset.
04/13/2020 3
Example- Market Basket Analysis
• This process analyzes customer buying habits by
finding associations between the different items
that customers place in their shopping basket.
04/13/2020 4
Transactional Database
Transaction ID List of Items
(TID)
T1 {Jam, milk, bread}
T2 {biscuit, eggs, salt, yogurt}
T3 {Jam, eggs, bread}
Support:
Support (A =>B )= P(AUB)
Percentage of transactions in D that contain AUB
Confidence:
Confidence (A =>B) = P(B/A)
Percentage of transactions in D containing A also
contain B
04/13/2020 6
Association Rule Mining
Goal:
Find all rules that satisfy the user-specified minimum
support (minsup) and minimum confidence
(minconf).
04/13/2020 7
The Apriori algorithm
• Probably the best known algorithm for Association
Rule Mining
• Two steps:
Step1:
– Find all itemsets that have minimum support
(frequent itemsets, also called large itemsets).
Step2:
– Use frequent itemsets to generate rules.
04/13/2020 8
Start with STEP : 1
Find all itemsets that have minimum support count
(min_sup = 2)
04/13/2020 9
Example:1 - Transaction Database
TID List of items
1 I1,I2,I5
2 I2,I4
3 I2,I3
4 I1,I2,I4
5 I1,I3
6 I2,I3
7 I1.I3
8 I1.I2.I3.I5
9 I1,I2,I3
04/13/2020 10
Compare candidate
Scan D for count of each
support count to
candidate
minimum support count
C1
L1
04/13/2020 11
Compare candidate
Generate C2 candidates Scan D for count of support count with
from L1 each candidate C2 minimum support count
L2
Itemset Itemset Support Itemset Suppor
count t
[I1 , I2] count
[I1 , I2] 4
[I1, I3]
[I1, I3] 4 [I1 , I2] 4
[I1,I4]
[I1,I4] 1 [I1, I3] 4
[I1, I5]
[I1, I5] 2 [I1, I5] 2
[I2,I3]
[I2,I3] 4 [I2,I3] 4
[I2,I4]
[I2,I4] 2 [I2,I4] 2
[I2,I5]
[I2,I5] 2 [I2,I5] 2
[I3,I4]
[I3,I4] 0
[I3,I5]
[I3,I5] 1
[I4,I5]
[I4,I5] 0
04/13/2020 12
Generate C3 candidates Scan D for count of each
from L2 candidate C3
Itemset Support
count
[I1, I2,I3] 2
[I1, I2,I5] 2
04/13/2020 13
Start with STEP : 2
Generating Association rules form Frequent Itemsets
04/13/2020 14
Generate strong association rules from the
frequent item sets
Example 1:
The data contain frequent item set X= {I1,I2,I5}. What
are the association rules generated from X. The minimum
confidence threshold is 70%
04/13/2020 15
The resulting association rules are
04/13/2020 17
This lecture is based on the following resources - slides:
1. J.Han : Data Mining Concepts and Techniques.
2. G.Piatetsky-Shapiro : Association Rules and Frequent
Item Analysis.
3. Jerzy Stefanowski : Institute of Computing Sciences
Poznan University of Technology Poznan, Poland.
04/13/2020 18
THANK YOU
04/13/2020 19