Professional Documents
Culture Documents
Associations and
Correlations
1
Mining Frequent Patterns
Methods
Summary
2
Basic Concepts: Definitions
3
Basic Concepts: Definitions
4
Market Basket Analysis
5
Market Basket Analysis
6
Market Basket Analysis
7
Market Basket Analysis
8
Market Basket Analysis: Applications – (1)
9
9
Market Basket Analysis: Applications – (2)
10
10
Market Basket Analysis: Applications – (3)
11
11
Frequent Patterns, Associations and Correlations
Basic Concepts: Frequent Patterns
12 12
Basic Concepts: Association Rules
13 13
Frequent Patterns, Associations and Correlations
14
Basic Concepts: Support and Confidence
Mining Association Rules
2.Rule Generation
Generate rules that satisfy minimum support and minimum
confidence.
15
Frequent Pattern Generation
16
Frequent Patterns, Associations and Correlations
17
Frequent Patterns Generations
Frequent Patterns Generations
18
Frequent Patterns Generations
19
Kinds of Frequent Pattern Mining
20
Kinds of Frequent Pattern Mining
21
Kinds of Frequent Pattern Mining
22
Kinds of Frequent Pattern Mining
23
Kinds of Frequent Pattern Mining
24
Kinds of Frequent Pattern Mining
25
Kinds of Frequent Pattern Mining
26
Mining Frequent Patterns
Methods
Summary
27
Frequent Patterns: Complete Search Approach
28
The Downward Closure Property and
@SIGMOD’00)
Vertical data format approach (Charm—Zaki & Hsiao
@SDM’02)
29 29
Apriori:
33
The Apriori Algorithm—An Example
C3 Itemset
3rd scan L3 Itemset sup
{B, C, E} {B, C, E} 2
34 34
Frequent Patterns, Associations and Correlations
35
The Apriori Algorithm—An Example
Frequent Patterns, Associations and Correlations
36
The Apriori Algorithm—An Example
Pruning
The Apriori Algorithm (Pseudo-Code)
L1 = {frequent items};
for (k = 1; Lk !=; k++) do begin
Ck+1 = candidates generated from Lk;
for each transaction t in database do
increment the count of all candidates in Ck+1 that are
contained in t
Lk+1 = candidates in Ck+1 with min_support
end
return k Lk;
37 37
Implementation of Apriori
39 39
Pattern-Growth Approach: Mining Frequent Patterns
40 40
Construct FP-tree from a Transaction
41 41
Construct FP-tree from a Transaction Database
TID Items bought (ordered) frequent items
100 {f, a, c, d, g, i, m, p} {f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o, w} {f, b}
400 {b, c, k, s, p} {c, b, p} min_support = 3
500 {a, f, c, e, l, p, m, n} {f, c, a, m, p}
{}
Header Table
1. Scan DB once, find
frequent 1-itemset (single Item frequency head f:4 c:1
item pattern) f 4
c 4 c:3 b:1 b:1
2. Sort frequent items in a 3
frequency descending b 3 a:3 p:1
order, f-list m 3
p 3
3. Scan DB again, construct m:2 b:1
FP-tree
F-list = f-c-a-b-m-p p:2 m:1
42
Partition Patterns and Databases
43 43
Find Patterns Having P From P-conditional Database
Starting at the frequent item header table in the FP-tree
Traverse the FP-tree by following the link of each frequent item p
Accumulate all of transformed prefix paths of item p to form p’s
conditional pattern base
{}
Header Table
f:4 c:1 Conditional pattern bases
Item frequency head
f 4 item cond. pattern base
c 4 c:3 b:1 b:1
c f:3
a 3
b 3 a:3 p:1 a fc:3
m 3 b fca:1, f:1, c:1
p 3 m:2 b:1 m fca:2, fcab:1
p:2 m:1 p fcam:2, cb:1
44
From Conditional Pattern-bases to Conditional FP-trees
pattern base
{}
c:3
f:3
am-conditional FP-tree
c:3 {}
Cond. pattern base of “cm”: (f:3)
a:3 f:3
m-conditional FP-tree
cm-conditional FP-tree
{}
Cond. pattern base of “cam”: (f:3) f:3
cam-conditional FP-tree
46
Frequent Patterns, Associations and Correlations
Construct FP-tree from a Transaction Database : Another Example
47
Frequent Patterns, Associations and Correlations
Construct FP-tree from a Transaction Database: Another Example
48
Frequent Patterns, Associations and Correlations
Extension of Pattern Growth Mining Methodology
Mining closed frequent itemsets and max-patterns
CLOSET (DMKD’00), FPclose, and FPMax (Grahne & Zhu, Fimi’03)
Pattern-growth-based Clustering
MaPle (Pei, et al., ICDM’03)
Pattern-Growth-Based Classification
Mining frequent and discriminative patterns (Cheng, et al, ICDE’07)
49 49
Mining Frequent Itemsets Using the Vertical
50
Mining Frequent Itemsets Using the Vertical
51
Mining Frequent Itemsets Using the Vertical
52
Mining Frequent Closed Patterns :CLOSET
53
MaxMiner: Mining Max-Patterns
Tid Items
1st scan: find frequent items
10 A, B, C, D, E
A, B, C, D, E 20 B, C, D, E,
2nd scan: find support for 30 A, C, D, F
Methods
Summary
55
Interestingness Measure: Correlations (Lift)
56
Interestingness Measure
57
Home Work
58
Home Work
59
Frequent Patterns, Associations and Correlations
60
The End
Thanks