Professional Documents
Culture Documents
FP Growth
FP Growth Algorithm
FP Growth Algorithm
Divide-and-conquer strategy as follows:
} Compress a large database into a compact, Frequent-Pattern
tree (FP-tree) structure
} highly condensed, but complete for frequent pattern
mining
} avoid costly database scans
} Then, divide such a compressed database into a set of
conditional databases(a special kind of projected database)
} And, mine each such database separately
ASSOCIATION RULE MINING
} Construct an FP-tree
} Create a root node labeled null
} Scan database
} Process the items in each transaction in L order
} From the root, add nodes in the order in which items appear in
the transactions
} Link nodes representing items along different branches
ASSOCIATION RULE MINING
TID Items
1 I1,I2,I5 } Minimum support of ~20% (count of 2)
2 I2,I4 } Frequent 1-itemsets
3 I2,I3,I6 I1, I2, I3, I4, I5
4 I1,I2,I4 } Construct list
5 I1,I3
L = {(I2,7),(I1,6),(I3,6),(I4,2),(I5,2)}
6 I2,I3
7 I1,I3
8 I1,I2,I3,I5
9 I1,I2,I3
ASSOCIATION RULE MINING
TID Items
I2 2 null
1 I1,I2,I5
I1 1
2 I2,I4
I3 0 (I2,2)
3 I2,I3,I6
I4 1
(I4,1) 4 I1,I2,I4
I5 1 (I1,1)
5 I1,I3
(I5,1) 6 I2,I3
7 I1,I3
8 I1,I2,I3,I5
9 I1,I2,I3
ASSOCIATION RULE MINING
Prefix Paths
FP Growth Algorithm: FP Tree Mining (I2 I1,1)
(I2 I1 I3, 1)
Conditional Path
I2 7 null (I2 I1, 2)
I1 6 Conditional FP-tree
(I1,2)
I3 6 (I2,7) null
I4 2
(I4,1) (I3,2)
I5 2 (I1,4)
(I3,2) (I2,2)
(I5,1) (I4,1)
(I3,2) (I2 I1 I5, 2),
(I1,2)
(I2 I5, 2),
(I1 I5, 2)
(I5,1)
Data Mining - Ayesha Khan November 3, 2015
ASSOCIATION RULE MINING
Header Table {}
Conditional pattern bases
Item frequency head f:4 c:1 itemcond. pattern base
f 4
c 4 c f:3
c:3 b:1 b:1
a 3 a fc:3
b 3 a:3 p:1
m 3 b fca:1, f:1, c:1
p 3 m:2 b:1 m fca:2, fcab:1
p fcam:2, cb:1
p:2 m:1
ASSOCIATION RULE MINING
m-conditional pattern
Header Table {} base:
Item frequency head fca:2, fcab:1
f 4 f:4 c:1 All frequent patterns
{} concerning m
c 4
c:3 b:1 b:1 m,
a 3 fm, cm, am,
b 3 a:3 p:1 f:3
fcm, fam, cam,
m 3
c:3 fcam
p 3 m:2 b:1
p:2 m:1 a:3
m-conditional FP-tree
Example: Sorted by their frequency
Transactio
n Items
100 Bread, Cheese. Eggs, Juice
200 Bread, Cheese, Juice
300 Bread, Milk, Yogurt
400 Bread, Juice, Milk
500 Cheese, Juice, Milk
Item Frequency
Bread 4
Juice 4
Cheese. 3
Milk, 3
Removing the non-frequent
items and reordering
Transactio
n Items
100 Bread, Cheese. Eggs, Juice
200 Bread, Cheese, Juice
300 Bread, Milk, Yogurt
400 Bread, Juice, Milk
500 Cheese, Juice, Milk
Transaction Items
B:4
J:1
B
J:3 M:
J 1
C:1
C
M
C:2 M: M:
1 1
Mining FP-tree for frequent items
} For any frequent item A, all the frequent itemsets
containing A can be obtained by following the A's node-
links, starting from A's head in the FP-tree header
} BJM(I)
B:4
} JCM(I) J:1
B
J:3 M:
J 1
C:1
C
M
C:2 M: M:
1 1
FP Growth
} Next we look at C and find the following:
} BJC(2)
} JC(l) NUL
L
B
J:3 M:
J 1
C:1
C
M
C:2 M: M:
1 1
FP Growth
} Looking at J, the next frequent item in the header table we
obtain:
} BJ(3) NUL
L
} J(l)
B:4
J:1
} We obtain a frequent itemset,
B
J:3 M:
J 1
C:1
C
M
C:2 M: M:
1 1
“Conditional" trees for M
NUL
L
B:4
J:1
J:3 M: C:1
1
M: M:
1 1
FP-growth vs. Apriori: Scalability With the Support Threshold
75
Run time(sec.)
50
25
0
0 0.8 1.5 2.3 3
Support threshold(%)