You are on page 1of 67

Khoa Khoa Hc & K Thut My Tnh

Trng i Hc Bch Khoa Tp. H Ch Minh

Chng 6: Khai ph lut kt hp


Khai ph d liu
(Data mining)

Hc k 1 2009-2010

Ni dung

6.1. Tng quan v khai ph lut kt hp

6.2. Biu din lut kt hp

6.3. Khm ph cc mu thng xuyn

6.4. Khm ph cc lut kt hp t cc mu


thng xuyn

6.5. Khm ph cc lut kt hp da trn


rng buc

6.6. Phn tch tng quan

6.7. Tm tt

Ti liu tham kho

[1] Jiawei Han, Micheline Kamber, Data Mining:


Concepts and Techniques, Second Edition, Morgan
Kaufmann Publishers, 2006.

[2] David Hand, Heikki Mannila, Padhraic Smyth,


Principles of Data Mining, MIT Press, 2001.

[3] David L. Olson, Dursun Delen, Advanced Data


Mining Techniques, Springer-Verlag, 2008.

[4] Graham J. Williams, Simeon J. Simoff, Data


Mining: Theory, Methodology, Techniques, and
Applications, Springer-Verlag, 2006.

[5] ZhaoHui Tang, Jamie MacLennan, Data Mining


with SQL Server 2005, Wiley Publishing, 2005.

[6] Oracle, Data Mining Concepts, B28129-01, 2008.

[7] Oracle, Data Mining Application Developers Guide,


B28131-01, 2008.

6.0. Tnh hung 1 Market basket analysis

6.0. Tnh hung 2 - Tip th cho

6.0. Tnh hung 2 - Tip th cho

6.0. Tnh hung

Phn tch d liu gi hng (basket data


analysis)

Tip th cho (cross-marketing)

Thit k catalog (catalog design)

Phn loi d liu (classification) v gom


cm d liu (clustering) vi cc mu ph
bin

6.1. Tng quan v khai ph lut kt hp

Qu trnh khai ph lut kt hp

Cc khi nim c bn

Phn loi lut kt hp

6.1. Tng quan v khai ph lut kt hp

Qu trnh khai ph lut kt hp


Preprocessing

Raw Data

Mining

Items of Interest

Relationship
s among
Items
(Rules)

Postprocessing

User

6.1. Tng quan v khai ph lut kt hp

Qu trnh khai ph lut kt hp


Preprocessing

Raw Data

Mining

Items of Interest

Transactional/
Relational Data

Items

Transaction
Items_bought
--------------------------------2000
A, B, C
1000
A, C
4000
A, D
5000
B, E, F

A, B, C, D, F,

Relationship
s among
Items
(Rules)

Postprocessing

User

Association
Rules

A C (50%, 66.6%)

Bi ton phn tch gi th trng

10

6.1. Tng quan v khai ph lut kt hp

D liu mu ca AllElectronics (sau qu


trnh tin x l)

11

6.1. Tng quan v khai ph lut kt hp

Cc khi nim c bn

Item (phn t)

Itemset (tp phn t)

Transaction (giao dch)

Association (s kt hp) v association rule (lut


kt hp)

Support ( h tr)

Confidence ( tin cy)

Frequent itemset (tp phn t ph bin/thng


xuyn)

Strong association rule (lut kt hp mnh)

12

6.1. Tng quan v khai ph lut kt hp

D liu mu ca AllElectronics (sau qu


trnh tin x l)

Itemsets:
{I1, I2, I5},
{I2},

Item: I4

Transaction: T800
13

6.1. Tng quan v khai ph lut kt hp

Cc khi nim c bn

Item (phn t)

Cc phn t, mu, i tng ang c quan tm.

J = {I1, I2, , Im}: tp tt c m phn t c th c


trong tp d liu

Itemset (tp phn t)

Tp hp cc items

Mt itemset c k items gi l k-itemset.

Transaction (giao dch)

Ln thc hin tng tc vi h thng (v d: giao dch


khch hng mua hng)

Lin h vi mt tp T gm cc phn t c giao dch

14

6.1. Tng quan v khai ph lut kt hp

Cc khi nim c bn

Association (s kt hp) v association rule (lut


kt hp)

S kt hp: cc phn t cng xut hin vi nhau trong


mt hay nhiu giao dch.
Th hin mi lin h gia cc phn t/cc tp phn t

Lut kt hp: qui tc kt hp c iu kin gia cc tp


phn t.
Th hin mi lin h (c iu kin) gia cc tp phn t
Cho A v B l cc tp phn t, lut kt hp gia A v B l
A B.

B xut hin trong iu kin A xut hin.

15

6.1. Tng quan v khai ph lut kt hp

Cc khi nim c bn

Support ( h tr)

o o tn s xut hin ca cc phn t/tp phn t.

Minimum support threshold (ngng h tr ti thiu)


Gi tr support nh nht c ch nh bi ngi dng.

Confidence ( tin cy)

o o tn s xut hin ca mt tp phn t trong


iu kin xut hin ca mt tp phn t khc.

Minimum confidence threshold (ngng tin cy ti


thiu)
Gi tr confidence nh nht c ch nh bi ngi dng.

16

6.1. Tng quan v khai ph lut kt hp

Cc khi nim c bn

Frequent itemset (tp phn t ph bin)

Tp phn t c support tha minimum support threshold.

Cho A l mt itemset
A l frequent itemset iff support(A) >= minimum support
threshold.

Strong association rule (lut kt hp mnh)

Lut kt hp c support v confidence tha minimum


support threshold v minimum confidence threshold.

Cho lut kt hp AB gia A v B, A v B l itemsets


AB l strong association rule iff support(AB) >=
minimum support threshold v confidence(AB) >=
minimum confidence threshold.

17

6.1. Tng quan v khai ph lut kt hp

Phn loi lut kt hp

Boolean association rule (lut kt hp lun


l)/quantitative association rule (lut kt hp lng
s)

Single-dimensional association rule (lut kt hp


n chiu)/multidimensional association rule (lut
kt hp a chiu)

Single-level association rule (lut kt hp n


mc)/multilevel association rule (lut kt hp a
mc)

Association rule (lut kt hp)/correlation rule (lut


tng quan thng k)

18

6.1. Tng quan v khai ph lut kt hp

Phn loi lut kt hp

Boolean association rule (lut kt hp lun


l)/quantitative association rule (lut kt hp
lng s)

Boolean association rule: lut m t s kt hp gia s


hin din/vng mt ca cc phn t.
Computer Financial_management_software
[support=2%, confidence=60%]

Quantitative association rule: lut m t s kt hp


gia cc phn t/thuc tnh nh lng.
Age(X, 30..39) Income(X, 42K..48K) buys(X, high
resolution TV)
19

6.1. Tng quan v khai ph lut kt hp

Phn loi lut kt hp

Single-dimensional association rule (lut kt hp


n chiu)/multidimensional association rule (lut
kt hp a chiu)

Single-dimensional association rule: lut ch lin quan


n cc phn t/thuc tnh ca mt chiu d liu.
Buys(X, computer) Buys(X,
financial_management_software)

Multidimensional association rule: lut lin quan n cc


phn t/thuc tnh ca nhiu hn mt chiu.
Age(X, 30..39) Buys(X, computer)
20

6.1. Tng quan v khai ph lut kt hp

Phn loi lut kt hp

Single-level association rule (lut kt hp n mc)


/multilevel association rule (lut kt hp a mc)

Single-level association rule: lut ch lin quan n cc


phn t/thuc tnh mt mc tru tng.
Age(X, 30..39) Buys(X, computer)
Age(X, 18..29) Buys(X, camera)

Multilevel association rule: lut lin quan n cc phn


t/thuc tnh cc mc tru tng khc nhau.
Age(X, 30..39) Buys(X, laptop computer)
Age(X, 30..39) Buys(X, computer)

21

6.1. Tng quan v khai ph lut kt hp

Phn loi lut kt hp

Association rule (lut kt hp)/correlation rule (lut


tng quan thng k)

Association rule: strong association rules AB (association


rules p ng yu cu minimum support threshold v
minimum confidence threshold).

Correlation rule: strong association rules A B p ng


yu cu v s tng quan thng k gia A v B.

22

6.2. Biu din lut kt hp

Dng lut: AB [support, confidence]

Cho trc minimum support threshold (min_sup),


minimum confidence threshold (min_conf)

A v B l cc itemsets

Frequent itemsets/subsequences/substructures

Closed frequent itemsets

Maximal frequent itemsets

Constrained frequent itemsets

Approximate frequent itemsets

Top-k frequent itemsets


23

6.2. Biu din lut kt hp

Frequent
itemsets/subsequences/substructures

Itemset/subsequence/substructure X l frequent
nu support(X) >= min_sup.

Itemsets: tp cc items

Subsequences: chui tun t cc events/items

Substructures: cc tiu cu trc (graph, lattice, tree,


sequence, set, )

24

6.2. Biu din lut kt hp

Closed frequent itemsets

Mt itemset X closed trong J nu khng tn ti tp


cha thc s Y no trong J c cng support vi X.

X J, X closed iff Y J v X Y: support(Y) <> support


(X).

X l closed frequent itemset trong J nu X l


frequent itemset v closed trong J.

Maximal frequent itemsets

Mt itemset X l maximal frequent itemset trong J


nu khng tn ti tp cha thc s Y no trong J l
mt frequent itemset.

X J, X l maximal frequent itemset iff Y J v X Y: Y


khng phi l mt frequent itemset.
25

6.2. Biu din lut kt hp

Constrained frequent itemsets

Approximate frequent itemsets

Frequent itemsets tha cc rng buc do ngi


dng nh ngha.

Frequent itemsets dn ra support (xp x) cho


cc frequent itemsets s c khai ph.

Top-k frequent itemsets

Frequent itemsets c nhiu nht k phn t vi k


do ngi dng ch nh.
26

6.2. Biu din lut kt hp

Lut kt hp lun l, n mc, n chiu gia


cc tp phn t ph bin: AB [support,
confidence]

A v B l cc frequent itemsets

single-dimensional

single-level

Boolean

Support(AB) = Support(A U B) >= min_sup

Confidence(AB) = Support(A U B)/Support(A) =


P(B|A) >= min_conf
27

6.3. Khm ph cc mu thng xuyn

Gii thut Apriori: khm ph cc mu thng


xuyn vi tp d tuyn

R. Agrawal, R. Srikant. Fast algorithms for mining


association rules. In VLDB 1994, pp. 487-499.

Gii thut FP-Growth: khm ph cc mu


thng xuyn vi FP-tree

J. Han, J. Pei, Y. Yin. Mining frequent patterns


without candidate generation. In MOD 2000, pp.
1-12.
28

6.3. Khm ph cc mu thng xuyn

Gii thut Apriori

Dng tri thc bit trc (prior knowledge) v c


im ca cc frequent itemsets

Tip cn lp vi qu trnh tm kim cc frequent


itemsets tng mc mt (level-wise search)

k+1-itemsets c to ra t k-itemsets.

mi mc tm kim, ton b d liu u c kim tra.

Apriori property gim khng gian tm kim: All


nonempty subsets of a frequent itemset must also
be frequent.

Chng minh???

Antimonotone: if a set cannot pass a test, all of its


supersets will fail the same test as well.

29

6.3. Khm ph cc mu thng xuyn

Gii thut Apriori

30

6.3. Khm ph cc mu thng xuyn

Gii thut Apriori

31

6.3. Khm ph cc mu thng xuyn

D liu mu ca AllElectronics (sau qu


trnh tin x l)

32

6.3. Khm ph cc mu thng xuyn

min_sup = 2/9
minimum support count = 2

33

6.3. Khm ph cc mu thng xuyn

Gii thut Apriori

c im

To ra nhiu tp d tuyn
104 frequent 1-itemsets nhiu hn 107 (104(104-1)/2)
2-itemsets d tuyn
Mt k-itemset cn t nht 2k -1 itemsets d tuyn trc .

Kim tra tp d liu nhiu ln


Chi ph ln khi kch thc cc itemsets tng ln dn.
Nu k-itemsets c khm ph th cn kim tra tp d liu
k+1 ln.
34

6.3. Khm ph cc mu thng xuyn

Gii thut Apriori

Cc ci tin ca gii thut Apriori

K thut da trn bng bm (hash-based technique)


Mt k-itemset ng vi hashing bucket count nh hn minimum
support threshold khng l mt frequent itemset.

Gim giao dch (transaction reduction)


Mt giao dch khng cha frequent k-itemset no th khng cn c
kim tra cc ln sau (cho k+1-itemset).

Phn hoch (partitioning)


Mt itemset phi frequent trong t nht mt phn hoch th mi c
th frequent trong ton b tp d liu.

Ly mu (sampling)
Khai ph ch tp con d liu cho trc vi mt tr support threshold
nh hn v cn mt phng php xc nh tnh ton din
(completeness).

m itemset ng (dynamic itemset counting)


Ch thm cc itemsets d tuyn khi tt c cc tp con ca chng
c d on l frequent.

35

6.3. Khm ph cc mu thng xuyn

Gii thut FP-Growth

Nn tp d liu vo cu trc cy (Frequent Pattern


tree, FP-tree)

Gim chi ph cho ton tp d liu dng trong qu trnh


khai ph
Infrequent items b loi b sm.

m bo kt qu khai ph khng b nh hng

Phng php chia--tr (divide-and-conquer)

Qu trnh khai ph c chia thnh cc cng tc nh.


1. Xy dng FP-tree
2. Khm ph frequent itemsets vi FP-tree

Trnh to ra cc tp d tuyn

Mi ln kim tra mt phn tp d liu

36

6.3. Khm ph cc mu thng xuyn

Gii thut FP-Growth

1. Xy dng FP-tree

1.1. Kim tra tp d liu, tm frequent 1-itemsets

1.2. Sp th t frequent 1-itemsets theo s gim dn ca


support count (frequency, tn s xut hin)

1.3. Kim tra tp d liu, to FP-tree


To root ca FP-tree, c gn nhn null {}
Mi giao dch tng ng mt nhnh ca FP-tree.
Mi node trn mt nhnh tng ng mt item ca giao dch.

Cc item ca mt giao dch c sp theo gim dn.


Mi node kt hp vi support count ca item tng ng.

Cc giao dch c chung items to thnh cc nhnh c prefix


chung.

37

6.3. Khm ph cc mu thng xuyn

Gii thut FP-Growth

38

6.3. Khm ph cc mu thng xuyn

39

6.3. Khm ph cc mu thng xuyn

Gii thut FP-Growth

2. Khm ph frequent itemsets vi FP-tree

2.1. To conditional pattern base cho mi node ca FPtree


Tch lu cc prefix paths with frequency ca node

2.2. To conditional FP-tree t mi conditional pattern


base
Tch ly frequency cho mi item trong mi base
Xy dng conditional FP-tree cho frequent items ca base

2.3. Khm ph conditional FP-tree v pht trin frequent


itemsets mt cch qui
Nu conditional FP-tree c mt path n th lit k tt c cc
itemsets.
40

6.3. Khm ph cc mu thng xuyn

Gii thut FP-Growth

41

6.3. Khm ph cc mu thng xuyn

42

6.3. Khm ph cc mu thng xuyn

Gii thut FP-Growth

c im

Khng to tp itemsets d tuyn


Khng kim tra xem liu itemsets d tuyn c thc l
frequent itemsets

S dng cu trc d liu nn d liu t tp d liu

Gim chi ph kim tra tp d liu

Chi ph ch yu l m v xy dng cy FP-tree lc u

Hiu qu v co gin tt cho vic khm ph cc frequent


itemsets di ln ngn
43

6.3. Khm ph cc mu thng xuyn

So snh gia gii thut Apriori v gii thut FP-Growth

Co gin vi support threshold

44

6.3. Khm ph cc mu thng xuyn

So snh gia gii thut Apriori v gii thut FP-Growth

Co gin tuyn tnh vi s giao dch

45

6.4. Khm ph cc lut kt hp t cc


mu thng xuyn

Strong association rules AB

Support(AB) = Support(A U B) >= min_sup

Confidence(AB) = Support(A U B)/Support(A)


= P(B|A) >= min_conf

Support(AB)

min_sup

= Support_count(A U B) >=

Confidence(AB)

= P(B|A) =
Support_count(AUB)/Support_count(A) >=
min_conf
46

6.4. Khm ph cc lut kt hp t cc


mu thng xuyn

Qu trnh to cc strong association rules t


tp cc frequent itemsets

Cho mi frequent itemset l, to cc tp con


khng rng ca l.

Support_count(l) >= min_sup

Cho mi tp con khng rng s ca l, to ra lut s


(l-s) nu Support_count(l)/Support_count(s)
>= min_conf

47

6.4. Khm ph cc lut kt hp t cc


mu thng xuyn

I1 I2 I5
Min_conf = 50%

I1 I5 I2
I2 I5 I1
I5 I1 I2
48

6.5. Khm ph cc lut kt hp da trn


rng buc

Rng buc (constraints)

Hng dn qu trnh khai ph mu (patterns) v


lut (rules)

Gii hn khng gian tm kim d liu trong qu


trnh khai ph

Cc dng rng buc

Rng buc kiu tri thc (knowledge type constraints)

Rng buc d liu (data constraints)

Rng buc mc/chiu (level/dimension constraints)

Rng buc lin quan n o (interestingness


constraints)

Rng buc lin quan n lut (rule constraints)

49

6.5. Khm ph cc lut kt hp da trn


rng buc

Rng buc kiu tri thc (knowledge type constraints)

Rng buc d liu (data constraints)

Chiu (thuc tnh) d liu hay mc tru tng/ nim

Rng buc lin quan n o (interestingness


constraints)

Task-relevant data (association rule mining)

Rng buc mc/chiu (level/dimension constraints)

Lut kt hp/tng quan

Ngng ca cc o (thresholds)

Rng buc lin quan n lut (rule constraints)

Dng lut s c khm ph


50

6.5. Khm ph cc lut kt hp da trn


rng buc

Khm ph lut da trn rng buc

Qu trnh khai ph d liu tt hn v hiu qu


hn (more effective and efficient).

Lut c khm ph da trn cc yu cu (rng buc)


ca ngi s dng.
More effective

B ti u ha (optimizer) c th c dng khai thc


cc rng buc ca ngi s dng.
More efficient

51

6.5. Khm ph cc lut kt hp da trn


rng buc

Khm ph lut da trn rng buc lin


quan n lut (rule constraints)

Dng lut (meta-rule guided mining)

Metarules: ch nh dng lut (v c php syntactic)


mong mun c khm ph

Ni dung lut (rule content)

Rng buc gia cc bin trong A v/hoc B trong lut


AB
Quan h tp hp cha/con
Min tr
Cc hm kt hp (aggregate functions)
52

6.5. Khm ph cc lut kt hp da trn


rng buc

Metarules

Ch nh dng lut (v c php syntactic) mong


mun c khm ph

Da trn kinh nghim, mong i v trc gic ca


nh phn tch d liu

To nn gi thuyt (hypothesis) v cc mi quan


h (relationships) trong cc lut m ngi dng
quan tm

Qu trnh khm ph lut kt hp + qu trnh


tm kim lut trng vi metarules cho trc
53

6.5. Khm ph cc lut kt hp da trn


rng buc

Metarules

Mu lut (rule template): P1 P2 Pl Q1 Q2


Qr

P1, P2, , Pl, Q1, Q2, , Qr: v t c th (instantiated


predicates) hay bin v t (predicate variables)

Thng lin quan n nhiu chiu/thuc tnh

V d ca metarules

Metarule
P1(X, Y) P2(X, W) buys(X, office software)

Lut tha metarule


age(X, 30..39) income(X, 41k..60k) buys(X,
office software)

54

6.5. Khm ph cc lut kt hp da trn


rng buc

Rng buc gia cc bin S1, S2, trong A


v/hoc B trong lut A B

Quan h tp hp cha/con: S1 / S2

Min tr

S1 value, {=, <>, <, <=, >, >=}

value / S1

ValueSet S1 hoc S1 ValueSet, {=, <>, , , }

Cc hm kt hp (aggregate functions)

Agg(S1) value, Agg() {min, max, sum, count, avg},


{=, <>, <, <=, >, >=}
55

6.5. Khm ph cc lut kt hp da trn


rng buc

Tnh cht ca cc rng buc

Anti-monotone

Monotone

Succinctness

Convertible

56

6.5. Khm ph cc lut kt hp da trn


rng buc

Tnh cht ca cc rng buc

Anti-monotone

A constraint Ca is anti-monotone iff. for any pattern S


not satisfying Ca, none of the super-patterns of S can
satisfy Ca.

V d: sum(S.Price) <= value

Monotone

A constraint Cm is monotone iff. for any pattern S


satisfying Cm, every super-pattern of S also satisfies it.

V d: sum(S.Price) >= value


57

6.5. Khm ph cc lut kt hp da trn


rng buc

Tnh cht ca cc rng buc

Succinctness

A subset of item Is is a succinct set,


set if it can be
expressed as p(I) for some selection predicate p, where
is a selection operator.

SP2I is a succinct power set,


set if there is a fixed number
of succinct set I1, , Ik I, s.t. SP can be expressed in
terms of the strict power sets of I1, , Ik using union and
minus.

A constraint Cs is succinct provided SATCs(I) is a


succinct power set.

C th to tng minh v chnh xc cc tp tha


succinct constraints.

V d: min(S.Price) <= value

58

6.5. Khm ph cc lut kt hp da trn


rng buc

Tnh cht ca cc rng buc

Convertible

Cc rng buc khng c cc tnh cht anti-monotone,


monotone, v succinctness

Cc rng buc hoc l anti-monotone hoc l monotone


nu cc phn t trong itemset ang kim tra c th t.

V d:
Nu cc phn t sp theo th t tng dn th avg(I.price)
<= 100 l mt convertible anti-monotone constraint.
Nu cc phn t sp theo th t gim dn th avg(I.price)
<= 100 l mt convertible monotone constraint.
59

6.5. Khm ph cc lut kt hp da trn


rng buc

60

6.5. Khm ph cc lut kt hp da trn


rng buc

Khm ph lut (rules)/tp phn t ph bin


(frequent itemsets) tha cc rng buc

Cch tip cn trc tip

p dng cc gii thut truyn thng

Kim tra cc rng buc cho tng kt qu t c


Nu tha rng buc th tr v kt qu sau cng.

Cch tip cn da trn tnh cht ca cc rng buc

Phn tch ton din cc tnh cht ca cc rng buc

Kim tra cc rng buc cng sm cng tt trong qu trnh


khm ph rules/frequent itemsets
Khng gian d liu c thu hp cng sm cng tt.

61

6.6. Phn tch tng quan

Strong association rules A B

Da trn tn s xut hin ca A v B (min_sup)

Da trn xc sut c iu kin ca B i vi A


(min_conf)

Cc

o support v confidence da vo s ch
quan ca ngi s dng

Lng rt ln lut kt hp c th c tr v.

Trong

s 10,000 giao dch, 6,000 giao dch cho


computer games, 7,500 cho videos, v 4,000 cho
c computer games v videos

Buys(X, computer games) Buys (X, videos)


[support = 40%, confidence = 66%]

62

6.6. Phn tch tng quan

Phn tch tng quan cho lut kt hp A B

Kim tra s tng quan v ph thuc ln nhau


gia A v B

Da vo thng k v d liu

Cc o khch quan, khng ph thuc vo


ngi s dng

Trong

s 10,000 giao dch, 6,000 giao dch cho


computer games, 7,500 cho videos, v 4,000 cho
c computer games v videos

Buys(X, computer games) Buys (X, videos)


[support = 40%, confidence = 66%]

P(videos) = 75% > 66%: computer games v


videos tng quan nghch vi nhau.

63

6.6. Phn tch tng quan

Lut tng quan (correlation rules): A B [support,


confidence, correlation]

correlation: o o s tng quan gia A v B.

Cc o correlation: lift, 2 (Chi-square), all_confidence, cosine


lift: kim tra s xut hin c lp gia A v B da trn xc sut (kh
nng)
2 (Chi-square): kim tra s c lp gia A v B da trn gi tr mong
i v gi tr quan st c
all_confidence: kim tra lut da trn tr support cc i
cosine: ging lift tuy nhin loi b s ph thuc vo tng s giao dch
hin c

all_confidence v cosine tt cho tp d liu ln, khng ph thuc


cc giao dch m khng cha bt k itemsets ang kim tra (nulltransactions).

all_confidence v consine l cc o null-invariant.

64

6.6. Phn tch tng quan

o tng quan lift

lift(A, B) < 1: A tng quan nghch vi B

lift(A, B) > 1: A tng quan thun vi B

lift(A, B) = 1: A v B c lp nhau, khng c tng quan

P ( A B )
lift ( A, B )
P ( B | A) / P ( B) confidence( A B ) / support ( B)
P( A) P( B )

lift({game}=>{video}) = 0.89 < 1 {game} v {video} tng quan nghch.

65

6.7. Tm tt

Khai ph lut kt hp

c xem nh l mt trong nhng ng gp quan trng nht t cng ng


c s d liu trong vic khm ph tri thc

Cc dng lut: lut kt hp lun l/lut kt hp lng s, lut kt


hp n chiu/lut kt hp a chiu, lut kt hp n mc/lut
kt hp a mc, lut kt hp/lut tng quan thng k

Cc dng phn t (item)/mu (pattern): Frequent


itemsets/subsequences/substructures, Closed frequent itemsets,
Maximal frequent itemsets, Constrained frequent itemsets,
Approximate frequent itemsets, Top-k frequent itemsets

Khm ph cc frequent itemsets: gii thut Apriori v gii thut


FP-Growth dng FP-tree
66

Hi & p

67

You might also like