You are on page 1of 3

DATA MINING V LUT KT HP

I. Data Mining Ngy nay, chng ta thng nghe ni chng ta ngp chm trong DL nhng li thiu tri thc. Qa tht s pht trin ca CSDL ngy cng tng hnh thnh vi mt khi lng khng l, i hi chng ta - nhng ngi s dng - phi bit khai thc, chn lc d liu c ch cho mnh gia mt bin d liu y. cng l l do Data Mining (DM) ra i. 1. nh ngha Data Mining (khai ph d liu) l tm ra nhng qui lut ng quan tm, cc thng tin v d liu c ch trong qu trnh s dng khi lng d liu khng l. 2. Mc ch Data Mining phn tch d liu v s dng k ngh phn mm tm ra khun mu v cc qui tc ca d liu, t c c nhng tri thc, hiu bit v d liu ang tip cn. Nh vy mc ch chnh ca DM khng ch l ly thng tin c sn t DL m quan trng l nhng hiu bit c c t DL . mi chnh l thng tin quan trng cn t ti. 3. Chc nng ca DM - Phn loi d liu - Kt hp d liu - Xu chui d liu... 4. So snh vi DBMS - DBMS: truy vn CSDL. VD: tm ra cc bn ghi vic mua hng cng c mt hng A v B - DM: phn tch v khm ph nhng hiu bit t nhng truy vn. VD: nguyn nhn cng ty sn xut mt hng A v B lin kt v chia li nhun? 5. ng dng - Giao dch thng mi: Tm ra cc lut kt hp trong cc giao dch thng mi - Y hc: phn tch gen - Ngn hng: phn tch lin quan ti th thanh ton ca khch hng II. Lut kt hp Trong cc giao dch mua bn, nhn thy rng chng loi cc mt hng l rt ln. Tuy nhin s lng bn ghi giao dch c cha ng thi mt s mt hng xc nh chim mt t l ng quan tm. Chng ta khng bit ngi mua l ai, do vn t ra l s trng lp ngu nhin hay c mt qui lut cng nh mt cn c no hay khng? l tin cho s ra i ca lut kt hp. 1. nh ngha Lut kt hp l lut ch ra mi quan h ca hai hay nhiu i tng (i tng chng ta ang xt y l cc mt hng). Cu trc ca lut nh sau: A=>B (sup, con). C ngha l lut c A th ko theo B vi c s support v confidence, trong : sup= support: ( h tr) l t l giao dch cha c hai mt hng A v B. con= confidence: ( tin cy) l t l giao dch cha mt hng B trong cc giao dch cha mt hng A.

VD v lut kt hp: bnh m=>sa (40%,45%) c ngha l: c bnh m th ko theo sa vi c s: 40% cc giao dch cha c hai mt hng bnh m v sa, trong s cc bn ghi cha bnh m c 45% bn ghi cha sa. Tuy nhin khng phi lut kt hp gia mt hng no cng c ngha, chng ta ch quan tm ti nhng lut c mt c s no hay cn gi l ngng. Mt trong cc ngng thng dng l gii hn c s, min_sup. VD: chng ta ch quan tm ti nhng lut kt hp c h tr ln hn min_sup, nh vy lut kt hp tm c s c gi tr cao hn. 2. ngha Mt ng dng quan trng ca lut kt hp l phn tch th trng. l vic phn tch thi quen mua hng ca khch tm s kt hp gia cc mt hng khc nhau trong mt ln mua hng ca h. VD: Quay li v d trn, trong 1 ln mua hng ti siu th nu khch hng mua bnh m, thng th h s mua sa. Thng tin nh th c th ch dn ngi bn la chn mt hng v v tr ca chng trn gi hng. Do ngi bn c th t sa v bnh m trong phm vi gn k gy tc ng tch cc ti vic mua ca khch cho c hai mt hng ny. Vic nhn ra cc mt hng thng c mua cng nhau gip ngi bn hng c th bn c nhiu hng hn do tng doanh thu. Khai thc lut kt hp nhm tm ra nhng mi lin kt ng quan tm hoc nhng quan h tng quan trong mt tp ln cc i tng. Trong giao dch thng mi khm ph mi quan h trong s lng ln cc bn ghi giao dch c th gip nhiu nh kinh doanh x l gii quyt cc vn nh: thit k catalog nh th no?... III. Thut ton Apriori Vn t ra l lm th no tm ra c cc lut lin kt gia khi lng khng l ca DL? DL th hin mi lin h u? lut kt hp no ng quan tm nht? Tm ra lut kt hp ng quan tm nh th no? 1. Chc nng Apriori l mt thut ton mnh v tp ph bin vi cc lut kt hp logic. Chc nng ca thut ton l tm tp ph bin t xy dng thnh cc lut kt hp. 2. Tp ph bin Tp ph bin l tp cha cc tp con tho mn ngng c s xc nh. VD: tp {A,B} tho mn ngng c s khi SupAB= min_sup Tnh cht: mi tp con khng rng ca mt tp ph bin cng l tp ph bin 3. Phn tch 3.1.Tm tp ph bin s dng s sinh cc ng vin. a. Tnh cht Apriori s dng tp phn t bit trong cc tp ph bin, tp k phn t c dng kho st v a ra tp (k+1) phn t. u tin, tm tp ph bin 1 phn t (tp L1), t tp L1 tm tp L2 l tp ph bin 2 phn t. Tip tc s dng L2 tm L3 ... Qa trnh tm mi tp Lk s duyt ton b CSDL. Theo tnh cht ca tp ph bin ta c suy lun sau: Nu mt phn t khng tho mn ngng nh nht ca h tr, min_sup, th I khng l ph bin, ngha l P(I) < min_sup. Nu phn t A c thm vo tp phn t I c tp I A, khng ph bin mc cao hn I th I A cng khng l tp ph bin ngha l P(I A)<min_sup. b. Qu trnh sinh tp Lk-1 da vo Lk c xy dng nh sau: Bc 1: Kt ni - Tm Ck Ck l tp ng vin k_itemsets c sinh bi Lk-1 lin kt vi chnh n. Vic lin kt

Lk-1 vi Lk-1 c xc nh nh sau: li l tp phn t th i trong Lk-1, trong li(j) l phn t th j (tnh t phn t cui) ca tp phn t li. Hai tp phn t trong Lk-1 c kt ni vi nhau khi v ch khi chng c (k-2) phn t u tin ging nhau. iu kin li[k-1]<lt[k-1] m bo cho vic sinh Ck khng b lp cc phn t. Kt qu kt ni liv lt l li[1] li[2] ... li[k-2] li[k-1]. Bc 2: iu chnh Ck l tp bao gm Lk, tht vy, cc tp con ca n c th hoc khng l tp ph bin, nhng tt c tp ph bin k phn t u c cha trong Ck. Qa trnh duyt v m cc phn t ca Ck s loi b cc phn t khng tho mn gii hn c s v cho kt qu l tp Lk. Vic gim kch thc ca Ck c tin hnh nh sau: - Tt c cc tp (k-1) phn t khng ph bin khng l tp con ca tp ph bin k phn t. - Nu tp (k-1) phn t no ca tp ng vin k phn t khng thuc Lk-1 th ng vin khng l tp ph bin v loi b khi Ck.

You might also like