You are on page 1of 28

Trng i hc Kinh T Nng Khoa Thng K Tin Hc Mn hc C s d liu

TM HIU PHN MM WEKA

GVHD: THY NGUYN VN CHC THC HIN: NHM 15

NI DUNG TRNH BY
Gii thiu chc nng phn lp Mt s b phn lp ph bin
Cy quyt nh ID3 NaiveBayes

Tng kt

NI DUNG TRNH BY
Gii thiu chc nng phn lp Mt s b phn lp ph bin
Cy quyt nh ID3 NaiveBayes

Tng kt

PHN LP L G?
Phn lp (classify) l mt nhim v khai thc d liu, trong : cho trc mt tp hp cc lp, tm cch gn mt mu mi vo phn lp sao cho c chnh xc cao nht c th. V d: D on khi u l u lnh hay u c. Phn loi vn bn theo ch tin tc, th thao, gio dc... Weka h tr phn lp trong phn chc nng Explorer ca nhm chc nng Applications.

PHN LP VI WEKA

PHN LP VI WEKA

y l chc nng cho php ngi dng chn la mt trong cc thut ton

phn lp ci t sn p dng ln
d liu. Bc 1: nhn nt Choose m hp thoi chn thut ton.

PHN LP VI WEKA

y l chc nng cho php ngi dng chn la mt trong cc thut

ton phn lp ci t sn p
dng ln d liu. Bc 2: nhn vo ch hin th thut ton m hp thoi chn tham s.
7

PHN LP VI WEKA

y l chc nng cho php ngi dng chn la mt trong cc thut ton phn lp ci t sn p dng ln d liu.

Bc 1: nhn nt Choose m hp thoi chn thut ton.


Bc 2: nhn vo ch hin th thut ton m hp thoi chn tham s. Bc 3: nhn nt Start chy thut ton vi d liu hin c.

PHN LP VI WEKA

y l d liu thu c sau khi thc hin thnh cng, gm thng tin
v tp d liu, m hnh phn lp (cy quyt nh, gi tr xc sut), kt qu d on trn tp d liu kim th v s liu thng k.
9

PHN LP VI WEKA
y l bng lu li thng tin cc ln chy. Ta c th ghi li kt qu chy thut ton sang tp tin lu tr.

10

PHN LP VI WEKA
y l bng chn la ch kim th nh gi hiu qu ca b phn lp c xy dng. Use training set: s dng tp

hun luyn lm tp kim th.

11

PHN LP VI WEKA
y l bng chn la ch kim th nh gi hiu qu ca b phn lp c xy dng. Supplied test set: ch nh tp d

liu mi lm tp kim th.

12

PHN LP VI WEKA
y l bng chn la ch kim th nh gi hiu qu ca b phn lp c xy dng. Cross-validation: kim th bng

phng php cross-validation.

13

PHN LP VI WEKA
y l bng chn la ch kim th nh gi hiu qu ca b phn lp c xy dng. Percentage split: chia tp d liu

ban u thnh tp hun luyn v


tp kim th theo t l %.

14

PHN LP VI WEKA
Cc la chn tin ch khc.

La chn thuc tnh phn lp

La chn xut kt qu
15

NI DUNG TRNH BY
Gii thiu chc nng phn lp Mt s b phn lp ph bin
Cy quyt nh ID3 NaiveBayes

Tng kt

16

CC THUT TON PHN LP


Weka h tr tng i a dng cc thut ton phn lp. Cc thut ton c chia thnh nhiu nhm da theo tnh cht hot ng, c th k n mt s i din nh: Bayes: mng Bayes, NaiveBayes Functions: SVM, hm hi qui Trees: ID3, J48 Rules: cc phng php khai thc da trn lut

17

CY QUYT NH
L m hnh phn lp dng cy sao cho bt u t mt s thuc tnh no (nt trung gian) c th i n quyt nh phn lp cho mt mu (nt l). V d: ID3, J48

18

CC BC THC HIN

19

PHN TCH KT QU
Thng tin tm tt v lt chy: thut ton s dng, d liu u vo (tn, cc thuc tnh), kiu test.
=== Run information === Scheme: weka.classifiers.trees.Id3 .. Relation: weather.symbolic Instances: 14 Tn thut ton Attributes: 5 Tham s i km outlook temperature humidity windy play Test mode: evaluate on training data
20

PHN TCH KT QU
Cy quyt nh c xy dng t thut ton ID3 v d liu weather.
=== Classifier model (full training set) === Id3 outlook = sunny | humidity = high: no | humidity = normal: yes outlook = overcast: yes outlook = rainy | windy = TRUE: no | windy = FALSE: yes Time taken to build model: 0 seconds

21

PHN TCH KT QU
So snh kt qu d on ca tng mu so vi thc t. khi ng chc nng ny, chn More options Output predictions.
=== Predictions on test data === inst#, thc s d bo error probability distribution 1 2:no 2:no 0 *1 2 1:yes 1:yes *1 0 3 2:no 2:no 0 *1 4 1:yes 1:yes *1 0 5 2:no 1:yes + *1 0 6 1:yes 1:yes *1 0 7 2:no 2:no 0 *1 8 1:yes 2:no + 0 *1
22

PHN TCH KT QU
Thng k v t l phn lp ng/sai, km theo mt s thng s v nhng o li ph bin.

=== Tm tt thng k === Trng hp phn lp ng Trng hp phn lp sai Kappa statistic Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances

12 2 0.6889 0.1429 0.378 30 % 76.6097 % 14

85.7143 % 14.2857 %

23

PHN TCH KT QU
Confusion matrix th hin phn b cc lp do Weka d on so vi thc t. Ct ch s mu phn b v lp tng ng do Weka thc hin, dng ch s mu thuc v lp tng ng trong thc t. V d: Ct a c 9 mu Weka phn lp 9 mu thuc lp a, nhng 9 mu ny thuc hai dng a = yes (8) v b = no (1) Weka phn lp sai 1 mu.
=== Confusion Matrix === a b <-- classified as 8 1 | a = yes 1 4 | b = no
24

NAVE BAYES
L m hnh phn lp da trn xc sut thng k theo nh l Bayes. Trong Weka, chng ta quan tm n dng Bayes n gin nht, l NaiveBayesSimple. Cch s dng: tng t cc bc thc hin trong Cy quyt nh ID3. nhng thay v u ra l m hnh cy quyt nh th s l cc gi tr xc sut.

25

PHN TCH KT QU
=== Classifier model (full training set) === N(Play = yes)+1 N(outlook = sunny play = yes)+1 Naive Bayes (simple) N+n n: tng s lp N+m m: tng s gi tr Class yes: P(C) = 0.625 Attribute outlook sunny overcast 0.25 0.41666667 ..

rainy 0.33333333

Lm trn Laplace

Class no: P(C) = 0.375 Time taken to build model: 0 seconds

26

NI DUNG TRNH BY
Gii thiu chc nng phn lp Mt s b phn lp ph bin
Cy quyt nh ID3 NaiveBayes

Tng kt

27

TNG KT
Phn lp (classify) c h tr trong chc nng Explorer ca Weka. y l chc nng gip ngi dng phn lp d liu da trn qu trnh gm 2 bc: Hun luyn: xy dng b phn lp da trn d liu hun luyn c phn lp sn. D on: s dng b phn lp quyt nh mt mu mi thuc v phn lp no. Mt s b phn lp ph bin: Cy quyt nh (ID3, J48), NaiveBayes, kNN.
28

You might also like