Professional Documents
Culture Documents
NI DUNG TRNH BY
Gii thiu chc nng phn lp Mt s b phn lp ph bin
Cy quyt nh ID3 NaiveBayes
Tng kt
NI DUNG TRNH BY
Gii thiu chc nng phn lp Mt s b phn lp ph bin
Cy quyt nh ID3 NaiveBayes
Tng kt
PHN LP L G?
Phn lp (classify) l mt nhim v khai thc d liu, trong : cho trc mt tp hp cc lp, tm cch gn mt mu mi vo phn lp sao cho c chnh xc cao nht c th. V d: D on khi u l u lnh hay u c. Phn loi vn bn theo ch tin tc, th thao, gio dc... Weka h tr phn lp trong phn chc nng Explorer ca nhm chc nng Applications.
PHN LP VI WEKA
PHN LP VI WEKA
y l chc nng cho php ngi dng chn la mt trong cc thut ton
phn lp ci t sn p dng ln
d liu. Bc 1: nhn nt Choose m hp thoi chn thut ton.
PHN LP VI WEKA
ton phn lp ci t sn p
dng ln d liu. Bc 2: nhn vo ch hin th thut ton m hp thoi chn tham s.
7
PHN LP VI WEKA
y l chc nng cho php ngi dng chn la mt trong cc thut ton phn lp ci t sn p dng ln d liu.
PHN LP VI WEKA
y l d liu thu c sau khi thc hin thnh cng, gm thng tin
v tp d liu, m hnh phn lp (cy quyt nh, gi tr xc sut), kt qu d on trn tp d liu kim th v s liu thng k.
9
PHN LP VI WEKA
y l bng lu li thng tin cc ln chy. Ta c th ghi li kt qu chy thut ton sang tp tin lu tr.
10
PHN LP VI WEKA
y l bng chn la ch kim th nh gi hiu qu ca b phn lp c xy dng. Use training set: s dng tp
11
PHN LP VI WEKA
y l bng chn la ch kim th nh gi hiu qu ca b phn lp c xy dng. Supplied test set: ch nh tp d
12
PHN LP VI WEKA
y l bng chn la ch kim th nh gi hiu qu ca b phn lp c xy dng. Cross-validation: kim th bng
13
PHN LP VI WEKA
y l bng chn la ch kim th nh gi hiu qu ca b phn lp c xy dng. Percentage split: chia tp d liu
14
PHN LP VI WEKA
Cc la chn tin ch khc.
La chn xut kt qu
15
NI DUNG TRNH BY
Gii thiu chc nng phn lp Mt s b phn lp ph bin
Cy quyt nh ID3 NaiveBayes
Tng kt
16
17
CY QUYT NH
L m hnh phn lp dng cy sao cho bt u t mt s thuc tnh no (nt trung gian) c th i n quyt nh phn lp cho mt mu (nt l). V d: ID3, J48
18
CC BC THC HIN
19
PHN TCH KT QU
Thng tin tm tt v lt chy: thut ton s dng, d liu u vo (tn, cc thuc tnh), kiu test.
=== Run information === Scheme: weka.classifiers.trees.Id3 .. Relation: weather.symbolic Instances: 14 Tn thut ton Attributes: 5 Tham s i km outlook temperature humidity windy play Test mode: evaluate on training data
20
PHN TCH KT QU
Cy quyt nh c xy dng t thut ton ID3 v d liu weather.
=== Classifier model (full training set) === Id3 outlook = sunny | humidity = high: no | humidity = normal: yes outlook = overcast: yes outlook = rainy | windy = TRUE: no | windy = FALSE: yes Time taken to build model: 0 seconds
21
PHN TCH KT QU
So snh kt qu d on ca tng mu so vi thc t. khi ng chc nng ny, chn More options Output predictions.
=== Predictions on test data === inst#, thc s d bo error probability distribution 1 2:no 2:no 0 *1 2 1:yes 1:yes *1 0 3 2:no 2:no 0 *1 4 1:yes 1:yes *1 0 5 2:no 1:yes + *1 0 6 1:yes 1:yes *1 0 7 2:no 2:no 0 *1 8 1:yes 2:no + 0 *1
22
PHN TCH KT QU
Thng k v t l phn lp ng/sai, km theo mt s thng s v nhng o li ph bin.
=== Tm tt thng k === Trng hp phn lp ng Trng hp phn lp sai Kappa statistic Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances
85.7143 % 14.2857 %
23
PHN TCH KT QU
Confusion matrix th hin phn b cc lp do Weka d on so vi thc t. Ct ch s mu phn b v lp tng ng do Weka thc hin, dng ch s mu thuc v lp tng ng trong thc t. V d: Ct a c 9 mu Weka phn lp 9 mu thuc lp a, nhng 9 mu ny thuc hai dng a = yes (8) v b = no (1) Weka phn lp sai 1 mu.
=== Confusion Matrix === a b <-- classified as 8 1 | a = yes 1 4 | b = no
24
NAVE BAYES
L m hnh phn lp da trn xc sut thng k theo nh l Bayes. Trong Weka, chng ta quan tm n dng Bayes n gin nht, l NaiveBayesSimple. Cch s dng: tng t cc bc thc hin trong Cy quyt nh ID3. nhng thay v u ra l m hnh cy quyt nh th s l cc gi tr xc sut.
25
PHN TCH KT QU
=== Classifier model (full training set) === N(Play = yes)+1 N(outlook = sunny play = yes)+1 Naive Bayes (simple) N+n n: tng s lp N+m m: tng s gi tr Class yes: P(C) = 0.625 Attribute outlook sunny overcast 0.25 0.41666667 ..
rainy 0.33333333
Lm trn Laplace
26
NI DUNG TRNH BY
Gii thiu chc nng phn lp Mt s b phn lp ph bin
Cy quyt nh ID3 NaiveBayes
Tng kt
27
TNG KT
Phn lp (classify) c h tr trong chc nng Explorer ca Weka. y l chc nng gip ngi dng phn lp d liu da trn qu trnh gm 2 bc: Hun luyn: xy dng b phn lp da trn d liu hun luyn c phn lp sn. D on: s dng b phn lp quyt nh mt mu mi thuc v phn lp no. Mt s b phn lp ph bin: Cy quyt nh (ID3, J48), NaiveBayes, kNN.
28