Professional Documents
Culture Documents
Chuong4 Phan2
Chuong4 Phan2
Cy quyt nh l mt kiu m hnh d bo K thut hc my dng trong cy quyt nh c gi l hc bng cy quyt nh, hay ch gi vi ci tn ngn gn l cy quyt nh Phng tin c tnh m t dnh cho vic tnh ton cc xc sut c iu kin S kt hp ca cc k thut ton hc v tnh ton nhm h tr vic m t, phn loi v tng qut ha mt tp d liu cho trc
nh ngha cy quyt nh
Cy quyt nh l mt cu trc phn cp ca cc nt v cc nhnh
3 loi nt trn cy: Nt gc Nt ni b: mang tn thuc tnh ca CSDL Nt l: mang tn lp Ci Nhnh: mang gi tr c th ca thuc tnh
Cy quyt nh c s dng trong phn lp bng cch duyt t nt gc ca cy cho n khi ng n nt l, t rt ra lp ca i tng cn xt
V d
David l qun l ca mt cu lc b nh golf ni ting. Anh ta ang c rc ri chuyn cc thnh vin n hay khng n. C ngy ai cng mun chi golf nhng s nhn vin cu lc b li khng phc v. C hm, khng hiu v l do g m chng ai n chi, v cu lc b li tha nhn vin. Mc tiu ca David l ti u ha s nhn vin phc v mi ngy bng cch da theo thng tin d bo thi tit on xem khi no ngi ta s n chi golf. thc hin iu , anh cn hiu c ti sao khch hng quyt nh chi v tm hiu xem c cch gii thch no cho vic hay khng. Vy l trong hai tun, anh ta thu thp thng tin v: Tri (outlook) (nng (sunny), nhiu my (overcast) hoc ma (raining)). Nhit (temperature) bng F. m (humidity). C gi mnh (wind) hay khng. V tt nhin l s ngi n chi golf vo hm . David thu c mt b d liu gm 14 dng v 5 ct.
V d
Day Outlook Temp. Humidity Wind Play?
1
2 3 4 5 6 7 8 9 10 11 12 13
Sunny
Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast
Hot
Hot Hot Mild Cool Cool Cool Mild Cold Mild Mild Mild Hot
High
High High High Normal Normal Normal High Normal Normal Normal High Normal
Weak
Strong Weak Weak Weak Strong Weak Weak Weak Strong Strong Strong Weak
No
No Yes Yes No Yes No Yes Yes Yes Yes Yes Yes
14
Rain
Mild
High
Strong
No
V d
Kim tra khi no chi golf, khi no khng chi
Outlook
Sunny
Overcast
Rain
Humidity
Yes
Wind
High No
Normal Yes
Strong No
Weak Yes
V d
Kim tra khi no chi golf, khi no khng chi
Outlook
Sunny
Overcast
Rain
Humidity
High No
Normal Yes
Duyt cy quyt nh
Day 1 Outlook Sunny Temp. Hot Humidity High Wind Weak Play? No
Outlook
Sunny
Overcast
Rain
Humidity
Yes
Wind
High
Normal
Strong
Weak
No
Yes
No
Yes
Sunny
Overcast
Rain
Wind
No
No
Strong
Weak
No
Yes
Sunny
Overcast
Rain
Yes
Wind
Wind
Strong
Weak
Strong
Weak
No
Yes
No
Yes
Sunny
Overcast
Rain
Humidity
Yes
Wind
High
Normal
Strong
Weak
No
Yes
No
Yes
Xy dng cy quyt nh
Cy c thit lp t trn xung di Ri rc ha cc thuc tnh dng phi s Cc mu hun luyn nm gc ca cy Chn mt thuc tnh phn chia thnh cc nhnh. Thuc tnh c chn da trn o thng k hoc o heuristic Tip tc lp li vic xy dng cy quyt nh cho cc nhnh
Xy dng cy quyt nh
iu kin dng
Tt c cc mu ri vo mt nt thuc v cng mt lp (nt l) Khng cn thuc tnh no c th dng phn chia mu na Khng cn li mu no ti nt
li thng tin
Thuc tnh A c cc gi tr {a1, a2, ,an} Dng thuc tnh A phn chia tp hun luyn thnh n tp con {S1, S2, , Sn} Sij : s mu ca lp Ci thuc tp con Sj (A=aj) Entropy ca thuc tnh A: n 1j mj 1j mj j 1 li thng tin da trn phn nhnh bng thuc tnh A:
E(A)
s ... s s
I(s ,...,s )
V d
Day Outlook Temp. Humidity Wind Play?
1
2 3 4 5 6 7 8 9 10 11 12 13
Sunny
Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast
Hot
Hot Hot Mild Cool Cool Cool Mild Cold Mild Mild Mild Hot
High
High High High Normal Normal Normal High Normal Normal Normal High Normal
Weak
Strong Weak Weak Weak Strong Weak Weak Weak Strong Strong Strong Weak
No
No Yes Yes No Yes No Yes Yes Yes Yes Yes Yes
14
Rain
Mild
High
Strong
No
li thng tin, v d
Ta c
S = 14 m=2 C1 = Yes, C2 = No S1 = 9, S2 = 5
li thng tin, v d
Humidity
High
Normal
[3+, 4-]
[6+, 1-]
E=0.985
E=0.592
li thng tin, v d
Wind
Weak
Strong
[6+, 2-]
[3+, 3-]
E=0.811
E=1.000
li thng tin, v d
Outlook
Sunny
Overcast
Rain
[2+, 3-]
[4+, 0-]
[3+, 2-]
E=0.971
E=0.000
E=0.971
Gain(S,Wind)=0.048
Gain(S,Humidity)=0.151
Ch s Gini
Ch s Gini ca nt t:
GINI(t ) 1 p( j t )
j
V d ch s Gini
GINI(t ) 1 p( j t )
j 2
C1 C2 C1 C2 C1 C2
0 6 1 5 2 4
P(C1) = 0/6 = 0 P(C2) = 6/6 = 1 GINI = 1 (P(C1)2+P(C2)2) = 1 (0+1) = 0 P(C1) = 1/6 P(C2) = 5/6 GINI = 1 (1/6)2 (5/6)2 = 0.278 P(C1) = 2/6 P(C2) = 4/6 GINI = 1 (2/6)2 (4/6)2 = 0.444
GINI chia
trong
ni GINI (i ) i 1 n
ni l s mu trong nt i n l s mu trong nt p
p C1
p 6
N2
C2
Gini=0.500
N1
C1 C2 5 2
N2
1 4
Gini=0.333
1
2 3 4 5 6 7 8 9 10
Yes
No No Yes No No Yes No No No
Single
Married Single Married Divorced Married Divorced Single Married Single
125K
100K 70K 120K 95K 60K 220K 85K 75K 90K
No
No No No Yes No No Yes No Yes
Tax
> 80K < 80K
High
No
Normal
Yes
Strong
No
Weak
Yes
R1 : R2 : R3 : R4 : R5 :
If If If If If
(Outlook=Sunny) (Humidity=High) Then Play=No (Outlook=Sunny) (Humidity=Normal) Then Play=Yes (Outlook=Overcast) Then Play=Yes (Outlook=Rain) (Wind=Strong) Then Play=No (Outlook=Rain) (Wind=Weak) Then Play=Yes
u im ca cy quyt nh
Cy quyt nh d hiu Vic chun b d liu cho mt cy quyt nh l c bn hoc khng cn thit Cy quyt nh c th x l c d liu c gi tr bng s v d liu c gi tr l tn th loi Cy quyt nh l mt m hnh hp trng C th thm nh mt m hnh bng cc kim tra thng k Cy quyt nh c th x l tt mt lng d liu ln trong thi gian ngn