8 Machine Learning

You might also like

You are on page 1of 48

Hc vin Cng ngh Bu chnh Vin thng

Khoa Cng ngh thng tin 1

Nhp mn tr tu nhn to

Gii thiu hc my
Ng Xun Bch

Ni dung
Gii thiu
Hc cy quyt nh
Phn loi Bayes n gin
Hc da trn v d

http://www.ptit.edu.vn

Ti liu tham kho


N. Nilsson. Introduction to machine learning
http://ai.stanford.edu/people/nilsson/mlbook.html
T. Mitchell. Machine learning. McGraw-Hill, 1997.
E. Alpaydin. Introduction to machine learning. MIT Press,
2004.
M. Mohri, A. Rostamizadeh, A. Talwalkar. Foundations of
Machine Learning. MIT Press, 2012.

http://www.ptit.edu.vn

Cng c v d liu
B cng c Weka

http://www.cs.waikato.ac.nz/~ml/weka

Kho d liu mu UC Irvine

http://www.ics.uci.edu/~mlearn/ML/Repository.html

http://www.ptit.edu.vn

Mt s ng dng ca hc my (1/3)
Nhng ng dng kh lp trnh theo cch thng thng
do khng tn ti hoc kh gii thch kinh nghim, k
nng ca con ngi

o
o

Nhn dng ch vit, m thanh, hnh nh


Li xe t ng, thm him sao Ho

Chng trnh my tnh c kh nng thch nghi: li gii


thay i theo thi gian hoc theo tnh hung c th

o
o

Chng trnh tr gip c nhn


nh tuyn mng

http://www.ptit.edu.vn

Mt s ng dng ca hc my (2/3)
Khai ph (phn tch) d liu

H s bnh n tri thc y hc


D liu bn hng quy lut kinh doanh

http://www.ptit.edu.vn

Mt s ng dng ca hc my (3/3)
Hu ht cc ng dng tr tu nhn to ngy nay c s
dng hc my

http://www.ptit.edu.vn

Hc my l g?
Hc:

thu thp kin thc hoc k nng

Hc my:

o
o

Gii quyt vn t kinh nghim


c thc hin bi chng trnh my tnh c kh nng:

Thc hin cng vic tt hn


Theo tiu ch
Nh s dng d liu mu hoc kinh nghim

http://www.ptit.edu.vn

V d
Hc nh c

o
o

: nh c
: s vn thng
: kinh nghim t chi

Hc nhn dng ch

o
o

: nhn dng ch ci t nh
: phn trm ch nhn dng ng
: nh s ca ch v ch tng ng

Dch my

o
o
o

: dch mt cu ting Anh sang ting Vit


: o dch my (v d s cu ng, s mnh ng,)
: cp cu ting Anh v ting Vit tng ng
http://www.ptit.edu.vn

Vn cn quan tm (1/2)

Kinh nghim c th nh th no?


o

Kinh nghim trc tip v gin tip

C gim st (hng dn) v khng gim st

Trc tip: trng thi c th + nc i ng tng ng


Gin tip: ton b vn c v kt qu
C gim st
Khng gim st
Bn gim st

Cn phi hc ci g? Biu din kin thc hc c th


no?
o
o

Tri thc cn hc c biu din nh mt hm ch, cn la chn


hm ch c th
V d nh c:

10

Chn_nc_i:
im_s:
http://www.ptit.edu.vn

Vn cn quan tm (2/2)

S dng thut ton g hc?


o

S dng hm

o
o
o

o
o

11

VD: _ = 11 + 22 + 33 + 44 +

S
S
S
S

dng
dng
dng
dng

cc lut
mng n ron
cy quyt nh
cc m hnh xc sut

http://www.ptit.edu.vn

Thit k chng trnh hc my

La chn d liu hoc kinh nghim


La chn hm ch
La chn biu din cho hm ch
La chn thut ton hc
Tin hnh hc (hun luyn)

12

http://www.ptit.edu.vn

Mt s khi nim

Mu, hay v d (samples): l i tng cn x l (v d


phn loi)
o

Mu thng c m t bng tp thuc tnh hay c


trng (features)
o

V d: khi lc th rc th mi th l mt mu

V d: trong chun on bnh, thuc tnh l triu chng ca ngi


bnh, v cc tham s khc nh chiu cao, cn nng,

Nhn phn loi (label): th hin loi ca i tng m ta


cn d on
o

13

V d: nhn phn loi th rc c th l rc hoc bnh thng

http://www.ptit.edu.vn

Mt s dng hc my ph bin

Hc c gim st (supervised learning)


o

Hc khng gim st (unsupervised learning)


o
o

Phn lp (classification)
Hi quy (regression)
Hc lut kt hp (association)
Phn cm (clustering)

Hc tng cng (reinforcement learning)

14

http://www.ptit.edu.vn

Phn lp

cn nng

chiu cao
15

http://www.ptit.edu.vn

Hi quy (regression)
Tui th

= +

Ti sn
ng dng: d on gi c, li xe,
16

http://www.ptit.edu.vn

Hc lut kt hp

V d
o

(|)
o

Phn tch giao dch, mua bn (ha n mua hng)


Xc sut ngi mua hng cn mua hng

V d lut kt hp
o
o

17

Ngi mua bnh m thng mua b


Ngi mua lc rang thng mua bia

http://www.ptit.edu.vn

Phn cm

Nhm nhng trng hp tng t vi nhau


Khng c gi tr u ra
ng dng
o
o
o

18

Phn cm khch hng, phn cm sinh vin


Phn on nh
Thit k vi mch

http://www.ptit.edu.vn

Hc tng cng

Kinh nghim khng c cho trc tip di dng u


vo / u ra
H thng nhn c mt gi tr thng (reward) l kt
qu cho mt chui hnh ng no
Thut ton cn hc cch hnh ng cc i ha gi
tr thng
V d: hc nh c
o
o

19

H thng khng c ch cho nc i no l hp l cho tng tnh


hung c th
Ch bit kt qu thng thua sau mt chui nc i

http://www.ptit.edu.vn

Ni dung

Gii thiu
Hc cy quyt nh (decision tree learning)
Phn loi Bayes n gin
Hc da trn v d

20

http://www.ptit.edu.vn

D liu hun luyn


Ngy

Tri

Nhit

Gi

Chi tennis

D1

nng

nng

cao

yu

khng

D2

nng

nng

cao

mnh

khng

D3

u m

nng

cao

yu

D4

ma

trung bnh

cao

yu

D5

ma

lnh

bnh thng

yu

D6

ma

lnh

bnh thng

mnh

khng

D7

u m

lnh

bnh thng

mnh

D8

nng

trung bnh

cao

yu

khng

D9

nng

lnh

bnh thng

yu

D10

ma

trung bnh

bnh thng

yu

D11

nng

trung bnh

bnh thng

mnh

D12

u m

trung bnh

cao

mnh

D13

u m

nng

bnh thng

yu

D14

ma

trung bnh

cao

mnh

khng

21

http://www.ptit.edu.vn

thuc tnh

mu

nhn

Ngy

Tri

Nhit

Gi

Chi tennis

D1

nng

nng

cao

yu

khng

D2

nng

nng

cao

mnh

khng

D3

u m

nng

cao

yu

D4

ma

trung bnh

cao

yu

D5

ma

lnh

bnh thng

yu

D6

ma

lnh

bnh thng

mnh

khng

D7

u m

lnh

bnh thng

mnh

D8

nng

trung bnh

cao

yu

khng

D9

nng

lnh

bnh thng

yu

D10

ma

trung bnh

bnh thng

yu

D11

nng

trung bnh

bnh thng

mnh

D12

u m

trung bnh

cao

mnh

D13

u m

nng

bnh thng

yu

D14

ma

trung bnh

cao

mnh

khng

22

http://www.ptit.edu.vn

D liu

n mu hun luyn, mi mu l mt cp < , >


o

l vector cc thuc tnh


l nhn phn loi, (tp cc nhn)

V d mu D4
o
o

23

= , , ,
=

http://www.ptit.edu.vn

V d cy quyt nh
Tri
nng
m
cao

khng
24

u m

ma
Gi

bnh
thng

mnh

khng

yu

c
http://www.ptit.edu.vn

Cy quyt nh l g?

L m hnh phn loi c dng cy


o

Mi nt trung gian (khng phi l) ng vi mt php kim tra


thuc tnh, mi nhnh ca nt ng vi mt gi tr ca thuc tnh
ti nt
Mi nt l ng vi mt nhn phn loi

Qu trnh phn loi thc hin nh sau


o
o

25

Mu phn loi i t gc cy xung di


Ti mi nt trung gian, thuc tnh tng ng vi nt c kim
tra, ty gi tr thuc tnh, mu c chuyn xung nhnh tng
ng
Khi ti nt l, mu c nhn nhn phn loi ca nt

http://www.ptit.edu.vn

Biu din di dng quy tc

Cy quyt nh c th biu din tng ng di dng


cc quy tc logic
Mi cy l tuyn ca cc quy tc, mi quy tc bao gm
cc php hi
V d

(Tri = nng m = bnh_thng)


(Tri = u_m)
(Tri = ma Gi = yu)

26

http://www.ptit.edu.vn

Hc cy quyt nh

Cy quyt nh c hc (xy dng) t d liu hun


luyn

Vi mi b d liu c th xy dng nhiu cy quyt nh


o

Chn cy no?

Qu trnh hc l qu trnh tm kim cy quyt nh ph


hp vi d liu hun luyn
o

27

Cho php phn loi ng d liu hun luyn

http://www.ptit.edu.vn

Thut ton ID3

Xy dng ln lt cc nt ca cy bt u t gc

Thut ton
o
o

Khi u: nt hin thi l nt gc cha ton b tp d liu hun luyn


Ti nt hin thi , la chn thuc tnh

Cha c s dng nt t tin


Cho php phn chia tp d liu hin thi thnh cc tp con mt cch tt nht
Vi mi gi tr thuc tnh c chn thm mt nt con bn di
Chia cc v d nt hin thi v cc nt con theo gi tr thuc tnh c chn

Lp ( quy) cho ti khi

Tt c cc thuc tnh c s dng cc nt pha trn, hoc


Tt c v d ti nt hin thi c cng nhn phn loi
Nhn ca nt c ly theo a s nhn ca v d ti nt hin thi

La chn thuc tnh ti mi nt th no?

28

http://www.ptit.edu.vn

Tiu chun chn thuc tnh ca ID3

Ti mi nt
o

Tp (con) d liu ng vi nt
Cn la chn thuc tnh cho php phn chia tp d liu
tt nht

Tiu chun:
o
o
o
o

29

D liu sau khi phn chia cng ng nht cng tt


o bng tng thng tin (Information Gain - IG)
Chn thuc tnh c tng thng tin ln nht
IG da trn entropy ca tp (con) d liu

http://www.ptit.edu.vn

Entropy

Trng hp tp d liu c 2 loi nhn: ng (+) hoc


sai (-)
= + +
+ : % s mu ng, : % s mu sai

Trng hp tng qut: c loi nhn


() =

: % v d ca thuc loi

V d

(, , -) = (/)(/) (/)(/)
= .

30

http://www.ptit.edu.vn

tng thng tin IG


Vi tp (con) mu v thuc tnh

| Sv |
IG ( S , A) Entropy ( S )
Entropy ( S v )
vvalues ( A ) | S |

Trong :
values (A): tp cc gi tr ca
Sv l tp con ca bao gm cc mu c gi tr ca
bng
|| s phn t ca

31

http://www.ptit.edu.vn

V d tnh IG

Tnh ,
() = *, +
9
9
5
5
= 9+, 5 , = 2

2
= 0.94
14
14 14
14
6
6 2
2
= 6+, 2 , H = 2 2 = 0.811
8
8 8
8
3
3 3
3
= 3+, 3 , H = 2 2 = 1
6
6 6
6
8
6
H
H
14
14
8
6
= 0.94 14 0.811 14 1
= 0.048

, =

32

http://www.ptit.edu.vn

Cc c im ca ID3

ID3 l thut ton tm kim cy quyt nh ph hp vi


d liu hun luyn
Tm kim theo kiu tham lam, bt u t cy rng
Hm nh gi l tng thng tin
ID3 c khuynh hng (bias) la chn cy n gin
o

33

t nt
Cc thuc tnh c tng thng tin ln nm gn gc

http://www.ptit.edu.vn

Vn qu va d liu

qu va d liu nu tn ti sao cho

34

http://www.ptit.edu.vn

Chng qu va bng cch ta cy

Chia d liu thnh hai phn


o
o

Hun luyn
Kim tra

To cy ln trn d liu hun luyn


Tnh chnh xc ca cy trn tp kim tra
Loi b cy con sao cho kt qu trn d liu kim tra
c ci thin nht
Lp li cho n khi khng cn ci thin c kt qu na

35

http://www.ptit.edu.vn

Sau khi ta cy

36

http://www.ptit.edu.vn

Chng qu va d liu
bng cch ta lut (C4.5)

Bin i cy thnh cc lut


Ta mi lut c lp vi cc lut khc
o

B mt s phn trong v tri ca lut

Sp xp cc lut sau khi ta theo mc chnh xc ca


lut

37

http://www.ptit.edu.vn

S dng thuc tnh c gi tr lin tc

To ra nhng thuc tnh ri rc mi


V d, vi thuc tnh lin lc , to ra thuc tnh ri rc
nh sau
o
o

Xc nh ngng th no?
o

= nu A >
= nu A
Thng chn sao cho em li tng thng tin ln nht

C th chia thnh nhiu khong vi nhiu ngng

38

http://www.ptit.edu.vn

Cc o khc

o Information Gain (IG) u tin thuc tnh c nhiu


gi tr, v d, thuc tnh ngy s c tng thng tin cao
nht

Thng tin chia


c

SplitInformation ( S , A)
i 1

Si
S

log 2

Si
S

Tiu chun nh gi thuc tnh

InformationGain( S , A)
GainRatio
SplitInformation( S , A)
39

http://www.ptit.edu.vn

Ni dung

Gii thiu
Hc cy quyt nh
Phn loi Bayes n gin (Nave Bayes classification)
Hc da trn v d

40

http://www.ptit.edu.vn

Phng php phn loi Bayes (1/2)

Trong giai on hun luyn ta c mt tp mu, mi mu


c cho bi cp < , >, trong
o
o

l vector c trng (thuc tnh)


l nhn phn loi, ( l tp cc nhn)

Sau khi hun luyn xong, b phn loi cn d on nhn


cho mu mi =< 1 , 2 , , >
= ( |1 , 2 , , )

S dng quy tc Bayes

1 , 2 , , | ( )
=
(1 , 2 , , )
= 1 , 2 , , | ( )

41

http://www.ptit.edu.vn

Phng php phn loi Bayes (2/2)

Tn xut quan st thy nhn


trn tp d liu D:
( )
||

= 1 , 2 , , | ( )
S dng gi thit v tnh c lp (n gin!!!)

1 , 2 , , | = 1 | 2 | |

S ln xut hin cng vi chia


cho s ln xut hin :

42

( , )
( )

http://www.ptit.edu.vn

V d

Xc nh nhn phn loi cho mu sau


< = , = , = , = >
= , = =
= = ()

43

http://www.ptit.edu.vn

Ni dung

Gii thiu
Hc cy quyt nh
Phn loi Bayes n gin
Hc da trn v d (Instance based learning)

44

http://www.ptit.edu.vn

Nguyn tc chung

Khng xy dng m hnh


Ch lu li cc mu hun luyn
Xc nh nhn cho mu mi da trn nhng mu ging
mu mi nht
Gi l hc li (lazy learning)

45

http://www.ptit.edu.vn

Thut ton hng xm gn nht

-nearest neighbors (-NN)


Chn mu ging mu cn phn loi nht, gi l hng
xm
Gn nhn phn loi cho mu ch s dng thng tin ca
hng xm ny
o

V d ly theo a s trong s hng xm

Chn hng xm th no?

46

http://www.ptit.edu.vn

Tnh khong cch

Gi s mu c gi tr thuc tnh l
< 1(), 2(), , () > , thuc tnh l s thc
Khong cch gia hai mu v l khong cch
Euclidean instance

, =

47

=1

( )2

http://www.ptit.edu.vn

Thut ton -NN

48

http://www.ptit.edu.vn

You might also like