Professional Documents
Culture Documents
( | w ) (w )
(w | )
( )
i i
i
p x P
p x
p x
=
1 2
2 1
( | w ) (w )
( )
( | w ) (w )
p x p
v x
p x p
= >
Thut ton Bayes v ng dng
7
Hnh 1: Biu ca c trng N cho hai lp hc ca cc nt chai. Gi tr
ngng N = 65 c nh du bng mt ng thng ng
Gi s rng mi nt chai ch c mt c trng l N, tc l vec t c trng l x = [N],
gi s c mt nt chai c x = [65].
T th ta tnh c cc xc sut likelihood:
p(x|w
1
) = 20/24 = 0.833 P(w
1
) p(x|w
1
) = 0.333 (1-5a)
p(x|w
2
) = 16/23 = 0.696 P(w
2
) p(x|w
1
) = 0.418 (1-5b)
Ta s phn x = [65] vo lp w
2
mc d hp l(likelihood) ca w
1
ln hn
ca w
2
Hnh 2 minh ha nh hng ca vic iu chnh ngng xc sut tin nghim
n cc hm mt xc sut.
Xc sut tin nghim ng nht (equal prevalences). Vi cc hm mt
xc sut ng nht, ngng quy nh l mt na khong cch n phn t
trung bnh. S lng cc trng hp phn lp sai tng ng vi vng c
t m. y l vng m khong cch phn lp l nh nht.
Xc sut tin nghim ca w
1
ln hn ca w
2
. Ngng quyt nh thay th
cc lp c xc sut tin nghim nh hn. V vy gim s trng hp ca lp
c xc sut tin nghim cao dng nh c v thun tin.
Thut ton Bayes v ng dng
8
Hnh 2: Xc sut tin nghim ng nht (a), khng ng nht (b).
Chng ta thy rng tht s lch ngng quyt nh dn n lp w
2
tt
hn lp w
1
. iu ny nghe c v hp l k t khi m by gi lp w
2
xut hin thng
xuyn hn. Khi sai ton phn tng ln iu k l l s nh hng ca xc sut tin
nghim l c li. Cu tr li cho cu hi ny l lin quan n ch phn lp mo
him, m s c trnh by ngay by gi.
Chng ta gi nh rng gi ca mt nt chai (cork stopper) thuc lp w
1
l
0.025, lp w
2
l 0.015. Gi s l cc nt chai lp w
1
c dng cho cc chai c
bit, cn cc nt chai lp w
2
th dng cho cc chai bnh thng.
Nu ta phn lp sai mt nt chai lp w
1
th s b mt 0.025-0.015=0.01.
Nu phn lp sai mt nt chai lp w
2
th dn n n s b loi b v s b mt 0.015.
Ta k hiu:
SB - Hnh ng ca vic s dng mt nt chai(cork stopper) phn
cho loi chai c bit.
NB - Hnh ng ca vic s dng mt nt chai(cork stopper) phn
cho loi chai bnh thng.
w
1
= S (siu lp); w
2
= A (lp trung bnh)
Thut ton Bayes v ng dng
9
Hnh 3: Kt qu phn lp ca cork stoppers vi xc sut tin nghim khng ng
nht: 0.4 cho lp w1 v 0.6 cho lp w2
nh ngha:
ij
= (
i
| w
j
) l mt mt vi hnh ng
i
khi m lp ng l w
j
, vi
i
e{SB, NB}.
11
= (
1
| w
1
) = (SB | S) = 0,
12
= (
1
| w
2
) = (SB | A) = 0.015,
21
= (
2
| w
1
) = (NB | S) = 0.01,
22
= (
2
| w
2
) = (NB | A) = 0.
Thut ton Bayes v ng dng
10
Chng ta c th sp xp
ij
thnh ma trn hao ph .
= (1-6)
V th mt mt vi hnh ng s dng mt nt chai (m t bi vect c
trng x) v phn vo cho nhng chai c bit c th c biu th nh sau:
R(
1
| x) = R(SB | x) = (SB | S)P(S | x) + (SB | A)P(A | x) (1-6a)
R(
1
| x) = 0.015 P(A | x)
Tng t cho trng hp nu phn cho nhng chai thng thng:
R(
2
| x) = R(NB | x) = (NB | S)P(S | x) + (NB | A)P(A | x) (1-6b)
R(
2
| x) = 0.01P(S | x)
Chng ta gi nh rng nh gi ri ro ch chu nh hng t quyt nh sai.
Do vy mt quyt nh chnh xc s khng gy ra thit hi
ii
=0, nh trong (1-6).
Nu thay v 2 lp chng ta c c lp th s mt mt ng vi mt hnh ng
i
s l:
(1-6c)
Chng ta quan tm n vic gim thiu mc ri ro trung bnh tnh cho mt
lng ln nt chai bt k. Cng thc Bayes cho ri ro nh nht lm c iu ny
bng cch cc tiu ha cc ri ro c iu kin R(
i
| x).
Gi s ban u rng cc quyt nh sai lm c cng mt mt mt, chng c t l
vi mt n v mt mt:
(1-7a)
Trong trng hp ny t tt c cc xc sut hu nghim u tng ln mt,
chng ta cn phi cc tiu ha:
(1-7b)
0 0.015
0.01 0
(
(
i
1
( | ) ( | ) ( | )
c
i j j
j
R x P x
=
=
i
0
( | )
1
ij j
if i j
if j j
=
= =
=
( | ) ( | ) 1 ( | )
i j j
i j
R x P x P x
=
= =
<
12 2
( | ) P x
21 1
( | ) P x
*
21 1
1
21 1 12 2
( )
( )
( ) ( )
P w
P w
P w P w
=
*
12 2
2
21 1 12 2
( )
( )
( ) ( )
P w
P w
P w P w
=
Thut ton Bayes v ng dng
12
Vi s mt mt
12
= 0.015 v
21
= 0.01, s dng xc sut tin nghim
trn ta c P
*
(w
1
) = 0.308 v P
*
(w
2
) = 0.692. S thit hi s l ln hn nu nh
phn lp sai lp w
2
do cn tng P
*
(w
2
) ln so vi P
*
(w
1
). Kt qu ca vic iu
chnh l gim s lng cc phn t thuc lp w
2
b phn lp sai thnh w
1
. Xem kt
qu phn lp hnh hnh 6.
Ta c th tnh gi tr ri ro trung bnh trng hp c 2 lp:
(1-9)
1 2
12 2 21 1 12 12 21 21
( | ) ( ) ( | ) ( )
R R
R P w x p x dx P w x p x dx Pe Pe = + = +
} }
Thut ton Bayes v ng dng
13
R
2
v R
2
l min quyt nh ca lp
1
v lp
2
, cn Pe
ij
l xc sut sai
s ca s quyt nh lp l
i
khi m lp ng l j
Chng ta hy s dng tp d liu hun luyn nh gi nhng sai s ny,
Pe
12
=0.1 v Pe
21
=0.46 (xem hnh 6). Ri ro trung bnh i vi mi nt chai by gi
l:
R = 0.015Pe
12
+ 0.01Pe
21
= 0.0061.
Vi l tp cc lp ta c cng thc (1-9) tng qut:
(1-9a)
Lut quyt nh Bayes khng phi l la chn duy nht trong thng k phn
lp. Cng lu rng, trong thc t mt trong nhng c gng gim thiu ri ro trung
bnh l s dng c lng ca hm mt xc sut tnh c t mt tp d liu hun
luyn, nh chng ta lm trn cho cork Stoppers. Nu chng ta c nhng cn c
tin rng cc hm phn phi xc sut tha mn tham s mu, th ta thay th vic tnh
cc tham bin thch hp t tp hun luyn. Hoc l chng ta cng c th s dng
phng php cc tiu ha ri ro theo kinh nghim (empirical risk minimization
(ERM)), nguyn tc l cc tiu ha ri ro theo kinh nghim thay v ri ro thc t.
2.3 Phn lp Bayes chun tc
Cho n gi chng ta vn cha gi nh c trng ca phn phi mu cho
likelihoods. Tuy nhin, m hnh chun tc l mt gi nh hp l. M hnh chun tc
c lin quan n nh l gii hn trung tm ni ting, theo nh l ny th tng ca mt
lng ln cc bin ngu nhin c lp v phn phi ng nht s c phn phi hi t
v lut chun. Thc t ta c c mt xp x n lut chun tc, thm ch vi c mt
s lng tng i nh c thm vo cc bin ngu nhin. i vi cc c trng c
th c coi l kt qu ca vic b sung cc bin c lp, thng th gi nh l c th
chp nhn.
Likelihood chun tc ca lp
i
c biu din bi hm mt xc sut:
( ( ) | ) ( , ) ( ( ) | ) ( , ) ( )
i i
i i i i
X X
R x P x dx x P x p x dx
eO eO
= =
} }
Thut ton Bayes v ng dng
14
i
v
i
l cc tham s phn phi, n gi th ta s dng cc c lng
mu m
i
v C
i
.
Hnh 7 minh ha phn phi chun trong trng hp c hai chiu.
Cho mt tp hun luyn c n mu T={x
1
, x
2
, x
n
} c m t bi mt phn
phi vi hm mt xc sut l p(T | ), l mt vec t tham s ca phn phi
(chng hn nh vec t trung bnh ca phn phi chun). Mt cch ng ch tnh
c c lng mu ca vect tham bin l cc i ha hm mt xc sut p(T | ),
c th coi dy l mt hm ca gi l likelihood of cho tp hun luyn. Gi s rng
mi mu l a vo c lp t mt tp v hn, chng ta c th biu th likelihood nh
sau:
Khi s dng c lng hp l cc i (maximum likelihood estimation) ca
cc bin phn phi th n thng d dng hn l tnh cc i ca ln[p(T|)], iu ny
l tng ng nhau. Vi phn phi Gauss c lng mu c cho bi cc cng
thc (1-10a) v (1-10b) chnh l c lng hp l cc i v n s hi t v mt gi tr
thc.
1
( | ) ( | )
n
i
i
p T p x
=
=
[
Thut ton Bayes v ng dng
15
Nh c th nhn thy t (1-10), cc b mt ca mt xc sut ng nht vi
hp l chun (normal likelihood) tha mn Mahalanobis metric:
By gi chng ta tip tc tnh hm quyt nh cho cc c trng ca phn phi
chun:
g
i
(x) = P(
i
| x) = P(
i
) p(x |
i
) (1-11)
bin i logarit ta c:
Bng cch s dng nhng hm quyt nh, r rng ph thuc Mahalanobis
metric, ta c th xy dng phn lp Bayes vi ri ro nh nht, y l phn lp ti u.
Ch rng cng thc (1-11b) s dng gi tr tht ca khong cch Mahalanobis, trong
khi m trc chng ta s dng c lng ca khong cch ny.
Vi trng hp covariance ng nht cho tt c cc lp (
i
=) v b qua cc
hng s ta c:
(1-11c)
1
1
( ) ( ) ( ) ln ( )
2
i i i i
h x x x P
' = +
2
=
c lng sai s ca b d liu hun luyn cho tp d liu ny l 5%. Bng
cch a vo sai s 0.1 vo cc gi tr ca ma trn nh x A cho b d liu, vi
lch nm gia 15% v 42% gi r ca covariance, ta c sai s tp hun luyn l
6%.
| |
0.8 0.8 2
2 3 8 1 ( 2) 7.9%
0.8 1.6 3
Pe erf
( (
= = =
( (
Thut ton Bayes v ng dng
19
Tr li vi d liu cc nt chai, ta c bi ton phn lp s dng 2 c trng N
v PRT vi xc sut tin nghim ng nht. Lu phn lp thng k ngoi tnh ton
s n khng lm thay i cc php ton, v th m cc kt qu t c l ging nhau
nu nh s dng PRT hay PRT10.
Mt danh sch ring cc xc sut hu nghim hu ch trong tnh ton cc sai s
phn lp, xem hnh 11.
Cho cc ma trn covariances trong bng 1. lch ca cc phn t trong ma
trn covariance so vi gi tr trung tm nm trong khong t 5% n 30%. Hnh dng
ca cc cm l tng t nhau, y l bng chng tin rng vic phn lp l gn vi
ti u.
Bng cch s dng hm quyt nh da trn cc ma trn covariance ring l,
thay v ch mt ma trn tng covariance, ta s xy dng c ng bin quyt nh
bc hai. Tuy nhin phn lp bng ng bc hai kh tnh lch hn so vi phn lp
tuyn tnh, c bit l trong khng gian nhiu chiu, v ta cn phi c mt lng ln
tp d liu hun luyn (xem v d ca Fukunaga and Hayes, 1989).
Thut ton Bayes v ng dng
20
2.4 Min quyt nh
Trong thc t ca cc ng dng nhn dng mu, n gin ta ch cn s dng
mt lut quyt nh nh cc cng thc (1-2a) v (1-7c) khi s to ra nhiu bin
quyt nh, v rt d xut hin nhiu trong d liu, nh hng n chnh xc ca
cc tnh ton phn lp. Nhiu mu nm gn bin quyt nh c th thay i lp c
gn ch vi mt iu chnh nh. Ngha l thc t, phn ln cc mu mang c im
ca c 2 lp. i vi cc mu nh vy, thch hp cho vc t chng trong mt lp c
bit c th xem xt k hn. iu ny chc chn phi trong mt s ng dng, v d
nh, trong lnh vc y t, ni ranh gii gia bnh thng v khc thng l cn phi
phn tch thm. Mt cch gii quyt l gn mt s nh tnh(qualifications) trong vic
tnh ton xc sut hu nghim P(
i
|x) cho lp
i
. Chng hn chng ta gn nh tnh
"definite" nu xc sut ln hn 0.9, "probable" nu xc sut gia 0.9 v 0.8, v
"possible" nu xc sut b hn 0.8. Theo cch ny th vi nt chai c case 55 (xem
hnh 11) s c phn lp l mt "possible" cork ca lp "super", v case 54 l mt
"probable" cork ca lp "average".
Thay v gn m t nh tnh vo lp nhn c, mt phng php khc c s
dng trong mt s trng hp nht nh l quy nh cho s tn ti ca mt lp c
bit gi l lp t chi hay l min quyt nh (reject region).
K hiu:
*: lp c phn;
i
: lp vi xc sut hu nghim cc i, chng hn P(
i
|x) = max P(w
j
|x)
vi mi lp
ij
#
i
.
Lut Bayes c th vit nh sau *=
i
By gi ta quy nh xc sut hu nghim ca mt nt chai phi cao hn nhiu
so vi mt ngng t chi (reject threshold) nht nh
r
, nu khng n s c phn
vo reject class w
r
. Cng thc Bayes c vit li nh sau:
(1-14)
Khi tnh ton t s hp l (likelihood ratio) vi t s xc sut tin nghim
(prevalence ratio), th ta phi nhn t s ny vi (1-
r
)/
r
. Mt lp c khng bao gi c
mt rejection nu
r
< (c-1)/c, do
r
[(c-1)/c, 1].
*
( | )
( | )
i i r
r i r
if P x
if P x
>
=
<