You are on page 1of 10

Hidden Markov Model

I.

TM TT
M hnh Markov n l m hnh thng k ph bin m hnh chui d liu bin i
nhiu theo thi gian. Trong vic x l ngn ng t nhin NLP ( Natural Language
Processing), HMM c ng dng vi nhng thnh cng to ln trong vic gii
quyt cc vn nh trch thuc tnh ca ting ni, phn khc cc cm t.

II.

GII THIU
M hnh Markov n l mt cng c thng k rt mnh trong vic m hnh ha cc
chui c th sinh ra , hay ni cch khc l cc chui m c th c trng bi cc
chui trng thi sinh ra cc chui quan st khc nhau .
M hnh Markov n c ng dng trong rt nhiu lnh vc ca x l tn hiu ni
chung, v x l ting ni ni ring. V ng dng thnh cng trong NLP ( Natural
Languages Processing ) nh: part-of-speech tagging, phrase chunking, extracting
target information from document.
-

nh ngha ca m hnh Markov n l:

S l tp hp tt c cc trng thi:

V l tp hp tt c cc quan st c:

Q l chui cc trng thi c th xy ra, c chiu di T

V tng ng vi n l chui cc quan st c th quan st c

A l bng chuyn i, cha nhng gi tr xc sut chuyn i t trng thi i sang


trng thi j, v nhng xc sut chuyn i ny c lp vi thi gian:

B l bng xc sut quan st, cha nhng gi tr xc sut ca quan st k t trng


thi i, c lp vi thi gian:

l bng xc sut u tin:

Ta gi s m hnh Markov n tha mn 2 iu kin sau:


Th 1: l m hnh Markov first-order, trng thi hin ti ch ph thuc vo trng thi
lin trc n, c trng cho tnh nh ca m hnh.

Th 2: quan st c thi im t, ch ph thuc vo trng thi hin ti, c lp vi


cc trng thi v quan st c trong qu kh.
III.

EVALUATION
Cho mt m hnh Markov n v mt chui quan st c O, ta c th tnh c P(O|
), l xc xut xut hin ca chui quan st cho bi m hnh Makov n. T ta
c th nh gi cht lng m hnh khi d on v chui O cho trc v chn c
m hnh thch hp nht.
Xc sut chui quan st O cho chui trang thi Q c tnh bi:

Xc sut ca chui trng thi Q ng m hnh Markov n l:

T ta tnh c xc sut ca chui quan st O cho bi HMM l:

Chng ta c th tnh trc tip xc xut chui O, tuy nhin lng php tnh cn dng l
rt ln.
Do , chng ta c th s dng phng php tt hn l nhn bit lng tnh ton d,
sau m chng t c mc tiu gim phc tp trong tnh ton. Ta thc
hin m bng biu mt co cho mi bc tnh, ta tnh gi tr m (k hiu )ti
mi trng thi bng cch tng tt c trng thi trc n. lc ny l xc sut cua
s
chui O ti mc trng thi i ti thi im t.

Sau ta in y cc thng s vo biu mt co. Tng gi tr ca ct cui


chnh l xc sut chui quan st c.
Cc bc thc hin thut ton nh sau:
1. Initialisation: (in ct u tin):

2. Industion: (in cc ct cn li)

3. Termination: (Tnh xc xut chui O)

Hnh trn biu din cc php tnh trong 1 bc t trng thi th t sang trng thi th
T
t+1. Bng phng php s dng m, ta gim c lng tnh ton t 2TN
2
cn TN php tnh.

IV.

DECODING
Mc ch ca vic decoding l xc inh c chui trng thi m c kh nng a ra
c chui quan st cho trc nhiu nht. Mt gii php cho vn ny l s dng
thut ton Viterbi.

Thut ton Viterbi l mt dng khc ca thut ton biu mt co, tng t nh
thut ton Forward, ngoi tr chn cc gi tr xc sut chuyn i ln nht ti mi
bc, thay v tnh tng ca chng.
u tin, chng ta phi xc nh:
- Thut ton Viterbi c thc hin nh sau:
1. Initialisation:
2. Recursion:

Bc Recursion ca thut ton Viterbi

3. Termination:

4. Optimal state sequence backtracking:

Bc Backtracking ca thut ton Viterbi

Sau y ta xt mt v d v d bo thi tit hiu r hn v thut ton Viterbi


The Viterbi Algorithm: Thut ton Viterbi xc nh chui thi tit c th xy ra nht

C 3 loi thi tit: nng


, ma
, v sng m
. V ta gi thit rng 1 loi thi tit s
xy ra din ra trong c ngy, khng thay i vo mt khong thi gian trong ngy.
By gi gi thit rng bn b nht trong mt cn phng trong nhiu ngy, v bn c hi v thi
tit bn ngoi. V ch c mt vi bng chng bn c l vic mt ngi bn em ba n n cho
bn l c mang mt cy d
b kha.

hay khng

. Bn khng bit thi tit nh th no khi bn

Vo ba ngy u tin bn quan st thy ngi bn nh sau:

Tm xc sut ln nht chui thi tit s dng thut ton Viterbi. (gi thit rng xc sut u tin l
nh nhau vo ngy 1)
1. Tnh gi tr ban u ( Initialisation ):

Cc cng thc trn c tnh tng t nh bc 1 trn.

Hnh 5: Thut ton Viterbi tm chui thi tit c th xy ra nht. Ta tm thy ng dn n


trng thi tri nng thi im n= 2.
2. quy (Recursion ):
Chng ta tnh kh nng ti trng thi
xy ra nht vi

t tt c 3 trng thi trc, v chn ci c kh nng

Cc cng thc trn c tnh nh bc 2 trn.


Tc l trong hnh v tm ng dn m dn n likelihood ln nht xt v likelihood tt nht
thi im trc v bc chuyn tip t n. Ri nhn vi likelihood hin ti c cho bi trng
thi hin ti. V kt qu l s tm thy ng dn tt nht.
Likelihood l c tnh trong , v trng thi trc c th nht l trong . Xem hnh 5
Thc hin tng t vi trng thi tri

n=3:

Cui cng, chng ta s thu c ng dn kt thc c th xy ra nht mi trng thi ca m


hnh. Xem hnh 6

Hnh 6: Thut ton Viterbi tm chui thi tit c kh nng xy ra nht ti n= 3.

Hnh 7: Thut ton Virtebi tm ra chui thi tit c th xy ra nht.

3. Termination:
Con ng c th xy ra nht c xc nh, bt u bng cch tm ra trng thi cui cng
ca chui c th xy ra nht.

4. Backtracking:
Chui trng thi tt nht c th c t vector . Xem hnh 7

Theo cch th chui thi tit c th xy ra nht l:

V.

LEARNING
Cho trc 1 h thng cc mu ca mt qu trnh, chng ta c th nh gi cc thng s ca
m hnh Makov n = (A, B, ) sao cho chng th hin qu trnh mt cch ti u. C 2
phng php thng thng thc hin, ty thuc vo dng ca mu v d cho trc,
l training c gim st v training khng gim st.
Nu mu v d c c u vo v u ra th ta thc hin training c gim st, vi u vo l
chui quan st c cn u ra l chui trng thi. Nu mu v d ch c u vo th ta ch
c th training khng gim st bng cch on cc thng s ca m hnh t c
chui quan st cho.
Trong bi ny ta ch tho lun v training c gim st, cn training khng gim st vi
thut ton Baum -Wetch c trnh by trong [6].
Gii php n gin nht thit lp m hnh Makov n c thng s l s dng mt lot
mu v d. Tiu biu cho phng php ny l gii php PoS tagging.
Ta m t 2 nhm:
t 1 t N l nhm tag, tng ng nhm trng thi s 1 s N
w1 wM

ca HMM

l nhm word, tng ng nhm quan st v 1 v M

ca HMM

xc nh cc thng s m hnh trn ta dng nh gi kh nng cc i (MLE


Maximum Likelihood Estimation) t chui quan st v chui trng thi tng ng.
Ma trn chuyn i c tnh:

Count (t i , t j )
ti

l s ln chuyn trng thi t

ti

sang

tj

Count (t i)

l s ln

ng vi

tj

Count (t j )

l s ln

xut hin.

Ma trn quan st c tnh bi:

Count (w k ,t j )
tj

l s ln quan st c

xut hin.

Thng s cho bi:

wk

Trong thc t khi tnh ton cc thng s ta cn cc bin php lm mt trnh s m


c l 0 v tng hiu nng ca m hnh vi cc d liu khng xut hin trong mu v d
dng training.

You might also like