You are on page 1of 0

K

H
O
A

C
N
T
T

H

K
H
T
N





TRNG I HC KHOA HC T NHIN
KHOA CNG NGH THNG TIN
B MN CNG NGH TRI THC





NGUYN HNG QUANG - 0012081





NHN DNG TING NI TING VIT
TM HIU V NG DNG




LUN VN C NHN TIN HC



GIO VIN HNG DN
Th.s BI TIN LN








NIN KHA 2000 2004




K
H
O
A

C
N
T
T

H

K
H
T
N





1
Li cm n

Hon thnh lun vn ny c s ng gp rt ln ca thy
Bi Tin Ln, thy hng dn to iu kin cho em trong qu
trnh thc hin nghin cu nhn dng ting ni, em xin chn thnh
cm n thy.
Em xin gi li cm n n cc thy c trong trng, c
bit l cc thy c b mn Cng Ngh Tri Thc to ra mt mi
trng tht hon ho cho chng em hc tp v nghin cu khoa
hc.
Em cng khng th khng nhc n s ng vin chm sc
ca gia nh, s cng tc gip v ng h tinh thn ca bn b.
Em xin c tri n tt c.

TP. H Ch Minh, thng 7 nm 2004.
Nguyn Hng Quang





K
H
O
A

C
N
T
T

H

K
H
T
N





2
MC LC
DANH SCH HNH V...........................................................................................4
M U .................................................................................................................6
Chng 1 TNG QUAN.........................................................................................8
1.1 Nhn dng.....................................................................................................8
1.2 Nhn dng ting ni .....................................................................................9
1.2.1 X l m thanh......................................................................................9
1.2.2 Phn loi nhn dng ting ni .............................................................10
1.2.2.1 Nhn dng t lin tc v nhn dng t cch bit ............................10
1.2.2.2 Nhn dng ph thuc ngi ni v c lp ngi ni ....................11
1.2.3 H thng nhn dng ting ni t ng................................................13
1.2.4 L thuyt nhn dng ting ni ............................................................14
1.2.4.1 Rt trch vector c trng................................................................15
1.2.4.2 Phn lp...........................................................................................17
Chng 2 X L TING NI RT TRCH VECTOR C TRNG.........21
2.1 X l ting ni............................................................................................21
2.1.1 Ly mu tn hiu .................................................................................21
2.1.2 B lc tn hiu.....................................................................................22
2.1.3 D tm im cui (end-point detection)..............................................24
2.2 Rt trch c trng......................................................................................26
2.2.1 Cc bc rt trch c trng ...............................................................27
2.2.1.1 Lm r tn hiu................................................................................27
2.2.1.2 Phn on thnh cc khung.............................................................27
2.2.1.3 Ly ca s .......................................................................................28
2.2.2 Cc dng c trng ting ni ..............................................................33
2.2.2.1 Bin i tn hiu sang min tn s ..................................................33
2.2.2.2 c trng nng lng......................................................................35
2.2.2.3 c trng MFCC.............................................................................36
2.2.2.4 c trng LPC ................................................................................39
2.2.2.5 c trng tn s c bn..................................................................42
Chng 3 M HNH MARKOV N ..................................................................49
3.1 M hnh Markov n....................................................................................49
3.2 ng dng M hnh Markov vo nhn dng ting ni ................................51
3.2.1 Thut ton tin ....................................................................................52
3.2.2 Thut ton li ......................................................................................53
3.2.3 Phng php tm chui trng thi ti u ............................................54
3.2.4 Thut ton Viterbi ...............................................................................55
3.2.5 c lng Baum-Welch.....................................................................58
3.3 Cu trc ngn ng v m hnh nhn dng theo m v................................60
3.3.1 Cu trc ngn ng...............................................................................60
3.3.2 M hnh m v .....................................................................................63
3.3.3 Tha m v (allophones) .......................................................................63




K
H
O
A

C
N
T
T

H

K
H
T
N





3
3.3.4 Nhn xt ..............................................................................................65
Chng 4 HMM TOOLKIT.................................................................................67
4.1 Cu trc tp tin trong HTK ........................................................................69
4.1.1 Cu trc tp tin vector c trng HTK...............................................69
4.1.2 Cu trc tp tin m hnh HMM..........................................................71
4.1.3 Cu trc tp tin nh nhn d liu......................................................75
4.1.4 Cu trc tp tin vn phm...................................................................78
4.2 Nhn dng nguyn t .................................................................................81
4.3 Nhn dng theo m hnh m v ..................................................................85
Chng 5 NG DNG: IU KHIN XE T NG BNG TING NI .88
5.1 Th nghim nhn dng ting ni Ting Vit .............................................89
5.1.1 Nhn dng tnh (offline)......................................................................89
5.1.1.1 Dng vector c trng dng LPCEPSTRA_E_D...........................89
5.1.1.2 Dng vector c trng dng LPCEPSTRA_E_D_A ......................89
5.1.1.3 Dng vector c trng dng MFCC_0_D.......................................89
5.1.1.4 Dng vector c trng dng MFCC_0_D_A..................................90
5.1.1.5 Dng vector c trng dng MFCC_0_D_A_Z..............................90
5.1.2 Nhn dng thi gian thc (online) ......................................................91
5.1.2.1 Nhn dng theo m hnh m v dng MFCC_0_D_A_Z................91
5.1.2.2 Nhn dng nguyn t dng MFCC_0_D_A_Z...............................92
5.2 ng dng nhn dng ting ni ...................................................................93
KT LUN..............................................................................................................96
TI LIU THAM KHO......................................................................................98
Ph lc MT S CNG C TRONG HTK..................................................99





K
H
O
A

C
N
T
T

H

K
H
T
N





4

DANH SCH HNH V
Hnh 1.1: S nhn dng tng qut .........................................................................8
Hnh 1.2: Cc lnh vc trong x l ting ni ............................................................10
Hnh 1.3: Ranh gii gia c v y khng r rng...........................................11
Hnh 1.4: Ngi ni khc nhau s pht m khc nhau.............................................12
Hnh 1.5: M hnh nhn dng bn c lp ngi ni ..............................................13
Hnh 1.6: Cc thnh phn c bn ca h thng ASR ...............................................14
Hnh 1.7: Cc dng ca s thng dng..................................................................16
Hnh 1.8: Tng qut qu trnh rt trch vector c trng ........................................17
Hnh 1.9: Cc k thut nhn dng ting ni v xu hng pht trin .......................18
Hnh 1.10: HMM vi 3 trng thi v trng s chuyn trng thi.............................20
Hnh 2.1: V d v ly mu tn hiu f(t) trn min thi gian....................................22
Hnh 2.2: Minh ha hot ng b lc FIR................................................................23
Hnh 2.3: Minh ha hot ng b lc IIR.................................................................23
Hnh 2.4: D tm im cui da vo mc nng lng .............................................25
Hnh 2.5: S rt trch vector c trng tng qut ...............................................26
Hnh 2.6: S rt trch c trng chi tit ..............................................................27
Hnh 2.7: Phn on ting ni thnh cc khung chng lp .....................................28
Hnh 2.8: S khc bit gia cc dng ca s tn hiu..............................................32
Hnh 2.9: th biu din mi quan h gia Mel v Hz..........................................36
Hnh 2.10: Cc bc trch c trng MFCC ...........................................................37
Hnh 2.11: B lc trn thang Mel .............................................................................37
Hnh 2.12: B lc trn tn s tht.............................................................................38
Hnh 2.13: Minh ha cc bc bin i MFCC.......................................................38
Hnh 2.14: S x l LPC dng cho trch c trng ting ni.............................40
Hnh 2.15: Hnh dng tn hiu ting ni ...................................................................47
Hnh 2.16: Kt qu trch F0......................................................................................48
Hnh 2.17: Kt qu sau khi lc Median ....................................................................48
Hnh 3.1: Minh ha hot ng ca m hnh Markov n...........................................49
Hnh 3.2: M hnh Left - Right..................................................................................51
Hnh 3.3: M hnh Bakis ...........................................................................................51
Hnh 3.4: M hnh Tuyn tnh...................................................................................51
Hnh 3.5:Minh ha thut ton tin............................................................................53
Hnh 3.6:Minh ha thut ton li .............................................................................53
Hnh 3.7: V d minh ha thut ton Viterbi ............................................................56
Hnh 3.8:V d minh ha thut ton Viterbi (tt) .......................................................57
Hnh 3.9: V d minh ha so khp dng thut ton tin-li .....................................58
Hnh 3.10: Minh ha c lng Baum - Welch........................................................59
Hnh 3.11: Minh ha vic nhn dng m v trong HMM.........................................60
Hnh 4.1: M hnh n gin trong nhn dng ting ni...........................................67
Hnh 4.2: Cc module v chc nng trong HTK......................................................68




K
H
O
A

C
N
T
T

H

K
H
T
N





5
Hnh 4.3: Cc cng c v chc nng trong HTK .....................................................69
Hnh 4.4: Phn b cc tham s trong 1 s vector c trng ca HTK....................71
Hnh 4.5: Cc dng c trng c th chuyn i qua li bng HCopy ...................71
Hnh 4.6: Dng c bn ca 1 tp tin HMM (cha c khi to)...........................72
Hnh 4.7: Dng c bn ca 1 tp tin HMM c s dng pha trn Gaussian ............74
Hnh 4.8: Dng c bn ca 1 tp tin HMM c s dng a lung ............................75
Hnh 4.9: Vai tr ca vn phm trong nhn dng dng HTK..................................78
Hnh 4.10: Lc vn phm ..................................................................................79
Hnh 4.11: M hnh minh ha cc vn phm............................................................80
Hnh 4.12: Minh ha vic nhn dng nguyn t.......................................................81
Hnh 4.13: Hun luyn nguyn t v cc cng c h tr .........................................82
Hnh 4.14: Quy trnh hot ng ca HInit................................................................83
Hnh 4.15: Quy trnh hot ng ca HCompV.........................................................83
Hnh 4.16: Quy trnh hot ng ca HRest ..............................................................84
Hnh 4.17: Hun luyn theo m hnh m v dng HTK............................................85
Hnh 4.18: Qu trnh x l cc tp tin trong HERest ...............................................87
Hnh 5.1: M hnh ngn ng dnh cho h nhn dng..............................................88





K
H
O
A

C
N
T
T

H

K
H
T
N





6

M U
Ting ni l phng tin giao tip c bn nht ca loi ngi, n hnh thnh
v pht trin song song vi qu trnh tin ha ca loi ngi. i vi con ngi, s
dng li ni l mt cch din t n gin v hiu qu nht. u im ca vic giao
tip bng ting ni trc tin l tc giao tip, ting ni t ngi ni c
ngi nghe hiu ngay lp tc sau khi c pht ra. Bn cnh , ting ni l cch
giao tip c s dng rng ri nht bt c ai (d nhin l tr nhng ngi khuyt
tt) cng c th ni c.
Ngy nay, nh s pht trin ca khoa hc k thut, my mc dn dn thay
th cc lao ng tay chn. Tuy nhin iu khin my mc, con ngi phi lm
kh nhiu thao tc tn nhiu thi gian v cn phi c o to. iu ny gy tr
ngi khng t i vi vic s dng cc my mc, thnh tu khoa hc k thut.
Trong khi , nu iu khin my mc thit b bng ting ni s d dng hn. Nhu
cu iu khin my mc thit b bng ting ni cng bc thit hn i vi cc thit
b cm tay, nh: in thoi di ng, my Palm/Pocket PC,
cho my tnh c th nghe c nhiu ngi vt ln vi tn hiu m
thanh trong hn na th k qua trong lnh vc nhn dng ting ni. Qu trnh ny
c nh du bng cc kt qu nghin cu c sc trong lnh vc phn tch v x
l ting ni, cc ng dng thc t kh hu ch. Nhng d sao, kh nng ca my
vn vn cn trong khong gii hn, cn cn pht trin hn na c th tht s p
ng nhu cu thc s ca cuc sng. Mt khc, nhn dng ting ni ch ang c
pht trin trn cc th ting khc, nhng cha c pht trin v ng dng mnh
nc ta. Do tnh hnh pht trin Vit Nam, cho cng cuc nhn dng ting ni
tht s c quan tm, u t v to thnh cc nhm cc phng th nghim chuyn
nghin cu v nhn dng ting ni th tht s gp kh khn.
Lun vn ny xy dng vi mong mun gp phn thc y qu trnh trn,
bng cch k tha cc n anh n ch i trc, v thng qua vic tm hiu cc
thnh tu nc ngoi em mong rng mnh s gp phn to nn nhng bc pht
trin trong lnh vc nhn dng ting ni nc ta. Qua qu trnh nghin cu, em




K
H
O
A

C
N
T
T

H

K
H
T
N





7
nhn thy rng nu nh chng ta c s ph bin kin thc rng ri, khng ch cho
nhng ngi chuyn v lnh vc cng ngh thng tin, m cn cho nhng ngi
khng chuyn th chng ta hon ton c th thc y, pht trin v gt hi nhiu
thnh cng hn. V lc vn khng ch c nghin cu, pht trin bi mt s
ngi m l ca nhiu ngi. Nhng lnh vc nghin cu khc cng c th lm
tng t.
V l do trn m em khng ch tm ti nghin cu l thuyt, m cn c gng
pht trin thnh ng dng.




K
H
O
A

C
N
T
T

H

K
H
T
N





8
Chng 1 TNG QUAN
1.1 Nhn dng
Nhn dng ca loi ngi l mt qu trnh hon ho, l s quan st i
tng cn nhn dng, ghi nhn li nhng c trng ca i tng, phn lp i
tng v c s dng kh nng phn on suy lun phn bit i tng vi i
tng khc (trong mt tp gn nh v hn i tng).
Trong khi , nhn dng t ng nhn dng bng cng c my vi tnh ch
n gin l qu trnh phn bit tn hiu ny vi tn hiu khc (trong mt tp hu hn
cc tn hiu), qu trnh ny c thc hin bng cch thc hin cc bc tng qut
sau (nh trong hnh 1.1).

Hnh 1.1: S nhn dng tng qut
Thu nhn tn hiu v trch c trng: thu nhn tn hiu cn nhn dng,
kh nhiu lc tn hiu (tin x l) v rt ra cc c trng ca tn hiu
(vector c trng).
Hc mu: kt nhm, phn lp cc nhm vector c trng ca tng
nhm tn hiu (bng cc thut gii Heristic, bng cch s dng mng
Neural, bng cc siu phng dng thut ton K-means, Batchelor-
Tin x l v rt trch
vector c trng
Hc mu, phn lp
Tp hp t in
cc lp tn hiu
Nhn dng, so
khp mu
Ngng v lut
quyt nh
Tn hiu
Tn hiu cn
nhn dng
Tn hiu dng
hc
Kt qu




K
H
O
A

C
N
T
T

H

K
H
T
N





9
Wilkins, ). Qu trnh ny to ra cc lp tn hiu, mi lp ny c
trng cho tng nhm tn hiu.
Nhn dng, so khp mu: tm mi lin h gia tn hiu cn nhn dng
v cc lp tn hiu c to ra bc trc (bng cch thng qua quy
tc ngi lng ging gn nht chng hn). Nu nh tn hiu so
khp nht (v mc so khp tha mt ngng no ) ng vi mt
lp tn hiu no th h thng nhn dng xc nh tn hiu th tn
hiu thuc vo nhm tn hiu vi mt t l nht nh gi l
chnh xc ca h thng nhn dng (t l ny d nhin l cng cao cng
tt).
1.2 Nhn dng ting ni
1.2.1 X l m thanh
Khi m thanh c my vi tnh h tr th nhu cu x l m thanh
xut hin. Cc nhu cu ny to ra nhiu lnh vc ng dng trong thc t.
Chng hn nh: Tng hp m thanh (Synthesis), nn m thanh (Compression),
nhn dng ngi ni (speaker recognition), nhn dng ting ni (speech
recognition) Cc lnh vc ng dng khc nhau ca x l ting ni c th
hin qua hnh 1.2.
X l m thanh ng mt vai tr quan trng trong qu trnh nhn dng
ting ni, n cn trong vic lc nhiu tn hiu, bin i tn hiu, rt trch vector
c trng,





K
H
O
A

C
N
T
T

H

K
H
T
N





10

Hnh 1.2: Cc lnh vc trong x l ting ni
1.2.2 Phn loi nhn dng ting ni
1.2.2.1 Nhn dng t lin tc v nhn dng t cch bit
Mt h nhn dng ting ni c th l mt trong hai dng: nhn
dng lin tc v nhn dng tng t.
Nhn dng lin tc tc l nhn dng ting ni c pht lin tc
trong mt chui tn hiu, chng hn nh mt cu ni, mt mnh lnh
hoc mt on vn c c bi ngi dng. Cc h thng loi ny rt
phc tp, n phc tp ch cc t c pht lin tc kh x l kp (nu
cn thi gian thc), hoc kh tch ra nu nh ngi ni lin tc khng c
khong ngh (thng thng rt hay xy ra trong thc t). Kt qu tch t
nh hng rt ln n cc bc sau, cn x l tht tt trong qu trnh
ny.
Tri li, i vi m hnh nhn dng tng t, mi t cn nhn dng
c pht m mt cch ri rc, c cc khong ngh trc v sau khi pht
m mt t. M hnh loi ny d nhin n gin hn m hnh nhn dng




K
H
O
A

C
N
T
T

H

K
H
T
N





11
lin tc, ng thi cng c nhng ng dng thc tin nh trong cc h
thng iu khin bng li ni, quay s bng ging ni, vi chnh
xc kh cao, tuy nhin kh p dng rng ri i vi m hnh trn.

Hnh 1.3: Ranh gii gia c v y khng r rng
1.2.2.2 Nhn dng ph thuc ngi ni v c lp ngi ni
i vi nhn dng ph thuc ngi ni th mi mt h nhn dng
ch phc v c cho mt ngi, v n s khng hiu ngi khc ni g
nu nh cha c hun luyn li t u. Do , h thng nhn dng
ngi ni kh c chp nhn rng ri v khng phi ai cng kh
nng kin thc v nht l kin nhn hun luyn h thng. c bit l
h thng loi ny khng th ng dng ni cng cng.
Ngc li, h thng nhn dng c lp ngi ni th l tng hn,
ng dng rng ri hn, p ng c hu ht cc yu cu ra. Nhng
khng may l h thng l tng nh vy gp mt s vn , nht l
chnh xc ca h thng.
Trong thc t, mi ngi c mt ging ni khc nhau, thm ch
ngay cng mt ngi cng c ging ni khc nhau nhng thi im




K
H
O
A

C
N
T
T

H

K
H
T
N





12
khc nhau. iu ny nh hng rt ln n vic nhn dng, n lm gim
chnh xc ca h thng nhn dng xung nhiu ln. Do khc
phc khuyt im ny, h thng nhn dng c lp ngi ni cn c
thit k phc tp hn, i hi lng d liu hun luyn ln hn nhiu ln
(d liu c thhu t nhiu ging khc nhau ca nhiu ngi). Nhng
iu ny cng khng ci thin c bao nhiu cht lng nhn dng. Do
, trong thc t c mt cch gii quyt l bn c lp ngi ni.
Phng php ny thc hin bng cch thu mu mt s lng ln cc
ging ni khc bit nhau. Khi s dng, h thng s c iu chnh cho
ph hp vi ging ca ngi dng, bng cch n hc thm mt vi cu
c cha cc t cn thit (ngi dng trc khi s dng h thng cn phi
qua mt qu trnh ngn hun luyn h thng). iu ny c
Microsoft a vo b phn mm Office ca mnh.
Nhn dng c lp ngi ni kh hn rt nhiu so vi nhn dng
ph thuc ngi ni. Cng mt t, mt ngi, d c c gng pht m
cho tht ging i na th cng c s khc bit. i vi b no con ngi,
mt h thng hon ho, th s khc bit c th c b qua do ng
cnh, v do c phn x l lm m i ca no. Nhng i vi my tnh th
rt kh xy dng c mt m hnh gii quyt cho tt c cc trng hp
khc bit .

Hnh 1.4: Ngi ni khc nhau s pht m khc nhau




K
H
O
A

C
N
T
T

H

K
H
T
N





13

Hnh 1.5: M hnh nhn dng bn c lp ngi ni
1.2.3 H thng nhn dng ting ni t ng
Nhn dng ting ni t ng (Automatic Speech Recognition - ASR):
c ngha l chuyn i t ng ting ni thnh ch vit hoc thnh mt trong
cc chc nng ca thit b.
Mt h thng nhn dng ting ni t ng gm c cc thnh phn sau:
Rt trch c trng ting ni: bin i tn hiu m thanh thnh
chui cc vector c trng. Ngoi ra, qu trnh ny cn gii quyt
vn d tm im cui (phn bit trong chui m thu c u l
ting ni u l n nn) v lc nhiu.
Qu trnh phn lp v nhn dng: Thc cht y l qu trnh
nhn dng da trn m hnh m thanh, t in pht m v m hnh
ngn ng ca h thng. M hnh ngn ng y thc cht ch biu
din mt ng php no , n c th ng vi mt ngn ng c th
hoc n gin ch gi gn trong phm vi ng dng ca h thng,




K
H
O
A

C
N
T
T

H

K
H
T
N





14
iu ny gp phn gim thiu phm vi nhn dng ca ting trong
mt vi t ch khng phi ton b t vng.
Gii m: Qu trnh ny qu trnh ny c th ch n gin l xut ra
chui vn bn nhn dng c hoc l mt qu trnh phn tch
chui nhn c ng vi tc v g v thc hin tc v .

Hnh 1.6: Cc thnh phn c bn ca h thng ASR
ng dng:
iu khin bng ting ni (khong 30 t): Nhn dng tn ngi,
ch s ca h thng quay s bng ging ni trn in thoi di ng,
iu khin thit b in t,
Trong in t vin thng (khong 2000 t): T ng in mu n
trong h thng x l thng tin, tng i in thoi,
T in (khong 64k t): Chuyn i th thoi (b t vng ln),
th k in t,
1.2.4 L thuyt nhn dng ting ni
Nhn dng ting ni l k thut nhn ra cc thnh phn li ni ca con
ngi. Tin trnh ny c th c thc hin t vic thu vo tn hiu ting ni
t micro, v kt thc bng t c nhn dng c h thng xut ra. Nhng
bc ca qu trnh trn s c cp phn sau.
Vic nghin cu nhn dng ting ni c bt u t cui thp nin
40, trong s pht trin nhanh chng ca cng ngh my tnh ng gp




K
H
O
A

C
N
T
T

H

K
H
T
N





15
mt rt quan trng. Ngy nay, nh s pht trin tng vt trong cng ngh,
nhn dng ting ni c mt trong cng nghip mt s lnh vc. Trong
cng nghip, khi tay v mt ca con ngi c tn dng trit , th vic
iu khin bng ting ni c mt thun li rt ln. Nhng trnh ng dng khc
th p dng nhn dng ting ni vo h thng nhn t phng t ng qua in
thoi, bng cch ny khch hng cm thy thun li hn so vi vic nhn cc
nt ca in thoi. Hn na, nhn dng ting ni cn c ng dng nhiu
dng khc nhau nh h thng chnh t, cc chi tr em, tr chi game,
Mt cch l tng, mt trnh nhn dng s c th nhn dng c cc
t khc nhau ca bt k ngi no trong bt k mi trng no. Nhng trong
thc t, kh nng ca h thng ph thuc vo nhiu yu t khc nhau. B t
vng, a ngi dng, nhn dng lin tc (phc tp hn nhiu so vi nhn dng
tng t) l cc yu t gy kh khn, phc tp cho vic nhn dng ting ni.
Tng t nh vy i vi n nn.
1.2.4.1 Rt trch vector c trng
Ngy nay, vic x l tn hiu ting ni c thc hin trn min
s. Tn hiu s c thu bng cch ly mu theo mt tn s nht nh,
l vic o tn hiu theo mt chu k thi gian. Theo l thuyt, bt c mt
tn hiu c bng tn gii hn no cng c th ti to li mt cch hon
chnh nu nh tn s ly mu F
S
t nht l gp i tn s ti a ca tn
hiu (theo Alan v Willsky, 1997). Cht lng ca tn hiu c ly mu
cn ph thuc vo bin ly mu ph thuc vo s bit c dng.
i vi nhng ng dng ASR, biu din tn hiu min tn s th
ti u hn mt biu din gn hn hu dng hn l cn thit. Rt trch
vector c trng l vic x l bin i tn hiu m thanh thnh mt chui
nhng vector c trng. C mt vi dng c trng ca tn hiu m thanh
c th c s dng lm vector c trng, chng hn nh l MFCC (Mel
Frequency Cepstral Coeficient), LPC (Linear Prediction Filter
Coefficient),




K
H
O
A

C
N
T
T

H

K
H
T
N





16
tham s ha dng sng ca tn hiu, tn hiu c chia thnh
chui cc khung gi ln nhau theo thi gian, mi khung thng di
khong 25ms, khong thi gian thch hp cho vic x l tnh hn
(hnh 1.8).
kh nhiu v lm r tn hiu, cc khung trc khi c x l
c nhn vi hm ca s, thng dng l ca s Hamming hay
Hanning. Sau khi p hm ca s cnh ca khung s tr nn mn hn, mt
khc n cn gip cho thnh phn c tn s cao ca tn hiu xut hin
trong ph.

Hnh 1.7: Cc dng ca s thng dng




K
H
O
A

C
N
T
T

H

K
H
T
N





17

Hnh 1.8: Tng qut qu trnh rt trch vector c trng
1.2.4.2 Phn lp
Sau vic bin i ting ni thnh vector c trng l vic nhn ra
ci g thc s c ni ra. C mt vi cch tip cn vn ny, nh l:
hng c s tri thc, hng so khp mu, nhng phng php ny c
th c kt hp vi nhau.




K
H
O
A

C
N
T
T

H

K
H
T
N





18

Hnh 1.9: Cc k thut nhn dng ting ni v xu hng pht trin
a) K thut so khp mu
Mt h thng so khp mu da trn tng l s so khp li
ni vi mt s tp mu c lu tr, chng hn nh cc on m
thanh mu. Thng mi mu ph hp vi mt t trong t in. Ngi
phn lp s tnh ton s khc nhau v m thanh gia li ni thu vo v
tng mu c lu tr. Sau , anh ta s chn mu no so khp
nht vi d liu nhp.
i vi chng trnh, mt thut ton cn c s dng tm
ra s so khp khng tuyn tnh gia t l thi gian gia hai tn hiu,
n dng b p s chnh lch do s khc bit tc ni gy ra.
K thut so khp mu c s dng rng ri trong sn xut
thng mi vo cc thp nin 70 v 80, nhng sau ngy cng c
thay th bi cc phng php mnh hn (Holmes, 2001).
b) Mng Neural




K
H
O
A

C
N
T
T

H

K
H
T
N





19
Mng Neural l mt m hnh c gng m phng h thng
nron thn kinh ca con ngi. Mt mng neural bao gm mt s
lng cc nt. Nhng nt ny c sp xp thnh tng lp kt ni ln
nhau bng trng s khc nhau. Thng tin c a qua lp vo, c
x l qua mng, sau c xut ra ngoi thng qua lp ra. Kt qu
tr v ca mi nt c tnh bng hm khng tuyn tnh cc trng s
ca cc gi tr vo.
Mng c kh nng phn loi chnh xc ph thuc vo trng s
v cc gi tr ti u c xc nh trong qu trnh hun luyn. Khi
hun luyn, thng tin mt vi mu m thanh, v d nh ph bin ,
c a vo mng thng qua cc nt nhp, cc gi tr kt xut c
so snh vi gi tr c yu cu. S sai khc gia cc gi tr s lm
thay i cc trng s. Qu trnh ny c lp i lp li vi ln cho mi
mu hc, lm tng chnh xc ca mng.
Mc d l mt k thut th v v y ha hn, nhng mng
Neural cha tht s thnh cng trong mt h nhn dng ting ni lin
tc hon chnh.
c) Hng da trn tri thc
H thng da trn tri thc s dng tri thc phn bit s khc
nhau gia cc m thanh. Vo khong thp nin 70 v 80, n thch hp
trong vic ng dng trong h chuyn gia, n da trn b lut c rt
ra t tri thc v tn hiu m thanh.
Mt dng khc ca h thng c k tha t qu trnh pht m
ca con ngi. y thay v s dng b lut th nh ngha thnh
phn trung gian. Theo cch ny, s phn bit din ra bng cch so
snh ting ni c tng hp vi mt ting ni cn nhn dng. Mc
d l mt k thut c tim nng, nhng mt h thng nh vy c s
gii hn ca n.
d) M hnh Markov n (Hidden Markov models HMM)




K
H
O
A

C
N
T
T

H

K
H
T
N





20
M hnh Markov n l mt phng php thng k mnh m
m hnh ha tn hiu ting ni, v n tht s vt tri trong vic p
dng vo nhn dng ting ni ngy nay. Mt m hnh Markov n c
dng biu din cho mt n v ca ngn ng, nh l t hay l m
v. N gm c mt s hu hn cc trng thi v s chuyn i trng
thi, s chuyn i c thc hin thng qua xc sut chuyn i,
hm phn b Gauss thng c chn la thc hin iu ny.
Mt khi biu din mt chui mu quan st, m hnh c th xc
nh xc sut gp cc mu quan st , nhng nu nh mt chui n
cc mu quan st khng th tm ra mt chui cc trng thi c lin
quan th n khng th xc nh trng thi no v th t no.
Xc sut chuyn trng thi v s phn b xc sut ph thuc
vo trng s ca n. Trong qu trnh hun luyn cc trng s ny c
ti u ha cho ph hp vi d liu hun luyn. (hnh 1.9)

Hnh 1.10: HMM vi 3 trng thi v trng s chuyn trng thi




K
H
O
A

C
N
T
T

H

K
H
T
N





21
Chng 2 X L TING NI
RT TRCH VECTOR C TRNG
2.1 X l ting ni
Tn hiu (signal) l tt c s vt hin tng c mang hoc cha mt thng
tin no m chng ta c th hiu, c quy c trc. Cc tn hiu trong th gii
thc u dng lin tc (tn hiu tng t), n ht sc phc tp, thiu chnh xc
cn thit i vi my tnh. Do cc tn hiu ny thng b bin i thnh cc tn
hiu s (s ha), mt dng thng tin my tnh c th x l.
Ting ni cng l mt dng tn hiu tng t, do n cng cn c s
ha.
2.1.1 Ly mu tn hiu
Hm ly mu l cu ni gia cc h thng ri rc v cc h thng lin
tc. N cn c gi l: hm Dirac Delta, hm sng lc,
Cng thc 2.1

=
=
n
s
) nT t ( ) t ( x ) t ( x
i vi my tnh, ly mu ch n gin l c theo mt chu k thi gian
(i vi tn hiu m thanh v cc dng tng t), hay l chu k khng gian
(i vi tn hiu l nh v cc dng tng t) ta o tn hiu mt ln.
Qu trnh trn s to ra mt chui cc s biu din cho tn hiu, v c
th x l c bi my tnh.




K
H
O
A

C
N
T
T

H

K
H
T
N





22

Hnh 2.1: V d v ly mu tn hiu f(t) trn min thi gian
2.1.2 B lc tn hiu
B lc s c vai tr rt quan trng trong x l ting ni, chng c
dng vi 2 mc ch chnh:
- Tch tn hiu cn thit: Cc tn hiu ban u thng cha ng cc
nhiu hoc cc tn hiu khng mong mun khc, cc nhiu ny lm
gim ng k cht lng ca tn hiu v cn phi tch ring cc tn
hiu cn thit.
V d: i vi m thanh c thu, tn hiu m thng cha
thm cc ting n ca mi trng, chng hn nh ting ca qut
trn thi vo micro; cn i vi nh chp th l cc im lm m
trn nhng tm nh c khi c qut vo.
- Khi phc cc tn hiu b bin dng: C mt s trng hp v mt
nguyn nhn no (thng l nguyn nhn lin quan n thit b) s
to ra cc tn hiu vo b mo m. V vy cn phi chnh li tng
cht lng ca tn hiu s.
V d: Cc micro c s cho ra cc tn hiu m thanh khng tt;
con mt (forcus len) ca cc my qut b m s lm cho cc nh
c qut b m theo .
Trong thc t k thut, c hai b lc tuyn tnh dng lc tn hiu nh
sau:




K
H
O
A

C
N
T
T

H

K
H
T
N





23
- B lc p ng xung hu hn (Finite Impulse Response FIR): h
c tn hiu ra ch ph thuc vo tn hiu vo nn cc h ny cn c
gi l mch khng truy hi hay mch khng qui (non-recursive).
B lc c cng thc sau:
Cng thc 2.2

=
= + + + =
q
0 j
j q 1 0
) j n ( x b ) q n ( x b ... ) 1 n ( x b ) n ( x b ) n ( y

Hnh 2.2: Minh ha hot ng b lc FIR
- B lc p ng xung v hn (Infinite Impulse Response IIR): h
x l c p ng xung c di v hn hay p ng xung v hn.
Tn hiu ra khng nhng ch ph thuc vo tn hiu vo m cn ph
thuc vo qu kh ca chnh tn hiu ra, v vy chng cn c gi l
cc mch c truy hi hay qui. Cng thc b lc:
Cng thc 2.3

= =

+ =
p
1 i
q
0 j
j n j i n i n
x b y a y

Hnh 2.3: Minh ha hot ng b lc IIR




K
H
O
A

C
N
T
T

H

K
H
T
N





24
2.1.3 D tm im cui (end-point detection)
D tm im cui l mt x l c gng tm ra chnh xc khi no ngi
ta bt u v kt thc ni. N cn c dng xc nh khi m ngi ta
khng tht s ni g, hoc ni nhng iu khng mong i (nh khng c
trong b t vng nh trc). Khi , d tm im cui gip gim mt s
lng khung m trnh nhn dng cn phi x l, dn n gim ti vic tnh
ton. Tuy nhin, vic d tm im cui khng d nh ta tng, bi v c s tn
ti ca ting n nn, ting ni nn v s lin kt ca cc m tit, nh l s kh
khn trong vic d tm on v thanh phn bt u v kt thc ting ni.
D tm im cui c thc hin qua ba bc, qua mi bc xc nh
im cui cng chnh xc. Vic d tm da trn mc nng lng ca tn hiu
c c trng bng:
N
2
n 1
E log x (n)
=
=

(xem phn 2.2.2.2)
a) D tm th: da trn k thut nng lng t chnh xc nht. N tm mt
on m mc nng lng cao hn on trc v cho mt s khung
l im bt u (thng khong 40 khung) trc khi gp khung mc
nng lng cao hn. Khi mt s lng (thng khong 20 khung)
khung khc qua (khng cn kim tra bt k khung no) c cho l
im cui.
b) D tm tinh: bc d tm tinh s kim tra mc nng lng ca ting
ni, n c lc ra im u v cui bng cch cho rng mc nng lng
ca ting ni th cao hn n nn (cao hn mt ngng no ).
c) K thut VUS (Voice, Unvoice and Silence): k thut ny c phn loi
tng khung thnh on hu thanh, on v thanh v khong lng. Vic
phn loi da trn s phn b nng lng trong khung, ph bin dng
v s phn loi khung trc . Phng php ny c loi b i nhng




K
H
O
A

C
N
T
T

H

K
H
T
N





25
phn khng phi ting ni, nh: ting nhp ming, th, hoc n nn
(chng hn ting ng ca).

Hnh 2.4: D tm im cui da vo mc nng lng




K
H
O
A

C
N
T
T

H

K
H
T
N





26
2.2 Rt trch c trng

Hnh 2.5: S rt trch vector c trng tng qut
i vi mt h nhn dng ting, vic rt trch vector c trng ca ting ni l
cn thit. iu ny gip gim thiu s lng d liu trong vic hun luyn v nhn
dng, dn n s lng cng vic tnh ton trong h gim ng k. Bn cnh ,
vic rt trch c trng cn lm r s khc bit ca ting ny so vi ting khc, lm
m i s khc bit ca cng hai ln pht m khc nhau ca cng mt ting. Hnh
2.6 minh ha cc bc x l trong vic rt trch vector c trng ting ni.




K
H
O
A

C
N
T
T

H

K
H
T
N





27
2.2.1 Cc bc rt trch c trng

Hnh 2.6: S rt trch c trng chi tit
2.2.1.1 Lm r tn hiu
bc ny, mc ch l lm tng cng tn hiu, lm ni r c
trng ca tn hiu v lm cho n t nhy hn vi cc hiu ng do
chnh xc hu hn nhng bc x l sau. B lm r tn hiu thng l
mt b lc thng cao vi phng trnh sai phn nh sau:
Cng thc 2.4
) 1 n ( s a ) n ( s ) n ( s
~
= ; vi 0.9 a 1
2.2.1.2 Phn on thnh cc khung
Trong bc phn on khung, ) (
~
n s c chia thnh cc khung,
mi khung gm N mu, khong cch gia cc khung l M mu. Hnh 2.7
minh ha cch phn thnh cc khung trong trng hp M = (1/3)N.




K
H
O
A

C
N
T
T

H

K
H
T
N





28
C th, khung th nht gm N mu ting ni u tin (bt u t
) 0 ( s
~
n ) 1 N ( s
~
). Khung th hai bt u t mu th M v kt thc
v tr M+N-1. Tng t, khung th i bt u t mu th i*M v kt thc
v tr i*M+N-1. Tin trnh ny tip tc cho n khi cc mu ting ni
u thuc v mt hay nhiu khung.
Ta d dng thy rng nu M N th cc khung k nhau s c s
chng lp (nh hnh 2.7), dn n kt qu l cc php rt trch c trng
c tng quan vi nhau t khung ny sang khung kia; v khi M << N th
khung ny sang khung khc c hon ton trn. Ngc li, nu M > N
th s khng c s chng lp gia cc khung k nhau, dn n mt s
mu ting ni b mt (tc l khng xut hin trong bt k khung no).
Nu ta k hiu khung th i l x
i
(n) v gi s c tt c L khung trong tn
hiu ting ni th:
x
i
(n) = ) n i . M ( s
~
+ , n = 0, 1, , N-1; i = 0, 1, , L-1

Hnh 2.7: Phn on ting ni thnh cc khung chng lp
2.2.1.3 Ly ca s
Bc tip theo trong x l l ly ca s tn hiu ng vi mi
khung gim thiu s gin on tn hiu u v cui mi khung. Mt
dy tn hiu con c ly ra t mt tn hiu di hn hoc di v hn x(n)
gi l mt ca s tn hiu. Vic quan st tn hiu x(n) bng mt on
x
N
(n) trong khong n
0
(n
0
+ N 1) tng ng vi vic nhn x(n) vi
mt hm ca s w(n-n
0
)
Cng thc 2.5

+ > <
+
= =
) 1 N n n ( ) n n ( 0
1 N n n n ) n ( x
) n n ( w ). n ( x ) n ( x
0 0
0 0
0 N





K
H
O
A

C
N
T
T

H

K
H
T
N





29
Cc dng ca s tn hiu
Trong x l tn hiu s, cc ca s thng dng c biu din
thng qua ca s Hamming tng qut:
Cng thc 2.6

>
+
=
2 / N n 0
2 / N n ) N / n . 2 cos( ). 1 (
) n ( w
Tu theo cc gi tr khc nhau ca m ta c cc ca s khc
nhau:
= 0.54, ta c ca s Hamming, y l dng ca s thng
c dng nht.

= 0.5, ta c ca s Hanning:

= 1, ta c ca s ch nht:

Thm vo , rng ca ca s cng c tc ng kh ln n
kt qu ca cc php phn tch.




K
H
O
A

C
N
T
T

H

K
H
T
N





30
Mt s ca s khc cng c s dng trong x l tn hiu s
nh: ca s tam gic, ca s Kaiser, ca s Blackman, ca s cosin
Sau y l mt s v d cho thy s khc bit gia cc loi ca s.

Hnh 2.5a: m /a/, ca s ch nht,
512 im(45ms, tri) v 64 im(5.6ms, phi)


Hnh 2.5b: m /a/, ca s Hamming,




K
H
O
A

C
N
T
T

H

K
H
T
N





31
512 im(45ms, tri) v 64 im(5.6ms, phi)

Hnh 2.5c: m /a/, ca s Hanning,
512 im(45ms, tri) v 64 im(5.6ms, phi)





K
H
O
A

C
N
T
T

H

K
H
T
N





32

Hnh 2.8: S khc bit gia cc dng ca s tn hiu




K
H
O
A

C
N
T
T

H

K
H
T
N





33
2.2.2 Cc dng c trng ting ni
rt trch c trng, ta cn phi chn c trng tha mn nhng vn
sau y:
C kh nng din dt thng tin ting ni c lp ngi ni
D dng tnh ton
n nh theo thi gian
Xy ra t nhin v lin tc trong ting ni
t thay i theo mi trng ni (c lp mi trng)
Khng nh hng bi s bin dng bp mo
Khng nh hng bi n nn v bng tn gii hn
Khng nh hng bi trng thi ngi ni
c trng c tt c nhng c tnh nh th khng tn ti!!!
Cc dng c trng hin nay
c trng min m
Autocorrelation coefficients (COR)
Linear Prediction Coefficients (LPC)
Partial Correlation coefficients (PARCOR)
Log Area Ratio coefficients (LAR)
Perceptional Linear Prediction (PLP)
c trng min tn s v Cepstral
Line Spectrum Pairs (LSP)
Bank of filters (tuyn tnh)
Bank of filters (Mel)
Mel Frequency Cepstral Coefficients (MFCC)
2.2.2.1 Bin i tn hiu sang min tn s
C hai cch bin i:
a) Php bin i Fourier ri rc




K
H
O
A

C
N
T
T

H

K
H
T
N





34
Php bin i Fourier l php bin i thun nghch, dng
bin i tn hiu sang min tn s, n dng cc cng thc bin i
ri rc sau :
Php bin i thun:
Cng thc 2.7

=

=
1 N
0 n
N / kn 2 j
e ) n ( x ) k ( X
, k = 0, 1, 2, , N 1
Php bin i nghch:
Cng thc 2.8

=
1 N
0 k
N / kn 2 j
e ) k ( X ) n ( x
, n = 0, 1, 2, , N 1
b) Bin i cosin ri rc
Bin i Cosin l mt php bin i mnh, c dng trong
x l nn nh JPEG, n cng l mt php bin i chuyn tn hiu
sang min tn s, ta c cc cng thc sau:
Php bin i thun:
Cng thc 2.9

+
=
1 - N
0 n
N 2
k ) 1 n 2 (
cos ). n ( x ) k ( X(k)
, k = 0, 1, 2, , N
1
Bin i nghch:
Cng thc 2.10

+
=
1 N
0 k
N 2
k ) 1 n 2 (
cos ). k ( X ) n ( ) n ( x
, n = 0,1,2, ,N 1
Vi:


=
=
1 N k 1
N
2
0 k
N
1
) k ( ; kZ




K
H
O
A

C
N
T
T

H

K
H
T
N





35
C hai php bin i trn u c phin bn bin i nhanh, iu
ny gip tng tc x l, thch hp trong vic x l cn thi gian thc nh
x l m thanh, l FFT (Fast Fourie Transform) v FCT (Fast Cosine
Transform). Cc php bin i nhanh ny u da trn k thut phn
chia theo c s 2, ngha l thay v bin i trn ton b tn hiu th php
bin i ny s phn chia chui tn hiu thnh 2 chui tn hiu con, v li
p dng php bin i ln na cho 2 phn ny mt cch quy. Do php
chia cho 2, nn chui tn hiu i hi phi c chiu di l ly tha ca 2
(iu ny c th d dng gii quyt c bng cch tng kch thc
chui tn hiu ln v in 0 vo)
V d vic phn chia v bin i s c thc hin trn chui tn
hiu c chiu di 16 im nh sau:
1 tn hiu 16 im 0 1 2 3 4 5 6 7 8 9 10 11 12 13 15 16
2 tn hiu 8 im 0 2 4 6 8 10 12 14 1 3 5 7 9 11 13 15
4 tn hiu 4 im 0 4 8 12 2 6 10 14 1 5 9 13 3 7 11 15
8 tn hiu 2 im 0 8 4 12 2 10 6 14 1 9 5 13 3 11 7 15
16 tn hiu 1 im 0 8 4 12 2 10 6 14 1 9 5 13 3 11 7 15
phc tp ca phng php ny l O(Nlog
2
(N)).
2.2.2.2 c trng nng lng
Nng lng tn hiu c th hin thng qua mc , s lng tn
hiu c trong mt n v thi gian. Nng lng ca tn hiu ting ni l
mt c trng vt l ca tn hiu, c dng nh l tham s trong vector
c trng trong nhn dng ting ni, v cn c d tm khong lng
trong tn hiu ting ni. Tnh ton nng lng tn hiu thng da trn s
phn khung v ly ca s, bng cch ly tng cc bnh phng chui tn
hiu x(n) trong ca s tn hiu.
c trng nng lng y c tnh bng cch ly log nng
lng tn hiu, tnh bng cng thc sau:




K
H
O
A

C
N
T
T

H

K
H
T
N





36
Cng thc 2.11
N
2
n 1
E log x (n)
=
=

2.2.2.3 c trng MFCC
Cc nghin cu cho ta thy rng h thng thnh gic ca con
ngi thu nhn m thanh vi ln cc tn s m thanh khng theo
thang tuyn tnh. Do , cc thang m thanh ra i cho ph hp vi s
tip nhn ca thnh gic con ngi.
Cc thang c xy dng bng thc nghim, cho nn ngi ta
xy dng cc cng thc xp x s chuyn i ny. Trong cc thang v
cng thc dng th c trng MFCC s dng thang Mel. Thang Mel
c th hin thng qua th sau:

Hnh 2.9: th biu din mi quan h gia Mel v Hz




K
H
O
A

C
N
T
T

H

K
H
T
N





37

Hnh 2.10: Cc bc trch c trng MFCC
Ta dng php bin i Fourier chuyn tn hiu t min thi
gian sang min tn s. Sau ta dng dy b lc lc tn hiu, l
dy b lc tam gic c tn s gia u nhau trn thang Mel.

Hnh 2.11: B lc trn thang Mel




K
H
O
A

C
N
T
T

H

K
H
T
N





38

Hnh 2.12: B lc trn tn s tht
Ly log trn dy kt qu t dy b lc v thc hin bin i cosin
ri rc ta thu c cc h s c trng MFCC.

Hnh 2.13: Minh ha cc bc bin i MFCC




K
H
O
A

C
N
T
T

H

K
H
T
N





39
2.2.2.4 c trng LPC
tng c bn ca phng php LPC l ti thi im n, mu
ting ni s(n) c th c xp x bi mt t hp tuyn tnh ca p mu
trc .
Cng thc 2.12
) n ( s
~
s(n) ;
Vi

=
=
p
1 k
k
) k n ( s a ) n ( s
~
l gi tr d bo ca s(n)
(gi s a
1
, a
2
, , a
p
l hng s trn khung d liu (frame) c xem xt)
Chng ta chuyn quan h trn thnh dng ng thc bng cch
thm vo s hng G.u(n) gi l ngun kch thch:
Cng thc 2.13
) n ( u . G ) n ( s
~
) n ( s + =
trong u(n) l ngun kch thch c chuyn ha v G gi l li ca
n.
Khi sai s d bo ) n ( e
~
c nh ngha l:
Cng thc 2.14
) n ( u . G ) n ( s
~
) n ( s ) n ( e
~
= =
tm tp cc h s a
i
, k = 1, 2, , p trn khung c phn tch,
cch tip cn c bn l ta cc tiu ha sai s bnh phng trung bnh. Khi
s dn n vic ta phi gii mt h phng trnh vi p n s. C nhiu
phng php gii h phng trnh , nhng trong thc t, phng
php thng c dng l phng php phn tch t tng quan.




K
H
O
A

C
N
T
T

H

K
H
T
N





40

Hnh 2.14: S x l LPC dng cho trch c trng ting ni
Hnh 2.13 trnh by s chi tit ca qu trnh x l LPC rt
trch c trng ting ni. Cc bc c bn trong tin trnh x l nh sau:
Phn tch t tng quan
Mi khung sau khi c ly ca s s c a qua bc phn
tch t tng quan v cho ra (p + 1) h s t tng quan:
Cng thc 2.15


=
+ =
1 m N
0 n
i i i
) m n ( x
~
) n ( x
~
) m ( r ; m = 0, 1, , p
Trong gi tr t tng quan cao nht, p, c gi l cp ca
phn tch LPC. Thng thng, ta s dng cc gi tr p trong khong t 8
n 16.
Phn tch LPC
Bc ny, ta s chuyn mi khung gm (p + 1) h s t tng
quan thnh p h s LPC bng cch dng thut ton Levinson Durbin.




K
H
O
A

C
N
T
T

H

K
H
T
N





41
Thut ton Levinson Durbin th hin qua m gi sau, vi d
liu vo l p+1 h s t tng quan cha trong r, kt qu ra l p h s
LPC cha trong a.
Procedure Levinson_Durbin (Vector a, Vector r)
r(0)E
(0)

For i = 1 to p do
r(i)k
i

For j = 1 to i 1 do
) j i ( r k
1 i
j i
+

k
i

End for
) 1 i (
i
E
k

k
i

-k
i

) i (
i

For j = 1 to i 1 do
( i ) ( i 1) ( i 1)
j j i i j
k

=
End for
E
(i)
= (1 -
2
i
k )E
(i-1)

End for
For m = 1 to p do

) p (
m
a(m)
End for
End Procedure
Lc ny, ta c th dng cc h s LPC lm vector c trng cho
tng khung. Tuy nhin, c mt php bin i to ra dng h s khc c
tp trung cao hn t cc h s LPC, l php phn tch Cepstral.
Phn tch cepstral
T p h s LPC mi khung, ta dn xut ra q h s cepstral c(m)
theo cng thc quy sau:
Cng thc 2.16
c
0
= ln
2





K
H
O
A

C
N
T
T

H

K
H
T
N





42
c
m
= a
m
+
m 1
k m k
k 1
k
c a
m

; 1mp
c
m
=
m 1
k m k
k 1
k
c a
m

; p<mQ
Trong ,
2
l li ca m hnh LPC. Thng thng ta chn
Q(3/2)p.
t trng s cho cc h s cepstral
Do nhy ca cc h s cepstral cp thp lm cho ph b dc
v do nhy ca cc h s cepstral cp cao gy ra nhiu nn ta thng
s dng k thut t trng s lm gim thiu cc nhy ny:

i
(m) = c(m).w(m)
Vi w(m) l hm t trng s. Hm t trng s thch hp thng
l b lc thng di:
Cng thc 2.17
w(m) =
Q m
1 sin
2 Q

+


, 1 m Q.
Nhn xt
M hnh LPC l m hnh c bit thch hp cho tn hiu ting ni.
Vi min ting ni hu thanh c trng thi gn n nh, m hnh tt c
cc im cc i ca LPC cho ta mt xp x tt i vi ng bao ph
m. Vi ting ni v thanh, m hnh LPC t ra t hu hiu hn so vi
hu thanh, nhng n vn l m hnh hu ch cho cc mc ch nhn dng
ting ni. M hnh LPC n gin v d ci t trn phn cng ln phn
mm. c bit, kinh nghim chng t rng phng php LPC thc
hin tt hn so vi b trch c trng bng dy b lc.
2.2.2.5 c trng tn s c bn
Tn s c bn ng mt vai tr quan trng trong nhn dng ting
ni. T tn s c bn, ta c th c nhng phn bit cc ting theo mt s




K
H
O
A

C
N
T
T

H

K
H
T
N





43
c im ng m. Tn s c bn cn th hin sc thi, thanh iu, ging
ngi ni Do , xc nh tn s c bn l mt phn cng vic khng
th thiu trong cc h nhn dng ting ni, c bit l ting ni c thanh
iu nh ting Vit.
S th hin ca cc thanh iu lin quan n gi tr v s bin i
ca tn s c bn. Trong x l ting ni, tn hiu c chia thnh cc
khung lin tip nhau, nn thanh iu s c th hin bng tn s c bn
trong tng khung tn hiu cng nh s vn ng ca n t khung ny
sang khung khc.
Tn hiu u vo ca cc phng php trn l tn hiu ting ni
th, hoc tn hiu c x l bng mt php ton phi tuyn (nh ct
tm) hay dng li d bo (trong m hnh LPC).
Tn s c bn ch c trong cc m hu thanh, nn vic rt trch
tn s c bn cng phi m nhn lun vic phn bit gia cc m v
thanh v hu thanh.
tng hiu qu, ngi ta tin hnh mt s bc tin x l cho
tn hiu ting ni th, nhm tng chnh xc, gim khi lng tnh
ton. Thng thng tn hiu th c x l qua 2 bc trc khi dng
trch F0:
Lc thng thp: tn hiu ting ni c cho qua b lc thng
thp loi b cc thnh phn c tn s cao hn F
max
. (Tn s
c bn ln nht c th ca ting ni). Thng thng F
max
=
900Hz.
Thc hin vic ly mu li, gim bt kch thc sng m. Tn
s ly mu c gim xung cn 2 KHz (theo nh lut
Nyquist: tn s ly mu phi ln hn hoc bng 2 ln tn s
c bn ln nht). Ly mu li tn s gip gim ng k khi
lng tnh ton. cc phng php tm F0 thng thng,
khi lng tnh ton gim khong Z
2
ln. Vi Z l t l gim
tn s ly mu.
a) Phng php t tng quan




K
H
O
A

C
N
T
T

H

K
H
T
N





44
Thc hin tnh hm t tng quan trn khung tn hiu ting
ni di N
Cng thc 2.18
r
N
(p) =


=
+
1 p N
0 k
) p k ( s ) k ( s

Trong , p c gii hn trong vng c m c bn. Nu tn
hiu s(n) l tun hon th s c cc nh ti i = 0, P, 2P,(P l chu
k m c bn). Ngng quyt nh nh thng l : r
N
(p)>0.8r
N
(0).
C mt vi tng to ra ngng ng da vo tng quan nng
lng ca khung tn hiu v nng lng trung bnh ca c tn hiu.
Nhn xt:
Thng thng, tn hiu c nhn vi mt hm ca s
gim s tc ng do s thay i m iu.
Nu p dng phng php ny cho tn hiu ting ni th
th t ra khng tt, nh xut hin khng r.
Cn mt s bc tin x l loi b thng tin ca dy
m.
p dng phng php ny cho e
~
(n) s tt hn (phng
php Simplified Inverse Filter Tracking).
C ly vi nh trong mt khung tn hiu sau da vo
phng php Dynamic Programing tm ra chui F0
trong mt on cc khung lin tip.
Mt phng php dn xut t phng php ny l dng hip
tng quan gia hai tn hiu x(n) v y(n), y(n) = x(n + P) (tn hiu
y(n) l do tn hiu x(n) dch i P n v).
b) Li LPC v phng php SIFT
M hnh LPC c trng bng hm truyn t c dng nh
sau:




K
H
O
A

C
N
T
T

H

K
H
T
N





45
) z ( u . G
) z ( s
) z ( H =
Trong min thi gian l:

=
+ =
p
1 k
k
) n ( u . G ) k n ( s a ) n ( s
Trong , G.u(n) chnh l ngun kch thch, trong trng
hp m hu thanh, G.u(n) chnh l miu t chnh xc dng dao ng
ca dy thanh hay F0.
Chng ta cng nh ngha li ca c lng, ) n ( e
~
nh
sau:
) n ( u . G ) n ( s
~
) n ( s ) n ( e
~
= =
Nh vy m hnh LPC to ra tn hiu li d bo ) n ( e
~

cha thng tin v ngun kch thch, v do , vic xc nh F0
trong trng hp hu thanh tr nn d dng hn.
i vi phng php SIFT (Simplified Inverse Filter
Tracking), phng php ny p dng phng php t tng quan
vi tn hiu vo l ) n ( e
~
thu c trn.
c) Phng php dng cepstral
Phng php ny c th m t n gin nh sau:
Dng php phn tch Cepstral thc cho tn hiu vo. Tn hiu
vo ny c th s dng trc tip ting ni th.
Tm nh trong vng thch hp ca tn hiu c
n
.
Nhn xt
nh c tm kh chnh xc, t b ly nhm hi m.
Dng tt trong trng hp ting ni c cao thp.
Vic xc nh ngng quyt nh c nh ti c
n0
khng
tu thuc vo ngi ni kh phn bit v thanh/hu
thanh.




K
H
O
A

C
N
T
T

H

K
H
T
N





46
d) Phng php CLIP
Phng php CLIP (center clipping pitch detector) tng t
nh phng php t tng quan trn, nhng tn hiu c x l
loi b thng tin v cc phoocmng (thng tin v ng pht
m).
C mt vi gii php cho vic ny. C th l phng php
ct tm (center clipping). Phng php ny s loi b bt cc nh
nh trn sng m, lm cho sng m nhn ging dng xung hn.
Php ton ct C c m t nh sau:
C{s(n)} =

< +

>
L L
L L
L L
C ) n ( s C ) n ( s
C ) n ( s C 0
C ) n ( s C ) n ( s

Trong C
L
l ngng ct, thng c ly bng 30% gi
tr ln nht ca tn hiu.
e) Hm AMDF
Phng php (Average Magnitude Difference Function)
ging phng php t tng quan trn, nhng khi lng tnh
ton s gim xung do khng phi dng php nhn.
Chng ta nh ngha hm trung bnh hiu bin nh sau:
Cng thc 2.19
D(p) =

=
+

1 p N
0 n
| ) p n ( s ) n ( s |
p N
1

Sau khi tnh D(p) trong vng c kh nng xut hin P0.
Chn im cc tiu D(P0), P0 l chu k tn s c bn.
f) Phng php so khp bin
Chui tn hiu ting ni a vo my tnh c dng hnh sin.
Do , ta s tm hai im dao ng cng pha, khong thi gian gia
hai im chnh l chu k T. T T, ta s tm ra tn s f.




K
H
O
A

C
N
T
T

H

K
H
T
N





47
Tuy nhin cn ch rng tn hiu ting ni l s tng hp
ca nhiu tn s (xem hnh v), do , hai im dao ng cng pha
c xt phi l 2 im ct zero. Bn cnh , ta cng phi xc
nh ng 2 im ct zero to thnh chu k ca F0, v cc dao
ng cng hng cng c th gy ra im ct zero.

Hnh 2.15: Hnh dng tn hiu ting ni
Phng php so khp bin c tin hnh nh sau:
1. D tm im ct zero th nht theo mt chiu no (v d
i ln nh trong hnh v), t tn l X
1
.
2. D tm 2 im ct zero cng chiu tip theo, t tn l X
2
,
X
3
. Vi khong thi gian gia X
1
X
2
v X
2
X
3
l tng ng
nhau v nm trong khong ngng thi gian xc nh chu
k.
3. Ln lt so snh bin cc im tng ng trong hai
khong X
1
X
2
v X
2
X
3
. Gi tng bnh phng cc sai
lch bin l S.
Cng thc 2.20

= =
=
3 2
2 23 1 12
X , X
X x , X x
2
23 12
x x S
4. Nu S nh hn ngng lch (tc l hai khong X
1
X
2
v
X
2
X
3
ging nhau) th kt lun mi khong l mt chu k.
Nu khng, thay i khong thi gian, ngha l d tm cc
im ct zero khc.




K
H
O
A

C
N
T
T

H

K
H
T
N





48
Lm trn kt qu F0 bng b lc median
B lc Median c dng kh rng ri trong vic kh nhiu. Ni
dung k thut c th hin nh sau:
c mt tn hiu ra, mt ca s cc tn hiu vo lin nhau
c chn.
Sp xp cc d liu trong ca s tn hiu k trn.
Gi tr trung tm ca dy sp xp c chn lm median
ca tp hp cc mu trong ca s.
C ngha l b lc median s tnh li gi tr mt im bng cch
ly im c gi tr trung bnh trong cc im xung quanh.

Hnh 2.16: Kt qu trch F0

Hnh 2.17: Kt qu sau khi lc Median






K
H
O
A

C
N
T
T

H

K
H
T
N





49
Chng 3 M HNH MARKOV N

Hnh 3.1: Minh ha hot ng ca m hnh Markov n
M hnh Markov n (Hidden Markov Model - HMM) l m hnh da trn
thng k dng m hnh ho cc loi tn hiu theo thi gian, c s dng rt
thnh cng trong nhng ng dng v nhn dng. N c kh nng m hnh ho ting
ni theo thi gian da trn cu trc c rng buc bng ton hc cht ch. Cho nn
HMM nhn dng ting ni t hiu qu cao hn cc phng php khc. Thc t cho
thy, trong lnh vc nhn dng ting ni, m hnh Markov n cho kt qu cao hn
mng neural. Nh nhng u im , m hnh Markov n cn c s dng trong
nhiu lnh vc nhn dng khc, c trong cc ng dng khc nhau.
3.1 M hnh Markov n
M hnh Markov n gm cc trng thi, v mt ma trn trng s chuyn
trng thi to thnh mt mng chuyn i trng thi. Trong phng php nhn dng
ting ni bng m hnh Markov n, mi t mu s c biu din bng mt m
hnh Markov n. Ti mt thi im bt k, h thng s vo trng thi q
t
trong tp




K
H
O
A

C
N
T
T

H

K
H
T
N





50
S = {S
i
} c N trng thi. Qua cc thi gian ri rc, h thng s chuyn qua cc trng
thi khc. K hiu q
t
l trng thi thi im t, ta c:
P[q
t
= S
j
|q
t-1
= S
i
, q
t-2
= S
k
,] = P[q
t
= S
j
| q
t-1
= S
i
]
Chng ta ch xt cc qu trnh m v phi khng ph thuc vo thi gian.
Khi tp xc sut chuyn trng thi a
ij
c dng:
a
ij
= P[q
t
= S
j
| q
t-1
= S
i
], vi a
ij
0;
ij
a 1 =

.
Do , mt m hnh Markov n c c trng bi cc tham s sau:
1. N: s trng thi ca m hnh
Tp trng thi ca m hnh: s = {s
1
,s
2
,...,s
N
}
Trng thi thi im t, q
t
s
2. M: s cc k hiu quan st c ng vi mt trng thi
Tp cc k hiu quan st: v = {v
1
,v
2
,...,v
M
}
K hiu quan st thi im t, o
t
v
3. Tp xc sut chuyn trng thi: A = {a
ij
}
a
ij
= P(q
t+1
= s
j
| q
t
= s
i
), 1 i,j N
4. Tp xc sut k hiu V
k
quan st c trong mt trng thi: B = {b
j
(k)}
b
j
(k) = P(v
k
at t | q
t
= s
j
), 1 j N, 1 k M
5. Tp xc sut trng thi ban u l trng thi i: = {
i
}

i
= P[q
t
= S
i
], i[1,N]
Ta k hiu mt m hnh Markov n nh sau: = (A, B, ).
Mt s m hnh HMM thng dng l:




K
H
O
A

C
N
T
T

H

K
H
T
N





51

Hnh 3.2: M hnh Left - Right

Hnh 3.3: M hnh Bakis

Hnh 3.4: M hnh Tuyn tnh
3.2 ng dng M hnh Markov vo nhn dng ting ni
p dng m hnh Markov n cho x l ting ni, ta phi gii quyt 3 bi ton
c bn sau:
1. Tnh im: Cho chui quan st O = {o
1
,o
2
,...,o
T
} v m hnh = {A, B,
}, ta phi tnh xc sut c iu kin P(O|) ca chui quan st.
Thut ton tin - li




K
H
O
A

C
N
T
T

H

K
H
T
N





52
2. So khp: Cho chui quan st O v m hnh , ta phi tm chui trng thi
Q= {q
1
, q
2
, ..., q
T
} sao cho xc sut c iu kin P(O|) l ti u.
Thut ton Viterbi
3. Hun luyn: Cho chui quan st O v m hnh , ta phi nh gi li cc
thng s ca m hnh sao cho xc sut c iu kin P(O|) ca chui
quan st l ti u.
Hm c lng Baum-Welch
3.2.1 Thut ton tin
Ton t tin
t
(i) l xc xut ca chui quan st tng phn o
1
o
2
...o
t
v
trng thi quan st S
i
ti thi im t vi iu kin cho m hnh Markov n .

t
(i) = P(o
1
o
2
...o
t
, q
t
= s
i
| )
Ton t tin c th c tnh theo cc bc qui np sau:
Bc 1: Khi to
(i) =
i
b
i
(O
1
), vi i[1,N]
Bc 2: Qui np

t+1
(j) =
N
t ij j t 1
i 1
(i)a b (O )
+
=



, (t[1,T-1], j[1,N])
Bc 3: Kt thc:
P(O|) =
N
T
i 1
(i)
=






K
H
O
A

C
N
T
T

H

K
H
T
N





53

Hnh 3.5:Minh ha thut ton tin
3.2.2 Thut ton li

Hnh 3.6:Minh ha thut ton li




K
H
O
A

C
N
T
T

H

K
H
T
N





54
Ton t li
t
(i) l xc xut ca chui quan st tng phn O
t+1
O
t+2
O
T

v trng thi S
i
ti thi im t vi iu kin cho m hnh Markov n .

t
(i) = P(O
t+1
O
t+2
O
T
|q
t
= S
i
,)

t
(i) c th tnh c theo cc bc qui np sau:
Bc 1: Khi to

T
(i) = 1, (i(1,N))
Bc 2: Qui np

t
(i) =
N
ij j t 1 t 1
j 1
a b (O ) ( j)
+ +
=

, (t[1,T-1])
3.2.3 Phng php tm chui trng thi ti u
Mt tiu chun chn trng thi q
t
l ti a ha xc sut s trng thi
ng
Xt ton t
t
(i):Ton t
t
(i) l xc sut ca h thng trng thi i ti
thi im t vi iu kin cho chui quan st O v m hnh .

t
(i) = P(q
t
= S|O,)

t
(i) =
t t t t
N
t t
i 1
(i) (i) (i) (i)
P(O| )
(i) (i)
=

=


V ton t
t
(i) l xc sut ti a trn ng i
1 2 t 1
t 1 2 t 1 t i 1 2 t
q ,q ,...,q
(i) max P(q q ...q , q s , o o ...o | )


= =

Quy np:
t 1 t ij j t 1
i
( j) max (i)a b (o )
+ +

=


thu c chui trng thi, chng ta phi lu ng i ca chui
trng thi ti u thi im t. Ta lu li trong mng
t
(i)




K
H
O
A

C
N
T
T

H

K
H
T
N





55
3.2.4 Thut ton Viterbi
Bc 1: Khi to
1 i i 1
(i) b (o ) =
, 1 i N
1
(i) 0 =

Bc 2: Quy np
t t-1 ij j t
1 i N
( j) max (i)a b (o ), 2 t T 1 j N

=


t t-1 ij
1 i N
( j) argmax (i)a , 2 t T 1 j N

=


Bc 3: Kt thc
[ ]
*
T
1 i N
P max (i)

=

[ ]
*
T T
1 i N
q argmax (i)

=

Bc 4: Quay lui tm chui trng thi (ng i) ti u
* *
T t 1 t 1
q (q ), t T 1, T 2, ..., 1
+ +
= =

S php tnh: N
2
T




K
H
O
A

C
N
T
T

H

K
H
T
N





56

Hnh 3.7: V d minh ha thut ton Viterbi




K
H
O
A

C
N
T
T

H

K
H
T
N





57

Hnh 3.8:V d minh ha thut ton Viterbi (tt)




K
H
O
A

C
N
T
T

H

K
H
T
N





58

Hnh 3.9: V d minh ha so khp dng thut ton tin-li
3.2.5 c lng Baum-Welch
Xt ton t
t
(i,j) l xc sut ca h thng trng thi i ti thi im t
v trng thi j ti thi im t+1 vi iu kin c chui quan st O v m hnh
Markov n .
t t i t 1 j
(i, j) P(q s , q s | O, )
+
= = =

Khi :
t ij j t 1 t 1
t
(i)a b (o ) ( j)
(i, j)
P(O| )
+ +

=


N
t t
j 1
(i) (i, j)
=
=

Kt hp
t
(i) v
t
(i,j), ta c




K
H
O
A

C
N
T
T

H

K
H
T
N





59
= xc sut ca h thng trng thi i ti thi im t=1, tc l bng

t
(i).
T 1
t
t 1
ij T 1
t
t 1
(i, j)
a
(i)



t k
j T 1
t
t 1
O V
b (k)
(i)

=
=
=


Khi = ( A, B, ) l m hnh c c lng li.

Hnh 3.10: Minh ha c lng Baum - Welch




K
H
O
A

C
N
T
T

H

K
H
T
N





60
3.3 Cu trc ngn ng v m hnh nhn dng theo m v

Hnh 3.11: Minh ha vic nhn dng m v trong HMM
3.3.1 Cu trc ngn ng
Gi nh c bn nht l cc t c cu to t nhng thnh phn c
bn, gi l m v. V d nh, t cat c cu to t chui m v /kt/. m v
c cp trong cc t in v d: Oxford English Dictionary, n cha ng
nhiu thng tin v vic mt t c pht m nh th no, nhng iu
khng cho h thng nhn dng. Khi mt m v c ni th n b nh
hng bi ng cnh lc . V d, n s c ko di nu n v tr kt thc,
hoc c th n b nh hng bi m v trc v sau n trong mt t. Nhng
m v khi c pht m ra gi l m t, m t i khi c th c tin on
bng cch da trn nhng quy tc chnh t.




K
H
O
A

C
N
T
T

H

K
H
T
N





61
Xt v d, t can c biu din bi chui m v /kn/. Trong ting
Anh ni c hin tng thng thng l nhng nguyn m khng nhn c
ni gim. Khi mt nguyn m c ni gim, n khng c pht m cn
thn, v c ni lt kt qu l n tr thnh nguyn m schwa. Nguyn m
schwa hi khc so vi m t ban u, v n rt ngn. Tuy nhin cng c khi
ngi ni pht m cn thn th n l tr m t ban u. Do t can c th c
hai dng m [kn] v [kn]
Danh sch cc m v trong ting Anh
Vowels and Diphthongs
Phonemes Word Examples Description
iy Feel, eve, me front close unrounded
ih Fill, hit, lid front close unrounded (lax)
ae at, carry, gas front open unrounded (tense)
aa father, ah, car back open rounded
ah cut, bud, up open mid-back rounded
ao Dog, lawn, caught open-mid back round
ay tie, ice, bite diphthong with quality: aa + ih
ax ago, comply central close mid (schwa)
ey ate, day, tape front close-mid unrounded (tense)
eh pet, berry, ten front open-mid unrounded
er turn, fur, meter central open-mid unrounded
ow go, own, town back close-mid rounded
aw foul, how, our diphthong with quality: aa + uh
oy toy, coin, oil diphthong with quality: ao + ih
uh book, pull, good back close-mid unrounded (lax)
uw tool, crew, moo back close round





K
H
O
A

C
N
T
T

H

K
H
T
N





62
Consonants and Liquids
Phonemes Word Examples Description
b big, able, tab voiced bilabial plosive
p put, open, tap voiceless bilabial plosive
d dig, idea, wad voiced alveolar plosive
t talk, sat voiceless alveolar plosive
g gut, angle, tag voiced velar plosive
t Meter alveolar flap
g gut, angle, tag voiced velar plosive
k cut, ken, take voiceless velar plosive
f Fork, after, if voiceless labiodental fricative
v vat, over, have voiced labiodental fricative
s sit, cast, toss voiceless alveolar fricative
z zap, lazy, haze voiced alveolar fricative
th thin, nothing, truth voiceless dental fricative
dh then, father, scythe voiced bilabial plosive
sh she, cushion, wash voiceless postalveolar fricative
zh genre, azure voice postalveolar fricative
l lid alveolar lateral approximant
l elbow, sail velar lateral approximant
r red, part, far retroflex approximant
y yacht, yard palatal sonorant glide
w with, away labiovelar sonorant glide
hh help, ahead, hotel voiceless glottal fricative
m mat, amid, aim biliabial nasal
n no, end, pan alveolar nasal
ng sing, anger velar nasal
ch chin, archer, march voiceless alveolar affricate: t + sh
jh joy, agile, edge voiced alveolar affricate: d + zh





K
H
O
A

C
N
T
T

H

K
H
T
N





63

3.3.2 M hnh m v
Mt khi c nhiu dng m cho mt t th n cn phi xc nh xem m
hnh no s dc dng c th nhn dng c t . Nu nh d liu
hun luyn v nhng m hnh tt nht l m hnh t, ng vi mt m hnh l
mt dng m ca t. Khi s lng d liu hun luyn l rt ln, thng l
qu ln c th thc nghim v s lng php tnh c th nhn dng l rt
ln thm ch vi mt s t cc t vng.
gii quyt c vn trn, ngi ta chia t ra thnh nhiu m v,
mi m v c cc dng m (m t) v ng vi mi m hnh l mt dng m t.
Sau kt ni chng li vi nhau thnh cc t trong t vng nhn dng. V
d nh t cat c th c kt ni t nhng m t [k], [a], v [t]
Cch ny c nhiu u im nh s lng m hnh cn dng nh gn
bng vi n v c bn ca ngn ng (trong ting Anh khong 45), iu ny s
gim ng k s lng php tnh phc tp, hn na s lng d liu dng
nhn dng l nh.
3.3.3 Tha m v (allophones)
S dng m hnh m v nh l mt n v trong t ch l yu cu trc
gic. Khi m hnh bt k t no (a ra cch pht m ca n) l n gin cho
rng t c to t m v. Tuy nhin, thc t th cng mt m v s c th
rt khc nhau ty thuc vo ng cnh m v (ng cnh m v c cp nh
cc m v trc v sau n, v tr ca n trong t, trong on trong cu). V d
[t] trong t string s hi khc so vi [t] trong cat.
phn bit cc trng hp khc nhau ca cng mt m v, ngi ta
dng tha m v. V d, [t] trong t string s c t l [t1] c s dng khi
[t] nm gia [s] v [r], cn i vi [t] trong cat s c t l [t2] v c s
dng khi m [t] nm v tr kt thc 1 t.




K
H
O
A

C
N
T
T

H

K
H
T
N





64
Chnh xc loi tha m v no, v c bao nhiu tha m v c dng vn
ang cn l vn cn gii quyt. Nu s lng tha m v cng ln th s bin
ca m hnh cc tha m v cng gim. Nhng nu s tha m v cng ln s
cng tng s lng m hnh cn dng.
C hai loi tha m v n gin l triphone v biphone. Triphone c ng
cnh m v ph thuc vo m v trc v sau n. V d, [t] trong t string s
c tha m v l [s#t#r] v [t] trong t cat c tha m v l [a#t#]. Ch du #
ch phn cch, [s#t#r] cch k hiu c ngha l tha m v l ca [t]
vi [t] t trong ng cnh m v l [s] trc n v [r] sau n. Triphone n
gin c hiu qua, nhng c vi im bt li. Th nht, cn phi hun luyn
nhiu, khng trng hp hun luyn ht tt c m hnh tha m v. Khi
tng sut xut hin ca tha m v l thp, th khng phn bit gia cc
tha m v ca mt m v. Th hai, cc triphone c th cn thit trong vic nhn
dng nhng khng c sn trong tp hun luyn. Thc ra, triphone trong [E#t#]
(trong t bet) th kh ging vi [a#t#], v cng tht l mt cng khi c gng
phn bit.
Loi tha m v th hai l biphone, n tng t nh triphone nhng n
c th gii quyt c cc khuyt im ca triphone. N da trn ng cnh m
v l ch chu s nh hng ca m v i theo sau n. Biphone cng c nhng
hn ch nhng t b nh hng hn, n tng qut hn, kh nng xut hin ca
n trong cc t cng cao hn. Hn na s lng m hnh ca biphone cng
thp hn ca triphone.
Cn mt cch na l thut ton da trn c s cy quyt nh nh
phn. Cy nh phn ny c l l cc tha m v, cc nt con l cc cu hi nh
phn (nh: Is the phone to the left a liquid?). Khi tr li l Yes th mt nhnh
c chn, nu No th nhnh kia c chn. Hun luyn cy quyt nh cho
tng m v, mt danh sch cc cu hi c th c to ra, v d liu hun
luyn c lp i lp li v phn chia thnh cc tha m v bng cch chn nt




K
H
O
A

C
N
T
T

H

K
H
T
N





65
l v nhng cu hi thch hp nht vi d liu hun luyn. K thun li l c
th khm ph ra nhiu hn cc lut chung v m v hn biphone hay triphone.
Vi biphone v triphone nu mt m hnh c th khng c xut hin trong
tp hun luyn, th n khng c hun luyn v mt m hnh c lp ng
cnh cc dng thay th, thm ch nu c mt m hnh khc c hun
luyn gn ging vi n. S dng cy quyt nh th trong trng hp trn n
s c tm thy bi gii thut hun luyn v mt m hnh gn ging nht s
c thay th. im bt li ca cy quyt nh l khng r cn bao nhiu d
liu hun luyn l , v khng d xc nh tha m v no c chn mt
hp l.
3.3.4 Nhn xt
Mc d trong ting Vit s ting c th c l gii hn (ch khong
7000-8000 ting), nhng nu ng gc nhn dng ting ni th s lng
l ng k. Do , kh nng ng dng m hnh m v vo trong ting Vit
cho c th tng s t trong b t vng h thng nhn dng l rt ng xem
xt.
Tuy nhin, vic thc hin cng khng phi l iu d. Vn mun
thu vn l thanh iu trong ting Vit. y, sau khi xem xt, em nhn thy
mc d thanh iu nh hng ton b ting trong Ting Vit, nhng n nh
hng ln nht vn l cc nguyn m. Do , ta c mt gii php nh sau:
Gii php cho vic p dng m hnh m v vo nhn dng Ting Vit:
Ta xt cc m v bao gm cc dng sau:
Cc ph m bao gm: b, d, , g, h, k, l, m, n, p, r, s, t, v, x, ch, th,
kh, qu, nh
Cc nguyn m bao gm c du thanh iu: a, , , , , , , , , ,
, , , , , , , , e, , , , , , , , , , , , i, , , , , , o, , , ,




K
H
O
A

C
N
T
T

H

K
H
T
N





66
, , , , , , , , , , , , , , u, , , , , , , , , , , ,
ai, i, i, i, i.
Ta thy tng s m v khng nhiu lm, c th p dng c. Tuy nhin
vn l qu nhiu i vi nhn dng theo m hnh m v, v trong thc t th
nghim chng trnh th nghim hot ng km hiu qu. Cng sc b ra
nhiu hn kt qu mong i, n ch mang li hiu qu trong qu trnh nhn
dng cc ph m cho nn chnh xc ca m hnh l khng cao.
gii quyt vn ny mt cch trit th ta cn phi o su vo
dng nhn dng a lung.
xut cho nhn dng a lung:
M hnh gm hai lung nhn dng song song:
Lung th nht s dng vector c trng gm cc thnh phn:
MFCC tnh v cc o hm bc nht v bc hai ca n (bi v
MFCC t ra hiu qu trong vic lc i mc ph thuc ngi ni,
v nhiu, n nn, v thanh iu trong ting ni).
Lung th hai s dng vector c trng gm cc thnh phn: mc
nng lng E, tn s c bn F0, v cc o hm bc nht bc hai
ca chng.
Hai lung ny s s dng hai tp m hnh Markov v chng s kt hp
li vi nhau v cho ra kt qu. Cch ny c th p dng cho c m hnh nhn
dng nguyn ting v m hnh m v.
Tuy nhin, khi th nghim th vector c trng lung th hai c
hi t thp, tc x l cn chm nn kh ng dng thc t c. gii
quyt ta cn ci tin thm v vector c trng ny.




K
H
O
A

C
N
T
T

H

K
H
T
N





67
Chng 4 HMM TOOLKIT

Hnh 4.1: M hnh n gin trong nhn dng ting ni
HMM ToolKit (HTK) l mt cng c h tr nhn dng mnh m ca Steve
Young v nhm nghin cu ca ng. N tch hp hu ht cc k thut v m hnh
Markov n v cc k thut v x l ting ni v nhn dng ting ni. Bn cnh ,
n cn kt hp c m hnh ngn ng, c php vn phm, iu ny gip cho vic
nhn dng hiu qu hn. Cc chc nng v cu trc ca n th hin qua cc hnh 4.2
v 4.3




K
H
O
A

C
N
T
T

H

K
H
T
N





68

Hnh 4.2: Cc module v chc nng trong HTK




K
H
O
A

C
N
T
T

H

K
H
T
N





69

Hnh 4.3: Cc cng c v chc nng trong HTK
4.1 Cu trc tp tin trong HTK
4.1.1 Cu trc tp tin vector c trng HTK
Cu trc tp tin vector HTK c th c minh ha qua cu trc sau:
struct FeatureVectorFile
{
long nSamples; //s mu trong tp tin
long sampPeriod;
//tn s ca mu = tn s window (tnh theo 100ns)
short int sampSize;
//s byte trn mi mu = s chiu vector * 4
short int parmKind; //loi mu
FeatureVector sample[nSample];
//sizeof(FeatureVector) = sampSize
}




K
H
O
A

C
N
T
T

H

K
H
T
N





70
Vi parmKind nh ngha nh sau: (gm 6 bit cui)
0: WAVEFORM (sampled waveform)
1: LPC (Linear Prediction Filter Coefficients)
2: LPREFC (Linear Prediction Reflection Coefficients)
3: LPCEPSTRA (LPC Cepstral Coefficients)
4: LPDELCEP (LPC Cepstra plus Delta Coefficients)
5: IREFC (LPC Reflection Coef in 16 bit integer format)
6: MFCC (Mel-Frequency Cepstral Coefficients)
7: FBANK (Log Mel-Filter bank channel outputs)
8: MELSPEC (Linear Mel-filter bank channel outputs)
9: USER (User defined sample kind)
10: DISCRETE - vector quantised data
V cc bit cn li s quy nh cc dng h tr thm ca tp tin vector
c trng nh sau (dng s h c s 8):
000100 (k hiu _E): c c trng nng lng
000200 (k hiu _N): nng lng tnh b loi b
000400 (k hiu _D): c cc h s o hm cp 1
001000 (k hiu _A): c cc h s o hm cp 2
002000 (k hiu _C): c nn d liu
004000 (k hiu _Z): c thc hin zero mean cc h s tnh.
010000 (k hiu _K): c kim li CRC
020000 (k hiu _0): c h s Cepstral th 0 (u tin).
Xt v d c th sau:




K
H
O
A

C
N
T
T

H

K
H
T
N





71

Hnh 4.4: Phn b cc tham s trong 1 s vector c trng ca HTK

Hnh 4.5: Cc dng c trng c th chuyn i qua li bng HCopy
4.1.2 Cu trc tp tin m hnh HMM




K
H
O
A

C
N
T
T

H

K
H
T
N





72

Hnh 4.6: Dng c bn ca 1 tp tin HMM (cha c khi to)
(cn gi l HMM prototype)
Tp tin m hnh HMM c m u bng: ~h Tn m hnh
Phn thn c bao bc bi 2 tab: <BeginHMM> v <EndHMM>
N bao gm nhiu trng thi (state), s trng thi c t sau tab
<NumStates>. Mi trng thi bao bc bi tab nh trng thi <State> v tab nh
du ca trng thi sau n, hoc tab nh du ma trn chuyn trng thi <TransP>.
Trong mi trng thi c th c 1 hoc nhiu lung (stream), tab <SWeights> dng
nh du s lung v trng s ca tng lung, mi lung m u bng tab
<Stream> kt thc khi gp lung k tip hoc trng thi k tip. Trong mi lung c




K
H
O
A

C
N
T
T

H

K
H
T
N





73
th c 1 hoc nhiu pha trn Gaussian, s pha trn c nh du bng tab
<NumMixes>. ng vi mi tab <Mixture> l s th t v trng s ca mixture ,
tng s trng s ca tt c mixture trong 1 lung l phi = 1. Trong mi mixture c
2 vector nh du bng tab <Mean> v <Variance>, cc vector ny bao gm s
chiu vector c t u, sau l ni dung vector (gm s chiu s float).
Cc tp tin ny s c ngi s dng HTK t to ly cho ph hp vi mc
ch ca mnh, c bit l c th thay i s chiu ca vector c trng, loi vector
c trng (bao gm cc loi c bn c h tr v vector t nh ngha c th p
dng cng c vo nhn dng mt cch uyn chuyn hn, t ta c th a vo cc
dng vector c trng ph hp vi ngn ng Ting Vit.)
Sau cc tp tin ny s c khi to bng cc cng c nh: HInit hay
HCompV.




K
H
O
A

C
N
T
T

H

K
H
T
N





74

Hnh 4.7: Dng c bn ca 1 tp tin HMM c s dng pha trn Gaussian




K
H
O
A

C
N
T
T

H

K
H
T
N





75

Hnh 4.8: Dng c bn ca 1 tp tin HMM c s dng a lung
4.1.3 Cu trc tp tin nh nhn d liu
Mi tp tin nh nhn d liu (Label File) dng nh nhn cho mt tp
tin d liu (c t cng tn vi tp tin d liu nhng khc kiu, mc nh l .lab),
mi dng nh du cho v tr bt u v kt thc cho tng m v hoc cho tng t.
Mi dng c c php sau:
[start [end] ] name [score] { auxname [auxscore] } [comment]




K
H
O
A

C
N
T
T

H

K
H
T
N





76
Vi start nh du v tr bt u, end nh du v tr kt thc trong d liu
theo n v 100ns ca name vi mc chc chn score, v name nm trong
auxname vi mc chc chn auxscore, v cui cng l ghi ch ca ngi nh
nhn. Trong name l mt b phn ca auxname, v d nh l, m v trong ting,
ting trong t,
V d:
0000000 2200000 ay ice
2200000 3600000 s
3600000 4300000 k cream
4300000 5000000 r
5000000 7400000 iy
7400000 8200000 m
thun tin trong vic nh nhn, nht l khi cc d liu c nh nhn
ging nhau, HTK c h tr thm dng tp tin nh nhn tt hn (Master Label
File). N c c php nh sau:




K
H
O
A

C
N
T
T

H

K
H
T
N





77
MLF = #!MLF!#
MLFDef { MLFDef }
MLFDef = ImmediateTranscription | SubDirDef
ImmediateTranscription = Pattern
Transcription
.
SubDirDef = Pattern SearchMode String
SearchMode = -> | =>
Pattern = String
Trong String l mt chui k t, c th cha nhng k t ? v * dng
lm k t i din, ? thay th cho mt k t v * thay cho 0 hoc nhiu k t.
C php trn c th minh ha qua cc v d sau:
V d 1: Nu nh ta c hai tp tin nh nhn nh sau
Tp tin a.lab cha cc dng sau
000000 590000 sil
600000 2090000 a
2100000 4500000 sil
V tp tin b.lab cha
000000 990000 sil
1000000 3090000 b
3100000 4200000 sil
Th tp tin Master Label c dng thay th s cha
#!MLF!#
"*/a.lab"
000000 590000 sil
600000 2090000 a
2100000 4500000 sil
.
"*/b.lab"
000000 990000 sil
1000000 3090000 b
3100000 4200000 sil
.




K
H
O
A

C
N
T
T

H

K
H
T
N





78
V d 2: Nu nh ta c cc tp tin d liu hun luyn gm: nhanh.001.mfc,
nhanh.002.mfc, , nhanh.100.mfc v tt c chng u cha chui vector c trng
ca t nhanh th ta c th to Master Label nh sau
#!MLF!#
"*/nhanh.???.lab"
nhanh
.
4.1.4 Cu trc tp tin vn phm

Hnh 4.9: Vai tr ca vn phm trong nhn dng dng HTK




K
H
O
A

C
N
T
T

H

K
H
T
N





79
Tp tin m hnh ngn ng trong HTK c to t tp tin siu ngn ng
(ngn ng c t) qua cng c HParse v HBuild
V d ta c on vn phm nh ngha bng siu ngn ng sau:
(
Start < bit | but > End
)
N biu din cho m hnh bn di

Hnh 4.10: Lc vn phm
Sau khi qua cng c HParse to ra tp tin dng sau y
# Khai bo cc tham s ca mng ( y l th c hng)
# N=s nt v L=s cnh
N=4 L=8
# Danh sch nt: I=s th t nt, W=t
I=0 W=start
I=1 W=end
I=2 W=bit
I=3 W=but
# Danh sch cnh: J=s th t cnh, S=nt bt u, E=nt kt thc
J=0 S=0 E=2
J=1 S=0 E=3
J=2 S=3 E=1
J=3 S=2 E=1
J=4 S=2 E=3
J=5 S=3 E=3
J=6 S=3 E=2
J=7 S=2 E=2
Mt s v d khc:




K
H
O
A

C
N
T
T

H

K
H
T
N





80
a) V d 1 (Xem hnh 4.9 a)
(
one | two | three | four | five |
six | seven | eight | nine | zero
)
b) V d 2 (Xem hnh 4.9b)
(
sil (one | two | three | four | five |
six | seven | eight | nine | zero) sil
)
c) V d 3(Xem hnh 4.9c)
$digit = one | two | three | four | five |
six | seven | eight | nine | zero;
(
sil < $digit > sil
)
d)V d 4( Xem hnh 4.9d)
$digit = one | two | three | four | five |
six | seven | eight | nine | zero;
(
[sil] < $digit > [sil]
)

Hnh 4.11: M hnh minh ha cc vn phm




K
H
O
A

C
N
T
T

H

K
H
T
N





81
4.2 Nhn dng nguyn t

Hnh 4.12: Minh ha vic nhn dng nguyn t
Phng php nhn dng nguyn t s c thc hin thng qua cc cng c
ca HTK l: HInit, HCompV, HRest hun luyn to ra cc tp tin m hnh
c c lng li (reestimate) trong qu trnh hun luyn. Cn i vi qu trnh
nhn dng, c s tham gia ca HParse v HVite.
Tp mu dng hun luyn l cc tp tin cha chui vector c trng thu
c bng cch dng HCopy, n c dng rt trch c trng ca tn hiu m
thanh trong tp tin .Wav.




K
H
O
A

C
N
T
T

H

K
H
T
N





82
Trong qu trnh hun luyn, HInit v HCompV s tham gia vo qu trnh khi
to cc tham s ca m hnh, chng s khi to cc tham s trong HMM prototype
da trn tp mu hun luyn, qu trnh ny s to ra HMM c khi to. Sau
HRest s m nhim vic hun luyn, HRest s dng tp hun luyn hun luyn m
hnh c khi to, qu trnh ny c thc hin ln lc cho tng m hnh (mi
m hnh ng vi mt t trong tp t vng cn nhn dng), ta thu c cc m hnh
sn sng cho vic nhn dng.
Trc khi nhn dng, HParse s to ra mt m hnh ngn ng dng trong nhn
dng t tp tin siu ngn ng. Sau , ng vi tn hiu m thanh cn nhn dng, ta
to ra chui vector c trng (dng HCopy) ca n. Khi , HVite s m nhim
nhn dng, n s dng m hnh ngn ng v tp hp cc m hnh Markov n
nhn dng chui vector cn nhn dng.

Hnh 4.13: Hun luyn nguyn t v cc cng c h tr




K
H
O
A

C
N
T
T

H

K
H
T
N





83

Hnh 4.14: Quy trnh hot ng ca HInit

Hnh 4.15: Quy trnh hot ng ca HCompV




K
H
O
A

C
N
T
T

H

K
H
T
N





84


Hnh 4.16: Quy trnh hot ng ca HRest




K
H
O
A

C
N
T
T

H

K
H
T
N





85
4.3 Nhn dng theo m hnh m v

Hnh 4.17: Hun luyn theo m hnh m v dng HTK
Trong giai on hun luyn theo m hnh m v, cc cng c c dng l
HInit, HRest, HcompV, HERest. M hnh Markov n c th c khi to bng hai
cch, cch th nht c s dng khi d liu hun luyn c nh nhn, cch




K
H
O
A

C
N
T
T

H

K
H
T
N





86
ny s dng cc cng c HInit v HRest to ra cc m hnh ng vi tng m v.
Cch th hai dng HCompV khi to m hnh HMM i vi trng hp d liu
cha c nh nhn (ch c m t), m hnh c khi to s c sao chp ra
nhiu bn ng vi tng m v. Sau qu trnh khi to, cc m hnh c khi to s
c hun luyn bng HERest (nn hun luyn hai ln). Sau , tp cc m hnh
sn sng nhn dng.
Cng c HVite cng c dng nhn dng trong m hnh nhn dng bng
m v. N kt hp vi m hnh ngn ng (to bi HParse) v t in pht m (mi
t c nh ngha thng qua cch pht m ca n) (trong nhn dng nguyn t
HVite cng cn t in pht m nhng ch l hnh thc, cc t n gin c cch
pht m l chnh n).




K
H
O
A

C
N
T
T

H

K
H
T
N





87

Hnh 4.18: Qu trnh x l cc tp tin trong HERest




K
H
O
A

C
N
T
T

H

K
H
T
N





88
Chng 5 NG DNG: IU KHIN XE T NG
BNG TING NI
Cc ting trong h nhn dng iu khin xe t ng gm:
STT Ting M hnh Lnh thc hin
1 tri trais Xe b li sang tri
2 phi phair Xe b li sang phi
3 thng thawngr Xe chy thng
4 tin tieens Xe chy ti
5 lui lui Xe chy lui
6 dng duwngf Dng xe li
7 khong lng sil Ch dng trong nhn dng
H dng 100 mu hun luyn cho mi ting (tng cng 600 mu hun
luyn). H dng m hnh ngn ng nh sau:

Hnh 5.1: M hnh ngn ng dnh cho h nhn dng




K
H
O
A

C
N
T
T

H

K
H
T
N





89
5.1 Th nghim nhn dng ting ni Ting Vit
5.1.1 Nhn dng tnh (offline)
5.1.1.1 Nhn dng nguyn t dng LPCEPSTRA_E_D
Nhn dng trn tp hun luyn:
S mu nhn dng: 600
S mu ng: 600, t t l: 100%
Nhn dng trn tp d liu mi:
S mu dng hun luyn: 600
S mu dng th: 300
S mu ng: 300, t t l: 100%
5.1.1.2 Nhn dng nguyn t dng LPCEPSTRA_E_D_A
Nhn dng trn tp hun luyn:
S mu nhn dng: 600
S mu ng: 600, t t l: 100%
Nhn dng trn tp d liu mi:
S mu dng hun luyn: 600
S mu dng th: 300
S mu ng: 300, t t l: 100%
Nhn xt: Trong h nhn dng ting ni ny, vic dng dng c trng h
s Cepstral ca LPC m c c hai o hm bc 1 v bc 2 l khng cn thit
(vector 39 chiu), tit kim chi ph tnh ton ch cn dng 1 o hm bc 1,
c ngha l ch dng dng vector c trng LPCEPSTRA_E_D (vector 26
chiu)
5.1.1.3 Nhn dng nguyn t dng MFCC_0_D
Nhn dng trn tp hun luyn:
S mu nhn dng: 600
S mu ng: 600, t t l: 100%
Nhn dng trn tp d liu mi:
S mu dng hun luyn: 600




K
H
O
A

C
N
T
T

H

K
H
T
N





90
S mu dng th: 300
S mu ng: 300, t t l: 100%
5.1.1.4 Nhn dng nguyn t dng MFCC_0_D_A
Nhn dng trn tp hun luyn:
S mu nhn dng: 600
S mu ng: 600, t t l: 100%
Nhn dng trn tp d liu mi:
S mu dng hun luyn: 600
S mu dng th: 300
S mu ng: 300, t t l: 100%
5.1.1.5 Nhn dng nguyn t dng MFCC_0_D_A_Z
Nhn dng trn tp hun luyn:
S mu nhn dng: 600
S mu ng: 600, t t l: 100%
Nhn dng trn tp d liu mi:
S mu dng hun luyn: 600
S mu dng th: 300
S mu ng: 300, t t l: 100%
5.1.1.6 Nhn dng theo m hnh m v triphone dng
MFCC_0_D_A_Z
Nhn dng trn tp hun luyn:
S mu nhn dng: 600
S mu ng: 600, t t l: 100%
Nhn dng trn tp d liu mi:
S mu dng hun luyn: 600
S mu dng th: 300
S mu ng: 300, t t l: 100%




K
H
O
A

C
N
T
T

H

K
H
T
N





91
5.1.2 Nhn dng thi gian thc (online)
5.1.2.1 Nhn dng nguyn t dng MFCC_0_D_A_Z
S mu dng hun luyn: 600
S mu dng th: 180
o trais: 30
ng: 21
Sai: 1 (nhn dng ra duwngf)
T l ng: 96.67%
o phair: 30
ng: 28
Sai: 2 (nhn dng ra trais)
T l ng: 93.33%
o thawngr: 30
ng: 29
Sai: 1 (nhn dng ra phair)
T l ng: 96.67%
o tieens: 30
ng: 28
Sai: 2 (nhn dng ra duwngf)
T l ng: 93.33%
o lui: 30
ng: 30
T l ng: 100%
o duwngf: 30
ng: 30
T l ng: 100%
Tng s mu ng: 174, t t l: 96.67%




K
H
O
A

C
N
T
T

H

K
H
T
N





92
5.1.2.2 Nhn dng theo m hnh m v triphone dng
MFCC_0_D_A_Z
S mu dng hun luyn: 600
S mu dng th: 180
o trais: 30
ng: 28
Sai: 1 (nhn dng ra duwngf)
T l ng: 96.67%
o phair: 30
ng: 28
Sai: 1 (nhn dng ra trais)
T l ng: 93.33%
o thawngr: 30
ng: 30
Sai: 1 (nhn dng ra phair)
T l ng: 93.33%
o tieens: 30
ng: 28
Sai: 2 (nhn dng ra duwngf)
T l ng: 93.33%
o lui: 30
ng: 30
T l ng: 100%
o duwngf: 30
ng: 30
T l ng: 100%
Tng s mu ng: 175, t t l: 97.22%




K
H
O
A

C
N
T
T

H

K
H
T
N





93
5.2 ng dng nhn dng ting ni
T th nghim trn, h nhn dng iu khin xe t ng dng m hnh nhn
dng theo m v triphone dng MFCC_0_D_A_Z.
M hnh hot ng ca chng trnh ng dng

Mt s hnh nh v thit b.
Micro
Thit b
Thu tn hiu (theo
tng khung)
Trch c
trng
Nhn
dng
Tp m
hnh HMM
M hnh
ngn ng
X l v ra
lnh




K
H
O
A

C
N
T
T

H

K
H
T
N





94






K
H
O
A

C
N
T
T

H

K
H
T
N





95








K
H
O
A

C
N
T
T

H

K
H
T
N





96
KT LUN
Sau qu trnh nghin cu nhn dng ting ni Ting Vit, lun vn lm c mt
s cng vic nh sau:
Kho st cc c trng ting ni, th p dng vo nhn dng ting ni Ting
Vit.
Kho st m hnh m v, th nghim v ng dng m hnh nhn dng bng
m v triphone
Ci t h nhn dng thi gian thc ng dng vo iu khin thit b
Trong gii hn thi gian v sc lc ca mt ngi, em mi ch c khi u tip
cn nghin cu v nhn dng ting ni, v vy chc chn trong lun vn cn
nhiu thiu st. So vi s pht trin nhn dng th cc kt qu t c trong
lun vn khng ng k, nhng em mong rng lun vn ny s gp mt phn
vo vic thc y nghin cu v ng dng ca h nhn dng ting ni Ting
Vit.
Chng trnh ng dng Demo c ci t trn h thng vi cc thng s nh
sau:
My PC AMD XP 2500+ 1.8 GHz, FSB 333, 512 MB DDR.
Card m thanh onboard
Micro dng thu dng thu m dng cm tay.
Ting ni c thu vi tn s ly mu 16000Hz, kch thc mi mu
l 16 bit.
Hng pht trin:
Trong thi gian ngn, lng d liu thu vo cha phong ph ( a dng v
s lng), cho nn kt qu cha c chnh xc. Do , c th ci tin c
chnh xc ca h nhn dng bng cch tng cng v mt d liu hun luyn.




K
H
O
A

C
N
T
T

H

K
H
T
N





97
Kho st thm cc c im ng m ca Ting Vit m c nh hng n
thanh iu, t c th to ra cc vector c trng tt hn, c kh nng c trng
cho ting ni Ting Vit cao hn. y l hng pht trin kh quan trong tng lai.
Vic tch cc ting trong mt chui tn hiu hin nay ch dng mc kim
tra mc nng lng ca n, iu ny dn n vic tch cc t trong chui m c t
l chnh xc khng cao. C th dng cc c tnh ca tn s c bn h tr thm
cho vic tch t v loi b nhng khong lng, n nn v nhiu.








K
H
O
A

C
N
T
T

H

K
H
T
N





98

TI LIU THAM KHO
[1] Christine Englund, Speech recognition in the JAS 39 Gripen aircraft
adaptation to speech at different G-loads, 2004, pp. 2 - 5
[2] Steve Young et all, The HTK Book, the Cambridge University
Engineering Department, July 2000
[3] Lawrence Rabiner and Biing-Hwang Juang, Fundamentals of Speech
Recognition, Prentice Hall, 1993
[4] Xun t -V Vn Tun, Lun vn tt nghip Khoa CNTT H KHTN,
2003
[5] Bob Dunn, Speech Signal Processing and Speech Recognition, 29 April 2003
[6] Arnon Cohen and Yaniv Zigel, Feature Selection in Speaker Verification
Systems, Electrical and Computer Eng. Dept., Ben-Gurion University, Beer-
Sheva, Israel
[7] Keiichi Tokuda, HMM-Based Speech Synthesis toward Human-like Talking
Machines




K
H
O
A

C
N
T
T

H

K
H
T
N





99
Ph lc MT S CNG C TRONG HTK
1. HCopy
HCopy l cng c ca HTK dng chuyn i cc dng tp tin c h tr
bi HTK (xem hnh 4.5), l cng c rt trch c trng trong tp tin cha ting
ni. HCopy c th c s dng theo cch sau:
Bc 1: To mt tp tin script (chng hn nh t tn l convert.scp) dng
cha tn cc tp tin cn chuyn i v tn cc tp tin kt qu.
Mi dng trong tp tin script cha:
Tn_tp_tin_cn_x_l Tn_tp_tin_kt_qu_tng_ng
V d:
Data\Nhanh\Wav\nhanh.001.wav Data\Nhanh\MFCC\nhanh.001.mfcc
Data\Nhanh\Wav\nhanh.002.wav Data\Nhanh\MFCC\nhanh.002.mfcc
Data\Nhanh\Wav\nhanh.003.wav Data\Nhanh\MFCC\nhanh.003.mfcc
Data\Nhanh\Wav\nhanh.004.wav Data\Nhanh\MFCC\nhanh.004.mfcc
Data\Nhanh\Wav\nhanh.005.wav Data\Nhanh\MFCC\nhanh.005.mfcc
Data\Nhanh\Wav\nhanh.006.wav Data\Nhanh\MFCC\nhanh.006.mfcc
Bc 2: To mt tp tin cu hnh cha cc thng tin (t tn l HCopy.cfg)
nh kiu tp tin ngun, kiu tp tin ch, tn s ngun v ch,
kch thc ca s (u tnh bng 100ns) (cc thuc tnh khc
mc nh)
V d:
SOURCEKIND = WAVEFORM
#dng hnh sng
SOURCEFORMAT = WAV
#kiu tp tin .wav
SOURCERATE = 625
#tn s tp tin ngun 16KHz
TARGETKIND = LPCEPSTRA
#kiu tp tin kt qu l LPCEPSTRA
TARGETFORMAT = HTK
#kiu tp tin HTK
TARGETRATE = 100000
#tn s tp tin kt qu 100Hz
WINDOWSIZE = 250000.0
#kch thc ca s 25ms




K
H
O
A

C
N
T
T

H

K
H
T
N





100
Bc 3: Thc thi HCopy vi dng lnh v cc tham s l cc tp tin to
c, chng hn nh sau:
HCopy C HCopy.cfg S convert.scp
Kt thc qu trnh, ta s to ra cc tp tin kt qu nh mong mun.
2. HCompV
HCompV c dng khi to m hnh Markov n khi tp hun luyn
cha c nh nhn, c th s dng bng cch sau:
Bc 1: To tp tin script cha tt c cc tp tin dng hun luyn
(chng hn t tn l train.scp).
Bc 2: To tp tin m hnh HMM prototype (tn proto).
Bc 3: Thc thi HCompV vi dng lnh v cc tham s, chng hn nh:
HCompV S train.scp proto
Kt thc qu trnh ta thu c HMM c khi to trong tp tin proto
3. HInit
HInit c dng khi to m hnh Markov n vi tp hun luyn c
nh nhn hoc mi tp tin dng hun luyn ch cha mt t (hoc l m v) ng
vi mt tp tin m hnh, c th s dng bng cch sau:
Bc 1: To tp tin script cha tt c cc tp tin dng hun luyn
(chng hn t tn l train.scp).
Bc 2: To tp tin m hnh HMM prototype (tn proto).
Bc 3: Thc thi HInit vi dng lnh v cc tham s ca n, chng hn
nh sau:
HInit S train.scp proto
Kt thc qu trnh ta thu c HMM c khi to trong tp tin proto
4. HParse
HParse c dng to tp tin m hnh ngn ng (dng mng) t tp tin
vn phm, c th s dng bng cch sau:




K
H
O
A

C
N
T
T

H

K
H
T
N





101
Bc 1: To tp tin vn phm ph hp vi h thng nhn dng ang xy
dng (t tn l grammar), chng hn nh n cha ni dung nh
sau:
(
sil (one | two | three | four | five |
six | seven | eight | nine | zero) sil
)
Bc 2: Thc thi HParse vi dng lnh v cc tham s ca n, chng hn
nh sau:
HParse grammar lattice
Kt thc qu trnh ta thu c mng ngn ng trong tp tin lattice, tp tin
ny c dng trong HVite.
5. HRest
HRest c dng hun luyn m hnh Markov n, c th s dng bng
cch sau:
Bc 1: To tp tin script cha tt c cc tp tin dng hun luyn
(chng hn t tn l train.scp).
Bc 2: Khi to tp tin m hnh HMM (bng HInit hoc HCompV)
(chng hn vi t nhanh tn tp tin HMM cng l nhanh).
Bc 3: Thc thi HRest vi dng lnh v cc tham s, chng hn nh:
HRest S train.scp nhanh
Kt thc qu trnh ta thu c HMM c hun luyn trong tp tin
nhanh.
6. HERest
HERest c dng hun luyn trong h nhn dng ting ni bng m
hnh m v, c th s dng bng cch sau:
Bc 1: To tp tin script cha tt c cc tp tin dng hun luyn
(chng hn t tn l train.scp).
Bc 2: Chun b cc tp tin nh: danh sch tn cc m hnh HMM
hmmlist, tp cc m hnh HMM c khi to hmmset, Master




K
H
O
A

C
N
T
T

H

K
H
T
N





102
Label cha tt c m t (hoc nh nhn) cc d liu hun luyn
train.mlf.
Bc 3: Thc thi HERest vi dng lnh v cc tham s, chng hn nh:
HERest I train.mlf S train.scp H hmmset hmmlist
Bc 3 nn thc hin 2 ln, kt thc qu trnh ta thu c tp cc m hnh
HMM c hun luyn trong hmmset.
7. HVite
HVite c dng nhn dng trong h nhn dng ting ni bng m hnh
Markov n, c th s dng bng cch sau:
Bc 1: To tp tin script cha tt c cc tp tin cn nhn dng (chng hn
t tn l test.scp).
Bc 2: Chun b cc tp tin nh: t in pht m dictionary, mng ngn
ng lattice, danh sch tn cc m hnh HMM hmmlist, tp cc m
hnh HMM c hun luyn hmmset.
Bc 3: Thc thi HVite vi dng lnh v cc tham s, chng hn nh:
HVite w lattice i recout.mlf S test.scp H hmmset dictionary hmmlist
Kt thc qu trnh ta thu c tp tin recout.mlf, mt tp tin Master Label
cha m t cc d liu cn nhn dng.

You might also like