You are on page 1of 46

Gio vin hng dn: ThS.

o Th Thu Thy

MC L

Trang 1

Gio vin hng dn: ThS.o Th Thu Thy

M U
Ngn iu chnh l ci mang li cho ting ni con ngi nhng m sc ring
bit. Ngn iu ca ting ni lin kt cht ch vi ng iu. Ng iu l s nng cao
h thp ca ging ni trong cu. Ting Vit ta l mt ngn ng kh phc tp bao
gm c ngn iu v ng iu. Do vn nhn dng ting ni cn rt nhiu s
u t v nghin cu. Tuy nhin cho n nay kt qu mang li vn cha hon thin
do tnh cht phc tp v khng c nh ca i tng nhn dng l ting ni ca con
ngi, c bit l ting Vit.
Hin nay c rt nhiu phng php nhn dng ting ni. M hnh Fujisaki
c ng dng rng ri trong h thng ca ting Nht, m hnh MFGI (Mixdorff
Fujisaki model of German Intonation) c ng dng trong ting c, m hnh
HMM (hidden markov models)vv
Trong cc m hnh trn li p dng nhiu phung php nhn dng khc nhau. Moi
phng phap mang mot tnh ac trng va u iem rieng.
Phng phap LPC (linear predictive coding)-ma hoa d bao tuyen tnh:
nhc iem la co mot so t phat am gan giong nhau th b nham lan
nhieu.
Phng phap AMDF (average magnitude difference function)- ham hieu bien
o trung bnh: u iem la so ngo vao t,kch thc mang huan luyen
nho, t phu thuoc vao cach phat am nen t le oc sai t hn phng phap
LPC, tuy nhien khuyet iem la khong phan biet ve thanh ieu, kho s
ung trong trng hp t oc lien tiep.

AMDF & LPC :Do u va nhc iem cua hai phng phap LPC va AMDF
nen can s ket hp gia hai phng phap o.
Phng phap th t MFCC (mel-frequency ceptrums coefficients).
Nhn dng ting ni l mt qu trnh nhn dng mu, vi mc ch l phn lp
thng tin tn hiu ting ni thnh mt dy tun t cc mu c hc trc v
lu tr trong h thng nhn dng. Cc mu l cc n v nhn dng, chng c th l
cc t hay cc m v. Nu cc mu ny l bt bin v khng thay i th cng vic
Trang 2

Gio vin hng dn: ThS.o Th Thu Thy

nhn dng ting ni tr nn n gin bng cch so snh d liu ting ni cn nhn
dng vi cc mu c hc v lu tr trong h thng.
Nhn dng ting ni l mt lnh vc tuy khng mi nhng v cng phc tp.
Nhn dng ting ni c th gii bt u nghin cu cch y hn 50 nm, tuy
nhin nhng k qu thc t t c v cng kh quan. Cn phi rt lu na con
ngi mi t n vic xy dng mt h thng hiu c ting ni nh con ngi.
Trong phm vi ch l n tt nghip chng em s xy dng chng trnh nhn
dng mi ch s ting Vit bng nhng cng c c sn ca Matlab. nh hng
xy dng chng trnh nhn dng c tt c cc t, cu trong ting Vit c th
ng dng c vo thc t. Tuy nhin do ch mi tip xc vi lnh vc ny nn kh
nng, kin thc ca chng em con rt hn ch v nhng kh khn v thi gian,
phng tinnn chng m ch c th xy dng mt h thng nhn dng nh.
Trong tng lai nu c iu kin tip xc v nghin cu su hn v lnh vc ny, em
mong mun pht trin n ny ln c th ng dng trong thc t.

Trang 3

Nhn dng ting ni


Vn bn

Gio vin hng dn: ThS.o Th Thu Thy


Khng tip xc vi PC

iu khin, ra lnh

Khng thi gian thc

Chng 1 TNG QUAN V NHN DNG TING NI


1.1 Nhn dng ting ni
C tip xc vi PC

Hiu mt cch n gin, nhn dng ting ni (speech recognition by


machine) l dng my tnh chuyn i tn hiu ngn ng t dng m thanh thnh
dng vn bn. Ni mt cch chnh xc hn: nhn dng ting ni l phn chia
(segmentation) v gn nhn ngn ng (labeling) cho tn hiu ting ni.
Nhn dng ting ni c nhiu ng dng:
c chnh t : L ng dng c s dng nhiu nht trong cc h nhn dng.
Thay v nhp liu bng tay thng qua bn phm, ngi s dng ni vi my
qua micro v my xc nh cc t c ni trong .
iu khin - giao tip khng dy : Chng hn h thng cho php my tnh
nhn lnh iu khin bng ging ni ca con ngi nh: chy chng
trnh, tt my Mt s u im ca vic s dng ting ni thay cho cc
thit b vo chun nh bn phm, con chut l: thun tin, tc cao, khng
b nh hng ca cp, khong cch, khng i hi hun luyn s dng...
in thoi-lin lc : Mt s h thng (chng hn my in thoi di ng)
cho php ngi s dng c tn ngi trong danh sch thay v bm s. Mt
s h thng khc ( ngn hng, trung tm chng khon) thc hin vic tr
li t ng i vi cc cc cuc gi hi v ti khon

Tuy nhin vn nhn dng ting ni gp rt nhiu kh khn. Mt s kh khn ch
yu l:
Ting ni l tn hiu thay i theo thi gian. Mi ngi c mt ging ni,
cch pht m khc nhau... Thm ch mt ngi pht m cng mt t m mi
ln khc nhau cng khng ging nhau (chng hn v tc , m lng...)
Cc phng php nhn dng hin ti ca my tnh kh my mc, cn xa
mi t n mc t duy ca con ngi.
Nhiu l thnh phn lun gp trong mi trng hot ng ca cc h thng
nhn dng v nh hng rt nhiu n kt qu nhn dng.
Trang 4

Nhn dng

Gio vin hng dn: ThS.o Th Thu Thy

Do nhng kh khn , nhn dng ting ni cn tri thc t rt nhiu t ngnh khoa
hc lin quan:
X l tn hiu: tm hiu cc phng php tch cc thng tin c trng, n
nh t tn hiu ting ni, gim nh hng ca nhiu v s thay i theo thi
gian ca ting ni.
m hc: tm hiu mi quan h gia tn hiu ting ni vt l vi cc c ch
sinh l hc ca vic pht m v vic nghe ca con ngi.
Nhn dng mu: nghin cu cc thut ton phn lp, hun luyn v so
snh cc mu d liu...
L thuyt thng tin: nghin cu cc m hnh thng k, xc sut; cc thut
ton tm kim, m ho, gii m, c lng cc tham s ca m hnh
Ngn ng hc: tm hiu mi quan h gia ng m v ng ngha, ng php,
ng cnh ca ting ni.
Tm-sinh l hc: tm hiu cc c ch bc cao ca h thng nron ca b no
ngi trong cc hot ng nghe v ni.
Khoa hc my tnh: nghin cu cc thut ton, cc phng php ci t v
s dng hiu qu cc h thng nhn dng trong thc t.
Ba nguyn tc c bn trong nhn dng ting ni:
Tn hiu ting ni c biu din chnh xc bi cc gi tr ph trong mt
khung thi gian ngn. Nh vy ta c th trch ra c im ting ni t nhng
khang thi gian ngn v dng cc c im ny lm d liu nhn dng ting
ni.
Ni dung ca ting ni c biu din di dng ch vit, l mt dy cc k
hiu ng m.
Nhn dng ting ni l mt qu trnh nhn thc. Ngn ng ni l c ngha, do
thng tin v ng ngha v suy on ca gi tr trong qu trnh nhn dng
ting ni nht l khi thng tin v m hc l khng r rng.

Trang 5

Gio vin hng dn: ThS.o Th Thu Thy

1.2 Mt s phng php nhn dng ting ni ph bin


1.2.1 So snh mu bng phng php lp trnh ng ( Dynamic Program)
Khi so snh tn hiu thu ngi ta phi so snh vi tt c cc mu, iu ny s
lm tn rt nhiu thi gian tnh ton. gim thi gian tnh ton v tng tc x
l nhn dng ging ni ngi ta s dng phng php lp trnh ng. phng
php nhn dng mu ny cc t cn nhn dng s c so snh vi cc mu c
lu tr trong h thng v thc hin vic so snh hai mu tn hiu ny tm ra mu
c sai s l nh nht. Bi v tn hiu m thanh c to ra ti cc thi im khc
nhau th khng bao gi ging nhau hon ton. N lun c s sai khc do cc yu t
v trng m, ng iu, tc , V vy cn phi thc hin so snh hai mu theo cc
thut ton bin dng nhm gim thiu sai s. Thut ton DTW (Dynamic Time
Warping) c th coi l thut ton hiu qu nht cho vic ng dng so snh tn hiu
c chiu di khc nhau v c sai s nh nht. Thut ton ny s dng phng php
quy
V d : Cc chng trnh con (Procedure) c t ng gi ra nhng vi cc
thng s (parameter) khc nhau v tm cc sai s so vi cc tn hiu mu. Mu no
c sai s so vi tn hiu cn so snh l nh nht th mu chnh l mu cn tm.
1.2.2 Phng php m hnh Markov n (Hidden Markov Model)
H thng nhn dng t ri rc da trn HMM c s khi nh sau:
Cc mu ca HMM c lu tr

Phn
tch
v xc nh
cc tham
So
s
tng ng viNguyn
cc mutc
HMM
la
T chn
nhn dng c
Ting
ni
Lng
tsnh
ha
Vector

u vo
T nhn dng c chia thnh chui thi gian ca T khung v c phn tch
mt s thut ton phn tch nh (MFCC), phn tch m ha d bo tuyn tnh (LPC),
bin i Fourier nhanh (FFT), ... Sau bc ny ta c chui mu quan st Ot (t=
1,2,3,... T). Chui Ot c lng t ha l tp i din ca M mu ting ni. Sau
h thng so snh tng ng ca t u vo vi ca M mu ting ni. T u vo
Trang 6

Gio vin hng dn: ThS.o Th Thu Thy

c nhn bng cch ly t ging vi n nht trong mu ca h thng.


V mt ton hc, mi m hnh t Mi, i=1,2,... W c xc nh bi tp tham s
[A,B,].
Gi P = {Ot | Mi} l xc sut nhn c chui quan st Ot vi m hnh Mi. T c
nhn dng RW c xc nh t cng thc : RW =Argmax [ P {O t M i }] . Trong
Argmax cho bit kt qu ch s i ca m hnh Mi c xc sut P = {Ot | Mi} cao nht.
tnh gi tr P = {Ot | Mi} cn xt tt c cc chui trng thi c th to ra chui
quan st v sau xc nh chui trng thi no c xc sut cao nht. Tuy nhin nu
phi xt tt c th s khng thc tin v phi xt vi s lng rt ln cc chui trng
thi. gim thiu khi lng tnh ton c th dng hai phng php quy l
thut ton Baul-Welch v thut ton Viterbi
1.2.3 Phng php mng Nerural (Neural Network)
Mng neural c cu trc Perceptron nhiu lp nh hnh c s dng nhiu
trong cc h thng nhn dng. Perceptron l loi n gin nht ca cc mng lin kt
(l mng khng c lin kt gia cc khi x l trong cng mt lp v khng c lin
kt gia cc khi x l lp ra quay ngc v lp vo) s dng thut ton c gim
st. Mt mng Perceptron bao gm n v x l c sp xp thnh nhiu lp.
Mng ny c hun luyn theo quy tc Delta hoc cc bin th ca n. Cc khi x
l c sp xp thnh cc lp bao gm mt lp vo mt khi x l mt lp n v
mt lp ra. Cc lin kt c trng s khc nhau kt ni mi mt khi x l mt lp
no ti tt c cc khi x l lp ln cn.
Lp ra

Lp ra

Lp n
Lp vo
(a)

(b)

Lp vo

Mng Perceptron. (a) Perceptron mt lp, (b) Perceptron nhiu lp


Mng neural loi ny c hun luyn bng cch nhp mt vector mu lp
Trang 7

Gio vin hng dn: ThS.o Th Thu Thy

u vo v tnh ton cc u ra. Sau , u ra c so snh vi cc mu u ra


mong mun. Sai s gia u ra thc t vi u ra mong mun c tnh v phn hi
qua mng ti mi phn t. Trng s u vo ca mi phn t c iu chnh ti
thiu sai s. Trong qu trnh ny c lp li cho n khi u ra thc t lch vi u
ra mong mun trong phm vi sai s xc nh trc. C rt nhiu cp mu u vo,
u ra c a qua mng v qu trnh trn c lp li cho mi cp u vo, u
ra. Vic nhn dng chnh l nhp mu ting ni cha bit nt u vo ca mng
c hun luyn v tnh ton gi tr ca cc nt u ra xc nh mu ting ni.
1.2.4 Phng php tr tu nhn to (Artificial Intelligence)
tng c bn ca ng dng tr tu nhn to vo nhn dng ting ni l thu
thp kin thc t cc ngun kin thc khc nhau gii quyt cc vn t ra. V
d ng dng tr tu nhn to lm cng on phn on v gn nhn ting ni cn
c s tng hp v cc kin thc m hc, ng m hc, t vng hc, c php hc, ng
ngha v kin thc thc t.
-

Kin thc m hc : L kin thc v c trng ca m thanh (cc n v ng


m) c pht ra trn c s cc s o v ph tn hiu v cc c tnh hu
thanh v v thanh.

Kin thc v t vng : L nhng nguyn tc do t in t ra kt hp cc


m thanh thnh t v ngc li chia nh t thnh m thanh.

Kin thc v c php : L s kt hp cc t thnh cc cm t hoc cu ng


ng php.

Kin thc v ng php : L s hiu bit v ng cnh sao cho cc cu hoc


cm t ph hp vi mc tiu nh ni v ph hp vi cc cu trc.

Kin thc thc t : L kh nng suy lun logic cn thit lm r da trn


nhng cch thc thng thng m t c dng.

C nhiu cch khc nhau tng hp cc ngun kin thc vo trong h thng
nhn dng ting ni. Phng php thng dng nht l x l t di ln. Theo cch
ny, cc tin trnh x l c trin khai tun t t thp ln cao. Tin trnh phn tch
tn hiu u vo, tm c tnh, phn on, gn nhn c trn khai u tin, sau l
cc tin trnh phn lp m thanh, xc nh t, cu. Mi tin trnh x l i hi mt
ngun kin thc v cc ngun kin thc ny c tch ly dn qua cc qu trnh x
l thc t ging nh kin thc con ngi.
Trang 8

Gio vin hng dn: ThS.o Th Thu Thy

1.2.5 M hnh hai t v ba t


h thng c kh nng lm vic vi chnh xc cao hn, bn cnh
phng php nhn dng theo m hnh Markov n ngi ta cn c th tch hp vo
h thng mt phng php thng k. Thng qua m hnh hai t cng nh ba t c
thng k tch ly trong qua trnh tnh ton nhn dng nhiu ln, tin trnh kim tra
ng cnh c thit lp. Phng php ny c u im l h thng nh c ng
cnh m ngi ni quen dng. H thng cng hot ng lu vi mt ngi, s ngy
cng quen vi cch ni ca ngi v qua chnh xc ngy cng cao. Trong
qu trnh h thng nhn dng lm vic vi m hnh thng k hai t th hai t trong
cu c c so snh vi nhau. Nu trc hai t ny tn ti trong cc cu
trc trong b nh thng k th t c nhn dng c xc nh l chnh
xc. Cc h thng nhn dng ca cc hng nh Dragon, Philips v Lernout&houspie
u c p dng phng php thng k hai t. Tng t nh phng php thng
k hai t, phng php thng k ba t cho kt qu c chnh xc cao hn. Vi
phng php thng k ba t, h thng c tc x l chm hn do c phc tp
hn phng php thng k hai t rt nhiu.

Trang 9

Gio vin hng dn: ThS.o Th Thu Thy

1.3 Cc phng php phn tch c trng ca tn hiu ting ni

Trch c trng ca ting ni


1.3.1 M hnh LPC (Linear Predictive Coding model)
M hnh LPC c s dng kh rng ri trong cc h thng nhn dng ting ni l
bi cc l do sau:
-

LPC cung cp mt m hnh tt ca tn hiu ting ni. c bit i vi cc


trng thi gn n nh ca m thanh, m hnh LPC cho ta mt xp x kh tt
ca ph m thanh. Tuy trong cc vng ngn v khng m, m hnh LPC hot
ng km hiu qu hn vng c m, nhng n vn cung cp mt m hnh c
th s dng tt cho mc ch nhn dng ting ni.

Cch m LPC c ng dng trong vic phn tch tn hiu ting ni dn n


mt s phn tch hp l cc m ngun m thanh. V nh vy, vic biu din
chi tit cc c im ca cc di m thanh l hon ton c th.

Phng php tnh ton ca LPC chnh xc v mt ton hc v n gin, trc


Trang 10

Gio vin hng dn: ThS.o Th Thu Thy

tip trong vic ci t ln c phn cng hoc phn mm. S lng tnh ton
trong x l LPC cng t hn trong phng php filters-bank
-

M hnh LPC hot ng tt trong cc ng dng nhn dng. Knh nghim cho
thy, cc h thng nhn dng s dng m hnh LPC cho kt qu tt hn so
vi cc h s dng filter-bank.

tng c bn ca m hnh LPC l mt mu ting ni cho trc ti thi im n,


s(n) c th c xp x bi mt t hp tuyn tnh ca p mu tn hiu qu kh, theo
biu thc sau:
s ( n ) a1 s ( n1 )+ a2 s ( n2 ) ++a p s(n p)

(1)

Trong cc h s a1,a2,ap c coi nh khng i trong khung thi gian phn


tch. Bin i cng thc (1), thm vo i lng Gu(n) ta c:
p

s ( n )= ai s ( ni )+Gu( n)
i=1

Trong u(n) l kch thch chun ho v G l h s ca kch thch. Bng bin i


sang min Z ta c quan h:
p

S ( z ) = ai z S ( z ) +GU (z )
1

i=1

t dn n hm truyn ca m hnh:
H ( z )=

S (z )
=
GU ( z)

1
p

1 ai z1

1
A ( z)

i=1

Cc biu thc ca phn tch LPC


Da trn m hnh lin h chnh xc gia s(n) v u(n)
p

s ( n )= ak s ( nk )+Gu(n)
k =1

ta coi t hp tuyn tnh ca cc tn hiu qu kh l mt c lng ca


p

~s ( n )= a s (nk )
k
k =1

Trang 11

~s (n)

Gio vin hng dn: ThS.o Th Thu Thy

Sai s c lng e(n) c nh ngha:


p

e ( n )=s ( n )~s ( n )=s ( n ) ak s (nk )


k =1

Vi hm truyn sai s :
Vn c bn ca phn tch d on tuyn tnh l xc nh tp cc h s{ak}
tin on trc tip t tn hiu ting ni cc c tnh ph ca b lc trng vi tn
hiu sng ting ni trong ca s phn tch.
Do cc c im ph tn ca ting ni t hay i theo thi gian, do vy cc h
s tin on ti mt thi im n phi c c lng t mt phn on ngn ca tn
hiu ting ni xy ra gn n. V nh th, hng tip cn c bn l tm mt tp cc h
s tin on c sai s d on bnh phng t cc tiu trn mt phn on ngn
ca tn hiu sng ting ni. Thng thng, tn hiu ting ni c phn tch trn cc
khung lin tip vi di khong 10ms.
Bi ton ny c gii da trn phng php t tng quan, khi cc h s ak c
lng c s l nghim ca phng trnh:
p

r n (|ik|) a^ k =r n ( i) , k i p
k=1

Hay c th biu din di dng ma trn nh sau:

Vi r(k) l h s t tng quan ca tn hiu di i k mu

r ( k )=

x (n) x (n+ k )

n=

H phng trnh ny c gii bng thut ton Levinson-Durbin.


Thut ton Levinson-Dunbin:
Khi to: p=1
Trang 12

Gio vin hng dn: ThS.o Th Thu Thy

Tnh sai s bnh phng trung bnh bc nht:


E1=r ( 0 ) ( 1a12 (1 ) ) ,trong a1 (1 )=

r (1)
r (0)

qui: vi p=2,3...,P
Tnh h s Kp (h s PARCOR)
p1

r ( p ) a i r ( pi)
K p=

i=1

E p1

Tnhcc h s d bo bc p:
a p ( k ) =a p1 ( k )K p a p1 ( pk ) vi k =1,2, , p1

ap(p) = Kp
Tnh sai s bnh phng trung bnh bc p:

Quay li bc 1, thay p bng p+1 nu p P


Cc bc thc hin thut ton LPC trch c trng ca tn hiu

Cc bc thc hin thut ton LPC


Bc 1: Lc nhiu, s dng b lc thng cao

Trang 13

Gio vin hng dn: ThS.o Th Thu Thy

Vi tn s ct di 50-250 Hz lc nhiu tn s thp do microphone gy ra.


Bc 2: Pre-emphasis lm bng ph (spectrally flaten)
Tn hiu s(n) c cho qua mt b lc thng thp:
H(z)=1-az-1
~
x ( n )=x ( n )ax ( n1 ) vi 0.9 a 1

Thng chn a = 0.9375


Tn hiu ban u mu xanh da tri, tn hiu sau Pre-emphasis mu xanh l

Tn hiu preemphasized
Bc 3:Tn hiu c phn on thnh cc frame, mi frame N mu,
chng lp M mu : M = 1/3 N
Chn tn s ly mu
Chn N v M
Bc 4: ca s ha cc frame, nhm gim s gin on ca tn hiu ti u v
cui mi frame. Hay ni cch khc l gim dn tn hiu v 0 ti cc khong
bt u v kt thc ca mi khung.
Trang 14

Gio vin hng dn: ThS.o Th Thu Thy

Ca s thng c dng l ca s Hamming.

~
x l ( n )=x l ( n ) W ( n ) 0 n N 1

Bc 5: Xc nh h s LPC dung thut ton Levinson Dubin cho mi


frame

ap(m) = LPC coefficient , 0 m p


Ta c p + 1 h s a, vi a(0) = 1.
Chn p v b a(0) . Ta c vector c trng c di l p cho mi frame.
Bc 6 : Chuyn cc h s d bo tuyn tnh thnh cc h s ceptral.
c 0=ln 2

2
( l h s G ca m hnh LPC )
m1

c m=am +
k=1

( mk ) c a

( mk ) c a

mk

vi p m Q ht

m1

c m=

k=Q p

m k

vi 1 m p

3
nl
g y Q= p
2

H s ceptral l cc h s ca bin i Fourier cho log cng ph. Cc h s ny


c cho l ng tin cy hn cc h s LPC
Bc 7: Tnh ton cc h s ceptral c trng s
c^ m=wm c m vi 1 mQ

Trong :

W m = 1+

( )]

Q
m
sin
vi 1 m Q
2
Q

Vic ny nhm gim s nh hng ca overall spectral slope ti cc h s


Trang 15

Gio vin hng dn: ThS.o Th Thu Thy

ceptral bc thp v nhiu ti cc h s ceptral bc cao. Thc cht l ta dng mt ca


s ceptral gim dn hai u. Hm Wm de-emphasize cm quanh m = 1 v m = Q
Bc 8 : Tnh o hm ca cc h s ceptral

Trong l hng s chun ho (thng ly 0.375)v (2K+1) l s frame c


tnh.
Kt thc:
Vector c trng l vector c 2Q thnh phn gm Q h s ceptral c trng s v Q
o hm ca h s ceptral.
Mt s tham s thng dng [1]
Tham s Fs = 6.67kHz

Fs = 8 kHz

Fs = 10 kHz

300 (45
msec)

240 (30 msec) 300 (30 msec)

100 (15
msec)

80 (10 msec)

100 (10 msec)

10

10

12

12

12

Bng tham s LPC


1.3.2 Phng php MFCC (Mel-Frequency Ceptrum Coefficients)
Bn cnh LPC th MFCC cng l mt phng php ph bin. MFCC da trn
nhng nghin cu v nhng di thng quan trng (critical) ca tai ngi i vi tn
s. V thu c nhng c trng ng m quan trng ngi ta s dng cc b lc
tuyn tnh vi di tn thp v cc b lc c c tnh loga vi di tn s cao. Trong
phng php ny, ta s dng Mel-scale tuyn tnh vi cc tn s di 1000Hz v t
l logarit vi cc tn s trn 1000Hz.
Trang 16

Gio vin hng dn: ThS.o Th Thu Thy

1.3.2.1 Mel-frequency scale


Cc nghin cu tm sinh l ch ra rng nhn thc ca con ngi i vi tn s
ca m thanh ca cc tn hiu ting ni khng theo mt t l tuyn tnh. V vy ngi
ta s dng mt cch o da trn t l Mel.
chuyn t thang tn s sang mel scale ta s dng cng thc

m=1127.01048 ln 1+

f
f
hay Mel ( f )=m=2595 log 10 1+
700
700

)(

V cng thc bin i ngc :


f =700 ( e m/ 1127.010481 )

Tn s Mel
1.3.2.2 Thc hin trch c trng bng phng php MFCC

Qui trnh trch c trng MFCC


Trang 17

))

(Mel)

Gio vin hng dn: ThS.o Th Thu Thy

a) Frame Blocking
Tn hiu c cht thnh tng frame N mu vi chng lp M mu.
Thng ly M = 1/3N
(Ta ly N=512 d cho vic tnh FFT v M=100 )
b) Ca s ho
Ca s ha cc frame, nhm gim s gin on ca tn hiu ti u v cui mi
frame. Hay ni cch khc l gim dn tn hiu v 0 ti cc khong bt u v kt
thc ca mi khung.
Ca s thng c dng l ca s Hamming.

~
x 1 ( n )=x 1 ( n ) W (n)

vi 0 n N1

c) Bin i Fourier nhanh (FFT)


Tn hiu (ca mt frame) sau khi nhn vi hm ca s, c chuyn sang min tn
s bng bin i Fourier ri rc:
N 1

X n = x k e2 jkn/ N vi n=0,1,2 . , N 1
k=0

d) Chuyn i Mel-Frequency
Thc hin chuyn i theo cng thc (Mel).

B ( f )=1127.01 .048 ln 1+

f
700

Trang 18

Gio vin hng dn: ThS.o Th Thu Thy

Hnh 1 Cc b lc tam gic tnh nng lng trn mi di tn s


e) Wrapping v bin i DCT
tnh c M h s MFCC, thang Mel c chia thnh M di, mi di c rng
Bmax(f)/M. Da vo cc di ny ta xy dng M b lc tam gic Hm. T tnh ra M
gi tr nng lng:

[
N

S m=log

k=1

X 2 ( k ) H m (k ) vi m {1,2, , M }

Sau thc hin php bin i cosin ri rc DCT (Discrete Cosine Transformation)
ta s thu c cc h s MFCC:
MFCC = DCT (Sm)
Bin i cosin ri rc:
N 1

1
(
[ 2 ) N ] vi u=0,1,2, , N1

C ( u )= ( u ) f ( x ) cos u x +
x=0

Bin i ngc:

[( ) ]

N1

f ( x )= (u)C (u ) cos u x +
u=0

Vi

(u)=

1
vi x=0,1,2, , N1
2 N

1
, u=0
N
2
,u 0
N
Trang 19

Gio vin hng dn: ThS.o Th Thu Thy

1.3.2.3 Mt s vn khc
a) Vn xc nh im u v im cui ca tn hiu (speech detection)
Mc ch ca vic xc nh tn hiu l tch bit cc on tn hiu ting ni cn
quan tm vi cc phn khc ca tn hiu (mi trng, nhiu ). iu ny l rt cn
thit trong nhiu lnh vc. i vi vic t ng nhn dng ting ni, speech
detection l cn thit tch ring on tn hiu l ting ni t to ra cc mu
(pattern) phc v cho vic nhn dng.
Cu hi t ra y l lm sao xc nh chnh xc tn hiu ting ni, t cung
cp mu tt nht cho vic nhn dng. Trong trng hp tn hiu c thu trong
iu kin mi trng gn l tng (gn nh khng c nhiu) th vic xc nh chnh
xc ting ni l vn khng kh. Tuy nhin, thong thng trong thc t, mt vi
vn ny sinh s gy kh khn cho vic xc nh chnh xc. Mt trong nhng vn
in hnh nht l cch pht m ca ngi ni. V d, khi pht m, ngi ni
thng to ra cc m thanh nhn to nh ting chp mi, hi th hoc l ting lch
tch trong ming.
Yu t th 2 lm cho vic xc nh ting ni tr nn kh khn l iu kin mi
trng m ting ni c to ra. Mt mi trng l tng vi nhiu v tp m gn
nh khng c l khng thc t, do vy bt buc phi xem xt vic pht ra ting ni
trong mi trng c nhiu (nh ting my mc, qut, ting x xo ca nhng ngi
xung quanh), thm ch cn trong c trng hp mi trng xung quanh khng n
nh (ting sp ca, ting xe c...)
Yu t cui cng trong vic lm gim cht lng tn hiu l s mt mt trong h
thng truyn tn hiu, nh l cht lng ca knh thng tin, hay mt mt do s
module ho (lng t ho, s ho)
Speech detection thc s quan trng i vi phng php nhn dng da trn so
snh mu (pattern comparison), v cng nng cao cht lng ca mu i vi
phng php HMM hay mng Neuron. Tuy nhin trong ni dung n do ch tp
trung vo HMM v mng Neuron nn khng i su vo vic xc nh tn hiu, tn
hiu ting ni c xc nh ngng 5%

Trang 20

Gio vin hng dn: ThS.o Th Thu Thy

Chng 2 CC THUT TON V M HNH NHN DNG TING NI


2.1 Gii thiu
M hnh Markov n l m hnh thng k trong h thng c m hnh ha
c cho l mt qu trnh Markov vi cc tham s khng bit trc v nhim v l
xc nh cc tham s n t cc tham s quan st c. Cc tham s ca m hnh
c rt ra sau c th c s dng thc hin cc phn tch k tip, v d ng
dng cho nhn dng mu.
Trong mt m hnh Markov in hnh, trng thi c quan st c t
ngi quan st, v vy cc xc sut chuyn tip trng thi l cc tham s duy nht.
M hnh Markov n thm vo cc u ra: mi trng thi c xc sut phn b trn cc
biu hin c th. V vy, nhn vo dy cc biu hin c sinh ra bi HMM khng
trc tip ch ra dy cc trng thi.
Ch : Trong l thuyt xc sut, qu trnh Markov l mt qu trnh mang tnh ngu
nhin (stochastic process) vi c tnh nh sau: trng thi ck ti thi im k l mt
gi tr trong tp hu hn {1,,M}. Vi gi thit rng qu trnh ch din ra t thi
im 0 n thi im N v rng trng thi u tin v trng thi cui cng bit,
chui trng thi s c biu din bi mt vector hu hn C={c0,,cN}. Nu P(ck |
c0,c1,...,c(k 1)) biu din xc sut (kh nng xy ra) ca trng thi ck ti thi im k
khi qua mi trng thi cho n (k-1). Gi s trong thi im ck ch ph thuc
vo trng thi trc ck-1 v c lp vi cc trng thi trc khc. Qu trnh
gi l qu trnh Markov bc mt(first order Markov process). C ngha l xc sut
xy ra trng thi ck ti thi im k, khi bit trc mi trng thi cho n thi
im k-1 ch ph thuc vo trng thi trc, v d trng thi ck-1 ti thi im k-1.
Khi ta c cng thc:
P(ck | c0,c1,...,c(k 1))= P(ck| c(k 1))
Ni tm li mt h c thuc tnh Markov c gi l qu trnh Markov (bc1).
Nh vy, vi qu trnh Markov bc n:
P(ck | c0,c1,...,c(k 1))= P(ck| ck-n,ck-n-1,,c(k 1))
Ni chung vi thut ton Viterbi qu trnh xy ra bn di c xem l mt qu
trnh Markov:
Trng thi hu hn ngha l s m l hu hn
Trang 21

Gio vin hng dn: ThS.o Th Thu Thy

Thi gian ri rc, ngha l vic chuyn t trng thi ny sang trng thi khc
cng mt mt n v thi gian.
Quan st khng tn b nh, ngha l chui cc quan st c xc sut ch ph
thuc vo trng thi ngay trc (nn khng cn lu b nh nhiu).
2.2 Trnh
Phng php tip cn l thuyt thng tin v nhn dng

Hnh 1
Nhn dng l tm cch xc nh c kh nng xy ra ln nht ca chui ngn ng
W, khi cho trc cn c m A, Cng thc:

Theo lut Bayes:


Trong m hnh HMM ta quan tm n P(W|A)
K hiu:
A

Trang 22

Gio vin hng dn: ThS.o Th Thu Thy

P(A/W)

P(O/ )

V d 1:

Hnh 5.2

Xt ba chn, mi chn trn gia cc cc trng thi 1 v 2.


Phn nh chn th i thnh 2 phn t l ai1, ai2, khi ai1+ ai2 = 1.
Xt hai bnh, mi bnh cha cc qu bng en, bng trng.
Chia bnh th i thnh 2 phn t l biB, biW, vi biB+ biW = 1
Vector tham s cho m hnh ny l:
= {a01,a02,a11,a12,a21,a22,b1(B),b1(W ),b2(B),b2(W )}

Hnh 5.3

Chui quan st : O={B,W,B,W,W,B}


Chui trng thi: Q={1,1,2,1,2,1}
Mc ch: cho m hnh v chui quan st O, c th lm th no chui
Trang 23

Gio vin hng dn: ThS.o Th Thu Thy

trng thi Q c xc nh.


Cc yu t ca m hnh Markov n ri rc
o N : s trng thi trong m hnh
Cc trng thi, s = {s1,s2,,sN}
Trng thi thi im t, qt s
o M: S k hiu quan st (quan st ri rc)
Tp cc k hiu quan st v={v1,v2,,vM}
K hiu quan st thi im t, ot v
o A= {aij}: Tp phn phi xc sut chuyn trng thi
aij = P(qt+1 = sj |qt = si ), 1 i,j N
o B = {bj (k)}: Phn b xc sut k hiu quan st trng thi j:
bj (k)= P(vk at t|qt = sj ), 1 j N, 1 k M
o = {i }: Phn b xc sut trng thi khi u
i = P(q1= si ), 1 i N
Mt m hnh HMM c vit di dng c trng = {A, B,}
V d 2:

11
12
={a01,a02} , A= a21 a 22

b ( B) b (W )

1
1
v B= b ( B ) b ( W )
2
2

S trng thi
Mt s m hnh thng dng

Trang 24

Gio vin hng dn: ThS.o Th Thu Thy

Hnh 5.4a: M hnh 2-state v 3-state

Hnh 5.4b:M hnh Left Righ

Hnh 5.4c:M hnh Bakis

Hnh 5.4d: M hnh Tuyn tnh


To chui quan st trong HMM
o La chn mt trng thi khi u, q1 = si, da trn phn b trng thi khi
u .
o Cho t chy t 1T:
Chn ot = vk theo s phn b xc sut k hiu trong trng thi si, bi(k).
Chuyn tip n trng thi mi qt+1=sj theo s phn b xc sut s
Trang 25

Gio vin hng dn: ThS.o Th Thu Thy

chuyn tip trng thi cho trng thi si, aij.


Tng t ln 1, quay li bc 2 nu t T; ngc li th kt thc.

Hnh 5.5: S tin ha ca m hnh Markov


Biu din s trng thi bng s mt li(trellis)

Hnh 5.6:( Nhng nt t th hin mt s chuyn tip trng thi bng 0, ni m


khng c vector quan st no c to ra.)
2.3 Ba vn c bn ca HMM
Vn 1: Tnh im (Scoring) : cho mt chui quan st O = {o1,o2,...,oT } v mt
m hnh = {A, B,}, lm th no chng ta c th tnh ton xc sut c iu kin
P(O | ) (kh nng xy ra ca chui quan st)?
Dng thut ton tin li (the forward-backwark algorithm)
Vn 2 : So khp (Matching): cho mt chui quan st O = {o1,o2,...,oT }, lm th
Trang 26

Gio vin hng dn: ThS.o Th Thu Thy

no chng ra c th la chn chui trng thi Q = {q1,q2,...,qT } n ti u theo


mt s hng.
Dng thut ton Viterbi
Vn 3 : Hun luyn (Training): lm th no chng ta c th iu chnh cc tham
s ca m hnh = {A,B,} t c P(O | ) ln nht?
Dng th tc Baum-Wetch
Tnh ton P(O|)

P(O|)=
P(O,Q |)= P(O|Q,)P(Q |)
Xt chui trng thi c nh Q = q1q2 ...qT
P(O|Q ,)= bq1(o1)bq2(o2) ...bqT (oT )
P(Q |)= q1 aq1q2 aq2q3 ...aqT 1qT
V vy:

P(O|)=
S php tnh cn lm 2T.NT (c NT chui nh vy)
V d: N=5, T=100 2.100.5100 1072 php tnh.
2.3.1 Thut ton tin thut ton li:
a) Thut ton tin :
Thut ton tin t(t) l xc sut chui quan st tng phn tin n thi im t v
trng thi si thi im t vi iu kin m hnh cho:
t (i)= P(o1o2 ...ot,qt = si |)
D dng thy rng:
1(i)= ibi(o1),

P(O|)=
Trang 27

1iN

Gio vin hng dn: ThS.o Th Thu Thy

Theo phng php quy np

t+1 (j)=[

] bj (ot+1), 1 t T-1, 1 j N

S php tnh: N2T.


V d: N=5,T=100 52.100 php tnh,( thay v 1072)

Din t thut ton tin:

Hnh 5.7
b) Thut ton li:
Tng t xc nh thut ton li t(i), khi kh nng xy ra ca chui quan st cc
b t thi im t+1 n kt thc, bit trc trng thi si thi im t v vi iu
kin m hnh cho :
Trang 28

Gio vin hng dn: ThS.o Th Thu Thy

t(i)= P(ot+1ot+2...oT |qt = si,)


C th d dng nhn ra rng
T(i)=1,

1iN

P(O|)=

Theo phng php quy np

t(i)=

( t=T1,T2,...,1; 1 iN )

Din t th tc li:

Hnh 5.8
Tm chui trng thi ti u:
Mt tiu chun la chn trng thi ti u qt l cc i ha s trng thi
ng.
Ton t t (i ) l xc sut ca h thng trng thi si ti thi im t, vi iu
kin cho chui quan st O v m hnh

cho:

,
Ch rng n c th biu din di dng sau
Trang 29

Gio vin hng dn: ThS.o Th Thu Thy

Tuy nhin vi tiu chun ti u ring phn th xy ra vn l chui trng


thi ti u c th khng tun theo nhng rng buc chuyn tip trng thi.
Mt tiu chun ti u khc l cc i ha P(Q,O|). iu ny c th tm thy
bng thut ton Viterbi.
Vi t(i) l xc sut xy ra cao nht trn mt ng dn tnh vi t ln quan st
u tin:

Theo phng php quy np:

thu c chui trng thi, ta cn theo di chui trng thi m cho ng


dn tt nht thi im t n trng thi si. Chng ta thc hin iu ny trong
mt mng t(i).
2.3.2 Thut ton Viterbi
Khi u:

quy:

Kt thc:

Trang 30

Gio vin hng dn: ThS.o Th Thu Thy

Quay lui tm ng dn( chui trng thi) ti u

S php tnh N2T

V d thut ton Viterbi

Trang 31

Gio vin hng dn: ThS.o Th Thu Thy

Hnh 5.10

V d so khp s dng thut ton tin-li:

2.3.3 c lng li vi thut ton Baum-Welch


c lng li vi thut ton Baum-Welch s dng EM xc nh tham s ML
Trang 32

Gio vin hng dn: ThS.o Th Thu Thy

Xt ton t t(i,j) l xc sut ca h thng trng thi i ti thi im t v trng thi j


ti thi im t+1 vi iu kin c chui quan st O v m hnh Markov n .

Khi

Kt hp t(i) v t(i,j) chng ta c:

= s chuyn tip t trng thi si

= s chuyn tip t trng thi si ti sj

Th tc c lng li Baum-Welch

Hnh 5.12
Cc biu thc c lng li vi thut ton Baum-Welch
Trang 33

Gio vin hng dn: ThS.o Th Thu Thy


Nu = {A,B,} l m hnh gc v ={ A , B , } l m hnh c lng li, khi

ta c th chng minh:
M hnh gc xc nh im ti hn ca hm c kh nng xy ra, trong trng hp
=

Hoc:

M hnh thch hp hn trong iu kin P ( O| ) > P ( O|

Chng ta c th tng xc sut chui quan st O m quan st c t m

hnh nu s dng lp li trong khng gian v lp li vic c lng li cho

n khi mt s im ti hn t c. M hnh kt qu thu c gi l m hnh


Markov n c kh nng xy ra ln nht.
2.3.4 U V NHC IM CA HMM:
1) u im :
HMM c ng dng trong nhiu lnh vc. Thnh cng ny ch yu da trn
s tn ti ca thut ton hc lp li mt cch t ng(SFA), thut ton m chnh cc
tham s ca HMM sao cho ph hp vi chui hun luyn c cho.
Mt trong cc im mnh ca HMM l kh nng tnh ton mt cch hiu qu
xc sut ca chui c cho cng l ng i kh thi nht m tng qut c chui
cho. C th dng gii quyt cc h thng m ta khng th bit c chui
trng thi m h thng tri qua hoc ngay c khi s trng thi ca m hnh cng c
th l n. Dng HMM s cho m hnh nh hn, t thng s tnh ton hn.
Trang 34

Gio vin hng dn: ThS.o Th Thu Thy

2) Nhc im :
Gi nh cho rng tt c cc xc sut chi ph thuc duy nht vo trng thi
hin ti th khng ng cho nhng ng dng v ting ni. Mt hu qu l cc HMM
kh c c cc mu pht m r rng v nhng phn phi m thanh trong thc t
ph thuc rt nhiu vo nhng trng thi qu kh. Mt hu qu khc l cc khong
tn ti c to mu khng chnh xc bi phn phi hm m gim thay v bng
phn phi Poisson chnh xc.
Gi nh c lp cho rng khng c s tng quan gia nhng frames no k
tip nhau l khng ng cho nhng ng dng v ting ni. Theo gi nh ny cc
HMM ch kim tra mt frame ting ni mt thi im.
Nhng mu mt xc sut (ri rc hay lin tc) u c chnh xc to
mu cha ti u. c bit l cc mu ri rc phi chu sai s ln.

Trang 35

Gio vin hng dn: ThS.o Th Thu Thy

Chng 3 THC HIN NHN DNG DNG MATLAB


Nh chng ta bit, Matlab (Matrix Laboratory) l mt mi trng tr gip tnh
ton v hin th rt mnh c hng MathWorks pht trin. Mc pht trin ca Matlab ngy
nay chng t Matlab l mt phn mm c giao din cc mnh cng nhiu li th trong
k thut lp trnh gii quyt nhng vn a dng trong nghin cu khoa hc k thut.
Ngoi th vin cc hm tnh ton, ho c bn, Matlab cn c cc toolbox l
cc th vin cho tng lnh vc c th. V d c toolbox cho x l tn hiu (Signal
Processing), m phng m hnh (SimulLink), logic m (Fuzzy Logic), mng nron (Neural
Network), thm ch cho c thit k my bay (Aerospace) hay gii phng trnh vi phn
(PDE)

Chng ny tp trung ch yu vo gii thiu cc toolbox v hm cn thit


x l ting ni v xy dng mt h thng nhn dng ting ni. Phng php nhn
dng c s dng l m hnh Markov n (HMM)
3.1 X l ting ni
Cc bc x l ting ni nh sau:
Ghi m ting ni hoc c t file : D liu ting ni s c thu vo t
micro tr nhng thng s ho ca mt ting ni a vo v lu li trong
matlab. N hng thng s dng x l v phn bit nhng t hay m
khc nhau thng qua qu trnh x l tip theo
Lc tn hiu u vo tn s cao : c thc hin khi b lc tng t thng
cao bc nht vi tn s ct 3db trong khong tn s 100Hz - 1KHz nhm
lm phng ph tn hiu v lm cho n t nhy hn vi nhng tc ng bn
ngoi
B lc thng cao to tn hiu s dng hm:
y[n] = x[n] - ax[n - 1] vi 0.9 < a < 1

Trang 36

n tt nghip
Vi

y[n] l mu ra hin ti ca b lc
x[n] l mu ng vo hin ti
x[n-1] l mu ng vo trc

Xc nh im u-im cui :Ct cc vng khng cha tn hiu ting ni.


Mt s hm x l ting ni
Vit trong Matlab

ngha

[x fs] =
wavread(wavfile);

c tn hiu m thanh t file wav cho


bi xu wavfile, y l vector m t tn
hiu m thanh (c gi tr thc t 0 n
1), fs l tn s ly mu (gi tr nguyn)

wavwrite(x,fs,wavfile);

Ghi tn hiu m thanh t file wav cho


bi xu wavfile, y l vector m t tn
hiu m thanh, fs l tn s ly mu.

sound(x);

Pht m thanh ra loa, y l vector m t


tn hiu m thanh.

x = wavrecord(n, fs)

Ghi m (t micro) vi tn s ly mu fs
v n mu. Kt qu l vector x.

x = filter([1 -0.9375], 1,
x);

B lc thng cao

y = detector(x);

endpoint detection

3.2 Trch c trng ting ni

VoiceBox l mt toolbox ca Matlab chuyn v x l ting ni do Mike Brookes


pht trin. VoiceBox yu cu Matlab phin bn 5 tr ln. VoiceBox gm cc hm c th
chia thnh mt s nhm chc nng sau:

Phn tch ph tn hiu


Phn tch LPC
Tnh ton MFCC, chuyn i spectral cepstral

37

n tt nghip

Chuyn i tn s (mel-scale, midi,...)


Bin i Fourier, Fourier ngc, Fourier thc...
Tnh khong cch (sai lch) gia cc vector v dy vector.
Loi tr nhiu trong tn hiu ting ni.
Trong ti ny ta chn phng php MFCC trch c trng ting ni

S khi ca qu trnh trch chn c trng MFCC


Hm tnh MFCC ca tn hiu trong VoiceBox l hm melcepst:
c = melcepst(s, fs, w, nc, p, n, inc, fl, fh)
Hm c rt nhiu tham s, mt s tham s quan trng l:
s l vector tn hiu ting ni (c c sau khi dng hm wavrecord hoc wavread), fs l
tn s ly mu (mc nh l 11050).
nc l s h s MFCC cn tnh (tc l s phn t ca vector c trng. Mc nh l 12)
p l s b lc mel-scale.
w l mt xu m t cc la chn khc: nu c e th tnh thm log nng lng, c d th
tnh thm c trng delta.
Li gi hm sinh ra ma trn c, mi dng ca ma trn l 12 h s MFCC ca mt frame.
km thm log nng lng v d liu delta nh trong cc h nhn dng khc, ta dng
lnh:
Hm melcepst c chng ti s dng trch chn c trng MFCC trong h thng
nhn dng

38

n tt nghip

3.3 Hun luyn HMM

Tn hiu ting ni giai on hun luyn c thc hin bng phng

php th cng: s dng cng c trong Matlab ghi m, lc nhiu v ct thnh cc t


ring r, mi t ghi vo mt file (tn file ghi t tng ng).

B d liu t xy dng gm :
10 t n m cc ch s ting Vit (khng, mt, hai... chn).
File wav 16 bit 8kHz, mi t c bn file pht m.
C 40 mu ting ni c s dng hun luyn.

Bt u

Ghi m ting ni hoc c t file

Trch c trng MFCC

Kt hp tt c cc c trng ca tng t to
thnh tp d liu hun luyn

Gn nhn cho tng t

Khi to HMM cho tng t

Hun luyn HMM cho tng t

Kt thc

Hnh 2 Qu trnh hun luyn HMM


Vi phng php MFCC: tn hiu c chia thnh cc frame c di N = 512 mu vi
chng lp M = 100.
Cc hm Matlab chnh
39

n tt nghip

melcepst.m : to cc ma trn vector c trng cho tng mu tn hiu.


hmmtrain.m : Hun luyn m hnh HMM theo cc d liu c lng t ho bng
codebook tnh c t bc trn, v to ra 10 m hnh HMM ring bit cho tng t
kho nhn dng. Cc m hnh ny c lu li di dng file hmmdata.mat

clear all
clc;
addpath('VOICEBOX')
addpath('HMMs')
nc=16;
p=32;
M = 4;
N = 4;
SM_mat=M*ones(1,N);
so_lan_lap = 5;

%
%
%
%
%
%

nc number of cepstral coef


p number of filters in filterbank
M : So trang thai quan sat duoc
N : So trang thai
SM_mat : Number of Mixture Model for state
So lan lap

% c file wav
traindata = cell(1,10);
for i=0:9
temp = cell(1:M);
for j=1:M
fname = sprintf('Train/s%dt%d.wav',i,j);
[x,fs] = wavread(fname);
x=endcut(x, 500, 0.1);
% ct khong lng
x = filter([1 -0.9375],1,x);
temp{1,j} = x';
end
traindata{1,i+1}=temp;
end
% Huan luyen tu so 0 den so 9
hmmdata = cell(1,10);
for i=1:10
fprintf('\n\nHUAN LUYEN SO %d\n',i-1);
sample=[];
for k=1:length(traindata{i})
x=filter([1 -0.9375], 1, traindata{i}{k});
% Trch c trng ca ting ni
sample(k).data=melcepst(x,fs,'M',nc,p,256,80);
end
hmmdata{i}=hmmtrain(sample,SM_mat,so_lan_lap);
end
save('hmmdata.mat')

40

n tt nghip

3.4 Nhn dng


Sau khi hun luyn HMM ca cc t cho h thng, ta c th tin hnh vic
nhn dng t c ni l t no trong s cc t hun luyn. Gii thut nhn dng
ting ni bng HMM nh sau:

Bt u

Ghi m cc t cn hun luyn, mi t ni nhiu


ln
Trch c trng ca tng t

Chuyn vector c sang chui quan st O

c HMM ca cc t c hun luyn


Tnh xc sut ca chui quan st O vi HMM
ca tng t

Chn t c xc sut ln nht, l t c nhn


dng
Kt thc

Hnh 3 Qu trnh nhn dng dng HMM


Cc hm Matlab chnh:
viterbi.m : kim tra mu nhn dng vi th tc forward tnh xc sut ca chui
quan st vi m hnh HMM cho trc. iu kin viter c 2 mc 1 v 0 tng ng vi
vic hin kt qu ca chui trng thi tt nht ng vi chui quan st.

addpath('VOICEBOX')
addpath('HMMs')
41

n tt nghip

load hmmdata
[fname,pathname]=uigetfile('Test/*.wav');
x = wavread([pathname,fname]);
set(handles.axes1);
plot(x);
grid minor;
x=endcut(x, 500, 0.1);
% ct khong lng
x=filter([1 -0.9375],1,x);
% Trch c trng ca ting ni
m = melcepst(x,fs,'M',nc,p,256,80);
for j=1:10
pout(j)=viterbi(hmmdata{j},m);
end
% Ly xc sut chui quan st ln nht
[d,n] = max(pout);
set(handles.So,'String',num2str(n-1));

3.5 Giao din chng trnh

Ngoi chc nng hun luyn v nhn dng th chng trnh cn c chc nng pht m
ch s (t khng n chn)
Chy th v kim tra kt qu
Cc tham s c thay i v chy th vi b d liu gm 20 ngi: 13 nam v 7 n.

Cc d liu c thu m bng micro v my tnh c nhn vi mc nhiu kh cao.


Mi ngi c 5 mu: 3 mu cho vo b hun luyn, 2 mu cho vo b kim tra.
Cc vector c trng c trch t MFCC gm 13 mfcc v 12 delta.
Cc vector c trng c trch t LPC c kch thc tu theo bc LPC.
Cc kt qu thu c ng vi cc tham s:
85
80
75
%

70

% nhn dng

65
60
32

64
Kch thc codebook

Hnh 4 Kt qu theo kch thc codebook


42

128

n tt nghip

Chn kch thc codebook l 64

80.5
80
79.5
79
% 78.5
% nhn dng

78
77.5
77
3

S trng thi HMM

Hnh 5 Kt qu theo s trng thi HMM

Trch c trng theo phng php MFCC

77
76
75
74
73
%
72
71
70
69

% nhn dng

S trng thi HMM

Hnh 6 Kt qu theo s trng thi HMM


Kt qu nhn dng t 76%
Nhn xt kt qu
Kt qu tt nht t c vi phng php LPC bc 8, m hnh HMM 3 trng thi v

kch thc codebook 64. Vi kh nng nhn dng trung bnh l 80%.
C th ni y l mt kt qu cha tt bi bn cnh mt s t nhn dng kh tt (8090%) th nhng t khc m hnh li cho kt qu khng cao. Hay ni cch khc l kh
nng nhn dng cc t khng ng u.
C cc nhm t hay b nhn dng nhm vi nhau : {dng, di}; {tri, chy}; (tin

43

n tt nghip

trn); (tt, phi).


Nguyn nhn :
Cht lng ca cc mu d liu khng cao ( nhiu ln, v thu t cc mi trng nhiu

khc nhau)
Cc tham s chn la cha ti u
Mt s t c cch pht m gn ging nhau.
Nhn xt kt qu :
T nhng kt qu thu c c th thy phng php nhn dng bng mng

Neuron cho nhng kt qu kh quan hn so vi s dng m hnh Markov n. (Kt qu


nhn dng 94% so vi 80 %)
Nh vy, qua qu trnh th nghim cc m hnh cng nh cc phng php trch c
trng. Quyt nh cui cng c a ra l la chn m hnh Neuron vi 250 nt n,
cc hm truyn l hm logsigmoid v hm purelin. Cng vi phng php trch c
trng l MFCC 13 h s.

KT LUN
Nhn xt kt qu chung ca n
n thc hin c vic xy dng cc m hnh nhn dng ting ni, c th l nhn
dng cc t iu khin ri rc: Tt, Bt, Chy, Dng, Tin, Li, Tri, Phi, Trn, Di.
V tin hnh chy th nghim da trn cc phng php phn tch c trng ca tn
hiu l LPC v MFCC.
Da trn c s d liu thu thp c a ra c mt m hnh nhn dng thch hp
nht.
Tuy nhin vn khng trnh khi mt s hn ch:
S lng mu cn t, nn cha khng nh c s hi t ca thut ton.
Cht lng mu khng cao v khng ng nht do tin hnh thu bng my tnh c nhn
v cc a im khc nhau.
Cn mt s hn ch trong phng php lm vic do thiu kinh nghim.
Vic chuyn i thut ton t Matlab sang C ci t cho DSP vn gp phi mt s sai
s trong tnh ton. Dn n kt qu nhn dng trong thc t khng c cao nh khi
chy th nghim trn my tnh.
44

n tt nghip

Ngoi ra cn mt s kh khn khch quan: nh c trng ca cc t ting Vit khc bit


so vi t ting Anh, mt s t iu khin c nhiu c im ging nhau dn n nhn
dng sai (nh tri - phi; dng - di )

Trong qu trnh thc tp lm n, sinh vin c gng ht sc nghin c v lm vic


nghim tc hon thnh yu cu ca ti. T thu c nhng kin thc v nhng
kinh nghim rt b ch. Tuy nhin, trong qu trnh lm vic, cng nh trong bn thn
n ny vn khng trnh khi nhng thiu st, do vy s ch bo, gp ca cc thy c
gio s l s gip v cng qu bu n c hon thin hn.

TI LIU THAM KHO


Fundamentals of speech recognition Lawrence Rabiner Prentice Hall 1993
A turial on Hidden Markov Model and selected applications in speech recognition
Lawrence Rabiner. IEEE - 1989
Chapter 9: Automatic Speech Recognition Speech and Language Processing: An
Introduction to natural processing, computational linguistics and speech recognition
Daniel Jurafsky & James H. Martin. 2007.
Nhn dng ting Vit dng mng Neuron v trch c trng dng LPC v AMDF
Hong nh Chin
Speech Recognition using Neural Network Joe Tebelskis 1995
Bi ging mn nhn dng ca thy Trn Hoi Linh, HBK H Ni, 2007
Digital Signal Processing and Applications with the C6713 and C6416 DSK Rulph
Chassaing Wiley 2004
C Algorithms For Real-Time DSP Paul M. Embree Prentice Hall 1995
Lp trnh Matlab v ng dng Nguyn Hong Hi , Nguyn Vit Anh NXB Khoa

hc k thut HN 2005.
Mel frequency ceptral coefficients Wikipedia.org v cc link tham chiu
HMM toolbox for matlab -http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html
Auditory toolbox for matlab - http://www.slaney.org/malcolm/pubs.html
ECE4703 Real-Time DSP Orientation Lab - D. Richard Brown III 2004
V mt s ti liu khc.

45

n tt nghip

46

You might also like