You are on page 1of 28

Tiu lun mn hc:Truyn thng a phng tin

NHD:TS. Nguyn Hong Lan

LI NI U
Mc ch ca bo co ny l Tm hiu phng php phn tch c trng ting ni.
Phn tch v th nghim mt ng dng lin quan nhn dng ting ni.
Trch chn cc tham s c trng l bc c ngha quyt nh ti kt qu
ca cc chng trnh nhn dng ting ni. C nhiu phng php trch chn cc tham
s c trng nhng nhn chung cc phng php ny da trn hai c ch:
M phng li qu trnh cm nhn m thanh ca tai ngi. M phng li qu trnh to
m ca c quan pht m.
Di s hng dn tn tnh ca C Nguyn Hong Lan em c gn hon thnh tt
bi tiu lun. Nhng trong qu trnh thc hin khng trnh khi nhnh sai st, mong
thy gp bi tiu lun c hon thin hn.

Em xin chn thnh cm n !

H Ni, thng 6 nm 2010.


HV: Nguyn Ngc ng

Hc Vin: Nguyn Ngc ng

Tiu lun mn hc: Truyn thng a phng tin

NHD:TS. Nguyn Hong Lan

PHN I: NHNG VN C BN CA TING NI


I.

B my pht m ca con ngi


i. C ch pht m

S h thng pht m ca ngi c minh ha nh hnh v:

Hinh 1: B my pht m

(1) Khoang mi, (2) Vm ming cng, (3) rng, (4) Vm ming mm, (5)-(6)-(8)
Li, (7) Li g, (9) Hng, (10) Np thanh qun, (11)-(12) Dy thanh m, (13)
Thanh qun, (14) Thc qun, (15) Kh qun.
H thng pht m ngi bao gm: phi (lung), kh qun (trachea), thanh qun
(thanh qun), khoang ming (oral cavity) v khoang mi (nasal cavity). Thanh qun cha
hai np gp gi l dy thanh m (vocal cords), s ko cng khi pht ra ting ni. Khoang
ming gm mt ng m thanh (acoustic tube) di khong 17 cm ngi nam, phn trc
kt thc mi v phn sau kt thc dy thanh m hoc thanh qun. Khoang ming

Hc Vin: Nguyn Ngc ng

Tiu lun mn hc:Truyn thng a phng tin

NHD:TS. Nguyn Hong Lan

ng vai tr l mt hp cng hng ng, th tch ca n c th c iu khin bi b


my pht m ( mi, li, quai hm, v vm ming mm). Khoang mi l mt ng di
khong 12 cm ngi nam, kt thc l mi v vm ming mm. Vm ming mm
(velum) s iu khin hi pht ra theo ng ming hoc ng mi. i vi nhng m
khng theo ging mi (non-nasalised), vm ming mm s ng khoang mi v hi ch
pht ra theo ng ming. i vi nhng m c ging mi, vm ming mm s dch
chuyn xung pha di, ng ng ming v hi ch pht ra theo ng mi. Trng
hp th ba l hi c pht ra theo c hai ng.
Qu trnh pht m: khi ni, phi cha y khng kh. Lng khng kh ny s c
y qua kh qun v thanh mn (glottis). Lung khng kh qua thanh mn s kch thch
dy thanh m dao ng to ra s pht m. m thanh ny c truyn ra ngoi qua
khoang ming v khoang mi. Cc khoang ny c tc dng nh b lc lm suy hao mt
vi tn s trong khi cho cc tn s khc i qua.
ii. c trng vt l
- cao:
L mc cao thp ca m, ph thuc vo s chn ng nhanh hay chm ca khng
kh trong mt khong thi gian nht nh, c gi l tn s dao ng. Tn s dao ng
cng ln th m thanh cng cao.
- mnh:
Thng c gi l cng , do bin dao ng quyt nh. Trong ngn ng, ph
m thng mnh hn nguyn m, y chnh l mt trong nhng c im gp phn nhn
din s khc bit gia ph m v nguyn m trong m thanh ting ni.
- di:
L trng ca m, ph thuc vo s chn ng lu hay mau ca cc phn t khng
kh. di c s dng phn bit cc nguyn m di v ngn, nh phn bit a vi
, vi trong ting Vit.
- m sc:
L sc thi ring ca mt m do cc c th khc nhau to ra. m sc l nguyn nhn
gy ra s khc bit gia ging ni ca ngi ny vi ngi khc. m sc c c l do
hin tng cng hng.
- Ting n v ting thanh:
Ting n l do s chuyn ng khng nhp nhng (khng c chu k n nh) ca cc
phn t khng kh gy ra. Ting thanh l do s chuyn ng nhp nhng (c chu k n
nh) ca cc phn t khng kh gy ra.
iii. Phn loi ting ni
- m hu thanh:
Hc Vin: Nguyn Ngc ng

Tiu lun mn hc: Truyn thng a phng tin

NHD:TS. Nguyn Hong Lan

c to ra khi dy thanh m cng ln v rung khi p sut khng kh tng ln, lm cho
thanh mn m ra ri ng li khi lung khng kh i qua. B phn pht m hot ng
ging nh hp cng hng, khuych i nhng thnh phn hi ny v lm suy gim
nhng thnh phn hi khc to ra m hu thanh. Mc rung ca dy thanh m ty
thuc vo p sut khng kh phi v sc cng ca dy thanh m. Ngi ni c th iu
khin 2 yu t trn thy i chu k c bn (c gi l pitch) ca m thanh. ngi
n ng, tn s c bn khong t 50250 Hz, trong khi ph n l thng ri vo
khong 120500 Hz.
Trong ngn ng, cc nguyn m v bn cht m hc l nhng m hu thanh.
- m v thanh:
c to ra khi dy thanh m khng rung. C hai loi m v thanh c bn: m xt v
m bt hi.
i vi m xt, v d khi ni s, x, mt s im trn b phn pht m b co
li khi lung khng kh i ngang qua n, hn lon xy ra to nn nhiu ngu
nhin. Bi v nhng im co thng pha trc ming, cng hng ca b
phn pht m c nh hng nh n c tnh ca m xt.
i vi m bt hi, nh khi ta ni h trong hng?, hn lon xy ra gn
thanh mn khi dy thanh m b gi nh mt phn. Trng hp ny, cng hng
ca b phn pht m s bin iu ph ca nhiu ngu nhin. Hiu ng ny c
th nghe r khi ni th thm.
Cu to c bn ca ph m trong mi ngn ng l m v thanh.
Ngoi hai loi m c bn trn, cn c mt loi m trung gian va mang tnh cht
nguyn m, va mang tnh cht ph m, c gi l bn nguyn m hay bn ph m. V
d nh m i v u trong nhng t ai, u.
- m bt hi:
Khi pht cc m ny, b my pht m s c ng li hon ton ti mt im no
trong b my pht m. Ap sut khng kh trong b my pht m s tng ln tc thi v
c gii phng mt cch t ngt. S gii thot nhanh chng ca p sut ny s to nn
mt s kch thch tm thi ca b my pht m.
iv. M hnh lc ngun to ting ni
(Hnh 2) minh ha m hnh rt n gin ca b phn pht ra nguyn m e l mt ng
u c chiu di L, mt u ngun m thanh(dy thanh m) v u kia c m ra(mi).
ng ny cng hng cc tn s l f0, 3f0, 5f0 vi f0=c/4L vi c l vn tc m thanh
trong khng kh. V d, L=17cm, c=300m/s, th s cng hng cc tn s: 500Hz,
1500Hz, 2500Hz, nhng nh cng hng ny c gi l cc Formant. B phn pht
m c th nhiu dng khc nhau v to ra nhng nh cng hng khc nhau hay cc gi
Hc Vin: Nguyn Ngc ng

Tiu lun mn hc:Truyn thng a phng tin

NHD:TS. Nguyn Hong Lan

tr Formant khc nhau nn m thanh pht ra khc nhau. Trong ting ni, cc tn s
Formant lun thay i t m ny sang m khc.

Hinh 2: M hnh ng u ca b phn pht m

Qu trnh hnh thnh ting ni c biu din bng m hnh Source-filter:

Hinh 3: To ting ni theo m hnh lc ngun

Tn hiu vo l tn hiu t ngun m thanh(cng c th l c chu k hay nhiu) c


lc bng b lc c tnh cht cng hng tng t vi b phn pht m. Ph ca tn hiu
ting ni thu c bng cch nhn ph ca b lc vi ph ca tn hiu. AV, AN l cc
li biu th cng ca m thanh v cng nhiu.
Mt b phn pht m c mt s hu hn Formant, nhng ch cn quan tm n 3 hay 4
Formant u tin trn bng tn t 100Hz n 3.5kHz do bin ca cc Formant cao hn
b suy gim gn nh hon ton vi suy gim -12dB/octave. Trng hp ting ni v
thanh, ph tng i bng phng, s lng cc Formant nh vy vn mc d ting ni
v thanh c bng tn m rng ln n 7-8kHz. Ngoi ra, do nh hng bc x ca ming
nn bin c tng ln chng 6dB/octave trong bng tn 0-3kHz. Chnh v vy m
n phn tin x l tn hiu ta phi dng b lc tin nhn b thm +6dB/octave.

Hc Vin: Nguyn Ngc ng

Tiu lun mn hc: Truyn thng a phng tin

II.

NHD:TS. Nguyn Hong Lan

C quan thnh gic ca con ngi:


1. Cu to

Hinh 4: Cu to c quan thnh gic

Tai ngoi:
Bao gm c vnh tai v l tai, chnh l tai s dn tn hiu m thanh n mng nh v
lm cho mng nh rung ln. lch ca mng nh khong chng vi nanomet v mt
ting ni thm c th to ra mt lch ch bng mt phn mi bn knh nguyn t
hydro.
Tai gia:
C mt xng nh gi l xng ba p st vo mng nh. Trong lc mng nh rung
ln, v xng ba lin kt vi cc xng khc, gi l xng e, lm xng ny c th
quay. Trong lc quay, xng e li lin kt vi mt xng khc, gi l xng bn p,
n p st vo ca s hnh ovan ca vng trong tai. Ba xng ny (ba, e, bn p) l
nhng xng nh nht trong c th con ngi v c gi chung l xng nh. Chc
nng ca n l truyn ti s rung ng ca mng nh n ca s hnh oval trong tai.
Tai trong:
Ca s hnh oval l mt mng ph nhy, m rng trong bc tng xng c cu trc
xon c, c gi l c tai. Cht lng trong c tai c chia theo chiu di ca n thnh
hai mng nhy, gi l mng nhy Reissner v mng nhy c bn(mng y). S rung
ng ca ca s oval gy nn sng p sut truyn n cht lng trong xng nh v p
sut ca sng gy trn mng nhy c bn mt lch ti nhng im khc nhau dc theo
chiu di ca n. p cht vo mng nhy c bn l c quan v no. C quan ny cha
khong 30000 t bo hnh si. Mi t bo ny c nhiu si nh li ti nh ra. Cc si dy
ny un cong nh s vn ng ca mng nhy c bn v nh cc t bo hnh si hot

Hc Vin: Nguyn Ngc ng

Tiu lun mn hc:Truyn thng a phng tin

NHD:TS. Nguyn Hong Lan

ng. Cc t bo hnh si ny lin kt vi cc u cui ca nron ca h thn kinh


truyn tn hiu v no.
ii. C ch nghe
Khi ta nghe mt sng m thun tu tc m n (sng sine), nhng im khc nhau
trn mng y s rung ng thao tn s ca m n i vo tai. im lch ln nht trn
mng y ph thuc tn s m n. Tn s cng cao to ra im lch ln nht pha y
v tn s thp to ra im lch ln nht pha nh. Nh vy mng y ng vai tr phn
tch tn s tn hiu vo phc tp, bng cch tch nhng tn s khc nhau nhng im
khc nhau dc theo chiu di ca n. Mi im nh vy c th xem l mt b lc thng
di c tn s trung tm v bng thng xc nh. Nhng p ng ny khng i xng
quanh tn s trung tm, vng tn s cao c tc suy gim dc hn nhiu so vi vng
tn s thp. V tr ca lch cc i dc theo mng nhy bin thin phi tuyn theo tn
s (theo hm logarit).
Nhng nghin cu ch ra rng ngng nghe ca mt m n tng ln khi c s hin
din ca nhng m n ln cn khc (m mt n) v ch c bng tn hp xung quanh
m n mi tham gia vo hiu ng m n, bng tn ny thng gi l bng tn ti hn.
Gi tr ca bng tn ti hn ph thuc vo tan s ca m n cn th. Vi m n
100Hz, bng tn ti hn xp x 90Hz, vi m n 5kHz l xp x 1kHz.
C th xem qu trnh nghe ca h thnh gic l mt dy cac b lc bng thng, c p
ng ph lp ln nhau v bng thng hiu qu ca chng xp x bng thng ti hn. y
l c s cho vic thit k dy bng lc sau ny

III.

Ng m ting Vit
1. m v

V mt ngn ng hc, c th xem ting ni l mt chui cc m c bn c gi l m


v. m v l n v ngn ng tru tng v khng th quan st trc tip trong tn hiu
ting ni. Nhiu m v khc nhau kt hp vi nhau mt cch no to ra nhng m
thanh khc nhau.
ii. Nguyn m
Nguyn m c xc nh bi hc cng hng khoang ming v hc yt hu-ngun
gc ca cc Formant. Khoang ming v khoang yt hu c tch bit ra bi li. Do ,
s thay i ca khoang ny ng ngha vi s thay i ca khoang kia. Vic xc nh th
tch, hnh dng, li thot khng kh ca nhng hc cng hng ny, tc xc nh kh
nng cng hng ca chng, chnh l m t m ca ming, v tr ca li v hnh
dng ca mi.

Hc Vin: Nguyn Ngc ng

Tiu lun mn hc: Truyn thng a phng tin

NHD:TS. Nguyn Hong Lan

1. Ph m

c im c bn ca ph m l s cu to bng lung khng kh b cn tr, s cn tr


ny din ra vi nhng mc khc nhau, cch thc khc nhau v nhng b phn khc
nhau ca c quan pht m. Ph m uc chia ra ph m tc (nh p, t, , b) v ph
m xt(nh v, s, x)
Ph m tc: c trng l mt ting n, do lung khng kh b cn tr hon ton,
phi ph v s cn tr thot ra ngoi. Ph m tc c chia lm ph m bt
hi (nh th)v ph m mi (nh m, n, ng, nh).
Ph m xt: c trng l ting c xt, pht sinh do lung khng khi i ra b cn
tr khng hon ton(ch b kh khn) phi lch qua mt khe h nh v trong khi
thot ra ngoi c xt vo thnh ca b phn pht m.
2. Thanh iu

Thanh iu l s nng cao hay h thp ging ni trong mt m tit. m tit l n


v pht m nh nht, trong ting Vit m tit l mt t. Thanh iu l s thay i cao
ca ging ni, iu c ngha thay i bin tn s c bn trong m hu thanh.
Thanh iu c xc nh bng tn s c bn.

Hc Vin: Nguyn Ngc ng

Tiu lun mn hc:Truyn thng a phng tin

NHD:TS. Nguyn Hong Lan

PHN II: CC PHNG PHP TRCH CHN THAM S C TRNG


CA TING NI
Qua phn phn tch ng m, ta thy rng, khi pht m mt t (tng qut gm ph m,
nguyn m, thanh iu), dy thanh m rung to ra dng sng ca lung khng kh, n
lt b phn cu m v mi bin i chm lm thay i dng sng pht ra bn ngoi
to ra nhng t khc nhau. Nh vy tn hiu ting ni l do xung bc sng chp vi tn
hiu bin thin chm ca b phn cu m. iu ny dn ti vic trch tham s ting ni
rt hiu qu l phn tch cepstral, trong phng php ny ngi ta mun ly phn tn
hiu c tn s thp do b phn cu m to ra.

I.

Phn tch cepstral theo thang o mel

Phng php tnh cc h s MFCC l phng php trch chn tham s ting ni
c s dng rng ri bi tnh hiu qu ca n thng qua phn tch cepstral theo thang
o mel.
Phng php c xy dng da trn s cm nhn ca tai ngi i vi cc di tn s
khc nhau. Vi cc tn s thp (di 1000 Hz), cm nhn ca tai ngi l tuyn
tnh. i vi cc tn s cao, bin thin tun theo hm logarit. Cc bng lc tuyn
tnh tn s thp v bin thin theo hm logarit tn s cao c s dng trch
chn cc c trng m hc quan trng ca ting ni. M hnh tnh ton cc h s MFCC
c m t nh (Hnh 5).

Hinh 5: S tnh ton cc h s MFCC

ngha v phng php xc nh tham s cc khi trong s trn m t nh sau:

Khi 1: B lc hiu chnh (Preemphasis)


Tn hiu ting ni s(n) c a qua b lc s bc thp ph ng u hn, gim
nh hng gy ra cho cc x l tn hiu sau ny. Thng b lc ny c nh bc mt, c
dng:

Hc Vin: Nguyn Ngc ng

Tiu lun mn hc: Truyn thng a phng tin

NHD:TS. Nguyn Hong Lan

H(z) = 1 az-1 0.9 a 1.0


Quan h gia tn hiu ra vi tn hiu vo tun theo phng trnh
~

s (n) = s(n)-a.s(n-1)
Gi tr a thng c chn l 0.97.
Khi 2: Phn khung (Frame Blocking)
Trong khi ny tn hiu hiu chnh
c phn thnh cc khung, mi
khung c N mu; hai khung k lch nhau M mu. Khung u tin cha N mu, khung
th hai bt u chm hn khung th nht M mu v chng ln khung th nht N-M mu.
Tng t, khung th ba chm hn khung th nht 2M mu (chm hn khung th hai M
mu) v chm ln khung th nht N-2M mu. Qu trnh ny tip tc cho n khi tt c
cc mu ting ni cn phn tch thuc v mt hoc nhiu khung.
Khi 3: Ly ca s (Windowing)
Bc tip theo l ly ca s cho mi khung ring r nhm gim s gin on ca tn
hiu ting ni ti u v cui mi khung. Nu w (n), 0 n N-1
Thng thng, ca s Hamming c s dng. Ca s ny c dng:

Khi 4: Bin i Fourier ri rc (FFT)


Tc dng ca FFT l chuyn i mi khung vi N mu t min thi gian sang min
tn s. FFT l thut ton tnh DFT nhanh. DFT c xc nh

Khi 5: Bin i sang thang o Mel trn min tn s


Nh ni trn, tai ngi khng cm nhn s thay i tn s ca ting ni tuyn
tnhm theo thang Mel. Ngi ta chn tn s 1kHz, 40 dB trn ngng nghe l 1000
Mel. Do , cng thc gn ng biu din quan h tn s thang mel v thang tuyn
tnh nh sau:

Hc Vin: Nguyn Ngc ng

10

Hinh 6: Cc bng lc tam gic theo tn s Mel

Mt phng php chuyn i sang thang mel l s dng bng lc (Hnh 6), trong
mi b lc c p ng tn s dng tam gic. S bng lc s dng thng trn 20
bng. Thng thng, ngi ta chn tn s t 0 dn Fs/2 (Fs l tn s ly mu ting
ni). Nhng cng c th mt di tn gii hn t LOFREQ n HIFREQ s c dng
lc i cc tn s khng cn thit cho x l. Chng hn, trong x l ting ni qua
ng in thoi c th ly gii hn di tn t LOFREQ=300 n HIFREQ=3400.
Sau khi tnh FFT ta thu c ph tn hiu S(fn) . Thc cht y l mt dy nng lng
W(m)=|s(fn)|2. Cho W(m) i qua mt dy K bng lc dng tam gic, ta c mt dy cc
. Tnh tng ca cc dy
mk(k=1,2,3,K).

trong tng bng lc, ta thu c mt dy cc h s

Khi 6: Bin i Cosine ri rc (DCT)Trong bc ny ta s chuyn log ca cc gi


tr mk v min thi gian bng cch bin i Cosine ri rc (DCT). Kt qu ca php bin
i ny ta thu c cc h s MFCC.

Thng thng, ch c mt s gi tr u tin ca ci c s dng. Trong cc ng dng


nhn dng ting ni, ngi ta thng ly 12 h s MFCC v thm 1 h s nng lng ca
khung sau khi c chun ha lm tham s c trng cho tn hiu ting ni (nh vy
tng cng c Q=13 h s).
Khi 7: Cepstral c trng s
V nhy ca cc h s cepstral bc thp lm cho ph ton b b dc, nhy ca
cc cepstral bc cao gy ra nhiu nn ngi ta thng s dng ca s cepstral cc tiu
ha nhy ny. Cng thc biu din cc h s cepstral c trng s:

Hc Vin: Nguyn Ngc ng

11

Khi 8: Ly o hm cc h s MFCC theo thi gian


nng cao cht lng nhn dng, ngi ta a thm cc gi tr o hm theo thi
gian ca cc gi tr h s MFCC vo vector h s ting ni. Cc gi tr c tnh theo:

Trong ; : l di ca s tnh delta (thng chn l 2 hoc 3).


Kt thc cc bc trn vi mi khung ta thu c mt vector c 2Q thnh phn biu
din tham s c trng ca ting ni.

II.

Phng php m d on tuyn tnh LPC(Linear Predictive Coding)

M hnh LPC c s dng trch lc cc tham s c trng ca tn hiu ting ni.


Kt qu ca qu trnh phn tch tn hiu thu c mt chui gm cc khung ting ni.
Cc khung ny c bin i nhm s dng cho vic phn tch m hc.
Ni dung phn tch d bo tuyn tnh l: mt mu ting ni c xp x bi t hp
tuyn tnh ca cc mu trc . Thng qua vic ti thiu ha tng bnh phng sai s
gia cc mu hin ti vi cc mu d on c th xc nh c mt tp duy nht cc h
s d bo. Cc h s d bo ny l cc trng s c s dng trong t hp tuyn tnh.
Vi dy tn hiu ting ni s(n) gi tr d bo c xc nh bi:

Trong ; ak : l cc h s c trng cho h thng.

Hinh 7: S x l LPC dng trch chn c trng ting ni

S khi b phn tch LPC dng cho trch chn cc tham s c trng ca tn hiu
ting ni (Hnh 7). Hm sai s d bo c tnh theo cng thc:
cc tiu ha li cn tm tp gi tr { k } ph hp nht.
Do tn hiu ting ni thay i theo thi gian nn cc h s d bo phi c c
Hc Vin: Nguyn Ngc ng

12

lng t cc on tn hiu ngn. Vn t ra l tm mt tp cc h s d bo ti


thiu ha sai s trung bnh trn mt on ngn.
Hm li d bo trong mt thi gian ngn xc nh bi:

Trong ; s n (m) : l mt on tn hiu ting ni ln cn mu th n;


Tm tp gi tr k ti thiu ha En bng cch t E n / i = 0 vi: i = 1,2,...,p

T nhn c phng trnh:

t:

Phng trnh trn c th vit:

Gii h p phng trnh ny tm c p n cu {k}. Tp cc h s {k} s ti thiu sai


s trung bnh bnh phng d on cho on tn hiu sn (m). Sai s d on c xc
nh:

Hc Vin: Nguyn Ngc ng

13

S dng php th ta c:

Theo nguyn tc, phn tch d don tuyn tnh rt n gin nhng vic tnh ton n
(i, k ) v tm nghim ca h phng trnh rt phc tp. Phng php khc phc l s
dng hm t tng quan gii cc phng trnh ny.
Gi s on tn hiu sn (m)=0 nu chng nm ngoi khong 0 m N - 1. iu c
ngha l c th biu din on tn hiu di dng: s n(m) = s(n + m)w(m), trong
: w(m) l ca s c chiu di hu hn (thng dng ca s Hamming). Sai s d on
Em (m) :

Khi (2-24) tr thnh:

Gi Rn(k ) l hm t tng quan dng:

Do Rn (k ) l hm chn nn:

Do :

H phng trnh ny c th vit di dng ma trn:

Hc Vin: Nguyn Ngc ng

14

Trong :

Ch : R l ma trn i xng. Tt c cc phn t thuc ng cho ca ma trn ny


u c gi tr bng nhau, iu c ngha l nghch o ca n lun tn ti v c nghim.

III.

Phng php PLP

Phng php ny l s kt hp ca hai phng php trnh by trn. Hnh 8 m


t cc bc xc nh h s PLP.

Hinh 8: S cc bc xc nh h s PLP

Cc khi x l
Khi 1: Bin i Fourier nhanh (FFT)
Tng t nh phng php MFCC, tn hiu ting ni c chia thnh cc khung
v c chuyn sang min tn s bng thut ton FFT.

Hc Vin: Nguyn Ngc ng

15

Khi 2: Lc theo thang tn s Bark


Tn hiu ting ni c lc qua cc b lc phn b theo thang tn s phi tuyn,
trong trng hp ny l thang tn s Bark:

Khi 3: Nhn mnh tn hiu dng hm cn bng n (equal-loudnes)


Bc ny tng t bc nhn mnh (preemphais) ca phng php MFCC. Hm ny
m phng ng cong cn bng n (Equal-Loudnes Curve)

Khi 4: Dng lut cng nghe (Power Law of Hearing)


Bc x l ny ging nh bc ly gi tr logarit trong phng php MFCC.
Hm cn lp phng c dng c dng:
( f ) = ( f ) 0.33
Khi 5: Bin i Fourier ngc (Inverse DFT)
Cc h s t tng quan c bin i Fourier ngc l gi tr u vo cho LPC.
Khi 6: Thut ton Durbin
Thut ton Durbin c s dng tnh cc h s d bo tuyn tnh nh phng php
LPC
Khi 7: Tnh cc gi tr delta
Phng php tnh tng t nh phng php h s MFCC.

Hc Vin: Nguyn Ngc ng

16

PHN III: NG DNG PHNG PHP TRCH CHN THAM S C


TRNG CA TING NI VO NHN DNG
I.

Tng quan v nhn dng ting ni

Nhn dng ting ni l mt h thng to kh nng my nhn bit ng ngha ca li


ni.
V bn cht, y l qu trnh bin i tn hiu m thanh thu c ca ngi ni qua
Micro, ng dy in thoi hoc cc thit b khc thnh mt chui cc t. Kt qu ca
qu trnh nhn dng c th c ng dng trong iu khin thit b, nhp d liu, son
tho vn bn bng li, quay s in thoi t ng hoc a ti mt qu trnh x l ngn
ng mc cao hn.

Hinh 9: Cc phn t c bn ca mt h thng nhn dng ting ni


Cc h thng nhn dng ting ni c th c phn loi nh sau:

Nhn dng t pht m ri rc/lin tc;

Nhn dng ting ni ph thuc ngi ni/khng ph thuc ngi ni;

H thng nhn dng t in c nh (di 20 t)/t in c ln (hng nghn t);

Nhn dng ting ni trong mi trng c nhiu thp/cao;

Nhn dng ngi ni.

Trong h nhn dng ting ni vi cch pht m ri rc c khong lng gia cc t


trong cu. Trong h nhn dng ting ni lin tc khng i hi iu ny. Ty thuc
vo quy m v phng php nhn dng, ta c cc m hnh nhn dng ting ni khc
nhau. (Hnh 9) l m hnh tng qut ca mt h nhn dng ting ni in hnh.
Tn hiu ting ni sau khi thu nhn c lng t ha s bin i thnh mt tp cc
vector tham s c trng vi cc phn on c di trong khong 10-30 ms. Cc c
Hc Vin: Nguyn Ngc ng

17

trng ny c dng cho i snh hoc tm kim cc t gn nht vi mt s rng buc


v m hc, t vng v ng php. C s d liu ting ni c s dng trong qu trnh
hun luyn (m hnh ha/phn lp) xc nh cc tham s h thng.

II.

Cc phng php tip cn trong nhn dng ting ni

C ba phng php ph bin c s dng trong nhn dng ting ni hin nay l:

Phng php m hc-Ng m hc;

Phng php nhn dng mu;

Phng php ng dng tr tu nhn to.

Cc phng php c trnh by tm tt nh di y.


1. Phng php m hc-Ng m hc
Phng php ny da trn l thuyt v m hc-Ng m hc. L thuyt cho bit:
tn ti cc n v ng m xc nh, c tnh phn bit trong li ni v cc n v ng m
c c trng bi mt tp cc tn hiu ting ni. Cc bc nhn dang ca phng
php gm:
Bc 1: Phn on v gn nhn. Bc ny chia tn hiu ting ni thnh cc on c
c tnh m hc c trng cho mt (hoc mt vi) n v ng m, ng thi gn cho
mi on m thanh mt hay nhiu nhn ng m ph hp.
Bc 2: Nhn dng. Bc ny da trn mt s iu kin rng buc v t vng, ng
php v.v xc nh mt hoc mt chui t ng trong cc chui nhn ng m c
to ra sau bc:
S khi ca phng php ny c biu din (Hnh 9). Nguyn l hot ng
ca phng php c th m t nh sau:
Trch chn c trng: Tn hiu ting sau khi s ha c a ti khi trch
chn c trng nhm xc nh cc ph tn hiu. Cc k thut trch chn c trng ting
ni ph bin l s dng bng lc (filter bank), m ha d on tuyn tnh (LPC) Tch
tn hiu ting ni nhm bin i ph tn hiu thnh mt tp cc c tnh m t cc tnh
cht m hc ca cc n v ng m khc nhau. Cc c tnh c th l: tnh cht cc
m mi, m xt; v tr cc formant; m hu thanh, v thanh; t s mc nng lng tn
hiu
Phn on v gn nhn: bc ny h thng nhn dng ting xc nh cc vng m
thanh n nh (vng c c tnh thay i rt t) v gn cho mi vng ny mt nhn ph
hp vi c tnh ca n v ng m. y l bc quan trng ca h nhn dng ting
ni theo khuynh hng m hc-Ng m hc v l bc kh m bo tin cy nht.

Hc Vin: Nguyn Ngc ng

18

Nhn dng: Chn la kt hp chnh xc cc khi ng m to thnh cc t nhn


dng.
c im ca phng php nhn dng ting ni theo hng tip cn m hc-Ng m
hc:

Ngi thit k phi c kin thc kh su rng v m hc-Ng m hc;

Phn tch cc khi ng m mang tnh trc gic, thiu chnh xc;

Phn loi ting ni theo cc khi ng m thng khng ti u do kh s dng cc

cng c ton hc phn tch.

Hinh 10: S khi nhn dng ting ni theo m hc-Ng m hc


ii. Phng php nhn dng mu

Hinh 11: S khi h nhn dng ting ni theo phng php mu


Phng php nhn dng mu khng cn xc nh c tnh m hc hay phn on ting
ni m s dng trc tip cc mu tn hiu ting ni trong qu trnh nhn dng. Cc h
thng nhn dng ting ni theo phng php ny c pht trin theo hai bc (Hnh 11),
c th l.

Hc Vin: Nguyn Ngc ng

19

Bc 1: S dng tp mu ting ni (c s d liu mu ting ni) o to cc mu


ting ni c trng (mu tham chiu) hoc cc tham s h thng.
Bc 2: i snh mu ting ni t ngoi vi cc mu c trng ra quyt nh.
Trong phng php ny, nu c s d liu ting ni cho o to c cc phin bn
mu cn nhn dng th qu trnh o to c th xc nh chnh xc cc c tnh m hc
ca mu (cc mu y c th l m v, t, cm t). Hin nay, mt s k thut nhn
dng mu c p dng thnh cng trong nhn dng ting ni l lng t ha vector, so
snh thi gian ng (DTW), m hnh Markov n (HMM), mng nron nhn to (ANN).
H thng bao gm cc hot ng sau:
Trch chn c trng: Tn hiu ting ni c phn tch thnh chui cc s o xc
nh mu nhn dng. Cc s o c trng l kt qu x l ca cc k thut phn tch
ph nh: lc thng di, phn tch m ha d on tuyn tnh (LPC), bin i Fourier ri
rc (DFT).
Hun luyn mu: Nhiu mu ting ni ng vi cc n v m thanh cng loi dng
o to cc mu hoc cc m hnh i din, c gi l mu tham chiu hay mu chun.
Nhn dng: Cc mu ting ni c a ti khi phn loi mu. Khi ny i snh
mu u vo vi cc mu tham chiu. Ki nhn dng cn c vo cc tiu chun nh gi
quyt nh mu tham chiu no ging mu u vo.
Mt s c im ca phng php nhn dng mu:
Hiu nng ca h ph thuc vo s mu a vo. Nu s lng mu cng
nhiu th
Chnh xc ca h cng cao; tuy nhin, dung lng nh v thi gian luyn mu
tng.

iii.

Cc mu tham chiu ph thuc vo mi trng thu m v mi trng truyn dn.

Khng i hi kin thc su v ngn ng.


Phng php ng dng tr tu nhn to

Phng php ng dng tr tu nhn to kt hp cc phng php trn nhm tn dng


ti a cc u im ca chng, ng thi bt chc cc kh nng ca con ngi trong
phn tch v cm nhn cc s kin bn ngoi p dng vo nhn dng ting ni. S
khi ca phng php tr tu nhn to theo m hnh t di ln (bottom-up) (Hnh 12).
c im ca cc h thng nhn dng theo phng php ny l:
S dng h chuyn gia phn on, gn nhn ng m. iu ny lm n gin ha
h thng so vi phng php nhn dng ng m.
S dng mng nron nhn to hc mi quan h gia cc ng m, sau dng n
nhn dng ting ni.
Hc Vin: Nguyn Ngc ng

20

Hinh 12: S khi h nhn dng ting ni theo phng php t di ln


Vic s dng h chuyn gia nhm tn dng kin thc con ngi vo h nhn dng:
Kin thc v m hc: phn tch ph v xc nh c tnh m hc ca cc mu ting
ni.
Kin thc v t vng: s dng kt hp cc khi ng m thnh cc t cn nhn dng.
Kin thc v c php: nhm kt hp cc t thnh cc cu cn nhn dng.
Kin thc v ng ngha: nhm xc nh tnh logic ca cc cu c nhn dng.
C nhiu cch khc nhau tng hp cc ngun kin thc vo b nhn dng ting
ni. Phng php thng dng nht l x l t di ln. Theo cch ny, tin trnh x l
ca h thng c trin khai tun t t thp ln cao. Trong (Hnh 12), cc bc x l
mc thp (phn tch tn hiu, tm c tnh, phn on, gn nhn) c trin khai trc
khi thc hin cc bc x l mc cao (phn lp m thanh, xc nh t, xc nh
cu). Mi bc x l i hi mt hoc mt s ngun kin thc nht nh. V d: bc
phn on ting ni cn hiu bit su sc v c tnh m hc-Ng m hc ca cc n
v ng m; bc xc nh t i hi kin thc v t vng; bc xc nh cu i hi
kin thc v m hnh ngn ng (nguyn tc ng php).
Phng php ny v ang c p dng thnh cng trong cc ng dng nhn
dng ting ni thc t. Bc u tin ca qu trnh nhn dng l trch chn cc tham
s tn hiu ting ni.
a. Phn tch tham s ting ni

Trong nhn dng, tng hp, m ha ting ni u cn phn tch cc tham s. Di


Hc Vin: Nguyn Ngc ng

21

y, m t phng php phn tch cepstral theo thang o mel tnh cc h s MFCC
thng qua vic s dng dy cc bng lc.
Khi nim c bn trong phn tch tn hiu ting ni l phn tch thi gian ngn
(Short- Time Analysis). Trong khong thi gian di, tn hiu ting ni l khng
dng, nhng trong khong thi gian ngn (10-30 ms) ting ni c coi l dng. Do
, trong cc ng dng x l ting ni ngi ta thng chia ting ni thnh nhiu
on c thi gian bng nhau c gi l khung (frame), mi khung c di t 10
n 30 ms.
2. Pht hin ting ni

Pht hin thi im bt u, im kt thc ca ting ni (tch ting ni ra khi


khong lng) l phn cn thit trong chng trnh nhn dng ting ni, c bit trong
ch thi gian thc. Phn ny trnh by ba phng php pht hin ting ni da trn
hm nng lng thi gian ngn SE (Short Energy) v t l vt qu im khng ZCR
(Zero Crossing).
Pht hin ting ni da trn hm nng lng thi gian ngn
Hm nng lng thi gian ngn ca tn hiu ting ni c tnh bng cch chia tn
hiu ting ni thnh cc khung, mi khung di N mu. Mi khung c nhn vi
mt hm ca s W (n) . Nu hm ca s bt u xt mu th m th hm nng lng
thi gian ngn

Em

c xc nh nh sau:

Trong :
n: l bin ri rc;
m: l s mu th th m;
N: l tng s mu ting ni
Hm ca s W(n) thng dng l hm ca s ch nht c xc nh nh sau:

Hc Vin: Nguyn Ngc ng

22

Thut ton xc nh im u v im cui ting ni theo phng php ny:


Bc 1: Vi mi khung ca tn hiu, xc nh hm nng lng thi gian ngn Em.
Nu Em > Ethreshold(gi tr ngng nng lng cho trc) th nh du l im bt u
khung (k hiu l khung B). Ngc li, xt khung k tip cho n khi xc nh c
khung B. Nu khng xc nh c B, kt lun: khng l tn hiu ting ni.
Bc 2: Tnh Em ca khung k tip khung B cho n khi E m < Ethreshold th dng v
nh du khung l im kt thc ca mt t (k hiu khung E). Sau khi xc nh im
bt u v kt thc, da vo di thi gian on m thanh thm bc kim tra: tn
hiu c chc l ting ni khng? (mt t ting Vit nu pht m r rng thng di
hn 200 ms).
Pht hin ting ni da trn hm gi nng lng v t l vt qu im khng
Thut ton ny xc nh im bt u, im kt thc ca tn hiu ting ni da trn hai
i lng tnh ca tn hiu ting ni l: hm gi nng lng E (Pseudo-Energy) v t l
vt qu im khng ZCR (Zero Crossing Rate) .
Trong mt dy gi tr tn hiu ting ni c ri rc ha, im khng l im ti
din ra s i du cng tn hiu v c m t bi:

Trong , sgn(.) l hm du .
Nng lng l i lng c dng xc nh vng cha m hu thanh, v thanh.
Nhng hm nng lng thng rt nhy cm vi nhiu. Do vy, ngi ta thng s dng
hm gi nng lng trong tnh ton. Hm gi nng lng c xc nh bi:

Trong ;
E^(n) : l hm gi nng lng,
N : l kch thc khung ca s.
T l vt qu im khng ZCR
Ta thy, khung c nng lng cng cao th t l vt qu im khng cng thp v
ngc li. Nh vy, t l vt qu im khng l i lng c trng cho tn s tn hiu
ting ni. y, chng ta cn xc nh cc tham s ngng cho hm gi nng lng vi
hai ngng trn v di v mt ngng t l vt qu im khng.

Hc Vin: Nguyn Ngc ng

23

K hiu:
EUp
: ngng nng lng trn (cao);
EDown
: ngng nng lng di (thp);
ZCR_T : ngng t l vt qu im khng.
Thut ton ny c m t nh sau :
Bc 1: Chia chui tn hiu ting ni thnh cc khung. Tnh gi tr hm gi nng
lng E^(n) v t l vt qu im khng theo ZCR tng ng trn mi khung.
Bc 2: Xt t khung u tin. nh du khung th i l im bt u nu ti khung i
t l vt qu im khng ca ZCR vt ngng (ZCR> ZCR_T ), v gi tr hm gi
nng lng vt ngng di (E^(n) > EDown ) theo hng tng ca ca hm gi nng
lng.
Bc 3: Xt cc khung k tip. nh du khung k tip thuc t. Nu hm gi nng
lng vt ngng trn (E^(n) > EUp ) theo hng tng ca nng lng.
Bc 4: im bt u ca t c xc nh li khi hm gi nng lng trn khung
nh hn ngng di (E^(n) < EDown), v ng thi t l vt qu im khng trn
khung ln hn ngng (ZCR > ZCR_T ).
Bc 5: im kt thc t c xc nh nu ti ; t l vt qu im khng nh
hn ngng (ZCR < ZCR_T ), v hm gi nng lng tng ng nh hn ngng di
(E^(n) < EDown ) theo xu hng i xung ca hm gi nng lng.
Pht hin ting ni da trn nng lng ph ngn hn
tng chnh ca phng php ny l s dng b iu khin d bin ting ni VAD
(Voice Activity Detector) da trn vic xc nh nng lng ph ngn hn Ef trn cc
khung tn hiu ting ni. VAD dng xc nh mt khung cha tn hiu ting ni hay
nhiu. Hm u ra ca VAD trn khung th m l v [m]. Vi khung cha ting ni (c
th c nhiu) v [m]=1, ngc li khung ch cha nhiu v [m]=0.
Thut ton c m t nh sau:
Bc 1: Tnh nng lng ph ngn hn Ef cho mi khung theo:

Trong ;
NumChan : s knh ca bng lc tam gic
Hc Vin: Nguyn Ngc ng

24

: cc phn t u ra ca NumChan (chun ho bng hm logarit)


Bc 2: Xc nh nng lng ph trung bnh di hn Em trn mi khung da trn E f
Nu : (

Th :

(4-2)

Cn khng th :
trong , : ngng ca ph trung bnh di hn

(4-3)

Bc 3: Kim tra khung cha ting ni hay khng:


Nu:
Th:

v[m]=1

Cn khng th:

v[m]=0

Trong : l tham s xc nh nh thc nghim.


Phng php ny ngn vic phn loi sai ca ph m st v ting ni cui tn hiu
ting ni.

Hc Vin: Nguyn Ngc ng

25

KT LUN
Qua bi tiu lun ta phn no tm hiu c cc phng php phn tch c trng
ting ni. Bi tiu lun a ra nhng vn c bn ca ting ni nh b my pht m
ca con ngi v c quan thnh gic. Qua cc c im ta i vo phn tch cc phng
php trch chn c trng ca ting ni. Da vo cc phng php trch chn c trng
ny a ra cc phng php nhn dng ting ni.
c nhiu cng trnh nghin cu v lnh vc nhn dng ting ni (Speech
recognition) trn c s s dng cc phng php trch chn c trng ca ting ni,
nhiu kt qu tr thnh sn phm thng mi nh ViaVoice, Dragon..., cc h thng
bo mt thng qua nhn dng ting ni cc h quay s in thoi bng ging ni... Trin
khai nhng cng trnh nghin cu v a vo thc t ng dng vn ny l mt vic
lm ht sc c ngha c bit trong giai on cng nghip ho hin i ho hin nay
ca nc nh.

Ti liu tham kho


[1] Ben J. Shannon, Kuldip K. Paliwal A Comparative Study of Filter Bank Spacing for
Speech Recognition.
[2] http://en.wikipedia.org
[3] http://www.lsv.unisaarland.de/Vorlesung/Digital_Signal_Processing/Summer06/dsp06_chap12.pdf
[4] Nguyn Quang Hoan, Nhp mn tr tu nhn tao. 2007, Hc vin cng ngh Bu
Chnh Vin Thng.
[5] Nguyn Ph Bnh, Bi ging X l ting ni, i hc Bch khoa H Ni.

Hc Vin: Nguyn Ngc ng

26

MC LC
LI NI U..............................................................................................................1
PHN I: NHNG VN C BN CA TING NI..........................................2
I. B my pht m ca con ngi..............................................................................2
i. C ch pht m..................................................................................................2
ii. c trng vt l................................................................................................3
iii. Phn loi ting ni...........................................................................................3
iv. M hnh lc ngun to ting ni......................................................................4
II. C quan thnh gic ca con ngi:.......................................................................6
1. Cu to..............................................................................................................6
ii. C ch nghe......................................................................................................7
III. Ng m ting Vit...............................................................................................7
1. m v ...............................................................................................................7
ii. Nguyn m .......................................................................................................7
PHN II: CC PHNG PHP TRCH CHN THAM S C TRNG CA
TING NI......................................................................................................................9
I. Phn tch cepstral theo thang o mel......................................................................9
II. Phng php m d on tuyn tnh LPC(Linear Predictive Coding)...............12
III. Phng php PLP.............................................................................................15
PHN III: NG DNG PHNG PHP TRCH CHN THAM S C TRNG
CA TING NI VO NHN DNG.........................................................................17
I. Tng quan v nhn dng ting ni.......................................................................17
II. Cc phng php tip cn trong nhn dng ting ni.........................................18
1. Phng php m hc-Ng m hc.................................................................18
ii. Phng php nhn dng mu..........................................................................19
iii. Phng php ng dng tr tu nhn to.........................................................20
KT LUN................................................................................................................26
Hc Vin: Nguyn Ngc ng

27

Ti liu tham kho......................................................................................................26

Hc Vin: Nguyn Ngc ng

28

You might also like