Professional Documents
Culture Documents
LI NI U
Mc ch ca bo co ny l Tm hiu phng php phn tch c trng ting ni.
Phn tch v th nghim mt ng dng lin quan nhn dng ting ni.
Trch chn cc tham s c trng l bc c ngha quyt nh ti kt qu
ca cc chng trnh nhn dng ting ni. C nhiu phng php trch chn cc tham
s c trng nhng nhn chung cc phng php ny da trn hai c ch:
M phng li qu trnh cm nhn m thanh ca tai ngi. M phng li qu trnh to
m ca c quan pht m.
Di s hng dn tn tnh ca C Nguyn Hong Lan em c gn hon thnh tt
bi tiu lun. Nhng trong qu trnh thc hin khng trnh khi nhnh sai st, mong
thy gp bi tiu lun c hon thin hn.
Hinh 1: B my pht m
(1) Khoang mi, (2) Vm ming cng, (3) rng, (4) Vm ming mm, (5)-(6)-(8)
Li, (7) Li g, (9) Hng, (10) Np thanh qun, (11)-(12) Dy thanh m, (13)
Thanh qun, (14) Thc qun, (15) Kh qun.
H thng pht m ngi bao gm: phi (lung), kh qun (trachea), thanh qun
(thanh qun), khoang ming (oral cavity) v khoang mi (nasal cavity). Thanh qun cha
hai np gp gi l dy thanh m (vocal cords), s ko cng khi pht ra ting ni. Khoang
ming gm mt ng m thanh (acoustic tube) di khong 17 cm ngi nam, phn trc
kt thc mi v phn sau kt thc dy thanh m hoc thanh qun. Khoang ming
c to ra khi dy thanh m cng ln v rung khi p sut khng kh tng ln, lm cho
thanh mn m ra ri ng li khi lung khng kh i qua. B phn pht m hot ng
ging nh hp cng hng, khuych i nhng thnh phn hi ny v lm suy gim
nhng thnh phn hi khc to ra m hu thanh. Mc rung ca dy thanh m ty
thuc vo p sut khng kh phi v sc cng ca dy thanh m. Ngi ni c th iu
khin 2 yu t trn thy i chu k c bn (c gi l pitch) ca m thanh. ngi
n ng, tn s c bn khong t 50250 Hz, trong khi ph n l thng ri vo
khong 120500 Hz.
Trong ngn ng, cc nguyn m v bn cht m hc l nhng m hu thanh.
- m v thanh:
c to ra khi dy thanh m khng rung. C hai loi m v thanh c bn: m xt v
m bt hi.
i vi m xt, v d khi ni s, x, mt s im trn b phn pht m b co
li khi lung khng kh i ngang qua n, hn lon xy ra to nn nhiu ngu
nhin. Bi v nhng im co thng pha trc ming, cng hng ca b
phn pht m c nh hng nh n c tnh ca m xt.
i vi m bt hi, nh khi ta ni h trong hng?, hn lon xy ra gn
thanh mn khi dy thanh m b gi nh mt phn. Trng hp ny, cng hng
ca b phn pht m s bin iu ph ca nhiu ngu nhin. Hiu ng ny c
th nghe r khi ni th thm.
Cu to c bn ca ph m trong mi ngn ng l m v thanh.
Ngoi hai loi m c bn trn, cn c mt loi m trung gian va mang tnh cht
nguyn m, va mang tnh cht ph m, c gi l bn nguyn m hay bn ph m. V
d nh m i v u trong nhng t ai, u.
- m bt hi:
Khi pht cc m ny, b my pht m s c ng li hon ton ti mt im no
trong b my pht m. Ap sut khng kh trong b my pht m s tng ln tc thi v
c gii phng mt cch t ngt. S gii thot nhanh chng ca p sut ny s to nn
mt s kch thch tm thi ca b my pht m.
iv. M hnh lc ngun to ting ni
(Hnh 2) minh ha m hnh rt n gin ca b phn pht ra nguyn m e l mt ng
u c chiu di L, mt u ngun m thanh(dy thanh m) v u kia c m ra(mi).
ng ny cng hng cc tn s l f0, 3f0, 5f0 vi f0=c/4L vi c l vn tc m thanh
trong khng kh. V d, L=17cm, c=300m/s, th s cng hng cc tn s: 500Hz,
1500Hz, 2500Hz, nhng nh cng hng ny c gi l cc Formant. B phn pht
m c th nhiu dng khc nhau v to ra nhng nh cng hng khc nhau hay cc gi
Hc Vin: Nguyn Ngc ng
tr Formant khc nhau nn m thanh pht ra khc nhau. Trong ting ni, cc tn s
Formant lun thay i t m ny sang m khc.
II.
Tai ngoi:
Bao gm c vnh tai v l tai, chnh l tai s dn tn hiu m thanh n mng nh v
lm cho mng nh rung ln. lch ca mng nh khong chng vi nanomet v mt
ting ni thm c th to ra mt lch ch bng mt phn mi bn knh nguyn t
hydro.
Tai gia:
C mt xng nh gi l xng ba p st vo mng nh. Trong lc mng nh rung
ln, v xng ba lin kt vi cc xng khc, gi l xng e, lm xng ny c th
quay. Trong lc quay, xng e li lin kt vi mt xng khc, gi l xng bn p,
n p st vo ca s hnh ovan ca vng trong tai. Ba xng ny (ba, e, bn p) l
nhng xng nh nht trong c th con ngi v c gi chung l xng nh. Chc
nng ca n l truyn ti s rung ng ca mng nh n ca s hnh oval trong tai.
Tai trong:
Ca s hnh oval l mt mng ph nhy, m rng trong bc tng xng c cu trc
xon c, c gi l c tai. Cht lng trong c tai c chia theo chiu di ca n thnh
hai mng nhy, gi l mng nhy Reissner v mng nhy c bn(mng y). S rung
ng ca ca s oval gy nn sng p sut truyn n cht lng trong xng nh v p
sut ca sng gy trn mng nhy c bn mt lch ti nhng im khc nhau dc theo
chiu di ca n. p cht vo mng nhy c bn l c quan v no. C quan ny cha
khong 30000 t bo hnh si. Mi t bo ny c nhiu si nh li ti nh ra. Cc si dy
ny un cong nh s vn ng ca mng nhy c bn v nh cc t bo hnh si hot
III.
Ng m ting Vit
1. m v
1. Ph m
I.
Phng php tnh cc h s MFCC l phng php trch chn tham s ting ni
c s dng rng ri bi tnh hiu qu ca n thng qua phn tch cepstral theo thang
o mel.
Phng php c xy dng da trn s cm nhn ca tai ngi i vi cc di tn s
khc nhau. Vi cc tn s thp (di 1000 Hz), cm nhn ca tai ngi l tuyn
tnh. i vi cc tn s cao, bin thin tun theo hm logarit. Cc bng lc tuyn
tnh tn s thp v bin thin theo hm logarit tn s cao c s dng trch
chn cc c trng m hc quan trng ca ting ni. M hnh tnh ton cc h s MFCC
c m t nh (Hnh 5).
s (n) = s(n)-a.s(n-1)
Gi tr a thng c chn l 0.97.
Khi 2: Phn khung (Frame Blocking)
Trong khi ny tn hiu hiu chnh
c phn thnh cc khung, mi
khung c N mu; hai khung k lch nhau M mu. Khung u tin cha N mu, khung
th hai bt u chm hn khung th nht M mu v chng ln khung th nht N-M mu.
Tng t, khung th ba chm hn khung th nht 2M mu (chm hn khung th hai M
mu) v chm ln khung th nht N-2M mu. Qu trnh ny tip tc cho n khi tt c
cc mu ting ni cn phn tch thuc v mt hoc nhiu khung.
Khi 3: Ly ca s (Windowing)
Bc tip theo l ly ca s cho mi khung ring r nhm gim s gin on ca tn
hiu ting ni ti u v cui mi khung. Nu w (n), 0 n N-1
Thng thng, ca s Hamming c s dng. Ca s ny c dng:
10
Mt phng php chuyn i sang thang mel l s dng bng lc (Hnh 6), trong
mi b lc c p ng tn s dng tam gic. S bng lc s dng thng trn 20
bng. Thng thng, ngi ta chn tn s t 0 dn Fs/2 (Fs l tn s ly mu ting
ni). Nhng cng c th mt di tn gii hn t LOFREQ n HIFREQ s c dng
lc i cc tn s khng cn thit cho x l. Chng hn, trong x l ting ni qua
ng in thoi c th ly gii hn di tn t LOFREQ=300 n HIFREQ=3400.
Sau khi tnh FFT ta thu c ph tn hiu S(fn) . Thc cht y l mt dy nng lng
W(m)=|s(fn)|2. Cho W(m) i qua mt dy K bng lc dng tam gic, ta c mt dy cc
. Tnh tng ca cc dy
mk(k=1,2,3,K).
11
II.
S khi b phn tch LPC dng cho trch chn cc tham s c trng ca tn hiu
ting ni (Hnh 7). Hm sai s d bo c tnh theo cng thc:
cc tiu ha li cn tm tp gi tr { k } ph hp nht.
Do tn hiu ting ni thay i theo thi gian nn cc h s d bo phi c c
Hc Vin: Nguyn Ngc ng
12
t:
13
S dng php th ta c:
Theo nguyn tc, phn tch d don tuyn tnh rt n gin nhng vic tnh ton n
(i, k ) v tm nghim ca h phng trnh rt phc tp. Phng php khc phc l s
dng hm t tng quan gii cc phng trnh ny.
Gi s on tn hiu sn (m)=0 nu chng nm ngoi khong 0 m N - 1. iu c
ngha l c th biu din on tn hiu di dng: s n(m) = s(n + m)w(m), trong
: w(m) l ca s c chiu di hu hn (thng dng ca s Hamming). Sai s d on
Em (m) :
Do Rn (k ) l hm chn nn:
Do :
14
Trong :
III.
Hinh 8: S cc bc xc nh h s PLP
Cc khi x l
Khi 1: Bin i Fourier nhanh (FFT)
Tng t nh phng php MFCC, tn hiu ting ni c chia thnh cc khung
v c chuyn sang min tn s bng thut ton FFT.
15
16
17
II.
C ba phng php ph bin c s dng trong nhn dng ting ni hin nay l:
18
Phn tch cc khi ng m mang tnh trc gic, thiu chnh xc;
19
iii.
20
21
y, m t phng php phn tch cepstral theo thang o mel tnh cc h s MFCC
thng qua vic s dng dy cc bng lc.
Khi nim c bn trong phn tch tn hiu ting ni l phn tch thi gian ngn
(Short- Time Analysis). Trong khong thi gian di, tn hiu ting ni l khng
dng, nhng trong khong thi gian ngn (10-30 ms) ting ni c coi l dng. Do
, trong cc ng dng x l ting ni ngi ta thng chia ting ni thnh nhiu
on c thi gian bng nhau c gi l khung (frame), mi khung c di t 10
n 30 ms.
2. Pht hin ting ni
Em
c xc nh nh sau:
Trong :
n: l bin ri rc;
m: l s mu th th m;
N: l tng s mu ting ni
Hm ca s W(n) thng dng l hm ca s ch nht c xc nh nh sau:
22
Trong , sgn(.) l hm du .
Nng lng l i lng c dng xc nh vng cha m hu thanh, v thanh.
Nhng hm nng lng thng rt nhy cm vi nhiu. Do vy, ngi ta thng s dng
hm gi nng lng trong tnh ton. Hm gi nng lng c xc nh bi:
Trong ;
E^(n) : l hm gi nng lng,
N : l kch thc khung ca s.
T l vt qu im khng ZCR
Ta thy, khung c nng lng cng cao th t l vt qu im khng cng thp v
ngc li. Nh vy, t l vt qu im khng l i lng c trng cho tn s tn hiu
ting ni. y, chng ta cn xc nh cc tham s ngng cho hm gi nng lng vi
hai ngng trn v di v mt ngng t l vt qu im khng.
23
K hiu:
EUp
: ngng nng lng trn (cao);
EDown
: ngng nng lng di (thp);
ZCR_T : ngng t l vt qu im khng.
Thut ton ny c m t nh sau :
Bc 1: Chia chui tn hiu ting ni thnh cc khung. Tnh gi tr hm gi nng
lng E^(n) v t l vt qu im khng theo ZCR tng ng trn mi khung.
Bc 2: Xt t khung u tin. nh du khung th i l im bt u nu ti khung i
t l vt qu im khng ca ZCR vt ngng (ZCR> ZCR_T ), v gi tr hm gi
nng lng vt ngng di (E^(n) > EDown ) theo hng tng ca ca hm gi nng
lng.
Bc 3: Xt cc khung k tip. nh du khung k tip thuc t. Nu hm gi nng
lng vt ngng trn (E^(n) > EUp ) theo hng tng ca nng lng.
Bc 4: im bt u ca t c xc nh li khi hm gi nng lng trn khung
nh hn ngng di (E^(n) < EDown), v ng thi t l vt qu im khng trn
khung ln hn ngng (ZCR > ZCR_T ).
Bc 5: im kt thc t c xc nh nu ti ; t l vt qu im khng nh
hn ngng (ZCR < ZCR_T ), v hm gi nng lng tng ng nh hn ngng di
(E^(n) < EDown ) theo xu hng i xung ca hm gi nng lng.
Pht hin ting ni da trn nng lng ph ngn hn
tng chnh ca phng php ny l s dng b iu khin d bin ting ni VAD
(Voice Activity Detector) da trn vic xc nh nng lng ph ngn hn Ef trn cc
khung tn hiu ting ni. VAD dng xc nh mt khung cha tn hiu ting ni hay
nhiu. Hm u ra ca VAD trn khung th m l v [m]. Vi khung cha ting ni (c
th c nhiu) v [m]=1, ngc li khung ch cha nhiu v [m]=0.
Thut ton c m t nh sau:
Bc 1: Tnh nng lng ph ngn hn Ef cho mi khung theo:
Trong ;
NumChan : s knh ca bng lc tam gic
Hc Vin: Nguyn Ngc ng
24
Th :
(4-2)
Cn khng th :
trong , : ngng ca ph trung bnh di hn
(4-3)
v[m]=1
Cn khng th:
v[m]=0
25
KT LUN
Qua bi tiu lun ta phn no tm hiu c cc phng php phn tch c trng
ting ni. Bi tiu lun a ra nhng vn c bn ca ting ni nh b my pht m
ca con ngi v c quan thnh gic. Qua cc c im ta i vo phn tch cc phng
php trch chn c trng ca ting ni. Da vo cc phng php trch chn c trng
ny a ra cc phng php nhn dng ting ni.
c nhiu cng trnh nghin cu v lnh vc nhn dng ting ni (Speech
recognition) trn c s s dng cc phng php trch chn c trng ca ting ni,
nhiu kt qu tr thnh sn phm thng mi nh ViaVoice, Dragon..., cc h thng
bo mt thng qua nhn dng ting ni cc h quay s in thoi bng ging ni... Trin
khai nhng cng trnh nghin cu v a vo thc t ng dng vn ny l mt vic
lm ht sc c ngha c bit trong giai on cng nghip ho hin i ho hin nay
ca nc nh.
26
MC LC
LI NI U..............................................................................................................1
PHN I: NHNG VN C BN CA TING NI..........................................2
I. B my pht m ca con ngi..............................................................................2
i. C ch pht m..................................................................................................2
ii. c trng vt l................................................................................................3
iii. Phn loi ting ni...........................................................................................3
iv. M hnh lc ngun to ting ni......................................................................4
II. C quan thnh gic ca con ngi:.......................................................................6
1. Cu to..............................................................................................................6
ii. C ch nghe......................................................................................................7
III. Ng m ting Vit...............................................................................................7
1. m v ...............................................................................................................7
ii. Nguyn m .......................................................................................................7
PHN II: CC PHNG PHP TRCH CHN THAM S C TRNG CA
TING NI......................................................................................................................9
I. Phn tch cepstral theo thang o mel......................................................................9
II. Phng php m d on tuyn tnh LPC(Linear Predictive Coding)...............12
III. Phng php PLP.............................................................................................15
PHN III: NG DNG PHNG PHP TRCH CHN THAM S C TRNG
CA TING NI VO NHN DNG.........................................................................17
I. Tng quan v nhn dng ting ni.......................................................................17
II. Cc phng php tip cn trong nhn dng ting ni.........................................18
1. Phng php m hc-Ng m hc.................................................................18
ii. Phng php nhn dng mu..........................................................................19
iii. Phng php ng dng tr tu nhn to.........................................................20
KT LUN................................................................................................................26
Hc Vin: Nguyn Ngc ng
27
28