Professional Documents
Culture Documents
Ky Thuat Nhan Dang
Ky Thuat Nhan Dang
w(m n)
(2.2)
Hm nng lng
N
n m N +1
2
T l qua im zero
thi gian ngn
trong , N l chiu di ca ca s w(m-
n).
Nhiu thut ton pht hin u cui c da trn
ln ca tn hiu nng lng thi gian ngn v t
l qua im zero c gng pht hin chnh xc
n mc c th. Qu trnh c bn ca thut ton
nh sau: mt mu tn hiu nh ca nn nhiu c
ly trong sut khong lng (silence) cho n
trc im bt u ca tn hiu ting ni. T y
Hnh 2 S tng quan gia tn hiu ting ni
v nn nhiu.
Vi mt ca s kt thc ti mu th m, hm nng
lng thi gian ngn E(m) c xc nh bi:
Phn tch ph
Nu nhng gi tr c khong cch u nhau, tc l
2k
xem w , th bin i Fourier ri rc (DFT)
N
ca tt c cc frame ca tn hiu l:
Hnh 5 Dng xung sau x l kt hp hm nng
X (k ) X (e
j 2k / N
)
k 0,..., N 1.
t t
lng thi gian ngn v t l qua im zero
T hnh 5 ta thy ch cn xc nh di ti thiu
ca mt t l ta c th tch t ra khi nn nhiu.
n y m-un 1 hon thnh nhim v. y l
mt phn rt quan trng trong mt h thng nhn
dng ting ni, n nh hng rt ln n kt qu
nhn dng.
2.2 Thc hin m-un
2
n y chng ta c c cc mu ting ni
c kh nhiu. M-un 2 thc hin vic trch c
trng cc mu ting ni thu m-un 1. C
nhiu phng php trch c trng khc nhau nh:
wavelets, LPC, MFCC y chn phng php
MFCC (trch c trng theo thang tn s Mel) do
tc tnh ton cao, tin cy ln v c s
dng rt hiu qu trong cc chng trnh nhn dng
ting ni trn th gii.
Bn cnh nu s mu N l bi s ca 2 (N=2p, p
l s nguyn) th phc tp tnh ton s
gim ng k khi dng phng php FFT (Fast
Fourier Transform).
Lc x l
Nhng nghin cu v sinh l hc chng t rng
mc cm nhn i vi tn s tn hiu ting ni
ca con ngi khng theo mt t l tuyn tnh.
ng vi mi tone l c mt tn s f, c o
bng n v Hz. m t chnh xc s tip nhn
tn s ca h thng thnh gic, ngi ta xy
dng mt thang khc thang Mel. Thang tn s
mel tuyn tnh tn s di 1000 Hz v logarit
tn s trn
1000 Hz. Mt quan h nh x tng ng gia
thang tn s thc (vt l, Hz) v thang tn s sinh
l Mel c cho bi cng thc sau:
S gii thut phng php MFCC nh
sau:
F
mel
1000
1 +
F
Hz
_
log
10
2
1000
,
F
Hz
_
hay F
mel
2595. log
10
1 +
1000
(2.3)
,
Hnh 6 Qu trnh tnh cc h s MFCC.
Ca s ho tn hiu
(Windowing)
Nhng phng php nh gi ph c in ch ng
tin cy trong trng hp tn hiu dng (stationary
signal), v d mt tn hiu m nhng c trng l
bt bin i vi thi gian. i vi tn hiu ting
ni th iu ny ch c c trong mt khong thi
gian ngn, vic ny c th thc hin c bng
cch ca s ho mt tn hiu x(n) thnh mt
chui lin tc nhng ca s tun t x
t
(n), t=1,2,
,T, gi l nhng frame.
Trong h thng nhn dng t ng th dng ca s
thng dng nht l Hamming window, p ng
xung ca n l mt hm cosin tng:
n
_
Vic phn tch ph s th hin nhng c trng tn
hiu ting ni m do chnh hnh dng ca vng
pht m to ra. Nhng c trng ph ca tn hiu
ting ni s c c sau khi cho qua nhng b lc.
i vi thang tn s Mel th mt lc cho mi
thnh phn tn s mong mun (hnh 7). B lc ny
c p ng tn s dng tam gic, v khong cch
hay bng thng c xc nh bi mt hng s
Mel.
w n
N 1
n 0,..., N 1
' ,
0 n khac
Hnh 7 Mt v d v b lc thang Mel
Tnh nng lng logarit (LOG)
Cc bc trc ng vai tr lm phng ph, thc
hin mt x l ging nh tai ca con ngi.
n
hun luyn
bc ny tnh ton logarit ca bnh phng ln
nhng h s ti ng ra b lc. Ch rng tai ngi
thc hin rt tt vic x l ln v logarit.
Hn th na, x l ln th loi b nhng thng
tin khng cn thit trong khi x l logarit thc
hin mt nn ng, trch c trng t nhy i vi
nhng bin i ng.
Tnh ph tn s mel
Bc cui cng trong vic tnh ph tn s mel
(MFCC) bao gm thc hin bin i ngc DFT
trn ln logarit ca ng ra ca b lc.
Ch rng do nng lng ph log l thc v i
xng nn bin i DFT ngc c ni gn l
chuyn i cosine ri rc (Discrete Cosine
Transform DCT). Tnh cht ca DCT l to ra
nhng c trng rt khc nhau. DCT cng c tc
dng lm phng ph nu ch c nhng h s u
tin c gi li. Trong nhn dng ting ni th s
h s MFCC thng nh hn 15. [6]
Sau khi tn hiu ting ni c trch c trng th
mi t c c c trng bi mt ma trn h s
thc. Do m hnh HMM ri rc c ng dng
nhn dng nn nhng vector c trng ny phi
c c lng vector (VQ) thnh mt ch s
codebook ri rc. Thut ton ph bin dng
thit k codebook l LBG (Linde, Buzo v Gray).
Hnh 8 c lng vector VQ trong nhn dng.
Phng php c s dng c lng vector l
phng php K-means.
2.3 Thc hin m-un
3
Sau khi thc hin xong 2 m-un trn th chng
ta c mt c s d liu cc vector c trng ng
vi tng t. Trong m un ny chng ta s
xy dng mt m hnh Markov n vi d liu
hun luyn l cc vector c trng c c t m-
un 2. S hun luyn v nhn dng bng
m hnh HMM c th hin trn hnh 9 vi
b t vng gm 3 t: ti, lui, tri.
Hun luyn:
Ti Lui Tri
Nhng mu
c lng
thng s
ti
lui
tri
Nhn dng:
O , , , , ,
,
P(O/
ti
) P(O/
lui
) P(O/
tri
)
Hnh 9 S m hnh HMM
ng vi mi t cn nhn dng th chng ta c mt
c s d liu cc c trng t cc ln c khc
nhau (nh trn s l 3 ln ly mu). Sau ta s
c lng cc thng s ca m hnh (A, B,
)
xc sut P(O|) t cc i, tng ng vi mi
t l mt xc nh. nhn dng mt t th ta
ch vic tnh xc sut chui quan st ca t ng
vi cc c hun luyn, v chn mu no c
xc sut ln nht.
Da vo cc ti liu tham kho v nhng thng tin
v cc h thng nhn dng xy dng thnh cng
chng ti thy rng: i vi nhn dng tn hiu
ting ni th m hnh HMM thng c chn l
m hnh tri phi (left-right) c t 5 n 6 trng
thi. Qua qu trnh th nghim, m hnh c 6 trng
thi cho kt qu tt hn nn trong chng trnh ca
mnh, cc tc gi xy dng mt HMM vi s
trng thi l 6, xem hnh 10.
Hnh 10 M hnh HMM tri phi vi 6 trng
thi.
3 M HNH H THNG XE IU
KHIN
S m hnh xe v tuyn iu khin bng ting
ni t my tnh c trnh by trn hnh 11.
lui
ti
phi tri
B iu khin t xa
SW
1
SW
2
SW
3
SW
4
anten
pht
anten
thu
B iu khin
trn xe
phi tri ti lui
Hnh 11 S tng quan h thng th nghim
Xe v tuyn c th c iu khin t xa bng
ting ni t my tnh. Ting ni l t lnh s c
thu vo v nhn dng trn b nhn dng ting ni,
v cp chui t nhn dng c cho b quyt nh
xut lnh iu khin thng qua cng COM. Mt
mch giao tip my tnh thng qua cng ni tip
(RS232) c thit k iu khin. Mch giao
tip nhn tn hiu v ng m cc kho chuyn
thnh tn hiu ca b iu khin t xa. Mi khi c
mt kho c ng hoc mt t hp phm c
nhn, b iu khin t xa s m ha thch hp v
a ra anten pht. Tn hiu iu khin c iu
ch v truyn n xe bng sng v tuyn vi tn
s sng mang F
C
= 27MHz. B iu khin trn xe
s tin hnh iu khin vn hnh xe. M hnh
hot ng tt vi b t vng gm 4 t: phi, tri,
ti, lui vi kt qu tt (99%).
4 KT LUN
M hnh th nghim nhn dng ting ni ting
Vit theo hng kt hp MFCC v HMM tuy cn
nhiu hn ch nhng p ng c mc tiu ca
ti. Chng trnh c s dng iu khin
robot vi b t vng nh (di 16 t) cho
chnh xc c th chp nhn c (trn 90%).
Trong thi gian ti nhm tc gi s ti u ha
chng trnh nhn dng t c kt qu cao
hn v tng tc x l.
TI LIU THAM KHO
1. GS. Phm Vn t , K thut lp trnh C, Nh
xut bn Khoa Hc v K Thut, 1999.
2. Nguyn Hong Hi Nguyn Khc Kim, Lp
trnh Matlab, Nh xut bn Khoa Hc v K
Thut, 2003.
3. PGS.TS. Nguyn Hu Phng, X l tn
hiu s, Nh xut bn Giao thng vn ti,
2000.
4. L Tin Thng, X l tn hiu s v wavelets,
Nh xut bn i Hc Quc Gia TP. H Ch
Minh, 2002.
5. Claudio Becchetti and Lucio Prina Ricotti,
Speech Recognition Theory and C++
Implementation, JOHN WILEY & SONS,
LTD, 2000.
6. Gordon E.Pelton, Voice Processing, McGraw
Hill, 1992.
7. John R.Deller & John G.Proakis & John H. L.
Hansen, Discrete Time Processing of Speech
Signals, Macmillan Publishing Company,
1993.
8. F.J. Owens, Signal Processing of Speech,
Macmillan, 1993.