You are on page 1of 9

Nhn dng ting ni bng mng Nron nhn to

Mng n ron (Neuron Netwok) l mt cng c c kh nng gii quyt c nhiu bi ton kh, thc t nhng nghin cu v mng n ron a ra mt cch tip cn khc vi nhng cch tip cn truyn thng trong l thuyt nhn dng. Trong khun kh bi bo ny tc gi mong mun c tho lun v 1 phng php nhn dng ting ni s dng mng n ron. M u c nhiu cng trnh nghin cu v lnh vc nhn dng ting ni (Speech recognition) trn c s l thuyt cc h thng thng minh nhn to, nhiu kt qu tr thnh sn phm thng mi nh ViaVoice, Dragon..., cc h thng bo mt thng qua nhn dng ting ni cc h quay s in thoi bng ging ni... Trin khai nhng cng trnh nghin cu v a vo thc t ng dng vn ny l mt vic lm ht sc c ngha c bit trong giai on cng nghip ho hin i ho hin nay ca nc nh. Mng n ron (Neuron Netwok) l mt cng c c kh nng gii quyt c nhiu bi ton kh, thc t nhng nghin cu v mng n ron a ra mt cch tip cn khc vi nhng cch tip cn truyn thng trong l thuyt nhn dng. Trong khun kh bi bo ny tc gi mong mun c tho lun v 1 phng php nhn dng ting ni s dng mng n ron vi: - Phng php m d on tuyn tnh LPC (Linear Predictive Coding) c s dng trong vic trch trn nhng c trng c bn ca ting ni. - Mng n ron lan truyn ngc hng (Back-propagation Neural Network) c s dng hc mu v ra quyt nh i tng nhn dng. X l tn hiu ting ni Qu trnh tin x l tn hiu l chuyn ting ni t dng sng (wave form representation) sang dng biu din tham s (parametric form representation). Cc tham s biu din tn hiu ting ni c th l: nng lng thi gian ngn (short time energy), t l qua im khng (zero

crossing rate) t l qua mc (level crossing rate). c nhiu cch dng trch chn ra c nhng thng tin v m thanh mt cch trc tip t nhng tn hiu s ca ting ni v hiu qu hn l phng php biu din tn hiu theo ph c s dng rng ri. Phn tch ph theo phng php m d on tuyn tnh LPC c nghin cu nhm trch chn ra cc c tnh c bn ca tn hiu ting ni, l nhng tham s u vo cho h thng nhn dng ting ni. Phng php ny biu din mu ting ni ti thi im n, x(n) bng php xp x tuyn tnh p mu qu kh: x(n) a1x(n - 1) + a2x(n - 2) + ...+ apx(n - p) Trong , x(n) l mu d on ti thi im n v cc h s a1, a2, ..., ap c coi l cc hng s trn khung (frame) phn tch ting ni. Thm thnh phn kch thch Gu(n) th:

Trong u(n) l ngun kch thch chun ho v G l h s khuych i kch thch. Ngun kch thch chun ho em nhn t l vi h s khuych i G v a vo h thng ton im cc to ra tn hiu ting ni. Mt khc, ta bit rng hm kch thch phi l xung tun hon (i vi m hu thanh) hoc l ngun nhiu ngu nhin (i vi m v thanh). i vi m hnh ny, ngun kch thch l mt cng tc c iu khin bi c tnh hu thanh/v thanh ca ting ni tng ng cho php chn xung tun hon hoc nhiu ngu nhin. Mc khuch i kch thch G c nh gi trc tip t tn hiu ting ni. i vi m hnh LPC, cc tham s l s phn lp m hu thanh/m v thanh, v tr nh m tit v cc h s b lc {ak}. Nu gi l t hp tuyn tnh ca cc mu qu kh:

th sai s tin on tuyn tnh c nh ngha:

e(n)= Vn c bn ca phng php LPC l xc nh cc h s tin on sao cho cc c tnh ph ca b lc s trong m hnh tng hp ting ni ph hp vi dng sng ting ni trong ca s phn tch. Do c tnh bin thin theo thi gian ca ph tn hiu ting ni nn cc h s tin on ti thi im n phi c nh gi trong khong thi gian ngn gn vi n. Do , cch tip cn c bn xc nh cc h s tin on l ti thiu ho sai s bnh phng tin on tuyn tnh trong mt on sng ting ni ngn. Thng thng, trong x l ting ni php phn tch ph thi gian ngn c thc hin trn cc khung ting ni lin tip vi khong cch mi khung l 10 ms. L thuyt nhn dng v mng n ron Nhn dng (Pattern Recognition) C th hiu l phng php xy dng mt h thng tin hc c kh nng: cm nhn-nhn thc-nhn bit cc i tng vt l gn ging kh nng ca con ngi. Nhn dng c gn cht vi 3 kh nng trn l mt lnh vc ht sc rng c lin quan n vic x l tn hiu trong khng gian nhiu chiu, m hnh, th, ngn ng, c s d liu, phng php ra quyt nh... H thng nhn dng phi c kh nng th hin c qu trnh nhn thc ca con ngi qua cc mc: - Mc 1- mc cm nhn: cm nhn c s tn ti cc i tng quan st, hay i tng m h thng cn nhn dng. Mc ny cng a ra qu trnh thu nhn s liu qua cc b cm bin trong h thng nhn dng, v d trong h thng nhn dng ting ni: i tng y l ting ni (speech) v thu nhn u vo qua Micro hoc cc file m thanh .wav. - Mc 2- mc nhn thc: y biu din qu trnh hc, m hnh ho i tng tin ti hnh thnh s phn lp (classification) cc i tng cn nhn dng. - Mc 3- mc nhn bit: t i tng quan st c th tr li nhn bit i tng l g ? Hay y l qu trnh ra quyt nh. Gi X l i tng nhn dng, X=(x1,x2,x3 ..., xn), ccxi R (thuc tp s) Gi l khng gian biu din i tng:

= {X1, X2,..., Xm} Gi l khng gian din dch, hay l tp cc tn gi ca cc lp c1,c2,...,cn: = {w1, w2,..., wn} Nh vy qu trnh nhn dng i tng l tm qui lut nh x t khng gian biu din sang khng gian din dch : : sao cho Xj Ck (i tng Xj thuc vo lp Ck). Nh vy i vi h thng nhn dng, cc i tng X bit (qua quan st, cm nhn, o lng), cn khng gian din dch v qui lut l nhng iu cha bit. V bi ton y chnh l : xy dng 1 h thng t cu trc, i hi mt qu trnh hc t cc i tng quan st thu nhn c (xc nh khng gian ) n vic tm qui lut (ra quyt nh). Hnh v 1 cho ta s ca h thng nhn dng.

Hnh 1. S tng quan ca h thng nhn dng Mng n ron nhn to M phng hot ng ca cc n ron thn kinh, mng n ron nhn to l h thng bao gm nhiu phn t x l n gin (neuron) hot ng song song. Tnh nng ca h thng ny tu thuc vo cu trc ca h, cc trng s lin kt n ron v qu trnh tnh ton ti cc n ron n l. Mng n ron c th t d liu mu v tng qut ho da trn cc d liu mu hc.

Hnh 2: M hnh mng Neuron

Mt nhm cc n ron c t chc theo mt cch sao cho tt c chng u nhn cng mt vector vo X x l ti cng mt thi im. Vic sn sinh ra tn hiu ra ca mng xut hin cng mt lc. V mi n ron c mt tp trng s khc nhau nn c bao nhiu n ron s sn sinh ra by nhiu tn hiu ra khc nhau. Mt nhm cc n ron nh vy c gi l mt lp mng. Chng ta c th kt hp nhiu lp mng to ra mt mng c nhiu lp, lp nhn tn hiu u vo (vector tn hiu vo x) c gi l lp vo (input layer). Trn thc t chng thc hin nh mt b m cha tn hiu u vo. Cc tn hiu u ra ca mng c sn sinh ra t lp ra ca mng (output layer). Bt k lp no nm gia 2 lp mng trn c goi l lp n (hidden layer) v n l thnh phn ni ti ca mng v khng c tip xc no vi mi trng bn ngoi. S lng lp n c th t 0 n vi lp. M hnh n ron nhn to i hi 3 thnh phn c bn sau: - Tp trng s lin kt c trng cho cc khp thn kinh. - B cng (Sum) thc hin php tnh tng cc tch tn hiu vo vi trng s lin kt tng ng - Hm kch hot (squashing function) hay hm chuyn (transfer function) thc hin gii hn u vo ca neuron. Trong m hnh n ron nhn to mi n ron c ni vi cc n ron khc v nhn c tn hiu xi t chng vi cc trng s wi. Tng thng tin vo c trng s l: Net = wjxj. Nhn dng ting ni bng mng n ron Hnh 3 trnh by nhng chc nng c bn ca h thng nhn dng ting ni, trong c nhng chc nng con ca tng khi. n gin trong vic ci t v yu cu ti nguyn my tnh va phi ta thng nht trong cc mu s dng 8bit cho 1 mu, tn s ly mu 22025 Hzs, m thanh mono.

Hnh 3. S khi m t h thng nhn dng ting ni Phn tch tn hiu ting ni

Phn tch tn hiu ting ni bng phng php LPC c thc hin qua cc bc sau: - Ci thin tn hiu (preemphasis):Tn hiu ting ni s(n) c a qua h thng s bc thp (chng hn nh b lc FIR bc thp) lm phng ph vi iu kin h thng ny hoc l n nh hoc l thch nghi chm. - Ct khung (frame blocking) : Tn hiu ting ni sau khi ci thin c chia thnh L khung - N mu, cc mu k cn c phn tch bi M mu. - Ca s (windows): c a qua hm ca s ti thiu ho s im gin on ti cc v tr bt u v kt thc khung. - Phn tch t tng quan: cc khung c phn tch t tng quan, kt qu ta c gi tr tng quan cao nht v c gi l bc phn tch LPC. Thng thng nhn gi tr t 8 n 16. - Phn tch LPC: s dng thut ton Durbin chuyn i cc h s t tng quan thnh tp tham s LPC . - Chuyn i tham s LPC thnh h s phn tch ph. Trong thc nghim ta chn 12 h s cepstral lm c trng ca tn hiu. M t mng n ron trong nhn dng - Phng n la chn s nt ca tng lp trong mng: theo kinh nghim ca cc chuyn gia v mng n ron trong cc bi ton phn lp c s dng mng lan truyn ngc hng, s dng 1 lp tnh ton l lp mng Kohonen lm lp n. Ta xc nh s n ron cho tng lp. + S n ron lp vo = s chiu ca vector vo, y ta chn 12 h s cepstral l c trng ca mu, mi mu c x l trong L khung ting ni, th s n ron ca lp vo s l 12*L, v d s dng 5 frames/mu th s n ron lp vo l 60 + S n ron lp Kohonen = s gi tr cc tp tr li. V d cn nhn dng 10 t mi t c m t bi vector vo 60 thnh phn, ti lp n cn lu tr li gii l 600, cn s n ron lp n l 625 n ron (ma trn n ron kch thc 25x25)

+ S n ron lp ra = s lng kt qu u ra, s dng phng php m ho bng s bit biu din s lng kt qu, v d cn nhn dng 128 t cn 7 n ron lp ra, 7 n ron ny cho php m ho 27 = 128 gi tr - Phng php hc cnh tranh ca lp n v qu trnh hc c ch o ti lp ra ca mng theo cc bc sau: + Khi to trng s: cc thnh phn ma trn trng s c khi to bi gi tr ngu nhin + c tn hiu vo cho mng: d liu trong file mu cha thng tin mu hc v cho kt qu gm 2 thnh phn: mng 1 chiu cha vector tn hiu vo v mng 2 chiu cha ma trn trng s lin kt ban u ca lp Kohonen + Hiu chnh ma trn trng s lp Kohonen: hiu chnh trng s lin kt n ron lp n Kohonen sao cho mng c th hc mu tt nht. T mng mt chiu cha vector tn hiu vo, mng hai chiu cha ma trn trng s lin kt lp n v cc hng s hc amin, amax, tmax chc nng ny phi xc nh mng hai chiu cha ma trn trng s lin kt lp n theo cng thc: HidWeight HidWeight + Hiu chnh ma trn trng s lp ra: hiu chnh trng s lin kt n ron lp ra to bng tra cu. T mng mt chiu cha vector tn hiu ra v mng hai chiu cha ma trn trng s lin kt lp ra chc nng ny phi xc nh mng hai chiu cha ma trn trng s lin kt lp ra bng cch hiu chnh ma trn trng s lin kt lp ra theo cng thc: OutWeight =OutVec(k). Phng php nhn dng - u vo: file wave cha d liu tn hiu ting ni cn nhn dng v file dat cha thng tin trng s lin kt n ron lp n v lp ra. Ngoi ra u vo ngun m cng c th l t micro thng qua sound card lm, lc ny d liu ting ni c c trong buffer d liu ca Windows. - u ra: kt qu cn nhn dng - Qu trnh nhn dng ting ni c thc hin qua cc bc: =HidWeight + rate(t)topo(Winner, i)(InVec(j) -

+ c tn hiu vo: c d liu t file wav hoc t buffer d liu m thanh + X l tn hiu ging nh chc nng phn tch LPC trn + c ma trn trng s lin kt lp n v lp ra ca mng + Xc nh n ron trung tm + Tra cu kt qu: tra cu trn bn topo mng n ron a ra gi tr cn nhn dng. Kt lun Trn y l 1 nghin cu v th nghim s dng phng php m d on tuyn tnh phn tch tn hiu, s dng mng n ron lan truyn ngc hng vi lp mng t t chc Kohonen tch lu tri thc cho mng, lm c s cho vic nhn dng ting ni bng cch tra cu topo mng a ra kt lun v ting ni cn nhn dng a vo h thng. Thc t h thng th nghim c th nhn dng c 1 s nguyn m ting vit. Phng php LPC c p dng kh rng ri trong x l ting ni bi n cung cp m hnh l tng cho tn hiu ting ni. Hn na, LPC tng i n gin, d thc hin c bng phn cng ln phn mm m vn m bo chnh xc. Tuy nhin nhc im ca phng php ny l khng gii quyt c vn cc tnh cht ng ca thanh mn, lm gim hiu nng ca h thng nhn dng, c bit l cc h thng c lp ngi ni. xut hng pht trin: nng cao hiu sut nhn dng ca h thng, theo s pht trin ca tin hc hin i, hng nghin cu ny c th tip tc m rng pht trin theo cc hng sau: - xut mng n ron m: pht trin theo 3 hng l + Mng n ron vi cc u vo, trng s m + Mng n ron dng xc nh hm thuc + Suy din m vi mng neuron.

- xut dng gii thut di truyn ti u ho cu trc mng neuron. Nhng hng pht trin tip theo c nh gi theo ng m hnh v cc h thng thng minh lai, l: Hybrid Intelegent System = Neural nerworks + Expert System + Genetic Algorithms + Fuzzy Logich. - xut cc phng php x l tn hiu s khc, c c s ci tin h thng nng cao cht lng nhn dng ca h thng. xut cc hng dng m hnh Makov n, s dng phng php lng t ho vector b xung vo h thng./.

You might also like