10 views

Uploaded by Ishan Rastogi

Tutorial on LP

- 1999 unit1 paper1
- Lesson 7: Limits Involving Infinity (worksheet with solutions)
- RAE (1).pdf
- Generalised Addition Chains
- Modesta Coditional Probability
- 10.4 Numth Annotated
- Solved Partitions
- U 1 L 1.pdf
- RMO_SOLU_97.doc
- 0.552820001334778606.pdf
- TCS 5 solu
- LAB2
- 12_mathematics_imp_ch2_4.pdf
- Programa_4 Robot Paralelo 3 Gdl
- PGDBA 2016 (1).pdf
- 08_HPW-13-ISM-03-III.pdf
- Clase 5 Calculo Numerico II
- Hcu 2016 Inps
- KS4 Ch1 Revision Rounds
- Math Art

You are on page 1of 9

Linear prediction is one of the most important tools in speech processing. It can be utilized in many ways but regarding to speech processing, the most important property is the ability to model the vocal tract. It can be shown that the lattice structured model of the vocal tract is an all-pole lter which means a lter that has only poles. One can also think that the lack of zeros restricts the lter to bolster up certain frequencies which in this case are the formant frequencies of the vocal tract. In reality the vocal tract is not composed of lossless uniform tubes, but in practice, modeling the vocal tract with an all-pole lter works ne. Linear prediction (LP) is a usefull method for estimating the parameters of this all-pole lter according to a recorded speech signal. Let us rst study an example of the usefullness of LP with this respect. Figure 1 presents a 30 ms window of the vowel [a] with a sampling frequency of 16 kHz. Its amplitude spectrum can be found in gure 2, showing the fundamental frequency (dense peaks) and formants (broad peaks in the pulse envelope). In the same picture, there is also the amplitude response of a 20th degree LP-model that models very well the broad peak envelope.

ikkunoitu anne 0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

50

100

150

200

250

300

350

400

450

500

5.1

The term linear prediction refers to the prediction of the output of a linear system based on its input and previous outputs !""! $#% :

29

(1)

Amplitudispektri ja LPCspektri 40

20

20

40

60

80

1000

2000

3000

4000 taajuus, Hz

5000

6000

7000

8000

Figure 2: Amplitude spectrum and LP spectrum. The notation ( refers to the estimate or prediction of ( . The idea is that once we know the input (( and the output we would like to predict the behaviour of the unknown system S UTV as illustrated in gure 3. In the gure the output has been delayed so that we cant use the & real output. The problem is now to determine the constants P@B and P@B in a such a way that 8 I approximates the real output as accurately as possible. The following terms describe the model: autoregressive, (AR) model The output is predicted by using only previous outputs and the current input, which means that P@BW)YXB`@bacX and only A@H and UXd must be determined. I 8 I This corresponds to an all-pole lter. moving average, (MA) model In this model the prediction is based only on the input, which gives A@Be)fX . This model corresponds to a FIR lter.

&

&

autoregressive moving average (ARMA) model This is the general model as in equation (1) corresponding to a general linear recursive lter. In speech processing the AR-model is preferred based on the following reasons:

g g g

the input (excitation signal at vocal cords) is unknown computational easiness of determination of parameters

P@B

as shown before, the vocal tract is theoretically an all-pole lter (excluding nasal sounds)

30

x(n) H(z) -1 z

y(n)

B(z)

A(z) ^ y(n)

Figure 3: Prediction of the output of the unknown system S UTV based in input and previous outputs. In speech processing S UTV corresponds to a vocal tract and the input is usually unavailable.

g g

AR-model of higher degree can also model ARMA-model stabile all-pole model can be used to present the amplitude response of any system with any desired precision (however, the degree of the required all-pole model may be considerably high).

where (2)

i pTV0)q`D 8

6 Tsr 6 D"!tD

and denotes the gain. The tranfer function is the ratio of the output z-transformed form h

81

Tsr 1

u UTV

v pTV

in

which implies

By taking the inverse T -transform of (3) yields the time domain relation

i u UTV pTV0)

v UTV h

(3)

21 ' xy xye) D w 47 6 8

which is

'(

(0)

31

(4)

where is the input, is the response and !""! #% are the coefcients of the lter UTV . 8 8 In other words the output of the all-pole system can be predicted perfectly if the input and the previous outputs are known. In practice the prediction is never perfect since the systems are not linear nor all-pole type and there is generally some noise in the output. Moreover in speech processing the input '( is unknown. Nevertheless, the vocal tract (as well as any other system) can be modeled by using all-pole model and in this case the model works really well. So by getting rid of dependence on the input (( in equation (4) we end up with the following model that will be used in from now on:

The hat over refers to estimate of . & Our goal is to determine the parameters U"""! #% so that '( would be close to the 8 8 8 recorded speech in some frame of the signal, i.e., so that the prediction error would be minimized. As the parameters have been determined, we may, according to equation (2), use the following model of vocal tract

h

where i UTV ).

i UTV

)

h

but we are now mainly interested in

Autocorrelation Method Parameters y"""! #% are to be determined so that the sum of squared errors 8 8 2 & ( '(Q

will be minimized over all indices. In practice the sum is nite due to the niteness of the signal but it is usefull to think that the frame is innitely long and only few samples are nonzero. In the following the output '( will be denoted as '( (s referring to speech). So we have a windowed speech signal where only a nite number of samples are nonzero. With the given prediction coefcients, U""" #% , the energy of the prediction error can be 8 8 8 written as

5.2

2 dy 4 r 1 ) 2 4 r ) 2 4 r & where # is the length of the prediction lter and By having convention that UXde)Y , the energy 8 ) 2 2 1 4 r 34HG 8 1 )

& (A 2 1 A@H R@BA 3476 8 '( is the estimate of '( 1 P@B R@B

32

Let us minimize by choosing suitable coefcients P!""! #% . A necessary condition for 8 8 8 1 with respect to variable xy equals optimality of the choise of 'x is that the partial derivative of 8 8 1 zero. Notice that depends on the variables ""!! #% so it could be written as ""!" #% 8 8 8 1 8 but we omit this to1 keep the notation short. So lets differentiate! The partial derivative with respect to 'xy (xe)q95H"""!# ) is

8 @B 'C@B P 1 3 4HG 8 xy 1 ) xy 8 8 34HG P @B 'C@B 1 8 ) 2Q 2 1 A@B R@BA xy 354HGH8 8 ) 2Q 2 1 A@B R@BA xy 354HGH8 h h h % B f ) f p % fp% has been utilized. By regrouping this we get e where the differentiation rule e d 2 2 1 A@H R@BA 'x 4 r 34HG 8 2 1 A@H 2 c C@B 'xy 4 r 34HGB8 2 1 A@HgBP@7y x c 34HGB8 ) )

where

gBP@7yxy0) 2 4 r 2 4 r

C@B 'xy

with delay @hx which is

'( 'A@ixy

2 R@B xy 4 r ) 2 'kDlxy(R@B mDlxyxy 4 r ) 2 nP@hxy 4 r B g P 7 @ y y x Moreover the term depends only on value @olx so it can be denoted by one variable autocorgBP@7yxj)

relation function

gBP@hxye)gHA@7yxy

33

pqqq 5 1 3 4HG r 1 354HG qqq qs 354HG 1 pqqq q 1 3 476 3476 r qqq 1 qs 3476 1 8 8 8 vww ww gHUXd gH gHP gH gHUXd gH gHP gH gHUXd

. . .

and gHA@Be)gHyd@B )

. .

vww gB ~ ~ ww 8 y ww gBP ww ww 8 U w w p{d ) w gBU{d 8 "" . . . . .. . . . . x "" . . . . . x x #% gB#% gB#m|gB#mC9|gB#m{}y!y!y gBpXd 8 Notice that the coefcient matrix is symmetric (due to gBP@B)gByd@H ) and Toeplitz (due to gBA@7yxy) gBA@txy ), which is crucial when deriving a fast computational method to nd the coefcients U"""! #% . 8 8 8

5.2.1 Levinson-Durbin Recursion Recap: at this point we have derived the equations (so called normal equations) for the prediction coefcients !""! #% based on the minimization of the prediction error. Now the coefcients 8 8 could be solved by inverting the autocorrelation matrix, but this is computationally rather demanding. To help us, Mr. Levinson and Mr. Durbin have developed an efcient algorithm for solving a symmetric Toeplitz-type equation group. The basic idea is to solve the matrix equation

)

in steps, that is, by increasing the length of the vector the previous solution. The optimal coefcients satisfy

where is the sum of squares of prediction error (more information can be found, for instance, in the book T. W. Parsons, Voice and Speech Processing, McGraw-Hill, Inc., 1987). By using this, the

21 xygB'xy0) w 4HG 8

34

. . .

. . .

. . .

...

. . .

gBpXd

. . .

~ wv ww ww y w 8 U 8 y!y!y x #% 8

~

The matrix on the left is still symmetric and Toeplitz. Assume that we have already solved the equation when #b) . Now, let us see how it helps us to solve P9 p{d when #)n{ , where the subscript refers to the degree of the equation. So this 8 8d 8 is what we have already solved:

yields

thus: symmetric Toeplitz-matrices (and only them) have the nice property that when the coefcient vector and the result vector are twisted upside down, (switch the last and the rst element and switch the second last and second and so on...), the equation is still satised. Let us now try to use the following kind of solution to a bigger group of equations

~

vww

35

w 4HG 'xAgBU{xy . where ) 8 For this to be a solution, we only require that all the elements, except the rst one, in the vector on the right side are equal to zero. It will be so, if DR@ )nXB

in other words

@ ) 2 xyAgBU{x w 4HG 8

We notice that

Justication:

) y R@s ) ) ) D@ D @ yd@ @ R

We have thus found that by trying a vector that is a sum of the lower degree solution and its twisted version multiplied by a constant, we get a solution to the problem of the higher degree. Same deduction works in general when increasing the size from to . Thus, the results are

r 6 2 ) r 6 w 4HG ) r

6 xygHx 8 r R @ 6

and

Because

X (

6 xyEDR@ 6 xy 8 r 8 r is the prediction error for the th degree lter), it follows @ s 9 8

'xy0)

The values @

are called reection coefcients. Levinson-Durbin recursion will be started with condition

gBUX0) G

which may be thought to be the error of the X th degree predictor (no prediction at all). There exist also other methods and variations to solve the coefcients but Levinson-Durbin recursion is the most commonly used one. Besides, calculating the coefcients in this way guarantees that the absolute values of the reection coefcients are always , yielding a stable lter. 36

The degree, # , of the model is chosen by considering that one pole corresponds to one formant, and because there is approximately one formant per one kHz, the degree is usually the same as the sampling frequency in kHz. For instance, when the sampling frequency is 8 kHz, the degree of the model is 8. In practice, to compensate the inaccuracies in the model (AR assumption and others) the degree is usually chosen to be a little bit higher. For instance, with a sampling frequency of 8 kHz, a reasonable model degree is 10 or 12, and with a 16 kHz sampling frequency, the degree should be 18 or 20. The LP analysis method discussed above is perhaps the most important method in speech processing. In speech coding, for instance, it will be used to code the excitation and vocal tract contributions separately, in speech recognition it will give information about the spectrum of the speech (and in this way about the phoneme) and in speech syntesis it enables to control the vocal tract and excitation separately. In Matlab the LPC (or LP) analysis is implemened by the command lpc.

37

- 1999 unit1 paper1Uploaded byLueshen Wellington
- Lesson 7: Limits Involving Infinity (worksheet with solutions)Uploaded byMatthew Leingang
- RAE (1).pdfUploaded byvictor_br12
- Generalised Addition ChainsUploaded byIan G. Walker
- Modesta Coditional ProbabilityUploaded byLinda Chua
- 10.4 Numth AnnotatedUploaded byantonio bandras
- Solved PartitionsUploaded byslchem
- U 1 L 1.pdfUploaded byarthisiva
- RMO_SOLU_97.docUploaded bySourabh Raj
- TCS 5 soluUploaded bysuganya004
- LAB2Uploaded byVamsi Krishna
- PGDBA 2016 (1).pdfUploaded byAnonymous 4BKOtAnaYa
- 0.552820001334778606.pdfUploaded byGabriel Sales Veloso
- Programa_4 Robot Paralelo 3 GdlUploaded byEdi Aguilar
- 08_HPW-13-ISM-03-III.pdfUploaded by蔥蔥
- KS4 Ch1 Revision RoundsUploaded byJay Krish
- 12_mathematics_imp_ch2_4.pdfUploaded bysuraj kushwahadeoria
- Clase 5 Calculo Numerico IIUploaded byjose
- Math ArtUploaded byV.I.G.Menon
- Hcu 2016 InpsUploaded byKulbir Singh
- Apr 2018 Math.pdfUploaded byMao Doan Minh
- xii-1.4.pdfUploaded byAbdullah
- XI Maths - I Terminal ExamUploaded byPurisai Rajamani Kumar
- 0BzldnLHvRAq_SUFTUG5YbnpSbVkUploaded byAnkit
- Algebra_Cheat_Sheet.pdfUploaded byEnrique Garcia
- lessonplan1 reducingcomplexfractionsUploaded byapi-449247060
- cse120wi11lec15[1] RecursionUploaded byDee Can
- GAAplicationsUploaded byirealro
- Analisis Data KhususUploaded bynay ratu
- 1-s2.0-S089982561300136X-mainUploaded byshaunasweeney5144

- Cl 7 System Architecture Part 1Uploaded byIshan Rastogi
- Cl-4 System ConceptsUploaded byIshan Rastogi
- Slide ReUploaded byIshan Rastogi
- CL-2 Overview of Sys EngineeringUploaded byIshan Rastogi
- sc_jalcUploaded byIshan Rastogi
- CL-3 What is a SystemUploaded byIshan Rastogi
- Cl 1 IntroductionUploaded byIshan Rastogi
- Over Fitting and TblUploaded byIshan Rastogi

- Market Report Series Energy Efficiency 2018Uploaded byNuno Monteiro
- Janky v BatistatosUploaded bypropertyintangible
- bli - resume part 2Uploaded byapi-423824620
- District Planning Committees in India - Wikipedia, The Free EncyclopediaUploaded bysheel1479
- NCLEX Questions for Test 3 All ChaptersUploaded byctramel001
- Potentially Malignant Lesions and ConditionsUploaded byNaleena Joseph
- cpc work.docxUploaded byRitesh Kumai
- BookUploaded bypothirajkalyan
- Marketing Bake ParlorUploaded bywaseem_almas
- 012_B4_052808_apod101Uploaded byJi Xua
- Architect's Instruction - Designing Buildings WikiUploaded bymarx0506
- Research methodology - unit 5 - Attitude measurementUploaded byFrancisca Rebello
- go to hellUploaded byapi-279602880
- Terror Backlash - Nathan Rot Hen BaumUploaded byChezOut
- Motion for Reconsideration of Denial of Appellant's Motion for Appointment of CounselUploaded byJanet and James
- Accounting Self Study Guide - Grade 10 - 12Uploaded byxxxfarahxxx
- 487678_39664_itt_questionsUploaded bySabika Fathima
- AN_385 FTDI D3XX Driver Installation GuideUploaded bychris75726
- ReferenceUploaded byAira Alyssa
- Bourdon - How to Write a History of Audiences.Uploaded byquet1m
- Circumcision_of_male_infants_and_childre.pdfUploaded byMark
- Tiqbiz Education IndiaUploaded byAkhilesh Chouksey
- scimakelatex.77319.Marc+Kohen.Flubib+KansasUploaded byschlimmtimm
- Beurse en Lenings 2014 WEBUploaded byJwayne10
- Environment Control for SmartUploaded byGlaiza Lacson
- Pradeep_Modified Profile.docxUploaded byneoalt
- Php BasicsUploaded byThirunavukkarasu Kannapiran
- Buying behaviour of GoldUploaded byAdnan Ali
- Doctrine of Restitution and MinorsUploaded byRajiv Ranjan
- Terminology for the Study of RiversUploaded byrishipatel96