Professional Documents
Culture Documents
=
m
m
xx xx
z m z ) ( ) ( 2.1
from which we obtain the power spectral density by evaluating ) (z
xx
on the unit
circle(that is by substituting z=exp(j2*pi*f) )
Assuming that log ) (z
xx
is analytic (possesses derivatives of all orders) in an annular
region in the z-plane that include the unit circle. Then log ) (z
xx
may be expanded in a
Laurent series of the form, for z=exp(j2*pi*f).
=
m
fm j
xx
e m v f
2
) ( ) ( log 2.2
where v(m) are the coefficients in the series expansion. Further, v(m) may be viewed as
the sequence with z transform V(z)= log ) (z
xx
. Thus,
m
m
xx
z m v z ) ( exp ) ( 2.3
) ( ) (
1 2
= z H z H
w
13
where by definition
2
w
=exp[v(0)]
and
1
1
, ) ( exp ) ( r z z m v z H
m
m
>
On evaluating the above equation on the unit circle, we have the equivalent
representation of the power spectral density as
2
2
) ( ) ( f H f
w xx
= 2.4
The filter with the system function H(z) is analytic in the region |z|>r
1
<1. Hence in this
region it has a Talyor series expansion as a causal system of the form
n
m
z n h z H
=
0
) ( ) ( 2.5
The output of this filter to a white noise input sequence w(n) with power spectral density
2
w
is a stationary random process x(n) with power spectral density
2
2
) ( ) ( f H f
w xx
= .
Conversely, the stationary random process x(n) with the power spectral density
2
w
may
be transformed into a white noise process by passing x(n) through a linear filter with
system function 1/H(z) This filter is called a noise whitening filter. Its output, denoted as
w(n) is called the innovations process associated with the stationary random process x(n).
Figure 2.2: Filters for generating random process from white noise and the
inverse filter.
w(n)
=
=
0
) ( ) ( ) (
k
k n w k h n x
White noise
x(n)
w(n)
White noise
Linear causal
filter, H(z)
Linear causal
filter, 1/H(z)
14
2.2 Rational Power Spectra
Considering the power spectral density of the stationary random process x(n) is a rational
function, expressed as
,
) ( ) (
) ( ) (
) (
1
1
2
=
z A z A
z B z B
z
w xx
2 1
r z r < < 2.6
Where the polynomials B(z) and A(z) have roots that fall inside the unit circle in the z-
plane. Then the linear filter H(z) for generating the random process x(n) from the white
noise sequence w(n) is also rational and is expressed as
,
1
) (
) (
) (
1
0
+
= =
p
k
k
k
q
k
k
k
z a
z b
z A
z B
z H
1
r z > 2.7
where b
k
and a
k
are the filter coefficients that determine the location of the zeros and
poles of H(z), respectively. Thus H(z) is causal, stable, and minimum phase. Its
reciprocal 1/H(z) is also a causal, stable, and minimum phase linear system. Therefore the
random process x(n) uniquely represents the statistical properties of the innovation
process w(n), and vice versa.
For the linear system with the rational system function H(z) given by above equation, the
output x(n) is related to the input w(n) by the following difference equation
= =
= +
q
k
k
p
k
k
k n w b k n x a n x
0 1
) ( ) ( ) ( 2.8
15
Distinguishing among three special cases:
Autoregressive (AR) Process: b
0
= 1, b
k
= 0, k > 0
In this case the linear filter H(z) = 1/A(z) is an all-pole filter and the difference equation
for the input-output relationship is
) ( ) ( ) (
1
n w k n x a n x
p
k
k
= +
=
2.9
In turn, the noise-whitening filter for generating the innovations process is an all zero
filter.
Moving Average (MA) process : a
k
= 0, k >= 1
In this case the linear filter H(z) = B(z) is an all-zero filter and the difference equation for
the input-output relationship is
=
=
q
k
k
k n w b n x
0
) ( ) ( 2.10
The noise-whitening filter for generating the innovations process is an all pole filter.
Autoregressive, Moving Average (ARMA) Process : In this case the linear filter
H(z) = B(z)/A(z) has both poles and zeros in the z-plane and the corresponding difference
equation is given by 2.8. The inverse system for generating the innovations process w(n)
from x(n) is also a pole-zero system of the form 1/H(z) = A(z)/B(z).
16
2.3 Relationships between the Filter Parameters and the
Autocorrelation Sequence
When the power spectral density of the stationary random process is a rational function,
there is a basic relationship that exists between the autocorrelation sequence
) (m
xx
and
the parameters a
k
and b
k
of the linear filter H(z) that generates the process by filtering the
white noise sequence w(n). This relationship may be obtained by multiplying the
difference equation in 2.8 by x
*
(n-m) and taking the expected value of the both sides of
the resulting equation, to get
= =
+ =
q
k
wx k
p
k
xx k xx
k m b k m a m
0 1
) ( ) ( ) (
2.11
where
) (m
wx
+ =
=
) ( ) ( ) (
0
*
m n w k n w k h E
k
2.12
) (
2
m h
w
=
where in the last step, it was assumed that the sequence w(n) is white.
From 2.12 the following relationship is obtained:
17
<
+
>
=
=
+
=
0 ), (
0 , ) ( ) (
), (
) (
*
1 0
2
1
m m
q m b k h k m a
q m k m a
m
xx
p
k
m q
k
m k w xx k
p
k
xx k
xx
2.13
This represents a nonlinear relationship between
) (m
xx
<
= +
>
=
=
=
0 ), (
0 , ) (
0 ), (
) (
*
1
2
1
m m
m k m a
m k m a
m
xx
p
k
w xx k
p
k
xx k
xx
2.14
Thus a linear relationship is obtained between
) (m
xx
0
....
0
0
....
1
) 0 ( ........ ) 2 ( ) 1 (
...... ...... ...... ......
) 2 ( ........ ) 0 ( ) 1 (
) 1 ( ........ ) 1 ( ) 0 (
2
2
*
* *
w
p
xx xx xx
xx
xx xx
xx xx
xx
a
a
a
p p
p
p
2.15
This correlation matrix is Toeplitz and hence it can be efficiently inverted by the use of
Levinson-Durbin algorithm as shown later.
18
2.4 Theory of linear Prediction
Linear prediction involves predicting the future values of a stationary random process
from the observation of past values of the process. Consider, in particular, a one step
forward linear predictor, which forms the prediction of the value x(n) by a weighted
linear combination of the past values x(n-1),x(n-2) ... x(n-p).
Hence linearly predicted value of x(n) is
) (
^
n x = - ) ( ) (
1
k n x k a
p k
k
p
=
=
2.16
Where the a
p
(k) represent the weights in the linear combination. These weights are
called prediction coefficients of the one step forward linear predictor of order P. The
negative sign in the definition of x(n) is for mathematical convenience.
The difference between the value x(n) and the predicted value ) (
^
n x is called the forward
prediction error, denoted by f
p
(n) ,
) ( ) ( ) (
^
n x n x n
f
p
= = ) (n x + ) ( ) (
1
k n x k a
p k
k
p
=
=
2.17
For information bearing signals, the prediction error f
p
(n) may be regarded as the
information, or the innovation, content of the sample.
To calculate the optimum prediction coefficients for our prediction filter we minimize the
mean square error i.e.
( ) ( ) (
^
n x n x )
2
is minimum.
19
Two approaches can obtain the LPC coefficients a
k
characterizing an all-pole H(z) model.
The least mean square method selects a
k
to minimize the mean energy in e(n) over a
frame of signal data, while the lattice filter approach permits instantaneous updating of
the coefficients.
The first of the two common least squares technique is the autocorrelation method,
which multiplies the speech signal by a window w(n) so that x
+
=
otherwise
N m m w n m x
m x
n
, 0
1 0 , .
2.19
Another least square technique called covariance method windows the error signal,
instead of the actual speech signal.
20
Autocovariance measures the redundancy in a signal
2.20
2.4.2 The Covariance Method
An alternative to using a weighting function or window for defining x
n
(m) is to fix the
interval over which the mean squared error is computed to the range
1 0 N m
And use the unweighted speech directly. That is
( ) m e E
N
n n
=
1
0
2
where
( ) k i
n
,
is defined as
( ) ( ) ( )
=
=
1
0
, ,
N
m
n n n
k m x i m x k i
p k
p i
0
1
2.21
or by a change of variable
( ) ( ) ( )
+ = k i m x m x k i
n n n
,
p k
p i
0
1
2.22
Using the extended speech interval to define the covariance values,
( ) k i,
, the matrix
form of the LPC analysis equation becomes,
21
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
( )
( )
( )
p,0
:
2,0
1,0
:
p p, ......... p,2 p,1
: ......... : :
p 2, ......... 2,2 2,1
p 1, ......... 1,2 1,1
2
1
p
a
a
a
2..23
The resulting covariance matrix is symmetric (since
( ) ( ) i k k i
n n
, , =
) but not Toeplitz,
and can be solved efficiently be a set of techniques called the Cholesky decomposition.
The mean-square value of the forward linear prediction error f
p
(n) based on the
autocorrelation method is
= = =
+ + = =
p
k
p
k
p
l
xx p p xx p xx p
f
p
k l k a l a k k a n f E
1 1 1
* * 2
) ( ) ( ) ( )] ( ) ( Re[ 2 ) 0 ( ] | ) ( [| 2.24
Now,
f
p
is a quadratic function of the predictor coefficients, and its minimization leads
to the set of linear equations
) (l
xx
= - p l k l k a
p
k
xx p
,......, 2 , 1 ..... ,......... ) ( ) (
1
=
=
2.25
These are called the normal equations for the coefficients of the linear predictor. The
minimum mean-square prediction error is thus :
) ( ) ( ) 0 ( ] min[
1
k k a E
xx
p
k
p xx
f
p
f
p
+ =
=
2.26
Writing eq 2.26 in terms of vectors
xx xx
r R =
p
a f 2.27
22
or which the predictor coefficients can be obtained as
xx xx
r R
1
p
) ( a
= 2.28
A question may arise as to whether to use the autocorrelation method or the covariance
method in estimating the predictor parameters. The covariance method is quite general
and can be used with no restrictions. The only problem is that of stability of the resulting
filter. In the autocorrelation method on the other hand, the filter is guaranteed to be
stable, but problems of the parameter accuracy can arise because of the necessity of the
windowing (truncating) the rime signal. This is usually a problem if the signal is a portion
of an impulse response. For example, if the impulse response of an all-pole filter is
analyzed by covariance method, the filter parameters can be computed accurately from
only a finite number of samples of the signal. Using the autocorrelation method, one can
not obtain the exact parameters values unless the whole infinite impulse response is used
in the analysis. However, in practice, very good approximations can be obtained by
truncating the impulse response at a point where most of the decay of the response has
already occurred.
23
Chapter 3
LATTICE FILTERS
24
3. Lattice Filters
Linear prediction can be viewed as being equivalent to linear filtering where the predictor
is embedded in the linear filter, as shown in figure 3.1.
Figure 3.1: Forward Linear Prediction.
This is called a prediction-error filter with the input sequence x(n) and the output
sequence f
p
(n). An equivalent realization for the prediction-error filter is shown in fig 3.2
Figure 3.2: Prediction Error Filter.
x(n) f
p
(n)
_
+
) (
^
n x
x(n-1)
z
-1
Forward
Linear
Predictor
f
p
(n)
x(n)
a
p
(p) a
p
(p-1) a
p
(3) a
p
(2) a
p
(1) 1
z
-1
z
-1
z
-1
z
-1
25
This realization is a direct-form FIR filter with the system function given as
=
p
k
k
p p
z k a z A
0
) ( ) ( 3.1
where, by definition, a
p
(0) = 1.
Prediction Error filters can be realised in other way also, which take the form of a lattice
structure. To find a relationship between the Lattice filter coefficients and the FIR filter
structure, let us begin with a predictor of order p=1. The output of such a filter is
) 1 ( ) 1 ( ) ( ) (
1 1
+ = n x a n x n f 3.2
This output can be obtained from the single stage lattice filter, as shown in figure 3.3
below, by exciting both the inputs by x(n) and selecting the output from the top branch.
Figure 3.3: Single stage Lattice Filter.
Thus the output is exactly that given by above equation if we select K
1
= a
1
(1). The
parameter K
1
in the lattice filter is called a reflection coefficient.
The negated reflection coefficient, - k
m
, is also called the partial correlation (PARCOR)
coefficient
f
0
(n)
g
0
(n-1) g
0
(n) g
1
(n)
f
1
(n)
K
1
*
K
1
z
-1
x(n)
26
Next, considering a predictor of order p = 2. For this case, the output of the direct-form
FIR filter is :
) 2 ( ) 2 ( ) 1 ( ) 1 ( ) ( ) (
2 2 2
+ + = n x a n x a n x n f 3.3
By cascading two lattice stages as shown in the figure 3.4, it is possible to obtain the
same output as above, as shown.
Figure 3.4: Two stage Lattice Filter.
The two outputs form the first stage are :
) 1 ( ) ( ) (
) 1 ( ) ( ) (
*
1 1
1 1
+ =
+ =
n x n x K n g
n x K n x n f
3.4
Similarly, the two outputs from the second stage are :
) 1 ( ) ( ) (
) 1 ( ) ( ) (
1 1
*
2 2
1 2 1 2
+ =
+ =
n g n f K n g
n g K n f n f
3.5
Substituting the value of f
1
(n) and g
1
(n-1) in above equations yield:
) 2 ( ) 1 ( ) ( ) ( ) (
2 2
*
1 1 2
+ + + = n x k n x k k k n x n f 3.6
g
2
(n)
f
2
(n)
K
2
*
K
2
f
0
(n)
g
0
(n-1)
g
0
(n) g
1
(n)
f
1
(n)
K
1
*
K
1
z
-1
x(n)
z
-1
27
On equating coefficients we get,
a
2
(2)=K
2
and a
2
(1)= ) (
2
*
1 1
k k k + 3.7
or equivalently,
K
2
= a
2
(2), K
1
= a
1
(1) 3.8
By continuing this process, the equivalence between an m
th
order direct-form FIR filter
and an m
th
stage lattice filter can be demonstrated. The lattice is described by the
following set of order-recursive equations:
) 1 ( ) ( ) (
) 1 ( ) ( ) (
) ( ) ( ) (
1 1
*
1 1
0 0
+ =
+ =
= =
n g n f K n g
n g K n f n f
n x n g n f
m m m m
m m m m
p m
p m
,......, 2 , 1
,......, 2 , 1
=
=
3.9
A p-stage lattice filter for p
th
order predictor can be shown as follows :
Figure 3.5: P stage Lattice Filter.
As a consequence of the equivalence between the direct form prediction error filter and
the FIR lattice filter, the output of the p-stage lattice filter is expressed as :
=
= =
p
k
p p p
a k n x k a n f
0
1 ) 0 ( . .......... .......... ),........ ( ) ( ) ( 3.10
g
p
(n)
f
p
(n)
g
2
(n)
f
2
(n)
g
1
(n)
f
1
(n)
g
0
(n)
f
0
(n)
First
Stage
Second
Stage
p
th
Stage
28
The lattice forms characterization requires only p reflection coefficients Ki for a p step
linear predictor in comparison with the p(p+1)/2 filter coefficients required by the FIR
filter implementation. The reason that the lattice provides a more compact representation
is because appending stages to the lattice does not alter the parameters of the previous
stages. On the other hand, appending the p
th
stage to a FIR based predictor results in
system function A
p
(z) that has coefficients totally different from the coefficients of the
lower-order FIR filter with system function A
p-1
(z).
Although the direct-form implementation of the linear predictor is the most convenient
method, for many applications, such as transmission of the predictor coefficients in
speech coding, it is advantageous to use the lattice form of predictor. This is because the
lattice form can be conveniently checked for stability. That is, for a stable model, the
magnitude of reflection coefficient is bounded by unity, and therefore it is relatively easy
to check a lattice structure for stability.
The quantization of the filter coefficients for transmission can create a major problem
since errors in the filter coefficients can lead to instability in the vocal tract filter and
create an inaccurate output signal. This potential problem is averted by quantizing and
transmitting the reflection coefficients that are generated by the Levinson-Durbin
algorithm. These coefficients can be used to rebuild the set of filter coefficients {ai} and
can guarantee a stable filter if their magnitude is strictly less than one.
A major attraction of a lattice structure is its modular form and the relative ease with
which the model order can be extended. Furthermore a perturbation of the parameter of
any section of the lattice structure has a limited and more localized effect.
29
3.1 Predictor Model Order Selection
One procedure for the determination of the correct model order is to increment the model
order, and monitor the differential change in the error power, until the change levels off.
The incremental change in error power with the increasing model order from i-1 to I is
defined as
) ( ) 1 ( ) ( i i i
E E E =
3.11
The order p beyond which the decrease in the error power becomes less than a threshold
is taken as the model order.
When the model order is less than the correct order, the signal is under-modelled. In this
case, the prediction error is not well decorrelated and will be more than the optima;
minimum. A further consequence of the under-modelling is a decrease in the spectral
resolution of the model: adjacent spectral peaks of the signal could be merged and appear
as a single spectral peak when the model order is too small. When the model order is
larger than the correct order, the signal is over-modelled. An over-modelled problem can
result in an ill-conditioned matrix equation, unreliable numerical solutions and the
appearance of spurious spectral peaks in the model.
30
Chapter 4
Levinson Durbin Algorithm
31
4. The Levinson-Durbin Algorithm
The Levinson Durbin algorithm is a computationally efficient algorithm for solving the
normal equations. It is named so in recognition of its first use by Levinson(1947) and
then its independent reformulation at a later date by Durbin(1960).
, 0 ) ( ) (
0
=
=
k l k a
xx
p
k
p
l=1, 2, , p, a
p
(0)=1 4.1
for the prediction coefficients. This algorithm exploits the special symmetry in the
autocorrelation matrix
=
) 0 ( ........ ) 2 ( ) 1 (
...... ...... ...... ......
) 2 ( ........ ) 0 ( ) 1 (
) 1 ( ........ ) 1 ( ) 0 (
*
* *
xx xx xx
xx
xx xx
xx xx
xx
p
p p
p
p
4.2
Since ) ( ) , ( j i j i
p p
= , so the autocorrelation matrix is a Toeplitz matrix. Also, since
) , ( ) , (
*
j i j i
p p
= , the matrix is also Hermitian.
The key to the Levinson Durbin method of solution that exploits the Toeplitz property of
the matrix is to proceed recursively, beginning with a predictor of the order m=1 (one
coefficient) and to increase the order recursively, using the lower order solutions to
obtain the solution to the next higher order. Thus the solution to the first order predictor
obtained by solving the equation is
) 0 ( / ) 1 ( ) 1 (
1 xx xx
a = 4.3
and the resulting minimum mean square error (MMSE) is
32
] | ) 1 ( | 1 )[ 0 (
2
1 1
a E
xx
f
= 4.4
The next step is to solve for the coefficients a
2
(1) and a
2
(2) of the second order predictor
and express the solution in terns of a
1
(1). The resulting equations are
) 2 ( ) 0 ( ) 2 ( ) 1 ( ) 1 (
) 1 ( ) 1 ( ) 2 ( ) 0 ( ) 1 (
2 2
*
2 2
xx
xx
xx
xx
xx
xx
a a
a a
= +
= +
4.5
By using the solution in 4.4 to eliminate ) 1 (
xx
, the following equations are obtained
] ) 1 ( 1 )[ 0 (
) 2 ( ) 1 ( ) 1 (
) 2 (
2
1
1
2
a
a
a
xx
xx xx
+
=
4.6
f
xx xx
E
a
1
1
) 2 ( ) 1 ( ) 1 ( +
=
) 1 ( ) 2 ( ) 1 ( ) 1 (
*
1 2 1 2
a a a a + = 4.7
In this manner to represent in terms of a recursive equation we express the coefficients of
the m
th
order predictor in terms of the coefficients of the (m-1)st order predictor, We can
write the coefficient vector a
m
as the sum of two vectors, namely
=
m
m m
m
m
m
m
m
K
d a
m a
a
a
a
...
0
...
) (
....
) 3 (
) 2 (
) 1 (
1 1
a 4.8
where a
m-1
is the predictor coefficient vector of the (m-1)st order predictor and the vector
d
m-1
and the scalar K
m
are the scalar K
m
are to be determined. For this the m X m
autocorrelation matrix
xx
as
33
) 0 (
1
*
1 1
xx
bt
m
m m
m
4.9
where [ ] ( )
t
b
m xx xx xx
bt
m
m m
1 1
) 1 ( ....... ) 2 ( ) 1 (
= = .The superscript b on
1 m
denotes the vector [ ] ) 1 ( ....... ) 2 ( ) 1 (
1
=
m
xx xx xx
bt
m
with elements taken in
reverse order.
The solution to the equation
m m m
a = may be expressed as
1
]
1
=
|
|
\
|
1
]
1
+
1
]
1
1
]
1
) ( 0 ) 0 (
1 1 1
1
*
1 1
m K
d a
xx
m
m
m m
xx
bt
m
m m
4.10
This is the key step in the Levinson Durbin algorithm. From 4.10, two equations are
obtained as follows
1
*
1 1 1 1 1
= + +
m
b
m m m m m m
K d a 4.11
) ( ) 0 (
1 1 1 1 1
m K d a
xx xx m m
bt
m m
bt
m
= + +
4.12
Since
1 1 1
=
m m m
a , 4.11 yields the solution
*
1
1
1 1
b
m m m m
K d
= =
) 1 (
...
) 2 (
) 1 (
*
1
*
1
*
1
*
1 1
m
m
m
m
b
m m m
a
m a
m a
K a K d 4.13
The scalar equation 4.12 can be used to solve for K
m
34
( )
( )
*
1 1
1 1
0
+
=
m
bt
m xx
m
bt
m xx
m
a
a m
K
4.14
Thus substituting the solutions in 4.13 and 4.14 into 4.8, we get the recursive equations
for the predictor coefficients and the reflection coefficients for the lattice filters. The
recursive equations are given as
( )
( )
( )
( )
( ) ( )
( ) ( ) ( ) ) (
) (
0
*
1 1
*
1 1
1 1
*
1 1
1 1
k m a m a k a k a
k m a K k a k a
E
a m
a
a m
K m a
m m m m
m m m m
f
m
m
bt
m xx
m
bt
m xx
m
bt
m xx
m m
+ =
+ =
+
=
+
= =
4.15
From the equations we note that the predictor coefficients form a recursive set of
equations. K
m
is the reflection of the m
th
stage. Also K
m
=a
m
(m), the mth coefficient of
the m
th
stage.
The important virtue of the Levinson-Durbin algorithm is its computational efficiency, in
that its use results in a big saving in the number of operations. The Levinson-Durbin
recursion algorithm requires O(m) multiplications and additions(operations) to go from
stage m to stage m+1. Therefore, for p stages it will take on the order of 1+2+3++p =
p(p+1)/2, or O(p
2
) operations to solve for the prediction filter coefficients or the
reflection coefficients, compared with O(p
3
) operations if the Toeplitz property is not
exploited.
35
Chapter 5
Progress Report
36
5. Progress Report.
As required by the normal equations, we have to calculate the autocorrelation matrix and
solve the equation a
p=
(R
xx
)
-1
r
xx.
The first program made by us calculated the
autocorrelation matrix and compared the value of the prediction coefficients obtained by
using the matrix inversion function of MATLAB6p1 with the linear prediction function
of MATLAB6p1. The results are shown below.
Using inv() function Using lpc() function
1.0000 1.0000
-1.6538 -1.6538
1.3089 1.3089
-1.0449 -1.0449
0.4729 0.4729
-0.0588 -0.0588
-0.1057 -0.1057
0.6557 0.6557
-0.8568 -0.8568
0.5507 0.5507
-0.2533 -0.2533
-0.0262 -0.0262
0.1345 0.1345
Table 5.1
Since both functions returned identical values, our autocorrelation matrix was correctly
calculated. Next we implemented the Levinson-Durbin algorithm using the recursive
relations as explained previously. The function made by us returned both, the prediction
coefficients and the reflection coefficients. Hence this function can be used to implement
the linear predictor using either FIR filter or lattice filter.
To check the accuracy of our function, we made a program that calculated the prediction
coefficients for a speech input signal and implemented FIR predictive filter to generate
37
the error sequence. The same coefficients were then used to implement the inverse filter
that generated back the speech signal from the error signal. The figure below shows the
original, error and recreated signal. It is clear that the recreated signal is same as the
original input signal.
Figure 5.1: Original, Error and Recreated signal (without segmentation)
The above sample is for the utterance of the word N-S-I-T. The speech sample is encoded
in 8bits and sampled at a frequency of 11025Hz corresponding to telephone quality. In all
the examples in this section, 8
th
order Linear Prediction is used. The error signal is clearly
smaller than the original signal and will require fewer bits to encode. This program does
not use advanced techniques such as segmentation, windowing or end detection. But
since speech is only quasi stationary and cannot be assumed stationary for such large
number of samples we will have to perform segmentation, windowing of this speech
38
sample. This will further reduce the magnitude of the error signal helping in improved
compression of the speech.
Next we divided the speech into non-overlapping segments and again calculated the error
and the recreated signal. The result is shown below. In this program we have used the
filter programs created by us. We have used a constant window size of 15ms which
corresponds to 166 samples.
Figure 5.2: Original, Error and Recreated signal (with non-overlapping
segmentation)
The figure shows that using non-overlapping segments cause spikes due to discontinuities
where errors are large. This is because we are trying to predict speech from 0 at the
edges. To overcome this limitation, overlapping segments of the speech are taken and
windowing is done. The most popular windows are the Hamming and Hanning windows.
39
Below is shown the output when we take overlapping segments overlapping by N/2,
where N is the window size. Hanning window is used in this example.
Figure 5.3: Original, Error and Recreated signal (with overlapping segmentation
and windowing)
It can be clearly seen from the above figure that the error signal has become smooth as
compared to the non-overlapping segmentation case. Hence this error signal can be
encoded successfully.
40
As the final program in this preliminary case, we will solve the same case using lattice
filters instead of the FIR and IIR filters. Lattice filters have a number of advantages over
FIR/IIR implementation in the case of Linear Prediction. These have already been
explained earlier.
Figure 5.4: Original, Error and Recreated signal with lattice filters.
Hence we can see that Lattice Filters give the same result as the other filters.
41
The spectrum of the input signal and that of the output signal is shown below. There is no
difference in the spectrum.
Figure 5.5: Frequency Spectrum of Original Signal.
42
Figure 5.6: Frequency Spectrum of Recreated Signal.
43
Chapter 6
Observation and Results
44
6. Observations and Results
All software simulations have been performed on MATLAB6p1, which is a registered
trademark of The Mathworks Incorporation.
We have shown earlier how we have reached the stage where error is minimum by using
segmentation of speech and windowing. In this section we will show the effect of
changing various parameters such as order of prediction, size of window, on the error
signal and the recreated signal. To take into effect of quantization and compression, we
will encode the error signal in less number of bits before using it to recreate back the
speech signal. This will ensure that the observations are accurate.
The speech sample used for the entire observation will be the word N-S-I-T encoded in 8
bits sampled at a frequency of 11025 Hz, which corresponds to telephone quality. The
order of prediction will be 8 and the size of window 15ms corresponding to 166 samples
of speech. Three different encoding schemes will be used. The error signal will be
encoded in 8 bits to 3 bits. During each of the sub-segment, one parameter will be varied
while all the remaining parameters will hold the value as specified above.
6.1 Effect of order of prediction
The order of prediction governs how many previous samples as used to predict the next
sample. As the order of the predictor is increased, up to the order of the process which
generated the signal, the power spectrum of the error signal will become more and more
flat. But it is not possible to increase the order arbitrary since the autocorrelation matrix is
a p X p matrix, where p is the order, and solving this matrix requires lot of computation
time. Through large number of experimentation and observation, it has been seen that an
order of 8-12 is suitable for most speech samples.
45
In this section we will vary the order and see the effect on the error signal. We will
measure the predictive gain i.e. the 10 log (variance of the original signal / variance of the
error signal). A table of the observations is shown below.
Number of Bits
Order
8
7
6
5
4
3
2
3.8176 3.7277 3.4369 3.0975 2.0599 0.29527
4
6.7546 6.6762 6.3967 5.9849 4.8896 3.8079
6
9.7913 9.6751 9.3888 5.9849 7.9312 7.0689
8
11.113 11.037 10.683 10.246 9.1502 8.0151
10
13.269 13.157 12.78 12.402 11.391 12.291
12
15.264 15.145 14.808 14.236 13.505 14.385
20
15.493 15.378 15.019 14.556 13.964 14.67
40
15.225 15.099 14.768 14.368 13.502 14.271
60
15.539 15.421 15.099 14.607 13.835 14.6
Table 6.1: Predictive Gain Vs Order of Prediction.
46
Plotting a graph for the above table.
Figure 6.1: Graph of Prediction Gain Vs Order of Prediction.
Index
8 Bits: Purple
7 Bits: Green
6 Bits: Magenta
5 Bits: Black
4 Bits: Blue
3 Bits: Red
47
6.1.1 Spectrum of the Error Signal
As the order of prediction increases, the spectrum of the error signal is flattened since the
prediction filter removed all the correlation from the input signal and gives the output as
a nearly white noise sequence.
Below are shown a few plots of the Frequency domain representation of the error signal
as the order of the predictor is increased. Clearly the spectrum is gradually flattened as
the order is increased. After the order becomes greater than the order of the system that
generated the original signal, there would not be any more flattening of the error signal.
This is because we cannot predict with more accuracy than the actual system that
generated the signal.
Figure 6.2: Frequency representation of error when order p=2..
48
Figure 6.3: Frequency representation of error when order p=6.
Figure 6.4: Frequency representation of error when order p=8.
49
Figure 6.5: Frequency representation of error when order p=12.
Figure 6.6: Frequency representation of error when order p=20.
50
Figure 6.7: Frequency representation of error when order p=40.
From the above figures we can see there is not much flattening of the error after order 20.
Generally for Speech signal an order of 12 is sufficient and gives good results.
51
Shown below are the frequency spectrum of the original signal and the recreated signal.
Figure 6.8: Frequency representation of original signal.
This figure shows the frequency content of the original signal. Let us represent the
frequency spectrum in a more convenient form by using the function fftshift().
52
Figure 6.9: Frequency representation of original signal (Shifted).
Below we show the frequency response of the recreated signal.
53
.Figure 6.10: Frequency representation of recreated signal (Shifted) p=8.
Figure 6.11: Frequency representation of recreated signal (Shifted) p=12.
54
6.2 Effect of segment size
Window size determines the number of segments that the speech is divided into. Since
speech is quasi stationary, it can be assumed stationary for only a small duration. The
smaller the segment size, the more number of segments will be created and the program
will require more computation time. But as the segments size is increased, the predictive
gain is reduced and hence compression is not very efficient. Hence we have to find a
balance between the size of the segment and the computation time required.
Number of Bits
Window
Size (ms)
8
7
6
5
4
3
5
11.956 11.832 11.499 11.024 9.9649 9.7677
7
11.164 11.076 10.705 10.294 9.4267 8.2087
10
11.471 11.4 10.996 10.569 9.4888 8.6293
13
11.032 10.935 10.618 10.07 9.2504 8.0782
15
11.113 11.037 10.683 10.246 9.1502 8.0151
17
11.035 10.944 10.576 10.153 9.1392 8.0709
20
10.945 10.829 10.492 9.9965 9.0453 7.9843
23
10.971 10.878 10.51 10.048 9.0963 8.0501
25
10.984 10.891 10.514 10.085 9.1083 7.9189
30
11.03 10.943 10.574 10.097 9.0816 8.1633
50
11.08 10.985 10.636 10.136 9.2119 8.3669
Table 6.2 Predictive Gain vs Window Segment Size
55
Plotting the above table in a graph form.
Figure 6.12: Window size Vs Predictive Gain.
Index
8 bits: Red circles
7 bits: Blue circles
6 bits: Magenta Triangles
5 bits: Green Stars
4 bits: Blue Squares
3 bits: Black V
56
6.3 Effect of quantization and compression of error signal.
Once the error signal has been quantized into less number of bits, some information has
been lost and cannot be achieved back. We have to concentrate on how best we can
reproduce the original signal from this quantized forward prediction error. Below are
shown the recreated signals for the same word N-S-I-T we have used earlier. The original
signal is encoded in 8 bits. We will encode the forward prediction error in 7, 6, 5, 4, 3 bits
and see the output waveform.
Figure 6.13: Original Signal encoded in 8 bits.
57
Figure 6.14: Recreated Signal when error is encoded in 8 bits.
Figure 6.15: Recreated Signal when error is encoded in 7 bits.
58
Figure 6.16: Recreated Signal when error is encoded in 6 bits.
Figure 6.17: Recreated Signal when error is encoded in 5 bits.
59
Figure 6.18: Recreated Signal when error is encoded in 4 bits.
Figure 6.19: Recreated Signal when error is encoded in 3 bits.
The loss of quality of speech as the error signal is encoded in less and less bits is obvious.
60
Figure 6.20: Recreated Signal Spectrum when error is encoded in 8 bits.
Figure 6.21: Recreated Signal Spectrum when error is encoded in 7 bits.
61
Figure 6.22: Recreated Signal Spectrum when error is encoded in 6 bits.
Figure 6.23: Recreated Signal Spectrum when error is encoded in 5 bits.
62
Figure 6.24: Recreated Signal Spectrum when error is encoded in 4 bits.
Figure 6.25: Recreated Signal Spectrum when error is encoded in 3 bits.
63
Chapter 7
Conclusions and Future Scope of the Project
64
7.1 Conclusions
From our experiments we have verified the accuracy and the efficiency of the Levinson
Durbin algorithm and its application to Linear Prediction. We have also clearly shown
that it is possible to compress speech signals in the form of forward prediction error
together with the filter parameters. If the error sample is stored in sufficient number of
bits then the recreated signal is of good quality.
As we have shown above we have managed to encode 8 bit speech sample into a 4 bit
error value while maintaining intelligibility of the speech. Depending on the application
and the quality of signal required we can choose the number of bits of precision to store
the error signal.
Our results show that use of windowing and segmentation is very useful in order to have
an efficient compression of the error signal.
The simulations results show that the prediction order in the range of 8-12 is sufficient
for linear prediction of speech, and using order above this does not give much
improvements.
Figures obtained by our simulations on MATLAB6.1 show that a window size of 5-10
msec gives very good results for speech samples at 11025 Hz. For good results, the
length of the window should contain ideally 60 to 100 speech samples.
65
7.2 Future Scope of the Project
In our project we successfully simulated compression of speech in MATLAB. We also
made all the relevant functions on our own so that all the work can be easily implemented
in Hardware.
Future students working on this project will find the hardware implementation of our
project very simple as we have provided a ready made set of important functions along
with the source code. These functions can be easily coded in C language from which it
can be converted into the assembly language of any hardware.
Future work on the project can also be done on improved DSP techniques to further
reduce error and make compression more efficient.
The Linear Predictor Coefficients can be calculated faster using Schur Algorithm on
parallel architecture, which is much more computationally efficient. Hence real-time
speech processing can be performed.
66
8. References
[1] J.Makhoul, Linear Prediction: A tutorial review; Proc of IEEE, vol 63, pp 561-589,
April 1975.
[2] John D Markel and A H Gray, A linear prediction vocoder simulation based upon the
autocorrelation method, IEEE Tran. on Acoustics, speech and signal processing; Vol.
ASSP: 22, April 1974.
[3] L.R Rabiner, B S Atal amd M R Sambur, LPC prediction arror: analysis of its
variation with position of analysis frame, IEEE Tran. on Acoustics, speech and signal
processing; Vol. ASSP: 25, Oct 1977.
[4] Advanced Digital Signal Processing, Proakis J.G, Rader M.C, Ling F, Nikias C.L,
Macmillan Publishing Company, New York, 1992. ISBN- 0-02-396841-9.
[5] J.Makhoul, Stable and efficient Lattice Methods for Linear Prediction., IEEE Tran.
On Acoustic Speech and Signal Processing, Vol. ASSP: 25, Oct 1977
[6] Douglas OShaughnessy, Linear Predictive Coding, IEEE Potentials, Feb-1988
[7] John E. Roberts and R. H. Wiggins, PIECEWISE LINEAR PREDICTIVE CODING
(PLPC), The MITRE Corporation, Bedford, Massachusetts 01730, May-1980
[8] M. A. Atashroo, Autocorrelation Prediction, Advanced Research Project Agency.
[9] Manfred R. Schroeder, Linear Predictive Coding of Speech: Review and Current
Directions, IEEE Communications Magazine, Aug-1985, Vol-23, No. 8, pp 54-61
[10] Advanced Digital Signal Processing and Noise Reduction, Vaseghi Saeed V, John
Wiley and Sons, 1996. ISBN-0-471-62692-9.
[11] L.R Rabiner and B.H Juang, Fundamentals of Speech Processing, Prentice Hall,
1993.
[12] Simon Haykin, Adaptive Filter Theory, 3
rd
edition, Prentice Hall International, New-
Jersey, ISBN 0-13-397985-7
67
Appendices
68
A. Appendix I: New MATLAB Functions.
Function A.1: The Levinson-Durbin Algorithm.
function [lpc_coeffs, ref_coeffs]=levdurbin(samples,num_coeff)
%syntax => [lpc_coeffs, ref_coeffs]=levdurbin(auto_mat,num_coeff) function for
calculating the LPC and %reflection coeffs
%input needed are the autocorrelation matrix and number of coefficients .
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
%finding the autocorrelation matrix
N=length(samples);
p=num_coeff;
auto_corr=zeros(p,p);
for k=1:p
for j=1:p
for n=p+1:N
auto_corr(k,j)=auto_corr(k,j)+samples(n-k).*samples(n-j);
end
end
end
%Reading the Auto-correlation Matrix and extracting R_dash(0),R_dash(1) ..
for i=1:num_coeff
R(i)=auto_corr(1,i);
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
E=0; % initialising the error term.
69
k=zeros(1,num_coeff); %initialising the reflection coeff matrix
a=zeros(num_coeff); %initialising the coeff pXp with no overwriting the previous
entries
E=R(0+1); % in the form of aurocorrelation + 1 because matrix index has to
be from 1
if R(1)==0 % To prevent divide by zero.
R(1)=1;
end
k(1)=-R(1+1)./R(0+1);
a(1,1)=k(1);
for j=2:num_coeff
E=0;
temp1=0;
temp2=0;
for L=1:j-1
temp1=temp1+a(j-1,L).*R(j-L+1);
end
for m=1:j-1
temp2=temp2+R(j-m+1).*a(j-1,j-m);
E=R(0+1)+temp2;
end
k(j)=-(R(j-1+1)+temp1)./E;
a(j,j)=k(j);
for m=1:j-1
a(j,m)=a(j-1,m)+k(j).*a(j-1,j-m);
end
end
lpc_coeffs=[1 a(p,:)];
ref_coeffs=k;
70
Function A.2: Ntoone
function output=Ntoone(sample)
%output=Ntoone(sample)
%This is a function for converting an N dimension matrix to 1 dimension
%matrix
%The input parameter is an N dimensional matrix and the
%output is a one dimensional matrix
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
[R,C]=size(sample);
output=zeros(1,R*C);
for i=1:R
for j=1:C
output(1,(i-1)*C+j)=sample(i,j);
end
end
71
Function A.3: onetoN
function output=onetoN(sample,R,C)
%output=onetoN(sample,R,C)
%Function for converting a one dimensional matrix into a
%N dimension matrix.
%The input parameter is an one dimensional matrix and the number
%of rows and columns
%The output is a N dimensional matrix
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
len=length(sample);
output=zeros(R,C);
for i=1:R
for j=1:C
output(i,j)=sample((i-1)*C+j);
end
end
72
Function A.4: onetoN
function
one_D=Ntoone_overlap(windowed_samples,padded_number_of_windows,padded_
window_length)
%Syntax=>one_D=Ntoone_overlap(windowed_samples,padded_number_of_windows,p
added_window_le%ngth)
%The input parameters are as follows
%windowed samples = matrix returned after windowing, which contains overlapping
elements.
%padded_number of windows = Number of windows after equalizing the size of the
sample and the
%lenght required for windowing.
%padded_window_length = Length after equalizing all sizes
%This function is used in conjunction with the overlapping windowing
%function and converts a N dimensional matrix having overlapping windows to a one
%dimensional matrix in such a way that the overlapping elements are considered only
once
%and not twice.
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
one_D=zeros(1,padded_number_of_windows*(padded_window_length/2)+padded_wind
ow_length/2);
for i=0:padded_number_of_windows-1
for j=1:padded_window_length
73
one_D(1,(i*(padded_window_length/2)+j))=one_D(1,(i*(padded_window_length/2)+j))
+ windowed_samples(i+1,j);
end
end
74
Function A.5: rect_win
function [windowed_samples, padded_window_length,
padded_number_of_windows]=rect_win(samples,sam_rate,time_segment)
%syntax =>[windowed_samples, padded_window_length,
%padded_number_of_windows]=rect_win(samples,sam_rate,time_segment)
%The function is used for the purpose of segmentation and windowing a speech sample.
The segments
%are non overlapping and the window type is rectangular. Non-overlapping segments
use each value
%onle once for the purpose of windowing.
%samples = The matrix containing the signal to be segmented and windowed.
%sam_rate = The sampling rate of the speech signal
%time segment = The length of the window in milli seconds i.e. length of the time
segment
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
win_length=sam_rate*time_segment;
padded_window_length=ceil(win_length);
number_of_windows=length(samples)/win_length;
padded_number_of_windows=ceil(number_of_windows);
padded_total_length=padded_number_of_windows * padded_window_length;
padded_samples=[samples' zeros(1,padded_total_length-length(samples))];
windowed_samples=zeros(padded_number_of_windows,padded_window_length);
for i=0:padded_number_of_windows-1
for j=1:padded_window_length
windowed_samples(i+1,j)=padded_samples(i*padded_window_length+j);
end
end
75
Function A.6: han_win_overlap
function [windowed_samples, padded_window_length,
padded_number_of_windows]=ham_win_overlap(samples,sam_rate,time_segment)
%syntax =>[windowed_samples, padded_window_length,
%padded_number_of_windows]=ham_win_overlap(samples,sam_rate,time_segment)
%This function is used for segmenation and windowing of input speech signal.
%The segmenation type is overlapping with an i=overlap of N/2 where N is the window
size.
%The windowing type is hamming window.
%This function can be used for other types of windowing such as hanning rectangular
%by changing the multiplying value.
%The input parameters are
%samples = The matrix containing the signal to be segmented and windowed.
%sam_rate = The sampling rate of the speech signal
%time segment = The length of the window in milli seconds i.e. length of the time
segment
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
win_length=sam_rate*time_segment;
padded_window_length=ceil(win_length);
if rem(padded_window_length,2)==1
padded_window_length=padded_window_length+1;
end
i=0;
while (length(samples)-i*(padded_window_length./2))>padded_window_length
for j=1:padded_window_length
76
windowed_samples(i+1,j)=samples(1,i*(padded_window_length/2)+j).*(0.50-
0.50.*(cos((2*pi*(j-1)./(padded_window_length-1)))));
end
i=i+1;
end
for j=1:padded_window_length
if i*(padded_window_length/2)+j < length(samples)
windowed_samples(i+1,j)=samples(1,i*(padded_window_length/2)+j).*(0.50-
0.50.*(cos((2*pi*(j-1)./(padded_window_length-1)))));
else
windowed_samples(i+1,j)=0;
end
end
padded_number_of_windows=i+1;
77
Function A.7: fir_cpp
function [y]=fir_cpp(b,a,x)
% syntax =>[y]=fir_cpp(b,a,x)
% This function is a self made function for implementation of an FIR filter
% The inputs are the standard inputs which are given to the MATLAB filter command.
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
p=length(b);
p=p-1;
temp=zeros(p+1,1);
N=length(x);
for i=1:N
for j=1:p+1
if i-j >=0
temp(j,1)=x(i-j+1);
end
end
b_tran=b;
y(i)=b_tran*temp;
end
78
Function A.8: iir_cpp
function [y]=iir_cpp(b,a,x)
% syntax =>[y]=fir_cpp(b,a,x)
% This function is a self made function for implementation of an IIR filter
% The inputs are the standard inputs which are given to the MATLAB filter command.
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
p=length(a);
for k=2:p
q(1,k-1)=a(k);
end
p=p-1;
temp=zeros(p,1);
N=length(x);
y=zeros(1,N);
for i=1:N
for j=1:p
if i-j >0
temp(j,1)=y(i-j);
end
end
%q_tran=q';
y(i)=q*temp;
y(i)=-y(i)+x(i);
end
79
Function A.9: lat_cpp
function [output] = lat_cpp(b,x)
%syntax =>[output] = lat_cpp(b,x)
%This function is for a all zero lattice filter. This filter is equivalent to an iir filter.
%The inputs required are the reflection coefficients and the input to be sampled
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
N=length(x);
p=length(b);
F=zeros(p+1,N);
G=zeros(p+1,N);
F(1,:)=x;
G(1,:)=x;
for j=2:N
for i=2:p+1
F(i,j) = F(i-1,j) + b(i-1)*G(i-1,j-1); % here we r generating the output of
lattice filter i.e error
G(i,j) = b(i-1)*F(i-1,j) + G(i-1,j-1);
end
end
output=F(p+1,:);
80
Function A.10: latrec_cpp
function [output1]=latrec_cpp(b,x)
%syntax [output1]=latrec_cpp(b,x)
%This function if used to implement the all pole lattice. The filter is similar to
%a IIR filter.
%The inputs are the reflection coefficients and the input to be filtered.
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
p=length(b);
N=length(x);
F=zeros(p+1,N); % this is one way of implementing...that is combining both the outer
G=zeros(p+1,N);
% loop for N samples and the inner loop for 1 to p order
F(1,:)=x;
G(1,:)=x;
for j=2:N
for i=2:p+1
F(i,j) = F(i-1,j) + b(i-1)*G(i-1,j-1); % here we r generating the output of
lattice filter i.e error
G(i,j) = b(i-1)*F(i-1,j) + G(i-1,j-1);
end
end
% generating recreated signal
F_rec=F;
G_rec=G; % i-1 is varying from 1 to 12 (i.e p)
%replace i-1 by p+1 -(i-1) = p-i+2
for j=2:N
81
for i=2:p+1 % here we r getting back the recreated signal
F_rec(p-i+2,j) = F_rec(p-i+3,j) - b(p-i+2)*G_rec(p-i+2,j-1);
G_rec(p-i+3,j) = b(p-i+2)*F_rec(p-i+2,j) + G_rec(p-i+2,j-1);
end
end
output1=F_rec(1,:);
82
Function A.11: threshold
function out=threshhold(in,min)
%syntax =>out=threshhold(in,min)
% This function is used for end detection. The imputs are the input speech signal and the
% threshhold value below which the signal should be rejected.
%Made by Vidhu Niti Singh-565\ECE\99 and Sandeep Dabas-549\ECE\99.
N=length(in)
j=1;
max=30;
for i=1:N-max
if abs(in(i))>min
out(j)=in(i);
j=j+1;
else
for k=1:max
if abs(in(i+k))>min
out(j)=in(i);
j=j+1;
break
end
end
end
end
out=out-mean(out);
83
B. Appendix II: MATLAB Codes
Code B.1: Calculation of the Error Signal and Recreation of Speech.
% In this program we use the filter approach to find the error signal and then recreate
%the sample
% from the error. This is same as assuming a causal and causally invertible filter which
%converts
% a white noise sequence into a WSS process and the inverse filter which gives white
%noise output
% when a WSS process is taken as the input.
samples=readwav('vid2.wav');
p=12; % number of coefficients ie order of the predictor
N=length(samples);
y=levdurbin(samples,p);%<----using my own function.
%------------- making a predictive filter which gives error as the output ---------------
% the error can be treated as a white noise which is made after removing all the
%correlation
% in the input speech sample.
b=y;
a=1;
error=filter(b,a,samples);
%----------Passing the error through the inverse filter which again generates the speech
%sample.
a_dash=y;
84
b_dash=1;
recreate=filter(b_dash,a_dash,error);
subplot(3,1,1),plot(samples);title('Original Signal');axis([0 length(samples) 1 1]);
xlabel('Samples');ylabel('Amplitude');
subplot(3,1,2),plot(error);title('Error Signal');axis([0 length(samples) -1 1 ] ) ;
xlabel('Samples');ylabel('Amplitude');
subplot(3,1,3),plot(recreate);title('Recreated Signal');axis([0 length(samples) -1 1 ] ) ;
xlabel('Samples');ylabel('Amplitude');
85
Code B.2: Non-Overlapping Segmentation of Speech Sample
%This program segments the sample by using windowing. We want to see the effect on
%the forward %prediction error by using this technique. The forward prediction error
%should decrease.
[samples sam_rate FIDX Wmode]=readwav('vid2.wav');
samples=threshhold(samples,0.03);
[windowed_samples padded_window_length padded_number_of_windows]
=rect_win(samples' ,sam_rate,0.015);
p=8; %order of prediction
lpc_matrix=zeros(padded_number_of_windows,1+p);
error=zeros(padded_number_of_windows,padded_window_length);
for i=0:padded_number_of_windows-1
lpc_matrix(i+1,:)=levdurbin(windowed_samples(i+1,:),p);
error(i+1,:)=fir_cpp(lpc_matrix(i+1,:),1,windowed_samples(i+1,:)); %passing
through noise %whitening filter
end
recreated=zeros(padded_number_of_windows,padded_window_length);
for i=0:padded_number_of_windows-1
for j=1:padded_window_length
recreated(i+1,:)=iir_cpp(1,lpc_matrix(i+1,:),error(i+1,:)); %passing through inverse
filter
end
end
error_1d=zeros(1,padded_number_of_windows*padded_window_length);
recreated_1d=zeros(1,padded_number_of_windows*padded_window_length);
error_1d=reshape(error',1,padded_number_of_windows*padded_window_length);
recreated_1d=reshape(recreated',1,padded_number_of_windows*padded_window_length
);
subplot(3,1,1);plot(samples);title('Original signal');axis([0 length(samples) -1 1]);
ylabel('Amplitude'); subplot(3,1,2);plot(error_1d);title('Error signal(windowed)');
axis([0 length(samples) -1 1]); ylabel('Amplitude');
subplot(3,1,3);plot(recreated_1d);title('Recreated signal');
axis([0 length(samples) -1 1]);ylabel('Amplitude');
86
Code B.3: Calculation of the Error Signal and Recreation of Speech.
%In this program the Speech Segment is segmented into overlapping segments and
%Hanning Window is %used on each segment.
%The size of each segment is 15 millisecond.
[samples sam_rate FIDX Wmode]=readwav('vid2');
samples=threshhold(samples,0.030);
[windowed_samples padded_window_length
padded_number_of_windows]=han_win_overlap(samples,sam_rate,0.015);
p=8;% number of coefficients ie order of the predictor
lpc_matrix=zeros(padded_number_of_windows,1+p);
error=zeros(padded_number_of_windows,padded_window_length);
for i=0:padded_number_of_windows-1
lpc_matrix(i+1,:)=levdurbin(windowed_samples(i+1,:),p);
error(i+1,:)=fir_cpp(lpc_matrix(i+1,:),1,windowed_samples(i+1,:));
end
figure(1);
one_D=Ntoone_overlap(error,padded_number_of_windows,padded_window_length);
recreated=zeros(padded_number_of_windows,padded_window_length);
for i=0:padded_number_of_windows-1
recreated(i+1,:)=iir_cpp(1,lpc_matrix(i+1,:),error(i+1,:));
end
recreated_1d=Ntoone_overlap(recreated,padded_number_of_windows,padded_window_l
ength);
subplot(3,1,2);plot(one_D);title('overlap window error');axis([0 length(samples) -.6 .6]);
87
subplot(3,1,1);plot(samples);title('Original Signal');axis([0 length(samples) -.6 .6]);
subplot(3,1,3);plot(recreated_1d);title('Recreated Signal');axis([0 length(samples) -.6 .6]);
88
Code B.4: Encoding of error signal using different number of bits
%In this program we encode the error signal into different number of bits and %then
%read it back to
%take the effect of quantisation into effect.
%-------------Reading the file ------------------------
[samples,sam_rate,wmode,fidx]=readwav('vid2.wav');
p=8; % number of coefficients ie order of the predictor
N=length(samples);
[windowed_samples padded_window_length
padded_number_of_windows]=han_win_overlap(samples',sam_rate,0.015);
lpc_matrix=zeros(padded_number_of_windows,1+p);
error=zeros(padded_number_of_windows,padded_window_length);
for i=0:padded_number_of_windows-1
lpc_matrix(i+1,:)=real(lpc(windowed_samples(i+1,:),p));
error(i+1,:)=fir_cpp(lpc_matrix(i+1,:),1,windowed_samples(i+1,:));
end
one_D=Ntoone_overlap(error,padded_number_of_windows,padded_window_length);
writewav(one_D,sam_rate,'error_8.wav','8','s');
writewav(one_D,sam_rate,'error_7.wav','7','s');
writewav(one_D,sam_rate,'error_6.wav','6','s');
writewav(one_D,sam_rate,'error_5.wav','5','s');
writewav(one_D,sam_rate,'error_4.wav','4','s');
writewav(one_D,sam_rate,'error_3.wav','3','s');
89
error_8=readwav('error_8','s');
error_7=readwav('error_7','s');
error_6=readwav('error_6','s');
error_5=readwav('error_5','s');
error_4=readwav('error_4','s');
error_3=readwav('error_3','s');
err=error_8;
error=onetoN_overlap(err',padded_number_of_windows,padded_window_length);
recreated=zeros(padded_number_of_windows,padded_window_length);
for i=0:padded_number_of_windows-1
recreated(i+1,:)=iir_cpp(1,lpc_matrix(i+1,:),error(i+1,:));
end
recreated_1d=Ntoone_overlap(recreated,padded_number_of_windows,padded_window_l
ength);
writewav(recreated_1d,sam_rate,'recreated_8');
subplot(3,1,1);plot(samples);title('original signal');axis([0 length(samples) -.6
.6]);ylabel('Amplitute');
subplot(3,1,2);plot(one_D);title('error signal');axis([0 length(samples) -.6
.6]);ylabel('Amplitute');
subplot(3,1,3);plot(recreated_1d);title('Recreated Signal');
axis([0 length(samples) -2.5 2.5]);ylabel('Amplitute');
90
C. Appendix III C- Codes
Code C.1 Levinson Durbin Algorithm
#include "levd.h"
M levd(M samples, M num_coeff) {
double old_nargin=nargin; nargin=2; nargin_set=1;
double old_nargout=nargout; nargout=1; nargout_set=1;
M lpc_coeffs__out(0,0,"lpc_coeffs__out");
levd(samples, num_coeff, i_o, lpc_coeffs__out, junk_M);
nargout=old_nargout;
nargin=old_nargin;
return(lpc_coeffs__out);
}
M levd(M samples, M num_coeff, i_o_t, Mr lpc_coeffs__out, Mr ref_coeffs__out) {
M ans(0,0,"ans"), lpc_coeffs(0,0,"lpc_coeffs"), ref_coeffs(0,0,"ref_coeffs")\
, N(0,0,"N"), p(0,0,"p"), auto_corr(0,0,"auto_corr"), k(0,0,"k"), k_v0(\
0,0,"k_v0"), j(0,0,"j"), j_v1(0,0,"j_v1"), n(0,0,"n"), n_v2(0,0,"n_v2")\
, i(0,0,"i"), i_v3(0,0,"i_v3"), R(0,0,"R"), E(0,0,"E"), a(0,0,"a"), j_v4(\
0,0,"j_v4"), temp1(0,0,"temp1"), temp2(0,0,"temp2"), L(0,0,"L"), L_v5(0,\
0,"L_v5"), m(0,0,"m"), m_v6(0,0,"m_v6"), m_v7(0,0,"m_v7");
double old_nargin=nargin; if (!nargin_set) nargin =2;
double old_nargout=nargout; if (!nargout_set) nargout=2;
nargin_set=0; nargout_set=0;
N=length(samples);
p=num_coeff;
auto_corr=zeros(p,p);
91
k_v0=colon(1.0,1,p);
for (int k_i0=1;k_i0<=forsize(k_v0);k_i0++) {
forelem(k,k_v0,k_i0);
j_v1=colon(1.0,1,p);
for (int j_i1=1;j_i1<=forsize(j_v1);j_i1++) {
forelem(j,j_v1,j_i1);
n_v2=colon(p+1.0,1,N);
for (int n_i2=1;n_i2<=forsize(n_v2);n_i2++) {
forelem(n,n_v2,n_i2);
auto_corr(k,j)=auto_corr(k,j)+dot_mul(samples(n-k),samples(n-j)\
);
}
}
}
//Reading the Auto-correlation Matrix and extracting R_dash(0),R_dash(1) ..
i_v3=colon(1.0,1,num_coeff);
for (int i_i3=1;i_i3<=forsize(i_v3);i_i3++) {
forelem(i,i_v3,i_i3);
R(i)=auto_corr(1.0,i);
}
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
E=0.0;
// initialising the error term.
k=zeros(1.0,num_coeff);
//initialising the reflection coeff matrix
a=zeros(num_coeff);
//initialising the coeff pXp with no overwriting the previous entries
E=R(0.0+1.0);
//% in the form of aurocorrelation + 1 because matrix index has to be from 1
if (istrue(R(1.0)==0.0)) {
92
// To prevent divide by zero.
R(1.0)=1.0;
}
k(1.0)=-dot_div(R(1.0+1.0),R(0.0+1.0));
a(1.0,1.0)=k(1.0);
j_v4=colon(2.0,1,num_coeff);
for (int j_i4=1;j_i4<=forsize(j_v4);j_i4++) {
forelem(j,j_v4,j_i4);
E=0.0;
temp1=0.0;
temp2=0.0;
L_v5=colon(1.0,1,j-1.0);
for (int L_i5=1;L_i5<=forsize(L_v5);L_i5++) {
forelem(L,L_v5,L_i5);
temp1+=dot_mul(a(j-1.0,L),R(j-L+1.0));
}
m_v6=colon(1.0,1,j-1.0);
for (int m_i6=1;m_i6<=forsize(m_v6);m_i6++) {
forelem(m,m_v6,m_i6);
temp2+=dot_mul(R(j-m+1.0),a(j-1.0,j-m));
E=R(0.0+1.0)+temp2;
}
k(j)=-dot_div((R(j-1.0+1.0)+temp1),E);
a(j,j)=k(j);
m_v7=colon(1.0,1,j-1.0);
for (int m_i7=1;m_i7<=forsize(m_v7);m_i7++) {
forelem(m,m_v7,m_i7);
a(j,m)=a(j-1.0,m)+dot_mul(k(j),a(j-1.0,j-m));
}
}
lpc_coeffs=brackets('M',(M)1.0,(M)a(p,c_p),l_M);
93
ref_coeffs=k;
nargin=old_nargin; nargout=old_nargout;
lpc_coeffs__out=lpc_coeffs; ref_coeffs__out=ref_coeffs; return(nop_M)\
;
}
main() {
initM();
levd();
exitM();
return 0;
}
Code C.2 Levinson Durbin Header File
#ifndef __levd_h
#define __levd_h
#include "matlib.h"
M levd(M samples, M num_coeff);
M levd(M samples, M num_coeff, i_o_t, Mr lpc_coeffs__out, Mr ref_coeffs__out);
#endif
94
Code C.3 MA Lattice Filter
#include "lat.h"
M lat(M b, M x) {
M ans(0,0,"ans"), output(0,0,"output"), N(0,0,"N"), p(0,0,"p"), F(0,0,"F")\
, G(0,0,"G"), j(0,0,"j"), j_v0(0,0,"j_v0"), i(0,0,"i"), i_v1(0,0,"i_v1")\
;
double old_nargin=nargin; if (!nargin_set) nargin =2;
double old_nargout=nargout; if (!nargout_set) nargout=1;
nargin_set=0; nargout_set=0;
//syntax =>[output] = lat_cpp(b,x)
N=length(x);
p=length(b);
F=zeros(p+1.0,N);
G=zeros(p+1.0,N);
F(1.0,c_p)=x;
G(1.0,c_p)=x;
j_v0=colon(2.0,1,N);
for (int j_i0=1;j_i0<=forsize(j_v0);j_i0++) {
forelem(j,j_v0,j_i0);
i_v1=colon(2.0,1,p+1.0);
for (int i_i1=1;i_i1<=forsize(i_v1);i_i1++) {
forelem(i,i_v1,i_i1);
F(i,j)=F(i-1.0,j)+b(i-1.0)*G(i-1.0,j-1.0);
// here we r generating the output of lattice filter i.e error
G(i,j)=b(i-1.0)*F(i-1.0,j)+G(i-1.0,j-1.0);
}
}
output=F(p+1.0,c_p);
95
nargin=old_nargin; nargout=old_nargout;
return(output);
}
main() {
initM();
lat();
exitM();
return 0;
}
Code x.3 MA Lattice Filter Header File
#ifndef __lat_h
#define __lat_h
#include "matlib.h"
M lat(M b, M x);
#endif
96
Code C.3 AR Lattice Filter
#include "latrec.h"
M latrec(M b, M x) {
M ans(0,0,"ans"), output1(0,0,"output1"), p(0,0,"p"), N(0,0,"N"), F(0,0,\
"F"), G(0,0,"G"), j(0,0,"j"), j_v0(0,0,"j_v0"), i(0,0,"i"), i_v1(0,0,"i_v1")\
, F_rec(0,0,"F_rec"), G_rec(0,0,"G_rec"), j_v2(0,0,"j_v2"), i_v3(0,0,"i_v3")\
;
double old_nargin=nargin; if (!nargin_set) nargin =2;
double old_nargout=nargout; if (!nargout_set) nargout=1;
nargin_set=0; nargout_set=0;
p=length(b);
N=length(x);
F=zeros(p+1.0,N);
// this is one way of implementing...that is combining both the outer
G=zeros(p+1.0,N);
// loop for N samples and the inner loop for 1 to p order
F(1.0,c_p)=x;
G(1.0,c_p)=x;
j_v0=colon(2.0,1,N);
for (int j_i0=1;j_i0<=forsize(j_v0);j_i0++) {
forelem(j,j_v0,j_i0);
i_v1=colon(2.0,1,p+1.0);
for (int i_i1=1;i_i1<=forsize(i_v1);i_i1++) {
forelem(i,i_v1,i_i1);
F(i,j)=F(i-1.0,j)+b(i-1.0)*G(i-1.0,j-1.0);
// here we r generating the output of lattice filter i.e error
G(i,j)=b(i-1.0)*F(i-1.0,j)+G(i-1.0,j-1.0);
}
97
}
// generating recreated signal
F_rec=F;
G_rec=G;
// i-1 is varying from 1 to 12 (i.e p)
//replace i-1 by p+1 -(i-1) = p-i+2
j_v2=colon(2.0,1,N);
for (int j_i2=1;j_i2<=forsize(j_v2);j_i2++) {
forelem(j,j_v2,j_i2);
i_v3=colon(2.0,1,p+1.0);
for (int i_i3=1;i_i3<=forsize(i_v3);i_i3++) {
forelem(i,i_v3,i_i3);
// here we r getting back the recreated signal
F_rec(p-i+2.0,j)=F_rec(p-i+3.0,j)-b(p-i+2.0)*G_rec(p-i+2.0,j-1.0)\
;
G_rec(p-i+3.0,j)=b(p-i+2.0)*F_rec(p-i+2.0,j)+G_rec(p-i+2.0,j-1.0)\
;
}
}
output1=F_rec(1.0,c_p);
nargin=old_nargin; nargout=old_nargout;
return(output1);
}
main() {
initM();
latrec();
exitM();
return 0;
}
98
Code C.3 AR Lattice Filter Header File
#ifndef __latrec_h
#define __latrec_h
#include "matlib.h"
M latrec(M b, M x);
#endif
99
Code C.3 Segmentation and Hanning Window
#include "hanwin.h"
M hanwin(M samples, M sam_rate, M time_segment) {
double old_nargin=nargin; nargin=3; nargin_set=1;
double old_nargout=nargout; nargout=1; nargout_set=1;
M windowed_samples__out(0,0,"windowed_samples__out");
hanwin(samples, sam_rate, time_segment, i_o, windowed_samples__out, junk_M,
junk_M);
nargout=old_nargout;
nargin=old_nargin;
return(windowed_samples__out);
}
M hanwin(M samples, M sam_rate, M time_segment, i_o_t, Mr
windowed_samples__out, Mr padded_window_length__out) {
double old_nargin=nargin; nargin=3; nargin_set=1;
double old_nargout=nargout; nargout=2; nargout_set=1;
hanwin(samples, sam_rate, time_segment, i_o, windowed_samples__out,
padded_window_length__out, junk_M);
nargout=old_nargout;
nargin=old_nargin;
return(sixpack_M);
}
M hanwin(M samples, M sam_rate, M time_segment, i_o_t, Mr
windowed_samples__out, Mr padded_window_length__out, Mr
padded_number_of_windows__out) {
100
M ans(0,0,"ans"), windowed_samples(0,0,"windowed_samples"),
padded_window_length(\
0,0,"padded_window_length"),
padded_number_of_windows(0,0,"padded_number_of_windows")\
, win_length(0,0,"win_length"), i(0,0,"i"), j(0,0,"j"), j_v0(0,0,"j_v0")\
, j_v1(0,0,"j_v1");
double old_nargin=nargin; if (!nargin_set) nargin =3;
double old_nargout=nargout; if (!nargout_set) nargout=3;
nargin_set=0; nargout_set=0;
win_length=sam_rate*time_segment;
padded_window_length=ceil(win_length);
if (istrue(rem(padded_window_length,2.0)==1.0)) {
padded_window_length+=1.0;
}
i=0.0;
while (istrue((length(samples)-i*(dot_div(padded_window_length,2.0)))\
>padded_window_length)) {
j_v0=colon(1.0,1,padded_window_length);
for (int j_i0=1;j_i0<=forsize(j_v0);j_i0++) {
forelem(j,j_v0,j_i0);
windowed_samples(i+1.0,j)=dot_mul(samples(1.0,i*(padded_window_length/2.0)\
+j),(0.5-dot_mul(0.5,(cos((dot_div(2.0*pi*(j-1.0),(padded_window_length-1.0)\
)))))));
}
i+=1.0;
}
j_v1=colon(1.0,1,padded_window_length);
for (int j_i1=1;j_i1<=forsize(j_v1);j_i1++) {
forelem(j,j_v1,j_i1);
if (istrue(i*(padded_window_length/2.0)+j<length(samples))) {
windowed_samples(i+1.0,j)=dot_mul(samples(1.0,i*(padded_window_length/2.0)\
101
+j),(0.5-dot_mul(0.5,(cos((dot_div(2.0*pi*(j-1.0),(padded_window_length-1.0)\
)))))));
} else {
windowed_samples(i+1.0,j)=0.0;
}
}
padded_number_of_windows=i+1.0;
nargin=old_nargin; nargout=old_nargout;
windowed_samples__out=windowed_samples;
padded_window_length__out=padded_window_length; \
padded_number_of_windows__out=padded_number_of_windows; return(nop_M)\
;
}
main() {
initM();
hanwin();
exitM();
return 0;
}
Code C.3 Segmentation and Hanning Window Header File
#ifndef __hanwin_h
#define __hanwin_h
#include "matlib.h"
M hanwin(M samples, M sam_rate, M time_segment);
102
M hanwin(M samples, M sam_rate, M time_segment, i_o_t, Mr
windowed_samples__out, Mr padded_window_length__out);
M hanwin(M samples, M sam_rate, M time_segment, i_o_t, Mr
windowed_samples__out, Mr padded_window_length__out, Mr
padded_number_of_windows__out);
#endif