Professional Documents
Culture Documents
Ioan Tabus
1 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
We have shown that adaptation equation (SD − p, R) can be written in an equivalent form as (see also the Figure
with the implementation of SD algorithm)
In order to simplify the algorithm, instead the true gradient of the criterion
∇
ˆ
w (n) J(n) = −2u(n)e(n)
2 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
LMS algorithm
Using the noisy gradient, the adaptation will carry on the equation
1
w (n + 1) = w (n) − µ∇
ˆ
w (n) J(n) = w (n) + µu(n)e(n)
2
In order to gain new information at each time instant about the gradient estimate, the procedure will go through
all data set {(d(1), u(1)), (d(2), u(2)), . . .}, many times if needed.
LMS algorithm
I the (correlated) input signal samples
{u(1), u(2), u(3), . . .}, generated randomly;
Given
I the desired signal samples {d(1), d(2), d(3), . . .}
correlated with {u(1), u(2), u(3), . . .}
1 Initialize the algorithm with an arbitrary parameter vector w (0), for example w (0) = 0.
2 Iterate for n = 0, 1, 2, 3, . . . , nmax
2.0 Read /generate a new data pair, (u(n), d(n))
∑M−1
2.1 (Filter output) y (n) = w (n)T u(n) = i=0
wi (n)u(n − i)
2.2 (Output error) e(n) = d(n) − y (n)
2.3 (Parameter adaptation) w (n + 1) = w (n) + µu(n)e(n)
w0 (n + 1) w0 (n) u(n)
w1 (n + 1) w1 (n) u(n − 1)
. . .
= + µe(n)
. . .
. . .
wM−1 (n + 1) wM−1 (n) u(n − M + 1)
3 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
4 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
SD algorithm is guaranteed to converge to Wiener optimal filter if the value of µ is selected properly (see last
Lecture)
w (n) → w o
The iterations are deterministic : starting from a given w (0), all the iterations w (n) are perfectly determined.
LMS iterations are not deterministic: the values w (n) depend on the realization of the data d(1), . . . , d(n) and
u(1), . . . , u(n). Thus, w (n) is now a random variable.
The convergence of LMS can be analyzed from following perspectives:
I Convergence of parameters w (n) in the mean:
E w (n) → w o
I Convergence of the criterion J(w (n)) (in the mean square of the error)
5 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
T
w (n + 1) − w o = w (n) − w o + µu(n)(d(n) − w (n) u(n))
T T T
ε(n + 1) = ε(n) + µu(n)(d(n) − w o u(n)) + µu(n)(u(n) w o − u(n) w (n))
T T
= ε(n) + µu(n)eo (n) − µu(n)u(n) ε(n) = (I − µu(n)u(n) )ε(n) + µu(n)eo (n)
T
E ε(n + 1) = E (I − µu(n)u(n) )ε(n) + E µu(n)eo (n)
and now assuming the statistical independence of u(n) and ε(n), we have
T
E ε(n + 1) = (I − µE [u(n)u(n) ])E [ε(n)] + µE [u(n)eo (n)]
Using the principle of orthogonality which states that E [u(n)eo (n)] = 0, the last equation becomes
T
E [ε(n + 1)] = (I − µE [u(n)u(n) ])E [ε(n)] = (I − µR)E [ε(n)]
c(n + 1) = (I − µR)c(n)
which was used in the analysis of SD algorithm stability, and identifying now c(n) with E ε(n), we have the stability
condition on next page.
6 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
7 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
∑
M
µJo
J∞ = Jo + λi
i=1
2 − µλi
For µλi ≪ 2
∑
M ∑
M
J∞ = Jo + µJo λi /2 = Jo (1 + µ λi /2) = Jo (1 + µtr (R)/2)
i=1 i=1
µM
J∞ = Jo (1 + µMr (0)/2) = Jo (1 + · Power of the input)
2
8 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
Learning curves
The statistical performance of adaptive filters is studied using learning curves, averaged over many realizations, or
ensemble-averaged.
I The mean square error MSE learning curve Take an ensemble average of the squared estimation error
e(n)2
2
J(n) = Ee (n) (1)
I The mean-square deviation (MSD) learning curve Take an ensemble average of the squared error
deviation ||ε(n)||2 where ε = w (n) − w o is the deviation of the parameters of the adaptive filter, from
the optimal parameters
2
D(n) = E ||ε(n)|| (2)
9 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
It is therefore enough to study the transient behavior of Jex (n), since D(n) follows its evolutions.
I The condition for stability
2
0<µ< (6)
λmax
µJmin ∑
M
Jex (∞) = λk (7)
2 k=1
I The misadjustment
Jex (∞) µ ∑
M
µ µ
M= = λk = tr (R) = Mr (0) (8)
Jmin 2 k=1 2 2
10 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
11 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
{ [ ]
1 1 + cos( 2π (n − 2)) , n = 1, 2, 3
h(n) = 2 W
0, otherwise
2 2 2 2
r (0) = h(1) + h(2) + h(3) + σv
r (1) = h(1)h(2) + h(2)h(3)
r (2) = h(1)h(3)
r (3) = r (4) = . . . = 0
12 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
0.8 1
0.5
0.6
0
0.4
−0.5
0.2 −1
0 −1.5
0 2 4 6 8 10 0 20 40 60
1
0.5
0.5
0 0
−0.5
−0.5
−1
−1 −1.5
0 20 40 60 0 20 40 60
13 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
We define the eigenvalue spread χ(R) of a matrix as the ratio of the maximum eigenvalue over the minimum
eigenvalue
14 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
15 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
1
10
0
10
−1
10
−2
10
−3
10
0 50 100 150 200 250 300 350 400 450 500
time step n
16 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
17 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
0
10
−1
10
−2
10
−3
10
0 500 1000 1500 2000 2500 3000
time step n
18 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
Summary
2
0<µ<
λmax
I For LMS filters with filter length M moderate to large, the convergence condition on the step size is
2
0<µ<
MSmax
where Smax is the maximum value of the power spectral density of the tap inputs.
19 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
w (n + 1) − w (n) = δw (n + 1)
Thus we have to minimize
2
∑
M−1
2
∥δw (n + 1)∥ = (wk (n + 1) − wk (n))
k=0
under the constraint
T
w (n + 1)u(n) = d(n)
The solution can be obtained by Lagrange multipliers method:
( )
2 T
J(w (n + 1), λ) = ∥δw (n + 1)∥ + λ d(n) − w (n + 1)u(n)
∑
M−1 ∑
M−1
(wk (n + 1) − wk (n)) + λ d(n) − wi (n + 1)u(n − i)
2
=
k=0 i=0
∂J(w (n+1),λ)
To obtain the minimum of J(w (n + 1), λ) we equate to zero the partial derivatives ∂wj (n+1)
= 0:
∂ ∑
M−1 ∑
M−1
(wk (n + 1) − wk (n)) + λ d(n) −
2
wi (n + 1)u(n − i) = 0
∂wj (n + 1) k=0 i=0
we have
1
wj (n + 1) = wj (n) + λu(n − j)
2
∑
M−1
d(n) = wi (n + 1)u(n − i)
i=0
∑
M−1
1
d(n) = (wi (n) + λu(n − i))u(n − i)
i=0
2
∑
M−1
1 ∑
M−1
2
d(n) = wi (n)u(n − i) + λ (u(n − i))
i=0
2 i=0
∑
2(d(n) − M−1
i=0
wi (n)u(n − i))
λ = ∑M−1
i=0
(u(n − i))2
2e(n)
λ = ∑M−1
i=0
(u(n − i))2
21 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
e(n)
wj (n + 1) = wj (n) + ∑M−1 u(n − j)
i=0
(u(n − i))2
In order to add an extra freedom degree to the adaptation strategy, one constant, µ̃, controlling the step size will
be introduced:
1 µ̃
wj (n + 1) = wj (n) + µ̃ ∑M−1 e(n)u(n − j) = wj (n) + e(n)u(n − j)
(u(n − i))2 ∥u(n)∥2
i=0
To overcome the possible numerical difficulties when ∥u(n)∥ is very close to zero, a constant a > 0 is used:
µ̃
wj (n + 1) = wj (n) + e(n)u(n − j)
a + ∥u(n)∥2
22 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
we may view Normalized LMS algorithm as a LMS algorithm with data- dependent adaptation step size.
I Considering the approximate expression
µ̃
µ(n) =
a + ME (u(n))2
0 < µ̃ < 2
23 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
2 2
Learning curve Ee (n) for LMS algorithm Learning curve Ee (n) for Normalized LMS algorithm
0 1
10 10
µ=0.075 µ=1.5
µ=0.025 µ=1.0
µ=0.0075 µ=0.5
µ=0.1
0
10
−1
10
−1
10
−2
10
−2
10
−3 −3
10 10
0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500
time step n time step n
24 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
I The LMS was run with three different step-sizes: µ = [0.075; 0.025; 0.0075]
I The NLMS was run with four different step-sizes: µ̃ = [1.5; 1.0; 0.5; 0.1]
I With the step-size µ̃ = 1.5, NLMS behaved definitely worse than with step-size µ̃ = 1.0 (slower,
and with a higher steady state square error). So µ̃ = 1.5 is further ruled out
I Each of the three step-sizes µ̃ = [1.0; 0.5; 0.1] was interesting: on one hand, the larger the
step-size, the faster the convergence. But on the other hand, the smaller the step-size, the better
the steady state square error. So each step-size may be a useful tradeoff between convergence
speed and stationary MSE (not both can be very good simultaneously).
I LMS with µ = 0.0075 and NLMS with µ̃ = 0.1 achieved a similar (very good) average steady state square
error. However, NLMS was faster.
I LMS with µ = 0.075 and NLMS with µ̃ = 1.0 had a similar convergence speed (very good). However,
NLMS achieved a lower steady state average square error.
I To conclude: NLMS offers better trade-offs than LMS.
I The computational complexity of NLMS is slightly higher than that of LMS.
25 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
2 2
Learning curve Ee (n) for LMS algorithm Learning curve Ee (n) for variable size LMS algorithm
0 1
10 10
µ=0.075 c=10
µ=0.025 c=20
µ=0.0075 c=50
LMS µ=0.0075
0
10
−1
10
−1
10
−2
10
−2
10
−3 −3
10 10
0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500
time step n time step n
27 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
g (n) = ∇
ˆ w J = −2u(n)e(n)
gi (n) = −2e(n)u(n − i)
Passing the signals gi (n) through low pass filters will prevent the large fluctuations of direction during
adaptation process.
where LPF denotes a low pass filtering operation. The updating process will use the filtered noisy gradient
w (n + 1) = w (n) − µb(n)
28 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
w (n + 1) = w (n) − µb(n)
γw (n) = γw (n − 1) − γµb(n − 1)
w (n + 1) − γw (n) = w (n) − γw (n − 1) − µb(n) + γµb(n − 1)
w (n + 1) = w (n) + γ(w (n) − w (n − 1)) − µ(b(n) − γb(n − 1))
w (n + 1) = w (n) + γ(w (n) − w (n − 1)) − µ(1 − γ)g (n)
w (n + 1) = w (n) + γ(w (n) − w (n − 1)) + 2µ(1 − γ)e(n)u(n)
29 / 30
LMS algorithm
Variants of the LMS algorithm
Linear smoothing of LMS gradient estimates
I If there is an impulsive interference in either d(n) or u(n), the performances of LMS algorithm will
drastically degrade (sometimes even leading to instability).
Smoothing the noisy gradient components using a nonlinear filter provides a potential solution.
I The Median LMS Algorithm
Computing the median of window size N + 1, for each component of the gradient vector, will smooth out
the effect of impulsive noise. The adaptation equation can be implemented as
wi (n + 1) = wi (n) − µ · med ((e(n)u(n − i)), (e(n − 1)u(n − 1 − i)), . . . , (e(n − N)u(n − N − i)))
* Experimental evidence shows that the smoothing effect in impulsive noise environment is very strong.
* If the environment is not impulsive, the performances of Median LMS are comparable with those of LMS,
thus the extra computational cost of Median LMS is not worth.
* However, the convergence rate may be slower than in LMS, and there are reports of instability occurring
whith Median LMS.
30 / 30