Professional Documents
Culture Documents
Abstract—Data-reuse RLS (DR-RLS) algorithm is a computa- echo cancellation problems [21]. Based on data-reused princi-
tionally efficient technique that has been recently introduced to ple, the low-complexity DR-RLS algorithm with dichotomous
improve the tracking capabilities of the RLS. Nevertheless, no coordinate descent iterations was introduced [22].
analysis of its convergence has been proposed so far. The aim of
that work is to fill this gap with theoretical analyzes in the mean No theoretical analysis of the performance of the DR-RLS
and mean-square error sense. Transient and steady-state models algorithm has been proposed so far in the literature, essentially
are provided. They are validated through numerical simulations. because of the difficulty in evaluating statistically the effects of
the data-reused term. Here we address this issue by carrying
Index Terms—Adaptive filtering, data-reuse, RLS algorithm, out a detailed analysis of the performance of the DR-RLS
transient convergence analysis, steady-state analysis. in the mean and mean-square error sense. Simulation results
are provided to validate the resulting theoretical transient and
steady-state models. This work therefore contributes to the
I. I NTRODUCTION general comprehension of the stochastic convergence charac-
teristics of the DR-RLS algorithm.
time-averaged correlation matrix Φn of input data [1], [2], Taking the expected value of (12), then considering the sta-
defined as: tistical independence of noise zn with any other signal, and
X n E{zn } = 0, yields:
Φn = λn−i xi x⊤
i + δλ
n+1
IL = λΦn−1 + xn x⊤n. (6)
i=0
E Φn w e n = E Φn w e n−1 (13)
M
X M
Setting M = 1, the DR-RLS algorithm reduces to the RLS.
n i−1 o
− (−1)i+1 E x⊤ n Pn xn xn x⊤
nwe n−1 .
Replacing (3) into (5), the recursion of the DR-RLS can be i
i=1
reformulated as:
M The second term on the r.h.s. of (13) can be approximated as
X M i−1 follows:
wn = wn−1 + (−1)i+1 x⊤ n Pn x n en Pn xn .
i M
i=1 X M n i−1 o
(7) (−1)i+1 E x⊤ n Pn x n xn x⊤
nwe n−1 (14)
i=1
i
M
III. T RANSIENT PERFORMANCE (a) X M n i−1 o
≈ (−1)i+1 E ∥xn ∥2Pn xn x⊤n E{w e n−1 }
i
We now analyze the convergence behavior of the DR-RLS i=1
in the mean and mean-square error sense. Defining the weight M
(b) X M n i−1 o
e n at time instant n between wn and w⋆ as:
error vector w ≈ (−1)i+1 E ∥xn ∥2Pn Rx E{w e n−1 }
i=1
i
e n = wn − w ⋆ ,
w (8) M
(c) X M
i−1
(−1)i+1 tr Rx E{Φn }−1
≈ Rx E{w e n−1 }.
we aim to study the recursion of E{w e n } and its correlation i
i=1
matrix Kn = E w e nwe n⊤ over time.
Before proceeding, for mathematical tractability, we need Approximation (a) results from assumption A1. The rational
to introduce the following simplifying assumption. behind approximation (b) is as follows. First, we observe that
Assumption A1: The weight error vector w e n−1 is statisti- the entries of matrix xn x⊤
n are defined by x(n − i)x(n − j). A
cally independent of the regression input vector xn . common approximation that works well for reasonably large
regression vector xn is to ignore the correlation between its
The above independence assumption A1 is commonly used
square norm ∥xn ∥2Pn and x(n − i)x(n − j) and ∥xn ∥2Pn , since
for analyzing the transient behavior of adaptive filters [1], [2].
the former tends to vary much slowly around its mean value
(1 − λ)L [23]–[25]. Approximation (c) results from a similar
A. Mean error transient behavior model argument as above. The rational of the above approximations
is experimentally confirmed in the simulations section. Finally,
We start with the error analysis of the DR-RLS in the mean
with approximation E{Φn w e n } ≈ E{Φn }E{we n } successfully
error sense. By taking the expected value of (6), we obtain:
utilized in [26]–[28], and E{Φn w e n−1 } ≈ E{Φn }E{w e n−1 }
E{Φn } = λE{Φn−1 } + Rx (9) studied in Appendix A, and then using (14), (13) becomes:
with the initial condition Φ0 = δIL and δ > 0 a parameter. E{Φn }E{we n } = E{Φn }E{w e n−1 } (15)
Note that (9) will be utilized to derive the analytical models M
X M
i−1
(−1)i+1 tr Rx E{Φn }−1
below. Substituting (1) into (4), then using definition (8), the − e n−1 }.
Rx E{w
i
instantaneous estimation error (4) becomes: i=1
⊤
square-deviation (MSD) at time instant n is defined by [2]: E Φn w e n−1 w
e n−1 Φn ≈ E{Φn }Kn−1 E{Φn }, then pre-
multiplying and post-multiplying it by E{Φn }−1 simultane-
e n ∥2 = tr{Kn }.
MSDn = E ∥w (18)
ously, we arrive at the mean-square error behavior for DR-RLS
To evaluate (17) and (18), we have to devise a recursion for algorithm given by (27), shown at the top of next page.
calculating Kn in the following. Post-multiplying recursive
relation (12) by its transpose, taking the expected value, then
considering the statistics of zn , yields:
IV. S TEADY- STATE PERFORMANCE
e n⊤ Φn = E Φn w ⊤
E Φn we nw e n−1 we n−1 Φn
(19)
+ T1 + T2 − (T3 + T⊤ 3) We shall now investigate the error of the DR-RLS at steady-
where state from the transient models devised in the last section.
M X M Because 0 ≪ λ < 1 and using (6), the steady-state expectation
X M M n i+j−2 of matrix Pn can be approximately computed as follows [2],
T1 = (−1)i+j+2 E x⊤ n Pn xn
i j [30], [31]:
i=1 j=1
−1
lim E Pn ≈ lim E P−1
o
× zn2 xn x⊤n , (20) n→∞ n→∞
n
−1
(28)
= (1 − λ) R−1
XM X M
M
M
n i+j−2 = lim E Φn x .
T2 = (−1)i+j+2 E x⊤ n Pn xn
n→∞
i j
i=1 j=1 Theorem 1 (Mean stability): Assume that model (1) and A1
o
× xn x⊤ w w ⊤
x x⊤ hold. The weight error vector of DR-RLS algorithm converges
n e n−1 e n−1 n n , (21)
to zeros vector as n → ∞, if the forgetting factor λ satisfies:
M
2
X M n i−1
T3 = (−1)i+1 E x⊤ n Pn x n 1− < λ < 1. (29)
i (M − 1)L
i=1
Proof: Assume (28) is approximately held when n is suffi-
o
× xn x⊤nw e n−1 w ⊤
e n−1 Φn . (22)
ciently large. Replacing (28) into (16) as n → ∞, it results in
We shall now focus on calculating matrices T1 to T3 . Apply- the mean weight error of the DR-RLS at steady-state:
ing similar approximations as in (14), and using the statistical M
h X M i
property of zn , matrix T1 can be approximated by: E{we n} = 1 − (−1)i+1 (1 − λ)i Li−1 E{w e n−1 },
i=1
i
M X M
X M M i+j−2 (30)
(−1)i+j+2 tr Rx E{Φn }−1
T1 ≈
i j as n → ∞. To guarantee the convergence in the mean, (30)
i=1 j=1
implies that:
× σz2 Rx . (23)
M
X M
Using A1 and Isserlis’ theorem, matrix T2 can be determined 0< (−1)i+1 (1 − λ)i Li−1 < 1. (31)
as follows [1], [29]: i=1
i
M X M
M
M Since 0 ≪ λ < 1, the r.h.s. of (31) holds on. Hence, we only
X i+j−2
(−1)i+j+2 tr Rx E{Φn }−1
T2 ≈ need to prove the l.h.s. of (31). When M is odd, we find that
i j
i=1 j=1 the last term is positive. Consider now the case when M is
× E xn x⊤ ⊤
xn x⊤
nw e n−1 w
e n−1 n
even. Then, we split all the terms defined by (31) into M/2
M
XX M M sub-terms, namely,
M i+j−2
(−1)i+j+2 tr Rx E{Φn }−1
= M! M!
i=1 j=1
i j (1 − λ) − (1 − λ)2 L > 0, . . . ,
(M − 1)! 2!(M − 2)!
× Rx tr Rx Kn−1 + 2 Rx Kn−1 Rx . (24) M!
(1 − λ)M −1 LM −2 − (1 − λ)M LM −1 > 0,
To evaluate matrix T3 , we introduce the following approxi- (M − 1)!
mation studied in Appendix A: which are positive provided that (29) is satisfied. Therefore,
n i−1 o we conclude that DR-RLS solution is asymptotically unbiased
E x⊤ n P n xn xn x ⊤
n w
e n−1 w
e ⊤
n−1 Φ n if condition is satisfied (29). ■
n i−1 o (25)
⊤ ⊤ ⊤ Let us define the steady-state correlation matrix of w
e n by
≈ E xn Pn xn xn xn we n−1 we n−1 E{Φn }.
K∞ = limn→∞ Kn . Substituting (28) into (27), and assuming
Using (25) and considering A1, matrix T3 is given by: that the DR-RLS converges as n → ∞, yields:
M
2(ρ1 − ρ2 )K∞ = ρ2 σz2 R−1 −1
M i−1 x + ρ2 tr Rx K∞ Rx (32)
X
(−1)i+1 tr Rx E{Φn }−1
T3 ≈ Rx Kn−1 E{Φn }.
i
i=1 with
(26) M
Replacing (23), (24), M
and (26) into (19), using two necessary X i−1
(−1)i+1 L(1 − λ)
⊤ ρ1 = (1 − λ), (33)
approximations E Φn w e nwe n Φn ≈ E{Φn }Kn E{Φn } and i i=1
4
M X
M
X M M i+j−2 2
(−1)i+j+2 tr Rx E{Φn }−1 σz E{Φn }−1 Rx E{Φn }−1
Kn = Kn−1 +
i=1 j=1
i j
M X
M
X M M i+j−2
(−1)i+j+2 tr Rx E{Φn }−1 E{Φn }−1 Rx tr{Rx Kn−1 } + 2Rx Kn−1 Rx E{Φn }−1
+
i=1 j=1
i j
M
X M i−1
(−1)i+1 tr Rx E{Φn }−1 E{Φn }−1 Rx Kn−1 + Kn−1 Rx E{Φn }−1
− (27)
i=1
i
M X M
X M M i+j−2 phases. Figure 1(d) shows that the empirical learning curves of
(−1)i+j+2 L(1−λ) (1 − λ)2 .
ρ2 =
i j the MSD generally match well with their transient and steady-
i=1 j=1
(34) state theoretical predictions based on (18), (27), and (39). To
Consider the following properties: conclude, Fig. 1 illustrates the accuracy of our models and the
⊤ rationality of all approximations used in the analysis.
tr ΣB = vec(B⊤ ) σ,
(35) 0.6
⊤
vec AΣB = B ⊗ A σ, (36) 0.4
⊤ -10
-5
+ ρ2 vec(IL ) vec(Rx ) k∞ . -20
-10
-30
Then, we obtain: -40 -15
k∞ = ρ2 σz2 2(ρ1 − ρ2 ) Rx ⊗ IL
-50
-20
-80 -30
which allows to recover the steady-state matrix K∞ . Finally, 1000 2000 3000 4000 5000 6000 7000 8000 1000 2000 3000 4000 5000 6000 7000 8000
we can compute the steady-state EMSE or MSE, and MSD of (c) EMSE curves. (d) MSD curves.
the DR-RLS according to (17) and (18) as n → ∞.
Fig. 1. Comparisons of empirical results and theoretical predictions.
V. S IMULATION R ESULTS
The overall performance of the DR-RLS has already been A PPENDIX A
validated thoroughly in [20], [21]. We now verify the cor- P ROOFS OF A PPROXIMATIONS
rectness and precision of the presented theoretical models To justify the consistency of some approximations intro-
with simulations. All empirical curves were averaged over 200 duced in this paper, we decompose Φn = E{Φn } + ∆n with
Monte Carlo runs. The input signal was obtained by filtering ∆n some random fluctuations. We have:
a zero-mean white Gaussian noise s(n) through an AR(1) E{Φn we n−1 } = E{Φn }E{w e n−1 } + E{∆n w e n−1 }, (40)
model, namely, x(n) = 0.6x(n − 1) + s(n), where variances ⊤ i−1 ⊤ ⊤
σs2 and σx2 were set to 0.64 and 1, respectively. The noise E (xn Pn xn ) xn xn w e n−1 we n−1 Φn
⊤ i−1 ⊤ ⊤
variance σz2 was set to 0.02. Figure 1(a) depicts w⋆ ∈ R64 = E (xn Pn xn ) xn xn w e n−1 we n−1 E{Φn } (41)
generated from a standard normal distribution and scaled by ⊤ i−1
+ E (xn Pn xn ) xn xn w ⊤
e n−1 w ⊤
e n−1 ∆n ,
an exponential decay factor 0.5. The data-reuse parameter M
⊤
was set to 4. The forgetting factor λ and the parameter δ were E Φn w
e n−1 we n−1 Φn = E{Φn }Kn−1 E{Φn }
set to 0.9995 and 1 × 103 , respectively. Weight vector w0 ⊤ ⊤
+ E Φn E w e n−1 w
e n−1 ∆n + E ∆n w e n−1 w
e n−1 ∆n
was initialized to zero. Figure 1(b) illustrates the mean weight
+ E ∆n w e n−1 w ⊤
e n−1 E Φn . (42)
behavior of the DR-RLS. We can find that all the theoretical
curves (dotted red) of weight coefficients wn (ℓ) predicted We assume that the entries of ∆n are small in comparison
by (16) are consistent with the empirical curves (solid blue), with those of E{Φn } because equation (6) means that Φn is
which validates the asymptotic unbiasedness of the algorithm. a low-pass filtering of x⊤ n xn [26]–[28]. Therefore, the first
Figure 1(c) illustrates the good consistency between the empir- term on the r.h.s. of (40)–(42) dominates the remaining ones,
ical learning curves of the EMSE and its theoretical prediction leading to the approximations.
obtained from (17), (27), and (39) in transient and steady-state
5
R EFERENCES noise,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 67, no. 7, pp.
1359–1363, 2020.
[1] S. Haykin, Adaptive Filter Theory, 2nd ed. New Jersey: Prentice-Hall, [17] J. Apolinário, M. L. R. Campos, and P. S. R. Diniz, “Convergence
1991. analysis of the binormalized data-reusing LMS algorithm,” IEEE Trans.
[2] A. H. Sayed, Fundamentals of Adaptive Filtering. New York: Wiley, on Signal Process., vol. 48, no. 11, pp. 3235–3242, 2000.
2003. [18] P. S. R. Diniz and S. Werner, “Set-membership binormalized data-
[3] P. S. R. Diniz, Adaptive Filtering: Algorithms and Practical Implemen- reusing LMS algorithms,” IEEE Trans. on Signal Process., vol. 51, no. 1,
tation, 4th ed. New York, USA: Springer, 2013. pp. 124–134, 2003.
[4] J. Cioffi and T. Kailath, “Fast, recursive-least-squares transversal filters [19] R. A. Soni, K. A. Gallivan, and W. K. Jenkins, “Low-complexity data
for adaptive filtering,” IEEE Transactions on Acoustics, Speech, and reusing methods in adaptive filtering,” IEEE Trans. on Signal Process.,
Signal Processing, vol. 32, no. 2, pp. 304–337, 1984. vol. 52, no. 2, pp. 394–405, 2004.
[5] A. Carini and E. Mumolo, “A numerically stable fast RLS algorithm for [20] C. Paleologu, J. Benesty, and S. Ciochină, “Data-reuse recursive least-
adaptive filtering and prediction based on the UD factorization,” IEEE squares algorithms,” IEEE Signal Process. Lett., vol. 29, pp. 752–756,
Trans. on Signal Process., vol. 47, no. 8, pp. 2309–2313, 1999. 2022.
[6] J. Benesty and T. Gänsler, “New insights into the RLS algorithm,” [21] L. M. Dogariu, C. Paleologu, J. Benesty, and S. Ciochina, “On the
EURASIP J. Adv. Signal Process., vol. 3, pp. 331–339, 2004. performance of a data-reuse fast RLS algorithm for acoustic echo
[7] L. Rey Vega, H. Rey, J. Benesty, and S. Tressens, “A fast robust recursive cancellation,” in IEEE BlackSeaCom, 2022, pp. 135–140.
least-squares algorithm,” IEEE Trans. on Signal Process., vol. 57, no. 3, [22] I. D. Fı̂ciu, C. L. Stanciu, C. Elisei-Iliescu, C. Anghel, and R. M. Udrea,
pp. 1209–1216, Mar. 2009. “Low-complexity implementation of a data-reuse RLS algorithm,” in
[8] B. Toplis and S. Pasupathy, “Tracking improvements in fast RLS 45th International Conference on Telecommunications and Signal Pro-
algorithms using a variable forgetting factor,” IEEE Transactions on cessing (TSP), 2022, pp. 289–293.
Acoustics, Speech, and Signal Processing, vol. 36, no. 2, pp. 206–227, [23] C. Samson and V. Reddy, “Fixed point error analysis of the normalized
1988. ladder algorithm,” IEEE Transactions on Acoustics, Speech, and Signal
[9] S.-H. Leung and C. So, “Gradient-based variable forgetting factor Processing, vol. 31, no. 5, pp. 1177–1191, 1983.
RLS algorithm in time-varying environments,” IEEE Trans. on Signal [24] S. de Almeida, J. Bermudez, N. Bershad, and M. Costa, “A statistical
Process., vol. 53, no. 8, pp. 3141–3150, 2005. analysis of the affine projection algorithm for unity step size and
[10] C. Paleologu, J. Benesty, and S. Ciochina, “A robust variable forgetting autoregressive inputs,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52,
factor recursive least-squares algorithm for system identification,” IEEE no. 7, pp. 1394–1405, 2005.
Signal Process. Lett., vol. 15, pp. 597–600, 2008. [25] J. Chen, C. Richard, J.-C. M. Bermudez, and P. Honeine, “Variants
[11] Y. Cai, R. C. de Lamare, M. Zhao, and J. Zhong, “Low-complexity of non-negative least-mean-square algorithm and convergence analysis,”
variable forgetting factor mechanism for blind adaptive constrained IEEE Trans. on Signal Process., vol. 62, no. 15, pp. 3990–4005, 2014.
constant modulus algorithms,” IEEE Trans. on Signal Process., vol. 60, [26] E. Eweda, N. J. Bershad, and J. C. M. Bermudez, “Stochastic analysis of
no. 8, pp. 3988–4002, 2012. the recursive least squares algorithm for cyclostationary colored inputs,”
[12] Y. J. Chu and S. C. Chan, “A new local polynomial modeling-based IEEE Trans. on Signal Process., vol. 68, pp. 676–686, 2020.
variable forgetting factor RLS algorithm and its acoustic applications,” [27] W. Gao, J. Chen, and C. Richard, “Transient theoretical analysis of
IEEE/ACM Trans. on Audio, Speech, and Language Processing, vol. 23, diffusion RLS algorithm for cyclostationary colored inputs,” IEEE Signal
no. 11, pp. 2059–2069, 2015. Process. Lett., vol. 28, pp. 1160–1164, 2021.
[13] Y. V. Zakharov, G. P. White, and J. Liu, “Low-complexity RLS algo- [28] W. Gao, J. Chen, C. Richard, W. Shi, and Q. Zhang, “Transient perfor-
rithms using dichotomous coordinate descent iterations,” IEEE Trans. mance analysis of the ℓ1 -RLS,” IEEE Signal Process. Lett., vol. 29, pp.
on Signal Process., vol. 56, no. 7, pp. 3150–3161, 2008. 90–94, 2022.
[14] Y. V. Zakharov and V. H. Nascimento, “DCD-RLS adaptive filters with [29] J. Chen, C. Richard, Y. Song, and D. Brie, “Transient performance
penalties for sparse identification,” IEEE Trans. on Signal Process., analysis of zero-attracting LMS,” IEEE Signal Process. Lett., vol. 23,
vol. 61, no. 12, pp. 3198–3213, 2013. no. 12, pp. 1786–1790, 2016.
[15] C. Elisei-Iliescu, C. Paleologu, J. Benesty, C. Stanciu, C. Anghel, and [30] A. Bertrand, M. Moonen, and A. H. Sayed, “Diffusion bias-compensated
S. Ciochină, “Recursive least-squares algorithms for the identification RLS estimation over adaptive networks,” IEEE Trans. on Signal Pro-
of low-rank systems,” IEEE/ACM Trans. Audio, Speech, Language cess., vol. 59, no. 11, pp. 5212–5224, Nov. 2011.
Process., vol. 27, no. 5, pp. 903–918, May 2019. [31] R. Arablouei, K. Dogancay, S. Werner, and Y. Huang, “Adaptive
[16] Y. Yu, L. Lu, Z. Zheng, W. Wang, Y. Zakharov, and R. C. de Lamare, distributed estimation based on recursive least-squares and partial diffu-
“DCD-based recursive adaptive algorithms robust against impulsive sion,” IEEE Trans. on Signal Process., vol. 62, no. 14, pp. 3510–3522,
Jul. 2014.