You are on page 1of 4

Transform Domain LMF Algorithm for Sparse

System Identification Under Low SNR
Murwan Bashir and Azzedine Zerguine

Abstract—In this work, a transform domain Least Mean
Fourth (LMF) adaptive filter for a sparse system identification, in
the case of low Signal-to-Noise Ratio (SNR), is proposed. Unlike
the Least Mean Square (LMS) algorithm, the LMF algorithm,
because of its error nonlinearity, performs very well in these
environments. Moreover, its transform domain version has an
outstanding performance when the input signal is correlated.
However, it lacks sparse information capability. To overcome this
limitation, a zero attractor mechanism, based on the l1 norm
is implemented to yield the Zero-Attractor Transform-Domain
LMF (ZA-TD-LMF) algorithm. The ZA-TD-LMF algorithm
ensures fast convergence and attracts all the filter coefficients
to zero. Simulation results conducted to substantiate our claim
are found to be very effective.
Index Terms—Least Mean Fourth (LMF), Transform Domain
(TD), Zero-Attractor ZA, Sparse solution

The LMF algorithm [1] is known to perform better than
the LMS algorithm in the case of non-Gaussian noise and
in low SNR environments. However, both algorithms do not
explore the special structure of sparsity that appears in many
systems, e.g., digital transmission channels [2] and wide area
wireless channels. Several approaches have been used to endow adaptive algorithms the ability to recognize such systems.
For example, the work in [3] uses sequential updating, since
the sparse filters are long by nature and most of the elements
are zeros. The proportionate LMF algorithm [4] and PNLMS
algorithm [5] are applied to sparse systems identification,
where the updating power depends on the value of the weight.
The advent of compressive sensing [6] and the least absolute
shrinkage and selection operator (LASSO) [7], are different
approaches appeared by endowing adaptive algorithms with
the ability to recognize sparse structures. By adding l1 norm
regression to the LMS algorithm, the derived sparse-aware
algorithm called zero attractor LMS (ZA-LMS) algorithm [8].
This algorithm fundamentally tries to attract all the weights
to zero, and hence the naming zero attractor. To avoid the
strong bias of the ZA-LMS algorithm when the system is
not sparse, a weighted zero attractor is introduced in [8] to
endow the resultant adaption with the ability to recognize nonzero elements, and apply small attraction to zero for these
group of elements. The sparse LMF algorithms introduced
in [9] proved to outperform their counterpart sparse-aware
LMS algorithms in low SNR environment. However, both the
LMS and the LMF family sparse families inherit the slow
M. Bashir and A. Zerguine are with the Department of Electrical Engineering, King Fahd University of Petroleum & Minerals, Dhahran, 31261, KSA
(e-mail: {g201304570,azzedine}

convergence property in the correlated environment, where the
eigenvalue spread of the autocorrelation matrix of the input
signal is large.
Applying discrete transformation (e.g., DCT and DFT)
accompanied with power normalization to the input is known
to result in more whiter input, shrinks the eigenvalue spread, as
in the case of transform domain LMS algorithm in [10]. Moreover, by endowing the TD-LMS algorithm with zero attractor,
the result is TD-ZA-LMS and TD-WZA-LMS algorithms [11],
where it was shown their convergence is faster in comparison
to their counterpart ZA-LMS and WZA-LMS algorithms.
In this work, we investigate a sparse-aware transform domain LMF algorithm, and assess its performance in a low
SNR environment. We study specifically the TD-ZA-LMF
algorithm, since the l1 penalty added to the LMF cost function
is convex, unlike other penalties, e.g., lp and l0 for example.
Notations: In the following parts of the paper, matrices and
vectors are denoted by capital and lower case boldface letters,
respectively, the superscripts H, T , and -1 denote Hermitian,
transpose, and inverse operators, respectively, and finally, kk1
and E[] denote the l1 norm and the statistical expectation,
A. The Transform Domain LMF Algorithm
Consider a system identification scenario with input ui and
the desired output d(i) defined by
d(i) = wo ui + n(i)



where w is the optimal filter with length N . The transformed
input vector is defined as
xi = ui T


where T is an N × N transformation matrix and the transformed input is xi . Both xi and ui are of length N . The
transform domain LMF algorithm is given by
T 3
wi = wi−1 + µΛ−1
i xi e (i)


wi = TT w



where w
¯ i is the time domain weight vector and wi is the
transformed domain weight vector. µ is the step size, e(i) is the
error, the difference between the desired output and the output
of the adaptive system, and Λ is the power normalization
matrix. Clearly (3) does not exploit any sparsity information
in its recursion. In order to exploit sparsity, we will alter the

M is known to be large. III. reads: zi = zi−1 + µzTi {n(i) − zTi vi }2k−1 (i) − si (10) Ignoring the high power of the viT ui and applying statistical expectation to (10) results in E[zi ] = − E[zi−1 ] + µE[vi n2k−1 ] i 2k−2 T µ(2k − 1)E[ni v i v i zi ] 2 σn2 T r(Rv ) (15) where σn2 is the variance of noise. We launch the analysis by using the following general recursion: (8) where si is the sparsity penalty term and k ∈ N∗ . note that k = 1 and k = 2 result in the LMS algorithm and LMF algorithm. zTi vi (9) equation (8). k = 2. JZA = ∂JZA = −e3 (i)xTi + λZA TT sgn(Twi ) ∂wiT (7) Clearly the sparsity contribution. taking into account (12). The noise and regressor. E[vi n2k−1 ] = 0. Moreover. the effect of E[vi ] will be decreasing regardless of the noise plant power and the regressor power at steady state. this algorithm is designed for Non-Gaussian and correlated inputs. where = Using energy conservation relation arguments leads to the following [12]: e i k2 + E Ekw |ea (i)|2 |ep (i)|2 e i−1 k2 + E = Ekw 2 kvi k2 kvi k22 (18) fi = wo − wi is the weight error vector. Convergence Analysis wi = wi−1 + µviT e2k−1 (i) − si (14) and in the case of TD-ZA-LMF. µ. which reflects in a small step size. In contrast to the algorithm introduced in [9]. that is. the recursion is governed by wi = wi−1 + µviT e3 (i) − si viT (17) Λ−1 xTi . can be assumed to be independent from each other at steady state. it has the ability to attract all the filter coefficients to zero. and hence a necessary condition for the stability of (8) in the mean is that the step size. since the transformed weight vector is not sparse. The condition in (15) becomes: 2 0<µ< 2 (16) σn M where for sparse filters. where M is the number of filter coefficients. At steady state the following condition is always valid: e i k22 = Ekw e i−1 k22 Ekw (19) the Excess Mean Square Error (EMSE) is defined as − E[si ] (11) By modeling the measurement noise as a white Gaussian process. should satisfy the following: 0<µ< 0<µ< T wi = wi−1 + µΛ−1 xTi e3 (i) − ρZA Λ−1 i T sgn(Twi ) where ρZA = µλZA . we study the convergence in the mean of the TD-LMF sparse-aware algorithm. For the case of the TD-ZA-LMF. Note that we are transforming back the coefficients in order to exploit sparsity. ea (i) = where w e i−1 and ep (i) = vi w e i are the posterior and priori errors. Equation (13) is quite similar to that of the LMS convergence relation introduced in [12]. T r(Rv ) = M . for a small step size scenario. Now. the lower the SNR which implies higher E[vi ] value. Hence this independence assumption is valid EM SE = E|ea (i)|2 (20) Hence equation (9) becomes: E |ea (i)|2 |ep (i)|2 = E kvi k22 kvi k22 (21) the priori and a posteriori errors can be found to be related through the following recursion: ea (i) = ep (i) − µkvi k2 e3 (i) + vi si (22) . respectively. This assumption suggests the following relation: E[n2k−2 vi viT zi ] i = E[n2k−2 ]E[vi viT ]E[zi ] i = E[n2k−2 ]Rv E[zi ] i (12) Finally. using indei pendence assumption. Zero Attractor TDLMF Algorithm The cost function of the zero attractor is given by 1 4 e (i) + λZA kTwi k1 (6) 4 where λZA is the zero attraction force (Lagrangian multiplier). But for very small step size. (11) looks like:   2k−2 E[zi ] = I − µ(2k − 1)E[ni ]Rv E[zi−1 ] − E[si ] (13) B. E[si ]. Steady State Analysis The steady-state analysis of the proposed algorithm is derived in this section. this range should be: substituting (7) into (5) gives e(i) = n(i) − 2 (2k − 1)E[n2k−2 ]T r(Rv ) i B. Defining the weight error vector by zi = wo − wi−1 and employing the relation between the zi and e(i). vi w respectively. P ERFORMANCE A NALYSIS OF THE TD-ZA-LMF In this section. and from [13].cost function to include a sparsity-aware term and then apply the general gradient algorithm formula: wi = wi−1 − µΛ−1 i ∂J ∂wiT (5) to yield the proposed TD-ZA-LMF algorithm. A. does not affect the convergence. The higher the noise plant power.

which in this case the TD-LMF. for the zero elements. The results are averaged over 500 runs. The transform used is the Discrete Cosine Transform (DCT-II). which states that. S IMULATION R ESULTS In this section. By applying the separation principle. the term A reads: A = µ2 Ee6 (i)Ekvi k22 (25) Using the following relation between the error. which sets the step size range to 0 < µ < 0. It should be noted that the TD-LMF algorithm inherits the same step size conditions for convergence. the sparsity term si is expected to be zero. C approximates to C ≈ −6µσn2 Eea (i)2 (29) where γ= 6µσn2 1 − 45µ2 T r(Rv )σn4 (33) If the transformation and power normalization matrices are set to I. applying the same procedure for term D. we can see the tension in the EMSE between the N Z and Z elements. at steady-state the adaption error is independent of the input. for the case of the transform domain. (σn2 = 0. then from (32) and (33). the EM SE for the TD-LMF falls back to that of the EM SE for the LMF given in [12]. a posterior error and noise: e(i) = ea (i) + n(i) (26) the term A results in the following: 2 A ≈ 45µ T r(Rv )σn2 Eea (i)2 2 + 15µ T r(Rv )σn2 (27) The sparsity enforcing at steady-state is independent from the regressors. is independent from the error w the EM SE is EM SEm∈Z = 5µσn4 T r(Rv ) 2 − 9µσn2 T r(Rv ) − Ew ˜ i T r(Rv )(6µσn2 − 1 )si (35) T r(Rv ) Since for this group. and hence the a posterior error. We choose µ = 0. we will use the following relation expectation approximation [14]: E EviT vi Rv viT vi ≈ = 2 2 kvi k2 Ekvi k2 T r(Rv ) (24) This approximation is quite helpful in simplifying the analysis. The EM SE exerted for the non-zero elements is EM SEm∈N Z = 5µσn4 T r(Rv ) ksi kRv + γE 2 2 − 9µσn T r(Rv ) T r(Rv ) (34) where for this group of weights. When the sparsity rate is low (i. (30) = 5µσn4 T r(Rv ) 2 − 9µσn2 T r(Rv ) | {z } EM SET D−LM F   ksi kRv 1 + γ E −w ˜ i T r(Rv )[6µσn2 − ]si T r(Rv ) T r(Rv ) {z } | EM SEsparse (32) Moreover. ZA-LMS and TD-ZA-LMS are evaluated using Monte Carlo simulations. and suitable for long filters. ZA-LMF. IV. Also.31) and M = 32. the non-zero (N Z) elements and zero (Z) elements. the input and the sparsity enforcing term. and hence the power normalization matrix is almost constant and by employing the rational expectation approximation. The SNR is set to 5 dB.005. namely. To get more insight of equation (31). the performance of the propsed TD-ZALMF. E gives: } E E = −6µσn2 w ˜ iT Rv si In order to simplify (22) the following adopted assumptions are used: (31) Now. the term D can be shown to be results in vi si D = Eea (i) kvi k22 w ˜ T vT vi si = E i i 2 kvi k2 T w ˜ Rv si = E i T r(Rv ) Finally. the overall EM SE will be lower than for the non-sparse-aware algorithm. which is the case for sparse filters. However.098. |Z| >> |N Z|). Two experiments are conducted for the input .. we will also resort to the separation principle assumption.Then: ksi k2vi vT |ea (i)| 2 6 2 i E + Eµ e (i)kv k + E i | {z }2 kvi k22 kvi k22 | {z } A B vi si 3 2µEea (i)e (i) + ea (i) {z } | kvi k22 {z } | C D µe3 (i)vi si (23) 2 2 E |ep (i)| kvi k22 = − − | {z Also. We start the steady-state analysis by exploring each term of the right hand side of (22). substituting the different terms in (22). the term B lands in: E ksi k2vi vT i kvi k22 ksi kRv =E T r(Rv ) (28) Using A2 and ignoring the high power terms. A2: The measurement noise is independent of the adaption error. we study the EM SE for the two set of elements in the vector wi . we will assume further that the input is not too non-stationary. While.e. the EM SE reads η A1: At steady state the error is independent from the input. the sparsity enforcing term si ˜ i .

9z −1 is sued to generate the correlated samples. 27. Man. 1177–1191. IEEE Transactions on.” Acoustics. R EFERENCES [1] E. R. 2009. 6. 2009.” Speech and Audio Processing. Y. Algorithm ZA-LMS ZA-LMF TD-ZA-LMS TD-ZA-LMF First Experiment 5. are comparatively faster in convergence and lower in steady state level. V. ICASSP 2009. 2003. 11. Etter. “The least mean fourth (lmf) adaptive algorithm and its family. 1289–1306. . “Stability and convergence analysis of transform-domain lms adaptive filters with second-order autoregressive process. Schreiber. 1984. [12] A. IEEE Transactions on. [13] S. no.” Information Theory. 404–408. 2000. IEEE International Conference on ICASSP’85. though. vol. Zhao. 57. 1996. and M. 2. The values are give in Table I. 5.” in Acoustics Speech and Signal Processing (ICASSP). pp.” Proceedings of the IEEE. “Transform domain lms algorithms for sparse system identification. pp. 1: MSD curves for white Gaussian input with sparsity rate = 2.1 × 10−5 9 × 10−6 TABLE I: Zero Attractor values for the different algorithms. “Fixed point error analysis of the normalized ladder algorithm.ZA−LMF ZA−LMS TD−ZA−LMF TD−ZA−LMS −2 −4 −4 −6 MSD (dB) −8 MSD (dB) ZA−LMF ZA−LMS TD−ZA−LMF TD−ZA−LMS −2 −10 −12 −14 −6 −8 −10 −16 −12 −18 −14 −20 −22 1000 2000 3000 4000 5000 6000 7000 8000 iteration 1000 2000 3000 4000 5000 6000 7000 8000 9000 iteration Fig. Peterson. 267–288. Reddy.” Journal of the Royal Statistical Society. no. Fundamentals of adaptive filtering. As can be seen from Fig. 1169–1172. Adachi. 2: MSD curves for correlated input with sparsity rate = 2. IEEE. For the correlated 1 input. 1983. vol. no. [9] G.” Information Theory. [11] K. International Conference on. Speech and Signal Processing. The system under identification has sparsity m = 2.” in Acoustics. Narayan. pp. “Sparse least mean fourth algorithm for adaptive channel estimation in low signal-to-noise ratio region. 3125–3128. 2009. “Transform domain lms algorithm. Widrow. [2] W. pp. Samson and V. John Wiley & Sons. 1983. 2009. pp. 2010. vol. H. “Advanced television systems for terrestrial broadcasting: Some problems and some proposed solutions.” Signal Processing. which allow a fair comparison of the algorithms. “Regression shrinkage and selection via the lasso. L.” in Machine Learning and Applications. C ONCLUSION By introducing the Zero Attractor to the TD-LMF algorithm. IEEE Transactions on. 119–130. 10. “Proportionate normalized least-mean-squares adaptation in echo cancelers. pp. L. signal. Tibshirani. [3] D. Shi and X. Fig. pp. Kozat. 1985. S. Hero. IEEE. Khoo. with random location for the non-zero elements. 3147–3157. 5. [5] D. the LMF based algorithms reached the steady state faster compared to the LMS based ones. Z. pp. and Signal Processing.5 × 10−5 9 × 10−6 Second Experiment 1 × 10−5 9 × 10−6 3. and the TD-ZA-LMF is the best performing. Donoho. performed a bit slower in comparison with the ZA-LMF in case of white Gaussian input. 508–518. J. 2010 IEEE International Conference on. because of the bias of the power normalization introduced in the zero-attractor. 4. “Compressed sensing. Yilmaz and S. a first-order filter with transfer function 1+0. [10] S. vol. vol. Gaussian input and correlated one.5 × 10−5 5 × 10−6 4. 609–615. 31. and A. The zero-attraction parameters are set to allow all algorithms to reach the same level of mean-square-deviation (MSD). 31. Speech and Signal Processing. Wu. both algorithms suffer from very slow convergence due to the high correlated input. Because of the bias introduced by the power normalization matrix. [14] C. vol. ICMLA’09. A.” International Journal of Communication Systems. 3714–3717. 2006. the proposed algorithm improved the convergence as confirmed by simulations. and H. 958–981. 30. no. S. [7] R. M. However. no. 2014. pp. “Identification of sparse impulse response systems using an adaptive delay filter. vol. 83. Speech and Signal Processing. pp. Series B (Methodological). the TD-ZA-LMF is slightly slower than the ZA-LMF. Figure 2 depicts the ZA-LMF algorithm is better than the ZA-LMS algorithm when the SNR is low. O. Ma. vol. [6] D. Duttweiler. 52.” Acoustics. 8. 1. Speech. IEEE Transactions on. Chen. 3. pp. [8] Y. “An extended version of the nlmf algorithm based on proportionate krylov subspace projections. no. Narasimha. Gu. The transform domain algorithms. no. the resultant recursion is able to explore sparse structure in correlated environment with low SNR. 2009. Gui and F. Compared to the existing methods. pp. pp. The proposed algorithm though. 1995. This proves the power of the hybridisation of the transform domain and the sparse-aware ability in enhancing the convergence behavior of the LMF algorithm at low SNR. IEEE. 1. 275–283. [4] Y. IEEE Transactions on. vol. “Sparse lms for system identification. IEEE Transactions on. More importantly. IEEE.. Walach and B. no.” in Acoustics. Sayed. IEEE International Conference on.