You are on page 1of 5

IEEE SIGNAL PROCESSING LETTERS, VOL.

27, 2020 995

A New Diffusion Variable Spatial Regularized


QRRLS Algorithm
Yijing Chu , S. C. Chan , Yi Zhou , and Ming Wu

Abstract—This paper develops a framework for the design of and combination methods [8]–[13], and Lagrange multipliers
diffusion adaptive algorithms, where a network of nodes aim to methods [14]. The algorithms are based on the hypothesis that
estimate system parameters from the collected distinct local data the statistical property of observations at each node is identical.
stream. We explore the time and spatial knowledge of system However, this does not hold for a variety of applications such as
responses and model their evolution in both time and spatial acoustic signal processing and control systems [15], [16].
domain. A weighted maximum a posteriori probability (MAP)
is used to derive an adaptive estimator, where recent data has
More research work focuses on the setting where the local
more influence on statistics via weighting factors. The resulting optimal solutions are close to one another but do not coincide.
recursive least squares (RLS) local estimate can be implemented This kind of network has the so called multitask problem [17],
by the QR decomposition (QRD). To mediate the distinct spatial [18]. One solution is to decompose space-varying parameters
information incorporation within neighboring estimates, a variable into basic functions and estimate a large amount of coefficients
spatial regularization (VSR) parameter is introduced. The estima- [19]. More generally, regularization could be imposed to local
tion bias and variance of the proposed algorithm are analyzed. objectives to track multitask and interrelated optima [20]–[25].
A new diffusion VSR QRRLS (Diff-VSR-QRRLS) algorithm is The estimation bias and variance analysis of these diffusion
derived that balances the bias and variance terms. Simulations are algorithms has been carried out [17], [20], [26]–[31], which
carried out to illustrate the effectiveness of the theoretical analysis provides insights in dealing with multitask problems. It turns
and evaluate the performance of the proposed algorithm.
out that the variable combination strategy [17] is efficient in
Index Terms—Diffusion adaptive algorithm, variable spatial reducing the bias and improving estimation accuracy, which is
regularization, performance analysis. worthy of further study. However, most of the multitask analysis
and strategies are focusing on the least mean squares (LMS)-type
I. INTRODUCTION algorithms [32]–[34]. To our best knowledge, similar work
E CONSIDER the problem of distributed adaptive es- is not available for distributed recursive least squares (RLS)
W timation of system parameters via a sensor network
collecting distinct local data stream. To reveal how the diffusion
algorithms [35], [36].
To bridge this gap, we explore the interrelated system re-
sponses and use the maximum a posteriori probability (MAP) to
strategy allows information exchange between network nodes
while affects the distinct local estimation, the maximum a pos- derive a diffusion (Diff) RLS algorithm, aiming at develop- ing
teriori probability (MAP) is used for the algorithm derivation. a new method that automatically adjusts the diffusion strength.
The regularization arising from the spatial prior information is To do so, the likelihood function (LF) is augmented with a
specified using the bias and variance analysis. prior of the correlated system responses. Applying an expo-
The distributed optimization attracts a lot of attention in real- nentially weighted window to MAP estimator, the Diff RLS
time applications such as communication [1], data processing algorithm is obtained, which is implemented by using the nu-
[2]–[4], and control systems [5]. A group of methods have been merically more stable QR decomposition (QRD). To mediate
developed to solve this problem, where local estimates reach the distinct spatial information incorporation within neighbor-
an agreement constraint. The techniques include the alternating ing estimates, a variable spatial regularization (VSR) parameter
directions method of multipliers (ADMM) [6], [7], adaptation is introduced. The estimation mean squares deviation (MSD)
of the Diff-QRRLS algorithm is analyzed theoretically. A new
regularization selection rule is obtained by balancing the bias
Manuscript received May 2, 2020; accepted May 23, 2020. Date of publication
June 4, 2020; date of current version June 25, 2020. This work was supported and variance terms, resulting in the proposed Diff-VSR- QRRLS
by the National Natural Science Foundation of China under Grant 61901174 algorithm. Simulations are conducted to evaluate the theoretical
and by the Guangdong Basic and Applied Basic Research Foundation under analysis and performance of the proposed algorithm.
Grant 2019A1515010771. The associate editor coordinating the review of this
manuscript and approving it for publication was Luis Antonio Azpicueta-Ruiz.
(Corresponding author: Yijing Chu.) II. DERIVATION OF DIFF-QRRLS
Yijing Chu is with the State Key Laboratory of Subtropical Building Science,
South China University of Technology, Guangzhou 510641, China (e-mail: A. Problem Formulation
chuyj@scut.edu.cn).
S. C. Chan is with the Department of Electrical and Electronics Engineering, Consider a network of K connected nodes. The neighborhood
The University of Hong Kong, Hong Kong (e-mail: scchan@eee.hku.hk). of node k is denoted by 𝒩k , the nodes within which have the
Yi Zhou is with the Communication and Information Engineering, Chongqing ability to share information. The network is used to estimate
University of Posts and Telecommunications Chongqing, Chongqing 400065, K vectors of size L, e.g. wk (n) = [wk1 (n) . . . wkL (n)]T is the
China (e-mail: zhouy@cqupt.edu.cn).
Ming Wu is with the Key Laboratory of Noise and Vibration Research,
unknown vector at node k. n indicates the discrete time index.
Institute of Acoustics, Beijing 100190, China (e-mail: mingwu@mail.ioa.ac.cn). Each node collects the observation {dk (n)} and an input vector
Digital Object Identifier 10.1109/LSP.2020.2999883 of size L, xk (n) = [xk (n) . . . xk (n − L + 1)]T , which relates

1070-9908 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Kongu Engineering College. Downloaded on January 24,2022 at 09:05:40 UTC from IEEE Xplore. Restrictions apply.
996 IEEE SIGNAL PROCESSING LETTERS, VOL. 27, 2020

the observation with the unknown coefficient vector at n. It is TABLE I


assumed that observations at different nodes are conditionally DIFF-VSR-QRRLS ALGORITHM
independent. Let dk (n) = [dk (n) . . . dk (L)]T denote a
vector of observations for node k, d(n) =
col{d1 (n) . . . dK (n)} stack dk (n) of all nodes and time
duration, and w(n) = col{w1 (n) . . . wK (n)} stack the
coefficient vectors of all nodes at time n. The conditional
probability density function (pdf) can be written as
P(d(n)|w(n)). It is further assumed that this pdf as well
as the coefficient pdf P(w(n)) are log-concave [14]. Using
these notations, the MAP estimate of w(n) can be expressed as
ŵ(n) = arg max P(w|d(n)) = arg max P(d(n)|w)P(w)
w w
(1)
where Bayes’ rule is applied to the second equality.
Due to the conditional independence of the observations at
different nodes, the conditional probability can be written as

n 
K
P(d(n)|w) = P(dk (i)|wk (i)). (2) Moreover, a set of weights that decrease exponentially towards
i=L k=1 past data is multiplied to the square error to derive a recursive es-
Taking the logarithm of pdfs, the MAP estimate (1) becomes

K  timate ŵλ (n). Thus ŵλ (n) = arg max K k=1 fk (w k (n)) with
 n  w

ŵ(n) = arg max lnP(dk (i)|wk (i)) + lnP(w(i)) .
−1  ⎣
n
w 2
i=L k=1 fk (wk (n)) = 2 λn−i (n)(dk (i) − xTk (i)wk (n))
(3) 2σqk
i=L
B. System Model and Diff-QRRLS
+ ξk (wk (i) − wk (i − 1))T (wk (i) − wk (i − 1))
In this section, we assume an evolution of w(n) in both time 
T
and spatial domain [5] and a linear model of observations such + blk (wk (i) − wl (i)) (wk (i) − wl (i)) . (8)
l∈𝒩k /k
that the MAP estimate is specified. Then we show how the
estimate is obtained via the Diff-QRRLS algorithm. To start Here λn−i (n) = λλn−i−1 (n − 1) with λ0 (n) = 1 and the
with, the following system model for the kth node is considered forgetting factor
(FF) λ satisfying 0  λ < 1. Define
n
i=L λn−i (n)xk (i)xk (i) and pXdk (n) =
T
wk (n) = wk (n − 1) + εk (n) (4)
nRXXk (n) =
as
λ
i=L n−i (n)d k (i)x k (i). Taking derivative of (8) in terms
wl (n) = wk (n) + υ lk (n) for l ∈ 𝒩k /k (5) of wk (n) (rather than wk (i)) and setting it to zero, the local
dk (n) = xTk (n)wk (n)
+ qk (n) (6) estimate is obtained
where the measurement noise {εk (n), υ lk (n), qk (n)} are as- (RXXk (n) + ξk I L )wk (n) = pXdk (n) + ξk wk (n − 1)
sumed to be zero-mean white Gaussian sequences with diagonal 
covariance matrices or variance {σεk2 2
I L , σvlk 2
I L , σqk }. I L is an + blk (wl (n) − wk (n)). (9)
identify matrix of size L. l∈𝒩k /k
The white Gaussian assumption of {qk (n)} in (6) implies Eq. (9) is still not a distributed estimator. To make that
that P(dk (n)|wk (n)) is normal with mean xTk (n)wk (n) and available, we follow the formulation of distributed solution in
2
variance σqk . Following the derivation and using the property [10], define ψ l as an estimate of the optima wl , and let the
that {εk (n)} and {υ lk (n)} are independent, P(wk (n)) can be combination coefficients have the property1T B = 1T for the
written as a product of several normal pdfs, which have mean non-negative matrixBformed by {blk (n)} and 1 vector of unit
2 2
{wk (n − 1),wl (n)} and covariance {σεk I L , σvlk I L } for l ∈ entry. Moreover, we introduce a positive parameter αk (n) that
𝒩k /k. Accordingly, the log-LFs in (3) turn out to be quadratic controls the spatial regularization and apply the incremental
terms and the MAP estimator strategy to (9) such that a diffusion solution is obtained
⎡ reduces to

n 
K (RXXk (n) + ξk I L )ψ k (n) = pXdk (n) + ξk wk (n − 1) (10)
ŵ(n) = arg max ⎣ −1 (dk (i) − xTk (i)wk (i))2 
2
2σqk
w
i=L k=1
wk (n) = alk (n)ψ l (n) (11)
l∈𝒩k
−1
+ 2 (wk (i) − wk (i − 1))T (wk (i) − wk (i − 1)) akk (n) = 1 − αk (n)(1 − bkk ), alk = αk (n)blk for l = k
2σεk (12)

 −1 The local update (10) can be implemented by a QRD structure
T ⎦
+ 2 (w k (i) − w l (i)) (w k (i) − w l (i)) . [37], leading to the Diff-VSR-QRRLS. The rank-one update of
2σvlk
l∈𝒩k /k the covariance matrix RXXk (n) can be implemented by updat-
(7) ing the Cholesky factor R(1) (n)of RXXk (n) recursively (1st
k
It is clear that the log-LFs naturally break into several terms. QRD in recursion (i), Table I). The QRD is executed
once for the
To have a concise expression, the time and spatial regularization data vector and once for the regularization [ μ(n)z l , wkl (n −
2 2 2 2
parameters are defined, i.e. ξk = σqk /σεk and blk = σqk /σvlk . 1)] at each time instant, wherez l is the l-th row of the identity

Authorized licensed use limited to: Kongu Engineering College. Downloaded on January 24,2022 at 09:05:40 UTC from IEEE Xplore. Restrictions apply.
CHU et al.: NEW DIFFUSION VARIABLE SPATIAL REGULARIZED QRRLS ALGORITHM 997

matrix I L . The computational complexity of solving (10) is We consider the deviation of the Diff-QRRLS estimator to the
identical to that of the conventional QRRLS algorithms, which true system response that can be split into two terms
is o(L2 ) as shown in [38]. Note, if λ = 0, only the current data is w(n) − h = {E[w(n)] − h} + {w(n) − E[w(n)]} (17)
used in the update and (10) reduces to the LMS-like algorithm.
where the optimal solution reads from (16)
III. MSD ANALYSIS AND VSR METHOD E[w(n)] = Gh. (18)
The first term in the brackets of (17) corresponds to estimation
We are interested in studying MSD of the proposed Diff- VSR- bias while the second corresponds to variance. Consequently, the
QRRLS such that a practical SR selection rule can be obtained MSD becomes
that minimizes the estimation MSD.
Consider a system identification problem via a network of K JM SD (n) = ||E[w(n)] − h||22 + E[||w(n) − E[w]||22 ]. (19)
nodes. The signal collected at the kth node can be described as Using (16), (18) and the property of the system deviation, the
dk (n) = xTk (n)hk + ηk (n) (13) estimation bias and variance terms are obtained in (20) and (21)
E[w(n)] − h = (G − I KL )h = (G − I KL )Δh (20)
where hk is the L-order system to be identified, ηk (n) is the
zero-mean additive Gaussian noise. In multitask problems, the w(n) − E[w(n)] = GR−1 (n)X T (n)ΛD (n)η(n). (21)
system responses are different from each other. Suppose they To have a comprehensive result of estimation bias, a spe-
have coefficients h0 in common, the deviation from which is cific case of an identical SR for each node is considered. In
denoted as Δhk = hk − h0 at each node. To have an expression this case, D α (n) = α(n)I K . Recall the global combination
of the global system response h and deviation Δh that stack matrix. It then reduces to G = I KL − α(n)B Δ , where B Δ =
the local values of all nodes, the global combination matrix is (I K − B) ⊗ I L . Consequently, the estimation bias finds
defined as the Kronecker product of the combination matrix A
formed by {alk (n)} and the identity matrix, i.e. G = AT ⊗ ||E[w(n)] − h||22 = α2 (n)||B Δ Δh||22 . (22)
I L . Since the common part satisfies G(h − Δh) = h − Δh, Then, taking trace of the outer product of (21) and using some
the global system deviation has the property (I KL − G)h = algebraic operations, the estimation variance becomes
(I KL − G)Δh. T r(ΛD (n)X (n)R−1 (n)GGR−1 (n)X T
The global representations in terms of stochastic quantities
that appear in the system model are defined as × ΛD (n)η(n)η T (n))
X (n) = diag{X 1 (n) . . . X K (n)}, X k (n) = [xk (n) . . . xk (L)]T To derive a formula for practical use, the noise variance is

K
1
averaged over all nodes σ̄η2 = K 2
k=1 E[ηk (n)] such that
η(n) = col{η 1 (n) . . . η K (n)}, η k (n) = [ηk (n) . . . ηk (L)].
Using these expressions, the system reads E[||w(n) − E[w]||22 ] = σ̄η2 T r(G2 R−1 (n)R̂(n)R−1 (n))
d(n) = X (n)h + η(n). 1−λ 2
(14) = σ̄ T r(GGR−1
xx ) (23)
1+λ η
A. MSD Analysis where we have used
n→∞ 1
In this subsection, the MSD analysis, including the bias and R̂(n) = X T (n)Λ2D (n)X (n) = Rxx
1 − λ2
variance terms, is carried out. To have a concise expression in .
MSD analysis, the covariance matrices are introduced Using (22) and inserting G = I KL − α(n)B Δ into (23), the
RXXk (n) = X Tk (n)Λ(n)X k (n), R(n) = X (n)ΛD (n)X (n)
T MSD can be obtained from (19)
JM SD (n) = aα2 (n) − 2bα(n) + c, where
where Λ(n) = diag{[1, λ1 (n), . . . , λn−L (n)]} is a diagonal
weight matrix and ΛD (n) block-diagnalizes K weight matrices 1−λ 2
a = ||B Δ Δh||22 + σ̄ T r(B Δ B Δ R−1
xx ),
Λ(n). 1+λ η
To make the analysis tractable, the input signal is as- 1−λ 2 1−λ 2
sumed to be stationary such that Rxxk = E[xk (i)xTk (i)] b= σ̄ T r(B Δ R−1
xx ), and c = σ̄ T r(R−1
xx ).
1 1 1+λ η 1+λ η
and RXXk = 1−λ Rxxk . Accordingly, R(n) = 1−λ Rxx with (24)
Rxx = diag{Rxx1 . . . RxxK }.
The global expression for VSR parameter is defined B. VSR Algorithm Design
as D α (n) = diag{α1 (n) . . . αK (n)}. According to (12),
the global combination matrix becomes G = [I K − (I K − To minimize JM SD (n), the derivative of (24) is set to 0 to get
B)D α (n)] ⊗ I L . Without loss of generality, we assume a sym- the optimal SR parameter
metric property such that G = GT in the rest of the paper. αopt = b/a, (αopt ≤ 1) . (25)
Based on the definitions, the effect of SR on MSD is stud- It can be seen that the optimal SR parameter is proportional to
ied. It is assumed that the proposed algorithm is asymptoti- noise variance and inverse proportional to FF, input power, and
cally unbiased in time domain such that Eq. (10) reduces to the system deviation, which coincides with the definition of the
RXXk (n)ψ k (n) = pXdk (n). Applying this approximation as SR parameter blk for the MAP estimator. A practical estimation
well as the global representations to (11), we have of the noise variance can be used to formulate a VSR algorithm.
w(n) = GR−1 (n)X T (n)ΛD (n)d(n). (15) In this paper, the following criterion is employed
2
Substituting (14) into (15) leads to αk (n) = α0 + κσηk (n) (26)
2 2
w(n) = Gh + GR−1 (n)X T (n)ΛD (n)η(n). (16) σηk (n + 1) = λs σηk (n) + (1 − λs )e2k (n) (27)

Authorized licensed use limited to: Kongu Engineering College. Downloaded on January 24,2022 at 09:05:40 UTC from IEEE Xplore. Restrictions apply.
998 IEEE SIGNAL PROCESSING LETTERS, VOL. 27, 2020

Fig. 1. The mean convergence curves of the deviation to the true system
response and the optimal solution.

Fig. 3. The MSD curves of different Diff-RLS algorithms under a variaty of


radius and SNR settings.
Fig. 2. The simulated and theoretical results for the averaged MSD curves:
r = 0.03 (left) and 0.3 (right).

The proposed algorithms are then compared with several


where α0 is a small non-negative parameter. According to (25) typical Diff RLS algorithms under different settings of radius
and the requirement that αk (n) ≤ 1, we use the largest possible and SNR. The algorithms under test include the conventional
1

K
noise variance ē20 = K 2 2
k=1 E[dk (n)] for σ̄η in a and calculate
Diff-RLS proposed in [12], the non-cooperative Diff-RLS (A
the scale according to some prior of the system = I), and the proposed Diff-VSR-QRRLS and its constant SR
version using Metropolis rule (denoted as Diff-SR-QRRLS).
1−λ  SNR is set to 10 dB. The FFs for all algorithms are set to
κ= T r B Δ R−1
xx /
1+λ 0.99. Different radius values r = 0, 0.03 and 0.3 are used. The
  user parameters {ξk , α0 , λs } for the proposed algorithm are set
1−λ
||B Δ Δh||22 + T r(B Δ R−1 )ē
xx 0 .
2
(28) to {0.0001, 0.0001, 0.99} while κ is calculated to be 1.5, 1,
1+λ and 0.07 according to (28). Fig. 3 shows the MSD curves. It
can be seen that the algorithms generally converge to a larger
IV. SIMULATION RESULTS MSD as r increases. When r is small, the Diff algorithms have
similar performance and outperform the non cooperative one.
In this section, the effect of the proposed VSR method on the It is because that the combination reduces estimation variance
multitask system identification problem has been investigated as shown in (23). When r becomes as large as 0.3, the bias
and the theoretical analysis in terms of the optimal solution and term dominates. Diff-RLS and Diff-SR-QRRLS result in a MSD
MSD is verified. We consider a system of 10 nodes. The system similar to the non cooperative Diff-RLS. Diff-VSR-QRRLS
responses hi are distributed on a circle of radius r centered at h0 , minimizes MSD via choosing the SR parameter automatically.
i.e. hi = h0 + rg i , for i = 1, 2, …, and g i is a Gaussian sequence Next, we fix the radius to 0.3 and set SNR to different values.
of unit norm. The first order autoregressive (AR) process is used κ is calculated to be 0.02 and 0.05, respectively, for SNR = 0 and
to generate the input signal and the AR parameter is set to 0.9. 20 dB. Diff-VSR-QRRLS has a similar performance with Diff-
The Metropolis rule is used to generate the combination matrix RLS via large SR when SNR = 0 dB, which outperforms the non-
if not specified. cooperative Diff-RLS. When SNR = 20 dB, Diff- VSR-QRRLS
Fig. 1 verifies the optimal solution to Diff-QRRLS derived in adjusts SR to a small value and achieves the best performance
(18). The curves measure the deviation of E[w(n)] to the true of all algorithms under comparison.
responses h and the averaged one Gh. The radius r is set to 0.03
and 0.3. The SNR at the receiver is set to 10 dB. λ = 0.99. It
can be seen that the deviation to Gh increases slightly with r
while the deviation to h rises up significantly when r changes V. CONCLUSIONS
from 0.03 to 0.3. Under both multitask settings, the deviation to In this short paper, we propose a Diff-RLS algorithm that
Gh is much smaller than that to h. It verifies that Diff- QRRLS adjusts the spatial regularization automatically. The algorithm
converges to the solution in (18). is derived from a weighted MAP that discloses the mechanism
In the second experiment, the simulated and theoretical MSD of diffusion strategy. A VSR rule is derived from the MSD
results are compared. The settings are identical to those in the analysis that balances between estimation bias and variance
first experiment. Fig. 2 shows that the simulated and theoreti- under different conditions. It is clear that the spatial evolution
cal MSD curves generally agree well with each other. JM SD prior averages local estimates via combination strategy so as to
is slightly overestimated by (24) as r increases due to the increase SNR. The result is general for Diff adaptive algorithms
approximations used. The optimal SR decreases as radius to and the VSR strategy provides a framework for the algorithm
reduce the estimation bias. design in multitask problems.

Authorized licensed use limited to: Kongu Engineering College. Downloaded on January 24,2022 at 09:05:40 UTC from IEEE Xplore. Restrictions apply.
CHU et al.: NEW DIFFUSION VARIABLE SPATIAL REGULARIZED QRRLS ALGORITHM 999

REFERENCES [20] J. Chen, C. Richard, and A. H. Sayed, “Multitask diffusion adaptation over
networks,” IEEE Trans. Signal Process., vol. 62, no. 16, pp. 4129–4144,
[1] L. Sun and Y. Wang, “CTBRNN: A novel deep-learning based signal Aug. 2014.
sequence detector for communications systems,” IEEE Signal Process. [21] R. Nassif, C. Richard, A. Ferrari, and A. H. Sayed, “Proximal multitask
Lett., vol. 27, pp. 21–25, 2020. learning over networks with sparsity-inducing coregularization,” IEEE
[2] Y. Zhou, C. Huang, H. Liu, D. Li, and T. K. Truong, “Clutter removal in Trans. Signal Process., vol. 64, no. 23, pp. 6329–6344, Dec. 2016.
through-the-wall radar based on weighted nuclear norm minimization,” [22] R. Nassif, S. Vlaski, C. Richard, and A. H. Sayed, “A regularization
IEEE Geosci. Remote Sens. Lett., vol. 58, no. 1, pp. 486–499, Jan. 2020. framework for learning over multitask graphs,” IEEE Signal Process. Lett.,
[3] K. Yuan, B. Ying, X. Zhao, and A. H. Sayed, “Exact diffusion for vol. 26, no. 2, pp. 297–301, Feb. 2019.
distributed optimization and learning—Part I: Algorithm development,” [23] A. Koppel, B. M. Sadler, and A. Ribeiro, “Proximity without consensus
IEEE Trans. Signal Process., vol. 67, no. 3, pp. 708–723, Feb. 2019. in online multiagent optimization,” IEEE Trans. Signal Process., vol. 65,
[4] K. Yuan, B. Ying, X. Zhao, and A. H. Sayed, “Exact diffusion for dis- no. 12, pp. 3062–3077, Jun. 2017.
tributed optimization and learning—Part II: Convergence analysis,” IEEE [24] S. Vlaski, L. Vandenberghe, and A. H. Sayed, “Regularized diffusion
Trans. Signal Process., vol. 67, no. 3, pp. 724–739, Feb. 2019. adaptation via conjugate smoothing,” 2019, arxiv:1909.09417.
[5] Y. J. Chu, C. M. Mak, Y. Zhao, S. C. Chan, and M. Wu, “Performance [25] J. Chen and A. H. Sayed, “Distributed Pareto optimization via diffusion
analysis of a diffusion control method for ANC systems and the network strategies,” IEEE J. Sel. Topics Signal Process., vol. 7, no. 2, pp. 205–220,
design,” J. Sound. Vib., vol. 475, no. 9, Jun. 2020, Art. no. 115273. Apr. 2013.
[6] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed Op- [26] J. Chen, J. Li, S. Yang, and F. Deng, “Weighted optimization-based
timization and Statistical Learning via the Alternating Direction Method distributed Kalman filter for nonlinear target tracking in collaborative
of Multipliers. New York, NY, USA: Now, 2011. sensor networks,” IEEE Trans. Cybern., vol. 47, no. 11, pp. 3892–3905,
[7] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation. Nov. 2017.
Old Tappan, NJ, USA: Prentice-Hall, 1989. [27] X. Zhao and A. H. Sayed, “Distributed clustering and learning over
[8] J. Li, F. Deng, and J. Chen, “A fast distributed variational Bayesian networks,” IEEE Trans. Signal Process., vol. 63, no. 13, pp. 3285–3300,
filtering for multisensor LTV system with non-Gaussian noise,” IEEE Jul. 2015.
Trans. Cybern., vol. 49, no. 7, pp. 2431–2443, Jul. 2019. [28] X. Zhao and A. H. Sayed, “Clustering via diffusion adaptation over
[9] A. H. Sayed, “Adaptive networks,” Proc. IEEE, vol. 102, no. 4, pp. 460– networks,” in Proc. 3rd Int. Workshop Cogn. Inf. Process., 2012, pp. 1–6.
497, Apr. 2014. [29] R. Nassif, S. Vlaski, C. Richard, J. Chen, and A. H. Sayed, “Learning over
[10] F. S. Cattivelli and A. H. Sayed, “Diffusion LMS strategies for distributed multitask Graphs,” 2020, arxiv:2001.02112.
estimation,” IEEE Trans. Signal Process., vol. 58, no. 3, pp. 1035–1048, [30] R. Nassif, S. Vlaski, C. Richard, J. Chen, and A. H. Sayed, “Learning over
Mar. 2010. multitask graphs—Part II: Performance analysis,” IEEE Open J. Signal
[11] F. S. Cattivelli and A. H. Sayed, “Diffusion strategies for distributed Process., vol. 1, pp. 46–63, 2020.
Kalman filtering and smoothing,” IEEE Trans. Autom. Control, vol. 55, [31] R. Nassif, C. Richard, A. Ferrari, and A. H. Sayed, “Multitask diffusion
no. 9, pp. 2069–2084, Sep. 2010. adaptation over asynchronous networks,” IEEE Trans. Signal Process.,
[12] F. S. Cattivelli, C. G. Lopes, and A. H. Sayed, “Diffusion recursive least- vol. 64, no. 11, pp. 2835–2850, Jun. 2016.
squares for distributed estimation over adaptive networks,” IEEE Trans. [32] J. Ni, “Diffusion sign subband adaptive filtering algorithm for distributed
Signal Process., vol. 56, no. 5, pp. 1865–1877, May 2008. estimation,” IEEE Signal Process. Lett., vol. 22, no. 11, pp. 2029–2033,
[13] C. G. Lopes and A. H. Sayed, “Incremental adaptive strategies over dis- Nov. 2015.
tributed networks,” IEEE Trans. Signal Process., vol. 55, no. 8, pp. 4064– [33] J. Ni, J. Chen, and X. Chen, “Diffusion sign-error LMS algorithm: For-
4077, Aug. 2007. mulation and stochastic behavior analysis,” Signal Process., vol. 128,
[14] F. Y. Jakubiec and A. Ribeiro, “D-MAP: distributed maximum a posteriori pp. 142–149, Nov. 2016.
probability estimation of dynamic systems,” IEEE Trans. Signal Process., [34] H. Lee, S. Kim, J. Lee, and W. Song, “A variable step-size diffusion LMS
vol. 61, no. 2, pp. 450–466, Jan. 2013. algorithm for distributed estimation,” IEEE Trans. Signal Process., vol. 63,
[15] Y. Huang, J. Benesty, and J. Chen, Acoustic MIMO Signal Processing, no. 7, pp. 1808–1820, Apr. 2015.
Berlin, Germany: Springer, 2006. [35] Y. J. Chu and C. M. Mak, “A variable forgetting factor diffusion recur-
[16] Y. J. Chu, S. C. Chan, Y. Zhao, and M. Wu, “Performance analysis of sive least squares algorithm for distributed estimation,” Signal Process.,
diffusion filtered-x algorithms in multitask ANC systems,” presented at the vol. 140, pp. 219–225, 2017.
27th Eur. Signal Process. Conf. (EUSIPCO), A Coruna, Spain, Sep. 2–6, [36] L. Zhang, Y. Cai, C. Li, R. C. de Lamare, and M. Zhao, “Low-complexity
2019. correlated time-averaged variable forgetting factor mechanism for diffu-
[17] J. Chen, C. Richard, and A. H. Sayed, “Diffusion LMS over multitask sion RLS algorithm in sensor networks,” in Proc. IEEE SAM 2016, Rio de
networks,” IEEE Trans. Signal Process., vol. 63, no. 11, pp. 2733–2748, Janeiro, Brazil, 2016, pp. 1–5.
Jun. 2015. [37] Y. J. Chu and C. M. Mak, “A new parametric adaptive nonstationarity
[18] R. Caruana, “Multitask learning,” Mach. Learn., vol. 28, no. 1, pp. 41–75, detector and application,” IEEE Trans. Signal Process., vol. 56, no. 19,
Jul. 1997. pp. 5203–5214, Nov. 2017.
[19] R. Abdolee, B. Champagne, and A. H. Sayed, “Estimation of space-time [38] S. C. Chan and Y. J. Chu, “A new state-regularized QRRLS algorithm with
varying parameters using a diffusion LMS algorithm,” IEEE Trans. Signal a variable forgetting factor,” IEEE Trans. Circuits Syst. II, Exp. Briefs,
Process., vol. 62, no. 2, pp. 403–418, Jan. 2014. vol. 59, no. 3, pp. 183–187, Mar. 2012.

Authorized licensed use limited to: Kongu Engineering College. Downloaded on January 24,2022 at 09:05:40 UTC from IEEE Xplore. Restrictions apply.

You might also like