Professional Documents
Culture Documents
PII: S0165-1684(20)30132-8
DOI: https://doi.org/10.1016/j.sigpro.2020.107589
Reference: SIGPRO 107589
Please cite this article as: Shuyong Zhou , Haiquan Zhao , Statistics Variable Ker-
nel Width for Maximum Correntropy Criterion Algorithm, Signal Processing (2020), doi:
https://doi.org/10.1016/j.sigpro.2020.107589
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.
This paper summarizes several variable kernel width maximum correntropy criterion (MCC) algorithms, and discusses
the basic principles of these algorithms. A close relationship between this algorithms and LMS algorithm is analyzed and
established.
Then a new statistics variable kernel width MCC algorithm is proposed (SVKW-MCC) on the basis of previous variable
kernel width algorithms.
SVKW-MCC algorithm is proposed to address the shortcomings of some well-known variable kernel width algorithm.
The SVKW-MCC algorithm use statistics method to compute the kernel width and eliminates the abnormal errors caused
by impulsive noise by statistical method.
The stability and steady-state mean-square performance of the proposed algorithm is analyzed and verified by
experiments
0Statistics Variable Kernel Width for Maximum
Abstract: Since the maximum correntropy criterion (MCC) algorithm with a constant kernel width leads to the trade-off
problem between the convergence rate and steady-state misalignment, various adaptive kernel width MCC algorithms were
derived to solve this problem. However, the superior performances of these algorithms depend mainly on specific data range,
or have complicated calculation and parameter setting. Thus, this paper proposes a statistics variable kernel width MCC
(SVKW-MCC) algorithm to overcome these problems. Specifically, the proposed algorithm calculates the mean and variances
of the errors signal, and then the proposed algorithm removes these data that significantly deviate from the mean value of
errors signal, moreover, the new mean and variance are recalculated after removing these abnormal data, subsequently, the new
kernel width is calculated by the new variance and mean. Simulation results in system identification and echo cancellation
scenarios show that the proposed algorithm outperforms the existing variable kernel width methods. Moreover, the stability
and steady-state mean-square performance of the proposed algorithm is analyzed and verified by experiments.
Keywords: Maximum correntropy criterion, variable kernel width, impulsive interferences, statistics variable kernel width,
1. Introduction
In recent years, information theoretic learning (ITL) [1, 2] has been applied in non-Gaussian signal processing, especially in
the impulsive noise environment. The minimum entropy [3-5] and the maximum entropy [6-15] are mostly used in ITL theory.
The maximum correntropy criterion (MCC) is popular for its simplicity and robustness.
The maximum correntropy is defined as the probability of how similar [16] two random variables are in a neighborhood of
the joint space controlled by the kernel width i.e, the kernel width acts as a zoom lens [9], controlling the “observation window”
in which similarity is assessed. The smaller the kernel width, the more sensitive of the MCC algorithm is to observation errors,
however, when the kernel width is large enough; the algorithm will be degraded to the LMS algorithm [22]. The kernel width
of MCC leads to a trade-off problem among learning speed and steady-state accuracy. The selection of a suitable kernel width
Shuyong Zhou, Haiquan Zhao are with the Key Laboratory of Magnetic Suspension Technology and Maglev Vehicle, and the National Rail Transportation
Electrification and Automation Engineering Technology Research Center under Grant NEEC-2019-A02, Ministry of Education, and the School of Electrical
Engineering, Southwest Jiaotong University, Chengdu, 610031, China.
* Corresponding author
E-mail addresses: e-mail: 2241903@qq.com; hqzhao_swjtu@126.com
is a critical to the MCC-based algorithms.
In order to solve the contradiction between the steady state performance and the fast convergence rate, several adaptive
kernel width methods have been proposed in [16-21], such as switch kernel width maximum correntropy criterion (SMCC)
algorithm [16], variable kernel width (VKW-MCC) maximum correntropy criterion algorithm [18], adaptive kernel width
maximum correntropy criterion (AMCC) algorithm [17], improved variable kernel width maximum correntropy criterion
(IVKW-MCC) algorithm [21]. However, existing kernel width selection methods are not suitable enough for this problem.
The aforementioned algorithms have obvious weakness. SMCC and AMCC algorithm only perform better in certain
environments. The parameter setting of VKW-MCC algorithm is very complicated. The IVKW-MCC algorithm is very
complicated to calculate.
This paper summarizes several variable kernel width MCC algorithms, and discusses the basic principles of these
algorithms. A close relationship between this algorithms and LMS algorithm is analyzed and established [22].Then a new
statistics variable kernel width MCC algorithm is proposed (SVKW-MCC) on the basis of previous variable kernel width
algorithms. Afterwards the convergence performance of the algorithm is analysed. The steady-state excess mean square error
(EMSE) of the SVKW-MCC algorithm is studied based on energy conservation relation [24-28]. The simulations in system
identification and echo cancellation scenarios that include non-Gaussian impulsive interferences have proved that the proposed
In the MCC algorithm, maximum correlation entropy is a nonlinear local similarity measure between two random variable X
where E[·] denotes the expectation operator, Fx, y ( X , Y ) is the joint distribution function of X and Y, k(·,·) is a symmetric
positive definite Gaussian kernel controlled by the kernel width 0 , and defined as [8]:
1 x y 2
k ( x ,y ) e x p . (2)
2 0 2 0
Like the MSE criterion, the cost function of the MCC algorithm can be defined as [6, 9],
d xT w
2
J M C C(w )n E e xp
n n n
(3)
2 02
where en dn xTn wn is the a priori output error, w n is the estimate of unknown system wopt that needs to be estimated at
iteration time n, wopt is an L×1 weight vector, E[·] denotes the expectation operator, and d n is defined as:
where d n is the output signal, xTn is an L×1 input signal defined as xn xn , xn 1 ,..., xn L 1 , The variable v n is the
T
Gaussian background noise plus impulse signal. Similarly to MSE criterion, we can use the stochastic gradient ascent approach
e2
Wn 1 Wn E exp n 2 en xn . (5)
2 0
According to [9], the weight update equation using the maximum correntropy cost function can be reduced to this
simple form.
e2
Wn 1 W n e x p n 2 e xn
2 0
n
The switch kernel width maximum correntropy criterion (SMCC) algorithm is obtained by maximizing the following cost
function [16]:
2
1 e
max J '
SMCC (n) 2 exp( n 2 )en . (6)
σn σn 2σ n
The SMCC algorithm consider σ n as a function of en in (6). After simple calculation, the expression and the kernel width of
en2
Wn 1 Wn exp 2 en xn
n2 2 n (7)
e /2
2
n
2
n
The (8) can be obtained by substituting the kernel width of the SMCC algorithm into the weight update equation of
SMCC algorithm.
1
Wn 1 Wn 2 exp 1 xn . (8)
en
In this case, this algorithm is no longer a MCC-based algorithm and will diverge according to the formula (8), therefore, the
σ0 is a constant. The kernel width of the SMCC switches between an error-based kernel width and a predetermined kernel
width [16].
The kernel calculation principle of the AMCC algorithm [17] is the same as that in the previous algorithm, and the kernel
However, these algorithms are only robust in the Specific data matching with prediction kernel width, thus, [18] proposes a
variable kernel width (VKW-MCC) MCC algorithm to overcome this problem, The kernel width of the VKW-MCC algorithm
is calculated at each iteration by maximizing the following cost function calculate with respect to the kernel width σ n .
e2
max JVKW (en ) exp n 2 (11)
σn
2σ n
σ n κ en . (12)
In order to prevent the VKW-MCC algorithm from degrading to the LMS algorithm, VKW-MCC algorithm uses the following
where Nw is the width of the window function of Ae , n , then the threshold of error is expressed as
where 0< <1 is the smoothing factor, min(·) denotes the sample minimum operation which helps to remove the impulsive
n 0 if en 0 /
, (15)
n en if en 0 /
3. Proposed algorithm
3.1 Discussion
First, let’s discuss the SMCC algorithm. After substituting the kernel width of the SMCC algorithm into the weight update
It is clearly seen from the (16) that the update equation of the SMCC algorithm is divided into two segments by en = 2 0 .
When en 2 0 , the SMCC algorithm is the ordinary MCC algorithm. When en 2 0 , the SMCC algorithm uses
1/ en 1 to guarantee the robustness of the algorithm in the impulse environment. The convergence speed of SMCC algorithm
en2
SMCC exp 2 en (17)
n2 2 n
e2
MCC exp n 2 en , (18)
0
2
2 0
where 0 is a constant. The simulation results of (17) and (18) are depicted as Fig.1. It can be seen from the Fig.1 that the
SMCC algorithm is essentially similar to MCC algorithm. The SMCC algorithm only works better than MCC algorithm under
Then we analyse the AMCC algorithm. Substituting (10) into the weight update equation (7), The AMCC algorithm [17] can
be expressed as
en2
Wn 1 Wn exp 2 n n.
ex (19)
en 0
2 2
2(en 0 )
2
en2
AMCC exp 2 n .
e (20)
en 0
2 2
2(en 0 )
2
The curves of (20) under different σ 0 are depicted as Fig.2. It can be seen from the Fig.2 that the error peak value of the
curve always fluctuates from -1 to 1 no matter how the value of the kernel width changes. So the algorithm only maintains
robustness against impulsive noise in some special data range. The simulation results show that when the data deviates from
this range, the convergence effect is even worse than the conventional MCC algorithm.
Fig.1. The curves of MCC and SMCC versus en Fig.2. The curves of AMCC versus en with different 0
3.2. Proposed SVKW-MCC algorithm
Because the existing variable kernel width algorithm has the shortcomings discussed above, so the following variable kernel
width algorithm (SVKW-MCC) is proposed based on statistical probability. MCC algorithm can be regarded as variable step
en2
n = exp 2 (21)
2
From the point of view of mathematics, this exponential term is a normal distribution function with a mean of zero and a
variance of which is usually defined as the kernel width of the MCC algorithm. A simulation result in Fig.3 is presented to
illustrate the influences of different kernel width. It is obvious that the equivalent step size n decreases rapidly with the
increase of error. When the error exceeds three times the kernel width, it almost decays to zero. This is the reason why MCC
algorithm is robust against the impulsive noise. Also, the curve clearly displays the inherent shortcomings of MCC algorithm.
As the filter iteration, the error is decreasing while the step size is increased. The relationship between error and iteration speed
en2
n en exp 2 n
e (22)
2
The simulation result of (22) is shown as Fig.4. It can be seen from Fig.4 that the iteration speed is the fastest when the error is
equal to the kernel width. From the point of view of mathematical, the formula is derived as
en 2
max J SVKW MCC en exp 2 en (23)
σn 2σ n
We take derivative of (23) with respect to en , then make the derivative equals zero,
The new variable kernel width algorithm update formula is derived as:
en2
Wn 1 Wn exp 2 en x n
2 n
(26)
n en
Observing the above formula (26), the (26) reverts to the LMS algorithm. When the impulsive interferences occur, the
algorithm loses robustness. To avoid this drawback, the SMCC and AMCC algorithm change the kernel width form. However,
it is well-known that the LMS algorithm has the excellent performance in the ordinary Gaussian environment. The fundamental
reason is that the kernel width should not be equal to the error when the impulsive noise occurs.
Fig.3. Equivalent step size an error relationship curve Fig.4. The curves of n en of the MCC algorithm
calculating the kernel width, the most important principle is to remove the impact of impulse noise on the error, so we
proposed that the optimal kernel width should be equal to the mean value of absolute error what have removed the impulsive
impacts.
n E en (27)
en E en (28)
is a coefficient. When the error is normal, it is close to one. When the error is impulse noise, it is very large. Substituting
(28) and (27) into update formula of variable kernel width algorithm to get
2
Wn 1 Wn exp E en x n . (29)
2
From the above formula (29), when the error is interrupted by white Gaussian noise, the algorithm with smaller coefficient
decays to LMS algorithm to achieve the optimal convergence effect. When the error is interrupted by impulsive noise,
the exponential term in the algorithm will converge quickly to near zero, and has better robustness. The equivalent step-size
2
exp (30)
2
It can be seen from the Fig.5 that the step size is the largest when the error is equal to the mean value, and the step size is
sharply decreased when the error deviates from the mean value.
According to the law of large numbers in probability theory, most data will be in the near range of the mean.
P en E en 3 D en 0.95 (31)
. denotes probability, E . represents mean value, D . and denotes variance. According
In the above formulas (31), P
to the law of large numbers in probability theory (31), the most of en always fluctuates near the mean value in the updating
process. In order to make most of the en fall within the kernel width, the kernel width of SVWK-MCC is set
n E en 3 D ( en ) . (32)
According to (32), the mean and variance of absolute value of error is needed to be calculated. In order to estimate the mean
and variance of the error at the current iteration, we take a period of error data starting from the moment of time n to
calculate the mean and variance. The window function is defined as:
Ae, n [en , en 1 , en 2 ,...en i ...en Nw1 ] (33)
Potential impulsive noise in window function can lead to serious distortion in mean and variance. According to the theory of
law of large numbers, it can be simply considered that the impulse noise is only a small part of the data far from the mean
value. The mean and standard deviation of the above error window functions can be calculated as:
The proposed algorithm zeros data that deviate from mean of error.
We can get more accurate mean and standard deviation than first calculating by recalculating the data of E A ( n) and DA ( n) ,
based on experience, repeat it three times. To guarantee the smooth update of the kernel width, the following sliding average
where is a constant coefficient close to 1, the proposed algorithm is summarized in the following table 1.
end
end
n n 1 (1 )( EA (n) 3DA (n))
e2
Wn 1 Wn exp n 2 en xn
2 n
end
4. Performance analysis
In this section, we analyze the stability and steady-state mean-square performance of the proposed SVKW-MCC algorithm.
Similar to the MCC algorithm [6-15], the weight vector update expression of SVKW-MCC algorithm can be written as
Wn 1 Wn f e n xn (37)
e2
f e n exp n2 en , (38)
n
where n E en 3 D ( en ) is the calculated kernel width. To perform the convergence performance in the
Wn ( Wopt Wn ) (39)
where Wopt is the system parameters to be identified, Subtracting the both side of the (37) from Wopt , yields
Wn 1 Wn f e n xn (40)
The (41) can be obtained by taking square and expectation of both sides of (40) [25-28].
In the above (41), where ea , n WnT X n , and en ea ,n vn , vn is the Gaussian background noise plus impulse noise in
the SVKW-MCC algorithm. In order to simplify the calculation, we only consider the white Gaussian noise. The
convergence of the SVKW-MCC algorithm in the mean-square sense can be guaranteed if the squared weight error satisfies
E Wn 1 E Wn
2 2
(42)
In order to solve (44), the following assumptions are widely used in the theoretical analysis of adaptive filters [17, 18]:
A1: The noise sequence n is independent, identically distributed, and independent of the input sequence xn .
A2: The filter is long enough such that the a priori error ea , n is zero-mean Gaussian and independent of the noise
n .
Substituting en ea ,n vn and (38) into molecular of (44), after a simple calculation, we can obtain
e2
lim 2 E e a , n f e n lim 2E (ea2, n e a , n n ) exp n2 . (45)
n n
n
According assumption A2, we obtain lim e a , n n 0 . Substituting it into (45), yields
n
e2
lim 2 E e a , n f e n lim 2 E ea2, n exp n2 . (46)
n n
n
According to the assumption of A3, the denominator in (44) can be calculated as:
E f 2 e n xn E f 2 e n E xn
2 2
(47)
2e2
E f 2 e n E (ea , n n )2 exp 2n (48)
n
2e2
E f 2 e n E (ea2, n 2ea , n n n2 ) exp 2n (49)
n
According to assumption A2, then substituting the E e a , n n 0 into the (49), we obtain
2e2
E f 2 e n E (ea2, n n2 ) exp 2n (50)
n
n
In particular, if the variance of the background noise is much smaller than priori error, i.e., E n E ea2, n ,we have
2
2
0 (52)
e2
E exp n2 E xn
2
n
E xn E Tr ( Rxx ) , and Tr(⋅) denotes the trace of a matrix, and Rxx is the covariance matrix of the input
2
where
E Wn 1 E Wn .
2 2
(53)
Comparing (54) with the (43), we can conclude that the result is similar to (51), we can obtain
e2
2 E ea2, n exp n2
n
. (55)
2e 2
E (ea2, n n2 ) exp 2n E x n
2
n
The steady-state behavior of adaptive filtering algorithm is generally evaluated by excess mean square error (EMSE), which
can be defined as
Combining (56) and (55), limiting both sides of formula (55) we can obtain
e2 2
E Tr ( Rxx ) exp n2 n
n
S lim (57)
.
2 E Tr ( Rxx ) exp en n2
n 2
n
2
The input signal is generated from a zero-mean Gaussian distribution with unit variance; the length of the system to be
e2
identified system is equal to L, Let E n2 2 and lim E exp n2 e , where 2 is variance of background noise,
n
n
Le2
S . (58)
2 Le2
en2
e lim E exp . (59)
n 2
E en 3 D( en )
The error obeys the normal distribution with mean zero and variance e2 . We can obtain lim en2 e2 . In the (58), according
n
2 2
to probability and statistics theory, we obtain that E en e , D( en ) 1 e . After a simple calculation, we can
obtain
1 e2
E en en f en d en en exp n 2 den (60)
2 e 2 e .
Observing the above formula, we can find that it has obvious symmetry, so we can deduce that
1 e2
2 e e2 e2
E en 2en
0 2 e
exp n 2 den
2 e 0
exp n 2 d n 2
2 e 2 e
(61)
2 e e2 2
E en exp n 2
e (62)
2 e
0
,
D( en ) E en E en
2 2
D(en ) E en
2
(63)
2
D( en ) 1 e2 (64)
Substituting (62) and (64) into (59) yields
1
e exp( ) (65)
16 2 2
9 6 1
en2
e lim exp 2
0.8 (66)
n
n
Substituting (66) into (58), the excess mean square error (EMSE) can be obtained.
0.8 L v2
S (67)
2 0.8 L2
Fig.6. Steady-state Excess Mean Square Error and step-size curve when v2 0.001
Fig.7. Noise Variance and Steady-State Excess Mean Square Error Curve.
The mean square deviation (MSD) is defined as NMSD(n) 10log10 wopt w n 2
/ wopt
2
in the simulation. The system
parameters obey uniform distribution between zero and half. The order of the system is 64 taps (L=64). The input signal is a
Gaussian distribution with the mean zero and variance one. When the output signal runs half the time, the system parameters
suddenly change to the original negative number. Furthermore, the desired signal is disturbed by Gaussian white noise with
signal-to-noise ratio of 30dB. The impulsive noise (n) can be modeled by the Bernoulli-Gaussian (BG) distribution,
(n) c(n) A(n) , where A( n) is a Gaussian distribution with an average of zero and a standard deviation of 100. And
c ( n) is a Bernoulli process with the probability density function defined by P(c(n) 1) Pr and P(c(n) 0) 1 Pr ,
and Pr 0.01 is set [13]. All simulation results are the average over 300 independent trials. The initial value of the
Gaussian
Algorithm colored signals Speech signal
background
0.0008 0.0009
MCC
4 4
5. Simulation results
(a)
(b)
(c)
Fig.8. Experimental result for the MCC, SMCC, AMCC, VKW-MCC, and proposed SVKW-MCC algorithm in Gaussian scenario,
where the system parameter is suddenly changed from w opt to –w opt at the 30000th iteration. (a) System impulse response. (b) NMSD
From the above simulation, it is clearly seen that the kernel width of the MCC algorithm is contradiction between the
When the system to be identified is changed to chart a in the Fig.9 and the variance of impulsive noise is 1000, the
(a)
(b)
(c)
Fig.9. Experimental result for the MCC, SMCC, AMCC, VKW-MCC, and proposed SVKW-MCC algorithm in Gaussian scenario,
where the system parameter is suddenly changed from w opt to –w opt at the 20000th iteration. (a) System impulse response. (b) NMSD
SMCC and AMCC algorithm perform worse than MCC under this particular condition. In order to surpass the MCC
algorithm, experiments show that VKW-MCC need to set the appropriate parameters. And also it is very difficult to set
appropriate parameters of VKW-MCC. The proposed algorithm does not need this prior knowledge, and also no need to set
the appropriate parameter. The convergence speed and steady-state performance of the proposed algorithm outperform the
above algorithms in Gaussian noise environment. From the curve of the kernel width, it is observed that there is an
approximate linear relationship between the kernel width and the convergence curve.
results are as
(a)
(b)
Fig.10. Experimental result for the MCC, SMCC, AMCC, VKW-MCC, and proposed SVKW-MCC algorithm in Correlated scenario,
where the system parameter is suddenly changed from w opt to –w opt at the 30000th iteration. (a) NMSD learning curves (b) kernel width
change curve.
From the above figure, it is shown that the convergence speed and steady-state performance of the proposed algorithm are
better than those of other algorithms under the relevant input conditions.
5.3. Echo cancellation simulation
In this section, the speech signal is used as input, and also the abrupt change of the echo path occurs in 4 105 th input
sample. Figure 7 gives a comparison of these algorithms in this case. The parameters set the same as Table. 2. Evidently,
comparing with the Other variable kernel width algorithm, the proposed algorithm can much better solve the trade-off
(a)
(b)
(c)
Fig.11. Experimental result for the MCC, SMCC, AMCC, VKW-MCC, and proposed SVKW-MCC algorithm in an echo
cancellation scenario, where the system parameter is suddenly changed from w opt to –w opt at the 400000th iteration. (a) Impulse
response of echo channel. (b) Input speech signal for echo cancellation experiment. (c) NMSD learning curves
From the above figure, it is observed that the convergence speed and steady-state performance of the proposed
algorithm are better than those of other algorithms under the speech signal environment.
Fig.12. Experimental result for the MCC, SMCC, AMCC, VKW-MCC, and proposed SVKW-MCC algorithm in Gaussian scenario,
where the system parameter is suddenly changed from w opt to –w opt at the 20000th iteration.
Because we originally analyzed the steady-state performance of the algorithm after it stabilized, so we use a straight line
to represent the steady-state value of the analysis. Therefore, the figure of EMSE simulation is not correct mathematically,
but it can be seen that the actual simulation results are close to the theoretical values.
6. Conclusion
In this paper, statistical variable kernel width maximum entropy (SVKW-MCC) algorithm is proposed to address the
shortcomings of some well-known variable kernel width algorithm. The SVKW-MCC algorithm eliminates the abnormal
errors caused by impulsive noise based on statistical method. Moreover, it calculates the optimal kernel width by using the
mean and variance of the errors after eliminating abnormal errors. Simulation results show that the proposed algorithm has
obvious advantages in Gaussian environment. Since the study is based on the assumption of Gaussian distribution, the
parameters still need to be modified to achieve better results in the case of non-Gaussian distribution such as colored signals
and speech signal. Notwithstanding that, this algorithm can be extended to other MCC algorithm or M-estimation algorithm.
The new algorithm is proved to be simple in calculation, wide range of application, better in convergence speed and
Acknowledgment
This work was partially supported by National Science Foundation of P.R. China (Grant: 61871461, 61571374, 61433011),
References
[1] Principe J C. Information theoretic learning: Renyi's entropy and kernel perspectives. Springer Science & Business Media, 2010.
[2] B. Chen, Y. Zhu, J. Hu, J.C. Principe. System parameter identification: information criteria and algorithms. Newnes, 2013.
[3] D. Erdogmus and J.C. Principe, "Generalized information potential criterion for adaptive system training," Neural Networks, IEEE
Transactions on. 13.5(2002). pp.1035-1044
[4] B. Chen, J. Hu, L. Pu, Z. Sun. "Stochastic gradient algorithm under (h, )-entropy criterion." Circuits, Systems & Signal Processing
26.6(2007): pp.941-960.
[5] B. Chen, P. Zhu, J.C. Principe. "Survival information potential: a new criterion for adaptive system training." Signal Processing, IEEE
Transactions on, 60.3 (2012). pp.1l84-1194.
[6] W. Liu, P.P. Pokharel, J.C. Principe. "Correntropy: A localized similarity measure." Neural Networks, 2006. lJCNN'06. International
Joint Conference on. IEEE, 2006. pp.4919-4924.
[7] Wang W, Zhao H, Zeng X. "Geometric Algebra Correntropy: Definition and Application to Robust Adaptive Filtering." IEEE
Transactions on Circuits and Systems II: Express Briefs, 2019.
[8] I. Santamaría, P.P. Pokharel, J.C. Príncipe. "Generalized correlation function: definition, properties, and application to blind
equalization." Signal Processing, IEEE Transactions on, 54.6 (2006).pp.2187-2197.
[9] W. Liu, P.P. Pokharel, J.C. Principe. "Correntropy: properties and applications in non-Gaussian signal processing." Signal Processing,
IEEE Transactions on, 55.11 (2007): pp.5286-5298
[10] Gogineni V. C, Mula S. “Improved proportionate-type sparse adaptive filtering under maximum correntropy criterion in impulsive
noise environments,” Digital Signal Processing, 2018, 79: 190-198.
[11] Li Y, Jiang Z, Shi W, et al. “Blocked maximum correntropy criterion algorithm for cluster-sparse system identifications,” IEEE
Transactions on Circuits and Systems II: Express Briefs, 2019, 66(11): 1915-1919.
[12] Li Y, Wang Y, Yang R, et al. “A soft parameter function penalized normalized maximum correntropy criterion algorithm for sparse
system identification,” Entropy, 2017, 19(1): 45.
[13] Chen B, Liu X, Zhao H, et al. “Maximum correntropy Kalman filter,” Automatica, 2017, 76: 70-77.
[14] Ma W, Chen B, Duan J, et al. “Diffusion maximum correntropy criterion algorithms for robust distributed estimation,” Digital Signal
Processing, 2016, 58: 10-19.
[15] Liu C, Qi Y, Ding W. “The data-reusing MCC-based algorithm and its performance analysis,” Chinese Journal of Electronics, 2016,
25(4): 719-725.
[16] Wang W, J. Zhao, H. Qu, B. Chen, J. C. Principe "A switch kernel width method of correntropy for channel estimation." Proc. IEEE
International Joint Conference on Neural Networks, (IJCNN’15), Killarney, Ireland. p. 1–7.
[17] W. Wang, J. Zhao, H. Qu, B. Chen, and J. C. Principe, “An adaptive kernel width update method of correntropy for channel
estimation,” in Proc. Inter. Conf. Digit. signal process. (ICDSP), 2015, pp. 916–920.
[18] F. Huang, J. Zhang, and S. Zhang, “Adaptive filtering under a variable kernel width maximum correntropy criterion,” IEEE
Transactions on Circuits and Systems II: Express Briefs, vol. 64, no. 10, pp. 1247–1251, 2017.
[19] L. Shi and Y. Lin, “Convex combination of adaptive filters under the maximum correntropy criterion in impulsive interference,”
IEEE Signal Process. Lett., vol. 21, no. 11, pp. 1385–1388, 2014.
[20] S. Zhao, B. Chen, and J. C. Principe, “An adaptive kernel width update for correntropy,” in Proc. Inter. Joint Conf. Neural Netw.,
2012, pp.1–5.
[21] Shi L, Zhao H, Y. Zakharov. “An improved variable kernel width for maximum correntropy criterion algorithm.” IEEE Transactions
on Circuits and Systems II, Exp. Briefs, to be published. doi: 10.1109/TCSII.2018.2880564.
[22] Chen B, Xing L, Zhao H, et al. “Generalized correntropy for robust adaptive filtering.” IEEE Transactions on Signal Processing,
2016, 64(13): 3376-3387.
[23] S. Wang, L. Dang, B. Chen, S. Duan, L. Wang, and C. K. Tse, “Random Fourier filters under maximum correntropy criterion,” IEEE
Transactions on Circuits and Systems I: Regular Papers., DOI:10.1109/TCSI.2018.2825241, 2018.
[24] Al-Naffouri TY, Sayed AH. “Adaptive filters with error non-linearities: Mean square analysis and optimum design.” EURASIP J
Adv Signal Process 2001;2001(1):192–205.
[25] Chen B, Xing L, Liang J, Zheng N, Principe JC. “Steady-state mean-square error analysis for adaptive filtering under the maximum
correntropy criterion.” IEEE Signal Process Letters, 2014, 21(7):880–4.
[26] Wang W, Zhao J, Qu H, et al. “Convergence performance analysis of an adaptive kernel width MCC algorithm.“ AEU-International
Journal of Electronics and Communications, 2017, 76: 71-76.
[27] Wang W, Zhao H, Zeng X, et al. “Steady-State Performance Analysis of Nonlinear Spline Adaptive Filter under Maximum
Correntropy Criterion,” IEEE Transactions on Circuits and Systems II: Express Briefs, 2019.
[28] Khalili A, Rastegarnia A, Islam M K, et al. “Steady-state tracking analysis of adaptive filter with maximum correntropy criterion,”
Circuits, Systems, and Signal Processing, 2017, 36(4): 1725-1734.
[29] B.W. Silverman, “Density estimation for statistics and data analysis,” vol.3, Chapman and hall London, 1986.
[30] M.C. Jones, J.S. Marron, and SJ. Sheather, "A brief survey of band width selection for density estimation," Journal of the American
Statistical Association, 91.433(1996): pp. 401--407.
[31] AW. Bowman, "An alternative method of cross-validation for the smoothing of density estimates," Biometrika, 71.2(1984):
pp.353-360
credit author statement
Dear Editor:
I certify that this manuscript is original and has not been published and will not be submitted elsewhere for
publication while being considered by Signal Processing. And the study is not split up into several parts to
increase the quantity of submissions and submitted to various journals or to one journal over time. No data have
been fabricated or manipulated (including images) to support my conclusions. No data, text, or theories by others
are presented as if they were our own. The submission has been received explicitly from all co-authors. And
authors whose names appear on the submission have contributed sufficiently to the scientific work and therefore
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent was obtained from all individual participants included in the study.
Ethical Statement
Dear Editor:
I certify that this manuscript is original and has not been published and will not be submitted elsewhere for
publication while being considered by Signal Processing. And the study is not split up into several parts to
increase the quantity of submissions and submitted to various journals or to one journal over time. No data have
been fabricated or manipulated (including images) to support my conclusions. No data, text, or theories by others
are presented as if they were our own. The submission has been received explicitly from all co-authors. And
authors whose names appear on the submission have contributed sufficiently to the scientific work and therefore
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent was obtained from all individual participants included in the study.