Professional Documents
Culture Documents
ABSTRACT x1 (n)
g1 (n)
In this paper, different adaptive algorithms for stereophonic acoustic
S
)
x2 (n)
(n
echo cancellation are compared. The algorithms include the sim- g2 (n)
h1
ple LMS algorithm and two specialized two-channel adaptive algo-
rithms. Due to the high calculation complexity needed in stereo-
Channel
)
ĥ 1 (n)
ĥ 2 (n)
(n
phonic acoustic echo cancellation applications, the time domain al-
h2
gorithms are applied in a subband structure. The comparison include
aspects as convergence rate, calculation complexity, signal path de-
lay and memory usage. Real-life recordings are used in the evalua-
tion.
y(n) e(n)
Rec. Trans.
1. INTRODUCTION room room
The increasing use of teleconferencing systems and desktop confer- Figure 1: Schematic diagram of a stereophonic echo can-
encing where the acoustic echo canceler (AEC) plays a central role, celer.
has led to the requirement of faster and better performing algorithms.
In these applications there is a desire to have far better sound qual-
ity and sound localization than what has been provided previously.
These quality improvements can be achieved by increasing the sig-
nal bandwidth and also by adding more audio channels to the system.
been made of how to handle this properly.
This last fact spurred the need for multi-channel acoustic echo can-
celers of which the two channel (stereo) AEC is the the most inter-
esting since only complexity issues differ in the more general multi- A complete theory of non-uniqueness and characterization of the
channel case. Figure 1 illustrates the concept of stereophonic echo SAEC solution was presented in [2]. It was shown that the only
cancellation between a transmission room and a receiving room. As solution to the non-uniqueness problem is to reduce the correlation
is depicted in the figure, the echo is due to the acoustic coupling be- between the stereo signals and an efficient low complexity method
tween the loud-speakers and the microphones in the receiving room. for this purpose was also given, [2]. Several decorrelation methods
The solution is to estimate this coupling, and subtract an estimated has since then been presented [3, 4, 5], and even if these decrease the
echo from the return signal. correlation between the channels, the normal equations to be solved
will still represent an ill-conditioned problem.
Stereophonic acoustic echo cancellation (SAEC) is fundamentally
different from traditional mono echo cancellation. Four mono echo
cancelers straightforwardly implemented in the stereo case not only The performance of the SAEC is also more severely affected by the
would have to track changing echo paths in the receiving room but choice of algorithm than the monophonic counterpart. This is easily
also in the transmission room! For example, the canceler has to re- recognized since the performance of most adaptive algorithms de-
converge if one talker stops talking and another starts talking at a pends on the condition number of the input signal. In the stereo case,
different location in the transmission room. There is no adaptive al- the condition number is very high and algorithms that do not take
gorithm that can track such a change sufficiently fast and this scheme the cross-correlation between the channels into account, such as the
therefore results in poor echo suppression. Thus, a generalization of standard Least Mean Square (LMS) or Normalized LMS (NLMS),
the mono AEC in the stereo case does not result in satisfactory per- converge slowly to the true solution.
formance.
The theory explaining the problem of SAEC is described in [1]. The More sophisticated algorithms such as the APA (Affine Projection
fundamental problem is that the two channels may carry linearly re- Algorithm) or RLS (Recursive Least Squares), that are less affected
lated signals which in turn may make the normal equations to be by a high condition number, handles the stereo case much better.
solved by the adaptive algorithm singular. This implies that there is This is even more true for algorithms that are specially derived for
no unique solution to the equations but an infinite number of solu- the two-channel situation. In the following we will study two dif-
tions and it can be shown that all (but the physically true) solutions ferent types of two-channel algorithms, the two-channel fast RLS
depend on the transmission room. As a result, intensive studies have algorithm, and a frequency domain adaptive algorithm.
Table 1 Two-channel FRLS algorithm for complex arithmetic. The Table 2 Two-channel frequency domain adaptive filter, i denotes
transposition and the Hermitian transposition operators are denoted channel 1 or 2. The block size is denoted L, and the number of
T and H , respectively, and the conjugate is denoted ∗ . λ ∈ (0, 1] is adaptive coefficients per frequency bin K , resulting in a total adap-
the forgetting factor and k ∈ [1, 2.5] a stabilization parameter. tive filter length of K L. Forgetting factor 0 < β < 1, channel
cross-correlation dependence 0 ≤ ρ ≤ 1 and adaptive step size µb .
Input signals Matrix sizes F denotes the Fourier matrix.
T
χ (n) = x1 (n) x2 (n) (2 × 1) Input signals Matrix sizes
T n T o
x(n) = χ T (n) . . . χ T (n − L + 1) (2L × 1) Xi (m) = diag F xi (mL−L) · · · xi (m L + L −1) (2L ×2L)
Prediction T
y(m) = 01×L y(m L) · · · y(m L + L − 1) (2L ×1)
eA (n) = χ (n) − A H (n − 1)x(n − 1) (2 × 1)
Power spectrum estimation with regularization
ϕ1 (n) = ϕ(n − 1) + eA H (n)E−1 (n − 1)e (n) (1 × 1)
A A
Si, j (m) = βSi, j (m − 1) + (1 − β)X∗i (m)X j (m) (2L ×2L)
M(n) 0
= + ((2L + 2) × 1) S̃i,i (m) = Si,i (m) + diag ereg (2L ×2L)
m(n) G(n − 1)
I2 Si (m) = S̃i,i (m) I2L×2L − (2L ×2L)
E−1 (n − 1)eA (n) n o−1
−A(n − 1) A
ρ 2 S∗1,2 (m)S1,2 (m) S̃1,1 (m)S̃2,2 (m)
eB2 (n) = χ (n − L) − B H (n − 1)x(n) (2 × 1)
H (n)m(n) Filtering
ϕ(n) = ϕ1 (n) − eB (1 × 1)
2 −1
KX
H (n)/ϕ(n − 1)
A(n) = A(n − 1) + G(n − 1)eA (2L × 2) ŷ0i (m) = Xi (m − k)ĥi,k (m − 1) (2L ×1)
H (n)/ϕ(n − 1)] k=0
EA (n) = λ[EA (n − 1) + eA (n)eA (2 × 2) h i
G(n) = M(n) + B(n − 1)m(n) (2L × 1) e(m) = y(m) − W1 F−1 ŷ01 (m) + ŷ02 (m) (2L ×1)
eB1 (n) = EB (n − 1)m(n) (2 × 1) ĥ1,k (m) = ĥ1,k (m − 1) + µb FW2 F−1 S−1 1 · (2L ×1)
eB (n) = keB2 (n) + (1 − k)eB1 (n) (2 × 1) h i
X∗1 (m − k) − ρS1,2 S̃−1 ∗
2,2 X2 (m − k) Fe(m)
H (n)/ϕ(n)
B(n) = B(n − 1) + G(n)eB (2L × 2)
H (n)/ϕ(n)]
ĥ2,k (m) = ĥ2,k (m − 1) + µb FW2 F−1 S−12 · (2L ×1)
EB (n) = λ[EB (n − 1) + eB2 (n)eB (2 × 2) h i
2 ∗ −1 ∗
X2 (m − k) − ρS2,1 S̃1,1 X1 (m − k) Fe(m)
Filtering
e(n) = y(n) − ĥ H (n − 1)x(n) (1 × 1)
Definitions
T
ĥ(n) = ĥ(n − 1) + G(n)e∗ (n)/ϕ(n) (2L × 1) e(m) = 01×L e(m L) · · · e(m L + L − 1) (2L ×1)
h iT
Definition T (m) 0
ĥi,k (m) = F ĥi,k 1×L (2L ×1)
T
ĥ(n) = ĥ 1,0 (n) ĥ 2,0 (n) · · · ĥ 1,L−1 (n) ĥ 2,L−1 (n) T
ĥi,k (m) = ĥ i,kL (m) · · · ĥ i,kL+L−1 (m) (L ×1)
W1 = diag 01×L 11×L , W2 = diag 11×L 01×L
2. ADAPTIVE ALGORITHMS
calculated as
A few adaptive algorithms, which exploit the correlation between the e(n) = y(n) − x1H ĥ1 (n − 1) − x2H ĥ2 (n − 1) (1)
two channels in SAEC, have been derived. In this section, we will µ
ĥi (n) = ĥi (n − 1) + H xi e(n), i = 1, 2 (2)
investigate important properties of such algorithms. The algorithms x1 x1 + x2H x2
that have been chosen are the two-channel fast recursive least mean
squares (FRLS), Table 1 [6], applied in a subband scheme and the where xi is the transmission room signal vector containing the last L
unconstrained versions of a two-channel frequency domain adaptive samples for channel i, and ĥi (n) the filter estimate vector. The main
filter, Table 2 [7]. For comparison, the normalized least mean square disadvantage with this algorithm is the slow convergence rate on sig-
(NLMS), is also studied. In Table 3 calculation complexity, signal nals with large correlation matrix eigenvalue spread [8], as speech
path delay and memory usage of these algorithms are exemplified, signals, and with correlated channels in the two channel situation.
and finally in Sec. 3, convergence and tracking properties are shown For the long adaptive filters needed in SAEC, calculation complex-
in simulations. All complexity and memory usage numbers are cal- ity is also significant. The number of real-valued multiplications per
culated for the full SAEC, i.e. with four adaptive filters and two sample needed for the four adaptive filters depicted in Fig. 1 is 8L for
return signals, as depicted in Fig. 1. real-valued signals and 32L for complex-valued signals. The calcu-
lation complexity can be reduced by applying the adaptive filters in
a subband structure, but this will also introduce signal transmission
delay to the otherwise delayless NLMS algorithm. This is described
in Sec. 2.4.
2.1. Normalized Least Mean Square
the two algorithms specially derived for the two-channel situation −5 Subband NLMS
Subband FRLS.
are less affected than the other two. −10
NLMS
Finally in Fig. 4, it is shown that even if the fullband NLMS appear −15
Frequecy Alg.
to perform reasonably well in Fig. 2 and 3, it does only perform
dB
−20
well in the region with the most signal energy, 200–2500 Hz. As
the human ear is sensitive also for higher frequencies, Fig. 4 tells us −25
−10
dB
−20
(b) −30
0
−20
types as in Fig. 2.
−25
−30
−35
[5] S. Shimauchi, Y. Haneda, S. Makino, and Y Kaneda, “New
configuration for a stereo echo canceller with nonlinear pre-
−40
0 5 10 15 20 25 30 processing,” in Proc. IEEE ICASSP, 1998, pp. 3685–3688.
s
[6] J. Benesty, F. Amand, A. Gilloire, and Y. Grenier, “Adaptive
Figure 2: (a) Transmission room speech signal (left channel). In- filtering algorithms for stereophonic acoustic echo cancella-
stantaneous receiving room impulse responses, h i (t), change after tion,” in Proc. IEEE ICASSP, 1995, pp. 3099–3102.
20 s. (b) Mean square error performance for four adaptive algo- [7] J. Benesty, A. Gilloire, and Y. Grenier, “A frequency domain
rithms: NLMS (dash-dotted line), subband NLMS (dotted line), two- stereophonic acoustic echo canceler exploiting the coherence
channel FRLS (solid line), two-channel frequency algorithm (dashed between the channels,” The Journal of the Acoustical Society
line). of America, vol. 106, pp. L30–L35, Sept 1999.
[8] S. Haykin, Adaptive Filter Theory, chapter 9, Prentice Hall
International, 3rd edition, 1996.
4. REFERENCES [9] M. G. Bellanger, Adaptive Digital Filters and Signal Analysis,
Marcel Dekker, 1987.
[1] M. M. Sondhi, D. R. Morgan, and J. L. Hall, “Stereophonic [10] S. L. Gay and J. Benesty, Eds., Acoustic Signal Processing For
acoustic echo cancellation—an overview of the fundamental Telecommunication, vol. 551, chapter 8, Kluwer Academic
problem,” IEEE Signal Processing Lett., vol. 2, no. 8, pp. 148– Publishers, 2000, ISBN 0-7923-7814-8.
151, August 1995.
[11] K. Ochiai, T. Araseki, and T. Ogihara, “Echo cancellation
[2] J. Benesty, D. R. Morgan, and M. M. Sondhi, “A better under- with two path models,” IEEE Trans. on Communications, vol.
standing and an improved solution to the specific problems of COM-25, no. 6, pp. 589–595, June 1977.
stereophonic acoustic echo cancellation,” IEEE Trans. Speech
[12] A. Gilloire and M. Vetterli, “Adaptive filtering in subbands
and Audio Processing, vol. 6, no. 2, pp. 156–165, March 1998.
with critical sampling: Analysis, experiments, and application
[3] T. Gänsler and P. Eneroth, “Influence of audio coding on to acoustic echo cancellation,” IEEE Trans. on Signal Process-
stereophonic acoustic echo cancellation,” in Proc. IEEE ing, vol. 40, no. 8, pp. 1862–1875, August 1992.
ICASSP, 1998, pp. 3649–3652.
[13] W. Kellermann, “Analysis and design of multirate systems for
[4] A. Gilloire and V. Turbin, “Using auditory properties to im- cancellation of acoustical echoes,” in Proc. of ICASSP, 1988,
prove the behavior of stereophonic acoustic echo cancellers,” pp. 2570–2573.
in Proc. IEEE ICASSP, 1998, pp. 3681–3684.