You are on page 1of 4

PREDICTION ERROR METHODS FOR TIMEDOMAIN BLIND IDENTIFICATION

OF
MULTICHANNEL FIR FILTERS.

K. Abed Meraim' , P. Duhamel*, D. Gesbert**, P. hubaton*, S. Mayrargue**, E. Moulines*, D.Slock ***


* Tklkcom Paris / URA 820 and GDR 134/CNRS 46, rue Barrault, 75634 Paris CEDEX 13 FRANCE
*** Eurecom Institute, 2229 route des Crttes, BP.193, 06904 Sophia Antipolis Cedex, FRANCE
** France TBkcom/CNET/PAB and GDR 134/CNRS 38-40 rue du General Leclerc,
92131 Issy-Les-Moulineaux, FRANCE

ABSTRACT diction approach, which is more robust to the overdetermi-


Blind channel identification methods based on the over- nation of the system order is presented.
sampled channel output is a problem of current theoreti-
cal and practical interest. In this contribution, it is first 2. S U B S P A C E METHODS
demonstrated that the subspace methods developped in [l] The continuous-time output from a linear time-invariant
are not robust to errors in the determination of the model channel driven by a PAM/QAM sequence is given by
order. An alternative solution is then proposed, based on
a linear prediction approach. The effect of overestimat-
ing the channel order is investigated by simulations: it is
c ( t )= C s(L)h(t - LT)+ G ( t )
k
(1)
demonstrated that the prediction error method is "robust"
to over-determination, in contrast to most of the schemes where s(k) is an independent and identically distributed
suggested to date. (i.i.d) sequence of symbols ( E s ( k ) = 0 , E ( s ( ~ )= ~ )I),
G ( t ) is an additive temporally white noise E ( & j ( t ~ ) & j (=t ~ ) )
1. INTRODUCTION a26(tl - t z ) , T is the baud rate, and h ( t ) is the impulse
response of the cascade of channel, transmit and receive
Since the early work by Sat0 [2], most of the blind identifi-
cation/equalization techniques have been based on the use
filters, assumed to be causal and time-limited to (M 1) +
symbols duration. Oversampling by a factor q > 1 leads
(implicit or not) of higher-order statistics, which are known to a q x 1 discrete sequence y(n) = y1 n , - . . , y q ( n ) l T
to suffer from many drawbacks (in terms of finite-sample (respectively w(n) = [wl(n), ... , w,(n)$),'?here yl(n) =
variance, robustness to noise...). The recent proposal by g ( ( I - l)T/q+nT+to) (respectively wl(1z) = G((2- 1)T/q+
Gardner [3] and Tong, Xu and Kailath [4] of methods allow-
ing the blind identification of the channels using only sec-
+
nT to)). According to (l), y(n) can be seen as the noisy
output of a q x 1 time-invariant polynomial transfer function
ond order statisticsappears as a major breakthrough in this h(z) driven by the scalar sequence ~ ( n i.e.
),
field. The basic idea motivating these approaches consists in
recognizing that appropriately oversampled communication
signals are cyclostationary, and that, under a mild addi-
tional hypothesis, the phase of linear time-invariant channel
can be retrieved from the periodically time-varying channel
output correlation function [3, 41. Since these algorithms
rely on fractional sampling and second-order statistics, they
show several desirable properties, making it suitable in mo- Denote by r ( r ) the autocovariance function of y(n), r ( r ) =
bile communication context (see, [4]): at the first place, +
E { y ( n ~ ) y ( n ) ~ } .Under the fundamental hypothesis
these algorithms require fewer symbols than most of the h(z) # 0 for each z, it has been shown in [4] that h(t)
schemes suggested to date, making these solutions attrac- is identifiable from a finite number of values of r ( r ) . A
tive even in the presence of relatively rapid channel vari- subspace-based identification scheme was later introduced
ation. Some improvements over these methods have been in [I]. It relies on geometric properties of the so-called
presented in [l] (subspace method) and [5] (deterministic +
Sylvester q(N 1) x (M N + + 1) block-Toeplitz matrix
maximum likelihood method). Our contribution in this pa- associated with the polynomial vector h(z), defined as:
per is twofold. (i) The potential drawbacks of the subspace
and related methods (proposed in [5, l]), when the exact
order of the system is unknown, are outlined and (more
importantly) (ii) a time-domain procedure based on a pre-
*The authors are listed in alphabeticorder. This contribution
is the result of a collectiveresearch effort carried out in the work-
ing groups GT4 -multivariate systems- and GT8 -communication Denoting by R N the covariance matrix of the q(N 1) x 1 +
and coding- inside the GDR 134-CNRS. vector Y N ( ~=)[y(n), * , ~ ( -nN)IT, it comes, by using

0-7803-2431-5/95 $4.00 0 1995 IEEE 1968


the signal model (2) and the noise properties: [h(z)]v(n) is also a finite order autoregressive process (AR).
This property is related to the generalized Bezout identity
YN(") = 7N(h)-%23.N(R) +wN(n) (see, for example, [7]). Under (H2) (h(z) is irreducible),
sM+N(n) = [s(n), ...,s(n - M - N)IT it is known [7] that there exists a 1 x q polynomial vector
g ( z ) such that g(z)h(z) = 1. By applying g ( z ) to y(n), it
E N = TN(h)l;(h)-!-F21 comes that [g(z)]y(n) = s(n): h ( z ) can be exactly inverted
Since q > I, R N - c21is singular as soon as q(N 1) > + by an FIR causal filter. This relation is the key behind all
+ +
( N 1 M). Denote 11 the orthogonal projection matrix subsequent derivations.
In order t o proceed, some additional notations and def-
onto the noise subspace of R N (Le the orthogonal subspace
of R a n g e ( 7 ~ ( h ) ) ) . The subspace identification method is initions are in order. Denote Hn-l(y) the past of y up to
ultimately related t o the following theorem: time n - 1 defined by

Theorem 1 Assume that (Hl) h ( z ) # 0 Vz, then Hn-l(y) = sp{y:(n -I)/ i = 1, q, 1 L 1) (4)
1. for N 2 M - 1, the matrix l j ( h ) is full rank. Here, sp{zi I} stands for the Hilbert subspace generated
by {z, E I}. Denote accordingly Hn-l,N(y) the finite past
2. let f(z) be a q x 1 polynomial transfer function,
deg(f(z)) = P . For N 2 M , we have of Y,

=)0 e f(z) = p(z)h(z)


IITN(~
Hn-l,N(Y) = sP{Y:(n -I)/ i = 1,q, 1 5 1 5 N } (5)
(2)
The innovation process (i(n))nez of y is the q-variate white
where p(z) is some scalar polynomial. In this case, the noise sequence defined by
degree P must be greater than M .
In practice the orthogonal projector? is estimated from the i ( n ) = y(n) - Y(n)lH,-,(Y) (6)
observed signals covariance matrix R N ,and, if M is known, where I stands for the orthogonal projection in H. The
h can be estimated (up to a constant) by minimizing on the process y will be said to be autoregressive of order N if i(n)
set of all q x 1degree-M polynomials the following quadratic coincides with the finite order innovation sequence i ~ ( n =
)
criterion y(n) - y(n)(~n-l,nr(y). We have the following theorem.
Trace(7NT (f)) (3) Theorem 2 Under ( H l - 2 ) , y(n) i s an order M autore-
under a suitable constraint (see [l] for more details). If gressive process. Its innovation process is given by i(n) =
M is overestimated, the argument i(z) of (3) will repre- h(O)v(n).
sent an estimate of a certain polynomial r(z)h(z), and an-
proof: Consider ? j j ( h ) the Sylvester matrix associated to
other algorithm would here be required to factorize from
the filter h(z). Under (Hl), for N = ( M - l), 7 ~ ( h )
the components of f(z) some approximatively common ze-
is full rank, and is thus left invertible, i.e. it exists a
ros. But, using such an algorithm would require some test
(2M x q M ) matrix K such that KTM-I(h) = I. There-
t o decide how many zeros are common, and the relevance of
fore, Hn,M-l(y) = Hn,ZM-l(S), SO that Hn(Y) = Hn(S),
the final estimate should again depend on the result of the
the infinite past of processes y and s coincide. Since y(n) =
test. The same conclusion holds for the method proposed in
[SI,which is closely related to the above subspace method.
The method proposed in [4] estimates the Sylvester matrix
Ek=l+ E,"=,
h(O)s(n) h(k)s(n - k), s ( n ) is a white noise, and
- M h ( k ) s ( n - k ) = ~ ( n ) ( ~ , , - the
~ ( ~innovation
), of y(n)
7 ~ ( hwithout
) using the specific parametrisation of the ma- ki given by: i ( n ) = h ( O ) ~ ( n )AS
. Hn,M-I(Y) = Hn,ZM-l(S),
trix. Based on simulations, it seems t o be 'more' robust t o it holds H n - l , ~ ( y )= Hn-I,ZM(S). Thus, the variables
the errors in the estimation of the model order than sub- s(n - k), 1 5 k 5 M belong to Hn-l,M(y), and hence,
space methods, but, as it will be shown in section 4, its M
performance is still somewhat less inferior in comparison
with those of the linear prediction method presented in the -E
k=l
h(k)s(n- 1)= Y(n)lH,,-l(u) E Hn-l,M(Y)
following.
In some sense, s(n) represents the normalized innovation of
3. PREDICTION BASED METHODS y(n), and a l x q polynomial filter g ( z ) verifying [g(z)]y(n) =
In this section, a linear prediction approach is developped/ s(n) can be seen as a prediction error filter (PEF). Such a
to identify h(z). This approach has been first investigated filter can be easily computed by using generalized Yule-
by Slock in [6] in the special case where q = 2, and when Walker equations, as shown below. The important point is
the degree M of h(z) is known. The main contribution that the polynomial transfer function h(z) can be recovered
of this section is to generalize the results of [6], and more from any PEF g(z) by using h(k) = E(y(n)([g(z)]y(n -
importantly, t o show that the linear prediction approach is k))*).
robust to an over-determination of the order M. In order The innovation i(n) is computed by projecting y(n) onto
to simplify the exposition, the noiseless case w ( n ) = 0 is the space generated by the random variables {y.(n - l ) / i =
considered in a first part. A straightfroward extension t o 1,q, 1 = 1, P} where P 2 M. Let [ A ( l ) , - - . , A ( P ) ]be q x q
noisy data is presented in a second part.
The basic idea behind the linear prediction approach is
matrices such that y(n)
P
+ Ek=l -
A(k)y(n k) = i ( n ) :
to recognize that the moving-average (MA) process y(n) = [A(l),*-*,A(P)]Rp-1= - [ r ( l ) , * * . , ~ ( P ) ] (7)

1969
Since the innovation is not full-rank, these coefficients are Me the order of the filter) autocovariance coefficients
not uniquely defined. A particular set of coefficients may according to
be obtained by:
i(n) = [iab(12)]1+a,b<q

where 'Rp-1 is the covariance matrix of Yp-l(n) =


- -
T-n
1
c
T-n

trl
y(t + n)y(t)T 0 5 12 5 P
+
[ y ( n ) , . , y(n - P 1)IT and the supersript # denotes the
Moore-Penrose pseudo-inverse. According to theorem 2, the From these coefficients, a sample estimate ?ip of the
+
covariance of the innovation D = r ( 0 ) z:==,.A(k)rT(k) q ( P + 1 ) x q ( P + 1 ) covariance matrix of the vectorized
is of rank one; the non-zero eigenvalue of D 1s equal to observation Yp(n) can be obtained using the following
A d = llh011'; the associated eigenvector d is the unit-norm construction
d = ztho/llholl (the sign is not identifiable at the second-

-1
order). Take 1 = a/&. I t comes i(0) i(1) .. i(P)
i(l)T +(O) .. i ( P - 1)
&P=( ;
i ( P - 1)T .. i(0) i(1)
i(P)T .. i(l)T i(0)
Comments
2. From the eigen decomposition of ?ip, estimate (i) the
0 For the prediction method, when P > M, h(M + order n"l of the filter h ( z ) by using, for example a stan-
I), .. h ( P ) are equal to zero, And thus, when ex-
a ,

act statistics are available, an exact identification of dard information criteria test, such as the MDL test
the filter coefficients is achieved under the solely hy- (see[SI and the references therein), (ii) the noise covari-
pothesis that the prediction order P is greater than ance B2 as the average of the N = q ( P + l ) - ( i i + P + l )
the true model order. In practice, when only sample- smallest eigenvalues of ?ip.
estimate statistics are available the order estimation
is necessary to perform the pseudo-inverse of the co-
3. Compute the vectorial prediction error filter a as
variance matrix 7Lp-l. However, a failure of the order a = [I,A(l), ...,a(&)]T
determination procedure, doesn't affect seriously the
performances of the method. Heuristically, the reawns [A(l), .* . ,A(&)] = -[i(l),. . .,q&)](2(&-l)
-21)#
are twofolds. First, let P > M and let U a 'noise'
eigenvector of matrix 7Lp-I: 7 L p - l ~= 0; from (i'), it 11 h ( 0 ) 11' from
4. Estimate the filter coefficient 1 = h(O)/
comes that [ ~ ( l.). .,,r(P)]u = 0, the noise eigenvectors the eigen-decomposition of the estimated covariance
are orthogonal to [ ~ ( l.) ,,r(P)IT.This relation is ex- matrix D:
act for ensemble-averaged covariance; it is expected to
be approximately verified for sample-statistics. Next, D = . . . , i ( & ) ] T + i ( 0 ) - &'Iq
[k(l),...,A(ii)][?(l),
the order determination method will fail when sample D = PjPT
noise eigenvalues are closed to some signal eigenval-
ues (in the case where there is a clear cut between the P = [e,, ... ,eq], PTP= I ,
two sets, the correct model order ca? be inferred di-
rectly). Then, it is likely that the M - M dominant
P = d i a g ( P 1 , ...,P,), B1 2 B2 2 . . . 2 B,
noise eigenvalues are bounded away from zero. This
two reasons together imply that the inversion does not
cause serious numerical troubles. All these claims are
supported by the numerous numerical simulations we 5. Compute the 1 x q(& + 1) prediction error filter j as:
have conducted, as illustrated in section 4.
0 The prediction method can be straightforwardly ex-
4 = iTA
tended to deal with noisy signals. In this case, it suf-
fices to estimate 'a and to deal with the noisefree 6. I f s ( & ) is the q(& + 1) x q(A2 + 1) matrix
covariance, obtained by substracting to the observed
signal covariance the noise covariance a'I.
* I ---*
( @)
?(a-1) i ( & )
0 0
0
. e -

.*f 0
.. ..
A practical implementation From the results of the
previous section, and assuming the additive noise spatially
. .
and temporally white, we derive the following algorithm.

1. Given T > P samples of the process y, YT = estimate the q(& +


1) x 1 filter coefficient vector =
[y(l), .
,y(T)], +
compute (P 1) (P is greater than [A(&)', . . . ,6(0)'IT as: = >(A>)ijT

1970
4. SIMULATION RESULTS
We present here some numerical simulations to assess the
performance of our algorithm. We run simulations for a
4-variate ( q = 4) MA model, with coefficients given by
(M = 5): h l ( t ) = -0.04 - 0.302-l - 1 . 2 8 ~ --~0 . 5 3 ~ - ~ +
0 . 1 4 ~ --
~ 0 . 2 6 ~ - ~hz(2)
, = 0.91 - 0.202-1 - 0.44t-’ -
1 . 0 2- ~ 0~. 5~4 ~ - -
~ 0.08~-~,h3(z) = -1.18 + 0.492-’ -
+ +
0 . 3 1 ~ - ~0 . 4 0 ~ - ~ 0.132-’ - 1 . 8 5 ~ -and
~ h4(2) = 1.30 +
+ + +
0.052-1 0.34z-’ - 0 . 0 3 ~ - ~0.40~-~0.88~-~. The out-
put observation noise is an i.i.d. sequence of zero-mean
Gaussian variables and the input signal is an i.i.d sequence
of zero-mean, unit-variance Gaussian variables independent
from the noise signal.
The number of samples T is held constant (T = 100).
For comparison, the algorithm of [4] was also simulated for
the same channels. The criterion used in this section is the
mean-square estimation error (MSE) defined as the square-
root of the sample average, over the Monte-Carlo simula-
tions, of the total estimation variance. For each experiment,
300 independent Monte-Carlo simulations are performed.
In figure 1, the a priori estimated order is set t o P = 7,
the input signal is Gaussian and the noise variance is var-
ied between -30 dB and 0 dB. The AIC model-order deter-
mination procedure is used for both methods. The figure
shows, the MSE in dB for the linear prediction method (re-
spectively, the method of [4]) in solid line (respectively, in
dashed line) as a function of the noise power. This show
that our method consistently provides more reliable esti- Tabl. Error-rate on the estimated order by the MDL criterion
mates is the case of unknown model-order M.
In the second experiment, we consider a two-variate REFERENCES
( q = 2) MA model of order M = 2. The noise variance E. Moulines, P. Duhamel, J. Cardoso, and S. Mayrar-
is set to -10 dB, the number of samples T is held con- gue, “Subspace methods for the blind identification of
stant (T = 1000) and the estimated order P is varied be- multichannel FIR filters.” To appear in IEEE Trans. On
tween 2 (i.e the exact order) and 10. To deal with a dif- Signal Proc., 1994.
ficult order estimation context, the transfert functions are
Y. Sato, “A method for self recovering equalization,”
+
choosen as hl(z) = 1.1650 0.07512-’ - 0 . 6 9 6 5 ~ -and ~
IEEE Tr. on Comm., vol. 23, no. 6, pp. 679-682, 1975.
h z ( r ) = 0.0352 - 0.06972-1 + 0 . 1 6 9 6 ~ - ~ The
. eigenval-
ues of the exact noise-free covariance matrix are given by W. Gardner, “A new method of channel identification,”
(0, 0.0152, 0.0510, 1.0464, 1.8581, 2.67753. IEEE Tr. on Comm., vol. 39, pp. 813-817, August 1991.
Fig.? shows the MSE of the total estimation variance of L. Tong, G. Xu, and T. Kailath, “A new approach to
h = [h(0)T,... ,h(P)T]Tof the linear prediction method blind identification and equalization of multipath chan-
when the MDL procedure is used for order determination nels,” in Proc. of the 25th Asilomar Conference, Pacific
(dashed-line) and this of the linear prediction method when Grove, CA, pp. 856-860, 1991.
the exact order is a priori known (solid-line), as a function
H. Liu, G. Xu, and L. Tong, “A deterministic approach
of the over-estimated filter degree P . The table 1 shows
to blind identification of multichannel FIR systems,” in
the error-rate (ER) of the order estimation. This shows the
Proc. Int. Conf. on Acoust. Speech and Sag. Proc., vol. 4,
robustness of the method to the errors in the estimation of
pp. 581-584, 1994.
the model order.
D. Slock, “Blind fractionally-spaced equalization,
5. CONCLUSION perfect-reconstruction filter-banks and multichannel lin-
ear prediction,” in Proc. Int. Conf. on Acoust. Speech
This contribution presents a linear prediction approach to and Sig. Proc., vol. 4, pp. 585-588, 1994.
the blind identification of multichannel FIR filters. This
technique allows to obtain a good estimation behaviours of T. Kailath, Linear systems. Prentice-Hall, 1980.
the channel coefficients even in the case where the number of M. Wax and T. Kailath, “Detection of signals by in-
coefficients is over-estimated. Numerical simulations have formation theoretic criteria,” IEEE Tr. on Sig. Proc.,
been preformed t o evidence the usefulness of the method vol. 33, pp. 387-392, Apr. 1985.
and to support our theoretical claims.

M 12 13 15 17 (10
ER I 0.62 I 0.51 I 0.67 I 0.67 I 0.79

1971