Professional Documents
Culture Documents
Toeplitz and Circulant Matrices A Review
Toeplitz and Circulant Matrices A Review
t0
t
t2
..
.
tn1
t1 t2 t(n1)
t0 t1
..
.
t1 t0
...
t0
Robert M. Gray
Information Systems Laboratory
Department of Electrical Engineering
Stanford University
Stanford, California 94305
Revised March 2000
c
Robert
M. Gray, 1971, 1977, 1993, 1997, 1998, 2000.
The preparation of the original report was nanced in part by the National
Science Foundation and by the Joint Services Program at Stanford. Since then it
has been done as a hobby.
ii
Abstract
In this tutorial report the fundamental theorems on the asymptotic behavior of eigenvalues, inverses, and products of nite section Toeplitz matrices and Toeplitz matrices with absolutely summable elements are derived.
Mathematical elegance and generality are sacriced for conceptual simplicity and insight in the hopes of making these results available to engineers
lacking either the background or endurance to attack the mathematical literature on the subject. By limiting the generality of the matrices considered
the essential ideas and results can be conveyed in a more intuitive manner
without the mathematical machinery required for the most general cases. As
an application the results are applied to the study of the covariance matrices
and their factors of linear models of discrete time random processes.
Acknowledgements
The author gratefully acknowledges the assistance of Ronald M. Aarts of
the Philips Research Labs in correcting many typos and errors in the 1993
revision, Liu Mingyu in pointing out errors corrected in the 1998 revision,
Paolo Tilli of the Scuola Normale Superiore of Pisa for pointing out an incorrect corollary and providing the correction, and to David Neuho of the
University of Michigan for pointing out several typographical errors and some
confusing notation.
Contents
1 Introduction
3 Circulant Matrices
15
4 Toeplitz Matrices
4.1 Finite Order Toeplitz Matrices . . . . . . . . . . . . . . . . . .
4.2 Toeplitz Matrices . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Toeplitz Determinants . . . . . . . . . . . . . . . . . . . . . .
19
23
28
45
47
48
51
54
57
Bibliography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
58
CONTENTS
Chapter 1
Introduction
A toeplitz matrix is an n n matrix Tn = tk,j where tk,j = tkj , i.e., a matrix
of the form
t0 t1 t2 t(n1)
t
t0 t1
..
.
t2
t1 t0
.
(1.1)
Tn =
..
.
..
.
tn1
t0
Examples of such matrices are covariance matrices of weakly stationary
stochastic time series and matrix representations of linear time-invariant discrete time lters. There are numerous other applications in mathematics,
physics, information theory, estimation theory, etc. A great deal is known
about the behavior of such matrices the most common and complete references being Grenander and Szego [1] and Widom [2]. A more recent text
devoted to the subject is Bottcher and Silbermann [15]. Unfortunately, however, the necessary level of mathematical sophistication for understanding
reference [1] is frequently beyond that of one species of applied mathematician for whom the theory can be quite useful but is relatively little understood. This caste consists of engineers doing relatively mathematical (for an
engineering background) work in any of the areas mentioned. This apparent
dilemma provides the motivation for attempting a tutorial introduction on
Toeplitz matrices that proves the essential theorems using the simplest possible and most intuitive mathematics. Some simple and fundamental methods
that are deeply buried (at least to the untrained mathematician) in [1] are
here made explicit.
3
CHAPTER 1. INTRODUCTION
In addition to the fundamental theorems, several related results that naturally follow but do not appear to be collected together anywhere are presented.
The essential prerequisites for this report are a knowledge of matrix theory, an engineers knowledge of Fourier series and random processes, calculus
(Riemann integration), and hopefully a rst course in analysis. Several of the
occasional results required of analysis are usually contained in one or more
courses in the usual engineering curriculum, e.g., the Cauchy-Schwarz and
triangle inequalities. Hopefully the only unfamiliar results are a corollary to
the Courant-Fischer Theorem and the Weierstrass Approximation Theorem.
The latter is an intuitive result which is easily believed even if not formally
proved. More advanced results from Lebesgue integration, functional analysis, and Fourier series are not used.
The main approach of this report is to relate the properties of Toeplitz
matrices to those of their simpler, more structured cousin the circulant or
cyclic matrix. These two matrices are shown to be asymptotically equivalent
in a certain sense and this is shown to imply that eigenvalues, inverses, products, and determinants behave similarly. This approach provides a simplied
and direct path (to the authors point of view) to the basic eigenvalue distribution and related theorems. This method is implicit but not immediately
apparent in the more complicated and more general results of Grenander
in Chapter 7 of [1]. The basic results for the special case of a nite order
Toeplitz matrix appeared in [16], a tutorial treatment of the simplest case
which was in turn based on the rst draft of this work. The results were subsequently generalized using essentially the same simple methods, but they
remain less general than those of [1].
As an application several of the results are applied to study certain models
of discrete time random processes. Two common linear models are studied
and some intuitively satisfying results on covariance matrices and their factors are given. As an example from Shannon information theory, the Toeplitz
results regarding the limiting behavior of determinants is applied to nd the
dierential entropy rate of a stationary Gaussian random process.
We sacrices mathematical elegance and generality for conceptual simplicity in the hope that this will bring an understanding of the interesting
and useful properties of Toeplitz matrices to a wider audience, specically
to those who have lacked either the background or the patience to tackle the
mathematical literature on the subject.
Chapter 2
The Asymptotic Behavior of
Matrices
In this chapter we begin with relevant denitions and a prerequisite theorem and proceed to a discussion of the asymptotic eigenvalue, product, and
inverse behavior of sequences of matrices. The remaining chapters of this
report will largely be applications of the tools and results of this chapter to
the special cases of Toeplitz and circulant matrices.
The eigenvalues k and the eigenvectors (n-tuples) xk of an n n matrix
M are the solutions to the equation
M x = x
(2.1)
and hence the eigenvalues are the roots of the characteristic equation of M :
det(M I) = 0
(2.2)
(2.3)
x x=1
(2.5)
x x=1
This corollary will be useful in specifying the interval containing the eigenvalues of an Hermitian matrix.
The following lemma is useful when studying non-Hermitian matrices
and products of Hermitian matrices. Its proof is given since it introduces
and manipulates some important concepts.
Lemma 2.1 Let A be a matrix with eigenvalues k . Dene the eigenvalues
of the Hermitian matrix A A to be k . Then
n1
k=0
n1
|k |2 ,
(2.6)
k=0
n1
(A A)k,k =
k=0
n1
k .
(2.7)
k=0
(2.8)
n1
n1
|rj,k |2
k=0 j=0
.
=
n1
k=0
|k |2 +
k=j
|rj,k |2
n1
k=0
|k |2
(2.9)
7
Equation (2.9) will hold with equality i R is diagonal and hence i A is
normal.
Lemma 2.1 is a direct consequence of Shurs Theorem [3, pp. 229-231]
and is also proved in [1, p. 106].
To study the asymptotic equivalence of matrices we require a metric or
equivalently a norm of the appropriate kind. Two norms the operator or
strong norm and the Hilbert-Schmidt or weak norm will be used here [1,
pp. 102-103].
Let A be a matrix with eigenvalues k and let k be the eigenvalues of
the Hermitian matrix A A. The strong norm A is dened by
RA A (x)1/2 = max
[x A Ax]1/2 .
A = max
(2.10)
x x=1
A 2 = max k = M .
(2.11)
(2.12)
x x=1
If A is itself Hermitian, then its eigenvalues k are real and the eigenvalues
k of A A are simply k = k2 . This follows since if e(k) is an eigenvector of A
with eigenvalue k , then A Ae(k) = k A e(k) = k2 e(k) . Thus, in particular,
if A is Hermitian then
A = max |k | = |M |.
(2.13)
|A| = n
n1
n1
1/2
|ak,j |
1/2
= (n Tr[A A])
k=0 j=0
= n
n1
1/2
. (2.14)
k=0
n1
k=0
|k |2 ,
(2.15)
n1
k = |A|2 .
(2.16)
k=0
(2.17)
3. cA = |c| A
.
The triangle inequality in (2.17) will be used often as is the following direct
consequence:
A B | A B .
(2.18)
The weak norm is usually the most useful and easiest to handle of the
two but the strong norm is handy in providing a bound for the product of
two matrices as shown in the next lemma.
Lemma 2.2 Given two n n matrices G = {gk,j } and H = {hk,j }, then
|GH| G |H|.
(2.19)
Proof.
|GH|2 = n1
= n1
i
= n1
gi,k hk,j |2
j
hj G Ghj ,
m,j
gi,k gi,m hk,j h
(2.20)
9
where denotes conjugate transpose and hj is the j th column of H. From
(2.10)
(hj G Ghj )/(hj hj ) G 2
and therefore
|GH|2 n1 G 2
hj hj = G 2 |H|2 .
Lemma 2.2 is the matrix equivalent of 7.3a of [1, p. 103]. Note that the
lemma does not require that G or H be Hermitian.
We will be considering n n matrices that approximate each other when
n is large. As might be expected, we will use the weak norm of the dierence
of two matrices as a measure of the distance between them. Two sequences
of n n matrices An and Bn are said to be asymptotically equivalent if
1. An and Bn are uniformly bounded in strong (and hence in weak) norm:
An ,
Bn M <
(2.21)
and
2. An Bn = Dn goes to zero in weak norm as n :
lim |Dn | = 0.
lim |An Bn | = n
2. If An Bn and Bn Cn , then An Cn .
3. If An Bn and Cn Dn , then An Cn Bn Dn .
(2.22)
10
Proof.
1. Eqs. (2.22) follows directly from (2.17).
0.
An |Cn Dn |+ Dn |An Bn | n
1
1
1
1
4. |A1
n Bn | = |Bn Bn An Bn An An
Bn1 A1
n |Bn An | n 0.
1
1
5. Bn A1
n Cn = An An Bn An Cn
A1
n |An Bn Cn | n 0.
n1
k=0
n,k = lim n1
n
n1
k=0
n,k .
(2.23)
11
Proof.
Let Dn = {dk,j } = An Bn . Eq. (2.23) is equivalent to
lim n1 Tr(Dn ) = 0.
(2.24)
2
n1
d
k,k
k=0
n1
|dk,k |2
k=0
.
n
n1
n1
|dk,j |2 = n2 |Dn |2 .
k=0 j=0
0.
(2.25)
n1
k=0
2
n,k
= n
lim n
n1
2
n,k
.
(2.26)
k=0
Note that (2.23) and (2.26) relate limiting sample (arithmetic) averages of
eigenvalues or moments of an eigenvalue distribution rather than individual
eigenvalues. Equations (2.23) and (2.26) are special cases of the following
fundamental theorem of asymptotic eigenvalue distribution.
Theorem 2.2 Let An and Bn be asymptotically equivalent sequences of matrices with eigenvalues n,k and n,k , respectively. Assume that the eigenvalue
moments of either matrix converge, e.g., n
lim n1
for any positive integer s. Then
lim n1
n
n1
k=0
s
n,k
= n
lim n1
n1
s
n,k
exists and is nite
k=0
n1
k=0
s
n,k
.
(2.27)
12
Proof.
(2.28)
|n | K |Dn | n
0.
(2.29)
where K does not depend on n. Equation (2.29) allows us to apply Lemma
2.3 to the matrices Asn and Dns to obtain (2.28) and hence (2.27).
Theorem 2.2 is the fundamental theorem concerning asymptotic eigenvalue behavior. Most of the succeeding results on eigenvalues will be applications or specializations of (2.27).
Since (2.26) holds for any positive integer s we can add sums corresponding to dierent values of s to each side of (2.26). This observation immediately yields the following corollary.
Corollary 2.2 Let An and Bn be asymptotically equivalent sequences of matrices with eigenvalues n,k and n,k , respectively, and let f (x) be any polynomial. Then
lim n1
n
n1
k=0
f (n,k ) = n
lim n1
n1
f (n,k ) .
(2.30)
k=0
13
uniformly on [a, b].
Stated simply, any continuous function dened on a real interval can be
approximated arbitrarily closely by a polynomial. Applying Theorem 2.3 to
Corollary 2.2 immediately yields the following theorem:
Theorem 2.4 Let An and Bn be asymptotically equivalent sequences of Hermitian matrices with eigenvalues n,k and n,k , respectively. Since An and
Bn are bounded there exist nite numbers m and M such that
m n,k , n,k M ,
n = 1, 2, . . .
k = 0, 1, . . . , n 1.
(2.31)
n1
k=0
F [n,k ] = n
lim n
n1
F [n,k ]
(2.32)
k=0
Proof.
From Theorem 2.4 we have for F (x) = ln x
lim n1
n
n1
k=0
ln n,k = n
lim n1
n1
k=0
ln n,k
14
and hence
lim exp n
ln
n1
k=0
ln
n1
n,k
k=0
or equivalently
lim exp[n1 ln det An ] = lim exp[n1 ln det Bn ],
n
Chapter 3
Circulant Matrices
The properties of circulant matrices are well known and easily derived [3, p.
267],[19]. Since these matrices are used both to approximate and explain the
behavior of Toeplitz matrices, it is instructive to present one version of the
relevant derivations here.
A circulant matrix C is one having the form
C=
c0
c1
c2
cn1 c0
c1
c2
..
.
cn1 c0 c1
...
... ...
c1
cn1
cn1
..
.
...
c2
c1
c0
(3.1)
where each row is a cyclic shift of the row above it. The matrix C is itself
a special type of Toeplitz matrix. The eigenvalues k and the eigenvectors
y (k) of C are the solutions of
Cy = y
(3.2)
or, equivalently, of the n dierence equations
m1
cnm+k yk +
k=0
n1
ckm yk = ym ; m = 0, 1, . . . , n 1.
(3.3)
k=m
ck yk+m +
n1
ck yk(nm) = ym ; m = 0, 1, . . . , n 1.
k=nm
15
(3.4)
16
n1
ck k + n
k=0
ck k = .
k=nm
n1
c k k
(3.5)
k=0
y = n1/2 1, , 2 , . . . , n1 ,
(3.6)
where the normalization is chosen to give the eigenvector unit energy. Choosing j as the complex nth root of unity, j = e2ij/n , we have eigenvalue
m =
n1
ck e2imk/n
(3.7)
k=0
and eigenvector
(3.8)
= n1 e2imk/n ; m, k = 0, 1, . . . , n 1
= {k k,j }
17
To verify (3.8) we note that the (k, j)th element of C, say ak,j , is
ak,j = n
n1
e2im(kj)/n m
m=0
= n1
n1
e2im(kj)/n
m=0
= n1
n1
cr e2imr/n
(3.9)
r=0
n1
cr
r=0
n1
e2im(kj+r)/n .
m=0
But we have
n1
2im(kj+r)/n
m=0
n
0
k j = r mod n
otherwise
so that ak,j = c(kj) mod n . Thus (3.8) and (3.1) are equivalent. Furthermore
(3.9) shows that any matrix expressible in the form (3.8) is circulant.
Since C is unitarily similar to a diagonal matrix it is normal. Note that
all circulant matrices have the same set of eigenvectors. This leads to the
following properties.
n1
ck e2imk/n
k=0
m =
n1
bk e2imk/n ,
k=0
respectively.
1. C and B commute and
CB = BC = U U ,
where = {m m k,m }, and CB is also a circulant matrix.
18
Proof.
We have C = U U and B = U U where and are diagonal matrices
with elements m k,m and m k,m , respectively.
1. CB = U U U U
= U U
= U U = BC
Since is diagonal, (3.9) implies that CB is circulant.
2. C + B = U ( + )U
3. C 1 = (U U )1
= U 1 U
if is nonsingular.
Circulant matrices are an especially tractable class of matrices since inverses, products, and sums are also circulants and hence both straightforward
to construct and normal. In addition the eigenvalues of such matrices can
easily be found exactly.
In the next chapter we shall see that certain circulant matrices asymptotically approximate Toeplitz matrices and hence from Chapter 2 results
similar to those in Theorem 3 will hold asymptotically for Toeplitz matrices.
Chapter 4
Toeplitz Matrices
In this chapter the asymptotic behavior of inverses, products, eigenvalues,
and determinants of nite Toeplitz matrices is derived by constructing an
asymptotically equivalent circulant matrix and applying the results of the
previous chapters. Consider the innite sequence {tk ; k = 0, 1, 2, } and
dene the nite (n n) Toeplitz matrix Tn = {tkj } as in (1.1). Toeplitz
matrices can be classied by the restrictions placed on the sequence tk . If
there exists a nite m such that tk = 0, |k| > m, then Tn is said to be a
nite order Toeplitz matrix. If tk is an innite sequence, then there are two
common constraints. The most general is to assume that the tk are square
summable, i.e., that
|tk |2 < .
(4.1)
k=
Unfortunately this case requires mathematical machinery beyond that assumed in this paper; i.e., Lebesgue integration and a relatively advanced
knowledge of Fourier series. We will make the stronger assumption that the
tk are absolutely summable, i.e.,
|tk | < .
(4.2)
k=
This assumption greatly simplies the mathematics but does not alter the
fundamental concepts involved. As the main purpose here is tutorial and we
wish chiey to relay the avor and an intuitive feel for the results, this paper
will be conned to the absolutely summable case. The main advantage of
(4.2) over (4.1) is that it ensures the existence and continuity of the Fourier
19
20
series f () dened by
f () =
k=
tk eik = n
lim
n
tk eik .
(4.3)
k=n
Not only does the limit in (4.3) converge if (4.2) holds, it converges uniformly
for all , that is, we have that
n
ik
f ()
t
e
k
k=n
n1
tk eik
k=
+
tk eik
k=n+1
n1
ik
ik
tk e +
tk e
k=
k=n+1
n1
|tk | +
k=
|tk |
k=n+1
where the righthand side does not depend on and it goes to zero as n
from (4.2), thus given / there is a single N , not depending on , such that
n
ik
f ()
t
e
k
k=n
/ , all [0, 2] , if n N.
(4.4)
k=
|tk |2
k=
|tk | .
|tk eik |
k=
k=
21
Since f () is the Fourier series of the sequence tk , we could alternatively
begin with a bounded and hence Riemann integrable function f () on [0, 2]
(|f ()| M|f | < for all ) and dene the sequence of n n Toeplitz
matrices
Tn (f ) = (2)1
2
0
f ()ei(kj) d ; k, j = 0, 1, , n 1
(4.5)
As before, the Toeplitz matrices will be Hermitian i f is real. The assumption that f () is Riemann integrable implies that f () is continuous
except possibly at a countable number of points. Which assumption is made
depends on whether one begins with a sequence tk or a function f ()
either assumption will be equivalent for our purposes since it is the Riemann
integrability of f () that simplies the bookkeeping in either case. Before
nding a simple asymptotic equivalent matrix to Tn , we use Corollary 2.1
to nd a bound on the eigenvalues of Tn when it is Hermitian and an upper
bound to the strong norm in the general case.
(4.7)
Proof.
Property (4.6) follows from Corollary 2.1:
(x Tn x)/(x x)
max n,k = max
x
k
(4.8)
22
so that
x Tn x =
n1
n1
tkj xk xj
k=0 j=0
n1
n1
!
(2)1
2
0
k=0 j=0
"
f ()ei(kj) d xk xj
(4.9)
2
2 n1
ik
f () d
(2)1
x
ke
0
k=0
and likewise
x x=
n1
|xk | = (2)
2
k=0
2
xk eik |2 .
(4.10)
x Tn x
Mf ,
x x
(4.11)
d|
0
n1
k=0
2
mf
k=0
which with (4.8) yields (4.6). Alternatively, observe in (4.11) that if e(k) is
the eigenvector associated with n,k , then the quadratic form with x = e(k)
#
2
yields x Tn x = n,k n1
k=0 |xk | . Thus (4.11) implies (4.6) directly.
We have already seen in (2.13) that if Tn (f ) is Hermitian, then Tn (f ) =
maxk |n,k | = |n,M |, which we have just shown satises |n,M | max(|Mf |, |mf |)
which in turn must be less than M|f | , which proves (4.7) for Hermitian matrices.. Suppose that Tn (f ) is not Hermitian or, equivalently, that f is not
real. Any function f can be written in terms of its real and imaginary parts,
f = fr + ifi , where both fr and fi are real. In particular, fr = (f + f )/2
and fi = (f f )/2i. Since the strong norm is a norm,
Tn (f ) = Tn (fr + ifi )
= Tn (fr ) + iTn (fi )
Tn (fr ) + Tn (fi )
M|fr | + M|fi | .
23
Since |(f f )/2 (|f | + |f |)/2 M|f | , M|fr | + M|fi | 2M|f | , proving (4.7).
Note for later use that the weak norm between Toeplitz matrices has a
simpler form than (2.14). Let Tn = {tkj } and Tn = {tkj } be Toeplitz, then
by collecting equal terms we have
|Tn Tn |2 = n1
n1
n1
|tkj tkj |2
k=0 j=0
n1
= n1
(n |k|)|tk tk |2
k=(n1)
n1
(4.12)
(1 |k|/n)|tk tk |2
k=(n1)
We are now ready to put all the pieces together to study the asymptotic
behavior of Tn . If we can nd an asymptotically equivalent circulant matrix
then all of the results of Chapters 2 and 3 can be instantly applied. The
main dierence between the derivations for the nite and innite order case
is the circulant matrix chosen. Hence to gain some feel for the matrix chosen
we rst consider the simpler nite order case where the answer is obvious,
and then generalize in a natural way to the innite order case.
4.1
24
T =
t0
t1
..
.
tm
t1 tm
t0
0
..
..
..
tm
t1 t0 t1 tm
...
0
tm
tm
..
.
t0 t1
t1 t0
(4.13)
We can make this matrix exactly into a circulant if we ll in the upper right
and lower left corners with the appropriate entries. Dene the circulant
matrix C in just this way, i.e.
T =
t0
t1
.
..
tm
tm
..
.
t1
t1 tm
tm
...
...
0
...
tm
t1 t0
t1
tm
0
...
tm
tm
...
t1
..
.
tm
tm
..
t1
t0
t1 t0
(n)
c0
(n)
(n)
25
cn1
(n)
cn1 c0
..
(n)
(n)
cn1
c1
(4.14)
(n)
c0
(n)
(n)
(n)
tk
ck =
t(nk)
k = 0, 1, , m
(4.15)
k = n m, , n 1
otherwise
Lemma 4.2 The matrices Tn and Cn dened in (4.13) and (4.14) are asymptotically equivalent, i.e., both are bounded in the strong norm and.
lim |Tn Cn | = 0.
(4.16)
m
k(|tk |2 + |tk |2 )
k=0
mn1
m
k=0
(|tk |2 + |tk |2 ) n
0
26
The above Lemma is almost trivial since the matrix Tn Cn has fewer
than m2 non-zero entries and hence the n1 in the weak norm drives |Tn Cn |
to zero.
From Lemma 4.2 and Theorem 2.2 we have the following lemma:
Lemma 4.3 Let Tn and Cn be as in (4.13) and (4.14) and let their eigenvalues be n,k and n,k , respectively, then for any positive integer s
lim n1
n
n1
k=0
s
n,k
= n
lim n1
n1
s
n,k
.
(4.17)
k=0
k=0
n1
k=0
s
n,k
Kn1/2 ,
(4.18)
Proof.
Equation (4.17) is direct from Lemma 4.2 and Theorem 2.2. Equation
(4.18) follows from Lemma 2.3 and Lemma 4.2.
Lemma 4.3 is of interest in that for nite order Toeplitz matrices one can
nd the rate of convergence of the eigenvalue moments. It can be shown that
k sMfs1 .
The above two lemmas show that we can immediately apply the results
of Section II to Tn and Cn . Although Theorem 2.1 gives us immediate hope
of fruitfully studying inverses and products of Toeplitz matrices, it is not yet
clear that we have simplied the study of the eigenvalues. The next lemma
claries that we have indeed found a useful approximation.
n1
k=0
s
n,k
= (2)1
2
0
f s () d.
(4.19)
27
If Tn (f ) and hence Cn (f ) are Hermitian, then for any function F (x) continuous on [mf , Mf ] we have
lim n1
n
n1
F (n,k ) = (2)1
2
F [f ()] d.
(4.20)
tnk e2ijk/n .
(4.21)
k=0
Proof.
From Chapter 3 we have exactly
n,j =
n1
ck e2ijk/n
(n)
k=0
m
n1
tk e2ijk/n +
k=0
m
k=nm
tk e2ijk/n = f (2jn1 )
k=m
Note that the eigenvalues of Cn are simply the values of f () with uniformly
spaced between 0 and 2. Dening 2k/n = k and 2/n = we have
lim n1
n
n1
s
n,k
=
k=0
lim n1
n
lim
n1
n1
f (2k/n)s
k=0
f (k )s /(2)
k=0
= (2)1
2
f ()s d,
(4.22)
28
n1
s
n,k
2
= (2)
f ()s d.
(4.23)
k=0
n1
F (n,k ) = (2)
2
F [f ()] d;
(4.24)
k=0
i.e., the sequences n,k and f (2k/n) are asymptotically equally distributed.
This behavior should seem reasonable since the equations Tn x = x and
Cn x = x, n > 2m + 1, are essentially the same nth order dierence equation
with dierent boundary conditions. It is in fact the nice boundary conditions that make easy to nd exactly while exact solutions for are usually
intractable.
With the eigenvalue problem in hand we could next write down theorems
on inverses and products of Toeplitz matrices using Lemma 4.2 and the results
of Chapters 2-3. Since these theorems are identical in statement and proof
with the innite order absolutely summable Toeplitz case, we defer these
theorems momentarily and generalize Theorem 4.1 to more general Toeplitz
matrices with no assumption of ne order.
4.2
Toeplitz Matrices
tk eik
k=
tk = (2)1
2
0
.
f ()eik
(4.25)
29
(n)
(n)
(n)
n1
f (2j/n)e2ijk/n .
(4.26)
j=0
lim c
n k
lim n1
n1
f (2j/n)e2ijk/n
j=0
= (2)1
2
0
(4.27)
f ()eik d = tk
(n)
and hence the ck are simply the sum approximations to the Riemann integral
giving tk . Equations (4.26), (3.7), and (3.9) show that the eigenvalues n,m
of Cn are simply f (2m/n); that is, from (3.7) and (3.9)
n1
n,m =
ck e2imk/n
(n)
k=0
n1
k=0
n1
n1
n1
f (2j/n)e2ijk/n e2imk/n
j=0
.
f (2j/n) n1
j=0
n1
(4.28)
22ik(jm)/n
k=0
= f (2m/n)
Thus, Cn (f ) has the useful property (4.21) of the circulant approximation
(4.15) used in the nite case. As a result, the conclusions of Lemma 4.4
hold for the more general case with Cn (f ) constructed as in (4.26). Equation
(4.28) in turn denes Cn (f ) since, if we are told that Cn is a circulant matrix
with eigenvalues f (2m/n), m = 0, 1, , n 1, then from (3.9)
(n)
ck
= n1
n1
n,m e2imk/n
m=0
,
= n1
n1
m=0
f (2m/n)e2imk/n
(4.29)
30
Lemma 4.5
1. Given a function f of (4.25) and the circulant matrix Cn (f ) dened by
(4.26), then
(n)
ck =
tk+mn ,
k = 0, 1, , n 1.
m=
i2jm/n
tm e
m=
n1
tm ei2jm/n
m=
t1+mn e2ijl/n
l=0 m=
(4.30)
31
ck
= n1
n1
f (2j/n)e2ijk/n
j=0
= n1
n1
n1
t1+mn e2ij(kl)/n
j=0 l=0 m=
n1
t1+mn n1
l=0 m=
n1
j=0
e2ij(kl)/n
tk+mn
m=
n
tk eik ,
(4.31)
k=n
32
(n)
ck
= n1
n1
fn (2j/n)e2ijk/n
j=0
= n
n1
2ijk/n
tm e
e2ijk/n
(4.33)
m=n
j=0
n
n
tm n1
m=n
n1
e2ij(k+m)/n .
j=0
|tk | <
k=
tk eik .
k=
Dene the circulant matrices Cn (f ) and Cn = Cn (fn ) as in (4.26) and (4.31)(4.32). Then,
Cn (f ) Cn Tn .
(4.34)
Proof.
33
Since both Cn (f ) and Cn are circulant matrices with the same eigenvectors (Theorem 3.1), we have from part 2 of Theorem 3.1 and (2.14) and the
comment following it that
|Cn (f ) Cn |2 = n1
n1
|f (2k/n) fn (2k/n)|2 .
k=0
Recall from (4.4) and the related discussion that fn () uniformly converges
to f (), and hence given / > 0 there is an N such that for n N we have
for all k, n that
|f (2k(n) fn (2k/n)|2 /
and hence for n N
|Cn (f ) Cn |2 n1
n1
/ = /.
i=0
Since / is arbitrary,
lim |Cn (f ) Cn | = 0
proving that
Cn (f ) Cn .
(4.35)
n1
(1 |k|/n)|tk tk |2 .
k=(n1)
(n)
c|k|
= t|k| + tn|k|
k0
(4.36)
(n)
cnk
= t|nk| + tk
k0
and hence
|Tn Cn |2 = |tn1 |2 +
n1
(1 k/n)(|tnk |2 + |t(nk) |2 )
k=0
.
= |tn1 |2 +
n1
k=0
(k/n)(|tk |2 + |tk |2 )
34
|tk |2 + |tk |2 /
k=N
and hence
lim |Tn Cn | =
n
lim
n
lim
n1
k=0
N 1
(k/n)(|tk |2 + |tk |2 )
lim n
(k/n)(|tk |2 + |tk |2 ) +
k=0
(k/n)(|tk |2 + |tk |2 )
k=N
N 1
n1
k(|tk | + |tk | ) +
2
k=0
(|tk |2 + |tk |2 ) /
k=N
Since / is arbitrary,
lim |Tn Cn | = 0
and hence
Tn Cn ,
(4.37)
n1
k=0
s
n,k
= (2)1
2
0
f ()s d.
(4.38)
35
n1
2
F (n,k ) = (2)
F [f ()] d.
(4.39)
k=0
d = 0.
:f ()=x
Then the limiting distribution D(x) = limn Dn (x) exists and is given by
D(x) = (2)1
d.
f ()x
The technical condition of a zero integral over the region of the set of
for which f () = x is needed to ensure that x is a point of continuity of the
limiting distribution.
Proof.
Dene the characteristic function
1x () =
mf x
.
otherwise
We have
D(x) = n
lim n1
n1
1x (n,k ) .
k=0
Unfortunately, 1x () is not a continuous function and hence Theorem 4.2 cannot be immediately implied. To get around this problem we mimic Grenander
36
and Szego p. 115 and dene two continuous functions that provide upper
and lower bounds to 1x and will converge to it in the limit. Dene
1+
x ()
= 1
0
x
x<x+/
x+/<
x
x/
x+
1
x/<x
0
x<
The idea here is that the upper bound has an output of 1 everywhere 1x does,
but then it drops in a continuous linear fashion to zero at x + / instead of
immediately at x. The lower bound has a 0 everywhere 1x does and it rises
linearly from x to x / to the value of 1 instead of instantaneously as does
1x . Clearly
+
1
x () < 1x () < 1x ()
1
x () =
for all .
Since both 1+
x and 1x are continuous, Theorem 4 can be used to conclude
that
n1
lim n1
n
1+
x (n,k )
k=0
1
= (2)
1+
x (f ()) d
= (2)
d + (2)
f ()x
(2)1
x<f ()x+
d + (2)1
f ()x
(1
f () x
) d
/
d
x<f ()x+
and
lim n1
n
n1
1
x (n,k )
k=0
1
= (2)
1
x (f ()) d
= (2)
d + (2)
f ()x
x<f ()x
(1
f () (x /)
) d
/
37
d + (2)
= (2)
f ()x
1
(2)
x<f ()x
(x f ()) d
d
f ()x
d (2)
= (2)
f ()x
d
x<f ()x
These inequalities imply that for any / > 0, as n grows the sample average
#
n1 n1
k=0 1x (n,k ) will be sandwitched between
(2)1
d + (2)1
f ()x
and
x<f ()x+
(2)
f ()x
d (2)
d.
x<f ()x
Since / can be made arbitrarily small, this means the sum will be sandwiched
between
1
(2)
d
f ()x
and
(2)
f ()x
Thus if
d (2)
d.
f ()=x
d = 0,
f ()=x
then
1
2
D(x) = (2)
1x [f ()]d
0
1
= (2) v
d
f ()x
38
Proof.
From Corollary 4.1 we have for any / > 0
D(mf + /) =
d > 0.
f ()mf +
(4.40)
= Cn (f )1 [I + Dn Cn (f )1 ]
&
'
= Cn (f )1 I + Dn Cn (f )1 + (Dn Cn (f )1 ) +
(4.41)
and the expansion converges (in weak norm) for suciently large n.
3. If f () mf > 0, then
Tn (f )1 Tn (1/f ) = (2)1
dei(kj) /f () ;
(4.42)
39
that is, if the spectrum is strictly positive then the inverse of a Toeplitz
matrix is asymptotically Toeplitz. Furthermore if n,k are the eigenvalues of Tn (f )1 and F (x) is any continuous function on [1/Mf , 1/mf ],
then
lim n1
n
n1
F (n,k ) = (2)1
k=0
F [(1/f ()] d.
(4.43)
n1
F [min(n,k , )] = (2)1
2
k=0
F [min(1/f (), )] d.
0
(4.44)
Proof.
1. Since f () > 0 except at possible a nite number of points, we have
from (4.9)
2
1 n1
ik
x Tn x =
f ()d > 0
x
e
k
2 k=0
so that for all n
min n,k > 0
k
and hence
det Tn =
n1
n,k = 0
k=0
so that Tn (f ) is nonsingular.
2. From Lemma 4.6, Tn Cn and hence (4.40) follows from Theorem 2.1
since f () mf > 0 ensures that
Tn1 , Cn1 1/mf < .
40
(4.45)
0
|Dn Cn1 | Cn1 |Dn | (1/mf )|Dn | n
(4.45) must hold for large enough n. From (4.40), however, if n is large
enough, then probably the rst term of the series is sucient.
3. We have
|Tn (f )1 Tn (1/f )| |Tn (f )1 Cn (f )1 | + |Cn (f )1 Tn (1/f )|.
From (b) for any / > 0 we can choose an n large enough so that
/
|Tn (f )1 Cn (f )1 | .
2
(4.46)
(4.47)
so that for any / > 0 from (4.46)-(4.47) can choose n such that
|Tn (f )1 Tn (1/f )| /
which is (4.42). Equation (4.43) follows from (4.42) and Theorem 2.4.
Alternatively, if G(x) is any continuous function on [1/Mf , 1/mf ] and
(4.43) follows directly from Lemma 4.6 and Theorem 2.4 applied to
G(1/x).
4. When f () has zeros (mf = 0) then from Corollary 4.2 lim minn,k = 0
n k
and hence
Tn1 = max n,k = 1/ min n,k
(4.48)
k
(4.49)
41
since f () is continuous on [0, Mf ] and has at least one zero all of these
sets are nonzero intervals of size, say, |Ek |. From (4.49)
d/f ()
|Ek |k/Mf
(4.50)
k=1
(4.51)
d/f ()
k=1
= (Mf )1
(4.52)
1/(k + 1)
k=1
which diverges so that 1/f () is not integrable. To prove (4.44) let F (x)
be continuous on [1/Mf , ], then F [min(1/x, )] is continuous on [0, Mf ]
and hence Theorem 2.4 yields (4.44). Note that (4.44) implies that the
eigenvalues of Tn1 are asymptotically equally distributed up to any
nite as the eigenvalues of the sequence of matrices Tn [min(1/f, )].
A special case of part 4 is when Tn (f ) is nite order and f () has at least
one zero. Then the derivative exists and is bounded since
df /d =
m
ik
iktk e
k=m
m
|k||tk | <
k=m
The series expansion of part 2 is due to Rino [6]. The proof of part 4
is motivated by one of Widom [2]. Further results along the lines of part 4
regarding unbounded Toeplitz matrices may be found in [5].
Extending (a) to the case of non-Hermitian matrices can be somewhat
dicult, i.e., nding conditions on f () to ensure that Tn (f ) is invertible.
42
lim n1
n
n1
(4.53)
(4.54)
sn,k = (2)1
k=0
2
[f ()g()]s d s = 1, 2, . . . .
(4.55)
2. If Tn (t) and Tn (g) are Hermitian, then for any F (x) continuous on
[mf mg , Mf Mg ]
lim n
n1
F (n,k ) = (2)
k=0
2
F [f ()g()] d.
(4.56)
3.
Tn (f )Tn (g) Tn (f g).
(4.57)
4. Let f1 (), .., fm () be Riemann integrable. Then if the Cn (fi ) are dened as in (4.29)
m
i=1
Tn (fi ) Cn
m
i=1
fi Tn
m
i=1
fi .
(4.58)
43
m
i=1
lim n
n1
sn,k
= (2)
2
m
0
k=0
s
fi ()
(4.59)
i=1
If the Tn (fi ) are Hermitian, then the n,k are asymptotically real, i.e.,
the imaginary part converges to a distribution at zero, so that
lim n
n1
(Re[n,k ]) = (2)
k=0
lim n1
n
n1
2
m
0
s
fi ()
d.
(4.60)
i=1
([n,k ])2 = 0.
(4.61)
k=0
Proof.
1. Equation (4.53) follows from Lemmas 4.5 and 4.6 and Theorems 2.1
and 3. Equation (4.54) follows from (4.53). Note that while Toeplitz
matrices do not in general commute, asymptotically they do. Equation
(4.55) follows from (4.53), Theorem 2.2, and Lemma 4.4.
2. Proof follows from (4.53) and Theorem 2.4. Note that the eigenvalues
of the product of two Hermitian matrices are real [3, p. 105].
3. Applying Lemmas 4.5 and 4.6 and Theorem 2.1
|Tn (f )Tn (g) Tn (f g)| = |Tn (f )Tn (g) Cn (f )Cn (g)
+Cn (f )Cn (g) Tn (f g)|
.
|Tn (f )Tn (g) Cn (f )Cn (g)|
+|Cn (f g) Tn (f g)|n
0
4. Follows from repeated application of (4.53) and part (c).
5. Equation (4.58) follows from (d) and Theorem 2.1. For the Hermitian
case, however,
we cannot simply apply Theorem 2.4 since the eigenvalues n,k of Tn (fi ) may not be real. We can show, however, that they
i
44
n1
(n,k + in,k )
lim n
k=0
n1
s
n,k
k=0
= (2)1
2
m
0
s
fi ()
(4.62)
i=1
m
fi . From (2.14)
i=1
n1
|n,k | = n
2
k=0
n1
2
n,k
2
n,k
2
m
Tn (fi ) .
i=i
k=0
i=1
m 2
Cn
lim
f
i
n
i=1
= (2)1
2
m
0
2
fi ()
(4.63)
i=1
n1
2
n,k
0.
k=1
Thus the distribution of the imaginary parts tends to the origin and
hence
s
lim n
n1
k=0
s
n,k
= (2)
2
m
0
fi ()
d.
i=1
Parts (d) and (e) are here proved as in Grenander and Szego [1, pp.
105-106].
We have developed theorems on the asymptotic behavior of eigenvalues,
inverses, and products of Toeplitz matrices. The basic method has been
to nd an asymptotically equivalent circulant matrix whose special simple
45
4.3
Toeplitz Determinants
ln (det(Cn )) n =
1
1 n1
m
ln f (2 ).
ln detCn =
n
n m=0
n
These sums are the Riemann approximations to the limiting integral, whence
1
lim ln (det(Cn )) n =
n
ln f (2) d.
0
lim (det(Cn )) = e
1
2
) 2
0
ln f () d.
) 2
0
ln f () d.
(4.64)
46
Chapter 5
Applications to Stochastic
Time Series
Toeplitz matrices arise quite naturally in the study of discrete time random
processes. Covariance matrices of weakly stationary processes are Toeplitz
and triangular Toeplitz matrices provide a matrix representation of causal
linear time invariant lters. As is well known and as we shall show, these
two types of Toeplitz matrices are intimately related. We shall take two
viewpoints in the rst section of this chapter section to show how they are
related. In the rst part we shall consider two common linear models of
random time series and study the asymptotic behavior of the covariance matrix, its inverse and its eigenvalues. The well known equivalence of moving
average processes and weakly stationary processes will be pointed out. The
lesser known fact that we can dene something like a power spectral density
for autoregressive processes even if they are nonstationary is discussed. In
the second part of the rst section we take the opposite tack we start with
a Toeplitz covariance matrix and consider the asymptotic behavior of its triangular factors. This simple result provides some insight into the asymptotic
behavior or system identication algorithms and Wiener-Hopf factorization.
The second section provides another application of the Toeplitz distribution theorem to stationary random processes by deriving the Shannon
information rate of a stationary Gaussian random process.
Let {Xk ; k I} be a discrete time random process. Generally we take
I = Z, the space of all integers, in which case we say that the process
is two-sided, or I = Z+ , the space of all nonnegative integers, in which
case we say that the process is one-sided. We will be interested in vector
47
48
(5.1)
This is the autocorrelation matrix when the mean vector is zero. Subscripts
will be dropped when they are clear from context. If the matrix Rn is
Toeplitz, say Rn = Tn (f ), then rk,j = rkj and the process is said to be
weakly stationary. In this case we can dene f () =
rk eik as the
k=
power spectral density of the process. If the matrix Rn is not Toeplitz but
is asymptotically Toeplitz, i.e., Rn Tn (f ), then we say that the process is
asymptotically weakly stationary and once again dene f () as the power
spectral density. The latter situation arises, for example, if an otherwise stationary process is initialized with Xk = 0, k 0. This will cause a transient
and hence the process is strictly speaking nonstationary. The transient dies
out, however, and the statistics of the process approach those of a weakly
stationary process as n grows.
The results derived herein are essentially trivial if one begins and deals
only with doubly innite matrices. As might be hoped the results for asymptotic behavior of nite matrices are consistent with this case. The problem is
of interest since one often has nite order equations and one wishes to know
the asymptotic behavior of solutions or one has some function dened as a
limit of solutions of nite equations. These results are useful both for nding
theoretical limiting solutions and for nding reasonable approximations for
nite order solutions. So much for philosophy. We now proceed to investigate
the behavior of two common linear models. For simplicity we will assume
the process means are zero.
5.1
49
noise. The most common such model is called a moving average process and
is dened by the dierence equation
Un =
n
bk Wnk =
k=0
n
bnk Wk
(5.2)
k=0
Un = 0; n < 0.
We assume that b0 = 1 with no loss of generality since otherwise we can
incorporate b0 into 2 . Note that (5.2) is a discrete time convolution, i.e.,
Un is the output of a lter with impulse response (actually Kronecker
response) bk and input Wk . We could be more general by allowing the lter
bk to be noncausal and hence act on future Wk s. We could also allow the
Wk s and Uk s to extend into the innite past rather than being initialized.
This would lead to replacing of (5.2) by
Un =
bk Wnk =
k=
bnk Wk .
(5.3)
k=
We will restrict ourselves to causal lters for simplicity and keep the initial
conditions since we are interested in limiting behavior. In addition, since
stationary distributions may not exist for some models it would be dicult
to handle them unless we start at some xed time. For these reasons we take
(5.2) as the denition of a moving average.
Since we will be studying the statistical behavior of Un as n gets arbitrarily
large, some assumption must be placed on the sequence bk to ensure that (5.2)
converges in the mean-squared sense. The weakest possible assumption that
will guarantee convergence of (5.2) is that
|bk |2 < .
(5.4)
k=0
In keeping with the previous sections, however, we will make the stronger
assumption
|bk | < .
k=0
(5.5)
50
Bn =
1
b1
b2
..
.
1
b1
..
b2
..
bn1 . . .
b2
(5.6)
b1 1
so that
U n = Bn W n .
(5.7)
If the lter bn were not causal, then Bn would not be triangular. If in addition
(5.3) held, i.e., we looked at the entire process at each time instant, then (5.7)
would require innite vectors and matrices as in Grenander and Rosenblatt
[12]. Since the covariance matrix of Wk is simply 2 In , where In is the n n
identity matrix, we have for the covariance of Un :
(n)
RU
= EU n (U n ) = EBn W n (W n ) Bn
= Bn Bn
(5.8)
or, equivalently
rk,j = 2
n1
bkbj
=0
(5.9)
min(k,j)
= 2
b+(kj)b
=0
From (5.9) it is clear that rk,j is not Toeplitz because of the min(k, j) in the
sum. However, as we next show, as n the upper limit becomes large
(n)
and RU becomes asymptotically Toeplitz. If we dene
b() =
bk eik
(5.10)
k=0
then
so that
Bn = Tn (b)
(5.11)
RU = 2 Tn (b)Tn (b) .
(5.12)
(n)
51
We can now apply the results of the previous sections to obtain the following
theorem.
Theorem 5.1 Let Un be a moving average process with covariance matrix
(n)
RUn . Let n,k be the eigenvalues of RU . Then
(n)
RU 2 Tn (|b|2 ) = Tn ( 2 |b|2 )
(5.13)
n1
2
F (n,k ) = (2)
k=0
F ( 2 |b()|2 ) d.
(5.14)
2 Tn (1/|b|2 ).
RU
(5.15)
Proof.
(Theorems 4.2-4.4 and 2.4.)
If the process Un had been initiated with its stationary distribution then
we would have had exactly
(n)
RU = 2 Tn (|b|2 ).
(n) 1
5.2
Autoregressive Processes
n
k=1
ak Xnk + Wk
n = 0, 1, . . .
52
n < 0.
(5.16)
a1
An =
1
a1 1
.. ..
.
.
(5.17)
a1 1
an1
so that
An X n = W n .
We have
RW = An RX An
(n)
(n)
(5.18)
(5.19)
(RX )1 = 2 An An
(5.20)
(n)
or
(n)
tk,j =
n
nmax(k,j)
a
mk amj =
m=0
am am+(kj) .
m=0
Unlike the moving average process, we have that the inverse covariance matrix is the product of Toeplitz triangular matrices. Dening
a() =
ak eik
(5.21)
k=0
we have that
(5.22)
53
(5.23)
n1
F (1/n,k ) = (2)1
k=0
2
0
F ( 2 |a()|2 ) d,
(5.24)
(n)
RX 2 Tn (1/|a|2 )
(5.25)
Proof.
(Theorems 5.1.)
Note that if |a()|2 > 0, then 1/|a()|2 is the spectral density of Xn . If
(n)
|a()|2 has a zero, then RX may not be even asymptotically Toeplitz and
hence Xn may not be asymptotically stationary (since 1/|a()|2 may not be
integrable) so that strictly speaking xk will not have a spectral density. It is
often convenient, however, to dene 2 /|a()|2 as the spectral density and it
often is useful for studying the eigenvalue distribution of Rn . We can relate
(n)
2 /|a()|2 to the eigenvalues of RX even in this case by using Theorem 4.3
part 4.
n1
k=0
F [min(n,k , )] = (2)1
2
0
F [min(1/|a()|2 , )] d.
(5.26)
54
Proof.
(Theorem 5.2 and Theorem 4.3.)
If we consider two models of a random process to be asymptotically equivalent if their covariances are asymptotically equivalent, then from Theorems
5.1d and 5.2 we have the following corollary.
Corollary 5.2 Consider the moving average process dened by
U n = Tn (b)W n
and the autoregressive process dened by
Tn (a)X n = W n .
Then the processes Un and Xn are asymptotically equivalent if
a() = 1/b()
and M a() m > 0 so that 1/b() is integrable.
Proof.
(Theorem 4.3 and Theorem 4.5.)
(n)
RX
(5.27)
5.3
Factorization
5.3. FACTORIZATION
55
well known that any such matrix can be factored into the product of a lower
triangular matrix and its conjugate transpose [12, p. 37], in particular
Tn (f ) = {tk,j } = Bn Bn ,
(5.28)
(5.29)
where (j, k) is the determinant of the matrix Tk with the right hand column replaced by (tj,0 , tj,1 , . . . , tj,k1 )t . Note in particular that the diagonal
elements are given by
(n)
(5.30)
(5.31)
ik
bk e
k=0
b0 = 1
From (5.28) and Theorem 4.4 we have
Bn Bn = Tn (f ) Tn (b)Tn (b) .
(5.32)
(5.33)
56
Proof.
Since det Tn (b) = n = 0, Tn (b) is invertible. Likewise, since det Bn =
[det Tn (f )]1/2 we have from Theorem 4.3 part 1 that det Tn (f ) = 0 so that
Bn is invertible. Thus from Theorem 2.1 (e) and (5.32) we have
Tn1 Bn = [Bn1 Tn ]1 Tn Bn1 = [Bn1 Tn ] .
(5.34)
Since Bn and Tn are both lower triangular matrices, so is Bn1 and hence
Bn Tn and [Bn1 Tn ]1 . Thus (5.34) states that a lower triangular matrix is
asymptotically equivalent to an upper triangular matrix. This is only possible
if both matrices are asymptotically equivalent to a diagonal matrix, say Gn =
(n)
{gk,k k,j }. Furthermore from (5.34) we have Gn G1
n
(n)
|gk,k |2 k,j In .
(5.35)
n1
(n) 2
bk,k
= 1.
(5.36)
k=0
From (5.30) and Corollary 2.3, and the fact that Tn (b) is triangular we
have that
lim 1 n1
n1
k=0
bk,k = 1 n
lim {(det Tn (f ))/(det Tn1 (f ))}1/2
(n)
= 1 n
lim {det Tn (f )}1/2n 1 n
lim {det Tn (b)}1/n
= 1 = 1
(5.37)
Combining (5.36) and (5.37) yields
lim |Bn1 Tn In | = 0.
(5.38)
5.4
S(f ) =
RX (n)e2inf ,
n=
the Fourier transform of the covariance function. For a xed positive integer
n, the probability density function is
fX n (xn ) =
1
(2)n/2 det(R
e 2 (x
1
)1/2
n mn )t R1 (xn mn )
n
58
1
log(2e)n detRn .
2
=
ln S(f ) df.
(5.40)
2 0
The Toeplitz distribution theorems have also found application in more
complicated information theoretic evaluations, including the channel capacity
of Gaussian channels [17, 18] and the rate-distortion functions of autoregressive sources [9].
Bibliography
[1] U. Grenander and G. Szego, Toeplitz Forms and Their Applications,
University of Calif. Press, Berkeley and Los Angeles, 1958.
[2] H. Widom, Toeplitz Matrices, in Studies in Real and Complex Analysis, edited by I.I. Hirschmann, Jr., MAA Studies in Mathematics,
Prentice-Hall, Englewood Clis, NJ, 1965.
[3] P. Lancaster, Theory of Matrices, Academic Press, NY, 1969.
[4] W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, NY,
1964.
[5] R.M. Gray, On Unbounded Toeplitz Matrices and Nonstationary Time
Series with an Application to Information Theory, Information and
Control, 24, pp. 181196, 1974.
[6] C.L. Rino, The Inversion of Covariance Matrices by Finite Fourier
Transforms, IEEE Trans. on Info. Theory, IT-16, No. 2, March 1970,
pp. 230232.
[7] G. Baxter, A Norm Inequality for a Finite-Section Wiener-Hopf Equation, Illinois J. Math., 1962, pp. 97103.
[8] G. Baxter, An Asymptotic Result for the Finite Predictor, Math.
Scand., 10, pp. 137144, 1962.
[9] R.M. Gray, Information Rates of Autoregressive Processes, IEEE
Trans. on Info. Theory, IT-16, No. 4, July 1970, pp. 412421.
[10] F.R. Gantmacher, The Theory of Matrices, Chelsea Publishing Co., NY
1960.
59
60
BIBLIOGRAPHY