You are on page 1of 147

Editorial Poiicy

lix the publication or mOllographs

In what !()llo\\s all refercnces to monographs arc applicahle also to ll1ultiauthorship


\oIUllleS such as seminar notes.
~ I. Lccturc \otes aim to report Ile\\ de\'clopments - quickly. inf()I"l11ally. and at a high
bel. Monograph manuscrirts should be reasonably sc': r-contained and rounded off Thus
they may. and oflen will. present not only results oCthe author but abo related work by
other people. Furthermore. the manuscripts should prm ide sufricient l1loti\atiot1. ex-
amples. and applications. This clearly distingUtshes Lecture ;\otes manuscript:; n'om
journal anicles which normally arc \Cry concise. /\rticles intended I(lr a journal but too
long to be accepted by rnost journals usually do not haw this "lecture notes" character
I-"or similar reasons it is unusual f(lr Ph.D. theses to be accepted f()r the Lecturc "Jotes
senes.
~ 2. \1alluscripts or plans for I.ecture '\otes \olumes should be submitted (preferably
in dupliL:atcl either to one 01" the series editors or to Springer-Verlag. '\Je\\ York. These
proposals arc then relcreed. A final decision concerning puhlication can nnly he made
llnthe basis of the complete manllscript. but a preliminary lb.:ision can often be based
on partial inf()I"Ill<ltion: a fllirly detailed outline describing the planned conients or \~ach
chapter. and an indicat ion of the est imated length. a bibliography. and olle or t \\"() sample
chap!crs . or a first drall orlhe manuscript. The editors \\ill try to make the prcliminary
(kcisiOI1 as definite as they can on the basis of the mailable infi.JI"Illation.
~ J. Final manuscripts should be in English. They should contain at least 100 pages of
sClentil'iL; text and should include
.. a tahle of contellts:
.. an inf(xll1atin:: introduction. perhaps \\'ith somc historical remarks: it should be
acceSSible to a reader not particularly fllll1iliar \\ith the \\)pic treated:
.. a sll~iect index: as a rule this is genuinely helpf1.i! fill" the reader.
Lecture Notes in Statistics 85
Edited by S. Fienberg, J. Gani, K. Krickeberg,
I. Olkin, and N. Wennuth
Paul Doukhan

Mixing

Properties and Examples

Springer-Verlag
New York Berlin Heidelberg London Paris
Tokyo Hong Kong Barcelona Budapest
Paul Doukhan
Department of Economy
University of Cergy-Pontoise
33, Bd. du Port
95011 Cergy-Pontoise
France

Mathematics Subject Classification: 60F99, 6OE99

Library of Congress Cataloging-in-Publication Data


Doukhan, Paul.
Mixing: properties and examples I Paul Doukhan.
p. cm. -- (Lecture notes in statistics; 85)
Includes bibliographical references and index.
ISBN -13 :978-0-387-94214-8
1. Limit theorems (Probability theory) 2. Stochastic processes.
1. Title. II. Series: Lecture notes in statistics (Springer-Verlag)
; v. 85.
QA273.67.D68 1995
519.2--dc20 93-47442
Printed on acid-free paper.
@ 1994 Springer-Verlag New York, Inc.

All rights reselVed. This work may not be translated or copied in whole orin part without the written
permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY
10010, USA), except for brief excerpts in connection with reviews or scholady analysis. Use in
connection with any form of information storage and retrieval, electronic adaptation, computer
software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the
fonnerare not especially identified, is not to be taken as a sign that such names, as understood by the
Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.

Camera ready copy provided by the author.

987654321

ISBN -13 :978-0-387-94214-8 e-ISBN-13:978-1-4612-2642-0


DOl: 10.1007/978-1-4612-2642-0
I want to thank my parents and all the friends and colleagues who helped me in a
decisive way during the preparation of the present manuscript.

For all she brings to me, this work is dedicated

to Geraldine
Introduction

These notes are devoted to the study of mixing theory. The underlying goal is to provide
statisticians dealing with problems involving weak dependence properties, with a powerful and
easy tool. Up to now, this approach to dependence has been mainly considered from an abstract
point of view. For an excellent review on this subject we refer to :

"Dependence in Probability and Statistics, a Survey of Recent Results" (I).

The aim of this work is to study applications of these results. We obtain bounds for the decay
of mixing coefficient sequences associated to random processes or to random fields which are
actually used in Statistics. In some cases we will give counterexamples which show that some
frequently held ideas are wrong. In fact, it would be of little interest to study a probabilistic
technique with no field of application.

This work is divided in two parts. In the first part we focus on the definitions and probabilistic
properties of mixing theory. The second part describes the mixing properties of classical
processes and random fields. Let us describe in more details the contents of these two parts.

PART 1. Our presentation of mixing theory is mainly based on covariance inequalities,


moment inequalities of the Rosenthal kind and on the recent reconstruction techniques in Berbee
and Bradley's work. The new results presented in the first part are roughly equivalents in a
mixing framework to the inequalities of Rosenthal and Bernstein. Powerful Rosenthal
inequalities given in Doukhan & Portal (1983) and Uteev (1984) are extended here to the case
of random fields. They are not well known since no reference to them is recalled in the previous
collective reference. Rosenthal inequalities lead to improved rates of convergence for invariance
principles. Doukhan & Portal (1983) proved a weak invariance principle for the empirical
distribution function with the rate n- 1I27 • For instance Dehling (1983) shows a strong
invariance principle with the rate n-112880 while the use of the previous inequalities led Massart
(1987) to a better n- 1I33 rate. The same can be said about the exponential inequalities proved in
Carbon (1983), in Collomb (1984), or in Doukhan, Le6n & Portal (1984). Berbee (1979) and
Lin (1989) yield sharp Bernstein inequalities for various kinds of mixing processes. Some
results are extended here to the case of random fields.

Strong mixing is the weakest mixing condition used in statistics, even if it is much stronger
than the mixing conditions used in ergodic theory (2). We shall essentially focus on the strong
mixing properties of processes and fields. We only note some differences between the various
properties. Only note that mixing for random processes and for random fields are very distinct
notions. The corresponding conditions are classical in the case of processes. In one aim of
simplicity, we authoritatively define a notion of mixing random field; notations are very close
and the reader should take care to distinguish the mixing coefficient sequences associated to
random fields and to random processes. We do not intend to give very extensive results or
bibliography which may be found in the previously cited collective work. The Doctoral
Dissertation of A.V. Bulinskii (3) and the book by B. Nahapetian (4) deal mainly with the case
of random fields. X. Guyon (1992) presents the main statistical aspects of the theory of random
fields. The main basic other references are included in the Bibliography at the end of this
volume.

1 Oberwolfach 1985, Birkhiiuser (1986).


2 In our probabilist framework it is worth noticing that the reference measure is always finite.
The other difference is mainly the uniformity of the present mixing assumptions.
3 Moscow State University (1990).
4 Teubner Texte zur Mathematik (1992).
viii

PART 2. The second part presents a review of examples. It also generalizes various results
already known in order to unify them. For instance, mixing properties for new models in
financial mathematics and linear or Gaussian fields arc studied here in detail. Prohahilistic
results may also imply a certain knowledge of the decay rate for the mixing sequences. Thus the
results presented determine the rate of decay of mixing sequences associated with the classical
modcls - in an optimal way for some cases. We show that most of the processes or fields used
in Statistics that satisfy some stability or ergodicity property are strongly mixing. The classes of
processes investigated are the Gaussian Random Fields, the Gibhs Markov Fields and the
General Linear Random nelds. The properties of discrete time Markov Processes yield the
properties of ARMA, Bilinear, ARCH and GARCH processes, as well as their non linear
versions. In order to investigate Continuous Time Processes we recall topics concerning the
hypermixing property.

This study is mainly self-contained hut for the hasic notions of Prohahility Theory and
Mathemutical Analysis. For the sake of shortness we do not present proofs for all the results.
Rather, we only give them when they arc easy or they correspond to fundamental results, in
order to introduce the reader to some of the suhtleties in mixing theory. We also include the
proofs of results which arc not well known. We intend to give an introductory exposition of
mixing techniques and references which will refer the reader to significative extensions as well
as to the proofs of the results included here.

Results are numbered linearly inside each chapter, thus Proposition 4 in chapter 1.5 will be
referrcd as Proposition 1.5.4 in anothcr chapter and only as Proposition 4 inside chapter 1.5.
Bihliographical comments at thc cnd of each chapter indicate developments as well as the
sources of previous results. The final hibliography includes the relationship hetween the
references and the related chapters of this volume. An index and a list of notations may be
useful tools for the reader. The end of a proof is identified by the symhol •.

I want to express my gratitude to all those who helped me while I was writing
this work specially Xavier Guyon, Abdelkader Mokkadem and Emmanuel Rio. Their results
arc widely developed in these notes. Many other colleagues and friends helped me, providing
for instance adequate references or performing calculations included here as Denis Bosq, Jean
Bretagnolle, Alexander Bulinskii, Hans Fiillmer, IIdar Ihragimov, Jose Rafacl Leon, JoaquIn
Ortega, AlexanderTsyhakov and Sergei Uteev. My orthograph was essentially smoothed hy
Joaquin Ortega. Marie-Claude Viano as well as the Referee have really improved the quality of
the manuscript hy their very attentive reading.

Key Words AMS Classification Subject 1985


60 F 99 Mixing, 60 G 99 Weak dependence, 60 U 10 Stationary process, 60 Ii 15 Inequalities.
60 E 99 Rosenthal inequalities, 60 E 99 Exponential inequalities, 60 F 05 Central limit theorem,
60 G 15 Uaussian process, 60 K 99 Marko" random field, Linear process, 60 J 05 Markov process with
discrete parameter, 60 J 27 Markov process with continuous parameter.
Contents

Introduction ........................................................ vii

l'O ota tions ........•................................................•.. xi

1. General properties ................................................... 1

I. I. Dependence of cr-lields ............................................................... 3

1.2. Basic tools ....................................... '" .................................... 7


1.2. I. Reconstruction techniques .................................................. 7
1.2.2. Covariance inequalities ..................................................... 9

1.3. Mixing ................................................................................ 15


1.3.1. Mixing random lields ..................................................... 16
1.3.2. Mixing processes .......................................................... 111
1.3.3. Weak conditions ti.)r processes .......................................... 21
1.3.4. Miscellany .................................................................. 22
1.4. Tools .................................................................................. 25
1.4.1. Moment incqualities ....................................................... 25
1.4.2. Exponential inequalities ................................................... 33
1.4.3. Maximal inequalities ...................................................... 40

1.5. Central limit theorem ................................................................ 45


1.5. I Sufticient conditions ...................................................... 4 5
I .5.2. Convergence rates ......................................................... 411
I .5.3. Dimension dependent rates ............................................... 49

2. Examples ..•.•.........................................•............ 55

2. I . Gaussian rundom fields ............................................................. 57


2.1.1. An explicit bound .......................................................... 511
2.1.2. Mixing rates ................................................................ 61
2.2. Gibbs fields .......................................................................... 63
2.2.1 Dobrushin theory .......................................................... 63
2.2.1.1. Comparison hclwccn specifications ........................................... 64
2.2.1.2. [)ohrushin's condilion ............................................................. 65
2.2.1.3. Mixing condilion ................................................................... 66
2.2.2 Markov fields .............................................................. 67
2.2.2.1. POlCnlials ............................................................................. 6 7
2.2.3 Non compact case ......................................................... 70
2.2.3.1. Poinl pmce~ses ..................................................................... 7 I
2.2.3.2. Diffusions ............................................................................ 7 2
x

2.3. Linear fields .......................................................................... 75


2.3.1. Independent innovations .................................................. 77
2.3.2. Dependent innovations .................................................... 80
2.3.3. Proofs ....................................................................... 81
2.3.4. Miscellany .................................................................. 84

2.4. Markov processes ................................................................... 87


2.4.0.1. A class of non linear models .................................................... 93
2.4.0.2. Dynamical systems approach ................................................... 93
2.4.0.3. Annealing ............................................................................ 94
2.4.1. Polynomial AR processes ................................................ 96
2.4.1.1. Bilinear models ..................................................................... 98
2.4.1.2. ARMA models ...................................................................... 98
2.4.2. Nonlinear processes ...................................................... 100
2.4.2.1. ARX(k. q) nonlinear processes ............................................... 100
2.4.2.2. AR(I) nonlinear processes ..................................................... 104
2.4.2.3. Financial nonlinear processes ................................................. 106

2.5. Continuous time processes ........................................................ 111


2.5.1. Markov processes ........................................................ 111
2.5.2. Operators .................................................................. 111
2.5.3. Diffusion processes ...................................................... 113
2.5.4. Hypermixing .............................................................. 118
2.5.5. Hypercontractivity ........................................................ 119
2.5.6. lJltrcl.Contractivity ......................................................... 121
2.5.7. General SDE .............................................................. 122
Bibliography ............................................................... 125
Index ..................................................................... 141
Notations
complement of A in E.

=Min {a, b j, minimum value of a finite family of real numbers.


A\B = AnBe is the difference of subsets A and B.
avb = Max {a, b}, maximum value of a finite family of real numbers.

CCU") space of real valued continuous functions on If .

COORd) space of real valued continuous functions on [Rd, with limit 0 at 00.

Cb(lRd) space of real valued continuous and bounded functions on [Rd.

C~(lRd) space of real valued continuous functions on [Rd, k-times continuously


differentiable with bounded k-th partial derivatives.

CK([Rd) space of real valued continuous compactly supported functions on [Rd.

Ck([Rd) space of k-times continuously differentiable functions on [Rd.

cX(A;B) is any measure of dependence of the process X = (X')'ET between index


subsets A, BeT: n, ~, p, '1>, or'll are defined in § 1.1.

cX(k; a, b) = sup {cx(A; 8); A, BeT, IAI::;; a, IBI::;; b}, mixing coefficient for
random fields.

cX,k;a,b = sup {cXClt-a, t]nT; ]t+k, t+k+b]nT); t E "tl. j, if T c "tl..

Symmetric difference of two subsets: Ad8 = A\8u8\A

ess-sup X = Inf{a E [Ru{+ooj; IP(X > a) = OJ.


InfA infimum of A c IR, we set Inf 0 = + 00, and analogously for Sup A.

u n
00 00

Iiminf An Am.
n=1 m=n

n
00 00

Iimsup An U Am.
n=1 m=n

spaceofmeasurablefunctionson[Rdwith flf(x)I P dx < 00.

IR d

when it is not defined another way, A. is Lebesgue measure on [Rd.


xii

tensor produet of the cr-algebras J·t, and JV'.

J.1®V tensor product of the measures J.1 and v defined on a product cr-algebra.

IN set of nonnegative integers.

10(a )1
a function st limsup t-~O() ___at.1 < 00

Landau's notation.

o(a)
a funetion st lim 1--)00 ___ 1_. = O. Landau's notation.
at

set of probability distributions on (n, 05-i).

Q set of rational numbers.

I. :"\
-C)
set of real numbers .

= {w; X( OJ) E A}.

= (X t ; tEA), denotes the A-marginal of the process X.

=cr(X t ; tEA}, denotes the cr-algcbra generated by X A .


set of integers.

1.1 denotes either the standard norm of a normed space or the cardinal number
of a finite set.

p-norl1l of a I3anach valued random variable, IIXlip = 1.[ IXIIlJ III'.


total variation of a signed measure.
1. General properties

In this part, we intend to present some of the most important properties of mixing
processes and random fields. For the sake of simplicity we shall omit some proofs as well as
some obvious generalizations. Our aim is to give some important asymptotic results as well as
some useful tools leading to them.

In the first section we present some of the measures of dependence between O'-fields and
their immediate properties.

In section 1.2 we present the reconstruction results for dependent random variables of
Berbee and Bradley and we give the fundamental covariance inequalities for dependent 0'-
fields. Together with the reconstruction results they are the only available basic tools of mixing
techniques.

We are then in a position to define mixing processes and mixing random fields in section
1.3. Mixing random fields may be defined in several distinct ways, only one of them will be
presented because it only involves the metric structure of the index set. We also present some
important relations linking some the various notions of mixing.

Section 1.4 is devoted to obtaining tools for the mixing theory. That is inequalities for the
moments of sums of mixing processes and mixing random fields. We give in detail a mixing
analogue of the Rosenthal inequality and we give without proofs other inequalities for moments
of sums. We also present exponential inequalities which generalize those of Hoeffding and
Bernstein to mixing cases. After that we recall the maximal inequalities given by Billingsley and
Moricz, Serfling and Stout in the case of processes, and by Moricz in the case of random fields
as well as Ottaviani's inequalities.

Finally we consider asymptotic properties of mixing structures as the central limit


theorem in section 1.5, we shall however not develop all the results already included in
Dependence in probability and statistics. a survey of recent results 1986, Birkhiiuser.
Properties: Dependence of cr-algebras 3

1.1. Dependence of a-fields

The different notions of mixing are related to underlying measures of dependence


between a-fields. More precisely let (n, .fb, IP) be a probability space and V, 'V' be two sub
a-algebras of .fb, various measures of dependence between V and 'V' have been defmed

(1) a(V, 'V') = Sup{IIP(U) IP(V) -IP(U II V)I; U E V, VE 'V'}.

(2) ~(V, 'V') = IE ess-sup{ IIP(V/V) - IP(V)I; V E 'V'}.

(3) <!>(V,'V')=SUp{IIP(V)_IP(UIIV)I;UE V,IP(U)*O, VE 'V'}.


IP(U)

(4) '!'(V,'V')=Sup{ll- :i~)~~tUE V,IP(U)*O, VE 'V',IP(V)*O}.

(5) p(V,'V')= Sup{ICorr(X, Y)I;XE L 2(V), YE L2('V')} [see (I)].

Coefficient (1) is called the strong mixing or a-mixing coefficient and (see (2)),

(I') a(V, 'V') = Sup{ICov(U, V)I; ° S; U, V S; 1, a(U) c V, a(V) c 'V'}.

Coefficient (2) is called the absolute regularity or ~-mixing coefficient, it may be rewritten as

(2')

the supremum is taken over all the partitions (Ui), (Vj) of n with Ui E V, Vj E 'V'. Set
1P'lf for the restriction of IP to the a-algrebra V. This relation may also be written as
~(V, 'V')= 1I1P'lf®lPqr -1P'lf®qrIlVar and thus ~(V, 'V') is also the supremum of

f W [dIP 'If ®IP qr - dIP 'If ®qr] for random variables W defined on the product probability
space (nxn, V ®'V') with ° S; W S; 1.

The coefficient (3) is the uniform mixing coefficient or <!>-mixing coefficient. One can prove that

(3') <!>(V,'V')=ess-sup{IIP(V/V)-IP(V)I;VE'V'}.

The coefficient (4) is the *-mixing or ,!,-mixing coefficient. It may be shown that

I Set Corr(X, Y) =~X~ with Cov(X, Y) =IEX.Y - IE X. IE Y, Var X =IEX2 - [IEX]2.


Var X VarY
2 One inequality is obvious, the other one is proved using Lemma 3 in § 1.2.2
ICov(U, V)I =~ ICov(2U-l, 2V-l)l~ 112U-llI~1I2V-1II~ a('lf, qr)~ a('lf, qr).
4 Mixing

1
(4') \jf(V, 0/) = ess-sup{- HP(V/V) - lP (V)I; V E 0/, lP (V) :f. O}.
lP(V)

Remark 1. The ranges of the previous dependence coefficients are defined by the following
inequalities
Os;a(V,o/)s;~, OS;q,(V,o/)S;I, OS;\jf(V,o/)S;oo, OS;p(V,o/)S; 1.
Moreover if one of them vanishes then the related a-fields V and 0/ are independent.

Proposition 1. The following inequalities hold

2 a( V, 0/) s f3( V, 0/) s ¢( V, 0/) s ~ 1Jf( V, 0/),


4 a(V, 0/) Sp(V, 0/) s2 ¢1I2(V, 0/) ¢1I2(0/, V) and
p( V, 0/) S 1Jf( V, 0/).

Proof. The following are obvious

2 a(V, 0/) S; ~(V, 0/) S; q,(V, 0/) S; ~\jf(V, 0/),


Considering X = lI. u , Y = l1y leads to 4 a(V, 0/) S; p(V, 0/). Indeed the variance of
X is lP (V) (1 - lP (V)) S; i. Thus,
4IlP(V) lP(V) - lP(VnV)1 = 4ICov(l1 u , l1y)1 S; ICorr(l1 u ' l1y)1.
Taking suprema in this inequality leads to the result.

The inequality p(V, 0/) S; 2 q, 1I2(V, 0/) is a consequence of Theorem 3-(1), § 1.2.2 for
the case p = q = ~. The symmetric inequality p(V, 0/) S; 2 q,112(V, 0/) q,112(0/, V) is
I
proved using the same arguments (see Peligrad (1983». Let X = I, Xi 11 u. and
i=i 1
J
Y = I, Yj l1 y . be simple V and 0/ -measurable random variables, Shwartz's inequality
j=i J
yields
ICov(X, Y)12 S; I, x~ lP(V i) [I, lP(V i) {I, Yj IlP(Vj I Vi) - lP(V j )I}2]
i i j
S; [EX 2 [I, lP(V i) (I, y]llP(Vj I Vi) - lP(Vj)I)(I, IlP(Vj I Vi) - lP(Vj)I)]
i j j
S; [E X2 [E y2 Maxi I, IlP (Vj I Vi) - lP (Vj)1 Maxj I, IlP (Vi I Vj ) - lP (V i)1
j i
Let Ai (resp. B i) be the union of those Vj for which lP(Vj I Vi) - lP(Vj);:::: 0 (resp. < 0),
I, IlP (Vj I Vi) - lP(Vj)l::; IlP(Ai I Vi) - lP(Vj)1 + IlP(Bi I Vi) - lP(Vj)1 S; 2 q,(V, 0/).
j
A symmetry argument completes the proof.•

We also recall the following result in Bradley (1986), part (b) is due to Csaki & Fischer (1963).

Theorem 1. Let Vn and o/n denote two sequences of a-fields such that ( Vn vo/n)n~J are
independent then
Properties: Dependence of a-algebras 5

~ L c( V n, 'V'n}' if c = a,
GO

(a) c( -;:; V n, -;:; 'V'n} /3, or €p.


n=l n=l n=1

(b) p( -;:; V n, -;:; 'V'n}


n=l n=l
=SUPn2:1 p(Vn, 'V'n}·
(c)

This theorem may be used in order to symmetrize mixing random variables. Let c denote the
dependence coefficient used between the a-fields generated by random variables X and Y. It is
usual to consider an independent copy (X', Y') of the random variable (X, y) in order to use
Paul Levy's symmetrization inequality relating the tails of X and X - X'. The previous result
relates the dependence coefficients associated to the a-algebras a(X), a(Y) to those associated
to the couples of a-algebras a(X, X'), a(Y, Y') or a(X - X') and a(Y - Y').

Many other dependence coefficients have been introduced. For instance, consider for a, b ;::: 0

(6) aa b('lJ, 0/) =Sup{ HP(U) ~(V) - ~(UnV)I; U E 'lJ, V E 0/, IP(U) IP(V);t:O}.
, IP (U) IP (V)

In the sequel (0, ~, IP) will always denote the underlying probability space.

Survey of the literature

The coefficient a has been introduced by Rosenblatt (1956). The ~-mixing coefficient,
introduced by Kolmogorov, first appeared in the paper by Wolkonski & Rozanov (1959).
Ibragimov (1962) introduced the coefficient cp, see also in Ibragimov & Rozanov (1978). Blum,
Hanson & Koopmans (1963) the *-mixing coefficient. Hirschfeld (1935), Gebelein (1941)
introduced the coefficient p and Kolmogorov & Rozanov (1960) defined the corresponding
dependence condition. Moreover Bradley (1983, 1985, 1987), Peligrad (1983) and Bulinskii
(1984, 1987, 1989) introduced various related measures of dependence. Among them,let us
mention also the Information Regularity Coefficients introduced Wolkonski & Rozanov (1959)
and related to the classical measure of entropy. Statuljavichus (1983) describes an almost
Markov regularity coefficient. A variant of which is later used extensively in Veijanen (1989).
Further information may be found in Bradley (1986) and Bulinskii (1984 & 1989 b). Bradley
& Bryc (1985) and Bulinskii (1987) define aa,b.

Properties of those general dependence coefficients are proved in Doob (1953),


Wolkonski & Rozanov (1959), Billingsley (1968), and Ibragimov & Linnik (1974). Hall &
Heyde (1980) and Roussas & Ioannides (1987) present a synthesis of the previously cited
works. A large amount of very precise properties is also given by the work of R.C. Bradley.
An extensive bibliography concerning mixing coefficients is given in Bradley (1986).
Properties: Basic Tools 7

1.2. Basic tools

This part presents the basic tools concerning weakly dependent a-fields. These tools are
the reconstruction techniques and the covariance inequalities. They are summed up in the two
forthcorning subsections.

1.2.1. Reconstruction techniques


Reconstruction techniques are a very powerful tool to get exponential and maximal
inequalities as we shall see all along this book. The fust direct approximation of dependent
random variables by independent ones was introduced by Berkes & Philipp (1977). Theorem 1
concerning the absolute regularity coefficient was proved in Berbee (1979). Theorem 2
concerning the strong mixing coefficient was proved in Bradley (1983 b).

Let E and F be two Polish spaces and (X, Y) some ExF-valued random variable. We shall set
P= p(a(X), a(Y)) and a = a(a(X), a(Y)) for the mixing coefficients relative to the
a-algebras generated by X and Y. We assume throughout this section that the probability space
on which X and Yare defined is rich enough to define another random variable with uniform
distribution on the interval [0,1] and independent of X and Y.

Theorem 1. A random variable y* can be defined with the same probability distribution as Y,
independent of X and such that [P(Y:;f: Y*) = (3. For some measurable function f on
ExFx[O, 1], and some uniform random variable .1 on the interval, y* takes the form
y* = f(X, Y, .1).

Lemma 1. Let T be a finite set with cardinality N and let A and /.l be two probability measures
on T. There exists a probability distribution V on T with marginals A and /.l such that
v(f(t, t)}) = A({t}) A /.l(ft}).

Proof of Lemma 1. Order the set T = I tl , ... , t N } in such a way that Ai ~ ~i for
i = 1, ... , k and Ai > ~i for i = k+1, ... , N, where Ai = A({tiD and ~i = ~({tiD for
k N
i = 1, ... , N. The relation a =I. (~i - Ai) = I. (Ai - 11) follows from the fact that the
i=1 i=k+1
total mass of a probability measure is equal to 1. We define the NXN-matrix Q = (qi)ij by
setting
if j ~ k and i > k,
if j~N,
otherwise.

Sketch of the proof of Theorem 1. Let A be a Borel set in E such that [P (X E A) > o.
We first assume that Y is atomic and let ffi = {B I , ... , B N } the set of Y's atoms. Set
8 Mixing

[P((XE A)n(Y E B i ))
T = {I, ... , N}, AA,i = and !Li = [P (Y E B i). The probability
[P(XEA)
distribution v A built as in Lemma 1 on T2 satisfies
Cb ~ 1 ~ 1[P((X E A)n(Y E B i)) - [P(XE A) [P(Y E Bi)1
c( w, A) = IIv A - II.A ®!LIIVar= 2" £.., .
i=1 [P(XEA)
Setting !D (AxB ixB j) = v A,i,j [P (X E A) one obtains a distribution on ExF2. Note that
[P((X E A)n(Y E B i)) = !D(AxBixF) and [P(XE A) [P(Y E B) = !D(AxFxB/ There
exists an F-valued random variable Y* such that (X, Y, y*) has the distribution !D. From
Lemma 1, we obtain that [P(Y"* Y*I XE A) = c( ffi, A), hence
[P (Y "* y*) = Sup L c( ffi, A k ) [P (X E A k ) = ~.
k

The supremum is considered over measurable partitions {Ak } of E. To extend the proof to non
atomic random variables Y, choose (1) a sequence ffi n of finite measurable partitions of F such
that

Setting Y n = [(Y I ffi n) construct, as previously, a sequence of atomic F-valued random


variables Y~. Some subsequence ofY~ is [P-almost surely convergent..

Remarks 1. The uniform random variable involved by Theorem 1 may be seen as a


nonatomic equivalent of the permutation implicit in Lemma 1. Measure theoretic arguments
omitted here may be found in Berbee (1979) in which the proof is completely distinct. Berbee's
notation for ~ is.1. The proof given is close to Bryc (1982) who provides alternatives to the
proofs of Berbee (1979) and Berkes & Philipp (1979) results. In the latter paper the case of
<j>-rnixing is considered. Another approach related to this result is given in Schwartz (1980),
proposition 1 and lemma 7, the information numbers yield a similar reconstruction result for
very weak Bernoulli processes.

Theorem 2. Let rand q be positive numbers. If Y is a real random variable with moments up
to order y, then a random variable y* can be defined with the same probability distribution as
Y, independent of X and such that
IP(IY - Y*I :::?q) ~ 18 (a2(lEIYlrFrt(2r+1).
q
For some measurable function f on ExFx[O, 11, and some uniform random variable .1 on the
interval, y* takes the form y* = f(X, Y, .1).

We shall only sketch the proof of this result which may be found in Bradley (1983 b). The
following lemma gives a sharp inverse bound relating ~ and a, when Y is a discrete random

I Proceed as in Bryc (1982). From the tightness of Y's distribution choose a compact set Kn such that
lP(Y;; K)::; n-I. Compactness allows to determine elements (x.I, n) of Kn such that the n-I-balls centered at
those points are a finite covering of Kn' Let J!1,n denote the union of such ball, then limn !f(Y 1J!1, n) = Y in
probability. Some subsequence of J!1, n is thus convenient.
Properties: Basic Tools 9

variable with N atoms. Note that ~:::; N a. would be an obvious bound.

Lemma 2. If the probability distribution of Y is atomic with N atoms, then f3 ::; {8N a..

Proof of Lemma 2. Szarek (1976)'s bound in Khinchin inequality leads Bradley (1983 b)
to prove that it is possible to extract subsets S c {I, ... , M} and T c {I, ... , N} from a real
valued matrix A = (aij)!Si:>M,!SjSN such that I~ ai) ;:: "3~N ~ lai} Let now B I ,· .. , BN
SxT I,J

be the atoms of Y's probability distribution. If AI"'" AM denotes a cr(X)-measurable partition


ofQ let ai,j = [P(At'lBj) - [P(Ai) [P(B/ It is enough to show that ~ ~ lai):::;" 32Na..
1 J
Hence a. ;:: 1[P(ArlB) - [peA) [P(B)I = I~ ai,jl setting A = ~ Ai and B = yBj . This together
SxT
with the previous result concludes the proof. •

Proof of Theorem 2. Using Lemma 2 and Theorem 1, Bradley proves that if HI"'" HN is
a mesurable partition of the support of Y's probability distribution in IR with [P(Y E Hi) > 0,
then it is possible to construct y* with the required properties and such that

L
N
[P(3 i; Y, y* E H i):::;.,j8N a.. For this use the discrete random variable Y I = Yi 1l{YeHi }
i=1
where Yi is chosen in Hi' Let now a be some positive real number, m be some integer and
N = 2 m + 3. Set Hi = [~+ (i - l)a, ~+ i a[ for Iii:::; m and H_ m- 1 =]-oo,-~-ma[,
a
Hm+I=[Z+ ma,+oo[.

Then [P (IY - Y*I ;:: a) :::; {8N a. + 2 [P (IYI ;:: ~ + rna). Use Markov inequality provides a
bound for the second term. Choose the best possible values a and m yields the result. •

1.2.2. Covariance inequalities


We present here the fundamental ineqUalities for the covariance of mixing random
variables.

Let X and Y be measurable random variables with respect to V and CV' respectively. Recall
that we denote IIXlip = (IE IXIP) lip for p < 00, and IIXII= = ess-sup IXI. An essential property
of the mixing coefficients defined in § 1.1. is given by the following covariance ineqUalities.

Theorem 3.
(1) ICov(X, Y)I::;8 ex.
11
reV, 0/) IIXllp IIYllq,for any p, q, r 21 and
1
p 1
+ q+ r = 1.
1

(3) ICov(X, Y)I ::; 2 cpl/P( V, 0/) IIXllp IIYllq,for any p, q 21 and ~+ ~ 1.
(4) ICov(X, Y)I::; lfI( V, 0/) IIXlljllYll1'
(5) ICov(X, Y)I ::;p(V, 0/) IIX11 2 11Y11 2 •
10 Mixing

Remark 1. For the sake of homogeneity with § 1.1, we do not give any inequality (2)
concerning ~-mixing, the power of this notion lies in the reconstruction Theorem 1, § 1.2.1.
Inequality (1) is due to W olkonski & Rozanov (1959) in the case p q 00 and to Davydov = =
(1970) in the present form. Davydov (1968) proved it in the weaker form where the constant 8
is replaced by 12. Inequality (3), due to Ibragimov (1962), may be found in Billingsley (1968,
p. 170). Inequality (4) is due to Blum, Hanson & Koopmans (1963). Inequality (5) is evident
taking in account the defInition of p.

Remark 2. Almost all of these inequalities are easily extended to separable Hilbert space
valued random variables. However, the inhomogeneous mixing condition (1) is much more
complicated to extend in this setting. This was achieved by Dehling (1983), using deep Banach
space theory arguments. The factor 8 is then replaced by 15 in Inequality (1).

Remark 3. Bulinskii replaces LP-norms by convenient Luxemburg norms (see (2)) in (1), in
Bulinskii (1987, 1990). He gets inequalities in Orlicz spaces. The loss due to the
inhomogeneity of inequality (1) is thus reduced as much as possible. That is, let ell E ff p'

'I' E
ff q for p+q=
11
1. Setting e(t) =t eIl(C
-1-1-
) 'I'(t- ) for eIl(t) = Inf{s ~ 0; eIl(s) ~ t}, the
following inequality due to Bulinskii (1987) extends inequality (1) (see (3))
ICov(X, Y)I ~ 10 8(a(V, 0/)) IIXII<f> IIYlI'f'.
We do not present those results in detail because the constraint to get better covariance
inequalities yields a big loss on the function of the mixing coefficient; see Bulinskii, Doukhan
= =
(1987). For instance if eIl(t) tP InU(Uvt) and 'I'(t) tq Inv(Yvt) are as previously for U and
Y big enough, we get e(t) = O(ln- w (Wvt)) for w = ~ + ~ and W big enough. It is worth
mentioning that before the systematic work of Bulinskii, Hermdorf (1985) used Orlicz norms
in order to prove central limit results.

Proofs. (1) We prove this result in three steps.

(i) First step : IIXlioo < 00, IIYll oo < 00.

Lemma 3. ICov(X, Y)I ~4 a(V, 0/) IIXllooIlYli oo.

Proof of Lemma 3. Let u = sign(1E (Xlo/) - IE X), v = sign(1E (YI '\.!) - IE Y) (using
notation (4)), then: ICov(X, Y)I = IIE(X IE (YI'\.!) -IEY)I::; IIXlioo IE lIE (YI'\.!) -IEYI
::; IIXlioo IE (v(1E(YI'\.!) - lEY)) ::; IIXlioo lIE (vY) - IE v IEYI
Similar arguments lead to ICov(X, Y)I::; IIXlloollYlioollEvu -lEv lEul.
Let now U+ ={u = I}, U- = {u =- I}, y+ = {v = I}, Y- = {v =- I}, then

2 Set 'fF p = (ell: IR+ ~ IR+, eIl(O) = 0, eIl;c 0 and convex, x-Pell(x) i} for p> I.
IIXII
Luxemburg norms are defined for ell e 'fF P by IIXII<f> = Inf{ t > 0; IE eIl(-t-) :s; I}.
3 In the case of separable Hilbert valued rvs we replace, in Bulinskii, Doukhan (1987), the factor 10 by 16.
4 sign(x) = -I, 0, or I according to the fact that x < 0, x = 0 or x> O.
Properties: Basic Tools 11

IlEvu - [v [ul = I[[P(U+nV+) -IP(U+)IP(V+)] + [1P(U-nV-) -IP(U-)IP(V-)]


- [IP (U+ n V-) - IP eU+)1P (V-)] - [IP (U-n V+) - IP (U-)IP (V+)] I,

Hence,l[vu-[v[ul:5:4a(V,o/).+

(ii) IIXlip < 00, IIYII~ < 00, 1<P< 00.

Define X = X 11 {IXI$;aj' ~ = X 11 {IXI>aj and write X = X+~.


Thus Icov(X,Y)I=lcov(X,Y)+cov(~,Y)I:5: 4 a(V, 0/) a IIYII~ + 211YII~ [I~I.

Now Markov's inequality leads to [I X I :5: 2 a [IXIP. Set [I XaplP = a (V , 0/) then
- aP
I-lip
Icov(X, Y)I :5: 6 a (V, 0/) IIYII~ IIXlip. +

(iii) IIXlip < 00, IIYll q < 00, ~+ ~< 1,

Define Y = Y 11 {IYI$;bj and}': = Y 11 {IYI>bj" Write Y = Y + }':, hence analogously


I-lIp
Icov(Y,X)I:5: 6 a (V, o/)bIlXllp + 2 b [I}':I.
[IYlq I-lip
Setting - - = a (V , 0/) yields
aq
Icov(X, Y)I:5: 8 al/reV, 0/) IIYll q IIXllp.+

Alternative proof for (ii) and (iii). Define X=X11{IXI$;aj and Y=Y11{IYI$;b)" Write
cov(X, Y)=cov(X, Y)+cov(X, }':)+cov(~, Y)+cov(~,}':) which follows from the

identities X = X + X and Y = Y + Y. Now Markov inequality leads, with u = P (1 - 1)


- - r
1
and v = q (1 - r)' to
- [IYlq - [IXiP
Icov(X, Y)I:5: 2 a b - - , Icov(Y,X)I:5: 2 b a - - ,
bq - aP
( IIXII IIYII )r/Cr-l)
andlcov(~,}':)I:5: 2 a b 7 T '

Hence
[IYlq [IXIP {IIXlip IIYllq}r/(r-I»)
Icov(X, Y)I :5: 2 a b ( 2 a(V, 0/) + - - + - - + -=- -b .
bq aP a

Choosing the best constant a and b in this expression would lead to a constant smaller than 8 in
. . [IYlq [ IXIP
(1). The value 81S obtamed for - - = - - = 2 a(V , 0/). +
bq aP
12 Mixing

(3) The random variables are first assumed to be simple.


I J
For instance, writing X = I, Xj 1lUj and Y = I, Yj 1lVj yields
i=l j=l
I 11 I J
IIXlip = (I, IXjl P IP(Uj») P and Cov(X, Y) = I, I, Xj Yj {IP(U/,V j) -IP(U)IP(V.)}.
~ ~~ J
I J
Thus Cov(X, Y) = I, Xj (IP(U)1/p I, Yj {IP(V/U j) -1P(V j)}(IP(U j»1/q.
i=l j=l
I J 1/
Now ICov(X, Y)I::; IIXlip (I, IP(U) II, lyjlllP(V/U j) -1P(V)llq) q.
i=l j=l
J J
But II, lyjlllP(Vj)-IP(V/Uj)llq = II, Iy} lIP (V/U) - IP(V/ 1q lIP (V/U) - IP(V/IPl q,
j=l j=l
J J
::; I, Iy}q IIP(V j) + IP(V/Uj)III, IIP(Vj)-IP(V/Uj)llq/P.
j=l j=l
J q/p
Hence ICov(X, Y)lq ::; 2 IIXII~ IIYII~ {SuplI, IIP(Vj)-IP(V/Uj)lI} .
j=l
Let now ct (resp. Ci) be the union of 0/ -sets with IP(Vj) - IP(V/U j) ;;:: 0 (resp. < 0) then
J
I, lIP (Vj ) -1P(V/Uj)1 = [1P(ct)-IP(ctlU)] - [1P(Ci)-IP(CjlU j »] ::;
2 <I>('lf, 0/),
j=l
The previous inequalities are proved for simple random variables, thus approximating the initial
random variables in LP (resp. in Lq), by simple ones leads to the desired results. The proof is
analogous for p = 1 and q = 00 • •
(4 ) The random variables are first assumed to be simple. Let for measurable partitions
I J I
lUi} and {Vi} : X = I, Xi 1lUj and Y = I, Yj 1lvj" We have IIXII 1 = I, IXillP(Ui) and
i=l j=l i=l
I J
ICov(X, Y)I = II, I, Xj Yj {1P(U)IP(Vj) -1P(U/iV)}I, hence
i=l j=l
I J
ICov(X, Y)I::; 'l'('lf, o/)I, I, IXjlly} IP(U)IP(Vj) = 'l'('lf, 0/) IIxlI, IIYII,.
i=l j=l
The previous inequalities are thus proved for simple random variables. Approximating the
initial random variables in Ll by simple ones leads to the desired result..

Survey of the literature


The first direct approximation of dependent random variables by independent ones was
introduced by Berkes & Philipp (1977). It is not mentioned here because bounds given for
strong mixing involve deeply the structure of the random variables considered. The aim there is
to introduce grouping techniques in order to prove strong invariance principles, see Philipp
(1986) for a review of this technique. Other results in the field of reconstruction are presented
in Berbee (1979) and in Bradley (1983 b), see also Bradley (1986) for further references;
moreover Bryc (1982) presents alternative short proofs of such results. Schwarz (1980) gives a
different approach to those results concerning finitely determined processes adapted to Doeblin
Properties: Basic Tools 13

Markov chains. The related paper by Bosq (1991) is also of interest. We also omit the results
by Gordin (unpublished, Vilnius conference, 1975) or Gordin (1969). He approximates
mixing sequences by martingales, having in view a CLT as in Mc Leish's result, Theorem 1, in
§ 1.5.1., see in Hall & Heyde (1980).

The previous results in § 1.2.2. may be found in the initial papers. See Blum, Hanson &
Koopmans (1963), Davydov (1970), Rosenblatt (1956), Ibragimov & Rozanov (1974) and
Wolkonski & Rozanov (1959). Additional informations may be found in Billingsley (1968), in
Doob (1953), in Hall & Heyde (1980), in Ibragimov & Linnik (1974), in Iosifescu (1980), in
Rosenblatt (1971) or in Roussas & Ioannides (1987). Bradley (1986) and Peligrad (1986)
present other measures of dependence. Bulinskii (1987) proves sharp ineqUalities in Orlicz
space. The work of Dehling (1983) extends the mixing inequality for covariances to separable
Hilbert spaces in the difficult strong mixing case; Bulinskii, Doukhan (1987) extend this result
to Orlicz spaces. Finally Rio (1994) proves a new covariance inequality (5) which does not
involve LP-norms but integrals of quantile functions with respect to the distribution generated
by the strong mixing sequence: this inequality is sharper than Davydov's (1970).

J
2et
5 Set Qx(u) =Inf{t; !P(IXI > t):5 uJ, then Icov(X, YI:5 2 Qx(u) Qy(u) du for rvs X, Y with a finite

variance and =IX(cr(X), cr(Y». A converse result is also given.


(X
Properties: Definition of Mixing 15

1.3. Mixing

Once the measures of dependence between two <i-algebras have been introduced,
various notions of mixing may be defined for general processes. A multitude of definitions of a
mixing random field can be introduced, we shall focus on the simplest in § 1.3.1. The
definition of a mixing random process proposed in § 1.3.2. is the classical one. Useful
definitions from ergodic theory are recalled in § 1.3.3. and relations between the mixing
notions for processes and fields are given in § 1.3.4.

Let X = (Xt)te T be a random field, viewed as a family of random variables indexed by


a time set T. T is a metric space with distance d. Xc = {Xt; t E C) denotes the C-marginal
of X. XC is the <i-algebra generated by Xc for C c T, and ICI the cardinal of C if it is finite.
Moreover the distance of subsets A and B will be written dCA, B). Assume that c(., .) denotes
any of the dependence coefficients previously defined as u, ~, p, <I> and 'l'. Set

v u, V E [}I *, cX(k; u, v) = Sup{ cx(A, B); dCA, B) c k, IAI ~ u, IBI ~ v}.

We shall thus write ux(k; u, v), ~x(k; u, v), Px(k; u, v), <l>x(k; u, v) or 'l'x(k; u, v).
Many other coefficients could be introduced here, depending on the stucture of the index set T.

None of them is universally known as the mixing coefficients sequence.

Let note also that cx(k; u, v) is a decreasing function with respect to k and an increasing
function with respect to u and v. We shall make use of a classical convention setting
cx(k; u, 00) = SUPy cx(k; u, v) and cx(k) = cx(k; 00, 00). The main interest of those
coefficients is perhaps the fact that they only depend on <i-fields. Thus considering Y = (Yt)
with Yt = ft(X t) instead of X = (Xt) may only make the corresponding coefficients decrease.

Another definition of mixing (see e.g. Bradley (1986» does not satisfy this property.
Let X = (Xt)teT be a real second order random field, set Lx(A) for the closure in L2(Q) of
Span{Xt; tEA}, the vector space spanned by XA . A measure of dependence of X is defined
using the linear correlation coefficient or r-mixing coefficient

V A, BeT, rx(A, B) = Sup{lcorr(U, V)I; U E Lx(A), V E Lx(B)}.

This property is a second order property, only depending on first or second order moments of
X. This mixing notion is the cosine of the angle beetween the linear spans of paste and future in
the Hilbert space L 2(Q). The inequality rx(A, B) ~ Px(A, B) is clear and a reverse inequality
holds for the gaussian case (§ 2.1.); indeed in this case the whole distribution of a random field
is determined by second order properties. Linear correlation mixing can be defined following
the same lines as for other coefficients.
16 Mixing

1.3.1. Mixing random fields


The random field X is said c-mixing - e.g. a-mixing or <j>-mixing - if
limk~~ cx(k; u, v) = 0, for any integers u, v ~ O.
The dependence of coefficients with respect to u and v is not given precisely in this definition.
In Part 2, we shall see examples such that cx(k; u, v) depends explicitely on u and v.

IAI:S;u
IB I:S;v
dCA, B) ~ k

Definition of a mixing coefficient for fields


Figure 1.3.1.

Remark 1. Bulinskii (1987) gives a wider definition for random fields indexed by :zd.

C(B)

C(A)

IAI:S;u
IBI:S;v
d(C(A), C(B» ~ k

A and B arc not


separated by
boxes
Bulinskii's definition of a mixing coefficient
Figure 1.3.2.

Using subsets A and B separated by hyperplanes, he considers the smallest parallelepiped


d
C(A) = II [ai' bi] with A c C(A). He defines cx(k; u, v) as the largest value of c(A, B)
i=!
d
where IAI:S; u, IBI:S; v and Bn II [ai - k, b i + k] = 0. If BnC(A) 'I; 0, this coefficient is
i=!
set equal to O. The examples of figure 1.3.2. proves that his definition is distinct from the
previous one. Figure 1.3.2. indeed shows that two disjoint subsets of :z2 may be separated or
Properties: Definition of Mixing 17

not in the sense of Bulinskii.

This notion is thus really weaker than the classical one given before. However this complicated
notion does not appear to be of interest to us because the current random fields used in statistics
fit our definition of mixing. This kind of unnatural generalization will thus be systematically
omitted.

An important inequality was recently proved by Bradley (1991 a). It shows that a-mixing and
p-mixing - uniform with respect to u, v - are equivalent conditions for stationary random fields
indexed by Z d for d > 1.

Theorem 1. If X = (Xt)tE Zd is a strictly stationary random field and has the mixing
property limk--7oo aX(k; 00, 00) = 0 ,then
ax(k; 00, 00) S Px(k; 00, 00) S 21r:a X (k; 00, 00).

Sketch of proof. This inequality is well known in the Gaussian case (see § 2.1) and thus it
will be proved here using a Central Limit Theorem argument. Let e > 0 be arbitrarily small real
number, then there exist A and B, two finite subsets of T separeted by a distance k with
peA, B) 2: p(k, 00, 00) - e. There also exist adapted and normalized rvs X = f(X A) and
Y = g(X B) with [XY = peA, B). Assume first that d l' 1. There is some direction e,
orthogonal to the one minimizing the distance of the subsets A and B. The shifted rvs
Xi = f(XA+I(i)e)' Yi =g(XB+I(i)e) are equally distributed and almost independent if I(i)
increases very fast to infinity with i. If d = 1 the rvs Xi and Yi take the same form but e
denotes here a positive number larger than the diameter of AuB. Thoses rvs obey a CLT and
n n
Theorem 2.1.1 may now be used with the gaussian limits of _~ I, Xi and _~ I, Y i'.
'I n i = I ' I n i=1

The following result in -Bradley (1989) shows that /3-mixing is a trivial notion in the case of
random fields indexed by Z d if dependence on the cardinality of the subsets considered is not
allowed. This will appear clearly all along the examples.

Theorem 2. If X = (Xt) tE Zd is a strictly stationary random field with


lim f3 x (k; 00, 00) = 0,
k--7 OO

then the random field is m-dependent (see (1)).

=
Sketch of proof. In fact, Bradley shows that /3 x (k; 00, 00) 0 or 1. The proof is similar
to that of Theorem 1 and we assume as well first the assumption d l' 1. It runs as follows.
Assume that /3(X(A), X(B)) > 0 for some finite subsets A and B separated at least k. There

1 A random field (Xt)tE T is said to be m-dependent if for any subsets S, S' c T, d(S, S') ;0: m implies that
X.S is independent of X. s'; this implies in turn that for all the dependence measures cX(m; u, v) = o.

If (Zn) is an independent sequence then for any finite non zero sequence (a 1, ... , am) the moving average
process Xn = a 1 Zn + ... + am Zn_m+l is m-dependent but not (m-I)-dependent.
18 Mixing

exists some direction, given by a vector e in ;Zd, such that A* = u Ai and B* = u Bi


ie;l ie;l
are also separated at least k and Ai = A + ie, Bi = B + ie. If Xl(U) and X-J(V) are non
independent events define Y i = ll(Xi.(U)nx~.(V)). Recall that ~(X(A), X(B)) appears as
1 1

the total variation on X(AuB) of the difference of the image distribution P = [Pxand the
AvB
product Q = [P X ®[P X . Under the distribution P, the sequence (Y i) is ergodic so that
A B
li~~~ ~ (Y I + ... + Y n) = P(AnB), P-a.s. The same holds for Q. This and the relation
Q(AnB) = P(A)P(B) *" P(AnB) imply P[limn~~ ~(Y 1+"'+ Y n) = P(AnB)] = 1 and

Q(li~~~ ~ (Y I + ... + Y n) = P(AnB)] = O. Thus this yields ~(X (A), X (B)) = 1.


Modifications concerning the case d = 1 follow the same lines .•

Remarks 2. The previous argument does not work in the case of mixing processes (this
notion defined for d = 1 in section § 1.3.2 does not allow to consider interlaced subsets A and
B; Example 3 provides a counterexample) neither without the mixing assumption (set Xt = B
for some binomial random variable with a small parameter). The part d=1 in the proofs of
Theorems 1 & 2 seems to be new and was suggested by an anonymous referee.

It is worth mentioning the work by Veijanen (1989) concerning partially observable


random fields. He introduces the adequate notion of conditionally mixing random fields and
random processes. The typical example is the case where X t = Y t + Zt is observed for some
mixing random field Y t and some random field Zt conditionally independent of Y t. A
conditionally mixing random field is defined as previously using now a conditional probability
with respect to some a-field instead of a fixed probability in relations § 1.1.(1)-(5).

1.3.2. Mixing processes


If T is ordered - e.g. T c !R - another notion of mixing may be defined for the process
X. Let cX,k;u,v = Sup{ cx(A, B)}; the sup is considered over A and B with IAI:5: u, IBI :5: v
and a < b + k if a E A, b E B.

The process X is said to be c-mixing if limk~~ cX,k;u,v =0 for any u, v ~ O.

In most of the interesting cases, we shall see that there is no dependence over u or v, and we
shall thus write cX,k = SUP{cX,k;u,v; u, v ~ O}.

A B
15555555 ~N~~~N~~~~~~\'<
o t+k
Definition of a mixing coefficient for processes
Figure 1.3.3.

The previous definitions are non overlapping as show the following examples.
Properties: Definition of Mixing 19

Example 1. There exist stationary processes which are a-mixing without being p-mixing,
or a-mixing without being p-mixing. Examples are given in Bradley (1981 a, 1980).

Example 2. Gaussian stationary processes yield various counterexamples proposed in


section § 2.l. In this case a-mixing and p-mixing are equivalent. An a-mixing Gaussian
stationary process which is not m-dependent is not <I>-mixing. Ibragimov & Solev (1969) give
an example of a stationary a-mixing Gaussian process which is not p-mixing. Such a process
is p-mixing but not p-mixing.

Example 3. For stationary Markov processes - see § 2.4. - Bradley (1986) states that
c n = C(Ci(X O)' Ci(X n» for the previous measures of dependence. Moreover if p, <I> or
\jI-mixing condition holds, then the decay of the corresponding sequence is geometric. A
geometrically ergodic Markov process which is not Doeblin recurrent is p-mixing and not
<I>-mixing. The simple AR-process Xn+l = ~ Xn + Zn for some independent and identically
JV'(O, I)-distributed sequence (Zn) is so. However, if the distribution of (Zn) is binomial, then
the process is not even a-mixing; see Andrews (1984). It is possible to construct a stationary
Markov process with denumerable states, such that the decay of an is less than geometric; see
Davydov (1973), examples 1 and 2 or Kesten & O'Brien (1976), corollary l. Such a process is
a-mixing and not p-mixing in view of previous results. Moreover Rosenblatt (1971) gives an
example of a p-mixing real valued stationary Markov process which fails to be p-mixing.

Example 4. A necessary and sufficient condition for Markov processes to be *-mixing is


given in Blum, Hanson & Koopmans (1963) ; see § 2.4. The *-mixing coefficient of a
»,
stationary process is defined as *(n) = 'V(Ci(Xk , k::; 0), Ci(X n it satisfies *(n) = \jIn for a
Markov process. <I>-mixing Markov sequences which are not \jI-mixing may be obtained using
°
this characterization. The rate of decay to of the mixing sequence \jIn is geometric for Markov
*-mixing processes. Athreya & Pantula (1986 b) give an example of \jI-mixing sequences with
arbitrary rate of decay of the mixing sequence \jIn'

Finally we recall the celebrated example of continuous fractions introduced by Levy, see e.g.
Billingsley (1968). Any real number x in the interval ]0, 1[ may be expanded in an unique way
in the form

x for some sequence of integers xl' x2' ..... .

t
Consider the probability distribution with the density [In 2] 1 + x] on the interval ]0, 1[. It is
shown in Philipp (1970) that the subsequent process (Xl' X 2, ... ) is 'V-mixing and that the
\jI-mixing sequence has a geometric decay.

We thus get the following diagram for mixing properties of a process


20 Mixing

* fr
~-mixing ~ a-mixing
\jI-mixing ~ <j>-mixing ~ { &
p-mixing : a-mixing

Asymptotic results concerning mixing coefficients for mixing sequences are included in the
work of Bradley (see Bradley (1986) for a review). In particular it may be shown that the
mixing sequences have very few asymptotic possibilities (that is, few possible limiting values)
under ergodic theory mixing assumptions: they equal 1 or decrease to O.

Example S. a) The following is an "almost natural" example of a process such that


limk-7= aX,k;u,v = 0 and for which we conjecture that limk-7oo aX,k *' 0 (see (2)). We
investigate the following example suggested by J. Bretagnolle. Let (€n)nE;Z and (11 n)nE;Z be
two independent sequences of independent and identically Bernoulli (3) distributed, with
1 n .
parameter 2' we set S(O) = 0, Sen) = .L. 11 i and Xn = €S(n)' Let k, u, v be mtegers, we
1=1
determine a bound for ax(k; u, v). For this let Em,r,p= {IS(r+m)-S(r)l::;p}, then
IT> (Em,r,p) ::; L (i) 2-m the sum being extended to integers i such that i=¥- for some integer
i
Ijl::; p. Stirling's formula yields, for fixed p and large m, (i) rm = _~(I +0(1)) thus
'121tm

IT> (Em r p) ::; _~ (l + 0(1» where the 0 is uniform with respect to r. Set D = lw, u+w],
, , '11tm

V = lk+u+w, k+u+v+wl then axeD, V) = SUPA,B Ih(A, B)I for


h(A,B)=IT>(XuE A,XyE B)-IT>(XuE A)IT>(XyE B). It is straightforward to see that
k+u+v
Ih(A, B)I::; 2 L
IT>(Em,w,u)' We have proved the existence of some constant C with
m=k+u+l
aX,k;u,v <
- C --Jk'
UV
thus
We conjecture that the previous bound is sharp. In order to show that limk-7~aX,k*'O it is
enough to exhibit sequences u(k) and v(k) such that for any integer k, aX,k;u(k),v(k) ~ a> O.

b) Bulinskii and Zhurbenko (1976) gave such an example of random field for which
limk-7~ ax(k; u, v) = 0 and ax(k; =, =) = ~. This example is also based on Bernoulli
sequences. Let (11 n)nE;Z be as before an independent and identically distributed sequence of
Bernoulli random variables, they set Xn(k) = 11 a(k) 11 b(k) 11 c(k)' where a(k) = 4 k-I ,
b(k) = 2 4 k- l , c(k) = 3 4 k- 1, n(k) = 4k - 1 for k> 1 and else, a(l) = 0, b(1) = 1,
c(I) = 2, n(I) = 3, and Xn = 11n if n is not an element of the sequence n(k). Here,

2 As a matter of fact, part 2 is mainly devoted to give useful examples of processes such that
limk .... ~ c X•k = O.
3 That is : IP (e n = I) = IP (e n = - I) = 12 and IP (T] n = 1) = IP (T] n = - I) = 12 .
Properties: Definition of Mixing 21

ux(k; u, v) =0 if k> 2 uvv.

1.3.3. Weak conditions for processes


We recall here some defmitions from the classical ergodic theory useful in the sequel.

Let X = (Xn)ne Z be a stationary process, for convenience assume that the underlying
probability space is the canonical space IR Z equipped with the probability IP, defming the
distribution of X on the Borel sets $ of IR Z . Let 9 be the shift operator defined by
(9 X)n = Xn+l·
X = (Xn)ne Z is mixing in the sense of ergodic theory if for any events A, B in $,
limn~co IP(An9 nB) =IP(A) IP(B).
X =(Xn)ne Z is ergodic if the a-field of invariant events is trivial [if the event A is invariant
by 9 then IP(A) = 0 or IP(A) = 1].

Let $ (IR Z -) denotes the Borel field on IR Z -, set of the bilateral sequences vanishing for
nonnegatives values of the time index. X = (Xn)ne Z is uniformly ergodic if the following

holds liffio~ SupA,Be aJ{lR z-) I~~ IP (An9 nB) - IP (A) IP (B)I = O.
x = (Xn)ne Z is regular if the tail a-field X -co = r;' Xl-co,tl is trivial.
Now mixing in the sense of ergodic theory implies ergodicity. Regularity implies mixing in the
sense of ergodic theory. Strong mixing implies regularity as well as uniform ergodicity.
Uniform ergodicity implies ergodicity, and a mixing uniformly ergodic stationary process is
strongly mixing (Rosenblatt, 1972).

We introduce now a terminology used mainly in the case of Markov processes (see
n-l
§ 2.4.). Let Nn(A) = Lll A(Xk) be the time spent by the process X in the measurable set A
k=O
n-l
until time n. This defines a measure setting Nn(f) =L f(X k ) for any random variable f on
k=O
(E, ~). The process X is said recurrent if there exists a a-finite nonnegative measure fl such
that for any fl-integrable bounded random variables f and g with Jg(x) fl(dx) "" 0 on (E, ~)
we have the following ergodic theorem
N (f) Jf(x) fl(dx)
Nn(g) n~co Jg(x) fl( dx)
_n_ ~ a.s.

The process X is said positive recurrent if fl is bounded and null recurrent else. fl is called a
stationary distribution in the positive recurrent case.
22 Mixing

1.3.4. Miscellany
Note that cX,k;u,v :5; cx(k; u, v) thus a mixing field on the index set Ili or 2 is always
a mixing sequence. Very interesting inverse results follow from Takahata (1986)

Theorem 3. Let (Xt)tET be a random process indexed by some subset T of IR, then
ax(k; u, v) ::; 3 (v -1) f3x , k;uvv,uvv ¢x(k; u, v) ::; 3 (v -1) lfIx, k;uvv,uvv'
A f3-mixing (resp. lfI-mixing) random sequence is an a-mixing (resp. ¢-mixing) random field
(see (4)).

Sketch of proof. Takahata (1986) proves the inequalities a x (A,B):5;3B x ,k;u,v'


<i>x(A,B):5;3'1'x,k;u,v for sets A=[a,a'[ and B=[b,b'[u[c,c'[ with b:5;b',
b'+k:5; a:5; a' :5; c-k, c :5; c' and u = Max {b'-b, a'-a}, v = Max {a'-a, c'-c}.
For the general case B may be decomposed as the union of c(A, B) components (B i) such that
for any i, the convex envelope of Bi does not intersect A. A factor c(A, B) - 1 appears in
Theorem 3 and v is obviously the upper bound of c(A, B) on the class of sets B with cardinal v
(see (5».

From now on we shall essentially focus on the strong mixing properties of processes
and fields. Such properties are the weakest which can be used in statistics even if they are much
stronger than those generally used in ergodic theory. Indeed in order to construct tests of
hypothesis, Statisticians need limit theorems in distribution.

Survey of the literature

Authors cited in the previous chapters have naturally defined the corresponding notions of
mixing structures as it may be seen by reading the papers or the books (6) by Billingsley
(1968), Blum, Hanson & Koopmans (1963), Bradley & Bryc (1985), Bradley (1983, 1986,
1987), Bulinskii (1989), Doob (1953), Gebelein (1941) and Hirschfeld (1935), Ibragimov &
Linnik (1971) [they detail the links between the various notions in ergodic theory and mixing],
Ibragimov (1962, 1975), Ibragimov & Rozanov (1978), Iosifescu (1980), Peligrad (1983,
1986), Rosenblatt (1956, 1985), Roussas & Ioannides (1987), Wolkonski & Rozanov (1959).
A detailed description of examples in Davydov (1973) and in Herrndorf (1983) is given in
Nahapetian (1991). Linear correlation coefficients are not intrinsic and the present exposition is
short; for this see Bradley (1986) and Bulinskii (1989) and Remarks 1.5.1 & 1.5.2 or
§ 2.1.2., moreover Bradley (1985) relates p-mixing with strong mixing.

Takahata (1986) relates the mixing properties of fields to those of processes. Bradley (1986)
and Bulinskii (1989) give various examples proving that the notions presented are sharp in the
sense that it is possible to construct processes satisfying a notion of mixing and not another,
subject to the natural restrictions given by the Proposition 1.1.1.

I think it is also important to recall here some alternative ways of defining mixing introduced in
Gastwirth & Rubin (1975) - see also in Hall & Heyde (1980) - they involve norms of linear
operators following Rosenblatt (1973). Withers (1981) is interested in linear measurable

4 according to the definition in § 1.3.1.


5 In the factor v-I, v may be replaced by the number, N, of components of the set B; that means, setting
w = Max {Card A, Card B) :
<XX(A, B)::; 3 (N - I) ~x, k;w,w' IPX(A, B) ::; 3 (N - I) IjfX, k;w,w
6 Listed in alphabetic order.
Properties: Definition of Mixing 23

functionals of the initial process. Mac Leish (1975) introduced a generalization of martingales
integrating the mixing processes. Finally, in view of statistical applications, we refer to Duflo
(1990) and Meyn & Tweedie (1992). The ergodicity assumption is replaced by the weaker one
of stability for Markov processes.
Properties: Tools 25

1.4. Tools

This chapter is devoted to give and/or to recall the most important tools known in the
field of mixing theory. All of them follow from the results in § 1.2. Three subsections include
respectively:

Analogues of Rosenthal inequalities for the moments of sums of random variables


taken from a mixing random field or from a mixing random sequence.

Analogues for mixing random variables of Bernstein and Hoeffding inequalities.

Analogues of Kolmogorov and Ottaviani's maximal inequalities for sums of mixing


random variables.

1.4.1. Moment inequalities


In this section we give analogues to Rosenthal inequalities for moments of the sum of
random variables taken from a mixing random field or from a mixing random sequence. The
main results are concerned with the a-mixing case while the corresponding results for the
<j>-mixing case are given in Remark 4.

Related moment inequalities are reviewed at the end of this section.

The main interest of Rosenthal moment inequalities is their use for triangular arrays;
they lead to evaluations of the oscillation of empirical processes (see for instance Doukhan,
Portal (1987) or Massart (1987)) and give the right bound for integrated moments of non
parametric estimations (see for instance Doukhan, Portal (1983), Doukhan & Bulinskii (1987),
or Doukhan (1991)).

Let (Yt)tET be a 'finite family of real random variables, we shall use the following
notations.
D('t, E, T) = L('t, 0, T) if 0 < 't ::;; 1, E ;::: 0, and
for E> 0, D('t, E, T) = Max{L('t, E, T), [L(2, E, T)l~/2l, if't> 2
D('t, E, T) = L('t, E, T) if 1 < 't ::;; 2,
with L(Jl, E, T) = I.tET (lElYt 1~L+€)Jl/(Jl+e) = I.
tET
IIY tll~+E.

Example 1. Set MIl,E = SUPtE T IIY tll~+E' then L(Jl, E, T) ::;; ITI MIl,E. Thus, in this case,
~/2 ~12 ~12 ~/2 .
D( 't, E, T) ::;; Max {ITI M1:,E' ITI M 2,El. The second term has order ITI M2 E III the
stationary situation. Let us set precise bounds of these expressions in the special case of
triangular arrays Yt = fn(X~ and for T = {I, ... , nl. Assume that (Xt)tE £' is a stationary

process then D('t, E, T) < _ Max{nM1:,E' n ~/2 MZ,El.


~/2
Here MIl,E _- II fn(XOl IIJlIl+ E' may vary
with n in such a way that this bound is sharper than the simpler one D('t, E, T) ::;; n~/2 MH .

a Jl ·aJl/(Jl+e)
Let fbe a density function on the real line and fn(x) = fen x), IIfn(XOlII Il+E = O(n ) if
Xo has a bounded density. Different choices of a yield very different behaviours of D('t, E, T).
26 Mixing

This example comes from kernel estimation; it shows the interest of such a complicated
formulation of the forthcoming results.

Assume that the following mixing condition on Y holds

:3 c > 0, :3cE 2!l'l, c;:::'t, v u, V E !l'l*, u + v ::;; c, u, v;::: 2,

(1) ~ sr bC'u
£... r [ u y(r; u, v)]EI(C+E) < 00.

r=1
Here sr and br denote the maximal value of the cardinal number of a ring with thickness 1 and
radius r or of a ball with radius r in the metric of the index set I of Y. Moreover condition (1)
will be empty for 't ::;; 1 and then we shall set c = O.

For the case of Zd consider the metric Izl = maXi IZil for Z = (zl"'" zd) in Zd. It is easy to
see that there are constants 8d and 0d> 0 only depending on d with br ::;; 8d rd and c r ::;; 0d rd-l.

Expression (1) is rewritten as

(1 ') L (r+l)d(C-U+I)-1[u y (r; u, V)f(C+E) < 00.


r=1

Theorem 1. Assume the previous strong mixing assumption (1) for some 't", e> 0 and let c
be the smallest even integer such that c ;? 't". Let T be any finite subset of ;Zd. If Yt belongs to
L 1:+10 and is centered for t E T, then there is a constant C only depending on 't" and on the
mixing coefficients of Y, ay<r; u, v), for u + v::; c, such that
!ElL Y t 11: ::; C D( 't", e, T).
tET

The case of processes is analogue (I). In fact the strong mixing random fields - with
index set Z - are strong mixing random processes. However, the order structure of the line
yields a similar result. Consider the mixing assumption,

(2) :3 e > 0, :3 c E 2!l'l, c;::: 't : L (r+ l)c-2 [UY,r]EI(C+E) < 00.

r=1

Theorem 2. Assuming the previous strong mixing assumption (2), let T be afinite subset of
f1'I such that if t E T, then Yt belongs to L HE and is centered. Then there is some constant C
only depending on 't" and on the mixing coefficients of Y, a y , r such that
!ElL Y t 11: ::; C D('t", e, T).
tET

Remarks 1. For 0 < 0 ::;; 1 the inequality (x + y)O ::;; XO + yO, holds for x, y;::: O. It
concludes the proof of the previous results. In this case C = 1, 't = 0 and this inequality has
nothing to do with mixing. Else the previous results are a consequence of Uteev (1984),
Bulinskii, Doukhan (1990), Doukhan, Leon & Portal (1984), and Doukhan, Portal (1983,

1 In fact the strong mixing random fields with index set £. are strong mixing random processes, see
Theorem 1.3.3.
Properties: Tools 27

1987) for 0 > 1, and are proved below. They are extensions of the Rosenthal inequality. That
is also the case of the weaker and partial results in Dasgupta (1988). This work is perhaps the
fIrst one of this kind (2).

In the independent case, the result in Theorems 1 and 2 is called Rosenthal's ineqUality; see
Petrov (1975) or Hall & Heyde (1980). It holds with £ = 0 and it is optimal. The only loss
here is the fact that £ > O. Moreover, a sharp constant C with order T,'t!2 is given in this case.

The following interpolation lemma, due to Uteev (1984), will allow us to consider only the case
when T, is an even integer. Remark that contrarily to Uteev, we do not allow, here, the value
£=0(3).

Let F = (if)lg::;n be a family of sub a-algebras of.54, and B be a separable Banach space. A
family of centered random variables TJ = (TJi)l::;i::;n' defIned on a space (n, .54,), is said to be
(F, B)-adapted if the random variable TJi is B-valued, fFCmesurable

Set, in view of the following interpolation lemma

M(v,O,TJ)= Ln ([II11i ll v+ll ) v/(v+ll) ,


i=l
Q(v, 0, 11) = M(v, 0, TJ) for 1 ~ v ~ 2 and
v!2
Q(v,0,11)=Max{M(v,0,11),(M(2,0,TJ»} for v > 2.

Lemma 1. Assume that, for some v ~ 1 and a fixed constant c, any family 1] = (1] ih 5i5n'
n
centered and (F, B)-adapted satisfies LElIL i=l
1]iW S c Q(v, 8, 1]).

Then if 8 > 0, for any t sv, there is a constant C = C(c, v, 8, t) = c 24 (v(v-t)/o+1) such
that any centered family cp = (CPi) 15i5n' (F, B)-adapted satisfies
n
LElIL CPi lit s C Q(t, 8, cp).
i=l

lit
Proof of Lemma 1. Set Q = Q(t, 0, <1», y = Q , <l>i = 'l'i + TJi' 'l'i = Ti - [T i,
TJi = Y i - [Y i' with Ti = <l> i ll {1Iq,i"::; y}' Y i = <l> i ll {1Iq,i" > y} for i = 1, ... , n. The
convexity of [x -7 xt] yields

[IIi <l>i lll ~ 2 t- 1([III 'l'l + [III 11illl


i=1 i=1 i=1
n n
[II'" .11 1 < [('"
~111 - ~
II TJ I.1I1/V)V
i=1 i=1

2 An attentive reading shows that the manuscript was first proposed in 1982.
3 This point was not clear in the proof by Uteev and the present proof comes from personal discussions with
A. Bulinskii.
28 Mixing

(L
n n n
[ilL ll j lll::;; 2 - I (
V
[IL {lIlljlltlv - [lllllV} IV + [Hlll V f),
j=1 j=1 j=1
n n

[ilL \jfl::;; ([IlL \jfjln l/V


j=1 j=1
Let I; = (I; j) I ~j~n be the (F, B)-adapted centered sequence defined by
I;i = z(lItJ jlltlv - [lIlljIIIlV) for some z E B with IIzll = 1, satisfying the assumption of the
n

Lemma. Set U=(cQ(v,O,\jf)t!v, V=cQ(v,o,l;) and W=(L[lIll jll tlv r,then


j=1
n
[IIL<I> jlll::;; 2 1- 1 U + 2 1+V - 1 (V + W).
j=1
We thus have to estimate these three terms.

u) M(v, 0, \jf) = L
n
([II\jfjllv+ Of /(v+O)::;; 2 v Ln ([ IITl(v+O)t(U(V+O)
i=1 i=1
h
were · u = v(t+o)
we d e f me --> _ 1 so t h at rL
IL
liT iIIU(v+O) <
_ Y(u(V+O)-(I+O» rL
IL
11m
't'j 111+0 an d
t(v+o)
M(v, 0, \jf) ::;; 2 v ( Q(t, 0, <1»
) vlt
.

This inequality yields, considering the different cases, Q(v, 0, \jf) ::;; 2 v Q vii.
o II v+o v/(v+o) l(v+O)/v
L L
tI 0 v/(v+o)
v) M(v, 0,1;) = ([llIllili v - [lIll jll VI) ::;; 2 v ([lIll jll ) and
i=1 j=1
v+t n
analogously, M(v, 0, 1;)::;;2 L ([IIY jlt+o)
l)
11(1+0)
::;;c2
v+t
Q.
i=1

w)
)V ( 0 tI )V
W = ( L [Hlllv ::;; 2 v L ([HYjll) v .
n

j=1 j=1
t ~v-0 yields ([IIY jll)tlv = ([P(II<1>jll>y) [(II<1>jll/ll<l>jll > y»tlv.
tlV tlv tlv-I t
We get from Markov inequality ( [IIYjll ) ::;; [P (lI<1>jll>y)y ([(II<1>jll I II <l>j II > y» .
tlV tlv IIv-1 I
Jensen inequality implies ( [IIYjll ) ::;; [P (lI<1>jll > y) y [(lI<I>ill/ll<l>jll > y).
tlV tlv-l tlv-I I
Thelastinequalityisrewrittenas ( [IIYjll ) ::;;[P (lI<1>jll>y)y [(lI<I>illll{lhhll>y})'
tlV tlv-l+O/(t+O) tlv-t 1+0 tI(t+ll)
We obtain from Holder inequality ( [IIYjll ) ::;; [P (lI<1>jll>y)y ([II<1>jll) .
( ) IIv t/v-I t+o t/(l+o)
Now t/v-l+o/(t+o) ~ 0 leads to, [IIYjll ::;; y ([ II<I>jll) for i such that
[P(II<1>jll > y) 7= 0; else this inequality is trivial. Thus'W::;; 2 v Q.
Add the previous inequalities to get the lemma with CCt) = c 24v, if t ~ v - O. For general
t < v, if k ~ v-t is some integer, we get by recurrence CCt) = c 24vk. A suitable constant is
o
thus C = c 24v «v-I)/0+1) .•
Properties: Tools 29

The following computational lemma will also be very useful for our purpose, it is an
extension of previous results in Doukhan, Portal (1983,1987) or in Doukhan (1992).

Lemma 2. Set Dc = D(c, e, T) = L(c, e, T)v[L(2, e, T)]cl2 for e > 0 and c ~ 2 is an


even integer, then D a Db ::; D a+b'

Y
Proof of Lemma 2. Replacing Y t by Lt leads to Lc = 1, Dc = D(c, e, T) = Lc v 1
c
with Lc = L(c, e, T), for e > 0 and where c;:: 2 is an integer. Note that
Da Db = La Lb v La V Lb V 1 is the maximum value of four terms. Set c = a + b. We
have to bound La by some function of Lc' In order to do this, note that HOlder's inequality
. r 'th (c+E)(a-2)
Imp les, WI u = C _ 2 ,v =
(2+E)(c-a)
c _2
[E IYl a+E ::; ([E IYlc+E)uJ(C+E) ([E IYI 2+E)y/(2+E).
ua a a-2 C+E ~ .sI(l-r). a c-a 2+E
Set r = c(a+E) = = =
C c-2 a+E' and A £." IIY ~12+E ' with s 2: c-2 a+E' It follows
teT
that La::; A (LJ.
s a
Remark that l-r;:: I to deduce that A::; 1 and r::; c so that La ::; (Lc) ~
. Thus
Da Db::; La+b V 1 = D a+b, concluding the proof of this lemma.•

Proof of Theorem 1. Write first [ElL Y tiC::;


teT 'teTC
L
I[E Y t,"'Y lei = Ac(T). Fix tc = s

in T and assume that it realizes the maximum distance to the other points tl ,... , tc_1
c-l

Ac(T)::; c Lu=l L LT Bc(s, u), where Bc(s, u) ='teT(s,c-u,r)-u


r~l SF
L I[E Y~ Ytr .. Y Ie I
and = {'t = (tl, ... ,ty) E (T\{s})Y; diam{t1, ... ,ty} ::; r, d(s, {tl, ... ,ty}) = r}.
T(s, v, r)

L et v = -, = c+e
-, an d r~ =-.
c+e J.l e The IDlxmg
. . .mequality lor covanances YleIds
C • •

uc-u c+e

I[EY~Y~'''YIe_) ::; I[EY~II[EYt""YIe_ul + 8(<xy(r;u,c-u»s IIY~lIv IIY~'''YIe-uIlW


Now L L
'teT(s,c-u,r)
r~l
I[EY~II[EY~"'YIe_ul::;[EIYt LI[EYt,"YIe_ul=[EIYsluAc_u(T)
'teTC-U
implies with the previous inequality
c-l c-l
L
u=l r~l
L I[E Y~II[EY~'''YIe_)::; Lu=l Ac_u(T) (L
L Lse T 'teT(s,c-u,r) seT
[EIYl)·

Set Mu(T) = L ([E IYl+E)u1(u+E). HOlder inequality yields the bound


seT

Also
30 Mixing

u c-u
1I~lIv IIY~ ... Y Ic-}Il::; IIYA:+£ rrIlY~lIc+E'
i=!
Now for each 't = (t!, ... ,tc-d consider the index s for which IIY JI C+Etakes its maximal value to
see that

L
SE T
L IIYslI~+E rrllY~
~ET(s.c-u.r) i=!
IIC+E::;
C-u-! C c-u-l
::; 2 (c - u) srbr I, IIYsll c+E::; 2(c - u) srbr Mc(T),
SET

Use now the previous inequalities to get


c-!
Ac(T) ::; c I, (Ac_u(T) Mu(T) + 2CcM c(T»).
u=!
Lemma 2 implies the result for even integers, Lemma I and Remark I extend it to arbitrary real
exponents. +

Proof of Theorem 2. Let T = {I, 2, ... , n}, and assume first that c = 2 q ;:: 2 is an
even integer in assumption (2) and denote by C the bound for the sum of the corresponding
series. Note that

Let r = r(t!, ... ,tc) be the largest interval among successive points in the sequence {t!, ... ,tc },
c+e c+e e
r = tm+! - tm, (r = Max!$i<c (ti+1 - ti»' we set v = m' 11 = c-m and 1;, = -.
c+e
Using the
Davydov inequality for covariance and HOlder inequality yields

Now use repeatedly the previous inequality to see that


~ c-!
Ac ::; 8 C £. IIY tll~+E + I, Cr:;) Am A c_m.
!$t$n m=!

Lemma 2 implies the result for even integers, and Lemma I extends it to arbitrary nonnegative
real exponents. +

Extensions or related results

Remark 2. Assume that IIXJI1:+E::;M, then D('t,e,T)::;n~/2M~, and if the mixing assumption
Properties: Tools 31

~ 't12-l li/('t+Ii) .
£.oJ (r+l) [<Xx(r)] <00 holds then Yokoyama (1980) has shown that there IS a
r=1
constant K not depending on moments of X, with lElX l + ... + Xl
~ K M ncl2. This result
relaxes the previous mixing assumption, however the inequality obtained has not the form of
Rosenthal inequality.

In the case where the random variables are bounded by 1, Yokoyama (1980) shows that
lElX l + ... + Xnl't < K n'tl2 - the author gives there a distinct interpolation argument - under

the weaker mixing assumption L (r+ 1)'t/2-l <Xx(r) < 00. Extensions of this result may be
r=1
found in Nahapetian (1991).

In the case where the random variables are bounded by 1, we use the inequality

in the proof of Theorems 1 and 2 with different values of (~, Jl, v), to get the following result

Theorem 3. Set now R(c, e, T) = Max{L IIYtlC D(c, e, T)j then we have for any
tET
even integer c
lfIX/+ ... +Xn( ~constR(c,e,T).

Here X is either a random field such that L(r+l)Cd-dU+d-1 [aX<r;u, v)~(2+E) < 00 for u, v~2
r=l
with u+v~c,

or X is a random process such that L (r+ 1y-2 [ax( r )]EI(2+E) < 00.

r=l

All the results indicated for random sequences naturaly hold for random fields via evident
modifications of the assumptions.

Remark 3. If Y is a vector valued random field with 't-order moments and values in a
separable Hilbert space H, Theorem 2 holds using Dehling's inequalities for covariances (in
Dehling (1983» and the development given in Doukhan LeOn & Portal (1984)

lEILY ll2C ~ LllE(Yll,Y~) (Y\"Y~)I=A2C(T).


...
leT <J,'teTC
For a random strong mixing sequence we do not get the same result. If ~-mixing holds
Takahata's (1986) result [see Theorem 1.3.3] also implies the previous inequality. We indicate
a technique to extend such results, it was communicated by S. Uteev. Let (ek) be an
orthonormal denumerable basis of H, and ~ = (~k) a Gaussian white noise independent of Y.

Write Yl = L ~ek' Note that Yl = L Y~ ~k satisfies lEI L Y f LY f/lEl~I't and


= lE I
k k leT leT
32 Mixing

that for any 'Y:::; 't + E, [ IY t = [Iy t / [ I~ II Y• Thus it is easily shown that

[II, yt:::; c ITI~/2 MaxtETlyg+E' extending Yokoyama's result to Hilbert space valued
lET
random variables.

Remark 4. Let Y be a cj>-mixing stationary and centered random sequence with finite 't-order
moments and such that I, <l>E:(2+E) < 00. Ibragimov (1962) proved the inequality
r~1

[IY 1+ ... + Y nit < Kn tl2, the mixing assumption on the convergence rate of the mixing
sequence is very weak because the author shows that for p-mixing sequences
[IY 1+ ... + y/+o:::; K([IY 1+ ... + y/)(2+0)12.

Yokoyama (1980) shows that such a result does not directly hold for the strong mixing case.

However all the results proved here in the the a-mixing case extend to the cj>-mixing case for
even integers changing only the corresponding assumptions, putting now E =0 and using
mixing assumptions analogue to those of Theorems 1 & 2

I, (r+1)cd-du+d-1 [cj>x(r; u, v)]lIc < 00 (u, v;:: 2, u + v:::; c),


r=1
=
I, (r+1)c-2 [h(r)]lIc < 00.

r=1
The case of bounded random variables given in Theorem 3 is now given for u, v;:: 2,
u + v :::; c
I, (r+1)cd-du+d-1 [cj>x(r; u, v)]1/2 < 00,

r=1

I, (r+1)c-2 [h(r)]112 < 00_

r=1
We do not know how to extend this result to arbitrary exponents c;:: 1 because Uteev's
interpolation lemma does not work for 0 = O. The same problem is true for Rosenthal
inequalities in Orlicz spaces when strong mixing holds. Bulinskii & Doukhan (1987) prove
such an inequality only for even numbers.

Applications to W.L.L.N.
Remark 5. If Y is a stationary and centered random sequence, the WLLN holds with the rate
n-~/2 for processes with 't+E-order finite moments.

Remark 6. If X is a stationary random sequence, under the previous mixing assumptions,

fn (x) = ki ~(r)(x-Xr) estimates the density function of the marginal distribution of the
r=1
process X. Assume that k is a compactly supported function with integral 1 on the real line. The
bias of such estimates converges to 0, uniformly over compact subsets of [R, under a regularity
assumption on f. The variance of this estimate is controlled using Theorem 2 applied to the
Properties: Tools 33

adapted random variables Yn(x) = ~(n)(x-Xn) - [~(nlx-Xn)' The previous results show
n
that Sn(x) =I, Yr(x) has 't-th order moments with order n m~(l-fi(HE))(n); in the case of
r=1
<j>-mixing set e = 0 for even integers 't under a suitable mixing assumption. The loss is
decreasing for increasing values of e. A generalization is given for any moment of such
estimates in Doukhan (1992).

1.4.2. Exponential inequalities


We shall first introduce results which give an idea of the two representative distinct
ways to get exponential inequalities analogue to those given in the independent case.
First, the optimization of Rosenthal moment inequalities (or the use of bounds for cumulant
sums) and the direct approach based on grouping techniques and exponential inequalities for
independent random variables; afterwards we give a result based on the reconstruction
technique. Recall first Bernstein's and Hoeffding's inequality. For any independent and
identically distributed sequence of random variables (X~ centered at expectation, with IXtl ~ 1
and [IXl ~ a 2 we have (see for instance Pollard (1984, p. 192 & 194»
n 2
!P(iI,Xtl;:::x"Ii)~2exp{- x },
t=1 2( a 2 + ~)
3{ll
n 2
!P(iI, Xtl ;::: x "Ii) ~ 2 exp{ - x2 }.
t=1
Recall that the previous inequalities are optimal. For instance, Bernoulli random variables yield
analogue inequalities with De Moivre Laplace CLT.
Let us come to the case of dependent random variables.
Proposition 1. Assume that (X,) is a mixing sequence of centered random variables with
IX,I ::; 1 and such that the strong mixing sequence (resp. the uniform mixing sequence)
satisfies an ::; u vn (resp. ¢n ::; u v n) for some u > 0, 0::; v < 1. If, moreover, n satisfies
n Sup, (If IX/ /1(2+£) :? 1 (resp. n Sup, Iflxl :? 1) then there are some constants a,
b > 0 such that for any x :? 0,
n
!P(/I, x,l :? x {Ii ain a- J ) ::; a exp{ -b{X }
t=J
n
(resp. !P(II, x,l :? x
t=J
"Ii a) ::; a exp{ -b{X}).

Proof of Proposition 1. The proof of Theorem 2 implies for any integer p, the inequalities
n 2p-l
[II, X /P ~ C 2p Max {na2, (na 2)p} for constants C2p ~ 8 C(2p) + I, CZJ) C m C 2p _m
t=1 m=l
where,
~

C(p) = I, (r+ 1)p-2


r=1
a: < 00 and a 2 = SUPn ([ IX/+ E)21(2+E) in the strong mixing case, and
34 Mixing

C(p) = I, (r+ 1)p-2 cjJl:2 < 00 and 0'2 = SUPn [E Ixi in the uniform mixing case.
1'=1
Note that a2 :0; SUPn ([E IXi)21(2+EJ in the ftrst case and combinatorial arguments lead in the
case of a geometric decrease of the mixing sequences (see Doukhan, Portal (1983), Doukhan,
A p t)4P 4p
Leon & Portal (1984) and Massart (1987)) to C 2p :O; ( --€ and C2p :0; (A p)
I-a
respectively for constants A> 0 and 0:0; a < 1. Now Markov's inequality implies that if
n0'22:: 1, then for a constant B > 0
n B 2 2p
!P(II, Xtl 2:: XCO'I-€{Ii) :0; (7) in the strong mixing case and
t=1
n B 2 2p
!P(II, X tl 2:: x 0' {Ii) :0; (7) in the uniform mixing case.
1=1

Optimizing the previous inequalities implies the result; see Doukhan, Portal (1983 a, 1987) and
Massart (1987) for more details .•

Statuljavichus & Yackimavicius (1989) give very precise similar results using the cumulant
sums technique. This method is developed in Nahapetian (1991).

We now indicate a result due to Collomb (1984), the proof of this result is based on a direct
grouping argument

Proposition 2. If (Xt ) is a uniform mixing sequence of centered random variables with

I cfi n <
00

IXtl ~1 such that 00, b- 1 = 8 (1 + 4 I, cfi n), a = 2 exp 3-{e if n and x satisfy
n=1 n=1
2 O'{n cfik 1
n 0' ;? 1, 0 ~ x ~ 8 b kn for kn = Inf{k; k ~ ,/ then for any x;? 0
n
!poI, X tl ;? x ,{Ii 0') ~ a exp{ -b x 2 }.
t=1

Remarks 7. A close result is shown in Bosq (1975). This result was extended in Carbon
(1983) to the strong mixing case. We do not state it in its complete form because of its
complexity. We only present it for examples of decay rates of strong mixing sequences. For
any 0 < a < 1, there is some b > 0 such that for n big enough,
n n
if a) an:o; v e for O:O;v < 1 then !P(II, Xtl 2:: x ,{Ii):O; 2 exp{-bn l12 - a x)),
t=1
n
if b) an:o; v n for 0:0; v < 1 then !P(lI, Xtl 2:: x "Ii):o; 2 exp{ -b n- aJ2 x)} and
t=1
n _ I-a
if c) an:o; n- v for v> 0 then !P(II, Xtl 2:: x" n):O; 2 exp{ -bx l~}.
t=1 ..[ii
Assumption (a) seems inadequate but it leads to a good result while the more frequent case. (b)
leads to a loss n€ in the exponential factor of Theorem 1 and a gain replacing by x. A recent -vx
result due to Bosq (1991) using the reconstruction results of § 1.2. improves sharply the
inequality of Carbon (1983). The main problem in this inequality seems to be a minorization
Properties: Tools 35

assumption. The equilibrium is not good for a O(Jn) deviation in the case (c). Moreover the

dependence with respect to (j is not explicit.

We now present results which seem much closer to their independent analogues for the case of
~-mixing and <j>-mixing processes. In order to fix some ideas, let us first set the general
assumptions. The sequence Xl' X2, ... is a sequence of real valued random variables assuming
some of the forthcoming assumptions.
(i) VtEIM*,[EXt=O.
(1'1') :::I
:J (j
2 E IR+"'"
*, v n, mE ",,*
U1
1 fLeX n+···+ X
: ill LL
)2 <
n+m - (j 2 .
(iii) V t E 1M * , IXtl ::; M.
(iv) V t E 1M', [E1Xl::; Mr.

Remark 8, (ii) is the only assumption related to stationarity. Note that it holds if (i) and the

L Pn)SuPn [EX~ <


00

p-mixing assumption hold. (j2 = (1 + 2 00 is here a convenient


n=O
bound.
Under (iii) and a strong mixing assumption, it still holds with the unhomogeneous

bound (j2 = (1 + 2 L
a O~(2+O») M 20/(2+O) SUPn ([EX~)2/(2+O) < 00.
n=O
If (iii) is not assumed but (iv) holds, then the M 20/(2+O) factor must be replaced by
M 2ro/(r-2)(2+O). We do not present an explicit use of condition (iv), however a truncation
argument and the use of the Borel-Cantelli lemma yields classically to exponential inequalities
under this assumption. An explicit example ofthis technique is given in Uteev (1985).
Now if (iii) is satisfied and an ::; a e-bn for some positive constants then (ii) still holds
[EX2
for some constant c only depending on a, band M, (j2 = C SUPn n for any
[1 v ln~(
[EX n
w> 1. That may be shown using Remark 3 of § 1.2.2 and Bulinskii & Doukhan (1987 :
proposition 5, 1990; formula (19).
A better result may be shown by a direct argument following the previous remark. For

this optimize the previous bound with respect to 0 and note that f a~/(2+O)
n=O
::; const. . We
0
2 [EX 2
get the bound (j = const. SUPn n 1 .
1 v In--
[EX 2 n

Those assumptions naturally extend to the case of a random field (Xt)tE ;t'd, assumption
(ii) becomes ~ [E (L xl::; (j2. The previous remarks still hold. In the case of a p-mixing
A

random field, we set (j2 = [L Cd n d - 1 pen; 1, 1)] SUPt [EX~ < 00. In the case of an a-
n=O
36 Mixing

mixing random field set (i = [I, cd n d- 1 aO/ (2+0)(n;


1, 1)] SUPt (IEX~)1/(2+0) < 00 since
n=O
the covariance inequalities in § 1.2.2 may still be used, here cd denotes a constant (4)
depending on the norm chosen on Zd.
Using alternative mixing assumptions, Nahapetian (1991) gives the asymptotic behavior
for the variance of sums.
The case of a geometric decay rate for a(n; 1, 1) may also be considered. The bound
- IE~
I, cd n d- 1 a0!(2+0\n; 1, 1) ::; const. yields (ii) with ci = const SUPn n 1 d'
n=O &I [1 v In -2]
IEX n
In this case we recall that if the strong mixing coefficients satisfy for any u, v and n
a(n; u, v) ::; c n for some sequence c n with limn~_ c n = 0, the result of Bradley in § 1.2.1
implies that, under the additional assumption of strict stationarity, the p-mixing assumption
holds if d ;t: 1.

Using now the same tools as Bosq (1991), we get, using the classical Bernstein
inequality, the following Bernstein inequality in which only a loss of In n is observed in the
geometric mixing case

Theorem 4. Assume the sequence to satisfy the f3-mixing condition, and (i), (ii), (iii), then

for any £ > 0, let (} = ~ and for any nonnegative real number q !> ~
1+(}
we have:

!P(If, X t l ~ x) !> 4 exp {_ (~-£) x 2 } + 2 n f3[q()l-l,


t=1 2(n +q M x13) q

In the case of a random field, set for integers (n1,···,nd) 1] = mini,j -0



]
N = n1 ... nd and
£2
(} = 4 then, assuming the same assumptions, we have for any integer Q !> Td N and for
£ > 0 little enough
nz nd

!P(II, ... I, Xil"'.,i) ~ x) !>


iJ=l id=l

!>(2+2 d)exp{- (1- £)x 2 }+ 2d!i{3([81] Q1Id}_1;[2dQ},N).


2(n c?+2 d Q M x13) Q

A first reduction for the Proofs. Set x t = X[t-ll' the continuous time process x = (x t)
obtained satisfies of course, let s and t be nonnegative real numbers then assumptions write
1 s+t
f
now: (i') Ixtl::; 1, (ii') IE x t = 0, (iii') tIE ( Xu du)2 ::; 0'2 and for any dependence
S

Card{zE£d/n<lIzlI<n+l). d-l. . .
4 In fact cd ;::: Maxn>o d-l ' for mstance cd = d 2 IS a convement chOIce
n
when IIzll = max1:>j:>d IZjl if Z= (zl ,... , zd)'
Properties: Tools 37

v
measure c, Cx,t::; cX,[tj-l' We shall denote by T(u, v) the integral T(u, v) = J x t dt. The
u
alternative assumption to (iii), (iv) IEIXl::; 1 for some r> 2 becomes (iv') IElxl::; 1. We
shall write a = 1 under assumption (iii) and a = nllr under assumption (iv). Moreover a
rescaling allows to consider only the case M = 1.

This transformation will allow to use blocking techniques without divisibility precautions.

n
Let Sn = f Xu du, define for any integer 1 and any real number 9, q = 1(1+9)
o
_n_ ; we set

i(q-l)(l+9)+q iq(1+9)
Ui = f Xu du and Vi = f Xu du, for 1::; i ::; 1.
(i-l )q(l +9) (i-l)q(1 +9)+q
I 1 I
Then Sn =L (U i + Vi)' Set An = LUi' Bn = L Vi then for any u, v > 0
i=l i=l i=l
!P (ISnl ;?: u + v) ::; !P (lAnl ;?: u) + !P (IBnl ;?: v).

We now use the reconstruction techniques in § 1.2.1.

Proof of Theorem 4. In the ~-mixing case, apply the reconstruction Theorem 2.1.1 to get
independent random sequences (U;)l$i$I' (V;)l$i$1 such that !P(U; "# U i) ::; ~[eqj-I' U; has
the same distribution as Ui (resp. !P(V; "# Vi) ::; ~[eqj-l) and V; has the same distribution as
I I
Vi (5). Bernstein's inequality applied to those sequences yields with;\; = L U;, B: = LV;
i=l i=l
2
!P(i;\;1 ;?: u) ::; 2 exp { - 2u )}, and analogously
na 1
2(--+-qau)
1+9 3
!P{lBn* l ;?: v) ::; 2exp {
- V 2
2}.
n9a 1
2( 1+9 +3 q9av )

Thus setting X = v'"


-'IIe
~
= u [1 _ ... I 9 ], we obtain
''I 9+1
!P{I;\; + B:I ;?: x) ::; 4 exp { - A x; }. Here A = (-{l;e - ~9)2 is close to 1 if 9
2(na2 + 3 qax)

is small so that A = 1 - e for 1::; e::; 2~e. Now set r = l$i$l


u {U; "# Ui}u{ V; "# Vi},

5 For this, use the induction technique and apply theorem 1.2.1. at each step r, using the random variables X,
Y defined as
X = (U',
1 U'»l<'<
J _I_f
and
38 Mixing

n n 1~
then 1P(f') ~ 2 q ~[9ql-I' Hence 1P(ISnl ~ x) ~ - ~[9ql-1 + 4exp{- 2 } for the
q 2(ncr2 +3"qx)
case of bounded random variables.•

Inverse the function inside the exponential leads now to

1P(ISn l ~~ i n cr 2 + ~ t q) ~ 2 ~ ~9q-1 + 4e-t,


assume that qn t ~ n is the first number such' that _n_ is an integer and with
• q(I+6)

~ ~[9ql-1 ~ ne- t, we get 1P(ISn l ~ ~ i t n cr 2 + ~ t qn.t) ~ 6 e-t .


In the geometric mixing case ~q_1 ~ a e-bq it is easy to check that qn.t ~ t;l~ n (1 + 0(1»

and in the arithmetic mixing case ~q_1 ~ a q- , qn.t ~ 6 b


b [a ent ]lI(b+l) (1 + 0(1» ( see (6».

The case of a random field follows analogously. Here the mixing coefficient is a function of the
nl nd
cardinality such that ~(n-I; a, b) :::; (aAb)r cn' Set Sn = .L. ....L.
Xil' .... id for
11=1 Id=1

n = (n I'"'' nd)' Using the same arguments as before this may be rewritten
nl nd 2d
Sn = J... J xSI" ",sd ds I'" dSd and now Sn = ? Aj where the Aj's are integrals of xt on
j=1
, 6 6 p (j)
the union of ld-rectangles with an area less or equal to d N for j > 1 (= -d- N for
2 2
1 ~ p(j) ~ d) 6~ N for j = 1 for N = nl ... nd; moreover the previous
and equal to (1 -
2
rectang!es are separated at least 6q (for j fixed); thus there are (Ui.j)ISjS2d.ISiSld where
I
A·j = k'" U··
I.j and for instance
i=1
(i l -I)(I+EI)ql+ql (i d-l)(I+EI)qd+qd

Ui •1 = f .. ··..
(i l -I)(I+EI)ql
f
(id-I)(I+EI)qd
Xs dS

ifi -7 (i l , ... , id) is a bijection {I, ... ,ld} -7 {I, ... , l}d. That means that the reconstruction
techniques still enable us to provide the representation of Aj as Aj =Aj outside of a set of
N
probability 2d _Q ~([6q]-I; [2dQ], N) with qi =~, Q = ql ... qd and q = mini qi'
(1+6)1

6 the 0(.) terms come from the fact that I =(1 :a)q E IN.
Properties: Tools 39

[P(IAjI2 u)::O; 2exp{ _---2~u=---2---}


an Ci 1
2(--+-Qau)
(1+8)d 3

where a = 0 for j = 1 and a = 8 else. Set Ie = «1+8)dl2 - 2 d-{e)2 and x = u'"


-\JI ~
(8+1)

if j:f. 1, x = u [1 - 2 d ...
-\JI ~],
(8+1)
if j = 1. Hence Ie = 1 - £ for 0::0; £ ::0; ~e if

The following result comes essentially from Lin (1989). The loss with respect to
independence is now no more a function of n but only a logarithm of Ci in the case of a
geometric decay of the mixing sequence

Theorem 5. Assume the sequence to satisfy the l/J-mixing condition, and (i), (ii), (iii), then
there are constants a, b > 0 with
n
a) IP(II, X II :2: x-vn) ::; a exp{- b x 2} if the sequence n l/Jn is bounded and,
1=1
n_ bx2
b) 1P(II,XII:2:x-vn)::;aexp{- c;2}iflimn-7oonl/Jn=O, and where
1=1 c;2 + x M lfI( )
{iz
lfI(t) = Inf{p E f/'/; p l/Jp ::; t}.

Note that if l/Jn ::; u v n for o::;v < 1 and u > 0 then lfI(c;2) = C In Ci- 1 and if l/Jn::;u n- V
for v > 0 then lfI(c;2) = C c;21(J-v) for some constant C only depending on u and v. In
inequality b), the constants involved do not depend on M and d.

Proof of Theorem 5 (7). Let x in [0, 1] and q = [~] we first group the random variables
{XI"'" Xn} in 21 blocks with 1 the integer such that 21 - 1 < ~::o; 21.
2iq+q
Set Ui = I, X k (see 8), Tn = I, U2i and T~ = I, U2i + l , it is clear that Sn = Tn + T~.
k=2iq+! i i

Now, In [Eexp{ x Sn} ::0; 21 (In [Eexp{ 2x Tn} + In [Eexp{ 2x T~}). We bound one of these
terms, since bounds will be analogous for both (note that stationarity is not assumed and thus

7 The present proof is due to Emmanuel Rio.


8 By convention, empty sums are set equal to O.
40 Mixing

m
the terms are not equal). Set ~ = L, V 2i and Lm =In [exp{2xZm }, the inequality (3) for
;=1 .
covariances in Theorem 3 in § 1.2.2. yields, with p = 1 and q = 00
Lm+l :,; Lm + In (24)q e 2qx + SUPt>O [exp{2x V t }) and thus 1:,; ~:,; nx so that,
In [exp{2xT o }:'; n x In (24) q e 2qx + SUPt>O [exp{2xV t }).

Now IUtl:'; q, thus [exp{2xV t } :,; 1 + 2 x V t + 5 x 2 V~yields

In [exp{2xT o }:'; nxln (1+15 4>q+5qcr 2 x 2):,; 15nx4>q + 5nx 2 cr 2.

Grouping the previous inequalities yields

The first point follows from a rough bound In [exp {x So} :,; c n x 2 for 0:'; x :,; 1 and
Markov's inequality. For the second point assume first that x 'l'(cr 2) :2: 1 then
In [exp{xS o } :,; 20nx 2 cr2. Else, we obtain In [exp{xS o } :,; 20nx'l'(cr2) .•

Remark 9. Introducing a disequilibrium between even and odd blocks leads to an inequality
of the same form, where b is as close as we want from !, as in the case of Theorem 4. Li has
achieved this in a chinese written paper quoted in Lin (1989).

Remark 10. The interest of such sharp inequalities is to get a Bounded Law of the Iterated
S
Logarithm I~___ 0 < C a.s .. The Law of the Iterated Logarithm may be
-/2 n In (In n) cr 2
found in Nahapetian (1991).

1.4.3. Maximal inequalities


We intend in this section to extend as much as possible the maximal inequalities given
for independent sequences in Petrov (1975) and Hall & Heyde (1980) for non independent
sequences.

Maximal inequalities are the basic tool to get almost sure asymptotic results, such as Laws of
Large Numbers, Laws of the Iterated Logarithm and Strong Invariance Principles. Since
martingale maximal inequalities do not work in the weakly dependent case, we present suitable
partial inequalities. We first recall the general results in Moricz, Serfling, Stout (1982) for the
case of sequences and in Moricz (1983) for the case of random fields. They yield maximal
inequalities from suitable non maximal inequalities in § 1.4.1 and 1.4.2. Their application in
the mixing case is easy. After that, we derive maximal mixing inequalities for mixing
processes, generalizing Ottaviani's inequality in Reznick (1968) and proved by Massart (1987).

Theorem 6. Moricz, Ser/ling, Stout (1982)


Let Xl"'" Xn be real random variables define the sums S(i, j) = Xi + ... + Xj and maximum
sums M(i,j)=Max{IS(i,k)l,i::;k::;jj for l::;i::;j::;n. Let g be quasi-subadditive,
in the sense that for some 1 ::; Q < 2
g(i, j) ~ 0, g(i, j)::; g(i, j+l), for 1::; i::;j::;n and
g(i,j)+g(i+I,k)::;Qg(i,k), for l::;i::;j::;k::;n.
If, moreover, the partial sums S(i, j) satisfy
Properties: Tools 41

a) for some a, ,..::::1 fE IS(i,j)I't" ~ga(i,j), then


fE M't"(1, n) ~ A(n) ga(1, n)
for the constant A(n) = (1 - QafT
In 2 n _
2-(a-l)fT rT (independent of n) if a > 1 and
A(n) = (1+ lii""2) if a = Q = 1_
. q,(Ct)
b) For some functions q, and X with SUPt>t - - = x(C) for C> 1 and
o ¢(t)
limC-+l x(C) = 1, and some constants K::::1, to::::O: fEe
t ISr ')1
I,]
"'(t)
~Ke'l'
(. ")
gl,] for
t> to' then
fE / M(l, n) ~ A Ke B ¢(t) g(l, n)

for t > to and constants A and B only depending on Q and X if a> 1.

c) Forjunctionsq,andxwithSuPt>t «Ct) = x(C) for 0 < C < 1, limC-+l X(C) =1


o ¢(t)
and some constants K :::: 1, to> 0 : lP(lS(i, j)1 :::: t) ~ Ke-¢(t)lg(i, j), then
lP(IM(1, n)1 :::: t) ~A K e-B¢(t)lg(l, n))
for t > to and constants A and B only depending on Q and X if a > 1.

Proof of Theorem 6. We set for convenience g(i, j) = 0 if i > j.


a) By recurrence over N, assume that the result holds for any m < N, the following is a
consequence of Minkowski's inequality
1: 1:)111:
IIM(1, N))II1:::;; IIS(I, m))I1: + ( IIM(1, m-l)lI1: + IIM(m, N)II1: .
Now choose m with
g(1, m-l) ::;;,~ g(l, N)::;; g(1, m) and thus g(m+l, N)::;; ~ g(1, N),

we get the result with the constant A(n) =(1 - QaJ1: 2-<a-l)/1:f1: if a > 1 and if a =Q = 1
the result is shown by recurrence over v for n = 2v using the first inequality and
In 2 n
A(n) =(1 + lil2)'

b) Choose q> 1 with B =X(q) ::;; ~ and set A =2P for ~ + ~ = 1, the result holds
for n = 1, assuming that it holds for n < N we prove it for n = N.

F or 1 <
_m< _ N ,we h ave IE e tM(I, N) < _ IE e tM(I, m-I) + IE e tIS(l, m)1 e tM(m+l, N) , an d
we choose m as in a). HOlder's inequality implies with the induction hypothesis
IE etM(l, m-l)::;; (IE etqM(I, m-l»l/q,
IE etM(I, m-I)::;; (AKeB«!l(qt)g(l, m-I»lIq ::;; A IIq KeBQ«!l(qt)g(l, N)/(2q),
IE etM(I, m-I) A IIq KeB«!l(t)g(1, N)I2, using the definition of q, and analogously,
IE etIS(I, m)1 etM(m+I, N)::;; (IE etpIS(I, m)I)lIp (1E etqM(m+I, N»lIq,
IE etIS(I, m)1 etM(m+l, N)::;; (Ke«!l(pt)g(l, m»l/P(AKeB«!l(qt)g(m+l, N»lIq,
IE e tIS(I, m)1 tM(m+I, N) < AlIqK Bg(l, N)[«!l(pt)/(pB)+2«!l(qt)/(qQ)]
e _ e ,
IE e tIS(I, m)1 e tM(m+I, N) <
_ AlIqKe B«!l(t)g(1, N)[X(P)/(pB)+X(q)/(qQ)]
,
42 Mixing

IE etlS(l, m)1 etM(m+l, N) S; 2A l/QKeB$(t)g(I, N).


We now collect the previous inequalities to finish the proof.

c) The proof is similar and may be found in Moricz, Serfling, Stout (1982) .•

The following multidimensional analogue of Theorem 7-a is proved in Moricz (1983).


For k, m E ~ d we set k S; m if k j S; mj for i = 1, ... , d and k = (k 1 , ... , k d ),
m = (ml' ... ' md). Moreover define for b = (b 1 , ... , b d ) the rectangle

R = R(b, m) = .~ ]bj, bj + mj].


~
We set S(R) =
bR
L
X k if (Xk)ke Zd denotes a random

field and M(R) = Maxl~ IS(R(b, k»1. We say that a function f defmed on rectangles is
subadditive, if f(R) ~ 0 and f(R 1) + f(R 2) S; f(R) for rectangles R, Rl and R2
with Rl = R(b, m') and R2 = R(b', m") if R = R(b, m) and where we define b', m' and
m" as b'=(b 1 , ••• , b j _1, bj+Pj' b j + 1, ... , b d), m'=(ml, ... ,mj_l,Pj,mj+l, ... ,md) and
m"=(ml, ... ,mj_l,mrPj,mj+l, ... ,md) for IS;Pj <mj' 1 S;j S;d.

Theorem 7. Let (Xkhe:tf denote a real randomfield,fbe a subadditivefunction and h(t, m)


be afunction defined for t E fR+. mE Zd. increasing with respect to each coordinate (i.e.
h(t. m) ~ h(t'. m') for 0 ~ t ~ t' and 1 ~ m ~ m') satisfy for some 'f ~ 1
[£ IS(R)I'Z' ~f(R) (h(f(R). m)/ for any rectangle R = R(b. m) then

[£ [M(R)/ ~ 3d ('Z'-l) f(R) (h'(f(R). m)/ and [£ [M(R)]'Z' ~ (~/ feR) (h'(f(R). m)/ with
ln2 m] ln2 md

h ,(t. m) = "" ... £..J


£..J
k]=O
"" h( k +t +k' (-k'
kJ=O
m 1 ...• -k:!.l).
2 1 ••• 1 2 1
md
2 d'

Applications to S.L.L.N.
Remark 12. If Y is a stationary and centered random sequence, then SLLN holds with the
rate n-'t12 for processes with ('t+E)-order finite moments.

Remark 13. If X is a stationary random sequence, and k is compactly supported with


integral 1 on the real line then Yn(x) = ~(n)(x-Xn) - 1E~(n)(x-Xn) is such that, under the

previous mixing assumptions, fn(x) = ~ i


r=l
~(r)(x-~) estimates the density function of the
marginal probability distribution of the process X, the bias of such estimates converges to 0,
n
uniformly over compact subsets of fR. Then the previous results show that Sn(x) = L Yr(x)
r=1
has't-th moments with order n m't(l-II('t+E»(n).

Loss with respect to the independent case is small for big 't and small E.

The following result analogous to Ottaviani's inequality is essentially due to Reznick


(1968), was improved in Doukhan, Leon (1989) and the present version comes from Massart
(1987). Note that Nahapetian (1991) proves maximal inequalities under alternative mixing
Properties: Too's 43

assumptions.

Theorem 8 (maximal inequality) Let (Xn)n?l be stationary random sequence centered at


expectation and such that LEIXl I2+ O < 00 for some 8> 0 and let Sn = In (Xl + ... + Xn)

then if either the sequence is rfJ-mixing or it is a-mixing with an = O(n- r) for some
r > 2 + 8, there exist positive constants C, c with
(~- a) fP(maxlsksn {lSkl} > x) ~ /P(ISnl > x - 2y{Ti) + Cn- t, for x> 0, y> c and

. I
respectIve y t ="28.In the ,/, "
'1rmlXlng case an
d 0 r(2+8) l' h ..
< t < 2(2+o+r) - In t e a-mlxmg case,

a = maxlsksn (fP(IS n - Ski> y{Ti)}.

Proof of Theorem 8. In the <I>-mixing case let m = min{n; <1>0 ::;~} and
2m2+1i 21i 0-2t l+t
C = 4( - ) [E IX,I + and in the a-mixing case let ~ = - - , s = - (s < r) and
c 2(2+0) ~

then set m = [nl3] and C = 4 Max{(~)2+1i[EIXlI2+1i, a on 13S }.


Set Ak= {IS,I::;x, ... , ISk_11::;x,ISkl>x}, :'f<.={kE {1, ... ,n};!P(A k»Cn-(1+t)},
then
o
!P(max1$kSn {ISkl} > x) = L !P(A k)::; Cn-
I + L
!P(A k ), and
k=l ke%
!P (ISnl > x - 2y-{ii) 2 L!P ({ ISol > x - 2y-{ii} (')A k).
ke%
Now following Reznick (1968), we set p~(zlk) = !P (IS u - Syl > z-{ii I A k)
IP (ISnl > x - 2y-{ii) ;::: (1 - Max ke % Pk! 1(2ylk» L!P (A k)·
ke%
ItonlyremainstoprovethatMaxke!:KPk~1(2ylk)::; a +~.
It may be seen that Pk+l(2ylk) ::; p'r:i-1(ylk) + Pk~m(ylk) but
[E IS k+m-l - S k12+1i < 2
- m +1i [E 12X l
12+1i
' thus we now get for k in :'f<. ,
pk::rl(ylk)::; (2;)2+0 [Elxl+ 1i C-1 nl-O/2::;~.
Finally by the definition of mixing coefficients we get depending on the mixing property used
1 a l+t-~s 1
n (Ik)
Pk+m y < J..
- a + 'I'm <
- a+ n ( y Ik) <
4 and Pk+m - a + !P(Am) <
- a + n- 4 - <- a + 4'
k
Collecting the previous inequalities ends the proof. +

Survey of the literature

Doob (1953), Ibragimov and Yokoyama (1980), and Yoshihara (1976) proved inequalities for
arbitrary moments of sums of <1>, p or a-mixing sequences in a form where constants are not
explicit or not useful to work with triangular arrays. Billingsley (1968) obtained the Rosenthal
bound for the fourth order moment of a <I>-mixing sequence. The same technique led Doukhan
& Portal (1983 a, 1987) and independently Uteev (1985) to obtain the Rosenthal inequality for
strong mixing sequences. This was extended to random fields in Doukhan, Leon & Portal
44 Mixing

(1984) and Orlicz space analogues are proved in Bulinskii & Doukhan (1987). The case of the
variance of sums in random fields is worked out in Neaderhouser (1978).

The first exponential inequality was probably given in Blum, Hanson and Koopmanns (1963).
They proved a Hoeffding's inequality for 'V-mixing random variables. Bosq (1975) obtained a
first <j>-mixing exponential inequality who proposed exponential inequalities through a direct
proof. It was improved in Collomb (1984) and in Carbon (1983). Gyarfi et al. (1989) use them
extensively for non-parametric curve estimation. After that, Bosq (1991) used the
reconstruction results to get a sharp exponential inequality . The same trick is used here to get
l3-mixing Bernstein inequalities. In every previously recalled work, the exponential inequalities
are proved using grouping arguments. The Rosenthal moment inequalities led Doukhan &
Portal (1983) to a unifonn exponential inequality presented in Proposition 1, this inequality is
extended to the case of random fields in Doukhan, Le6n & Portal (1984) and improved in the
strong mixing case in Massart (1987). Analogue results are given in Uteev (1985). Now Lin
Zhengyan (1989) seems to give the best exponential inequalities known for strong mixing or
<j>-mixing sequences.

General maximal inequalities are given in Moricz (1983), Moricz, Serfling & Stout (1982) and
Serfling (1968). Ottaviani's inequalities for mixing sequences proved in Reznick (1968) may be
found in Rosenblatt (1956), Wolkonski & Rozanov (1959), Ibragimov (1962) under weaker
fonns or BiIIingsley (1968), Iosifescu & Teodorescu (1969), Mac Leish (1975), Yoshihara
(1978), Hall & Heyde (1980), Yokoyama (1980) and Uteev (1984). They are improved in
Doukhan & Le6n (1987) and Massart (1987). Recall also that Nahapetian (1991) proves
various limit results which develop the ones proposed here.
Properties: Central Limit Theorems 45

1.5. Central limit theorems


The central limit theorem is a fundamental tool in statistics. In this section, various
central limit theorems for mixing random variables are presented in order to point out the real
differences with the independent case. This chapter is divided in three parts. In the fIrst one we
recall sufficient conditions for CLT to hold. In the second we recall results concerning
convergence rates in CLT. In the last part we prove a CLT with a rate involving explicitly the
dimension of the underlying space. This kind of result, first proved in Yurinskii (1977) for
independent and identically distributed random variables, yields weak invariance principles.
Dehling (1983), Doukhan, Leon & Portal (1987) and Doukhan & Portal (1987) prove similar
results for dependent random variables.

1.5.1 Sufficient conditions


Let X = (Xt)tE £. be a real valued stationary centered at expectation and mixing random
process. The CLT problem is to provide explicit suffIcient conditions on X for a central limit
(52
theorem to hold. Set Sn = X j +".+ Xn and (5~ = [ISnI2. First recall that convergence of nn
holds if

(1) the sequence X is ¢-mixing, I. 1f/~2 < 00 and IEIX l l 2 < 00, or
n;=()

(2) the sequence X is tfJ-mixing, I. tfJ~+1)/[2+Jl < 00 and IEIXl I2 + O< 00 for some 0> 0,
n=O

(3) the sequence X is a-mixing, I. a~[2+81 < 00 and IEIX l I2 + O< 00 for some 0> O.
n;=()

I. [XjXk. Those results follow from


00

Moreover~ = n (52 +.0(1) if (52 = [IX j I2 + 2


k~j

n
the expression (5~ = n [IX jl2 + 2 I. (n - k)[ X j Xk and from the use of the covariance
k~j

inequalities of § 1.2.1.

Assume that (52) 0 then Ibragimov (1962) shows that CLT holds under condition (3).
Moreover Ibragirnov (1975) proves the CLT if ~ =n L(n) for some slowly varying function

L and either I.
n=O
p ~n2 < 00 or [IX j I2+1l < 00 for some () > 0; he conjectures that CLT still

holds if the p-mixing assumption replaces the last one. Billingsley (1968) shows that CLT
holds under condition (1). Mac Leish (1975 b) shows that an invariance principle holds under
condition (2). We also recall the results by Serfling (1968) based on martingale techniques. Let
X = (Xt)tE £. be a process centered at expectation and such that [IXa+j +".+ Xa+nl2 behaves
like nA for big n and uniformly with respect to a. Then the mixing assumption implies CLT.
Remark 1. Note that Bradley (1981 b) provides a sufficient condition for such linear growth
46 Mixing

of partial sums of a stationary process. Let (~) be a stationary real valued process, set as usual
n
Sn = L Xt· Set now rk = SUpy z ICorr(Y, Z)I for the maximal correlation of random
t=1 '

variables Y = L at Xt and Z = L
b t Xt defined with the help of real valued sequences at
t:::;T t<':T+k
and b t with a finite number of nonzero elements. Then a sufficient condition for n- 1 Var Sn to

converge to constant r? is Var Sn converges to infinity and I r(2n ) <


n=O
00. Note that the last

assumption is easily provided for r-mixing processes since rk ::; Pk (see § 1.3).

A result due to Oodaira and Yoshihara (1972) (See the bibliographical comments for the
paternity of Theorem 1) relaxes the previous conditions. It makes use of the deep ergodic
theoretic argument of Gordin (1969 a).

Theorem 1. Let the stationary sequence X be centered at expectation and a-mixing. If

L a;I[2+8] < 00, LEIX]1 2 + 8 < 00, for some {) > °and rr = LEIX]12 + 2 i LEX ]Xk > 0,
n=O k=]
then the sequence of processes {S[ntlan; t E [0, J]} converges in the Skorohod topology to
a standard Brownian motion Won [0, 1J.

Recall that the Donsker CLT holds for any independent and identically distributed
sequence if the random variables have finite variance, see Petrov (1975). Note that no strong
mixing condition (I) implies CLT without an assumption like IEIX/+1i < 00 for some 0> O.
That is quite natural in view of the inhomogeneous covariance inequalities in § 1.2.2. and
Herrndorf (1983) gives an example of a strongly mixing stationary sequence with arbitrary fast
decay of the mixing sequence, IE IX 112 < 00, and such that CLT does not hold.

It is still possible to generalize the moment assumption IEIX/+1i < 00 to IE'P(IXII) < 00 for
convex functions with 9(x) = x- 2 'P(x) increasing and limlxl~oo 9(x) = 00. Under the

mixing condition L un 'P-I(U n) < 00, CLT holds (2). If 'P(x) = x2 Ina(Max{x, b}) (with
n=O
b > b(a) big enough), then CLT holds if a > 1 under the assumption of a geometric decay of
the mixing sequence. Now if the function 'P increases very fast at infinity, the condition

L un 'P- I(Un) < 00 is as close as wanted to the mixing condition for bounded random
n=O

variables L un < 00 since 'P- I has a very slow behavior at 0 ( e.g. 'P(x) = e ax yields
n=O

'P-I(u) = InaU ). As a final consideration on this topic, a necessary and sufficient condition

1 The sense of a strong mixing condition is the one defined in § 1.3.2. This is Ibragimov's conjecture.
2 Use the result in Herrndorf (1983) and the covariance inequality in Orlicz spaces from Bulinskii (1987,
1989).
Properties: Central Limit Theorems 47

for CLT to hold for a strong mixing process is given by Jakubowski & Szewczak (1990). It is
based on Gordin's argument. We do not recall it since its assumptions cannot be checked using
only the existence of moments and the decay of the mixing sequence.

In Doukhan, Massart & Rio (1994), we give an optimal result [see (3)] under the strong
mixing assumption, at least for an arithmetic decay of the mixing sequence. The CLT problem
is described under alternative mixing assumptions in Nahapetian (1991). For instance, it holds
under 'Jf-mixing under the additional assumption ~ ;::: en.

The case of a p-mixing sequence is of interest since very weak assumptions are assumed for
CLTto hold.

Theorem 2 (Ibragimov (1962), (1975)). If the sequence X is stationary, then the


following results hold
a) If limn~oo Pn = 0 then SUPn a~ < 00 or a; = n h(n) for some slowly varying
function h. If limn~oo an = 00 and LfIX I 12 + o < 00 for some l5 > 0 then the sequence
S,(an converges to a standard Gaussian random variable.

b) If L P n<
2 00, X has a continuous spectral density f, ~ = 2;mf(0)( 1+0(1)) and
=1
frO) .r0, then the sequence S,(an converges to a standard Gaussian random variable.

If X = (Xt)tE Zd is a real valued stationary centered at expectation and mixing random field,
things do not work the same way. Let An be an increasing sequence of finite subsets in ;Zd. Set

Sn = I, X t' now (J;;- = [E ISnl2 behaves like IAnl if the regularity condition
tEAn
la~1
li~~ I~I = 0 holds [see (4)] as well as a condition analogue to (2)

. laAnl
(4) ltmn~ ~I = 0,

the field X is a-mixing an,Q,b = o(n-d ). for some l5 > 0 LfIX112 + 0 < 00 and for a + b 54

3 Set Q(u) = Inf{t; [P(lXol > t) ~ ul, and a-I for the inverse function of the mixing sequence, then the
I
condition f a-I(u) Q2(u) du < 00 implies the conclusion of Theorem I. Under the moment assumption of
o
Theorem I, the Donsker CLT holds if:i: n2/ O an < 00. This assumption is weaker than the previous one. If
I

IE'I'(IXII) < 00 then the functional CLT holds as long as L 1;(n) an < 00 where 1; denotes the inverse
I
function of '1".
4 It means that the sequence An does not increase in only one direction and dAn denotes the border of An'
48 Mixing

~ d-J
£.., n an,a,b < 00.

n=O
Set

Theorem 3 (Bolthausen (1982)). The sequence S,/Gn converges to a standard Gaussian


random variable if the assumptions (4) hold and if > o.

Remarks 2. The case of random fields is considered in Gorodetskii (1984) for various mixing
conditions. Bradley (1992) proves a CLT for a centered and weakly stationary random field X
without any mixing rate assumption. Assume that px(n;=,oo~..:;!, 0 and the spectral density f
of the random field satisfies, then LX/II L Xtll 2 is asymptotically normal. A sufficient
[l,njd [l,njd
condition for the second assumption to hold is r(l) < 1 and r(n~..:;!, 0 where we denote (see
Remark 1) r(n) = SUPy,Z ICorr(Y, Z)I with Y = L at X t and Z = L b t X t for finite
U V
subsets U and V in Z. d distant at least k.

The assumption of stationarity has mainly a technical origin and it seems that the previous
results may be be extended to the non stationary situation. For instance Guyon (1992) extends
the Theorem 3 to the non stationary case. The following result is also a consequence of the
theorem 9.6. in Bulinskii (1990), p. 84.

Theorem 3bis• Assume that there is a 0> 0 such that the following conditions hold

~ d-J M2+1i}
£.., n an J J < 00, and
n=O ' ,

~ d-J
for a + b:::; 4, £.., n an,a,b < 00,

n=O
then limsuPn~~ IAnl- J Lie ov (Xi' Xj ) I < 00.

i,jEAn
If moreover, liminfn~~ IAnrJ ~ > 0, then the sequence S,/Gn converges to a standard
Gaussian random variable.

1.5.2. Convergence rates


Let X = (Xt)t E Il'I be a real valued stationary centered at expectation and mixing random
1 to' 2
process. Set NO'(t) = J;;:::: f e-~ dy and L'ln = SUPt lIP (Sn s t) - NO'(t)I).
'121t ~

Theorem 4 (Tikhomirov (1980)). Let X = (Xt)tErti be a real valued stationary centered


at expectation mixing random process such that IfIXJ I2+ 8 < oofor some 0> O. Assume that
Properties: Central Limit Theorems 49

o<c?<oo.
If an = O(n-/3(2+0)(I+O)/02)for some f3 > 0 then L1n = O(n- O(/3-1)/2(/3+1)).
If an = O(e-/3n)for some f3 > 0 then L1n = O(ln1+0 n n- (12 ).
If Pn = O( e-/3n)for some f3 > 0 then L1n = O(ln l +OI2 n n-(O/\1)12).

This is similar to the independent and identically distributed case up to a logarithmic


factor. In the independent and identically distributed case Petrov (1975) proves that
6n = D(-in-) if [EXT S(Xi ) < 00 for some function S increasing to infinity. Also recall
S(n )
that Dasgupta (1988) gives a non-uniform bound, extending the classical independent and
identically distributed results in Petrov (1975). The proof uses Stein's (1973) method and will
not be detailed here. It is based on the differential equation satisfied by the empirical
characteristic function. It was extended by various authors to the case of random fields (Guyon
& Richardson (1984) and Bulinskii (1987,1989) or Bulinskii & Doukhan (1990».

If the mixing coefficients satisfy a(n; u, v) = D(e- an ) uniformly with respect to u, v and
[E IX t I2 +O< 00, then Guyon & Richardson (1984, theorem 2) prove that 6n' defined as
previously, satisfies

If the strong mixing coefficients satisfy a(n; u, v) = D(n- a) uniformly with respect to u, v
and [EIX t I2+1l < 00, then Guyon & Richardson (1984, theorem 3) prove that
6n = D(a n-X) with X = (oA1) 2(b-1) and b a 0 (oAl)
2b+oAI 2d(2+o)«6+1)A2)

Bulinskii (1987) proves that, in the case where the mixing sequence a(n; u, v) depends in a
polynomial way of u and v, the results are analogous. This extension includes the case of a
linear growth with respect to u and v proved in Takahata (1983).

In Bulinskii & Doukhan (1990), we investigate the case of low moment assumptions of the
kind [EIXlln o (IX t lv1) < 00. For geometrically mixing stationary random fields with
a(n; u, v) ::;; (u+v)b e- an we get 6n = D(ln- O/2 an). In the \jI-mixing case with
\jI(n; u, v) ::;; (u+v)b e- an , the convergence rate obtained is the one given in the independent
and identically distributed case by Petrov (1975) ~ = D(ln-oan). Note that such convergence
rates are really useful. Indeed as remarked in Reznick (1968), the rate 6n = D(ln-(1+E) n)
yields the LIL for E > O. This idea is also used for instance in Dehling (1983) and in
Doukhan, Le6n (1989). Finally the results essentially still extend to the non stationary case as
shown independently in Bulinskii (1990) and Guyon (1992).

1.5.3. Dimension dependent rates


In order to prove functional limit theorems with convergence rates, one needs
multidimensional central limit results involving explicitly the dimension of the space.

In this section we consider a generic case. Let Z(n) = (Z~>tET be a sequence of [Rd valued
50 Mixing

random processes, then their finite dimensional distributions take the form
Z(n){t l , ... , t h } = (~~, ... , Z~). We shall make use of the Prohorov metric (5), 1t, between
two probability distributions, which is compatible with the convergence in Probability. The
classical proofs of functional limit theorems (in distribution) always have two steps (6). The
first one is the convergence of finite dimensional marginal distributions and the second is the
tightness of the sequence of processes considered. The problem is analogue if one wants to get
a rate of convergence. T is a Polish space, assume that for any fmite subset U in T, Z(n){U}
converges to the finite dimensional distributions Z{U} of some continuous process Z. The
distance in the metric of uniform convergence between Z(n) and Z is bounded by the oscillation
of both processes on balls of T with radius £ > 0, and the sup bound of the distance of their
finite dimensional marginal distributions over subsets U with cardinal h = h(£) such that h
balls with radius £ cover the set T
1t(Z(n), Z)::;; supu 1t(Z(n){U}, Z{U}) + 2~

where IP (sup {IZ(n)(t) - Z(n)(s)l}, d(s, t) < £} 2: ~) ::;; ~ and


IP (sup {IZ(t) - Z(s)l}, d(s, t) < £} 2: ~) ::;; ~

Let Z ::; (Zt)te T be an [R d valued and strong mixing random process. The finite dimensional
distributions of Z take the form X = (XI' ... ' xh) = (Ztl' ... ' Zth)' where [Elx/+ l3 ::;; M2+l3 if
for any t in T, [EIZ/+l3 ::;; M2+l3. Let F = (ff)j~ be a filtration such that for some C > 0 and
o :; a < 1
(1) an = Sup a(ff~, ff j:n) ::;; C an,

where ff{ denotes the a-algebra generated by ffi' ff i+ I , ... , ff j • Now X~ is an [Rk-valued
(equipped with its Euclidean norm) and ffrmeasurable random vector such that the process
(X~)j;::1 is stationary for any k and such that for some ~ > 0
k
(2) IIXj 112+l3 < Ak where Ak 2: ao > O.

This implies that for some constant C, only depending on the mixing coefficients, and any finite
subset ofll'l [See (1)]
~ k 2+,,( 1+"(12 2+,,(
(3) [Elk Xii ::;;C{Card(l)} Ak if~>'Y.
ie I

Set S~ = p-l12 fj=1


Xf and ~~ = Cov S;, assumptions (1) and (2) imply that ~~ ~ ~k as

5 The Prohorov distance between two probability distributions P, Q on a Polish space E metrized by d if
defined. if AE = {x e E; d(x. A)!5: el. by n(P. Q) =Inf{e; "V A. p(AE)!5: Q(A) + el. Borel subsets A
may restricted to the closed ones.
6 Excepted for Hungarian constructions; see Komlos et al. (1975).

7 In view of Theorem 1.4.2. an assumption less restrictive than (I) for (3) to hold is I. r2
r=l
IX ~(4+£) < 00 for
some e < l3 - "(.
Properties: Central Limit Theorems 51

k~ 00, we shall assume that for any k the kxk-matrix Lk satisfies


(4) Lk is positive definite.

Denote by 1tk the Prohorov metric between two probability distributions in [Rk equipped with its
euclidian norm. In this section we bound 1t(n) = 1tk(n)(2J (S~(n))), ciY' k(n)(O, Lk(n))).
ciY' k(O, L) denotes the k dimensional Gaussian probability distribution with covariance L.

Define two sequences converging to infinity, p = pen), q = q(n) and 1= [p~q]. We shall
omit the index n, writing k for ken), S = s~(n) and A for Ak(n) and c will denote a positive
constant which may change from one inequality to another. Also 1tk(2J (A), 2J (B)) will
simply be denoted by 1tk(A, B) for two random variables A and B with distributions 2J(A) and
2J(B).

We may group the random variable p-112 X~ in such a way that S = U + V where V is the
I
sum of less than lq+q such terms and U = L Uj is the sum of q-distant random variables
j=1
such that U. has the same distribution as U 1 = S~. Using the Berbee & Bradley reconstruction
results of §1.2.1., we consider the sum of independent and identically distributed random
I
variable U* = L Uj where Uj has the same distribution as U I and if'Y = 8 - £. [see (8)],
j=1
2 (1EIU.1 2+Y)I/(2+y) (2+y)/(2y+5)
[p (lUj - Uj I ~ e) : :; 18 (u q J e ) .
If we balance both terms' we get
* < (2 2+y 1/(2+y))(2+Y)l{(2Y+5)(3Y+7)}
1tk(U j ,Uj )_c uq(IEIU} ) .
Now 1tk(U*, U) :::; 11t k(Uj, Uj ) thus according to (3) (Resp. see (9))
U 4 A2)(2+Y)/{(2Y+5)(3Y+7)}
(5) 1t k (U*, U):::; c (i T .
2+y 1/(3+y)
In the same way 1tk(S, U) :::; (IE IVI) :
(6) 1t k(S, U) :::; c [A 2(~ + t)](2+y)/{2(3+yn .

Let W, W', W" be respectively ciY' k(O, L\ ciY' k(O, L~), ciY' k(O, \¥- L:) random variables.
Following Doukhan, Le6n & Portal (1985) yields 1tk(W, W"):::;(i - ~) IIL~III + IILk - L~III
and
(7)

8 Resp. !P(vj * Vj) :5; Pq in the p-mixing case.


9 Resp. 1tk (V*, VJ:5; I Pq in the p-mixing case.
52 Mixing

Assuming that ~ A 2 is bounded and p2 = o(qn), we see that the term (6) is neglectible with
respect to the term (7). A truncated version of Yurinskii's result, given in Massart (1987),
yields 1tk(U*, W") ~ c [In 1/2 n + In 112 (lIE IV 11 2+Y](k lIE IV 11 2+Y) 1/4 if 0 < Y ~ 1
(8) 1tk(U*, W") ~ c In 112 n kl/4 n- y/B A (2+y)/4.
Thus
(9) 1t(n) ~c {ln l12 n (Q/ BA (2+y)/4 + (~)(2+Y)/(2(3+Y))} + 1tk(U*, U).
kl/4
n p
[ y -2 2(2+y) (3+')')/(8+7,),+,),2)] . .
Set q = [Q In n] and p = {n k A· } then If Q bIg enough the term
(5) is neglectible and
(10) 1t(n) ~ c In 1/2 n {n -y/2 k A 2(1 +Y)/(3+Y)} (2+')')/(8+7,),+,),2).

We thus obtain the following result

. -u -2(3+y! 2(2+y)(3+y!
Theorem 5. Assume (1), (2), and (4). If llmn --7=n k A = 0 for some
u > (8+r-r)12 then (10) holds. For r= I if limn --7=n- k- A
u 2 6 =
0 for some u> I then
7r(n) ~ c ln1l2 n n- 3132 k 31 / 6 A31/6.

Remark 3. The control of 1t(n) is effective if A + 2(I+y)J.t < (3+y)12 when ken) = [nA.]
and Ak(n) = [ni-!]. If y = 1 this condition is rewritten A + 6/-1 < 3.

If one gets a coordinatewise control of X~ = (~L I'"'' ~~,k)j;::I' that is lI~f,hIl2+0 < M for
h = 1, ... , k, then Ak < kll2 M and the last condition is rewritten A < (3+y)/[2(2+y)]. If
y= 1, A< 112 the dimension must increase more slowly than {n.

In the independent case Yurinskii (1977) obtains 1t(n) ~ c In l/2 n k 51B n- IIB while the result
. 112 9116 -3/32
wrItes here 1t(n) ~ c In n k n .

We now present generalizations of Theorem 5 which extend also the results in Doukhan, Le6n
& Portal (1985).

Arithmetic mixing cases. According to the footnote before inequality (3) we have to
• ~ 2 £/(4+£)
assume the convergence of the senes £... r a r < 00 for some e < () - y.
r=1

a-mixing : If qa a q is a bounded sequence, set q = [nh], h = (3y+ 7)(2 y+5), then a factor
4a(2+y)
n h(2+y)1 (2(3+y)) appears in inequality (10), that is for y = 1 a factor n35112a. If the random
variable are bounded, () = 00, we also have to assume a> 3.

The loss is reduced in the following case.


~-mixing : If qb ~q is a bounded sequence, use the same p as in formula (10) and set
q = [nl/b], then a factor n(2+y)/(2b(3+y)) appears in inequality (10), that is for y= 1 a factor
Properties: Central Limit Theorems 53

n 3/8b •
If the random variable are bounded (0 = 00) we also have to assume b > 3.

Survey of the literature


The case of CLT for independent sequences is reviewed in Petrov (1975) and CLT for general
sequences is given in Serfling (1968). The first CLT for mixing random sequences were
proved in Rosenblatt (1956) for the a-mixing case and in Ibragimov (1962) for the q,-mixing
case. Before this, Doob (1953) proved the CLT for Doeblin recurrent Markov chains. Theorem
1 itself was obtained by Ibragimov & Linnik (1971) for bounded random variables using
Gordin's (1969 a) method.lbragimov (1975) conjectured that the functional CLT holds under
the q,-mixing assumption (1), Peligrad (1990) partially solves this problem. He also conjectured
that CLT holds under p-mixing if the behaviour of the variance of partial sums is regular; the 41-
mixing case is investigated by Dehling, Denker and Philipp (1986). Davydov (1968) proves
Theorem 1 for random variables with (2+0)-th order moments but using a stronger mixing
assumption. Oodaira and Yoshihara (1972) prove Theorem 1. Gordin (1973) proves the CLT
without a second moment assumption. This result is first reported in Gordin (1969 b) with a
misprint. Bradley (1988) details this historic misprint. Herrndorf (1984) proves the functional
CLT under 'I'-moment conditions for non stationary sequences under an additional condition on
the variance of sums. Bulinskii & Doukhan (1987) show that this condition holds for stationary
sequences. Further references are Wolkonski & Rozanov (1959), Reznick (1968), Billingsley
(1968), Mac Leish (1975) and Ibragimov & Rozanov (1978). Ibragimov (1962) and
Kolmogorov & Rozanov (1960) studied the p-mixing case (see also Bryc & Smolenski
(1993». Uteev (1984), Takahata (1986), Yokoyama (1980) and Yoshihara (1978) as well as
Gastwirth & Rubin (1975) prove related results. Nahapetian (1991, chapter 4) describes in
details the various techniques for proving CLT results in the mixing cases. Davydov (1973),
Bradley (1983, 1985, 1986, 1987) and Herrndorf (1983) obtained very precise CLT inverse
results for mixing sequences. Now Gorodetskii (1984), Bolthausen (1982), Bulinskii (1987,
1989), Nahapetian (1980), Neaderhouser (1978), Hegerfeldt & Nappi (1977), Eberlein &
Csenki (1979) prove results for mixing random fields. Let us also recall the results of Samur
(1984). He sets necessary and sufficient conditions for q,-mixing triangular arrays to satisfy
CLT. Finally Doukhan, Massart & Rio (1994) seem to improve the results in the strong mixing
for the case of random sequences. We do not give the precise results but a large amount of the
literature gives necessary conditions for CLT to hold using a mixed assumption: a-mixing and

L Pk < 00. A good bibliography is given in Hall & Heyde (1980) and Peligrad (1986). Rates
k=O
of convergence in the CLT are given by Tikhornirov (1980), Guyon & Richardson (1984),
Bulinskii (1989) and Bulinskii & Doukhan (1990). Note that invariance principles are reported
in Philipp (1985). Such results obviously yield convergence rates in the functional CLT.
Unfortunately, up to now, they are far from being optimal. In the independent and identically
distributed case, recall that Kornlos, Major & Tusnady (1975) explicit the strong rate for the
invariance principle; it is close to n-112.

The case ofCLTwith explicit dependence in the dimension is originated in Yurinskii (1977) for
the independent and identically distributed case) and was studied by Dehling (1983), Doukhan
& Portal (1983, 1987), Doukhan, Leon & Portal (1984, 1985) and Massart (1987). It gives
rise to invariance principles for the empirical measure. See Philipp (1986) for a bibliography
and, for instance, Doukhan & Leon (1989), Doukhan, Leon & Portal (1987), or Massart
(1987) for additional convergence rates. We do not recall here the functional CLT results
obtained this way for the empirical measure.
55

2. Examples

Our aim is in this part to make clear the mixing properties of random processes and
fields classically used in probability and statistics. Reviews concerning examples of mixing
sequences and fields are given in Bradley (1986) and in Iosifescu (1980) or in Roussas &
Ioannides (1987). Unnatural examples and counterexamples will not be considered. As far as
possible, we present explicit conditions on the parameters of the proposed models for a given
decay of the mixing coefficients to hold. In the discrete time Markov case, for instance, we
shall not make use of the properties of potential kernels because they are practically impossible
to verify. In order to avoid repetitions, we shall consider in the same section the random
processes and fields of the same kind.

Section 2.1 contains an extension of the results in Wolkonski and Rozanov (1959, 1961)
communicated by Ibragimov and published in Doukhan & Guyon (1991). It concerns with
Gaussian random fields.

Section 2.2 contains part of the results in Guyon (1986), Georgii (1988) from Dobrushin
(1970), Kiinsch (1982), and Follmer (1988). We present there the case of mixing Gibbs
random fields.

Section 2.3 presents in detail an extension of Gorodetskii (1977),s result published in Doukhan
& Guyon (1991) and concerning the mixing properties of linear random fields.

Section 2.4 contains properties Markov chains with general state spaces. Davydov (1973)
characterized the mixing properties of Markov processes. After a quick review of general
results, we give the sufficient conditions for geometric <II-mixing deduced from Doob (1953)
[see(l)] and geometric~-mixing in Mokkadem (1985-1990) [see (2)]. As it is shown in
subsequent subsections those results lead to results concerning the most familiar times series
see in Mokkadem (1986, 1987), as well as their natural generalizations, see in Doukhan &
Tsybakov (1993) and in Ango Nze (1992).

Section 2.5 gives some properties of continuous time processes. We mainly consider Markov
processes and diffusions. After this, we point out the properties of hypermixing - Chiyonobu &
Kusuoka (1988). Those properties lead to a satisfactory large deviation principle. They are
linked to the mixing properties of Gaussian processes and to more analytic properties of the
Markov processes as Hypercontractivity, Ultracontractivity, Sobolev's logarithmic inequality
and the spectral gap. In this case, the results in Bakry & Emery (1985) yield simple and explicit
hypermixing sufficient conditions.

We consider, in this part, a random field X, that is a family ofrandom variables X =(Xt)teT indexed by
some metric space T and defined on a probability space (n, JI/:., IP), taking values on a measurable set (E, (;)
(E is caHed the state space), usuaHy E =IR and (; = ffi (IR). We shaH denote by ~ (n, JI/:.) the set of
probability measures on (n, JI/:. ) and by J'4, (n, JI/:. ) the set of measures on (n, JI/:.). If T =IR, IN or Z then
X is a process, otherwise it is a general random field.
1 Doob provides sufficient conditions for Doeblin recurrence of a Markov chain. ell-mixing was not defined
there but in Ibragimov (1962).
2 General theory ofR-recurrence for Markov chains from Tweedie (1974) led Nummelin & Tuominen (1982)
to sufficient geometric ergodicity conditions involving additional irreducibility assumptions. Mokkadem
obtain sufficient conditions for those assumptions. Markov chains are thus shown to be geometric ~mixing.
Examples: Gaussian Fields 57

2.1. Gaussian random fields

After some general results relating mixing properties of a Gaussian random field, we
propose an explicit bound of the mixing coefficients of such a random field based on the
approximation properties of its spectral density in § 2.1.1. In § 2.1.2. more precise results
characterize the decay of such coefficients for Gaussian processes. In this chapter,
X = (Xt)teT denotes a stationary Gaussian random field indexed by some metric group T.

The first result proves that ljl-mixing condition is in this case highly restrictive.

Proposition 1 (1bragimov, Linnik (1971)). The ¢-mixing condition implies m-


dependence for stationary Gaussian random fields.
Proof. Let (X, Y) be a zero mean, bidimensional random vector with Cov(X, Y) = r > 0
and Var X = Var Y = 1. Set A = {X >~} and B = {O:::; Y:::; I}. A direct computation
shows that HP(AnB) -!P(A)!P(B)I2': ao !P(A) for some constant ao independent of r. Setting
1 t 2
<1> the gaussian repartition function, <1>(t) = . M:; f exp {- x2 } dx, yields
v21t ~
~l 2 2 ~l 2 2
!P(AnB) - !P(A)!P(B) = f f exp{ - x -2rxy+y} dx dy _ f f exp{ _ ~} dx dy
2/r 0 2(1-~) 21t{17 21r 0 2 21t
1 ~ x2 l-rx - rx
= .M:; f exp{ - :r} [<1>(~ ) - <1>(~ ) - <1>(1) + <1>(0)] dx.
v21t 2/r 1-~ l_r2
Bounding the quantity inside brackets leads to the result with ao = 2 <1>( 1) - <1>(2) - <1>(0).
This yields the result because a ljl-mixing Gaussian random field has only a finite
number of non zero correlations .•

The p-mixing condition is related to a-mixing by the following inequality proved by


Kolmogorov and Rozanov (1960)

Theorem 1. For A, BeT the following holds


ux(A, B) ::; Px(A, B) ::; 2 1C ux(A, B)
Thus, for a, b, and k > 0 ux(k; a, b) ::; Px(k; a, b) ::; 2 1C ux(k; a, b).

Proof. We only need to prove the right member inequality for ax(A, B) :::; i- Let 10 > O.
There exist normalized Gaussian random variables x and y measurable with respect to X A and
X B such that r = IExy 2': Px(A, B) - e. For this use Lemma 2.1.1, proved independently
below. Set U = {x> O} and V = {y > O}. A direct computation yields
!P(UnV) _1 + Arcsin rand !P(U) !p(V)_l
- 4 2 1t - 4'
58 Mixing

The inequality Ar~si~ r ::::; cx X (A, B) follows. Hence we have proved that
Px(A, B) - E::::; r::::; sin 21tcx x (A, B)::::; 21tcx x (A, B) .•

2.1.1. An explicit bound


We now give a sufficient strong mixing condition for fields indexed by T = Zd. The
following result is a consequence of Rosenblatt (1985). Rosenblatt (1985) states that strong
mixing holds under the assumptions of the following Theorem 2. It was communicated by
Ibragimov (see also Ibragimov (1962)). This yields explicit bounds for the mixing sequence
associated to stationary Gaussian random fields. This result is well known in the one
dimensional case (d = 1). It was proved first by Kolmogorov & Rozanov (1960) and may be
found, for instance in Ibragimov and Rozanov (1978).

Theorem 2. Let X = (Xt)tE Zd, be a stationary Gaussian random field centered at


expectation,with aspectraldensityf(A) =
LfEXOXtei!..1 such thatf(A)~a>Ofor A in
tE:izd
[0, 27(1. Then the maximal correlation satisfies
I
Px(k) ~ t1 k (f), a
where t1 k (f) = Inffllf - PII=} is the best uniform approximation off by (k -l}-th degree
trigonometric polynomials, prAY = L c t ei!..I.
Itkk

Using the bound of Proposition 1.1.1., cxx(k) ::::; Px(k), yields

Corollary 1 (Ibragimov, Rozanov (1978)). Under the assumptions of Theorem 2,


I
ax(k) ~ t1 k (f)· a
d
Example 1. Write zt = II zt! for z = (zl , ... , zd)' and t = (t l , ... , t d). Let g be a
i=1 1

continuous function which does not vanish on the torus If d = [0, 21t]d, with Fourier
expansion g(z) = L gt zt. Let Z = (Zt)te ;Z d be a Gaussian white noise. Consider
tEZ d
d, where X t = '"
£... gt-s Zs' Then f(A,) = 1 '"
2
X = (Xt)te ;Z £... gt e iA.tI .
IS clearly bounded
SEZd tEZ d
below by some a> 0 over lfd. Consider the Fourier expansion of f, f(A,) = L c t e iA .t with
tE Zd
ct = L gtgt-s' Noting that Llk(f) = Inf{ IIf - PII",,} ~ IIf - Pk""" ::::; L Ictl where
SE Zd Itl>k
Pk(A,) = L ct eiA.t, Theorem 2 and Corollary 1 yield
Itl";k
cxx(k) ::::; Px(k) ::::; ~ L 1 L gt gt-s i : : ; ~ SUPtlgtl L L Igtl.
Isl;;,k tE Zd Isl;;'k Itl;;'lsl/2
Examples: Gaussian Fields 59

This bound yields the following interesting classical result.

Corollary 2. Assume that X = (Xt)tE;Zd is a stationary Gaussian random field such that
Cov(Xa, Xs) = O(lsr A) for some A > d and the spectral density of X is bounded below,
then ax(k) = O(kd -A).

Another result close to this one is concerned by absolute regularity coefficients.

Theorem 3. (Wolkonski & Rozanov 1961) Let X = (Xt)tE;Z be a gaussian stationary


process with spectral density bounded below by a. Then its f3-mixing coefficients satisfy with
ben) = L L ICov(X o, Xu)l,
s~ u~s

l. f3X.n < 1
lmn~oo ben) - a'

This result is not proved here.

Remark In order to compare Theorem 3 with Corollary 2, let X = (Xt>tE 2? be a gaussian


stationary process with spectral density bounded below such that Cov(X o, Xt) = O(rA) then
/3 n = 0(n 2 - A) and an = O(n I-A). This shows that the /3-mixing condition is really more
restrictive than a-mixing, even for stationary gaussian sequence. For the case of stationary
random fields with index set ;Zd for d > 1 this is stated by Theorem 1.3.2. which proves that
the /3-mixing condition implies m-dependence. More precision are provided for the case of
gaussian sequences in Section 2.1.2.

Proof of Theorem 2. First note that it is enough to prove the following bound for any
k-separated subsets A and Bin ;Zd,
Px(A, B) ~ c(f) dk(f)·

a) Reduction of the problem

Lemma 1. Let X be a Gaussian process. The maximal correlation between


L2( X B), Px(A, B), satisfies
Px(A, B) = Sup{ IfE¢A¢B1; ¢lA E NA(X), ¢lB E NB(X)},
where NA(X) is the set of ¢lA in the Gaussian space generated by (Xt)tEA with fE¢lA = 0, and
2
fE¢lA = 1.

Proof of Lemma 1. We propose here a different proof of this classical result shown in
Kolmogorov and Rozanov (1960). Consider first finite subsets A and B with
IAI = n ~ m = IBI. Up to a linear transformation we can assume the random vectors X A and
X B to be normalized, with CovX A = In' CovX B = 1m' Cov(XA, XB ) = (Dn' 0n,m-n)' Dn
is a diagonal (n, n)-matrix with PI 2 ... 2 Pn 2 0 entries. On m-n is the zero matrix with order
(n, m-n). Canonical analysis results yield such a decomposition. Set A= {i l ,··, in},
B = (jl, .. ,jm}' Let gA(XA)E L 2 (X A), then Hi denoting the i-th degree Hermite
polynomial
60 Mixing

gA (X A) = L
a i Hi(x A), with H/x A) = Hi/xk) where i = (i k; k E A). II
iE!N A kEA
Now [gA(X A) = 0 is rewritten as ao = 0 and

L i- = 1, with i! = kEA
2
[g1(X A) = 1 is rewritten as II k!,
iE!NA1.
2
Analogously gB(XB) E L (X B ) may be represented as
gB(x B) = L bj Hj(x B) = gB(x B) + g~(xB)' with gB(xB) = L bj H/x B)·
jE!N B j=(i,O)
The last summation extends to such j = (i, 0) E [NB'X{O} depending only on {Xjk' k:::; nl,
B' = {h ,... , jn} and to} is an element of [NB\B'. Hence
, ,pi i
[gA(X A) gB(X B) = [gA(X A) gB(X B) = ~ ai b i 7f, where p =
ik
Pk'
II
iE!N 1. kEA
The previous inequality yields
'" la· b·1
l[gA(X A) gB(XB)I:::; PI ..t.... ~:::; PI
A 1.
= [Xi xJ..• 1 1
iE!N
b) Let hEN A (X), <I>s E N B ( X)
Provide L2 with the following scalar products and induced norms

«\>, \If> = f <\> \jf fdA,


112
(<\>, 'II) = f <\> \jf dA,
112
11<\>lI f = <<\>, <\» , 11<\>112 = (<\>, <\» .

,112 2 2
Then 11<\>11 2 :::; a 1I<\>lIf. Note that [<\>A = II<\>Allfand
<<\>A' 'liB> = f[ LYt ei1 '][P(A)+(f(A)-P(A»]dA = f[ LYt ei1t][f(A)-P(A)]dA,
tEA-B tEA-B
for any (k - l)-th degree polynomial P. Indeed, It I ;::: k for tEA - B [set of differences
between elements in A and in B]. Thus, since II<\> AII2 :::; a'I!2I1<\> Allf:::; a'll2,
<<\> A' 'liB> = (<\> A[f - P], 'liB) :::; IIf - PIUI<\> AI12 II'IIBII2:::; a1 L\k(f)·.
Note, also that px(A, B) may also be estimated without the minorization condition on f. Let
c(A, f) = Sup{ II<\> A112; II<\> Allf = 1} = Sup{ c t2 ; L L
C s ct rt-s = 1} < 00, the previous
tEA s,tEA
proof shows that if the set {f > O} has nonzero Lebesgue measure then
Px(A, B) :::; c(A, f) c(B, f) L\k(f). Unfortunately we do not know general explicit bounds
for c(A, f). Such bounds could avoid the minorization condition on f.
Examples: Gaussian Fields 61

2.1.2. Mixing rates


Turning back now to the case of processes, we will give precise results. Yaglom (1963)
proposed the problem of characterizing the mixing Gaussian stationary processes in terms of
the properties of its spectral measure.

The problem was solved by Helson & Sarason (1967) for discrete time processes and by
Hayashi (1981) and Dominguez (1989) for continuous time processes. Let ~ denote the spectral
measure of a stationary Gaussian process (X t); f..l is defined on [0, 2it] if t E Z and ~ is
defined on [R if t E [R. If ~ is not absolutely continuous, then the process (X t) is not regular
and thus it is not mixing. Lemma 1 and the proof of Theorem 2 imply that if ~ absolutely
continuous with a density w(x)
PX,t = Sup If fl (x) f2(x) e itx w(x) dxl.
The supremum is considered over fi' i = 1,2 which are respectively polynomials in e isx for
s ;e: 0 and in e isx for s :5: 0 with f Ifi(x)1 2 w(x) dx:5: 1. The problem is to characterize the
set W of weight functions w such that limH~ PX,t = O.

In the case of functions on [0, 2it], introduce the set Woof weight functions w such
that for any 10 > 0 they are functions r, s, t in L2([0, 2it], dx) with r continuous on the torus,
"s,,~ < 10, "tll~ < 10 and such that In w = r + s + t. We denote by s the harmonic conjugate
of s if s is square integrable ( 1).

Theorem 4. (Helson-Sarason 1967, Hayashi 1981 and Dominguez 1989)


In the discrete time case, W is the the set offunctions w = IPI 2 Wo where P is a polynomial in
e ix and Wo E Woo
In the continuous time case, W is the set of measures j1 such that for any £ > 0 they exist
functions r, s, tin L2(fR, dx) and a> 0 with r in HI, IIsll= + II til = < £ such that
j1(dx) = Ir(x) I et(x) dx and sex) = Arg(r(x) e- iax ).

Note that any positive or negative power of such a function Wo is integrable, this may allow to
detect non mixing processes.
Ibragimov & Rozanov (1978) prove that for a stationary Gaussian discrete time

process, ~-mixing is equivalent to the fact that f(x) = IP(eix )12 exp{ L aj e ijx } for some
.i=

polynomial P and some real sequence (aj) with L Ijl la/ < =.
j=
The previous results are optimal. Ibragimov & Solev (1969) give an example of a
stationary strongly mixing Gaussian process which is not absolutely regular. Ibragimov &
Rozanov (1978) also give such an example. It is associated with the spectral density

1 For s in L2 there is a unique harmonic function h with h(O) = s(O) such that Re(h) = s; now s = Im(h).
62 Mixing

f(x) = exp{ f
j=o
2- ljl ei22jx)}.

~ 1 ..
The example f(x) =exp{ .£.. J' In . eIJX } in Ibragimov & Rozanov (1978) shows that
J
.r-oo
discontinuity of the spectral density at the origin does not hold only for long range dependent
sequences (see Rosenblatt (1985» since the corresponding gaussian process is strongly
mixing.

Characterizing the spectral measure for a specified decay rate of the mixing sequence is
another problem proposed by Ibragimov and solved by Dominguez (1989) using the extension
techniques initiated by M. Cotlar (Caracas, Venezuela).

Let vt be a junction decreasing to zero at infinity and JL be the spectral measure of the stationary
Gaussian process (Xt)tE fR; in order that Pt = O(vt) for t ~ 00 it is necessary and sufficient
that there exists ao ~O such that for all a ~ao' there is ra E HI and ta such that
JL{dx) = Irix)1 etix) dx and IItalioo =O(va), IIsalioo =O(v~) for six) =Arg(rix) e-iax).
Dominguez (1990) and Cheng (1992) prove multivariate analogues of this result. Moreover
Cheng does not directly consider gaussian random fields but only second order stationary
random fields. Indeed if one considers linear correlation coefficients all the previous results
extend to this framework and cosines of the angle of vector spaces spanned by a random field
may be attained by using spectral properties.

Survey of the literature


Wolkonski & Rozanov (1959), Kolmogorov & Rozanov (1960) and Ibragimov (1962)
described the properties of Gaussian sequences that may be found in Ibragimov & Linnik
(1971) or in Ibragimov & Rozanov (1978). The case of random fields is investigated in
Rosenblatt (1985), Guyon (1986) and Doukhan & Guyon (1991).

Yaglom (1963) proposed the characterization problem solved by Helson & Sarason (1967) for
discrete time processes and by Hayashi (1981) for continuous time processes. Dominguez
(1989) characterized the decay of the mixing coefficients for the continuous time processes and
Dominguez (1990) (see also Cheng (1992» proves a matricial extension of the Helson-Sarason
theorem and a characterization of some multivariate linearly completely regular processes.
Examples: Gibbs Fields 63

2.2. Gibbs fields

We begin in § 2.2.1. with Dobrushin theory for random fields defined through
conditional marginal distributions. The comparison result of § 2.2.1.1. yields Dobrushin's
uniqueness condition (§ 2.2.1.2.) and then a mixing condition arises in § 2.2.1.3. The
fundamental example of such random fields is the Markov field case (§ 2.2.2.), it is described
in terms of potentials in § 2.2.2.1. The non compact case is evocated in § 2.2.3. with the
examples of point processes in § 2.2.3.1 and diffusion based random fields in § 2.2.3.2.

We follow here the presentation proposed in Georgii (1988) and in Kiinsch (1982) for the
results of Dobrushin (1970). A simplified version of those results is provided in Guyon
(1992). Let T be some denumerable set - T is called the parameter space, e.g. T = ;Zd.

2.2.1 Dobrushin theory

The canonical version of X is defined on the product space (n, .1/:,) given by n = ET,
I8iT . .
.1/:, = (; , X t( ro) = rot for ro = (rot)tE T' E IS a PolIsh space called the state space. A
probability measure m on (n, .1/:,) defines the distribution of the random field (X t).

Set 0/ = {V c T; 0 < IVI < oo} . .1/:, is the smallest a-algebra on n containing the cylinder
events
I8iV
.1/:,V={{X V EV};VE{; }, V E 0/.

Let (R, ~), (S, 8) and (V, 'lJ) be measurable spaces. A kernel from (S, 8) to
(R, ~) is a function x: ~ x S ~ [R +, such that x(.! s) is a measure on (R, ~) if s E S
and x(VI.) is 8-measurable if V E ~. Ifx(RI.) = 1, x is called a probability kernel from
~ to 8. If A is a kernel from (V, 'lJ) to (S, 8), then A x is the kernel from (V, 'lJ) to

(R, ~) defined by Ax(ZI r) = JA(dsl r) x(ZI s), Z E 'lJ. II.IIVar is the norm in variation
».
of signed measures (see (1 The probability kernel x is said to be proper if R = S, ~ c 8
and x(VI .) = llv. The measures m on (S, 8) are mapped on measures Jl on (R, ~) by the
relation Jlx(V) = JJl(ds) x(VI s). The conditional probability Jl(VI 8') is the conditional
expectation of lEJl(ll v l 8').

Let x = (XV)VE 0/ be a family of probability kernels such that Xv is a probability measure


from (n, .1/:,v) to (n, .1/:,). A random field Jl is said to exhibit a dependence specified by x if

j1-a.s.

1 For instance 11911 = ~ L 19(x)1 if E is finite or denumerable.


XEQ
64 Mixing

A specification p with parameter space T and state space (E, C;) is a family of proper
probability kernels p = (PY)YE'V' where Py(A, x) is defined for A E 054. y , x E ETW. We
also define following Hillmer (1975) the associated kernels 1ty (A, x) for A E 054., x E Q,
by the relations

7rV<A, .) is d1\V measurable.

7rV<., x) is a probability measure on (Q, d) and equals Ox on d1\V and pV<., x1\V) on d v .

7r v (7r wf) = 7r vf for We V E 0/, fE C(Q).

The last condition is a consistency condition Let ceQ) denote the space of continuous and
bounded function on Q equipped with the uniform norm. The specification will be assumed to
be continuous, that is 1tyf E ceQ) if f E ceQ) with 1tyf(x) = f 1ty(dy, x) fey).
A Gibbs state /l for a specification p is a probability measure on (Q, 054.) with
/l(AI 054. T\c) = Pc(AI.) for A E 054. and C E qr - that is /l(1tyf) = /l(f) for f E C(Q).
The set of Gibbs states,
tl (p)= {mE ~(Q,054.);m(UI054.TW)=py(UI.)m-a.s. V AE 054., VVE qr),
is convex. If the state space E is compact, Dobrushin (1970) proves the existence of Gibbs
states (2), and that tl (p) is compact. If W E qr and x E E W, Y E E T\W, we define (xy) as
the element of Q with corresponding coordinates x and y. Extremal points in tl (p) are the
measures with a trivial tail field. In the case T = Zd, extremal points in the set tl s(p) of
stationary Gibbs distributions are exactly the ergodic stationary measures in tl (p).

In the case where tl (p) has exactly one element, this measure is called the Gibbs state of the
system. We say that there is no phase transition.

2.2.1.1. Comparison between specifications

Let (pi)i=I,2 be two continuous specifications. Define, for x, y in Q with Xu = Yu if u *- t,


the expressions
1 s,t = 2:1 SUPx,y,i=1,2I1Ps(·1
i i
x) - ps(·1 y)lI yar , for s *- t E T and 1s,s = O.

We define the matrix r = (1s,t)s,t and its restriction to V, r y, as well as

£..J
y = '
Xs,t " (r yn )s,t and
n=O

2 Indeed any sequence ltv n(.I x) has a weakly convergent subsequence; choosing a sequence of finite Yn
increasing to T leads to the result. The compactness assumption may be relaxed: see Georgii (\ 988).
Examples: Gibbs Fields 65

Set 13 s = ~ SuPx IIp~(.1 x) - p;(.1 x)IIVar and Pa(f) =SUPx,y;ta If(x) - f(y)1 for x E ET\{a}

and f E C(Q). For any x, y in Q and fin C(Q) we have If(x) - f(y)1 $; L Pt(f).
tET

Theorem 1. Assume that the specifications pi, i = 1, 2 are continuous and

\.i t E T, L
seT
rs,1 s a < 1, and fii E §(pi).

Then for fE C(a), lfi](f) - fi2(f)1 sLs,t f3 s Ys,IPlf)·


We need for this result the following fundamental Lemma. Define a point a = (at)teT E [RT
to be an estimate for 111 and 112 if for any f E C(Q).
1111(f) - 112(f)1 $; L at pt(f).
t

Lemma 1. Set, for any s fued in T, a( s) = (at< s» E [RT with arCs) = at if s 7: t and
a is) =as + L au rU,s then a(s) is an estimate for fi] and fi2'
u;t,s

Sketch of the Proof. Indeed the fact that 111 and 112 are specifications implies
1111(f) - 112(f)1 $; 1111(1t}f) - 111 (1tJf)1 + 1111 (1tif) - 112(1tJfl
$; 13 s pt(f) + Ls
as P s(1tif) $; 13 s pt(f) + L
s;ft
as Ps(f) + L s;ft
as "(s,t Pt(f)·

The second inequality follows from p t(1tsf) ::;; pt(f) + "(s,t pt(f) if s 7: t and = 0 else.

Sketch of Theorem I's Proof. Define now a by '\ = 1, as =0 for s if:. t and apply the
Lemma 1 for various well chosen values of t. A contradiction can be exhibited allowing to
complete the proof (see Kiinsch 1982, Theorem 2.1.).•

2.2.1.2. Dobrushin's condition

Define the influence matrix of p by r = ("{s.t)s,t with "(s.t = i SUPx,y IIPs(·1 x) - ps(·1 y)II Var '
where x and y in Q are subject to the restriction Xu = Yu for u if:. t and s, t E T. The
following Dobrushin's condition gives the uniqueness of the Gibbs measure.

Theorem 2. Uniqueness hold if a = SUPt L rs,t < 1.


SET

Proof. Consider only two elements 111 and 112 of ~ (p) in Theorem 1 to prove Theorem 2.•

The Gibbs underlying measure will be written!! £Y(p) = I!!}]. Dobrushin's uniqueness
condition is a sufficient condition for mixing properties to hold. This gives an idea of the
limitation of the mixing techniques: e.g. the physical phenomena which exhibit a phase
transition property are not considered.
66 Mixing

2.2.1.3. Mixing condition


Another important corollary of Theorem 1 is the following inequality

\t V E 0/, \t x E n,
S,tEVUf!'V uf!'V
for a continuous specification p and a corresponding Gibbs measure ~ (see (3».

Let V, W be disjoint subsets in T, with V finite then cp(V, W) =sup{I~(A I B) - ~(A)I} for
A E ffb y , B E ffb w may now be bounded. Then the following inequality holds

IJl(AI B) - Jl(A)1 ~ LLL Yu,tX;'tPl/).


SEVtEW uf!'V

Consider the infimum of this expression with respect to f and use the inequality i1. I S; Xs,1 to
get.

Theorem 3. The uniform mixing coefficients satisfy, assuming Dobrushin's uniqueness


condition
t/J(V, W) ~ LLL
SEV uf!'VtEW
Yu t Xs t·
• •

Proof. The dominated convergence theorem implies a bound for the mixing coefficients. For
fixed t, L 'YU,I converges to 0 when V increases to T. Hence the following inequality gives the
Uf!'V

result, L L 'Yu,1 XS,I S; a L XS,I S; _a_.•


SEV Uf!'V s 1- a

Remark 1. It may be shown that the cp-mixing assumption implies uniqueness of the Gibbs
measure.

3 Indeed, fix V, pv is a Gibbs stale on Q' = EV for the specification (pW)wcv thus Theorem 1 gives

In v f(x) - n v f(y)l::; f J
f(ux)(p v (dulx) - pv (duly)) + (f(ux) - f(uy)) p V(duly)

: ; s,IEV
L UEVL Yu,t Xs~t PI(f) + ueV
L pif)·
Examples: Gibbs Fields 67

2.2.2 Markov fields


T is equipped with the metric d. The random field is said to be k-Markovian if the distribution
of the random field on V conditioned by its values on T\W for some V c W is the same as if it
was conditioned by its values on dk(T\W) = {t ~ W; d(t, V) ~ k}. If the random field is
k-Markovian then 'Ys,t =0 for des, t) ;::: k.

Hence (rn)s I = 0 if des, t) ;::: nk. It yields L (rn)s I ~ an if des, t) > r k. This and
SET
rk
Xs,1 = L (rn)s,I' together imply L X s,1 ~ L L (rn)s,1 ~ a
n>kr s,d(s,t)<!rk SET n>kr I-a
Let now B cAE qJ', and r be an integer with deB, Ac) > r k, the triangle inequality
shows that <p(A, B) ~ L L L 'Yu s Xs t and
. IEB SE1\{t} d(s,B»(r-llk' ,

"'(A
'I' ,
B) <
_a IBI S UPteB ~
£.. < _l-IBI a dCA, B)/k .•
Xs,t-
SE A,d(s,Bl>(r-llk I-a

Theorem 4. Assume the Dobrushin's uniqueness condition. If the random field is k-


Markovian, there exist some positive constants C, c such that
l{J(k; a, b) ~ C (al\b) e- ck .

2.2.2.1. Potentials

Let v be an a priori single spin measure, that is a measure on E, we may define for any finite
subset A of T the measure on EA

A potential function Iw: n -7 lR is a function on n defined for W E qJ'. Iw is called


admissible if for any x of n, Zv(x) = f exp{ - L Iw(Yx)} vv(dy) < 00. Define for

V finite in T,
Pv(dyl x) = Z~x) exp{-
WeT,
L VnW;o'~
Iw(yx)j vV(dy),

where Iw is admissible and Iw(Yx) is defined if W E qJ' for x E E w, Y E E T\W is such


that L IVI SUPxe n IIv(x)1 < 00. One can show that p is a specification to a Markov field.
V3a
As previously, (xy) is an element of n. If E is denumerable we assume that V is the counting
measure on E.
If Iw summable, that is IIIIIa = L
SUPxe nIIv(x)1 < 00 for any a in n, then the potential.is
V3a
admissible if and only if veE) < 00. It is the case if Iw = 0 for diameter(W) ;::: k and then the
random field is k-Markovian.
68 Mixing

IP (U (") V) . IP (U (") V)
Remark 2. Set \jI 0 (V , 'If) = sup IP IP I/mf IP IP I where extrema are
(U) (V) (U) (V)
defined for U E V and V E 'If such that IP (U) IP (V);c O. Bryc (1992) theorem 6.1.
provides a \jIo-mixing criterion for stationary Gibbs Markov fields in terms of interaction
potential (4). These coefficients are clearly related to \jI-mixing coefficients. Since it does not
take the same form as previous ones we do not present it in details but it gives rise to large
deviations results.

Example 1. If X is a k-Markovian discrete valued random field such that each point may be
visited, that is ~({x}) > 0, then Hammersley & Clifford prove that there is such a potential
function (see in Besag (1974) or in Guyon (1992».

eh dJ.l
Lemma 2. Let J.lo be an arbitrary measure on n, define dJ.lh = 0 for h E C( n)
Jeh dJ.lo
then II J.lg - J.lhIlVar.s;' IIg - hll oo'

This Lemma (see (5» yields IIPs(.1 x) - ps(.1 y)IIVar ~ L IIIvlloo' s, t E T, where Xu =Yu
V3s,t
for U;C t. Simon (1979) proves that Dobrushin's uniqueness condition holds here if
IX = Max t L (lVI - 1)IIIvlloo < 1.
V3t
This condition may be weakened to SUPtL (lVI- I)(Max Iv - Min Iv) < 1, or replacing
V3t
(Iv) by (Iv + H), for an arbitrary constant H. Now Theorem 4 asserts the geometric decay of
the associated cp-mixing sequence.

Example 2. Set N(t) = {s E T; 0 < d(s, t) ~ I}, N(t) is called the set of neighbours of
t in T. Assume that Iv = 0, for IVI > 2 with I{t}(x) = at Xt' I{s.t}(x) = b s•t Xs x t for S;c t.
Uniqueness holds as soon as
SUPt L Ibs.tl IIxlI,,! ~ IX < 1.
seN(t)
If E c [a, b] and lal ~ Ibl this means
IIxll;, ~ b 2 _ a2 •

Assume now that the distribution ~ is k-Markovian on T and has a density Q with respect to
some positive measure v on ET. A finite subset C of T is said to be a clique if either ICI = 1 or
any couple s. t in C satisfies d(s, t) ~ k. Hammersley & Clifford, and Besag (1974) have

4 Distributions invariant by shifts operators.

Jvt«q - vtq)f) dt for f e


I
5 Indeed set v t = J.1 tg +(1-t)h and q = h - g then lJ.1 g - J.1hl ::;; C(O) with

IIfll.. = 1 because ~Vt(f) =vt(fq) - vt(f) Vt(q). Schwarz inequality now leads to the desired result.
Examples: Gibbs Fields 69

proved that if the distribution P does not vanish on the set of configurations ET then

Q(x) = L. IA(xA), where the sum is considered over cliques A in T.


AcT

Example 3. Let T = ;Z 2.
(i) Triangular grid. Random Markov fields with 6 nearest neighbours. Here IN(s)1 = 6 and
cliques have the form
C 1 ={{S};SE T}, C 2 ={{s,t};d(s,t)=1,s , tE T} or
C 3 = {{s, t, u}; des, t) = 1, des, u) = 1, d(u, t) = 1 s, t, u E T} .
A Markov field is constructed by setting for neighbours s, t , u
I{s} = f(x s)' Ils,t} = g(x s' x t), Ils,t,u} = h(x s' xI' xu),
An isotropic and statIonary field may be constructed by setting for neighbours s, t , u
lis} = a x s' Ils,t} = b Xs xI' Ils,t,u} = c Xs x t xu' Uniqueness holds if 3 Ibl + 61cl ~ l.

Triangular grid
Figure 2.2.1.

(ii) Hexagonal grid. Random Markov fields with 3 nearest neighbours. IN(s)1 = 3 and cliques
have the form {s} and {s, t} and a Markov field is constructed setting for neighbours s, t
li s} = f(x s)' Ils,t} = g(x s' Xt) ·
70 Mixing

Hexagonal grid
Figure 2.2.2.

(iii) Rectangular grids. Random Markov fields with 4 and 8 nearest neighbours on a regular
grid (k = I, T = Z 2) .

r
r: .J
-
I tt:'\
I.,;
"\ .r
-
.
r-- \:17

IN(t)1 =9 Rectangular grids IN(t)1 = 5

Figure 2.2.3.

Various other situations may be considered by a change of the underlying metric.

2.2.3 Non compact case


We now make a digression concerning the non compact case and its application to the study of
point processes. If E is not compact the previous results may be extended replacing continuous
functions on n by the set L of Lipschitz continuous ones. The norm of total variation on signed
measures, is replaced by the Vasershtein metric, R, between probabilities. It is induced by a
semimetric r on E.

Set R(S, 8') = InfQ I rex, y) Q(dx, dy), the lnf bound is taken over distributions Q with
marginals S and S'. One can prove that

,
R(S, S ) = Sup[
II f(x) S(dx) - I f(x) S'(dx)1 WIth. O(f) = Sup x y
If(x) - f(Y)1
( ).
O(f) , r x, y
Examples: Gibbs Fields 71

Usually r is chosen as the discrete metric rex, y) =0 if x;J:. y and rex, x) = 1. Dobrushin's
uniqueness condition still holds when IIPs(.1 x) - pS<.1 y)IIVar is replaced by

-
r~y
1() R(ps(.I x), Ps(.I y)) in the definition of 'Ys ,t and analogously for the Ws (see Dobrushin
(1970), Kiinsch (1982) and Follmer (1982)). Moreover Theorem 3 extends replacing q,-mixing
coefficients by a-mixing coefficients. Set r = ('Ys t) and A = (X s t) with Xs t = I (rn\ t.
, , 'n~O'

The assumption a = SUPt L 'Ys,t < 1 implies the geometric strong mixing condition
seT
analogue to that of Theorem 3 since 'Ys,t is here a geometric series.

An important case is the case of translation invariant Gibbs measures on ;;ld. In this case the
specifications are invariant under parameter translations. Set 'Ys-t ='Ys,t and Xs-t =Xs,t then
a(A, B) ::;; a2 L L f
Xs-t· with a 2 = SUPte T Il(dx) [infyeE r 2 (x t, y) pt(dxtl x)].f
seA teB
Assuming a decay Xs =O(lsr c-d-e) for some £ > 0 yields a(k; a, b) ::;; const a 2 [aAb] m-c .

2.2.3.1. Point processes

We now recall the definition of a point process on [Rd. [Rd is the set where particles live. The
extension to arbitrary state space is given in Preston (1976, § 6) and the following presentation
is the one in Jensen (1990). Set fa the Borel a-field on [Rd and C c fa the set of bounded
Borel sets. A point process is a random variable with values in the set S of locally finite integer

valued measures that is finite on C. Elements of S take the form L


mn 0X n where mn E il'I
n=1
and {xn; n:?! 1 }nA is finite for A in C. This measure is the configuration corresponding to IDn
particles at point xO' The set SeA) of finite configurations on A in C is the trace of S in A. Then

choose h > 0 and set Ai = [hO - ~), h(i + ~)[ for i E ;;l d if 11 = (1, ... , 1). Then

[R d = U
Zd
Ai and S is isomorphic to n
ieZ d
S(Ai) equipped with the a-field
. ieZd
n
fa (Ai) and thus

to [S(Ao)]Zd. For A in C a metric r is defined for x, y in S(Ao) by


rex, y) = r'(x, y) + r"(x, y)
with r"(x, y) = 1 if x ;J:. y and r"(x, x) = 0,
o
r'(x, y) = r'(y, x) = n - m + min 7t I Ix(j) - y(1t(j))1.
j=1
where m = Iyl ::;; n = Ixl stand for the number of particles (x(I), ... , x(n)) and
(y(1), ... , y(m)) in the configurations x and y. 1t runs over the one to one applications
n
{1, ... , n} ~ {1, ... , m}. The finite integer-valued measures x and y are written x = L 0xU)
j=1
72 Mixing

m
and y = L ByG ) with perhaps repetitions in sequences x(.) and y(.). The Vasershtein metric is
j=l
defined with the help of r(., .). For A in ffi we also define x(A) = {xU) E A; j ;::: I} the set
of particles defmed by x lying in A and xA the restriction of x.

For example, point processes may be defined in terms of a pair-potential cp: IRd -7 IR with

cp(O) = 0, cp( -z) =cp(z) and


I n
for any n, and zl'···' zn: 2" L cp(Zj - Zj) ;::: - n B for some B ;::: O.
j,j=l

The interaction of particles is defined by VA (x A I y Ac) = L


cp(z I - z2) where the sum is
extended to Zj E x(A)vy(A c) such that zl or z2 lies in x(A). Assume the admissibility
condition. If v A is the Poisson measure on SeA) with Lebesgue intensity measure define a
measure 1tA(. I YAc) with density with respect to V A'

This definition is tied up with the previous ones if we set Uj = x A . and


"* i) = 1tA'<. I YA~). We refer the reader to Jensen (1990) for more explicit mixing
1
1tj(.1 xJ., j
1 1
conditions.

2.2.3.2. Diffusions
We only recall some of the ideas in Deuschel (1986) following Follmer (1988). Let P be the
distribution of an infmite dimensional diffusion process

dX~ = biX~) du + dW~, t E T and U E [0, 1].

P is a distribution probability on (qo, I])T, the state space is here E = qo, 1] and the
metric r considered is the uniform metric.

If Y = (Xt\"s' then the local specification of the previous random field is given by

where p* denotes the Wiener measure and Ps the s-marginal of the distribution P of X.

Let b s have the specific form bs(x) = V's Is(x) where Is(x) = is(x s) + L jst(x s' Xt) for
to's

smooth functions is and jst with jst = 0 if t e: N(s), then


Examples: Gibbs Fields 73

1 exp{fs(X s, Y)} dP * (dX s),


1t s( d X sl Y) = Zs(Y)
I
where fs(X) = I s (X(1)) - Is(X(O» - Jgs(X(u» du is dermed with the help of
gs(x) = L bst(x) °V t Is(x) + V~ Is(x) + ~V t Is(x»2.
tEN(s)u{s)

Note that lIalS<X)1I ~ 211V t Isllco + IIVt gsllco if at denotes the partial Frechet derivative· of the
N2(s)
function fs on (qo, 1]) .

In order to apply the Dobrushin contraction technique, Deuschel (1986) shows that

R(1t s(·1 ro), 1t s(·1 ro') ~ L 'Yst llro(t) - ro'(t)1I and 'Yst ~ crt lIasft(X)II,
N2(s)

where ~= SUpy InfxE qo,!] JIIXs - xll1ts(dX I Y). S

Now, the non compact extension of Theorem 4 yields the cp-mixing condition under the
assumption
a = SUPt L 'Ys,t < l.
seT

Survey of the literature

The initial results concerning mixing properties of Gibbs fields are due to Dobrushin
(1968, 1970).

They are improved iq Kiinsch (1982) and conditions on the potentials come from Simon
(1979). The present presentation of specification comes from Follmer (1975). Georgii's (1988)
book is very complete concerning Gibbs states and proposes an extensive description of mixing
properties. See also Sinai (1982) and Prum (1986) for related results. Simplified results are
provided in Guyon (1986, 1992). Follmer (1988) proposes an approach to Gibbs random
fields with applications to infinite dimensional diffusion processes.

The examples of point processes and diffusions come from Preston (1976), Jensen (1990) and
Follmer (1988) and the example of Potentials comes from Simon (1969) and Guyon (1986, and
1992).

The mixing properties of lattice systems are also given in related works by Eberlein & Csenki
(1979), Hegerfeldt & Nappi (1977), Nahapetian (1980) and Neaderhouser (1978). Bowen
(1975) proves 'JI-mixing sufficient conditions for Ising models. Nahapetian (1991) proposes an
extensive discussion of most of the results in the present chapter.
Examples: Linear Fields 75

2.3. Linear fields

This chapter proposes sufficient strong nuxmg conditions for random fields
X =(Xt)te ;t'd defined linearly by Xt = L dgt' s Zs' The results proposed are announced in
se ;t'
Doukhan & Guyon (1991) and concern cases where Z is either an independent (§ 2.3.1) or a
mixing random field (§ 2.3.2). Usual results impose an invertibility assumption which may be
omitted using a change of the innovation process Z. Moreover the random field may be non
causal. We provide explicit bounds for the mixing coefficients which usually depend on the
cardinal of the subsets considered, at least for d 2: 2. We compare the results with the result in
(§ 2.1) for Gaussian random fields. The results are proved through lemmas of independent
interest in § 2.3.3. Finally § 2.3.4. proposes a motivation for the study of such random fields,
extensions as well as some miscellany results.

Let Z = (Zt)te ;t'd, be a real valued random field, where Zd is equipped with the uniform norm
defined as Itl = Max {Itll,,,, Itdl}, for t = (t 1,,,, td). d(t, C) denotes the distance between t and
the subset C of Zd.

Define a linear random field X = (Xt)te;t'd by the infinite sum

(1) Xt = L gt s Zs'
seZ d '

We assume the existence of constants 0> 0, M > 0 such that for tin Zd

(2) LIIZl ::; M and, if 8 ;? 1, LIZ t = O.

Set p = min { 1, o}. Convergence of the series (1) in the L I)-sense involves the following
uniformity condition on the coefficients (gt,s)t,se ;t'd,

(3)

Lemma 1. The distribution of the random field X in (1) is well defined under the assumptions
(2) and (3). Moreover the finite distributions of the random field X are limits in La of those of
the moving average random fields X"' defined by X"! = L g I,s ZS'
Isl9n
To quantify the assumption (3) we introduce the following notations for any constants m 2: 0,
II > 0, t E Z d and C c Z d : am,~ = SUPt L
Igt,slll, AC,t,~ = L
Igt,slll. Note that if
Is-tl>m sl1'C
d(t, C) 2: m, then ACm,t,~::; am,~ where Cm = {t E Z d; d(t, C)::; m} denotes the m-
order neighbourhood of a subset C c Zd.

Remark 1. Assume that the infinite dimensional matrix (gt,s) is almost diagonal, say there
exists some integer m 2: 0 such that Is - tl > m implies gt,s =0; the series is locally finite.
76 Mixing

Let cz(k; a, b) denote any of the mixing coefficient sequence associated to the random field Z,
then cx(k; a, b) :5: cz(k-m; a, b); see (1). For this, only note the inclusion of the a-fields
XA C ':itAm. Our aim is to extend this kind of result to arbitrary operators (gt,s)'
In view of (2) assume now that
(4) Sup d LIg t siP
tEL sEit4 •
=r < 00

Let (; c IR Zd be the space of bounded and real sequences on ~ d equipped with the norm
IIxlioo = Sup{lxtl; t E ~d} for x = (xt)te Zd. The linear mapping G defmed on (; by
(Gx)t = Ldgt,s Xs
SEZ

is continuous with norm '1 =Sup d L


Ig t sl.
tE Z SEZd '
Remark 2. A continuous linear operator on r; which commutes with the shift operators takes
this form but this latter assumption is essential [see (2)].

We denote by I the identity operator on (; and IRA is considered indifferently as the usual
d d
product space as well as the subspace of IR Z defined by {x E IR Z ; x t =0, t e: A}.

=
Counterexample 1. Let X (Xt)te Z be the process defined by X (1 - B)P Z, where B =
=
is the shift operator and Z (Zt)te Z is an independent and identically distributed sequence
(3). Then X is not mixing if p > 4 is not an integer.

According to the previous counterexample in Gorodetskii (1977) we introduce an invertibility


condition on the linear field X. Assume that there is a continuous and linear operator K on (;,
defined by (Kx)t =: L dkt,s x S' with
SEZ
~ GK=~

Example 1. The linear operator U on (;, defined for t E ~ d by (Ux)t = L ut s x S' where
seZ d '
L IUt,sl < 1 and I~,tl = 1, is invertible and bicontinuous. Its inverse takes the form
Sup tE Z d
S;<t

(U-Ix)t = L v t s Xs with Sup d L IV t sl < The operator H, defined on (; by


00.

seZd ' te Z seZd '

1 Such a random field is only a finite moving average of Z. Mixing holds more generally if gt is a family of
measurable functions defined on the space of sequences on Zd such that llt(z) only depends on Zs for Is-t1~.

2 If we assume that a continuous linear operator G on e


commute with the shifts then it takes the form
(Gx)t = L gt-s Xs for some (gt) with L Igsl < 00.
S S
At the contrary, let L be the linear form defined on the subspace of convergent sequences ff of e by
L(x) = limltl--+~ xI' Hahn-Banach theorem provides a continuous extension of L which has not this form.
~ ~

3 Let ~be the formal series defined by (1 - z)p = L at zt. The process X is defined by ~ = L a t_s Zs'
t=O s=o
Examples: Linear Fields 77

(Hx)t = h t xI' with 0 < a:::; Ihtl :::; A < 00 for t E Z d has the same properties. Let G be a
product of operators with the form of H of U and of its inverse, then condition (5) still sholds.

Example 2. Let g(z) = L


gt zt be a continuously differentiable function on the torus
tE Zd
lfd = {z = (Zj, ... , zd) E (Cd, IZjl =... = IZdl = I}, its multidimensional Fourier's series
converges uniformly. For any C k function g on the torus, the series L gt is absolutely
tE£d
convergent as soon as k > ~ ; see (4). It is moreover clear that, if L Itl k Igtl < 00, then g is
tE Zd
for some k > ~ and g does not vanish on the torus then
k ~ k
C on the torus. If £..J It I Igtl < 00
tE Zd
the inverse k =1, written k(z) =L ktzt, satisfies L
Iktl < 00. The operators G and K
g tEZ d tEZ d
defined by (Gx)t = L
gt-s Xs and (Kx)t = L
k s_t Xs satisfy (4) and (5). In this case, if
SEZ d SEZ d
Z is stationary, then the random field X is stationary.

We shall begin with the assumption that Z is an independent random field; after this we shall
consider a mixing assumption.

2.3.1. Independent innovations


In this section we assume that the innovation random field Z is independent. The
following shows that the previous assumptions do not ensure a mixing condition for the
random field X defined by the relation (1).

Counterexample 2 (Rosenblatt (1980), Andrews (1984)) Let X = (Xt)tE £ be the process


defined by (1 - r B) X =Z , where B is the shift operator and Z = (Zt\E £ is an independent
and identically distributed Bernoulli with parameter p :::; ~ and Irl < 1. The process X is not
strongly mixing and its marginal distribution may be absolutely continuous (5).

We also assume the existence of a density Pt of the marginal Zt of Z and a constant c > 0 with
(6) f Ip/z + x) - p/z)1 dz ::; c lxi, \f x E fR.
fR
Note that this condition (condition (i) in Gorodetskii, 1977) holds if the random field Z is
independent and identically distributed and its marginal distribution is of bounded variation over
the real line (6).

4 To prove this, use Schwartz inequality and the summability of series L Itl 2k Ig tl2, L Itl- 2k .
tE Zd tE Zd
5 The integral part of Xtr- t is equal to XO' If r =~, ~'s marginal distribution is Lebesgue measure on [0, I].
6 Recall that a function p is of bounded variation over the real line if and only if there exists some constant
p
C < = such that any sequence Xo < Xj < ... < xp satisfies L Ip(x i) - p(xi_j)1 $ C.
i=j
78 Mixing

Set now L(u) = ~ u[1 vlln ul], and

11(1 +5) 11(1 +11)


(7) m.t(~, C) = AC,t,1I vL(AC,t,2) and N(m,~) = am,lI vL(am,2)'
11(1+11) 11(1+11)
If ~ ~ 2 the previous expressions become m.t(~, C) =AC,t,1I and N(m, ~) =am,lI .

Theorem 1. Assume that conditions (2)-(6) holdfor the linear random field X given in (1) in
terms of the independent random field Z. Then assume that k is big enough, the mixing
coefficients sequences (ax) relative to X =(Xt)tE Ld satisfy the following for some constant (;
only depending on 0 and y.
i) For any finite subsets A and B ofT,
ax(A, B) ~ (; (L
fR.-lo, Ak )1I(l+6) + fR.-lo, Bk )1I(l+6)j. L
tEA tEB
ii) For any a, b > 0
ax(2k; a, b) ~ (; (a + b) N(k, 0).

The dependence on cardinals a and b in the bound ii) of Theorem 1 may be weakened for
hypercubes A and B with the form.~ lUi' vi] of ~ d.
1=1

Let H(.) be a monotone nondecreasing and nonnegative function, and C, m be respectively a


subset of ~d and an integer then setting k(t) = d(t, (Cmt) we obtain
~ (C, m, j.L) = L H(Aem,l,J.L) ~ L H(ak(t).J.L)·
tee tee
m+v-u
Assume first that d = 1 and C = [u, v], then ~ (C, m, j.L) ~ 2 L H(ak.J.L)' Use this
k=m
=
result for the functions H(x) Xl/(1M) + L(x) implies that under the assumptions of Theorem
I, a sharp bound may be obtained in ii) if d = 1. Here there is no dependence on cardinalities,
according to Gorodetskii (1977), and the corresponding bound is thus mixing coefficient (XX,k'
We introduce the following notation with the same convention as in (7)

11(1+11)
L am,lI L L(am,2)}'
00 00

(7') W(k, 0) = { }v{


m=k m=k

Corollary 1. Let d = 1. Consider the random process coefficient sequence of the process X.
Assume that Z is independent and that assumptions in Theorem 1 hold. If k is large enough.
there exists a positive constant (; with
aX,2k ~ (; W(k. 0).

Let now d> 1 and C =.~


I-I
lUi' vd then we write again
tee
~ (C,
H(ak(t).J.L)' m, j.L) ~ L
Define CrG) be the subset of the points t in C such that j = d(t, (Cm)c) and this distance is
d
attained for the i-th coordinate, hence L H(ak(t).J.L) = L L L H(aj,J.L)' Setting for
tee i=1 j=m tee~G)
Examples: Linear Fields 79

n
such hypercubes, C, K(C) = ~ .I1. [vrUj] we obtain 3'e (C, m, f.l) ::;; K(C) L.
H(a m.Il ).
1=1 J;Ct m=k
Dependence with respect to the cardinality is weakened, indeed in this setting K(C) = ICI I - lld if
C = [u, v]d.

The bound ux(A, B) ::;; 1; [K(A) + K(B)][ { L. a m.o


00 11(1 +0)
}v{ L. L(am.2)}] follows
00

m=k m=k
11(1+0)
from the choice H(x) = x vL(x) for such 2k-distant hypercubes A and B.

Examples 3. If Igt.sl ::;; g(lt-sl) for some nonnegative and nonincreasing function g, then

a m. ll ::;;
m-1
f
x d- I gll(x) dx. Assume for instance that for x ~ 1, g(x)::;; c x- P rX for some

constants c, p > 0 and 0::;; r < 1. Direct calculations imply am.1l = O(m d- IlP r llm ),
N(k, 0) = O(g(k) ~ k(d+1) In k) and W(k, 0) = O(g(k) ~k(d+3)l2ln k).

If X t = L.
gt-s Zs for a Gaussian white noise Z = (Zt)tE Zd and if g, g(z) = L.
gt zt, is a
SEZd tEZ d
non vanishing continuous function on the torus IT d, we have seen in Example 2.1.1 that if
a = InfzE lrd Ig(z)1 2 then we get similar bounds from Theorems 2.3.1. or 2.1.2. except for the
cardinality (IAI+IBI) factor. Let for example Igt.sl ::;; const. r lt-sl . Use Theorem 2.1.2., then the
inequality ux(k)::;; ~ SUPtlgtl. L. L.Igtl yields ux(n; a, b) = O(n d+ 1 rnl2) while
Isl:2:k Itl:2:lsll2
ux(n; a, b) = O«a + b) n(d+1)12 rnl2) comes from Theorem 1.

A simple modification of the proof of Theorem I yields the case of L2-linear random fields.

Proposition 1. IfZ is independent and if the conditions (2)-(6) hold with 8 = p = 2 then
there exists a constant (> 0 with
1/3
a x (2k; a, b)::; ((a+b) ak ,2'

The following result proves that one may expect more than strong mixing from the case
of causal processes (see Pham & Tran (1985».

Theorem 2. Assume that d=l and conditions (2)-(6) hold for some independent random
sequence Z. Let moreover X be a causal process (gt,s = 0 for s > t). Then if k is big enough,
the mixing coefficients sequences (fix) relative to X = (Xt)tE Zd satisfy for some constant (
only depending 8 and r
fiX;k ::; (W(k, 8).

Note, that, no absolute regularity sufficient condition has been proved (up to our knowledge)
for the more general case of non causal processes.
80 Mixing

2.3.2. Dependent innovations


Assume that Z is a-mixing, az(k; a, b) -7 0 for k -700 and any a, b;;:: 0 and moreover,

3 -. E JO, 0[, 3 c E 2 fl'I, c:2:-.,

L (r+lycd-d-l [az/r; u, v)/O-'f)I(c+8-'f) < 00.


co

(8)
r=l
If 0 < 0::; 1, the convergent series assumption will be omitted and we set 't = O.

In this case, assume that the fInite marginal distributions of Z are absolutely continuous. Let C
be a fInite subset of Zd. The density ofZc conditional to ~CC' Pc(z I~cc), satisfIes for some
constant c > 0 independent of C and for any x = (xt; t E C) E IR c
(9)
IR c
f ess-sup $5 CC
Ipe(z+x /$5 e c) - Pe(z /$5ec)1 dz::; c L Ixtl.
tee

Example 4. Assumption (9) holds if for t e C, the conditional density of Zt knowing that
Zc = y, PI,C' satisfIes
(9')
,
r Sup C IPt c<z + x, y) - Pt c(z, y)1 dz ::; c lxi, 'if x
ye~' ,
E IR.

If Z is specifIed by its conditional marginal distributions, this condition is natural (see § 2.2.),
the underlying measure Amay be chosen different of the Lebesgue measure. For a Markov fIeld
Z, it is enough to write (9') for a fIxed neighbourhood C of t, and suffIcient mixing conditions
based on conditional marginal distributions are known. Dobrushin and Simon's weak
dependence condition described in § 2.2. provides such properties. For a Gibbsian model, the
zh(y)
,for an underlying measure A. If the
,"
conditional density Pt cis Pt c(z, y) =
f e
exh(y) A(dx)
random fIeld Z is independent this is only condition (6).

In the mixing innovation setting, it is natural in view of Theorem 1.4.1 to replace the notations
(7) by
~12 11(1 H) ~/2(1 H)
(10) S't('t,C) = AC,t,'tvAc, 1,2' M(m,'t)=am,~ vam,2 •
11(1 H)
If 0::; 2, we set S't('t, C) = AC,t,'t and M(m, 't) = am,~ .

A result extending Theorem 1 may now be written.

Theorem 3. Assume that conditions (2)-(5), (8), (9) hold for the random field Z. Then ifm is
big enough, the mixing coefficients sequences (aX) relative to X = (Xt)te Zd satisfy the
following for some constant' only depending on 0 and r.
i) For any finite subsets A and B ofT,
ax(A, B)::;, fL L
$/-., Am)II(1H) + $/-., Bm)II(1H)} + az(A m, Bm).
teA teB
ii) We have, for any a, b > 0 if k :2: 2m,
ax(k; a, b)::;, (a+b) M(m, -.) + a z (k-2m; a(2m+l)d, b(2m+l)d).

The same remarks as after Theorem 1 still apply and analogues of Corollary 1 are provided in
Examples: Linear Fields 81

the same way. The following simple corollaries of Theorem 3 may be useful.

Corollary 2.lfZ is p-dependent and the assumptions in Theorem 3 hold. Then let m be the
k-TJ
least integer greater tha~, we have
ax(A, B) ::; t;{ L /klr, A m)+ L /ktr, Bm)} and ax(k;a,b)::; I;(a+b) N(m, r).
tEA tEB

Corollary 3. If M(m, r) ::; const. e- JIm and az(p; u, v) ::; con st. (u+v Y e- ap for some
fl, a, r > 0 and the assumptions in Theorem 3 hold, then there exist a constant I; > 0 with
ax(k; a, b)::; I; (a+b) kwdl(2a+JI) e- JIakl(2a+J1J,

Remark 3. Corollary 2 allows to consider moving average parts in X without using the
invertibility condition (8). For instance, if Zt = lOt - lOt-v' for v fixed in Zd, and some white
noise £, we write X = G Z = G R £ for some non invertible operator R. Dependence of the
field Z allows to weaken the invertibility assumption (5). Corollary 3 may be used in the case
of a Gibbs field Z.

2.3.3. Proofs
Proof of Lemma 1. Convergence of the series defining X holds in the L sense. L Ois a °
complete metric space normed by IIUII/i = [[ IUIO] I/o if 0 ~ I, and for 0 < I, it is only
metrized with the distance do(U, V) = [IU - Vlo invariant by translations. Now X~ is a
Cauchy sequence in L 0, indeed we have, by assumptions (2) and (3),

IIX~ - X~+PIl/i :::; Milo L Igt,sl m-=too 0 for 0 ~ I,


Isl>m
do(X~, X~+P):::; L Igt,i m-=too 0 for 0 < 1.
Isl>m
What was done for a fixed t can be extended to finite-dimensional distributions Xc of X. The
distribution of the spatial random field X is thus defined using Kolmogorov consistency
theorem.•

Proof of Theorems 1 & 3. The points ii) follows clearly from i) and from the inequality
AcID,q! :::; am,~ valid if t E C. From this inequality we obtain

L 9t- t(O, C ID )II(I+OJ:::; ICI N(m, 0), L S'lt, C ID )II(1HJ:::; ICI M(m, 't).
~c ~c

Let E c [R A, F c [R B be any Borel sets; in order to prove the results we have to precise
bounds for the expressions
I.l = IP(X A E E, X B E F) - IP(X A E E) IP(X B E F).

The proof is based on the following decomposition of the random field X as a sum of a moving
average random field and an small remainder term with respect to the total variation. Let C be a
finite subset of Zd, we define W t and ~ (1) as

7 An additional index C should be added but we shall write for simplicity WI = W~ and We = (W~ltE C'
82 Mixing

Wt= L gt,s Zs' Rt = L gt,s Zs, if t E C.


SEem seem

We are in position to state the two following Lemmas. Their use is fundamental for proving the
bounds of mixing coefficient sequences.

Lemma 2. Let e be afinite subset of ;Zd. Assume that conditions (3), (4), (5) and (9) hold.
Let S be a measurable subset of [Rc. For m big enough there exists a constant /( such that for
any rC in [RC the following relation holds
ess-suP:it me Ip(W C E S - rc 1::b(Cm;c) -peW C E S 1::b(Cm;c) I ~ /( L Irtl.
(C ) teC

Note that under the assumption of independence of the random field Z, conditioning is useless
in Lemm& 2 and the assumption (9) reduces to the assumption (6).

Proof of Lemma 2. Relation G K = I yields L dgs' u ku , t = 1, if s = t E C, and = 0 if


UEd'

s *" t E C. G rn = (gt,S)SE em,tE e is a linear operator IR em --7 IR c and K rn = (kt,S)SE e,tE em


is a linear operator IR e --7 IR em. They satisfy G rn K rn = Ie + urn for some Urn: IR e --7 IR e
with
IIUrnll ~ IIKII SUPtEe L Igt,sl ~ IIKII SUPt L
Igt,sl.
secm Is-tl>m
For m big enough, this expression is bounded by!, the linear operator Ie + Urn is thus
invertible. Moreover G rn K"m = Ie with f(rn = K rn (Ie + Urnfl and lIf(rnll ~ 2 IIKII. The
" rn : IR em --7 IR e leads to define ~ by 't
linear, continuous operator K " rn re, it satisfies
K
G rn ~ = re' hence
1[p(W C E S - re /~(ern)c) - [P(W e E S /~(ern)c)1 =
1[p(Grn(Zem + f) E S /~(ern)") - [P(GrnZem E S /~(ern)c)1 ~
~ r ess-sup ~ (Cm)c Ipem(z + f t /~(ern)c) - Pem(z /~(ern)c)1 dz.
[R~m
"
Using the assumption (9), this expression is bounded by c IIKrnll L Irtl..
tEe

Lemma 3. Let e be a finite subset of ;Zd. There exists a constant /( such that for any
measurable subsets U and S of.Q and [Rc the following relations hold.
i) Assume the assumptions in Theorem 1, then

(*) IP((XC E S)nU) - P((WC E S)nU)1 ~ /( L In 1:(1+0)(8, em).


teC
ii) Assume the assumptions in Theorem 3, then
(**) IP((X c E S)nU) - P((W C E S)nU)1 ~ /( L g1:(1H)('r, em).
teC

Proof of Lemma 3. Set 1; = [P «Xc E S)n U) - [P «WeE S)n U). For any family of
Examples: Linear Fields 83

positive real numbers (Tlt)teC we set H = {r E IRC; Irtl:5 Tl t }, then


lSi :5 IP (Rc ~ H) + IP ([(W c + Rc E S)~(W C E S)]n[Rc E H)n UJ),

lSi :51P(Rc ~ H) + J IIP(WC E S - r I RC = r) -IP(WC E S I Rc = r)IIPRc(dr),

In order to use Lemma 2, we just note that


lIP (W c E S - r I Rc = r) - IP (W c E S I RC = r)1 :5
ess-sup~(Cm)C lIP (W c E S - r I ~ (Cm)c) - IP (W C E S I ~ (Cm)c)1.

Hence ISI:5 IP (RC ~ H) + K L Tll' it only lasts to bound the fIrst term of this inequality.
C

i) The first term in this inequality is bounded under the assumptions of Theorem 1, using
Nagaev-Fuk inequality (Nagaev-Fuk (1971), Corollary 4),
1P(IRtl 2: Tl t):5 Tl-~ mt(o, Cm).
ii) Under the assumptions of Theorem 3, the fIrst term in this inequality is bounded using
Theorem 1.4.1 on the finite set C yields
1P(IRtl 2: Tl t ):5 Tl-( 8lt, Cm).
Now Lemma 2 implies respectively that ISI:5 K L [Tl t + Tl -~ mt(o, Cm)] for some constant
teC
K > 0, or lSi :5 K L [Tl t + Tl-( 8lt, Cm)]. Equilibrating both terms yields (*) and (**) for
teC
some constant K .•

End of the Proof of Theorems 1 & 3. We may replace C by A or B in the previous


Lemmas. X t is written as the sum of some moving average random field and a remainder term
which is shown to be neglectible as the order of the moving average term increases.

Then X A == W A where W A is ~ Am-measurable and X B == W B where W B is ~Bm ­


measurable. Moreover the a-fIelds ~Am and ~m are almost independent.

Figure 2.3.1.

Write IL = ILl + 1L2 + 1L3 + 1L4 + 1L5 where,

ILI=IP(XAE E,XBE F)-IP(WAE E,XBE F),


1L2 = IP(W A E E, X B E F) - IP(W A E E, W B E F),
1L3 = IP(W A E E, W B E F) - IP(W A E E) IP(W B E F),
84 Mixing

!l4 = [!P(W A E E) -!P(X A E E)]!P(W B E F),


!ls = !P(X A E E)[!P(W B E F) - !P(X B E F)].

Apart from!l3 which may be bounded by uz(A m, Bm), any of those terms takes the form
S= !P«Xc E S)nU) - !P«W c E S)nU) for some event U. Use repeat idly (**) in Lemma
3 yields Theorem 3. Theorem 1 follows using (*) instead (**), here !l3 = 0 .•

Note that if one introduces two parameters m and m' instead of m, Theorem 3 may be
strenghtened in the case when uz{k; a, b) only depends on /lAb.

Proof of Proposition 1. The only modification with respect to the proof of Theorem 1 is
the simpler control
!P(lRtl ~ T]t) ~ T]-t IER~,
It yields the result with IER~ ~ a m,2' and T]t = ar::.~ .•

Proof of Corollaries 1 & 2. Nagaev Fuk inequality is used for p-dependent fields and a
direct optimisation yields Corollary 2 .•

Proof of Theorem 2. Let us note a modification of Theorem 1 for the case of random
processes (d=I). The process is causal, i.e. gt,s = 0 if s > t. In this case we may consider, by
a time shift, A = £. - and B = [k, +00[' in this case XB = W B + RB with
Wt =Lgt,sZs, R t = Lgt,sZs iftE B.
s$m+k s>m+k
This implies that W B is independent of XA and RB, hence
!l = !leE, F) = f [!P(W B E F - r) - !P(W B E F)]!P x A'
Ex~B B
R (dx, dr).

Use the same technique as in Lemma 3 yields


!leE, F) ~ !P(XA E E, RB ~ H) + !P(XA E E) SUPrEH I!P(W B E F -r) - !P(W B E F)1.
Note that F ~ !leE, F) is a signed measure with total mass zero, hence its total variation of is
less than 2 Supp !leE, F) where the supremum is considered over all the Borel subsets F in
IRB. Let now (Ej) (resp. (Fj)) be finite and Borel measurable partitions of IRA (resp. IRB) then
L !l(E j, Fj ) ~ 2 {!P(R B ~ H) + SUPF,rEH I!P(W B E F - r) - !P(W B E F)I}.
j,j
The same method as the one used for Corollary 1 finishes the proof.•

2.3.4. Miscellany
Example 2 motivates the previous linear representation ofthe random field X. We now
recall some general features concerning second order random fields given in Rozanov (1967)
and Guyon (1992). A second order stationary random field (Xt)tE;z>d is a random field with
IEX t = m and Cov(X O' Xt) = Cov(Xs' X s+!) for any s, t in £. d. Let <I> be the family of finite
subsets of £. d and H(X, F) be the linear subspace of L2(Q) spanned by (Xt)tE F'

The random field is said to be <I>-regular ifn H(X, F) = {OJ and <I>-singular if
FE <I>
there is Fo in <I> with n H(X, F) = H(X, FO)' It may be proved that any second order
FE <I>
Examples: Linear Fields 85

stationary random field (Xt)te Zd may be decomposed in an unique orthogonal way as the sum
X = X(r) + Xes) of a 4>-regular random field X(r) and 4>-singular random field X(s).
Let 4> = {F E ~d; pC is finite}; X is a 4>-regular random field if and only if its
spectral measure is absolutely continuous and its spectral density f is such that

f IP~~~2 dx < 00 for some trigonometric polynomial P. The random field X may then be
represented as X t = L gt-s Zs for some coloured noise Z (this representation is non causal if
d=I). Let d=1 and 4> = {F = [a, +00[; a E ~}, the same representation holds for some
white noise and is causal if the process X is 4>-regular; moreover 4>-regularity is equivalent to
the absolute continuity of the spectral measure and the condition f In f(x) dx > -00.
The fact that nongaussian linear processes are easier to identify than Gaussian ones (see
Rosenblatt (1985), Theorem 5, p. 46) gives also more interest to the present results.
Note that the previous mixing results may be extended in the following
multidimensional way. Assume that random variables ~ are vector valued say in a Banach
space [8, the coefficients gt,s define now a continuous and linear operator over [8 and all of the
assumptions over the linear sequence (gt,s) are obviously rewritten in terms of the underlying
norm of [8. The interesting case seems to be [8 = IR d, however changing X to
Y = (Yt)te Zd+l, with Yt+kll = Xt,k and 11 = (0, ... , 0, 1), the k-th coordinate of Xt leads to
analogous multidimensional results.

Following Pham, Tran (1985) the previous results may be adapted to vector valued
random fields in IRd. For this, it is enough to consider elements gt,s as dxd matrices and to
replace modulus by n,orms of operators on dxd matrices. However considering infinite
dimensional vector spaces is not a simple adaptation because in that case Lebesgue measure
cannot be used in the proof; see the use of condition (5) in the proof. The case of function
spaces has a special interest. Strong mixing can also be replaced by absolute regularity, see
Pham & Tran (1985).

The ~-mixing condition of Z = (Zn)ne Z is related in Denker & Keller (1986) to the
one of the process X = (Xn)ne Z if X t = f(Zt, Zt_I"") is a stationary random process
generated by some function such that for some constants C > 0 and 0 ~ a < 1,
If(xl"'" x n' YI' Y2"") - f(xl"'" x n' zl' z2, ... )1 ~ Can.
Survey of the literature
Because of their particular structure allowing a direct approach for most of the limit results,
works concerning the mixing properties of linear random fields came relatively late.
Withers (1981) studied a CLT for this particular class of processes defining a mixing condition
adapted to linear processes. Gorodetskii (1977) corrected an incomplete proof in Chanda
(1974) giving the strong mixing properties for linear and causal sequences. The same processes
are also shown to be absolutely regular in Pham & Tran (1985) and the results are extended
there to vector valued processes (see also Athreya & Pantula (1986». The results presented
here are the ones in Doukhan & Guyon (1991), causality as well as independence of the
innovation are omitted. Gorodetskii (1977) and Rosenblatt (1980) proposed the
counterexamples 1 and 2; they show the importance of the invertibility and absolute continuity
86 Mixing

assumptions.

Among the important literature concerning the classical time series, Kesten & O'Brien's (1976)
propose a review of mixing processes such as linear ones. Mixing conditions for infinite
memory systems are proposed by Denker & Keller (1986). Mokkadem (1990) proves the
mixing properties of ARMA processes which are an important class of linear processes. Basic
textbooks for the general theory of random fields are Rozanov (1967) and Guyon (1992).
Examples: Markov Processes 87

2.4. ~arkov processes

This chapter is divided in three main parts. We first present results for general Markov
processes; some consequences are provided in § 2.4.0.1. for a class of nonlinear processes,
dynamical systems are explored in § 2.4.0.2. and a class of nonhomogeneous processes is
considered in § 2.4.0.3. The main consequences are provided in two sections devoted to
Polynomial processes (§ 2.4.1) and explicit examples of nonlinear processes (§ 2.4.2).

We first recall that Q is called a probability kernel on the countably generated measured space
(E, &') if

x ---) Q(x, F) is a measurable function on the space (E, rf:), for F E rf:.

Q(x, .) is a probability measure on the space (E, rf:), for x E E.

We use again the notations used in § 2.2. Let Q, R be two probability kernels, fbe a mesurable
function and y be a signed measure, we define QR, Qf, and yO as the probability kernel, the
mesurable function and the signed measure defined, when it makes sense, for x in E and F in &'
by
f Q(x, dy) R(y, F),
QR(x, F) =

Qf(x) = f Q(x, dy)j(y), rQ(F) = f y(dy) Q(y, F).

A transition probability kernel P~(x, dy) indexed by If = ;Z or IR on the measured space


(E, &') is defined as a family of probability kernels (P~)S$tE 1f such that
~ = P: p ~ for s::; u ::; t E 7J (Chapman-Kolmogorovequation).

A Markov process X = (X UIE 1f is defined by a measure space (E, &') and some probability
space (Q, 8t., [p), via a transition probability kernel P~(x, dy) such that P~(x, F) is a regular
version of the conditional probability [P (X t E FI Xs = x) for any F in &'. Therefore,
according to the usual definition, conditional probability with respect to the past is equal to
conditional probability with respect to the immediate paste.

The process is said to be homogeneous if P~(x, F) only depends on (t - s), in this case
we shall write ph = p:+h. We consider from now on a discrete time, homogeneous Markov
chain X = (Xt)IE:i" The n-step transition kernel is defined by
pn(x, F) = lP (Xn E FI Xo = x) - set P = pl. Let x be a point of E we set IE x for the
expectation operator conditionally to Xo = x.

Davydov (1973) proves the following fundamental explicit bounds (I) for the mixing
coefficients ~n and <\In'

I (I), (2) are the transcriptions in tenns of transition operators of the relations, valid for Markov chains:
Pn = [ SUPB IIP(B I Xk = x) -IP(B)I, <\In = SUPk ess-sup SUPB IIP(B I Xk = x) - IP(B)I, BE cr(Xk+n)·
88 Mixing

(1) f3 n = SUPk f vi dx ) Ilpn(x, .) - vn+k"Var'

(2) ~ I/J n ~ SUPk ess-SuP vk Ilpn(x, .) - vn+k"Var ~ I/Jw


here vn = vaPn denotes the marginal distribution ojXn-
Assume that 1t is an invariant probability measure, i.e. 1tP =1t. If the initial distribution of the
process (X t) is Vo = 1t, then the process (Xt)~o is stationary and its marginal distribution is 1t;
moreover the previous relations become simpler
(1') ~n = f 1t(dx) IIpn(i,.) -1tll yaf '

(2') i <l>n ~ ess-SuPn Ilpn(x, .) - 1tIIYaf ~ <l>n'

We first propose a sufficient condition for geometric decrease of the <I>-mixing coefficients of
the Markov chain (Xn) with transition probabilities P.

Theorem 1 (Ueno (1960), Davydov (1973)). Let (Xn ) be a Markov chain and )1 be
some nonnegative measure with non-zero mass )10' Assume that there exists some integer r
such that jar any x in E, pr(x, A) ~ )1(A) then
\1' x, x' E E, Ilpr(x, .) - pr(x', .)II Var ~ 2( 1-)10)'
\1'n E [1'/, \1' x, x' E E, IIpn(x, .) - pn(x', .)IIVar ~ 2(1_)10)nlr-1 = 2 rt.
There exists a probability measure n: on E such that

\1'nE[1'/, \1'xEE, Ilpn(x,.)-n:IIVar~21t, n:(A) = f n:(d8)P(8,A).

Whatever is its initial distribution, (Xn) is I/J-mixing and I/Jn ~ 21{

In this case we shall say that X is Doeblin recurrent.

Sketch of Theorem l's proof. The measure pf(X, .) -110 is nonnegative and up to a
factor 2 its mass equals its variation norm, so the first point follows. The other points are
deduced classically using the multiplicative properties of transition kernels and inequality (2).
See Orey (1971) or Doob (1953) .•

Remarks 1. For the case of non homogeneous Markov chains, replacing the assumption
pf(X, A) ~ Il(A) by P~(x, A) ~ Il(A) yields <I>-mixing and <l>n ~ 211 n whatever the initial
distribution is.
A necessary and sufficient condition is given in Blum, Hanson & Koopmans (1963) for a
stationary Markov process (Xn)n~O to be 'JI-mixing. (Xn)n~O is 'JI-mixing if and only if the
probability transition kernel is absolutely continuous w.r.t. the stationary marginal distribution
1t and there are an integer m and a real number a, 0 < a < 1 such that the density Pm(x, y) of
the kernel pm(x,.) satisfies 1t®1t({(x, y) E E2; 1Pm(x, y) - 11 > aD = O. This determins
*-mixing sequences which are not <I>-mixing (see § 1.3.). An example is the integer-valued
Markov chain given by the transition matrix P = (Pi} with Pi,j = (i)j + (oi,j - 0i+ I} (~)i+j.

Rosenblatt (1971) proposes a functional approach to mixing. Assuming that 1t is an invariant


Examples: Markov Processes 89

probability measure. Let La = {f E LP(E, It);f It(dx) f(x) = OJ, the Markov chain X is
sald L (E, It)-mlXlng If ap(n) = SUPfE Lo T
. p ... IIpnfll .
~2· In thls case, ap(n) decreases
p
geometrically to zero because the norm of operator based on the LP(E, It) is a norm of algebra.
IIp nfll 1
Moreover b n = SUPfEL ~ has the same decay as an and an ::; b n. An LP(E, It)-mixing
o 00

Markov chain is strongly mixing with a geometric decay. Doeblin condition and consequently
geometric <»-mixing, holds if aoo(n) -7 O. Moreover L 2 (E, It)-mixing is equivalent to p-
n....=
mixing, thus a p-mixing Markov process is geometrically p-mixing.

The Markov process is called recurrent if it is recurrent in the sense given in § 1.3.3. with the
same a-finite nonnegative measure It. X is said positive recurrent if It is bounded and null
recurrent else. The measure It is invariant and unique up to a positive multiplicative constant.

Assume now that there exists an unique invariant probability measure It. The chain will be
called aperiodic positive recurrent if IIpn(x, .) -ltll Yar -7 0 It-a.s. We shall name indifferently
ergodic an aperiodic positive recurrent chain in view of the ergodic theorem provided by the
definition, the terminology of stationary distribution is justified by the fact that if the initial
distribution of the process is It, the process X is stationary. Orey (1971), Nummelin &
Tuominen (1982) prove sufficient conditions for the following relations to hold It-a.s. for some
function A(x) > 0, with f A(x) It(dx) < 00,

(3) Ilpn(x, .)-ltllyar::;A(x) TIn, for some O<TI < 1,


(4) Ilpn(x, .)-ltll yar ::; A(x) n K, for some 1C> O.

Relations (3) and (4) are known as geometric ergodicity and Riemanniann recurrence with order
1C if 1C is an integer. With the identity (1) they imply respectively 13 n =O(Tl n) and 13 n = O(nK)
if the chain is stationary. However these relations do not hold necessarily whatever is the initial
distribution of the process. More precisely this condition will depend, in a nontrivial sense, on
the degree of relationship of the initial distribution and the stationary one (2).

A Coset is a set C in (t such that for some a-finite nonnegative measure m and some integer r,
pr(x, F) ~ m(F) for any x in C and F in (t (3).

Let 'tc be the hitting time of the C-set C, 'tc = Inf{ t > 0, X t E Cj, we set the relations
(3') SuP [f ea'tc < 00
XEC x
(4') SuP [f 'tK+l < 00
XEC X C

The relations (3') and (4') are proved to imply (3) and (4) respectively in Nummelin (1984) and

2 It is the case if IIv n - ltllYar = O(pn).


e.g. this holds if the initial distribution Vo is absolutely continuous wrt It. dv O = h dlt and satisfies
ess-suP lt h(x) <00.

3 In tbe terminology of Orey (1971). the terminologies small set or petit set (tbe French translation of the
previous one) are also often used.
90 Mixing

Duflo (1990), using Orey (1971). Moreover SuP [Ex'tc < 00 implies aperiodic positive
XEC
recurrence. Now sufficient conditions for those conditions to hold for some Coset are stated in
terms of Lyapounov functions. We fIrst recall the following defInitions.

(I) X is m-irreducible, for some a-finite positive measure m on (E, rf), if


It FE rf, m(F) > 0 ~ It x E E, :3 n ~ 0, pn(x, F) > O.
(H) X is Harris recurrent if there is a a-finite positive measure m on (E, rf) with,
It FE rf, m(F) > 0 ~ It x E E, fPC u
rXn E F}/ Xo = x) = I.
n=]

(d) Period of irreducible Markov chains: Orey (1971) proves, under the irreducibility
assumption, that there is a measurable partition rC1, ... , Cd' F} of n, with m(F) = 0,
P(x, Ci+ 1) = 1, It x E Ci - the index i is considered modulo d.
If d = 1, the Markov chain X is said to be aperiodic.

We are now in position to state the following Lyapounov function test criterion - see
also Nummelin (1984).

Theorem 2 (Tweedie (1975), Nummelin, Tuominen (1982)). Assume that a Markov


process with transition probabilities P is m-irreducible and aperiodic. Let C E rf, g be a
nonnegative measurable function on (E, rf) and £ > 0 be a real number.
i) If C is a C-set with sug
XE
f P(x, dy) g(y) <
CC
00 and Pg(x) ~ g(x) - £, X re C, then

the process is ergodic.


ii)If some real number r> I satisfies Pg(x) ~ r- 1 g(x) - £, x re C, then the following
relations hold
fExr7:c ~ £-1 g(x), x re C, fExr7:c ~ r £-1 f P(x, dy) g(y), x E C.
CC
Moreover, if C is a C-set with SUTJ
XEt::
f P(x, dy) g(y) <
~
00 then relations (3') and (3) hold.

The forthcoming Theorem 3 from Mokkadem (1990) provides simple sufficient conditions for
m-irreducibility and aperiodicity in this result. It is extensively used in § 2.4.0.2., § 2.4.1,
§ 2.4.2.1, § 2.4.2.3.

A Harris recurrent Markov chain admits a unique invariant nonnegative measure 1t and m is
absolutely continuous with respect to 1t. Davydov (1973) proves (see also, Athreya & Pantula
(1986 b)) that an aperiodic and Harris recurrent Markov chain is absolutely regular if 1t is a
fInite measure. More precisely absolute regularity (or /3-mixing) holds for an Harris recurrent
Markov chain with a fInite invariant measure if the distribution of Vn has support in F and no
more than one of the Cj's. This holds whatever the initial distribution is.

Examples 1. (Davydov (1973))


a) f3-mixing does not imply cp-mixing for Markov chains. Let E =[Nu{a, b}; consider the
one step transition probability defined by Paa = Pbb = Pab = Pba = ~,
Pj,a = pj,b = ~(l - Uj) and pj,j+l = Uj for i E [N. If (Uj) is a nondecreasing sequence with
Examples: Markov Processes 91

II Uj = 0 the chain is recurrent with invariant probability 7t such that 7ta = 7tb =~. If
j=1
moreover the initial distribution is concentrated on {a, b} the chain is "mixing
by the previous remarks. If the initial distribution is concentrated on {I} the chain is ~-mixing;
it is "'-mixing iff SUPj Uj < 1. This provides an example of Markov ~-mixing chain which is
not "'-mixing:
b) tfJ-mixing does not imply the recurrence of a Markov chain. Let E = [N; consider the one

(112 112)
step transition probability given by P =
~ooOJgg:::)
0 U 0 0 ...
U 00 .. .
...............
with U = 112 1/2 then the

chain is "'-mixing but it is not recurrent since ~ ~ +00.

Let us now state assumptions useful in the following result. Let E be a separable topological
space endowed with its Borel a-field. Assume that for some subspace S of E and some
nonnegative and a-finite measure Jl with Jl(S) > 0 we have IP C-v n E [N, Xn E S) = I; this
means that the process ~ lives on S and set

(HI) For any compact subset K of E and any Borel subset N of S such that Jl(N) =0, there
is an integerr with 'v' XE KnS, pr(x, N) = O.
(H2) For any compact subset K of E and any Borel subset A of S such that Jl(A) ::t: 0, there
s
is an integer s with InfxE Kr.S p (x, A) > o.
(H3) There are a nonnegative and measurable function g (the Lyapounov function) on S, a
compact subset K of E 'and positive constants A, E > 0 and 0 < p < 1 of S with
Jl(KnS) > 0,
'v' xE KnS, [£(g(X n+ 1 ) I Xn = x) ~ pg(x) - E,
'v' XE KnS, [£(g(X n+ 1 ) I Xn = x) ~ A.

The following weaker assumption is also of interest

(H4) There are a nonnegative and measurable function h on S, a compact subset K' of E and
a positive constants A with
Jl(K'nS) > 0,
'v' U: K'nS, IE (h(X n+ l ) I Xn = x) ~ h(x) - 1.
'v' x E K'nS, [£ (h(X n+ 1) I Xn = x) ~ A.

Theorem 3 (Mokkadem (1990».


a) Assume that the Markov chain (Xn) satisfies (H]), (H2 ) and (H4 ) then the process (Xn) is
ergodic Harris recurrent and its invariant probability 11: is equivalent to the restriction JLs of JL to
s.
b) Assume that the Markov chain (Xn) satisfies (H]), (H2 ) and (H3 ) then the process (Xn) is
geometrically ergodic Harris recurrent and its invariant probability 11: is equivalent to the
92 Mixing

restriction Ils of 11 to S. Moreover if (Xn) is stationary, it is geometrically absolutely regular


and fEg(X n) = 1 g(x) 7r(dx) < 00.

Proof. It will be given in four steps.

a) Assumption (H2) implies both Ils -irreducibility and aperiodicity. Ils-irreducibility is


immediate. let a measurable partition {C I •...• Cd' F} of S with Il(F) =o. P(x. Ci + l ) = 1.
'if x E C i . If d> 1 we may assume that Il(C I »O. say. Let a. b in C I • and C 2
respectively. set K = {a. b} in (H2 ). A contradiction finishes the proof since the integer s is
multiple of d and simultaneously it is not. in view of (d).

b) Assumptions (H2 ) and (H4) imply positive recurrence. By a result in Jain & Jamison (1967)
there is an excessive measure 7r » Ils. Consider a compact K with Ils(K) > O. apply (H2 )
to K and A with Ils(A) > 0 and 1t(A) < 00. then 1t(Kr"lS) < 00. indeed
1t(A);:: J1t(dx) pS(x. A) ;:: 1t(KnS) InfxE
Kr1S
KnS pS(x. A).

Hence subsets Kr"IS with K compact and Il(KnS) > 0 are test sets for Tweedie ergodicity
criterion (H4 ).

c) Assumption (H]). Ils-irreducibility and positive recurrence imply that 7r is equivalent to Ils
and Harris positive recurrence. We only have to prove that 1t « Ils. Let K be an arbitrary
compact in E. and A with Ils(N) =O. then there is an integer r such that pr(x. N) =0 on

KnS. Hence 1t(N) =sl1t(dX) pS(x. N):$; 1t(S\K) is arbitrarily little as shows the regularity

of the measure 1t. Notice thus that subsets Kr"IS with K compact and Il(KnS) > 0 are C-sets
in the sense of Nummelin & Tuominen (1982) since the measure 1t and Ils are equivalent.
Using theorems 2.3 and 2.5 in Revuz (1984). chapter 3. we see that there is some nulset N
with Ils(N) =0 and such that S\N is absorbing thus F(x. A) = 1 for x in N. Ils(A) > 0 and
~

F(x. A) = P(x. u (X n E A». We still have to show that F(x. A) = 1 for x in S\N. and
n=1
for this use (HI) with K ={x} and the relation
F(x. A) ;:: P r (x. A) + JP (x. dy)
r F(y. A).
S\A

d) Assumptions (H]). (H2). (H3) and positive recurrence imply the conclusions of the Theorem
3. Now use the ergodicity criterion in Nummelin. Tuominen (1982) and equality (1) to
conclude.•

Remarks 2. Using Meyn & Tweedie (1992),s theorem 8.1. proves with the identity (1') in
§ 1.1 that an = O(11 n) for some 0 < 11 < 1 under the assumptions of Theorem 3 if the initial
distribution is a Dirac mass at some point. If the initial distribution v 0 of is absolutely
J
continuous with respect to 1t and satisfies g dv o < 00. then ~n = O(11 n) (use theorem 6.3.
ofMeyn & Tweedie (1992) which proves that A(x) = O(g(x» in inequality (3) and the related
Examples: Markov Processes 93

2.4.0.1. A class of non linear models

We follow here the presentation in Mokkadem (1987). Let <Xn) the [Rd-valued Markov chain

Xn+1 = f(X n) + en+1(X n), n = 1, 2, ...

Here f is a measurable function on [Rd while en+1 is a sequence of independent and identically
distributed random fields. Such processes are in the class of Markov systems defmed e.g. in
Meyn & Tweedie (1992).

ConSider the assumptions

(IA) 3 r E il'l : pr(x,.) '" A. and Inf K pr(x, A) > 0 for any Borel set A with A.(A) > O.
xe
This condition implies irreducibility and aperiodicity. Suppose that en is absolutely continuous
and set gtCx) for the density of en(t). Here, the transition density of the process writes
p(x, y) = gx(Y - f(x». Hence condition (IA) follows from the following assumption.

(lA') f is locally bounded and, if tn is a sequence converging to t then some subsequence of


gtn converges a.s. to a strictly positive density gt.

Set now the assumptions

(U') IE len(t)I S :5: C.


(U") lEexp (a len(t)l) :5: C.

Proposition 1. (i) If (IA) and (U') hold and !f(t)1 ~r It I if ItI > M for some M> 0 and
o~ r < 1 then the conclusions of Theorem 3 hold.
(ii) Without assuming (U') but only the uniform integrability of (leit)IS)t and the condition
(flf( t) + eit) IS ~ r ItiS for some M > 0 and 0 ~ r < 1 then the conclusions of Theorem 3
hold.
(iii) If (IA) and (un) hold and lfft)1 ~ ItI - A if It I > M for some M> 0 and A > InaC then
the conclusions of Theorem 3 hold.

2.4.0.2. Dynamical systems approach


This section is an exposition of the results in Chan & Tong (1985) and Tong (1990). We
follow the presentation in Tong (1990), Appendix 1. Let h: [Rdx[R -7 [Rd be a function with
the particular form hex, e) = f(x) + sex, e) and a sequence (En) of independent and
identically distributed random variables. We consider the Markov process defined by the
recurrence relation
(CT) n = 0, 1, ...

This model is a random perturbation of the Dynamical System

CDS) n=0,1, ...


94 Mixing

Consider a norm 1.1 on [Rd. Let us introduce the set of assumptions

(CT I) 0 is an equilibrium point of the system (DS) which is exponentially asymptotically


stable in the large i. e.
f(O) = 0 and::3 K, c > 0, V n E !J'j, IXnl ::;; K e- cn Ixol.

(CT2) V x E [R d, V r > 0, fP (Is(x, cn)1 < r) > O.

(CT3) The distribution of c n is absolutely continuous with respect to Lebesgue measure and
its density is positive on some interval Hi, 0[.

(CT4 ) The function h is differentiable at the ongIn. Its partial derivatives


b = Deh(O, 0) E [R d, and the dxd matrix A = Dxh(O, 0) satisfy that
{b, Ab, ... , Ad-1b} are linearly independent vectors of [Rd.

(CT5) The function f is Lipschitz continuous over /Rd, i.e.


:3 M> 0, V x, yE [R d, If(x) - f(y)1 ::;; Mix - yl.

(CT6) :3 't > 0, V XE [Rd, [Is(x, cn)1 < 'to

(CT7) 0 is an equilibrium point of the system (DS) which is exponentially asymptotically


unstable in the large i.e.
f(O) = 0 and,:3 K, c > 0, V n E !J'j, IXnl ~ K e cn Ixol.

(CT 8) s(O, cn) is not identically equal to 0 a.s.

Theorem 4 (Tong 1990). Assume that f is continuous over /Rd and continuously
differentiable in a neighbourhood of the origin. Suppose that conditions (CT]), (CT2), (CT3 ),
(CT4 ), (CT5 ), and (CT6) hold. The Markov process defined by (CT) is then geometrically
ergodic.

For this, use Theorem 2 with the Lyapounov function g(x) = SUPn~O {eCn IfI(x)I} in order to
prove that the assumption (H 3) holds with S = [Rd. The rest of the proof uses the results in
Feigin and Tweedie (1985). The main difference with Theorem 3, is that no localization ofthe
Markov chain on a subset may be considered. However, as quoted in Tong (1990), Theorem 4
may be used to prove the geometric ergodicity of various classes of Markov process instead of
Theorem 3. A partial converse to this result is also provided in Tong (1990).

Proposition 2 (Tong 1990). Suppose that conditions (CT2 ), (CT5 ), (CT6 ), (CT7 ) and
(CT8) hold. For any initial distribution of XO' the process defined by CT satisfies IXnl -+ 00

with positive probability.

The philosophy of the previous results justifies them. The stability properties of the dynamical
system (DS) are essentially the same as the ones of the noisy system (CT). The point is
however that only very strong contractivity conditions are considered in these results.

2.4 . 0 . 3. Annealing

We recall here some basic facts in the case of nonhomogeneous discrete valued Markov
Examples: Markov Processes 95

chains. Such chains are used in annealing techniques for Gibbs fields, see § 2.2. We follow
here the lines of FoIlmer (1988) and Guyon (1992).

Simulation of a Markovian distribution probability Il with a potential (Iv) on the product


space n = nEt
tET
is essential in Gibbs fields theory. Another problem of interest is the

determination of maxima of the potential function.

Annealing techniques have thus been introduced for computational reasons. The finite
case Inl = w < 00 is the only one interesting to investigate. Here T is finite as well as E("

The density of the distribution probability Il with respect to to the product discrete counting
measure v, takes the form
dll 1
-(co) =-exp{-I(co)}.
dv Z(co)

We set c(P) = ~ SUPi,j t IP(i,k) - PU,k)1 = ~ SUPi,j IIP(i,.) - PU,.)IIVar for a Markov

operator P on the finite state space n. We indicate some elementary properties

c(P)::; I - Ow for w = IQI and 0 = Infi,) P(i,j).


IIf.lJP - f.l2PI IYar ::; c(P) IIf.lJ - f.l2l1YaJor any couple of probability distributions (f.lJ' f.l2)·
Thus c(PQ) ::; c(P) c(Q) (see formula § 2.4 (1)).

Consider now a nonstationary Markov chain defined by a sequence P n of Markov


kernels. The Markov chain is weakly ergodic if, li~-7"" SUPIlI'1l2111l!Q~ - 1l2Q~IIVar = 0,
for any n ~ 1 where Q~ = P n P n+! ... Pm' In view of formula (1), this implies 13-mixing. It
holds equivalently with li~-7"" c(Q~) = 0, for any n ~ 1.

The chain is said to be strongly ergodic if there is a distribution Il"" such that, for any
n ~ 1, li~-7"" SUPIlIIIlQ~ -1l""lI var = O. It holds if for any n ~ 1, n c(Q~) = O.
m~n

Assume that Iln Pn = Iln, then weak ergodicity implies strong ergodicity if moreover
""
L IIll n+I-ll nll var < 00. The behaviour of ~L
n
f(X i) is that of f f(co) Il",,(dco) if the
~ ~ n
following conditions hold. Let cn = max!::;i::;n c(Pi), then the weak law of the large numbers

holds if Iim n -7"" n(1 - c n) = 00, the strong law of the large numbers holds if

L n-2(1 - c nr2 < 00 and the central limit theorem holds if limn-7"" n 1/3(1 - c n) = 00.

n=!

Simulation algorithms for the distribution Il follow from such results for a Markov chain with
Il P n = Il· Since strong ergodicity implies li~-7"" !P (X n = x I Xo = xo) = Il(x), we only
96 Mixing

have to detennine Pn'

The ftrst example is the dynamic of Metropolis. Choose any symmetric matrix Q on n.

Set P n = P with if i"* j, !LU) ~ !L(i), P(i,j) = Q(i,j) !LU.), and otherwise P(i,j) = Q(i,j) if
!L(I)
i"* j, P(i, i) = 1 -I, P(i, k) if i =j. Rewrite this as P(i,j) = Q(i,j) exp{ (lU) - I(i))+}.
bi

The other fundamental example is the Gibbs sampler. It is defmed for a sequence sn in
T such that each element of T is visited infinitely often (e.g. sn = n - N [N] if
T = (l, ... ,N}). In this case, set Pn(i,j) = 11i j1t Us I i(n)) where we have set
(n) (n) sn n
=J'

i = (i I' ... , ir) E nand i(n) = (it)t;tsn and analogously for j. In this case
limn~~ [p (X n = x I Xo = xO) = !L(x) if !L(x) > O.

2.4.1. Polynomial AR processes


This section is devoted to polynomial AR processes, the examples considered are
Bilinear processes (§ 2.4.1.1) and ARMA processes (§ 2.4.1.2). We follow the presentation
in Mokkadem (1990).

Consider the general polynomial AR process (Zn) with values in a smooth algebraic variety (4)
E; it is defined by an independent and identically distributed sequence (en)nE 2'. with values in a
smooth algebraic variety F, a polynomial function <P: ExF -7 E and by the recurrence relation

(PAR) n = 1, 2, ...

The marginal distribution of sequence (en)nE 2'. is assumed to be absolutely continuous with
respect to a Lebesguian measure !LF (5) and its support M = {f > O} is deftned by its density
f. Let <pe(z) = <p(z, e), S the semigroup of polynomial applications generated by <Pe and Sz the
orbit of a point z in E by this semigroup (Sz = (<Pe\o ... o<Pek(z); k E [N, ei E F}). We shall
assume (H3) and
(AI) ::3 TEE, ::3 a E F, \;/ X E E: T E cl(Sx), T = <peT, a), T is said to be an
attraction point for the chain.

(~) The sequence Rk(x) = <Peno ... o<Pen_k(x) converges in mean of order s to a limit
independent of x for some real number s > O.

(AI) holds true if<p~ = <pao ... o<Pa (k times) is Lipschitzian with order (strictly) less than 1. Let
Zn be the limit in distribution of Rk(x) under assumption (A 2), then the process (Zn)nE Z is
stationary, satisfies the recurrence relation Zn+\ =<p(Zn' e n+\), and its marginal distribution 1t
4 Recall that an algebraic subset E in IRd is a subset defined for polynomials FI"'" Fr in [XI"'" Xd1 by
E = Ix E IRd; FI(x) = 0, ... , Fr(x) = OJ. A smooth algebraic variety E is now an analytic manifold which is
an algebraic set and which is not the union of two proper algebraic subsets.
5 If F is a submanifold of IR d it is the induced area measure.
Examples: Markov Processes 97

is invariant by the Markov chain (Zn). (H3) and (H4) imply (~).

Let us set some definitions before stating the result. The vector space spanned by ST is called
=
the Euclidean space of the process ("n)ne z. Let now cpk(z, e 1, ... , ek) CPe,o ... oCPe/z) be the
iterated polynomial function we set Dk = cpk(z, Fk). Note that ST = 'i Dk and (H2) implies
that the process (Xn)ne Z is in the closure of ST, that is 1t(cl(ST)) = 1. Let W k be the closure
in Zariski's topology (6) ofDk; the sequence W k is increasing thus (1) it is constant, W k = W,
for k ~ leo; W is called the algebraic variety of the states of the process (Xn)ne z.

Theorem 5. Under the assumptions (A 1), (A 2) and (H3 ), the equation Zn+I qJ(Zn' en+I ) =
defines on the orbit ST ofT a geometrically ergodic Harris recurrent chain and its invariant
probability 1C is equivalent to the restriction J.ls of J.l to ST.

Moreover, if the process (Zn) is stationary then it is geometrically absolutely regular and
(fg(Zn) = jg(X) 1C(dx) < 00.

If M is open, then the Markov chain is Harris recurrent and geometrically ergodic.
The proof of this result is based on heavy algebraic geometry arguments and may be found in
Mokkadem (1990). We are here much more interested in the applications of this result to
tractable models.

Until the end of this subsection we shall only consider affine models defined on
E = [RP by Zn+l =A(e n+1) Zn + b(en+1), n = 1, 2, ... where A is a polynomial function
with pxp matrix values and b is an [RP-valued polynomial function. Let 0 be a point in F, we
shall make the assumptions

(A'I) The eigenvalueS of A(O) are inside the open unit disk.

(Ai) The series u m =


m
n A(e n+1_j ) b(e n+1_j ) converge a.s. and the sequence of matrices
j=O

=n A(en+1_j ) converges a.s. to the 0 matrix.


m
Vm
j=O
(AZ) [Ellb(en)II S < 00, [EIIA(en)II S < 00, for some matrix norm 11.11.

Corollary 1. Under assumptions (A~), (A~) and (H3), the conclusions of Theorem 5 hold.
Moreover the assumption (A;) implies (A~) and (H3 ) with g(x) = Ixls + 1.

6 Zariski's topology is generated by elementary closed sets which are the zero sets of polynomials.
7 An increasing sequence of algebraic varieties in IRd is stationary.
98 Mixing

2.4.1.1. Bilinear models


Consider the stationary processes satisfying for some independent and identically distributed
sequence (a;) with a finite variance the bilinear recurrence relation

(BIL)

A Markov representation of such process is provided in Pham (1986). There is a stationary


process Z, [Rf-valued, with r = Max{p, P + q, P + Q} and some index m with

Y t = H Zt+rn_I' Zt = (A + B c t) Zt_1 + c c t + d c~ + f, for matrices A, B, Hand


vectors c, d and f.

Corollary 2. Assume that the density of et is positive in a neighbourhood of 0, the


eigenvalues of A have modulus less than 1 and LEIE/ s < 00, LEilA + BEt ll 2s < 00, then
geometric absolute regularity holds and LEI Yl < 00.

Remark 3. In the simpler case Zt = (A + B c t) Zt_1 + c cp note that 0 is an attraction


point of the model. Let <A,B ;c> be the space spanned by c and invariant by A and B (8), and
<A;c> be the space spanned by c and invariant by A (9). If W is the algebraic variety of its
states, we have the following result. If <A;c> = <A,B;c> then a dimensionality argument
implies that W = <A,B;c>.

Example 2. if A = ( 0 01 0
000
0) , B= ( 0) (0 0 0
010
, c= 01 )
0
then (c, Ac, B c)

generates <A,B ;c> thus <A,B ;c> = [R 3. Moreover


SuPx rank D<p2(0, x) = SUPx rank D<p3(0, x) = 2. This implies that dim W = 2 and
W = W 2. Thus we only have to specify the range of <p 2 (0, .), but
<p2(0, e I' e2) = Ace 1 + B c e 1 e2 + c e 2 thus W = <p2(0, [R 2) is the algebraic variety with
the equation Z = X Y in the basis (c, A c, B c), it is not the whole space [R 3.

In this case Theorem 4 cannot be used to prove geometric ergodicity, since the process does not
fill the whole space.

2.4.1. 2. ARMA models

Assume that the [Rl-valued process (Yt)tE Z satisfies the recurrence relation

(ARMA) ±
i=O
Bi Y t-i = :f
j=O
Aj Ct_j'

for some independent and identically distributed [Rf-valued sequence (ct) of centered random
variables, Bi is a lxl real matrix for i = 0, ... , p, Aj is a lxr real matrix for j = 0, ... , q and

8 That is Span{A PIBP2 ... AP,.IBP, c; f> 0, Pi ~ OJ.


9 That is Span{APc; P ~ OJ.
Examples: Markov Processes 99

BO is the identity lxl matrix. Define for z E a:: the matrices P(z) = fBi zi, Q(z) = ~ Aj zj
i=1 J=1
and assume that

(S) The zeros of the polynomial PI (z) =det P(z) have modulus bigger than 1.
If (S) holds then the equation (ARMA) has a unique solution which is stationary and takes the

form Y t = L C k tt_k' A first way to get mixing sufficient conditions is the use of the results
k=1
in section 2.3, we shall prefer the simpler one concerning white noise (Et ) given in Mokkadem
(1990). It is also based on the previous Theorem 5.

Theorem 6. Under the stationarity assumption (S) the vectorial ARMA(p, q) model (ARMA)
is geometrically absolutely regular if the marginal distribution of (ct) is centered at expectation
and dominated by a Lebesguian measure on a smooth algebraic subvariety V of fRr containing
O.

Sketch of the proof. Think of a Lebesguian measure as the Lebesgue measure on a linear
submanifold of [Rd or more generally as its representations on the maps defining a Riemannian
manifold. We shall only sketch the proof of this result based on the Markovian representation
of Y for some X t in [Rk with k = max{p, q+ I}: X t = FX t_1 + GEl' Y t = HXI' here F, G,
H are real matrices such that the non zero eigenvalues of F are the inverse of the roots of P I' X
is stationary and Et and {X t _l , X t _2 , ... } are independent. It enough to prove the result for the
process X, the algebraic variety considered is the subspace S = n FP G([R k). A suitable
function g is given with the help of a Jordan decomposition of th/r~~triction of F to S, with
FVj,1 = A.j v j ,I' FVj,i = A. j Vj,i + Vj,i_l for i> 1, and i = 1, ... , Ij , j = 1, ... , J.

Now set g(x) = (1 + L. ~~ IA.Pj-i)b x· .)S where x =


J J,1
L. ~
~ v·· x· . and b satisfies
J,1 J,1
J 1 J 1

lA./ + lA.jl < r < 1 for j = 1, ... , J. It follows that IE (g(X t + 1)/ X t = x) :5 r S g(x) + A .•

Ango Nze (1992) proves the following robustness result. It is stated as a Corollary since the
technique used in the proof is the same.

Corollary 3. Assume that (ct ) is a sequence of independent and identically distributed real
valued random variables centered at expectation with common distribution equivalent to A.
P
Assume also that the polynomial P(x) = xf - L ai xf- i has its roots inside the open unit disk
i=l
of (f. Let g be a measurable function g: fRP ~ fR, such that Ig(x)I5'lxl for Ixl > A, and
bounded on the set {Ixl 5' A} for some A ~ O. Then there exists some ao > 0 such that the

process (Xt) satisfying the recurrence relation X t = LP a i X t_i + a g(Xt _l , ... , X t _p ) + ct is


i=l
geometrically ergodic for lal 5' alJ'

Theorem 6 states this result for a = 0 if the distribution of (Et ) is absolutely continuous with
100 Mixing

respect to A.. Unfortunately the value of 110 is not directly related with the distance of the zeros
of P to the unit circle; note however that this result holds for any a in the case of a bounded
function g.

2.4.2. Nonlinear processes


We develop in this subsection results concerning nonlinear processes. The models
considered generalize ARX(k, q) models (§ 2.4.2.1) and financial ARCH or GARCH models
(§ 2.4.2.3). The special case of AR(l) processes is considered in § 2.4.2.2..

2.4.2.1. ARX(k, q) nonlinear processes


We consider in this section the general ARX(k, q) nonlinear process defined by Yo' ... 'Yl-k
(Yi E [Rd) and the recurrence relation
(ARX) Yn = f(Yn_l' ... ' Yn-k' x n, ... , xn_q+l) + ~n' n = 1, 2, ...

where f: [RkdxlMq ~ [Rd, d, k and q~ 1 are integers and {x n}, {~n} are independent
sequences of independent and identically distributed random variables with distribution L and
M valued in [R d and a Banach separable space (1M, J>{, ) respectively. Let 1.1, 1.101 denote
respectively norms on [Rd and 1M. On the state space of the process [RdkxlM q, we set the norm 1.1
defmed by a=Max{lutl, ... ,lukl,lvtlD1' ... ,lvqI01} if a = (uP ... ,uk,vp ... ,Vq)E [RkdxlMq.

The aim here is to state the mixing properties of such models under assumptions that can be
checked in terms of a priori information about the problem (i.e. in terms of the properties of
distributions L and M of {xn} and (~n}and the function f). We study probabilistic properties
of ARX Markov chains, as irreducibility, aperiodicity, and ergodicity. From § 2.4. general
results we obtain a-mixing and $-mixing sufficient conditions if the initial distribution of the
process is a Dirac mass and ~-mixing and $-mixing under a stationarity assumption. Note that
the initial condition now involves the values ofXo,·.·, xl_q.

Let Xn = (Yn-p···, Yn-k' x n, ... , xn_q+t ), n ~ 1. The sequence {Xn} is a [RdkxlMq-valued


Markov process and its transition operator p(a, A) = IP(Xn+t E AI Xn = a) is defined by

p(a, A) = L(Ucf(a» 0u/ut) ... OUk(uk_t ) M(V t ) 0Y2(v t ) ... Oy q(vq_t )


kd q
for a= (up ... , Uk' vp ... , Vq) E [R xlM and
Chk d Chq
A=Utx ... xUkxVtx ... XVqE w ([R )xw (1M).

This measure is very singular and thus it is clear that not much can be expected directly from
the one step transition measure. Thus we compute the r-th power transition kernel pr(a, A).

Define a[ui, ... ,U;,vi, ... ,v;1 = (ui, ... ,u;,up ... ,uk_r,vi, ... ,v;,vp ... ,Vq_r) for r::;;min{k,q},
a[ui,···,u;,vi,···,v;1 = (ui,···,uk,vi,···,v;, v t , ... , v q-r) for k ::;; r ::;; q and
a [ui , ... , u;, vi, ... , v;1 is defined similarly for the other cases, we have

Lemma 1. Using the previous notations and setting Uj = fRd if j > k and "i = 1M if j > q,
Examples: Markov Processes 101

we have
pr(e,A)= f L(duj) f L(dui)··· f L(dur:j)x
Ur+j((J) Ur_1+j(d!u;, v;]) U2+j((J[u ;, ... , u:_2 ' v;, .. , v:_2 ])
x f M(dvj) ... f M(dv;) M(V}) L(Ud(X[uj,., u;_I' vj,., v;_IJ))·
Vr v2
.
Proof. Use recurrence and the relatIOn P m+l (8, A) = f P(8, d8') P m
(8', A). Here
8' = 8 [u; , v;J and the integral only depends on the random variables U; , v; except for Dirac
masses. +

Let us consider some consequences of Lemma 1. We directly obtain the following sufficient
condition for (H 2).

Lemma 2. If L > > 1 then


(A) The Markov chain {Xn} is irreducible with respect to the measure 1 ®k ®M(j9q.

Proposition 3. Under the irreducibility assumption (A), the Markov chain {Xn} is aperiodic
(i.e. d = 1 in relation (d)).

Proof. Let P be a period in condition (d) and S denotes the support of the distribution M, then
the form of the transition probabilities kernel P yields
d
8=(ul""u k , v!' ... , v q )IR x{ul'···' uk_l }xSx{v 1 '···' vq_l} C A p + 1'
E Ai =?

Using now the recurrence device we deduce easily that Ai = IRkdxS q, thus P = 1 and the
chain is aperiodic, indeed IR kdxSq is the support of ~'s distribution. +

Geometric ergodicity· is still considered from two different ways. The fIrst way is a direct
application of Lemma 1, leading to Doeblin's condition which is a strong assumption. Denote
by B(a, r) the closed ball centered at a and with radius r in IRd.

Proposition 4. Assume that f is bounded and L;;:: a 11 B(O, a+llflJ~) 1, for some a, a > O.
Then if r;;:: max{k, q}
pre e, A) ;;:: J1.{A) for some non-negative measure 11 with nonzero mass 110.

Proof. The result is proved for product Borel sets using Lemma 1 and the inequality follows
with ~(A) =ak
ME
f A(dul) ... A(duk)M(dvl) ... M(dvq)' which holds ifB=(B(O,a»kxlMr.+
Remarks 4. The result still holds if (x n) takes values in an arbitrary Polish space. The
process (Yn) defined by Yn = fn(Yn-l'···' Yn-k' x n,···, x n_q+ 1) + ~n is geometrically <1>-
mixing for any initial condition {Yo, ... , Yl-k} if (ft ) is an uniformly bounded family of
measurable functions.

Irreducibility and aperiodicity assumptions follow from Proposition 3 under condition (A).
We are thus in position to prove ergodicity criteria using Theorem 3. E.g. this result shows that
ergodicity and geometric ergodicity follow from the existence of a nonnegative and measurable
Lyapounov function g locally bounded such that for some Xo > 0, c > 0, 0 < p < 1 and any
102 Mixing

e with lei> xo'


IE {g(Xn+I) I Xn = e} ~ g(e) - c, for some c > 0,
IE {g(Xn+ 1) I Xn = e} ~ p g(e) - c, for some real constant c.

In order to establish these results we introduce the following assumption on the function f.

We assume that there exist nonnegative constants a I , ... , ak' a locally bounded and
measurable function h: 1M ~ IR+, and positive constants Xc, c, such that

(ff) If(e)1 ~f ai lujl + t h(vj) - c if lei> xo,


i=I j=I
and SUPI91~xo If(e)1 < co.

Theorem 7. Assume that assumptions (A) and (fF) hold and fEh(xl) + fEl~11 < co. Assume
that the unique nonnegative real zero of the polynomial P(z) = 1- al I-I -... - ak satisfies
p ~ 1.
i) Then the process X is ergodic if fEl~11 + q fEh(xl) < c.
ii) The process X is geometrically ergodic if p < 1. Hence if the process X is stationary then
y is a geometrically {3-mixing process.

Recall that if the distribution of ~1 is equivalent to Lebesgue measure then condition (A) trivially
holds.

Proof. The polynomial P has only a positive zero p as it is shown in Polya & Szego (1972),
§ III-1-2, p. 106. Assumption (A) in Lemma 2 follows from Proposition 1 so that the only non
trivial point to prove in Theorem 3 is the existence of a locally bounded and nonnegative
function g and suitable constants E and 0 < p ~ 1 and Xc > 0 with
(L) IE {g(Xn+I) I Xn = e} ~ p g(e) - E, if lei> xo·

Set g(e) =
i=I
f
ai IUil +
j=I
t ~j
h(vj) for positive constants a i and to be precised later and~j
al = 1. Let lei> xo, note that if Xn = e = (uI' ... ' Uk' VI' ... ' v q ) is fixed then
Xn+I = (Yn' UI'···' Uk_I' xn+I' VI'···' vq_I) for Yn = f(e) + ~n. Hence

IE {g(Xn+I) I Xn = e} ~f ai Iujl+
h(vj) + IE + t ~;
IEh(xn+I) - c, I~nl ~I
i=I
j=I
where 14' = ai + ai+I for 1 ~i<k, ak = ak and~} = 1 + ~j+I for I ~j <q, ~~ = l.
Let us show that there exist some {ai' ~j} with
ai = p ai' ~} ~ p ~ j for 1 ~ i ~ k and 1 ~ j ~ q.
The first equalities yield iteratively P(p) = 0 as well as an appropriate choice of coefficients
a i > 0 for i < k. Now choose ~ j = p ~ j for j < q and ~ ~ ~ p ~ Ii or
equivalently 1 ~ P ~q and p ~j = 1 + ~j+I for j < q. We get ~j = pH ~I - (1+ ... +pj-2)
fior J· < q. Those reI·
ations h0 Id I·f PI
R > 1+ ... + pq-I S·
_ q . mce p < 1 thi s IS
. the case I·f we set
p
Examples: Markov Processes 103

~ _ .9-
1- pq.
Thus we have detennined coefficients {(Xi' ~j} such that Lyapounov inequality (L) holds. The
two cases considered in Theorem 7 have now to be considered separately.
Assume ftrst that p < 1 then (L) holds for some real number E. Replacing p by some greater
value minor than 1 and increase Xc yields condition (L).
If now p = 1 then ~l = q is a suitable choice hence (L) holds for some E > 0 and p if
IEI~nl + q IEh(xn+l) < c .•
Remarks S. If the distributions L and M are absolutely continuous, then it is also the case
for 1t and pn(8, .). Moreover if /.1 and /.1n are the densities of 1t and ~ then the second part of
J
Theorem 7 asserts that 1/.1(8) - /.1 n(8)1 d8 = O(ll n).

Remarks 6. Polya & Szego (1972, § I1I-1-2, p. 106) propose the bound M ~ P for the
positive zero of P. This bound.is defined by constants c i ~ 0 such that c t + ... + c k ~ I

setting M = maxtsisk [t.]IIi. E.g.


1
M =maxISiSk [i ai /i or M =2 maxISiSk [ai /i are
suitable bounds of p. Hence geometric ergodicity of the process {Xn} follows from Theorem 7
if one of the following assumptions holds

for 1 ~i ~ k,

for 1 ~i ~ k.

Remarks 7. For q = 0 ARX process are simply nonlinear AR(k) processes. They are
defined by
Yn = f(Yn_I'···' Yn-k) + ~n·
If k = 1, the condition in part two of Theorem 7 writes with limsup If(lxlX)1 = P < 1.
Ixl~
k
In the AR(k) process Yn = L b i Yn-i + ~n' the best possible bound in condition (ff) is
i=I
k
If(8)1 ~ L Ibilluil. The roots of the polynomial Q(z) = zk - b l zk-I + ... - b k have a
i=I
modulus less or equal to the only positive zero p of P(z) = zk - Ibll zk-t - ... - Ibkl.
Mokkadem (1990) Theorem 6 proves geometric ergodicity of the process if the roots of Q lie
inside the unit disk. In fact Polya & Szego (1972, § I1I-1-2) show that (2 11k - l)p ~ /.1 ~ P
if is /.1 the largest modulus of a zero of Q. The previous loss between assumption conceming
the zeros of Q and R seems to be essential in view of the form of majorization chosen; this also
explain the interest of Corollary 3.

Example 3. Let (Xn) be the real valued Markov process deftned for some independent and
identically distributed sequence (~) with an absolutely continuous distribution with 1E1~11 < 00
by the recurrence relation,
(TAR(l)) Xn+l = [(i>I 1 {X :!>O} + <1>2 1{X >O}]X n + ~n' n = 1,2, ...
n-p n-p
Petrucelli & Woolford (1984) prove for p =0 that a necessary and sufftcient condition for
104 Mixing

geometric ergodicity to hold is q,1 < 1, q,2 < 1 and q,1 q,2 < 1. Chen & Tsay (1991) extend
> 1, the condition is transformed adding q,~ q,~ < 1 and q,~ q,1 < 1 for
(10) this result for p
some explicit integers s, t depending only on p. It is interesting to note that Theorem 7 yields
the sufficient weaker condition 1q,11 < 1, 1q,21 < 1.

Remarks 7 and Example 3 show the limitations of such general results.

2.4.2.2. AR(l) nonlinear processes

In the more general situation of a multiplicative (11) locally a-compact group (E, .) the
particular case of AR(I) nonlinear process is investigated more precisely. The presentation in
Doukhan & Ghindes (1980) is followed here.

The AR(l) nonlinear process is defined by Xo' (Xo e E) and the recurrence relation
Xn = f(Xn_I)'~n' n = 1, 2, ...
where f: E -7 E is measurable with respect to the Borel a-field $(E) on E and {~n} is a
sequence of independent and identically distributed random variables with marginal distribution
L. The reference measure is the left hand Haar one, A. Additional probabilistic properties of the
Markov chain - as irreducibility, aperiodicity and ergodicity - are given below. The transition
probabilities are
=
P(x, A) L([f(x)r l A) for xe E, Ae $(E).

Let C(E) (resp. M(E» be the set of bounded and continuous (resp. measurable) real
functions on the topological space E, X has the Feller (resp. strong Feller) property if
P(C(E» c C(E) (resp. P(M(E» c C(E». Note that if f is continuous then the Markov chain
is aperiodic and has the Feller property, if moreover L« A then it has the strong Feller
property.

Consider for 0 ~ z < 1 the kernel Gz<x, A) = L zn pn(x, A) and for any Borel set
n=O
A, the set of the points that can be reached from A, I(A) = {x e E, Gz(x, A) > O}.

X is said irreducible in the open sets (10, for short) if 1(0) = E for any nonempty open set in
E. Write S for the support of the measure L and say that L has ACP if L has a non trivial
absolutely continuous part with respect to A.

Lemma 3. Assume that f is continuous, onto and


(*) \i A E $(E): f(A).S c A => fA = ~ or E}.
If X is 10, and L has an ACP f or L has an ACP, and L« .t, or E is a metric space}, then
the process X is A.-irreducible.

Proof. Note first that there are some Borel set B with L(A);::= b A(AnB) for some constant
b > 0 and any Borel set A.

10 The sufficient condition of the previous result is proved using Theorem 2 for the iterated Markov process
(Xnd)~' The converse is proved directly proving that else the process is explosive.
II The multiplicative notation means that no commutativity assumption is assumed. It will thus include the
use of classical groups as the orthogonal one in the space of square matrices. We denote by I the unit of E.
Examples: Markov Processes 105

a) Assume that A E ffi (E) satisfies A(A) 7; 0 and 0 is a nonempty open set in E then
Gz<x, A);::: f Gz(x, dy) Gz(Y, A);::: JGz(x, dy) A«[f(y)rl.A)nB).
The facts that f is continuous and onto and the continuity of the function 11. A* 11. B yield the
result.
b) Note that 1(0) is an open set, then considering J = I(O)c yields a contradiction because
P(x, J) = I for x in J and f(J).S c J.•

Examples 4. Condition (*) holds if S = E or if E is non-compact and S contains the


complement of a compact set and f is continuous and onto. It also holds for a metric space E if
for some c > b > 0, B(1, c) c Sand B(x, r-b) c f(B(x, r» for r;::: c and x in E] or if
[the sequence fn(x) has a limit point for any x in E and d(f(x), f(y» ;::: d(x, y) - b for
d(x, y) ;::: c].

X is said recurrent in the open sets (RO, for short) if G 1(x, 0) = 00 for any nonempty open
set in E and any x in E.

X is called I-recurrent if the taboo probability ('2) satisfies U(x, A) = I for any Borel
set A and any x outside of a A-nulset, N(A) (,3). From now on we assume in this subsection
that E is a metric space with a distance d invariant under translations. Usual Markov tools yield

Lemma 4. Assume that there is a compact set K in E such that G j(x, K) = 00 for any x in E.
Iffis continuous and onto, L has an ACP and (*) holds then X is A-recurrent.

Proof. Use the results in Revuz (1984, chapter 2.7) to prove that a I-recurrent and
A-irreducible process is A-recurrent. In the present framework these notions are equivalent as it
is shown using
U(x, A) = P(x, A) +
AC
f P(x, dy) U(y, A) ;:::1 - [p (.n
J=I
{Xj E N(A)} / Xo = x).

Now it is natural to set Rx = {y E E; G1(x, B(y, e) = 00, V 10 > O}. Let 10 > 0,
aE Rx' SE S. The precompactness of K determines a point in Rx as the intersection of balls
with radius tending to O. By continuity of f there exists 10' ;::: 10 with
G 1(x, B(f(a).s, 210'» ;::: f G 1(x, dy) L(B([f(y)r1.f(a).s, He'».
B(a,E)
Hence G,(x, B(f(a).s, 210'»;::: G1(x, B(s, 10» L(B(s, 10» since we get by the triangular
inequality B(s, e) c B([f(y)r1.f(a).s, He'). The assumptions on f imply f(Rx)'S c Rx and
the proof runs classically .•
Using renewal theory techniques yields the following result in the case E = [R.

Proposition 5. Assume that f is continuous and satisfies


:3 a> 0, f([-a, aJ) c [-a, a] and \1 x E fR, Ixl > a ==> If(x) I ~ Ixl.
If L > > A is a symmetric distribution on fR concave on fR+ and with finite variance then X is
aperiodic A-irreducible and A-recurrent.

12 U(x,A)=1P(3n~1, XnE AIXo=x).

13 That is, coming from any point x outside of N(A) the process visits infinitely often any Borel set A.
106 Mixing

Ergodicity is considered in the same way as recurrence. Let Ex be the ergodicity set, coming
from the point x defmed by
1 n
Ex = {ye E; limsup -
k-+~ n k=1
L
pk(x, B(y, e» > 0, "if e > OJ.

It is easy to prove that Ex satisfies f(Ex).S c Ex hence

Lemma 5. Iff is continuous and onto, L has an ACP and (*) holds then X is ergodic if
moreover there is a compact subset K of E such that for any x in E
I
(**) G l(x, K) = 00 and limsup -
k-+~ n
Ln
k=l
pk(x, K) > o.

The following Lemma yields condition (**) ~ order to derive explicit ergodicity results. The
first result is elementary while the second one uses renewal theory.

Lemma 6. If one of the following assumptions holds, there is some compact K and some
integer nO with InA~no pk( 1, K) > O.

a) There exist two sequences of compact sets (Hn) and (Kn) with L L(H~) < 00, and
n=l

b) The closed balls Bc( x, r) in E are compact and there are constants a, T > 0 with
f(B/1, r» c B/1, r-a) for r> T. The distribution of d(l, Xl) is not arithmetic (14) and
satisfies fE d(l, xl) < 00.

Remark 8. Results of § 2.4.2.1. easily extend to the present framework, but the
corresponding results are not recalled in details in this subsection. We only mention that
=
geometric ergodicity - Theorem 7 - holds if E [Rd and If(x)1 $; k Ixl for some k < 1 and if
Ixl is big enough. Doeblin condition holds assuming that L has an ACP and f(E) is relatively
compact - see Theorem 1.

2.4.2.3. Financial nonlinear processes


Non parametric tests have been proposed by Diebolt (1985) for the ARCH model
Xn+ I = f(X n) + g(X n) e n+I and mixing conditions are proved in Diebolt & Guegan (1990,
1991) for special classes of such models. We recall in this subsection the more general results
proved in Ango Nze (1992).

The vectorial autoregressive model with heteroscedastic errors is defined by the recurrence
relation
(ARCH) Xt+1 = f(X t) + g(X t) e t+ 1
where Xo e [Rd, g: [Rd -7 GLd([R) (15), f: [Rd -7 [Rd are measurable functions and (et) is an
independent and identically distributed sequence with distribution L. We assume that
L(r 1(Z» =0 where Z = {x e [Rd; det g(x) =OJ. If L is equivalent to Lebesgue measure

14 A probability distribution on IR is called aritlunetic if it is supported by a tl. for some real number a.
15 The space GLd(lR) of invertible real dxd matrices is equipped with some norm of algebra [written 11.11]
deriving of a norm on IRd [written 1.1], this means IIAII = Sup{IAxI; Ixl :,; I}.
Examples: Markov Processes 107

on a neighbourhood of the origin in [Rd, then the process (~) is an aperiodic and A.-irreducible
Markov chain.

Proposition 6. The non linear ARCH process (Xt ) is geometrically ergodic if


L(fl(Z)) =0, L is a absolutely continuous distribution with respect to A, the restriction of the
°
distribution L to some neighbourhood of is equivalent to the restricted Lebesgue measure, and
if(i) or (ii) holds.
s [{(x)1 + IIg(x)1I (lElelylls
i) :1 s ~ 1, lElEtl < 00, and limsuPlxl-+oo { Ixt } < 1.
ii) :1 a> 0, \i x E fRd, Xx) = lEexp{a Ilg(x)11 IEtl} < 00 and
.
1zmsuPlxl-+oo ([((x) I + a-lIn Xx)} 1
Ixt < .

Remark 9. The regularity assumption on the distribution L holds if L is equivalent to A and


A(f"I(Z» = O.

An alternative condition for i) can be given in the Euclidean case. If the process (~) is centered
at expectation and its covariance matrix V satisfies
1· (!f(x)12 + tr[Vg'(x)g(x)]} 1
zmsuPlxl-+oo IxP < ,
then the process (Xt) is geometrically ergodic. For this set g(x) = 1 + Ixl2 in Mokkadem

criterion (Theorem 3).1.1 denotes here the Euclidean norm of [Rd: Ixl = ~ xI + ... + xl.

Proof. Use the method in Theorem 3 with hI (x) = (1 + Ixl)S, h2 (x) = e a1xl yields
respectively the following inequalities valid for Ixl big enough and some 0:5: P < 1
lE(h 1(Xt+1)IXt =x)= IE (1 + If(x)+ g(x)~+II)s:5:(1 + If(x)l+ IIg(x)III1E l lls)s:5:p (1 + Ixl)s,
IE (h2(Xt+l)1 X t = x) = lEexp(aIf(x)+ g(x) Et+II) :5: exp(alf(x)l+ln y(x»:5: p h 2(x) .•

Note that here again, Theorem 4 does not apply. Moreover, ergodicity criteria also result
from Theorem 3 by relaxing strict inequalities in weak ones.

Examples 5.
a) Engle (1982) introduced the ARCH linear univariate model given by

Xt+l=~a+bX~Et+l' a>O, l>b>O.


It is geometrically ergodic for a normalized independent and identically distributed sequence (~)
with distribution equivalent to Lebesgue measure.

b) T ARCH models where introduced in Zakoian (1990) who considers the recursive equation
Xt+l = (a+ bX t 11 {Xt>Oj -cXt 11 {Xt<Oj)Et+1 with a, b, c > 0,
and a second order independent and identically distributed sequence (Et). The process is
geometrically ergodic if 0 < bvc < I.

e) Bivariate Garch-M processes are defmed as


Xt+l = f(X t, Y t) + g(Zt) Et +l·
108 Mixing

=
Y t + 1 h(Y t) + 11t+l·
where the independent and identically distributed sequences (£t) and (11 t) are independent, and
the functions f: [R x [R -? [R, h: [R -? [R, g: [R -? [R are measurable and g satisfies
InfyelR Ig(y)1 > 0 (see Nelson (1990».

Set Z = ( X) and F(z) =


(f(Z») . Assume that the process Wt = (£t,11 t) is centered at
y h(z)
expectation and has identity covariance matrix. G(z) =(g~y) ~) leads to the previous case
considering the model
Zt+ 1 = F(Zt) + G(Zt) W t+ 1·

The process (Zt) is geometrically ergodic if there is K, 0 < K < 1, with


If2(x, y) + g2(x) + h 2 (x, y)1 ::;; K (x 2 + y2) for x 2 + y2 big enough.

Survey of the literature

The ~-mixing and «I>-mixing coefficients were linked to the properties of transition kernels in
Davydov (1973). Rosenblatt (1972) studied the behaviour of the a-mixing coefficients.
Rosenblatt (1971, 1972) proposed an approach to mixing using operator theory.

The first mixing properties of general space state Markov chains were stated for Doeblin
recurrent Markov chains in Doob (1953) and Deno (1960) (Theorem 1). Following this
approach, Iosifescu & Teodorescu (1969) proved a simple «I>-mixing condition for learning
systems.

Foster (1953) and Pitman (1974) studied convergence rates to the invariant measure for Markov
chains with values in a discrete state space and respectively proposed propose necessary
conditions for (3') and (4') to hold.
The general probabilistic properties of recurrent Markov chains are studied in Doob (1953),
Jain & Jamison (1967), Orey (1971), and Revuz (1984). The approach of fundamental
interest here, is the one in Tweedie (1974, 1975, 1983) and in Nummelin & Tuominen
(1982) (Theorem 2). These authors proved explicit sufficient mixing conditions - rates of
decay for the mixing sequences (see also Nummelin (1984) and the more general results in
Meyn & Tweedie (1992». Those results are widely used in Mokkadem (1985, 1986, 1987)
for models of statistical interest (§ 2.4.1.). Mokkadem (1990) (Theorem 3) proves explicit
sufficient conditions for ergodity and geometric ergodicity based on simple assumptions on
the transition probabilities. Duflo (1990) and Meyn & Tweedie (1992) give conditions for the
assumption of stability - weaker than ergodicity - in the case of nonlinear models using
Lyapounov functions; she also gives ergodicity criteria based on the Lyapounov functions
technique. Borovkov (1989, 1990) proposes a Lyapounov functions approach to the
ergodicity properties of Markov chain.
The nonlinear AR processes are studied in Doukhan & Ghindes (1980) and in Doukhan &
Tsybakov (1993); those papers are the source of the results in § 2.4.2.2 and § 2.4.2.1. The
results for nonlinear financial processes studied in § 2.4.2.3. come from Ango Nze (1992).
Related processes, called doubly stochastic processes (16) are studied with the same point of

16 The simplest doubly stochastic models are described for a Markov process (~) and an independent
inovation et as the solution of a coupled system
Xt+1 = Zt+1 X t + et"
The process (XI' Zt) is still Markovian and the first authors cited provide sufficient geometric ergodicity
condition in the case when (~) is an AR( 1) or an MA( I) process. The second reference provides ergodicity and
stationarity sufficient conditions in a more general setting.
Examples: Markov Processes 109

view in Meyn & Guo (1993), see also Tjostheim (1990).

Moreover Doukhan, Ghindes (1980, 1981) and Chan and Tong (1985) propose an approach
linking as much as possible the properties of the deterministic system to those of non linear
AR( 1) processes. Asymptotic relations between a dynamical system and the fIrst order
nonlinear autoregressive processes associated are related in Doukhan & Ghindes (1980); the
asymptotic is there with repeet to the fact that the distribution of the noise converges to the Dirac
mass at point O. No decisive result seem to be known in this case (17). See also Tong (1990),
and the papers by Cheng & Tong (1992) or Nychka et al. (1992). The latter authors determine
the dependence of the behaviour of the process on the initial conditions. Related results are in
Pham (1986), Tuan (1986), Tong (1990) and Diebolt & Guegan (1990,1991). We did not use
the results in Tong (1990) to prove the geometric ergodicity properties of various classes of
models since it is restricted to real valued excitating noise and it does not yield results for every
class of models considered here. However those result are really nice since one may really
expect to relate the properties of a dynamical system to the nonlinear autoregressive processes
naturally associated to it. Unfortunately in order to get a general result, Chan & Tong (1985) or
Tong (1990) are led to work with a set of restrictive assumptions. Several proofs may be
simplified in this Chapter (for particular cases) using this result.

Denker & Keller (1986) investigate the case of infInite memory systems. Reviews are given in
Roussas & Ioannides (1987), Athreya & Pantula (1986), as well as in Hernandez-Lenna et al.
(1991).

17 One may expect that the invariant measure, 7tcr of the Markov process Xn+ I = f(X n) + CJ En converges
to some invariant measure 7t for the associated dynamical system i.e. 7t(f l (B» = 7t(B); for definiteness,
suppose those processes to be ergodic for any cr). If f is continuous, the limit points of 7tcr are invariant by f
but no convergence result has been proved in general.
Examples: Continuous Time Processes 111

2.5. Continuous time processes

The notions of mixing still work here, considering the time dependence structure of
continuous time processes. We flrst present here some results concerning continuous time
Markov processes in § 2.5.1. while § 2.5.2. is devoted to their description in terms of
operators. General diffusion processes are described in § 2.5.3. We recall the strong notion of
hypermixing in § 2.5.4. Sufficient conditions for it to hold are the fundamental Bakry &
Emery's hypercontractivity criterion (§ 2.5.5.) and the ultracontractivity condition (§ 2.5.6.).
We detail explicit examples of processes in those sections. The last subsection, § 2.5.7.,
indicates some references concerning general stochastic differential equations (SDE). This
section doe obviously not describe every kind of continuous time mixing processes, e.g.
§ 2.1.2 and § 2.2.3 present other classes of such random processes.

2.5.1. Markov processes


The general results in § 2.4. hold. See for instance in Ueno (1960), Doob
(1953), Rosenblatt (1971) and Pham & Tran (1985). The relations between the norms of
transition operators and the geometric decay of mixing coefficients shown in Rosenblatt (1971)
hold true - see § 2.4.

Let X = (XJtElR'" an E-valued homogeneous Markov process with a regular version of the
conditional probability IP(X t E AI Xs = x) = Pt_s(x, A) for some transition probability

function on the Polish space (E, al (E». We assume the classical continuity assumption
Pt(x,.) => Bx to hold if t ~ O. The process is said to satisfy Dynkin's regularity assumption
if for any compact set in IRd
limHo SUPxe K IP (IX t - xl > £1 Xo = x) = O.
This is a continuity assUmption meaning that li~~o P t =I.

2.5.2. Operators
We give more details on the presentation of Markov processes with the help of
operators. Assume that X is stationary with marginal invariant distribution IJ., then the Hilbert
space L2(1J.) - with scalar product (.,.) and norm 11.11- is separable and contains some dense
subspace D(L). D(L) is called the domain of the inflnitesimal operator L of the Markov process.
Set Px(A) = IP(A I Xo = x).

L is a linear unbounded and non-negative operator defmed by the fact that, for fin D(L), the
t
process Mf =(MDQ() is a local Px-martingale where Mi = f (X t) - JLf(Xs) ds. It is deflned
o
P -I
as L = limt~o+ - \ - .

Assume the additional conditions


(i) L is self-adjoint and onto L: D(L) -+ L2(fl).
112 Mixing

(ii) The spectrum of L is discrete.


(iii) 0 is a single eigenvalue of L associated to the constant eigenfunction 1.

Let (em)m~ be an orthonormal reduction basis ofL2(~) with eo = 1 and

'r:/ f E D(L), Lf = LAm (f, em) em'


m=O
where AO = 0 < Al ::;; A2 ::;; ... are the elements of the spectrum of L.
L being unbounded we have liIDm~oo Am = 00. The semigroup Pt associated to the Markov
process defined by (Pl, g) = IE f(X t) g(Xo) for f, gEL2(~) may be rewritten:

L e- tAm (f, em) em'


~

Ptf =
m=O
in the vocabulary of operators, it means that Pt = e-Lt. Assumption (iii) implies the ergodicity
of the process X, see Bhattacharya (1982, proposition 2.2). Now IIPlll ::;; e- tA \ 11ft I if
(f, 1) = 0, thus Rosenblatt (1971) results (see § 2.4.) imply

Proposition 1. The process X is mixing with at::; e-t}"l if assumptions (i), (ii) and (iii)
hold.

We also defme the Green operator G by Gf = f Ptf dt on the subspace of centered functions,
o
00

1.L={fE L2(~);(f,I)=0}. We get Gf=LA~I(f,em)em' This operator is


m=O
continuous on L2(~), its range is D(L)(')I.L and LG = I - S where I is the identity operator on
L2(~) and Sf = (f, 1) 1 is the orthogonal projection on constants. A uniform law of iterated
logarithm is given in Doukhan & LeOn (1986) in this setting.
Example 1. Brownian motion. Baxter & Brosamler (1976) propose an example of such
Markov <I>-mixing process: the Brownian motion on a d-dimensional compact riemannian and
homogeneous manifold E. ~ is the uniform measure on E. Let L = -/l be the Laplace-Beltrami
A
operator on E.liIDm~oo ~d = c > 0 is proved in Minakshisundaram & Pleijel (1943).
ill

Example 2. The birth-death process investigated in Van Dom (1985) is an example of


O'I-valued continuous time Markov process with precise geometric ergodicity properties. Set
E = {-I, 0, 1, 2, ... }, a birth-death process on E is represented by its transition matrix
P t = (Pij(t»ijeE if for i, j in E
L Pi,/t) ::;; 1, Pi,j(t) ~ 0, Pi,/O) = 0ij' Pi,j(t+s) = L Pi,k(s)Pklt), and
j k
pijt) = L ai,kPk,P) = L Pi,k(t) ak,j for an array (ai,j) with a_l,j = 0, ai,i-l = ~i'
k k
ai,i = - Ai - ~i' ai,i+l = Ai and aij = 0 otherwise; Ai and ~i are the birth and death rates,
Examples: Continuous Time Processes 113

they determine the distribution of the process if


i=O
vi f { + O"i
v i) -1 } = 00 for Vi = A.QA.I ... A.i -1.
J.lQP-I" ,P-i-I
Let 1t = (1ti) be the invariant measure of this process then pdt) - 1ti = O(e- at) for some
a> 0 if and only if cr > O. cr is described in Van Dorn (1985) : it satisfies

cr E [liminfn~oo an' liminfn~oo b n) with an = A. n + P-n -..J A.n_I P-n -..J A.nP-n+I and
1 n _,-
b n = -n L [A.i + P-i - 2-'1 A.i-IP-J .
i=O

2.5.3. Diffusion processes


We present a general formulation of the diffusion processes following Meyer (1982 b)
and Bakry & Emery (1985). E is an abstract state space. Let ~ be an algebra of real valued
functions defined on E. Let L: ~ ~ ~ be a linear application, the energy form associated to
L is defmed as the bilinear form

r: ~x~ ~ ~ st 'if f, g E ~: r(f, g) = "21 [L(f g) - f Lg - g Lf).

Note that if E = !R d , ~ = Coo and L = d, then r(f, g) = Vf.Vg.

A bilinear application r: ~x~ ~ ~ is a derivation on ~ if

'if f, g, h E ~ : r(fg, h) = f r(g, h) + g r(f, h).

Now L is the generator of a diffusion on E if, for any x in E, there exist (Xt, n, ff, ff t) for
some probability space (n, ff) and some process (Xt) on E with XQ = x, adapted to the
filtration (fft) such that for any f in ~, (MP
is a local martingale where

M{ = f(X t) - f(x) - Jt
Lf(X s) ds.

In that case the process (Xt) is said continuous if [t ~ f(~)) is a continuous real function for
fin~. Moreover~ <Mf, Mg>t = 2 r(f, g)(X t).
It is proved in Bakry & Emery (1985) that

Proposition 2. A Markov process with energy form r, associated to its generator L is


continuous if and only if r is a derivation.

In order to prove the converse, they first prove an interesting localization lemma. If for any fin
~, r(f2, g) = 2 f r(f, f) then for any f and g in ~, f= 0 on the set {g O} implies *"
Lf = 0 on the set {g O} . *"
The second energy form is defined as r 2: ~x~ ~ ~ such that
1
'if f, g E ~: r 2(f, g) = "2 [r(Lf, Lg) - r(f, Lg) - r(Lf, g)).
114 Mixing

We now describe Markov diffusions. Let E be a connected C~ manifold, we consider an


algebra of bounded functions including the C~ bounded ones as the constant 1. P t and L are
d
=
operators on .if,. Now at Ptf(x) LPtf(x) PtLf(x). =
Markov property yields ref, t) = £[~Ptf2 - (ptt)2)]t=O;::: O. Assume, from now on, that
there is a probability distribution 11 on E such that .if, c L 2(11), the process is said to be
stationary. If L is a selfadjoint operator of L 2(1l), the process (X t) is said to be reversible
- i.e. (Lf, g) =(f, Lg), where (.,.) denotes the inner product of L 2(1l). In this case P t is also

J
selfadjoint (1), indeed

E
f ref, g) dll = Ef f Lg dll and Ll =0 yield r 2(f, g) dll = f Lf Lg dll·
E
If Lf = 0 only for constant functions f (assumption (iii) in 2.5.2), the process X is said to be
ergodic. In this case Ptf converges Il-a.s. to f f dll·
Examples 3 presented below are taken from Bakry & Emery (1985).

Examples 3.a) If E = IR, .if, = Coo, and Lf = a f" + b f, then ref, g) = a (f)2. If L
is the generator of a diffusion process then a;::: 0 and L = H2 + P H for Hf = a f',
a = -Va and P = ~, and r 2(f, t) = (H2f)2 - Hb (Hf)2.
For the particular d-dimensional case, following the presentation in Karatzas, Shreves (1988),
let cr(x) = (crij(x)i~d.j~r and b(x) = (bi(x))i~d be respectively the dispersion and the drift
matrices of the stochastic differential equations. The functions cri/x) and bi(x) are assumed to
be locally Holder continuous from IR d into IR, for 1:::; i :::; d, 1:::; j :::; r. W = (W t) is an
r-dimensional Brownian motion.
(D) dX t = b(X t) dt + cr(X t ) dW t •

There is a continuous process satisfying (D) if there is a constant K such that for (x, y) in
IRdxIR d
IIcr(x) -cr(y)1I + IIb(x)- b(y)1I :::; Klx-yl and Iicr(x)112 + IIb(x)1I 2 :::; K(l + IIxIl 2).

The dxd matrix defined by a(x) = cr(x) crT(x) is called the diffusion matrix. Set
~ ~ a 2f(x) ~ af(x)
"i f E C 2 (IR d), Lf(x) =.t...J .t...J aij(x) ax.ax. + .t...J bi(x) ax.'
i=l j=l I J i=l I
t
A solution of (D) satisfies that M~ = f(X t) - f(X o ) - f Lf(Xs) ds is a continuous local
o
martingale. It is a martingale if cr is bounded on the support of f.
Examples: Continuous Time Processes 115

Before we describe the hypercontractivity properties of diffusion processes we first recall some
simple results yielding directly mixing properties.

Example 4. Let (Wt)t:<:O be a Brownian motion on [Rd, the process X = (Xt)t~O is solution
of the equation dX t = b(X t) dt + dW t. Recall results in Doukhan, Le6n (1986) yielding
properties (i), (ii) and (iii) for such processes. Let C~ be the space of k-times continuously
d . af af
differentiable functions on [R with a compact support and V'f = (-a' ... , -a). For
Xl xd
2
Jl(dx) = e (x) dx define the bilinear form Bon CJ f
by B(f, g) = V'f.V'g dJl, its domain
D(V') is a Hilbert space with the norm IIfllV = (B(f, f) + IIflI2)l!2.

If the form B is closed there is a self adjoint operator, L, on D(V') with B(f, g) = (Lf, g). If
V'e E Ll~c([Rd) then C6 c D(V') and Lf= -At - b.V'f for fE C6, b = 2 e- I V'e and
Hf= -~f+c with c=e- I ~e.

The operators Land H have the same spectrum, their domains satisfy D(L) c L 2(Jl),
D(H) c L 2([Rd) via the transformation L(e- I f) = e- I Hf for f in CO', it is discrete if
limlxl-7oo c(x) = 00. Grouping these considerations yields

Proposition 3. If /1( dx) = e2 (x) dx is a probability distribution equivalent to Lebesgue


measure such that b = 2 I e- ve
E Ll~J/1), there is a unique solution X t to the stochastic
differential equation dXt = b(Xt) dt + dWr
Assume that div bEL l~PRd) satisfies div b(x) ~ - (a + a' Ix12) for constants a, a' ~ 0,
c E L l;J [Rd) and b E L,!J ~), then the process X is Markovian with invariant distribution /1
and geometrically strongly mixing if limlxH= c(x) = 00.

This result applies for multidimensional Omstein-Uhlenbeck processes dXt + CXt dt = dWI'
in this case c(x) = 'Y Ixl2 with 'Y> O.

A unidimensional diffusion process dX t = a(X t) dt + b(X t) dW t satisfies this condition if


a(.) is continuously differentiable and,
x y
limX -7+oo f exp{- f2 ~(z) dz} dy = ±oo,
- 0 0 b (z)

m(dx) = +b (x)
x
exp{j2 ~(y) dy} dx is a finite measure and then Jl(dx) = m(dx) .
b (y) m(dy) f
a2 a
1
Assume now that E = [Rd and let L =:2 Ld aij(x) ax.ax. + L bi(x) ax. be an elliptic
d

i,j=1 I J i=1 I

operator on [Rd, i.e.

The matrix A(x) = (ai}{x)) l~i,j5d is symmetric positive definite for x in [Rd and continuous as a
function of x, the functions b/x) are Borel measurable and bounded on compact subsets of [Rd.
116 Mixing

If moreover the coefficients bj(x) are bounded, for each x in IR d, Bhattacharya (1978) shows
that there exist an unique probability measure Px on n with Px(Xo = x) = 1 and that for any
function f twice continuously differentiable on IR d the process W = (Mf>~ is a Px-martingale,
t
where M{ = f(X t ) - JLf(X s) ds. The process X is a diffusion with generator L. This
o
process satisfies Dynkin regularity assumption. Defme now for y = x - z and ro > 0
d y. y. d d
Az(x) = L ajj(x) ~, B(x) = tr A(x) = L
ajj(x), Cz(x) = 2 Yj bj(x), L
jj=l Iyl j=l j=l
B(x) - Az<x) + Cz<x) - B(x) - Az(x) + Cix)
~ir) = Inf1yl=r ~(x) , ~Z<r) = SUPlyl=r ~(x) ,

gz(r) = Inf1yl=r Aix),


r
and l z(r) = f I!~(S) ds,
ro
For f(x) = F(lx - zl) it is easy to check that
F'(lx - zl)
Lf(x) = Az(x) F"(lx - zl) + Ix _ zl (B(x) - Az(x) + Cz(x»,
Bhattacharya (1978) deduces the following recurrence criterion generalizing Khas'minskii
(1960).
Theorem 1. Assume (E) holds, iffor some ro> 0 and z,

f exp[-lir)] dr = 00, the diffusion with generator L is recurrent,


ro

f exp[ -lir)] dr < 00, the diffusion with generator L is transient.


ro

Examples 3.b) If E=lRd,.sIt. = Coo. and Lf=df. we already have noticed that
d d
r(f. g) = Vf.Vg = L (D j f) (D j g) and r 2(f. g) = L (DjD j f) (DPj g) where Dj
j=l jj=l
denotes the partial derivation operator Dj = k. 1

Examples 3.c) More generally. let E be an abstract state space equipped with an algebra of
functions .sIt., a linear application D: .sit. ~ .sit. is called a derivation if the indentity
D(fg) = fDg + g Df holds on .sIt.. Consider now (k+ 1) linear derivations Do, ...• Dk , a
linear operator L may be defmed as
k
L = Do + L DT·
j=l
k
Then we write. analogously to b). nf. g) = L (Dj f) (Dj g). If the Lie brackets satisfy
j=l
Examples: Continuous Time Processes 117

[L, Di J = LDi - DiL = a Di for some a E 8t, independent of i, an easy calculation yields
k
r 2(f, g) = L
(DPj f) (DPj g). In the previous case a = O. Other nontrivial cases follow
i,j=!
in the same way. A linear derivation will be a fIrst order differential operator in the examples
proposed.

c-(i) Sd c [Rd+l, then Rij = Xi D j - Xj Di is a linear derivation which belongs to the


d+!
tangent bundle T(Sd) of the submanifold Sd. Let L = L
Rt be defined on 8t, = C~(!Rd+!),
ij=!
then L is the spherical Laplace operator, it satisfIes [L, RijJ = 0, because it commutes with
any rotation and thus also with ~j which is the generator of a rotation.

c-(ii) General Ornstein-Uhlenbeck processes (Meyer (1982 a)) In this case


8t, = E!\ 8t,n is a graded algebra such that Di(8t,n+!) c 8t,n and Lf= -nf and Ptf= e-ntf
for f E 8t, n'
d2 d
If E = [R then k = 1, L = (iX! - x dx and 8t, n is the span of the n-th Hermite
polynomial and the ergodic distribution 11 is the standard Gaussian. Unfortunately such an
algebra does not satisfy the previous assumptions. Element of 8t, will thus be chosen as the
sum of a constant and a Schwartz function (i.e. a rapidly decreasing function). This example
may be generalized to Malliavin-Ornstein-Uhlenbeck processes with values in E = C([R+).

d) C~ manifolds. Assume that the state space is a C~ manifold, 8t, = C~ and the operator L
d d d
writes locally as L = LLgij Dij + L
Xi Di - with Einstein's notation, it is rewritten
i=! j=! i=!
L = gij Dij + Xi Di and '(gij) denotes the inverse matrix of (gij). (gij) defines a Riemannian
metric 2 as soon as (tj(x)) is a symmetric positive definite matrix for any point x E E. For
this Riemannian structure, the Laplace-Beltrami operator takes the form
M = gij(Di/ - rfj Dkf) (3), where crt) denotes the Christoffels symbol (4). It yields
L = 6 + X for some linear derivation X.

In this case ref, g) = (grad f I grad g) if (. I .) denotes the associated inner Riemannian

product. Now if X = grad h (it is the case with h = In dll with 11 the invariant measure of
dv

k
2 The metric which assigns the distance L gjj(x) dXj dXj to the points x and x + dx.
i,j=1
d d n
3 That is M = I L gij(O 1).. f - k=1
L ~I) 0kf).
1=1 j=1
k ! d kl ag' j ag il agio
4 r ..
I)
= - > g [..::.J,!
2 i=i ax'
+ - . - ~ J.
ax) ax
118 Mixing

the process and v the Riemann measure in the Markov case) then if Ric = (Ricij) denotes the
Ricci curvature (5) linear operator and Hess f denotes the operator (6) determined locally as
d
(Hess f)ij = glJ(Di/
.. '" k
- k r ij Dkf)·
k=l
r 2(f, g) = (Hess f I Hess g) + (Ric - (Hess h))(grad f, grad g).

2.5.4. Hypermixing
The strong notion of mixing called hypermixing for discrete or continuous time
processes arose from Large Deviations Theory (see Chiyonobu & Kusuoka (1988)).

This paragraph contains some of the results in Deuschel & Stroock (1989) as well as
their notations. Let E be a Polish and n a space of left limited and right continuous functions E-
valued trajectories w(.) on IR (usually named cadlag functions). Thus n
equipped with the
Skorohod topology is still a Polish space. We set fF (I) for the a-algebra generated in by n
(w(s), s e I} for closed sets I. Write 1< J for closed intervals I = [a, b] and J = [c, d] if
a < b < c < d. J,{,s(n) is the space of stationary measures on n; for t in IR and w in n we
define wt(s) =w(s) if lsi ~ t and wt (s+2t) =wt(s) for s e IR the 2t-periodically extended
function and 9tw(s) =w(s+t) the translation map on n.
P E ,)y6lD) is said to be hypermixing if
(H.I) There is a decreasing function p(t) > 1 defined on [c, +oo[ for some c > such that: °
lim H "" t(p(t) - 1) =° and I!fI .. JnIlI.:s;"I!flllp(t)" .. I!fnllp(t)
where ~ is fF(Ij)-measurable and I j are t-distant intervals with I j < Ij+ l'
(H.2) There are decreasing functions it) > 1 and c(t) defined on [c, +oo[ such that
limH""t('"f(t) -1) = 0, limH""c(t) = 0, rI(t) + I)I(t) = 1, and
llfE/2 fEljlIO(t).:s;" c(t) I!f1l J(t) for intervals 11,12 distant at least t and any function f with fEPf = °
and fE/is the conditional expectation offwith respect to P given fF(I).

The previous definition is the one given in Chiyonobu & Kusuoka (1988). In fact Deuschel &
Stroock (1989) give an alternative defmition replacing (H.2) by (H.2')

(H.2') There are decreasing functions it) > 1 and c(t) defined on [c, +oo[ with
limH""t('"f(t) -1) = 0, limH""c'(t) = 0, and ICov(fJ,h)ll.:s;"c'(t) I!fIlIJ(t) 1!f211X't)
where~ is fF(Ij)-measurable and (Ij)j=I,2 are t-distant intervals.

Both defmitions are equivalent (1). An interesting simple consequence is the following

5 Rici\ f RjI where R/I is defined for any vector field by


=~ [VjVJ. - VJ.Vjlyk = i R/I yi.
~
6 Both operators are undertood as bilinear fonns in the forthcoming fonnula.
7 Indeed, use Holder inequality to see that (H.2) implies (H.2') with c'(t) = c(t). The converse holds with
, . ICov(fI'IEI1t)1
c(t) =c (t) because by dualIty IlIEI flIO(t) =Sup{ IIf II }.
1 1 ')'(t)
Examples: Continuous Time Processes 119

Proposition 4. An hypermixing process is geometrically strongly mixing.

More precisely considering fj = llA' -1P(Aj) in relation (H.2') leads for t-distant measurable
1
subsets Al and A2 to
IIP(A I f1A 2) -1P(A I )IP(A 2)1::;; c'(t) IP 1I")'(t)(A I ) IP II")'(t)(A2).

Taking sup bounds implies that at ::;; c'(t) and also

Remark 1. The previous proof also implies that a c'(t)-hypermixing process is also aa,b-
mixing (see § 1.1.) for any a, bE [0,1[, with the same decay rate.

Remark 2. A 'II-mixing process is hypermixing process.

Chiyonobu & Kusuoka (1988) give examples of such hypermixing processes. The ftrst one is a
stationary vector valued Gaussian process such that li~~oo t.p(t) =O. The second one is the
e-Markov case. Hypercontractivity and hypermixing are closely related in this case. We only
present the Markov case.

2.5.5. Hypercontractivity
Let Pt be the transition probability of a stationary Markov process with stationary
distribution J.l. and denote IIfllp = <f IfiP dJ.l.)lIp, then the process is hypermixing iff
IIPTfII 4
IIPT II 2,4 =Sup ~ = 1 for some T> 1, it is called J.l.-hypercontractive. Moreover,
c(t) =O(e-ct), yet) = 1 + O(e-'Yl) and pet) = 1 + O(e-pt) for some constants c, y, p > O.
Proposition 5. Assume· that some s, t > 0 satisfy
Pix, dy) = pix, y) J1.(dy) and Plx, dy) = plx, y) J1.(dy).
The strong mixing property holds with a geometric decay rate of the mixing coefficients if
f (f p;(x, y) J1.(dy)/J1.(dx) < 00, Hypermixing holds if moreover plx, y) ~a > 0, for
some a > 0 and any x, y E E.

The example of symmetric diffusions is widely investigated in Deuschel, Stroock (1989) and
the decay of c(t) may be checked using the results in Korzeniowski (1987). The second part of
this result is proved in Deuschel & Stroock (1989), the same lines of proof yield the ftrst part of
this result (sufftcient strong mixing condition).

Example 5. Ornstein-Uhlenbeck processes satisftes the S.D.E. dXt = - X t dt + dW t


and Pt(x, dy) = g _ley - x e- t12 ) dy where gt denotes the Gaussian cAP (0, t) density
I-e
gt()
x =..f21tt
1 e-x 2/2t . Th'IS process IS
'h "
ypefffilxmg because the prevIous
' cond'Ilion
, may be

checked. This implies that an hypermixing process is not necessarily I\>-mixing. E.g. Ornstein-
120 Mixing

Uhlenbeck process is stationary and Gaussian thus Proposition 2.1.1 shows that <j>-mixing
implies m-dependence, yielding a contradiction. That means also that o.a,o-mixing condition
may hold for a arbitrarily close to 1 without 0. 1 o-mixing - that is <j>-mixing - holds. Neveu
(1976) presents an alternative proof of this results.'

We present below the powerful Bakry & Emery (1985) r 2-criterion. This criterion
yields simple mixing sufficient conditions. $t. + denotes the subset of functions in $t. with
InfE f> 0 or equivalently the functions f> 0 with In f E $t..

The following fundamental lemma will yield simple hypercontractivity criterions. Set for this
IIPtfll q liPfllq . .
IIPtllp,q = SUPfELP(fl) IIfll = SUPfE ~ + IIfll . The Il.lIp are consIdered WIth respect to the
p p
spaces LP(!!). U will denote the convex function U(x) = x In x.

Lemma 1. For A > 0, fixed, the six following properties are equivalent
(1) \1 p > 1, \1 t :? 0, [1 ~ q ~ I+(p_l)eAtj => [lIPtllp,q ~ 1].
(2) :3p>I, \1t:?O, [l~q~I+(p-l)eAtj=>[IIPIIIp.q~1].
(3) \1 t :? 0, \1 q E [1, eAtj, \1 fEd +, lI(exp Pt)(lnf)lI q ~I!flil'
(4) \1 t :? 0, \1 f E d +, f U OP if) dJl ~ e,AI f U(f) dJl + (J-e- At) U (f f dJl).

(5) \1 p :? I, \1 fEd +, f jP In IlfIfl dJl ~.e..f jP,2 ref, f) dJl.


p A
(6) :3 p :? I, \1 fEd +, f jP In t-1fI1 dJl ~.e..f .f- 2 ref, f) dJl.
p A
In any case we shall say that hypercontractivity holds with the constant A. It implies a strong
mixing condition with 0.1 ~ e- /A.

The following results hold as as consequence of this result.

A
Theorem 2. If the process is ergodic, and \1 fEd: r 2 (f, f) :? "2 ref, f) then
hypercontractivity holds with the constant A.

Theorem 3. If the process is ergodic, and there are constants a > 0, b:? 0 such that
\1 fE d,
a) b < 1 and rif, f) :? a rif, f) + b (Lf)2 then hypercontractivity holds with the constant
2a
A = I-b'
b) 1 < b < 4 and rif, f) ~ a ref, f) + b (Lf)2 then hypercontractivity holds with the
2a
constant A = b-I .

Note that the restriction b < 4, is according to Bakry & Emery, a technical one.

Theorem 2 yields directly hypercontractivity with A =2 for Omstein-Uhlenbeck sernigroups.


Examples: Continuous Time Processes 121

Indeed, r 2(f, f) - ref, f) = L (DPjf)2 ~ o.


i,j
r
If E is a d-dimensional sphere with radius r, Ricij = d;;1 gij thus the expression of 2 writes

here r 2(f, f) ~ d/ ref, f), yielding A = 2 d;21 . if L =,1.. However Mueller & Weissler

(1982) showed that A = 22d is the suitable hypercontractivity constant. However Theorem 3
r
gives the right constant with the inequality IIHess f1l2;e: ~ (,1.f)2. If E is ad-dimensional
Riemannian manifold with a positive curvature, say that the eigenvalues of Ric are greater or
equal to c, the same computations yield A = 1~I~d . Rothaus (1981) proves hypercontractivity
for the Brownian motion on general Riemannian manifolds (non necessarily with positive
curvature). The explicit bound set here allows to consider more general diffusion processes
with the ,1. + X, where the first order differential operator X satisfies a suitable Lipschitz
condition related to theassumption in Theorem 3 a) by
X®X:S: (t- d)(Ricc(L) - a g) and b d:S: 1.
1 ..... .
In local coordinates, this means that [(5 - d)(Ricc(L)IJ - a glJ - Xl XJhj is a nonnegative
matrix. Other examples may be found in Bakry & Emery (1985).

2.5.6. Ultracontractivity
We give in this section some of the results in Davies (1988). Assume now that
Pt = e-Lt is a symmetric Markov seroigroup on L2(E, f.L(dx» for some Borel measure f.L on the
second countable locally compact space E.

P is said to be ultracon.tractive if ct = IIP tll oo ,2 < 00 for all t > O. Let K(t, x y) be the kernel
associated to the seroigroup: e- Lt f(x) = f K(t, x, y) fey) f.L(dy).
Davies shows that K(t, x, y) :S: Cr.72 and conversely if O:S: K(t, x, y) :S: at then P is
ai
ultracontractive with ct :S: 12.
In the particular case of the heat kernel: L = -,1. on L 2([Rd, dx), K(t, x, y) = ~(x - y)
with ~(x) = (41ttr d12 e-x2/4t. Here ~ = IIKtii oo = (41ttrd/2 and ct = IIKtll 2 = (81ttr d/4 , thus
the previous bounds are sharp.

The Sobolev's logarithmic constant 13(£) is defined as the infimum of I3's such that for each
f E Dom(L)nL2(E)
f f2 In f dx :S: £ (Lf, f) + 13(£) Ilfll~ + Ilfll~ In IIfII~,
or equivalently, f f2(x) In f(xl dx :S: £ (Lf, f) + 13(£) IIfll~.
IItl~

Gross (1975) proved (see also Deuschel & Stroock (1988» the following
122 Mixing

Proposition 6. Sobolev's logarithmic inequality holds with f3( £) if and only if


IIPTllp,q ~ exp{f3(£)(~ - ~}, for 1 < p ~q < 00, T> 0 and e4T/e;:::: : ~ ~.

Sobolev's logarithmic inequality holds with ~(t) such that c t = "Pt"~ 2 = e i3 (t) if the
semigroup is ultracontractive. Reciprocally, if Sobolev's logarithmic inequltlity holds for a
continuous and decreasing function ~(e) then the semi group e-Lt is ultracontractive with
t

ct = "Pt"~ 2 = eM(t)
,
and M(t) = it
0
f ~(e) de. Recall that hypercontractivity holds with
~(e) = O.
If L is strictly elliptic on E c [R d then 0 ~ K(t, x, y) ~ c C d12 and ~(e) ~ c' - ~ In e.

Consider an elliptic operator L, Lf= -I.i,j !xfa jj


!
g!) + Vf, and the associated quadratic
J
form

Q(f) = f (~aij
~ af at' 2
ax. ax. + v If I ) dx.
!,J ! J
Assume that there is some C 2 function <\> on E with L<\> ~ O. The operator Uq, defined by
[U q, f = <\> f] is unitary and defined L2(E, <\>2(x) dx) ~ L2(E, dx). It allows transportations
of the Markov structure on L 2(E, dx), replacing the operator L on L2(E, dx) by some operator
H on L2(E, <\>2(x) dx). Results in the previous subparagraphs may now be used to get
ultracontractivity and thus hypermixing.

2.5.7. General SDE


Various authors give the asymptotic behaviour for general SDE with the form

dZ(t)
CIt = e F(t, Z(t), co) for e ~ O.

Under some suitable regularity assumptions such processes converge to a diffusion process.

The techniques used there are based on the strong mixing properties of such equations. See e.g.
Papanicolaou & Varadhan (1973), Cogburn & Hersh (1973), Kesten & Papanicolaou (1979) or
Gerencser (1989). The previous authors give various conditions under which the random
function has such mixing properties. This approach explains the link between deterministic
dynamical systems and a related noisy system.

Survey of the literature


The probabilistic properties of Markov processes are studied in Doob (1953) and
Rosenblatt (1971) who gives a functional approach to their mixing properties. See also
Iosifescu & Teodorescu (1969) and Meyn & Tweedie (1993).

Khas'minskii (1960) and Bhattacharya (1978, 1982) prove recurrence properties of diffusion
processes. Carmona (1979) and Molcanov (1981) study the spectrum of Schrodinger
operators. Chen Mufa (1990) is interested by the spectrum of diffusion processes. Baxter &
Examples: Continuous Time Processes 123

Brosamler (1976) and Bolthausen (1982) studied the particular case of the Brownian motion on
a Riemannian manifold.

Chiyonobu & Kusuoka (1988) defme the hypermixing properties in view of a theorem of large
deviations; they also present examples.

Bakry & Emery (1985) prove fundamental explicit necessary hypercontractivity conditions for
very general diffusion processes - those results are extended in Bakry (1991) : the r 2-criterion.

Deuschel & Stroock (1989) present other hypermixing sufficient conditions. Davies (1989)
studies this property from the operators viewpoint, linking the infimum of the spectrum, the
Sobolev logarithmic inequality and the notion of ultracontractivity in the case of Schrodinger
operators.

An interesting alternative viewpoint is in Prakasa Rao (1990); he consider a class of stopping


times and defme mixing conditionally to them. This notion seems very close to that introduced
in Veijanen (1989) for partially observed processes. A continuous time process cannot
physically be observed continuously.

Papanicolaou & Varadhan (1973), Cogburn & Hersh (1973), Kesten & Papanicolaou (1979),
Rosenblatt (1987) and Gerencser (1989) prove limit theorems for SDE related to mixing
techniques.
125

Bibliography

After each reference we indicate the chapter of this volume with which it is related in
underlined characters, e.g. § 1. § 2.3. § 2.5. means that the corresponding reference is
related together with each paragraph in chapter 1 and to the subsections 2.3 and 2.5. The title of
monographs is written in italic characters. The title of reviews is classically abreviated.

Aaronson J., Denker M. (1991) On the FLIL for certain 'If-mixing processes
and infinite measure preserving transformations. C. R. Acad. Sci. Paris, Serie 1,
313, 471-475 .....§.ll

Andrews D. (1984) Non strong mixing autoregressive processes. J. Appl.


Probab. 21, 930-934. § 1.3. § 2.3. § 2.4.

Ango Nze P. (1992) Criteres d'ergodicite de quelques modeles it representation


markovienne. C. R. Acad. Sci. Paris., Serie 1,315, 1301-1304 ....§.2.A..

Athreya K.B., Pantula S.G. (1986 a) A note on strong mixing of ARM A


processes. Statist. Probab. Lett. 4, 187-190. § 1.3. § 2.3. § 2.4.

Athreya K.B., Pantula S.G. (1986 b) Mixing properties of Harris chains and
autoregressive processes. J. Appl. Probab. 23, 880-892. § 1.3. § 2.4.

Bakry D. (1991) Inegalites de Sobolev faibIes : un critere r 2. In Sem. de


Probabilites XXV, L.N.M., Springer-Verlag, Berlin.~

Bakry D., Emery M. (1984) Hypercontractivite de semi-groupes de diffusion.


C. R. Acad. Sci. Paris, Serie I, 299, 775-778.~

Bakry D., Emery M. (1985) Diffusions hypercontractives. in Sem. de


Probabilites XIX, L.N.M. 1123, Springer-Verlag, Berlin.~

Baxter J. R., Brosamler G. A. (1976) Energy and the law of the iterated
logarithm. Math. Scand. 38, 115-136....§..bi"

Berbee H. C. P. (1979) Random walks with stationary increments and renewal


theory. Math. Centre Tracts, Amsterdam. § 1.1. § 1.2. § 1.3.

Berkes I., Phillip W. (1977) An almost sure invariance principle for the
empirical distribution of mixing random variables. Z. Wahrsch. Verw. Gebiete
41,115-137.~

Berkes I., Phillip W. (1979) Approximation theorems for independent and


weakly dependent random vectors. Ann. Probab. 7, 29-54•....§....Lb

Besag J. (1974) Spatial interaction and the statistical analysis of lattice


systems. J. Royal Stat. Soc. series B, 36, 192-236.~

Bhattacharya R. N. (1978) Criteria for recurrence and existence of invariant


126 Mixing

measures for multidimensional diffusions. Ann. Probab. 6-4, 541-553.~

Bhattacharya R. N. (1982 a) On the functional central limit theorem and the law
of iterated logarithm for Markov processes. Z. Wahrsch. Verw. Gebiete 60, 185-
201.~

Bhattacharya R. N. ( 19 8 2 b) On classical limit theorems for diffusions. Sankhya


44, series A, 1, 47-71.~

Billingsley P. (1968) Convergence of probability measures. Wiley, N.Y .....§..l

Blum J. R., Hanson D. L., Koopmans L. H. (1963) On the strong law of


large numbers for a class of stochastic processes. Z. Wahrsch. Verw. Gebiete 2, I-
ll. § 1.1, § 1.2, § 1.3, § 1.4.

Bolthausen E. (1982) On the central limit theorem for stationary mixing random
fields. Ann. Probab. 10-4, 1047-1050. § 2.5.

Borovkov A.A. (1989) Ergodicity and Stability of Multidimensional Markov


Chains. Theory Probab. Appl. 35 (3), 542-546 .....§...2A"

Borovkov A.A. (1990) Lyapounov Functions and Ergodicity and Stability of


Multidimensional Markov Chains. Theory Probab. Appl. 36 (1), 1-18 .....§...2A"

Bosq D. (1975) Inegalite de Bernstein pour un processus melangeant. C.


R. Acad. Sci. Paris, Serie A, 275, 1095-1O98.~

Bosq D. (1991) Inegalite de Bernstein pour un processus melangeant a


temps discret ou continuo tech. report 136, L.S.T.A., Jussieu, Paris. § 1.2, § 1.4.

Bowen R. (1975) Equilibrium states and the ergodic theory of Anosov


diffeomorphisms. L. N. M. 477, Springer-Verlag, Berlin.~
Bradley R. C. (1980) On the strong mixing and weak Bernoulli conditions. Z.
Wahrsch. Verw. Gebiete 51, 49-54.-t.ll

Bradley R. C. (1981 a) Central limit theorems under weak dependence. J.


Multivar. Anal. 11, 1-16. § 1.3, § 1.5.
Bradley R. C. (1981 b) A sufficient condition for linear growth of variances in a
stationary random sequence. Proc. Amer. Math. Soc., 83, 3, 586-589 . .§....li
Bradley R. C. (1983 a) Equivalent measures of dependence. J. Multivar. Anal.
13, 167-176. § 1.1, § 1.3.

Bradley R. C. (1983 b) Approximation theorems for strongly mixing random


variables. Michigan Math. J. 30, 69-81. .§...L2."

Bradley R. C. (1981 b) A sufficient condition for linear growth of variances in a


stationary random sequence. Proc. Amer. Math. Soc., 83, 3, 586-589 . .§....li

Bradley R. C. (1985) On the central limit question under absolute regUlarity.


An. of Probab. 13,4, 1314-1325 ...§....U.,
Bradley R. c. (1986) Basic properties of strong mixing conditions in
Dependence in probability and statistics, a survey of recent results. Oberwolfach,
Mixing 127

1985, Birkhauser. § 1.1. § 1.3. § 2.

Bradley R. C. (1987) Idendical mixing rates. Probab. Th. ReI. Fields 74, 497-
503 ......§..l.J.,

Bradley R. C. (1988) On Some Results of M.1. Gordin: A Clarification of a


Misunderstanding. J. Theor. Probab. 1, (2), 115-119.~

Bradley R. C. (1989) A caution on mixing conditions for random fields. Statist.


and Probab. Letters, 8, 489-491.~

Bradley R. C. (1990) On p-mixing exept on small sets. Pac. J. of Math., 146,


2, 217-226.~

Bradley R. C. (1991 a) Equivalent mixing conditions for random fields. Preprint.


~

Bradley R. C. (1991 b) Some examples of mixing random fields. Preprint.~

Bradley R. C. (1992) On the spectral density ofand asymptotic normality of


weakly dependent random fields. Journ. Theor. Probab. 5, 2355-373.~

Bradley R. C., Bryc W. (1985) Multilinear forms and measures of


dependence between random variables. J. Multivar. Anal. 16-3, 335-367.~
~

Bryc W. (1982) On the approximation theorem of I. Berkes and W. Philipp. Demonstr.


Math. 15, 3, l-lO ....§.....l.,k

Bryc W. (1992) On the large deviation principle for stationary weakly dependent random
fields. Ann. Probab. 20,2, l004-1030 ...,§,ll

Bryc W. (1992) On the large deviation principle for stationary weakly dependent random
fields. Ann. Probab. 20, 2, 1004-1O30...,§,ll

Bryc W., Smolenski W. (1993) Moment conditions for almost sure


convergence of weakly correlated random variables. Proc. Amer. Math. Soc.~
1.5. 2.1.

Bulinskii A. V. (1987) Limit theorems under weak dependence conditions.


Fourth Int. Con! on Prob. Th. and Math. Stat., V.N.U. Sci. Press, Utrecht, The
Nederland, 307-326. § 1.3. § 1.5.

Bulinskii A. V. (1989 a) On various conditions of mixing and asymptotic normality


of random fields. Soviet Math. Dokl. 37 (4),443-447. § 1.1. § 1.3, § 1.5.

Bulinskii A V. (1989 b) Limit theorems under weak dependence conditions.


Moscow University Press, in Russian. § 1.1. § 1.2, § 1.3, § 1.5.

Bulinskii A. V., Doukhan P. (1987) Inegalites de melange fort utilisant des


normes d'Orlicz. C. R. Acad. Sci. Paris, Serie 1, 305, 827-830. § 1.2, § 1.4.

Bulinskii A. V., Doukhan P. (1990) Vitesse dans Ie theoreme de limite centrale


pour des champs melangeants satisfaisant des hypotheses de moments faibles. C. R.
Acad. Sci. Paris, Serie 1, 311, 801-805.~
128 Mixing

Bulinskii A. V., Zhurbenko I.G. (1976) A central limit theorem for additive
random functions. Theory Probab. AppI. 21,4,687-697. § 1.3, § 2.4,

Carbon M. (1983) Inegalite de Bernstein pour les processus fortement


melangeants non necessairement stationnaires. C. R. Acad. Sci. Paris, Serie A, 297,
303-306,~

Carmona R. (1979) Operateur de SchrOdinger It resolvante compacte; in


Seminaire de Strasbourg 1977-1978, L.N,M. 721, 570-573,~

Chan K. S., Tong H. (1985) On the use of the deterministic Lyapounov function
for the ergodicity of stochastic difference equations, Adv. in AppI. Probab, 17, 666-
678.~

Chanda K. C. (1974) Strong mixing properties of linear stochastic processes, J,


AppI. Probab. 11, 401-408. § 1.3, § 2.3.
Chen Rong, Tsay Ruey S. (1991) On the ergodicity of TAR(l) processes,
Ann. Appl. Probab. 1 (4), 613-634 .~

Chen Mufa (1990) Exponential L 2 -convergence and L 2 -spectral gap for


Markov processes. Acta Math. Sinica, New series 7 (1), 19-37.~
Cheng R.(1992) A second order mixing condition for second-order stationary random
fields, Studia. Math, 101 (2), 139-153,...§.1,L

Cheng B., Tong H. (1992) On consistent Nonpararnetric Order Determination


and Chaos. J. R, Statist. Soc. B 54 (2), 427-449,~
Chiyonobu T., Kusuoka S. (1988) The large deviation principle for
hypermixing processes. Probab. Th, ReI. Fields 78, 627-649, § 2,1. § 2.4, 2.5.
Cogburn R., Hersh R. (1973) Two limit theorems for random equations. Indiana
Univ. Math. J. 22, 1067-189.~
Collomb G. (1984) Proprietes de convergence presque complete du predicteur
It noyau. Z. Wahrsch. Verw. Gebiete 66, 441-460.....§...lA"

Collomb G., Doukhan P. (1983) Estimation non parametrique de la fonction


d'autoregression d'un processus stationnaire et q,-melangeant. C. R. Acad. Sci.
Paris, Serie 1, 296, 859-863 ....i..1A.
Csaki P., Fischer J. (1963) On the general notion of maximal correlation.
Magyar Tud. Akad. Mat. Kutato Int. KozI. 8, 27-51.!..LL
Dasgupta R. (1988) Non-uniform rates of convergence to normality for strong
mixing processes. Sankbya Series A, 50 (3), 436-451. § 1.4, § 1.5.
Davies E. B. (1989) Heat kernels and spectral theory. Cambridge University
Press.~

Davydov Yu A. (1968) Convergence of distributions generated by stationary


stochastic processes. Theory Probab. AppI. 13,691-696.....§..ll
Davydov Yu A. (1970) The invariance principle for stationary processes. Theory
Mixing 129

Probab. Appl. 15, 487-498. § 1.2. § 1.5.

Davydov Yu A. (1973) Mixing conditions for Markov chains. Theory Probab.


Appl. 18, 313-328.~
Dehling H. (1983) Limit theorems for sums of weakly dependent random
variables. Z. Wahrsch. Verw. Gebiete 63,393-432. § 1.2. § 1.5.
Dehling H., Denker M., Philipp W. (1986) Central limit theorems for mixing
sequences of random variables under minimal conditions. Ann. Probab. 14 (4),
1359-1370. ~
Denker M., Keller G. (1986) Rigorous statistical procedures for data from
dynamical systems. J. Statist. Phys. 44, 67-93.§ 2.3.
Deuschel J. D. (1986) Non-linear smoothing of infinite dimensional Diffusion
Processes. Stochastics 19, 237-261.~
Deuschel J. D., Stroock D. W. (1989) Large deviations. Academic Press,
Boston.~

Diebolt J. (1985) Deux tests non parametriques pour un modele


autoregressif non lineaire. C. R. Acad. Sci. Paris, Serie I, 301 (12), 649-652.
~

Diebolt J., Guegan D. (1990) Probabilistic properties of the general non-linear


Markovian process of order one and application to time series modelling. technical
report 125, L.S.T.A., Jussieu, Paris.~
Diebolt J., Guegan D. (1991) Le modele de serie chronologique autoregressive
~-ARCH. C. R. Acad. Sci. Paris, Serie 1,312,625-630. § 2.4.

Dobrushin R. L. (1956) Central limit theorem for non stationary Markov


chains I, n: Theory Probab. Appl. 1, 65-80 & 4, 329-383.~
Dobrushin R. L. (1968) The description of a random field by means of its
conditional probabilities and conditions of its regularity. Theory Probab. Appl. 13
(2), 197-224 ..i.U.

Dobrushin R. L. (1970) Prescribing a system of random variables by conditional


distributions. Theory Probab. Appl. 15 (3), 458-486.~
Dominguez M. (1989) Rate of convergence of the maximal correlation coefficient
in the continuous cases. Rev. Brasilera de Prob. e Estatistica, 3, 111-124...§..ll
Dominguez M. (1990) A matricial extension of the Helson-Sarrason theorem and
an characterization of some multivariate linearly completely regular processes. J. of
Mult. An. 31 (2), 289-31O.-.U..L.

Doob J. L. (1953) Stochastic Processes. Wiley, New-York. § 1. § 2.4.


Doukhan P. (1980) Etude probabiliste de la chaine de Markov:
Xn+l =!(Xn) + en : Thesis, Orsay.~

Doukhan P. (1986) Polynomes d'Hermite et statistique des processus


130 Mixing

melangeants. Collective work, Bruxelles 1985. Revue du C.E.R.O., Bruxelles 28,


99-IIS ......uA..
Doukhan P. (1988) Non parametric estimation of a regression function in a
mixing framework. Collective work, Caracas 1987. Acta Sc. Ven.....§...!.A.

Doukhan P. (1992) Consistency of 6-estimates for a regression or a density in


a dependent framework; in Siminaire d'Orsay 1989-1990: Estimationfonctionnelle.
Preprint Orsay, France.~

Doukhan P., Ghindes M. (1980) Etude du processus Xn+l =f(Xn)+En. C.


R. Acad. Sci. Paris, Serie A, 290, 921-923 ....§.2d.
Doukhan P., Ghindes M. (1983) Estimation de la transition de probabilite
d'une chaine de Markov Doeblin recurrente. Stochastic Process. Appi. IS, 271-293.
§...2..d.

Doukhan P., Guyon X. (1991) Mixing for linear random fields. C. R. Acad. Sci.
Paris, Serie 1,313, 46S-470....ill
Doukhan P., Leon J. (1986) Invariance principles for the empirical measure of a
mixing sequence and for the local time of a Markov process; in Strasbourg,
Conference of Probability in Banach Spaces 1985; L.N.M. 1993,4-21. Springer-
Verlag, Berlin.~
Doukhan P., Leon J. (1989) Cumulants for mixing sequences and applications
to empirical spectral density. Probab. Math. Stat., 10.1, 11-26. § 1.4. § I.S.
Doukhan P., Leon J., Portal F. (1984) Vitesse de convergence dans Ie
theoreme central limite pour des variables aleatoires melangeantes a valeurs dans un
espace de Hilbert. C. R. Acad. Sci. Paris, Serie 1,298, 30S-308. § 1.4. § I.S.
Doukhan P., Leo~ J., Portal F. (1985) Calcul de la vitesse de convergence
dans Ie theoreme centrallirnite vis a vis des distances de Dudley, Levy et Prokhorov.
Probab. Math. Stat. 6.2, 19-27....§.J...S..
Doukhan P., Leon J., Portal F. (1987) Principe d'invariance faible pour la
mesure empirique d'une suite de variables aleatoires dependantes. Probab. Th. reI.
fields 76, SI-70.,,§,U
Doukhan P., Massart P., Rio E. (1994) The functional central limit theorem
for strongly mixing processes. To appear in Ann. I.H.P. § 1.4. § I.S.
Doukhan P., Portal F (1983 a) Moments de variables aleatoires melangeantes. C.
R. Acad. Sci. Paris, Serie I, 297, 129-132.~
Doukhan P., Portal F. (1983 b) Principe d'invariance faible pour un processus
empirique dans un cadre multidimensionnel et melangeant. C. R. Acad. Sci. Paris,
Serie I, 297, SOS-S08.,,§,U
Doukhan P., Portal F. (1987) Principe d'invariance faible pour la fonction de
repartition empirique dans un cadre multidimensionnel et melangeant. Probab. Math.
Statist. 8.2, 117-132. § 1.4. § I.S.
Doukhan P., Tsybakov A. (1993) Estimation in non parametric A.R.X.
Mixing 131

models. To appear in Problems of Trans. of Inform.~


Duno M. (1990) Methodes recursives aleatoires. Masson, Paris . .§...l....J...
~

Eberlein E., Csenki A. (1979) A note on strongly mixing lattices of random


variables. Z. Wahrsch. Verw. Gebiete 50, 135-136.~
Engle R. F. (1982) Autoregressive heteroscedasticity of estimates of the
variance ofU.K. inflation. Econometrica 50 (4), 987-1007.~
Feigin P. D., Tweedie R. L. (1985) Random coefficient autoregressive
processes-a Markov chain analysis of stationarity and finiteness of moments. J. Time
Ser. Anal. 6, 1-14.~
Follmer H. (1975) Phase transition and Martin boundary. In Sem. Prob. IX,
L.N.M. 465, 305-318, Springer-Verlag, Berlin.~
Follmer H. (1982) A covariance estimate for Gibbs measures. J. Funct. Anal.
46, 387-395 .....§..bb
Follmer H. (1988) Random Fields and diffusion processes. In Ecole d'ete de
Saint Flour XV-XVII, 1985-1987, L.N.M. 1362, 101-203, Springer-Verlag,
Berlin. § 2.2. § 2.5.
Foster F.G. (1953) On the stochastic matrices associated with certain queueing
processes. Ann. Math. Statist. 24, 355-360.~
Gastwirth J. L., Rubin H. (1975) The asymptotic distribution theory of the
empiric c.d.f. for mixing stochastic processes. Ann. Statist. 3, 809-824 ...§...U
Gebelein H. (1941) Das Statistische Problem der Korrelation als Variation-
und Eigenwertproblem und sein Zusammenhang mit Ausgleischungrechnung. Z.
Angew. MatP-. Mech. 21, 364-379 ....l.1...l..
Georgii H. O. (1988) Gibbs measures and phase transition. Studies in
Mathematics, nO 9, De Gruyter.~
Gerencser L. (1989) On a class of mixing processes. Stochastics 26, 165-197.
~

Gordin M. I. (1969 a) On the central limit theorem for stationary processes.


Soviet Math. Dokl. 10, 1l74-1l76.~
Gordin M. I. (1969 b) Abstract, Communications of the First Soviet-Japanese
Symposium on Probability.~
Gordin M. I. (1973) Central limit theorem for stationary processes without the
assumption of a finite variance, Abstracts of Communications T1:A-K, 173-174.
International Conference in Probability Theory and Mathematical Statistics, June 25-
30,1973, Vilnius (in Russian).~
Gorodetskii V. V. (1977) On the strong mixing property for linear
sequences. Theory Probab. Appl. 22, 41l-413.~
Gorodetskii V. V. (1984) The central limit theorem and an invariance
132 Mixing

principle for weakly dependent random fields. Soviet Math. Dokl. 29, 3, 529-532.
LU..
Gross L. (1975) Logarithmic Sobolev Inequalities. Amer. J. of Math.97,
1061-1O83.~

Guyon X. (1986) Estimation d'un champ par pseudo-vraisemblance


conditionnelle: etude asymptotique et application au cas markovien, in Spatial
processes and spatial time series. Proc. of 6-th. Franco-Belgian meeting of Stat.,
Nov. 1985, Pub. St. Louis Univ., Bruxelles. § 1.5. § 2.2. § 2.3.

Guyon X. (1992) Champs aleatoires sur un reseau. Modilisations statistique


et applications. Masson, Paris. § 1.5. § 2.2. § 2.3.

Guyon X., Richardson S. (1984) Vitesse de convergence du theoreme de la


limite centrale pour des champs faiblement dependants. Z. Wahrsch. Verw. Gebiete
66, 297-314.~
Gyiirfl L., Hardie W., Sarda P., Vieu P. (1989) Non Parametric curve
estimation from time series. Lecture Notes in Statistics, 60, Springer-Verlag, Berlin.
Hall P., Heyde C. C. (1980) Martingale limit theory and its applications.
Academic Press. § 1.3. § 1.4. § 1.5.
Hayashi E. (1981) The spectral density of a strongly mixing stationary
Gaussian sequence. Pacific J. of Math. 96, 343-359 ....§..k!..
Hegerfeldt G. C., Nappi C. R. (1977) Mixing properties in lattice systems.
Comm. Math. Phys. 53, 1-7.§ 2.2.
Helson H., Sarason D. (1967) Past and future. Math. Scand. 21, 5-16 ....§..k!..
Hernandez-Lerma 0., Montes-De-Oca R., Cavazos-Cadena R. (1991)
Recurrence conditions for Markov decision processes with Borel state space: a
survey. Ann. Op. Research 28, 29-46 ....§....M.
Herrndorf N. ( 1983) Stationary strongly mixing sequences not satisfying the
central limit theorem. Ann. Probab.11 (3), 809-813.~
Herrndorf N. (1984) A functional central limit theorem for strongly mixing
sequences of random variables. Ann. Probab.12, 141-153.~
Hirschfeld H. O. (1935) A connection between correlation and contingency. Math.
Proc. Camb. Phil. Soc. 31, 520-524.~
Holley R. (1985) Rapid convergence to equilibrium in one dimensional
stochastic Ising model. Ann. Probab. 13 (I), 72-89 ...§..2..2.
Holley R. A., Stroock D. W. (1976) L2 Theory for the stochastic Ising model.
Z. Wahrsch. Verw. Gebiete 35, 87-101...§..2..2.
Ibragimov I. A. (1962) Some limit theorems for stationary processes. Theory
Probab. Appl. 7, 349-382. § 1.2. § 1.3. § 1.5.
Ibragimov I. A. (1965) On the spectrum of stationary Gaussian sequences
satisfying the strong mixing condition I: Necessary conditions. Theory Probab.
Mixing 133

Appl. 10 (1), 85-106.~

Ibragimov I. A. (1970) On the spectrum of stationary Gaussian sequences


satisfying the strong mixing condition II: Sufficient conditions, mixing rate. Theory
Probab. Appl. 15 (1), 23-36.~
Ibragimov I. A. (1975) A note on the central limit theorem for dependent random
variables. Theory Probab. Appl. 20, 135-141. § 1.3, § 1.5.

Ibragimov I. A., Linnik Y. V. (1971) Independent and stationary


sequences of random variables. Walter Nordhoof pub., Groningen...§..L
Ibragimov I. A., Rozanov Y. A. (1978) Gaussian random processes.
Springer-Verlag, Berlin. § 1. § 2.1.

Ibragimov I. A., Solev V. N. (1969) A condition for regularity of a Gaussian


stationary process. Soviet Math. DokI. 1O,371-375.....§...bL

Iosifescu M. (1980) Recent advances in mixing sequences of random variables;


in Third international Summer School on probability theory and mathematical
statistics, Varna. Pub. Bulgarian Ac. of Sc. Sofia, 111-138.~
Iosifescu M., Teodorescu R. (1969) Random processes and learning. Springer-
Verlag, Berlin .....§.1.d"

Jain N., Jamison B. (1967) Contributions to Doeblin's theory of Markov


processes. Z. Wahrsch. Verw. Gebiete 8, 19-40.....§.1.d"
Jakubowski A., Szewczak Z. S. (1990) A normal convergence criterion for
strongly mixing stationary sequences, in Limit theorems in Probability and Statistics,
Pics 1989, CoIl. Math. Soc. J. Bolyai, 57, 281-292 .....§.1.d"

Jensen J. L. (1990) Asymptotic normality of estimates in spatial point


processes. Research Report 210, Dept. of Theor. Statist., Aarhus University ....§...2,2"

Karatzas I., Shreves S. (1988) Brownian motion and stochastic calculus.


Springer-Verlag, BerIin.~
Kesten H., O'Brien G. L. (1976) Examples of mixing sequences. Duke
Math. J. 43-2, 405-415. § 2.3, § 2.4.

Kesten H., Papanicolaou G. C. (1972) A limit theorem for turbulent


diffusion. Comm. Math. Phys. 65, 97-128. § 2.1, § 2.5.

Khas'minskii R. Z. (1960) Ergodic properties of recurrent diffusion processes


and stabilization of the solution to the Cauchy problem for parabolic equations;
Theory Probab. Appl. 5 (2), 179-196.~
Kolmogorov A. N., Rozanov Y. A. (1960) On the strong mixing conditions for
stationary Gaussian sequences. Theory Probab. Appl. 5, 204-207 .....§...b.L.
Komlos J., Major P., Tusnady G. (1975) An approximation for partial SUfiS
of independent RV.'s - and the sample D.F. Z. Wahrsch. Verw. Gebiete 24,321-
332.~

Korzeniowski A. (1987) On logarithmic Sobolev constant for diffusion


134 Mixing

semigroups. J. Funct. Analysis 71, 363-370.~

Kiinsch H. (1982) Decay of correlations under Dobrushin's uniqueness


condition and its applications. Comm. Math. Phys. 84, 207-222 ....i.bb

Liggett T. M. (1989) Exponential L2 convergence of attractive reversible nearest


particle systems. Ann. Probab. 17 (2), 403-432. § 2.2. § 2.4.

Lin Zhengyan (1989) The increments of partial sums of a dependent sequence


when the moment generating function does not exist. Acta Math. Sinica, new series,
5 (4), 289-296.-UA,.

Mac Leish D. L. (1975 a) A generalization of martingales and mixing sequences.


Adv. in Appl. Probab. 7-2, 247-258. § 1.3. § 1.5.

Mac Leish D. L. (1975 b) Invariance principles and mixing random variables. Z.


Wahrsch. Verw. Gebiete 32, 165-178.~

Massart P. (1987) Invariance principles for empirical processes, the weakly


dependent case. In "Quelques problemes de vitesse de convergence pour des mesures
empiriques", These d'Etat, Orsay. § 1.4. § 1.5.

Meyer P. A. (1982 a) Note sur les processus d'Ornstein-Uhlenbeck. in Sem.


Prob. Strasbourg XVI, L.N.M. 920, Springer-Verlag, Berlin.

Meyer P. A. (1982 b) Geometrie differentielle stochastique (bis). in Sem. Prob.


Strasbourg XVI B, L.N.M. 921, Springer-Verlag, Berlin.~

Meyn S.P., Caynes P.E. (1993) Geometric ergodicity of a doubly stochastic


time series model. J. Time Series Anal. 14, I, 93-108 .....§...2A

Meyn S.P., Guo L. (1991) Asymptotic properties of stochastic systems


possessing Markovian realizations. SIAM J. Control Optim. 29, 533-561.....§...2A

Meyn S.P., Tweedie R.L. (1992) Stability of Markovian Processes I: Criteria


for Discrete-Time Chains. Adv. Appl. Prob. 24, 542-574 .....§...2A

Meyn S.P., Tweedie R.L. (1993) Stability of Markovian Processes II:


Continuous time processes and sampled chains; ill: Foster Lyapounov criteria for
continuous time systems. Adv. Appl. Prob.~

Minakshisundaram S., Pleijel A. (1943) Some properties of eigenfunctions


of the Laplace operator on riemannian manifolds. Can. J. of Math. 1, 242-256.
~

Mokkadem A. (1985 a) Le modele non lineaire AR(1) general, ergodicite et


ergodicite geometrique. c. R. Acad. Sci. Paris, Serie 1, 301, 889-892 .....§...2A
Mokkadem A. (1985 b) Conditions suffisante d'existence et d'ergodicite
geometrique des modeles bilineaires. C. R. Acad. Sci. Paris, Serie 1,301,375-377.
UA.,.

Mokkadem A. (1986) Sur Ie melange d'un processus ARMA vectoriel. C. R.


Acad. Sci. Paris, Serie 1,303, 519-521.....§..2A.,.

Mokkadem A. (1987) Sur un modele autoregressif non lineaire, ergodicite et


Mixing 135

ergodicite geometrique. J.T.S.A. , n° 2, 195-204.~

Mokkadem A. (1990) Proprietes de melange des modeles autoregressifs


polynomiaux. Ann. I.H.P. 26, 2, 219-260 .....§..U

Molcanov S. A. (1981) The local structure of the spectrum of the one-dimensional


Schrodinger operator. Comm. Math. Phys. 78, 429-446.~
Moricz F. A. (1983) A general moment inequality for the maximum of the
rectangular partial sums of multiple series. Acta Math. Hung. 41 (3/4), 337-346.
§....L±,.

Moricz F. A., Serfling R. J., Stout W. F. (1982) Moment and probability


bounds with quasi-superadditive structure for the maximum partial sum. Ann.
Probab. 10 (4), 1032-1O40.~
Mueller C., Weissler F. (1973) Hypercontractivity for the heat group for
Ultraspherical Polynomials and on the n-Sphere. J. Funct. 48.~
Nagaev S. V., Fuk A. Kh. (1971) Probability inequalities for sums of
independent random variables. Theory Probab. Appl. 16, 643-660.§ 2.3.
Nahapetian B. (1980) The central limit theorem for random fields; in
Multicomponent random systems, Ed. Dobrushin-Sinai; 531-542. § 1.5. § 2.2.
Nahapetian B. (1991) Limit Theorems and Some Applications in Statistical
Physics. Teubner-Texte zur Mathematik. Stuttgart, Leipzig. § 1.3. § 104. § 1.5 .
.§...bb.

Neaderhouser C. C. (1978 a) Limit theorems for multiply indexed mixing


random variables with applications to Gibbs random fields. Ann. Probab. 6-2, 207-
215. § 1.5. § 2.2.

Neaderhouser C. C.' (1978 b) Some limt theorems for random fields. Comm.
Math. Phys. 61, 293-305. § 1.5. § 2.2.
Nelson D. (1990) Stationarity and persistence in the GARCH(I-I) model.
Preprint, University of Chicago.~
Nelson E. (1973) The free Markov Field. J. Funct. An.~

Neveu J. (1970) Bases Mathematiques du Calcul des Probabilites. Masson,


Paris. § 1. § 204.
Neveu J. (1976) Sur l'esperance conditionnelle par rapport a un
mouvement brownien. An. I.H.P. XII, 2, 105-1O9.~
Nychka D., Ellner, Gallant and McCaffrey (1992) Finding Chaos in Noisy
Systems. J. R. Statist. Soc. B 54 (2), 399-426 .....§..U
Nummelin E. (1984) General irreducible Markov chains and non-negative
operators. Cambridge University Press.~
Nummelin E., Tuominen P. (1982) Geometric ergodicity of Harris recurrent
Markov chains. Stochastic Process. Appl. 3, 187-202. § 204.
136 Mixing

Orey S. (1971) Limit theorems for Markov chain transition probabilities. Van Nostrand,
London.~

Papanicolaou G. C., Hersh.R. (1972) Some limit theorems for stochastic


equations and applications. Indiana Univ. Math. J. 21 (9), 815-840.~

Papanicolaou G. C., Varadhan S. R. S. (1973) A limit theorem with strong


mixing in Banach space and two applications to stochastic differential equations.
Com. pure Appl. Math. 26, 497-524.~

Peligrad M. (1983) A note on two measures of dependence and mixing


sequences. Adv. in Appl. Probab. 15,461-464. § 1.1. § 1.3.

Peligrad M. (1986) Recent advances in the central limit theorem and its weak
invariance principle for mixing sequences of random variables in Dependence in
probability and statistics, a survey of recent results. Oberwolfach, 1985, Birkhiiuser.
§ 1.1. § 1.3. § 1.5.

Peligrad M. (1990) On Ibragimov-Iosifescu conjecture for <j>-mixing


sequences. Stochastic Process. Appl. 35, 293-308.~

Petrucelli J., Woolford S.W. (1984) A threshold AR(l) process. J. Appl.


Probab. 21, 270-286. ~
Petrov V. V. (1975) Sums of independent random variables. Springer-Verlag,
Berlin. § 1.4. § 1.5.

Pham T. D. (1986) Bilinear Markovian representation and bilinear models.


Stochastic Process. Appl. 20, 295-306.~

Pham T. D., Tran L. T. (1985) Some mixing properties of time series models.
Stochastic Process. Appl. 19, 297-303.~

Philipp W. (1970) Some metrical theorems in number theory II. Duke Math.
J. 37, 447-458 .....§...L..2."

Philipp W. (1986) Invariance principles for independent and mixing


sequences of random variables. In Dependence in probability and statistics, a survey
of recent results. Oberwolfach, 1985, Birkhiiuser. .§...li

Philipp W., Webb G. R. (1973) An invariance principle for mixing


sequences of random variables. Z. Wahrsch. Verw. Gebiete 25, 223-237 .....§..L5."

Pitman J.W. (1974) Uniform rates of convergence for Markov chain transition
probabilities. Z. Wahrsch. Verw. Gebiete 29,193-227.~

Pollard D. (1984) Convergence of stochastic processes. Springer-Verlag,


B erlin. .§..lA"

Polya G., Szego G. (1972) Problems and theorems in Analysis I. Springer-


Verlag, Berlin.

Preston C. (1976) Random fields. L.N .M. 534. Springer-Verlag, Berlin . .§...lA

Prakasa Rao B. L. S. (1990) On mixing for flows of cr-algebras. Sankhya,


Mixing 137

series A, 52 (1), 1-15....§..b1.

Prum B. (1986) Processus sur un reseau et mesures de Gibbs;


Applications; Masson, Paris. § 2.2.
Revuz D. (1984) Markov chains. North Holland, Arnsterdam.§ 2.4.
Reznick M. Kh. (1968) The law of the iterated logarithm for some classes of
stationary processes. Theory Probab. Appl. 13, 606-621.§ 1.3. § 1.4. § 1.5.

Rio E. (1994) A new covariance inequality for strongly mixing processes. To appear in
Ann. I.H.P.~

Rosenblatt M. (1956) A central limit theorem and a strong mixing condition.


Proc. Nat. Ac. Sc. U.S.A., 42, 43-47.§ 1.1. § 1.2. § 1.3. § 2.4.

Rosenblatt M. (1971) Markov processes. Structure and asymptotic behaviour.


Springer-Verlag, Berlin.§ 2.4.. § 2.5.

Rosenblatt M. (1972) Uniform ergodicity and strong mixing. Z. Wahrsch.


Verw. Gebiete 24, 79-84. § 2.4. § 2.5.

Rosenblatt M. (1980) Linear processes and bispectra. J. Appl. Probab.17, 265-


270. § 2.3. § 2.4.

Rosenblatt M. (1985) Stationary sequences and random fields. Birkhiiuser.


§1.5. §2.1. §2.3.

Rosenblatt M. (1987) Scale renormalization and random solutions of Burger's


equation. J. Appl. Probab. 24, 328-338. § 2.5.

Rothaus O. S. (1981) Diffusion on Compact Riemannian Manifolds and


Logarithmic Sobolev Inequalities. J. Funct. An. 42, 102-109. ~

Roussas G. G., Imlnnides D. (1987) Moment inequalities for mixing sequences


of random variables. Stoch. An. Appl. 5 (1), 61-120. § 1.1. § 1.2. § 1.3. § 2.4.

Rozanov Y. A. (1967) On Gaussian fields with given conditional distributions.


Theory Probab. Appl. 22 (3), 381-391. § 2.1. § 2.3.

Samur J. D. (1984) Convergence of sums of mixing triangular arrays of


random vectors with stationary rows. Ann. Probab. 12, 390-426 ...§.l.i..

Schwartz G. (1980) Finitely determined processes. An indiscrete approach. J.


Math. An. Appl. 76, 146-158 ...lJ..2.,.

Serfling R. J. (1968) Contribution to central limit theory for dependent random


variables. Ann. Math. Statist. 39 (4), 1158-1175.li.,2",

Simon B. (1979) A remark on Dobrushin's uniqueness condition. Comm.


Math. Phys., 183-185 .....§..2,b

Sinai Va. G. (1982) Theory of phase transition " rigorous results. Pergamon
Press, N.y .....§...bl."

Szarek S. (1976) On the best constants in the Khinchin inequality. Studia


138 Mixing

Math. 58, 197-208 . .§....l.b


Statuljavichus V. A. (1983) On a condition of almost Markov regularity.
Theory Probab. Appl. 28 (2), 379-383.~

Statuljavichus V. A., Yackimavicius D. A. (1989) Theorems on large


deviations for dependent random variables. Soviet Math. Dokl. 38 (2),442-445 .
.§....L.4.,.

Stein Ch. (1973) A bound for the error in the normal approximation of a sum of dependent
random variables. Proc. 6th Berkeley Symp. Math. Stat. and Probab. 2, 583-602.
§...U..

Takahata H. (1983) On the rates in the central limit theorem for weakly
dependent random fields. Z. Wahrsch. Verw. Gebiete 64, 445-456...§...1..J..

Takahata H. (1986) L~-bounds for asymptotic normality of weakly dependent


summands using Stein's method. Ann. Probab.9, 676-683. § 1.3. § 1.5.
Tikhomirov A. N. (1980) On the convergence rate in the central limit random
theorem for dependent random variables. Theory Probab. Appl. 25 (4),790-809.
§...U..

Tjostheim D. (1986) Some doubly stochastic time series models. J. Time Series
Anal. 7, 51-72.~

Tong H. (1983) Threshold models in non-linear time series analysis.


Lecture Notes in Statistics 21, Springer-Verlag, Berlin....§..2A.
Tong H. (1990) Non-linear Time Series: A Dynamical System Approach.
Clarendon Press, Oxford.~
Tuan P. D. (1986) The mixing properties of bilinear and random coefficient
autoregressive models. Z. Wahrsch. Verw. Gebiete 64, 291-300.~
Tweedie R. L. (1974) R-theory for Markov chains on a general state space. I:
solidarity properties and R-recurrent chains. Ann. Probab. 2 (5), 840-864 ....§..2A.

Tweedie R. L. (1975) Sufficient conditions for ergodicity and recurrence of


Markov chains on a general state space. Stochastic Process. Appl. 3, 385-403.
UA.
Tweedie R. L. (1983) The existence of moments for stationary Markov chains.
J. Appl. Probab. 20, 191-196.~

Ueno T. (1960) On recurrent Markov processes. Kodai Math. J. Sem.


Rep. 12, 109-142.~

Uteev S. (1984) Inequalities and estimates of the convergence rate for the
weakly dependent case. Adv. in Probab. Th. 1985, Novosibirsk. § 1.4. § 1.5.
Van Dorn E. A. (1985) Conditions for exponential ergodicity and bounds for the
decay parameter of a birth-death process. Adv. in Appl. Probab. 17, 514-530.
U,2.

Veijanen A. (1989) On estimation ofparameters ofpartially observed random


Mixing 139

fields and mixing processes. Statistical studies 9. Finnish Stat. Soc., Helsinki.
§ 1.4. § 1.5.

Withers C. S (1981 a) Central limit theorem for dependent variables. Z.


Wahrsch. Verw. Gebiete 57, 509-534. § 1.3. § 1.5.
Withers C. S. (1981 b) Conditions for linear processes to be strongly mixing. Z.
Wahrsch. Verw. Gebiete 57, 477-480....§..ld.
Wolkonski V. A., Rozanov Y. A. (1959) Some limit theorems for random
functions, Part I. Theory Probab. Appl. 4, 178-197....§..L
Wolkonski V. A., Rozanov Y. A. (1961) Some limit theorems for random
functions, Part n. Theory Probab. Appl. 6, 186-198. § 2.1. § 2.4.
Yaglom A. M. (1963) Stationary Gaussian processes statisfying the strong
mixing condition and best predictable functionals. Univ. of California Berkeley,
Proc. Int. Research seminar of Statistics Lab., 241-252, N.Y. Springer-Verlag,
Berlin ..J..2.J....
Yokoyama R. (1980) Moment bounds for stationary mixing sequences. Z.
Wahrsch. Verw. Gebiete 42, 45-57 .....§.lA.
Yoshihara K. (1976) Limiting behaviour of U-statistics for stationary absolutely
regular sequences. Z. Wahrsch. Verw. Gebiete 35, 67-77. § 1.4. § 1.5.
Yoshihara K. (1978) Moment inequalities for mixing sequences. Kodai Math.
J. 1, 316-328 ....§..l..4,.
Yurinskii V. V. (1977) On the error of the Gaussian approximation for
convolutions. Theory Probab. Appl. 22 (1), 236-247 ...§.J...1..
Zakoian J. M. (1990) Modeles autoregressifs aseuils de series chronologiques.
Thesis, Universite de Paris, Dauphine ..§..2A..
.
141

Index

This index is divided into a specific index for mixing coefficients and a general index.

Index

I-recurrent 105
absolute regularity 3; 7; 90
admissible 67
affine models 97
algebraic variety 96
annealing 95
aperiodic 90; 104
aperiodic positive recurrent 89
mixing coefficients AR(l) nonlinear process 104
ARCH 100
arithmetic 106
ARMA99
c-mixing 16; 18 ARX nonlinear process 100
*-mixing 3; 88 Bernstein inequality 33; 36
a-mixing 3; 17; 32; 45; 57; 80 bilinear model 98
birth-death 112
aa,b-mixing 5; 119 Brownian motion 112
~-mixing 3; 17; 36; 59; 90 C-set 89
causal 79; 85
$-mixing3; 19;32;39;45;57;68;88; 112 central limit theorem 45
p-mixing 17; 19; 35; 47; 57; 89 Chapman-Kolmogorov 87
r-mixing 15 clique 69
configuration 71
'JI-mixing 3; 19; 88 covariance inequalities 10
dependence 63
derivation 116
diffusion 116
diffusion process 72; 115
Dobrushin's condition 65
Doeblin recurrent 88
dynamical system 93
elliptic operator 115
equilibrium 94
ergodic 19; 21; 64; 89; 104; 114
Feller 104
142 Mixing

GARCH100 reversible 114


Gaussian random field 57 Riemanniann recurrence 89
generator 116 Rosenthal inequality 27
geometric ergodicity 89; 104 second order 84
Gibbs state 64 simulation 96
harmonic 61 singular 85
Harris recurrent 90 small set 89
heteroscedastic 107 Sobolev 122
hitting time 89 specification 64
Hoeffding's inequality 33 spectral density 58
homogeneous 87 stable 94
hypercontractivity 119 stationary 114
hypermixing 118 stationary distribution 22; 89
infinitesimal operator 111 strictly elliptic 122
interpolation lemma 27 strong Feller 104
invariant 88 strong mixing 3; 26; 75; 120
invertibility condition 76 subadditive 42
irreducibility 104 taboo 105
k-Markovian 67 TAR(1) 104
kernel 63 transition probability kernel 87
linear random field 75 ultracontractive 121
Luxemburg norm 10 uniform mixing 3; 66
Lyapounov 94; 102 uniformly ergodic 21
m-dependent 17; 57 unstable 94
m-irreducibility 90 Zariski 97
Markov field 67; 80
Markov process 87; 112
Markov representation 98
maximal inequalities 40
measures of dependence 3
mixing 21
moving average 81
nearest neighbours fl9
nonhomogeneous 95
null recurrent 22; 89
Orlicz 10; 46
Ornstein-Uhlenbeck 115; 120
Ottaviani 42
particle 71
period 90
petit set 89
phase transition 64
point process 71
polynomial AR process 96
positive recurrent 22; 89
potential 67
probability kernel 63; 87
proper 63
quasi-subadditive 40
reconstruction 7
recurrent 21; 89
regular 21; 85
renewal theory 106
Vol. I: RA. FISher: An Appreciation. Edited by S.E. Fienberg an Vol. 22: S. Johansen, Functional Relations, Random Coefficients
D.V. Hinkley. XI, 208 pages, 1980. and Nonlinear Regression with Application to Kinetic Data, VIII,
126 pages, 1984.
Vol. 2: Mathematica1 Statistics and Probability Theory. Proceed-
ings 1978. Edited by W. Klonecki, A. Kozek, and J. Rosinski. Vol. 23: D.G. Saphire, Estimation of Victimization Prevalence
XXIV, 373 pages,1980. Using Data from the National Crime Survey. V,165 pages,1984.

Vol. 3: BD. Spencer, Benefit-Cost Analysis of Data Used to Vol. 24: T.S. Rao, M.M. Gabr, An Introduction to Bispectral
Allocate Funds. VIII, 296 pages, 1980. Analysis and Bilinear Time Series Models. VIII, 280 pages, 1984.

Vol. 4: E.A. van Doom, Stochastic Monotonicity and Queueing Vol. 25: Time Series Analysis of Irregularly Observed Data.
Applications of Birth-Death Processes. VI, 118 pages, 1981. Proceedings, 1983. Edited by E. Parzen. VII, 363 pages, 1984.

Vol. 5: T. Rolski, Stationary Random Processes Associated with Vol. 26: Robust and Nonlinear Time Series Analysis. Proceed-
Point Processes. VI,I39 pages, 1981. ings, 1983. Edited by J. Franke, W. HlIrd1e and D. Martin. IX,286
pages, 1984.
Vol. 6: S.S. Gupta andD.-Y. Huang, Multiple Statistical Decision
Theory: Recent Developments. VIII, 104 pages, 1981. Vol. 27: A. Janssen, H. Milbrodt, H. Strasser,lnfiniteIy Divisible
Statistical Experiments. VI, 163 pages, 1985.
Vol. 7: M. Akahira and K. Takeuchi, Asymptotic Efficiency of
Statistical Estimators. VIII, 242 pages, 1981. Vol. 28: S. Amari, Differential-Geometrical Methods in Statistics.
V, 290 pages, 1985.
Vol. 8: The First Pannonian Symposium on Mathematical Statis-
tics. Edited by P. RtSvCsz, L. Schmetterer, and V.M. Zolotarev. VI, Vol. 29: Statistics in Ornithology. Edited by B.J.T. Morgan and
308 pages, 1981. P.M. North. XXV,418 pages,1985.

Vol. 9: B. JtSrgensen, Statistical Properties of the Generalized Vol30:J.Grandcll,StocbasticModelsofAirPollutantConcentra-


Inverse Gaussian Distribution. VI, 188 pages, 1981. tion. V, 110 pages, 1985.

Vol. 10: AA. McIntosh, FittingLinear Models: An Application of Vol. 31: J. Pfanzagl. Asymptotic Expansions for General Statisti-
Conjugate Gradient Algorithms. VI, 200 pages, 1982. cal Models. VII, 505 pages, 1985.

Vol. 11: D.F. Nicholls and B.G. Quinn, Random Coefficient Vol. 32: GeneralizedLinear Models. Proceedings,1985. Edited by
Autoregressive Models: An Introduction. V,l54pages,1982 R. Gilchrist, B. Francis and J. Whittaker. VI, 178 pages, 1985.

Vol. 12: M. Jacobsen, Statistical Analysis of Counting Processes. Vol. 33: M. Csargo, S. Cs6rgo,L. Horvath,AnAsymptotic Theory
VII, 226 pages, 1982 for Empirical Reliability and Concentration Processes. V, 171
pages, 1986.
Vol. 13: J. Pfanzagl (with the assistance of W. Wefe1meyer),
Contributions to a General Asymptotic Statistical Theory. VII, Vol. 34: D.E. Critchlow, Metric Methods for Analyzing Partially
315 pages, 1982 Ranked Data. X, 216 pages, 1985.

Vol. 14: GUM 82: Proceedings oftheIntemational Conference on Vol. 35: Linear Statistical Inference. Proceedings, 1984. Edited by
Gene.ralised Linear Models. Edited by R. Gilchrist. V, 188 pages, T. Calinski and W. Klonecki. VI, 318 pages, 1985.
1982
Vol. 36: B. Matern, Spatial Variation. Second Edition. 151 pages,
Vol. 15: KR.W. Brewer and M. Hanif, Sampling with Unequal 1986.
Probabilities. IX,l64pages, 1983.
Vol. 37: Advances in Order Restricted Statistical Inference. Pr0-
Vol. 16: Specifying Statistical Models: From Parametric to Non- ceedings, 1985. Edited by R. Dykstra, T. Robertson and F.T.
Parametric, Using Bayesian or Non-Bayesian Approaches. Edited Wright. VIII, 295 pages, 1986.
by J.P. Plorens, M. Moucbart, J.P. Raoult, L. Simar, and A.F.M.
Smith, XI, 204 pages, 1983. Vol. 38: Survey Research Designs: Towards a Better Understand-
ing of Their Costs and Benefits. Edited by R.W. Pearson and R.F.
Vol. 17: I.V. Basawa and D1. Scott, Asymptotic Optimal Infer- Boruch. V, 129 pages, 1986.
ence for Non-Ergodic Models. IX, 170 pages, 1983.
Vol. 39: J.D. Malley, Optimal Unbiased Estimation of Variance
Vol. 18: W. Britton, Conjugate Duality and the Exponential Components. IX, 146 pages, 1986.
Fourier Spectrum. V, 226 pages, 1983.
Vol. 40: H.R. Lerche, Boundary Crossing of Brownian Motion. V,
VoJ. 19: L. Fernholz, von Mises Calcu!us For StatisticalFunctionals. 142 pages, 1986.
VIII,l24pages, 1983.
Vol. 41: F. Baccclli, P. Bremaud, Palm Probabilities and Station-
Vol. 20: Mathematical Learning Models - Theory and Algo- ary Queues. VII, 106 pages, 1987.
rithms: Proceedings ofa Conference. Edited by U.Herlc:enrath, D.
Kalin, W. Vogel. XIV, 226 pages, 1983. Vol. 42: S. Kullback, J.C. Keegel, J.H. Kullback, Topics in
Statistical Information Theory. IX,158 pages,1987.
Vol. 21: H. Tong, Threshold Models in Non-linear Time Series
Analysis. X, 323 pages, 1983. Vol. 43: B.C. Arnold, Majorization and the Lorenz Order: A Brief
Introduction. VI, 122 pages, 1987.
VoL 44: DL. McLeish, Christopher G. Small, The Theory and Vol. 65: A. Janssen, D.M. Mason, Non-Standard Rank Testa. VI,
Applications of Statisti.caJ. Inference Functions. VI, 124 pages, 252 pages, 1990.
1987.
Vol. 66: T. Wright, Exact Confidence Bowds when Sampling
VoL 45: JX Ghosh (Ed.), Statistical Infotmation and Likelihood. from Small Finite Universes. XVI, 431 pages, 1991.
384 pages, 1988.
Vol. 67: M.A. Tanner, Tools for Statisti.caJ. Inference: Observed
VoL 46: H.-G. MIUler, Nonpammetric Regression Analysis of Data and Data Augmentation Methods. VI, 110 pages, 1991.
Longitudinal Data. VI, 199 pages, 1988.
Vol. 68: M. Taniguchi,HlgherOrderAsymptoticThcoryforTlII1e
VoL 47: A.J. Getson, F.e. Hsuan, (2)-Invemes and Their Statis- Series Analysis. vm.160 pages, 1991.
tical Application. vm, 11 0 pages, 1988.
Vol. 69: N.J.D. Nagelkedte, Maximum Likelihood Estimation of
VoL48:GL.Bretthorst,BayesianSpectmmAnalysisandParam- FunctionalRelationshipa. V,110pages, 1992.
eter Estimation.xu.
209 pages, 1988.
Vol. 70: L fida, Studies on the Optimal Search Plan. vm. 130
VoL 49: SL. Laurllzc:n, Extremal Families and Systems of Suffi- pages, 1992.
cient Statistics. XV, 268 pages, 1988.
Vol. 71: E.M.R.A. Engel. A Road to Randomness in Physical
VoL SO: O.E. Bamdorff-Nie1sen, Parametric Statistical Models Systems. IX,155 pages,l992.
and Likelihood. vn. 276 pages, 1988.
Vol. 72: JX Lindsey, The Analysis of Stochaatic Processes using
Vol. 51: J. HOsler, R.-D. Reiss (Eds.), Extreme Value Theory. GUM. VI, 294 pages, 1992.
Proceedings, 1987. X, 279 pages,1989.
Vol. 73: B.e. Arnold, E. Caatillo, J.-M. Sarabia, Conditionally
VoL 52: PX God, T. Ramalingam, The Matching Methodology: Specified Distributions. xm, 151 pages, 1992.
Some Statistical Properties. vm, 152 pages, 1989.
Vol. 74: P. Barone, A. Frigessi, M. Piccioni, Stochastic Mode1a,
Vol. 53: B.C. Arnold, N. Balakrishnan, Relations, BOWlds and Statistical Methods, and Algorilhms in Image Analysis. VI, 258
Approximations for Order Statistics. IX, 173 pages, 1989. pages, 1992.

Vol. 54: LR. Shah,B.L Sinha, Theory ofOptimalDesigns. vm, Vol. 75: PX Goe1, N.S. Iyengar (Eds.), Bayesian Analysis in
171 pages,1989. Statistics and Econometrics. XI, 410 pages, 1992.

Vol. 55: 1.. McDonald, B. Manly, I. Lockwood, J. Logan (Eds.), Vol. 76: 1.. Bondesson, Generalized Gamma Convolutions and
EstimationandAnalysis ofInsectPopulatioos. Proceedings, 1988. Related Classes of Distributions and Densities. vm. 173 pages,
XlV,492pages, 1989. 1992.

VoL 56: JX Lindsey, The Analysis of Categorical Data Using Vol. 77: E. Mammen, When Does Bootstrap Watk? Asymptotic
GLIM. V, 168 pages, 1989. Results and SimulatioDS. VI, 196 pages, 1992.

Vol. 57: A. Decarli, B.J. Francis, R. Gi.lcIuist, G.U.H. Seeber Vol. 78: L. Fahnnelr, B. Francis, R. Gi.lcIuist, G. Tutz (Bds.),
(Bds.), Statistical Modelling. Proceedings. 1989. IX, 343 pages, Advances in GUM and Statisti.caJ. Modelling: Proceedings of the
1989. GLIM92 Conference and the 7th International Wcnkahop on
Statistical Modelling. Munich, 13-17 July 1992. IX, 22S pages,
Vol. 58: O.E.Bamdorff-Nie1sen,P.Blcsilcl,P.S.Eriksen,Decom- 1992.
position and Invarlance of Measures, and Statistical Transfcmna-
tion Models. V,147 pages, 1989. Vol. 79: N. Schmitz, Optimal Sequentially Planned Decision
Procedures. XU, 209 pages, 1992.
Vol. 59: S. Gupta, R. Mukerjee, A Calculus for Factoria1Arrange-
ments. VI, 126 pages, 1989. Vol. 80: M. Fligner, J. Verducci (Eds.), Probability Mode1a and
Statisti.caJ. Analyses for Ranking Data. xxn, 306 pages, 1992-
Vol. 60: 1.. Gyikfi, W. HlIrdle, P. Sarda, Ph. Vl.eU, Nonpammetric
Curve Estimation from Time Series. vm. 153 pages, 1989. Vol. 81: P. Spirtes, C. Glymour, R. Schc.ines, Causation, Predic-
lion, and Search. xxm, 526 pages, 1993.
Vol. 61: J. Bxeckling, The Analysis of Directional Tune Series:
Applications to Wind SpeedandDirection. vm,238 pages, 1989. Vol. 82: A. Komste1ev and A. Tsybakov, Minimax Theory of
Image Reconstruction. XU, 268 pages, 1993.
Vol. 62:I.C. Akkcrbocm, TestingProb1ems with Linear or Angu-
lar Inequality Consuainta. XU, 291 pages, 1990. Vol. 83: e. Gatsonis, J. Hodges, R. Kass, N. Singiurwalla (Eds.),
Case Studies in Bayesian Statistics. XU, 437 pages, 1993.
Vol. 63: J. Pfanzagl. Estimation in Scmiparametric Mode1a: Some Vol. 84: S. Yamada, Pivotal Measures in Statistical Experiments
Recent Developments. m, 112 pages, 1990. and Sufficiency. VII,129 pages,1994.

Vol. 64: S. Gabler, Minimax Solutions in Sampling from Finite Vol. 85: P. Doukhan, Mixing: Properties and Examples. XI, 142
Populations. V,132pages,l990. pages, 1994.
General Remarks

LCl:turl: '\iotes arc printed by photo-on;.;ct from the Series Editors:


master-..:opy deli\'ered in camera-ready' f(mTI by
Professor S. i-iellberg
the authors of 111011ographs. resp. editors or pro-
Ol'licc or the Vice President
ceedings \'ol,tlT1es. For this purpose Springer-
York L Imcrsily
VC'lag prmidcs technical instructions for the
. 1-700 Keele Slrel'l
preparation or manuscripts. \-'olume editors arc
?\orth York. Ontario M.1.1 I P:l
requested III distrihute these to all cOlltributing. ('anada
author,; or proceedings \olul11es. Some hOll)o",
gcneity in the presentation orll1c cO!llributiolls in Pl'(lkssor .I. (iani
a multi-author \llILllm: is desirable, I kpartment orStatistics If\S
!\ustralian '\ational Lni\ersitv
('areful prc'paratiol1 ormanUSCripl~ will help kccp
(iPO Box 4
production time shun and ensure a satislilctory
Canberra ACT 2601
appcarance or the !inisIH:d book. The actual pro-
AustralIa
duction ora Lecture \()tcs ,'olume nonna!lv takes
appr()\irnately 1-1 \\L'cks, Prokssor K. Krickebcrg
3 Rue dc I. Tstrapade
I:or Illonograph ll1anll~cripts typed l)r typeset
7500:'i Paris
according to our ilbt ructions. Springer. V'erlag Cdll.
('rance
if neccssary. contribute tll\\'(\r(b tlK' preparation
e(h\s at a II\ed rate. Proli.:ssor I. Olkin
Department or Statist ic~
,\l.Ilhors 01' l1111!l\)graphs rccei\ ,: 50 fiec copies of
SIHnl()),(j Lni\ersity
their hook. hlitors of proc,;edings \olulTles simi .
Stanl<ml. C:\ 0430:'
larl~ recci\c 50 copies ortl1(' book and arc respon-
LSA
sible Ii)]' redistributing these to authors etc. at their
discn.:l.iol1. ~() reprints or indi\ idU<l! nmtrihut iOlls Proi(:ssor N. Wermuth
can be supplied. ;-";0 royalty is paid Oil Lecture Department or Ps~chol()gy
>";otes \ olumes. Johannes Gutenberg Llli\ersit)
Po!'t fileh 39XO
Volume authors and editors arc entitled to pur-
1)-6500 \Iaill!
chase rUl1hercopiesoi'their book f()rthcir personal
Ciennanv
usc at a discount or .1.1.3°" and other Springer
mathematics books at a discount or 2()O,() directly
ii'Ol1l Springer-Verlag. ;\uthors contributing to
proceedings \olunw<; may purcha:-;e the \olume in
\\ hich their article appears at a discount or 20 on.
Springer- \'crlug secures the copyright (l)r each
\olull1e.

You might also like