This action might not be possible to undo. Are you sure you want to continue?

Marius Iosifescu and Cor Kraaikamp

Contents

Preface Frequently Used Notation 1 Basic properties of the continued fraction expansion 1.1 A generalization of Euclid’s algorithm . . . . . . . . . 1.1.1 The continued fraction transformation τ . . . . 1.1.2 Continuants and convergents . . . . . . . . . . 1.1.3 Some special continued fraction expansions . . 1.2 Basic metric properties . . . . . . . . . . . . . . . . . . 1.2.1 Deﬁning random variables of interest . . . . . . 1.2.2 Gauss’ problem and measure . . . . . . . . . . 1.2.3 Fundamental intervals, and applications . . . . 1.3 The natural extension of τ . . . . . . . . . . . . . . . . 1.3.1 Deﬁnition and basic properties . . . . . . . . . 1.3.2 Approximation coeﬃcients . . . . . . . . . . . . 1.3.3 Extended random variables . . . . . . . . . . . 1.3.4 The conditional probability measures . . . . . . 1.3.5 Paul L´vy’s solution to Gauss’ problem . . . . e 1.3.6 Mixing properties . . . . . . . . . . . . . . . . . 2 Solving Gauss’ problem 2.0 Banach space preliminaries . . . . . . 2.0.1 A few classical Banach spaces . 2.0.2 Bounded essential variation . . 2.1 The Perron–Frobenius operator . . . . 2.1.1 Deﬁnition and basic properties 2.1.2 Asymptotic behaviour . . . . . v ix xv 1 1 1 4 11 14 14 15 17 25 25 27 31 36 39 43 53 53 53 55 56 56 62

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

vi 2.1.3

CONTENTS Restricting the domain of the Perron–Frobenius operator . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 A solution to Gauss’ problem for probability measures with densities . . . . . . . . . . . . . . . . . . . . . . . 2.1.5 Computing variances of certain sums . . . . . . . . . . Wirsing’s solution to Gauss’ problem . . . . . . . . . . . . . . 2.2.1 Elementary considerations . . . . . . . . . . . . . . . . 2.2.2 A functional-theoretic approach . . . . . . . . . . . . . 2.2.3 The case of Lipschitz densities . . . . . . . . . . . . . Babenko’s solution to Gauss’ problem . . . . . . . . . . . . . 2.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 A symmetric linear operator . . . . . . . . . . . . . . . 2.3.3 An ‘exact’ Gauss–Kuzmin–L´vy theorem . . . . . . . e 2.3.4 ψ-mixing revisited . . . . . . . . . . . . . . . . . . . . Extending Babenko’s and Wirsing’s work . . . . . . . . . . . 2.4.1 The Mayer–Roepstorﬀ Hilbert space approach . . . . 2.4.2 The Mayer–Roepstorﬀ Banach space approach . . . . 2.4.3 Mayer–Ruelle operators . . . . . . . . . . . . . . . . . The Markov chain associated with the continued fraction expansion . . . . . . . . . . . . . . . . . . 2.5.1 The Perron–Frobenius operator on BV (I) . . . . . . . 2.5.2 An upper bound . . . . . . . . . . . . . . . . . . . . . 2.5.3 Two asymptotic distributions . . . . . . . . . . . . . . 2.5.4 A generalization of a result of A. Denjoy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64 70 71 79 79 85 95 101 101 103 111 119 120 120 127 130 135 135 139 151 156 165 165 169 169 171 173 179 179 182 188 196 196 202 207 213

2.2

2.3

2.4

2.5

3 Limit theorems 3.0 Preliminaries . . . . . . . . . . . . . . . . . . . 3.1 The Poisson law . . . . . . . . . . . . . . . . . 3.1.1 The case of incomplete quotients . . . . 3.1.2 The case of associated random variable 3.1.3 Some extreme value theory . . . . . . . 3.2 Normal convergence . . . . . . . . . . . . . . . 3.2.1 Two general invariance principles . . . . 3.2.2 The case of incomplete quotients . . . . 3.2.3 The case of associated random variables 3.3 Convergence to non-normal stable laws . . . . . 3.3.1 The case of incomplete quotients . . . . 3.3.2 Sums of incomplete quotients . . . . . . 3.3.3 The case of associated random variables 3.4 Fluctuation results . . . . . . . . . . . . . . . .

CONTENTS 3.4.1 3.4.2

vii The case of incomplete quotients . . . . . . . . . . . . 213 The case of associated random variables . . . . . . . . 215

4 Ergodic theory of continued fractions 219 4.0 Ergodic theory preliminaries . . . . . . . . . . . . . . . . . . . 219 4.0.1 A few general concepts . . . . . . . . . . . . . . . . . . 219 4.0.2 The special case of the transformations τ and τ . . . . 224 4.1 Classical results and generalizations . . . . . . . . . . . . . . 225 4.1.1 The case of incomplete quotients . . . . . . . . . . . . 225 4.1.2 Empirical evidence, and normal continued fraction numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 240 4.1.3 The case of associated and extended random variables 244 4.2 Other continued fraction expansions . . . . . . . . . . . . . . 257 4.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 257 4.2.2 Semi-regular continued fraction expansions . . . . . . 260 4.2.3 The singularization process . . . . . . . . . . . . . . . 264 4.2.4 S-expansions . . . . . . . . . . . . . . . . . . . . . . . 266 4.2.5 Ergodic properties of S-expansions . . . . . . . . . . . 273 4.3 Examples of S-expansions . . . . . . . . . . . . . . . . . . . . 281 4.3.1 Nakada’s α-expansions . . . . . . . . . . . . . . . . . . 281 4.3.2 Minkowski’s diagonal continued fraction expansion . . 289 4.3.3 Bosma’s optimal continued fraction expansion . . . . . 292 4.4 Continued fraction expansions with σ-ﬁnite, inﬁnite invariant measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 4.4.1 The insertion process . . . . . . . . . . . . . . . . . . . 299 4.4.2 The Lehner and Farey continued fraction expansions . 300 4.4.3 The backward continued fraction expansion . . . . . . 307 Appendix 1: Spaces, functions, and measures A1.1 . . . . . . . . . . . . . . . . . . . . . . . . . A1.2 . . . . . . . . . . . . . . . . . . . . . . . . . A1.3 . . . . . . . . . . . . . . . . . . . . . . . . . A1.4 . . . . . . . . . . . . . . . . . . . . . . . . . A1.5 . . . . . . . . . . . . . . . . . . . . . . . . . A1.6 . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 2: Regularly A2.1 . . . . . . . . . . A2.2 . . . . . . . . . . A2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 . 313 . 313 . 314 . 314 . 316 . 319

varying functions 321 . . . . . . . . . . . . . . . . . . . . . . . . . 321 . . . . . . . . . . . . . . . . . . . . . . . . . 323 . . . . . . . . . . . . . . . . . . . . . . . . . 324

viii Appendix 3: Limit theorems for A3.1 . . . . . . . . . . . . . . . . A3.2 . . . . . . . . . . . . . . . . A3.3 . . . . . . . . . . . . . . . . Notes and Comments References Index mixing . . . . . . . . . . . . . . .

CONTENTS random variables 325 . . . . . . . . . . . . . . 325 . . . . . . . . . . . . . . 327 . . . . . . . . . . . . . . 328 333 347 377

Preface

This monograph is intended to be a complete treatment of the metrical theory of the (regular) continued fraction expansion and related representations of real numbers. We have attempted to give the best possible results known so far, with proofs which are the simplest and most direct. The book has had a long gestation period because we ﬁrst decided to write it in March 1994. This gave us the possibility of essentially improving the initial versions of many parts of it. Even if the two authors are diﬀerent in style and approach, every eﬀort has been made to hide the diﬀerences. Let Ω denote the set of irrationals in I = [0, 1]. Deﬁne the (regular) continued fraction transformation τ by τ (ω) = fractional part of 1/ω, ω ∈ Ω. Write τ n for the nth iterate of τ, n ∈ N = {0, 1, · · · }, with τ 0 = identity map. The positive integers an (ω) = a1 (τ n−1 (ω)), n ∈ N+ = {1, 2 · · · } , where a1 (ω) = integer part of 1/ω, ω ∈ Ω, are called the (regular continued fraction) digits of ω. Writing [x1 ] = 1/x1 , [x1 , · · · , xn ] = 1/(x1 + [x2 , · · · , xn ]), n ≥ 2,

**for arbitrary indeterminates xi , 1 ≤ i ≤ n, we have ω = lim [a1 (ω), · · · , an (ω)] ,
**

n→∞

ω ∈ Ω,

**thus explaining the name of τ . The above equation will be also written as ω = lim [a1 (ω), a2 (ω), · · · ],
**

n→∞

ω ∈ Ω.

The an , n ∈ N, to be called incomplete quotients, are clearly positive integervalued random variables which are deﬁned almost surely on (I, BI ) with respect to any probability measure assigning probability 0 to the set I\Ω of rationals in I. (Here BI denotes the σ-algebra of Borel subsets of I.) The metrical theory of the (regular) continued fraction expansion is about the sequence (an )n∈N+ of its incomplete quotients, and related sequences. ix

**x C.F. Gauss stated in 1812 that, in current notation,
**

n→∞

Preface

lim λ(τ −n ([0, x))) = γ([0, x]),

x ∈ I,

where λ denotes Lebesgue measure and γ is what we now call Gauss’ measure, deﬁned by dx 1 γ(A) = , A ∈ BI . log 2 A x + 1 Gauss asked for an estimate of the convergence rate in the above limiting relation, and this has actually been the ﬁrst problem of the metrical theory of continued fractions. Ramiﬁcations of this problem, which was given a ﬁrst solution only in 1928, still pervade the current developments. Chapter 2 contains a detailed treatment of Gauss’ problem by an elementary approach and functional-theoretic methods as well. The latter are applied to the Perron–Frobenius operator associated with τ , considered as acting on various Banach spaces including that of functions of bounded variation on I. Gauss’ measure is important since it is preserved by τ , that is, γ(τ −1 (A)) = γ(A) for any A ∈ BI . This implies that, by its very deﬁnition, the sequence (an )n∈N+ is strictly stationary under γ. As such, there should exist a doubly inﬁnite version of it, say (¯ ) ∈Z , Z = { · · · , −1, 0, 1, · · · }, a deﬁned on a richer probability space. It appears that this doubly inﬁnite 2 ¯ version can be eﬀectively constructed on (I 2 , BI , γ ), where γ is the so called ¯ extended Gauss’ measure deﬁned by γ (B) = ¯ 1 log 2 dxdy , (xy + 1)2

2 B ∈ BI .

B

Put a−n (ω, θ) = an+1 (θ), a0 (ω, θ) = a1 (θ), an (ω, θ) = an (ω) for any ¯ ¯ ¯ n ∈ N+ and (ω, θ) ∈ Ω2 . Then whatever ∈ Z, k ∈ N, and n ∈ N+ the probability distribution of the random vector (¯ , · · · , a +k ) under γ is a ¯ ¯ identical with that of the random vector (an , · · · , an+k ) under γ, that is, (¯ ) ∈Z under γ is a doubly inﬁnite version of (an )n∈N+ under γ. A distinca ¯ tive feature of our treatment is the consistent use of the extended incomplete quotients a , ∈ Z. It appears that ¯ γ ( [0, x] × I | a0 , a−1 , · · · ) = ¯ ¯ ¯ (a + 1)x ax + 1 γ -a.s. ¯

**for any x ∈ I, where a = [¯0 , a−1 , · · · ], which in turn implies that a ¯ γ (¯ ¯ a
**

+1

=i|a , a ¯ ¯

−1 , · · · )

=

a+1 (a + i)(a + i + 1)

γ -a.s. ¯

Preface

xi

for any i ∈ N+ and ∈ Z. The last equation emphasizes a ‘chain of inﬁnite order’ structure of the incomplete quotients when properly deﬁned on a richer probability space. This idea goes back to W. Doeblin (1940) and, hopefully, is fully clariﬁed by our treatment. Also, the considerations above motivate the introduction of the family (γa )a∈I of probability measures on BI deﬁned by their distribution functions γa ([0, x]) = (a + 1)x , ax + 1 x ∈ I.

In particular, γ0 = λ. Besides γ, these probability measures, which we call conditional, are the most natural ones associated with the regular continued fraction expansion. It appears that (¯ ) ∈Z is ψ-mixing under γ while a ¯ (an )n∈N+ is ψ-mixing under γ and any γa , a ∈ I, and that the ψ-mixing coeﬃcients of the latter under γ (which are equal to the corresponding ones of the former under γ ) can in principle be exactly calculated. The facts just ¯ described are part of our Chapter 1. Chapter 3 is devoted to limit theorems for incomplete quotients, related random variables, and their extended versions. These include weak convergence to the Poisson, normal, and non-normal stable laws as well as the law of the iterated logarithm, in both classical and functional approaches, and are essentially based, in general, on the ψ-mixing property of both (¯ ) ∈Z a and (an )n∈N+ . The ergodic properties of the regular continued fraction expansion, leading to strong laws of large numbers, is deferred to Chapter 4. The reason is that whilst these properties are inherited by the continued fraction expansions which can be derived from the regular continued fraction expansion by the procedures called singularization and insertion, the limit properties in Chapter 3 do not transfer automatically to continued fraction expansions so derived. We give applications of the ergodic properties of the continued fraction transformation τ and its natural extension τ . After an introduc¯ tion, in which several general ergodic theoretical concepts and results—such as Birkhoﬀ’s ergodic theorem—are described, various classical results and important recent results, based on the natural extension, are derived. It is then shown that—via singularization and insertion—the ergodic properties of very many other continued fraction expansions can easily be obtained. In particular, the ergodic properties of the so called S-expansions are described in detail. Several examples of S-expansions are studied, such as Nakada’s α-expansions, Minkowski’s diagonal continued fraction expansion and Bosma’s optimal continued fraction expansion. Also, the connection between the regular continued fraction expansion and continued fraction

xii

Preface

expansions with σ-ﬁnite, inﬁnite invariant measures, such as the backward continued fraction expansion and Lehner’s continued fraction expansion, is explained. To make the book self-contained as reasonably as possible, we have included three appendices containing less known notions and results from measure theory, regularly varying functions, and limit theorems for mixing sequences of random variables, which we use frequently, especially in Chapter 3. We urge the reader to become familiar with the appendices early on so as to be aware of what can be found there as needed. We also warn the reader that Chapter 3 and some subsections of Chapter 2 are more involved or more abstract, and thus they make more diﬃcult reading. The concluding notes and comments aim at giving credit, pointing out to results not included in the main text, or tracing historical developments. The references list greatly exceeds the number of works quoted in the course of the book. It should be consulted with the purpose of discovering historical sources, parallel research, and starting points for new investigations. For what our work is not, the reader is referred to the books by Brezinski (1991) and von Plato (1994)—for the history of continued fractions—Jones and Thron (1980), Lorenzen and Waadeland (1992), Olds (1963), Perron (1954, 1957), Rockett and Sz¨sz (1992), Schmidt (1980), Sprindˇuk (1979), u z Sudan (1959), and Wall (1948)—for various, mainly non-metric, aspects of the theory of continued fractions. Acknowledgements Much of our original work included in this book has been carried out in the framework of our association with the Bucharest ‘Gheorghe Mihoc’ Centre for Mathematical Statistics of the Romanian Academy, and the Department of Probability and Statistics (CROSS), Faculty ITS, of the Delft University of Technology. Many institutions and persons have helped us in various ways. The ﬁrst of us wishes to acknowledge the hospitality of Universit´ Ren´ e e Descartes – Paris 5, Universit´ des Sciences et des Technologies de Lille, e and Universit´ Victor Segalen – Bordeaux 2. He is grateful to Bui Trong e Lieu, Michel Schreiber (both of Paris 5), George Haiman (Lille), and JeanMarc Deshouillers (Bordeaux 2) for their kind invitations at these locations where his stays in the period 1996–1999 were very helpful in completing parts of the book. He is also grateful to the Nederlandse Organisatie voor

Preface

xiii

Wetenschappelijk Onderzoek (NWO)—the Dutch organization for scientiﬁc research—for two one-month research grants in the years 2000 and 2001, and to the Department of Probability and Statistics (CROSS) for invitations allowing several short stays in Delft during which much of the joint work on the book was done. A short stay in the spring of 2000 at the Department of Mathematics of Uppsala University, for which he is grateful to Allan Gut, was very beneﬁcial for gathering recent literature on the subject. Last, but not the least, he gratefully acknowledges generous ﬁnancial support in the years 2000 and 2001 from a French–Romanian CNRS International Project of Scientiﬁc Cooperation (PICS) directed by Ha¨ Brezis and Doina ım Cior˘nescu (both of Universit´ Pierre et Marie Curie – Paris 6). This allowed a e him to spend more time in Delft, which was decisive for completing the book. Finally, he wishes to acknowledge the technical help he has received from Adriana Gr˘dinaru who changed his handwritten, hardly legible drafts into a a camera ready copy. The second author would also like to thank the Romanian Academy for their support during his visits to Bucharest. Adriana Berechet read several versions of the typescript, and with her penetrating mind detected some inaccuracies and slips. Expressing our indebtedness to her, we wish to make it clear that any remaining errors are our own. Finally, we must thank all the people with Kluwer Academic Publishers who helped during the development and production of this book project.

Delft, November 2001

M.I. C.K.

xiv

Preface

**Frequently Used Notation
**

Abbreviations a.e. = almost everywhere (with respect to Lebesgue measure) a.s. = almost surely (with respect to any other measure) Cov = covariance g.c.d. = greatest common divisor i.i.d.= independent identically distributed i.o. = inﬁnitely often log = natural logarithm p.m. = probability measure s.i. = strongly inﬁnitesimal r.v. = random variable var = total variation Var = variance 2 = end of example, proof, or remark

xv

xvi Symbols

Frequently Used Notation

N = {0, 1, 2, · · · } , N+ = {1, 2, · · · } , −N = {· · · , −2, −1, 0} Z = (−N) ∪ N+ = {· · · , −1, 0, 1, · · · } Q = the set of rational numbers R = the set of real numbers a = integer part of a ∈ R {a} = fractional part of a ∈ R R+ = (x ∈ R : x ≥ 0) , R++ = (x ∈ R : x > 0) I = [0, 1] = the unit interval of R Ω = I \ Q = the set of irrationals in I C = the set of complex numbers i= √ −1 (imaginary unit)

**z ∗ = complex conjugate of z ∈ C Rn = real n-vector space, or Euclidean n-space, n ∈ N+ ; R1 = R B n = σ-algebra of Borel sets in Rn ; B1 = B BM = Bn ∩ M := (B ∩ M : B ∈ B n ), M ∈ B n , n ∈ N+ BI = B ∩ I = σ-algebra of Borel sets in I
**

2 BI = BI 2 = σ-algebra of Borel sets in I 2

Ac = complementary set of the set A

Frequently Used Notation IA = indicator function of the set A ∂A = boundary of the Borel set A δx = p.m. concentrated at the point x λ = Lebesgue measure on B λ2 = Lebesgue measure on B2 N (0, 1) = standard normal distribution Φ = standard normal distribution function P (θ) = Poisson distribution with parameter θ P f −1 = P -distribution of r.v. f ∗ = convolution of measures ⊗ = product of σ-algebras or measures C = 0.577 215 · · · (Euler’s constant) Fn = nth Fibonacci number: F0 = F1 = 1, Fn+1 = Fn + Fn−1 , n ∈ N+

xvii

√ g = ( 5 − 1)/2, G = g + 1 (‘golden ratios’) K0 = 2.685 452 · · · (Khinchin’s constant) K−1 = 1.745 405 · · · (Khinchin’s constant) λ0 = 0.303 663 002 898 732 568 · · · (Wirsing’s constant) ζ(2) =

i∈N+

i−2 = π 2 /6

xviii an , 3, 14 a , 31 ¯ B(I), 53 || · ||, 53 BEV (I), 55 ||·||v , 56 ||·||v,µ , 56 BV (I), 54 || · || v , 54 C, 319 C(I), 53 C 1 (I), 53 || · || 1 , 53 cτ Pois µ, 317 γ, 16 γ , 26 ¯ γa , 36 d0 , 319 dP , 315 D = D(I), 319 ess sup, 55 F, 324 G, Ga , 39 n L(I), 53

Frequently Used Notation || · || L , 54 Lp , 55 ||·||p , 55 L∞ , 55 ||.||∞ , 55 Lp , 54 µ ||·||p,µ , 54 L∞ , 55 µ ||.||∞,µ , 55 m(X ), 314 µ-ess sup, 55 να , 197 Pλ , 60 Pi , 22 Pi1 ···in , 136 Pµ , 57 pn , 4, 19 pe , 261 n pe , 265 n Pois µ, 317 pr(X ), 314 qn , 4, 19

e qn , 261 e qn , 265

Frequently Used Notation Qν , 328 rn , 14 r , 34 ¯ s (f ), 53 sn , 14 sa , 36 n s , 34 ¯ σ (C), 313 σ ((fi )i∈I ), 314 te , 263 n te , 273 n τ, 2 τ , 25 Θn , 27 Θn , 251 Θe , 263 n Θe , 280 n u i(n) , 18 U := Pγ , 59 un , 14 ua , 38 n u , 34 ¯ v (f ), 55 v i(n) , 18 W , 319 yn , 15 y , 34

xix

xx

Frequently Used Notation

Chapter 1

**Basic properties of the continued fraction expansion
**

In this chapter the (regular) continued fraction expansion is introduced and notation ﬁxed. Some basic properties to be used in subsequent chapters are also derived.

1.1

1.1.1

**A generalization of Euclid’s algorithm
**

The continued fraction transformation τ

In Proposition 2 of Book VII, Euclid gave an algorithm—now bearing his name—for ﬁnding the greatest common divisor (g.c.d.) of two given integers: let a, b ∈ Z and assume for convenience that a > b > 0. Put v0 := a, v1 := b,

and determine a1 ∈ N+ , v2 ∈ N, such that v0 = a1 v1 + v2 , where 0 ≤ v2 < v1 . If v2 = 0 then we repeat this procedure and obtain v1 = a2 v2 + v3 , where 0 ≤ v3 < v2 . In general, if vm = 0 for some m ≥ 2, then we obtain vm−1 = am vm + vm+1 , 1 (1.1.1)

2

Chapter 1

where 0 ≤ vm+1 < vm . Clearly, the procedure should stop after ﬁnitely many steps: there exists n ∈ N+ such that vn = 0 and vn+1 = 0. Then, as is well known, we have vn = g.c.d. (a, b) . Remark. The running time of Euclid’s algorithm depends on the number of division steps required to get the g.c.d. of the given positive integers v0 > v1 . In an 1844 paper of the French mathematician Gabriel Lam´ it e is essentially shown that (i) given n ∈ N+ , if Euclid’s algorithm applied to v0 and v1 requires exactly n division steps and v0 is as small as possible satisfying this condition, then v0 = Fn+1 and v1 = Fn ; (ii) if v1 < v0 < m ∈ N+ , then the number of division steps required by Euclid’s algorithm when applied to v0 and v1 is at most √ √ log( 5m)/ log(( 5 + 1)/2) − 2 ≈ 2.078 log m + 1.672 − 2, where : R → Z is the greatest integer function, that is, x = greatest integer not exceeding x ∈ R. For historical details we refer the reader to Shallit (1994), and for recent developments to Knuth (1981, Section 4.5.3) and Hensley (1994). It should be noted that the latter are based on results to be proved in this and later chapters. 2 To consider Euclid’s algorithm more closely we deﬁne the so called continued fraction transformation τ : I → I by τ (x) = x−1 − x−1 0 if x = 0, if x = 0.

Then putting x = b/a we obviously have a1 = a1 (x) = v0 /v1 , and vm vm−1 = τ m−1 (x) , ··· , an = an (x) = vn−1 /vn τ n (x) = 0,

1 ≤ m ≤ n,

where τ 0 = identity map and τ , times. Note that

∈ N+ , is the composition of τ with itself 1 ≤ m ≤ n. (1.1.2)

am (x) = a1 τ m−1 (x) ,

Basic properties As vm−1 = am vm + vm+1 , we have 1 τ m−1 (x) = am + τ m (x) , 1 ≤ m ≤ n.

3

If for arbitrary indeterminates xi , 1 ≤ i ≤ n, n ∈ N+ , we write [x1 ] = 1 , x1 [x1 , · · · , xn ] = 1 , x1 + [x2 , · · · , xn ] n ≥ 2,

then it follows that x = [a1 + τ (x)] = [a1 , · · · , am−1 , am + τ m (x)] = [a1 , · · · , an ] (1.1.3)

for 1 < m ≤ n. An expression as on the right hand side of (1.1.3) is called a ﬁnite (regular ) continued fraction (RCF for short). It follows from Euclid’s algorithm that each rational number x ∈ Z can be written as / x = a0 + [a1 , . · · · , an ] , (1.1.4)

where a0 = x . (Note that for any x ∈ R, x ∈ Z, the fractionary part / x − x of x is a number in the open interval (0, 1) !) The right hand side of (1.1.4) will be denoted by [a0 ; a1 , · · · , an ] . Euclid’s algorithm yields an ≥ 2. Hence each rational number x ∈ Z has / two continued fraction expansions, namely, [a0 ; a1 , · · · , an ] = [a0 ; a1 , · · · , an − 1, 1] . Of course, there is no reason whatsoever to stick to rationals. Let x ∈ R\Q and, as in the case of rationals, put a0 = x . It follows from the very deﬁnition of τ that τ n (x − a0 ) ∈ Ω = I\Q, Let us deﬁne an = an (x) = 1/τ n−1 (x − a0 ) , so that, similarly to (1.1.2), an (x) = a1 τ n−1 (x − a0 ) , n ∈ N+ . (1.1.2 ) n ∈ N+ , n ∈ N.

4 Hence

Chapter 1

x = [a0 ; a1 + τ (x − a0 )] = · · · = [a0 ; a1 , · · · , an−1 , an + τ n (x − a0 )] (1.1.5) for any n ≥ 2. The two cases x ∈ Q and x ∈ R\Q can be treated in a unitary manner if we deﬁne a1 (0) = ∞, the symbol ∞ being subject to the rules 1/∞ = 0, 1/0 = ∞. Equations (1.1.5) are then valid for any x ∈ R. Clearly, for any x ∈ Q there exists n = n (x) ∈ N+ such that am (x) = ∞ for any m ≥ n. The integers a1 (x), a2 (x), · · · will be called the (continued fraction) digits of x ∈ R whilst the functions x → ai (x) ∈ N+ ∪ {∞}, x ∈ R, i ∈ N+ , will be called the incomplete (or partial ) quotients of the continued fraction expansion. Euclid’s algorithm implies that x ∈ R has ﬁnitely many ﬁnite continued fraction digits if and only if x ∈ Q.

1.1.2

Continuants and convergents

Throughout the ﬁrst three chapters, without express mention to the contrary, we will assume that x ∈ [0, 1), which implies that a0 = 0, and write [0; a1 , · · · , an ] = [a1 , · · · , an ] , ω0 = 0, n ∈ N+ . n ∈ N+ . We will usually drop the dependence on x in the notation. Deﬁne ωn = ωn (x) = [a1 , · · · , an ] , x ∈ [0, 1), ωn =

Clearly, ωn ∈ Q, say

pn , n ∈ N+ , qn where pn , qn ∈ N+ and g.c.d. (pn , qn ) = 1. The number ωn ∈ ωn (x) is called the nth (regular continued fraction) (RCF) convergent of x, n ∈ N. As a rule, in the ﬁrst three chapters the speciﬁcation RCF will be dropped. Clearly, for any x ∈ Q there exists n = n (x) ∈ N such that ωm (x) = x for any m ≥ n. We shall show that for any irrational ω ∈ Ω := I\Q we have

n→∞

lim ωn (ω) = ω.

For that we need some preparation. Deﬁne recursively polynomials Qn of n variables, n ∈ N, by if n = 0, 1 x1 if n = 1, Qn (x1 , · · · , xn ) = x1 Qn−1 (x2 , · · · , xn ) + Qn−2 (x3 , · · · , xn ) if n ≥ 2.

Basic properties Thus Q2 (x1 , x2 ) = x1 x2 + 1, Q3 (x1 , x2 , x3 ) = x1 x2 x3 + x1 + x3 , Q4 (x1 , x2 , x3 , x4 ) = x1 x2 x3 x4 + x1 x2 + x1 x4 + x3 x4 + 1,

5

etc. In general, as noted by Leonhard Euler, for any n ∈ N+ , Qn (x1 , · · · , xn ) is the sum of all terms which can be obtained starting from x1 · · · xn and deleting zero or more non-overlapping pairs (xi , xi+1 ) of consecutive variables. There are Fn such terms. (Prove it!) The polynomials Qn , n ∈ N, are called continuants, and their basic property is that [x1 , · · · , xn ] = Qn−1 (x2 , · · · , xn ) , Qn (x1 , · · · , xn ) n ∈ N+ . (1.1.6)

The proof by induction is immediate and is left to the reader. The continuants enjoy the symmetry property Qn (x1 , · · · , xn ) = Qn (xn , · · · , x1 ) , This follows from Euler’s remark above. Hence Qn (x1 , · · · , xn ) = xn Qn−1 (x1 , · · · , xn−1 ) + Qn−2 (x1 , · · · , xn−2 ) for any n ≥ 2. The continuants also satisfy the equation Qn (x1 , · · · , xn ) Qn (x2 , · · · , xn+1 ) − Qn+1 (x1 , · · · , xn+1 ) Qn−1 (x2 , · · · , xn ) = (−1) ,

n

n ∈ N+ .

(1.1.7)

(1.1.8)

(1.1.9) n ∈ N+ .

The proof is immediate. For n = 1 equation (1.1.9) is true. By the very deﬁnition of Qn , for any n ≥ 2 we have Qn (x1 , · · · , xn ) Qn (x2 , · · · , xn+1 )−Qn+1 (x1 , · · · , xn+1 ) Qn−1 (x2 , · · · , xn ) = (x1 Qn−1 (x2 , · · · , xn ) + Qn−2 (x3 , · · · , xn )) Qn (x2 , · · · , xn+1 ) − (x1 Qn (x2 , · · · , xn+1 ) + Qn−1 (x3 , · · · , xn+1 )) Qn−1 (x2 , · · · , xn ) = (−1) Qn−1 (x2 , · · · , xn ) Qn−1 (x3 , · · · , xn+1 ) −(−1)Qn (x2 , · · · , xn+1 ) Qn−2 (x3 , · · · , xn ) = · · · = (−1)n−1 (Q1 (xn ) Q1 (xn+1 ) − Q2 (xn , xn+1 )) = (−1)n .

6

Chapter 1

Now, let ω ∈ Ω = I\Q have digits a1 (ω), a2 (ω), · · · . It follows from (1.1.6) and (1.1.9) that ωn (ω) = Qn−1 (a2 , · · · , an ) , Qn (a1 , · · · , an ) n ∈ N+ . (1.1.10)

pn = Qn−1 (a2 , · · · , an ), qn = Qn (a1 , · · · , an ),

Hence pn (ω) = qn−1 (τ (ω)), n ∈ N+ , ω ∈ Ω, and using (1.1.8) we obtain qn = an qn−1 + qn−2 , n ≥ 2, pn = an pn−1 + pn−2 , n ≥ 3, (1.1.11)

with q0 = 1, q1 = a1 , p1 = 1, p2 = a2 . If we deﬁne p0 = q−1 = 0, p−1 = 1, then equations (1.1.11) hold for any n ∈ N+ . It follows from (1.1.9) and (1.1.10) that pn qn−1 − pn−1 qn = (−1)n+1 , n ∈ N. (1.1.12) Clearly, either (1.1.10) or (1.1.11) implies that pn+1 ≥ Fn , qn ≥ Fn , n ∈ N. (1.1.13)

Notice that by (1.1.5), (1.1.6), (1.1.7), (1.1.10), and (1.1.11) we also have ω = [a1 + τ (ω)] a1 , a2 + τ 2 (ω) = 1 a1 + τ (ω) a2 + τ 2 (ω) a1 a2 + 1 + a1 τ 2 (ω) = p1 + τ (ω) p0 , q1 + τ (ω) q0 p2 + τ 2 (ω) p1 , q2 + τ 2 (ω) q1

ω=

=

=

and for n ≥ 3, ω = [a1 , · · · , an−1 , an + τ n (ω)] = = = Qn−1 (an + τ n (ω) , an−1 , · · · , a2 ) Qn (an + τ n (ω) , an−1 , · · · , a1 )

(an + τ n (ω)) Qn−2 (a2 , · · · , an−1 ) + Qn−3 (a2 , · · · , an−2 ) (an + τ n (ω)) Qn−1 (a1 , · · · , an−1 ) + Qn−2 (a1 , · · · , an−2 ) pn + τ n (ω) pn−1 an pn−1 + pn−2 + τ n (ω) pn−1 = . an qn−1 + qn−2 + τ n (ω) qn−1 qn + τ n (ω) qn−1 pn + τ n (ω) pn−1 , qn + τ n (ω) qn−1

Therefore we can assert that ω= ω ∈ Ω, n ∈ N, (1.1.14)

Basic properties and remark that (1.1.14) also holds for any rational ω in [0, 1).

7

Remark. A matrix approach to equations (1.1.12) and (1.1.14) is as follows. Consider the matrices Mn = pn−1 pn qn−1 qn , n ∈ N,

so that M0 = identity matrix, and deﬁne M−1 = Then equations (1.1.11) imply that Mn = Mn−1 An , where An = with a0 = 0. Hence Mn = 0 1 1 0

n i=0

0 1 1 0

.

n ∈ N,

0 1 1 an

,

n ∈ N,

0 1 1 ai

,

n ∈ N,

and (1.1.12) is nothing but the equation det Mn = (−1)n , n ∈ N.

Clearly, M−1 , Mn , An ∈ SL (2, Z), n ∈ N, that is, the entries of these 2 × 2 matrices belong to Z and their determinants are equal either to 1 or −1 . Recall that any matrix M= a b c d ∈ SL (2, Z)

can be viewed as a M¨bius transformation denoted by the same letter of the o compactiﬁed complex plane C∗ , which is deﬁned by M (z) = a b c d (z) := az + b , cz + d z ∈ C∗ .

With T denoting transpose we also have M (z) = (1, 0) M (z, 1)T (0, 1) M (z, 1)T , z ∈ C∗ ,

8 which implies at once that M M (z) = M M (z) , for any M , M ∈ SL (2, Z) . Next, for any z ∈ C and n ∈ N we have pn + zpn−1 qn + zqn−1 = Mn = Mn−1 In particular, for z = 0 we have pn qn whence Mn (0) = := It follows that Mn (z) = pn + zpn−1 = [a1 , · · · , an−1 , an + z] , qn + zqn−1 n ≥ 2, (1, 0) Mn−1 (1, an )T (0, 1) Mn−1 (1, an )

T

Chapter 1

z ∈ C∗ ,

z 1

= Mn−1 An 1 an + z .

z 1

= Mn

0 1

= Mn−1

1 an

,

n ∈ N,

(1.1.10 )

=

pn qn

[a1 , · · · , an ] if n ∈ N+ , 0 if n = 0.

for any z ∈ C, z = −qn /qn−1 , and M1 (z) = 1 a1 + z = p1 + zp0 q1 + zq0

for any z ∈ C, z = −a1 . Now, (1.1.14) follows from the last two equations by taking z = τ n (ω) , n ≥ 2, respectively z = τ (ω), ω ∈ Ω. Finally, it is obvious by (1.1.10 ) that pn and qn , n ∈ N+ , can be actually deﬁned as pn 0 1 0 1 0 = ··· . qn 1 a1 1 an 1 It is worth mentioning that any irrational number ω = [a0 ; a1 , a2 , · · · ] ∈ R

Basic properties can be represented in terms of only two elements of SL(2, Z), namely Q= 0 1 −1 0 and R = 1 1 0 1 ,

9

so that Q(z) = −1/z, R(z) = z + 1, z ∈ C. It is not hard to check that Q and R generate SL(2, Z) and that ω = lim Ra0 QR−a1 QRa2 Q · · · R−a2n−1 Q Ra2n (z0 )

n→∞

for any z0 ∈ C. This simple remark is the starting point for understanding by the use of elementary results about continued fractions the behaviour of the geodesic ﬂow on a certain Riemann surface. For details see Series (1982, 1991). See also Adler (1991), Faivre (1993), and Nakada (1995). For another representation of irrationals ω ∈ R in terms of matrices R and L = (P Q)2 Q see Raney (1973). 2 We can now prove the result announced before deﬁning the continuants. Proposition 1.1.1 For any x ∈ [0, 1) we have x − ωn (x) = For any ω ∈ Ω we have 1 1 < |ω − ωn (ω)| < , qn (qn+1 + qn ) qn qn+1 and

n→∞

(−1)n τ n (x) , qn (qn + τ n (x) qn−1 )

n ∈ N.

(1.1.15)

n ∈ N,

(1.1.16)

lim ωn (ω) = ω.

(1.1.17)

Proof. Equation (1.1.15) follows from (1.1.12) and (1.1.14). Next, since 1 τ n (ω) by (1.1.11) we have τ n (ω) qn (qn + τ n (ω) qn−1 ) = = 1 qn (qn (an+1 + τ n+1 (ω)) + qn−1 ) 1 , qn (qn+1 + qn τ n+1 (ω)) = an+1 + τ n+1 (ω) , n ∈ N, ω ∈ Ω,

10 and (1.1.16) follows. Finally, (1.1.17) follows from (1.1.16) and (1.1.13). Remark. It is easy to see that (1.1.15) implies |x − ωn (x)| ≤ 1 , qn qn+1 n ∈ N,

Chapter 1

2

for any x ∈ [0, 1). Of course, for a rational x the inequality above is meaningful just for ﬁnitely many values of n ∈ N. 2 Notice that (1.1.12) implies that ωn − ωn−1 = (−1)n+1 , qn qn−1 n ∈ N+ , ω ∈ Ω, (1.1.18)

which in conjunction with (1.1.15) yields 0 = ω0 < ω2 < ω4 < · · · < ω3 < ω1 < 1 (1.1.19)

for any ω ∈ Ω. Clearly, the above inequalities also hold for any rational ω ∈ [0, 1) with some inequality signs ‘<’ replaced by ‘≤’. In what follows we shall write ω = [a1 , a2 , · · · ] , ω ∈ Ω,

to mean precisely equation (1.1.17). The next result shows that the continued fraction expansion of an irrational number is unique in a certain sense. Proposition 1.1.2 Let (in )n∈N+ be a sequence of positive integers. Deﬁne the rational numbers ωn = [i1 , · · · , in ] , Then the limit

n→∞

n ∈ N+ .

lim ωn = ω

exists, where ω ∈ Ω and, moreover, the in , n ∈ N+ , are the continued fraction digits of ω. Proof. Writing ωn = pn /qn , n ∈ N+ , ω0 = 0, where pn , qn ∈ N+ and g.c.d.(pn , qn ) = 1, it follows from (1.1.18) that

n

ωn =

k=1

(−1)k+1 , qk−1 qk

n ∈ N+ .

Basic properties

11

As qk increases with k, Leibnitz’s theorem ensures the existence of limn→∞ ωn , say, ω, and (1.1.19) shows that 0 < ω < 1. It remains to show that an (ω) = in , n ∈ N+ . This will also prove that ω ∈ Ω, since if ω ∈ Q then we should have am (ω) = am+1 (ω) = · · · ∞ for some m ∈ N+ . As ωn = it is suﬃcient to show that a1 (ω) = 1/ω = i1 . This follows from (1.1.20) letting n → ∞ and noting that limn→∞ [i2 , · · · , in ] exists and lies in the open interval (0, 1). 2 1 , i1 + [i2 , · · · , in ] n ≥ 2, (1.1.20)

1.1.3

Some special continued fraction expansions

The continued fraction expansion of a real number is a fundamental representation of it through its connection with the Euclidean algorithm and with ‘best’ rational approximations [see, e.g., Hardy and Wright (1979, Ch. 11)]. At the same time very little is known about the explicit continued fraction expansions of some interesting numbers. We already know that these expansions are ﬁnite (i.e., terminating) exactly for rational numbers. Also, by a well known theorem of J.-L. Lagrange [for all classical non-metric results the basic reference is Perron (1954, 1957)], the sequence of digits of an irrational number x is eventually periodic if and only if x is a quadratic irrationality. Here ‘eventually periodic’ means that if x = [a0 ; a1 , a2 , · · · ] , then there exist k ∈ N and ∈ N+ such that an = an+ we use the notation if [a0 ; · · · , a −1 ] [a0 ; a1 , · · · , a ] if x= [a0 ; a1 , · · · , ak−1 , ak , · · · , ak+ −1 ] if for any n ≥ k, and k = 0, k = 1, k≥2

as a convenient abbreviation. The smallest such ∈ N+ is called the period length of x. If we can take k = 0, then x is called purely periodic. Next, a quadratic irrationality is a number of the form √ a+ b x= , c

12

Chapter 1

where b ∈ N+ is not a perfect square, and a, c ∈ Z, c = 0. Then x = √ a − b /c is called the algebraic conjugate of x. A purely periodic quadratic irrationality x is characterized by the inequalities x > 1, −1 < x < 0. We have, for example, √ 1+ 7 = 1; 1, 4, 1 2 and √ 1+ 2 = 1, 4, 8 . 3 The ﬁrst quadratic irrationality above is purely periodic and has period length 4 while the second one has period length 2 but is not purely periodic. Apart from that, the continued fraction expansion of even a single additional algebraic number is not explicitly known. We do not know even whether the sequence of digits is unbounded for such a number. [In connection with this matter see, however, Brjuno (1964) and Richtmyer (1975).] For transcendental numbers of interest it is not clear when to expect a continuous fraction expansion with a good ‘pattern’. For example, in a paper titled De Fractionibus Continuis, published in 1737, Leonhard Euler gave a nice continued fraction expansion for e = n∈N 1/n!, namely e = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, · · · , 1, 2n, 1, · · · ] . In this expansion the digits are eventually comprised of a meshing of two arithmetic progressions, one of which has zero common diﬀerence while the other has diﬀerence two. Generalizing the above result, Euler showed—the overline in the notation indicates inﬁnite arithmetic progressions— that e1/n = [1; n − 1 + 2in, 1]i∈N = [1; n − 1, 1, 1, 3n − 1, 1, 1, 5n − 1, 1, · · · ] for any 1 < n ∈ N+ , and e2/n = [1; (n − 1)/2 + 3in, 6n + 12in, (5n − 1)/2 + 3in, 1]i∈N = [1; (n − 1)/2, 6n, (5n − 1)/2, 1, 1, (7n − 1)/2, 18n, (11n − 1)/2, 1, · · · ] for any odd n ∈ N+ greater than 1. Recently, Clemens et al. (1995) have given explicit formulae relating continued fraction expansions with almost periodic or almost symmetric patterns in their digits, and series whose terms satisfy certain recurrence relations. The method developed by these authors ties together as a single

Basic properties

13

phenomenon previous results by Davison and Shallit (1991), K¨hler (1980), o Peth˝ (1982), Shallit (1979, 1982 a,b), van der Poorten and Shallit (1992), o and Tamura (1991), who have found continued fraction expansions for numbers expressed by certain types of series. On the other hand, nobody has made any sense out of the pattern in the continued fraction expansion for π : π = [3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, · · · ] . The digits of π do not appear to follow any pattern and are widely suspected to be in some sense random. There is a vague folklore statement [cf. Thakur (1996)] that the nice patterns come from the connection with hypergeometric functions and the representation of the latter by certain generalized continued fraction expansions. For more on that see Chudnovsky and Chudnovsky (1991, 1993). Remark. Using the continued fraction expansion for e, Alzer (1998) proved that p q 2 log q e− min p,q∈N+ ,q≥3 log log q q exists and is only attained at the 19th convergent of e p19 28 245 729 = , q19 10 391 013 thus it is equal to (10 391 013)2 log 10 391 013 28 245 729 e− log log 10 391 013 10 391 013 = 0.386 249 199 819 · · · . Further, the inequality q 2 log q p e− <c log log q q has inﬁnitely many solutions in integers p, q ∈ N+ if and only if c ≥ 1/2. For further developments see Elsner (1999). 2

14

Chapter 1

1.2

1.2.1

**Basic metric properties
**

Deﬁning random variables of interest

By (1.1.2 ) the incomplete quotients an , n ∈ N+ , of the irrationals in I are deﬁned by a1 (ω) = 1/ω , an (ω) = a1 τ n−1 (ω) , ω ∈ Ω, n ∈ N+ .

If we deﬁne a1 (0) = ∞ then the above equations also deﬁne the incomplete quotients for the rational numbers in [0, 1). As we have noted in Subsection 1.1.1, for any rational x ∈ [0, 1) there exists n = n (x) ∈ N+ such that am (x) = ∞ for any m ≥ n. The metric point of view in studying the sequence (an )n∈N+ is to consider that the an , n ∈ N+ , are N+ -valued random variables on (I, BI ) which are deﬁned µ-a.s. in I for any probability measure µ on BI assigning measure 0 to the rationals in I . (Such a µ is clearly Lebesgue measure λ.) Alternatively, we can look at the an , n ∈ N+ , as N+ ∪ {∞}-valued random variables which are deﬁned everywhere in [0, 1). It is clear, for example, that a1 (0) = ∞, a1 (x) = i, a2 (0) = a2 a2 (x) = 1, a2 (x) = j, x∈ 1 i x∈

i∈N+

a1 (x) = 1, 1 1 , , i+1 i = ∞,

x∈

1 ,1 , 2

i ≥ 2,

i ≥ 2, 1 1 , , i + 1 i + 1/2 1 1 , , i + 1/j i + 1/ (j + 1) j ≥ 2.

x∈

i∈N+

The distinction between the two cases is nevertheless immaterial as we shall only consider probability measures on BI assigning measure 0 to the rationals in I . The probability structure of (an )n∈N+ under λ will be given later. See Proposition 1.2.7. Let us deﬁne some related random variables. For any n ∈ N+ put rn = 1 τ n−1 = [an ; an+1 , an+2 , · · · ], (1.2.1)

**Basic properties sn = qn−1 , qn yn = pn−1 qn−1
**

−1

15 1 , sn , ω ∈ Ω, (1.2.2) (1.2.3)

−2 un (ω) = qn−1 ω −

where, as usual, pn /qn = [a1 , · · · , an ] , n ∈ N+ , is the nth convergent, p0 = 0, q0 = 1. Note that qn = y1 · · · yn = (s1 · · · sn )−1 , n ∈ N+ . Next, it follows from the ﬁrst equation (1.1.11) that 1 = an + sn−1 , sn with s0 = 0. Hence sn = [an , · · · , a1 ] , n ∈ N+ . (1.2.2 ) n ∈ N+ ,

Finally, using (1.1.15) it is easy to see that un = sn−1 + rn , n ∈ N+ . (1.2.3 )

In what follows we shall refer to the qn , rn , sn , un , yn , n ∈ N+ , as associated (with (an )n∈N+ ) random variables. It is clear that 0 < sn < 1 whilst rn , un , yn > 1, n ∈ N+ . We defer to Subsection 1.2.3 the study of distributional properties under λ of the associated random variables.

1.2.2

Gauss’ problem and measure

Of paramount importance for the metric theory of the continued fraction expansion, actually its ﬁrst basic result, is the asymptotic behaviour of the distribution function Fn (x) = λ (τ n < x) = λ (τ −n ([0, x))), x ∈ I, of τ n as n → ∞. C.F. Gauss wrote on 25th October 1800 in his diary that (in modern notation) lim Fn (x) = log (x + 1) , log 2 x ∈ I.

n→∞

Gauss’ proof has never been found. Later, in a letter dated 30th January 1812, Gauss asked Laplace what we now call: Gauss’ Problem. Estimate the error en (x) := Fn (x) − log(x + 1) , log 2 n ∈ N, x ∈ I.

16

Chapter 1

Gauss’ letter has been published on pages 371–372 of his Werke, Volume 1, Section 1, Teubner, Leipzig, 1917. Almost the whole letter is reproduced on pages 396–397 of J.V. Uspensky’s Introduction to Mathematical Probability, McGraw-Hill, New York, 1937. See also Gray (1984, p. 123) for other historical details about Gauss’ problem. The ﬁrst one to give a solution to Gauss’ problem (implicitly proving Gauss’ 1800 assertion) was R.O.√Kuzmin, who showed in 1928 [see Kuzmin (1928, 1932)] that en (x) = O(q n ) as n → ∞, with 0 < q < 1, uniformly in x ∈ I. Kuzmin’s proof is reproduced in Khintchine (1956, 1963, 1964). Independently, Paul L´vy showed one year later [see L´vy (1929) and also e e √ L´vy (1954, Ch.IX)] that |en (x)| ≤ q n , n ∈ N+ , x ∈ I, with q = 3.5−2 2 = e 0.67157 · · · . We present a slightly improved version of L´vy’s solution in e Subsection 1.3.5. Using Kuzmin’s approach, Sz˝sz (1961) claimed to have u lowered the L´vy estimate for q to 0.4. Actually, Sz˝sz’s argument yields just e u 0.485 rather than 0.4. The optimal value of q was determined by Wirsing (1974), who found that it was equal to 0.303 663 002 · · · . Chapter 2 is devoted to a thorough treatment of Gauss’ problem. In particular, Corollary 2.3.6 provides a complete solution to a generalization of it, where the interval [0, x), x ∈ I, is replaced by an arbitrary set A ∈ BI . The limiting distribution function log(x + 1)/ log 2, x ∈ I, occurring in Gauss’ problem motivates the introduction of what we now call Gauss’ measure γ, which is deﬁned on BI by γ (A) = 1 log 2

A

dx , x+1

A ∈ BI .

Then clearly γ([0, x]) = log(x+1)/ log 2, x ∈ I. We are going to prove that γ and τ enjoy an important property. First, we note that τ does not preserve λ. This means that we do not have λ(τ −1 (A)) = λ (A) for any A ∈ BI . Indeed, for, e.g., A = (1/2, 1) we have τ −1 (A) =

i∈N+

1 1 , i + 1 i + 1/2

and λ τ −1 (A) =

i∈N+

1 1 − i + 1/2 i + 1

=2

i∈N+

1 1 − 2i + 1 2i + 2

1 = 2 log 2 − 1 + 2 while λ (A) = 1/2.

= 2 log 2 − 1

Basic properties

17

Instead, τ does preserve γ and we state formally this result, which is a basic one in the metric theory of the RCF expansion. Theorem 1.2.1 Gauss’ measure γ is preserved by τ, and the sequence (an )n∈N+ is strictly stationary under γ. Proof. We should show that γ τ −1 (A) = γ(A), A ∈ BI .

**For this it is enough to show that the above equation holds for any interval A = (0, u], 0 < u ≤ 1. As τ −1 ((0, u]) =
**

i∈N+

1 1 , u+i i

,

**we only need to verify that
**

u 0

dx = x+1

1/i i∈N+ 1/(u+i)

dx , x+1

which is an easy exercise. Since an = a1 ◦ τ n−1 , n ∈ N+ , the second assertion is obvious. Remark. The expectation of a1 under γ is inﬁnite. Indeed 1 log 2

1 0

2

a1 (x) 1 dx = x+1 log 2

1/i

i

i∈N+ 1/(i+1)

dx = ∞. x+1 2

1.2.3

Fundamental intervals, and applications

For any n ∈ N+ and i(n) = (i1 , · · · , in ) ∈ Nn deﬁne + I(i(n) ) = ( ω ∈ Ω : ak (ω) = ik , 1 ≤ k ≤ n ) . For example, for any i ∈ N+ we have I (i) = ( ω ∈ Ω : a1 (ω) = i ) = Ω ∩ 1 1 , i+1 i .

We are going to prove that any I(i(n) ) is the set of irrationals from a certain open interval with rational endpoints. The sets I(i(n) ), i(n) ∈ Nn , are +

18

Chapter 1

called fundamental intervals of rank n. Let us make the convention that I(i(0) ) = Ω. Theorem 1.2.2 For any n ∈ N+ and i(n) = (i1 , · · · , in ) ∈ Nn let + pn−1 = [i1 , · · · , in−1 ] , qn−1 pn = [i1 , · · · , in ] qn

**with g.c.d. (pn−1 , qn−1 ) = g.c.d. (pn , qn ) = 1, p0 = 0, q0 = 1. Then I(i(n) ) = Ω ∩ (u(i(n) ), v(i(n) )), where u(i
**

(n)

(1.2.4)

)=

pn + pn−1 q +q n n−1 pn qn pn qn

if n is odd, if n is even, if n is odd, if n is even.

v(i(n) ) =

pn + pn−1 qn + qn−1 [i1 + 1]

We have

pn + pn−1 = qn + qn−1

if n = 1,

[i1 , · · · , in−1 , in + 1] if n > 1, 1 qn (qn + qn−1 ) 1 , Fn Fn+1 (1.2.5)

λ(I(i(n) )) = and

i(n) ∈Nn +

max λ(I(i(n) )) = λ (I (1(n))) =

n ∈ N+ ,

(1.2.6)

with 1(n) = (i1 , · · · , in ), where i1 = · · · = in = 1. Proof. Since [i1 , · · · , in−1 , in + ω] ∈ I i(n) , n ≥ 2, and [i1 + ω] ∈ I (i1 ) for any ω ∈ Ω, we have τ n I i(n) = Ω for any n ∈ N+ and i(n) ∈ Nn . In + conjunction with (1.1.14) this proves (1.2.4). It thus appears that I(i(n) ) is the image of Ω under the map ω→ pn + ωpn−1 , qn + ωqn−1 ω ∈ Ω.

Basic properties

19

Next, (1.2.5) follows from (1.2.4) and (1.1.12). Finally, (1.2.6) is an immediate consequence of (1.2.5), as the minimum of qn is attained for i1 = · · · = in = 1 [cf.(1.1.13)]. 2 Remark. When denoting by pn and qn , n ∈ N+ , quantities seemingly diﬀerent from those already deﬁned in Subsection 1.1.2, we clearly abused the notation. However, it should be noted that according to the context pn and qn will appear to be either functions of ω ∈ Ω or of i(n) ∈ Nn as well. + Actually, pn i(n) (qn i(n) ) is the common value of pn (qn ) as deﬁned in Subsection 1.1.2 at all points ω ∈ I i(n) , n ∈ N+ . 2 Corollary 1.2.3 For p, q ∈ N+ with p < q and g.c.d. (p, q) = 1 let p = [i1 , · · · , in ] = [i1 , · · · , in−1 , in − 1, 1] q for some n = n (p/q) ∈ N+ , where in ≥ 2. Deﬁne pn−1 = [i1 , · · · , in−1 ] , qn−1 p− n − = [i1 , · · · , in−1 , in − 1] qn

− with g.c.d. (pn−1 , qn−1 ) = g.c.d. (p− , qn ) = 1, p0 = 0, q0 = 1, and n

Ip/q = Then

ω∈Ω:

p is a convergent of ω . q (1.2.7)

**Ip/q = I (i1 , · · · , in ) ∪ I (i1 , · · · , in−1 , in − 1, 1) p + pn−1 p + p− n Ω ∩ , if n is odd, − q + qn−1 q + qn = p + p− p + pn−1 n Ω ∩ if n is even −, q +q q + qn n−1 and λ Ip/q = We have
**

{p,q∈N+ : n(p,q)=n}

3 − . (q + qn−1 ) q + qn 3 , (Fn−1 + Fn+1 ) Fn+2

max

λ Ip/q = λ IFn /Fn+1 =

n ∈ N+ .

**20 Proof. By (1.1.11) we have p = p− + pn−1 , n It then follows from (1.2.4) that Ω ∩ I (i1 , · · · , in−1 , in − 1, 1) = Ω ∩ p p + p− n , − q q + qn p + p− p n −, q + qn q
**

− q = qn + qn−1 .

Chapter 1

if n is odd,

if n is even

while, by (1.2.4) again, Ω ∩ I (i1 , · · · , in ) = Ω ∩ p + pn−1 p , q + qn−1 q p p + pn−1 , q q + qn−1 if n is odd

if n is even.

The last two equations show that (1.2.7) holds. To compute λ Ip/q we have to use (1.1.12) three times. Finally, we should note that the maximum of λ Ip/q is obtained for i1 = · · · = in−1 = 1, in = 2. 2 Corollary 1.2.4 (Legendre’s theorem) For ω ∈ Ω and p, q ∈ N+ with p < q and g.c.d. (p, q) = 1 let p = [i1 , · · · , in ] , q pn−1 = [i1 , · · · , in−1 ] qn−1

with p0 = 0, q0 = 1, where the length n = n (p/q) ∈ N+ of the continued fraction expansion of p/q is chosen in such a way that it is even if p/q < ω and odd otherwise. Deﬁne Θ = q2 ω − Then Θ< p . q

q p if and only if is a convergent of ω. q + qn−1 q

In particular, if Θ ≤ 1/2 then p/q is a convergent of ω . Proof. If p/q is a convergent of ω, then by (1.1.15) we have Θ = q2 ω − p q τ n (ω) q = < . n (ω) q q q+τ q + qn−1 n−1

Basic properties Conversely, if Θ < q / (q + qn−1 ) then ω− p 1 < . q q (q + qn−1 )

21

(1.2.8)

Assuming that p/q < ω, that is, n is even, from (1.2.8) we obtain p p 1 p + pn−1 <ω< + = q q q (q + qn−1 ) q + qn−1 [by (1.1.12)]. Similarly, assuming that p/q > ω, that is, n is odd, we obtain p p 1 p + pn−1 >ω> − = q q q (q + qn−1 ) q + qn−1 [by (1.1.12) again]. In both cases we thus have ω ∈ I (i1 , · · · , in ). Hence p/q = [i1 , · · · , in ] is a convergent of ω. The special case follows from the inequality q / (q + qn−1 ) > 1/2 which holds since q > qn−1 . 2 Corollary 1.2.5 For any n ∈ N+ and i(n) = (i1 , · · · , in ) ∈ N+ we have γ (ak = i1 , · · · , ak+n−1 = in ) = In particular, γ (ak = i) = for any k, i ∈ N+ . Proof. Theorem 1.2.1 and equation (1.2.4). 2 Corollary 1.2.6 (Brod´n–Borel–L´vy formula) For any n ∈ N+ we e e have (sn + 1) x λ (τ n < x | a1 , · · · , an ) = , x ∈ I, (1.2.10) sn x + 1 where sn is deﬁned by (1.2.2) or (1.2.2 ). Proof. Clearly, for any n ∈ N+ and x ∈ I, λ (τ n < x | a1 , · · · , an ) = λ ((τ n < x) ∩ I (a1 , · · · , an )) . λ (I (a1 , · · · , an )) 1 (i + 1)2 1 1 log = log 1 + log 2 i (i + 2) log 2 i (i + 2) (1.2.9) 1 1 + v(i(n) ) log , log 2 1 + u(i(n) ) k ∈ N+ .

22 By (1.1.14) and (1.2.4) we have (τ n < x) ∩ I (a1 , · · · , an ) = pn pn + xpn−1 <ω< qn + xqn−1 qn pn pn + xpn−1 <ω< qn qn + xqn−1

Chapter 1

ω∈Ω:

if n is odd,

ω∈Ω:

if n is even.

Hence, using (1.2.5) and (1.1.12), λ (τ n < x | a1 , · · · , an ) = qn (qn + qn−1 ) x (sn + 1) x = qn (qn + xqn−1 ) sn x + 1 2

for any n ∈ N+ and x ∈ I, and the proof is complete.

Remark. For x ∈ N+ equation (1.2.10) has been obtained by the Swedish mathematician T. Brod´n as early as 1900 [see Brod´n (1900, p. 246)], nine e e ´ years before E. Borel [see Borel (1909)]. L´vy (1929) also obtained and e used (1.2.10). This equation was called the Borel-L´vy formula by Doeblin e (1940). A generalization of (1.2.10) will be given in Proposition 1.3.8. 2 The Brod´n–Borel–L´vy formula (1.2.10) allows us to determine the e e probability structure of (an )n∈N+ under λ. Proposition 1.2.7 For any i, n ∈ N+ we have λ (a1 = i) = 1 , i (i + 1) (1.2.11)

λ (an+1 = i | a1 , · · · , an ) = Pi (sn ) , where Pi (x) = x+1 , (x + i) (x + i + 1) x ∈ I.

(1.2.12) (1.2.13)

Proof. As we have already noted, ( ω ∈ Ω : a1 (ω ) = i ) = Ω ∩ and (1.2.11) follows at once. 1 1 , , i+1 i i ∈ N+ ,

Basic properties Since τ n (ω) = [an+1 (ω) , an+2 (ω) , · · · ] , n ∈ N+ , ω ∈ Ω, we have ( ω ∈ Ω : an+1 (ω) = i ) = for any n, i ∈ N+ so that λ (an+1 = i | a1 , · · · , an ) = λ τ n ∈ and (1.2.12) follows from (1.2.10). 1 1 , i+1 i a1 , · · · , an , ω ∈ Ω : τ n (ω) ∈ 1 1 , i+1 i

23

2

Remark. Proposition 1.2.7 is the starting point of an approach to the metrical theory of the continued fraction expansion via dependence with complete connections. See Iosifescu and Grigorescu (1990, Section 5.2). 2 Corollary 1.2.8 The sequence (sn )n∈N+ with s0 = 0 is a Q ∩ I-valued Markov chain on (I, BI , λ) with the following transition mechanism: from state s ∈ Q ∩ I the possible transitions are to any state 1/ (s + i) with corresponding transition probability Pi (s), i ∈ N+ . We conclude this subsection by considering the random variables rn and un , n ∈ N+ , introduced in Subsection 1.2.1. Proposition 1.2.9 For any n ∈ N+ and x ≥ 1 we have λ (r1 < x) = λ (u1 < x) = 1 − λ (rn+1 < x | a1 , · · · , an ) = 1 − λ (un+1 < x | a1 , · · · , an ) = 1 , x (1.2.14) (1.2.15)

sn + 1 , sn + x

0 if x ≤ sn + 1, (1.2.16) sn + 1 if x > sn + 1. 1− x Proof. Equations (1.2.14) are obvious since r1 = u1 = 1/τ 0 . Then for any n ∈ N+ and x ≥ 1 we have λ (rn+1 < x | a1 , · · · , an ) = λ τ n > and λ (un+1 < x | a1 , · · · , an ) = λ (rn+1 < x − sn | a1 , · · · , an ) = λ τn > 1 x − sn a1 , · · · , a n . 1 x a 1 , · · · , an

24

Chapter 1 2

To obtain equations (1.2.15) and (1.2.16) it remains to use (1.2.10).

Corollary 1.2.10 For any n ∈ N+ let Gn (s) = λ(sn < s), s ∈ R, G0 (s) = 0 or 1 according as s ≤ 0 or s > 0. For any n ∈ N+ and x ≥ 1 we have 1 x−1 λ (rn < x) = dGn−1 (s) 0 s+x (1.2.17) 1 Gn−1 (s) ds 1 , = (x − 1) + x+1 (s + x)2 0 x−1 s+1 dGn−1 (s) if 1 ≤ x ≤ 2, 1− 0 x λ (un < x) = (1.2.18) 1 s+1 dGn−1 (s) if x > 2 1− x 0 1 x−1 Gn−1 (s) ds if 1 ≤ x ≤ 2, x 0 = 1 1 2 1− + Gn−1 (s) ds if x > 2 x x 0 1 x

x−1 0 1 0

=

Gn−1 (s) ds, (s + 1) dGn−1 (s) (s + x)2

1 0

d λ (rn < x) = dx =

(1.2.19)

2 + (x + 1)2 Also, for any n ∈ N+ we have

(s − x + 2) Gn−1 (s) ds . (s + x)3

d 1 1 λ (un < x) = Gn−1 (x − 1) − 2 dx x x 1 Gn−1 (x − 1) − 1 x x2 = 1 2 x

1 x−1 0

x−1 0

Gn−1 (s) ds

(1.2.20)

Gn−1 (s) ds if 1 ≤ x ≤ 2,

2−

0

Gn−1 (s) ds

if x > 2

Basic properties a.e. in [1, ∞).

25

Proof. The ﬁrst equality in (1.2.17) follows at once from (1.2.15). To obtain the second one we integrate by parts noting that Gn (0) = 0 and Gn (1) = 1 for any n ∈ N. Similarly, the ﬁrst equality in (1.2.18) follows at once from (1.2.16). To obtain the second and third ones we integrate by parts and then note that Gn (s) = 1 for any n ∈ N and s ≥ 1. Finally, equations (1.2.19) and (1.2.20) follow immediately from (1.2.17) and (1.2.18), respectively. 2

1.3

1.3.1

**The natural extension of τ
**

Deﬁnition and basic properties

The incomplete quotients an , n ∈ N+ , are expressed in terms of a1 and the powers of the continued fraction transformation τ . Such a thing is not possible for the variables sn or un , n ∈ N+ . To rule out this inconvenience we consider the so called natural extension τ of τ which is a transformation of (0, 1) × I deﬁned by τ (ω, θ) = τ (ω) , 1 a1 (ω) + θ , (ω, θ) ∈ (0, 1) × I. (1.3.1)

This is a one-to-one transformation of Ω2 with inverse τ −1 (ω, θ) = 1 , τ (θ) , a1 (θ) + ω (ω, θ) ∈ Ω2 . (1.3.2)

It is easy to see that for any n ≥ 2 we have τ n (ω, θ) = (τ n (ω) , [an (ω) , · · · , a2 (ω) , a1 (ω) + θ]) whatever (ω, θ) ∈ Ω × I, and τ −n (ω, θ) = ([an (θ) , · · · , a2 (θ) , a1 (θ) + ω], τ n (θ)) whatever (ω, θ) ∈ Ω2 . Equations (1.3.1) and (1.3.1 ) imply that τ n (ω, 0) = (τ n (ω) , sn (ω)) , n ∈ N+ , (1.3.3) (1.3.2 ) (1.3.1 )

26

Chapter 1

for any ω ∈ Ω. Note that the above equation also hold for n = 0 if we deﬁne τ 0 =identity map. 2 Now, deﬁne the extended Gauss measure γ on BI by γ (B) = Note that γ (A × I) = γ (I × A) = γ (A) (1.3.4) for any A ∈ BI . The result below shows that γ plays with respect to τ the part played by γ with respect to τ (cf. Theorem 1.2.1). Theorem 1.3.1 The extended Gauss measure γ is preserved by τ .

2 Proof. We should show that γ τ −1 (B) = γ (B) for any B ∈ BI or, equivalently, since τ is invertible on Ω2 , that γ (τ (B)) = γ (B) for any B ∈ 2 BI . As the set of Cartesian products I(i(m) ) × I(j (n) ), i(m) ∈ Nm , j (n) ∈ + 2 Nn , m, n ∈ N, generates the σ-algebra BI , it is enough to show that +

1 log 2

B

dxdy , (xy + 1)2

2 B ∈ BI .

γ(τ (I(i(m) ) × I(j (n) ))) = γ(I(i(m) ) × I(j (n) )) i(m) Nm , + j (n) Nn , +

(1.3.5)

for any ∈ ∈ m, n ∈ N. It follows from (1.3.4) and Theorem 1.2.1 that (1.3.5) holds for m = 0 and n ∈ N. If m ∈ N+ then it is easy to see that τ (I(i(m) ) × I(j (n) )) = I (i2 , · · · , im ) × I (i1 , j1 , · · · , jn ) , n ∈ N+ , where I (i2 , · · · , im ) equals Ω for m = 1. Also, if I(i(m) ) = Ω ∩ (a, b) and I(j (n) ) = Ω ∩ (c, d), with a, b, c, d ∈ Q ∩ I, then I (i2 , · · · , im ) = Ω ∩ b−1 − i1 , a−1 − i1 and I (i1 , j1 , · · · , jn ) = Ω ∩ ((d + i1 )−1 , (c + i1 )−1 ). A simple computation yields γ((a, b) × (c, d)) = (bd + 1) (ac + 1) 1 log , log 2 (bc + 1) (ad + 1)

and then γ( b−1 − i1 , a−1 − i1 × ((d + i1 )−1 , (c + i1 )−1 )) = = 1 ((a−1 − i1 )(c + i1 )−1 + 1)((b−1 − i1 )(d + i1 )−1 + 1) log log 2 ((a−1 − i1 )(d + i1 )−1 + 1)((b−1 − i1 )(c + i1 )−1 + 1) 1 (bd + 1) (ac + 1) log , log 2 (bc + 1) (ad + 1) 2

that is, (1.3.5) holds.

For more details on natural extensions we refer the reader to Subsection 4.0.1.

Basic properties

27

1.3.2

Approximation coeﬃcients

**On account of Legendre’s theorem (see Corollary 1.2.4), for any ω ∈ Ω we deﬁne the approximation coeﬃcients Θn = Θn (ω) as
**

2 Θn = Θn (ω) = qn ω −

pn , qn

n ∈ N.

Clearly, Θ0 (ω) = ω, ω ∈ Ω, and by (1.2.3) we have Θn = u−1 , n+1 Hence 0 < Θn < 1, n ∈ N. It is rather easy to obtain more information about Θn , n ∈ N. It follows from (1.2.3 ) and (1.2.1) that Θn = τn 1 = , sn + rn+1 sn τ n + 1 n ∈ N. n ∈ N. (1.3.6)

−1 Moreover, as s−1 = an + sn−1 and rn = an + rn+1 , n ∈ N+ , we also have n

Θn−1 = = Thus it appears that

1 1 = −1 sn−1 + rn sn−1 + an + rn+1 sn , n ∈ N+ . sn τ n + 1

(Θn−1 , Θn ) = Ψ (τ n , sn ) , the function Ψ : I 2 → R2 being deﬁned by + Ψ (x, y) = y x , xy + 1 xy + 1 ,

n ∈ N+ ,

(1.3.7)

(x, y) ∈ I 2 .

Clearly, Ψ is a C 1 -diﬀeomorphism between the interior of I 2 and the interior of the triangle ∆ with vertices (0, 0) , (1, 0) and (0, 1). It then follows from (1.3.7) that Θn−1 + Θn < 1, whence 1 min (Θn−1 , Θn ) < , 2 n ∈ N+ , n ∈ N+ ,

28 a well known result due to Vahlen (1895). The inverse Ψ−1 of Ψ is given by Ψ−1 (α, β) = For i ∈ N+ put Vi = I (i) × Ω Hi = Ω × I (i) . It follows from the deﬁnition of τ that τ (Vi ) = Hi , Vi = τ −1 (Hi ) , and that for any i ∈ N+ we have τ n ∈ Vi if and only if an+1 = i, τ n ∈ Hi if and only if an = i, n ∈ N, n ∈ N+ . i ∈ N+ , 2β 2α √ √ , 1 + 1 − 4αβ 1 + 1 − 4αβ ,

Chapter 1

(α, β) ∈ ∆.

(1.3.8) (1.3.9)

Furthermore, the set Vi∗ = ΨVi , is a quadrangle with vertices 0, 1 i , i 1 , i+1 i+1 , i+1 1 , i+2 i+2 and 0, 1 i+1 ,

and notice that its symmetrical with respect to the diagonal α = β is Hi∗ = ΨHi , i ∈ N+ . (For i = 1 both quadrangles are in fact triangles.) Deﬁne the mapping F : ∆ → ∆ as F = Ψτ Ψ−1 . It is easy to check that for any i ∈ N+ we have (α, β) ∈ Vi∗ ⇒ F (α, β) = β, α + i Now, by (1.3.7) we have Ψ−1 (Θn−1 , Θn ) = (τ n , sn ) , whence τ Ψ−1 (Θn−1 , Θn ) = τ n+1 , sn+1 , Therefore, by (1.3.7) again, F (Θn−1 , Θn ) = Ψ τ Ψ−1 (Θn−1 , Θn ) (1.3.11) = Ψ τ n+1 , s

n+1

1 − 4αβ − i2 β .

(1.3.10)

n ∈ N+ .

= (Θn , Θn+1 ),

n ∈ N+ .

Basic properties Hence, by (1.3.3), (1.3.8), and (1.3.10), Θn+1 = Θn−1 + an+1 1 − 4Θn−1 Θn − a2 Θn , n+1 n ∈ N+ .

29

(1.3.12)

Similarly, for any i ∈ N+ we have (α, β) ∈ Hi∗ ⇒ F −1 (α, β) = β + i 1 − 4αβ − i2 α, α . As by (1.3.3), (1.3.9), and (1.3.13) we have F −1 (Θn , Θn+1 ) = (Θn−1 , Θn ) , we obtain Θn−1 = Θn+1 + an+1 1 − 4Θn Θn+1 − a2 Θn , n+1 n ∈ N+ . (1.3.12 ) n ∈ N+ , (1.3.13)

We note that both (1.3.12) and (1.3.12 ) can be established by direct computation using the relationships between Θn , rn , sn , and an , n ∈ N+ . We are now able to derive some classical results in Diophantine approximation. Put fi (α, β) = α + i 1 − 4αβ − i2 β, i ∈ N+ ,

so that (1.3.10) can be rewritten as (α, β) ∈ Vi∗ ⇒ F (α, β) = (β, fi (α, β)) . It is easy to check that ∂fi (α, β) < 0, ∂α ∂fi (α, β) < 0, ∂β (α, β) ∈ Vi∗ , i ∈ N+ . (1.3.14)

The only ﬁxed point of τ in Vi is (ξi , ξi ), where √ −i + i2 + 4 ξi = [i, i, i, · · · ] = , 2

i ∈ N+ ,

∗ ∗ while the only ﬁxed point of F in Vi∗ = ΨVi is (ξi , ξi ), where ∗ ∗ (ξi , ξi ) = Ψ (ξi , ξi ) =

1 1 √ ,√ , 2+4 2+4 i i

i ∈ N+ .

(1.3.15)

Note that by (1.3.11) we have (Θn−1 , Θn , Θn+1 ) = (Θn−1 , F (Θn−1 , Θn )) , n ∈ N+ . Hence, for any i, n ∈ N+ , (Θn−1 , Θn , Θn+1 ) = (Θn−1 , Θn , fi (Θn−1 , Θn ))

30

Chapter 1

if and only if (Θn−1 , Θn ) ∈ Vi∗ , that is, by (1.3.7), if and only if an+1 = i. Finally, note that ∗ ∗ Θn−1 (ξi ) = Θn (ξi ) (1.3.16) for any i, n ∈ N+ . Now, on account of (1.3.14) through (1.3.16) we can state the following result. Theorem 1.3.2 For any ω ∈ Ω and n ∈ N+ we have min (Θn−1 , Θn , Θn+1 ) < and max (Θn−1 , Θn , Θn+1 ) > 1 a2 n+1 1 a2 + 4 n+1 +4 (1.3.17)

.

(1.3.18)

Inequality (1.3.17) generalizes a result of Borel (1903) according to which 1 min (Θn−1 , Θn , Θn+1 ) < √ , 5 n ∈ N+ . (1.3.11)

A great number of people independently found (1.3.17). See, e.g., Bagemihl and McLaughlin (1966), Obrechkoﬀ (1951), Sendov (1959/60). Inequality (1.3.18) is due to Tong (1983). Actually, the method sketched above yields easy proofs of generalizations of a great number of classical results by M. Fujiwara, B. Segre, J. LeVeque, P. Sz˝sz, and others. We will u mention here a generalization of a result of B. Segre. For other results the reader is referred to Jager and Kraaikamp (1989) and Kraaikamp (1991). Theorem 1.3.3 Let ρ ≥ 0 and n ∈ N+ . Then of the three inequalities Θ2n−1 < ρ a2 2n+1 + 4ρ , Θ2n < 1 a2 2n+1 + 4ρ , Θ2n+1 < ρ a2 2n+1 + 4ρ

at least one is satisﬁed and at least one is not satisﬁed. Corollary 1.3.4 [Segre (1945)] Let ρ ≥ 0 and ω ∈ Ω. Then there are inﬁnitely many rational numbers p/q with p < q and g.c.d. (p, q) = 1 satisfying the inequalities ρ p 1 1 1 −√ <ω− < √ . q 1 + 4ρ q 2 1 + 4ρ q 2

Basic properties

31

Remark. Tong (1994) proved the optimal version of Theorem 1.3.2 by showing that for any ω ∈ Ω and n ∈ N+ we have min (Θn−1 , Θn , Θn+1 ) < and max (Θn−1 , Θn , Θn+1 ) > 1 (an+1 + |τ n+1 − sn |)2 + 4 1 (an+1 − |τ n+1 − sn |)2 + 4 . 2

1.3.3

Extended random variables

It is well known [see, e.g., Doob (1953, p. 456)] that a doubly inﬁnite version of (an )n∈N+ under γ (i.e., when the process is a strictly stationary one, see Theorem 1.2.1) should exist on a richer probability space. It is possible to construct it eﬀectively by using the natural extension τ as follows. Deﬁne extended incomplete quotients a , ∈ Z, on Ω2 by a with a1 (ω, θ) = a1 (ω) , (ω, θ) ∈ Ω2 . Clearly, by (1.3.1 ) and (1.3.2 ) we have an (ω, θ) = an (ω) , a0 (ω, θ) = a1 (θ) , a−n (ω, θ) = an+1 (θ) , n ∈ N+ , (ω, θ) ∈ Ω2 . Similarly to the interpretation of the an , n ∈ N+ , in Subsection 1.2.1, 2 we can consider the a , ∈ Z, as N+ -valued random variables on I 2 , BI 2 for any probability measure µ on B 2 assigning which are deﬁned µ-a.s. in I I measure 0 to I 2 \Ω2 . (Such a µ is clearly γ.) Alternatively, we can look at the a , ∈ Z, as N+ ∪ {∞}-valued random variables which are deﬁned everywhere in [0, 1)2 , as the an , n ∈ N+ , can be deﬁned everywhere in [0, 1) (cf. Subsection 1.2.1). In the latter case a typical trajectory of (a ) ∈Z is either — a doubly inﬁnite sequence of natural numbers; — a doubly inﬁnite sequence of elements of N+ ∪ {∞} in which the natural numbers appear ﬁnitely many times in consecutive positions; — a doubly inﬁnite sequence of elements of N+ ∪ {∞} in which the natural numbers appear in consecutive positions from a certain rank on or up to a certain rank.

+1 (ω, θ)

= a1 τ (ω, θ) ,

∈ Z,

32

Chapter 1

The distinction between the two cases is again immaterial. Since τ preserves γ, the doubly inﬁnite sequence (a ) ∈Z is strictly stationary under γ. It is indeed a doubly inﬁnite version of (an )n∈N+ under γ, that is, the distribution of (ah , · · · , ah+m ) under γ and that of (ak , · · · , ak+m ) under γ are identical for any h ∈ Z, m ∈ N, and k ∈ N+ . The probability structure of (a ) ∈Z under γ is described by Corollary 1.3.6 to Theorem 1.3.5 below. The latter also brings to light an important family of probability measures on BI , to be called conditional, which we shall consider in some detail in the next subsection. Theorem 1.3.5 For any x ∈ I we have γ ([0, x] × I | a0 , a−1 , · · · ) = where a = [a0 , a−1 , · · · ]. Proof. As is well known, γ ([0, x] × I | a0 , a−1 , · · · ) = lim γ ([0, x] × I | a0 , · · · , a−n ) γ -a.s.. ¯

n→∞

(a + 1) x γ-a.s., ax + 1

For typographical convenience let us denote by In the fundamental interval I (a0 , · · · , a−n ) for any arbitrarily ﬁxed values of the ai , i = 0, −1, . . . , −n. Then we have γ ([0, x] × I | a0 , · · · , a−n ) = γ ([0, x] × In ) γ (I × In ) (log 2)−1 =

In x

dy

0

=

In

= γ (In ) x (y + 1) γ (dy) xy + 1 x (yn + 1) = γ (In ) xyn + 1

du (uy + 1)2

(log 2)−1

x (y + 1) dy In xy + 1 y + 1 γ (In )

**for some yn ∈ In . Since
**

n→∞

lim yn = [a0 , a−1 , · · · ] = a, 2

the proof is complete.

Basic properties Corollary 1.3.6 For any i ∈ N+ we have γ (a1 = i| a0 , a−1 , · · · ) = Pi (a) γ-a.s., where a = [a0 , a−1 , · · · ] and (1.2.13). Proof. We have (a1 = i) =

33

the functions Pi , i ∈ N+ , are deﬁned by

1 , 1 × [0, 1) 2

if i = 1,

1 1 , × [0, 1) if i ≥ 2. i+1 i

**Hence the conditional probability in the statement is γ-a.s. equal to (a + 1) /i (a + 1) / (i + 1) − = Pi (a) . 1 + a/i 1 + a/ (i + 1) 2 Remarks. 1. The strict stationarity of (a ) conditional probability γ (a
**

+1 ∈Z

under γ implies that the

= i| a , a

−1 , · · · ),

i ∈ N+ ,

does not depend on ∈ Z and is γ-a.s. equal to Pi (a), where a = [ a , a −1 , · · · ]. Thus Proposition 1.2.7 and Corollary 1.3.6 provide interpretations of Pi (x) for all x ∈ [0, 1). 2. The process (a ) ∈Z is an example of what is called an inﬁnite-order chain in the theory of dependence with complete connections, see Section 5.5 in Iosifescu and Grigorescu (1990). The existence of such chains is not obvious. To ensure the existence several restrictions should be imposed. See, e.g., Theorems 5.5.1 and 5.5.2 in Iosifescu and Grigorescu (op. cit.). The latter refers to N+ -valued inﬁnite-order chains and makes explicit use of the continued fraction expansion. The simple eﬀective construction of (a ) ∈Z on 2 the probability space I 2 , BI , γ fully clariﬁes an idea of Wolfgang Doeblin [see Doeblin (1940)], who was the ﬁrst to use dependence with complete connections in the metric theory of the continued fraction expansion. 2 Note that by its very construction (a ) ∈Z is a reversible process, that is, the ﬁnite dimensional distributions under γ of (a ) ∈Z and (a− ) ∈Z are identical. A similar property holds for (an )n∈N+ under γ, as is shown by the following result.

34

Chapter 1

Proposition 1.3.7 The random sequence (an )n∈N+ on (I, BI , γ) is reversible, i.e., the distributions of (a : m ≤ ≤ n) and (am+n− : m ≤ ≤ n) are identical for any m, n ∈ N+ , m ≤ n. Proof. By the strict stationarity under γ of (a ) ∈Z , the distribution of (a : m ≤ ≤ n) is identical with the distribution of (a −m−n+1 : m ≤ ≤ n) (both under γ). But by the very deﬁnition of (a ) ∈Z the ﬁrst distribution is identical with that of (a : m ≤ ≤ n) while the second one is identical with that of (am+n− : m ≤ ≤ n) (both under γ). 2 Remark. The result stated in Proposition 1.3.7 amounts to the fact that the γ-measures of the fundamental intervals I (i1 , · · · , in ) and I (in , · · · , i1 ) are equal for any n ∈ N+ and i1 , · · · , in ∈ N+ . This can be also proved by direct computation using results from Subsection 1.2.3. See Philipp (1967) and D¨rner (1992). u 2 Deﬁne extended associated random variables s , y , r and u as s = [a ,a r = [a ;a Clearly, s = s0 ◦ τ , r = r0 ◦ τ , u = s0 ◦ τ y = y0 ◦ τ ,

−1 −1 , · · · ] ,

y = u =s

−1

1 , s ¯ +r , ∈ Z.

+1 , a +2 , · · · ] ,

+ r1 ◦ τ

−1

,

∈ Z.

It follows from the above equations, Theorem 1.3.1, and Corollary 1.3.6 that 2 (s ) ∈Z is a strictly stationary Ω-valued Markov process on I 2 , BI , γ with the following transition mechanism: from state s ∈ Ω the possible transitions are to any state 1/ (s + i) with corresponding transition probability Pi (s) , i ∈ N+ . Clearly, for any ∈ Z we have γ (s < x) = γ (s0 < x) = γ (I × [0, x]) = γ ([0, x]) , x ∈ I.

Similar considerations can be made about the process (y ) ∈Z . This is a 2 strictly stationary Ω -valued Markov process on I 2 , BI , γ , where Ω = the set of irrationals in [1, ∞). The transition mechanism of (y ) ∈Z is as follows: from state y ∈ Ω the only possible transitions are to any state y −1 + i with corresponding transition probability Pi (1/y), i ∈ N. For any ∈ Z we have γ(y < x) = γ(y 0 < x) = γ([x−1 , 1]) = γ ([1, x]) , x ∈ [1, ∞),

**Basic properties where γ is the probability measure on B[1,∞] deﬁned by γ A = 1 log 2
**

A

35

dy , y (y + 1)

A ∈ B[1,∞) .

Next, the process (r ) ∈Z is a strictly stationary Ω -valued ‘deterministic’ 2 Markov process on I 2 , BI , γ in which state r ∈ Ω is followed by state −1 . Obviously, for any ∈ Z we have 1/(r − r ) = 1/ τ r γ (r < x) = γ (r1 < x) = γ (r1 < x) = γ ([1, x)), x ∈ [1, ∞).

Note that by the reversibility of (¯ ) ∈Z the ﬁnite-dimensional distributions a under γ of (s ) ∈Z and (r−1 ) ∈Z are identical. Finally, the process (s −1 , r−1 ) ∈Z is a strictly stationary Ω2 -valued ‘de2 terministic’ Markov process on I 2 , BI , γ in which state (s, ω) ∈ Ω2 is followed by state τ −1 (s, ω) = For any γ s

−1

1 , ω −1 − ω −1 s + ω −1

.

**∈ Z we have < x, r−1 < y = γ s0 < x, r−1 < y 1 = γ ([0, y] × [0, x]) = = log (xy + 1) , log 2 1 log 2
**

y 0 0 x

dudv (uv + 1)2

x, y ∈ I.

The process (u ) ∈Z , which is a functional of (s −1 , r−1 ) ∈Z (note that u = s −1 +r , ∈ Z), is no longer Markovian but is still a strictly stationary one. For any ∈ Z we have γ (u < x) = γ (u1 < x) = γ (s0 + r1 < x) = 1 log 2 dudv , (uv + 1)2 x ∈ [1, ∞),

D

where D = (u, v) ∈ I 2 : u + v −1 < x . Hence x−1 1 log 2 log x − x γ (u < x) = 1 1 log 2 − log 2 x

if 1 ≤ x ≤ 2,

if x ≥ 2.

36

Chapter 1

1.3.4

The conditional probability measures

Motivated by Theorem 1.3.5 we shall consider the family of (conditional ) probability measures (γa )a∈I on BI deﬁned by their distribution functions γa ([0, x]) = (a + 1) x , ax + 1 x ∈ I, a ∈ I.

In particular, γ0 = λ. The density ha of γa is ha (x) = a+1 , (ax + 1)2 x ∈ I, a ∈ I,

and [see, e.g., Billingsley (1968, p. 224)] we then have sup |γa (A) − γb (A)| = = 1 2 |ha (x) − hb (x)| dx (ab + a + b) x2 + 2x − 1

A∈BI

I

dx (ax + 1)2 (bx + 1)2 1 − 2x − (ab + a + b) x2 = |b − a| dx (ax + 1)2 (bx + 1)2 0 α (1 − α) |b − a| , = (αa + 1) (αb + 1)

I α

1 |b − a| 2

where α = 1 +

(a + 1) (b + 1)

−1

, a, b ∈ I. Hence 1 |b − a| , 4 a, b ∈ I . (1.3.19)

A∈BI

sup |γa (A) − γb (A)| ≤

**It is easy to see that we also have sup |γa ([0, x]) − γb ([0, x])| =
**

x∈I

α (1 − α) |b − a| , (1 + αa) (1 + αb)

a, b ∈ I.

For any a ∈ I put sa = a and 0 sa = n 1 , sa + an n−1 n ∈ N+ . (1.3.20)

It follows from the properties just described of the process (s ) ∈Z that the sequence (sa )n∈N+ is an I-valued Markov chain on (I, BI , γa ) which n starts at sa = a and has the following transition mechanism: from state 0

Basic properties

37

s ∈ I the possible transitions are to any state 1/ (s + i) with corresponding transition probability Pi (s) , i ∈ N+ . [Strictly speaking, this only holds for any a ∈ E ⊂ Ω, for some E ∈ BI with λ (E) = 1, as (sa )n∈N under γa n is a version of (sn )n∈N under γ( · |s0 = a), a ∈ E. The validity of the above assertion for the remaining a ∈ I \ E follows by continuity on account of (1.3.19).] Proposition 1.3.8 (Generalized Brod´n–Borel–L´vy formula) For any e e a ∈ I and n ∈ N+ we have γa (τ n < x | a1 , · · · , an ) = (sa + 1) x n , sa x + 1 n x ∈ I. (1.3.21)

Proof. For any n ∈ N+ and x ∈ I consider the conditional probability γ τ −n ([0, x] × I)| an , · · · , a1 , a0 , a−1 , · · · . (1.3.22)

Put a = [a0 , a−1 , · · · ] − actually, a (ω, θ) = θ, (ω, θ) ∈ Ω2 − and note that [an , · · · , a1 , a0 , a−1 , · · · ] = sa . n On the one hand, it follows from Theorems 1.3.1 and 1.3.5 (see also Remark 1 after Corollary 1.3.6) that the conditional probability (1.3.22) is γ-a.s. equal to (sa + 1) x n . sa x + 1 n On the other hand, putting γa ( · ) = γ ( · | a0 , a−1 , · · · ) , ¯ it is clear that (1.3.22) is γ-a.s. equal to γa (τ −n ([0, x] × I) ∩ (I (a1 , · · · , an ) × I)) ¯ . γa (I(a1 , · · · , an ) × I) ¯ (1.3.23)

Since τ −n ([0, x] × I) = τ −n ([0, x]) × I and γa (A × I) = γa (A) , A ∈ BI , ¯ the fraction in (1.3.23) is equal to γa τ −n ([0, x]) |I(a1 , · · · , an ) = γa (τ n < x | a1 , · · · , an ) . Therefore (1.3.21) holds for any a ∈ E ⊂ Ω, for some E ∈ BI with λ (E) = 1, hence by continuity [use (1.3.19)] for the remaining a ∈ I\E. 2

38

Chapter 1

Remark. Equation (1.3.21) can be also proved by direct computation (cf. the proof of Corollary 1.2.6). 2 Corollary 1.3.9 For any a ∈ I and n ∈ N+ we have γa (A | a1 , · · · , an ) = γsa (τ n (A)) n (1.3.24)

whatever the set A belonging to the σ-algebra generated by the random variables an+1 , an+2 , · · · , that is, τ −n (BI ). We now give a generalization of Proposition 1.2.9, where Lebesgue measure λ(= γ0 ) is replaced by γa , a ∈ I. Deﬁne ﬁrst the random variables ua n as ua = sa + rn , n n−1 n ∈ N+ , a ∈ I.

Proposition 1.3.10 For any a ∈ I, n ∈ N+ , and x ≥ 1 we have γa (r1 < x) = 1 − < x) = a+1 , x+a if if x ≤ a + 1, x > a + 1, sa + 1 n , x + sa n if if x ≤ sa + 1, n x > sa + 1. n 2

0

γa (ua 1

1− a+1 x

γa (rn+1 < x | a1 , . . . , an ) = 1 − 0

γa (ua < x | a1 , . . . , an ) = n+1

a 1 − sn + 1 x

The proof is entirely similar to that of Proposition 1.2.9.

Corollary 1.3.11 For any a ∈ I and n ∈ N+ let Ga (s) = γa (sa < n n s), s ∈ R, Ga (s) = 0 or 1 according as s ≤ a or s > a. For any a ∈ I, n ∈ 0

**Basic properties N+ , and x ≥ 1 we have
**

1

39

γa (rn < x)

=

0

x−1 a dG (s) s + x n−1 1 + x+1 1−

0 1 1 Ga n−1 (s) 0

= (x − 1)

0 x−1

ds

(s +

x)2

,

γa (ua < x) = n

s+1 x

dGa (s) if n−1 if

1 ≤ x ≤ 2,

1−

x−1 0

s+1 x

dGa (s) n−1

x>2

=

1 x

Ga (s) ds. n−1

Equations similar to (1.2.19) and (1.2.20) hold, too.

1.3.5

Paul L´vy’s solution to Gauss’ problem e

We now present the elegant solution given by L´vy (1929) to Gauss’ problem. e Actually, as L´vy has done in the case a = 0, we shall obtain estimates for e a both ‘errors’ Fn − G and Ga − G, a ∈ I, n ∈ N, where n

a Fn (x) = γa (τ n < x),

x ∈ I,

Ga (s) = γa (sa < s), n n

s ∈ R,

**and G(s) = 0, γ([0, s]), or 1 according as s < 0, s ∈ I, or s > 1. It follows from Corollary 1.3.11 that
**

a Fn (x) = 1 0

x(s + 1) a dGn (s) xs + 1

(1.3.25)

**for any a, x ∈ I and n ∈ N. It is easy to check that
**

1

G (x) =

0

x(s + 1) dG(s), xs + 1

x ∈ I,

(1.3.26)

and Ga n+1

1 m

a = Fn

1 m

, m, n ∈ N+ ,

a ∈ I.

(1.3.27)

**40 The last equation is still valid for n = 0 and a = 0 while G0 1 1 m
**

0 = F0

Chapter 1

1 m+1

=

1 , m+1

m ∈ N+ .

(1.3.27 )

Since (sa )n∈N is a Markov chain on (I, BI , γa )—see the preceding subsection— n for any m, n ∈ N+ , a ∈ I, and θ ∈ [0, 1) we have 1 1 Ga − Ga n+1 n+1 m m+θ = γa 1 1 ≤ sa < n+1 m+θ m 1 1 a ≤ sa < s n+1 m+θ m n (1.3.28)

= E γa

θ

=

0

Pm (s) dGa (s) n

while Ga 1 1 m − Ga 1 1 m+θ = γa

θ

1 1 1 ≤ < m+θ a1 + a m (1.3.28 ) Pm (s) dGa (s), 0

=

0+

**that is, (1.3.28) also holds for n = 0 if a = 0. It is easy to check that
**

θ 0

Pm (s)dG(s) = G

1 m

−G

1 m+θ

(1.3.29)

**for any m ∈ N+ and θ ∈ [0, 1). Now, by (1.3.25) and (1.3.26) we have
**

a Fn (x) − G(x) =

x(s + 1) d(Ga (s) − G(s)) n xs + 1 0 1 ∂ x(s + 1) = − (Ga (s) − G(s)) n ∂s xs + 1 0

1

ds

**for any a, x ∈ I and n ∈ N. Setting
**

a αn = sup |Ga (s) − G(s)| , n s∈I

a ∈ I, n ∈ N,

**Basic properties we obtain
**

a a |Fn (x) − G(x)| ≤ αn 1 0

41

x(1 − x) a x(1 − x) ds = α , 2 (xs + 1) x+1 n (1.3.30)

hence

√ a a |Fn (x) − G(x)| ≤ (3 − 2 2)αn

a α0 = max (G(a), 1 − G(a)),

for any a, x ∈ I and n ∈ N. Let us note that a ∈ I.

Theorem 1.3.12 For any n ∈ N+ and a ∈ I we have √ √ 1 a sup |Fn (x) − G(x)| ≤ (3 − 2 2)(3.5 − 2 2)n−1 , 2 x∈I √ 1 sup |Ga (x) − G(x)| ≤ (3.5 − 2 2)n−1 . n 2 x∈I Proof. By (1.3.27) through (1.3.30), for any m, n ∈ N+ , a ∈ I, and θ ∈ [0, 1)—also for n = 0 and any m ∈ N+ , a ∈ (0, 1], and θ ∈ [0, 1)—we have Ga n+1 1 m+θ ≤ −G 1 m 1 m 1 m+θ −G 1 m 1 m+θ

θ 0

Ga n+1 + Ga n+1

− Ga n+1 1 m +

−G

1 m

+G

1 m+θ

= ≤

a Fn

1 −G m √ a 3 − 2 2 αn

θ 0

Pm (s) d (Ga (s) − G(s)) n

(G(s) − Ga (s)) dPm (s) + Pm (θ)(Ga (θ) − G(θ)) n n √ a ≤ (3 − 2 2 + β(m, θ))αn , + where β(m, θ) =

0 θ

dPm (s) ds + Pm (θ). ds

**42 It is easy to check that β(m, θ) ≤ 1/2 for Actually, 1/2 4/(3 + θ) − 2/(2 + θ) − 1/6 β(m, θ) = 6 − 4√2 − 1/6 2Pm (θ) − 1/m(m + 1) Hence
**

a αn+1 =

Chapter 1 any m ∈ N+ and θ ∈ [0, 1). if m = 1, if m = 2 and θ ≤ if m = 2 and θ ≥ if m ≥ 3. √ 2 − 1, √ 2 − 1,

sup

m∈N+ , θ∈[0,1)

Ga n+1

1 m+θ

−G

1 m+θ

(1.3.31)

**√ a ≤ (3.5 − 2 2)αn for any a ∈ I and n ∈ N+ . Finally, by (1.3.27), (1.3.27 ), and (1.3.28 ), G0 1 and Ga 1 1 m+θ = Ga 1 1 m
**

θ

1 m+θ

= G0 1

1 m

=

1 m+1

−

0

Pm (s)dGa (s) 0

a F0 = a F 0

1 m 1 m

− Pm (a) if 0 ≤ θ ≤ a, if θ > a if 0 ≤ θ ≤ a, if θ > a

=

a+1 a+m+1 a+1 a+m

**for any a ∈ (0, 1], θ ∈ [0, 1), and m ∈ N+ . It is easy to see that
**

a α1 =

sup

m∈N+ , θ∈[0,1)

Ga 1

1 m+θ

−G

1 m+θ

1 ≤ , 2

a ∈ I.

(1.3.32)

Basic properties It follows from (1.3.31) and (1.3.32) that √ 1 a αn ≤ (3.5 − 2 2)n−1 , 2 By (1.3.30) the proof is complete. n ∈ N+ , a ∈ I.

43

2

a Theorem 1.3.12 shows that both Fn and Ga converge very fast to Gauss’ n distribution function G. Actually, the convergence is even considerably faster. See Corollary 2.3.6 and Theorem 2.5.5.

1.3.6

Mixing properties

We conclude this section by studying the ψ-mixing coeﬃcients of (an )n∈N+ under either γa , a ∈ I, or γ. Theorem 1.3.12 plays here an important part. k ∞ For any k ∈ N+ let B1 = σ (a1 , · · · , ak ) and Bk = σ (ak , ak+1 , · · · ) denote the σ-algebras generated by the random variables a1 , · · · , ak , respeck tively, ak , ak+1 , · · · . Clearly, B1 is the σ-algebra generated by the closures ∞ of the fundamental intervals of rank k while Bk = τ −k+1 (BI ), k ∈ N+ . For any µ ∈ pr (BI ) consider the ψ-mixing coeﬃcients (cf. Section A3.1) ψµ (n) = sup µ (A ∩ B) −1 , µ (A) µ (B) n ∈ N+ ,

k ∞ where the supremum is taken over all A ∈ B1 and B ∈ Bk+n such that µ (A) µ (B) = 0, and k ∈ N+ . Deﬁne γa (B) − 1 , n ∈ N+ , εn = sup γ (B) ∞ where the supremum is taken over all a ∈ I and B ∈ Bn with γ (B) > 0. ∞ ∞ Note that the sequence (εn )n∈N+ is non-increasing since Bn+1 ⊂ Bn for any a , a ∈ I, n ∈ N+ . We shall show that εn can be expressed in terms of Fn−1 and G, namely, εn = εn with

εn = sup

a,x∈I

a dFn−1 (x) /dx −1 , g (x)

n ∈ N+ ,

**where g (x) = G (x) = (log 2)−1 / (x + 1) , x ∈ I. Indeed, by the very deﬁnition of εn , for any a, x ∈ I we have εn g (x) ≥
**

a dFn−1 (x) − g (x) . dx

44

∞ By integrating the above inequality over B ∈ Bn we obtain a dFn−1 (x) − g (x) dx dx a dFn−1 (x) −

Chapter 1

γ (B) εn ≥ ≥

B

B

B

g (x) dx = |γa (B) − γ (B)|

∞ for any B ∈ Bn , n ∈ N+ , and a ∈ I. Hence εn ≥ εn , n ∈ N+ . On the other + ∞ hand, for any arbitrarily given n ∈ N+ let Bx,h = (x ≤ τ n−1 < x + h) ∈ Bn , − ∞ with x ∈ [0, 1), h > 0, x + h ∈ I, and Bx,h = (x − h ≤ τ n−1 < x) ∈ Bn , with x ∈ (0, 1], h > 0, x − h ∈ I. Clearly, + γa (Bx,h ) + γ(Bx,h ) − γa (Bx,h ) − γ(Bx,h )

εn ≥ max

−1 ,

−1

for any a ∈ I and suitable x ∈ I and h > 0. Letting h → 0 we get εn ≥ εn , n ∈ N+ . Therefore εn = εn , n ∈ N+ . a It is easy to compute ε1 = ε1 and ε2 = ε2 . Since F0 (x) = γa τ 0 < x = γa ([0, x]) , a, x ∈ I, we have ε1 = sup As 1≤ it follows that ε1 = 2 log 2 − 1 = 0.38629 · · · . Next, as γa (sa = 1/(a + i)) = Pi (a), a ∈ I, i ∈ N+ , by Proposition 1.3.8 1 we have

a F1 (x) = i∈N+ a dF0 (x) /dx (a + 1) (x + 1) − 1 = sup log 2 − 1 . g (x) (ax + 1)2 a,x∈I

a,x∈I

(a + 1) (x + 1) ≤ 2, (ax + 1)2

a, x ∈ I,

(a + i + 1)x a+1 x + a + i (a + i)(a + i + 1) (a + 1)x , (x + a + i)(a + i) a, x ∈ I.

=

i∈N+

**Basic properties Then ε2 = = sup
**

a,x∈I a dF1 (x)/dx −1 g(x)

45

sup (log 2)(a + 1)(x + 1)

a,x∈I i∈N+

1 −1 . (x + a + i)2

**It is not diﬃcult to check that 2(ζ(2) − 1) ≤ (a + 1)(x + 1)
**

i∈N+

1 ≤ ζ(2), (x + a + i)2

a, x ∈ I.

Hence

ε2 = max(ζ(2) log 2 − 1, 1 − 2(ζ(2) − 1) log 2) = ζ(2) log 2 − 1 = 0.14018 · · · .

For n ≥ 3 the computation of εn becomes forbidding. Instead, Theorem 1.3.12 can be used to derive good upper bounds for εn whatever n ∈ N+ . Proposition 1.3.13 We have ε1 < log 2 and 1 εn ≤ (log 2)cn−2 , 2 n ≥ 2,

√ where c = 3.5 − 2 2 = 0.67157 · · · .

**Proof. It follows from (1.3.25) and (1.3.26) that
**

a dFn (x) = dx 1 0 1

s+1 dGa (s) (xs + 1)2 n

and g(x) =

0

s+1 dG(s) (xs + 1)2

**for any a, x ∈ I and n ∈ N. Using the last two equations, integration by parts yields
**

a dFn (x) − g(x) dx

=

s+1 d(Ga (s) − G(s)) n (xs + 1)2 0 1 ∂ s+1 ((Ga (s) − G(s)) = ds n ∂s (xs + 1)2 0 1 |x(s + 2) − 1| ≤ sup | Ga (s) − G(s)| ds. n (xs + 1)3 s∈I 0

1

**46 But |x(s + 2) − 1| ds (xs + 1)3 0 1 1 − x(s + 2) ds 0 (xs + 1)3 (1−2x)/x 1 − x(s + 2)
**

0 1 0 1

Chapter 1

if 0 ≤ x ≤ 1 , 3

1

=

(xs + 1)3 x(s + 2) − 1 ds (xs + 1)3

ds −

(1−2x)/x

1 − x(s + 2) ds if (xs + 1)3 if

1 3

≤ x ≤ 1, 2

1 2

≤x≤1

if 0 ≤ x ≤ 1 , 2(x + 1)−2 − 1 3 −2(x + 1)−2 − 1 + (2x(1 − x))−1 if 1 ≤ x ≤ 1 , = 3 2 1 − 2(x + 1)−2 if 1 ≤ x ≤ 1 2 and (x + 1)

0 1

|x(s + 2) − 1| ds = (xs + 1)3

if 0 ≤ x ≤ 1 2(x + 1)−1 − (x + 1) 3 1 −2(x + 1)−1 − (x + 1) + (x + 1)(2x(1 − x))−1 if 3 ≤ x ≤ 1 = 2 1 x + 1 − 2(x + 1)−1 if 2 ≤ x ≤ 1 ≤ 1. Therefore sup

a,x∈I a dFn (x)/dx − 1 ≤ (log 2) sup |Ga (s) − G(s)| , n g(x) a,s∈I

n ∈ N.

Then ε1 = ε1 ≤ log 2 and, by Theorem 1.3.12, 1 εn+1 = εn+1 ≤ (log 2)cn−1 , 2 n ∈ N+ .

Basic properties

47 2

Theorem 1.3.14 For any a ∈ I we have ψγa (n) ≤ Also, ψγ (n) = εn , n ∈ N+ . (1.3.34) εn + εn+1 , 1 − εn+1 n ∈ N+ . (1.3.33)

Proof. It follows from (1.3.24) that for any a ∈ I we have εn = sup γa B|I(i(k) ) −1 , γ(B) n ∈ N+ , (1.3.35)

∞ where the supremum is taken over all B ∈ Bk+n with γ(B) > 0, i(k) ∈ Nk , + and k ∈ N. For arbitrarily given k, , n ∈ N+ , i(k) ∈ Nk , and j ( ) ∈ N+ + put A = I(i(k) ), B = ((ak+n , · · · , ak+n+ −1 ) = j ( ) ))

and note that γa (A) γa (B) = 0 for any a ∈ I. By (1.3.35) we have |γa (B|A) − γ (B)| ≤ εn γ (B) and |γa (B) − γ (B)| ≤ εn+k γ (B) . It follows from (1.3.36) and (1.3.37) that |γa (B|A) − γa (B)| ≤ (εn + εn+k ) γ (B) , whence |γa (A ∩ B) − γa (A) γa (B)| ≤ (εn + εn+k ) γa (A) γ (B) . Finally, note that (1.3.37) yields γ (B) ≤ γa (B) . 1 − εn+k (1.3.37) (1.3.36)

Since the sequence (εn )n∈N+ is non-increasing, we have εn + εn+1 εn + εn+k ≤ , 1 − εn+k 1 − εn+1 k, n ∈ N+ ,

48

Chapter 1

which completes the proof of (1.3.33). To prove (1.3.34) we ﬁrst note that putting A = I(i(k) ) for any given k ∈ N+ and i(k) ∈ Nk , by (1.3.35) we have + |γa (A ∩ B) − γa (A) γ (B)| ≤ εn γa (A) γ (B)

∞ for any a ∈ I, B ∈ Bk+n , and n ∈ N+ . By integrating the above inequality over a ∈ I with respect to γ and taking into account that

I

γa (E) γ(da) = γ (E) ,

E ∈ BI ,

we obtain ψγ (n) ≤ εn , n ∈ N+ . To prove the converse inequality remark that the ψ-mixing coeﬃcients under the extended Gauss measure γ of the doubly inﬁnite sequence (¯ ) ∈Z ¯ a of extended incomplete quotients, are equal to the corresponding ψ-mixing coeﬃcients under γ of (an )n∈N+ . This is obvious by the very deﬁnitions of (¯ ) ∈Z and ψ-mixing coeﬃcients. See Subsection 1.3.3 and Section A3.1. a As (¯ ) ∈Z is strictly stationary under γ , we have a ¯ ψγ (n) = ψγ (n) = sup ¯ γ (A ∩ B) ¯ −1 , γ (A) γ (B) ¯ ¯ n ∈ N+ ,

¯ ¯ where the upper bound is taken over all A = σ(¯n , an+1 , · · · ) and B ∈ a ¯ ¯ σ(¯0 , a−1 , · · · ) for which γ (A) γ (B) = 0. Clearly, A = A × I and B = I × B, a ¯ ¯ ∞ with A ∈ Bn = τ −n+1 (BI ) and B ∈ BI . Then ψγ (n) = sup

A ∈ τ −n+1 (BI ), B ∈ BI γ(A)γ(B) = 0

γ (A × B) ¯ −1 , γ(A) γ(B)

n ∈ N+ .

(1.3.38)

**Now, it is easy to check that γ (A × B) = ¯
**

A

γ(da)γa (B) =

B

γ(db)γb (A)

**for any A, B ∈ BI . It then follows from (1.3.38) and the very deﬁnition of εn that ψγ (n) ≥ sup
**

b ∈ I, A ∈ τ −n+1 (BI ) γ(A) = 0

γb (A) − 1 = εn , γ(A)

n ∈ N+ .

Basic properties This completes the proof of (1.3.34).

49 2

Corollary 1.3.15 The sequence (an )n∈N+ is ψ-mixing under γ and any γa , a ∈ I. For any a ∈ I we have ψγa (1) ≤ (ε1 + ε2 )/(1 − ε2 ) = 0.61231 · · · and (log 2)cn−2 (1 + c) ψγa (n) ≤ , n ≥ 2. 2 − (log 2)cn−1 Also, ψγ (1) = 2 log 2 − 1 = 0.38629 · · · , ψγ (2) = ζ(2) log 2 − 1 = 0.14018 · · · and 1 ψγ (n) ≤ (log 2)cn−2 , n ≥ 3. 2 The doubly inﬁnite sequence (¯ ) ∈Z of extended incomplete quotients is a ψ-mixing under the extended Gauss measure γ , and its ψ-mixing coeﬃcients ¯ are equal to the corresponding ψ-mixing coeﬃcients under γ of (an )n∈N+ . The proof follows from Proposition 1.3.13 and Theorem 1.3.14. As already noted, the last assertion is obvious by the very deﬁnitions of (¯ ) ∈Z a and ψ-mixing coeﬃcients. 2 Remark. The above result will be improved in Chapter 2. See Proposition 2.3.7. 2 Proposition 1.3.16 (F. Bernstein’s theorem) Let (cn )n∈N+ be a sequence of positive numbers. The random event (an ≥ cn ) occurs inﬁnitely often with γ-probability 0 or 1, according as the series n∈N+ 1/cn converges or diverges. In other words, γ(an ≥ cn i.o.) is either 0 or 1 according as the series n∈N+ 1/cn converges or diverges. Proof. We can clearly assume that cn ≥ 1, n ∈ N+ . Let En = (an ≥ cn ), n ∈ N+ . By (1.2.9) we have γ(En ) = γ(an ≥ cn ) = γ (a1 ≥ cn ) = γ(a1 ≥ cn ) = 1 1 log 1 + log 2 cn ,

where either cn = cn + 1 or cn = cn . Hence 1 2 ≤ γ(En ) ≤ , 2cn cn log 2 n ∈ N+ ,

since x log 2 ≤ log(1 + x) ≤ x for any x ∈ I. Thus if n∈N+ 1/cn converges, then the result stated follows from the Borel–Cantelli lemma.

50

Chapter 1

Assume now that n∈N+ 1/cn diverges. It follows from Theorem 1.3.14 that for any k, n ∈ N+ such that k ≤ n we have

c c c c |γ (Ek ∩ · · · ∩ En ∩ En+1 ) − γ (Ek ∩ · · · ∩ En ) γ (En+1 )| c c ≤ ε1 γ (Ek ∩ · · · ∩ En ) γ (En+1 ) ,

**where ε1 = 2 log 2 − 1 = 0.38629 · · · . Hence
**

c c γ ( En+1 | Ek ∩ · · · ∩ En ) ≥ (1 − ε1 )γ(En+1 ) ≥

1 − ε1 , 2cn+1

therefore

c c c γ En+1 Ek ∩ · · · ∩ En ≤ 1 −

1 − ε1 2cn+1

**for any k, n ∈ N+ such that k ≤ n. It follows that for any k, m ∈ N+ we have
**

m c c γ Ek ∩ · · · ∩ Ek+m ≤ i=0

1−

1 − ε1 2ck+i 1 − ε1 2ck+i

,

whence γ

c Ek

m

∩

c Ek+1

∩ · · · ≤ lim

m→∞

1−

i=0

=0

**since n∈N+ 1/cn diverges. Finally, γ (an ≥ cn i.o.) = γ(∩k∈N+ ∪ i≥k Ei ) =
**

c lim γ(∪ i≥k Ei ) = lim γ((∩i≥k Ei )c ) k→∞

k→∞

c c = 1 − lim γ Ek ∩ Ek+1 ∩ · · · = 1. k→∞

2 In Chapter 3 we shall need the following result. Corollary 1.3.17 Let bn , n ∈ N+ , be real-valued random variables on (I, BI ) such that an ≤ bn ≤ an + c, n ∈ N+ , for some c ∈ R+ . Let (cn )n∈N+ be a sequence of positive numbers. Then γ (bn ≥ cn i.o.) is either 0 or 1 according as the series n∈N+ 1/cn converges or diverges.

**Basic properties Proof. Clearly, (an ≥ cn i.o.) ⊂ (bn ≥ cn i.o.) ⊂ (an ≥ max(1, cn − c) i.o.), and the series or divergent.
**

n∈N+

51

1/cn and

n∈N+

1/ max(1, cn −c) are both convergent 2

52

Chapter 1

Chapter 2

**Solving Gauss’ problem
**

In this chapter a generalization of Gauss’ problem stated in Subsection 1.2.1 is solved. Several applications are also given.

**2.0 Banach space preliminaries
**

2.0.1 A few classical Banach spaces

In this subsection we describe some Banach spaces which are often mentioned throughout the book. We consider just functions deﬁned on I, but almost all considerations below can be easily extended to more general cases. We denote by B (I) the collection of all bounded measurable functions f : I → C. This is a commutative Banach algebra with unit under the supremum norm || f || = sup |f (x)| , f ∈ B (I) .

x∈I

We denote by C (I) the collection of all continuous functions f : I → C . This is a commutative Banach algebra with unit under the supremum norm. We denote by C 1 (I) the collection of all functions f : I → C which have a continuous derivative. This is a commutative Banach algebra with unit under the norm || f || 1 = || f || + || f || , f ∈ C 1 (I) . We denote by L (I) the collection of all Lipschitz functions f : I → C, that is, those for which s (f ) := sup

x =x

|f (x ) − f (x ) | < ∞· |x − x | 53

54 This is a commutative Banach algebra with unit under the norm || f || L = || f || + s (f ) , Clearly, C 1 (I) ⊂ L (I) ⊂ C (I) ⊂ B (I) . f ∈ L (I) .

Chapter 2

**The variation varA f over A ⊂ I of a function f : I → C is deﬁned as
**

k−1

sup

i=1

|f (ti ) − f (ti−1 )| ,

the supremum being taken over t1 < · · · < tk , ti ∈ A, 1 ≤ i ≤ k, and k ≥ 2. We write simply var f for varI f . If var f < ∞ then f is called a function of bounded variation. The collection BV (I) of all functions f : I → C of bounded variation is a commutative Banach algebra with unit under the norm || f || v = || f || + var f, f ∈ BV (I) . Clearly, L (I) ⊂ BV (I) ⊂ B (I) . Let µ be a measure on BI . Two measurable functions f : I → C and g : I → C are said to be µ-indistinguishable, or to be µ-versions of each other, if and only if µ (f = g) = 0. Let us partition the collection of all measurable complex-valued functions deﬁned on I into (equivalence) classes of µ-indistinguishable functions. For any real number p ≥ 1 we denote by Lp (I, BI , µ) = Lp the collection of all such classes of µ-indistinguishable µ functions f : I → C for which I |f |p dµ < ∞. Clearly, Lp ⊂ Lp if p ≥ p ≥ µ µ 1. Next, Lp is a Banach space under the norm µ ||f ||p,µ = |f |p dµ

I 1/p

,

f ∈ Lp . µ

(Note that the value of the integral is the same for all functions in an equivalence class.) To deﬁne L∞ we should ﬁrst deﬁne the µ-essential supremum. For a µ measurable function f : I → R, its µ-essential supremum, which is denoted µ-ess sup f , is deﬁned as inf {a ∈ R : µ (f > a) = 0} .

Solving Gauss’ problem

55

A measurable function f : I → C is said to be µ-essentially bounded if and only if µ-ess sup|f | < ∞. Note that µ-ess sup|f | = inf || f || , where the lower bound is taken over all µ-versions f or f . We denote by L∞ (I, BI , µ) = L∞ the collection of all classes of µ-essentially bounded µ complex-valued µ-indistinguishable functions deﬁned on I ; L∞ is a comµ mutative Banach algebra with unit under the norm ||f ||∞,µ = µ-ess sup |f |, f ∈ L∞ . µ

(Note that the value of the essential supremum is the same for all functions in an equivalence class.) Clearly, L∞ ⊂ Lp for any p ≥ 1. µ µ The special case p = 2 is an important one: L2 can be also considered µ as a Hilbert space with inner product (·, ·)µ deﬁned by (f, g)µ = f g ∗ dµ,

I

f, g ∈ L2 . µ

In the case where µ = λ we simply write Lp , ||f ||p , L∞ , ||f ||∞ , and ess sup f instead of Lp , ||f ||p,λ , L∞ , ||f ||∞,λ , and λ-ess sup f , respectively. λ λ

**2.0.2 Bounded essential variation
**

A variation v (f ) for f ∈ L∞ is deﬁned as v (f ) = inf var f, the inﬁmum being taken over all λ-versions f of f . If v (f ) < ∞ then f ∈ L∞ is called a function of bounded essential variation. It can be shown that v (f ) = 1 0<a→0 a lim

1

|f (u + a) − f (u) |du,

0

where for x > 1 we deﬁne f (x) = f (1). Clearly, if f ∈ BV (I) then, in general, v (f ) ≤ var f . This is a special instance of the following more general result due to Stadje (1985). If v (f ) < ∞ then the limit f (t) = lim 1 0<a→0 a

t+a

f (u) du

t

exists for any t ∈ I, the function f is a right-continuous λ-version of f , and var f = v (f ). The collection BEV (I) of all functions f ∈ L∞ of bounded

56

Chapter 2

essential variation is a commutative Banach algebra with unit under any of the norms ||f ||v,µ = v (f ) + ||f ||1,µ , f ∈ BEV (I) , with µ ∈ pr (BI ) such that µ ≡ λ. See R˘utu and Zb˘ganu (1989). In the a ¸ a case where µ = λ we simply write ||f ||v instead of ||f ||v,λ . Proposition 2.0.1 (i) Let µ ∈ pr (BI ). If f ∈ BV (I) then || f || ≤ var f +

I

f dµ .

(2.0.1)

**(ii) Let µ ∈ pr (BI ) with µ ≡ λ. If f ∈ BEV (I) then µ-ess sup |f | ≤ v (f ) +
**

I

f dµ .

(2.0.2)

**Proof. (i) For any x ∈ I we can write |f (x)| −
**

I

f dµ ≤ f (x) −

I

f dµ =

I

(f (x) − f (u)) µ (du) ≤ var f,

**from which (2.0.1) follows at once. (ii) (2.0.2) follows from (2.0.1) since µ-ess sup |f | = inf || f || , v (f ) = inf var f ,
**

e f e f

**the inﬁmum being taken over all µ-versions f of f , and f dµ =
**

I I

f dµ 2

for such an f .

2.1

2.1.1

**The Perron–Frobenius operator
**

Deﬁnition and basic properties

Let µ ∈ pr (BI ) such that µ τ −1 (A) = 0 whenever µ (A) = 0, A ∈ BI , (2.1.1)

where τ is the continued fraction transformation deﬁned in Subsection 1.1.1.

Solving Gauss’ problem

57

In particular, this condition is satisﬁed if τ is µ-preserving, that is, = µ, to mean µ τ −1 (A) = µ (A) for any A ∈ BI . In general, assuming that µ λ and putting h = dµ/dλ, it is easy to check that (2.1.1) holds if and only if λ (E) = 0, where E = (x ∈ I : h (x) = 0). The Perron–Frobenius operator Pµ of τ under µ is deﬁned as the bounded linear operator on L1 which takes f ∈ L1 into Pµ f ∈ L1 with µ µ µ µτ −1 Pµ f dµ = f dµ ,

τ −1 (A)

A

A ∈ BI ,

or, equivalently,

I

gPµ f dµ =

(g ◦ τ ) f dµ

I

(2.1.2)

for any f ∈ L1 and g ∈ L∞ . The existence of Pµ f is ensured by the Radon– µ µ Nikodym theorem on account of (2.1.1). Actually, Pµ so deﬁned takes Lp µ into itself for any p ≥ 1 and p = ∞. So, (2.1.2) holds for any f ∈ Lp µ and g ∈ Lq , with p > 1 and q = p/ (p − 1). In particular, (2.1.2) holds for µ any f, g ∈ L2 . µ The probabilistic interpretation of Pµ is immediate : if an I-valued random variable ξ on I has µ-density h, that is, µ (ξ ∈ A) = A hdµ, A ∈ BI , with h ≥ 0 and I hdµ = 1, then τ ◦ ξ has µ-density Pµ h. In the special case µ = λ we obviously have Pλ f (x) = d dx

τ −1 ([0,x])

f dλ a.e. in I.

**Proposition 2.1.1 The following properties hold : (i) Pµ is positive, that is, Pµ f ≥ 0 if f ≥ 0; (ii) Pµ preserves integrals, that is, Pµ f dµ = f dµ,
**

I

I

f ∈ L1 ; µ

(iii) Pµ p,µ := sup (||Pµ f ||p,µ : f ∈ Lp , ||f ||p,µ = 1) ≤ 1 for any p ≥ 1 µ and p = ∞; n (iv) for any n ∈ N+ the nth power Pµ of Pµ is the Perron–Frobenius n of τ under µ ; operator of the nth iterate τ (v) (Pµ f )∗ = Pµ f ∗ for any f ∈ L1 ; µ (vi) Pµ ((g ◦ τ ) f ) = gPµ f for any f ∈ L1 and g ∈ L∞ and for any f ∈ µ µ Lp and g ∈ Lq with p > 1 and q = p/ (p − 1); µ µ

58

Chapter 2

(vii) Pµ f = f if and only if τ is ν-preserving, where ν is deﬁned by ν (A) = A f dµ, A ∈ BI . In particular, Pµ 1 = 1 if and only if τ is µpreserving. For the proof see Boyarski and G´ra (1997, Ch. 4), Lasota and Mackey o (1985, Ch. 3) or Mackey (1992, Ch. 4). 2 Remark. The above considerations on the Perron–Frobenius operator of the continued fraction transformation τ under diﬀerent probability measures on BI apply mutatis mutandis to the general case of a transformation of an arbitrary probability space. For example, in the case of the natural extension τ of τ (see Subsection 1.3.1) we should start by considering measures µ ∈ 2 pr BI such that µ τ −1 (B) = 0 whenever

2 µ (B) = 0, B ∈ BI .

(2.1.1 )

Assuming that µ λ2 (two-dimensional Lebesgue measure) and putting h = dµ/dλ2 , it is easy to check that (2.1.1 ) holds if and only if λ2 E = 0, where E = (x, y) ∈ I 2 : h (x, y) = 0 . The Perron–Frobenius operator P µ of τ under µ is the bounded linear operator on L1 I 2 which takes f ∈ L1 I 2 into P µ f ∈ L1 I 2 with µ µ µ P µ f dµ = f dµ,

τ −1 (B) ¯ 2 B ∈ BI .

B

**It is also quite easy to check that if µ ≤ λ2 and h = dµ/dλ2 > 0 a.e. in I 2 , then h ◦ τ −1 (x, y) f ◦ τ −1 (x, y) P µ f (x, y) = y 2 (x + 1/y )2 h (x, y) a.e. in I 2 . Alternatively, P µ f (x, y) = sx (y) 1 τ 0 (y)
**

2

h ◦ τ −1 (x, y) h (x, y)

f ◦ τ −1 (x, y)

a.e. in I 2 . In particular, for µ = γ when h (x, y) = 1 1 , log 2 (xy + 1)2 x, y ∈ I 2 ,

**we have P γ f = f ◦ τ −1 a.e. in I 2 . Hence P µ f (x, y) =
**

n

sx (y) · · · sx (y) n 1 τ 0 (y) · · · τ n−1 (y)

2

h ◦ τ −n (x, y) h (x, y)

f ◦ τ −n (x, y) ,

**Solving Gauss’ problem P γ f = f ◦ τ −n
**

n

59

a.e. in I 2 for any n ∈ N+ . We should, however, note that the Perron–Frobenius operator of an invertible transformation, like τ , is not of great value for deriving asymptotic ¯ properties of its nth power as n → ∞. For an interesting discussion of the Perron–Frobenius operator of τ in connection with the time evolution of ¯ certain spatially homogeneous cosmologies (‘mixmaster universe’), we refer the reader to Mayer (1987). 2 Proposition 2.1.2 The Perron–Frobenius operator Pγ := U of τ under γ is given a.e. in I by the equation U f (x) =

i∈N+

Pi (x) f

1 x+i

,

f ∈ L1 . γ

(2.1.3)

Proof. Let τi : Ii → I denote the restriction of τ to the interval Ii = (1/ (i + 1) , 1/i], i ∈ N+ , that is, τi (u) = 1 − i, u u ∈ Ii .

**For any f ∈ L1 and any A ∈ BI we have γ f dγ =
**

τ −1 (A) i∈N+ τ −1 (A∩Ii )

f dγ =

i∈N+

−1 τi (A)

f dγ.

(2.1.4)

**For any i ∈ N+ , by the change of variable x = τi−1 (y) = (y + i)−1 we successively obtain f dγ = 1 log 2 1 log 2 1 log 2 f (x) dx −1 τi (A) x + 1 f
**

A

−1 τi (A)

=

1 y+i

1 dy −1 (y + i) + 1 (y + i)2 1 y+i dy y+1

(2.1.5)

=

A

Pi (y) f 1 y+i

=

A

Pi (y) f

γ (dy) . 2

Now, (2.1.3) follows from (2.1.4) and (2.1.5).

60

Chapter 2

**Proposition 2.1.3 Let µ ∈ pr (BI ). Assume that µ λ and h = dµ/dλ > 0 a.e. in I . Then the Perron–Frobenius operator Pµ of τ under µ is given a.e. in I by the equation Pµ f (x) = 1 h (x) h (x + i)−1
**

i∈N+

(x + i)

2

f

1 x+i

(2.1.6)

=

U g (x) , (x + 1) h (x)

f ∈ L1 , µ

**where g (x) = (x + 1) h (x) f (x), x ∈ I. The powers of Pµ are given a.e. in I by the equation
**

n Pµ f (x) =

U n g (x) , (x + 1) h (x)

f ∈ L1 , n ∈ N+ . µ

(2.1.7)

Proof. The proof of (2.1.6) is entirely similar to that of (2.1.3), and is left to the reader. Note that f ∈ L1 entails g ∈ L1 . µ γ To prove (2.1.7) note that it holds for n = 1. Assuming that (2.1.7) holds for some n ∈ N+ , we have

n n+1 Pµ f (x) = Pµ Pµ f (x) = Pµ

U ng (· + 1) h 1 x+i 1 x+i /

(x) 1 + 1 h (x + i)−1 x+i U n+1 g (x) (x + 1) h (x) a.e. in I, 2

=

1 h (x)

h (x + i)−1

i∈N+

(x + i)

2

U ng

=

1 (x + 1) h (x)

Pi (x) U n g

i∈N+

=

and the proof is complete.

**Corollary 2.1.4 The Perron–Frobenius operator Pλ of τ under λ is given a.e. in I by the equation Pλ f (x) =
**

i∈N+

1 f (x + i)2

1 , x+i

f ∈ L1 .

**The powers of Pλ are given a.e. in I by the equation
**

n Pλ f (x) =

U n g (x) , x+1

f ∈ L1 , n ∈ N+ ,

Solving Gauss’ problem where g (x) = (x + 1) f (x), x ∈ I. Proposition 2.1.5 Let µ ∈ pr (BI ). Assume that µ dµ/dλ. Then U n f (x) µ τ −n (A) = dx A x+1

61

λ and let h = (2.1.8)

**for any n ∈ N and A ∈ BI , where f (x) = (x + 1) h(x), x ∈ I. Proof. For n = 0 equation (2.1.8) reduces to µ (A) =
**

A

h (x) dx,

A ∈ BI ,

**which is obviously true. Assume that (2.1.8) holds for some n ∈ N. Then µ τ −(n+1) (A) = µ τ −n τ −1 (A) =
**

τ −1 (A)

U n f (x) dx = (log 2) x+1

U n f dγ.

τ −1 (A)

**By the very deﬁnition of the Perron–Frobenius operator U = Pγ we have U n f dγ =
**

τ −1 (A) A

U n+1 f dγ.

Therefore µ τ −(n+1) (A) = (log 2)

A

U n+1 f dγ =

A

U n+1 f (x) dx, x+1 2

and the proof is complete.

**Remark. It should be noted that (2.1.8) holds without assuming that h > 0 a.e. Since µ (τ n ∈ A) = µ τ −n (A) =
**

A n Pµ 1 dµ, n ∈ N,

A ∈ BI ,

it is possible to derive (2.1.8) from Proposition 2.1.3 assuming that h > 0 a.e., which clearly restricts the generality of the result. 2

62

Chapter 2

2.1.2

Asymptotic behaviour

It is easy to check that

**1 x+1 is an eigenfunction of Pλ corresponding to the eigenvalue 1. Deﬁne on L1 the linear operators Π1 and T0 by Π1 f (x) = (log 2)−1 x+1 f dλ,
**

I

f ∈ L1 , x ∈ I,

T0 = Pλ − Π1 . Hence Π2 = Π1 , 1 Pλ Π1 = Π1 Pλ = Π1 , T0 Π1 = Π1 T0 = 0. (2.1.9)

**It follows from the last equation (2.1.9) that
**

n n Pλ = Π1 + T0 ,

n ∈ N+ .

(2.1.10)

L1

Theorem 2.1.6 The only eigenvalue of modulus 1 of Pλ : L1 → is 1 and this eigenvalue is simple. The operator T0 has the following properties: (i) T0 (BEV (I)) ⊂ BEV (I); n (ii) there exists 0 < q < 1 such that ||T0 ||v = O (q n ) as n → ∞ (equivalently, the spectral radius of T0 in BEV (I) under || · ||v is less than 1); n n (iii) supn∈N+ ||T0 ||1 < ∞ and limn→∞ ||T0 h||1 = 0 for any h ∈ L1 . Proof. This is a special case of Theorem 5.3.12 in Iosifescu and Grigorescu (1990). 2

n The result just stated concerning the asymptotic behaviour of T0 as n as n → ∞. n → ∞ can be used to derive the asymptotic behaviour of U It follows from Corollary 2.1.4 and equation (2.1.10) that n U n g (x) = U ∞ g + (x + 1) T0

g ·+1

(x)

(2.1.11)

**a.e. in I for any g ∈ L1 , where γ U ∞g =
**

I

gdγ.

Solving Gauss’ problem

63

It is obvious that U ∞ U ∞ = U U ∞ = U ∞ . Using the last equation (2.1.9) it is easy to check that U ∞U = U ∞ . (2.1.12) Now, deﬁning the linear operator T : L1 → L1 by γ γ T g (x) = (x + 1) T0 a.e. in I, it is easy to check that

n T n g (x) = (x + 1) T0

g ·+1

(x),

g ∈ L1 , γ

g ·+1

(x),

g ∈ L1 , γ

(2.1.13)

a.e. in I for any n ∈ N+ , and T U ∞ = U ∞ T = 0. It follows from (2.1.11) and (2.1.13) that U n = U ∞ + T n, n ∈ N+ . (2.1.14)

Proposition 2.1.7 The only eigenvalue of modulus 1 of U : L1 → L1 γ γ is 1 and this eigenvalue is simple. The corresponding eigenspace consists of the a.e. constant functions on I. The linear operator T : L1 → L1 has the γ γ following properties: (i) T (BEV (I)) ⊂ BEV (I); (ii) there exists 0 < q < 1 such that ||T n ||v,γ = O (q n ) as n → ∞ (equivalently, the spectral radius of T in BEV (I) under || · ||v,γ is less than 1); (iii) supn∈N+ ||T n ||1,γ < ∞ and limn→∞ ||T n h||1,γ = 0 for any h ∈ L1 . γ Proof. By (2.1.11) and (2.1.13), all the conclusions are immediate consequences of the corresponding conclusions of Theorem 2.1.6. In checking (ii) we have to use Proposition 2.0.1(ii). 2 Remark. Since λ (A) λ (A) ≤ γ (A) ≤ , 2 log 2 log 2 A ∈ BI ,

the domains of the operators U, U ∞ and T can be as well taken to be L1 and then in (ii) and (iii) the norms || · ||v,γ and || · ||1,γ should be replaced by the norms || · ||v and || · ||1 , respectively. 2

**64 Corollary 2.1.8 For any h ∈ L1 we have
**

n→∞ I

Chapter 2

lim

|U n h − U ∞ h|dγ = lim

n→∞ I

|U n h − U ∞ h|dλ = 0.

**Hence, for any h ∈ L1 ,
**

n→∞ A

lim

U n h dµ = µ (A) U ∞ h

(2.1.15)

**uniformly with respect to A ∈ BI , where µ stands for either λ or γ. Proof. For any A ∈ BI we have U n h dµ − µ (A) U ∞ h
**

A

=

A

(U n h − U ∞ h) dµ |U n h − U ∞ h| dµ

A

≤ ≤

I

|U n h − U ∞ h| dµ −→ 0 2

as n → ∞, and the proof is complete.

Remark. It is not possible to show that U n h → U ∞ h a.e. as n → ∞ by using (2.1.15). It is an open problem whether this is actually true. Cf. Petek (1989) and Iosifescu (1992, p. 912). 2

2.1.3

Restricting the domain of the Perron–Frobenius operator

The asymptotic properties of the Perron–Frobenius operator U : L1 → L1 γ γ as described by Proposition 2.1.7, are not strong enough for to lead to a satisfactory solution to Gauss’ problem, whilst when restricting U to BEV (I) they are substantially better. See further Proposition 2.1.17. In the next sections the domain of U will be successively restricted to various Banach spaces. In this subsection we show that U , deﬁned by U f (x) =

i∈N+

Pi (x) f

1 x+i

(2.1.16)

for any x ∈ I, is a bounded linear operator on any of the Banach spaces B (I) , C (I) , BV (I), L (I), and C 1 (I).

Solving Gauss’ problem

65

Proposition 2.1.9 The operator U deﬁned by (2.1.16) is a bounded linear operator of norm 1 on both B (I) and C (I). Proof. It is obvious that if f ∈ B (I) then U f ∈ B (I) and || U f || ≤ || f ||. Next, if f ∈ C (I) then U f ∈ C (I) since the series deﬁning U f is uniformly convergent, it being dominated by a convergent series of positive constants. We also have || U f || ≤ || f || , f ∈ C (I) ⊂ B (I), as a consequence of the validity of this inequality for f ∈ B (I). In both cases || U || = 1 since U preserves the constant functions. 2 A diﬀerent interpretation is available for the operator U : B (I) → B (I). Proposition 2.1.10 The operator U : B (I) → B (I) is the transition operator of both the Markov chain (sa )n∈N on (I, BI , γa ), for any a ∈ I, and n 2 the Markov chain (s ) ∈Z on I 2 , BI , γ . Proof. As noted in Subsection 1.3.4, for any a ∈ I the sequence (sa )n∈N n is an I-valued Markov chain with the following transition mechanism: from state s ∈ I the possible transitions are to any state 1/ (s + i) with corresponding transition probability Pi (s), i ∈ N+ . Then the transition operator of (sa )n∈N takes f ∈ B (I) to the function deﬁned by n

a E f sa n+1 |sn = s = i∈N+

Pi (s) f

1 s+i

= U f (s),

s ∈ I,

that is, it coincides with the operator U whatever a ∈ I. A similar reasoning is valid for the case of the Markov chain (s ) ∈Z , whose transition mechanism is identical with that of (sa )n∈N . (See Subsecn tion 1.3.3.) 2 To prove a result similar to Proposition 2.1.9 for the Banach spaces BV (I), L (I), and C 1 (I) we need some preparation. We ﬁrst prove that the operator U : B (I) → B (I) preserves monotonicity. Proposition 2.1.11 If f ∈ B (I) is non-decreasing (non-increasing), then U f is non-increasing (non-decreasing). Proof. To make a choice assume that f is non-decreasing. Let y > x, x, y ∈ I. We have U f (y) − U f (x) = S1 + S2 , where S1 =

i∈N+

Pi (y) f

1 y+i

−f

1 x+i .

,

S2 =

i∈N+

(Pi (y) − Pi (x)) f

1 x+i

**66 Clearly, S1 ≤ 0. We shall prove that S2 ≤ 0, too. Since Pi (u) = 1,
**

i∈N+

Chapter 2

u ∈ I,

**we can write S2 = −
**

i∈N+

f

1 x+1

−f

1 x+i

(Pi (y) − Pi (x)) .

As is easy to see, the function P1 is decreasing while the functions Pi , i ≥ 3, are all increasing. Note also that f 1 x+1 −f 1 x+i ≥f 1 x+1 −f 1 x+2 ≥ 0, i ≥ 2.

Therefore S2 = −

i≥2

f

1 x+1 −f −f

−f

1 x+i

(Pi (y) − Pi (x)) (Pi (y) − Pi (x))

i≥2

≤ − f = f

1 x+1 1 x+1

1 x+2 1 x+2

(P1 (y) − P1 (x)) ≤ 0, 2

as claimed. Thus U f (y) − U f (x) ≤ 0, and the proof is complete.

Remark. It is possible to show more generally that if f ∈ L1 is nondecreasing (non-increasing), then U f is non-increasing (non-decreasing). The proof, along the same lines as above, is left to the reader. 2 Proposition 2.1.12 If f ∈ B (I) is monotone, then var U f ≤ The constant 1/2 cannot be lowered. Proof. Assume, with no loss of generality, that f is non-decreasing. [Note that if f is non-increasing, then −f is non-decreasing while var U (−f ) = var U f and var (−f ) = var f .] Then by Proposition 2.1.11 we have var U f = U f (0) − U f (1) =

i∈N+

1 var f. 2

Pi (0) f

1 i

− Pi (1) f

1 i+1

.

**Solving Gauss’ problem Since Pi (1) = 2Pi+1 (0), i ∈ N+ , it follows that var U f = P1 (0) f (1) −
**

i∈N+

67

Pi+1 (0) f

1 i+1

.

As P1 (0) =

i∈N+

Pi+1 (0) =

1 2

and f we ﬁnally obtain

1 i+1

≥ f (0) ,

i ∈ N+ ,

var U f ≤

1 1 (f (1) − f (0)) = var f. 2 2

Since for f deﬁned by f (x) = 0, 0 ≤ x < 1, and f (1) = 1 we have var U f = (var f ) /2, it follows that the constant 1/2 cannot be lowered. 2 Corollary 2.1.13 If f ∈ BV (I) is real-valued, then 1 var U f ≤ var f. 2 The constant 1/2 cannot be lowered. Proof. By Hahn’s decomposition of a signed measure, for any f ∈ BV (I) there exist monotone functions f1 , f2 ∈ B (I) such that f = f1 − f2 and var f = var f1 + var f2 . [To obtain this consider the signed measure µ on BI deﬁned by µ ((a, b]) = f (b) − f (a), a < b, a, b ∈ I.] Then by Proposition 2.1.12 we have var U f = var (U f1 − U f2 ) ≤ var U f1 + var U f2 1 1 (var f1 + var f2 ) = var f . ≤ 2 2 2

The optimality of the constant 1/2 follows from Proposition 2.1.12. Proposition 2.1.14 We have s (U f ) ≤ (2ζ (3) − ζ (2)) s (f )

(2.1.17)

for any f ∈ L (I). The constant θ = 2ζ (3) − ζ (2) = 0.7594798 · · · cannot be lowered.

**68 Proof. For x = y, x, y ∈ I, we have U f (y) − U f (x) = y−x Pi (y) − Pi (x) f y−x f
**

1 y+i 1 y+i

Chapter 2

i∈N+

1 x+i −f −

1 x+i 1 x+i

(2.1.18)

−

i∈N+

Pi (y)

1 . (x + i) (y + i)

**Next, remark that Pi (x) = and then Pi (y) − Pi (x) i−1 i = − , y−x (x + i) (y + i) (x + i + 1) (y + i + 1) Hence Pi (y) − Pi (x) f y−x 1 x+i (2.1.19) =
**

i∈N+

i i−1 − , x+i+1 x+i

i ∈ N+ ,

i ∈ N+ .

i∈N+

i (x + i + 1) (y + i + 1)

f

1 x+i+1

−f

1 x+i

.

Assume that x > y. It then follows from (2.1.18) and (2.1.19) that U f (y) − U f (x) ≤ s (f ) y−x Pi (y) i 2 + (y + i) (y + i) (y + i + 1)3 .

i∈N+

**Now, the function g deﬁned by g (y) =
**

i∈N+

Pi (y) , (y + i)2

y ∈ I,

is precisely U h for h (y) = y 2 , y ∈ I. Since h is increasing, g is decreasing by Proposition 2.1.11. Therefore for any y ∈ I we have Pi (y) i 2 + (y + i) (y + i) (y + i + 1)3

i∈N+

**Solving Gauss’ problem ≤
**

i∈N+

69 1 1 + i3 (i + 1) (i + 1)3 1 1 1 1 1 − + − + i3 i2 i i + 1 (i + 1)3

=

i∈N+

= ζ (3) − ζ (2) + 1 + ζ (3) − 1 = 2ζ (3) − ζ (2) . As clearly sup

x,y∈I, x>y

U f (y) − U f (x) = s (U f ) , y−x

we obtain (2.1.17). Finally, it is easy to check that for f (x) = x, x ∈ I, we have s (f ) = 1 and s (U f ) = 2ζ (3) − ζ (2). The proof is complete. 2 Proposition 2.1.15 We have || (U f ) || ≤ (2ζ (3) − ζ (2)) || f || (2.1.20)

for any f ∈ C 1 (I). The constant θ = 2ζ (3) − ζ (2) = 0.7594798 · · · cannot be lowered. Proof. Equations (2.1.19) and (2.1.18) show that for f ∈ C 1 (I) the series deﬁning U f can be diﬀerentiated term by term since the series of the derivatives is uniformly convergent, it being dominated by a convergent series of positive constants (cf.further Subsection 2.2.1). Then (2.1.20) follows from (2.1.17) since for any f ∈ C 1 (I) we have s (f ) = || f ||. 2 Now, we can state the result announced. Proposition 2.1.16 The operator U deﬁned by (2.1.16) is a bounded linear operator of norm 1 on any of the Banach spaces BV (I), L (I), and C 1 (I). Proof. The result follows from Corollary 2.1.13 and Propositions 2.1.14 and 2.1.15, having in view that U preserves the constant functions. In the case of BV (I) we should note that for a complex-valued f ∈ BV (I) we have max (var Re f, var Im f ) ≤ var f ≤ var Re f + var Im f. Hence by Corollary 2.1.13 we have var U f ≤ var f for such an f. 2

70

Chapter 2

2.1.4

**A solution to Gauss’ problem for probability measures with densities
**

λ. By Proposition 2.1.5 for any n ∈ N we U n f0 (x) dx, x+1 A ∈ BI , (2.1.21)

Let µ ∈ pr (BI ) such that µ have µ τ −n (A) =

A

with f0 (x) = (x + 1) F0 (x) , x ∈ I, where F0 = dµ/dλ. We shall consider Gauss’ problem in a more general form, namely, that of the asymptotic behaviour of µ(τ −n (A)) as n → ∞ for any A ∈ BI . Equation (2.1.21) shows that solving this more general Gauss’ problem for a given µ ∈ pr (BI ) amounts to studying the behaviour of the nth power of the Perron–Frobenius operator U on a suitable Banach space. On account of the results obtained in Subsection 2.1.2 we can state the following result. Proposition 2.1.17 Let µ ∈ pr (BI ) such that µ

n→∞ A∈B

λ. We have (2.1.22)

lim sup µ τ −n (A) − γ (A) = 0.

I

If F0 = dµ/dλ ∈ BEV (I) then there exists a constant C ∈ R+ such that µ τ −n (A) − γ (A) ≤ C q n γ (A) (2.1.23) for any n ∈ N+ and A ∈ BI . Here 0 < q < 1 is the constant occurring in Proposition 2.1.7(ii). Proof. We have µ τ −n (A) − γ (A) =

A

U n f0 (x) − U ∞ f0 dx x+1 1 log 2 F0 dλ = 1 , log 2

(2.1.24)

since U ∞ f0 =

I

f0 dγ =

I

**and equation (2.1.22) follows by (2.1.15). If F0 ∈ BEV (I) then for some C0 ∈ R+ by Proposition 2.1.7(ii) we have U n f0 − U ∞ f0
**

v

≤ C0 q n ||f0 ||v , n ∈ N+ .

It then follows from Proposition 2.0.1(ii) that ess sup |U n f0 − U ∞ f0 | ≤ C0 q n ||f0 ||v , Now, (2.1.23) follows from (2.1.24) and (2.1.25). n ∈ N+ . (2.1.25) 2

Solving Gauss’ problem

71

2 √ Remark. As for q, we conjecture that its (optimal) value is g = (3 − 5)/2 = 0.38196 · · · , as in a further related result, namely, Corollary 2.5.7.2

In the next three sections we will take up Gauss’ problem assuming that F0 = dµ/dλ belongs to Banach spaces ‘smaller’ than BEV (I).

2.1.5

Computing variances of certain sums

In this subsection, using properties of the Perron–Frobenius operator U on BEV (I), we give some results concerning the variances of certain sums of random variables constructed starting from either the a , ∈ Z, or the an , n ∈ N+ . These results will be used in Chapter 3. Let H be a real-valued function on NZ . Set H = H1 ◦ τ −1 , ∈ Z, + where H1 = H(· · · , a−2 , a−1 , a0 , a1 , a2 , · · · ). ¯ ¯ ¯ ¯ ¯

2 Clearly (H ) ∈Z is a strictly stationary process on I 2 , BI , γ . Set S0 = n 0, Sn = i=1 Hi , n ∈ N+ . We start with some well known results. 2 Theorem 2.1.18 If Eγ H1 < ∞, Eγ H1 = 0, and limn→∞ Eγ H1 Hn = 0, 2 2 then the ﬁnite or inﬁnite limit limn→∞ Eγ Sn exists. We have limn→∞ Eγ Sn 2 I 2 such that H = g ◦ τ − g a.e. in < ∞ if and only if there exists g ∈ Lγ 1 2. I

**This is a special case of Theorem 18.2.2 in Ibragimov and Linnik (1971).
**

2 Proposition 2.1.19 If Eγ H1 < ∞, Eγ H1 = 0, and the series 2 σ 2 = Eγ H1 + 2 n∈N+

Eγ H1 Hn+1

(2.1.26)

**converges absolutely, then σ 2 ≥ 0 and
**

2 Eγ Sn = n σ 2 + o (1)

(2.1.27)

as n → ∞. If the stronger assumption n∈N+ n |Eγ H1 Hn+1 | < ∞ holds, then 2 Eγ Sn = n σ 2 + O(n−1 ) (2.1.28) as n → ∞. Proof. By strict stationarity, for any n > 1 we have

n 2 Eγ Sn n−1

=

i,j=1

Eγ Hi Hj =

2 nEγ H1

+2

j=1

(n − j) Eγ H1 Hj+1 .

72 Therefore

Chapter 2

j |Eγ H1 Hj+1 | 1 j=1 2 2 Eγ Sn − nσ ≤ 2 + n n and the right hand side is o (1) as n → ∞ when (note that n∈N+ |un | < ∞ implies limn→∞ (2.1.27) holds. Finally, since

n−1 j=1

n−1

|Eγ H1 Hj+1 | ,

j≥n

n∈N+ |Eγ H1 Hn+1 | < ∞ n j=1 j |uj | /n = 0), so that

j |Eγ H1 Hj+1 | n +

j≥n

j |Eγ H1 Hj+1 | |Eγ H1 Hj+1 | ≤

j∈N+

n

, 2

**equation (2.1.28) holds, too, under our stronger assumption.
**

2 Corollary 2.1.20 Assume that Eγ H1 < ∞, Eγ H1 = 0, and

n |Eγ H1 Hn+1 | < ∞.

n∈N+ 2 Then σ = 0 if and only if there exists g ∈ Lγ I 2 such that H1 = g ◦ τ − g 2. a.e. in I 2 Proposition 2.1.21 If Eγ H1 < ∞, Eγ H1 = 0, and

Eγ

n∈N+

1/2

[H1 − Eγ (H1 |a−n , · · · , an )]2 < ∞,

(2.1.29)

then series (2.1.26) converges absolutely. On account of Corollary 1.3.15, this is a transcription of part of Theorem 18.6.1 in Ibragimov and Linnik (1971) for the special case of the doubly inﬁnite sequence (a ) ∈Z . 2 Note that both the conditional mean value occurring in (2.1.29) and σ 2 2 can be expressed in terms of the random variable h on I 2 , BI deﬁned on Ω2 2 ) by (thus a.e. in I h ([i1 , i2 , · · · ], [i0 , i−1 , · · · ]) = H (· · · , i−1 , i0 , i1 , · · · ) for any (i )

∈Z

∈ NZ . Clearly, + Eγ H1 = hdγ,

I2 2 Eγ H1 =

h2 dγ,

I2

**Solving Gauss’ problem Eγ (H1 |¯−n , · · · , an ) (ω, θ) = a ¯ for (ω, θ) ∈ I 2 (i−n , · · · , in ), where I 2 (i−n , · · · , in ) = I (i1 , · · · , in ) × I (i0 , i−1 , · · · , i−n ) for any ik ∈ N+ , −n ≤ k ≤ n, n ∈ N+ , and σ2 =
**

I2

73 1 −n , · · · , in )) hdγ

I 2 (i−n ,··· ,in )

γ

(I 2 (i

h2 dγ + 2

n∈N+ I2

h (h ◦ τ n ) dγ.

Condition (2.1.29) is fulﬁlled for a large class of functions h as shown by the following result. Proposition 2.1.22 Put cn = sup h (ω, θ) − h ω , θ , n ∈ N+ ,

where the upper bound is taken over all (ω, θ), (ω , θ ) ∈ I 2 (i−n , · · · , in ) and 2 ik ∈ N+ , −n ≤ k ≤ n. Assume that Eγ H1 = I 2 h2 dγ < ∞ and n∈N+ cn < ∞. Then (2.1.29) holds. Proof. For any n ∈ N+ we have Eγ [H1 − Eγ (H1 |a−n , · · · , an )]2 =

i−n ,··· ,in ∈N+ I 2 (i−n ,··· ,in )

2 h (ω, θ) γ (dω, dθ)

I 2 (i−n ,··· ,in ) γ (I 2 (i −n , · · ·

h ω , θ −

, in ))

γ dω , dθ

2

γ dω , dθ =

i−n ,··· ,in ∈N+ I 2 (i−n ,··· ,in )

h ω , θ − h (ω, θ) γ (dω, dθ)

I 2 (i−n ,··· ,in ) γ 2 (I 2 (i−n , · · ·

, in ))

≤

i−n ,··· ,in ∈N+

γ I 2 (i−n , · · · , in ) c2 = c2 . n n

Hence the series occurring in (2.1.29) is dominated by the convergent series 2 n∈N+ cn , which completes the proof.

**74 Remark. If for some positive constants c and ε we have h (ω, θ) − h ω , θ ≤c 1 1 1 1 − + − ω ω θ θ
**

ε

Chapter 2

(2.1.30)

for any (ω, θ), (ω , θ ) ∈ Ω2 , then the assumption of Proposition 2.1.22 holds. Indeed, for (ω, θ), (ω , θ ) ∈ I 2 (i−n , · · · , in ) we have 1 1 − ≤ sup λ (I (i−1 , · · · , i−n )) = (Fn Fn+1 )−1 θ θ i−1 ,··· ,i−n ∈N+ and similarly 1 1 ≤ (Fn−1 Fn )−1 . − ω ω Hence h (ω, θ) − h ω , θ I 2 (i ≤ c 2ε (Fn−1 Fn )−ε, n ∈ N+ ,

for any (ω, θ), (ω , θ ) ∈ −n , · · · , in ), ik ∈ N+ , −n ≤ k ≤ n, and clearly −ε the series n∈N+ (Fn−1 Fn ) is convergent. In particular, (2.1.30) holds if h satisﬁes a H¨lder condition of order o ε > 0, that is, sup

(ω,θ),(ω ,θ )∈Ω2

|h (ω, θ) − h (ω , θ )| < ∞. (|ω − ω| + |θ − θ|)ε 2

The results above clearly apply to the special case where H is a realN valued function on N+ + . In this case we set Hn = H (an , an+1 , · · · ) = H1 ◦ τ n−1 , n ∈ N+ .

Then (Hn )n∈N+ is a strictly stationary sequence on (I, BI , γ). Theorem 2.1.18, Proposition 2.1.19, Corollary 2.1.20, and Proposition 2.1.21 hold in the present case if in their statements we replace γ by γ, I 2 by I, τ by τ and inequality (2.1.29) by

1/2 Eγ [H1 − Eγ (H1 |a1 , · · · , an )]2 < ∞. n∈N+

(2.1.31)

In the present case the conditional mean value occurring in (2.1.31) and σ 2 can be expressed in terms of the random variable h on (I, BI ) deﬁned on Ω (thus a.e. in I) by h ([i1 , i2 , · · · ]) = H (i1 , i2 , · · · )

**Solving Gauss’ problem for any (i )
**

∈N+ ∈

75

N+ + . Clearly, Eγ H1 =

2 hdγ, Eγ H1 =

N

h2 dγ,

I

I

Eγ (H1 |a1 , · · · , an ) (ω) =

1 γ I i(n)

hdγ

I(i(n) )

**for any ω ∈ I(i(n) ), i(n) ∈ Nn , n ∈ N+ , and + σ2 =
**

I

h2 dγ + 2

n∈N+ I

h (h ◦ τ n ) dγ h U n h dγ

n∈N+ I

=

I

h2 dγ + 2

[the last equation is a consequence of (2.1.2)]. It follows from Proposition 2.1.22 that condition (2.1.31) is fulﬁlled if we assume that I h2 dγ < ∞ and n∈N+ cn < ∞, where cn = sup sup h (ω) − h ω ,

i(n) ∈Nn ω,ω ∈I(i(n) ) +

n ∈ N+ .

In turn, the second assumption holds if for some positive constants c and ε we have 1 ε 1 − h (ω) − h ω ≤ c , ω, ω ∈ Ω. (2.1.32) ω ω In particular, (2.1.32) holds if h satisﬁes a H¨lder condition of order ε > 0, o that is, |h (ω) − h (ω )| sup < ∞. |ω − ω |ε ω,ω ∈Ω To indicate another class of functions h for which (2.1.31) holds let us recall that a function h : I → C is said to be of bounded p–variation, p ≥ 1, on A ⊂ I if and only if

(p) varA h k−1

:= sup

i=1

|h (ti+1 ) − h (ti )|p < ∞,

the supremum being taken over t1 < · · · < tk , ti ∈ A, 1 ≤ i ≤ k, and (p) k ≥ 2. We write simply var(p) h for varI h. If var(p) h < ∞ then h is called a function of bounded p-variation. Clearly, var(1) h = var h and a

76

Chapter 2

function of bounded variation is also a function of bounded p-variation for any p > 1. (The converse of this assertion is in general not true.) More generally, a function of bounded p-variation, p ≥ 1, is also a function of p -variation, p > p. Proposition 2.1.23. If h is a function of bounded p-variation on Ω, then (2.1.31) holds. Proof. Without any loss of generality, on account of the last assertion above we can assume that p ≥ 2. It is obvious that h (ω) − h ω for any A ⊂ Ω and ω, ω ∈ A. Then

1/2 1/p Eγ [H1 − Eγ (H1 |a1 , · · · , an )]2 ≤ Eγ |H1 − Eγ (H1 |a1 , · · · , an )|p

≤ varA h

(p)

1/p

=

i(n) ∈Nn I(i(n) ) +

p

1/p γ (dω)

p 1/p

h (ω) −

1 γ I i(n)

I(i(n) )

h ω γ dω

=

i(n) ∈Nn +

γp

1 I i(n)

I(i(n) )

γ (dω)

I(i(n) )

h (ω) − h ω 1/p

γ dω

≤ max γ(I(i(n) )) n (n)

i ∈N+

(p) varI(i(n) ) h

i(n) ∈Nn + 1/p 1/p

≤

1 (Fn Fn+1 )−1 log 2

varΩ h

(p)

.

**Hence the series occurring in (2.1.31) is dominated by (varΩ h)1/p (log 2)
**

1/p n∈N+ (p)

(Fn Fn+1 )−1/p , 2

and clearly the last series is convergent. σ2

It is important to know when deﬁned in terms of H or, equivalently, in terms of h, is non-zero. In the result below the function h, which is only deﬁned on Ω, is considered as the representative of a class of λ-indistinguishable

Solving Gauss’ problem

77

functions on I, after having been extended in an arbitrary manner to the whole of I . Proposition 2.1.24 Assume that h ∈ L2 (I), I hdγ = 0, and U h ∈ γ BEV (I). Then the series σ2 =

I

h2 dγ + 2

n∈N+ I

h U n h dγ

(2.1.33)

converges absolutely, and we have σ = 0 if and only if there exists b ∈ L2 (I) γ such that h = b ◦ τ − b a.e. in I. In particular, if h is essentially unbounded then σ = 0. Proof. By (2.0.2) and Proposition 2.1.7(ii) we have ess sup |U n h| ≤ ||U n h||v ≤ q n−1 ||U h||v , n ∈ N+ , (2.1.34)

for some positive q < 1. This clearly entails the absolute convergence of both series (2.1.33) and n∈N+ n I h U n h dγ. Then Corollary 2.1.20 completes the proof of the ﬁrst two assertions concerning σ. Without appealing to Corollary 2.1.20, the characterization of the case σ = 0 can be given a direct proof as follows. Put h1 = n∈N+ U n h. By (2.1.34) this series converges in BEV (I), and we have h1 = U h + U h1 = U (h + h1 ). Writing g = h + h1 we note that U g ∈ BEV (I) and σ2 =

I

h2 + 2hh1 dγ =

g 2 − (U g)2 dγ.

I

By (2.1.2) we have (U g)2 dγ =

I I

((U g) ◦ τ )g dγ

and (U g)2 dγ =

I I

((U g) ◦ τ )2 dγ.

[Note that (2.1.2) implies in general that I f dγ = I f ◦ τ dγ, f ∈ L1 , which γ also follows from the fact that τ is γ-preserving.] Consequently, we can write σ2 =

I

g 2 dγ − 2

I

((U g) ◦ τ ) g dγ +

I

((U g) ◦ τ )2 dγ

=

I

(g − (U g) ◦ τ )2 dγ.

78 Now, if σ = 0 then g = (U g) ◦ τ a.e. in I. Hence h = (U g) ◦ τ − U g a.e. in I,

Chapter 2

(2.1.35)

**that is, we can take b = U g. Conversely, if h = b ◦ τ − b a.e. in I then Sn = b ◦ τ n − b a.e. in I for any n ∈ N+ . Hence
**

2 n−1 Eγ Sn ≤ 4n−1

b2 dγ → 0 as n → ∞,

I

that is, σ = 0. Finally, since U g ∈ BEV (I) as shown above, equation (2.1.35) cannot hold in the case where h is essentially unbounded, that is, we cannot have σ = 0. 2 Corollary 2.1.25 Let f : N+ → R such that Eγ f 2 (a1 ) < ∞, Eγ f (a1 ) = 0. Put Eγ f (a1 ) f (an+1 ) (2.1.36) σ 2 = Eγ f 2 (a1 ) + 2

n∈N+

Then σ = 0 if and only if f = 0. Proof. As a special case of (2.1.26) with (2.1.31) trivially satisﬁed, series (2.1.36) is absolutely convergent. Moreover, in the present case h is deﬁned by h (ω) = f ( 1/ω ) , ω ∈ Ω, and by hypothesis h ∈ L2 (I) and γ U h (ω) =

i∈N+ I

hdγ = 0. We then have ω ∈ Ω,

Pi (ω)f (i),

and v (U h) ≤

i∈N+

|f (i)| var Pi ≤ C

i∈N+

|f (i)| i2

for some C > 0. The last series is convergent since Eγ |f (a1 )| < ∞, so that U h ∈ BEV (I). Then by Proposition 2.1.24 we have σ = 0 if and only if there exists b ∈ L2 (I) such that h = b ◦ τ − b a.e. in I, and we have to show γ that this happens if and only if f = 0. Clearly, if f = 0 then σ = 0. To prove the converse we ﬁrst note that U h = U (b ◦ τ ) − U b = b − U b a.e. in I. This equation holds for b equal to h1 = n∈N+ U n h ∈ BEV (I). Putting b = b1 + h1 we get b1 = U b1 . But by Proposition 2.1.7 the last equation

Solving Gauss’ problem

79

only holds for a.e. constant functions b1 . This shows that actually b ∈ BEV (I). Next, whatever i ∈ N+ , for u ∈ (1/ (i + 1) , 1/i) the equation h (u) = (b ◦ τ ) (u) − b (u) a.e. in I implies f (i) = b (x) − b a.e. in I . Hence nf (i) = b (x) − b ([i(n − 1), x + i]) a.e. in I for any n ≥ 2, where i (n − 1) = (i1 , · · · , in−1 ) with i1 = · · · = in−1 = i. If f (i) = 0 then this contradicts the fact that b ∈ BEV (I). The proof is complete. 2 We note another criterion for to have σ = 0 under stronger assumptions than in Proposition 2.1.24. Proposition 2.1.26 Let h : I → R be continuous except for a ﬁnite number of points of I and assume that inf x∈(0,δ) |h (x)| > 0 for some δ > 0, 2 I hdγ = 0, and I h dγ < ∞. If the series deﬁning U h is uniformly convergent in I and U h ∈ BV (I), then σ deﬁned by (2.1.33) is non-zero. For the proof see Samur (1996). 2 1 x+i

2.2

2.2.1

**Wirsing’s solution to Gauss’ problem
**

Elementary considerations

λ. For any n ∈ N put x ∈ I,

Let µ ∈ pr (BI ) such that µ

Fn (x) = µ (τ n < x) ,

with τ 0 = identity map. As (τ n < x) = τ −n ((0, x)), by Proposition 2.1.5 we have x U n f0 (u) Fn (x) = du, n ∈ N, x ∈ I, (2.2.1) u+1 0 with f0 (x) = (x + 1)F0 (x), x ∈ I, where F0 = dµ/dλ. [Clearly, (2.2.1) is a special case of (2.1.21).] In this subsection we will assume that F0 ∈ C 1 (I). In other words, we study the behaviour of U n as n → ∞, assuming that the domain of U is C 1 (I).

**80 Let f ∈ C 1 (I). Then U f (x) =
**

i∈N+

Chapter 2

Pi (x) f

1 x+i f 1 x+i , x ∈ I,

=

i∈N+

i i−1 − x+i+1 x+i

**can be diﬀerentiated term by term to give (U f ) (x) = −
**

i∈N+

i i−1 2 − (x + i + 1) (x + i)2 1 f (x + i)2 f 1 x+i 1 x+i

f

1 x+i

+ = −

i i−1 − x+i+1 x+i i (x + i + 1)2

3

1 x+i −f 1 x+i+1

i∈N+

+

x+1 f (x + i) (x + i + 1)

,

x ∈ I,

since the series of derivatives is uniformly convergent, it being dominated by a convergent series of positive constants. Hence (U f ) = −V f , where V : C (I) → C (I) is deﬁned by V g (x) =

i∈N+

f ∈ C 1 (I),

(2.2.2)

i (x + i + 1)2 +

3

1/(x+i)

g (u) du

1/(x+i+1)

x+1 g (x + i) (x + i + 1)

1 x+i

,

g ∈ C (I), x ∈ I.

Clearly,

(U n f ) = (−1)n V n f , n ∈ N+ ,

n

f ∈ C 1 (I) .

(2.2.3)

We are going to show that V takes certain functions into functions with very small values when n ∈ N+ is large. Proposition 2.2.1 There are positive constants v > 0.29017 and w < 0.30796, and a real-valued function ϕ ∈ C (I) such that vϕ ≤ V ϕ ≤ wϕ.

Solving Gauss’ problem

81

Proof. Let h : R+ → R be a continuous bounded function such that limx→∞ h (x) /x = 0. We look for a function g : (0, 1] → R such that U g = h, assuming that the equation U g (x) =

i∈N+

Pi (x) g

1 x+i

= h (x)

(2.2.4)

holds for x ∈ R+ . Then (2.2.4) yields h (x) h (x + 1) 1 − = g x+1 x+2 (x + 1) (x + 2) Hence g (u) = 1 +1 h u 1 1 −1 − h u u 1 x+1 1 , u , x ∈ R+ .

u ∈ (0, 1],

**and we indeed have U g = h since U g (x) = (x + 1)
**

i∈N+

h (x + i − 1) h (x + i) − x+i x+i+1 = h (x) , x ∈ R+ .

= (x + 1)

h (x) h (x + i) − lim x + 1 i→∞ x + i + 1

In particular, for any ﬁxed a ∈ I we consider the function ha : R+ → R deﬁned by 1 , x ∈ R+ . ha (x) = x+a+1 We have just seen that the function ga : (0, 1] → R deﬁned by ga (x) = = satisﬁes U ga (x) = ha (x), We come to V via (2.2.2). Setting ϕa (x) = ga (x) = 1−a a+1 , 2 + (ax + 1) ((a + 1) x + 1)2 x ∈ I, x ∈ I. 1 + 1 ha x 1 1 − 1 − ha x x 1 x

1 x+1 − , ax + 1 (a + 1) x + 1

x ∈ (0, 1],

82 we have V ϕa (x) = − (U ga ) (x) = Let us choose a by asking that ϕa ϕa (0) = (1) . V ϕa V ϕa This amounts to (a + 1)3 (2a + 1) + (a − 1) (a + 2)2 = 0 or 2 (a + 1)4 − 3 (a + 1) − 2 = 0,

Chapter 2

1 , (x + a + 1)2

x ∈ I.

which yields as unique acceptable solution a = 0.3126597 · · · . For this value of a the function ϕa /V ϕa attains its maximum equal to 2 (a + 1)2 = 3.44615 · · · at x = 0 and at x = 1, and has a minimum equal to m (a) = a3 + a2 − a + 1 + 3a (a + 2) 1 − a − a2 = 3.247229 · · · at x = (δ − 1) / (1 − a (δ − 1)) = 0.3655 · · · , where δ= a (a + 1) (a + 2) (1 − a) (1 − a − a2 )

1/3

(1 − a) δ +

a+1 δ

= 1.328024 · · · .

It follows that for ϕ = ϕa with a = 0.3126597 · · · we have ϕ ϕ 2 ≤ V ϕ ≤ m (a) , 2 (a + 1) that is, vϕ ≤ V ϕ ≤ wϕ, where v= 1 > 0.29017, 2 (a + 1)2 w= 1 < 0.30796. m (a) 2

Solving Gauss’ problem

83

Remark. As noted by Wirsing (1974, p. 513), a better choice of ϕ is ϕ = 8ϕa − 7ϕa with a = 0.6247 and a = 0.7, which yields v = 0.3020, w = 0.3043. 2 Corollary 2.2.2 Let f0 ∈ C 1 (I) such that f0 > 0. Put α = min

x∈I

ϕ (x) , f0 (x)

β = max

x∈I

ϕ (x) . f0 (x) n ∈ N+ . (2.2.4)

Then

α n β v f0 ≤ V n f0 ≤ wn f0 , β α

Proof. Since V is a positive operator (that is, takes non-negative functions into non-negative functions) we have v n ϕ ≤ V n ϕ ≤ wn ϕ, n ∈ N+ .

Noting that αf0 ≤ ϕ ≤ βf0 we then can write α n v f0 ≤ β ≤ 1 n 1 1 v ϕ ≤ V n ϕ ≤ V n f0 ≤ V n ϕ β β α 1 n β n w ϕ ≤ w f0 , n ∈ N+ , α α 2 2

which shows that (2.2.4) holds. Remark. A similar result holds if f0 ∈ C 1 (I) and f0 < 0.

Theorem 2.2.3 (Near-optimal solution to Gauss’ problem) Let f0 ∈ C 1 (I) such that f0 > 0. For any n ∈ N+ and x ∈ I we have (log 2)2 α minx∈I f0 (x) n v G (x) (1 − G (x)) 2β ≤ |µ (τ n < x) − G (x)| ≤ (log 2)2 β maxx∈I f0 (x) n w G (x) (1 − G(x)), α

where α, β, v, and w are deﬁned in Proposition 2.2.1 and Corollary 2.2.2. In particular, for any n ∈ N+ and x ∈ I we have 0.07739 v n G (x) (1 − G (x)) ≤ |λ (τ n < x) − G (x)| ≤ 1.49132 wn G (x) (1 − G (x)) .

84

Chapter 2

**Proof. For any n ∈ N and y ∈ I set dn (y) = µ τ n < ey log 2 − 1 − y so that dn (G (x)) = µ (τ n < x) − G(x), x ∈ I. Then by (2.2.1) we have
**

x

dn (G (x)) =

0

U n f0 (u) du − G(x). u+1

Diﬀerentiating twice with respect to x yields dn (G (x)) 1 (x + 1) log 2 (U n f0 (x)) Hence, by (2.2.3), dn (G (x)) = (−1)n (log 2)2 (x + 1) V n f0 (x), n ∈ N, x ∈ I. = = U n f0 (x) 1 − , x+1 (x + 1) log 2 1 dn (G (x)) , n ∈ N, x ∈ I. 2 x+1 (log 2)

Since dn (0) = dn (1) = 0, it follows from a well known interpolation formula that y (1 − y) dn (θ), n ∈ N, y ∈ I, dn (y) = − 2 for a suitable θ = θ (n, y) ∈ I. Therefore µ (τ n < x) − G (x) = (−1)n+1 (log 2)2 θ+1 n V f0 (θ) G (x) (1 − G (x)) 2

for any n ∈ N and x ∈ I, and another suitable θ = θ (n, x) ∈ I. The result stated follows now from Corollary 2.2.2. In the special case µ = λ we have f0 (x) = x + 1, x ∈ I. Then with a = 0.3126597 · · · we have α = min

x∈I

**ϕ (x) 1−a a+1 = = 0.644333 · · · , 2 + f0 (x) (a + 1) (a + 2)2 β = max
**

x∈I

ϕ (x) = 2, f0 (x) (log 2)2 β = 1.49131 · · · . α

so that (log 2)2 α = 0.07739 · · · , 2β

Solving Gauss’ problem The proof is complete.

85 2

Remark. It follows from the above proof that for any n ∈ N the diﬀerence µ (τ n < x) − G (x) has a constant sign equal to (−1)n+1 whatever 0 < x < 1. 2

2.2.2

A functional-theoretic approach

The question naturally arises whether the operator V has an eigenvalue λ0 such that v ≤ λ0 ≤ w (see Theorem 2.2.3). This will indeed follow from the result below. Let B be a collection of bounded real-valued functions deﬁned on a set X, with the following properties: (i) B is a linear space over R; (ii) B is complete with respect to the supremum norm, and (iii) B contains the constant functions. Theorem 2.2.4 Let V : B → B be a positive bounded linear operator and F : B → R a positive bounded linear functional such that V ≥ F. Assume that there exist ϕ ∈ B with m (ϕ) = inf ϕ (x) > 0

x∈X

(2.2.5)

and two positive numbers v and w, v ≤ w, such that v≤ and V ϕ (x) ≤ w, ϕ (x) x ∈ X, (2.2.6)

v || V ϕ ||. (2.2.7) w Then V has an eigenvalue λ0 ∈ [v, w] with corresponding positive eigenfunction ψ ∈ B such that F (ϕ) > 1 − ψ ≥ ϕ ≥ m (ϕ) > 0, 0<w F (ϕ) F (ψ) − (w − v) ≤ ≤ λ0 , || V ϕ || || ψ ||

**and for any n ∈ N and f ∈ B we have V n f = G (f ) λn ψ + osc 0 f ψ λ0 − F (ψ) || ψ ||
**

n

θn ψ,

(2.2.8)

86

Chapter 2

where G : B → R is a positive bounded linear functional with || G || ≤ 1/m (ϕ), and θn : X → R is a function satisfying |θn | ≤ 1. Proof. Deﬁne ϕn = V n ϕ, n ∈ N, ϕ0 = ϕ. Since V is positive, from (2.2.6) we get vϕn ≤ ϕn+1 ≤ wϕn , n ∈ N. It follows that

x∈X

inf ϕn (x) > 0,

n ∈ N.

Set v0 = v, w0 = w, and vn = inf Then vn ϕn ≤ ϕn+1 ≤ wn ϕn , whence vn V ϕn ≤ V ϕn+1 ≤ wn V ϕn , that is, vn ϕn+1 ≤ ϕn+2 ≤ wn ϕn+1 . Therefore vn+1 ≥ vn and wn+1 ≤ wn , n ∈ N. We are going to improve these inequalities. It follows from (2.2.5) and (2.2.9) that ϕn+2 − vn ϕn+1 = V (ϕn+1 − vn ϕn ) ≥ F(ϕn+1 − vn ϕn ) ϕn+1 F(ϕn+1 − vn ϕn ), ≥ || ϕn+1 || whence vn+1 ≥ vn + Similarly, wn ϕn+1 − ϕn+2 = V (wn ϕn − ϕn+1 ) ≥ F (wn ϕn − ϕn+1 ) ϕn+1 F(wn ϕn − ϕn+1 ), ≥ || ϕn+1 || whence wn+1 ≤ wn − F (wn ϕn − ϕn+1 ) , || ϕn+1 || F(ϕn+1 − vn ϕn ) , || ϕn+1 || n ∈ N, (2.2.9) ϕn+1 ϕn+1 , wn = sup , ϕn ϕn n ∈ N+ .

n ∈ N.

(2.2.10)

n ∈ N.

(2.2.10 )

Solving Gauss’ problem

87

Putting dn = wn − vn and en = F (ϕn ) /| ϕn+1 || , n ∈ N, it follows from | (2.2.10) and (2.2.10 ) that dn+1 ≤ dn (1 − en ), which shows that en ≤ 1, n ∈ N. Now, note that (2.2.9) implies F (ϕn+1 ) ≥ vn F (ϕn ) and || ϕn+2 || ≤ wn+1 || ϕn+1 || , Hence en+1 ≥ vn en , wn+1 n ∈ N. n ∈ N. (2.2.12) n ∈ N, (2.2.11)

In conjunction with (2.2.11) and (2.2.12), assumption (2.2.7) which can be written as d0 e0 − > 0, w0 ensures exponential decrease of the dn , n ∈ N, since wn+1 en+1 − dn+1 ≥ vn en − dn (1 − en ) = wn en − dn , whence wn en − dn ≥ w0 e0 − d0 , 1 ≥ en ≥ and dn ≤ d0 1 − e0 + d0 1 (w0 e0 − d0 ) ≥ e0 − > 0, wn w0 d0 w0

n

n ∈ N,

(2.2.13)

,

n ∈ N.

(2.2.14)

Put λ0 = limn→∞ vn = limn→∞ wn , and deﬁne ϕ0 = ϕ0 = ϕ, ϕn = ϕn (v0 · · · vn−1 )−1 , Then (2.2.9) amounts to ϕn ≤ ϕn+1 ≤ wn ϕn = vn 1+ dn vn ϕn ≤ 1+ dn v0 ϕn , n ∈ N, (2.2.15) n ∈ N+ .

**and (2.2.14) implies that A =
**

n∈N

1+

dn v0

< ∞.

88 Hence ϕn ≤

i=0

Chapter 2

n−1

1+

di v0

ϕ0 ≤ A ϕ0 ,

n ∈ N+ .

(2.2.16)

It follows from (2.2.15) and (2.2.16) that 0 ≤ ϕn+1 − ϕn ≤ dn dn A ϕn ≤ ϕ0 , v0 v0 n ∈ N.

Therefore by (2.2.14) the series | | n∈N | ϕn+1 − ϕn | converges. By the completeness of B the limit ψ = limn→∞ ϕn exists. Letting n → ∞ in vn ϕn ≤ V ϕn ≤ wn ϕn , n ∈ N, yields V ψ = λ0 ψ. Since ϕn+1 ≥ ϕn ≥ · · · ≥ ϕ0 = ϕ, we have ψ ≥ ϕ. As 1 ≥ en = F (ϕn ) /| ϕn+1 || = F (ϕn ) /| V ϕn || , n ∈ N, letting n → ∞ yields 1 ≥ | | F (ψ) /λ0 || ψ || . Finally, by (2.2.13) we have λ0 F (ψ) F (ϕ) F (ψ) = = lim wn en ≥ w0 e0 − d0 = w − w + v > 0. n→∞ || ψ || || V ψ || || V ϕ || To prove (2.2.8) let f ∈ B and deﬁne fn = V n f, n ∈ N, f0 = f, vn = inf Hence fn+1 − vn λn+1 ψ = V (fn − vn λn ψ) 0 0 ≥ F (fn − vn λn ψ) ≥ 0 which yields vn+1 ≥ vn + Similarly, wn+1 ≤ wn − Therefore wn+1 − vn+1 ≤ (wn − vn ) 1 − F (ψ) , λ0 || ψ || n ∈ N, 1 F(wn λn ψ 0 n+1 λ0 || ψ || − fn ) ≤ wn , n ∈ N. F λn+1|| ψ || 0 1 (fn − vn λn ψ) ≥ vn , 0 n ∈ N. ψ F(fn − vn λn ψ), 0 || ψ || fn , λn ψ 0 wn = sup fn , λn ψ 0 n ∈ N.

**Solving Gauss’ problem whence wn − vn ≤ osc since w0 − v0 = sup
**

n

89

f ψ

1−

F (ψ) λ0 || ψ ||

,

n ∈ N,

f f f − inf = osc . ψ ψ ψ

If we denote by G (f ) the common limit of vn and wn as n → ∞, then we have F (ψ) n f 1− , n ∈ N, vn , wn = G (f ) + θn osc ψ λ0|| ψ || with a suitable θn ∈ R satisfying θn ≤ 1. Hence, by the very deﬁnition of the vn and wn , n ∈ N, equation (2.2.8) should hold. Since |G(f )| ≤ max (|v0 | , |w0 |) ≤ it follows that || G || = sup

f ∈B

|| f || , inf ψ

f ∈ B,

|G (f )| 1 ≤ . || f || inf ψ

The fact that G is a positive linear functional is an immediate consequence of equation (2.2.8). 2 Let us show that Theorem 2.2.4 applies to Gauss’ problem as considered in Subsection 2.2.1. The space B is Cr (I), the collection of all real-valued functions in C (I) , and the operator V the one denoted there by the same letter. As function ϕ we could use the function ϕa constructed in Subsection 2.2.1 with a = 0.3126597 · · · . Nevertheless, it is more convenient to use V ϕa instead, for which the same values of v and w apply. Thus we take ϕ (x) = 1 , (x + a + 1)2 x ∈ I,

with a = 0.3126597 · · · . Finally, the functional F can be constructed as follows. Let f ∈ Cr (I) , f ≥ 0. [Note that actually the considerations below hold for any non-negative f ∈ B(I).] Then V f (x) ≥

i∈N+ 1

i (x + i + 1)2

1/(x+i)

f (y) dy

1/(x+i+1)

=

0

k (x, y) f (y) dy,

x ∈ I,

90 where k (x, 0) = 0, k (x, y) = x ∈ I, y −1 − x

Chapter 2

(x + y −1 − x + 1)2

,

x ∈ I, y ∈ (0, 1].

**If 0 < y ≤ 1/3 then y −1 − x ≥ 2, and since t → (t + x + 1)−2 , is a decreasing function, we have k (x, y) ≥ y −1 − x (y −1 + 1)
**

2

t ≥ 2,

≥

y −1 − 1 (y −1 + 1)

2

=

y (1 − y) (y + 1)2

**for x ∈ I, 0 ≤ y ≤ 1/3. If 1/3 < y ≤ 1/2 then either k (x, y) = (2 + x)−2 or k (x, y) = 2 (3 + x)−2 . Hence k (x, y) ≥ 1/9 for x ∈ I, 1/3 < y ≤ 1/2. Thus we have V f ≥ F (f ), where
**

1/3

F (f ) =

0

y (1 − y) 1 2 f (y) dy + 9 (y + 1)

1/2

f (y) dy.

1/3

**Elementary calculations yield
**

1/3

F (ϕ) =

0 1/3

y (1 − y) dy 1 2 2 + 9 (y + 1) (y + a + 1)

1/2 1/3

dy (y + a + 1)2 dy

=

0

3a + 4 2 a2 + 3a + 2 3a + 4 − − − 3 a3 (y + 1) a2 (y + 1)2 a (y + a + 1) a2 (y + a + 1)2

1/2

+

1 9

1/3

3a + 4 88a2 + 279a + 216 dy 4 (a + 1) = − . log a3 3a + 4 18a2 (2a + 3) (3a + 4) (y + a + 1)2

As V ϕ ≤ wϕ, we have F (ϕ) w F (ϕ) ≥ = (a + 1)2 F (ϕ) > 0.033184. || V ϕ || || ϕ || (2.2.17)

Since w − v < 0.01779, inequality (2.2.7) holds. Thus Theorem 2.2.4 applies and we have F (ψ) ≥ (a + 1)2 F (ϕ) − (w − v) > 0.01539. || ψ || (2.2.18)

Solving Gauss’ problem

91

**To state the result corresponding to Theorem 2.2.3 we should ﬁrst introduce a few notation. Let
**

x

Ψ (x) =

0

ψ (u) du

and ψ (x) =

0

x

Ψ (u) − U ∞ Ψ du, u+1

x ∈ I.

It is easy to check that (x + 1) ψ (x) and ψ (0) = ψ (1) = 0. Remarks. 1. As noted by Wirsing (1974, p. 521), using as function ϕ the function V (8ϕa − 7ϕa ) with a = 0.6247 and a = 0.7 one can improve (2.2.18) to F(ψ)/| ψ || ≥ 0.031. | 2. Wirsing (1974, § 5) proved that the functions ψ and ψ are analytic. Their analytic continuations are holomorphic in the whole complex plane with a cut along the negative real axis from ∞ to −1, which is the natural boundary of these functions. 2 Theorem 2.2.5 Let f0 ∈ C 1 (I) (equivalently, dµ/dλ = F0 ∈ C 1 (I)). For any n ∈ N and x ∈ I we have µ (τ n < x) − G (x) − (−λ0 )n G f0 ψ (x) f0 (log 2)2 (λ0 − 0.01539)n G (x) (1 − G (x)) , ψ where λ0 = 0.303 663 002 898 732 658 · · · , ≤ || ψ || osc 1 3.41 , 2 ≤ ψ (x) ≤ (x + a + 1) (x + a + 1)2 x ∈ I, = ψ(x), x ∈ I,

with a = 0.3126597, and G is a positive bounded functional on Cr (I) such that 1 || G || ≤ ≤ (a + 2)2 = 5.34839 · · · . inf ψ In particular, for any n ∈ N and x ∈ I we have λ (τ n < x) − G (x) − (−λ0 )n G (1) ψ (x) (2.2.19)

92 ≤ 4.605 (λ0 − 0.01539)n G (x) (1 − G (x)) .

Chapter 2

Proof. We use the same trick as in the proof of Theorem 2.2.3. For n ∈ N and y ∈ I set dn (y) = µ τ n < ey log 2 − 1 − y − (λ0 )n G(f0 )ψ ey log 2 − 1 so that dn (G (x)) = µ (τ n < x) − G (x) − (−λ0 )n G f0 ψ (x) , Diﬀerentiating twice with respect to x yields 1 dn (G (x)) 2 x+1 (log 2) = (U n f0 ) (x) − (−λ0 )n G f0 (x + 1) ψ (x) x ∈ I.

= (−1)n V n f0 (x) − (−λ0 )n G f0 ψ (x) . Hence, by Theorem 2.2.4 and (2.2.18), dn (G (x)) ≤ 2 || ψ || osc f0 (log 2)2 (λ0 − 0.01539)n , ψ n ∈ N, x ∈ I.

Since dn (0) = dn (1) = 0, the ﬁrst inequality in the statement follows (cf. the proof of Theorem 2.2.3). In principle, Theorem 2.2.4 provides the means for computing λ0 to any accuracy. It follows from that theorem that for any real-valued f ∈ C 1 (I) and n ∈ N we have U n f (1) − U n f (0) = (−1)n λn G f 0

1 0

ψdλ + (λ0 − 0.01539)n osc

f ψ

1 0

θn ψ dλ

**with a suitable θn : I → R satisfying |θn | ≤ 1. Therefore if f > 0 then U n f (1) − U n f (0) = −λ0 + O U n−1 f (1) − U n−1 f (0) λ0 − 0.01539 λ0
**

n

as n → ∞. Using this equation Wirsing (1974) has obtained the value given in the statement. Note that in Knuth (1981, p. 350) the ﬁrst 20 (RCF) digits of λ0 are given as 3, 3, 2, 2, 3, 13, 1, 174, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 1. The 20th convergent equals 227 769 828 , 750 074 345 which yields 14 exact signiﬁcant digits of λ0 .

Solving Gauss’ problem

93

**Now, we refer to the proof of Theorem 2.2.4. It is shown there that ϕ ≤ ψ ≤ Aϕ, with dn , A = 1+ v
**

n∈N

dn ≤ (w − v) 1 − e0 +

w−v w

n

,

n ∈ N,

where in the present case v > 0.29017 and w < 0.30796. Then since by (2.2.17) we have we0 = it follows that A ≤ exp

n∈N dn

wF (ϕ) ≥ (a + 1)2 F (ϕ) ≥ 0.033184, || V (ϕ) ||

v

≤ exp

w (w − v) ≤ 3.409 · · · . v (we0 − (w − v))

In the special case µ = λ we have osc f0 1 1 1 (a + 1)2 = osc = − ≤ (a + 2)2 − = 4.843094 · · · , ψ ψ inf ψ sup ψ 3.41 2 C 1 (I) be real-valued. For any n ∈ N we have (2.2.20) + osc f (λ0 − 0.01539)n ψ

x

**and (2.2.19) follows. Theorem 2.2.6 Let f ∈ || U n f − U ∞ f || ≤ and || U n f − U ∞ f || ≥ λn G f 0 − osc f (λ0 − 0.01539)n ψ
**

x

λn G f 0

γ(dx)

I 0

ψ dλ

(2.2.21) γ(dx)

I 0

ψ dλ .

Here G is a positive bounded linear functional on Cr (I) with || G || ≤ 5.34839 · · · , and the last inequality is meaningful for n ∈ N+ large enough. Proof. It follows from (2.2.3) and (2.2.8) that U n f (x) − U n f (y) =

94 = (−1)n G(f )λn 0

y

Chapter 2 ψ dλ + osc

x

f (λ0 − 0.01539)n ψ

y x

θn ψ dλ

for any n ∈ N and x, y ∈ I with a suitable θn : I → R satisfying |θn | < 1. Integrating over y ∈ I with respect to γ, on account of (2.1.12) we obtain U n f (x) − U ∞ f = (−1)n G(f )λn 0 + osc f (λ0 − 0.01539)n ψ

y

γ(dy)

I y x

ψ dλ θn ψ dλ

(2.2.22)

γ(dy)

I x

for any n ∈ N and x ∈ I. Hence (2.2.20) and (2.2.21) follow at once. For the lower bound (2.2.21) we should note that || U n f − U ∞ f || ≥ |U n f (0) − U ∞ f | . 2 Remarks. 1. Equation (2.2.22) shows that whatever f ∈ C 1 (I) the exact rate of convergence of U n f (x)−U ∞ f to 0 as n → ∞ is O(λn ) for any x ∈ E, / 0 where y E= x∈I:

I

γ(dy)

x

ψ dλ = 0 .

**Clearly, E is not empty since
**

y y

γ(dy)

I 0

ψ dλ > 0 and

I

γ(dy)

1

ψ dλ < 0.

2. By (2.1.12) and Proposition 2.0.1(i) with µ = γ, for any f ∈ C 1 (I) we have || U n f − U ∞ f || ≤ var U n f, n ∈ N. Next, since U n f (1) − U n f (0) = U n f (1) − U ∞ f − (U n f (0) − U ∞ f ), we have |U n f (1) − U n f (0)| ≤ 2 || U n f − U ∞ f ||. Finally, noting that by (2.2.3) we have var U n f =

I

(U n f ) dλ =

I

V n f dλ, (U n f ) dλ =

I I

|U n f (1) − U n f (0)| =

V n f dλ ,

**Solving Gauss’ problem from (2.2.8) we obtain || U n f − U ∞ f || ≤ and || U n f − U ∞ f || ≥ 1 2 λn G(f ) − osc 0 f (λ0 − 0.01539)n ψ ψ dλ
**

I

95

λn G(f ) + osc 0

f (λ0 − 0.01539)n ψ

ψ dλ

I

**for any n ∈ N and any real-valued f ∈ C 1 (I). Since x γ(dx)
**

I 0

ψ dλ <

I

ψ dλ,

the upper bound for || U n f − U ∞ f || just derived is slightly worse than that given in Theorem 2.2.6. The comparison of the lower bounds for || U n f − U ∞ f || , here and in Theorem 2.2.6, amounts to a comparison of I ψ dλ/2 x and I γ(dx) 0 ψ dλ, a question we cannot answer. 2 Corollary 2.2.7 The spectral radius of the operator U − U ∞ in C 1 (I) is equal to λ0 . Proof. We should show that lim || U −

n 1/n U ∞ ||1

n→∞

= lim

n→∞

|| U n f − U ∞ f ||1 sup || f ||1 0=f ∈C 1 (I)

1/n

= λ0 .

This follows easily using Theorem 2.2.6 and equations (2.2.3) and (2.2.8). The details are left to the reader. 2

2.2.3

The case of Lipschitz densities

Theorem 2.2.4 can be also used to solve Gauss’ problem in the case where F0 = dµ/dλ ∈ L(I). In other words, Theorem 2.2.4 enables us to study the behaviour of U n as n → ∞ assuming that the domain of U is L(I). Let f ∈ L(I). Then the derivative f exists a.e. in I and is bounded by s(f ). Abusing the notation, we will also denote by f the extension to I of the derivative of f , which is obtained by assigning the value 0 at the points where f is not diﬀerentiable. It is obvious that the operator V : C(I) → C(I) introduced in Subsection 2.2.1 can be extended to B(I) with V g, g ∈ B(I), deﬁned by the same

96

Chapter 2

formula as in the case of a continuous g. The point is that, as is easy to see, equations (2.2.2) and (2.2.3) hold now a.e. in I, that is, (U n f ) = (−1)n V n f , f ∈ L(I), n ∈ N+ , (2.2.23)

a.e. in I, with the null set of exempted points depending on f and n. Let us now apply Theorem 2.2.4 to our V in the case where B is Br (I), the collection of all real-valued functions in B(I), with the same function ϕ and functional F as in the case where B = Cr (I) ⊂ Br (I), which has been considered in Subsection 2.2.2. It follows that the operator V : Br (I) → Br (I) has an eigenvalue λ0 = 0.303 663 002 898 732 658 · · · with corresponding positive eigenfunction ψ ∈ C(I) satisfying 3.41 1 ≤ ψ(x) ≤ , 2 (x + a + 1) (x + a + 1)2 where a = 0.3126597 · · · , and V n g = G(g)λn ψ + osc 0 g (λ0 − 0.01539)n θn ψ ψ (2.2.24) x ∈ I,

for any n ∈ N and g ∈ Br (I). Here G : Br (I) → R is a positive bounded linear functional with || G || ≤ (a + 2)2 and θn : I → R is a function satisfying |θn | ≤ 1. Theorem 2.2.8 Let f ∈ L(I) be real-valued. For any n ∈ N+ we have || U n f − U ∞ f || ≤ and || U n f − U ∞ f || ≥ λn G(f ) − osc 0 f (λ0 − 0.01539)n ψ

x

λn G(f ) + osc 0

f (λ0 − 0.01539)n ψ

x

γ(dx)

I 0

ψdλ

γ(dx)

I 0

ψdλ.

Here G is a positive bounded functional on Br (I) with || G || < 5.34839 · · · , and the last inequality is meaningful for n ∈ N+ large enough. The proof is identical with that of Theorem 2.2.6. Instead of (2.2.3) and (2.2.8) we should use (2.2.23) and (2.2.24). In particular, equation (2.2.22) holds for f ∈ L(I), too. 2 Remark. The contents of Remarks 1 and 2 following the proof of Theorem 2.2.6 apply mutatis mutandis to the present L(I) framework. 2

Solving Gauss’ problem

97

Corollary 2.2.9 Let f0 ∈ L(I) (equivalently, dµ/dλ = F0 ∈ L(I)). For any n ∈ N and A ∈ BI we have µ(τ −n (A)) − γ(A) ≤ (1 − log 2) λn G(f0 ) + osc 0 (2.2.25)

**f0 (λ0 − 0.01539)n || ψ || min(γ(A), 1 − γ(A)). ψ U n f0 (x) − U ∞ f0 dx x+1
**

A

**Proof. By Proposition 2.1.5, for any n ∈ N and A ∈ BI we have µ(τ −n (A)) − γ(A) = since U ∞ f0 =
**

I

(2.2.26)

f0 dγ =

1 log 2

I

F0 dλ =

1 . log 2

Note that

x

γ(dx)

I 0

ψ dλ ≤

|| ψ || log 2

1 0

x dx 1 = || ψ || −1 x+1 log 2

(2.2.27)

and µ(τ −n (A)) − γ(A) = γ(Ac ) − µ(τ −n (Ac )) (2.2.28) for any n ∈ N and A ∈ BI . Now, (2.2.25) follows from (2.2.26) through (2.2.28) and Theorem 2.2.8. 2 Corollary 2.2.10 The spectral radius of the operator U − U ∞ in L(I) equals λ0 . Proof. Obvious by Theorem 2.2.8. 2 As an application of Theorem 2.2.8 we shall derive the asymptotic behaviour of γa (ua < x), x ≥ 1, n as n → ∞ for any a ∈ I. While it is natural to think that for any a ∈ I the limit distribution function

n→∞

lim γa (ua < x) n

is the common distribution function γ (¯1 < x), x ≥ 1, of the extended ¯ u random variables u , ∈ Z,—cf. the last paragraph of Subsection 1.3.3—it ¯

98

Chapter 2

is somewhat surprising to ﬁnd out that the (exact) convergence rate is O(λn ) 0 for most a ∈ I. Theorem 2.2.11 For any n ∈ N+ and x ≥ 1 we have sup γa (ua < x) − H(x) n+1

a∈I

(2.2.29)

≤ 3.2228 where H(x) = 1 log 2 1 log 2

I(1,∞) (x) n λ0 (1 + (0.94932)n ), x x−1 x 1 x if 1 ≤ x ≤ 2,

log x −

log 2 −

if

x ≥ 2.

In (2.2.29), λ0 cannot be replaced by a smaller constant, and the exact convergence rate to 0 of the left hand side of (2.2.29) is O(λn ). 0 Proof. By Proposition 1.3.10, for any a ∈ I, x ≥ 1, and n ∈ N+ we have γa (ua < x|a1 , . . . , an ) = n+1 Hence γa ua ≥ n+1 1 a1 , . . . , an t = 1 − (1 − t(sa + 1))I(sa +1,∞) n n = min(1, t(sa + 1)) = ft (sa ) n n for any a ∈ I, t ∈ (0, 1], and n ∈ N+ , with ft (y) = min(1, t(y + 1)), Therefore, by Proposition 2.1.10, γa ua ≥ n+1 1 t = E γa u1 ≥ n+1 1 a1 , . . . , an t = U n ft (a), (2.2.30) y ∈ I. 1 t 1− sa + 1 n x I(sa +1,∞) (x). n

for any a ∈ I, t ∈ (0, 1], and n ∈ N+ . It is easy to check that (2.2.30) holds for n = 0, too. Clearly, ft ∈ L(I) for any t ∈ (0, 1], and t if 0 < t ≤ 1/2, log 2 U ∞ ft = ft (y)γ(dy) = 1 I (1 − t + log(2t)) if 1/2 ≤ t ≤ 1. log 2

**Solving Gauss’ problem Next, 0 ≤ ft (y) ≤ tI(0,1) (t), t ∈ (0, 1], y ∈ I. Hence osc and G(ft ) ≤ || G || || ft || ≤ 5.348396 tI(0,1) (t) for any t ∈ (0, 1]. Finally,
**

x

99

ft ≤ 5.348396 tI(0,1) (t) ψ

γ(dx)

I 0

ψdλ ≤ =

1 3.41 1 dx − log 2 I 1.312659 x + 1.312659 x + 1 1 3.41 2.312659 1 log − 0.312659 log 2 1.312659 1.312659

**≤ 0.60256. Consequently, Theorem 2.2.8 yields sup γa ua ≥ n+1
**

a∈I

1 t

− U ∞ ft ≤ 3.2228 t I(0,1) (t)λn (1 + (0.94932)n ) 0

for any n ∈ N and t ∈ (0, 1]. Hence, by putting 1/t = x, (2.2.29) follows. Finally, the assertion concerning the optimality of λ0 also follows from Theorem 2.2.8. 2 Remarks. 1. The convergence of λ(un < x) to H(x), x ≥ 1, as n → ∞ was ﬁrst sketchy proved by Doeblin (1940, p. 365) with an unspeciﬁed convergence rate. A detailed proof following Doeblin’s suggestions was given by Samur (1989, Lemma 4.5) together with a slower convergence rate than that occurring in Theorem 2.2.11. 2. Theorem 2.2.8 shows that the convergence rate to 0 as n → ∞ of sup sup γa (ua < x) − H(x) n+1

a∈I x≥1

**is O(λn ). It is possible for some a ∈ I that the convergence rate to 0 as 0 n → ∞ of sup γa (ua < x) − H(x) n+1
**

x≥1

is O(αn ) with 0 < α < λ0 . It follows from equation (2.2.22), which is valid for f ∈ L(I) too, that this happens if and only if a ∈ E, with E deﬁned in

100

Chapter 2

**Remark 1 following Theorem 2.2.6. In particular, 0 and 1 do not belong to E, thus sup |λ(un+1 < x) − H(x)| = O(λn ) 0
**

x≥1

and sup γ1 (u1 < x) − H(x) = O(λn ) n+1 0

x≥1

as n → ∞. It would be interesting to eﬀectively determine elements of E.2 The asymptotic behaviour as n → ∞ of the probability density of n ∈ N+ , a ∈ I, which exists a.e. by Corollary 1.3.11, can be established using a result to be proved later in Subsection 2.5.3. Set x − 1 if 1 ≤ x ≤ 2, 2 dH (x) x log 2 h (x) = = dx 1 if x ≥ 2. 2 log 2 x 0 log(x + 1) G (x) = log 2 1

x−1

ua , n

Recalling that

if x ≤ 0, if 0 ≤ x ≤ 1, if x > 1,

**it is easy to check that H (x) = Corollary 1.3.11 then yields γa (ua < x) − H (x) = n 1 x
**

x−1 0

1 x

G (s) ds,

0

x ≥ 1.

Ga (s) − G (s) ds n−1

(2.2.31)

for any a ∈ I, n ∈ N+ , and x ≥ 1. Letting Dx γa (ua < x) denote anyone n of the four (two for x = 1) unilateral derivatives of γa (ua < x) at x, we can n state the following result. Proposition 2.2.12 For any n ∈ N+ , a ∈ I, and x ≥ 1 we have |Dx γa (ua < x) − h (x)| ≤ n k0 [min(x − 1, 1) + x I(1,2] (x)] 1 2 x Fn−1 Fn

Solving Gauss’ problem where k0 is a constant not exceeding 14.8.

101

The proof follows from (2.2.31) and Theorem 2.5.5. The details are left to the reader. 2 Remark. √ The upper bound in Proposition 2.2.12 is O(g2n ) as n → ∞ √ with g = 5 − 1 /2, g2 = 3 − 5 /2 = 0.38196 · · · . It is an open problem whether this yields the optimal convergence rate. 2 Theorem 2.2.11 and Proposition 2.2.12 can be restated in terms of the approximation coeﬃcients deﬁned in Subsection 1.3.2. Indeed, by (1.3.6) we have un+1 = u0 = Θ−1 , n ∈ N, and the results below are easily checked. n n+1 Theorem 2.2.13 For any n ∈ N+ and t ∈ I we have ˜ |λ(Θn ≤ t) − H(t)| ≤ 3.2228 tI(0,1) (t)λn (1 + (0.94932)n ) 0 and ˜ |Dt λ(Θn ≤ t) − h(t)| ≤ k0 where ˜ H(t) = and t log 2 [min(t−1 − 1, 1) + t−1 I[1/2,1) (t)] , Fn Fn+1 if 0 ≤ t ≤ 1/2,

1 (1 − t + log(2t)) if 1/2 ≤ t ≤ 1 log 2 1 log 2 1 log 2 1 −1 t if 0 ≤ t ≤ 1/2,

˜ dH ˜ h(t) = = dt

if 1/2 ≤ t ≤ 1.

Remark. The ﬁrst result above improves on the convergence rate obtained by Faivre (1998a) while the second one on that obtained by Knuth (1984). 2

2.3

2.3.1

**Babenko’s solution to Gauss’ problem
**

Preliminaries

Let H−1/2 = H denote the collection of all complex-valued functions f which are holomorphic in the half-plane Re z > −1/2, bounded in every half-plane

**102 Re z > −1/2 + ε, ε > 0, and which satisfy f
**

R

Chapter 2

1 − + iy 2

2

dy < ∞.

Note that H is known [see Duren (1970)] as the ordinary Hardy space of functions holomorphic in the half-plane Re z > −1/2, which is a Hilbert space with inner product (·, ·)H deﬁned by (f, g)H = 1 2π 1 1 f ∗ − + iy g − + iy dy, 2 2 R f, g ∈ H,

**therefore a Banach space under the norm || · || H deﬁned by || f || H = 1 2π f
**

R

1 − + iy 2

2

1/2

dy

,

f ∈ H.

**Let L2 (R+ , BR+ , λ) = L2 (R+ ) denote the Hilbert space of square λintegrable functions ϕ : R+ → C with the usual scalar product (ϕ, ψ) =
**

R+

ϕψ ∗ dλ,

ϕ, ψ ∈ L2 (R+ ) ,

and norm

||ϕ||2 = (ϕ, ϕ)1/2 ,

ϕ ∈ L2 (R+ ) .

A Paley–Wiener theorem holds, giving a simple characterization of the elements of H [see Duren (1970)]: f ∈ H if and only if there exists ϕ ∈ L2 (R+ ) such that f (z) =

R+

e−zs−s/2 ϕ (s) ds,

Re z > −1/2;

**the function ϕ is unique (in the L2 -sense) and || f || H = || ϕ ||2 . In other words, the linear operator M : L2 (R+ ) → H deﬁned by M ϕ (z) =
**

R+

(2.3.1)

e−zs−s/2 ϕ (s) ds,

ϕ ∈ L2 (R+ ) , Re z > −1/2,

is an isometry and the image under M of L2 (R+ ) is H.

Solving Gauss’ problem

103

Notice that in Babenko (1978) an equivalent deﬁnition of H is considered. We follow here Mayer (1991). See also Hensley [(1992, p. 344) and (1994, p. 145)]. It is easy to check that the Perron–Frobenius operator Pλ of τ under λ takes H into itself. Obviously, for f ∈ H we deﬁne Pλ f by Pλ f (z) =

i∈N+

1 f (z + i)2

1 , z+i

Re z > −1/2.

2.3.2

A symmetric linear operator

**Consider the linear operator S : L2 (R+ ) → L2 (R+ ) deﬁned by Sϕ (s) = 1 − e−s s
**

1/2

ϕ (s) ,

ϕ ∈ L2 (R+ ) , s ∈ R+ .

**Clearly, S is invertible and S −1 ϕ (s) = s 1 − e−s
**

1/2

ϕ (s) ,

ϕ ∈ S L2 (R+ ) , s ∈ R+ .

Consider also the linear operator A = SM −1 : H → L2 (R+ ) with inverse A−1 = M S −1 : S L2 (R+ ) → H. Proposition 2.3.1 Deﬁne the symmetric linear operator K : L2 (R+ ) → by Kϕ (s) =

R+

L2 (R+ )

k (s, t) ϕ (t) dt , ϕ ∈ L2 (R+ ) , √ J1 2 st ((es − 1) (et − 1))1/2

s ∈ R+ ,

where k (s, t) =

,

s, t ∈ R+ ,

**and J1 is the Bessel function of order 1 deﬁned by J1 (s) = s 2 (−1)k s k! (k + 1)! 2
**

2k

,

s ∈ R+ .

k∈N

104 Then Pλ = A−1 K A.

Chapter 2

(2.3.2)

**Proof. Note ﬁrst that the range of K is included in S L2 (R+ ) . Let ϕ ∈ L2 (R+ ) and put f = M ϕ ∈ H. We have A−1 K A f = M S −1 K S ϕ. But S
**

−1

KSϕ (s) = =

s 1 − e−s s t

1/2

k (s, t)

R+

1 − e−t t

1/2

ϕ (t) dt s ∈ R+ ,

1/2

e

s−t 2

R+

√ J1 2 st ϕ (t) dt, es − 1

whence MS

−1

KSϕ (z) =

R2 +

t −zs − 2 e

k∈N+

=

R+

√ J1 2 st ϕ (t) dsdt es − 1 1 t t − ϕ (t) dt, exp − z+k 2 (z + k)2 s t

1/2

**for Re z > −1/2, on account of the identity 1 t 2 exp − z + k (z + k) =
**

R+

k∈N+

s t

1/2

e

−zs J1

√ 2 st ds es − 1

**which is valid for t ∈ R+ and Re z > −1 [see Watson (1944, formula 7.13.9)]. It remains to note that ϕ (t) exp −
**

R+

t t − z+k 2

dt = (M ϕ)

1 z+k

=f

1 z+k

**for any k ∈ N+ and Re z > −1, to obtain A−1 KAf (z) =
**

k∈N+

1 f (z + k)2

1 z+k

= (Pλ f ) (z),

Re z > −1/2. 2

Solving Gauss’ problem

105

As an integral symmetric linear operator with continuous kernel, K is a compact operator on L2 (R+ ) with only real eigenvalues λj , j ∈ N+ , satisfying lim |λj | = 0.

j→∞

See, e.g., Kanwal (1997, Ch.7). Note that 0 cannot be an eigenvalue since Kϕ = 0 implies that ϕ = 0 by the invertibility of the Hankel transform. See, e.g., Magnus et al. (1966, Ch. 11). As usual, we order the eigenvalues according to their absolute values, that is, |λ1 | ≥ |λ2 | ≥ ... , where we list each eigenvalue according to its multiplicity. We then have Kϕ =

j∈N+

λj (ϕ, ϕj ) ϕj ,

ϕ ∈ L2 (R+ ) ,

(2.3.3)

where ϕj is a (real-valued) eigenfunction corresponding to λj , that is Kϕj = λj ϕj , j ∈ N+ , and the ϕj , j ∈ N+ , deﬁne an orthonormal system in L2 (R+ ). Note that this system is complete since 0 is not an eigenvalue of K. We actually can prove more about K. For that we recall that a linear operator L on a Banach space B of norm || · || is called nuclear of order 0 (or of trace class) if and only if it can be written as Lx =

i∈I

yi (x)xi ,

x ∈ B,

with (||yi || ||xi ||)r < ∞

i∈I

for any r > 0. Here I is a countable set while xi ∈ B and yi ∈ B ∗ = the dual Banach space of B (consisting of all bounded linear functional on B) for any i ∈ I. Such operators have been introduced and studied by Grothendieck (1955, 1956). They are compact and thus have discrete spectra. Moreover, most of matrix algebra can be extended to them. In particular, one can deﬁne the trace of such an operator as Tr L =

i∈I

yi (xi ) =

j∈N+

λj ,

(2.3.4)

where λj , j ∈ N+ , are the eigenvalues of L, each of them counted with its multiplicity. The traces of the powers Ln , n ≥ 2, are also well deﬁned. The analog of the characteristic polynomial of a matrix for a nuclear operator of

106

Chapter 2

**order 0, is known as the Fredholm determinant, which is an entire function of z ∈ C given by the formula det (Id − zL) =
**

j∈N+

(1 − λj z).

**Then the equation det(Id − zL) = exp(−Tr log(Id − zL)) = exp −
**

k∈N+

zk k TrLk

**holds for |z| < 1. Hence Tr Ln =
**

j∈N+

λn , j

n ∈ N+ .

**Moreover, generalized traces deﬁned as |λj |ε
**

j∈N+

exist for any ε > 0. Let us ﬁnally note that in some Banach spaces every bounded linear operator is nuclear of order 0. A typical example of such a Banach space is A∞ (D1 ), to be deﬁned in Subsection 2.4.3. Proposition 2.3.2 K is a nuclear operator of trace class. Hence |λj |ε < ∞

j∈N+

**for any ε > 0. We have Tr K =
**

j∈N+

λj =

k (s, s) ds =

R+ R+

J1 (2s) ds = 0.7711255237 · · · , es − 1

Tr K 2 =

j∈N+

λ2 = j

R2 +

k (s, t) k (t, s) ds dt (2.3.5)

=

√ 2 J1 2 st ds dt = 1.103839654 · · · . s t R2 (e − 1) (e − 1) +

**Solving Gauss’ problem Proof. Consider the Laguerre polynomials
**

n

107

L1 (s) n We have

= (n + 1)!

m=0

(−1)m

sm , (m + 1)!m! (n − m)!

n ∈ N, s ∈ R+ .

R+

se−s L1 (s) n

2

ds = n + 1,

n ∈ N,

R+

se−s L1 (s) L1 (s) ds = 0, m n

m, n ∈ N, m = n.

**√ √ See, e.g., Magnus et al. (1966, Ch. 5). We expand J1 2 st / st, s, t ∈ R+ , in terms of the L1 (s) , n ∈ N, to obtain n √ J1 2 st √ = L1 (s) Cn (t), n st n∈N where Cn (t) = 1 n+1
**

n

s, t ∈ R+ ,

R+

se−s L1 (s) n

√ J1 2 st √ ds st

= n!

m=0 k∈N

(−1)m+k (m + k + 1)!tk k! (k + 1)!m! (m + 1)! (n − m)! n ∈ N, t ∈ R+ .

= It follows that

e−t tn , (n + 1)!

Kϕ =

n∈N

(ϕ, βn ) αn ,

ϕ ∈ L2 (R+ ) ,

(2.3.6)

where αn , βn ∈ L2 (R+ ) are given by αn (s) = tn+1/2 e−t , βn (t) = , (es − 1)1/2 (et − 1)1/2 (n + 1)! s1/2 L1 (s) n s, t ∈ R+ .

**To prove the ﬁrst assertion we should show that (||αn ||2 ||βn ||2 )r < ∞
**

n∈N

108

Chapter 2

**for any r > 0. Since (es − 1)−1 = k∈N+ e−ks , s ∈ R++ , the computation of ||αn ||2 reduces to that of a standard integral: ||αn ||2 = 2
**

k∈N+ R+

se−ks L1 (s) n

n p=0

2

ds n (k − 1)2p , p

=

k∈N+

n+1 k 2n+2

n+1 p

and since

n+1 p

≤ 2n+1 , 0 ≤ p ≤ n, we obtain (k − 1)2 + 1

k∈N+ n

||αn ||2 ≤ 2n+1 (n + 1) 2 Next, as

R+

k 2n+2

≤ 2n+1 (n + 1) ζ (2) .

**sm e−s ds = m!, m ∈ N, we have 1 ((n + 1)!)2 (2n + 1)! ((n + 1)!)2 s2n+1 e−ks ds
**

k≥3 R+

||βn ||2 = 2 = Since

1

k≥3

k 2n+2

=

2n+1 n+1

n+1

k≥3

1 , k 2n+2

n ∈ N.

1

k≥3

2

k 2n+2

=

j=0 ∈N+

1 (3 + j)2n+2

≤ 3

∈N+

1 = 3−2n−1 ζ (2n + 2) (3 )2n+2

and

2n + 1 n+1

≤ 22n+1 , ζ (2) n+1

ζ (2n + 2) ≤ ζ(2), 2 3

2n+1

n ∈ N,

we obtain ||βn ||2 ≤ 2

,

n ∈ N.

**Finally, for any r > 0 we have (||αn ||2 ||βn ||2 ) ≤
**

n∈N r

2 √ ζ (2) 3

r n∈N

√ 2 2 3

r

n

< ∞.

Solving Gauss’ problem

109

**The formulae for Tr K and Tr K 2 in the statement follow from (2.3.4) and (2.3.6) which as easily checked yield Tr K = (αn , βn ) =
**

n∈N

k(s, s)ds,

R+ R2 +

Tr K 2 =

(αm , βn )(αn , βm ) =

m,n∈N

k(s, t)k(t, s)dsdt.

Concerning the numerical values of Tr K and Tr K 2 we refer the reader to Mayer and Roepstorﬀ (1987, Section 3). 2 Remark. There is an interesting relationship between Tr K n and the non-zero ﬁxed points of τ n for any n ∈ N+ . It can be shown [see Mayer and Roepstorﬀ (1987, Section 3) and (1988, Section 3)] that

n −1

Tr K =

i1 ,... ,in ∈N+

n

x−2 n i1 ···i

k=2

x−2 n i1 ···ik−1 ik ···i

− (−1)

n

,

with 1 k=2 = 1, where xi1 ···in = i1 , . . . , in , i1 , . . . , in ∈ N+ . (For notation see Subsection 1.1.3.) Clearly, these quadratic irrationalities are all non-zero solutions of the equation τ n x = x. Hence xi1 ···in = 1 2qn−1 pn−1 − qn + (pn−1 + qn )2 + 4(−1)n−1

1/2

**for any n ∈ N+ and i1 , . . . , in ∈ N+ . Here, as usual, pn = [i1 , . . . , in ] , g.c.d.(pn , qn ) = 1, qn with p0 = 0, q0 = 1. In particular, xi xij = = i2 +1 4 j2 j + 4 i
**

1/2

n ∈ N+ ,

i − , i ∈ N+ , 2 1/2 j − , i, j ∈ N+ . 2

**It is asserted in Babenko (1978, p. 140) that for any n ∈ N+ , in our notation, we have Tr K n = (−1)n−1 2 1 −
**

i1 ,... ,in ∈N+

pn−1 + qn (pn−1 + qn ) +

2

4(−1)n−1

1/2

.

110

Chapter 2

For n = 1 and n = 2 this is in agreement with the Mayer–Roepstorﬀ formula, as easily checked. Clearly, Babenko’s formula is much simpler than Mayer– Roepstorﬀ’s. It can be shown that it is true for any n ∈ N+ . See Subsection 2.4.3. Let us ﬁnally note that by the above we have Tr K = 1 2 i 1− √ 2+4 i

i∈N+

and Tr K 2 = 1 2 1 2 ij + 2

i,j∈N+

ij (ij + 4) k+2

−1

=

k∈N+

k(k + 4)

− 1 t(k),

where t(k) is the number of divisors of k, equal to α (nα + 1) if 1 < k = nα 2 α pα is the factorization of k into distinct primes, and t(1) = 1. Corollary 2.3.3 The dominant eigenvalue λ1 of K is simple and is equal to 1. The corresponding eigenfunction ϕ1 is deﬁned by ϕ1 (s) = 1 (log 2)1/2 1 − e−s s

1/2

e−s/2 ,

s ∈ R+ .

Proof. Since

R+

**sk e−s ds = k!, k ∈ N, we have 1 (log 2)1/2 (es − 1)1/2 s1/2 (log 2)
**

1/2 R+

Kϕ1 (s) = =

**√ J1 2 st t−1/2 e−t dt (−1)k sk k! (k + 1)! tk e−t dt
**

R+

(es

− 1)

1/2 k∈N

=

s1/2 (1 − e−s ) (log 2)1/2 (es − 1)1/2 s

= ϕ1 (s) ,

s ∈ R+ ,

**Solving Gauss’ problem and ||ϕ1 ||2 = 2 = 1 log 2 1 log 2 1 log 2
**

R+

111

**(1 − e−s ) e−s ds s (−1)k+1 k! sk−1 e−s ds
**

R+

k∈N+

=

k∈N+

(−1)k+1 = 1. k

Thus 1 is an eigenvalue of K with corresponding eigenfunction ϕ1 . It should be the dominant eigenvalue since λn = 1 implies Tr K 2 ≥ n, which contradicts (2.3.5) unless n = 1. It should also be simple since λ1 = λ2 implies Tr K 2 ≥ 2, which contradicts again (2.3.5). 2 Concerning the remaining eigenvalues λn , n ≥ 2, we ﬁrst have λ2 = −λ0 = −0.30366 30028 98732 65859 · · · (this follows from Theorem 2.2.5 and Theorem 2.3.5 below). Next, extensive computations [cf. Daud´ et al. (1997, Section 6) and MacLeod (1993)] yield e λ3 λ4 λ5 λ6 λ7 λ8 λ9 = 0.10088 45092 93104 07530 = 0.01284 37903 62440 26481 = 0.00174 86751 24305 51191 = 0.00024 41314 65524 51581 ··· , ··· , ··· , ··· , = −0.03549 61590 21659 84540 · · · , = −0.00471 77775 11571 03107 · · · , = −0.00065 20208 58320 50290 · · · ,

λ10 = −0.00009 16890 83768 59330 · · · . It has been conjectured in Babenko (1978) that all eigenvalues λj , j ∈ N+ , are simple. Another conjecture [Mayer and Roepstorﬀ (1988)] is that (−1)j+1 λj > 0, j ∈ N+ .

2.3.3

**An ‘exact’ Gauss–Kuzmin–L´vy theorem e
**

1/2

**Let us deﬁne the functions ψj ∈ H, j ∈ N+ , by ψj (z) = A−1 ϕj (z) = e−zs−s/2
**

R+

s 1 − e−s

ϕj (s) ds,

Re z > −1/2.

112 Note that since λj ϕj = Kϕj implies |ϕj (s)| ≤ Cj s1/2 e−s/2 , s ∈ R+ ,

Chapter 2

for some suitable Cj ∈ R+ , it follows that ψj is regular in the halfplane Re z > −1. It is possible to show that actually the ψj , j ∈ N+ , are regular outside a cut along the negative axis from −1 to ∞, which is the natural boundary of them. In particular, ψ1 (z) = 1 (log 2)1/2 1 (log 2)

1/2 R+

e−zs−s ds 1 , z+1

e−(z+1)s =− (log 2)1/2 z + 1 1

∞

0

(2.3.7)

=

Re z > −1.

Proposition 2.3.4 We have |ψj (z)|2 =

j∈N+ j∈N+

1 , (2 Re z + j)2

1/2

Re z > −1/2,

(2.3.8)

max |ψj (x)| ≤

x∈I

1 π2 − 6 4 log 2

= 1.13325209315 · · · ,

j ≥ 2. (2.3.9)

**Proof. For any ﬁxed z with Re z > −1/2 consider the function ϕ (s) = e−zs−s/2 s 1 − e−s
**

1/2

,

s ∈ R+ ,

which clearly belongs to L2 (R+ ). On account of the completeness of the system (ϕj )j∈N+ , whose properties are described in the lines following equation (2.3.3), we can write ej ϕ j , ϕ=

j∈N+

**where ej = (ϕ, ϕj ) = ψj (z) , Parseval’s equation then yields |ej |2 = ||ϕ||2 . 2
**

j∈N+

j ∈ N+ .

**Solving Gauss’ problem But ||ϕ||2 = 2 =
**

R+

113

e−zs−s/2

R+

2

s ds 1 − e−s e−(2 Re z+j)s s ds

j∈N+ R+ ∞

e−2sRez

s ds = es − 1

=

−

j∈N+

e−(2 Re z+j)s 1 , (2 Re z + j)2

s 1 + 2 Re z + j (2 Re z + j)2 Re z > −1/2,

0

=

j∈N+

**and (2.3.8) follows. Finally, (2.3.9) follows from (2.3.7) and (2.3.8) since min ψ1 (x) =
**

x∈I

1 2 (log 2)1/2

. 2

Remarks. 1. It is conjectured in Babenko (1978, p.140) that ψj (0) = 0 and |ψj (0)| = maxx∈I |ψj (x)| , j ≥ 2. Note that ψ2 (0) = 0 is implicit in Wirsing (1974). 2. If ψj (0) = 0 for some j ≥ 2, then ψj (−i − [i1 , . . . , in ] + z) = ψj (0) (−1)n+1 + O(1) n+2 (1 − λj ) z λj

as z → 0 for any n ∈ N+ , i, i1 , . . . , in ∈ N+ , in ≥ 2, with ε < |arg z| < π − ε whatever ε > 0. This was proved by Wirsing (1974) for j = 2, thus establishing the cut along the negative real axis from −1 to ∞ as the natural boundary of the functions ψ and ψ in Subsection 2.2.2. (See Remark 2 before Theorem 2.2.5.) It is asserted in Babenko & Jur ev (1978) that Wirsing’s reasoning also works for any j ≥ 3. 2 We are now able to prove an ‘exact’ Gauss–Kuzmin–L´vy theorem for e the measures γa , a ∈ I (cf. Subsection 1.3.4). Theorem 2.3.5 For any a ∈ I, A ∈ BI , and n ∈ N+ we have γa τ −n (A) − γ (A) = (a + 1)

j≥2

λn−1 ψj (a) j

A

ψj dλ.

(2.3.10)

114 Next,

I

Chapter 2 ψj dλ = 0, j ≥ 2, and

−1

γa τ −n (A) − γ (A) − (a + 1)

j=2

λn−1 ψj (a) j

A

ψj dλ

≤

**π 2 log 2 − 1 |λ |n−1 min (γ (A) , 1 − γ (A)) 6 ≥ 2, and n ∈ N+ . (Clearly,
**

1 j=2

for any a ∈ I, A ∈ BI ,

= 0.)

Proof. For any a ∈ I consider the function ha deﬁned by ha (z) = a+1 , (az + 1)2 Re z > −1/2.

**Note that h0 does not belong to H. Instead, the function Pλ ha (z) = (a + 1)
**

i∈N+

1 , (z + a + i)2

Re z > −1/2,

**does belong to H for any a ∈ I. By (2.3.2) and (2.3.3) for any g ∈ H and n ∈ N we have
**

n Pλ g = A−1 K n A g = A−1 j∈N+

λn (Ag, ϕj ) ϕj = j

j∈N+

λn (Ag, ϕj ) ψj . j

**Hence, for any n ∈ N+ and a ∈ I,
**

n−1 n Pλ ha = Pλ (Pλ ha ) = j∈N+

λn−1 (APλ ha , ϕj ) ψj . j

(2.3.11)

**We assert that for any a ∈ I we have (APλ ha ) (s) = (a + 1) e
**

−s/2−as

s 1 − e−s

1/2

,

s ∈ R+ .

(2.3.12)

This can be checked as follows. Since Pλ ha = M S −1 (APλ ha ), we have to

Solving Gauss’ problem

115

**prove that this last equation holds with APλ ha given by (2.3.12). We have S −1 (APλ ha ) (s) = (a + 1) M S −1 APλ ha (z) = (a + 1) = (a + 1)
**

j∈N+ R+

s e−s/2−as , 1 − e−s

s ∈ R+ ,

R+

se−s e−(z+a)s ds 1 − e−s se−(z+j+a)s ds 1 (z + j + a)2

= (a + 1)

j∈N+

= Pλ ha (z), Thus (2.3.12) holds and we then have (APλ ha , ϕj ) = (a + 1) ψj (a),

Re z > −1/2.

a ∈ I, j ∈ N+ .

(2.3.13)

**Therefore (2.3.11) and (2.3.13) imply that
**

n Pλ ha = (a + 1) j∈N+

λn−1 ψj (a) ψj , j

a ∈ I, n ∈ N+ .

The last equation holds in H . By (2.3.9), Proposition 2.3.2, and Corollary 2.3.3, the series j∈N+ λn−1 ψj (a) ψj is uniformly and absolutely convergent j in I for any a ∈ I and n ∈ N+ . Hence whatever a ∈ I and n ∈ N+ by (2.3.7) we have

n Pλ ha (x) −

1 = (a + 1) (x + 1) log 2

λn−1 ψj (a) ψ(x), j

j≥2

x ∈ I.

(2.3.14)

Equation (2.3.10) follows by integrating the last equation over A ∈ BI since by the very deﬁnition of the Perron–Frobenius operator we can write

n Pλ ha dλ =

A

τ −n (A)

ha dλ =

τ −n (A)

dγa = γa (τ −n (A)),

n ∈ N.

Since γ (da) γa τ −n (A) = γ(τ −n (A)) = γ (A) , n ∈ N, A ∈ BI ,

I

116

Chapter 2

**if we divide equation (2.3.10) by (a + 1) (log 2) and integrate the equation obtained over a ∈ I, then we obtain 0=
**

j≥2

λn−1 j

I

ψj dλ

A

ψj dλ,

n ∈ N+ , A ∈ BI .

Taking A = I and n = 1 we deduce that ψj dλ = 0, j ≥ 2.

I

Finally, for a ∈ I, A ∈ BI , Da,

,n (A)

≥ 2, and n ∈ N+ set

= D (A)

−1

=

γa τ

−n

(A) − γ (A) − (a + 1)

j=2

λn−1 ψj (a) j

A

ψj dλ

**and note that D (A) = D (I \ A). It follows from (2.3.10) that D (A) ≤ (a + 1) |λ |n−1
**

A

j≥

|ψj (a)| |ψj (x)| dx 1/2

2 ψj (a) j≥

≤ (a + 1) |λ |n−1

A

1/2

2 ψj (x) j≥

dx

1/2

2 ψj (a) j≥

= (log 2) |λ |n−1

A

(a + 1)2

1/2

2 ψj (x) j≥

**× (x + 1)2 Now, equation (2.3.8) implies (a + 1)2
**

j≥ 2 ψj (a) ≤ (a + 1)2

γ (dx) .

1 1 2 − (2a + j) (a + 1)2 log 2 j∈N+ (2.3.15)

≤ ζ (2) −

1 log 2

Solving Gauss’ problem for any a ∈ I and therefore obtain

117

≥ 2. (The last inequality can be easily checked.) We π 2 log 2 − 1 |λ |n−1 γ (A) . 6

D (A) ≤

**Since D (A) = D (I \ A) we conclude that D (A) ≤ Note that π 2 log 2 − 1 |λ |n−1 min (γ(A), 1 − γ (A)) . 6 π 2 log 2 − 1 = 0.14018 · · · = ε2 6 2 ≥ 2 we have
**

x 0

**(cf. Subsection 1.3.6 ). Corollary 2.3.6 For any a, x ∈ I, n ∈ N+ , and γa (τ n < x) − γ([0, x]) = (a + 1)
**

j≥2

λn−1 ψj (a) j

ψj dλ,

1 d γa (τ n < x) − = (a + 1) dx (x + 1) log 2

−1 n

λn−1 ψj (a)ψj (x), j

j≥2 x 0

γa (τ < x) − γ([0, x]) − (a + 1)

j=2

n−1 λj ψj (a)

ψj dλ

≤

π 2 log 2 − 1 |λ |n−1 6

1 1 − − γ([0, x]) 2 2

−1

,

1 d γa (τ n < x) − − (a + 1) dx (x + 1) log 2 ≤ 1 π2 − 6 log 2 |λ |n−1

λn−1 ψj (a)ψj (x) j

j=2

1 . x+1

Next (cf. Corollary 1.2.5), for any a ∈ I, n, k ∈ N+ , and i(k) ∈ Nk we have + γa (an+1 , · · · , an+k ) = i(k) −1 ≤ γ [u(i(k) ), v(i(k) )] π 2 log 2 − 1 λn−1 , 0 6

118 which for k = 1 reduces to γa (an+1 = i) −1 ≤ −1 log(1 + 1/i(i + 2)) (log 2) for any a ∈ I and i, ∈ n ∈ N+ .

Chapter 2

π 2 log 2 n−1 − 1 λ0 , 6

Proof. The ﬁrst equation is (2.3.10) for A = [0, x), x ∈ I, while the second one is simply (2.3.14). (Clearly, the latter can be obtained from the former by diﬀerentiation.) The ﬁrst inequality is that occurring in Theorem 2.3.5 for A = [0, x), x ∈ I, while the second one is easily obtained using (2.3.15). Finally, the last inequality (the general case) is that occurring in Theorem 2.3.5 for A = [u(i(k) ), v(i(k) )] and = 2. 2 It is interesting to compare Theorem 2.2.5 (with µ = γa , a ∈ I) and Corollary 2.3.6. It is easy to see that for any a, x ∈ I we have

x

−λ0 G(fa )ψ(x) = ψ2 (a) where fa (x) = x+1 , (ax + 1)2

0

ψ2 dλ,

(2.3.16)

a, x ∈ I.

**Diﬀerentiating (2.3.16) with respect to x and then putting x = a yield
**

2 ψ2 (a) = −λ0 G(fa )ψ (a),

a ∈ I.

2 In particular, ψ2 (0) = −λ0 G(1)ψ (0) = λ0 G(1)U ∞ Ψ = 0 (since G(1) > 0). Now, it follows from (2.3.16) that for any x ∈ I such that ψ (x) = 0 the ratio ψ2 (x)/ψ (x) has a constant value equal to −(sgn ψ2 (0))(λ0 G(1)/U ∞ Ψ)1/2 , and that for any a ∈ I such that ψ2 (a) = 0 the ratio G(fa )/ψ2 (a) has a constant value equal to G(1)/ψ2 (0). Then

ψ(x) = −(sgn ψ2 (0)) and ψ2 (x) = −(sgn ψ2 (0)) for any x ∈ I.

U ∞Ψ λ0 G(1)

1/2 0 1/2

x

ψ2 dλ

λ0 G(1) U ∞Ψ

ψ (x)

**Remark. It follows from Corollary 2.3.6 that the exact convergence rate to 0 as n → ∞ of sup |γa (τ n < x) − γ([0, x])| ,
**

x∈I

a ∈ I,

(2.3.17)

Solving Gauss’ problem

119

is O(λn ) as long as ψ2 (a) = 0. In particular this holds for a = 0 since, as 0 we have just shown, ψ2 (0) = 0. If ψ2 (a) = · · · = ψj−1 (a) = 0 and ψj (a) = 0 for some j ≥ 3, then the exact convergence rate to 0 as n → ∞ of (2.3.17) is O(λn ). j The high accuracy computations of MacLeod (1993) show, however, that the only possible value of j is j = 3, since there exists a unique a ∈ I, very close to 0.4, with ψ2 (a) = 0 while ψ3 (a) = 0. 2

2.3.4

ψ-mixing revisited

Theorem 2.3.5 allows for an important improvement of Corollary 1.3.15. With the notation in Subsection 1.3.6, it follows from Theorem 2.3.5 that εn+1 ≤ π 2 log 2 − 1 λn−1 , 0 6 n ∈ N+ . (2.3.18)

It is easy to check that for n = 1 we actually have equality in (2.3.18), that is, π 2 log 2 ε2 = − 1 = 0.14018 · · · , 6 in accordance with the result obtained in Subsection 1.3.6. We can thus reformulate Corollary 1.3.15 as follows. Proposition 2.3.7 The sequence (an )n∈N+ is ψ-mixing under γ and any γa , a ∈ I. For any a ∈ I we have ψγa (1) ≤ 0.61231 · · · and ψγa (n) ≤ ε2 λn−2 (1 + λ0 ) 0 , 1 − ε2 λn−1 0 n ≥ 2.

In particular ψγa (2) ≤ ε2 (1 + λ0 )/(1 − ε2 λ0 ) = 0.19087 · · · for any a ∈ I. Also, ψγ (1) = ε1 = 2 log 2 − 1 = 0.38629 · · · , ψγ (2) = ε2 = 0.14018 · · · , and ψγ (n) ≤ ε2 λn−2 , n ≥ 3. 0 The doubly inﬁnite sequence (¯ ) ∈Z of extended incomplete quotients is a ψ-mixing under the extended Gauss measure γ and its ψ-mixing coeﬃcients ¯ are equal to the corresponding ψ-mixing coeﬃcients under γ of (an )n∈N+ . Remark. From Theorem 2.3.5 we can also obtain a formula expressing the ψ-mixing coeﬃcients ψγ (n), n ≥ 2, in terms of the eigenvalues λj and functions ψj , j ≥ 2, as ψγ (n + 1) = (log 2) sup (a + 1)(b + 1)

a,b∈I j≥2

λn−1 ψj (a) ψj (b) , j

n ∈ N+ .

120

Chapter 2

It is not diﬃcult to check that the above formula yields ψγ (2) = ε2 . Otherwise it seems to be of little value. 2

2.4

2.4.1

**Extending Babenko’s and Wirsing’s work
**

The Mayer–Roepstorﬀ Hilbert space approach

In this subsection we describe the setting devised by Mayer and Roepstorﬀ (1987) for Babenko’s work which is thus simpliﬁed and extended. Proofs are in general not given, and for them the reader is referred to the original paper. Let m denote the measure on BR+ with density dm t = t , dt e −1 Note that m (R+ ) = t

R+ k∈N+

t ∈ R+ .

e−kt dt =

k∈N+

1 = ζ (2) . k2

Consider the Hilbert space L2 R+ , BR+ , m = L2 (R+ ) of m-square intem grable functions f : R+ → C with inner product (·, ·)m deﬁned by (ϕ, ψ)m = and norm ϕ

2,m

ϕψ ∗ dm,

R+

ϕ, ψ ∈ L2 (R+ ), m

=

R+

|ϕ|2 dm

1/2

,

ϕ ∈ L2 (R+ ). m

Let D denote the half-plane Re z > −1/2 and consider the measure ν on BD with density 1 1 1 if − < x < 0, y ∈ R, π 2 2 2 dν (x + 1) + y = dxdy 0 otherwise. Note that ν (D) = 1 π

0

dx

−1/2 R

dy = (x + 1)2 + y 2

0 −1/2

dx = log 2. x+1

Solving Gauss’ problem

121

**Consider the Hilbert space H 2 (ν) of functions f holomorphic in D such that (z + 1)−1 f (z) is bounded in every half-plane Re z > −1/2 + ε, ε > 0, and f
**

2,ν

=

D

|f |2 dν

1/2

< ∞,

**with inner product (·, ·)ν deﬁned by (f, g)ν = f g ∗ dν,
**

D 2,ν .

f, g ∈ H 2 (ν) .

**Thus H 2 (ν) is a Banach space under the norm · Let f denote the restriction of f ∈ U ∞f =
**

I

H 2 (ν)

to I. Then (2.4.1)

f dγ =

(f, 1)ν log 2 .

and f

2,γ

≤ f

2,ν

(2.4.2)

**Next, the linear mapping M : L2 (R+ ) → H 2 (ν) deﬁned by m M ϕ (z) = (z + 1)
**

R+

e−zt ϕ (t) m(dt),

ϕ ∈ L2 (R+ ) , z ∈ D, m

is an isometry and the image under M of L2 (R+ ) is H 2 (ν). m The Perron–Frobenius operator U takes H 2 (ν) into itself. Obviously, for f ∈ H 2 (ν) we deﬁne U f by U f (z) =

i∈N+

Pi (z) f

1 z+i

,

z ∈ D.

The mapping K : ϕ → Kϕ, where Kϕ (s) = √ ϕ (t) J1 2 st √ m (dt) , st R+ ϕ ∈ L2 (R+ ) , s ∈ R+ , m

deﬁnes on L2 (R+ ) an integral symmetric linear operator with continuous m kernel √ J1 2 st (−1)n √ k (s, t) = (st)n , s, t ∈ R+ . = n! (n + 1)! st n∈N

122

Chapter 2

K has inﬁnite-dimensional range, is nuclear (of trace class) and, therefore, compact. The spectra of the operators K and K (introduced in Subsection 2.3.2) coincide. Thus with the notation from Subsection 2.3.2 for the eigenvalues of K we have Kϕ =

k∈N+

λk (ϕ, ϕk )m ϕk ,

ϕ ∈ L2 (R+ ), m

(2.4.3)

**where ϕk is an eigenfunction corresponding to λk , that is, K ϕk = λk ϕk , k ∈ N+ , and the ϕk , k ∈ N+ , deﬁne an orthonormal basis in L2 (R+ ). m Actually, ϕk (t) = t−1/2 et − 1
**

1/2

ϕk (t) ,

k ∈ N + , t ∈ R+ ,

where the ϕk , k ∈ N+ , are those introduced in Subsection 2.3.2. The operators M, K and U are connected by the equation U = M KM −1 . Hence U n = M K n M −1 , n ∈ N+ . (2.4.4) From (2.4.3) we have K nϕ =

k∈N+

λn (ϕ, ϕk )m ϕk , k

n ∈ N+ , ϕ ∈ L2 (R+ ) . m

(2.4.5)

**It then follows from (2.4.4) and (2.4.5) that U ng =
**

k∈N+

λn (M −1 g, ϕk )m M ϕk , k

n ∈ N+ , g ∈ H 2 (ν) .

Alternatively, U ng =

k∈N+

λn (g, M ϕk )ν M ϕk , k

n ∈ N+ , g ∈ H 2 (ν) .

**For k = 1 we have λ1 = 1 and ϕ1 (t) = Therefore M ϕ1 (z) = (z + 1) = (z + 1) (log 2)
**

1/2 R+ R+

1 (log 2)1/2

t−1 et − 1 e−t ,

t ∈ R+ .

e−zt ϕ1 (t) m (dt) e−(z+1)t dt = 1 (log 2)1/2 , z ∈ D,

Solving Gauss’ problem and, by (2.4.1), (g, M ϕ1 )ν M ϕ1 = 1 (g, 1)ν = U ∞ g, log 2 g ∈ H 2 (ν) .

123

**As 0 is not an eigenvalue of K, we also have M −1 g =
**

k∈N+

(M −1 g, ϕk )m ϕk ,

g ∈ H 2 (ν) ,

or, alternatively, g=

k∈N+

(g, M ϕk )ν M ϕk ,

g ∈ H 2 (ν) .

Then M −1 g

2 2,m

= ||g||2 = 2,ν

k∈N+

(M −1 g, ϕk )m

2

=

k∈N+

|(g, M ϕk )ν |2

**for any g ∈ H 2 (ν). Therefore ||U n g − U ∞ g||2 2,ν = ≤
**

2 2n k≥2 |λk | |(g, M ϕk )ν |

(2.4.6) ||g||2 2,ν − |U ∞ g|2 log 2 |λ2 |2n ,

for any n ∈ N+ and g ∈ H 2 (ν). Inequalities (2.4.2) and (2.4.6) imply the following result. Proposition 2.4.1 Let g ∈ H 2 (ν). Then for any n ∈ N+ we have U ng − U ∞g

2,γ

≤ ||g||2 − |U ∞ g|2 log 2 2,ν

1/2

|λ2 |n .

Corollary 2.4.2 (L2 -version of the Gauss–Kuzmin–L´vy theorem) Let e h : D → C such that the function z → (z + 1) h(z), z ∈ D, belongs to H 2 (ν) and the restriction of h to I is the Radon–Nikodym derivative with respect to λ of a probability measure µ on BI . Then |µ (τ −n (A)) − γ (A)| ≤ (log 2) γ

1/2

(A)

1 π

1 |h (x + iy)| dxdy − log 2 D

2

1/2

(2.4.7) |λ2 |

n

124 for any n ∈ N+ and A ∈ BI .

Chapter 2

**Proof. Let g (z) = (z + 1) h(z), z ∈ D. For any A ∈ BI and n ∈ N+ we have (IA , U n g − U ∞ g)γ ≤ But (IA , U n g − U ∞ g)γ = and, by Proposition 2.1.5, U n g (x) − U ∞ g dx = µ τ −n (A) − γ (A) x+1 1 log 2 1 (x + 1) h (x) dx = . x+1 log 2 1 log 2
**

A 2 IA dγ 1/2

U ng − U ∞g

I

2,γ

.

(2.4.8)

U n g (x) − U ∞ g dx x+1

A

since U ∞g =

I

**Therefore (2.4.8) amounts to µ τ −n (A) − γ (A) ≤ (log 2) γ 1/2 (A) U n g − U ∞ g
**

2,γ

(2.4.9)

for any n ∈ N+ and A ∈ BI . Now, (2.4.7) follows from (2.4.9) and Proposition 2.4.1. 2 Remark. Inequality (2.4.6) can be obviously generalized as follows. For any n, ∈ N+ and g ∈ H 2 (ν) we have

2

U ng − U ∞g −

2≤k≤

λn (g, M ϕk )ν M ϕk k

2,ν

≤ ||g||2 − |U ∞ g|2 log 2 − 2,ν

2≤k≤

|(g, M ϕk )ν |2 |λ

2n +1 |

with the usual convention which assigns value 0 to a sum over the empty set. Proposition 2.4.1 and Corollary 2.4.2 can be accordingly generalized. 2

Solving Gauss’ problem

125

**We can again derive the ‘exact’ Gauss–Kuzmin–L´vy Theorem 2.3.5. e First, we clearly have ψk (z) := M ϕk (z) = (z + 1) e−zt ϕk (t) m (dt)
**

−1/2

R+

= =

(z + 1)

R+

e−zt t1/2 et − 1

ϕk (t) dt

(2.4.10)

(z + 1) ψk (z),

k ∈ N+ , z ∈ D.

Second, the function ga , a ∈ I, deﬁned by ga (z) = (a + 1) (z + 1) , (az + 1)2 z ∈ D,

**does not belong to H 2 (ν) for a = 0. Instead, the function U ga (z) = (a + 1) (z + 1)
**

j∈N+

1 , (z + a + j)2

z ∈ D,

**does belong to H 2 (ν) for any a ∈ I. Then U n ga = U n−1 (U ga ) =
**

k∈N+

λn−1 (M −1 U ga , ϕk )m ψk k

**for any a ∈ I and n ∈ N+ . Now, it is easy to check that M −1 U ga (t) = (a + 1) e−at , Hence (M −1 U ga , ϕk )m = (a + 1) Therefore U n ga =
**

k∈N+

a ∈ I, t ∈ R+ .

(2.4.11)

R+

e−at ϕk (t) m(dt) = ψk (a),

a ∈ I, k ∈ N+ .

λn−1 ψk (a)ψk , k

n ∈ N+ , a ∈ I,

(2.4.12)

which by (2.4.10) is identical with (2.3.14).

**126 Note that by (2.4.11) for any a ∈ I we have ||U ga ||2 2,ν = M −1 U ga
**

2 2,m

Chapter 2

= (a + 1)2

e−2at t dt t R+ e − 1 (2.4.13)

= (a + 1)2

k∈N+ R+

te−(2a+k)t dt 1 , (2a + k)2

= (a + 1)2

k∈N+

that is, by Proposition 2.3.4, ||U ga ||2 = (a + 1)2 2,ν

k∈N+

|ψk (a)|2 .

This result is not at all surprising. It can be derived immediately from (2.4.12) with n = 1 on account of the fact that (ψk )k∈N+ is an orthonormal basis in H 2 (ν). (Remark that the ψk , k ∈ N+ , are not pairwise orthogonal in H !). Next, 1 U ∞ U ga = U ∞ ga = , a ∈ I. (2.4.14) log 2 It then follows from Proposition 2.4.1 that for any n ∈ N+ we have U n ga − U ∞ ga

2,γ

= ≤

U n−1 (U ga ) − U ∞ ga

2,γ 1/2

(2.4.15) |λ2 |n−1 .

||U ga ||2 − |U ∞ ga |2 log 2 2,ν

**Proposition 2.4.3 For any a ∈ I, n ∈ N+ and A ∈ BI we have γa (τ −n (A)) − γ(A) ≤ (log 2) γ 1/2 (A) (a + 1)2
**

k∈N+

(2.4.16) 1/2 1 1 2 − log 2 (2a + k) |λ2 |n−1 .

Proof. The function ha (x) = ga (x) a+1 = , x ∈ I, x+1 (ax + 1)2

Solving Gauss’ problem

127

is just the Radon–Nikodym derivative dγa /dλ. Now, (2.4.16) follows from (2.4.9) and (2.4.13) through (2.4.15). 2 Remarks. 1. On account of the remark following Corollary 2.4.2, inequality (2.4.16) can be generalized as follows. For any a ∈ I, , n ∈ N+ , and A ∈ BI we have γa τ −n (A) − γ (A) − (log 2)

2≤k≤

λn−1 ψk (a) k

A

ψk dγ 1/2

(2.4.17)

≤ (log 2) γ 1/2 (A) (a + 1)2

k∈N+

1 2 ψk (a) 2 − (2a + k) 1≤k≤

|λ

n−1 . +1 |

2. It is instructive to compare the inequality in Theorem 2.3.5 with (2.4.17). The diﬀerence between them reﬂects the diﬀerence between the Hilbert spaces H and H 2 (ν). 2

2.4.2

The Mayer–Roepstorﬀ Banach space approach

In this subsection we give a summary of the work of Mayer and Roepstorﬀ (1988) on the u0 -positivity of the Perron–Frobenius operators Pλ and U = Pγ on a suitable Banach space. Let us ﬁrst recall a few concepts concerning positive operators with respect to a cone in a real Banach space B. A closed convex subset C of B is called a cone if and only if (i) x ∈ C and a ∈ R+ imply ax ∈ C, and (ii) x ∈ C and −x ∈ C imply x = 0. A cone C induces a partial order ≤C (≤ for short): x ≤ y if and only if y − x ∈ C. A cone C is said to be reproducing if and only if B = C − C, that is, any z ∈ B can be written as z = x − y with x, y ∈ C. A linear operator T : B → B is said to be positive with respect to a cone C if and only if T C ⊂ C. Let C be a cone and 0 = u0 ∈ C. A positive with respect to C operator T is said to be u0 -positive if and only if for any 0 = x ∈ C there exist p ∈ N+ and α, β ∈ R++ such that αu0 ≤ T p x ≤ βu0 . Compact operators on the complexiﬁcation of B, which are positive with respect to a reproducing cone C ⊂ B and u0 -positive for some 0 = u0 ∈ C, enjoy properties similar to those of ﬁnite positive matrices. They obey

128

Chapter 2

a generalization of the Perron–Frobenius theorem for such matrices. For details the reader is referred to Krasnoselskii (1964). Coming back to our problem, let D1 = (z ∈ C : |z − 1| < 3/2) and consider the collection A (D1 ) of all holomorphic functions in D1 which together with their ﬁrst derivatives are continuous in D1 ; A (D1 ) is a Banach space under the norm f = max sup |f (z)| , sup f (z)

z∈D1 z∈D1

,

f ∈ A (D1 ) .

**Both operators Pλ and U take A (D1 ) into itself. Obviously, for f ∈ A (D1 ) we deﬁne Pλ f and U f by Pλ f (z) =
**

i∈N+

1 f (z + i)2

1 z+i

,

z ∈ D1 ,

and U f (z) =

i∈N+

Pi (z) f

1 z+i

,

z ∈ D1 ,

respectively. Both Pλ and U are nuclear operators of trace class on A (D1 ). Let us write (compare with Subsection 2.1.2) Pλ = Π1 + T0 , where Π1 f (z) = f1 (z) and f1 (z) = f dλ,

I

f ∈ A(D1 ), z ∈ D1 ,

(log 2)−1 , z+1

z ∈ D1 .

Since Pλ (f1 f ) = f1 U f, f ∈ A (D1 ), the spectra of the operators Pλ and U on A (D1 ) are identical, algebraic multiplicities of the eigenvalues included. Theorem 2.4.4 The spectra of U on A (D1 ) and on H 2 (ν) (see Subsection 2.4.1) are identical, algebraic multiplicities of the eigenvalues included. Consider the subspaces A⊥ (D1 ) = f ∈ A (D1 ) : U ∞ f = f dγ = 0

I

**Solving Gauss’ problem and A⊥ (D1 ) = f ∈ A(D1 ) : f dλ = 0
**

I

129

of A (D1 ) and the real subspaces A⊥ (D1 ) A⊥ (D1 ) r r

of A⊥ (D1 )

A⊥ (D1 )

consisting of functions that take real values on R ∩ D1 = [−1/2, 5/2]. Note that by Proposition 2.1.1(ii) U leaves invariant both subspaces A⊥ (D1 ) and A⊥ (D1 ) while Pλ leaves invariant both subspaces A⊥ (D1 ) and A⊥ (D1 ). r r ⊥ (D ) is just A⊥ (D ) ⊥ (D ) . ⊥ (D ) Ar A The complexiﬁcation of Ar 1 1 1 1 Also, the spectrum of T0 on A⊥ (D1 ) is identical with the spectrum of U on A⊥ (D1 ). The set C = f ∈ A⊥ (D1 ) : f ≥ 0 on [−1/2, 5/2] r is a reproducing cone in A⊥ (D1 ) . Deﬁne u0 ∈ A(D1 ) by r u0 (z) = z + 1 − Clearly, u0 ∈ C. Theorem 2.4.5 The operator −U on A⊥ (D1 ) is positive with respect r to the cone C . Moreover, −U is u0 -positive. Hence the operator − U + U ∞ on A (D1 ) has a simple positive dominant eigenvalue equal to λ0 (cf. Theorem 2.2.5) with eigenfunction f2 in the interior C o of C. There is no other eigenfunction in C. Corollary 2.4.6 The operator −T0 on A⊥ (D1 ) is positive with respect to r the (reproducing) cone f1 C = (f1 f : f ∈ C). Moreover, −T0 is f1 u0 -positive. Hence the operator −T0 on A (D1 ) has a simple positive dominant eigenvalue equal to λ0 with eigenfunction f2 = f1 f2 . There is no other eigenfunction in f1 C. Note that a minimax principle for −λ0 holds. We namely have

f ∈C

1 , log 2

z ∈ D1 .

min o

(U f ) (x) (U f ) (x) = −λ0 = max . min o −1/2≤x≤5/2 f ∈C f (x) −1/2≤x≤5/2 f (x) max (U f ) (x) (U f ) (x) ≤ −λ0 ≤ max f (x) f (x) −1/2≤x≤5/2 −1/2≤x≤5/2 min z+1 − c, z + 1.14617

Hence

for any f ∈ C o . For example, taking f (z) = z ∈ D1 ,

130 with c chosen such that f ∈ A⊥ (D1 ), we obtain 0.2995 ≤ λ0 ≤ 0.3038, that is, an approximation which is good enough.

Chapter 2

2.4.3

Mayer–Ruelle operators

Statistical mechanics problems motivated the consideration of a class of operators including as a special case the Perron–Frobenius operator Pλ of τ under λ. This class has been thoroughly studied by Mayer (1990, 1991). Nowadays, these operators are named after him and D. Ruelle. Let D1 = (z ∈ C : |z − 1| < 3/2) and consider the collection A∞ (D1 ) of all holomorphic functions in D1 which are continuous in D1 ; A∞ (D1 ) is a Banach space under the supremum norm || f || = sup |f (z)| ,

z∈D1

f ∈ A∞ (D1 ).

**For any β ∈ C with Re β > 1 and f ∈ A∞ (D1 ) deﬁne Gβ f (z) =
**

i∈N+

1 f (z + i)β

1 z+i

,

z ∈ D1 .

It is easy to check that Gβ is a bounded linear operator on A∞ (D1 ). Hence, as mentioned when discussing nuclear operators in Subsection 2.3.2, Gβ is nuclear of order 0 and thus has a discrete spectrum. For β = 2, Gβ has the same analytical expression as Pλ . In what follows we give without proofs the most important properties of the Mayer–Ruelle operator Gβ for Re β > 1, which generalize those of Pλ . For proofs we refer the reader to Mayer (1990, 1991). See also Daud´ et al. (1997), Faivre e (1992), Flajolet and Vall´e (1998, 2000), and Vall´e (1997). e e Theorem 2.4.7 Let β be real, strictly greater than 1. (i) The operator Gβ : A∞ (D1 ) → A∞ (D1 ) has a positive dominant eigenvalue λ(β) which is simple and strictly greater in absolute value than all other eigenvalues. The corresponding eigenfunction gβ ∈ A∞ (D1 ) is strictly positive on D1 ∩ R = [−1/2, 5/2]. (ii) The map β → λ(β) deﬁnes on (1, ∞) a strictly decreasing and logconcave function with √ log λ(β) 5−1 = log . lim λ(β) = ∞, λ(2) = 1, lim β↓1 β→∞ β 2

Solving Gauss’ problem Moreover, λ(β + u) ≤ √ 5−1 2

131

u

λ(β),

u ∈ R+ .

(iii) There exists a linear functional β on A∞ (D1 ) with β (gβ ) = 1 and (f ) > 0 for any f ∈ A∞ (D1 ) such that f |[−1/2,5/2] > 0 (here f |[−1/2,5/2] β denotes the restriction of f to [−1/2, 5/2]). If Π1β denotes the projection deﬁned as Π1β f = β (f )gβ , f ∈ A∞ (D1 ), then Gβ = λ(β)Π1β + T0β with Π1β T0β = T0β Π1β = 0. Hence

n Gn = λn (β)Π1β + T0β , β

n ∈ N+ .

(iv) The spectral radius ρ(β) of the linear operator T0β : A∞ (D1 ) → A∞ (D1 ) is strictly smaller than λ(β), and for any f ∈ A∞ (D1 ) such that f |[−1/2,5/2] > 0 we have Gn f (z) β λn (β) β (f )gβ (z) =1+O ρ(β) λ(β)

n

as n → ∞, where the constant implied in O is independent of z ∈ D1 (but dependent on f and β). (v) There exists ε = ε(β) > 0 such that for any α ∈ C satisfying |α − β| ≤ ε the dominant spectral properties of Gβ : A∞ (D1 ) → A∞ (D1 ) transfer to Gα : A∞ (D1 ) → A∞ (D1 ) : quantities λ(α), ρ(α), gα , α (thus Π1α ) and T0α can be deﬁned to represent the dominant spectral objects associated with Gα , and all of them are analytical with respect to α. Moreover, let a ∈ (ρ(β)/λ(β), 1) . For any f ∈ A∞ (D1 ) such that f |[−1/2,5/2] > 0 we have Gn f (z) α = 1 + O(an ) λn (α) α (f )gα (z) as n → ∞, where the constant implied in O is independent of z ∈ D1 and α satisfying |α − β| ≤ ε, but depends on a, f , and β. Finally, ρ(β + it) < ρ(β) for t ∈ [−ε, ε] , t = 0. The proof is the same Perron–Frobenius type of argument used in the case β = 2, which has been sketched in the preceding subsection. There the

132

Chapter 2

existence of a dominant simple real (in fact, negative) eigenvalue of T02 = T0 followed by considering the subspace A(D1 ) ⊂ A∞ (D1 ). 2 As in the special case β = 2, the Mayer–Ruelle operators enjoy better properties when deﬁned on suitable Hilbert spaces. Let Re β > 1. Consider the collection H (β) of functions f which are holomorphic in the half plane Re z > −1/2, bounded in any half-plane Re z > −1/2 + ε, ε > 0, and can be represented in the form f (z) =

R+

e−zs ϕ(s)(β−1)/2 m (ds),

Re z > −1/2,

(2.4.18)

where m is the measure on BR+ with density 1 if s > 0, s dm e −1 = ds 0 if s = 0, for some ϕ ∈ L2 (R+ ), the Hilbert space of m -square integrable functions m ϕ : R+ → C with inner product (·, ·)m deﬁned by (ϕ, ψ)m = and norm ||ϕ||2,m = |ϕ|2 dm

R+ 1/2

ϕψ ∗ dm ,

R+

ϕ, ψ ∈ L2 (R+ ) m

,

ϕ ∈ L2 (R+ ). m

Introducing the inner product (f1 , f2 )(β) = (ϕ1 , ϕ2 )m , where ϕi is associated with fi , i = 1, 2, by (2.4.18), H (β) is made a Hilbert space with norm || f ||(β) = ||ϕ||2,m , f ∈ H (β) , where f and ϕ are again associated by (2.4.18). Theorem 2.4.8 Let Re β > 1. (i) The linear operator Gβ takes boundedly H (β) into itself. (ii) For any f ∈ H (β) we have Gβ f (z) = e−zs Kβ ϕ(s)s(β−1)/2 m (ds), Re z > −1/2,

R+

Solving Gauss’ problem

133

where Kβ : L2 (R+ ) → L2 (R+ ) is a symmetric integral operator deﬁned m m by Kβ ϕ(s) = √ Jβ−1 2 st ϕ(t)m (dt), ϕ ∈ L2 (R+ ), s ∈ R+ . m

R+

**Here Jβ−1 is the Bessel function of order β − 1 deﬁned by Jβ−1 (u) = u 2
**

β−1 k∈N

u (−1)k k! Γ(k + β) 2

2k

,

u ∈ R+ .

Hence Gβ : H (β) → H (β) can be diagonalized in an orthonormal basis of H (β) . Moreover, if β ∈ R then Gβ is self-adjoint and its spectrum is real. (iii) The spectra of the operators Gβ : A∞ (D1 ) → A∞ (D1 ), Gβ : H (β) → H (β) and Kβ : L2 (R+ ) → L2 (R+ ) are identical. Hence for any real β > 1 m m these spectra are all real. Let us note in particular that for β = 2 the symmetric operator K2 from Theorem 2.4.8 is diﬀerent from the symmetric operator K from Proposition 2.3.1. They are related by the simple relation K2 = SKS −1 , where S : L2 (R+ ) → L2 (R+ ) is an invertible linear operator deﬁned by m S ϕ(s) = (es − 1)1/2 ϕ(s), s ∈ R+ .

Hence the spectra of K and K2 are identical. As for K, formulae for the trace of Kβ and its powers are available. Denoting by λi (β), i ∈ N+ , the eigenvalues of Kβ taken in order of decreasing moduli and counting their multiplicity, we have Tr Kβ =

i∈N+

λi (β) =

i∈N+

1

β−2 2 yi (yi

+ 1)

,

where yi = i +

n Tr Kβ =

√ i2 + 4 /2, i ∈ N+ , and, in general, λn (β) = i

i∈N+ i1 ,··· ,in ∈N+

1

β−2 yi1 ···in 2 yi1 ···in

+ (−1)n−1

,

where yi1 ···in =

pn−1 + qn +

(pn−1 + qn )2 + 4(−1)n−1 2 p0 = 0,

with, as usual, pn = [i1 , · · · , in ] , qn

g.c.d. (pn , qn ) = 1,

134

Chapter 2

for any n ∈ N+ and i1 , · · · , in ∈ N+ . Let us note that for β = 2 we recover Babenko’s formula for Tr K n , n ∈ N+ . See the remark following the proof of Proposition 2.3.2. In particular [see Daud´ et al. (1997)], we have e Tr K4 = 7 1 7 2 −√ −√ + 2 5 2 2 (−1)i

i≥2

i−1 i+1

2i i

ζ(2i) − 1 −

1 22i

**= 0.14446 23962 46160 81588 · · · ,
**

2 Tr K4 = 0.04647 18256 42727 93983 · · · ,

and

λ1 (4) = 0.19945 88183 43767 26019 λ3 (4) = 0.02856 64037 69818 52783 λ5 (4) = 0.00407 09406 93426 42144

··· , ··· , ··· .

λ2 (4) = −0.07573 95140 84360 60892 · · · , λ4 (4) = −0.01077 74165 76612 69829 · · · , To conclude this brief discussion of Mayer–Ruelle operators we mention two generalizations of them. a. For any subset M of N+ deﬁne GM,β f (z) =

i∈M

1 f (z + i)β

1 z+i

,

z ∈ D1 ,

whatever β ∈ C with Re β > 1 and f ∈ A∞ (D1 ). Clearly, GM,β is a bounded linear operator on A∞ (D1 ), hence a nuclear one of trace class, which coincides with Gβ when M = N+ . Now, for an arbitrarily ﬁxed k ∈ N+ , let Mi , 1 ≤ i ≤ k, be subsets of N+ and write M = (M1 , . . . , Mk ). Consider the linear operator GM,β : A∞ (D1 ) → A∞ (D1 ) deﬁned as GM,β = GMk ,β ◦ · · · ◦ GM1 ,β , which is nuclear of trace class, too. The operators GM,β for various M control the dynamics of continued fraction expansions of irrationals subject to periodical constraints. Their spectral properties are entirely similar to those of Gβ . For details see Vall´e e (1998), who considered systematically such operators. See, however, Fluch (1986, 1992) for special cases.

Solving Gauss’ problem

135

b. The second generalization has been motivated by the study of the transformation 1 1 z → − Re , 0 = z ∈ C, z z which extends to the complex domain the continued fraction transformation τ . Let 5 D2 = z : |z − 1| < , 4 and consider the collection B∞ (D2 ) of all functions F which are holomorphic 2 2 in D2 and continuous in D2 . Under the supremum norm || F || = sup

2 (z,w)∈D2

|F (z, w)| ,

**B∞ (D2 ) is a Banach space. Then for any (α, β) ∈ C2 with Re (α + β) > 1 a linear bounded operator Gα,β : B∞ (D2 ) → B∞ (D2 ) is deﬁned by Gα,β F (z, w) =
**

i∈N+

1 (z + i)α (w + i)β

F

1 1 , z+i w+i

2 for any F ∈ B∞ (D2 ) and (z, w) ∈ D2 . The spectral properties of Gα,β , which is positive and nuclear of trace class, are strongly related to those of Gα+β+2 , ∈ N. For details see Vall´e (1997). e

2.5

2.5.1

**The Markov chain associated with the continued fraction expansion
**

The Perron–Frobenius operator on BV (I)

In this section we study the Perron–Frobenius operator U on BV (I). This is motivated by Proposition 2.1.10 which establishes U as the transition operator of certain Markov chains. Throughout, except for Corollary 2.5.7, we consider just real-valued functions in BV (I). By Proposition 2.1.16, the operator U deﬁned by (2.1.16) is a bounded linear operator of norm 1 on BV (I). Moreover, by Corollary 2.1.13 we have 1 var U f ≤ var f 2

136 for any f ∈ BV (I), the constant 1/2 being optimal. Hence var U n f ≤ 2−n var f

Chapter 2

for any f ∈ BV (I) and n ∈ N+ . As might be expected, we shall see that the constant 2−n is not optimal for n > 1. A natural problem thus arises: what is the upper bound of var U n f /var f over non-constant f ∈ BV (I)? A satisfactory answer to this problem will be given in Theorem 2.5.3 and Corollary 2.5.6. It is easy to check by induction with respect to n ∈ N+ that U n f (x) =

i1 ,··· ,in ∈N+

Pi1 ···in (x)f (uin ···i1 (x)),

x ∈ I,

(2.5.1)

where uin ···i1 = uin ◦ · · · ◦ ui1 , (2.5.2) Pi1 ···in (x) = Pi1 (x)Pi2 (ui1 (x)) · · · Pin (uin−1 ···i1 (x)), and the functions ui and Pi , i ∈ N+ are deﬁned by ui (x) = 1 , x+i Pi (x) = x+1 , (x + i)(x + i + 1) x ∈ I. n ≥ 2,

Note that by Proposition 2.1.10 we have U n f (x) = Ex (f (sx )) n for any n ∈ N, f ∈ B(I), and x ∈ I (remember that sx = x, x ∈ I), where 0 Ex denotes the mean value operator with respect to the probability measure γx . As sx = uan ···a1 (x), x ∈ I, n ∈ N+ , n we thus have U n f (x) =

i(n) ∈Nn +

γx ((a1 , · · · , an ) = i(n) )f (uin ···i1 (x))

(2.5.3)

for any n ∈ N+ , f ∈ B(I), and x ∈ I. Hence Pi1 ···in (x) = γx (I(i(n) )) (2.5.4)

for any x ∈ I, n ∈ N+ , and (i1 , · · · , in ) = i(n) ∈ Nn . Of course, equation + (2.5.4) could be also obtained by direct computation.

Solving Gauss’ problem

137

Now, by (1.2.4), I(i(n) ) is the set of irrationals in the interval with endpoints pn /qn and (pn + pn−1 )/(qn + qn−1 ). Since 1/i1 if n = 1, pn = [i1 , · · · , in ] = 1 qn if n > 1 i1 + pn−1 (i2 , · · · , in )/qn−1 (i2 , · · · , in ) and pn + pn−1 qn + qn−1 = = 1/(i1 + 1) [i1 , · · · , in−1 , in + 1] if n = 1, if n > 1 if n = 1, if n > 1

1/(i1 + 1) 1 i1 + pn (i2 , · · · , in , 1)/qn (i2 , · · · , in , 1)

we can write Pi1 ···in (x) = (x + 1) × × 1 × qn−1 (i2 , · · · , in )(x + i1 ) + pn−1 (i2 , · · · , in ) (2.5.5)

1 qn (i2 , · · · , in , 1)(x + i1 ) + pn (i2 , · · · , in , 1)

for any n ≥ 2, i(n) ∈ Nn , and x ∈ I. + A useful alternative representation of U n f, n ∈ N+ , when f ∈ BV (I) is available. Proposition 2.5.1 If f ∈ BV (I) then for any n ∈ N+ and x ∈ I we have U n f (x) =

[0,1)

U n I(a,1] (x)df (a) + f (0)

with

[0,x) df

= f (x) − f (0), x ∈ I.

Proof. Since f can be represented as the diﬀerence of two non-decreasing functions, we may and shall assume that f is non-decreasing. Then for any x ∈ I we have f (x) − f (0) =

[0,1)

I(a,1] (x)df (a).

138

Chapter 2 By (2.5.1), using the above equation and Fubini’s theorem we obtain

U n f (x) =

i1 ,··· ,in ∈N+

**Pi1 ···in (x)f (uin ···i1 (x)) Pi1 ···in (x)
**

i1 ,··· ,in

=

[0,1)

I(a,1] (uin ···i1 (x))df (a) + f (0)

i1 ,··· ,in ∈N+

=

[0,1)

Pi1 ···in (x)I(a,1] (uin ···i1 (x)) df (a) + f (0)

=

[0,1)

**U n I(a.1] (x)df (a) + f (0) 2 var U n f = f ∈B(I),f ↑ var f sup
**

a∈[0,1)

for any n ∈ N+ and x ∈ I. Corollary 2.5.2 For any n ∈ N+ we have var U n f f ∈BV (I) var f sup = = var U n f f ∈B(I)f ↓ var f sup

sup var U n I(a,1] ,

where the ﬁrst three upper bounds are taken over non-constant functions f , and f ↑ (↓) means that f is non-decreasing (non-increasing). Proof. It is clear that var U n f var U n f = sup f ∈B(I),f ↓ var f f ∈B(I),f ↑ var f sup since var U n (−f ) var U n f = . var(−f ) var f vn = var U n f , f ∈B(I),f ↑ var f sup n ∈ N+ .

Next, let

Then (cf. the proof of Corollary 2.1.13) for any non-constant f ∈ BV (I) there exist two non-decreasing functions f1 and f2 such that f = f1 − f2 and var f = var f1 + var f2 . Therefore var U n f ≤ var U n f1 + var U n f2 ≤ vn (var f1 + var f2 ) = vn var f, n ∈ N+ .

Solving Gauss’ problem Hence

139

var U n f ≤ vn f ∈BV (I) var f sup var U n f var U n f ≥ sup = vn , f ∈BV (I) var f f ∈B(I),f ↑ var f sup

and since

the ﬁrst equation should hold. To derive the last equation let f ∈ B(I) be non-decreasing. Then U n f is a monotone function by Proposition 2.1.11, and Proposition 2.5.1 implies that U n f (1) − U n f (0) =

[0,1)

U n I(a,1] (1) − U n I(a,1] (0) df (a)

for any n ∈ N+ . Noting that I(a,1] : I → I is also a non-decreasing function for any a ∈ [0, 1), we obtain var U n f ≤ sup var U n I(a,1] var f.

a∈[0,1)

Hence, for any a ∈ [0, 1) and n ∈ N+ , var U n I(a,1] ≤ var U n f ≤ sup var U n I(a,1] f ∈B(I),f ↑ var f a∈(0,1] sup 2

and the proof is complete.

2.5.2

An upper bound

On account of Corollary 2.5.2 our guess for the upper bound of var U n f /var f over non-constant f ∈ BV (I) is given in the conjecture below. UB Conjecture. For any n ∈ N+ we have vn = sup var U n I(a,1] = var U n I(g,1] ,

a∈[0,1)

√ where g = [1, 1, 1, · · · ] = ( 5 − 1)/2 = 0.6180339 · · · . Without any loss of generality, throughout this subsection we assume that f ∈ BV (I) is non-decreasing. To simplify the writing put Pi1 ···in (0) = αi1 ···in , ui1 ···in (0) = βi1 ···in , i1 , · · · , in ∈ N+ .

140

Chapter 2

**If n is odd then by Proposition 2.1.11 and equations (2.5.1), (2.5.2), and (2.5.5) we have var U n f = U n f (0) − U n f (1) =
**

i1 ,··· ,in ∈N+

(2.5.6)

[Pi1 ···in (0)f (uin ···i1 (0)) − Pi1 ···in (1)f (uin ···i1 (1))] [Pi1 ···in (0)f (uin ···i1 (0)) − 2P(i1 +1)i2 ···in (0)f (uin ···i2 (i1 +1) (0))]

i1 ,··· ,in ∈N+

=

** α1i2 ···in f (βin ···i2 1 ) −
**

i1 ∈N+

α(i1 +1)i2 ···in f (βin ···i2 (i1 +1) ) .

=

i2 ,··· ,in ∈N+

**Similarly, if n is even then we have var U n f = U n f (1) − U n f (0) =
**

i2 ,··· ,in ∈N+

(2.5.7)

i1 ∈N+

α(i1 +1)i2 ···in f (βin ···i2 (i1 +1) ) − α1i2 ···in f (βin ···i2 1) .

It is easy to see that if n is odd then var U n I(a,1] has a constant value for 1 1 , if n = 1, j1 + 1 j1 a∈ [ [j1 , · · · , jn−1 , jn + 1], [j1 , · · · , jn ] ) if n > 1 while if n is even then var U n I(a,1] has a constant value for a ∈ [ [j1 , · · · , jn ], [j1 , · · · , jn−1 , jn + 1] ) , that is, in both cases, on the closure without the right endpoint of any fundamental interval I(j (n) ), j (n) = (j1 , · · · , jn ) ∈ Nn . Write 1(n) for + (j1 , · · · , jn ) with jk = 1, 1 ≤ k ≤ n, n ∈ N+ . Then in particular for a ∈ [ [1(2m + 2)], [1(2m + 1)]) , that is, a∈ F2m+1 F2m , F2m+2 F2m+1 , m ∈ N, (2.5.8) m ∈ N,

**Solving Gauss’ problem we have v1 := var U I(a,1] = 1/2, v3 := var U 3 I(a,1] =
**

i2 ∈N+

141

α1i2 1 −

i1 ∈N+

α(i1 +1)i2 1 +

i1 ∈N+

α(i1 +1)11 ,

**and v2m+1 := var U 2m+1 I(a,1]
**

m−2

=

q=0 i2 ,··· ,i2m−2q ∈N+

**α1i2 i3 ···i2m−2q−1 (i2m−2q +1)1···1 − +
**

i2 ∈N+ i1 ∈N+

**α(i1 +1)i2 i3 ···i2m−2q−1 (i2m−2q +1)1···1 α(i1 +1)1···1
**

i1 ∈N+

α1i2 1···1 −

i1 ∈N+

α(i1 +1)i2 1···1 +

for m ≥ 2. (In the last equation the number of subscripts of the α’s is 2m + 1.) Similarly, for a ∈ [ [1(2m + 2)], [1(2m + 3)]) , that is, a∈ we have v2 := var U 2 I(a,1] =

i1 ∈N+

m ∈ N,

F2m+1 F2m+2 , F2m+2 F2m+3

,

m ∈ N, α(i1 +1)1 ,

(2.5.9)

v2m+2 := var U 2m+2 I(a,1]

m−1

=

q=0 i2 ,··· ,i2m−2q+1 ∈N+ i1 ∈N+

**α(i1 +1)i2 i3 ···i2m−2q (i2m−2q+1 +1)1···1 − α1i2 i3 ···i2m−2q (i2m−2q+1 +1)1···1 +
**

i1 ∈N+

α(i1 +1)1···1

142

Chapter 2

for m ∈ N+ . (In the last equation the number of subscripts of the α’s is 2m + 2.) Since g belongs to all intervals (2.5.8) and (2.5.9), the UB Conjecture amounts to vn = vn , n ∈ N+ . The case n = 1. This case was dealt with in Proposition 2.1.12. Actually, writing i for i1 , equation (2.5.6) yields var U f = α1 f (β1 ) −

i∈N+

αi+1 f (βi+1 ).

Hence var U I(a,1] = for a ∈ 1 1 , i+1 i , i ∈ N+ and

1 i+1

**v1 = sup var U I(a,1] =
**

a∈[0,1)

1 = var U I(g,1] = v1 2

as g ∈ [1/2, 1). Thus in this case the UB Conjecture holds. The case n = 2. Write i for i1 and j for i2 . Then we have αij = 1 , (ij + 1)(i(j + 1) + 1) i, j ∈ N+ ,

**and equation (2.5.7) yields var U 2 f =
**

j∈N+

α(i+1)j f (βj(i+1) ) − α1j f (βj1 )

i∈N+

(2.5.10)

=

i∈N+

α(i+1)1 f (β1(i+1) ) +

j∈N+

α(i+1)(j+1) f (β(j+1)(i+1) ) − α1j f (βj1 ) .

i∈N+

**Clearly, β(j+1)(i+1) < βj1 for any i, j ∈ N+ . Hence var U 2 f ≤ f (1)
**

i∈N+

α(i+1)(j+1) − α1j .

α(i+1)1 +

j∈N+

f (βj1 )

j∈N+

**Solving Gauss’ problem But α(i+1)(j+1) =
**

i∈N+ i∈N+

143

1 ((i + 1)(j + 1) + 1) ((i + 1)(j + 2) + 1) 1 (i + 1)2 (2.5.11)

≤

1 (j + 1)(j + 2)

i∈N+

**= (ζ(2) − 1) α1j < α1j for any j ∈ N+ . Since f (βj1 ) ≥ f (0), j ∈ N+ , and
**

j∈N+ i∈N+

α(i+1)(j+1) − α1j = −

i∈N+

α(i+1)1 ,

**(2.5.10) and (2.5.11) imply that var U 2 f ≤
**

i∈N+

α(i+1)1 (f (1) − f (0)) =

i∈N+

α(i+1)1 var f

(2.5.12)

**for any non-decreasing f ∈ B(I). Now, note that for f = I(a,1] with a ∈ [1/2, 2/3), in particular for a = g, we have var U 2 I(a,1] =
**

i∈N+

α(i+1)1 ,

**that is, the constant α(i+1)1 =
**

i∈N+ i∈N+

1 =2 (i + 2)(2i + 3) 1 1 1 − + 2 3 4

i∈N+

1 1 − 2i + 3 2i + 4 7 = 0.21962 · · · 6

= 2 log 2 − 1 +

= log 4 −

occurring in (2.5.12) cannot be lowered. Therefore for n = 2 we have v2 = log 4 − 7 = 0.21962 · · · , 6

and the UB Conjecture holds in this case. The case n ≥ 3. We could try to treat this case similarly to the case n = 2. Using (2.5.5) it is not diﬃcult to generalize inequality (2.5.11) to α(i1 +1)(i2 +1)i3 ···in ≤ (ζ(2) − 1)α1i2 ···in < α1i2 ···in

i1 ∈N+

(2.5.13)

144

Chapter 2

for any n ≥ 3 and i2 , · · · , in ∈ N+ . Next, to make a choice let us assume that n is odd. Then it is easy to see that βin ···i3 (i2 +1)(i1 +1) > βin ···i3 i2 1 , βin ···i3 1(i1 +1) > βin ···i3 1 , βin ···i3 i2 1 < βin ···i2 for any i1 , · · · , in ∈ N+ . Then by (2.5.6) and (2.5.13) we have var U n f ≤

i3 ,··· ,in ∈N+

−

i1 ∈N+

**α(i1 +1)1i3 ···in f βin ···i3 1(i1 +1)
**

i1 ∈N

+

i2 ∈N+

α1i2 i3 ···in −

**α(i1 +1)(i2 +1)i3 ···in f (βin ···i3 i21 ) α(i1 +1)i2 i3 ···in f (βin ···i3 )
**

i1 ∈N

≤

i3 ,··· ,in ∈N+

i2 ∈N+

α1i2 i3 ···in −

+

i1 ∈N+

α(i1 +1)1i3 ···in (f (βin ···i3 ) − f (βin ···i3 1 )) . (2.5.14)

**For an even n the corresponding inequality is var U n f ≤
**

i3 ,··· ,in ∈N

i2 ∈N+

i1 ∈N+

α(i1 +1)i2 i3 ···in − α1i2 i3 ···in f (βin ···i3 ) (2.5.15)

+

i1 ∈N+

α(i1 +1)1i3 ···in (f (βin ···i3 1 ) − f (βin ···i3 )) .

**Put δi3 ···in = (−1)n−1
**

i2 ∈N+

α(i1 +1)i2 i3 ···in )

i1 ∈N

α1i2 i3 ···in −

**Solving Gauss’ problem for any i3 , · · · , in ∈ N+ . Note that δi3 ···in = (−1)n−1 α1 −
**

i3 ,··· ,in ∈N+ i1 ∈N+

145

αi1 +1 = 0. (2.5.16)

Using (2.5.5), which implies Pi1 ···in (0) = (−1)n

1 1 − pn (i2 , · · · , in , 1) pn−1 (i2 , · · · , in ) i1 + i2 + qn (i2 , · · · , in , 1) qn−1 (i2 , · · · , in )

for any n ≥ 2 and i1 , · · · , in ∈ N+ , it is easy to see that δi3 ···in can be expressed in terms of the digamma function ψ as δi3 ···in = ψ 2+ p + pn−3 pn−2 − ψ 2 + n−2 qn−2 qn−2 + qn−3 ψ 2 +

+

i∈N+

1 1 − ψ 2 + , pn−2 pn−2 + pn−3 i+ i+ qn−2 qn−2 + qn−3

**where pm = pm (i3 , · · · , im+2 ), qm = qm (i3 , · · · , im+2 ), m ∈ N+ , and p0 = 0, q0 = 1. Let us recall that the digamma function can be expressed by the convergent series ψ(z) = −C +
**

j∈N+

1 1 − j j+z−1

= −C +

j∈N+

z−1 j(j + z − 1)

for z = 0, −1, −2, · · · , where C = 0.57721 · · · is the Euler constant. As is well known, ψ satisﬁes the equation ψ(z + 1) = ψ(z) + 1 z

for z = 0, −1, −2, · · · . Tables for ψ can be found in Abramowitz and Stegun (1964). Putting δ (n) (f ) = δi3 ···in f (βin ···i3 ),

i3 ,··· ,in ∈N+

**146 inequalities (2.5.14) and (2.5.15) imply that var U n f ≤ δ (n) (f ) +
**

i1 ∈N+

Chapter 2

α(i1 +1)1···1 var f

(2.5.17)

for any n ≥ 3. Here we used the fact that α(i1 +1)1i3 ···in < α(i1 +1)11···1 for any n ≥ 3 and (i3 , · · · , in ) = 1(n − 2), which follows at once from (2.5.5). First, note that by (2.5.16) we have δ (n) (f ) ≤ Since 1 2 |δi3 ···in | (f (1) − f (0)).

i3 ,··· ,in ∈N+

(2.5.18)

1 2

**|δi3 ···in | = sup
**

i3 ,··· ,in ∈N+ (i3 ,··· ,in )∈A

δi3 ···in ,

**where the supremum is taken over all A ⊂ Nn−2 , it follows that + 1 2 |δi3 ···in | ≥
**

i3 ,··· ,in ∈N+

1 2

|δi | .

i∈N+

Hence the right hand side of (2.5.17) does not tend to 0 as n → ∞, and (2.5.18) is useless for n ≥ 3. As a matter of fact, it is a general result which does not take into account that f is non-decreasing. If for some given n ≥ 3 the inequality δ (n) (I(a,1] ) ≤ δ (n) (I(g,1] ) holds for any a ∈ [0, 1), then by (2.5.17) we have var U n I(a,1] ≤ δ (n) (I(g,1] ) +

i1 ∈N+

(2.5.19)

α(i1 +1)1···1

(2.5.20)

for any a ∈ [0, 1). It is easy to see that the right hand side of (2.5.20) is equal to vn . Since whatever n ∈ N+ we have var U n I(g,1) = vn , it follows from (2.5.20) that vn = vn . Thus if (2.5.19) holds then for the given n the UB Conjecture holds, too. In particular for n = 3, writing i, j, k for i1 , i2 , i3 , respectively, we have αijk = 1 , (i(jk + 1) + k)(i(j(k + 1) + 1) + k + 1) i, j, k ∈ N+ .

**Solving Gauss’ problem It has been proved in Iosifescu (1994) that δk =
**

j∈N+

147

α(i+1)jk

α1jk −

i∈N+

is positive for k = 1 and negative for k > 1. Then (2.5.19) clearly holds in this case. Hence the UB Conjecture holds for n = 3 and v3 = δ1 +

i∈N+

α(i+1)11 1 1 +ψ 2+ (j + 2)(2j + 3) j+1 2 3 −ψ 2+ 1 2 1 j+1 −ψ 2+ 1 2 2 2j + 1 −ψ 2+ 2 2j + 1

=

j∈N+

+ψ 2 + = log 4 −

7 + 6

ψ 2+

j∈N+

3 3 + + +ψ 5 2 = log 4 − +

j∈N+

2 3

−

2 −2−ψ 3

4 π 7 17 − + log √ + √ 6 30 27 2 3 ψ 2+ 1 j+1 −ψ 2+ 2 2j + 1 .

We have [see Iosifescu (1994, p.115)] 0.09104 < v3 < 0.09759 while a computation using MATHEMATICA yields 0.09436 < v3 < 0.09445. Returning to the general case, a good upper bound for vn , n ∈ N+ is available. For a lower bound see further Corollary 2.5.6. Theorem 2.5.3 We have vn ≤ k0 Fn Fn+1 (2.5.21)

148

Chapter 2

for any n ∈ N+ . Here and throughout the remainder of this section, k0 is a constant not exceeding 14.8. Proof. Clearly, (2.5.21) holds for n = 1, 2, 3 as was shown before. By Corollary 2.5.2 and on account of the constancy of the function a → var U n I(a,1] on any fundamental interval of order n, we have vn = sup var U n I(a,1] , n ∈ N+ .

a∈Ω

If to make a choice we assume that n ∈ N+ is odd, then by Proposition 2.1.11 and equation (2.5.3) for any a ∈ I we have var U n I(a,1] = U n I(a,1] (0) − U n I(a,1] (1) =

i(n) ∈Nn +

γ0 (I(i(n) ) − γ1 (I(i(n) )) I(a,1] (uin ···i1 (1))

+

i(n) ∈Nn +

γ0 (I(i(n) ))(I(a,1] (uin ···i1 (0)) − I(a,1] (uin ···i1 (1)). (2.5.22)

Note that if a ∈ Ω then just one of the diﬀerences I(a,1] (uin ···i1 (0)) − I(a,1] (uin ···i1 (1)), i(n) ∈ Nn , +

is = 0 (and equal to 1). Also, for an arbitrarily given a = [j1 , j2 , · · · ] ∈ Ω the set i(n) ∈ Nn : uin ···i1 (1) > a + consists of the i(n) = (i1 , . . . , in ) ∈ Nn satisfying + (i1 < j1 ), if n = 1; if n = 3;

(i3 < j1 ) ∪ (i3 = j1 , i2 > j2 ) ∪ (i3 = j1 , i2 = j2 , i1 < j3 ),

(in < j1 ) ∪ (in = j1 , in−1 > j2 ) ∪ (in = j1 , in−1 = j2 , in−2 < j3 ) ∪ · · · ∪(in = j1 , · · · , i3 = jn−2 , i2 > jn−1 ) ∪ (in = j1 , · · · , i2 = jn−1 , i1 < jn ), if n ≥ 5. Therefore, putting µ = γ0 − γ1 , it follows from (2.5.22) that for

Solving Gauss’ problem a = [j1 , j2 , . . . ] ∈ Ω and any odd n ≥ 5 we have var U n I(a,1] ≤ |µ(an < j1 )| + |µ(an = j1 , an−1 > j2 )| + |µ(an = j1 , an−1 = j2 , an−2 < j3 )| + · · · + |µ(an = j1 , · · · , a3 = jn−2 , a2 > jn−1 | + |µ(an = j1 , · · · , a2 = jn−1 , a1 < jn | + maxi(n) ∈Nn γ0 (I(i(n) )).

+

149

(2.5.23)

**We shall use the inequalities |γ0 (A) − γ1 (A)| ≤ (log 2)γ(A), (2.5.24) |γa (τ −n (A)) − γ(A)| ≤ (ζ (2) log 2 −
**

n−1 1)λ0 γ(A),

which are valid for any a ∈ I, A ∈ BI , and n ∈ N+ , with λ0 = 0.303663 · · · (Wirsing’s constant). The ﬁrst inequality follows by integrating over A the double inequality 1 2 1 − ≤1− , x ∈ I, ≤ 2 x+1 (x + 1) x+1 while the second one is the inequality in Theorem 2.3.5 for = 2. Note that (an < j1 ) = τ −n+1 (a1 < j1 ), (an = j1 , an−1 > j2 ) = τ −n+2 (a2 = j1 , a1 > j2 ), (an = j1 , an−1 = j2 , an−2 < j3 ) = τ −n+3 (a3 = j1 , a2 = j2 , a1 < j3 ), ···························································· (an = j1 , · · · , a3 = jn−2 , a2 > jn−1 ) = τ −1 (an−1 = j1 , · · · , a2 = jn−2 , a1 > jn−1 ) and (a2 = j1 , a1 > j2 ) ⊂ (a2 = j1 ) (a3 = j1 , a2 = j2 , a1 < j3 ) ⊂ (a3 = j2 , a2 = j2 ) ···························································· (an−1 = j1 , · · · , a2 = jn−2 , a1 > jn−1 ) ⊂ (an−1 = j1 , · · · , a2 = jn−2 ) (an = j1 , · · · , a2 = jn−1 , a1 < jn ) ⊂ (an = j1 , · · · , a2 = jn−1 ).

**150 Next, by Theorem 1.2.2 we have
**

i(n) ∈Nn +

Chapter 2

max γ0 (I(i(n) )) =

1 (:= σ(n)), Fn Fn+1 2 σ(n), 3 log 2

n ∈ N+ ,

and then

i(n) ∈Nn +

max γ(I(i(n) )) ≤

n ∈ N+ .

(2.5.26)

Now, by (2.5.24) through (2.5.26), with k1 = log 2 = 0.69315 · · · , k2 √ √ 2 = ζ(2) log 2 − 1 = 0.14018, · · · , θ = g2 = 5 − 1 /4 = 3 − 5 /2 = 0.38196 · · · , it follows from (2.5.23) that var U n I(a,1] ≤ = 4k2 3 log 2 λn−2 σ (0) + λn−3 σ (1) + · · · + σ (n − 2) + 0 0 k1 σ (n − 1) + σ (n) 2k2

4k2 σ (0) n−3 σ (1) σ (n − 1) λn−2 + λ0 + ··· 0 3 log 2 σ (n − 1) σ (n − 1) + σ(n − 2) k1 + σ(n − 1) 2k2 + σ(n).

Since

σ (k) 1 ≤ θk−n−1 , σ (n) 2 8 σ (n − 1) ≤ , σ (n) 3

k, n ∈ N,

and

n ≥ 3,

we ﬁnally obtain var U n I(a,1] ≤ We have 1+ 16k1 16k2 16 + =1+ 9 log 2 9θ (θ − λ0 ) log 2 9 log 2 32 9 log 2 k1 + k2 θ (θ − λ0 ) 1+ 16k1 16k2 + 9 log 2 9θ (θ − λ0 ) log 2 σ (n) .

= 1+

log 2 ζ (2) log 2 − 1 √ √ + 2 7 − 3 5 − 3 − 5 λ0

= 14.780 · · · < 14.8

Solving Gauss’ problem

151

and the proof is complete for any odd n. The case of an even n can be treated similarly. 2 Corollary 2.5.4 Let f ∈ BV (I). For any n ∈ N we have || U n f − U ∞ f || ≤ k0 var f . Fn Fn+1

Proof. By (2.1.12) and Proposition 2.0.1 (i) we have || U n f − U ∞ f || ≤ var U n f, n ∈ N,

and the result stated is implied by Theorem 2.5.3 for n ∈ N+ . The case n = 0 can be checked directly. 2 Remark. It was claimed in Iosifescu (1997, p.76) that Theorem 2.5.3 holds with k0 = 1/ log 2 for all n ∈ N large enough. (This is clearly true for n = 1, 2, or 3.) A ﬂaw detected by Adriana Berechet in the method of proof in that paper invalidates the conclusion. We conjecture, however, that both Theorem 2.5.3 and Corollary 2.5.4 hold with k0 = 1/ log 2 for any n ∈ N. 2

2.5.3

Two asymptotic distributions

We are now able to derive the asymptotic behaviour of γa (sa ≤ x) as n → ∞ n for any a, x ∈ I. Theorem 2.5.5 For any a ∈ I and n ∈ N we have k0 a+1 ≤ sup |γa (sa ≤ x) − G(x)| ≤ . n 2(Fn + aFn−1 )(Fn+1 + aFn ) Fn Fn+1 x∈I Proof. (i) The upper bound. We have already used in Subsection 2.5.1 the property of U of being the transition operator of the Markov chain (sa )n∈N for any a ∈ I. Therefore in particular n U n I[0,x] (a) = Ea I[0,x] (sa ) = γa (sa ≤ x) n n for any a, x ∈ I and n ∈ N. As U ∞ I[0,x] = I[0,x] dγ = γ([0, x]) = G(x), x ∈ I,

I

152

Chapter 2

Corollary 2.5.4 yields the upper bound announced . (ii) The lower bound. We start with two simple remarks. First, using the continuity of G and the equations limh↓0 γa (sa ≤ x − h) = γa (sa < x) n n and limh↓0 γa (sa < x + h) = γa (sa ≤ x), x ∈ I, it is easy to see that n n sup |γa (sa ≤ x) − G(x)| = sup |γa (sa < x) − G(x)| n n

x∈I x∈I

**for any a ∈ I and n ∈ N. Second, for any s ∈ I we have γa (sa = s) = γa (sa ≤ s) − G(s) − (γa (sa < s) − G(s)) n n n ≤ sup |γa (sa ≤ x) − G(x)| + sup |γa (sa < x) − G(x)| n n
**

x∈I x∈I

= 2 sup |γa (sa ≤ x) − G(x)| . n

x∈I

Hence sup |γa (sa ≤ x) − G(x)| ≥ n

x∈I

1 sup γa (sa = s) n 2 s∈I

(2.5.27)

for any a ∈ I and n ∈ N. Next, recall (see Subsection 2.5.1) that γa (sa = [in , . . . , i2 , i1 + a]) = γa (I(i(n) )) = Pi1 ···in (a), n γa sa = 1 1 i1 + a = γa (I(i1 )) = Pi1 (a) n ≥ 2,

**for any a ∈ I and (i1 , · · · , in ) = i(n) ∈ Nn . By (2.5.5) and (2.5.27) we then + have sup γa (sa = s) = P1(n) (a), a ∈ I, n ∈ N+ , (2.5.28) n
**

s∈I

where we write 1(n) for (i1 , · · · , in ) with i1 = · · · = in = 1, n ∈ N+ . With the convention F−1 = 0, by equation (2.5.5) again, P1(n) (a) = = a+1 ((a + 1)Fn−1 + Fn−2 )((a + 1)Fn + Fn−1 ) a+1 , (Fn + aFn−1 )(Fn+1 + aFn ) a ∈ I, n ∈ N+ .

(2.5.29)

The lower bound announced now follows from (2.5.27) through (2.5.29). The case n = 0 can be checked directly. 2

Solving Gauss’ problem

153

Remarks. 1. It is easy to see that P1(n) (·) is a decreasing function. Hence P1(n) (a) ≥ P1(n) (1) = 2 Fn+1 Fn+2 (2.5.30)

for any a ∈ I and n ∈ N+ . 2. Both√ lower and upper bounds in Theorem 2.5.5 are O(g2n ) as n → ∞ √ with g = ( 5 − 1)/2, g2 = (3 − 5)/2 = 0.38196 · · · . Thus the optimal convergence rate has been obtained. 2 Corollary 2.5.6 For any n ∈ N+ we have vn ≥ 2 . Fn+1 Fn+2

Proof. As noted in the proof of Theorem 2.5.5, we have γa (sa ≤ x) = U n I[0,x] (a), G(x) = U ∞ I[0,x] n for any a, x ∈ I and n ∈ N. Then Theorem 2.5.5, inequality (2.5.30), and the argument used in the proof of Corollary 2.5.4 yield 2 ≤ sup || U n I[0,x] − U ∞ I[0,x] || ≤ sup var U n I[0,x] Fn+1 Fn+2 x∈I x∈I for any n ∈ N. By Corollary 2.5.2 the proof is complete. Remark. Theorem 2.5.3 and Corollary 2.5.6 show that vn = n → ∞, and this convergence rate is optimal. O(g2n ) (2.5.31) 2 as 2

**Corollary 2.5.7 The spectral radius of the operator U − U ∞ in BV (I) √ equals g2 = (3 − 5)/2 = 0.38196 · · · . Proof. We should show that lim || U −
**

n

n→∞

U ∞ ||1/n v

= lim

n→∞

|| U n f − U ∞ f ||v sup || f ||v 0=f ∈BV (I)

1/n

= g2 .

The argument used in the proof of Corollary 2.5.4, and Theorem 2.5.3 yield || U n f − U ∞ f ||v = || U n f − U ∞ f || + var U n f ≤ 2 var U n f ≤ 4k0 4k0 var f ≤ || f ||v Fn Fn+1 Fn Fn+1

154

Chapter 2

for any n ∈ N and f ∈ BV (I). (We took into account that, as mentioned at the beginning of this section, here f is complex-valued. See the proof of Proposition 2.1.16.) Hence lim || U n − U ∞ ||1/n ≤ g2 . v

n→∞

The converse inequality follows by taking f = I[0,x] , x ∈ I, and using (2.5.31). 2 Theorem 2.5.5 allows a quick derivation of the asymptotic behaviour of γa (τ n ≤ x, sa ≤ y) n as n → ∞ for any a, x, y ∈ I, and of the (optimal) convergence rate, the same as above. Theorem 2.5.8 For any a ∈ I and n ∈ N we have a+1 2(Fn + aFn−1 )(Fn+1 + aFn ) log(xy + 1) log 2

≤ ≤

x,y∈I

sup γa (τ n ≤ x, sa ≤ y) − n

k0 . Fn Fn+1

a Proof. Set Ga (y) = γa (sa ≤ y), Hn (y) = Ga (y) − G(y), a, y ∈ I, n ∈ N. n n n Theorem 2.5.5 yields

a |Hn (y) | ≤

k0 , Fn Fn+1

a, y ∈ I, n ∈ N.

(2.5.32)

By the generalized Brod´n–Borel–L´vy formula (1.3.21), for any a, x, y ∈ I e e

**Solving Gauss’ problem and n ∈ N we have γa (τ n ≤ x, sa ≤ y) = n =
**

0 y 0 y

155

**γa (τ n ≤ x|sa = z) dGa (z) n n (z + 1)x a dGn (z) zx + 1
**

y 0

= =

1 log 2

(z + 1)x dz + zx + 1 z + 1

y 0

(z + 1)x a dHn (z) zx + 1

z=y z=0

log(xy + 1) (z + 1)x a + H (z) log 2 zx + 1 n

y

−

0

x − x2 H a (z)dz. (zx + 1)2 n

[When applying formula (1.3.21) we used the fact that the σ-algebras generated by (a1 , · · · , an ) and by sa are identical for any a ∈ I and n ∈ N+ .] n Hence, by (2.5.32), γa (τ n ≤ x, sa ≤ y) − n ≤ k0 Fn Fn+1 log(xy + 1) log 2 ≤ k0 Fn Fn+1

(y + 1)x (x − x2 )y + xy + 1 xy + 1

for any a, x, y ∈ I and n ∈ N, so that the upper bound holds. To get the lower bound we note that by Theorem 2.5.5 for any a ∈ I and n ∈ N we have

x,y∈I

sup γa (τ n ≤ x, sa ≤ y) − n

log(xy + 1) log 2 log(y + 1) log 2 a+1 . 2(Fn + aFn−1 )(Fn+1 + aFn ) 2

≥ sup γa (τ n ≤ 1, sa ≤ y) − n

y∈I

= sup |γa (sa ≤ y) − G(y)| ≥ n

y∈I

Remarks. 1. We can replace γa (τ n ≤ x, sa ≤ y) by λ(τ n ≤ x, sa ≤ y) in n n the statement of Theorem 2.5.8 since it is possible to relate these quantities by noticing that sa − s0 ≤ 1/F2 , n ∈ N, a ∈ I. The new upper and lower n n n bounds are of order O(g2n ) as n → ∞, too.

156

Chapter 2

2. As noted at the end of Subsection 1.3.3, log(xy + 1)/ log 2, x, y ∈ I, is the joint distribution function under γ of the extended random variables ¯ τ n and sn . ¯ ¯ 2

2.5.4

A generalization of a result of A. Denjoy

Sixty ﬁve years ago, A. Denjoy published a Comptes Rendus Note [see Denjoy (1936 b)] in which he sketched a proof of the fact that (in our notation) lim λ([a1 , · · · , an ] ≤ x, s0 ≤ y) = n x log(y + 1) log 2 (2.5.33)

n→∞

uniformly with respect to x, y ∈ I. Of course, for x = 1 this follows at once from Theorem 2.5.5. In this subsection we prove that (2.5.33) holds with λ replaced by any probability measure µ on BI absolutely continuous with respect to λ, in particular with λ replaced by any γa , a ∈ (0, 1]. An estimate of the convergence rate is also given . These will follow from Theorem 2.5.9 below. Since |[a1 , · · · , an ] − τ 0 | ≤ (Fn Fn+1 )−1 , n ∈ N+ , it is easy to see that for any probability measure µ on BI absolutely continuous with respect to λ, we have µ([a1 , · · · , an ] ≤ x, s0 ≤ y) − µ(τ 0 ≤ x, s0 ≤ y) n n ≤ max(µ(x − (Fn Fn+1 )−1 < τ 0 ≤ x), µ(x < τ 0 ≤ x + (Fn Fn+1 )−1 )) → 0 uniformly with respect to x, y ∈ I as n → ∞. This allows us to replace [a1 , · · · , an ] by τ 0 in (2.5.33) and its generalizations. Fix a ∈ I arbitrarily. Let f be a λ-integrable complex-valued function on I. Since γa is equivalent to λ for any a ∈ I, f is γa -integrable, too. Denote by Ek , k ∈ N, the set consisting of the endpoints of all fundamental intervals of rank , 0 ≤ ≤ k. For any n ∈ N we associate with f a a function fn which has a constant value on each fundamental interval of rank a n. Speciﬁcally, f0 = I f dγa and

a fn (x) =

1 γa (I(i(n) ))

I(i(n) )

f dγa ,

x ∈ I(i(n) ), i(n) ∈ Nn , +

for n ∈ N+ . Clearly,

a fn dγa =

I

I

f dγa ,

n ∈ N.

(2.5.34)

Solving Gauss’ problem

157

**Since for any n ∈ N+ and x ∈ I \ En there is a unique i(n) ∈ Nn such that + x ∈ I(i(n) ) and since max γa (I(i(n) )) → 0
**

i(n) ∈Nn +

as n → ∞, by a well known property of the Lebesgue integral we have lim f a (x) n→∞ n = f (x) (2.5.35)

**a.e. in I. It follows from (2.5.34) and (2.5.35) that lim
**

a |f − fn |dγa = 0.

n→∞ I

(2.5.36)

**By (2.5.36) the right hand side of (2.5.37) below converges to 0 as n → ∞.
**

a Remark. It is easy to check that (fn )n∈N is a martingale on (I, BI , γa ) whatever a ∈ I. 2

Theorem 2.5.9 Let f be a λ-integrable complex valued function on I and let h ∈ BV (I) be real-valued. Then f (h ◦ sa ) dγa − n f dγa hdγ

I

I

I

(2.5.37)

≤ inf

0≤k≤n

k0 a var h |f |dγa || h || |f − fk | dγa + Fn−k Fn−k+1 I I

**for any a ∈ I and n ∈ N. Proof. For any a ∈ I and k, n ∈ N+ , k ≤ n, we have f (h ◦ sa )dγa n (2.5.38) =
**

i(k) ∈Nk + I(i(k) )

I

(f −

a fk )(h

◦

sa )dγa n

+

I(i(k) )

a fk (h

◦

sa )dγa n

.

Clearly,

a (f − fk )(h ◦ sa )dγa ≤ || h || n a |f − fk |dγa .

(2.5.39)

i(k) ∈Nk +

I(i(k) )

I

**158 Next, for any ﬁxed i(k) ∈ Nk we can write +
**

a fk (h ◦ sa )dγa = n

Chapter 2

I(i(k) )

1 γa (I(i(k) )

I(i(k) )

f dγa

I(i(k) )

(h ◦ sa )dγa . (2.5.40) n

It is easy to check that γa (I(i(k) )) = where a+1 , (qk + apk )(qk + qk−1 + a(pk + pk−1 )) g.c.d. (pk , qk ) = 1, k ∈ N+ ,

pk = [i1 , . . . , ik ], qk

**and p0 = 0, q0 = 1. With the change of variable u= noting that sa (u) = sa (t) n n−k for t ∈ Ω, where a = = we obtain h(sa (u))γa (du) = (a + 1) n = (a + 1)
**

I

pk + t pk−1 , qk + t qn−1

t ∈ I,

[ik , . . . , i2 , i1 + a] if k > 1, 1/(i1 + a) if k = 1 qk−1 + apk−1 , qk + apk

I(i(k) )

I(i(k) )

h(sa (u))du n (au + 1)2

h(sa (t))dt n−k . (t(qk−1 + apk−1 ) + qk + apk )2

Hence 1 γa (I(i(k) )) =

I

I(i(k) )

h(sa (u))γa (du) = (a + 1) n h(v) dGa (v), n−k

h(sa (t))dt n−k (a t + 1)2 I

(2.5.41)

(h ◦ sa )dγa = n−k

I

**Solving Gauss’ problem where Ga (v) = γa (sa < v), m ∈ N, v ∈ I. By Theorem 2.5.5 we have m m |Ga (v) − G(v)| ≤ m for any a, v ∈ I and m ∈ N. Then h(v)dGa (v) − n−k hdγ
**

I

159

k0 Fm Fm+1

I

(2.5.42) G(v)dh(v) ≤

I

=

I

Ga (v)dh(v) − n−k

k0 var h . Fn−k Fn−k+1

**It follows from (2.5.40) through (2.5.42) that
**

a fk (h ◦ sa )dγa − n

i(k) ∈Nk +

I(i(k) )

I

f dγa

hdγ

I

(2.5.43)

≤

k0 var h Fn−k Fn−k+1

I

|f |dγa .

**Finally, (2.5.38), (2.5.39), and (2.5.43) for k = 0 and n ∈ N should be replaced by f (h ◦ sa )dγa = n
**

a a (f − f0 )(h ◦ sa )dγa + f0 n

I

I

I

(h ◦ sa )dγa , n

(2.5.38 ) (2.5.39 )

I

a (f − f0 )(h ◦ sa )dγa ≤ || h || n

I

a |f − f0 | dγa ,

and

a f0

I

(h ◦ sa )dγa − n

I

f dγa

hdγ ≤

I

k0 var h Fn Fn+1

I

|f |dγa ,

(2.5.43 )

**respectively. Now, (2.5.37) follows from (2.5.38), (2.5.38 ) (2.5.39), (2.5.39 ), (2.5.43), and (2.5.43 ). 2 Corollary 2.5.10 For any a, x, y ∈ I and n ∈ N we have γa (τ 0 ≤ x, sa ≤ y) − γa ([0, x])G(y) n ≤ inf
**

a δk (x) +

0≤k≤n

k0 γa ([0, x]) Fn−k Fn−k+1

(2.5.44)

160 where 0 a 2(a + 1)(x − ak )(bk − x) δk (x) = (bk − ak )(ax + 1)2

Chapter 2

if x ∈ Ek , if x ∈ (ak , bk ),

and [ak , bk ] is the closure of the (unique) fundamental interval of order k ∈ N containing x ∈ I \ Ek . Proof. Clearly, γa (τ 0 ≤ x, sa ≤ y) = n I[0,x] (I[0,y] ◦ sa ) dγa n

I

for any a, x, y ∈ I and n ∈ N. Theorem 2.5.9 applies with f = I[0,x] and h = I[0,y] , x, y ∈ I, yielding (2.5.44) since as is easy to see, in the present case

I a a |f − fk |dγa = δk (x),

k ∈ N, a, x ∈ I. 2

**Corollary 2.5.11 For any a ∈ I and n ∈ N we have a+1 ≤ sup γa (τ 0 ≤ x, sa ≤ y) − γa ([0, x])G(y) n 2(Fn + aFn−1 )(Fn+1 + aFn ) x,y∈I ≤ a+1 + k0 2 1 F n/2 .
**

+1

F

(2.5.45)

n/2

**Proof. We clearly have
**

a δk (x) ≤

a+1 a+1 max λ(I(i(k) )) = , (k) 2 i 2Fk Fk+1

k ∈ N, a, x ∈ I.

(2.5.46)

The upper bound from (2.5.45) follows by using (2.5.46) and taking k = n/2 . Next, as in the proof of Theorem 2.5.8, we get supx,y∈I γa (τ 0 ≤ x, sa ≤ y) − γa ([0, x])G(y) n ≥ supy∈I |γa (sa ≤ y) − G(y)| ≥ n a+1 2(Fn + aFn−1 )(Fn+1 + aFn ) 2

for any a ∈ I and n ∈ N, and so the lower bound holds, too.

Solving Gauss’ problem

161

Remark. The upper bound in Corollary 2.5.11 is O(gn ) as n → ∞, with √ g = ( 5 − 1)/2. The lower bound is O(g2n ) as n → ∞ so that the problem of the exact rate of convergence is unsettled. 2 Corollary 2.5.12 Let µ ∈ pr (BI ) such that µ dµ/dγa , a ∈ I. Then we have µ(τ 0 ≤ x, sa ≤ y) − µ([0, x])G(y) n ≤ inf ga I[0,x] − (ga I[0,x] )a k k0 dγa + µ([0, x]) Fn−k Fn−k+1 (2.5.47) λ and let ga =

0≤k≤n

I

**for any a, x, y ∈ I and n ∈ N. In particular, if ga has a version ga of bounded variation, then
**

I

**ga I[0,x] − (ga I[0,x] )a dγa k (a + 1)var[0,x] ga (Fk + aFk−1 )(Fk+1 + aFk ) ≤ (a + 1)var[0,x] ga +2 (Fk + aFk−1 )(Fk+1 + aFk )
**

x ak

(2.5.48) if x ∈ Ek

ga (t)γa (dt) if x ∈ (ak , bk ),

**where [ak , bk ] is the closure of the (unique) fundamental interval of order k ∈ N containing x ∈ I \ Ek . Proof. We have µ(τ 0 ≤ x, sa ≤ y) = n
**

I

I[0,x] (I[0,y] ◦ sa )ga dγa n

for any a, x, y ∈ I and n ∈ N. Theorem 2.5.9 applies with f = ga I[0,x] and h = I[0,y] , x, y ∈ I, yielding (2.5.47). Next, (2.5.48) can be obtained noting that (i) for a typical fundamental interval I(i(k) ) of order k ∈ N contained in [0, x] we have

I(i(k) )

ga I[0,x] − (ga I[0,x] )a dγa k 1 γa (I(i(k) ))

=

I(i(k) )

ga (t) −

I(i(k) )

ga (s)γa (ds) γa (dt)

=

1 γa (I(i(k) ))

I(i(k) )

I(i(k) )

(ga (t) − ga (s)) γa (ds) γa (dt)

≤ γa (I(i(k) ))varI (i(k) ) ga ,

**162 and (ii) for x ∈ (ak , bk ) we have
**

bk ak

Chapter 2

ga I[0,x] − (ga I[0,x] )a dγa k

x

=

ak

ga (t) −

1 γa ([ak , bk ])

bk x x ak

x ak

ga (s)γa (ds) γa (dt)

+ ≤

1 γa ([ak , bk ])

x

ga (s)γa (ds) γa (dt)

bk ak x ak

ak

ga (t)γa (dt) +

x

1 γa ([ak , bk ])

ga (s)γa (ds) γa (dt)

=

2

ak

ga (t)γa (dt). 2 λ and let ga = k0 Fn−k Fn−k+1

**The proof is complete. Corollary 2.5.13 Let µ ∈ pr(BI ) such that µ dµ/dγa , a ∈ I. Then we have |µ(sa ≤ x) − G(x)| ≤ n
**

0≤k≤n

inf

I

|ga − (ga )a | dγa + k

for any a, x ∈ I and n ∈ N. If ga has a version of bounded variation, then the right hand side of the above inequality is O(gn ) as n → ∞ uniformly with respect to a, x ∈ I. Proof. Take x = 1 in (2.5.47), and then x = 1 and k = (2.5.48). n/2 in 2

Remark. Corollary 2.5.13 shows that the limiting distribution as n → ∞ of sa under a probability measure on BI absolutely continuous with respect n to λ is always Gauss’ γ for any a ∈ I. The problem of the exact rate of convergence, which should normally depend on ga , remains unsettled. 2 Other special cases of Theorem 2.5.9 and its corollaries can be considered. For example, we can check that

n→∞

lim γ(τ 0 ≤ x, sa ≤ y) = G(x)G(y), n

a, x, y ∈ I.

(2.5.49)

It is interesting to note that (2.5.49) points to the asymptotic independence of τ 0 and sa under γ as n → ∞. n

Solving Gauss’ problem

163

As already noted at the beginning of this subsection, we can easily obtain the results corresponding to Corollaries 2.5.10 through 2.5.12 in the case considered by A. Denjoy, where τ 0 is replaced by [a1 , · · · , an ]. A deﬁnite diﬀerence occurs just in the convergence rates while the limiting probabilities are not altered.

164

Chapter 2

Chapter 3

Limit theorems

This chapter is devoted to functional versions of central limit and other weak theorems, and of the law of the iterated logarithm for the incomplete quotients and associated random variables. The reader should keep in mind throughout that the sequence (an )n∈N+ of incomplete quotients is ψ-mixing under diﬀerent probability measures on BI (see Subsections 1.3.6 and 2.3.4), while frequent reference is made to the three appendices at the end of the book.

3.0 Preliminaries

As in Subsection 2.5.4 let g be a λ-integrable complex-valued function on I. We particularize here the framework considered there taking a = 0 and accordingly γ0 = λ. Denote by Ek , k ∈ N, the set consisting of the endpoints of all fundamental intervals of rank , 0 ≤ ≤ k. For any n ∈ N+ we associate with g a function gn which has a constant value on each fundamental interval I(i(n) ), i(n) ∈ Nn , of rank n. Speciﬁcally, + gn (x) = Then

I

1 λ I(i(n) )

gdλ,

I(i(n) )

x ∈ I(i(n) ), i(n) ∈ Nn , n ∈ N+ . +

(3.0.1)

gn dλ =

gdλ,

I

n ∈ N+ ,

(3.0.2)

and

n→∞

lim gn (x) = g(x) 165

a.e. in I.

(3.0.3)

**166 It follows from (3.0.2) and (3.0.3) that limn→∞ ωg,A (n) =
**

I

Chapter 3 |g − gn |dλ = 0. Hence (3.0.4)

A

|g − gn |dλ → 0

uniformly with respect to A ∈ BI as n → ∞. We shall now prove a result which, in a sense, is dual to Theorem 2.5.9. Lemma 3.0.1 Let µ ∈ pr (BI ) such that µ λ and let g =dµ/dλ. ∞ For any n ∈ N+ and A ∈ Bn = τ −n+1 (BI ) we have |µ(A) − γ(A)| ≤

1≤s<n

inf (γ(Ac )ωg,A (s) + γ(A)ωg,Ac (s) + γ(A)εn−s ) ,

**with εn , n ∈ N+ , deﬁned as in Subsection 1.3.6. Hence
**

n→∞ A∈B∞

n

lim sup |µ(A) − γ(A)| = 0.

∞ Proof. Put h = IA − γ(A), A ∈ Bn . Then

µ(A) − γ(A) =

I

ghdλ

and ghdλ ≤

I I

|gs − g| |h|dλ +

I

gs hdλ ,

where gs is deﬁned by (3.0.1) and s < n, s ∈ N+ , is arbitrary. Since |h| = 1 − γ(A) = γ(Ac ) on A and |h| = γ(A) on Ac , we have |gs − g| |h|dλ ≤ γ(Ac )ωg,A (s) + γ(A)ωg,Ac (s). (3.0.5)

I

Next, gs hdλ =

i(s) ∈Ns + I(i(s) )

I

gs hdλ

=

i(s) ∈Ns +

1 λ(I(i(s) ))

gdλ

I(i(s) ) I(i(s) )

hdλ

=

i(s) ∈Ns +

µ(I(i(s) )) λ(I(i(s) ) ∩ A) − λ(I(i(s) ))γ(A) (s) )) λ(I(i

.

Limit theorems It then follows from equation (1.3.35) that gs hdλ ≤ γ(A)εn−s .

167

(3.0.6) 2

I

**Now, the result stated follows from (3.0.5), (3.0.6), and (3.0.4). Let fn : N+ → R, n ∈ N+ , and deﬁne Xnj = fn (aj ),
**

k

1 ≤ j ≤ n, Snn = Sn , n ∈ N+ .

nt

Sn0 = 0,

Snk =

j=1

Xnj ,

1 ≤ k ≤ n,

For any n ∈ N+ deﬁne the process ξn = ((ξn (t))t∈I by ξn (t) = Sn Lemma 3.0.2 Let µ ∈ pr (BI ) such that µ λ. array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i. under γ.

, t ∈ I.

Assume that the

−1 −1 (i) If either γξn n∈N+ or µξn n∈N+ converges weakly in BD , then both sequences converge weakly in BD and have the same limit. −1 −1 (ii) If either γSn n∈N+ or µSn n∈N+ converges weakly in B, then both sequences converge weakly in B and have the same limit.

Proof. Clearly, (ii) is an immediate consequence of (i). Let us therefore prove the latter. Take a sequence (kn )n∈N+ such that kn ≤ n, limn→∞ kn /n = 0, and limn→∞ kn = ∞. As X is s.i. under γ, we have

n→∞

lim γ (|Snkn | > ε) = 0

(3.0.7)

**for any ε > 0. Let us ﬁrst show that
**

n→∞

lim

1≤k≤kn

max |Snk | > ε

=0

(3.0.8)

for any ε > 0. It follows from Proposition A3.5 (see also Section A1.4) that whatever ε > 0 we have ε −1 dP γSnk , δ0 ≤ , 4 1 ≤ k ≤ kn ,

**for any n large enough (≥ nε ). Therefore for some θ ≤ ε/4 we have
**

−1 δ0 (A) < γSnk (Aθ ) + θ

168

Chapter 3

**for any n ≥ nε , 1 ≤ k ≤ kn , and A ∈ B. Hence, with A = (−θ, θ) for which Aθ = (−2θ, 2θ), we obtain
**

−1 γSnk

ε ε − , 2 2

−1 > γSnk Aθ > 1 − θ ≥ 1 −

ε 4

**for any n ≥ nε and 1 ≤ k ≤ kn . Equivalently,
**

1≤k≤kn

min γ |Snk | <

ε ε >1− , 2 4

n ≥ nε .

If ε is small enough so that 1− ε > ϕγ (1), 4

then by an Ottaviani type inequality [see Lemma 1.1.6 in Iosifescu and Theodorescu (1969)] we can write γ max |Snk | > ε ≤

ε γ |Snkn | ≥ 2 ε 1 − 4 − ϕγ (1)

1≤k≤kn

**for any n ≥ nε . Hence (3.0.8) holds on account of (3.0.7). Next, for any n ∈ N+ consider the process ξn = (ξn (t))t∈I deﬁned by ξn (t) = Sn
**

nt

− Sn min(

nt ,kn ) ,

t ∈ I.

∞ Note that ξn is Bkn +1 -measurable and then by Lemma 3.0.1 and Lemma 2.1.1 in Iosifescu and Grigorescu (1990) we have

n→∞

lim

D

−1 hd(γ ξn ) −

D

−1 hd(µξn )

=0

(3.0.9)

for any bounded continuous real-valued function h on D. On the other hand (see Section A1.6), for any n ∈ N+ we have d0 (ξn , ξn ) ≤ sup |ξn (t) − ξn (t)| ≤ max |Snk |.

t∈I 1≤k≤kn

**It then follows from (3.0.8) that d0 (ξn , ξ n ) converges to 0 in γ-probability as n → ∞. Hence as µ γ we also have that
**

∼ ∼

(3.0.10)

d0 (ξn , ξ n ) converges to 0 in µ-probability as n → ∞.

(3.0.11)

Limit theorems

169

We can now conclude the proof using (3.0.9) through (3.0.11). If, for −1 w example, γξn → ν for some ν ∈ pr (BD ), then it follows from (3.0.10) that w −1 −1 w γ ξn → ν, too. Next, (3.0.9) implies that µξn → ν, which in conjunction −1 w with (3.0.11) yields µξn → ν. 2 Remark. Lemma 3.0.2 still holds when the process ξn is replaced by the C C process ξn = ξn (t) t∈I deﬁned by

C ξn (t) = Sn nt

+ (nt − nt ) Sn(

nt +1)

− Sn

nt

,

t ∈ I, 2

with the convention Sn0 = 0, n ∈ N+ .

3.1

3.1.1

**The Poisson law
**

The case of incomplete quotients

Let θ ∈ R++ and α ∈ R be arbitrarily given. Consider the array X = {Xnj , 1 ≤ j ≤ n, n∈ N+ }, where Xnj = For this array we have

k

aj n

α

I(aj >θn) .

(3.1.1)

Snk = n

−α j=1

aα I(aj >θn) , j

1 ≤ k ≤ n,

Sn = Snn ,

n ∈ N+ . (3.1.2)

Proposition 3.1.1 The array (3.1.1) is s.i. under γ. Proof. We only consider the case α ∈ R++ . The other cases can be treated similarly. We have

k

γ (|Snk | > ε) ≤

j=1

γ |Xnj | >

ε k

= kγ |Xn1 | > ε k

1/α

ε k

= kγ a1 > n max θ, ≤ kγ(a1 > nθ),

1 ≤ k ≤ n.

170

Chapter 3

**Hence Xn1 converges in γ-probability to 0 as n → ∞, and for any 0 < a < 1 we have lim sup max γ (|Snk | > ε) ≤ lim a n γ (a1 > nθ) =
**

n→∞ 1≤k≤an n→∞

a , θ log 2

**which is less than 1 if we choose 0 < a < min(1, θ log 2). On account of Proposition A3.6 the proof is complete. Theorem 3.1.2 We have
**

−1 γSn → ν in B, w

2

(3.1.3)

where: (i) if α ∈ R++ then ν = Pois ρ with dρ x−1−1/α (x) = δx ((θα , ∞)) , dλ α log 2 (ii) if −α ∈ R++ then ν = Pois ρ with x−1−1/α dρ (x) = −δx ((0, θα )) , dλ α log 2 x ∈ R; x ∈ R;

(iii) if α = 0 then ν = Pois (θ log 2)−1 δ1 , that is, ν is the Poisson distribution P (θ log 2)−1 with parameter (θ log 2)−1 .

Proof. We only prove (i), the proofs of (ii) and (iii) being completely similar. Consider the measures µn on B deﬁned by µn (A) = γ Clearly, µn (R) = γ (a1 > θn) ≤ 1, µn ([−θα , θα ]) = 0, and γ(Xn1 ∈ A) = γ (a1 ≤ θn) δ0 (A) + µn (A), A ∈ B, n∈ N+ . n ∈ N+ , a1 n

α

∈ A, a1 > θn ,

A ∈ B, n∈ N+ .

**Limit theorems Also, for any x ∈ R we have
**

n→∞

171

**lim n µn ((x, ∞)) = lim n γ a1 > n (max(x, θα ))1/α
**

n→∞

=

1 1 lim n log 1 + log 2 n→∞ n (max(x, θα ))1/α + 1 1 1 = ρ ((x, ∞)) . log 2 (max(x, θα ))1/α

= Finally,

n→∞

lim n µn (R) = lim n γ(a1 > n θ) =

n→∞

1 = ρ(R). θ log 2

Therefore all hypotheses of Theorem A3.10 are fulﬁlled, and (3.1.3) holds. 2 Now, on account of Proposition 3.1.1, Theorem 3.1.2, Lemma 3.0.2, and Theorem A3.7 we can state the following result. (See Section A3.3 for notation.)

−1 Corollary 3.1.3 Let µ ∈ pr(BI ) such that µ ≤ λ. Then µξn → Qν w −1 → ν in B, where ξ = (S in BD , hence µSn n n nt )t∈I , with the convention Sn0 = 0, n ∈ N+ . w

3.1.2

The case of associated random variable

We shall now show that both Theorem 3.1.2 and Corollary 3.1.3 still hold when aj is replaced by either yj , rj , or uj , 1 ≤ j ≤ n, in (3.1.1) and (3.1.2). This will follow from the result below. Lemma 3.1.4 Let bn , n ∈ N+ , be real-valued random variables on (I,BI ) such that an ≤ bn ≤ an + c, n ∈ N+ , for some c ∈ R+ . For any n ∈ N+ consider the stochastic processes ξn = (Sn nt )t∈I and ξn = (Sn nt )t∈I , where Snk , 1 ≤ k ≤ n, is deﬁned by (3.1.2) and

k

Snk = n

−α j=1

bα I(bj >θn) , j

1 ≤ k ≤ n,

172

Chapter 3

**with the convention Sn0 = 0. Then d0 (ξn , ξn ) converges to 0 in γ-probability as n → ∞. Proof. For any n ∈ N+ we have
**

n

d0 (ξn , ξn ) ≤ sup |Sn

t∈I

nt

− Sn

nt

| ≤

j=1

|δnj |,

**where δnj = n−α bα I(bj >θn) − aα I(aj >θn) , j j
**

n n

1 ≤ j ≤ n.

**Notice that (aj > θn) ⊂ (bj > θn), 1 ≤ j ≤ n, and put δn = n
**

−α j=1 n

bα j

**I(bj >θn) − I(aj >θn) = n δn = n−α
**

j=1

−α j=1

bα I(bj >θn,aj ≤θn) , j

|bα − aα |I(aj >θn) . j j

≤ δn + δn , and we are going to prove that δn and δn both Then converge to 0 in γ-probability as n → ∞. We have γ(δn > 0) ≤ nγ(θn − c < a1 ≤ θn) → 0 as n → ∞ while δn ≤ cα n−1 n−(α−1)

j=1 n

n j=1 |δnj |

aα−1 I(aj >θn) , j

where cα =

cα(1 + c)α−1 if α ≥ 1, c|α| if α < 1.

[We have used the inequality (1+a)α −1 ≤ a {α} + α (1 + a)α−1 , valid for non-negative a and α, which implies 1 − (1 + a)−α ≤ aα.] By Theorem 3.1.2, δn converges to 0 in γ-probability as n → ∞. It follows that d0 (ξn , ξn ) is dominated by the sum of two non-negative random variables both converging in γ-probability to 0 as n → ∞. The proof is complete. 2 Corollary 3.1.5 Let bn denote either yn , rn , or un , n ∈ N+ . Put

k

Snk = n

−α j=1

bα I(bj >θn) , j

1 ≤ k ≤ n,

Limit theorems and for any n ∈ N+ consider the stochastic process ξn = (Sn convention Sn0 = 0. Let µ ∈ pr(BI ) such that µ −1 w BD , hence µSnn −→ ν in B.

nt

**173 )t∈I , with the
**

w

−1 λ. Then µξn → Qν in

Proof. Lemma 3.1.4 applies with c = 1 in the case of yn and rn and with c = 2 in the case of un . Since µ γ, the distance d0 (ξn , ξn ) converges to 0 in µ-probability, too, as n → ∞. This property and Corollary 3.1.3 imply the result stated. 2 Let bn denote either an , yn , rn or un , n ∈ N+ , and consider the special case α = 0. By Corollaries 3.1.3 and 3.1.5, under any µ ∈ pr(BI ) such that µ λ, the random variable

n

Sn =

j=1

I(bj >θn)

is asymptotically P (θ log 2)−1 as n → ∞. It is possible to estimate the rate of convergence of γ(Sn = k), k ∈ N, to its Poisson limit. The following result holds. Theorem 3.1.6 Let k ∈ N and 0 < δ < 1 be ﬁxed. We have |γ(Sn = k) − e−θ θk /k!| ≤ c exp(−(log n)δ ), n ∈ N+ ,

for θ = O(na ), 0 ≤ a < 1, where c only depends, perhaps, on δ, a, and k. The proof for the case bn = an , n ∈ N+ , k = 0, can be found in Philipp (1976, p. 382), where the proviso θ = O(na ), 0 ≤ a ≤ 1, does not appear. Cf. Galambos (1972) and Iosifescu (1978, p. 35).

3.1.3

Some extreme value theory

Throughout this subsection let again bn denote either an , yn , rn or un , n ∈ (k) N+ . For 1 ≤ k ≤ n let Mn be the kth largest of b1 , · · · , bn . Clearly, (1) Mn = Mn is the maximum of b1 , · · · , bn . The asymptotic distribution of (k) Mn as n → ∞ for any ﬁxed k can be easily obtained from previous results as shown below. Proposition 3.1.7 Let µ ∈ pr(BI ) such that µ k ∈ N+ we have lim µ Mn log 2 ≤x n

(k)

λ. For any ﬁxed

n→∞

=e

1 −x

k−1 j=0

x−j , j!

x ∈ R++ .

(3.1.4)

**174 In particular, lim µ Mn log 2 ≤x n = e− x ,
**

1

Chapter 3

n→∞

x ∈ R++ .

**Proof. Let 1 ≤ k ≤ n. It is easy to see that Sn = than k if and only if
**

(k) Mn

n j=1 I(bj >θn)

is less

**does not exceed θn, that is,
**

(k) Mn ≤ θn = Sn < k

(3.1.5)

**for any θ ∈ R++ and n ∈ N+ . Hence, by Corollaries 3.1.3 and 3.1.5,
**

k−1

µ

(k) Mn

≤ θn

=

µ Sn < k =

j=0 −(θ log 2)−1 k−1 j=0

µ Sn = j 1 j!(θ log 2)j

→ e

as n → ∞ for any ﬁxed k ∈ N+ . Putting x = θ log 2 we obtain the result stated. 2 Remark. The limit distribution for the special case k = 1 is known as Type II Extreme Value distribution for sequences of i.i.d. random variables. See, e.g., de Haan (1970). The same result can also be obtained from general results of Loynes (1965) for mixing strictly stationary sequences. 2 In what follows we give some almost sure asymptotic properties of Mn due to Philipp (1976), which improve upon results of Galambos (1974). We start with a F. Bernstein type theorem (see Proposition 1.3.16). Proposition 3.1.8 Let (cn )n∈N+ be a non-decreasing sequence of positive numbers. Then γ(Mn ≥ cn i.o.) is either 0 or 1 according as the series

n∈N+

1/cn converges or diverges.

Proof. We have (bn ≥ cn i.o.) ⊂ (Mn ≥ cn i.o.) since bn (ω) ≥ cn for some n ∈ N+ and ω ∈ Ω implies Mn (ω) ≥ cn . Conversely, if Mn (ω) ≥ cn for some n ∈ N+ and ω ∈ Ω, then there exists n ≤ n such that Mn (ω) = bn (ω) ≥ cn ≥ cn . Hence (Mn ≥ cn i.o.) ⊂ (bn ≥ cn i.o.). Therefore (Mn ≥ cn i.o.) = (bn ≥ cn i.o.) , and the conclusion follows from Corollary 1.3.17. 2

Limit theorems

175

**Corollary 3.1.9 Let (cn )n∈N+ be as in Proposition 3.1.8. Then either Mn = 0 a.e. n→∞ cn lim or lim sup
**

n→∞

(3.1.6)

Mn = ∞ a.e. cn

(3.1.7)

according as the series

n∈N+

1/cn converges or diverges.

Proof. First, assume that s = n∈N+ 1/cn < ∞. Choose positive numbers dn , n ∈ N+ , with limn→∞ dn = ∞ such that n∈N+ dn /cn < ∞. n This is always possible. Indeed, put sn = i=1 1/ci , n ∈ N+ , and deﬁne E1 = {j ∈ N+ : sj ≤ 3s/4},

n−1 n

En =

j ∈ N+ : 3s

i=1

4−i < sj ≤ 3s

i=1

4−i

,

n ≥ 2.

Consider the increasing sequence (nk )k∈N+ of indices n for which En = ∅ and take dj = 2nk−1 if j ∈ Enk , k ∈ N+ , with n0 = 0. Then we have −nk + 4−(nk −1) + · · · + 4−nk−1 ) ≤ 4−nk−1 +1 s, k ∈ N , + j∈Enk 1/cj ≤ 3s(4 −nk−1 ≤ 8s. By hence n∈N+ dn /cn = k∈N+ j∈En dj /cj ≤ 4s k∈N+ 2 k Proposition 3.1.8 we have γ Mn 1 ≥ i.o. cn dn = 0,

which is equivalent to (3.1.6). Second, assume that n∈N+ 1/cn = ∞. Choose positive numbers dn , n ∈ N+ , with limn→∞ dn = 0 such that n∈N+ dn /cn = ∞. This is again always possible. Indeed, put sn = n 1/ci , n ∈ N+ , and deﬁne i=1 E1 = {j ∈ N+ : sj ≤ 4} , En = j ∈ N+ : 4n−1 < sj ≤ 4n , n ≥ 2.

Consider the increasing sequence (nk )k∈N+ of indices n for which En = ∅ and take dj = 2−nk−1 if j ∈ Enk ∪ Enk+1 , k = 1, 3, · · · , with n0 = 0. Then dj /cj ≥ 1/cj ≥ 4nk − 4nk−1 ≥ 3 · 4nk−1 whence j∈En ∪En j∈En ∪En

k k+1 k k+1

176

Chapter 3

3 · 2nk−1 , k = 1, 3, · · · . Clearly, this implies n∈N+ dn /cn = ∞. By Proposition 3.1.8 we have Mn 1 γ ≥ i.o. = 1, cn dn which is equivalent to (3.1.7). 2 Theorem 3.1.10 Let (cn )n∈N+ be a non-decreasing sequence of positive numbers such that the sequence (n/cn )n∈N+ is non-decreasing. Then γ Mn ≤ n i.o. cn log 2

is either 0 or 1 according as the series log log n n exp cn

n∈N+

converges or diverges. The proof is completely similar to that given for the i.i.d. case in Barndorﬀ–Nielsen (1961). Theorem 3.1.6 plays an essential part in the present case. For details in the case bn = an , n ∈ N+ , see Philipp (1976, pp. 384–385). 2 Corollary 3.1.11 We have lim sup(inf)

n→∞

log Mn − log n = 1(0) a.e., log log n log Mn = 1 a.e.. log n

whence

n→∞

lim

Proof. For the lim sup case we should show that for any ε > 0 we have γ and γ or, equivalently, γ Mn ≥ n(log n)1+ε i.o. = 0 log Mn − log n ≥ 1 + ε i.o. log log n log Mn − log n ≥ 1 − ε i.o. log log n =0

=1

Limit theorems and γ Mn ≥ n(log n)1−ε i.o. = 1. These equations clearly hold by Proposition 3.1.8. For the lim inf case we should show that for any ε > 0 we have γ and γ or, equivalently, γ (Mn ≤ n(log n)ε i.o.) = 1 and γ Mn ≤ n(log n)−ε i.o. = 0 It is easy to check that these equations hold by Theorem 3.1.10. Corollary 3.1.12 We have lim inf

n→∞

177

log Mn − log n ≤ ε i.o. log log n log Mn − log n ≤ −ε i.o. log log n

=1

=0

2

Mn log log n 1 = a.e.. n log 2

Proof. We should show that for any ε > 0 we have γ and γ or, equivalently, γ Mn ≤ and γ Mn ≤ n(1 + ε ) i.o. (log log n)(log 2) n(1 − ε ) i.o. (log log n)(log 2) =1 Mn log log n 1 − ≤ ε i.o. n log 2 Mn log log n 1 − ≤ −ε i.o. n log 2 =1

=0

= 0, 2

where ε = ε log 2. This follows immediately from Theorem 3.1.10.

178

(k)

Chapter 3

To conclude this subsection we consider the kth smallest mn of b1 , · · · , bn , (1) (n) (k) 1 ≤ k ≤ n, n ∈ N+ . Clearly, mn = Mn . In general, we have mn = (n−k+1) Mn , 1 ≤ k ≤ n. Then by (3.1.5) we have (m(k) ≤ θn) = Sn < n − k + 1 n for any θ ∈ R++ and n ∈ N+ . Hence, for any µ ∈ pr(BI ) such that µ µ m(k) ≤ θn n = µ Sn < n − k + 1

n−k n

λ,

=

j=0

µ(Sn = j) = 1 −

j=n−k+1

µ(Sn = j).

**Since n−1 Sn converges to 0 in µ-probability as n → ∞ by Corollaries 3.1.3 and 3.1.5, we have lim µ Sn = n − m = 0
**

n→∞

**for any ﬁxed m ∈ N. Consequently,
**

n→∞

lim µ m(k) ≤ θn = 1 n

(3.1.8)

**for any ﬁxed k ∈ N+ . This result is not at all surprising. Indeed, by Proposition 4.1.1 we have lim a(k) n→∞ n
**

(k)

= 1 a.e.

for any ﬁxed k ∈ N+ , where an denotes the kth smallest of a1 , · · · , an . (k) (k) As mn ≤ an + 2, n ∈ N+ , 1 ≤ k ≤ n, it follows that mn = 0 a.e. n→∞ n lim for any ﬁxed k ∈ N+ , which clearly entails (3.1.8). Remark. It is proved in Iosifescu (1977) that if (ηn )n∈N+ is a strictly stationary ψ-mixing sequence of positive random variables on a probability space (Ω , K, P ) such that for some real-valued function g on N+ there exists the positive ﬁnite limit

n→∞ (k)

lim nP (ηn < g(n)) = θ,

Limit theorems

179

say, then P (ηk < g(n) for p values k, 1 ≤ k ≤ n) → e−θ θp /p! as n → ∞ for any ﬁxed p ∈ N. In particular this result applies to a sequence (ηn )n∈N+ for which P (η1 ≥ x) = log(1 + 1/x)/ log 2, with x ≥ 1,

**2θ log 2 , n ∈ N+ . n For such a sequence, similarly to (3.1.4) we can write g(n) = 1 + lim P n(ηn − 1) ≥x 2 log 2
**

(k) (k) k−1

n→∞

= e−x

j=0

xj , j!

x ∈ R++ ,

(3.1.9)

for any ﬁxed k ∈ N+ , where ηn denotes the kth smallest of η1 , · · · , ηn , 1 ≤ k ≤ n. We cannot assert that (3.1.9) is true for ηn = an , n ∈ N+ , since the equation γ (a1 ≥ x) = log (1 + 1/x) / log 2 holds just for x ∈ N+ . It is conjectured in Iosifescu (1978) that (3.1.9) holds true for ηn = rn , n ∈ N+ , under any P λ. [Notice that γ (r1 ≥ x) = log (1 + 1/x) / log 2 for any 2 x ≥ 1, but the sequence (rn )n∈N+ is not ψ-mixing under γ.]

3.2

3.2.1

Normal convergence

Two general invariance principles

Assume the framework of Subsection 2.1.5. Thus let H be a real-valued function on NZ . Set Hl = H1 ◦ τ l−1 , l ∈ Z, where + H1 = H( · · · , a−2 , a−1 , a0 , a1 , a2 , · · · ).

2 Then (Hl )l∈Z is a strictly stationary process on (I 2 , BI , γ). Set S0 = 0, Sn = n i=1 Hi − nEγ H1 , n ∈ N+ , assuming that the mean value Eγ H1 exists C and is ﬁnite. For any n ∈ N+ let us deﬁne the stochastic processes ξn = D = (ξ D (t)) C (t)) (ξn t∈I by t∈I and ξn n C ξn (t) = D ξn (t) =

1 √ S σ n 1 √ S σ n

nt

+ (nt − nt )(H t ∈ I,

nt +1

− Eγ H1 ) ,

nt

,

180

Chapter 3

**where σ = σ(H) is a positive number which will be speciﬁed later. We start with a weak invariance principle.
**

2 Theorem 3.2.1 Assume that Eγ H1 < ∞ and

Eγ [H1 − Eγ (H1 |a−n , · · · , an )]2 < ∞

n∈N+

1/2

(3.2.1)

so that by Propositions 2.1.19 and 2.1.21 1 2 Eγ Sn = σ 2 ≥ 0 n→∞ n lim exists ﬁnitely and is given by the absolutely convergent series

2 2 σ 2 = Eγ H1 − Eγ H1 + 2 n∈N+ −1 C If σ > 0 then γξn −→ W in both C and D, where ξn stands for either ξn D . The last conclusion still holds when γ is replaced by any µ ∈ pr(B 2 ) or ξn I λ2 . such that µ w 2 Eγ H1 Hn+1 − Eγ H1 .

(3.2.2)

Proof. This is a transcription of Theorem 21.1 in Billingsley (1968), with an improvement by Popescu (1978) (concerning the possibility of replacing γ by µ), for the special case of the doubly inﬁnite sequence (al )l∈Z . Note that in Proposition 2.1.22 a class of functions H is indicated, for which (3.2.1) holds. 2 Next, we state a strong invariance principle. Theorem 3.2.2 Assume that there exist constants 0 < δ ≤ 2 and c > 0 such that Eγ |H1 |2+δ < ∞ and Eγ

1/(2+δ)

|H1 − Eγ (H1 |a−n , · · · , an )|2+δ ≤ cn−(2+7/δ) ,

n ∈ N+ ,

(3.2.3)

**so that (3.2.1) holds and
**

n→∞

lim

1 2 Eγ Sn = σ 2 ≥ 0 n

exists ﬁnitely and is given by the absolutely convergent series (3.2.2). If σ > 0 then the strong invariance principle holds for the stochastic processes D C ξn and ξn , n ∈ N+ . That is, without changing their distributions, we can redeﬁne these processes on a common richer probability space together with a standard Brownian motion process (w(t))t∈I such that sup |ξn (t) − w(t)| = O(n−a ) a.s.

t∈I

Limit theorems

181

as n → ∞, with a random constant implied in O, for each a > 0 small C D enough, depending on δ. Here ξn stands for either ξn or ξn . Proof. This is a transcription of Theorem 7.1.1 in Philipp and Stout (1975) for the special case of the doubly inﬁnite sequence (al )l∈Z . 2 For further reference we also consider the special case where H only depends on the coordinates with positive indices of a current point in NZ , + N i.e., H is a real-valued function on N+ + . (Completely similar considerations can be made in the case where H only depends on the coordinates with nonpositive indices of a current point in NZ , i.e., H is a real-valued function on + (−N) N+ .) In this case we set Hn = H1 ◦ τ n−1 , n ∈ N+ , where H1 = H(a1 , a2 , · · · ), and we have a strictly stationary sequence (Hn )n∈N+ on (I, BI , γ). With C D the same deﬁnitions as before for Sn , ξn and ξn , n ∈ N+ , where Eγ H1 is replaced by Eγ H1 , we can state the following special cases of Theorems 3.2.1 and 3.2.2.

2 Theorem 3.2.1 Assume that Eγ H1 < ∞ and 1/2 Eγ [H1 − Eγ (H1 |a1 , · · · , an )]2 < ∞ n∈N+

(3.2.1 )

so that

**1 2 Eγ Sn = σ 2 ≥ 0 n→∞ n exists ﬁnitely and is given by the absolutely convergent series lim
**

2 2 σ 2 = Eγ H1 − Eγ H1 + 2 n∈N+ −1 C If σ > 0 then γξn −→ W in both C and D, where ξn stands for either ξn D . The last conclusion still holds when γ is replaced by any µ ∈ pr(B ) or ξn I such that µ λ. w 2 Eγ H1 Hn+1 − Eγ H1 .

(3.2.2 )

Note that inequality (2.1.32) and Proposition 2.1.23 describe two classes of functions H for which (3.2.1 ) holds. Theorem 3.2.2 Assume that there exist constants 0 < δ ≤ 2 and c > 0 such that Eγ |H1 |2+δ < ∞ and

1/(2+δ) Eγ |H1 − Eγ (H1 |a1 , · · · , an )|2+δ ≤ cn−(2+7/δ) ,

n ∈ N+ ,

(3.2.3 )

182 so that (3.2.1 ) holds and lim 1 2 Eγ Sn = σ 2 ≥ 0 n

Chapter 3

n→∞

exists ﬁnitely and is given by the absolutely convergent series (3.2.2 ). If σ > C 0 then the strong invariance principle holds for the stochastic processes ξn D and ξn , n ∈ N+ . That is, without changing their distributions, we can redeﬁne these processes on a common richer probability space together with a standard Brownian motion process (w(t))t∈I such that sup |ξn (t) − w(t)| = O(n−a ) a.s.

t∈I

as n → ∞, with a random constant implied in O, for each a > 0 small C D enough, depending on δ. Here ξn stands for either ξn or ξn .

3.2.2

The case of incomplete quotients

An important special case of Theorem 3.2.1 is obtained when the function N H only depends on ﬁnitely many coordinates of a current point of N+ + , i.e., when H is a real-valued function on Nk for a given k ∈ N+ . In this case + Hn = H(an , ..., an+k−1 ), n ∈ N+ , assumption (3.2.1 ) is trivially satisﬁed, and by Corollary 1.2.5 we have

r Eγ H1 =

1 log 2

H r (i(k) ) log

i(k) ∈Nk +

1 + v(i(k) ) 1 + u(i(k) )

with r = 1 or 2, and

2 2 σ 2 = Eγ H1 − Eγ H1

+2

n∈N+

(3.2.2 )

i(n+k) ∈Nn+k +

H(i(k) )H(in+1 , · · · , in+k ) 1 + v(i(n+k) ) 2 log − E γ H1 . (n+k) ) log 2 1 + u(i

Note that in the case k = 1 by either Corollary 2.1.25 or Proposition A3.4 we have σ = 0 if and only if H =const. It is an open problem to ﬁnd necessary and suﬃcient conditions in terms of H in the case k > 1 for to have σ = 0. The special framework assumed allows for an estimate of the convergence rate in the classical central limit theorem. Thus we have the following result.

**Limit theorems Theorem 3.2.3 If σ > 0 and Eγ |H1 |2+δ = 1 log 2 H(i(k) )
**

i(k) ∈Nk + 2+δ

183

log

1 + v(i(k) ) <∞ 1 + u(i(k) )

for some δ > 0, then there exist two positive constants a < 1 and c such that n j=1 Hj − nEγ H1 √ γ < x − Φ(x) ≤ c n−a σ n for any x ∈ R and n ∈ N+ . Proof. This is a transcription of Theorem 1 in Iosifescu (1968) for the special case of the sequence (an )n∈N+ of incomplete quotients. 2 Remark. It is an open problem to determine the optimal value of a in Theorem 3.2.3. We conjecture that a = δ/2, that is, the same value as in the case of i.i.d. random variables with ﬁnite (2 + δ)-absolute moment. 2 In what follows, by restricting the class of functions H we give more precise results in the case k = 1. To emphasize this special framework we change the notation by using the letter f instead of H. Theorem 3.2.4 Let f : N+ → R, An ∈ R, Bn ∈ R++ , n ∈ N+ , with limn→∞ Bn = ∞, and deﬁne Xnj Sn0

−1 = Bn (f (aj ) − An ) , k

1 ≤ j ≤ n, n ∈ N+ ,

= 0, Snk = 1 log 2

j=1

Xnj , 1 ≤ k ≤ n, Snn = Sn , f 2 (k)k −2 ,

F (x) =

{k:|f (k)|≤x}

F (x) = Eγ f 2 (a1 )I(|f (a1 )|≤x) = 1 log 2 f 2 (k) log 1 +

{k:|f (k)|≤x}

1 k(k + 2)

,

x ∈ R+ .

**(i) The following assertions are equivalent.
**

D (I) The stochastic process ξn = ξn = (ξn (t))t∈I deﬁned for any n ∈ N+ by ξn (t) = Sn nt , t ∈ I, satisﬁes −1 γξn −→ WD in BD , w

184 where WD is the Wiener measure on BD .

Chapter 3

−1 (II) γSn −→ N (0, 1), and the array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i. under γ.

w

**(ii) When limx→∞ F (x) = Eγ f 2 (a1 ) = ∞, assertion (I) above holds with a bounded sequence (An )n∈N+ if and only if x2
**

n→∞ {k:|f (k)|≤x}

k −2

{k:|f (k)|>x}

lim

f 2 (k)k −2

=0

(3.2.4)

or, equivalently (see Theorem A2.5), if and only if F is slowly varying. If this is the case, then we can take An = Eγ f (a1 ), n ∈ N+ ,and any sequence −2 (Bn )n∈N+ such that limn→∞ nBn F (Bn ) = 1. When Eγ f 2 (a1 ) < ∞, assertion (I) holds with a bounded sequence (An )n∈N+ if and only if f =const. If this is the case, then we can take √ 1/2 An = Eγ f (a1 ) and Bn = nσ(0) Eγ f 2 (a1 ), n ∈ N+ , for some σ(0) > 0. (iii) If either (I) or (II) holds, then γ can be replaced in (i) by any µ ∈ pr(BI ) such that µ λ. Proof. (i) and (iii) follow from Theorem A3.7 and Lemma 3.0.2, respectively. We thus have to only prove (ii). First, since lim log 1 +

1 k(k+2) k −2

k→∞

= 1,

either F and F both tend to ∞ as x → ∞ and limx→∞ F (x)/F (x) = 1 or both have ﬁnite limits as x → ∞. Consequently, F is slowly varying if and only if F is. Assume that (3.2.4) holds. Note that this does always happen when 0 < Eγ f 2 (a1 ) = lim F (x) < ∞.

x→∞

Then Theorem A3.12 applies with Xn = f (an ), n ∈ N+ , and 2 Eγ f (a1 ) if Eγ f 2 (a1 ) < ∞, Eγ f 2 (a1 ) 2 m (X1 ) = 0 if Eγ f 2 (a1 ) = ∞,

Limit theorems Eγ f (a1 )f (an ) Eγ f 2 (a1 ) = 0 if Eγ f 2 (a1 ) < ∞, if Eγ f 2 (a1 ) = ∞

185

(0) ϕ1

= 1,

ϕ(0) n

2 for n ≥ 2 [use Proposition A3.1 and equation (A3.2)], and σ(0) equals either 2 Eγ f 2 (a1 ) − Eγ f (a1 ) + 2 n∈N+ Eγ f (a1 )f (an+1 ) Eγ f 2 (a1 ) 2 − Eγ f (a1 )

or 1 according as Eγ f 2 (a1 ) < ∞ or Eγ f 2 (a1 ) = ∞. Noting that when Eγ f 2 (a1 ) < ∞ by Corollary 2.1.25 we have σ(0) = 0 if and only if f = const., we conclude that with An and Bn , n ∈ N+ , as indicated we have −1 w γξn −→ WD , that is, (I) holds with a bounded sequence (An )n∈N+ . Next, assume that (I) or, equivalently, (II) holds with a bounded sequence (An )n∈N+ . Clearly, this cannot happen if f = const. It thus remains to show that F is slowly varying when

x→∞

lim F (x) = ∞.

(3.2.5)

Fix δ ∈ (0, 1) and put Xnjδ = Xnj I(|Xnj |≤δ) − Eγ Xnj I(|Xnj |≤δ) for any −1 w 1 ≤ j ≤ n, n ∈ N+ . As γSn → N (0, 1) by (II), it follows from Theorem A3.11(i) that 2

n n→∞

lim Eγ

j=1

Xnjδ = 1.

(3.2.6)

**On the other hand, it follows from Corollary A3.2 that 2
**

n

Eγ

j=1

Xnjδ ≤ 1 + 2

k∈N+

2 ψ(k) nEγ Xn1 I(|Xn1 |≤δ) , n ∈ N+ . (3.2.7)

**Now, note that |f (i) − An | ≤ δBn entails
**

−1 |f (i)| ≤ |An | + δBn = Bn |An |Bn + δ ≤ Bn

**for any n large enough since δ ∈ (0, 1), (An )n∈N+ is bounded, and limn→∞ Bn = ∞. Then for such an n we have
**

2 Eγ Xn1 I(|Xn1 |≤δ) −2 ≤ Bn Eγ (f (a1 ) − An )2 I(|f (a1 )|≤Bn ) −2 ≤ 2Bn F (Bn ) + A2 , n

186 whence, by (3.2.5),

2 −2 Eγ Xn1 I(|Xn1 |≤δ) ≤ 4Bn F (Bn )

Chapter 3

(3.2.8)

**for any n large enough. It follows from (3.2.6) through (3.2.8) that there exist c > 0 and n0 ∈ N+ such that
**

−2 nBn F (Bn ) ≥ c,

n ≥ n0 .

(3.2.9)

**Finally, by Theorem A3.11 we also have
**

n→∞

lim nγ (|Xn1 | > ε) = 0

**for any ε > 0. Since (|Xn1 | > ε) = (|f (a1 ) − An | > εBn ) ⊃ (|f (a1 )| > |An | + εBn ) and limn→∞ (|An | + εBn ) /εBn = 1, we then have
**

x→∞

lim nγ (|f (a1 )| > Bn ) = 0.

(3.2.10)

**It follows from (3.2.9) and (3.2.10) that
**

2 Bn γ (|f (a1 )| > Bn ) = 0. n→∞ Eγ f 2 (a1 )I(|f (a )|≤B ) n 1

lim

Noting that limn→∞ Bn+1 /Bn = 1 (this follows from, e.g., Theorem A3.9, but a direct proof can be also easily given), the last equation implies x2 γ (|f (a1 )| > x) = 0, x→∞ Eγ f 2 (a1 )I(|f (a )|≤x) 1 lim which shows by Theorem A2.5 that F is slowly varying. Remarks. 1. Theorem 3.2.4 still holds if we replace D by C, WD by WC , D C and the stochastic process ξn by the stochastic process ξn deﬁned by

C ξn (t) = Sn nt

+ (nt − nt ) Sn(

nt +1)

− Sn

nt

,

t ∈ I, n ∈ N+ .

This follows from Theorem A3.8. 2. For the many consequences of Theorem 3.2.4 (as well as of other similar further results) concerning, e.g., the asymptotic behaviour as n → ∞ of random variables as min0≤k≤n Snk , max0≤k≤n Snk , max0≤k≤n |Snk |, Un = number of indices k, 1 ≤ k ≤ n, for which Snk > 0, we refer the reader to

Limit theorems

187

Billingsley (1968, § 11). In particular, in the last case we have an arc-sine law √ 2 Un < a = arcsin a, 0 ≤ a ≤ 1, lim µ n→∞ n π for any µ ∈ pr(BI ) such that µ λ. 2 Example 3.2.5 Let f (n) = na+1/2 , n ∈ N+ , with a ∈ R. Clearly, for a < 0 we have Eγ f 2 (a1 ) < ∞. For a = 0 we have Eγ f 2 (a1 ) = ∞, F (x) ∼ 2 log x/ log 2, x2 {k:|f (k)|>x} k −2 = O(1) as x → ∞. Thus (3.2.4) holds and we can take An = Eγ a1

1/2

=

1 log 2

k 1/2 log 1 +

k∈N+

1 k(k + 2)

and Bn = (n log n/ log 2)1/2 , n ∈ N+ . It is easy to check that ζ(3/2)/6 log 2 < An < ζ(3/2)/ log 2 and that we can also write √ √ √ An = 2 k − 1 − k − k − 2 log k, n ∈ N+ .

k≥2

Finally, for a > 0 we have F (x) ∼ x4a/(2a+1) /2a log 2 and x2 ∼ x4a/(2a+1) as x → ∞, that is, (3.2.4) does not hold.

{k:|f (k)|>x} k

−2

2

As a special case of Theorem 3.2.2 we note the following result. Proposition 3.2.6 Let f : N+ → R be a non-constant function. Assume that there exists a constant δ > 0 such that Eγ |f (a1 )|2+δ < ∞. Put S0 = 0, Sn = n f (ai ) − nEγ f (a1 ), n ∈ N+ . Let i=1

2 σ 2 = Eγ f 2 (a1 ) − Eγ f (a1 ) + 2 n∈N+ 2 Eγ f (a1 )f (an+1 ) − Eγ f (a1 ) ,

which by Corollary 2.1.25 is positive. Then the strong invariance principle D C holds for the stochastic processes ξn and ξn , n ∈ N+ . That is, without changing their distributions we can redeﬁne these processes on a common richer probability space together with a standard Brownian motion process (w(t))t∈I such that sup |ξn (t) − w(t)| = O(n−a )

t∈I

a.s.

(3.2.11)

188

Chapter 3

as n → ∞, with a random constant implied in O, for each a > 0 small C D enough, depending on δ. Here ξn stands for either ξn or ξn . Remark. It follows from a general result of Heyde and Scott (1973) that if we only assume Eγ f 2 (a1 ) < ∞, then instead of (3.2.11) we only can assert that sup |ξn (t) − w(t)| = o (log log n)1/2 a.s.

t∈I

as n → ∞, with a random constant implied in o.

2

3.2.3

The case of associated random variables

Write bn for either yn , rn or un , n ∈ N+ , respectively bl for either y l , rl or ul , l ∈ Z. We now give a partial extension of Theorem 3.2.4 to the sequence (bn )n∈N+ in the case of inﬁnite variance. Theorem 3.2.7 Assume f : [1, ∞) → R+ is regularly varying of index x 1/2, Eγ f 2 (a1 ) = ∞, and f (x) = x1/2 L(x), where L(x) = c exp 1 ε(t)t−1 dt , x ≥ 1, with c > 0, ε : [1, ∞) → R+ continuous, and limt→∞ ε(t) = 0. For any n ∈ N+ deﬁne the stochastic process ξn = (ξ (t))t∈I by ξn (t) = 1 Bn f (bj ) − Eγ (b0 ) ,

j≤ nt

t ∈ I,

with the usual convention which assigns value 0 to a sum over the empty −2 set, where (Bn )n∈N+ is any sequence satisfying limn→∞ nBn F (Bn ) = 1 with F deﬁned as in Theorem 3.2.4, and Eγ (b0 ) is equal to Eγ f (y 0 ) = or Eγ f (u0 ) = 1 log 2

∞ 1

f (x)dx 1 , Eγ f (r0 ) = Eγ f (r1 ) = x(x + 1) log 2 1 log 2

2 1

∞ 1

f (x)dx x(x + 1)

(x − 1)f (x)dx + x2

w

∞ 2

f (x)dx x2

according as bn denotes yn , rn or un , n ∈ N+ . Then µξn−1 → WD in BD for any µ ∈ pr(BI ) such that µ λ.

The proof of Theorem 3.2.7 for the cases where bn = rn or bn = un , n ∈ N+ , can be found in Samur (1989, pp. 75–77). The case where bn = yn , n ∈ N+ , can be treated in a similar manner. 2

Limit theorems

189

We note that the hypothesis of a slowly varying F occurring in Theorem 3.2.4 is replaced here by stronger hypotheses. [By Corollary A2.7(ii) the assumptions on f imply that F is slowly varying.] And even the Karamata representation of f is assumed to present special features (compare with Theorem A2.1). Example 3.2.8 Let f (x) = x1/2 , x ∈ [1, ∞) (cf. Example 3.2.5). Theorem 3.2.7 holds with Bn = (n log n/ log 2)1/2 , n ∈ N+ , and Eγ f (y 0 ) = Eγ f (r1 ) = Eγ f (u0 ) = 1 log 2

2 1

dx π √ = , 2 log 2 x(x + 1) 1 √ ∞ 4 2−1 (x − 1)dx dx + = . log 2 x3/2 x3/2 2 1 log 2 2

∞

The next result covers the case of ﬁnite variance. Theorem 3.2.9 Let f : [1, ∞) → R. Assume that either (i) f satisﬁes a Lipschitz condition of order 0 < ε ≤ 1, that is, |f (x) − f (y)| := sε (f ) < ∞, |x − y|ε x=y, x,y≥1 sup

∞

and

1

|f (x)|2+δ x−2 dx < ∞ for some δ ≥ 0

or (ii) f = I(b,∞) for some b > 1. Put S0 = 0, Sn = n (f (bi ) − Eγ f (b0 )), n ∈ N+ , and for any n ∈ N+ i=1 C C D D deﬁne the stochastic processes ξn = (ξn (t))t∈I and ξn = (ξn (t))t∈I on (I, BI , γ) by

C ξn (t) =

1 √ (S σ(f ) n √ , σ(f ) n S

nt

nt

+ (nt − nt )(f (b

nt +1 )

− Eγ f (b0 ))),

D ξn (t) =

t ∈ I,

**where σ(f ) is a positive number which is deﬁned by (3.2.12) below. Then 1 lim Eγ n→∞ n
**

n 2

f (bi ) − Eγ f (b0 )

i=1

= σ 2 (f ) ≥ 0

(3.2.12)

**190 exists ﬁnitely. If σ(f ) > 0 then (a) assuming that δ = 0, for any µ ∈ pr(BI ) such that µ
**

−1 µξn → W in both BC and BD , w

Chapter 3

λ we have

C D where ξn stands for either ξn or ξn ; (b) assuming that δ > 0, the strong invariance principle holds for the C D stochastic processes ξn and ξn , n ∈ N+ . That is, without changing their distributions we can redeﬁne these processes on a richer common probability space together with a standard Brownian motion process (w(t))t∈I such that

sup ξn (t) − w(t) = O(n−a ) a.s.

t∈I

as n → ∞, with a random constant implied in O, for each a > 0 small C D enough, depending on δ. Here ξn stands for either ξn or ξn . Proof. We shall show that (a) and (b) follow from Theorems 3.2.1 and 3.2.2, respectively. We use the notation of Subsection 2.1.5 . Deﬁne H ((il )l∈Z ) = f b1 ([i1 , i2 , · · · ], [i0 , i−1 , · · · ]) , H1 = H((al )l∈Z ), Hm = H1 ◦ τ m−1 , Hence h(ω, θ) = (il )l∈Z ∈ NZ , + m ∈ N+ . (3.2.13)

f (1/θ) f (1/ω)

in the case where bl = y l , l ∈ Z, in the case where bl = rl , l ∈ Z, bl = ul , l ∈ Z

f (θ + 1/ω) in the case where

**for (ω, θ) ∈ Ω2 . Also, as in the proof of Proposition 2.1.22 we easily obtain Eγ |H1 − Eγ (H1 | a−n , · · · , an )|2+δ =
**

i−n ,··· ,in ∈N+

1 γ 2+δ (I 2 (i ¯

−n , · · ·

, in ))

γ (dω , dθ ) ¯

I 2 (i−n ,··· ,in ) 2+δ

(3.2.14) .

×

I 2 (i−n ,··· ,in )

(h(ω , θ ) − h(ω, θ))¯ (dω, dθ) γ

Now, under (i) it is easy to check that h satisﬁes an inequality of the form (2.1.30), which yields cn ≤ crn , n ∈ N+ , for some c > 0 and 0 < r < 1,

Limit theorems

191

**with cn , n ∈ N+ , deﬁned as in Proposition 2.1.22. It follows from (3.2.14) that Eγ
**

1/(2+δ)

|H1 − Eγ (H1 | a−n , · · · , an )|2+δ ≤ crn ,

n ∈ N+ .

Hence (3.2.3) clearly holds. Next, we are going to show that under (ii) condition (3.2.3) also holds. In the case where bl = y l , l ∈ Z, for any given n ∈ N+ there is at most one fundamental interval I(i0 , i−1 , ..., i−n ) such that 1/b ∈ I (i0 , i−1 , ..., i−n ). Similarly, in the case where bl = rl , l ∈ Z, for any given n ∈ N+ , there is at most one fundamental interval I(i1 , ..., in ) such that 1/b ∈ I (i1 , ..., in ). Therefore by (3.2.14) in both these cases Eγ |H1 − Eγ (H1 |a−n , ..., an )|2+δ does not exceed (Fn Fn+1 log 2)−1 for all n ∈ N+ , hence (3.2.3) holds. In the case where bl = ul , l ∈ Z, the last integral in (3.2.14) may be diﬀerent from 0 only for those rectangles I 2 (i−n , ..., in ) which are intersected by the hyperbola y + 1/x = 1/b. It is easy to see that for n large enough the total Euclidean area of them does not exceed (Fn Fn+1 )−1 so that (3.2.3) holds in this case, too. To prove (a) note that for δ = 0 by Theorem 3.2.1 we have

−1 µξn −→ W in both BC and BD w

(3.2.15)

2 C D for any µ ∈ pr(BI ) such that µ λ2 , where ξn stands for either ξn or ξn deﬁned as in Section 3.2.1, for our special H given by (3.2.13) and with σ(f ) = σ(H) deﬁned by (3.2.12). But

bn (ω) − bn (ω, θ) ≤ (Fn−1 Fn )−1 ,

n ∈ N+ , (ω, θ) ∈ Ω2 .

**[In the case where bn = rn , n ∈ N+ , we even have bn (ω) = bn (ω, θ), n ∈ N+ , (ω, θ) ∈ Ω2 .] Thus under (i) we have sup ξn (t, ω) − ξn (t, (ω, θ))
**

t∈I

≤ ≤ ≤

1 √ max S (ω) − Si (ω, θ) σ(f ) n 1≤i≤n i 1 √ σ(f ) n sε (f ) √ σ(f ) n

n

f (bi (ω)) − f bi (ω, θ)

i=1

bi (ω) − bi (ω, θ)

i=1

ε

= O n−1/2

192

Chapter 3

**as n → ∞, with a non-random constant independent of (ω, θ) ∈ Ω2 implied in O, while under (ii) it is easy to see that sup ξn (t, ω) − ξn (t, (ω, θ))
**

t∈I

≤ ≤

1 √ σ(f ) n

n

I(b,∞) (bi (ω)) − I(b,∞) (bi (ω, θ))

i=1

O(1) √ = O n−1/2 σ(f ) n

γ-a.s.

**with a random constant implied in O. Therefore in both cases sup ξn (t, ω) − ξn (t, (ω, θ)) = O n−1/2
**

t∈I

µ-a.s.

(3.2.16)

2 for any µ ∈ pr(BI ) such that µ λ2 . Now, (3.2.15) and (3.2.16) imply at once that w µξn−1 −→ W in both BC and BD

**for any µ ∈ pr(BI ) such that µ λ. To prove (b) note that for δ > 0 by Theorem 3.2.2 we have sup |ξn (t) − w(t)| = O(n−a ) a.s.
**

t∈I

as n → ∞. By (3.2.16) it is obvious that the strong invariance principle C D holds as stated for the stochastic processes ξn or ξn , n ∈ N+ . 2 In the case where bn = rn , n ∈ N+ , under diﬀerent assumptions on f , we can derive from Theorems 3.2.1 and 3.2.2 the following result. Theorem 3.2.10 Let f : [1, ∞) → R and deﬁne the function g by g(u) = f (1/u) , u ∈ (0, 1]. Assume that g is a function of bounded pvariation, p ≥ 1. Put

n

S0 = 0, Sn =

i=1

f (ri ) − nEγ f (r1 ),

n ∈ N+ .

**Then the series σ 2 (f ) =
**

I

g 2 dγ −

I

2

gdγ

+2

n∈N+ I

g U n gdγ −

I

2

gdγ

converges absolutely. If σ(f ) = 0 then both the weak and strong invariance principles hold as described in Theorems 3.2.1 and 3.2.2 for the stochastic

Limit theorems

193

C D processes ξn and ξn , n ∈ N+ , deﬁned as in Theorem 3.2.9 with bn = rn , n ∈ N+ .

**Proof. In this case the function H considered in Theorems 3.2.1 and 3.2.2 is deﬁned by H (i1 , i2 , ...) = g ([i1 , i2 , ...]) , (in )n∈N+ ∈ N+ + .
**

N

It follows from Proposition 2.1.23 and its proof that both (3.2.1 ) and (3.2.3 ) hold in our special case, hence the present statement. 2 Remark. Convergence rates in the central limit theorem are available for the sequence ( n f (ri ) − nEγ f (r1 ))n∈N+ . Hofbauer and Keller (1982, p. i=1 133) proved that sup γ

x∈R n i=1 f (ri )

− nEγ f (r1 ) √ < x − Φ(x) = O(n−a ) σ(f ) n

as n → ∞ for some 0 < a ≤ 1/2. Rousseau-Eg`le (1983) showed that in the e case p = 1 we can take a = 1/2. See also Iosifescu and Grigorescu (1990, pp. 212–213) and Miseviˇius (1971). c 2 Example 3.2.11 Let f (x) = log x, x ∈ [1, ∞). This is clearly a Lipschitz function since f (x) = 1/x ≤ 1 for any x ∈ [1, ∞). Also, it is easy to α see that Eγ f (b0 ) < ∞ for any α ∈ R+ . In the cases where bn = yn or bn = rn , n ∈ N+ , Theorem 3.2.9 holds with Eγ f (b0 ) = = = 1 log 2 1 log 2 1 log 2 1 log 2

∞ 1

log x dx x(x + 1) x+1 x

∞ 1 ∞ ∞

− log x log

+

1 1

1 1 log 1 + x x

dx

k∈N+

(−1)k+1 k (−1)k+1 k2

dx xk+1

=

k∈N+

=

π2 12 log 2

while the corresponding σ(f ) = σ < ∞ is non-zero. This can be shown as follows. By the reversibility of (¯ ) ∈Z —see Subsection 1.3.3—the ﬁnite a

**194 dimensional distributions under γ of (¯ ) ¯ y σ2 1 = lim n Eγ n→∞ 1 = lim n Eγ n→∞ 1 = lim n Eγ n→∞
**

n i=1 n i=1 n i=1 ∈Z

Chapter 3 and (¯ ) r

∈Z

**are identical. Then
**

2

2 log y i − π 12 log 2

2 log ri − π 12 log 2

2

2 log ri − π 12 log 2

2

.

So, σ 2 coincides with (2.1.33) in the case where the function h is deﬁned by h(ω) = log π2 1 − , ω 12 log 2 ω ∈ Ω.

It is easy to check that U h ∈ BV (I) while h is essentially unbounded. Hence σ = 0 by Proposition 2.1.24. It is worth mentioning that Mayer (1990) showed that −π 2 /12 log 2 is the value at β = 2 of the ﬁrst derivative of the dominant eigenvalue λ(β) of the Mayer–Ruelle operator Gβ . See Theorem 2.4.7. Also, Hensley (1994) showed that σ 2 = λ (2) − (λ (2))2 > 1/6. Note that in the case where bn = yn , n ∈ N+ , we have

n

Sn =

i=1

log yi −

nπ 2 nπ 2 = log qn − , 12 log 2 12 log 2

n ∈ N+ .

In this case convergence rates in the central limit theorem are available. Miseviˇius (1981) proved that c sup λ

x∈R

log qn − nπ 2 /12 log 2 √ < x − Φ(x) = O σ n

log n √ n

(3.2.17)

as n → ∞. Vall´e (1997) was able to obtain the optimal convergence rate e in (3.2.17) using Mayer–Ruelle operators. She proved that for µ ∈ pr(BI ) such that µ λ and the Radon–Nikodym derivative dµ/dλ is analytic and strictly positive in I, we have sup µ

x∈R

log qn − nπ 2 /12 log 2 √ < x − Φ(x) = O σ n

1 √ n

(3.2.18)

Limit theorems

195

as n → ∞. The same result for µ = λ had been also obtained by Morita (1994). For further results on the sequence (log qn )n∈N+ see Miseviˇius c (1992) and Vall´e (1997). See also Example 3.4.6. e From (3.2.18), using the double inequality 1

2 2qn+1 (ω)

≤ ω−

pn (ω) 1 ≤ 2 , qn (ω) qn (ω)

ω ∈ Ω, n ∈ N+ ,

we can derive the corresponding result for the random variable zn deﬁned by pn (ω) zn (ω) = ω − , ω ∈ Ω, n ∈ N+ . qn (ω) We have µ log zn + nπ 2 /6 log 2 √ < x − Φ(x) = O 2σ n 1 √ n

as n → ∞. The details are left to the reader. In the case where bn = un , n ∈ N+ , Theorem 3.2.9 should hold with Eγ f (b0 ) = = 1 log 2 1 log 2

2 1

(x − 1) log x dx + x2

∞ 2

log x dx x2 =1+ 1 log 2 2 2

1 1 1 (log x − 1) |2 + (log x)2 |2 − (log x − 1) |∞ 1 1 2 x 2 x

while we conjecture that σ(f ) is non-zero.

Example 3.2.12 Let f (x) = 1/x, x ∈ [1, ∞). This is also a Lipschitz function since | f (x) | = 1/x2 ≤ 1 for all x ∈ [1, ∞) while g(ω) = f (1/ω), ω ∈ Ω, is a function of bounded variation. Both Theorems 3.2.9, in the case where bn = rn , n ∈ N+ , and 3.2.10 hold with Eγ f (r1 ) = Eγ f (r0 ) = 1 log 2

∞ 1

x2 (x

dx 1 = −1 + 1) log 2

while the corresponding σ(f ) = σ is non-zero. Indeed, σ 2 coincides with (2.1.33) in the case where the function h is deﬁned by h(ω) = ω − and Proposition 2.1.26 applies. 1 + 1, log 2 ω ∈ Ω, 2

196

Chapter 3

3.3

3.3.1

**Convergence to non-normal stable laws
**

The case of incomplete quotients

**We start with a result which parallels Theorem 3.2.4. Theorem 3.3.1 Let f : N+ → R, An ∈ R, Bn ∈ R++ , n ∈ N+ , with limn→∞ Bn = ∞, and deﬁne
**

−1 Xnj = Bn (f (aj ) − An ) , k

1 ≤ j ≤ n,

Sn0 = 0,

Snk =

j=1

Xnj ,

1 ≤ k ≤ n,

Snn = Sn ,

n ∈ N+ .

**Let k1 , k2 ≥ 0, k1 + k2 > 0, α ∈ (0, 2), and denote by ν = ν(k1 , k2 , α) the stable p.m. c1 Pois µ(k1 , k2 , α) (see Section A1.5). (i) The following assertions are equivalent.
**

D (I) The stochastic process ξn = ξn = (ξn (t))t∈I deﬁned for any n ∈ N+ by ξn (t) = Sn nt , t ∈ I, satisﬁes −1 γξn → Qν in BD , w

**where the p.m. Qν is deﬁned as in Section A3.3.
**

−1 (II) γSn → ν, under γ. w

and the array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i.

**(ii) Assertion (I) above holds if and only if F (x) =
**

{k:|f (k)|>x}

k −2 , x ∈ R+ , is regularly varying of index − α (3.3.1)

and

x→∞

lim

1 F (x) {k:f (k)>x} 1

k −2 =

k1 , k1 + k2 (3.3.2)

x→∞

lim

F (x) {k:f (k)<−x}

k −2

k2 = k1 + k2

**or, equivalently (see Theorem A2.5), if and only if F (x) = (log 2)−1
**

{k:|f (k)|≤x}

f 2 (k)k −2 ,

x ∈ R+ ,

Limit theorems

197

is regularly varying of index 2 − α and (3.3.2) holds or, equivalently, if and only if x2 F (x) 2−α lim = log 2 x→∞ F (x) α and (3.3.2) holds. If this is the case, then we can take An = Eγ f (a1 )I(|f (a1 )|≤Bn ) , and any sequence (Bn )n∈N+ such that

n→∞ −2 lim nBn F (Bn ) = (k1 + k2 )/(2 − α).

n ∈ N+ ,

(iii) If either (I) or (II) above holds, then γ can be replaced in (i) by any µ ∈ pr (BI ) such that µ λ. Proof. (i) and (iii) follows from Theorem A3.7 and Lemma 3.0.2, respectively. The proof of (ii) is entirely similar to that working in the case of i.i.d. random variables. See Samur (1989, p. 62) and Araujo and Gin´ (1980, pp. e 81, 84–85, 87–88). 2 Remark. In principle, from Theorem 3.3.1 we might derive the asymptotic behaviour as n → ∞ of random variables as, e.g.,

0≤k≤n

min Snk ,

0≤k≤n

max Snk ,

or

0≤k≤n

max |Snk |.

**This depends on the possibility of determining the distribution of the random vector inf ξν (t), sup ξν (t), ξν (1) ,
**

t∈I t∈I

where ξν = (ξν (t))t∈I is a stochastic process with stationary independent increments, ξν (0) = 0 a.s., trajectories in D, and ξν (1) having probability distribution ν (see Section A3.3). Note that this problem could be solved in the case of normal convergence, when ν is the standard normal distribution and ξν is the standard Brownian motion process—see Remark 2 following Theorem 3.2.4. 2 Corollary 3.3.2 Let k1 , k2 , α, and ν = ν(k1 , k2 , α) be as in Theorem 3.3.1. (i) Let f ∈ F (see Section A2.3). Then (3.3.1) and (3.3.2) hold if and only if f is regularly varying of index 1/α.

198

Chapter 3

(ii) Assume f : [1, ∞) → R++ is bounded on ﬁnite intervals and regularly varying of index 1/α. Let α δα/(1−α) log 2 ∗ ν , 0, α if α = 1, log 2 να = 1 ν , 0, 1 if α = 1, log 2 and for any n ∈ N+ deﬁne the stochastic process ηn = (ηn (t))t∈I by f (aj ) if α < 1, j≤ nt 1 f (aj ) − Eγ f (a1 )I(f (a1 )≤f (n)) if α = 1, ηn (t) = × f (n) j≤ nt (f (aj ) − Eγ f (a1 )) if α > 1,

j≤ nt

with the usual convention which assigns value 0 to a sum over the empty set. Then −1 w µηn → Qνα in BD for any µ ∈ pr(BI ) such that µ λ. Proof. (i) By Lemma A2.6(iii) it is suﬃcient to show that k −2 ∼ (f1 (x))−1 as x → ∞.

{k:f (k)>x}

(3.3.3)

**For any x ≥ 1 by the deﬁnition of f1 and f2 (see Section A2.3) we have {k : k > f2 (x)} ⊂ {k : f (k) > x} ⊂ {k : k ≥ f1 (x)}. Hence 1≤
**

{k:f (k)>x}

(3.3.4)

k −2 k

k>f2 (x) −2

k −2 ≤1+

f1 (x)≤k≤f2 (x)

k −2

k>f2 (x)

(3.3.5)

**for any x ≥ 1. But k −2 ≤ (f1 (x) − 1)−1 − (f2 (x))−1 ,
**

f1 (x)≤k≤f2 (x)

(3.3.6)

Limit theorems k −2 ≥ (f2 (x) + 1)−1

k>f2 (x)

199 (3.3.7)

**for any x ≥ 1, and (f1 (x))−1 ∼ (f2 (x))−1 ∼
**

k>f2 (x)

k −2 as x → ∞.

(3.3.8)

Now, (3.3.3) follows from (3.3.5) through (3.3.8). (ii) By Lemma A2.6(ii) we have f ∈ F. It follows from (i) above and Theorem 3.3.1 that −1 w µξn → Qνα in BD for any µ ∈ pr(BI ) such that µ ξn = (ξn (t))t∈I is deﬁned by ξn (t) = 1 Bn λ, where for any n ∈ N+ the process

f (aj ) − Eγ f (a1 )I(f (a1 )≤Bn ) , t ∈ I,

j≤ nt

with Bn satisfying

n→∞ −2 lim n Bn F (Bn ) =

k1 + k2 . 2−α

(3.3.9)

**It is therefore suﬃcient to prove that in (3.3.9) we can take Bn = f (n), n ∈ N+ , k1 = α/ log 2, k2 = 0, and that
**

n→∞

lim Eγ (ηn (1) − ξn (1)) Eγ f (a1 )I(f (a1 )≤f (n)) n = lim × n→∞ f (n) −Eγ f (a1 )I(f (a1 )>f (n)) = α . (1 − α) log 2 if α < 1, if α > 1 (3.3.10)

To proceed notice ﬁrst that by the very deﬁnition of f1 and f2 we have f1 (f (n) − 1) ≤ n ≤ f2 (f (n)) , n ∈ N+ .

Since f1 is regularly varying, by Corollary A2.2(i) we have f1 (f (n) − 1) ∼ f1 (f (n)) as n → ∞.

200 As f1 ∼ f2 , it follows that fi (f (n)) ∼ n as n → ∞, i = 1, 2.

Chapter 3

(3.3.11)

**Taking up (3.3.9) we begin by noting that (3.3.4) implies that f 2 (k)k −2
**

k<f1 (x)

f 2 (k)k −2 ≤

{k:f (k)<x}

f (k)k

k≤f2 (x)

2

−2

f 2 (k)k −2

k≤f2 (x)

≤1

(3.3.12)

for all x ≥ 1. Next, we use Theorem A2.3 taking L(x) = x−2/α f 2 ( x ) ( x + 1) / x , which is a slowly varying function. We easily obtain lim x

2 −2 k≤x f (k)k f 2 (x)

x ≥ 1,

x→∞

=

α . 2−α

(3.3.13)

Clearly, (3.3.13) also holds when k≤x is replaced by k<x . Because f1 ∼ f2 and f is regularly varying, it follows from (3.3.13) that the ﬁrst fraction in (3.3.12) tends to 1 as x → ∞. Then by (3.3.13) again and (3.3.11) we obtain n n F (f (n)) ∼ f 2 (k)k −2 2 (n) 2 (n) log 2 f f

k≤f2 (f (n))

∼ ∼

1 n f 2 (f2 (f (n))) α log 2 f2 (f (n)) f 2 (n) 2−α α as n → ∞, (2 − α) log 2

(3.3.14)

**that is, (3.3.9) is satisﬁed as stated. Now, coming to (3.3.10) assume ﬁrst α < 1. Then since lim log 1 +
**

1 k(k+2) k −2

k→∞

=1

(3.3.15)

and

k∈N+

f (k)k −2 = ∞, we have 1 log 2 f (k)k −2 as n → ∞.

{k:f (k)≤f (n)}

Eγ f (a1 )I(f (a1 )≤f (n)) ∼

Limit theorems Therefore the asymptotic behaviour of n Eγ f (a1 )I(f (a1 )≤f (n)) f (n)

201

as n → ∞ can be obtained from (3.3.14) by replacing f 2 by f, thus α by 2α (note that while f 2 is regularly varying of index 2/α, f is regularly varying of index 1/α). Thus α n Eγ f (a1 )I(f (a1 )≤f (n)) ∼ as n → ∞, f (n) (1 − α) log 2 that is, (3.3.10) holds when α < 1. Finally, let α > 1. We now use Theorem A2.4 taking L(x) = x−1/α f ( x ) ( x + 1) / x , which is a slowly varying function. We easily obtain lim x

k≥x f (k)k −2

x ≥ 1,

x→∞

f (x)

k≥x

=

α . α−1

k>x .

(3.3.16) By (3.3.4),

Clearly, (3.3.16) also holds when similarly to (3.3.12) we have

is replaced by

Eγ f (a1 )I(a1 >f2 (f (n))) Eγ f (a1 )I(f (a1 )>f (n)) ≤ ≤ 1, Eγ f (a1 )I(a1 ≥f1 (f (n))) Eγ f (a1 )I(a1 ≥f1 (f (n)))

n ∈ N+ .

(3.3.17)

It follows from (3.3.16) that the ﬁrst fraction in (3.3.17) tends to 1 as n → ∞. Notice then that since k∈N+ f (k)k −2 < ∞, by (3.3.15 ) we have Eγ f (a1 )I(a1 ≥f1 (f (n))) ∼ 1 log 2 f (k)k −2 as n → ∞.

k≥f1 (f (n))

**Using (3.3.16) again we thus obtain n Eγ f (a1 )I(f (a1 )>f (n)) ∼ f (n) ∼ ∼ n f (n) log 2 f (k)k −2
**

k≥f1 (f (n))

1 n f (f1 (f (n))) α log 2 f1 (f (n)) f (n) α−1 α as n → ∞, (α − 1) log 2

202 that is, (3.3.10) holds when α > 1, too.

Chapter 3 2

To complete the remark following Theorem 3.3.1 we note that Corollary 3.3.2 allows to derive in some cases the asymptotic behaviour as n → ∞ of the random variable Un = number of indices k, 1 ≤ k ≤ n, for which Snk > 0. Proposition 3.3.3 Assume f is bounded on ﬁnite intervals and regularly varying of index 1/α with 1 < α < 2. Then lim µ Un <x n = lim µ

n→∞

n→∞

(3.3.18)

k j=1 f (aj )

card 1 ≤ k ≤ n :

> kEγ f (a1 )

< x

n

x 0

=

sin(π/α) π

t1−1/α (1

dt , − t)1/α λ.

0 ≤ x ≤ 1,

for any µ ∈ pr(BI ) such that µ

Proof. It is easy to check that να deﬁned in Corollary 3.3.2 is a strictly stable probability and να ((0, ∞)) = 1/α for any 1 < α < 2. Then (3.3.18) is an immediate consequence of Theorem 5.1 in de Acosta (1982). 2 Remarks. 1. Proposition 3.3.3 holds for α = 2, too. In this case the limiting distribution is the classical arc-sine law mentioned in Remark 2 following Theorem 3.2.4. However, the assumption on f in Proposition 3.3.3 is slightly stronger [cf. Corollary A2.7(ii)] than the assumption on f in Theorem 3.2.4, under which the arc-sine law holds. 2. It follows from Proposition 3.3.3 [cf. Theorem 5.2 in de Acosta (1982)] that µ (λ (t ∈ I : ξνα (t) > 0) < x) = sin(π/α) π

x 0

t1−1/α (1

dt , − t)1/α

0 ≤ x ≤ 1,

for any 1 < α < 2. This generalizes P. L´vy’s arc-sine law for Brownian e motion. 2

3.3.2

Sums of incomplete quotients

From Corollary 3.3.2 we can derive results for the sums tn = n aj , n ∈ j=1 N+ , of incomplete coeﬃcients by taking f (x) = x, x ∈ [1, ∞). In this case

**Limit theorems we have An = Eγ a1 I(a1 ≤n) = Hence An = 1 log 2 1 = log 2
**

n

203

j log

j=1

(j + 1)2 j(j + 2) n+2 n+1 , n ∈ N+ .

log(n + 2) − (n + 1) log

1 (log n − 1 + o(1)) log 2

(3.3.19)

**as n → ∞. For any µ ∈ pr(BI ) such that µ λ by Corollary 3.3.2(ii) we have w µ (ηn (1))−1 → ν1 , (3.3.20) where ηn (1) = 1 n
**

n

(aj − An ) ,

j=1

n ∈ N+ .

**It follows from (3.3.19) and (3.3.20) that µ (ζn (1))−1 → δ(C−1)/ log 2 ∗ ν1 := ν , where 1 ζn (1) = n
**

n w

(3.3.21)

aj +

j=1

C − log n log 2

,

n ∈ N+ ,

and C = 0.57722 · · · is Euler’s constant. Note that the ch.f. of ν is ν (t) = exp − π 2 log 2 2 1 + i sgn t log |t| |t| , π t ∈ R,

see Section A1.5. Hence ν is strictly stable. A convergence rate in (3.3.21) is available in the special case where µ = γ. Heinrich (1987) proved that there exists c0 ∈ R++ such that γ (ζn (1) < x) − ν ((−∞, x)) ≤ c0 (log n)2 n (3.3.22)

**for any n ∈ N+ and x ∈ R. To conclude let us note that (3.3.21) is a special case of
**

−1 µζn −→ Qν in BD , w

**204 where for any n ∈ N+ the process ζn = (ζn (t))t∈I is deﬁned by ζn (t) = 1 n aj +
**

j≤ nt

Chapter 3

C − log n log 2

,

t ∈ I.

**As a consequence (compare with Remark 2 following Proposition 3.3.3) we have
**

n→∞

lim µ

card{1 ≤ k ≤ n :

k j=1 aj

> k(log n − C)/ log 2}

n = µ (λ(t ∈ I : ξν (t) > 0) < x) , 0 ≤ x ≤ 1.

<x

An explicit expression of the last distribution function is not known. Immediate consequences of (3.3.21) and (3.3.22) are that (i) for any µ ∈ pr(BI ) such that µ λ we have tn 1 −→ n log n log 2 in µ-probability as n → ∞, (3.3.23)

and (ii) for any ε > 0 and n ∈ N+ we have γ 1 tn − ≤ε n log n log 2 ≥ν −ε log n + C C , ε log n + log 2 log 2 − 2c0 (log n)2 . n

Khintchine (1934/35) proved using (3.3.23) that the series n∈N+ 1/tn is divergent a.e. in I. A stronger result is Theorem 3.3.4 below. This was stated by Doeblin (1940), but his proof is incorrect. We reproduce here the proof of Iosifescu (1996). Theorem 3.3.4 The series 1 log 2 − tn n log n

n≥2

is absolutely convergent a.e. in I . Proof. In what follows, the letter c with diﬀerent indices will denote suitable positive constants. Let h : N+ → N+ be a function such that limn→∞ h(n) = ∞. For any n ∈ N+ put

n

tn (h) =

i=1

ai I(ai ≤h(n)) .

Limit theorems

205

It follows from (3.3.19) and the strict stationarity of (an )n∈N+ under γ that Eγ tn (h) = n (log h(n) − 1 + o(1)) log 2 (3.3.24)

**as n → ∞. Next, for any n ∈ N+ we have Eγ a2 I(a1 ≤n) = 1 and Corollary A3.2 yields Eγ (tn (h) − Eγ tn (h))2 ≤ c2 nh(n), n ∈ N+ . (3.3.25) 1 log 2
**

n

j 2 log 1 +

j=1

1 j(j + 2)

≤ c1 n,

¯ Now, write tn = tn (h) for h(n) = n log4/3 n + 1 and tn = tn (h) for h(n) = n, n ∈ N+ . For any n ≥ 3 by (3.3.24) we have log 2 log log n 1 . − ≤ c3 Eγ tn n log n n log2 n

2 Since the series n≥3 (log log n)/n log n is convergent, it is suﬃcient to prove that the series 1 1 − (3.3.26) tn Eγ tn n≥2

**is absolutely convergent a.e. in I. For any n ≥ 2 consider the random events A1 (n) = A1 = tn > 3 Eγ tn , 2 A3 (n) = A3 = A4 (n) = A4 =
**

1 2 Eγ tn 1 2 Eγ tn

A2 (n) = A2 = tn < 1 Eγ tn , 2

≤ tn ≤ 3 Eγ tn ∩ tn = tn , 2 ≤ tn ≤ 3 Eγ tn ∩ tn = tn . 2

Let us ﬁnd upper bounds for the γ-probabilities of A1 , A2 , and A3 .We have A1 = tn − Eγ tn > 1 Eγ tn ⊂ 2 tn − Eγ tn > 1 Eγ tn . 2

**By (3.3.24) and (3.3.25) the Bienaym´–Chebyshev inequality implies e γ(A1 ) ≤ 4c2 n2 log4/3 n + 1 / Eγ tn
**

2

≤ c4 (log n)−2/3 .

(3.3.27)

206

Chapter 3

Since tn ≤ tn , n ∈ N+ and Eγ tn /2 − Eγ tn < 0 for n large enough, for such an n we have A2 = tn < 1 Eγ tn 2 ⊂ ⊂ tn < 1 Eγ tn = tn − Eγ tn < 1 Eγ tn − Eγ tn 2 2 tn − Eγ tn > Eγ tn − 1 Eγ tn . 2

**Again by (3.3.24) and (3.3.25), the Bienaym´–Chebyshev inequality implies e γ(A2 ) ≤ Noting that (tn = tn ) =
**

i=1

c2 n2 Eγ tn − Eγ tn /2

n 2

≤ c5 (log n)−2 .

(3.3.28)

ai > n( log4/3 n + 1) ,

**whence γ(tn = tn ) ≤ nγ a1 > n we obviously have γ(A3 ) ≤ c6 (log n)−4/3 . Next, let us ﬁnd an upper bound for Eγ where Ii (n) =
**

Ai

log4/3 n + 1

≤ c6 (log n)−4/3 ,

(3.3.29)

(3.3.30)

1 1 − = tn Eγ tn

4

Ii (n),

i=1

1 1 − dγ, tn Eγ tn

1 ≤ i ≤ 4.

Since tn ≤ tn , n ∈ N+ , on A1 we have 1 1 2 ≤ < . tn tn 3Eγ tn It follows from (3.3.24), (3.3.27), and (3.3.31) that I1 (n) ≤ c7 n−1 (log n)−5/3 . (3.3.32) (3.3.31)

Since tn ≥ n, n ∈ N+ , by (3.3.24), (3.3.28), and (3.3.30) we have I2 (n) ≤ c8 n−1 (log n)−2 , I3 (n) ≤ c9 n−1 (log n)−4/3 . (3.3.33)

**Limit theorems Finally, set wn = (tn − Eγ tn )/Eγ tn and note that by (3.3.24) and (3.3.25) we have
**

1/2 2 Eγ |wn | ≤ Eγ wn ≤ c10 (log n)−1/3 .

207

Since on A4 we have tn = tn and 2/3 ≤ 1/(1 + wn ) ≤ 2, it follows that I4 (n) = ≤ 1 1 dγ = − tn Eγ tn |wn | dγ (1 + wn )Eγ tn

A4

A4

(3.3.34)

2 E |w | ≤ c11 n−1 (log n)−4/3 . ¯ γ n Eγ tn

Therefore by (3.3.32) through (3.3.34) we have Eγ 1 1 − = O n−1 (log n)−4/3 tn Eγ tn

−1 −4/3 is convergent, by Beppo as n → ∞. As the series n≥2 n (log n) Levy’s theorem series (3.3.26) is absolutely convergent a.e. in I. The proof is complete. 2

Corollary 3.3.5 We have

n→∞

lim

n i=1 1/ti

log log n

= log 2

a.e..

Proof. This follows immediately from Theorem 3.3.4 since, as is well known, n 1 lim − log log n n→∞ i log i

i=1

exists and is ﬁnite. 2 For further results on the sums tn , n ∈ N+ , see Theorem 4.1.9 and its corollaries.

3.3.3

The case of associated random variables

We shall now show that Corollary 3.3.2 still holds in the case where α < 1 when aj is replaced by either yj , rj , or uj , j ∈ N+ . This will follow from the result below (compare with Lemma 3.1.4).

208

Chapter 3

Lemma 3.3.6 Let bn , n ∈ N+ , be real-valued random variables on (I, BI ) such that an ≤ bn ≤ an + c, n ∈ N+ , for some c ∈ R+ . For any n ∈ N+ consider the stochastic processes ηn = (ηn (t))t∈I and ηn = (ηn (t))t∈I deﬁned by ηn (t) = 1 f (n) f (aj ),

j≤ nt

ηn (t) =

1 f (n)

f (bj ),

j≤ nt

t ∈ I,

with the usual convention which assigns value 0 to a sum over the empty set, where f : [1, ∞) → R++ is bounded on ﬁnite intervals and regularly varying of index β > 1. Then d0 (ηn , ηn ) converges to 0 in γ-probability as n → ∞. Proof. Write f (x) = xβ L(x), x ∈ [1, ∞), where L is slowly varying. For any n ∈ N+ we have d0 (ηn , ηn ) ≤ sup |ηn (t) − ηn (t)|

t∈I

≤ where δn = 1 f (n)

n j=1

1 |f (aj ) − f (bj )| ≤ δn + δn , f (n) j=1 1 f (n)

n j=1

n

(3.3.35)

bβ − aβ L(aj ), j j

δn =

bβ |L(bj ) − L(aj )| . j

Using the inequality (1 + a)α − 1 ≤ a {α} + α (1 + a)α−1 , valid for nonnegative a and α, we obtain bβ − aβ ≤ cβ(1 + c)β−1 aβ−1 , j j j whence δn ≤ cβ(1 + c)β−1 Writing a−1 f (aj ) = a−1 f (aj )I(aj ≤M ) + a−1 f (aj )I(aj >M ) , j j j for an arbitrarily given M ≥ 1, we easily obtain n f (i) 1 1 δn ≤ cβ(1 + c)β−1 max + f (n) 1≤i≤M i M f (n) 1 ≤ j ≤ n, f (aj ) .

j=1

1 ≤ j ≤ n, a−1 f (aj ). j

1 f (n)

n j=1

n

**Limit theorems Then for any ε > 0 by Corollary 3.3.2(ii) we have lim sup γ δn > cβ(1 + c)β−1 ε
**

n→∞

209

≤ lim sup γ (ηn (1) > M ε/2)

n→∞

≤ ν1/β

Mε ,∞ 2

−→ 0 as M → ∞.

**Hence δn converges to 0 in γ-probability as n → ∞. Next, for any ﬁxed M ≥ 1 we can write n bj β 1 δn ≤ f (aj ) I(aj ≤M ) f (bj ) + f (n) aj
**

j=1 n

bj aj

β

+

j=1

f (aj )

L(bj ) − 1 I(aj >M ) L(aj ) n f (n)

n

≤

1 + (1 + c)β (1 + c)β + f (n)

sup

1≤x≤M +c

f (x)

sup

0≤s≤c, x>M

L(x + s) −1 L(x)

f (aj ).

j=1

**Given η > 0, choose M ≥ 1 such that sup
**

0≤s≤c

L(x + s) −1 ≤η L(x)

for x > M, which is possible by the Karamata representation of L (see Theorem A2.1). Then for any ε > 0 by Corollary 3.3.2(ii) again we have lim sup γ(δn > ε) ≤ lim sup γ ηn (1) > η −1 (1 + c)−β ε/2

n→∞ n→∞

≤ ν1/β

η −1 (1 + c)−β ε ,∞ 2

−→ 0 as η → 0.

Hence δn converges to 0 in γ-probability as n → ∞.

210 By (3.3.35) the proof is complete.

Chapter 3 2

**Corollary 3.3.7 Let bn denote either yn , rn or un , n ∈ N+ . For any n ∈ N+ consider the stochastic process 1 ηn = f (bj ) f (n)
**

j≤ nt t∈I

with the usual convention which assigns value 0 to a sum over the empty set, where f : [1, ∞) → R++ is bounded on ﬁnite intervals and regularly varying of index 1/α, 0 < α < 1. Let µ ∈ pr(BI ) such that µ λ. Then

−1 µηn → Qνα in BD . w

Proof. Lemma 3.3.6 applies with c = 1 in the case of yn and rn and with c = 2 in the case of un . Since µ λ, the distance d0 (ηn , ηn ) converges to 0 in µ-probability, too, as n → ∞. This property and Corollary 3.3.2(ii) imply the result stated. 2 In the case where α ≥ 1 we have results which complement Theorem 3.2.7. Write b0 for either y 0 , r0 or u0 . Theorem 3.3.8 Let bn denote either yn , rn or un . Assume f : [1, ∞) → R++ is regularly varying of index 1/α, α ∈ [1, 2), Eγ f 2 (a1 ) = ∞, and f (x) = x1/α L(x), where L(x) = c exp c > 0, ε : [1, ∞) → R continuous, and deﬁne the process ηn = (¯n (t))t∈I by ¯ η

x

ε(t)t−1 dt , x ≥ 1, with = 0. For any n ∈ N+

1 limt→∞ ε(t)

f (bj ) − m(f, b0 ) − Eγ f (a1 )I(f (a1 )≤f (n))

j≤ nt

if α = 1,

ηn (t) = ¯

1 × f (n)

f (bj ) − Eγ f (b0 )

j≤ nt

if α > 1

with the usual convention which assigns value 0 to a sum over the empty set, where m(f, b0 ) and Eγ f (b0 ) are equal to m(f, y 0 ) = m(f, r0 ) = Eγ (f (r0 ) − f (a0 )) = Eγ (f (r1 ) − f (a1 )) = 1 log 2

∞ 1

(f (x) − f ( x )) dx , x(x + 1)

**Limit theorems m(f, u0 ) = Eγ (f (u0 ) − f (a0 )) = = 1 log 2 1 log 2
**

∞ ∞ 1 2 1 1 ∞

211

f

x+

1 y

− f ( x ) (xy + 1)−2 dxdy

(f (x) − f (1)) (x − 1) dx x2

+

2

f (x) − ( x − x + 1)f ( x − 1 ) − (x − x )f ( x ) dx , x2 1 log 2

∞ 1 ∞ 2

Eγ f (y 0 ) = Eγ f (r0 ) = Eγ f (r1 ) = Eγ f (u0 ) = 1 log 2

2 1

f (x)dx , x(x + 1) f (x)dx x2 ,

f (x)(x − 1)dx + x2

**according as bn denotes yn , rn or un , n ∈ N+ . Then
**

−1 µη n −→ Qνα in BD w

for any µ ∈ pr(BI ) such that µ 3.3.2(ii).

λ, where να is deﬁned as in Corollary

The proof of Theorem 3.3.8 for the cases bn = rn or bn = un , n ∈ N+ , can be found in Samur (1989, pp. 75–77). The case where bn = yn , n ∈ N+ , can be treated in a similar manner. 2 Example 3.3.9 Let f (x) = x1/α , x ∈ [1, ∞), where α ∈ (1, 2). (For the case α = 2 see Example 3.2.8.) Theorem 3.3.8 holds with Eγ f (y 0 ) = Eγ f (r0 ) = Eγ f (r1 ) = = 1 log 2 1 log 2

∞ 1

x1/α dx 1 = x(x + 1) log 2

1 0

v −1/α dv v+1

j∈N+

1 (2j − 1 − 1/α)(2j − 1/α) 1 2α −ψ 1 1 − 2 2α ,

=

1 2 log 2

ψ 1−

**212 where ψ is the digamma function—see p. 145—and Eγ f (u0 ) = 1 log 2
**

2 1

Chapter 3

(x − 1)dx + x2−1/α

∞ 2

dx x2−1/α

=

α2 (21/α − 1) . (α − 1) log 2

**2 Example 3.3.10 Let f (x) = x, x ∈ [1, ∞). Theorem 3.3.8 holds with m(f, y 0 ) = m(f, r0 ) = = 1 log 2
**

∞ 1

1 log 2

∞ 1

(x − x ) dx x(x + 1)

dx = (log 2)−1 − 1, x2 (x + 1)

m(f, u0 ) = Eγ (r0 − a0 + y −1 ) = m(f, r0 ) + Eγ (y −1 ) 0 0 = 2 log 2

∞ 1

x2 (x

dx = 2 (log 2)−1 − 1 . + 1)

by

**It follows that if for any n ∈ N+ the process ζn = (ζn (t))t∈I is deﬁned ζn (t) = 1 n bj +
**

j≤ nt

C − log n log 2

,

t ∈ I,

where bn denotes either yn , rn or un , n ∈ N+ , then for any µ ∈ pr(BI ) such that µ λ we have w µζn−1 −→ Qν in BD in the cases where bn = yn or bn = rn , n ∈ N+ , with ν = δC/ log 2−1 ∗ ν1 , and w µζn−1 −→ Qν in BD in the case where bn = un , n ∈ N+ , with ν = δ(C+1)/ log 2−2 ∗ ν1 . As a consequence (compare with the similar result for the incomplete quotients an , n ∈ N+ , in Subsection 3.3.2) we have lim µ card{1 ≤ k ≤ n :

k j=1 yj

> k(log n − C)/ log 2}

n→∞

n card{1 ≤ k ≤ n :

k j=1 rj

<x

= lim µ

n→∞

> k(log n − C)/ log 2}

n 0 ≤ x ≤ 1,

<x

= µ (λ(t ∈ I : ξν (t) > 0) < x) ,

**Limit theorems and lim µ card{1 ≤ k ≤ n :
**

k j=1 uj

213

> k(log n − C)/ log 2}

n→∞

n = µ λ(t ∈ I : ξν (t) > 0) < x , 0 ≤ x ≤ 1.

<x

2

3.4

3.4.1

Fluctuation results

The case of incomplete quotients

We start with a direct consequence of Theorem 3.2.2 . Let K ⊂ C be the collection of all absolutely continuous functions x ∈ C 1 for which x(0) = 0 and 0 [x (t)]2 dt ≤ 1. Here x stands for the derivative of x which exists a.e. in I. N Let H be a real-valued function on N+ + . Set Hn = H (an , an+1 , · · · ) , n ∈ 2 N+ , and assume that Eγ H1 < ∞ and (3.2.1 ) holds. Denoting Sn = n 2 i=1 Hn − nEγ H1 , n ∈ N+ , and assuming that σ deﬁned by (3.2.2 ) is non-zero, for any n ≥ 3 put θn (t) = = 1 √ S σ 2n log log n 1 √ ξC , 2n log log n n

nt

+ (nt − nt ) H

nt +1

− Eγ H1

t ∈ I.

Theorem 3.4.1 (Strassen’s law of the iterated logarithm). Assume that Eγ |H1 |2+δ < ∞ for some constant δ > 0, (3.2.3 ) holds, and σ 2 deﬁned by (3.2.2 ) is non-zero. Then the sequence (θn )n≥3 , viewed as a subset of C , is a relatively compact set whose derived set coincides a.e. with K. Proof. The result follows from Strassen’s law of the iterated logarithm for standard Brownian motion [see Theorem 1 in Strassen (1964)] and Theorem 3.2.2 . 2 Corollary 3.4.2 (Classical law of the iterated logarithm). Under the assumptions of Theorem 3.4.1 the set of accumulation points of the sequence Sn /σ 2n log log n

n≥3

214 coincides a.e. with the segment [−1, 1].

Chapter 3

In the special case where H only depends on ﬁnitely many coordinates N of a current point of N+ + , i.e., when H is a real-valued function on Nk + for a given k ∈ N+ , certain assumptions in Theorem 3.4.1 are no longer necessary. In this case Hn = H (an , · · · , an+k−1 ), n ∈ N+ , and (3.2.3 ) is trivially satisﬁed. Also, σ 2 reduces to (3.2.2 ) and when k = 1 by Corollary 2.1.25 we have σ 2 = 0 if and only if H = const. Finally, it is enough to 2 assume that Eγ H1 < ∞. This follows from the work of Heyde and Scott (1973). Cf. the remark following Proposition 3.2.6. We state a most striking result. Proposition 3.4.3 Let f : N+ → R be a nonconstant function. Assume that Eγ f 2 (a1 ) < ∞ and put Sn = n f (ai ) − nEγ f (a1 ) , n ∈ N+ . Let i=1

2 σ 2 = Eγ f 2 (a1 ) − Eγ f (a1 ) + 2 n∈N+ 2 Eγ f (a1 ) f (an+1 ) − Eγ f (a1 ) ,

**which by Corollary 2.1.25 is non-zero. For any n ≥ 3 put 1 S θn (t) = √ σ 2n log log n
**

nt

+ (nt − nt ) (f

nt +1

− Eγ f (a1 )) ,

t ∈ I.

Then the sequence (θn )n≥3 , viewed as a subset of C, is a relatively compact set whose derived set coincides a.e. with K. In particular, the set of accu√ mulation points of the sequence (Sn /σ 2n log log n)n≥3 coincides a.e. with the segment [−1, 1]. The almost sure invariance principle is instrumental in establishing integral tests which characterize the asymptotic growth rates of partial sums and maximum absolute partial sums. Proposition 3.4.4 Let θ : [1, ∞) → R++ be non-decreasing. Then under the assumptions of Theorem 3.4.1 the following assertions hold: √ (i) γ (Sn > σ n θ (n) i.o.) = 0 or 1 according as ∞ θ2 (t) θ (t) exp − dt t 2 1 converges or diverges. (ii) according as √ γ (max1≤i≤n |Si | < σ n/θ(n) i.o.) = 0 or 1

∞ 1

θ2 (t) π 2 θ2 (t) exp − t 8

dt

Limit theorems converges or diverges.

215

Proof. These results follow from Theorem 3.2.2 and properties of standard Brownian motion. See Jain and Taylor (1973) and Jain, Jogdeo and Stout (1975) [cf. Philipp and Stout (1975)]. 2

2 Except for the suﬃciency of the moment assumption Eγ H1 < ∞ in the case considered there, the considerations on Theorem 3.4.1 following Corollary 3.4.2 are valid for Proposition 3.4.4, too. We note that Proposition 3.4.4(i) implies the classical law of the iterated logarithm Sn γ lim sup √ = 1 = 1. (3.4.1) n→∞ σ 2n log log n √ To obtain (3.4.1) we should take successively θ(n) = (1 + ε) 2 log log n and √ θ(n) = (1 − ε) 2 log log n, 0 < ε < 1, n ∈ N+ . Also, Proposition 3.4.4(ii) implies Chung’s law of the iterated logarithm for maximum absolute partial sums π max1≤i≤n |Si | = 1. (3.4.2) =√ γ lim inf n→∞ σ n/(log log n) 8 √ √ To obtain (3.4.2) we should take successively θ(n) = ( 8/π)(1+ε) log log n √ √ and θ(n) = ( 8/π)(1 − ε) log log n, 0 < ε < 1, n ∈ N+ .

We conjecture that in the special case where H only depends on ﬁnitely N many coordinates of a current point in N+ + , Chung’s law of the iterated 2 logarithm (3.4.2) holds only assuming that Eγ H1 < ∞ [as (3.4.1) does]. See Jain and Pruitt (1975) for the i.i.d. case.

3.4.2

The case of associated random variables

Write bn for either yn , rn or un , n ∈ N+ , respectively b0 for either y 0 , r0 or u0 . Theorem 3.4.5 Let f : [1, ∞) → R satisfy either (i) or (ii) of Theorem 3.2.9. With the notation of that theorem assume that σ(f ) > 0 and put 1 θn (t) = √ ξ C (t), 2n log log n n n ≥ 3, t ∈ I.

If δ > 0 then the sequence (θn )n≥3 , viewed as a subset of C, is a relatively compact set whose derived set coincides a.e. √ K. In particular, the set of with accumulation points of the sequence (Sn /σ 2n log log n)n≥3 coincides a.e. with the segment [−1, 1].

216

Chapter 3

Proof. The results follow at once from Theorem 3.2.9(b) and Strassen’s law of the iterated logarithm for standard Brownian motion [see Theorem 1 in Strassen (1964)]. 2 Note that in the present context we cannot make considerations similar to those following Corollary 3.4.2. Example 3.4.6 Let f (x) = log x, x ∈ [1, ∞). As we have seen in Example 3.2.11, in the cases where bn = yn or bn = rn , n ∈ N+ , we have Eγ f (b0 ) = π2 12 log 2

and σ(f ) = σ < ∞ is non-zero. It follows that Strassen’s law of the iterated logarithm holds for the corresponding processes θn , n ∈ N+ . In particular, the classical law of the iterated logarithm γ lim sup

n→∞

log qn − nπ 2 /12 log 2 √ =1 σ 2n log log n

=1

holds. This had been proved by Gordin and Reznik (1970) and Philipp and Stackelberg (1969). 2 A result similar to Proposition 3.4.4 holds. Proposition 3.4.7 Let θ : [1, ∞) → R++ be non-decreasing. Then under the assumptions of Theorem 3.2.9 the following assertions hold: √ (i) γ(Sn > σ(f ) n θ(n) i.o.) = 0 or 1 according as ∞ θ2 (t) θ(t) exp − dt t 2 1 converges or diverges. √ (ii) γ (max1≤i≤n |Si | < σ(f ) n/θ(n) i.o.) = 0 or 1 according as ∞ 2 π 2 θ2 (t) θ (t) exp − dt t 8 1 converges or diverges. Proof. These results follow from Theorem 3.2.9 and properties of standard Brownian motion. See Jain and Taylor (1973) and Jain, Jogdeo and Stout (1975) [cf. Philipp and Stout (1975)]. 2 The remarks following Proposition 3.4.4 concerning the classical and Chung’s laws of the iterated logarithm apply mutatis mutandis in the present context, too.

Limit theorems

217

It is obvious that all the results stated in this section still hold when γ is replaced by any µ ∈ pr(BI ) such that µ λ.

218

Chapter 3

Chapter 4

**Ergodic theory of continued fractions
**

In this chapter applications of the ergodic properties of the continued fraction transformation τ and its natural extension τ are given. Next, two operations (‘singularization’ and ‘insertion’) on incomplete quotients are introduced, which allow to obtain most of the continued fraction expansions related to the RCF expansion. Ergodic properties of these expansions are also derived.

**4.0 Ergodic theory preliminaries
**

4.0.1 A few general concepts

Let (X, X , µ) be a probability space. An X-valued random variable on X, i.e., an (X , X )-measurable map from X into itself (see Section A1.2), is called a transformation of X. A transformation T of X is said to be µ-non-singular if and only if µ(T −1 (A)) = 0 for any A ∈ X for which µ(A) = 0; it is said to be measure preserving if and only if µT −1 = µ, i.e., µ(T −1 (A)) = µ(A) for any A ∈ X – see Section A1.3. (When the probability µ should be emphasized we shall say that T is µ-preserving.) Clearly, any µ-preserving transformation of X is µ-non-singular. A pair (T, µ), where T is a µ-preserving transformation of X, is called an endomorphism of X. An endomorphism (T, µ) of X is called an automorphism if and only if T is bijective [that is, T (X) = X and T −1 exists] and T −1 is (X , X )-measurable. A quadruple (X, X , T, µ), where (T, µ) is an endomorphism of X, is called a (measurable) dynamical system. 219

220

Chapter 4

A transformation T of X is said to be ergodic (or metrically transitive, or indecomposable) under µ if and only if the sets A ∈ X with T −1 (A) = A, which are called T -invariant, satisfy either µ(A) = 0 or µ(A) = 1. An equivalent deﬁnition, even if seemingly more general, is that µ (T −1 (A) \ A) ∪ (A \ T −1 (A)) = 0 for A ∈ X if and only if either µ(A) = 0 or µ(A) = 1. Finally, in terms of functions this is equivalent to f = f ◦ T µ-a.s. for an X-valued random variable f on X if and only if f is constant µ-a.s. In particular, T is ergodic under µ if it is strongly mixing under µ, that is, lim µ(T −n (A) ∩ B) = µ(A)µ(B)

n→∞

**for any sets A, B ∈ X . This is equivalent to
**

n→∞ X

lim

(f ◦ T n )g dµ =

X

f dµ

X

g dµ

for any f ∈ L∞ (X, X , µ) and g ∈ L1 (X, X , µ). Proposition 4.0.1 Let T be a µ-non-singular transformation of X. If T is ergodic under µ, then there exists at most one probability measure ν on X such that ν µ and (T, ν) is an endomorphism of X. Conversely, if there exists a unique measure ν on X with ν µ and dν/dµ > 0 µ-a.s. such that (T, ν) is an endomorphism of X, then T is ergodic under µ. The proof of Proposition 4.0.1, which entails the concept of the Perron– Frobenius operator of T (cf. Section 2.1), can be found in Lasota and Mackey (1985). 2 An endomorphism (T, µ) of X is said to be exact if and only if, putting Xn = T −n (A) : A ∈ X , n ∈ N,

where T 0 is the identity map, the tail σ-algebra n∈N Xn is µ-trivial, i.e., it contains only sets A for which either µ(A) = 0 or µ(A) = 1. If an endomorphism (T, µ) of X is exact, then T is ergodic under µ; also, for any A ∈ X for which µ(A) > 0 and T n (A) ∈ X , n ∈ N+ , we have

n→∞

lim µ (T n (A)) = 1.

Ergodic theory of continued fractions

221

Proposition 4.0.2 Let T be a µ-preserving transformation of X for which T (A) ∈ X for any A ∈ X . Then the endomorphism (T, µ) is exact if and only if

n→∞

lim ||P n f −

X

f dµ||1,µ = 0

for any non-negative f ∈ L1 (X, X , µ), where P is the Perron–Frobenius operator of T under µ (cf. Section 2.1). For the proof see Boyarski and G´ra (1997, p. 82). o 2 Theorem 4.0.3 (Birkhoﬀ’s individual ergodic theorem) Let T be a µpreserving transformation of X. Then for any f ∈ L1 (X, X , µ) there exists ˜ f ∈ L1 (X, X , µ) such that 1 n→∞ n lim and

n−1 k=0

˜ f (T k (x)) = f µ-a.s.

˜ ˜ f ◦ T = f µ-a.s.

˜ Moreover, X f dµ = X f dµ and if, in addition, T is ergodic under µ, then ˜ f is µ-a.s. a constant equal to X f dµ. A proof of the ergodic theorem can be found in, e.g., Billingsley (1965), Walters (1982), Petersen (1983) or Cornfeld et al. (1982). In particular, in Keane (1991) a short proof, essentially based on an idea of Kamae (1982), is outlined. See also Katznelson and Weiss (1982). 2 Under suitable assumptions it is possible to reﬁne Birkhoﬀ’s theorem by ˜ giving an estimate of the convergence rate to the limit f . The result stated below is a special case of Theorem 3 of G´l and Koksma (1950). a Proposition 4.0.4 Let T be a µ-preserving transformation of X which is ergodic under µ. Assume that

n−1 2

f ◦T −n

X κ=0 X

κ

f dµ

dµ = O(Ψ(n))

**as n → ∞, where Ψ : N+ → R is a function such that the sequence (Ψ(n)/n)n∈N+ is non-decreasing. Then whatever ε > 0 we have
**

n−1

f (T κ (x)) = n

κ=0 X

f dµ + o Ψ1/2 (n) log

3+ε 2

n

µ-a.s.

222

Chapter 4

as n → ∞. Here the constant implied in o depends on ε and the current point x ∈ X. Given a transformation T of X we can deﬁne its so called natural extension T as follows. Let XT = (xi )i∈N ∈ X N : xi = T (xi+1 ), i ∈ N and deﬁne T : XT → XT by T ((xi )i∈N ) = (T (x0 ), x0 , x1 , · · · ) for any (xi )i∈N = (x0 , x1 , · · · ) ∈ XT . It is easy to check that T is bijective. If T is µ-preserving, then we can also deﬁne a measure µ on the σ-algebra XT ⊂ X N generated by the cylinder sets C(A0 , . . . , An ) = ((xi )i∈N ∈ XT : xj ∈ Aj , 0 ≤ j ≤ n) , where Aj ∈ X , 0 ≤ j ≤ n, n ∈ N, by setting µ(C(A0 , . . . , An )) = µ

0≤j≤n

n ∈ N.

T −n+j (Aj ),

Proposition 4.0.5 If T is µ-preserving, then T is µ-preserving; T is ergodic (strongly mixing) under µ if and only if T is ergodic (strongly mixing) under µ. ¯ ¯ Clearly, if (T, µ) is an endomorphism of X, then (T , µ) is an automorphism of XT . Remarks. 1. The deﬁnition just given of the natural extension T of T is a constructive one. More generally, starting from a transformation T of X which is µ-preserving (µT −1 = µ), a bijective transformation T : X → X is called a natural extension of T if and only if (i) there exists a measurable space (X, X ) and a probability measure µ on X such that T is µ-preserving, and (ii) there exists a random variable f : X → X such that n the σ-algebra generated by n∈N T f −1 (X )—see Section A1.1—coincides with X up to sets of µ-probability 0, f ◦ T = T ◦ f µ-a.s., and µf −1 = µ. ¯ ¯ ¯ The natural extension is unique up to isomorphism. By this we mean that if T i : X i → X i , i = 1, 2, are natural extensions of T : X → X, with X i being µi -preserving for a probability measure µi on X i (the σ-algebra in

Ergodic theory of continued fractions

223

X i ), i = 1, 2, then there exist Ei ∈ X i with µ(Ei ) = 0, i = 1, 2, and a one-to-one random variable g : X 1 \ E1 → X 2 \ E2 such that gT 1 = T 2 g on X 1 \ E1 and µ1 (g −1 (E)) = µ2 (E) for any set E in X 2 which is included in X 2 \ E2 . In the case of the constructive deﬁnition we clearly have X = XT while f is deﬁned by f ((xi )i∈N ) = x0 , (xi )i∈N ∈ XT .

Note that the deﬁnition of isomorphism of two natural extensions of a given endomorphism also applies to the case of two arbitrary endomorphisms or dynamical systems. 2. Unlike ergodicity or strong mixing, exactness does not transfer from ¯ an endomorphism (T, µ) to its natural extension (T , µ). As T is invertible, (T , µ) cannot be exact since ¯ µ T (A) = µ T ¯ ¯

n −1

(T (A))

= µ(A), ¯

hence µ T (A) = µ(A) for any n ∈ N+ and A ∈ X . Instead, T , µ always ¯ ¯ ¯ is a K-automorphism, which means that there exists an algebra A ⊂ X −1 n such that T (A) ⊂ A, n∈N+ T (A) generates X , and the tail σ-algebra

n∈N+

T

−n

(A) is µ-trivial. Cf. Petersen (1983, Section 2.5) ¯

2

Finally, let us consider together with the probability space (X, X , µ) and a transformation T : X → X, a family of probability spaces ((Y, Y, νx ))x∈X and a family (Tx )x∈X of transformations of Y such that the map (x, y) ∈ X × Y → Tx (y) ∈ Y is an Y -valued random variable on X × Y . The map S : X × Y → X × Y deﬁned by S(x, y) = (T (x), Tx (y)) , (x, y) ∈ X × Y,

is called a skew product of T and (Tx )x∈X . In many cases the natural extensions are constructed as skew products. Several examples can be found in the next sections. Assuming that T is µ-preserving and Tx is νx -preserving for any x ∈ X, we might expect the skew-product S to be ν-preserving, where ν is the probability measure on X ⊗ Y deﬁned by ν(A × B) =

A

νx (B) µ(dx),

A ∈ X , B ∈ Y.

Unfortunately, such a result does not hold even if it is claimed in Boyarski and G´ra (1997, p. 64). It is contradicted, e.g., by the case of the natural o extension τ of τ . Cf. the next subsection. ¯

224

Chapter 4

**4.0.2 The special case of the transformations τ and τ
**

It is possible to give a direct proof of the ergodicity under γ of the continued fraction transformation τ . See, e.g., Billingsley (1965, pp. 44–45). Results proved in Chapter 2 allow us to assert that actually τ is strongly mixing under γ and any γa , a ∈ I, thus in particular under γ0 = λ. This is a direct consequence of Corollary 1.3.15. Therefore τ is also ergodic under γ and any γa , a ∈ I. Moreover, the endomorphism (τ, γ) is exact by Corollary 2.1.8 and Proposition 4.0.2. It follows from Proposition 4.0.1 that any ν λ for which τ is ν-preserving should coincide with γ. As for τ , we shall show that it can be viewed as the natural extension of τ in the meaning of the constructive deﬁnition given in the preceding subsection. Indeed, in our case XT from the preceding subsection is Ωτ = {(ωi )i∈N ∈ ΩN : ωi = τ (ωi+1 ), i ∈ N}, and the natural extension of τ appears to be—we are bound to change notation—the transformation given by τe ((ωi )i∈N ) = (τ (ω0 ), ω0 , ω1 , · · · ) for any (ωi )i∈N = (ω0 , ω1 , · · · ) ∈ Ωτ . Let us remark that by the very deﬁnition of Ωτ we have ωi+1 = 1/(κi + ωi ) for some κi ∈ N+ whatever i ∈ N. Hence Ωτ can be viewed as the Cartesian product Ω × N+ + or, equivalently, Ω × Ω = Ω2 . More precisely, there is a one-to-one correspondence between Ωτ and Ω2 given by

−1 −1 (ωi )i∈N ∈ Ωτ ↔ (ω0 , [ ω1 , ω2 , · · · ] ) ∈ Ω2 . N

**Then there also is a one-to-one correspondence between τe ((ωi )i∈N ) = (τ (ω0 ), ω0 , ω1 , · · · ) ∈ Ωτ and τ (ω0 ),
**

−1 ω0

1 +[

−1 ω1 −1 , ω2 , · · · ]

∈ Ω2 .

These considerations show that we can identify τe : Ωτ → Ωτ and τ : Ω2 → Ω2 deﬁned as in Subsection 1.3.1 by τ (ω, θ) = τ (ω), 1 , a1 (ω) + θ (ω, θ) ∈ Ω2 .

Ergodic theory of continued fractions

225

It follows from Proposition 4.0.5 that τ is strongly mixing (thus ergodic) ¯ under γ . Also, (¯, γ ) is a K-automorphism. Clearly, τ can be viewed as a ¯ τ ¯ ¯ skew product.

4.1

4.1.1

**Classical results and generalizations
**

The case of incomplete quotients

**Since τ is γ-preserving and ergodic under γ, it follows from Theorem 4.0.3 that n−1 1 f (x) 1 1 lim f ◦ τκ = dx a.e. (4.1.1) n→∞ n log 2 0 x + 1
**

κ=0

for any measurable function f : I → R such that I |f | dλ < ∞. It is clear that under suitable further assumptions on f , Proposition 4.0.4 should lead to estimates of convergence rates in (4.1.1). We now state several classical results which can be derived from (4.1.1) by specializing f , together with the corresponding estimates of the convergence rates, when available. Let us note that throughout this subsection the constants implied in o will depend on ε, the current point in Ω, and the other variables involved. Proposition 4.1.1 [Asymptotic relative digit frequencies – L´vy (1929)] e For any i ∈ N+ we have 1 1 card{κ : aκ = i, 1 ≤ κ ≤ n} = log 1 + n→∞ n log 2 i(i + 2) lim More precisely, whatever ε > 0, for any i ∈ N+ we have card{κ : aκ = i, 1 ≤ κ ≤ n} n = as n → ∞. Proof. The ﬁrst equation in the above statement follows from (4.1.1) by taking f = I(a1 =i) , hence f ◦ τ κ = I(a1 ◦τ κ =i) = I(aκ+1 =i) , κ ∈ N. The second equation follows from Proposition 4.0.4 on account of Corollaries 1.3.15 and A3.3 which yield Ψ(n) = n, n ∈ N+ . 2 1 1 log 1 + log 2 i(i + 2) + o n− 2 log(3+ε)/2 n

1

a.e..

a.e.

226

Chapter 4

A more general result yielding the asymptotic relative m-digit block frequencies is also easily obtained. Proposition 4.1.2 Whatever ε > 0, for any m ∈ N+ and i(m) = (i1 , · · · , im ) ∈ Nm we have + card{κ : (aκ , · · · , aκ+m−1 ) = i(m) , 1 ≤ κ ≤ n} n = as n → ∞. The proof is quite similar to that of the preceding proposition. In (4.1.1) we should take f = I((a1 ,··· ,am )=i(m) ) . 2 It is important to note that the asymptotic relative digit frequencies as well as the asymptotic relative m-digit block frequencies, m ≥ 2, constitute probability distributions on N+ respectively Nm . This is quite easily + checked in the ﬁrst case and not so easily in the second one (induction on m!). Actually, this follows from (4.1.1) on account of the countable additivity of the integral there with respect to the integrand. We now give other results related to asymptotic relative digit frequencies. Corollary 4.1.3 (Asymptotic relative frequencies of digits between two given values) For any i, j ∈ N+ such that i ≤ j we have 1 (i + 1)(j + 1) card{κ : i ≤ aκ ≤ j, 1 ≤ κ ≤ n} = log n→∞ n log 2 i(j + 2) lim a.e..

1 1 1 + v(i(m) ) + o n− 2 log(3+ε)/2 n log (m) ) log 2 1 + u(i

a.e.

More precisely, whatever ε > 0, for any i, j ∈ N+ such that i ≤ j we have card{κ : i ≤ aκ ≤ j, 1 ≤ κ ≤ n} n = as n → ∞. This is a direct consequence of Proposition 4.1.1, which can be also obtained from (4.1.1) by taking f = I(i≤a1 ≤j) .

3+ε 1 (i + 1)(j + 1) 1 log + o n− 2 log 2 n log 2 i(j + 2)

a.e.

Ergodic theory of continued fractions

227

Proposition 4.1.4 (Asymptotic relative frequencies of digits exceeding a given value) For any i ∈ N+ we have 1 i+1 card{κ : aκ ≥ i, 1 ≤ κ ≤ n} = log n→∞ n log 2 i lim More precisely, whatever ε > 0, for any i ∈ N+ we have

3+ε 1 card{κ : aκ ≥ i, 1 ≤ κ ≤ n} 1 i+1 = log + o n− 2 log 2 n n log 2 i as n → ∞.

a.e..

a.e.

The proof is quite similar to that of Proposition 4.1.1. In (4.1.1) we should take f = I(a1 ≥i) . 2 Let us note that on account of the complete additivity of the asymptotic relative digit frequencies, the ﬁrst half of Proposition 4.1.4 is a direct consequence of the ﬁrst half of Proposition 4.1.1. Now, let m ∈ N+ such that m ≥ 2, and ﬁx arbitrarily an exceeding m. It then follows from Proposition 4.1.1 that

n→∞

∈ N+ not

lim

card{κ : aκ ≡ 1 = log 2

∞

mod m, 1 ≤ κ ≤ n} n log ( + pm + 1)2 ( + pm)( + pm + 2) a.e..

p=0

[By taking f = I(a1 ≡ mod m) in (4.1.1), an estimate of the convergence rate can be also obtained.] It has been shown that the sum of the series above can be expressed in terms of Euler’s Gamma-function. To be precise, the following result holds. Proposition 4.1.5 [Nolte (1990)] We have 1 log 2

∞

log

p=0

( + pm + 1)2 1 = log ( + pm)( + pm + 2) log 2

Γ( m )Γ( Γ2 (

+2 m ) +1 m )

.

The proof rests on a special case of a result from Whittaker and Watson (1927, Section 12.13), which reads as follows. Let αi , βi ∈ C \ N+ , 1 ≤ i ≤ r, for a given r ∈ N+ . Then the inﬁnite product (n − α1 )(n − α2 ) · · · (n − αr ) (n − β1 )(n − β2 ) · · · (n − βr )

n∈N+

**228 converges if and only if then
**

r i=1 αi

Chapter 4 =

r i=1 βi .

**If this condition is fulﬁlled,
**

r i=1

n∈N+

(n − α1 )(n − α2 ) · · · (n − αr ) = (n − β1 )(n − β2 ) · · · (n − βr )

Γ(1 − βi ) . Γ(1 − αi )

(4.1.2) 2

For example, using the well known relations Γ(z)Γ(1 − z) = π/ sin πz, z ∈ Z, and Γ(z + 1) = zΓ(z), z ∈ −N, if we take m = 2 and = 1 then we ﬁnd that card{κ : aκ ≡ 1 mod 2, 1 ≤ κ ≤ n} n→∞ n lim = 1 Γ(1/2)Γ(3/2) log π log = − 1 = 0.6514 · · · 2 (1) log 2 Γ log 2 a.e.,

i.e., about 65 % of the occurring digits are odd a.e.. Next, using the same relations for the function Γ, for m = 4 and we ﬁnd that

n→∞

=1

lim

card{κ : aκ ≡ 1 mod 4, 1 ≤ κ ≤ n} n = Γ(1/4)Γ(3/4) 1 1 log = 2 (1/2) log 2 Γ 2 a.e.,

i.e., about half of the occurring digits are ≡ 1 mod 4 a.e.. Similar considerations can be made about 2-digit blocks. For example, we have

n→∞

lim

**card{κ : (aκ , aκ+1 ) ≡ (0, 0) mod 2, 1 ≤ κ ≤ n} n = 1 log 2 log
**

i∈N+ j∈N+

(4ij + 1)(4ij + 2i + 2j + 2) (4ij + 2i + 1)(4ij + 2j + 1)

a.e.,

**which by (4.1.2) is equal to 1 log 2 log
**

i∈N+

Γ(1 + Γ(1

2i+1 1 4i )Γ(1 + 4i+2 ) 1 i+1 + 4i )Γ(1 + 2i+1 )

.

Ergodic theory of continued fractions Nolte (op. cit.) proved that the last quantity can be expressed as α+ 1 log 2 (−1)n

n≥2

229

ζ(n) − 1 n

(22−n − 22−2n − 1)(ζ(n) − 1) +

2n−1 − 1 22n−2

,

where α = log 2 − 1 + √ 2 4 log 6 2π − log Γ log 2 log 2 1 4 = 0.08167 · · · .

Setting y = 2 − log π/ log 2 = 0.3485 . . . , Nolte’s computations show that lim card{κ : (aκ , aκ+1 ) ≡ (a, b) mod 2, 1 ≤ κ ≤ n} n

n→∞

is a.e. equal to z = 0.11694 · · · y − z = 0.23156 · · · 1 − 2y + z = 0.41993 · · · for (a, b) = (0, 0); for (a, b) = (0, 1) or (1, 0); for (a, b) = (1, 1).

Actually, all the results we have proved so far are special cases of the following result. Proposition 4.1.6 Given m ∈ N+ , let H : Nm → R be such that + |H(i(m) )|(v(i(m) ) − u(i(m) )) < ∞

i(m) ∈Nm +

**[which is equivalent to Eγ |H(a1 , · · · , am )| < ∞]. Then we have 1 lim n→∞ n where αm = If, in addition, Eλ H 2 (a1 , · · · , am ) =
**

i(m) ∈Nm + n−1

H(aκ , · · · , aκ+m−1 ) = αm

κ=0

a.e.,

1 log 2

H(i(m) ) log

i(m) ∈Nm +

1 + v(i(m) ) . 1 + u(i(m) )

H 2 (i(m) )(v(i(m) ) − u(i(m) )) < ∞

230

Chapter 4

**[which is equivalent to Eγ H 2 (a1 , · · · , am ) < ∞], then whatever ε > 0 we have 1 n
**

n−1 κ=0

H(aκ , · · · , aκ+m−1 ) = αm + o n− 2 log(3+ε)/2 n

1

a.e.

as n → ∞. For the proof this time the choice of f in (4.1.1) is f (ω) = H(a1 (ω), · · · , am (ω)), ω ∈ Ω, 2

while Corollaries 1.3.15 and A3.3 should be also invoked.

Remark. A generalization of the second half of Proposition 4.1.6 was given by Philipp (1967). It allows the integer m vary in relation to n, and reads as follows. Proposition 4.1.7 Let H :

m∈N+

Nm → R be such that +

**Eλ H 2 (a1 , · · · , am ) < ∞ for any m ∈ N+ . Whatever ε > 0, if 2m ≤ n < 2m+1 then 1 n
**

n−1 κ=0 2 H(aκ , · · · , aκ+m−1 ) = αm + o n− 2 αm log2+ε n

1

a.e. 2

as n → ∞.

We shall now consider other important special cases of Proposition 4.1.6. With m = 1 and p if p < 1, p = 0, i H(i) = Hp (i) = log i if p = 0 for i ∈ N+ , we obtain the following results. Proposition 4.1.8 We have

n→∞

lim (a1 · · · an )1/n = K0 ap + · · · + ap n 1 n

1/p

a.e.

and

n→∞

lim

= Kp

a.e.

**Ergodic theory of continued fractions for any p < 1, p = 0, where K0 =
**

i∈N+

231

1 1+ i(i + 2)

log i/ log 2

= exp

1 log 2

1 0

log 1/t dt 1+t

= 2.685452 · · · and 1 Kp = log 2 In particular, K−1 = 1.745405 · · · , K−2 = 1.450340 · · · , K−4 = 1.236961 · · · , K−5 = 1.189003 · · · , K−7 = 1.133323 · · · , K−8 = 1.115964 · · · , K−10 = 1.091877 · · · . More precisely, whatever ε > 0 we have (a1 · · · an )1/n = K0 + o(n− 2 log as n → ∞, and ap + · · · + ap n 1 n

1/p

1 3+ε 2

1/p

i∈N+

1 i log 1 + i(i + 2)

p

=

1 log 2

1 0

( 1/t )p dt 1+t

1/p

.

K−3 = 1.313507 · · · , K−6 = 1.156552 · · · , K−9 = 1.102543 · · · ,

n) a.e.

= Kp + o(n− 2 log

1

3+ε 2

n)

a.e.

for any p < 1/2, p = 0, as n → ∞. The cases p = 0 and p = −1 leading to the asymptotic a.e. values K0 and K−1 of the geometric, respectively, harmonic mean of the ﬁrst n incomplete quotients as n → ∞ , were studied by Khintchine (1934/35). Ever since its discovery much eﬀort has been put in the numerical evaluation of K0 . See Lehmer (1939), Pedersen (1959), Shanks and Wrench, Jr. (1959), Wrench, Jr. (1960). In the last reference K0 has been evaluated to 155 decimal places. Recently, using work by Wrench, Jr. and Shanks (1996), Bailey et al. (1997) have presented rapidly converging series for any Kp , p < 1, allowing them to evaluate K0 and K−1 to 7,350 decimal places and Kp for p = −2, −3, · · · , −10 to 50 decimal places. Setting

n

ζ(s, n) = ζ(s) −

i=1

i−s ,

s > 1, n ∈ N+ ,

**232 the following identities hold: (i) for any n ∈ N+ we have 1 Ai log K0 = ζ(2i, n) − log 2 i
**

i∈N+

Chapter 4

log 1 −

2≤i≤n

1 i

log 1 +

1 , i

where Ai =

2i−1

(−1)κ−1 /κ ,

κ=1

i ∈ N+ ;

**(ii) whatever the negative integer p, for any n ∈ N+ we have j−p−1 ζ(2i + j − p, n) j∈N −p − 1 1 Kp = p log 2 i
**

i∈N+

− (i − 1)p log 1 −

2≤i≤n

1 ; i2

**(iii) in particular, for any n ∈ N+ we have n−1 − 2i ζ(j, n) 1 1 j=2 = − K−1 log 2 i
**

i∈N+

2≤i≤n

log(1 − i−1

i−2 )

.

Clearly, for n = 1 the sums 2≤i≤n occurring above are empty, thus zero, so that both Kp log 2 whatever the negative integer p and (log K0 )(log 2) can p be cast in terms of series involving values of the Riemann zeta function and rationals. From (i) above, the elegant integral representation log K0 = − 1 log 2

1 0

log[sin(πt)/πt] dt t(t + 1)

**can be derived. Let us note that we also have log K0 = log 2 + 1 log 2
**

1 0

log[πt(1 − t2 )/ sin πt] dt , t(t + 1)

as shown in Shanks and Wrench, Jr. (1959). Actually, the second equation for log K0 follows from the ﬁrst one since

1 0

log(1 − t2 ) dt = − log2 2. t(t + 1)

Ergodic theory of continued fractions See Bailey et al. (op. cit. p. 419).

233

Remarks. 1. Whatever p ∈ R the series i∈N+ ap is divergent a.e. For i p < 0 the assertion follows immediately from Proposition 4.1.8 while for p ≥ 0 it is obvious since in this case clearly n ap ≥ n, n ∈ N+ . For i=1 i p < 0 arbitrarily large in absolute value this might seem strange at ﬁrst sight. Actually, things are quite natural since by Proposition 4.1.1 any digit i ∈ N+ occurs a.e. inﬁnitely often (and thus there is no need to invoke Proposition 4.1.8). ˇ a 2. It has been proved by Sal´t (1969, 1984) that from a topological standpoint the sets of probability 1 in Propositions 4.1.1 and 4.1.8 (for p = 0) are only of the ﬁrst Baire category, i.e., they are countable unions of nowhere dense subsets of I. 3. A set which is ‘small’ in the measure theoretical sense, can be quite ‘large’ from the point of view of topology. Consider, for example, the set E2 of all numbers in [0, 1) whose RCF digits are 1 or 2. It is a trivial consequence of Proposition 4.1.1 that λ(E2 ) = γ(E2 ) = 0. On the other hand, it is also clear that E2 has the power of the continuum. To express the ‘topological size’ of sets like E2 the concepts of Hausdorﬀ measure and Hausdorﬀ dimension are suitable. We ﬁrst recall their formal deﬁnitions and then outline two applications of these concepts to continued fractions. Given a subset E of Rn , for any ε, δ > 0 put

δ Hε (E) = inf U i

diam(Ui )δ

,

where the inﬁmum is taken over all open coverings U = {Ui }i of E such that diam(Ui ) ≤ ε. The Hausdorﬀ measure H δ (E) and the Hausdorﬀ dimension dimH (E) of E are then deﬁned as

δ H δ (E) = lim Hε (E), ε→0

dimH (E) = inf δ : H δ (E) = 0 .

See Falconer (1986, 1990), Harman (1998), and Rogers (1998). It follows from Proposition 1.1.1—see also Corollary 4.1.30—that for any ω ∈ Ω the inequality p 1 ω− < 2 q q has inﬁnitely many solutions in integers p, q ∈ N+ with g.c.d. (p, q) = 1. Let then Mc denote the set of all x ∈ [0, 1) satisfying x− p 1 < c q q

234

Chapter 4

for inﬁnitely many pairs (p, q) of positive integers. Clearly, if c ≤ 2 then Mc = [0, 1), but what happens when c > 2? It is fairly easy to show that λ(Mc ) = 0 for c > 2. On the other hand, V. Jarn´ proved in 1929 that ık dimH (Mc ) = 2/c for any c > 2. A simpliﬁed proof of this result can be found in Falconer (1990, p. 142). Using iterated function systems (IFS)—which is another name for dependence with complete connections—it is possible to calculate the Hausdorﬀ dimension of sets deﬁned by number-theoretic properties. For instance, the set E2 just deﬁned is the attractor of the IFS consisting of the two (nonlinear) contractions u1 (x) = 1 1+x and u2 (x) =

1 3

1 . 2+x

It was ﬁrst shown by Jarn´ that ık Pollicott (2001) found that

≤ dimH (E2 ) ≤ 2 , but Jenkinson and 3

dimH (E2 ) = 0.53128 05062 77205 14162 44686 · · · , an approximation accurate to 25 decimal places, which improves earlier estimates of Hensley (1996). A striking feature of Jenkinson and Pollicott’s method is that successive approximations of dimH (E2 ) converge at a superexponential rate. Their method can be also used to eﬃciently compute the Hausdorﬀ dimension of other sets consisting of numbers whose RCF digits are constrained to belong to any given ﬁnite subset of N+ . 2 The case p = 1 is not settled by Proposition 4.1.8. For H(i) = i, i ∈ N+ , the series |H(i)|(v(i) − u(i)) =

i∈N+ i∈N+

i = i(i + 1)

i∈N+

1 i+1

**is divergent. In this case Eγ H(a1 ) = ∞ but, however, we have
**

n→∞

lim

a1 + · · · + an = ∞ a.e.. n

Before proving this (see Corollary 4.1.10 and Remark 1 following it) let us recall that in Subsection 3.3.2 we noted that, writing tn = a1 + · · · + an , n ∈ N+ , tn /n log n converges in µ-probability to 1/ log 2 as n → ∞ for any µ ∈ pr(BI ) such that µ λ. It follows that tnκ /nκ log nκ converges a.e. to 1/ log 2 as κ → ∞, where (nκ )κ∈N+ is some sequence of positive integers with limκ→∞ nκ = ∞. Hence tnκ /nκ converges a.e. to ∞ as κ → ∞.

Ergodic theory of continued fractions

235

Thus lim supn→∞ tn /n = ∞ a.e. and it remains to show that lim sup can be replaced by lim. Actually, we shall prove much more. Theorem 4.1.9 [Diamond and Vaaler (1986)] We have tn = 1 + o(1) n log n + θn max ai 1≤i≤n log 2 a.e.

**as n → ∞, where θn is an I-valued random variable for any n ∈ N+ . Proof. Given ε > 0 and n ∈ N+ set ai = ai I(ai ≤h(n)) , 1 ≤ i ≤ n,
**

1

**where h : N+ → R is deﬁned by h(n) = n log 2 +ε n, and tn = a1 + · · · + an . Then Eγ tn = n log 2 n log 2
**

h(n)

j log 1 +

j=1 h(n) j=1

1 j(j + 2)

=

1 (1 + o(1)) = n log h(n) (1 + o(1))/ log 2 j

**as n → ∞. By Corollaries 1.3.15 and A3.2 we have Varγ tn = O(nVarγ t1 ) = O(nEγ (t1 )2 ) as n → ∞. But Eγ (t1 )2 = 1 log 2
**

h(n)

j 2 log 1 +

j=1

1 j(j + 2)

= h(n) (1 + o(1))/ log 2

as n → ∞. Therefore Varγ tn = O(n h(n) ) as n → ∞. Now, consider the sequence (nκ )κ∈N+ deﬁned as nκ = exp κ1−ε , Note that nκ−1 = 1 + O(κ−ε ) nκ as κ → ∞ so that nκ−1 /nκ and h(nκ−1 )/h(nκ ) both converge to 1 as κ → ∞. By the choice of the nκ it is obvious that the series with general term Eγ (tnκ − Eγ tnκ )2 , nκ h(nκ )κ1+ε κ ∈ N+ , κ ∈ N+ .

236

Chapter 4

is convergent. Hence by Beppo Levi’s theorem the random series with general term (tnκ − Eγ tnκ )2 , κ ∈ N+ , nκ h(nκ )κ1+ε is convergent a.e. Therefore |tnκ − Eγ tnκ | = o nκ κ(1+ε)/2 log(1+2ε)/4 nκ as κ → ∞. Now, it is easy to check that nκ κ(1+ε)/2 log(1+2ε)/4 nκ = O Eγ tnκ logε/3 nκ = o Eγ tnκ a.e. a.e.

as κ → ∞ provided that ε < 0.126. Thus tnκ = (1 + o(1))Eγ tnκ a.e.

as κ → ∞. Next, for any n ∈ N+ satisfying nκ−1 < n ≤ nκ for some κ ∈ N+ we clearly have tnκ−1 ≤ tn ≤ tnκ , so that (1 + o(1))Eγ tnκ−1 ≤ tn ≤ (1 + o(1))Eγ tnκ a.e. as k → ∞. On account of the properties already noted of the sequence (nκ )κ∈N+ we easily obtain tn = (1 + o(1))Eγ tn as n → ∞, and since n log h(n) − n log n = o(n log n) as n → ∞, we can also write tn = (1 + o(1)) n log n log 2 a.e. (4.1.3) a.e.

as n → ∞. To complete the proof we shall show that a.e. there exist at most ﬁnitely many integers n ∈ N+ for which the inequalities ai > h(n), aj > h(n)

Ergodic theory of continued fractions

237

hold for two distinct indices i, j ≤ n. To proceed ﬁx i < j. It follows from Corollary 1.3.15 that γ(ai > h(n), aj > h(n)) = O(γ(ai > h(n))γ(aj > h(n))) = O(γ 2 (a1 > h(n))) = O((h(n))−2 ) = O(n−2 (log n)−1−2ε ) as n → ∞. Hence the probability of the random event (ai > h(n), aj > h(n) for distinct indices i, j ≤ 2n) is of order at most (log n)−1−2ε . For κ ∈ N+ let Eκ = (ai > h(2 ), aj > h(2 ) for distinct indices i, j ≤ 2

≥κ +1

).

Then γ(Eκ ) = O( ≥κ −1−2ε ) → 0 as κ → ∞. It is now clear that for ω ∈ Eκ and n > 2κ+1 there exists at most one index i ≤ n for which ai (ω) > h(n). Consequently, we can assert that 0 ≤ tn − tn ≤ max ai

1≤i≤n

a.e.

(4.1.4) 2

for all suﬃciently large n. By (4.1.3) and (4.1.4) the proof is complete.

Remarks. 1. It is now clear from the above theorem and Proposition 3.1.7 why tn /n log n converges in probability, rather than a.e., to 1/ log 2 as n → ∞. The obstacle to a.e. convergence is the occurrence of a single large value of the digits. At the same time, a.e. convergence can be obtained by excluding at most one summand. 2. It is interesting to compare Theorems 3.3.4 and 4.1.9 (see also Corollary 3.1.11). 2 Corollary 4.1.10 Whatever 0 ≤ ε < 1 we have a1 + · · · + an = ∞ a.e.. n→∞ n(log n)ε lim Remarks. 1. The equation

n→∞

lim

a1 + · · · + an = ∞ a.e. n

238

Chapter 4

can be also derived from a slight generalization of equation (4.1.1). Hartman (1951) proved that if f : I → R+ is measurable and I f dλ = ∞, then the limit in (4.1.1) exists and is equal to ∞ a.e.. The equation above then follows by taking f (ω) = a1 (ω), ω ∈ Ω. It is interesting to note that if we take f (ω) = a2 (ω)/a1 (ω) or f (ω) = a1 (ω)/a2 (ω), ω ∈ Ω, then we obrain

n→∞

lim

1 n

i∈N+

ai+1 1 = lim n→∞ n ai

i∈N+

ai = ∞ a.e.. ai+1

**2. Salem (1943) proved that the celebrated Minkowski’s ? function can be expressed in terms of the tn , n ∈ N, as ?(x) =
**

i∈N+

(−1)i−1 21−ti (x)

for any x ∈ I, if we consider that ai (x) = ∞ for any large enough i ∈ N+ when x ∈ I \ Ω. It is known that ? is a strictly increasing singular function, that is, ? (x) = 0 a.e. in I. Recently, Viader et al. (1998) have shown that x ∈ I : lim tn (x) =∞ n ∩ x ∈ I : ? (x) exists ﬁnitely

n→∞

⊂ x ∈ I : ? (x) = 0 , thus making more precise the set where the derivative of ? vanishes. Note that the sequence (an )n∈N+ is i.i.d. with common µ-distribution −m : m ∈ N ) under the probability measure µ induced by ? on B . Cf. (2 + I Lagarias (1992, p. 45). 3. Vardi (1995, 1997) discussed an interesting relationship between the St. Petersburg game [see, e.g., Feller (1968, X.4)] and the sequence (an )n∈N+ , on account of the properties of the sequence (tn )n∈N+ . That game is a well known example of a sequence of independent identically distributed random variables with inﬁnite mean value, and was considered as a paradox since no ‘fair’ entry fee exists. It appears that (an )n∈N+ makes a reasonable choice of entry fees for the St. Petersburg game. 2 Corollary 4.1.11 Let (cn )n∈N+ be a non-decreasing sequence of positive numbers satisfying n∈N+ c−1 < ∞. Then n tn = 1 + o(1) n log n + θn cn log 2 a.e.

as n → ∞, where θn is an I-valued random variable for any n ∈ N+ .

Ergodic theory of continued fractions

239

Proof. This is an immediate consequence of Theorem 4.1.9 and Proposition 1.3.16 (F. Bernstein’s theorem). 2 Corollary 4.1.12 Set dn = exp(κ log2 κ)κ log2 κ for exp((κ − 1) log2 (κ − 1)) < n ≤ exp(κ log2 κ) , Then lim sup

n→∞

κ ≥ 2.

(4.1.5)

a1 + · · · + an 1 = dn log 2

a.e..

Proof. In Corollary 4.1.11 set cn = dn /(log log 10κ) for n in the range (4.1.5). It is easy to check that (4.1.5) implies n log n ≤ dn , n ∈ N+ . Then by Corollary 4.1.11 we have tn ≤ dn 1 + o(1) dn + log 2 log log 10κ a.e.

−1 n∈N+ cn

< ∞ and that

as κ → ∞, so lim supn→∞ tn /dn ≤ 1/ log 2 a.e. To complete the proof we note that setting nκ = exp((κ + 1) log2 (κ + 1)) we have dnκ = nκ log nκ , κ ∈ N+ , and limκ→∞ tnκ /dnκ = 1/ log 2. 2 Remarks. 1. Philipp (1988, Theorem 1) proved that (i) for any sequence (cn )n∈N+ of positive numbers such that n∈N+ c−1 < ∞, we have n lim supn→∞ tn /cn = 0 a.e., and (ii) for any sequence (cn )n∈N+ of positive numbers such that the sequence (cn /n)n∈N+ is non-decreasing and −1 n∈N+ cn = ∞, we have lim supn→∞ tn /cn = ∞ a.e. Corollary 4.1.11 shows that the condition on the sequence (cn /n)n∈N+ in (ii) cannot be dispensed with. 2. It is easy to show, see Diamond and Vaaler (op. cit., pp. 81–82), that if (cn )n∈N+ is as in Corollary 4.1.11, then setting S = {n ∈ N+ : cn < n log n} , we have

x→∞

lim

1 log x

n≤x, n∈S

1 = 0, n

240

Chapter 4

that is, S has logarithmic density zero. It then follows from Corollary 4.1.11 that a1 + · · · + an = O(cn ) as n → ∞ for all integers n outside a set of logarithmic density 0. See also Corollary 3.1.9. 3. Theorem 4.1.9 can be easily generalized for a function H : N+ → R++ satisfying

1≤i≤n

H 2 (i)/i2 /

1≤i≤n

2 H(i)/i2 = O n log− 2 −ε n

3

**as n → ∞ for some ε > 0. [Clearly, H(i) = i, i ∈ N+ , satisﬁes the condition above.] For such a function H we have
**

n

H(ai ) =

i=1

(1 + o(1)) n log 2

1≤i≤n

H(i) log 1 +

1≤i≤n

1 i(i + 2)

+ θn max H(ai )

a.e.,

where θn is an I-valued random variable for any n ∈ N+ . The proof can be found in Diamond and Vaaler (op. cit.). 2

4.1.2

Empirical evidence, and normal continued fraction numbers

We shall now discuss the important amount of empirical evidence already accumulated on continued fraction expansions of certain real numbers. The interest of such computations lies in comparing statistics of such expansions with known theoretical limiting distributions. It is clear that, for instance, contained in the exceptional set in Proposition 4.1.8 are all quadratic irrationalities and the number e − 2. See Subsection 1.1.3. Clearly, all the numbers just mentioned are also contained in the exceptional set in Proposition 4.1.1. As we have already mentioned in Subsection 1.1.3, in the opposite direction seems to lie π − 3 whose continued fraction expansion is π − 3 = [ 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, · · · ] .

Ergodic theory of continued fractions

241

In Bailey et al. (1997, p. 423) it is asserted that, based on the ﬁrst 17,001,303 continued fraction digits of π − 3, the geometric mean is 2.68639 and the harmonic mean is 1.745882, which are reasonably close to K0 and K−1 —see Proposition 4.1.8. Clearly, no conclusion can be drawn beyond this. For computations concerning the continued fraction digits of various irrationals in I we refer the reader to Alexandrov (1978), Brjuno (1964), Choong, Daykin and Rathbone (1971) (see nevertheless D. Shanks’ review [MR 52 # 7073] of this paper), Lang and Trotter (1972), Richtmyer (1975), Shiu (1995), and J.O. Shallit’s review [MR 96b: 11165] of this last paper. Presenting an algorithm for computing the continued fraction expansion of numbers which are zeroes of diﬀerentiable functions, Shiu (1995) obtained √ statistics of the ﬁrst 10000 digits of irrationals in I such as 3 2 − 1, π − 3, √ π 2 − 9, log 2, 2 2 − 2. Table 1 below is compiled from his Table 1. The last column contains the (theoretical) asymptotic relative digit frequencies

1 1 log 1 + log 2 i(i + 2)

,

1 ≤ i ≤ 10,

in the ﬁrst 10 lines, the asymptotic relative frequency

1 12 × 101 log log 2 11 × 102

of the digits in the range [11, 100] in the 11th line, and the asymptotic relative frequency 1 102 log log 2 101

of the digits exceeding 100 in the last line. Cf. Propositions 4.1.1, 4.1.3, and 4.1.4.

242

Chapter 4

**Frequency of occurrence of i in 10000 digits of Digit i 1 2 3 4 5 6 7 8 9 10
**

11 − 100

√ 3 2−1 4173 1675 946 636 421 295 240 163 122 118 1060 151

π−3 4206 1672 882 597 443 282 224 186 143 123 1113 129

π2 − 9 4134 1706 948 581 401 302 232 185 138 117 1111 145

log 2 4149 1666 905 600 390 334 226 187 142 137 1113 151

2

√

2

−2

Theoretical asymptotic relative frequency 0.415037499 · · · 0.169925001 · · · 0.093109404 · · · 0.058893689 · · · 0.040641984 · · · 0.029747343 · · · 0.022720076 · · · 0.017921908 · · · 0.014499569 · · · 0.011972641 · · · 0.111317022 · · · 0.014213859 · · ·

≥ 101

4192 1639 933 616 390 278 213 190 135 135 1130 149

Table 1 It is also interesting to note that setting M10000 (ω) = max1≤κ≤10000 aκ (ω) (cf. Subsection 3.1.3) we have √ √ M10000 ( 3 2 − 1) = a1990 ( 3 2 − 1) = 12737, M10000 (π − 3) = a431 (π − 3) = 20776, M10000 (π 2 − 9) = a1234 (π 2 − 9) = 12013, M10000 (log 2) = a9168 (log 2) = 963664, M10000 (2

√ 2

− 2) = a6342 (2

√

2

− 2) = 44122 ,

and that in all cases just considered there exist digits not exceeding 100 which do not appear, viz. √ 74, 86, 91, 96, 97, 99, and 100 for 3 2 − 1; 90, 91, and 96 91 and 92 55, 73, 76, 96, and 97 79, 80, 81, 82, 91, 94, 97, and 99 for π − 3; for π 2 − 9; for log 2; for 2

√ 2

− 2.

**Ergodic theory of continued fractions Concerning Khinchin’s constant K0 , computations of K0 (ω, n) = (a1 (ω) · · · an (ω)) n
**

1

243

for n ≤ 10000 and various ω ∈ Ω, including those considered above, suggest that, e.g., π − 3 is not in the exceptional set. However, it should be pointed out that if even there might be convergence the rate has to be very slow. It was found that K0 (π − 3, 10000) diﬀers from K0 by more than K0 (π − 3, 100) does! The existence of the asymptotic relative digit and, more generally, mdigit block frequencies (Propositions 4.1.1 and 4.1.2) raises naturally the question of normality for the continued fraction expansion. ´ The idea of normality, ﬁrst introduced by E. Borel in 1909, is an attempt to formalize the notion of a real number being random. A real number x ∈ I is said to be normal in base b, b ∈ N+ , b ≥ 2, if and only if in its representation in base b all digits 0, 1, · · · , b−1 appear asymptotically equally often, i.e., with asymptotic relative frequencies all equal to 1/b. In addition, for each m ∈ N+ the bm diﬀerent m-digit blocks must occur equally often. In other words, for any m ∈ N+ we should have

n→∞

lim

1 n

number of occurrences of a given m-digit block in the ﬁrst n + m − 1 base-b digits of x

= b−m

whatever the given m-digit block. Actually, the above equation holds for all x ∈ I except for a set of Lebesgue measure zero. This can easily be seen by applying Birkhoﬀ’s ergodic theorem to the transformation T x = bx mod 1 of I. A number that is normal in all bases b ∈ N+ , b ≥ 2, is called normal. However, even if there are lots of normal numbers, when we are given a ‘concrete’ number x ∈ I the existence result just mentioned does not help to decide whether x is normal or not. Such a problem cannot be handled by methods known today. (Will it ever be solved?) For instance, it is not known whether π − 3, e − 2, or any irrational algebraic number is normal or not. The ﬁrst example of a normal number in base 10 was given by Champernowne (1933). His number is x = 0. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 · · · but an explicit example of a normal number is still lacking. Clearly, a similar problem can be considered for the continued fraction expansion (which has the advantage of not being related to any base). An irrational ω ∈ I is said to be a normal continued fraction number if and only

244

Chapter 4

if all its asymptotic relative m-digit block frequencies exist and are equal to those occurring in Proposition 4.1.2 for any m ∈ N+ . In other words, ω is a normal continued fraction number if it does not belong to the exceptional sets of λ-measure zero excluded in Proposition 4.1.2 for any m ∈ N+ . For instance, the quadratic irrationalities are not normal since they eventually have periodic expansions, and neither is e − 2. A construction of the Champernowne type for a normal continued fraction number was given by Adler, Keane, and Smorodinsky (1981). Their example is as follows. Let (rn )n∈N+ be the sequence of rationals in (0,1) obtained by ﬁrst writing r1 = 1/2, then r2 = 1/3 and r3 = 2/3, then r4 = 1/4, r5 = 2/4, r6 = 3/4, etc., at each stage m ∈ N+ writing all quotients with denominator m + 1 in increasing order. Let ri = [ai,1 , ai,2 , . . . , ai,ni ] be the continued fraction expansion of ri , with ai,ni = 1, i ∈ N+ . The irrational ω with continued fraction expansion [a1,1 , a2,1 , a3,1 , a3,2 , a4,1 , a5,1 , a6,1 , a6,2 , a7,1 , a8,1 , a8,2 , a9,1 , a9,2 , a9,3 , · · · ], which is obtained by concatenating the expansions of r1 , r2 , · · · in the given order, is a normal continued fraction number. The ﬁrst 14 digits of ω are 2, 3, 1, 2, 4, 2, 1, 3, 5, 2, 2, 1, 1, 2. Another example of a diﬀerent nature had been given by Postnikov (1960). We should emphasize that even if the empirical evidence pleads in favour of normality for the continued fraction expansion of algebraic irrationals of degree exceeding 2, or of π − 3, π 2 − 9 etc., the only mathematical results proved so far are the examples of normal continued fraction numbers just discussed. Finally, a few words about the empirical evidence concerning Theorem √ 4.1.9. Von Neumann and Tuckerman (1955) computed tn ( 3 2 − 1) and √ n log n/ log 2 for n = 100(100)2000. It appears that tn ( 3 2 − 1) log 2/n log n is most of the time greater than 1 and often nearly 2. As tn log 2/n log n converges just in probability to 1 as n → ∞, these deviations cannot be seen as signiﬁcant.

4.1.3

The case of associated and extended random variables

Since τ is γ -preserving and ergodic under γ (see Subsection 4.0.2), it follows ¯ ¯ ¯ again from Theorem 4.0.3 that 1 n→∞ n lim

n−1 k=0

¯ ¯ f ◦ τk =

1 log 2

1

1

dx

0 0

¯ f (x, y) dy (xy + 1)2

a.e. in I 2

(4.1.6)

Ergodic theory of continued fractions

245

¯ ¯ for any measurable function f : I 2 → R such that I 2 |f | dλ2 < ∞. As ¯ in Subsection 4.1.1, for suitable choices of f , Proposition 4.0.4 will lead to estimates of convergence rates in (4.1.6). We now give several results which can be derived from (4.1.6).

2 Proposition 4.1.13 For any B ∈ BI we have

1 lim n→∞ n

n−1

IB (τ k , sk ) = ¯

k=0

1 log 2

B

dx dy (xy + 1)2

a.e. in I 2 .

¯ Proof. The equation above follows from (4.1.6) by taking f = IB , 2 , and noting that by the very deﬁnition of the extended incomB ∈ BI plete quotients (see Subsection 1.3.3), equations (1.3.1) and (1.3.1 ) can be written as τ n (ω, θ) = (τ n (ω), sn (ω, θ)) , (ω, θ) ∈ Ω × I, ¯ ¯ for any n ∈ N+ . (The last equation holds for n = 0, too.) Corollary 4.1.14 For any A ∈ BI we have 1 lim n→∞ n and 1 lim n→∞ n

n−1

2

IA (τ k ) = γ(A)

k=0 n−1

a.e. in I,

IA (¯k ) = γ(A) a.e. in I 2 . s

k=0

Proof. The ﬁrst equation follows by taking B = A × I. [It might be also derived from equation (4.1.1).] The second equation follows by taking B = I × A. 2 It follows by dominated convergence from Proposition 4.1.13 that for any 2 µ ∈ pr(BI ) we have ¯ 1 n→∞ n lim In particular, 1 n→∞ n lim

n−1 n−1

µ τ −k (B) ¯ ¯

k=0

= γ (B), ¯

2 B ∈ BI .

(4.1.7)

µ τ −k (I × A) ¯ ¯

k=0

= lim

1 n→∞ n

n−1

µ (¯k ∈ A) ¯ s

k=0

(4.1.8)

= γ(A),

A ∈ BI .

246

Chapter 4

We are going to show under suitable assumptions that in (4.1.7) actual convergence holds instead of C´saro convergence while in (4.1.8) the exe tended random variable sk can be replaced by sa , k ∈ N, for a ﬁxed a ∈ I. ¯ k

2 Proposition 4.1.15 Let µ ∈ pr(BI ) such that µ ¯ ¯ n→∞ 2 for any B ∈ BI . 2 ¯ Proof. Let h = d¯/dλ2 . Then for any B ∈ BI we have µ

λ2 . Then (4.1.9)

lim µ τ −n (B) = γ (B) ¯ ¯ ¯

µ(¯−n (B)) = ¯ τ

I2

IB ◦ τ n d¯ = ¯ µ

I2

(IB ◦ τ n )(h/¯) d¯ , ¯ ¯ g γ

where g = d¯ /dλ2 , that is, ¯ γ g (x, y) = ¯ 1 1 , log 2 (xy + 1)2 (x, y) ∈ I 2 .

Now, since τ is strongly mixing (see Subsections 4.0.1 and 4.0.2), the last ¯ integral in the equations above converges to IB d¯ γ ¯ g γ (h/¯) d¯ = γ (B)¯(I 2 ) = γ (B) ¯ µ ¯

I2

I2

as n → ∞.

2

Remarks. 1. Proposition 2.1.5 shows that measures µτ −n , n ∈ N, can be expressed in terms of the Perron–Frobenius operator Pγ = U of τ with respect to γ. A similar representation holds for the case of a measure µ as ¯ in Proposition 4.1.15. It is easy to check that we have µ(¯−n (B)) = ¯ τ

B n ¯¯ ¯ γ Pγ f d¯ , 2 B ∈ BI ,

¯ ¯ g ¯¯ where f = h/¯ and Pγ is the Perron–Frobenius operator of τ under γ . See ¯ ¯ the Remark following Proposition 2.1.1. If the endomorphism (¯, γ ) were exact, then from Proposition 4.0.2 we τ ¯ might have deduced that convergence in (4.1.9) is uniform with respect to 2 B ∈ BI . Since (¯, γ ) is not exact, such a conclusion cannot be reached this τ ¯ way. It is an open problem whether this is really true. 2. Proposition 4.1.15 is a ﬁrst step towards the solution of what can be called Gauss’ problem for the natural extension τ of τ . ¯ 2

**Ergodic theory of continued fractions Theorem 4.1.16 Let µ ∈ pr(BI ) such that µ such that λ2 (∂B) = 0 we have (i)
**

n→∞

247

2 λ. For any B ∈ BI

lim µ (¯n ( · , a) ∈ B) = γ (B) τ ¯

n−1

**uniformly with respect to a ∈ I; (ii) 1 lim n→∞ n IB (τ k , sa ) = γ (B) ¯ k
**

k=0

a.e. in I

**uniformly with respect to a ∈ I.
**

2 Proof. (i) For any θ ∈ I and B ∈ BI set

**hn (θ, B) = µ (¯n ( · , θ) ∈ B) , n ∈ N+ . τ By Fubini’s theorem we have (µ ⊗ λ) τ −n (B) ¯ =
**

I2 1

IB (¯n (ω, θ)) µ(dω) dθ τ

1

=

0 1

dθ

0

IB (¯n (ω, θ)) µ(dω) τ

1 0

=

0

µ (¯n ( · , θ) ∈ B) dθ = τ

hn (θ, B) dθ.

Since µ ⊗ λ

**λ2 , it follows from Proposition 4.1.15 that
**

1 n→∞ 0

lim

hn (θ, B) dθ = γ (B) ¯

(4.1.10)

2 for any B ∈ BI . Now, note that—letting d denote the Euclidean distance in I 2 —by Theorem 1.2.2 we have

d (¯n (ω, θ), τ n (ω, a)) ≤ τ ¯

i(n) ∈Nn +

max I(i(n) ) =

1 , Fn Fn+1

n ∈ N+ ,

(4.1.11)

**for any θ, a ∈ I. Given ε > 0, let
**

+ Bε = (x,y)∈B

Dε (x, y),

**where Dε (x, y) is the open disk of radius ε centered at (x, y) ∈ I 2 , and
**

− Bε = ((x, y) ∈ B : Dε (x, y) ⊂ B) .

**248 By (4.1.11), for n ≥ n0 (ε) great enough and any θ, a ∈ I we have
**

− (ω : τ n (ω, θ) ∈ Bε ) ⊂ (ω : τ n (ω, a) ∈ B) ¯ ¯

Chapter 4

(4.1.12)

+ ⊂ (ω : τ n (ω, θ) ∈ Bε ) . ¯

**On the other hand, for any n ∈ N and θ ∈ I we trivially have
**

− (ω : τ n (ω, θ) ∈ Bε ) ⊂ (ω : τ n (ω, θ) ∈ B) ¯ ¯ + ⊂ (ω : τ n (ω, θ) ∈ Bε ) . ¯

(4.1.13)

Hence

+ − + − −hn θ, Bε \ Bε ≤ hn (θ, B) − hn (a, B) ≤ hn θ, Bε \ Bε

**for any n ≥ n0 (ε) and θ, a ∈ I. Integrating the double inequality above over θ ∈ I yields
**

1 0 1

hn (θ, B) dθ − hn (a, B) ≤

0

+ − hn θ, Bε \ Bε dθ

**for any n ≥ n0 (ε) whatever a ∈ I. Finally, let ﬁrst n → ∞ then ε → 0 in the last inequality. By (4.1.10) we obtain
**

− lim sup sup |¯ (B) − hn (a, B)| ≤ lim γ (Bε \ Bε ) = γ (∂B) = 0 γ ¯ + ¯ n→∞ a∈I ε→0

since λ2 (∂B) = 0, and the proof of (i) is complete. (ii) It is easy to check that (4.1.12) and (4.1.13) imply the inequalities IBε τ k , sk ¯ − ≤ IB τ k , sa k ≤ IBε τ k , sk ¯ +

for any a ∈ I, (ω, θ) ∈ Ω × I, and any k ≥ n0 (ε) great enough. Also, we trivially have IBε τ k , sk ¯ − ≤ IB τ k , sk ¯ ≤ IBε τ k , sk ¯ +

for any k ∈ N and (ω, θ) ∈ Ω × I. Hence IB (τ k , sk ) − IB (τ k , sa ) ≤ IBε \Bε (τ k , sk ) ¯ ¯ + − k (4.1.14)

**for any k ≥ n0 (ε), a ∈ I, and (ω, θ) ∈ Ω × I. By Proposition 4.1.13 we have 1 n→∞ n lim
**

n−1

IB (τ k , sk ) = γ (B) ¯ ¯

k=0

Ergodic theory of continued fractions and 1 n→∞ n lim Since λ2 (∂B)

249

n−1 − IBε \Bε (τ k , sk ) = γ (Bε \ Bε ) ¯ ¯ + + − k=0

a.e. in I 2 .

= 0, we have

ε→0 − lim γ (Bε \ Bε ) = γ (∂B) = 0. ¯ + ¯

It is now easy to see that (4.1.14) and the last three equations imply the result stated. 2 Remark. Theorem 4.1.16(i) has been proved by Barbolosi and Faivre (1995) while (ii) is implicit (or implicitly used) in many papers by Dutch authors. See, e.g., Bosma et al. (1983) or Jager (1986). 2 Theorem 4.1.16 has a host of consequences. We state some of them. Corollary 4.1.17 Let µ ∈ pr(BI ) such that µ such that λ2 (∂B) = 0 we have

n→∞ 2 λ. For any B ∈ BI

lim µ((τ n , sa ) ∈ B) = γ (B) ¯ n

(4.1.15)

uniformly with respect to a ∈ I. Proof. This is just a transcription of the result stated in Theorem 4.1.16(i) as τ n (ω, a) = (τ n (ω), sn (ω, a)) = (τ n (ω), sa (ω)), ¯ ¯ n for any n ∈ N. (ω, a) ∈ Ω × I, 2

Let us note that in Theorem 2.5.8 the (optimal) convergence rate in (4.1.15) has been obtained in the case where µ = γa for the class of rectangles B = [0, x] × [0, y], x, y ∈ I. Using this result we can prove Proposition 4.1.18 Let B be a simply connected subset of I 2 such that ∂B = m i for some m ∈ N+ , where either i=1

i

:= ( (x, fi (x)) : ai ≤ x ≤ bi )

**with 0 ≤ ai < bi ≤ 1 and fi : [ai , bi ] → I continuous and monotone, or
**

i

:= (ci , y) : ai ≤ y ≤ bi

with ci ∈ I and 0 ≤ ai < bi ≤ 1. Then γa ((τ n , sa ) ∈ B) = γ (B) + O(gn ) ¯ n

250

Chapter 4

as n → ∞, where the constant implied in O depends on m and the quantities deﬁning the i , 1 ≤ i ≤ m. The proof in the case a = 0 can be found in Dajani and Kraaikamp (1994). 2 By particularizing the set B in Corollary 4.1.17 and Proposition 4.1.18 we obtain results originally derived by ad hoc methods. We shall state below some of them leaving the calculation details to the reader. Corollary 4.1.19 For any µ ∈ pr(BI ) such that µ we have lim µ (Θn ≤ t) = H(t),

n→∞

λ and any t ∈ I

where H has been deﬁned in Theorem 2.2.13. For µ = λ the convergence rate in the equation above is O(gn ) as n → ∞. Proof. This follows from Corollary 4.1.17 with a = 0 and B= (x, y) ∈ I 2 : x ≤t , xy + 1 t ∈ I,

and Proposition 4.1.18, as Θn = τ n /(sn τ n + 1), n ∈ N, by equation (1.3.7). Note that, however, Theorem 2.2.13 yields a better convergence rate! 2 Corollary 4.1.20 For any µ ∈ pr(BI ) such that µ (t1 , t2 ) ∈ I 2 we have

n→∞

λ and any

lim µ (Θn−1 ≤ t1 , Θn ≤ t2 ) = H(t1 , t2 ),

**where H is the distribution function with density 1 √ 1 log 2 1 − 4t t if t1 ≥ 0, t2 ≥ 0, t1 + t2 < 1,
**

1 2

0

elsewhere.

For µ = λ the convergence rate in the equation above is O(gn ) as n → ∞. Proof. This follows from Corollary 4.1.17 with a = 0 and B= (x, y) ∈ I 2 : x y ≤ t1 , ≤ t2 , xy + 1 xy + 1 τn , sn τ n + 1 (t1 , t2 ) ∈ I 2 ,

and Proposition 4.1.18, as Θn−1 = sn , sn τ n + 1 Θn = n ∈ N,

**Ergodic theory of continued fractions by equation (1.3.7). Let us deﬁne random variables ρn and Θn by ρn (ω) = ω−
**

pn+1 qn+1 pn qn

251 2

ω−

,

Θn = qn qn+1 ω −

pn , qn

ω ∈ Ω, n ∈ N.

It is easy to see that ρn = sn+1 τ n+1 and Θn = 1/(sn+1 τ n+1 + 1) so that Θn = 1/(ρn + 1), n ∈ N. Corollary 4.1.21 For any µ ∈ pr(BI ) such that µ lim µ(ρn ≤ t) = 1 log 2 log(t + 1) − t log t t+1 , λ we have t ∈ I,

n→∞

0

n→∞

if 0 ≤ t ≤ 1/2, if 1/2 ≤ t ≤ 1.

lim µ(Θn ≤ t) =

log(2tt (1 − t)1−t ) log 2

For µ = λ the convergence rate in the equations above is O(gn ) as n → ∞. The proof is left to the reader. 2

For other results of the same type, which can be derived as before, we refer the reader to Bosma et al. (1983), Jager (1986), Kraaikamp (1994). Corollary 4.1.22 For any t, t1 , t2 ∈ I the limits

n→∞

lim lim

1 card{k : Θk ≤ t, 0 ≤ k ≤ n − 1 }, n 1 card{k : Θk ≤ t1 , Θk+1 ≤ t2 , 0 ≤ k ≤ n − 1 }, n 1 card{k : ρk ≤ t, 0 ≤ k ≤ n − 1 }, n lim

n→∞

n→∞

lim

1 card{k : Θk ≤ t, 0 ≤ k ≤ n − 1 }, n all exist a.e. in I and are equal to the corresponding values of the limiting distribution functions occurring in Corollaries 4.1.19, 4.1.20, and 4.1.21, respectively.

n→∞

and

252

Chapter 4

The proof is immediate on account of Theorem 4.1.16(ii) and the corollaries referred to in the statement. 2 Remarks. 1. It has been proved by Hensley (1998) that if (kn )n∈N+ is a strictly increasing sequence of positive integers, then for any t ∈ I we have 1 lim card{j : Θkj ≤ t, 0 ≤ j ≤ n − 1 } = H(t) a.e. in I, (4.1.16) n→∞ n where H has been deﬁned in Theorem 2.2.13. Corollary 4.1.22 only covers the case kn = n, n ∈ N+ . 2. In the case kn = n, n ∈ N+ , equation (4.1.16) has been conjectured by H.W. Lenstra Jr. Actually, this conjecture is implicit in Doeblin (1940), which enables us to call it after both Doeblin and Lenstra. The Doeblin– Lenstra conjecture has been proved by Bosma et al. (1983) by using, even if not explicitly, Theorem 4.1.16(ii) in a special case. 2 Corollary 4.1.23 The equations 1 lim n→∞ n 1 n→∞ n lim

n−1 n−1

Θk =

k=0

1 = 0.36067 · · · 4 log 2 1 6 1− 1 4 log 2 = 0.10655 · · ·

Θk Θk+1 =

k=0 n−1

1 n→∞ n lim and 1 n→∞ n lim

ρk =

k=0 n−1

π2 − 1 = 0.18656 · · · 12 log 2 1 1 + = 0.86067 · · · 2 4 log 2

Θk =

k=0

all hold a.e. in I. Proof. We consider just the ﬁrst equation, leaving the calculation details to the reader, as the same idea underlies the proofs in the other cases. By Corollary 4.1.22 we have 1 lim n→∞ n

n−1

I[0,t] (Θk ) = H(t)

k=0

a.e. in I for any t ∈ I ∩ Q. Hence for any ﬁxed ω ∈ Ω not belonging to the exceptional set the distribution function Fn (t) := 1 n

n−1

I[0,t] (Θk ),

k=0

t ∈ I,

**Ergodic theory of continued fractions converges weakly to H as n → ∞. Consequently, 1 t dFn (t) = n I should converge to t dH(t) =
**

I n−1

253

Θk

k=0

1 4 log 2

as n → ∞ for any ω ∈ Ω not belonging to the exceptional set, thus a.e. in I. While for the last two equations the reasoning is quite similar, in the case of the second equation we should consider two-dimensional distribution functions, and the value of the limit equals I 2 t1 t2 dH(t1 , t2 ). 2 We turn now to limit properties of certain associated random variables. It follows from (4.1.6) that for any measurable real-valued function f on I such that I |f | dλ < ∞ we have 1 lim n→∞ n

n−1

f (¯k ) = s

k=0

f dγ

I

a.e. in I 2 .

(4.1.17)

From (4.1.17) we can derive a weaker result for the sequences (sa )n∈N , a ∈ I. n Theorem 4.1.24 Let f : I → R be continuous. Then for any a ∈ I we have n−1 1 f (sa ) = f dγ a.e. in I. lim k n→∞ n I

k=0

Proof. We have |¯k − sa | ≤ (Fk Fk+1 )−1 for any k ∈ N, (ω, θ) ∈ Ω × I, s k a ∈ I. The result then follows from (4.1.17) and the uniform continuity of f on I. 2 Remarks. 1. The above result also follows from a theorem of Breiman (1960) on account of the Markov property of the sequences (sa )n∈N , a ∈ I. n a 2. The corresponding result for yn = 1/sa , n ∈ N+ , a ∈ I, can be easily n stated. In this form it can be found in Elton (1987) and Grigorescu and Popescu (1989). 2 Corollary 4.1.25 For any m ∈ N+ and a ∈ I we have 1 lim n→∞ n

n−1

(sa )m = k

k=0

1 log 2

i∈N+

(−1)i−1 (m + i)

a.e. in I.

254 In particular, for m = 1 the value of the limit is (1/ log 2) − 1. The proof amounts to computing the integral 1 log 2 which yields the result stated. Taking f (x) = log x, x ∈ I, in (4.1.17) and noting that log xγ(dx) =

I 1 0 1 0

Chapter 4

tm dt, t+1 2

1 log 2 1 log 2 1 log 2 1 log 2

log x dx x+1

1 0

=

log(x + 1) log x|1 − 0 (−1)k k+1

1 0

log(x + 1) dx x 1 log 2 , (−1)k (k + 1)2

= −

xk dx = −

k∈N

k∈N

= − we obtain

2 ζ(2) − ζ(2) 4

= −

π2 12 log 2

1 π2 log(¯0 s1 · · · sn−1 ) = − s ¯ ¯ n→∞ n 12 log 2 lim or, equivalently 1 π2 log(¯0 y1 · · · yn−1 ) = y ¯ ¯ n→∞ n 12 log 2 lim

a.e. in Ω

a.e. in Ω.

In the last equation we can give an estimate of the convergence rate. We have shown in Example 3.2.11 that 1 lim Eγ ¯ n→∞ n

n−1 i=0

π2 log yi − ¯ 12 log 2

2

> 0.

**Then for any ε > 0 by Theorem 4.0.4 we obtain 1 n
**

n−1

log yk = − ¯

k=0

1 n

n−1

log sk ¯

k=0

(4.1.18) + o n− 2 log(3+ε)/2 n

1

=

π2 12 log 2

a.e. in Ω

Ergodic theory of continued fractions

255

as n → ∞, where the constant implied in o depends on ε and the current point (ω, θ) ∈ Ω2 . While we cannot take f (x) = log x, x ∈ I, in Proposition 4.1.24 since this is not a continuous function on I, we can however replace sk by sa , ¯ k k ∈ N, a ∈ I, in (4.1.18) as shown below. Theorem 4.1.26 For any a ∈ I we have 1 π2 log(sa sa · · · sa ) = − 1 2 n n→∞ n 12 log 2 lim a.e. in Ω.

**More precisely, whatever ε > 0, for any a ∈ I we have
**

1 1 π2 log(sa sa · · · sa ) = − + o n− 2 log(3+ε)/2 n 1 2 n n 12 log 2

a.e. in Ω

as n → ∞, where the constant implied in o depends on both ε and the current point ω ∈ Ω. In particular, for a = 0 the above equations amount to

n→∞

lim

√ 2 n qn = eπ /12 log 2

a.e. in Ω

(4.1.19)

and

1 √ 2 n qn = eπ /12 log 2 + o n− 2 log(3+ε)/2 n

a.e. in Ω

(4.1.20)

as n → ∞, respectively. Proof. By the mean value theorem we have 1 log x − log y ≤ x−y min(x, y) for any 0 < x, y ≤ 1, x = y. Next, note that 0 < v(i(k) ) − 1 ≤ max u(i(k) ) 1 1 , 2 Fk−1 Fk+1 Fk

for any fundamental interval I(i(k) ) = Ω ∩ u(i(k) ), v(i(k) ) , i(k) ∈ Nk , k ∈ + N+ . This follows easily from (1.1.12), (1.1.13), and Theorem 1.1.2. Consequently, for any k ∈ N+ and a ∈ I we have |log sk − log sa | ≤ max ¯ k 1 1 , 2 Fk−1 Fk+1 Fk = O(g2k ) (4.1.21)

256

Chapter 4

as n → ∞, whatever the current point (ω, θ) ∈ Ω. Clearly, by (4.1.18) and (4.1.21) the proof is complete for any a ∈ I. In the special case a = 0 we only should note that s0 = k qk−1 , qk k ∈ N+ . 2 Remark. The convergence rate in Theorem 4.1.26 with a = 0 is slightly better than that derived by Philipp (1967, p. 122). Equation (4.1.19) was ﬁrst derived by L´vy (1929) using a diﬀerent method. e 2 Corollary 4.1.27 We have π2 1 pn log ω − = − n→∞ n qn 6 log 2 lim and, for any ε > 0, 1 pn π2 log ω − + o n−1/2 log(3+ε)/2 n = − n qn 6 log 2 a.e. in Ω a.e. in Ω

as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω. Proof. It follows from (1.1.16) that for any ω ∈ Ω and n ∈ N we have 1

2 2qn+1

< ω−

1 pn < 2. qn qn

Then the results stated are immediate consequences of equations (4.1.19) and (4.1.20). 2 Corollary 4.1.28 We have 1 π2 log λ (I(a1 , · · · , an )) = − n→∞ n 6 log 2 lim and, for any ε > 0, 1 π2 log λ (I(a1 , · · · , an )) = − + o n−1/2 log(3+ε)/2 n n 6 log 2 a.e. in Ω a.e. in Ω

as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω. Proof. By (1.2.2) and (1.2.5) we have log λ (I(a1 , · · · , an )) = −2 log qn − log(sn + 1), n ∈ N+ .

Ergodic theory of continued fractions

257

Since sn ∈ I, the results stated are again immediate consequences of equations (4.1.19) and (4.1.20). 2 Remark. The result above implies that the entropy H(τ ) of the continued fraction transformation τ is equal to π 2 /6 log 2. See e.g., Billingsley (1965, p. 134). 2 Corollary 4.1.29 For any ε > 0 we have

n

pn (ω) = ω 1/n eπ

2 /12 log 2

+ o n− 2 log(3+ε)/2 n

1

a.e. in Ω

as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω. The proof follows from the inequality

n

pn (ω) −

n

ω qn (ω) ≤

1 Fn+1 Fn

(n−1)/n

,

ω ∈ Ω, n ∈ N+ , 2

which can be easily checked.

Corollary 4.1.30 (Khinchin’s fundamental theorem of Diophantine approximation) Let f : N+ → R++ . (i) If i∈N+ f (i) = ∞ and if (i) ≥ (i + 1)f (i + 1), i ∈ N+ , then a.e. in Ω the inequality f (q) p < ω− q q has inﬁnitely many solutions in integers p, q ∈ N+ with g.c.d.(p, q) = 1. (ii) If i∈N+ f (i) < ∞, then a.e. in Ω the above inequality has at most ﬁnitely many solutions in integers p, q ∈ N+ with g.c.d.(p, q) = 1. The proof follows from Theorem 4.1.26 with a = 0 and F. Bernstein’s theorem (Proposition 1.3.16). See, e.g., Billingsley (1965, p. 48). 2

4.2

4.2.1

**Other continued fraction expansions
**

Preliminaries

In this section we study a large class of continued fraction expansions which can be derived from the RCF expansion. Before deﬁning them formally let us brieﬂy describe the underlying idea.

258

Chapter 4

The following rather old and well known remark is fundamental. For a ∈ Z, b ∈ N+ and x ∈ [0, 1) we have a+ 1+ 1 1 b+x =a+1+ −1 . b+1+x

This operation is called a singularization. We have singularized the digit 1 in [ · ; · · · , a, 1, b, · · · ] The eﬀect of a singularization is that a new and shorter continued fraction expansion is obtained. Moreover, we will see that the sequence of convergents associated with the ‘new’ continued fraction expansion is a subsequence of the sequence of convergents of the ‘old’ one. For example, given n ∈ N+ , if we singularize the digit an+1 (ω) = 1 in the RCF expansion of some ω ∈ Ω, then the sequence of convergents of the ‘new’ continued fraction expansion is obtained by deleting the nth term from the sequence of RCF convergents of ω. Obviously, the ‘new’ continued fraction expansion is no longer an RCF expansion! Starting from the RCF expansion of a given x ∈ [0, 1) it is not possible (i) to singularize two consecutive digits equal to 1, and (ii) to singularize digits other than 1. It is also important to note that once we have singled out digits equal to 1 to be singularized, the order in which they are singularized has no impact on the ﬁnal result. Of course, just one singularization does not make the new expansion ‘really faster’ than the old one. However, many algorithms can be devised such that for almost all x ∈ [0, 1) inﬁnitely many convergents are skipped. Before considering such algorithms, let us ﬁx notation. Let x ∈ [0, 1) with RCF expansion x = [a1 , a2 , · · · ] . Any ﬁnite or inﬁnite string of consecutive digits ak (x) = 1, ak+1 (x) = 1, ··· , ak+n−1 (x) = 1, k ∈ N+ , n ∈ N+ ∪{∞}

is called a 1-block if either k = 1 and ak+n (x) = 1 (if n is ﬁnite) or k > 1 and ak−1 (x) = 1, ak+n (x) = 1 (if n is ﬁnite). The ﬁrst algorithm we consider is: A For any x ∈ [0, 1) singularize the ﬁrst, third, ﬁfth, etc., components in any 1-block.

Ergodic theory of continued fractions

259

Applying algorithm A to a (ﬁnite or inﬁnite) RCF expansion [a1 , a2 , · · · ] yields a (ﬁnite or inﬁnite) continued fraction of the form b0 + b1 + e1 e2 . b2 + . . (4.2.1)

or [b0 ; e1 /b1 , e2 /b2 , · · · ], for short. In (4.2.1) we have b0 ∈ {0, 1}, bn ∈ N+ , en ∈ {−1, 1}, and bn + en+1 ≥ 2, n ∈ N+ . √ Example 4.2.1 Let x = (−3 + 17)/2 = 0.56155 · · · . As a quadratic irrationality x should have a periodic RCF expansion (see Subsection 1.1.3). We easily ﬁnd that x = [0; 1, 1, 3, 1, 1, 3, · · · ] = 0; 1, 1, 3 . Applying algorithm A to the RCF expansion of x yields x = [1; −1/2, 1/4, −1/2, 1/4, · · · ] or x = 1; −1/2, 1/4 , for short. By the very construction, the convergents pe n := b0 + e qn b1 + , e2 en . b2 + . . + bn e1 n = 1, 2, · · · , 2

of (4.2.1) are a subset of the convergents of [a1 , a2 , · · · ]. Therefore in the case of an inﬁnite RCF expansion we have pe n = [a1 , a2 , · · · ] . e n→∞ qn lim Several questions naturally arise : (i) Are there other algorithms yielding continued fraction expansions with the property above? (ii) Does algorithm A always yield fastest continued fraction expansions? Closest expansions? (The precise meaning of these terms will be explained later. See Subsection 4.3.3. Informally, one would like the e denominators qn , n ∈ N+ , to grow as fast as possible while the approximation coeﬃcients associated with the new expansion to be as small as possible.)

260 (iii) Is there an underlying ergodic transformation?

Chapter 4

We can easily answer the ﬁrst question. The second algorithm we consider is: B For any x ∈ [0, 1) singularize the last, third from last, ﬁfth from last, etc., components in any 1-block. Example 4.2.2 Let x be as in Example 4.2.1. Applying algorithm B to the RCF expansion of x yields x = [1; 1/2, −1/4, 1/2, −1/4, · · · ] , or x = 1; 1/2, −1/4 , for short. 2

Clearly, in general, algorithms A and B yield diﬀerent results. Actually it is possible to show that, in a sense, one cannot do better than either of these algorithms. Since one can singularize just digits equal to 1, and since two consecutive 1’s cannot be both singularized, it is not possible to go faster than either algorithms A or B. Slower algorithms are trivially at hand. Here is an example of such an algorithm: C For any x ∈ [0, 1) singularize all digits an+1 (x) = 1 for which Θn (x) ≥ 1/2 (see Subsection 1.3.2) whatever n ∈ N. In Subsection 4.3.2 it is shown that algorithm C is well deﬁned, that is, not in conﬂict with the requirements of the singularization procedure. Example 4.2.3 Let x be as in Example 4.2.1. A simple calculation shows that the ﬁrst four digits equal to 1 in the RCF expansion of x should not be singularized if we apply algorithm C to it. 2 From this example it is clear that, in general, algorithm C does not yield expansions which are fastest. In Subsection 4.3.3 we will discuss an algorithm which yields both fastest and closest expansions. This algorithm was introduced by Selenius (1960) and—independently—by Bosma (1987), and is called the optimal continued fraction (OCF) expansion. Finally, in Subsection 4.2.5 we will answer question (iii) above.

4.2.2

Semi-regular continued fraction expansions

Apart from the RCF expansion there exist many so called semi-regular continued fraction expansions. To deﬁne the latter we start by deﬁning a continued fraction (CF) as a pair of two sets e = (ek )k∈M and (ak )k∈{0}∪M of

Ergodic theory of continued fractions

261

integers with ek ∈ {−1, 1} and a0 ∈ Z, ak ∈ N+ , k ∈ M , where either M = {k : 1 ≤ k ≤ n} for some n ∈ N+ or M = N+ . Next, for arbitrary indeterminates xi , yi , 1 ≤ i ≤ n, n ∈ N+ , write [y1 /x1 ] = y1 , x1 [y1 /x1 , · · · , yn /xn ] = y1 , x1 + [y2 /x2 , · · · , yn /xn ] n ≥ 2.

If card M = n ∈ N+ then we say that the CF considered has length n and assign it the value [a0 ; e1 /a1 , · · · , en /an ] := a0 + [e1 /a1 , · · · , en /an ] = a0 + a1 + e1 e2 en . a2 + . . + an ∈ R ∪ {−∞, ∞}.

If M = N+ then we say that the CF considered is inﬁnite and look at it as the sequence ((ek )1≤k≤n , (ak )0≤k≤n )n∈N+ of all ﬁnite CF’s which are obtained by ﬁnite truncation. In both cases we can associate with a CF its convergents pe 0 e := a0 , q0 pe k e := [a0 ; e1 /a1 , · · · , ek /ak ] , qk 1 ≤ k ≤ n,

e e for either some n ∈ N+ or any n ∈ N+ , with pe = a0 , q0 = 1, pe ∈ Z, qk ∈ 0 k e |, q e ) = 1, 1 ≤ k ≤ n. N+ , g.c.d. (|pk k To ensure the convergence of the sequence of convergents of an inﬁnite CF, which would enable us to speak of a CF expansion, additional conditions should be imposed on the ek and ak , k ∈ N+ . One possibility, yielding the so called semi-regular continued fraction (SRCF ) expansion, is to ask that ei+1 + ai ≥ 1, i ∈ N+ , and ei+1 + ai ≥ 2 inﬁnitely often (in the inﬁnite case). It can be shown that the sequence of convergents of an inﬁnite SRCF expansion converges to an irrational number. See Tietze (1913) [cf. Perron (1954, §37)]. This will be written as

pe k e := [a0 ; e1 /a1 , e2 /a2 , · · · ] . k→∞ qk lim As in the RCF expansion case a matrix theory is associated with an SRCF expansion (or, more generally, with a CF). Consider (cf. Remark

**262 preceding Proposition 1.1.1) the matrices Ae := 0 and
**

e Mn := Ae · · · Ae , 0 n

Chapter 4

0 1 1 0

0 1 1 a0

=

1 a0 0 1

,

Ae := n

0 en 1 an

,

n ∈ N+ ,

n ∈ N.

Clearly,

e e det M0 = 1, det Mn = (−1)n e1 · · · en ,

n ∈ N+ .

(4.2.2)

**One can prove that
**

e Mn = e pe n−1 pn e e qn−1 qn

,

n ∈ N,

(4.2.3)

e with pe = 1, q−1 = 0, which implies that the sequences (pe )n∈N and n −1 e) (qn n∈N satisfy the recurrence relations

pe = an pe + en pe , n n−1 n−2

e e e qn = an qn−1 + en qn−2 ,

n ∈ N+ .

**The second equation above implies at once that se := n
**

e qn−1 = [1/an , en /an−1 , · · · , e2 /a1 ] , e qn

n > 1,

(4.2.4)

e e and clearly se := q0 /q1 = 1/a1 . It follows from (4.2.2) and (4.2.3) that 1 e e pe q0 − pe q−1 = 1, −1 0 e e pe qn − pe qn−1 = (−1)n e1 · · · en , n−1 n

n ∈ N+ ,

e showing that indeed g.c.d (|pe |, qn ) = 1, n ∈ N. n e Next (see again the RCF expansion case), looking at Mn as a M¨bius o transformation one can show that e Mn (0) =

pe n , e qn

n ∈ N.

More generally,

e Mn (z) =

pe + zpe n n−1 = [a0 ; e1 /a1 , · · · , en−1 /an−1 , en /(an + z)] , e e qn + zqn−1

n ≥ 2,

Ergodic theory of continued fractions for any z ∈ C, z = −1/se , and n a0 + pe + zpe e1 e 0 = M1 (z) = 1 e e a1 + z q1 + zq0

263

**for any z ∈ C, z = −1/se . It follows that putting te = [en+1 /an+1 , · · · ] , n ∈ n 1 N, we have pe + te pe n n−1 a0 + te = n , n ∈ N. 0 e e qn + te qn−1 n Finally, deﬁning
**

e Θe (a0 + te ) = (qn )2 a0 + te − n 0 0

pe n , e qn

n ∈ N,

it is easy to check that Θe (a0 + te ) = n 0 Since (te )−1 = en+1 (an+1 + te ), se + en+1 an+1 = n n+1 n we also have Θe (a0 + te ) = n 0 se n+1 , s e te + 1 n+1 n+1 en+1 , se n+1 n ∈ N, en+1 te |te | n = e en , s e te + 1 sn tn + 1 n n n ∈ N. (4.2.5)

n ∈ N.

(4.2.6)

The numbers Θe , n ∈ N, associated with a (ﬁnite or inﬁnite) SRCF n expansion are called its approximation coeﬃcients. Compare with the RCF expansion case in Subsection 1.3.2. We conclude this subsection with a few examples of well known SRCF expansions. 1. The RCF expansion: this is the SRCF expansion for which en = 1 for any n ∈ N+ . 2. Nakada’s α-expansions for α ∈ [1/2, 1]: see Subsection 4.3.1. 3. The nearest integer continued fraction (NICF) expansion: this is the SRCF expansion for which en+1 +an ≥ 2 for any n ∈ N+ . It was introduced by Minnigerode (1873) and studied by Hurwitz (1889). Actually, the NICF expansion is the 1/2-expansion, and is obtained by applying algorithm A deﬁned in Subsection 4.2.1 to the RCF expansion.

264

Chapter 4

4. The singular continued fraction (SCF) expansion: this is the SRCF expansion for which en + an ≥ 2, n ∈ N+ . It was introduced by Hurwitz (1889). Actually, the SCF expansion is the g-expansion with √ g = ( 5−1)/2, the golden ratio, and is obtained by applying algorithm B deﬁned in Subsection 4.2.1 to the RCF expansion. 5. Minkowski’s diagonal continued fraction (DCF) expansion: this is the SRCF expansion which is obtained by applying algorithm C deﬁned in Subsection 4.2.1 to the RCF expansion. See Subsection 4.3.2. 6. The continued fraction with odd incomplete quotients (Odd CF) expansion: this is the SRCF expansion for which e1 = 1, an ≡ 1 mod 2, en+1 + an ≥ 2, n ∈ N+ . It was introduced by Rieger (1981a) [see also Barbolosi (1990), Hartono and Kraaikamp (2002), and Schweiger (1995, Ch. 3)]. 7. The continued fraction with even incomplete quotients (Even CF) expansion: this is the SRCF expansion for which e1 = 1, an ≡ 0 mod 2, en+1 + an ≥ 2, n ∈ N+ . See also Kraaikamp and Lopes (1996) and Schweiger (1995, Ch. 3).

4.2.3

The singularization process

The following two easily checked identities are fundamental for the theory which we develop in this section: 1 a 0 1 0 c 1 1 0 1 1 b = 1 a+c 0 1 0 −c 1 b+1 , (4.2.7)

0 c 1 a

0 d 1 1

0 1 1 b

=

0 c 1 a+d

0 −d 1 b+1

,

(4.2.8)

where a, b, c and d are arbitrary real or complex numbers. Let (ek )k∈M , (ak )k∈{0}∪M

(4.2.9)

be a (ﬁnite or inﬁnite) CF with a +1 = 1, e +2 = 1 for some ∈ N for which + 2 ∈ M . The transformation σ which takes (4.2.9) into the CF (ek )k∈M \{

+1} ,

(ak )k∈{0}∪(M \{

+1})

(4.2.10)

Ergodic theory of continued fractions

265

with ek = ek , k ∈ M, k < + 1 or k ≥ + 3, e +2 = −e +1 , ak = ak , k ∈ {0} ∪ M, k < or k ≥ + 3, a = a + e +1 , a +2 = a +2 + 1, is called a singularization of the pair (a +1 , e +2 ). e e Let (pe /qk )k∈{0}∪M and (pe /qk )k∈{0}∪(M \{ +1}) be the sets of convergents k k associated with (4.2.9) and (4.2.10), respectively. We are going to derive the e e relationship between these sets. Let (Mk )k∈{0}∪M and (Mk )k∈{0}∪(M \{ +1}) be the sets of matrices deﬁned in the preceding subsection, associated with (4.2.9) and (4.2.10), respectively. We have pe k e qk

e = Mk

0 1

,

k ∈ {0} ∪ (M \{ + 1}).

e e Clearly, Mk = Mk for k < and, moreover, by (4.2.7) and (4.2.8) we have e =M e Mk k+1 for k ≥ + 1. The matrix M will then be given by

M e = M e−1

e with M−1 :=

0 e 1 a +e

+1

0 1 1 0

and e0 = 1. Hence 0 e +1 1 1 −e e

+1 +1 −1

M e = M e+1 = M e+1 Therefore pe qe

0 e 1 a

−1

0 e 1 a +e

+1

0 1

.

= M e+1

−e e

+1 +1

0 1

0 1

=

pe+1 q e+1

,

and we can state the following result. Proposition 4.2.4 Let ∈ N such that gents e (pe /qk )k∈{0}∪(M \{ k + 2 ∈ M . The set of conver+1}) +2 )

resulting after the singularization σ of the pair (a +1 , e e obtained by deleting pe /q e from the set (pe /qk )k∈{0}∪M . k

= (1, 1), is

In what follows a singularization process will consist of a set S of continued fractions and a rule which determines in an unambiguous way the pairs a +1 = 1, e +2 = 1 that should be singularized for any member of S.

266

Chapter 4

Remark. For an inﬁnite CF the sequence of convergents of the ‘new’ CF obtained after singularization, is a subsequence of the sequence of convergents of the ‘old’ one. Therefore if the ‘old’ CF converged to x, so does the ‘new’ one, and it converges faster. In particular, this holds for any SRCF expansion to be singularized.

4.2.4

S-expansions

From now on we will concentrate on one special singularization process. The set S of continued fraction expansions to be singularized is the set of all (ﬁnite or inﬁnite) RCF expansions. Since in this case all the e’s are +1, we will speak of singularizing a +1 = 1 instead of singularizing the pair a +1 = 1, e +2 = 1. Before describing the general rule (as we should according to the deﬁnition just given) remark that Example 4.2.1 actually describes a singularization process: S plus algorithm A yield the NICF expansion! Now, notice that algorithm A is equivalent to singularize a

+1

= 1 if and only if (τ , s ) ∈ SA ,

∈ N, ∈ N,

where (cf. Subsection 1.3) τ = [a +1 , a +2 , · · · ], s = [a , · · · , a1 ], with s0 = 0, and SA = [1/2, 1) × [0, g] ⊂ I 2 . We recall that the golden ratios g and G are deﬁned as √ 5−1 , G = g + 1. g= 2

Similarly, we can verify that algorithm B—leading to Hurwitz’ SCF expansion— is equivalent to singularize a where SB = [g, 1) × I ⊂ I 2 . Finally, using properties of the approximation coeﬃcients Θn , n ∈ N, deﬁned in Subsection 1.3.2, we can also show that algorithm C—leading to Minkowski’s DCF expansion—is equivalent to singularize a

+1 +1

= 1 if and only if (τ , s ) ∈ SB ,

∈ N,

= 1 if and only if (τ , s ) ∈ SC ,

∈ N,

Ergodic theory of continued fractions where SC = (x, y) ∈ I 2 ;

267

x 1 ≥ xy + 1 2

.

These three examples lead to the idea of prescribing by a subset S ⊂ I 2 which digits 1 = a +1 are to be singularized in the RCF expansion in the form of the condition (τ , s ) ∈ S, ∈ N. Such an S cannot be just any set but must satisfy the conditions S ⊂ [1/2, 1) × I, since otherwise a

+1

would not be equal to 1, and S ∩ τ (S) ⊂ {(g,g)}, ¯

since otherwise one would be forced to singularize two consecutive digits both equal to 1, which is impossible. Thus we are lead—in a natural way— to the following deﬁnition which exactly describes all S-expansions. Deﬁnition 4.2.5 A subset S of I 2 is said to be a singularization area if and only if

2 ¯ (i) S ∈ BI and γ (∂S) = 0;

(ii) S ⊂ [1/2, 1) × I; (iii) S ∩ τ (S) ⊂ {(g,g)}. ¯ If S is a singularization area, then the S-expansion of ω ∈ Ω is deﬁned as the SRCF expansion converging to ω which is obtained from the RCF expansion of ω by singularizing a digit 1 = a +1 = a +1 (ω) if and only if (τ , s ) ∈ S, whatever ∈ N. Remarks. 1. We need the continuity condition γ (∂S) = 0 in order to be ¯ able to draw the following conclusion. Let A(S, n) be the random variable deﬁned as A(S, n) = card{j : (τ j , sj ) ∈ S, 1 ≤ j ≤ n}, By Theorem 4.1.16(ii) we then have lim A(S, n) = γ (S) a.e.. ¯ n n ∈ N+ .

n→∞

268

Chapter 4

2. Actually, the sets SA and SB do not satisfy condition (iii). Indeed, in both cases, S ∩ τ (S) is a line segment. Of course, this can be easily repaired ¯ by taking ∗ SA = ([1/2, g] × [0, g]) ∪ ((g, 1) × [0, g)) and

∗ SB = ([g, 1) × [0, g]) ∪ ((g, 1) × (g, 1])

instead of SA and SB , respectively. 3. Since γ([1/2, 1) × I) = (log 2)−1 log 4 = 0.41503 · · · , 3

a singularization area S never can have γ-measure greater that 0.41503 · · · . But condition (iii) forces the maximal possible γ-measure of a singularization area S to be essentially smaller than 0.41503 · · · as shown below. Proposition 4.2.6 For any singularization area S we have γ(S) ≤ 1 − where the bound is sharp.

∗ ∗ Proof. Deﬁne M1 = SA with SA as before and M2 = ([0, g) × (g, 1]) ∪ ([g, 1) × [g, 1]). It is easy to check that M2 = τ (M1 ) and ¯

log G = 0.30575 · · · , log 2

γ(M1 ) = γ(M2 ) = 1 −

log G . log 2

Next, put S1 = S ∩ M1 and S2 = S ∩ M2 . Clearly, τ (S1 ) ∪ S2 ⊂ M2 ¯ and by Deﬁnition 4.2.5(iii) we have τ (S1 ) ∩ S2 ⊂ {(g, g)} , ¯ see also Figure 4.1. We now see that γ(S) = γ(S1 ) + γ(S2 ) = γ(¯(S1 )) + γ(S2 ) = γ(¯(S1 ) ∪ S2 ) τ τ ≤ γ(M2 ) = 1 − log G . log 2

**Ergodic theory of continued fractions 1
**

. ... ... .. ... ... ... .. .... .. .... ..... ..... . .. . .. . ... . ... . ... .. ... .. ... . ... . ... ... ... ... ... ... . .. .. .... .. .... .. .... .... .. .... .... .. .. . .. .... . .... .. .... ... ... .. .. ..... ..... . ..... ..... .. ..... ..... . .. ..... ..... .. ...... ...... .. .... ..... . .. ...... ...... . . ....... ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

269

τ (S1 ) ¯

S2

g

S1

0

1 2

g

1

Figure 4.1: S = S1 ∪ S2 and τ (S1 ) ¯ That a singularization area actually can have γ -measure 1 − (log 2)−1 (log G) ¯ ∗ and S ∗ . is shown by the cases of SA 2 B On account of Proposition 4.2.6, a singularization area S will be called maximal if log G . γ(S) = 1 − log 2 Given a singularization area S, let BS be a subset of I 2 such that whatever ω = [a1 , a2 , · · · ] ∈ Ω any digit 1 = a +1 = a +1 (ω) is unchanged by Ssingularization if and only if (τ , s ) ∈ BS , ∈ N. Clearly, such a set—which determines the occurrence of digits equal to 1 in the S-expansion—should have the following properties: (1) BS ⊂ [1/2, 1) × I since a (2) BS ∩ S = ∅ since a

+1 +1

= 1;

= 1 is not singularized;

**(3) τ −1 (BS ) ∩ S = ∅ since a is not singularized; ¯ (4) τ (BS ) ∩ S = ∅ since a ¯
**

+2

is not singularized.

On account of the considerations above, the subset BS of I 2 deﬁned as BS = ([1/2, 1) × I) \ (S ∪ τ −1 (S) ∪ τ (S)) ¯ ¯ is called the preservation area of 1’s. We have the following result.

270

Chapter 4

Proposition 4.2.7 If S is maximal, then γ(BS ) = 0. In general, the converse of this statement does not hold. Proof. Let M1 , M2 , S1 and S2 be as in the proof of Proposition 4.2.6. Put moreover B1 = BS ∩ M1 , B2 = BS ∩ M2 . It is now easy to see that τ (B1 ) ∩ (¯(S1 ) ∪ S2 ) = ∅, τ (B1 ) ∪ τ (S1 ) ∪ S2 ⊂ M2 , ¯ τ ¯ ¯ B2 ∩ (¯(S1 ) ∪ S2 ) = ∅, τ B2 ∪ τ (S1 ) ∪ S2 ⊂ M2 . ¯ Hence, since S is maximal, γ (B2 ) = 0, ¯ γ (B1 ) = γ (¯(B1 )) = 0, ¯ ¯ τ

which completes the proof. (The reader is invited to give an example where the converse does not hold.) 2 We conclude this subsection by deriving a number of results, which are obtained as easy spin-oﬀ. Let S be a singularization area and ω ∈ Ω. As e the sequence (pe /qk )k∈N+ of S-convergents of ω is a subsequence of the k sequence (pn /qn )n∈N+ of its RCF convergents, there exists an increasing random function nS : N+ → N+ such that pe k e qk = pnS (k) qnS (k) , k ∈ N+ .

**Theorem 4.2.8 Let S be a singularization area. Then
**

k→∞

lim

1 nS (k) = k 1 − γ(S)

a.e..

**Proof. It follows from the deﬁnition of nS that
**

nS (k)

nS (k) = k +

j=1

IS (τ j , sj ) .

**Since γ (∂S) = 0, by Theorem 4.1.16(ii) we have ¯ 1 = = k 1 + lim k→∞ nS (k) k→∞ nS (k) lim k + γ (S) a.e., ¯ k→∞ nS (k) lim
**

nS (k)

IS (τ j , sj )

j=1

Ergodic theory of continued fractions whence the result stated. Remark. Theorem 4.2.8 implies that nS (k) log 2 ≤ = 1.4404 · · · k→∞ k log G lim a.e.,

271 2

the upper bound being attained if and only if S is maximal. In words: sparsest sequences of S-convergents are given by maximal singularization ∗ areas. As the singularization area SA which yields the NICF is maximal, we have thus re-proved a theorem of Adams (1979), see also Jager (1982) and Nakada (1981). 2 The following corollary gives the S-expansion analogues of two classical results of P. L´vy in Subsection 4.1.3. e

e Corollary 4.2.9 Let S be a singularization area and let (pe /qk )k∈N+ be k the corresponding sequence of S-convergents. Then

1 e log qk = k→∞ k lim pe 1 log ω − k e k→∞ k qk lim =

1 π2 1 − γ (S) 12 log 2 ¯ −π 2 1 1 − γ (S) 6 log 2 ¯

a.e., a.e.

Proof. This is an immediate consequence of Theorems 4.1.26 and 4.2.8. We have nS (k) 1 1 π2 1 e log qk = lim log qnS (k) = k→∞ k→∞ k k nS (k) 1 − γ (S) 12 log 2 ¯ lim and similarly for the second equation. a.e., 2

By the mechanism of singularization the collection of RCF convergents that are deleted to obtain the S-convergents has the same cardinality as the set of the e , ∈ N+ , which are equal to −1. It is easy to see that nS (k) − k = 1 2

k

k−

=1

e

.

**Therefore we can state the following result. Corollary 4.2.10 We have 1 k→∞ k lim
**

k

e =

=1

1 − 3γ(S) 1 − γ(S)

a.e..

272

Chapter 4

The minimum of the limit above is attained if and only if S is maximal, and is equal to 1 G3 log = 0.11915 · · · . log G 4 We conclude this subsection by giving the S-expansion analogue of Legendre’s theorem—see Corollary 1.2.4. Theorem 4.2.11 Let A(t) = (x, y) ∈ I 2 : x/(xy + 1) < t, y ∈ Q , and deﬁne cS = sup (t ∈ (0, 1] : A(t) ∩ S = ∅) . Put LS = min(cS , 1/2) . Let ω ∈ Ω and p, q ∈ N+ with g.c.d.(p, q) = 1, p < q. If Θ = Θ(ω, p/q) = q 2 ω − p < LS , q 0 < t ≤ 1,

then p/q is an S-convergent of ω. The constant LS is best possible. Proof. Suppose that Θ(ω, p/q) < LS and that p/q is not an S-convergent of ω. Since LS ≤ 1/2, p/q is an RCF convergent of ω by Corollary 1.2.4, i.e., there exists n ∈ N+ such that p/q = pn /qn . Now, since pn /qn is not an S-convergent, by the very deﬁnition of an S-expansion we have (τ n , sn ) ∈ S. The deﬁnition of LS then implies τn ≥ cS ≥ LS , sn τ n + 1 which by the deﬁnition of the approximation coeﬃcients in Subsection 1.3.2 yields Θ(ω, p/q) = Θn ≥ LS , contrary to the hypothesis. Finally, it follows from the deﬁnition of LS and Corollary 1.2.4 that LS is best possible. 2 Remarks. 1. Rieger (1979) and Adams (1979) gave a proof of Corollary 4.2.10 for the special case of the NICF expansion, using a formula of Spence and Abel for the dilogarithm. We see that these transcendent techniques can be avoided, which was also observed by Jager (1982).

Ergodic theory of continued fractions

273

∗ 2. An easy calculation shows that for S = SA (the singularization area yielding the NICF expansion) we have

LS = g2 = 0.38166 · · · . This value was also found by Ito (1987) and by Jager and Kraaikamp (1989). Their methods are diﬀerent. Ito (op. cit.) developed a theory for determining the Legendre constants for a class of continued fractions, larger than the class of S-expansions. Unfortunately, his method is rather complicated.

4.2.5

Ergodic properties of S-expansions

In this subsection we show that for any S-expansion there exists an ‘underlying’ two-dimensional ergodic dynamical system. These systems will be ob2 ¯ ¯ tained via an induced transformation from (I 2 , BI , τ , γ ), the two-dimensional ergodic dynamical system underlying the RCF expansion. Using the ergodic dynamical systems thus obtained we will then deduce more metric and arithmetic properties of S-expansions. Let S be a singularization area and let x = [ a0 ; a1 , a2 , · · · ] = a0 + [ a1 , a2 , · · · ], a0 ∈ Z, [ a1 , a2 , · · · ] ∈ Ω. Denote by [ a0 ; e1 /a1 , e2 /an , · · · ] the S-expansion of x (cf. Subsection 4.2.3). Recall that this is an SRCFexpansion satisfying en+1 + an ≥ 1, n ∈ N+ . As before let τ n = [ an+1 , an+2 , · · · ] , sn = [an , · · · , a1 ] , and put te = [ en+1 /an+1 , · · · ] , n n ∈ N, n ∈ N,

n ∈ N+ , s0 = 0,

se n

if n = 0, 0 1/a1 if n = 1, = [1/an , en /an−1 , · · · , e2 /a1 ] if n > 1.

**By equations (1.2.2) and (4.2.4) we have sn = qn−1 /qn ,
**

e e se = qn−1 /qn , n

n ∈ N,

274

Chapter 4

e where (pn /qn )n∈N and (pe /qn )n∈N are the sequences of RCF convergents n and S-convergents of x, respectively. Also, n pn + τ pn−1 , q + τ nq n n−1 x = (4.2.11) pe + te pe k k k−1 e e qk + te qk−1 k e for any k, n ∈ N, with p−1 = pe = 1, and q−1 = q−1 = 0. Finally, put −1

∆ := I 2 \ S ,

∆− = τ (S), ¯

∆+ = ∆ \ ∆− .

Theorem 4.2.12 For any n ∈ N+ the following assertions hold: (i) (τ n , sn ) ∈ S if and only if pn /qn is not an S-convergent;

(ii) if pn /qn is not an S-convergent, then both pn−1 /qn−1 and pn+1 /qn+1 are S-convergents; (iii) (τ n , sn ) ∈ ∆+ is equivalent to the existence of k = k(n) ∈ N such that e pk−1 = pn−1 , pe = pn , k

e qk−1

and

e tk = τ n (⇒ ek+1 = +1), se = sn ; k

=

e qn−1 , qk

= qn ,

**(iv) (τ n , sn ) ∈ ∆− is equivalent to the existence of k = k(n) ∈ N such that e pk−1 = pn−2 , pe = pn , k
**

e qk−1

and

e tk = −τ n /(τ n + 1) (⇒ ek+1 = −1), se = 1 − sn . k

=

e qn−2 , qk

= qn ,

Proof. (i) This follows directly from Deﬁnition 4.2.5 and 4.2.4. (ii) This follows from the fact that in the sequence of RCF we cannot remove two or more consecutive convergents and sequence of convergents of some srcf. (iii) If (τ n , sn ) ∈ ∆+ then the very deﬁnition of ∆+ implies (τ n−1 , sn−1 ) ∈ S and (τ n , sn ) ∈ S .

Proposition convergents still have a that

Ergodic theory of continued fractions

275

Hence neither an nor an+1 is singularized and therefore both pn−1 /qn−1 and pn /qn are S-convergents. But then there exists k ∈ N+ such that pe pn−1 k−1 = , e qk−1 qn−1 pe pn k e = q . qk n

Since all the fractions are in their lowest terms and their denominators are positive we should have pe k−1 = pn−1 ,

e qk−1 = qn−1 ,

pe = pn , k

e qk = qn .

**Then (4.2.11) implies that pn + te pn−1 pn + τ n pn−1 k = , qn + τ n qn−1 qn + te qn−1 k hence te = τ n . Finally, we have k se = k
**

e qk−1 qn−1 = = sn . e qk qn

The converse is obvious. (iv) If (τ n , sn ) ∈ ∆− then the very deﬁnition of ∆− implies that (τ n−1 , sn−1 ) ∈ S and (τ n , sn ) ∈ S . Hence an = 1, and it should be singularized according to Deﬁnition 4.2.5. Then pn−2 /qn−2 and pn /qn are consecutive S-convergents by (ii). Again, there exists k ∈ N+ such that pe k−1 = pn−2 ,

e qk−1 = qn−2 ,

pe = pn , k

e qk = qn .

Since

pn = an pn−1 + pn−2 = pn−1 + pn−2 , (4.2.12) qn = an qn−1 + qn−2 = qn−1 + qn−2

we have se = k

qn−2 qn − qn−1 = = 1 − sn . qn qn

276 Next, from (4.2.11) we have pn + te pn−2 pn + τ n pn−1 k = , nq qn + τ n−1 qn + te qn−2 k and using equations (4.2.12) and (1.1.12) we obtain te + te τ n + τ n = 0 , k k whence te = − k The converse is obvious. Now, deﬁne the transformation τ∆ : ∆ → ∆ as ¯ ¯ ¯ τ (x, y) if τ (x, y) ∈ S, τ∆ (x, y) = ¯ 2 τ (x, y) if τ (x, y) ∈ S ¯ ¯ τn . τn + 1

Chapter 4

2

for any (x, y) ∈ ∆ = I 2 \ S. This is a very simple instance of an induced transformation. Cf., e.g., Petersen (1983, Sections 2.3 and 2.4). According to the general theory, it follows that (∆, B∆ , τ∆ , γ∆ ) is an ergodic dynamical ¯ ¯ system. Here γ∆ is the probability measure on B∆ with density ¯ 1 1 , γ (∆) log 2 (xy + 1)2 ¯ (x, y) ∈ ∆.

R2

Next, Theorem 4.2.12 leads us naturally to consider the map M : ∆ → deﬁned by (x, y) ∈ ∆+ , (x, y), M (x, y) = (−x/(x + 1), 1 − y) (x, y) ∈ ∆− .

Set AS = M (∆). Clearly, AS consists of ∆+ = I 2 \ (S ∪ τ (S)) and the ¯ image M (¯(S)) of ∆− = τ (S) under M , which lies in the second quadrant τ ¯ of the plane. Also, M : ∆ → AS is one-to-one. We can then deﬁne the transformation τS : AS → AS as τS = M τ∆ M −1 , and Theorem 4.2.12 ¯ ¯ ¯ implies that te , se ¯ e e k ∈ N. (4.2.13) k+1 k+1 = τS tk , sk ,

Ergodic theory of continued fractions

277

**It is immediate that the determinant of the Jacobian J of M |∆− is equal to 1/(x + 1)2 > 0. For (x, y) ∈ ∆− we have |J|−1 1 = (xy + 1)2 x+1 xy + 1
**

2

=

1 , (st + 1)2

**where t = −x/(x + 1) and s = 1 − y. This shows that 1 log 2 ds dt (st + 1)2 = 1 log 2 |J| |J|−1
**

∆−

M (∆− )

dx dy (xy + 1)2

(4.2.14)

= γ (¯(S)) = γ (S). ¯ τ ¯ Note also that γ ∆+ = 1 − γ (S) − γ (¯(S)) = 1 − 2¯ (S) . ¯ ¯ ¯ τ γ (4.2.15)

Theorem 4.2.13 Let ρ be the probability measure on BAS with density 1 1 , (1 − γ (S)) log 2 (xy + 1)2 ¯ (x, y) ∈ AS .

Then (AS , BAS , τS , ρ) is an ergodic dynamical system which underlies the ¯ corresponding S-expansion. Proof. The conclusion follows on account of equations (4.2.13) through (4.2.15) noting that the dynamical systems (∆, B∆ , τ∆ , γ∆ ) and (AS , BAS , ¯ ¯ τS , ρ) are isomorphic by the very deﬁnition of the latter. See Remark 1 ¯ following Proposition 4.0.5 and Petersen (1983, Sections 1.3 and 2.3). 2 Remark. The entropy of the maps τ∆ and τS can be easily obtained ¯ ¯ using Abramov’s formula [see e.g. Petersen (1983, p. 257)]. Since H(τ ) = π 2 /6 log 2 (see Remark following Corollary 4.1.28), we have H(¯∆ ) = τ 1 π2 H(τ ) = = H(¯S ), τ γ (∆) ¯ 1 − γ (S) 6 log 2 ¯

which shows that entropy is maximal π 2 /6 log G for maximal singularization areas. 2 At ﬁrst sight the dynamical system (AS , BAS , τS , ρ) looks very intricate. ¯ However, it is quite helpful. We have the following result.

278

**Chapter 4 Theorem 4.2.14 Let the map f : AS → R ∪ {∞} be deﬁned by f (x, y) = x−1 − τS (x, y), ¯
**

(1) (1)

(x, y) ∈ AS ,

where τS (x, y) is the ﬁrst coordinate of τS (x, y). Let a : [0, 1) → N+ ∪ {∞} ¯ ¯ be deﬁned as in Chapter 1, that is, −1 t a(t) = ∞ if t ∈ (0, 1), if t = 0.

We have a(x) a(x) + 1 f (x, y) = if sgn x = 1 if sgn x = 1 and τ (x, y) ∈ S, ¯ and τ (x, y) ∈ S, ¯

a(−x/(x + 1)) + 1 if sgn x = −1 and τ (M −1 (x, y)) ∈ S, ¯ a(−x/(x + 1)) + 2 if sgn x = −1 and τ (M −1 (x, y)) ∈ S ¯

and τS (x, y) = |x−1 | − f (x, y), (y f (x, y) + sgn x)−1 , ¯ (x, y) ∈ AS .

Proof. We should distinguish four cases, of which only two will be considered here. The other two cases can be treated similarly. Cf. Kraaikamp (1991, p. 26). 1. Let (x, y) ∈ ∆+ and τ (x, y) ∈ S. Then sgn x = 1 and ¯ τ∆ (M −1 (x, y)) = τ 2 (x, y) = τ ¯ ¯ ¯ = = 1 1 − a(x), x a(x) + y

x−1

1 1 − 1, − a(x) 1 + 1/(a(x) + y) ∈ ∆− .

x − 1 + xa(x) a(x) + y , 1 − xa(x) a(x) + y + 1

Ergodic theory of continued fractions Therefore

279

a(x) + y τS (x, y) = M (¯∆ (M −1 (x, y))) = ¯ τ , 1− a(x) + y + 1 1 + x−1+xa(x) 1−xa(x) − x−1+xa(x) 1−xa(x) = Thus we see that τS (x, y) = ¯ where f (x, y) = a(x) + 1. 2. Let (x, y) ∈ M (∆− ) and τ M −1 (x, y) ∈ S. Then sgn x = −1 and ¯ we have τS (x, y) = M τ M −1 (x, y) = τ M −1 (x, y) = τ − ¯ ¯ ¯ ¯ = = Thus we see that τS (x, y) = ¯ where f (x, y) = a(−x/(x + 1)) + 1. 2 Corollary 4.2.15 We have (i) f (x, y) ∈ N+ for (x, y) ∈ AS , x = 0; (ii) ak+1 = f (te , se ), k ∈ N, with (te , se ) = (x − a0 , 0). 0 0 k k x−1 − f (x, y), (f (x, y) + y sgn x)−1 , − − 1 x −a − x/(x + 1) x+1 1 x −1−a − x x+1 , , x ,1 − y x+1 x−1 − f (x, y), (f (x, y) + y sgn x)−1 , 1 1 − (a(x) + 1), x a(x) + y + 1 .

1 a(−x/(x + 1)) + 1 − y .

1 a(−x/(x + 1)) + 1 + y sgn x

280

Chapter 4

Let Ai , i = 1, 2, be the projections of AS onto the two axes and let λAi S S denote the probability measure deﬁned by λAi (A) =

S

λ A ∩ Ai S , λ Ai S

A ∈ BAi , i = 1, 2.

S

**Proposition 4.2.16 Let µ ∈ pr BA1 such that µ S B ∈ BAS such that λA1 ⊗ λA2 (∂B) = 0 we have
**

S S

**λA1 . For any
**

S

n→∞

lim µ (te , se ) ∈ B = ρS (B), n n

n−1

lim 1 n→∞ n

IB (te , se ) = ρS (B) a.e. in A1 . S k k

k=0

Proof. This is the result corresponding to Theorem 4.1.16 and Corollary 4.1.17 for the ergodic dynamical system (AS , BAS , τS , ρ). It is easy to see ¯ that the proof of Theorem 4.1.16 for the case of the ergodic dynamical system 2 ¯ ¯ (I 2 , BI , τ , γ ) carries over to the present case. 2 Corollary 4.2.17 Consider the approximation coeﬃcients

e e Θe = (qn )2 ae + te − pe /qn , n 0 0 n

n ∈ N.

**For any µ ∈ pr(BA1 ) such that µ
**

S

**λA1 and any (t1 , t2 ) ∈ I 2 we have
**

S

n→∞

lim µ Θe ≤ t1 , Θe ≤ t2 n n−1

= ρ(B), a.e. in A1 , S

n→∞

1 lim n card{k : Θe ≤ t1 , Θe ≤ t2 , 0 ≤ k ≤ n − 1} = ρ(B) k k+1

where B = (x, y) ∈ AS ; y |x| ≤ t1 , ≤ t2 xy + 1 xy + 1 .

Proof. The results stated follow from Proposition 4.2.16 on account of equations (4.2.5) and (4.2.6). 2

Ergodic theory of continued fractions

281

4.3

4.3.1

Examples of S-expansions

Nakada’s α-expansions

Let Iα = [α−1, α], α ∈ R, so that I1 = I. In this subsection we will consider transformations Nα : Iα → Iα deﬁned by −1 |x | − |x−1 | + 1 − α if x = 0 Nα (x) = 0 if x = 0 for x ∈ Iα , with α ∈ [1/2, 1]. Any irrational number x ∈ Iα has an inﬁnite SRCF expansion called α-expansion, of the form e1 b1 + where

n−1 n−1 (en , bn ) = (en (x), bn (x)) = e1 (Nα (x)), b1 (Nα (x)) ,

e2

:= [ e1 /b1 , e2 /b2 , · · · ] ,

. b2 + . .

n ∈ N+ ,

**with (e1 (x), b1 (x)) = sgn x, |x−1 | + 1 − α , x ∈ Iα .
**

n 0 Here Nα denotes the composition of Nα with itself n times while Nα is the identity map. The theory of α-expansions can be developed by parallelling that of the RCF expansion. This has been done by Nakada (1981), Nakada et al. (1977), Bosma et al. (1983), and Popescu (2000). Originally, these expansions were deﬁned by McKinney (1907). Our approach here consists in putting any α-expansion in the framework of the S-expansion theory by giving a suitable singularization area Sα , α ∈ [1/2, 1]. This will allow us to retrieve results derived by Nakada and coworkers (op. cit.) using diﬀerent methods. We should distinguish two cases: (i) α ∈ [1/2, g] and (ii) α ∈ (g, 1].

Case (i). Before giving the singularization areas Sα , α ∈ [1/2, g], we ﬁrst return to the special case α = 1/2 which yields the NICF expansion. Recall that the NICF expansion of an irrational number can be obtained from its RCF expansion by applying algorithm A from Subsection 4.2.1 to the latter. We noticed in Subsection 4.2.4 that this is equivalent to singularize a

+1

= 1 if and only if (τ , s ) ∈ SA ,

∈ N,

282 where SA = [1/2, 1) × [0, g] . For α ∈ (1/2, g], notice that τ ([1/2, α) × [0, g]) = ((1 − α)/α, 1] × [g, 1] . ¯ In particular, for α = g we have (SA \ ([1/2, α) × [0, g])) ∪ ((1 − α)/α, 1] × [g, 1]) = (SA \ ([1/2, g) × [0, g])) ∪ ((g, 1] × [g, 1]) = ([g, 1) × [0, g]) ∪ ((g, 1] × [g, 1]) ,

Chapter 4

∗ which only slightly diﬀers from the singularizaton area SB of Hurwitz’s SCF expansion, which coincides with the g-expansion. See Remark 2 following Deﬁnition 4.2.5. It therefore seems natural to try as singularization areas Sα for α ∈ [1/2, g] the sets

Sα = ([α, g) × [0, g)) ∪ ([g, (1 − α)/α] × [0, g]) (4.3.1) ∪ ((1 − α)/α, 1] × I) . Hence τ (Sα ) = ([0, (2α − 1)/(1 − α)) × [1/2, 1]) ¯ ∪ ([(2α − 1)/(1 − α), g] × [g, 1]) ∪ ((g, (1 − α)/α] × (g, 1]) . It is easy to check that Sα is indeed a singularization area: obviously, γ (∂Sα ) = 0, Sα ⊂ [1/2, 1] × I, and clearly Sα ∩ τ (Sα ) = {(g, g)}. Also, ¯ ¯ γ (Sα ) = 1 − ¯ log G , log 2

hence Sα is maximal for any α ∈ [1/2, g]. Notice that with M deﬁned as in Subsection 4.2.5 we have M (¯(Sα )) = ([α − 1, g − 1) × [0, 1 − g)) τ ∪ ([g − 1, (1 − 2α)/α] × [0, 1 − g]) ∪ ((1 − 2α)/α, 0] × [0, 1/2]) .

Ergodic theory of continued fractions

283

Writing Aα for ASα —see again the general case in Subsection 4.2.5—we take Aα = I 2 \ (Sα ∪ τ (Sα )) ∪ (M (¯(Sα )) \ ({0} × [0, 1/2])) ¯ τ ∪ ([g − 1, (1 − 2α)/α] × [0, 1 − g]) ∪ (((1 − 2α)/α, 0) × [0, 1/2]) ∪ ([0, (2α − 1)/(1 − α)] × [0, 1/2)) ∪ (((2α − 1)/(1 − α), α) × [0, g)) . If we denote by fα : Aα → R ∪ {∞} the function corresponding to the function f in Theorem 4.2.14, then it easy to see that actually fα maps Aα into N+ and that x−1 − fα (x, y) ∈ [α − 1, α), x ∈ [α − 1, α) \ {0}. Since there exists only one n ∈ N+ such that x−1 − n ∈ [α − 1, α), we deduce that fα (x, y) does not depend on y and that we should have fα (x, y) = x−1 + 1 − α , (x, y) ∈ Aα , x = 0.

= ([α − 1, g − 1) × [0, 1 − g))

Hence x → x−1 − fα (x, y) is Nakada’s transformation Nα . On account of Theorem 4.2.14 we can therefore state the main result for the case α ∈ [1/2, g]. Theorem 4.3.1 [Nakada (1981)] Let measure γα on BAα with density ¯ 1 1 , log G (xy + 1)2

1 2

≤ α ≤ g. Consider the probability

(x, y) ∈ Aα ,

**¯ and the transformation Nα : Aα → Aα deﬁned by ¯ Nα (x, y) = |x−1 | − |x−1 | + 1 − α , |x−1 | + 1 − α + y sgn x
**

−1

,

¯ ¯ where (x, y) ∈ Aα . Then (Aα , BAα , Nα , γα ) is an ergodic dynamical system underlying the corresponding α-expansion. Taking projection onto the ﬁrst axis, we deduce from Theorem 4.3.1 the following result.

284 Corollary 4.3.2 Let 1 ≤ 2 on BIα with density 1/(x + G + 1) 1 1/(x + 2) × log G 1/(x + G)

Chapter 4 α ≤ g. Consider the probability measure µα if x ∈ [α − 1, (1 − 2α)/α], if x ∈ ((1 − 2α)/α, (2α − 1)/(1 − α)), if x ∈ [(2α − 1)/(1 − α), α] .

Then (Iα , BIα , Nα , µα ) is an ergodic dynamical system. Remark. For α = 1/2 we obtain the NICF expansion, and the corresponding result has been derived independently by Rieger (1979) and Rockett (1980). 2 1

g

Sα

0

1α 2

g

1 2

1−α α

1

Figure 4.2: Sα for

≤α≤g

From Figure 4.2 it is clear that the vertices (α, g) and ((1 − α)/α, 1) of Sα determine the value of the Legendre constant Lα := LSα . See Theorem 4.2.11. More precisely, we have the following result. Theorem 4.3.3 Let

1 2

≤ α ≤ g. Then

Lα = min(α/(1 + αg), 1 − α).

Remark. Notice that for the values of α ∈ [1/2, g] under consideration we have τ ([1/2, α) × [0, g)) ⊂ Sα . ¯

Ergodic theory of continued fractions

285

. It follows at once from this and (4.3.1) that BSα = ∅, which is consistent with Proposition 4.2.7. 2 Case (ii). Let α ∈ (g, 1]. Put Sα = [α, 1] × I . (4.3.2)

Hence τ (Sα ) = [0, (1−α)/α]×[1/2, 1], and Sα ∩ τ (Sα ) = ∅ since for α ∈ (g, 1] ¯ ¯ we have (1 − α)/α < α . It is then easy to check that Sα is indeed a singularization area. However, a simple calculation shows that γ (Sα ) = 1 − ¯ log(1 + α) , log 2

thus for no value of α under consideration here the singularization area Sα is maximal. Next, with M deﬁned as in Subsection 4.2.5 we have M (¯(Sα )) = [α − 1, 0] × [0, 1/2] . τ Deﬁne Aα exactly as in case (i) and denote by fα : Aα → R ∪ {∞} the function corresponding to the function f in Theorem 4.2.14. The expression of Aα is now simpler, namely, Aα = ([α − 1, 0) × [0, 1/2])∪([0, (1 − α)/α] × [0, 1/2))∪(((1 − α)/α, α) × I) , see Figure 4.3. Similarly to case (i) we ﬁnd that fα (x, y) is independent of y and that in fact we have again fα (x, y) = |x−1 | + 1 − α , (x, y) ∈ Aα , x = 0.

Thus we can state the main result for the case α ∈ (g, 1]. Theorem 4.3.4 [Nakada (1981)] Let g < α ≤ 1. Consider the probability measure γα on BAα with density ¯ 1 1 , (x, y) ∈ Aα , log(1 + α) (xy + 1)2 ¯ and the transformation Nα : Aα → Aα deﬁned as in Theorem 4.3.1. Then ¯ ¯ (Aα , BAα , Nα , γα ) is an ergodic dynamical system.

286

Chapter 4

1

τ (Sα ) ¯

1/2

Sα

M (¯(Sα )) τ

α−1

0

1−α α 1/2

α

1

Figure 4.3: Sα for g ≤ α ≤ 1 Taking again projection onto the ﬁrst axis, we deduce from Theorem 4.3.4 the following result. Corollary 4.3.5 Let g < α ≤ 1. Consider the probability measure µα on BIα with density 1/(x + 2) if x ∈ [α − 1, (1 − α)/α], 1 × log(1 + α) 1/(x + 1) if x ∈ ((1 − α)/α, α]. Then (Iα , BIα , Nα , µα ) is an ergodic dynamical system. We conclude the discussion of case (ii) with some results from Kraaikamp (1991). It is obvious that the vertex (α, 1) of Sα determines the value of the Legendre constant Lα := LSα . As min(α/(α + 1), 1/2) = α/(α + 1) in case (ii), we have the following result. See again Theorem 4.2.11. Theorem 4.3.6 Let g < α ≤ 1. Then Lα = α . α+1

Ergodic theory of continued fractions

287

Next, it is easy to check that τ −1 (Sα ) ∩ ([1/2, 1] × I) = [1/2, 1/(1 + α)] × I. ¯ Since for our values of α we have (1 − α)/α < 1/(1 + α), we ﬁnd that the set Bα := BSα from Proposition 4.2.7 is (1/(1 + α), α) × I. Then γα (Bα ) = 2 − ¯ and we can state the following result. Theorem 4.3.7 Let g < α ≤ 1. For the α-expansion [e1 /a1 , e2 /a2 , · · · ] = [e1 /b1 , e2 /b2 , · · · ] of irrationals in Iα we have

n→∞

log(2 + α) , log(1 + α)

lim

log(2 + α) 1 card{k ; ak = 1, 1 ≤ k ≤ n} = 2 − a.e.. n log(1 + α)

Remarks. 1. The case α = 1 gives the classical result from Proposition 4.1.1. 2. For α ∈ [g, 1] the limit 2 − log(2+α) increases monotonically from 0 log(1+α) to 2 − log 3 = 0.4150 · · · , the asymptotic relative frequency of digit 1 in the log 2 RCF expansion. At α = 0.76292 · · · we have already lost half of the original 1’s. 3. It follows from Corollary 4.2.10 that for the α-expansion with α ∈ (g, 1] we have n log 4 1 ek = 3 − lim a.e.. n→∞ n log(1 + α)

k=1

2 We conclude this subsection by giving the analogue of Vahlen’s theorem— see Subsection 1.3.2—for α-expansions with α ∈ [1/2, 1]. For the NICF and Hurwitz’ SCF expansions this analogue was independently given by Kurosu (1924) and Sendov (1959/60). Kraaikamp (1990) proved the Kurosu–Sendov results by giving a domain in R2 where the point (Θe , Θe ) always lies. n n−1 For the two expansions just mentioned, that is for α = 1/2 and α = g, we have min(Θe , Θe ) < 2g3 = 0.4721 · · · , n−1 n and the constant 2g3 is best possible.

288

Chapter 4

However, one might ask whether there are values of α for which still smaller values can be obtained for the corresponding approximation coeﬃcients Θe (α) = Θe , n ∈ N. Beforehand it is clear that a value smaller than n n √ 1/ 5 = 0.447 · · · can never be found by a classical result of A. Hurwitz √ [see Perron (1954, p. 49)], according to which for every θ < 1/ 5 there exist irrational numbers x such that the inequality q 2 |x − (p/q)| < θ is veriﬁed only for ﬁnitely many p/q ∈ Q. The above-mentioned method from Kraaikamp (1990) can easily be adapted for S-expansions. As an example we will mention here the case of αexpansions, for which the ﬁrst result below is due to Bosma et al. (1983). Theorem 4.3.8 Let α ∈ [1/2, 1]. For any irrational number in Iα and any n ∈ N+ we have Θe < c(α) n and min(Θe , Θe ) < V (α) , n−1 n where the functions c, V : [1/2, 1] → R are deﬁned by c(α) = max G and max V (α) = max 1−α 1 , α , ≤ α ≤ 1, gα + 1 2

1 2

g , 4α − 2 1 + gα α 2(1 − α) , 2 α+1 α +1

if

≤ α ≤ g,

if g ≤ α ≤ 1.

The bounds c(α) and V (α) are best possible. For the proof see Kraaikamp (1991). Remark. A simple calculation yields minα c(α) = c(α0 ) = α0 , with α0 = 1 2 −2 − √ 5+ √ 6 5 + 15 = 0.5473 · · · .

Moreover, we √ have minα V (α) = V (α1 ) = 0.4484 · · · , a constant slightly larger than 1/ 5, where √ 1 − 3g + 10 − 11g α1 = = 0.6121 · · · < g. 4g2 2

Ergodic theory of continued fractions

289

4.3.2

Minkowski’s diagonal continued fraction expansion

Let x ∈ R such that both x and 2x ∈ Z. Consider the sequence σ of all irreducible fractions p/q ∈ Q with q ∈ N+ satisfying x− p 1 < , q 2q 2

ordered in such a way that their denominators form an increasing sequence. It can be shown [see, e.g., Perron (1954, §45)] that there exists a unique SRCF expansion whose sequence of convergents coincides with σ. Legendre’s theorem (see Corollary 1.2.4) implies that we take precisely those RCF convergents for which Θn < 1/2. By (4.2.5) this SRCF expansion—which is called Minkowski’s diagonal continued fraction (DCF ) expansion—is an S-expansion with singularization area S = SDCF := (x, y) ∈ I 2 : x 1 ≥ xy + 1 2 .

Since min(Θn , Θn+1 ) < 1/2—cf. Subsection 1.3.2—the DCF expansion picks at least one out of two consecutive RCF convergents. Since γ (SDCF ) = 1 − ¯ 1 , 2 log 2

the singularization area SDCF is not maximal. Also, by Theorem 4.2.8 we have nSDCF (k) = 2 log 2 = 1.3862 · · · a.e.. lim k→∞ k It can be shown [cf. Kraaikamp (1989, p. 210)] that the DCF expansion of any ω ∈ Ω can be obtained from its RCF expansion [a1 , a2 , · · · ] by singularizing any digit ak+1 (ω) = 1 if and only if one of the following four conditions is fulﬁlled: (i) k = 0, that is, a1 = 1; (ii) ak , ak+2 = 1, k ∈ N+ ; (iii) ak = 1, ak+2 = 1, and [ak+3 , ak+4 , · · · ] > [ak −1, · · · , a1 ], k ∈ N+ , with the convention that the value of [ak − 1, · · · , a1 ] for k = 1 is [a1 − 1]; (iv) ak = 1, ak+2 = 1, and [ak−1 , · · · , a1 ] > [ak+2 − 1, ak+3 , · · · ], k ≥ 2.

290

Chapter 4

It is also interesting to note that the DCF expansion of a quadratic irrationality is periodic. The general theory developed in Subsections 4.2.4 and 4.2.5 allows us to state the following results. For detailed proofs the reader is referred to Kraaikamp (op. cit.). With the notation in Subsection 4.2.5, for the DCF expansion case we have ∆+ DCF M (¯(SDCF )) τ = = (x, y) ∈ R2 : ++ (x, y) ∈ R2 : x 1 y 1 < , < xy + 1 2 xy + 1 2 ,

(x + 1)(1 − y) 1 1 ≤ , − ≤ x ≤ 0, y ≥ 0 , xy + 1 2 2

**ADCF := ASDCF = ∆+ ∪ M (¯(SDCF )) , τ DCF see also Figure 4.4.
**

. ... ... ... ... .. ... .... .... ..... ... ... ..... ..... ... .. ... .. ... . ... . ... . ... .. ... .. ... ... . . . . .... .... .. .... .... .. .... .... .. .. . .. .... .. .... .... .. ... .... . .. .... .... . ..... .. ..... ..... ..... .. .. .. .. ..... . ..... .. ..... ..... ...... . .. .. ... ...... . . ...... . ...... . ....... ....... . . ...... ...... . . ..... . ..... . .. . . .... . .... . .... . .... . ... . ... . ... . ... . . ... . ... . .. . ... . .. . . ... . .. . . .. . .. . . .. . .. . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

τ (SDCF ) ¯

1/2

SDCF

M (¯(SDCF )) τ

−1/2

0

1/2

1

Figure 4.4: SDCF Furthermore, writing fDCF for fSDCF and τDCF for τSDCF we have ¯ ¯ fDCF (x, y) = and τDCF (x, y) = ¯ x−1 − fDCF (x, y), (fDCF (x, y) + y sgnx)−1 x−1 + x−1 + y sgn x − 1 , 2( |x−1 | + y sgn x) − 1

Ergodic theory of continued fractions

291

for (x, y) ∈ ADCF . Proposition 4.3.9 Let ρDCF be the probability measure on BSDCF with density 2 , (x, y) ∈ ADCF . (xy + 1)2 Then (ADCF , BSDCF , τDCF , ρDCF ) is an ergodic dynamical system which un¯ derlies the DCF expansion. Proposition 4.3.10 For any µ ∈ pr B[−1/2,1] such that µ any (t1 , t2 ) ∈ I 2 we have

n→∞

λ and

lim µ Θe ≤ t1 , Θe ≤ t2 n−1 n

= H(t1 , t2 ).

Here H is the distribution function with density d1 + d2 , where 2IB (x, y) , d1 (x, y) = √ 1 1 − 4xy with B1 = [0, 1/2] × [0, 1/2], B2 = B1 ∩ (x, y) ∈ E1 : 0 ≤ (x − y)2 + x + y ≤ 3/4 . The result above can be also stated in an equivalent form concerning the existence for any (t1 , t2 ) ∈ I 2 of the limit a.e. equal to H(t1 , t2 ) of 1 card{k : Θe ≤ t1 , Θe ≤ t2 , 0 ≤ k ≤ n − 1} k k+1 n as n → ∞. It then follows, e.g., that 1 lim n→∞ n

n−1

2IB (x, y) d2 (x, y) = √ 2 , 1 + 4xy

Θe = k

k=0

1 4

a.e..

We also note the following results. Proposition 4.3.11 An RCF digit ak+1 equal to 1 does not disappear in the DCF expansion if and only if (τ k , sk ) ∈ B = (x, y) ∈ ADCF : y < 1 − 2x 2x − 1 1 ,y> ,y< 3x − 2 x 2−x

**292 whatever k ∈ N. Note that γ (B) is equal to ¯ 1 log 2
**

1 1/(2−t) √ 2− 2

Chapter 4

dt

1/2 (2t−1)/t

du − (tu + 1)2

1/(2−t)

dt

1/2 (2t−1)/(2−3t)

du (tu + 1)2

=

1 log 2

√ √ 1 log( 2 − 1) + 2 − 2

= 0.0473 · · · .

**Corollary 4.3.12 Let [ae ; ae , ae , · · · ] be the DCF expansion of an irra0 1 2 tional number. Then
**

n→∞

lim

1 card{k : ae = 1, 1 ≤ k ≤ n} k n = ρDCF (B) = γ (B) ¯ 1 − γ (SDCF ) ¯ = 0.0656 · · · a.e..

√ √ 1 = 2 log( 2 − 1) + 2 − 2

This asymptotic relative frequency (6.56 · · · %) should be compared with the asymptotic relative frequency of digit 1 in the RCF expansion (2− log 3 = log 2 41.50 · · · %). See Proposition 4.1.1 and Subsection 4.1.2.

4.3.3

Bosma’s optimal continued fraction expansion

A remarkable geometrical interpretation of the RCF expansion of an irrational number was given by Klein (1895). The idea behind it is to represent any irreducible p/q ∈ Q ∩ I by an integer-valued vector in R2 , namely, by + the point (q, p) ∈ R2 , and to represent an irrational number ω ∈ Ω by a + half-line L with slope ω. The approximation of ω by its RCF convergents amounts to systematically ﬁnding integer-valued vectors close to L. More precisely, starting from V−1 = (0, 1) and V1 = (1, 0) we deﬁne Vn recursively by Vn = an Vn−1 + Vn−2 , n ∈ N+ , where an ∈ N+ is maximal with respect to the property that Vn is on the same side of L as Vn−2 . It then appears that the positive integers a1 , a2 , · · · are in fact the RCF digits of ω, that is, ω = [a1 , a2 , · · · ].

Ergodic theory of continued fractions

293

Bosma (1987) gave a similar interpretation of α-expansions and, inspired by this, presented a very interesting SRCF expansion formally deﬁned as follows. pe −1 Deﬁnition 4.3.13 Let −1/2 < x < 1/2. Put ae = 0, te = x, e1 = sgn te , 0 0 0 e e = 1, q−1 = 0, pe = 0, q0 = 1, se = 0, and deﬁne recursively 0 0 e e −1 + ee e −1 tk k+1 sk e tk , ak+1 = + −1 e e e +1 2 tk + ek+1 sk te = te k+1 k

−1

− ae , k+1

ek+2 = sgn te , k+1

e e e pe = ae pe + ek+1 pe , qk+1 = ae qk + ek+1 qk−1 , k+1 k+1 k k−1 k+1 e e se = qk /qk+1 , k ∈ N. k+1

The optimal continued fraction (OCF ) expansion of x, denoted OCF(x), is the SRCF expansion [e1 /ae , e2 /ae , · · · ]. For an irrational x ∈ R such that 1 2 2x ∈ Z, OCF(x)=[ae ; e1 /ae , e2 /ae , · · · ] is deﬁned as ae + [e1 /ae , e2 /ae , · · · ], 0 1 2 0 1 2 where ae ∈ Z is such that −1/2 < x − ae < 1/2, and [e1 /ae , e2 /ae , · · · ] = 0 0 1 2 OCF(x − ae ). 0 is, It is not diﬃcult to see that the te and se have the usual meaning, that k k te = [ek+1 /ae , · · · ], k k+1 k ∈ N,

se k

if k = 0, 0 1/ae if k = 1, = 1 [1/ae , ek /ae , · · · , e2 /ae ] if k ≥ 2 1 k k−1

e and pe /qk , k ∈ N, are the OCF convergents of x. k e Next, the sequence of OCF convergents (pe /qk )k∈N is a subsequence of k the sequence (pn /qn )n∈N of RCF convergents. If we deﬁne n(k) in such a e way that pe /qk = pn(k) /qn(k) , k ∈ N+ , then k n(k) + 1 if ek+2 = 1, n(k + 1) = n(k) + 2 if ek+2 = −1

294 with n(0) = 0 if x > 0, 1 if x < 0.

Chapter 4

Finally, it appears that the OCF expansion gives approximation coeﬃe e cients Θe = (qn )2 |x − (pe /qn )| < 1/2 for any n ∈ N and, at the same time, n n it is a fastest expansion. Fastest SRCF expansions for which all convergents are RCF convergents can be deﬁned as those in which always the maximal number of RCF convergents is skipped, meaning that whenever a 1-block of length m ∈ N+ occurs in the RCF expansion, exactly (m + 1)/2 out of the m 1’s are skipped. (Note that this implies that for fastest SRCF expansions only a choice is left in deciding which RCF convergents will be skipped when m is even.) A still more precise deﬁnition of ‘fastest’ is as follows. Writing nα (k) := nSα (k), k ∈ N+ , α ∈ [1/2, 1], by Theorem 4.2.8 we have a.e. nα (k) = lim k→∞ k log 2 = 1.44092 · · · log G log 2 log(α + 1) if 1/2 ≤ α ≤ g, if g < α ≤ 1.

Then an (arbitrary) SRCF expansion is said to be fastest if and only if nSRCF (k) = n1/2 (k) for inﬁnitely many k ∈ N+ . Here the non-decreasing function nSRCF : N+ → N+ is deﬁned by

e qnSRCF (k) ≤ qk < qnSRCF (k)+1 ,

k ∈ N+ ,

e where the qi and qi , i ∈ N+ , are associated with the RCF expansion and the SRCF expansion considered, respectively. Cf. Bosma (1987, p. 364).

The next result [cf. Bosma and Kraaikamp (1990)] places OCF expansions in the context of the S-expansion theory. More precisely, it shows how singularizing appropriately the RCF expansion yields the OCF expansion. (Note that it is for this reason that we have anticipated notation by denoting the OCF expansion as an S-expansion.) Lemma 4.3.14 Let ω ∈ Ω have RCF expansion [a1 , a2 , · · · ], RCF convergents pn /qn , and RCF approximation coeﬃcients Θn , n ∈ N. Consider the set 2x − 1 SOCF = (x, y) ∈ I 2 ; y < min x, . 1−x Then for any n ∈ N+ the following three assertions are equivalent:

Ergodic theory of continued fractions (i) pn /qn is not an OCF convergent of ω; (ii) an+1 = 1 , Θn−1 < Θn and Θn > Θn+1 ; (iii) (τ n , sn ) ∈ SOCF .

295

Proof. For the proof of the equivalence of (i) and (ii) we refer the reader to Corollary (4.20) of Bosma (1987). Here we show that (ii) and (iii) are equivalent. Since sn τn Θn−1 = , Θn = , n ∈ N+ , (4.3.2) sn τ n + 1 sn τ n + 1 we have |qn ω − pn | Θn qn−1 = = τ n < 1, |qn−1 ω − pn−1 | Θn−1 qn Θn−1 < Θn if and only if ω ∈ Ω. (4.3.3)

Also τ n > sn . (4.3.4) Furthermore, if an+1 = 1 then pn+1 = pn + pn−1 and qn+1 = qn + qn−1 , and by (4.3.3) we have Θn+1 = qn+1 |qn+1 ω − pn+1 | = (qn + qn−1 )|(qn + qn−1 )ω − (pn + pn−1 )| = (qn + qn−1 )|(qn−1 ω − pn−1 ) + (qn ω − pn )| = (qn + qn−1 )(|qn−1 ω − pn−1 | − |qn ω − pn |) since qn ω − pn and qn−1 ω − pn−1 have diﬀerent signs, as shown by equation (1.1.18). Thus Θn+1 = Θn−1 1 + It follows from (4.3.3) that an+1 = 1 and Θn+1 < Θn if and only if sn < 2τ n − 1 . 1 − τn (4.3.5) qn qn−1 − Θn 1 + qn−1 qn .

Combining (4.3.4) and (4.3.5) with the deﬁnition of SOCF completes the proof. 2

296 Remarks. 1. It is easy to check that γ (SOCF ) = 1 − ¯ log G , log 2

Chapter 4

so SOCF is a maximal singularization area. See Figure 4.5. Notice that SOCF contains SDCF , hence any sequence of OCF convergents is a subsequence of the corresponding sequence of DCF convergents. Since τ (SOCF ) ⊂ I 2 \SOCF , ¯ the set BSOCF of the OCF preservation area of 1’s is empty. Hence any OCF incomplete quotient (or digit) is greater than or equal to 2. 2. It now appears that the function n : N+ → N+ considered above is in fact nSOCF . It then follows from Theorem 4.2.8 that n(k) log 2 = = 1.4404 · · · a.e.. k→∞ k log G lim 2 As in the DCF expansion case, the general theory developed in Subsections 4.2.4 and 4.2.5 allows us to state the following results. For detailed proofs the reader is referred to Bosma and Kraaikamp (1990, 1991). With the notation in Subsection 4.2.5, for the OCF case we have ∆OCF = I 2 \ SOCF = (x, y) ∈ I 2 : y ≥ min x, 2x − 1 1−x ,

2 ∆− ¯ OCF = τ (SOCF ) = (x, y) ∈ I : (y, x) ∈ SOCF ,

that is, reﬂecting SOCF in the diagonal y = x yields ∆− , and OCF AOCF := ASOCF = M (∆OCF ) = (x, y) ∈ (−1/2, g) × [0, g] : y ≤ min and y ≥ max 0, 2x − 1 1−x , 2x + 1 x + 1 , x+1 x+2

see Figure 4.5. Furthermore, writing fOCF for fSOCF and τOCF for τSOCF we have ¯ ¯ fOCF (x, y) = τOCF (x, y) = ¯ x−1 + |x−1 | + y sgn x 2( |x−1 | + y sgn x) + 1 ,

x−1 − fOCF (x, y), (fOCF (x, y) + y sgn x)−1

**Ergodic theory of continued fractions
**

.. ... ... ... ... ... ... .. ... ... .. .. ... ... ... ... ... ... ... ... ... ... .. .. ... ... ... ... ... ... . .. ... .. .. ....... . ....... .......... . .......... . .......... .......... . . ......... . ......... . ........ ........ . . ... .... ........ . . . ....... ....... . . ...... ....... . .. ... ...... . . . ...... ...... . .... ..... . . ..... ..... . . ..... . ..... . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

297

1

τ (SOCF ) ¯

1/2

SOCF

M (¯(SOCF )) τ

−1/2

0

1/2 g

1

Figure 4.5: SOCF for (x, y) ∈ AOCF . Theorem 4.3.15 Let ρOCF be the probability measure on BAOCF with density 1 1 , (x, y) ∈ AOCF . log G (xy + 1)2 Then (AOCF , BAOCF , τOCF , ρOCF ) is an ergodic dynamical system which un¯ derlies the OCF expansion. Remark. For both DCF and OCF expansions the two-dimensional sets ADCF and AOCF have curved boundaries. This implies that the functions fDCF and fOCF depend on both their arguments x and y, and not only on x as in the case of α-expansions, α ∈ [1/2, 1]. As a result, no one-dimensional ergodic dynamical system exists for either DCF or OCF expansion. 2 Proposition 4.3.16 For any µ ∈ pr B[−1/2,g] such that µ any (t1 , t2 ) ∈ I 2 we have

n→∞

λ and

lim µ Θe ≤ t1 , Θe ≤ t2 n−1 n

= H(t1 , t2 ).

Here H is the distribution function with density 1 1 1 + if (x, y) ∈ Π, log G 1 − 4xy 1 + 4xy 0 elsewhere,

298 where Π = (x, y) ∈ R2 : 4x2 + y 2 < 1, x2 + 4y 2 < 1 . ++

Chapter 4

The result above can be also stated in an equivalent form concerning the existence for any (t1 , t2 ) ∈ I 2 of the limit a.e. equal to H(t1 , t2 ) of 1 card{k : Θe ≤ t1 , Θe ≤ t2 , 0 ≤ k ≤ n − 1} k k+1 n as n → ∞. It then follows, e.g., that 1 lim n→∞ n

n

Θe = k

k=1

arctan 1 2 = 0.24087 · · · 4 log G

a.e..

(4.3.6)

Other consequences are that for any irrational number we have (i) 0 < Θe < 1/2, n ∈ N+ ; n √ √ (ii) 0 < Θe + Θe < 2/ 5, hence min (Θe , Θe ) < 1/ 5, n ∈ N+ . n n n−1 n−1 √ In connection with (ii) above, it should be noted that the constant 1/ 5 in the second inequality is ‘best possible’ by A. Hurwitz’s result mentioned just before Theorem 4.3.8. Remark. The a.e. asymptotic arithmetic mean (4.3.6) should be compared with the corresponding values 1 = 0.36067 · · · 4 log 2 1 = 0.25 4 √ 5−2 = 0.24528 · · · 2 log G √ 8G + 6 − 2G − 1 = 0.24195 · · · log G for the RCF expansion,

for the DCF expansion,

for the NICF and SCF expansions,

for the α0 -expansion,

where α0 = 0.55821 · · · . See Corollary 4.1.23 and Proposition 4.3.10 for the ﬁrst two values, and Bosma et al. (1983) for the last two ones. Note how close the value in (4.3.6) is to 1 − γ (SOCF ) = ¯ log G = 0.24061 · · · . 2

Ergodic theory of continued fractions

299

The latter gives an a priori bound for the a.e. asymptotic arithmetic mean of the approximation coeﬃcients. It can be shown that the value in (4.3.6) is in fact ‘the best one can get’ for any irrational number. More precisely, we have the following result. Theorem 4.3.17 [Bosma and Kraaikamp (1991)] Whatever the SRCF e expansion with convergents pe /qn and approximation coeﬃcients Θe , n ∈ N, n n we have m n 1 1 Θ e , n ∈ N+ , Θe ≥ k k m n

k=1 k=1 e e for any irrational number, where m = card{k : qk < qn+1 , k ∈ N+ } and e and Θe , n ∈ N , are associated with the OCF expansion. qn + n

4.4

4.4.1

**Continued fraction expansions with σ-ﬁnite, inﬁnite invariant measure
**

The insertion process

We have seen in previous subsections how the concept of singularization leads to a class of SRCF expansions for which the underlying ergodic theory can be developed. The idea of adding a convergent instead of removing one (as singularization does) leads to the concept of insertion, to some extent the opposite of that of singularization. Now, the fundamental identity is a+ 1 = a+1 + b+x −1 1+ 1 b−1+x ,

where a ∈ Z, b ∈ N+ , b > 1, x ∈ [0, 1). Let (cf. Subsection 4.2.2) (ek )k∈M , (ak )k∈{0}∪M (4.4.1)

be a (ﬁnite or inﬁnite) CF with a +1 > 1, e +1 = 1 for some ∈ N for which + 1 ∈ M . The transformation ι which takes (4.4.1) into the CF (ek )k∈M , f (ak )k∈{0}∪M , f (4.4.2)

where M = M if M = N+ and M = {k : 1 ≤ k ≤ n + 1} if M = {k : 1 ≤ k ≤ n}, n ∈ N+ , with ek = ek , k ∈ M , k ≤ , e +1 = −1,

300

Chapter 4

e +2 = 1, ek = ek−1 , k ∈ M , k ≥ + 3, ak = ak , k ∈ {0} ∪ M , k ≤ − 1, a = a + 1, a +1 = 1, a +2 = a +1 − 1, ak = ak−1 , k ≥ + 3, is called e an insertion of the pair (1, −1) before a +1 , e +1 . Let (pe /qk )k∈{0}∪M and k e (pe /qk )k∈{0}∪M be the sets associated with (4.4.1) and (4.4.2), respectively. f k The result corresponding to Proposition 4.2.4 can be stated as follows. Proposition 4.4.1 Let gents ∈ N such that

e (pe /qk )k∈{0}∪M f k

+ 1 ∈ M . The set of conver-

resulting after the insertion ι of the pair (1, −1) before a +1 (> 1), e +1 (= 1), is obtained by inserting the term (pe + pe−1 )/(q e + q e−1 ) in the set e e (pe /qk )k∈{0}∪M before the convergent pe /q e . As usual, here pe = 1, q−1 = −1 k 0. The proof is similar to that of Proposition 4.2.4 by using appropriate matrix identities. 2 Starting from the RCF expansion, by appropriate insertions we can obtain many classical SRCF expansions, and also continued fraction algorithms which are not SRCF expansions. Amongst the former we mention the Lehner continued fraction (LCF) expansion, and amongst the latter the Farey continued fraction (FCF) expansion. Both these expansions will be studied in the next subsection. In particular, we can obtain this way the OddCF and EvenCF expansions —see the examples of SRCF expansions at the end of Subsection 4.2.2—as well as the backward continued fraction (BCF) expansion that we will study in Subsection 4.4.3.

4.4.2

The Lehner and Farey continued fraction expansions

Lehner (1994) showed that any number x ∈ [1, 2) has a unique inﬁnite SRCF expansion of the form e1 b0 + := [ b0 ; e1 /b1 , e2 /b2 , · · · ] , (4.4.3) e2 b1 + . b2 + . . where (bn , en+1 ) is equal to either (1, 1) or (2, −1), n ∈ N. We shall call this expansion the Lehner continued fraction (LCF ) expansion. Dajani and Kraaikamp (2000) called it the Lehner fraction or the Lehner expansion, and showed that if we deﬁne the transformation L : [1, 2) → [1, 2) by L(x) = e(x) , x − b(x) x ∈ [1, 2),

**Ergodic theory of continued fractions where (b(x), e(x)) = then (bn (x), en+1 (x)) = (b(Ln (x)), e(Ln (x))) , x ∈ [1, 2), (2, −1) if 1 ≤ x < 3 , 2 (1, 1) if
**

3 2

301

≤ x < 2,

for any n ∈ N. Here Ln , n ∈ N+ , denotes the composition of L with itself n times while L0 is the identity map. Denoting as usual the RCF convergents of a real number x = [a0 ; a1 , a2 , · · · ] by (pn /qn )n∈N and deﬁning the mediant convergents of x by kpn + pn−1 , kqn + qn−1 1 ≤ k < an+1 , n = 1, 2, · · ·

(so that if an+1 = 1 then there is no mediant convergent), we will see that the set of LCF convergents of x is the union of the sets of RCF and mediant convergents of x. It is for this reason that the LCF expansion was called the mother of all SRCF expansions in Dajani and Kraaikamp (op. cit.). Proposition 4.4.2 Let x ∈ [1, 2) \ Q, with RCF expansion [ 1; a1 , a2 , · · · ]. Then the LCF expansion (4.4.3) of x is given by the following algorithm. (i) Let n be the smallest m ∈ N for which am+1 > 1. If n = 0, that is, a1 > 1 then we replace [1; a1 , a2 · · · ] by [ 2; −1/2, · · · , −1/2, −1/1, 1/1, 1/a2 , · · · ] .

(a1 −2) times

**If n ≥ 1 then we replace [ 1; 1, · · · , 1, an+1 , · · · ] by ιn+an+1 −1 ( · · · (ιn+1 (ιn ([ 1; 1, · · · , 1, an+1 , · · · ])) · · · ) = [ 1; 1/1, · · · , 1/1, , 1/2, −1/2, · · · , −1/2, −1/1, 1/1, 1/an+2 , · · · ] ,
**

(n−1) times (an+1 −2) times

where ιn is deﬁned as in Subsection 4.4.1. Denote the SRCF expansion of x thus obtained by

302

Chapter 4

[ b0 ; e1 /b1 , e2 /b2 , · · · ].

(4.4.4)

(ii) Let n > n be the smallest integer m > n for which em +1 = 1 and bm +1 > 1. Apply to (4.4.4) the procedure from (i) to bn +1 . The proof is easy and left to the reader. 2 Remark. It follows from the very insertion mechanism that any RCF or mediant convergent is an LCF convergent. Conversely, the sequence of LCF convergents is obtained after all mediant convergents have been inserted into the sequence of RCF convergents. Another immediate consequence is that the LCF expansion of a quadratic irrationality is (eventually) periodic. 2 Note that the transformation L [which is implicit in Lehner (1994)] is isomorphic to the transformation I : [0, 1) → [0, 1) deﬁned by x 1 − x if 0 ≤ x < 1/2, I(x) = 1−x if 1/2 ≤ x < 1, x which was used by Ito (1989) to generate the RCF and mediant convergents of any x ∈ [0, 1). More precisely, we have L(x) = I(x − 1) + 1, We also have L(x) = 1 , I (h(x − 1)) x ∈ [1, 2).

x ∈ [1, 2),

where the bijective function h : [0, 1) → [1/3, 2/3) is deﬁned by 1 2−x h(x) = x x+1 if 0 ≤ x < 1/2, if 1/2 ≤ x < 1.

Ito (op. cit.) showed that I is ν-preserving, where ν is the σ-ﬁnite, inﬁnite measure on B[0,1) with density x−1 , x ∈ (0, 1), and that [0, 1), B[0,1) , I, ν is an ergodic dynamical system. This implies that L is µ-preserving, where µ is the σ-ﬁnite, inﬁnite measure on B[1,2) with density (x − 1)−1 , x ∈ (1, 2), and that [1, 2), B[1,2) , L, µ , is an ergodic dynamical system underlying the LCF expansion.

Ergodic theory of continued fractions

303

We will now exhibit the relationship between the LCF expansion and an algorithm yielding the so called Farey continued fraction (FCF ) expansion. The latter is an inﬁnite CF expansion of any x ∈ [−1, 0) ∪ (0, ∞) of the form f1 d1 + f2 . d2 + . . := [ f1 /d1 , f2 /d2 , · · · ] , (4.4.5)

where (dn , fn ) is equal to either (1, 1) or (2, −1), n ∈ N+ . Formally, as shown by Dajani and Kraaikamp (op. cit.), if we deﬁne the transformation F : [−1, ∞) → [−1, ∞) by f (x) − d(x) if x = 0, x F(x) = 0 if x = 0, where (d(x), f (x)) = then (dn (x), fn (x)) = d(Fn−1 (x)), f (Fn−1 (x)) , x ∈ [−1, ∞), (2, −1) if − 1 ≤ x < 0, (1, 1) if x ≥ 0,

for any n ∈ N+ . Here Fn , n ∈ N+ , denotes the composition of F with itself n times while F0 is the identity map. By its very deﬁnition the FCF expansion is not an SRCF expansion since the condition fn+1 + dn ≥ 1, n ∈ N+ , is violated. ¯ Put D = [1, 2) × [−1, ∞), and deﬁne the transformation L : D → D by ¯ L(x, y) = L(x), e(x) b(x) + y , (x, y) ∈ D.

¯ It is easy to check that L is a one-to-one transformation of D := [1, 2) × ([−1, 0) ∪ (0, ∞)) with inverse ¯ L−1 (x, y) = Also, for any n ≥ 2 we have ¯ Ln (x, y) = (Ln (x), [en (x)/bn−1 (x), · · · , e2 (x)/b1 (x), e1 (x)/(b0 (x) + y)]) f (y) + d(y), F(y) , x (x, y) ∈ D .

304 whatever (x, y) ∈ D, and

Chapter 4

¯ L−n (x, y) = ([dn (y); fn (y)/dn−1 (y), · · · , f2 (y)/d1 (y), f1 (y)/x], Fn (y)) whatever (x, y) ∈ D . Remark. It is interesting to compare the last two equations above with (1.3.1 ) and (1.3.2 ). This might suggests developments similar to those in Section 1.3. 2 ¯ ¯ Theorem 4.4.3 The quadruple D, BD , L, µ is an ergodic dynamical system which is a natural extension of the dynamical system [1, 2), B[1,2) , L, µ . Here µ is the σ-ﬁnite, inﬁnite measure on BD with density (x+y)−2 , (x, y) ∈ ¯ D = [1, 2) × [−1, ∞). Proof. Let π1 : [1, 2) × [−1, ∞) → [1, 2) denote the projection onto the ﬁrst axis. Cf. Remark 1 after Proposition 4.0.5. Then it is easy to check ¯ that π1 ◦ L = L ◦ π1 , and that µ π1 (A) = µ(A), ¯ −1 A ∈ B[1,2) .

**¯ ¯ We should next show that L is µ-preserving and, ﬁnally, that the σ-algebra generated by ¯ −1 Ln π1 B[1,2)
**

n∈N

coincides with BD . We leave the details to the reader, who can ﬁnd them in Dajani and Kraaikamp (op. cit.). 2 Let us denote by φ the σ-ﬁnite, inﬁnite measure on B[−1,∞) with density (x + 1)−1 − (x + 2)−1 , x ∈ (−1, ∞). It is easy to check that F is φ-preserving. Theorem 4.4.4 The map ξ : [−1, 0) ∪ (0, ∞) → [1, 2) deﬁned by ξ(x) = [ d1 ; f1 /d2 , f2 /d3 , · · · ] , if x ∈ [−1, 0) ∪ (0, ∞) has FCF expansion x = [ f1 /d1 , f2 /d2 , · · · ] is an isomorphism from [−1, ∞), B[−1,∞) , F, φ to [1, 2), B[1,2) , L, µ .

Ergodic theory of continued fractions Proof. It is clear that ξ is bijective. Since L (ξ(x)) = L ([ d1 ; f1 /d2 , f2 /d3 , · · · ]) = [ d2 ; f2 /d3 , f3 /d4 , · · · ] = ξ ([ f2 /d2 , f3 /d3 , · · · ]) = ξ (F(x)) ,

305

we only need to show that ξ is measurable and that µ(A) = φ ξ −1 (A) for any A ∈ B[1,2) . Whilst measurability is obvious, the equation above can be easily checked. The details can be found in Dajani and Kraaikamp (op. cit.). 2 An immediate consequence of Theorems 4.4.3 and 4.4.4 is that [−1, ∞), B[−1,∞) , F, φ is an ergodic dynamical system underlying the FCF expansion. Remark. Corollary 4.1.10 in conjunction with the insertion concept provides a heuristic argument why the dynamical system [1, 2), B[1,2) , L, µ should be ergodic, where L is µ-preserving for a σ-ﬁnite, inﬁnite measure µ. After all, an insertion before a digit > 1 is simply building a tower over the RCF cylinder corresponding to that digit. Since the LCF expansion is obtained by using insertion as many times as possible in order to ‘shrink away’ any RCF digit > 1, it follows that the system thus obtained should be ergodic (it includes the RCF dynamical system as an induced system), but by Corollary 4.1.10 it should have inﬁnite mass. 2 The next result corresponds to Proposition 4.1.8 for the values p = −1, 0, 1 there. Theorem 4.4.5 Let x ∈ [1, 2) \ Q with LCF expansion [ b0 ; e1 /b1 , e2 /b2 , · · · ]. Then n 1 1 + ··· + b1 bn √ lim n b1 · · · bn lim lim b1 + · · · + bn n = 2 a.e.,

n→∞

n→∞

= 2 = 2

a.e., a.e.

n→∞

306

Chapter 4

Proof. Let [1; a1 , a2 , · · · ] be the RCF expansion of x. For any given suﬃciently large m ∈ N+ there (uniquely) exist integers k ∈ N+ and j ∈ N such that m = a1 + · · · + ak + j , 0 ≤ j < ak+1 . By Proposition 4.4.2 the LCF expansion is obtained by replacing any RCF digit by a block of LCF digits of length consisting of ( − 1) 2’s followed by one 1. Then 1 1 1 + ··· + = k+ b1 bm 2 This implies that m 1 1 + ··· + b1 bm Since 0 ≤ j < ak+1 , we have k ≤ m 1

1 k k i=1 ai k

(ai − 1) +

i=1

j m+k = . 2 2

=

1 1 k 1+ m 2

.

,

which converges a.e. to 0 by Corollary 4.1.10. Hence lim m 1 1 + ··· + b1 bm = 2.

m→∞

Since any bn , n ∈ N+ , is equal to either 1 or 2, recalling the classical inequalities m 1 1 + ··· + b1 bm the result follows. Corollary 4.4.6 Let x ∈ [−1, ∞) \ Q, with FCF expansion [ f1 /d1 , f2 /d2 , · · · ]. ≤

m

b1 · · · bm ≤

b1 + · · · + bm (≤ 2) , m 2

Ergodic theory of continued fractions Then n 1 1 + ··· + d1 dn √ lim n d1 · · · dn lim d1 + · · · + dn n

307

n→∞

lim

= 2

a.e.,

n→∞

= 2 = 2

a.e., a.e..

n→∞

The proof follows from Theorems 4.4.4 and 4.4.5.

2

4.4.3

The backward continued fraction expansion

Until now we have used only the insertion mechanism in this section. As an example of combining singularization and insertion we discuss here the backward continued fraction (BCF ) expansion. Any irrational number ω ∈ I has an inﬁnite CF expansion of the form 1− c1 − 1 1 . c2 + . . := [ 1; −1/c1 , −1/c2 , · · · ] , (4.4.6)

where 2 ≤ cn = cn (ω) ∈ N+ , so that (4.4.6) is an SRCF expansion. There is a transformation β : I → I naturally associated with the RCF transformation τ , which is deﬁned by (x − 1)−1 − (x − 1)−1 if x ∈ [0, 1), β(x) = 0 if x = 1. The graph of β can be obtained from that of τ by reﬂecting the latter in the line x = 1/2. It is for this reason that (4.4.6) has been called ‘backward’. Note also that β(x) = −N0 (x − 1), x ∈ I, where N0 is deﬁned in Subsection 4.3.1. In terms of β, the incomplete BCF quotients are given by cn = c1 β n−1 (ω) , n ∈ N+ , with c1 = (1 − ω)−1 , ω ∈ Ω. Here β n , n ∈ N+ , denotes the composition of β with itself n times while β 0 is the identity map. R´nyi (1957) showed that β is ν-preserving, where ν is Ito’s σ-ﬁnite, e inﬁnite measure with density x−1 , x ∈ (0, 1), which has been considered in Subsection 4.4.2, and that the dynamical system (I, BI , β, ν) is ergodic. See also Adler and Flatto (1984).

308

Chapter 4

As with Proposition 4.4.2 we leave to the reader the proof of the following result. Proposition 4.4.7 Let ω ∈ Ω with RCF expansion [a1 , a2 , · · · ]. Then the BCF expansion (4.4.6) of ω is given by the following algorithm. (i) If a1 = 1 then singularize a1 to arrive at [ 1; −1/(a2 + 1), 1/a3 , · · · ] as a new SRCF expansion of ω. If a1 > 1 then insert (a1 − 1) times −1/1 before a1 to arrive at [ 1; −1/2, · · · , −1/2, −1/1, 1/1, 1/a2 , · · · ]

(a1 −2) times

as a new SRCF expansion of ω, and then singularize the digit 1 appearing before 1/a2 in this expansion of ω. In either case we obtain as SRCF expansion of ω [ 1; (−1/2)a1 −1 , −1/(a2 + 1), 1/a3 , · · · ] , where (−1/2)a1 −1 abbreviates −1/2, · · · , −1/2.

(a1 −1)times

(4.4.7)

(ii) Let n be the smallest integer m ∈ N+ for which em = 1 in (4.4.7). Apply to the latter expansion the procedure from (i) to an . Remarks. 1. The above insertion/singularization mechanism implies that ω has a BCF expansion [ 1; (−1/2)a1 −1 , −1/(a2 + 2), (−1/2)a3 −1 , 1/(a4 + 2), · · · ] . (4.4.8)

See also Zagier (1981, Aufgabe 3, p. 131). It also follows easily from (4.4.8) that every quadratic irrationality has an (eventually) periodic BCF expansion. 2. Again, as for the LCF expansion, it heuristically follows from Corollary 4.1.10 and the insertion mechanism that the BCF transformation β should be ergodic, with invariant σ-ﬁnite, inﬁnite measure. 2 √ For the LCF expansion it was intuitively clear that n b1 · · · bn → 2 a.e. as n → ∞ since the only digits are 1 and 2, and ‘there are very few 1’s against

Ergodic theory of continued fractions

309

the 2’s’ (by Corollary 4.1.10). For the BCF expansion such an argument clearly does not work. However, we have the following result. Theorem 4.4.8 Let ω ∈ Ω with BCF expansion (4.4.6). Then √ lim n c1 · · · cn = 2 a.e.

n→∞

and

n→∞

lim

n 1 1 + ··· + c1 cn

= 2 a.e..

Proof. Let [a1 , a2 , · · · ] be the RCF expansion of ω. For any given sufﬁciently large m ∈ N+ there (uniquely) exist integers k ∈ N+ and j ∈ N such that m = a1 + a3 + · · · + a2k−1 + j, It follows from (4.4.8) that c1 · · · cm = 2 and therefore 1 m

m

Pk

0 ≤ j < a2k+1 .

k

i=1 (a2i−1 −1)+j−1

(a2i + 2) ,

i=1

log ci =

i=1

log 2 m

k

a2i−1 − k + j − 1

i=1

+

1 m

k

k

log(a2i + 2)

i=1

k+1

k

= (log 2) 1 − Since

k

a2i−1

i=1

+ + j

log(a2i + 2)

i=1 k

. a2i−1 + j

i=1

k+1 a2i−1 + j

i=1

= 1 k+1

k

1

k i=1

→ 0

a.e.

j a2i−1 + k+1

**as m → ∞, and log(a2i + 2)
**

i=1 k

→ 0 a2i−1 + j

a.e.

i=1

**310 as m → ∞, we deduce that
**

m

Chapter 4

√ c1 · · · cm → 2

a.e.

as m → ∞. Next, since cn ≥ 2, n ∈ N+ , we have m ≥ 2. 1 1 + ··· + c1 cm Using the same inequalities as in the proof of Theorem 4.4.5 we therefore obtain 2 ≤ lim

m→∞

√ m ≤ lim m c1 · c2 · · · · · cm = 2, 1 1 m→∞ + ··· + c1 cm lim m 1 1 + ··· + c1 cm = 2 a.e.. 2

that is,

m→∞

Remark. The asymptotic behaviour of the arithmetic mean c1 + · · · + cm m as m → ∞ was posed as an open problem in Dajani and Kraaikamp (2000). If we write m as before, then an easy calculation yields

k

c1 + · · · + cm = 2+ m

a2i

i=1 k

, a2i−1

j+

i=1

**with 0 ≤ j < a2k+1 . Thus we need to study the behaviour of
**

k

a2i

i=1 k

(4.4.9)

a2i−1

i=1

Ergodic theory of continued fractions

311

as k → ∞. The asymptotic behaviour of the numerator in (4.4.9) is the same of that of the denominator, and Aaronson (1986) showed that the fraction converges to 1 in probability. However, one expects that inﬁnitely often the denominator is much larger that the numerator, and vice-versa. Thus Dajani and Kraaikamp (op. cit.) conjectured that the lim inf and lim sup of (4.4.9) are a.e. equal to 0 and +∞, respectively. Recently, Aaronson and Nakada (2001) have proved this conjecture. 2

312

Chapter 4

**Appendix 1: Spaces, functions, and measures
**

A1.1

Let X be an arbitrary non-empty set. A non-empty collection X of subsets of X is said to be a σ-algebra (in X) if and only if it is closed under the formation of complements and countable unions. Clearly, ∅ and X both belong to X , and X is also closed under the formation of countable intersections. For any non-empty collection C of subsets of X the σ-algebra generated by C, denoted σ(C), is deﬁned as the smallest σ-algebra in X which contains C. Clearly, σ(C) is the intersection of all σ-algebras in X which contain C. A pair (X, X ) consisting of a non-empty set X and a σ-algebra X in X is called a measurable space. In the special case where X is a denumerable set the usual σ-algebra in X is P(X), the collection of all subsets of X. Clearly, P(X) is generated by the elements of X : P(X) = σ ({x} : x ∈ X). The product of two measurable spaces (X, X ) and (Y, Y) is the measurable space (X × Y, X ⊗ Y), where the product σ-algebra X ⊗ Y is deﬁned as σ(C) with C = (A × B : A ∈ X , B ∈ Y).

A1.2

Let (X, X ) and (Y, Y) be two measurable spaces. A map f : X → Y from X into Y is said to be (X , Y)-measurable or a Y -valued random variable (r.v.) on X if and only if the inverse image f −1 (A) = (x ∈ X : f (x) ∈ A) of every set A ∈ Y is in X . Setting f −1 (Y) = (f −1 (A) : A ∈ Y), the above condition can be compactly written as f −1 (Y) ⊂ X . [Note that f −1 (Y) is always a σ-algebra in X whatever f : X → Y ! ] Let (X, X ) be a measurable space, let ((Yi , Yi ))i∈I be a family of measurable spaces, and for any i ∈ I let fi be a Yi -valued r.v. on X. Then 313

314

Appendix 1

the σ-algebra σ ∪i∈I fi−1 (Yi ) is called the σ-algebra generated by the family (fi )i∈I and is denoted σ((fi )i∈I ). Clearly, this is the smallest σ-algebra S⊂X having the property that fi is (S, Yi )-measurable for any i ∈ I.

A1.3

Let (X, X ) be a measurable space. A function µ : X → R+ is said to be a (ﬁnite) measure on X if and only if it is completely additive, that is, for any sequence (Ai )i∈N+ of pairwise disjoint elements of X we have µ ∪i∈N+ Ai = i∈N+ µ(Ai ). Complete additivity is equivalent to ﬁnite additivity [that is, for any ﬁnite collection A1 , . . . , An of pairwise disjoint elements of X , we have µ (∪n Ai ) = n µ(Ai )] in conjunction with coni=1 i=1 tinuity at ∅ (that is, for any decreasing sequence A1 ⊃ A2 ⊃ . . . of elements of X with ∩i∈N+ Ai = ∅ we have limn→∞ µ(An ) = 0 ). Clearly, ﬁnite additivity implies µ (∅) = 0. In the special case where X is a denumerable set a measure µ on P(X) is deﬁned by simply giving the values µ ({x}) for the elements x ∈ X. A probability on X is a measure P on X satisfying P (X) = 1. An important example of a probability on X is that of the probability δx concentrated at x for any given x ∈ X, which is deﬁned by δx (A) = IA (x), A ∈ X . The collection of all measures (probabilities) on X will be denoted m(X ) (pr(X )). A triple (X, X , P ) consisting of a measurable space (X, X ) and a probability P on X is called a probability space. [The traditional notation for a probability space is (Ω, K, P ). The points ω ∈ Ω are interpreted as the possible outcomes (elementary events) of a random experiment, and the sets A ∈ K as the (random) events associated with it; these are the subsets of Ω arising as the truth sets of certain statements concerning the experiment.] We say that A ∈ X occurs P -almost surely, and write A P -a.s., if and only if P (A) = 1. Let (Y, Y) be a measurable space and let f be a Y -valued r.v. on X. The P -distribution of f is the probability P f −1 on Y deﬁned by P f −1 (A) = P (f −1 (A)), A ∈ Y. Let (X, X ) and (Y, Y) be two measurable spaces. The product measure of µ ∈ m(X ) and ν ∈ m(Y) is the (unique) measure µ ⊗ ν ∈ m (X ⊗ Y) satisfying the equation µ ⊗ ν(A × B) = µ(A)ν(B) for any A ∈ X and B ∈ Y.

A1.4

Let X be a metric space with metric d. The usual σ-algebra in X, denoted BX , is that of Borel subsets of X, that is, the σ-algebra generated by the

Spaces, functions, and measures

315

collection of all open subsets of X. In the special case where X = Rn (ndimensional Euclidean space) we write Bn for BRn , n ∈ N+ , and B = B 1 . Further, if X is a Borel subset M of Rn , then BM = B n ∩ M = (A ∩ M : A ∈ B n ), n ∈ N+ . A sequence (µn )n∈N+ of measures on BX is said to converge weakly to a w measure µ on BX , and we write µn → µ, if and only if

n→∞ X

lim

hdµn =

hdµ

X

for any h ∈ Cr (X) = the set of all real-valued bounded continuous functions on (X, d). An equivalent deﬁnition is obtained by asking that

n→∞

lim µn (A) = µ(A)

(A1.1)

for any A ∈ BX for which µ (∂A) = 0, where ∂A is the boundary of A deﬁned as the closure of A minus the interior of A. In the special case where X = R, putting Fn (x) = µn ((−∞, x]) and F (x) = µ ((−∞, x]), x ∈ R, equation (A1.1) holds if and only if limn→∞ µn (R) = µ(R) and limn→∞ Fn (x) = F (x) for any point of continuity x of F . The Prokhorov metric dP on pr(BX ) is deﬁned by dP (P, Q) = inf(ε > 0 : P (A) ≤ Q(Aε )+ε, A ⊂ X, A closed), P, Q ∈ pr(BX ), where Aε = (x : d(x, A) < ε) and d(x, A) = inf(d(x, y) : y ∈ A). If the metric space (X, d) is separable, then for P, Pn ∈ pr(BX ), n ∈ N+ , the weak convergence of Pn to P is equivalent to limn→∞ dP (Pn , P ) = 0. Let (X, d) and (Y, d ) be two metric spaces. Consider a Y -valued r.v. f on X. The set Df of all discontinuity points of f belongs to BX since it can be written as ∪ε ∩δ Aε,δ , where ε and δ vary over the positive rational numbers, and Aε,δ is the (open) set of all points x ∈ X for which there exist x , x ∈ X such that d(x, x ) < δ, d(x, x ) < δ and d (f (x ), f (x )) ≥ ε. Proposition A1.1 If Pn , P ∈ pr (BX ), Pn → P , and P (Df ) = 0, then Pn f −1 → P f −1 . In particular, the above result holds for a continuous f for which clearly Df = ∅. For a characterization via weak convergence of almost everywhere continuous functions f , that is, such that P (Df ) = 0, see Mazzone (1995/96).

w w

316

Appendix 1

A1.5

In this section (X, d) is the real line with the usual Euclidean distance. The characteristic function (ch.f.) or Fourier transform of a measure ∧ µ ∈ m(B) is the complex-valued function µ deﬁned on R by µ (t) =

R ∧ ∧ ∧

e itx µ(dx),

t ∈ R.

If µ = ν for two measures µ, ν ∈ m(B), then µ = ν. Proposition A1.2 (L´vy-Cram´r continuity theorem) Let P, Pn ∈ e e pr(B), n ∈ N+ . w (i) Pn → P ∈ pr(B) implies limn→∞ Pn = P pointwise, and the convergence of ch.f.s is uniform on compact subsets of R. (ii) If limn→∞ P n = h pointwise and h is continuous at 0, then h is the w ch.f. of a probability P ∈ pr(B) and Pn → P . Let µ, ν ∈ m(B). The convolution µ ∗ ν is the measure on B deﬁned by µ ∗ ν(A) =

R ∧

µ(A − x)ν(dx),

A ∈ B,

where A − x := (y − x : y ∈ A) , x ∈ R. The convolution operator ∗ is associative and commutative. We have µ ∗ ν = µ ν, µ, ν ∈ m(B).

For any n ∈ N+ let fi , 1 ≤ i ≤ n, be real-valued r.v.s on a probability space (Ω, K, P ). The fi are said to be independent if and only if the σ-algebras fi−1 (B), 1 ≤ i ≤ n, are P -independent, that is,

n n

P

i=1

Ai

=

i=1

P (Ai )

for any Ai ∈ fi−1 (B), 1 ≤ i ≤ n. For independent real-valued r.v.s fi , 1 ≤ i ≤ n, the ch.f. of the P -distribution P ( n fi )−1 of the sum n fi is equal i=1 i=1 to the product of the ch.f.s of the P -distributions P fi−1 of the summands, 1 ≤ i ≤ n. Also, P ( n fi )−1 is the convolution of the P fi−1 , 1 ≤ i ≤ n. i=1 Let µ ∈ m(B). For any n ∈ N+ the nth convolution µ∗n of µ with itself is deﬁned recursively by µ∗1 = µ and µ∗n = µ∗(n−1) ∗ µ for n ≥ 2. Deﬁne also µ∗0 as δ0 .

Spaces, functions, and measures

317

**Let µ ∈ m(B). The Poisson probability Pois µ associated with µ is deﬁned as Pois µ = e −µ(R)
**

n∈N ∧ ∧

µ∗n = e µ−µ(R) . n!

Its ch.f. is Pois µ = exp(µ − µ (0)). The classical Poisson distribution P (θ) with parameter θ > 0 is Pois(θδ1 ). A measure on B is said to be a L´vy measure if and only if it integrates e 2 on the whole of R. Given a L´vy measure µ, the the function min 1, x e τ -centered Poisson probability cτ Pois µ, τ > 0, is deﬁned as the probability with characteristic function exp

R

e itx − 1 − itx I[−τ,τ ] (x) µ(dx) .

**We have cτ Pois µ = (Pois µ) ∗ δb(τ ) , where
**

τ

b(τ ) = −

−τ

xµ(dx).

A probability P ∈ pr(B) is said to be inﬁnitely divisible if and only if for ∗n any n ∈ N+ there exists Pn ∈ pr(B) such that Pn = P . Proposition A1.3 (L´vy–Khinchin representation) P ∈ pr(B) is ine ﬁnitely divisible if and only if there exist σ ≥ 0 and a L´vy measure ν, and e for any τ > 0 there exists aτ ∈ R such that P (t) = exp itaτ −

∧

σ 2 t2 + 2

R

e itx − 1 − itx I[−τ,τ ] (x) ν(dx) ,

t ∈ R.

It follows from Proposition A1.3 that an inﬁnitely divisible probability is the convolution of a normal distribution N (aτ , σ 2 ) and a τ -centered Poisson probability cτ Pois ν. Either of the two terms can be degenerate, that is, the cases σ = 0 and ν ≡ 0 are allowed. An important special class of inﬁnitely divisible probabilities on B is that of stable probabilities. A probability P ∈ pr(B) is said to be stable if and only if for any n ∈ N+ there exist An ∈ R++ and Bn ∈ R such that

−1 P ∗n = P fn ,

where fn is the aﬃne function on R deﬁned by fn (x) = An x + Bn , x ∈ R. (A1.2)

318

Appendix 1

If Bn = 0 for any n ∈ N+ , then P is said to be strictly stable. It appears that the only constants An allowed in (A1.2) are An = n1/α , n ∈ N+ , with α ∈ (0, 2], and then α is called the order of µ. A probability P ∈ pr(B) is stable of order α if and only if its ch.f. P has the form

α P (t) = exp [i at − c|t| (1 − i b sgn t σ (t, α))] , ∧ ∧

t ∈ R,

where a, b, c ∈ R with |b| ≤ 1 and c ≥ 0, and πα if α = 1, tg 2 σ(t, α) = 2 π log |t| if α = 1. In particular, a stable probability has order 2 if and only if it is normal. An important example of a stable probability is that of the 1-centered Poisson probability c1 Pois µk1 ,k2 ,α , 0 < α < 2, k1 , k2 ≥ 0, k1 + k2 > 0, whose L´vy measure has density e µk1 ,k2, α (dx) = k2 I(−∞,0) (x) + k1 I(0,∞) (x) |x|−1−α , dx x = 0.

**The ch.f. hk1 ,k2 ,α of c1 Pois µk1 ,k2 ,α is 0 e itx − 1 − itx I[−1,0) (x) |x|−1−α dx hk1 ,k2 ,α (t) = exp k2
**

−∞ ∞

+ k1

0

e itx − 1 − itx I(0,1] (x) x−1−α dx

, t ∈ R,

which can be expressed in terms of elementary functions as follows. We have hk1 ,k2 ,1 (t) = exp i(k2 − k1 )(C − 1)t − π(k1 + k2 ) 2 2 k1 − k2 1 + i sgn t log |t| |t| , π k1 + k2

where C = 0.57721... is Euler’s constant, while for α = 1, 0 < α < 2, i(k2 − k1 )t hk1 ,k2 ,α (t) = exp 1−α +(k1 + k2 ) Γ(2 − α) πα cos α(α − 1) 2 1 + i sgn t k1 − k2 πα tg k1 + k2 2 |t|α ,

Spaces, functions, and measures where Γ is the classical gamma function. Actually, any stable probability of order α = 2 has the form δa ∗ c1 Pois µk1 ,k2 ,α with a ∈ R, k1 , k2 ≥ 0, k1 + k2 > 0.

319

A1.6

Let C = Cr (I) be the metric space of real-valued continuous functions on I = [0, 1] with the uniform metric d(x, y) = sup |x(t) − y(t)| ,

t∈I

x, y ∈ C.

The space C is complete and separable. The σ-algebra BC of Borel sets in (C, d) coincides with the σ-algebra B I ∩ C. Here BI denotes the σalgebra in RI generated by the collection of its subsets of the form Πt∈I At , where At ∈ B, t ∈ I, and At = R for ﬁnitely many t ∈ I. Of paramount importance is the probability W on BC known as the Wiener measure, for which W (x : x(0) = 0) = 1, W (x : x(ti ) − x(ti−1 ) ≤ ai , 1 ≤ i≤ k)

k

=

i=1

1 2π (ti − ti−1 )

ai −∞

e−u

2 /2(t

i −ti−1 )

du

for any k ∈ N+ , 0 ≤ t0 < t1 < · · · < tk ≤ 1, ai ∈ R, 1 ≤ i ≤ k. Let D = D(I)(⊃ Cr (I)) be the metric space of real-valued functions on I which are right continuous and have left limits, with the Skorohod metric d0 to be deﬁned below. Clearly, we can also consider the uniform metric d in D which is deﬁned similarly to that in C, that is, d(x, y) = supt∈I |x(t) − y(t)| , x, y ∈ D. Let L denote the set of all strictly increasing continuous functions : I → I with (0) = 0, (1) = 1, and put s0 ( ) = sup |log [( (t) − (s)) / (t − s)]|

s=t

for any ∈ L. The distance d0 (x, y)(≤ d(x, y)) for x, y ∈ D is deﬁned as the inﬁmum of all ε > 0 for which there exists ∈ L such that s0 ( ) ≤ ε

320

Appendix 1

and supt∈I |x(t) − y ( (t))| ≤ ε. The metrics d0 and d generate the same topology in D. Nevertheless, while D is complete and separable under d0 , separability does not hold under d. The σ-algebra BD of Borel sets in (D, d0 ) coincides with the σ-algebra B I ∩ D. Wiener measure W can be immediately extended from BC to BD as the topologies induced in D by the metrics d0 and d are identical. Hence A∩C ∈ BC for any A ∈ BD . This allows us to deﬁne W (A) = W (A ∩ C), A ∈ BD . Clearly, C is the support of W in D, that is, the smallest closed subset of D whose W -measure equals 1. General references: Araujo and Gin´ (1980), Billingsley (1968), Halmos e (1950), Hoﬀmann–Jørgensen (1994), Samorodnitsky and Taqqu (1994).

**Appendix 2: Regularly varying functions
**

A2.1

A measurable function R : [r, ∞) → R+ , where r ∈ R+ , is said to be regularly varying (at ∞) of index α ∈ R if and only if there exists x0 ≥ r such that R([x0 , ∞)) ⊂ R++ and

x→∞

lim

R(tx) = tα R(x)

for any t ∈ R++ . A regularly varying function of index 0 is called a slowly varying function. It is obvious that R is regularly varying of index α if and only if it can be written in the form R(x) = xα L(x), x ∈ (r, ∞),

where L is a slowly varying function. The general form of a slowly varying function is described by the celebrated Karamata theorem below [cf. Seneta (1976, Theorem 1.2 and its Corollary)]. Theorem A2.1 (Representation theorem) Let r ∈ R+ . A function L : [r, ∞) → R+ is slowly varying if and only if

x

L(x) = c(x) exp

x0

ε(t) dt , t

x ≥ x0 ,

for some x0 ≥ r, where the function c : [x0 , ∞) → R+ is bounded and measurable and limx→∞ c(x) = c > 0 while the function ε : [x0 , ∞) → R is continuous and limx→∞ ε(x) = 0. Corollary A2.2 If L is a slowly varying function, then 321

322 (i) limx→∞ L(x + y)/L(x) = 1 for any y ∈ R++ ;

Appendix 2

(ii) limx→∞ xε L(x) = ∞ and limx→∞ x−ε L(x) = 0 for any ε > 0; (iii) L is bounded on ﬁnite intervals in [x0 , ∞) if x0 ≥ r is large enough. There exist necessary or suﬃcient integral conditions for slow variation which are easy to check and use for theoretical and practical purposes. Here are two such results. See, e.g., Seneta (1976, pp. 53-56 and 86-88). Theorem A2.3 Let r ∈ R+ . If L : [r, ∞) → R+ is a slowly varying function and x0 ≥ r so large that L is bounded on ﬁnite intervals in [r, ∞), then for any α ≥ −1 we have lim xα+1 L(x)

x x0 x

x→∞

=α+1

(A2.1)

y α L(y)dy

**while the function x →
**

x0

y α L(y)dy, x > x0 , is regularly varying of index

**α + 1. Conversely, if L : [r, ∞) → R+ is measurable and bounded on ﬁnite intervals in [x0 , ∞) for some x0 ≥ r and (A2.1) holds for some α > −1, then
**

x

**L is a slowly varying function while the function x →
**

x0

y α L(y)dy, x > x0 ,

is regularly varying of index α + 1. The last assertion also holds for α = −1. Theorem A2.4 Let r ∈ R+ . If L : [r, ∞) → R+ is a slowly varying function, then

∞ x→∞ x ∞

lim

y α L(y) dy < ∞

(A2.2)

for any α < −1. If have

x→∞ x

lim

y −1 L(y) dy < ∞ then for any α ≤ −1 we = −(α + 1) (A2.3)

x→∞

lim

xα+1 L(x)

∞

y α L(y)dy

x ∞

**while the function x →
**

x

y α L(y) dy, for x large enough, is regularly

varying of index α + 1. Conversely, if L : [r, ∞) → R+ is measurable, satisﬁes (A2.2), and (A2.3) holds for some α < −1, then L is a slowly varying function while

**Regularly varying functions
**

∞

323

**the function x → index α + 1.
**

x

y α L(y)dy, for x large enough, is regularly varying of

A2.2

An important class of pairs of regularly varying functions is deﬁned as follows. Let ξ be a non-degenerate real-valued random variable on a probability space (Ω, K, P ), and deﬁne real-valued functions F and F on [0, ∞) by F (x) = E(ξ 2 I(|ξ|≤x) ), F (x) = P (|ξ| > x), x ∈ R+ .

**Clearly, F is non-decreasing and F non-increasing. It is easy to check that
**

x

F (x) = −

0

u2 dF (u), F (x) =

x

∞

u−2 dF (u),

x ∈ R+ ,

**whence by integrating by parts we obtain F (x) + x2 F (x) = 2
**

0 ∞ x x

u F (u)du,

(A2.4)

x2 F (x) + F (x) = 2x2

u−3 F (u)du,

x ∈ R+ .

(A2.5)

Theorem A2.5 If either F or F varies regularly, then the limit x2 F (x) =c x→∞ F (x) lim (A2.6)

**exists and 0 ≤ c ≤ ∞. Conversely, if (A2.6) holds with 0 < c < ∞, then F (x) ∼ x2− 1+c L(x),
**

2

F (x) ∼ cx− 1+c L(x)

2

as x → ∞, where L is a slowly varying function. Finally, (A2.6) holds with c = 0 if and only if F is slowly varying while (A2.6) holds with c = ∞ if and only if F is slowly varying. The proof follows immediately from equations (A2.4) and (A2.5) by using Theorems A2.3 and A2.4. 2

324

Appendix 2

A2.3

Let f : [1, ∞) → R++ be a measurable function which is bounded on ﬁnite intervals and such that limx→∞ f (x) = ∞. For any y ∈ [f (1), ∞) deﬁne f0 (y) = inf{x ≥ 1 : f (x) ≥ y}, f1 (y) = inf{x ≥ 1 : f (x) > y}, f2 (y) = sup{x ≥ 1 : f (x) ≤ y}. Clearly, the functions fi : [f (1), ∞) → [1, ∞), i = 0, 1, 2, are well deﬁned, any of them is non-decreasing, 1 ≤ f0 ≤ f1 ≤ f2 , and limy→∞ fi (y) = ∞, i = 0, 1, 2. We say that f ∈ F if and only if f1 (y) = 1. y→∞ f2 (y) lim Lemma A2.6 [Samur (1989, Lemma 2.11)] (i) If f : [1, ∞) → R++ is non-decreasing and limx→∞ f (x) = ∞, then f ∈ F. (ii) If f : [1, ∞) → R++ is bounded on ﬁnite intervals and regularly varying of index α > 0, then f ∈ F. Moreover, f0 (y) =1 y→∞ f2 (y) lim and fi is regularly varying of index 1/α, i = 0, 1, 2. (iii) If f ∈ F and f1 is regularly varying of index 1/α for some α > 0, then f is regularly varying of index α. Corollary A2.7 Let f ∈ F, and deﬁne a real-valued function F on R+ by F (x) = (log 2)−1

{k∈N+ : |f (k)|≤x}

f 2 (k)k −2 ,

x ∈ R+ .

**(i) F is slowly varying if and only if lim f 2 (x) x
**

{k∈N+ : k≤x}

x→∞

f 2 (k)k −2

= 0.

(A2.7)

(ii) If f ∈ F is regularly varying of index 1/2, then (A2.7) holds, that is, F is slowly varying.

**Appendix 3: Limit theorems for mixing random variables
**

A3.1

Let (Ω, K, P ) be a probability space. For any two σ-algebras K1 and K2 included in the σ-algebra K deﬁne the dependence coeﬃcients α(K1 , K2 ) = sup (|P (A1 ∩ A2 ) − P (A1 )P (A2 )| : Ai ∈ Ki , i = 1, 2) , ϕ(K1 , K2 ) = sup (|P (A2 |A1 ) − P (A2 )| : Ai ∈ Ki , i = 1, 2, P (A1 ) > 0) , ψ(K1 , K2 ) = sup Clearly, α(K1 , K2 ) ≤ ϕ(K1 , K2 ) ≤ ψ(K1 , K2 ) and 0 ≤ α(K1 , K2 ), ϕ(K1 , K2 ) ≤ 1, 0 ≤ ψ(K1 , K2 ) ≤ ∞. P (A2 |A1 ) − 1 : Ai ∈ Ki , P (Ai ) > 0, i = 1, 2 . P (A2 )

Let (X, X ) be a measurable space and consider an array X = {Xnj , 1 ≤ j ≤ jn , jn ∈ N+ , n ∈ N+ } (A3.1)

of X-valued r.v.s deﬁned on (Ω, K, P ). [An inﬁnite sequence (Xn )n∈N+ of X-valued r.v.s can be seen as the (triangular) array {Xnj ≡ Xj , 1 ≤ j ≤ n, n ∈ N+ } .] For such an array deﬁne the dependence coeﬃcients δ (k) = sup

(k) n∈N+

1≤h≤jn −k

max

δ(σ (Xnj , 1 ≤ j ≤ h), σ (Xnj , h + k ≤ j ≤ jn )) , 325

326

(k)

Appendix 3

where N+ = {n ∈ N+ : jn > k} , k ∈ N+ , and δ stands for either α, ϕ or ψ. Clearly, in the case of an inﬁnite sequence (Xn )n∈N+ we can write δ(k) = sup δ(σ(Xj , 1 ≤ j ≤ h), σ(Xj , h + k ≤ j ≤ h + k + )).

h, ∈N+

It is obvious that the sequence (δ(k))k∈N+ is non-increasing. An array (resp. sequence) of r.v.s is said to be δ-mixing if and only if limk→∞ δ(k) = 0. It can be shown [Bradley (1986, p. 184)] that ϕ(1) < 1 whenever ψ(1) < ∞. A ﬁnite collection (Xi )1≤i≤n , n ≥ 2, of X-valued r.v.s is said to be strictly stationary if and only if the probability distribution of (Xk+1 , · · · , Xk+h ), 0 ≤ k ≤ n − h, does not depend on k whatever 1 ≤ h < n. A sequence (Xn )n∈N+ of Xvalued r.v.s is said to be strictly stationary if and only if the probability distribution of (Xk+1 , · · · , Xk+h ) does not depend on k ∈ N whatever h ∈ N+ . An array of X-valued r.v.s is said to be strictly stationary if and only if any row of it is strictly stationary. Proposition A3.1 Let (A3.1) be a ψ-mixing array of X-valued r.v.s. Let ξ and η be real-valued random variables which are σ(Xnj , 1 ≤ j ≤ h)and σ (Xnj , h + k ≤ j ≤ jn )-measurable, respectively, for some h, k, n ∈ N+ . Assume that E |ξ| , E |η| < ∞ and ψ(k) < ∞. Then Cov (ξ, η) exists and |Cov (ξ, η)| ≤ ψ(k)E |ξ| E |η| . In particular, if Eξ 2 < ∞ and Eη 2 < ∞ then |Cov (ξ, η)| ≤ ψ(k) Var1/2 ξ Var1/2 η. Corollary A3.2 Let (A3.1) be a ψ-mixing strictly stationary array of 2 real-valued r.v.s with ψ(1) < ∞. Assume that EXn1 < ∞ for some n ∈ N+ . Then

k jn

Var

j=1

Xnj

< k 1 + 2

j=1

ψ(j) Var Xn1 ,

1 ≤ k ≤ jn .

Corollary A3.3 Let (Xn )n∈N+ be a ψ-mixing strictly stationary sequence of X-valued r.v.s. Assume that n∈N+ ψ(n) < ∞. Let f be a

**Limit theorems real-valued r.v. on (X, X ), and assume that Ef 2 (X1 ) series σ 2 = Ef 2 (X1 ) − E 2 f (X1 ) + 2
**

n∈N+

327 < ∞. Then the

E(f (X1 ) − Ef (X1 ))(f (Xn+1 ) − Ef (X1 ))

**is absolutely convergent and σ ≥ 0. We have
**

n

Var

j=1

f (Xj ) = n(σ 2 + o(1))

as n → ∞. The above results are already folklore. See, e.g., Doukhan (1994, Ch. 1). Proposition A3.4 [Gordin (1971, Remark 3)] In addition to the hypotheses of Corollary A3.3 assume that ψ(1) < 1. Then σ = 0 if and only if f = const.

A3.2

For an array (A3.1) of real-valued r.v.s on (Ω, K, P ) set

k

Snk =

j=1

Xnj ,

1 ≤ k ≤ jn ,

Snjn = Sn ,

n ∈ N+ .

Then such an array is said to be strongly inﬁnitesimal (s.i. for short) if and only if it is strictly stationary and for any sequence (kn )n∈N+ of natural integers such that kn ≤ jn , n ∈ N+ , and limn→∞ kn /jn = 0 the sum Snkn converges in P -probability to 0 as n → ∞. All results given below were proved by J.D. Samur, as indicated at appropriate places, in the more general case of Banach valued random variables. Proposition A3.5 If (A3.1) is a ϕ-mixing s.i. array of real-valued r.v.s, then −1 lim max dP P Snk , δ0 = 0

n→∞ 1≤k≤kn

for any sequence (kn )n∈N+ of natural integers such that kn ≤ jn , n ∈ N+ , and limn→∞ kn /jn = 0. This is a consequence of a more general result [Samur (1984, Theorem 3.3)].

328

Appendix 3

Proposition A3.6 [Samur (1987, § 3.4.3.2)] Let (A3.1) be a ϕ-mixing −1 strictly stationary array of real-valued r.v.s such that P Sn converges weakly to some probability measure on B. Then the array (A3.1) is s.i. if and only if Xn1 converges in P -probability to 0 as n → ∞, and for any ε > 0 there exists 0 < a = a(ε) < 1 such that lim sup max P (|Snk | > ε) < 1.

n→∞ 1≤k≤ajn

A3.3

Let ν be an inﬁnitely divisible probability on B. We denote by Qν the distribution (on BD ) of a stochastic process ξν = (ξν (t))t∈I with stationary independent increments, ξν (0) = 0 a.s., trajectories in D, and ξν (1) having probability distribution ν. When ν is Gaussian the process ξν can be taken with trajectories in C. In this case the distribution of ξν is concentrated on BC , and we shall denote it by Qν . Given an array (A3.1) of real-valued r.v.s., for any n ∈ N+ deﬁne the D D C C stochastic processes ξn = (ξn (t))t∈I and ξn = (ξn (t))t∈I by

D ξn (t) = Sn C ξn (t) = Sn jn t jn t

, + (jn t − jn t ) (Sn(

jn t +1)

− Sn

jn t

),

t ∈ I,

with the convention Sn0 = 0, n ∈ N+ . Clearly, for any n ∈ N+ the D C trajectories of ξn and ξn are in D and C, respectively. Theorem A3.7 [Samur (1987, Theorem 3.2 and Corollary 3.3)] Let (A3.1) be a ϕ-mixing strictly stationary array of real-valued r.v.s such that ψ(1) < ∞. Let ν be a probability measure on B. Then the following statements are equivalent: −1 w I. P Sn → ν and the array (A3.1) is s.i. w D −1 → Q in B . II. ν is inﬁnitely divisible and P ξn ν D Remark. If the assumption ψ(1) < ∞ does not hold, then Theorem A3.7 still holds with statement I replaced by

−1 I. P Sn → ν, the array (A3.1) is s.i., and w

n∈N+

**sup jn P (|Xn1 | > ε) < ∞, lim jn P (|Xn1 | > ε, |Xnj | > ε) = 0
**

n→∞

for any ε > 0 and any integer j ≥ 2.

2

Limit theorems

329

Theorem A3.8 [Samur (1987, Corollary 3.5 and § 3.6.4)] Let (A3.1) be a ϕ-mixing strictly stationary array of real-valued r.v.s. Let ν be a probability measure on B. Then the following statements are equivalent:

−1 I. P Sn → ν, the array (A3.1) is s.i., and limn→∞ jn P (|Xn1 | > ε) = 0 for any ε > 0. D II. ν is Gaussian and P ξn C III. ν is Gaussian and P ξn −1 w w

→ Qν in BD . → Qν in BC .

−1 w

IV. ν is Gaussian, and on a common probability space (Ω , K , P ) there exist an array X = Xnj , 1 ≤ j ≤ jn , jn ∈ N+ , n ∈ N+ of real-valued r.v.s and a stochastic process ζ = (ζ(t))t∈I with trajectories in C which satisfy P (Xn1 , · · · , Xnjn )−1 = P (Xn1 , · · · , Xnjn )−1 , P ζ −1 = Qν ,

k 1≤k≤jn

n ∈ N+ ,

max

j=1

k Xnj − ζ j n

→ 0 P -a.s. as n → ∞.

**Remark. If ϕ(1) < 1 and ν is Gaussian, then statement I above can be replaced by
**

−1 I. P Sn → ν, and the array (A3.1) is s.i. w

2

Theorem A3.9 [Samur (1987, § 3.4.3.1)] Let (Xn )n∈N+ be a ϕ-mixing strictly stationary sequence of real-valued r.v.s. Let (Bn )n∈N+ be a sequence of positive numbers such that limn→∞ Bn = ∞, and let (An )n∈N+ be a sequence of real numbers. Assume that 1 P Bn

n j=1

−1 (Xj − An ) → ν,

w

330

Appendix 3

where ν is a non-degenerate probability measure on B. Then ν is stable. Let α ∈ (0, 2] be the order of ν and write Xnj = 1 (Xj − An ) , Bn 1 ≤ j ≤ n, n ∈ N+ .

The array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i. if and only if: (i) Bn = n1/α L(n), n ∈ N+ , for some slowly varying function L : R+ → R++ integrable over ﬁnite intervals, and (ii) for any sequence (rn )n∈N+ of natural integers such that rn ≤ n and limn→∞ rn /n = 0 we have rn (Arn − An ) = 0. n→∞ Bn lim Theorem A3.10 [Samur (1984, Theorem 5.6)] Let (A3.1) be a ϕ-mixing strictly stationary array of real-valued r.v.s such that ψ(1) < ∞. Assume there exist positive measures µn on B, n ∈ N+ , such that µn (R) ≤ 1 and −1 µn ([−t, t]) = 0, n ∈ N+ , for some t ∈ R++ . If P Xn1 = (1 − µn (R))δ0 + µn −1 w and jn µn converges weakly to a ﬁnite measure µ on B, then P Sn → Pois µ. Theorem A3.11 [Samur (1984, Theorems 4.1 and 4.2)] Let (A3.1) be a ϕ-mixing strictly stationary s.i. array of real-valued r.v.s such that ϕ(1) < 1. −1 Assume that P Sn converges weakly to a probability measure ν on B. Then ν is Gaussian if and only if

n→∞

lim jn P (|Xn1 | > ε) = 0

**for any ε > 0. If ν = N (m, σ 2 ) then for any ε > 0 we have (i) limn→∞ E and (ii) limn→∞ E
**

jn j=1 jn j=1 2

Xnj I(|Xnj |≤ε) − EXnj I(|Xnj |≤ε)

= σ2

Xnj I(|Xnj |≤ε) = m.

For any real-valued r.v. η put 2 E η/Eη 2 if 0 < Eη 2 < ∞, 2 m (η) = 0 if Eη 2 = ∞.

Limit theorems It can be proved that if Eη 2 = ∞ then E 2 η I(|η|≤x) = 0. x→∞ Eη 2 I(|η|≤x) lim See, e.g., Araujo and Gin´ (1980, p. 80). e

331

(A3.2)

Theorem A3.12 [Samur (1985), Corollary 3.4] Let (Xn )n∈N+ be a ϕmixing strictly stationary sequence of real-valued r.v.s for which ϕ1/2 (n) < ∞.

n∈N+

Assume that

2 0 < EX1 ≤ ∞,

x2 P (|X1 | > x) = 0, x→∞ EX 2 I(|X |≤x) 1 1 lim

**and the limits ϕ(0) := lim n EX1 Xn I(|X1 |≤x,|Xn |≤x) , 2 x→∞ EX1 I(|X1 |≤x)
**

n i=1 Xi ,

n ∈ N+ ,

exist and are all ﬁnite. Put Sn = following assertions hold: (i) E |X1 | < ∞. (ii) The series

n ∈ N+ , S0 = 0. Then the

2 σ(0) = ϕ1 − m2 (X1 ) + 2

(0)

(ϕ(0) − m2 (X1 )) n

n≥2

converges absolutely and its sum is non-negative. (iii) If σ(0) = 0 then for any sequence (Bn )n∈N+ of positive numbers with limn→∞ Bn = ∞ satisfying

n→∞ w −2 2 lim nBn EX1 I(|X1 |≤Bn ) = 1

**¯−1 we have P ξn → WD in BD , where ¯ ξn (t) = S
**

nt

− nt EX1 , n ∈ N+ , σ(0) Bn

t ∈ I.

2 When a2 = EX1 < ∞ we can take Bn = |a|n1/2 , n ∈ N+ .

332

Appendix 3

**Notes and Comments
**

1.1

As we have noted, the basic reference for classical non-metric results on diﬀerent types of continued fraction expansions is Perron (1954, 1957). There exist several metrical results about Euclid’s algorithm. Let b, n ∈ N+ with 1 ≤ b < n. Then b/n = [a1 , · · · , aτ (b,n) ] with aτ (b,n) ≥ 2, and τ (b, n) ∈ N+ is the number of division steps occurring when b and n are input to the algorithm. Since Euclid’s algorithm applied to b and n behaves essentially the same as when applied to b/g.c.d.(b, n) and n/g.c.d.(b, n), it is convenient to consider the average number τn of division steps when b is relatively prime to n and chosen at random, that is, probability 1/ϕ(n) is given to any integer in the range [1, n] which is prime to n. Here ϕ is Euler’s ϕ-function deﬁned by ϕ(n) = n

p|n

1−

1 p

,

n ≥ 2,

**and ϕ(1) = 1, where the product is taken over all prime numbers p which divide n. Clearly, τn = 1 ϕ(n)
**

n

τ (k, n).

k=1 g.c.d.(k, n) = 1

Porter (1975) and Knuth (1976) showed that τn = 12 log 2 log n + c + O(n−1/6+ε ) π2

as n → ∞ for any ε > 0, with c= 6 log 2 1 3 log 2 + 4C − 24π 2 ζ (2) − 2 − = 1.467078... . 2 π 2 333

334

Notes and Comments

The leading coeﬃcient (12 log 2)/π 2 = 0.84276... was independently derived by Dixon (1970, 1971) and Heilbronn (1969). A very interesting discussion of this topic can be found in Knuth (1981, Section 4.5.3). See also Lochs (1961), Sz¨sz (1980), and Tonkov (1974). For recent generalizations of Dixon’s and u Heilbronn’s results, see Hensley (1994). The largest quotient

1≤k≤τ (b,n)

max

ak

occurring in Euclid’s algorithm when b and n are input to the algorithm, has been studied by Hensley (1991). The continued fraction transformation τ underlies a chaotic discrete dynamical system which exhibits in an accessible manner all the common features of such systems. See, e.g., Corless (1992).

1.2

Whole sections or chapters on the metrical theory of continued fractions can be found in the books by Billingsley (1965), Ibragimov and Linnik (1971), Iosifescu and Grigorescu (1990), Kac (1959), Khin(t)chin(e) (1956, 1963, 1964), Knuth (1981), Koksma (1936), L´vy (1954), Rockett and Sz¨sz e u (1992), Sinai (1994), Urban (1923).

1.3

The natural extension τ of τ has been introduced in a more general context ¯ by Nakada (1981) in order to derive ergodic properties of associated random variables. See Sections 4.0 and 4.1. The extended incomplete quotients have been ﬁrst introduced by Faivre (1996) and, in general, the extended random variables by Iosifescu (1997), who proved Theorem 1.3.5 which motivates the consideration of the conditional probability measures γa , a ∈ I. Proposition 1.3.8 and Corollary 1.3.9 can also be found in the latter reference. Subsections 1.3.5 and 1.3.6 rely on the work of Iosifescu (1989, 2000 b). It is worth mentioning that to our knowledge it is the ﬁrst time that mixing coeﬃcients have been computed exactly. A ﬁrst estimation, ψ(n) ≤ (0.8)n , n ∈ N+ , of the ψ-mixing coeﬃcients is due to Philipp (1988). As to other types of mixing, it seems possible to prove a kind of α-mixing for (¯ ) ∈Z using the Markovian structure of (¯ ) ∈Z and the reversibility of r s (¯ ) ∈Z . a

Notes and Comments

335

It is the appropriate place to mention that the sequence (an )n∈N+ enjoys another mixing property known as the almost Markov property, a concept introduced by the Lithuanian school—see especially the references to the papers by V.A. Statulevi˘ius and B. Riauba in Heinrich (1987) and Misc evi˘ius (1971). See also Saulis and Statulevi˘ius (1991). Let µ ∈ pr(BI ) and c c for k, n ∈ N+ deﬁne the random variable αk,n (µ) = sup |µ (B|σ(a1 , · · · , ak+n−1 )) − µ (B|σ(ak+1 , · · · , ak+n−1 ))| , where the supremum is taken over all B ∈ σ(ak+n , ak+n+1 , · · · ). Put χµ (n) = sup ess sup αk,n (µ).

k∈N+

Then as shown in Heinrich (op. cit.)—for a slightly weaker form of this result see Misevi˘ius (1981)—assuming that µ c λ and that f = dµ/dλ ∈ L(I) and is bounded away from 0, we have χµ (n) ≤ 2−n+1 (24 + s(f )/ inf f (x)),

x∈I

n ∈ N+ .

Finally, note that it has not been usual to prove F. Bernstein’s theorem (Proposition 1.3.16) as an application of ψ-mixing of the sequence of incomplete quotients.

2.1

Theorem 2.1.6 and Proposition 2.1.7 are in fact corollaries of the ergodic theorem of Ionescu Tulcea and Marinescu (1950) [see also Hennion (1993)], which is a deep generalization of an ergodic theorem of Doeblin and Fortet (1937). Cf. Iosifescu (1993b). As noted by Iosifescu (1993a), it is hard to understand how Doeblin (1940) missed a geometric rate solution to Gauss’ problem, which could have been obtained by using the latter theorem. Subsection 2.1.3 relies on the work of Iosifescu (1992, 1993, 1994). In particular, Propositions 2.1.11 and 2.1.12 have allowed for the simplest solution known to date to Gauss’ problem, which is included in the ﬁrst two references just quoted. Proposition 2.1.11 has been also proved by Sz¨sz u 1 (I). (1961) for f ∈ C In connection with Proposition 2.1.17 we note that in the case of a singular µ ∈ pr(BI ) the solution to the corresponding Gauss’ problem has not been yet systematically studied. See Remark 2 following Corollary 4.1.10 for a case where the limit clearly diﬀers from Gauss’ measure.

336

Notes and Comments

2.2

Subsections 2.2.1 and 2.2.2 contain a very detailed presentation of E.Wirsing’s 1974 celebrated paper. This also includes the eﬀective computation of numerical constants occurring there. Subsection 2.2.3 relies on the work of Iosifescu (2000 a, c). That Theorem 2.2.6 holds for f ∈ L(I), that is, that Theorem 2.2.8 holds, had been announced in Iosifescu (1992) and subsequently used by Faivre (1998a). We stress again the importance of a study of the set E deﬁned in Remark 1 following Theorem 2.2.6. (See also Remark 2 following Theorem 2.2.11.)

2.3

This section contains a detailed presentation of K.I. Babenko’s work on Gauss’ problem, with some improvements and generalizations. Information about the life and work of K.I. Babenko (1919–1987) can be found in Russian Math. Surveys 35 (1980), no. 2, 265–275, and 43 (1988), no. 2, 138–151. Proposition 2.3.2 and its proof are due to Mayer and Roepstorﬀ (1987). For a = 0, that is, under Lebesgue measure λ = γ0 the exact Gauss–Kuzmin– L´vy Theorem 2.3.5 has been proved by Babenko (1978). The general case e a ∈ I has been announced by Iosifescu (2000 b). Note that equation (2.3.14) is equivalent to equation (3.6) in Hensley (1992). We stress the fact that for some a ∈ I the exact convergence rate in Gauss’ problem under γa is faster than Wirsing’s optimal rate O(λn ) as 0 n → ∞. See the Remark after the proof of Corollary 2.3.6. It should be noted that by Proposition 2.1.17 for any i(k) ∈ Nk the + limit of µ[(an+1 , . . . , an+k ) = i(k) ] as n → ∞ exists and is equal to γ(I(i(k) ) whatever µ ∈ pr(BI ) such that µ λ. Corollary 2.3.6 shows that in the case where µ = γa , a ∈ I, a good convergence rate also holds. A note of historical nature is in order concerning the equation lim λ(an = k) = 1 1 log 1 + log 2 k(k + 2) , k ∈ N+ ,

n→∞

which is a weaker form of a result given in Corollary 2.3.6. This formula was ﬁrst obtained as early as 1900. Two papers of the Swedish astronomer Hugo Gyld´n, whose understanding of the approximate computation of planetary e motions led him in 1888 to study the asymptotic of λ(an = k), k ∈ N+ , as n → ∞, were taken up for revision by his fellow-countrymen Torsten Brod´n e and Anders Wiman, both mathematicians associated with Lund University.

Notes and Comments

337

Wiman (1900) got ﬁnally the correct result after Sisyphical computations. Two subsequent papers, both published in 1901, of Brod´n and Wiman were e ´ then considered by Emile Borel as the ﬁrst ones to notice the applicability of measure theory in probability. The reader will ﬁnd precise references and all the necessary details in von Plato (1994, Ch. 2). This book is a fascinating account of the emergence of measure-theoretic probability in the ﬁrst third of the 20th century (until the publication of A.N. Kolmogorov’s Grundbegriﬀe der Wahrscheinlichkeitsrechnung in 1933). It is convincingly argued there that the theory of the continued fraction expansion should be counted among the ﬁelds that brought inﬁnitary events and the idea of measure 0 into probability.

2.5

This section relies on the work of Iosifescu (1994, 1997, 1999). For a = 0, that is, under Lebesgue measure λ = γ0 the optimal convergence rate O(g2n ) in Theorem 2.5.5 (without explicit lower and upper bounds), has been ﬁrst shown by D¨rner (1992) using a diﬀerent approach. For a = 0, too, Theorem u 2.2.8 with just an upper bound O(gn ) [instead of the optimal one O(g2n )], has been proved by a diﬀerent method by Dajani and Kraaikamp (1994). The proof given here emphasizes the importance of the generalized Brod´n– e Borel–L´vy formula (1.3.21). e It is hard to understand why A. Denjoy’s 1936 Comptes Rendus Notes went unnoticed so many years. The method of proving and generalizing Denjoy’s results here, is quite diﬀerent from that suggested by him.

3.0

The idea underlying Lemma 3.0.1 goes back to Philipp (1970). Lemma 3.0.2 is a special case of a result of Samur (1989, Lemma 2.3).

3.1

Except for Theorem 3.1.6, the results in Subsections 3.1.1 and 3.1.2 have been proved by Samur (1989). The classical Poisson law [Theorem 3.1.2 (iii)] under any µ λ has been ﬁrst given a complete proof by Iosifescu (1977), who ﬁlled a gap in an incomplete proof by Doeblin (1940, p. 358).

338

Notes and Comments

3.2 & 3.3

Subsections 3.2.2 and 3.2.3 mainly rely on the work of Samur (1989, 1996), who applied his earlier results for diﬀerent mixing random variables to the special case of random variables occurring in the metrical theory of continued fractions. The presentation here is more transparent due to the consistent use of the extended random variables which only appear in an implicit manner in Samur’s treatment. For the ﬁrst versions of most of the results in these sections credit should be given to Doeblin (1940). An extensive analysis of Doeblin’s paper has been made by Iosifescu (1990, 1993 a,b), where the reader can ﬁnd a comprehensive evaluation of Doeblin’s important contributions to the metrical theory of continued fractions as compared with subsequent work in the ﬁeld. It should be noted that Samur (1989) has also dealt with more general partial sums Sn deﬁned as follows. Let (fn )n∈N+ be a sequence of H-valued functions on N+ , where H is a separable Hilbert space, and put Sn = n i=1 fn (ai ), n ∈ N+ . He derived suﬃcient conditions for the laws of certain random functions associated with the Sn , n ∈ N+ , to converge weakly (in the Skorohod space of H-valued functions on I) to an inﬁnitely divisible probability measure on H. Another generalization of the case considered in Theorem 3.2.4 is that of partial sums

n

Sn =

i=1

fi (ai ),

where (fn )n∈N+ is a sequence of real-valued functions on N+ . A very special case has been taken up by Doeblin (1940, p. 360), with fn (j) = 1 or 0 according as j ≥ cn or j < cn , n, j ∈ N+ . Here (cn )n∈N+ is a sequence of positive numbers. In this case Sn is the number of occurrences of the random events (ai ≥ ci ), 1 ≤ i ≤ n. By F. Bernstein’s theorem—see Corollary 1.3.16—limn→∞ Sn < ∞ or = ∞ a.e. in I according as the series n∈N+ 1/cn converges or diverges. Doeblin gave valid hints for a proof that if n∈N+ 1/cn = ∞ then (Sn )n∈N+ obeys the central limit theorem under √ λ. More precisely, (Sn − An )/ An is asymptotically N (0, 1) under λ as n → ∞, with n 1 1 An = log 1 + , n ∈ N+ . log 2 ci

i=1

A complete proof with an estimate of the convergence rate under any µ λ has been given by Philipp (1970). This result has been improved by Zuparov

Notes and Comments

339

(1981). The functional version of this central limit theorem was proved by Philipp and Webb (1973).

3.4

We only mention here a result not covered by those given in this section. It is about Doeblin’s sequence (Sn )n∈N+ just discussed. Doeblin (1940, p. 361) asserted the validity of the law of the iterated logarithm Sn − An λ lim supn→∞ √ =1 2An log log An = 1.

A complete proof was again given by Philipp (1970). The functional version of this law of the iterated logarithm might follow from a more general result in Sz¨sz and Volkmann (1982, p. 458). u

4.0

Most of the results stated for probability measures are still valid for ﬁnite measures and even for σ-ﬁnite, inﬁnite measures. See, e.g., Aaronson (1997).

4.1

Khin(t)chin(e) [1934/35, 1936; 1963 (or 1964), Ch. 3] proved the a.e. convergence of arithmetic means n f (ai , · · · , ai+k−1 )/n, n ∈ N+ , for some i=1 ﬁxed k ∈ N, under an unnecessarily strong assumption on the function f : Nk → R. His proofs are quite intricate since he made no use of the + Birkhoﬀ–Khinchin (!) ergodic theorem which, as we have seen, provides short and elegant proofs. (This should be certainly associated with the fact that ergodic theory at the time was restricted to invertible transformations. But even so a way out could have perhaps been found.) Unlike Khinchin, Doeblin (1940, p. 366) did make use of the ergodic theorem. He proved that the continued fraction transformation τ is ergodic under λ [a diﬀerent proof had been given earlier by Knopp (1926), see also Martin (1934)]. Since τ is γ-preserving, this enabled him to derive (in an equivalent form) equation (4.1.1), thus to retrieve Khinchin’s results under weaker assumptions in a straightforward manner. It is the appropriate place to note that, in spite of the fact that, e.g., Billingsley (1965, p. 49) fully credits Doeblin for the idea leading to (4.1.1), many authors assert that this idea is due to

340

Notes and Comments

Ryll–Nardzewski (1951). Actually, the only real advance made after 1940 in using ergodic theorems in the metric theory of RCF expansion originated with Nakada (1981) who, as already mentioned, introduced the natural extension τ of τ , allowing to derive equation (4.1.6). It is again really surprising ¯ that Doeblin (1940, p. 365) asserts that his version of Theorem 2.2.11—see Remark 1 following that theorem—implies that 1 card{k : Θ−1 < x, 1 ≤ k ≤ n } = H(x), k n→∞ n lim x ≥ 1,

and that n−1 n Θi converges a.e. as n → ∞ to a constant (not indicated). i=1 Or Doeblin’s ﬁrst assertion above is equivalent to the ﬁrst case considered in Corollary 4.1.22 while the second one is the ﬁrst equation in Corollary 4.1.23 without the value of the limit. How did Doeblin guess these results whose proofs involve the use of τ ? ¯ It should be noted that special cases of the Khinchin-Doeblin results have been known before. For example, as already noted, Proposition 4.1.1 and its consequences were ﬁrst proved (without convergence rates) by L´vy e (1929). The application of the G´l–Koksma theorem to the RCF expansion, a yielding the convergence rates indicated, is due to de Vroedt (1962, 1964). Let us ﬁnally mention that in Philipp (1967) a more general problem is considered. Given an arbitrary sequence (In )n∈N+ of intervals contained in I, it is shown there that for any ε > 0 the random variable card{k : τ k ∈ Ik , 1 ≤ k ≤ n }, is equal to

n k=1

n ∈ N+ , γ(Ik )

k=1

γ(Ik ) + O

n

1/2

γ(Ik )

k=1

log

3+ε 2

n

a.e.

as n → ∞, where the constant implied in O depends on both ε and the current point ω ∈ Ω. Moeckel (1982), then Jager and Liardet (1988), using quite diﬀerent methods showed—amongst other things—that if we consider modulo 2 the sequence (qn )n∈N+ of the denominators of the RCF convergents of any given ω ∈ Ω, then the asymptotic relative frequencies of the digit blocks 01, 10, and 11 all are a.e. equal to 1/3. [Note that the digit block 00 cannot occur since |pn−1 qn − pn qn−1 | = 1, n ∈ N+ .] Jager and Liardet (op. cit.) showed

Notes and Comments

341

that results of this kind can be easily derived from the ergodicity of a certain skew product. To deﬁne it we need some notation. For any integer m ≥ 2 let G(m) denote the ﬁnite group of 2 × 2 matrices with entries from Z/mZ (the classes of remainders modulo m) and determinant equal to ±1, that is, G(m) = a b c d : a, b, c, d ∈ Z/mZ, ad − bc = ±1 .

It is known that the cardinality of G(m) is given by the formula 2J(2) = 6 if m = 2, card G(m) = 2mJ(m) if m ≥ 3, where J is Jordan’s arithmetical totient function deﬁned by J(m) = m2

p|m

1−

1 p2

,

m ≥ 2.

Here the product is taken over all prime numbers p which divide m. Jager and Liardet’s skew product Tm : Ω × G(m) → Ω × G(m) is then deﬁned by Tm (ω, A) = τ (ω), A 0 1 1 a1 (ω) mod m , (ω, A) ∈ Ω × G(m).

These authors showed that Tm is γ ⊗ hm -preserving, where hm is the Haar measure on G(m), that is, the uniform one assigning measure 1/card G(m) to any element of G(m), and that (Tm , γ ⊗ hm ) is an ergodic endomorphism. Hence they deduced, e.g., that given integers m ≥ 2, a, b ∈ N+ , with g.c.d.(a, b, m) = 1, we have

n→∞

lim

1 1 card {k : pk ≡ a, qk ≡ b mod m, 1 ≤ k ≤ n } = n J(m)

a.e.,

a result also obtained by Moeckel (1982). Subsequently, Nolte (1990) gave other interesting applications of Jager and Liardet’s endomorphism. ¯ A natural extension Tm of Tm was obtained and studied by Dajani and ¯ Kraaikamp (1998). It appears that we can take Tm : Ω2 ×G(m) → Ω2 ×G(m) deﬁned by ¯ Tm ((ω, θ), A) = τ (ω, θ), A ¯ 0 1 1 a1 (ω) mod m

342

Notes and Comments

¯ ¯ ¯ for (ω, θ, A) ∈ Ω2 ×G(m). Then Tm is γ ⊗hm -preserving, and Tm , γ ⊗ hm is ¯ an ergodic automorphism. Hence Dajani and Kraaikamp (op. cit.) deduced, e.g., that for any integers m ≥ 2, 0 ≤ a, b ≤ m − 1, with g.c.d.(a, b, m) = 1 and for any (t1 , t2 ) ∈ I 2 we have

n→∞

lim

1 card {k : Θk−1 < t1 , Θk < t2 , pk ≡ a, qk ≡ b mod m, 1 ≤ k ≤ n } n = H(t1 , t2 ) J(m) a.e.,

where the distribution function H has been deﬁned in Corollary 4.1.20. Their paper contains a host of other results. They also showed that these results can be extended to S-expansions (cf. Sections 4.2 and 4.3). It is interesting to note that the sequences of numerators and denominators of the S-convergents have – mod m – the same asymptotic behaviour as that just indicated for the sequences of numerators and denominators of the RCF convergents. It may seem diﬃcult to compare, e.g., the decimal expansion with the RCF expansion, since their dynamics are diﬀerent. However, Lochs (1964) obtained a then surprising result that had to serve as a prototype for further results of the same kind. Let ω ∈ Ω and consider the rational number xn = xn (ω) := 10n ω /10n , which yields the ﬁrst n decimal digits of ω, and yn = xn + 10−n , n ∈ N+ . Clearly, for n large enough we have yn < 1. Next, let ω = [a1 , a2 , · · · ], xn = [b1 , · · · , bk ], and yn = [c1 , · · · , c ] be the RCF expansions of ω, xn , and yn , respectively, and for n ∈ N+ large enough put mn = mn (ω) = max{i ≤ max(k, ) : bj = cj , 1 ≤ j ≤ i }. In other words, mn (ω) is the largest integer such that the closed interval [xn , yn ] is contained in the closure of the fundamental interval I(a1 , · · · , √ amn (ω) ) (containing ω). For example, if ω = 3 2 − 1 = 0.259921 · · · then x5 = 0.25992, y5 = 0.25993, ω = [3, 1, 5, 1, 1, · · · ], x5 = [3, 1, 5, 1, 1, 4, 2, 5, 1, 3], and y5 = [3, 1, 5, 1, 1, 5, 5, 1, 2, 1, 4, 3]. Therefore m5 (ω) = 5, that is, from the ﬁrst 5 decimal digits of ω we obtain its ﬁrst 5 RCF digits. Using arithmetic properties of τ and Paul L´vy’s result (4.1.19), Lochs (op. cit.) e proved that

n→∞

lim

mn 6 log 2 log 10 = = 0.97027014 · · · n π2

a.e..

This means that, roughly speaking, usually around 97% of the RCF digits are determined by the decimal digits. Using an early mainframe computer,

Notes and Comments

343

by way of example, Lochs (1963) calculated that the ﬁrst 1000 decimal digits of π determine 968 RCF digits of it! Lochs’ result was generalized to a wider class of transformations of I by Bosma et al. (1999). Their results are based on the Shannon–McMillan– Breiman theorem in information theory [see Billingsley (1965, p. 129)] while Lochs’ limit appears in fact to be the ratio of the entropies of the transformations S : I → I deﬁned as Sx = 10 x mod 1, x ∈ I, underlying the decimal expansion, and τ . Finally, Dajani and Fieldsteel (2001) gave wider applications and simpler proofs of results describing the rate at which the digits of one number theoretical expansion determine those of another. Their proofs are based on general measure-theoretic covering arguments and not on the dynamics of speciﬁc maps. We mention that Lochs’ problem was also considered by Faivre (1997, 1998b), who showed that (i) for any ε > 0 there exist positive constants a < 1 and A such that λ mn 6 log 2 log 10 − ≥ ε n π2 ≤ Aan , n ∈ N+ ,

√ and (ii) the random variable mn − 6(log 2)(log 10)n/π 2 / n is asymptotically N (0, σ) for some σ > 0 (which is related to the constant denoted by the same letter in Example 3.2.11). Clearly, Lochs’ result is implied by (i) via the Borel–Cantelli lemma. Cassels (1959) showed that there exist numbers x which are normal in base 3 but non-normal in any base that is not a power of 3. This result was generalized by Schmidt (1960) as follows. Let the notation r ∼ s stand for r, s ∈ N+ being powers of the same integer. It is fairly obvious that if r ∼ s then normality of x in both bases r and s imply each other. If r ∼ s then this implication does not hold. In fact, Schmidt (op. cit.) showed that in the latter case there is a continuum power set of numbers x which are normal in base r but not even simply normal in base s. (Simple normality means that each single digit occurs with the proper frequency.) Motivated by this, Schweiger (1969) deﬁned two number theoretical transformations T and S on I (or I d , the d-dimensional unit cube, d ∈ N+ ) to be equivalent (T ∼ S) if there exist positive integers m, n ∈ N+ such that T m = S n . Schweiger then showed that T ∼ S implies that every T -normal number is S-normal, and conjectured that T ∼ S implies the opposite conclusion. Surprisingly, Kraaikamp and Nakada (2000) proved that the RCF and NICF expansions share the same set of normal numbers. Clearly, in itself

344

Notes and Comments

this is not a counter-example to Schweiger’s conjecture, since the RCF transformation τ and the NICF transformation N1/2 ‘live’ on diﬀerent intervals. However, in Kraaikamp and Nakada (2001) two counter-examples are given.

4.2 & 4.3

Section 4.2 fully relies on the work of Kraaikamp (1991), see also his 1989 paper. There exists a host of CF expansions which would have deserved to be discussed here. Two such expansions are the Rosen continued fraction expansions, and the α-expansions of Tanaka and Ito (1981). We will brieﬂy discuss both of them. Although Rosen (1954) introduced his CF expansions in the mid-1950s, it is only very recently that there has been any investigation of their metric properties—see Burton et al. (2000), Gr¨chenig and Haas (1996), Nakada o (1995), Sebe (2002), and Schmidt (1993). The groups which underlie the Rosen continued fraction expansions are Fuchsian groups of the ﬁrst kind—discrete subgroups of PSL(2, R) acting upon the Poincar´ upper half-plane by M¨bius (fractional linear) transfore o mations, with all of R as their limit sets. Let λ = λq = 2 cos(π/q) for q ∈ {3, 4, . . . }, and put A= 1 λ 0 1 , B= 0 −1 1 0 .

Then the group Gq generated by A and B is called the Hecke (triangle) group of index q. Rosen (op. cit.) deﬁned a CF expansion related to Gq , q ≥ 4. (Note that for q = 3 we have the modular group.) Fix some such q and let Jq = [−λ/2, λ/2 ]. Then the transformation fq : Jq → Jq deﬁned by fq (x) = sgn x sgn x 1 − + λ, x λx 2 x ∈ Jq \ {0}, fq (0) = 0,

leads to a CF expansion of the form x = b1 λ + e1 e2 . b2 λ + . . ,

where ei is equal to either 1 or −1 and bi ∈ N, i ∈ N+ . We call this the Rosen, or λ-continued fraction (λ-CF ), expansion of x ∈ Jq \ {0}.

Notes and Comments

345

In Burton et al. (op. cit.) the natural extension of the ergodic dynamical system underlying the λ-CF expansion was obtained for any q ≥ 3—the case q = 3 is in fact the NICF expansion. [Previously, Nakada (op. cit.) obtained a similar result for any even q.] From this a large number of results similar to those holding for the RCF expansion, were obtained for the λ-CF expansion. At ﬁrst sight Nakada’s α-expansions and those of Tanaka and Ito (1981) bear a close resemblance. Let α ∈ [1/2, 1], Iα = [α − 1, α], and deﬁne the transformation Tα : Iα → Iα by −1 x − x−1 + 1 − α if x ∈ Iα \ {0}, Tα (x) = 0 if x = 0. It yields a unique Tanaka–Ito α-expansion of the form x = b1 + 1 1 . b2 + . . , x ∈ Iα \ {0} ,

which is ﬁnite if and only if x is rational, and where bi ∈ Z \ {0}, i ∈ N+ . In spite of the similarities it is much harder to obtain results for the Tanaka–Ito α-expansions as compared to the Nakada α-expansions discussed in Subsection 4.3.1. E.g., Tanaka and Ito (op. cit.) were able only to give the explicit form of the density of the invariant measure for 1/2 ≤ α ≤ g. For these values of α they were also able to derive the entropy of Tα . It is interesting to note that the latter is independent of α ∈ [1/2, g], and is equal to π 2 /(6 log g), which is the value corresponding to an S-expansion with maximal singularization area. It should be noted that limit properties as those in Chapter 3 for CF expansions, other than the RCF expansion, need the corresponding Gauss– Kuzmin–L´vy theorems (implying ψ-mixing of the sequence of their ine complete quotients). In this respect we mention the papers of Dajani and Kraaikamp (1999), Iosifescu and Kalpazidou (1993), Kalpazidou (1985a, c, 1986d, e, 1987b), Popescu (1997a, b, 1999, 2000), Rieger (1978, 1979), Rockett (1980), and Sebe (2000a, b, 2001a, b, 2002). It appears, as noted in the Preface, that for any single CF expansion a speciﬁc approach is required, which has to more or less mimic that working for the RCF expansion. We conclude by brieﬂy discussing a generalization of the RCF expansion known as f -expansions (which, in general, are not CF expansions). Let f be

346

Notes and Comments

a continuous strictly decreasing (increasing) real-valued function deﬁned on [1, β], where either 2 < β ∈ N+ or β = ∞ ([0, β], where either 1 < β ∈ N+ or β = ∞), such that f (1) = 1 and f (β) = 0 (f (0) = 0 and f (β) = 1), with the convention f (β) = limx→β f (x) for β = ∞. Denote by f −1 the inverse function of f , which is deﬁned on I. Such a function f can be used to represent most real numbers t ∈ I as t = f (a1 (t) + f (a2 (t) + · · · )) := lim fn (a1 (t), · · · , an (t)),

n→∞

where fn is deﬁned recursively by f1 (x1 ) = f (x1 ), and fn+1 (x1 , · · · , xn+1 ) = fn (x1 , · · · , xn−1 , xn + f (xn+1 )), an (t) = f −1 ({rn−1 (t)}) with r0 (t) = t, rn (t) = f −1 ({rn−1 (t)}), Note that rn (t) = an (t) + f (an+1 (t) + f (an+2 (t) + · · · )), n ∈ N+ . The above representation of t is called its f -expansion. Clearly, the RCF expansion is obtained for f (x) = 1/x, x ≥ 1, and the part of the continued fraction transformation τ is now played by the f -expansion transformation τf of I deﬁned by τf (t) = {f −1 (t)}, t ∈ I. [Some caution is necessary in the case where β = ∞ when either τf (0) or τf (1) should be given the value 0.] Also, the natural extension τf of τf is deﬁned by ¯ τf (t, u) = (τf (t), f (a1 (t) + u)) ¯ for the points (t, u) of a suitable subset of I 2 of Lebesgue measure 1. The f -expansions were ﬁrst considered by Kakeya (1924), who proved that if f −1 is absolutely continuous and (f −1 ) > 1 a.e. in I then, save possibly a countable subset of I, any other t ∈ I has an f -expansion. A metrical theory of f -expansions parallelling that of the RCF expansion is available. See, e.g., Iosifescu and Grigorescu (1990, Section 5.4) and the references therein. Finally, if β does not belong to N+ ∪ {∞}, then the corresponding f leads to a so called f -expansion with dependent digits. For recent results on such f -expansions, see Barrionuevo et al. (1994), Dajani and Kraaikamp (1996, 2001), and Dajani et al. (1994). n ∈ N+ . n ≥ 2. Here the ‘incomplete quotients’ an (t) are deﬁned recursively as f2 (x1 , x2 ) = f1 (x1 + f (x2 )),

References

Aaronson, J. (1986) Random f -expansions. Ann. Probab. 14, 1037– 1057. Aaronson, J. (1997) An Introduction to Inﬁnite Ergodic Theory. Mathematical Surveys and Monographs 50. Amer. Math. Soc., Providence, RI. Aaronson, J. and Nakada, H. (2001) Sums without maxima. Preprint. Abramov, L.M. (1959) Entropy of induced automorphisms. Akad. Nauk SSSR 128, 647–650. (Russian) Dokl.

Abramowitz, M. and Stegun, I.A. (Eds.) (1964) Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. National Bureau of Standards, Washington, D.C. de Acosta, A. (1982) Invariance principles in probability for triangular arrays of B-valued random vectors and some applications. Ann. Probab. 10, 346–373. Adams, W.W. (1979) On a relationship between the convergents of the nearest integer and regular continued fractions. Math. Comp. 33, 1321–1331. Adler, R.L. (1991) Geodesic ﬂows, interval maps, and symbolic dynamics. In: Bedford, T. et al. (Eds.) (1991), 93–123. Adler, R.L. and Flatto, L. (1984) The backward continued fraction map and geodesic ﬂow. Ergodic Theory and Dynamical Systems 4, 487–492. Adler, R., Keane, M., and Smorodinsky, M. (1981) A construction of a normal number for the continued fraction transformation. J. Number Theory 13, 95–105. 347

348

References Alexandrov, A.G. (1978) Computer investigation of continued fractions. Algoritmic Studies in Combinatorics, 142–161. Nauka, Moscow. (Russian) Aliev, I., Kanemitsu, S., and Schinzel, A. (1998) On the metric theory of continued fractions. Colloq. Math. 77, 141–146. Alzer, H. (1998) On rational approximation to e. J. Number Theory 68, 57–62. Araujo, A. and Gin´, E. (1980) The Central Limit Theorem for Real e and Banach Valued Random Variables. Wiley, New York. Babenko, K.I. (1978) On a problem of Gauss. Soviet Math. Dokl. 19, 136–140. Babenko, K.I. and Jur ev, S.P. (1978) On the discretization of a problem of Gauss. Soviet Math. Dokl. 19, 731–735. Bagemilhl, F. and McLaughlin, J.R. (1966) Generalization of some classical theorems concerning triples of consecutive convergents to simple continued fractions. J. Reine Angew. Math. 221, 146–149. Bailey, D.H., Borwein, J.M., and Crandall, R.E. (1997) On the Khintchine constant. Math. Comp. 66, 417–431. Baladi, V. and Keller, G. (1990) Zeta functions and transfer operators for piecewise monotonic transformations. Comm. Math. Phys. 127, 459–477. Barbolosi, D. (1990) Sur le d´veloppement en fractions continues ` e a quotients partiels impairs. Monatsh. Math. 109, 25–37. Barbolosi, D. (1993) Automates et fractions continues. J. Th´or. Nome bres Bordeaux 5, 1–22. Barbolosi, D. (1997) Une application du th´or`me ergodique souse e additif ` la th´orie m´trique des fractions continues. J. Number Theory a e e 66, 172–182. Barbolosi, D. (1999) Sur l’ordre de grandeur des quotients partiels du d´veloppement en fractions continues r´guli`res. Monatsh. Math. e e e 128, 189–200.

References

349

Barbolosi, D. and Faivre, C. (1995) Metrical properties of some random variables connected with the continued fraction expansion. Indag. Math. (N.S.) 6, 257–265. Barndorﬀ–Nielsen, O. (1961) On the rate of growth of the partial maxima of a sequence of independent identically distributed random variables. Math. Scand. 9, 383–394. Barrionuevo, J., Burton, R.M., Dajani, K., and Kraaikamp, C. (1996) Ergodic properties of generalized L¨roth series. Acta Arith. 74, 311– u 327. Bedford, T., Keane, M., and Series, C. (Eds.) (1991) Ergodic Theory, Symbolic Dynamics and Hyperbolic Spaces. Oxford University Press, Oxford. Berechet, A. (2001a) A Kuzmin-type theorem with exponential convergence for a class of ﬁbred systems. Ergodic Theory and Dynamical Systems 21, 673–688. Berechet, A. (2001b) Perron–Frobenius operators acting on BV(I) as contractors. Ergodic Theory and Dynamical Systems 21, 1609–1624. ¨ Bernstein, F. (1911) Uber eine Anwendung der Mengenlehre auf ein aus der Theorie der s¨kularen St¨rungen herr¨hrendes Problem. Math. a o u Ann. 71, 417–439. Billingsley, P. (1965) Ergodic Theory and Information. Wiley, New York. Billingsley, P. (1968) Convergence of Probability Measures. Wiley, New York. ´ Borel, E. (1903) Contribution ` l’analyse arithm´tique du continu. a e J. Math. Pures Appl. (5) 9, 329–375. ´ Borel, E. (1909) Les probabilit´s d´nombrables et leurs applications e e arithm´tiques. Rend. Circ. Mat. Palermo 27, 247–271. e Bosma, W. (1987) Optimal continued fractions. Indag. Math. 49, 353–379. Bosma, W. and Kraaikamp, C. (1990) Metrical theory for optimal continued fractions. J. Number Theory 34, 251–270.

350

References Bosma, W. and Kraaikamp, C. (1991) Optimal approximation by continued fractions. J. Austral. Math. Soc. Ser. A 50, 481–504. Bosma, W., Dajani, K., and Kraaikamp, C. (1999) Entropy and counting correct digits. Report No. 9925 (June), Univ. Nijmegen, Dept. of Math., Nijmegen (The Netherlands). Bosma, W., Jager, H., and Wiedijk, F. (1983) Some metrical observations on the approximation by continued fractions. Indag. Math. 45, 281–299. Bowman, K.O. and Shenton, L.R. (1989) Continued Fractions in Statistical Applications. Marcel Dekker, New York. Boyarsky, A. and G´ra, P. (1997) Laws of Chaos: Invariant Measures o and Dynamical Systems in One Dimension. Birkh¨user, Boston. a Bradley, R.C. (1986) Basic properties of strong mixing conditions. In: Eberlein, E. and Taqqu, M.S. (Eds.) Dependence in Probability and Statistics, 165–192. Birkh¨user, Boston. a Breiman, L. (1960) A strong law of large numbers for a class of Markov chains. Ann. Math. Statist. 31, 801–803. Brezinski, C. (1991) History of Continued Fractions and Pad´ Approxe imants. Springer–Verlag, Berlin. Brjuno, A.D. (1964) The expansion of algebraic numbers into continued fractions. Z. Vyˇisl. Mat. i Mat. Fiz. 4, 211–221. (Russian) c Brod´n, T. (1900) Wahrscheinlichkeitsbestimmungen bei der gew¨hne o ¨ lichen Kettenbruchentwickelung reeller Zahlen. Ofversigt af Kongl. Svenska Vetenskaps-Akademiens F¨rhandlingar 57, 239–266. o Brown, G. and Yin, Q. (1996) Metrical theory for Farey continued fractions. Osaka J. Math. 33, 951–970. Bruckheimer, M. and Arcavi, A. (1995) Farey series and Pick’s area theorem. Math. Intelligencer 17, no. 4, 64–67. de Bruijn, N.G. and Post, K.A. (1968) A remark on uniformly distributed sequences and Riemann integrability. Indag. Math. 30, 149– 150.

References

351

Bunimovich, L.A. (1996) Continued fractions and geometrical optics. Amer. Math. Soc. Transl. (2) 171, 45–55. Burton, R.M., Kraaikamp, C., and Schmidt, T.A. (2000) Natural extensions for the Rosen fractions. Trans. Amer. Math. Soc. 352, 1277– 1298. Cassels, J.W.S. (1959) On a problem of Steinhaus about normal numbers. Colloq. Math. 7, 95–101. Chaitin, G.J. (1998) The Limits of Mathematics: A Course on Information Theory and the Limits of Formal Reasoning. Springer–Verlag Singapore, Singapore. Champernowne, D.G. (1933) The construction of decimals normal in the scale of ten. J. London Math. Soc. 8, 254–260. Chatterji, S.D. (1966) Masse, die von regelm¨ssigen Kettenbr¨chen a u induziert sind. Math. Ann. 164, 113–117. Choong, K.Y., Daykin, D.E., and Rathbone, C.R. (1971) Rational approximations to π. Math. Comp. 25, 387–392. Chudnovsky, D.V. and Chudnovsky, G.V. (1991) Classical constants and functions: computations and continued fraction expansions. In: Chudnovsky, D.V. et al. (Eds.) Number Theory (New York, 1989/1990), 13–74. Springer–Verlag, New York. Chudnovsky, D.V. and Chudnovsky, G.V. (1993) Hypergeometric and modular function identities, and new rational approximations to and continued fraction expansions of classical constants and functions. In: Knopp, M. and Sheingorn, M. (Eds.) (1993), 117–162. Clemens, L.E. , Merrill, K.D., and Roeder, D.W. (1995) Continued fractions and series. J. Number Theory 54, 309–317. Cohn, H. (Ed.) (1993) Doeblin and Modern Probability (Blaubeuren, Germany, 1991). Contemporary Mathematics 149. Amer. Math. Soc., Providence, RI. Corless, R.M. (1992) Continued fractions and chaos. Amer. Math. Monthly 99, 203–215. Cornfeld, I.P., Fomin, S.V., and Sinai, Ya.G. (1982) Ergodic Theory. Springer–Verlag, Berlin.

352

References Dajani, K. and Fieldsteel, A. (2001) Equipartition of interval partitions and an application to number theory. Proc. Amer. Math. Soc. 129, 3453–3460. Dajani, K. and Kraaikamp, C. (1994) Generalization of a theorem of Kusmin. Monatsh. Math. 118, 55–73. Dajani, K. and Kraaikamp, C. (1996) On approximation by L¨roth u series. J. Th´or. Nombres Bordeaux 8, 331–346. e Dajani, K. and Kraaikamp, C. (1998) A note of the approximation by continued fractions under an extra condition. New York J. Math. 3A, 69–80. Dajani, K. and Kraaikamp, C. (1999) A Gauss–Kusmin theorem for optimal continued fractions. Trans. Amer. Math. Soc. 351, 2055– 2079. Dajani, K. and Kraaikamp, C. (2000) ‘The mother of all continued fractions’. Colloq. Math. 84/85, 109–123. Dajani, K. and Kraaikamp, C. (2001) From greedy to lazy expansions and their driving dynamics. Preprint No. 1186, Utrecht Univ., Dept. of Math., Utrecht. Dajani, K., Kraaikamp, C., and Solomyak, B. (1996) The natural extension of the β-transformation. Acta Math. Hungar. 73, 97–109. Daud´, H., Flajolet, P., and Vall´e, B. (1997) An average-case anale e ysis of the Gaussian algorithm for lattice reduction. Combinatorics, Probability and Computing 6, 397–433. Davenport, H. (1999) The Higher Arithmetic: An Introduction to the Theory of Numbers, 7th Edition. Cambridge Univ. Press, Cambridge. Davison, J.L. and Shallit, J.O. (1991) Continued fractions for some alternating series. Monatsh. Math. 111, 119–126. Delmer, F. and Deshouillers, J-M. (1993) On a generalization of Farey sequences, I. In: Knopp, M. and Sheingorn, M. (Eds.) (1993), 243– 246. Delmer, F. and Deshouillers, J-M. (1995) On a generalization of Farey sequences. II. J. Number Theory 55, 60–67.

References

353

Denker, M. and Jakubowski, A. (1989) Stable limit distributions for strongly mixing sequences. Statist. Probab. Lett. 8, 477–483. Denjoy, A. (1936 a) Sur les fractions continues. C.R. Acad. Sci. Paris 202, 371–374. Denjoy, A. (1936 b) Sur une formule de Gauss. C.R. Acad. Sci. Paris 202, 537–540. Diamond, H.G. and Vaaler, J.D. (1986) Estimates for partial sums of continued fraction partial quotients. Paciﬁc J. Math. 122, 73–82. Dixon, J.D. (1970) The number of steps in the Euclidean algorithm. J. Number Theory 2, 414–422. Dixon, J. D. (1971) A simple estimate for the number of steps in the Euclidean algorithm. Amer. Math. Monthly 78, 374–376. Doeblin, W. (1940) Remarques sur la th´orie m´trique des fractions e e continues. Compositio Math. 7, 353–371. Doeblin, W. and Fortet, R. (1937) Sur des chaˆ ınes ` liaisons compl`tes. a e Bull. Soc. Math. France 65, 132–148. Doob, J.L. (1953) Stochastic Processes. Wiley, New York. Doukhan, P. (1994) Mixing: Properties and Examples. Lecture Notes in Statist. 85. Springer–Verlag, New York. Duren, P.L. (1970) Theory of H p Spaces. Academic Press, New York. D¨rner, A. (1992) On a theorem of Gauss–Kuzmin–L´vy. Arch. Math. u e (Basel ) 58, 251–256. Elsner, C. (1999) On arithmetic properties of the convergents of Euler’s number. Colloq. Math. 79, 133–145. Elton, H.J. (1987) An ergodic theorem for iterated maps. Ergodic Theory and Dynamical Systems 7, 481–488. Faivre, C. (1992) Distribution of L´vy constants for quadratic nume bers. Acta Arith. 61, 13–34. Faivre, C. (1993) Sur la mesure invariante de l’extension naturelle de la transformation des fractions continues. J. Th´or. Nombres Bordeaux e 5, 323–332.

354

References Faivre, C. (1996) On the central limit theorem for random variables related to the continued fraction expansion. Colloq. Math. 71, 153– 159. Faivre, C. (1997) On decimal and continued fraction expansions of a real number. Acta Arith. 82, 119–128. Faivre, C. (1998a) The rate of convergence of approximations of a continued fraction. J. Number Theory 68, 21–28. Faivre, C. (1998b) A central limit theorem related to decimal and continued fraction expansions. Arch. Math. (Basel ) 70, 455–463. Falconer, K.J. (1986) The Geometry of Fractal Sets. Cambridge Univ. Press, Cambridge. Falconer, K. (1990) Fractal Geometry: Mathematical Foundations and Applications. Wiley, Chichester. Feller, W. (1968) An Introduction to Probability Theory and Its Applications, Vol. I, 3rd Edition. Wiley, New York. Finch, S. (1995) Favorite Mathematical Constants. Available at: http: //www.mathsoft.com/asolve/constant/constant.html Flajolet, P. and Vall´e, B. (1998) Continued fractions algorithms, funce tional operators, and structure constants. Theoret. Comput. Sci. 194, 1–34. Flajolet, P. and Vall´e, B. (2000) Continued fractions, comparison e algorithms, and ﬁne structure constants. Constructive, Experimental, and Nonlinear Analysis (Limoges, 1999), 53–82. Amer. Math. Soc., Providence, RI. Fluch, W. (1986) Eine Verallgemeinerung des Kuz’min-Theorems. Anz. ¨ Osterreich. Akad. Wiss. Math.-Natur. Kl. Sitzungsber. II 195, 325– 339. ¨ Fluch, W. (1992) Ein Operator der Kettenbruchtheorie. Anz. Osterreich. Akad. Wiss. Math.-Natur. Kl. 129, 39–49. G´l, I.S. and Koksma, J.F. (1950) Sur l’ordre de grandeur des fonctions a sommables. Indag. Math. 12, 638–653.

References

355

Galambos, J. (1972) The distribution of the largest coeﬃcient in continued fraction expansions. Quart. J. Math. Oxford Ser. (2) 23, 147– 151. Galambos, J. (1973) The largest coeﬃcient in continued fractions and related problems. In: Osgood, Ch. (Ed.) Diophantine Approximation and its Applications (Proc. Conf., Washington, D.C., 1972), 101–109. Academic Press, New York. Galambos, J. (1994) An iterated logarithm type theorem for the largest coeﬃcient in continued fractions. Acta Arith. 25, 359–364. Gologan, R.-N. (1989) Applications of Ergodic Theory. Technical Publishing House, Bucharest. (Romanian) Gordin, M.I. (1971) On the behavior of the variances of sums of random variables forming a stationary process. Theory Probab. Appl. 16, 474–484. Gordin, M.I. and Reznik, M.H. (1970) The law of the iterated logarithm for the denominators of continued fractions. Vestnik Leningrad. Univ. 25, no. 13, 28–33. (Russian) Gray, J.J. (1984) A commentary on Gauss’ mathematical diary, 1796– 1814, with an English translation. Exposition. Math. 2, 97–130. Grigorescu, S. and Popescu, G. (1989) Random systems with complete connections as a framework for fractals. Stud. Cerc. Mat. 41, 481–489. Gr¨chenig, K. and Haas, A. (1996) Backward continued fractions and o their invariant measures. Canad. Math. Bull. 39, 186–198. Grothendieck, A. (1955) Produits tensoriels topologiques et espaces nucl´aires. Mem. Amer. Math. Soc. 16. Amer. Math. Soc., Providence, e RI. Grothendieck, A. (1956) La th´orie de Fredholm. Bull. Soc. Math. e France 84, 319–384. de Haan, L. (1970) On Regular Variation and its Application to the Weak Convergence of Sample Extremes. Math. Centre Tracts 32. Math. Centrum, Amsterdam. Halmos, P.R. (1950) Measure Theory. Van Nostrand, New York. (Reprinted 1974 by Springer–Verlag, New York)

356

References Hardy, G.H. and Wright, E. (1979) An Introduction to the Theory of Numbers, 5th Edition. Clarendon Press, Oxford. [Reprinted (with corrections) 1983] Harman, G. (1998) Metric Number Theory. Oxford University Press, New York. Harman, G. and Wong, K.C. (2000) A note on the metrical theory of continued fractions. Amer. Math. Monthly 107, 834–837. Hartman, S. (1951) Quelques propri´t´s ergodiques des fractions conee tinues. Studia Math. 12, 271–278. Hartono, Y. and Kraaikamp, C. (2002) On continued fractions with odd partial quotients. Rev. Roumaine Math. Pures Appl. 47, no. 1. Heilbronn, H. (1969) On the average length of a class of ﬁnite continued fractions. Number Theory and Analysis (Papers in Honor of Edmund Landau), 87–96. Plenum, New York. Heinrich, H. (1987) Rates of convergence in stable limit theorems for sums of exponentially ψ-mixing random variables with an application to metric theory of continued fractions. Math. Nachr. 131, 149–165. Hennion, H. (1993) Sur un th´or`me spectral et son application aux e e noyaux lipschitziens. Proc. Amer. Math. Soc. 118, 627–634. Hensley, D. (1988) A truncated Gauss–Kuzmin law. Trans. Amer. Math. Soc. 306, 307–327. Hensley, D. (1991) The largest digit in the continued fraction expansion of a rational number. Paciﬁc J. Math. 151, 237–255. Hensley, D. (1992) Continued fraction Cantor sets, Hausdorﬀ dimension, and functional analysis. J. Number Theory 40, 336–358. Hensley, D. (1994) The number of steps in the Euclidean algorithm. J. Number Theory 49, 142–182. Hensley, D. (1996) A polynomial time algorithm for the Hausdorﬀ dimension of continued fraction Cantor sets. J. Number Theory 58, 9–45. Hensley, D. (1998) Metric Diophantine approximation and probability. New York J. Math. 4, 249–257.

References

357

Hensley, D. (2000) The statistics of the continued fraction digit sum. Paciﬁc J. Math. 192, 103–120. Heyde, C.C. and Scott, D.J. (1973) Invariance principles for the law of the iterated logarithm for martingales and processes with stationary increments. Ann. Probab. 1, 428–436. Hofbauer, F. and Keller, G. (1982) Ergodic properties of invariant measures for piecewise monotonic transformations. Math. Z. 180, 119– 140. Hoﬀmann-Jørgensen, J.(1994) Probability with a View toward Statistics, Vols. I and II. Chapman & Hall, New York. ¨ Hurwitz, A. (1889) Uber eine besondere Art der Kettenbruch-Entwicklung reeller Gr¨ssen. Acta Math. 12, 367–405. o Ibragimov, I.A. and Linnik, Yu.V. (1971) Independent and Stationary Sequences of Random Variables. Wolters–Noordhoﬀ, Groningen. Ionescu Tulcea, C. T. and Marinescu, G. (1950) Th´orie ergodique e pour des classes d’op´rations non compl`tement continues. Ann. of e e Math. (2) 52, 140–147. Iosifescu, M. (1968) The law of the iterated logarithm for a class of dependent random variables. Theory Probab. Appl. 13, 304–313. Addendum, ibid. 15 (1970), 160. Iosifescu, M. (1972) On Strassen’s version of the loglog law for some classes of dependent random variables. Z. Wahrsch. Verw. Gebiete 24, 155–158. Iosifescu, M. (1977) A Poisson law for φ-mixing sequences establishing the truth of a Doeblin statement. Rev. Roumaine Math. Pures Appl. 22, 1441–1447. Iosifescu, M. (1978) Recent advances in the metric theory of continued fractions. Trans. Eighth Prague Conf. on Information Theory, Statistical Decision Functions, Random Processes (Prague, 1978), Vol. A, 27–40. Reidel, Dordrecht. Iosifescu, M. (1989) On mixing coeﬃcients for the continued fraction expansion. Stud. Cerc. Mat. 41, 491–499.

358

References Iosifescu, M. (1990) A survey of the metric theory of continued fractions, ﬁfty years after Doeblin’s 1940 paper. In: Grigelionis, B. et al. (Eds.) Probability Theory and Mathematical Statistics (Proc. Fifth Vilnius Conference, 1989), Vol. I, 550–572. Mokslas, Vilnius & VSP, Utrecht. Iosifescu, M. (1992) A very simple proof of a generalization of the Gauss–Kuzmin–L´vy theorem on continued fractions, and related quese tions. Rev. Roumaine Math. Pures Appl. 37, 901–914. Iosifescu, M. (1993a) Doeblin and the metric theory of continued fractions: a functional theoretical approach to Gauss’ 1812 problem. In: Cohn, H. (Ed.) (1993), 97–110. Iosifescu, M. (1993b) A basic tool in mathematical chaos theory: Doeblin and Fortet’s ergodic theorem and Ionescu Tulcea and Marinescu’s generalization. In: Cohn, H. (Ed.) (1993), 111–124. Iosifescu, M. (1994) On the Gauss–Kuzmin–L´vy theorem, I. Rev. Roue maine Math. Pures Appl. 39, 97–117. Iosifescu, M. (1995) On the Gauss–Kuzmin–L´vy theorem, II. Rev. Roue maine Math. Pures Appl. 40, 91–105. Iosifescu, M. (1996) On some series involving sums of incomplete quotients of continued fractions. Stud. Cerc. Mat. 48, 31–36. Corrigendum, ibid. 48, 146. Iosifescu, M. (1997a) On the Gauss–Kuzmin–L´vy theorem, III. Rev. e Roumaine Math. Pures Appl. 42, 71–88. Iosifescu, M. (1997b) A reversible random sequence arising in the metric theory of the continued fraction expansion. Rev. Anal. Num´r. e Th´or. Approx. 26, 91–93. e Iosifescu, M. (1999) On a 1936 paper of Arnaud Denjoy on the metrical theory of the continued fraction expansion. Rev. Roumaine Math. Pures Appl. 44, 777–792. Iosifescu, M. (2000a) An exact convergence rate result with application to Gauss’ 1812 problem. Proc. Romanian Acad. Ser. A 1, 11–13. Iosifescu, M. (2000b) Exact values of ψ-mixing coeﬃcients of the sequence of incomplete quotients of the continued fraction expansion. Proc. Romanian Acad. Ser. A 1, 67–69.

References

359

Iosifescu, M. (2000c) On the distribution of continued fraction approximations: optimal rates. Proc. Romanian Acad. Ser. A 1, 143–145. Iosifescu, M. and Grigorescu, S. (1990) Dependence with Complete Connections and its Applications. Cambridge Univ. Press, Cambridge. Iosifescu, M. and Kalpazidou, S. (1993) The nearest integer continued fraction expansion: an approach in the spirit of Doeblin. In: Cohn, H. (Ed.) (1993), 125–137. Iosifescu, M. and Kraaikamp, C. (2001) On Denjoy’s canonical continued fraction expansion. Submitted. Iosifescu, M. and Theodorescu, R. (1969) Random Processes and Learning. Springer–Verlag, Berlin. Ito, Sh. (1987) On Legendre’s theorem related to Diophantine approximations. S´minaire de Th´orie des Nombres, 1987–1988 (Talence, e e 1987–1988), Exp. No. 44, 19 pp. Ito, Sh. (1989) Algorithms with mediant convergents and their metrical theory. Osaka J. Math. 26, 557–578. Jager, H. (1982) On the speed of convergence of the nearest integer continued fraction. Math. Comp. 39, 555–558. Jager, H. (1985) Metrical results for the nearest integer continued fraction. Indag. Math. 47, 417–427. Jager, H. (1986a) The distribution of certain sequences connected with the continued fraction. Indag. Math. 48, 61–69. Jager, H. (1986b) Continued fractions and ergodic theory. Transcendental Number Theory and Related Topics, 55–59. RIMS Kokyuroku 599. Kyoto Univ., Kyoto. Jager, H. and Kraaikamp, C. (1989) On the approximation by continued fractions. Indag. Math. 51, 289–307. Jager, H. and Liardet, P. (1988) Distributions arithm´tiques des d´noe e minateurs de convergents de fractions continues. Indag. Math. 50, 181–197. Jain, N.C. and Pruitt, W.E. (1975) The other law of the iterated logarithm. Ann. Probab. 3, 1046–1049.

360

References Jain, N.C. and Taylor, S.J. (1973) Local asymptotic laws for Brownian motion. Ann. Probab. 1, 527–549. Jenkinson, O. and Pollicott, M. (2001) Computing the dimension of dynamically deﬁned sets: E2 and bounded continued fractions. Ergodic Theory and Dynamical Systems 21, 1429–1445. Jain, N.C., Jodgeo, K., and Stout, W.F. (1975) Upper and lower functions for martingales and mixing processes. Ann. Probab. 3, 119–145. Jones, W.B. and Thron, W.J. (1980) Continued Fractions: Analytic Theory and Applications. Addison-Wesley, Reading, Mass. Kac, M. (1959) Statistical Independence in Probability and Statistics. Wiley, New York. Kaijser, T. (1983) A note on random continued fractions. Probability and Mathematical Statistics : Essays in Honour of Carl-Gustav Esseen, 74–84. Uppsala Univ., Dept. of Math., Uppsala. Kakeya, S. (1924) On a generalized scale of notations. Japan J. Math. 1, 95-108. Kalpazidou, S. (1985a) On a random system with complete connections associated with the continued fraction to the nearer integer expansion. Rev. Roumaine Math. Pures Appl. 30, 527–537. Kalpazidou, S. (1985b) On some bidimensional denumerable chains of inﬁnite order. Stochastic Process. Appl. 19, 341–357. Kalpazidou, S. (1985c) Denumerable chains of inﬁnite order and Hurwitz expansion. Selected Papers Presented at the 16th European Meeting of Statisticians (Marburg, 1994). Statist. Decisions, Suppl. Issue no. 2, 83–87. Kalpazidou, S. (1986a) A class of Markov chains arising in the metrical theory of the continued fraction to the nearer integer expansion. Rev. Roumaine Math. Pures Appl. 31, 877–890. Kalpazidou, S. (1986b) Some asymptotic results on digits of the nearest integer continued fraction. J. Number Theory 22, 271–279. Kalpazidou, S. (1986c) On nearest continued fractions with stochastically independent and identically distributed digits. J. Number Theory 24, 114–125.

References

361

Kalpazidou, S. (1986d) On a problem of Gauss–Kuzmin type for continued fractions with odd partial quotients. Paciﬁc J. Math. 123, 103–114. Kalpazidou, S. (1986e) A Gaussian measure for certain continued fractions. Proc. Amer. Math. Soc. 96, 629–635. Kalpazidou, S. (1987a) On the entropy of the expansion with odd partial quotients. In: Grigelionis, B. et al. (Eds.) Probability Theory and Mathematical Statistics (Proc. Fourth Vilnius Conf., 1985), Vol. II, 55–62. VNU Science Press, Utrecht. Kalpazidou, S. (1987b) On the application of dependence with complete connections to the metrical theory of G-continued fractions. Lithuanian Math. J. 27, no. 1, 32–40. Kamae, T. (1982) A simple proof of the ergodic theorem using nonstandard analysis. Israel J. Math. 42, 284–290. Kanwal, R.P. (1997) Linear Integral Equations: Theory and Technique, 2nd Edition. Birkh¨user, Boston. a Kargaev, P. and Zhigljavsky, A. (1997) Asymptotic distribution of the distance function to the Farey points. J. Number Theory 65, 130–149. Katznelson, Y. and Weiss, B. (1982) A simple proof of some ergodic theorems. Israel J. Math. 42, 291–296. Keane, M.S. (1991) Ergodic theory and subshifts of ﬁnite type. In: Bedford, T. et al. (Eds.) (1991), 35–70. Keller, G. (1984) On the rate of convergence to equilibrium in onedimensional systems. Comm. Math. Phys. 96, 181–193. Khintchine, A. (1934/35) Metrische Kettenbruchprobleme. Compositio Math. 1, 361–382. Khintchine, A. (1936) Zur metrischen Kettenbruchtheorie. Compositio Math. 3, 276–285. Khintchine, A.J. (1956) Kettenbr¨che. Teubner, Leipzig. [Translation u of the 2nd (1949) Russian Edition; 1st Russian Edition 1935] Khintchine, A.Ya. (1963) Continued Fractions. Noordhoﬀ, Groningen. [Translation of the 3rd (1961) Russian Edition]

362

References Khinchin, A.Ya. (1964) Continued Fractions. Univ. Chicago Press, Chicago. [Translation of the 3rd (1961) Russian Edition] ¨ Klein, F. (1895) Uber eine geometrische Auﬀassung der gew¨hnlichen o Kettenbruchentwicklung. Nachr. K¨nig. Gesellsch. Wiss. G¨ttingen o o Math.-Phys. Kl. 45, 357–359. [French version (1896) Sur une repr´sene tation g´om´trique du d´veloppement en fraction continue ordinaire. e e e Nouvelles Ann. Math. (3), 15, 327–331] Knopp, K. (1926) Mengentheoretische Behandlung einiger Probleme der diophantische Approximationen und der transﬁniten Wahrscheinlichkeiten. Math. Ann. 95, 409–426. Knopp, M. and Sheingorn, M. (Eds.) (1993) A Tribute to Emil Grosswald: Number Theory and Related Analysis. Contemporary Mathematics 143. Amer. Math. Soc., Providence, RI. Knuth, D.E. (1976) Evaluation of Porter’s constant. Comput. Math. Appl. 2, 137–139. Knuth, D.E. (1981) The Art of Computer Programming, Vol. 2: Seminumerical Algorithms, 2nd Edition. Addison-Wesley, Reading, Mass. Knuth, D.E. (1984) The distribution of continued fraction approximations. J. Number Theory 19, 443–448. K¨hler, G. (1980) Some more predictable continued fractions. Monatsh. o Math. 89, 95–100. Koksma, J.F. (1936) Diophantische Approximationen. J. Springer, Berlin. Kraaikamp, C. (1987) The distribution of some sequences connected with the nearest integer continued fraction. Indag. Math. 49, 177–191. Kraaikamp, C. (1989) Statistic and ergodic properties of Minkowski’s diagonal continued fraction. Theoret. Comput. Sci. 65, 197–212. Kraaikamp, C. (1990) On the approximation by continued fractions, II. Indag. Math. (N.S.) 1, 63–75. Kraaikamp, C. (1991) A new class of continued fractions. Acta Arith. 57, 1–39.

References

363

Kraaikamp, C. (1993) Maximal S-expansions are Bernoulli shifts. Bull. Soc. Math. France 121, 117–131. Kraaikamp, C. (1994) On symmetric and asymmetric Diophantine approximation by continued fractions. J. Number Theory 46, 137–157. Kraaikamp, C. and Liardet, P. (1991) Good approximations and continued fractions. Proc. Amer. Math. Soc. 112, 303–309. Kraaikamp, C. and Lopes, A. (1996) The theta group and the continued fraction expansion with even partial quotients. Geometriae Dedicata 59, 293–333. Kraaikamp, C. and Meester, R. (1998) Convergence of continued fraction type algorithms and generators. Monatsh. Math. 125, 1–14. Kraaikamp, C. and Nakada, H. (2000) On normal numbers for continued fractions. Ergodic Theory and Dynamical Systems 20, 1405–1421. Kraaikamp, C. and Nakada, H. (2001) On a problem of Schweiger concerning normal numbers. J. Number Theory 86, 330–340. Krasnoselskii, M. (1964) Positive Solutions of Operator Equations. Noordhoﬀ, Groningen. Krengel, U. (1985) Ergodic Theorems (with a Supplement by Antoine Brunel). W. de Gruyter, Berlin. Kuipers, L. and Niederreiter, H. (1974) Uniform Distribution of Sequences. Wiley, New York. Kurosu, K. (1924) Notes on some points in the theory of continued fractions. Japan J. Math. 1, 17–21. Corrigendum, ibid. 2 (1926), 64. Kuzmin, R.O. (1928) On a problem of Gauss. Dokl. Akad. Nauk SSSR Ser. A, 375–380. [Russian; French version in Atti Congr. Internaz. Mat. (Bologna, 1928), Tomo VI, 83–89. Zanichelli, Bologna, 1932] Lagarias, J.C. (1992) Number theory and dynamical systems. In: Burr, S.A. (Ed.) The Unreasonable Eﬀectiveness of Number Theory, 35–72. Proc. Sympos. Appl. Math. 46. Amer. Math. Soc., Providence, RI.

364

References Lang, S. and Trotter, H. (1972) Continued fractions for some algebraic numbers. J. Reine Angew. Math. 255, 112–134. Addendum, ibid. 267 (1974), 219–220. Lasota, A. and Mackey, M.C. (1985) Probabilistic Properties of Deterministic Systems. Cambridge Univ. Press, Cambridge. [2nd Edition (1994) Chaos, Fractals, and Noise: Stochastic Aspects of Dynamics. Applied Mathematical Sciences 97. Springer–Verlag, New York] Legendre, A.M. (1798) Essai sur la th´orie des nombres. Duprat, e Paris. [2`me ´dition (1808), Courcier, Paris; 3`me ´dition (1830), e e e e Didot, Paris; reprinted (1955), Blanchard, Paris] Lehmer, D. (1939) Note on an absolute constant of Khintchine. Amer. Math. Monthly 46, 148–152. Lehner, J. (1994) Semiregular continued fractions whose partial denominators are 1 or 2. In: Abikoﬀ, W. et al. (Eds.) The Mathematical Legacy of Wilhelm Magnus: Groups, Geometry and Special Functions (Brooklyn, NY, 1992), 407–410. Contemporary Mathematics 169. Amer. Math. Soc., Providence, RI. L´vy, P. (1929) Sur les lois de probabilit´ dont d´pendent les quotients e e e complets et incomplets d’une fraction continue. Bull. Soc. Math. France 57, 178–194. L´vy, P. (1936) Sur le d´veloppement en fraction continue d’un nombre e e choisi au hasard. Compositio Math. 3, 286–303. L´vy, P. (1952) Fractions continues al´atoires. Rend. Circ. Mat. Palermo e e (2) 1, 170–208. L´vy, P. (1954) Th´orie de l’addition des variables al´atoires, 2`me e e e e ´dition. Gauthier-Villars, Paris. (1`re ´dition 1937) e e e Liardet, P. and Stambul, P. (2000) S´ries de Engel et fractions contie nues. J. Th´or. Nombres Bordeaux 12, 37–68. e Lin, M. (1978) Quasi-compactness and uniform ergodicity of positive operators. Israel J. Math. 29, 309–311. Lochs, G. (1961) Statistik der Teilnenner der zu den echten Br¨chen u geh¨rigen regelm¨ssigen Kettenbr¨che. Monatsh. Math. 65, 27–52. o a u

References

365

Lochs, G. (1963) Die ersten 968 Kettenbruchnenner von π. Monatsh. Math. 67, 311–316. Lochs, G. (1964) Vergleich der Genauigkeit von Dezimalbruch und Kettenbruch. Abh. Math. Sem. Hamburg 27, 142–144. Lorenzen, L. and Waadeland, H. (1992) Continued Fractions and Applications. North-Holland, Amsterdam. Loynes, R.M. (1965) Extreme values in uniformly mixing stationary stochastic processes. Ann. Math. Statist. 36, 993–999. Lyons, R. (2000) Singularity of some random continued fractions. J. Theoret. Probab. 13, 535–545. Mackey, M.C. (1992) Time’s Arrow: The Origins of Thermodynamic Behavior. Springer–Verlag, New York. MacLeod, A. J.(1993) High-accuracy numerical values in the Gauss– Kuzmin continued fraction problem. Comput. Math. Appl. 26, 37–44. Magnus, W., Oberhettinger, F., and Soni, R.P. (1966) Formulas and Theorems for the Special Functions of Mathematical Physics, 3rd Edition. Springer–Verlag, Berlin. Marcus, S. (1961) Les approximations diophantiennes et la cat´gorie e de Baire. Math. Z. 76, 42–45. Marques Henriques, J. (1966) On probability measures generated by regular continued fractions. Gaz. Mat. (Lisboa) 27, no. 103–104, 16– 22. Martin, M.H. (1934) Metrically transitive point transformations. Bull. Amer. Math. Soc. 40, 606–612. Mayer, D.H. (1987) Relaxation properties of the mixmaster universe. Physics Lett. A 122, 390–394. Mayer, D. (1990) On the thermodynamic formalism for the Gauss map. Comm. Math. Phys. 130, 311–333. Mayer, D. (1991) Continued fractions and related transformations. In: Bedford, T. et al. (Eds.) (1991), 175–222.

366

References Mayer, D. and Roepstorﬀ, G. (1987) On the relaxation time of Gauss’ continued-fraction map. I. The Hilbert space approach (Koopmanism). J. Statist. Phys. 47, 149–171. Mayer, D. and Roepstorﬀ, G. (1988) On the relaxation time of Gauss’ continued-fraction map. II. The Banach space approach (transfer operator method). J. Statist. Phys. 50, 331–344. Mazzone, F. (1995/96) A characterization of almost everywhere continuous functions. Real Anal. Exchange 21, no. 1, 317–319. McKinney, T.E. (1907) Concerning a certain type of continued fractions depending on a variable parameter. Amer. J. Math. 29, 213–278. ¨ Minkowski, H. (1900) Uber die Ann¨herung an eine reelle Gr¨sse durch a o rationale Zahlen. Math. Ann. 54, 91–124. ¨ Minnigerode, B. (1873) Uber eine neue Methode, die Pell’sche Gleichung aufzul¨sen. Nachr. K¨nig. Gesellsch. Wiss. G¨ttingen Math.o o o Phys. Kl. 23, 619–652. Misevi˘ius, G. (1971) Asymptotic expansions for the distribution funcc n−1 j tions of sums of the form j=0 f (T t). Ann. Univ. Sci. Budapest E¨tv¨s Sect. Math. 14, 77–92. (Russian) o o Misevi˘ius, G. (1981) Estimate of the remainder term in the limit thec orem for the denominators of continued fractions. Lithuanian Math. J. 21, 245–253. Misevi˘ius, G. (1992) The optimal zone for large deviations of the c denominators of continued fractions. New Trends in Probability and Statistics (Palanga, 1991), Vol. 2, 83–90. VSP, Utrecht. Moeckel, R. (1982) Geodesics on modular surfaces and continued fractions. Ergodic Theory and Dynamical Systems 2, 69–83. Mollin, R.A. (1999) Continued fraction gems. Nieuw Arch. Wiskunde (4) 17, 383–405. Morita, T. (1994) Local limit theorem and distribution of periodic orbits of Lasota-Yorke transformations with inﬁnite Markov partitions. J. Math. Soc. Japan 46, 309–343. Errata, ibid. 47 (1995), 191–192.

References

367

Nakada, H. (1981) Metrical theory for a class of continued fraction transformations and their natural extensions. Tokyo J. Math. 7, 399– 426. Nakada, H. (1990) The metrical theory of complex continued fractions. Acta Arith. 56, 279–289. Nakada, H. (1995) Continued fractions, geodesic ﬂows and Ford circles. In: Takahashi, Y. (Ed.), Algorithms, Fractals and Dynamics, 179–191. Plenum, New York. Nakada, H., Ito, Sh., and Tanaka, S. (1977) On the invariant measure for the transformations associated with some real continued fraction. Keio Engrg. Rep. 30, 159–175. von Neumann, J. and Tuckerman, B. (1955) Continued fraction expansion of 21/3 . Math. Tables Aids Comput. 9, 23–24. Nolte, V.N. (1990) Some probabilistic results on the convergents of continued fractions. Indag. Math. (N.S.) 1, 381–389. Obrechkoﬀ, N. (1951) Sur l’approximation des nombres irrationnels par des nombres rationnels. C.R. Acad. Bulgare Sci. 3, no. 1, 1–4. Olds, C.D. (1963) Continued Fractions. Random House, Toronto. Pedersen, P. (1959) On the expansion of π in a regular continued fraction. II. Nordisk Mat. Tidskr. 7, 165–168. Perron, O. (1954, 1957) Die Lehre von der Kettenbr¨chen. Band I: Elu ementare Kettenbr¨che; Band II: Analytisch-funktiontheoretische Ketu tenbr¨che. Teubner, Stuttgart. (1st Edition 1913; 2nd Edition 1929) u Petek, P. (1989) The continued fraction of a random variable. Exposition. Math. 7, 369–378. Petersen, K. (1983) Ergodic Theory. Cambridge Univ. Press, Cambridge. Peth˝, A. (1982) Simple continued fractions for the Fredholm numbers. o J. Number Theory 14, 232–236. Philipp, W. (1967) Some metrical theorems in number theory. Paciﬁc J. Math. 20, 109–127.

368

References Philipp, W. (1970) Some metrical theorems in number theory II. Duke Math. J. 37, 447–458. Errata, ibid. 37, 788. Philipp, W. (1976) A conjecture of Erd¨s on continued fractions. Acta o Arith. 28, 379–386. Philipp, W. (1988) Limit theorems for sums of partial quotients of continued fractions. Monatsh. Math. 105, 195–206. Philipp, W. and Stackelberg, O.P. (1969) Zwei Grenzwerts¨tze f¨r a u Kettenbr¨che. Math. Ann. 181, 152–156. u Philipp, W. and Stout, W. (1975) Almost Sure Invariance Principles for Partial Sums of Weakly Dependent Random Variables. Mem. Amer. Math. Soc. 161. Amer. Math. Soc., Providence, RI. Philipp, W. and Webb, G.R. (1973) An invariance principle for mixing sequences of random variables. Z. Wahrsch. Verw. Gebiete 25, 223– 237. von Plato, J. (1994) Creating Modern Probability: Its Mathematics, Physics and Philosophy in Historical Perspective. Cambridge Univ. Press, Cambridge. van der Poorten, A. and Shallit, J. (1992) Folded continued fractions. J. Number Theory 40, 237–250. Popescu, C. (1997a) Continued fractions with odd partial quotients: an approach in the spirit of Doeblin. Stud. Cerc. Mat. 49, 107–117. Popescu, C. (1997b) On the rate of convergence in Gauss’ problem for the continued fraction expansion with odd partial quotients. Stud. Cerc. Mat. 49, 231–244. Popescu, C. (1999) On the rate of convergence in Gauss’ problem for the nearest interger continued fraction expansion. Rev. Roumaine Math. Pures Appl. 44, 257–267. Popescu, C. (2000) On a Gauss–Kuzmin problem for the α-continued fractions. Rev. Roumaine Math. Pures Appl. 45, 993–1004. Popescu, G. (1978) Asymptotic behaviour of random systems with complete connections, I, II. Stud. Cerc. Mat. 30, 37–68, 181–215. (Romanian)

References

369

Porter, J.W. (1975) On a theorem of Heilbronn. Mathematika 22, 20–28. Postnikov, A.G. (1960) Arithmetic Modeling of Random Processes. Trudy Mat. Inst. Steklov. 57. Nauka, Moscow. [Russian; English translation Selected Transl. in Math. Statist. and Probab. 13 (1973), 41–122] Raney, G.N. (1973) On continued fractions and ﬁnite automata. Math. Ann. 206, 265–283. R˘utu, G. and Zb˘ganu, G. (1989) Some Banach algebras of functions a ¸ a of bounded variation. Stud. Cerc. Mat. 41, 513–519. R´nyi, A. (1957) Representations for real numbers and their ergodic e properties. Acta Math. Acad. Sci. Hungar. 8, 477–493. Richtmyer, R.D. (1975) Continued fraction expansion of algebraic numbers. Adv. in Math. 16, 362–367. Rieger, G.J. (1977) Die metrische Theorie der Kettenbr¨che seit Gauss. u Abh. Braunschweig. Wiss. Gesellsch. 27, 103–117. Rieger, G.J. (1978) Ein Gauss–Kusmin–L´vy–Satz f¨r Kettenbr¨che e u u nach n¨chsten Ganzen. Manuscripta Math. 24, 437–448. a Rieger, G.J. (1979) Mischung und Ergodizit¨t bei Kettenbr¨chen nach a u n¨chsten Ganzen. J. Reine Angew. Math. 310, 171–181. a Rieger, G.J. (1981a) Ein Heilbronn–Satz f¨r Kettenbr¨che mit ungeru u aden Teilnennern. Math. Nachr. 101, 295–307. ¨ Rieger, G.J. (1981b) Uber die L¨nge von Kettenbr¨chen mit ungeraden a u Teilnennern. Abh. Braunschweig. Wiss. Gesellsch. 32, 61–69. Rieger, G.J. (1984) On the metrical theory of the continued fractions with odd partial quotients. Topics in Classical Number Theory (Budapest, 1981), Vol. II, 1371–1418. Colloq. Math. Soc. J´nos Bolyai a 34. North-Holland, Amsterdam. Rivat, J. (1999) On the metric theory of continued fractions. Colloq. Math. 79, 9–15. Rockett, A.M. (1980) The metrical theory of continued fractions to the nearer integer. Acta Arith. 38, 97–103.

370

References Rockett, A.M. and Sz˝sz, P. (1992) Continued Fractions. World Sciu entiﬁc, Singapore. Rogers, C.A. (1998) Hausdorﬀ measures, 2nd Printing, with a Foreword by K.Falconer. Cambridge Univ. Press, Cambridge. Rosen, D. (1954) A class of continued fractions associated with certain properly discontinuous groups. Duke Math. J. 21, 549–563. Rousseau-Eg`le, J. (1983) Un th´or`me de la limite locale pour une e e e classe de transformations dilatantes et monotones par morceaux. Ann. Probab. 11, 772–788. Ruelle, D. (1978) Thermodynamic Formalism. The Mathematical Structures of Classical Equilibrium Statistical Mechanics. Addison-Wesley, Reading, Mass. Ryll–Nardzewski, C. (1951) On the ergodic theorems. II. Ergodic theory of continued fractions. Studia Math. 12, 74–79. ˇ a Sal´t, T. (1967) Remarks on the ergodic theory of the continued fracˇ tions. Mat. Casopis Sloven. Akad. Vied 17, 121–130. ˇ a Sal´t, T. (1969) Bemerkung zu einem Satz von P. L´vy in der metrischen e Theorie der Kettenbr¨che. Math. Nachr. 41, 91–94. u ˇ a Sal´t, T. (1984) On a metric result in the theory of continued fractions. Acta Math. Univ. Comenian. 44–45, 49–53. Salem, R. (1943) On some singular monotonic functions which are strictly increasing. Trans. Amer. Math. Soc. 53, 427–439. Samorodnitsky, G. and Taqqu, M.S. (1994) Stable Non-Gaussian Random Processes: Stochastic Models with Inﬁnite Variance. Chapman & Hall, New York. Samur, J.D. (1984) Convergence of sums of mixing triangular arrays of random vectors with stationary rows. Ann. Probab. 12, 390–426. Samur, J.D. (1985) A note on the convergence to Gaussian laws of sums of stationary ϕ-mixing triangular arrays. Probability in Banach Spaces V (Proccedings, Medford, 1984), 387–399. Lecture Notes in Math. 1153. Springer–Verlag, Berlin.

References

371

Samur, J.D. (1987) On the invariance principle for stationary ϕ-mixing triangular arrays with inﬁnitely divisible limits. Probab. Theory Related Fields 75, 245–259. Samur, J.D. (1989) On some limit theorems for continued fractions. Trans. Amer. Math. Soc. 316, 53–79. Samur, J.D. (1991) A functional central limit theorem in Diophantine approximation. Proc. Amer. Math. Soc. 111, 901–911. Samur, J.D. (1996) Some remarks on a probability limit theorem for continued fractions. Trans. Amer. Math. Soc. 348, 1411–1428. Saulis, L. and Statulevi˘ius, V. (1991) Limit Theorems for Large Dec viations. Kluwer, Dordrecht. Schmidt, A.L. (1975) Diophantine approximation of complex numbers. Acta Math. 134, 1–85. Schmidt, A.L. (1983) Ergodic theory for complex continued fractions. Monatsh. Math. 93, 39–62. Schmidt, T.A. (1993) Remarks on the Rosen λ-continued fractions. In: Pollington, A. and Moran, W. (Eds.), Number Theory with an Emphasis on the Markoﬀ Spectrum, 227–238. Marcel Dekker, New York. Schmidt, W.M. (1960) On normal numbers. Paciﬁc J. Math. 10, 661–672. Schmidt, W.M. (1980) Diophantine Approximation. Lecture Notes in Math. 785. Springer–Verlag, Berlin. Schweiger, F. (1969) Eine Bemerkung zu einer Arbeit von S.D. Chatˇ terji. Mat. Casopis Sloven. Akad. Vied 19, 89–91. Schweiger, F. (1995) Ergodic Theory of Fibred Systems and Metric Number Theory. Clarendon Press, Oxford. Schweiger, F. (2000a) Kuzmin’s theorem revisited. Ergodic Theory and Dynamical Systems 20, 557–565. Schweiger, F. (2000b) Multidimensional Continued Fractions. Oxford Univ. Press, Oxford.

372

References Sebe, G.I. (1999) Spectral analysis of the Ruelle operator associated with the topological inﬁnite order chain of the continued fraction expansion. Rev. Roumaine Math. Pures Appl. 44, 277–291. Sebe, G.I. (2000a) The Gauss–Kuzmin theorem for Hurwitz’s singular continued fraction expansion. Rev. Roumaine Math. Pures Appl. 45, 495–514. Sebe, G.I. (2000b) A two-dimensional Gauss–Kuzmin theorem for singular continued fractions. Indag. Math. (N.S.) 11, 593–605. Sebe, G.I. (2001a) On convergence rate in the Gauss–Kuzmin problem for the grotesque continued fractions. Monatsh. Math. 133, 241–254. Sebe, G.I. (2001b) Gauss’ problem for the continued fraction expansion with odd partial quotients revisited. Rev. Roumaine Math. Pures Appl. 46, 839–852. Sebe, G.I. (2002) A Gauss–Kuzmin theorem for the Rosen fractions. J. Th´or. Nombres Bordeaux 14. e Segre, B. (1945) Lattice points in inﬁnite domains, and asymmetric Diophantine approximation. Duke J. Math. 12, 337–365. Selenius, C.-O. (1960) Konstruktion und Theorie halbregelm¨ssiger a Kettenbr¨che mit idealer relativer Approximationen. Acta Acad. Abo. u Math. Phys. 22, no. 2, 1–75. Sendov, B. (1959/60) Der Vahlensatz uber die singul¨ren Kettenbr¨che ¨ a u und die Kettenbr¨che nach n¨chsten Ganzen. Annuaire Univ. Soﬁa u a Fac. Sci. Phys. Math. Livre 1 Math. 54, 251–258. Seneta, E. (1976) Regularly Varying Functions. Math. 508. Springer–Verlag, Berlin. Lecture Notes in

Series, C. (1982) Non-Euclidean geometry, continued fractions, and ergodic theory. Math. Intelligencer 4, no. 1, 24–31. Series, C. (1991) Geometrical methods of symbolic coding. In: Bedford, T. et al. (Eds.) (1991), 125–151. Shallit, J. (1979) Simple continued fractions for some irrational numbers. J. Number Theory 11, 209–217.

References

373

Shallit, J. O. (1982a) Simple continued fractions for some irrational numbers, II. J. Number Theory 14, 228–231. Shallit, J. O. (1982b) Explicit descriptions of some continued fractions. Fibonacci Quart. 20, 77–81. Shallit, J. (1994) Origins of the analysis of the Euclidean algorithm. Historia Math. 21, 401–419. Shanks, D. and Wrench, J.W., Jr. (1959) Khintchine’s constant. Amer. Math. Monthly 66, 276–279. Shiu, P. (1995) Computation of continued fractions without input values. Math. Comp. 64, 1307–1317. Sinai, Ya.G. (1994) Topics in Ergodic Theory. Princeton Univ. Press, Princeton, NJ. Sloane, N.J.A. and Plouﬀe, S. (1995) The Encyclopedia of Integer Sequences. Academic Press, San Diego. Sprindˇuk, V.G. (1979) Metric Theory of Diophantine Approximaz tions. Wiley, New York. Stadje, W. (1985) Bemerkung zu einem Satz von Akcoglu und Krengel. Studia Math. 81, 307–310. Strassen, V. (1964) An invariance principle for the law of the iterated logarithm. Z. Wahrsch. Verw. Gebiete 3, 211–226. Sudan, G. (1959) The Geometry of Continued Fractions. Technical Publishing House, Bucharest. (Romanian) ¨ Sz˝sz, P. (1961) Uber einen Kusminschen Satz. Acta Math. Acad. Sci. u Hungar. 12, 447–453. Sz˝sz, P. (1962) Verallagemainerung und Anwendungen eines Kusminu schen Satzes. Acta Arith. 7, 149–160. Sz˝sz, P. (1980) On the length of continued fractions representing a u rational number with given denominator. Acta Arith. 37, 55–59. Sz˝sz, P. and Volkmann, B. (1982) On Strassen’s law of the iterated u logarithm. Z. Wahrsch. Verw. Gebiete 61, 453–458.

374

References Tamura, J. (1991) Symmetric continued fractions related to certain series. J. Number Theory 38, 251–264. Tanaka, S. and Ito, Sh. (1981) On a family of continued-fraction transformations and their ergodic properties. Tokyo J. Math. 4, 153– 175. Thakur, D.S. (1996) Exponential and continued fractions. J. Number Theory 59, 248–261. ¨ Tietze, H. (1913) Uber die raschesten Kettenbruchentwicklungen reeller Zahlen. Monatsh. Math. Phys. 24, 209–242. Tong, J. (1983) The conjugate property of the Borel theorem on Diophantine approximation. Math. Z. 184, 151–153. Tong, J. (1994) The best approximation function to irrational numbers. J. Number Theory 49, 89–94. Tonkov, T. (1974) On the average length of ﬁnite continued fractions. Acta Arith. 26, 47–57. Urban, F.M. (1923) Grundlagen der Wahrscheinlichkeitsrechnung und der Theorie der Beobachtungsfehler. Teubner, Leipzig. Urba´ski, M. (2001) Porosity in conformal inﬁnite iterated function n systems. J. Number Theory 88, 283–312. ¨ Vahlen, K.T. (1895) Uber N¨herungswerthe und Kettenbr¨che. J. Reine a u Angew. Math. 115, 221–233. Vajda, S. (1989) Fibonacci and Lucas Numbers, and the Golden Section: Theory and Applications. E. Horwood, Chichester. Vall´e, B. (1997) Op´rateurs de Ruelle–Mayer g´n´ralis´s et analyse e e e e e des algorithmes d’Euclide et de Gauss. Acta Arith. 81, 101–144. Vall´e, B. (1998) Dynamique des fractions continues ` contraintes e a p´riodiques. J. Number Theory 72, 183–235. e Vall´e, B. (2000) Digits and continuants in Euclidean algorithms. Ere godic versus Tauberian theorems. J. Th´or. Nombres Bordeaux 12, e 531–570.

References

375

Vardi, I. (1995) The limiting distribution of the St. Petersburg game. Proc. Amer. Math. Soc. 123, 2875–2882. Vardi, I. (1997) The St. Petersburg game and continued fractions. C.R. Acad. Sci. Paris Ser. I Math. 324, 913–918. Veech, V.A. (1982) Gauss measures for transformations on the space of interval exchange maps. Ann. of Math. (2) 115, 201–242. Vershik, A.M. and Sidorov, N.A. (1993) Arithmetic expansions associated with the rotation of a circle. Algebra i Analiz 5, no. 6, 97–115. (Russian) Viader, P., Paradis, J., and Bibiloni, L. (1998) A new light on Minkowski’s ?(x)-function. J. Number Theory 73, 212–227. Viswanath, D. (2000) Random Fibonacci sequences and the number 1.13198824 · · · . Math. Comp. 69, 1131–1155. de Vroedt, C. (1962) Measure-theoretical investigations concerning continued fractions. Indag. Math. 24, 583–591. de Vroedt, C. (1964) Metrical problems concerning continued fractions. Compositio Math. 16, 191–195. Wall, H.S. (1948) Analytic Theory of Continued Fractions. Van Nostrand, New York. Walters, P. (1982) An Introduction to Ergodic Theory. Graduate Texts in Mathematics 79. Springer–Verlag, New York. Watson, G.N. (1944) A Treatise on the Theory of Bessel Functions, 2nd Edition. Cambridge Univ. Press, Cambridge. Whittaker, E.T. and Watson, G.N. (1927) A Course of Modern Analysis. Cambridge Univ. Press, Cambridge. ¨ Wiman, A. (1900) Uber eine Wahrscheinlichkeitsaufgabe bei Ketten¨ bruchentwickelungen. Ofversicht af Kongl. Svenska Vetenskaps-Akademiens F¨rhandlingar 57, 829–841. o Wirsing, E. (1974) On the theorem of Gauss–Kusmin–L´vy and a e Frobenius type theorem for function spaces. Acta Arith. 24, 507– 528.

376

References Wrench, J.W., Jr. (1960) Further evaluation of Khintchine’s constant. Math. Comp. 14, 370–371. Wrench, J.W., Jr. and Shanks, D. (1966) Questions concerning Khintchine’s constant and the eﬃcient computation of regular continued fractions. Math. Comp. 20, 444–448. Zagier, D.B. (1981) Zetafunktionen und quadratische K¨rper. Eine o Einf¨hrung in die h¨here Zahlentheorie. Springer–Verlag, Berlin-New u o York. Zuparov, T.M. (1981) On a theorem from the metric theory of continued fractions. Izv. Akad. UzSSR Ser. Fiz.-Mat. Nauk no. 6, 9–12. (Russian)

Index

Aaronson, J., 311, 339 Abramov’s formula, 277 Acosta, A. de, 202 Adams, W.W., 271, 272 Adler, R.L, 9, 244, 307 α-expansion, 281, 344, 345 Alexandrov, A.G., 241 algorithm A, 259 algorithm B, 260 algorithm C, 260 almost Markov property, 335 Alzer, H., 13 approximation coeﬃcient, 27, 263 Araujo, A., 197, 320, 331 arc-sine law, 187 generalization of, 202 array, 325 strictly stationary, 326 strongly inﬁnitesimal (s.i.), 327 associated random variables, 15 extended, 34 automorphism, 219 Babenko, K.I., 103, 109, 111, 113, 336 backward continued fraction (BCF) expansion, 307 Bagemihl, F., 30 Bailey, D.H., 231, 233, 241 Barbolosi, D., 249, 264 Barndorﬀ–Nielsen, O., 176 Barrionuevo, J., 346 377 Berechet, A., xiii, 151 Bernstein, F. F. Bernstein’s theorem, 49, 174 Bibiloni, L., 238 Billingsley, P., 36, 180, 187, 221, 224, 257, 320, 334, 343 Birkhoﬀ’s individual ergodic theorem, 221 Borel sets, 314 ´ Borel, E., 22, 30, 243, 337 Borwein, J.M., 231, 233, 241 Bosma, W., 249, 251, 252, 260, 281, 288, 293–296, 298, 299, 343 boundary, 315 bounded essential variation, 55 bounded p-variation, 75 Boyarski, A., 58, 221, 223 Bradley, R.C., 326 Breiman, L., 253 Brezinski, C., xii Brjuno, A.D., 12, 241 Brod´n, T., 22, 336, 337 e Brod´n–Borel–L´vy formula, 21 e e generalized, 37 Burton, R.M., 344, 345 Cassels, J.W.S., 343 Champernowne, D.G., 243 characteristic function, 316 Choong, K.Y., 241 Chudnovsky, D.V., 13

378 Chudnovsky, G.V., 13 Clemens, L.E., 12 conditional probability measures, 36 continuant, 5 continued fraction (CF), 260 continued fraction digits, 4 continued fraction expansion, 4 continued fraction expansion for e, 12 continued fraction expansion for π, 13 continued fraction transformation, 2 natural extension of, 25 continued fraction with even incomplete quotients (Even CF) expansion, 264 continued fraction with odd incomplete quotients (Odd CF) expansion, 264 convolution, 316 Corless, R.M., 334 Cornfeld, I.P., 221 Crandall, R.E., 231, 233, 241 Dajani, K., 250, 300, 303–305, 310, 311, 337, 341, 343, 345, 346 Daud´, H., 111, 130, 134 e Davison, J.L., 13 Daykin, D.E., 241 Denjoy, A., 156, 163, 337 dependence coeﬃcients, 325 dependence with complete connections, 23, 234 diagonal continued fraction (DCF) expansion, 289 Diamond, H.G., 235, 239, 240 digamma function ψ, 145

Index Diophantine approximation, 29 fundamental theorem of, 257 Dixon, J.D., 334 δ-mixing, 326 Doeblin, W., xi, 22, 33, 99, 204, 252, 335, 337–340 Doeblin–Lenstra conjecture, 252 Doob, J.L., 31 Doukhan, P., 327 Duren, P.L., 102 D¨rner, A., 34, 337 u dynamical system, 219 Elsner, C., 13 Elton, H.J., 253 endomorphism, 219 entropy, 257, 277 Euclid’s algorithm, 1, 2 Euler, L., 5, 12 Faivre, C., 9, 101, 130, 249, 334, 336, 343 Falconer, K.J., 233, 234 Farey continued fraction (FCF) expansion, 303 Feller, W., 238 f -expansion, 346 with dependent digits, 346 Fieldsteel, A., 343 Flajolet, P., 111, 130, 134 Flatto, L., 307 Fluch, W., 134 Fortet, R., 335 Fourier transform, 316 Fujiwara, M., 30 fundamental interval, 18 G´l, I.S., 221, 340 a G´ra, P., 58, 221, 223 o Galambos, J., 173, 174 Gauss, C.F., x, 15

Index Gauss–Kusmin–L´vy theorem e ‘exact’, 111, 125 L2 -version, 123 Gauss’ measure, 16 extended, 26 Gauss’ Problem, 15 Babenko’s solution to, 101f Paul L´vy’s solution to, 39f e Wirsing’s solution to, 79f Gauss’ problem for τ , 246 ¯ geodesic ﬂow, 9 Gin´, E., 197, 320, 331 e Gordin, M.I., 216, 327 Gr¨chenig, K., 344 o Gray, J.J., 16 Grigorescu, S., 23, 33, 62, 168, 193, 253, 334, 346 Grothendieck, A., 105 Gyld´n, H., 336 e Haan, L. de, 174 Haas, A., 344 Halmos, P.R., 320 Hardy, G.H., 11 Harman, G., 233 Hartman, S., 238 Hartono, Y., 264 Hausdorﬀ dimension, 233 Hausdorﬀ measure, 233 Heilbronn, H., 334 Heinrich, H., 203, 335 Hennion, H., 335 Hensley, D., 2, 103, 194, 234, 252, 334, 336 Heyde, C.C., 188, 214 Hofbauer, F., 193 Hoﬀmann–Jørgensen, J., 320 Hurwitz, A., 263, 264, 288, 298 Ibragimov, I.A., 71, 72, 334

379 inﬁnite-order chain, 33 insertion, 300 Ionescu Tulcea, C.T., 335 Iosifescu, M., 23, 33, 62, 64, 147, 151, 168, 173, 178, 179, 183, 193, 204, 334–337, 345, 346 isomorphism, 222 iterated function systems, 234 Ito, Sh., 273, 281, 302, 344, 345 Jager, H., 30, 249, 251, 252, 271– 273, 281, 288, 298, 340, 341 Jain, N.C., 215, 216 Jarn´ V., 234 ık, Jenkinson, O., 234 Jogdeo, K., 215, 216 Jones, W.B., xii Jur0 ev, S.P., 113 Kac, M., 334 Kakeya, S., 346 Kalpazidou, S., 345 Kamae, T., 221 Kanwal, R.P., 105 Karamata theorem, 321 Katznelson, Y., 221 K-automorphism, 223 Keane, M.S., 221, 244 Keller, G., 193 Khin(t)chin(e), A.Ya., 16, 204, 231, 257, 334, 339, 340 Knopp, K., 339 Knuth, D.E., 2, 92, 101, 333, 334 K¨hler, G., 13 o Koksma, J.F., 221, 334, 340 Kolmogorov, A.N., 337 Kraaikamp, C., 30, 250, 251, 264, 273, 278, 286–290, 294, 296,

380 299, 300, 303–305, 310, 311, 337, 341–346 Krasnoselskii, M., 128 Kurosu, K., 287 Kuzmin, R.O., 16 Lagarias, J.C., 238 Lagrange, J.-L., 11 Lam´, G., 2 e Lang, S., 241 Laplace, P.S., 15 Lasota, A., 58, 220 λ-continued fraction (λ-CF) expansion, 344 Law of the iterated logarithm Chung’s, 215 classical, 213 Strassen’s, 213, 216 Legendre constants, 273 Legendre’s theorem, 20 Lehmer, D., 231 Lehner continued fraction (LCF) expansion, 300 Lehner, J., 300, 302 Lenstra, H.W., 252 LeVeque, J., 30 L´vy-Cram´r continuity theorem, e e 316 L´vy–Khinchin representation, 317 e L´vy measure, 317 e L´vy, Paul, 16, 22, 39, 256, 271, e 334, 340, 342 Liardet, P., 340, 341 Linnik, Yu.V., 71, 72, 334 Lochs, G., 334, 342 Lopes, A., 264 Lorenzen, L., xii Loynes, R.M., 174 Mackey, M.C., 58, 220

Index MacLeod, A.J., 111, 119 Magnus, W., 105, 107 Marinescu, G., 335 Martin, M.H., 339 matrix approach, 7 Mayer, D.H., 59, 103, 109, 111, 120, 127, 130, 194, 336 Mazzone, F., 315 McKinney, T.E., 281 McLaughlin, J.R., 30 measurable space, 313 measure, 314 mediant convergents, 301 Merrill, K.D., 12 Minnigerode, B., 263 Misevi˘ius, G., 193, 194, 335 c M¨bius transformation, 7 o Moeckel, R., 340, 341 Morita, T., 195 ‘Mother of all SRCF expansions’, 301 Nakada, H., 9, 271, 281, 283, 285, 311, 334, 340, 343–345 nearest integer continued fraction (NICF) expansion, 263 Neumann, J. von, 244 Nolte, V.N., 227, 229, 341 normal continued fraction number, 243 normal number, 243 number normal in base b, 243 Oberhettinger, F., 105, 107 Obrechkoﬀ, N., 30 Olds, C.D., xii 1–block, 258 Operator Mayer–Ruelle, 130 generalization of, 134

Index nuclear of order 0 (of trace class), 105 trace of, 105 Perron–Frobenius, 57, 58 transition, 65 optimal continued fraction (OCF) expansion, 293 Paradis, J., 238 Pedersen, P., 231 Perron, O., xii, 11, 261, 288, 289, 333 Petek, P., 64 Petersen, K., 221, 223, 276, 277 Peth˝, A., 13 o Philipp, W., 34, 173, 174, 176, 181, 215, 216, 230, 239, 256, 334, 337–340 Plato, J. von, xii, 337 Poisson probability, 317 τ -centered, 317 Pollicott, M., 234 Poorten, A. van der, 13 Popescu, C., 180, 281, 345 Popescu, G., 253 Porter, J.W., 333 Postnikov, A.G., 244 preservation area, 269 probability, 314 inﬁnitely divisible, 317 stable, 317 order of, 318 strictly stable, 318 probability space, 314 Prokhorov metric, 315 Pruitt, W.E., 215 ψ-mixing coeﬃcient, 43 quadratic irrationality, 11 random variable (r.v.), 313

381 independent, 316 P -distribution of, 314 Raney, G.N., 9 Rathbone, C.R., 241 R˘utu, G., 56 a ¸ (regular) continued fraction (RCF), 3, 4 convergents of [= (RCF) convergents], 4 digits of, 4 asymptotic relative digit frequencies, 225 asymptotic relative frequencies of digits between two given values, 226 asymptotic relative frequencies of digits exceeding a given value, 227 asymptotic relative m-digit block frequencies, 226 incomplete (partial) quotients of, 4 extended, 31 regularly varying function, 321 index of, 321 Reznik, M.H., 216 Riauba, R., 335 Richtmyer, R.D., 12, 241 Rieger, G.J., 264, 272, 284, 345 Rockett, A.M., xii, 284, 334, 345 Roeder, D.W., 12 Roepstorﬀ, G., 109, 111, 120, 127, 336 Rogers, C.A., 233 Rosen continued fraction expansion, 344 Rosen, D., 344 Rousseau-Eg`le, J., 193 e Ruelle, D., 130 Ryll–Nardzewski, C., 340

382 ˇ a Sal´t, T., 233 Salem, R., 238 σ-algebra, 313 Samorodnitsky, G., 320 Samur, J.D., 79, 99, 188, 197, 211, 324, 327–331, 337, 338 Saulis, L., 335 Schmidt, T.A., 344 Schmidt, W.M., xii, 343 Schweiger, F., 264, 343 S-convergent, 270 Scott, D.J., 188, 214 Sebe, G.I., 344, 345 Segre, B., 30 Selenius, C.-O., 260 semi-regular continued fraction (SRCF) expansion, 261 closest, 259 fastest, 259, 294 Sendov, B., 30, 287 Seneta, E., 321, 322 Series, C., 9 S-expansion, 267 Shallit, J.O., 2, 13, 241 Shanks, D., 231, 232, 241 Shiu, P., 241 Sinai, Ya.G., 334 singular continued fraction (SCF) expansion, 264 singularization, 258, 265 singularization area, 267 maximal, 269 singularization process, 265 skew product, 223 Jager and Liardet’s, 341 Skorohod metric d0 , 319 slowly varying function, 321 representation theorem, 321 Smorodinsky, M., 244 Soni, R.P., 105, 107

Index spectral radius, 95 Sprindˇuk, V.G., xii z St. Petersburg game, 238 Stackelberg, O.P., 216 Stadje, W., 55 Statulevi˘ius, V.A., 335 c Stout, W.F., 181, 215, 216 Strassen, V., 213, 216 Sudan, G., xii Sz˝sz, P., xii, 16, 30, 334, 335, 339 u Tamura, J., 13 Tanaka, S., 281, 344, 345 Taqqu, M.S., 320 Taylor, S.J., 215, 216 Thakur, D.S., 13 Theodorescu, R., 168 Thron, W.J., xii Tietze, H., 261 Tong, J., 30, 31 Tonkov, T., 334 transformation, 219 ergodic, 220 exact, 220 measure preserving, 219 natural extension of, 222 non-singular, 219 strongly mixing, 220 Trotter, H., 241 Tuckerman, B., 244 UB Conjecture, 139 Urban, F.M., 334 Uspensky, J.V., 16 Vaaler, J.D., 235, 239, 240 Vahlen, K.T., 28 Vall´e, B., 111, 130, 134, 135, 194 e Vardi, I., 238 Viader, P., 238 Volkmann, B., 339

Index Vroedt, C. de, 340 Waadeland, H., xii Wall, H.S., xii Walters, P., 221 Watson, G.N., 104, 227 weak convergence, 315 Webb, G.R., 339 Weiss, B., 221 Whittaker, E.T., 227 Wiedijk, F., 249, 251, 252, 281, 288, 298 Wiener measure, 319 Wiman, A., 336, 337 Wirsing, E., 16, 83, 91, 92, 113, 336 Wrench, J.W., 231, 232 Wright, E., 11 Zagier, D.B., 308 Zb˘ganu, G., 56 a Zuparov, T.M., 338

383

Sign up to vote on this title

UsefulNot useful- Some Nonoverlapping Domain Decomposition Methodsby Chandra Clark
- Saharon Shelah- There may be no nowhere dense ultrafilterby Ksmweo
- A Study on New Concepts in Smarandache Quasigroups And Loops, by Jaiyeola Temitope Gbolahanby science2010
- Menachem Magidor and Saharon Shelah- Bext^2(G, T ) Can be Nontrivial, Even Assuming GCHby Jgfm2

- Saharon Shelah- A Combinatorial Principle Equivalent to the Existence of Non-free Whitehead Groupsby Jgfm2
- Tataby siva.mtnc
- Proceedings of the National Seminar on Present Trends in Algebra and Its Applicationsby bhavanarim
- Saharon Shelah and Bradd Hart- Categoricity over P for first order T or categoricity for phi in Lomega1 omega can stop at alephk while holding for aleph0, ..., alephk-1by HutsDM

- A practical guide to international commercial arbitration
- International Commercial Arbitration
- Handbook-International Arbitration 3rd Ed P. Capper
- International Commercial Arbitration
- Comparative International Commercial Arbitration
- Royden H.L. Real analysis (2ed., Macmillan, 1988)(T)(353s)_MCet_
- Bollobas Linear Analysis
- A Short Course on Spectral Theory
- Some Nonoverlapping Domain Decomposition Methods
- Saharon Shelah- There may be no nowhere dense ultrafilter
- A Study on New Concepts in Smarandache Quasigroups And Loops, by Jaiyeola Temitope Gbolahan
- Menachem Magidor and Saharon Shelah- Bext^2(G, T ) Can be Nontrivial, Even Assuming GCH
- GAMMA ACTS
- Saharon Shelah and Renling Jin- A Model in Which There are Jech-Kunen Trees but there are no Kupra Trees
- 2015-11-notes
- Heike Mildenberger and Saharon Shelah- Specialising Aronszajn Trees by Countable Approximations
- Saharon Shelah- A Combinatorial Principle Equivalent to the Existence of Non-free Whitehead Groups
- Tata
- Proceedings of the National Seminar on Present Trends in Algebra and Its Applications
- Saharon Shelah and Bradd Hart- Categoricity over P for first order T or categoricity for phi in Lomega1 omega can stop at alephk while holding for aleph0, ..., alephk-1
- Saharon Shelah- On Quantification with a Finite Universe
- How to Write a Research Manuscript-Tips to Boost Your Scientific Career
- Exam en 3
- Lab6 Complete
- Israel N Herstein - Solutions
- latha-al
- Sic on 1
- Forcing Consistency of ZFC plus negation_CH (lecture delivered at Calcutta Logic Circle) by Shashi Mohan Srivastava
- Aleksander Blaszczyk and Saharon Shelah- Regular Subalgebras of Complete Boolean Algebras
- On k-centralizer of semiprime gamma rings.pdf
- Kluwer