You are on page 1of 51

The conservative matrix field

Ofir David
arXiv:2303.09318v1 [math.GM] 15 Mar 2023

Abstract
We provide a more accessible approach to Apéry’s proof that the Riemann zeta function at 3 is
irrational. To achieve this, we introduce a new structure called the conservative matrix field, which
facilitates the proof and can be applied to other mathematical constants, such as e, π, ln (2), in order to
study their properties. The results obtained in this paper not only offer a more accessible proof of Apery’s
theorem, but also pave the way for further research and discovery in this field and relates it to other fields
in number theory.

1 Introduction
The Riemann zetaP function ζ (s) is a complex valued function that plays a crucial role in mathematics. It is

defined as ζ (s) = n=1 n1s for complex numbers s with Re (s) > 1 and can be extended analytically to all of
C with a simple pole at s = 1. In particular, the values ζ (d) for integers d ≥ 2 have significant implications
in a number of areas, e.g. ζ(d)
1
is the probability of choosing a random integer which is not divisible by md
2
for some integer m ≥ 1. While the even evaluations ζ (2d) are well understood, with ζ (2) = π6 and more
generally ζ(2d)
π 2d
are rational numbers, the behavior of odd evaluations ζ (2d + 1) remains largely unknown.
One of the main results about these odd evaluations was in 1978 where Apéry showed that ζ (3) is irrational
[1]. While Apéry’s proof was complicated, subsequent attempts were made to explain and simplify it, see
for example van der Poorten [13], and others tried to reprove it all together, e.g. Beukers in [3]. Subsequent
research [12, 14] has also shown that there are infinitely many odd integers for which ζ (2d + 1) is irrational,
and in particular at least one of ζ (5) , ζ (7) , ζ (9), and ζ (11) is irrational.
The aim of this paper is to provide a clearer proof of Apéry’s theorem through the creation of a novel
mathematical structure referred to as the conservative matrix field. This structure will allow us to under-
stand Apéry’s original proof and provide a framework for studying other natural constants such as ζ (2) , π,
and e, with the potential to uncover new relationships and properties among them. Moreover, while we will
mainly work over the integers, this structure seem to have natural generalizations for general metric fields.

The conservative matrix field structure is based on generalized continued fractions, which are number
presentations of the form
bk b1
K∞1 := b2
ai , bi ∈ C,
ak a1 + b3
a2 +
..
a3 + .
namely, the limit, if it exists, of the convergents defined by
pn bk b1
= Kn−1
1 = b2
.
qn ak a1 +
..
a2 +
.
bn−1
an−1 +0

Their much more well known cousins, the simple continued fractions where bk = 1 and ak ≥ 1 are integers,
have been studied extensively and are connected to many research areas in mathematics and in general. In
particular, the original goal of these continued fractions was to find the “best” rational approximations for a
given irrational number, which are given by the convergents defined above.

1
While irrational numbers have a unique simple continued fraction expansion (and rationals have two
expansions), there can be many presentations in the generalized version (more details in Section 2). The
uniqueness in the simple continued fraction expansion allows us to extract a lot of information from the
expansion, and while we lose this property, what we gain is the option to find “nice” generalized continued
fractions which are easier to work with. In particular, we are interested in polynomial continued fractions
where ak = a (k) , bk = b (k) with a, b ∈ Z [x].
For example, in the ζ (3) case, the simple continued fraction is
1
ζ (3) = [1; 4, 1, 18, 1, 1, 1, 4, 1, 9, ...] = 1 + 1 ,
4+ 1+ 1
18+ 1
..
1+ .
where the coefficients 1, 4, 1, 18, 1, ... don’t seem to have any usable pattern. However, it has a much simpler
generalized continued fraction form
1 1
ζ (3) = ∞ −i6
= 16
,
1 + K1 i3 +(1+i)3 1− 3 3 26
1 +2 − 36
23 +33 −
..
33 +43 − .
Pn
where the convergents in this expansion are the standard approximations 1 k13 for ζ (3). Moreover, this
abundance of presentations allows us to find many presentations for ζ (3) which can be combined together to
find a “good enough” presentation where the convergents converge fast enough to prove that ζ (3) is irrational.
In particular, in Apéry’s original proof, and in our, we eventually show that
6
ζ (3) = −k6
.
5 + K∞1 17(k3 +(1+k)3 )−12(k+(1+k))

The irrationality proof uses a very elementary argument (see Section 3) that shows that if pqnn → L
 
where pqnn are reduced rational numbers with |qn | → ∞, and L − pqnn = o |q1n | , then L must be irrational.


Moreover, we can measure how irrational L is by looking for δ > 0 such that L − pqnn ∼ |q 1|1+δ . The main

n
object of this study, the conservative matrix field defined in Section 5, is an algebraic object that collects
pn,m
infinitely many related such approximations qn,m arranged on the integer lattice in the positive quadrant.

pn,m
ln L− qn,m

pn,m
Computing δn,m for each approximation, namely δn,m = −1 − where the rational is reduced,

ln|qn,m | qn,m
and plotting them as a heat map we get the following

Figure 1: (Figure
by Rotem
Elimelech) The gradient color from red→white→blue correspond to δn,m =
pn,m
−1 − logqn,m L − qn,m going from positive→zero→negative.

2
As wePshall see, the X-axis and Y -axis correspond more or less to the standard approximations of ζ (3),
n
namely 1 k13 , which do not converge fast enough to show irrationality, while on the diagonal we get the
expansion mentioned above used by Apéry to prove the irrationality.

The conservative matrix field structure is not only a way to understand Apéry’s original proof, but seems
to have a much broader range of applications. There are many places that study generalized continued fraction
and in particular polynomial continued fractions (see for example [2, 6, 8, 7]) . This paper originated in the
Ramanujan machine project [10] which aimed to find simple polynomial continued fraction presentations to
interesting mathematical constants using computer automation. With the goal of trying to prove many of
the conjectures discovered by the computer, and along the way understand Apery’s proof, this conservative
matrix field structure was found. These computer conjectures suggest that there is still much to be explored
in this field and that this new structure is just a step towards a deeper understanding of mathematical
constants and their relations.

1.1 Structure of the paper


This paper is divided into two parts. In Part I we mainly go over basic and elementary results about gen-
eralized continued fractions and their Mobius transformation generalizations. Many of the results there are
either known or generalization and reformulations of known results in the context of the polynomial continued
fractions. This part is mainly here to make this paper self contained, and also introduce the settings of the
wider world in which the conservative matrix field lives, and a bit deeper look into the tools in this world.
In Part II we will define what is the conservative matrix field and how to use it to show the irrationality of ζ (3).

More specifically, as the generalized continued fraction expansion are much less known than their simple
versions (namely the denominators are positive integers and numerators are 1), we begin in Section 2 by going
over the definitions, notations and some properties of these generalized continued fractions. In particular,
while some of the results about simple continued fractions do not hold for their generalized versions, one of
the main tools that we do gain, is the possibility to move from infinite sums to generalized continued fractions
and back, via Euler’s conversion, which we describe in section 2.2.
From this point on, since the main focus of this paper are the generalized continued fractions (and even
more generalized versions of them), we will simply call them continued fractions, and we will always add the
“simple” adjective when referring to simple continued fractions.
As with the simple continued fractions, our new continued fraction presentation is also closely connected
to rational approximations, and in Section 3 we show how these approximations can be used to show that a
given number is irrational. However, not every such presentation is enough, even if the number is irrational,
and in order to find better and better presentations, we move from these generalized continued fractions to
an even more generalized form. It is well known that we can use Mobius transformation to represent and
study simple continued fractions, and as we shall see the same holds for generalized continued fractions. As
these Mobius transformations described by product of 2 × 2 matrices, we are naturally led to study general
products of 2 × 2 matrices, and the corresponding Mobius transformation. This is done in Section 4, where
two of the main goal is to understand how two such presentations relate to one another, and in particular
what happens when one presentation arise from continued fractions.

In Section 5 we collect many continued fractions into the single object of conservative matrix field, and
study its properties. In particular, we describe how to construct many examples of such matrix fields, related
to interesting mathematical constants like ζ (2) , ζ (3) , e, π etc. Finally in Section 6 we apply the results found
so far to reprove that ζ (3) is irrational.
We then end the paper in Section 7 with a discussion about several directions which can generalize this
matrix field structure, and possibly connect it to many other research areas.

3
Part I
Introduction to generalized continued fraction
2 Definitions and examples
2.1 The definitions
We start with a generalization of the simple continued fractions, which unsurprisingly, called generalized
continued fractions. These can be defined over any topological field, though here we focus on the complex
field with its standard euclidean metric, and more specifically when the numerators and denominators are
integers.
Definition 1. Let an , bn be a sequence of complex numbers. We will write
bi b1
Kn1 := b2
,
ai a1 + b3
a2 +
..
a3 +
.
bn
an +0

which are in C ∪ {∞}, and denote the limit, if it exists, as


bi bi
K∞
1 := lim Kn1 .
ai n→∞ ai

1 ai , then we will say that K1 ai is a continued fraction presentation of α.


If α = K∞ bi ∞ bi

In the simple continued fractions presentations the sequence bi is the constant 1 sequence, while the
ai are positive integers, in which case we usually write
1
[a0 ; a1 , a2 , ..., an ] = a0 + Kn1
ai
∞ 1
[a0 ; a1 , a2 , ...] := a0 + K1 .
ai
This simple continued fraction presentation of numbers enjoys several interesting properties. In particular,
the limits above always converge, and the continued fraction is rational if and only if its expansion is finite.
Moreover, every number can be written as a simple continued fraction, where irrational numbers have unique
presentation, and every rational has exactly two presentations (this is because n1 = (n−1)+
1
1 ). These ai can
1
be retrieved from applying the Euclidean division algorithm, as can be seen in the example below:

Algorithm 1 Finding the simple continued fraction of 15


11 using the Euclidean division algorithm
15 = 1 · 11 + 4 15 1
11 = 2 · 4 + 3 =1+
11 2 + 1+1 1
4=1·3+1 3

3=3·1+0 = [1; 2, 1, 3]

However, while we have an algorithm to find the (almost) unique sequence ai , in general they can be very
complicated without any known patterns, even for “nice” numbers, for example:

π = [3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, 14, 2, 1, 1, 2, 2, 2, 2, 1, 84, 2, 1, 1, 15, 3, 13, ...].


When moving to generalized continued fractions, even when we assume that both ai and bi are integers,
we lose the uniqueness property, and the rational if and only if finite property. What we gain in return are

4
more presentations for each number, where some of them can be much simpler to use. For example, π can
be written as
2
(2n − 1)
π = 3 + K∞1 .
6
We want to study these presentations, and (hopefully) use them to show interesting properties, e.g. prove
irrationality for certain numbers.
Definition 2. Let an , bn be a sequence of integers. In this case the Kn1 abii are rational numbers (when they
are defined, and not infinity). If they converge, then we call them the convergents for that generalized
continued fraction.
One of the main tools used to study simple continued fractions are Mobius transformations. Recall
that given a 2 × 2 invertible matrix M = ac db over the complex numbers and a complex number z, the
Mobius action is defined by
az + b
M (z) = .
cz + d
cz+d and project it onto R C by
In other words, we apply the standard matrix multiplication ac db ( z1 ) = az+b
  1

dividing the x-coordinate by the y-coordinate.


By this definition, it is easy to see that
bi b1
Kn1 0 b1 0 b2 0 bn
  
= b2
= 1 a1 1 a2 ··· 1 an (0)
ai a1 +
..
a2 +
.
bn
an +0

This presentation allows us to show an interesting recurrence relation on the numerators and denominators
of the convergents, which generalizes the well known recurrence on simple continued fractions.
Q 
n−1
Lemma 3. Let an , bn be a sequence of integers. Define Mn = 01 abnn and set ( pqnn ) =

1 Mi ( 01 ). Then
pn Qn−1
qn = 1 Mi (0) = Kn−1
1
bi
ai are the convergents of the generalized continued fraction presentation. More
pn−1 pn  Qn−1
over, we have that qn−1 qn = 1 Mi , implying the same recurrence relation on pn and qn given by

pn+1 = an pn + bn pn−1
qn+1 = an qn + bn qn−1 ,

with starting condition p0 = 1, p1 = 0 and q0 = 0, q1 = 1.


Q 
n−1 Qn−1
Proof. Our definition of ( pqnn ) = 1 M i ( 01 ) simply gives us the right column of 1 Mi . Since Mi ( 10 ) =
( 01 ), we see that ! !
n−1 n−2
pn−1
Y Y 
Mi ( 10 ) = Mi ( 01 ) = qn−1 ∀n ≥ 2,
1 1
=Id
z }| !{
n−1
Y
and for n = 1 we have Mi ( 10 ) = ( pq00 ) , so together we have that
1

n−1
pn−1 pn
Y 
Mi = qn−1 qn ∀n ≥ 1.
1

pn−1 pn pn pn+1
From this equation we get the matrix recurrence , implying the same recurrence
 
qn−1 qn Mn = qn qn+1
on pn , qn .

5
2.2 Euler’s formula
For any number, the Euclidean division algorithm can be used to find its simple continued fraction expansion.
With generalized continued fractions, we don’t have a unique presentation anymore, so there isn’t a single
algorithm to find an expansion. However, this allow us the option of looking for a suitable presentation which
is easier to work with, and maybe move between such presentations (which will be one of our main tools
when trying to reprove Apery’s theorem).
One of the most elementary and useful continued fraction presentation was introduced by Euler who found
a way to convert standard finite sums (and their infinite sum limits) to generalized continued fraction.
Theorem 4 (Euler’s formula). Let ri ∈ C for i ≥ 1. Then
n k
!
X Y 1
1 + r1 + r1 r2 + · · · + r1 · · · rn = ri = −ri .
k=0 i=1
1 + Kn1 1+r i

By taking the limit (if exists), we have that


−ri 1
K∞
1 = P∞ Qk − 1.
1 + ri k=0 i=1 ri

Proof. Standard induction.


For more details and applications of this formula, the reader is referred to [6].
Euler’s formula also implies that whenever ai +bi = 1 we can go back from generalized continued fractions
1 ai to infinite sums, where we have many more tools at our disposal. However, in general this condition
bi
K∞
doesn’t hold, but fortunately there is a trick to move to equivalent presentations where it might.
Lemma 5 (The equivalence transformation). Let ai , bi ∈ C we two sequences and 0 6= ci ∈ C another
sequence with nonzero elements. Then
bi 1 ci−1 ci bi
Kn1 = Kn1 .
ai c0 ci ai

Proof. Intuitively, this lemma follows from the fact that bi


ai +x = ci bi
ci ai +ci x plus induction. For example

4 1 1 c0 1 c0 c1 1 c0 c1 1 c0 c1
= = · = · c1 = · c1 c2 = · c1 c2 .
11 2 + 1+1 1 c0 2 + 1+1 1 c0 2c1 + 1+ 1 c 0 2c1 + c
c2 + 2
c 0 2c 1 + c c
c2 + 2 3
3 3 3 3 3c3

Qn
More precisely, recall that Kn1 abii = 0 bi
(0). As Mobius transformations defined by scalar matrices

1 1 ai
are the identity, we get that
"n # "n # " n
#
Y Y Y
0 bi 0 bi 0 bi 1 0 ci 0
    
1 ai (0) = 1 ai · ci I (0) = 1 ai · 0 ci 0 1 (0)
1 1 1
"n #
  Y
c−1 ci−1 0 0 bi 1 0 cn 0
0
   
= 0
0 1 0 1 1 ai 0 ci 0 1 (0)
1
"n #
  Y  
c−1 0 ci−1 ci bi cn 0
0

= 0
0 1 1 ci ai 0 1 (0)
1

Since cn 0
( 01 ) = ( 01 ) , we conclude that

0 1

bi 1 ci−1 ci bi
Kn1 = Kn1 .
ai c0 ci ai

6
−1
Remark 6. In the last lemma we basically moved from the matrices 01 abii to (cn−1 Un−1 ) Mn Un with


Un = 10 c0n . This type of equivalence can be generalized, as we shall see it later in section 4.1.


Combining this lemma and Euler’s formula, we are led to look for cn satisfying

cn an + cn−1 cn bn = 1.

If we can find such cn , then we have the following.


Corollary 7. Let ai , bi ∈ C be any sequences and suppose that we can find a solution to

ci ai + ci−1 ci bi = 1

with nonzero ci . Then


!
bi 1 (ci−1 ci bi ) 1 1
Kn1 = Kn1 = Pn k Qk
−1 ,
ai c0 (ci ai ) c0 k=0 (−1) i=1 (ci−1 ci bi )

or equivalently
n k  
X k
Y 1 bi
0 1
Kn1

(−1) (ci−1 ci bi ) = = 1 c0 .
k=0 i=1
1 + c0 · Kn1 abii ai

Example 8 (The exponential function). Given some x ∈ R, we start with the standard Taylor expansion
for ex :
∞ ∞ Yn  
x x2 x3 X xn X x
e =1+x+ + + ··· = = .
2 3! 0
n! 0 1
i
Taking ri = x
i in Euler’s formula we get that

1
ex = −x/i
.
1+ K∞
1 1+x/i

We would like to use the equivalence transformation with ci = i so as to remove the division in the numerators
and denominators, however since c0 = 0 we cannot directly do it. Instead, we will apply it starting from the
second index, namely

−x/i −x −x −x
K∞
i=1 = = = −xi
,
1 + x/i ∞ −x/i
1 + x + Ki=2 1+x/i 1 ∞ −x(i−1)
1 + x + 1 K2 i+x 1 + x + 11 K∞
1 1+i+x

so that
1
ex = x .
1− 1+x− x
2+x− 2x
3+x− 3x
4+x− 4x
..
.
In particular, for x = 1 and x = −1 we get that
e−2 −i 1 i
= K∞
1 , = K∞
1 .
1−e 2+i e−1 i
Similar computation can be done to other functions like sin (x) , cos (x) , ln (1 + x) etc.

Finding cn which satisfy the relation in Corollary 7 above is equivalent to solving the recurrence
1 0 1

ci := = bi ai (ci−1 ) .
ai + ci−1 bi

7
Once we choose c0 , the rest of the ci are determined by the recurrence relation, and as long as we don’t
divide by zero anywhere, we can transform the generalized continued fraction into an infinite sum. Of course,
the
Pn hard part is not to find some sequence ci , but a “nice enough” such sequence for which we can compute
k Qk
k=0 (−1) i=1 (ci−1 ci bi ).
The fact that we got a recurrence relation with the transpose of our standard matrix 01 abii is not a


coincidence. To give another way of viewing this transformation, we first linearize the recurrence by setting
ci = FFi+1
i
so that the recurrence becomes

Fi ai + Fi−1 bi = Fi+1 ,
which we can also write as
0 bi

( Fi−1 Fi ) 1 ai = ( Fi Fi+1 ).
This is exactly the recurrence satisfied by pn and qn we saw in Lemma 3 (so that both ci = pi+1 pi
and ci = qi+1
qi

solve the recurrence above). In a sense, if we find one “nice enough” solution to the recurrence, then we can
understand both of pn and qn . More over, the starting conditions of pn and qn are independent (namely
( pq00 pq11 ) = Id is invertible), so any sequence satisfying the recurrence above is a linear combination of these
two sequences. Thus understanding one “nice” solution gives us a lot of information about all the solutions.
Next, we try to give simple conditions on ai , bi where we can find “nice” solution for the recurrence. A
good starting point is when ai , bi are fixed ai ≡ a, bi ≡ b, so that our matrix is Mi = M = ( 10 ab ), and a
solution Fn can be found by looking at M n . This is a standard recursion where Fn will be a combination of a
polynomial f (n) and exponential hn , where h is one of the two roots h1 , h2 for the characteristic polynomial
x2 − ax − b = 0, namely h1 · h2 = −b and h1 + h2 = a.
More generally, our recurrence depends on i, andQthe “right” way to think about exponential is more like
n
factorial, so we should look for Fn of the form f (n) · 1 h (k) for some polynomials f, h , or in the cn notation
f (n)
we have cn = FFn+1 n
= f (n+1)h(n+1) .
With this quadratic intuition in mind, we have the following family of continued fractions which have
this sort of solution to their corresponding recurrence relation. Special cases of these continued fractions
appear in many places (in particular, see for example [4]), though we did not see this exact formulation in
the literature.
Theorem 9. Let h1 , h2 , f : C → C be any functions, and define a, b : C → C such that
b (x) = −h1 (x) h2 (x)
f (x) a (x) = f (x − 1) h1 (x) + f (x + 1) h2 (x + 1)
Qn
Then taking Fn = f (n) · 1 h2 (k) solves the recurrence relation
Fn a (n) + Fn−1 b (n) = Fn+1 ,
and we get that  
bi f (1) h2 (1) 1
Kn1 = P   − 1 .
ai f (0) n f (0)f (1) Qk h1 (i)
k=0 f (k)f (k+1) i=1 h2 (i+1)

Proof. Simply putting the definition of a, b, F in the recurrence gives us


"n #
Y
Fn a (n) + Fn−1 b (n) − Fn+1 = h2 (k) (f (n) a (n) − f (n − 1) h1 (n) − f (n + 1) h2 (n + 1)) = 0.
1

Using Corollary 7 and taking cn = Fn


we get that
Fn+1
   
bi F 1 1 f (1) h2 (1) 1
Kn1 =    − 1 =    − 1 .
ai F0 Pn Qk Fi−1
h (i) h (i) f (0) Pn f (0)f (1) Qk h1 (i)
k=0 i=1 Fi+1 1 2 k=0 f (k)f (k+1) i=1 h2 (i+1)

8
Remark 10. We leave it as an exercise to show that if we start with polynomials a, b ∈ C [x] and we look for
solutions to cn an + cn−1 cn bn = 1 where an = a (n) and bn = b (n) are evaluations, and where cn = c (n) for
some rational function c ∈ C (x), then a, b must has the form as in the theorem above where f, h1 , h2 are all
f (x)
polynomials in themselves and c (x) = f (x+1)h 1 (x+1)
.
While the theorem above is applicable to any functions h1 , h2 and f , it is probably most useful when they
Qk  1 (i) 
are polynomials over Z, in which case the product i=1 h2h(i+1) can become much simpler.

Example 11 (The trivial Euler family). 1


Consider the family from the theorem above where f (x) = 1, so
that

b (x) = −h1 (x) h2 (x)


a (x) = h1 (x) + h2 (x + 1) ,

in which case we have  


bi 1
Kn1 = h2 (1)  P   − 1 ,
ai n Qk h1 (i)
k=0 i=1 h2 (i+1)

or alternatively
n Yk    −1
X h1 (i) 1 bi
= Kn1 + 1 .
i=1
h2 (i + 1) h2 (1) ai
k=0
Qk h1 (i) Qk
1. Let b (x) = −1×x and a (x) = 1+(x + 1) = x+2. We then have that i=1 h2 (i+1) = 1
i=1 i+1 = (k+1)! ,
1

so that
−n 1 1
K∞
1 = P∞ 1 −1= − 1,
n+2 k=0 (k+1)! e − 1
which we already saw in Example 8
d id
Qk h1 (i) Qk
2. For some d ≥ 2 let b (x) = −xd × xd and a (x) = xd + (x + 1) . Then i=1 h2 (i+1) = i=1 (i+1)d =
1
(k+1)d
, so that
bn 1 1
K∞
1 = P∞ 1 −1= − 1.
an k=0 (k+1)d ζ (d)

d
3. For some d ≥ 3 let b (x) = −xd−1 (x + 2) × xd and a (x) = xd−1 (x + 2) + (x + 1) . We then have that
k k k
" #
Y h1 (i) Y id−1 Y i+2 1 k+2 1 1 1
= · = · = + .
h (i + 1) i=1 (i + 1)d−1 i=1 i + 1
i=1 2 (k + 1)
d−1 2 2 (k + 1)d−1 (k + 1)
d−2

Summing up over k gives us



" #
X 1 1 1 1
d−1
+ d−2
= (ζ (d − 1) + ζ (d − 2)) .
2 (k + 1) (k + 1) 2
k=0

d
A similar computation can be done for b (x) = −xd−1 (x + m) × xd , a (x) = xd−1 (x + m) + (x + 1)
where m ≥ 0. Note that for m = 0 we get the continued fraction from part (2) above and for m = 1 we
can use the equivalence transformation Lemma 5 to cancel (x + 1) in a (x) and x (x + 1) in b (x) and
get
b (x) −x2(d−1) 1
K∞1 = K ∞
1 (d−1)
= − 1.
a (x) x (d−1) + (1 + x) ζ (d − 1)
1 It has come to our attention that Euler doesn’t have enough mathematical objects named after him.

9
d−1
4. For some d ≥ 2 let b (x) = −xd × xd−1 (x + 1) and a (x) = xd + (x + 1) (x + 2). We then have that
k k k
Y h1 (i) Y id−1 Y i 1 1·2 1 2
= d−1
· = ·
d−1 (k + 1) (k + 2)
= d k+2
.
i=1
h2 (i + 1) i=1 (i + 1) i=1
i + 2 (k + 1) (k + 1)

Since 1
j+1 · 1
j+2 = 1
j+1 − 1
j+2 for all j ≥ 0, we get by induction that

d d−`  
1 1 X (−1) d−1 1 1
·
d k+2
= `
+ (−1) − .
(k + 1) k+1 k+2
`=2 (k + 1)

Summing over this expression we get



" d # X d
X X (−1)d−` d−1

1 1 d−` d−1
`
+ (−1) − = (−1) ζ (`) + (−1) .
(k + 1) k+1 k+2
k=0 `=2 `=2

Remark 12. In general, the ideas


 appearing
 in the examples above can help compute many of the sums of
P∞ f (0)f (1) Qk h1 (i)
the form k=0 f (k)f (k+1) i=1 h2 (i+1) .
First, if deg (h1 ) > deg (h2 ) there is no convergence, so that we may assume that deg (h1 ) ≤ deg (h2 ).
Next, if deg (h1 ) = deg (h2 ) and all the roots of h1, h2 are integers, then most of the elements in
Qk  h1 (i)  f (0)f (1) Qk h1 (i)
i=1 h2 (i+1) will be canceled out and f (k)f (k+1) i=1 h2 (i+1) will be a rational function. Using the
standard decomposition of rational functions, we can write it as a linear combination elements of the form
1
(n−α)d
with integer α. These always converge if d ≥ 2 to values of the zeta function. The elements for
P∞  1 
which d = 1 should be put together to find out their limits (e.g. 0 k+1 − 1
k+2 = 1). This can be
slightly generalized, ifthe roots  of h2 and h1 are the same modulo the integers, in which case still most of
Qk
the elements in i=1 h2h(i+1) 1 (i)
are canceled.
Finally, if deg (h2 ) > deg (h1 ), we expect to see all sorts of factorials appearing in the denominators of
the summands, suggesting to look for Taylor expansions find get the limit.

In the examples above we only looked at “trivial” solutions where f ≡ 1. In general, there are solutions
where f 6= 1 (and we shall see many of them later), however, they are all part of the trivial family in disguise.
Indeed, if

b (x) = −h1 (x) h2 (x)


f (x) a (x) = f (x − 1) h1 (x) + f (x + 1) h2 (x + 1)

as in the theorem, then using the equivalence transformation from Lemma 5 with cn = f (n), we get that

b (n) 1 b̃ (n)
Kn1 = Kn1 .
a (n) f (0) ã (n)

where this new continued fraction is in the trivial Euler family:

h̃1 (x) = h1 (x) f (x − 1) ; h̃2 (x) = h2 (x) f (x)


b̃ (x) = f (x − 1) f (x) b (x) = −h̃1 (x) × h̃2 (x)
ã (x) = f (x) a (x) = h̃1 (x) + h̃2 (x + 1) .

10
As mentioned before, the structure of the trivial family should not be too surprizing. If we just wanted
to solve a simple quadratic equation S (x) = x2 + ax + b = 0, then we would look for two solutions λ1 , λ2
such that S (λ1 ) = S (λ2 ) = 0. This is equivalent to solving the two equations b = λ1 λ2 and a = − (λ1 + λ2 ),
which is more or less what we look for when trying to present a generalized continued fraction as part of the
Euler trivial family. While the sign choice is simply for convenience, the main difference, is that in our case
the solutions depend on n, and instead of a (n) = λ1 (n) + λ2 (n), we “half advance” the index and look for
a (n) = λ1 (n) + λ2 (n + 1).
In general, starting with a standard quadratic equation x2 + ax + b = 0 over the integers, we do not expect
to have solutions over the integers. The same applies here - if we are given polynomials a (x) , b (x) where
b(n)
K∞1 a(n) converge, it is not true that there is a solution as in Theorem 9 with integral polynomials. Assuming
that we know how to decompose b (x), we can go over all possible options for a (x) = h1 (x) + h2 (x). In the
more generalized form, where f (x) a (x) = f (x − 1) h1 (x) + f (x + 1) h2 (x + 1), this process is less trivial,
but still not that hard, and we describe an algorithm to find all such solutions in Appendix A.

11
3 Irrationality testing
Any real number α can be approximated
by
rational numbers. More over, if q ∈ N is the denominator, then
we can always find p ∈ Z such that α − pq ≤ 1q . Looking for approximations where the error is much smaller

than 1q is the starting point of the study of Diophantine approximations. Let us describe one of the well
known ways to use such approximations to show that a given number is irrational.
Take your favorite constant L (e.g. e, π, ln (7) , ζ (k) etc) and let pqnn be a sequence of rationals converging
to L. If L = pq is rational and L 6= pqnn , then

L − p n = p · q n − q · p n ≥ 1

qn q · qn |q · qn | .

Since q is constant, we immediately get that:


 
pn 1
Corollary 13. Suppose that pn , qn are coprime with qn → ∞. If L − =o , then L is irrational.

qn |qn |

In other words, we can always have < |q1n | error, and if we can do better, then the number is irrational.
Note that once we know that a number
  is irrational, Dirichlet’s theorem tells us that there are reduced approx-
imations pqnn with L − pqnn = O q12 . The importance of rational approximations derived from generalized

n
continued fractions is that it gives us an upper bound on the error, which we can use to prove irrationality.
Claim 14. Let an , bn , pn , qn be sequence of integers satisfying the recurrence from Lemma 3, namely
pn+1 = an pn + bn pn−1
qn+1 = an qn + bn qn−1 .
1. If pn
qn → L, then
∞ Qk ∞ Qk
L − pn = 1 bi 1 |bi |
X X
≤ .
qn qk qk+1 |qk qk+1 |


k=n k=n
 
2. Suppose in addition that the bn are nonzero. If for q̃n = qn
we have L − pn
=o 1
, then

gcd(pn ,qn ) qn |q̃n |
L is irrational.
Proof. 1. For all m ≥ n we have
m m p pk+1 
pm+1 X det qkk qk+1
 
pn pm+1 X pk+1 pk
L− =L− + − =L− − .
qn qm+1 qk+1 qk qm+1 qk qk+1
k=n k=n
pm+1
Under the assumption that L − → 0 as m → ∞, we conclude that
qm+1
X


pk pk+1  ∞ Qk m Qk
1 |det (Mi )| 1 |bi |

L − pn
≤ det qk qk+1 X X
= = ,

qn qk qk+1 |qk qk+1 | qk qk+1


k=n k=n k=n

and we are done.


 
2. For the second part, setting p̃n = gcd(ppnn ,qn ) we get that L − p̃q̃nn = L − pqnn = o |q̃1n | . In order to

use Corollary 13, we only need to
show that q̃n has a subsequence going to infinity. Otherwise, assume
that q̃n is bounded, then since L − p̃q̃nn → 0, the p̃n and q̃n must be constant for all n large enough

(since gcd (p̃n , q̃n ) = 1) and = L. However, we also have that


p̃n
q̃n

p pn+1  Qn
pn det qnn qn+1

1 |bk |
p̃n+1 p̃n pn+1
q̃n+1 − q̃n = qn+1 − qn = qn qn+1 = 6= 0,

|qn qn+1 |

which leads to a contradiction. Hence, the q̃n cannot be bounded as required.

12
The recursion relation of the qi suggest that the larger the |bi | are, the faster the growth of |qk | is, and in
P∞ Qk+1 |b |
general we expect it to be fast enough so that k=1 |qk qk+1i| will converge. However, it might still not be
1
   
in o |q1n | . Hopefully, if the gcd (pn , qn ) is large enough, then it is in o |q̃1n | , which is enough to prove
irrationality.

3.1 Failed irrationality test for ζ (3)


Consider ζ (3) with its standard rational approximation as infinite sum:
n
pn X 1 n→∞
:= −→ ζ (3) .
qn 1
k3

3 n! 3
Pn
Multiplying the denominators, we can write qn = (n!) and pn = 1 , both of which in Z. Trying

k
to apply the irrationality test from the previous section, we get that
X ∞  
ζ (3) − pn = 1 1

= Θ .
qn n+1 k 3 n2

This is, of course, far from what we need to prove irrationality, since n12 is much larger than q1n = (n!)
1
3 . While

the rational approximations above choice is easy to use in general, it is not good enough
to show irrationality.
3 p0
More over, fixing qn = (n!) , we can always choose some p0n such that ζ (3) − qnn ≤ 2q1n showing how bad

the approximation above is.
One way to improve the approximation, as in part (2) of Claim 14, is by moving to a reduced form of pqnn .
3
Indeed, we can take instead the common denominator, namely q̃n = lcm [n] , where lcm [n] := lcm {1, 2, ..., n}
Pn  3
and then p̃n = 1 lcm[n]k . It is well known that ln (lcm [n]) = n + o (n) (it follows from the prime number
theorem, see [2]), so that lcm [n] ∼ en is much smaller than n! . However, we still get that the error is too
big n12 ≥ e1n , so even with this improvement, it is still not enough.

Trying to solve this problem, we define a new sequence


P of rational
 approximations to ζ (3). Since the
n−1 1
previous sequence is Θ n2 away, we instead use qn =
1 pn
k3 + 2n2 . As before, while simply taking
1

1
3
the product of the denominators we have ((n − 1)!) 2n2 , taking instead their least common multiple allow
3
us to assume that qn ∼ lcm [n] . On the other hand, computing the error we get
∞ !

! Z



ζ (3) − pn X 1 1 X 1 1 1
= − 2 = − dx ≤ 3 .

qn n k 3 2n n k 3 x 3 n
n

So while the denominator of the n-th approximation hasn’t changed too much, the error itself becomes much
smaller, namely decreases from n12 to n13 . This is still not enough for our purpose, but if we could keep
finding many such approximations for ζ (3), which can be “easy” to describe and work with, then we might
eventually find one where the approximation is good enough to conclude that ζ (3) is irrational.
The main idea in this paper, is to start with a continued fraction, and then “improve” it. For example,
we already saw in Example 11 that the standard approximation is
n−1
" n 
#
X 1 1 Y 0 −k 6

d
= n −k6
= ( 01 11 ) 1 k3 +(1+k3 )
(0) .
k=0 (k + 1) 1 + K 1 3
k +(1+k)3 1

The “improvement” will be done by adding matrices to the multiplication from the left, or from the right,
while trying to conserve the continued fraction form, and hopefully allowing us to end up with a nice enough
presentation and to prove irrationality. In any case, this will require a bit more general approach to this
subject, where we consider general product of Mobius transformations, though hopefully keeping them simple
enough in order to work with them.

13
4 The most general of generalized continued fractions
Our goal, continuing
 on, will
 be to start with a given continued fraction expansion for some constant (e.g.
−n6
1 n3 +(1+n)3 ), and create new continued fractions, in which the approximation error goes to
ζ (3) = ( 01 11 ) K∞
zero quicker than the denominator goes to infinity.
As we are moving between different continued fraction presentations, we are led to consider a larger family
of “continued fraction expansions” namely - any sequence of matrices.
Definition 15. Let M (i) be a sequence of 2 × 2 matrices. For z ∈ C we will denote
"∞ # "n #
Y Y
M (i) (z) = lim M (i) (z) .
n→∞
1 1

In the definition above, it is possible that


Q∞ the limit converges forQnone z and diverges for another. For
n
example, if we take M (i) = −1 0
, then while doesn’t converge.

0 1 [ 1 M (i)] (0) = 0 [ 1 M (i)] (1) = (−1)
However, in many “natural” sequences we have a very strong convergence behavior.
1. If Mi = 10 a1i are upper triangular, then

Example 16.
n
! n
Y  Pn  X
Mi (z) = 10 11 ai (z) = z + ai .
1 1
P∞ Qn
If 1 Q ai = ∞, then ( 1 Mi ) (z) → ∞ for all z. If we also add an M0 matrix which takes ∞ → w ∈ C,
n
then ( 0 Mi ) (z) → w for all z.
Qn Q 
n−1
2. If Mi = 10 abii has the continued fraction form, then ( 1 Mi ) (∞) = (0) since Mi (∞) = 0.

1 M i
Qn−1
It follows that we have convergence in 0 if and only if we have convergence in ∞. Writing 1 Mi =
pn−1 pn
qn−1 qn , this limit will simply be lim qn .
pn
n→∞
For x ∈ (0, ∞) we have that
"n−1 #
Y pn−1 pn  pn−1 x + pn
Mi (x) = qn−1 qn (x) =
1
qn−1 x + qn

and in case the pn and qn sequences are positive, it is an exercise to show that pqn−1 x+pn
n−1 x+qn
is in the
n o
pn pn−1
segment defined by the endpoints qn , qn−1 . The best way to see it is to consider the vectors (pn , qn )
and (pn−1 , qn−1 ) in the positive quadrant and see that (pn , qn ) + x (pn−1 , qn−1 ) is between them,
hQ so itsi
n−1
projection to the projective line is between their projections. It follows that when applying 1 Mi
to any element in [0, ∞], the sequence converges and to the same limit.
The condition on the pn , qn is true, for example, if the ai , bi are all positive, which is the case for simple
continued fractions.

14
4.1 Some words about cocycles and coboundaries
In our previous discussion about continued fractions, where Mi = 01 abii , we were specially interested in

Qn pn−1 pn
which contained the numerators and denominators of the convergents. These

Pn := 1 Mi = qn−1 qn
products can be defined for any sequence Mi of matrices, and we think of them as potential matrices
where we move from the potential at point i to the potential at point i + 1 via the matrix Mi , or more
formally Pi Mi = Pi+1 . We can also restrict them to row vectors, instead of full 2 × 2 matrices. In particular,
for the continued fraction matrices, each such vector sequence will satisfy (Fi−1 , Fi ) Mi = (Fi , Fi+1 ), where
Fi solves the recurrence relation
Fi+1 = Fi ai + Fi−1 bi
that we already encountered before.
The main goal of this section is to look for natural ways to move between such potential matrices and
vectors, and eventually to find natural connections between different continued fractions. This type of
question is usually asked in cohomology theory (see for example chapter 4 in [5]), where such sequences Mi
of matrices should and can be called “cocycles”. We will not go too much into this theory here, since on the
one hand this cocycle structure is in a sense trivial, and on the other hand, the theory is usually much more
geared into commutative rings, unlike our noncommutative matrix setting. However, the question about
natural conversion between the potentials exists in this theory and is called coboundary equivalence, and this
(1)
will come up a lot in our study. More specifically we want to move from one potential Pi to the second
(2) (1) (2)
Pi using a nice transformation Pi Ui = Pi , producing for us this commutative diagram:

(1) (1) (1) (1)


(1) M1 M2 M3 Mn−1
P1 / P (1) / P (1) / ··· / Pn(1)
2 3

U1 U2 U3 Un
 (2)  (2)  (2) (2)
Mn−1

(2) M1 / (2) M2 / (2) M3 / ··· / Pn(2)
P1 P2 P3

More formally, we have the following.


(1) (2)
Definition 17. Two matrix sequences Mi , Mi are called Ui -coboundary equivalent for a sequence of
(1) (2)
invertible matrices Ui if Mi Ui+1 = Ui Mi for all i, or in a commutative diagram form:

(2)
Mi
(∗) / (∗) .
O O
Ui Ui+1

(∗) / (∗)
(1)
Mi

Remark 18. In the world of standard, non indexed matrices, this coboundary equivalence is simply matrix
conjugation, and as we shall see some of the results for this coboundary equivalence are just “indexed” version
of what we expect from matrix conjugacy.
The commutativity condition in the coboundary definition can be extended to products of the Mi , and
in particular for our potential matrices. Indeed, a simple inductions shows that with the notations as in the
definition, for all m ≤ n we have
"n # "n #
Y (2) Y (1)
Um Mi = Mi Un+1 .
m m

(1) (2)
Every two matrix sequences Mi , Mi (invertible) are coboundary equivalent for some Ui . Indeed, once
 −1
(2) (1)
we choose U1 , we can recursively define Ui+1 = Mi Ui Mi . However, what will matter to us later on
is that Ui is simple enough to work with. For example, it can be defined over Z, triangular, diagonal, etc. In

15
particular, we want to work with the Mobius maps induced by the matrices, and the upper triangular (resp.
lower triangular) are exactly the matrices which take infinity to itself (resp. zero to itself).
This idea of coboundary equivalent sequences is very useful, and we have already seen one such important
example. In the “equivalence transformation” for continued fractions in Lemma 5 we used
 
Mn(1) = 10 abnn , , Mn(2) = 01 cn−1 cn bn
, Un = 10 cn−1
0
 
cn an

(1) (2)
so that Mn Un+1 = cn−1 Un Mn , and since we deal with Mobius transformation, where scalar matrices act
(1) (2)
as the identity, we have Mn Un+1 ≡ Un Mn .

Other than this important example, we have two more - one to move to upper triangular matrices, and one
to continued fraction matrices, both of which are helpful when we need to take product of many such matrices.
In the upper triangular case, the diagonal is just a product of the diagonals and the only complicated part
is in the upper right corner. In the continued fraction form, as we already saw, there is a natural recursion
relation, which will be very helpful once we start to do the actual computations, and we will start with it.
Theorem 19. [11] Let
an bn 1 an
 
Mn = cn dn , Un = 0 cn , an , bn , cn , dn ∈ C,
such that cn 6= 0 for all n ≥ 1 (so that Un is invertible). Then Mn is Un -coboundary equivalent to the
continued fraction matrix
c
 
0 − n+1
cn det(Mn )
Un−1 Mn Un+1 = cn+1 .
1 an+1 +dn cn

In particular, setting ( pqnn


) = M1 M2 · · · Mn ( 10 ), both of the pn and qn satisfy the same recurrence relation
   
cn+1 cn+1
qn+1 = qn an+1 + dn − qn−1 det (Mn ) , q0 = 0, q1 = c1
cn cn
   
cn+1 cn+1
pn+1 = pn an+1 + dn − pn−1 det (Mn ) , p0 = 1, p1 = a1 .
cn cn

Proof. The computation of Un−1 Mn Un+1 is straight forward


cn+1
 
1   1 an+1  1 0 −cn+1 det(Mn )
 0− det(Mn )
Un−1 Mn Un+1 cn −an an bn
 cn
= 0 1 cn dn 0 cn+1 = = cn+1 .
cn cn cn cn an+1 +dn cn+1 1 an+1 +dn cn

Writing M̃n := Un−1 Mn Un+1 , and noting that Un+1 ( 10 ) = ( 10 ), the coboundary equivalence gives us
n
! n
!
Y Y
( pqnn ) = Mk Un+1 ( 10 ) = U1 M̃k ( 10 ) .
1 1

Since M̃n+1 ( 10 ) = ( 01 ), we also get that


" n
# " n
#
pn+1
Y Y 
U1 M̃k ( 01 ) = U1 M̃k M̃n+1 ( 10 ) = qn+1 .
1 1

In other words, we got that


n
pn pn+1
Y 
U1 M̃k = qn qn+1 .
1

This implies the matrix recurrence relation


pn−1 pn  pn pn+1 
qn−1 qn M̃n = qn qn+1 ,

16
which translate to the recurrence relations
   
cn+1 cn+1
qn+1 = qn an+1 + dn − qn−1 det (Mn )
cn cn
   
cn+1 cn+1
pn+1 = pn an+1 + dn − pn−1 det (Mn ) ,
cn cn
and starting conditions p0 = 1, p1 = a1 , q0 = 0, q1 = c1 .
Remark 20. In the theorem above, if the entries of Mn are integers but cn is not constant, then in general
the coefficients in the recurrence will not be integers. However, the pn , qn solutions will still be integers, since
we used the product of the Mi integral matrices to define them.
We will mainly be interested in the case where the entries of the matrices are polynomial evaluated at
the integer points, so for example in the previous case cn = c (n) where c ∈ C [x]. In particular, unless c ≡ 0,
in which case the Mn are upper triangular, we can always apply this transformation for all n large enough.
However, we would actually prefer to work with upper triangular matrices, since it is easy to multiply them,
Qn 1 αi   1 Pn αi 
and in particular 1 0 1 = 0 11 .
Our next goal is to show when we can transform a sequence of matrices into upper triangular. Recall
that a standard matrix is conjugated to an upper triangular matrix if and only if it has a nonzero eigenvector
v tr M = λv tr . Here we also have the indexed analogue, which while at first glance seems a bit trivial, when
we add the right restrictions, will become quite helpful.
Definition 21. Let Mi = acii dbii be a sequence of matrices. We say that a sequence v (i) = (Fi , Gi ) of


nonzero vectors is a (left) eigenvector with eigenvalue λ (i) if


v (i) Mi = λ (i) v (i + 1) .
We similarly define right eigenvector and right eigenvalue by the formula
Mi u (i + 1) = α (i) u (i) .

v (i − 1) λ (i) v (i) := v (i − 1) Mi−1 λ (i + 1) v (i + 1) := v (i) Mi

Mi−1 Mi+1
··· / (i − 1) / (i) Mi
/ (i + 1) / ···

Figure 2: We should think of left eigenvectors v (i) as being at the i-th position, and multiplying by Mi from
the right “moves” them to the i + 1 position. Right eigenvector goes similarly but from i + 1 to i with Mi
multiplying from the left.

Unlike standard eigenvectors, in our case it is very easy to find eigenvectors by simply defining v (i + 1) =
1
λ(i) v (i) M(i) recursively. However, the problem is finding an eigenvector which is easy to work with. We
already saw one such example when our Mi had continued fraction form, in which case a 1-left eigenvector
is simply a solution to
(Fi−1 , Fi ) 01 abii = (Fi , Fi+1 ) ,


or alternatively Fi+1 = ai Fi + bi Fi−1 . If Fi is just any sequence, then it would be very hard to work with,
however if bi , ai are polynomial in i, we might find Fi which is polynomial or exponential in i.
 
Example 22. Consider a generalized continued fraction from the trivial Euler family Mi = 01 h1−h 1 (i)h2 (i)
(i)+h2 (i+1)
(see section 2.2). Then it has both a h2 (i)-left and h1 (i)-right eigenvectors
 
(1, h2 (i)) · 10 h1−h 1 (i)h2 (i)
(i)+h2 (i+1)
= h2 (i) · (1, h2 (i + 1))
     
0 −h1 (i)h2 (i) h2 (i+1) h2 (i)
1 h1 (i)+h2 (i+1)
· −1
= h1 (i) · −1
.

17
In particular if h1 , h2 ∈ Z [x], then both the eigenvalues and eigenvectors are integral.

Note that in the example above, the left and right eigenvectors at the i-position are perpendicular:
 
(1, h2 (i)) · h−1
2 (i)
= 0.

This is not a coincidence, and it happens for any sequence of matrices, and we can use it to triangularize it.
Lemma 23. Let Mi = acii dbii be any sequence of matrices. Then (Gi , Fi ) is a left λi -eigenvector if and only

 −1 
Fi Fi 0

if −G i
is a right αi -eigenvector. Moreover, if the F i 6
= 0, then setting Un = G F
we get that
i i

 bi 
−1 αi
Ui Mi Ui+1 = Fi Fi+1
,
0 λi

so in particular αi λi = det (Mi ).


Proof. Suppose that Mi has a λi -left eigenvector (Gi , Fi ). Then
   
F F
(Gi , Fi ) Mi −Gi+1
i+1
= λi (Gi+1 , Fi+1 ) −Gi+1
i+1
= 0,
 
Fi+1
implying that Mi ⊥ (Gi , Fi ). Since we are in dimension 2, the perpendicular of a nonzero vector is
−Gi+1
 
F
a 1-dimensional space, so that Mi −Gi+1 Fi
for some scalar αi , namely it is a right eigenvector.

i+1
= αi −G i
−1
In our choice of Ui the row etr
2 Ui is the left eigenvector, and since det (Ui+1 ) = 1, we get that Ui+1 =
 Fi+1 0

−1
−1
−Gi+1 Fi+1 so the Ui+1 e1 is the right eigenvector. We now get that

−1 −1
etr tr tr
2 Ui Mi Ui+1 = λi e2 Ui+1 Ui+1 = λi e2
−1
Ui Mi Ui+1 e1 = αi Ui Ui−1 e1 = αi e1
−1 −1 tr −1 bi
etr
 
1 Ui Mi Ui+1 e2 = Fi e1 Mi Fi+1 e2 = .
Fi Fi+1
Hence, all together we get that  bi 
−1 αi
Ui Mi Ui+1 = Fi Fi+1
,
0 λi

which completes the proof.


To fully utilize this triangularization let’s recall the formula for computing a product of triangular matrices.
Claim 24. For sequences αi , βi , γi with αi , γi 6= 0 we have that
n−1
Y   Qn−1 
αi βi αi cn
0 γi = 1
0
Qn−1
γi
1
1
n−1 k−1
! n−1
!
X Y Y
cn = αi βk γi
k=1 i=1 i=k+1

In particular, as a Mobius map we get


"n−1 # n−1 k−1
!
Y
α i βi
 X βk Y αi
0 γi (0) =
1
γk i=1
γi
k=1
.
Proof. A simple induction.

18
Example 25. The last two results can be combined together, for example, to reprove the conversion from
continued fractions in the Euler family from Example 11 to infinite sum. Given two functions h1 , h2 : Z → C
set

bi = −h1 (i) h2 (i)


ai = h1 (i) + h2 (i + 1) ,

and let Mi = . We already saw in Example 22 that this sequence has h2 (i)-left eigenvector (1, h2 (i)),
0 bi

1 ai  
−1
so Lemma 23 implies that for Ui = h2 (i)
1
0
h (i)
we have
2

 
h1 (i)
−1 h1 (i) − h
Ui Mi Ui+1 = 2 (i+1) .
0 h2 (i)

Then Claim 24 shows that


"n−1 # ! n−1 k
!
Y
−1 1 X Y h1 (i)
U1 Mi Un+1 (0) = − · .
1
h2 (1) h
i=1 2
(i + 1)
k=1

n−1 k
!
X Y h1 (i) −1
Set α = h2 (i+1) , so the expression above is h2 (1) .
1−α
Since Un+1 (0) = 0, we conclude that
k=0 i=1
"n−1 #   
b 1−α 1 − α
i h2 (1) 0
Y
n−1 −1
K1 = Mi (0) = U1 = −1 h (1)−1
ai 1
h2 (1) 2 h2 (1)
1−α
= h2 (1) α−1 − 1 ,

= 1 1
h2 (1) (α − 1) + h2 (1)

which is exactly what we got in Example 11.


Other then the triangulation mentioned above, there are other more direct applications for these eigen-
vectors. For example, we can use it to show that the numerators and denominators in a given continued
fraction have a very large common divisor.
 
−i2d
Example 26. For example, let’s consider the ζd case with the matrices Mi = 01 id +(1+i) d (see Example 11)
 d

where there is a right id -eigenvector (1+i)
−1
. Any 1-left eigenvector is simply a sequence Fi satisfying the
recurrence
(Fi−1 , Fi ) Mi = (Fi , Fi+1 ) ,
or equivalently  
d
Fi+1 = id + (1 + i) Fi − i2d Fi−1 .
In particular, the pn and qn sequence satisfy this relation, where
n−1
pn−1 pn
Y 
Mi = qn−1 qn ,
1

namely they are the eigenvectors which start with (0, 1) and (1, 0) respectively.
Using the process above we get that
  n    n
Y
(1+n)d (1+n)d (1+1)d
Y
(Fn , Fn+1 ) · −1
= (F0 , F1 ) Mi −1
= (F0 , F1 ) −1
id .
1 1

d d
If F0 , F1 are integers (e.g. the starting points for pn or qn ) ,then (1 + n) Fn − Fn+1 = (n!) · C for some
 d
constant integer C. We claim that this implies that lcm[n] n!
divides Fn . This is clearly true for n = 1, 2, 3

19
 d
where n!
lcm[n] = 1. Proving the rest by induction, assume that this is true for n and we prove for n + 1.
d d
We then have that Fn+1 = (1 + n) Fn − (n!) C, so it is enough to prove the claim for each summand. For
the first we have  d  d
(n + 1)! d n! d
| (1 + n) | (1 + n) Fn .
lcm [n + 1] lcm [n]
 d
(n+1)! d
For the second, we want to show that there is some integer m such that lcm[n+1] m = (n!) C, which is
d d
equivalent to (n + 1) · m = C (lcm [n + 1]) . Since (n + 1) | lcm [n + 1], we can find such m and we are done.
In other words, we have shown that no matter what the starting conditions are, we always get that
 d  d
n!
lcm[n] | F n , so in particular n!
lcm[n] | gcd (pn , qn ).

20
Part II
The conservative matrix field
5 Definition and properties
Up until now we mainly looked at a single continued fractions K∞ 1 ai , and in particular where ai = a (i) , bi =
bi

b (i) with a, b ∈ Z [x]. In this section we define the conservative matrix field, which is a collection of such
continued fractions with interesting connections between them.
Definition 27. A pair of matrices MX (x, y) , MY (x, y) is called a conservative matrix field (or just
matrix field for simplicity), if

1. The entries of MX (x, y) , MY (x, y) are polynomial in x, y ,


2. The matrices satisfy the coboundary equivalence relation

MX (x, y) MY (x + 1, y) = MY (x, y) MX (x, y + 1) ∀x, y.

Remark 28. For the reason for the name “conservative matrix field” consider the coboundary equivalence
relation as the following commutative diagram

MX (x,y+1)
(x, y + 1) / (x + 1, y + 1)
O O
MY (x,y) MY (x+1,y)

(x, y) / (x + 1, y)
MX (x,y)

As this is true for any (x, y), when we think about them as points in the plane, the intuition should be that
traveling along the bottom and then right edge or traveling along the left and then top edge should result in
the same product. This is very similar to what we expect from the standard conservative vector fields (and
indeed, both are 1-cocycles with the appropriate groups), and in order to keep this intuition in mind, it got
the name conservative matrix field.

In these matrix fields, we will in particular be interested in the case where for each fixed y = y0 , the

sequence {MX (n, y0 )}n=1 has the continued fraction form. Then, the continued fractions on the integer rows
y0 ∈ Z are coboundary equivalent continued fractions via nice polynomial matrices MY . In general, if we
manage to construct such a matrix field where one of its horizontal lines is a continued fraction presentation
for some constant α, we can look at the rest of the matrix field for other properties of α. In particular, in
Section 6 we will construct such a matrix field for ζ (3), where its Euler continued
P∞ fraction (from Example 11)
is on the Y = 0 line, and see for example that the Y = m lines converge to m k13 , and the diagonal line
X = Y can be used to define another continued fraction presentation where the convergents converge to ζ (3)
fast enough to show that it is irrational.
With this intuition in mind, we start with a construction for specific matrix fields with many interesting
properties in section 5.1. Then in section 5.2 we find out how every such matrix field comes with its dual,
which is in a sense a reflection through the x = y line. Once we have this dual matrix field, we study the
numerators and denominators of the continued fractions in that matrix field, and in particular find their
greatest common divisors. Finally we show how to put everything together in Section 6 to show that ζ (3) is
irrational.

21
5.1 The construction
Definition 29. Let f (x, y) , f¯ (x, y) ∈ C [x, y] be two polynomials. We say these polynomials are conjugate
if they satisfy the following two conditions:

1. Linear condition: The polynomials satisfy

f (x, y) − f (x + 1, y − 1) = f¯ (x + 1, y) − f¯ (x, y − 1) .

Given such polynomials, we will well define af,f¯ := a (x, y) as

a (x, y) = f (x + 1, y − 1) − f¯ (x, y − 1) = f (x, y) − f¯ (x + 1, y) .

2. Quadratic condition:

f f¯ (x, y) + f f¯ (0, 0) = f f¯ (x, 0) + f f¯ (0, y) .


   

In other words, all the monomials appearing in f f¯ (x, y) are xn and y m for n, m ∈ N.


We denote by bf,f¯ := b (x) the polynomial

b (x) = f f¯ (x, y) − f f¯ (0, y) = f f¯ (x, 0) − f f¯ (0, 0)


   

which only depends on x. We will usually also have that f f¯ (0, 0) = 0, so that b (x) = f f¯ (x, 0).
 

Given two such conjugate polynomials, we define


 
cf 0 b(x)
MX (x, y) = 1 a(x,y)
 
f¯(x,y) b(x)
MYcf (x, y) = 1 f (x,y)
.

cf
The cf superscript is to indicate that MX is in a continued fraction form. We will shortly change it a
little bit and remove these cf .

Remark 30. If f f¯ (0, 0) = 0, then the y = 0 is the continued fraction with bi = f f¯ (i, 0) and ai = f (i, 0)−
 

f¯ (i + 1, 0), which is in the trivial Euler family defined in Example 11. Indeed, just take h1 (x) = f (x, 0)
and h2 (x) ¯
 = −f (x, 0). Using the second presentation of a (x, y), the y = 1 line is a continued fraction with
bi = f f (i, 0) and ai = f (i + 1, 0) − f¯ (i, 0), which is again in the trivial Euler family, this time with the
¯
switched roles h1 (x) = −f¯ (x, 0) and h2 (x) = f (x, 0).
Also, recall that finding an “Euler” presentation for a continued fraction is in a sense a generalization of
solving a quadratic equation x2 + ax + b = 0, where the roots λ1 , λ2 satisfy λ1 λ2 = b and − (λ1 + λ2 ) = a.
The structure defined above should be considered as an even further generalization of this concept. Indeed,
starting with b (x) and a (x, y), we look for f, f¯ such that

b (x) = f (x, 0) f¯ (x, 0)


a (x, y) = f (x, y) − f¯ (x + 1, y) .

With this point of view, the term “conjugates” should be more natural, since in a way f, f¯ are roots of a
quadratic equation (though with polynomials coefficients).

22
Example 31 (The ζ (3) matrix field). The main example that we should have in mind is a matrix field for
ζ (3) defined by

y 3 − x3
f (x, y) = x3 + 2x2 y + 2xy 2 + y 3 = (y + x)
y−x
y 3 + x3
f¯ (x, y) = −x3 + 2x2 y − 2xy 2 + y 3 = (y − x) = f (−x, y)
y+x
f f¯ (x, y) = y 6 − x6


b (x) = −x6
3
a (x, y) = x3 + (1 + x) + 2y (y − 1) (2x + 1) .
3
In particular, for y = 0, 1 we have the continued fraction b (n) = −n6 and a (n, 0) = n3 + (1 + n) , which
is exactly the Euler continued fraction which converges to ζ(3)
1
− 1, as we saw in Example 11. We will see
in Section 6 that for any fixed integer y = m ≥ 1, the continued fraction with bn = b (n) and an = a (n, m)
converges to P∞1 1 − 1.
m k3
The polynomial matrices in this matrix field are
   
cf 0 b(x) −x6
MX (x, y) = 1 a(x,y) = 10 x3 +(1+x)3 +2y(y−1)(2x+1)
¯   y3 +x3 (y−x) −x6

cf f (x,y) b(x) y+x
MY (x, y) = 1 f (x,y)
= y 3 −x3
1 y−x (y+x)

and the first few of them are

(∗) (∗) (∗) (∗)


O O O

14 −1 7 −1 0 −1
     
1 52 1 95 1 168

(1, 3) / (2, 3) / (3, 3) / (∗)


O 
0 −1

O 
0 −23

O 
0 −33

1 45 1 95 1 175

3 −1 0 −1 −7 −1
     
1 21 1 48 1 95

(1, 2) / (2, 2) / (3, 2) / (∗)


O 
0 −1

O 
0 −23

O 
0 −33

1 21 1 55 1 119

0 −1 −3 −1 −14 −1
     
1 6 1 21 1 52

(1, 1) / (2, 1) / (3, 1) / (∗)


0 −1 0 −23 0 −33
     
1 9 1 35 1 91

23
We continue to show that this general construction produces a conservative matrix field.
Theorem 32. Given polynomials f, f¯, a, b as in Definition 29 and the matrices
 
MX cf
(x, y) = 01 a(x,y)
b(x)

¯ 
MYcf (x, y) = f (x,y)
1
b(x)
f (x,y)
,

then the following hold:


1. The matrices satisfy the coboundary equivalence condition
cf
MX (x, y) MYcf (x + 1, y) = MYcf (x, y) MX
cf
(x, y + 1) ∀x, y.

cf
2. The determinants of MX (x, y) , MYcf (x, y) are only functions of x, y respectively, and more specifically:
 
cf
det MX (x, y) = −b (x)
 
det MYcf (x, y) = f · f¯ (0, y) .


Proof. 1. From the assumption on our functions we know that for all x, y we have

a (x, y) = f (x, y) − f¯ (x + 1, y) (1)


a (x, y + 1) = f (x + 1, y) − f¯ (x, y) (2)
b (x + 1) − b (x) = f (x, y) a (x, y + 1) − a (x, y) f (x + 1, y) (3)

Using conditions (1) and (2) we get that


  ¯  (1)  
MXcf
(x, y) MYcf (x + 1, y) = 01 a(x,y)
b(x) f (x+1,y) b(x+1)
1 f (x+1,y)
= fb(x) b(x)f (x+1,y)
(x,y) b(x+1)+a(x,y)f (x+1,y)
¯   (2)  
0 b(x) b(x) b(x)f (x+1,y)
MYcf (x, y) MXcf
(x, y + 1) = f (x,y)
1
b(x)
f (x,y) 1 a(x,y+1)
= f (x,y) b(x)+f (x,y)a(x,y+1)

The two matrices are the same using (3) from above.
2. Simple computation.

Example 33. There are many examples of conservative matrix fields, and we give some of them below.
For each pair f, f¯, we also add the b (x) , a (x, y) appearing as the continued fraction on the horizontal
lines. In particular, as we saw in Remark 30, when f f¯ (0, 0) = 0, the y = 1 line is in the Euler Family from


Example 11, namely b (n) = −h1 (n) × h2 (n) and a (n) = h1 (n) + h2 (n + 1). In these cases we can convert
it to an infinite sum and hopefully use it to compute the value of the continued fraction, which we add in the
examples below (up to a Mobius map). Further more, in many cases we think of f¯ as an image under some
nice linear map g 7→ ḡ of f , and when this is the case, we will give this linear map instead of f¯.
1. When both f, f¯ are linear themselves, then solving the linear and quadratic conditions in Definition 29
is elementary (which we show in Appendix B). There are two families of solutions

f (x, y) = A (x + y) + C
f¯ (x, y) = Ā (x − y) + C̄

or

f (x, y) = Ax + By + C
f¯ (x, y) = −Ax + By + C̄

where A, B, C, Ā, C̄ above are the parameters of the families.

24
(a) Taking f (x, y) = x + y and f¯ (x, y) = x − y, we get b (x) = x2 and a (x, y) = 2y − 1. In y = 1 we
get the continued fraction
n2 − (−n) × n 1 1 1 − ln (2)
K∞
1 = K∞
1 = P∞ Qk − 1 = P∞ −1= .
1 (−n) + (n + 1) −i (−1)k ln (2)
k=0 i=1 i+1 k=0 k+1

Taking f¯ (x, y) = y − x instead, we get b (x) = −x2 and a (x, y) = 2x + 1. Since a is independent of
y, all the horizontal lines in the matrix field are the same, so in a sense it is degenerate. Moreover,
trying to compute the continued fraction produces
−n2 −n × n 1 1
K∞
1 = K∞
1 = P∞ Qk − 1 = P∞ 1 − 1 = −1,
2n + 1 n + (n + 1) k=0 i=1
i
k=0 k+1
i+1
P∞
since the harmonic sum 0
1
k+1 diverges to infinity.
(b) For f (x, y) = x + y and f¯ (x, y) = 1 (which we can think of as ∂f
∂x = ∂f
∂y = f¯), we get b (x) =
x, a (x, y) = x + y − 1, and in the y = 1 case we get
n − ((−1) × n) 1 1
K∞
1 = K∞
1 = P∞ Qk −1=
n (−1) + (n + 1) k=0 i=1
−1 e−1
i+1

which we already saw in Example 8.


2. When f, f¯ have degree at most 2, then we have the following families of examples (as function of C):
operation f (x, y) a (x, y) b (x) Euler family (a (x, 1))
y2
 
ḡ (x, y) = −g (−x, y) x2 + xy + + C (x + y) (x + 1)2 + x2 + y (y − 1) + C (2y − 1) −x2 x2 − C 2 (x + 1) (x + 1 + C) + x (x − C)
2  
ḡ (x, y) = g (−x, y) x + 2xy + 2y 2 + C (2y − x)
2 (2x + 1) (2y − 1 + C) x 2 2
x −C 2 (x + 1) (x + 1 + C) − x (x − C)
ḡ (x, y) = g (x, −y) x2 + 2xy + 2y 2 + C (x + y) (2x + 1 + C) (2y − 1) x2 (x + C)2 (x + 1) (x + C + 1) − x (x + C)
2x2 +2xy+y 2 +C(2x+y)
ḡ (x, y) = −g (x, −y) C (2x + 1) + x2 + (x + 1)2 + y (y − 1) −x2 (x + C)2 (x + 1) (x + C + 1) + x (x + C)
2

2
In particular, when taking C = 0, the y = 1 line is either b (x) = −x4 and a (x) = x2 + (1 + x) , or
2
b (x) = x4 and a (x) = (x + 1) − x2 . The continued fraction will eventually be transformed (after the
P∞ P∞ n
right Mobius action) to the sums 1 n12 and 1 (−1) n2 which are ζ (2) and 21 ζ (2) respectively.
3. For the operation ḡ (x, y) = −g (x − y, y) we have
y2 y
f (x, y) = x2 + xy + +x+
2 2
2
b (x) = −x2 (x + 1)
2
a (x, y) = 2 (x + 1) + y (y − 1)
2 2
For y = 1 we get the continued fraction with b (x) = −x2 (x + 1) and a (x, 1) = 2 (x + 1) . The
equivalence transformation in Lemma 5 allows us to cancel x (x + 1) in b and (x + 1) in a twice and get
2
−n2 (n + 1) −1
K∞
1 2 = K∞
1 = −1
2 (n + 1) 2

4. For degree at most 3, with the action ḡ (x, y) 7→ g (−x, y), we have the family
2 3
f (x, y) = x3 + 2x2 (y − C) + 2x (y − C) + (y − C) − (x + y − C) C 2
2 2
b (x) = −x2 (x − C) (x + C)
2 2
a (x, y) = x (x − C) + (x + 1) (x + 1 + C) + (1 + 2x) (y − 1 − 2C) 2y.
2
When y = 1 we get a continued fraction in the Euler family with h1 (x) = x (x − C) and h2 (x) =
2
x (x + C) . In particular, in the case where C = 0 we simply get the matrix field for ζ (3) mentioned
in Example 31.

25
Remark 34. Once we have a pair of conjugate polynomials f, f¯, there are several ways to generate more such
pairs. The simplest way is just to take cf, cf¯ for some 0 6= c ∈ C. Another less trivial way is to look at the
pair f (y, x) , −f¯ (y, x) . We shall see in section 5.2 how this new pair is hidden in the same conservative
matrix field.
cf cf
Right now,while the M
X matrix has the known continued fraction form, the MY matrices have this new
¯
unkown form f (x,y)
1
b(x)
f (x,y)
. However, we already saw in Theorem 19 how to convert matrices to continued
fraction form, which works best when the bottom left coordinate is constant, like it is in our case. Moreover,
cf
we will show an even stronger result that the continued fraction in MX and the hidden continued fraction
in MY are both defined very similarly. For that, we use the following notations.
Notation 35. We define:
Uα = ( 10 α1 ) Dα = ( α0 10 ) τ = ( 01 10 ) .
For any matrix M , we will write the isomorphism M 7→ M τ
= τ M τ −1 (and note that τ 2 = Id, so that
τ
−1
= τ ). More specifically, we have that ( ac cb ) = db ac is just switching the rows and switching the

τ
columns, and in particular Uατ = Uαtr .
With these notations we get:
 
cf 0 b(x)
MX (x, y) = 1 f (x,y)−f¯(x+1,y)
= Db(x) τ Uf (x,y) U−f¯(x+1,y)
 
f¯(x,y) b(x)
MYcf (x, y) = 1 f (x,y)
= Uf¯(x,y) D−(f f¯)(0,y) τ Uf (x,y) ,

cf
so that MX and MYcf are “almost” the same. There is some “cyclic permutation” and after it they have a
similar structure, with related parameters, and in particular the MYcf is also a continued fraction sequence,
after a simple coboundary equivalence (via the matrices Uf¯(x,y) ).
As mentioned in Remark 30, if f f¯ (0) = 0, then the Y = 0 and Y = 1 lines are in the trivial Euler


family. Similarly, on the X = 0 line we have


¯ 
MYcf (0, y) = f (0,y)
1
0
f (0,y)
,

which is even simpler to work with (recall that the continued fraction in the trivial Euler family,
 are in essence

cf 0 0
upper triangular in disguise). However, on the X = 0 line we have that MX (0, y) = 1 f (0,y)− f¯(1,y) are
not invertible. With this in mind, we do a slight change of parameters, which will solve this problem, and
we will see is more natural.
Definition 36. Let f, f¯ be conjugate polynomials and MX , MY as in Definition 29. Define
 
−1 cf 0 1
MX (x, y) := Db(x) MX (x, y) Db(x+1) = τ Uf (x,y) U−f¯(x+1,y) Db(x+1) = b(x+1) f (x,y)−f¯(x+1,y)
¯ 
−1
MY (x, y) := Db(x) MYcf (x, y) Db(x) = Ufτ(x,y) τ D−(f f¯)(0,y) Ufτ¯(x,y) = fb(x)
(x,y) 1
f (x,y)

There are three main reasons why this is a bit better way to view our matrix fields.
cf
1. If we start the continued fraction on the y line at x = 0 with the previous MX matrices, we get
h ih i
cf cf cf
MX (0, y) MX (1, y) MX (2, y) · · · = Db(0) τ Uf (0,y) U−f¯(0+1,y) Db(1) τ Uf (1,y) U−f¯(1+1,y) · · ·

where as we mentioned before Db(0) = ( 00 01 ) is singular which can cause problems. This means that we
have to start with x = 1, and therefore “lose” the information from τ Uf (0,y) U−f¯(0+1,y) . With our new
matrices MX we instead get
h ih i
MX (0, y) MX (1, y) MX (2, y) · · · = τ Uf (0,y) U−f¯(0+1,y) Db(1) τ Uf (1,y) U−f¯(1+1,y) Db(2) · · ·

so we start exactly after the problematic matrix Db(0) .

26
2. With this new definition, where we start at x = 0, we get that MY (0, y) is upper triangular, since
¯  ¯ 
MY (0, y) = fb(0)
(0,y) 1
f (0,y)
= f (0,y)
0
1
f (0,y)
.

3. Finally, as we shall see below, the limits for each horizontal line are more natural, namely
"N # "N #   −1
Y Y cf b (n)
lim MX (n, m) (0) = τ Ua(0,m) lim MX (n, m) (0) = 1 + a (0, m) K∞ 1 .
N →∞
n=0
N →∞
n=1
a (n, m)
P∞
In particular, in the new matrix field for our ζ (3) example, we will get the limits m k13 . This will
simplify the arguments when trying to find the denominators and numerators of the convergents.
With this in mind we rewrite Theorem 32 and expand it with this new matrices.

Theorem 37. Let f, f¯, a, b be polynomials as in Definition 29. We set


 
0 1
MX (x, y) := τ Uf (x,y) U−f¯(x+1,y) Db(x+1) = b(x+1) f (x,y)−f¯(x+1,y)
¯ 
MY (x, y) := Ufτ(x,y) τ D−(f f¯)(0,y) Ufτ¯(x,y) = fb(x)
(x,y) 1
f (x,y)

Then

1. The matrices form a conservative matrix field, namely

MX (x, y) MY (x + 1, y) = MY (x, y) MX (x, y + 1) .

2. The determinants of MX (x, y) , MY (x, y) are only functions of x, y respectively, and more specifically:

det (MX (x, y)) = −b (x + 1) = f · f¯ (0, 0) − f · f¯ (x + 1, 0)


 

det (MY (x, y)) = f · f¯ (x, y) − b (x) = f · f¯ (0, y) .


 

3. For x = 0 , the matrices MY (0, y) are upper triangular


¯ 
MY (0, y) = f (0,y)
0
1
f (0,y)
.

Proof. This follows directly from Theorem 32 .


Next, we use the fact that the Y = 1 line in the original conservative matrix field is in the trivial Euler
family, to find a simple presentation for the Y = 1 line in our new matrix field.
Lemma 38. Suppose that f f¯ (0, 0) = 0. Then


"n−1 #
Qn ¯
(−1)n
 
1 f (k,0) Q cn
Y
Ufτ¯(0,0) τ
MX (k, 1) U− f¯(n,0) = 0 n
1 f (k,0)
0
n k−1
! n
!
X k−1
Y Y
cn = (−1) ¯
f (i, 0) f (i, 0) .
k=1 i=1 i=k+1

Proof. Assuming that f f¯ (0, 0) = 0, at the Y = 1 we have




   
0 1 0 1
MX (x, 1) = (f f¯)(x+1,0) f (x,1)−f¯(x+1,1) = (f f¯)(x+1,0) f (x+1,0)−f¯(x,0) .

27
Setting v (x) = f¯ (x, 0) , 1 we get v (x) MX (x, 1) = f (x + 1, 0) v (x + 1), namely
 these areeigenvectors with

1
eigenvalue f (x + 1, 0). Similarly we have the right −f¯ (x + 1, 0)-eigenvectors −f¯(x+1,0) . By Lemma 23,
 
1 0
setting Ux = f¯(x,0) 1 we get
 
−f¯(x+1,0) 1
Ufτ¯(x,0) MX (x, 1) U−
τ
f¯(x+1,0) = 0 f (x+1,0)
.

It follows that "n−1 # n  


−f¯(k,0)
Y Y 1
Ufτ¯(0,0) MX (k, 1) U−f¯(n,0) = 0 f (k,0)
,
0 1

which by Claim 24 is equal to


n k−1
! n
!
Qn ¯
(−1)n
 
1 f (k,0) Q cn
X k−1
Y Y
0 n , cn = (−1) f¯ (k, 0) f (k, 0) .
1 f (k,0)
k=1 i=1 i=k+1

Remark 39. Note in particular that when f¯ (0, 0) = 0 in the lemma above, then Ufτ¯(0,0) = I is simply the
identity matrix.

5.2 The dual conservative matrix field


With Theorem 37 and Lemma 38 in the previous section, we see that we understand quite well both the
Y = 1 and X = 0 lines. More over, we already saw that both the horizontal and the vertical lines in the
matrix field are more or less continued fractions, namely
 
0 1
MX (x, y) := τ Uf (x,y) U−f¯(x+1,y) Db(x+1) = b(x+1) ¯
f (x,y)−f (x+1,y)
¯ 
τ τ f (x,y) 1
MY (x, y) := Uf (x,y) τ D−(f f¯)(0,y) Uf¯(x,y) = b(x) f (x,y)

The next goal is to use this almost symmetry with the hope of eventually saying something about the diagonal
line X = Y .
Definition 40 (The dual matrix field). Let f (x, y) , f¯ (x, y) be conjugate polynomial, and let MX , MY be
as above. We define the dual matrix field to be

M̂Y (y, x) = Ufτ¯(x−1,y) MX (x − 1, y + 1) U−


τ τ τ
f¯(x,y) = Uf (x,y) τ Db(x) U−f¯(x,y)

M̂X (y, x) = Ufτ¯(x−1,y) MY (x − 1, y + 1) U−


τ
f¯(x−1,y+1) = τ Uf¯(x−1,y) Uf (x−1,y+1) D−bY (y+1)

This new matrix field corresponds to the conjugate polynomials

fˆ (x, y) = f (y, x)
¯
fˆ (x, y) = −f¯ (y, x)
¯
â (x, y) = fˆ (x, y) − fˆ (x + 1, y) = f (y, x) + f¯ (y, x + 1)
¯
 
b̂ (x) = fˆfˆ (x, 0) = − f f¯ (0, x)


Example 41. In the ζ (3) matrix field mentioned in Example 31 we have a special case where

y 3 − x3 y 3 + x3
f (x, y) = (y + x) ; f¯ (x, y) = (y − x) ,
y−x y+x
¯
satisfy f (x, y) = f (y, x) and f¯ (x, y) = −f¯ (y, x), so that fˆ = f and fˆ = f¯.

28
In the ζ (2) matrix field from Example 33, we have

f (x, y) = 2x2 + 2xy + y 2 ; f¯ (x, y) = −2x2 + 2xy − y 2

so that
¯
fˆ (x, y) = x2 + 2xy + 2y 2 ; fˆ (x, y) = x2 − 2xy + 2y 2
and therefore
  2

â (x, y) = x2 + 2xy + 2y 2 − (x + 1) − 2 (x + 1) y + 2y 2 = (2y − 1) (2x + 1)

b̂X (x) = x4 .

This dual matrix field construction not only gives us free of charge another conservative matrix field for
every one that we find, but they are also closely related. In the matrix field with MX , MY , the horizontal
lines are (almost) polynomial continued fractions, and we wish to study how the numerators and denominator
behave there. By definition, the horizontal lines of the dual matrix field correspond to vertical line in the
original matrix field, so understanding the full matrix field is equivalent to understand these continued
fractions. More precisely, since
τ τ
MY (x, y) = U− f¯(x,y−1) M̂X (y − 1, x + 1) Uf¯(x,y) ,

we get that "n−1 #


n
Y Y
MY (y, k) = τ
U− f¯(y,0) M̂X (k, y + 1) Ufτ¯(y,n) (4)
k=1 k=0

With this dualic structure we turn to study the rational approximation given by the different points on
the matrix field, and more concretely how far the standard rational presentation is from being a reduced
rational presentation.
Definition 42. For every n ≥ 0 define the polynomial vectors
"n−1 #
  Y
pn (y)
qn (y)
= MX (k, y) e2
0
"n−1 #
  Y
p̂n (y)
q̂n (y)
= M̂X (k, y) e2 .
0

For example, the first few values of pn (m) , qn (m) are arranged as :

29
(∗) (∗) (∗) (∗)
O O O

MY (0,3) MY (1,3) MY (2,3)

/ / / (∗)
     
p0 (3) p1 (3) p2 (3)
q0 (3) q1 (3) q2 (3)
O MX (0,3)
O MX (1,3)
O MX (2,3)

MY (0,2) MY (1,2) MY (2,2)

/ / / (∗)
     
p0 (2) p1 (2) p2 (2)
q0 (2) q1 (2) q2 (2)
O MX (0,2)
O MX (1,2)
O MX (2,2)

MY (0,1) MY (1,1) MY (2,1)

/ / / (∗)
     
p0 (1) p1 (1) p2 (1)
q0 (1) MX (0,1) q1 (1) MX (1,1) q2 (1) MX (2,1)

Remark 43. Note that since MX (k, y) e1 = b (k + 1) e2 and M̂X (k, y) e1 = −bY (k), we have for n ≥ 1
  n−1
pn−1 (y) pn (y)
Y
qn−1 (y) qn (y)
DbX (n) = MX (k, y)
0
  n−1
p̂n−1 (y) p̂n (y)
Y
q̂n−1 (y) q̂n (y) D−bY (n) = M̂X (k, y) .
0

To study these polynomials pn (m) and qn (m), we use the conservative matrix field structure to see what
happens when we increase n or increase m, and also what is the connections between them and their duals
p̂m (n) and q̂m (n).
Claim 44. Let f, f¯ ∈ Z [x, y] be conjugate polynomials such that f f¯ (0, 0) = 0 and let pn , qn , p̂m , q̂m as in


Definition 42 above. Then


1. Evaluating the polynomial pn , qn at m = 1 we have
n k−1
! n
!
X k−1
Y Y
pn (1) = (−1) f¯ (k, 0) f (k, 0)
k=1 i=1 i=k+1
Yn
qn (1) = f (k, 0) − f¯ (0, 0) pn (1) .
1

2. When increasing n we get that


    
pn+1 (y) pn−1 (y) pn (y) b(n)
qn+1 (y)
= qn−1 (y) qn (y) a(n,y)
.

3. When f f¯ (0, m) 6= 0, increasing m follows the recurrence





pn (m+1)
 1 
f (0,m) −1

pn−1 (m) pn (m)
 ¯
(f f )(n,0)

qn (m+1)
= ¯
 0 f¯(0,m) qn−1 (m) qn (m) f (n,m)
f f (0, m)

30
4. Suppose that f¯ (0, 0) = 0. Then the polynomials pn , qn and p̂m , q̂m are connected by the following
equation  Qm ¯ ¯
    
(−1)n n
Q
k=1 f (0,k) p̂m (1) pn (m+1) k=1 f (k,0) pn (1) p̂m (n+1)
0 q̂ (1) qn (m+1)
= 0 q (1) q̂m (n+1)
,
m n

and in particular we have that q̂m (1) qn (m + 1) = qn (1) q̂m (n + 1).


Proof. 1. Applying Lemma 38 we get
"n−1 #
Q ¯
(−1)n n
   
1 f (k,0) Q cn
pn (1)
Y
τ
qn (1)
= MX (k, 1) e2 = U− f¯(0,0) 0 n
f (k,0)
Ufτ¯(n,0) e2
1
0
 Qn ¯
n

τ (−1) 1 f (k,0) Q cn τ Qn cn

= U− f¯(0,0) 0 n
f (k,0)
e2 = U− f¯(0,0) 1 f (k,0)
1

where
n k−1
! n
!
X k−1
Y Y
cn = (−1) f¯ (k, 0) f (k, 0) .
k=1 i=1 i=k+1

It follows that
 
pn (1) = etr pn (1)
= etr τ Qn c n

1 qn (1) 1 U−f¯(0,0) 1 f (k,0)
= cn
  n
Qn cn
pn (1)
Y
qn (1) = etr = −f¯ (0, 0) , 1 f (k, 0) − f¯ (0, 0) cn
 
2 qn (1) 1 f (k,0)
=
1
n k−1
! n
!
X k
Y Y
= (−1) f¯ (k, 0) f (k, 0) .
k=0 i=0 i=k+1

2. This is the standard recursion for continued fractions, and it follows from
"n # "n−1 #
 
pn+1 (y)
Y Y
qn+1 (y)
= MX (k, y) e2 = MX (k, y) MX (n, y) e2
0 0
    
pn−1 (y) pn (y) pn−1 (y) pn (y) bX (n)
= qn−1 (y) qn (y)
DbX (n) MX (n, y) e2 = qn−1 (y) qn (y) a(n,y)
.

3. This follows from the coboundary condition of the conservative matrix field structure
"n−1 # "n−1 #
 
−1
pn (m+1)
Y Y
qn (m+1)
= MX (k, m + 1) e2 = MY (0, m) MX (k, m) MY (n, m) e2
0 0
 −1  
f¯(0,m) 1 pn−1 (m) pn (m) 1

= 0 f (0,m) qn−1 (m) qn (m)
Db(n) · f (n,m)

1 
f (0,m) −1

pn−1 (m) pn (m)
 ¯
(f f )(n,0)

= f¯(0,m)
f f¯ (0, m) qn−1 (m) qn (m)
 0 f (n,m)

4. We compute the matrix in the (n, m + 1) position in the matrix field in two different ways - first by
moving along the X = 0 line and then Y = m + 1 line, and second by moving along the Y = 1 line and
then the X = n line, namely
m
Y n−1
Y n−1
Y m
Y
MY (0, k) MX (k, m + 1) = MX (k, 1) MY (n, k) .
1 0 0 1

Converting the MY into M̂X via equation (4) we have


"m−1 # n−1 n−1
"m−1 #
Y Y Y Y
τ τ τ
U−f¯(0,0) M̂X (k, 1) Uf¯(0,m) MX (k, m + 1) = MX (k, 1) U−f¯(n,0) M̂X (k, n + 1) Ufτ¯(n,m) .
k=0 0 0 k=0

31
Multiplying both side by e2 , we get
"m−1 # "n−1 #
Y   Y  
τ τ pn (m+1) τ p̂m (n+1)
U−f¯(0,0) M̂X (k, 1) Uf¯(0,m) qn (m+1) = MX (k, 1) U− f (n,0) q̂m (n+1) .
¯
k=0 0

Next, we use Lemma 38 as in the previous part, and the fact that Ufτ¯(0,0) = Id to get that
"n−1 #
 Qn ¯ 
(−1)n
Y
τ 1 f (k,0) pn (1)
MX (k, 1) U− f¯(n,0) = 0 qn (1)
0
"m−1 #
f¯(0,k) p̂m (1)
Y  Qm 
M̂X (k, 1) Ufτ¯(0,m) = 1
0 q̂m (1)
.
0

Putting everything together, we get


 Qm ¯    Qn ¯  
f (0,k) p̂m (1) pn (m+1) (−1)n 1 f (k,0) pn (1) p̂m (n+1)
1
0 q̂m (1) qn (m+1)
= 0 qn (1) q̂m (n+1)
,

which is what we wanted to show.

If f¯ (0, 0) = 0 as in the last part of the claim above, then qnq(m+1)


n (1)
= q̂mq̂m
(n+1)
(1) . Fixing m and letting n → ∞,
the numbers qn (m) are simply the denominators for the continued fraction in the matrix field appearing on
the m’th row. As we already know how to compute qn (1), in order to understand these denominators, we
would need to understand q̂mq̂m (n+1)
(1) . Since m is fixed, the function n 7→ q̂m (n + 1) is just polynomial in n,
and we divide it by the constant q̂m (1). This already tells us a lot about these denominators.
Eventually we would want to find the greatest common divisor of qn (m + 1) and pn (m + 1) and show
that it is large. In particular, it would be helpful to know if qn (1) | qn (m + 1) for all n, which we just saw
is equivalent to q̂mq̂m(n+1)
(1) always being an integer. Hence, we are left with the problem of checking if all the
evaluations of a polynomial q̂m at integer points are divisible by the same number q̂m (1). The solution to
this type of question is well known, and it not hard to show that this holds exactly when q̂m (1) | q̂m (n + 1)
for deg (q̂m ) + 1 consecutive integers (see Appendix C). This suggest an induction like process to show that
this holds for all of the denominators, and in particular we will show it for the matrix field of ζ (3).

32
6 The ζ (3) case
We now apply the dual matrix field identities for the ζ (3) matrix field. Recall that in this case we have that

y 3 − x3
f (x, y) = (y + x) = y 3 + 2y 2 x + 2yx2 + x3
y−x
y 3 + x3
f¯ (x, y) = (y − x) = y 3 − 2y 2 x + 2yx2 − x3
y+x
3
a (x, y) = x3 + (1 + x) + 2y (y − 1) (2x + 1)
b (x) = −x6 .

Our first goal is showing that gcd (qn (m) , pn (m)) is large as n and m increase, and we will use the
results from Claim 44. Once we understand these polynomials and their gcd, which are defined for each row
separately, we will combine them together to understand the general numerators and denominators appearing
in any route on the matrix field, starting at the bottom left corner. In particular, investigating the diagonal
route, we will show that both the approximations converge fast enough, and the gcd grows fast enough to
conclude at the end that ζ (3) is irrational.
This ζ (3) matrix field has several properties making it much easier to work with, which will come into
play later:
Fact 45. 1. The matrix field is its own dual, since f (y, x) = f (x, y) and −f¯ (y, x) = f¯ (x, y). In particular
we get that p = p̂ and q = q̂.
2. We have that f (0, 0) = f¯ (0, 0) = 0.
¯
3. All the f (n, 0) , f (0, n) , f¯ (n, 0) , f¯ (0, n) are the same up to a sign (and therefore also fˆ and fˆ), namely
these are n . Furthermore, they all divide f (n, n) = fˆ (n, n) = 6n .
3 3

4. The polynomial a (x, y) can be written as a (x, y) = A1 (x)+y (y − 1) A2 (x), so in particular a (x, 1 − y) =
a (x, y).
With the goal of finding out how big gcd (qn (m) , pn (m)), we show that both the numerators and de-
3
nominators are almost divisible by (n!) . We already know that fixing m and only increasing n, namely
running on horizontal lines in the matrix field, we get “nice” continued fractions which should have factorial
reduction. The next lemma shows that these factorial reduction are in a sense synchronized between the
different horizontal lines.
In the following, we will use lcm [n] for lcm {1, 2, ..., n} where n ≥ 1 and also set lcm [0] = 1.
 3
3 n!
Lemma 46. For all n ≥ 0 and m ∈ Z we have (n!) | qn (m) (with equality for m = 1) and lcm[n] | pn (m).
 3
n!
In particular we have that lcm[n] | gcd (pn (m) , qn (m)).

Proof. We will prove this claim by induction on n, but before that, we first consider the case where m = 1
and n is arbitrary (the bottom horizontal line). By part 1 in Claim 44 we get that
n k−1
! n
! n  3
X k−1
Y
¯
Y X n!
pn (1) = (−1) f (k, 0) f (k, 0) =
i=1 1
k
k=1 i=k+1
n
Y 3
qn (1) = f (k, 0) = (n!) .
1
Pn
These are exactly the numerator and denominator of 1 k13 if we just take the new denominator as the
product of the denominators in the summands. Since we can also instead take the least common multi-
 3
3
ple of the denominators, we see that lcm[n]
n!
| pn (1) as required. Of course the (n!) | qn (1) is trivial

33
3
since (n!) = qn (1), but more over it allows us to think of the general conditions as qn (1) | qn (m) and
3
q1 (n) | lcm [n] pn (m).

We prove the rest of this lemma using induction on n. The induction hypothesis will go as follows -
assuming that the claim is true for (n − 1, m) for a given n and all m ∈ Z, we show:
1. From part 4 in Claim 44, we show that the claim is true for (n, m) with 1 ≤ m ≤ n.
2. From part 3 in Claim 44, if the claim is true for (n − 1, n) and (n, n), then it is true for (n, n + 1).

3. Our polynomials satisfy qn (y) = qn (1 − y) and pn (y) = pn (1 − y), so the claim is true for (n, m) with
−n ≤ m ≤ n + 1, which are 2 (n + 1) consecutive integers.
4. These polynomials have degree ≤ 2n + 1, so this is enough to show the claim for (n, m) for all m.
3
When n = 0 we have q0 (m) ≡ 1 and p0 (m) ≡ 0 which are divisible by (0!) = 1 and lcm[0]
0!
= 1 respectively.
Suppose now that the claim is true for (k, m) with k ≤ n − 1 and all m and we prove for (n, m) and all
m. We prove first for the denominators, which is easier.

Denominators:
By using identity 4 from Claim 44, together with the facts in 45 about the matrix field we get
     
qm (1) pm (1) pn (m+1) qn (1) pn (1) pm (n+1)
0 qm (1) qn (m+1)
= 0 qn (1) qm (n+1)
.

For the denominators, this implies that

qn (m + 1) qm (n + 1)
= .
qn (1) qm (1)

By the induction hypothesis, for 0 ≤ m ≤ n − 1 the right hand of this equation is an integer, so that
qn (1) | qn (m + 1). Using part 3 in Claim 44 with n = m we have


pn (n+1)
 1 
f (0,n) −1

pn−1 (n) pn (n)
 ¯
(f f )(n,0)

= ¯
qn (n+1)
f f¯ (0, n) qn−1 (n) qn (n)
 0 f (0,n) f (n,n)

so for the denominators we get

f¯ (n, 0) f (n, n)
qn (n + 1) = qn−1 (n) f (n, 0) + qn (n) = qn−1 (n) n3 (−1) + qn (n) · 6.
f (0, n) f (0, n)

By the induction hypothesis (n − 1)!3 | qn−1 (n) and from the argument above n!3 | qn (n), so we conclude
3
that (n!) | qn (n + 1), so we conclude that n!3 | qn (n + 1). At this point, we know the claim for (n, m) with
1 ≤ m ≤ n + 1.
Using the fact that a (x, y) can be written as A1 (x) + y (y − 1) A2 (x) , we get that a (x, y) = a (x, 1 − y).
Since
"n−1 # "n−1 #
Y Y
tr tr 0 1

qn (y) = e1 MX (k, y) e2 = e1 b(k+1) a(k,y) e2
0 0

we also get that qn (1 − y) = qn (y) and degy (qn ) ≤ 2n. From this we conclude that qn (1) | qn (m) for all
−n ≤ m ≤ n+1, which is a total of 2n+2 ≥ degy (qn )+1 consecutive integers. Finally, using Lemma 58 from
Appendix C we conclude that qn (1) | qn (m) for all m, thus proving the induction step for the denominators.

34
Numerators:
The proof for the numerators is similar, but needs a bit more computations. Part 4 from Claim 44
qm (1) pn (m + 1) + pm (1) qn (m + 1) = qn (1) pm (n + 1) + pn (1) qm (n + 1)
can be rewritten as
(1) (2) (3) (4) (5)
z }| { z }| { z }| { z }| { z }| {
3 3 3 3
lcm [n] pn (m + 1) lcm [n] pm (n + 1) lcm [n] pn (1) qm (n + 1) lcm [n] pm (1) qn (m + 1)
= + · − · .
qn (1) qm (1) qn (1) qm (1) qm (1) qn (1)
To show that the expression on the left is an integer, it is enough to show that (1) − (5) on the right are
integers.
• Expression (2) is on the first row of the matrix field, and we saw in the beginning of the proof that it
is an integer.
• Expressions (3) and (5) follows from the claim about the denominators (which is independent of this
proof about the numerators).
• Expressions (1) and (4) are true if 0 ≤ m ≤ n − 1 using the induction hypothesis, and the fact that
lcm [m] | lcm [n] in that case.
To conclude, we just saw that the claim is true for (n, m) when 1 ≤ m ≤ n.
Using part 3 in Claim 44 with n = m for the numerators we get
1
f (0, n) f f¯ (n, 0) pn−1 (n) + f (n, n) pn (n) − f f¯ (n, 0) qn−1 (n) + f (n, n) qn (n)
    
pn (n + 1) = ¯

f f (0, n)
 
6
= −n3 pn−1 (n) + 6pn (n) + qn−1 (n) − 3 qn (n) .

n
Since the claim is true for (n − 1, n) and (n, n), we get that
 3
n!
| −n3 pn−1 (n) + 6pn (n)

lcm [n]
 
6
(n − 1)!3 | qn−1 (n) − 3 qn (n) .
n
Since n | lcm [n] , it follows that n!
lcm[n] | (n − 1)!, so everything together shows that
 3
n!
| pn (n + 1) .
lcm [n]
 3
At this point we know that n!
lcm[n] | pn (m) for 1 ≤ m ≤ n + 1. The same trick as with the denominators
show that pn (m) = pn (1 − m), so the claim is true for −n ≤ m ≤ n + 1, and using Lemma 58 again we
conclude that it is true for all m, thus finishing the proof for the induction step, and therefore the original
claim.
Up until now we looked at each row separately. We now move to the whole matrix field.
Definition 47. Given n ≥ 0 and m ≥ 1, define
"m−1 # "n−1 #
  Y Y
P (n,m)
Q(n,m)
:= MY (0, k) MX (k, m) e2 .
k=1 k=0

In particular, as Mobius transformations we get that


"m−1 # "n−1 #
Y Y P (n, m)
MY (0, k) MX (k, m) (0) = .
Q (n, m)
k=1 k=0

35
Remark 48. Note that for the general matrix field with f¯ (0, 0) = 0, we can use 4 in Claim 44 to get
  Qm ¯
f¯(k,0) pn (1)
    n Qn
 
P (n,m) k=1 f (0,k) p̂m (1) pn (m+1) p̂m (n+1)
Q(n,m)
= 0 q̂ (1) qn (m+1)
= (−1) k=1
0 q (1) q̂m (n+1)
.
m n

In particular, in our ζ (3) case we have


       
P (n,m) qm (1) pm (1) pn (m+1) qn (1) pn (1) pm (n+1)
Q(n,m)
= 0 qm (1) qn (m+1)
= 0 qn (1) qm (n+1)
.

With this new notation, we have the new factorial reduction for these numerators and denominators.

Corollary 49. For all n, m ≥ 0 we have that

qm (1) qn (1) | Q (n, m + 1) ,

qm (1) qn (1)
3 | gcd (P (n, m + 1) , Q (n, m + 1)) .
lcm [max (m, n)]

In particular for n = m we get that


 3 2
n! (qn (1))
· n! = 3 | gcd (P (n, n + 1) , Q (n, n + 1)) .
lcm [n] lcm [n]

Proof. Using the presentation from the remark above


    
P (n,m+1) qm (1) pm (1) pn (m+1)
Q(n,m+1)
= 0 qm (1) qn (m+1)
,

and Lemma 46 we get that

qm (1) qn (1) | qm (1) qn (m + 1) = Q (n, m + 1)


qm (1) qn (1)
3 | qm (1) pn (m + 1) + pm (1) qn (m + 1) = P (n, m + 1) .
lcm [max (m, n)]

This factorial reduction property, will help us in the end to show that ζ (3) is irrational, but it can also
be used to show more general properties of the matrix field, as follows.
P (ni ,mi )
Theorem 50. Let ni , mi ≥ 1 be any sequence such that max (ni , mi ) → ∞. Then Q(ni ,mi ) → ζ (3).

Before we prove this theorem, here is an interesting corollary for using this theorem for fixed m.
P∞
Corollary 51. For any m ≥ 1, the limit for the Y = m line is lim pqnn(m)
(m)
= m n13 .
n→∞
    
P (n,m−1) qm−1 (1) pm−1 (1) pn (m)
Proof. Using the notation Q(n,m−1)
= 0 qm−1 (1) qn (m)
, we get that

P (n, m − 1) pn (m) pm−1 (1)


= + .
Q (n, m − 1) qn (m) qm−1 (1)
P (n,m−1) pm−1 (1) Pm−1
By Theorem 50 we know that lim = ζ (3), and we have already seen that = 1 n3 ,
1
so
n→∞ Q(n,m−1) qm−1 (1)
P∞
together we get that lim pqnn(m)
(m)
= m n13 .
n→∞

And now for the proof of the theorem.

36
Proof of Theorem 50. The mi bounded case: Suppose first that mi is bounded, and by splitting the
sequence to finitely many subsequence, we may assume that mi = m is constant. We use the presentation
    
P (ni ,m) qni (1) pni (1) p̂m (ni +1)
Q(ni ,m)
:= 0 qn (1) q̂m (ni +1)
i

so that
P (ni , m) pm (ni + 1) pni (1)
= + .
Q (ni , m) qm (ni + 1) qni (1)
We already know that lim pqnn(1)
(1)
= ζ (3), so if we can show that deg (qm ) > deg (pm ), then lim pqm
m (n)
(n) = 0.
n→∞ n→∞
Indeed, recall that "m−1 #
  Y
pm (y)
qm (y) = MX (k, y) e2
0

where MX (x, y) = 0 1
, and

b(x+1) a(x,y)

3
a (x, y) = x3 + (1 + x) + 2y (y − 1) (2x + 1) .

This means that for every k ≥ 0 we have that degy (a (k, y)) = 2, and by induction deg (pm ) = 2m − 2 while
deg (qm ) = 2m. Hence deg (qm ) > deg (pm ) and we are done.

The mi unbounded case: Here we will use the second presentation of P and Q, namely
    
P (n,m) qm (1) pm (1) pn (m+1)
Q(n,m)
= 0 qm (1) qn (m+1)

and therefore
P (n, m) pn (m + 1) pm (1)
= + .
Q (n, m) qn (m + 1) qm (1)

pm (1)
= ζ (3), for all m large enough we have pqm
m (1)
Fix some ε > 0. Since lim − ζ (3) ≤ 2ε . As we

m→∞ qm (1) (1)

shall see below, for all m large enough we also have that pqnn(m+1)
(m+1) ε
≤ 2 independent of n. Hence, we can find

P (n,m)
M = Mε so that for m ≥ Mε we have Q(n,m) − ζ (3) ≤ ε. Thus, if we look at the two subsequence of (ni , mi )


P (ni ,mi )
, where mi ≥ Mε and where mi ≤ Mε , we get that Q(n − ζ (3) ≤ ε on the first subsequence, and from

i ,mi )

P (ni ,mi )
the previous case, if the second subsequence is infinite, so that ni → ∞, we have Q(n − ζ (3) ≤ ε for

i ,mi )
all i large enough.
We are left to show that pqnn(m)
(m) ε
≤ 2 for all m large enough (independent of n ≥ 0).
P∞
Note that the result in the corollary above that lim pqnn(m) (m)
= m n13 only depends on the m bounded
n→∞
case, so we already fully proved it. This is a tail of a convergent series, so it will be small for all large enough
m. With this motivation (without the result itself from the corollary) we show that as m increases , pqnn(m)
(m)

becomes bounded by similar such tail and therefore is as small as we want.
Recall that
  n−1
pn−1 (y) pn (y)
Y
qn−1 (y) qn (y)
Db X (n) = MX (k, y) ,
0
so taking the determinant, we get that
n
Y
(pn−1 (y) qn (y) − pn (y) qn−1 (y)) b (n) = (−b (k)) ,
1

which we can rewrite as


n Qn−1
pn (y) pn−1 (y) (−1) 1 b (k)
= − .
qn (y) qn−1 (y) qn−1 (y) qn (y)

37
Qj−1 6 2
Using the fact that p0 (y) = 0 , q0 (y) = 1 and 1 |b (k)| = ((j − 1)!) = |qj−1 (1)| , we get that

n j Qj−1 ∞ Qj−1
pn (y) X
= (−1) 1 b (k) X
≤ 1 b (k)
.

qn (y) qj−1 (y) q j (y)
qj−1 (y) qj (y)
j=1 j=1
Qn−1 2
By Lemma 46 we have that 1 b (k) = (n − 1)!6 = qn−1 (1) , and also qn−1 (1) | qn−1 (m) and qn−1 (1) n3 =
qn (n) | qn (m) so that Q
j−1
b (k) qj−1 (1) qj−1 (1) j 3 1


1 1
= · · 3 ≤ 3.

qj−1 (m) qj (m) qj−1 (m) qj (m) j j

P

This already shows that pqnn(m)
(m)
≤ j=1 j13 , which is of course not enough, as instead of the tail, we got the
full sum. To solve this, we note that each one of the qj (y) for fixed j ≥ 1 are nonconstant polynomials of
qj−1 (1) qj−1 (1)
y (of degree 2j) so that lim qj−1 (y) · qj (y) = 0. Fixing ε > 0 and N > 0, we can find M = Mε,N large
y→∞
qj−1 (1) qj−1 (1)
enough such that qj−1 (y) · qj (y) < Nε for all y > M and 1 ≤ j < N . In particular, for any such y > M we
have that
∞ ∞ ∞
pn (m) X qj−1 (1) qj−1 (1) ε X qj−1 (1) qj−1 (1) X 1
≤ · ≤ N + · ≤ε+ .
qn (m)
j=1
qj−1 (m) qj (m) N qj−1 (m) qj (m) j3
j=N j=N
P∞ P∞
Since 1
1
< ∞ converges, we can find N large enough so that N j13 ≤ ε also, so together we get that
j3
for all y big enough (independent of n) we have pqnn(y)
(y)
≤ 2ε which is what we wanted to prove.

Finally, we combine all of the results to show that ζ (3) is irrational.


Theorem 52. The number ζ (3) is irrational.
Proof. Consider the diagonal direction on the ζ (3) matrix field where m = n + 1. From Theorem 50 we have
that
P (n, n + 1)
lim = ζ (3) .
n→∞ Q (n, n + 1)

The main idea:


Let us denote Qn = Q (n, n + 1) , Pn = P (n, n + 1) and Q̃n = gcd(Qn ,Pn ) , P̃n = gcd(Qn ,Pn )
Qn Pn
so that
lim Pn = = ζ (3). Recall that in Corollary 13 we showed
P̃n
lim Q̃ that if pqnn → L is any convergent rational
n→∞ Qn n→∞ n
sequence, and L − pqnn |qn | = o (1), then L is irrational. Hence, our goal is to show that P̃n
converge fast

Q̃n
enough to ζ (3) to apply this result, and conclude that ζ (3) is irrational.
√ 4
More specifically, setting λ+ = 1 + 2 , we will first show that given any ε > 0 the approximation error
is !

ζ (3) − P n
=O 1
2n .
Qn (λ − ε)+
 
6 n
On the other hand, we will show that Qn = O (n!) (λ+ + ε) , and by Corollary 49 we know that
 3
n!
· n! | gcd (P (n, n + 1) , Q (n, n + 1)) ,
lcm [n]
 
3 n n
so that Q̃n = O lcm [n] (λ+ + ε) . It is well known that lcm [n] = O ((e + ε) ) (it follows from the prime
number theorem, see [2]), so together we get that
!n !
3
P̃n (e + ε) (λ+ + ε)
ζ (3) − Q̃n = O .

2
Q̃n (λ+ − ε)

38
√ 4 (e+ε)3 (λ+ +ε)
Since 20.08 ∼ e3 < λ+ = 1 + 2 ∼ 33.97, for all ε > 0 small enough we get that (λ+ −ε)2
< 1. Hence

Q̃n = o (1) thus proving that ζ (3) is irrational.
P̃n
ζ (3) − Q̃

n

Step 1: Find recursion relation for Qn :


With this main idea, we are left to find the growth rate of Qn and how fast ζ (3) − Pn
goes to zero.

Qn
Using the coboundary condition on the matrix field, we get that
" n # "n−1 #
  Y Y
P (n,n+1)
Q(n,n+1)
= MY (0, k) MX (k, n + 1) e2
k=1 k=0
" n
#
Y
= MX (k − 1, k) MY (k, k) e2 ,
k=1

where
  
0 1 f¯(k,k) 1
MX (k − 1, k) MY (k, k) = b(k) f (k−1,k)−f¯(k,k) b(k) f (k,k)
.
 b(k) f (k,k)

= f (k−1,k)b(k) (f f¯)(k,0)+f (k,k)f (k−1,k)−(f f¯)(k,k)
 
−k6 6k3
= −f (k−1,k)k6 6k3 f (k−1,k)−k6
 
−k3 6
= k3 −f (k−1,k)k3 6f (k−1,k)−k3
.

We want to use Theorem 19 to find the recurrence that Pn and Qn satisfy, however in that theorem we
looked on a product of matrices applied to e1 and not e2 . To fix this, recall that τ = ( 01 10 ) is the row\column
switching matrix so that
" n # " n #
    Y Y
Qn Pn 3
Pn = τ Qn = τ MX (k − 1, k) MY (k, k) τ τ e2 = (n!) M (k) e1
k=1 k=1

where
 
6f (k−1,k)−k3 −f (k−1,k)k3
M (k) = 6 −k3
6
det (M (k)) = k .

Applying now Theorem 19 we get that un = Qn


(n!)3
satisfy the relation

 
3
un+1 = un 6f (n, n + 1) − (n + 1) − n3 − un−1 n6 ,

where u0 = Q0
0!3 = 1 and v1 = Q1
1!3 = 6f (0, 1) − 1 = 5. Denote
3
F (n) = 6f (n, n + 1) − (n + 1) − n3 = 34n3 + 51n2 + 27n + 5,

so the recurrence can be written as un+1 = F (n) un − n6 un−1 .


It is also interesting to note that (n!)
Pn
3 satisfies the same recurrence and
P0
(0!)3
= 0, P1
(1!)3
= 6, so that
"n #
6
 
Pn ∞ −k
Y 
0 6 0 −k6 0 6
ζ (3) = lim = (1 5) 1 F (k)
(0) = ( 1 5 ) K1 ,
n→∞ Qn F (k)
1

or alternatively
6 −k 6
− 5 = K∞
1 .
ζ (3) F (k)

39
Step 2: Analyze the recurrence to find the growth rate of Qn :
6
By Corollary 49 we know that (n!) | Qn , so that vn = (n!)
Qn
6 =
un
(n!)3
are integers which satisfy

3
vn+1 (n + 1) = F (n) vn − n3 vn−1 ,

where v0 = Q0
0!6 = 1 and v1 = Q1
1!6 = 5. Equivalently, we can write

F (n) n3
vn+1 = v n − 3 vn−1 .
n3 (1 + n)

Taking the limit only for the coefficients, we get the recurrence vn+1 = 34vn − vn−1 . This correspons to the
quadratic equation x2 − 34x + 1 = 0 with the roots
√ √
34 ± 1156 − 4 34 ± 24 2 √  √ 4
λ± = = = 17 ± 12 2 = 1 ± 2 ,
2 2
√ 4
so a standard computation shows that vn−1 vn
→ λ+ = 1 + 2 . In the original recurrence with the non-
constant coefficients, the same holds, but needs a bit more explanation. As with the standard recurrence
with constant coefficients, we expect the general solution to behave like vn ∼ λn+ , though there is a specific
√ 4
starting position for which vn ∼ λn− . Since λ− = 1 − 2 ∼ 0.03 , this is highly unlikely to happen, since
we deal with integer values. More sepcifically, the first few elements in vi are 1, 5, 73, 1445, 33001, ... which is
an increasing sequence of positive integers, and since Fn(n)
3 ≥ 11 for n ≥ 3, it is not hard to show by induction
that
F (n) n3
vn+1 = v n − 3 vn−1 ≥ 11vn − vn−1 ≥ 10vn ,
n3 (1 + n)
so at least we get that vn ≥ 10n grows much faster than the very special case of λn− . This is enough to show
that for every ε > 0 and for any n large enough, we have
n n
(λ+ − ε) ≤ vn ≤ (λ+ + ε) .

For the reader’s conveneince, we add a full proof in Appendix D.

Step 3: Analyze the approximation error:


−n6
The sequence of (n!)
Pn
3 , un =
Qn
(n!)3
are the numerators and denominators of the continued fraction K∞ 1 F (n) .
Using Claim 14 we get that for all n large enough
∞ ∞ ∞
!
6

ζ (3) − P n X
≤ (k!) X 1 X 1 1
= 3 ≤ 2k+1
=O 2n .
Qn |uk uk+1 | (k + 1) |vk vk+1 | n (λ+ − ε) (λ+ − ε)
k=n k=n

6
These are the growth rate for Qn = (n!) vn and the error for ζ (3) − Pn
that we needed in the beginning,

Qn
thus completing the proof.

40
7 On future fractions
The main goal of this paper was to introduce this new mathematical object of conservative matrix field, and
as an application use it to reprove Apery’s result about the irrationality of ζ (3). As can be seen in Section 6,
the final proof as it is right now is very specific to matrix field of ζ (3), which has several nice properties, and
doesn’t hold for other examples of matrix fields. However it might be possible that some of the results hold
in a more general setting.
While this irrationality result is already interesting by itself, the conservative matrix field object also seems
to have many interesting properties. Among others, it is a natural generalization of quadratic equations, and
it involves a bit of noncommutative cohomology theory in the form of cocycles and coboundaries.
So far, the conservative matrix fields that we managed to find where f, f¯ are polynomials of degree 4
or more seem to always be degenerate, namely a (x, y) = f (x, y) − f¯ (x + 1, y) doesn’t depend on y. This
might be related to the fact that we work over 2 × 2 matrix, which might bound the possible matrix fields.
Whether this is the case or not, this leads to several possible interesting generalizations of this theory, which
are standard in the theory of continued fractions.
1. While many of the results mentioned in this paper are true for general continued fractions over C (and
even other fields), the irrationality of ζ (3) relied heavily on the fact that the defining polynomials f, f¯
were in Z [x, y]. This leads naturally to the question of what happens when we use other integer rings in
algebraic extensions, e.g. Z [i] in Q [i]. Both in the ζ (2) and ζ (3) matrix fields case we can find in the

background algebraic numbers of degree 2 (namely 1 + i and ζ3 = e 3 i respectively). This type of field
extension, with the right definition of generalized continued fraction might add many more interesting
examples.

2. In the proof of the irrationality of ζ (3) we had two main results that we needed to show. One was
to find the error rate and how fast it converges to zero, and the second was to find gcd (Pn , Qn ) and
hope that it grows to infinity fast enough. As it is usually the case in number theoretic problems, the
first result lives in the standard Euclidean geometry, where we needed to show that some sequence
goes to zero in the |·|∞ norm, and the second result can be seen as showing that the p-adic norms of
|gcd (Pn , Qn )|p all go to zero as well. This suggest a more general approach where the matrix field lives
over the Adeles, and the convergence in the real and p-adic places together prove irrationality.
3. All the results in this paper were for 2 × 2 matrices, and a natural generalization would be by going
to a higher dimension matrices. There are many suggestions for what should be the generalization of
continued fractions to higher dimensions, however probably one of the best approaches is to change the
language all together from continued fractions to lattices in Rd . The subject of lattices is well studied
in the literature with many connections to other subjects. With this approach, the question should be
what is the right way to formulate the results about general continued fraction as results on lattices,
and what can we say in higher dimension.
These three types of generalization of changing the field, the norm, or the dimension, can also be combined.
Of course, there are more tools available already in “standard” 2 × 2 matrices over the integers to study
polynomial continued fraction. However, it seems that the conservative matrix field holds some interesting
structure which might reveal itself to be very useful not only to prove results about continued fractions, but
to other subjects as well.

41
Part III
Appendix
A Identifying polynomial continued fractions in the Euler family
b(i)
Recall from Example 11 that a continued fraction K∞
1 a(i) is in the trivial Euler family if is has the form

b (x) = −h1 (x) h2 (x)


a (x) = h1 (x) + h2 (x + 1) ,
in which case we have that
 
b (i) 1
Kn1 = h2 (1)  P   − 1 .
a (i) n Qk h1 (i)
k=0 i=1 h2 (i+1)

If a, b ∈ C [x], then in order to find polynomial solution h1 , h2 ∈ C [x], we only need to know how to
decompose b (x) to product of polynomial. In the more general case, we had

b (x) = −h1 (x) h2 (x)


f (x) a (x) = f (x − 1) h1 (x) + f (x + 1) h2 (x + 1)
where the values of the convergents are given by
 
bi f (1) h2 (1) 1
Kn1 =    − 1 .
ai f (0) Pn f (0)f (1) Qk h1 (i)
k=0 f (k)f (k+1) i=1 h2 (i+1)

In this case, it is not enough to decompose b (x), which is not a simple task by itself, we also need to guess
what is the polynomial f . With this in mind we have the following results, which can be used to construct
an algorithm which finds f (x) (if such a polynomial exists).
Lemma 53. Suppose that there is a solution for an equation of the form
f (x + 1) β(1) (x) + f (x) β(0) (x) + f (x − 1) β(−1) (x) = 0, (5)
 
where f, β(i) ∈ C [x] are nonzero polynomials. Let df = deg (f ), d = max deg β(i) | i = −1, 0, 1 and
write
d
(j)
X
β(i) (x) = β(i) xj
j=0

(j) (j)
where the coefficients β(i) ∈ C are scalars (and we use the convention of β(i) = 0 for negative j). Then
(d) (d) (d)
1. The sum β(−1) + β(0) + β(1) = 0. In particular, at least two of the β(i) have the max degree d.
(d−1) (d−1) (d−1)
(d) (d) β(−1) +β(0) +β(1)
2. If β(−1) − β(1) 6= 0, then the degree of f must be df = (d) (d) . In particular, this expression
β(−1) −β(1)
must be well defined and an integer.
(d) (d) (d) (d)
3. If β(−1) − β(1) = 0, then β(−1) + β(1) 6= 0 and
 

(d−2) (d−2) (d−2)
 
(d−1) (d−1)
 df (d) (d)

β(−1) + β(0) + β(1) + df −β(−1) + β(1) + β(−1) + β(1) = 0
2
is a nontrivial quadratic equation in df .

42
Proof. In general, the coefficients of a product of polynomials is a convolution of the coefficients of the given
polynomials. In order to use this, we first want to find the coefficients of f(k) = f (x + k) for k = −1, 0, 1, so
that
1
X
f(k) (x) β(k) (x) = 0.
k=−1
Pdf
Writing f (x) = 0 f (i) · xi where f (i) ∈ C, we get that
df df i   df df  
X i
X X i j i−j X j X (i) i i−j
f(k) (x) = f (i) · (x + k) = f (i) · x k = x f · k .
0 i=0 j=0
j j=0 i=j
j

For i < j, we can write i


= 0, so that the coefficient of xj in f(k) is

j

df  
(j)
X i i−j
f(k) = f (i) · k .
i=0
j
P1
The coefficient of xdf +d−` in k=−1 f(k) (x) · β(k) (x) is
 
1 ` 1 X ` X df  
X X (d −j) (d+j−`) 
X i (d+j−`)
 f(k)f β(k) = f (i) · k i+j−df β(k)
d f − j
k=−1 j=0 k=−1 j=0 i=0
 
df `   X 1
X X i (d+j−`) 
= f (i)  k i+j−df β(k) .
df − j
i=0 j=df −i k=−1

1. We first look at the leading coefficient, namely the coefficient of xdf +d , which should be zero. This
means that ` = 0, implying that from all the sums we are left with j = 0 and i = df , so that
" 1 #
X (d)
(df )
0=f β(k) .
k=−1

The leading coefficient of f is non zero, so we are left with


(d) (d) (d)
0 = β(−1) + β(0) + β(1) .

Since this sum is zero, and at least one of the summands is nonzero (since d = max deg β(k) | k = −1, 0, 1 ),
 

at least two of them are non zero.


2. Next, taking ` = 1 and equating the coefficient to 0, we get that
 
df 1   X1
X X i (d+j−`) 
0= f (i)  k i+j−df β(k)
df − j
i=0 j=df −i k=−1
i=df i=df −1
z }| { z

1   1
"}| 1 #{
X d f
X (d+j−1) 
X (d)
= f (df )  k j β(k) + f (df −1) β(k) .
j=0
d f − j
k=−1 k=−1
hP i
1 (d)
From part (1) we know that k=−1 β(k) = 0. Using again the fact that f (df ) 6= 0, we get that

1 1
(d−1) (d)
X X
0= β(k) + df kβ(k) .
k=−1 k=−1

43
(d) (d) P1 (d)
In particular, if β(1) − β(−1) = k=−1 kβ(k) 6= 0, then
P1 (d−1)
k=−1 β(k)
df = − P1 (d)
.
k=−1 kβ(k)
P1 (d) P1 (d−1)
Otherwise, we get that k=−1 kβ(k) = k=−1 β(k) = 0.

3. Finally, letting ` = 2, we get


 
df 2   X 1
X X i (d+j−2)
0= f (i)  k i+j−df β(k) 
d f − j
i=0 j=df −i k=−1
    " 1 #
2   1 2  1
 X
X df
X (d+j−2)  + f (df −1) 
X df − 1 (d+j−2)  + f (df −2)
X (d)
= f (df )  k j β(k) k j−1 β(k) β(k) .
j=0
d f − j j=1
d f − j
k=−1 k=−1 k=−1

P1 (d)
Once again, we know that k=−1 β(k) = 0, which removes the last summand.
P1 (d) P1 (d−1)
If k=−1 kβ(k) = k=−1 β(k) = 0, then the second summand is zero, and dividing by the nonzero
coefficient f (df ) , we are left with
1 1 1
  X
X (d−2)
X (d−1) df (d)
0= β(k) + df kβ(k) + k 2 β(k) .
2
k=−1 k=−1 k=−1

P1 (d) P1 (d) P1 (d)


We already have that k=−1 β(k) = 0 and assumed that k=−1 kβ(k) . If k=−1 k 2 β(k) as well, then we
(d)
must have that β(k) = 0 for k = −1, 0, 1, but d was chosen as the max degree of the β(k) , so at least one
(d) P1 (d) (d) (d)
of the β(k) cannot be zero. Thus under our assumption we get that k=−1 k 2 β(k) = β(−1) + β(1) 6= 0,
so that the quadratic equation above is not trivial.

Note that once we know the β(k) and the degree of f , equation (5) in the lemma is a linear system in the
coefficients of f , which can easily be solved using standard methods.
Applying the previous lemma to our case, we get the following:
Corollary 54. Suppose that f, a, h1 , h2 ∈ C [x] are polynomials satisfying

f (x) a (x) − f (x − 1) h1 (x) − f (x + 1) h2 (x + 1) = 0. (6)

Let d = max {deg (a) , deg (h1 ) , deg (h2 )}, and write
d
X
a (x) = a(i) xi
i=0
d
(i)
X
h1 (x) = h1 xi
i=0
d
(i)
X
h2 (x) = h2 xi .
i=0

Then
(d) (d) (d) (d)
1. We have a(d) = h1 + h2 and at least two of the h1 , h2 , a(d) are non zero.

44
(d−1) (d−1) (d)
(d) (d) a(d−1) −h1 −h2 −dh2
2. If h1 6= h2 , then the degree of f must be df = (d) (d) . In particular, this expression
h2 −h1
must be well defined and an integer.
(d) (d) (d−1) (d−1) (d)
3. If h1 = h2 , then a(d−1) = h1 + h2 + dh2 and
       
(d−2) (d−2) (d−1) (d) (d−1) (d−1) (d) d (d) (d)
a(d−2) −h1 − h2 +(d−1)h2 +(d
2)h2 +df h1 − h2 +dh2 −( 2f ) h1 +h2 =0
 
(d) (d)
is a nontrivial quadratic equation in df (namely h1 + h2 6= 0).

Example 55. 1. Suppose that we start with a polynomial continued fraction in the trivial Euler family,
namely
b (x) = −h1 (x) h2 (x)
a (x) = h1 (x) + h2 (x + 1) .
Then the theorem above should show that df = 0, namely we can take f ≡ 1 constant. Let’s see three
examples:
d
(a) If b (x) = −xd × xd and a (x) = xd + (1 + x) , then
j= d d−1 d−2
d

a(j) 2 d 2
(j) .
h1 1 0 0
(j)
h2 1 0 0
(d) (d)
Part (1) in the corollary above of course holds. Since h1 = h2 we would have to use part (3) to
find df , where we would get the equation
     
d d df
0= − − d · df − 2 = − (d + df − 1) df .
2 2 2
Hence either df = 1 − d ≤ 0 or df = 0, so in any way we know to look for a constant f solution.
d
(b) If b (x) = − −xd × xd and a (x) = (1 + x) − xd , then


j= d d−1 d−2
d

a(j) 0 d 2
(j) .
h1 −1 0 0
(j)
h2 1 0 0
(d) (d)
This time h1 6= h2 , so we can use part (2) to get
(d−1) (d−1) (d)
a(d−1) − h1 − h2 − dh2 d−d
df = (d) (d)
= = 0.
h2 − h1 1 − (−1)

(c) We can also look when deg (h1 ) 6= deg (h2 ), for example in b (x) = −1 × x and a (x) = 1 + (x + 1),
so that d = 1. Here the coefficients of x−1 are considered as zero and we get
j= d d−1 d−2
a(j) 1 2 0
(j) .
h1 0 1 0
(j)
h2 1 0 0
(d) (d)
Since h1 6= h2 we have that
(d−1) (d−1) (d)
a(d−1) − h1 − h2 − dh2 2−1−0−1
df = (d) (d)
= = 0.
h2 − h1 1−0

45
3
2. Take b (x) = −x3 × x3 and a (x) = x3 + (1 + x) + 4 · (2x + 1), which is the continued fraction on the
second line in the ζ (3) matrix field discussed in Section 6. After choosing the decomposition of −b (x)
with h1 (x) = h2 (x) = x3 , so that d = 3, we have

j= d d−1 d − 2
a(j) 2 3 11 = 32 + 8
(j) .
h1 1 0 0
(j)
h2 1 0 0
(d) (d)
Since h1 = h2 , we can use part (3) in the corollary to get

0 = 8 − 3df − df (df − 1) = 8 − 2df − d2f = (4 + df ) (2 − dd ) .

Since df needs to be nonnegative, we only need to check df = 2. Solving the linear system will produce
f (x) = x2 + x + 12 .

3. Take b (x) = −x6 and a (x) = 34x3 + 51x2 + 27x + 5 which is the polynomial continued fraction we
got on the diagonal of the ζ (3) matrix field in Theorem 52. Let us show that in this case there is no
solution to equation (6) in the corollary.
Assume by negation that there is a solution. In any decomposition b (x) = −h1 (x) h2 (x) we have
that deg (h1 ) + deg (h2 ) = 6. Since deg (a) = 3, in order for part (1) in the corollary to hold, we
must have that deg (h1 ) = deg (h2 ) = 3, so that h1 (x) = cx3 and h2 (x) = 1c x3 . We also need that
(d) (d)
c + 1c = h1 + h2 = a(d) = 34, so that c2 − 34c + 1 = 0, and in particular c 6= ±1.
(d) (d)
From this we conclude that h1 = c 6= 1c = h2 , so we may use part (2) to get
(d−1) (d−1) (d)
a(d−1) − h1 − h2 − dh2 51 − 0 − 0 − 3 48 · c
df = = 1 = ,
(d)
h2 −
(d)
h1 c − c (1 − c2 )

where df ≥ 0 is an integer. It follows that c also satisfies the quadratic equation df c2 + 48c − df = 0.
Combining the two quadratic equations we get

0 = df c2 + 48c − df − df c2 − 34c + 1 = (48 + 34 · df ) c − 2df ,


 

so c must be a rational number. However, if c2 − 34c + 1 has a rational root, then its denominator and
numerator must divide 1, namely the root must be ±1 - contradiction.
−n6
We conclude that the polynomial continued fraction presentation K∞1 34n3 +51n2 +27n+5 cannot be written
as in equation (6).

46
B Conservative matrix field of degree 1
Here we give a full solution to the problem of finding

f (x, y) = ax + by + c
f¯ (x, y) = āx + b̄y + c̄

which satisfy the conditions in Definition 29, namely

f (x, y) − f (x + 1, y − 1) = f¯ (x + 1, y) − f¯ (x, y − 1)
f f¯ (x, y) + f f¯ (0, 0) = f f¯ (x, 0) + f f¯ (0, y) .
   

Solving the linear condition gives us


 
(ax + by + c) − (a (x + 1) + b (y − 1) + c) = ā (x + 1) + b̄y + c̄ − āx + b̄ (y − 1) + c̄
b − a = ā + b̄

For the quadratic condition we have

f f¯ (x, y) = (ax + by + c) āx + b̄y + c̄ = aāx2 + (ac̄ + cā) x + bb̄y 2 + bc̄ + cb̄ y + cc̄ + ab̄ + bā xy.
     

Since the quadratic condition simply says that there are no monomials with mixed x and y, in this case we
simply get that ab̄ + bā = 0.

To solve these two conditions, write the quadratic condition as det −a ā


= 0 , and the linear condition

b b̄
is just (1, 1) −a ā −1
A matrix has determinant if and only if it has rank at most 1, so
 
b b̄ 1 = 0. 2 × 2 0
that it has the form v · utr where v, u are column vectors. Hence, we get that the quadratic condition is
−a ā
b b̄
= v · utr and the linear condition is then

0 = (1, 1) −a ā
= (1, 1) v · utr −1
 −1  
b b̄ 1 1 .

This is now a product of two scalars which is zero, so that either (1, 1) v = 0 or utr −1 1 . These imply that


(1, 1) −a ā
= (0, 0) or −a ā −1
= ( 00 ) respectively.
  
b b̄ b b̄ 1
To summarize, our pair of polynomials are either of the form

f (x, y) = a (x + y) + c
f¯ (x, y) = ā (x − y) + c̄

or

f (x, y) = ax + by + c
f¯ (x, y) = −ax + by + c̄

C The algebra of integer valued polynomial


In the previous section we answer the question of given a polynomial g (x) and some number d, how to check
if d | g (n) for all integers n. Alternatively, is d | gcd {g (n) | n ∈ Z}. Of course, if we can write g (x) = dg̃ (x)
with g̃ ∈ Z [x], then this condition holds, but the other direction is not true. For example, the polynomial
x (x + 1) is not divisible by 2, but for every integer n either n or n + 1 is even, so n (n + 1) is always divisible
by 2. We can even go further to polynomials with rational coefficients, like g (x) = x(x+1)(x+2) 3 , such that for
any n ∈ Z we still have that g (n) is an even integer. This motivates us to define the following.
Qn−1
Definition 56. For n ∈ N we define the polynomial nx = 0 (x−i) x·(x−1)···(x−n+1)
. This is a

 x ∞ i+1 = n!
polynomial of degree n in Q [x], so that n 0 is a Q-basis for Q [x]. In particular, for a nonnegative
integers x, we simply get the binomials.

47
The polynomials f in Q [x] satisfying f (Z) ⊆ Z are called integer valued polynomial. This class  was
fully described by Pólya in [9], and it was shown to contain exactly the integer combinations of the nx above.
For the ease of the reader, we add the proof for this result here.
Lemma 57. For every integer m, we have that m

n ∈ Z.

Proof. We first note that Pascal’s identity holds for the polynomial. Indeed, taking
     
x x−1 x−1
p (x) = − − ,
n n−1 n
we get a finite degree polynomial where p (m) = 0 for all m ≥ n, so that p (x) ≡ 0 as a polynomial.
In order to prove that mn ∈ Z for  all n, m ∈ Z, we use induction,
 first on n and then on m.


For n = 0 we simply get that m 0 = 1 and for n = 1 we get m


1 = m for all m, so we are done.
Assume now the claim for n−1 and we prove for n ≥ 2. By Pascal’s identity we have m m−1
  m−1
n = n−1 + n
and since n−1 is always an integer by the induction hypothesis, then n is an integer if and only if m−1
m−1 m
  
n
is an integer. In other words, we only need to show this for a single m. Taking m = 0 we get n0 = 0 and we


are done.
In the following, when we write d | q for d ∈ Z and q ∈ Q, we mean that q has to be an integer and is
divisible by d.
Pd x

Lemma 58. Given a general polynomial f (x) = 0 an n ∈ Q [x] and an integer k the following are
equivalent:
1. For all 0 ≤ n ≤ d we have k | an ,
2. For all m ∈ Z we have k | f (m) ,
3. For m = 0, 1, ..., d , we have k | f (m) ,
4. There exists m0 such that k | f (m0 + m) for m = 0, 1, ..., d, and

Proof. Note first that considering the polynomial f (x)


k instead, it is enough to prove the lemma for k = 1.
Namely, we just need to show that the coefficients\evaluation are integers.
• (1) ⇒ (2): follows from the fact that mn are integers for all m.


• (2) ⇒ (3): is trivial.


• (3) ⇒ (1): Since nn = 1 and m
= 0 when 0 ≤ m < n, it follows that for 0 ≤ m ≤ d we have
 
n

m−1  
X m
f (m) = am + an ,
n=0
n

which we can also write as


m−1  
X m
am = f (m) − an .
n=0
n

So if an ∈ Z for 0 ≤ n < m, then since f (m) ∈ Z by assumption and m n ∈ Z, we conclude that




am ∈ Z. Thus, by induction we get that an ∈ Z for all 0 ≤ n ≤ d, namely we get (1).


• (2) ⇒ (4): is trivial.
• (4) ⇒ (2): If f (m0 + m) ∈ Z for m = 0, ..., d, then setting g (m) = f (m0 + m) we see that g (m) ∈ Z for
m = 0, ..., d. By the (3) ⇒ (2) direction for the degree d polynomial g we get that g (m) = f (m0 + m) ∈
Z for all m, which is exactly condition (2) for the polynomial f and we are done.

48
D Asymptotics of recurrence with convergent coefficients
In this section we fix a recurrence relation over R:

vn+1 = an vn + bn vn−1 , lim an = a, lim bn = b. (7)


n→∞ n→∞

Denote by x2 = ax + b the quadratic polynomial corresponding to the limit and assume that its two roots λ±
are distinct and satisfy 0 < |λ− | < λ+ . It is well known that in the limit recursion vn+1 = avn + bvn−1 , unless
the starting position is (v0 , v1 ) = c (1, λ− ) for some constant c, then vn ∼ λn+ . In this section we want to show
that a similar claim holds for the recurrence relation with the convergent coefficients. In this case, it is not
enough to have a condition on the starting position. Indeed, we might even have an = bn = an+1 = bn+1 = 0
for some n, which leads to vk = 0 for all k ≥ n + 2. Instead, our condition will be that if vn grows at least
n
slightly better than |λ− | , then it will behave like λn+ .

As usual, the first step is to move to matrix multiplication by rewriting the recurrence as

( vn−1 vn ) 01 abnn = ( vn vn+1 ) .




Q 
k−1
Letting Mn = 0 bn
, we basically want to find the asymptotics of (v0 v1 ) , where we know

1 an 1 Mn
that (1) Mn → M∞ = and (2) M is diagonalizable with eigenvalues 
( 01 ab ) λ± . This diagonalization let us
λ 0
simplify the notation a bit. Letting P ∈ GL2 (R) such that D = P M P −1 = 0+ λ− , write Dn = P Mn P −1
Qk Q 
k
so that Dn → D and 1 Dn = P 1 M n P −1 . We expect the asymptotics of the corresponding sequence
Qk−1 k
(αk , βk ) = (α1 , β1 ) 1 Dn to behave like αk ∼ λk+ and βk ∼ λk− , so in particular αβkk ∼ λλ− → 0. This is

+
indeed true, under the right condition, and will eventually give us the required result about the recurrence.
 
λ 0 Qk−1
Lemma 59. Suppose that Dn → D where D = 0+ λ− and 0 ≤ |λ− | < λ+ and set (αk , βk ) = (α1 , β1 ) 1 Dn

for some initial position (α1 , β1 ). If αβkk has a bounded subsequence, then αβkk → 0.


β
Proof. By assumption, there is M ≥ 1 and a bounded subsequence αkki ≤ M for all i. We fix some
i √
0 < ε < 1, and√
since 0 < |λ− | < λ+ , for all small enough such choice we have (1) (1 + M ) ε ≤ ε and (2)
ηε := |λλ−+|+2

− ε
ε
< 1.

We shall show below that when kDk − Dk∞ < ε and αβkk ≤ M , we get

 √
βk+1 ηε αβkk if ε ≤ αβkk

≤ √ √ . (8)
if αβkk < ε
αk+1 
ε

The fact that kDk − Dk∞ → 0, implies that for all k big enough the condition kDk − Dk∞ < ε holds. For
these k’s, once we have a single k0 for which αk ≤ M , the sequence αk will decrease by a factor of ηε < 1,
βk βk

until it will be smaller
than ε, and then it will remain as such. As ε > 0 can be arbitrarily small, we
conclude that αβkk → 0 as required.

ε ε1,2 
To prove equation (8), we use the fact that (αk+1 , βk+1 ) = (αk , βk ) Dk where Dk = D + ε1,1 2,1 ε2,2 to get
that

βk+1 αk ε1,2 + βk (λ− + ε2,2 ) |αk | ε + |βk | (|λ− | + ε) |αk | ε + |βk | (|λ− | + ε)
αk+1 αk (λ+ + ε1,1 ) + βk ε2,1 ≤ |αk | (λ+ − ε) − |βk | ε ≤ |αk | (λ+ − ε (1 + M ))
=

|αk | ε + |βk | (|λ− | + ε)


≤ √ = (∗)
|αk | (λ+ − ε)

49
√ √
Note that the dnominator in (∗) is positive since ε < 1 < λ+ . In the αβkk ≥ ε case, we get

√  √ 
|βk | (|λ− | + ε + ε) |λ− | + 2 ε |βk | |βk |
(∗) ≤ √ ≤ √ · = ηε .
|αk | (λ+ − ε) λ+ − ε |αk | |αk |

On the other hand, if αβkk < ε , then

√ √
|αk | ε + |αk | ε (|λ− | + ε) √ |λ− | + ε + ε √ √
(∗) ≤ √ = ε √ ≤ ηε ε < ε,
|αk | (λ+ − ε) λ+ − ε

which completes the proof.


Returning back to the recursion, we get the following
Theorem 60. Suppose that we have a solution to the recurrence vn+1 = an vn +bn vn−1 , where an → a, bn → b
and λ± are the roots of x2 = ax + b with 0 < |λ− | < λ+ . Assume further that there are some R, r > 0 and a
| vn | k k
subsequence v i ∈ [|λ− | + r, R]. Then for any ε > 0 we have (λ+ − ε) ≤ vk ≤ (λ+ + ε) for all k large
| ni −1 |
enough.
 
1 λ+
Proof. Set Mn = 10 abnn and M = ( 01 ab ) as in the beginning of this section. With P = 1 λ− and

   
λ− −λ+ λ 0
P −1 = λ− −λ
1
+ −1 1 we have that D = P M P −1 = 0+ λ− . It follows that

k−1
! k−1
!
Y Y
−1
(vk−1 vk ) := (v0 v1 ) Mn = (v0 v1 ) P Dn P
1 1

Writing !
k−1
Y
(αk , βk ) = (v0 v1 ) P −1 Dn = (vk−1 , vk ) P −1 ,
1

|vni |
and using the assumption on the subsequence we get the upper bound
|vni −1 |

βni −λ+ vni −1 + vni (R + λ+ ) |vni −1 | R + λ+
αn λ− vn −1 − vn ≤
= = .
i i i r |v n i −1 | r

Using Lemma 59, we conclude that αβkk → 0. Going back via (vk−1 , vk ) = (αk , βk ) P , we get that

vk λ+ αk + λ− βk 1 − λλ− βk
+ αk
= = λ+ · → λ+ .
vk−1 αk + βk 1 + αβkk

k
Hence, for any ε > 0, we get that vk−1 for all k large enough, implying that (λ+ − ε) ≤ vk ≤
vk ε
− λ+ <

2
k
(λ+ + ε) for all k large enough.

50
References
[1] Roger Apéry. Irrationalité de ζ(2) et ζ(3). Astérisque, 61(11-13):1, 1979.
[2] Tom M. Apostol. Introduction to analytic number theory. Springer Science & Business Media, 1998.
[3] F. Beukers. A note on the irrationality of ζ(2) and ζ(3). Pi: A Source Book, 11:434, 2013. Publisher:
Springer Science & Business Media.

[4] Eric Brier, David Naccache, and Ofer Yifrach-Stav. A Note on the Ramanujan Machine. arXiv preprint
arXiv:2211.01058, 2022.
[5] Kenneth S. Brown. Cohomology of groups, volume 87. Springer Science & Business Media, 2012.
[6] William B. Jones and Wolfgang J. Thron. Continued fractions: Analytic theory and applications, vol-
ume 11. Addison-Wesley Publishing Company, 1980.

[7] James Mc Laughlin and Nancy J. Wyshinski. Real numbers with polynomial continued fraction expan-
sions. arXiv preprint math/0402462, 2004.
[8] Salvatore Pincherle. Delle funzioni ipergeometriche e di varie questioni ad esse attinenti. Giorn. Mat.
Battaglini, 32:209–291, 1894.

[9] Georg Pólya. Über ganzwertige ganze Funktionen. Rendiconti del Circolo Matematico di Palermo (1884-
1940), 40(1):1–16, 1915. Publisher: Springer.
[10] Gal Raayoni, Shahar Gottlieb, Yahel Manor, George Pisha, Yoav Harris, Uri Mendlovic, Doron Haviv,
Yaron Hadad, and Ido Kaminer. Generating conjectures on fundamental constants with the Ramanujan
Machine. Nature, 590(7844):67–73, 2021. Publisher: Nature Publishing Group UK London.

[11] Ofir Razon, Yoav Harris, Shahar Gottlieb, Dan Carmon, Ofir David, and Ido Kaminer. Automated
Search for Conjectures on Mathematical Constants using Analysis of Integer Sequences. arXiv preprint
arXiv:2212.09470, 2022.
[12] Tanguy Rivoal. La fonction zêta de Riemann prend une infinité de valeurs irrationnelles aux entiers
impairs. Comptes Rendus de l’Académie des Sciences-Series I-Mathematics, 331(4):267–270, 2000. Pub-
lisher: Elsevier.
[13] Alf Van der Poorten. A proof that Euler missed. Math. Intelligencer, 1(4):195–203, 1979.
[14] Wadim Zudilin. One of the numbers ζ (5), ζ (7), ζ (9), ζ (11) is irrational. Uspekhi Mat. Nauk,
56(4):149–150, 2001.

51

You might also like