You are on page 1of 11

Solving the Pell Equation

H. W. Lenstra Jr.

Pells Equation found in Eulers Algebra [4, Abschn. 2, Cap. 7].


The Pell equation is the equation Modern textbooks usually give a formulation in
terms of continued fractions, which is also due to
x2 = dy 2 + 1, Euler (see for example [9, Chap. 7]). Euler, as well
to be solved in positive integers x , y for a given as his Indian and English predecessors, appears to
nonzero integer d . For example, for d = 5 one can take it for granted that the method always produces
take x = 9 , y = 4 . We shall always assume that d is a solution. That is true, but it is not obviousall
positive but not a square, since otherwise there are that is obvious is that if there is a solution, the
clearly no solutions. method will find one. Fermat was probably in pos-
The English mathematician John Pell (1610 session of a proof that there is a solution for every
d (see [13, Chap. II, XIII]), and the first to publish
1685) has nothing to do with the equation. Euler
such a proof was Lagrange (17361813) in 1768 (see
(17071783) mistakenly attributed to Pell a solu-
Figure 1).
tion method that had in fact been found by another
One may rewrite Pells equation as
English mathematician, William Brouncker  
(16201684), in response to a challenge by Fermat (x + y d ) (x y d ) = 1,
(16011665); but attempts to change the termi-
nology introduced by Euler have always proved so that finding a solution comes down to finding
futile. a nontrivial unit
of the ring Z[ d] of norm 1; here
Pells equation has an extraordinarily rich history, the norm Z[ d] Z = {1} between unit
to which Weils book [13] is the best guide; see also groups multiplies each unit by its conjugate, and
[3, Chap. XII]. Brounckers method is in substance the units 1 of Z[ d] are considered trivial. This
identical to a method that was known to Indian reformulation implies that once one knows a so-
mathematicians at least six centuries earlier. As lution to Pells equation, one can find infinitely
we shall see, the equation also occurred in Greek many. More precisely, if the solutions are ordered
mathematics, but no convincing evidence that the by magnitude, then the nth solution xn, yn can be
Greeks could solve the equation has ever emerged. expressed in terms of the first one, x1 , y1 , by
A particularly lucid exposition of the Indian  
or English method of solving the Pell equation is xn + yn d = (x1 + y1 d )n .

Accordingly, the first solution x1 , y1 is called the


H. W. Lenstra Jr. is professor of mathematics at the Uni-
fundamental solution to the Pell equation, and
versity of California, Berkeley, and at the Mathematisch
Instituut, Universiteit Leiden, The Netherlands. His e-mail solving the Pell equation means finding x1 , y1 for
addresses are: hwl@math.berkeley.edu and hwl@math. given d
. By abuse of language, we shall also refer
leidenuniv.nl. to x + y d instead of the pair x , y as a solution to

182 NOTICES OF THE AMS VOLUME 49, NUMBER 2


Figures 1 and 2.
Title pages of two
publications from
1773. The first (far
left) contains
Lagranges proof of
the solvability of
Pells equation,
already written and
submitted in 1768.
The second
contains Lessings
discovery of the
cattle problem of
Archimedes.

n xn yn
Pells equation and call x1 + y1 d the fundamen-
tal solution. 1 15 4
One may view the solvability of Pells equation 2 449 120
as a special case of Dirichlets unit theorem from 3 13455 3596
4 403201 107760
algebraic number theory, which describes the
5 12082575 3229204
structure of the group of units of a general ring of
6 362074049 96768360
algebraic integers; for the ring Z[ d] , it is the prod-
uct of {1} and an infinite cyclic group. The shape of the table reflects the exponential
As an example, consider d = 14 . One has growth of xn and yn with n.
For
general
d , the continued fraction expansion
 1 of [ d] + d is again purely periodic, and the period
14 = 3 + ,
1 displays a symmetry similar to the one visible for
1+ d = 14 . If the period length is even, one proceeds as
1
2+ above; if the period length is odd, one truncates at
1 the end of the second period.
1+ 
3 + 14
The cattle problem
so the continued fraction expansion of 3 + 14 is An interesting example of the Pell equation, both
purely periodic with period length 4 . Truncating the from a computational and from a historical per-
expansion at the end of the first period, one finds spective, is furnished by the cattle problem of
that the fraction Archimedes (287212 B.C.). A manuscript con-
taining this problem was discovered by Lessing
1 15 (17291781) in the Wolffenbttel library, and
3+ =
1 4 published by him in 1773 (see Figure 2). It is now
1+ generally credited to Archimedes (see [5, 13]). In
1
2+ twenty-two Greek elegiac distichs, the problem
1
asks for the number of white, black, dappled, and
1 brown bulls and cows belonging to the Sun god,
subject to several arithmetical restrictions. A ver-
is a fair approximation to 14 . The numerator
sion in English heroic couplets, taken from [1], is
and denominator of this fraction yield the funda-
shown in Figure 3. In modern mathematical nota-
mental solution x1 = 15 , y1 = 4 ; indeed one has tion the problem is no less elegant. Writing x , y ,
152 = 14 42 + 1 . Furthermore, one computes z , t for the numbers of white, black, dappled, and

(15 + 4 14)2 = 449 + 120 14 , so x2 = 449 , y2 = brown bulls, respectively, one reads in lines 816
120 ; and so on. One finds: the restrictions

FEBRUARY 2002 NOTICES OF THE AMS 183


thy care
ttle, friend, apply x = ( 12 + 13 )y + t,
The Sun god's ca wi sdom's share.
mber, ha st tho u
to count their nu floor y = ( 14 + 15 )z + t,
on the Thrinacian
They grazed of old
herded into four, z = ( 16 + 17 )x + t.
of Sic'ly's island, cream,
r: one herd white as
colour by colou on gleam, Next, for the numbers x
, y
, z
, t
of cows of the
glowing wi th eb
the next in coats d with spots the las
t.
same respective colors, the poet requires in lines
third, an d sta ine
brown-skinned the rpassed , 1726
lls in power unsu
Each herd saw bu -hu ed,
eb on
in ratios these: cou
nt ha lf the x
= ( 13 + 14 )(y + y
), z
= ( 15 + 16 )(t + t
),
the n all the bro wn include;
re,
add one third mo bulls' number tel
l. y
= ( 14 + 15 )(z + z
), t
= ( 16 + 17 )(x + x
).
t thou the white
thus, friend, cans well,
brown exceed as Whoever can solve the problem thus far is called
The ebon did the the stained.
and fifth pa rt of
now by a fourth merely competent by Archimedes; to win the prize
bu lls tha t rem ained
edall for supreme wisdom, one should also meet the
To know the spott unite
brown bulls, and conditions formulated in lines 3340 that x + y be
reckon again the the white.
an d sev en th of a square and that z + t be a triangular number.
these with a sixth air ed
the tale of silver-h The first part of the problem is just linear algebra,
Among the cows, bla ck compared,
bulls and cows of and there is indeed a solution in positive integers.
was, when with r.
ee plus one in fou The general solution to the first three equations is
exactly one in thr once more,
s cou nted one in four given by (x, y, z, t) = m (2226, 1602, 1580, 891) ,
The black cow bre ed
of the besp eck led where m is a positive integer. The next four equa-
plus now a fifth, t to feed.
l, they wa nd ere d ou tions turn out to be solvable if and only if m is divisible
when, bulls witha d sixth by 4657 ; with m = 4657 k one has (x
, y
, z
, t
)
s tallied a fifth an
The speckled cow ales mixed. = k (7206360 , 4893246 , 3515820 , 5439213) .
ired, males and fem
of all the brown-ha lf a third The true challenge is now to choose k such that
cows numbered ha
Lastly, the brown x + y = 4657 3828 k is a square and z + t
of the silver herd.
and one in seven head = 46572471 k is a triangular number. From the
fai lingly how many
Tell'st thou un lls well-fed prime factorization 4657 3828 = 22 3 11
bo th bu
d, o friend,
the Sun possesse wi ll 29 4657 one sees that the first condition is equiv-
colour no -on e
and cows of ev'ry an d skill, alent to k = al 2 , where a = 3 11 29 4657 and l
st numb ers ' ar t
deny that thou ha on g the wise. is an integer. Since z + t is a triangular number if
st tho u ra nk am
though not yet do e.
and only if 8(z + t) + 1 is a square, we are led to the
foll'wing recognis equation h2 = 8(z + t) +1= 8 4657 2471 al 2 +1 ,
But come! also the ck,
lls joined the bla
n d's white bu
go which is the Pell equation h2 = dl 2 + 1 for
Whene'er the Su pa ck
uld gather in a
their multitude wo arely throng
an d breadth, and squ d = 2 3 7 11 29 353 (2 4657)2
of equal length g.
ory broad and lon
Thrinacia's territ wi th the flecked,
= 410 286423 278424.
wn bulls mi ng led
But when the bro
they collect, Thus, by Lagranges theorem, the cattle problem
from one would
in rows growing 'er admits infinitely many solutions.
triangle, with ne
forming a perfect to spare. In 1867 the otherwise unknown German math-
ed bu ll, an d no ne
a diff'rent-colour thy mind, ematician C. F. Meyer set out to solve the equation
u analyse this in
Friend, canst tho find, by the continued fraction method [3, p. 344]. After
s all the meas es ur
and of these masse 240 steps in the continued fraction expansion for
! be assured all dee m
go forth in glory eme! d he had still not detected the period, and he gave
m in thi s discipline supr
thy wisdo up. He may have been a little impatient; it was
later discovered that the period length equals
203254. The first to solve the cattle problem in a
satisfactory way was A. Amthor in 1880 (see [6]).
Amthor did not directly apply the continued frac-
tion method; what he did do we shall discuss below.
Nor did he spell out the decimal digits of the
fundamental solution to the Pell equation or the
corresponding solution of the cattle problem. He
did show that, in the smallest solution to the
Figure 3. Problem that Archimedes conceived in verse and cattle problem, the total number of cattle is given
posed to the specialists at Alexandria in a letter to by a number of 206545 digits; of the four leading
Eratosthenes of Cyrene. digits 7766 that he gave, the fourth was wrong,
due to the use of insufficiently precise logarithms.

184 NOTICES OF THE AMS VOLUME 49, NUMBER 2


The full number occupies forty-seven pages of that Rd is very close to log(2x1 ) and to
This shows
computer printout, reproduced in reduced size on log(2y1 d ) . That is, if x1 and y1 are to be repre-
twelve pages of the Journal of Recreational Math- sented in binary or in decimal, then Rd is approx-
ematics [8]. In abbreviated form, it reads imately proportional to the length of the output of
any algorithm solving the Pell equation. Since the
77602714 . . . 237983357 . . . 55081800, time required for spelling out the output is a lower
each of the six dots representing 34420 omitted bound for the total running time, we may con-
digits. clude: there exists c1 such that any algorithm for
Several nineteenth century German scholars solving the Pell equation takes time at least c1 Rd .
were worried that so many bulls and cows might Here c1 denotes, just as do c2 , c3 , below, a pos-
not fit on the island of Sicily, contradicting lines 3 itive real number that does not depend on d .
and 4 of the poem; but, as Lessing remarked, the The continued fraction method almost meets
Sun god, to whom the cattle belonged, will have this lower bound. Let l be the period length
of the
coped with it. continued fraction expansion of [ d] + d if that
The story of the cattle problem demonstrates length is even and twice that length if it is odd. Then
that the continued fraction method is not the last one has
word on the Pell equation.
log 2 log(4d)
l < Rd < l
Efficiency 2 2
We are interested in the efficiency of solution meth- (see [7, eq. (11.4)]); so Rd and l are approximately
ods for the Pell equation. Thus, how much time does proportional. Using this, one estimates easily that
a given algorithm for solving the Pell equation the time taken by a straightforward implementa-
take? Here time is to be measured in a realistic way, tion of the continued fraction method is at most
which reflects, for example, that large positive in- Rd2 (1 + log d)c2 for suitable c2 ; and a more refined
tegers are more time-consuming to operate with implementation, which depends on the fast Fourier
than small ones; technically, one counts bit oper- transform, reduces this to Rd (1 + log d)c3 for suit-
ations. The input to the algorithm is d , and the run- able c3 . We conclude that the latter version of the
ning time estimates are accordingly expressed as continued fraction method is optimal, apart from
functions of d . If one supposes that d is specified a logarithmic factor.
in binary or in decimal, then the length of the input In view of these results it is natural to ask how
is approximately proportional to log d . An algo- the regulator grows as a function of d . It turns out
rithm is said to run in polynomial time if there is that it fluctuates wildly. One has
a positive real number c0 such that for all d the run-  
ning time is at most (1 + log d)c0 , in other words, if log(2 d ) < Rd < d (log(4d) + 2),
the time that it takes the algorithm to solve the Pell the lower bound because of the inequality
equation is not much greater than the time re- y1 < eRd /(2 d ) above and the upper bound by a
quired to write down the equation. theorem of Hua. The gap between the two bounds
How fast is the continued fraction method? Can is very large, but it cannot be helped: if d ranges
the Pell equation be solved in polynomial time? The over numbers of the form k2 1 , for which one has
central quantity that one needs to consider in order x1 = k and y1 = 1 , then Rd log(2 d ) tends to 0 ;
to answer such questions is the regulator Rd, which and one can show that there exist an infinite set D
a constant c4 such that all d D have
is defined by of d s and

Rd = log(x1 + y1 d ), Rd = c4 d . In fact, if d0, d1 are integers greater
than 1 and d0 is not a square, then there exists
where x1 + y1 d denotes, as before, the funda- a positive integer m = m(d0 , d1 ) such that D =
mental solution to Pells equation. The regulator {d0 d12n : n Z , n m} has this property for some
coincides with what in algebraic number theory c4 = c4 (d0 , d1 ) .
would be called the regulator of the kernel of It is believed that for most d the upper bound
the normmap Z[ d] Z . From x1 y1 d = is closer to the truth. More precisely, a folklore con-
1/(x1 +y1 d ) one deduces that 0 < x1 y1 d jecture asserts that there is a set D of nonsquare
< 1/(2 d ) , and combining this with x1 + y1 d positive integers that has density 1 in the sense that
= eRd, one finds that limx #{d D : d x}/x = 1 , and that satisfies

eRd eRd 1 log Rd


< x1 < + , lim = 1.
2 2 4 d dD log d
eRd 1 eRd This conjecture, however, is wide open. The same
< y1 < .
2 d 4d 2 d is true for the much weaker conjecture that

FEBRUARY 2002 NOTICES OF THE AMS 185



lim supd (log Rd )/ log d , with d ranging over the deduced from his theory that the least value for n
squarefree integers > 1 , is positive. divides p + 1 = 4658 ; had he been a little more care-
If the folklore conjecture is true, then for most ful, he would have found that it must divide
d the factor Rd entering the running time is about (p + 1)/2 = 2329 = 17 137 . In any case, trying a

d , which is an exponential function of the length few divisors, one discovers that the least value
log d of the input. for n is actually equal to 2329 . One has Rd =
.
Combining the results above, one concludes 2329 Rd
= 237794.586710 .
that the
continued fraction method takes time at The conclusion is that the fundamental solution
most d (1 + log d)c5 ; that conjecturally it is ex- to thePell equation for d itself is given by
ponentially slow for most values of d ; and that any x1 + y1 d = u2329 , with u as just defined. Amthor
method for solving the Pell equation that spells out failed to put everything together, but I did this for
x1 and y1 in full is exponentially slow for infinitely the convenience of the reader in Figure 4: for the
many d , and will therefore fail to run in polyno- first time in history, all infinitely many solutions to
mial time. the cattle problem are displayed in a handy little
If we want to improve upon the continued frac- table! It does, naturally, not contain the full decimal
tion method, then we need a way of representing expansion of any of the numbers asked for, but what
x1 and y1 that is more compact than the decimal it does contain should be considered more enlight-
or binary notation. ening. For example, it enables the reader not only
to verify easily that the total number of cattle in
Amthors solution the smallest solution has 206545 decimal digits
Amthors solution to the cattle problem depended and equals 77602714 . . . 55081800 , but also to dis-
on the observation that the number d = cover that the number of dappled bulls in the
410 286423 278424 can be written as 1494 195300th solution equals 111111 . . . 000000 ,
(2 4657)2 d
, where d
= 4 729494 is squarefree. a number of 308 619694 367813 digits. (Finding the
Hence, if x , y solves the Pell equation for d , then middle digits is probably much harder.) Archimedes
x , 2 4657 y solves the Pell equation for d
and had an interest in the representation of large
will therefore for some n be equal to the nth so- numbers, and there is little doubt that the solution
lution x
n, yn
(say) of that equation: in Figure 4 would have pleased and satisfied him.
 
x + 2 4657 y d
= (x
1 + y1
d
)n . Power products
This reduces the cattle problem to two easier Suppose one wishes to solve the Pell equation
problems: first, solving the Pell equation for d
; and x2 = dy 2 + 1 for a given value of d . From Amthors
second, finding the least value of n for which yn
is approach to the cattle problem we learn that for
divisible by 2 4657 . two reasons it may be wise to find the smallest
Since d
is much smaller than d , Amthor could divisor d
of d for which d/d
is a square: it
use the continued fraction algorithm for d
. In a saves time when performing the continued fraction
computation that could be summarized in three algorithm, and it saves both time and space when
he found the period length to be 92
pages (see [6]), expressing the final answer. There is no known
and x
1 + y1
d
to be given by algorithm for finding d
from d that is essentially
faster than factoring d . In addition, if we want
u = 109 931986 732829 734979 866232 821433 543901 088049 to determine which power of the fundamental

+ 50549 485234 315033 074477 819735 540408 986340 d
. solution for d
yields the fundamental solution
for d that is, the number n from the previous
In order to save space, one can write sectionwe  also need to know the prime factor-
  izationof
 d/d
, as well as the prime factorization

d
u = 300 426607 914281 713365 609 of p p for each prime p dividing d/d
. Thus,
 2 if one wants to solve the Pell equation, one may as
+ 84 129507 677858 393258 7766 . well start by factoring d . Known factoring algo-
rithms may not be very fast for large d , but for most
This is derived from the identity x + y d =
 values of d they are still expected to be orders of
  2
(x 1)/2 + (x + 1)/2 , which holds whenever magnitudes faster than any known method for
. solving the Pell equation.
x = dy 2 + 1 . The regulator is found to be Rd
=
2
Let it nowbe assumed that d is squarefree, and
102.101583 .
write x1 + y1 d for the fundamental solution of the
In order to determine the least feasible value for
Pell equation,
which is a unit of Z[ d] . Then
n, Amthor developed a little theory, which one
x1 + y1 d may still be aproper power in the field
would nowadays cast in the language of finite fields
Q ( d ) of fractions of Z[ d] . For example, the least
and rings. Using that p = 4657  is aprime number
d
d with y1 > 6 is d = 13 , for which one has x1 = 649,
for which the Legendre symbol p equals 1 , he y1 = 180, and

186 NOTICES OF THE AMS VOLUME 49, NUMBER 2


 6
 3+ 13 c > 0 . However, the number of bits of a , b , c will
649 + 180 13 = . typically grow linearly with the exponents |ni |
2
Also in the case d = 109, which Fermat posed as a themselves rather than with their logarithms. So one
challenge problem in 1657, the fundamental solu- avoids using the latter representation and works
tion is a sixth power: directly with the power products instead.
 Several fundamental issues are raised by the
158 070671 986249 + 15 140424 455100 109 representation of elements as power products. For
 
261 + 25 109 6 example, can we recognize whether two power
= .
2 products represent the same element of Q ( d ) by
means of a polynomial time algorithm? Here poly-
However, this is as far as it goes: it is an elemen-
nomial time means, as before, that the run time
tary exercise in algebraic number theory to show is bounded by a polynomial function of the length
that if n is a positiveinteger for which x1 + y1 d
has an nth root in Q ( d ) , then n = 1, 2 , 3 , or 6 , the of the input, which in this case equals the sum of
case n = 2 being possible only for d 1 , 2 , or the lengths of the two given power products. Sim-
5 mod 8 , and the cases n = 3 and 6 only for ilarly, can we decide in polynomial time whether a
d 5 mod 8 . Thus, for large squarefree d one can- given power product represents an element of the

not expect to save much space by writing x1 + y1 d form
a + b d with a , b Z , that is, an element of
as a power. This is also true when one allows the Z[ d] ? If it does, can we decide whether one has
root to lie in a composite of quadratic fields, as we a2 db2 = 1 and a , b > 0 , so that we have a solu-
did for the cattle problem. tion to Pells equation, and can we compute the
Let d again be an arbitrary positive integer that residue classes of a and b modulo a given positive
is not a square. Instead of powers, we consider integer m , all in polynomial time?
power products in Q ( d ) , that is, expressions of All questions just raised have affirmative an-
the form swers, even in the context of general algebraic
t 
(ai + bi d )ni number fields. Algorithms proving this were ex-
i=1 hibited recently by Guoqiang Ge. In particular, one
can efficiently decide whether a given power prod-
where t is a nonnegative integer, ai , bi , ni are
integers, ni = 0 , and for each i at least one of ai uct represents a solution to Pells equation, and if
and bi is nonzero. We define the length of such an it does, one can efficiently compute any desired
expression to be number of least significant decimal digits of that
t  solution; taking the logarithm of the power prod-
  
log |ni | + log(|ai | + |bi | d ) . uct, one can do the same for the leading digits, and
i=1 for the number of decimal digits, except possibly
This is roughly proportional to the amount of bits in the probably very rare cases that a or b is ex-
needed to specify the numbers ai , bi, and ni . Each cessively close to a power of 10. There is no known
power
product represents a nonzero element of polynomial time algorithm for deciding whether a
Q ( d ), and
that element can be expressed uniquely given power product represents the fundamental
as (a + b d )/c , with a , b , c Z, gcd(a, b, c) = 1 , solution to Pells equation.

Figure 4.

FEBRUARY 2002 NOTICES OF THE AMS 187



Infrastructure to Z[ d] to the usual class group of invertible
Suppose now that, given d , we are not ideals. By means of Gausss reduced binary qua-
asking for
the fundamental solution x1 + y1 d to dratic forms one can do explicit computations in
Pells
equation, but for a power product in Q ( d ) that Pic0 Z[ d] and in its circle subgroup. For a fuller
represents it. The following theorem summarizes explanation of these notions and their algorithmic
essentially all that is rigorously known about the use we refer to the literature [2, 7, 10, 11, 14].
smallest length of such a power product and about The equivalence stated in part (b) of the theo-
algorithms for finding one. rem has an interesting feature that is not com-
monly encountered in the context of equivalences.
Theorem. There are positive real numbers c6 and Namely, one may achieve an improvement by going
c7 with the following properties. back-and-forth. Thus, starting from a power prod-
(a) For each positive integer d that is not a square uct representing the fundamental solution, one
can first use it to compute R d and next use R d to
there exists a power product that represents the
fundamental solution to Pells equation and that find a second power product, possibly of smaller
has length at most c6 (log d)2 . length than the initial one. And conversely, start-
ing from any rough approximation to Rd , one can
(b) The problem of computing a power product compute a power product and use it to compute
representing the fundamental solution to Pells equa- Rd to any desired accuracy.
tion is polynomial time equivalent to the problem The algorithm referred to in part (c) is the
of computing an integer R d with |Rd R
d | < 1 .
fastest rigorously proven algorithm for computing
(c) There is an algorithm that given d computes a power product as desired. Its run time is roughly
a power product representing the fundamental the square root of the run time of the continued
solution to Pells equation in time at most fraction algorithm. It again makes use of the
1/2
Rd (1 + log d)c7 . infrastructure just discussed, combining it with
a search technique that is known as the baby
Part (a) of the theorem, which is taken from [2],
stepgiant step method. The power product com-
implies that the question we are asking does
ing out of the algorithm may not have a very small
admit a brief answer, so that there is no obvious
length, but one can easily do something about this
obstruction to the existence of a polynomial time
by using the algorithms of part (b). Our estimates
algorithm for finding such an answer.
for Rd show that the run time is at most
Part (b), which is not formulated too rigorously,
d 1/4 (1 + log d)c8 for some c8 ; here the exponent
asserts the existence of two polynomial time
1/4 can be improved to 1/5 if one is willing to
algorithms.
The first takes as input a power
assume certain generalized Riemann hypotheses
product i (ai + bi d )ni representing the funda-
(see [10]). Recent work of Buchmann and Vollmer
mental solution to the Pell equation and gives as
shows that part (c) is valid with c7 = 1 +  for all
output an integer approximation to the regulator.  > 0 and all d exceeding a bound depending on .
There
is no surprise here,
one just uses the formula Mathematically the infrastructure methods have
Rd = i ni log |ai + bi d| and applies a polynomial great interest. Algorithmically one conjectures that
time algorithm for approximating logarithms. The something faster is available. But as we shall see,
second algorithm takes as input the number d as the final victory may belong to the infrastructure.
well as an integer approximation R d to Rd , and
it computes a power product representing the Smooth numbers
fundamental solution to Pells equation. Since the The algorithms for solving Pells equation that we
algorithm runs in polynomial time, the length of saw so far have an exponential run time as a func-
the output is polynomially bounded, and this is in tion of log d . One prefers to have an algorithm
fact the way part (a) of the theorem is proved. whose run time is polynomial in log d . The method
The key notion underlying the second algorithm that we shall now discuss is believed to have a run
is that of infrastructure, a word coined by time that is halfway between exponential and
Shanks (see [11]) to describe a certain multiplica- polynomial. Like many subexponential algorithms
tive structure that he detected within the period of in number theory, it makes use of smooth numbers,
the continued fraction expansion of d . It was that is, nonzero integers that up to sign are built
subsequently shown (see [7]) that this period can up from small prime factors. Smooth numbers
be embedded in a circle group of circumfer- have been used with great success in the design of
ence Rd , the embedding preserving the cyclical algorithms for factoring integers and for comput-
structure. In the modern terminology of Arakelov ing discrete logarithms in multiplicative groups of
theory, one may describe that circle group as the rings. Here we shall see how they can be used for

kernel of the natural map Pic0 Z[ d] PicZ[ d] the solution of Pells equation as well.
from the group of metrized line bundles of Instead of giving a formal description, we illus-
degree 0 on the arithmetic curve corresponding trate the algorithm on the case d = 4 729494 =

188 NOTICES OF THE AMS VOLUME 49, NUMBER 2


2 3 7 11 29 353 derived from the cattle 43992 22 d = 52 13 31 43,
problem. The computation is less laborious and 65142 32 d = 2 53 13 41,
more entertaining than the expansion of d in a
continued fraction performed by Amthor. We shall 65242 32 d = 2 5 7 41,
explain the method on an intuitive level only; read- 65382 32 d = 2 7 13 23 43,
ers desirous to see its formal justification should
acquaint themselves with the basic theorems of 86992 42 d = 17 41.
algebraic number theory. The prime numbers occurring in these sixteen fac-
The smooth numbers that the algorithm operates torizations are the small prime factors 2 , 3 , 7 , 11,
with are not ordinary integers, but elements of the  of d , as well as the prime numbers p 43 with
29
ring Z[ d] , with d as just chosen. There is a natural d
p = 1 . It is only the latter primes that matter, and
way of extending the notion of smoothness
to such there are seven of them: 5 , 13, 17, 23, 31, 41, and
numbers. Namely, for = a + b d Q ( d ) , with 43. It is important that the number of smooth ex-
a, b Q , write
= a b d . Then 
yields an pressions a2 db2 exceeds the number of those
automorphism of the field Q() and the ring Z[] , primes, which is indeed the case: 16 > 7 . If one
and the norm map N: Q ( d ) Q defined by uses only the prime numbers up to 31 and the
N() =
= a2 db2 respects multiplication. It is
eight factorizations that do not contain 41 or 43,
now natural to expect that an element of Z[ d] is there is still a good margin: 8 > 5 . Thus, one
smooth if and only if
is smooth; so one may as decides to work with the smoothness bound 31.
well pass to their product N() , which is an ordinary The next step is to write down the prime ideal
integer, and define to be smooth if |N()| is built factorizations
of the eight numbers (a + b d )/
(a b d ) . Consider, for example, the case a =
up from prime numbers that lie below a certain 2 d contains a factor 13,
2162, b = 1 . Since 2162

bound. The size of this bound depends on the cir-
the element 2162 + d has a prime ideal factor of
cumstances; in the present computation we choose
norm 13, and from 2162 4 mod 13 one sees that
it empirically. this is the prime ideal p13 = (13, 4 + d ) ; it is the
The first step in the algorithm is to finda good kernel ofthe ring homomorphism Z[ d] Z/13Z
supply of smooth numbers a + b d in Z[ d] , or, sending d to 4 (mod 13). The conjugate prime

equivalently, pairs of integers a , b for which ideal q13 = (13, 4 d ) then occurs in 2162 d .
a2 db2 is smooth. One does this by trying b = 1 , Likewise, 2162 + d is divisible by the cube of
2 , 3 , in succession,
and trying integers a in the the prime ideal p 5 = (5,
2 + d ) and by p17 =
neighborhood of b d ; then |a2 db2 | is fairly (17, 3 + d ) , and 2162 d by q35 q17 , where q5 =

small, which increases its chance to be smooth. For 2 d ) and q17 = (17, 3 d ) .Finally, 2162
(5,
example,
. with b = 1 one finds for a near + d has the prime ideal factor (2, d ) , but since
b d = 2174.74 the following smooth values of 2 divides d , this prime ideal equals its own conju-

a2 d : gate, so it cancels when one divides 2162 + d by
its conjugate. Altogether one finds the prime ideal
factorization
21562 d = 2 7 11 17 31,    
21622 d = 2 53 13 17, (2162 + d )/(2162 d )

21752 d = 3 13 29, = (p5 /q5 )3 (p13 /q13 ) (p17 /q17 ).

21782 d = 2 3 5 11 43, As a second example, consider a = 4351 , b = 2 .


2 2 We have 43512 22 d = 52 232 , and from
2184 d = 2 3 7 31 , 4351/2 2 mod 5 one sees that 4351 + 2 d
21872 d = 3 52 23 31. belongs to q5 rather than p5 . Similarly,
4351/2 2 mod 23 implies that it belongs to
For b = 2 , 3 , 4 , one finds, restricting to values of a p23 = (23, 2 + d ) . Writing q23 = (23, 2 d ) , one
that are coprime to b : obtains
   
(4351 + 2 d )/(4351 2 d )
43292 22 d = 3 5 172 41, = (p5 /q5 )2 (p23 /q23 )2 .
2 2 3
4341 2 d = 3 5 17 ,
Doing this for all eight pairs a , b , one arrives at
43512 22 d = 52 232 , the table in Figure 5. The first row lists the prime
numbers p we are using. The first column lists the
43632 22 d = 132 17 41,
eight expressions = a + b d . In the th row
43892 22 d = 3 5 7 11 13 23, and the pth column, one finds the exponent of

FEBRUARY 2002 NOTICES OF THE AMS 189



pp /qp in the prime ideal factorization of /
; (2156 + d ) (2162 + d )
=
here pp , qp are as above, with p31 = (31, 14 + d ) (2156 d ) (2162 d )

and q31 = (31, 14 d ) . Thus, each gives rise to (2187 d ) (4389 2 d )

an exponent vector that belongs to Z5 . (2187 + d ) (4389 + 2 d )

The third step in the algorithm is finding linear 32 232 (2156 + d )2 (2162 + d )2
= 2 .
relations with integer coefficients between the eight 2 172 (2187 + d )2 (4389 + 2 d )2
exponent vectors. The set of such relations forms In the second representation,  is visibly a square,
a free abelian group of rank 3 , which is 8 minus or, equivalently, N() is a square; this is a bad
the rank of the 8 5 matrix formed by the eight sign, since it is certain to happen when  = 1, in
which case one has Q , N() = 2 , x = 1 , and
vectors. A set of three independent generators for
y = 0 . That is indeed what occurs here. (Likewise,
the relation group is given by the last three columns it would have been a bad sign if  were visibly d
of Figure 5; in general, one can find such a set by times a square; this is certain to happen if  = 1 .)
applying techniques of linear algebra over Z . In the present case, the numbers are small enough
In the final step of the algorithm one inspects that one can directly verify that  = 1. For larger
power products, one can decide whether  equals
the relations one by one. Consider for example the 1 by computing log || to a suitable precision
first relation. It expresses that the sum of the ex- and proving that the logarithm of a positive unit

ponent vectors corresponding to 2156 + d and of Z[ d] cannot be close to 0 without being equal
to 0 .
2162 + d equals the sum of the exponent vectors
Thus, the first relation disappointingly gives
for 2187 + d and 4389 + 2 d . In other words, if
rise to a trivial solution to the Pell equation. The
we put reader may check that the unit

(2156 + d ) (2162 + d )
= , 292 (4351 + 2 d )2 (4389 + 2 d )4
(2187 + d ) (4389 + 2 d )
54 72 112 234 (2175 + d )4
then the element  = /
has all exponents in its
obtained from the second relation is also equal
prime ideal factorization equal to 0 . This is the to 1 . The third relation yields the unit

same as saying that  is a unit x + y d of the ring

Z[ d] ; also, the norm 
= x2 dy 2 of this unit 24 514 (2175 + d )18 (2184 + d )10
=
equals N()/N(
) = 1 , so we obtain an integral 327 75 299 3120

solution to Pells equation x2 dy 2 = 1 , except (2187 + d )20 (4341 + 2 d )6
.
that it is uncertain whether x and y are positive. (2162 + d )18 (4351 + 2 d )10
We can write  = /
= 2 /N() , where the prime
Since this is not visibly a square, we can be certain
factorization of N() is available from the factor-
that it is not 1 . Since it is positive,
it is not 1
izations of a2 db2 that we started with; one finds either. So is of the form x + y d , where x , y Z
in this manner the following two power product satisfy x2 dy 2 = 1 and y = 0 ; thus, |x| , |y| solve
representations of : Pells equation. From the power product, one com-
putes the logarithm of the unit to be
about 102.101583 . This implies that
> 1 , so that is the largest of the
four numbers ,
=1/ , , and
;
in other words, x + y d is the largest of
the four numbers x y d , which is
equivalent to x and y being positive. In
general one can achieve this by first re-
placing by if is negative and next
by
if < 1 .
We conclude that the power product
defining does represent a solution to
Pells equation. The next question is
whether it is the fundamental solution.
In the present case we can easily con-
firm this, since from Amthors compu-
.
Figure 5. tation we know that Rd = 102.101583,

190 NOTICES OF THE AMS VOLUME 49, NUMBER 2


.
and the logarithm of any nonfundamental solution 3/ 8 = 1.06066 . One of the bottlenecks is the
would be at least 2 Rd . Therefore, is equal to time spent on solving a large sparse linear system
the solution u found by Amthor, and it is indeed over Z . If one is very optimistic about developing
fundamental. In particular, the numbers a better algorithm for doing this,
. . it may be possi-
log = 102.101583 and log u = 102.101583 are ble to achieve 1 instead of 3/ 8.
exactly equal, not just to a precision of six decimals. The smooth numbers method needs to be sup-
The power product representation we found plemented with an additional technique if one
for is a little more compact than the standard wishes to be reasonably confident that the unit it
representation we gave for u . Indeed, its length, as produces is the fundamental solution to Pells
defined earlier, is about 93.099810 , as compared equation. We forgo a discussion of this technique,
.
to Rd = 102.101583 for u . The power product since there is no satisfactory method for testing
whether it achieves its purpose. More precisely,
(2175 + d )18 (2184 + d )10 (2187 + d )20
there is currently no known way of verifying in
(2175 d )18 (2184 d )10 (2187 d )20 subexponential time that a solution to the Pell

(4341 + 2 d )6 (2162 d )18 (4351 2 d )10 equation that is given by means of a power prod-
,
(4341 2 d )6 (2162 + d )18 (4351 + 2 d )10 uct is the fundamental one. The most promising
technique for doing this employs the analytic class
which also represents u , has length about number formula, but its effectiveness depends on
125.337907 . the truth of the generalized Riemann hypothesis.
The latter hypothesis, abbreviated GRH, asserts
Performance that there does not exist an algebraic number field
The smooth numbers method for solving Pells whose associated zeta function has a complex zero
equation exemplified in the previous section can with real part greater than 12. The GRH can also be
be extended to any value of d . There is unfortu- used to corroborate the heuristic run time analy-
nately not much one can currently prove either sis, albeit in a probabilistic setting. This leads to
about the run time or about the correctness of the the following theorem.
method. Regarding the run time, however, one can
make a reasonable conjecture. Theorem. There is a probabilistic algorithm that for
For x > e , write some positive real number c10 has the following
  properties.
L(x) = exp (log x) log log x .
(a) Given any positive integer d that is not a
The conjecture is that, for some positive real square, the algorithm computes a positive integer
number c9 and all d > 2 , the smooth numbers R that differs by less than 1 from some positive in-
method runs in time at most L(d)c9 . This is, at a teger multiple m Rd of Rd .
doubly logarithmic level, the exact average of
xc9 = exp(c9 log x) and (log x)c9 = exp(c9 log log x) ; (b) If the GRH is true, then (a) is valid with m = 1 .
so conjecturally, the run time of the smooth
(c) If the GRH is true, then for each d > 2 the ex-
numbers method is in a sense halfway between
pected run time of the algorithm is at most L(d)c10.
exponential time and polynomial time.
The main ingredient of the heuristic reasoning The algorithm referred to in the theorem is prob-
leading to the conjecture is the following proven abilistic in the sense that it employs a random
theorem: for fixed positive real numbers c , c
, and number generator; every time the random number
x , the probability for a random positive generator is called, it draws, in unit time, a random

integer xc (drawn from a uniform distribution) bit from the uniform distribution, independently
to have all its prime factors L(x)c equals of previously drawn bits. The run time and the

1/L(x)c /(2c)+o(1) . This theorem explains the output of a probabilistic algorithm depend not
importance of the function L in the analysis of only on the input, but also on the random bits that
algorithms depending on smooth numbers. are drawn; so given the input, they may be viewed
Other ingredients of the heuristic run time as random variables. In the current case, the ex-
analysis are the belief that the expressions pectation of the run time for fixed d is considered
a2 db2 that one hopes to be smooth are so in part (c) of the theorem, and (a) and (b) describe
with the same probability as if they were what we know about the output. In particular, the
random numbers, and the belief that the units algorithm always terminates, and if GRH is true,
produced by the algorithm have a substantial then it is guaranteed to compute an integer ap-
probability of being different from 1 . These proximation to the regulator.
beliefs appear to be borne out in practice.
The theorem just stated represents the efforts
Probably one can take c9 = 3/ 8 +  in the of several people, an up-to-date list of references
conjecture just formulated, for any  > 0 and all being given by Ulrich Vollmer [12]. According to a
d exceeding a bound depending on  ; one has recent unpublished result of Ulrich Vollmer, one

FEBRUARY 2002 NOTICES OF THE AMS 191



may take c10 = 3/ 8 +  for any  > 0 and all d ex- Lecture Note Ser. 56, Cambridge University Press,
Cambridge, 1982, pp. 123150.
a bound depending on ; this improves the
ceeding
[8] H. L. NELSON, A solution to Archimedes cattle prob-
value 2 +  found in [12].
lem, J. Recreational Math. 13 (3) (198081), 162176.
The last word on algorithms for solving Pells
[9] I. NIVEN, H. S. ZUCKERMAN, H. L. MONTGOMERY, An Intro-
equation has not been spoken yet. Very recently, duction to the Theory of Numbers, 5th ed., John Wiley
Sean Hallgren exhibited a quantum algorithm that & Sons, New York, 1991.
computes, in polynomial time, a power product rep- [10] R. J. SCHOOF, Quadratic fields and factorization, H. W.
resenting the fundamental solution. His algorithm Lenstra Jr., R. Tijdeman (eds.), Computational Meth-
depends on infrastructure, but not on smooth num- ods in Number Theory, Part II, Math. Centre Tracts
bers. For practical purposes, the smooth numbers 155, Math. Centrum, Amsterdam, 1982, pp. 235286.
method will remain preferable until quantum com- [11] D. SHANKS, The infrastructure of a real quadratic field
and its applications, Proceedings of the Number The-
puters become available.
ory Conference (Univ. Colorado, Boulder, Colo., 1972),
Univ. Colorado, Boulder, pp. 217224.
Acknowledgements
[12] U. VOLLMER, Asymptotically fast discrete logarithms
This paper was written while I held the 20002001 in quadratic number fields, W. Bosma (ed.), Algo-
HP-MSRI Visiting Research Professorship. I thank rithmic Number Theory (ANTS-IV), Lecture Notes
Sean Hallgren, Mike Jacobson Jr., and Ulrich Vollmer in Comput. Sci. 1838, Springer, Berlin, 2000, pp.
for answering my questions, John Voight for 581594.
writing a first draft, Bart de Smit for numerical as- [13] A. WEIL, Number Theory, an Approach through His-
sistance, and the Universiteitsbibliotheek Leiden for tory, Birkhuser, Boston, 1984.
providing Figure 1. A special word of thanks is [14] H. C. WILLIAMS, Solving the Pell equation, Proc. Mil-
lennial Conference on Number Theory, A. K. Peters,
due to Hugh Williams, whose version [14] of the
to appear.
same story contains many details omitted in mine.

Bibliographic note About the Cover


The present paper will, with additional references, This months cover is what the graphics expert
also appear in Algorithmic number theory, the Edward Tufte would likely call a confection. It
proceedings of an introductory workshop held at tries very hard to portray all of the cattle and all
MSRI (Berkeley) in August 2000, to be published by of the digits involved in the minimal solution to
Cambridge University Press. Readers interested in Archimedes cattle problem described in Hen-
acquainting themselves with number theoretic drik Lenstras article. It also illustrates a well
algorithms are encouraged to consult those pro- known geometrical version of the algorithm for
ceedings. An extensive and up-to-date bibliography constructing a continued fraction expansion (in
on the Pell equation and on methods for solving it this case, of the ratio of dimensions of the image
can be found in Hugh Williamss paper [14]. itself).
The picture of the animals has been extracted
References from a famous painting by the seventeenth cen-
[1] ARCHIMEDES, The cattle problem, in English verse tury artist Paulus Potter entitled (in translation
by S. J. P. Hillion & H. W. Lenstra Jr., Mercator, from the original Dutch) Four cows in a
Santpoort, 1999.
meadow, even though the cows seem to be
[2] J. BUCHMANN, C. THIEL, H. C. WILLIAMS, Short represen-
steers. It is reproduced here by permission of the
tation of quadratic integers, Computational Algebra
and Number Theory (Sydney, 1992), Math. Appl., Rijksmuseum in Amsterdam, where the original
vol. 325, Kluwer, Dordrecht, pp. 159185. is located.
[3] L. E. DICKSON, History of the theory of numbers, vol. The 206,545 digits to be displayed down to
II, Diophantine analysis, Carnegie Institution of Wash- subatomic level were supplied by Lenstra and Bart
ington, Washington, 1920. de Smit.
[4] L. EULER, Vollstndige Anleitung zur Algebra, Zweyter Bill Casselman (covers@ams.org)
Theil, Kays. Acad. der Wissenschaften, St. Peters-
burg, 1770; Opera mathematica, ser. I, vol. 1, B. G.
Teubner, Leipzig, 1911. English translation: Elements
of Algebra, Springer, New York, 1984.
[5] P. M. FRASER, Ptolemaic Alexandria, Oxford University
Press, Oxford, 1972.
[6] B. KRUMBIEGEL, A. AMTHOR, Das Problema Bovinum des
Archimedes, Historisch-literarische Abteilung der
Zeitschrift fr Mathematik und Physik 25 (1880),
121136, 153171.
[7] H. W. LENSTRA JR., On the calculation of regulators and
class numbers of quadratic fields, J. Armitage (ed.),
Journes Arithmtiques 1980, London Math. Soc.

192 NOTICES OF THE AMS VOLUME 49, NUMBER 2

You might also like