You are on page 1of 349

Progress in Mathematics

Volume 70

Series Editors
J. Oesterle
A. Weinstein
Analytic Number Theory
and Diophantine Problems
Proceedings of a Conference at
Oklahoma State University, 1984

Edited by
A.C. Adolphson
J.B. Conrey
A. Ghosh
R.I. Yager

1987 Birkhauser
Boston . Basel . Stuttgart
A.C. Adolphson R.I. Yager
J.B. Conrey Macquarie University
A. Ghosh New South Wales 2113
Department of Mathematics Australia
Oklahoma State University
Stillwater, OK 74078
U.S.A.

Library of Congress Cataloging-in-Publication Data


Analytic number theory and diophantine problems.
(Progress in mathematics: v. 70)
Includes bibliographies.
I. Numbers. Theory of-Congresses. I. Adolphson, A.C.
II. Series: Progress in mathematics (Boston, Mass.) :
vol. 70
QA24I. A487 1987 512'.73 87-14635

CIP-Kurztitelaufnahme der Deutschen Bibliothek


Analytic number theory and diophantine problems:
proceedings of a conference at Oklahoma State Univ ..
1984/ ed. by A.C. Adolphson ... -Boston:
Basel: Stuttgart: Birkhiiuser. 1987.
(Progress in mathematics: Vol. 70)

NE: Adolphson, A.C. [Hrsg.] : Oklahoma State


University [Stillwater, Okla.]: GT

© Birkhiiuser Boston, 1987

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system. or
transmitted. in any form or by any means, electronic, mechanical. photocopying. recording or other-
wise, without prior permission of the copyright owner.
Permission to photocopy for internal or personal use. or the internal or personal use of specific
clients, is granted by Birkhiiuser Boston, Inc .. for libraries and other users registered with the Copyright
Clearance Center (CCC). provided that the base fee of $0.00 per copy. plus $0.20 per page is paid
directly to CCC, 21 Congress Street, Salem. MA 01970. U.S.A. Special requests should be addressed
directly to Birkhiiuser Boston, Inc., 675 Massachusetts Avenue. Cambridge, MA 02139. U.S.A.
3361-8/87 $0.00 + .20

ISBN-I3: 978-1-4612-9173-2 e-ISBN-I3: 978-1-4612-4816-3


DOl: 10.1007/978-1-4612-4816-3

Text prepared by the editors in camera-ready form.

9 8 765 4 3 2 1
PREFACE

A conference on Analytic Number Theory and Diophantine Problems


was held from June 24 to July 3, 1984 at the Oklahoma State
University in Stillwater. The conference was funded by the National
Science Foundation, the College of Arts and Sciences and the
Department of Mathematics at Oklahoma State University.

The papers in this volume represent only a portion of the many


talks given at the conference. The principal speakers were
Professors E. Bombieri, P. X. Gallagher, D. Goldfeld, S. Graham,
R. Greenberg, H. Halberstam, C. Hooley, H. Iwaniec, D. J. Lewis,
D. W. Masser, H. L. Montgomery, A. Selberg, and R. C. Vaughan. Of
these, Professors Bombieri, Goldfeld, Masser, and Vaughan gave three
lectures each, while Professor Hooley gave two. Special sessions
were also held and most participants gave talks of at least twenty
minutes each. Prof. P. Sarnak was unable to attend but a paper
based on his intended talk is included in this volume.

We take this opportunity to thank all participants for their


(enthusiastic) support for the conference. Judging from the
response, it was deemed a success.

As for this volume, I take responsibility for any typographical


errors that may occur in the final print. I also apologize for the
delay (which was due to the many problems incurred while retyping
all the papers).

A. special thanks to Dollee Walker for retyping the papers and


to Prof. W. H. Jaco for his support, encouragement and hard work in
bringing the idea of the conference to fruition.

A. Ghosh
(on behalf of the Editors).
TABLE OF CONTENTS

K. ALLADI, P. ERDOS and J. D. VAALER : ••••••••••••••••••••••••••••• 1


Multiplicative functions and small divisors.

E. BOMBIER! : ••••••••••••••••••••••••••••••••••••••••••••••••••••• 15
Lectures on the Thue Principle.

E. BOMBIERI and J. D. VAALER : •••••••••••••••••••••••••••••••••••• 53


Polynomials with low height and prescribed vanishing.

w. W. L. CHEN : ••••••••••••••••••••••••••••••••••••••••••••••••••• 75
On the irregularities of distribution and
approximate evaluation of certain functions II.

J. B. CONREY, A. GHOSH and S. M. GONEK : •••••••••••••••••••••••••• 87


Simple zeros of the zeta-function of a quadratic
number field II.

H. DIAMOND, H. HALBERSTAM and H.-E. RICHERT : •••••••••••••••••••• 115


Differential difference equations associated with
sieves.

J. FRIEDLANDER : ••••••••••••••••••••••••••••••••••••••••••••••••• 125


Primes in arithmetic progressions and related
topics.

P. X. GALLAGHER : •••••••••••••••••••••••••••••••••••••••••••••••• 135


Applications of Guinand's formula

D. GOLDFELD (appendix by S. FRIEDBERG) : ••••••••••••••••••••••••• 159


Analytic number theory on GL(r,R).
viii

D. A. GOLDSTON and H. L. MONTGOMERY : •••••••••••••••••••••••••••• 183


Pair correlation and primes in short
intervals.

S. W. GRAHAM and G. KOLESNIK : ••••••••••••••••••••••••••••••••••• 205


One and two dimensional exponential sums.

R. GREENBERG : ••••••••••••••••••••••••••••••••••••••••••••••••••• 223


Non-vanishing of certain values of L-functions.

G. HARMAN: •••••••••••••••••••••••••••••••••••••••••••••••••••••• 237


On averages of exponential sums over primes.

D. HENSLEy: ••••••••••••••••••••••••••••••••••••••••••••••••••••• 247


The distribution of Q(n) among numbers with
no large prime factors.

T. KANO : •••••••••••••••••••••••••••••••••••••••••••••••••••••••• 283


On the size of I d(n)e(nx)
n .. x

D. W. MASSER and G. WUSTHOLZ : ••••••••••••••••••••••••••••••••••• 291


Another note on Baker's Theorem.

M. B. NATHANSON : •••••••••••••••••••••••••••••••••••••••••••••••• 305


Sums of polygonal numbers.

A. D. POLLINGTON : ••••••••••••••••••••••••••••••••••••••••••••••• 317


On the density of B2-bases.

P. SARNAK : •••••••••••••••••••••••••••••••••••••••••••••••••••••• 3 21
Statistical properties of eigenvalues of
the Heeke operators.

H.-B. SIEBURG : •••••••••••••••••••••••••••••••••••••••••••••••••• 333


Transcendence theory over non-local fields.
PARTICIPANTS

Adolphson, A. Kano, T.
Alladi, K. Kennedy, R. E.
Bateman, P. Kolesnik, G.
Beukers, F. Kueh, Ka-Lam.
Bombieri, E. Lewis, D. J.
Brownawell, D. Maier, H.
Chakravarty, S. Masser, D. W.
Chen, W. W. L. McCurley, K.
Cisneros, J. Montgomery, H. L.
Conrey, J. B. Mueller, J.
Cooper, C. Myerson, J.
Diamond, H. G. Nathanson, M.
Friedlander, J. Ng, E.
Gallagher, P. X. Pollington, A.
Ghosh, A. Schumer, P.
Goldfeld, D. Selberg, A.
Goldston, D. A. Shiokawa, I.
Gonek, S. M. Sieburg, H. B.
Graham, S. Skarda, V.
Greenberg, R. Spiro, C.
Gupta, R. Vaaler, J.
Halberstam, H. Vaughan, R. C.
Harman, G. Vaughn, J.
Hensley, D. Woods, D.
Hildebrand, A. YUdirim, C. Y.
Hooley, C. Youngerman, D.
Iwaniec, H. Yager, R.
Jaco, W.
MULTIPLICATIVE FUNCTIONS AND SKALL DIVISORS

1 2
K. Alladi • P. Erdos and J.D. Vaaler

3
1. Introduction

Let S be a set of positive integers and g be a nrultiplicative


function. Consider the problem of estimating the sum

S(x,g) g(n). (1.1 )


n .. x
n ES
A natural way to start is to write

g(n) = I h(d) (1.2)


dfn
and reverse the order of summation. This in turn leads to the
estimation of the contribution arising from the large divisors d of
n, where n S, which often presents difficulties. In this paper we
shall characterize in various ways the following idea:

"Laltge cUvv.,O!L6 06 a 6qualte-6nee integen have (1.3)


mone pJt.ime divv., OM than the 6maU. onu."

When the nrultiplicative function h is small in size, (1.3) will be


useful in several situations to show that the principal contribution
is due to the small divisors. The terms ~large' and ~small' will be
made precise in the sequel.
An application to Probabilistic Number Theory is discussed in

IOn leave of absence from ~MATSCIENCE', Institute of Mathematical


Sciences, Madras, India.
2The research of the third author was supported by a grant from the
National Science Foundation.
3As this paper evolved we had several useful discussions with
Amit Ghosh, Roger Heath-Brown and Michael Vose.
2

Sec.4; indeed, it was this application which motivated the present


paper (see [1], [3]). Our discussion in the first two sections is
quite general - in Sec.2 the principal result is derived for sets
rather than for divisors only and in Sec.3 the main inequality is
for submultiplicative functions. This is done in the hope that our
elementary methods may have other applications as well, perhaps even
outside of Number Theory.

2. A mapping for sets.

If n is not square it is trivial to note that half its divisors


are less than /-;:;- If n is square-free there is also an interesting
one-to-one correspondence, namely: there is a bijective mapping m
between the divisors d of n which are less than /-;:;- and the divisors
d' of n which are greater than /-;:;- such that

m(d) d' - 0 (mod d) (2.1)

(of course the mapping m depends on n). In fact, this mapping is a


special case of a rather general one-to-one correspondence that can
be set up between subsets of a finite set, as we shall presently
see.

Let S be a finite set and A a finite measure on the set of all


subsets of S. For each t ~ 0 define

A( t, S) {E ~ S A (E) " t).

We then have

Theorem 1. Foft eac.h t ;. 0 thefte iJ., a peftmutat-i..on

11 A(t,S) ... A(t,S)


t,S

~uc.h that 60ft all E C A(t,S) we have 11 (E)n E = ~.


t,S

RemMk. There are trivial cases here. If A(S) " t then A(t,S) is
the power set of S and so the permutation E ... S - E has the desired
property. If t = 0 then A(O,S) is the power set of S(O) where
3

s(O) = { s E S : ,,(s) = OJ. Here E + s(O) - E is an appropriate


permutation. So in the proof that follows we assume that
O(t("(S).

Pltoo6. If s has cardinality lsi 1 the result is trivial. We


proceed by induction of lsi.
Let lsi = N ) 2 and assume the result is true for sets with
N - 1 elements. Pick x in S with ,,( {x}) "t. (If such an x does
not exist the result is trivially true because A(t,S) = 0.) Let
T = S - {x} and note that ITI = N - 1. By our inductive hypothesis
there is for each T ) 0, a permutation n T of A(T,T) such that
T,

n
T,
T(E) nE= 0 for all E c A( T, T).

We partition A(t,S) into three disjoint subsets as follows:

{E ~ A( t, S) x E E },

{E ~ A(t,S) x ~ E, t - ,,({x}) ( ,,(E) " t},

{E .::. A( t, S) I x ~ E, ,,(E) " t - ,,( {x}) }.

Next, define

<I> : Al (t,S) U A2 (t,S) + A(t, T)

by <I>(E) E - {x} and

~ : A(t - ,,({x}),T) + A1(t,S)

by ~(E) E U {x}. Clearly both <I> and ~ are bijective. Also

A( t, T)

and
A3 (t,S) A(t - ,,({x}), T).

We define n S as follows
t,
4

11 S(E)
t.

It is easy to check that lI t •S has the desired properties and this


proves Theorem 1.

Corollary. Le.t S. A be a6 above. VeMne

B(t.S) = {E £ 8 : A(8) - t " A(E)};

.then .thene i6 a bij ec..tion (J : A( t. S) ... B( t. S) .6 uch .that:


t.S
E ~ (Jt.S(E) bOlt ail E E A(t.S).

Pnoob. Define (Jt.S(E) = 8 - lI t • S (E) and use Theorem 1.

Let n = P1 ••• Pr be square-free and 8 = {Pl' P2 •••• Pr} with


A(Pi) = log Pi' i = 1.2 ••••• r. We apply the Corollary with this
choice of S and A (and with t replaced by log t) to obtain the
following result. which. in view of its number theoretic form. is
given the status of a theorem.

Theorem 2. Le.t n be .6quMe-6nee and t > 1. Then .the.Jte. .i..6 a one.-


.to-one mapping mt be.twee.n .the divi6 OM d 06 n which Me. .te.M .than on
equal. .to t and .tho.6e divi6oJt.6 d' 06 n which Me gneat:e.n .than on
equal. .to nit • .6uch .that:

d' _ 0 (mod d).

RemM!v., •

1.) In Theorem 2 the parameter t could be greater than I; . but


only t " I; is of interest here. If t > I; then T = nit
< In. In this case mT produces a correspondence between
d " T and d' > t. The divisors between T and t can be
mapped onto themselves and mt for t >;;- can be easily
constructed from mT • where T < I;.
5

2. ) The case t = In is of special interes t because it shows


that for a multiplicative function h satisfying 0 .; h .; 1
we have

h(d) .; 2 l.. h(d), for all square-free n.


(2.2)
din
d .; In
Note that (2.2) is an immediate consequence of (2.1) (which
is Theorem 2 with t = In) because h(d')'; h(d).
Inequality (2.2) can be proved directly without use of
(2.1) as was pointed out by Heath-Brown. For this direct
proof and applications see [3], [1].

3.) In a private correspondence to one of US (K.A.) R.R. Hall


reported that Woodall had arrived at the mapping (2.1) a
few years ago. Never-the-less, applications of such
mappings or inequalities to Probabilistic Number Theory in
[1], [3], appear to be new.

4.) When h ~ 1, clearly (2.2) is false. In fact, in this case


(2.2) does not even hold if 2 is replaced by an arbitrarily
large constant. Note that the constant 2 is best possible
in (2.2) by taking h = 1.

3. A useful inequality.
In view of (2.2) we may ask as to what sort of conditions one
should impose upon h so that for all square-free n,

l.. h(d) <k I h(d), (3.1)


din din l/k
d .; n

where k ) 2. Because of (1.3) we may expect (3.1) to hold provided


h(p) is quite small.

To get an idea concerning the size of such h we consider the


special multiplicative function with h(p) = c > O. Let r be a large
intep;er '1nd PI' P2,···,Pr primes such that PI ~ P2 ~ P3 ~ ••• ~ Pre
Let n = PI P2 ••• P • In this situation a divisor d of n satisfies
l/k r
rl ~ n provided d has (asymptotically) ~ r/k prime factors. Thus,
6

L h(d)}{ L h(d) r 1 - (l+c)r { r/k


oIo
r JI, -1
(JI,)c ) • (3.2)
din dd!nnl/k N

The maximum value of (~)cJl, occurs when JI, - rc/(c+l), as r + 00. 50


the left hand side of (3.2) is unbounded if c/(c+l) > 11k, i.e.,
if c > I/(k-l). On the other hand if c < I/(k-l) then the
expressions in (3.2) are - 1 as r + 00. This example led one of us
(K.A.) to make the following conjecture, part (i) of which appeared
as problem 407 in the Wv..t COa6t Numbelt Theolt'1 Con6eltence, MilomaJt
(1983):

Conjecture.
(1) Folt each k ) 2, thelte ex,u,v.. a cOn6tant ck -6uch that (3.1)
hold6 601t aU muUiplicative 6unction6 h -6at,u, 6'1ing
o .; h(p) .. c k ' 601t aU p.

(11) In paJtt (i) ck = I/(k-1) ,u, adm,u,-6ible.

To this end we now prove an inequality for certain sUbmultiplicative


functions h, namely, those h for which h(mn) .. h(m)h(n), if
(m,n) = 1.

Theorem 3. Let h ) 0 be -6UbmuUipucative and -6at,u,6'1 0 .. h(p) .. c


< I/(k-l) 601t aU ptt.tme-6 p. Then 601t aU -6quaJte-6Itee n we have

L h(d) .. { 1 _ kc )-1
din l+c

P1t006. We begin with the familiar decompositon

I h(d) = L h(d) + I h(pd),


din dl nip dr nip

where p is any prime divisor of n. 5ubmultiplicativity yields

h(p) I h(d) h(p) I h(d) + h(p) I h(pd) (3.3)


din dl nip dl nip
) L h(np) + h(p) L h(pd)
dl nip dl nip
7

= {I + h(p)} L h(d).
dl nip

Next, observe that

In addition

L h(d) log d L h(d) L log p = L log p L h( pd)


din din pi d pin d I nip

({ d r r
(3.5)
h(d)}{ (log p) h(p) }
I+h(p)
n p n

because of (3.3). Since 0, h(p) "c, we have h(p)/(I + h(p»)


, c/(I + c). By combining (3.4) and (3.5) we obtain Theorem 3.

RemaJc.iv.J •

1.) Theorem 3 proves Conjecture (1) for any c k < I/(k-I). The
case c k = I/(k-I) (part (li» is still open when k > 2
(for k = 2 this is (2.2». The analysis underlying (3.2)
shows that ~ > I/(k-I) is not possible.

2.) It would be of interest to see if the constant


kc -1
{I - I+c} can be improved. An attempt to deal with the
case ck = I/(k-I) may throw some light on this question.

3.) R. Balasubramaniam and S • Srinivasan (personal communica-


tion to one of us - K.A.) have obtained slightly weaker
versions of Theorem 3 in response to our conference query
in the course of proving Conjecture (i) for
~ < I/(k-i).

4. ) If h is submultiplicative, then so is hT(n) which is equal


to hen) when n " T and is zero for n > T. The proof of
Theorem 3 shows

h(d) <
{d n f
h(d) HI -log
-1-t h(p) log p
1 + h(p)
rl
drn prn
d , T d't p,T
8

holds uniformly for all square-free 0 ~ t ~ T and


submultiplicative h satisfying h ) 0and
o ~ h(p) < (log t)/(log nit) •

5.) Let h be super-multiplicative, that is, h(mn) ) h(m)h(n),


for (m,n) = 1. Suppose h(p) ) c > I/(k-l) for all primes
p. Then the proof of Theorem 3 can be modified to yield
the dual inequality

L h(d» ) (1 + c)(k - 1) L h(d)


din k din 11k
d .; n
for all square-free n. Here also the situation regarding
c = I/(k-l) is open.

4. An application.
Let S be an infinite set of positive integers. Define

L 1 ,
s(x, s E S
s=O (mod d)

and set X SI(x). In addition, let

where w is multiplicative. First we assume that Rd satIsfIes the


following condition:

(C-l) There exists 0 >0 such that uniformly in x

IRd(x)1 < XW~d) (equivalently Sd(x) < XW~d» for 1 .; d .; XO

We also require Rd to satisfy at least one of the following two


conditions:

(C-3) There exist 8 >0 such that to each U >0 there is V >0
satIsfying
9

Furthermore, we also require that there exists c >0


such that

v(d)
c 1 .. d .. x,

where v(n)

Examples of sets S satisfying these conditions include

(E-1) S = {Q(n) I n = 1,2,3, ••• }, where Q(x) is a polynomial


with positive integer coefficients. Here w(d) = p(d), the
number of solutions of Q(x) =0 (mod d) and IRdl" p(d),
so (C-2) holds. We may take 6 = l/(deg Q) in (C-1).

(E-2) S = {p + a I p = prime}, where a is a fixed positive


integer. Here w(d) = d/~(d) where ~ is Euler's
function. By the Brun-Titchmarsh inequality (see
Halberstam-Richert [61, p.107) we can take any 6 € (0,1)
in (C-1). By Bombieri's theorem (see [6], p. 111), we see
that (C-3) holds with S = 1/2.

Let f be a (complex valued) strongly additive function, namely,


one that satisfies

f(n) L f(p).
pin
The quantities

f(p)w(p) If 2 (p)lw(p)
A(x) L p
and B(x) L p
p';x p';x

act like the 'mean' and 'variance' of f(n), for n € S, n .;; x. Our
problem is to obtaIn a bound for

L 1,2,3, •••
n .. x
n ES

in terms of B(x). In the special case where S is the set of all


positi ve integers, Elliott [4] has solved this problem elegantly.
10

Recently one of us (K.A.) has improved Elliott's method in order to


make it applicable to subsets. In [2) sets S with <5 = 1 in (C-1)
are treated whereas in [3) the situation concerning S in (E-2) is
investigated. It is this improved method which we shall employ
here; we sketch only the main ideas since details may be found in
[2), [3).

We start by introducing a simplification: We may assume that


f ~ O. This is because the inequality

(4.1)

is valid for all complex numbers a and b. So a complex function


could be decomposed into its real and imaginary parts. If f is real
valued we can write f = f+ - f-, where f+, f- are non-negative
strongly additive functions generated by

+ (p)
f - min(O,f(p»).

For convenience we introduce the distribution function

1
F (v) L 1.
x X
n';x, nES
f(n)-A(x)<V ,IB(x)

We note that for even t

t
L (f(n) - A(x») J (4.2)
n"x
n ES

Our aim is to show that the moments of Fx are bounded (uniformly


in x).
To accomplish this we consider the bilateral Laplace transform

T (x) = J euvdF (v) •


u x -00

If there is R ) 0 for which Tu(X) < 1 when lui .; R, then it follows


that the expression in (4.2) is bounded. Note that

-uA(x)/,IB("X)
u(f(n) - A(x)/IB(x» e
T (x) = 1. L e g(n),
u X n';x X
n.;x
n ES n ES
11

where
(4.3)
g(n) = euf(n)IIB(x) •

Of course g is strongly multiplicative (that is g(n) 1Jln g(p»,


because f is strongly additive. Our goal therefore is to bound
S(x,g) (see (1.1» suitably. We have two cases.

Cw..e 1 : u .:; 0 =) o .:; g .:; 1.

In a recent paper [2] it was shown by using a sieve method,


that in Case I, for the sets S satisfying either (C-2) or (C-3), we
have

S(x,g) <X 1T
p.:;x
( 1 +
(g(p) - l)w(p) )
p
. (4.4)

Cw..e 2: u ) 0 =) g ~ 1.

Here we let a = 11k (in c-l) and assume that f satisfies

{ max f(p) ) IIB(x) <: 1. (4.5)


p.;x

Then we can choose R ) 0 (sufficiently small) such that

1
1 .. g(p) .; 1 + 2(k-l) •

1
With h as in (1.2) we note that 0 .:; h(p) = g(p) - 1 .; 2(k-l) •
Also h(pe) = 0 for all p, e ~ 2, because g is strongly multipli-
cative. So by Theorem 3

S(x,g) L h(d) < L ~ h(d)


din n';x dina
n ES d';n
.; L a h(d)Sd(X) •
d .; x

By (C-l) we obtain

S(x,g) < X L a h(d)~(d) " X 1T ( 1 + h(p)w(p»). (4.6)


d.:;x P"'x p

Inequalities (4.4) and (4.6) combine with (4.3) and (4.5) to yield
12

T (x)
u
<I for lui.; R. For details relating to such calculations
see [2], Sec.7. Therefore by means of this method we obtain the
following extension of a result of Elliott [4],

Th.eorem 4. Le.t f be. il6 above. and If I .6ai.u.,6y (4.5). The.rr

Re.maJt!v., •

1.) Although our discussion was for even ~, Theorem 4 is


stated for all ~ > O. This is because one can pass from
even ~ to all positive real numbers by a suitable
application of the Holder-Minkowski inequality.

2.) If f satisfies certain additional conditions then one can


use the above method more carefully to obtain asymptotic
estimates for the moments. In these cases the weak limit of
Fx(v) would exist. Such asymptotic estimates are obtained
in [2] for S with 6 = 1, and in [3] for S in (E-2). For
these sets the full strength of Theorem 3 is not required.
The inequality (2.2) (which foliows from Theorem 2)
suffices.

3.) There are certain open problems concerning the behavior of


additive functions in polynomial sequences (see Elliott [5],
Vol. 2, p. 335). Part of the difficulty in such questions
is because we do not fully understand the moments of
additive functions in these sequences. Theorem 4 is derived
in the hope that it might shed some light on these
questions.

4.) We restrict our attention to strongly additive functions


for the sake of simplicity. From here the transition to
general additive functions is not difficult. This procedure
for the case 6 = I is illustrated in [2], Sec.IO.
13

References.

1. K. Alladi, Moments of additive functions and sieve methods, New


York Number Theory Seminar, Springer Lecture Notes 1052 (1982),
1-25.

2. K. Alladi, A study of the moments of additive functions using


Laplace transforms and sieve methods, Proceedings Fourth
Matscience Conference on Number Theory, Ootacamund, India
(1984), Springer Lecture Notes (to appear).

3. K. Alladi, Moments of addi ti ve functions and the sequence of


shifted primes, Paci6-ic. Jouttnai. 06 Mat:h. Ernst Straus Keaorial
Vol., June (1985) (to appear).

4. P.D.T.A. Elliott, High power analogues of the Turan-Kubilius


inequality and an application to number theory, Can. JOUft. 06
Mat:h 32 (1980), 893-907.

5. P.D.T.A. Elliott, Probabilistic Number Theory, Vol. and 2,


Grundelehren 239-240, Springer-Verlag, Berlin, New York, 1980.

6. H. Halberstam and H._E. Richert, Sieve Methods, Academic Press,


London, New York, 1974.

K. Alladi P. Erdos
University of Hawaii, Hungarian Academy of Sciences,
Honolulu, Hawaii 96822 Budapest, Hungary.

J.D. Vaaler
University of Texas,
Austin, Texas 78712, U.S.A.
LECTURES ON THE TRUE PRINCIPLE

Enrico Bombieri

I. Introduction.
The aim of these lectures is to give an account of results
obtained from the application of Thue's idea of comparing two
rational approximations to algebraic numbers in order to show that
algebraic numbers cannot be approximated too well by rational
numbers. In particular we will give special attention to the
problem of obtaining effective measures of irrationality, or types,
for various classes of algebraic numbers.

1.1 Notation.
In what follows we shall adhere to the following notation.
k is a number field and K denotes an extension of k of degree
r = [K:kl, with r ~ 2.
For each place of k we have an absolute value Ilv' uniquely
defined up to a power. In order to fix this power, let us consider
the inclusion of complete fields ~ c ~ arising from the inclusion
Q c k; then if v lies over the rational prime p, which we write as
vip, we want

-[kv:~l/lk:Ql
p

while if v is archimedean we want

[k :~l/[k:Ql
Ixl = Ixl v
v

where Ix I denotes the usual euclidean absolute value in R or C. We


also write
e:
v if vi'"
e: i f v is finite.
v
16

If a E: 1<, a ". 0 and i f we consider a E: K by means of the


inclusion I< c K then we have

log lal v = (1)

where w runs over the places of K lying over the place v of 1<. We
also write

[1<:QlI[ I< :0 1
I v 'v
v

Fundamental for us is the product formula in 1<, which we write


as

Product Formula. I na E: k., a ". 0 then

I log lal = 0 •
v
v

1.2 Heights.
Let us abbreviate log+t = log t if t > 1, log+t = 0 if
o < t ,,1. As an immediate consequence of the product formula we
have

Fundamental Inequality. Let a E: 1<, a ". 0 and let S be any I.>et


on p.tac.eI.> on 1<. Then

This leads to the definition of height: the abl.>olute height of


a E: 1<, a ". 0, denoted by h(a), is defined by

+
log h(a) = I log lal
v
(2)
v

where
v
I
runs over all places of k.. The height h(a) has the
following properties.

(a) invariance: h(a) does not depend on the field k. with


CI. E: k. used in the definition (2)
17

(b) h(a) = h(a- 1 )

(c) h(aS) ~ h(a)h(S)

Of these. (a) follows from 0); (b) follows from the product
formula; (c) follows from log+(ab) ~ log+a + log+b; (d) follows from

max log
+
Ja. Iv
~
ifv%oo
i
and

Let a t 1<.. a f. 0 and let

f(x) (3)

be an irreducible equation for a in Z[x], with GCD(a O •••• • a d ) 1.


The classical height H(a) of a is given by

H(a) max la .1. (4)


i ~

and the M?hler measure M(a) of a is defined by

1 211 • e
M(a) = exp( ~ J log If(e~ )Ide). (5)
11 0

One proves easily. by Jensen's formula or directly. that

M(a) = laol Tf max(1.l a i l)


~
(6 )

where a1 ••••• ad are the roots of f(x). If I<. = Q(a) we get

log M(a)

and
1d L
.
~
log+la.~ I =
v
f
00
log+lal v

t log laol = I
vl oo
log+lal v ;
18

hence
M( a) (7)

where d = deg a •

Also by (5) we have

for every p > O. The special case p = 2 yields

2 1/2 1/2
M(a) " ( I la.l) .; (d+l) H(a).
1
(8)
i

In the opposite direction, by symmetric functions we have

lao I + ••• + ladl " laollT (1 + la.l) (9)


i 1

" 2d M(a),

so that (8) and (9) prove that M(a) and H(a) have the same order of
magnitude.

We may consider h(a) as an intrinsic height on the algebraic


group Gm• If P is the point on Gm corresponding to a E k*
then
log h(a) ! lim; log H(mP)
m+ oo
(10)

where mP = am is the "sum" of P with itself (for the operation in


Gm) m times. Formula (0) shows the analogy of log h(a) with the
Tate height on elliptic curves; everything is of course much simpler
here.

The definition of height can be carried through in other set-


tings too; of importance to us is the p~ojeet~ve he~ght, defined as
follows.

Let x = (xO,x1' ••• ,x N) be homogeneous coordinates of a


k-rational point in projective space pN. The projective height
of x is defined by

log h(x) I log Ixl v (11 )


v
19

where
Ixl = max Ix. I • (12)
v i ~ v

By the product formula, h(x) = h( AX) whenever A e: k. *, thus the


height h(x) is well-defined on ~(k.); it is also independent of
field extensions. We note that the projective height is compatible
with tensor (Kronecker) products:

h(x 0 y) h(x)h( y). (13)

Examples.

(i) k. = Q, a p/q e: Q*.

In this case

h(a) = H(a) = max (Ipl ,Iql).

(ii) a = 12 - 1 , k. = Q(l2).

Here a is integral, thus lal v = 1 if v (<XI. At <XI we have two


inequivalent absolute values v, for which k.v = R; the inclusion
k. k.v
c = R is such that /2 is positive in one embedding and
negative (the other determination, -12) in the other. Let us
call v+, v the corresponding places. Now

and
h(a) = /12+1 = 1.55377 ••••

r r-l
(iii) a - rna + 1 = 0, m > 2.
Let k. = Q(a). Now [k.:Q] = r and a is a unit, thus lal v
if v (<XI. At <XI we have

(a) one absolute value Vo with k. v R and a close to m, in


o
fact
m - _1_
r-l
<a <m
m
20

for the embedding Iz c ky = R;


o
(b) if r is even, one absolute value v+ with Izv R and such
+
that

r
for the embedding Iz c Izv+ = R, and 2 -1 absolute values v j , j 1,
r
••• , '2 -1 with Izv. C and such tha t
J
-I/(r-I)
a ~ m I;;

with I;;r-I = 1, I;; # 1 for the embedding Iz c kyo = C (the conjugate


J
embedding determines the same Vj ) ;

(c) if r is odd, we have a result similar to (b) but with two


absolute values v+' v_ with Izv+ R, kv_ R and a ~ mll(r-I).

+ o and
If vl oo and v #vO then lal v <1 hence log lal v thus

log h(a) log lal log(laII/r)


va

1
= -;- log m.

Thus h(a) < mi/r and in fact h(a) is extremely close to mi/r.

1.3 General heights.


The above discussion on heights can be extended by introducing
different types of local heights. This turns out to be useful in
obtaining refined results on roots of special type (for example,
roots of unity) of polynomials. Before considering a general const-
ruction let us reexamine the height introduced before in the light
of different considerations.

Let Iz be a number field and let a E: Iz. For each place v of Iz


let Izv denote the completion of Iz with respect to the absolute value
II v determined by v and let flv be the completion of an algebraic
closure of ky with respect to an absolute value, again denotled by
21

I Iv' extending the absolute value on ~.

Lemma 1. Fo~ eve~y v we have

J loglz-al d z
Izl v =1 v v

whe~e dvz ,u, the no~mat-tzed HaM meCl6Wl.e on the un-i.v.. {z e:!lv
Izlv = 1} 06 !lv'

P~006. If vl oo this reduces to Jensen's formula

If instead vr oo we have

I z-a Iv = max{1, I a Iv )

almost everywhere in !lv; this is clear if either lal v < 1 or


lal v > 1 and for lal v = 1 it reduces to the case in which a = 1,
where it follows from the fact that the subgroup of units of !lv
congruent to 1 modulo the maximal ideal {Izl v > I} of the ring
R
v {z e:!l v : I z Iv .. I} has infinite index in the group of all
units of !lv'

Corollary. Let f e: Q[x] and iet k be an algeb~a-i.c nwnbe~ 6-i.e.td


conta-i.n-i.ng the coe6Muenv.. 06 f and aU noov.. 06 f. We have

L (ord a f)log h(a) = L J loglf(z)1 d z,


a v Izl v =1 v v

whe~e Lv ~Un6 ove~ aU no~mat-tzed ab-6oiute value'-> 06 k and dvz ,u,


the no~mat-tzed HaM meCl6Wl.e on the g~oup 06 un-i.v.. Izlv = 1 06 !lv'

Let f(z) = aOz d + ... + ad e: /dz] and let us define the local
height Hv(f) by means of

max ifvr oo (14)


i

and
22
e: 2
log Hv(f) =..:!.. log(L Uaillv) i f vl co • (15 )
2 i

Lemma 2. Fall. evell.lj v we have

J loglf(z)1 d z ( log Hv(f)


Izl v =1 v v

Pll.o06. We have

J loglf(z) d z
Izl =1 v v
v

[~:~) 2
2[k:Q) log( J IIf(z)1I d z).
Izl =1 v v
v
Since

J lIf(z) II 2 d z
v v v
=
Izl v =1

the first part of Lemma 2 follows from the definition of Hv(f).


In order to prove the second part one may note that the state-
ment is true if f has degree 1 (by Lemma 1) and proceed by induction
on deg fusing

Gauss' Lemma. 16 v%co ~hen

H (fg) = H (f)H (g).


v v v

Theorem 1. Fall. evell.lj f I': O[ x) we have

log H(f) - (log2)(deg f) ( L (ord f)log h(a) ( log H(f)


a

whell.e L JtUYU> ovell. aU ll.oo.t6 06 f.


a
23

PILoo6. The right-hand side inequality is immediate from Lemma 1,


Corollary and Lemma 2. The left-hand side inequality can be proved
as follows. We may suppose that f is monic, hence

ord f
f(z) = TT (z-a) a
a

Now if v(~ we have

log H (f) =
v
L (ord a f)log + lal v
a

by Gauss' Lemma. If instead vl~ then

r
L
s=O

~
[. lIa i II
2 ••• 11 a ,,2v
i
l~il< ••• <is~r 1 v s

r 2
~ 2r TT (1 + "ai" v )
i=1

r 2
~ 4r TT max(1, lail ).
i=1 v

If we apply this inequality to the roots of the polynomial f we find

and the left-hand side inequality of Theorem 1 is obtained by


summing these local estimates for all v.
For later use we also need bounds for the heights of derivat-
ives of polynomials.

Leala 3.
opelLatolL
24

whelte I

N i
log H(6 I f) ( log H(f) + L
t(deg V f) de gx f
v-I x v
v
whelte
t(t) - t log t1 + (l-t)log
1
!=t.

Pltoofi. Clear, because

for every m, d; this last inequality is most easily proved by noting


that

I (~) I

m
( (1 + 1 ) (1 + u)d-m
u

and choosing u -- ~
d-m·
Now we consider general heights. Let lJ v be a positive measure
on ~v with total mass lJ v (~v ) - 1 and let us define a height
h(a,l!) by

log h(a,l!) - Lf loglz-al v dlJ.


v
(16 )
v

It is clear that

L (ordaf)log h(a,l!) - L flog If(z)l v dlJ v • (17)


a v

As a special case, suppose that lJ v has support in Iz Iv ( 1 for


every v. Then

If(z)1 v ( max (18)


i

and
(19)
2S

Hence

Theorem 2. 16 each ~v ha6 ¢uppo~ ~n Rv {z E: nv : I z Iv .. I} .then

2 (ordaf)log h(a.~) .. log H(f) + log(deg f + 1).


a

Quite often. one uses Theorem 2 for its consequence

ord f .. log H(f)+log(deg f + 1) (20)


a log h(a.~)

which we have wherever h(a.H) .. 1 for all a. In what follows we


shall describe one non-trivial application of Theorem 2.

If we use (20) with the height h(a) studied so far we get no


result whatsoever in the case in which a is a root of unity. since
then log h(a) vanishes. It is an interesting question in itself to
study what is the maximum multiplicity of a root of unity in a
polynomial of given degree and given height.

Let p be a rational prime and let us choose

1 '
~ (z) = - 1 \' 15 (z) i f vl oo
v p- L. I;
I;
where 01; is a Dirac measure at I; and where 21; runs over the
primitive p-th roots of unity; if v I00 we choose instead ~v Haar
measure on {z E: nv: Izlv = I}.

We note that i f vr oo then 10g+lalv = 10g+la-l;l v i f a is not a


primitive p-th root of unity. and it follows that

log h(a.H) 2
vr oo
10g+lal
v + p-l
1
v
f
00
2
r;
log la-I;I
v

1
= p-l 2 (2 log+ la-I; I +
v 2 log la-r;1 ) .. 0
v
r; vt oo vl oo

by the product formula. Also

log h(I._~)) -- ~
p-l

by Theorem 2 we obtain
26

Theorem 3. 16 f( 1,;) -1 0 whenevelt I,; .u. a plUm.tt.tve p-th ltoot 06


ttYLay, p plUme, .then

l;=l P ord 1 (f) ~ log H(f) + log(deg f + 1).

As a final remark for this section we note that the fundamental


theorem of algebra

lord f deg f
a

may be considered a limiting case of our considerations, if IIv


becomes a point mass at ~, for every v.

II. Thue's method.


2.1 As a first application of the estimates of the preceding
section we prove the basic Liouville lower bound for the distance of
two algebraic numbers.
Let K be an extension of k. of degree r = [K:k.J, let v be a
place of k. with an extension ~ to K, with associated absolute values
II ~ and II v ' We have
v r/[K :k. J
~ v
1~lv = I~I~ v if ~ E k. (21)
v

and thus we can use (21) to extend the absolute value II , origin-
v
ally defined in fl, to the field K. We can now state

Liouville Bound. Let a E K, B E fl, a f B. Then

wah 0 [K :k. J. In pMt.tc.uiM, we have


~ v
v

Plto06. By the Fundamental Inequality we have

log la-BI ) - log h(a-B)


v
27

and
h(a-a) ~ 2h(a)h(a)

by property d) of the height. Since la-al = la-alo/ r , the result


~ v
v
follows.

Definition. IJ.i..6 a :type 06 -Uutat{onaLi.:ty nolt a ovelt k with Itupec.:t


:to v {6

nolt all. a € k, a f o.
(ltltat{onaLi.:ty, Olt :type, nolt a/k, Itelat{ve :to v.

It is clear that it suffices to consider lower bounds for a-a


only i f h(a) is larger than a prescribed bound, simply by changing
the constant c(a), that is we need to prove

Definition. IJ.i..6 an e66ec.:t{ve :type nolt a ovelt k with lte.6pec.:t :to v


(6 cl (a), c 2 (a) above c.an be de:teltmLned e66ec.:t{vely. One :then
wJt,(.:tu
IJ eff (a;k,v) = inf IJ

whelte :the (nn{mum .i..6 :taken ovelt all. adm.i..6~{ble \.l's nolt wh{c.h
e66ec.:t{vely c.alc.ulable c.o~:tan~ Cl(a), c2(a) c.an be 60und.

It is clear that the Liouville bound implies

(22)

In the other direction, it is known (see [Schmidt 1980) that if


° = 1, a € K and a t k then
28

for an effectively computable c3(a) >0 and infinitely many a € fl.


Thus, if <5 = 1,

(23)

for every a € K, a (. fl. The gap between (22) and (23) is


considerable and i t was only after Baker's work on linear forms in
logarithms that the first improvement on (22) was obtained,
namely: if 6 = 1 then

(24)

for some very small n(a) >0 [Feldman 19711. Further work showed
that n(a) can be made to depend only on the field K, and generalized
(24) to arbitrary extensions K/fl and absolute values v. All these
result, although of great theoretical importance, are far away from
the celebrated theorom of Roth:

1 .the.n Il(a; fl,v) 2.

On the other hand, Roth's theorem is ineffective and this


limits to some extent the range of its applications. In what
follows, we shall describe in some detail Thue's method, which is at
the origin of all ineffective results such as Roth's, together with
some recent effective developments and new applications.

We may summarize the essence of Thue's method in three steps.


Let aI' a 2 E K and suppose that aI' a 2 E fl are approximations to aI'
a2' for the absolute value Ilv. For simplicity, we consider the
case fl = Q and write a 1 = Pl/q l' a2 = P2/q 2; we also write
r= [K:Q1.

Step 1. One constructs a polynomial P(xl'x 2 ) with rational


integral coefficients, vanishing at (a 1 ,a 2 ), together with all
il i2
partial derivatives of order (i 1 ,i 2 ), with -- + --
d1 d2
< t, where

di = deg
P, and where t is sufficiently small. The number of
xi
coefficients of P at our disposal is asymptotic to d 1d 2 , while the
number of equations is asymptotic to (r t 2 /2)d 1d 2 • If t < 1(2/r) we
29

can solve the corresponding linear system for the coefficients of P,


with a height

for a suitable C(al'a 2 ). Of course, the construction guarantees


that the polynomial P is not identically O.

Step 2. By modifying P if necessary, and perhaps by imposing a


condition of type IIq2 v., rrKlc.h .iaJtgelt than q 1"' one shows that

Step 3. By looking at denominators one has the lower bound

Finally one compares this lower bound with an upper bound


obtained by using a Taylor series expansion of P at (a 1 ,a 2 ), noting
that P vanishes to high order at this point:

PI P2 d 1+ d 2 PI td 1 P2 td 2
Ip(-,
ql
-) I
q2
<C
1
(I a 1 - -
ql
I + Ia 2- -q
2
I),

with a suitable C1 = C1 (a 1 ,a 2 ,t).

Now suppose that the approximations to a i satisfy

i=I,2;

then from the preceding bounds we obtain

The degrees d 1 and d 2 are still at our disposal and we choose them
d1 d2
so that ql and q2 are about of the same magnitude. If ql and q2
are sufficiently large, this implies that tn < 2 + E for any
positive E. Since any t < 1(2!r) is allowed, one deduces
30

n < ili + £,

which is the Thue-Siegel-Dyson theorem.

It is clear that the preceding argument requires two approxima-


tions P1/q1 and P2/q2' with q1 and q2 large (otherwise the presence
of the constant C1 in the estimates becomes too important), while it
may very well be that such approximations do not exist. Moreover,
all existing arguments for Step 2 require that q2 be much larger
than q1. This means that if we seek for a bound Q for which

la 2 _ ~I
q
> q-ili - £ for q > Q,

then the preceding argument will obtain Q as a function

but onllJ p1tov.i.ded PI / q 1 .u, a 6u6 McA.enttlJ good appltox..i.mat.i.ort :to a 1


and pltov.i.ded q1 .u, 6u6McA.ertillJ .e.altge ah a 6urtc.:t.i.ort 06 aI' £ and :the
appltox..i.mat.i.ort. Two cases now may occur:
Case 1. a1 does not admit such a good approximation. In this
case, we conclude an effective type of irrationality for a 1 •

Case 2. There is at least one good approximation to a1.In this


case, we conclude a type of irrationality for elements a 2 of
the field K, which depends on the denominator q1 of the good
approximation to a 1 •

No procedure is given to decide between Case 1 and Case 2, and


in Case 2 we have no information on the location q1 of the
approximation. The ineffectivity of the method depends on the fact
that the statement of Case 2 is an existence statement whose truth
is not determined in the course of our arguments.

Until recently, no instance of Case 2 was known. However a


refinement of the notion of good approximation led to the first
explicit examples in which Case 2 would hold, thus leading to new
non-trivial types of approximation for a class of algebraic numbers
[Bombieri 19821. In what follows, we shall carry out the steps in
the preceding program, with sufficient accuracy to obtain effective
31

approximation results. We shall proceed using invariant techniques.

Let P(x p X 2 ) I: k.[x p x 2 ] denote a polynomial of degree d 1 in xl


and d 2 in x 2 ; the totality of such polynomials is a k.-vector space
V(d 1 ,d 2 ), of dimension (d 1 + 1)(d 2 + 1). Let e be positive and let
G(t) be the set of pairs (i 1 ,i 2 ) such that

Let aI' a 2 E K where [K:k.] r ) 2. We want to find P I: V(d 1 ,d 2 )


such that

for I (il ,i 2 ) I: G(t) and where

If we write

this means solving the linear system of equations

d1 d2
i
(jl)(j2) a j 1- 1 a j 2 -i 2 o
L L
aj j
i 1 i2
1 2
jl =O j 2=0 1 2

for (il'i 2 ) I: G(t), with I: k. not all zero.


aj Ij 2

Siegel's L~.

Axel Thue was the first to use Dirichlet's Box Princple in


order to construct P. This was made explicit by Siegel ([Siegel
1929]), who proved:

LeJUI8. Let

+ alNx N = 0
+ a 2NxN = 0
32

be a lineaJt ~Ij~tem 06 equat.io~ wUh Jtat.ional .integJtal c.oe6Muenu


not aU 0 and w.ith M < N. Then theJte ~ a Jtat.ional .integJtal
~otut.ion (x1'···,x N) w.ith not all xi's equal to 0, w.ith

max IXil ~ (N max


i i,j

Statement of this type are now called Siegel's Lemma. It is a


curious fact that the name Siegel's Lemma became associated to
weaker statements, replacing the bound given above by
M
N-M
c 1 (c 2N max la iJ" I)
ij

for unspecified constants c1'c 2 , so that we find in the literature


"versions of Siegel's Lemma" which are distinctly worse than
Siegel's!

The preceding result is sufficient for most applications but


for our purposes we need a more sophisticated result. So let us
consider more closely the problem of finding solutions in kN of the
linear system

Ax 0,

where A is an M x N matrix with entries in k. The following


remarks are useful.

Remark 1. We are dealing with a homogeneous problem, Le., a


problem in projective space. Thus i t appears that integrality of
coordinates, which is a property in affine space, should be totally
irrelevant.
In other words: it is a bad procedure to mix projective and
affine points of view.

Remark 2. The system may be supposed of maximal rank. It


N-1
defines a proj ecti ve subspace II c P of codimension M. Thus II is
a point defined over k of Grass(N-1,N-1-M), the Grassmannian of
(N-1-M)-planes in (N-l)-space. We should regard this point as our
basic object and not the individual linear equations defining our
system.
33

In other words: the linear system Ax = 0 is not intrinsically


defined and therefore it should be replaced by an invariant treat-
ment.

Remark 3. Solutions defined over k correspond to elements


of IT(k) , the points of IT defined over k. Thus we may want to study
a basis of solutions, rather than one solution at a time.

We proceed as follows. Let M ( N and let

x = (x ij ), i = 1, •• , M; j 1, ••• , N

be an M x N matrix with elements in k. For J c {1, ••• , N} with


IJI = M let XJ denote the M x N matrix

1, ••• , M, j E J.

We assume that at least one matrix XJ is non-singular, that is X is


of maximal rank. Then for each place v of k we define a loc.at
he.tght by

H (X) = max Idet xJl v if v~oo


v J
and

* 1/2
H (X)
v Idet(XX ) Iv ifvl oo ,

and a global height H(X) by

log H(X) = I log H (X).


v
v

The height H(X) so defined does not depend on a field of definition


k for X and it is .tnvan-<-ant :

H(CX) H(X)

for C E GL(M,k). We also have the useful property that if


X
X= ( 1) then
X2
34

for all places v. This is easily seen i f v ( "" by using Laplace's


expansion, while if vi"" it is a generalization of Hadamard's
inequality due to Fischer in 1908.

The following result is due to Bombieri and Vaaler.

Theorea. Let Ax = 0 be a lineM .6 'pdem 06 M equat.ion.6 .in N


unk.nown.6, de6.ined oveJt k. and 06 maximal Jtank.. TheJte ex.i..6t N-M
.tineaJt.ty independent vec.toJt .6 o.tu.tion.6 xl' ••• , ~-M .6uch that

wheJte l'.k. .i..6 the ab.60lu.te d.i..6 c.Jt.i.minant on k. and wheJte d [ k.:Q).

If A is defined over a field K with [K:k.) = r let 0i(K),


i = l, .•. ,r be the conjugate fields of K/k.. Let us suppose that
rM < N and let
Ol(A)

A =
( °2(A)
··
° ·(A)
, r
).
N
Assume that A is of rank rM. Then there are xl' ••• , ~-rM E k.
such that

Moreover

Analogous statements hold for

A= (D
where A9., is an M9., x N matrix over a field K9., of degree [K9.,: k.)
over k.. One defines A accordingly and replaces rM by L r 9.,M9.,'
9.,
the same conclusion.
35

Suppose

is a matrix with rM rows indexed by (0,i 1 ,i 2 ) and N columns indexed


by (j1' j 2)' where: 0 denotes conjugation of Kover k (there are r
such conjugate fields), (i1'i 2 ) E: G, and j1 .. d 1 , j2 .. d 2 • Let us
assume for simplicity that A is of maximal rank rM = rl GI. Then the
preceding results on Siegel's Lemma show that there are polynomials
P i (x 1 ,x2) E: k[x1,x21, not identically 0, of degree at most d i in xi'
such that

(1) o

for I E: (j;

(ii ) P1,P2, ••• ,PN-rM are linearly independent over Q;

(iii)

In evaluating H(A) we have to consider the maximal minors of


2
A. A typical determinant is a polynomial of degree .$ rd 1d 2 in the
2
variables a o1 ' and of degree .$ rd I d 2 in the variables a o2 • Since
N ~ d I d 2 ' we expect an estimate, as d l , d 2 tend to infinity:

where Al and A2 are bounded functions of aI' a 2 • An important but


rather difficult problem is the determination of A1 ,A 2 as functions
of a 1 , a 2 and r, 0, t and 0 = d 2 /d 1 (the quantities 0, t appear in
-1
the description of (j). If t .. 0 .. t , which we shall suppose from
now on, we have N ~ d 1d 2 , M ~ 1/2 t 2d l d 2 • With 0 = d 2 /d I we now get

so that if h(P I ) .. h(P 2 ) .. ••• we obtain


36

The Thue Principle.

with rows indexed by (o,I), I E: G and columns indexed by (jl,j2)'


Let PI' ••• , PL , L = N-rM, be the polynomials constructed in the
preceding section and let (SI,S2) be an approximation in k to
(aI' a 2 ), relative to an absolute value v. By this we mean

for i 1, 2. Let It* = ( i*


1t , i *
2t ) be an index such that

and let

and let 't be a real number with 't ) 't* Let Pt


Pt(SI,S2) +0, the product formula in k yields

I logi Pt (SI,S2)i w = 0 •
w

For simplicity of notation we now drop t, ~ and set 't* O.


Thus

I log iPt(Sl,S2)iw = o.
w

We estimate separately each iP(Sl,S2)iw •


Case (i). w f v.

In this case
37

Ip(f3l'f32) Iw .; max (1, l(d 1 + l)(d 2 + 1)1w)


d1 d2
x Ip1w max(1, If3 1 l w) max (1 , If3 2 l w) ,
hence

Case (ii). w= v •

In this case

I
and now 6 P(a 1 ,a 2 ) = 0 if We write t
for t - *
T~. Let

I 1
f(x) = x log ~ + (I-x) log I-x'

so that

for every i, d. Let us write

Subcase 1.
a b-l
We have l+a .; 0t, l+b'; 0 t. Now

Ip(f3 1 ,f3 2 )l v '; max(l, (ld 1+ l)(d 2+ 1)1)2

dl d2 i 1 i2
• Ipl v (!) I( i 1 )( i 2 )(f3 1 - a l ) (13 2 - a 2 ) Iw

+ e: max {d 1 (e:f(x) + x log a) + d 2 (e:f(y) + y log b)j


v
-1
8 x+8y;>t
38
[kv: Q)
where E = I if vl~ and E = 0 otherwise, and Ev [k:QJ.
Say E = I. The absolute maximum of f(x) + x log a occurs at
x = al (I +a) and i t is log(I +a). Thus the hypothesis of subcase I
-1
implies that the maximum occurs on the line a x + ay = t, and
-1
thus x ( at, y (a t. Thus the maximum is not more than

+ E -I max (d l x log a + d 2y log b)


v a x+ay=t
d a-I log 1
- t min( dla log
la l - 8I v I' 2 Ia2 82 1v
)

+ E E(f * (at) d l + f*(a-It) d 2 ),


v

where f * (x) = f(x) i f 0 ( x (112 and f * (x) = log 2 if 1/2 ( x ( 1.


If we put together the information obtained so far we deduce

-1 a b
Subcase II. a 1+a + a T+b ;. t. In this case

Suppose subcase I holds. We put together all the estimates for


log Ip(8 1 ,8 2 )1 with the product formula, and find

+ log h(P) •

If instead subcase II holds, the above formula still holds.


39

Finally we replace P by pm in the above calculation and let m ... "'.


We obtain

I
!:J. P(<x 1 ,<x 2 ) = 0
60ft
8- 1 ~+
d1
8 .2
d
.; t,
2

-1
60ft ¢ome 8, t .; 8 .; t

and

Then we have

.; log h(P) + d1(EvEf *(8t) + log h(Sl))

*-1
+ d 2 ( EvEf (8 t) + log h(S2))
wheJte
1
E E= 0
v
if v¥oo, EVE = [1l:QJ if Il
v
R,
2
EVE = TF:Qf i f Il
v
= c.

As was remarked before,


the condition P(Sl,S2) F 0 is the
I*
hardest part to verify and usually one replaces P by!:J. P for some
I*. So if T is real with

and if T <t
then we can apply the preceding theorem to !:J.I P. This
*
I*
means replacing t by t - T and using Lemma 3 to estimate h(!:J. P).
Then we get
40

We now choose

and 8 such that

d 18 log 1~1
~
- 6 Iv
1

and let D + 00. We have proved

Thue-Siegel Principle. Let

1, 2.

Then we have

Two more steps are needed: the estimation of Ai and that of T.

Application of Dyson's Le.aa.

Let us assume that t ( 8 ~ t- 1 and let P(x,y) E k[x,y] be a


polynomial of bidegree d 1 ,d 2 such that

I
/::, P(Cl 1 ,Cl 2 ) 0

i
-1 i1
for 8 - + 8 ..1.
d1 d2
<t and

I
/::, P(6 1 ,6 2 ) 0

i i
for 8- 1 ....!. + 8 ..1. < T. Suppose that Cl 1 ,Cl 2 E K have degree rover
d1 d 2
k. We have
41

Dyson's L_.
1 2 1 2 r-1 d 2
-2 rt + -2 T .; 1 + -- -
2 dl

In terms of 6 = d 2 /d l , this yields

as 6 + O. If ~ <t <~ then T < t, which is what we need.

Let P 1 ,P 2 , ... ,PL , L N-r!G!, be the set of linearly


independent polynomials constructed in the section on Siegel's
Lemma. Let 1 ~ t .; L. Since PI' P2 , ... , Pt , are linearly
independent we can find a linear combination of PI' ... , P t which
vanishes at (a I ,a 2 ) to order tt namely

for

where tt is the largest for which

By Dyson's Lemma, this linear combination will not vanish at


(8 1 ,8 2 ) more than

for

with
1 2 1 2 r-I d 2
2 Tt .; 1 - 2 rt t + --2-- ~

d
= 1 1 2 __t _ + .E::.!. ...l. + o(L)
- 2" rt dId 2 2 dId2
42

Since a linear combination of PI' ••• , PR, does not vanish at U\, 82 )
more than calculated before we see that one of them, say PR," does
not vanish at (8 1 ,8 2 ) in the same way. By considering either PRo or
PRo + PRo' and replacing PRo by PRo + PRo' we see that we may suppose
that PRo itself does not vanish at (8 1 ,8 2 ) more than stated before.
In doing so, we may have to increase the height of PRo by a factor of
2, or less. In conclusion we find that the polynomials P1'··· ,P R,
satisfy:

I
/). PR,(a 1 ,a 2 ) 0

-1 i1 i2
for 8 -+ 8 - ( t;
d1 d2
(b) for each R, there is IR,* such that

IR,*
/). PR,(81'8 2 ) f 0

and

with

(c) if the PRo are the successive minima for (a) then

for some bounded C (independent of d 1 ,d 2 ). In particular,

as d 1 , d 2 + 00 , dl~ d2•

We apply the preceding result to the case in which d 1+ 00 ,

d 2+ d 2 /d 1+ 0, log h(8 2 ) + 0 0 . We also note that in Dyson's Lemma


00,

the condition that a 2 be of degree rover k can be removed and


replaced by a 2 of degree s ~ 2 over I<. , a 2 E /<.(a 1 ) , and
43

[Viola 19841. We obtain

Main 'lbeor_. Let K = Iz( (X 1) be 06 degfLee r ovefL Iz and let v be a


ptac.e 06 Iz, extended to K. Let S 1 E Iz be an appfLox-tmat-ton to (Xl '
-tn the J.> enJ.> e that

Let (X2 Ek(u 1 ), (X2 flz. Then the e66ec.t.tve type 06


-tMat-tonal-tty 60fL (X2 ovefL Iz J.> at~ fi,teJ.>

---2 2 1
HefLe T = I 2 - rt , and log y at log at + (1 - at) log r-:-et and
Al ~ given by

Al = lim
d 2-
d2/d(~ 0

with A the matfL.ix

indexed by fLOW'.> «(J , i I ' i 2') rmd c.olwnnJ.> (j 1 ,j 2 ), with (J fLrmg-tng ovefL
c.onjugation 06 K/Iz, and

PfLo06. By the preceding theorem we have


44

1 -1 1
(t - T~) min(d 1 , e log la 1 _ Bl l v' d 2 e log la 2 - B2 1v)

~ log h(P~) + d 1log (yh(B 1 )) + d 2 Iog(4h(B 2 )) •

If we take the ave~age of this relation with respect to ~ we get the


result, because

o
J II-x dx = t.

Applications.

If a 1 = r/z, E; E k then we can bound Al with some precision.


There are several ways of doing it and the best one yields

we refer to [B-M 19831 for similar explicit estimates. For example,


if
rh
I band b > la I and

log Ib - a I
log b

then a 1 has an effective type

( Q) _2_ + o( 1 )
~eff a 1; (1 - A (log b)1/3

[Rombied-Mueller,19831. If b is large and A < 1 - ~r this rep res-


ents an improvement over the Liouville bound. Previous exponents,
obtained with the Pade' technique yielded

(Thue, Baker)
and

s + s + 0(_1_) (Chudnovsky) for s 1,2, ..• , r-l.


s - (s+l)A log b

In this case, one chooses B1= 1 and Ia 1- B11 is of order


Ib - a I
b
l/r
Now h(a 1) = b ; hence
45
Al n 1
Since h(B 1 ) = 1 we see that n 1 is determined by (ye )

h(a 1 )r(I-A)/c(r), hence

and

-1/3
Choosing T (log h(a 1 » we obtain

as asserted.

At the other hand of calculations of this type we have situa-


tions in which h( B1) is large. A typical example is the
following. Let a 1 be the root - ~ of the equation

xr - mx + 0,

where m is a la rge integer; here we choose k. = Q, v the infinite


place, so that I Iv is the ordinary absolute value. We have
already computed the height of aI' with the bound

We choose Bl hence h(B 1 ) = m and note that


m'

-r-l -r
is - m - (h(a 1 )h(B 1 » ; thus the pair aI' Bl nearly satisfies
the Liouville bound.

It remains to estimate H(A). This is a difficult problem. If


one uses a Laplace expansion and uses the fact that no term in a row
or column may appear twice, then one can prove
r 2
- '4 t
r t
__ 2 log h(a 1 ) + o( r 2)·
1 - - t
2 2
46

The Main Theorem now yields

For large r, this does not exceed

min ~ 2 (1 + ~) < 13.209446


a (1 - 3" a) 2a

with a .5674. Thus if r ~ rO and m ~ mer) we have

It is easy to generalize the last example to equations of the


sort

where f(x) = x S + a 1x s - 1 + ••• + as is a polynomial with bounded


coefficients. What appears however of more interest is the fact
that for every algebraic CL we can find Iz such that lJeff(CL;Iz,v) is
small. The following is proved in [Bombieri-Mueller,1986].

Theore•• Let CL be a Jteal. al.gebJtaA.c numbeJt oil degJtee r ~ 3 and


!et T) >0 be anLf pO-6.d-<-ve COn-6tant. Then one can Mnd -<-nMn-<-te!Lf
manLf Jteal. a-tgebJtaA.c numbeJt Metd-6 Iz oil degJtee r - 1 -6uch that

In order to apply the Thue-Siegel Principle to such a situation


we use Wirsing's result that real numbers admit very good approxima-
tions by algebraic numbers of o-<-xed degJtee.

Proposition. Let CL, ICL I ,1/2 , be Jteal. a-tgebJtaA.c 00 degJtee rand


he-<-ght H( CL). FOJt eveJtLf X ~ 2 theJte ~ i3, al.gebJtaA.c 06 degJtee at
mO-6t r - 1, -6uch that
47

(r-1)2/ r
H(S) , 2 r (r(r+I)H(a») X

and

la - sl ' r!(r-I)
Xr

If we take k = Q(S) and K k(a) one can then show that

-1/3
~eff(a;k,oo) , 2 + O«log X) ).

Another type of applicaton relates to Thue equations. The


Thue-Siegel Principle can be used to obtain bounds for the number of
solutions of equations F(x,y) = c, since every sufficiently large
solution is an anc.ho!!. pai!!. with a root of F(x, I) = O. This allows
one to count in an efficient way the solutions to a Thue equation
exceeding a certain bound. Coupled with a counting of the remaining
"small" solutions, Bombieri and Schmidt proved

Theorea. Let F(x,y) be an il!.l!.educ.~ble 60!!.m ove!!. Z 06 deg!!.ee r ) 3.


Then the numbe!!. 06 !.>Olut~on!.> ± (x,y) to IF(x,y)1 = 1 doe!.> not exc.eed
cr, nO!!. !.>Orne ab!.>olute c.onl.>tant c. 16 r ~ lMge, one c.an tak.e 21Sr
nO!!. !.>uc.h a bound.

This result has been generalized to the so-called Thue-Mahler


equation.

Further Applications.

We have not touched in these lectures upon the problem of prov-


ing the non-vanishing of the auxiliary polynomial at SI,S2' i.e.
Dyson's Lemma or Roth's Lemma.

The classical argument goes by induction on the number of


variables of P and is roughly as follows. Suppose we know that
48

for I = (i 1 ,i 2 , ••• ,in) E G and want to show that P cannot vanish too
much at some other point (8 1 , ... ,8 n ). If n = 1, we have discussed
the situation in great detail:

a) The fundamental theorem of algebra


b) Gauss' Lemma
c) vanishing at or roots of unity.

Let us decompose P as

I f.(x') g.(x )
] ] n

where x' = (x 1 ' ... ,x n _ 1 ); we may assume that the fj are linearly
independent, and so are the gj' Because of linear independence,
some generalized Wronskian of the fj is not identically 0, and so is
some Wronskian of the gj' say W( f) and W(g). But now this means
that some generalized Wronskian of P, say W(P), is non-zekO and
6ac.toJtize,6 at, a potynom-<-ai. -<-n x' and a potynom-<-ai. -<-n xn:

W(P) W(f)W(g),

as one sees by

Now the vanishing of P determines the vanishing of W(P), which in


turn determines the vanishing of W(f) and W(g), which in turn
determines the vanishing of W(f) and W(g), which are polynomials in
a lower number of variables. Thus, by induction, we obtain a
control on the amount of vanishing of W(f), W(g), hence of W(P) and
finally P itself. The final result now depends on how one wants to
control the start of the induction, namely a) or b). The technique
in b) leads to the famous Roth's Lemma, which shows that if the
heights of 8 1 "" 8n go to <X> sufficiently rapidly then P has very
limited vanishing at 81"" ,8 n • The technique in a) leads for the
case n = 2 to Dyson's Lemma ([Dyson 1947), [Bombieri 1982)). The
main advantage in b) is the fact that no conditions on the height

of 8 1 ,8 2 are needed (recall that in applications, such as a 1= r~,


one may want to take 8 1 = 1). On the other hand, the result one
49

obtains in the case n ~ 3 is much weaker and it is not directly


usable in applicatons; in particular, one could not obtain a new
proof of Roth's theorem using Dyson's technique.

The new ideas needed in this direction were provided independ-


ently by C. Viola [Viola 19841 and H. Esnault and E. Viehweg
[Esnault and Viehweg 19841, using methods from algebraic geometry.
Viola's idea, so far carried out completely only in the case n = 2,
relates the multiplicity of zeros to local contributions to the
calculation of invariants of the curve P = 0, such as the genus. A
global control of the genus (for example, genus ~ 0 if the curve is
irreducible) yields an inequality which implies a sharp form of
Dyson's Lemma. A nice feature of Viola's result is that it allows
the case in which (12' which is an element of 1<.«(11)' may have degree
over I<. strictly less than the degree of (11 over 1<.. This means that
the measure of effectivity obtained is valid for all elements of
1<.«(11) not in I<. and not just for generators of 1<.«(11) over 1<.. It is
conceivable that examples may be found in which an irrationality
type for (12 is obtained by c.orv.dltuc.t-<-Ylg (11 with (12 E 1<.«(11) and with
exceptionally good approximations Sl E Q; the degree of (11 could
very well be much larger than the degree of (12 over Q.

The approach of Esnault and Viehweg is based on algebraic


geometry and it works in any dimension. It is not possible to
describe here their technique, which uses very deep results such as
variation of Hodge structures and Kawamata's vanishing theorems.
Their result however is easily described. Let us say that P has a
zelto 06 type (a,t) at ~ if

whenever
n
L
v=l

Let I(d,a,t) be the set

I(d,a,t) = {(~ ) E In:


v

where In is the unit cube 0 ~ ~v ~ 1. We have [Esnault and


Viehweg, 19841:
50

Dyson's L_. A6,6ume. that

1 ••••• M Me. M po-<-nu -<-n en

¢uch that r;
l.I.v
-# 1; y.v 601L 1.1 -# y and v 1, ••• , nj

b) a = (aI ••••• a n ) ha¢ a i ) 0 and tl.l ) 0 601L 1.1 = I ••••• M;

c) d I ) d Z ) ••• ~ d n

The.n -<-6 p ~ a polynom-<-al -<-n C[xI ••••• xn1 06 mult-<-de.glLe.e.


d l ' ...• d n • not -i.de.nt-<-caUy O. and -<-6 P ha¢ a Ze.ILO 06 type. (a. t 1.1) at
r; 601L 1.1 = I ••••• M we. have.
1.1

TT (1
M n n d
L Vol(I(d.a.t » < + (M'-Z)) L -i
1.1=1 1.1 .1=1 i=j+1 dj

wUh M' max(M.Z) •

di+1
Roughly speaking. this result shows that if --d-- + 0 for
i = 1 ••••• n-1 then the vanishing of P at differe~t points implies
"almost independent" conditions on the coefficients of p. as long as
we require vanishing of the same "type" and the technical condition
a). As proved by Esnault and Viehweg. this implies the Roth
theorem. There is another application of this result. which is
worth mentioning here. Let a be real algebraic of degree ) 3 and
let us consider the problem of solutions to

la _ ~I < q-Z - E(q)


q

where E(q) + 0 as q + 00. The so-called Cugiani-Mahler theorem


- 1/2
asserts that i f dq) c 1 (a)(logloglog q} • for a suitable
c 1 (a). then the sequence {qi} of solutions to the above inequality
satisfies
log qi+I
lim sup --......=..;...:;..
log qi

It is now possible to show that using Dyson's Lemma in place of


Roth's Lemma in the proof of the Cugiani-Mahler theorem one reaches
the same conclusion with the better value

_ ( loglog q - 1/4
E(q) - cZ(a) logloglog q} ;
51

the gain is thus almost of one logarithm, from a triple log decay to
a double log.

It is conceivable that the several variable generalization of


Dyson's Lemma can be used to improve our knowledge about effective
approximations to algebraic numbers. Here the main obstacle appears
to be that of estimating in an efficient way the height of the
polynomials in the auxiliary construction. It would be of grea t
interest to produce new examples, say from a three variables
construction, which could not be treated equally well with the two
variable construction of Thue and Siegel.

Referenees.

E. Rombieri, On the Thue-Siegel-Dyson theorem, Ae~a M~hem~~ea

148 (1982), 255-296.

E. Rombieri and J. Mueller, On effective measures of irrationality

for r~~ and related numbers, 1. Re~ne Angew. M~h 342 (1983),
173-196.

E. Rombieri and J. Mueller, Remarks on the approximation to an


algebraiC number by algebraic numbers, M~ehigan M~h 1.
33 (1986), 83-93.

E. Rombieri and J. Vaaler, On Siegel's Lemma, Inven~. M~h. 73


(1983),11-32. Addendum to "On Siegel's Lemma", Inven~. M~h. 75
(1984),377.

F. Dyson, The approximation to algebraic numbers by rationals, Ae~a

M~hem~~ea 79 (1947), 225-240.

H. Esnault and E. Viehweg, Dyson's Lemma for polynomials in several


variables (and the Theorem of Roth), Inven~. M~h. 78
(1984) ,445-490.
52

N. I. Feldman, An effective refinement of the exponent in


Liouville's theorem (Russian), Izv. Akad. Nauk SSSR Se~. Mat. 35
(1971), 973-990; Math. USSR Izv. 5 (1971), 985-1002.

W. M. Schmidt, Diophantine Approximation, Lecture Notes in Math.,


785, Springer, Berlin 1980.

C. L. Siegel, Uber einige Anwendungen diophantischer


Approximationen, Abh. de~ P~eus. Akad. de~ W~~en6eha6ten. Phy~.

-Math. Kl. 1929, Nr. 1 (= Ges. Ahb. I, 209-266).

C. Viola, On Dyson's Lemma, Ann~ Seuola No~m. Sup P~a, 12


(1985),105-135.

Enrico Bombieri,
Institute for Advanced Study,
Princeton, N.J., 08540, U.S.A.
POLYNOMIALS wrTII LOW HElGHT AND PRESCRIBED VANISHING

Enrico Bo.bieri and Jeffrey D. Vaaler*

1. Introduction.

In a recent paper [2] we obtained an improved formulation of


Siegel's classical result([9],Bd. I,p. 213, Hilfssatz) on small
solutions of systems of linear equations. Our purpose here is to
illustrate the use of this new version of Siegel's lemma in the
problem of constructing a simple type of auxiliary polynomial. More
precisely, let k. be an algebraic number field, Ok. its ring of
integers, Ctp Ct 2 ' ••• ' CtJ distinct, nonzero algebraic numbers (which
are not necesarily in k.), and m1 ,m 2 , ••• ,mJ positive integers. We
will be interested in determining nontrivial polynomials P(X) in
which have degree less than N, vanish at each Ct. with
J
multiplicity at least mj and have low height. In particular, the
height of such plynomials will be bounded from above by a simple
function of the degrees and heights of the algebraic numbers Ct j and

the remaining data in the problem: ml'm 2 , ••• ,mJ , N and the field
constants associated with k..
This type of construction has been used recently by Mignotte
[6], [7] (see also [8, pp.281-288]) and by Dobrowolski [4]. Our
bounds provide a simpler and somewhat sharper form of their
results. In section 5 we consider the special case of polynomials
in Z[X] which vanish at 1 with high multiplicity and yet have
relatively low height.

If N is sufficiently large, the set S of polynomials in k.[X]


which have degree less than N and vanish at each with
multiplicity at least mj' forms a vector space over k. of positive
dimension L. An interesting feature of the Siegel lemma obtained in

* The research of the second author was supported by a grant


from the National Science Foundation.
54

[2) is that it allows us to determine L polynomials P l (X).P 2 (X) •••••


PL (X) in 0lz [X) nS which form a basis for S and are such that the
average height of these polynomials is small.
Ac.lzl1owiedgement. We wish to thank Dr. D. Bump for calling our
attention to references [5) and [10) on Schur polynomials.

2. State.ent of results.
We summarize. briefly. our notation for heights of algebraic
numbers. vectors and matrices. This is identical to the notation
used in [2). We suppose that the number field Iz has degree dover Q
and write v for a place of /z. Then Izv is the completion of Iz at v
and [Izv ' ~) = dv is the local degree. At each place v we normalize
d /d
an absolute value I I as follows. If vi'" we set Ixl = Ixl v
v v
where I I is the ordinary absolute value on R or C. If v is a finite
place then vip for some rational prime p. In this case we require
-d /d
that Ip Iv = p v Because of our normalizations the product
formula
Tr lal v
v

holds for aE k and a f O. Also. it will be convenient to use a


second nomalized absolute value II IIv at each place v. These are
d/d
related by II IIv = Ilv v.

We extend the definition of II to (column) vectors x in IzN


v
with

N
by Ixl v = max Ix n I v • The homogenous height of a vector x in Iz is
n
given by

h(x) =Tr Ixl


v

v

In view of the product formula we have h(ax) for all scalars = h(x)
N-l
a +0. Thus h is a height on the projective space Pk If f(X) is
a polynomial in Iz [X) we write h(f) for the height of the vector of
55

coefficients of f.

Let Y = (Ymn) be an MxN matrix over Iz with rank(Y) = M < N.


For each subset 1 c{I,2, ••• ,N} of cardinality 111 = M we write

m € {1,2, ••• ,M}, n € 1,

for the corresponding MxM submatrix. At each place v we define a


local height Hv on matrices by

d /2d
H (Y) = ( L I det YI 12) v i f vi"" ,
v III=M v

H (Y) = max Idet Y1 I i f vi"" •


v III=M v

We then obtain a global height by setting

H(Y) = TT H (Y) •
v
v

For elements y in Iz we use the inhomogeneous height

hI (y) = IT max{l, hi) •


v

If y has degree dover Q then the quantity (hI (y»d is also the
Mahler measure of the algebraic number y, as defined, for example,
in [8J. A basic property of each of our heights is that they do not
depend on the field containing y or the entries in x or Y.

As before we assume that a 1 ,a2 , ••• ,aJ are distinct nonzero


algebraic numbers with degrees rj = [Iz(aj ): IzJ over Iz. We also
assume that for i f j the minimal polynomials for a i and a j over Iz
have no common zeros. This allows us to avoid some trivial
complications. We write

and let N be a positive integer such that N - M = L is positive. It


follows easily that the vector space
56

for ~ = O,I, ••• ,m.-l and j 1,2, ••• ,J} (2.1)


J

has dimension Lover k.. If qj (X) is the minimal polynomial of Ilj

over k. and

J m.
Q(X) = IT {q (X)} J,
j=1 j

then the poynomials FR.(X) = XR.-I Q(X), R. 1,2, ••• ,L, clearly form a
basis for S. Now by a basic result on heights ([ 1], section 2) we
have

L
L log h(FR.) = L log h(Q) (2.2)
R.=1
J 2 M
" (N - M) L mJ.r J. log h 1 (IlJ.) + N t(N) ,
j=1

where t(a) = a(l-a)log 2 for 0 " a " 1 If the ratio MIN is close
to 1 we cannot expect to do much better than (2.2). On the other
hand, when MIN is near zero it is possible to determine L polyno-
mials in S for which this bound can be substantially improved.

Let u(a) be defined for 0 " a " 1 by u(O) = u(l) = 0 and, if


O<a<l,by

1 2 1 - a2 1 1 + a 1
u(a) = r a log( ~ ) + 2 alog(r-:-a) + rlog(1 - a 2 ) •

The function u(a) is continuous and satisfies the inequalities

u(a) < t(a) (2.3)

and
u(a) < 21 a 2 log
1
(49) 3
+ r a2 , (2.4)

for 0 < a < 1. To establish (2.3) on the interval 0 < a " 2- 1/2 we
note that t(O) - u(O) = 0, t(2- 1/2 ) - u(2- 1/2 ) > 0, and

1 - a
t"(a) - u"(a) = 21 log( -er- ) "0
2

57

On the remaining interval 2- 1 /2 <e <1 we have t"(e) - u"(e) > O.


Now we use t(l) - u(l) = t'(I-) - u'(I-) = 0 to show that

1
t(e) - u(e) f (~- e){t"(O - u"(~)} d~ •
e
This proves (2.3). The second inequality follows from the series
expansion

u(e) = t e 2 log (!e) + t e2 -


n=2
L {4(2n - 1)(n 2 - 2n + 2)}-1 e 2n •

Of course (2.3) is sharp when e is near one while (2.4) is most


useful for values of e near zero.

Theora. 1. Thene ex~t polynomial6 P1(X), ••• ,PL(X) ~n 0k[X) wh~ch

60nm a ba6~ 60n the ~pace S, deMned by (2.1), and ~at~ 6y

L J
L log h(PR,) .. (N - M) L mjrjlog hI (u j )
R,=1 j=1

+ N2u (~)
N + (N - M) 1og c k • (2.5)

2 sid 1/2d
Hene ck = (-;) 16kl whene s ~ the numb en 06 complex placu
06 k and 6k ~ the fucJUminant 06 k.

There are alternative bounds which can be obtained from our


method and in some situations these may be sharper than (2.5)

L J
2 log h(PR,)" 2 (N - mJ.rJ.) mJ.rjlog h 1 (u J.)
R,=1 j=1

(2.6)
J mr
+ N2 L u(~) + (N - M) log c k '
j=1

and
L J
L log h(PR,)" L (N - m.)m.r.log h 1 (u J.)
R,=1 j=1 J J J
(2.7)
58

For example, u(a) is convex on [0,(17)-1/2 ] and u(O) O. It


follows that
J m. J m.r.
L
M
u(~) .; u(i)
j=1
rj u(i-) "j=1
L N

_If.
whenever M.; (17) 2N. Thus the bounds (2.6) and (2.7) are most
useful when log hl(a j ) is small on" average. In particular, if each
a j is a root of unity then (2.7) is clearly sharper than (2.5).

3. Preliminary le..as.

Let Ie {O, I ,2, • • • , N-I } with 111 = M and 1 = {n1 < n2 < < ...
nM} • We define polynomials Q 1(x) and PI (x) in M variables as
follows. We set

n.
Ql(x) = det ( xi J ) ,

where i = 1,2, ••• ,M indexes rows and j 1,2, ••• ,M indexes columns,
and
(3.1 )

where

is the Vandermonde determinant. The polynomials P1 (x) are the Schur


polynomials (or S-functions) whose basic properties are given in
Macdonald [5] and Stanley [10]. Clearly Ql and PI each have integer
coefficients. In fact, PI has nonnegative integer coefficients.
This will be useful for our purposes and is contained in [5, p.42,
equation (5.12)] and [10, p. 181, Theorem 10.1]. If we evaluate PI
at the vector all of whose coordinates are 1 we find that

(1!)(2!) ••• «M - I)!) P1 (1,1, ••• ,1) (3.2)

([5, pp. 27-28]).


59

L~ 3. 16M and N Me integeM .. M .. N , then


M-l
log{ L PI(l,I, ••• ,1)2} = L (M - Iml)1og( ~:: )
III=M m=-(M-l)
(3.3)

N-l
P~oo6. Let a(~) = L n~ (with a(O) = N) and let
n=O

Dm- 1 det{a(~ + v)},

~ = 0,1,2, ••• ,m-l, and v = 0,1,2, ••• ,m-l, be the corresponding


Hankel determinant. Here we assume that ° . m .. N-l with D_l s 1.
If A denotes the NxM matrix

1,2 ••• ,N , j = 1,2, ••• ,M

then det{ATA} = DM- 1 • T


When we expand det {A A} using the Cauchy-
Binet formula we find that

L
III=M (3.4)

- {(I!)(2!) ••• «M - 1)!)}2 L P I (1,1, ••• ,1)2.


III=M

The determinants Dm- 1 occur in the construction of orthonormal


polynomials on the set {0,1,2, ••• ,N-l}. Specifically, the poly-
nomials

p (x)
m
(D ID)
m- m
- 1/2
det
CO)a(l)
•••
a(O)
a(2) .(m) )
a(m+l)
(3.5)
a(m-l) a(m) a(::~i )
1 x

have degree m, where m = 0,1,2, ••• ,N-l, and are orthonormal on


{0,1,2, ••• ,N-l} (see Szego [11, p. 27, equation (22.6»)). That is,

N-l
L (3.6)
~=O
p (Op (~)
m n °
60

if 0 ~ m <n ~ N-1 , and

N-1
L {p (O} 2 = 1 • (3.7)
~=O m

From (3.5) we see that


D I;
( m-1 ) 2 m
p (x)
m
-D- x + (3.8)
m

for each m.

A second representation for the polynomials Pm (x), m = 0, 1,


... , N-1, occurs in a paper of Chebyshev [3]. More precisely,
Chebyshev showed that the polynomials

t (x) = m! t.m( x) ( x-N) , (3.9)


m m m

where t. is the finite difference operator, satisfy (3.6) and

N-1 2 m
L {t (~)} (2m+1)-1 TT (N+i) , (3.10)
~=O m i=-m

(see [3,p. 552, equation 10] or [11, p. 34, equation (2.3.4)]). Let
T(m,N) denote the function on the right of (3.10). Polynomials Pm
having degree m, positive leading coefficient and satisfying (3.6)
and (3.7) are unique. It follows that Pm(x) = T(m,N)-1/2 tm(x) for
each m = 0,1,2, ••• ,N-1. From (3.9) we have

t (x) = ( 2m) x m + ••• ,


m m

and therefore

-2
log Dm - log Dm- 1 = log {T(m,N)( ;m) } (3.11)

Finally, we sum (3.11) over m in the set {0,1,2, ••• ,M-1} to obtain

M-1 -2
log DM- 1 L log {T(m,N)( 2m) } (3.12)
m=O m

Of course the right hand side of (3.12) is known, and when combined
with (3.4) leads to the identity in (3.3).
To establish the upper bound in (3.3) we set
61
x
F(x) f 1og (~)
M+y dY
-x
!Lt...L
and note that log (M + y) is positive, decreasing and convex for
-M < y < M. For 0 ~ m ~ M-l it follows that
m N+R, m R, + liz N+
L log( M;t) ~ L f log( ~) dy
R,=-m R,=-m R, - liz Y

1 m+l
F(m + 2) ~ f F(x)dx
m

Therefore we have

M-l N+m M-l m N+R,


L (M-Iml )log ( M+;) L L log( M;t)
m=-(M-l) m=O R,=-m

M-l m+l
~ L f F(x)dx
m=O m

M
f (M - Ixl) log( : : ~)dx
-M

Next we suppose that B1 , B2 , ••• ,B J are distinct nonzero


algebraic numbers and ml,m 2 , ••• ,m 2 are positive integers with
J
M = I:j=l mj • Throughout the remainder of this section we work in
the number field K = Q(B l ,B 2 , ••• ,B J ). We associate an mj x N matrix
Bj with each Bj by setting

where ~ = O,l,2, ••• ,mj -l indexes rows and n = O,l,2, ••• ,N-l indexes
columns. Then we assemble these into an M x N matrix

B =( ... .:~)
BJ

Theore. 4. I Ii 1 ~ M < N then the ma..tJr.-tx B hal.> nank. M and .6 at.w n-tv..
62

J 2 M
log H(B) ~ (N - M) L m.log h 1 (8 j ) + N u(N) • (3.13)
j=l J

P~oo6. If v is a lattice point in ZM with nonnegative coordinates


vm' m = 1.2 ••••• M. we define the partial differential operator DV by

Then we fix a lattice point A in ZM by setting

AT = (0.1.2 ••••• m1-1. 0.1.2 •••• m2-1. 0.1 ••••• 0.1.2 ••••• mJ -1).

For each subset 1 c{0.1.2 ••••• N-1} with 1 = {n1 < n2 < ••• < ~} we
find that

A n. n. - A
D Ql(x) = det{( J) x J i}. (3.14)
\ i

Now let b denote the vector

(3.15)

In (3.15) there are m1 coordinates equal to 81' followed by m2


coordinates equal to 82 , and so forth. ending with mJ coordinates
equal to SJ. When (3.14) is evaluated at x = b we obtain

By using (3.1) and the product rule for derivatives we have

L {D VV(x) HDA-Vp l(x)} (3.16)


O~V~A

When the right side of (3.16) is evaluated ,at x = b. each term in


the sum with 0 ~ v < A is zero. This can be seen as follows. Let v
be fixed with 0 ~ v < A and let Sj = r{=l m,q, • with So = O. Then
for some integer j. 0 ~ j ~ J. we have
63

Thus there must be two distinct integers t1 and t2 such that


Sj_1+1 ( tl < t2 ( Sj and v t1 = vt2 • It follows that

since the determinant has two identical rows. This establishes our
assertion and the identity (the confluent case of (3.1))

(3.17)

The first factor on the right of (3.17) can be explicitly given as

A mi m.
D V(x) I = IT (fl - fl) J = Y •
x=b 1(i<j(j j i

Clearly y F0 and in particular det(ti I ) F0 when


I = {O,I,2, ••• ,M-1}. This shows that ti has rank M.
Next we write

and note that from (3.1),

By the product formula

H(B) = IT {~xIPI(fll'···'flj)l) IT { I II P I (fl 1 ,···,fl j )


2 d /2d
IIv } v ,
v(co vlco
(3.18)
where d = [K:Q]. If v t co then
j mj (N-M)
maxIPI(fl1, ••• ,flj)1
I v
(IT
j=1
(max{I,\ fl j lv }) •
If v I co we use the fact that PI' and hence PI ' has nonnegative
coefficients, so that
64

Combining these estimates we find that

J d + d ~ 2
+ L m.(N-M) L ~ log v + ils.ll
2~ log{ I
PrO.l ••••• 1) } I
j=1 J vi" J vi" r
J 1 ~ 2
(N-M) I
m.log h 1 (S.) + log{ 2 I
P I (I.I ••••• I) } •
j=1 J J I

Of course PI 0.1 ••••• 1) = PI 0.1 ••••• 1) and therefore the proof is


completed by appealing to Lemma 3.

4. Proof of Theorem 1.
With each algebraic number Cl j we associate an mj x N matrix Aj
defined by

where ~ = O.1.2 ••••• mj -l indexes rows and n = O.1.2 ••••• N-1 indexes
columns. Now let F be a number field which is a Galois extension of
k and a Galois extension of each of the fields k(Cl j ). j
1.2 ••••• J. If G(F/k) is the Galois group of F over k. and
G(F/k(Clj » is the Galois group of F over k(Cl j ). then G(F/k(Cl j » is a
subgroup of G(F Ik) having index [k (Cl.) : k 1 = r.. Let
J J
(j) (j) (j)
a1 • a2 ••••• ar . be a set of distinct representatives of the
J (")
cosets of G(F Ik (Cl j ) ):(~ )G(~ I)k)~ (Fno)r {e:(~h) (a: J)} :e~)wri te
i j ~ i j
65

Finally, we assemble the matrices Aj , j 1,2, ••• ,J, into a M x N


J
matrix (where M = L j =1 mjr j )

Now suppose that x is a nonzero vector in (O~)N such that

A.x = 0, j 1,2, ••• ,J • (4.1)


J

Then form the polynomial

N-l
P(Y) L
n=O

having x as its vector of coefficients. Of course the equations


(4.1) are equivalent to the vanishing conditions which we wish to
impose on p(Y), namely,

0, ~ = 0,1,2, ••• ,mj -1 ,

for each j, j = 1,2, ••• ,J. Therefore we apply the general form of
Siegel's Lemma given in [2] as Theorem 14. By that result there
N
exist N - M linearly independent vectors xl'lIt2' ••• '~-M in (O~)

which satisfy (4.1) and

N-M
L log h(x t ) ( (N - M) log c~ + log H(Z) • (4.2)
£=1

The matrix Z has precisely the same structure as the matrix H of


Theorem 3, but now the set {S1, ••• ,SJ} used to construct H consists
J
of the L j =1 rj distinct nonzero algebraic numbers in the set

{a~j)(aj) : i = 1,2, ••• ,r j and j = 1,2, ••• ,J} •

In the matrix Z the integers mj correspond to a~j)(aj) for each


J
value of the index i, i = 1,2, ••• ,r j , and so M = L j =1 mjrj • Thus
by Theorem 3 we have
66

This completes our proof of Theorem 1.

To establish Corollary 2 we use the inequality

J
log H(Z).. L log H(~)
j=1

J
L rj log H( Aj) •
j=1

This follows from [2, equation (2.6»). We apply Theorem 3 to obtain


an upper bound for log H( Aj ). When this upper bound is combined
with (4.2) and (4.3) we find that the inequality (2.6) holds. In a
similar manner we deduce (2.7) from (4.3) by using Theorem 3 to
bound log H( Aj ) •

5. PolynOldals vhich _Dish at: 1.


We apply Theorem 1 in the special case Il D Q, J = 1, al = 1 and
so r 1 z 1. If follows that for 1 .. m1 < N there exist L = N - m1
linearly independent polynomials PI ,P 2 , ••• 'PL in Z[X) having degree
less than N, vanishing at 1 with multiplicity at least m1 and
satisfying

If we arrange the polynomials P R. in order of increasing height we


find that

(5.l)

It will be convenient to combine (5.1) and (2.4) as follows.


67

Corollary 5. Le~ mi and N be {n~egek4 wi~h I ~ mi < N. Then ~heke


ex,u,v., a non~~v{al poiynom{al PI (X) {n Z[xj hav{ng degkee ie,6,6 ~han
N, van,u,h{ng M I ~h rrut~{pliu~y M ieill.>~ mi and llM,u,6y{ng
2
I I mi N
N- log h(P I ) ~ 12 (N) log (-)(1 + 0(1».
mi
(5.2)

Heke 0(1) deno~~ a 6unc;t{on 06 mi and N wh{ch ~en~ ~o 2ekO ill.>

N + 00 {n lluc.h a way ~hM mi IN + 0 •

We note that the left and right hand sides of (3.13) are
asymptotically equal as N + 00 in the special case k = Q, J = 1, and
III = 1, which leads to Corollary 5. For this reason we expect the
bound (5.2) to be rather sharp. In fact, under somewhat more
restrictive conditions, we will show that a polynomial of low height
cannot vanish at 1 with too high a multiplicity.

lbeore. 6. Le~ G(X) be a non~~v{al poiynom{al {n Q[xj hav{ng


degkee i~ll ~an N and van,u,h{ng M 1 w{~h ~{pliu~y el • 16
N + 00 and el + 00 {n lluch a way ~hM

( N log N)_ 1 12
and + 0 , (5.3)
e1

~hen

-1
(1 + 0(1» ~ N log h(G) • (5.4)

Pko06. We will work over the field k = Q. Let Fm denote the m-th
cyclotomic polynomial. If g(X) is a nontrivial polynomial in Q[xj
with Fm t g we define

Lm(g) ~ loglgl
v
+ ,p(m)-l log IRes {F ,g} I
m 00

v 00

Here Res {Fm,g} is the resultant of Fm and g, ~ is Euler's


~-function, and I g I v is the absolute value I Iv applied to the
vector of coefficients of g. It is clear from the product formula
that
68

for each 8 +0 in Q. Also, we have

(5.5)

If gE Z[xj is irreducible in Z[xj then

the resultant of Fm and g is a nonzero integer, and therefore

(5.6)

Since an arbitrary polynomial g(x) in Q[xj can be factored into a


rational number times a finite product of irreducible polynomials in
Z[xj it follows that (5.6) holds generally.
Now suppose that G(X) is a nontrivial polynomial in Q[xj having
degree less than N and vanishing at 1 with multiplicity e 1 • Then we
may write

e
G(X) = TT
n=l
{F (X)} nQ(X)
n
(5.7)

where each en is a nonnegative integer and Q is not divisible by any


cyclotomic polynomials. Let

so that Fm f Gm and Gm is divisible by (X-1) with multiplicity


greater than or equal to (e 1 - e m)+. Applying (5.5) and (5.6) we
find that

L (G ) )
m m

For m ) 2 we have

lim IT (X d _ l)~(m/d)
x+l dim
69

Xd _ 1 \J(m!d)
.
li m 1T ( X""7"T)
x+1 cilm
= 1T (d) \J(m!d)
dim
= exp {A(m)} •

In this way we obtain the lower bound

(5.8)

for m ) 2.
N-1 n
Let G(X) ~n=O anX so that

N-1 n-e
G (X) =
m
L a (
n
n)X
e
m
n=e m
t~
m
If v then if follows that

IG mIv .. IGI v • (5.9)

At the infinite place (extended to C) we have

N-1
IGm(I;m) I .. max lanl~ L ( n )
~
n=e em
n m (5.10)

N
e + 1) ,
m

where /;m is a primitive m-th root of unity. Combining (5.8), (5.9)


and (5.10), we obtain the inequality

N
e 1A(m) .. $(m) log h(G) + $(m) log( e + 1) + emA(m). (5.11)
m

Finally, we set m equal to the prime number p and use the bound

(5.12)

for binomial coefficients, where $(a) = -a log a - (I-a) log(l-a),


o< 9 < 1. The inequality (5.12) is most easily proved by noting
70

that

N
1
21li { ..llill:
M+l dz I
zl=p z

and then choosing p = M/(N-M) • Thus the polynomial G(X) having


the form (5.7) must satisfy

e + 1
e 1log p ( (p - 1)log h(G) + (p - l)N~( ~) + eplog p (5.13)

for each prime number p.


To complete the proof we set

x =

e(x) = L log p ,
p(x

s(x) = L (p - 1) •
p(x

We then sum both sides of (5.13) over the set of primes p less than
or equal to x. We also use the fact that ~ is concave and
increasing on (0,1/21, and the obvious inequality

L e (p-l) ( N •
P P

It follows that

e + 1
e 1 e(x) ( s(x) log h(G) + N L (p - 1) ~( ~) + N
p(x

( s(x)log h(G) + N s(x)~{(N s(x»-l I e (p - 1) + N- 1 } + N


p(x P

( s(x) log h(G) + N s(x) ~{s(x)-l + N- 1 } + N ,

and therefore

e 1 e'x' -1 -1 -1 -1
( -)(~) (N log h(G) + ~{s(x) + N } + s(x) • (5.14)
N s(x)
71

By the prime number theorem we have

e(x) + 1 and 2s(x) log x + 1


x x2

as x + 00. It follows easily that

e
e(x) ~ l ( -1)(1 + 0(1» (5.15)
s(x) 2 N

and
N 2 N
s(x) = 8(--)
e
log (--) (1 + 0(1»
e
(5.16)
l l

as e 1 /N + o. We also find that


If.
s(x)
--+
N
o as (N loge N) 2 + 0 • (5.17)
1

Using (5.16) and (5.17) we conclude that


2
1 1 -1 1 e1
w{s(x)- + N- } + s(x) = 4( ~) (1 + 0(1» • (5.18)

When (5.14), (5.15) and (5.18) are combined we obtain exactly the
statement of the Theorem.
The argument used to prove Theorem 6 suggests that a polynomial
of low height which vanishes at 1 with high multiplicity must also
vanish at primitive p-th roots of unity, at least for primes p which
are not too large. This type of automa.t.te vanu,h.tng can be made
explicit in various ways. Here we provide a simple result which
follows easily from (5.11).

Theora 7. Let m1 and N be .tntegeM wUh 1 .; m1 <N and let PI (x)


be. a nontlUv.tal pollfnom.tal .tn z[xj wh.teh -6at-iJ., Mu the. eoneiM.ton all
COlLoilaJtlf 5. I tl N + 00 and m1 + 00 .tn -6 ueh a waif that

m1 (N log N)1/2
~ + 0 and + 0 , (5.19)
m1

the.n Fp i p 1 tl OIL all plUme./.) p weh that

2N
p .; (-)(1 + 0(1» • (5.20)
m1
72

P~oo6. Let PI vanish at 1 with exactly the multiplicty e 1 , so that


m1 ( e l • If Fp ~ PI then we may apply (5.11) with PI = G, P = m
and e = O. It follows that
p

m1 1 -1 -1
( --)(~) (N log h(P 1) + N log N •
N p - 1

Using the hypothsis on the right of (5.19) and (5.2) we find that

mi 1 1 m1 2 N
( --)(~) ( -2 ( --N) log (--) (1 + 0(1» • (5.21)
N p - 1 mi

But (5.21) implies that

p ;> ( 2N)(1 + 0(1» •


m1

Hence we must have Fp PI for those primes p satisfying (5.20).


This proves the Theorem.

References.

[11. E. Bombieri, Lectures on the Thue Principle, these


proceedings.

[21. E. Bombieri and J.D. Vaaler, On Siegel's Lemma, Inven~.

Math. 73,(1983),11-32.

[31. P.L. Chebyshev, Sur l'interpolation, Zap~~ AQadem~~

NauQ, vol.4, Supplement no. 5, (1864). Oeuv~eh, vol. 1,


pp. 539-560.

[41. E. Dobrowolski, On a question of Lehmer and the number of


irreducible factors of a polynomial, Ae~a. ~~~h., 34,
(1979) ,391-401.

[51. I.G. Macdonald, Symmetric Functions and Hall Polynomials,


(1979),Oxford U. Press.
73

[61. M. Mignotte, Approximation des nombres algebriques par des


nombres algebriques de grande degre, Ann. rae. Sci.
ToutoU6e Math.(5) 1 (1979), no. 2, 165-170.

[71. M. Mignotte, Estimations elementaires effectives sur les


nombres algebriques, Journees Arithemetiques, 1980; (ed.
J.V. Armitage) London Math. Soc. Lecture Note Sere 56,
Cambridge U. Press, (1982).

[81. W.M. Schmidt, Diophantine Approximation, Lecture Notes in


Math. 785, Springer-Verlag, New York, 1980.

[91. C.L. Siegel, Uber einige Anwendungen diophantisher


Approximationen, Abh. den PneU6~. Akad. den W~~e~eha6ten.

Phy~.-math. Kt. (1929), Nr. 1 (=Ges. Abh., I, pp. 209-226).

[101. R.P. Stanley, Theory and application of plane partitions


I, II, studi~ Appl Math. 50, (1971), 167-188 and 259-279.

[Ill. G. Szego, Orthogonal Polynomials, AMS Colloq. Pub. 23,


4-th ed., Providence, (1975).

E. Bombieri, J. D. Vaaler,
Institute for Advanced Study, University of Texas,
Princeton, N.J. 08540, U.S.A. Austin, TK. 78712, U.S.A.
ON IRREGULARITIES OF DISTRIBUTION AND
APPROXIMATE EVALUATION OF CERTAIN FUNCTIONS II

W.W.L.Chen

1. Introduction.

Let U = [0,1]. Suppose that g is a Lebesgue-integrable


function, not necessarily bounded, in Lt, and that h is any function
in U2 • Let P = p(N) be a distribution of N points in Lt such that
h(y) is finite for every y EO P. For x = (xI'x 2 ) in U2 , let 8(x)
denote the rectangle consisting of all y = (YI'Y2) in U2 satisfying
o ( YI < xl and 0 ( Y2 < x 2 ' and write

Z [ P; h: 8( x) ] L h(y). (1)
yEP nB(x)

Let ~ denote the Lebesgue measure in U2 , and write

D[ P;h;g; ti(x)] Z[p;h;ti(x)] - N J g(y)d~. (2)


ti(x)

The aim of this paper is to use a variation of the ideas in


Chen [1] on Halasz's method in [2] to prove

Theorea 1. SuppOM. that g .u., a Lebugue-.tntegnab.f..e 6uncA:.ton .tn Lt.


SuppO-6e 6Mthen that thene ex.u.,.t6 a meCL6 Mable flubfl et S 06 if flUc.h
that ~(S) > 0 and g(y) 'I 0 60n eveny y EO S. Then thene ex.u.,.t6 a
pOfl.tt.tve c.onfltant c i = c i = c i (g) fluc.h that 60n eveny d.L6tl!..[but.ton
P 06 N po.tnt6 .tn l
and 60n eveny 6unc.t.ton h bounded .tn J2,
sup I D[P;h;g;ti(x)]1 > c1(g)(log N).
x ELf

Note that this is an improvement of the case K = 2 of Corollary 1 of


Chen [1] as well as a generalization of Theorem 2 of Schmidt [4] and
Theorem 2 of Halasz [2].
76

As an application of Theorem 1, we shall consider functions in


U2 of the following type.

Definition. We denote by F the class of all functions of type

C + J g(y)dll
ti(x)

in u2, where C is a real constant, and where g satisfies the


hypotheses of Theorem 1.

We can show that functions in F cannot be approximated very


well by certain simple functions.

Definition. By an M-simple function in UZ , we mean a function cp,


defined by

M
cp(x) I mi xS (x)
i=1 i
2
for all " €: U , where, for each i = 1, ••• , M, tii denotes a rectangle
(i) (i)1 «i) (i)1
,
in U- of the type (u 1
(1)
' u 1 + vI x u2 ' U z vz
(1) +
,
Xti denotes the characteristic function of the rectangle ti i , and the
i
coefficients Mr are real.

As in Sec. Z of [11, the following is an easy consequence of


Theorem 1.

Theorea 2. SuppO-6e that f €: F. Then thelte ex,u,-tt, a pa<>-dive


c.oyudant Cz cZ(f) -6uc.h that 60lt evelty M--6impte 6unc.tion cp in u2 ,

sup I cp(x) - f(x) I > c 2 (f)M-1 (log M).


x E U2

2. An outline of the .ethod of BaUsz.

Following the method of Halasz [21, corresponding to every


function of the type D(x) = D[P;h;g;ti(x)l, where P is a distribution
of N points in U2 (N being sufficiently large), we construct an
auxiliary function F(x) = F[P;h;g;xl such that
77

f IF(x)1 du ( 2 (3)
U2

Also, there exists a positive constant c3 = c 3 (g) such that

f F(x)D(x) du > c 3 (g)(log N). (4)


tP-
Theorem 1 follows, on combining (3) and (4) and noting that

f F(x)D(x) du I D(x)1 f IF(x)1 dUo


U2 U2

It remains to establish the existence of such a constant c3(g) and


function F[ P;h;g;x].

Some difficulty arises, as in [1], from the assumption that g


can take different signs in any region. We therefore have to look
for regions in U2 where g is "predominantly positive" or
"predominantly negative". We deal with the remaining "undesirable"
regions by letting F vanish there. On the other hand, the function
F is more complicated than the one used by Roth in [3]. For
Halasz's method to succeed, we also need to make sure that in the
regions we have chosen, the value of g is not "too large" "too
often". We discuss this in the next section.

3. Preparation for the proof of Theorem 1.

Let g be a Lebesgue-integrable function in if. Suppose that S


is a measurable subset of if satisfying U(S) > 0 and g(y) '" 0 for
every y £ S. Then, replacing g by -g if necessary, we may assume,
without loss of generality, that there exist three positive
constants c 4 = c 4 (g), C s = cS(g) and c 6 = c6(g) and a subset SI C S
such that

(S)
and
for every Y E SI • (6)

Consider the function


78

max{-g(y), o}. (7)

Then g is Lebesgue-integrable in UZ• Let

c 7 (g) Z-s c 4 (g)c S(g). (8)

Consider also the function Igl Then Igl is Lebesgue-integrable


in UZ• Let
-Z
c 8 (g) Z c 6 (g). (g)

Then there exists a positive constant Cg Cg(g) such that for every
measurable set Ec u2 satisfying

(0)
we have

(1)

and

f Ig(y)1 d~ ( c 8 (g). OZ)


E

By an elementary box in UZ, we mean a set in uZ of the type

where ml' mZ' t l , t z are integers.

Consider the set SI. Since SI is a measurable, there exists a


finite union r* of elementary boxes in uZ such that

where r*!J. SI denotes the symmetric difference of r* and SI. Hence


if
E r* \ SI' (14 )

then
79

(15 )

Also. noting (10). we have that

Since r* is a finite union of elementary boxes of the type


(13). there is one such elementary box with maximal t 1 • and one with
maximal t 2 • Let Tl and T2 denote these maximal values of tl and t2
respectively. and let

T + (17)

We can now introduce the auxiliary function F[P;h;g;xl.

Any x ~ [0.1) can be written in the form

x = I f\ (x)2 -i-l •
i=O

where 8i (x) = 0 or 1 such that the sequence 8i (x) does not end with
1.1 ••..• For r = O. 1. 2 • •••• let

Definition. By an r-interval. we mean an interval of the form


[m2- r .(m+l)2- r ). where the integer m satisfies 0 ~ m < 2r.

Suppose that r = (rl,r 2 ) is an ordered-pair of non-negative


integers. Let

2
For any x ~ [0.11 • let

R (x) R (x 1 )R (x 2 ).
r r1 r2

Definition. By an r-box in U2 • we mean a set of the form II x 12 •


where II is an r1-interval and 12 is an r 2-interval.
80

We shall consider a function of the type

F(x) TT (1 + af (x)) - 1 ,
Irl=n r (18)
r 1 > Tl

r 2 > T2

where n is chosen in terms of N and where a = a(g) < 1/2 is a


suitably chosen positive constant. In any r-box ti, the function fr
is defi ned by

o n T *=</J or ti n P -#r/J) ;
(15
f (x) (19 )
r *
(BCTandBnP=r/J).

It is not difficult to prove

Lemma 1. SuPpO-6 e, 60IL j = 1, ... , k, :that r (j) = (r 1 (j ) ,r 2 (j ) )


(") (') (')
-6 at.w 6-tu Ir ] I = n, r 1J > T 1 and r / ;. T2 • SuppO-6 e 6wr.:theIL :that
r(l) , ••• ,r(k) aILe all ~66eILen:t. Then ~6 S = (sl,s2)' where

( ,)
r J and s = (20)
1 2

:then 60IL any s-box B, exactly one 06 :the 60ltow~ng :thILee con~:t~on-6
hold:

(ii ) f (1) ... f (k)


r r

(iii ) f (1) ... f (k) o.


r r

Fwr.:theILmoILe, (iii) ho.td-6 ~n any S-box B whelte B n P -# r/J.

4. Completion of the proof of Theorem 1.

We shall only prove (3) and (4) for

(21)
81

Let N satisfying (21) be given. Let n be a positive integer such


that

(22)

Then we have. in particular. that

n ;. 2T. (23)

Note. first of all. that


n+l
TT (1 + afr(x») = 1 + aF 1(x) + I (24)
Irl = n k=2
r 1 ;. Tl
r 2 ;. T2
where
F 1 (x) I f (x). (25)
r
Irl = n
r 1 ;. Tl
r 2 ;. T2

and where. for k = 2 •• . .. n + 1.

f (1)(x) ••• f (k)(x) (26)


r r

In view of Lemma 1. for each k = 2 ••••• n+l.

(27)

Furthermore. for each f in (19).


r

J f (x)d~ = O. (28)
U2 r
Since

TT (l+af (x») + 1
Irl = n r
r 1 ;. Tl
r 2 ;. T2
82

for all xe: U2 , (3) follows easily from (24), (25), (26), (27), and
(28).
On the other hand, from (18) and (24), we have

(29)

(30)

L _ 3. We have, fiolt k = 2, ••• , n+ 1, that

n-k+l
If Fk(][)D(][)dlll ( L (31)
U2 r =O

We can deduce (4) from Lemmas 2 and 3 as follows. There are


exactly (n-T+1) choices of r satisfying the hypotheses of Lemma 2.
It follows, from (25), (30) and (23), that

(32)

On the other hand,

n+l n+l n-k+l n-r


I L ak f L \ k -n-h-3 N(h-l)
Fk(][)D(][)dlll ( L L a c 6 (g)2 k-2
k=2 U2 k=2 r=O h=1

n-l n-r h+l k -n-h-3 ( h-l)


= L L L a c 6 (g)2 N k-2
r=O h=1 k=2

h-l
( a 2c 6 ()
g Nn \ 2-n-h-3
L
\ (h-l) ak
L
h=1 k=O k

h
2 -n-3 (l;a )
( a c6 (g)Nn2 L
h=O
2 -n-l
( a 2 c6 (g)Nn. (33)
83

Let -9
2 c 4 (g)cS(g)
CL = c 6 (g)

Then clearly CL <1/2 , By (29), (31), (32) and (33),

where c 10 (g) is a positive constant. This proves (4), in view of


(22).

It remains to prove Lemmas 2 and 3, the proofs of which are


based on

L~ 4. SuppO.6e that B .u, an S-box .in u2 • 16 B n P = ~, then

(34)

we have

The proof is essentially a slight modification of part of


the proof of Lemma 2 of [11. Let

tS'

Then
84

1 1 al+a 2 -sl-1 -s2-1


f L L (-1) D(Xl+a12 ,X2+a22 »)d~ • (37)
B' al=O a2=0

In view of (1), the sum

L * h(y), (38)
YEP nB (][)

* -s -1 -s -1
where B (][) = [xl' x 1+2 1 ) x [x2' x 2+2 2 ) c B for
every x £ B'. Hence the sum (38) vanishes, and so, in view of (2),
we have that (37) is equal to

-sl-l -s2- 1

g(y)·l·,
x 1+a1 2 x 2+a2 2
1 al+a 2
-N f ( L
1 L (-1) f f
B' al=O a2=0 0 0

= -N f
( xl+ 211-1 x2+2
f
-s2- 1

f
g(y).},
B' xl x2

= -N f KB(y)g(y)d~
B

on interchanging the order of integration. This completes the proof


of Lemma 4.

PJto06 06 Lemma 2. We decompose the integral (30) into integrals


over r -boxes. We shall say the an r- box B is "good" if i t is
contained in i* and does not contain any point of P. By (19),
f = 0 in any r- box that is not "good". Hence by (34),
r

f L(2 f (][)D(][)d~ L f R (][)D(][)d~


r B "good" B r

N L
B "good"
85

For any r satisfying the hypotheses of Lemma Z, there are at least


( l/Z c 4 (g)Zn_ N) "good" r-boxes. It follows, by (6), (14), (7),
(11), (ZZ), (8) and (10), that

I f (][)D(][)dll
u2 r

> NcS(g) L I KB(y)dll - NZ- n- Zc5 (g)Il(E) - NZ- n- Z I g-(y)dll


B "good" B E

This completes the proof of Lemma Z.

P~006 06 Lemma 3. Consider

I f (l)(][) ••• f (k)(][)D(][)dll.


u2 r r

Let S = (sl'sZ) be defined by (ZO). Let B by an S-box in UZ• Then


by Lemmas 1 and 4,

Hence by (19), (14), (6), (35), (36), (1Z), (9) and noting that
there are Zisl S-boxes in UZ,

II f (l)(][) ••• f (k)(x)D(x)dlll


u2 r r

(Nc 6 (g) I *I KB(y)dll + N2 1sl - 1 I Ig(y)ldll


B T B E
( Nc (g)z-lsl-4 + Nc (g)Z-lsl-z
6 8
86

= Nc (g)2 -181-3 •
6

(1) (k)
If we use the convention r1 < •.. < r 1 ' we have, writing
h = (k) (1)
r1 - r1 '

Now (31) follows on noting that once r (1)


1 and h are c h osen, th ere
h-l
are exactly ( k-2) ways of choosing (k-2) integers in the
(1) (1)
interval ( r 1 ' r 1 + h). This completes the proof of Lemma 3.

References.

[1] W. W. L. Chen, On irregularities of distribution and


approximate evaluation of certain functions, to appear in
Qu~ekly Jo~nat 06 Mathemat{C6 (UX60kd) 1985.

[2] G. Halasz, On Roth's method in the theory of irregularities of


point distributions, Recent Erogress in anal!tic number
theory, vol. 2, pp. 79-94 (Academic Press, London, 1981) •

[ 3] K. F. Roth, On irregularities of distribution, Mathemat{k.a, 1


(1954) , 73-79.

[4] W. M. Schmidt, Irregula ri ties of distribution VII, Ac.ta


AtUth., 21 (1972), 45-50.

W. Chen
Huxley Building,
Imperial College,
London SW7, U.K.
SIMPLE ZEROS OF 'IHE ZETA-FUNCTION
OF A QUADRATIC NUMBER FIELD, II

J.B. Conrey, A. Ghosh and S.M. Gonek

1. Introduction.

Let K be a fixed quadratic extension of Q and write I;K(s) for


the Dedekind zeta-function of K, where s = (J + it. It is well-
known, and easy to prove, that the number NK(T) of zeros of I;K(s) in
the region 0 < < 1,
(J 0 <t ( T satisfies

(1.1 )

as T + 00. On the other hand, not much is known about the number of
* (T), that are simple.
these zeros, NK Indeed, it was only recently
that the authors [2] showed that

and, if the Lindelof hypothesis is true, that

for any £ > O. Before this, it was not even known whether I;K(s) has
infinitely many simple zeros in 0 < (J < 1. In this paper we shall
prove that if the Riemann hypothesis (RH) is true for I; (s), the
Riemann zeta-function, then a positive proportion of the zeros of
I;K(s) are simple. More precisely we have

Theorem 1. AMume -that RH ,u., -tltue bOlt I;(s). Then

Research supported in part by NSF grants.


88

a.6 T +00 •

Remank. As we shall see below, if we assume the Riemann hypothesis


for ~K(s), the constant 1/54 can be replaced by 1/27.

In the case of the Riemann zeta-function, there are three known


methods for proving that a positive proportion of the zeros are
simple. They are the pair correlation method of Montgomery [12],
the modification of Levinson's method (due to Heath-Brown and
Selberg) and the method of Conrey, Ghosh and Gonek [1].
An application of Montgomery's method shows that on GRH a
positive proportion of the zeros of ~K(s) have multiplicity less
than or equal to two but does not furnish any information on simple
zeros (in fact, this statement also holds for L-functions associated
with certain cusp-forms on the modular group, if one assumes the
appropriate Riemann hypothesis).

The method of Levinson (which is unconditional) msy work if one


had mean-value theorems of "mollified" L-functions, on the critical
line, with mollifiers of long length. Such results as are available
at present do not suffice.
The present method (which is a variation of that in C-G-G [1].)
overcomes these difficulties by exploiting the factorization

(1.2)

here X is the quadratic (Kronecker) character of the field K and


L( s, X) is the associated Dirichlet L-function. Unfortunately, our
approach has the drawback that it will not apply to functions like
the Dirichlet series associated with cusp-forms, for although these
functions also have a r(s) term in their functional equations, they
do not factor as a product of two "natural" Dirichlet series.

To establish Theorem 1 we shall require the following result


which is of interest in its own right.

Theore. 2. Ml.lume RH 601L ~(s) and .tet p = 1/2 + i y denote the typ-i.c.ai..
nontJt-i.v-i.ai.. zelLo 06 ~(s). Then -i.6 X -i.I.l any nonpJt-i.nupai.. c.hanac.telL
(not nec.c.eM aJt-i..ty quadILat-i.c.), we have
90

Theorea 3. On RH, -i6 a .. 1/3, 2/3, 1/4, 3/4, 1/6 and 5/6, then at
lea6t one-th-iJtd 06 the zeJto~ 06 z.;(s,a) Ue 066 the line a '" 1/2 •

A proof and discussion of a result of this kind may be found in


Gonek [7].

2. Preaable to the proof of 'Dleorea 2.

Throughout, T is large, L a log T, and e: is an arbitrarily


small positive number though not necessarily the same one at each
occurence. Estimates depending implicitly on e: will be denoted by
oe: or <e: •
It suffices to prove Theorem 2 for a primitive character X and
its modulus q will be fixed from now on. Consequently, the
constants implied by the symbols 0 and < may depend on q and x. Let

a(k)k
-s ,

where
~
a(k) - ~(k)x(k)(l- log y ) and y - Tn

with ~ the Mobius function and 0 < n <1;2 to be selected later in the
proof. Let

N- L L(1/2 + iY,X)A(1/2 + iy,x) (2.1)


O<y(T
and

v- IL(1/2 + iY,X)A(1/2 + iy,x)1


2
, (2.2)

with y running through the ordinates of the zeros of z.;(s). Then, by


the Cauchy-Schwartz inequality we have

2
1 { O<y(T : L(1/2 + iy,X) ,. 0 II ~.l..!::!l. • (2.3)
V

The purpose of A( s, X) here is to mollify L( s, X) and thereby


sharpen the inequality. The remainder of the paper is concerned
with the evaluation of N and V. We shall show that on RH, i f
89

I { O<y<T L( p, X) o }I ;; (2/3 + o(1»N(T)

a.6 T +co, whelte N(T) L6 the rtumbelt 06 zelto<> 06 1,;(s) wLth 0 < y < T.
That L6, at mo<>t two-th-i.ltrill 06 the ze!to<> 06 1,;(s) CUte a£.60 zelto<> 06
L(s,x).

With a lot more work, we could actually show that any two
L-functions with inequivalent characters have at most two-thirds of
their zeros in common, provided the Riemann hypothesis holds for one
of them. A result of this type has also been given by A. Fujii [5)
using a method different from ours. While his result is unconditio-
nal, his constant (which was not evaluated) is presumably quite
small and would therefore not serve to prove Theorem 1.

To prove Theorem 1 we first observe from (1.2) that a zero of


1,;K(s) is simple if and only if it is either

(i) a simple zero of 1,;(s) and not a zero of L(s,X)


or
(ii) a simple zero of L(s,X) and not a zero of 1,;(s).

Furthermore, these two conditions are mutually exclusive. Now it is


known that, on RH at least 19/27 of the zeros of 1,;(s) are simple
(see C-G-G [1). Then, by Theorem 2, the number of zeros satisfying
(i) is at least ( 19/27 - 2/3 + o(l»N(T) = (1/27 + o(l»N(T). But
as is well-known N(T) ~ V2NK(T). Hence, Theorem 1 follows.

We could have appealed to the result of Montgomery and Taylor


[11) where 19/27 is replaced by 0.6725 with some loss in the
constant in Theorem 1. Also notice that we have assumed RH only for
1,;(s) and not for L(s,X). If one assumes it for both functions (or,
equiva- lently, for 1,;K( s», it can be shown by the method in [1)
that 19/27ths of the zeros of L(s,X) are simple, and by the method
in this paper that at most 2/3rds of the zeros of L( s, X) are zeros
of 1,;(s). In this way one can count the simple zeros of 1,;K(s) of
type (ii) above, thereby doubling the constant in Theorem 1.

Theorem 2 also has an application to the Hurwitz zeta-function


1,;( s, a), namely
91

y T 1/2 -£, then

and v_ 3 TL as T+co • (2.4)


211

Combined with (2.1) and (1.2), these estimates imply the result.
The first step in treating N and V is to express them as
contour integrals by Cauchy's residue theorem. To this end let II
denote a sequence of numbers Tn such that

(n 3, 4, • • • • )
and

1,;' 2
- (a + iTn) < (log Tn) (2.5)
I,;

uniformly for -1 ~ a ~ 2 see Davenport [4; p.108]) • In


particular, Tn is not the ordinate of any zero of I,;(s).

Until the vetUj end 06 the papen, we J.Jhall. alwa!f6 aMume that
Til.

Next set (once and for all)

a =1+ L- 1 •

Let R be a positively oriented rectangle with vertices at


a + i, a + iT, I-a + iT, and I-a + i. Then, on RH, we have

N = -
21Ti
f R -1,;'
1,;
(s)L(s,X)A(s,X) ds (2.6)

and
v = 2 1i
11
fR J:(S)L(S,X)L(l-s,X)A(S,X)A(l-s,x)
1,;
ds • (2.7)

Let us consider N first. As it happens, it is easier to work


with
2:i f -
-1 1,;'
N = (l-s)L(s,X)A(s,X) ds •
11 R 1,;

This is equivalent to (2.6) because J: (s)


1,;
and - s:I,; (1-s) have the
same poles and residues inside R. Now for s inside or on R,
92

(2.8)
and

( 2.9)

These bounds and (2.5) imply that the top and bottom edges of R
contribute O(yTl/ 2 +E: ) to N.
For the left edge of R we replace s by l-s and find that

I-iT
-1
21ri J ~(l-s)L(s,x)A(s,x) ds
1-a+iT I;

a-iT
= -=l J ~'(S)L(l-s'X)A(l-S,x) ds
21Ti a-i"

a+iT ,
- __1__ J ~1; (s)L(l-s,X)A(l-s,X) ds.
- 21Ti a+i

For the right-hand side of R we use the identities

- 71;' (l-s) = 71;' (s) - X


X'
(l-s) (2.10)

l;(l-s) = X(l-s)l;(s) (2.11)


where
x(1-s) = 1/2 -Sr(s/2)/r(--2--)
l-s

11 (2.12)

Thus, on substitution, we may write

(2.13)
with

a+iT ,
N1 -= /1Ii J ~ (s)L(1-s,x)A(1-s,X) ds , (2.14)
a+i I;
93

a+iT ,
1
N2 =- J
211i a+i
~ (s)L(s,X)A(s,X) ds (2.15)
Z;;
and
a+iT
N3 =
1
211i J
a+i
X'
X (l-s)L(s,X)A(s,X) ds . (2.16)

We now come to V.
The top and bottom edges of R contribute O£( yT l / 2 +£) to V by
(2.5), (2.8), and (2.9). Replacing s by l-s and using (2.10), we
find that the contribution of the left edge of R equals

a-iT
- J ~(l-s)L(l-s'X)L(s,x)A(l-s,x)A(s,x) ds
21Ii a-i Z;;

a-iT , X'
= 21!i J (- ~
Z;;
(s) + - (l-s») L(l-s,X)L(s,X)A(l-s,X)A(s,X) ds •
X
a-i

We will write

a+iT ,
VI - __1__ J ~ (s)L(s,X)L(l-s,X)A(s,X)A(l-s,X) ds (2.17)
- 21Ii a+i Z;;

and

~Tr _ _
J X (l-s)L(s,X)L(l-s,X)A(s,X)A(l-s,X) ds , (2.18)
a+i

so that the integral above equals VI - 02

Notice that VI is also the contribution of the right-hand side


of R to V. Hence, on combining these results, we obtain

V = 2 Re. VI

We conclude this section by introducing some useful notation


and formulae. As usual we write e(x) in place of exp(211ix).
r ,
We let L to denote a sum with (m,r)=l. Ramanujan's sum
m=l
is

r ,
c (a) = L (2.20)
r m=l

Similarly, we define
94

q
c (a) L x(m) e(ma)
q
x m=1

It is known (see ( 12; p.358)) that

if (a,q) 1,
c (a) = 1x(a:.(x) (2.21)
X if (a,q) > 1.

where .(X) = c x(l) is the Gauss sum.

We shall write the functional equation for L(s,X) in the form

L(l-s,X) x(l-s , X) L( s , X) , (2.22)


where

X( l-s, X) (2.23)

with
a .. { o if X(-I) = 1,
1 if x(-I) = -1.

Observe that if q = 1, X is principal and X(I-s,X) = X(I-s),where


X( l-s) is the factor in the functional equation for 1,;(s) (see
(2.11) and (2.12».
Finally, define

Fn(S) .. 1T (l_p-s) ;
pin
Fn(s,x) .. 1T (1 - X(p)p-s).
pin

3. Aoxilliary le..as.

r.e..a 1. Let r be a pO.6.tt.tve /teal numbe/t and .6UppO.6e that X(l-s, x)


.t.6 g.tven by(2.23). Then 60/t a .. 1 + L- 1 and T laJtge, we have

a+!T
f X(I-s,X) r- s ds
a+!
a
.ll::.!2. -r
+ ~ E(r/q , T) ifr<!l!
.(x) e(-q) .( X) - 211 '

~a
.( X) E( r / q, T) ifr>!l!
211 '
95

whelte

E(r/q,T)

P1t006. When q=l, X(l-s,X) = X(l-s) and, except for minor modifica-
tions, a proof can be found in Gonek [8]. If q > 1, then as is
easily shown,

X(I-s,X) =
.ll::!l
T(X) q
s
X(I-s)(1 + O(e
-1ft
»). ( 3.1)

Using this and the case q = 1 of the lemma, we obtain the result.

Lemma 2. Let a(n), fl(n) be aJUthmet.i.c 6unc.t.ton6 llUch that a(n)


O( 1) and fl(n) = O(dr(n) logR.n ), whelte d/n) ,u, the coe6Muent 06
n-s.tn r,;r(s) and R.,u, a non-negat.tve .tntegelt. At60 let a = 1 + L-

• Then.t6 1 < x ( T,

a+iT
21Ti f X(l-s, X)( L a(k)k s - 1 )( L fl(j) j -s) ds
a+i k<x j=1

X(-1) a(k)
= TfX) -k-

P1t006. This follows from Lemma 1 and Lemma 2 of C-G-G [1].

Lemma 3. Let X mod q be a pJUm.tt.tve c.haltactelt and -6et

-nH -s
L d(n)x(n)e( qK)n (a> 1).
n=1

16 K ,u, -6qualte-6Itee and (H,K) = (K,q) = 1, then D hi1ll an ana.tyt.tc


cont.tnuat.ton to the whole plane ex.cept bolt a pole at s=l. At th,u,
po.tnt .tt hi1ll the -6 arne pJUnupa.t paltt ll6

Pltoon, This is a straightforward generalization of a well-known


result of Estermann.
96

A(s,a,k) has the same principal part at 5=1 as


1 1;;'
-6«a,k» ~(k) ~ (5) where 6( ) is the Dirac de:~a-function.
Thus, if we call the expression in (3.2) P(s,X'qK) , then by (3.5)

qK/d , ( -bH -bH


- I I D(s,X, qK/d) - P(s,X'qK/d»)
dlK b=1

• ( A(s,bd,qK) + I;;(d) ~(qK)


1
C
1;;'
(5»)

qK/d , -bH
J.: L P(s,X'qK/d)A(s,bd,qK)
dlK b=1

qK/d ,
6(d) -bH l;'(s)
+ J.: I ~(qK) D(S'X'~(qK» I;;
dlK b=1

qK/d
, ~ -bH J:
I I ~(qK) P(s,X'qK/d) I;; (5)
dlK b=1

= - L1 - L2 + L3 - L4'

say, with L1 regular at 5=1. The principal part of 1:2 is the same
as that of

- -5 2 K 5( K 1-5 -5
X(H)-r(x)(qK) I;; (5) I X('d) d Fq (s)(1 + X(-1)('d) ) - ~(q)q )
dlK
qK/d ,
I x(b)A(s,bd,qK).
b=1

The sum over b equals

qK/d
d-s I x(n)A(nd)
I L 5
b=1 n:::b(mod qK/d) n=1 n
(n,K/d)=1

and the last sum is zero unless d=1 or p (recall that K is square-
free). In any case we may write it as

~s,x»
L' - Fi. -
6(d)(- -L (s,X) - + A(d)
K dS - X(d)

Thus,
97
I-s -s
x (F q (s)(1 + X(-I)K ) - ~(q)q )

- s
log p X(p)p
+ L
plK ps_ X(p)

Next,

for a> 1. Since (H,qK) =1, the sum over b equals

qK ,
L e(an)
a=1 qK

Thus by (2.20) we have

LL . x(mn) L dll(7)
m,n=1 (mn)s dl(qK,mn)

after some simplifications. Since this function is regular at s=l,


E3 has the same principal part as

Evidently we may restrict the first sum to one over dl K. Also,


since (q,K) =1 and K is square-free, the double sum equals

ll(qK) L ll(d)X(d) TT (1 + F (I,X» = ll(qK)g(K) •


dlK pld P

Thus, the principal part of E3 is identical to that of

ll(qK) g(K)L(1 )2 i'( )


~(qK) , X 1; s •
98

Le1IIIa 4. Let X mod q be a pltLm-it-ive cJtiVlacte.te. and !.let

Q(s,X'qK )
-H lOO, A(m)d(n)x(n) e(- mnH) (a> 1).
/.. s qK
m,n=1 (mn)
Then -i6 H, K and q Me pa-i.lUAI-i.l.le c.opltLme and K -i.l.l /.)quMe6te.ee, Q hal; a
mete.omote.ph-ic. c.ont-inuat-ion to the whole plane. The onlif pole note.
a ) 1 -i.l.l at s=1 whelte U hal; a pole who/.)e pltLnupal pMt -i.l.l the !.lame
al.l that 06

X H) X( K) T ( X)( qK)-s l; 2( s )G K( s, X) + ~(qK)g(K)


-( <p( qK) L2(I,x) ~'(s)
I;

whete.e
2
g(K) = TT (1 - 2x(p) + x:iEl) (3.3)
plK P

and

F'
( LL' (s,x)
- K - (
+ ~ (s,X») F (s)(1 + x(-I)K
I-s-s
) - <p(q)q )
K q

(3.4)

Plto06. For a > 1 we have


qK
L
a=1
(3.5)

qK/d,
-bH
=- n
dlK b=1
l D(S,X'qK/d)A(s,bd,qK),

where D(s,l,l) is as in Lemma 3, and

-s
A(s,a,k) l A(n)n (a> 1).
n:a(mod k)

It is well known that A(s,a,k) has a meromorphic continuation


to the whole plane with a simple pole at s=1 if and only if
(a,k) = 1. Also by Lemma 3, D(s,X,.) is regular everywhere except
for a possible double pole at s=l. Thus, Q(s,X,.) is meromorphic in
the complex plane and has no poles in alI except possibly at s~l.

To find the principal part at this point, first note that


99

Finally,
1 s..~ - -s
<j>(qK) I; (s) X(H)x(K)-r( X)(qK) 1;(S)
2

qK ~
l-s -s
x (F q (s)(1 + X(-l)K ) - <j>(q)q ) L
b=1

o.
-H
Collecting these results, we find that Q(s,X'qK) has the same
principal part as

-(H) (K) ( )( K)-s ( )2G (


X X T X q r; s K s, X
) + ~(qK)g(K)
<j>( qK)
L(I,x)2 ~~(s)
r;

this completes the proof.

Lem1ll8. 5. La X mod q be a pJUm.t.t.tve c.haJtadeJt and f.,uppof.,e .that:


(H,K) = (K,q) = 1. sa

(0 > 1)

TheYl L ha-6 an anai.y.t.tc COYl.t.tYluat:.tOYl .to .the whale plaYle excep.t 60Jt a
pOM.tble pale at: s=l. A.:t .th.tf.> po.tYl.t i l ha-6 .the -6ame pJUYlUpa£. paJt.t
a-6

wheJte 6( K) = 1 .t6 K= 1 and .tf.> zeJto o.theJtW.tf.> e.

PJtoo6. This follows on relating the Dirichlet series to a Hurwitz


zeta-function in an obvious manner.

Leama 6. Le.t X mod q be. a pJUm.t.t.tve c.haJtadeJt and WiU...te

A(m)x(n) e(-mn)
LL (mn)s qK
( 0 ) 1).
m,n=1

16 (K,q) and K .tf.> J.,quaJte.6Jtee, .theYl R ha-6 a meJtomoJtph.tc


COYl.t.tYluat:.tOYl .to .the eYl-t-tJte complex plaYle. It6 OYlly pale .tyl 0 > 1 .tf.>
at: s= 1 wheJte i l ha-6 a pale w.i..th .the f., arne pJUYlUpai. paJt.t a-6
100

lI(qK)L(l )F (0 )1;'() X(-~i'(X)(J:(K)~L'(l,-x) _A~(~K~)_ _ )1;(s) ,


~(qK) ,X K ,X ~ s + u
- X(K)/K
whelte o(K) = 1 .i6 K=l and ill 0 otheltWiIle.

Pltoo6. This is similar to the proof of Lemma 4.

I.-.a 7. SuppO.6 e :that

c 1 (j) =- L A(m)x(n),
mn=j

- L L a(h)A(m)x(n)d(n),
hSy hmn=j

and
b 2 (j) = L L a(h)x(n)d(n).
hSy hn=j

Then .i6 y = Tn wah n ( 1/2 '

L c 1 (j )e(~)
j(qKT/21r

-H
"a(k) , b (j)e(.=.1.) = "" a(h)a(k) res. Q(s,X'g'K) (9 KT)s)
I. k I. 1 qk I. I. k s=1 ( s 211
kSy jSqKT/211 h,kSy

+ 0 e; ( y 1/2 T3 / 4 +e; + TL- 1 ) , (3.7)

and
-H
= "" a(h)a(k) res. D(s,X'QiK) (9 K,)s)
I. I. k s=l ( s 21TH
h,k~y

+ 0 ( y1/2 T3 / 4 +e; + TL- 1 ), (3.8)

whelte , .. T .in (3.8) and R, Q and D Me a.6 .in Lemmas 3, 4 and 6.

Pltoo6. All three formulae are proved by the method used to estimate
the sum M2 in Conrey, Ghosh and Gonek [1;Sec.5]. Since the method
101

is rather complicated and lengthy, we shall only indicate the idea


of the proof of (3.7) here; the interested reader is referred to
sections 5-7 of the afore mentioned paper for details. Had we
assumed GRH, the lemma could be established with considerably less
work; the reader may wish to consult Lemma 6 in [3] for the proof of
a similar result.
First we set

(a> 1).

Then the sum on the left in (3.7) is

(3.9)

where c depends on T and c > 1. Now by the definitions of b 1 (j)


and Q(s,x,.), we see that

-H
L a(h)Q(s,X'qK)' (3.10)
h~y

where H = h/(h,k) and K = k/(h,k). From this and Lemma 4 it


follows that B(s,-j/(qk» is a meromorphic function whose only pole
in a ~ 1 is at s=l. Inserting (3.10) into (3.9), we see that this
pole should give rise to the main term

a(h)a(k)
LL k
(3.11)
h,k~y

To prove that this is the case we need to replace the exponential


(additive character) in B(S,~~) by a character sum. We may then
proceed as in the proofs of the Bombieri-Vinogradov theorem given by
Vaughan [15] and Gallagher[6].
By (5.12) in [1] we find that

e(.:i)
qk L L ~(t)o(q',qk,d,~),
q'lqk dl (qk,j)

where
]J(qk(d,q~/q'»
q
o(q' ,qk,d,~)
102

Clearly, we may suppose that (k,q) =1 (otherwise a(k) .. 0 in


(3.7). Hence, the divisors q' and d split as q' .. ql q 2 and d ..
d 1d 2 with ql lq , q21k, d11q, d21k, and (ql,q2) .. (dl'd 2 ) .. 1. Also,
since lji mod ql q 2 is primitive, there is a unique pair of primitive
characters ljil mod q I' lji2 mod q2 such that lji .. ljillji2' From this and
the coprimality of ql and q2 it is easy to show that

Using these factorizations for q', d, lji and T(~), we may now write

e(.:.t) = ~ ~ ~ * T(~1 ) ~ L * T(~2) ~1(q2) ~2(ql)


qk
ql lq d 1 1q ljil mod ql q2 1k lji2 mod q2

ljillji2(rr) o(ql q 2' qk, d 1 d 2 ,ljillji2)'


dtk 1 2
dId 21 j

Substituting this in the definition of B(S,~) and using the result


in (3.9), we find after rearranging the sums that the right-hand
side of (3.9) equals

1
(W) f ( L
(c) m=1

The expression inside the brackets is analogous to E2 in (5.15) of


[1] and is treated in precisely the same way. That is, we disting-
uish between the cases q2 ~ LA for some A > 0, and LA< q2 ~ y/k.
The integrand above has a pole at s=1 if and only if q 1 = q2 = I, so
the contribution of this term must be identical to (3.11). For
q2 ~ LA we move the contour to the left and use Siegel's theorem as
in the proof of the prime number theorem for arithmetic progress-
ions. For the remaining cases we use a Vaughan-type identity and
the large sieve. If we assumed GRH, it is this last part that could
be dispensed with (and so the analysis is much easier).
103

We now state some elementary lemmas, the proof s of which we


omit.

Then

and 60Jt x )

(11) clog x + O( 1) c =

te.ma 9. FOJt a 6ixed chaJtacteJt X mod q, m a po~itive integeJt, and


x ~ 1, we have

~log E~j logloglog 30m i f j=O,


(i) L p < {
plm (loglog 3m) j if j=I,2,

and

logloglog 30m i f j=O


(11) L ~log
p
p)j < {
(loglog 3m) j
p<x i f j=I,2.
p%m

te.ma 10. Let <1t(s,x) be a6 in (3.4) with x mod q being a Mxed


chMacteJt and k a PO.6-it-ive bztegeJt. Then

Gk (1,X) = -x(-1) p~q) L x(p)log p + O(loglog 3k)


plk
and

x( -1) i£.9.2..
q
L x(p)log p log ~ + 0(log2k loglog 3k)
plk p

Lea.a 11. FoJt x ) 1 and q Mxed,

i£.9.2..
q
log x + 0(1)

Lemaa 12. Let y, a(h) and Fh(s, x) be a6 in Sec.2 and let g(h) be a6
104

.tn (3.3). Then

a(mh)
(i) L -h- < °- l;<m),
2
h~y/m

ll(h);(h)Fh (0, x)
-1
(ii) L < L ,
4>(h)
h~y/m

(iii) L
h~y/m

Lem.a 13. Le~ y and a(h) be ah be6o~e. Then


°- l/(m)
2
(i) L a(mh~X(h) gmll(m) x(m) + O( )
4>(q)4>(m)log y log y log4 2y/m
h~y/m

and
_ 0_ 1/2(m)10g L
(ii) L a(mh)x~h)log h = - gmll(m)x(m)log y/m + O( 1 )
h~y/m 4>(q)4>(m)1og y og y

P~oo6. We may base a proof on the formula (see Graham [9])

lJ(k) x r -4
L - k - log k = 4>(r) + O( 0_ l/i r )1og 2x),
k<x
(k~r)=l
For (i) we have

a(mh)x(h) ll(m)x(m)
L lli.hl. log ...:J..
h log y h mh
h(y/m
(h~mq)=l

lJ(m)x(m)mg 0_1/2(m q )
4>(mq)log y + O(log Y log4 2y/m)'

The original sum vanishes if (m,q) > 1 so the result follows from
the multiplicativity of 4> and 0_ 1/2 •
In a similar way (ii) follows on noting that h is squarefree
and so log h = L log p.
plh
105

4. The estimation of N.
Recall from (2.13) that

(4.1)

where the Ni are given by (2.14) - 2.16).


We first consider N1 • Using the functional equation (2.22) in
(2.14), we have

a+iT ~
N = __1__ J 1 (s)L(s,x)A(1-s,x)X(1-s,x) ds.
1 2~i a+i ~

Setting
c (j) = - I A(m)x(n)
1 mn=j

and using Lemma 2, we obtain

,,( -1' I ;(k) I c 1(j)e(.=i


N k )+0(y·/l2-
=~ E ).
1 T( X) kSY k jSqkT/2~ q E

Now by (3.6) we find that


-1
-a(k) res. ( R(s'X'-k)
q
k s=1 s

Here the sum may be taken over squarefree k coprime to q (other-wise


a(k) = 0), hence the residue may be computed by means of Lemma 6.
The result is , after simplification,

- I ;(p)log p ) + 0(i/2T3/4+E ) + 0(TL- 1 ).


pSy p-x(p)

By Lemma 12(ii) the sum over k is bounded by L- 1 , while the sum over
p equals

-1 \' X(p) log P log yIp + O( \' log2 P ).


log y L P L P
P~ P~
106

The error term is clearly bounded and, by Lemma 9(11), with 1IFl, so
is the first term. Hence

(4.2)

Next, by (2.15), we have

a+iT
1
Nl = 21ri J ~~(S)L(S,x)A(S,x) ds
a+i
T
-a (....!. -it L3
2 c 2 (n) n
211 J n dt) < (4.3)
n=2 1

since
2 A(h)x(j)a(k)« d 3 (n)log n.
hjk=n

Finally we come to N3 • Taking the logarithmic derivative of


(2.12) it is easily shown that

X (l-s)
x~
(4.4)

for t~l, 0 ~ 0 ~ 2, say. Inserting this into (2,16), we obtain

-1 T 3
N3 = Z; { L(a+it,x)A(a+it,X)log(t/2n)dt + O(L)

since
L(a+it,X)A(a+it,X) < ~2(a) < L2.

The main term can be written as

-a 1 T_it
2 c 3 (n)n (z; J n log(t/2n) dt),
n=1 1

with
c 3 (n) = 2 x(h)a(k) < d(n).
hk=n

The term n=1 contributes TL/2n + O(T) to N3 , while the remaining


terms contribute an amount of 0(L 3 ), so that

T
- 211 L + O(T).
107

Combining this with (4.1)-(4.3) we see that

(4.5)

5. The estimation of VI
We now turn to the first term VI in the denominator V see
(2.07), (2.17), and (2.19)). By the functional equation (2.22) we
have

a+iT ,
VI = 2:1 f ~ (s)L2(s,X)A(s,X)A(I-s,X)X(I-s,X) ds,
11 a+i I;

where a 1 + L- 1 • We define

L (cr > 1).


j=1

Then by Lemma 2,

;i(k)
k

To evaluate this we use (3.7) of Lemma 7 and find that

where
h k
H = (h,k) and K = (h,k)

Observe that in the sum above we may suppose that both Hand K
are square-free and that (H,q) = (K,q) 1. Therefore, Lemma 4 is
applicable and we may write the residue as
108

1
If we use the expansion = s-l + Y + ••• near s=l to evaluate
~(s)

these residues and insert the result into VI' we obtain

- - 2y-1
VI x(-I)-I \ \ a(h)a(k)x(h)x(k)(h,k)(G (1 )1 ~ +G'(1 »
21r ~,~~y hk K ,x og 21TH K ,x

a(h)a(k)
hk

We next apply the MBbius inversion formula in the form

f«h,k» = L L
mlh nlm
mlk
On applying this to VI and simplifying, we find that

a(mh);(mk)X(h)X(k)
nlm
L.I!i!!l
n
LL hk
h,k~y/m

- x(-I)v(q)p(q) L 2 (I,x)2!
T( X)q m_<y
L ~
h m
t
v(n) L L a(mh)a(mk)v(nk)g(nk)
h k<l hk<j>( nk)
, m

+ O(TL- I ),
or, say

We first treat VII. By Lemma la, the expression in brackets is

Te 2y- I
Gk (l,X) log 21Thn + Gk(l,X) + O(L loglog 3kn)
109

<h(n)
~)
T
-X(-I) ~
q
(log -
h
I x(p)log p - I x(p)log p log
plk prk p

+ O(L loglog 3kn) + O(log 2k log 2n).

Since neither n nor k is greater than y, the error terms here are
O(L log 2nL). Hence, using the identity I 11(:) = p(:) we
nrm
have

v11 = - ~ ....!
211
\' p(m
2) \' \' a(mh)a(mk)X(h)X(k)
q L L L hk
m~y m h,k(y/m

Tp
X(p) log p log hk

+ O( TL I
m~y

Notice that in each sum over m we may assume that m is square-


free. With this in mind, we see from Lemma 13(i) and Lemma 8 that
the o-term is

112(n)02 I;(m)
_ __ -....:2- log 2nL < T log L •
m n

We may therefore rewrite VII equals

Jill! I ~ I 12&....E. (L I a(mh)X(h) I


a(mpR.)x(R.)
2q1l m p h R.
m(y p(y/m h(y/m R.5.y/mp
- (p~q)=1 -

I a(mh)X(~)log h a(mpR.h( R.)


R.
k5y/m

a(mh)x(h) a(mpR.)x(R.)log R. )
I + O(T log L).
h R.
h~y/m

Using Lemma 13 to estimate the sums over hand R. and noting that we
may suppose that (p,rn) = (q,m) =1 ,we find that the expression in
parenthesis is
110

- u2(m)m 2g 2
~2(m)~2(q)log2y
~
~(p)
(L + log y/m + log y/mp)

0'_1/2(m)2 0'_1/2(m)2 log L


+ 0 ( log y log '+ 2y/mp ) + O( log2y ) .
The first O-term contributes

0'_1/2(m)2 y/m -It


«_T_
log y I m f log 2y/mu du/u
m~y 1

by the prime number theorem. The integral is easily seen to be


0(1). So by Lemma 8(ii) the contribution is O(T).
The second error term is

T log L 0'_1/ 2(m) 2


« I « T log L.
log y m
m~y

Finally, by Lemmas 9 and 11, the main term contributes an amount

«lOgT 2y I u2(m) (L logloglog 30m + loglog 3m) « T loglog L •


m~y Hm)

Thus,
Vll < T log L. (5.2)

We now turn to V12 • We have

u 2(n)g(n)
~(n)

;(mk)u(k)g(k)
k~(k)

since we may obviously assume that (n,k)=I. By Lemma 12(i) and


(iii) and Lemma 8(ii), this is, on interchanging orders of summation

u2 (n)lg(n)1
I ~(n)
nlm

u2(n)0'~1/2(n)lg(n)1
n~(n)
III

Now for square-free n,

Ig(n)1 = TT 11 - 2X(p) + D.Pl


P d(n)
2

pin

and

where w(n) is the number of prime-factors of n. Hence,

1J2(n)d(n) 4
V12 < T L n.p(n) < T.
n~y

It follows from this, (5.1) and (5.2) that

(5.3)

6. The estimation of V2 •
We shall see in this section that the main term in V is from
V2 • Recall from (2.18) that

V2 = 2ni
1 JaHT i<l-s)L(s,X)L(l-s,x)A(s,x)A(l-s,x)
X~ - -
ds,
a+i

Moving the line of integration to 0 = 1/2 and using (2.8), (2.9) and
(4.4), we obtain

v2 = - 2n
1 (6.1)

T 1/2 + E
+ O( J1 I L( 1;2+ it,X)A( 1/2+ it,X) 12 dt)
t
+ 0 (yT
E
).

The mean-values are evaluated using the techniques indicated in


Sec.5 to give us

_ TL (l+_L_)
2n log y
+ 0
E
d/2T3/4 +E) + O(T log L). (6.2)
112

7. co.pletion of the proof.


By (4.5) we see that

N (7.1)

Also, from (5.3) we have

and from (6.2) that

- TL (1+_L_) + 0 (i/2 T 3 / 4+£) + O(T log L).


211 log y £

Thus, by (2.19) it follows that

v= TL (1+ _L_) + O(T log L). (7.2)


211 log y

1/2 -2£
We now take y = T in (7.1) and (7.2) and find that

and D (3+0(e)~. (7.3)

This establishes (2.4) and (2.5) and therefore Theorem 2, provided


that T is in the sequence IT defined in Sec.2 (preceding (2.6». To
remove this restriction first note that every positive T is within
0(1) of some element of IT and that increasing T by 0(1) in (2.2)
introduces at most O(L) new terms into the sum. However, by (2.9)
and (2.10) each of these terms is

if y = i/2- 2£. Thus (7.3) is valid for all large T. Similarly,


increasing T by 0(1) introduces at most O(L) new terms into the sum
for V in (2.3). Each of these is

< T1 - £/2 - 2£2


£

so (7.3) is also valid for all large T. This completes the proof of
113

Theorem 2.

References

1. J.B. Conrey, A. Ghosh, and S.M. Gonek, Simple zeros of the


Riemann zeta-function, submitted.

2. J.B. Conrey, A. Ghosh, and S.M. Gonek, Simple zeros of the


zeta-function of a quadratic number field I, Invent. Math. 86
(1986), 563-576

3. J.B. Conrey, A. Ghosh, and S.M. Gonek, Large gaps between zeros
of the zeta-function, to appear Mathemat~k~

4. H. Davenport, Multiplicative Number Theory, Graduate Texts in


Mathematics, v.74, Springer Verlag, New York, 1980.

5. A. Fujii, On the zeros of Dirichlet L-functions (V), Acta


~th, 28 (1976), 395-403.

6. P.X. Gallagher, Bombieri' s mean value theorem, Mathemat~ka 15


(1968), 1-6.

7. S.M. Gonek, The zeros of Hurwitz's zeta-function on (J = 1/2,


Analytic Number Theory(Phil. Pa. 1980) 129-140, Springer Verlag
Lecture Notes 899, 1981.

8. S.M. Gonek, Mean values of the Riemann zeta-function and its


derivatives, Invent. Math. 75 (1984), 123-141.

9. S.W. Graham, An asymptotic estimate related to Selberg's sieve,


JoWL. Num. Thy. 10(1978), No.1 ,83-94.

10. H.L. Montgomery, The pair correlation of zeros of the zeta-


function, Proc. Symp. Pure Math., 24( 1973), 181-193.
114

11. H.L. Montgomery, Distribution of the zeros of the Riemann zeta-


function, Proc.lnt.Cong.Math., Vancouver 1974 , 379-381.

12. H.L. Montgomery and R.C. Vaughan, The exceptional set in


Goldbach's problem, Acta Anith. XXVII (1975), 353-370.

13. E.C. Titchmarsh, The Theory of the Riemann Zeta-Function,


Oxford, Clarendon Press, 1951.

14. R.C. Vaughan, Mean value theorems in prime number theory, J.


London Math. Soe.(2) 10 (1975), 153-162.

J.B. Conrey and A. Ghosh S.M. Gonek


Oklahoma State University University of Rochester
Stillwater, OK 74078-0613 Rochester, NY 14627
U.S.A. U.S.A.
DIFFERENTIAL DIFFERENCE EQUATIONS ASSOCIATED
WITH SIEVES

*H. Diamond. H. Halberstam and H.-E. Richert

1. Our aim in this note is to analyse the differential difference


equations underlying sieves of dimension K > 1. A heuristic version
of such an analysis together with some valuable numerical informa-
tion was given by Iwaniec. van de Lune and te Riele [5] (see also
te Riele [7]) and what we seek to do here, in effect, is to justify
the conclusions of [5]. It has been shown elsewhere (in [2]) how to
construct sieves of dimension K > 1 on the basis of such informa-
tion. In this connection we acknowledge also our indebtedness to
the important thesis of Rawsthorne [6].
Let o(u) = 0K(U) be the continuous solution of the Ankeny-
Onishi differential-difference equation (cf [1], or Chapter 7 of
[3])

-K -1 -K -K-1
u o(u) = C (O<u< 2), (u a(u»' -KU 0(u-2) (u>2) (1.1 )

where C = (2e Y) Kr(K + 1) and y denotes Euler's constant. We shall


indicate how to prove that there exist numbers UK > 1, 13 K > 1 and
continuous functions F K, fK that satisfy the simultaneous
differential-difference equations with retarded argument

FK(u) = 1/0 (u) (O<u(u ), (uKF (u»' (1.2)


KKK

(1.3)

as well as the additional conditions

*AII three authors acknowledge with gratitude support from the


National Science Foundation.
116
-u -u
1 + O(e ), fK(u) 1 + O(e ) as u + 00 0.4)

and
(l.5)

Once we have such a pair of numbers uK' i\ and a pair of


functions FK , f K, we can derive with relative ease (by the method
sketched in [2]) the following:

Theorem. Le:t A be a 6.<.nUe .<.n:tegelt ;.,equenc.e who;.,e e1.emen:tJ.> Me no:t


nec.eMaJr.il.y pO-6a.<.ve Oil. ciM:t.<.nc.:t, le:t P be a ;.,e:t 06 pumeJ.> and z > 2
a Iteal. numb ell. • Wu:te

P(z) = IT p and Ad {a E A a - 0 mod d} •


p<z
pEP

Suppo;.,e :thelte ex~:t an appltox..<.ma:t.<.on X :to :the c.Md.<.nru:ty I AI of A,


and a non-nega:t.<.ve mul:t.<.pUc.a:t.<.ve 6unc.:t.<.on wed) on :the ;.,quMe6ltee
.<.n:tegeltJ.> (w.<.:th 0 ( w(p) < p '<'6 PE P and w(p) = 0 '<'6
PiP),;.,o :tha:t

Me .<.n :the na:tUlte 06 ltemaindeltJ.>. Ve6.<.ne

S(A,P,z) = I{a EA: (a,P(z)) 1}1


and
V(z) = IT ( 1 - ~)
p<z P

16 :thelte ex.~:tJ.> a c.on-6:tan:t A > 2 ;.,uc.h :that

V(w1)!V(w) ( (llog w )\1 + _A_ _) whenevelt 2 ( wI ( w,


og wI log wI

:then, nOll. any numb ell. y > z,

S(A,P,z) (XV(z){F (~Og y) + O(log log y)} + I /(m)R 0.6)


K og z (log y)v mIP(z) m
m<z
117

and

s( A, p,z) ) XV(z) {f (~lo) + O(log log y)} _ L c-(m)R (1.7)


K og Z
(log y)v mIP(z) m
m<y
whette v = 1/( 2K + 2) and, .in the ttema-indett .6wn6, c±(m) « 4 Q (m) ,
wdh Q(m) denot.ing the numbett 06 pJlJ.me 6actOM 06 m.

Inequality (1.6) coincides for log y/log z ~ a K with the upper


bound from the Ankeny-Onishi theory [1), and (1.7) is, of course,
non-trivial only if log y/log z > fl K • Our theorem is the natural
refinement of [1): we begin with the Ankeny-Onishi upper sieve up
to a K , a nd from there on proceed to improve on [1) by a combina to-
rial device that has the same effect as infinitely mny iterations
of Buchstab's identities. The Rosser-Iwaniec theory [4) uses no
'start-up' sieve; but what is an advantage when K ~ 1 turns out to
be a defect when K > 1. Nevertheless, while the theorem is superior
for K > 1 to both [1) and [4), it should be said that the !?pins
relative to [1), especially for larger K, are only modest.

Our theorem my be used to cont ruct a weighted sieve, as is


shown in the first part of Chapter 10 of [3). The theorem itself,
with K = 2, my be applied to show (on the basis of [8) that the
maximal number N(n) of pairwise orthogonal Latin squares of order n
satisfies, for all sufficiently large n, the inequality
1
N(n) > N14 • 8
The details of the work described below will appear elsewhere
in due course.

2. From now on we shall use a, fl, F and f without the suffix K when
it is clear that we are working with a particular K > 1. It is easy
to check that if F, f are solutions of (1.2), (1.3) and (1.4) such
that
Q(u): F(u) - feu) > ° for u > 0, (2.1)

then F(u) decreases and f( u) increases, ea ch towa rds 1, as u + 00, so


that we my replace (l.S) by (2.1). Introduce also
118

p(u) := F(u) + f(u), u > 0,

so that (1.4) is equivalent to

-u
p(u) 2 + O(e-u ) and Q(u) O( e ) as u ... co (2.2)

and (1.2), (1.3) together imply that

up'(u) -KP(U) + KP(U-l) (2.3)


and u > max(a,13).
uQ'(u) -KQ(U) - KQ(u-1) (2.4)

From here on we proceed as far as we can by the method of 'adjoint'


equations due to Iwaniec [4], and take full advantage of the
analytic tools he fashioned here. Thus the adjoint equations of
( 2 • 3) and ( 2 .4) are

(up(u»' Kp(U) - Kp(U + 1) (2.5)


and
(uq(u»' Kq(U) + Kq(U + 1), (2.6)

and these have solutions, regular in the half-plane Re u > 0,


normalized to satisfy

p(u) ~ u- 1 , q(u) ~ u 2K- 1 for u real and u ... co • (2.7)

The adjoint functions p and q derive importance from the fact that
the 'inner products'

u
(p.p)(u) := up(u)p(u) + K f p(x + l)P(x) dx (2.8)
u-1

and
u
((Q.q))(u) := uq(u)Q(u) - K f q(x + l)Q(x) dx (2.9)
u-1

are constant from max( a,S) onward. Indeed, by (2.2) and (2.7) we
have
119

2 and ((Q.q))(u) =0 if u > IIBx(a,f3) (2.10)

and, conversely, (2.10) and (2.7) together imply (2.2). The


functions p and q are representable as Laplace transforms, having
rather complicated expressions (see section 5 of [4]), and will not
be given here. When 2K E: N, q is in fact a polynomial of degree
2K - 1. For each K > 1, q(u) possesses finitely IIBny positive zeros
and the largest of these, to be denoted by P = PK' plays a central
role in all subsequent calculations. (One might expect this from
the Rosser-Iwaniec theory for K " 1, where f3 K = PK + 1). For any
one K, P K has to be computed numerically, but it can be shown
that*

2K - 1 < PK " K + IK(K - 1) , < K " 1.5 ,


K + IK( K - 1) < PK K + 1 + IK( K - 3/2), 1.5 <K<2
"
2.843K - 2 < PK < DK , 2
"K, (2.11)

where D = 3.59112 ••• is the solution of D(log D - 1) = 1. In fact,


Iwaniec [5) has proved that PK ~ DK (K + co), and we conjecture
that PK/K is strictly increasing towards D as K + co Some of the
inequalities (2.11) can be sharpened if necessary.

Our method is complicated but, from a techincal point of view,


rather simple. For the most part we rely heavily on interplay
between the differential difference equations satisfied by the
various functions in play and use of convexity and Taylor's
theorem. One might say that most arguments come down to
verification of inequalities linking p(u) and a(u) at values of u
having the form PK + a and bK + c. Complications are to be expected
since p(u) and a(u) are strangers to one another, and neither knows
about p! In particular, we have to study the properties of a(u)
more deeply than was done in [1) and to find out more about P than
is to be found in [l)i nevertheless we make extensive use of both

(*) The value of Pk has been thoroughly investigated by Dr. F.Grupp


in his Habilitationsschrift, Ulm, January 1986. We are grateful to
him for making some of his results available to us earlier.
120

these pioneering studies and also of [6].


Our procedure is first to show that a ~ B cannot occur (for
K > 1; for K = lone obtains a = B = 2). We do so in two steps: we
show first that

a ~ B-1 is impossible.

Otherwise, necessarily,

whereas, on the contrary, we are able to show that

~<2
o( p) • (2.12)

Here is a typical instance of the kind of inequality mentioned


earlier. We know that up(u) t 1 as u + co, and that o(u) t 1 as
u + co, so that certainly up( u) / o( u) < 2 i f u is la rge enough.
Indeed, since pp(p) <1 trivially, it would suffice to show that
o( p) > '2 ' or, for
1
K) 2 at least, that o(2.843K - 2) > 1/2. The
2 1
tables suggest that O(2K - TO) > '2 for all K ) 1. We can prove
the first of these inequalities for K sufficiently large; and so far
we have proved only that O(2K) > 0.4 for K ) 8 . Fortunately the
behaviour of (aK + b)p(aK + b), a, b ) 0 , comes to our aid: this
quantity deCJtea6u as a function of K and is almost constant (close
a
to ;-+l for long ranges of values of K. Hence we may get away
with weaker information about o. Roughly speaking, we have .t:ea6-t
difficulty with K beyond 5 or 6, just where numerical computation
gets rapidly out of hand; and find the small K'S hMdu-t to deal
with theoretically, although here numerical computation of the
highest precision is available.

Continuing with our story, we next ~ute out ea6e

B-1 < a < B,

This is harder, but here we end up with a necessary condition for


this case to exist tha t is viola ted if we can find a Uo between p
and B such that
1,21

(u o - l)p(u O - 1)
( o) <2 • (2.13)
a u

We can prove in this case that S >p + l2 --and slightly better


inequalities even--so we have candidates for the role of uo and,
once again, we are back to the kind of scenario I've described in
connection with (13). (A useful observation here is that the
expression on the left of (14) is strictly decreasing as a function
of u O).

Subject to verification of (2.12) and (2.13) we have establish-


ed now that for each K >1

We distinguish next two cases:

I. S<a<S+l,

II. S+ 1 <a •

3. Case I. From the inner product rela tions we obtain


- --
a
ap(a) + p(x+l) dx
""""O'((i) K f a(x)
2
S-l
(I)
a
aq(a) _ 9 (x+l) dx
f
K
0
a( a) a(x)
S-l

and from these we are able to deduce that, necessarily,

max(2,p) < S < p+ 1. (3.1 )

Inequalities (3.1) tell us, when combined with ea rlier inforrna tion
about p, that a and S are small when K is small; and numerical
evidence tells us that Case I actually corresponds (cf [5]) to the
range
1 < K < 1.8344323

with the right-hand limit corresponding to a S + 1 4.8819016.


122

We are able to show in this case that .the equat-ion6 (I) have a
urt-ique .6ofu.t-iort pitUt ex, fl --w-i.th fl .6at-i..66!J-irt9 (3.1). Moreover, i t
is surprisingly simple to deduce from the fact that fl >p , that
Q(u) >0 when u > o.

4. Case II ex~fl+l.

Here the inner product relations

ex
exp~ex) + K E~x + 1) dx + (ex - l)p(ex - l)f(ex - 1)
a ex) f a(x) 2
(II) ex-2
ex
exq~ex~ _ K g~x + 1) dx - (ex - l)q(ex - l)f(ex - 1)
a ex f a(x) o
ex-2
lead us to the equation

ex
a(ex) {p(ex)q(ex -1) + q(ex)p(ex - I)} + Kq(ex - 1) f p(x + 1) dx
a(x)
ex-2

ex
- Kp(ex - 1) f g(x + 1) dx - 2q(ex - 1)
a(x) o (4.1)
ex-2

for ex, and if we can show that this equation has a solution
ex >p+ we can show, surprisingly easily again, that Q(u) >0 for
u > O. Since fl is then uniquely determined from

ex-I K-l
(ex - I)Kf (ex - 1) = K f x
a(x - 1) dx ,
fl

with f(ex - 1) given by

ex
(ex - l)q(ex - l)f(ex - 1) = exg(ex) - K f g(x + 1~ dx
a( x) ,
a( ex) ex-2

we are finished.

To show that (19) has a solution greater than p + 1 is rather a


complicated business: the function of ex on the left of (4.1) is
negative at ex = p+ 1 --this part fairly straightforward--but then
to show tha t a t some stage between p + 1 and 4K this function
becomes positive requires good information about the finer
distribution of the values assumed by a(u) and its derivatives.
123

5. Final discussion.

The critical numbers fl -- .the. J.>-<-fi:tlYlg limm in sieve


K
language--have to be computed. From present evidence it appears to
be true that

and possibly that

On the other hand, Iwaniec conjectures that

fl
~ + 2.44518586 ••• as K + ~ ,
K

and so converges to the same limit as the Ankeny-Onishi sifting


limit. In other words, for large k our sieve is not J.>.tgYl.tb-<-c.an:tt1J
be.t.te~ than the one-step A.-O. sieve.
As Iwaniec remarks, this demonstrates the power of the Selberg
upper sieve. The question remains: is there a 'sta rt-up' sieve
better than Selberg's when K > 1?

References.

1. N. C. Ankeny and H. Onishi, The general sieve, Ada ~.th.

10(1964/65), 31-62.

2. H. Diamond and H. Halberstam, The Combinatorial Sieve, to appear


in the Proceedings of the Math. Science Conference on Number
Theory 1983, Springer Lecture Notes 1985.

3. H. Halberstam and H. -E. Richert, Sieve Methods, Academic Press,


1974.

4. H. Iwaniec, Rosser's Sieve, Ada ~h. 36 (1980), 171-202.

5. H. Iwaniec, J. van de Lune and H. J. J. te Riele, The limits of


Buchstab's iteration sieve, IYldag. Ma.th. P~oc.. A 83(4), (1980).
124

6. D. Rawsthorne, Improvements in the small sieve estimate of


Selberg by iteration, Ph.D. thesis, University of Illinois,
1980.

7. H. J. J. te Riele, Numerical solution of two coupled non linea r


equations related to the limits of Buchstab's iteration sieve,
A6de1.ing NumeJtielze W.iAlzunde, 86. Math. Centrum, Amsterdam,
1980, 15 pp.

8. R. N. Wilson, Concerning the number of mutually orthogonal latin


squares, V.iACJte:te Ma.:th. 9 (1974), 181-198.

H.G. Diamond and H. Halberstam H.-E. Richert


University of Illinois Universita Ulm (MNH)
1409 West Green Street Abt. fur Mathematik III
Urbana, Illinois 61801, U.S.A. 7900 Ulm (Donau)
Oberer Eselsberg
Wes t Germa ny
PRIMES Ilf ARITHMETIC PROGRESSIONS
AND RELATED toPICS

John Friedlander

o. Introduction.

This paper (talk) has a dual purpose. The first is to report


without proof some of the results of recent collaborative work on a
number of multiplicative topics. These topics are connected by a
thread which we shall follow in the reverse order so that in fact
the work in each section was to a greater or lesser extent motivated
by the work in the subsequent sections.

The second purpose is to publicize Iwaniec's recent (version of


the) proof of Burgess' estimate [4] for character sums. Although
this proof uses essentially the same ingredients as the earlier
ones, it seems to this author to be much simpler. I am grateful to
my friend Henryk Iwaniec for allowing me to include his proof here,
and for his comments on the first draft of this paper.

I should like to dedicate this paper to Keith who spent his


third birthday without his father who was giving this talk at
exactly that time.

1. Primes in AritbBetic Progressions.

The results in this section represent work done jointly by the


author with E. Bombieri and H. Iwaniec (to appear in [3] to which we
shall refer as B-F-I) as well as related recent results of Fouvry
[7,8]. These are concerned with estimates of the type

-A
I max max Iw(y;q,a) - y/$(q)1 <A X(log X)
q ..Q yo;;x (a,q )=1

for arbitrary A > O. The famous Bombieri-Vinogradov theorem [2,19]


gives the above for Q = X1/2 -e; while the conjecture of Elliott-
126

Halberstam predicts that it even holds for Q = Xl-E.


In attempting to prove results with exponent beyond 1/2 , we are
first of all led to drop the expression max. This is not a serious
y
restriction since it is known (see for example [12, lemma 1]) that
the resultant weakening of the inequality is only apparent. A
second concession we make is to drop the expression max. This is
a
a more serious deficiency which is necessitated by the methods at
our disposal, but nevertheless is not a hindrance for most
applications.

Pltoblem. We want to show that, for arbitrary weights Yq not too


large (say bounded by a power of d(q», and for some fixed 0 > 0,
Q = XI/2 +0, we have

The requirement of (*) for arbitary Yq includes the case of


absolute values (take Yq = sgn(1jI(X;q,a) - X/CP(q») and, by Cauchy's
inequality, is no more difficult than this special case. Although,
in this generality, the above goal has not yet been reached, there
have been a number of successes in proving (*) for certain special
classes of Yq • The first such results are due to Fouvry and Iwaniec
[9] and then to Fouvry [6].
Results for even the simplest of weights Yq have interesting
applications. Thus B-F-I and Fouvry [7] independently proved

Theor... 16 Yq i6 .ident.ic.aUy one then (*) hold/, wilh


I-E
aYly Q .. x •

Corollary. (Titchmarsh divisor problem). Folt a # 0, A > 0, we have

L A(n)d(n+a) = cI(a)XlogX + c 2 (a)X + O(Xlog-~).


lal<n..X

Previous proofs of this asymptotic formula were not strong


enough to give the second main term, giving only an error of order X
127

loglog X •

Definition. We say that the weights A are well-factorable of level


Q if for every decomposition Q = Q1 Q2' 1 .. Ql' 1 .. Q2' there exists
a decomposition A = Al*A2 (Dirichlet convolution) with Aj having
support on [I,Qj] and IAjl .. 1.

Improving previous results of [9,6] B-F-I shows

'lbeore•• (*) ho.td6 601t any we.i..ght6 {y } weU-6actoltable 06 .teve.t


q
Q ,. X4/7 - e:

The importance of the well-factorable weights is due to their


appearance in the Iwaniec error term [16] in the linear sieve. This
now gives

Corollary. Folt X > Xo (e:) .the nwnbelt 06 pa.i..1L6 06 .tw.i..n plUme6 up .to X
,u, no molte.than (712 + e:) .t.i..me6.the expected nwnbelt.

The basic problem described above is attacked by the dispersion


method. A combinatorial identity (such as that due to Heath-Brown
[13)) is used to replace sums over primes by bilinear forms. These
are estimated by a variety of methods appealing mainly to the work
of Deshouillers-Iwaniec [5]. In this estimation the degree of
flexibility of the weights becomes significant. For the
extremely flexible special weights above the situation is rather
favorable. In fact, the bulk of B-F-I is devoted to the extension
of (*) to classes of weights far less flexible.

We conclude this section by mentioning some spectacular recent


work of Fouvry [8j which makes heavy use of B-F-I (and requires much
else besides).

Tbeorea. (Fouvry). Thelte ex.i...6t6 0 > 0 -6uch .that 601t a pO-6.i...t.i..ve


pltopoltt.i..on 06 p.. X .the glteate6.t plUme 6ac.tolt 06 p-l exceed6
p2/3 + 0 (In [8j Fouvlty g.i..Ve6 2/3 + 0 = 0.6687 .)

The above problem has been studied extensively and the


improvement here although quantitatively small was pursued with
128

strong motivation. In fact Fouvry is able to prove that the same


result holds even when one restricts p to the arithmetic progression
p : 2 (mod 3). Combining this with a generalization, due to Adleman
and Heath-Brown, of the Sophie Germain criterion one gets

Corollary. (Adleman, Fouvry, Heath-Brown). Fait -in6-i-n-iteiy many


pJUmu p the. n-<-JU,t Ca6e. a6 Fe.ltmat'-6 la6t the.alte.m -<-.0 tltue.; that -<-.0

2. Divisor Proble.s.

The work in this section was done jointly with Iwaniec. We


were concerned with the problem of proving the expected asymptotic
formulae

L
n ( X
n:a(q)

(where Pr is a certain polynomial of degree r-l) uniformly for


6 -e:
q <X r ,for all e: > 0, with the object of making 6r as large as
we could.

With the exception of 92 and 64 we were able to improve the


known results as shown below.

r old 9 due to new 9


r r

Hooley
2 2/3 Linnik no change
Selberg

3 1/2 1/2 + 1/230


Linnik
4 1/2 no change
129

5 9/20

--- 8
6 Lavrik 5/12
3r+4
---
;. 7
8
3r

Here the proof for r ;. 5 (which will appear in [11]) depends


on a result of Iwaniec [17] which in turn rests on the Burgess
estimate for character sums and the Halasz-Montgomery method as
refined in [14].

The proof for r = 3 is completely different and of greater


1 1
novelty. The fact that 2 + 230 > 21 provided some of the
motivation for the work in Sec.l (although it eventually disappeared
from the proof), and in turn the work on 8 3 resulted in consequence
of •••

3. Kloosterman Sums.

The results of this section also represent work done jointly


with Iwaniec. The details of proof, including the application to
83 , are given [10].

We let q be prime (although results of the same "essential"


strength hold for composite q). The simplest variant of our results
here is the following estimate.

Let 1 ( A < N, AN < q, nn _ (mod q). Then

L I L e(a.!!) I
l(a(A M<n(M+N q
(n,q)=l

As an illustration of the strength of this result let us take


N = q1/2. Here an application of Wei1's estimate to the inner sum
does not improve the trivial estimate whereas a simple computation
shows that the above estimate is non-trivial for q£ < A < q 1h-£.
The above result is proved by modifying the ideas used in the
Burgess estimate for character sums and (as does that estimate)
l30

appeals to Weil's "Riemann hypothesis for curves".

For application to 93 it was necessary to develop a non-trivial


estimate for the sum

~
1(m(M
(m,q)=1 (n,q)=1

Given the presence of the extra variable it is not surprising


that the proof here was based on a modification of the Burgess ideas
which then appealed to Deligne's "Riemann hypothesis for
varieties". For the two varieties considered here the question of
the applicability of the Deligne theory was far from straight-
forward, following in the one case from a result of Hooley [151 and
in the other from a result of Birch and Bombieri [11.

4. Character SUlIS.

We now proceed to lwaniec's elegant proof of the Burgess


estimate. As already mentioned this proof utilizes essentially the
same tools and in particular draws its strength from the same main
lemma.

r - . (Burgess). Fait X a nan-pJUnupal chaJtac.telt madlLta .the plUme


q and k a pO.6.i.;t.tve .tn.tegelt we have

For simplicity we restrict to prime modulus q. We seek to


estimate the sum

s x(n) •

Employing an idea used by I.M. Vinogradov and by A.A. Karatsuba


we translate the interval by a product

s ~ x(n+ab) + T(a,b)
N<n(N+H
131
where a,b are integers and

T(a,b) L x(n) - L x(n).


N(n<N+ab N+H(n<N+H+ab

If (a,q)=l we have

S = x(a) L x(;n+b) + T(a,b).


N(n .. N+H

Here T(a,b) consists of two sums of length abo We think of ab as


being less than H and we shall attempt to prove some result by
induction. We sum the last identity over a,b with 1 .. a .. A ,
1 .. b .. B, (a,q)=I. The number J of such pairs is > AB and

Jisl .. L L I L x(an+b)I + L IT(a,b)l.


a n b a,b

In the first sum we make a single "longer" variable y = an


getting

L LI LI L v(y) L X(y+b) I
a n b y(modq) l<b .. B

where

•. { l"a<A, (a,q) = 1
v(y) 11 (a,n)
N(n .. N+H, ;n=y(q)

By Holder's inequality

1- .!..... 1
LL I L I < { L v( y) 2 } 2k {L I L x(Y+b)1 2k }2k.
a n b y(q) y(q) l"b"B

The latter factor may be estimated at once by the main lemma.


To estimate the former we note that

- -
a 1n 1= a 2n 2 mod q}.

There are no more than 2AH choices of the pair (a2-al,nl).


132

Each such choice determines (a2-a1)n1 and hence it determines


a1(n2-n1) modulo q. We assume that AH < q/2, so then a1(n2-n1) is
determined. (It is easily checked that with the choice of A we
shall later make, in case AH" q/2 the result follows from the
Polya-Vinogradov inequality.) We thus have

(A more careful estimate would allow q£ to be replaced by log q.)


Combining our estimates we have, for some positive ck = c(k,£),

£
q
(**)
-1 \'
+ J L. IT(a,b)1 •
a,b

Assume for the moment that we can ignore (by induction) the
last sum and fix attention on the rest. Since A and B occur with
negative exponents we should like them large; for the induction we
are constrained to AB < H, say AB = H/2. A little thought shows
that B = q1/2k is optimal and this determines A as well.
Substituting these values we see that we can do no better than
obtain an inequality

and using (**) and induction we do just that.


1/4
The induction is begun by noting that, for H .. 2q ,the
result is trivial. To deduce it for H, we assume it up to H/2,
choosing A and B as above. Substitution in (**) shows that we
require

Provided that k .. 2 and Ak is sufficiently large in terms of c k '


this is clearly possible.
133

References.

[1] B. J. Birch and E. Bombieri, On some exponential sums,


appendix to [10], Ann~ 06 Math., 121 (1985), 345-350.

[2] E. Bombieri, On the large sieve, Mathemat-ik.a 12 (1965), 201-


225.

[3] E. Bombieri, J. B. Friedlander and H. Iwaniec, Primes in


arithmetic progressions to large moduli, to appear in Ac.:ta
Math.

[4] D. A. Burgess, On character sums and L-series II, P~oe. London


Math. Soc. (3) 13 (1963), 524-536.

[5] J.-M. Deshouillers and H. lwaniec, Kloosterman sums and


Fourier coefficients of cusp forms, Invent. Math. 70 (1982),
219-288.

[6] E. Fouvry, Autour du theoreme de Bombieri -Vinogradov, Acta


Math. 152 (1984), 219-244.

[7] E. Fouvry, Sur Ie probleme des diviseurs de Titchmarsh,


preprint (1984).

[8] E. Fouvry, Theoreme de Brun-Titchmarsh, application au


theoreme de Fermat, Invent. Math. 79 (1985), 383-407.

[9] E. Fouvry and H. Iwaniec, Primes in arithmetic progressions,


Ac.:ta ~th. 42 (1983), 197-218.

[10] J. B. Friedlander and H. Iwaniec, Incomplete Kloosterman sums


and a divisor problem, Ann~ 06 Math., 121 (1985), 319-350.

[11] J. B. Friedlander and H. Iwaniec, The divisor problem for


arithmetic progressions, Acta ~-ith., XLV (1985), 273-277.
134

[12] D. R. Heath-Brown, Primes in 'almost all' short intervals, J.


London Math. Soc. (2) 26 (1982), 385-396.

[13] D. R. Heath-Brown, Prime numbers in short intervals and a


generalized Vaughan identity, Can. J. Math. 34 (1982). 1365-
1377 •

[14] D. R. Heath-Brown and H. Iwaniec, On the difference between


consecutive primes, Invent. Math. 55 (1979), 49-69.

[15] C. Hooley, On exponential sums and certain of their


applications, Jounnee6 ~th. 1980, Armitage, J. V. ed.,
Cambridge (1982), pp. 92-122.

[16] H. Iwaniec, A new form of the error term in the linear sieve,
Acta ~th. 37 (1980), 307-320.

[17] H. Iwaniec, On the Brun-Titchmarsh theorem, J. Math. Soc.


Japan 34 (1982), 95-123.

[18] A. F. Lavrik, A functional equation for Dirichlet L-series and


the problem of divisors in arithmetic progressions, Izv. Akad.
Naulz SSSR Sell.. Mat. 30 (1966), 433-448 (= Tltwu.t. A.M.S. (2)
82 (1969), 47-65).

[19] A. I. Vinogradov, On the density hypothesis for Dirichlet L-


functions, Izv. Akad. Naulz SSSR Sell.. Mat. 29 (1965), 903-934;
correction ibid. 30 (1966), 719-720.

J. Friedlander
Scarborough College
University of Toronto
Scarborough, MIC lA4 Canada
APPLICATIONS OF GUINARO'S FORMULA

P. X. Gallagher

The explicit formula of Wei! [21] connects quite general sums


over primes with corresponding sums over the critical zeros of the
Riemann zeta function (or more general L-functions). In the earlier
version of Guinand [8], there is on the Riemann hypothesis 1 ) a kind
of Fourier duality between the differentials of the remainder terms
in the prime number theorem (suitable renormalized) and in the
formula counting critical zeros of the Riemann zeta function.

According to Weil [22], analytic number theory, which deals


with inequalities and asymptotic formulas, is not number theory but
analysis. Nowhere is this more true than in our first topic, which
is the relation between and bounds for these two remainder terms.
It is convenient to begin in a general context, consisting of a
function Z = Z(s) meromorphic on the s-plane (s a + it), and
satisfying for some positive integer k the conditions

(nk) l!!. a > 0, Z has only finitely many zeros; ..!E.


each vertical strip, Z has only finitely many
poles and.!!..£!.. order < k.
and
(p) l!!.~ right half plane a > aI' the logarithmic
deri va ti ve of Z .!!. gi ven ~~ absolutely convergent
Dirichlet series,

-vs
Z'/Z(s) Lv c(v)e

1)
In compensation for the extra hypothesis, while in Weil's
formula the function which is summed over zeros must be holomorhpic
in a strip containing the critical strip, Guinand can sum certain
functions with compact support on the critical line.
Research supported in part by NSF Grant DMS 82-02633
136

with arbitrary complex coefficients, v running over


~ sequence of positive numbers bounded away from O.

We denote by p e + iy a typical zero or pole of Z, by m(p)


the order of Z at p :I: the multiplicity of the zero or pole at
p), and put N(O) = 0 and

N(T) = : (Le=O m(p) + 2Le>0 m(p)) for T Z0


where the sum is over the p with y between 0 and T, the terms with
y = 0 or iT weighted by a factor 1/2 • It follows from the argument
principle that

N(T) = M(T) + 8(T) - 8(0) (1)


with
T
1
M(T)
11 I Re Z'/Z(it) dt
0
and

8(T) = 1 I 1m Z'/Z(o + iT) do •


11 0

In fact, since Z' /Z(s) + 0 exponentially as 0 + 00 , N(T) is finite


and
N(T) = 2!i IC Z'/Z(s) ds,

where C goes in straight lines from 00 + iT to iT to 0 to 00 and a


Cauchy principal value is taken at each p on C. On taking the real
part (1) follows. The integrands in the formula for M and 8 are
undefined at the p but these singularities may be removed, and then
the integrands are real-analytic. It follows that M is real-
analytic.
For real U, let

P(U) c(v)

where the dash indicates that only half of the possible term with
v = U is taken, and let

Q(U) e iyU _1
L 2 - - )m(p) ,
iy
e>o
137

where s* -a + it for s a + it. We define R by

p(U) Q(U) + R(U) (U ) 0) (2)

and R(-U) = - R (U) for U < o.


For our applications it will suffice to have a Guinand formula
with weight functions f = f(u) defined on R and satisfying the
condition

(Wk ) f •••• • f(k-2) ~ continuous. f(k-l) and f(k)


piecewise continuous 2 ) and f ..... f(k) ~

For such f. the function

J e Su f(u)du

is holomorphic in lal < a2 and is O(ltl-k ) in each closed


substrip. In particular. the Fourier transform g(t) = f(it) is real
analytic.

Theorem 1. 16 z -6 at-i6 Me-6 (n k ) and (p) and f -6 at-i6 M-u (Wk ) and ha.l.l
no d-iA co n-t.inuLt.iu at -the ± \I • -then

J f(u) dR(u) = J g(t) dS(t). (3)

The right side of (3) is

where SO(O) =0 and SO(t) = S(t) - S(O±) for t ~ O. Thus with

2) A function is piecewise continuous i f it has only finitely many


discontinuities at each of which its value is the average of left
and right limits.
138

Ro(u) = R(u) - (S(O+) + S(O-»)u, we have (under the same hypotheses


as above)

If also f(O) 0, then

J f(u)dRO = J g(t)dSOO(t)
-00

where SOO(t) = SO(t) + M'(O)t. At their origins, Ro and So vanish


to first order, while Rand SOO vanish to second order. We have

and
p(u) = QO(u) + Ro(u)

where explicitly NO differs from N by the omission of the real non-


negative zeros and poles.

The proof of (3) given in Section 1 follows Wei! [21] with a


simplification arising from the hypothesis of only finitely many
zeros and poles in a > O. Another real part argument replaces the
use of the functional equation in [21] and thus allows us to defer
the definition of the gamma factor to Section 2.

Strengthening conditions (nk ) and (p), we now assume

(Nk ) z ~.£!. order <k + 1. Z has only zeros


.2!!. a = 0 ; ~ a f. 0, Z has only real
~ and poles 3 ): only ~~ only
poles .2!!. a > 0; .2!!. a < 0, with only finitely
many exceptions, only ~.!i. k :: 1 ~ 2
(mod 4) and only poles .!i. k :: 0 ~ 3 (mod 4);
and either

3)For k=l, this could be weakened to read: in a < 0, the zeros and
poles of Z have bounded imaginary part.
139

(P) the coefficients c(v) .!!!.. (p) .!!!. real and


all positive ~ all negative according ~

Z has only ~~ only poles ~ (J > 0,


or
(plf) ~..!!..!. function zlf satisfying (Nk ) and
(P) for which Ic(v)1 « Iclf(v)l.

In Section 2 we show by the usual gamma factor arguments that (Nk )


and (p) imply (nk)' and also derive a trivial bound for S and some
qualitative properities of M'. In Section 3, we use these facts,
together with (300) and (3), to give rather parallel proofs of
"dual" bounds for Ro and S, each in terms of the two functions M'
and Q':

Theorea 2. 16 Z ~at~6ie6 (Nk ) and (P), then 6o~ T ) 2 and U ) 2,

RO(U) < IQ'(u)1 + 1 + fT IM'(t)1 + t k - 1 (5)


T t dt,
1

and k-l U
Set) < IM'(T)I + T + f IQ'(u)1 + 1 duo (6)
U 1 u

16 Z ~at~6ie6 (Nk ) and (plf), then the ¢ame boun~ hold wah IQ'I
and IMI ~eptaced by IQ'I + IQIf'1 and IM'I + IM"I.

In the proof, first (5) is derived, using Theorem 1, from the


trivial bound on S mentioned above. Then the analogous trivial
special case of (5) is used with Theorem 1, in an analogous way, to
get (6).

If there are no positive fl, so that Q' - 0, then the optimal


choices for T and U in (5) and (6) are T =1 and U = IM'(T)I + Tk - 1 ,
giving
Ro(U) <1

Set) < log (IM'(T)I + 2Tk - 1).

If there are positive fl and b is the largest of these, so that


Q'(u) < e bu , then suitable choices of T and U in (5) and (6) give
e.g.
140

{U2(1
if M'(T) ~ log T and k I',
< - l/k)bU
e i f WeT) < Tk - 1 •

(For k > 1, the logs in (6 0 ) and (6 b ) are = log T.)

The simplest example of a function Z satisfying the hypothesis


(N 1 ) and (P) is given by Z(s) = 1 - e- s • Here Theorem 1 gives the
Poisson sum formula. In this example, the sawtooth functions RO and
S are bounded, but do not tend to zero. Here Q' = 0 and M' is
constant, so (50) and (6 0 ) are best possible in this case.

A second example with k = 1 is given by Z(s) = I;(s + 112 ) where


I; is, on a Riemann hypothesis, the Riemann zeta function, or the
zeta function of an algebraic number field, or an ordinary primitive
Hecke L-function, or a Hecke L-function with grossencharacter.
Artin L-functions, as quotients of products of Hecke L-functions,
are then indirectly covered, directly on Artin's conjecture. In
these examples,

p(u) I A(d)x(d)(Nd)_lh
Nd;; e U
and

Q(u) -4 sinh 2u + 2u or 0

according as X is principal or nonprincipal. Here Theorem 1 is


Guinand's formula (for weights satisfying (WI». In these cases
M'(t) = log t for t + "" , so Theorem 2 gives the standard R.H.
estimates R(u) < u 2 of von Koch [11] and Set) < (log t)/loglog t
of Littlewood [12]; for the latter, there is also a proof due to
Selberg [18], using his approximate formula for S.

A third example, for k = 2, is given by Z(s) Zr(s + 1/2 ) where


Zr is the Selberg zeta function attached to a compact Riemann
surface of genus ) 2. Here Theorem 1 is a version of the Selberg
trace formula 4 ). In this case Q'(t) ~ 2e n / 2 and M'(t) =t for

4 )This does not give a new proof of the trace formula, which
logically preceeds the definition of Zr.
141

t + .. , so (5 b ) and (6 b ) give Ro(U) < e U/ 4 and S(T) < T/log T. The


bound on Ro is due to Randol [15], improving by a factor of U liz an
earlier estimate of Huber [10]. Randol [16] and Hejhal [9] have
given proofs, analogous to those of Littlewood and Selberg mentioned
in the previous example, for the bound on S.

More generally, for k > 2, we may take Z(s) = Zr(s+po'X) where


Zr is the Selberg zeta function attached to a compact space form of
a k-dimensional symmetric space of rank 1; here Po and X are as in
Gangolli [4]. For compact hyperbolic space forms, there are only
finitely many negative zeros or poles if k is odd, while there are
:: t k of them on [-t,O] for large t i f k is even. By (14) and (15)
below, this gives M'(t) <: t k - 1 in both cases. The corresponding
bound for Ro(U) in (5 b ) is due to Randol [17] (whose proof suggested
our proof of (5»; the bound S(T) < Tk- 1/log T from (6 b ) was
proved, in even greater generality, by Berard [I].

Guinand derived (3) (in the case Z(s) = I;;(s + 1/2 ) on R.H.) for
a wider class of weights starting with a special case of (3) which
he reformulated to show that RO(u)/u and SO(t)/t are connected by a
unitary operator closely related to the Fourier transform. In
Section 3 we use a similar operator to give a correspondence between
second moments in the distribution of zeros and primes in corres-
ponding short intervals:

Theon. 3.
then 6o~ po~~t~ve E + 0

(7)

Supposing in addition that M'(t) - clog It I as Itl+oo, we show


by a method of Mueller [14] that the integral on the right is
~ CE log 2 ~ , with asymptotic equality if and only if

m(iy)
I m(iy') - I y
O<y(T O<y(T (8)
1 2
(- 2" clog T)
142

for T + a> , ET + O. In the case Z(s) = r,;(s + 112 ) on R.H., it


follows from Fujii's second moment estimates for zeros in short
intervals [2] that the left side of (8), and therefore also (7), is
2 1
< E log ; . In this case, condition (8) is a consequence of
Mueller's "essential simplicity" condition for which she finds an
arithmetic equivalent, and which in turn is a consequence of
Montgomery's pair correlation conjecture [13]. For other second
moment correspondences, with various normalizations, see Goldston
and Montgomery [6] and the papers cited there and in [3]. An
unachieved goal of analytic number theory is to find some pair of
equivalent second moment asymptotic evaluations which can be proved
in the case Z(s) = r,;(s + 1;2) on R.H. making use of the arithmetic
nature of the coefficients. This is of course part of the more
general goal of explicating duality in this part of analytic number
theory: translating completely a definition of prime number into an
understanding of the critical zeros of the Riemann zeta function.

1. Guinaod's fOlW1la
It suffices to prove Theorem 1 for real g, i.e. for f(-u)
f(u). Since R(-u) = -R(u), this gives

J f(u)dR(u) 2Re J f(u)dR(u). (9)


o
Following Wei1 [21], we next write, for a1 < a < a2'
a> 1 a + iT ~
J f(u)dP(u) = lim 211i J f(s)Z' /Z(s)ds. (10)
o T+a> a - iT

This follows by Fourier inversion, at u = 0, from the formula

f(s)Z'/Z(s) J_a>
e Su LV c(v)f(u+v) du,

which is gotten from the product of the absolutely convergent


integral for f(s) by the absolutely convergent series for Z' /Z(s)
by IIBking changes of variable and reversing the order of summation
and integration. To justify the Fourier inversion, it suffices to
observe that both
143

and its derivative belong to L1(R) and are piecewise-continuous near


u = O. In fact, the series converges uniformly, as does the series
with f replaced by f' since

Next, the order condition on Z in vertical strips together with


the fact that Z' /Z(s) -+- 0 exponentially as a -+- co implies that for
ITI -+- co
(11)
T+l
N(T + 1) - N(T) + 15(T)1 + f f IZ'/Z(a + i t)ldadt = o(T k ).
T 0

In fact, these bounds follow easily from the following standard


partial fraction approximation:

LeJma 1. (Jensen, Landau): 16 h = h(z) .u, analyt-i-c and Ih(z)/h(O)1


~ B -i-n Izl ~ r, then h hah <A log B zekO~ zi -i-n Izl < Ar, 60k each
A < I, and

The bound £(s) « Itl-k together with (11) justifies moving the
line of integration in (10) to a = 0, giving

T
f f(u)dP(u) lim {LS>O f(p)m(p) + 2! f g(t)Z'/Z(it)dt}, (12)
o T+oo hi ~T -T

where each term with S =0 is weighted with the factor liz and at each
pole of Z'/Z on a =0 a principal value is taken. By the reflection

principle,
~
f(p) =
~
f(p),
* so

2Re f f(u)dP(u) L( f(p) + f(P*) - 2g(y»)m(p) (13)


o S>O
144

T
+ lim {La>O g(y)m(p) - f g(t)dM(t)},
T+"" hl(T -T

the dash indicating that in the last sum terms with a > 0 are
multiplied by two. The first sum is

f f(u)dQ(u) =2 Re f f(u)dQ(u),
o
since Q(-u) = -Q(u) via p* -p • Using dN - dM = dS and combining
(13) with (9) gives

f f(u)dR(u) f g(t)dS(t),

both integrals existing as symmetric limits.

2. Consequenees of (Nk) and (p) for M' and S.

For each nonzero function Z meromorphic of order <k + 1,


there is a gamma factor, i.e. a function G meromorphic of order
< k + 1 with all zeros and poles in a <0 for which X = GZ is real
on a = O. With the normalizations

X' /X(s) = m(O) + O(sk), X(s) ~ (is)m(O)


s

for s+O, G is uniquely determined. In fact X is (is)m(O) times the


standard genus k Weierstrass product over the p +0 with 8 > 0 and
the p* corresponding to p with 8 > O. On a = 0, the factors with
8 = 0 are real, and the factors corresponding to p and p* for 8 >0
are conjugate. Explicitly,5)

where p is a polynomial of degree k (essentially the negative of the

5)In certain cases of the Selberg zeta function of order 2, Vigneras


[20] has written G explicitly as a finite product of Barnes double
gamma functions.
145

analogous polynomial in the corresponding expression for Z), and


Ek(z) is the standard genus k Weierstrass factor. From E'k/Ek(z) =
zk/(z-l) it follows that

k *k
G'/G(s) p'(s) - I ~ m(p) + L (s/p) m(p).
8<0 (s-p) 8>0 (s-p*)

At this point we invoke hypothesis (Nk ) and get

(14)

where

a2 2 k-l m(8) (k even);


(
" + t )8

L(t) (15)

i k +1 ,
-I (k odd).
1f 8<0

In (15) the dash indicates that the sum in either over real negative
zeros or real negative poles. The contribution of any exceptional
poles or zeros, terms with 8 > 0, and p' has been put in the o-term
in (14).

The function L is even and non-negative in all cases. Since in


each sum on the right in (15) all terms have the same sign, it
follows that for t >0

L(t)/t k- 2 increases and L(t)/t k decreases (k even);

L(t)/t k- 1 increases and L(t)/t k+ 1 decreases (k odd).

In particular, in both cases L changes slowly, i.e. changes only by


a bounded factor when t changes by a bounded factor. Since G has
< r k+l-o zeros and poles in Is I ( r, (15) also gives

(16)

in both cases.
146

We next show that in each vertical strip 101 ~ 02'

Re G'/G(s) < L(t) + Itl k - I (t large). (17)

Since p' has degree k-I, it suffices to observe that

k
Re s
(s - S)Sk

which for 101 ~ 02 and S ( -202 is

according as k is even or odd. The exceptional terms and other


terms with S > -202 contribute < Itl k- I in both cases.

Next we conclude from (17), using also (p), that

(t large). (18)

We my suppose that 02 > 01' so by (p) Z is bounded on ° = 02. By


the reflection principle, Z satisfies GZ(s*) = GZ(s) (functional
equation !), so

°2
Z(-02 + it) < exp( f Re G'/G(o + it)do ).
-°2
It follows from (17) that (18) holds also for °= -02. For
101 < 02' put Fs(w) = Z(w) exp(-(w-s)4k). On the horizontal sides
of the rectangle (in the w = u + iv plane) bounded by u = : 02 and
v - t =:t 112 t this function is bounded for large t since Z has
order <k + 1 and Re(w-s)4k " t4k. On the vertical sides it is
bounded by the right side of (18) by what we have shown and the fact
that L changes slowly. Since Z(s) = Fs(s), the mximum principle
now gives (18) for 101 < 02.
In particular, (18) and (16) show that Z has order < k in
vertical strips. Thus (Nk ) and (p) imply (nk).

Using (18), the proof of (11) now gives (for large T)

T+I
N(T+I) - N(T) + IS(T)I + f f Iz'/Z(o + it)1 dodt (19)
T 0
147

It follows that

T+1
J IdSOO(t)1 < L(T) + ITl k- 1 (for large T). (20)
T

In fact, with Moo(t) = M(t) - M'(O)t, the integral is


T+l T+l
J IdNO(t)1 + J IdMOO(t)1 < IA~+INOI + IA~+IMI + ITl k- 1
T T

here we have used the monotonicity of NO and the positivity of L.

3. Bounds for Ro and S.


Beginning with Ro in case (p), we have for U > 1, supposing all
c(v) > 0,

P(U) - PO) = J ~l ,U(u)dP(u) ~


~
J

for any compactly supported continuous majorant/minorant of the


characteristic function ~I , U of [1,U]. On subtracting

+
From this and using (300)' we get, provided f- is sufficiently
+
differentiable and f-(O) = 0,
(21)
RO(U) -RO(I) S J~ (ft(u) - ~I u(u»)dQO(u) + J~ gt(t)dSOO(t)
-00 ' -00

+
where g- is the Fourier transform of f. (If all c(v) .. 0, the
inequalities are reversed).

For U >2 and T >2


we take for f! in (21) the characteristic
function of the interval [1 + T- I , U! T- I ] convolved with Tf(Tu),
where f is any nonnegative C~ function supported in (-1,1) and

satisfying J f(a)da =1. The first integral in (21) is then in


148

modulus at most
I+T- 1
I I-T -1
The Fourier transform of <PI U is < Itl- 1 , and the Fourier
transform of f satisfies g(B) < (1 ~ IBI)-(k+2), from which

Thus the second integral in (21) is

T IdSOO(t) 1 IdSOO(t)1
< I --r=-r-- + Tk+2 I
-T 1t l i t I>T Itl k+3 •

Because of the double zero of Soo at t=O, the interval [-I,ll


contributes 0(1) to the first integral. Using (20), we thus get the
bound

T k-l k-l
<I I
00

L(t) + t dt + Tk+2 L(t) + t dt.


1 t T t k +3

The t k- 1 terms here contribute < Tk- 1 + log T; since L(t)/tk+1 is


decreasing and L changes slowly, we have

T
I <I
00

Tk- 2 L(t) dt ( L(T) L(t) dt.


T t k +3 1 t

Using L(t) < IM'(t)1 + t k - 1 + 1, this gives

R (U) < IQ'(U)I + 1 + IT IM'(t)1 + t k- 1 dt. (22)


o TIt

For example, the choice T a 2 gives the trivial bound

(23)

which will play the same role in getting a refined bound for S as
(19) did in getting the bound (22) for RO' First, it follows from
(23) that

U+l
I IdR(u)1 < IQ'(u)1 + 1 (for large U) (24)
U
149

since the integral here is

U+l U+l U+l U+l


f IdP(u)1 + f IdQ(u) I It.u pi + It.u QI
U U
U+l U+l
< It. QI + It. Rol + 1
U U

here we have made use of the monotonicity of P and Q, which follow


from the assumptions (P) and (N).

Now we bound S. For each T > 0,

where this + is
time g- a majorant/minorant of the characteristic
function WT of [O,T] and is the Fourier transform of a function
+
f- in (Wk ). On subtracting

from this and using Theorem 1 in reverse, we get

S(T) - S(O) ~ f (g~(t) - wT(t»)dM(t) + f f~(u)dR(u). (25)


-co

+
For the functions g- we use the following construction, which
was suggested by Goldston's use [5] of Selberg's kernels 6 ) [7], [19]
for a related purpose in the case of the Riemann zeta function on
R.H. :

~ 2. Let k be a pO.6a-i..ve -i..ntegeJt. FOIL each L >0 thelLe Me


ILeal nunc.t-i..On.6 fL ~ .6uppolLted on (-1,1) w-i..th k cont-i..nuoU.6
delt-i..vat-i..ve.6, who.6 e Foult-i..elL tlLan.6 nOILm6 gL"!: .6 at-i...6 nlJ gL ~ ~ ~L whelLe ~L
-i...6 the chMac.tewt-i..c nun c.t-i..o n On [O,L], and nOlL
+
wh-i..ch fL-(n) < Inl -1 and

6)For k=1 we could use Selberg's kernels.


150

.the -implied c.olU.tan-U depenrii..ng onty on k.

P~oo6. For each positive integer k, we put

with dk chosen so that

Thus ok is even and is the Fourier transform of a function in Ck - 1


supported in (-1,1). For 8 > 0, we have

foo
8 k+2
° (b)db < (1 + i8i)-(k+l)

Since

8-L
1 - (f + f ) 0k+2(b)db,
8

it follows that

For odd k, we have

It follows that for sufficiently large ck ' the functions

have all the required properties. This completes the proof, since
we may suppose k odd.
% -1 % -1
In (25), we take f (u) = U fTU(uU ). Thus f± E Ck ( -U, U) •
151

and f~(u) < lui-I. Using (24), it follow that the second integral in
(25) is

u u
< I ~ < I IQ'(u)1 + 1 du,
-u -[ul-- 1 u

the interval [-1,1] contributing < 1.


+ + +
From g-(t) = gTU-(tU), it follows that g-(t) ~ ~T(t) , and

Since M'(t) < L(t) + Itl k- I + 1, the first integral in (25) is

< I- ! ~I + 1 dt + I - L(t) +t!


L(t) +t! ! ~I + 1 dt.
-00 (1 + Itlu)k+I __ (1 + It-Tlu)k+I

The Itl k- I + 1 parts here contribute < Tk-I/u. Since L changes


slowly,

2T L(t) dt < L(T)


IT/2 (1 + It-Tlu)k+I u •

The rest of the L part of the second integral above is bounded by


the L part of the first integral, which is

< I L(t) dt < I tk-Odt <1


o (1 + tU)k+I 0 (1 + tU)k+I U·
This gives

S(T) < !M'(T)! + Tk+I + IU !Q'(u)1 + 1 duo (26)


U 1 u

In case (pH) in the argument bounding RO' the same choice


of f± gives

I+T- I U+T- I
p(U) - P(I) = I f~(u)dP(u) + O(I _IldpH(u)1 + I _Ildp#(u)I).
I-T U-T

Since

we get on using the bound corresponding to (22) for RH that (22)


152

also holds for R, with !Q'! replaced by !Q'! + !QH,! and !M'! by
!M'! + !MI1 ,!. In particular, we get the correspondingly modified
(23). To get the correspondingly modified (24), we use

U+1 U+1
J !dP(u)! <: !flu pH!
u
in the displayed line below (24). From this point, the argument
proceeds as with case (p) to get the correspondingly modified (26).

4. A second ~nt correspondence


For k = 1, we may take for f in (3 0 ) the characteristic
function of the interval [O,ul (u > 0) renormalized to take the
value liz at 0 and u. This gives the special formula

e itu _1
RO(u) = lim JT it dSO(t),
T-- -T

which is valid for all u ".: v by our definition of R(u) for


u (0. Following Guinand [81, this may be reformulated as

RO(u) __ T SO(t)
-lim J - t - h(tu)dt, (27)
u T---T

with

d /8 -1 i8 /8 -1 (28)
h(8) = 8 d8 -i-8-- = e - -i-8-

In fact, after integrating by parts and dividing by u, we get

RO(u) lim T d e itu -1


- u - = - T-- J SO(t) dt itu dt,
-T

which is (27), the + T terms vanishing in the limit since


SO(t) = o(t) for large It!.

I.emE 3. The k.eltnel h de Mned .in (28) g.ivv., a lineaJt -iA ometltlj

1 T
H: .p(t) + lim. - J ~(t)h(tu)dt
T + 00 121T -T
153

w~h H -1 = H*, £.e. un£t~y, but we don't need th~).

P~oo6. It suffices to show that for U + ~,

U 2 ~ 2
f IHt(u)1 du + f It(t)1 dt (29)
-U
~

for t EC O
(R), since H then extends by continuity from this dense
subset to a linear isometry on all of L2 (R). From (28) we get the
identity

from which it follows that

U U
f It(u)1 2 du = f I Ft(u)1 2du - (I Gt(U)1 2 + I Gt(-U)1 2 )/u,
-U -U

where F is the Fourier transform and G = H - F. For t EC~(R), Gt


is bounded, so (29) follow from the corresponding (Plancherel)
formula for F.

Supposing now that, besides satisfying (N 1 ) and (P) Z has order


< 3/2, we have L(t) <liz-a
so by (19) S(t) < It I 1/2 - a for large
Itl. It follows that SO(t)/t E L2 (R) from which (27) and the lemma
gi ve Ro( u) /u E L2 (R).
For each A > 0, (27) gives

T SO(t/A)
---=
AU
- lim f t h(tu)dt.
T+~ -T

Combining this with (27) and the lemma gives

In this equation, Ro and So could be replaced by Rand S.

Next we will work towards an asymptotic evaluation of the


integral on the right as A + 1+. For convenience we replace t by
At and put A = 1 + e.

Supposing that M'(t) + ~ as t + ~ , Theorem 2 gives S(t) <


M'(t)/log M'(t), from which for A =1
154

I Tw (S(At) - S(t))2 dt < I W


( M'(t))2 (31)
t T t log W(t) dt.

Next, we have

In fact, the integral on the left is

from which (32) follows.


Finally, we have

T
J(T) ( I ( N(At) ~ N(t))2 dt ( J(AT), (33)
o
with
min(y,y')
J(T) = L m(iy)m(iy') I dt/t 2 ,
O<y,y'(T max(y,Y')/A

the dash indicating that the sum is over all pairs y, y' for which
max (y,y') ( A min(y,y'). Thus

J(T) a e; L
O<y(T

On the hypothesis M'(t) + W , we have

m(iy) ~ IT W(t) dt (T + w). (34)


y 1 t

In fact, the two sides differ by

T T
I dS(t) + 0(1) - S(T) + I S(t)dt + 0(1),
1 t T l t2

and the bound S(t) < M'(t)/log W(t) shows that this is of smaller
order than the right side of (34). It follows that

J(T) ~ E IT M'~t) dt, (T + w, e; T + 0) (35)


1
155

with asymptotic equality if and only if (for T + m , £T + 0)

(A) L m(iy') ~ L
y(y'(I+£)y O<y(T

Combined with (31), (32), (33), this gives

m T
J (S().t) - S(t») 2dt > £ J ...lCtll dt (T + m, £T + 0) (36)
o t ~ 1 t

with asymptotic equality if and only if (A) holds, provided also

JTm( t log
M'( t»)2 (JT...lCtll dt)
M'(t) dt ~ 0 £ 1 t

and
a o( / M'~t) dt).
1

For M'(t) ~ clog t, both of these provisos are satisfied if only-


£T + 0 sufficiently slowly, which we may suppose.

References

[1] Berard, P.R., On the wave equation on a compact Riemannian


manifold without conjugate points. Math. Z. 155 (1977), 249-
276.

[2] Fujii. A., On the zeros of Dirichlet L-function I. T.A.M . S.


196 (1977), 249-276.

[3] Gallagher, P.X., Pair correlation of the zeros of the zeta


function. J. Re~ne Agnew Math 362 (1985), 72-86.

[4] Gangoll1, R., Zeta functions of Selberg's type for compact


space forms of symmetric spaces of rank 1. lWno.u, J. Math
21 (1977), 1-41.

[5] Goldston, D.A., Lecture at the 1984 Stillwater Conference on


Analytic Number Theory and Diophantine Problems.
156

[6] Goldston, D.A. and Montgomery, H.L., Pair correlation of


zeros and primes in short intervals, Proc. 1984 Stillwater
Conference on Analytic Number Theory and Diophantine
Problems, Birkhauser Verlag (this volume).

[7] Graham, S.W. and Vaaler, J.D., A class of extremal functions


for the Fourier transform, T.A.M.S. 265 (1981), 283-302.

[8] Guinand, A.P., A summation formula in the theory of prime


numbers, P~oc. London Math. Soc. (2) 50 (1984), 107-119.

[9] Hejhal, D.A., The Selberg trace formula for PSL(2,R), Vol.
1, Lec.tUlte Notu .tn Mathemat.tCil, 584 (1976).

[10] Huber, H., Zur analytischen theorie hyperbolisher Raumformen


und Bewegungsgruppen II, Math. Ann. 142 (1961), 385-398 and
143 (1961), 463-464.

[11] von Koch, H., Sur 1a distribution des nombres premiers, Acta
Math, 24 (1901), 159-182.

[12] Littlewood, J .E., On the zeros of the Riemann zeta function,


P~oc. London Math. Soc. (2) 24 (1924), 295-318.

[13] Montgomery, H.L., The pair correlation of zeros of the zeta


function. Analytic Number Theory (Proc. Sympos. Pure Math.
24 St. Louis Univ., St. Louis, MO. 1972 A.M.S. Providence,
R.I. 1973.

[14] Mueller, J.H., Arithmetic equivalent of essential simplicity


of zeta zeros, T.A.M.S., 275 (1983), 175-183.

[IS] Randol, B., On the asymptotic distribution of closed


geodesics on compact Riemann surfaces, T.A.M.S., 233 (1977),
241-247.
157

[16] Randol, B., The Riemann hypothesis for Selberg's zeta


function and the asymptotic behavior of eigenvalues of the
Laplace operator. T.A.M.S. 236 (1978), 209-223.

[17] Randol. B., The Selberg trace formula, in Eigenvalues in


Riemannian geometry, by Isaac Chavel (to appear).

[18] Selberg, A., On the remainder term for N(T), Avhand.UngeJt


NO~Qe Vid. AQad. O~lo (1944) No.1.

[19] Vaaler, J.D., Some extremal functions in Fourier analysis,


B.A.M.S., 12 (1985), 183-212.

[20] Vigneras, M. F., L'equation functionelle de 1a fonction zeta


de Selberg de la group modulaire PSL(2,Z), A6tewqu.e, 61
(1979), 235-249.

[21] Weil, A., Sur les "formules explicites" de la Theorie des


nombres premiers, Comm. Sem. Math. Univ. Lu.nd [Medd. Lunds
Univ. Mat. Sem.] Tome Supplementaire 1952, 252-265.

[22] Weil, A., Two lectures on number theory, past and present,
E~eignment Mat. (2) 20 (1974), 87-110.

P.X.Gallagher
Department of Mathematics
Columbia University
New York, N.Y. 10027
ANALYTIC NUMBER 'l1IEORY ON GL(r.R)

Dorian Goldfeld*

with an appendix by Solo.on Friedberg

1. Introduction.

There has been much progress in recent years on some classical


questions in analytic number theory. This has been due in large
part to the fusion of harmonic analysis on GL(2,R) with the
techniques of analytic number theory, a method inspired by A.
Selberg [17]. A lot of impetus has been gained by the trace formula
of Kuznetsov [II], [12], which relates Kloosterman sums with
eigenfunctions of the Laplacian on GL(2,R) modulo a discrete
subgroup. We cite some of the most striking applications.

Letting
21li( am+an
S(m,n;c) c
e
a
(a,c)
a"i ;: 1 mod c
denote the classical Kloosterman sum, Kuznetsov [12] has shown that

I
c(){

Where the o-constant depends at most on m and n. This is the first


result of its kind showing a cancellation between Kloosterman
sums. A simpler proof of this, with a higher power of (log x), is
given in Goldfeld-Sarnak [51. It is based on the study of the zeta
function

* The author gratefully acknowledges the generous support of the


Vaughn Foundation
160

S(m,n;c)
L 2s
c=1 c

as initiated by Selberg [19].

If Pn denotes the nth prime, then Iwaniec and Pintz [8] have
proved
P _ P = o( p1/2 + 1/21 + E) •
n+1 n n

Also, Fouvry [4] has shown that there exist infinitely many
primes p such that p-1 has a prime factor greater than p2/3. This,
together with some unpublished results of L.M. Adleman and R. Heath-
Brown (extensions of Sophie Germaine's criterion) enable one to show
that
(p f xyz)

is impossible for positive integers x,y,z for infinitely many primes


p.

The excellent survey article of Iwaniec [7] lists many more


applications of harmonic analysis on GL(2,R) to analytic number
theory. In view of these advances, i t is natural , therefore, to
ask if the fusion of harmonic analysis on GL(2,R) (r)2) with
analytic number theory will yield further results and improve-
ments. We believe that this is the case.

The object of these lectures is to provide a brief and element-


ary introduction to harmonic analysis on GL(r,R) with r ) 2. Stress
has been laid on those aspects of the theory which are particularly
useful to analytic number theory; namely, Fourier expansions, L-
functions, Eisenstein and Poincare series, and arithmetic sums such
as Kloosterman sums. We have followed the elegant classical exposi-
tion of Jacquet [9] that was further developed by Bump [1] (for the
special case of GL(3,R», which we believe is particularly suited to
the types of explicit calculations that arise in analytic number
theory.

The author would like to thank D. Bump and S. Friedberg for


many helpful discussions.
161

2. lwasawa deeo~osition.

The Iwasawa decomposition for GL(2,R) states that every


g £ GL(2,R) can be written in the form

(2.1)

where y > 0, x, d £ R, and

(: :) £ 0(2)

where
t
O(r) {g E GL(r,R) g g I } (2.2)

is the orthogonal group. Setting

Z GL(r,R) } (2.3)
r

to be the group of scalar matrices, we can then identify the upper


half plane
h {x+iy; x £ R, Y > 0 }

as the group of 2 by 2 matrices of type

(oy xl) ; y > 0, x E R },

or by the isomorphism

h ~ GL(2,R)/0(2) Z2 •

We seek to generalize the decomposition (2.1) to the group


GL(r,R) for r ;. 2. To this end, we define the generalized upper
r
half space H to be the set of all matrices

X 1 ,2 ••••• (2.4)
I

xr-l,r
I
162

where Xi'j€ R for i <j < rand Yi ) 0 for 1 < i < r-1.

Proposition 2.1 (Iwasawa decomposition)

P~oo6. Let g € GL(r,R). Then gtg is a positive definite symmetric


non-singular matrix. It is not difficult to show that there exist
u and R. in GL( r ,R) where u is upper triangular with ones on the
diagonal, R. is lower triangular with ones on the diagonal, such that

t
ug g = R.d (2.5)

Hence u-1 R.d ) -1 or


u

d.

Consequently R.d = d(tu)-l. Substituting into (2.5) gives ugtgtu d


= a- 1 (t u )-1 for

a ..

I so that aug € O(r). Consequently,

Q.E.D.

To illustrate the Iwasawa decomposition, we consider an


arbitrary matrix

g" (AB C)
8 y
CJ, E GL( 3 ,R) •
abc
Then
163

where

aex + bi3 + cy I: aex


xl =
a 2 + b2 + c2 =~

I: Aex I: a 2 - I: Aa I: CIa
x2 =
I: ex2 I: a 2 - (I: a ex) 2

[I: ex2 I: a 2 - 0: aex)2]112


Y1 I: a 2
(I: a2~1/2 Sl/2
Y2 I: ex2 I: a 2 - (I: aex) 2

with
S I: A2 l: ex2 I: a 2 - l: A2(I: oa)2 - I: a 2 (l: Aex)2
- I: ex2 (l: Aa) 2 + 2 I: Aa I: CIa l: Aex •

3. AutOllOrphic foras.

Let Ir denote the identity matrix in GL(r,R). For a positive


integer M, we let

r r (M) = {y E SL(r,Z) y - I (mod M) }


r

denote the principal congruence subgroup (mod M) of SL(r,R). This


will be a discrete subgroup of GL(r,R), and it acts on the general-
ized upper half space Hr by left multiplication. That is, for
y E rr (M) , 1: E Hr , we let y1: = 1:* where 1:* E Hr is uniquely chosen
so that Y1: = 1:* (mod O(r)Zr)'

Let v 1 , ••• ,v r - 1 be complex numbers. For 1: € Hr given by

x 1 ,2 x 1 ,r ... Yr-1

1: =
1 ("Y2 Y1 Yr-2
• 1 (3.1)
x Y1
r-1,r

let us define
164

r-l r-l ci·v.


1T 1T
i=1 j=1
Y
i
J J

where

(r-i)j .; j .; i

(r-j)i i.; j .; r-l

If V denotes the algebra of Gr-invariant differential operators on


Hr, then I is an eigenfunction of V, and hence
v1 ,···,v r _ 1
determines a character A on V by the formula
v 1 ,···,v r _ 1

DI (D E D).
v1,···,v r _ 1

For example, when r = 2 , V is generated by the Laplacian

8
a2
= -y2 ( ~ a
+ ~)
2
and A (8) = v(l-v)
ax" ay" v

When r = 3, V is generated by two elements 81 , 82 (see [1], pp. 33-


34) where

Here 81 is the Laplacian and 82 is a third order operator. We have


165

2 2
3(v +v v 2+v 2-v 1-v 2 )

We now define the notion of an automorphic form for the


principal conguence subgroup r (M) •
r

Definition 3.1 F~x eompiex numbe~ v 1 ' ••• ,v r _ 1 • A ~mooth 6unet~on


4> : Rr + C ~ eaUed an automofl.ph~e 60fl.m 06 type (vI'···' v r-l)
60fl. r r (M) ~6

(i) 4>(y-r) 4>(T) 60fl. y E r r(M) , T ERr.

(ii) D4> = A (D)4> 60fl. DE V •


v 1 ,···,v r _ 1

(iil) 4>( pT) hall poiyno~al gfl.owth ~n Yl' ••• 'Yr-l on the

fl.eg~on { T I Yi ;. 1 (i = 1,2, ••• , r-l) }, 60fl. evefl.Y p in


rr(M)\SL(r,R). FUfl.thefl., 4> ~ eaUed a eMP 60fl.m '<"6
il ~at~6'<"u the ad~aonal eondd~on

(iv)
f 4>(puT)du = 0

r (M) n u\ U
r

60fl. evefl.Y PEr (M)\SL(r,R) and evefl.1J gfl.OUp u 06 the 60fl.m


r

·1
*

r
s
) EGL(r,R)

Generalized Ra_nujan conjecture: 16 4> ~ a eMP 60fl.m 06 type


60fl. r r (M) , then

Re(v
r-l
) = -1r

This conjecture was first explicitly stated by Selberg [19] for


166

the case r = 2. Using Weil's [22] estimates for Kloosterman sums,


Selberg [19] obtained
i < Re(v) < ~
for the case r 2. It is known [1] that

for r = 3. By developing a GL(2,It) generalization of the "large


sieve", Deshouillers and Iwaniec [3] have shown that the generalized
Ramanujan conjecture is true on the average (over M) for the case
r = 2. Very little has been done when r ) 3.

4. Fourier expansions of autoaorphic foIllS.

Let ~ be an automorphic form for rr(M). In view of the noo-


commutativity of the situation, it is remarkable that ~ has a
Fourier expansion. These expansions were first found independently
by I.Piatetski-Shapiro [15] and J. Shalika [20]. We follow,
however, the more classical approach given in [9] for the special
case of an automorphic form ~ for GL( r, Z). A proof of these
expansions for the principal congruence subgroup r /M) is given in
the appendix by S. Friedberg.

Let Nr C SL(r,It) denote the group of upper triangular matrices


of type

( I
• 1 (4.1)

For integers n 1 , ••• ,n


r-
1 ' let a denote a character of N defined by
r

(4.2)

Proposition 4.1 Let ~ be an au;tomoftph-i..c. c.t.L6 p 60ftm 60ft r r


GL( r, Z) • Then
167
co co
cp(-r) L L cp «( yO) ,)
n =1 y E Nr - 1 n r \ r n1 •••• • n r - 1 0 1
r-1 r-1 r-1
whelte

J cp(u,)e(u)du
N
r
nr r \N r
with e 9~ven by (4.2) •

Now. if cp is of type (v 1 ••••• v r _ 1) • then it follows that


cp is also of the same type. Consequently. cp
n1 ·····n r_1 n1 ·····nr _ 1
must satisfy the two properties

DCP D EV (4.3a)
n1,···,n r _ 1

(4.3b)

for every x of type (4.1). Furthermore. in view of Definition (3.1)


(iii). we see that cp (,) must have polynomial growth in
nl,···,n r _ 1
y1 ••••• yr-1 on the region { , I Yi ) 1 (i = 1.2 ••••• r-I)} The
mu1tipi1icity one theorem of Shalika [20] states that up to a
constant multiple, there is a unique function W (,)
D1,···,D r _ 1
satisfying conditions (4.3a), (4.3b) and having polynomial growth at

the cusp yl •••• 'yr-1 + co. Moreover,

W (,) = c«n»W 1 •••• ,1«n)T)


n1,···,n r _ 1

where

(n)
·n (4.4)
r-1

and c«n» is a constant depending on (n) and v 1 , ••• ,v r _ 1 • We set


168

(4.5)

The function W( T) is called a Whittaker function. This is due to


the fact that in the case r = 2 , W( T) satisfies the classical
Whittaker equation. We have now shown

Proposition 4.2 Let ~ be an automo~phic CU6p 60~m 06 type


(v 1 , ••• v r _ 1 ) 60~ GL(r,Z). Then the~e ex.wt coyudan.t6 a
n1,···,n r _ 1
/.)uch that

L
n =1
r

x W«n)( 6' ~ )T) (4.6)

whe~e Br _ 1 = Nr _ 1 n rr_1,rr_l and (n) .w given by (4.4).

As an example, we take r = 2. It is known that the unique


Whittaker function is given by

211ix
W(T) = 2/Y K 1/(211Y) e
v- 2

where

and 1 1
2( t +-
K (y) = 1/2 f e
t
t s - 1 dt •
s 0

Proposition (4.2) says that any cusp form ~ of type v for SL(2,Z)
has the Fourier expansion

Let
-1
o
w =
r
-1 o
169

For T€ Hr , this induces an involution

which has the effect of interchanging the yi (i = 1, ••• , r-l) and


the x i ,i+l (i = l, ••• ,r-l) if T is given by (3.1). Hence, i f we
denote

where ~ is a cusp form of type (v 1 ' ••• ,v r _ 1 ) then it is easily seen


that ~ is a cusp form of type (v r- l'v r- 2' ••• 'v 1 ) . Moreover

where a denotes the Fourier coefficient in the


nl,···,n r _ l
expansion (4.6) for ~

Now, associated to ~ , we have an L-series

-s
I a n
n
n=1
where

a = a
n n,l, ••• ,l

As shown in [15], [16], [20]. this has a functional equation


s + 1 - s , ~ + ~ when multiplied by suitable gamma factors.

Generalized Ka_nujan conjecture: 16 ~ .w a C1L6p 601Lm 06 :type


(v 1 ' ••• ,v r _ 1) 60n GL(r,Z), :then 60n eveny E ) 0

whene :the o-col't6:tan:t .w .<.ndependen:t 06 n.

If in addition ~ is an eigenform for the Heeke algebra, then


the generalized Ramanujan conjecture combined with the
multiplicative properties of an actually imply
170

for every prime p.

While the functional equation for L~(s) is somewhat tricky to


prove, we can associate with ~ a different object

J ... (4.7)
no
where

and

(y)

Splitting the integral on the dexter side of (4.7) into two


J
integrals defined by the regions

1 .;; Y,

and using the identity

~«Y»

where

we see that

s-1 ~ -s-1
Z~(s) = f ... f [~«Y»Y + ~«Y»Y 1 dYl ••• dYr-l
nl

from which it follows that

ZJ-s)
~
171

5. Eisenstein and Poincare series.

Let TO': Hr be given by (3.1). Recall that for fixed complex

where

(r-i)j .. j .. i

(r-j )i i .. j .. r-l •

The minimal parabolic Eisenstein series for r SL(r,Z) is


defined as

(5.1)

where Nr is given by (4.1). The series on the dexter side of (5.1)


converges absolutely and uniformly on compact subsets of Hr if
Re(v i ) > 2/r, i =1,2, ••• ,r-l. General methods for obtaining the
meromorphic continuation and functional equations of Eisenstein
series were first given by Selberg [17], [18]. More detailed proofs
appeared in [14], [10]. Langlands [13] obtained, for the first
time, proofs of these results for the case of an arbitrary reductive
group.

Now, we consider the Hilbert space L2(r\Hr) with inner product


given by

~(T)
- - d* T
1jJ(T) (5.2)

for any two square-integrable automorphic forms for r, and where

r-l
*
d T Tf
1(i(j (r-l
dx i ·
,J
Tf
i=l
(5.3)

is the GL(r,R) invariant measure. It can be shown that


E(T,vl, ••• ,v r _l) is an automorphic form of type (vl, ••• ,v r -l) which
172

is not square-integrable, but lies in the continuous spectrum of V.


The Fourier expansion

*
, (2v)E(T,V) = , * (2v)yv +3 * (2v-l)y I-v +2/y- coL n v-1/2 01_2v(2nny) x
n=1
x K 1/ (hny)cos(hnx)
v- 2
where
v
*
, (v) = n
- 2" r(2")
v
,(v)

°v (n)

K (y) = f e- y cosh u(cosh vu) du (y > 0)


v 0

T = ( y x
o 1

for the case r = 2 is classical. Fourier expansions of Eisensten


series for GL(3,Z) were given in [1], [6], [21], and recently [23]
has obtained the Fourier expansion of Eisenstein series for GL(r,Z),
r > 2. The arithmetic part of the Fourier coefficient involves
certain general divisor or Ramanujan sums.

We now consider a generalization of Eisenstein series, namely


Poincare series. To this end, it is first of all necessary to
define the notion of an E-function.

For fixed integers nl , ••• ,nr-l' let e denote the character of


Nr given by (4.2). An E-function is a smooth function E : Hr .. C
satisfying

E(XT) = 6(x)E(T) for x E Nr' T E ·Hr (5.4)

E(T) = 0(1) for T E Hr • (5.5)

By abuse of notation, we have not specified the dependence of E on

n1,···,nr-l·

Now, let vI, ••• ,v r - 1 be complex parameters. Let nl, ••• ,n r - l be


173

integers. We define the Poincare series Pn n (T;v1, ••• ,v r -1)


1····' r-l
by the infinite series
(5.6)
Pn , ••• ,n _ (T;v 1 ,···,v r- 1)
1 r 1
L I
v1,···,v r_ 1
(YT) x
yEN n r\r
r

where E is an E-function satisfying (5.4) and (5.5).


nl'··· ,nr-l
Again, by (5.5), the series on the dexter side of (5.6) converges
absolutely and uniformly on compact subsets of Hr if Re(v i ) > 2/r
for i = 1,2, ••• ,r-1.
In order to obtain the Fourier expansion of the Poincare series
(5.6), it is necessary to introduce the Bruhat decomposition. Let W
denote the Weyl group of GL(r,R), which is simply the group of r x r
matrices with exactly one 1 in each row and column, and zeros every-
where else (Le. the regular representation of the symmetric group
on r symbols). We also let Nr be the group of upper triangular
matrices with ones on the diagonal, and we let Drc GL(r,R) be the
group of diagonal matrices. For w E 'W, let

G N wD N
w r r r

so that
GL( r ,R) U G
wEW w

Similarly,
r U Gn r
wEW w

The sets Gwn r are called Bruhat cells. The cell corresponding to

-1
w
r -1

is called the big cell. We can now break up the Poincare series
into pieces corresponding to Bruhat cells, namely
174

P
n1, ••. ,n r _ 1
( 1:; vI' ••• , vI)
r- l: l: I (11:) x
wE W YEN nr\ r vI'···' v r-I
r
YE G
w

The Fourier expansion of Pn n (1:;V I , ••• ,v -1) will also


1'···' r-l r
break up into pieces corresponding to Bruhat cells. The Fourier
coefficients corresponding to a cell will be infinite sums of
SL( r, Z) Kloosterman sums weighted by certain integrals which are
higher dimensional generalizations of hypergeometric functions. We
now describe the Kloosterman sums associated to the big cell for r.
Fix integers np ••• ,nr-l and mp ••• ,mr-l. For x in Nr given
by

Xl ,2 ••••••••

• 1

let

(5.7)

Then for d£ Dr' we define the big cell Kloosterman sum

Sw (mI, ••• ,mr- l;nl, ••• ,n r- lid)


r

x a (b )
nl, ••• ,n r _ l 2

Similarly, there will be Kloosterman sums for all the other


Bruhat cells. The Kloosterman sums will have multiplicative
properties in the d aspect. For d of the form

d
• d
r
175

the di (i = 1, ••• , r) will be rational numbers. If we assume each di


is a positive or negative integer power of a fixed prime p, then the
Kloosterman sum Sw(ml, ••• ,m r _ 1;nl, ••• ,n r _l;d) will be associated to
a certain algebraic variety over Fp. The complete determination of
these varieties has only been affected for r = 2,3 (see [22], [2]).

If the E-function defined by (5.4) and (5.5) has exponential


decay in Yi+ ~ (i = 1, ••• ,r- l) then the Poincare series (5.6) will
be square-integrable. It will not be an eigenfunction for D,
however. We will now show that the inner product of a cusp form
with P (-r; vI' ••• , v I ) picks off a certain transform of
n 1 , ••• , n r-
r1
the (nl, ••• ,nr-l)th Fourier coefficient of the cusp form.

2 r
Proposition 5.1 Let 4> E L (r\H ). Then

r-l
4>
n 1 ,···,nr _ 1
(y) I
v 1 ,···,v r _ 1
(Y) E
n 1 ,···,nr-l
(Y) TT
i=1

whelte

and
1
and e .u, 9.tven by (5.7)
nl,···,nr-l

Pltoo6: By the Rankin-Selberg unfolding method


176

To complete the proof, we note that

J J
y =0 N nr\N
r-l r r
and that

for T = xY with x given by (4.1).

Let be a cusp form of type (AI' A2 , ••• , Ar - 1 ).


~ Then
proposition (5.1) shows that the inner product

has a meromorphic in vI' v 2 ' ••• , v r - 1 with polar


continuation
divisors depending on AI' A2 , ••• , \-1. In [2), we show how this
can be applied to the generalized Ramanujan conjecture.
177

Appendix

The Fourier expansion on a congruence subgroup of SL(r.Z)

Soloaon Friedberg

Let M be a
integer. positive
We give here the Fourier
expansion of a function ~ : Hr + C invariant under the congruence
subgroup

~) (mod M)} •

This Fourier expansion was first developed in an adelic setting by


Piatetski-Shapiro [15] and Shalika [20] and this is simply a
translation of one of their results into a non-adelic framework.

Let R be a fixed set of coset representatives for

Nr - 1 n SL(r-l ,Z)\SL(r-l ,Z) •

For each y in SL(r-l,Z) denote by Py an element of SL(r-l,Z) such


that Pyy is in rr_l(M) (the coset of Py in rr_l(M)\SL(r-l,Z) is thus
uniquely determined by y). Given integers n 1 , ••• ,n r _ 1 , let
e denote the character of Nr given by
n1 nr - 1
M , ••• , M

and choose the Raar measure du on Nr such that the measure of


(Nrn rr(M)\N r ) = 1.

Theorea (A.I) Suppo~e~: Hr + C ~ rr(M) ~nvaniant. Then

I
L (~r-l)
n =-00 n --00 n1 nr-2
1 r-2 M'··· '-M-' M
178

+ L
YER n =-00 n
L
=-a> n
L (4) PY)n
=1 1
n
r-1
((ri ~)t)
1 r-2 r-1 M'··· '-M-
whelte

p
f 4>( ( Y o )
1
U t) an i nr _1
(u) du
N
r
nr r (M)\Nr 0
M,··it-,~

Pltoo6: Denote by u(al, ••• ,ar_l) the element

of Nr • First, since 4> is invariant under the subgroup

of rr(M), we may write

4>( t) L··· 2 4>n n


n1 , ••• ,n r _ 1E Z I r-1
M'···'~

with

4>n n (t) f 4>(u(a l , ••• ,a r _ I }t) x


I r-1 (R/MZ)r-1
M'···'-M-
x e(-(nIal+ ••• +nr_Iar_I)/M)dal ••• dar_1

and
e(x) = exp(2~ix) •

Next, suppose yE SL(r-I,Z) has bottom row (YI ••• Yr-I)' and m is
an integer. Note that such a Y is determined modulo the SL( r-I ,Z)
max~mal parabolic
179

(+H) r-2
1

Then we claim that

cj> my (-r) (A.l)


1 mYr - 1
M'···'-M-

p
cj> y (T) To see this, observe that since

01 ) is in rr(M), the left hand side equals

(R!~Z)r-l cj>«( ~Y ~) u(a i ,... ,a;_I)( 6' ~)T) x

x e(-m(Ylal+ ••• Yr_lar_l)!M) da 1 ••• da r _ 1

with

Further

Thus changing variables gives (A.l).


Now, iterating these steps, replacing u successively by

1
o
o

• 1

for i 1,2, ••• ,r-l completes the proof (for example, when i=2 the Y
180

to be used run over

Pr-2 *)
( -o:........::~r---- \p r-l ).

Bibliography

[1] D. Bump, Automorphic Forms on GL(3,R), Lecture Notes in


Math.1983, Springer, (1984).

[2] D. Bump, S. Friedberg, D. Goldfeld, Poincare' series and


Kloosterman sums for SL(3,Z), to appear in Acta ~hme~~ea.

[3] J. M. Deshouillers, H. Iwaniec, Kloosterman sums and Fourier


coefficients of cusp forms, rnven~. M~h., 70 (1982), 219-288.

[4] E. Fouvry, Brun-Titchmarsh theorem on average, to appear.

[5] D. Goldfeld, P. Sarnak, Sums of Kloosterman sums, Inven~.

M~h., 71 (1983), 243-250.

[6] K. Imai, A. Terras, The Fourier expansions of Eisenstein


series for GL(3,Z), T~an6. A.M.S. 273 (1982), #2, 679-694.

[7] H. Iwaniec, Non-holomorphic modular forms and their


applications, Modular Forms (R. Rankin, Ed.), Ellis Horwood,
West Sussex, (1984), 197-156.

[8] H. Iwaniec, J. Pintz, Primes in short intervals, Mathematics


Institute of the Hungarian Academy of Sciences, pre print no.
37, (1983).

[9] H. Jacquet, Dirichlet series for the group GL(n), Automorphic


Forms, Representation Theory and Arithmetic, Springer-Verlag,
(1981), 155-164.
181

[10] T. Kubota, Elementary Theory of Eisenstein series, New York,


John Wiley and Sons (1973).

[11] N. V. Kuznetsov, The arithmetic form of Selberg's trace


formula and the distribution of the norms of the primitive
hyperbolic classes of the modular group (in Russian) Preprint,
Khabarovsk (1978).

[12] N. V. Kuznetsov, Petersson's conjecture for cusp forms of


weight zero and Linnik's conjecture; sums of Kloosterman sums
[in Russian], at. Sb. (N.S.), 39 (1981), 299-342.

[13] R. Langlands, On the Functional Equations Satisfied by


Eisenstein Series, Springer Verlag, Lecture Notes in Math.
#544 (1976).

[14] H. Maass, Siegel's Modular Forms and Dirichlet Series,


Springer Verlag, Lecture Notes in Math. #216 (1971).

[15] I. I. Piatetski-Shapiro, Euler subgroups, Lie Groups and their


Representations, John Wiley and Sons, (1975), 597-620.

[16] I. I. Piatetski-Shapiro, Multiplicity one theorems,


Automorphic Forms, Representations, and L-Functions, Proc.
Symp. in Pure Math. XXXII, (A. Borel, Ed.), Part II, 209-212.

[17] A. Selberg, Harmonic analysis and discontnuous groups in


weakly symmetric Riemannian spaces with applications to
Dirichlet's series, J. Indian Math. SOQ., 20, (1956), 47-87.

[18] A. Selberg, Discontinuous groups and harmonic analysis, PnoQ.


Intennat. Congn. Math., Stockholm, (1962), 177-189.

[19] A. Selberg, On the estimation of Fourier coefficients of


modular forms, Proc. Symp. Pure Math. VII, A.M.S., Providence,
R.I., (1965), 1-15.

[20] J. Shalika, The multiplicity one theorem for GL( n), AnnaiA 06
Math. 100, (1974), 171-193.
182

[21] L. Takhtadzhyan, I. Vinogradov, Theory of Eisenstein series


for the group SL(3,R), and its application to a binary
problem, J. SaVe Math. 18 (1982), #3, 293-324.

[22] A. Weil, On some exponential sums, Pnac. Nat. Acad. Sci.


U.S.A., 34 (1948), 204-207.

[23] A. Yukie, Ph.D. Thesis, Harvard (1985).

D.Goldfeld S. Friedberg
Harvard University Harvard University
Cambridge, Mass.02138 Cambridge, Mass.02138

University of Texas at Austin


Austin, Texas 78712

Columbia University
New York, N.Y. 10027 U.S.A.
PAIR CORRELATION OF ZEROS AND PRIMES
IN SHORT INTERVALS

Daniel A. Goldston and Bugh L. Montgomery*

1. Statement of results.
In 1943, A. Selberg [15] deduced from the Riemann Hypothesis
(RH) that

X 2
f (W«(l + o)x) - W(x) - ox)2 x- dx < o(log X)2 (1)
I

for X-I ( 0 ( x-l/4, X) 2. Selberg was concerned with small


values of 0, and the constraint 0 (X- 1 / 4 was imposed more for
convenience than out of necessity. For larger 0 we have the
following result.

Theorem 1. AMume RH. Then

X
f (w«l + o)x) - w(x) - ox)2 x- 2 dx < o(log X)(log 2/0) (2)
1

6o~ 0 <0 ( 1, X ) 2.

In this estimate, the error term for the number of primes in


the interval (x, (1 + o)x] is damped by the factor x -2, and the
length of the interval, ox, varies with x. Saffari and Vaughan [14]
considered the undamped integral, and derived from RH the estimates

X
f (W«I+o)x) - W(x) - ox)2 dx < ox2 (log 2/0)2 (3)
1

for 0 <0 ( 1 , and

X
f (W(x + h) - W(x) - h)2 dx < hX(log 2X/h)2 (4)
1

*Research supported in part by NSF Grant MCS82-0I602.


184

for 0 <h .. X • It may be similarly shown that RH gives the


estimate

X
f (W(x) - x)2 dx < X2 • (5)
1

Gallagher and Mueller [5] showed that if one assumes not only RH but
also the pair correlation conjecture

# {(y,y') : 0 <y .. T, 0 <y - y' .. 2na/10g T}


(6)
1 fa 1 - ( -
( -2~ sin
- nu)
- 2 du + 0(1) ) T log T
" 0 nu

then it can be deduced that

X
-2 -
f (W«1 + o)x) - w(x) - ox)2 x dx - o(log l/o)(log X/O) (7)
1
-1 -e;
for X .. 0 .. X • Here y denotes the ordinate of a non-trivial
zero of the Riemann zeta function. Thus it seems likely that the
estimate of Theorem 1 is best possible.

In the course of formulating the conjecture (6), Montgomery


[13] also proposed a more precise estimate, namely that

F(X,T) - z;1 T log T (8)

uniformly for T .. X .. TA ,for any fixed A > 1, where

F(X,T) L xi(y-y')w(y_y') (9)


o < y,y' .. T

and w(u) = 4/(4 + u 2 ). We now relate this conjecture to the size of


the integral in (3).

Theorem 2. AMume RH. 16 o < Bl .. B2 .. 1, -then


X
1
f (w( (1 + o)x) - W(x) - ox)2 dx - - oX2 log 1/0
2
(10)
1

un.i6olLmltJ nOlL x- B2 .. 0 .. X-Bl , plLov.ided -that (8) ho.f.d6 un.i6olLmltJ


185

B -3 B 3
x 1 (log X) (T ( X 2 (log X) • ( 11)

Convekhety, ~6 1 ( Al (A 2 < 00, then (8) hotd6 un~6o~mty 6o~


~l (X ( ~2 , p~ov~ded that (10) hotd6 un~6o~mty 6o~

(12)

Previously Mueller [12] derived (10) from RH and a strong


quantitative form of (8). Heath-Brown and Goldston [11] showed that
a b
RH and (8) for T (X (T ,a < 2 < b, imply

This estimate follows easily from Theorem 2 by taking


6 = Ex_liz (log X) 1/2 in (10). In deriving (10) from (8) we also use
the weaker estimate (3). In the case of very small 6, say
6 ~ (log X)/X, we can do better by appealing instead to the bound

X
f (1jJ«1 + 6)x) - 1jJ(x) - 6x)2 dx < 6X2 log X + 62X3 (13)
1

which follows from sieve estimates (see the proof of Lemma 7). In
this way we could show that

X
f (n(x + h) - n(x) - h/log x)2 dx ~ hX/log X (14)
1

for h ~ log X , given RH and (8) for T ( X ( f(T)T log T. Here f(T)
tends to infinity arbitrarily slowly with T. From this it follows
easily that

lim inf (Pn+l - p ) I log p


n n
=0 •

Heath-Brown [10] derived this from a slightly stronger hypothesis.

In assessing the depth of the estimates (8) and (10), we note


that (10) is a logarithm sharper than (3), and that (8) is a
logarithm sharper than the trivial bound
186
1 (15)
IF(X,T)I ~ F(I,T) ~ 2~ T(log T)2

(See Lemma 8.) As in (4), we can relate (10) to primes in intervals


of constant length. In summary we have the following

Corollary. AMLLme RH.


equ-<-vai.en.t:
(a) Folt eveltlj 6.txed A > 1, (8) hoR.ci6 uni.6oltmtlj nolt
T~ X~ TA

(b) Folt eveltlj n~xed e > 0, (10) hoR.ci6 un~noltmtlj nolt


X-I ~ <5 ~ X- e •

(c) Folt eveltlj n~xed e >0 ,


X
J (~(x + h) - ~(x) - h)2 dx ~ hX log X/h (16)
o
hoR.d6 un~6oltmtlj nolt 1 ~ h ~ XI - e •

It is not hard to show that either (b) or (c) implies RH.


Gallagher [4] has shown that a weak quantitative form of the prime
k-tuple hypothesis gives (16) when h ~ log X •
The path we take between (8) and (10) involves elementary
arguments of Abelian and Tauberian character; these are of two
sorts. First, we consider the connection between the assertion

+co
J e- 2lyl f(Y + y) dy =1 + 0(1) (17)

as Y + +w ,and the more general assertion

b b
J R(y) f(Y + y) dy =J R(y) dy + 0(1) (18)
a a

as Y + +co where R is any Riemann-integrable function. (These two


statements are equivalent i f f is bounded and non-negative.) This
interplay reflects the choice of the weighting function w(u) in the
definition (9) of F(X,T). Second, and more intrinsically, we
consider a question of Riemann summability (R 2 ), namely the
187

connection between the two assertions

J (sinu KU)2 f(u)du = (n/2 + o(I»K log 11K (19)


o
as K + 0+ ,and

u
J f(u)du (1 + 0(1» U log U (20)
o
as U + ~ Because of the intricacies of the (R 2 ) method, neither
of these assertions implies the other, although they are equivalent
for non-negative functions f. The lemmas we formulate below are
complicated by the fact that we specify the relation between the
parameters K and U.

2. Lea.as of su.aability.

I.-- 1. 16

I(Y) = J+oo e- 2lyl f(Y + y)dy = 1 + g(Y) ,

and ~6 f(y») 0 601l. aU. y, then 601l. any R~emann-~nte9Il.ab.e.e

6u.nc.t~on R( y) ,

b b
J R(y) f(Y + y)dy (J R(y)dy) ( 1 + g'(Y») • (21)
a a

16 R .u, Mxed then I g'(Y) I .u, -6maU. pll.ov~ded that I g(y) I .u, -6maU.
u.n~60Il.mty 601l. Y + a-I ( y ( Y + b + 1 •

In terms of Wiener's general Tauberian theorem, the truth of


this lemma hinges on the fact that the Fourier transform of the
kernel k(y)= e- 2lyl ,namely the function

+00
k(t) = J k(y) e(-ty)dy (e(u) e 2niu) ,

never vanishes.

Let K (y) = max(O, c - Iyl) • By comparing Fourier


c
188

transforms, or by direct calculation, we see that

1 -21yl 1 -2Iy-cl 1 -21y+cl


K (y) = - e - - e - - e
c 2 4 4

+ Jc (c - Izl) e-2Iy-zl dz •
-c
Hence

c
f K (y) f(Y + y) dy
c
= '12 I(Y)
1
- '4 I(Y + c) - '41 I(Y - c)
-c
c
+ f (c - Izl)I(Y + z) dz
-c

c 2 + e:1(Y)

where le: l l is small if c > 0 is fixed and if 1e:(y)1 is small for


Y - c ( Y ( Y + c • Since

1 1
n c (y) - Kc-n (y») (x[ -c, c 1(y) ( -(K
-(K n c+n (y) - Kc(Y» ,

and since f > 0 , we deduce that (21) holds in the case of the step
function R(y) = x[ -c,c l(y). Since the general R can be approximated
above and below by step functions, we obtain (21).

Lemma 2. Suppo-!le thM f(t) .u, a c.ont.i.nuoM non-negM.i.ve 6unc.t.i.on


de6.i.ned 60Jt aU.. t ~ 0, w.Uh f(t) < log2 (t + 2). 16

T
J(T) = f f(t)dt = (1 + e:(T»)T log T ,
then
o

/'" ( sin KU)2 f(u)du = (11/2 + e:'(K»)K log 11K (22)


o u

+
whelle Ie:' (K) I .u, -!lmaU.. M K -+- 0 .i.6 I e:(T) I .u, -!lmaU.. un.i.601lmty
-1 -2 -1 2
601l K (log K) (T ( K (log K) •

We divide
the range of integration in (22) into four
-1 -2 -1
subintervals: 0 ( u (K (log K) = Ul ' Ul ( u (CK = U2 '
-1 2
U2 ( u ( K (log K) = U3 ' and U3 ( u < "'.
Since f(t) < log2(t + 2), we see that
189

u u
f 1 < f 1 K2 log2(u + 2) du < K2 Ul log 2 Ul <K
o 0

and similarly that

-2
< f u log2 u du < U-1
3 log2 U3 <K
U3

By writing log 11K + log KU + (f(u) - log u) , we express


the integral from Ul to U2 as a sum of three integrals. We note
that

U
f 2 ( sin KU)2 du
f
U
1
u o
-2
211 K(1 + O(log K) ),

and that

Put r(u) J(u) - u log u + u. Then by integrating by parts we see


that

U2 2
f (sin KU) (f(u)-log u)du < K(I+ (log!) max le(u)l)log(C+2) •
U1 u K U1 (u(U 2

As for the range U2 ( u (U 3 ,we see that if e(u) (1 then

We make this small by taking C large. Then the remaining error


terms are small i f e(u) is smsll.

+00
~ 3. 16 K .u, e.ve.n, K" c.ont-<-ntlo/.L6, f
-3
K(x) + 0 at> x + +00, K' + 0 at> x + +00 , and -<-6 K"(x) <: x at>
x+ + 00 , the.n

=f
~

K(t) (23)
o
190

p~oo6. Integrate by parts twice.

tem.a 4. 16 6 ~ a non-negat~ve 6unct~on de6~ned on [0, +~) ,


f(t) < log2(t + 2), and ~6

~ 2
I(K) = f ( sin Kt) f(t)dt = (n/2 + E(K»K log 11K
o t

then
T
J(T) = f f(t)dt (1 + E')T log T
o

-1 -1 -1 2
T (log T) (K (T (log T) •

P~oo6. Let K be a kernel with the properties specified in Lemma 3.


Replace t by tiT in (23), multiply by f(t) - log t, and integrate
over 0 ( t <~. Then we find that
~

f (f(t) - log t} K(t/T)dt = n- 2 T2 f K"(x) R(nx/T)dx


o 0

where

I(K) - 21 nK log 11K + O(K).

Since

I(K) < f min(K 2 , t- 2 )lOg2(t + 2)dt < K log2(2 + 11K)


o
for all K > 0, on taking Xl = (log T)-1 we see that

fXl K"R < f


Xl
xT
-1
log2 T/x dx <T
-1

o 0

On taking x 2 =% (log T)2 we find that

f K"R < f
191

-1
Assuming, as we may, that E ) (log T) ,we have
-1
R(T.K/T) <E xT log T for xl .; x .; xZ. Hence

x2 -1 -3 -1
f K"R <E T (log T) f min(I, x ) x dx <E T log T.
xl o
For n >0 take

K(x) = Kn (x) 2 ( sin ZT.K + sin Zn(1 +n)x)( Znx(1 - 4n x»)


z Z -1
,

so that
1 Hltl';l,
A

K(t) cos 2 ( n(ltl - l)/(Zn») if 1 .; It I .; 1 + n,


o if It I ) 1 + n •

Thus
00

fo f(t)K n (t/T)dt = (1
A

+ O(n»T log T + 0 (T) + 0 (E T log T) •


n n

Since f is non-negative, we see that

f f(t)K (1 + n)t/T)dt.; J(T).; f f(t)K (t/T)dt ,


o n o n
and we obtain the desired result by taking n small.

In this argument we have made free use of existing treatments


of Riemann summability. We note especially Hardy [8, pp. 301, 316,
3651 and Hardy and Rogosinski [9, Theorem 1111.

3. ~s of analytic nuuer theory.


As is customary, we write s = (J + it, and we let p = a + iy
be a typical non-trivial zero of the Riemann zeta function. We
first note a simple result of Gallagher [3]:

Le.E 5. Let S(t) = L c(lI)e(lIt) whelte M ,u, a c.ountab./'.e ut 06


II E I~
Itea./'. nwnbelL6 and L I c( II)I< 00. Then
192
T +00
f IS(t)12 dt < T2 f
-T

When a main term is desired, we use the following more

elaborate estimate.

Le.u 6. Let S(t) be a6 above. 16 0" T-1 then

T
f IS(t)12 dt (T + 0(0- 1» L I c( \.I) 12
0 \.I EM

+ o( T L 1c(\.I)c(v)l) •
\.I, v EM
0 < I \.I-vi < 0

Pnoo6. Selberg (see Vaaler[17]) has constructed functions F_(t) and

+00
and f Hence

T +00
f Is12 .. f IsI2 F+ = I C(II)c(V) ;+(v - 11) •
o \.I,V
_1
The terms \.I = v contribute (T + 0 ) L\.I IC(\.I)!2. Since

-1
T + 0 .. 2T ,

the terms \.I 1 v contribute at most

2T L 1c(\.I)c(v)l.
o< I \.I-vi <0
This gives an upper bound, and a corresponding lower bound is
derived similarly using F_.
193

te..a 7. Le~C(x) > 0 be a eon~~nuoU6 6un~~on ~ueh ~h~


C(x) .. C(y) whenevelt x .. y. If Ic(p) I .. C(p) 60lt aU p~me~ p,
and ~6 0 ~ T- 1 , ~hen

T
f
o p p

+ o( oT f~-l C(u)2 u(log u)-2 du)


o
Pit 0 0 6. We appeal to the previous lemma. In the second error term,
the primes p € (X,2X] contribute

T C(X/ I I 1 <T C(X/ I 1T 2 (2X,k)


X(p .. 2X p(p, .. (1+20)p 1"k.. 40x

where 2 (X,k) denotes the number of primes p .. x for which p + k is


1T

also prime. It is well-known (see Halberstam and Richert [7, p.117])


that

-2
1T 2 (X,k) <( k/$(k»)x(log x)

uniformly for x ~ 2, k # O. Since I k/$(k) < K, it


k .. K
follows that our upper bound is

-2 2X -2
<T C(X)2 OX2 (log X) < oT f C(u)2 u(log u) du.
X

We put X = 0- 1 2 r and sum over r > 0 to obtain the desired result.

We now present the main known properties of F(X,T).

L _ 8. and
A6~ume RH, .te~ F(X,T) be ~ ~n (9). Then F(X,T) > 0,
F(X,T) = F(1/X,T), and

F(X,T) = T(X-2 (log 1 (-1/2


T)2 + log x)(2rr' + 0 (log T) (loglog T)
1/2
))
(24)

Pit 0 0 6. The first assertion is an immediate consequence of either of


the two identities
194

F(X,T) e -4111ul 1 \'L. Xiy e(yu) 12 du, ~25)


O<Y"T
or
2 +00 Xiy 2
F(X,T) = -
11 f L 1 + (t-y)2 1 dt.
-co O<Y"T
The observation that F is non-negative has also been made by Mueller
(unpublished). The second assertion is obvious from the definition
of F. The estimate (24) is substantially due to Goldston [6, Lemma
B), and may be proved by substituting an appeal to Lemma 7 in the
argument of Montgomery (13).

Leama 9. 16 0 .. h .. T then

#{(y,y') : 0 .. y .. T, Iy - y'l .. h } < (1 + h log T)T log T •


(27)

P~oo6. We argue unconditionally, although if RH is assumed then the


above follows easily from Lemma 8 (see (6) of Montgomery [13]).
Let N(T) = #{y: 0 <y .. T}. Following Selberg, Fujii [2] showed
that

T
fo (N(t+h) - N(t) - ~ h log t)2 dt
211
<T log(2 + h log T)

for 0 .. h .. 1. Hence

T
f N( t+h) - N( t»)2 dt < h2 T(log T)2
o
_1
for (log T) .. h .. 1. This gives (27) in this case. To derive
-1 1
(27) when 0 .. h .. (log T) , i t suffices to consider h = (log T) - •
As for the range 1 .. h .. T , it suffices to use the bound

N(T + 1) - N(T) ~ log T (28)

(see Titchmarch [16,p. 178) •


195

L~ 10. Fan 0 <0 (1 let

a(s) = «1 + o)s - 1)/s • (29)

16 I c(y) I ( 1 6an all y then

+.. +co
J la(it)1 2 I [.\ 1 +c(y)
(t-y)2
12 dt =J I t
I y (Z
a(I/2 + i y )c(y)1 2 dt
1 + (t-y)2
y

(30)

pnav~de that Z ) 1/0

Pnaa6. By (28), the sum that occurs in the integral on the left is
< log (2 + Itl) • Since

(31)

in the strip 101 (1/0 , it follows by Cauchy's formula or by


direct calculation that

a'(s) < min(02 , o/Isl) (32)

for 101 «20)-1 • Hence in particular,

a(it) - a(l12 + it) <min (0 2 , Mltl) ,

and consequently

Let I denote the integral on the left in (30), and J the correspond-
ing integral with a(it) replaced by a( V2 + it). Then

1- J < J min(03, 0/t 2 )( log(2 + Itl»)2 dt < 02 (log 2/0)2 •

Write J in the form J - J IAI2. From (28) and (31) we see that
196

A < min(o, Itl- 1 ) log(2 + Itl) (33)

Now let K be the integral with a( liz + it) replaced by


a( 1/2 + iy), and write K = f IBI2. Then B also satisfies the
estimate (33). From (31) and (32) we see that

Thus

A - B <min(02, o/Itl)( log(2/0 + Itl»)2,


so that

and hence
J - K < 02(log 2/0)3

Finally, let L = f Icl 2 be the integral on the right in (30). We


note that C also satisfies the estimate (33). Since

-1 -1
B - C <: min(Z , It I ) log(2Z + Itl),

we find that

Thus
K - L < Z-l (log 2Z)3

and the proof is complete.

4. Proof of Theore. 1.
Although we arrange the technical details differently, the
ideas are entirely the same as in Selberg's paper. If oX (1 then
there is at most one prime power in the interval (x, (1 + o)x], so
197

that our integral is

< 0 L A(n)2 /n + 02 X < o(log X)2


n';X

which suffices. We now suppose that oX > 1. By the above argument


we see that

1/0
f ... < o(log 2/0)2.
o
Thus it suffices to consider the range 1/0 .; x .; X Here we apply
the explicit formula for W(x) (see Davenport [1, 17]), which gives

W(l + o)x) - w(x) - ox = - L a(p)x p (34)


Ipl.;Z

+ o( (log x)min(1, Z 11\ II»


+ o( (log x)min(1, Z II (1:o)x II»)

where a(s) is given in (29), and II 9 II = min 119 - nil is the


n
distance from 9 to the nearest integer. The error terms contribute
a negligible amount if we take Z = X(log X)2 Writing p ~ + iy,
x = eY, Y = log X, we see that it remains to show that

Y
iyy 2
f I L a(p)e I dy<oYlog2/0. (35)
log 1/0 Iyr.;z
By Lemma 5 we see that this integral is

f (L
-00hf.;Z
I y-27TU I.;2/Y
a (p) 2 ) du <Y I
h ';Z
I y' ';Z
la(p)a(p') I •

ly-y'I.;4/Y
By (31) and Lemma 9 this gives (35), and the proof is complete.

5. Proof of Theorea 2.
We first assume (8) as needed, and derive (10). Let
198

T
J(T) a J(X,T) - 4 f
o
Montgomery [13] (see his (26), but beware of the changes in
notation) used (28) to show that

J(X,T) = 2n F(X,T) + O(log T)3) •

Thus (8) is equivalent to

J(X,T) = (1 +o(I»)T log T • (36)

With a(s) defined in (29), we note that

where K = 1/2 log (1 + 6). Then by Lemma 2 we deduce that

XiY
f la(it)1 2 12 + (t-y)Z I2 dt - (n/2 + O(I»K log I/K
o y
= (n/4 + 0(1»6 log 1/6 • (37)

The values of T for which we have used (8) lie in the range

(38)

The integrand is even, so that the value is doubled if we integrate


over negative values of t as well. Then by Lemma 10

+co
f
-1
provided that Z ~ 6 (log 1/6)3. Let S(t) denote the above sum
over y. Its Fourier transform is

+co
S(u) - f S(t) e(-tu)dt =n 2 a(p) xiy e(_yu)e-2nlul
-CD hi ( z
Hence by Plancherel's identity the integral above is

I \'
L. a(p)X i Y e(-yu) 12 e- 4n Iu I du •
hi ( Z
199

On writing Y log X, -2mu ,. y , we find that

+00
f I \ a(p) e iY (Y+Y)1 2 e- 2lyl dy = (1 + 0(1»0 log 1/0 •
hlL.( z (39)
In Lemma 1 we take

R(y) ,.
o( Y ( log 2,
o otherwise •

On making the change of variable x = e Y+Y we deduce that

2X
f I L a(p)x p 12 dx = ( 3/2 + 0(1») OX2 log 1/0 •
X hi ( z
We replace X by X2-k , sum over k, 1 ( k ( K , and use the explicit
formula (34) with Z = X(log X)3 to see that

fX -K (~(l+o)x) - ~(x) - ox)2 dx =2


1
(1-2-
2K
+ 0(1») OX2 log 1/0.
X2

We take K = [loglog xl, and note that it suffices to have (8) in the
-K
range (11). To bound the contribution of the range 1 ( x (X2 ,
we appeal to (3) with X replaced by X2- K • Thus we have (10) •

We now deduce (8) from (10). By integrating (10) by parts from


Xl to X2 ,. X1 (10g X1)2/3 , we find that
X2
f (~(I+o)x) - ~(x) - ox)2 x- 4dx" ! + o(l»o(log 1/0)X~2.
Xl
From (3) we similarly deduce that

-2
= O( 0 (log 1/0) Xl ) .

2
We add these relations, and multiply through by Xl By making a
further appeal to (10) with X = Xl we deduce that

00

f min(x2/x~, X~/x2)(~(1+0)x) - ~(x) - ox)2 x- 2 dx


o
= (1 + 0(1»0 log 1/0
200

We write X for Xl' put Y = log X, x = e Y+Y, and appeal to the


explicit formula (34) with Z = X(log X)3, and we find that we have
(39). Retracing our steps, we find that we have (37). Then by
Lemma 4 we obtain (36), and hence (8) • The values of 0 and X for
which we have used (10) also satisfy (12) •

6. Proof of the Corollary.


We note that Lemma 8 gives (8) when

-3
X(log X) (T ( X,

and that (10) is trivial when

Thus the equivalence of (a) and (b) follows immediately from


Theorem 2.

We now show that (b) implies (c). We suppress the converse


argument, which is similar. The method here is that of Saffari and
Vaughan [14]. Our first goal is to deduce from (b) that

H X
J J (~(x+h) - ~(x) - h)2 dxdh ~ 21 H2 X log X/H (40)
o 0

uniformly for To this end it suffices to show that

o
H
J (~(x+h) - ~(x) - h)2 dh dx ~ i H2 X log X/H ( 41)

In this integral we replace h by 0 = h/x ,and invert the order of


integration. Thus the left hand side above is

H/X X 2H/X H/o


J L f(x,ox)2 x dx do + J L f(x,ox)2 x dx do
o liz X H/X 1/2 X

where f(x,y) = ~(x+y) - ~(x) - y. By integrating by parts, we see


from (b) that if A ~ B ~ X then

B
J f(x,ox)2 x dx ~ (B3 - A3) 0 log 1/0 + O(X 3 o log 1/0).
A
ZOl

This yields (41). Then (40) follows by replacing X by XZ- k in


(41), summing over 0 .. k .. K = [2 loglog Xl , and by appealing to
(4) with X replaced by X2-K- 1 •

We now deduce (c) from (40). Suppose that 0 < n < 1. By


differencing in (40) we see that

(l+n)H X
I I f(x,h)2 dx dh = (n +V2 n2 +0(1»XH2 log X/H
H o
Let g(x,h) = f(x,H). From the identity

f2 - g2 = 2f(f-g) - (f-g)2

and the Cauchy-Schwartz inequality we find that

But f(x,h) - g(x,h) = f(x+H,h-H), so that

nH X+H
II (f-g)2 = I I f(x,h)2 dx dh
o H

< n2H2X log X/H


by (40). Hence we see that

x
nH I (~(x+H) - ~(x) - H)2 dx = II g2
o
II f2 + O(n 3 / 2 XH 2 log X/H)

(n + O(n 3/ 2) + 0(1») XH 2log X/H

We now divide both sides by nH, and obtain the desired result by
letting n +
+
0 sufficiently slowly.

References.

1. H. Davenport, Multiplicative Number Theory, Second Edition,


Springer-Verlag, 1980.
202

2. A. Fujii, On the zeros of Dirichlet L-functions, I, TIta.rL6.


Amelt. Math. Soc.. 196 (1974), 225-235. (Corrections to this
paper are noted in TltaYlh. Amelt. Math. Soc.. 267 (1981), pp 38-
39, and in [5; pp. 219-220).)

3. P.X. Gallagher, A large sieve density estimate near a '" 1,


Invent. Math. 11 (1970), 329-339.

4. P.X. Gallagher, On the distribution of primes in short


intervals, Mathemat~~a 23 (1976), 4-9.

5. P.X. Gallagher and Julia H. Mueller, Primes and zeros in short


intervals, J. Re~ne Agnew. Math. 303/304 (1978), 205-220.

6. Daniel A. Goldston, Large differences between consecutive prime


numbers, Thesis, University of California Berkeley, 1981.

7. H.Halberstam and H.-E. Richert, Sieve Methods. Academic Press,


London, 1974.

8. G.H. Hardy, Divergent Series, Oxford University Press, 1963.

9. G.H. Hardy and W.W. Rogosinski, Notes on Fourier series (I):


On sine series with positive coefficients, J. London Math. Soc..
18 (1943), 50-57.

10. D.R. Heath-Brown, Gaps between primes, and the pair correlation
of zeros of the zeta-function, Ac.ta ~th. 41 (1982), 85-99.

11. D.R. Heath-Brown and D.A. Goldston, A note on the difference


between consecutive primes, Math. Ann. 266 (1984), 317-320.

12. Julia Huang (=J.H. Mueller), Primes and zeros in short


intervals, Thesis, Columbia University, 1976.

13. H.L Montgomery, The pair correlation of zeros of the zeta


function, Pltoc.. Sympo¢. PUlte Math. 24 (1973), 181-193.
203

14. B. Saffari and R.C. Vaughan, On the fractional Parts of x/n and
related sequences II, Ann. In6~. Fo~e~ (Grenoble) 27, no. 2,
(1977), 1-30.

15. A. Selberg, On the normal density of primes in small intervals,


and the difference between consecutive primes, A~eh. Math.
Nat~vid. 47, no. 6, (1943), 87-105.

16. E.C. Titchmarsh, The theory of the Riemann zeta-function,


Oxford University Press, 1951.

17. J.D. Vaaler, Some extremal functions in Fourier analysis, Bull.


Ame~. Math. Soc., 12, No.2, (1985), 183-216.

D. A. Goldston H. L. Montgomery
San Jose State University, University of Michigan,
San Jose, CA 95192, Ann Arbor, HI 48109,
U.S.A. U.S.A.
ONE AND TWO DIMENSIONAL EXPONENTIAL SUMS

S. W. Graham and G. Kolesnik

1. Introduction
In number theory, one often encounters sums of the form

(1)

where V is a bounded domain in Rk and e(w) We shall refer


to the case k = 1 as the one-dimensional case, k = 2 as the two-
dimensional case, etc. Our objective here is to give an exposition
of van der Corput's method for estimating the sums in (1). The one-
dimensional case is well understood. Our knowledge of the two-
dimensional case is fragmentary, and dimensions higher than two are
telVta .tnc.og n.tta. We shall review the one-dimensional case in
Section 2. In Section 3 we will give an outline of what is known
and what is conjectured about the two-dimensional case.

2. The one-dimensional case


Let N be a large positive integer, I a subinterval of (N, 2NJ,
and f : I + R. We wish to get an upper bound for
S := Ln e(f(n») Since le(f(n» I = 1, we have the trivial upper
bound lsi (N. Moreover, this upper bound is attained when f(n) =
an + b, a is an integer, and 1= (N, 2NJ. A non-trivial upper bound
thus requires some conditions on f. Usually these conditions are
hypotheses about the derivatives of f. One example is

Theorem 1. (Kusmin-Landau inequality)


M.6Ume that f' .t.6 monoton.tc. and that II f' II > A on I , whelLe

I/xl/ := min Ix - nl.


nEZ
Then
206

L e(f(n» < A-I.


nO

This inequality is implicit in Lemmas 4.8 and 4.2 of


Titchmarsh's book [20). An elementary proof can be found in Herzog-
Piranian [6).

The condition that IIf'" ) A is too restrictive for most


applications. Van der Corput's method applies to a much wider class
of functions. It depends upon two processes, which have become
known as the A-process and the B-process. The A-process may be
formulated as

Lemma 1. Let I and f be a6 be6oll.e. Then

I
n EI
L e(f(n»)1 2 ( III + Q
Q
L (1 - l1l) L e(f(n+q) - f(n»)
Iql<Q Q n EI
n+q E I

The proof of Lemma 1 uses the Cauchy-Schwarz inequality on the


sum
r
q=l n E I
L e(f(n+q»),

n+q E 1

see Titchmarsh [20) for details.


In most applications, the following variant of Lemma 1 is used.

Lemma 1A. Let I and f be a6 be 6oll.e, let

S L e(f(n») and S
q
L e(f(n+q) - f(n»).
nO n EI
n+q E 1

16 Q ( III then

(2)

Of course, I I I on the right-hand side of (2) may be replaced by


N. But there are occassions when one needs to use the fact that I
is short, and it is important to have III in (2).
Z07

The B-process is a combination of the Poisson summation formula


and the saddle point method. One possible formulation is

Lemma 2. Le-t I = [a, b] c [N, ZN]. N.,.6Ume f hG,b tl0UIL c.on-t-inuolL6


delUvat-ive6 and :that f"(x) < ° on I. N.,¢ume tlUll.-thelL -that

and -that m~ = mZm4 • Let: f'(b) = <1, f'(a) = 13, and le-t nv be ¢uc.h
-that f'(n v ) = v nOll <1 < V < 13. Then

-1/Z
L e ( f (n ) - vn - 1/ 8) If" (n ) I
<1<v~13 v v v

This is Lemma 3 of Phillips [13]. Heath-Brown [5] and Atkinson


[1] give other versions of this lemma in which f is assumed to be
analytic in some appropriate domain; this hypothesis naturally leads
to strong error terms.

The efforts of the A and B processes can be explained


succinctly by the theory of exponent pairs. In this theory, we deal
with functions f satisfying the following conditions:

(3.1) f has infinitely many derivatives on I,

(3.Z) there exists y > 0, s > 0, and d, ° < d < liZ, such that

for all integers p ) ° and all x € I,


If(P+l)(x) - (-l)p(s) Pyx- S - P I < d(s)pyx- S - P ,
(3.3) z:= ya- S ) l/Z.

The symbol (s) p in condition (3.Z) is defined by (s) 0 = 1 and


(s)p = s(s+1) ••• (s+p-1) if p ) 1. Condition (3.Z) states that f is,
in an appropriate sense, well approximated by yx -s • In condition
(3.3), z is effectively f'(a). The condition z ) l/Z is motivated
by the fact that we can apply the Kusmin-Landau inequality in the
contrary case.
208

Definition. The ordered pair (k,£) is an exponen~ pai~ if


o ( k ( 1/2 ( £ ( 1, and i f for all f satsifying (3.1)-(3.3), the
estimate
k £
L e(f(n)) ~ z N
n E I
holds.

The trivial estimate shows that (0,1) is an exponent pair. By


application of Lemma lA, one can prove that if (k,£) is an exponent
pair then
k
A(k,£)
2k + 2 '

is also an exponent pair. By application of Lemma 2, one can prove


that if (k,£) is an exponent pair and if k + 2£ ) 3/2 then

B(k,£) (£ - 1/2, k + 1/2)

is also an exponent pair. Proofs of these results can be found in


Phillips [13]. The restriction k + 2£ ) 3/2 in the B-process can be
removed by appealing to the stronger versions of Lemma 2 previously
mentioned. Moreover, this condition is satisfied by every exponent
pair that arises from the A-process.There is no point in applying
B to an exponent pair arising from the B-process since B2 (k,£) =
(k,£).
For computational purposes, it is convenient to think of A and
B as linear transformations on projective space. Let

o 2
A 1 o
o o
Then

In projective space this is equal to

k/(2k + 2)
£ + 1)/(2k +
Z09

where (K,A) = A(k,R.). The B matrix has an analogous effect. We


are, of course, abusing notation by using the same let ters in two
different senses, but the intended meaning will be clear from the
context.

As we noted before, B is an involution. Moreover, A(O,l) =

(0,1). It follows that any exponent pair obtainable from the A and
B processes can be written either in the form

(4)

or in the form

(5)

where ql' ••• ,qr are


non-negative integers. (When A and Bare
2 q 1 qz
thought of as functions on R , A BA
qr
B···
A B is a composition of
functions. Thus AB(O,l) = A(B(O,l)) = A(1/2,1/2) = (1/6,2/3). When
A and B are thought of as matrices, Aql B AqrB is a matrix
multiplication.)
We use P to denote the set of all exponent pairs obtainable
from (0,1) by A and B. Exponent pairs of the form (4) are in the
set AP ; those of the form (5) are in the set BAt'. Note that (0,1)
E: AP since A(O,l) = (0,1).

Exponent pairs enjoy a convexity property. From the inequality

(0 .; ex .; 1) (6)

we see that if (k 1 ,i 1 ) and (k 2 ,i 2 ) are exponent pairs, then so is

for any ex, 0 " ex "1. Consequently, P - the convex hull of P - is a


set of exponent pairs. In fact all known exponent pairs are in P.
However, i t is possible that there are other exponent pairs. For
example, it has been conjectured that (E:, 1/2 + e;) is an exponent
pair for every E: > O.
In applications, it is usually desirable to minimize some
ZlO

function on p. We illustrate this with the following examples. Let


~(s) denote Riemann's zeta function, let d(n) be the number of
divisors of n, and let r(n) be the number of ways of writing n as a
sum of two squares. Set

6(x) I d(n) - x(log x + Zy - 1),


n(x
and
R(x) I r(n) - nx •
n(x

It can be proved that if (k,~) is an exponent pair and if 0(k,~) k


+ ~ - 1/Z then

6(x) < x0 log x + xl/4 log x ,

R(x) < x0 log x + xl/4 log x,

and
~(l/Z + it) < to/ Z log t.

This motivates the problem of finding

inf (k+£). (7)


(k,~) E P

In 1945, Rankin [14] found an algorithm for computing (7). His work
was published ten years leater, but he did not give the details of
his method since they involved much heavy algebra. Recently, one of
us (Graham) has found an algorithm for computing

ak + b + c
inf (8)
dk + e + f
(k,~)E P

the algebra can be considerably lightened by appealing to matrix


notation.

The algorithm yields a sequence of exponent pairs which provide


approximations to the
desired infimum. The rth term in this
q1
sequence has the form A BA qZ ••• AqrB(O, 1), where all the qi's are
non-negative integers, and only q1 can be zero. The sequence (ql'
qz, ••• ) is called the q-sequence. It is unusual to have qi ;. 10,
so it is convenient to use baseball notation and write the q-
211

For example, in the


problem of finding (7) the optimal q-sequence is

13211 21122 12221 21122 11213 (9)

This means that the sequence of exponent pairs leading to the


infimum is
AB(O, 1) = (1/6, 2/3),

ABA 3 B(0, 1) = (11/82, 57/82),

ABA 3 BA 2B(0, 1) = (33/234), 161/234),

etc. Using a Casio FX-700~ programmable calculator, we have carried


the sequence in (9) out to 100 terms. Glen Ierley and his IBM PC-XT
have shown that this gives inf(k + ~) to 85 decimal places. To 30
places, the answer is

inf (k + 0 .82902 13568 59133 59240 92397 77283.


(k,~) EP

The details of the above mentioned algorithm will appear later,


but we can give a short sketch of it here. Let

ak + b + c
e(k,~)
dk + e + f •

It is necessary to' assume that dk + e~ + f >0 for all (k,O e: P.


In practice, this requires checking only the points (0,1), (1/2,1/2)
and (0,1/2), for P is contained inside the triangle determined by
these points.

We may also regard e as a matrix, i.e.

[: ~J
b
e e

Let u, v, and w denote tha 2 x 2 sub-determinants of e, so that

The algorithm is based on


212

Lemma 3. 16 (k,i) ~ AP, ~hen 0B(k,i) - 0(k,i) ha6 ~e ~~gn a6

w(k + i) + v - u.

We then apply this lemma as follows. Let

r = inf (k + i) .82902 13568 59133 ••• ,


(k,O E P

Y = max {w + v - u, wr + v - u},
and
Z min {w + v - u, wr + v - u}.

The analysis then breaks into three cases.

Case 1. Z;> O. Then 0B(k,i) ;> 0(k,i) for all (k,i) in P


Consequently,
inf 0(k,i) inf 0A(k,i).
(k,O EP (k,i) EP

We let 0 1 = 0A, and we repeat the analysis.

Case 2. Y ( O. Then B(k,i) .. 0(k,i) for all (k,i) in p.


Consequently,
inf 0(k,i) inf 0BA(k,i).
(k,i) E P (k,i) E P

We let 0 1 0BA, and we repeat the analysis.


Case 3. Z < 0 < Y. In this case, the algorithm branches. We
pursue each branch until one of them can be shown to be superior.

3. Two dimensional sums.

Let V c [X, 2Xj x [Y, 2Yj and let f: V + R. Define

s L e(f(m,n)).
(m,n)EV

In analogy with the one-dimensional theory of exponent pai rs, i t is


appropriate to assume that
213

where

A is a non-zero real constant,


a < 1, B < 1, aB F 0, and
~ ~(X, Y) + 0 as X + ~ and Y + ~.

The primary tools for estimating S are two dimensional analogues of


the Poisson summation formula and the Weyl-van der Corput
inequality.

First, let us consider the Poisson summation formula. Recall


that in Lemma 2, terms of the form 1fll(x v ) 1- 1/2 appear, so that the
usefulness of that lemma is lessened when f" becomes small. In two
dimensional sums, the Hessian of f plays a similar role. The
Hessian of f is defined by

Hf det l D
xx
D f
xy
f D f
xy
D f
yy

A precise version of the two dimensional Possion summation


formula is complicated to state; see [17], Lemma 4 or [9], Lemma 2.
We will mention only that under sui table conditions on f and V, we
have
L e(f(m,n))
(m,n)E V

<: M- 1/2 1 L e(f(l;,n) - ).II; - vn) 1 + Error terms.


(u,v)E ~'

Here, it is understood that

(i) M satisfies M < Hf (M ,


(ii) ~ is the image of V under ).I Dxf, v Dyf,
(iii) ~' is some subset of ~,

(iv) I; = i;().I,v) and n = n().I,v) are defined by


D f(l;,n) ).I and D f(l;,n) = v.
x y

The two dimensional Weyl-van der Corput inequality can be


expressed as
214

Lemma 4. 16 Q .. X C1J1d R .. y then

X2y2 Xy
I SI2 < """'OR + QR r IS 1 (q, r) I ,
Iqr < Q Ir < R
(q, r) F (0, 0)
whelte
S1 (q, r) I e(f1 (m,n;q,r»),
(m,n) EV1(q,r)

wah
f1 (m,n;q,r) f(m + q, n + r) - f(m, n)

1
f ...l f(m + qt, n + rt) dt,
at
and 0

V 1(q,r) {(m,n) (m + qt, n + rt) E V for t 0, 1 }.

In analogy with the one-dimensional case, we can hope to prove


an estimate of the form

k R. k R.
S (: L IX lL 2y 2 (10)
1 2 '

where L1 = IAlx- a - 1y-8 and L2 = IAlx- a y-8-1. Note that L1 ~ Dxf and
L2 ~ Dyf. If we can prove an estimate of the form (10) under
appropriate assumptions on f and D, we say that (k1,R. 1 ; k 2 ,R. 2 ) is an
exponent quadruple. Note that since

lsi .. I I I e(f(m,n»)I, (11)


m n

(0,1; k,R.) is an exponent quadruple whenever (k,R.) is an exponent


pair. Similarly, (k,R.; 0,1) is an exponent quadruple.
Unfortunately, the application of Lemma 4 and the Poisson
summation formula is not as straightforward as it is in the one-
dimensional case. To illustrate why this is so, we consider

f (m, n) Am -a n -8 •

After applying Lemma 4, we encounter functions of the form

f 1 (m,n; q,r) f1 d 8
~ A(m + qt)-a(n + rt)- dt
o t
215
-Cl -fl -1 -1
-Am n (Clqm + flrn ) •

If we then apply the Poisson summation formula, we must first


compute Hf 1 • Now

where

For some values of the parameters, the expression P will vanish, or


i t will be inconveniently small. The effect of this is that the
Poisson summation formula cannot be applied directly. Instead, we
subdivide V into a region where P is small and another region where
P is large. In the latter region, we can apply the Poisson summati-
on formula. In the former region, we use some other estimate such
as (11). There are considerable technical difficulties in carrying
this out, and the difficulties become even more pronounced when
Lemma 4 is used more that once. Here we shall ignore these
difficulties and argue heuristically. By Lemma 4,

Now

S1(q,r) L e(f1 (m,n; q,r» ,


(m,n) E VI (q,r)

and
1 d
fl(m,n; q,r) f ~ f(m + at, n + rt)dt
o t

.. ~ + rF .. pF
X Y

where F = IAIX-Cly-fl and p = max(hlx-1,lrly-l). If (kl,R. 1 ; k 2 ,R. 2 )


is an exponent quadruple, then
216

Now assume that Q and R are chosen so that QX- 1 Ry-1. If we set
Z = Q2YX-1 R2Xy-1, then

(k 1+k )/2
1
L (...l.) 2
QR XY
Iql<Q Irt<R
It follows that

Choose Z so that the two terms on the right-hand side are equal.
Then

Thus we see heuristically, that if (kl'R. 1 ; k 2 ,R. 2 ) is an exponent


quadruple then so is

Similarly, a heuristic argument with the Poisson summation formula


yields the exponent quadruple

One way of avoiding the difficulties implicit in Lemma 4 is to


apply it with Q = 1 or R = 1. Classical scholars will recall that
Titchmarch [18] used this approach. In his notation, Lemma 4 is
Lemma ~, and Lemma 4 with R = 1 is Lemma ~'. By taking R = and
arguing heuristically, we see that this approach should lead to the
exponent quadruple

Similarly, with Q = 1 one gets

We may use (6) and take the average of Al and A2 to get the exponent
quadruple
Z17

The "s" here stands for Srinivasan, who used essentially this
operation in his method of exponent pairs [17]. We shall say more
about this later.

It is also possible to apply the Poisson summation formula to


one variable at a time and get the exponent quadruples

and

In some applications, the critical cases for estimating S occur


when X ~ Y. In such a case, it is desirable to have kl = k Z and
~1 = ~Z· Note that

k k + ~ + k k + ~ + 1)
(Zk + Z' Zk + Z Zk + Z' Zk + Z

and
B(k,~; k,O (~ - l/Z, k + 1/2; ~ - 1/2, k + 1/2).

We thus have the following

Conjecture. If (k,O is an exponent pair, then (k,~; k,~) is an


exponent quadruple.

The conjecture is known in the following special cases.

1. f(x,y) = g(x) + hey) and V is a rectangle. In this case,

S I e(g(m) + hen») I e(g(m») I e(h(n»)


(m,n)EV m n

and the result follows immediately.

2. (k,~) (0,1). This is the trivial estimate.


3. (k,O B(O,l) = (1/2,1/2). This has been proved by
several authors independently; see [3], [5], and [16].
218

4. (k,R,) AB(O,l) (1/6,2/3). See Theorem 1 of


Kolesnik [10].

5. (k,R.) = AqB(O,l) for any q > 0. This is a result of the


authors which is in preparation.

Srinivasan [17] has used As to develop a theory of exponent


quadruples. Roughly stated, his theory is as follows. Let Ps be
the set of all pairs obtained from (0,1) by

k 3k + R. + 1).
(4k + 2' 4k + 2

and
B(k,R.) (R. - 1/2, k + 1/2).

If (k,R.) t Ps ' then (k,R.; k,R.) is an exponent quadruple.


It should be noted that Srinivasan's notation is different from
ours; he says that (k,R.) is a two-dimensional exponent pair if

The applications mentioned in Section 1 can be done with two


dimensional sums. Assume that (k,R.; k,R.) is an exponent quadruple,
and let
2k + 2R. -
e e(k,R,)
4R. - 1

Then for some constant C > 0, we have

(12.1 )

(12.2)

(12.3)

Here is a historical survey of the results of this type that have


appeared in the literature.

1. (k , R.) = As °
3 B( '1)'
, e 19/58. This was done by Titchmarsh
[19] for 1,;(1/2 + it).

2. (k,R.) = A~AB(O, 1); e= 15/46. This was done by Titchmarsh


219

[18] for E(x), by Min [12] for ~(1/2 + it), and by Richert [15] for
l\(x).

3. (k,R.) = As A2 B(0 ' 1) ., 8 13/40 • This was done by Hua [7]


for E(x).
4. (k,R.) = A3 B(0,1); 8 = 12/37. This was done by Haneke [4]
for ~(1/2 + it), by Chen [2] for E(x), and by Kolesnik [8] for ~(x).

5. (k,R.) = A3BA~B(0,1); 8 = 35/108. This was done by Kolesnik


[11] for ~(1/2 + it) and ~(x).

Note that
35 -
108 = .324 074 •

If our conjecture is true for (k,R.) A3 BA3B( 0,1), then we could


prove (12.1), (12.2), and (12.3) with

23
8 = 7T = .32394 36619 ••••

If we assume the conjecture for all (k,O and apply the algorithm
mentioned in Section 1, we find that the optimal q-sequence is

32122 11121 21211 21121 11122 11111

and the limiting value for 8 is

.32392 47503 76239 83494 00175 84916

We would like to mention two more applications. We can apply


Lemma 4 with Q = 1 to estimate sums of the form

~ a(m) e(f(m,n)).
(m,n)EV

An example of this is given in Lemma 4 of [3]. By making some


slight modifications of that lemma, we can prove that if
~ la(m)1 2 < X and 8 > 0 then
m

a(m) e(xm
-8 n-8 )
~ ~
X<m<2X Y<n<2Y
< F1/4 X3/4 y1/2 + X5/6 y5/6 + F- 1/ 4 XY + Xy1/2,
220

where F xX-Sy-S. The first term may be written as

F 1/8 x7/8 F 1/8 y5/8 •


(x) (y)

Note that A2 B(0,1; 0,1) = (1/8,7/8; 1/8,5/8).

In our final application, we let d3 (n) be the number of ways of


writing n as a product of three factors, and we define

~3(x) = L d3 (n) - xf 3 (log x),


n " x

where f 3 (log x) is the residue of Z;3(s)x s /s at s = 1. An examina-


tion of Kolesnik's arguments in [9] shows that if (k1'~1; k2'~2) is
an exponent quadruple and if

2k1 + 12~1 + 10k 2 + 4~2 - 5


e 6(4~1 + 3k 2 + ~2 - 1)
then
(13)

Kolesnik takes (k1'~1; k2'~2) = ABA 1 B(0,1; 0,1) (1/20,15/20;


3/20,15/20) to get e= 43/96 = .447916 •••• If we take k1 = k2 = k
and ~1 = ~2 = ~ then

12k + 16~ - 5
e 18k + 30~ - 6

For this e, the optimal q-sequence is

11112 22121 21211 23321 11221 11111 ... .


Our conjecture would therefore imply (13) with

e .44607 41756 73843 37652.

Acknowledgements.
We had the opportunity to speak on this material at the
Mathematisches Forschingstitut of Oberwohlfach, at Oklahoma State
University, and the University of Michigan. We wish to thank those
institutions for their hospitality.
221

References.

1. F. V. Atkinson, The mean value of the Riemann zeta-function,


Acta. Math. 81 (1949), 353-376.

2. Chen Jing-Run, The lattice points in a circle, S~. Sin~ca 12


(1963),633-649.

3. S. W. Graham, The distribution of squarefree numbers, J.


London Math. Soc. (2) 24 (1981), 54-64.

4.
-
W. Haneke, Verscharfung der Abschatzung von ~(1/2 + it), Acta
A~h. 8 (1963), 357-430.

5. D. R. Heath-Brown, The Pjateckii-Sapiro prime number theorem,


J. No. Theony 16 (1963), 242-266.

6. F. Herzog and G. Piranian, Sets of convergence of Taylor


Series I, Duke Math. Jnl. 16 (1949) 529-534.

7. L. K. Hua, The lattice points in a circle, Quant. J. Math.


(Oxford) 12 (1941), 193-200.

8. G. Kolesnik, Improvement of remainder term for the divisors


problem, Math. Zame~k~ 6 (1969), 545-554.

9. ___________" On the estimation of mUltiple exponential sums,


Recent Progress in Analytic Number Theory, Vol. 1 (eds. H.
Halberstam and C. Hooley, Academic Press, New York, 1981)
247-256.

10. On the number of abelian groups of a given


order, J. Re~ne Angew. Math. 329 (1981), 164-175.

11. On the order of ~(1/2 + it) and a(R), Pac. Jnt.


06 Math., 98 (1982) 107-122.

12. S. H. Min, On the order of ~(1/2 + it), Tnan6. Amen. Math.


222

Soc. 65 (1949) 448-472.

13. E. Phillips, The zeta-function of Riemann; further develop-


ments of van der Corput's method, Qu~. J. Math. (Oxford)
4 (1933) 209-225.

14. R. A. Rankin, Van der Corput's method and the theory of


exponent pairs, Qu~. J. Math. Ox6o~d (2), 6 (1955) 147-
153.

15. H. E. Richert, Verscharfung der Abschatzung beim


Dirichletschen Teilerproblem, Math. Z. 58 (1953) 204-218.

16. P. G. Schmidt, Zur Anzahl Abelscher Gruppen gegebner Ordnung


I, Acta MUh. 13 (1968) 405-417.

17. B. R. Srinivasan, The lattice point problem in many


dimensional hyperboloids, III, Math. Ann. 160 (1965) 280-
311.

18. E. C. Titchmarsh, The lattice points in a circle, P~oc.

London Math.Soc. (2) 38 (1934) 96-155; see also


"Corrigendum", op. cit. 55 (1935).

19. , On the order of 1,;(1/2 + it), Qu~. J. Math.


(Oxford) 13 (1942) 11-17.

20. , The theory of the Riemann-zeta function, Clarendon


Press, Oxford 1951.

S.W Graham
Michigan Technology University
Houghton, Michigan 49931 USA

G. Kolesnik
California State University - Los Angeles
Los Angeles, CA 90032 USA
NON-VANISHING OF CERTAIN VALUES OF L-FUNCTIONS*

Ralph Greenberg

1. Let K be an imaginary quadratic field. The L-functions that we


will consider are defined by

L(X,s) = L ~
a N(a)s

where the sum is over the nonzero ideals of the ring of integers OK
of K. Here X is a grossencharacter of K of type Ao. That is, X is
a complex-valued multiplicative function on the ideals of OK such
that X« a» = an "(i' m for all a
OK' a ;: 1 (mod f x ), where n, m e: Z
and fx is an ideal of OK (the conductor of X). We call (n,m) the
infinity type of x. The above series defines an analytic function
for Re(s) sufficiently large which can be analytically continued to
the entire complex plane and satisfies a functional equation. By
translating s or applying complex conjugation, we can clearly assume
that X has infinity type (n,O) with n = nX ) 0, as we will from here
on. The functional equation is then as follows. Let

-s
A(x,s) = A r(s)L(x,s)

where A = 2~/1N and N = !disc(K)!N(f ) . Then


X X X

A(x,n+l-s) = W A<X,s).
X

Here the root number Wx is a complex number of absolute value 1


which can be computed in terms of Gauss sums. Now

*Supported in part by a National Science Foundation grant.


224

L<X,s) L
X o c(a)

N(a)s
L(xo c,s),
a

since complex conjugation simply permutes the ideals of OK' Here c


denotes complex conjugation (in Gal(K/Q)). The above functional
equation becomes

A(x,n+l-s) = W A(xo c,s) •


X

Note that XO c and X have the same infinity type. If XO c = X, then


clearly Wx = ± 1. In the case Wx = -1, the functional equation then
implies that L(X, n;1 ) = O.
We will assume from now on that n is
n+l
odd so that the point of symmetry s = --2- in the functional equation
is an integer. If Xo c = X, Wx = -1, and n is odd, then we will
call the zero of L(X,S) at s = (n+l)/2 a "trivial critical zero".
The following theorem concerns the cases where either XO c = X and
W = +1 or xo c f X. It is proved in [21.

Tbeorea 1. LeA: B > O. Ex.clud-ing the tJL.iv-iai. cJUt-ic.ai. ze f W6,


L( x, n; 1 ) vanv., hell 60ft only 6.tn.ttely many gftOM enc.hMac.teM X /.) uc.h
that NX < B.

As an example, consider an elliptic curve E defined over Q and


with complex multiplication by OK' The Hasse-Weil L-function for E
over Q turns out to be L(w,s) for a certain grossencharacter W= WE
for K (proved by Deuring). The infinity type of W is (1,0). The
assumption that E is defined over Q is equivalent to the equality
2k+l
10 c = W. (See [3].) The grossencharacters X = W for k ) 0 have
infinity type (2k+l,0) and clearly satisfyAlso NX is Xo c = X.
bounded (by NW)' It is not hard to compute Wx (see [11). If
k 2k+l
K f Q(I=f) or Q(i=3), then Wx (-1) Ww ' Thus L(w ,s) has a
trivial critical zero at s = k+l for half of the k's. According to
the above theorem, for the remaining k's, only finitely many of the
2k+l
values L( W ,k+l) are zero. (Actually this special case of the
theorem was proved earlier, in [11.) There can in fact be zeros
among these remaining values. If the Mordell-Wei! group E( Q) is
infinite and of even rank, then the Birch and Swinnerton-Dyer
conjecture would imply that L(w,s) has an even order zero at s = 1
225

(so that Ww = +1) and this of course is true for many elliptic
curves E. Also, Nelson Stephens has found a number of examples
2k+l
where L(w ,s) vanishes to even order at s = k+l for small values
of k > O.
Rohrlich has proved other non-vanishing results, which we
combine in the following theorem. Here W= WE is the grossen-
character attached to an elliptic curve E as above.

'l1leorea 2.(Rohrlich) Le.t S be a Mni..te. Ile.t 06 pJUmu. Le.t ~ va/tlj


ove.1t aU He.de chMac.te.M 06 Mni...te. oltde.1t 601t K llUc.h .that N( f ) iA
~
d~viA~ble only by pJUmu ~n S and e~the.1t (i) ~o c = ~-land W = +1
w~
OIT.. (11) ~o c = ~. The.n L(w~,l) vaniAhu 601t only Mni...tely many
Iluc.h ~'Il. 16 (iii) ~o c = ~-1 and W = -1 and ~6 the c.onduc.tolt
w~
06 ~ iA Itut~c.te.d all above., then L'(w~,l) vaniAhu 601t only
6ini...tely many Iluc.h V Il.

Cases (i) and (iii) in this theorem are proved in [6). Note
that if X W~ where ~ is of finite order and satisfies ~o c =~ -1

then XO c = (~o c)( ~ -1 0 c) = W~ = X. The infinity type of X is


(1,0). One intriguing connection between the proof of Theorem 1 in
[2) and Rohrlich's arguments in [6) is that we both use Roth's
theorem on approximating algebraic numbers by rational numbers in a
crucial way. Although in [2) we use the archimedean version and in
[6) Rohrlich uses the nonarchimedean version, there is a certain
similiarity to how Roth's theorem comes into the arguments which we
will explain later. Case (i1) of the above theorem is proved in
[7). Actually Rohrlich considers the more general L-functions
attached to the twists by ~ of the L-series for modular forms of
weight 2. If X = W~ and ~o c = ~, then XO c
-
= W~
-1
• Except for the
-1
finitely many Vs with ~ = ~ (and conductor restricted as above),
we have xo c F X. Theorems 1 and 2 would obviously be consequences
of the following conjecture.

Conjecture 1. Le.t S be a 6~n~te. Ile.t 06 p~me.ll. Let X VMy ovelt aU


gltoMenc.hMac.te.M 06 K Iluc.h that N(f ) iA ~viA~ble only by pJUme.1l
X
S (and 06 type (n,O) n odd, but not
n;
~n ~n6~n~ty ~th pOIl~t~Ve.,

6ixed ). Exdu~ng the t~v~al c.lti...t~c.al ZeItOIl, L( X, 1 ) iA


nonzelto ~th at mOllt Mn~te.ly many exc.ep:t~on-6. The t~v~al c.Iti...t~c.al
226

ze.lLa'" Me. ",-tmpie. wUh at ma",.t Mrr-t.te..ty marry e.xc.e.p.t-taYl<'>.

One could also consider the following more general questions.


Let 6 be a cusp form of weight k whi,ch is an eigenform for the Heeke
operators and a new form of level N6 • The corresponding L-function
satisfies the functional equation

where
-s
A(6,s) A r(s)L(6,s), A = 2Tl/lN 6"

Here "6 is obtained by applying complex conjugation to the coeffi-


cients in the q-expansion of 6. If "6 = 6 and W6 = -1, then clearly
L(6,k/2) = O. If N6 is divisible only by primes in some finite set
S but k is not restricted (except perhaps to be even), will these
zeros forced by the functional equation account for all but finitely
many of the values L(6,k/2) which vanish? Will the zeros forced by
the functional equation be simple with at most finitely many
exceptions? The L-function L(X,s) attached to a grossencharacter X
of K corresponds to a modular form of weight nX + 1 and level Nx•
The above condition on NX would limit K to finitely many imaginary
quadratic fields and would limit N( 6 X) to be divisible only by
primes in S.

2. We now want to discuss the connection of the above nonvanishing


results to the arithmetic of elliptic curves. As before, let E be
an elliptic curve defined over Q and with complex multiplication by
OK" If p is any prime, we will consider towers of fields
K = Ko C Kl C .". c Kn C " •• with Kn a cyclic extension of K of
degree pn. The field K
00 n)O
U
= K is then a Galois extension of K
n
with Gal(Koo/K) = l!m(z/pnZ) - zp , the additive group of p-adic
integers. Koo is a so-called Zp-extension of K. The question of how
the rank of E(Kn ) behaves as n + 00 (and related questions) was first
discussed by Mazur (see [5]).

Now the Birch and Swinnerton-Dyer conjecture states that, if F


is any number field, then the rank of E(F) should equal the order of
227

vanishing of the Hasse-Weil L-function LF(E,s) for E over F at


s=l. If F is abelian over K, we have the following essentially
formal identity:

2
L( 1jJcP, s) •

Here cP runs over the characters of Gal(F/K) (which can be identified


with Hecke characters of finite order for K by class field
theory). Also 1jJ = 1jJE as before. Note that the fact that LF(E,s)
has even order at s=1 agrees with the fact that E(F) is an 0K-module
and rankZ(E(F)) = 2 rankoK(E(F)). Conjecturally, the behavior of
the rank (over Z) of E(K n ) as n + ~ should be related to the

vanishing of L(1jJCP,s) at s=1 as cP varies over the characters


of Gal(K)K) of finite order (each of which factors through
Gal(Kn/K) for some n). I t is easy to show that only primes of K
dividing p can ramify in a Zp-extension K~/K. Hence the conductor
of the grossencharacter X = 1jJCP will be divisible only by primes in
some finite set.
+
We will single out two special Zp -extensions K~ and K~ of K.
Both are Galois extensions of Q. The element c in Gal(K/Q) acts (as
an inner automorphism) on Gal(K+/K) trivially and
~
on Gal(K-/K) by~

multiplication by -1. Thus the n-th level K+ of K+ is abelian over


n ~

Q of degree 2pn. The n-th level K- of K- is a dihedral extension of


n ~

Q, also of degree 2pn. If cP is a character of Gal(K+/K) or


n
Gal(K~/K), then (identifying cP with a Hecke character for K) one
-1
finds that cpo c cP or cP ,respectively. The existence of these
Zp -extensions can be proven by class field theory. Actually
+ is
K~

easily described explicitly. It is a subfield of K(~~) where


p
~ ~ denotes the p-power roots of unity, and is called the cyclotomic
p
zp-extension of K for that reason. often called the
anticyclotomic Zp -extension of K. It could also be described
explicitly as a subfield of the field obtained by adjoining certain
values of the j-function to K. By class field theory, one can show
that every Zp-extension of K is contained in K ~
= K+K- •~ ~

Also Gal(K)K) :: Z2 and so obviously K has infinitely many distinct


p
~-extensions.
228

Consider first the anti-cyclotomic Zp -extension. If cJ> is a


character of Gal(K:/K), then X = wcJ> satisfies xo c = X. The root
numbers Wx behave as follows (see [1). We assume E has good
reduction at p. If P splits in K, then WwcJ> = Ww In particular,
i f Ww= +1 (i.e. i f LQ(E,s) has an even order zero), then Rohrlich's
theorem implies that L(WcJ>,l) f 0 for all but finitely many such
cJ>'s. Rubin's generalization of the Coates-Wiles theorem then shows
that the rank of E(K-) becomes constant for n sufficiently large.
n
If Ww = -I, then L(WcJ>,l) = 0 for all cJ>. If p remains prime in K,
then WwcJ> = :t Ww and both signs occur depending just on whether the
order of cJ> is an even or odd power of p. Thus L( WcJ>,1) =0 for
infinitely many cJ>'s. But Rohrlich proves that these zeros are
mostly simple. This result together with a recent theorem of Gross
and Zagier (which connects the heights of certain "Heegner points"
on E with the values L'(X,l» shows that rank(E (K-» + 00 as
n
n + 00 if either p splits in K and Ww = -lor if p remains prime in
K. In the first case, rank(E(K n »
-
> 2p n
- e for all n, where e is
some constant. The Birch and Swinnerton-Dyer conjecture would imply
the more precise statement that rank(E(K-» - 2pn becomes constant
n
for n::> O. In the case where p remains prime in K, the growth
of E(K-) is less regular. For n > 0, the rank of E(K-) increases
n n
only for either the even or odd n's. We still have an inequality
rank(E(K-»
n
> apn for n > 0, where a is some positive constant.
+
If cJ> factors through Gal(KooIK), then for X = WcJ>, we have
XO c f X (except i f cJ> has order 2) • Again Rohrlich's result
together with Rubin's theorem implies that rank(E(K+» becomes
n
constant for large enough n. More generally, consider any Zp-
extension Koo of K other than the anti-cyclotomic one. If cJ> is a
-1
character of Gal(KjK), then one sees easily that cJ>0 c .;. cJ> except
possibly for finitely many such cj>'s. Again, for X = WcJ>, we will
usually have xo c f x. The argument given in [7) can be adapted
(with some difficulties) to prove the following result (suggested by
the conjecture stated in Section 1).

'l'heorea 3. Let Koo = U Kn be artlj zp-exteno-Lon 06 K, Koo .;. ( . Then


rank(E(K n » .u., bounded a6 n + 00 •

A stronger result should be true. Conjecture 1 actually would


229

imply the following conjecture. Let F be any finite abelian


extension of K and let f
co
= Fie.
00
Let f *
00.
be the largest subfield
of f.. such tha t the characters ~ of Gal(f /K) of finite order all
have the property that ~o c =
-1
~. The
..
field f .. is
* a finite
extension of K: If f .. = " .. and i f p is odd, the field f: is K...
For any field L, we let E(L) = E(L)/E(L) i.
tors on

It is tempting to speculate in a somewhat different direction.


Let f be a Galois extension of Q such that G = Gal(f/Q) :: GL2(~)
for some prime p. Let E by any elliptic curve defined over Q.
Assume that Weil's conjecture is valid for E. That is, LQ(E,s)
= L( 6,s), where 6 is a modular form of weight 2. Let F be any
finite Galois extension of Q contained in f. The Hasse-Weil L-
function LF(E,s) is formally a product of L-functions L(6,~,s),
where ~ is an irreducible character of Gal( F/Q). Each L-function
occurs d~ times in this product, where d~ is the degree of the
character. The function L( 6,~,s) is defined (for Re(s) > 3/2 ) by
an Euler product whose factors are (mostly) of degree 2d~ and which
are easily described from the Euler factors for L(6,s) and those for
the Artin L-function L( ~,s). The properties of these L-functions
don't seem to be known in general, but it seems reasonable to
believe that they have analytic continuations with a functional
equation relating L(6,~,2-s) to L(6,~,s). (The modular form 6 here
would satisfy 6= 6.) If ~ = ~ and if the root number W1
D'~
occuring in the functional equation is -1, then L( 6, ~, 1) would be
forced to vanish. If ~ -F ~, one might believe that L(6,~,1) should
be nonzero with at most finitely many exceptions as ~ varies over
all such irredcible characters of G. (Perhaps we should assume here
that only finitely many primes of Q are ramified in f ).
Now it is easy to show that in the group G* = PGL 2 (Zp) = G/Z;,
every element is conjugate to its inverse. Thus every character of
G* is real-valued. Also every real-valued irreducible character of
** x 2 * **
G factors through G G/ (Z ) . Let f and f denote the
p * **
corresponding subfields of f. Thus Gal(f /Q) :: PGL 2 (Zp) and f is
a finite (quadratic i f p '" 2) extension of f*. In analogy with
**
Conjecture 2, it may be reasonable to believe that E(F)/E(f ) is
230

finitely generated in general. Also, under certain assumptions,


M. Harris [41 has shown that an elliptic curve can have unbounded
rank in a PGL 2 (Zp)-extension of some number field. A calculation of
what the root numbers W6,~ should be would give some idea of what to
expect in general. Such calculations can be done if the elliptic
curve E has good reduction at all primes ramified in P /Q. Assume
p > 2. There is a unique character E : G* + ±1. Let NE denote the
conductor of E. If E(-N E) = +1, then all but finitely many of the
W 6,~' s turn out to be + 1 when ~ factors through G*. Possibly E has
bounded rank in P (and even f) in this case. If d-NE) = -1, then
infinitely many of the W6,~'s are -1 (namely for those ~'s with
E as corresponding determinant). This suggests that the rank of E
should be unbounded in P. There is a canonical tower of fields
F:, n ) 1, with Gal(F:/Q) " PGL2 (Z/(pn» such that f* =U F: • We
*
have [Fn : Q1 - c(p)
n 3
for some constant c. If E(-N E) = -1, it

seems that rank(E(F*» should


n
be > a(pn)2 for some a > 0 when
n > O. This rate of growth is the most one could find by just root
number calculations. A higher rate of growth would indicate that
many of the L-functions L(6,~,s) have high order zeros at s = 1.

Now let E be an elliptic curve without complex multiplication


and let f be the field generated by the coordinates of the p-power
division points on E. For all but finitely many p, we will have
Gal(f/Q) " GL 2 (Zp)' It is this case that seems closest to the
situation described earlier in this section. Although we haven't
calculated root numbers, we suspect that w = -1 for infinitely
* 6,~
many characters ~ factoring through .G -and hence that E has
unbounded rank in P.

3. We want to say something about the proofs of Theorems 1 and 2.


Since they are already in print, we will be very sketchy. Mainly,
we will try to explain a certain similarity in how Roth's theorem
occurs in the arguments.

We will simplify our discussion of Theorem 1 by restricting


2k+l
attention to the values L(lj! ,k + 1) for k ) 0, where lj! is the
grossencharacter for an elliptic curve E as in Section 1. The root
2k+l
numbers Wk = W( lj! ) turn out to depend only on the residue class
231

of k modulo m, where m is the number of roots of unity in K. Let m'


be any multiple of m and let k' be a fixed integer such that
wk ' = +1. The essential part of our proof is to show that the Abel
average of the L-values over all k = k' (mod m') is nonzero and so
-<..nMn-<..telY many of these L-values are also nonzero. One improves
this to only 6-<..n-<..tely many by using the fact that these L-values are
(up to a factor) certain special values of p-adic L-functions
constructed by Katz. This role of p-adic L-functions in our
argument is the reason our result is limited to grossencharacters
with nX odd.

One can derive a convergent series for the L-values considered


h ere by us i ng t he same i ntegra 1 representat i on f or L( ",2k+l,s)
'I' whi c h
gives the analytic continuation and functional equation. The
integrals can be evaluated when s= k + 1. The result is that
2k+l
L(lji ,k+1) = (1 + Wk)Gk , where

p2k+1 (a) k (AN( a) )j


-AN( a)
L
N( a)k+1
e L j!
a j=O

k
(AN( a) )j
L L .,
J.
a j=O

Here 4>o(a) = lji(a)/~(a) and A is the same constant which appears in


the functional equation. (A small difficulty occurs if K = Q( 1=3).
Then A might vary slightly with k and also the second series above
will be different. We assume here A is constant.) The Abel average

lim (I-x) L Gkxk


x+l k=O

can be evaluated. The terms in ~ for which 4>O(a)=1 (or


equivalently a = a) give a contribution of ~
E_ N(a) to this Abel
a.=a
average. The conductor of lji is divisible by the ramified primes of
K and so one need consider only the ideals a = (a) , where a e: z.
Now lji«a» = I;(a)a for some Dirichlet character 1;. (It turns out
that I; is equivalent to the Dirichlet character for K, although
usually nonprimitive.) Thus the above sum is just L(I;,l) and is
certainly nonzero.

The terms in ~ for which 4>o(a) f 1 give a contribution of zero


232
k
to the Abel average. Also the Abel average of the sequence Gkl;
(where I; is any m'-th root of unity, 1;;' 1) is zero. These facts
immediately give the result stated earlier about the Abel average of
our L-values over k = k' (mod m'). The estimates that are involved
here are the most troublesome for those terms where 8 = ~o(a) (or
8 = ~ o( a) 1;) is close to 1. One needs to show that N( a ) increases
rapidly for those terms. Now 8 = A/I, where A = lj/(a) (or lj/(a)w
for some root of unity w). The A's which occur here belong to one
of finitely many lattices L in the complex plane consisting of
algebraic numbers. The most delicate estimates are needed for the
terms where Im(A) is small. Let L = Zw1 + Zw2 • If A = aWl + bW2 is
close to the real axis (and, say b;' 0 ), then alb is a good
rational approximation to the algebraic number Im(-w2 /w 1 ). Roth's
theorem enters at this point in order to show a or b and so
II = N(a) is large. One in fact needs Roth's theorem with an
exponent 2 + E for a rather small value of E •
2k+1
The values L(lj/ ,k+l) that we have considered can be written
k k -1
as L(lj/~o ,1). The grossencharacter ~ = ~o satisfies ~o c = ~ ,

although of course ~ is not of finite order if k > O. Its infinity


type is (k,-k). In Rohrlich's theorem the analogous case is (1) and
it is this case (and also case (3» where Roth's theorem (the
nonarchimedean version) plays a role. We will just consider case
(1) and will assume that S = {p}, where p is an odd prime. For
simplicity, we will restrict attention to Hecke characters ~ such
that the field K~ which corresponds to ~ by class field theory is a
subfield of the field K defined in Section 2. The condition
-1 co
~o c = ~ , means that K~ K- for some n. Obviously the order
n
2
of ~ (denoted by ord(~» is pn. Let r = Gal(K",,/K). Then r :: Z
p
and c acts naturally on r (as an inner automorphism in Gal(KjQ».
+ - +
This gives us a decomposition r = r x r , where rand r can be
+ -
identified with Gal(K",,/K) and Gal(K",,/K), respectively.

Rohrlich also uses an averaging argument.


Gal(K(values of ~ )/K) acts on a character~, giving a set of
conjugate characters ~i' 1 .. i .. e~ , say. Denote by L(lj/~av,l) the
Galois average

L( lj/~ ,1)
av
233

Rohrlich shows that lim", L(1/I<j> ,1) is nonzero as ord(<j» + "" and <j>
'f av
varies as restricted above and such that «-( 1/I<j» = +1. Now 1/1 has its
values in K and so the grossencharacters Xi = 1/I<j> i are all
conjugate. The root numbers «-(Xi) are all +1 and a theorem of
Shimura shows that either all or none of the values L(X.,1) are
1
zero. Hence L(1/I<j>,1) is nonzero i f ord(<j» is sufficiently large and
«-( 1/I<j» +1.

We have the following convergent series for L(1/I<j>,I)

(1 + w( 1/I<j»)

Here A<j> = 2rr/~ , which is unchanged when <j> is replaced by any of


the <j>i's. One difficulty in handling these series is that A<j> + 0
as ord(<j» + "". We will assume that p ~ N(f1/l) so that 1/I<j>(a)
= 1/1( a) <j> ( a) for all ideals
a. Then we can replace <j> by
<j>av = e<j> L <j>i in the above series, giving a convergent series for
-1

L( 1/I<j>
,1). When is <j> (a) i= O? If W is a pn-th root of unity,
av av
then the sum of the conj ugates of w will be zero unless rAP = 1.
If ord(<j» = pn, then we can regard <j> as a character of Gal( K-/ K).
n
K- / K
Now <f>{ a) = <f>{ (_n_)), and so <j> (a) i= 0 implies that the Artin
a av
K-/ K
symbol (~) has order 1 or p. It must then fix K:_ 1 and so
K- / K
( n-a1 ) must be tr i via I • As ord(<j» + "", the terms that survive in
K-/ K
the series for L(1/I<j> ,1) are those for which (--""--) is trivial, i.e.
av a
those that correspond to ideals a such that a = Ii. The contri-
bution of those terms to the limit in question is
\' 1jJ( a)
2 L _ N( a)' nonzero as before.
fPa

K/K
Let a = (_00_) The condi tion a = a means that a E r +. For
a a a
a given <j> such that ord(<j» = pn, the remaining nontrivial terms in
the series for L( 1jJ<j> ,1) are those for which a i= Ii and
av n-l
aa I K- = proj _(a J is in (r-)p If a = (a), we can translate
"" r
this into a statement about a. Class field theory gives a canonical
isomophism U/Utorsion =r , where U is the group of units in
234

0p = OK ®Z Zp. (This ring is either the integers in the p-adic


completion of K or the direct product of two copies of Zp' depending
on whether p remains prime or splits in K.) The statement that
proj r-< (J a> is in a small subgroup of r- becomes equivalent to
stating that a/a is close to some element 1; of the finite group
Utorsion. In fact, 1; must be a global root of unity. One can write
each 1; as 1; = wi w where w is the image of some algebraic number in
Ope Thus A = aw belongs to one of finitely many "lattices" L = w OK
in 0p consisting of algebraic numbers and A has the property that
AlI is close to 1, that is, A - ~ is small. As before, but this
time using a p-adic version of the theorem of Roth, one finds
that A~ (in R here) and so N(a) = N(a) is large. In this way,
Rohrlich shows that the terms in the convergent series giving
L(W~ ,1) for which a f 4 contribute zero to the limit.
av

References.

1. R. Greenberg, On the Birch and Swinnerton-Dyer conjecture.


Invent. Math. 72, 241-265 (1982).

2. R. Greenberg, On the critical values of Hecke L-functions for


imaginary quadratic fields, Invent. Math. 79, 79-94 (1985).

3. B. Gross, Arithmetic on Elliptic Curves with Complex


Multiplications. Lecture Notes in Math. 776.

4. M. Harris, Systematic growth of Mordell-Weil groups of abelian


varieties in towers of number fields. Invent. Math. 51, 123-141
(1979) •

5. B. Mazur, Rational points of abelian varieties with values in


towers of number field. Invent. Math. 18, 183-226 (1972).

6. D. Rohrlich, On L-functions of elliptic curves and


anticyclotimic towers. Invent. Math. 75, 383-408 (1984).
235

7. D. Rohrlich, On L-functions of elliptic curves and cyclotomic


towers. Invent. Math. 75, 409-423 (1984).

R. Greenberg,
University of Washington,
Seattle, Washington 98195, U.S.A.
ON AVERAGES OF EXPONENTIAL SUMS OVER PRIMES

Glyn Harman

1. Introduction.

In this paper we shall be concerned with obtaining approxima-


tions to and estimates for the sum

e(nex)A(n) (1)

where e(x) = exp(21Tix), ex is real, and A(n) is the von Mangoldt


function. Although we are unable to establish the naturally
conjectured results for this sum, we shall show how the introduction
of averaging - in a form likely to occur in applications - can lead
to substantial improvements.

To analyse the behaviour of SN(ex) we first need some


information concerning diophantine approximations to ex. If we
suppose that

where lal < q-2 and (a,q) = 1, then one expects that

(2)

where E(N,q,a) is some error which will be an increasing function of


N, q and Ia I • For small values of q, (2) would provide a good
approximation to SN(ex) by a term which is O(min(N, lal- 1 )/<j>(q» for
certain values of the parameters. For some applications the exact
form of the approximation is necessary (e.g. on the major arcs of
the Hardy-Littlewood circle method, see [14]) and in other cases an
upper bound suffices (e.g. section 7 of [1]). The fact that (2)
holds on the Generalized Riemann Hypothesis is classical, with

This analysis was


Z38

fundamental to Hardy and Littlewood's conditional proof of the


ternary Goldbach theorem [4] and the demonstration in [5] that the
exceptional set in the binary Goldbach problem is O(X 1 / Z +E)
(actually they used a more general hypothesis and gave results
depending on the width of the zero-free region). Ignoring powers of
(log N) we note that for large q, (Z) gives a bound N1/Zq1/Z , while
for small q, if we only know lsi < q-Z, the upper estimate is
Nq- l/Z

Without any hypothesis one can only establish (Z) with the
current state of knowledge, for q < (log N)A (any given A) and with

E(N,q,S) ~ N exp(-c(A)(log Nl/z)O + NiSi)

(see for example, the proof of Lemma 3.1 in [14]). Vinogradov,


however, proved the ternary Goldbach theorem unconditionally (see
chapter 10 of [15]) by establishing a result of the form

(3)

The bound (3) in this form is due to R.C. Vaughan [lZ]. We note
Z 5 3 5
that for q < N / or q > N / and given only lsi < q-Z, this is only
weaker than the result obtained on the GRH by a power of (log N).
No stronger bounds are possible for small q when lal is substant-
ially smaller than q-Z, however, by the Vinogradov-Vaughan method.
Vaughan also established that

L ISN(ha)1 ~ (log N)7(N 3 / 4H + (NHq)l/Z + NHq-Z + N4/5+EH3/5),


h~H (4)

which quickly leads to the result that, for a irrational, S


arbitrary, there are infinitely many primes p such that

"ap + S" < cp -1/4 (log p)


7
(5)

where c is an absolute constant. By sieve methods one can deduce a


stronger result [6] but this sheds no light of SN(a). On the GRH
the exponent in (5) can be increased to 1/3 (I have not been able to
locate this fact mentioned in the literature, but Prof. S. Graham
remarked to me that he had proved it in an unpublished manuscript).
239

The Bombieri-Vinogradov theorem (chapter 28 of [2]) shows that,


in some sense, the GRH is true on average. This leads one to hope
that one could prove (2) to be true on average. Montgomery and
Vaughan [9] effectively got such a result, drawing on some work of
Gallager [3]. They proved that the integral

f SN(a)
2
e(-na) da
M

where M is the union of maj or arcs, equals the value expected with a
suitably small error plus some unpleasant terms coming from a
possible 'exceptional~ character ·(one whose L-function has a zero
very close, in terms of n, to 1). In this use is being made of
averaging over both numerator and denominator and the latter can
take values up to a small power of n.

The following three theorems demontrate other average results


on exponential sums over primes.

Theorem 1. Le.t N ) Q ) 1. SUppO-b e. that, 60Jt Q ( q ( 2Q we. have.

a q - a(q)/q = Sq w-Uh ISql < q-2, S .; Isql ( 2S and


N- 1 ( S ( N- 3 / 5 • The.rr we. have.

I I SN(a q ) I < (log N)5(N 7 / 8 s- 1 / 8 + Q3/4NS1/4 + Q3/2NS 1 / 2 )


Q(q(2Q
(6)

Theorem 2. G.ive.rr.the. htjpothe6 e6 06 The.oJte.m 1 but w-Uh


o < S < N- 1 exp( (log N)1/2) and Q < N1 / 3 exp( -2(log N)l/2), the.rr the.Jte.
e.x.i.6.t6 arr ab.6 olute. c.oYl.6tarrt c .6uch that

L IsN(a) - Il(q) S (S ) + X 1< N exp(-c(log N)1/2), (7)


Q(q(2Q q $(q) N q q

whe.Jte.

I-a
n

th.i.6 te.Jtm oc.c.UJt.trrg orrltj .i6 the.Jte. .i.6 a modulM r d-iv.id-irrg q wilh a
Jte.al pJt.imil.ive. c.haltacte.Jt X who.6e. L-6urrc.t.iorr hC1.6 a Jte.al ze.Jto a w.ith
(1 - a) < (log N) -1 / 2 • (The.Jte. c.arr be. at mO.6t orre. .6uc.h r 60Jt a g.ive.rr
240

N).

Theorem 3. SUPPO.6 e that (a, q) 1, and q, R, L , N ;. 1. Let E: > °


be g~ven. Then we have that

(8)

2
+ NLq-1/2 + N9 / 10 (RL)1/2 + RN 4 / 5 + (NLRq(l + ~ »)1/2

AUeltnat~vely the exponent.6 2/3, 2/5, 9/10, 4/5 may be Itep.(ac.ed by


7/10, 3/5, 7/8, 3/4 lte.6pec.t~vely.

The author does not know of any applications at present for the
first two theorems although they do imply a bound O( N7 / 8 (10g N)5
min(N, S-1)1/8/ q ) on average over q. Theorem 3 is, however, a
stronger result than can be obtained by applying the GRR for each
modulus qr, when the parameters are in certain ranges. For example,
when R = L = Nl/ 3 , q = Nl/ 2 , a < N, the right hand side of (8) is
O(N 4 / 3 + E:) whereas applying the GRR (and not making use of the
averaging over r) there is a term ( qLNR 3)1/2 which is of size
N4 / 3 + 1/12 Professor P.X. Gallagher has remarked that this may
have some implications for the vertical distribution of zeros of L-
functions. Theorem 3 is applied in [7] to prove that there are
infinitely many solutions of lap - P 3 + sl < p-l/300 where p is a
prime, P3 a number having no more than three prime factors, a is
irrational, and S is arbitarary. This improves upon a result of
Vaughan [ll] who adapted his method in [10] which has a "GRR true on
average" strength. Several variations on the above results are
possible. We shall only briefly sketch the proofs of the results
here.

2. Proofs of Theorems I and 2.

We shall adapt the argument of [10] to prove Theorem 1 and


appeal to Theorem 7 of [3] in addition to establish Theorem 2. We
donote by ,(X) the usual Gauss sum. We note the well-known (Chapter
9 of [2]) results:
241

1T(X) 1 q1/2 if X is primitive mod q


( IT(xd)1 if Xd is the character mod d which
induces X

T(X) ~(q) if X is the principal character mod q.

Let r(q) be the nearest integer to Sq-1. Then


IISql - h/(hr(q) + 1)1 < 3/r(q)2 for h ;. 1. It is elementary that
the smallest integer in the arithmetic progression hr(q) + 1 which
is coprime to q is O(d(q)qr(q)/$(q»). Hence for each q there exist
integers t(q), k(q) with (t(q),k(q» = 1, 1 ( t(q) ~ d(q)r(q) q/$(q)
and IS q - k(q)/t(q)1 ~ S2. We write

1/I(N,X,y) I A(n)x(n)e(ny) and 1/I(y,X) = 1/I(N,X,O).


n(N

Thus

I IT(X)II1/J{N,x,S )1
q
X
mod q
(9)

We first assess the contribution to the right hand side of (9)


arising from principal characters. In this case we use the bound
(3) and obtain

1/I(N,X,Sq) = SN(Sq) + O(log Q)


< (N 4 / 5 + Nr(q)-1/2 + N1 / 2r(q)1/2)(log N)7/2

~ N7 / 8 S- 1 / 8 (log N)7/2

which is of a suitable size since IT(X)I ( 1.

Now we must convert the remainder of the sum to one involving


only primitive characters. Using * to denote summation over
primitive characters only, the sum is

3/2 5
I * IT(X) 111/I(N,X,S )1 + Q
q
(log N)
X mod M
242

1/2
< L ----<L- 1:* 11jI(N,X,yq) I (log Q) + Q3/2(log N)5, (10)
1<q~2Q ~(q) X mod q

since
_1_ <: l2lL.Q.
~(qm) ~(q)'

and we have written y for that one of fl (m = Q, Q + 1, ••• , 2Q)


* q m
such that q 1m and L 11jI(N,X,fl) I is maximised. We also put u(q) =
X m 1/4
t(m), v(q) = k(m). For the values of q ,,(Nfl) we take no account
of the averaging over q. Since fl < N- 1/2 we have (writing u for
u(q)),

L*
X
11jI(N,X,yq) I .. L* max
X y"N
IljI(y,X,v(q)/u(q))1

1/2
u
<: ~(u) L max IljI ( y , X) I
X mod uq y"N
since X1 X2 runs over all characters (mod uq) no more than once as
Xl' X2 run over characters mod q and mod u respectively. An appeal
to Theorem 2 of [10] then furnishes the bound

It quickly follows that

This is a satisfactory esitmate again.

To handle the remaining values of q we need to modify the


details of [10]. We must first divide up the range of summation
over q, so we now restrict q to lie between Z and 2Z. We write

-s
F(s,X) x(n)A(n)n
and

-s
G(s,X) x(nhl(n)n

where u, v (both not less than 1) will be chosen later in terms of


N, Z, and fl. We also put e = 1 + (log N)-l and T = N2 • We then
have (of Lemma 3 of [10]) that
243

6+iT L' s
1
21ri f (L (s,X) + F(S,x»); ds
6-iT

+ W(u,X) + O(log N),

for y (N. By partial integration we then obtain

N 1 6+iT L' s-l


-f e(y y) - . f (-L(s,X) + F(s,X») y dsdy
1 q 2nL 6-iT

+ O(N8log N + u).

The error term above contributes < z3/20og N)3u to (6) which will
be satisfactory providing u < N8 1 / 2 •

Writing

s-l
h(s) fN e(y y)y dy
1 q

(suppressing, in the interests of clarity, the dependence of h on q)


we have that h(s) is an entire function of s and for a ) 1/2 ,

h(s) < N min(l,ltl- 1 / 2 ) for t ( 4N8,


and
h (s) <: N mi n( 1, It 1-1 ) for t ) 4N8,

where s = a + it. This means that we can follow through all of


Vaughan's analysis with the factor h(s) included. Also we have
q1/2/cj>(q) in place of his q/cj>(q). This gives , using Vaughan's
notation (cf. (20) of [10]),

T'
I *f IH(6+it,X)h(6+it)ldt
X -T'

< z-1/2(log N)3 N(l + z 2u- 1 )1/2(1 + z 2v- 1T,)1/2

for T' ( 4N8, and (cf. (24) of [10])

T'
I *f II(1/2 +it,X)h(6+it)ldt
X -T'

<: N1/ 2 (u 2 + z2)1/4(v + T,z2)1/2(log N)4.

For integrals with 4N8 ( t ( T the same estimates hold without the
factor T' appearing on the right-hand side. The choice u = 28- 1 / 2Z- 1
244

(which is less than Ni3 1 / 2 as required earlier) and v


gives an estimate

Since (Ni3)1/4 ( Z < Q this completes the proof of Theorem 1.

The proof of Theorem 2 is similar to the above argument with


(Ni3)114 replaced by P = exp(-(log N)1I2/2) and for values of q
smaller than this value the required bound quickly follows by
partial summation from Gallager's result (the form given in [9] is
most convenient).

3. Proof Of Theorem 3.

The following two results are essentially Lemmas 5 and 7 of


[7], the only alterations coming from a change in presentation
concerning the dependence of the results on the size of "a". We
remark that the definition of 8 in Lemma 6 of [7] should have been 8
= max(T/(Rq) ,qo ,1) and not with an "ao" as stated there, and the "J"

occuring in the hypothesis of Lemma 7 should have been an "L".

Lemma 1. SuppO-6e that e: > 0, N ) R, J, M, q ) 1, (a,q) 1. Then

a b e(ajmn)
L L L L n m qr
R(r<2R J(j<2j M(m<2M n(N/m

Lemma 2. G.tven the hypothu u 06 Lemma 1 and two M.quenc.u 06


c.omplex numbelL6: a b ~ Ne:/3 Then
n' m •

a b e(ajmn)
L L L L n m qr
R(r<ZR J(j<ZJ M(m<2M n(N/m
24S

The proof of Lemma 1 uses the fact that the inner sum is a geometric
series, while the proof of Lemma 2 is based on the large sieve and
counting the solutions of certain diophantine inequalities.

To prove Theorem 3 we appeal to Heath-Brown's generalized


Vaughan identity, whereby a sum of the form E A(n)f(n) may be
decomposed into 0 ( (log N)10 double sums of the form

I a b f(mn)
n m
n.. N/m

with either

2/3 7/10 E/6


(1) a or log n, M <{ N (N ), b m <{ N
n
or
( II) b <{ NE/6 , NIlS <{ M <{ N1/3 (N 1/4 ; N2/5 )
m

(the values in brackets produce the alternative exponents). The


result of Theorem 3 quickly follows.

References.

[1] R. C. Baker and G. Harman, Diophantine approximation by prime


numbers, J. London Math. Soc., (2) 25 (1982), 201-215.

[ 2] H. Davenport, Multiplicative number theory (ed. revised by


Montgomery, H. L.), Springer-Verlag: New York, 1980.

[3] P. X. Gallagher, A large sieve densi ty estimate near (J 1,


Invent. Math. 11 (1970), 329-339.

[4] G. H. Hardy and J. E. Littlewood, Some problems of 'Partitio


Numerorum': IlIOn the expression of a number as a sum of
primes, Acta Math. 44 (1923), 1-70.

[5] A further contribution to the study of


Goldbach's problem, Pnoc. London Math. Soc. (2) 22 (1923), 46-
56.
246

[6] G. Harman, On the distribution of Cl p modulo one, J. London


Math. Soc.. (2) 27 (1983), 9-18.

[7] Diophantine approximation with a prime and an


almost-prime", J. London Math. Soc.. (2) 29 (1984), 13-22.

[8] D. R. Heath-Brown, Prime numbers in short intervals and a


generalized Vaughan identity, Canad. J. Math. 34 (1982), 1365-
1377 •

[9] H. L. Montgomery and R. C. Vaughan, The exceptional set in


Goldbach's problem, Acta Ani~h. 27 (1975), 353-370.

[10] R. C. Vaughan, Mean value theorems in prime number theory, J.


London Math. Soc.. (2) 10 (1975), 153-62.

[ 11] Diophantine approximation by prime numbers III,


P~oc.. Land. Math. Soc.. (3) 33 (1976), 177-192.

[ 12] Sommes trigonometriques sur les nombres premiers,


C.R. Ac.ad. Sc.~. p~, S~~. A, 258 (1977), 981-3.

[13] On the distribution of Cl p modulo 1, Math~mat~Qa,

24 (1977), 135-141.

[ 14] The Hardy-Littlewood Method, Cambridge University


Press: Cambridge, 1981.

[15] 1. M. Vinogradov, The method of trigonometrical sums in the


theory of numbers (Translated, revised and annotated by
Davenport, A. and Roth, K. F.), Interscience: New York, 1954.

G. Harman
Department of Pure Mathemtics,
University College,
P.O. Box 78,
Cardiff CF1 1XL, Wales, U.K.
THE DISTRIBUTION OF n(n) AMONG NOKBERS
WITH NO LARGE PRIME FACTORS

Douglas Hensley

O. Abstract
The main result concerns the distribution of n(n) within

S(x,y) { n: 1 ( n ( x and p ( y if pin }.

There is an average value kO for n(n), and a dispersion parameter V,


such that for k not too far from kO' and for large x, y with

2 loglog x + 1 ( log y ( (log x)3/4,

the number of solutions n of n(n) = k in S(x,y) is roughly


exp(-V(k - kO)2) times the number of solutions n of n{n) ~ kO in
S(x,y).
In the course of the proof, machinery is developed which
permits a sharpening in the same range of previous estimates for the
local behaviour of ~(x,y) as a function of x.

1. Introduction.
The question of the distribution of v(n) among natural numbers
n ( x with no prime factors >y has received increasing attention in
recent years. Alladi's Turan-Kubilius inequality made a good start,
and there has been further progress (see [1,2]).

Here it is more natural to deal with n(n), and count prime


divisors of n according to their multiplicity. Our methods are best
suited to moderately large values of u := log x/log y, and for most
of this work we assume

(log y)I/3 ( u ( IY /(2 log y).


248

This is essentially the same as the region advertised in the


abstract, and is technically more convenient.
We adopt most of the standard notation of the subject: The
largest prime factor of n is p(n),

S(x,y) { n : 1 ~ n ~ x and p(n) ~ y },


and
~(x,y) ~ HS(x,y).

Our results have the distinction of giving good estimates for


the individual ~k(x,y), where

~k(x,y) := H{ n : 1 ~ n ~ x, p(n) ~ y and Q(n) = k},

when k is near the average (over n in S(x,y» of Q(n). This mean


is given to a close approximation by

where, = ,(x,y) is determined by

,-1
L p log p log x.
p~y

Loosely, ' = (log u + loglog u)/log y, and ko = u + u/log u. As k


departs from kO' ~k(x,y) falls off in the typical Gaussian manner,
with variance ~ u/(log u)2 ,out to > u 1 / 14 standard deviations.
Very few n in S(x,y) have Q(n) farther from k O•

In the course of the proof we develop considerable machinery


which can also be used to study the local behavior of ~(x,y) as a
function of x.

There are recent and striking results of Hildebrand [51 on this


subject. He shows that

~(cx,y)

for essentIally the entire interesting range of x and y, with CI

given by
249

This a and our 1 - T are nearly equal. In fact, a = 1 - T +


O(l/u log2y) in our range. Later, we will be working with a certain
e defined as T was except that the primes are "smeared out" a
- 5/3 + e:
little. The distinction is minor, as e=T + O(y ) in our
range. It will be evident that the error terms in both theorems are
large enough that the results hold with T in place of e, and
without the effect of this smearing on V. For simplicity of
exposition though, we do all the mathematics, and state the
theorems, in terms of the smeared parameter e and its associated
quantities. In particular the kO defined previously, and the
subsequent kO defined by a smeared analog, are normally equal and at
worst differ by 1.

The sharpening promised permits us to replace Hildebrand's


1
u- / 10 with (log u)3//-;; log y in our narrower range. It may be
that the former error term could be improved to like or better
sharpness in this narrower range, but this is not obvious.

The starting point for our proofs is the identity

'I'k(X'y) = L Q(d)~_Q(d)(x/d,y) (1.1)


d=1

where

and

a
Q(d) = TT (L (-l)j/j!) = IT qa ' say.
pa "d j=O pa II d

Note that qo = I, ql = 0, and 0 < qa < 1 for a ) 2, with


li~ qa = l/e. Thus in (1.1) most d (x have Q(d) s 0, since most d
have a prime divisor of multiplicity 1.

The reason for putting things in terms of the Hm(x/d,y) is that


there is a tie to probability. If Y1' Y2 , ••• , Yj are independent,
identically distributed random variables on some probability space,
with
250

Prob(Yi = log p) 1/1I(y)

for each p ( y, then

(1.2)

This allows us to transfer the problem of counting 'i'k(x,y) to


the setting of sums of independent random variables. While the
concept is then fairly simple, many details must be hammered out.

In Sections 2 and 3 we develop some information about the


distribution of the random variables Yi , and define quantities that
later appear in the main results. In Sections 4 and 5 we show that
various "exceptional numbers" are rare in S(x,y). In Sec. 6 we
return to the main line of argument and obtain sharp estimates of
the Hm(x/d,y) for "unexceptional" m and d. In Sec. 7 we prove
Theorem I, the sharper estimate of 'i'(cx,y)/'i'(x,y) in the range
under discussion. In Sec. 8 we prove Theorem 2, showing that the
distribution of n(n) in S(x,y) is Gaussian, and that every
reasonably central k has as many n in S(x,y) with n(n) =k as
expected, to within a factor of 1 + 0(u- l / 4 ).
The origin of the two constraints on u merits some discussion.
The lower limit u = (log y)I/3 could easily be relaxed to u =
(log y)€ , and probably to u = (loglog y)I+€. But if u ( loglog y,
the distribution of mass in

'[-1
L P
p(y

shifts from being packed largely into (ly ,y) to being far more
spread out. The application of the Berry-Esseen theorem in Sec. 6
breaks down, and all the many calculations along the way are vastly
complicated. Happily, there are other ways to study the
distribution of n(n) in S(x,y) for smaller u, and Alladi [2) has
shown that here too it is Gaussian.

The upper limit seems to be an inherent defect of our method.


For u = yl/T, the proportion of square-free numbers in S(x,y) is
asymptotically 1/~(2(1 - I/T», for T > 2. (This follows from
Hildebrand's local behavior result, or from our Theorem 1). As
251

+
T + 2 , Z;(2(1 - lIT)) + '" and the proportion of square-free numbers
drops toward zero.

Since the identity (1.1) is designed to let us recover S(x,y)


in full from a weighted version in which only square free numbers
receive full weight, it cannot be expected to perform well when the
weighted version varies too strongly from the weight-one case.

2. A sense in which the distribution of log p (p < y) Is a.ooth.


An important distinction in probability is that made between
dIscrete and continuous dIstributions. Now the distribution of our
Yi , with mass l/'JT(y) at each log p, p" y is of course discrete.
However, for
large y the prime number theorem suggests that this
-1 s
distribution is continuous, with density proportional to s e on
1 " s " log y. The subsequent analysis would be simpler if it had
to do with such a density. This section gives rigorous content to
the metaphor above. We show that the dIstribution of a Yi is elo~e

to a continuous distribution with a density that 6o~ mo~t s is near


Cs- l e s •

If we would relax the standards of "close to" we could insist


-1 s
on proportionality to s e for all large s. But there are stronger
results on the local smoothness of primes if a few exceptions are
allowed.

Selberg showed that for all El > 0, all E2 > 0, there exists an
x(E l ,E 2 ) such that if x > x(E l ,E 2 ) then [71

n19 +El n19 +q


#{ n " x In(n + n ) - n(n) - n Ilog nl >
19
77 +El
E2n Ilog n} < E2x • (2.1)

Disallowing exceptions in (2.1) would only permit an exponent


of 1/2, even on the Riemann hypothesis.

We now fix an El, 0 < El < 1/100, and let

(E l - 58 / 77 )
v =y ~ v(y)
252

). (s) v-IX (s)


p
[log p - vl2,log p + v12]

and
).(s) = L ). (s) •
p .. y P

Let m be Lebesque measure.

L _ 1. FOJt ail E >0 and 0 > 0, .i.6 U c R .u, mea6UJtab.te and


m(U) = 0 , .then theJte ew.t k > 0/£ and u1 < u 2 < ••• < uk .i.n U
.6tlch .that Uj+l - u j > E 60Jt 1 .. j .. k - 1.

PJtoo6. Clear.

L _ _ 2. FOJt ail .6u6Muen.t.ty .taJtge y, and 60Jt ail t satisfying


1 .. t .. 21 E1log y,

99 -1 s t
m{ s ).(s) < 100 s e and log y - t .. s .. log y} .. 100.

99
-1 s
PJtoo6. Fix t and let Ut = { s : ).(s) < 100 s
e and log y - t ..
s .. log y}. Assume m(U t ) > t/l00. We derive a contradiction.
There must be some interval L of length 1 within [log y - t,
log y] which intersects Ut in a set of measure greater than 1/100.
Let L' = {sl,s2, ••• ,sk} be a set of k > v/l00 elements of L, with
Sj+l - Sj > v for 1 .. j .. k - 1. Such an L' exists by Lemma 1.

For Sj€ L' consider the intervals [exP(sj- v/2), exp(sj+ v/2»
= [aj,b j ), say. These are disjoint for distinct j, 1 .. j .. k - 1.
Fix a particular j and drop the subscripts: [a,b). Temporarily,
let 0 = 77
19
+ E1/2. For each integer m,
o
a .. m .. a + a , consider
the sequence ai(m) determined by

(2.2)
(m)
a i

These sequences are disjoint for distinct m, collectively they


include all but a vanishingly small fraction of the integers in
[a,b), and they satisfy
253

for a ( m, n ( a + a° (2.3)
a (m) > a (m-l) for a + 1 ( m ( a + a °•
i i

For each m and i, let

B(i,m) = (2.4)

M(m) = {i B(i,m) c [a,b)},

and
N(m) { i B(i,m) c [a,b) and B(i,m) contains
fewer than 199/200 E(i,m) primes}.

Then the number of primes in [aj,b j ) is at least

199
200 L E(i,m).
i E M(m)
i"- N(m)

This lower bound holds for each m, a ( m ( a + aO. Since


I og ai(m) E: [ s j - v
/2, s j + v /2] and since [(m
a i + a i(m» 0] - a i(m)
~ eXP(osj)' there are thus at least

RemaJtk.. (This is not to say that we can sum over the various m and
get still more primes. But for each fixed m, this is correct).

Now for each m, a ( m <a + a , °


#M(m) ~ v exp(1 - o)s.)
J
> YEl/12 •

Thus there are at least


254

primes in [aj,b j ).

On the other hand since SjE L' there are no more than
99 -1 Sj
100 vS j e primes in [aj,b j ). Thus

3v -1 Sj -1 OSj
1000 Sj e (Sj e #N(m),

and so

3v (l-o)Sj.
#N(m) ) 1000 e (2.5)

Thus also

Now summing over j, we get that there are at least

di st i nct i ntegers of the form (m) h 1+s 1


Cl i ' eac (e and with fewer
(m) (m)
than 199/200 E(i,m) primes between Cli and Cl i + 1 • But according
-6
to (2.1), with E2 = 10 , say, there cannot be this high a
l+s 1
proportion of integers n ( e with so few primes between nand
n + n.
o This completes the proof of Lemma 2.

1
Corollary. FOJt y l.IuULuentty .taJtge, r ) 0, and 1 ( t ( 2El1og Y
-1 rs 1 r
m{ s : )..(s)exp«r-l)s) < s (e - 100 Y ) and log y - t ( s (
log y} ( t/100.

P~oo6. The set in question is contained in the set of Lemma 2.

3. Calculus aDd Statistics.


Here we work out estimates of various quantities related to
exponential centering and the Berry-Esseen theorem. Let
255

r-l
G(r) l: p
p .. y
r-l
I(r) l: p log p
p.. y
(3.1)
r-l log2 p ,
J(r) l: P
p .. y

and
r-l
K(r) .. l: p log3 P •
p .. y

Let ~ now be the probability measure

1
1I(y)

a sum of equal point masses at each log p, p .. y.

Let Y1 , Y2 ••• be independent, identically distributed random


variables with common measure ~ •

Let A be the probability density function

1 -1
v X (s),
1I(y)
[log p - v12, log p + v12]

as in Sec. 2. Let ZI' Z2... be further random variables,


independent, and uniformly distributed on [-vI2, vI2]. Then
A(S) is the common density of the (Y i + Zi)'s.

For r ;. 0 let

G(r) f e(r-l)s A(s)ds ,


o
I(r) s f se(r-l)s A(s)ds ,
o (3.2)
J(r) .. f s 2e (r-1)s 1\'( s )d s ,
o
and 00

K(r) .. f s 3e (r-1)s 1\,( s )d s .


o
Here A , and thus G, G etc. depends on y implicitly. The defining
integrals are convergent for all r since A(S) has bounded support.

The next variation on G, I, ••• comes from replacing 1I(t) with


256

li(t), the logarithmic integral, and truncating at y. Let

log y
-1 rs
G(r) J s e ds ,
1

log y
rs
I(r) J e ds ,
1
(3.3)
log y
rs
J(r) J se ds ,
1

and
log Y 2 rs
K(r) =J s e ds.
1

As the notation is meant to suggest, G , G and G , etc. are nearly


equal. We use the prime number theorem:
For fixed C > 0, (see [7)

(3.4)

L_ 3. G(r) G(r) + 0(1 + (1 + ;)exp(r log y - Ilog y»),


un~6o~mty ~n r > O. F~he~, G may be ~eplaeed ~h I, J O~ K.

r-1
P~oo6 • G(r) = I p

t r - 1 n(t)!y_ + JY (l - r)t r - 2 n(t)dt


2 2

= yr-1{li(y) + 0(ye-C/10g y)} +

+ JY (1 - r)t r - 2 (li(t) + 0(te-C/10g t»)dt


2

= yr-1 li (y) + O(yre-c/log y)

+ JY (1 - r)t r - 2 li(t)dt + o( JY tr-le-C/log t dt )


2 2

J2y t
r-1 dt r -C/log y
- - + O(y e
log t 2
y r-1 -C/log td )
) + O( J t e t
257

so that

,-- log y ,-
G(r) G(r) + O(yre-Crlog y) + 0(1 ers-Crs ds). (3.5)
1

In estimating with I, J, or K in place of G, the powers of


log t that arise can be subsumed in exp(-C/log t) by reducing C.
Let us take C originally so that af ter any such reductions, (3.5)
holds for G, I, J and K with C = 2. It remains to bound

log y
rs-c/-;
I e ds.
1

We need a sublemma.

L~ 4. Le.t F(T,r) = IT ert-/t"


dt. The.n 6o~ T ) 1 and r ) 0
1

(a) F(T,r) .; 32

32 rT-/r
(b) F(T,r) .; - e (T- 1/ 2 " r).
r

(The proof presents no special difficulty and is left to the


reader) •
Now

log y rs-Us 1 4log y I/.4 rs-/s


I e ds = - I
4 4
e ds " 8
1
3
for r " -:::::: ' by (a).
Ilog y

For r > __
3_ h
, t oug ,
h
hog y

I log y 2 ,-
e rs - rs ds .; Ilog y I 32 1
e rs - rs ds .; _ e r og
y - 'log y
r
1 1 r

by (b). In both cases,

(3.6)
258

The other error term in (3.5) adds to this to give the claimed error
bound of Lemma 3.

L~ 5. G = G(1 + O(}», and likew-iAe 601t I, J and K.

Plto06. (For G).

log p + vl2
r-1 -1
IG - GI .; L p - v f e(r-l)s dsl
p.;y log p - v/2

-1
v/2
r-1 e(r-l) s dsl r-1
L p 11 - v f < v2 L p
p.$y -v/2 p.;y

L _ 6. Folt each C > 0, .the 60Uow.i..ng ho.tcL6 un.i..60ltm./'.y .i..n 0 <r <C
and y .. exp(1/r)

(1) G(r) li(yr) + 10g(1/r) + 0(1),

(2) I(r) = -r1 (y


r
- 1) + 0(1),

1 r 1 -2
(3) J(r) =; y (log y - ;) + r + 0(1) ,

1 r -3
K(r) .. - Y (log y - ~ + 2r- 2 ) + 2r
2
(4) + 0(1)
r r
2r
(5) -=-,V--,_ ( 1 + _ 4 _ + ( 1 »)
4 2 rlog y 0 2 2
r log y r log y

1 ( r
+ r -2 log (-) y (rlog y - 1) + 1) + O(log r)
r

log y 1 rs r 1 -1 t
Plto06. We have G(r) =f -- e rds" li(y ) + f t edt.
1 rs r
The last integral is log r + 0(1), uniformly in 0 <r .; C. The rest
is also a simple calculus exercise.

From Lemmas 5 and 6 we have the

Corollary. Lemma 3 ho.td.6 w.i;th G, I, J and K .i..n ptace 06 G, I, J,


A 1
and K Itrupec.t.i..ve.ty, when -1--'; r .; C.
og y
259

Now from Lemmas 3, 5 and 6, we have uniformly in ___1__ ~ r


log y
~ C that

r r -/log y
GJ - 12 GJ - r2 + o(--I---
rlog y
+ Ye
r2
)

(3.7)

Let

a = IIG (3.8)
and

We plan later to modify the density function A (s) of the


-1 (r-1)s
Yi + Zi~s to G e A(S), which is also a probability density
function, and with mean a, standard deviation a and absolute third
moment 13. These three statistical parameters are needed to apply
the Berry-Esseen theorem. (The central limit theorem with explicit
error estimates).

Until now we have left r in a wide range.

In Lemma 6 (1), this splits naturally into two regions: li(yr)


predominant, and log (l/r) predominant. In the latter case, the
calculus becomes very involved. This case is also the one
associated with small u = log xlloy y where traditional methods have
worked so well. Accordingly, we shall here treat only the case of
large u from now on. We assume

(3.9)

Defining ~ as usual be ~ > 0, e~ - u~, we now restrict attention


to r satisfying

Irlog y - ~I ~ 2. (3.10)

1
Then (3.10) is contained in (-1-- ~ r ~ C + 1), and for any r
og y
satisfying (3.10), all the previous results of this section are
valid. From now on, we assume (3.9) and (3.10).
260

Leaaa 7. Given (3.9) and (3.10),

(1) r = (log u + 10glog u + O(l»/log y ,

(2) G(r) = li(yr) + loglog y + O(loglog u) ,

~ 1 r
(3) I(r) = -(y - 1) + 0(1) ,
r

r
(4)
~
J(r) = -r1 y (log y - - ) + r
1
r
-2
+ 0(1) ,

~ 1 r( 2 21011 v -2 -3
(5) K(r) = - y log y - ~ + 2r ) + 2r + 0(1),
r r

r -1 -2 -3
(6) G(r) = y (r10g y) + (r log y) + O«r log y) »),

(7) (GJ - r2 )(r), and (GJ - I2)(r), both equal

2r 4 1
/2 {1+~+0(2 2)}·
r log y g y r log y
-2 4 1
(8) r (1 + ~ + O( 2 2»)
g y r log y
2
(~)(1 + O(log log u») and
10g2 u log u '

(9) log y - ex = 1.( 1 + 0(_1_») _ 12iLz (1 + O( 10glog u»).


r rlog y log u log u

P~oon. A routine, if lengthy, calculation.

Now let h(r,x,y) = G(r)(log x)/I(r). Frequently we will


abbreviate this to h(r), or just h. Then

dh _ -(GJ _ I2)10g x/I2 = -a2G210g x/I2 = _ ~(1+0(10glog u»)


dr - 1 2 log u •
og u
(3.11)

P~oon. Immediate from the definition and from Lemma 7.

Let S be the (unique from (3.11» r such that I(r) = log x.


Then ISlog y - ~I ( 1. (See (3.13) below). Let hO = G(S), and
nO = [hole
261

RemiVlk.. In S(x,y), the mean and median value of n(n) is quite


close to h O' as we shall see.
Now, more calculus

h(a - _1_) = h + __u_ (1 + 0(108108 u») (3.12)


log y 0 1 2 log u '
og u

h( a + -1-1- ) h - __u_ (1 + o( 10glog u»)


og y = 0 1 2
og u
log u '

(P~oo6. Immediate from (3.11).)

a log y (3.13)

(P~oo6· Here I(r) = I(r)(l + O(e- /log y» from Lemma 3. Now

i(~/log y) = log x, so I(~/log y) = log x(l + O(e- /log y». Now


-/10g y
dI/dr = J:=:log x log y for r = ~/log y + O(e ), so a change

of 0(_1___ e- /log y) in r will bring I(r) to log x.) ;


log y

-3
flo = 0(1), (3.14 )

uniformly in (r,y) satisfying (3.9) and (3.10).

PJtoo6. Lemma 7 has an estimate of o. To estimate fl, we cut the


defining integral at 1/2 log y and at a. For s < 1/2 log y, the

integrand is < 10g3y e(r-l)s>.(s). Using the definition of >.(s) and


the prime number theorem,

1/2 log Y -1 rs
f s e ds
< ;y
1 r/2
1

For 1/2 log y < s < a, the integral in (3.8) is

r
a 1 ,3 rs d < 1 ra -4 < y
< f 1h log ylog y 's - a e s log y e r 4
r log y

from Lemma 7, (9), and


262

log y () log y 1
f (s - a)3 e r-1 sA(s)ds < r
-3
f s
- rs
e ds < 4
y
a 1/2 log y r log y

again by the prime number theorem. Together with G = yr/rlog y and


0 2 ~ 1/r2 from Lemma 7, this gives (3.14).

Now let r = r(h) be the inverse function of h(r), and


G(h) = (h log G(r(h» - h log h + h + (1 - r(h) log x). Then

dO/dh = log G - log h (3.15)

_ _1_ (-.!L)
log x GJ-I2
2
= ~(1 + O(loglog u»)
u log u '

u u
uniformly in hO - - - .. h .. hO + _.- •
log 2u log 2u

Further, if V= - d 2G/dh 2 Ih ' and ~h = h - hO ' then


o
d 2 0/dh 2 = -V(1+0(~h(log u)/u» for Ih - hoi .. ~ •
log u

PJtoofi. Only the last claim is at all difficult. We expand

G r( 1 + 1 2 6 1)
= Y rlog y 2 2 + 3 3 + 4 4 + O( 5 5)'
r log y r log y r log y r log y

and I, J and K to like accuracy. Then

2 3 3
~h log (Gi;I\ -I ( I K - J G ),
(GJ-I 2 )log x (GJ-I 2 )IJ

and 13K - J3G < y4r/ r 7log y. On the other hand, (GJ - I2)IJ •
y4r /r6 log y, so

Iddh log ( GJ -13 12 ) I < 1r Idrl


dh
< ~.
u

Now let 00 = 0(6), = o(r(h O». Then again uniformly in


h - _u_ .. h .. h +.-!!.,.-
o log 2u 0 logLu '

Iddh log (02)1 < ~


u (3.16)
263

and
a
2 2 (1 + O(~h log
00 u
u».
2 GJ-I 2
Pnoo6· a = ---2--. Now
G
d GJ-I 2 d GJ - 12 d IJ
dh log( -2-) = dh log( IJ ) + dh log( 2" ).
G G

But log (IJ/G 2 ) = 2 log (I/G) + log (J/I). Expanding as before and
simplifying now gives (3.16).
We make one last observation.
-1
Given (3.9) and (3.10), and moreover Ir - 91 < (log y) ,for
r < 2/3

_1_ = _1_ (1 + O(~h 10g2 u». (3.17)


r-l 1-9 u log y

Pnoo6. Plug in Lemma 7 (1) and the given conditions.

4. Exclusion of nuabers with .any prime powers.

Here we show that in the identity

'l'(x,y) L Q(d)'l"(x/d,y) (4.1)


d

the contribution due to terms with d ) K is small for large K under


the hypothesis

(log y)
1/3 IV
< u < 2 log y. (4.2)

RemaJt~. It is roughly at u = Iy that we turn a kind of corner. For


smaller u, the proportion of square-free numbers in S(x,y) is
positive, while for larger u, it is asymptotically zero. Thus for
larger u it is increasingly difficult to recover 'l'(x,y) from the
weighted sum 'l"(x,y) which counts only square-free numbers with full
weight. In all our theorems, we assume (4.2).

We first skip ahead to (6.8) and borrow a result:

'l"(x,y) > liog u exp( hO + (1 - 9)1og x). (4.3)


og x
264

Lemma 8. Uni6oltmiy.in x and y -6ati66y.ing (4.2), and.in d,


1 " d " x,

f'(x/d,y) < (d a- 1 log x)f'(x,y).

'" m m
Pltoo6. f'(x/d,y) = L ~ Prob( L Yi " log x - log d}. Now
m=O m. 1
m m
Prob (L Yi " log x/d) " Prob ( L Yi + Zi " log x - log d + m~ )
1 1

G(lla' m..
,,~ J(log x/d) + mv
e
(1) ( )
-a sf(s) m ds •
ll(y)m 0

Thus
1
'" ~ (l-a)(log x - log d + ~v)
f'(x/d,y)"
oL m.
f e (4.4)

= (x/d) I-a exp {v/2


e }
G(a).

But e v / 2 = 1 + O(y- 5/6 + £1) and G(e) ~ u so e v / 2 G(a) = G(a) +


O(uy-4/5) = G(a) + 0(1) so that f'(x/d,y) < (x/d)l-e e G(e). But
G(e) = hO ' so from (4.3) we get f'(x/d,y) < d e- 1f'(x,y)log x.
From Lemma 8, we have

L Q(d)f'(x/d,y) < (f'(x,y)log x) L Q(d)d a- 1 • (4.5)


d)K d)K

Lemma 9.
L Q(d)d a- 1 < K(a- 1/2\og y.
d)K

Re.maltiz. This lemma is of course useless if a) 1/2. That is why


we had to assume (4.2), which ensures e < 1/2, and a bit more:
1 1.2
a > "2 - log y for large y.
265

P~006 06 lemma 9. We have

I Q(d)d s- 1 = TT (1 + I qJopj (S-l»). (4.6)


d=l p'y j=2

Let M = 1 + L.\' qjP j(S-l) , and let (J p )' p' j be independent


P
j=2 1 j(S-I)
random variables with I119.SS at j of M qjP • Then
P

(TT M) Prob( I J log p ) log K). (4.7)


p'y p p'y p

\'
L.
LM qJop j(S-I) 0j(s). Then
j=O p

Prob ( I J log p ) log K)


p'y P

N (s/log p) ds , fOO ey(s-log K) rr* N (s/log p) ds


p
o p'y p

where rr* denotes convolution, and y ) O.


This last integral is

With Y = 21 - S this last product is

< Tr (1 - .1 )-1 < log y.


P'Y P

From Lemma 9 and (4.5), we get

I Q(d)~'(x/d,y) < log x log y KS-1/2~,(x,y),


d)K

uniformly in d , x, and (x,y) satisfying (4.2).

5. Exclusion of atypical n(k).

Here we show that in S(x,y), n(k) is close to hO most of the


time. For small u, we could simply refer to Alladi's Turan-Kubilius
inequality, but its range does not extend to u as large as those
266

included in (4.2). which we assume.

L~ 10. (a) Fo~ 1 ( B( 21 -


lu/10g u.

l: 1 < (e -B2/3 10g2x)~(x.y)


k E S(x.1.)
n(k) ( hO - B/u/10g u

(b) For B;> 1.

-B2/12 - 41 (21 - -
e)B/u/10g u
l: 1 <( e + e )x
k E S(x.y)
n(k) ;> hO + B/U/10g u

P~oo6 (a). The sum on the left is equal to

l: Q(d) l: H(x/d.y)
d m ( h O-B/U/10g u - n(d) m

( L Q(d) H (x/d.y)
m
d

Let M .. rhO - B/U"/log u ], and r r(M). so I(r)/C(r) = log x/M •


t - 1~~\'
=

Then since e< and in view of (3.12), r < t- 1~;\ < t.


Now for m ( M,
(5.1)
Cm log x d (1) m Cm 1
A ( / ) A

H (x/d,y)
m
= -,
m.
f
0
e -r Sd(Prob(l: Y
1 i
= s») ( -, (x/d) -r.
m.

Since G = G(1 + O(y-3/2») from Sec.3 and the definition of v.


and since m ( u(1 + 0(1» here, and u < Iy ,

Thus m
H (x/d,y)
m
< f-
m!
(x/d)l-r

and
267

L Q(d) L Hm(x/d,y) < x 1- r L Q(d)d r - 1 L Gm/m! • (5.2)


d m<M d m<M

Since M < hO' r > S so G > M. Thus

L Gm/m! < MGM/M! • (5.3)


m<M

Now

< TT (1 + ~l qjP -j 12) , exp (1


2loglog y + 0(1) ) •
p'y 2

Hence
L Q(d)d r - 1 < Ilog y. (5.4)
d
1 -
Now recalling that M = M(B), we have for B < 2/u/log u ,

1-r M
Mx G 1M! < x/u- exp(Mlog G - Mlog M + M - rlog x). (5.5)

The quantity exponentiated in (5.5) simplifies to

1 2 -1 2
hO - Slog x - 2(1 + o(I»)(ho - M) u log u.

Thus

- (ho-Slog x) 1 2 -1 2
L < x/u e exp(- t<hO-M) u log u).
(5.6)
k E S(x,y)
n(k)<M

x
From Sec.4 we have '¥'(x,y) > log x exp (h O - Slog x), and (a)
follows.

1 Bru
PJtoo6 (b). Let K = exp(t; log)' The quantity on the left of (b)
is

\ '¥ (x,y) =
l - n
n>hO+B/u/log u (5.7)

9-1/2
L Q(d) L Hn_n(d)(x/d,y) + 0 (K log y log x '¥(x,y»),
d<K n>hO+Bru/log u
268

from (4.8).
For d (K, Q(d) ( log K/log 2, and for n ) hO + BI;/log u ,
1 -
n - Q(d) ) hO + 2Blu/log u. Now consider

l: Q(d)
d(K

This is larger than the double sum on the right of (5.7). For each
1 -
n ) hO + 2Blu/log u, the corresponding r = r(n) is less than S. As
in the proof of (a),

-3/2
Hn(x/d,y) < d
r-1 1
exp(2 ny )x exp(G(r»). (5.8)

But G(r) is concave and decreasing in h for h > hoe From


1 -
(3.15), with M now denoting rhO + 2Blu/log ul,

1 2
G(r(M» ( G(8) - 12 B •

Further,
dG/dh ( 1 B log u
- "3 lu

for h ) M. Thus

+ 1 -3/2) < _ (-B2/12)+h o-Slog x


exp(G(n)?y lu e (5.9)

and so

l: Q(d)d S- l
d

(5.10)

2 -B 2 /12
< log x 'I'(x,y)e •

6. Application of the Berry-Esseen theorea.

We now confine our attention to n E: rhO - u/log 2u , hO +


269

Z 1 -
u/log u], and 1 .; d.; exp(?,u log y/log u), and estimate ~(x/d,y).
Under these circumstances, we have

L~ 11.

H (x/d,y)
n = dr-1(1 + O«I+log d)log u»).
H (x,y) lu log y
n

(He~e d need no~ be an {n~ege~).

P~oo6. Let fn(s) = n~y) e(r-l)sA(s), where r = r(n) and


G = G(r). Let Xl' XZ••• be independent random variables with
density fn(s). Then

(6.1)

(6.Z)

(6.3)

-3
From Sec.3, 80 <1 , so by the Berry-Esseen theorem, for any
a < b,

Prob ( t
n
Xi E [log x - b, log x - a]) =
b
H oIn } - Ho~n} + O(/n).
1

We take b = u-l/40/~ and a = -1. Then

so

[log x - b, log x + 1]) = __1__ u- 1/ 4 + O(u- 1 / Z).


IZn

1 r
Now from Sec.Z and 3, fn(S) is ) 3G(~) throughout

(a, log y) with the possible exception of a set of measure


o(log y/log u). Thus there is a "rectangular block", of width
(log y - a) and mass asymptotically equal to 1/3, and solid except
for a possible missing mass of 0(1). Under these circumstances, we
may apply the results of Sec.6 of [4] to fn(s) * fn(s), which has a
270

"block" with no exceptions. We conclude that


written as Ql(s) + Q2(s), such that

and
2
log u
2
u10g y

Thus from (6.2), we get

(6.4)
log u (1 + O(loglog u»).
l21ru log y log u

Since the Qi(s) depend on r, and since Ql(log x) will appear several
times, we introduce the notation Q(r) = Ql(log x).

Now consider

for -1 < c < b. From (6.3) and (6.4),

2
Ql(s) '" Q(r) + O( (10~ X-S~log u) (6.5)
log Y

uniformly in the range of nand r under consideration and in s,


log x - b ( S ( log x + 1. Thus

log x-c log x-c


J e(l-r)s f(n)(s) ds Q(r) J e(l-r)s ds (6.6)
log x-b n log x-b

+ O(.98)n e (l-r)(10 g X-C») + o(~I-re(r-l)c 10g2 u (1 + lei»),


u10g 2 y

__ ""_
O(r)x
.... 1- r e (1)
_ ..._~_ ru
r- c {I + 0 (U 1og Y e (1)(b)
r- -c) +
l-r log u
271

o(/U log y (.98)n) + o( log u (1 + Icl»)).


log u lu log y

1 / l/~ log y
If we now restrict c to - - " c ...
2 2 log u ' these error
terms reduce to o( (1 +c)1og u). Thus uniformly in that range of c,
lu log y
and in In - hoi" u/10g2 u,
(6.7)
log x-c (1 ) () _ Q(r)x 1- r e(r-l)c (1)
J e -r s f n (s) ds _ -- -
n l-r
{I + O( +c log u)).
lu log y
log x-b

10g2 u
In particular, with n = nO' r = 9 + 0 ( 1 ) from (3.11) so
og x
with c = 1,

log x-I
J e(l-r)s f(n)(s) ds
n
> log u
lu10g y x
l-r
log x-b

But

~n n n n
Hn(x,y) = ~ n.I Prob(I
1
Yi " log x) > ~ Prob(I Yi + Zi " log x-I)
n. 1

Gf_,n, log x-I (1 ) ()


>~ J e -r s f n (s) ds
n! log x-b n
n n l-r
> ~ x log u
nn u log y ,

G(r)nenx-r
by Stirling's formula. Now n = exp(G(n», and from
n 10g2 u
(3.15), since n = hO + 0(1), G(n) = G(h O) + o( u ) . Thus
Hn 0(x,y) > dog yu exp «h»
u10g GO'
But G(h O) = x-9 e ho , and u10g y =

log x, so

H (x,y) > x(l-9) e ho log u/10g x. (6.8)


nO

Since ~'(x,y) > H (x,y) this proves (4.3).


no
We now return to a consideration of general nand c. Clearly

log x-b
J e(l-r)s f(n)(S) ds " x (l-r) e (r-l)b (6.9)
o n
272

< Q(r)x1-re(r-l)c(I+c)log u)
lu log y ,

so that in (6.7) the lower limit of integration could just as well


n n
be zero. Now I Yi =I (Y i + Zi) + O(uv), and a change in c of
(1-1)c 1 log u
O(uv) changes e by a factor of (easily) 1 + o( I 1 Thus
vu og y ).

n = G(rt Q(r)x 1- r e(r-1)c (1 + O(O+c)log u»)


Prob(I Yi " log x - c) ....~/~~..::.
1 11 ( y) n l-r v u log y ,

(6.10)
and so

H (xe-C,y) = ~ x 1- r e(r-l)c Q(r) (1 + O«I+c)log u»)


n n! l-r lu log y ,

(6.11)

Now with clog d, we get Lemma 11.

7. Y(cx.y).

Now we narrow the range of c a bit, and assume

exp(- ~ log y/log u) " c " 1,


(7.1)
1/3 1 r-
(log y) " u " ~y/log y.

Given (7.1), we have uniformly in that rane of x, y, and c,

lbeorem 1.

'!'(cx,y)

Remank. This improves on both the range and accuracy of (11.5) of


(4) (which had a slightly different definition of 8), where the
error factor was 1 + O(u- l / 7 ). It is also stronger in its range of
validity than (5), which had 1 + O(u-1/10) over a wider range,
extending essentially to u = y. The present approach, dependent as
it is on the weighted sum ,!,'(x,y), presents stubborn difficulties
273

when u ) yl/2, as then the proportion of square-free numbers in


S(x,y) tends to zero. This makes it hard to recover ~(x,y) from
~'(x,y) •
To prove Theorem 1 we first exclude atypical cases. From Lemma
10 of Sec.5, we have

Now let K = min {e u/1og3u , e i~log y/31og U} • Weave


h

~(cx,y) = L L Q(d)H _n(d)(cx/d,y) (7.3)


In-h OI<'1:':'2'::"1
u d.;K n
og u

1-£
+ 0 (~(x,y)exp(-u ») + O( L Q(d)~'(x/d,y»).
£
d)K

But
L Q(d) L Hn-n(d) (cx/d,y) (7.4)
d.;K In-hoi .; u/1og2u

L Q(d) L H (cx/d,y)
n
d.;K In-hoi.; u1og2u

+ O( L
d.;K

1-£
This last error term is <£ ~(x,y) exp(-u ), from Lemma 10. Thus

~(cx,y) L Q(d) L H (cx/d,y) (7.5)


d.;K In-hoi .; u/1og2u n
1-£
+ O( L Q(d)~'(x/d,y») + 0 (~(x,y)exp(-u »).
d)K £

The error terms simplify to

1 3
~(x,y) x O{log x log y exp( -(2' - a)u/1og u) +

1 - 1-£
log x log y exp(-(2' - a)iu log y/1og u) + exp(-u )

and finally to
274

0(~(x,y)e-~/310g2 U).

That is,

Recall 6h = h - hO' or here, n - hO• For 16hl ( u/10g 2u ,


dr 10g 2u
-- • - from (3.11). Thus in this range,
dh log x

2
r = S + 0(6h log u/10g x). (7.7)

From Lemma 11 then, uniformly over the range of (7.1),

H (cx/d,y)
n = (c/d)l-S{l + o(log u(1+10g (d/c»)
H (x,y) lu log y
n
(7.8)
2
+ O( 6h10g u(1+10g (d/c»)} •
log x

Now the error term of (7.8) that involves 6h is smallest


precisely when Hn(x,y) is largest. So it will pay to consider
carefully how Hn(x,y) varies with n. From (6.11), we have

n
Hn(x,y) = ill.!:.2..::. x 1- r .Q.W.. { 1 + o( log u )}. (7.9)
n! l-r lu log y
1 -1/4
Now Q(r) = (1 + O(u »), and from (3.16), this is
/21Tn a(r)
constant In In - hoi ( u/10g 2u to within a factor of 1 + 0(u- 1/4 ).
Thus

(7.10)
275

RemaJl./z. (Foreshadowing the Erdos-Kac type result of the next


section. This is the corresponding result for ~'(x,y).)

If we now sum the error due to the second "0" of (7.8) in


estimating L L Q(d)H (cx/d,y), it comes to, say,
d .. K In-hoi .. u/log 2u n
Error2' with
I-a I-a 3
Error2 < c x log u eho x (7.11 )
loglx

To estimate the inner sum, we go to a lemma.

L~ 12
(a) Ld Q(d)da-I(I + log (d/c») < (log y)3/2(1 + log (1/c»

(b) 16 6U4the~ u .. y1/3, then a .. 215 and the ~um 06 (a) i6


< (I + log (1/c».

P~006 (a) The sum to be estimated is

< (1 + log (1/c») ddr( TT (I + ~ q pj(r-I»))I


r
= •
a
p .. y j ~2 j

This derivative is

< TT (I + t p2( a- 1) ) L p
2(a-l) I
og p.
p .. y p .. y

Since a < 1/2, this is < log 3/2 y. And if u .. Y


1/3 , then
a .. log u/log y .. 2/5 so the derivative in question is 0(1). Thus

I-a I-a hO ·3
Error2 <
Ac x
2
e log u L 2 11Ihie -1/2 V(lIh)2 ,
log x In-h o I"u/log u

where
276

1 + 10g(1/c)
A =
3/2
(1 + 10g(1/c»)log y

The sum above is ~ I;/log u from (3.15). Thus the quantity in


(7.11) is

I-a I-a hO - 2
< Ac x e lulog u
2
log x

This, however, is small compared to ~'(x,y). If fact,

I-a hO
~'(x,y) > I~ x e log u
log u log x

from (6.8). So the ratio of Error 2 to ~'(x,y) is <


Ac
I-a log u/lu
- log y. In either case (u (y
>. 1/3
), this is

I-a 3u(l
< clog + 10g(1/c»/lu log y.
-

Therefore this error term is within the error allowed for in


Theorem 1.

We now consider the other error term in (7.8). Summed over d


in (7.5), for any fixed eligible n, this comes to, say, Error 1 ,

c 1- S (1 + 10g(1/c»)log u
Errorl < { lu log y L Q(d)d a- 1 } H (x,y). (7.12)
d n

But L Q(d)d a- 1 < 1


d
as in the proof of Lemma 12. Thus (7.12) simplifies to

Error l < (1 + 10g(l/c»)log


3/2
u c
I-a /(/u- log y). (7.13)

This is smaller than the other error, which proves Theorem 1.

8. The distribution of n(k) in S(x,y).

TheQrem 2. Un~6o~mtif ~n (log y)I/3 ~ u ~ I/Y/IOg y, a6 Y+ 00 ,


277

1/8
(a) L ~n(x,y) < ~(x,y)exp(-u )
In-h OI>u4/7
4/7
(u ,

1 2
- ZV(n-h O) -1/4
~ (x,y) =e ~ (x,y)(l + O(u »).
n nO

Pltoo6. Part (a) Is simple. In Lemma 10 put B = u 1 / 14 log u. For


part (b), we put K = exp(u s / 12 ), and have

~ (x,y) = L Q(d)H _Q(d)(x/d,y) + o(exp(-u2/s»)~,(x.y) (8.1)


n d(K n

from (4.8).
Let ~ (x,y) = L Q(d)H _Q(d)(x/d,y). From (7.10),
n d(K n

~'(x,y) < l/~


og u
H (x,y).
nO
(8.2)

On the other hand, ~ (x,y) >H (x,y). Thus


nO nO

~no(x,y) > l~! u ~'(x,y). (8.3)

Thus to prove (b) of Theorem 2, in view of (8.1) and (8.3) we need


4/7
only show that for In - hoi (u ,
1 2
- ZV(n-h O) - -1/4
"W (x,y) e ~ (x,y)(l + O(u »). (8.4)
n nO

Now consider the component terms of ~ (x,y). For integer d,


n
<d ( K, from Lemma 11 we have

r(n-Q(d) )-1
Hn_Q(d)(x/d,y) = d Hn_Q(d)(x,y). (8.5)

From (3.11),

r(n - Q(d» = r(n) + O(lOg2u log d/log x).

For d ( K, log d ( Ilog x/log u, so


278

Thus
r(n)-l
Hn_n(d)(x/d,y) = d Hn_n(d)(X,y) x

x(l + O(log u log d) + O(lo g2 u lo g2 d»).


lu log y log x

Now we estimate Hn_n(d)(x,y)/Hn(x,y). We have ned) (


5/12 2
, and r(n-n(d» = r(n) + O( log u log d
r.--
flog x/log u and ( 2u
flog x). Now from (6.11), with r' = r(n-n(d» and r = r(n) for the
moment,
1-r n n
H (x,y) = _x_ G(r) e ill..!l (1 + O( log u») (8.6)
n 121Tn nn 1-r lu log y

while
1-r' , n-n(d) n-n(d) ,
( ) --;:~x==:::;:::;:;: G(r ) e ~ (1+0( log u »).
Hn-n(d) x,y - Ih(n-n(d»
: (n_n(d»n-n(d)
.. 1-r lu log y

Thus
Hn_n(d)(x,y) z ;; 1-r.2i.Ll. (1 + 0 log u )
H (x,y) In-n(d) 1-r' Q(r) (/u log y) (8.7)
n

x exp{G(n - n(d» - G(n)}.

The product of all but the last factor here is 1 + O(u- 1/ 4 ), from
(6.4) and (3.11). As for exp{G(n - ned»~ - G(n)},

since lfihl ( u 4/ 7 and IdG/dhl < lfihl log 2u/u from (3.15). Thus

IG(n - ned»~ - G(n)1 < n(d)u -3/7 log 2 u.

Together with (8.7), this gives for d ( K that

(8.8)

We now show
279

~ (x,y)
n
(1 + 0(u- l / 4 ») L Q(d)dr(n)-l Hn(x,y). (8.9)
d(K

Pltoo6. We have

Hn_n(d)(x/d,y) = H (x,y)dr(n)-l{l + 0(u- l / 4 ) + O(log u log d) +


.. n iu log y
2 2 2
O(log u log d) + O(log ~/~og d)}. (8.10)
log x u

If we sum the errors in (8.10) over d ( K, then, we get, aside


from the acceptable error due to the 0(u- l / 4 ),
2 2 2
Error < L Q(d)d r (n)-l{lo g u log d + log ~/~Og d}Hn(X,y). (8.11)
d(K log x u

As in Lemma 12,

I
d~l
Q(d)d r (n)-llog i d < 10gi u
d=l
L Q(d)dr(n)-l (8.12)

for i ·z 1 or 2. Thus the error bounded in (8.11) is

4 3
< (.!2&....!!.
log x + .!2&....!!.)
3/7 ~n (x ,y,
)
u

which gives (8.9). In view of (7.10), it remains only to show that

L Q(d)dr(n)-l ~ (1 + 0(u- l / 4 ») L Q(d)d S- l •


d d

(The sums will be more nearly equal if .truncated so we are just


taking the worst case).
More precisely, a simple induction argument shows that

L Q(d)d r - l / L Q(d)d S- l
d(K d(K

is monotone in K, for fixed rand e.


Now

L Q(d)d r - l = TT ( + I
j=2
q p (r-l)j
j
(8.13)
d-l p(y

so
280

{L Q(d)d r (n)-l/ L Q(d)d a- 1 } (8.14)


d=l d=l
(a-1)j ( (r-a)j
=1T( 1 + o{ L qjP P - l})
p(y j=2
-3/7 log 2u log 2( a-1)
-1T( + O(u I! I! ))
p(y log Y

2(a-1)1
Now we already observed that L P og P < log y, so this
p(y
equals
1 + O(u- 3/ 7 10g 2u) = 1 + O(u- 1/ 4 ).

To summarize, there is a nO ~ L P
a-I with a determined
p(y
essentially by the condition L p a-I log p = log x. To a looser
p(y
1
approximation, nO = u(l + -1--)'
og u
In S(x,y), the distributionn of n(k) is roughly normal, with
mean nO and standard deviation ~ l/lv, where V is defined by (3.15),
so that the standard deviation is loosely !.;/log u. Out to a
distance of at least u4/7 from no' that is, ~ u
1/14 log u standard
deviations, the number of k in S(x,y) with n(k) = n is, to within
an error factor of 1 + O(u- 1 / 4 ), given by
1 2
- 2Y(n-h O)
e ~ (x,y).
nO

References.

1. K. A11adi, The Turan-Kubilius inequality for integers without


large prime factors, J. Fun die ~e~ne u. angew. Math. 335 (1982)
180-196.

2. An Erdos-Kac theorem for integers without large


prime factors, Acta ~h. (to appear).

3. P. D. T. A. Elliot, Probabilistic Number Theory I, Grund1ehren


der mathematischen Wissenchaften 239, Springer Verlag, NY 1979
(p. 74).
281

4. D. Hensley, A property of the counting function of integers


with no large prime factors, J. 06 Numbek Th. 22 (1986), 46-74.

5. A. Hildebrand, On the local behavior of 'I'(x,y), Tk0.n6. Am.


Math. ~oe. (1986), to appear.

6. K. Prachar, Primzahlverteilung, Grundlehren der mathematischen


Wissenschaften 41, Berlin 1957.

7. A. Selberg, On the normal density of primes in small intervals


and the difference between consecutive primes. AlLeh. Math.
NatUkv~d 47, No.6 (1943) 87-105.

D. Hensley
Texas A&M University,
College Station,
Texas 77843, U.S.A.
ON THE SIZE OF I d(n)e(nx)
n<x

Takeshi hno

1. In his famous Habilitationsschrift of 1854 on trigonometric


series and integration theory, Riemann gave the following
interesting example which shows his high ingenuity of analysis and
arithmetic as well.

Let us define first

x-[x]-% x ~ Z
D(x) =
o x E Z

and

Now we consider the two series

L D(nx + 1/2 )/n, (1)


n=1

and a> C
L ....!!. sin(211nx).
11 n=1 n

Then Riemann states that the function which is defined by (l)


for all rational values of x can be expressed by the trigonometric
series (2), and it is unbounded in every fixed interval, hence it
follows that it is by no means integrable in his sense. This was
finally established by Chowla and Walfisz [3], and later Wintner
[12] made additional remarks. We combine their results in the
following

'lbeorea 1. Both 06 (1) and (2) c..onvelLge to the .6ame value 601L
alm0.6t aU x inciud.ing aU algebtwic.. numbelL.6, whde they d.ivenge on
a dert.61!. I.>e:t 06 :tnart.6cel1del1tal l1umbelL.6. The 6ul1c..tiol1 thU.6 de6,il1ed by
(I) and (2) belol1g.6 to LP 6011. any p > 0, but it i.6 di.6 cOl1til1UOU.6
284

almo~t even~hene.

Here we mention that such exceptional set of transcendental


numbers x is defined by certain relations between convergents of the
continued fraction expansion of x.
Now one sees that (1) and (2) are linked with

I D(nx)/n (3)
n=l

and
1 \' d( n)
n L --n-- sin(2nnx), (4)
n=l

respectively. Formally, (1) = (2) i f and only i f (3) = (4), and we


have the same assertion for (3) = (4) as Theorem 1.

Also it is known [2) that the complex s.e ries

I den) e(nx)
n
(5)
n=l

converges for all algebraic irrational values of x, while it


diverges on a dense set of transcendental numbers.
Next we shall show

I den) e(nx), £ > 0, (6)


1/2 +£
n=l n

conveng~ 60n almo~t ate x ~nclud~ng 'ate algebnaic ~nnat~onal

nu.mbe~, while a d~vengu on a de~e ~et 06 tna~cendental numbe~.

Pno06. The last statement follows (trivially) from the correspon-


ding fact in (5). The second part is obvious from

I den) e(nx) = o(N liz +£), £ > 0, (7)


n(N

which holds for all algebraic irrational numbers. This can be


achieved i f we employ Roth's theorem instead of Liouvilles' in the
proof of Hilfssatz 32 of Walfisz [10).
The first assertion can be proved trivially if we appeal to the
285

deep LZ-theorem of L. Carleson [cf. I] because

d(n) 2
L( 1/ Z +E) < co ,
n=l n

which shows that (7) holds for almost all x. It is still possible
to deduce the first assertion from the following estimate due to
Erdos [5]:

O(IN log N), for almost all x. (8)

We remark, at first, that Theorem Z seems sharp in the sense


that it will likely be impossible to make E = 0 in (6). As a matter
of fact, Walfisz [II] made a conjecture that

L d(n) e(nx) (9)


n(N

would hold for all irrational values of x. Obviously (9) implies


that

L d(n) e(nx)
n=Z Iii
diverges for all irrational x.
Next we shall show that (6) is not summable by Abel's method.
In fact we can prove

Theorea 3. The .!leJUu

L d(n) e(nx) (10)


n=Z n log n

.i...6 not .!lummable 60ft any x , by Abel'.!l method, on a den-lle .!let 06


tftan-lleendental numbelt-6.

For the proof we apply the following known Tauberian theorem of


mean type.

Theorea 4. 16 the M.JUU

L c
n
(11 )
n=l
286

.w .()ummable :to S by Abel'.() me.:thod and .() ctt.w 6-iu :the cond-i:t.-ion

L nc = o(N),
n"N n

:then (11) .w necu.()~y conve~gen:t. :to s.

P~006 06 Theo~em 3. Chowla [2: Theorem 5] proved that

o(N log N)

holds for all irrational x, which implies

~l. d(n) e(nx)


n n log n o(N).
n=2

Thus i f (10) be Abel summable, then Theorem 4 shows that (10) is


necessarily convergent. But this is not always the case since
Chowla [2: Theorem 7] proved that

L d(n) cos(2nnx)
n=2 n log n

diverges on a dense set of transcendental x.


In view of this theorem and the following lemma, i t is clear
that (6) is also non-summable by Abel's method on a dense set of
transcendental x.

~ . 16 (11) .w Abel .()ummable, :then 6M any mono:ton-icaUy

.w af..() 0 Abel .() ummable •

N
Apply partial summaiton to L d c xn.
n=1 n n

Now we shall return to Theorem 2. Walfisz [11] showed that for


almost all x,

L d(n) e(nx) - n( IN log N (loglog N)3/2), (12)


n"N
287

which implies the following

Theorem 5. The ~eJUe-6

L d(n) e(nx)
n=3 In log n (loglog n)3/2

Thus, in view of Theorems 2 and 5, we may naturally ask the


following question:

Does \'
L
d(n) e(nx) converge almost everywhere?
n=2 IIi log n

If the answer is "Yes", then we replace the 0 in (8) by 0, and if


the answer is "No", then we improve (12) up to

L d(n) e(nx) n(1N log N) (13)


n(x

for almost all x, which shows that (8) is a correct estimate.


A. Oppenhiem [8] pointed out that by the method of Hardy and
Littlewood he could show for all irrational x

L r(n) e(nx) = n(IN),


n(N

where as usual r(n) stands for the number of representations of n as


the sum of two integral squares. Also we remark that Erdos [5]
observes that for almost all x,

L r(n) e(nx) O(IN log N).


n(N

2. In this section we shall consider a certain generalization of


the equation (3) = (4). If we put

A
n

the we have the formal identity


288

a '"A
L -2!. D( nx)
n
L-2!. sin(2nnx), (14)
n=1 n n=1 n

which is shown to be true for all real x, by Davenport [4], for


special an such that an = ~(n), A(n) (Liouville), A(n) (von
Mangoldt). Actually he proved that for all irrational x

~(n) D( nx) 1
L n
- -:; sin(2nx), (1S)
n=1

2
L A(n) D(nx) 1
n L
sin(2nn x)
(16)
n=1 n n=1 n
2 '

L A(n) D(nx) 1 L ~ sin(2nnx). (17)


n n n=1 n
n=1

His method of proof depends on the deep estimate such as

-K
L ~(n) e(nx) = O(N(log N) ), K > 1,
nC;N

by virtue of Vinogradov's method. Segal [9] reinvestigated the


identity (14) through a different approach by using complex
analysis. He obtained

Theorem 6. 16 .the V'<'JUc.h.e.e.t .6eJUe.-6 Lan-s c.onveILge.-6 ab.6 o.tu;tely


n=1 n
and un'<'6oILmty 60IL Re s > 1 + E (E > 0), .then (14) ho.td6 .<.n .the
.6en.6e .that: 60IL g.<.ven x e.<..theIL bo.th .6.<.de.-6 c.onveILge .to .the .6ame value,
OIL bo.th d.<.veILge.

However, unfortunately, this theorem dos not tell us for what values
of x do both sides converge or diverge. In spite of this fact, we
can somewhat simplify the proof of Theorem 1 by virtue of it.

It will be worth observing that the series on the r.h.s. of


(16) is actually the one that Riemann is reputed to have given in
his lecture as an example of "almost" everywhere non-differentiable
continuous functions. Later Hardy [7] proved that it is non-
differentiable for all irrational values of x. On the one hand
Gerver found that it is in fact d'<'66e/t.ertt.<.ab.te at only particular
rational points [6]. Now we shall show
289

Theorea 6. The 6unc.:t.ion deMned by (17) .iJ., fucont:.inuoU6 only at


.int:egnat po.in~, and can be d.i66enent:.iated at non-.int:egnat po.in~.

Pnoo6. This is immediate from the following closed expression for


the r.h.s. of (17) which is valid for 0 < x < 1:

\~
L. sin(211nx) - { logr(x) + (18)
11 n=1 n

21 -
log(sinllx) + (y + log211)x} + ( log/2 11 + y/2 ) ,

where y is Euler's constant. (18) is a consequence of the Fourier


series expansion of log rex), which was obtained by Kummer.

References.

[1) L. Carleson, On convergence and growth of partial sums of


Fourier series, Act:a Math. 116 (1966), 135-157.

[2) S. Chowla, Some problems of diophantine approximation (I),


Math. z. 33 (1931), 544-563.

[3] S. Chowla and A. Walfisz, Ueber eine Riemannsche Identitat,


Act:a An.it:h. 1 (1936), 87-112.

[4) H. Davenport, On some infinite series involving arithmetical


functions, Quant:. J. Math., (2), 8 (1937), 8-13.

[5) P. Erdos, J. Ind.ian Math. Soc., 12 (1948), 67-74.

[6) J. Gerver, The differentiability of the Riemann function at


certain rational multiples of 11, Amen. J. Math., 92 (1970),
33-55.

[7) G. H. Hardy, Weierstrass's non-differentiable function,


Tnan6. Amen. Math. Sac., 17 (1916), 301-325.
290

[8] A. Oppenheim, The approximate functional equation for the


multiple theta-function and the trignometric sums associated
therewith, P~oc. London Math. Soc., 28 (1928), 476-483.

[9] S.L. Segal, On an identity between infinite series of


arithmetic functions, Acta ~h., 28 (1976), 345-348.

[10] A. Wafisz, Ueber einige trigonometrische Summen , Math. Z. 33


(1931), 564-601.

[II] A. Walfisz, Ueber einige trigonometrische Summen II, Math. Z.


35 (1932), 774-788.

[12] A. Wintner, On a trigonometrical series of Riemann, Ame~. 1.


Math., 59 (1937), 629-634.

T. Kano
Okayama University,
Okayama, Japan.
ANOTHER NOTE ON BAKER'S 'l1IEOREK

D. W. Masser and G. WUstholz

1. Introduction.
Recently G. Wustholz [5), [6) proved a theorem in transcendence
which includes and greatly extends many classical results. In
particular it generalizes Baker's famous theorem [2) on linear forms
in logarithms, and places it within the context of arbitrary
commutative group varieties.

Now although Wustholz's Theorem has a rather general setting,


the main innovations in his proof are primarily analytic and not
related specifically to the theory of group varieties. So they may
be well illustrated with particular examples. When the underlying
group variety is a product of multiplicative groups, the result
reduces simply to Baker's Theorem. Thus the aim of the present
article is to give a proof of Baker's Theorem using the methods of
Wustholz, but without reference to group varieties. Our exposition
follows to a la rge part a course of lectures given by Wustholz
himself at Ann Arbor in May 1984; as noted there, many of the
technical complications of [5) and [6) disappear altogether.

We shall prove the following version of Baker's Theorem.

Theor81ll. FOIL n) 2 let atgeblLaic. numbelL6 wtth


111''''' II n- 1 be
1,1I1' ... ,lI n_ 1 lineaJll.y .independent ovelt the Itat.ionat Meld Q, and
let a 1 , ••• ,an _ 1 be non-zelLo atgeblLaic. numbelL6 wtth logalt.i.thm6
~1""'~n-1 not a.t.t zelLo. Then the numbelt

i6 tlLQI'L6 c.endentat.

As usual in transcendence, the proof proceeds by contradiction.


292
61 6n - 1
If a 1 ••• a n - 1 is algebraic, we construct from Siegel's Lemma an
auxiliary function .p(z1' ••• ,zn-l)' analytic in z1' ••• ,zn_l' which
has many zeroes. We then use the Schwarz Lemma to deduce that
.p( zl'''. , zn-l) has many more zeroes. Up to here Wiistholz's proof
follows exactly the classical lines, so we omit the details (see for
example [2]). The conclusion is as follows.

LeIDIIl. FOIL MIj C ) 1 the 6oUow.ing hold6 60IL att ;.,u66.iuentl.1j lMge
.ilttegelLJ.> D. TheILe ex.i;.,.u, a non-zelLo polljnomi.al p .in Z[xl'." ,x n ],
a 6 total deglLee o.t rna;., t D, ;., uc.h tho.t the 6unc.:t.ion

zl zn-l 6 zl + ••• + 6 lz 1
p( e , ••• ,e e 1 n- n- )

,
(a/az n _ 1 ) n-l .p(s~I, ••• ,s~n_l) o

60IL att non-negat.ive .integelLJ.> '1' ••• "n-l' s ~th

1 + 1/(2n-2) cn1/2
'1 + ••• + 'n-l .. D , s.. •

The last step is to prove that cp(zl' •••• zn-l) has too many
zeroes. For example in [2] this is done by means of generalized
Vandermonde determinants, and Kummer theory is used in some of the
later quantitative refinements. Wiistholz proceeds by proving a zero
estimate that is essentially algebraic in nature. To emphasize this
we formulate it over the polynomial ring

where K is any algebraically closed field of zero characteristic.


We identify Q with the prime field of K. and we write KX for the set
of non-zero elements of K For elements 61' ••• ' 6 n - 1 of K we
introduce the fundamental operators
293

acting on R • It is easy to verify that these are commuting


derivations on R (and a better reason for this will be given
shortly). We then have

Proposition. (Wustho!z). Suppoce 1,Sl, ••• ,Sn-l ane tinean£y


.independent ovelL Q. FOIL an .integelL D .. 1 and lLeat S .. 1, T .. 1
cu.ppoce P .<.¢ a po.tynom.iat .in R 06 totat deglLee at mOct D and
(s) (s) x
(~1 , ••• '~n ) (0.; s .; S) ane d.<.¢t.inc:t po.in.t6 06 (K )n cu.ch
that

'n-l P(c(s) (s)


o
II n - 1 "I ' ••• , ~n )

nOlL a.e..e. non-negat.ive .integeltC '1' ... "n-l' s w.<.th

'1 + ••• + 'n-l .; T, s .; S •

Then .i6
n
2n Dn ,

the po.tynom.iat P .<.¢ .ident.ic.aUy zelLo.

The rest of this article is devoted to a proof of the


Proposition. We see here how it supplies the required contradiction
to the Lemma.

For this we note first the basic relation

for X = exp(Slzl + ••• + Sn-lzn-l) and any polynomial P; this is the


real reson why 1I 1 , ••• ,lI n _ 1 are commuting derivations. By iteration
we obtain

Hence the polynomial P of the Lemma satisfies the vanishing


conditions of the Proposition at the points
294
s t1 s t n- 1 s ( 13 1 t1 + ••• + 13 t )
(e n-1 n-1 )
, ... ,e ,e
(1)
(0 .. s .. S)

with

01 + 1/(2n-2)
T S

It therefore suffices to take

2n 4(n-1)
c n o ) n

in the Lemma to obtain a contradiction. Note that the distinctness


of the points (1) is an immediate consequence of the linear
independence of 1,13 1 , ••• ,13 n - 1 and the fact that tl' ••• ,t n - 1 are not
all zero.

2. Jacobians.
Let P be a prime ideal of R, and regard R as embedded in the
corresponding local ring Rp. We shall be considering matrices M
with entries in Rp, and we write rank ~ for the rank of M taken
modulo P.
Let D1 , ••• ,Dk be commuting derivations on R. For an ideal 1 of
R we define the Jacobian J D( 1) of 1 with respect to the system
D = (D 1 , ••• ,Dk ) as follows. It is the infinite matrix with k
columns whose rows are indexed by elements of 1; for P in 1 and an
integer j with 1 .. j .. k the entry corresponding to P and j is
DjP. In practice no ambiguity will arise from not specifying the
order of the rows.
We consider first the system ~ = (~1' ••• '~n-1) defined in
Section 1 for 1,131' ••• ,13 n - 1 linearly independent over Q. We say
that a prime ideal P of R is general if xl ••• x n is not in ~

equivalently, if the variety of Pin t<? contains a point in (/()n.

Jacobian Lemma. SuppOl.>e. 1 .. r .. n and P .i.J., a gene./tat plUme. .i.de.at 06


R 06 /tanll r. The.n

rank P J ~ (P) min(r,n-l) •


295

PltOO 6. If r = n then P contains xl - i';l' ••• ,x n - i';n for i';l, ... ,i';n
x
in K The corresponding finite submatrix B of J~(P) has a square
minor of order n-l whose determinant is xl ••• x n _ l ; and since this is
not in P we deduce that B, and hence also J~(P), has rank n-l modulo
P as desi red.

Henceforth we assume 1 .. r < n. Let D


system formed from

Then JD(P) is the usual Jacobian associated with P and it is well-


known that

r • (2)

Consider the formal expression

(3)

Since the derivatives

Sl/x l , ••• , Dn- lL = Sn- llx n- l' Dn L -l/x


n

are in Rp, we can consider the matrix JD(L, P) obtained by adjoining


an initial row to JD(P). An easy (but crucial) calculation now
gives

(4)

where J~(O,P) is obtained from J~(P) by adding an initial row of


zeroes.

Now in general we have

rank C + rank B - n .. rank CB .. rank C

if B has n rows and C has n columns. Applying this to (4), we find


that
296

Assume the lemma is false. Then (5) implies

rank p J D(L, P) .; r •

Comparing this with (2), we conclude that there exist finitely many
elements P of P and elements A, Ap of R, with A not in P, such that

(6)

We now interpret the expression (3) and the relations (6)


locally on the variety V of P. We can find a smooth point
11 = (i;I, ••• ,i;n) on V at which none of the polynomials xl, ••• ,x n _ 1 ,A
vanish. Then V can be parametrized near 11 by means of equations

(7)

where Fl' ••• ,F n are power series in the variables t 1 , ••• ,t r which
converge for t 1 , ••• ,t r sufficiently small. The Jacobian matrix with
entries aF.fat (1.; i .; n, 1 .; s .; r) therefore has rank r.
1. s
Since the constant terms of Fl' ••• ,F n are 1, we can define
convergent power series

with zero constant terms. Then

and we deduce easily that the Jacobian matrix with entries aYifats
(1 .; i .; n, 1 .; s .; r) also has rank r. In particular Yl' ••• , Yn
are not all zero, so the vector space they generate over Q has
dimension m satisfying .; m .; n. Let Yl' ••• 'Ym be a basis
consisting of a subset of Y1 , ••• ,Y n • Then the Jacobian matrix with
entries ay.fat (1 .; j .; m, 1 .; s .; r) also has rank r.
J s

We can now apply Corollary 1 (p.253) of Ax's well-known paper


297

[1] on Schanuel's Conjecture for power series. We conclude that the


functions

generate a field of transcendence degree at least m+r over K •


Y1 Ym
Since e , ••• ,e are among F 1 , ••• ,F n which generate a field of
transcendence degree rover K, we deduce that Y1' ••• 'Ym are
algebraically independent over K.

But now the relations (6) lead to a final contradiction as


n
follows. Write ~n = -1 and consider the function A = I ~iYi.
i=l
Then

n
a A/at
s I (~i/Fi)(aFi/ats) (1 ( s ( r) •
i=l

For P in R let P be the function of t 1 , ••• , tr obtained from the


substitution (7); clearly

n
I ~iDiP(aFi/ats) (1 ( s ( r).
i=l

Making the substitution (7) in (6) gives

(1 ( i ( n),

and on multiplying by ~iaFi/ats and summing over i we obtain

(l(s(r).

Since each P is now in P we have P o identically; consequentl~,

since
A(O, ••• ,O)

we deduce aA/at s o for all s. Thus A is the constant


A(O, ••• ,O) = O.

Finally there are rationals qij such that

m
( n)
I
j=l
qiJ' YJ' (1 ( i
298

and therefore

Since Yl""'Ym are algebraically independent over K, we deduce

(1 " j " m) •

Then since 8 1 "" ,8 n are linearly independent over Q, we conclude


that qij = 0 for all i,j, leading to Yl = ••• = Yn = 0, the desired
contradiction. This completes the proof of the Jacobian Lemma.

3. Integration.
Let D = (D l , ••• ,D k ) be a system of commuting derivations on R,
and let T be a non-negative integer. For an ideal I of R we define
f 1dTn (the notation was suggested by a remark of D.J. Lewis) as
the ideal generated by the polynomials P for which all the
derivatives

lie in I. Clearly

(8)

and it is easy to verify the inclusions

1T+l £ f 1d Tn ~ I . (9)

We shall also need the remark that with D


Section 2 the equality

(10)

for T = 1 implies the same equality for all T :> O. This is proved
by induction on T. For suppose T :> 1 and (10) holds with T replaced
299

by each t with 0 ~ t ~ T. In this case write It for either side of


T+l
(10). Then a polynomial P lies in f Id D = f ITdD i f and only i f
P, Dl are in IT = f IT_1dD for all j. This in turn holds i f and
only if P, DiP, DjP, DiDjP are in I T- 1 for all i,j. Now the
commutators DiDj - DjD i are themselves derivations and therefore
linear combinations of D1 , ••• ,Dn with coefficients in R. Thus P
T+l
lies in f Id D if and only if P, DjP, DiP, DjDiP are in I T- 1 for
all i,.;; and on retracing steps we see that this is equivalent to P
lying in

This completes the proof of the remark.

The main lemma of this section concerns the system


8 = (8 1 , ••• ,8 n _ 1 ) defined in Section 1 for 1,Bl' ••• ,B n _ 1 linearly
independent over Q •

Integration L _ . Let 1 ~ r ~ n, and ~uppo~e P .u, a gene!tal plUme


.i..deal 06 R 06 !tank. r. Then f PdT 8 ~ plUmMY w.i.th !tad.i..cal P, and

.i..u length ~ at le~t (T+P), whe!te P = min(r,n-l) •


P

P!t006. Assume r # n to begin with. We start by showing that

f Pd8 = J PdD • (11)

In one direction, since 8 1 , ••• ,8 n - 1 are linear combinations of


D1, ••• ,D n with coefficients in R, it is clear that

J Pd8 ;:J PdD. (12)

For the opposite inclusion we shall express D1 , ••• ,Dn back as


linear combinations of 8 1 "" ,8 n - 1 in a restricted sense. By the
Jacobian Lemma, the matrices J 8 (P), JD(P) have equal ranks modulo
P. It follows that the relation JD(P)B = J 8 (P) of Section 2 can be
inverted in the form AJD(P) :: J 8 (P)A (mod p), where A is a matrix
with entries in R and A is in R but not in p. Hence there exist
elements Aij of R such that
300

n-I
ADiP = L Ai.A . P (mod P) (1 ( i ( n) (13)
j=I ]]

for all P in P. It follows from this easily that

fPdA C fPdn. (14)

Now (12) and (14) together give (11). From our opening remark
we deduce that in fact

f PdTA

By the Corollary (p.I64) in a recent paper of Seibt [4J, the


integral f PdTn is simply the (T+I)-th symbolic power p(T+I) of P;
that is. the unique isolated primary component of the ordinary power

PT+ 1. And this is known to have length (T+r)


r • For, passing to the
(T+I)
localization Rp. the length of P is the dimension of
Rp IP (T+I) Rp as a vector space over F = Rp IPR p • But the former
quotient is the same as Rp IP T+I Rp" which by standard results (see,
e.g., [7], Theorem 25 (p.30l) and the Remark (p.3IO)) is isomorphic

to the vector space over F. of dimension ( T+r)


r ' of all polynomials

in r variables of total degree at most T (these remarks are due to


M. Hochster). This completes the case r F n.
Finally suppose r = n. Then P is maximal. and we see at once
from (9) that] = f PdTA is primary with radical P. We now descend
to the ring R' = K[xI ••••• xn_IJ. and we put P' = R' n P. ]' =
R' n]. Since the derivations

(1 ( i ( n-I)

act like AI ••••• An _ I on R'. it is clear that

for the system A' But are linear


301

combinations of D1 ••••• Dn- 1 and moreover there are converse


relations of the form (13) (with. e.g. A = x 1 ••• x n_ 1 ). It follows
that we can use the preceding arguments to prove that

The right-hand side is just the

symbolic power P' (T+1) • whose length is ( T+n-1)


n-1 • Also there is a

natural injection from Rp,f]'R p' to Rpf JRp as vector spaces over

It follows that the length of ] is at least the length of ]'. which

is ( T+n-1).
n-1 • this completes the proof in the case r = n.

4. Proof of Proposition.

This is by contradiction. We suppose there exists a non-zero


polynomial P of total degree D ;> 1. and distinct points
11
s
(<;(s) • • • • • <;n(s) )
1
(0 .. s .. s) of (Kx)n such that .

for all non-negative integers 'l ••••• 'n-l.s with

'1 + ••• + 'n-l .. T. s .. s.

It suffices to assume S is an integer but that the weaker


inequalities

2nDn
n • (15)

hold; from these we shall deduce our contradiction.

For any ideal of R we define I * as the contracted extension


simultaneously with respect to the maximal ideals PO' •••• PS
corresponding to the points 1IO ••••• 1IS. We put
302

T' = [Tin]

and we let 1r be the ideal generated by the polynomials

of total degrees at most D.

We start by observing that since nT' ( T the generators of 1n+l


all vanish at the points 110, ••• ,11 s. Consequently all the ideals
1* 1, ••• ,1 *n+l are proper and non-zero.

Next, we prove that if 1 ( r < nand 1*r has rank m < n, then
1 *r+l has rank strictly larger than m. For this it will suffice to
deduce a contradiction if 1*r+l has rank m. But in this case let P
be a prime component of 1 *+1 of rank m. Evidently P is general,
* * r
and, since 1 r ~ 1 r+ 1 s... P, i t follows that P is also a prime
component of 1 *r; let Q be the corresponding primary component. It
T'
is clear from the definitions that 1r ~ f 1r+ld ~, and since
1r+l s... 1 *r+l s... P, we get

T'
1r £f Pd ~. (16)

By the Integration Lemma the right-hand side is primary with radical


P; hence localizing (16) at P yields

Comparing lengths and using once more the Integration Lemma, we find
that the length ~(Q) of Q satisfies

On the other hand, Q is an isolated primary component of rank m of


the ideal 1r generated by polynomials of total degrees at most D; so
by the Corollary (p.419) of [3] we have the estimate
303

By (15) this contradicts (17).

So the assertion about ranks is established; in other words,


the ranks of 1*1, ••• ,1 *n+l strictly increase until they reach n, and
then remain stationary. In particular 1*nand 1*n+l must have rank
n. We have already noted that 1*n+l has general prime components
hence these are all prime components of 1*n as well; let
Q O, ••• ,Q S be the corresponding primary components. As above we
find that

QcfPdT't;. (0 .. s .. S),
s- s

and now the Integration Lemma yields

Thus

S
L
s=O

But once again the Corollary (p.419) of [3] gives

S
L
s=O

which by (15) is another contradiction. This completes the proof of


the Proposition.

References.

[1] J. Ax, On Schanuel's conjectures, Annal¢ 06 Math. 93 (1971),


252-268.

[2] A. Baker, Transcendental Number Theory, Cambridge 1975.

[3] D.W. Masser and G. Wustholz, Fields of large transcendence


304

degree generated by values of elliptic functions, Invent.


Math. 72 (1983), 407-464.

[4] P. Seibt, Differential filtrations and symbolic powers of


regular primes, Math. Z. 166 (1979), 159-164.

[5] G. Wustholz, Multiplicity estimates on group varieties, to


appear.

[6] G. Wustholz, The analytic subgroup theorem, to appear.

[7] O. Zariski and P. Samuel, Commutative algebra Vol. II,


Springer, New York 1968.

D.W. Masser G. Wustholz


Dept. of Mathematics, Max-Planck-Institut fur Mathematik,
University of Michigan, Gottfried-Claren-Strasse 26,
Ann Arbor, MI 48109,U.S.A. 5300 Bonn 3, Fed. Rep.of Germany.
SUMS OF POC YGONAL NOHBERS

Melvyn B. Nathanson

Let m ) 1. The k-th polygonal number of order m+2 is the sum of the
first k terms of the arithmetic progression I, l+m, 1+2m, l+3m, •••
The polygonal numbers of orders 3 and 4 are the triangl!lar numbers
and squares, respectively.

In his note to Book IV, Article 29, of Diophantus's


An-i.thme.t-ica, Fermat [2] wrote, "Every number is either a triangular
number or the sum of two or three triangular numbers; every number
is a square or the sum of two, three, or four squares; every number
is a pentagonal number or the sum of two, three, four or five
pentagonal numbers; and so on ad -inMn-i.tunf'.

Lagrange [4] proved that every number is the sum of four


squares. Gauss [3] showed that every number is the sum of three
triangular numbers, or, equivalently, that every non-negative
integer n " 3 (mod 8) is the sum of three odd squares. Weil [8]
presented proofs of these theorems that use only techniques
available to Fermat.

Gauss [3] also proved that a positive integer n is the sum of


three squares if and only if n is not of the form 4a (8k + 7).

For m ) 5 , Cauchy [1] proved that every number is the sum of m


polygonal numbers of order m, and that at most four of the polygonal
summands are different from 0 or 1. Legendre [5] proved that, for
m ) I, 2, 3 (mod 4), every sufficiently large integer is the sum of
four polygonal numbers of order m, and , for m " 0 (mod 4), every
sufficiently large integer is the sum of five polygonal numbers of
order m, at least one of which is 0 or 1.

Uspensky and Heaslet [7, p.380] stated that "Cauchy showed that
other parts of the Fermat theorem can be derived in a comparatively
306

elementary but rather long way" from the triangular number


'theorem. Recently, Wei! [9, p.102] wrote that from the triangular
number theorem "one can derive (not quite easily, but at any rate
elementarily) all of Fermat's further assertions." The purpose of
this paper is to give short and easy proofs of the Fermat-Cauchy
theorem (Theorem 1), of Legendre's results (Theorems 2-5), and of
some further refinements of these results on sums of polygonal
numbers (Theorems 6-8).

Pepin [6] published tables of representations of all integers


n ( 120m as sums of m polygonal numbers of order m, at most four of
which are different from 0 or 1. (There are mistakes in these
tables, but they are easily corrected.) It suffices, therefore, to
prove Cauchy's theorem only for n ) 120m •

Denote the k-th polygonal number of order m + 2 by

p (k)
m
= ~k2
2
- k) + k.

L _ 1. Let L denote the length 06 the -<-nteltvai. de6-<-ned by the


-<-nequrut-<-u

1..2 + /6(.!!.) - 3
m
<b .. 13 + 18(.!!.) - 8.
m
(1)

Then
(i) L) 4 -<-6 n ) 108m,
(ii) L) hm -<-6 n ) 7h 2m3 •

Pltoo6. A simple computation shows that

L = /8(.!!.) - 8 - / 6(.!!.) - 3 +
m m
1..6 ) g

if

(2)

The right side of (2) is 107.86 for g 4. This proves (i).


307

212
Let g ) 3. Then 7g ) 7(g - (6») + 5 and so L ) g for
2
n ) 7g m. This yields (ii) for g = hm.

Le.t m ) 3 and n ) 2m. Le.t a, b, r be norr nega.t.tve


-i.n.tegelUl .6uch :that 0 .. r <m and

n = .!!(a - b) + b + r. (3)
2

16
2~
-l r
+- n-
I 6(!!.) - 3 <b .. - + I 8(!!.) - 8
2 m 3 m

.then
(1) b 2 .. 4a

(11) 3a < b2 + 2b + 4

P~oo6. Equation (3) implies that

a = (4)

Therefore,

2 2 2 n-r
b - 4a = b - 4(1 - ;)b - 8(-m-) .. 0

if

Since m ) 3 and 0 .. rim < 1, it follows that b 2 .. 4a if

o .. b .. '32 + r--n
I 8(;) - 8 •

Similarly, using (4), we obtain

if

b > (-12 - -)
3 + / (-
1 - -) n-r - 4 •
3 2 + 6(-)
m 2 m m
308

Therefore, 3a < b2 + 2b + 4 if

b > 12 + / 6(.!t) - 3 •
m

L _ 3. Let a and b be non-negat.<.ve .<.ntegeM. In

(i) b 2 .. 4a

(11) 3a < b2 + 2b + 4

and '<'6 6o~ ~ome d ~ 1 e'<'the~

(iii) a/d 2 _ bId _ 1 (mod 2)

(iv) a/d 2 _ 2(mod 4) and bId _ 0 (mod 2),

then the~e ex~t non-negat.<.ve .<.ntegeM s, t, u, V ~Llch. that

b s + t + U + v.

P~oo6. Suppose that (iii) holds with d = 1. Then a and b are odd,
hence 4a - b 2 := 3 (mod 8). Since 4a - b 2 ~ 0 by (i), Gauss's
theorem implies that there exist odd integers x ~ y ;;. z > 0 such
that

(5)

The integer b + x + y + z is even. Choose ±z so that


b + x + y ± z := 0 (mod 4). Define integers s, t, u, v, as follows:

s = b + x + Y+ z
4

b + x
t =-2-- s =
b +x y + z -
4

b - x + y + z
u =~- s =
2 4

b + Z b - x - y + Z
v=-t-- s 4
309

Then

b s + t + u + v

s " t " u " v.

To prove that s, t, u, v are non-negative, it is enough to show that


the integer v = (b - x - y ± z)/4 l 0, or, equivalently,
(b - x -y ± z)/4 > -1. The worst case is (b - x - y -z)/4 > -1, or
x + y + z <b + 4. The maximum value of x + y + z subject to the

constraint (5) is / 12a - 3b 2 , and so it suffices to prove that

/ 12a - 3b 2 <b + 4 , or 3a < b2 + 2b + 4. This is precisely (ii),


and so s, t, u, v are non-negative integers.

Suppose (iv) holds with d = 1. Then a - (b/2)2 =1 or 2


(mod 4). It follows from (i) that a -(b/2)2 = 2
(4a - b )/4 ,,0. By
Gauss's theorem, there exist non-negative integers X .. Y .. Z such
that a - (b/2)2 = X2 + y2 + z2. Let x = 2X, y = 2Y, z = 2Z. Then
4a - b 2 = x 2 + y2 + z2. If k is an even integer, then k 2 = 2k
(mod 8). Since a, b, x, yand z are even, it follows that

o = 4a = b 2 + x 2 + y2 + z2 _ 2(b + x + Y + z) (mod 8)

and so b + x + Y + Z =0 (mod 4). Then s = (b + x + y + z)/4 is an


integer. Define t, u, v as above. The proof continues as in case
(iii) with d = 1.

Suppose that (iii) or (iv) holds with d .. 2. Let A = a/d 2 and


B bid. Then

4A

and
B2 + 2B + 4 (b 2 + 2db + 4d 2 )/d 2

.. (b 2 + 2b + 4)/d 2

> 3a/d 2 = 3A.

It follows that there are non-negative integers S, T, U and V such


310

that

B = S + T + U + V.

Let s = dS. t = dT. u = dUo v = dV. Then s. t. u. v are non-


negative integers satisfying a s2 + t + u 2 + v 2 and b = s + t +
2
u + v. This concludes the proof.

L _ 4. Le.t m ) 1. Then n i6 .the ~um 06 60Wl. polygonal numbeM 06


olLdelL m+2 -i6 and only -i6 n = (m(a - b)/2) + b. whelLe a = s2 + t 2 +
u 2 + v 2 and b = s + t + U + v 601L non-negat-ive -in.tegeM s. t. u. v.

P1L006. This follows directly from the representation Pm(k)


(m(k 2 - k)/2) + k.

'nleorea 1. Le.t m ) 3 and n ) 108m. Then n i6 .the ~um 06 m+2


polygonal numbeM 06 olLdelL m+2. 06 wh-ic.h at mo~.t 60Wl. Me d-<-6fielLen.t
61L0m 0 OIL 1.

P1L006. By Lemma 1. the interval (1) has length at least 4. and so


it contains at least two consecutive odd positive integers.
Therefore. the set S = {b + rl. where b is an odd positive integer
in the interval and r = O. 1. 2 ••••• m-2. contains a complete set of
residues modulo m. Choose b + r in the set S so that n =b + r
(mod m). Define a by equation (4) of Lemma 2. Then a and b are odd
positive integers that satisfy the hypotheses of Lemma 3 with d = 1
in (iii). Apply Lemma 4 with n-r in place of n. Then n-r is a sum
of four polygonal numbers of order m+2. Since 0 .. r .. m-2. it
follows that n is a sum of r+4 .. m+2 polygonal numbers of order m+2.

'nleorea 2. Le.t m ) 3. m odd, and n ) 28m 3 • Then n i6 .the ~um 06


fioWl. polygonal numbeM 06 olLdelL m+2.

By Lemma 1. the interval (1) contains at least 2m


consecutive integers. Since m is odd. there is an odd integer b in
this interval such that n - b (mod m). Let r = O. Define a by
equation (4). Then a =b _ 1 (mod 2) and the Theorem follows from
Lemmas 3(iii) and 4.
311

'l1leorea 3. Let m ) 3, m even, and n ) 7m 3 • 16 n .u, odd, then n .u,


the ~um 06 60Wt polygonal. numbefUJ 06 oltdelt m+2. 16 n .u, even, then
n .u, the ~um 06 Mve polygonal. numbefUJ 06 oltdelt m+2, o.;t leCL6t one 06
wh-i.c.h .u, 1.

P1t006. By Lemma 1, the interval (1) contains at least m consecutive


integers. If n is odd, choose b in this interval so that n =b
(mod m). Then b is odd since m is even. Let r =0 and define a by
equation (4).

If n is even, choose b in the interval so that n =b + 1


(mod m). Then b is odd. Let r = 1 and define a by equation (1).
In both cases, a _ b =1 (mod 2) and the Theorem follows from
Lemmas 3(iii) and 4.

'l1leorea 4. Let m = 0 (mod 4), m) 4, and n ) 28m3 • 16 n .u, even,


then n .u, the ~um 06 60ult polygonal. numbefUJ 06 oltdelt m+2.

P1t006. By Lemma 1, the interval (1) contains at least 2m


consecutive numbers. Choose b 1 and b 2 in this interval such that
b 2 - b 1 = m and n = b 1 = b 2 (mod m). Define a i = (2(n-b i )/m) + bi
for i = 1, 2. Then aI' a 2 , b 1 , b2 are positive even integers, and

It follows that a i = 2 (mod 4) for i .. 1 or 2. Choose i so that


ai = 2 (mod 4). Let a = a i and b = bi. Let r .. O. Then
a = 2 (mod 4) and b = 0 (mod 2), and the Theorem follows from Lemmas
3(iv) and 4.

'l1leorea 5. Let m =2 (mod 4), m ) 6, and n ) 7m3 • 16 n =2


(mod 4), then n .u, the ~um 06 60Wt polygonal. numbefUJ 06 oltdelt m+2.

PIt006 •. By Lemma 1, the interval (1) contains at least m consecutive


integers. Choose b in this interval such that b =n (mod m). Then
b =0 (mod 2). Let x = (n-b)/m and r .. O. Define a by equation
(4). Then

a = b + 2x n - (m-2)x _ 2 (mod 4).


312

The Theorem follows from Lemmas 3(iv) and 4.

Legendre's results (Theorems 2-5) show that every sufficiently


large integer n is a sum of four polygonal numbers of order m+2
unte6~ m+2 =n =0 (mod 4). The following propositions refine this
exceptional case. Corollary 1 of Theorem 6 is due to Legendre [6].

Theore. 6. Let m ) 3 , k ) 1, and n ) 74 km3 • 16 m = 2 (mod 2k+2 )


and n = 22k (mod 2 2k+1 ), Oft -<-6
. _
m = 2 + 2
k+1 (mod 2k+2) and n ;: 0

(mod 22k+1), then n -u a ~wn 06 60M polygonal numb eM 06 ol!.del!. m+2.

PI!.OOn. Let m = 2k +1m' + 2 and n = 22kn,. Theorem 6 is equivalent


to the statement that n is the sum of four polygonal numbers of
order m+2 if m' t n' (mod 2).

By Lemma 1, the interval (1) contains at least 2km consecutive


integers. Chose b in this interval such that n ;: b (mod m) and
x = (n-b)/m ;: 2k- 1 (mod 2k ). Let x = 2kx' + 2k - 1 • Apply Lemma 3
with d = 2k in (iii). Then

b n - mx
d = -d--

_ 1 (mod 2).

Let a n-(m-2)x. If m' t n'(mod 2), then

n' - m'(2x' + 1)

_ n' - m' _ 1 (mod 2).

Let r z O. Then a, b, r satisfy (3) of Lemma 2, and the Theorem


follows from Lemmas 3 and 4.

Corollary 6.1 Let m ) 10 and n ) 28m 3 • 16 m ;: 2 (mod 8) and


n =4 (mod 8), Ol!. .in m =6 (mod 8) and n ;: 0 (mod 8), then n -u a
313

.6 urn 06 60Wl. polygonal. nurnbelL6 06 olLdelL m+2.

Corollary 6.2 Let m ) 18 and n ) 112m3 • 16 m :: 2 (mod 16) and


n :: 16 (mod 32), OIL i6 m :: 10 (mod 16) and n :: 0 (mod 32), then n if.>
a .6urn 06 60Wl. polygonal numbelt.6 06 olLdelL m+2.

2k+1 3
Theorem 7. Let m ) 3, k ) 1, 1 ( j ( k+1, and n ) 74 m. 16
m :: 2 (mod 4) and n :: 22k+1 (mod 22k+2), OIL i6 m :: 2 + 2 (mod 2j +1 )
j
2k+2
and n :: 0 (mod 2 ), then n if.> a .6urn 06 60Wl. polygonal numbelt.6 06
olLdelL m+2.

Theorem 7 is equivalent
to the statement that n is a sum of four polygonal numbers of order
m+2 if m' t n'(mod 2).

By Lemma 1, the interval (1) contains at least 22k+1


consecutive integers. Choose b in this interval such that
n :: b(mod m) and x = (n - b)/m :: 22k+1-j (mod 22k+2-j). Let x =
22k+2- j x' + 22k+1-j. Let a = n - (m - 2)x. Apply Lemma 3 (iv) with
d = 2k. Then

b n - mx

and so b/d _ o (mod 2). If m' t n'(mod 2), then

a ~ n - (m - 2)x = 22k+1(n' - m'(2x' + 1))

and so a/d 2 :: 2 (mod 4). The Theorem follows from Lemmas 3 and 4.

Corollary 7.1 Let m ) 3 and n ) 448m 3 • 16 m :: 2 (mod 4) and


n :: 8 (mod 16), OIL i6 m :: 6 (mod 8) and n :: 0 (mod 16), then n if.>
the .6urn 06 60Wl. polygonal numbelt.6 06 olLdelL m+2.

Corollary 7.2 Let m ) 3 and n ) 7168m3 • 16 m :: 2 (mod 4) and


n :: 32 (mod 64), OIL i6 m :: 6, 10, OIL 14 (mod 16) and n :: 0 (mod 64),
then n if.> a.6urn 06 60Wl. polygonal nurnbelt.6 06 olLdelL m+2.

The Fermat-Cauchy theorem that every non-negative integer is


the sum of m+2 polygonal numbers of order m+2 is best possible in
314

the sense that there exist integers (for example, 2m+3 and 5m+6)
that cannot be represented as the sum of m+1 polygonal numbers of
order m+2. It is natural to define F(m+2) as the smallest number f
such that every sufficiently large integer is the sum of f polygonal
numbers of order m + 2. Clearly F(3) =3 and F(4) = 4.

Theorea 8. Let m > 3. 16 m+2 = 1, 2, or 3 (mod 4), then


F(m+2) = 3 O~ 4. 16 m+2 = 0 (mod 4), then F(m+2) = 4 O~ 5.

P~oo6. Let Pm(x) denote the number of polygonal numbers of order


m+2 that do not exceed x. Since Pm(k) = (m(k 2 - k)/2)+k, it follows

that Pm(x) =I (2/m)x + 0(1). Let ~(x) denote the number of


integers n not exceeding x such that n can be written as the sum of
two polygonal numbers of order m+2. If m > 3, then

Q (x) .. p (x) 2 (2/m)x + O(IX ) .. (2/3)x + O(IX )


m m

and so there are infinitely many positive integers that are not sums
of two polygonal numbers of order m+2. Therefore, F(m+2) > 3.

By Theorems 2, 3, and 4, if m is odd or m = 0 (mod 4), then


F(m+2) .. 4. Therefore, F(m+2) = 3 or 4 for m+2 = 1, 2, or 3
(mod 4).

If n is the sum of three polygonal numbers of order m+2, then


there exist nonnegative integers t, u, v such that

This is equivalent to

8mn + 3(m - 2)2 = (2mt - m + 2)2 + (2mu - m + 2)2 + (2mv - m + 2)2.

Let m+2 _ 0 (mod 4). Then m = 4m'+2 and

N (2m' + l)n + 3(m,)2

= (2m't + t - m,)2 + (2m'u + u - m,)2 + (2m'v + v - m,)2.

Since 2m' + 1 is odd, hence relatively prime to 8, there exists an


315

entire congruence class r(mod 8) such that

N = (2m' + l)n + 3(m,)2 =7 (mod 8)

for n =r (mod 8). Then N is not a sum of three squares, and so n


is not a sum of three polygonal numbers of order m+2 if n > 0 and
n =r (mod 8). Therefore, F(m+2) > 4 if m+2 =0 (mod 4).

By Theorem 3, F(m+2) ~ 5 for m even, and so F(m+2) =4 or 5 if


m+2 =2 (mod 4). This concludes the proof.

The exact value of F(m+2) is not known for any m > 3.

Ref erences.

[1] A. Cauchy, Demonstration du theoreme general de Fermat sur les


nombres polygones, M~m. Sc. Math. et Phy~. de l'In6t~tut de
F~ance, (1) 14 (1813-15), 177-220 = Veuv~~, (2) vol.6,
320-353.

[2] P. Fermat, quoted in T. L. Heath, Diophantus of Alexandria,


Dover: New York, 1964, p.188.

[3] C. F. Gauss, Disquisitiones Arithmeticae, Yale University


Press: New Haven and London, 1966.

[4] J. L. Lagrange, Demonstration d'un theoreme d'arithmetique,


Nouveaux M~mo~~~ de l' Acad. ~oya.te d~ Sc. et BeUe~-L. de
Be~n, 1770, pp.123-133 = Oeuv~~, vol.3, pp.189-201.

[5] A.- M. Legendre, Theorie des nombres, 3rd ed., vol.2, 1830,
pp.331-356.

[6] T. Pepin, Demonstration du theoreme de Fermat sur les nombres


polygones, At~ Accad. Pont. Nuov~ L~nce~ 46 (1892-3), 119-
131.
316

[7] J. V. Uspensky and M. A. Heaslet, Elementa ry Number Theory,


McGraw-Hill: New York and London, 1939.

[8] A. Weil, Sur les sommes de trois et quat res carres,


L'En6e~gnement Math~mat~que 20 (1974), 215-222.

[9] A. Weil, Number Theory, An Approach through History from


Hammurabi to Legendre, Birkhauser: Boston, 1983.

Note (added November, 1985). The following two articles are related
to the subject of this paper

L. E. Dickson, All positive integers are sums of values of a


quadratic function of x, Bull. Ame~. Math. SOQ. 33 (1927), 713-720.

G. Pall, Large positive integers are sums of four or five values of


a quadratic function, Ame~. J. Math. 54 (1932), 66-78.

M. B. Nathanson
Rutgers University,
Newark, New Jersey 07102

Office of the Provost and Vice President


for Academic Affairs,
Lehman College (CUNY),
Bronx, New York 10468, U.S.A.
OR THE DENSITY OF Hz-BASKS

Andrew D. Pollington

A sequence A of positive integers is called a Sidon sequence or


a B2-sequence if the pairwise sums are all distinct. If. in
addition every non-zero integer appears in the set of differences we
call A a B2-basis.
Let A(n) denote the number of elements of A not exceeding n.
Erdos. see (2). has shown that lim inf n- l / 2 A(n) = 0 for every B2-
sequence A. and that there is a B2-sequence A satisfying

lim sup n- l / 2A(n) ~ 1/2.

In 1981. Ajtai. Kolmos and Szemeredi [1) gave a random construction


of a Sidon sequence for which

A(n) > 10~0 (n log n)1/3 for

It is the purpose of this note to show that the same results can be
obtained for B2-bases.

P~oo6. Following Erdos. (2). p. 90. let Ap' p a prime. denote


the set of numbers

k 1.2 ••••• p - l (1)

where (k 2 )p is the least positive residue of k 2 mod p. Then Ap is a


B2-sequence. If a, a' € Ap' a ~ a'. then
318

p < la - a'i < 2p2 - p. (2)

Let P denote a sequence of primes PI < P2 < ••• , for which

(3)
Put
V~ (4)
where

m is the least positive integer which is not in V: - Vn and bn is


the least positive integer for which neither bn or bn + m are of the
form a i + a j - ak' ai' aj' ak C Then bn + m V:. < 2IV:13.
n 2
I (Pi + 1) < Pn So b n + m < 2p6n Clearly if Vn is a
i=I
B2-sequence then so is Vn'

PJtoo6. We use induction an n. Since V1 -- .API ' Now


suppose that Vn- I is B2 • It suffices to show that

(5)

with aI' a2 > a3 > a4 cannot hold.

If (5) holds then ai C ~ and a4 C Vn-I'


n
If a3 c t),n' then

a i - a 2 .. 2Pn2 - Pn by (2)
and

violating (5).

by (2) and (3)

which again violates (5).


319

Put A =
n=l
0
Vn • Then A is B2 , since VI C V2 C • • • • A is

clearly a B2-basis. For each Pn £ P there are at least Pn -


2
elements of A less than 4Pn - Pn. Hence

lim sup A(n) n- 1/ 2 ) 1/2 •

1/3
A(n) >- 1 (n log n) for all n > nO.
10 3

Note. The greedy algorithm gives a B2-basis A, with A(n) > cn 1 / 3 •


Pollington and Vanden Eynden [3] have constructed a B2-basis al < a2
< ••• with a k £ [c(k-l)3, ck 3 ] where c is a fixed constant.
Theorem 2 follows immediately from a slight adaptation of the
random construction of a B2-sequence given by Ajtai, Kolmos and
Szemeredi, [I]. If x ( yare positive integers, then the triple
(x, y, x + y) is called a general triangle. To obtain their B2-
sequence, Ajtai, et. al construct a sequence Bi of sets of positive
integers with the following properties:

i) Bi is a subset of the interval [2.10 i , 3.10 i )

ii) 1Bil= [1~0 i 1/ 3 10i/3]

iii) Bi is B2

iv) the set ~ os


j
U( i
Bj generates less than 10 1 • 26i general

triangles

v) for no pair b, b'E Bi , b > b', is the difference b - b' in

We can use the same construction, except, infinitely often we choose


to replace Bi by a pair {bi' b i + mil, where as in Theorem 1, mi is
the least positive integer not in ~-1 - ~-1 and bi is chosen so
that bi £ [2 .1O i , 3 .1O i ]. If this change is made sufficiently
infrequently we still have
320

A(n) > -1 (n log n)


1/3
for all n > nO'
10 3
co

but now l J Ai is a B2-basis.


M'

Referenees.

[1] Ajtai, Kolmos, Szemeredi, A dense infinite Sidon sequence,


Eunop. J. Comb~natokiC6 (1981) 2, 1-11.

[2] Halberstam and Roth, Sequences. Oxford University Press, 1966.

[3] Pollington and Vanden Eynden. The integers as differences of a


sequence, Canad. 13u£R... Math. Vol. 24 (4), 1981, 497-499.

A. D. Pollington
386 TMCB
Brigham Young University
Provo, Utah 84601
USA
STATISTICAL PROPERTIES OF EIGENVALUES OF THE HECKE OPERATORS

Peter Sarnak

O. Introduction.

Two basic questions concerning the Rarnanujan ,-function concern


the size and variation of these numbers

(1) Rarna nuja n conj ecture: !,(p)! < 2pll/2 for all primes p.

(i1) "Sato-Tate" conjecture: a --~ 11/2 is equidistributed with


p p
respect to

~
211
14-x 2 dx if Ixl" 2

o otherwise

as p + ~. We refer to the last as the semicircle distribution.


Concerning the above the following is known: (1) has been
proved by Deligne [ IJ. However its genera liza tion to a genera I
GL(2) cusp form, as well as to more general groups is far from being
solved. (ii) This conjecture is motivated by related questions for
L-functions of elliptic curves [8J. It is conjectured to be true
for ,(p) as well as for "typical" cusp forms in GL(2). It certainly
does not hold for all cusp forms and we will consider this again
later. Our aim here is to outline results which prove averaged
versions of (i) and (ii) in general.
I have benefited immeasurably from discussions with R. Phillips
and 1. Piatetski-Shapiro and some of the results quoted here are
from joint work with them.

1. Classical Hecke Operators.


We begin by considering the simplest example of Hecke
322

operators. Let r = SL(2, Z) and h = { z 1m z > O}. I


Let H be the
2
Hilbert space L (r/h), that is of all r invariant functions on h
which are squa re sumtm ble over a fundamental domain F for r with
dxdy
respect to dw(z) = 2. The operators in-question are then defined
y
by

(1.1)

for n = 1, 2, ••••
It is well known that {Tn} forms a commutative family of self-
adj oint opera tors. Furthermore H decomposes into Heeke inva riant
subspaces

H {l} $ E $ Cusp

where {l} spans the constant functions, E is spanned by Eisenstein


series [3] and Cusp is orthogonal to these and consists of cuspidal
functions. On Cusp we have a simultaneous orthonormal basis of {Tn}
which we denote by uj(z)

T u, p/p)u j
P ]
(1.2)
1 2
(7; + rj )u j

where Al ..
A2 .. A3 ••• . ,
Thus we use the A's to order the Uj s.
For these cusp forms u j , very little is known about Pj (p) or
rj" Very interesting computations of Pl(P) for p < 1000 and rj for
small j appear in Stark [10] and Hejhal [3] • For these, the
Ramanujan conjecture takes the form

Ip , (p)1
]
<2 (1.3)

for all j and primes p.


We note that
since the Ramanujan conjecture holds for the
1
Eisenstein series E(z, '2 + it), as one checks easily by a calcula-
tion, we can restate the Ramanujan conjecture purely in terms of the
323

spectrum of Tp. Thus the following is equivalent to (1.3). For p


a prime,
2
!<T f,f>! ~ 2<f,f> for all f E L (rl h) for which
p
(1.3')
<f,!> = O.

Put another way o(T! .L) c [-2,2]. Here o(T) is the spectrum of
p {I} I; _ I;
T. On the other hand T 1 (p 2 + p 2)1 and indeed
p

n(p): = liT II p
liz + p
- 1/2
> 2. 0.4 )
P

It is known that

liS -1/5
!P.(p)! .. 2 (p + P ). (l.5 )
]

(This was communicated to the author in a letter from S.J. Patterson


1981) •

Definition 1.6. Let X be a topologi ca 1 spa ce. We say that a


sequence Xj in X is lJ-equidistributed where lJ is a Radon measure on
X, i f for all f E: Cc(X),

lim N f(x.) + J f(x) dlJ(x) • (1. 6)


N + co ] X

The Sato-Tate conjecture for the numbers Pj(p), states that for
fixed j, Pj(p) is lJ-equidistributed, where lJ is the semicircle
distribution.
Our approach here is to study these questions concerning Pj(p)
in both variables j and p. Thus we consider seriously the operator
Tp!Cusp i.e. the variation in j for fixed p. Our first result is a
density result concerning the number of exceptions Tp may have to
the Ramanujan conjecture. We recall Weyl's law, see Selberg [9]

1 2
N(K) = /I {r j .. K} - 12 K • (1.7)

For a ~ 2 (and p fixed) we set

N(a,K) /I{j! r. "K, !P.(P)! ~ a}.


] ]
324

Theorem 1.1. 2- log 0./2


N{a,K) <K log p

In paM:-ic.ui.aJe. atmo.6t aU. Pj{p) Un the .6en.6e 06 d.e.n.6ay -in j) lie -in
[-2,2].

Concerning the variation of the Pj{p) in j and p, let

so that Xj €: X II [-n{p), n{p)].


p

Theorem 1.2. {x j } , j 1,2, ••• -i.611 equ-ifutJUbuted -in X whelte


II = II II and
p (1 +12) .; 4-x2
p
2 2 -i6 Ixi <2
21r{n{p) - x )
d (x)
II
P 0 othe1UAJ-i.6 e •

The following Corollary was first proved by Phillips and Sarnak [7]
by completely different methods. In that paper approxima te eigen-
functions for Tp were constructed directly.

Corollary 1.3. Let am' 8m, m = 1,2, ••• ,k be numbelt.6 .6a.t-i.66y-ing


-2 .. am < 8m .. 2 and .e.et Pl'P2, ••• ,Pk be k pJUme6. Then

lim 1
K+'" KZ tI{r . .. Klp.{p ) E[a, 8m], m = 1, ••• ,k}
J J m m
> o.

It follows that any given finite sequence of numbers, satisfying the


Ramanujan bound may be approximated by the eigenvalues of a cusp
form.

In the above we study the behavior of Pj{p) as a vector in p as


j + "'. If, as expected, the Sa to-Ta te holds for each j, we might
hope that the interchange of the two limits would agree. It is
clear that

lim lip II the semicircle distribution!


p+'"
325

What this shows is that in this way of averaging the numbers p.(p),
J
we do have equidistribution with respect to the semicircle. There
are obvious advantages in averaging over j, since if for example we
consider cusp forms for ro(N), N > 1, then there is a subset of the
j's (the number of which whose rj " K, is of order K) for which the
Sato-Tate conjecture is false. These are cusp forms coming from the
Maass-Hecke construction [4). Of course these disappear in our
averaging and indeed we still find that the generic cusp form has
the semicircle behaviour. These Maass-Hecke cusp forms have their
eigenvalues equidistributed with respect to ].Ip above, with p = I!
The measures ].Ip therefore interpolate between this distribution at
p = 1, and the semicircle at p = 00 •

A final comment concerning the semicircle. As P + 00 the


operators Tp are presumably becoming random, at least that is what
we are showing. For i t is known that the eigenvalues of a random
Hermitian matrix, whose size tends to 00 , become distributed accord-
ing to the semicircle distribution. This is due to Wigner (see [6)
and is known as the Wigner semicircle law.
We will discuss the general case in Section 4. We first turn
to a general phenomenon which is at the heart of the above
considera tions.

2. A Weyl Law.
In this section we describe an extension of the classical Weyl
theorem on eigenvalues of the Laplacian to the case where we have a
family of operators commuting with the Laplacian. Let M be a
compact Riemannian manifold and M ~ S its universal cover. Let G be
the isometry group of S and so r = III (M) is a discrete subgroup of
G. fi will denote the Laplacian on M or S. Now suppose we are given
a family of operators T1 ,T 2 , ••• on L2 (M) for which the family
fi, T 1 ,T 2 , ••• is commutative. We take the T j to be bounded, with say
IITk II = n k • We my then simultaneously diagonalize the family:

Tku j p. (k) u. (2.1)


J J
T u. -fi u. = A.U.
00 J J J J

where {u'}'-12 is an orthonomal basis for L2 (M), and are


J ]- , , •••
326

ordered by increasing Aj • The asymptotics of Aj is well known, this


being Weyl's law

N(A) = #{A. ( A} ~ CA n / 2 (2.2)


]

where C is an appropriate non-zero constant and n dim M. Let


Bk = {z Eel Iz I ( nk } and

x = II Bk • (2.3)
'k

For j 1,2, ••• we obtain a point Xj in X where

The question is how do these xj's distribute themselves in X as


j ... "'? To obtain an answer we assume further the Tk's a re "Heeke
like" opera tors. So we assume Tk to be selfadj oint (normal would
suffice) and is of the form

(2.4)

(k)
where S~ E G. The important assumption is that
T k : L 2 (r/S) ... L 2 (r/S), which can be arranged with appropriate s~k)
i f the commensurator of r in G is non-trivial [Ill.
v E N let
r

the number of words of the type


=I
(mod r) where wk is a
••• Wr (2.5)
. (k) (k) (k)
word ~n SI ,S2 ••• Sn(k) of length vk •

In this case, since we are assuming that the Tk's are self-adjoint,
our space X in (2.3) is a product of intervals.

Theorea 2.1 Let Tk be a.6 above, .then the .6equenc.e {x.} '-1 , 2 , ••• E X
J ]-
.u., II eqtUciWtlUbuted, «itelte II .u., .the tnea.6UJte 9-i..ven by .the moment.6
327

Notice that since X is compact, one sees easily that ~ exists and is
unique. We now examine some simple instances of the above theorem.

Example 2.2. Suppose that the original nanifold M admits a non-


trivial isometry S : M + M of order k (k may be infinite). Let
T : L2 + L2 be the unitary operator given by

Tf(x) f(Sx).

T commutes with ~ and let u j be as above with

j 1,2, •••

Clearly Iwjl = 1. The theorem then asserts that Wj is


~-equidistributed on the circle where

(i) ~ puts mass 11k at the k-th roots of 1 if k <~ •


(ii) ~ is d0/2n on the circle, if k = ~ •

Example 2.3.
-
M = S' = R/z, ~ = ~
dx""
u.(x)
]
= e2nijx. Let

<X 1 ' ••• ''1< "R and Tk(x) = x + <Xk • In this case Pj(k) = e 2nij <Xk.
The theorem thus asserts that the sequence j(<X1'<X2' ••• '<Xk)'
j 1,2,... is ~-equidistributed in the k-torus. Clearly
M(vl' ••• ,Vk) = 0 if l'<Xl'<X2' ••• '<Xk are linearly independent over Q.
so that in this case the sequence is equidistributed with respect to
Lebesgue measure. This is the well known result of Weyl (12).

The main application of the theorem is however to the Heeke


operators in symmetric spaces. In the case of r = SL(2,Z) as in
Section 1, there a re added complica tions in the proof of the above
type of theorem due to the noncompactness. We will outline the
proof in that case in the next section. The proof of Theorem 2.1 in
the general case combines the ideas outlined in the next section,
with the standard derivation of Weyl's law via differential equation
methods - e.g. sllBll time behavior of the fundamental solution to
the wave equation on M

In the r = SL(2,Z) case of Section 1, if we ignore the


difficulties coming from the Eisenstein series (which in this case
are not difficult to overcome) we can compute the number M(v) for Tp
328

quite easily from the well known identity

T n+1+ T n-1
p p

We find
o if v is odd
N(v) J n
Zn -j
L ( .
n-J-1
)) p , if v Zn.
j=O
The inverse moment problem is easily solved giving the Il 's in
p
Theorem 1.Z. The fact that Il is a product of the II p 'S follows from
the multiplicative property of the Hecke operators.

3. Outline of Proofs.

We now outline proofs of the results in Section 1, details will


appear elsewhere. The basic ingredient is the Selberg trace formula
but it is not the full formula that is needed. Indeed such a
formula cannot be used to prove Theorem Z.l. Basically what we need
is the "singularity at 0" in the trace formula.

Consider the case of r SL(Z,Z). Let k(z,1,;) be a point pair


invariant [3], which we assume to have very small support. That is
k(z,1,;) 0, if d(z,1,;) > e, where d(z,1,;) is the non-Euclidian
distance from z to 1,;. Let

K(z,1,;) L k(z,y1,;) (3.1)


y r
We have the spectral expansion [3]

K(z,1,;)
j
L h(r. )u. (z)u. (1,;) +
J J J
-!- J
11 _<Xl
<Xl h(t)E(z, l/Z + it)E(1,;, l/Z + it) dt.
(3.Z)

For what follows we ignore the contribution from the Eisenstein


series since in this case as was mentioned before they are known
explicitly, and may be dealt with easily. It follows that

(3.3)
329

and hence

(3.4 )

However one can calculate [T~ K(z,~)lz=~ asymptotically as € + 0

so that unless ~ is the fixed point of some the


above is zero for € small enough.
On integrating with respect to ~ one finds the main contribution
comes from exactly those Si Si ••• Si = I (mod r) • This, combined
1 2 \I
with (3.4) leads naturally to the asymptotics

(3.5)

Theorems 2.1 and 1.2 follow from this type of argument. If one is
more careful in the analysis in the case r =SL(2,Z), and keeps track
of all contributions above, one finds: (i) that the contribution
from the continuous spectrum is controlled by the constant term of
the Eisenstein series which is essentially the zeta function.
(ii) the number of terms y Si Si ••• Si with fixed points in F is
1 2 \I
easily majorized by elementary bounds for class numbers of binary
quadratic forms. This leads to the inequality:

K > pk => I Ip j (p)1 2k ( 2kK2 + p2k2k (3.6)


Ir.1 (K
]

Theorem 1.1 is an immediate consequence.

4. Geneml Case.
The results in this section are joint with I. Piatetski-
Shapiro. The first thing to observe is that the measures lip are
none other than the spherical Plancherel measures for SL 2 (Qp)' see
for example MacDonald [5]. He uses the variable e where
x = 2 cos e. One may also see that this is so by carrying out the
above proof using the adelic trace formula for GL2(Q)/GL2(~) (2).
330

The case of a compact quotient such as that coming from a quaternion


algebra and its generalizations. is particularly simple and an
analogue of Theorem 1.2 my be proved in complete generality. i.e.
for a reductive algebraic group defined over a number field. In
this case the existence of a limiting distribution follows from
Theorem 2.1 but the point is that one can avoid solving the inverse
moment problem. since these limiting distributions are spherical
Plancherel measures. which have been computed in complete generality
see MacDonald [5]. In the general nonc6mpact case such as
G = SL( n.R). r = SL( n. Z) there are technica 1 problems corning from
the continuous spectrum. We expect the same answer for the limiting
distribution. but so far have not been able to verify it in general.
For GL(n.Z) the eigenvalues of the p-th Hecke operators on u j
(1) (n)
(cusp forms) my be parametrized by Clj (p) ••.•• Cl j (p) where
(1) (n)
Ct • ••• Ct. =1. The corresponding limiting distribution for
J J
these is the spherical Plancherel measure for SL(n. Qp )' and lives on
the n-l torus. As in Section 1. one takes the limit p + ~ of these
measure and this turns out to be the measure

C (4.1)
n

where
k.j 1.2 ••••• n and 01 + 02 ••• + On o.

This gives a natural generalization of the semicircle or Sato-Tate


distribution. Indeed the above results prove this conjecture in the
average over the cusp forms (in the sense of Section 1). There are
other theoretical ways of arriving at the measure in (4.1). we note
in particular that it is the measure obtained by projecting Haar
measure on SU(n) to its mxirnal torus. If n = 2 then the measure
(4.1) is C2 sin 2 0 de which is of course the semicircle distribution
for the variable p = 2 cos O.
331

References.

[I]. Deligne, P., La conjecture de Weil I. Publ. Math. IHES, 43


(1974) 273-307.

[2]. Gelbart, S., Automorphic forms on Adele group, Anal of Math.


Studies, 83, 1975.

[3]. Hejhal, D., The Selberg trace formula for PSL(2,IR), Vol.2,
S.L.N. 1001, 1983.

[4]. Maass, H., Uber eine neue Art von Nichtanalytischen


Automorphismen ••• , Math. Ann. 121, 1949 pp. 141-183.

[5]. MacDonald, I.G., Spherical functions on groups of P-adic


~, TATA Inst. Series, 1971.

[6]. Mehta, M.L., Random matrices, Academic Press, 1967.

[7]. Phillips, R. and Sarnak, P., Preprint.

[8]. Serre, J.P., Abelian ~-adic representations, Benjamin, 1968.

[9]. Selberg, A., Gottingen lectures, 1954.

[10] • Sta rk, H. , Fourier coefficients of Maass wave forms, in


Modular forms ed. Rankin, Ellis Horwood, 1985.

[ll]. Venkov, A.B., Spectral theory of automorphic forms, Pltoc..


ste.Uov. Inllt. 1982, No.4 (English translation).

[12]. Weyl, H., Uber die Gleichverteilung von Zahlen Mod. eins,
Math. Ann. 77 , 1914, 313-352.

P.Sarnak,
Stanford University,
Stanford, CA, 94305 U.S.A.
TRANSCENDENCE THEORY OVER
RON-LOCAL FIELDS

Bans-Bernd Sieburg

1. S~ry.

For any commutative ring R let Val(R) denote the set of all
multiplicative real valuations. Let 0 : Val(R) ... R denote the map
given by ~ ... o(~) := inf {~(a): a € R, afO}. Here R is the field of
real numbers. In the first part of the present paper we show that
for o(~) > 0 the quotient field of R "is" either an algebraic
extension of the field Q of rational numbers, i f and only i f ~ is
Archimedian , or an algebraic extension of a rational function field
in arbitrarily many variables, if and only if ~ is non-Archimedian.
Local fields are contained in the class of rings (R,~) with
o(~) = O.

The second part of the paper is devoted to transcendence


questions over groundfie1ds k which are quotient fields of non-
Archimedian valued rings (R,~) with 0(<1» > O. Our results include
axiomatic formulations of the methods of Schneider, Ge1fond and
Baker. We also derive transcendence measures for certain elements
of the completion of k.

2. Classification of groundfields.

Let the notation be as above. The trivial valuation 1 given by


1(a) 1 for a f 0 and 1(0) = 0 has 0(1) = 1. To provide less
trivial examples we consider R = Z, the ring of rational
integers. For a fixed prime number p let I p denote the p-adic
valuation. Then o( lip) = O. If I I denotes the ordinary absolute
value on Z then o( I I) = 1. Furthermore, let d € Z, d f 0 and not a
square. Let I 11' I 12 denote the extensions of I to Z[ idle
Then, for i = I, 2, o( I Ii) = fci - 1 or 1 depending on d >0 or
334

d < O. Finally, consider R = A[X 1 , ••• ,Xml, where m) 1 is an


integer and A denotes an arbitrary integral domain. Let I 100 be the
discrete non-Archimedian valuation with Iploo := edeg(P), where
deg(P), for P f 0, is the total degree of P and deg(O) := -00. Then
o( I 100) = 1.

Definition. Let D denote the class of all commutative rings R


having a real multiplicative valuation ~ such that o(~) >0 and
1
~(b) > O(~) for at least one b E: R.

These rings have the following properties

~ 1. Let (R,~) E: D. Then


(1) R it, an ,[n6.tn-ite, non-ttUv,[all.y va1.ued '[ntegna1. domain wh,[d!. it,
not a 6.te..td,
(2) R it, not eomplete unden ~.

The proofs are simple and can therefore be omitted.

Let Ii denote the class of all valued fields which are quotient
fields of rings in D. The following result classifies the
Archimedian and non-Archimedian members of Ii

Proposition 1. Let (k,~) E: D. ~ it, Altd!.,[med.tan ,[6, and only ,[6,


k liz' .iA a punely a1.gebnaie ex.teYUl'<'on 60n eveny -6Ub 6,[e..td tc 06 k. ~
it, non-Alteh.<.med.tan '<'6, and only '<'6, theM ex.it,:.t6 a .6ub6,[e..td Iz' 06 k
.6ud!. that the ex.teYUl,[on kl tc it, tJtaYUleendenta1..

Pno06. Obviously it suffices to prove the second assertion only.


Suppose there exists a subfield tc of k and Z E: k such that z is
transcendental over tc. The subfield tc(z) has only non-Archimedian
valuations. Thus the restriction of ~ to Iz'(z) is non-Archimedian
and therefore ~ itself.
Conversely, let ~ be non-Archimedian. Let (R,~) denote the ring of
(k,~). Since ~ is non-Archimedian, there exist proper subfields F
of k such that HF * ) c [o(~), o(~)l
1 (at least the prime field is
such an F). Here F* := F-{O} and [a,bl, with a and b in R, denotes
an interval. For fixed F suppose kiF were purely algebraic. Then,
for every u in k, there exists a positive integer nand
335
n
o F P:= I a.X i
1
€ F[Xj, an I, such that P(a) = O. Then
i=O

max
O(j(n-l
1
( 0($)

1
This shows $(a) (0($) Since a was arbitrary, we have the above
inequality especially for all a € R, a contradiction.

Rem~R. Proposition 1 shows that (k,$) € D with $ Archimedian, iff


k is a purely algebraic extension of Q, whereas $ is non-
Archimedian, iff k is an algebraic extension of a rational function
field in arbitrarily many variables over a field F with 0($) ( $(a)
1
(0($) for all a €F, a F O.

Arguments analogous to those used in [Sie.4, Sec.2j immediately


show

Le\11113 2. Le:t (k, $) € D and let k $ denote the compte:t.i..on 06 k undelL


$. Then k $ ( D, thU6 k cannot be complete.

RemaltR. This shows that all local fields are contained in the class
of all rings R with real multiplicative valuation $ satisfying 0($)
= o. D contains all global fields.

3. Transcendence results.

Let (k,$) € D be fixed. From the view point of transcendence


theory it is sufficient to consider as groundfields
k Q, iff $ is Archimedian

k F(S), iff $ is non-Archimedian.


1
Here F denotes a subfi€ld of k with 0($) ( Ha) ( 0($) for all
o Fa € F, and S is a non-empty transcendence basis of k over F (see
Prop. I, proof). The Archimedian case is classically well-known.
Therefore we will consider non-Archimedian $ only. Hence, for the
rest of this paper we can make the
336

GENERAL ASSUMPTIONS: R F[sl, k = F(S), ~ 1 100 •

We need some additional notations. Let (K,~) denote an


algebraically closed, complete extension of k. Let R+ (N resp.)
denote the set of all positive real numbers (positive rational
integers) and ~et R+,o := R+ u {O} (No := N u {O}). Let
LK : = {L a i X1 : a. E K} denote the K[ Xl-module of formal Laurent
iE Z 1
series with coefficients in K and let K[[xll denote the integral
domain of formal power series over K. For t t R+,o let

K[ [xli iff t = 0

{f ELK: lim ~(ai)ti o) iff t F O.


lil+oo
and let PK
t := Lt n K[ [xlI.K K
Futhermore, for every f t L and fixed
t t R+,o define

iff t 0

i E Z} iff t F 0

Let A(K) (T(K) resp.) denote the set of all algebraic


(transcendental) elements of Kover k. For every a t A(K) let
fa e: k[Xj denote the minimal polynomial of a, deg(a) := deg(f a )
denote the degree of a, and Da := {x t R : ax t I(R,K)} denote the
denominator ideal of a, where I(R,K) is the integral closure of R in
K. We have Da = (d(a» for 0 F d(a) t R uniquely determined up to
units. The d(a) is called the denominator of a. Let a t A(K) with
v := deg(a) • Let a l := a, a 2 , ... , a v denote the conjugates of a.
Then
Ia1 := max ~(ai) denotes the house of a,
l·a.;v
and

0 iff a 0
sea) :=
{ max {log raT, log~(d(a»} , iff a # 0

denotes the size of a. We note that sea) is invariant under changes


of d(a).
337

Finally, for arbi t rary field extensions F21 FIlet tdeg F F2 be


1
the transcendence degree of F2 over Fl.

We can now state the transcendence theorems mentioned above.


Their complete proofs for char(k) = 0 can be found in Sie. [3) • It
is not difficult to see that, after suitable technical adjustments,
they also hold for char(k) > O. In order to illustrate our approach
we will outline the proof of Theorem 1 in Section 5.

Theorem 1. (Schneider's method). Let k' Ik be a 6,(nae, -6epMab.t'.e


extelU.ion. Let R. € W, r € ~(K) n It+ andfl' ••• , fR. €p~be
ai.gebltcUc.ai.l.y .independent ovelt K. Let r' € R+ wah r' < r. Let

~(Z) .; r'}

-6uc.h that fi (an) € k' 60lt aU. 1 .; i .; R. and n € N. Then

R.
L limsup
logT
log( max s(fi(a j ») > R. - 1.
i=1 T N l.;j';T

For technical reasons we will state the applications only for


char(k) = O. Then the exponential and logari thmn functions are
defined via the usual power series expansions for z £ K such
that $(z) <1 and ~(z-l) <1 respectively.

Corollary 1. (Theorem of Gelfond-Schneider). Let char(k) = o. Let


a, 13 € K wah 13 t Q , 0 < ~(a-l) < 1 and ~{I3)~(a-l) < 1. Let
as := exp(l3log(a». Then tdegk k(a,l3,a 13 ) > 1.

Pltoo6. Let k' := k(a,l3,a 13 ), r' := max {I, ~(13)} and


>
~

r := (~(log a»)-I. Obviously r r'. Let r € ~(K) n[r' ,r) such


that f 1, f2 € p~, where fl := X, f2 := exp(Xlop;(a». It is not
difficult to see that f 1 , f2 are algebraically independent. Since 13
t Q the elements of the sequence

are all distinct and in U'r'(O).


338

Suppose that tdegkk' = O. Then k'ik is a finite extension such


that f 1 (].11 + ].128), f 2 (].11 + ].128) .ok' for all (].11,].12) EO N2 • For
T EO If let

:= l /].1max].1 /T s( III + 11 2 8)
... l' 2'"

and

(2)
xT :=

(1) (2)
We find x T .. c 1 and xT .. c2T with suitable constants depending
only on k'. This together with Theorem 1 shows

log (1) log ~2)


.. lim ~ + lim .. 1
T+~ 2 log T T+~ 2 log T 2

a cont radiction.

In a similar way one proves

Corollary 2. Let m, n EO If be ~u.c.h that mn > m+n. Let {ul' ••• ' ~},
{vI' A ••• ' Vn} den.ote Q:-tin.eMly .in.depen.den.t ~u.b~e.t6 06 K ~u.c.h
that ~(uiVj) <1 60~ all 1 .. i .. m, 1 .. j .. n. Then.

Rema4k. The simplest cases are (m,n) (2,3) and (3,2) ("Theorem of
six exponentials").

From Corollary 2 we deduce the following

Corollary 3. Let a EOA(K), 0 < ~(a-l) < 1. Let 8 EOK ~u.c.h that
8 EO A(K) .imptiu deg(8) > 3. Let ~(8vlog(a» < 1 60~ 0 .. v .. 3.
Then.

P~006. Take n = 3, m = 2, u1 = log(a), u2 = 8 log(a), v I I ,


339

v2 S, v3 = S2 in Corollary 2.

Our second main result is

Theorem 2. (Gelfond's method). LeA: k'ik be a MnUe, .6epMab.f.e


ex~en6~on. Le~ t ~ ., t ( 2, and r ~ ~(K) n R+. Le~ f 1 , ••• ,
ft ~ P~ .6uc.h ~hM M .f.ea.6~ two Me a.f.gebltMc.aU..y ~ndependen~ oveJt
K. Le~ r' ~ R+, r' < r. Le~ (an)n ~ N deno~e a .6equenc.e ~n U'r'(O)
who.6e e.f.emenu Me aU ~~~nc.~ and .6uc.h ~hM fi (an) ~ k' 60Jt aU..
1 ( i ( t and n ~ N. Le~

D( L "l. na Xn-l
n=O n=1 n

deno~e ~he .6~andMd de~vM~ve on K[ [X]].


D(k'[f 1 , ••• ,f t ]) c k'[f 1 , ••• ,f t ]. Then

1
(1) liminf
n
n E .N

16 D opeJtMu on ~e k'-vec.~oJt .6pac.e k' + k'f 1 + ••• + k'f t , ~hen

( 2) liminf max max


n
n E :N 1(j(t l(1.;n

Theorem 2 provides an alternative proof for Corollary 1. In


addition we have

Corollary 4. (Theorem of Hermite-Lindemann). Le~ char(k) = O. Le~

a ~ K be .6uc.h ~M 0 < ~(a) < 1. Then tdegk k(a,exp(a» ~ 1.

Plto06. Let k' := k(a,exp(a», r' := ~(a). Let r ~ ~(K) n ]r' ,l[ be
such that f1 := X and f2 := exp(X) are in P~. Then f 1 , f2 are
algebraically independent over K. For all n ~ N, let an :=
na ~ U'r'(O). D operates on k' + k'f 1 + k'f 2 • We have s(a n ) (
ns(a) and s(exp(a n » ( ns(exp(a» for all n ~ N and therefore there
exists a C ~ R+ such that

1
A := limsup - max {max s(ja), max s(exp(ja») ( C •
nEB n 1~j(n 1(j(n
340

If tdeg k k' o then by Theorem 2, A = + 00 , a contradiction.

For any a € A(K) we call H(a) the height of a, which is defined


as the maximum of the ~-values of the coeffieients of the minimal
polynomial of a over R.

Theorem 3. (Baker's method). Let char(k) = o. Let n € 110 , d € N,


A, B € t+, A ) 3, B ) 3. Let aI' ••• , an A(K) ~uch that $(a i -1) <

1, deg(a i ) .; d, H(a i ) .; A 60ft aLe. I .. i .; n. Thefte ex.u,~ an


e66ect-ive COn6tant C € R+, depencUng on1.1j on n and d, ~uch that
eaheft

~(BO + ~ Bo log(ao» > B-C(log(A»2n 2+Sn+8


j=1 J J

60ft aLe. BO' B1 , ••• , Bn € A(K) wah deg(B) .; d and H(B) .; B 60ft
o .; v .; n.

From this we deduce the following appftox-imat-ion me~ufte.

Corollary 5. Let char(k) = o. Let n, d € II, A € R+, A ) 3. Let


aI' ••• , an' BO' B1' ••• , Bn € A(K) wah 0 < ~(ai-l) <
and ~(Bi)~(acl) < 160ft 1 .; i .; n. Su.ppo~e eaheft 0 < ~(Bo) < 1,
OftB1, B1 , ••• , Bn Me Q-lineaJLtlj -independent. Let e B 0 = exp(B o) and
ai i:= exp(Bilog(a i » 60ft 1 .; i .; n. Then thefte ex~~ an e66ect-ive
po~a-ive fteal cOn6tant C, depencUng on1.1j on n, d, BO' B1' ••• , Bn ,
al' ••• , an ~uch that 60ft aLe. n € A(K) wah deg(n) .; d and H(n) .; A
2
-C(log(A»2n +9n+lS
e

With the usual method, suitably adjusted for our purposes, one
obtains the following transcendence measures for arbitrary
characteristic.
341

Theorem 4. Let b f': R, ~(b) > 1, be 6-{.xed. Let (cn)n f': Ii denote a
~equenee 06 ~ntege~ ~uch that c n f 0 ~n6~n~teiy 06ten. Let

a: = Co + L c b -n'. •
n
n=l

Then, 60~ eve~y polynom~al P f': R[X] 06 deg~ee D ( 1 and he~ght Hone
h~

log(~(p(a») ) -Sl(DD-1 + DHlog2(2H» •

Finally let us note that, using the same methods as in [1] and
[2], we can prove

Theorem 5. (Schanuel's conjecture). Let (k,~) f': D , ~ non-


~eh~me~an, char(k) = o. Let n f': Ii and aI' ••• , an f': K be Q-
~nea4ty ~ndependent. Let $(a i ) <1 60~ all 1 ( i (n. Then

4. Auxiliary results.

For the proof of Theorem 1 we will need the following lemmas.

Lem.a 3. (Fundamental inequality). Let a f': A(K), a f o. Then

log(~(a» ) - 2deg(a)s(a) •

P~006. See Sie.[3], Chapter 1.

Lemaa 4. (Siegel's lemma). Let k'ik be a Mn~te, ~epMab.te


exte~~on. Let m, n f': N, m > n. Let aij f': I(R,k'), 1 ( i ( n,
1 ( j (m. Let S f': R+, S ) 1, be ~uch that max Ia ij I ( S. Then
i,j
the ~y~tem

0, 1 ( i ( n,
342

Pte.006. see [3, Chapter 1].

From non-Archimedian analysis we need the following two


results.

LeIlDa 5. Le~ r ~ R+ and f ~ p~, f ~ 0 and put f = L a xn. Then


n=O n
+ . ~
f ha6 a,t mol.l~ Mnuay many, narnay d (f,r) := max {j E , 9 1<t>(a . )r J
o J
IIfll r } zete.OI.l .tn U;(O) (C'.ou.n~ed wUh mLf.t.tp.t.tC'..t~iu).

Le..a 6. Le~ r ~ Ri
T,r ' ~ Ri T,O , r' ( r . Le~ 0 ~ f ~ pKr have h zete.ol.l
in U'r'(O), h ~ No' Fote. R. ~ No .tet f(R.) deno~e ~he R.-~h 60te.ma.i'.
detc..tvM.tve 06 ~e powete. l.Ietc..tu f. Then

5. Proof of Theorem 1.

Let T ~ 9 be sufficiently large. Put L = [k':k]. The


assertion is trivial if there exists io ~ {1, ••• ,R.} such that

1
limsup 10g(T) log( max s(f i (a . ))) -+0> •
TEN 1(j(T 0 J

If for all 1 ( i (R.

1
limsup ~ log( max s(f i (a]. ))) < -+0> ,
TEN og l(j(T

then we show: if PI' ••• , PR. ~ R+ are such that

_Po
max s(fi(a j )) ( T 1 for all T ;> T, 1 ( i ( R.,
l(j(T
343
J/,
then I Pi) J/, - 1. A simple argument shows that we can restrict
i=1 1 1 J/,
to the case max Pi < P + -;;- , where P := \' Pi ' Let
-J/, i __Ll
Iv

1 l"i"J/,
E := P + R: •

step 1. We construct an auxiliary function.


E-P
Let Gi = [2T i 1, 1 .. i .. J/,. We show that there exists a
polynomial

P :=

not the zero-polynomial and with coefficients in I(R,k'), such that


all I P(>..p ... ,AJ/,) I .. exp(c4TE) and such that F := P(fp ... ,fJ/,)E P~
vanishes for all uj , 1 .. j .. T.
From the last condition we obtain a system of T linear forms
Al AJ/,
with coefficients (f 1 (uj » ••• (fJ/,(uj » in k' in the
J/,
IT (Gi+l) > 2J/,T > T Let
i=1

0ij := d( f i (uj » , 1 .. i .. J/, , 1 .. j .. T,

J/, Gi Ai
EA . := IT 0i . (fi(uj ») E: I(R,k').
_,J i=1 J

o, 1 .. j .. T, (*)

is equivalent to the system

I p (~) E A j = 0, 1.. j .. T , (**)


A -'
which has coefficients in I(R,k') satisfying

.. exp(4J/,T E) =: S •
J/,
Applying Lemma 4 with m := IT (Gi +l), n := T and S as above,
i=1
344

we obtain p(~) € r(R,k'), not all zero, which solve (*) (=) (**) and
are such that

Step 2. We construct a suitable non-ze,ro element in k'.


Since f 1 ' ••• , f R. are K-alge braically independent, F is not the
zero function. From Lemma 5 we know that F has only finitely many
zeros in U' r'(O). Therefore there is at least one F(a.) F O. Let
T* := min{j € R : F(aj) F O}. Step 1 shows that T* > J
T+1. Now
define no = F(a T*). By construction no is a non-zero element of k'.

Step 3. We estimate no from below.

Using Lemma 3 we obtain ~(no) .. exp(-2Ls(n o») • The size of


no can be estimated by

R.
s (n ) .. log
o
fPl + L Gi s(fi(a *»,
i=l T

where P denotes the maximum of the houses of the coefficients of


P. We have
(from step 1)
and

Noting that T* > T+1 we obtain

Thus, for a suitable constant C s € R+

(1)

Step 4. We estimate no from above.

Apply Lemma 6 with f = F, h = T* - 1 and R. O. We obtain


345

:i.( 0) IIFII " IIFllr' " (r*)-T*+l IFllr,


~ n " ~(aT*)

where r* := r/r' > 1. Since IIf111r , ••• , IIfR.1I e: R+ there exist


constants c(i) e: &+ depending only on f i , "i" R., such that

II L
A

" max ~(p(l» IIfl/lr ••• llfR.llr


1

"exp( c4T e:' + I


i=l
c(i) Gi )

for some constant C6 e: R+. Thus

(2)

for suitable positive real constants c 7 , c 8 •


Now T* is large since T is large. Thus the corresponding
inequalities (1) and (2) give E > 1, from which our assertion
follows.

References.

[1) Ax, J. "On Schanuel's conjectures," Ann. 06 Math. 93


(1971),252-268.

(2) Coleman, R. F. "On a stronger version of the Schanuel-Ax


theorem," Am. J. Math. 102 (1979), 595-624.

(3) Sieburg, H. B. T~an6zendenz und algeb~aL6che Unabhangigkei~ in


eine~ Klah~e nich~-A~chime~ch bewe~e~e~ K~~pe~ de~

Ch~ak~e~~ik Null. Thesis, Koln 1983.

[4) Sieburg, H. B. "Algebraically independent values of Liouville-


von Neumann series over QV-fields." To appear A~ch. Math.1984.
346

H. B. Sieburg
Stanford University and The Salk Institute for
Stanford, CA 94305 Biological Studies,
U.S.A. P.O.Box 85800
San Diego, CA 92168
Progress in Mathematics

GROSS. Quadratic Forms in Infinite- 20 STEVENS. Arithmetic on Modular


Dimensional Vector Spaces Curves
2 PHAM. Singularites des Systemes 21 KATOK. Ergodic Theory and
Differentiels de Gauss-Manin Dynamical Systems II
3 OKONEK/SCHNEIDERISPINDLER. Vec- 22 BERTIN. Seminaire de Theorie des
tor Bundles on Complex Projective Nombres, Paris 1980-81
Spaces 23 WElL. Adeles and Algebraic Groups
4 AUPETIT. Complex Approximation, 24 LE BARZ/HERVIER. Enumerative
Proceedings, Quebec, Canada, July Geometry and Classical Algebraic
3-8, 1978 Geometry
5 HELGASON. The Radon Transform 25 GRIFFITHS. Exterior Differential Sys-
6 LIONIVERGNE. The Weil Represen- terns and the Calculus of Variations
tation, Maslov Index and Theta 26 KOBLITZ. Number Theory Related to
Series Fermat's Last Theorem
7 HIRSCHOWITZ. Vector Bundles and 27 BROCKETT/MILLMAN/SUSSMAN. Dif-
Differential Equations Proceedings, ferential Geometric Control Theory
Nice, France, June 12-17, 1979 28 MUMFORD. Tata Lectures on Theta I
8 GUCKENHEIMER/MoSER/N EWHOUSE. 29 FRIEDMAN/MoRRISON. Birational
Dynamical Systems, C.l.M.E. Geometry of Degenerations
Lectures, Bressanone, Italy, June, 30 YANO/KoN. CR Submanifolds of
1978 Kaehlerian and Sasakian Manifolds
9 SPRINGER. Linear Algebraic Groups 31 BERTRAND/WALDSCHMIDT. Approxi-
10 KATOK. Ergodic Theory and mations Diophantiennes et Nombres
Dynamical Systems I Transcendants
II BALSLEV. 18th Scandinavian Con- 32 BOOKS/GRAy/REINHART. Differen-
gress of Mathematicians, Aarhus, tial Geometry
Denmark. 1980 33 ZUIL y. Uniqueness and Non-
12 BERTIN. Seminaire de Theorie des Uniqueness in the Cauchy Problem
Nombres, Paris 1979-80 34 KASHIWARA. Systems of Micro-
13 HELGASON. Topics in Harmonic differential Equations
Analysis on Homogeneous Spaces 35 ARTIN/TATE. Arithmetic and Geo-
14 HANO/MARIMOTO/MuRAKAMI/ metry: Papers Dedicated to l.R.
OKAMOTO/OZEKJ. Manifolds and Lie Shafarevich on the Occasion of His
Groups: Papers in Honor of Yozo Sixtieth Birthday, Vol. I
Matsushima 36 ARTIN/TATE. Arithmetic and Geo-
15 VOGAN. Representations of Real metry: Papers Dedicated to l.R.
Reductive Lie Groups Shafarevich on the Occasion of His
16 GRIFFITHS/MoRGAN. Rational Hom- Sixtieth Birthday, Vol. II
otopy Theory and Differential Forms 37 DE MONVEL. Mathematique et
17 VovsJ. Triangular Products of Physique
Group Representations and Their 38 BERTIN. Seminaire de Theorie des
Applications Nombres, Paris 1981-82
18 FRESNELIv AN DER PUT. Geometrie 39 UENO. Classification of Algebraic
Analytique Rigide et Applications and Analytic Manifolds
19 ODA. Periods of Hilbert Modular 40 TROMBJ. Representation Theory of
Surfaces Reductive Groups
41 STANELY. Combinatories and 56 SHIFFMAN/SOMMESE. Vanishing
Commutative Algebra Theorems on Complex Manifolds
42 JOUANOLOU. Theoremes de Bertini 57 RIESEL. Prime Numbers and Com-
et Applications puter Methods for Factorization
43 MUMFORD. Tata Lectures on Theta 58 HELFFER/NoURRIGAT. Hypoellipti-
II cite Maximale pour des Operateurs
44 KAc . Infinite Dimensional Lie Polynomes de Champs de Vecteurs
Algebras 59 GOLDSTEIN . Seminarie de Theorie
45 BISMUT. Large Deviations and the des Nombres, Paris 1983-84
Malliavin Calculus 60 PROCESI. Geometry Today: Gior-
46 SATAKEIMoRITA . Automorphic nate Di Geometria, Roma . 1984
Forms of Several Variables Tani- 61 BALLMANN/GROMov/SCHROEDER .
guchi Symposium, Katata, 1983 Manifolds of Nonpositive Curvature
47 TATE . Les Conjectures de Stark sur 62 GUILLou/MARIN . A la Recherche
les Fonctions L d' Artin en s = 0 de la Topologie Perdue
48 FROHLICH. Classgroups and Hermi- 63 GOLDSTEIN . Seminaire de Theorie
tian Modules des Nombres, Paris 1984-85
49 SCHLlCHTKRULL. Hyperfunctions 64 MYUNG . Malcev-Admissible
and Harmonic Analysis on Sym- Algebras
metric Spaces 65 GRUBB. Functional Calculus of
50 BOREL, ET AL. Intersection Co- Pseudo-Differential Boundary
homology Problems
51 BERTIN/GOLDSTEI N. Seminaire de 66 CAssou-NOGU ES/TAYLOR. Elliptic
Theoire des Nombres . Paris 1982- Functions and Rings and Integers
83 67 HOWE. Discrete Groups in Geome-
52 GASQUI/GOLDSCHMIDT. Deforma- try and Analysis: Papers in Honor
tions Infinitesimales des Structures of G.D. Mostowon His Sixtieth
Con formes Plates Birthday
53 LAURENT. Theorie de la Deuxieme 68 ROB ERT. Antour de L'Approxima-
Microlocalisalion dans Ie Domaine lion Semi-Classique
Complexe 69 FARAUT/HARZALLAH. Deux Cours
54 VERDIER/LE POTI ER. Module des d'Analyse
Fibres Stables sur les Courbes AI- 70 AooLPHsoN/CoNREY/GosH/YAGER.
gebriques NOles de I'Ecole Nor- Number Theory and Diophantine
male Superieure, Printemps , 1983 Problems: Proceedings of a
55 EICHLER/ZAGIER . The Theory of Conference at Oklahoma State
jacobi Forms University

You might also like