Goldbach

a
r
X
i
v
:
1
2
0
5
.
5
2
5
2
v
2

[
m
a
t
h
.
N
T
]

1
3

M
a
y

2
0
1
3
MINOR ARCS FOR GOLDBACHS PROBLEM
H. A. HELFGOTT
Abstract. The ternary Goldbach conjecture states that every odd number
n 7 is the sum of three primes. The estimation of sums of the form
px
e(p), = a/q + O(1/q
2
), has been a central part of the main ap-
proach to the conjecture since (Vinogradov, 1937). Previous work required q
or x to be too large to make a proof of the conjecture for all n feasible.
The present paper gives new bounds on minor arcs and the tails of major
arcs. These bounds play a central role in a twin paper by the author proving
the ternary Goldbach conjecture.
The new bounds are due to several qualitative improvements. In particu-
lar, this paper presents a general method for reducing the cost of Vaughans
identity, as well as a way to exploit the tails of minor arcs in the context of
the large sieve.
Contents
1. Introduction 2
1.1. Results 2
1.2. History 3
1.3. Comparison to earlier work 4
1.4. Acknowledgments 5
2. Preliminaries 5
2.1. Notation 5
2.2. Fourier transforms and exponential sums 5
2.3. Smoothing functions 7
2.4. Bounds on sums of (m) and (n) 7
2.5. Basic setup 9
3. Type I 10
3.1. Trigonometric sums 10
3.2. Type I estimates 13
4. Type II 31
4.1. The sum S
1
: cancellation 32
4.2. The sum S
2
: the large sieve, primes and tails 44
5. Totals 50
5.1. Contributions of dierent types 50
5.2. Adjusting parameters. Calculations. 62
5.3. Conclusion 72
Appendix A. Norms of Fourier transforms 73
References 77
1
2 H. A. HELFGOTT
1. Introduction
The ternary Goldbach conjecture (or three-prime conjecture) states that every
odd number greater than 5 is the sum of three primes. I. M. Vinogradov [Vin37]
showed in 1937 that every odd integer larger than a very large constant C is
indeed the sum of three primes. His work was based on the study of exponential
sums
nN
(n)e(n)
and their use within the circle method.
Unfortunately, further work has so far reduced C only to e
3100
([LW02]; see
also [CW89]), which is still much too large for all odd integers up to C to be
checked numerically. The main problem has been that existing bounds for (1.1)
in the minor arc regime namely, = a/q +O(1/q
2
), gcd(a, q) = 1, q relatively
large have not been strong enough.
The present paper gives new bounds on smoothed exponential sums
(1.1) S
(, x) =
n
(n)e(n)(n/x).
These bounds are clearly stronger than those on smoothed or unsmoothed expo-
nential sums in the previous literature, including the bounds of [Tao]. (See also
work by Ramare [Ram10].)
In particular, on all arcs around a/q, q > 1.5 10
5
odd or q > 3 10
5
even,
the bounds are of the strength required for a full solution to the three-prime
conjecture. The same holds on the tails of arcs around a/q for smaller q.
(The remaining arcs namely, those around a/q, q small are the major arcs;
they are treated in the companion paper [Hela].)
The quality of the results here is due to several new ideas of general applica-
bility. In particular, 4.1 introduces a way to obtain cancellation from Vaughans
identity. Vaughans identity is a two-log gambit, in that it introduces two con-
volutions (each of them at a cost of log) and oers a great deal of exibility in
compensation. One of the ideas in the present paper is that at least one of two
logs can be successfully recovered after having been given away in the rst stage
of the proof. This reduces the cost of the use of this basic identity in this and,
presumably, many other problems.
We will also see how to exploit being on the tail of a major arc, whether in the
large sieve (Lemma 4.3, Prop. 4.6) or in other contexts.
There are also several technical improvements that make a qualitative dier-
ence; see the discussions at the beginning of 3 and 4. Considering smoothed
sums now a common idea also helps. (Smooth sums here go back to Hardy-
Littlewood [HL23] both in the general context of the circle method and in the
context of Goldbachs ternary problem. In recent work on the problem, they
reappear in [Tao].)
1.1. Results. The main bound we are about to see is essentially proportional to
((log q)/
_
(q)) x. The term
0
serves to improve the bound when we are on the
tail of an arc.
Main Theorem. Let x x
0
, x
0
= 2.16 10
20
. Let S
(, x) be as in (1.1), with
dened in (1.4). Let 2 = a/q + /x, q Q, gcd(a, q) = 1, |/x| 1/qQ, where
MINOR ARCS FOR GOLDBACHS PROBLEM 3
Q = (3/4)x
2/3
. If q x
1/3
/6, then
(1.2)
|S
(, x)|
R
x,
0
q
log
0
q + 0.5
_
0
(q)
x +
2.5x
0
q
+
2x
0
q
L
x,
0
,q
+ 3.2x
5/6
,
where
(1.3)
0
= max(2, ||/4), R
x,t
= 0.27125 log
_
1 +
log 4t
2 log
9x
1/3
2.004t
_
+ 0.41415
L
x,,q
= min
_
log
7
4
q
13
4
+
80
9
(q)/q
,
5
6
log x +
50
9
_
+ log q
80
9

16
9
+
111
5
.
If q > x
1/3
/6, then
|S
(, x)| 0.2727x
5/6
(log x)
3/2
+ 1218x
2/3
log x.
The factor R
x,t
is small in practice; for instance, for x = 10
25
and
0
q = 5 10
5
(typical dicult values), R
x,
0
q
equals 0.59648 . . . .
The classical choice
1
for in (1.1) is (t) = 1 for t 1, (t) = 0 for t > 1,
which, of course, is not smooth, or even continuous. We use
(1.4) (t) =
2
(t) = 4 max(log 2 | log 2t|, 0),
as in Tao [Tao], in part for purposes of comparison. (This is the multiplicative
convolution of the characteristic function of an interval with itself.) Nearly all
work should be applicable to any other suciently smooth function of fast
decay. It is important that decay at least quadratically.
1.2. History. The following notes are here to provide some background; no claim
to completeness is made.
Vinogradovs proof [Vin37] was based on his novel estimates for exponential
sums over primes. Most work on the problem since then, including essentially
all work with explicit constants, has been based on estimates for exponential
sums; there are some elegant proofs based on cancellation in other kinds of sums
([HB85], [IK04, 19]), but they have not been made to yield practical estimates.
The earliest explicit result is that of Vinogradovs student Borodzin (1939).
Vaughan [Vau77] greatly simplied the proof by introducing what is now called
Vaughans identity.
The current record is that of Liu and Wang [LW02]: the best previous result
was that of [CW89]. Other recent work falls into the following categories.
Conditional results. The ternary Goldbach conjecture has been proven under
the assumption of the generalized Riemann hypothesis [DEtRZ97].
Ineective results. An example is the bound given by Buttkewitz [But11]. The
issue is usually a reliance on the Siegel-Walsz theorem. In general, to obtain
eective bounds with good constants, it is best to avoid analytic results on L-
functions with large conductor. (The present paper implicitly uses known results
on the Riemann function, but uses nothing at all about other L-functions.)
Results based on Vaughans identity. Vaughans identity [Vau77] greatly sim-
plied matters; most textbook treatments are by now based on it. The minor-arc
treatment in [Tao] updates this approach to current technical standards (smooth-
ing), while taking advantage of its exibility (letting variable ranges depend on
q).
1
Or, more precisely, the choice made by Vinogradov and followed by most of the literature
since him. Hardy and Littlewood [HL23] worked with (t) = e
t
.
4 H. A. HELFGOTT
q
0
|S(a/q,x)|
x
, HH
|S(a/q,x)|
x
, Tao
10
5
0.04522 0.34475
1.5 10
5
0.03821 0.28836
2.5 10
5
0.03097 0.23194
5 10
5
0.02336 0.17416
7.5 10
5
0.01984 0.14775
10
6
0.01767 0.13159
10
7
0.00716 0.05251
Table 1. Worst-case upper bounds on x
1
|S
(a/2q, x)| for q

q
0
, || 8, x = 10
27
. The trivial bound is 1.
Results based on log-free identities. Using Vaughans identity implies losing a
factor of (log x)
2
(or (log q)
2
, as in [Tao]) in the rst step. It thus makes sense to
consider other identities that do not involve such a loss. Most approaches before
Vaughans identity involved larger losses, but already [Vin37, 9] is relatively
economical, at least for very large x. The work of Daboussi [Dab96] and Daboussi
and Rivat [DR01] explores other identities. (A reading of [DR01] gave part of the
initial inspiration for the present work.) Ramares work [Ram10] asymptotically
the best to date is based on the Diamond-Steinig inequality (for k large).
* * *
The authors work on the subject leading to the present paper was at rst
based on the (log-free) Bombieri-Selberg identity (k = 3), but has now been
redone with Vaughans identity in its foundations. This is feasible thanks to the
factor of log regained in 4.1.
1.3. Comparison to earlier work. Table 1 compares the bounds for the ratio
|S
(a/q, x)|/x given by this paper and by [Tao] for x = 10

27
and dierent values
of q. We are comparing worst cases: (q) as small as possible (q divisible by
2 3 5 ) in the result here, and q divisible by 4 (implying 4 a/(q/4)) in
Taos result. The main term in the result in this paper improves slowly with
increasing x; the results in [Tao] worsen slowly with increasing x.
The qualitative gain with respect to [Tao] is about log(q)
_
(q)/q, which is
log(q)/
_
e
(log log q) in the worst case.

The results in [DR01] are unfortunately worse than the trivial bound in this
range. Ramares results ([Ram10, Thm. 3], [Ramd, Thm. 6]) are not applicable
within the range, since neither of the conditions log q (1/50)(log x)
1/3
, q x
1/48
is satised. Ramares bound in [Ramd, Thm. 6] is
(1.5)
x<n2x
(n)e(an/q)
13000
q
(q)
x
for 20 q x
1/48
. We should underline that, while both the constant 13000
and the condition q x
1/48
keep (1.5) from being immediately useful in the
present context, (1.5) is asymptotically better than the results here as q .
(Indeed, qualitatively speaking, the form of (1.5) is the best one can expect
from results derived by the family of methods stemming from [Vin37].) There
is also unpublished work by Ramare (ca. 1993) with better constants for q
(log x/ log log x)
4
.
1.4. Acknowledgments. The author is very thankful to O. Ramare for his cru-
cial help and feedback, and to D. Platt for his prompt and helpful answers. He is
also much indebted to A. Booker, B. Green, H. Kadiri, T. Tao and M. Watkins
for many discussions on Goldbachs problem and related issues. Thanks are also
due to B. Bukh, A. Granville and P. Sarnak for their valuable advice.
Travel and other expenses were funded in part by the Adams Prize and the
Philip Leverhulme Prize. The authors work on the problem started at the Univer-
site de Montreal (CRM) in 2006; he is grateful to both the Universite de Montreal
and the

Ecole Normale Superieure for providing pleasant working environments.
The present work would most likely not have been possible without free and
publicly available software: Maxima, PARI, Gnuplot, QEPCAD, SAGE, and, of
course, L
A
T
E
X, Emacs, the gcc compiler and GNU/Linux in general.
2. Preliminaries
2.1. Notation. Given positive integers m, n, we say m|n
if every prime divid-

ing m also divides n. We say a positive integer n is square-full if, for every prime
p dividing n, the square p
2
also divides n. (In particular, 1 is square-full.) We
say n is square-free if p
2
n for every prime p. For p prime, n a non-zero integer,
we dene v
p
(n) to be the largest non-negative integer such that p
|n.
When we write
n
, we mean
n=1
, unless the contrary is stated. As usual, ,
, and denote the Moebius function, the von Mangoldt function, the divisor
function and the sum-of-divisors function, respectively.
As is customary, we write e(x) for e
2ix
. We write |f|
r
for the L
r
norm of a
function f.
We write O
(R) to mean a quantity at most R in absolute value.

2.2. Fourier transforms and exponential sums. The Fourier transform on
R is normalized here as follows:
f(t) =
_

e(xt)f(x)dx.
If f is compactly supported (or of fast decay) and piecewise continuous,

f(t) =
(t)/(2it) by integration by parts. Iterating, we obtain that, if f is compactly

supported, continuous and piecewise C
1
, then
(2.1)

f(t) = O
_
|
(2t)
2
_
= O
_
|f
|
1
(2t)
2
_
,
and so

f decays at least quadratically.
The following bound is standard (see, e.g., [Tao, Lemma 3.1]): for R/Z
and f : R C compactly supported and piecewise continuous,
(2.2)
nZ
f(n)e(n)
min
_
|f|
1
+
1
2
|f
|
1
,
1
2
|f
|
1
| sin()|
_
.
(The rst bound follows from
nZ
|f(n)| |f|
1
+ (1/2)|f
|
1
, which, in turn is
a quick consequence of the fundamental theorem of calculus; the second bound
is proven by summation by parts.) The alternative bound (1/4)|f
|
1
/| sin()|
2
given in [Tao, Lemma 3.1] (for f continuous and piecewise C
1
) can usually be
improved by the following estimate.
6 H. A. HELFGOTT
Lemma 2.1. Let f : R C be compactly supported, continuous and piecewise
C
1
. Then
(2.3)
nZ
f(n)e(n)
1
4
|
(sin )
2
for every R.
As usual, the assumption of compact support could be easily relaxed to an
assumption of fast decay.
Proof. By the Poisson summation formula,
n=
f(n)e(n) =
n=
f(n ).
Since

f(t) =

f
(t)/(2it),
n=
f(n ) =
n=
(n )
2i(n )
=
n=
(n )
(2i(n ))
2
.
By Eulers formula cot s = 1/s +
n=1
(1/(n +s) 1/(n s)),
(2.4)
n=
1
(n +s)
2
= ( cot s)
=

2
(sin s)
2
.
Hence
n=
f(n )
n=
1
(2(n ))
2
= |

1
(2)
2

2
(sin )
2
.
The trivial bound |
|f
|
1
, applied to (2.3), recovers the bound in [Tao,
Lemma 3.1]. In order to do better, we will give a tighter bound for |
when
f =
2
in Appendix A.
Integrals of multiples of f
(in particular, |f
|
1
and
) can still be made sense

of when f
is undened at a nite number of points, provided f is understood as

a distribution (and f
has nite total variation). This is the case, in particular,

for f =
2
.
* * *
When we need to estimate
n
f(n) precisely, we will use the Poisson summa-
tion formula:
n
f(n) =
f(n).
We will not have to worry about convergence here, since we will apply the Pois-
son summation formula only to compactly supported functions f whose Fourier
transforms decay at least quadratically.
2.3. Smoothing functions. For the smoothing function
2
in (1.4),
(2.5) |
2
|
1
= 1, |
2
|
1
= 8 log 2, |
2
|
1
= 48,
as per [Tao, (5.9)(5.13)]. Similarly, for
2,
(t) = log(t)
2
(t), where 4,
(2.6)
|
2,
|
1
< log()|
2
|
1
= log()
|
2,
|
1
= 2
2,
(1/2) = 2 log(/2)
2
(1/2) < (8 log 2) log ,
|
2,
|
1
= 4 log(/4) +|2 log 4 log(/4)| +|4 log 2 4 log |
+| log 4 log 2| +| log | < 48 log .
(In the rst inequality, we are using the fact that log(t) is always positive (and
less than log()) when t is in the support of
2
.)
Write log
+
x for max(log x, 0).
2.4. Bounds on sums of (m) and (n). We will need explicit bounds on
nN
(n)/n and related sums involving . The situation here is less well-
developed than for sums involving . The main reason is that the complex-
analytic approach to estimating
nN
(n) would involve 1/(s) rather than
(s)/(s), and thus strong explicit bounds on the residues of 1/(s) would be
needed.
Fortunately all we need is a saving of (log n) or (log n)
2
on the trivial bound.
This is provided by the following.
(1) (Granville-Ramare [GR96], Lemma 10.2)
(2.7)
nx:gcd(n,q)=1
(n)
n
1
for all x, q 1,
(2) (Ramare [Ramc]; cf. El Marraki [EM95], [EM96])
(2.8)
nx
(n)
n
0.03
log x
for x 11815.
(3) (Ramare [Rama])
(2.9)
nx:gcd(n,q)=1
(n)
n
= O
_
1
log x/q

4
5
q
(q)
_
for all x and all q x;
(2.10)
nx:gcd(n,q)=1
(n)
n
log
x
n
= O
_
1.00303
q
(q)
_
for all x and all q.
Improvements on these bounds would lead to improvements on type I estimates,
but not in what are the worst terms overall at this point.
A computation carried out by the author has proven the following inequality
for all real x 10
12
:
(2.11)
nx
(n)
n
_
2
x
8 H. A. HELFGOTT
The computation was rigorous, in that it used D. Platts implementation [Pla11]
of double-precision interval arithmetic based on Lambovs [Lam08] ideas. For the
sake of verication, we record that
5.42625 10
8
n10
12
(n)
n
5.42898 10
8
.
Computations also show that the stronger bound
nx
(n)
n
1
2
x
holds for all 3 x 7727068587, but not for x = 7727068588 .
Earlier, numerical work carried out by Olivier Ramare [Ramb] had shown that
(2.11) holds for all x 10
10
.
We will make reference to various bounds on (n) in the literature. The
following bound can be easily derived from [RS62, (3.23)], supplemented by a
quick calculation of the contribution of powers of primes p < 32:
(2.12)
nx
(n)
n
log x.
We can derive a bound in the other direction from [RS62, (3.21)] (for x > 1000,
adding the contribution of all prime powers 1000) and a numerical verication
for x 1000:
(2.13)
nx
(n)
n
log x log
3
2
.
We also use the following older bounds:
(1) By the second table in [RR96, p. 423], supplemented by a computation
for 2 10
6
V 4 10
6
,
(2.14)
ny
(n) 1.0004y
for y 2 10
6
.
(2)
(2.15)
ny
(n) < 1.03883y
for every y > 0 [RS62, Thm. 12].
For all y > 663,
(2.16)
ny
(n)n < 1.03884
y
2
2
,
where we use (2.15) and partial summation for y > 200000, and a computation
for 663 < y 200000. Using instead the second table in [RR96, p. 423], together
with computations for small y < 10
7
and partial summation, we get that
(2.17)
ny
(n)n < 1.0008
y
2
2
for y > 1.6 10
6
.
Similarly,
(2.18)
ny
(n) < 2 1.0004
y
for all y 1.
It is also true that
(2.19)
y/2<py
(log p)
2
1
2
y(log y)
for y 117: this holds for y 2 758699 by [RS75, Cor. 2] (applied to x = y,
x = y/2 and x = 2y/3) and for 117 y < 2 758699 by direct computation.
2.5. Basic setup. We begin by applying Vaughans identity [Vau77]: for any
function : R R, any completely multiplicative function f : Z
+
C and any
x > 0, U, V 0,
(2.20)
n
(n)f(n)e(n)(n/x) = S
I,1
S
I,2
+S
II
+S
0,
,
where
(2.21)
S
I,1
=
mU
(m)f(m)
n
(log n)e(mn)f(n)(mn/x),
S
I,2
=
dV
(d)f(d)
mU
(m)f(m)
n
e(dmn)f(n)(dmn/x),
S
II
=
m>U
f(m)
_
_
_
_
d>U
d|m
(d)
_
_
_
_
n>V
(n)e(mn)f(n)(mn/x),
S
0,
=
nV
(n)e(n)f(n)(n/x).
The proof is essentially an application of the Mobius inversion formula; see, e.g.,
[IK04, 13.4]. In practice, we will use the function
(2.22) f(n) =
_
1 if gcd(n, v) = 1,
0 otherwise,
where v is a small, positive, square-free integer. (Our nal choice will be v = 2.)
Then
(2.23) S
(x, ) = S
I,1
S
I,2
+S
II
+S
0,
+S
0,w
,
where S
(x, ) is as in (1.1) and

S
0,v
=
n|v
(n)e(n)(n/x).
The sums S
I,1
, S
I,2
are called of type I (or linear), the sum S
II
is called of
type II (or bilinear). The sum S
0
is in general negligible; for our later choice of
V and , it will be in fact 0. The sum S
0,v
will be negligible as well.
10 H. A. HELFGOTT
3. Type I
There are here three main improvements in comparison to standard treatments:
(1) The terms with m divisible by q get taken out and treated separately
by analytic means. This all but eliminates what would otherwise be the
main term.
(2) For large m, the other terms get handled by improved estimates on
trigonometric sums.
(3) The error term /x = a/q is used to our advantage. This happens
both through the Poisson summation formula and through the use of two
successive approximations.
3.1. Trigonometric sums. The following lemmas on trigonometric sums im-
prove on the best Vinogradov-type lemmas in the literature. (By this, we mean
results of the type of Lemma 8a and Lemma 8b in [Vin04, Ch. I]). See, in par-
ticular, the work of Daboussi and Rivat [DR01, Lemma 1].) The main idea is
to switch between dierent types of approximation within the sum, rather than
just choosing between bounding all terms either trivially (by A) or non-trivially
(by C/| sin(n)|
2
). There will also
2
be improvements in our applications stem-
ming from the fact that Lemmas 3.1 and Lemma 3.2 take quadratic (| sin(n)|
2
)
rather than linear (| sin(n)|) inputs.
Lemma 3.1. Let = a/q +/qQ, gcd(a, q) = 1, || 1, q Q. Then, for any
A, C 0,
(3.1)
y<ny+q
min
_
A,
C
| sin(n)|
2
_
min
_
2A +
6q
2
2
C, 3A +
4q
AC
_
.
Proof. We start by letting m
0
= y +(q + 1)/2, j = n m
0
, so that j ranges
in the interval (q/2, q/2]. We write
n =
aj +c
q
+
1
(j) +
2
mod 1,
where |
1
(j)| and |
2
| are both 1/2q; we can assume
2
0. The variable
r = aj +c modq occupies each residue class modp exactly once.
One option is to bound the terms corresponding to r = 0, 1 by A each and
all the other terms by C/| sin(n)|
2
. The terms corresponding to r = k and
r = k 1 (2 k q/2) contribute at most
1
sin
2
q
(k
1
2
q
2
)
+
1
sin
2
q
(k
3
2
+q
2
)
1
sin
2
q
_
k
1
2
_ +
1
sin
2
q
_
k
3
2
_,
since x
1
(sin x)
2
is convex-up on (0, ). Hence the terms with r = 0, 1 contribute
at most
1
_
sin

2q
_
2
+ 2
2r
q
2
1
_
sin

q
(r 1/2)
_
2

1
_
sin

2q
_
2
+ 2
_
q/2
1
1
_
sin

q
x
_
2
,
2
This is a change with respect to the rst version of this papers preprint [Helb]. The version
of Lemma 3.1 there has, however, the advantage of being immediately comparable to results in
the literature.
where we use again the convexity of x 1/(sin x)
2
. (We can assume q > 2, as
otherwise we have no terms other than r = 0, 1.) Now
_
q/2
1
1
_
sin

q
x
_
2
dx =
q
_
2
q
1
(sin u)
2
du =
q
cot

q
.
Hence
y<ny+q
min
_
A,
C
(sin n)
2
_
2A +
C
_
sin

2q
_
2
+C
2q
cot

q
.
Now
2
/(sin )
2
+ cot 2 3/2 for all [0, /4] (by the rst terms of the Taylor
series for near 0, and by inspection otherwise). Hence, the left side of (3.1) is
at most
2A +
3
2
C
_
2q
_
2
= 2A +
6
2
Cq
2
.
The following is an alternative approach yielding the other estimate in (3.1).
We bound the terms corresponding to r = 0, r = 1, r = 1 by A each. We let
r = r
for r
ranging from 2 to q/2. We obtain that the sum is at most

(3.2)
3A+
2r
q/2
min
_
_
_
A,
C
_
sin

q
_
r
1
2
q
2
_
_
2
_
_
_
+
2r
q/2
min
_
_
_
A,
C
_
sin

q
_
r
1
2
+q
2
_
_
2
_
_
_
.
We bound a term min(A, C/ sin((/q)(r
1/2 q
2
))
2
) by A if and only if
C/ sin((/q)(r
1 q
2
))
2
A. The number of such terms is
max(0, (q/) arcsin(
_
C/A) q
2
),
and thus at most (2q/) arcsin(
_
C/A) in total. (Recall that q
2
1/2.) Each
other term gets bounded by the integral of C/ sin
2
(/q) from r
1 q
2
(
(q/) arcsin(
_
C/A)) to r
q
2
, by convexity. Thus (3.2) is at most
3A +
2q
Aarcsin
_
C
A
+ 2
_
q/2
q
arcsin
C
A
C
sin
2 t
q
dt
3A +
2q
Aarcsin
_
C
A
+
2q
C
_
A
C
1
We can easily show (taking derivatives) that arcsin x + x(1 x
2
) 2x for
0 x 1. Setting x = C/A, we see that this implies that
3A +
2q
Aarcsin
_
C
A
+
2q
C
_
A
C
1 3A+
4q
AC.
(If C/A > 1, then 3A+(4q/)
AC is greater than Aq, which is an obvious upper

bound for the left side of (3.1).)
12 H. A. HELFGOTT
Lemma 3.2. Let = a/q+/qQ, gcd(a, q) = 1, || 1, q Q. Let y
2
> y
1
0.
If y
2
y
1
q and y
2
Q/2, then, for any A, C 0,
(3.3)
y
1
<ny
2
qn
min
_
A,
C
| sin(n)|
2
_
min
_
20
3
2
Cq
2
, 2A +
4q
AC
_
.
Proof. Clearly, n equals an/q + (n/Q)/q; since y
2
Q/2, this means that
|nan/q| 1/2q for n y
2
; moreover, again for n y
2
, the sign of n an/q
remains constant. Hence the left side of (3.3) is at most
q/2
r=1
min
_
A,
C
(sin

q
(r 1/2))
2
_
+
q/2
r=1
min
_
A,
C
(sin

q
r)
2
_
.
Proceeding as in the proof of Lemma 3.1, we obtain a bound of at most
C
_
1
(sin

2q
)
2
+
1
(sin

q
)
2
+
q
cot

q
+
q
cot
3
2q
_
20
3
Cq
2
2
for q 2. (If q = 1, then the left-side of (3.3) is trivially zero.) Alternatively, we
can follow the second approach in the proof of Lemma 3.1, and obtain an upper
bound of 2A + (4q/)
AC.
The following bound will be useful when the constant A in an application of

Lemma 3.2 would be too large. (This tends to happen for n small.)
Lemma 3.3. Let = a/q+/qQ, gcd(a, q) = 1, || 1, q Q. Let y
2
> y
1
0.
If y
2
y
1
q and y
2
Q/2, then, for any B, C 0,
(3.4)
y
1
<ny
2
qn
min
_
B
| sin(n)|
,
C
| sin(n)|
2
_
2B
q
max
_
2, log
Ce
3
q
B
_
.
The upper bound (2Bq/) log(2e
2
q/) is also valid.
Proof. As in the proof of Lemma 3.2, we can bound the left side of (3.4) by
2
q/2
r=1
min
_
B
sin

q
_
r
1
2
_,
C
sin
2
q
_
r
1
2
_
_
.
Assume Bsin(/q) C B. By the convexity of 1/ sin(t) and 1/ sin(t)
2
for
t (0, /2],
q/2
r=1
min
_
B
sin

q
_
r
1
2
_,
C
sin
2
q
_
r
1
2
_
_
B
sin

2q
+
_ q
arcsin
C
B
1
B
sin

q
t
dt +
_
q/2
q
arcsin
C
B
1
sin
2
q
t
dt
B
sin

2q
+
q
_
B
_
log tan
_
1
2
arcsin
C
B
_
log tan

2q
_
+C cot arcsin
C
B
_
B
sin

2q
+
q
_
B
_
log cot

2q
log
C
B
B
2
C
2
_
+
_
B
2
C
2
_
.
Now, for all t (0, /2),
2
sin t
+
1
t
log cot t <
1
t
log
_
e
2
t
_
;
we can verify this by comparing series. Thus
B
sin

2q
+
q
Blog cot

2q
B
q
log
2e
2
q
for q 2. (If q = 1, the sum on the left of (3.4) is empty, and so the bound we
are trying to prove is trivial.) We also have
(3.5) t log(t
_
t
2
1) +
_
t
2
1 < t log 2t +t
for t 1 (as this is equivalent to log(2t
2
(1
1 t
2
)) < 1
1 t
2
, which
we check easily after changing variables to = 1
1 t
2
). Hence
B
sin

2q
+
q
_
B
_
log cot

2q
log
C
B
B
2
C
2
_
+
_
B
2
C
2
_
B
q
log
2e
2
q
+
q
_
B Blog
2B
C
_
B
q
log
Ce
3
q
B
for q 2.
Given any C, we can apply the above with C = B instead, as, for any t > 0,
min(B/t, C/t
2
) B/t min(B/t, B/t
2
). (We refrain from applying (3.5) so as
to avoid worsening a constant.) If C < Bsin /q (or even if C < (/q)B), we
relax the input to C = Bsin /q and go through the above.
3.2. Type I estimates. Our main type I estimate is the following.
3
One of the
main innovations is the manner in which the main term (m divisible by q) is
separated; we are able to keep error terms small thanks to the particular way in
which we switch between two dierent approximations.
(These are not necessarily successive approximations in the sense of continued
fractions; we do not want to assume that the approximation a/q we are given
arises from a continued fraction, and at any rate we need more control on the
denominator q
of the new approximation a
/q
than continued fractions would

furnish.)
Lemma 3.4. Let = a/q +/x, gcd(a, q) = 1, |/x| 1/qQ
0
, q Q
0
, Q
0
16.
Let be continuous, piecewise C
2
and compactly supported, with ||
1
= 1 and
L
1
. Let c
0
|
.
Let 1 D x. Then, if || 1/2c
2
, where c
2
= (3/5
c
0
)(1 +
_
13/3), the
absolute value of
(3.6)
mD
(m)
n
e(mn)
_
mn
x
_
is at most
(3.7)
x
q
min
_
1,
c
0
(2)
2
_
m
M
q
gcd(m,q)=1
(m)
m
+O
_
c
0
_
1
4

1
2
__
D
2
2xq
+
D
2x
__
3
The current version of Lemma 3.4 is an improvement over that included in the rst preprint
of this paper.
14 H. A. HELFGOTT
plus
(3.8)
2
c
0
c
1
D + 3c
1
x
q
log
+
D
c
2
x/q
+
c
0
c
1
q log
+
D
q/2
+
|
|
1
q max
_
2, log
c
0
e
3
q
2
4|
|
1
x
_
+
_
2
3c
0
c
1
+
3c
1
c
2
+
55c
0
c
2
12
2
_
q,
where c
1
= 1 +|
|
1
/(2x/D) and M [min(Q
0
/2, D), D]. The same bound holds
if || 1/2c
2
but D Q
0
/2.
In general, if || 1/2c
2
, the absolute value of (3.6) is at most (3.7) plus
(3.9)
2
c
0
c
1
_
D + (1 +) min
__
x
||q
_
+ 1, 2D
_
_
+
1
2
log
+
2D
x
||q
__
+ 3c
1
_
2 +
(1 +)
log
+
2D
x
||q
_
x
Q
0
+
35c
0
c
2
6
2
q,
for (0, 1] arbitrary, where
3 + 2 + ((1 +
_
13/3)/4 1)/(2(1 +)).
In (3.7), min(1, c
0
/(2)
2
) always equals 1 when || 1/2c
2
(since (3/5)(1 +
_
13/3) > 1).
Proof. Let Q = x/|q|. Then = a/q + O
(1/qQ) and q Q. (If = 0,

we let Q = and ignore the rest of the paragraph, since then we will never
need Q
or the alternative approximation a
/q
.) Let Q
= (1 + )Q Q + 1.
Then is not a/q +O
(1/qQ
), and so there must be a dierent approximation

a
/q
, gcd(a
, q
) = 1, q
such that = a
/q
+ O
(1/q
) (since such
an approximation always exists). Obviously, |a/q a
/q
| 1/qq
, yet, at the
same time, |a/q a
/q
| 1/qQ + 1/q
1/qQ + 1/((1 + )q
Q). Hence
q
/Q + q/((1 + )Q) 1, and so q
Q q/(1 + ) (/(1 + ))Q. (Note also

that (/(1 +))Q (2|q|/x) x/q > 1, and so q
2.)
Lemma 3.2 will enable us to treat separately the contribution from terms with
m divisible by q and m not divisible by q, provided that m Q/2. Let M =
min(Q/2, D). We start by considering all terms with m M divisible by q. Then
e(mn) equals e((m/x)n). By Poisson summation,
n
e(mn)(mn/x) =
f(n),
where f(u) = e((m/x)u)((m/x)u). Now
f(n) =
_
e(un)f(u)du =
x
m
_
e
__

xn
m
_
u
_
(u)du =
x
m

_
x
m
n
_
.
By assumption, m M Q/2 x/2|q|, and so |x/m| 2|q| 2. Thus, by
(2.1),
(3.10)
f(n) =
x
m
_
_
() +
n=0

_
nx
m

_
_
_
=
x
m
_
_
() +O
_
_
n=0
1
_
2
_
nx
m

__
2
_
_
_
_
=
x
m
() +
m
x
c
0
(2)
2
O
_
_
max
|r|
1
2
n=0
1
(n r)
2
_
_
.
Since x 1/x
2
is convex on R
+
,
max
|r|
1
2
n=0
1
(n r)
2
=
n=0
1
_
n
1
2
_
2
=
2
4.
Therefore, the sum of all terms with m M and q|m is
mM
q|m
x
m
() +
mM
q|m
m
x
c
0
(2)
2
(
2
4)
=
x(q)
q
()
m
M
q
gcd(m,q)=1
(m)
m
+O
_
(q)
2
c
0
_
1
4

1
2
__
D
2
2xq
+
D
2x
__
.
.
We bound | ()| by (2.1).
Let
T
m
() =
n
e(mn)
_
mn
x
_
.
Then, by (2.2) and Lemma 2.1,
(3.11) |T
m
()| min
_
x
m
+
1
2
|
|
1
,
1
2
|
|
1
| sin(m)|
,
m
x
c
0
4
1
(sin m)
2
_
.
For any y
2
> y
1
> 0 with y
2
y
1
q and y
2
Q/2, (3.11) gives us that
(3.12)
y
1
<my
2
qm
|T
m
()|
y
1
<my
2
qm
min
_
A,
C
(sin m)
2
_
for A = (x/y
1
)(1+|
|
1
/(2(x/y
1
))) and C = (c
0
/4)(y
2
/x). We must now estimate
the sum
(3.13)
mM
qm
|T
m
()| +
Q
2
<mD
|T
m
()|.
To bound the terms with m M, we can use Lemma 3.2. The question is then
which one is smaller: the rst or the second bound given by Lemma 3.2? A brief
calculation gives that the second bound is smaller (and hence preferrable) exactly
16 H. A. HELFGOTT
when
_
C/A > (3/10q)(1 +
_
13/3). Since
_
C/A (
c
0
/2)m/x, this means
that it is sensible to prefer the second bound in Lemma 3.2 when m > c
2
x/q,
where c
2
= (3/5
c
0
)(1 +
_
13/3).
It thus makes sense to ask: does Q/2 c
2
x/q (so that m M implies m
c
2
x/q)? This question divides our work into two basic cases.
Case (a). large: || 1/2c
2
, where c
2
= (3/5
c
0
)(1 +
_
13/3). Then
Q/2 c
2
x/q; this will induce us to bound the rst sum in (3.13) by the very rst
bound in Lemma 3.2.
Recall that M = min(Q/2, D), and so M c
2
x/q. By (3.12) and Lemma 3.2,
(3.14)
1mM
qm
|T
m
()|
j=0
jq<mmin((j+1)q,M)
qm
min
_
x
jq + 1
+
|
|
1
2
,
c
0
4
(j+1)q
x
(sin m)
2
_
20
3
2
c
0
q
3
4x
0j
M
q
(j + 1)
20
3
2
c
0
q
3
4x

_
1
2
M
2
q
2
+
3
2
c
2
x
q
2
+ 1
_
5c
0
c
2
6
2
M +
5c
0
q
3
2
_
3
2
c
2
+
q
2
x
_
5c
0
c
2
6
2
M +
35c
0
c
2
6
2
q,
where, to bound the smaller terms, we are using the inequality Q/2 c
2
x/q,
and where we are also using the observation that, since |/x| 1/qQ
0
, the
assumption || 1/2c
2
implies that q 2c
2
x/Q
0
; moreover, since q Q
0
, this
gives us that q
2
2c
2
x. In the main term, we are bounding qM
2
/x from above
by M qQ/2x M/2 c
2
M.
If D (Q + 1)/2, then M D and so (3.14) is all we need. Assume from
now on that D > (Q + 1)/2. The rst sum in (3.13) is then bounded by (3.14)
(with M = Q/2). To bound the second sum in (3.13), we use the approximation
a
/q
instead of a/q. By (3.12) (without the restriction q m) and Lemma 3.1,
Q/2<mD
|T
m
()|
j=0
jq
+
Q
2
<mmin((j+1)q
+Q/2,D)
|T
m
()|
D(Q+1)/2
q
j=0
_
3c
1
x
jq
+
Q+1
2
+
4q
c
1
c
0
4
x
jq
+ (Q+ 1)/2
(j + 1)q
+Q/2
x
_
D(Q+1)/2
q
j=0
_
3c
1
x
jq
+
Q+1
2
+
4q
c
1
c
0
4
_
1 +
q
jq
+ (Q + 1)/2
_
_
,
where we recall that c
1
= 1 +|
|
1
/(2x/D). Since q
(/(1 +))Q,
(3.15)
D(Q+1)/2
q
j=0
x
jq
+
Q+1
2
x
Q/2
+
x
q
_
D
Q+1
2
1
t
dt
2x
Q
+
(1 +)x
Q
log
+
D
Q+1
2
.
Recall now that q
(1 +)Q + 1 (1 +)(Q + 1). Therefore,

(3.16)
q
D(Q+1)/2
q
j=0
1 +
q
jq
+ (Q+ 1)/2
q
1 +
(1 +)Q + 1
(Q+ 1)/2
+
_
D
Q+1
2
_
1 +
q
t
dt
q
3 + 2 +
_
D
Q+ 1
2
_
+
q
2
log
+
D
Q+1
2
.
We conclude that
Q/2<mD
|T
m
()| is at most
(3.17)
2
c
0
c
1
_
D +
_
(1 +)
3 + 2
1
2
_
(Q+ 1) +
(1 +)Q+ 1
2
log
+
D
Q+1
2
_
+ 3c
1
_
2 +
(1 +)
log
+
D
Q+1
2
_
x
Q
We sum this to (3.14) (with M = Q/2), and obtain that (3.13) is at most
(3.18)
2
c
0
c
1
_
D + (1 +)(Q + 1)
_
+
1
2
log
+
D
Q+1
2
__
+ 3c
1
_
2 +
(1 +)
log
D
Q+1
2
_
x
Q
+
35c
0
c
2
6
2
q,
where we are bounding
(3.19)
5c
0
c
2
6
2
=
5c
0
6
2
3
5
c
0
_
1 +
_
13
3
_
=
c
0
2
_
1 +
_
13
3
_
c
0
c
1

1
4
_
1 +
_
13
3
_
and dening
(3.20)
3 + 2 +
_
1
4
_
1 +
_
13
3
_
1
_
1
2(1 +)
.
(Note that
<
3 for < 0.1741). A quick check against (3.14) shows that

(3.18) is valid also when D Q/2, even when Q+1 is replaced by min(Q+1, 2D).
We bound Q from above by x/||q and log
+
D/((Q+1)/2) by log
+
2D/(x/||q+1),
and obtain the result.
Case (b): || small: || 1/2c
2
or D Q
0
/2. Then min(c
2
x/q, D) Q/2.
We start by bounding the rst q/2 terms in (3.13) by (3.11) and Lemma 3.3:
(3.21)
mq/2
|T
m
()|
mq/2
min
_
1
2
|
|
1
| sin(m)|
,
c
0
q/8x
| sin(m)|
2
_
|
1
q max
_
2, log
c
0
e
3
q
2
4|
|
1
x
_
.
18 H. A. HELFGOTT
If q
2
< 2c
2
x, we estimate the terms with q/2 < m c
2
x/q by Lemma 3.2,
which is applicable because min(c
2
x/q, D) < Q/2:
(3.22)
q
2
<mD
qm
|T
m
()|
j=1
(j
1
2
)q<m(j+
1
2
)q
mmin
c
2
x
q
,D
qm
min
_
x
_
j
1
2
_
q
+
|
1
|
2
,
c
0
4
(j+1/2)q
x
(sin m)
2
_
20
3
2
c
0
q
3
4x
1j
D
q
+
1
2
_
j +
1
2
_
20
3
2
c
0
q
3
4x
_
c
2
x
2q
2
D
q
+
3
2
_
c
2
x
q
2
_
+
5
8
_
5c
0
6
2
_
c
2
D
+ 3c
2
q +
5
4
q
3
x
_
5c
0
c
2
6
2
_
D
+
11
2
q
_
,
where we write D
= min(c
2
x/q, D). If c
2
x/q D, we stop here. Assume that
c
2
x/q < D. Let R = max(c
2
x/q, q/2). The terms we have already estimated are
precisely those with m R. We bound the terms R < m D by the second
bound in Lemma 3.1:
(3.23)
R<mD
|T
m
()|
j=0
m>jq+R
mmin((j+1)q+R,D)
min
_
c
1
x
jq +R
,
c
0
4
(j+1)q+R
x
(sin m)
2
_
1
q
(DR)
j=0
3c
1
x
jq +R
+
4q
c
1
c
0
4
_
1 +
q
jq +R
_
(Note there is no need to use two successive approximations a/q, a
/q
as in case
(a). We are also including all terms with m divisible by q, as we may, since
|T
m
()| is non-negative.) Now, much as before,
(3.24)
1
q
(DR)
j=0
x
jq +R

x
R
+
x
q
_
D
R
1
t
dt min
_
q
c
2
,
2x
q
_
+
x
q
log
+
D
c
2
x/q
,
and
(3.25)
1
q
(DR)
j=0
_
1 +
q
jq +R

_
1 +
q
R
+
1
q
_
D
R
_
1 +
q
t
dt
3+
D R
q
+
1
2
log
+
D
q/2
.
We sum with (3.21) and (3.22), and we obtain that (3.13) is at most
(3.26)
2
c
0
c
1
3q +D +
q
2
log
+
D
q/2
_
+
_
3c
1
log
+
D
c
2
x/q
_
x
q
+ 3c
1
min
_
q
c
2
,
2x
q
_
+
55c
0
c
2
12
2
q +
|
|
1
q max
_
2, log
c
0
e
3
q
2
4|
|
1
x
_
,
where we are using the fact that 5c
0
c
2
/6
2
< 2
c
0
c
1
/. A quick check against
(3.22) shows that (because of the fact just stated) (3.26) is also valid when c
2
x/q
D.
We will need a version of Lemma 3.4 with m and n restricted to the odd
numbers. (We will barely be using the restriction of m, whereas the restriction
on n is both (a) slightly harder to deal with, (b) something that can be turned
to our advantage.)
Lemma 3.5. Let R/Z with 2 = a/q + /x, gcd(a, q) = 1, |/x| 1/qQ
0
,
q Q
0
, Q
0
16. Let be continuous, piecewise C
2
and compactly supported,
with ||
1
= 1 and
L
1
. Let c
0
|
.
Let 1 D x. Then, if || 1/2c
2
, where c
2
= 6/5
c
0
, the absolute value
of
(3.27)
mD
m odd
(m)
n odd
e(mn)
_
mn
x
_
is at most
(3.28)
x
2q
min
_
1,
c
0
()
2
_
m
M
q
gcd(m,2q)=1
(m)
m
+O
_
c
0
q
x
_
1
8

1
2
2
__
D
q
+ 1
_
2
_
plus
(3.29)
2
c
0
c
1
D +
3c
1
2
x
q
log
+
D
c
2
x/q
+
c
0
c
1
q log
+
D
q/2
+
2|
|
1
q max
_
1, log
c
0
e
3
q
2
4|
|
1
x
_
+
_
2
3c
0
c
1
+
3c
1
2c
2
+
55c
0
c
2
6
2
_
q,
where c
1
= 1 + |
|
1
/(x/D) and M [min(Q
0
/2, D), D]. The same bound holds
if || 1/2c
2
but D Q
0
/2.
2
(3.30)
2
c
0
c
1
_
D + (1 +) min
__
x
||q
_
+ 1, 2D
_
_
3 + 2 +
1
2
log
+
2D
x
||q
__
+
3
2
c
1
_
2 +
(1 +)
log
+
2D
x
||q
_
x
Q
0
+
35c
0
c
2
3
2
q,
for (0, 1] arbitrary.
If q is even, the sum (3.28) can be replaced by 0.
Proof. The proof is almost exactly that of Lemma 3.4; we go over the dierences.
The parameters Q, Q
, a
, q
and M are dened just as before (with 2 wherever

we had ).
Let us rst consider m M odd and divisible by q. (Of course, this case arises
only if q is odd.) For n = 2r + 1,
e(mn) = e(m(2r + 1)) = e(2rm)e(m) = e
_
x
rm
_
e
__
a
2q
+

2x
+

2
_
m
_
= e
_
(2r + 1)
2x
m
_
e
_
a +q
2
m
q
_
=
e
_
(2r + 1)
2x
m
_
,
20 H. A. HELFGOTT
where {0, 1} and
= e((a + q)/2) {1, 1} are independent of m and n.

Hence, by Poisson summation,
(3.31)
n odd
e(mn)(mn/x) =
n odd
e((m/2x)n)(mn/x)
=

2
_
f(n)
f(n + 1/2)
_
,
where f(u) = e((m/2x)u)((m/x)u). Now
f(t) =
x
m

_
x
m
t

2
_
.
Just as before, |x/m| 2|q| 2. Thus
(3.32)
1
2
f(n)
f(n + 1/2)
x
m
_
_
1
2
2
_
+
1
2
n=0

_
x
m
n
2

2
_
_
_
=
x
m
_
_
1
2
2
_
+
1
2
O
_
_
n=0
1
_
_
nx
m

__
2
_
_
_
_
=
x
2m
2
_
+
m
x
c
0
2
2
(
2
4)x.
The contribution of the second term in the last line of (3.32) is
mM
m odd
q|m
m
x
c
0
2
2
(
2
4) =
q
x
c
0
2
2
(
2
4)
mM/q
m odd
m =
qc
0
x
_
1
8

1
2
2
__
M
q
+ 1
_
2
.
Hence, the absolute value of the sum of all terms with m M and q|m is given
by (3.28).
We dene T
m,
() by
(3.33) T
m,
() =
n odd
e(mn)
_
mn
x
_
.
Changing variables by n = 2r + 1, we see that
|T
m,
()| =
r
e(2 mr)(m(2r + 1)/x)
.
Hence, instead of (3.11), we get that
(3.34) |T
m,
()| min
_
x
2m
+
1
2
|
|
1
,
1
2
|
|
1
| sin(2m)|
,
m
x
c
0
2
1
(sin 2m)
2
_
.
We obtain (3.12), but with T
m,
instead of T
m
, A = (x/2y
1
)(1+|
|
1
/(x/y
1
)) and
C = (c
0
/2)(y
2
/x), and so c
1
= 1 +|
|
1
/(x/D).
The rest of the proof of Lemma 3.4 carries almost over word-by-word. (For
the sake of simplicity, we do not really try to take advantage of the odd support
of m here.) Since C has doubled, it would seem to make sense to reset the value
of c
2
to be c
2
= (3/5
2c
0
)(1 +
_
13/3); this would cause complications related
to fact that 5c
0
c
2
/3
2
would become larger than 2
c
0
/, and so we set c
2
to the
slightly smaller value c
2
= 6/5
c
0
instead. This implies
(3.35)
5c
0
c
2
3
2
=
2
c
0
.
The bound from (3.14) gets multiplied by 2 (but the value of c
2
has changed),
the second line in (3.17) gets halved, (3.19) gets replaced by (3.35), the second
term in the maximum in the second line of (3.21) gets doubled, the bound from
(3.22) gets doubled, and the bound from (3.24) gets halved.
We will also need a version of Lemma 3.4 (or rather Lemma 3.5; we will decide
to work with the restriction that n and m be odd) with a factor of (log n) within
the inner sum.
0
,
q Q
0
, Q
0
max(16, 2
x). Let be continuous, piecewise C

2
and compactly
supported, with ||
1
= 1 and
L
1
. Let c
0
|
. Assume that, for any

0
,
0
a constant, the function
()
(t) = log(t)(t) satises
(3.36) |
()
|
1
log()||
1
, |
()
|
1
log()|
|
1
, |
()
|
c
0
log()
Let
3 D min(x/
0
, x/e). Then, if || 1/2c
2
, where c
2
= 6/5
c
0
, the
absolute value of
(3.37)
mD
m odd
(m)
n
n odd
(log n)e(mn)
_
mn
x
_
is at most
(3.38)
x
q
min
_
1,
c
0
/
2
(2)
2
_
m
M
q
gcd(m,q)=1
(m)
m
log
x
mq
+
x
q
|
log ()|
m
M
q
gcd(m,q)=1
(m)
m
+O
_
c
0
_
1
2

2
2
_
_
D
2
4qx
log
e
1/2
x
D
+
1
e
__
plus
(3.39)
2
c
0
c
1
Dlog
ex
D
+
3c
1
2
x
q
log
+
D
c
2
x/q
log
q
c
2
+
_
2|
|
1
max
_
1, log
c
0
e
3
q
2
4|
|
1
x
_
log x +
2
c
0
c
1
3 +
1
2
log
+
D
q/2
_
log
q
c
2
_
q
+
3c
1
2
_
2x
c
2
log
2x
c
2
+
20c
0
c
3/2
2
3
2
2xlog
2
ex
c
2
for c
1
= 1 +|
|
1
/(x/D). The same bound holds if || 1/2c
2
but D Q
0
/2.
22 H. A. HELFGOTT
2
, the absolute value of (3.37) is at most
(3.40)
2
c
0
c
1
Dlog
ex
D
+
2
c
0
c
1
(1 +)
_
x
||q
+ 1
_
_
3 + 2 log
+
2
e||q +
1
2
log
+
2D
x
||q
log
+
2||q
_
+
_
3c
1
4
_
2
5
+
1 +
2
log x
_
+
40
3
2c
0
c
3/2
2
_
x log x
for (0, 1].
Proof. Dene Q, Q
, M, a
and q
as in the proof of Lemma 3.4. The same

method of proof works as for Lemma 3.4; we go over the dierences. When
applying Poisson summation or (2.2), use
(x/m)
(t) = (log xt/m)(t) instead of
(t). Then use the bounds in (3.36) with = x/m; in particular,
|
(x/m)
|
c
0
log
x
m
.
For f(u) = e((m/2x)u)(log u)((m/x)u),
f(t) =
x
m

(x/m)
_
x
m
t

2
_
and so
1
2
f(n/2)

x
m
_
_
1
2

(x/m)
_
2
_
+
1
2
n=0

_
x
m
n
2

2
_
_
_
=
1
2
x
m
_
log
_
2
_
+ log
_
x
m
_

_
2
__
+
m
x
_
log
x
m
_
c
0
2
2
(
2
4).
The part of the main term involving log(x/m) becomes
x ()
2
mM
m odd
q|m
(m)
m
log
_
x
m
_
=
x(q)
q
()
mM/q
gcd(m,2q)=1
(m)
m
log
_
x
mq
_
for q odd. (We can see that this, like the rest of the main term, vanishes for m
even.)
In the term in front of
2
4, we nd the sum
mM
m odd
q|m
m
x
log
_
x
m
_
M
x
log
x
M
+
q
2
_
M/q
0
t log
x/q
t
dt =
M
x
log
x
M
+
M
2
4qx
log
e
1/2
x
M
,
where we use the fact that t t log(x/t) is increasing for t x/e. By the same
fact (and by M D), (M
2
/q) log(e
1/2
x/M) (D
2
/q) log(e
1/2
x/D). It is also
easy to see that (M/x) log(x/M) 1/e (since M D x).
The basic estimate for the rest of the proof (replacing (3.11)) is
T
m,
() =
n odd
e(mn)(log n)
_
mn
x
_
=
n odd
e(mn)
(x/m)
_
mn
x
_
= O
_
_
min
_
_
x
2m
|
(x/m)
|
1
+
|
(x/m)
|
1
2
,
1
2
|
(x/m)
|
1
| sin(2m)|
,
m
x
1
2
|
(x/m)
|
(sin 2m)
2
_
_
_
_
= O
_
log
x
m
min
_
x
2m
+
|
|
1
2
,
1
2
|
|
1
| sin(2m)|
,
m
x
c
0
2
1
(sin 2m)
2
__
.
We wish to bound
(3.41)
mM
qm
m odd
|T
m,
()| +
Q
2
<mD
|T
m,
()|.
Just as in the proofs of Lemmas 3.4 and 3.5, we give two bounds, one valid for
|| large (|| 1/2c
2
) and the other for small (|| 1/2c
2
). Again as in the
proof of Lemma 3.5, we ignore the condition that m is odd in (3.13).
Consider the case of || large rst. Instead of (3.14), we have
(3.42)
1mM
qm
|T
m
()|
40
3
2
c
0
q
3
2x
0j
M
q
(j + 1) log
x
jq + 1
.
Since
0j
M
q
(j + 1) log
x
jq + 1
log x +
M
q
log
x
M
+
1j
M
q
log
x
jq
+
1j
M
q
1
j log
x
jq
log x +
M
q
log
x
M
+
_ M
q
0
log
x
tq
dt +
_ M
q
1
t log
x
tq
dt
log x +
_
2M
q
+
M
2
2q
2
_
log
e
1/2
x
M
,
this means that
(3.43)
1mM
qm
|T
m
()|
40
3
2
c
0
q
3
4x
_
log x +
_
2M
q
+
M
2
2q
2
_
log
e
1/2
x
M
_
5c
0
c
2
3
2
M log
ex
M
+
40
3
2c
0
c
3/2
2
x log x,
where we are using the bounds M Q/2 c
2
x/q and q
2
2c
2
x (just as in
(3.14)). Instead of (3.15), we have
D(Q+1)/2
q
j=0
_
log
x
jq
+
Q+1
2
_
x
jq
+
Q+1
2
x
Q/2
log
2x
Q
+
x
q
_
D
Q+1
2
log
x
t
dt
t
2x
Q
log
2x
Q
+
x
q
log
2x
Q
log
+
2D
Q
;
24 H. A. HELFGOTT
recall that the coecient in front of this sum will be halved by the condition that
n is odd. Instead of (3.16), we obtain
q
D(Q+1)/2
q
j=0
1 +
q
jq
+ (Q+ 1)/2
_
log
x
jq
+
Q+1
2
_
q
3 + 2 log
2x
Q+ 1
+
_
D
Q+1
2
_
1 +
q
2t
_
_
log
x
t
_
dt
q
3 + 2 log
2x
Q+ 1
+Dlog
ex
D

Q+ 1
2
log
2ex
Q + 1
+
q
2
log
2x
Q+ 1
log
2D
Q+ 1
.
(The bound
_
b
a
log(x/t)dt/t log(x/a) log(b/a) will be more practical than the
exact expression for the integral.) Hence
Q/2<mD
|T
m
()| is at most
2
c
0
c
1
Dlog
ex
D
+
2
c
0
c
1
_
(1 +)
3 + 2 +
(1 +)
2
log
2D
Q+ 1
_
(Q+ 1) log
2x
Q+ 1
c
0
c
1

Q+ 1
2
log
2ex
Q+ 1
+
3c
1
2
_
2
5
+
1 +
log
+
D
Q/2
_
xlog
x
Summing this to (3.43) (with M = Q/2), and using (3.19) and (3.20) as before,
we obtain that (3.41) is at most
2
c
0
c
1
Dlog
ex
D
+
2
c
0
c
1
(1 +)(Q + 1)
_
3 + 2 log
+
2
ex
Q+ 1
+
1
2
log
+
2D
Q+ 1
log
+
2x
Q+ 1
_
+
3c
1
2
_
2
5
+
1 +
log
+
D
Q/2
_
xlog
x +
40
3
2c
0
c
3/2
2
xlog x.
Now we go over the case of || small (or D Q
0
/2). Instead of (3.21), we have
(3.44)
mq/2
|T
m,
()|
2|
|
1
q max
_
1, log
c
0
e
3
q
2
4|
|
1
x
_
log x.
Suppose q
2
< 2c
2
x. Instead of (3.22), we have
(3.45)
q
2
<mD
qm
|T
m,
()|
40
3
2
c
0
q
3
6x
1j
D
q
+
1
2
_
j +
1
2
_
log
x
_
j
1
2
_
q
10c
0
q
3
3
2
x
_
log
2x
q
+
1
q
_
D
0
log
x
t
dt +
1
q
_
D
0
t log
x
t
dt +
D
q
log
x
D
_
=
10c
0
q
3
3
2
x
_
log
2x
q
+
_
2D
q
+
(D
)
2
2q
2
_
log
ex
D
5c
0
c
2
3
2
_
4
2c
2
xlog
2x
q
+ 4
2c
2
xlog
ex
D
+D
log
ex
D
5c
0
c
2
3
2
_
D
log
ex
D
+ 4
2c
2
xlog
2
e
c
2
x
_
where D
= min(c
2
x/q, D). (We are using the bounds q
3
/x (2c
2
)
3/2
, D
q
2
/x
c
2
q < c
3/2
2
2x and D
q/x c
2
.) Instead of (3.23), we have
R<mD
|T
m,
()|
1
q
(DR)
j=0
_
3c
1
2
x
jq +R
+
4q
c
1
c
0
4
_
1 +
q
jq +R
_
_
log
x
jq +R
,
where R = max(c
2
x/q, q/2). We can simply reuse (3.24), multiplying it by
log x/R; we replace (3.25) by
q
1
q
(DR)
j=0
_
1 +
q
jq +R
log
x
jq +R
q
_
1 +
q
R
log
x
R
+
_
D
R
_
1 +
q
t
log
x
t
dt
3q log
q
c
2
+
_
Dlog
ex
D
Rlog
ex
R
_
+
q
2
log
q
c
2
log
+
D
R
.
We sum with (3.44) and (3.45), and obtain (3.40) as an upper bound for (3.41).
We will apply the following only for q relatively large.

0
,
q Q
0
, Q
0
max(2e, 2
x). Let be continuous, piecewise C

2
and compactly
supported, with ||
1
= 1 and
L
1
. Let c
0
|
. Let c
2
= 6/5
c
0
.
Assume that x e
2
c
2
/2.
Let U, V 1 satisfy UV + (19/18)Q
0
x/5.6. Then, if || 1/2c
2
, the
absolute value of
(3.46)
vV
v odd
(v)
uU
u odd
(u)
n
n odd
e(vun)(vun/x)
is at most
(3.47)
x
2q
min
_
1,
c
0
()
2
_
log V q
+O
_
1
4

1
2
_
c
0
_
D
2
log V
2qx
+
3c
4
2
UV
2
x
+
(U + 1)
2
V
2x
log q
_
plus
(3.48)
2
c
0
c
1
_
Dlog
D
e
+q
_
3 log
c
2
x
q
+
log D
2
log
+
D
q/2
__
+
3c
1
2
x
q
log Dlog
+
D
c
2
x/q
+
2|
|
1
q max
_
1, log
c
0
e
3
q
2
4|
|
1
x
_
log
q
2
+
3c
1
2
2c
2
xlog
c
2
x
2
+
25c
0
4
2
(2c
2
)
3/2
xlog x,
where D = UV and c
1
= 1 + |
|
1
/(2x/D) and c
4
= 1.03884. The same bound
holds if || 1/2c
2
but D Q
0
/2.
26 H. A. HELFGOTT
2
(3.49)
2
c
0
c
1
Dlog
D
e
+
2
c
0
c
1
(1 +)
_
x
||q
+ 1
_
_
(
3 + 2 1) log
x
||q
+ 1
2
+
1
2
log Dlog
+
e
2
D
x
||q
_
+
_
3c
1
2
_
1
2
+
3(1 +)
16
log x
_
+
20c
0
3
2
(2c
2
)
3/2
_
xlog x
for (0, 1].
Proof. We proceed essentially as in Lemma 3.4 and Lemma 3.5. Let Q, q
and
Q
be as in the proof of Lemma 3.5, that is, with 2 where Lemma 3.4 uses .
Let M = min(UV, Q/2). We rst consider the terms with uv M, u and v
odd, uv divisible by q. If q is even, there are no such terms. Assume q is odd.
Then, by (3.31) and (3.32), the absolute value of the contribution of these terms
is at most
(3.50)
aM
a odd
q|a
_
_
_
_
_
v|a
a/UvV
(v)(a/v)
_
_
_
_
_
_
x (/2)
2a
+O
_
a
x
|
2
2
(
2
4)
__
.
Now
aM
a odd
q|a
v|a
a/UvV
(v)(a/v)
a
=
vV
v odd
gcd(v,q)=1
(v)
v
umin(U,M/V )
u odd
q|u
(u)
u
+
V
p odd
p|q
(p
)
p
umin(U,M/V )
u odd
q
gcd(q,p
)
|u
(u)
u
=
(q)
q
vV
v odd
gcd(v,q)=1
(v)
v
umin(U/q,M/V q)
gcd(u,2q)=1
(u)
u
+
_
q
gcd(q,p
)
_
q
V
p odd
p|q
(p
)
p
/ gcd(q, p
umin
U
q/ gcd(q,p
)
,
M/V
q/ gcd(q,p
u odd
gcd
u,
q
gcd(q,p
=1
(u)
u
=
1
q
O
_
_
_
_
_
_
_
_
vV
gcd(v,2q)=1
(v)
v
+
V
p odd
p|q
log p
p
/ gcd(q, p
)
_
_
_
_
_
_
_
_
,
where we are using (2.7) to bound the sums on u by 1. We notice that
V
p odd
p|q
log p
p
/ gcd(q, p
p odd
p|q
(log p)
_
_
_
_
_
v
p
(q) +
>vp(q)
p
V
1
p
vp(q)
_
_
_
_
_
log q +
p odd
p|q
(log p)
>0
p
V
p
vp(q)
log p
p
log q +
vV
v odd
gcd(v,q)=1
(v)
v
,
and so
aM
a odd
q|a
v|a
a/UvV
(v)(a/v)
a
=
1
q
O
_
_
_
_
_
log q +
vV
gcd(v,2)=1
(v)
v
_
_
_
_
_
=
1
q
O
(log q + log V )
by (2.12). The absolute value of the sum of the terms with (/2) in (3.50) is
thus at most
x
q
(/2)
2
(log q + log V )
x
2q
min
_
1,
c
0
()
2
_
log V q,
where we are bounding (/2) by (2.1).
The other terms in (3.50) contribute at most
(3.51) (
2
4)
|
2
2
1
x
uU
vV
uv odd
uvM, q|uv
u sq-free
(v)uv.
For any R,
uR,u odd,q|u
R
2
/4q +3R/4. Using the estimates (2.12), (2.15)
and (2.16), we obtain that the double sum in (3.51) is at most
(3.52)
vV
gcd(v,2q)=1
(v)v
umin(U,M/v)
u odd
q|u
u +
V
p odd
p|q
(log p)p
uU
u odd
q
gcd(q,p
)
|u
u
vV
gcd(v,2q)=1
(v)v
_
(M/v)
2
4q
+
3M
4v
_
+
V
p odd
p|q
(log p)p
(U + 1)
2
4
M
2
log V
4q
+
3c
4
4
MV +
(U + 1)
2
4
V log q,
where c
4
= 1.03884.
28 H. A. HELFGOTT
From this point onwards, we use the easy bound
v|a
a/UvV
(v)(a/v)
log a.
What we must bound now is
(3.53)
mUV
m odd
q m or m > M
(log m)
n odd
e(mn)(mn/x).
The inner sum is the same as the sum T
m,
() in (3.33); we will be using the
bound (3.34). Much as before, we will be able to ignore the condition that m is
odd.
Let D = UV . What remains to do is similar to what we did in the proof of
Lemma 3.4 (or Lemma 3.5).
Case (a). large: || 1/2c
2
. Instead of (3.14), we have
1mM
qm
(log m)|T
m,
()|
40
3
2
c
0
q
3
4x
0j
M
q
(j + 1) log(j + 1)q,
and, since M min(c
2
x/q, D), q
2c
2
x (just as in the proof of Lemma 3.4)
and
0j
M
q
(j + 1) log(j + 1)q
M
q
log M +
_
M
q
+ 1
_
log(M + 1) +
1
q
2
_
M
0
t log t dt
_
2
M
q
+ 1
_
log x +
M
2
2q
2
log
M
e
,
we conclude that
(3.54)
1mM
qm
|T
m,
()|
5c
0
c
2
3
2
M log
M
e
+
20c
0
3
2
(2c
2
)
3/2
xlog x.
Instead of (3.15), we have
D(Q+1)/2
q
j=0
x
jq
+
Q+1
2
log
_
jq
+
Q+ 1
2
_
x
Q+1
2
log
Q+ 1
2
+
x
q
_
D
Q+1
2
log t
t
dt
2x
Q
log
Q
2
+
(1 +)x
2Q
_
(log D)
2
_
log
Q
2
_
2
_
.
Instead of (3.16), we estimate
q
D
Q+1
2
q
j=0
_
log
_
Q+ 1
2
+jq
__
1 +
q
jq
+
Q+1
2
q
_
log D + (
3 + 2 1) log
Q+ 1
2
_
+
_
D
Q+1
2
log t dt +
_
D
Q+1
2
q
log t
2t
dt
q
_
log D +
_
3 + 2 1
_
log
Q+ 1
2
_
+
_
Dlog
D
e

Q + 1
2
log
Q+ 1
2e
_
+
q
2
log Dlog
+
D
Q+1
2
.
We conclude that, when D Q/2, the sum
Q/2<mD
(log m)|T
m
()| is at most
2
c
0
c
1
_
Dlog
D
e
+ (Q + 1)
_
(1 +)(
3 + 2 1) log
Q+ 1
2

1
2
log
Q+ 1
2e
__
+
c
0
c
1
(Q+ 1)(1 +) log Dlog

+
e
2
D
Q+1
2
+
3c
1
2
_
2x
Q
log
Q
2
+
(1 +)x
2Q
_
(log D)
2
_
log
Q
2
_
2
__
.
We must now add this to (3.54). Since
(1 +)(
3 + 2 1) log
2
1
2
log 2e +
1 +
_
13/3
2
log 2
e > 0
and Q 2
x, we conclude that (3.53) is at most

(3.55)
2
c
0
c
1
Dlog
D
e
+
2
c
0
c
1
(1 +)(Q + 1)
_
(
3 + 2 1) log
Q+ 1
2
+
1
2
log Dlog
+
e
2
D
Q+1
2
_
+
_
3c
1
2
_
1
2
+
3(1 +)
16
log x
_
+
20c
0
3
2
(2c
2
)
3/2
_
xlog x.
Case (b). small: || 1/2c
2
or D Q
0
/2. The analogue of (3.21) is a
bound of
2|
|
1
q max
_
1, log
c
0
e
3
q
2
4|
|
1
x
_
log
q
2
for the terms with m q/2. If q
2
< 2c
2
x, then, much as in (3.22), we have
(3.56)
q
2
<mD
qm
|T
m,
()|(log m)
10
2
c
0
q
3
3x
1j
D
q
+
1
2
_
j +
1
2
_
log(j + 1/2)q
10
2
c
0
q
3x
_
D
+
3
2
q
q
xlog x dx.
30 H. A. HELFGOTT
Since
_
D
+
3
2
q
q
xlog x dx =
1
2
_
D
+
3
2
q
_
2
log
D
+
3
2
q
e

1
2
q
2
log
q
e
=
_
1
2
D
2
+
3
2
D
q
__
log
D
e
+
3
2
q
D
_
+
9
8
q
2
log
D
+
3
2
q
e

1
2
q
2
log
q
e
=
1
2
D
2
log
D
e
+
3
2
D
q log D
+
9
8
q
2
_
2
9
+
3
2
+ log
_
D
+
19
18
q
__
,
where D
= min(c
2
x/q, D), and since the assumption (UV +(19/18)Q
0
) x/5.6
implies that (2/9 + 3/2 + log(D
+ (19/18)q)) x, we conclude that

(3.57)
q
2
<mD
qm
|T
m,
()|(log m)
5c
0
c
2
3
2
D
log
D
e
+
10c
0
3
2
_
3
4
(2c
2
)
3/2
xlog x +
9
8
(2c
2
)
3/2
xlog x
_
5c
0
c
2
3
2
D
log
D
e
+
25c
0
4
2
(2c
2
)
3/2
xlog x.
Let R = max(c
2
x/q, q/2). We bound the terms R < m D as in (3.23), with a
factor of log(jq +R) inside the sum. The analogues of (3.24) and (3.25) are
(3.58)
1
q
(DR)
j=0
x
jq +R
log(jq +R)
x
R
log R +
x
q
_
D
R
log t
t
dt
_
2x
c
2
log
_
c
2
x
2
+
x
q
log Dlog
+
D
R
,
where we use the assumption that x e
2
c/2, and
(3.59)
1
q
(DR)
j=0
log(jq +R)
_
1 +
q
jq +R

3 log R
+
1
q
_
Dlog
D
e
Rlog
R
e
_
+
1
2
log Dlog
D
R
(or 0 if D < R). We sum with (3.57) and the terms with m q/2, and obtain,
for D
= c
2
x/q = R,
2
c
0
c
1
_
Dlog
D
e
+q
_
3 log
c
2
x
q
+
log D
2
log
+
D
q/2
__
+
3c
1
2
x
q
log Dlog
+
D
c
2
x/q
+
2|
|
1
q max
_
1, log
c
0
e
3
q
2
4|
|
1
x
_
log
q
2
+
3c
1
2
2c
2
xlog
c
2
x
2
+
25c
0
4
2
(2c
2
)
3/2
xlog x,
which, it is easy to check, is also valid even if D
= D (in which case (3.58) and

(3.59) do not appear) or R = q/2 (in which case (3.57) does not appear).
4. Type II
We must now consider the sum
(4.1) S
II
=
m>U
gcd(m,v)=1
_
_
_
_
d>U
d|m
(d)
_
_
_
_
n>V
gcd(n,v)=1
(n)e(mn)(mn/x).
Here the main improvements over classical treatments are as follows:
(1) obtaining cancellation in the term
>U
1, leading to a gain of a factor
of log;
(2) using a large sieve for primes, getting rid of a further log;
(3) exploiting, via a non-conventional application of the principle of the large
sieve (Lemma 4.3), the fact that is in the tail of an interval (when that
is the case).
Some of the techniques developed for (1) should be applicable to other instances
of Vaughans identity in the literature.
It is technically helpful to express as the (multiplicative) convolution of two
functions of compact support preferrably the same function:
(4.2) (x) =
_

0
1
(t)
1
(x/t)
dt
t
.
For the smoothing function (t) =
2
(t) = 4 max(log 2 | log 2t|, 0), (4.2) holds
with
(4.3)
1
(t) =
_
2 if t (1/2, 1]
0 otherwise.
We will work with
1
(t) as in (4.3) for convenience, yet what follows should carry
over to other (non-negative) choices of
1
.
By (4.2), the sum (4.1) equals
(4.4)
4
_

0
m>U
gcd(m,v)=1
_
_
_
_
d>U
d|m
(d)
_
_
_
_
n>V
gcd(n,v)=1
(n)e(mn)
1
(t)
1
_
mn/x
t
_
dt
t
= 4
_
x/U
V
max(
x
2W
,U)<m
x
W
gcd(m,v)=1
_
_
_
_
d>U
d|m
(d)
_
_
_
_
max(V,
W
2
)<nW
gcd(n,v)=1
(n)e(mn)
dW
W
by the substitution t = (m/x)W. (We can assume V W x/U because
otherwise one of the sums in (4.5) is empty.)
We separate n prime and n non-prime. By Cauchy-Schwarz, the expres-
sion within the integral in (4.4) is then at most
_
S
1
(U, W) S
2
(U, V, W) +
32 H. A. HELFGOTT
_
S
1
(U, W) S
3
(W), where
(4.5)
S
1
(U, W) =
max(
x
2W
,U)<m
x
W
gcd(m,v)=1
_
_
_
_
d>U
d|m
(d)
_
_
_
_
2
,
S
2
(U, V, W) =
max(
x
2W
,U)<m
x
W
gcd(m,v)=1
max(V,
W
2
)<pW
gcd(p,v)=1
(log p)e(mp)
2
.
and
(4.6)
S
3
(W) =
x
2W
<m
x
W
gcd(m,v)=1
nW
n non-prime
(n)
2
=
x
2W
<m
x
W
gcd(m,v)=1
_
1.42620W
1/2
_
2
1.0171x + 2.0341W
(by [RS62, Thm. 13]). We will assume V w; thus the condition gcd(p, v) = 1
will be fullled automatically and can be removed.
The contribution of S
3
(W) will be negligible. We must bound S
1
(U, W) and
S
2
(U, V, W) from above.
4.1. The sum S
1
: cancellation. We shall bound
S
1
(U, W) =
max(U,x/2W)<mx/W
gcd(m,v)=1
_
_
_
_
d>U
d|m
(d)
_
_
_
_
2
.
There will be what is perhaps a surprising amount of cancellation: the expres-
sion within the sum will be bounded by a constant on average.
4.1.1. Reduction to a sum with . We can write
(4.7)
max(U,x/2W)<mx/W
gcd(m,v)=1
_
_
_
_
d>U
d|m
(d)
_
_
_
_
2
=
x
2W
<m
x
W
gcd(m,v)=1
d
1
,d
2
|m
(d
1
> U)(d
2
> U)
=
r
1
<x/WU
r
2
<x/WU
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
l
gcd(l,r
1
r
2
)=1
r
1
l,r
2
l>U
gcd(,v)=1
(r
1
l)(r
2
l)
x
2W
<m
x
W
r
1
r
2
l|m
gcd(m,v)=1
1,
where we write d
1
= r
1
l, d
2
= r
2
l, l = gcd(d
1
, d
2
). (The inequality r
1
< x/WU
comes from r
1
r
2
l|m, m x/W, r
2
l > U; r
2
< x/WU is proven in the same way.)
Now (4.7) equals
(4.8)
s<
x
WU
gcd(s,v)=1
r
1
<
x
WUs
r
2
<
x
WUs
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
max
U
min(r
1
,r
2
)
,
x/W
2r
1
r
2
s
<l
x/W
r
1
r
2
s
gcd(l,r
1
r
2
)=1,((l))
2
=1
gcd(,v)=1
1,
where we have set s = m/(r
1
r
2
l).
Lemma 4.1. Let z, y > 0. Then
(4.9)
r
1
<y
r
2
<y
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
min
z/y
min(r
1
,r
2
)
,
z
2r
1
r
2
<l
z
r
1
r
2
gcd(l,r
1
r
2
)=1,((l))
2
=1
gcd(,v)=1
1
equals
(4.10)
6z
2
v
(v)
r
1
<y
r
2
<y
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
(r
1
)(r
2
)
_
1 max
_
1
2
,
r
1
y
,
r
2
y
__
+O
_
_
5.08
_
3
2
_
2
y
p|v
_
1 +
1
p
__
1
1
p
3/2
_
2
_
_
.
If v = 2, the error term in (4.10) can be replaced by
(4.11) O
_
1.27
_
3
2
_
2
y
z
_
1 +
1
2
__
1
1
2
3/2
_
2
_
.
Proof. By Mobius inversion, (4.9) equals
(4.12)
r
1
<y
r
2
<y
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
l
z
r
1
r
2
l>min
z/y
min(r
1
,r
2
)
,
z
2r
1
r
2
gcd(,v)=1
d
1
|r
1
,d
2
|r
2
d
1
d
2
|l
(d
1
)(d
2
)
d
3
|v
d
3
|l
(d
3
)
m
2
|l
gcd(m,r
1
r
2
v)=1
(m).
We can change the order of summation of r
i
and d
i
by dening s
i
= r
i
/d
i
, and
we can also use the obvious fact that the number of integers in an interval (a, b]
divisible by d is (b a)/d +O
(1). Thus (4.12) equals

(4.13)
d
1
,d
2
<y
gcd(d
1
,d
2
)=1
gcd(d
1
d
2
,v)=1
(d
1
)(d
2
)
s
1
<y/d
1
s
2
<y/d
2
gcd(d
1
s
1
,d
2
s
2
)=1
gcd(s
1
s
2
,v)=1
(d
1
s
1
)(d
2
s
2
)
d
3
|v
(d
3
)
z
d
2
1
s
1
d
2
2
s
2
d
3
gcd(m,d
1
s
1
d
2
s
2
v)=1
(m)
d
1
d
2
d
3
m
2
z
s
1
d
1
s
2
d
2
_
1 max
_
1
2
,
s
1
d
1
y
,
s
2
d
2
y
__
34 H. A. HELFGOTT
plus
(4.14) O
_
_
_
_
_
_
_
_
d
1
,d
2
<y
gcd(d
1
d
2
,v)=1
s
1
<y/d
1
s
2
<y/d
2
gcd(s
1
s
2
,v)=1
d
3
|v
z
d
2
1
s
1
d
2
2
s
2
d
3
m sq-free
1
_
_
_
_
_
_
_
_
.
If we complete the innermost sum in (4.13) by removing the condition m
_
z/(d
2
1
sd
2
2
s
2
), we obtain (reintroducing the variables r
i
= d
i
s
i
)
(4.15)
z
r
1
,r
2
<y
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
r
1
r
2
_
1 max
_
1
2
,
r
1
y
,
r
2
y
__
d
1
|r
1
d
2
|r
2
d
3
|v
m
gcd(m,r
1
r
2
v)=1
(d
1
)(d
2
)(m)(d
3
)
d
1
d
2
d
3
m
2
times z. Now (4.15) equals
r
1
,r
2
<y
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)z
r
1
r
2
_
1 max
_
1
2
,
r
1
y
,
r
2
y
__

p|r
1
r
2
v
_
1
1
p
_

pr
1
r
2
pv
_
1
1
p
2
_
=
6z
2
v
(v)
r
1
,r
2
<y
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
(r
1
)(r
2
)
_
1 max
_
1
2
,
r
1
y
,
r
2
y
__
,
i.e., the main term in (4.10). It remains to estimate the terms used to complete
the sum; their total is, by denition, given exactly by (4.13) with the inequality
m
_
z/(d
2
1
sd
2
2
s
2
d
3
) changed to m >
_
z/(d
2
1
sd
2
2
s
2
d
3
). This is a total of size at
most
(4.16)
1
2
d
1
,d
2
<y
gcd(d
1
d
2
,v)=1
s
1
<y/d
1
s
2
<y/d
2
gcd(s
1
s
2
,v)=1
d
3
|v
m>
z
d
2
1
s
1
d
2
2
s
2
d
3
m sq-free
1
d
1
d
2
d
3
m
2
z
s
1
d
1
s
2
d
2
.
Adding this to (4.14), we obtain, as our total error term,
(4.17)
d
1
,d
2
<y
gcd(d
1
d
2
,v)=1
s
1
<y/d
1
s
2
<y/d
2
gcd(s
1
s
2
,v)=1
d
3
|v
f
_
_
z
d
2
1
s
1
d
2
2
s
2
d
3
_
,
where
f(x) :=
mx
m sq-free
1 +
1
2
m>x
m sq-free
x
2
m
2
.
It is easy to see that f(x)/x has a local maximum exactly when x is a square-free
(positive) integer. We can hence check that
f(x)
1
2
_
2 + 2
_
(2)
(4)
1.25
__
x = 1.26981 . . . x
for all x 0 by checking all integers smaller than a constant and using {m :
m sq-free} {m : 4 m} and 1.5 (3/4) < 1.26981 to bound f from below for x
larger than a constant. Therefore, (4.17) is at most
1.27
d
1
,d
2
<y
gcd(d
1
d
2
,v)=1
s
1
<y/d
1
s
2
<y/d
2
gcd(s
1
s
2
,v)=1
d
3
|v
_
z
d
2
1
s
1
d
2
2
s
2
d
3
= 1.27
p|v
_
1 +
1
p
_
_
_
_
_
_
d<y
gcd(d,v)=1
s<y/d
gcd(s,v)=1
1
d
s
_
_
_
_
_
2
.
We can bound the double sum simply by
d<y
gcd(d,v)=1
s<y/d
1
sd
2
d<y
_
y/d
d
2
y
_
3
2
_
p|v
_
1
1
p
3/2
_
.
Alternatively, if v = 2, we bound
s<y/d
gcd(s,v)=1
1
s
=
s<y/d
s odd
1
s
1 +
1
2
_
y/d
1
1
s
ds =
_
y/d
and thus
d<y
gcd(d,v)=1
s<y/d
gcd(s,v)=1
1
sd

d<y
gcd(d,2)=1
_
y/d
d

y
_
1
1
2
3/2
_
_
3
2
_
.
Applying Lemma 4.1 with y = S/s and z = x/Ws, where S = x/WU, we

obtain that (4.8) equals
(4.18)
6x
2
W
v
(v)
s<S
gcd(s,v)=1
1
s
r
1
<S/s
r
2
<S/s
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
(r
1
)(r
2
)
_
1 max
_
1
2
,
r
1
S/s
,
r
2
S/s
__
+O
_
_
5.04
_
3
2
_
3
S
_
x
W
p|v
_
1 +
1
p
__
1
1
p
3/2
_
3
_
_
,
36 H. A. HELFGOTT
with 5.04 replaced by 1.27 if v = 2. The main term in (4.18) can be written as
(4.19)
6x
2
W
v
(v)
sS
gcd(s,v)=1
1
s
_
1
1/2
r
1
uS
s
r
2
uS
s
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
(r
1
)(r
2
)
du.
From now on, we will focus on the cases v = 1 and v = 2 for simplicity. (Higher
values of v do not seem to be really protable in the last analysis.)
4.1.2. Explicit bounds for a sum with . We must estimate the expression within
parentheses in (4.19). It is not too hard to show that it tends to 0; the rst
part of the proof of Lemma 4.2 will reduce this to the fact that
n
(n)/n = 0.
Obtaining good bounds is a more delicate matter. For our purposes, we will need
the expression to converge to 0 at least as fast as 1/(log)
2
, with a good constant
in front. For this task, the bound (2.8) on
nx
(n)/n is enough.
Lemma 4.2. Let
g
v
(x) :=
r
1
x
r
2
x
gcd(r
1
,r
2
)=1
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
(r
1
)(r
2
)
,
where v = 1 or v = 2. Then
|g
1
(x)|
_
_
1/x if 33 x 10
6
,
1
x
(111.536 + 55.768 log x) if 10
6
x < 10
10
,
0.0044325
(log x)
2
+
0.1079
x
if x 10
10
,
|g
2
(x)|
_
_
2.1/x if 1 x 10
6
,
1
x
(1634.34 + 817.168 log x) if 10
6
x < 10
10
,
0.038128
(log x)
2
+
0.2046
x
. if x 10
10
.
Tbe proof involves what may be called a version of Rankins trick, using Dirich-
let series and the behavior of (s) near s = 1. The statements for x 10
6
are
proven by direct computation.
Proof. Clearly
(4.20)
g(x) =
r
1
x
r
2
x
gcd(r
1
r
2
,v)=1
_
_

d| gcd(r
1
,r
2
)
(d)
_
_
(r
1
)(r
2
)
(r
1
)(r
2
)
=
dx
gcd(d,v)=1
(d)
r
1
x
r
2
x
d| gcd(r
1
,r
2
)
gcd(r
1
r
2
,v)=1
(r
1
)(r
2
)
(r
1
)(r
2
)
=
dx
gcd(d,v)=1
(d)
((d))
2
u
1
x/d
gcd(u
1
,dv)=1
u
2
x/d
gcd(u
2
,dv)=1
(u
1
)(u
2
)
(u
1
)(u
2
)
=
dx
gcd(d,v)=1
(d)
((d))
2
_
_
_
_
_
rx/d
gcd(r,dv)=1
(r)
(r)
_
_
_
_
_
2
.
Moreover,
rx/d
gcd(r,dv)=1
(r)
(r)
=
rx/d
gcd(r,dv)=1
(r)
r
|r
p|d
_
p
p + 1
1
_
=
x/d
(d
)
2
=1
gcd(d
,dv)=1
_
_
p|d
1
p + 1
_
_

rx/d
gcd(r,dv)=1
d
|r
(r)
r
=
x/d
(d
)
2
=1
gcd(d
,dv)=1
1
d
(d
rx/dd
gcd(r,dd
v)=1
(r)
r
and
rx/dd
gcd(r,dd
v)=1
(r)
r
=
x/dd
|(dd
v)
1
d
rx/dd
(r)
r
.
Hence
(4.21)
|g(x)|
dx
gcd(d,v)=1
((d))
2
((d))
2
_
_
_
_
_
_
_
_
x/d
(d
)
2
=1
gcd(d
,dv)=1
1
d
(d
x/dd
|(dd
v)
1
d
f(x/dd
)
_
_
_
_
_
_
_
_
2
,
where f(t) =
rt
(r)/r
.
We intend to bound the function f(t) by a linear combination of terms of the
form t
, [0, 1/2). Thus it makes sense now to estimate F

v
(s
1
, s
2
, x), dened
38 H. A. HELFGOTT
to be the quantity
d
gcd(d,v)=1
((d))
2
((d))
2
_
_
_
_
_
1
gcd(d
1
,dv)=1
(d
1
)
2
d
1
(d
1
)
1
|(dd
1
v)
1
d
1
(dd
1
d
1
)
1s
1
_
_
_
_
_
_
_
_
_
_
2
gcd(d
2
,dv)=1
(d
2
)
2
d
2
(d
2
)
2
|(dd
2
v)
1
d
2
(dd
2
d
2
)
1s
2
_
_
_
_
_
.
for s
1
, s
2
[1/2, 1]. This is equal to
d
gcd(d,v)=1
(d)
2
d
s
1
+s
2
p|d
1
(1 +p
1
)
2
(1 p
s
1
)
p|v
1
(1p
s
1)(1p
s
2)
(1 p
s
2
)
_
_
_
_
gcd(d
,dv)=1
(d
)
2
(d
)
s
1
+1
|d
1
(1 +p
1
) (1 p
s
1
)
_
_
_
_
_
_
_
_
gcd(d
,dv)=1
(d
)
2
(d
)
s
2
+1
|d
1
(1 +p
1
) (1 p
s
2
)
_
_
_
_
,
which in turn can easily be seen to equal
(4.22)
pv
_
1 +
p
s
1
p
s
2
(1 p
s
1
+p
1
)(1 p
s
2
+p
1
)
_
p|v
1
(1 p
s
1
)(1 p
s
2
)
pv
_
1 +
p
1
p
s
1
(1 +p
1
)(1 p
s
1
)
_
pv
_
1 +
p
1
p
s
2
(1 +p
1
)(1 p
s
2
)
_
Now, for any 0 < x y x
1/2
< 1,
(1+xy)(1xy)(1xy
2
)(1+x)(1y)(1x
3
) = (xy)(y
2
x)(xyx1)x 0,
and so
(4.23)
1 +
xy
(1 +x)(1 y)
=
(1 +x y)(1 xy)(1 xy
2
)
(1 +x)(1 y)(1 xy)(1 xy
2
)

(1 x
3
)
(1 xy)(1 xy
2
)
.
For any x y
1
, y
2
< 1 with y
2
1
x, y
2
2
x,
(4.24) 1 +
y
1
y
2
(1 y
1
+x)(1 y
2
+x)

(1 x
3
)
2
(1 x
4
)
(1 y
1
y
2
)(1 y
1
y
2
2
)(1 y
2
1
y
2
)
.
This can be checked as follows: multiplying by the denominators and changing
variables to x, s = y
1
+ y
2
and r = y
1
y
2
, we obtain an inequality where the left
side, quadratic on s with positive leading coecient, must be less than or equal
to the right side, which is linear on s. The left side minus the right side can be
maximal for given x, r only when s is maximal or minimal. This happens when
y
1
= y
2
or when either y
i
=

x or y
i
= x for at least one of i = 1, 2. In each of
these cases, we have reduced (4.24) to an inequality in two variables that can be
proven automatically
4
by a quantier-elimination program; the author has used
QEPCAD [HB11] to do this.
Hence F
v
(s
1
, s
2
, x) is at most
(4.25)
pv
(1 p
3
)
2
(1 p
4
)
(1 p
s
1
s
2
)(1 p
2s
1
s
2
)(1 p
s
1
2s
2
)

p|v
1
(1 p
s
1
)(1 p
s
2
)
pv
1 p
3
(1 +p
s
1
1
)(1 +p
2s
1
1
)
pv
1 p
3
(1 +p
s
2
1
)(1 +p
2s
2
1
)
= C
v,s
1
,s
2

(s
1
+ 1)(s
2
+ 1)(2s
1
+ 1)(2s
2
+ 1)
(3)
4
(4)((s
1
+s
2
)(2s
1
+s
2
)(s
1
+ 2s
2
))
1
,
where
C
v,s
1
,s
2
=
_
1 if v = 1,
(12
s
1
2s
2)(1+2
s
1
1
)(1+2
2s
1
1
)(1+2
s
2
1
)(1+2
2s
2
1
)
(12
s
1
+s
2)
1
(12
2s
1
s
2)
1
(12
s
1 )(12
s
2 )(12
3
)
4
(12
4
)
if v = 2.
For 1 t x, (2.8) and (2.11) imply
(4.26) f(t)
_
_
_
_
2
t
if x 10
10
_
2
t
+
0.03
log x
_
x
t
_
log log 10
10
log xlog 10
10
if x > 10
10
,
where we are using the fact that log x is convex-down. Note that, again by
convexity,
log log x log log 10
10
log x log 10
10
< (log t)
|
t=log 10
10 =
1
log 10
10
= 0.0434294 . . .
Obviously,
_
2/t in (4.26) can be replaced by (2/t)
1/2
for any 0.
By (4.21) and (4.26),
|g
v
(x)|
_
2
x
_
12
F
v
(1/2 +, 1/2 +, x)
for x 10
10
. We set = 1/ log x and obtain from (4.25) that
(4.27)
F
v
(1/2 +, 1/2 +, x) C
v,
1
2
+,
1
2
+
(1 + 2)(3/2)
4
(2)
2
(3)
4
(4)
55.768 C
v,
1
2
+,
1
2
+
_
1 +
log x
2
_
,
where we use the easy bound (s) < 1 + 1/(s 1) obtained by
n
s
< 1 +
_

1
t
s
dt.
(For sharper bounds, see [BR02].) Now
C
2,
1
2
+,
1
2
+

(1 2
3/2
)
2
(1 + 2
3/2
)
2
(1 + 2
2
)
2
(1 2
12
)
(1 2
1/2
)
2
(1 2
3
)
4
(1 2
4
)
14.652983,
4
In practice, the case yi =
x leads to a polynomial of high degree, and quantier elimination

increases sharply in complexity as the degree increases; a stronger inequality of lower degree
(with (1 3x
3
) instead of (1 x
3
)
2
(1 x
4
)) was given to QEPCAD to prove in this case.
40 H. A. HELFGOTT
whereas C
1,
1
2
+,
1
2
+
= 1. (We are assuming x 10
6
, and so 1/(log 10
6
).)
Hence
|g
v
(x)|
_
1
x
(111.536 + 55.768 log x) if v = 1,
1
x
(1634.34 + 817.168 log x) if v = 2.
for 10
6
x < 10
10
.
For general x, we must use the second bound in (4.26). Dene c = 1/(log 10
10
).
We see that, if x > 10
10
,
|g
v
(x)|
0.03
2
(log x)
2
F
1
(1 c, 1 c) C
v,1c,1c
+ 2
x
0.03
log x
F(1 c, 1/2) C
v,1c,1/2
+
1
x
(111.536 + 55.768 log x) C
v,
1
2
+,
1
2
+
.
For v = 1, this gives
|g
1
(x)|
0.0044325
(log x)
2
+
2.1626
xlog x
+
1
x
(111.536 + 55.768 log x)
0.0044325
(log x)
2
+
0.1079
x
;
for v = 2, we obtain
|g
2
(x)|
0.038128
(log x)
2
+
25.607
xlog x
+
1
x
(1634.34 + 817.168 log x)
0.038128
(log x)
2
+
0.2046
x
.
4.1.3. Estimating the triple sum. We will now be able to bound the triple sum
in (4.19), viz.,
(4.28)
sS
gcd(s,v)=1
1
s
_
1
1/2
g
v
(uS/s)du,
where g
v
is as in Lemma 4.2.
As we will soon see, Lemma 4.2 that (4.28) is bounded by a constant (essentially
because the integral
_
1/2
0
1/t(log t)
2
converges). We must give as good a constant
as we can, since it will aect the largest term in the nal result.
Clearly g
v
(R) = g
v
(R). The contribution of each g
v
(m), 1 m S, to
(4.28) is exactly g
v
(m) times
(4.29)
S
m+1
<s
S
m
1
s
gcd(s,v)=1
_
1
ms/S
du +
S
2m
<s
S
m+1
1
s
gcd(s,v)=1
_
(m+1)s/S
ms/S
du +
S
2(m+1)
<s
S
2m
1
s
gcd(s,v)=1
_
(m+1)s/S
1/2
du
=
S
m+1
<s
S
m
gcd(s,v)=1
_
1
s

m
S
_
+
S
2m
<s
S
m+1
gcd(s,v)=1
1
S
+
S
2(m+1)
<s
S
2m
gcd(s,v)=1
_
m+ 1
S

1
2s
_
.
Write f(t) = 1/S for S/2m < t S/(m + 1), f(t) = 0 for t > S/m or t <
S/2(m+1), f(t) = 1/tm/S for S/(m+1) < t S/m and f(t) = (m+1)/S1/2t
for S/2(m + 1) < t S/2m; then (4.29) equals
n:gcd(n,v)=1
f(n). By Euler-
Maclaurin (second order),
(4.30)
n
f(n) =
_

f(x)
1
2
B
2
({x})f
(x)dx =
_

f(x) +O
_
1
12
|f
(x)|
_
dx
=
_

f(x)dx +
1
6
O
_
3
2m
_
_
s
m+ 1
_
_
=
1
2
log
_
1 +
1
m
_
+
1
6
O
_
_
2m
s
_
2
+
_
m+ 1
s
_
2
_
.
Similarly,
n odd
f(n) =
_

f(2x + 1)
1
2
B
2
({x})
d
2
f(2x + 1)
dx
2
dx
=
1
2
_

f(x)dx 2
_

1
2
B
2
__
x 1
2
__
f
(x)dx
=
1
2
_

f(x)dx +
1
6
_

_
|f
(x)|
_
dx
=
1
4
log
_
1 +
1
m
_
+
1
3
O
_
_
2m
s
_
2
+
_
m+ 1
s
_
2
_
.
We use these expressions for m C
0
, where C
0
33 is a constant to be
computed later; they will give us the main term. For m > C
0
, we use the bounds
on |g(m)| that Lemma 4.2 gives us.
(Starting now and for the rest of the paper, we will focus on the cases v = 1,
v = 2 when giving explicit computational estimates. All of our procedures would
allow higher values of v as well, but, as will become clear much later, the gains
from higher values of v are oset by losses and complications elsewhere.)
Let us estimate (4.28). Let
c
v,0
=
_
1/6 if v = 1,
1/3 if v = 2,
c
v,1
=
_
1 if v = 1,
2.5 if v = 2,
c
v,2
=
_
55.768 . . . if v = 1,
817.168 . . . if v = 2,
c
v,3
=
_
111.536 . . . if v = 1,
1634.34 . . . if v = 2,
c
v,4
=
_
0.0044325 . . . if v = 1,
0.038128 . . . if v = 2,
c
v,5
=
_
0.1079 . . . if v = 1,
0.2046 . . . if v = 2.
42 H. A. HELFGOTT
Then (4.28) equals
mC
0
g
v
(m)
_
(v)
2v
log
_
1 +
1
m
_
+O
_
c
v,0
5m
2
+ 2m + 1
S
2
__
+
S/10
6
s<S/C
0
1
s
_
1
1/2
O
_
c
v,1
uS/s
_
du
+
S/10
10
s<S/10
6
1
s
_
1
1/2
O
_
c
v,2
log(uS/s) +c
v,3
uS/s
_
du
+
s<S/10
10
1
s
_
1
1/2
O
_
c
v,4
(log uS/s)
2
+
c
v,5
_
uS/s
_
du,
which is
mC
0
g
v
(m)
(v)
2v
log
_
1 +
1
m
_
+
mC
0
|g(m)| O
_
c
v,0
5m
2
+ 2m+ 1
S
2
_
+O
_
c
v,1
log 2
C
0
+
log 2
10
6
_
c
v,3
+c
v,2
(1 + log 10
6
)
_
+
2
2
10
10/2
c
v,5
_
+O
_
_

s<S/10
10
c
v,4
/2
s(log S/2s)
2
_
_
for S (C
0
+ 1). Note that
s<S/10
10
1
s(log S/2s)
2
=
_
2/10
10
0
1
t(log t)
2
dt.
Now
c
v,4
2
_
2/10
10
0
1
t(log t)
2
dt =
c
v,4
/2
log(10
10
/2)
=
_
0.00009923 . . . if v = 1
0.000853636 . . . if v = 2.
and
log 2
10
6
_
c
v,3
+c
v,2
(1 + log 10
6
)
_
+
2
2
10
5
c
v,5
=
_
0.0006506 . . . if v = 1
0.009525 . . . if v = 2.
For C
0
= 10000,
(v)
v
1
2
mC
0
g
v
(m) log
_
1 +
1
m
_
=
_
0.362482 . . . if v = 1,
0.360576 . . . if v = 2,
c
v,0
mC
0
|g
v
(m)|(5m
2
+ 2m+ 1)
_
6204066.5 . . . if v = 1,
15911340.1 . . . if v = 2,
and
c
v,1
(log 2)/C
0
=
_
0.00006931 . . . if v = 1,
0.00017328 . . . if v = 2.
Thus, for S 100000,
(4.31)
sS
1
s
_
1
1/2
g
v
(uS/s)du
_
0.36393 if v = 1,
0.37273 if v = 2.
For S < 100000, we proceed as above, but using the exact expression (4.29)
instead of (4.30). Note (4.29) is of the form f
s,m,1
(S) +f
s,m,2
(S)/S, where both
f
s,m,1
(S) and f
s,m,2
(S) depend only on S (and on s and m). Summing over
m S, we obtain a bound of the form
sS
1
s
_
1
1/2
g
v
(uS/s)du G
v
(S)
with
G
v
(S) = K
v,1
(|S|) +K
v,2
(|S|)/S,
where K
v,1
(n) and K
v,2
(n) can be computed explicitly for each integer n. (For
example, G
v
(S) = 1 1/S for 1 S < 2 and G
v
(S) = 0 for S < 1.)
It is easy to check numerically that this implies that (4.31) holds not just
for S 100000 but also for 40 S < 100000 (if v = 1) or 16 S < 100000 (if
v = 2). Using the fact that G
v
(S) is non-negative, we can compare
_
T
1
G
v
(S)dS/S
with log(T + 1/N) for each T [2, 40]
1
N
Z (N a large integer) to show, again
numerically, that
(4.32)
_
T
1
G
v
(S)
dS
S

_
0.3698 log T if v = 1,
0.37273 log T if v = 2.
(We use N = 100000 for v = 1; already N = 10 gives us the answer above for
v = 2. Indeed, computations suggest the better bound 0.358 instead of 0.37273;
we are committed to using 0.37273 because of (4.31).)
Multiplying by 6v/
2
(v), we conclude that
(4.33) S
1
(U, W) =
x
W
H
1
_
x
WU
_
+O
_
5.08(3/2)
3
x
3/2
W
3/2
U
_
if v = 1,
(4.34) S
1
(U, W) =
x
W
H
2
_
x
WU
_
+O
_
1.27(3/2)
3
x
3/2
W
3/2
U
_
if v = 2, where
(4.35)
H
1
(S) =
_
6
2
G
1
(S) if 1 S < 40,
0.22125 if S 40,
H
2
(s) =
_
4
2
G
2
(S) if 1 S < 16,
0.15107 if S 16.
Hence (by (4.32))
(4.36)
_
T
1
H
v
(S)
dS
S

_
0.22482 log T if v = 1,
0.15107 log T if v = 2;
moreover, H
1
(S) 3/
2
, H
2
(S) 2/
2
for all S.
* * *
Note. There is another way to obtain cancellation on , applicable when
(x/W) > Uq (as is unfortunately never the case in our main application). For
this alternative to be taken, one must either apply Cauchy-Schwarz on n rather
than m (resulting in exponential sums over m) or lump together all m near each
other and in the same congruence class modulo q before applying Cauchy-Schwarz
on m (one can indeed do this if is small). We could then write
mW
mr mod q
d|m
d>U
(d) =
mW
mr mod q
d|m
dU
(d) =
dU
(d)(W/qd +O(1))
44 H. A. HELFGOTT
and obtain cancellation on d. If Uq (x/W), however, the error term dominates.
4.2. The sum S
2
: the large sieve, primes and tails. We must now bound
(4.37) S
2
(U
, W
, W) =
<m
x
W
gcd(m,v)=1
<pW
(log p)e(mp)
2
.
for U
= max(U, x/2W), W
= max(V, W/2). (The condition gcd(p, v) = 1 will

be fullled automatically by the assumption V > v.)
From a modern perspective, this is clearly a case for a large sieve. It is also
clear that we ought to try to apply a large sieve for sequences of prime support.
What is subtler here is how to do things well for very large q (i.e., x/q small).
This is in some sense a dual problem to that of q small, but it poses additional
complications; for example, it is not obvious how to take advantage of prime
support for very large q.
As in type I, we avoid this entire issue by forbidding q large and then taking
advantage of the error term /x in the approximation =
a
q
+

x
. This is one
of the main innovations here. Note this alternative method will allow us to take
advantage of prime support.
A key situation to study is that of frequencies
i
clustering around given
rationals a/q while nevertheless keeping at a certain small distance from each
other.
Lemma 4.3. Let q 1. Let
1
,
2
, . . . ,
k
R/Z be of the form
i
= a
i
/q +
i
,
0 a
i
< q, where the elements
i
R all lie in an interval of length > 0, and
where a
i
= a
j
implies |
i

j
| > > 0. Assume + 1/q. Then, for any
W, W
1, W
W/2,
(4.38)
k
i=1
<pW
(log p)e(
i
p)
2
min
_
1,
2q
(q)
1
log ((q( +))
1
)
_
_
W W
+
1
_

W
<pW
(log p)
2
.
Proof. For any distinct i, j, the angles
i
,
j
are separated by at least (if
a
i
= a
j
) or at least 1/q |
i

j
| 1/q (if a
i
= a
j
). Hence we can
apply the large sieve (in the optimal N +
1
1 form due to Selberg [Sel91] and
Montgomery-Vaughan [MV74]) and obtain the bound in (4.38) with 1 instead of
min(1, . . . ) immediately.
We can also apply Montgomerys inequality ([Mon68], [Hux72]; see the expo-
sitions in [Mon71, pp. 2729] and [IK04, 7.4]). This gives us that the left side
of (4.38) is at most
(4.39)
_
_
_
_
_
rR
gcd(r,q)=1
((r))
2
(r)
_
_
_
_
_
1
rR
gcd(r,q)=1
mod r
gcd(a
,r)=1
k
i=1
<pW
(log p)e((
i
+a
/r)p)
2
If we add all possible fractions of the form a
/r, r R, gcd(r, q) = 1, to the

fractions a
i
/q, we obtain fractions that are separated by at least 1/qR
2
. If +
1/qR
2
, then the resulting angles
i
+a
/r are still separated by at least . Thus

we can apply the large sieve to (4.39); setting R = 1/
_
( +)q, we see that we
gain a factor of
(4.40)
rR
gcd(r,q)=1
((r))
2
(r)

(q)
q
rR
((r))
2
(r)

(q)
q
dR
1
d

(q)
2q
log
_
(q( +))
1
_
,
since
dR
1/d log(R) for all R 1 (integer or not).
Let us rst give a bound on sums of the type of S
2
(U, V, W) using prime
support but not the error terms (or Lemma 4.3).
Lemma 4.4. Let W 1, W
W/2. Let = a/q +O
(1/qQ), q Q. Then
(4.41)
A
0
<mA
1
<pW
(log p)e(mp)
_
A
1
A
0
min(q, Q/2)
_
(W W
+ 2q)
<pW
(log p)
2
.
If q < W/2 and Q 3.5W, the following bound also holds:
(4.42)
A
0
<mA
1
<pW
(log p)e(mp)
_
A
1
A
0
q
_
q
(q)
W
log(W/2q)

<pW
(log p)
2
.
If A
1
A
0
q and q Q, , [0, 1], the following bound also holds:
(4.43)
A
0
<mA
1
<pW
(log p)e(mp)
2
(W W
+q/(1 ))
<pW
(log p)
2
.
The inequality (4.42) can be stronger than (4.42) only when q < W/7.2638 . . .
(if q is odd) or q < W/92.514 . . . (if q is even).
Proof. Let k = min(q, Q/2) q/2. We split (A
0
, A
1
] into (A
1
A
0
)/k
blocks of at most k consecutive integers m
0
+1, m
0
+2, . . . . For m, m
in such a
block, m and m
are separated by a distance of at least

|{(a/q)(mm
)}| O
(k/qQ) = 1/q O
(1/2q) 1/2q.
By the large sieve
(4.44)
q
a=1
<pW
(log p)e((m
0
+a)p)
2
((W W
) + 2q)
<pW
(log p)
2
.
We obtain (4.41) by summing over all (A
1
A
0
)/k blocks.
If A
1
A
0
|q| and q Q, , [0, 1], we obtain (4.43) simply by applying
the large sieve without splitting the interval A
0
< m A
1
.
46 H. A. HELFGOTT
Let us now prove (4.42). We will use Montgomerys inequality, followed by
Montgomery and Vaughans large sieve with weights. An angle a/q + a
1
/r
1
is
separated from other angles a
/q +a
2
/r
2
(r
1
, r
2
R, gcd(a
i
, r
i
) = 1) by at least
1/qr
1
R, rather than just 1/qR
2
. We will choose R so that qR
2
< Q; this implies
1/Q < 1/qR
2
1/qr
1
R.
By Montgomerys inequality [IK04, Lemma 7.15], applied (for each 1 a q)
to S() =
n
a
n
e(n) with a
n
= log(n)e((m
0
+a)n) if n is prime and a
n
= 0
otherwise,
(4.45)
1
(r)
<pW
(log p)e((m
0
+a)p)
mod r
gcd(a
,r)=1
<pW
(log p)e
__
(m
0
+a) +
a
r
_
p
_
2
.
for each square-free r W
. We multiply both sides of (4.45) by (W/2 +

(3/2)(1/qrR 1/Q)
1
)
1
and sum over all a = 0, 1, . . . , q 1 and all square-
free r R coprime to q; we will later make sure that R W
. We obtain
that
(4.46)
rR
gcd(r,q)=1
_
W
2
+
3
2
_
1
qrR

1
Q
_
1
_
1
(r)
2
(r)
a=1
<pW
(log p)e((m
0
+a)p)
2
is at most
(4.47)
rR
gcd(r,q)=1
r sq-free
_
W
2
+
3
2
_
1
qrR

1
Q
_
1
_
1
q
a=1
mod r
gcd(a
,r)=1
<pW
(log p)e
__
(m
0
+a) +
a
r
_
p
_
2
We now apply the large sieve with weights [MV73, (1.6)], recalling that each
angle (m
0
+a) +a
/r is separated from the others by at least 1/qrR1/Q; we

obtain that (4.47) is at most
<pW
(log p)
2
. It remains to estimate the sum
in the rst line of (4.46). (We are following here a procedure analogous to that
used in [MV73] to prove the Brun-Titchmarsh theorem.)
Assume rst that q W/13.5. Set
(4.48) R =
_
W
q
_
1/2
,
where = 1/2e
20.25068
= 0.30285 . . . . It is clear that qR
2
< Q, q < W
and
R 2. Moreover, for r R,
1
Q

1
3.5W

3.5
1
W
=

3.5
1
qR
2

/3.5
qrR
.
Hence
W
2
+
3
2
_
1
qrR

1
Q
_
1
W
2
+
3
2
qrR
1 /3.5
=
W
2
+
3r
2
_
1

3.5
_
R
2
W
2
=
W
2
_
1 +
3
1 /3.5
rW
R
_
<
W
2
_
1 +
rW
R
_
and so
rR
gcd(r,q)=1
_
W
2
+
3
2
_
1
qrR

1
Q
_
1
_
1
(r)
2
(r)
2
W
rR
gcd(r,q)=1
(1 +rR
1
)
1
(r)
2
(r)

2
W
(q)
q
rR
(1 +rR
1
)
1
(r)
2
(r)
.
For R 2,
rR
(1 +rR
1
)
1
(r)
2
(r)
> log R + 0.25068;
this is true for R 100 by [MV73, Lemma 8] and easily veriable numerically
for 2 R < 100. (It suces to verify this for R integer with r < R instead of
r R, as that is the worst case.)
Now
log R =
1
2
_
log
W
2q
+ log 2
_
=
1
2
log
W
2q
0.25068.
Hence
rR
(1 +rR
1
)
1
(r)
2
(r)
>
1
2
log
W
2q
and the statement follows.
Now consider the case q > W/13.5. If q is even, then, in this range, inequality
(4.41) is always better than (4.42), and so we are done. Assume, then, that
W/13.5 < q W/2 and q is odd. We set R = 2; clearly qR
2
< W Q and
q < W/2 W
, and so this choice of R is valid. It remains to check that

1
W
2
+
3
2
_
1
2q

1
Q
_
1
+
1
W
2
+
3
2
_
1
4q

1
Q
_
1

1
W
log
W
2q
.
This follows because
1
1
2
+
3
2
_
t
2

1
3.5
_
1
+
1
1
2
+
3
2
_
t
4

1
3.5
_
1
log
t
2
for all 2 t 13.5.
We need a version of Lemma 4.4 with m restricted to the odd numbers.

48 H. A. HELFGOTT
Lemma 4.5. Let W 1, W
W/2. Let 2 = a/q +O
(1/qQ), q Q. Then
(4.49)
A
0
<mA
1
m odd
<pW
(log p)e(mp)
_
A
1
A
0
min(2q, Q)
_
(W W
+ 2q)
<pW
(log p)
2
.
If q < W/2 and Q 3.5W, the following bound also holds:
(4.50)
A
0
<mA
1
m odd
<pW
(log p)e(mp)
_
A
1
A
0
2q
_
q
(q)
W
log(W/2q)

<pW
(log p)
2
.
If A
1
A
0
2q and q Q, , [0, 1], the following bound also holds:
(4.51)
A
0
<mA
1
<pW
(log p)e(mp)
2
(W W
+q/(1 ))
<pW
(log p)
2
.
Proof. We follow the proof of Lemma 4.4, noting the dierences. Let k =
min(q, Q/2) q/2, just as before. We split (A
0
, A
1
] into (A
1
A
0
)/k
blocks of at most 2k consecutive integers; any such block contains at most k odd
numbers. For odd m, m
in such a block, m and m
are separated by a distance

of
|{(mm
)}| =
_
2
mm
2
_
= |{(a/q)k}| O
(k/qQ) 1/2q.
We obtain (4.49) and (4.51) just as we obtained (4.41) and (4.43) before. To
obtain (4.50), proceed again as before, noting that the angles we are working
with can be labelled as (m
0
+ 2a), 0 a < q.
The idea now (for large ) is that, if is not negligible, then, as m increases,
m loops around the circle R/Z roughly repeats itself every q steps but with
a slight displacement. This displacement gives rise to a conguration to which
Lemma 4.3 is applicable.
Proposition 4.6. Let x W 1, W
W/2, U
x/2W. Let Q 3.5W. Let

2 = a/q +/x, gcd(a, q) = 1, |/x| 1/qQ, q Q. Let S
2
(U
, W
, W) be as in
(4.37) with v = 2.
For q Q, where [0, 1],
(4.52)
S
2
(U
, W
, W)
_
max(1, 2)
_
x
8q
+
x
2W
_
+
W
2
+ 2q
_
<pW
(log p)
2
If q < W/2,
(4.53)
S
2
(U
, W
, W)
_
x
4(q)
1
log(W/2q)
+
q
(q)
W
log(W/2q)
_
<pW
(log p)
2
.
If W > x/4q, the following bound also holds:
(4.54) S
2
(U
, W
, W)
_
W
2
+
q
1 x/4Wq
_

W
<pW
(log p)
2
.
If = 0 and x/4W +q Q,
(4.55)
S
2
(U
, W
, W) min
_
_
1,
2q/(q)
log
_
x
|q|
_
q +
x
4W
_
1
_
_
_
_
x
|q|
+
W
2
_

W
<pW
(log p)
2
.
Lastly, if = 0 and q Q, where [0, 1),
(4.56)
S
2
(U
, W
, W)
_
x
|q|
+
W
2
+
x
8(1 )Q
+
x
4(1 )W
_

W
<pW
(log p)
2
.
The trivial bound would be in the order of
S
2
(U
, W
, W) = (x/2 log x)
<pW
(log p)
2
.
In practice, (4.54) gets applied when W x/q.
Proof. Let us rst prove statements (4.53) and (4.52), which do not involve .
Assume rst q W/2. Then, by (4.50) with A
0
= U
, A
1
= x/W,
S
2
(U
, W
, W)
_
x/W U
2q
+ 1
_
q
(q)
W
log(W/2q)
<pW
(log p)
2
.
Clearly (x/W U
)W (x/2W) W = x/2. Thus (4.53) holds.

Assume now that q Q. Apply (4.49) with A
0
= U
, A
1
= x/W. Then
S
2
(U
, W
, W)
_
x/W U
q min(2,
1
)
+ 1
_
(W W
+ 2q)
<pW
(log p)
2
.
Now
_
x/W U
q min(2,
1
)
+ 1
_
(W W
+ 2q)
_
x
W
U
_
W W
q min(2,
1
)
+ max(1, 2)
_
x
W
U
_
+W/2 + 2q
x/4
q min(2,
1
)
+ max(1, 2)
x
2W
+W/2 + 2q.
This implies (4.52).
If W > x/4q, apply (4.43) with = x/4Wq, = 1. This yields (4.54).
Assume now that = 0 and x/4W +q x/|q|. Let Q
= x/|q|. For any m

1
,
m
2
with x/2W < m
1
, m
2
x/W, we have |m
1
m
2
| x/2W 2(Q
q), and
so
(4.57)
m
1
m
2
2
/x +q/x
||/x =
1
q
.
50 H. A. HELFGOTT
The conditions of Lemma 4.3 are thus fullled with = (x/4W) ||/x and
= |q|/x. We obtain that S
2
(U
, W
, W) is at most
min
_
1,
2q
(q)
1
log ((q( +))
1
)
_
_
W W
+
1
_

W
<pW
(log p)
2
.
Here W W
+
1
= W W
+x/|q| W/2 +x/|q| and

(q( +))
1
=
_
q
||
x
_
1 _
q +
x
4W
_
1
.
Lastly, assume = 0 and q Q. We let Q
= x/|q| Q again, and we split

the range U
< m x/W into intervals of length 2(Q
q), so that (4.57) still

holds within each interval. We apply Lemma 4.3 with = (Q
q) ||/x and
= |q|/x. We obtain that S
2
(U
, W
, W) is at most
_
1 +
x/W U
2(Q
q)
_
_
W W
+
1
_

W
<pW
(log p)
2
.
Here W W
+
1
W/2 +x/q|| as before. Moreover,
_
W
2
+
x
q||
__
1 +
x/W U
2(Q
q)
_
_
W
2
+Q
__
1 +
x/2W
2(1 )Q
W
2
+Q
+
x
8(1 )Q
+
x
4W(1 )
x
|q|
+
W
2
+
x
8(1 )Q
+
x
4(1 )W
.
Hence (4.56) holds.
5. Totals
Let x be given. We will choose U, V , W later; assume from the start that
2 10
6
V < x/4 and UV x. Starting in section 5.2, we will also assume that
x x
0
= 10
25
.
Let R/Z be given. We choose an approximation 2 = a/q + /x,
gcd(a, q) = 1, q Q, |/x| 1/qQ. We assume Q max(16, 2
x) and
Q max(2U, x/U). Let S
I,1
, S
I,2
, S
II
, S
0
be as in (2.21), with the smooth-
ing function =
2
as in (1.4).
The term S
0
is 0 because V < x/4 and
2
is supported on [1/4, 1]. We set
v = 2.
5.1. Contributions of dierent types.
5.1.1. Type I terms: S
I,1
. The term S
I,1
can be handled directly by Lemma
3.6, with
0
= 4 and D = U. (Condition (3.36) is valid thanks to (2.6).) Since
U Q/2, the contribution of S
I,1
gets bounded by (3.38) and (3.39): the absolute
value of S
I,1
is at most
(5.1)
x
q
min
_
1,
c
0
/
2
(2)
2
_
m
U
q
gcd(m,q)=1
(m)
m
log
x
mq
+
x
q
|
log ()|
m
U
q
gcd(m,q)=1
(m)
m
+
2
c
0
c
1
_
U log
ex
U
+
3q log
q
c
2
+
q
2
log
q
c
2
log
+
2U
q
_
+
3c
1
2
x
q
log
q
c
2
log
+
U
c
2
x
q
+
3c
1
2
_
2x
c
2
log
2x
c
2
+
_
c
0
2

2c
0
2
_
_
U
2
4qx
log
e
1/2
x
U
+
1
e
_
+
2|
|
1
q max
_
1, log
c
0
e
3
q
2
4|
|
1
x
_
log x,
where c
0
= 31.521 (by Lemma A.5), c
1
= 1.0000028 > 1 + (8 log 2)/V 1 +
(8 log 2)/(x/U), c
2
= 6/5
c
0
= 0.67147 . . . . By (2.1), (A.16) and Lemma A.6,
|
log ()| min

_
2 log 4,
24 log 2
2
_
.
By (2.7), (2.9) and (2.10), the rst line of (5.1) is at most
x
q
min
_
1,
c
2
_
_
min
_
4
5
q/(q)
log
+ U
q
2
, 1
_
log
x
U
+ 1.00303
q
(q)
_
+
x
q
min
_
2 log 4,
c
2
_
min
_
4
5
q/(q)
log
+ U
q
2
, 1
_
,
where c
0
= 0.798437 > c
0
/(2)
2
, c
0
= 1.685532. Clearly c
0
/c
0
> 1 > 2 log 4.
Taking derivatives, we see that t (t/2) log(t/c
2
) log
+
2U/t takes its max-
imum (for t [1, 2U]) when log(t/c
2
) log
+
2U/t = log t/c
2
log
+
2U/t; since
t log t/c
2
log
+
2U/t is increasing on [1, 2U], we conclude that
q
2
log
q
c
2
log
+
2U
q
U log
2U
c
2
.
Similarly, t t log(x/t) log
+
(U/t) takes its maximum at a point t [0, U for
which log(x/t) log
+
(U/t) = log(x/t) + log
+
(U/t), and so
x
q
log
q
c
2
log
+
U
c
2
x
q
U
c
2
(log x + log U).
We conclude that
(5.2)
|S
I,1
|
x
q
min
_
1,
c
2
_
_
min
_
4q/(q)
5 log
+ U
q
2
, 1
_
_
log
x
U
+c
3,I
_
+c
4,I
q
(q)
_
+
_
c
7,I
log
q
c
2
+c
8,I
log xmax
_
1, log
c
11,I
q
2
x
__
q +c
10,I
U
2
4qx
log
e
1/2
x
U
+
_
c
5,I
log
2U
c
2
+c
6,I
log xU
_
U +c
9,I
xlog
2x
c
2
+
c
10,I
e
,
52 H. A. HELFGOTT
where c
2
and c
0
are as above, c
3,I
= 2.11104 > c
0
/c
0
, c
4,I
= 1.00303, c
5,I
=
3.57422 > 2
c
0
c
1
/, c
6,I
= 2.23389 > 3c
1
/2c
2
, c
7,I
= 6.19072 > 2
3c
0
c
1
/,
c
8,I
= 3.53017 > 2(8 log 2)/, c
9,I
= 2.58877 > 3
2c
1
/2
c
2
, c
10,I
= 9.37301 >
c
0
(1/2 2/
2
) and c
11,I
= 9.0857 > c
0
e
3
/(4 8 log 2).
5.1.2. Type I terms: S
I,2
. The case q Q/V .
If q Q/V , then, for v V ,
2v =
va
q
+O
_
v
Qq
_
=
va
q
+O
_
1
q
2
_
,
and so va/q is a valid approximation to 2v. (Here we are using v to label
an integer variable bounded above by v V ; we no longer need v to label the
quantity in (2.22), since that has been set equal to the constant 2.) Moreover,
for Q
v
= Q/v, we see that 2v = (va/q) + O
(1/qQ
v
). If = a/q + /x, then
v = va/q +/(x/v). Now
(5.3) S
I,2
=
vV
v odd
(v)
mU
m odd
(m)
n
n odd
e((v) mn)(mn/(x/v)).
We can thus estimate S
I,2
by applying Lemma 3.5 to each inner double sum in
(5.3). We obtain that, if || 1/2c
2
, where c
2
= 6/5
c
0
and c
0
= 31.521, then
|S
I,2
| is at most
(5.4)
vV
(v)
_
_
_
_
_
x/v
2q
v
min
_
1,
c
0
()
2
_
mMv/q
gcd(m,2q)=1
(m)
m
+
c
10,I
q
4x/v
_
U
q
v
+ 1
_
2
_
_
_
_
_
plus
(5.5)
vV
(v)
_
2
c
0
c
1
U +
3c
1
2
x
vq
v
log
+
U
c
2
x
vqv
+
c
0
c
1
q
v
log
+
U
q
v
/2
_
+
vV
(v)
_
c
8,I
max
_
log
c
11,I
q
2
v
x/v
, 1
_
q
v
+
_
2
3c
0
c
1
+
3c
1
2c
2
+
55c
0
c
2
6
2
_
q
v
_
,
where q
v
= q/ gcd(q, v), M
v
[min(Q/2v, U), U] and c
1
= 1.0000028; if ||
1/2c
2
, then |S
I,2
| is at most (5.4) plus
(5.6)
vV
(v)
_
_
c
0
c
1
/2
U +
3c
1
2
_
_
2 +
(1 +)
log
+
2U
x/v
||qv
_
_
x/v
Q/v
+
35c
0
c
2
3
2
q
v
_
_
+
vV
(v)
c
0
c
1
/2
(1 +) min
__
x/v
||q
v
_
+ 1, 2U
_
_
_
_
3 + 2 +
log
+ 2U
x/v
||qv
+1
2
_
_
_
Write S
V
=
vV
(v)/(vq
v
). By (2.12),
(5.7)
S
V

vV
(v)
vq
+
vV
gcd(v,q)>1
(v)
v
_
gcd(q, v)
q

1
q
_
log V
q
+
1
q
p|q
(log p)
_
_
_
_
_
v
p
(q) +
1
p
+vp(q)
V
1
p
1
p
V
1
p
_
_
_
_
_
log V
q
+
1
q
p|q
(log p)v
p
(q) =
log V q
q
.
This helps us to estimate (5.4). We could also use this to estimate the second
term in the rst line of (5.5), but, for that purpose, it will actually be wiser to
use the simpler bound
(5.8)
vV
(v)
x
vq
v
log
+
U
c
2
x
vqv
vV
(v)
U/c
2
e

1.0004
ec
2
UV
(by (2.14) and the fact that t log
+
A/t takes its maximum at t = A/e).
We bound the sum over m in (5.4) by (2.7) and (2.9). To bound the terms
involving (U/q
v
+ 1)
2
, we use
vV
(v)v 0.5004V
2
(by (2.17)),
vV
(v)v gcd(v, q)
j
vV
(v)v +V
vV
gcd(v,q)=1
(v) gcd(v, q)
j ,
vV
gcd(v,q)=1
(v) gcd(v, q)
p|q
(log p)
1log
p
V
p
vp(q)
p|q
(log p)
log V
log p
p
vp(q)
(log V )
p|q
p
vp(q)
q log V
and
vV
gcd(v,q)=1
(v) gcd(v, q)
2
p|q
(log p)
1log
p
V
p
vp(q)+
p|q
(log p) 2p
vp(q)
p
log
p
V
2qV log q.
Using (2.14) and (5.7) as well, we conclude that (5.4) is at most
x
2q
min
_
1,
c
0
()
2
_
min
_
_
4
5
q/(q)
log
+ min(Q/2V,U)
2q
, 1
_
_
log V q
+
c
10,I
4x
_
0.5004V
2
q
_
U
q
+ 1
_
2
+ 2UV q log V + 2U
2
V log V
_
.
54 H. A. HELFGOTT
Assume Q 2UV/e. Using (2.14), (5.8), (2.18) and the inequality vq V q
Q (which implies q/2 U/e), we see that (5.5) is at most
1.0004
__
2
c
0
c
1
+
3c
1
2ec
2
_
UV +
c
0
c
1
Qlog
U
q/2
_
+
_
c
5,I
2
max
_
log
c
11,I
q
2
x
, 2
_
+c
6,I
2
_
Q,
where c
5,I
2
= 3.53312 > 1.0004 c
8,I
and
c
6,I
2
=
2
3c
0
c
1
+
3c
1
2c
2
+
55c
0
c
2
6
2
.
The expressions in (5.6) get estimated similarly. In particular,
vV
(v) min
__
x/v
||q
v
_
+ 1, 2U
_
1
2
log
+
2U
_
x/v
||qv
_
+ 1
vV
(v) max
t>0
t log
+
U
t

vV
(v)
U
e
=
1.0004
e
UV,
but
vV
(v) min
__
x/v
||q
v
_
+ 1, 2U
_
v
x
2U||q
(v) 2U
+
x
2U||q
<vV
gcd(v,q)=1
(v)
x/||
vq
+
vV
(v) +
vV
gcd(v,q)=1
(v)
x/||
v
_
1
q
v
1
q
_
1.03883
x
||q
+
x
||q
max
_
log V log
x
2U||q
+ log
3
2
, 0
_
+V +
x
||
1
q
p|q
(log p)v
p
(q)
x
||q
_
1.03883 + log q + log
+
6UV ||q
2x
_
+ 1.0004V
by (2.12), (2.13), (2.14) and (2.15); we are proceeding much as in (5.7).
If || 1/2c
2
, then, assuming Q 2UV/e, we conclude that |S
I,2
| is at most
(5.9)
x
2(q)
min
_
1,
c
0
()
2
_
min
_
4/5
log
+ Q
4V q
2
, 1
_
log V q
+c
8,I
2
x
q
_
UV
x
_
2 _
1 +
q
U
_
2
+
c
10,I
2
_
UV
x
q log V +
U
2
V
x
log V
_
plus
(5.10) (c
4,I
2
+c
9,I
2
)UV + (c
10,I
2
log
U
q
+c
5,I
2
max
_
log
c
11,I
q
2
x
, 2
_
+c
12,I
2
) Q,
where
c
4,I
2
= 3.57422 > 2
c
0
c
1
/,
c
5,I
2
= 3.53312 > 1.0004 c
8,I
,
c
8,I
2
= 1.17257 >
c
10,I
4
0.5004,
c
9,I
2
= 0.82214 > 3c
1
1.0004/2ec
2
,
c
10,I
2
= 1.78783 > 1.0004
c
0
c
1
/,
c
12,I
2
= 28.26771 > c
6,I
2
+c
10,I
2
log 2.
If || 1/2c
2
, then |S
I,2
| is at most (5.9) plus
(5.11)
(c
4,I
2
+ (1 +)c
13,I
2
)UV +c
_
c
14,I
2
_
log q + log
+
6UV ||q
2x
_
+c
15,I
2
_
x
||q
+c
16,I
2
_
2 +
1 +
log
+
2UV ||q
x
_
x
Q/V
+c
17,I
2
Q+c
c
18,I
2
V,
where
c
13,I
2
= 1.31541 >
2
c
0
c
1

1.0004
e
,
c
14,I
2
= 3.57422 >
2
c
0
c
1
,
c
15,I
2
= 3.71301 >
2
c
0
c
1
1.03883,
c
16,I
2
= 1.50061 > 1.0004 3c
1
/2
c
17,I
2
= 25.0295 > 1.0004
35c
0
c
2
3
2
,
c
18,I
2
= 3.57565 >
2
c
0
c
1
1.0004,
and c
= (1 + )
3 + 2. We recall that c
2
= 6/5
c
0
= 0.67147 . . . . We will
choose (0, 1) later.
The case q > Q/V . We use Lemma 3.7 in this case.
5.1.3. Type II terms. As we showed in (4.1)(4.6), S
II
(given in (4.1)) is at most
(5.12) 4
_
x/U
V
_
S
1
(U, W) S
2
(U, V, W)
dW
W
+ 4
_
x/U
V
_
S
1
(U, W) S
3
(W)
dW
W
,
where S
1
, S
2
and S
3
are as in (4.5) and (4.6). We bounded S
1
in (4.33) and
(4.34), S
2
in Prop. 4.6 and S
3
in (4.6).
We rst recall our estimate for S
1
. In the whole range [V, x/U] for W, we know
from (4.33) and (4.34) that S
1
(U, W) is at most
(5.13)
2
2
x
W
+
0
(3/2)
3
x
W
_
x/WU
U
,
where
0
= 1.27.
(We recall we are working with v = 2.)
We have better estimates for the constant in front in some parts of the range;
in what is usually the main part, (4.34) and (4.36) give us a constant of 0.15107
56 H. A. HELFGOTT
instead of 2/
2
. Note that 1.27(3/2)
3
= 22.6417 . . . . We should choose U, V so
that the rst term dominates. For the while being, assume only
(5.14) U 5 10
5
x
V U
;
then (5.13) gives
(5.15) S
1
(U, W)
1
x
W
,
where
1
=
2
2
+
22.6418
_
10
6
/2
0.2347.
This will suce for our cruder estimates.
The second integral in (5.12) is now easy to bound. By (4.6),
S
3
(W) 1.0171x + 2.0341W 1.0172x,
since W x/U x/5 10
5
. Hence
4
_
x/U
V
_
S
1
(U, W) S
3
(W)
dW
W
4
_
x/U
V
_
w,1
x
W
1.0172x
dW
W

9
x
V
,
where
9
= 8
1.0172
1
3.9086.
(We are using the easy bound
a +b +c

a +
b +
c.)
Let us now examine S
2
, which was bounded in Prop. 4.6. Recall W
=
max(V, W/2), U
= max(U, x/2W). Since W
W/2 and W V 117, we can

always bound
(5.16)
<pW
(log p)
2
1
2
W(log W).
by (2.19).
Bounding S
2
for arbitrary. We set
W
0
= min(max(2q, V ), x/U),
where e is a parameter that will be set later.
For V W < W
0
, we use the bound (4.52):
S
2
(U
, W
, W)
_
max(1, 2)
_
x
8q
+
x
2W
_
+
W
2
+ 2q
_
1
2
W(log W)
max
_
1
2
,
__
W
8q
+
1
2
_
xlog W +
W
2
log W
4
+qW log W,
where = q/Q.
If W
0
> V , the contribution of the terms with V W < W
0
to (5.12) is (by
5.15) bounded by
(5.17)
4
_
W
0
V
1
x
W
_
0
4
_
W
4q
+ 1
_
xlog W +
W
2
log W
4
+qW log W
_
dW
W

2
2
0
x
_
W
0
V
log W
W
3/2
dW +

2
2
x
_
W
0
V
log W
W
1/2
dW
+
2
0
x
2
16q
+qx
_
W
0
V
log W
W
dW
0
x
V
+
2
_
xW
0
_
_
log W
0
+
2
2
3
0
x
2
16q
+qx
_
(log W
0
)
3/2
(log V )
3/2
_
,
where
0
= max(1, 2) and
2
= 4
1
1.93768.
We now examine the terms with W W
0
. (If q > x/U, then W
0
= U/x, the
contribution of the case is nil, and the computations below can be ignored.)
We use (4.53):
S
2
(U
, W
, W)
_
x
4(q)
1
log(W/2q)
+
q
(q)
W
log(W/2q)
_
1
2
W log W.
By
a +b

a +
b, we can take out the q/(q) W/ log(W/2q) term and

estimate its contribution on its own; it is at most
(5.18)
4
_
x/U
W
0
1
x
W

q
(q)

1
2
W
2
log W
log W/2q
dW
W
=

2
2
_
q
(q)
_
x/U
W
0
xlog W
W log W/2q
dW
2
_
qx
(q)
_
x/U
W
0
1
W
_
1 +
log 2q
log W/2q
_
dW
Now
_
x/U
W
0
1
log 2q
log W/2q
dW
_
2q log 2q
_
x/2Uq
max(,V/2q)
1
t log t
dt.
We bound this last integral somewhat crudely: for T e,
(5.19)
_
T
e
1
t log t
dt 2.3
T
log T
58 H. A. HELFGOTT
(by numerical work for e T T
0
and by comparison of derivatives for T > T
0
,
where T
0
= e
(12/2.3)
1
= 2135.94 . . . ). Since e, this gives us that
_
x/U
W
0
1
W
_
1 +
log 2q
log W/2q
_
dW
2
_
x
U
+ 2.3
_
2q log 2q
x/2Uq
log x/2Uq
,
and so (5.18) is at most
2
2
_
q
(q)
_
1 + 1.15
log 2q
log x/2Uq
_
x
U
.
We are left with what will usually be the main term, viz.,
(5.20) 4
_
x/U
W
0
S
1
(U, W)
_
x
8(q)
log W
log W/2q
_
W
dW
W
,
which, by (4.34), is at most x/
_
(q) times the integral of
1
W
_
_
2H
2
_
x
WU
_
+

4
2
_
x/WU
U
_
log W
log W/2q
for W going from W
0
to x/U, where H
2
is as in (4.35) and
4
= 4
0
(3/2)
3
90.5671.
By the arithmetic/geometric mean inequality, the integrand is at most 1/W times
(5.21)
+
1
2H
2
(x/WU)
2
+

1
2
4
2
_
x/WU
U
+

2
log 2q
log W/2q
for any > 0. We will choose later.
The rst summand in (5.21) gives what we can think of as the main or worst
term in the whole paper; let us compute it rst. The integral is
(5.22)
_
x/U
W
0
+
1
2H
2
(x/WU)
2
dW
W
=
_
x/UW
0
1
+
1
2H
2
(s)
2
ds
s
2
+

6
4
_
log
x
UW
0
by (4.36), where
6
= 0.60428.
Thus the main term is simply
(5.23)
_
2
+

6
4
_
x
_
(q)
log
x
UW
0
.
The integral of the second summand is at most

4
4
x
U
_
x/U
V
dW
W
3/2

1

4
2
_
x/UV
U
.
By (5.14), this is at most
2
10
3

4

1
7
/2,
where
7
=
2
4
1000
0.1281.
Thus the contribution of the second summand is at most
7
2

x
_
(q)
.
The integral of the third summand in (5.21) is
(5.24)

2
_
x/U
W
0
log 2q
log W/2q
dW
W
.
If V < 2q x/U, this is
2
_
x/U
2q
log 2q
log W/2q
dW
W
=

2
log 2q
_
x/2Uq
1
log t
dt
t
=

2
log 2q
_
log log
x
2Uq
log log
_
.
If 2q > x/U, the integral is over an empty range and its contribution is hence 0.
If 2q V , (5.24) is
2
_
x/U
V
log 2q
log W/2q
dW
W
=
log 2q
2
_
x/2Uq
V/2q
1
log t
dt
t
=
log 2q
2
(log log
x
2Uq
log log V/2q)
=
log 2q
2
log
_
1 +
log x/UV
log V/2q
_
.
(Of course, log(1 + (log x/UV )/(log V/2q)) (log x/UV )/(log V/2q); this is
smaller than (log x/UV )/ log 2q when V/2q > 2q.)
The total bound for (5.20) is thus
(5.25)
x
_
(q)
_

_
1
2
log
x
UW
0
+

2
_
+
1
_
1
4
6
log
x
UW
0
+

7
2
__
,
where
(5.26) =
_
_
_
log 2q
_
log log
x
2Uq
log log
_
if V/2 < q < x/(2U).
log 2q log
_
1 +
log x/UV
log V/2q
_
if q V/2.
Choosing optimally, we obtain that (5.20) is at most
(5.27)
x
_
2(q)
_
log
x
UW
0
+
__
6
log
x
UW
0
+ 2
7
_
,
where is as in (5.26).
Bounding S
2
for || 8. Let us see how much a non-zero can help us. It
makes sense to apply (4.55) only when || 4; otherwise (4.53) is almost certainly
better. Now, by denition, ||/x 1/qQ, and so || 8 can happen only when
q x/8Q.
60 H. A. HELFGOTT
With this in mind, let us apply (4.55). Note rst that
x
|q|
_
q +
x
4W
_
1
1/|q|
q
x
+
1
4W
4/|q|
1
2Q
+
1
W
4W
||q

1
1 +
W
2Q
4W
||q

1
1 +
x/U
2Q
.
This is at least 2 min(2Q, W)/|q|. Thus we may apply (4.55)(4.56) when |q|
2 min(2Q, W). Since Q x/U, we know that min(2Q, W) = W for all W x/U,
and so it is enough to assume that |q| 2W.
Recalling also (5.16), we see that (4.55) gives us
(5.28)
S
2
(U
, W
, W) min
_
_
_
_
1,
2q/(q)
log
_
4W
||q

1
1+
x/U
2Q
_
_
_
_
_
_
x
|q|
+
W
2
_
1
2
W(log W).
Similarly to before, we dene W
0
= max(V, |q|), where 1 will be set later.
For W W
0
, we certainly have |q| 2W. Hence the part of (5.12) coming from
the range W
0
W < x/U is
(5.29)
4
_
x/U
W
0
_
S
1
(U, W) S
2
(U, V, W)
dW
W
4
_
q
(q)
_
x/U
W
0
_
S
1
(U, W)
log W
log
_
4W
||q

1
1+
x/U
2Q
_
_
Wx
|q|
+
W
2
2
_
dW
W
.
By (4.34), the contribution of the term Wx/|q| to (5.29) is at most
4x
_
||(q)
_
x/U
W
0
_
_
H
2
_
x
WU
_
+

4
4
_
x/WU
U
_
log W
log
_
4W
||q

1
1+
x/U
2Q
_
dW
W
Note that 1 + (x/U)/2Q 3/2. Proceeding as in (5.20)(5.27), we obtain that
this is at most
2x
_
||(q)
_
log
x
UW
0
+
__
6
log
x
UW
0
+ 2
7
_
,
where
(5.30) =
_
_
_
log
(1+
1
)|q|
4
log
_
1 +
log x/UV
log 4V/||(1+
1
)q
_
if |q| V/,
log
3|q|
8
_
log log
8x
3U|q|
log log
8
3
_
if V/ < |q| x/U,
where
1
= x/2UQ. This is what we think of as the main term.
By (5.15), the contribution of the term W
2
/2 to (5.29) is at most
(5.31) 4
_
q
(q)
_
x/U
V
_
1
2
x
dW
W
max
V W
x
U
_
log W
max
_
log
2W
|q|
, 2
_.
Since t (log t)/(log t/c) is decreasing for t > c, (5.31) is at most
4
2
1
_
q
(q)
x
_
log W
0
max
_
log
8W
0
3|q|
,
8
3
_.
If W
0
> V , we also have to consider the range V W < W
0
. The part of
(5.12) coming from this is
4
_
|q|
V
S
1
(U, W) (log W)
_
Wx
2|q|
+
W
2
4
+
Wx
16(1 )Q
+
x
8(1 )
_
dW
W
.
We have already counted the contribution of W
2
/4 in the above. The terms
Wx/2||q and Wx/(16(1 )Q) contribute at most
4
1
_
|q|
V
x
W
(log W)W
_
x
2|q|
+
x
16(1 )Q
_
dW
W
= 4
1
x
_
1
_
2||q
+
1
4
_
(1 )Q
_
_
|q|
V
_
log W
dW
W
2
2
3
x
_
1
_
2||q
+
1
4
_
(1 )Q
_
_
(log ||q)
3/2
(log V )
3/2
_
.
The term x/8(1 ) contributes
1
x
_
|q|
V
log W
W(1 )
dW
W

1
x
1
_

V
log W
W
3/2
dW

2
x
2
_
(1 )V
(
_
log V +
_
1/ log V ),
where we use the estimate
_

V
log W
W
3/2
dW =
1
V
_

1
log u + log V
u
3/2
du
V
_

1
log V
u
3/2
du +
1
V
_

1
1
2
log V
log u
u
3/2
du
= 2
log V
V
+
1
2
V log V
4
2
V
_
_
log V +
_
1/ log V
_
.
* * *
It is time to collect all type II terms. Let us start with the case of general .
We will set e later. If q V/2, then |S
II
| is at most
(5.32)
x
_
2(q)
_
log
x
UV
+ log 2q log
_
1 +
log x/UV
log V/2q
__
_
6
log
x
UV
+ 2
7
_
+
2
2
_
q
(q)
_
1 + 1.15
log 2q
log x/2Uq
_
x
U
+
9
x
V
.
62 H. A. HELFGOTT
If V/2 < q x/2U, then |S
II
| is at most
(5.33)
x
_
2(q)
_
log
x
U 2q
+ log 2q log
log x/2Uq
log
__
6
log
x
U 2q
+ 2
7
_
+
2
2
_
q
(q)
_
1 + 1.15
log 2q
log x/2Uq
_
x
U
+ (
2
_
log 2q +
9
)
x
V
+

2
6
_
(log 2q)
3/2
(log V )
3/2
_
x
q
+
2
_
_
2 log 2q +
2
3
((log 2q)
3/2
(log V )
3/2
)
_
qx,
where we use the fact that Q x/U (implying that
0
= max(1, 2q/Q) equals 1
for q x/2U). Finally, if q > x/2U,
(5.34)
|S
II
| (
2
_
2 log x/U +
9
)
x
V
+
2
_
log x/U
x
U
+
2
2
3
((log x/U)
3/2
(log V )
3/2
)
_
x
2
2q
+
qx
_
.
Now let us examine the alternative bounds for || 8. If |q| V/, then
|S
II
| is at most
(5.35)
2x
_
||(q)
_
log
x
UV
+ log
|q|(1 +
1
)
4
log
_
1 +
log x/UV
log
4V
||(1+
1
)q
_
6
log
x
UV
+ 2
7
+
2
2q
(q)

log V
log 2V/|q|

x
U
+
9
x
V
,
where
1
= x/2UQ. If |q| > V/, then |S
II
| is at most
(5.36)
2x
_
||(q)
_
_
log
x
U ||q
+ log
3|q|
8
log
log
8x
3U|q|
log 8/3
_
_
6
log
x
U |q|
+ 2
7
_
+
2
2
3
_
x
_
2|q|
+
x
4
Qq
_
_
(log |q|)
3/2
(log V )
3/2
_
+
_

2
_
2(1 )
_
_
log V +
_
1/ log V
_
+
9
_
x
V
+
2
_
q
(q)

_
log |q|
x
U
,
where = q/Q. (Note that || x/Qq implies x/4Q
2
, and so will be very
small and Qq will be very close to Q.)
5.2. Adjusting parameters. Calculations. We must bound the exponential
sum
n
(n)e(n)(n/x). By (2.20), it is enough to sum the bounds we obtained
in 5.1. We will now see how it will be best to set U, V and other parameters.
Usually, the largest terms will be
(5.37) C
0
UV,
where
(5.38)
C
0
=
_
c
4,I
2
+c
9,I
2
= 4.39636 if || 1/2c
2
0.74463,
c
4,I
2
+ (1 +)c
13,I
2
= 4.88963 + 1.31541 if || > 1/2c
2
(from (5.10) and (5.11), type I; (0, 1) will be set later) and
(5.39)
x
_
0
(q)
_
log
x
UV
+ (log
0
(1 +
1
)q) log
_
1 +
log
x
UV
log
V
0
(1+
1
)q
_
_
6
log
x
UV
+ 2
7
(from (5.32) and (5.35), type II; here
0
= max(2, ||/4), while
1
= x/2UQ for
|| > 8 and
1
= 0 for || < 8.
We set UV = x/
q
0
; we must choose > 0.
Let us rst optimise in the case || 4, so that
0
= 2 and
1
= 0. For
the purpose of choosing , we replace
_
(q) by

q/C
1
, where C
1
= 2.3536
510510/(510510), and also replace V by q
2
/c, c a constant. We use the approx-
imation
log
_
1 +
log
x
UV
log
V
|2q|
_
= log
_
1 +
log(
2q/)
log(q/2c)
_
= log
_
3
2
+
log 2
c/
log q/2c
_
log
3
2
+
2 log 2
c/
3 log q/2c
.
What we must minimize, then, is
(5.40)
C
0
2q
+
C
1
2q
_
_
log
2q
+ log 2q
_
log
3
2
+
2 log
2
3 log
q
2c
__
_
6
log
2q
+ 2
7
_
C
0
2q
+
C
1
2
6
_
1
log q
_
5
3
+
2
3
log 4c
log
q
2c
_
log +
1
log q 2
1
log +
4
6
+
1
log 2
C
0
2q
_
+
4
_
1
log q
__
5
6
+
1
_
+
1
3
log 4c
log
q
2c
_
log +
3
__
,
where
1
=
1
2
+ log
3
2
,
2
= log
2 + log 2 log
3
2
+
log 4c log 2q
3 log q/2c
,
3
=
1
2
_
2
+
4
6
+
1
log 2
_
=
log 4c
6
+
(log 4c)
2
6 log
q
2c
+
5
,
4
=
C
1
C
0
_

6
2
_
0.30925 if || 4
0.27805
1+0.26902
if || > 4,
5
=
1
2
(log
2 + log 2 log
3
2
+
4
6
+
1
log 2) 1.01152.
64 H. A. HELFGOTT
Taking derivatives, we see that the minimum is attained when
(5.41) =
_
5
6
+
1
+
1
3
log 4c
log
q
2c
_
4

_
1.7388 +
log 4c
3 log
q
2c
_
0.30925.
provided that || 4. (What we obtain for || > 4 is essentially the same, only
with log
0
q = log ||q/4 instead of log q, and 0.27805/(1 + 0.26902) in place of
0.30925.) For q = 5 10
5
, c = 2.5 and || 4 (typical values in the most delicate
range), we get that should be 0.55834 . . . , and the last line of (5.40) is then
0.02204 . . . ; for q = 10
6
, c = 10, || 4, we get that should be 0.57286 . . . , and
the last line of (5.40) is then 0.01656 . . . . If || > 4, ||q = 5 10
5
, c = 2.5 and
= 0.2 (say), then = 0.47637 . . . , and the last line of (5.40) is 0.02243 . . . ; if
|| > 4, ||q = 10
6
, c = 10 and = 0.2, then = 0.48877 . . . , and the last line of
(5.40) is 0.01684 . . . .
(A back-of-the-envelope calculation suggests that choosing w = 1 instead of
w = 2 would have given bounds worse by about 15 percent.)
We make the choices
= 1/2, and so UV =
1
2
q
0
for the sake of simplicity. (Unsurprisingly, (5.40) changes very slowly around its
minimum.)
Now we must decide how to choose U, V and Q, given our choice of UV . We
will actually make two sets of choices. First, we will use the S
I,2
estimates for
q Q/V to treat all of the form = a/q + O
(1/qQ), q y. (Here y is
a parameter satisfying y Q/V .) The remaining then get treated with the
(coarser) S
I,2
estimate for q > Q/V , with Q reset to a lower value (call it Q
).
If was not treated in the rst go (so that it must be dealt with the coarser
estimate) then = a
/q
/x, where either q
> y or
> x/Q. (Otherwise,

= a
/q
+O
(1/q
Q) would be a valid estimate with q
y.)
The value of Q
is set to be smaller than Q both because this is helpful (it

diminishes error terms that would be large for large q) and because this is now
harmless (since we are no longer assuming that q Q/V ).
5.2.1. First choice of parameters: q y. The largest items aected strongly by
our choices at this point are
(5.42)
c
16,I
2
_
2 +
1 +
log
+
2UV ||q
x
_
x
Q/V
+c
17,I
2
Q (from S
I,2
, || > 1/2c
2
),
_
c
10,I
2
log
U
q
+ 2c
5,I
2
+c
12,I
2
_
Q (from S
I,2
, || 1/2c
2
),
and
(5.43)
2
2q
(q)
_
1 + 1.15
log 2q
log x/2Uq
_
x
U
+
9
x
V
(from S
II
).
In addition, we have a relatively mild but important dependence on V in the
main term (5.39). We must also respect the condition q Q/V , the lower bound
on U given by (5.14) and the assumptions made at the beginning of section 5
(e.g. Q x/U, V 2 10
6
). Recall that UV = x/
q.
We set
Q =
x
8y
,
since we will then have not just q y but also q|| x/Q = 8y, and so q
0
4y.
We want q Q/V to be true whenever q y; this means that
q
Q
V
=
QU
UV
=
QU
x/2
q
0
=
U
q
0
4y
must be true when q y, and so it is enough to set U = 4y
2
/
q
0
. The following
choices make sense: we will work with the parameters
(5.44)
y =
x
1/3
6
, Q =
x
8y
=
3
4
x
2/3
, x/UV = 2
_
q
0
2
_
2y,
U =
4y
2
q
0
=
x
2/3
9
q
0
, V =
x
(x/UV ) U
=
x
8y
2
=
9x
1/3
2
,
where, as before,
0
= max(2, ||/4). Thus
1
x/2UQ 2
6/x
1/6
. Assuming
(5.45) x 2.16 10
20
,
we obtain that U/(x/UV ) (x
3/2
/9
q
0
)/(2
q
0
) = x
2/3
/18q
0
x
1/3
/6
5 10
5
, and so (5.14) holds. We also get that
1
0.002.
Since V = x/8y
2
= (9/2)x
1/3
, (5.45) also implies that V 2 10
6
(in fact,
V 27 10
6
). It is easy to check that
(5.46) V < x/4, UV x, Q
ex, Q max(U, x/U),

as stated at the beginning of section 5. Let = (3/2)
3
= 27/8. Then
(5.47)
V
2q
=
x/8y
2
2q

x
16y
3
=
x
54y
3
= 4 > 1,
V
|q|
=
x/8y
2
8y

x
64y
3
=
x
216y
3
= 1.
The rst type I bound is
(5.48)
|S
I,1
|
x
q
min
_
1,
c
2
_
_
_
_
min
_
_
_
4
5
q
(q)
log
+ x
2/3
9q
5
2
1
2
0
, 1
_
_
_
_
log 9x
1
3
_
q
0
+c
3,I
_
+
c
4,I
q
(q)
_
_
_
+
_
c
7,I
log
y
c
2
+c
8,I
log x
_
y +
c
10,I
x
1/3
3
4
2
2
q
3/2
1
2
0
(log 9x
1/3
_
eq
0
)
+
_
c
5,I
log
2x
2/3
9c
2
q
0
+c
6,I
log
x
5/3
9
q
0
_
x
2/3
9
q
0
+c
9,I
x log
2x
c
2
+
c
10,I
e
,
where the constants are as in 5.1.1. The function x (log cx)/(log x/R), c, R
1, attains its maximum on [R
, ], R
> R, at x = R
. Hence, for q
0
xed,
(5.49) min
_
_
_
4/5
log
+ 4x
2/3
9(
0
q)
5
2
, 1
_
_
_
_
log 9x
1
3
_
q
0
+c
3,I
_
66 H. A. HELFGOTT
attains its maximum at x = (27/8)e
6/5
(q
0
)
15/4
, and so
min
_
_
_
4/5
log
+ 4x
2/3
9(
0
q)
5
2
, 1
_
_
_
_
log 9x
1
3
_
q
0
+c
3,I
_
+c
4,I
log
27
2
e
2/5
(
0
q)
7/4
+c
3,I
+c
4,I

7
4
log
0
q + 6.11676.
Examining the other terms in (5.48) and using (5.45), we conclude that
(5.50)
|S
I,1
|
x
q
min
_
1,
c
2
_
min
_
q
(q)
_
7
4
log
0
q + 6.11676
_
,
1
2
log x + 5.65787
_
+
x
2/3
q
0
(0.67845 log x 1.20818) + 0.0507x
2/3
,
where we are using (5.45) to simplify the smaller error terms. (The bound
(1/2) log x + 5.65787 comes from a trivial bound on (5.49).) We recall that
c
0
= 0.798437 > c
0
/(2)
2
.
Let us now consider S
I,2
. The terms that appear both for || small and ||
large are given in (5.9). The second line in (5.9) equals
c
8,I
2
_
x
4q
2
0
+
2UV
2
x
+
qV
2
x
_
+
c
10,I
2
2
_
q
2
q
0
+
x
2/3
18q
0
_
log
9x
1/3
2
c
8,I
2
_
x
4q
2
0
+
9x
1/3
2
2
+
27
8
_
+
c
10,I
2
2
_
y
1/6
2
3/2
+
x
2/3
18q
0
_
_
1
3
log x + log
9
2
_
0.29315
x
q
2
0
+ (0.00828 log x + 0.03735)
x
2/3
q
0
+ 0.00153
x,
where we are using (5.45) to simplify. Now
(5.51) min
_
4/5
log
+ Q
4V q
2
, 1
_
log V q = min
_
4/5
log
+ y
4q
2
, 1
_
log
9x
1/3
q
2
can be bounded trivially by log(9x
1/3
q/2) (2/3) log x + log 3/4. We can also
bound (5.51) as we bounded (5.49) before, namely, by xing q and nding the
maximum for x variable. In this way, we obtain that (5.51) is maximal for
y = 4e
4/5
q
2
; since, by denition, x
1/3
/6 = y, (5.51) then equals
log
9(6 4e
4/5
q
2
)q
2
= 3 log q + log 108 +
4
5
3 log q + 5.48214.
If || 1/2c
2
, we must consider (5.10). This is at most
(c
4,I
2
+c
9,I
2
)
x
2
q
0
+ (c
10,I
2
log
x
2/3
9q
3/2
0
+ 2c
5,I
2
+c
12,I
2
)
3
4
x
2/3
2.19818x
q
0
+ (0.89392 log x + 23.0896)x
2/3
.
If || > 1/2c
2
, we must consider (5.11) instead. For = 0.07, that is at most
(c
4,I
2
+ (1 +)c
13,I
2
)
x
2
q
0
+ (3.30386 log q
3
+ 16.4137)
x
||q
+ (68.8137 log ||q + 36.7795)x
2/3
+ 29.7467x
1/3
= 2.49086
x
q
0
+ (3.30386 log q
3
+ 16.4137)
x
||q
+ (22.9379 log x + 56.576)x
2
3
.
Hence
(5.52)
|S
I,2
| 2.49086
x
q
0
+x min
_
1,
4c
2
_
min
_
3
2
log q + 2.74107
(q)
,
1
3
log x +
1
2
log
3
4
q
_
+ 0.29315
x
q
2
0
+ (22.9462 log x + 56.6134)x
2/3
plus a term (3.30386 log q
2
+16.4137) (x/||q) that appears if and only if ||
1/2c
2
.
For type II, we have to consider two cases: (a) || < 8, and (b) || 8.
Consider rst || < 8. Then
0
= 2. Recall that = 27/8. We have q V/2
and |q| V/ thanks to (5.47). We apply (5.32), and obtain that, for || < 8,
(5.53)
|S
II
|
x
_
2(q)
_
1
2
log 4q
0
+ log 2q log
_
1 +
1
2
log 4q
0
log
V
2q
_
_
0.30214 log 4q
0
+ 0.2562
+ 8.22088
_
q
(q)
_
_
_1 + 1.15
_
log 2q
log
9x
1/3
0
2
q
_
_
_(q
0
)
1/4
x
2/3
+ 1.84251x
5/6
x
_
2(q)
_
C
x,2q
log 2q +
log q
2

_
0.30214 log 2q + 0.67506
+ 16.404
_
q
(q)
x
3/4
+ 1.84251x
5/6
where we dene
C
x,t
:= log
_
1 +
log 4t
2 log
9x
1/3
2.004t
_
for 0 < t < 9x
1/3
/2. (We have 2.004 here instead of 2 because we want a constant
2(1 +
1
) in later occurences of C
x,t
, for reasons that will soon become clear.)
For purposes of later comparison, we remark that 16.404 1.5785x
3/44/5
for
x 2.16 10
20
.
68 H. A. HELFGOTT
Consider now case (b), namely, || 8. Then
0
= ||/4. By (5.47), |q| V/.
Hence, (5.35) gives us that
(5.54)
|S
II
|
2x
_
||(q)
_
1
2
log |q| + log
|q|(1 +
1
)
4
log
_
_
1 +
log ||q
2 log
18x
1/3
||(1+
1
)q
_
_
_
0.30214 log ||q + 0.2562
+ 8.22088
_
q
(q)
_
log
9x
1/3
2
log
12x
1/3
|q|
(q
0
)
1/4
x
2/3
+ 1.84251x
5/6
x
_
0
(q)
_
C
x,
0
q
log
0
(1 +
1
)q +
log 4
0
q
2
_
0.30214 log
0
q + 0.67506
+ 1.68038
_
q
(q)
x
4/5
+ 1.84251x
5/6
,
since
8.22088
_
log
9x
1/3
2
log
12x
1/3
|q|
(q
0
)
1/4
8.22088
log
9x
1/3
2
log 9
(x
1/3
/3)
1/4
1.68038x
4/52/3
for x 2.16 10
20
. Clearly
log
0
(1 +
1
)q log
0
q + log(1 +
1
) log
0
q +
1
.
Now note the fact ([RS62, Thm. 15]) that q/(q) < (q), where
(5.55) (q) = e
log log q +
2.50637
log log q
.
Moreover, q/(q) 3 for q < 30. Since (30) > 3 and (t) is increasing for
t 30, we conclude that, for any q and for any r max(q, 30), q/(q) < (r).
In particular, q/(q) (y) = (x
1/3
/6) (since, by (5.45), x 180
3
). It is easy
to check that x
_
(x
1/3
/6)x
4/55/6
is decreasing for x 180
3
. Using (5.45),
we conclude that 1.67718
_
q/(q)x
4/5
0.83574x
5/6
. This allows us to simplify
the last lines of (5.53) and (5.54).
It is time to sum up S
I,1
, S
I,2
and S
II
. The main terms come from the rst
lines of (5.53) and (5.54) and the rst term of (5.52). Lesser-order terms can be
dealt with roughly: we bound min(1, c
0
/
2
) and min(1, 4c
0
/
2
) from above by
2/
0
(somewhat brutally) and 1/q
2
0
by 1/q
0
(again, coarsely). For || 1/2c
2
,
1
||

4c
2
0
,
log ||
||

2
e log 2

log
0
0
;
we use this to bound the term in the comment after (5.52). The terms inversely
proportional to q, (q) or q
2
thus add up to at most
2x
0
min
_
7
4
log
0
q + 6.11676
(q)
,
1
2
log x + 5.65787
q
_
+
2x
0
min
_
3
2
log q + 2.74107
(q)
,
1
3
log x +
1
2
log
3
4
q
_
+ 0.29315
x
q
0
+
4c
2
x
q
0
(3.30386 log q
2
+ 16.4137) +
2x
(e log 2)q
0
3.30386 log
0
2x
0
min
_
log
7/4
0
q
13/4
+ 8.858
(q)
,
5
6
log x + 5.515
q
_
+
2x
0
q
(8.874 log q + 1.7535 log
0
+ 22.19)
2x
0
_
_
min
_
log
7/4
0
q
13/4
+ 8.858
(q)
,
5
6
log x + 5.515
q
_
+
log q
80
9

16
9
0
+ 22.19
q
_
_
.
As for the other terms we use (5.45) to bound x
2/3
and x
2/3
log x by a small
constant times x
5/6
. We bound x
2/3
/
q
0
by x
2/3
/
2 (in (5.50)).
The sums S
0,
and S
0,w
in (2.23) are 0 (by (5.45)). We conclude that, for
q y = x
1/3
/6, x 2.16 10
20
and =
2
as in (1.4),
(5.56)
|S
(x, )| |S
I,1
| +|S
I,2
| +|S
II
|
x
_
(q)
0
_
C
x,
0
q
(log
0
q + 0.002) +
log 4
0
q
2
_
0.30214 log
0
q + 0.67506
+
2.49086x
q
0
+
2x
0
min
_
_
log
7
4
0
q
13
4
+
80
9
(q)
,
5
6
log x +
50
9
q
_
_
+
2x
0
log q
80
9

16
9
0
+
111
5
q
+ 3.14624x
5/6
,
where
(5.57)
0
= max(2, ||/4), C
x,t
= log
_
1 +
log 4t
2 log
9x
1/3
2.004t
_
.
Since C
x,t
is an increasing function as a function of t (for x xed and t
9x
1/3
/2.004) and
0
q 2y, we see that C
x,t
C
x,2y
. It is clear that x C
x,t
(xed t) is decreasing function of x. For x = 2.16 10
20
, C
x,2y
= 1.39942 . . . .
Also, compare the value C
3.110
28
,210
6 = 0.64020 . . . given by (5.57) to the value
of 1.196 . . . 0.5 = 0.696 . . . for C
3.110
28
,210
6 in a previous version [Helb] of the
present paper. (The largest gains are elsewhere.)
5.2.2. Second choice of parameters. If, with the original choice of parameters, we
obtained q > y = x
1/3
/6, we now reset our parameters (Q, U and V ). Recall
that, while the value of q may now change (due to the change in Q), we will be
able to assume that either q > y or |q| > x/(x/8y) = 8y.
70 H. A. HELFGOTT
We want U/(x/UV ) 5 10
5
(this is (5.14)). We also want UV small. With
this in mind, we let
V =
x
1/3
3
, U = 500
6x
1/3
, Q =
x
U
=
x
2/3
500
6
.
Then (5.14) holds (as an equality). Since we are assuming (5.45), we have V
2 10
6
. It is easy to check that (5.45) also implies that U <

x and Q >

ex,
and so the inequalities in (5.46) hold.
Write 2 = a/q + /x for the new approximation; we must have either q > y
or || > 8y/q, since otherwise a/q would already be a valid approximation under
the rst choice of parameters. Thus, either (a) q > y, or both (b1) || > 8 and
(b2) ||q > 8y. Since now V = 2y, we have q > V/2 in case (a) and |q| > V/
in case (b) for any 1. We set = e
2
.
By (5.2),
|S
I,1
|
x
q
min
_
1,
c
2
__
log x
2/3
log 500
6 +c
3,I
+c
4,I
q
(q)
_
+
_
c
7,I
log
Q
c
2
+c
8,I
log xlog c
11,I
Q
2
x
_
Q+c
10,I
U
2
4x
log
e
1/2
x
2/3
500
6
+
c
10,I
e
+
_
c
5,I
log
1000
6x
1/3
c
2
+c
6,I
log 500
6x
4/3
_
500
6x
1/3
+c
9,I
xlog
2x
c
2
x
q
min
_
1,
c
2
__
2
3
log x 4.99944 + 1.00303
q
(q)
_
+
1.063
10000
x
2/3
(log x)
2
,
where we are bound log c
11,I
Q
2
/x by log x
1/3
. Just as before, we use the assump-
tion (5.45) when we have to bound a lower-order term (such as x
1/2
log x) by a
multiple of a higher-order term (such as x
2/3
(log x)
2
).
We have q/(q) (Q) (where is as in (5.55)) and we can check that
1.00303(Q) 0.0327 log x + 4.99944
for all x 10
6
. We have either q > y or q|| > 8y; if q|| > 8y but q y, then
|| 8, and so c
0
/
2
q < 1/8||q < 1/64y < 1/y. Hence
|S
I,1
| 4.1962x
2/3
log x + 0.090843x
2/3
+ 0.001063x
2/3
(log x)
2
4.1982x
2/3
log x + 0.001063x
2/3
(log x)
2
.
We bound |S
I,2
| using Lemma 3.7. First we bound (3.47): this is at most
x
2q
min
_
1,
4c
2
_
log
x
1/3
q
3
+c
0
_
1
4

1
2
_
_
(UV )
2
log
x
1/3
3
2x
+
3c
4
2
500
6
9
+
(500
6x
1/3
+ 1)
2
x
1/3
3x
_
,
where c
4
= 1.03884. We bound the second line of this using (5.45). As for the
rst line, we have either q y (and so the rst line is at most (x/2y)(log x
1/3
y/3))
or q < y and 4c
0
/
2
q < 1/16y < 1/y (and so the same bound applies). Hence
(3.47) is at most
3
2
x
2/3
_
2
3
log x log 9
_
+ 0.02017x
2/3
log x.
Now we bound (3.48), which comes up when || 1/2c
2
, where c
2
= 6/5
c
0
,
c
0
= 31.521 (and so c
2
= 0.6714769 . . . ). Since 1/2c
2
< 8, it follows that q > y
(the alternative q y, |q| > 2y is impossible). Then (3.48) is at most
(5.58)
2
c
0
c
1
_
UV log
UV
e
+Q
_
3 log
c
2
x
Q
+
log UV
2
log
UV
Q/2
__
+
3c
1
2
x
y
log UV log
UV
c
2
x/y
+
16 log 2
Qlog
c
0
e
3
Q
2
4 8 log 2 x
log
Q
2
+
3c
1
2
2c
2
xlog
c
2
x
2
+
25c
0
4
2
(3c
2
)
1/2
xlog x,
where c
1
= 1.0000028 > 1+(8 log 2)/V . Here log(c
0
e
3
Q
2
/(4 8 log 2 x)) log Q/2
is at most log x
1/3
log x
2/3
. Using this and (5.45), we get that (5.58) is at most
1177.617x
2/3
log x + 0.0006406x
2/3
(log x)
2
+ 29.5949x
1/2
log x
1177.64x
2/3
log x + 0.0006406x
2/3
(log x)
2
.
If || > 1/2c
2
, then we know that |q| > max(y/2c
2
, 2y) = y/2c
2
. Thus (3.49)
(with = 0.01) is at most
2
c
0
c
1
UV log
UV
e
+
2.02
c
0
c
1
_
x
y/2c
2
+ 1
_
_
(
3.02 1) log
x
y/2c
2
+ 1
2
+
1
2
log UV log
e
2
UV
x
y/2c
2
_
+
_
3c
1
2
_
1
2
+
3.03
0.16
log x
_
+
20c
0
3
2
(2c
2
)
3/2
_
xlog x.
Again by (5.45), this simplies to
1212.591x
2/3
log x + 29.131x
1/2
log x 1213.15x
2/3
(log x)
2
.
Hence, in total and for any ||,
|S
I,2
| 1213.15x
2/3
(log x) + 0.0006406x
2/3
(log x)
2
.
Now we must estimate S
II
. As we said before, either (a) q > y/4, or both (b1)
|| > 8 and (b2) ||q > 8y. Recall that = e
2
. In case (a), we use (5.33), and
obtain that, if y/4 < q x/2e
2
U, |S
II
| is at most
(5.59)
x
_
(q)
2q
_
log
x
U 2e
2
q
+ log 2q log
log x/(2Uq)
log e
2
__
6
log
x
U 2e
2
q
+ 2
7
_
+
2
2
_
_
x
2e
2
U
_
_
1 + 1.15
_
log x/e
2
U
2
_
x
U
+ (
2
_
log x/U +
9
)
x
V
+

2
6
_
(log(e
2
y/2))
3/2
(log y)
3/2
_
x
y
+
2
_
_
2e
2
log x/U +
2
3
((log x/U)
3/2
(log V )
3/2
)
_
x
2e
2
U
,
where is as in (5.55). It is easy to check that q (log 2q)(log log q)/q is
decreasing for q y (indeed for q 9), and so the rst line of (5.59) is minimal
for q = y. Asymptotically, the largest term in (5.59) comes from the last line (of
order x
5/6
(log x)
3/2
), even if the rst line is larger in practice (while being of order
72 H. A. HELFGOTT
x
5/6
(log x) log log x). The ratio of (5.59) (for q = y = x
1/3
/6) to x
5/6
(log x)
3/2
is
descending for x x
0
= 2.16 10
20
; its value at x = x
0
gives
(5.60) |S
II
| 0.272652x
5/6
(log x)
3/2
in case (a), for q x/2e
2
U.
If x/2e
2
U < q Q, we use (5.34). In this range, x/2
2q +

qx adopts its
maximum at q = Q (because x/2
2q for q = x/2e
2
U is smaller than

qx for
q = Q, by (5.45)). A brief calculation starting from (5.34) then gives that
|S
II
| 0.10198x
5/6
(log x)
3/2
,
where we use (5.45) yet again to simplify.
Finally, let us treat case (b), that is, || > 8 and ||q > 8y; we can also assume
q y, as otherwise we are in case (a), which has already been treated. Since
|/x| x/Q, we know that |q| x/Q = U. From (5.36), we obtain that |S
II
| is
at most
2x
_
(y)
8y
_
log
x
U e
2
8y
+ log 3y log
log x/3Uy
log 8e
2
/3
__
6
log
x
U e
2
2y
+ 2
7
_
+
2
2
3
_
x
16y
((log 8e
2
y)
3/2
(log y)
3/2
) +
x/4
Qy
((log e
2
U)
3/2
(log y)
3/2
)
_
+
_

2
_
2(1 y/Q)
_
_
log V +
_
1/ log V
_
+
9
_
x
V
+
2
_
2(y)
log e
2
U
log 8e
2
/3

x
U
,
We take the maximum of the ratio of this to x
5/6
(log x)
3/2
, and obtain
|S
II
| 0.24956x
5/6
(log x)
3/2
.
Thus (5.60) gives the worst case.
We now take totals, and obtain
(5.61)
S
(x, ) |S
I,1
| +|S
I,2
| +|S
II
|
(4.1982 + 1213.15)x
2/3
log x + (0.001063 + 0.0006406)x
2/3
(log x)
2
+ 0.272652x
5/6
(log x)
3/2
0.27266x
5/6
(log x)
3/2
+ 1217.35x
2/3
log x,
where we use (5.45) yet again.
5.3. Conclusion.
Proof of main theorem. We have shown that |S
(, x)| is at most (5.56) for q

x
1/3
/6 and at most (5.61) for q > x
1/3
/6. It remains to simplify (5.56) slightly.
Let
=
C
x
1
,2q
0
(log 2q
0
+ 0.002) +
log 8q
0
2
0.30214 log 2q
0
+ 0.67506
= 3.61407 . . . ,
where x
1
= 10
25
, q
0
= 2 10
5
. (We will be optimizing matters for x = x
1
,
0
q =
2q
0
, with very slight losses in nearby ranges.) By the geometric mean/arithmetic
mean inequality,
_
C
x
1
,
0
q
(log
0
q + 0.002) +
log 4
0
q
2
_
0.30214 log
0
q + 0.67506.
is at most
1
2
_
1
_
C
x
1
,
0
q
(log
0
q + 0.002) +
log 4
0
q
2
_
+
(0.30214 log
0
q + 0.67506)
_
C
x,
0
q
2
(log
0
q + 0.002) +
_
1
4
0.30214
2
_
log
0
q
+
1
2
_
log 2
2
0.67506
_
0.27125 log
_
1 +
log 4t
2 log
9x
1/3
2.004t
_
(log
0
q + 0.002) + 0.4141 log
0
q + 0.49911.
Now, for x x
0
= 2.16 10
20
,
C
x,t
log t

C
x
0
,t
log t
0.08659
for t 10
6
, and
C
x,t
log t

C
6t
3
,t
log t

1
log t
log
_
1 +
log 4t
2 log
27
1.002
_
0.08659
if 10
6
< t x
1/3
/6. Hence
0.27125 C
x,
0
q
0.002 0.000047 log
0
q.
We conclude that, for q x
1/3
/6,
|S
(, x)|
R
x,
0
q
log
0
q + 0.49911
_
(q)
0
x +
2.491x
q
0
+
2x
0
min
_
_
log
7
4
0
q
13
4
+
80
9
(q)
,
5
6
log x +
50
9
q
_
_
+
2x
0
log q
80
9

16
9
0
+
111
5
q
+ 3.2x
5/6
,
where
R
x,t
= 0.27125 log
_
1 +
log 4t
2 log
9x
1/3
2.004t
_
+ 0.41415.
Appendix A. Norms of Fourier transforms

Our aim here is to give upper bounds on |
2
|
, where
2
is as in (1.4). We
will do considerably better than the trivial bound |
|
1
.
Lemma A.1. For every t R,
(A.1) |4e(t/4) 4e(t/2) +e(t)| 7.87052.
We will describe an extremely simple, but rigorous, procedure to nd the
maximum. Since |g(t)|
2
is C
2
(in fact smooth), there are several more ecient
and equally rigourous algorithms for starters, the bisection method with error
bounded in terms of |(|g|
2
)
.
74 H. A. HELFGOTT
Proof. Let
(A.2) g(t) = 4e(t/4) 4e(t/2) +e(t).
For a t b,
g(t) = g(a) +
t a
b a
(g(b) g(a)) +
1
2
(b a)
2
O
( max
v[a,b]
|g
(v)|)
and so
(A.3) max
t[a,b]
|g(t)| max(|g(a)|, |g(b)|) +
1
2
(b a)
2
max
v[a,b]
|g
(v)|.
For any v R,
(A.4) |g
(v)|
_
2
_
2
4 +
2
4 + (2)
2
= 9
2
.
Clearly g(t) depends only on t mod4. Hence, by (A.3) and (A.4), to estimate
max
tR
|g(t)| with an error of at most , it is enough to subdivide [0, 4] into
intervals of length
_
2/9
2
each. We set = 10
6
and compute.
Lemma A.2. Let
2
: R
+
R be as in (1.4). Then
(A.5) |
2
|
31.521.
This should be compared with |
2
|
1
= 48.
Proof. We can write
(A.6)
2
(x) = 4(4
1/4
(x) 4
1/2
(x) +
1
(x)) +f(x),
where
x
0
is the point measure at x
0
of mass 1 (Dirac delta function) and
f(x) =
_
_
0 if x < 1/4 or x 1,
4x
2
if 1/4 x < 1/2,
4x
2
if 1/2 x < 1.
Thus

2
(t) = 4g(t) +

f(t), where g is as in (A.2). It is easy to see that |f
|
1
=
2 max
x
f(x) 2 min
x
f(x) = 160. Therefore,
(A.7)
f(t)
(t)/(2it)

|f
|
1
2|t|
=
80
|t|
.
Since 31.5214 7.87052 = 0.03892, we conclude that (A.5) follows from Lemma
A.1 and (A.7) for |t| 655 > 80/( 0.03892).
It remains to check the range t (655, 655); since 4g(t) +

f(t) is the
complex conjugate of 4g(t) +

f(t), it suces to consider t non-negative. We use
(A.3) (with 4g +

f instead of g) and obtain that, to estimate max
tR
|4g +

f(t)|
with an error of at most , it is enough to subdivide [0, 655) into intervals of
length
_
2/|(4g +

f)
each and check |4g +

f(t)| at the endpoints. Now,
for every t R,
f
_
(t)
(2i)
2
x
2
f(t)
= (2)
2
O
_
|x
2
f|
1
_
= 12
2
.
By this and (A.4), |(4g +
f)
48
2
. Thus, intervals of length
1
give an error
term of size at most 24
2
2
1
. We choose
1
= 0.001 and obtain an error term less
than 0.000237 for this stage.
To evaluate

f(t) (and hence 4g(t) +

f(t)) at a point, we use Simpsons rule
on subdivisions of the intervals [1/4, 1/2], [1/2, 1] into 200 max(1,
_
|t|) sub-
intervals each.
5
The largest value of

f(t) we nd is 31.52065 . . . , with an error
term of at most 4.5 10
5
.
Lemma A.3. Let
2
: R
+
R be as in (1.4). Let
y
(t) = log(yt)
2
(t), where
y 4. Then
(A.8) |
y
|
1
< (log y)|
2
|
1
.
This was sketched in [Helb, (2.4)].
Proof. Recall that supp(
2
) = (1/4, 1). For t (1/4, 1/2),
y
(t) = (4 log(yt) log 4t)
=
4 log 4t
t
+
4 log yt
t

8 log 4t
t
> 0,
whereas, for t (1/2, 1),
y
(t) = (4 log(yt) log t)
=
4 log yt
t

4 log t
t
=
4 log yt
2
t
< 0,
where we are using the fact that y 4. Hence
y
(t) is increasing on (1/4, 1/2) and
decreasing on (1/2, 1); it is also continuous at t = 1/2. Hence |
y
|
1
= 2|
y
(1/2)|.
We are done by
2|
y
(1/2)| = 2 log
y
2

2
(1/2) = log
y
2
8 log 2 < log y 8 log 2 = (log y)|
2
|
1
.
Lemma A.4. Let y 4. Let g(t) = 4e(t/4) 4e(t/2) + e(t) and k(t) =
2e(t/4) e(t/2). Then, for every t R,
(A.9) |g(t) log y k(t) 4 log 2| 7.87052 log y.
Proof. By Lemma A.1, |g(t)| 7.87052. Since y 4, k(t) (4 log 2)/ log y 6.
For any complex numbers z
1
, z
2
with |z
1
|, |z
2
| , we can have |z
1
z
2
| > only
if | arg(z
1
/z
2
)| > /3. It is easy to check that, for all t [2, 2],
arg
_
g(t) log y
4 log 2 k(t)
_
arg
_
g(t)
k(t)
_
< 0.7 <

3
.
(It is possible to bound maxima rigorously as in (A.3).) Hence (A.9) holds.
Lemma A.5. Let
2
: R
+
R be as in (1.4). Let
(y)
(t) = (log yt)
2
(t), where
y 4. Then
(A.10) |
(y)
|
< 31.521 log y.

Proof. Clearly
(y)
(x) =
2
(x)(log y) +
_
(log x)
2
(x) +
2
x
2
(x)
1
x
2
2
(x)
_
=
2
(x)(log y) + 4(log x)(4
1/4
(x) 4
1/2
(x) +
1
(x)) +h(x),
where
h(x) =
_
_
0 if x < 1/4 or x > 1,
4
x
2
(2 2 log 2x) if 1/4 x < 1/2,
4
x
2
(2 + 2 log x) if 1/2 x < 1.
5
The authors code was run twice, in two dierent versions once using D. Platts imple-
mentation [Pla11] of double-precision interval arithmetic (based on Lambovs [Lam08] ideas).
76 H. A. HELFGOTT
(Here we are using the expression (A.6) for
2
(x).) Hence
(A.11)

(y)
(t) = (4g(t) +

f(t))(log y) + (16 log 2 k(t) +
h(t)),
where k(t) = 2e(t/4) e(t/2). Just as in the proof of Lemma A.2,
(A.12) |
f(t)|
|f
|
1
2|t|

80
|t|
, |
h(t)|
160(1 + log 2)
|t|
.
Again as before, this implies that (A.10) holds for
|t|
1
0.03892
_
80 +
160(1 + log 2)
(log 4)
_
= 2252.51.
Note also that it is enough to check (A.10) for t 0, by symmetry. Our remaining
task is to prove (A.10) for 0 t 2252.21.
Let I = [0.3, 2252.21] \ [3.25, 3.65]. For t I, we will have
(A.13) arg
_
4g(t) +

f(t)
16 log 2 k(t) +
h(t)
_
3
,

3
_
.
(This is actually true for 0 t 0.3 as well, but we will use a dierent strategy
in that range in order to better control error terms.) Consequently, by Lemma
A.2 and log y log 4,
|
(y)
(t)| < max(|4g(t) +

f(t)| (log y), |16 log 2 k(t)
h(t)|)
< max(31.521(log y), |48 log 2 + 25|) = 31.521 log y,
where we bound
h(t) by (A.12) and by a numerical computation of the maximum

of |
h(t)| for 0 t 4 as in the proof of Lemma A.2.

It remains to check (A.13). Here, as in the proof of Lemma A.4, the allowable
error is relatively large (the expression on the left of (A.13) is actually contained
in (1, 1) for t I). We decide to evaluate the argument in (A.13) at all t
0.005Z I, computing

f(t) and

h(t) by numerical integration (Simpsons rule)
with a subdivision of [1/4, 1] into 5000 intervals. Proceeding as in the proof of
Lemma A.1, we see that the sampling induces an error of at most
(A.14)
1
2
0.005
2
max
vI
((4|g
(v)| +|(
f)
(t)|)
0.0001
8
48
2
< 0.00593
in the evaluation of 4g(t) +

f(t), and an error of at most
(A.15)
1
2
0.005
2
max
vI
((16 log 2 |k
(v)| +|(
h)
(t)|)
0.0001
8
(16 log 2 6
2
+ 24
2
(2 log 2)) < 0.0121
in the evaluation of 16 log 2 |k
(v)| +|(
h)
(t)|.
Running the numerical evaluation just described for t I, the estimates for
the left side of (A.13) at the sample points are at most 0.99134 in absolute value;
the absolute values of the estimates for 4g(t) +

f(t) are all at least 2.7783, and
the absolute values of the estimates for | 16 log 2 log k(t) +
h(t)| are all at least

2.1166. Numerical integration by Simpsons rule gives errors bounded by 0.17575
percent. Hence the absolute value of the left side of (A.13) is at most
0.99134 + arcsin
_
0.00593
2.7783
+ 0.0017575
_
+ arcsin
_
0.0121
2.1166
+ 0.0017575
_
1.00271 <

3
for t I.
Lastly, for t [0, 0.3][3.25, 3.65], a numerical computation (samples at 0.001Z;
interpolation as in Lemma A.2; integrals computed by Simpsons rule with a
subdivision into 1000 intervals) gives
max
t[0,0.3][3.25,3.65]
_
|(4g(t) +

f(t))| +
| 16 log 2 k(t) +
h(t)|
log 4
_
< 29.08,
and so max
t[0,0.3][3.25,3.65]
|
(y)
|
< 29.1 log y < 31.521 log y.

An easy integral gives us that the function log
2
satises
(A.16) | log
2
|
1
= 2 log 4
The following function will appear only in a lower-order term; thus, an
1
estimate
will do.
Lemma A.6. Let
2
: R
+
R be as in (1.4). Then
(A.17) |(log
2
)
|
1
= 96 log 2.
Proof. The function log (t) is 0 for t / [1/4, 1], is increasing and negative for
t (1/4, 1/2) and is decreasing and positive for t (1/2, 1). Hence
|(log
2
)
= 2
_
(log
2
)
_
1
2
_
(log
2
)
_
1
4
__
= 2(16 log 2 (32 log 2)) = 96 log 2.
References
[BR02] G. Bastien and M. Rogalski. Convexite, compl`ete monotonie et inegalites sur les
fonctions zeta et gamma, sur les fonctions des operateurs de Baskakov et sur des
fonctions arithmetiques. Canad. J. Math., 54(5):916944, 2002.
[But11] Y. Buttkewitz. Exponential sums over primes and the prime twin problem. Acta
Math. Hungar., 131(1-2):4658, 2011.
[CW89] J. R. Chen and T. Z. Wang. On the Goldbach problem. Acta Math. Sinica,
32(5):702718, 1989.
[Dab96] H. Daboussi. Eective estimates of exponential sums over primes. In Analytic num-
ber theory, Vol. 1 (Allerton Park, IL, 1995), volume 138 of Progr. Math., pages
231244. Birkhauser Boston, Boston, MA, 1996.
[DEtRZ97] J.-M. Deshouillers, G. Enger, H. te Riele, and D. Zinoviev. A complete Vinogradov
3-primes theorem under the Riemann hypothesis. Electron. Res. Announc. Amer.
Math. Soc., 3:99104, 1997.
[DR01] H. Daboussi and J. Rivat. Explicit upper bounds for exponential sums over primes.
Math. Comp., 70(233):431447 (electronic), 2001.
[EM95] M. El Marraki. Fonction sommatoire de la fonction de Mobius. III. Majorations
asymptotiques eectives fortes. J. Theor. Nombres Bordeaux, 7(2):407433, 1995.
[EM96] M. El Marraki. Majorations de la fonction sommatoire de la fonction
(n)
n
. Univ.
Bordeaux 1, preprint (96-8), 1996.
[GR96] A. Granville and O. Ramare. Explicit bounds on exponential sums and the scarcity
of squarefree binomial coecients. Mathematika, 43(1):73107, 1996.
78 H. A. HELFGOTT
[HB85] D. R. Heath-Brown. The ternary Goldbach problem. Rev. Mat. Iberoamericana,
1(1):4559, 1985.
[HB11] H. Hong and Ch. W. Brown. QEPCAD B Quantier elimination by partial cylin-
drical algebraic decomposition, May 2011. version 1.62.
[Hela] H. A. Helfgott. Major arcs for Goldbachs problem. Preprint.
[Helb] H. A. Helfgott. Minor arcs for Goldbachs problem. Preprint. Available as
arXiv:1205.5252.
[HL23] G. H. Hardy and J. E. Littlewood. Some problems of Partitio numerorum; III: On
the expression of a number as a sum of primes. Acta Math., 44(1):170, 1923.
[Hux72] M. N. Huxley. Irregularity in sifted sequences. J. Number Theory, 4:437454, 1972.
[IK04] H. Iwaniec and E. Kowalski. Analytic number theory, volume 53 of American Math-
ematical Society Colloquium Publications. American Mathematical Society, Provi-
dence, RI, 2004.
[Lam08] B. Lambov. Interval arithmetic using SSE-2. In Reliable Implementation of Real
Number Algorithms: Theory and Practice. International Seminar Dagstuhl Castle,
Germany, January 8-13, 2006, volume 5045 of Lecture Notes in Computer Science,
pages 102113. Springer, Berlin, 2008.
[LW02] M.-Ch. Liu and T. Wang. On the Vinogradov bound in the three primes Goldbach
conjecture. Acta Arith., 105(2):133175, 2002.
[Mon68] H. L. Montgomery. A note on the large sieve. J. London Math. Soc., 43:9398, 1968.
[Mon71] H. L. Montgomery. Topics in multiplicative number theory. Lecture Notes in Math-
ematics, Vol. 227. Springer-Verlag, Berlin, 1971.
[MV73] H. L. Montgomery and R. C. Vaughan. The large sieve. Mathematika, 20:119134,
1973.
[MV74] H. L. Montgomery and R. C. Vaughan. Hilberts inequality. J. London Math. Soc.
(2), 8:7382, 1974.
[Pla11] D. Platt. Computing degree 1 L-functions rigorously. PhD thesis, Bristol University,
2011.
[Rama] O. Ramare. Explicit estimates on several summatory functions involving the Moe-
bius function. Preprint.
[Ramb] O. Ramare. Explicit estimates on the summatory functions of the moebius function
with coprimality restrictions. Preprint.
[Ramc] O. Ramare. From explicit estimates for the primes to explicit estimates for the
Moebius function. Preprint.
[Ramd] O. Ramare. A sharp bilinear form decomposition for primes and moebius function.
Preprint. To appear in Acta. Math. Sinica.
[Ram10] O. Ramare. On Bombieris asymptotic sieve. J. Number Theory, 130(5):11551189,
2010.
[RR96] O. Ramare and R. Rumely. Primes in arithmetic progressions. Math. Comp.,
65(213):397425, 1996.
[RS62] J. B. Rosser and L. Schoenfeld. Approximate formulas for some functions of prime
numbers. Illinois J. Math., 6:6494, 1962.
[RS75] J. B. Rosser and L. Schoenfeld. Sharper bounds for the Chebyshev functions (x)
and (x). Math. Comp., 29:243269, 1975. Collection of articles dedicated to Derrick
Henry Lehmer on the occasion of his seventieth birthday.
[Sel91] A. Selberg. Lectures on sieves. In Collected papers, vol. II, pages 66247. Springer
Berlin, 1991.
[Tao] T. Tao. Every odd number greater than 1 is the sum of at most ve primes. Preprint.
Available as arXiv:1201.6656.
[Vau77] R.-C. Vaughan. Sommes trigonometriques sur les nombres premiers. C. R. Acad.
Sci. Paris Ser. A-B, 285(16):A981A983, 1977.
[Vin37] I. M. Vinogradov. Representation of an odd number as a sum of three primes. Dokl.
Akad. Nauk. SSR, 15:291294, 1937.
[Vin04] I. M. Vinogradov. The method of trigonometrical sums in the theory of numbers.
Dover Publications Inc., Mineola, NY, 2004. Translated from the Russian, revised
and annotated by K. F. Roth and Anne Davenport, Reprint of the 1954 translation.
Harald Helfgott,

Ecole Normale Superieure, Departement de Mathematiques,
45 rue dUlm, F-75230 Paris, France
E-mail address: harald.helfgott@ens.fr

Goldbach

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Goldbach

Uploaded by

Copyright:

Available Formats

a

(a/2q, x)| for q

(a/q, x)|/x given by this paper and by [Tao] for x = 10

(log log q) in the worst case.

if every prime divid-

(R) to mean a quantity at most R in absolute value.

(t)/(2it) by integration by parts. Iterating, we obtain that, if f is compactly

The trivial bound |

) can still be made sense

is undened at a nite number of points, provided f is understood as

has nite total variation). This is the case, in particular,

(x, ) is as in (1.1) and

ranging from 2 to q/2. We obtain that the sum is at most

AC is greater than Aq, which is an obvious upper

The following bound will be useful when the constant A in an application of

of the new approximation a

than continued fractions would

(1/qQ) and q Q. (If = 0,

or the alternative approximation a

), and so there must be a dierent approximation

/Q + q/((1 + )Q) 1, and so q

Q q/(1 + ) (/(1 + ))Q. (Note also

instead of a/q. By (3.12) (without the restriction q m) and Lemma 3.1,

(1 +)Q + 1 (1 +)(Q + 1). Therefore,

3 for < 0.1741). A quick check against (3.14) shows that

and M are dened just as before (with 2 wherever

= e((a + q)/2) {1, 1} are independent of m and n.

x). Let be continuous, piecewise C

. Assume that, for any

as in the proof of Lemma 3.4. The same

We will apply the following only for q relatively large.

x). Let be continuous, piecewise C

(Q+ 1)(1 +) log Dlog

x, we conclude that (3.53) is at most

+ (19/18)q)) x, we conclude that

= D (in which case (3.58) and

(1). Thus (4.12) equals

Applying Lemma 4.1 with y = S/s and z = x/Ws, where S = x/WU, we

, [0, 1/2). Thus it makes sense now to estimate F

x leads to a polynomial of high degree, and quantier elimination

= max(V, W/2). (The condition gcd(p, v) = 1 will

/r, r R, gcd(r, q) = 1, to the

/r are still separated by at least . Thus

W/2. Let = a/q +O

are separated by a distance of at least

. We multiply both sides of (4.45) by (W/2 +

/r is separated from the others by at least 1/qrR1/Q; we

, and so this choice of R is valid. It remains to check that

We need a version of Lemma 4.4 with m restricted to the odd numbers.

W/2. Let 2 = a/q +O

in such a block, m and m

are separated by a distance

x/2W. Let Q 3.5W. Let

)W (x/2W) W = x/2. Thus (4.53) holds.

= x/|q|. For any m

+x/|q| W/2 +x/|q| and

= x/|q| Q again, and we split

< m x/W into intervals of length 2(Q

q), so that (4.57) still

log ()| min

= max(U, x/2W). Since W

W/2 and W V 117, we can

b, we can take out the q/(q) W/ log(W/2q) term and

/x, where either q