You are on page 1of 37

Murmurations

Nina Zubrilina

October 12, 2023

Abstract
We establish a case of the surprising correlation phenomenon observed in the recent works of He,
Lee, Oliver, Pozdnyakov, and Sutherland between Fourier coefficients of families of modular forms
and their root numbers.
arXiv:2310.07681v1 [math.NT] 11 Oct 2023

1 Introduction
In a recent paper, He, Lee, Oliver, and Pozdnyakov ([3]) discovered a remarkable oscillation pattern in the
averages of the Frobenius traces of elliptic curves of fixed rank and conductor in a bounded interval. This
discovery stemmed from the use of machine learning and computational techniques and did not explain
the mathematical source of this phenomenon, referred to as "murmurations" due to its visual similarity
to bird flight patterns:

(a) Rank bias observed in [3] (with authors’ permission) (b) Starlings murmurations. AlbertoGonzalez/Shutterstock.com

Later, Sutherland and the authors ([13], [4]) detected this bias in more general families of arithmetic
L functions, for instance, those associated to weight k holomorphic modular cusp forms for Γ0 (N ) with
conductor in a geometric interval range [M, cM ] and a fixed root number. Sutherland made a striking
observation that the average of af (P ) over this family for a single prime P ∼ M converges a continuous-
looking function of P/M :

Figure 2: Dyadic averages of a(p) with a fixed root number, courtesy of Sultherland

1
The goal of this paper is to establish this bias in families of modular forms of square-free level with
arbitrary fixed weight and root number. We show the following:

Theorem 1. Let H new (N ) be a Hecke basis for trivial character weight k cusp newforms for Γ0 (N ) with f ∈
H new (N ) normalized to have lead coefficient 1. Let ε(f ) be the root number of f , let af (p) be the p-th Fourier
coefficient of f , and let λf (p) := af (p)/p(k−1)/2 . Let X, Y, and P be parameters going to infinity with P prime;
assume further that that Y = (1 + o(1))X 1−δ2 and P ≪ X 1+δ1 for some δ2 , δ1 > 0 with 2δ1 < δ2 < 1. Let
y := P/X. Then:

P□ P √
f ∈H new (N ) λf (P ) P ε(f )
 
N ∈[X,X+Y ] √ X p r
P□ = Dk A y + (−1)k/2−1 Dk B 2
c(r) 4y − r Uk−2 √
2 y
P
N ∈[X,X+Y ] f ∈H new (N,k) 1 √
1≤r≤2 y
 
−δ ′ +ε 1
− Dk δk=2 πy + Oε X +
P

where Uk−2 is the Chebyshev polynomial given by

sin((n + 1)θ)
Un (cos θ) := ,
sin θ
δ ′ := min{δ2 /2 − δ1 , 1/9 + δ2 /9 − δ1 },
Y p

A := 1+ ,
p
(p + 1)2 (p − 1)
Y p4 − 2p2 − p + 1
B := ,
p
(p2 − 1)2
12
Dk := Q  ,
(k − 1)π p 1 − p21+p
Y p2

c(r) := 1+ 4 ,
p − 2p2 − p + 1
p|r
P□
and denotes a sum over square-free parameters. In particular, for any δ1 < 2/9, one can find δ2 for which

δ > 0.

We define
 
√ X p r
Mk (y) := Dk A y + (−1)k/2−1 Dk B c(r) 4y − r2 Uk−2 √ − Dk δk=2 πy
√ 2 y
1≤r≤2 y

to be the weight k murmuration density.


The formula above arises from an application of the Eichler-Selberg trace formula to the composition
of Hecke and Atkin-Lehner operators, which allows us to reinterpret the sum in terms of class numbers.
We then compute class number averages in short intervals by means of the class number formula.
We remark that the exponents in the statement are far from optimal, as our goal here is only to get
a power saving error term in a range up to X a for some a > 1. The restriction to square-free levels is a
technical one, as the trace formula simplifies greatly when the level is square-free. From the computations
of Sutherland, it appears that the resulting density functions are slightly perturbed when one considers
all levels, but they share key properties with the ones above.

2
Figure 3: Plots of M8 and M24 , courtesy of Sultherland ([14])

The murmuration densities Mk (y) in Theorem 1 have many interesting features. They are highly
oscillating, continuous, with derivative discontinuities at n2 /4 for n ∈ N. At the origin, the Mk ’s are

positive and grow like y. This positive root number bias for small P has been observed previously in
the works of Martin and Pharis (see [8], [9], [10]).
In the case of weight k = 2, we analyze the behavior this function as y → ∞ in more detail. In spite
of the appearing y term, it has a true growth rate of y 1/4 . The function M2 (y)/y 1/4 is an asymptotically

uniformly almost periodic function of y; it is a convergent sum of periodic functions with (increasing)
half integer periods. Its sign, i.e., the sign of the correlation bias, changes infinitely often. We prove the
following:

Theorem 2. Let
√ X p
M2 (y) = D2 A y + D2 B c(r) 4y − r2 − D2 πy

1≤r≤2 y

be the weight 2 murmuration density. Then as y → ∞,


  √ 
1/4 2BD2 X
1/2 2 y
M2 (y) = y · Q(d)d ζ −1/2, + O(1).
π √ d
1≤d≤2 y
d □-free

Here ζ is the Hurwitz zeta function,


Y p2
Q(d) := ≍ µ2 (d)/d2
p4 − 2p2 − p + 1
p|d

and {·} denotes the fractional part.

5 10 15 20 25
-1

Figure 4: The weight 2 density function

Integrating the murmuration density above produces the dyadic interval weight k averages observed
by Sutherland:

3
Theorem 3. Let P ≪ X 6/5 , let c > 1 be a constant, Z := cX, and y := P/X. Then as X → ∞,
P□ P √ Z c
N ∈[X,Z] f ∈H new (N,k) λf (P ) P ε(f ) 2
P□ = 2 uMk (y/u)du + oy (1),
(c − 1) 1
P
N ∈[X,Z] f ∈H new (N,k) 1

where Mk (y) is as in Theorem 1. In particular, for k = c = 2, the dyadic average


P□ P
N ∈[X,2X] f ∈H new (N,1) af (P )ε(f )
P□ P
N ∈[X,2X] f ∈H new (N,k) 1

converges to
 √
 α y − βy on [0, 1/4];
α√y − βy + γπy 2 − γ(1 − 2y)py − 1/4 − 2γy 2 arcsin(1/2y − 1)) on [1/4, 1/2];



 α y − βy + 2γy 2 (arcsin(1/y − 1) − arcsin(1/2y − 1))


 p
−γ(1 − 2y) y − 1/4 + 2γ(1 − y) 2y − 1 on [1/2, 1],

where
α ≈ 6.38936, β ≈ 11.3536, γ ≈ 2.6436.
We note this is the function from Figure 2.

Finally, we analyze the asymptotic behavior of the smoothed averages from Theorem 3:

Theorem 4. Let Φ : (0, ∞) → C be a compactly supported smooth weight function, and let
Z ∞  Z ∞
2 du du
MΦ (y) := M2 (y/u)Φ(u)u / Φ(u)u2 .
0 u 0 u

Then MΦ is continuous on (0, ∞), MΦ (0) = 0, and as y → ∞,

MΦ (y) = 1 + oy (1).

Asymptotic properties of the characteristic function averages as in Theorem 3 will be analyzed in


forthcoming work.
Murmurations are a feature of the the one-level density transition range. The Katz-Sarnak philosophy
([6]) predicts that averages as in Theorem 4 for P ∼ N a behave differently when a < 1 and a > 1,
and our statements for P ∼ N describe the phase transition between these. The unusual (for k > 2)
normalization of the coefficients above also arises naturally from this interpretation. For a more detailed
discussion of this connection, we point the reader to [11].
Computing similar averages weighted "harmonically" (i.e., by the value at 1 of the symmetric square
L-function) by means of the Petersson formula reveals that with weights, this bias becomes much less

pronounced: the resulting function grows like y, as opposed to y, at the origin.
Murmurations for elliptic curves over Q are not explained by these results, as they constitute a very
sparse subset of weight 2 modular forms. We point out that computational observations of the afore-
mentioned authors make a compelling case that this phenomenon is very sensitive to the ordering by
conductor, and disappears almost entirely when the curves are ordered by naive height, j-invariant, or
discriminant.

4
2 Trace Formula Setup
Given a square-free positive integer N and a prime P ∤ N , let H new (N, k) denote a basis of the space
S new (N, k) of weight k Hecke cusp newforms for Γ0 (N ), and let af (P ) = λf (P )P (k−1)/2 denote the
eigenvalue under the P -th Hecke operator Tp of f ∈ H new (N, k). Let ε(f ) denote the root number of f
(recall (−1)k/2 ε(f ) is equal to the eigenvalue of f under the Atkin-Lehner involution WN ). In order to
compute theP average of af (P )ε(f ) for eigenforms f ranging over square-free levels N in an interval, we
interpret f ∈H new (N,k) af (P )ε(f ) as the trace of the operator (−1) Tp ◦ WN on S new (N ) and apply
k/2

the corresponding trace formula.


Such a trace formula was first derived by Yamauchi in [16]; the result contained a computational error
which was later corrected by Skoruppa and Zagier ([12]). This formula (section 2, formula (7)) gives the
trace of Tp ◦ WN on the full space of cusp forms S(N ); as the authors point out in the discussion leading
to formula (5), oldforms coming from S(M ) contribute to the trace only when N/M is a square. Since
we are restricting ourselves to N square-free, we thus have the following result at our disposal:
Theorem (Skoruppa-Zagier ([12], section 2, formulas (5) and (7)). For N square-free and a prime P ∤ N ,
√ !
X √ H1 (−4P N ) r N X
P λf (P )ε(f ) = + (−1)k/2−1 Uk−2 √ H1 (r2 N 2 − 4P N )
2 2 P √
f ∈H new (N,k) 0<r≤2 P/N

− δk=2 (P + 1).

Here the Hurwitz class number H1 (−d) is the number of equivalence classes with respect to SL2 (Z)
of positive-definite binary quadratic forms of discriminant −d weighed by the number of automorphisms
(i.e., with forms corresponding to multiples of x2 + y 2 and x2 + xy + y 2 counted with multiplicities 1/2
and 1/3, accordingly). Hence, H1 can be expressed in terms of the Gauss class number h via:
X
H1 (−d) = h(−d/f 2 ) + O(1),
f ∈N:f 2 |d

with the error term disappearing if d ̸= 3 · □, 4 · □.


Assume from now on that P ̸= 2. The square factors of 4P N are 1 and 4, since by assumption P ∤ N .
For a prime q and r ≥ 1, the condition q 2 | N (r2 N − 4P ) can hold either if q 2 |r2 N − 4P or if q divides
both N and 4P , i.e., if q = 2 and N is even. However, if N = 2N ′ is even (with N ′ odd), then for any d
with 4d2 |(r2 N 2 − 4P N ), one has

(r2 N 2 − 4P N )/4d2 = (r2 N ′2 − 2P N ′ )/d2 ,

which is always 2 or 3 modulo 4, so the corresponding class number vanishes. Thus it suffices to consider
square divisors of r2 N 2 − 4P N for which d2 |r2 N − 4P . Consequently, for N square-free, and a prime
P ∤ 2N , the trace formula becomes:

X √ h(−4P N ) h(−P N )
P λf (P )ε(f ) = + − δk=2 P + O(1).
2 2
f ∈H new (N )
√ !
r N X X
+ (−1)k/2−1 Uk−2 √ h(N (r2 N − 4P )/d2 ). (1)
2 P √ P d2 |r2 N −4P
1≤r≤2 N

From this formula, one can already see that the trace is positively biased, at least for small P . Indeed,
the only negative term in this expression is −P ; on the other hand, Siegel’s bound dictates that the class

5
number terms should be of size (P N )1/2+ε , which dominates for small P , as was observed in ([8]) and
([9]), ([10]).
On the other hand, for P of size N 1+ε , the balance becomes more subtle, and as we will see, the trace
can be either positive or negative, even when averaged over short intervals in N .

3 Average Class Number in Short Intervals


Our interest in this section is to exploit the Dirichlet class number formula to understand sums of class
numbers in (1) as the square-free parameter N ranges over a short interval [X, X + Y ] for Y = o(X).
For such an interval, the square root term in the class number formula has fixed size, so these sums can be
understood by averaging Dirichlet characters coming from a truncated L function special value. Carrying
out this computation yields Theorem 1. We establish it via the following two propositions:

Proposition 3.1. Let P ̸= 2 be prime and let [X, X +Y ] be an interval of length Y = o(X). Then as X → ∞,
we have

P 7/12 Y P 1/2
 
ζ(2)π X□ h(−P N ) h(−4P N ) √ 1
+ = A y+Oε + + (XY P )ε .
XY 2 2 P 3/2 X 1/2 Y 5/6 X 5/12 X 3/2
N ∈[X,X+Y ]
P ∤N

Here the error term is o(y) as long as P → ∞, Y = o(X/ log(P X)), and for some ε > 0, X 7/10+ε P 1/10+ε =
o(Y ).
2
p P ̸= 2 be a prime, let r ∈ N, and let X > Y > 0 be such that r (X + Y ) < 4P for
Proposition 3.2. Let
each integer r < 2 P/X. Then:

ζ(2)π X X□ X p
H1 (r2 N 2 − 4P N ) = c(r) 4y − r2
YX √ √
r≤2 P/XN ∈[X,X+Y ] r≤2 P/X
P ∤N
 P 11/10 YP P Y 1/2 P P 
+ O 2/5 9/10 + 2 + 3/2
+ √ + 1/9
(XY P )ε .
Y X X X XY 13/18 XY

Subsections 3.1 proves Proposition 3.1. In subsection 3.2, we prove Proposition 3.2 by adapting the
same idea to the more complex square divisor structure of the arguments of the class numbers involved.

3.1 H1 (−4P N )
3.1.1 h(−P N )
By Dirichlet’s class number formula, for d > 4, h(−d) is 0 if −d = 2 or 3 mod 4, and otherwise

d
h(−d) = L(1, χd ),
π
where L(1, χd ) is the value at 1 of the Dirichlet series for the Kronecker symbol nd (a quadratic Dirichlet


character of modulus d or 4d). We evaluate the sum


1 X 1 X
(2)
p
√ h(−P N ) = N/XL(1, χ−P N ).
PX π
N ∈[X,X+Y ] N ∈[X,X+Y ]
P N =3 mod 4

6
by truncating the Dirichlet series of the L-function and splitting the corresponding characters into prim-
itive and non-primitive ones. For χ a non-principal Dirichlet character of modulus d, it follows from Abel
summation and Polya-Vinogradov that for a truncation parameter T ,
X χ(n) T √
X χ(n) 
L(1, χ) = = +O d log d/T (3)
n≥1
n n=1
n

(see, for example, [1], page 321, Theorem 5.2.) Since χ−P N is always a non-principal Dirichlet character
for square-free N with P ∤ N , we have
T −P N
 √ !
X p X p X
n Y P X log P X
N/XL(1, χ−P N ) = N/X +O
n=1
n T
N ∈[X,X+Y ] N ∈[X,X+Y ]
P N =3 mod 4 P N =3 mod 4
√ √ !
T T
p p
N/X −P N −P N
 
X X
m2
X X N/X n Y P X log P X
= 2
+ +O
m n T
N ∈[X,X+Y ] m=1 N ∈[X,X+Y ] n=1
P N =3 mod 4 P N =3 mod 4 n̸=□
√ !
Y P X log P X
=: Sq + NSq + O .
T
Since the sum Sq contains principal characters and NSq is going over non-principal ones, we expect Sq
to be the main term. Indeed,

T   r
X 1 X
2 −P N  N −X 
Sq = µ (N ) 1+ 1+ −1
m=1
m2 m2 X
N ∈[X,X+Y ]
P N =3 mod 4

T  
X 1 X
2 −P N  p 
= µ (N ) +O Y 1 + Y /X − 1
m=1
m2 m2
N ∈[X,X+Y ]
P N =3 mod 4
 
Y2
   
X 1  X
2 N χ1 (P N ) − χ2 (P N ) 
= µ (N ) +O
√ m2 m2 2 X
m≤ T N ∈[X,X+Y ]
(P,m)=1

where χ1,2 are the characters modulo principal. The character N


χ1 (N ) is principal modulo

4, χ 1 m2

2m, and mN2 χ2 (N ) is always non-principal modulo 4m. Applying Lemmas 6.7 and Lemma 6.5,


X Y η(2m)    2
1 1/5+ε 3/5+ε Y
Sq = 2
+ Oε 2
m X +O
√ ζ(2) 2m m X
m≤ T
(P,m)=1

Y2
 
4A Y Y
=Y + Oε + √ + X 3/5+ε
+ . (4)
11ζ(2) P2 T X
Next, we want to bound the term
T
p T
!
−P N −P N
Y2
  P
N/X n≤T (1/n)
X X X X
n n
NSq = = +O
n=1
n n X
N ∈[X,X+Y ] N ∈[X,X+Y ] n=1
P N =3 mod 4 n̸=□ P N =3 mod 4 n̸=□
T −P
 
Y 2 log T
   
X
n
X N χ1 (P N ) − χ2 (P N ) 
=  +O
n=1
n n 2 X
N ∈[X,X+Y ]
n̸=□

7
For n not a square, Nn is non-principal; moreover, N
is primitive modulo 8. Hence N
χ1,2 (N ) are
  
2 n
also non-principal, so applying Lemma 6.7 again,
T
X
(1/n)Oε n1/5+ε X 3/5+ε + O Y 2 log T /X ≪ε T 1/5+ε X 3/5+ε + Y 2 log T /X. (5)
 
NSq =
n=1
n̸=□

Combining (2), (4), and (5),

1 X 4A
√ h(−P N ) = Y + ErrY,X,P,T , (6)
PX 11ζ(2)π
N ∈[X,X+Y ]

where
√ !
Y Y 3/5+ε 1/5+ε Y 2 log T Y P X log P X
ErrY,X,P,T = Oε 2
+ √ + X T + + .
P T X T

In particular, setting T := Y 5/6 P 5/12 X −1/12 , we get an error term matching that of Proposition 3.1.

3.1.2 h(−4P N )
We handle this case the same way as in the previous section. Since −4P N is always 0 mod 4,
r
1 X 2 X N
√ h(−4P N ) = L(1, χ−4P N )
P X N ∈[X,X+Y ] π X
N ∈[X,X+Y ]
T √ !
N X −4P
r N

2 X
n Y P X log P X
= +O
π X n=1 n T
N ∈[X,X+Y ]
T −4P N
 √ 2
!
2 X X
n Y P X log P X Y log T
= +O + .
π n=1
n T X
N ∈[X,X+Y ]

Again, we can separate into principal and non-principal characters:



X T
X −4P N
 T
X X −P N

XT X −P N

n n m2
= +
n n=1 N ∈[X,X+Y ]
n m=1 N ∈[X,X+Y ]
m2
N ∈[X,X+Y ] n=1
n̸=□ m odd
n odd

Applying Lemma 6.7 and Lemma 6.5 as in the previous section, we conclude
 √ 
T
1 X 2Y  X η(m)  18A
√ h(−4P N ) = 2  Y + ErrY,X,P,T = Y + ErrY,X,P,T ,
πζ(2) m 11ζ(2)π

PX N ∈[X,X+Y ] m odd
(m,P )=1

which finishes the proof of Proposition 3.1 in combination with (6).

3.2 H1 (r2 N 2 − 4P N )
The aim of this section is to prove the following:

8
Proposition 3.3. Let P ̸= 2 be a prime, let r ∈ N, and let X > Y > 0 be such that r2 (X + Y ) < 4P. Then:
X□ Y Bc(r) √
H1 (r2 N 2 − 4P N ) = 4P X − r2 X 2
πζ(2)
N ∈[X,X+Y ]
P ∤N
√ !
 Y 2
P √ √ 
+ O Y P X)ε (Y P X)3/5 + √ + rY 3/2 X 1/2 + X P Y 5/18 + Y 8/9 P X .
X
2 2
For a divisor d2 |r2 N −4P such that r N d−4P
2
N
= 0/1 mod 4, we once again have by the class number
formula that  2 2  √
r N − 4P N 4P N − r2 N 2
h = L(1, χ r2 N 2 −4P N ).
d2 πd d2

Thus, for 1 ≤ r ≤ 2 P/(X + Y ),


p

X□ X X L(1, χ(r2 N 2 −4P N )/d2 ) √


H1 (r2 N 2 − 4P N ) = 4P N − r2 N 2 ,
πd
N ∈[X,X+Y ] d2 ≤4P N ∈[X,X+Y ]
P ∤N N ∈Aer,d

where we let

Aek,d := {N ∈ Z : N □ - free, P ∤ N, d2 |r2 N − 4P, and (r2 N 2 − 4P N )/d2 ≡ 0/1 mod 4.}

We begin by analyzing the set Aer,d .

3.2.1 Remainder analysis


2 2
Suppose first that r is odd. For N square-free, d2 |r2 N − 4P implies that d is odd, and r N d−4P2
N

0/1 mod 4 automatically holds. Thus, Ar,d is the set of square-free integer solutions to the congruence
e

r2 N = 4P mod d2 , P ∤ N.

For odd r and d, this has a solution if and only if P ∤ d and (r, d) = 1, yielding

Aer,d = {N ∈ Ar,d : N □ - free, P ∤ N },


where we let
(
n ∈ Z : n ≡ 4P k −2 mod d2 if (d, r) = (d, P ) = (d, 2) = 1;
Ar,d :=
∅ otherwise

Now assume r is even. From d2 |4P − r2 N < 4P and since r2 X < 4P , we know r, d < P , i.e.,
2 2
(P, k) = (P, d) = 1. Let r := 2l. Then r N d−4P
2
N
= 0/1 mod 4 is equivalent to

4l2 N = 4P + rd2 mod 4d2 , where rN = 0/1 mod 4. (7)

If d is odd, reducing (7) modulo 4 shows that r ≡ 0 mod 4, and (7) simplifies to

l2 N = P mod d2 .

This has 1 solution mod d2 for (l, d) = 1 and no solutions otherwise. Suppose now d = 2b, so (7)
becomes

l2 N = P + rb2 mod 4b2 , rN = 0/1 mod 4. (8)

9
Since we restrict to N square-free, we can disregard the case N ≡ 0 mod 4. If N = 2 mod 4, then r must
be even and (8) has no solutions mod 2. For N odd, rN = 0/1 mod 4 holds if and only if r = 0/N mod 4,
and we have an equivalence
"
N (l2 − b2 ) = P mod 4b2 , N odd
(8) & N is odd ⇐⇒
N l2 = P mod 4b2 , N odd.

If (l, b) > 1, this has no solutions since P ∤ (l, b). Otherwise, if (l, b) = 1, there are three cases:

• if l, b are odd, there is a solution N ≡ P l−2 mod 4b2 ;

• if l is even, b is odd, there is a solution N ≡ P (l2 − b2 )−1 mod 4b2 ;

• if l is odd, b is even, there’s a solution N ≡ P l−2 mod 4b2 and a solution N = P (l2 −b2 )−1 mod 4b2 ,
distinct.

Note N is automatically odd in the above three cases.


In summary, for any choice of r and d,

Aer,d := {N ∈ Ar,d : P ∤ N, N □ - free},

where the set Ar,d is given by a congruence condition modulo d2 ; namely,

Ar,d := {n ∈ Z : n mod d2 ∈ Rd,r , }

where Rd,r is a subset of residues mod d2 coprime to d that satisfies:




 1 if (d, r) = 1, 2 ∤ d, k, P ∤ d

1 if (d, r) = 1, 2|r



|Rd,r | = 1 if (d, r) = 2, 2||d;
2 if (d, r) = 2, 4|d;





∅ otherwise.

r2 N 2 −4P N
Furthermore, letting s := d2
for some N ∈ Ar,d , the above analysis also proves that:

• For (d, r) = 2, 2||d, 2||r, s is always even;

• For (d, r) = 2, 2||d, 4|r, s is always odd;

• For (d, r) = 2, 4|d, the two residues in Rr,d produce s of different parity.

We will call a pair (r, d) admissible if Rr,d is non-empty.

3.2.2 Truncation
We have established that

X□ X X□ L(1, χ r2 N 2 −4P N ) √
H1 (r2 N 2 − 4P N ) = d2
4P N − r2 N 2 . (9)
πd
N ∈[X,X+Y ] d2 ≤4P N ∈[X,X+Y ]
P ∤N N ∈Ar,d
P ∤N

Note that for N ∈ [X, X + Y ] (and assuming r2 (X + Y ) < 4P ),

10
!
√ √ √ √ √  √
r
r2 (N − X)
2 2 2 2 2
4P N − r N − 4P X − r X = 4P − r N N − X − 4P X − r2 X 2 1− 1−
4P − r2 X
! !
√ √ √
r r
Y r2 Y
≪ PX 1+ − 1 − X 4P − r2 X 1− 1−
X 4P − r2 X
√ √ √ √
≪ PY/ X + r X Y .

Combining this with Siegel’s bound,

X X□ L(1, χ r2 N 2 −4P N ) √
d2
4P X − r2 X 2 + O Y 2 P 1/2+ε X −1/2+ε + rY 3/2 X 1/2+ε P ε .

(9) =
πd
d2 ≤4PN ∈[X,X+Y ]
N ∈Ar,d
P ∤N

By (3), the main term can be truncated as



 2 2 
(r N −4P N )/d2
 
T
√ X 1 X□ X n √ X X P X log P X 
4P X − r2 X 2 + O PX ,
2
πd n=1
n 2
d2 T
d ≤4P N ∈[X,X+Y ] d ≤4P N ∈[X,X+Y ]
N ∈Ar,d
P ∤N

so


 
T (r2 N 2 −4P N )/d2
Y P X log P X Y 2 P 1/2+ε
 
X 4P X − r2 X 2 X□ X n
(9) = +O + + rY 3/2 X 1/2+ε P ε
πd n T X 1/2−ε
d2 ≤4P N ∈[X,X+Y ] n=1
N ∈Ar,d
P ∤N

Y P X log P X Y 2 P 1/2+ε
 
4P X − r2 X 2 X X Sd,n,r 3/2 1/2+ε ε
= +O + + rY X P ,
π 2 n≤T
nd T X 1/2−ε
d ≤4P
(10)

where we define Sd,n,r as follows:


Definition 3.4. Let d, r be positive integers, let X > Y > 0, and let P be a prime with that 4P > r2 (X + Y ).
Define   2
(r N − 4P )/d2

X
2 N
Sd,n,r := µ (N ) .
n n
N ∈[X,X+Y ]
N ∈Ar,d
P ∤N

Note that Sd,n,r = 0 unless (r, d) is an admissible pair.


To evaluate SPd,n,r , one cannot simply split these sums into a sum of trivial
P and non-trivial characters:
unlike the sums a mod m χ(m) considered previously, the shifted sum a mod m χ(m)χ(m + a) is not
necessarily zero for a non-principal χ, so we can’t expect cancellation. Our strategy is as follows:
• For d ≪ Y τ for some 0 < τ < 1 and n ≪ Y σ for some 0 < σ < 1, we compute Sd,n,r explicitly
by examining equidistribution of square-free numbers in residue classes mod n;

• for d ≪ Y τ for some 0 < τ < 1 and n ≫ Y σ , we upper bound Sd,n,r sum using Poisson
summation;

11
• for d ≫ Y τ , we use a crude upper bound coming from Siegel’s bound on the class number.

In the remaining subsection, we carry out these steps and prove the following:

Proposition 3.5. Let d, r be positive integers, let X > Y > 0, and let P be a prime with that 4P > r2 (X +Y ).
Then: √
X X Sd,n,r Y Bc(r) √ 
= +O Y T 1/4+ε + XY 5/18 + Y 8/9 (P X)ε .
2 n≤T
nd ζ(2)
d ≤4P

This implies Proposition 3.3 by choosing the appropriate value of T :


Proof of Proposition 3.3. Plugging Proposition 3.5 into identity (10), we get that

Y P X log P X Y 2 P 1/2+ε
 
X□
2 2 Y Bc(r) 4P X − r2 X 2 3/2 1/2+ε ε
H1 (r N − 4P N ) = +O + + rY X P
ζ(2) π T X 1/2−ε
N ∈[X,X+Y ]
P ∤N
√ √ √ 
+O P X Y T 1/4+ε + X P Y 5/18 + Y 8/9 (P X)1/2+ε .

Setting
T := (Y P X)2/5
gives the claimed error term.

3.2.3 Small d, small n.


First, we address the case when the parameters d and n are small. This is the main term of the sum.

Proposition 3.6. For any parameters 0 ≤ τ, σ ≤ 1 and integer r ≥ 1,


X X Sd,n,r Y Bc(r) √ 
σ/2 3σ/2+τ +ε 1−2τ 1−σ/5
= +O XY +Y +Y +Y .
n≤Y σ d<Y τ
nd ζ(2)

The function Sd,n,r can be approximated by multiplicative functions defined in Definition 6.1 as
follows.

Lemma 3.7. Let d, n, r ≥ 1 be integers with P ∤ n, such that (d, r) is an admissible pair. Then

Y η(d2 n) √ 
′ 3/2+ε
Sd,n,r = φ r,d (g)θr (n ) + O Xn/d + dn ,
ζ(2) φ(d2 n)
e

where g := (d∞ , n), and n′ := n/g.


 2 
)/d2
Proof. Let r ∈ Rr,d be a residue modulo d2 . The character (r x−4P
n
is a function of x mod f nd2 ,
where f = 4 if n is even and f = 1 if n is odd. Thus, we have:
  2
(r N − 4P )/d2
  2
(r N − 4P )/d2
 
X
2 N X X
2 N
µ (N ) = µ (N )
n n 2
n n
N ∈[X,X+Y ] a mod f d n N ∈[X,X+Y ]
N ≡r mod d2 a≡r mod d2 N ≡a mod f d2 n
P ∤N P ∤N
X  a   (r2 a − 4P )/d2  X
= µ2 (N ).
n n
a mod f d2 n N ∈[X,X+Y ]
a≡r mod d2 N ≡a mod f d2 n
P ∤N
(11)

12
Since P > X/4, we can replace the condition P ∤ N with an error term of O(n). Evidently, the terms
with (a, n) > 1 above vanish; moreover, for r above, we have (r, d) = (a, d) = 1. Thus, it suffices to
consider moduli a coprime to f d2 n. Thus we can apply Theorem 6.4, yielding

Y η(d2 n) X  a   (r2 a − 4P )/d2  √ 


1+ε 3/2+ε
(11) = +O Xn/d + d n .
ζ(2) f φ(d2 n) n n
a mod d2 f n
a≡r mod d2

Finally, Lemma 3.8 applied to m = f n completes the proof, since by Lemma 6.3, φ(f
e n) = f φ̃(n).
Lemma 3.8. Let m be an integer with P ∤ m and υ2 (m) ̸= 1, 2, and let (r, d) be an admissible pair. Then:
X  a   (r2 a − 4P )/d2 
=φ er,d (g)θr (m′ ),
2
m m
a mod d m
a∈Rr,d mod d2

where g := (d∞ , m), and m′ := m/g.


Proof. Let r be an integer that reduces to an element of Rd,r modulo d2 . Then:
X  a   (r2 a − 4P )/d2  X  a   (r2 a − 4P )/d2   a   (r2 a − 4P )/d2 
=
2
m m 2
m′ m′ g g
a mod d m a mod d m
a≡r mod d2 a≡r mod d2
X  b   r2 b − 4P  X   2
a (r a − 4P )/d2

=
b mod m′
m′ m′ g g
a mod d2 g
a≡r mod d2
  2
(r a − 4P )/d2


X a
= θr (m )
g g
a mod d2 g
a≡r mod d2

where we applied Lemma 6.2 and the assumption P ∤ m′ in the last step. It remains to sum the above
identity over r ∈ Rr,d , which is φ
ed,r (g) by definition.

Proof of Proposition 3.6. Since P > X/4, Y = o(X), and n ≪ Y σ , the condition (P, n) = 1 will always
hold asymptotically. Thus, by Lemma 3.7 and Lemma 6.6,
√ !
X Sd,n,r X Y η(d2 n) φ er,d (g)θr (n′ ) X Xn d1+ε n3/2+ε
= 2 n)
+ O 2
+
n≤Y σ
nd n≤Y σ
ζ(2) φ(d nd n≤Y σ
nd nd
d≤Y τ d≤Y τ d≤Y τ
(r,d) admissible

Y Bc(r) √ 
σ/2 3σ/2+τ +ε 1−2τ 1−σ/5
= +O XY +Y +Y +Y
ζ(2)
as aimed.

3.2.4 Small d, large n.


Proposition 3.9. Let P ̸= 2 be a prime, let (r, d) be an admissible pair of positive integers, and let X > Y > 0
be such that (X + Y )r2 < 4P . Let σ, τ be parameters with 0 < σ, τ < 1. Then as X → ∞,
X X Sd,n,r √ √
≪ log T log Y X + Y 1−σ/2+ε + Y T 1/4+ε log Y.
d<Y τ
nd
n∈[Y σ ,T ]

13
Lemma 3.10. Let π(x) : Z → S 1 ⊆ C be an m-periodic function, and let ℓ ∈ R+ . Then:
X √ Mℓ
max π(x) ≪ Mℓ + ,
I:|I|=ℓ
x∈I
m

where X
M := max e(bn/m)π(b) .
n mod m
b mod m

Proof. We exploit that for an m-periodic function π and a Shwartz function f , one has by Poisson sum-
mation that

X X X X π(b) X n
π(n)f (n) = π(b) f (ma + b) = e (bn/m) fˆ . (12)
n∈Z b mod m a∈Z b mod m
m n∈Z
m

We construct a suitable function f as follows. Let ψ(x) : R → R+ be smooth, compactly supported,


with ∥ψ∥L1 = 1 and supp(ψ) ⊆ [−1, 1]. The function

ψ(x/ε) c
ψε (x) := , ψε (t) = ψ(εt)
b
ε

then also satisfies ∥ψε ∥L1 = 1, has ∥ψˆε ∥L∞ = ∥ψ∥


ˆ
L∞ , and supp(ψ) ⊆ [−ε, ε]. For c ∈ R, let
 
t−c
fℓ,c,ε (t) := (χ[0,1] ∗ ψε ) ,

where χ[0,1] is the characteristic function of the interval [0, 1]. The function f = fℓ,c,ε is a smooth
function satisfying the following properties:

1. f (t) ∈ [0, 1];

2. supp(f ) ⊆ [c − εℓ, c + ℓ + εℓ];

3. f (t) = 1 for t ∈ [c + ℓε; c + ℓ − ℓε];

4. |fˆ(t)| = ℓ · χ
b[0,1] (tℓ) · ψ(εtℓ)
b


5. fˆ(t) ≪ ℓ, fˆ(t) ≪k,ψ for r ≥ 2.
(εℓt)k

Choosing c to be the starting point of I and ε = εℓ to be chosen later, properties 1, 2, and 3 imply that
X X
π(x) = π(n)f (n) + O (εℓ) .
x∈I n∈Z

From (12),
P
X 1 X ˆ n  X maxn mod m | b mod m e(bn/m)π(b)| X ˆ
π(n)f (n) = f e(bn/m)π(b) ≪ f (n/m)
n∈Z
m n∈Z m b mod m m n∈Z

14
Choosing an integer parameter Xc = max{1, m/εℓ} and using property 5, we thus get a bound
!
X X MX ˆ M X mk
π(x) = π(n)f (n) + O(ℓε) ≪ f (n/m) + O(εℓ) ≪ ℓXc + ℓ + O(ℓε)
x∈I n∈Z
m n∈Z m t≥Xc
εk nk ℓk
ℓmk
 
M M
≪r ℓXc + k k k−1 + O(ℓε) ≪ (ℓ + m/ε) + O(ℓε).
m ε ℓ Xc m

Finally, letting ε := M/ℓ yields the desired result.


p

x

Lemma 3.11. Let A, B, C, D be integers, let m ∈ N, and let χ(x) = m for some m be a Kronecker symbol of
′ ′ ′
modulus m , where m = 4m if m ≡ 2 mod 4, and m = m otherwise. Let π(x) := χ(Ax + B)χ(Cx + D).
Then: X
M := max ′ e(xn/m′ )π(x) ≪ m
n mod m
x mod m

where m is given as follows. Let h = (AD − BC, m), and let P be a set of primes given by

P := {p|m : 2 ∤ υp (m), p ∤ h, p ̸= 2, and (A, C, p) = 1}.

Then:
m
m := Q √ .
p∈P p/2

Proof. Since 2· = 8· , by replacing m with 4m when m ≡ 2 mod 4, it suffices to prove theQstatement


 

in the case m = m′ . We claim that for such m, M is multiplicative in m. Indeed, let m = p pαp , let
  αp
χp := p· be the Kronecker symbol, and let

πp (x) := χp (Ax + B)χp (Cx + D),

so π(x) = p πp (xp ). Let yp be the inverse of q̸=p q αq modulo pαp , and let y be an integer that reduces
Q Q
to yp mod pαp for all p (so in particular, (y, m) = 1). Then for any x,
XY
xy q αq ≡ x mod m,
p q̸=p
so,
Y X X   X  Y X
e(xp (ny)/pαp )πp (xp ) = e n y xp /pαp πp (xp ) = e(xn/m)π(x).
p xp mod pαp x mod m p p x mod m
(xp :=x mod pαp )

Since we could choose ny to have any set of simultaneous reductions modulo primes p|m, the maximum
over n for π will be the product of the corresponding maxima for the πi ’s.
Assume now that m = pα for some prime p. If α is even or p = 2 or p|h, we apply the trivial bound,
so assume α and p are odd and p ∤ h.
Case I: Suppose (AC, p) = 1. Then:

X X
e(xn/m)χ(Ax + B)χ(Cx + D) = e(xn/m)χ(x + BA−1 )χ(x + DC −1 )
x mod m x mod m
X
= e(xn/m)χ(x)χ(x + DC −1 − BA−1 )
x mod m

15
where the inverses are taken mod pα . Notice that for any s with (s, m) = 1 and ts = 1 mod m and for
any shift h,

 nx     
X X (nt)(sx) X (nt)x
χ(x)χ(x + h)e = χ(sx)χ(sx + sh)e = χ(x)χ(x + sh)e
x mod m
m x mod m
m x mod m
m

Thus, maxn ny
depends only on (m, h), and since we assumed (A, m) =
P 
y mod m χ(y)χ(y + α)e m
(C, m) = 1, we are evaluating
X
max e(xn/m)χ(x)χ(x + h)
n mod m
x mod m

where h = (BA−1 − DC −1 , m) = (AD − BC, m). The inner sum depends on the p-adic valuation of
n:

• When n ≡ 0 mod pα , the using p ∤ h and Lemma 6.2,

X X
e(xn/pα )χ(x)χ(x + h) = χ(x)χ(x + 1) = |θ(pα )| = pα−1 ;
n mod pα n mod pα

• When υp (n) = α − 1, n =: n′ pα−1 ,


X X X∗
e(nx/pα )χ(x)χ(x + 1) = pα−1 e(xn′ /p)χ(x)χ(x + 1) = pα−1 e(cx/p)χ(1 + x−1 )
x mod pα x mod p x mod p
X∗
= ±pα−1 e(x−1 /p)χ(x + 1) ≤ 2pα−1/2 ,
x mod p

P∗
(here denotes summation over coprime residues).

• Finally, when υp (n) < α − 1, the sum is 0 since the Legendre symbol is p periodic for p ̸= 2.

In summary, the maximum over n is 2pα−1/2 when p ∤ h and (2AC, p) = 1.

Case II: Next, suppose (A, p) = 1 but p|C. If p|D, the sum vanishes, so without loss of generality,
p ∤ D, and

X X
e(xn/m)χ(Ax + B)χ(Cx + D) = e(xn/m)χ(x) .
x mod m x mod m

• When n ≡ 0 mod pα , x nx x
P   P 
x mod m m
e m
= x mod m m
= 0.

• When υp (n) = α − 1, n =: n′ pα−1 ,

X X X
e(nx/pα )χ(x) = pα−1 e(xn′ /p)χ(x) = pα−1 e(x/p)χ(x) ≤ pα−1/2
x mod pα x mod p x mod p

(Gauss sum).

16
• When υp (n) < α − 1, the sum is 0.
In summary, the maximum over n is always at most pα−1/2 when p divides exactly one of A, C.

We need not consider the case p|A, C since then p|h, so this concludes the proof.

·

Lemma 3.12. Let Y ≤ X, let m, w, D ∈ N and q ∈ Z. Let χ = m
. Let r be an integer with wr = q mod D
and (r, D) = 1. Then:
X+Y √
X
2
√ m log Y mY
µ (N )χ(N )χ((wN − q)/D) ≪ X+ Y + √ ,
N =X
mD D
N =r mod D

where
m
m := Q √ ,
p∈P p/2
P := {p|m : 2 ∤ υp (m), p ∤ (q, m), p ̸= 2, and (D, w, p) = 1}.
Proof. Let l ∈ Z be such that wr = q + lD. Then:

X+Y   X+Y  
X
2 wN − q X
2 w(N − r)
µ (N )χ(N )χ = µ (N )χ(N )χ +l
N =X
D N =X
D
N =r mod D N =r mod D
w(sδ 2 − r)
X X  
2
= µ(δ) χ(sδ )χ +l
D
δ 2 ≤X+Y s∈[X/δ 2 ,(X+Y )/δ 2 ]
sδ 2 =r mod D
X X
≪ χ(Dt + r)χ(wt + l)

δ≤ 2X t∈[ X−r , (X+Y )−r ]
D D
(δ,D)=1 t=−rD−1 mod δ 2
(δ,m)=1

Here we restrict to (δ, m) = 1 since terms with (δ, m) > 1 clearly vanish, and to (D, δ) = 1 because
otherwise sδ 2 = r√mod D cannot hold (as we assumed (r, D) = 1).
For each δ ≤ 2X with (δ, m) = (δ, D) = 1, pick an integer solution tδ to tδ D = −r mod δ 2 , and
let
X −r tδ X + Y − r tδ
Iδ := [ 2 − 2, 2
− 2]
δ D δ δ D δ
be an interval of length√Y /(δ D). Changing variables again and applying the trivial bound for δ > R for
2

some parameter R ≤ 2X, we can rewrite the above by letting t = δ 2 x + tδ for x ∈ Iδ , which yields
 
X X X Y 
χ(D(δ 2 x + tδ ) + r)χ(w(δ 2 x + tδ ) + l) + O  1+
δ≤R x∈I
√ Dδ 2
δ δ∈[R, 2X]
(δ,D)=1
(δ,m)=1


 
X X
2 2 Y
≤ χ(Dδ x + (Dtδ + r))χ(wxδ + (wtδ + l)) + O X+ .
δ≤R x∈Iδ
DR
(δ,D)=1
(δ,m)=1

Note that the determinant

Dδ 2 · (wtδ + l) − (Dtδ + r)wδ 2 = δ 2 (Dl − rw) = −qδ 2

17
satisfies (qδ 2 , m) = (q, m) for all the δ in the above sum. Thus by Lemma 3.10 and Lemma 3.11 and using
the condition (δ, m) = 1,


X mY mY
χ(Dδ 2 x + (Dtδ + r))χ(wδ 2 x + (wtδ + l)) ≪ √ + 2
x∈Iδ
δ D δ Dm

Summing over δ, we conclude that


X+Y √
mY log R Y m √
 
X wN − q Y
µ2 (N )χ(N )χ ≪ √ + + X+
N =X
D D Dm DR
N =r mod D

or, taking R = Y ,
X+Y √
Ym √
 
X
2 wN − q mY log Y
µ (N )χ(N )χ ≪ √ + + X.
N =X
D D Dm
N =r mod D

Applying the lemma to D = d2 , w = r2 , and q = 4P , and using that the terms with P ∤ N
contribute an O(1) error term for P ≫ X, we get an immediate corollary:

Corollary 3.13. Let Y ≤ X, let (r, d) be an admissible pair of integers, let n ∈ N, and let P be a prime such
that r2 (X + Y ) < 4P.
Let
Pn = {p|n : 2 ∤ υp (n), and p ̸= 2, P },
and let
n
nn := Q √ .
p∈P p/2
Then:

√ nn log Y nn Y
Sd,n,r ≪ X + 2Y + .
nd d
Proof of Proposition 3.9. From Corollary 3.13,

√ √ √
X X Sd,n,r X X X nn Y log Y Y nn
≪ + 2 3+
d<Y τ
nd nd nd nd2
n∈[Y σ ,T ] m∈[Y σ ,T ] d<Y τ
√ √
X nn √ X nn
≪ log T log Y X +Y + Y log Y .
n2 n
n∈[Y σ ,T ] n∈[Y σ ,T ]

Every integer n is representable uniquely in the form

n = a2 b 2α P β ,

where b is square-free and (a, 2P ) = (b, 2P ) = 1. In terms of this representation, Pn = {p|b}, so with
the divisor bound,

nn ≤ a2 b1/2+ε 2α P β .

18
Thus
X √n X X X 1 X 1/4+ε
n
≪ ≪ (1/a) T /a2 ≪ T 1/4+ε .
n≤T
n α,β:
q T
2α/2 P β/2 a b3/4−ε √
a≤ T b≤ a≤ T
2α P β a2 2α P β
2α P β ≤T

Similarly,
X nn XX 1 X X 1 a X 1
2
≪ b−3/2+ε ≪ + ≪ Y −σ/2+ε .
n>Y σ
n 2 P β a2
α a2 Y σ/2−ε a 2
α,β a∈N b>Y σ /a2 a≤Y σ/2 σ/2
a>Y

To summarize,
X X Sd,n,r √ √
≪ log T log Y X + Y 1−σ/2+ε + Y T 1/4+ε log Y
d<Y τ
nd
n∈[Y σ ,T ]

as desired.

3.2.5 Large d and error terms



For P ≫ d ≫ Y τ , we use Siegel’s bound:

(P X)ε
 
X X Y
Sd,n,r /dn ≪ +1 ≪ (P X)ε (Y 1−2τ + log Y ).
√ √ d2 d1+ε
P ≫d≫Y τ ,n P ≫d≫Y τ

It remains to collect the error terms. The cumulative error term from Proposition 3.6, Proposition
3.9 and the bound above is
 √ √  √ 
log T log Y X + Y 1−σ/2+ε + Y T 1/4+ε log Y + XY σ/2 + Y 3σ/2+τ +ε + Y 1−2τ + Y 1−σ/5
+ (P X)ε (Y 1−2τ + log Y ).


Assuming P ε ≪ X, and since we will pick T with T ≫ Y , we can rewrite this as
√ √ √
log T log Y X + Y T 1/4+ε + XY σ/2 + Y 3σ/2+τ +ε + Y 1−2τ + Y 1−σ/5 + (P X)ε Y 1−2τ .

Finally, letting τ = 1/18, σ = 5/9, and assuming log T ≪ Y , this becomes


√ 1/4+ε √
YT + XY 5/18 + Y 8/9 (P X)ε

as aimed.

19
3.3 Large r, P -divisible levels and the P term.
We bound trivially the r’s not covered in Proposition 3.2, that is,

ζ(2)π X□ X
H1 (r2 N 2 − 4P N ).
YX √ √
N ∈[X,X+Y ] 2 P/(X+Y )≤r≤2 P/N
P ∤N

A combination of the divisor bound and Siegel’s bound implies that

H1 (r2 N 2 − 4P N ) ≪ (P N )1/2+ε .

and thus
H1 (r2 N 2 − 4P N ) ≪ (P N )1/2+ε .
Thus,
1 X□ X 1 √ √ p 
H1 (r2 N 2 − 4P N ) ≪ Y P (1/ X − 1/ (X + Y ))(P N )1/2+ε
YX √ √ XY
N ∈[X,X+Y ] 2 P/(X+Y )≤r≤2 P/N
P ∤N

≪ P 1+ε Y X −2+ε ,

which is bounded by X −1/2+ε for X, Y, and P chosen as in Theorem 1.


Next, we address the√levels N divisible by P , which have been excluded from consideration so far. At
ramified primes, λf (P ) P = ±1, so the trivial bound on these terms is X(Y /P + 1) pre-averaging,
which amounts to a contribution of O(1/P + 1/Y ) in Theorem 1.
Finally, we address the term P to finish the proof of Theorem 1. We use the well-known asymptotic
X Z √
µ2 (N ) = + O( Z), (13)
N ≤Z
ζ(2)

as well as the bound Uk−2 (x) ≪ k for x ∈ [−1, 1], Combined with Proposition 3.1, Proposition 3.2, and
(13) and expressing the error term in terms of δ2 and δ1 completes the proves the following:

 
ζ(2)π X□ X √ k/2−1
X p
2
r
λf (P ) P ε(f ) = A y + (−1) B c(r) 4y − r Uk−2 √
YX new
√ 2 y
N ∈[X,X+Y ] f ∈H (N ) 1≤r≤2 y
 ′

− δk=2 πy + Oε rX −δ +ε + 1/P := (∗) (14)

3.4 Dimension Formulas


It remains to count the number of forms that are being averaged in Theorem 1. From the work of Martin
([7]), for a square-free level N , the dimension of the space of cusp newforms for Γ0 (N ) is given by

(k − 1)φ(N )
dim S(N, k) = + O(N ε ), (15)
12

X□ X X k−1
1= φ(N )µ2 (N ) + O(kY X ε + kXY /P )
12
N ∈[X,X+Y ] f ∈H new (N,k) N ∈[X,X+Y ]
 
(k − 1)(XY ) Y 1
= 1− 2 + O(kY X ε + kX 8/5+ε + kY 2 )
12 · ζ(2) p
p +p

20
by Lemma 6.8.
Hence
Q  −1
1
1 12 · ζ(2) p 1− p2 +p

1 1 1

P□ = +O + + .
(k − 1)(XY ) kY X 2−ε kY 2 X 3/5 kX 2
P
N ∈[X,X+Y ] f ∈H new (N,k) 1

Now, (14) implies that the the numerator of the left-hand side of Theorem 1 is bounded by XY ky =
kP Y. Thus

P□ √ Q  1
−1
12 1 −
P
f ∈H new (N ) λf (P ) P ε(f )
 
N ∈[X,X+Y ] p p2 +p P P PY
P□ = (∗) · +O 2−ε
+ 3/5
+ 2
(k − 1)π X T X X
P
N ∈[X,X+Y ] new
f ∈H (N,k) 1

which completes the proof of Theorem 1.

4 Properties of the weight 2 density function


In this section, we analyze asymptotic of the function from Theorem 1 in the case of weight k = 2.

Lemma 4.1. Let


⌊1/t⌋
X √
A(t) := π/4 − t 1 − s2 t2
s=1

for 0 ≤ t ≤ 1. Then
t
A(t) = + t3/2 P (1/t) + O(t2.5 ),
2
where √
P (M ) = − 2ζ(−1/2, {M }).

Proof. We will argue geometrically.


Let t := 1/M, ⌊M ⌋ := N, M − N := δ, and let√s := δ/M . A(t) can be interpreted as the excess
area above the Darboux sum for the graph of f (t) = 1 − x2 with t spacing:


We let αL and αi , i ∈ {1, . . . , N } be the angles between lines connecting 0 and (kt, 1 − k 2 t2 ) for
k ∈ {1, . . . , N }, ordered in decreasing order of the x coordinate (see Picture (a)). We split the area we
want to compute into triangles (blue) and arcs (purple). We refer to them as A1 and A2 , accordingly.

21
(a) Angles α (b) Area splitting

Since the heights of the blue triangles add up to 1, the total area A1 is given by:

t s − tp t (t − s) s
A1 = + 2
1 − (1 − s) = − √ + O(t5/2 ).
2 2 2 2
Now we compute the purple area. For a sector with an angle α, Taylor approximation dictates that
the volume is given by:
α3
Vol(α) = + O(α5 ).
12
Using Taylor approximation again, note that
√ 3/2
2a
cos α = 1 − a =⇒ Vol(α) = + O(a5/2 ). (16)
6
Thus from the inner product, √
2s3/2
Vol(αL ) = + O(t5/2 ).
6
Finally, we address Vol(α1 ), . . . , Vol(αN ). From the inner product,
cos αn+1 = (1 − s − n/M )(1 − s − (n + 1)/M ) + (1 − (1 − s − n/M )2 )(1 − (1 − s − (n + 1)/M )2 ).
Setting sn := s + n/M , this can be rewritten as
r r 
√ √ √ sn sn+1
cos αn+1 = ( sn+1 − sn )2 + sn sn+1 + 2 sn sn+1 1− 1− −1 .
2 2
For n ∈ {0, . . . , N − 1}, let
√ √ 1 1 t
xn+1 := ( sn+1 − sn )2 = √ √ ≍ ,
M ( n + 1 + δ + n + δ)2 n+1
r r 
√ sn sn+1
yn := sn sn+1 + 2 sn sn+1 1− 1− −1 .
2 2
Note that for parameters c ∈ [0, 1] and ε, Taylor approximation yields
r r ! √ ! √
c c+ε c 2 + εc/4 2
2 c + εc
p 2
c(c+ε)+2 c(c + ε) 1− 1− − 1 = −ε √ −ε +O(ε3 ).
2 2 2
c + εc + c + ε/2 8(2 − c)

22
Hence yn = O(t2 ) and hence yn ≪ xn . From(16), the volume we are to compute is given by

N √ √ N √ N
X 2(xn + yn )3/2 2 X 2 X 3x2n yn + 3xn yn2 + yn3
+ O((xn )5/2 ) = xn3/2 + + O(t5/2 ).
n=1
6 6 n=1 6 n=1 x3/2
n + (xn + yn )
3/2

For parameters x ≪ y,

3x2 y + 3xy 2 + y 3
(x + y)3/2 = x3/2 + .
x3/2 + (x + y)3/2
Note that
1 1
xn − ≪ ,
4M n M n2
and that
1 1
yn − ≪ 3.
4M 2 (2 − n/M ) M
From this one can see that

3x2n yn + 3xn yn2 + yn3 3(x′n )2 yn′ + 3x′n (yn′ )2 + (yn′ )3


 
1 5/2
= +O k ,
3/2
xn + (xn + yn )3/2 (x′n )3/2 + (x′n + yn′ )3/2 M 5/2
where
1 1
x′n = , yn′ = .
4M n 4M 2 (2 − n/M )
From the Euler-Maclaurin formula, we can approximate the sum for x′n , yn′ by an integral

N Z 1
X 3(x′n )2 yn′ + 3x′n (yn′ )2 + (yn′ )3 1 z 2 − 6z + 12
′ 3/2 ′ ′ 3/2
= 2
√ 3 3/2
dz + O(t5/2 ).
n=1
(xn ) + (xn + yn ) 8M 0 z((2 − z) + (2(2 − z)) )

The function under the integral has an explicit antiderivative given by


√ √
2 2 2 2 − z(z − 1)
√ − √ .
z z(z − 1)

In conclusion,
N
X 3x2n yn + 3xn yn2 + yn3 1
3/2
= + O(t5/2 ).
n=1 xn + (xn + yn )3/2 4M 2
3/2
It remains to compute xn :
PN
n=1

N −1
X√ √ 3 X √ √
( sn+1 − sn ) = (sn+1 )3/2 − 3(sn + 1/M ) sn + 3(sn+1 − 1/M ) sn+1 − sn3/2
n n=0
N −1
3/2 3/2 √ √ 6 X√
= (−s0 ) + (sN ) − 3(s0 + 1/M ) s0 + 2(sN − 1/M ) sN − sn
M 1
−1
N
!
√ √ X √
= −δ 3/2 − 3(δ + 1) δ + (δ + N )3/2 + 3(δ + N + 1) δ + N − 6 δ+k .
n=1

23
Now,
N −1
3/2
√ X √
lim (δ + N ) + 3(δ + N + 1) δ + N − 6 δ + k = −6ζ(−1/2, 1 + δ),
N →∞
n=1

so ∞
X
3/2
√ X √ √
= −δ − 3(δ + 1) δ − 6ζ(−1/2, 1 + δ) − ( sn+1 − sn )3 .
n N

Finally,
∞ ∞
X √ √ 3 X 1 1 1
( sn+1 − sn ) = 3/2
√ √ = + O(t5/2 )
N N
M ( n + δ + n + 1 + δ)2 4M 2

by the the Euler-Maclaurin formula.


Collecting together all the terms concludes the proof.
Proof of Theorem 2. For convenience of notation, we analyze m(y) := M (y)/D2 .
The function m(y) can be re-expressed in terms of the coefficients Q(d), since

X p X X p
c(r) y − r2 /4 = Q(d) y − d2 n2 /4.
√ √ √
1≤r≤2 y 1≤d≤2 y 1≤n≤2 y/d
d □-free

For a fixed value of d, one has:



Z 2 y/d
X p
2 2
p πy
y − d n /4 ≤ y − d2 t2 /4dt = .
√ 0 2d
1≤n≤2 y/d

We let
X p πy
F (y, d) := − y − d2 n2 /4 + ≥0
√ 2d
1≤n≤2 y/d

be the error term in this approximation. Then:

 
√ X πyQ(d)
m(y) = A y − πy − 2B F (d, y)Q(d) −
√ 2d
1≤d≤2 y
d □-free
√ X Q(d) X
= A y − πy + y(Bπ) − 2B F (d, y)Q(d)
√ d √
1≤d≤2 y 1≤d≤2 y
d □-free d □-free

Clearly,
X Q(d)
≪ y −1 .
√ d
d>2 y

Moreover, one sees from the Euler product that


X Q(d)
Bπy = πy,
d
d □-free

24
and hence
√ X
m(y) = A y − 2B Q(d)F (d, y) + O(1).

1≤d≤2 y
d □-free

Changing variables, F (y, d) = y(1/t), t := d

2 y
, where A(t) is the function from Lemma 4.1. Applying
the Lemma,

d3/2 d5/2
  
√ X 2y d d
m(y) = A y − 2B Q(d) √ + 3/2 3/4 P ( √ ) + O + O(1)
√ d 4 y 2 y 2 y y 5/4
1≤d≤2 y
d □-free
√ √
 
√ √ X
1/4
X d
=A y−B y Q(d) − y 2B Q(d) dP √ + O(1).
√ √ 2 y
1≤d≤2 y 1≤d≤2 y
d □-free d □-free

Now,
X
Q(d) = A/B,
d □-free

and hence
√ √
 
1/4
X d
m(y) = −y 2B Q(d) dP √ ,
√ 2 y
1≤d≤2 y
d □-free

which concludes the proof.

5 Geometric Averaging
In this section we complete the proof of Theorem 3 and analyze the asymptotic behavior of the dyadic
average.
Proof of Theorem 3. Let and let δ2 be a parameter chosen depending on δ1 to satisfy the conditions of
Theorem 1 with a powersaving error term. Assume further that Y ∼ X 1−δ2 be a parameter chosen so
Y divides Z − X; let X = X1 , X2 , . . . , XG = Z − Y be given by Xg = X + (g − 1)Y, where
G := (Z − X)/Y . From (14),
X□ X √ X X□ X √
λf (P ) P ε(f ) = λf (P ) P ε(f )
N ∈[X,Z] f ∈H new (N,2) g N ∈[Xg ,Xg+1 ] f ∈H new (N,2)
 
X Xg Y P
+ X 2.

= Mk
g
Dk πζ(2) Xg

Since Theorem 1 gives a bound of O(k + kP/X) on M ,

X Z Z Z Z
Xg Y Mk (P/Xg ) − uMk (P/u)du ≪ umk (P/u) − (u + Y )mk (p/(u + Y ))u
g X X

≪ kXY (P/X + 1) = o(kX 2 ).

Next, Z Z Z c
2
uMk (p/u)du = X Mk (y/u)du.
X 1

25
Finally, from (15), we can again compute that
Q  −1
1
1 24 · ζ(2) p 1− p2 +p

1

P□ = +O .
(c2 − 1)(k − 1)X 2 kX 12/5−ε
P
N ∈[X,Z] f ∈H new (N,k) 1
Thus,

P□ √ Q  1
−1
24ζ(2) 1 −
P Z c
N ∈[X,Z] f ∈H new (N,2) λf (P ) P ε(f ) p p2 +p
P□ = Mk (y/u)du + oc (1)
πζ(2)Dk (c2 − 1)(k − 1) 1
P
N ∈[X,Z] f ∈H new (N,2) 1
Z c
2
= 2 Mk (y/u)du + oc (1)
(c − 1) 1

Finally, we analyze the asymptotic behavior of the function above.


√ √
Proof of Theorem 4. Let f (x) = δ0<x<1 1 − x/ x. The Mellin transform of f is expressible via the Beta
function as
Z ∞

Z 1  
dx 1 dx 3
f˜(s) = f (x)x s
= 1 − xx s−1/2
=B , s − 1/2
0 x 2 0 x 2

Γ(s − 1/2)Γ(3/2) π Γ(s − 1/2)
= =
Γ (s + 1) 2 Γ (s + 1)
By Stirling approximation,

f˜(σ + it) ∼ |t|−3/2 ,


so we can apply Mellin inversion, which yields
Z Z
X X 1 ˜ 1
F (x) := 2
rc(r)f (r /x) = rc(r) 2 s
f (s)r(x/r ) ds = L(2s−1)f˜(s)xs ds,
r≥1 r≥1
2πi Re(s)=3 2πi Re(s)=3

where
Y p2 p−s
X   
−s
L(s) := c(r)r = 1+ 1+ .
r p
p4 − 2p2 − p + 1 1 − p−s

For a function Φ : (0, ∞) → R of compact support, let


Z ∞   
x 2 du
FΦ (x) := F Φ(u)u /Φ̃(2).
0 u u
By definition,
! Z
Z ∞ ∞
X p du du
(D2 B) · FΦ (x) = c(r) (x/u) − r2 Φ(u)u2 / Φ(u)u2 .
0 1≤r≤x
u 0 u

Then by Mellin inversion,


Z
1
FΦ (x) = L(2s − 1)f˜(s)Φ̃(s + 2)xs ds. (17)
2πi Re(s)=3

26
Now, we can further simplify L as:

−1 + p + 2p2
 
ζ(s)ζ(s + 2) Y
L(s) = 1+ .
ζ(2s + 4) p (1 − p − 2p2 + p4 )(1 + p2+s )

The Euler product part in the expression above evaluated at 2s − 1 converges uniformly for s > σ
for any σ > −1, so it is analytic and uniformly bounded in this region. The Mellin transform Φ̃ is entire
because of the support assumption, and if Φ is smooth, the decay of Φ in the t aspect allows us to shift
the contour in (17) to the line Re(s) = −1/2. The poles at 1 and 1/2 cancel our exactly the contribution

of the πy and A y terms in MΦ (y). Finally, the residue at 0 is

π Γ −1

−1 + p + 2p2
 
2 ζ(−1) Y
1+ = 1/BD2 .
2 Γ(1) ζ(2) p (1 − p − 2p2 + p4 )(1 + p)

6 Arithmetic Functions
In this section, we collect auxiliary definitions, notation, and lemmas.

Definition 6.1. Let φ(m) be Euler’s function and ψ(m) be the Dedekind Psi function. Let
m Y p
η(m) := = .
ψ(m) p+1
p|m

For any r, m ∈ N and a prime P , let


X  a   ar2 − 4P 
θr (m) := .
a mod m
m m

We let   2
(r a − 4P )/d2

X a
Φ
e r,d (g) :=
g g
a mod d2 g
a mod d2 ∈Rr,d

for d, g, r ∈ N.

Lemma 6.2. For any modulus m and a prime P ̸= 2 with (m, P ) = 1, θr (m) is a multiplicative function of
m. For odd r, it is given as follows. For a prime p with (p, 2k) = 1,

θr (pα ) = −pα−1 for α odd; θr (pα ) = pα−1 (p − 2) for α even.


For p|r,
θr (pα ) = 0 for α odd; θr (pα ) = pα−1 (p − 1) for α even;
For p = 2,
θr (2α ) := (−1)α 2α−1 .
For even r,
θr (2α ) = 0
for any α ≥ 1; for an odd prime p, θr (pα ) = θr′ (pα ), where r′ is the odd part of r.

Proof.

27
Lemma 6.3. Let d, g, r ∈ N be such that g|d∞ . Then:


 φ(g)δg=□ if 2 ∤ d

φ(g)δg=□ if 2||d, 2 ∤ n



φ
er,d (g) = 0 if 2||d, 2|n, 2||r .

2φ(g)δg=□ if 2||d, 2|n, 4|r





2φ(g)δ
g=□ if 4|d

If (d, r) is non-admissible, φ
er,d (g) = 0, so assume it is an admissible pair. Let r be an integer that
reduces to an element
 of Rr,d modulo d , let s := (r r − 4P )/d ,2 and let g =: 2 h, where (h, 2) = 1.
2 2 2 α

Since h is odd, h is a character modulo rad h, and since rad h|d , we have
·

  2 g 
(r a − 4P )/d2 r + td2 s + tr2
 X  
X a
=
g g g g
a mod d2 g t=1
a≡r mod d2
X  r   s + tr2  X  r + td2   s + tr2 
=
t mod h
h h t mod 2α
2α 2α
X  r + td2   s + tr2 
= δh=□ φ(h) α α
.
t mod 2 α
2 2

where in the last step we used that (h, k) = 1 because (r, d) ≤ 2 by assumption. We compute
X  r + td2   s + tr2 
⋆ :=
t mod 2α
2α 2α

case by case.

1. If α = 0, i.e., 2 ∤ g, then ⋆ = 1, so φ(g)


e = 2δg=□ φ(g) if 4|d and 2δg=□ φ(g) otherwise;
If 2|g (and hence 2|d and 2|r from admissibility), then r is odd since (r, d) = 1 for all r ∈ Rd,r ,
and:

2. If 2||r, 2||d, then s is even and ⋆ = 0.

3. If 4|r and 2||d, Rr,d has one element and the corresponding s is odd, and so using that
 
x+4 x
=− ,
2 2
we see
X  z + 4t   z ′ 
⋆= α α
= δ2|α 2α ;
t mod 2α
2 2

4. if 4|d and 2||r, there’s exactly one choice of r ∈ Rr,d for which s is odd; for this choice of r, once
again ⋆ = δ2α =□ 2α ; for the other choice of r, s is even and ⋆ = 0.

Combining all the cases, we get the statement of the lemma.


Theorem 6.4 (Hooley, [5]). : Let (a, m) = 1.
X+Y
X Y η(m) p
µ2 (N ) = + O( X/m + m1/2+ε ).
N =X
ζ(2) φ(m)
N =a mod m

28
Lemma 6.5. Let K be a cut-off parameter and let P ̸= 2 be a prime. Let
Y p
A := (1 + );
p
(p + 1)2 (p − 1)

1.
K Y      
X 1 p 1 9A 1 1
η(m) = 1+ +O = +O + ;
m2 (p + 1)2 (p − 1) K 11 P2 K
m odd p̸=2,P
(P,m)=1

2.
K    
X 1 2 X −k Y p 8A 1 1
η(2m) = 2 1+ = +O + ;
m2 3 k≥0 p̸=2,P
(p + 1) 2 (p − 1) 11 P 2 K
m:(P,m)=1

Lemma 6.6. Let Tr ⊆ N3 denote the set of triples (m, d, g) such that (d, r) is admissible, (m, d) = 1, and
g|d∞ .
η(d2 mg) θr (m)φ(g)
e
Θr (m, d, g) := 2
.
φ(d mg) mgd
P
Then Tr Θr (m, d, g) is absolutely convergent and equal to
Y p4 − 2p2 − p + 1 Y  p2

B · c(r) := 2 − 1)2
1+ 4 2−p+1
.
p
(p p − 2p
p|r

Moreover,
X
Θr (m, d, g) − Bc(r) ≪ Z −2 + (Z ′ )−1/5 .
(m,d,g)∈Tr
mg≤Z ′ ,d≤Z

Q   Q   Q  
Proof. Define E := p 1 + (p2 −1)p
2 , E(k) := p|r 1 + p
(p2 −1)2
, G := p 1−
2p
(p2 −1)2 +p
, G(k) :=
   
p|r 1 − (p2 −1)2 +p , and F (k) := p|r 1 + (p2 −1)2 . It is easy to verify that
Q 2p Q p(p−1)

EGF (k)
Bc(r) = .
E(k)G(k)

Next, since g|d∞ and (m, d) = 1, we have {p|md2 g} = {p|d} ⊔ {p|m}, so


1 θ (m) φ(g)
Q r
e
Θr (m, d, g) = Q .
d3 2
p|d (1 − 1/p ) m
2 2
p|m (1 − 1/p ) g
2

We begin by establishing absolute convergence. For fixed m, d, the sum


Xg X φ(g)
e
:=

g2
g|d

over g’s appearing in Tr for these m and d is upper bounded by


!
X 2φ(g) Y Xp−1 1 Y 1
 Y Y
2
2
= 2 1 + 2k
= 2 1 + ≪ (1+1/p ) ≤ 1/(1−1/p2 ) = ζ(2),

g k≥1
p p p(p + 1) p p
g|d p|d p|d
g=□

29
so is uniformly bounded.
Now fix d. Any number r can be written uniquely as a2 · b · 2c , where b is square-free and a, b odd.
From the definition of θr ,
θr (a2 · b · 2c ) ≤ a2 · 2c .
In particular, using that p|m (1 − 1/p2 ) ≫ 1,
Q

! ! !
X θr (m) X
2
X
c
X
2
X
2
Q ≪ |θr (m)|/m ≪ 1/2 1/a 1/b ≪ 1.
m2 p|m (1 − 1/p2 )
(m,d)=1 m∈N c∈N a∈N b □ - free

Finally,
X 1
Q ≪ 1,
d∈N
d3 2
p|d (1 − 1/p )

so indeed, the series Tr Θr (m, d, g) converges absolutely.


P
Xg
Next, we compute case by case from the definition.

• When 2 ∤ d, φ(g)
e = δg=□ φ(g), and
Xg X φ(g) Y  1

= = 1+ ;

g2 p(p + 1)
g|d p|d
g=□

• When 2||d and 2||r, φ(g)


e = δg=□ φ(g) for odd g and 0 for even g, so
Xg X φ(g) Y  1

= = 1+

g2 p(p + 1)
g|d p|do
g=□
2∤g

where do is the odd part of d;


• When 2||d and 4|r, then φ(g)
e = δg=□ φ(g) for g odd and φ(g)
e = 2δg=□ φ(g) for g even, so
Xg X φ(g) X φ(g)  
4Y 1
= +2 = 1+ ;

g2 ∞
g2 3 p(p + 1)
g|d0 g|d p|do
g=□ g=□
2|g

• When 4|d and 2||r, then φ(g)


e = 2δg=□ φ(g), so
Xg Y 1

7Y

1

=2 1+ = 1+ ;
p(p + 1) 3 p(p + 1)
p|d p|do

We can now compute the desired sum by expressing it as an Euler product.


Suppose first that r is odd, so (r, d) is admissible if and only if d is odd and (r, d) = 1. Then
Q
p|d (1 + 1/p(p + 1)) θ (m)
X X X
Θr (m, d, g) = 3
Q 2 2
Q r 2
T r d odd (m,d)=1
d p|d (1 − 1/p ) m p|m (1 − 1/p )
(d,r)=1
X θ (m) 1 Y p(1 + p + p2 )
Q r
X
= 2 2
.
m
m p|m (1 − 1/p ) d3 (p − 1)(p + 1)2
(d,2mr)=1 p|d

30
The inner sum can be expressed as the Euler product
!
Y X 1 p(1 + p + p2 ) Y  p
 Y
p
 Y 
p

1+ = 1+ 2 = 1+ 2 / 1+ 2
k≥1
p3k (p − 1)(p + 1)2 (p − 1)2 p
(p − 1)2 (p − 1)2
p∤2mr p∤2mr p|2mr

so
X E X θr (m) Y
2 −1
Y
Θr (m, d, g) = · (1 − 1/p ) (1 + p/(p2 − 1)2 )−1 .
Tr
E(k) m m2
p|m p|2m,p∤k

This sum over m can itself be expressed as an Euler product. The p ̸= 2, p ∤ r part of the product is
!!
Y 1 X θr (p2α+1 ) X θ(p2α+2 )
1+ +
(1 − 1/p2 )(1 + p/(p2 − 1)2 ) α≥0 p4α+2 α≥0
p4α+4
p∤2k
!!
Y 1 X 1 X (p − 2)
= 1+ 2 − +
(p − 1)(1 + p/(p2 − 1)2 ) α≥0
p2α α≥0 p2α+1
p∤2k
Y 2p

G
= 1− 2 = .
(p − 1)2 + p G(2k)
p∤2k

The p|r factors are


!! !
Y 1 X θr (p2α+1 ) X θ(p2α+2 ) Y 1 Xp−1
1+ + = 1+ 2
1 − 1/p2 α≥0
p4α+2 α≥0
p4α+4 p − 1 α≥0 p2α+1
p|r p|r
Y p(p − 1)

= 1+ 2 = F (k).
(p − 1)2
p|r

The Euler factor at 2 is


−1 ! !
1 X 1 (−1)α 2α−1 9 14 X 7
1+ 1− 2 = 1+ (−1/2)α = = G(2).
1 + 2/(22 − 1)2 α≥1
2 22α 11 2 3 α≥1 11

All the Euler factors together add up to Bc(r).


Now we estimate the tail, namely, the contribution of terms with d > Z or mg > W . As noted
above, for d fixed,
X X θ (m) φ(g)
2
Q r 2 2

m
(m,d)=1 g|d ,g=□p|m (1 − 1/p ) g

is uniformly bounded; hence,

X X 1
Θr (m, d, g) ≪ Q ≪ Z −2 . (18)
d≥Z
d3 p|d (1 − 1/p2)
(m,d,g)∈Tr :d≥Z

For a given g, the contribution of summands with that g is

φ(g) X 1/d3 θr (m)/m2 φ(g) X 1/d2 |θr (m)|/m2 φ(g)


2
Q 2
Q 2
≤ 2
Q 2
Q 2
≪ 2 .
g m,d p|d (1 − 1/p ) p|m (1 − 1/p ) g m,d p|d (1 − 1/p ) p|m (1 − 1/p ) g
(m,d)=1 (m,d)=1
d odd d odd
g|d∞

31
2 2
Here we used that Q 1/d Q|θr (m)|/m 2 converges since these are terms of the original sum
P
m,d 2
p|d (1−1/p ) p|m (1−1/p )
(m,d)=1
d odd
corresponding to g = 1. From this,
X X 1
Θr (m, d, g) ≪ ≪ W −1/2 . (19)
√ t2
(m,d,g)∈Tr :g≥W t> W

Finally, the terms in the sum corresponding to elements of Tr with a fixed m are

θr (m)/m2 X φ(g) 1/d3 θr (m)/m2 X φ(g) 1/d3 θr (m)/m2


Q 2
Q ≤ Q Q ≪ Q ,
p|m (1 − 1/p ) g,d g2 2
p|d (1 − 1/p )
2
p|m (1 − 1/p ) g,d g2 2
p|d (1 − 1/p )
2
p|m (1 − 1/p )
(m,d)=1 d odd
d odd g|d∞
g|d∞

φ(g) Q 1/d3
where again, converges since these are the m = 1 terms in the original sum. Thus,
P
g,d g2 2
p|d (1−1/p )
d odd
g|d∞

X X
Θr (m, d, g) ≪ |θr (m)|/m2 .
(m,d,g)∈Tr :m≥V m≥V

Recall that for m = a2 · b · 2c for b square-free, a, b odd, one has θr (m)/m2 ≪ 1


a2 b2 2c
. Hence

X X 1 X X 1 X X 1 X X 1 
|θr (m)|/m2 ≤ |θr (m)|/m2 ≪ +
2c 2c a2 b 2 a2 b 2
m≥V c≥1 m=a2 b odd c≥1 1/3
a>(V /2c ) b b>(V /2c )1/3 a
m≥V /2c
X 1 2c/3
≪ c V 1/3
≪ V −1/3 . (20)
c
2

Finally, let Z ′ = V W , where V = Z 3/5 , V = Z 2/5 . Then gm > Z ′ implies that at least one of
g > W or m > V holds, so putting together (18), (19), and (20),
X
Θr (m, d, g) ≪ Z −2 + (Z ′ )−1/5 .
(m,d,g)∈Tr
mg≥Z ′ or d≥Z

Next, suppose r is even, let ro the odd part of r.


Q
p|do (1 + 1/p(p + 1)) θ (m)
X X X
Θr (m, d, g) = 3
Q 2 2
Q ko 2
T r d odd
d
(m,2d)=1p|d (1 − 1/p ) m p|m (1 − 1/p )
(d,ko )=1
Q
p|do (1 + 1/p(p + 1)) θko (m)
X X
+ Q Q
d3 p|d (1 − 1/p2 ) m2 2
p|m (1 − 1/p )
2||d (m,d)=1
(d,ko )=1
Q
7 X p|do (1 + 1/p(p + 1)) θko (m)
X
+ Q Q
3 d3 p|d (1 − 1/p2 ) m2 2
p|m (1 − 1/p )
4|d (m,d)=1
(d,ko )=1

=: I + II + (7/3)III.

We compute the three summands I, II, and III, separately.

32
Observe that for m odd and α ≥ 1,
θko (2α m) 1 1 θko (m) 2 · (−1)α θ (m)
2α 2
Q 2
= 2α
(−1) α α−1
2 2
Q 2
= α 2
Q ko 2
.
2 m p|2m (1 − 1/p ) 2 1 − 1/4 m p|m (1 − 1/p ) 3·2 m p|m (1 − 1/p )

Thus

Q !−1
p|d (1 + 1/p(p + 1))
X X θ (m) X 2 · (−1)α 9X
I= Q Q ko 1+ = Θko (m, d, g).
d3 p|d (1 − 1/p2 ) m2 2
p|m (1 − 1/p ) 3 · 2α 7 T
d odd (m,d)=1 α≥1 ko
(d,ko )=1

For II, we can rewrite


Q
1 p| (1 + 1/p(p + 1)) θko (m)
X X X 91 X
II = Q Q = Θko (m, d, g).
8(1 − 1/4) d3 p|d (1 − 1/p2 ) m2 p|m (1 − 1/p2 ) 76 T
do odd (m,2do )=1 d odd ko
(d,ko )=1 (d,ko )=1

Finally,
Q
1 p|d (1 + 1/p(p + 1)) θko (m) 9 1 X
X X X
III = Q Q = Θko (m, d, g).
α≥2 do odd
8α (1 − 1/4) d3 p|d (1 − 1/p2 ) m 2 2
p|m (1 − 1/p ) 76·7 T
(m,2d)=1 ko
(d,ko )=1

In summary,
θko (2α m)
 
11 1 1 7 X 11 X
2α 2
Q 2
= 1+ + Θko (m, d, g) = Θko (m, d, g).
2 m p|2m (1 − 1/p ) 9 6 6·73 T 7 T
ko ko

It remains to notice that c(r) = 11/7c(ko ).


Finally, we address the case 4|r:
Q
p|do (1 + 1/p(p + 1)) θ (m)
X X X
Θr (m, d, g) = 3
Q 2 2
Q ko 2
T r d odd
d
(m,2d)=1 p|d (1 − 1/p ) m p|m (1 − 1/p )
(d,ko )=1
Q
4 X p|do (1 + 1/p(p + 1)) θko (m)
X
+ Q Q .
3 d3 p|d (1 − 1/p2 ) m2 p|m (1 − 1/p2 )
2||d (m,d)=1
(d,ko )=1

Since the first summand matches I = 97 · c(ko ); the second is equal to 4


3
· II = 419
367
· c(ko ) = 27 c(ko ),
and we get the desired answer once again.
The error term analysis for r even matches that of odd r.

Lemma 6.7. Let m ∈ N, and let χ be a real character modulo m coming from a primitive character modulo m0 .
Let X
σχ (Z) := µ2 (N )χ(N ).
N ≤Z

Then:

Z Y p η(m)
+ Oε Z 3/5+ε mε = Z · + Oε Z 3/5+ε mε
 
σχ (Z) =
ζ(2) p+1 ζ(2)
p|m

if χ is the principal character, and


 
1/5+ε
σχ (Z) = Oε Z 3/5+ε mε m0
if χ is non-principal.

33
Proof. Consider the associated Dirichlet series
X Y L(s, χ)
µ2 (n)χ(n)n−s = 1 + χ(p)p−s =

Lχ (s) := ,
n p
L(2s, χ2 )

where L(s, χ) is the Dirichlet series associated to χ. By Perron’s formula (see [15], p. 70), choosing Z near
a half-integer, we have
Z 1+ε+iT
Zs
 1+ε 
X
2 1 Z
µ (N )χ(N ) = Lχ (s) ds + Oε
N ≤Z
2πi 1+ε−iT s T

Suppose first that χ is the principal character. Then we have

ζ(s) Y 1
Lχ (s) = ,
ζ(2s) 1 + p−s
p|m

and shifting the integral to the left of the line Re(s) = 1 picks up the pole of ζ(s) at s = 1. We have:
1 Y 1
Ress=1 Lχ (s) =
ζ(2) 1 + 1/p
p|m

Since there are no other poles to the right of the line 1/2, shifting the contour to the line Re(s) =
1/2 + ε,

Zs Z 1+ε
Z  
X
2 Z Y p 1
µ (N )χ(N ) = + Lχ (s) ds + Oε ,
N ≤Z
ζ(2) p + 1 2πi C s T
p|m

where the integration contour C as shown in Figure 6a. Similarly,


if χ is non-principal, there are no poles inside of the contour, and

Zs Z 1+ε
Z  
X
2 1
µ (N )χ(N ) = Lχ (s) ds + Oε .
N ≤Z
2πi C s T

It remains to bound the contour integral terms. (a) Integration contour C

From the Euler product expansion for L(s, χ) it is immediate that for s = σ + it, σ > 1, one has
|L(s, χ)| ≥ ζ(2σ)/ζ(σ), and thus |1/L(2s, χ2 )| ≪ε 1 everywhere on the above contour. Moreover,
from the above discussion, we also have Hχ (s) ≪ 1. Hence
Zs Zs
Z Z
1
Lχ (s) ds ≪ε |L(s, χ))| ds.
2πi C s C s
Next, let χ′ be the the real primitive character of modulus m0 < m that gives rise to χ. For σ > 0,
Y
L(s, χ) = L(s, χ′ ) (1 + χ′ (p)p−s )
p|m

and thus for s ∈ C,


Y
|L(s, χ)| ≤ |L(s, χ′ )| (1 + p−1/2−ε ) ≪ mε |L(s, χ′ )|
p|m

34
From result of Davenport ([2]) that for χ′ primitive and s = σ + it, σ ∈ [0, 1], we have

|L(s, χ′ )| ≪ ((|t| + 1)m0 )1/2−σ/2

Thus,
1/2+ε+iT T T
Zs t1/4−ε/2
Z Z Z
′ 1/4−ε/2
L(s, χ )) ds ≪ Z 1/2+ε m1/4−ε/2 dt ≪ Z 1/2+ε t−3/4−ε dt ≪ m0 Z 1/2+ε T 1/4−ε
1/2+ε−iT s −T |t| + 1 1

Next, let σ ≥ 1 and s = σ + it, t > 1, and let


X
A(u) := χ′ (n).
n≤u

If χ is principal (i.e., if L(s, χ′ ) = ζ(s)) one has |L(s, χ′ )| ≪ log |t|. Indeed, for an integer N = ⌊T ⌋,
Abel summation gives
Z ∞ Z ∞
X
−s su s{u} X
ζ(s) = n −N s−1
+ s+1
du − s+1
du = n−s + O(1) + O (T /N ) = O(log T ).
n≤N N u N u n≤N

On the other hand, if χ is non-principal, one has



|A(u)| ≪ m0 log m0

by the Polya-Vinogradov inequality; hence, by partial summation,

∞ √ ∞ √
√ |t| m0 log m
Z Z
X
−s ′ −s sA(u) m0 log m0 du
n χ (n) = −A(N )N + du ≪ +|t| m0 log m0 ≪ ,
n≥N N us+1 N N us+1 N

so setting N = m0 |t|, we see that

X √
|L(s, χ′ )| ≪ χ′ (n)n−s + |t| m0 log m0 /N ≪ log |m0 t|.
n<N

To summarize,
Z 1+ε+iT 1 Z 1+ε σ
Zs Zσ
Z
′ 1/2−σ/2 Z
L(s, χ )) ds ≪ (T m0 ) dσ + log(T m0 ) dσ
1/2+ε+iT s 1/2+ε T 1 T
√ σ
m0 1 εZ 1+ε log m0 T
Z 
Z
≪ √ √ dσ +
T 1/2+ε T m0 T
Z εZ 1+ε log T m0
≪ +
T T

1/5
and similarly for the other horizontal contour. Finally, taking T = Z 2/5 /m0 yields the desired error
term.

Lemma 6.8. One has

Z2 Y
 
X
2 1
µ (n)φ(n) = 1− 2 + O(Z 8/5+ε )
n≤Z
2ζ(2) p p +p

35
Proof. Consider the associated Dirichlet series
Y p−1

ζ(s − 1) Y

1

Lχ (s) := 1+ 2 = 1− s .
p
p ζ(2s − 2) p p +p

Applying Perron’s formula again,


2+ε+iT
Zs Z 1+ε
Z  
X
2 1
µ (N )χ(N ) = Lχ (s) ds + Oε
N ≤Z
2πi 2+ε−iT s T

Shifting the contour to the line Re(s) = 3/2 + ε picks up the pole of ζ(s) at s = 2 with residue agreeing
Q  
with the main term, and no other poles. In the region Re(s) ≥ 3/2 + ε, the function p 1 − ps +p 1

is bounded absolutely. Similarly, |1/ζ(2s − 2)| ≪ε 1 in this region. Hence it suffices to bound the
R 1/2+ε+iT R 1+ε±iT
contour integrals 1/2+ε−iT |ζ(s)Z s+1 /s|ds and 1/2+ε±iT |ζ(s)Z s+1 /s|ds. By the convexity bound we
have ζ(1/2 + ε + it) ≪ (1 + |t|)1/2−σ/2 ; hence, for the first integral
Z 1/2+ε+iT Z T
s+1 3/2+ε
ζ(s)Z /s ds ≪ Z t1/4−ε/2−1 dt = Z 3/2+ε T 1/4−ε/2 .
1/2+ε−iT 1

Similarly, on the vertical parts of the contour inside the critical strip, one has
Z 1+iT Z 1 1/2−t/2
Z 2 1/2 −t/2
Z
s+1 t+1 T
ζ(s)Z /s ds ≪ Z dt ≪ T ≪ Z 2 /T,
1/2+ε+iT 1/2+ε T T 0

and for σ ≥ 1, ζ(s) = O(log T ) gives


Z 1+ε+iT
ζ(s)Z s+1 /s ds ≪ εZ 2+ε /T 1−ε .
1+iT

Taking T = Z 2/5
concludes the calculation.

7 Acknowledgments
I am very grateful to Andrew Sutherland for introducing this question and to Peter Sarnak for suggesting
it as a project to me, as well as for many fruitful discussions. I would like to thank Jonathan Bober,
Andrew Booker, Andrew Granville, Yang-Hui He, Min Lee, Michael Lipnowski, Mayank Pandey, Michael
Rubinstein, and Will Sawin, for enlightening conversations in connection to this problem.
The research was supported by NSF GRFP and the Hertz Fellowship.

References
[1] Raymond Ayoub. An introduction to the analytic theory of numbers. Mathematical Surveys, No. 10.
American Mathematical Society, Providence, R.I., 1963.

[2] H. Davenport. On Dirichlet’s L-Functions. J. London Math. Soc., 6(3):198–202, 1931.

[3] Yang-Hui He, Kyu-Hwan Lee, Thomas Oliver, and Alexey Pozdnyakov. Murmurations of elliptic
curves. 2022.

[4] Yang-Hui He, Kyu-Hwan Lee, Thomas Oliver, Alexey Pozdnyakov, and Andrew V. Sutherland. In
preparation.

36
[5] C. Hooley. A note on square-free numbers in arithmetic progressions. Bull. London Math. Soc., 7:133–
138, 1975.

[6] Nicholas M. Katz and Peter Sarnak. Zeroes of zeta functions and symmetry. Bull. Amer. Math. Soc.
(N.S.), 36(1):1–26, 1999.

[7] Greg Martin. Dimensions of the spaces of cusp forms and newforms on Γ0 (N ) and Γ1 (N ). J. Number
Theory, 112(2):298–331, 2005.

[8] Kimball Martin. Root number bias for newforms. Proc. Amer. Math. Soc., 151(9):3721–3736, 2023.

[9] Kimball Martin and Thomas Pharis. Rank bias for elliptic curves mod p. Involve, 15(4):709–726,
2022.

[10] Kimball Martin and Thomas Pharis. Errata for rank bias for elliptic curves mod p. https://math.
ou.edu/~kmartin/papers/aprank-err.pdf, 2023.

[11] Peter Sarnak. Letter to Drew Sutherland and Nina Zubrilina on murmurations and root numbers.
https://publications.ias.edu/sarnak/paper/2726, 2023.

[12] Nils-Peter Skoruppa and Don Zagier. Jacobi forms and a certain space of modular forms. Invent.
Math., 94(1):113–146, 1988.

[13] Andrew V. Sutherland. Letter to Michael Rubinstein and Peter Sarnak. https://math.mit.edu/
~drew/RubinsteinSarnakLetter.pdf, 2022.

[14] Andrew V. Sutherland. Murmurations of weight k newforms for Γ0 (N ), comparison to theorem.


https://math.mit.edu/~drew/zub8.html, 2023.

[15] E. C. Titchmarsh. The theory of the Riemann zeta-function. The Clarendon Press, Oxford University
Press, New York, second edition, 1986. Edited and with a preface by D. R. Heath-Brown.

[16] Masatoshi Yamauchi. On the traces of Hecke operators for a normalizer of Γ0 (N ). J. Math. Kyoto
Univ., 13:403–411, 1973.

37

You might also like