You are on page 1of 7

THE LINDENSTRAUSS MAXIMAL INEQUALITY

TERENCE TAO

Abstract. We give an abstract version of a maximal inequality of Linden-


strauss, which improves upon the classical Hardy-Littlewood maximal inequal-
ity.

1. Introduction

Let (X, BX , µX ) and (Y, BY , µY ) be σ-finite measure spaces, let R be a totally


ordered set which is at most countable, and for each r ∈ R let Kr : X × Y → R+
be a non-negative measurable kernel. We can then define the operator Tr on any
non-negative measurable function f : X → R+ by

Tr f (y) := Kr (x, y)f (x) dµX (x)
X
and the maximal function
TR f (y) := sup Tr f (y).
r∈R

In this note we are interested in the problem of determining conditions on Kn for


which we have the weak-type (1, 1) inequality

C
µY ({y ∈ Y : TR f (y) > λ}) ≤ f (x) dµX (x) (1)
λ X
for all non-negative measurable f : X → R+ and λ > 0, and some constant C > 0.

The classical Hardy-Littlewood maximal inequality corresponds to the case when


X = Y is a homogeneous space, R ⊂ R+ is a countable set of radii, and Kr (x, y) :=
1B(y,r) (x)/µ(B(y, r)). Then, as is well known, (1) holds for some constant C which
depends on the doubling constant of the homogeneous space. If one makes Kr a
little smaller, namely Kr (x, y) := 1B(y,r) (x)/µ(B(y, 3r)), then the standard proof
of the Hardy-Littlewood inequality shows that (1) now holds with a constant of
1. (We will reprove this result shortly; in fact we can even replace B(y, 3r) with
B(y, 2r) by being more careful.)

Observe in the above example that Kr (x, y) is supported on a “narrow” subset of


X × Y , with the support getting smaller as r decreases. We shall now consider
more general families of kernels with narrow support. It is convenient to introduce
the following “balls” associated to any family (Kr )r∈R of kernels as follows. Given
any y ∈ Y and r ∈ R, define Br (y) := {x ∈ X : Kr (x, y) ̸= 0}, while for x ∈ X
1
2 TERENCE TAO

and r ∈ R define Br∗ (x) := {y ∈ Y : Kr (x, y) ̸= 0}. Observe that Br (y), Br∗ (x) are
measurable for almost every x, y. We also define the dilated ball
∪ ∪

B≤r Br (y) := Br∗′ (x);
x∈Br (y) r ′ ∈R:r ′ ≤r

this is also measurable for almost every y.

We can then obtain the following abstract Hardy-Littlewood maximal inequality:


Theorem 1.1 (Hardy-Littlewood maximal inequality). Let (Kr )r∈R be a family
of non-negative kernels which obeys the bound
1
Kr (x, y) ≤ ∗ B (y)) (2)
µY (B≤r r

for almost every x, y and all r ∈ R. (We adopt the usual conventions that 1/0 =
+∞ and 1/ + ∞ = 0.) Then (1) holds with constant C = 1.

Proof We shall give a slight rearrangement of the usual Vitali-type covering lemma
argument, in order to motivate a variant of this theorem below.

By monotone convergence we may take R to be finite, and X, Y to have finite


measure. and by dividing f by λ we may normalise λ = 1. We now induct on the
cardinality |R| of R. The case |R| = 0 is vacuous, so suppose |R| > 0 and the claim
has already been proven for smaller values of |R|. Let r be the largest value in R,
and consider the set
E := {y ∈ Y : Tr f (y) > 1} (3)
thus
{y ∈ Y : TR f (y) > 1} = E ∪ {y ∈ Y \E : TR\{r} f (y) > 1}.
By deleting a set of measure zero we may assume that µX (Br (y)) > 0 for all y ∈ E.
Then by the kernel bounds on Kr , we see that for every y ∈ E we have


f (x) dµX (x) ≥ µY (B≤r Br (y)).
Br (y)

Now let Σ ⊂ E be a subset of E such that the sets Br (y) for y ∈ Σ are disjoint, and
such that Σ is maximal with respect to set inclusion (such a set exists by Zorn’s
lemma); since all the sets Br (y) have positive measure we see that Σ is at most
countable. Summing the previous inequality over all y ∈ Σ we conclude

f (x) dµX (x) ≥ µY (E ′ ) (4)
F
where ∪
E ′ := ∗
B≤r Br (y)
y∈Σ
and ∪
F := Br (y).
y∈Σ
By the maximality of Σ, we see that for all y ′ ∈ E that Br (y) and Br (y ′ ) intersect
for some y ∈ Σ. In particular this implies that
E ⊂ E′.
LINDENSTRAUSS MAXIMAL INEQUALITY 3

Thus
µY ({y ∈ Y : TR f (y) > 1}) ≤ µY (E ′ ) + µY ({y ∈ Y \E ′ : TR\{r} f (y) > 1}.
(5)
Now from construction of E ′ we observe that for y ∈ Y \E ′ that
TR\{r} f (y) = TR\{r} (f 1X\F )(y)
and hence by induction hypothesis


µY ({y ∈ Y \E : TR\{r} f (y) > 1} ≤ f (x) dµX (x).
X\F

Combining this with (4), (5) we conclude that



µY ({y ∈ Y : TR f (y) > 1}) ≤ f (x) dx
X
closing the induction as desired.

For instance, if X = Y is a metric measure space, R ⊂ R+ , and


1
Kr (x, y) := 1d(x,y)≤r
µX (B̃(x, r))
where B(x, r) ⊂ B̃(x, r) ⊂ B(x, 2r) is the enlarged ball
B̃(x, r) := {x′ ∈ X : d(x, y), d(x′ , y) ≤ r for some y ∈ X}
then the above theorem shows that the Hardy-Littlewood maximal operator

1
M f (x) := sup |f (y)| dy
r∈R µX (B̃(x, r)) B(x,r)

is weak-type (1, 1) with operator norm at most 1. In the case that d is an ultrametric
(so that d(x, x′ ) ≤ max(d(x, y), d(x′ , y)) for all x, x′ , y) we have B̃(x, r) = B(x, r);
this for instance gives the weak-type (1, 1) of the standard dyadic maximal operator
on cubes, with the usual constant of 1.

In some situations this result is unsatisfactory because of failure of doubling, which


may cause B̃(x, r) to be unacceptably large compared with B(x, r). However, there
is a refinement of the Hardy-Littlewood maximal inequality due to Lindenstrauss
which can remedy this. Roughly speaking, it allows one to replace the dilated balls

B≤r Br (y) by the smaller set
∪ ∪

B<r Br (y) := Br∗′ (x).
x∈Br (y) r ′ ∈R:r ′ <r

In the case where the set of radii R are lacunary, the replacement of the constraint
r′ ≤ r with r′ < r can be a significant saving, and in particular one can often
recover a doubling-type condition.
Theorem 1.2 (Lindenstrauss maximal inequality). Let (Kr )r∈R be a family of
non-negative kernels which obeys the bounds
1
Kr (x, y) ≤ ∗ B (y)) (6)
µY (B<r r
4 TERENCE TAO

for almost every x, y and all r ∈ R, and the kernel bound



Kr (x, y) dµY (y) ≤ 1 (7)
Y
for almost every y and all r ∈ R. Then (1) holds with constant C = 2
1−e−1 .

We remark that the hypothesis (7) is very natural, as it asserts that Tr is of strong-
type (1, 1) with operator norm at most 1. It is not hard to see that both (7) and (6)
are consequences of (2), so the Lindenstrauss maximal inequality is stronger than
the Hardy-Littlewood maximal inequality except for the small loss in the constant
C. One can optimise the constant 1−e2 −1 a little bit more but we will not attempt
to do so here.

Proof As before we can take R to be finite, X, Y to have finite measure, f to be


bounded, and λ = 1. For technical reasons we shall need to assume the existence of
an ε > 0 such that µX (Br (y)) > ε and µY (Br∗ (x)) > ε for all x, y, r; one can reduce
to this case by adding a small amount of measure to X and Y and increasing Kr
slightly, and then sending ε → 0 and using monotone convergence; we omit the
details.

We again induct on R, thus |R| ≥ 1 and we assume the claim has already been
proven for smaller |R|. We take r to be the largest value of R, and define the set
E by (3) as before. Thus

Kr (x, y)f (x) dµX (x) ≥ 1 (8)
Y
for all y ∈ E. In particular this forces Br (y) to have positive measure.

We now deviate from the previous argument by selecting Σ randomly rather than
greedily. Indeed, we apply a Poisson process with intensity p(y) for every y ∈ E,
where p : E → R+ is a bounded strictly positive density function to be chosen later.
This creates a random subset1 Σ ⊂ E with∑ the property that for any non-negative
measurable w : E → R+ , the quantity y∈Σ w(y) is a Poisson random variable

with expectation E w(y)p(y) dµY (y). In particular (setting w = 1E ) we see that
Σ is almost surely finite.

Now we define the random sets



E ′ := ∗
B<r Br (y)
y∈Σ

and ∪
F := Br (y).
y∈Σ
Observe that
µY ({TR f > 1}) ≤ µ(E) + µ(E ′ ) + µY ({y ̸∈ E ′ : TR\{r} f (y) > 1}).
1If E contains atoms, then Σ may contain multiplicity, i.e. it is a multiset rather than a set.

One way to create Σ is to let N be a Poisson random variable with expectation P := E p(y)dµY (y)
and then let Σ = {y1 , . . . , yN } where y1 , . . . , yN are iid elements of E chosen using the probability
distribution p(y)dµY (y)/P .
LINDENSTRAUSS MAXIMAL INEQUALITY 5

But as before we have for y ̸∈ E ′


TR\{r} f (y) = TR\{r} (f 1X\F (y))
so by induction hypothesis

µY ({y ̸∈ E ′ : TR\{r} f (y) > 1}) ≤ C f (x) dx.
X\F
2
where C = 1−e−1 .We take expectations and conclude

µY ({T∗ f > 1}) ≤ µ(E) + Eµ(E ′ ) + CE f (x) dx.
X\F

To close the induction it will thus suffice to show that



2
µ(E) + Eµ(E ′ ) ≤ E f (x) dx.
1 − e−1 F
Now using the weight function w(x) := 1/p(x) we see that
∑ 1
µ(E) = E
p(x)
y∈Σ

while from definition of E we have

Eµ(E ′ ) ≤ E ∗
µY (B<r Br (y)).
y∈Σ

It is now natural to set


1
p(x) :=
∗ B (y))
µY (B<r r
(note that p is bounded by 1/ε) and so

µ(E) + Eµ(E ′ ) ≤ 2E ∗
µY (B<r Br (y)).
y∈Σ

Applying (8) followed by Fubini’s theorem we conclude


∑ ∫
′ ∗
µ(E) + Eµ(E ) ≤ 2E µY (B<r Br (y)) Kr (x, y)f (x) dµX (x)
y∈Σ Br (y)
∫ ∑

=2 [E µY (B<r Br (y))Kr (x, y)]f (x) dµX (x).
X y∈Σ

Meanwhile, we have
∫ ∫
E f (x) dx = [E1F (x)]f (x) dx.
F X
Thus it will suffice to show that
∑ 1

E µY (B<r Br (y))Kr (x, y) ≤ E1F (x)
1 − e−1
y∈Σ

for almost every x ∈ X.

Fix x. The left-hand side is just



Kr (x, y) dµY (y).
E
6 TERENCE TAO

Applying (6), (7), we can bound this by min(1, α), where



1
α := ∗ B (y)) dµY (y).
∗ µ
E∩Br (x) Y (B<r r

On the other hand, observe that x ∈ F if and only if |Σ ∩ E ∩ Br∗ (x)| ≥ 1. But
|Σ ∩ Br∗ (x)| is a Poisson random variable with expectation α. Thus
E1F (x) = 1 − e−α .
The claim now follows from the elementary inequality
1 − e−α
min(1, α) ≤
1 − e−1
for all α > 0 (this follows from the concavity of 1 − e−x for 0 ≤ x ≤ 1, and the
monotonicity of 1 − e−x for x ≥ 1).

The above theorem is an abstraction of a slightly more concrete inequality of Lin-


denstrauss [1], which is as follows. Let G be a group with a bi-invariant Haar
measure2 µ, and let F1 , . . . , Fn be a finite tempered Følner sequence, in the sense
that ∪
µ( Fr−1
′ Fr ) ≤ Aµ(Fr )

r ′ <r
for some A ≥ 1 and all 1 ≤ r ≤ n. Then by applying the above theorem to
X = Y = G, R := {1, . . . , n} and the kernels
1
Kr (x, y) := 1 −1
Aµ(Fr ) y x∈Fr
we conclude that the maximal operator

1
M f (y) := sup f (gy) dµ(g)
1≤r≤n µ(F r) Fr
2
is of weak-type (1, 1) with norm at most 1−e−1 A.

The inequality also shows that on a metric measure space X, the maximal operator

1
M̃ f (x) := sup f (y) dµ(y)
r∈R V (x) B(x,r)
2
is of weak-type (1, 1) with norm at most 1−e−1 , where V (x) is the quantity
V (x) := max(µ({x ∈ X : d(x , y) ≤ r for some r′ < r, y ∈ B(x, r)}),
′ ′ ′
sup µ(B(y, r))).
y∈B(x,r)

One can gauge the strength of these inequalities by looking at the usual Euclidean
Hardy-Littlewood inequality (with X = Rd ) in the high-dimensional limit. The
usual Hardy-Littlewood argument gives a weak-type (1,1) bound of 2d . If one
restricts the radii to powers of d, then the Lindenstrauss argument gives a weak-
type (1,1) bound of O(1); one can then refine this to powers of d1/d log d to get a
bound of O(d log d). But all balls are comparable in measure to a ball of radius
equal to a power of d1/d log d . We thus see that the full Hardy-Littlewood maximal
2It is also possible to tackle the non-unimodular case without difficulty.
LINDENSTRAUSS MAXIMAL INEQUALITY 7

inequality is also weak-type (1, 1) with a bound of O(d log d). This is not as strong
as the bound of O(d) established by Stein and Stömberg [2] but is not too far off.

The author thanks Zubin Guatam for explaining the argument of Lindenstrauss.

References

[1] E. Lindenstrauss, Pointwise Theorems for Amenable Groups, Invent. Math. 146 (2001), no.
2, 259–295.
[2] E. M. Stein, J. Strömberg, Behaviour of maximal functions in Rn for large n, Ark. Mat. 21
(1983), 259–269.

Department of Mathematics, UCLA, Los Angeles CA 90095-1555

E-mail address: tao@math.ucla.edu

You might also like