You are on page 1of 100

Analysis I

Piotr Hajlasz

1 Measure theory

1.1 σ-algebra.

Definition. Let X be a set. A collection M of subsets of X is σ-algebra if M has the


following properties

1. X ∈ M;

2. A ∈ M =⇒ X \ A ∈ M;
S∞
3. A1 , A2 , A3 , . . . ∈ M =⇒ i=1 Ai ∈ M.

The pair (X, M) is called measurable space and elements of M are called measurable
sets.
Exercise. Let (X, M) be a measurable space. Show that

1. ∅ ∈ M;

2. A, B ∈ M =⇒ A \ B ∈ M;

3. M is closed under finite unions, finite intersections and countable intersections.

Examples.

1. 2X , the family of all subsets of X is a σ-algebra.

2. M = {∅, X} is a σ-algebra.

3. If E ⊂ X is a fixed set, then M = {∅, X, E, X \ E} is a σ-algebra.

1
Proposition 1 If {Mi }i∈I is a family of σ-algebras, then
\
M= Mi
i∈I

is a σ-algebra.

A simple proof is left to the reader.


Let R be a family of subsets of X. By σ(R) we will denote the intersection of all
σ-algebras that contain R. Note that there is at least one σ-algebra that contains R,
namely 2X .

1. σ(R) is a σ-algebra that contains R


2. σ(R) is the smallest σ-algebra that contains R in the sense that if M is a σ-algebra
that contains R, then σ(R) ⊂ M.

We say that σ(R) is a σ-algebra generated by R.


Example. If R = {E}, then σ(R) = {∅, X, E, X \ E}.
In the sequel we will need the following result.

Proposition 2 The family M of subsets of a set X is a σ-algebra if and only if it


satisfies the following three properties

1. X ∈ M;
2. If A, B ∈ M, then A \ B ∈ M.
S∞
3. If A1 , A2 , A3 , . . . ∈ M are pairwise disjoint, then i=1 Ai ∈ M.

We leave the proof as an exercise cf. the proof of Theorem 3(d). 2

1.2 Borel sets.

Let X be a metric space (or more generally a topological space). By B(X) we will
denote the σ-algebra generated by the family of all open sets in X. Elements of B(X)
are called Borel sets and B(X) is called σ-algebra of Borel sets.
The following properties are obvious.
T∞
1. If U1 , U2 , U3 , . . . ⊂ X are open, then i=1 Ui is a Borel set;
2. All closed sets are Borel;
S∞
3. If F1 , F2 , F3 , . . . ⊂ X are closed, then i=1 Fi is a Borel set.

2
1.3 Measure.

Definition. Let (X, M) be a measurable space. A measure (called also positive


measure) is a function
µ : M → [0, ∞]
such that

1. µ(∅) = 0;

2. µ is countably additive i.e. if A1 , A2 , A3 , . . . ∈ M are pairwise disjoint, then



! ∞
[ X
µ Ai = µ(Ai ).
i=1 i=1

The triple (X, M, µ) is called measure space.


If µ(X) < ∞, then µ is called finite measure.
If µ(X) = 1, then µ is called probability or probability measure.
If we can write X = ∞
S
i=1 Ai , where Ai ∈ M and µ(Ai ) < ∞ for all i = 1, 2, 3, . . .,
then we say that µ is σ-finite.
Example. If X is an arbitrary set, and µ : 2X → [0, ∞] is defined by µ(E) = m if
E is finite and has m elements, µ(E) = ∞ if E is infinite, then µ is a measure. It is
called counting measure.

Theorem 3 (Elementary properties of measures) Let (X, M, µ) be a measure


space. Then

(a) If the sets A1 , A2 , . . . , An ∈ M are pairwise disjoint then µ(A1 ∪ . . . ∪ An ) =


µ(A1 ) + . . . + µ(An ).

(b) If A, B ∈ M, A ⊂ B and µ(B) < ∞, then µ(B \ A) = µ(B) − µ(A).

(c) If A, B ∈ M, A ⊂ B, then µ(A) ≤ µ(B).

(d) If A1 , A2 , A3 , . . . ∈ M, then

! ∞
[ X
µ Ai ≤ µ(Ai ).
i=1 i=1

(e) If A1 , A2 , A3 , . . . ∈ M, µ(Ai ) = 0, for i = 1, 2, 3, . . . then µ( ∞


S
i=1 Ai ) = 0.

3
(f ) If A1 , A2 , A3 , . . . ∈ M, A1 ⊂ A2 ⊂ A3 ⊂ . . . then

!
[
µ Ai = lim µ(Ai ).
i→∞
i=1

(g) If A1 , A2 , A3 , . . . ∈ M, A1 ⊃ A2 ⊃ A3 ⊃ . . . and µ(A1 ) < ∞, then



!
\
µ Ai = lim µ(Ai ).
i→∞
i=1

Proof.
(a) The sets A1 , A2 , . . . , An , ∅, ∅, ∅, . . . are pairwise disjoint. Hence

µ(A1 ∪ . . . ∪ An ) = µ(A1 ∪ . . . ∪ An ∪ ∅ ∪ ∅ ∪ ∅ ∪ . . .)
= µ(A1 ) + . . . + µ(An ) + µ(∅) + µ(∅) + µ(∅) + . . .
= µ(A1 ) + . . . + µ(An ).

(b) Since B = A ∪ (B \ A) and the sets A, B \ A are disjoint we have

µ(B) = µ(A) + µ(B \ A) (1)

and the claim easily follows.


(c) The claim follows from (1).
(d) We have

A = A 1 ∪ A 2 ∪ A3 ∪ . . .
= A1 ∪ (A2 \ A1 ) ∪ (A3 \ (A1 ∪ A2 )) ∪ (A4 \ (A1 ∪ A2 ∪ A3 )) ∪ . . .
|{z} | {z } | {z } | {z }
B1 B2 B3 B4

The sets Bi are pairwise disjoint and Bi ⊂ Ai . Hence



X ∞
X
µ(A) = µ(Bi ) ≤ µ(Ai ).
i=1 i=1

(e) It easily follows from (d).


(f ) Since Ai ⊂ Ai+1 , we have

A = A 1 ∪ A 2 ∪ A3 ∪ . . .
= A1 ∪ (A2 \ A1 ) ∪ (A3 \ A2 ) ∪ (A4 \ A3 ) ∪ . . .
|{z} | {z } | {z } | {z }
B1 B2 B3 B4

4
The sets Bi are pairwise disjoint and hence

X
µ(A) = µ(Bi ) = lim (µ(B1 ) + . . . + µ(Bi )) = lim µ(B1 ∪ . . . ∪ Bi ) = lim µ(Ai ).
i→∞ i→∞ i→∞
i=1

The limit exist because µ(Ai ) ≤ µ(Ai+1 ) (since Ai ⊂ Ai+1 ).


(g) It suffice to apply (f) to the sets A1 \ Ai .
The proof is complete. 2
Exercise. Provide a detailed proof of (g). Where do we use the assumption that
µ(A1 ) < ∞? Show an example that the conclusion of (g) need not be true without the
assumption that µ(A1 ) < ∞.

1.4 Outer measure and Carathéodory construction.

It is quite difficult to construct a measure with desired properties, but it is much


easier to construct so called outer measure which has less restrictive properties. The
Carathéodory theorem shows then how to extract a measure from an outer measure.
Definition. Let X be a set. A function

µ∗ : 2X → [0, ∞]

is called outer measure if

1. µ∗ (∅) = 0;
2. A ⊂ B =⇒ µ∗ (A) ≤ µ∗ (B);
3. for all sets A1 , A2 , A3 , . . . ⊂ X

! ∞
[ X

µ Ai ≤ µ∗ (Ai ).
i=1 i=1

We call a set E ⊂ X µ∗ -measurable if

∀A ⊂ X µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A \ E). (2)

Condition (2) is called Carathéodory condition. Since the inequality

µ∗ (A) ≤ µ∗ (A ∩ E) + µ∗ (A \ E)

is always satisfied, in order to verify measurability of a set E it suffices to show that

µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A \ E).

5
Proposition 4 All sets with µ∗ (E) = 0 are µ∗ measurable.

Proof. If µ∗ (E) = 0, then for an arbitrary set A ⊂ X we have

µ∗ (A) ≥ µ∗ (A \ E) = µ∗ (A \ E) + µ∗ (A ∩ E),
| {z }
0

because A ∩ E ⊂ E. 2
Let M∗ be the class of all µ∗ -measurable sets.

Theorem 5 (Carathéodory) M∗ is a σ-algebra and

µ∗ : M∗ → [0, ∞]

is a measure.

Proof. We will split the proof into several steps.


Step I: X ∈ M∗
This is obvious.
Step II: E, F ∈ M∗ =⇒ E ∪ F ∈ M∗ .
Let A ⊂ X. We have

µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A \ E) = µ∗ (A ∩ E) + µ∗ ((A \ E) ∩ F ) + µ∗ ((A \ E) \ F ).

Since A ∩ E = A ∩ (E ∪ F ) ∩ E and (A \ E) ∩ F = (A ∩ (E ∪ F )) \ E, we have

µ∗ (A) = µ∗ (A ∩ (E ∪ F ) ∩ E) + µ∗ (A ∩ (E ∪ F ) \ E) + µ∗ (A \ (E ∪ F ))
= µ∗ (A ∩ (E ∪ F )) + µ∗ (A \ (E ∪ F )).

Step III: E, F ∈ M∗ =⇒ E \ F ∈ M∗ .
It follows from the symmetry of the condition (2) that X \ E ∈ M∗ . Since E \ F =
X \ ((X \ E) ∪ F ) the claim follows.
Step IV: If the sets E1 , E2 , E3 , . . . ∈ M∗ are pairwise disjoint, then for every A ⊂ X

[ ∞
X

µ (A ∩ Ei ) = µ∗ (A ∩ Ei ).
i=1 i=1

If E, F ∈ M∗ are pairwise disjoint, the for every A ⊂ X

µ∗ (A ∩ (E ∪ F )) = µ∗ (A ∩ (E ∪ F ) ∩ E) + µ∗ (A ∩ (E ∪ F ) \ E) = µ∗ (A ∩ E) + µ∗ (A ∩ F ).

6
This, Step II and the induction argument implies that for every n
n
[ n
X

µ (A ∩ Ei ) = µ∗ (A ∩ Ei ). (3)
i=1 i=1

Hence ∞ n
[ X

µ (A ∩ Ei ) ≥ µ∗ (A ∩ Ei )
i=1 i=1
and thus in the limit ∞ ∞
[ X

µ (A ∩ Ei ) ≥ µ∗ (A ∩ Ei ).
i=1 i=1
Since the opposite inequality is a property of an outer measure, the claim follows.
Step V: If the sets E1 , E2 , E3 , . . . ∈ M∗ are pairwise disjoint, then ∞ ∗
S
i=1 Ei ∈ M .

Step II and the induction argument implies that for every n


n
[
Ei ∈ M∗ .
i=1

Hence (3) gives


n
! n
! n ∞
!
[ [ X [
∗ ∗ ∗ ∗ ∗
µ (A) = µ A∩ Ei +µ A\ Ei ≥ µ (A ∩ Ei ) + µ A\ Ei ,
i=1 i=1 i=1 i=1

and in the limit


∞ ∞
! ∞
! ∞
!
X [ [ [

µ (A) ≥ µ∗ (A ∩ Ei ) + µ∗ A\ Ei = µ∗ A∩ Ei + µ∗ A\ Ei .
i=1 i=1 i=1 i=1

Since the opposite inequality is a property of an outer measure the claim follows.
Step VI: Final step. It follows from Steps I, III, V and Proposition 2 that M∗ is a
σ-algebra and then Step IV with A = X implies that µ∗ restricted to M∗ is a measure.
The proof is complete. 2
We say that a measure µ : M → [0, ∞] is complete if every subset of a set of
measure zero is measurable (and hence has measure zero). Therefore it follows from
Proposition 4 that the measure described by the Carathéodory theorem is complete.
Let (X, d) be a metric space. For E, F ⊂ X we define
dist (E, F ) = inf d(x, y), diam E = sup d(x, y).
x∈E,y∈F x,y∈E

Definition. An outer measure µ∗ defined on subsets of a metric space is called metric


outer measure if
µ∗ (E ∪ F ) = µ∗ (E) + µ∗ (F ) whenever dist (E, F ) > 0.

7
Theorem 6 If µ∗ is a metric outer measure, then all Borel sets are µ∗ -measurable i.e.
B(X) ⊂ M∗ .

Proof. Since M∗ is a σ-algebra, it suffices to show that all open sets belong to M∗ . Let
G ⊂ X be open. It suffices to show that for every A ⊂ X

µ∗ (A) ≥ µ∗ (A ∩ G) + µ∗ (A \ G) (4)

(because the opposite inequality is obvious). We can assume that µ∗ (A) < ∞ as
otherwise (4) is obviously satisfied.
For each positive integer n define
1
Gn = {x ∈ G : dist (x, X \ G) > }.
n
Then
1
dist (Gn , X \ G) ≥ > 0. (5)
n
Let
1 1
Dn = Gn+1 \ Gn = {x ∈ G : < dist (x, X \ G) ≤ }.
n+1 n
Clearly

[
G \ Gn = Di (6)
i=n

and
1 1
dist (Di , Dj ) ≥ − > 0 provided i + 2 ≤ j.
i+1 j
Since the mutual distances between the sets D1 , D3 , D5 , . . . , D2n−1 are positive we have

µ∗ (A ∩ D1 ) + µ∗ (A ∩ D3 ) + . . . µ∗ (A ∩ D2n−1 ) = µ∗ (A ∩ (D1 ∪ D3 ∪ . . . ∪ D2n−1 )) ≤ µ∗ (A)

and similarly

µ∗ (A ∩ D2 ) + µ∗ (A ∩ D4 ) + . . . + µ∗ (A ∩ D2n ) ≤ µ∗ (A).

Hence ∞
X
µ∗ (A ∩ Di ) ≤ 2µ∗ (A) < ∞.
i=1

Now (6) yields



X
µ∗ (A ∩ (G \ Gn )) ≤ µ∗ (A ∩ Di )
i=n

and hence
µ∗ (A ∩ (G \ Gn )) → 0 as n → ∞.

8
Inequality (5) gives
µ∗ (A ∩ Gn ) + µ∗ (A \ G) = µ∗ ((A ∩ Gn ) ∪ (A \ G)) ≤ µ∗ (A)
and thus
µ∗ (A ∩ G) + µ∗ (A \ G) ≤ µ∗ (A ∩ Gn ) + µ∗ (A ∩ (G \ Gn )) + µ∗ (A \ G)
≤ µ∗ (A) + µ∗ (A ∩ (G \ Gn ))
and after passing to the limit
µ∗ (A ∩ G) + µ∗ (A \ G) ≤ µ∗ (A).
The proof is complete. 2
We will use outer measures to prove the following important result.

Theorem 7 Let X be a metric space and µ a measure in B(X). Suppose that X is a


union of countably many open sets of finite measure. Then
µ(E) = inf µ(U ) = sup µ(C). (7)
U ⊃E C⊂E
U −open C−closed

Before proving the theorem let’s discuss two of its corollaries. The first one shows that
in a situation described by the above theorem in order to prove that two measures are
equal it suffices to compare them on the class of open sets.

Corollary 8 If µ is as in Theorem 7 and ν is another measure on B(X) that satisfies


ν(U ) = µ(U ) for all open sets U ⊂ X
then
ν(E) = µ(E) for all E ⊂ B(X).

The second corollary describes an important class of measures satisfying assumptions


of theorem 7.
Definition. Let X be a metric space and µ a measure defined on the σ-algebra of
Borel sets. We say that µ is a Radon measure if µ(K) < ∞ for all compact sets and
µ(E) = inf µ(U ) = sup µ(K) for all E ∈ B(X). (8)
U ⊃E K⊂E
U −open K−compact

Corollary 9 If X is a locally compact and separable metric space1 and µ is a measure


in B(X) such that µ(K) < ∞ for every compact set K, then X is a union of countably
many open sets of finite measure and µ is a Radon measure.
1
Being locally compact means that ever point has a neighborhood whose closure is compact. IRn is
locally compact, but also IRn \ {0} is locally compact.

9
Proof. It follows from locally compactness of X that X is a union of a family of open
sets with compact closures. Since X is separable, every covering of X by open sets has
countable subcovering and hence we can write

[
X= Ui , Ui — compact.
i=1

This proves the first part of the corollary because µ(Ui ) ≤ µ(Ui ) < ∞. Now (8) follows
from (7) and Theorem 3(f) because
∞ n
!
[ [
C= C∩ Ui .
n=1 i=1
| {z }
compact

and hence
µ(C) = sup µ(K)
K⊂C
K−compact

for every closed set C. The proof is complete 2


Proof of Theorem 7. For E ⊂ X we set

µ∗ (E) = inf µ(U ).


U ⊃E
U −open

It is easy to see that µ∗ is a metric outer measure. Hence µ∗ restricted to the class of
Borel sets is a measure. Clearly

µ(U ) = µ∗ (U ) for all open sets U ⊂ X, (9)

and
µ(E) ≤ µ∗ (E) for all E ∈ B(X). (10)
We can assume that X is a union of an increasing sequence of open sets with finite
measure. Indeed, if X is the union of open sets Ui of finite measure, then the sets
Vn = U1 ∪ U2 ∪ . . . ∪ Un satisfy

[
X= Vn , Vn ⊂ Vn+1 , µ(Vn ) < ∞.
n=1

Now inequality (10) implies that for all E ∈ B(X)

µ(Vn \ E) ≤ µ∗ (Vn \ E) and µ(Vn ∩ E) ≤ µ∗ (Vn ∩ E), (11)

because both Vn \E and Vn ∩E are Borel. Actually we have equalities in both inequalities
of (11) because otherwise we would have a sharp inequality

µ(Vn ) = µ(Vn \ E) + µ(Vn ∩ E) < µ∗ (Vn \ E) + µ∗ (Vn ∩ E) = µ∗ (Vn ),

10
which contradicts (9). Hence in particular µ(Vn ∩ E) = µ∗ (Vn ∩ E). Since

µ(Vn ∩ E) → µ(E) and µ∗ (Vn ∩ E) → µ∗ (E)

as n → ∞ by Theorem 3(f) we conclude that µ(E) = µ∗ (E) and thus

µ(E) = inf µ(U ) (12)


U ⊃E
U −open

by the definition of µ∗ . To prove that the measure of a set E ∈ B(X) can be approxi-
mated by measures of closed subsets of E observe that Vn \ E has finite measure and
hence it follows from (12) that there is an open set Gn such that
ε
Vn \ E ⊂ Gn , µ(Gn \ (Vn \ E)) < .
2n
S∞
The set G = n=1 Gn is open and C = X \ G ⊂ E is closed. Now it suffices to observe
that ∞ ∞
[ [
E\C =E∩ Gn ⊂ Gn \ (Vn \ E)
n=1 n=1

and hence µ(E \ C) < ε. The proof is complete. 2

1.5 Hausdorff measure.

Let ωs = π s/2 /Γ(1 + 2s ), s ≥ 0. If s = n is a positive integer, then ωn is volume of the


unit ball in IRn .
Let X be a metric space. For ε > 0 and E ⊂ X we define

ωs X
Hεs (E) = inf s (diam Ai )s
2 i=1

where the infimum is taken over all possible coverings



[
E⊂ Ai with diam Ai < ε.
i=1

Since the function ε 7→ Hεs (E) is nonincreasing, the limit

Hs (E) = lim Hεs (E)


ε→0

exists. Hs is called Hausdorff measure.


It is easy to see that if s = 0, H0 is the counting measure.

Theorem 10 Hs is a metric outer measure.

11
Proof. Clearly
Hs : 2X → [0, ∞],
Hs (∅) = 0,
A ⊂ B =⇒ Hs (A) ≤ Hs (B).
In order to show that Hs is an outer measure it remains to show that

! ∞
[ X
s
H En ≤ Hs (En ).
n=1 n=1

We can assume that the right hand side is finite. Hence



X
Hεs (En ) < ∞ for every ε > 0.
n=1

Fix δ > 0. We can find a covering of each set En



[
En ⊂ Ani , diam Ani < ε
i=1

such that ∞
ωs X δ
Hεs (En ) ≥ s (diam Ani )s − n
2 i=1 2
Hence
∞ ∞ ∞
!
X ωs X [
Hεs (En )
≥ s s
(diam Ani ) − δ ≥ Hε s
En − δ,
n=1
s i,n=1 n=1

because {Ani }∞ forms a covering of ∞


S
i,n=1 n=1 En . Passing to the limit with δ → 0 yields

∞ ∞ ∞
!
X X [
Hs (En ) ≥ Hεs (En ) ≥ Hεs En .
n=1 n=1 n=1

Now passing to the limit with ε → 0 yields the result. We are left with the verification
of the metric condition. Let dist (E, F ) > 0. It suffices to show that

Hεs (E ∪ F ) = Hεs (E) + Hεs (F ) (13)

for all ε < dist (E, F ). Let



[
E∪F ⊂ Ai , diam Ai < ε
i=1

be a covering of E ∪ F . We can assume that

Ai ∩ (E ∪ F ) 6= ∅ (14)

12
for all i, as otherwise we could remove Ai from the covering. Since diam Ai < dist (E, F )
it follows from (14) that Ai has a nonempty intersection with exactly one set E or F .
Accordingly, the family {Ai }i splits into two disjoint subfamilies

{Bj }j = all the sets Ai such that Ai ∩ E 6= ∅;

{Cj }j = all the sets Ai such that Ai ∩ F 6= ∅.


Hence

ωs X s ωs X s ωs X
(diam A i ) = (diam Bj ) + (diam Cj )s
2s i=1 2s j 2s j
≥ Hεs (E) + Hεs (F ).

Since {Ai }i was an arbitrary covering of E ∪ F such that diam Ai < ε, we conclude
upon taking the infimum that

Hεs (E ∪ F ) ≥ Hεs (E) + Hεs (F ).

Since the opposite inequality is obvious (13) follows. 2


Exercise. Show that if

• Hs (E) < ∞, then Ht (E) = 0 for all t > s;

• Hs (E) > 0, then Ht (E) = ∞ for all 0 < t < s.

Definition. The Hausdorff dimension is defined as follows. If Hs (E) > 0 for all
s ≥ 0, then dimH (E) = ∞. Otherwise we define

dimH (E) = inf{s ≥ 0 : Hs (E) = 0}.

It follows from the exercise that there is s ∈ [0, ∞] such that Ht (E) = 0 for t > s
and Ht (E) = ∞ for 0 < t < s. Hausdorff dimension of E equals s.

1.6 Lebesgue measure.

For a closed interval


P = [a1 , b1 ] × . . . × [an , bn ] ⊂ IRn (15)
we define its volume by
|P | = (b1 − a1 ) · . . . · (bn − an )
and for A ⊂ IRn we define

X
L∗n (A) = inf |Pi |
i=1

13
where the infimum is taken over all coverings

[
A⊂ Pi
i=1

by intervals as in (15).
Arguments similar to those in the proof that the Hausdorff measure is an outer
metric measure give the following result

Theorem 11 L∗n is a metric outer measure.

Definition. L∗n is called outer Lebesgue measure. L∗n -measurable sets are called
Lebesgue measurable. L∗n restricted to Lebesgue measurable sets is called Lebesgue
measure and is denoted by Ln .

Corollary 12 All Borel sets are Lebesgue measurable.

All sets with L∗n (A) = 0 are Lebesgue measurable. Then Ln (A) = 0. A set has
Lebesgue measure zero if and only it it can be covered by a sequence of intervals with
whose sum of volumes can be arbitrarily small. Note that this is the same as the sets
of measure zero in the setting of Riemann integral.

Theorem 13 If P is a closed interval as in (15), then


Ln (P ) = |P |.

In what follows we will often write |A| to denote the Lebesgue measure of a Lebesgue
measurable set A.
Proof. PIn order to prove the theorem we have to show that if P ⊂ ∞
S
i=1 Pi , then

|P | ≤ i=1 |Pi |.
We will need the following quite obvious special case of this fact.

Lemma 14 If P ⊂ ki=1 Pi is a finite covering of a closed interval P by closed intervals


S

Pi , then |P | ≤ ki=1 |Pi |.


P

Proof. This lemma follows from the theory of Riemann integral.2 Indeed, volume of
Pk interval equals the integral of the characteristic function. Hence inequality χP ≤
the
i=1 χPi implies
Z Xk Z Xk
|P | = χP ≤ χPi = |Pi |.
IRn i=1 IRn i=1
2
This lemma can be proved in a more elementary way without using the Riemann integral, but the
proof would be longer.

14
The proof is complete. 2
Now we can complete the proof of the theorem. Observe that each interval Pi is
contained in a slightly bigger open interval Piε such that
ε
|Piε | = |Pi | + ,
2i
S∞
where Piε is the closure of Piε . Now from the covering P ⊂ i=1 Piε we can select a
finite subcovering
[k
P ⊂ Piεj
j=1

(compactness) and hence the lemma yields


k
X ∞
X ∞
X
|P | ≤ |Piεj | ≤ |Piε | = |Pi | + ε.
j=1 i=1 i=1

Since ε > 0 was arbitrary the theorem follows. 2

Proposition 15

• If P = (a1 , b1 ) × . . . (an , bn ) is a bounded interval, then

Ln (P ) = (b1 − a1 ) · . . . · (bn − an ).

• Ln (∂P ) = 0.

• If P is an unbounded interval in IRn and has nonempty interior (i.e. each side
has positive length), then Ln (P ) = ∞.

Proof. The first claim follows from the observation that P can be approximated from
inside and from outside by closed intervals that differs arbitrarily small in volume. The
second claim follows from the first one Ln (∂P ) = Ln (P ) − Ln (int P ) = 0, and the lats
claim follows from the fact that P contains closed intervals of arbitrarily large measure.
2
IRk ⊂ IRn , k < n is a subset generated by the first k coordinates.

Corollary 16 If k < n, then Ln (IRk ) = 0.

The next result shows that we can compute Lebesgue measure of an open set by
representing it as a union of cubes. We call a cube dyadic if its sidelength equals 2k for
some k ∈ ZZ.

15
Theorem 17 An arbitrary open set in IRn is a union of closed dyadic cubes with
pairwise disjoint interiors. Hence Lebesgue measure of the open set equals the sum of
the measures of these cubes.

Proof. Instead of a formal proof we will show a picture which explains how to represent
any open set as a union of cubes with sidelength 2−k , k = 0, 1, 2, 3, . . .

The proof is complete 2


Compare the following result with Theorem 7.

Theorem 18 For an arbitrary set E ⊂ IRn

L∗n (E) = inf Ln (U ),

where the infimum is taken over all open sets U such that E ⊂ U .

Proof. L∗n = inf i |Pi | and each interval Pi is contained in an open set (interval) of
P
slightly bigger measure. 2
Every Borel set B is Lebesgue measurable. Every set E with L∗ (E) = 0 is Lebesgue
measurable. Hence sets of the form B ∪ E are Lebesgue measurable. These are all
Lebesgue measurable sets. More precisely we have the following characterization.

16
Definition. Let X be a metric space. By a Gδ set we mean a set of the form
A = S∞
T
G
i=1 i , where the sets Gi ⊂ X are open and by Fσ we mean a set of the form

B = i=1 Fi , where the sets Fi ⊂ X are closed. Clearly all Gδ and Fσ sets are Borel.

Theorem 19 Let A ⊂ IRn . Then the following conditions are equivalent

1. A is Lebesgue measurable;
2. For every ε > 0 there is an open set G such that A ⊂ G and L∗n (G \ A) < ε;
3. There is a Gδ set H such that A ⊂ H and L∗n (H \ A) = 0;
4. For every ε > 0 there is a closed set F such that F ⊂ A and L∗n (A \ F ) < ε;
5. There is a Fσ set M such that M ⊂ A and L∗n (A \ M ) = 0;
6. For every ε > 0 there is an open set G and a closed set F such that F ⊂ A ⊂ G
and Ln (G \ F ) < ε.

Proof. (1)⇒(2)S Every measurable set can be represented as a union of sets with finite
measure A = ∞ i=1 Ai , Ln (Ai ) < ∞. It follows from Theorem 18 that for every Si∞there
i
is an open set Gi such that Ai ⊂ Gi and Ln (Gi \ Ai ) < ε/2 . Hence A ⊂ G = i=1 Gi ,
Ln (G \ A) < ε.
(2)⇒(3) We define H = ∞ ∗
T
i=1 Ui , where Ui are open sets such that A ⊂ Ui , Ln (Ui \A) <
1/i.
(3)⇒(1) This implication is obvious.
(1)⇔(4)⇔(5) this equivalence follows from the equivalence of the conditions (1), (2)
and (3) applied to the set IRn \ A.
(1)⇒(6) If A is Lebesgue measurable then the existence of the sets F and G follows
from the conditions (2) and (4).

T∞and open sets Fi ,∗ Gi such that Fi ⊂ A ⊂ Gi , Ln (Gi \ Fi ) < 1/i.


(6)⇒(1) Take closed
Then the set H = i=1 Gi is Gδ and Ln (H \ A) = 0. 2
The Lebesgue measure has an important property of being invariant under trans-
lations i.e. if A is a translation of a Lebesgue measurable set B, then A is Lebesgue
measurable and Ln (A) = Ln (B). This property follows immediately from the definition
of the Lebesgue measure. We will show not that this property implies that there is a
set which is not Lebesgue measurable.

Theorem 20 (Vitali) There is a set E ⊂ IRn which is not Lebesgue measurable.3


3
The reader will easily check that the same proof applies to any translation invariant measure µ
such that the measure of the unit cube if positive and finite; see also Theorem 21.

17
Proof. We will prove the theorem for n = 1 only, but the same argument works for
an arbitrary n. The idea of the proof is to find a countable family of pairwise disjoint
and isometric (by translation)sets whose union E contains (0, 1) and is contained in
(−1, 2). If the sets in the family are isometric to a set V , then V cannot be Lebesgue
measurable, because measurability of V would imply that a number between 1 and
3 (measure of E) be equal to infinite sum of equal numbers (equal to measure of V )
which is impossible. It should not be surprising that the construction of the set V has
to involve axiom of choice.
For x, y ∈ (0, 1) we write x ∼ y if x − y is a rational number. Clearly ∼ is an
equivalence relation and hence (0, 1) is the union of a family F of pairwise disjoint sets
[x] = {y ∈ (0, 1) : x ∼ y}.
It follows from the axiom of choice that there is a set V ⊂ (0, 1) which contains exactly
one element from each set in the family F. Let
[
E= Va , where Va = V + a = {x + a : x ∈ V }.
a∈Q∩(−1,1)

It is easy to see that

• Va ∩ Vb = ∅ if a, b ∈ Q, a 6= b;
• (0, 1) ⊂ E ⊂ (−1, 2).

Suppose that V is measurable. Then each Va is measurable, L1 (V ) = L1 (Va ) and E is


measurable. If L1 (V ) > 0, then L1 (E) = ∞ which is impossible and if L1 (V ) = 0, then
L1 (E) = 0 which is also impossible. That proves that the set V cannot be measurable.
2
The Lebesgue measure is translation invariant and Ln ([0, 1]n ) = 1. It turns out that
the above two properties uniquely determine Lebesgue measure. This proves that the
Lebesgue measure is in a sense the only natural way of measuring length, area, volume
of general sets in one, two, three,. . . dimensional Euclidean spaces.

Theorem 21 If µ is a measure on B(IRn ) such that µ(a + E) = µ(E) for all a ∈ IRn ,
E ∈ B(IRn ) and µ([0, 1]n ) = 1, then µ(E) = Ln (E) on B(IRn ).

Proof. According to Corollary 8 it suffices to prove that µ(U ) = Ln (U ) for all open
sets U .
Consider an open cube Qk = (0, 2−k )n , where k is a positive integer. There are 2kn
pairwise disjoint open cubes contained in the unit cube [0, 1]n , each being a translation
of Qk . Since the measure is invariant under translations we conclude that
µ(Qk ) ≤ 2−kn .

18
It is easy to see that we can cover the boundary ∂Q by translations of Qk in a way that
the sum of µ-measures of the cubes covering ∂Q goes to zero as k → ∞. This proves
that µ(∂Q) = 0.
Now [0, 1]n can be covered by 2kn cubes each being a translation of the closed cube
k
Q . Since the measure is invariant under translations we conclude that

µ(Qk ) ≥ 2−kn

and hence
µ(Qk ) = µ(Qk ) = 2−kn = Ln (Qk ). (16)
As we know (Theorem 17) an arbitrary open set U is a union of cubes with pairwise
disjoint interiors and sidelength 2−k (k depends on the cube). Cubes are not disjoint,
but meet along the boundary which has µ-measure zero. Hence the µ(U ) equals the
sum of the µ-measures of the cubes which according to (16) equals Ln (U ). The proof
is complete. 2

Theorem 22 Hn = Ln on B(IRn ).

Sketch of the proof. Observe that the Hausdorff measure Hn in IRn has the property
that
Hn (E + a) = Hn (E) for all E ∈ B(IRn ).
Therefore in order to prove that Hn = Ln on B(IRn ) it suffices to prove that

Hn ([0, 1]n ) = 1

This innocently looking result is however very difficult and we will not prove it here. 2
Exercise. Using Theorem 22 prove that Hn = L∗n on the class of all subsets of IRn .
If s < n, then Hs is the s-dimensional measure in IRn . Not that s is not required
to be an integer. If k < n is an integer and M ⊂ IRn is a k-dimensional manifold,
then one can prove that Hk coincides on M with the natural k-dimensional Lebesgue
measure on M .4
Examples. If C ⊂ [0, 1] is the standard ternary Cantor set, then one can prove that
dimH (C) = log 2/ log 3 = 0.6309 . . . Moreover Hs (C) = 1 for s = log 2/ log 3.
The Van Koch curve is defined as the limiting curve for the following construction
4
We have already seen how in the course of Advanced Calculus to measure subsets of M in the
setting of the Riemann integral. This construction can easily be generalized to the Lebesgue measure
on M . We will discuss it with more details later.

19
The Van Koch curve, denoted by K is homeomorphic to the circle S 1 , but dimH K =
log 4/ log 3 = 1, 261 . . .

Proposition 23 A set E ⊂ IRn has Lebesgue measure zero if and only if for every ε >
there is a family of balls B(xi , ri )}∞
i=1 such that

[ ∞
X
E⊂ B(xi , ri ), rin < ε.
i=1 i=1

Proof. ⇒ Each bass B(xi , ri ) is contained in a cube with sidelength


P∞ 2ri and hence E
can be covered by a sequence of cubes whose sum of volumes is i=1 (2ri ) < 2n ε.
n

⇐ If a set E ⊂ IRn has Lebesgue measure zero, then for every ε > 0 there is an open
set U such that E ⊂ U and Ln (U ) < ε. The set U can be represented as union of cubes
with pairwise disjoint interiors
√ (Theorem 17). each cube Q of sidelength ` is contained
in a ball of radius r = n` and hence rn P = nn/2 |Q|. Therefore E can be covered by a

sequence of balls {B(xi , ri )}∞
i=1 such that
n
i=1 ri < n
n/2
ε. 2
Definition. We say that a function between metric spaces f : (X, d) → (Y, ρ) is
Lipschitz continuous if there is a constant L > 0 such that
ρ(f (x), f (y)) ≤ Ld(x, y) for all x, y ∈ X.
In this case we say that f is L-Lipschitz continuous and call L Lipschitz constant if f .

Proposition 24 If f : IRn → IRn is Lipschitz continuous and E ⊂ IRn has Lebesgue


measure zero, then f (E) has Lebesgue measure zero too.

Proof. It follows from Proposition 23 and the fact that the L-Lipschitz image of a ball
of radius r is contained in a ball of radius Lr. 2
Exercise. Let f : IRn → IRn be a homeomorphism. Prove that f (A) is Borel if and
only if A is Borel.
Although the theory of Lebesgue measure seems to work very well something very
important is missing: we haven’t proved so far that if Q is a unit cube whose sides are
not parallel to coordinate axes, then its measure equals 1. Indeed we considered only
intervals with sides parallel to coordinate directions. Fortunately this important fact
is contained in a result that we will prove now.

20
Theorem 25 If L : IRn → IRn is a nondegenerate linear transformation with the
matrix A (det A 6= 0), then L(E) ⊂ IRn is a Borel (Lebesgue measurable) set if and
only if E ⊂ IRn is Borel (Lebesgue measurable) set. Moreover

Ln (L(E)) = | det A|Ln (E). (17)

Proof. Since L is a homeomorphism it preserves the class of Borel sets (see Exercise).
Since both L and L−1 are Lipschitz continuous L preserves the class of sets of measure
zero (Proposition 24). Therefore L preserves the class of Lebesgue measurable sets
(because each Lebesgue measurable set is the union of a Borel set and a set of measure
zero). It follows from Proposition 24 that the formula (17) is satisfied for sets of measure
zero. Therefore it remains to prove (24) for Borel sets.
Define a new measure µ(E) = Ln (L(E)). Clearly µ is a measure on B(IRn ) that is
translation invariant. Let µ([0, 1]n ) = a > 0. Hence the measure a−1 µ satisfies all the
assumptions of Theorem 21 and therefore a−1 µ(E) = Ln (E), so we have

Ln (L(E)) = aLn (E) for all E ∈ B(IRn ).

It remains to prove that a = | det A|. Clearly a = a(A) : GL (n) → (0, ∞) is a function
defined on the class of invertible matrices. We have

a(A1 A2 ) = a(A1 )a(A2 ). (18)

Indeed, if A1 , A2 are matrices representing linear transformations L1 and L2 , then

a(A1 A2 )Ln (E) = Ln (L1 (L2 (E)) = a(L1 )Ln (L2 (E)) = a(L1 )a(L2 )Ln (E).

Also
a(sn I) = sn , s > 0. (19)
Here I is the identity matrix. This property is quite obvious and we leave a formal
proof to the reader. Now it remains to prove the following fact.

Lemma 26 Let a : GL (n) → (0, ∞) be a function with properties (18) and (19). Then
a(A) = | det A|.

Proof. Let Ai (s) be the diagonal matrix with −s on the ith place and s on all other
places on the diagonal. Since Ai (s)2 = s2 I we have a(Ai (s))2 = a(Ai (s)2 ) = s2n and
hance a(Ai (s)) = |sn | = | det Ai (s)|.
For k 6= l let Bkl (s) = [aij ] be the matrix such that akl = s, aii = 1, i = 1, 2, . . . and
all other entries equal zero. Multiplication by the matrix Bkl (s) from the right (left) is
equivalent to adding kth column (`th row) multiplied by s to `th column (kth row).
It is well known from linear algebra (and easy to prove) that applying such opera-
tions to any nonsingular matrix A it can be transformed to a matrix of the form tI or

21
An (t). Since multiplication by Bkl (s) does not change determinant, t = | det A|1/n . It
remains to prove that a(Bkl (s)) = 1. Since Bkl (−s) = Ak (1)Bkl (s)Ak (1) we have
that a(Bkl (−s)) = a(Bkl (s)). On the other hand Bkl (s)Bkl (−s) = I and hence
a(Bkl (s))2 = 1, so a(Bkl (s)) = 1. The proof of the lemma and hence the proof of
the theorem is complete. 2

2 Integration

2.1 Measurable functions.

Definition. Let (X, M) be a measurable space and Y a metric space. We say that a
function f : X → Y is measurable if

f −1 (U ) ∈ M for every open set U ⊂ Y .

If E ∈ M is a measurable set, then we say that f : E → Y is measurable if f −1 (U ) ∈ M


for every open U ⊂ Y .
Clearly, f : E → Y is measurable if it can be extended to a measurable function
˜
f : X → Y , but there is also another point of view.

ME = {A ⊂ E : A ∈ M} = {B ∩ E : B ∈ M}

is a σ-algebra and measurability of f : E → Y is equivalent to measurability of f with


respect to ME .

Theorem 27 Let (X, M) be a measurable space, f : X → Y a measurable function


and g : Y → Z a continuous function. Then the function g ◦ f : X → Z is measurable.

Proof. For every open set U ⊂ Z we have

(g ◦ f )−1 (U ) = f −1 (g −1 (U ) ) ∈ M.
| {z }
open in Y

The proof is complete. 2

Theorem 28 Let (X, M) be a measurable space and Y a metric space. If the functions
u, v : X → IR are measurable and Φ : IR2 → Y is continuous, then the function

h(x) = Φ(u(x), v(x)) : X → Y

is measurable.

22
Proof. Put f (x) = (u(x), v(x)), f : X → IR2 . According to Theorem 27 it suffices to
prove that f is measurable. Let R = (a, b) × (c, d) be an open rectangle. Then

f −1 (R) = f −1 ((a, b) × (c, d)) = u−1 ((a, b)) ∩ u−1 ((c, d)) ∈ M.

Every open set U ⊂ IR2 is a countable union of open rectangles



[
U= Ri
i=1

and hence ∞
[
f −1 (U ) = f −1 (Ri ) ∈ M.
i=1

The proof is complete. 2


This theorem has many applications.

(a) If f = u + iv : X → C, where u, v : X → IR are measurable, then f is measurable.

Indeed, we take Φ(u, v) = u + iv.

(b) If f = u + iv : X → C is measurable, then the functions u, v, |f | : X → IR are


measurable.

Indeed, we take Φ1 (u + iv) = u, Φ2 (u + iv) = v, Φ3 (u + iv) = |u + iv|.

(c) If f, g : X → C are measurable, then the functions f +g, f g : X → C are measurable.

Indeed, we take Φ1 : C × C → C, Φ1 (z1 , z2 ) = z1 + z2 and Φ2 : C × C → C,


Φ2 (z1 , z2 ) = z1 z2 .

(d) A set E ⊂ X is measurable if and only if the characteristic function if E



1 if x ∈ E,
χE (x) =
0 if x 6∈ E,

is measurable.

Indeed, this follows from the fact that for any open set U ⊂ IR


 ∅ if 0 6∈ U , 1 6∈ U ,
E if 0 6∈ U , 1 ∈ U,

(χE )−1 (U ) =

 X \E if 0 ∈ U, 1 6∈ U ,
X if 0 ∈ U, 1 ∈ U,

23
(e) If f : X → C is a measurable function, then there is a measurable function α : X →
C such that |α| = 1 and f = α|f |.

The set E = {x : f (x) = 0} is measurable because



\  1 
E= f −1 B(0, ) .
i=1 | {z i }
disc of radius 1/i

Hence the function h(x) = f (x)+χE (x) is measurable and h(x) 6= 0 for all x. Therefore
h is a measurable function as a function h : X → C \ {0}. Since ϕ : C \ {0} → C,
ϕ(z) = z/|z| is continuous we conclude that

α(x) = ϕ(f (x) + χE (x))

is measurable. If x ∈ E, then α(x) = 1 and if x 6∈ E, then α(x) = f (x)/|f (x)|.


Therefore f (x) = α(x)|f (x)| for all x ∈ X.
Definition. Let X, Y be two metric spaces. We say that f : X → Y is a Borel
mapping if it is measurable with respect to the σ-algebra of Borel sets in X i.e.

f −1 (U ) ∈ B(X) for every open set U ⊂ Y .

Clearly every continuous mapping is Borel.

Theorem 29 Let (X, M) be a measurable space and Y a metric space. Let also f :
X → Y be a measurable mapping.

(a) If E ⊂ Y is a Borel set, then f −1 (E) ∈ M.

(b) If Z is another metric space and g : Y → Z is a Borel mapping, then g◦f : X → Z


is measurable.

Proof. (a) Let R = {A ⊂ Y | f −1 (A) ∈ M}. Clearly R contains all open sets. If we
prove that R is a σ-algebra, we will have that B(Y ) ⊂ R which is the claim.

• Y ∈R

Indeed, f −1 (Y ) = X ∈ M.

• If A ∈ R, then Y \ A ∈ R.

Indeed, f −1 (Y \ A) = X \ f −1 (A) ∈ M.

24
S∞
• If A1 , A2 , . . . ∈ R, then i=1 Ai ∈ R.

Indeed, f −1 ( ∞
S S∞ −1
i=1 A i ) = i=1 f (Ai ) ∈ R. The above three properties imply that R
is a σ-algebra.
(b) If U ⊂ Z is open, then g −1 (U ) ⊂ Y is Borel and hence (g◦f )−1 (U ) = f −1 (g −1 (U )) ∈
M by (a). 2
Definition. Let (X, M) be a measurable space. Let IR = IR ∪ {−∞, ∞}. We say
that a function f : X → IR is measurable if f −1 (U ) is measurable for every open set
U ⊂ IR and if the sets f −1 (+∞), f −1 (−∞) are measurable. Similarly for E ∈ M we
define measurable functions f : E → IR.
Exercise. Let (X, M) be a measurable space. Prove that a function f : X → IR is
measurable if and only if f −1 ((a, ∞]) ∈ M for all a ∈ IR.
Let {an } be a sequence in IR. For k = 1, 2, . . . put

bk = sup{ak , ak+1 , ak+2 , . . .} (20)

and
β = inf{b1 , b2 , . . .} . (21)
We call β upper limit of {an } and write

β = lim sup an
n→∞

Clearly {bk } is a decreasing sequence, so β = limk→∞ bk . It is easy to see that there


is a subsequence {ani } of {an } such that limi→∞ ani = β. Moreover β is the largest
possible value for limits of all subsequences of {an }.
The lower limit is defined by

lim inf an = − lim sup(−an ). (22)


n→∞ n→∞

Equivalently the lower limit can be defined by interchanging sup and inf in (20) and
(21). The lower limit equals to the lowest possible value for limits of subsequences of
{an }.
For a sequence of functions fn : X → IR we define new functions
supn fn , lim supn fn : X → IR by
 
sup fn (x) = sup fn (x) for every x ∈ X,
n n
 
lim sup fn (x) = lim sup fn (x) for every x ∈ X.
n n

25
Similarly we can define functions inf n fn and lim inf n→∞ fn .
If f (x) = limn→∞ fn (x) for every x ∈ X, then we say that f is a pointwise limit of
{fn }.

Theorem 30 Suppose that fn : X → IR is a sequence of measurable functions. Then


the functions supn fn , inf n fn , lim supn→∞ fn , lim inf n→∞ fn are measurable.

Proof. Since
 −1 ∞
[
sup fn ((a, ∞]) = fn−1 ((a, ∞]) ∈ M,
n
n=1

measurability of supn fn follows from the exercise. By a similar argument we prove


that the function inf n fn is measurable. The two facts imply measurability of lim sup fn
because
lim sup fn = inf {sup fi }.
n→∞ k≥1 i≥k

Now measurability of lim inf n fn is an obvious consequence of the equality

lim inf fn = − lim sup(−fn ).


n→∞ n→∞

The proof is complete. 2

Corollary 31 The limit of every pointwise convergent sequence of functions fn : X →


C (or fn : X → IR) is measurable.

Corollary 32 If f, g : X → IR are measurable functions, then the functions max{f, g},


min{f, g} are measurable. In particular the fuctions

f + = max{f, 0}, f − = − min{f, 0}

are measurable.

The functions f + and f − are called positive part and negative part of f . Note that
f = f + − f − , |f | = f + + f − .
Corollary 31 has the following generalization to the case of metric space valued
mappings.

Theorem 33 Let fn : X → Y be a sequence of measurable mappings with values in


a separable metric space. If fn (x) → f (x) for every x ∈ X, then f : X → Y is
measurable.

26
Proof. Let {yi }∞ i=1 be a countable dense set. Then the functions di : Y → IR, di (y) =
d(yi , y) are continuous. Hence di ◦ fn : X → IR are measurable. Since di ◦ fn → di ◦ f ,
the functions di ◦ f : X → IR are measurable too by Corollary 31. In particular the
sets
{x : d(yi , f (x)) ≥ d(yi , F )} = (di ◦ f )−1 ([d(yi , F ), ∞))
are measurable. To prove measurability of f it suffices to prove measurability of f −1 (F )
for every closed set F and it follows from the equality

\
−1
f (F ) = {x : d(yi , f (x)) ≥ d(yi , F )} . (23)
i=1

To prove this equality observe that if x ∈ f −1 (F ), then f (x) ∈ F and hence


d(yi , f (x)) ≥ d(yi , F ) for every i, so x belongs to the right hand side of (23). On
the other hand if x 6∈ f −1 (F ), then f (x) 6∈ F and hence there is i such that
d(yi , f (x)) < d(yi , F ) (because {yi }∞
i=1 is a dense subset of Y and so we can find yi
arbitrarily close to f (x)), therefore x cannot belong to the right hand side of (23). 2
Definition. Let (X, M) be a measurable space. A measurable function s : X → C is
called simple function if it has only a finite number of values. Simple functions can be
represented in the form
Xn
s= αi χAi ,
i=1

where αi 6= αj for i 6= j and Ai are disjoint measurable sets.

Theorem 34 If f : X → [0, ∞] is measurable, then there is a sequence of simple


functions sn on X such that

1. 0 ≤ s1 ≤ s2 ≤ . . . ≤ f ;

2. limn→∞ sn (x) = f (x) for every x ∈ X.

If in addition f is bounded, then sn converges to f uniformly.

Proof. It is easy to see that the sequence



n if f (x) ≥ n,
sn (x) = k k
2k
if 2n
≤ f (x) < k+1
2n
≤ n,

has all properties that we need. 2

27
2.2 Integral

In the integration theory we will assume that 0 · ∞ = 0.


Definition. If s : X → [0, ∞) is a simple function of the form
n
X
s= αi χAi ,
i=1

where αi 6= αj for i 6= j and Ai are pairwise disjoint measurable sets, then for any
E ∈ M we define Z Xn
s dµ = αi µ(Ai ∩ E) .
E i=1

According to our assumption 0 · ∞ = 0, so 0 · µ(Ai ∩ E) = 0, even if µ(Ai ∩ E) = ∞.


If f : X → [0, ∞] is measurable, and E ∈ M, we define the Lebesgue integral of f
over E by Z Z
f dµ = sup s dµ,
E E
where the supremum is over all simple functions s such that 0 ≤ s ≤ f .

R R
(a) If 0 ≤ f ≤ g, then f dµ ≤ E g dµ.
E
R R
(b) If A ⊂ B and f ≥ 0, then A f dµ ≤ B f dµ.
R R
(c) If f ≥ 0 and c is a constant 0 ≤ c < ∞, then E f dµ = c E f dµ.
R
(d) If f (x) = 0 for all x ∈ E, then E f dµ = 0, even if µ(E) = ∞.
R
(e) If µ(E) = 0, then E f dµ = 0, even if f (x) = ∞ for all x ∈ E.
R R
(f) If f ≥ 0, then E f dµ = X χE f dµ.

In the proof of the next two theorems we will need the following lemma.

Lemma 35 Let s and t be nonnegative measurable simple functions on X. Then the


function Z
ϕ(E) = s dµ, E ∈ M
E
defines a measure in M. Moreover
Z Z Z
(s + t) dµ = s dµ + t dµ . (24)
X X X

28
Proof. Since ϕ(∅) = 0, it remains to prove that ϕ is countably additive. If E1 , E2 , . . .
are disjoint measurable sets, then

! n ∞
! n ∞
[ X [ X X
ϕ Ek = αi µ Ek ∩ Ai = αi µ(Ek ∩ Ai )
k=1 i=1 k=1 i=1 k=1
X∞ Xn ∞
X
= αi µ(Ek ∩ Ai ) = ϕ(Ek ) .
k=1 i=1 k=1

This completes the proof of the first part. Let now


k
X m
X
s= αi χAi , t= βj χBj
i=1 j=1

be a representation of the simple functions as in the definition, then for Eij = Ai ∩ Bj


we have Z Z Z
s + t dµ = (αi + βj )µ(Eij ) = s dµ + t dµ .
Eij Eij Eij

Since X is a union of disjoint sets Eij we conclude (24) from the first part of the lemma.
2
The following theorem is one of the most important results of the theory of inte-
gration.

Theorem 36 (Lebesgue monotone convergence theorem) Let {fn } be a se-


quence of measurable functions on X such that:

1. 0 ≤ f1 ≤ f2 ≤ . . . ≤ ∞ for every x ∈ X;

2. limn→∞ fn (x) = f (x) for every x ∈ X.

Then f is measurable and Z Z


lim fn dµ = f dµ
n→∞ X X

R
Proof. Measurability of f follows from R Corollary 31. The sequence XRfn dµ is increasing
R
and hence it hasRa limit α = limn→∞ X fn dµ. Since fn ≤ f , we have X fn dµ ≤ X f dµ
and hence α ≤ X f dµ . Let 0 ≤ s ≤ f be a simple function. Fix a constant 0 < c < 1
and define
En = {x : fn (x) ≥ cs(x)}, n = 1, 2, 3, . . .
It is easy to see that E1 ⊂ E2 ⊂ . . . and ∞
S
n=1 En = X (because fn (x) → f (x) for every
x ∈ X.) Hence Z Z Z
fn dµ ≥ fn dµ ≥ c s dµ . (25)
X En En

29
R
Since ϕ(E) = E
s dµ is a measure and E1 ⊂ E2 ⊂ . . . we conclude (Theorem 3(f)) that

Z ! Z
[
s dµ = ϕ(En ) → ϕ En = ϕ(X) = s dµ .
En n=1 X
R
Thus after passing to the limit in (25) we obtain α ≥ c E
s dµ and hence
Z Z
f dµ ≥ α ≥ s dµ ,
X X

because 0 < c < 1 was arbitrary. Taking supremum over all simple functions 0 ≤ s ≤ f
yields Z Z
f dµ ≥ α ≥ f dµ ,
X X
R
i.e. X
f dµ = α and the theorem follows. 2

Theorem 37 If fn : X → [0, ∞] is measurable, for n = 1, 2, 3, . . . and



X
f (x) = fn (x) for all x ∈ X,
n=1

then f is measurable and


Z ∞ Z
X
f dµ = fn dµ .
X i=1 X

Proof. The function f is a limit of measurable functions, so it is measurable. Let {si }


and {ti } be increasing sequences of simple nonnegative function such that si → f1 ,
ti → f2 (Theorem 34). Then si + ti → f1 + f2 and hence (24) combined with the
monotone convergence theorem gives
Z Z Z
f1 + f2 dµ = f1 dµ + f2 dµ .
X X X

Applying an induction argument we obtain


Z N Z
X
(f1 + f2 + . . . + fN ) dµ = fn dµ .
X n=1 X

Since gN = f1 + . . . + fN is an increasing sequence of measurable functions and gN → f


the theorem follows from the monotone convergence theorem. 2

Theorem 38 (Fatou’s lemma) If fn : X → [0, ∞] is measurable, for n = 1, 2, 3, . . .,


then Z   Z
lim inf fn dµ ≤ lim inf fn dµ .
X n→∞ n→∞ X

30
Exercise. Construct a sequence of measurable functions fn : IR → [0, ∞] such that the
inequality in Fatou’s lemma is sharp.
Proof of the theorem. Let gn = inf{fn , fn+1 , fn+2 , . . .}. Then gn ≤ fn and hence
Z Z
gn dµ ≤ fn dµ . (26)
X X

Since 0 ≤ g1 ≤ g2 ≤ . . ., all the functions gn are measurable (Theorem 30) and


gn → lim inf k→∞ fn , inequality (26) combined with the monotone convergence theorem
yields the result. 2

Theorem 39 Let f : X → [0, ∞] be measurable. Then


Z
ϕ(E) = f dµ for E ∈ M
E

defines a measure in M. Moreover


Z Z
g dϕ = gf dµ (27)
X X

for every measurable function g : X → [0, ∞].

Proof. Clearly ϕ(∅) = 0. To prove that ϕ is a measure S it remains to prove countable


additivity. Let E1 , E2 , . . . ∈ M be disjoint sets and E = ∞n=1 En . Then obviously


X
χE f = χEn f
n=1

and
Z ∞
Z X
ϕ(E) = χE f dµ = χEn f dµ
X X n=1
∞ Z
X X∞
= χEn f dµ = ϕ(En ) ,
n=1 X n=1

by Theorem 37. To prove (27) observe that it is true if g = χE , E ∈ M. Hence it is also


true for simple functions and the general case follows from the monotone convergence
theorem combined with Theorem 34. 2
Definition. Let (X, M, µ) be a measure space. R The space L1 (µ) consists of all
measurable functions f : X → C for which kf k1 = X |f | dµ < ∞. Elements of L1 (µ)
are called Lebesgue integrable functions or summable functions.

31
If f = u + iv, then the functions u+ , u− , v + , v − are measurable and Lebesgue
integrable as bounded by |f |. For a set E ∈ M we define
Z Z Z Z Z 
+ − + −
f dµ = u dµ − u dµ + i v dµ − v dµ .
E E E E E

f − dµ is finite, then
R R
If f : X → IR is measurable and each of the integrals E f + dµ, E
we also define Z Z Z
f dµ = +
f dµ − f − dµ .
E E E

Theorem 40 Suppose that f, g ∈ L1 (µ). If α, β ∈ C, then αf + βg ∈ L1 (µ) and


Z Z Z
(αf + βg) dµ = α f dµ + β g dµ .
X X X

We leave the proof as a simple exercise.


R
This theorem says that L1 (µ) is a linear space and f 7→ X
f dµ is a linear functional
in L1 (µ).
Similarly we define L1 (µ, IRn ) Ras the class of all measurable mappings f =
(f1 , f2 , . . . , fn ) : X → IRn such that X |f | dµ < ∞ and then we set
Z Z Z 
f dµ = f1 dµ, . . . , fn dµ .
X X X

Theorem 41 If f ∈ L1 (µ, IRn ), then


Z Z

f dµ ≤ |f | dµ .

X X

Proof. Let α = (α1 , . . . , αn ) ∈ IRn be a unit vector such that |


R R
X
f dµ| = hα, X
f dµi.
Then
Z Z Xn Z

f dµ = hα, f dµi = αi fi dµ

X X i=1 X
n
Z X Z Z
= αi fi dµ = hα, f i dµ ≤ |f | dµ
X i=1 X X

by Schwarz inequality. 2

Theorem 42 (Lebesgue’s dominated convergence theorem) Suppose {fn } is a


sequence of complex valued functions on X such that the limit f (x) = limn→∞ fn (x)
exists for every x ∈ X. If there is a function g ∈ L1 (µ) such that
|fn (x)| ≤ g(x) for every x ∈ X and all n = 1, 2, . . .,

32
then f ∈ L1 (µ), Z
lim |f − fn | dµ = 0
n→∞ X
and Z Z
lim fn dµ = f dµ.
n→∞ X X

Proof. Clearly f is measurable and |f | ≤ g. Hence f ∈ L1 (µ). It is easy to see that


2g − |f − fn | ≥ 0 and hence Fatou’s lemma yields
Z Z
2g dµ ≤ lim inf (2g − |f − fn |) dµ
X n→∞ X
Z  Z 
= 2g + lim inf − |f − fn | dµ
X n→∞ X
Z Z
= 2g − lim sup |f − fn | dµ .
X n→∞ X
R
The integral X
2g dµ is finite and hence
Z
lim sup |fn − f | dµ ≤ 0 .
n→∞ X

This immediately implies Z


lim |fn − f | dµ = 0 .
n→∞ X
Now Z Z Z

fn dµ − f dµ ≤ |fn − f | dµ → 0 ,

X X X
so Z Z
lim fn dµ = f dµ .
n→∞ X X
The proof is complete. 2

2.3 Almost everywhere

Definition. If there is a set N ∈ M of measure zero, µ(N ) = 0 such that a property


P (x) is satisfied for all x ∈ X \ N , then we say that the property P (x) is satisfied
almost everywhere (a.e.)
For example fn → f a.e. means that there is a set N ∈ M, µ(N ) = 0 such that
fn (x) → f (x) for all x ∈ X \ N .

33
If f = g a.e., then we write f ∼ g. It is easy to see that ∼ is an equivalence relation
(f ∼ g, g ∼ h ⇒ f ∼ h follows from the fact that the union of two sets of measure zero
is a set of measure zero). Note that if f ∼ g, then
Z Z
f dµ = g dµ for every X ∈ M.
E E

Therefore from the point of view of the integration


R Rtheory, functions which are equal
a.e. cannot be distinguished. Moreover if E f dµ = E g dµ for all E ∈ M, then f = g
a.e. This follows from part (b) of Theorem 43 below applied to f − g. Thus two
functions f and g cannot be distinguished in the integration theory if and only if they
are equal a.e. Therefore it is natural to identify such functions.

Theorem 43
R
(a) Suppose f : X → [0, ∞] is measurable, E ∈ M and E f dµ = 0, then f = 0 a.e.
in E.
R
(b) Suppose f ∈ L1 (µ) and E f dµ = 0 for every E ∈ M. Then f = 0 a.e. in X.

Proof. (a) Suppose An = {x ∈ E : f (x) ≥ 1/n}. Then


Z Z
1
µ(An ) ≤ f dµ ≤ f dµ = 0.
n An E
S∞
Hence µ(An ) = 0 and therefore the set {x ∈ E : f (x) > 0} = n=1 An has measure
zero.
(b) Let = u + iv. Define E = {x ∈ X : u(x) ≥ 0}. Then
Z Z Z
0 = real part of f dµ = u dµ = u+ dµ,
E E E

and hence u+ = 0 a.e. by (a). Similarly u− , v + , v − = 0 a.e. Accordingly, f = 0 a.e. 2


Recall that a measure µ is complete if every subset of a set of measure zero is
measurable (and hence has measure zero). All the measures obtained through the
Carathéodory construction are complete (Proposition 4), e.g. the Hausforff measure
and the Lebesgue measure are complete. However, we can assume that any measure
is complete, because we can add all subsets of sets of measure zero to the σ-algebra.
More precisely one can prove.

Theorem 44 Let (X, M, µ) be a measure space, let M∗ be the collection of all E ⊂ X


for which there exist sets A, B ∈ M such that A ⊂ E ⊂ B and µ(B \ A) = 0 and define
µ(E) = µ(A) in this situation. Then M∗ is a σ-algebra and µ is a complete measure
on M∗ .

34
For a proof, see Rudin page 28.
Therefore, whenever needed, we can assume that the measure is complete.
Remark. If µ is a measure on B(IRn ) and there is an uncountable set E ∈ B(IRn )
such that µ(E) = 0, then µ cannot be complete, because one can prove that the
cardinality of all Borel sets in B(IRn ) is the same as the cardinality of real numbers
and the cardinality of all subsets of E is the same as the cardinality of all subsets of real
numbers which is bigger than the cardinality of real numbers. Hence in order to make
the measure µ complete we have to enlarge B(IRn ) to a bigger σ-algebra by including
also sets which are not Borel. This is the case of the Lebesgue measure. The σ-algebra
of Lebesgue measurable sets is larger, than the σ-algebra of Borel sets.
Definition. Suppose now that µ is a complete measure. If a function with values in
a metric space Y is defined only at almost all points of X, i.e.

f : X \ N → Y, µ(N ) = 0 ,

then we say that f is measurable if there is y0 ∈ Y such that



f (x) if x ∈ X \ N ,
f˜(x) =
y0 if x ∈ N ,

is measurable. In such a situation we say that f : X → Y is a measurable function


defined a.e.
Note that it follows from the assumption that µ is complete that f can be defined
in N in an arbitrary way and still the resulting function f˜˜ is measurable and f˜˜ ∼ f˜.
In particular functions in L1 (µ) need to be defined a.e. only. L1 (µ) is a linear space.
Since we identify functions that are equal a.e. we identify L1 (µ) with the quotient space
L1 (µ)/ ∼ (check that L1 (µ)/ ∼ is a linear space). Therefore
R f ∈ L1 (µ) satisfies f = 0 in
1
L (µ) if and only if f = 0 a.e., if and only if kf k1 = X |f | dµ = 0 (see Theorem 43(a)).
In all the previously discussed convergence theorems (Theorem 36, Theorem 37,
Theorem 38, Theorem 42) we can replace the requirement of everywhere convergence
by the a.e. convergence. For example the Lebesgue dominated convergence theorem
can be formulated as follows.

Theorem 45 (Lebesgue dominated convergence theorem) Suppose that {fn } is


a sequence of complex valued functino on X defined a.e. If the limit f (x) =
limn→∞ fn (x) exists a.e. and if there is a function g ∈ L1 (µ) such that

|fn (x)| ≤ g(x) a.e., for all n = 1, 2, 3, . . .

then f ∈ L1 (µ), Z
lim |f − fn | dµ = 0
n→∞ X

35
and Z Z
lim fn dµ = f dµ .
n→∞ X X

Not surprisingly, this theorem is an easy consequence of the previous version of the
Lebesgue dominated convergence theorem. We leave details to the reader.
Recall that if Y is a metric space, then a mapping f : IRn → Y is Lebesgue
measurable if f −1 (U ) is a Lebesgue measurable set for every open U ⊂ Y . Since the
σ-algebra of Lebesgue measurable sets is larger, than that of Borel sets, the mapping
f need not be Borel. I turns out, however, that f equals a.e. to a Borel mapping, at
least when Y is separable.

Theorem 46 If Y is a separable metric space and f : IRn → Y is Lebesgue measurable,


then there is a Borel mapping g : IRn → Y such that f = g a.e.

Proof. Since Y is separable, there is a countable family of open sets {Ui }∞


i=1 in Y such
that every open set in Y is a union of open sets from the family. Indeed, it suffices to
take the family of balls with rational length radii and centers in a countable dense set
in Y .
For i = 1, 2, . . . let Mi ⊂ IRn be a Fσ set such that

Mi ⊂ f −1 (Ui ), Ln (f −1 (Ui ) \ Mi ) = 0 ,

see Theorem 19. Clearly the set E = ∞ −1


S
i=1 (f (Ui ) \ Mi ) has Lebesgue measure zero
and hence there is a Gδ set H such that E ⊂ H, Ln (H) = 0 (Theorem 19). Fix y0 ∈ Y
and define 
f (x) if x 6∈ H,
g(x) =
y0 if x ∈ H.
Clearly f = g a.e. and it remains to prove that the mapping g is Borel. It is easy to
see that5 if y0 6∈ Ui , then g −1 (Ui ) = Mi \ H and if y0 ∈ Ui , then g −1 (Ui ) = Mi ∪ H.
−1
S∞case g (Ui ) is a Borel set. Since every open set U ⊂ Y can be represented as
In each
U = j=1 Uij , the set

[
−1
g (U ) = g −1 (Uij )
j=1

is Borel. 2
5
A good picture is useful here.

36
2.4 Theorems of Lusin and Egorov

Theorem 47 (Lusin) Let X be a metric space and µ a measure in B(X) such that
X is a union of countably many open sets of finite measure. If f : X → Y is a Borel
mapping with values in a separable metric space, then for every ε > 0 there is a closed
set F ⊂ X such that µ(X \ F ) < ε and f |F is continuous.

Proof. Let {Ui }∞


i=1 be a countable family of open sets such that every open set in Y is
a union of open sets from the family. It follows from Theorem 7 that for every i there
is an open set Vi ⊂ X such that6

f −1 (Ui ) ⊂ Vi , µ(Vi \ f −1 (Ui )) < ε2−i−1 .

Let ∞
[
E= Vi \ f −1 (Ui ).
i=1

Clearly µ(E) < ε/2. We will prove that the mapping g = f |X\E is continuous. Observe
that7
g −1 (Ui ) = Vi ∩ (X \ E) . (28)
Indeed, g −1 (Ui ) ⊂ Vi ∩ (X \ E), but on the other hand

Vi ∩ (X \ E) ⊂ Vi ∩ (X \ (Vi \ f −1 (Ui ))) = f −1 (Ui ) ,

so
Vi ∩ (X \ E) ⊂ f −1 (Ui ) ∩ (X \ E) = g −1 (Ui ) .

S set U ⊂ Y
In order to prove that g is continuous it suffices to prove that for every open
there is an open set V ⊂ X such that8 g −1 (U ) = V ∩ (X \ E). Let U = ∞ j=1 Uij . Then
−1
S∞
(28) gives g (U ) = ( j=1 Vij ) ∩ (X \ E) which proves continuity of g in X \ E. The set
E need not be closed, but it follows from Theorem 7 that there is an open set G ⊂ X
such that E ⊂ G and µ(G \ E) < ε/2. Now f |F is continuous, where F = X \ G, F is
closed and µ(X \ F ) = µ(G) = µ(E) + µ(G \ E) < ε. 2
As a corollary we will prove the following characterization of Lebesgue measurable
functions also known as the Lusin theorem.
Let E ⊂ IRn be a Lebesgue measurable set. Recall that f : E → IR is Lebesgue
measurable if f −1 (U ) is Lebesgue measurable for every open U ⊂ IR (f −1 (U ) is a subset
of E).
Actually to prove this fact we need first to split f −1 (Ui ) into sets of finite measure and then apply
6

Theorem 7 to each such set.


7
It is easy to see if you make an appropriate picture, but a formal proof is enclosed below.
8
Because the class of open sets in the metric space X \ E is exactly the class of sets of the form
V ∩ (X \ E), where V ⊂ X is open.

37
Theorem 48 (Lusin) Let E ⊂ IRn be Lebesgue measurable. Then a function f : E →
IR is Lebesgue measurable if and only if for every ε > 0 there is a closed set F ⊂ E
such that f |F is continuous and Ln (E \ F ) < ε.

Proof. Since f equals a.e. to a Borel function necessity follows from Theorem 47. To
prove sufficiency it suffices to show that for every a ∈ IR the set Ea = {x : f (x) ≥ a} is
Lebesgue measurable (this is a variant of the exercise from page 25.) Let F be a closed
set such that f |F is continuous and Ln (E \ F ) < ε. Then the set F 0 = {x : (f |F )(x) ≥
a} = Ea ∩ F is closed and L∗n (Ea \ F 0 ) ≤ Ln (E \ F ) < ε, hence measurability of Ea
follows from Theorem 19. 2
In the next theorem µ is a measure on an arbitrary set X (not necessarily a metric
space), but it is crucial that µ(X) < ∞.

Theorem 49 (Egorov) Let µ be a finite measure on a set X. Let fn : X → Y be a


sequence of measurable functions with values in a separable metric space. If fn → f
a.e., then f is measurable and for every ε > 0 there is a measurable set E ⊂ X such
that µ(X \ E) < ε and fn ⇒ f uniformly on E.

Proof. Measurability of f follows from Theorem 33. Let d be the metric in Y . Define

[
Ci,j = {x : d(fn (x), f (x)) ≥ 2−i } .
n=j

We will prove that this set is measurable. First observe that the mapping F = (fn , f ) :
X → Y × Y is measurable. Indeed, if {Ui }∞ i=1 is a collection of open sets in Y such that
every open set in Y is a countable union of open sets from the family, it easily follows
that every open set in Y × Y is a countable union of open sets of the form Ui × Uj .
Therefore to prove measurability of F it suffices to observe that each set

F −1 (Ui × Uj ) = fn−1 (Ui ) ∩ f −1 (Uj )

is measurable. Since the function d : Y × Y → IR is continuous, the composed map


d ◦ F : X → IR is measurable and hence

{x : d(fn (x), f (x)) ≥ 2−i } = F −1 {(y1 , y2 ) ∈ Y × Y : d(y1 , y2 ) ≥ 2−i }


= F −1 (d−1 ([2−i , ∞)) = (d ◦ F )−1 ([2−i , ∞))

is measurable. That proves measurability of the sets Ci,j . Now observe that Ci,j is a
decreasing sequence of sets with respect to j, µ(Ci,1 ) ≤ µ(X) < ∞ and hence

\
µ(Ci,j ) → µ( Ci,k ) = 0 as j → ∞ (29)
k=1

38
by Theorem 3(g). Indeed, this intersection has measure zero because if x ∈ ∞
T
k=1 Ci,k ,
then there is a subsequence fnj (x) such that d(fnj (x), f (x)) ≥ 2−i for all j, which can be
true only for x in a set of measure zero. Now it follows from (29) that for S∞every i there
is an integer J(i) such that µ(Ci,J(i) ) < ε2−i . It is easy
S∞to see that µ( i=1 Ci,J(i) ) < ε
and that fn converges uniformly to f on E = X \ i=1 Ci,J(i) , because for every i
d(fn (x), f (x)) < 2−i for all n ≥ J(i) and all x ∈ E. 2
Again if X is a metric space and µ is a finite measure of B(X) we can assume that
the set E in the above theorem is closed, see Theorem 7.
Exercise. Show an example that if µ(X) = ∞, then the claim of Egorov’s theorem
may be false.

2.5 Convergence in measure

Definition. We say that a sequence of measurable functions fn : X → IR converges


in measure to a measurable function f : X → IR if for every ε > 0

lim µ({x : |fn (x) − f (x)| ≥ ε}) = 0.


n→∞

µ
We denote convergence in measure by fn → f . As usual we assume that the functions
fn and f are defined a.e.

Theorem 50 (Lebesgue) If µ(X) < ∞ and a sequence of measurable functions fn :


µ
X → IR converges to f : X → IR a.e., then it converges in measure, fn → f .

Proof. Given ε > 0, let


Ei = {x : |fi (x) − f (x)| ≥ ε} .
S∞
The sequence of sets i=n Ei is a decreasing sequence of sets of finite measure and
hence

! ∞ [ ∞
!
[ \
µ Ei → µ Ei = 0.
i=n n=1 i=n

The last set has measure zero, because if x ∈ ∞


S
i=n Ei for every n, then x belongs to
infinitely many Ei ’s so there is a subsequence fni such that |fni (x) − f (x)| ≥ ε which
is true only on a set of measure zero. Hence

[
µ({x : |fn (x) − f (x)| ≥ ε}) = µ(En ) ≤ µ( Ei ) → 0,
i=n

which proves convergence in measure. 2

39
µ
Theorem 51 (Riesz) If fn → f , then there is a subsequence fni such that fni → f
a.e.

Remark. In this theorem measure of the space can be infinite, µ(X) = ∞.


Proof. For every i there is ni such that
1 1
µ({x : |fni (x) − f (x)| ≥ }) ≤ i .
i 2
We can assume that n1 < n2 < n3 . . . Let

[ 1
Fk = X \ {x : |fni (x) − f (x)| ≥ } .
i=k
i
P∞ −i
= S2−k+1 and hence µ(X \ ∞
S
Clearly µ(X \ Fk ) ≤ i=k 2 k=1 Fk ) = 0. We will

prove that fni → f pointwise on k=1 Fk . To this end it suffices to show that fni
converges to f pointwise on every set Fk , but we will actually show a stronger fact
S∞ fni ⇒ f uniformly on Fk . Indeed, for all x ∈ Fk , x does not belong to the set
that
i=k {x : |fni (x) − f (x)| ≥ 1/i}, so for every j ≥ k and all x ∈ Fk

1
|fnj (x) − f (x)| ≤
j
which proves uniform convergence of fnj to f on Fk 2
Remark. We actually proved not only convergence a.e. but also uniform convergence
on subsets of X whose complement has arbitrary small measure. Note that Lebesgue’s
theorem combined with this stronger conclusion of Riesz’ theorem implies Egorov’s
theorem for sequences of real valued functions.
Exercise. Generalize the Lebesgue and the Riesz theorems to the case of sequences of
measurable mappings with values into separable metric spaces.

2.6 Absolute continuity of the integral

The following result is known as the absolute continuity of the integral.


R
Theorem 52 Let f ∈ L1 (µ). Then for every ε > 0 there is δ > 0 such that E
|f | dµ <
ε whenever µ(E) < δ.

Proof.
R For if not, there would exist ε0 > 0 andSsets En ∈ M such that µ(En ) ≤ 2−n

and En |f | dµ ≥ ε0 . The sequence of sets Ak = n=k En is decreasing and

!
\
µ Ak = 0 , (30)
k=1

40
T∞ P∞ −n+1
because
R µ( k=1 A k ) ≤ µ(A n ) ≤ i=n µ(Ei ) ≤ 2 for
R every n. RSince ϕ(E) =
E
|f | dµ is a measure on M (Theorem 39) and ϕ(A 1 ) = A1
|f | dµ ≤ X |f | dµ < ∞,
Theorem 3(g) implies that

! Z
\
ϕ(Ak ) → ϕ Ak = T |f | dµ = 0

k=1 k=1 Ak

R
by (30). This is, however, a clear contradiction, because ϕ(Ak ) ≥ ϕ(Ek ) = Ek
|f | dµ ≥
ε0 . 2

2.7 Riesz representation theorem

Definition. If f : X → IR is a continuous function on a metric space X, then


supp f = {x ∈ X : f (x) 6= 0} is called the support of f and Cc (X) = Cc (X, IR) denotes
the linear space of continuous functions on X with compact support. We say that a
linear functional I : Cc (X) → IR is positive if

I(f ) ≥ 0 whenever f ≥ 0.

Let X be a locally compact metric space and µ a measure on B(X) such that
µ(X) < ∞ for every compact set K ⊂ X, i.e. µ is a Radon measure, see Corollary 9.
Then Z
I(f ) = f dµ
X

is a positive linear functional on Cc (X). Surprisingly, every positive linear functional


on Cc (X) can be represented as an integral with respect to a Radon measure.

Theorem 53 (Riesz representation theorem) Let X be a locally compact metric


space9 and let I : Cc (X) → IR be a positive linear functional. Then there is a unique
Radon measure µ such that Z
I(f ) = f dµ . (31)
X

Proof. For an open set G ⊂ X denote by Γ(G) the class of continuous functions
0 ≤ f ≤ 1 with compact support in G.

Lemma 54 Let G1 , G2 , . . . , Gk ⊂ X be open sets. If f ∈ Γ(G1 ∪ . . . ∪ Gk ), then there


exist functions fi ∈ Γ(Gi ), i = 1, 2, . . . such that f = f1 + . . . + fk .
9
This theorem is true for a more general class of locally compact Hausdorff topological spaces.

41
Proof. Since X is locally compact, there is a compact set E such that

supp f ⊂ int E, E ⊂ G1 ∪ . . . ∪ Gk .

The sets Hi = Gi ∩ E form an open covering of the compact space E. If we can find
continuous nonnegative functions ϕ1 , . . . , ϕk in E such that

ϕ = ϕ1 + . . . + ϕk > 0 in E, supp ϕi ⊂ Hi ,

then the functions 


f (x)ϕi (x)/ϕ(x) if x ∈ E,
fi (x) =
0 if x 6∈ E
will have desired properties. To construct functions ϕi observe that there is an open
covering of E by sets Di , i = 1, 2, . . . , k such that Di ⊂ Hi and then we define ϕi (x) =
dist (X, E \ Di ). 2
Let now I : Cc (X) → IR be a positive linear functional. Obviously

I(0) = 0 and I(f ) ≤ I(g) for f ≤ g

(because I(g − f ) ≥ 0).


Exercise. Prove that if ψ : [0, ∞) → [0, ∞) is an increasing function such that ψ(s +
t) = ψ(s) + ψ(t), then ψ(t) = tψ(1).
It follows from the exercise that

I(tf ) = tI(f ) for all t ≥ 0 and f ∈ Cc (X). (32)

Now for E ∈ B(X) we define

µ(E) = inf λ(G) , (33)


E⊂G−open

where
λ(G) = sup I(f ) .
f ∈Γ(G)

In order to prove that µ is a measure on B(X) it suffices to prove that for any open
sets G1 , G2 , . . .

[ ∞
X
λ( Gn ) ≤ λ(Gn ) (34)
n=1 n=1

and
λ(G1 ∪ G2 ) = λ(G1 ) + λ(G2 ) provided G1 ∩ G2 = ∅. (35)
Indeed, the two properties easily imply that the function µ∗ defined on the class of all
subsets of X by (33) is a metric outer measure.

42
Let {Gn }∞ be a sequence of open sets and L < λ( ∞
S
n=1
S∞ n=1 Gn ). Then thereSN
is a
function f ∈ Γ( n=1 Gn ) such that I(f ) > L. Since supp f is compact, f ∈ Γ( n=1 Gn )
for some finite N . According to the lemma f = f1 + . . . + fN , fi ∈ Γ(Gi ), so
N
X ∞
X
L < I(f ) = I(fn ) ≤ λ(Gn )
n=1 n=1

and (34) easily follows.


If f ∈ Γ(G1 ), g ∈ Γ(G2 ), G1 ∩ G2 = ∅, then f + g ∈ Γ(G1 ∪ G2 ). Indeed, the
condition G1 ∩ G2 = ∅ implies that f + g ≤ 1. Hence

I(f ) + I(g) = I(f + g) ≤ λ(G1 ∪ G2 )

and (35) follows upon taking the supremum over f and g.


We proved that µ is a measure in B(X). The measure µ is finite on compact sets.
Indeed, if K ⊂ G ⊂ G and G is compact, then there is a function h ∈ Cc (X) such that
h = 1 on G and therefore

µ(K) ≤ sup I(f ) ≤ I(h) < ∞ .


f ∈Γ(G)

Clearly
µ(G) = sup I(f ) for every open G ⊂ X
f ∈Γ(G)

and
µ(E) = inf µ(G) for every E ∈ B(X).
E⊂G−open

Let g ∈ Cc (X), g ≥ 0. Since Ig (f ) = I(f g) is a positive functional it follows from what


we have already proved that

µg (G) = sup I(f g) for every open G ⊂ X


f ∈Γ(G)

and
µg (E) = inf µg (G) for every E ∈ B(X)
E⊂G−open

is a measure in B(X) that is finite on compact sets.

Lemma 55 I(g) = µg (X).

Proof. For f ∈ Γ(X), f g ≤ g and hence I(f g) ≤ I(g). On the other hand if f ∈ Γ(X),
f = 1 on supp g, then f g = g and hence I(f g) = I(g). Therefore

µg (X) = sup I(f g) = I(g) .


f ∈Γ(X)

43
Lemma 56 (inf E g)µ(E) ≤ µg (E) ≤ (supE g)µ(E) for every E ∈ B(X).

Proof. If supp f ⊂ G, then it follows from (32) that

(inf g)I(f ) ≤ I(f g) ≤ (sup g)I(f ) .


G G

Taking the supremum over f ∈ Γ(G) yields

(inf g)µ(G) ≤ µg (G) ≤ (sup g)µ(G)


G G

and then the lemma follows after taking the infimum over all open sets G such that
E ⊂ G. This is because continuity of g implies

inf (inf g) = inf g and inf (sup g) = sup g .


E⊂G−open G E E⊂G−open G E

2
R
Lemma 57 µg (E) = E
g dµ for all E ∈ B(X).

Proof. The measure µ is finite on compact sets and hence every Borel set is a union of
sets of finite measure (Corollary 9). Therefore it suffices
P to prove the equality for sets
E of finite measure µ(E) < ∞. Given ε > 0 let s = nk=1 ak χEk be a simple function
10
Sn that s ≤ g ≤ s + ε (Theorem 34). Here the sets Ek are pairwise disjoint and
such
k=1 Ek = X.

Then
Z n
X n
X
s dµ = ak µ(Ek ∩ E) ≤ ( inf g)µ(Ek ∩ E)
E Ek ∩E
k=1 k=1
Xn
≤ µg (Ek ∩ E) = µg (E)
k=1

and
Z n
X n
X
(s + ε) dµ = (ak + ε)µ(Ek ∩ E) ≥ ( sup g)µ(Ek ∩ E)
E Ek ∩E
k=1 k=1
Xn
≥ µg (Ek ∩ E) = µg (E) .
k=1

We proved that
Z Z Z Z
(g − ε) dµ ≤ s dµ ≤ µg (E) ≤ (s + ε) dµ ≤ (g + ε) dµ .
E E E E
10
i.e. we include the set where ak = 0. For example if s = χE , then we write s = 1 · χE + 0 · χX\E .

44
Since ε > 0 was arbitrary and µ(E) < ∞, the lemma easily follows. 2
R
Comparing Lemma 55 and Lemma 57 we obtain that I(g) = X g dµ which proves
(31) for g ≥ 0. The general case follows from the fact that every function in Cc (X) is
a difference of two nonnegative functions.
We are left with the proof of uniqueness of the measure. Suppose that
Z Z
I(f ) = f dµ = f dν for all f ∈ Cc (X),
X X

where µ and ν are measures on X that are finite on compact sets. According to
Corollary 9 and Corollary 8 it suffices to prove that the measures µ and ν agree on the
class of open sets.
Let G be an open set and K ⊂ G a compact set. Let f ∈ Γ(G) be such that f = 1
on K. Then Z Z
µ(G) ≥ f dµ = I(f ) = f dν ≥ ν(K) .
G G
Taking supremum over compact sets K ⊂ G and applying Corollary 9 we obtain that
µ(G) ≥ ν(G). Similarly we prove the opposite inequality. The proof of the theorem is
complete. 2

3 Integration on product spaces


This should be a word for word copy of Chapter 8 from Rudin’s book. There is no need
to rewrite the whole chapter, so I will not include anything here.

4 Vitali covering lemma, doubling and Hausdorff


measures

4.1 Doubling measures

If σ > 0 and B is a ball in a metric space, then by σB we will denote a ball concentric
with B and with the radius σ times that of B.
Definition. Let X be a metric space and µ a measure on B(X). We say that µ is a
doubling measure if there is a constant Cd > 0 such that for every ball B, 0 < µ(B) < ∞
and µ(2B) ≤ Cd µ(B).
The Lebesgue measure is an example of a doubling measure with Cd = 2n . It turns
out that general doubling measures inherit many properties of the Lebesgue measure.

45
Definition. We say that a metric space X is a doubling-metric space if there is a
constant M > 0 such that every ball B can be covered by no more than M balls of half
the radius.
Examples. IRn is metric-doubling. If X is a metric-doubling space, then every subset
of X is metric-doubling. In particular every subset of IRn is metric-doubling. If X is
an infinite discrete space, then X is not metric-doubling.

Proposition 58 If a metric space is complete and metric-doubling, then it is locally


compact.

Proof. It suffices to prove that any closed ball is compact. It easily follows from the
metric-doubling condition that each closed ball is totally bounded. Since it is complete
(as a closed subset of a complete metric space) it is compact. One can also prove this
theorem more directly by mimicking the proof that the interval [0, 1] is compact. 2

Proposition 59 If µ is a doubling measure on a metric space X, then X is metric-


doubling.

Proof. We will prove that X satisfies the metric-doubling condition with M = Cd4 . Let
{xi }i∈I ⊂ B(x, r) be a maximal r/2-separated set, i.e.
r
∀i,j∈I d(xi , xj ) ≥
2
r
∀y∈B(x,r) ∃i∈I d(xi , y) < .
2
S
From the second condition we have B(x, r) ⊂ i∈I B(xi , r/2) and it suffices to show
that the number of elements in I is bounded by Cd4 , i.e. #I ≤ Cd4 . The balls B(xi , r/4)
are pairwise disjoint and B(xi , r/4) ⊂ B(x, 2r) ⊂ B(xi , 4r). We have
X X
µ(B(x, 2r)) ≥ µ(B(xi , r/4)) ≥ Cd−4 µ(B(xi , 4r)) ≥ Cd−4 µ(B(x, 2r))(#I)
i∈I i∈I

and hence #I ≤ Cd4 . 2

Theorem 60 (Volberg-Konyagin) If X is a complete metric-doubling metric space,


then there is doubling measure on X.

This theorem is very difficult and we will not prove it. Note that according to the pre-
vious proposition metric-doubling condition is necessary for the existence of a doubling
measure, so Volberg-Konyagin’s theorem is very sharp.
The Euclidean space is metric doubling and hence any subset of the Euclidean space
is metric-doubling. Therefore we have

46
Corollary 61 If X ⊂ IRn is closed, then there is a doubling measure on X.

Proposition 62 Let µ be a doubling measure on a metric space X. Then there is a


constant C > 0 such that  s
µ(B(x, r)) r
≥C
µ(B(x0 , r0 )) r0
whenever x ∈ B(x0 , r0 ), r ≤ r0 and s = log Cd / log 2.

Remark. For the Lebesgue measure in IRn , Cd = 2n , log Cd / log 2 = n and hence
Ln (B(x, r))/L(B(0, 1)) ≥ Crn which is the sharp lower bound estimate for the
Lebesgue measure of the ball. That shows that the exponent s in the above lemma
cannot be any smaller.
Proof. Let x ∈ B(x0 , r0 ), r ≤ r0 . Let k be the largest integer such that 2k−1 r ≤ 2r0 .
Hence 2k r > 2r0 and thus B(x0 , r0 ) ⊂ B(x, 2k r). We have

µ(B(x0 , r0 )) ≤ µ(B(x, 2k r)) ≤ Cdk µ(B(x, r)),

µ(B(x, r))
≥ Cd−k . (36)
µ(B(x0 , r0 )
Since 2k−1 r ≤ 2r0 , r/r0 ≤ 4−1 2−k and hence
 s  log Cd / log 2
r r
= ≤ (4−1 )log Cd / log 2 (2−k )log Cd / log 2 = ACd−k .
r0 r0 | {z }
A

This estimate together with (36) yields the claim. 2

4.2 Covering lemmas

Theorem 63 (5r-covering lemma) Let B be a family of balls in a metric space such


that sup{diam B : B ∈ B} < ∞. Then there is a subfamily of pairwise disjoint balls
B 0 ⊂ B such that [ [
B⊂ 5B .
B∈B B∈B0

If the metric space is separable, then the family B 0 is countable and we can arrange it
as a sequence B 0 = {Bi }∞
i=1 , so

[ ∞
[
B⊂ 5Bi .
B∈B i=1

47
Remark. Here B can be either a family of open balls or closed balls. In both cases
proof is the same.
Proof. Let sup{diam B : B ∈ B} = R < ∞. Divide the family B according to the
diameter of the balls
R R
Fj = {B ∈ B : j
< diam B ≤ j−1 } .
2 2
Clearly B = ∞
S
j=1 Fj . Define B1 ⊂ F1 to be the maximal family of pairwise disjoint
balls. Suppose the families B1 , . . . , Bj−1 are already defined. Then we define Bj to be
the maximal family of pairwise disjoint balls in
j−1
[
0 0
Fj ∩ {B : B ∩ B = ∅ for all B ∈ Bi } .
i=1

Next we define B 0 = ∞
S
j=1 Bj . Observe that every ball B ∈ Fj intersects with a ball in
Sj Sj
i=1 Bj . Suppose that B ∩ B1 6= ∅, B1 ∈ i=1 Bi . Then

R R
diam B ≤ =2· ≤ 2 diam B1
2j−1 2j
and hence B ⊂ 5B1 . The proof is complete. 2
Remark. We have actually proved more: every ball B ∈ B is contained in 5B1 for
some ball B1 ∈ B 0 such that B1 ∩ B 6= ∅. This fact will be used in the proof of the next
result.
Definition. We say that a family F of closed balls covers a set A in the Vitali sense
if for every a ∈ A there is a ball B ∈ F of arbitrarily small radius such that a ∈ B. In
other words
∀a∈A inf{diam B : a ∈ B ∈ F} = 0.

Theorem 64 (Vitali covering theorem) Let µ be a doubling measure on X and


A ⊂ X a measurable set. Let F be a family of closed balls that covers A in the Vitali
sense. Then there is a subfamily G ⊂ F of pairwise disjoint balls such that
[ 
µ A\ B = 0. (37)
B∈G

Proof. Let F 0 ⊂ F be a subfamily of all balls with diameters less than 1. Observe that
the family F 0 also covers A in the Vitali sense. According to the 5r-covering lemma
there is a countable subfamily G ⊂ F 0 of pairwise disjoint balls such that
[ [
A⊂ B⊂ 5B .
B∈F 0 B∈G

48
We will prove that the family G satisfies (37). To this end it suffices to prove that for
any ball Q of radius bigger than 1
[ 
µ Q ∩ (A \ B) = 0 .
B∈G

Observe that [  [ 
Q∩ A\ B =Q∩ A\ B .
B∈G B∈G
B⊂10Q

Indeed, if B is not contained in 10Q, then11 B ∩ (Q ∩ A) = ∅. The family of balls


B such that B ∈ G and B ⊂ 10Q is countable and hence we can name the balls by
B 1 , B 2 , B 3 , . . . We have

X ∞
X
µ(5B i ) ≤ C µ(B i ) ≤ Cµ(10Q) < ∞ .
i=1 i=1

The first inequality follows from the doubling condition and the second inequality from
the fact that the balls B i are pairwise disjoint and contained in 10Q. The above
inequality implies that

[  X∞
µ 5B i ≤ µ(5B i ) → 0 as N → ∞
i=N i=N

and it remains to prove that


 [  N
[ ∞
 [
Q∩ A\ B ⊂Q∩ A\ Bi ⊂ 5B i . (38)
B∈G i=1 i=N
B⊂10Q

The first inclusion is obvious. To prove the second one let a ∈ Q ∩ (A \ N


S
SN SN i=1 B i ). The
set i=1 B i is closed and a 6∈ i=1 B i , so there is a ball B a of sufficiently small radius
r < 1 such that a ∈ B a ∈ F 0 and
N
[
Ba ∩ Bi = ∅ . (39)
i=1

On the other hand it follows from the remark stated after the proof of the 5r-covering
lemma that there is B ∈ G such that B a ∩ B 6= ∅, B a ⊂ 5B. Since diam B < 1 and
a ∈ Q, it follows that B ⊂ 10Q and hence B = B j for some j. Now (39) implies that
j > N and hence

[
a ∈ 5B j ⊂ 5B i
i=N

which proves (38). 2


11
Because diam B < 1 and the radius of Q is bigger than 1.

49
Corollary 65 For an open set U ⊂ IRn and every ε > 0 there is a sequence of pairwise
disjoint closed balls B 1 , B 2 , B3 , . . . contained in U and of radii less than ε such that

[ 
Ln U \ Bi = 0 .
i=1

Exercise. Show a direct proof of the corollary without using the Vitali covering lemma.

4.3 Doubling measures and the Hausdorff measure

We will snow now how to use the covering lemmas to prove important results about
Hausdorff measures.

Theorem 66 For any Borel set A ⊂ IRn and any 0 < δ ≤ ∞

Ln (A) = Hδn (A) = Hn (A) .

Proof. Step 1. If Ln (Z) = 0, then Hδn (Z) = 0.


Indeed, for
P every ε > 0, Z 12
can be covered by cubes Qi , each of diameter less than
δ such that i |Qi | < ε. Since (diam Qi )n = C(n)|Qi | we have that
ωn X ωn X ωn
Hδn (Z) ≤ n
(diam Qi )n
= n
C(n) |Qi | < ε n C(n)
2 i 2 i
2

and hence Hδn (Z) = 0.


Step 2. Hδn (A) ≤ Ln (A).
Let A ⊂ ∞
S
i=1 Qi , where the cubes Qi have pairwise disjoint interiors and


X
|Qi | ≤ Ln (A) + ε .
i=1

Fix i. It follows from Corollary 65 that there are pairwise disjoint balls B j ⊂ Qi ,
j = 1, 2, 3, . . . such that

[
Ln (Qi \ Bj ) = 0 .
j=1

12
By C(n) we denote a constant which depends on n only.

50
Now applying Step 1 w have

[ ∞
X ∞
X
Hδn (Qi ) = Hδn ( Bi) ≤ Hδn (B i ) ≤ ωn rin
i=1 i=1 i=1

X ∞
[
= Ln (B i ) = Ln ( B i ) = Ln (Qi ) .
i=1 i=1

Therefore

[ ∞
X ∞
X
Hδn (A) ≤ Hδn ( Qi ) ≤ Hδn (Qi ) ≤ Ln (Qi ) ≤ Ln (A) + ε .
i=1 i=1 i=1

Since ε > 0 was arbitrary, the claim follows.


Step 3. Ln (A) ≤ Hδn (A).
We will need the following fact.

Lemma 67 (Isodiametric inequality) For any Lebesgue measurable set A ⊂ IRn


 n
diam A
Ln (A) ≤ ωn
2

We will prove this lemma in Section 6. Observe that if A is a ball, then we have the
equality. Therefore the lemma says that among all sets of fixed diameter, ball has the
largest volume.
Now we can complete the proof of Step 3. Let A ⊂ ∞
S
i=1 Ci , diam Ci < δ. Then
∞ ∞  n ∞
X X diam Ci ωn X
Ln (A) ≤ Ln (Ci ) ≤ ωn = n (diam Ci )n .
i=1 i=1
2 2 i=1

Taking the infimum over all such coverings we obtain


Ln (A) ≤ Hδn (A) .

Step 4. General case. We proved in steps 2 and 3 that Ln (A) = Hδn (A) for all δ.
Hence
Ln (A) = lim Hδn (A) = Hn (A) .
δ→0
The proof is complete. 2
Now we will compare doubling measures with Hausdorff measures in metric spaces.
Recall that in a given metric space there is 0 ≤ s ≤ ∞ such that
Ht (X) = ∞ for t < s and Ht (X) = 0 for t > s.
This unique number s is called Hausdorff dimension of X and is denoted by s = dimH X.

51
Theorem 68 Let µ be a doubling measure on a metric space X, µ(2B) ≤ Cd µ(B) for
every ball B. Let s = log Cd / log 2. Then

dimH X ≤ s .

Proof. It suffices to prove that Hs (B) < ∞ for every ball B. Indeed, then Ht (B) = 0
for t > s and hence Ht (X) = 0 for t > s. This, however, implies that dimH X ≤ s.
Let B = B(x0 , R). According to Proposition 62 there is a constant C > 0 such that
for every x ∈ B and r < R, µ(B(x, r)) ≥ Crs .
Fix 0 < δ < R. Let {xi }∞i=1 ⊂ B be a maximal δ/2-separated set, i.e. d(xi , xj ) ≥ δ/2
for i 6= j and for every x ∈ B there is i ∈ I such that d(x, x
Si ) < δ/2. Therefore the balls
B(xi , δ/4) are disjoint, contained in B(x, 2R) and B ⊂ i∈I B(xi , r/2). We estimate
the number of elements in I.
X X
µ(B(x, 2R)) ≥ µ(B(xi , δ/4)) ≥ C(δ/4)s = (#I)C(δ/4)s .
i∈I i∈I

Hence
#I ≤ µ(B(x, 2R))C −1 4s δ −s = C 0 δ −s .
| {z }
does not depend on δ
S
Now B ⊂ i∈I B(xi , δ/2) is a covering of the ball by sets of diameters less than or
equal to δ, and hence
ωs X ωs ωs ωs
Hδs (B) ≤ s (diam B(xi , δ/2))s ≤ s δ s (#I) ≤ s δ s C 0 δ −s = s C 0
2 i∈I 2 2 2

Hence
ωs 0
Hs (B) = lim Hδs (B) ≤ C <∞
δ→0 2s

Definition. We say that a Borel measure µ on a metric space X is s-regular, if there


is a constant C ≥ 1 such that

C −1 rs ≤ µ(B(x, r)) ≤ Crs

for all x ∈ X and all 0 < r < diam X.


Clearly, every s-regular measure is doubling.

Theorem 69 Let µ be an s-regular measure on X. Then there are constants C1 , C2 > 0


such that
C1 µ(E) ≤ Hs (E) ≤ C2 µ(E)
for every Borel set E ⊂ X. In particular Hs is s-regular.

52
Proof. First we will prove that there is a constant C1 such that C1 µ(E) ≤ Hs (E) for
every Borel set E. Obviously we can assume that Hs (E) < ∞. Since every bounded
s
set is contained in a ball of radius diam A, then µ(A) ≤ C(diam
S∞ A) by the s-regularity
of the measure. For every ε > 0 there is a covering E ⊂ i=1 Ei such that

ωs X
s
(diam Ei )s ≤ Hs (E) + ε .
2 i=1

We have

[  ∞
X ∞
X
µ(E) ≤ µ Ei ≤ µ(Ei ) ≤ C (diam Ei )s
i=1 i=1 i=1
s ∞
2 ωs X s 2s s
= C (diam Ei ) ≤ C (H (E) + ε) .
ωs 2s i=1 ωs

Since ε > 0 was arbitrary we conclude that C1 µ(E) ≤ Hs (E), where C1 = C −1 ωs /2s .
The proof of the opposite inequality is more difficult.

Lemma 70 If µ is an s-regular measure on a metric space X, then there is a constant


M > 0 such that for every ball B of radius R and any r < R, the ball B can be covered
by no more than M (R/r)s balls of radius r.

Proof. Let {xi }i∈I ⊂ B be a maximal r-separated set, i.e. d(xi , xj ) ≥Sr for i 6= j
and for every x ∈ B there is i ∈ I such that x ∈ B(xi , r). Hence B ⊂ i∈I B(xi , r)
and B(xi , r/2) ∩ B(xj , r/2) = ∅ for i 6= j. Since B(xi , r/2) ⊂ 2B and µ(B(xi , r/2)) ≥
C −1 (r/2)s we have
C(2R)s ≥ µ(2B) ≥ (#I)C −1 (r/2)s .
Hence #I ≤ C 2 4s (R/r)s and hence the lemma follows with M = C 2 4s . 2
The lemma implies that Hδs (B) ≤ C 0 µ(B) where B is a ball of radius R. Indeed,
we can cover B by M (R/r)s balls of radius r and we can take r so small that 2r < δ.
Hence  s
ωs R
s
Hδ (B) ≤ s M (2r)s = ωs M Rs ≤ C 0 µ(B) .
2 r
Since the constant C 0 is independent of δ we conclude that Hs (B) ≤ C 0 µ(B). We will
now prove a stronger inequality which says that for an arbitrary open set U ⊂ X

Hs (U ) ≤ C 00 µ(U ) .

This will imply that Hs (E) ≤ C 00 µ(E) for any Borel set E ⊂ X. Indeed, both measures
µ and Hs have the property that X is a union of a countably family of open sets of finite

53
measure. Therefore both measures have the property that the measure of any Borel
set equals infimum of measures of open sets that contain given Borel set (Theorem 7).
It follows from the proof of Lemma 70 that X is separable, soSaccording to the
5r-covering lemma we can find disjoint balls Bi ⊂ U such that U ⊂ ∞
i=1 5Bi . Hence

X ∞
X ∞
X
Hs (U ) ≤ Hs (5Bi ) ≤ C 0 µ(5Bi ) ≤ C 00 µ(Bi ) ≤ C 00 µ(U ).
i=1 i=1 i=1

The proof is complete. 2

5 Differentiation

5.1 Lebesgue differentiation theorem

Definition.
R We say that f ∈ L1loc (µ), if for every x ∈ X there is r > 0 such that
B(x,r)
|f | dµ < ∞.
The integral average of f over a measurable set E of finite measure will be denote
by Z Z
−1
fE = µ(E) f dµ = f dµ .
E E

Theorem 71 (Lebesgue differentiation theorem) If µ is a doubling measure on


a metric space X and f ∈ L1loc (µ), then
Z
lim f dµ = f (x) for µ-a.e. x ∈ X . (40)
r→0 B(x,r)

Proof. We can assume that f ≥ 0. Since X is separable, it is a countable union of


balls
R such that f is integrable on each such ball. Therefore, it suffices to prove that if
B
f dµ < ∞, then (40) is satisfied for µ-a.e. x ∈ B.
If for a measurable set A ⊂ B we have
Z
lim inf f dµ ≤ t for all x ∈ A ,
r→0 B(x,r)

then Z
f dµ ≤ tµ(A) .
A
Indeed, given ε > 0, let U be an open set such that A ⊂ U , µ(U ) ≤ µ(A) + ε. For
every x ∈ A there is arbitrarily small r > 0 such that B(x, r) ⊂ U and
Z
f dµ ≤ t + ε .
B(x,r)

54
Such closed balls form a covering of U in the Vitali sense and hence we can select
pairwise disjoint balls B 1 , B 2 , B 3 , . . . from the covering such that

[ Z
µ(U \ B i ) = 0, f dµ ≤ t + ε .
i=1 Bi

We have
Z Z ∞ Z
X ∞
X
f dµ ≤ f dµ = f dµ ≤ (t + ε) µ(B i )
A U i=1 Bi i=1
= (t + ε)µ(U ) ≤ (t + ε)(µ(A) + ε) .

Since ε > 0 was arbitrary, we conclude


Z
f dµ ≤ tµ(A)
A

Similarly, if A ⊂ B is a measurable set such that


Z
lim sup f dµ ≥ t for all x ∈ A ,
r→0 B(x,r)

then Z
f dµ ≥ tµ(A) . (41)
A
Let Z Z
As,t = {x ∈ B : lim inf f dµ ≤ s < t ≤ lim sup f dµ} .
r→0 B(x,r) r→0 B(x,r)

Then Z
tµ(As,t ) ≤ f dµ ≤ sµ(As,t )
As,t

and hence µ(As,t ) = 0. This easily implies that


Z Z
lim inf f dµ = lim sup f dµ for µ-a.e. x ∈ B ,
r→0 B(x,r) r→0 B(x,r)
R
i.e. the limit limr→0 B(x,r)
f dµ exists for µ-a.e. x ∈ B. Denote this limit by
Z
g(x) = lim f dµ
r→0 B(x,r)

whenever the limit exists. We have to prove that f (x) = g(x) for µ-a.e. x ∈ B.
Fix a measurable set F ⊂ B and ε > 0. Let

An = {x ∈ F : (1 + ε)n ≤ g(x) < (1 + ε)n+1 }, n ∈ ZZ .

55
Observe that
Z
lim sup f dµ = g(x) ≥ (1 + ε)n for all x ∈ An ,
r→0 B(x,r)
R
and hence An
f dµ ≥ (1 + ε)n µ(An ) by (41). We have
Z ∞ Z
X ∞
X
g dµ = g dµ ≤ (1 + ε)n+1 µ(An )
F n=−∞ An n=−∞
∞ Z
X Z
≤ (1 + ε) f dµ = (1 + ε) f dµ .
n=−∞ An F

Similarly we prove that Z Z


−1
g dµ ≥ (1 + ε) f dµ .
F F
Since ε > 0 was arbitrary we conclude that for every measurable set F ⊂ B
Z Z
f dµ = g dµ
F F

and hence f = g µ-a.e. 2


Among all representatives of f in the class of functions that are equal f a.e. we
select the following one Z
f (x) := lim sup f dµ . (42)
r→0 B(x,r)

It follows from the Lebesgue differentiation theorem that the right hand side of (42)
which is defined everywhere equals f a.e.
Definition. Let f ∈ L1loc (µ). We say that x ∈ X is a Lebesgue point of f if
Z
lim |f (y) − f (x)| dµ(y) = 0 ,
r→0 B(x,r)

where f (x) is defined by (42).

Theorem 72 Let µ be a doubling measure and f ∈ L1loc (µ). Then µ-a.e. point x ∈ X
is a Lebesgue point of f .

Proof. For every c ∈ IR


Z
lim |f (y) − c| dµ(y) = |f (x) − c| µ-a.e. (43)
r→0 B(x,r)

56
Let
E = {x ∈ X : (43) holds for all c ∈ Q} .
Since Q is countable, µ(X \ E) = 0. Let x ∈ E, be such that |f (x)| < ∞, and let
Q 3 cn → f (x). Then it follows from (43) that
Z
lim |f (y) − f (x)| dµ = |f (x) − f (x)| = 0 .
r→0 B(x,r)

2
Definition. We say that a family F of measurable sets in X is regular at x ∈ X if
there is a constant C > 0 such that for every S ∈ F there is a ball B(x, rS ) such that
S ⊂ B(x, rS ), µ(B(x, rS )) ≤ Cµ(S) ,
and for every ε > 0 there is a set S ∈ F with µ(S) < ε.

Theorem 73 Let µ be a doubling measure and f ∈ L1loc (µ). If x ∈ X is a Lebesgue


point of f and F is a regular family at x ∈ X, then
Z
lim f dµ = f (x) .
S∈F
µ(S)→0 S

Proof. For S ∈ F let rs be defined as above. Observe that if µ(S) → 0, then rs → 0


by Proposition 62. We have
Z Z Z
f dµ − f (x) ≤ |f (y) − f (x)| dµ(y) ≤ µ(S)−1

|f (y) − f (x)| dµ(y)
S S B(x,rS )
Z Z
µ(B(x, rS ))
= |f (y) − f (x)| dµ(y) ≤ C |f (y) − f (x)| dµ(y) → 0
µ(S) B(x,rS ) B(x,rS )

as µ(S) → 0. 2

Corollary 74 If f ∈ L1loc (IRn ) and x ∈ IRn is a Lebesgue point of f , then for any
sequence of cubes Qi such that x ∈ Qi , diam Qi → 0 we have
Z
lim f dµ = f (x) .
i→∞ Q
i

Rx
Corollary 75 Let F (x) = a
f (t) dt, where f ∈ L1 (a, b). Then F 0 (x) = f (x) for a.e.
x ∈ (a, b).

Proof. We have
x+h
F (x + h) − F (x)
Z
= f (t) dt → f (x) as h → 0,
h x

whenever x is a Lebesgue point of f . 2

57
Theorem 76 If A ⊂ IRn is a measurable set, then for almost all x ∈ A
|B(x, r) ∩ A|
lim =1
r→0 |B(x, r)|
and for almost all x ∈ IRn \ A
|B(x, r) ∩ A|
lim = 0.
r→0 |B(x, r)|

Proof. Applying the Lebesgue differentiation theorem to f = χA ∈ L1loc (IRn ) we have


that for a.e. x ∈ A
|B(x, r) ∩ A|
Z
= χA dLn → χA (x) = 1
|B(x, r)| B(x,r)

and for a.e. x ∈ IRn \ A


|B(x, r) ∩ A|
→ χA (x) = 0 .
|B(x, r)|
2
Definition. Let A ⊂ IRn be a measurable set. We say that x ∈ IRn is a density point
of A if
|A ∩ B(x, r)|
lim = 1.
r→0 |B(x, r)|
Thus the above theorem says that almost every point of a measurable set A ⊂ IRn is a
density point.

6 Brunn-Minkowski inequality
In this section we will prove the Brunn-Minkowski inequality, isodiametric inequality
and the isoperimetric inequality.

Theorem 77 (Brunn-Minkowski inequality) If A, B ⊂ IRn are compact sets, then

Ln (A + B)1/n ≥ Ln (A)1/n + Ln (B)1/n .

where A + B = {a + b : a ∈ A, b ∈ B}.

Proof. Observe that if we translate the sets, then the set A + B will also translate,
so the inequality will remain the same. In particular, we can locate the origin of the
coordinate system at any point.
We divide the proof into several steps.

58
Step 1. A and B are rectangular boxes with sides parallel to the coordinate axes.
Denote the edges of A and B by a1 ,. . . ,an and b1 ,. . . ,bn respectively. Then A + B
is also a rectangular box with edges a1 + b1 ,. . . ,an + bn . Hence the Brunn-Minkowski
inequality takes the form
Yn Yn Yn
1/n 1/n 1/n
(ai + bi ) ≥ ai + bi
i=1 i=1 i=1

which is equivalent to

7 Lp spaces
Definition. Let (X, µ) be a measure space. For 0 < p < ∞, L̃p (µ) denotes the class
of all complex valued measurable functions such that
Z 1/p
p
kf kp = |f | dµ <∞
X
p p
and we define L (µ) = L̃ (µ)/ ∼, where f ∼ g if f = g a.e.
For p = ∞ we define L̃∞ to be the class of essentially bounded measurable functions,
i.e. such that there is M > 0 with
|f (x)| ≤ M µ-a.e. (44)
It is easy to see that there is a smallest number M satisfying (44). We denote this
number by kf k∞ . Therefore
|f (x)| ≤ kf k∞ µ-a.e.
and for any ε > 0, µ({x : |f (x)| > kf k∞ − ε}) > 0. Finally we set L∞ (µ) = L̃(µ)/ ∼.
Example. Let µ be a counting measure on IN = {1, 2, 3, . . .}, i.e. for A ⊂ IN,
µ(A) = #A, the number of elements in the set A if A is finite and µ(A) = ∞ is A is
infinite. Clearly if f : IN → C, then
Z ∞
X
f dµ = f (n).
IN n=1

The functions f : IN → C can be identified with complex sequences and it is easy to


see that ∞
p ∞
X 1/p
L (µ) = {x = (xn )n=1 : kxkp = |xn |p < ∞} ,
n=1
and
L∞ (µ) = {x = (xn )∞
n=1 : kxk∞ = sup |xn | < ∞} .
n∈IN
p p
In this particular situation we denote L (µ) by ` for all 0 < p ≤ ∞.

59
Theorem 78 (Jensen’s inequality) Let (X, µ) be a measure space such that µ(X) =
1. If f ∈ L1 (µ) satisfies a < f (x) < b for a.e. x and ϕ is a convex function on (a, b),
then Z  Z
ϕ f dµ ≤ (ϕ ◦ f ) dµ .
X X

Remark. We do not exclude the cases a = −∞ and b = ∞.


Recall that ϕ is convex on (a, b) if

ϕ((1 − λ)x + λy) ≤ (1 − λ)ϕ(x) + λϕ(y) for all x, y ∈ (a, b) and 0 ≤ λ ≤ 1.

It is easy to see that this condition is equivalent to

ϕ(t) − ϕ(s) ϕ(u) − ϕ(t)


≤ for all s, t, u such that a < s < t < u < b (45)
t−s u−t

Lemma 79 Every convex function on (a, b) is continuous.

60
Proof. Fix a < s < t < b and take x, y such that a < s < x < y < t < b. Denote by
S, X, Y, T the corresponding points on the graph of ϕ.

−→
The point Y belongs to the angle with the vertex X formed by the half-lines SX
−−→
and XT . This easily implies that if y is decreasing to x, then Y → X, which proves the
right hand side continuity of ϕ. The proof of the left hand side continuity is similar. 2
R
Proof of Jensen’s inequality. For t = X f dµ we have a < t < b. Fix this value of t and
define
ϕ(t) − ϕ(s)
β = sup .
a<s<t t−s
It follow from (45) that

ϕ(u) − ϕ(t)
β≤ for all t < u < b
u−t
and hence
ϕ(u) ≥ ϕ(t) + β(u − t) for all t < u < b. (46)
On the other hand
ϕ(t) − ϕ(s)
≤β for all a < s < t
t−s
and hence
ϕ(s) ≥ ϕ(t) + β(s − t) for all a < s < t (47)
The inequalities (46) and (47) imply that

ϕ(s) ≥ ϕ(t) + β(s − t) for all a < s < b.

In particular
ϕ(f (x)) − ϕ(t) − β(f (x) − t) ≥ 0 a.e.

61
Integrating with respect to µ yields
Z Z   Z 
ϕ(f (x)) − ϕ f dµ −β f (x) − f dµ dµ ≥ 0 .
X X
| {z } | X {z }
constant constant

Since µ(X) = 1 we conclude


Z Z  Z Z 
(ϕ ◦ f ) dµ − ϕ f dµ − β f dµ − f dµ ≥ 0
X X
|X {z X }
0

and hence Z Z 
(ϕ ◦ f ) dµ ≥ ϕ f dµ .
X X
2
The numbers 1 < p, q < ∞ such that 1/p + 1/q = 1 or p = 1 and q = ∞ are
called Hölder conjugate or just conjugate exponents. Observe that if 1 < p < ∞, then
q = p/(p − 1).

Theorem 80 (Hölder’s inequality) Let (X, µ) be a measure space and 1 < p, q < ∞
be conjugate exponents. Then for any two complex valued measurable functions f and
g we have Z
|f g| dµ ≤ kf kp kgkq .
X

Proof. We will need the following lemma.

Lemma 81 (Young’s inequality) If a, b > 0, then


ap b q
ab ≤ + .
p q

Proof. We can assume that a, b > 0, otherwise the inequality is obvious. Then a = es/p ,
b = et/q for some s, t ∈ IR and the inequality is equivalent to

es/p+t/q ≤ es /p + et /q .

Now the lemma follows from the convexity of ex . 2


We can assume that 0 < kf kp < ∞ and 0 < kgkq < ∞, otherwise the Hölder
inequality is obvious. The lemma yields
 p  q
|f (x)g(x)| 1 |f (x)| 1 |g(x)|
≤ + .
kf kp kgkq p kf kp q kgkq

62
Integrating this inequality yields
Z Z Z
 1 p 1 
|f g| dµ ≤ kf kp kgkq |f | dµ + |g|q dµ
X pkf kpp X qkgkqq X
| {z }
=1/p+1/q=1

and the theorem follows. 2

Theorem 82 (Minkowski’s inequality) Let (X, µ) be a measure space and 1 ≤ p ≤


∞. Then for any two complex valued measurable functions f and g we have

kf + gkp ≤ kf kp + kgkp .

Proof. If p = 1, the inequality is obvious. If p = ∞, then

|f (x) + g(x)| ≤ |f (x)| + |g(x)| ≤ kf k∞ + kgk∞ a.e.

Hence kf + gk∞ ≤ kf k∞ + kgk∞ . If 1 < p < ∞, then we need to use the Hölder
inequality.
Z Z Z
p p−1
|f + g| dµ ≤ |f ||f + g| dµ + |g||f + g|p−1 dµ
X X X
Z 1/p Z 1/q Z 1/p Z 1/q
p (p−1)q p (p−1)q
≤ |f | dµ |f + g| dµ + |g| dµ |f + g| dµ
X X X X
Z 1/q
p
≤ (kf kp + kgkp ) |f + g| dµ
X

and hence Z 1−1/q


p
|f + g| dµ ≤ kf kp + kgkp .
X

Since 1 − 1/q = 1/p the theorem follows. 2


Example. In the case of space `p , the Hölder and the Minkowski inequality read as
follows !1/p ∞ !1/q
X∞ X∞ X
|xn yn | ≤ |xn |p |yn |q ,
n=1 n=1 n=1


!1/p ∞
!1/p ∞
!1/p
X X X
|xn + yn |p ≤ |xn |p + |yn |p .
n=1 n=1 n=1

Definition. Let K denote IR or C. The normed space is a pair (X, k · k), where X is
a linear space over K and
k · k : X → [0, ∞)
is a function such that:

63
(a) kx + yk ≤ kxk + kyk for all x, y ∈ X;

(b) kαxk + |α|kxk for all x ∈ X and α ∈ K;

(c) kxk = 0 iff x = 0.

The function k · k is called a norm.


Since kx − yk = k(x − z) + (z − y)k ≤ kx − zk + kz − yk we see that

d(x, y) = kx − yk

defines a metric in X. Normed spaces are always regarded as metric spaces with respect
to this metric.
Definition. We say that a normed space (X, k · k) is a Banach space if it is complete
with respect to the metric d(x, y) = kx − yk.
It follows from the Minkowski inequality that (Lp (µ), k · kp ) is a normed space for
all 1 ≤ p ≤ ∞.

Theorem 83 Lp (µ) is a Banach space for all 1 ≤ p ≤ ∞.

Proof. We will prove this theorem for 1 ≤ p < ∞ and leave the case p = ∞ as an
exercise. Let {fn } be a Cauchy sequence in Lp (µ). Let {fni } be a subsequence such
that
kfni+1 − fni kp < 2−i , i = 1, 2, 3, . . .
Let
k
X ∞
X
gk = |fni+1 − fni |, g= |fni+1 − fni | .
i=1 i=1

It follows from the Minkowski inequality that kgk kp < 1, and then Fatou’s lemma
applied to the sequence gkp yields
Z Z Z
p
p
g dµ = (lim inf gk ) dµ ≤ lim inf gkp dµ ≤ 1 ,
X X k→∞ k→∞ X

so kgkp ≤ 1. In particular g(x) < ∞ a.e. Thus the series



X
fn1 (x) + (fni+1 (x) − fni (x))
i=1

converges absolutely for a.e. x. Since


k
X
fn1 (x) + (fni+1 (x) − fni (x)) = fnk+1 (x)
i=1

64
we conclude that the sequence fnk (x) converges for a.e. x. Denote its limit by f (x),
i.e.
lim fnk (x) = f (x) a.e.
k→∞
p
It suffices to prove that f ∈ L and kf − fn kp → 0 as n → ∞.
For every ε > 0 there is N such that kfn − fm kp < ε for all n, m ≥ N . Applying
Fatou’s lemma to |fni − fm |p with m ≥ N fixed and i = 1, 2, 3, . . . we have
Z Z
p
|f − fm | dµ ≤ lim inf |fni − fm |p dµ ≤ εp .
X i→∞ X

Hence f − fm ∈ Lp and thus f ∈ Lp . Moreover kf − fm kp → 0 as m → ∞. 2


Exercise. Prove the above theorem for p = ∞.
The following fact follows from the proof of the above theorem.

Corollary 84 If fn → f in Lp (µ), then there is a subsequence fni which converges to


f a.e.

Lemma 85 Let S be the class of complex, measurable, simple functions s on X such


that
µ({x : s(x) 6= 0}) < ∞ .
If 1 ≤ p < ∞, then S is dense in Lp (µ).

Proof. Clearly S ⊂ Lp (µ). If f ∈ Lp (µ) and f ≥ 0, let sn be a sequence of simple


functions such that 0 ≤ sn ≤ f , sn → f pointwise. Since sn ∈ Lp it easily follows that
sn ∈ S. Now inequality 0 ≤ |f − sn |p ≤ f p and the dominated convergence theorem
implies that sn → f in Lp . In the general case we write f = (u+ − u− ) + i(v + − v − )
and apply the above argument to each of the functions u+ , u− , v + , v − separately. 2

Theorem 86 Let X be a locally compact metric space and µ a Radon measure on X.


Then the class of compactly supported continuous functions Cc (X) is dense in Lp (µ)
for all 1 ≤ p < ∞.

Proof. According to the lemma it suffices to prove that the characteristic function of
a set of finite measure can be approximated in Lp by compactly supported continuous
functions. Let E be a measurable set of finite measure. Given ε > 0 let K ⊂ E be a
compact set such that µ(E \ K) < (ε/2)p . Let U be an open set such that K ⊂ U , U
is compact and µ(U \ K) < (ε/2)p . Finally let ϕ ∈ Cc (X) be such that supp ϕ ⊂ U ,
0 ≤ ϕ ≤ 1, ϕ(x) = 1 for x ∈ K. We have
Z 1/p Z 1/p Z 1/p
p p p
kχE − ϕkp = |χE − ϕ| dµ ≤ |χE | dµ + |ϕ| dµ
X\K X\K X\K
1/p 1/p
≤ µ(E \ K) + µ(U \ K) < ε.

65
2

7.1 Convolution

Let us start with an observation that since the Lebesgue measure in invariant under
translations and under the mapping x 7→ −x, for any f ∈ L1 (IRn ) and any y ∈ IRn we
have Z Z Z
u(x) dx = u(x + y) dx = u(−x) dx .
IRn IRn IRn

Also since for every measurable set E and any t > 0 the set tE = {tx : x ∈ E}
has measure Ln (tE) = tn Ln (E) we easily conclude that the Lebesgue integral has the
following scaling property
Z Z
n
u(x/t) dx = t u(x) dx .
IRn IRn

Observe that in the case of the Riemann integral the above equalities are direct con-
sequences of the change of variables formula. We will prove a corresponding change
of variables formula for the Lebesgue integral later, but as for the proof of the above
equalities we do not have to refer to the general change of variables formula as they
follow directly from the properties of the Lebesgue measure mentioned above.
Definition. For measurable functions f and g on IRn we define the convolution by
Z
(f ∗ g)(x) = f (x − y)g(y) dy
IRn

and it is a question under what conditions the convolution is well defined.


If f ∈ L1loc (IRn ) and g is bounded, measurable and vanishes outside a bounded set,
then the function y 7→ f (x − y)g(y) is integrable, so (f ∗ g)(x) is well defined and finite
for every x ∈ IRn . If f, g ∈ L1 (IRn ), then it can happen that for a given x the function
y 7→ f (x − y)g(y) is not integrable and hence (f ∗ g)(x) is not defined. However as a
powerful application of the Fubini theorem we can prove the following surprising result.

Theorem 87 If f, g ∈ L1 (IRn ), then for a.e. x ∈ IRn the function y 7→ f (x − y)g(y)


is integrable and hence (f ∗ g)(x) exists. Moreover f ∗ g ∈ L1 (IRn ) and

kf ∗ gk1 ≤ kf k1 kgk1 .

Proof. The function |f (x − y)g(y)| as a function of a variable (x, y) ∈ IR2n is measur-

66
able.13 Hence Fubini’s theorem yields
Z Z Z Z 
|f (x − y)g(y)| dx dy = |f (x − y)||g(y)| dx dy
IRn IRn IRn IRn
Z Z 
= |f (x − y)| dx |g(y)| dy = kf k1 kgk1 .
IRn IRn

Therefore
Z Z  Z Z

kf k1 kgk1 = |f (x − y)g(y)| dy dx ≥ f (x − y)g(y) dy dx = kf ∗gk1 .
IRn IRn n n

IR IR

Theorem 88 If f, g, h ∈ L1 (IRn ) and α, β ∈ C, then

1. f ∗ g = g ∗ f ;

2. f ∗ (g ∗ h) = (f ∗ g) ∗ h;

3. f ∗ (αg + βh) = αf ∗ g + βf ∗ h.

Remark. This theorem says that (L1 , ∗) is a commutative algebra.


Proof. (1) We have
Z Z Z
(f ∗g)(x) = f (x−y)g(y) dy = f (−y)g(y+x) dy = f (y)g(x−y) dy = (g∗f )(x) .
IRn IRn IRn

(2) We have
Z Z Z 
f ∗ (g ∗ h)(x) = f (x − y)(g ∗ h)(y) dy = f (x − y) g(y − z)h(z) dz dy
IRn IRn IRn
Z Z  Z Z 
= h(z) f (x − y)g(y − z) dy dz = h(z) f (−y)g(y − z + x) dy dz
IRn IRn IRn IRn
Z Z  Z
= h(z) f (y)g((x − z) − y) dy dz = h(z)(f ∗ g)(x − z) dz = (f ∗ g) ∗ h(x) .
IRn IRn IRn

(3) It is easy and left to the reader. 2

Theorem 89 If f ∈ L1loc (IRn ) and g is continuous with compact support, then

1. f ∗ g is continuous on IRn .
13
Why?

67
2. If f ∈ C 1 (IRn ), then f ∗ g ∈ C 1 (IRn ) and
∂ ∂f 
(f ∗ g)(x) = ∗ g (x) .
∂xi ∂xi

3. If g ∈ C 1 (IRn ), then f ∗ g ∈ C 1 (IRn ) and


∂ ∂g 
(f ∗ g)(x) = f ∗ (x) .
∂xi ∂xi

4. If f ∈ C r (IRn ) or g ∈ C r (IRn ), then f ∗ g ∈ C r (IRn ).

Proof. (1) Since g is bounded and has compact support, (f ∗ g)(x) is well defined and
finite for all x ∈ IRn . Fix x ∈ IRn . The function y 7→ f (y)g(x − y) vanishes outside a
sufficiently large ball B (because g as compact support). Let xn → x. Then there is a
ball B (perhaps larger than the one above), so that all the functions

y 7→ f (y)g(xn − y)

vanish outside B. Hence

|f (y)g(xn − y)| ≤ kgk∞ |f (y)|χB (y) ∈ L1 (IRn )

and the dominated convergence theorem yields


Z Z
(f ∗ g)(xn ) = f (y)g(xn − y) dy → f (y)g(x − y) dy = (f ∗ g)(x)
IRn IRn

which proves continuity of f ∗ g.


(2) The function ∂f /∂xi is bounded on bounded subsets of IRn (because it is continu-
ous). Fix x ∈ IRn . For any r > 0 there is a constant M > 0 such that

∂f
∂xi ≤ M for all z ∈ B(x, r + 1).

Let tk → 0, |tk | < 1. We have


f (x + tk ei − y) − f (x − y) ∂f
g(y) → (x − y)g(y) .
tk ∂xi
The functions on the left hand side are bounded by M kgk∞ for all y with |y| < r (by
the mean value theorem). Take r so large that supp g ⊂ B(0, r). Then
(f ∗ g)(x + tk ei ) − (f ∗ g)(x)
tk
f (x + tk ei − y) − f (x − y)
Z Z
∂f ∂f 
= g(y) dy → (x − y)g(y) dy = ∗ g (x) .
B(0,r) tk B(0,r) ∂xi ∂xi

68
We could pass to the limit under the sign of the integral, because the functions were
uniformly bounded.
(3) The proof is similar to that in the case (2); details are left to the reader.
(4) This result follows from (2) and (3) by induction. 2

Let ϕ ∈ C0∞ (B(0, 1)) be such a function that ϕ ≥ 0 and IRn ϕ(x) dx = 1. A function
R

with the given properties can be constructed explicitly. Indeed,


(  
1
exp |x|2 −1 if |x| < 1,
ϕ̃(x) =
0 if |x| ≥ 1.
belongs to C0∞ (IRn ) and we define
ϕ̃(x)
ϕ(x) = R .
IRn ϕ(y) dy
For ε > 0 we set
ϕε (x) = ε−n ϕ(x/ε) .
Then supp ϕε ⊂ B(0, ε), ϕε ≥ 0 and IRn ϕε (x) dx = 1. For f ∈ L1loc (IRn ) we put
R

fε = f ∗ ϕε .
It follows from Theorem 89 that fε ∈ C ∞ .

Theorem 90 Let f be a function defined on IRn .

1. If f is continuous, then fε ⇒ f uniformly on compact sets as ε → 0.


2. If f ∈ C r , then Dα fε ⇒ Dα f uniformly on compact sets for all |α| ≤ r.

Proof. (1) Uniform continuity of f on compact sets implies that for any compact set K
sup sup |f (x) − f (x − y)| → 0 as ε → 0.
|y|<ε x∈K
R R
Since IRn
ϕε (y) dy = 1, we have IRn
f (x)ϕε (y) dy = f (x) and hence for any compact
set K
Z

sup |f (x) − fε (x)| = sup
(f (x) − f (x − y))ϕε (y) dy
x∈K x∈K n
Z IR
≤ sup |f (x) − f (x − y)|ϕε (y) dy → 0 as ε → 0 .
x∈K B(0,ε)

(2) It follows from Theorem 89 and the induction argument that Dα (f ∗ϕε ) = (Dα f )∗ϕε
for all |α| ≤ r and hence the theorem follows from the part (1). 2

69
Lemma 91 If f ∈ Lp , 1 ≤ p < ∞, then fε ∈ Lp (IRn ) and kfε kp ≤ kf kp .

Proof. If p = 1, then

kfε k1 = kf ∗ ϕε k1 ≤ kf k1 kϕε k1 = kf k1

by Theorem 87. Let then 1 < p < ∞. Hölder’s inequality yields


Z p Z p
p
1/p 1/q
|fε (x)| = f (x − y)ϕε (y) dy ≤ |f (x − y)|ϕε (y) ϕε (y) dy
IRn IRn
Z  Z p/q
p
≤ |f (x − y)| ϕε (y) dy ϕε (y) dy = |f |p ∗ ϕε (x) .
IRn IRn

Hence
Z
kfε kpp ≤ |f |p ∗ ϕε (x) dx = k |f |p ∗ ϕε k1 ≤ k |f |p k1 kϕε k1 = kf kpp .
IRn

2
The following result proves density of the class of smooth functions in Lp (IRn ).

Theorem 92 If f ∈ Lp (IRn ) and 1 ≤ p < ∞, then fε ∈ Lp (IRn ) and kf − fε kp → 0 as


ε → 0.

Proof. We will need the following result.

Lemma 93 If f ∈ Lp (IRn ), 1 ≤ p < ∞, then


Z
lim |f (x + y) − f (x)|p dx = 0 .
y→0 IRn

Proof. For y ∈ IRn let τy : Lp (IRn ) → Lp (IRn ) be the translation operator defined by

(τy f )(x) = f (x + y) for f ∈ Lp (IRn ).

The lemma says that τy is continuous, i.e. kτy f − f kp → 0 as y → 0. Given ε > 0


let g be a compactly supported continuous function such that kf − gkp < ε/3, see
Theorem 86. Then

kτy f −f kp ≤ kτy f −τy gkp +kτy g−gkp +kf −gkp = 2kf −gkp +kτy g−gkp < 2ε/3+kτy g−gkp .

Since τy g ⇒ g converges uniformly, there is δ > 0 such that kτy g − gkp < ε/3 for |y| < δ
and hence
kτy f − f kp < ε for |y| < δ,

70
which proves the lemma. 2
Now we can complete the proof of the theorem. Assume first that 1 < p < ∞.
Hölder’s inequality and Fubini’s theorem yield
Z Z p

kf − fε kp = (f (x) − f (x − y))ϕ ε (y) dy dx
IRn
n
IR
Z Z p
1/p 1/q
≤ |f (x) − f (x − y)|ϕε (y) ϕε (y) dy dx
n n
ZIR  ZIR Z
p
p/q 
≤ |f (x) − f (x − y)| ϕε (y) dy ϕε (y) dy dx
IRn IRn IRn
| {z }
1
Z Z
= |f (x) − f (x − y)|p ϕε (y) dy dx
n
ZIR B(0,ε)
Z 
= |f (x) − f (x − y)|p dx ϕε (y) dy → 0 as ε → 0.
B(0,ε) IRn

by Lemma 93. If p = 1, then the above argument simplifies, because we do not have
to use Hölder’s inequality. 2

Corollary 94 C0∞ is dense in Lp (IRn ) for all 1 ≤ p < ∞.

Proof. Since continuous functions with compact support are dense in Lp it suffices to
prove that every compactly supported continuous function can be approximated in Lp
by C0∞ . This, however, immediately follows from Theorem 92, because if f vanishes
outside a bounded set, then fε has compact support. 2

8 Functions of bounded variation


Definition. We say that a function f : [a, b] → IR has bounded variation if its
variation defined by
b
_ n−1
X
f = sup |f (xi+1 ) − f (xi )| < ∞
a i=0

is finite, where the supremum is taken over all n and all partitions a = x0 < x1 < . . . <
xn = b.
Clearly monotonic functions have bounded variation
b
_
f = |f (b) − f (a)| .
a

71
Wx
If a function f : [a, b] → IR has bounded variation,
Wx then the function x →
7 a f is
nondecreasing on [a, b]. Also the function x →
7 a f − f (x) is nondecreasing, because
f (y) − f (x) ≤ yx f for y > x. Therefore every function of bounded variation is a
W
difference of two nondecreasing functions.
x
_ x
_ 
f (x) = f− f − f (x) . (48)
a a

We proved

Theorem 95 (Jordan) A function f : [a, b] → IR has bounded variation if and only


if it is a difference of two nondecreasing functions.

Wxf : [a, b] → IR is a continuous function of bounded variation,


Exercise. Prove that if
then the function x 7→ a f is continuous.
The exercise and the formula (48) implies

Theorem 96 A continuous function has bounded variation if and only if it is a differ-


ence of two continuous nondecreasing functions.

Geometrically bounded variation is related to the length of a curve. Recall that if


ϕ = (ϕ1 , . . . , ϕk ) : [a, b] → IRk is a continuous curve, then its length is defined by

72
n−1
X
`(ϕ) = sup kϕ(xi+1 ) − ϕ(xi )k
i=0
n−1
X p
= sup (ϕ1 (xi+1 ) − ϕ1 (xi ))2 + . . . + (ϕk (xi+1 ) − ϕk (xi ))2
i=1

where the supremum is taken over all n and all partitions a = x0 < x1 < . . . < xn = b.
Definition. We say that a curve ϕ : [a, b] → IRk is rectifiable if it has finite length
`(ϕ) < ∞.
The following result easily follows from the definition of length and the definition
of variation.

Theorem 97 A continuous curve ϕ : [a, b] → IRk is rectifiable if and only if the


coordinate functions ϕi : [a, b] → IR are of bounded variation for i = 1, 2, . . . , k.

The following result is an easy exercise.

Theorem 98 If functions f, g : [a, b] → IR have bounded variation, then the functions


f ± g and f g have bounded variation. If in addition g ≥ c > 0, then the function f /g
has bounded variation.

Exercise. Prove the above theorem.

Proposition 99 Let f be a monotonic function on (a, b). Then the set of points of
(a, b) at which f is discontinuous is at most countable.

Proof. Suppose f is nondecreasing. If f is discontinuous at x ∈ (a, b), then

lim f (t) < lim+ f (t)


t→x− t→x

and hence there is a rational number r(x) such that

lim f (t) < r(x) < lim+ f (t) .


t→x− x→t

It is easy to see that if f is discontinuous at x1 and at x2 , x1 6= x2 , then r(x1 ) 6= r(x2 ).


Therefore there is a one-to-one correspondence between the set of points where f is
discontinuous and the following subset of Q

{r(x) : f is discontinuous at x}

which clearly is at most countable. 2

73
Corollary 100 If f : [a, b] → IR has bounded variation, then the set of points at which
f is discontinuous is at most countable.

We will prove now a deep and important result.

Theorem 101 Functions of bounded variation are differentiable a.e.

Proof. It suffices to prove that nondecreasing functions are differentiable a.e.

Lemma 102 If f : [a, b] → IR is nondecreasing, then


f (x + h) − f (x)
lim sup <∞ a.e. (49)
h→0+ h

Proof. Let E be the set of points where the upper limit (49) equals ∞. Fix an arbitrary
K > 0. For every x ∈ E there is an arbitrarily small interval [x, x + h] such that
f (x + h) − f (x)
>K,
h
i.e.
h < K −1 (f (x + h) − f (x)) . (50)
Such intervals cover the set E in S the Vitali sense. Accordingly, we can select disjoint
intervals I1 , I2 , . . . such that |E \ ∞
P∞
I
k=1 k | = 0, so |E| ≤ k=1 k | and hence
|I

X
|E| ≤ |Ik | < K −1 (f (b) − f (a))
k=1

by (50) and the fact that f is nondecreasing. Since the above inequality holds for any
K > 0 it follows that |E| = 0. 2
Definition. The Dini derivatives are defined as follows
f (x + h) − f (x) f (x + h) − f (x)
D+ f (x) = lim sup , D+ f (x) = lim inf ,
h→0+ h +
h→0 h

f (x + h) − f (x) f (x + h) − f (x)
D− f (x) = lim sup , D− f (x) = lim inf .
h→0− h −
h→0 h

The lemma says that D+ f < ∞ a.e. Similarly we prove that all other Dini deriva-
tives D+ f, D− f, D− f are finite a.e.
We need to prove that

D+ f = D+ f = D− f = D− f a.e. (51)

74
To this end it suffices to prove that

D− f ≥ D+ f a.e. and D+ f ≥ D− f a.e.

because then (51) will follow from the inequality

D+ f ≥ D+ f ≥ D− f ≥ D− f ≥ D+ f a.e.

We will prove only that D− f ≥ D+ f a.e. as the prof of the inequality D+ f ≥ D− f


a.e. is very similar.
We need to prove that the set {D− f < D+ f } has measure zero. We have
[
{D− f < D+ f } = {D− f < u < v < D+ f } .
0<u<v
| {z }
u,v∈Q E(u,v)

This is a countable union of sets E(u, v) and thus it suffices to prove that each set
E(u, v) has measure zero. Let U be an arbitrary open set such that E(u, v) ⊂ U . For
every x ∈ E(u, v) there is an arbitrarily small h > 0 such that [x − h, x] ⊂ U and
(f (x − h) − f (x))/(−h) < u, i.e.14
x
_
f < hu . (52)
x−h

Such intervals cover the set E(u, v) in the Vitali sense, so we can select pairwise disjoint
intervals I1 , I2 , . . . such that

[
|E(u, v) \ Ik | = 0 .
k=1

We have ∞ _ ∞
X X
f ≤u |Ik | ≤ u|U | .
k=1 Ik k=1

The first inequality follows from (52) and the second inequality from the fact that the
intervals I1 , I2 , . . . are disjoint and contained in U .
Denote15 G = E(u, v) ∩ ∞
S
S |E(u, v) \ G| = 0. For every x ∈ G there
k=1 int Ik . Clearly
is an arbitrarily short interval [x, x + h] ⊂ k int Ik such that (f (x + h) − f (x))/h > v
and hence
x+h
_
f > hv .
x

14
We take [x − h, x] because D− f < u is a condition for the limit as h → 0− .
15
int Ik denotes the interior of Ik .

75
These intervals cover S the set G in the Vitali
S sense, so we can select pairwise disjoint
intervals J1 , J2 , . . . ⊂ k Ik such that |G \ k Jk | = 0 and

X∞ _ ∞
X [∞
f >v |Jk | = v Jk .


k=1 Jk k=1 k=1

We have ∞
[ X ∞ _ ∞ _
X
v|E(u, v)| = v|G| ≤ v Jk < f≤ f ≤ u|U | .


k=1 k=1 Jk k=1 Ik

Since the measure of U can be arbitrarily close to the measure of E(u, v) we obtain

v|E(u, v)| ≤ v|E(u, v)|

and hence |E(u, v)| = 0, because v > u. The proof is complete. 2

Theorem 103 If a function f : [a, b] → IR has bounded variation, then f 0 ∈ L1 (a, b).
If in addition f is nondecreasing, then
Z b
f 0 (x) dx ≤ f (b) − f (a) .
a

Example. The Cantor Staircase Function. This is a continuous nondecreasing function


f : [0, 1] → [0, 1] such that f (0) = 0, f (1) = 1 and f is constant in each of the intervals
in the complement of the Cantor set. Since the Cantor set has measure zero we conclude
that f 0 (0) = 0 a.e.

Therefore the function f has bounded variation (as a nondecreasing function) and
Z 1
0= f 0 (x) dx < f (1) − f (0) = 1 .
0

76
Actually Sierpiński provided a more surprising example. He constructed a strictly
increasing function with the derivative equal to zero a.e.
Proof of the theorem. We can assume that f is nondecreasing. Rx Let aε > a and bε < b
be points of differentiability of the function F (x) = a f (t) dt such that |a − aε | < ε,
|b − bε | < ε and F 0 (aε ) = f (aε ), F 0 (bε ) = f (bε ). Since
f (x + h) − f (x)
lim = f 0 (x) a.e.
h→0+ h
Fatou’s lemma yields
Z bε bε Z bε +h aε +h 
f (x + h) − f (x)
Z Z
0
f (x) dx ≤ lim inf dx = lim inf f− f
aε h→0+ aε h h→0 bε aε
= f (bε ) − f (aε ) ≤ f (b) − f (a) .
Passing to the limit with ε → 0 yields the result. 2

8.1 Absolutely continuous functions

Definition. We say that a function f : [a, b] → IR is absolutely continuous if for every


ε > 0 there is δ > 0 such that if (x1 , x1 + h1 ), .P . . , (xk , xk + hk ) are pairwise disjoint
intervals in [a, b] of total length less than δ, i.e. ki=1 hi < δ, then
k
X
|f (xi + hi ) − f (xi )| < ε .
i=1

Theorem 104 Absolutely continuous functions have bounded variation.

We leave the proof as an exercise.

Theorem 105 If f, g : [a, b] → IR are absolutely continuous, then also the functions
f ± g and f g are absolutely continuous. If, in addition, g ≥ c > 0 on [a, b], then f /g
is absolutely continuous.

We leave the proof as an exercise.


The following result provides a complete characterization of absolutely continuous
functions as the functions for which the fundamental theorem of calculus holds.
Rx
Theorem 106 If f ∈ L1 ([a, b]), then the function F (x) = a f (t) dt is absolutely
continuous. On the other hand if F is absolutely continuous, then its derivative is
integrable F 0 ∈ L1 ([a, b]) and
Z x
F (x) = F (a) + F 0 (t) dt for all x ∈ [a, b].
a

77
Rx
Proof. Absolute continuity of the function F (x) = a
f (t) dt follows from the absolute
continuity of the integral, Theorem 52.
If F is absolutely continuous, then it has bounded variation and hence F 0 ∈ L1 ([a, b])
by Theorem 103. This implies that the function
Z x
x 7→ F 0 (t) dt
a

is absolutely continuous. Hence the function


Z x
ϕ(x) = F (x) − F (a) − F 0 (t) dt
a

is absolutely continuous, ϕ(a) = 0, and16 ϕ0 (x) = 0 for a.e. x. It suffices to prove that
ϕ(x) = 0 for all x. This will immediately follow from the next lemma.

Lemma 107 If ϕ : [a, b] → IR is absolutely continuous and ϕ0 (x) = 0 a.e., then ϕ is


constant.

Proof. Fix β ∈ (a, b) and ε > 0. For almost every x ∈ (a, β) there is an arbitrary small
interval [x, x + h] ⊂ (a, β) such that

ϕ(x + h) − ϕ(x)
< ε.
h

Such intervals form a Vitali covering of the set of points in (a, β) where the derivative of
ϕ equals 0 (almost all points in (a, β)). Hence we can select pairwise disjoint intervals
from this covering [x1 , x1 + h1 ], [x2 , x2 + h2 ], . . . ⊂ (a, β) that cover almost all points of
(a, β).
For a given ε > 0 we choose δ > 0 as in the S condition for the absolute continuity
of ϕ. For N sufficiently large, the set (a, β) \ N
i=1 [xi , xi + hi ] has measure less than
δ. This set is a union of N + 1 open intervals. Denote these intervals by (ξj , ξj0 ),
j = 1, 2, . . . , N + 1. We have
N
X N
X +1
|ϕ(β) − ϕ(a)| ≤ |ϕ(xi + hi ) − ϕ(xi )| + |ϕ(ξj0 ) − ϕ(ξj )| ≤ ε(β − a) + ε .
i=1 j=1

Since this inequality holds for any ε > 0 we conclude that ϕ(β) = ϕ(a), so the function
ϕ is constant. This proves the lemma and also completes the proof of the theorem. 2

Theorem 108 (Integration by parts) If the functions f, g : [a, b] → IR are abso-


lutely continuous, then Z b Z b
0 b
f g = f g|a − f 0g .
a a
Rx
16
Because d
dt a
F 0 (t) dt = F 0 (x) a.e. by the Lebesgue differentiation theorem.

78
Proof. Clearly f g 0 = (f g)0 − f 0 g a.e. and absolute continuity of f g implies
Z b Z b Z b Z b
0 0 0
fg = (f g) − fg= f g|ba − f 0g .
a a a a

Theorem 109 Every function of bounded variation is a sum of an absolutely contin-


uous function and a singular function, i.e. such a function that its derivative equals 0
a.e.

Proof. The theorem follows immediately from the equality


 Z x  Z x
0
f (x) = f (x) − f (t) dt + f 0 (t) dt .
a a

2
Example. Every Lipschitz function f : [a, b] → IR is absolutely continuous and hence
it is differentiable a.e. and satisfies the claim of the fundamental theorem of calculus.
Definition. For a function f : [a, b] → IR, the Banach indicatrix is defined by

N (f, y) = #f −1 (y) for y ∈ IR,

i.e. N (f, y) is the number of points in the preimage of y (it can be equal to 0 or to ∞
for some y).

Theorem 110 (Banach) For a continuous function f : [a, b] → IR its indicatrix is


measurable and
_b Z
f= N (f, y) dy . (53)
a IR

Remark. We do not assume that the function f has bounded variation, so if the
function f has unbounded variation, then the theorem says that the integral of the
function on the right hand side of (53) equals ∞.
Proof. Under construction.

Theorem 111 If f : [a, b] → IR is absolutely continuous, then


b
_ Z b
f= |f 0 (t)| dt .
a a

79
Proof. Under construction.
The above two theorems immediately imply the following result.

Corollary 112 If f : [a, b] → IR is absolutely continuous, then


Z b Z
0
|f (t)| dt = N (f, y) dy .
a IR

Theorem 113 (Rademacher) If f : Ω → IR is Lipschitz continuous, where Ω ⊂ IRn


is open, then f is differentiable a.e., i.e.
 
∂f ∂f
∇f (x) = (x), . . . , (x)
∂x1 ∂xn

exists a.e. and


f (y) − f (x) − ∇f (x) · (y − x)
lim = 0 for a.e. x ∈ Ω .
y→x ky − xk

Proof. Let ν ∈ S n−1 and


d
Dν f (x) =
f (x + tν)|t=0
dt
be the directional derivative whenever it exists. Since Dν f (x) exists precisely when the
Borel-measurable functions
f (x + tν) − f (x) f (x + tν) − f (x)
lim sup and lim inf
t→0 t t→0 t
coincide, the set Aν on which Dν f fails to exist is Borel-measurable. The function
t 7→ f (x+tν) is absolutely continuous and hence differentiable a.e. Thus the intersection
of the set Aν with any line parallel to ν has one dimensional measure zero and hence
Fubini’s theorem implies that Aν has measure zero. Accordingly, for every ν ∈ S n−1

Dν f (x) exists for a.e. x ∈ Ω

Take any ϕ ∈ C0∞ (Ω) and note that for each sufficiently small h > 0

f (x + hν) − f (x) ϕ(x − hν) − ϕ(x)


Z Z
ϕ(x) dx = − f (x) dx .
Ω h Ω −h
Indeed, invariance of the integral with respect to translations yields
Z Z
f (x + hν)ϕ(x) dx = f (x)ϕ(x − hν) dx
Ω Ω

80
and h has to be so small that x+hν ∈ Ω for all x ∈ supp ϕ. The dominated convergence
theorem yields Z Z
Dν f (x)ϕ(x) dx = − f (x)Dν ϕ(x) dx .
Ω Ω

This is true for any ν ∈ S n−1 and in particular we have


Z Z
∂f ∂ϕ
(x) ϕ(x) dx = − f (x) (x) dx for i = 1, 2, . . . , n.
Ω ∂xi Ω ∂xi
Now
Z Z Z
Dν f (x)ϕ(x) dx = − f (x)Dν ϕ(x) dx = − f (x) (∇ϕ(x) · ν) dx
Ω Ω Ω
n Z n Z
X ∂ϕ X ∂f
= − f (x) (x) νi dx = ϕ(x) (x) νi dx
i=1 Ω ∂xi i=1 Ω ∂xi
Z
= ϕ(x)(∇f (x) · ν) dx .

g(x)ϕ(x) dx = 0 for every ϕ ∈ C0∞ (Ω), then g = 0


R
Lemma 114 If g ∈ L1loc (Ω) and Ω
a.e.

We leave the proof of this lemma as an exercise. The lemma implies that

Dν f (x) = ∇f (x) · ν a.e.

Now let ν1 , ν2 , . . . be a countable dense subset of S n−1 and let

Ak = {x ∈ Ω : ∇f (x), Dνk f (x) exists and Dνk f (x) = ∇f (x) · νk } .

Let A = ∞
T
k=1 Ak . Clearly |Ω \ A| = 0 and

Dνk f (x) = ∇f (x) · νk for all x ∈ A and all k = 1, 2, . . .

We will prove that f is differentiable at each point of the set A. To this end, for any
x ∈ A, ν ∈ S n−1 and h > 0 define
f (x + hν) − f (x)
Q(x, ν, h) = − ∇f (x) · ν
h
and it suffices to prove that if x ∈ A, then for every ε > 0 there is δ > 0 such that

|Q(x, ν, h)| < ε whenever 0 < h < δ and ν ∈ S n−1 .

Let the function f be L-Lipschitz, i.e. |f (x) − f (y)| ≤ √ L|x − y| for all x, y ∈ Ω. This
implies that |∂f /∂xi | ≤ L a.e. and thus |∇f (x)| ≤ nL a.e. Hence for any x ∈ A,
ν, ν 0 ∈ S n−1 and h > 0

|Q(x, ν, h) − Q(x, ν 0 , h)| ≤ ( n + 1)L|ν − ν 0 | .

81
Given ε > 0 let p be so large that for each ν ∈ S n−1
ε
|ν − νk | < √ for some k = 1, 2, . . . , p.
2( n + 1)L

Since
lim Q(x, νi , h) = 0 for all x ∈ A and all i = 1, 2, . . .
h→0+

we see that for a given x ∈ A there is δ > 0 such that

|Q(x, νi , h)| < ε/2 whenever 0 < h < δ and i = 1, 2, . . . , p .

Now

|Q(x, ν, h)| ≤ |Q(x, νk , h)| + |Q(x, νk , h) − Q(x, ν, h)| < ε/2 + ( n + 1)L|νk − ν| < ε

whenever ν ∈ S n−1 and 0 < h < δ and the theorem follows. 2

Theorem 115 (McShane) If f : A → IR is an L-Lipschitz function defined on a


subset A ⊂ X of a metric space X, then there is an L-Lipschitz function f˜ : X → IR
such that f˜|A = f .

Proof. For x ∈ X we define

f˜(x) = inf (f (y) + Ld(x, y))


y∈A

and it a a routine exercise to check that f˜ is L-Lipschitz and satisfies f˜|A = f . 2

Theorem 116 (Stepanov) Let Ω ⊂ IRn be open. Then a measurable function f :


Ω → IR is differentiable a.e. in Ω if and only if

|f (f ) − f (x)|
lim sup <∞ for a.e. x ∈ Ω. (54)
y→x ky − xk

The implication ⇒ is obvious. To prove the implication ⇐ first we show that (54)
implies that f is Lipschitz continuous on a big set A in the sense that the measure of
Ω \ A is small. Then we extend f |A to a Lipschitz function on IRn using McShane’s
theorem. Now f˜ is differentiable a.e. by Rademacher theorem. In particular it is
differentiable at a.e. point of A and it remains to prove that f is differentiable at every
density point of A at which f˜ is differentiable. We leave details as an exercise. 2

82
9 Signed measures
Definition. Let M be a σ-algebra. A function µ : M → [−∞, ∞] is called signed
measure if

1. µ(∅) = 0;
2. µ attains at most one of the values ±∞;
3. µ is countably additive.

Examples.

1. µ = µ1 − µ2 , where µ1 and µ2 are (positive) measures and one of them is finite.


2. Z
µ(E) = f dν, f ∈ L1 (ν) .
E

Definition. A measurable set E ∈ M is called positive if µ(A) > 0 for every measur-
able subset A ⊂ E.
Example. If E = E1 ∪ E2 , E1 ∩ E2 = ∅, µ(E1 ) = 7, µ(E2 ) = −3, then µ(E) = 4 > 0,
but E is not positive.

Lemma 117 If µ is a signed measure, then every measurable set such that 0 < µ(E) <
∞ contains a positive set A of positive measure.

Proof. If E is positive, then we take A = E. Otherwise there is B ⊂ E, µ(B) < 0. Let


n1 be the smallest positive integer such that there is B1 ⊂ E with
1
µ(B1 ) ≤ − .
n1
If A1 = E \ B1 is positive, then we take A = A1 (observe that µ(A1 ) > 0 as otherwise
µ(E) < 0). If A1 is not positive, then let n2 be the smallest positive integer such that
there is B2 ⊂ E \ B1 with
1
µ(B2 ) ≤ − .
n2
If A2 = E \(B1 ∪B2 ) is positive, then we take A = A2 (observe that
Smµ(A2 ) > 0). If A2 is
not positive, then we define A3 as above and so on. If Am = E \ j=1 Bj is positive for
some m, then we take A = Am (µ(Am ) > 0). Otherwise we obtain an infinite sequence
Bj and we set
[∞
A=E\ Bj .
j=1

83
We will prove that A is positive and µ(A) > 0. We have

X
IR 3 µ(A) = µ(E) − µ(Bj )
j=1

and hence ∞ ∞
X X 1
0 < µ(E) = µ(A) + µ(Bj ) ≤ µ(A) −
j=1
n
j=1 j

X 1
< µ(A) . (55)
n
j=1 j

Note that µ(A) < ∞ as otherwise Pµ(E) = µ(A) + µ(E \ A) = ∞ + µ(E \ A) = ∞.


Hence (55) yields µ(A) > 0 and ∞ j=1 1/nj < ∞. In particular
S∞ nj → ∞ ad j → ∞.
It remains to proveSthat A is positive. If C ⊂ A = E \ j=1 Bj is measurable, the for
every m, C ⊂ E \ m j=1 Bj and the definition of nm+1 yields

1
µ(C) > − → 0 as m → ∞
nm+1 − 1
and hence µ(C) ≥ 0. This proves positivity of A. 2

Theorem 118 (Hahn’s decomposition theorem) If µ is a signed measure, then


there exist two disjoint measurable sets X + and X − such that

X = X + ∪ X −, X+ ∩ X− = ∅ ,

µ is positive on X + and µ is negative on X − (i.e. −µ is positive on X − ).

Therefore every signed measure is on the form

µ = µ+ − µ−

where µ+ and µ− are positive measures defined by µ+ (E) = µ(E ∩ X + ), µ− (E) =


−µ(E ∩ X − ). Moreover the measures µ+ and µ− are concentrated on disjoint sets X +
and X − .
Exercise. Prove that the decomposition X = X + ∪X − is unique up to sets of µ-measure
zero, i.e. if X = X̃ + ∪ X̃ − is another decomposition, then

µ(X + \ X̃ + ) = µ(X̃ + \ X + ) = µ(X − \ X̃ − ) = µ(X̃ − \ X − ) = 0 .

R
Example. Let µ(E) = E
f dν, where f ∈ L1 (ν). Then µ is a signed measure and
Z Z
µ(E) = +
f dν − f − dν .
E E

84
Hence we can take

X + = {x : f (x) ≥ 0}, X − = {x : f (x) < 0} ,

but we also can take

X̃ + = {x : f (x) > 0}, X̃ − = {x : f (x) ≤ 0} .

In general X + 6= X̃ + , X − 6= X̃ − , however, the sets differ by the set where f equals 0


and this set has measure zero
Z
µ({x : f (x) = 0}) = f dν = 0 .
{f =0}

Proof of Hahn’s theorem. Suppose µ does not attain value +∞. Let

M = sup{µ(A) : A ∈ M, A is positive} .

Then there is a sequence of positive sets

A1 ⊂ A 2 ⊂ A3 ⊂ . . . ; µ(Ai ) → M

and if A = ∞
S
i=1 Ai , then A is positive with µ(A) = M < ∞. It remains to prove that
X \ A is negative (then we take X + = A, X − = X \ A). For if not, there is E ⊂ X \ A
with 0 < µ(E) < ∞. Hence Lemma 117 implies that there is a positive subset C ⊂ E
of positive measure. Now A ∪ C is positive and

µ(A ∪ C) = µ(A) + µ(C) > M

which is an obvious contradiction. 2


As an application we will prove the Radon-Nikodym-Lebesgue theorem and then
we will classify all continuous linear functionals on Lp (µ) for 1 ≤ p < ∞.
Definition. Let µ be a (positive) measure in a σ-algebra M. If ν is another (positive)
measure in M such that

E ∈ M, µ(E) = 0 ⇒ ν(E) = 0 ,

then we say that ν is absolutely continuous with respect to µ and we write

ν  µ.

Example. If f ≥ 0 is measurable, and a measure ν is defined by


Z
ν(E) = f dµ ,
E

85
then ν  µ.
Definition. Let λ be a measure in M. We say that λ is concentrated on A ∈ M if
λ(E) = λ(A ∩ E) for all E ∈ M .
Equivalently λ is concentrated on A ∈ M if λ(X \ A) = 0.
Definition. Let λ1 , λ2 be two measures in M. If there are sets
A, B ∈ M, A∩B =∅
such that
λ1 is concentrated on A ,
λ2 is concentrated on B ,
then we say that the measures λ1 and λ2 are mutually singular and we write
λ1 ⊥ λ2 .

Lemma 119 Let λ1 , λ2 be two measures in M. If there is a set E ∈ M such that


λ1 (E) = 0 and λ2 is concentrated on E, then λ1 ⊥ λ2 .

Proof. It suffices to take A = X \ E and B = E. 2


Examples.

1. For a ∈ IRn the Dirac mass is a Borel measure defined by δa (E) = 1 if a ∈ E and
δa (E) = 0 if a 6∈ E. The lemma yields Ln ⊥ δa .
2. If M is a k-dimensional submanifold in IRn , k < n, and λ(E) = Hk (E ∩ M ) for
E ⊂ IRn , then Ln ⊥ λ.
3. if C is the standard ternary Cantor set and λ(E) = Hlog 2/ log 3 (E ∩ C) for E ⊂ IR,
then λ ⊥ L1 .

Theorem 120 (Radon-Nikodym-Lebesgue) Let µ and ν be two (positive), σ-finite


measures on M. Then

(a) There is a unique pair of measures λa , λs on M such that


λ = λa + λs , λa  µ, λs ⊥ µ .

(b) There is a M-measurable function h ≥ 0 on X such that


Z
λa (E) = h dµ for all E ∈ M.
E

If in addition λ is finite (i.e. λ(X) < ∞), then the measures λa , λs are finite and
h ∈ L1 (µ).

86
Remark. Part (a) is due to Lebesgue and part (b) due to Radon and Nikodym.
In particular if λ  µ, then
Z
λ(E) = h dµ (56)
E

for some h ≥ 0 and all E ∈ M which is a striking characterization of all absolutely


continuous measures. The function h is called Radon-Nikodym derivative and is often
denoted by h = dλ/dµ. With this notation (56) can be rewritten as
Z

λ(E) = dµ for all E ∈ M .
E dµ

Proposition 121 If λ is a finite measure, absolutely continuous with respect to the


Lebesgue measure, λ  Ln , then
dλ λ(B(x, r))
= lim+ for a.e. x ∈ IRn .
dLn r→0 |B(x, r)|

Proof. It follows from the Radon-Nikodym theorem that dλ/dLn ∈ L1 (IRn ) and
Z
λ(B(x, r)) dλ dλ
= dLn → (x) as r → 0+
|B(x, r)| B(x,r) dL n dL n

for a.e. x ∈ IRn by the Lebesgue differentiation theorem. 2

Proof of the Radon-Nikodym-Lebesgue theorem. First we will prove the second part of
the theorem which is due to Radon and Nikodym. We can state it as follows.

Theorem 122 (Radon-Nikodym) If λ and µ are σ-finite measures on a σ-algebra


M and λ  µ, then there is a nonnegative measurable function h : X → [0, ∞] such
that Z
λ(E) = h dµ for all E ∈ M .
E
If in addition λ is finite, then h ∈ L1 (µ).

Proof. We will prove the theorem under the additional assumption that µ(X) < ∞
and λ(X) < ∞. The general case of σ-finite measures easily follows from this special
case and the details are left to the reader. Let
Z
Φ = {ϕ : X → [0, ∞] : ϕ is measurable and ϕ dµ ≤ λ(E) for all E ∈ M} .
E

Clearly Φ 6= ∅, because ϕ ≡ 0 ∈ Φ. If ϕ1 , ϕ2 ∈ Φ, then max{ϕ1 , ϕ2 } ∈ Φ. Indeed,


Z Z Z
max{ϕ1 , ϕ2 } dµ = ϕ1 dµ + ϕ2 dµ
E E∩{ϕ1 ≥ϕ2 } E∩{ϕ1 <ϕ2 }
≤ λ(E ∩ {ϕ1 ≥ ϕ2 }) + λ(E ∩ {ϕ1 < ϕ2 }) = λ(E) .

87
Define Z
M = sup ϕ dµ ≤ λ(X) < ∞ .
ϕ∈Φ X

There is a sequence ϕn ∈ Φ such that


Z
ϕn dµ → M as n → ∞ .
X

Let hn = max{ϕ1 , ϕ2 , . . . , ϕn } ∈ Φ. Then hRn is an increasing sequence of functions


pointwise convergent to a function h. Since E Rhn dµ ≤ λ(E) for all E ∈ M it follows
from the monotone convergence theorem that E h dµ ≤ λ(E) for all E ∈ M. Hence
h ∈ Φ. Moreover Z Z
h dµ = lim hn dµ = M .
X n→∞ X
We will show that Z
h dµ = λ(E) for all E ∈ M .
E
Clearly Z
η(E) = λ(E) − h dµ
E
defines a positive measure in M and we need to show that η ≡ 0. For if not there would
exist a set A ∈ M such that η(A) > 0. This implies that λ(A) > 0 and hence µ(A) > 0
(because λ  µ). Thus there is ε > 0 such that η(A) − εµ(A) > 0. Define ξ = η − εµ.
Then ξ is a signed measure and we can assume that A is positive with respect to ξ by
Lemma 117. Hence if E ∈ M, ξ(A ∩ E) ≥ 0 and thus
Z
0 ≤ ξ(A ∩ E) = η(A ∩ E) − εµ(A ∩ E) = λ(A ∩ E) − h dµ − εµ(A ∩ E) ,
A∩E
Z
h dµ + εµ(A ∩ E) ≤ λ(A ∩ E) ,
A∩E
but also Z
h dµ ≤ λ(E \ A)
E\A

because h ∈ Φ. Adding up the two inequalities yields


Z Z
(h + εχA ) dµ = h dµ + εµ(A ∩ E) ≤ λ(E) .
E E

This implies that h + εχA ∈ Φ. Since


Z
(h + εχA ) dµ = M + εµ(A) > M ,
X

we get a contradiction. The proof is complete. 2


Now we can prove the first part of the theorem which is due to Lebesgue.

88
Theorem 123 (Lebesgue) If λ and µ are σ-finite measures, then there exist unique
measures λa and λs such that

λ = λa + λ s , λa  µ, λs ⊥ µ .

Proof. We leave the proof of the uniqueness as an exercise and so we are left with the
proof of the existence of λa and λs . Let ν = µ + λ. Then µ  ν and λ  ν. By the
Radon-Nikodym theorem
Z Z
µ(E) = f dν, λ(E) = g dν
E E

for some measurable functions f, g : X → [0, ∞] and all E ∈ M. Define

λ0 (E) = λ(E ∩ {f = 0}), λ1 (E) = λ(E ∩ {f > 0}) .

Since λ = λ0 + λ1 it suffices to prove that

λ0 ⊥ µ and λ1  µ .

The measure λ0 is concentrated on the set {f = 0}. Since µ({f = 0}) = 0, we conclude
that λ0 ⊥ µ. To prove that λ1  µ let µ(E) = 0. Then f = 0, ν-a.e. in the set E, i.e.
ν(E ∩ {f > 0}) = 0 and thus
Z
λ1 (E) = g dν = 0 .
E∩{f >0}

2
We will apply now the Radon-Nikodym theorem to classify all continuous linear
functionals on Lp (µ) for 1 ≤ p < ∞. Let us start first with some general facts about
continuous linear mappings between normed spaces.

Theorem 124 Let L : X → Y be a linear mapping between normed spaces. Then the
following conditinos are equivalent

(a) L is continuous;

(b) L is continuous at 0;

(c) L is bounded, i.e. there is C > 0 such that

kLxk ≤ Ckxk for all x ∈ X .

89
Remark. Formally we should use different symbols to denote norms in spaces X and
Y . Since it will always be clear in which space we take the norm it is not dangerous to
use the same symbol k · k to denote apparently different norms in X and Y .
Proof. The implication (a)⇒(b) is obvious.
(b)⇒(c). Suppose L is continuous at 0 but not bounded. Then there is a sequence
xn ∈ X such that
kLxn k ≥ nkxn k .
Hence xn
0 ← L ≥1

nkxn k
which is a contradiction17 .
(c)⇒(a). Let xn → x. Then

kLx − Lxn k = kL(x − xn )k ≤ Ckx − xn k → 0

and hence Lxn → Lx which proves continuity of L. 2


In particular if X is a normed space over IR (or C), then a linear functional

Λ : X → IR (or C)

is continuous if and only if it is bounded, i.e. there is C > 0 such that

|Λx| ≤ Ckxk for all x ∈ X. (57)

The number
kΛk = sup |Λx|
kxk≤1

is called the norm of Λ.

Lemma 125 The norm Λ is the smallest number C for which the inequality (57) is
satisfied.

Proof. Suppose |Λx| ≤ Ckxk for all x ∈ X. If kxk ≤ 1, then |Λx| ≤ Ckxk ≤ C and
hence
kΛk = sup |Λx| ≤ C .
kxk≤1

Thus the number C is (57) cannot be smaller than kΛk. It remains to prove that

|Λx| ≤ kΛk kxk for all x ∈ X .


17
Convergence to 0 follows from the continuity of L at 0.

90
If x = 0, then the inequality is obvious. If x 6= 0, then

x
|Λx| = Λ kxk ≤ kΛkkxk .
kxk

The last inequality follows from the fact that kx/kxkk = 1. 2


Example. If 1/p + 1/q = 1, 1 < p, q < ∞ are Hölder conjugate, and g ∈ Lq , then
Z
Λg f = f g dµ
X

defines a bounded linear functional on Lp (µ) and kΛg k ≤ kgkq . Indeed, Hölder’s
inequality yields Z

|Λg f | = f g dµ ≤ kgkq kf kp .
X

Similarly if g ∈ L (µ), then Z
Λg f = f g dµ
X

defines a bounded linear functional on L1 (µ) and kΛg k ≤ kgk∞ .


It turns out that under reasonable assumptions every functional on Lp (µ), 1 ≤ p <
∞ is on the form Λg for some g ∈ Lq (or g ∈ L∞ if p = 1).

Theorem 126 Suppose 1 ≤ p < ∞ and µ is a σ-finite measure. Assume that Λ is a


bounded linear functional on Lp (µ). Then there is a unique g ∈ Lq (µ), 1/p + 1/q = 1
such that Z
Λf = f g dµ for all f ∈ Lp (µ) .
X
Moreover
kΛk = kgkq .

Proof. Uniqueness. Suppose


Z Z
Λf = f g1 dµ = f g2 dµ
X X

for some g1 , g2 ∈ Lq (µ) and all f ∈ Lp (µ). We need to show that g1 = g2 a.e. For
E ∈ M with µ(E) < ∞ we have
Z Z
g1 dµ = Λ(χE ) = g2 dµ .
E E

Hence Z
(g1 − g2 ) dµ = 0
E

91
for every E ∈ M with µ(E) < ∞. This easily implies that g1 − g2 = 0 a.e. and hence
g1 = g2 µ-a.e.
We will prove the theorem under the additional assumption that µ(X) < ∞. The
general case of σ-finite measure can be reduced to the case of finite measure and we
leave details to the reader. Define

λ(E) = Λ(χE ) .

It is obvious that λ is finitely additive

λ(A ∪ B) = λ(A) + λ(B) for A, B ∈ M, A ∩ B = ∅ .

Indeed,

λ(A ∪ B) = Λ(χA∪B ) = Λ(χA + χB ) = Λ(χA ) + Λ(χB ) = λ(A) + λ(B) .

We will use this fact to prove that actually λ is conuntably additive. Let E = ∞
S
i=1 Ei ,
where Ei ∈ M and Ei ∩ Ej = ∅ for i 6= j. Let Ak = E1 ∪ . . . ∪ Ek . We have

kχE − χAk kp = µ(E \ Ak )1/p → 0 as k → ∞ ,

because the sets E \ Ak form a decreasing sequence of sets with empty intersection and
µ(E \ A1 ) ≤ µ(X) < ∞. Continuity of the functional λ implies

ΛχAk → ΛχE

and hence finite additivity of λ yields


k
X
λ(Ei ) = λ(Ak ) → λ(E) as k → ∞ .
i=1

Accordingly

X
λ(Ei ) = λ(E)
i=1

which is countable additivity of λ. Note that λ is not necessarily positive, so

λ : M → IR

is a signed measure. By the Hahn decomposition theorem λ = λ+ − λ− , where λ+ and


λ− are positive finite measures concentrated on disjoint sets. Clearly µ(E) = 0 implies
that λ(E) = 0 and also λ+ (E) = λ− (E) = 0. Hence λ+ , λ−  µ. The Radon-Nikodym
theorem yields the existence of 0 ≤ g ± ∈ L1 (µ) such that
Z
±
λ (E) = g ± dµ .
E

92
Now Z
λ(E) = g dµ where g = g + − g − ∈ L1 (µ).
E
We can write the last statement as
Z
Λ(χE ) = χE g dµ .
X

Since simple functions are linear combinations of characteristic functions


Z
Λf = f g dµ (58)
X

whenever f is a simple function. Now every f ∈ L∞ (µ) is a uniform limit of a sequence


of simple functions and hence (58) holds for all f ∈ L∞ (µ). It remains to prove the
following three statements

(a) g ∈ Lq (µ) (we know that g ∈ L1 (µ));

(b) kΛk = kgkq (we know that kΛk ≤ kgkq by Hölder’s inequality);

(c) Λf = X f g dµ for all f ∈ Lp (µ) (we know it for f ∈ L∞ (µ)) .


R

We will complete the proof in the case 1 < p < ∞. The case p = 1 is easier and left to
the reader. Let 
g(x)/|g(x)| if g(x) 6= 0,
α(x) =
0 if g(x) = 0,
and
En = {x : |g(x)| ≤ n} .
Define
f = |g|q/p αχEn .
Clearly f ∈ L∞ (µ) (because |f | ≤ nq/p ). Moreover
(a) (b)
|f |p = |g|q χEn = f g .

Hence
Z Z Z 1/p
q by (b) f ∈L∞ by (a) q
|g| dµ = f g dµ = Λf ≤ kΛkkf kp = kΛk |g| dµ .
En X En

This yields
Z 1−1/p
q
|g| dµ ≤ kΛk .
En

93
The left hand side converges to kgkq as n → ∞ and hence kgkq ≤ kΛk. Since we
already know the opposite inequality kΛk ≤ kgkq we conclude

kΛk = kgkq .

This yields the claims (a) and (b). Now (c) follows from the continuity of the functional.
Indeed, there is a sequence18 L∞ 3 fk → f in Lp and hence
Z Z Z

Λf − f g dµ ≤ |Λf − Λfk | + Λfk − f g dµ = |Λf − Λfk | + (fk − f )g dµ

X X X
≤ |Λf − Λfk | + kfk − f kp kgkq → 0 as k → ∞ .

which proves (c). 2


Definition. Let X be a locally compact metric space. The class of continuous func-
tions vanishing at infinity C0 (X) consists of all continuous functions u : X → IR such
that for every ε > 0 there is a compact set K ⊂ X such that |u(x)| < ε for all x ∈ X \K.
Exercise. Prove that u ∈ C0 (IRn ) if and only if u is continuous on IRn and
lim|x|→∞ u(x) = 0.
Example. The set of positive integers IN is a locally compact metric space with
respect to the metric |x − y|. Functions on IN can be identified with infinite sequences.
Indeed, a sequence (xi )∞
i=1 corresponds to a function f (i) = xi . Every function on IN
is continuous and compact sets in IN are exactly finite subsets. It easily follows that
C0 (IN) consists of sequences such that |xi | → 0 as i → ∞. This space is denoted by
c0 = C0 (IN). Thus

c0 = {(xi )∞
i=1 : xi ∈ IR, |xi | → 0 as i → ∞} .

It is easy to see that c0 is a Banach space with respect to the norm kxk∞ =
supi=1,2,... |xi |, where x = (xi )∞ ∞
i=1 . Therefore c0 is a closed subspace in ` .

Recall that Cc (X) consists of all continuous functions on X that have compact
support. Clearly Cc (X) ⊂ C0 (X).

Theorem 127 Let X be a locally compact metric space. Then C0 (X) is a Banach
space with the norm
kuk∞ = sup |u(x)| .
x∈X

Moreover Cc (X) is a dense subset of C0 (X).

We leave the proof as an exercise.


Observe that if µ is a measure on X, then C0 (X) is a closed subspace of L∞ (µ).
Hence C0 (X) is the closure of Cc (X) in the metric of L∞ (µ). If µ is a Radon measure on
18
Because simple functions are dense in Lp .

94
a locally compact metric space, then the closure of Cc (X) in the metric of Lp (µ) equals
Lp (µ) for all 1 ≤ p < ∞ (Theorem 86). However the closure of Cc (X) in the metric of
L∞ (µ) equals C0 (X) L∞ (µ). We managed to characterize bounded linear functional
on Lp (µ) for all 1 ≤ p < ∞ (Theorem 126). It is not so easy to characterize bounded
linear functionals on L∞ (µ). However Theorem 128 below provides a characterization
of bounded linear functionals on C0 (X).
Recall that if µ is a signed measure, then Hahn’s decomposition theorem provides a
unique representation µ = µ+ −µ− , where µ+ and µ− are positive measures concentrated
on disjoint sets. We define the measure |µ| by |µ| = µ+ +µ− and call the number |µ|(X)
the total variation of µ.
If a signed measure µ is given by
Z
µ(E) = f dλ for some f ∈ L1 (λ) ,
E

then Z
|µ|(E) = |f | dλ .
E

If µ is a signed measure and f ∈ L1 (|µ|), then we define


Z Z Z
f dµ = +
f dµ − f dµ− .
X X X

Theorem 128 (Riesz Representation Theorem) Let X be a locally compact met-


ric space and let Λ be a bounded linear functional on C0 (X). Then there is a unique
Borel signed measure µ of finite total variation such that
Z
Λf = f dµ for all f ∈ C0 (X) . (59)
X

Moreover
kΛk = |µ|(X) .

In the proof we will need the following abstract and general result.

Lemma 129 Let X0 ⊂ X be a dense linear subspace in a normed space X. If Λ :


X0 → IR is a bounded linear functional (with respect to the norm in X), then there is
a unique functional Λ̃ : X → IR (called extension of Λ) such that

Λ̃x = Λx for all x ∈ X0 and kΛ̃k = kΛk .

Remarks. Here
kΛ̃k = sup |Λ̃x| and kΛk = sup |Λx| .
kxk≤1 kxk≤1
x∈X x∈X0

95
Usually the extension is denoted by the same symbol Λ rather than Λ̃.
Example. Let X be a locally compact metric space and λ a Radon measure on X. If
Λ : Cc (X) → IR is a linear functional such that

|Λf | ≤ M kf kL1 (λ) for all f ∈ Cc (X) ,

then there is unique bounded linear functional Λ̃ : L1 (λ) → IR such that Λ̃f = Λf for
f ∈ Cc (X) and
|Λ̃f | ≤ M kf kL1 (λ) for all f ∈ L1 (λ) .
Indeed, Cc (X) is a linear and dense subspace of L1 (λ) by Theorem 86.

Proof of the lemma. Let X0 3 xn → x ∈ X. Then we define

Λ̃x = lim Λxn (60)


n→∞

and we have to prove that:

(a) the limit exists;

(b) the limit does not depend on the particular choice of a sequence xn ∈ X0 that
converges to x;

(c) Λ̃x = Λx for x ∈ X0 ;

(d) kΛ̃k = kΛk.

If xn → x, xn ∈ X0 , then

|Λxn − Λxm | = |Λ(xn − xm )| ≤ kΛkkxn − xm k → 0 as n, m → ∞ ,

which shows that Λxn is a Cauchy sequence of real numbers and hence it is convergent.
This proves (a). If xn → x, yn → x, xn , yn ∈ X0 , then

|Λxn − Λyn | = |Λ(xn − yn )| ≤ kΛkkxn − yn k → 0 as n → ∞

which proves (b). If x ∈ X0 , then taking xn = x we have xn → x and hence Λ̃x =


limn→∞ Λx = Λx which proves (c). Clearly

kΛk = sup |Λx| ≤ sup |Λ̃x| = kΛ̃k ,


kxk≤1 kxk≤1
x∈X0 x∈X

but we actually have equality because if x ∈ X, then there is a sequence xn ∈ X0 ,


xn → x and hence

|Λ̃x| = lim |Λxn | ≤ kΛk lim kxn k = kΛkkxk .


n→∞ n→∞

96
which proves that kΛ̃k ≤ kΛk (see Lemma 125). The above two inequalities prove (d).
2
Remark. In the proof we used the fact that every Cauchy sequence in IR is convergent,
i.e. we used the fact that IR is Banach space. Therefore the same proof gives a more
general result about extensions of bounded linear mappings L : X0 → Y defined on a
dense linear subspace X0 ⊂ X of a noremd space X into a Banach space Y . The reader
should find an appropriate statement.
Proof of Theorem 128. First we prove uniqueness of the measure µ. To this end R we need
to prove that if µ is a signed Borel measure of finite total variation such that X f dµ = 0
for all f ∈ C0 (X), then µ ≡ 0. Let µ = µ+ − µ− be the Hahn decomposition with
measures µ+ and µ− concentrated on disjoint sets X + and X − respectively. If

1 if x ∈ X + ,
h(x) =
−1 if x ∈ X − ,

then h2 = 1 and dµ = hd|µ|. Hence


Z Z
|µ|(X) = d|µ| = h2 d|µ| . (61)
X X

Let fn ∈ Cc (X) be a sequence that converges to h in L1 (|µ|) (Theorem 86). Note that
Z Z
0= fn dµ = fn h d|µ| . (62)
X X

Now (61) and (62) yield


Z Z Z
2
|µ|(X) = (h − fn h) d|µ| = (h − fn )h d|µ| ≤ |h − fn | d|µ| → 0 as n → ∞ .
X X X

Hence |µ|(X) = 0 and thus µ = 0.


Now let Λ be a bounded linear functional on C0 (X). We can assume without lost
of generality that19 kΛk = 1. We would like to apply the Riesz representation theorem
for positive functionals on Cc (X), but unfortunately the functional Λ is not necessarily
positive. So we want to construct a positive functional Φ such that

|Λf | ≤ Φ(|f |) ≤ kf k∞ for all f ∈ Cc (X) . (63)

Once the functional Φ is constructed we can finish the proof of the theorem as follows.
By the Riesz representation theorem there is a (positive) Radon measure λ such that
Z
Φ(f ) = f dλ for all f ∈ Cc (X) .
X
19
Otherwise we multiply Λ by kΛk−1 .

97
Observe that
λ(X) = sup{Φ(f ) : 0 ≤ f ≤ 1, f ∈ Cc (X)} . (64)
R
Indeed, for 0 ≤ f ≤ 1, f ∈ Cc (X) we have X f dλ ≤ λ(X). On the other hand for
every compact
R set K ⊂ X there is 0 ≤ f ≤ 1, f ∈ Cc (X) such that f = 1 on K and
hence X f dλ ≥ λ(K). Now (64) follows from the fact that

λ(X) = sup λ(K) .


K⊂X
K−compact

Observe that (64) implies


λ(X) ≤ 1. (65)
It follows from (63) that
Z
|Λ(f )| ≤ Φ(|f |) = |f | dλ = kf kL1 (λ) .
X

Thus Λ is a bounded linear functional defined on a linear subspace20 Cc (X) ⊂ L1 (λ).


Hence the functional can be uniquely extended to a bounded linear functional Λ̃ on
L1 (λ) such that Λ̃f = Λf for f ∈ Cc (X) and

|Λ̃f | ≤ kf kL1 (λ) for all f ∈ L1 (λ) .

According to Theorem 126 there is a unique function g ∈ L∞ (λ), kgk∞ ≤ 1 such that
Z
Λ̃f = f g dλ for all f ∈ L1 (λ).
X

If f ∈ C0 (X) and fn ∈ Cc (X), fn → f in the norm of C0 (X), then


Z
Λ̃f = lim fn g dλ = Λfn → Λf
n→∞ X

and hence Λ̃f = Λf for all f ∈ C0 (X). Thus


Z
Λf = f g dλ for all f ∈ C0 (X) .
X

Accordingly, we have the representation (59) with dµ = gdλ. Since kΛk = 1 and
d|µ| = |g| dλ it follows that
Z
|µ|(X) = |g| dλ ≥ sup{|Λf | : f ∈ C0 (X), kf k∞ ≤ 1} = kΛk = 1 .
X

On the other hand kgk∞ ≤ 1 and λ(X) ≤ 1 so


Z
|µ|(X) = |g| dλ ≤ 1 .
X
20
Bounded with respect to the norm of L1 (λ).

98
Thus
|µ|(X) = 1 = kΛk
and the proof is complete. Therefore we are left with the construction of the positive
functional Φ satisfying (63).
Denote by Cc+ (X) the class of nonnegative functions in Cc (X) and for f ∈ Cc+ (X)
define
Φ(f ) = sup{|Λh| : h ∈ Cc (X), |h| ≤ f } .
If f ∈ Cc (X), then f = f + − f − , f + , f − ∈ Cc+ (X) and we define

Φ(f ) = Φ(f + ) − Φ(f − ) .

Clearly Φ(f ) ≥ 0 for f ∈ Cc+ (X), so Φ is positive and Φ satisfies (63). Therefore we
only need to show that Φ is linear. Obviously Φ(cf ) = cΦ(f ) and hence we are left
with the proof that

Φ(f + g) = Φ(f ) + Φ(g) for all f, g ∈ Cc (X). (66)

Observe that it suffices to verify (66) for f, g ∈ Cc+ (X). Indeed, for f, g ∈ Cc (X) we
will have then21
(f + g)+ − (f + g)− = f + − f − + g + − g −
and hence
(f + g)+ + f − + g − = (f + g)− + f + + g + .
Now, the assumed linearity of Φ on Cc (X)+ yields

Φ((f + g)+ ) + Φ(f − ) + Φ(g − ) = Φ((f + g)− ) + Φ(f + ) + Φ(g + )

Φ((f + g)+ ) − Φ((f + g)− ) = Φ(f + ) − Φ(f − ) + Φ(g + ) − Φ(g − ) .


| {z } | {z } | {z }
Φ(f +g) Φ(f ) Φ(g)

Thus we are left with the proof of (66) for f, g ∈ Cc (X)+ . Fix f, g ∈ Cc+ (X). Given
ε > 0, there exist h1 , h2 ∈ Cc (X) such that |h1 | ≤ f , |h2 | ≤ g and

Φ(f ) ≤ |Λh1 | + ε/2, Φ(g) ≤ |Λh2 | + ε/2 .

Let c1 = ±1, c2 = ±1 be such that |Λh1 | = c1 Λh1 , |Λh2 | = c2 Λh2 . We have

Φ(f ) + Φ(g) ≤ |Λh1 | + |Λh2 | + ε = Λ(c1 h1 + c2 h2 ) + ε


≤ Φ(|h1 | + |h2 |) + ε ≤ Φ(f + g) + ε

and hence
Φ(f ) + Φ(g) ≤ Φ(f + g) .
21
Because both sides are equal f + g.

99
Now we need to prove the opposite inequality. Let h ∈ Cc (X) satisfy |h| ≤ f + g, let
V = {x : f (x) + g(x) > 0}, and define
( (
f (x)h(x) g(x)h(x)
f (x)+g(x)
if x ∈ V , f (x)+g(x)
if x ∈ V ,
h1 (x) = h2 (x) =
0 if x 6∈ V , 0 if x 6∈ V .
It is easy to see that the functions h1 and h2 are continuous, h = h1 + h2 , and |h1 | ≤ f ,
|h2 | ≤ g. We have
|Λh| = |Λh1 + Λh2 | ≤ |Λh1 | + |Λh2 | ≤ Φ(f ) + Φ(g)
and after taking the supremum over all functions h we have
Φ(f + g) ≤ Φ(f ) + Φ(g)
which completes the proof. 2

10 Maximal function
Definition. For a function u ∈ L1loc (IRn ) we define the Hardy-Littlewood maximal
function by Z
M u(x) = sup |u(y)| dy .
r>0 B(x,r)

Clearly M u : IRn → [0, ∞] is a measurable function.

Theorem 130 (Mardy-Littlewood maximal theorem) If u ∈ Lp (IRn ), 1 ≤ p ≤


∞, then M u < ∞ a.e. Moreover

(a) If u ∈ L1 (IRn ), then for every t > 0


Z
C(n)
|{x : M u(x) > t}| ≤ |u| . (67)
t IRn

(b) If u ∈ Lp (IRn ), 1 < p ≤ ∞, then M u ∈ Lp (IRn ) and


kM ukp ≤ C(n, p)kukp . (68)

Remarks. Let u = χ[0,1] . Then for x ≥ 1


Z 2x
1 1
M u(x) ≥ χ[0,1] (y) dy = 6∈ L1 .
2x 0 2x
Hence the inequality (68) does not hold for p = 1. If g ∈ L1 (IRn ), then
Z
t|{x : |g(x) > t}| =
IRn

100

You might also like