You are on page 1of 51

Basic Measure Theory

Heinrich v. Weizsäcker

Revised Translation

2004/2008
Fachbereich Mathematik
Technische Universität Kaiserslautern
2 Measure Theory (October 20, 2008)

Preface

The aim of this text is to give a short introduction into the basic concepts
of the Lebesgue theory of measure and integration on general spaces in order to
enable the reader to follow Master courses in Probability and in Functional anal-
ysis. The main advantage of Lebesgue’s theory over more elementary concepts
like the Riemann integral is its power in dealing with countable limit operations,
both for the measure (viz. generalized volume) of sets, and for the integral of
functions. These become essential features in higher real analysis and proba-
bility. The price is some additional work in constructing suitable domains of
definition for measure and integral.
There are two main approaches to this theory. The first approach starts with
measures and then constructs the integrals, in analogy to the classical passage
from volume to integration. Conversely, the second approach constructs in
several steps the space of integrable functions and then a set has finite measure
simply if the indicator function of this set is integrable. In this text we follow
the first line, even though it takes a little longer. The main reason for this choice
is that in Probability the measurable sets (’events’) have an important role by
themselves and this path of arguments gives additional information about them.
The common goal of both approaches are the limit theorems of chapter 4 (in
particular the dominated convergence theorem). A reader who knows these limit
theorems can start directly with chapter 5 and check the definitions and results
of the first four chapters only when needed.
Up to minor modifications this text is a translation of the original German
version [?]. Only chapter 2 has been revised more thoroughly. For most math-
ematical words and phrases we introduce both the English and German term.
Contents

1 Measures 5
1.1 σ-Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Null-Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 An example of a non-measurable set . . . . . . . . . . . . . . . . 9

2 Construction of measures 11
2.1 Set algebras and contents . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Outer measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Proof of the Measure Extension Theorem . . . . . . . . . . . . . 16
2.4 Monotone classes and more on uniqueness . . . . . . . . . . . . . 19

3 Measurable functions 21

4 The Integral 25
4.1 The elementary integral . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 The Integral for non-negative measurable and for integrable func-
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 The Limit Theorems . . . . . . . . . . . . . . . . . . . . . . . . . 30

5 Fubini’s Theorem 33

6 Convergence in measure 37
6.1 Comparison with other types of convergence . . . . . . . . . . . . 37
6.2 A Metric for Local Convergence in Measure . . . . . . . . . . . . 39

7 The Lp -spaces 43

8 The Radon-Nikodym Theorem 47


8.1 Absolute continuity of measures . . . . . . . . . . . . . . . . . . . 47
8.2 Lebesgue and Hahn-Jordan decomposition . . . . . . . . . . . . . 49

3
4 Measure Theory (October 20, 2008)
Chapter 1

Measures

1.1 σ-Algebras
A measure measures the size of sets. Not all sets can be measured in this sense
as we shall see in section 1.4. On the other hand the class of sets which are mea-
surable should be sufficiently rich. In particular we want to keep measurability
of sets if we perform simple operations like taking the complement or taking
(countable) unions and intersections. This leads to the following definition.

Definition 1.1 Let S be a set and let F be a collection of subsets of S. Then


F is called a σ-algebra over S, if the following holds:
a) S ∈ F,
b) If A ∈ F then Ac ∈ F. 1 S∞
c) For every sequence A1 , A2 , · · · of elements of F the union n=1 An is also a
member (i.e. an element) of F.

Remark: 1. Properties a) and b) imply that the empty set ∅ belongs also
to F, since ∅ = S c .
2. Properties b) and c) imply that for every sequence A1 , A2 , · · · of elements of
T∞
F the Sintersection n=1 An is in F, because first Acn ∈ F for each n due to b),
hence n Acn ∈ F due to c) and finally b) yields


à ∞
!c
\ [
An = Acn ∈ F.
n=1 n=1

If we choose An = ∅ for n ≥ 3 we see in particular that for every pair of sets


A, B ∈ F the sets A∪B and A∩B and hence also the difference set A\B = A∩B c
belong to F.
3. The largest σ-algebras over a set S is the power set (Potenzmenge) 2S of all
subsets of S. The smallest σ-algebra is the trivial σ-algebra which has only ∅
and S as its elements.

The remainder of this section shows the typical procedure by which most
σ-algebras are found. If one has several different σ-algebras over the same set
1 Ac denotes the complement of A in S.

5
6 Measure Theory (October 20, 2008)

S one can consider their intersection, i.e. the system of subsets of S which are
members of each of the given σ-algebras.

Theorem 1.1 LetT(Fi )i∈I be an arbitrary family of σ-algebras over S. Then


their intersection i∈I Fi is again a σ-algebra over S.

Proof. We should verify that the intersection has the three properties a)-c)
in Definition 1.1. We restrict
T ourselves to condition c). Let A1 , A2 , · · · be a
sequence of elements of S i∈I F i . Thus An ∈ Fi for all n ∈ N and all i ∈ I. For
fixed i ∈ I it follows
S that T n ∈ Fi since Fi is a σ-algebra. This applies to
n∈N A
each i, and hence n∈N An ∈ i∈I Fi .

As a consequence for every system E of subsets of S there is a smallest


σ-algebra which contains E.

Definition 1.2 Let E be any system of subsets of S. The intersection of the


family of all σ-algebras which contain E is called the σ-algebra generated by
E. It will be denoted by σ(E) and E is called a generating system (Erzeu-
gendensystem) of this σ-algebra.

Remarks 1. Thus a set A belongs to σ(E) if and only every σ-algebra which
contains all elements of E must also contain A. This concept is similar to the
concept of the linear hull of a set E of vectors in a vector space V , which is the
smallest vector space containing all vectors in E or, the intersection of all linear
subspaces of V which contain E as a subset. But while the elements of the linear
hull can be described explicitly as the linear combinations of the elements of E,
an analogous explicit representation of the elements of σ(E) by the elements of
E is not possible in general.

2. The definition shows: If two systems of sets E1 , E2 satisfy E1 ⊂ σ(E2 ) then


we have even σ(E1 ) ⊂ σ(E2 ), because the σ-algebra σ(E2 ) is by assumption one
of the σ-algebras whose intersection is σ(E1 ). In particular the larger a system
of sets, the larger the σ-algebra it generates.

Example 1.1 Let Q be the system of all compact boxes Q = [r1 , s1 ] × · · · ×


[rm , sm ] in Rm . Then every open set U ⊂ Rm is an element of the σ-algebra
σ(Q).

Proof. Each point x ∈ U is in a cube of the form Q = [r1 , s1 ]×· · ·×[rm , sm ]


with rational numbers ri , si , such that even the whole cube Q is a subset of U .
The set of all cubes with rational sides which are contained inSU is countable,

hence there is a sequence (Qn )n of compact cubes with U = n=1 Qn . Since
Qn ∈ Q ⊂ σ(Q) and σ(Q) is a σ-Algebra, we get U ∈ σ(Q).

Similarly we can consider the system R0 of (half open) rectangular sets


R = (a1 , b1 ] × · · · (ad , bd ] in Rd . Then each R is the countable union of the
compact boxes [a1 + n1 , b1 ] × · · · [ad + n1 , bd ] and conversely each compact box
[a1 , b1 ] × · · · [ad , bd ] is the countable intersection of the half-open sets R = (a1 −
1 1 d
n , b1 ] × · · · (ad − n , bd ] in R . This implies that the two systems Q and R0
generate the same σ-algebra. There are many other generating systems of that
σ-algebra.
Measure Theory (October 20, 2008) 7

Theorem 1.2 The following four systems of sets in Rd ,


the system Q of all compact boxes,
the system R0 of all half-open rectangular sets,
the system K of all compact subsets
the system O of all open subsets
the system A of all closed subsets
all generate the same σ-algebra, i.e. σ(Q) = σ(R0 ) = σ(K) = σ(O) = σ(A).

Proof. The inclusion Q ⊂ K ⊂ A implies σ(Q) ⊂ σ(K) ⊂ σ(A). Since every


closed set, being the complement of an open set, is in σ(O), we have A ⊂ σ(O)
and the example above shows O ⊂ σ(Q). These two observations yield the
remaining inclusions σ(A) ⊂ σ(O) ⊂ σ(Q).

Definition 1.3 Let (S, d) be a metric space. Let O(S) be the system of all open
subsets of S. The σ-algebra generated by O(S) is called the Borel σ-algebra
of S and is denoted by B(S). Its elements are called the Borel sets of S.

So starting from open or closed sets one can form at liberty countable unions,
intersections and complements without ever leaving the realm of Borel sets.

1.2 Measures
Definition 1.4 Let S be a set and let F be a σ-algebra on S. A map µ : F →
[0, ∞] is called a measure (Maß), if µ(∅) = 0 and if µ is σ-additive, i.e. for
every sequence of pairwise disjoint sets Fn ∈ F one has
Ã∞ ! ∞
[ X
µ Fn = µ(Fn ). (1.1)
n=1 n=1

The simplest example of a measure is the counting measure (Zählmaß).


It assigns to each finite set the cardinality of that set and to infinite sets the value
∞. Another example is a point mass or Dirac measure εx which is defined
by εx (A) := 1A (x). Here 1A is the indicator function of A, i.e. εx (A) = 1 if
x ∈ A and εx (A) = 0 otherwise. There are many more interesting measures, the
most important one being Lebesgue measure: The d-dimensional Lebesgue-
measure is a measure on a suitable σ-algebra over Rd which coincides with the
usual geometric volume for simple sets, e.g. for all boxes. But this fact that
usual volume really defines a measure, requires some work. This will be done
in the next chapter, in a more general setting.

Definition 1.5 If S is a set and F is a σ-algebra on S, then the pair (S, F) is


called a measurable space (meßbarer Raum). If moreover µ is a measure
on F, the triplet (S, F, µ) is a measure space (Maßraum). The elements of
F are called measurable sets (meßbare Mengen).

Let us collect some simple properties of measures.

Remarks 1. A measure is also finitely additive (endlich additiv): For


disjoint measurable sets A and B we have

µ(A ∪ B) = µ(A) + µ(B),


8 Measure Theory (October 20, 2008)

just take F1 = A, F2 = B and Fn = ∅ for n ≥ 3 in (1.1).


2. This implies that a measure is monotone, in fact if A ⊂ B then

µ(B) = µ(A) + µ(B \ A) ≥ µ(A)

and in particular
µ(B \ A) = µ(B) − µ(A).
3. A measure is sub-additive: For any two not necessarily disjoint sets
A, B in F one has
µ(A ∪ B) ≤ µ(A) + µ(B),
because µ(A ∪ B) = µ(A) + µ(B\A) ≤ µ(A) + µ(B).
Notation. We write An ↑ A if the sequence (An ) of sets is monotonically
increasing with union A; The notation An ↓ A is used similarly.

Theorem 1.3 a) A measure is σ-continuous from below: For An ∈ F the


condition An ↑ A implies µ(An ) → µ(A).
b) A measure is σ-continuous from above: The conditions Bn ↓ B and µ(B1 ) <
∞ imply µ(Bn ) → µ(B).
c) A measure is σ-sub-additive: For any sequence of (not necessarily disjoint)
sets An ∈ F Ã∞ !
[ X∞
µ An ≤ µ(An ).
n=1 n=1

Proof. a) Suppose An ↑ A. Consider the


Sn sets F1 = A1 and Fn = An \An−1
for n ≥ 2. The Fn are
Pn disjoint and A n = k=1 Fk for every n ∈ N, so by finite
additivity µ(An ) = k=1 µ(Fk ). Then
Ã∞ ! Ã∞ ! ∞
[ [ X
µ(A) = µ An = µ Fk = µ(Fk )
n=1 k=1 k=1
Xn
= lim µ(Fk ) = lim µ(An ).
n→∞ n→∞
k=1

b) Suppose Bn ↓ B and µ(B1 ) < ∞. Choose An = B1 \Bn and A = B1 \B.


Then An ↑ A and in B1 = An ∪ Bn = A ∪ B both unions are disjoint. Thus
µ(A) + µ(B) = µ(An ) + µ(Bn ) for each n. According to a) µ(A) = limn µ(An ).
Because of µ(B1 ) < ∞ and monotonicity all numbers are finite and we can
conclude µ(B) = limn µ(Bn ).
SN PN
c) Put CN = n=1 An . Sub-additivity gives µ(CN ) ≤ n=1 µ(An ). Part a),
applied to the CN , allows to pass to the limit as N → ∞.

1.3 Null-Sets
Null-sets are very useful because for most purposes one can ignore everything
that happens on them.

Definition 1.6 We call a set A ⊂ S null-set (Nullmenge) or, more precisely,


a µ-null-set if there is some N ∈ F with µ(N ) = 0 and A ⊂ N .
Measure Theory (October 20, 2008) 9

By σ-sub-additivity the union of countably many µ-null sets is again a µ-null


set.
Recall the operation of symmetric difference between two sets which is
defined by
A M B = (A \ B) ∪ (B \ A). (1.2)
It can also be written as

A M B = (A ∪ B) \ (A ∩ B). (1.3)

The symmetric difference consists of precisely those points at which the two
sets differ, i.e. which are in one of the two sets but not in the other. If the
symmetric difference A M B of two sets is a µ-null-set then A and B are called
µ-equivalent. If moreover both sets are measurable then µ(A∪B) = µ(A∩B) =
µ(A) = µ(B).

Definition 1.7 A measure µ, and the associated measure space (S, F, µ) are
complete (vollständig), if every µ-null set belongs to the σ-algebra F.

We note in passing that every measure can be completed.

Proposition 1.1 Let (S, F, µ) be a measure space. Let

Fµ = {A0 ⊂ S : A0 is µ − equivalent to some A ∈ F}.

Extend µ to Fµ by letting µ(A0 ) = µ(A) if A and A0 are µ-equivalent. Then


(S, Fµ , µ) is a complete measure space. It is called the completion of (S, F, µ),
and the σ-algebra Fµ is the µ-completion of F.

Proof. The discussion preceding definition 1.7 shows that the extension of
µ to Fµ is unambiguous. Two sets A0 and A ∈ F differ at exactly the same
points at which the complements A0c and Ac differ. This implies that Fµ is
stable under complements. If (A0n ) is a sequence of sets in Fµ then there are
sets An ∈ F such that A0n MSAn ⊂ Nn ∈ F where µ(N S n ) = 0 for each n. All
points at which the setSA0 = n A0n differs
S from A = n An ∈ F are contained
in the µ-null-set N = Nn and thus n A0n ∈ Fµ . So Fµ is a σ-algebra. If the
A0n are disjoint then the sets An ∩ N c = A0n ∩ N c are also disjoint and hence

X ∞
X ∞
X
µ(A0n ) = µ(An ) = µ(An ∩ N c ) = µ(A ∩ N c ) = µ(A) = µ(A0 ).
n=1 n=1 n=1

Thus the extension of µ to Fµ is a measure which clearly is complete.

1.4 An example of a non-measurable set


The concept of a σ-algebra serves primarily for having suitable domains of def-
inition for measures. Why not define a measure µ on all subsets A of the
underlying space? In this section we want to show that this is not possible for
’Lebesgue measure’ on the real line, more precisely for a finite measure on an
interval with the very natural additional condition of translation-invariance.
10 Measure Theory (October 20, 2008)

Theorem 1.4 (Vitali set) There is a set E ⊂ [0, 1] with the following prop-
erty: Let λ be a measure on a σ-algebra L over [0, 1] such that λ([0, 1]) = 1 and
for all F ∈ L we have F + x mod 1 ∈ L for x ∈ [0, 1] and

λ(F + x mod 1) = λ(F ) (1.4)


2
Then E ∈
/ L.

Proof. Call two numbers x, y ∈ [0, 1] similar if they differ by a rational


number. Clearly this is an equivalence relation and thus we get a partition of
the interval into disjoint equivalence classes. Now let E be a set which contains
exactly one element out of each equivalence class. In order to see that such a
set cannot be in L denote for all rational numbers r with 0 ≤ r < 1 by Er the
set E + r mod 1. Then the interval [0, 1] is the (countable) disjoint union of
the sets Er , and Er ∈ L provided that E ∈ L.
Assuming E ∈ L, we would get from translation-invariance
X X µ[ ¶
λ(E) = λ(Er ) = λ Er = λ([0, 1]) = 1.
r r r

The right-hand side is positive and finite. But on the left we have added in-
finitely many times the same value λ(E). Both for λ(E) = 0 and for λ(E) > 0
we get a contradiction.

Remark 1.1 In the above situation every L-measurable subset F of E must be


a null set.

Proof. The sets Fr = F + r mod 1 are disjoint and therefore as above


X X µ[ ¶
λ(F ) = λ(Fr ) = λ Fr ≤ λ([0, 1]) = 1
r r r

and hence λ(F ) = 0.

2 i.e. E is not ’Lebesgue-measurable’


Chapter 2

Construction of measures

Even though it is clear how to measure the size of simple sets by volume or
similar finitely additive functionals, it is not obvious that these definitions can
be extended to a measure on a whole σ-algebra of sets. The main result in this
chapter is Carathéodory’s extension theorem (theorem 2.2). As a special case
this contains the construction of Lebesgue measure in Euclidean space.
There are several approaches to this theorem. Our proof is based on an
approximation argument. It can easily be reframed to work also in the functional
analytic context.1

2.1 Set algebras and contents


The challenge in the construction of measures is their continuity when passing
to countable unions. We consider first preliminary forms of the concepts of
chapter 1 in which only finite set operations are allowed.

Definition 2.1 Let S be a set. Let A be a collection of subsets of S. Then A


is called a (set) algebra ((Mengen-)Algebra) on S, if the following hold:
a) S ∈ A,
b) A ∈ A implies Ac ∈ A.
c) A is ∪-stable: A, B ∈ A imply A ∪ B ∈ A.

Remark: Like in the case of σ-algebras these conditions automatically imply


that ∅ ∈ A and A is also ∩-stable. Every countable union of elements of A can
be converted to a countable disjoint union of elements of A via the formula

[ ∞
[ n−1
[
An = A1 ∪ (An ∩ ( Am )c ).
n=1 n=2 m=1

This shows that an algebra is a σ-algebra if and only if it is closed under forming
countable disjoint sequences.

1 Carathéodory’s original proof is useful in the construction of Hausdorff measures in the

theory of fractals, cf. e.g. the script [?] ’Fractal sets and Preparation to Geometric Measure
Theory’ (http://www.mathematik.uni-kl.de/ wwwstoch/2002w/skriptrev.pdf). A proof via
approximation from inside is also possible, see the script [?]

11
12 Measure Theory (October 20, 2008)

Example 2.1 Let A be the set algebra on R which is generated by all intervals
which are closed on the right and open on the left. Then A consists precisely of
all sets of the form
[n
A= Ji (2.1)
i=1

where each Ji is an interval of one of the three types (−∞, b], (a, b] or (a, ∞).
It is called the interval algebra over R.

Sometimes it is convenient not to require that the underlying set S itself


belong to the system of sets. The ’local’ version of a set algebra is called a
’ring’:

Definition 2.2 Let S be a set and let R be a non-empty system of subsets of


S. Then R is called a ring ((Mengen-)Ring) on S, if the following hold:
b0 ) For A, B ∈ R the difference set A \ B = A ∩ B c also belongs to R.
c0 ) For A, B ∈ A the union A ∪ B is in R.

A ring R is also stable under the operation M (cf. 1.2). Moreover (1.3) shows
that A ∩ B = (A ∪ B) \ (A M B) and hence a ring is stable under intersections. If
one takes M as addition and ∩ as multiplication then a ring in our sense becomes
a ’commutative ring’ in the algebraic sense.2

Example 2.2 An example of a ring is the interval ring R of all bounded sets
in the interval algebra A of Example 2.1.

Obviously a ring is a set algebra if and only if it has the set S as an element.

Definition 2.3 Let S be a set and let R be a ring of subsets of S. A map


m : A → [0, ∞] is a content (Inhalt), if m(∅) = 0 and m is additive, i.e.

m(A ∪ B) = m(A) + m(B) (2.2)

holds for any pair of disjoint sets A, B in A.

Remarks: 1. In analogy to measures contents are monotone and finitely


sub-additive.
2. For any two elements A, B of the ring R the number m(A M B) quantifies
how much the two sets differ. In the proof of the extension theorem 2.2 we shall
make use of this idea.
3. The formula (2.2) has a natural extension for not necessarily disjoint sets:

m(A ∪ B) + m(A ∩ B) = m(A) + m(B). (2.3)


In fact both sides are equal to m(A) + m(B \ A) + m(A ∩ B). A set function
with (2.3) is called modular.
2 Here is a short way to see this: identify a set A with its indicator function 1
A and note
that 1A∩B = 1A 1B and 1AMB ≡ 1A + 1B (mod 2). A set algebra becomes a ’commutative
ring with 1’ (the set S is the unit). In older literature set algebras are called ’fields’ (’Mengen-
Körper’). The associated algebraic structures are called Boolean rings resp. algebras.
Measure Theory (October 20, 2008) 13

Example 2.3 Let A (resp. R) be the above algebra (resp. ring) of unions of
intervals. One can assume that in the representation (2.1) the intervals Ji are
disjoint. If F : R → R is a non-decreasing function then we get a content mF
on A resp. R by the formula
n
[ n
X
mF ( Ji ) = F (bi ) − F (ai ), (2.4)
i=1 i=1

where for unbounded intervals we use the limit values F (−∞) = limx↓−∞ F (x)
and F (∞) = limx↑∞ F (x) in the obvious way. It is called Stieltjes content.
For F (x) = x we get the usual geometric size of a set based on the length of
intervals.
Clearly if a content can be extended to a measure it must be σ-additive for
all sets for which it is defined.
Definition 2.4 A content m on a ring R is called σ-additive if for every
disjoint sequence (An ) in R whose union is in R, one has

[ ∞
X
m( An ) = m(An ). (2.5)
n=1 n=1

So a measure is simply a σ-additive content on a σ-algebra. In view of the


extension theorem (theorem 2.2) below, sometimes a σ-additive content on a
ring is also called a pre-measure (Prämaß).
In most cases the actual verification of the σ-additivity implicitly depends
on a compactness argument like in part b) of the following criterion.
Theorem 2.1 A content m on a ring R of subsets of S is σ-additive if one of
the following conditions hold:
a) m takes only finite values and is continuous at ∅, i.e. if Rn ∈ R, and Rn ↓ ∅
then m(Rn ) ↓ 0.
b) S is a topological (e.g. metric) space and for each R ∈ R and every ε > 0
there are two sets B, C with B ⊂ C ⊂ R such that B ∈ R, C is compact and
m(R) ≤ m(B) + ε < ∞.
Proof. a) Let S
a disjoint sequence (An ) in R be given whose union A is in
n
R. Let Rn = A \ ( k=1 Ak ). Then Rn ↓ ∅ and hence
n
X n
[
m(A) − m(Ak ) = m(A) − m( Ak ) = m(Rn )
k=1 k=1

converges to 0 as n → ∞. This implies the σ-additivity (2.5).


b) Under the given assumptions we verify condition a). Let (Rn ) be a se-
quence of elements of R with Rn ↓ ∅. Let ε > 0. For each n choose the sets
Bn ∈ R, T Cn compact with Bn ⊂ Cn ⊂ Rn and m(Rn ) ≤ m(Bn ) + ε2−n .

Then also n=1 Cn = ∅ and by compactness there is already some N ∈ N with
TN TN
n=1 Cn = ∅. In particular n=1 Bn = ∅. Thus

N
[ N
[
RN = RN ∩ Bnc ⊂ Rn \ Bn
n=1 n=1
14 Measure Theory (October 20, 2008)

and hence for all k ≥ N


N
X N
X
m(Rk ) ≤ m(RN ) ≤ m(Rn ) − m(Bn ) ≤ ε2−n < ε.
n=1 n=1

This implies m(Rn ) → 0 as required.

Corollary 2.1 Let F : R → R be nondecreasing and right-continuous. Then


the associated Stieltjes content mF on the interval ring R is σ-additive.

Proof. We check condition b) in theorem 2.1. Let R ∈ R and ε > 0 be


given. First let R = (a, b]. By right-continuity of F we can choose points c, d
with a < c < d < b and F (a) ≤ F (d) < F (a) + ε. Let B = (d, b] and C = [c, b].
Then B ⊂ C ⊂ R, B ∈ R and C is compact. Moreover

mF (B) = mF ((d, b]) = F (b) − F (d) > F (b) − F (a) − ε = mF (R) − ε.

The additivity of mF allows to extend this argument to finite unions of intervals,


i.e. to all elements of R.

In an analogous way one easily verifies the σ-additivity of the n-dimensional


volume function on the ring of finite unions of rectangles of the form

R = (a1 , b1 ] × · · · (an , bn ].

Definition 2.5 A content m on a ring R over the set S is called σ-finite, if S


is the union of an increasing sequence (Ek ) of elements of R with m(Ek ) < ∞.

This is a very mild condition. For example the volumeSfunction on the ring
generated by rectangular sets in Rd is σ-finite since Rd = k (−k, k]d . We need
this condition in order to ensure uniqueness in the following theorem which is
the central result in this chapter.

Theorem 2.2 Let m be a σ-additive and σ-finite content on the ring R. Then
there is a unique extension of m to a measure on σ(R).

The remainder of this chapter is devoted to the proof of this theorem and
some additional comments. In a first round the reader may pass directly to the
next chapter.

2.2 Outer measures


We saw that typically measures cannot be defined on all subsets of the under-
lying space. If we relax the condition of σ-additivity, this difficulty disappears.

Definition 2.6 Let S be a set. A map µ∗ from the power set 2S to [0, ∞]
is called an outer measure (äußeres Maß) on S, if it has the following
properties:
1. µ∗ (∅) = 0,
2. A ⊂ B implies µ∗ (A) ≤ µ∗ (B) (monotonicity),
Measure Theory (October 20, 2008) 15

S∞ P∞
3. µ∗ ( n=1 En ) ≤ n=1 µ∗ (En ) (σ-sub-additivity).

Outer measures often are generated as follows.

Proposition 2.1 Let R be any system of subsets of S which contains ∅, and


let m : R → [0, ∞] be a set function with m(∅) = 0. Let

X [

µ (A) := inf{ m(Ri ) : Ri ∈ R, A ⊂ Ri } (2.6)
i=1 i

where µ∗ (A) = ∞ if there is no covering of A by a sequence of elements of R.


Then µ∗ is an outer measure and µ∗ (R) ≤ m(R) for all R ∈ R.

Proof. Obviously µ∗ (∅) = 0. In order to check monotonicity let A ⊂ B.


Then in (2.6) for A the infimum is taken over a larger class of numbers than
for B. Thus µ∗ (A) ≤ µ∗ (B). For σ-sub-additivity let now (En ) be a sequence
of sets and let εS> 0. For each n we choose a sequence (Rin ) of elements of G

satisfying En ⊂ i=1 Rin and

X
m(Rin ) ≤ µ∗ (En ) + ε2−n .
i=1
S∞ S∞
Then the set n=1 En is contained in the set i,n=1 Rin and hence

[ ∞
X ∞
X
µ∗ ( En ) ≤ m(Rin ) ≤ µ∗ (En ) + ε.
n=1 i,n=1 n=1

Thus all properties of an outer measure are verified. The final inequality in the
assertion is clear since every R ∈ R forms an admissible covering of itself on the
right hand side of (2.6).

In the special case of a σ-additive content on a ring R the restriction of the


outer measure µ∗ defined in (2.6) to R actually coincides with m. This will be
important below.

Lemma 2.1 Let R be a ring and let m be a σ-additive content. Then

µ∗|R = m.
S
Proof.
S Let A ∈ R be given. Suppose A ⊂ i Ri and Ri ∈ R. Then
A = i A ∩ Ri . As m is σ-sub-additive and monotone we get

X ∞
X
m(A) ≤ m(A ∩ Ri ) ≤ m(Ri ).
i=1 i=1

Taking the infimum over these coverings gives m(A) ≤ µ∗ (A). The converse
inequality was part of the proposition.

Finally every outer measure induces a distance (Abstand) between subsets


of S.
16 Measure Theory (October 20, 2008)

Lemma 2.2 Let µ∗ be an outer measure on S and define for A, B ⊂ S the


(possibly infinite) ’distance’ d(A, B) by

d(A, B) = µ∗ (A M B). (2.7)

Then the triangular inequality

d(A, C) ≤ d(A, B) + d(B, C). (2.8)

holds. Moreover the function µ∗ and the set operations ∪, ∩, \ are continuous
for d: If both d(An , A) → 0 and d(Bn , B) → 0 then µ∗ (An ) → µ∗ (A), d(An ∪
Bn , A ∪ B) → 0, and similarly for ∩ and \.

Proof. Obviously for three sets A, B, C one has A M C ⊂ (A M B) ∪ (B M


C). Since µ∗ is monotone and sub-additive we get the triangular inequality.
The triangular inequality yields in particular the inequality in

|µ∗ (A) − µ∗ (B)| = |d(∅, A) − d(∅, B)| ≤ d(A, B) = µ∗ (A M B) (2.9)

which in turn implies the continuity of µ∗ . Let now ◦ denote one of the above
set operations. Let A, B and A0 , B 0 be two pairs of sets. If a point x belongs to
A ◦ B but not to A0 ◦ B 0 or conversely, then x is at least in one of the symmetric
difference sets A M A0 or B M B 0 . More formally

(A ◦ B) M (A0 ◦ B 0 ) ⊂ (A M A0 ) ∪ (B M B 0 ).

Then again the fact that µ∗ is monotone and sub-additive implies

d(A ◦ B, A0 ◦ B 0 ) ≤ d(A, A0 ) + d(B, B 0 ).

This proves the continuity.

2.3 Proof of the Measure Extension Theorem


Actually the proof of theorem 2.2 yields more detailed information which some-
times is useful. The existence proof does not need σ-finiteness. It yields a
complete measure with a useful approximation property. The uniqueness part
of theorem 2.2 will be shown in remark 2.2 below. A much stronger uniqueness
result is established in the next (optional) section.

Theorem 2.3 Let m be a σ-additive content on the ring R. Let µ∗ be the


associated outer measure given by proposition 2.1. Let N = {N ⊂ S : µ∗ (N ) =
0} be the class of µ∗ -null-sets. Let M = σ(R ∪ N ) and let µ = µ∗|M be the
restriction of µ∗ to the σ-algebra M. Then
a) µ is a complete measure which extends m i.e. µ|R = m.
b) For every A ∈ M with µ(A) < ∞ there is a sequence (An ) in R with
µ(A M An ) → 0.

Proof. I 1. Let d denote the ’distance’ on the power set of S which is


associated to µ∗ , cf. (2.7). Let Rf = {R ∈ R : m(R) < ∞} and let Mf
be the system of all sets A ⊂ S which are in the closure of Rf with respect
to d, i.e. A ∈ Mf iff there is a sequence (Rn ) in Rf with d(Rn , A) → 0; it
Measure Theory (October 20, 2008) 17

is then clear that d is finite on Mf . Obviously Rf ⊂ Mf and by lemma 2.2


µ∗ (A) = lim m(Rn ) < ∞ for each A ∈ Mf . Moreover N ⊂ Mf since every µ∗
null-set N satisfies d(∅, N ) = µ∗ (N ) = 0.
2. Clearly Rf is a ring and as a consequence of lemma 2.2 Mf is a ring as
well: If two sets A and B can be approximated by elements of the ring Rf , then
the same is true for A ∪ B and A \ B.
3. µ∗|Mf is a content: Let A, B ∈ Mf such that A ∩ B = ∅. Choose in R
approximating sequences (An ) for A and (Bn ) for B. Then by lemmas 2.1 and
2.2 and the modularity (2.3) of m

µ∗ (A ∪ B) = lim m(An ∪ Bn )
= lim m(An ) + lim m(Bn ) − lim m(An ∩ Bn )
= lim µ∗ (An ) + lim µ∗ (Bn ) − lim µ∗ (An ∩ Bn )
= µ∗ (A) + µ∗ (B) + µ∗ (∅) = µ∗ (A) + µ∗ (B).

4. We claim that µ∗|Mf is σ-additive, and moreover for every disjoint se-
S
quence
P∞ (An ) in Mf the union A = An also belongs to Mf if and only if
∗ ∗
n=1 µ (An ) < ∞. Clearly if A ∈ Mf (or more generally if µ (A) < ∞) then
by step 2 the partial sums of the series are all dominated by µ∗ (A) and the
series converges. Conversely, assume that the series converges. Then
N
[ N
[ ∞
[ ∞
X
d(A, An ) = µ∗ (A M An ) = µ∗ ( An ) ≤ µ∗ (An ) → 0
n=1 n=1 n=N +1 n=N +1

as N → ∞. Therefore A can be approximated by elements of Mf . By the


triangular inequality and the definition of Mf the set A then can also be ap-
proximated by elements of Rf and as a result we get A ∈ Mf . Moreover
N
[ N
X ∞
X
µ∗ (A) = lim µ∗ ( An ) = lim µ∗ (An ) = µ∗ (An ).
N →∞ N →∞
n=1 n=1 n=1

6. We note in passing that in the case where R is an algebra and m(S) < ∞
3
the proof is now complete: The class Mf then is an algebra according to step
2. By step 3, the series in the statement of step 4 always are bounded and thus
Mf is closed under countable disjoint unions. Hence Mf is a σ-algebra which
contains both R = Rf and N , i.e. σ(R ∪ N ) ⊂ Mf . Further µ∗ is σ-additive
on Mf , in particular it induces a measure extension of m on σ(R) which is
complete because N is closed under taking subsets.
II. 7. The remainder of the proof is devoted to the extension to the un-
bounded case. Consider the system of sets

M̄ = {A ⊂ S : A ∩ B ∈ Mf f or all B ∈ Mf }. (2.10)

Then M̄ contains R. In fact let R ∈ R and let B ∈ Mf . By definition of


Mf there is an approximating sequence (Rn ) in Rf for B. Then (R ∩ Rn ) is
an approximating sequence for R ∩ B, i.e. R ∩ B ∈ Mf . Therefore R ∈ M̄.
Moreover N ⊂ M̄ since B ∩ N is again in N and hence in Mf for all B in Mf .
3 i.e. for most applications in probability
18 Measure Theory (October 20, 2008)

8. We claim that M̄ is a σ-algebra. Clearly S ∈ M̄. Also M̄ is a ring since


Mf is a ring. So M̄ is S
an algebra. Let now a disjoint sequence (An ) in M̄ be
given with union A = An . For every B ∈ Mf the additivity of µ∗ on Mf
(step 3) gives

X N
X N
[
µ∗ (An ∩ B) = lim µ∗ (An ∩ B) = lim µ∗ ( An ∩ B) ≤ µ∗ (B) < ∞
N N
n=1 n=1 n=1
S
and thus by step 4 A ∩ B = n An ∩ B ∈ Mf . This holds for every B ∈ Mf
and so A ∈ M̄. As an algebra which is closed under countable disjoint unions
M̄ is a σ-algebra.
9. We claim that every set A ∈ M̄ with µ∗ (A) < ∞S is in Mf . By
P definition
of µ∗ there is a sequence (Bi ) in R Ssuch that A ⊂ i Bi and i m(Bi ) ≤
µ∗ (A) + 1 < ∞. Replacing Bi by Bi \ j<i Bj if necessary weP may assume that

the Bi are disjoint. Then again the partial sums
S of the series i µ (A ∩ Bi ) are
bounded and hence step 4 implies that A = i A ∩ Bi ∈ Mf .
10. The restriction µ∗|M̄ is a measure. Let (An ) be as in 8. In the proof of
P∞
µ∗ (A) = n=1 µ∗ (An ) we only need to check ≥ since µ∗ is σ-sub-additive. Here
we may assume µ∗ (A) < ∞. According to step 9 A ∈ Mf and step 4 gives the
assertion.
11. Steps 7 and 8 imply that σ(R ∪ N ) ⊂ M̄. Step 10 thus shows that
the restriction µ of µ∗ to σ(R ∪ N ) is a measure which extends m according to
lemma 2.1. It is complete since N is contained in the σ-algebra which is the
domain of µ. Step 9 implies that every element of σ(R ∪ N ) of finite measure is
actually in Mf and hence it can be approximated in the sense of assertion b).
This completes the proof of the theorem.

In general measure extensions are not unique. We shall see an example in


chapter 5. However it is easy to see that any measure extension is dominated
by the outer measure µ∗ .
Remark 2.1 If ν is a measure extension of the content m to some σ-algebra A
then ν(A) ≤ µ∗ (A) for all A ∈ A.
PChoose any
Pcovering of A by a sequence (Ri )
of elements of R. Then ν(A) ≤ i ν(Ri ) = i m(Ri ). Passing to the infimum
of such sums we get the assertion.
Remark 2.2 Let µ be the extension of theorem 2.3. Then at least on {A ∈
σ(R ∪ N ) : µ(A) < ∞} all measure extensions of the content m agree with µ:
Let ν be such a measure extension and let A be in this class. According to step
9 of the above proof A belongs to Mf , i.e. there is an approximating sequence
(Rn ) in R with µ(Rn M A) → 0. Then the previous remark implies that a
fortiori ν(Rn M A) → 0 and thus by lemma 2.1 and lemma 2.2 applied both to
ν and µ∗ we get
ν(A) = lim ν(Rn ) = lim m(Rn ) = µ∗ (A).
n n

We can now give the uniqueness argument for theorem 2.2: Let m be σ-finite
and let (Ek ) be an increasing sequence in R of finite content whose union is the
whole set S. Let A ∈ σ(R). Then
ν(A) = lim ν(A ∩ Ek ) = lim µ(A ∩ Ek ) = µ(A).
k k
Measure Theory (October 20, 2008) 19

Example 2.4 (Lebesgue measure) Let m be the geometric volume function on


the ring R of finite unions of (half open) rectangular sets in Rd . The measure
extension constructed in theorem 2.3 is called the d-dimensional Lebesgue
measure. We shall denote it by λd . The σ-algebra σ(R) is the Borel-σ-algebra
B(Rd ) according to section 1.1. The class N is the class of Lebesgue null-
sets. By definition a set N ⊂ Rd belongs S to N iffP for each k there are countably
k k ∞ k 1
many
T∞ S∞ krectangles R i such that N ⊂ R
i i and i vol(Ri ) < k . Since N ⊂
d
k=1 i=1 Ri ∈ σ(R) = B(R ) the class N consists precisely of the null-sets in
the sense of definition 1.6 for the measure λd|B(Rd ) . The σ-algebra L = σ(R ∪ N )
is then easily seen to be the completion of the Borel sets for Lebesgue measure in
the sense of proposition 1.1. It is called the σ-algebra of Lebesgue measurable
sets. Every Lebesgue measurable set differs only by a Lebesgue null-set from a
Borel set.

Example 2.5 (Counting measure) Let R be the family of all finite subsets of
a set S. Let m assign to each finite set its cardinality. The corresponding outer
measure assigns to each infinite set the value +∞. The empty set ∅ is the only
null-set and σ(R) = σ(R ∪ N ) consists of all countable subsets of S and their
complements. Let us look into the proof of theorem 2.3 in this case. The system
Mf defined there agrees with the system of finite sets whereas the σ-algebra
M̄ defined in (2.10) coincides with the full power set. We see that the proof
actually sometimes produces a measure extension on a larger σ-algebra than
just on σ(R ∪ N ).

2.4 Monotone classes and more on uniqueness


This section can be read independently from the rest of this chapter. The
monotone class theorem below sometimes is quite convenient in order to extend
a statement from a certain class of sets to the σ-algebra generated by this class.
We shall use it e.g. in the proof of Fubini’s theorem (theorem 5.1).4
As a consequence of theorem 2.2 two measures on some σ-algebra F agree if
they agree on a ring R which generates F and on which they are σ-finite. Even
for finite measures one might ask: Is a finite measure on a σ-algebra F already
uniquely determined by its values on an arbitrary generating system E of F?

Example 2.6 Here is a counter-example even if the total mass of the measures
is fixed. Let S = {1, 2, 3, 4} and let the system E consist of the two sets {1, 2}
and {2, 3}. Then all four singletons {1}, {2}, {3}, {4} can easily be produced
via algebra operations from these two sets, and hence σ(E) is the power set of
S. The two weight assignments {1/4, 1/4, 1/4, 1/4} and {1/2, 0, 1/2, 0} clearly
induce two different measures µ and ν on the power set S with µ(S) = ν(S) = 1
and µ(A) = ν(A) = 1/2 for both sets A ∈ E.

We shall see in a minute that the essential requirement is that E be stable


under intersection. The proof uses the following concept.

Definition 2.7 A monotone class (monotone Klasse) over the set S is a


system D of subsets of S with the following three properties:
4 In chapter 5 we also sketch a way how to avoid theorem 2.4.
20 Measure Theory (October 20, 2008)

a) S ∈ D.
b) If A, B ∈ D and A ⊂ B then B \ A ∈ D.
c) If
S∞(An ) is an increasing sequence of elements of D, then their union
n=1 An belongs D.

Remark. Monotone classes and σ-algebras are built in a similar fashion. In


particular again the intersection of an arbitrary number of monotone classes is
again a monotone class. The essential difference to σ-algebras is the fact that in
c) the sets are assumed to be increasing. This makes this condition much easier
to verify. On the other hand we have

Theorem 2.4 (Monotone class theorem) Let E be a class of subsets of S which


is stable under (finite) intersections. If a monotone class D contains all elements
of E then it even contains all elements of σ(E).

Proof. Let D∗ be the intersection of all monotone classes which contain E,


i.e. D∗ is the monotone class generated by E. Then D∗ ⊂ D and we only need
to show that σ(E) ⊂ D∗ . Let us proceed step by step.
1. We claim: for every set E ∈ E and every A ∈ D∗ we have also A∩E ∈ D∗ :
Consider the class DE of all sets A satisfying A ∩ E ∈ D∗ . Then DE is itself
a monotone class because D∗ is one. Moreover DE contains all elements of E
since E is stable under intersections. By definition of D∗ we conclude D∗ ⊂ DE
and from this the assertion.
2. We claim: D∗ itself is stable under intersections: Let A, B ∈ D∗ . The
class DA of all B with A ∩ B ∈ D∗ is a monotone class which according to step
1 contains the system E. Again by definition of D∗ this shows D∗ ⊂ DA and
the assertion.
3. It remains to show that every monotone class D∗ which is stable under
intersections is a σ-algebra, because then D∗ is even a σ-algebra which contains
E and hence σ(E). Properties a) and b) imply that D∗ is stable under taking
complements. Then D∗ is not only stable under finite intersections but also
under finite unions. But together with property c) we conclude that D∗ is
stable under arbitrary countable unions. Thus D∗ is indeed a σ-algebra.

Theorem 2.5 (Uniqueness criterion for measures) Let the system E be stable
under intersections and generate the σ-algebra F. Let µ and ν be two measures
on F which agree on E. If the set S is the union of an increasing sequence of
sets Ek ∈ E with µ(Ek ) = ν(Ek ) < ∞ then µ = ν.

Proof. First of all it is easy to verify that for each k the system of sets
Dk = {A ∈ F : µ(A ∩ Ek ) = ν(A ∩ Ek ) is a monotone class which contains E.
Thus by the monotone class theorem µ(A ∩ Ek ) = ν(A ∩ Ek ) for all k and all
A ∈ F. This implies the assertion by continuity from below of both measures.

A typical application is

Corollary 2.2 A finite measure on the Borel subsets of a metric space is uniquely
determined by its values on open sets.
Chapter 3

Measurable functions

Definition 3.1 Let (S, F) and (Y, G) be measurable spaces. A map f : S −→ Y


is called F-G-measurable (meßbar) (or just: measurable), if

f −1 (G) ∈ F f or all G ∈ G.

In the special case when Y is a metric space a map f : S −→ Y is called F-


measurable or measurable, if f is F-B(Y )-measurable.

For functions with one-dimensional range often the values ±∞ occur in limit
procedures. Therefore it is convenient to consider also extended real valued
functions. The following notation is convenient.
Notation: Let f, g : S −→ [−∞, ∞] be two functions and let a ∈ R. Then
denote by {f ≤ a} the set {x ∈ S|f (x) ≤ a} = f −1 ((−∞; a]), by {f = g} the
set {x ∈ S|f (x) = g(x)} and introduce similarly the notations

{f < a}, {f ≥ a}, {f > a}, {f < g} etc.

Moreover we recall that the extended real line [−∞, ∞] is a metric space
when endowed with the metric
|x − y|
d : [−∞, ∞] × [−∞, ∞] → [0, 1], d(x, y) = .
1 + |x − y|

It is not hard to see that the mapping f : [−∞, ∞] → [−1, 1] given by f (x) =
x
1+|x| is homeomorphic, and hence the above metric defines the usual topology
on [−∞, ∞].
For extended real valued functions measurability has a simpler description.

Definition 3.2 An extended real-valued function f : S −→ [−∞, +∞] is called


F-measurable if {f ≤ α} ∈ F for all α ∈ R (or alternatively {f > α} ∈ F for
all α, etc.)

Remarks: 1. If f takes values in the extended real number system we have


now two definitions of measurability. We shall see below in lemma 3.2 that the
two definitions are compatible.

21
22 Measure Theory (October 20, 2008)

2. The (extended) real-valued measurable functions are - with certain re-


strictions to be discussed later - the natural candidates for the definition of the
integral.
The verification of measurability is often much simplified by the following
criterion.

Lemma 3.1 A map f : S → Y is F-G-measurable if f −1 (E) ∈ F for all E ∈ E


where E is any system of subsets of the space Y such that G = σ(E).

Proof. Consider the following class of subsets of Y

A = {A ⊆ Y |f −1 (A) ∈ F}.

According to our assumption E ⊆ A. Moreover since f −1 (Y ) = S ∈ F we


have Y ∈ A. Let A ∈ A. Then f −1 (Ac ) = S\f −1 (A) = (f −1 (A))c ∈ F, and
therefore Ac ∈ A. Finally let An ∈ A for all n ∈ N. Then
[ [
f −1 ( An ) = f −1 (An ) ∈ F
n∈N n∈N
S
and thus n∈N An ∈ A. So A is a σ-algebra which contains E. This implies that
G = σ(E) ⊆ A, i.e. our assertion.

Lemma 3.2 Let (S, F) be a measurable space. For f : S → [−∞, ∞] the


following are equivalent.
(1) f is measurable (in the sense of Definition 3.1)
(2) {f < a} ∈ F for all a ∈ R
(3) {f ≤ a} ∈ F for all a ∈ R
(4) {f > a} ∈ F for all a ∈ R
(5) {f ≥ a} ∈ F for all a ∈ R.

Proof. We recall that [−∞, ∞] is homeomorphic to [−1, 1]; hence by lemma


3.1 it suffices to show that

B([−1, 1]) = σ(E),

where E denotes the class of intervals of the form [−1, a), [−1, a], (a, 1], and
[a, 1] for a ∈ (−1, 1). The inclusion ⊇ is clear, and the reverse inclusion follows
from the fact that

(a, b] = [−1, b] \ [−1, a] for − 1 ≤ a ≤ b ≤ 1,

and similarly for the remaining systems of intervals.

Since the singleton sets {−∞} and {+∞} are Borel measurable, the previous
criterion admits an obvious analogue for extended real-valued functions.

Theorem 3.1 Let f, g be extended real valued functions. Then the sets {f ≤
g}, {f < g}, {f = g} are in F.
S
Proof. We have {f < g} = r∈Q {f ≤ r} ∩ {g > r} ∈ F, and thus also
{f ≤ g} = {g < f }c ∈ F and {f = g} = {g ≤ f } ∩ {f ≤ g} ∈ F.
Measure Theory (October 20, 2008) 23

Remark 3.1 1. If c ∈ R and f (x) = c for all x ∈ S then


½
∅ ∈ F for a ≤ c
{f < a} =
S ∈ F for a > c

So the constant functions are measurable.


2. If f : Rm −→ Rn is continuous, Then f −1 (U ) is open and hence Borel for all
open sets U . So by lemma 3.1 all continuous functions are measurable.

Lemma 3.3 Let (S, F ), (Y, G), (Z, H) be measurable spaces. Let f : Y → Z, g :
S → Y be measurable maps. Then the composition f ◦ g : S → Z is measurable.

Proof. For H ∈ H we get f −1 (H) ∈ G and (f ◦ g)−1 (H) = g −1 (f −1 (H)) ∈ F.

Lemma 3.4 A vector-valued function f = (f1 , . . . , fn ) : S → Rm is F-measurable


if and only if the components fi : S → R are F-measurable for all i ∈ {1, . . . , m}.

Proof. Let f be measurable. Then for each a ∈ R the set A = [−∞, a] ×


Rm−1 ⊆ Rm is closed, i.e. A ∈ B(Rm ). Thus {f1 ≤ a} = f −1 (A) ∈ F for all
a ∈ R. Therefore f1 is measurable by lemma 3.2. A similar argument works for
the other components.
Conversely assume that fi is measurable for all i. Let a box Q = [a1 , b1 ] × . . . ×
[an , bn ]. be given. Then
n
\
f −1 (Q) = fi−1 ([ai , bi ]) ∈ F
i=1

This holds for all boxes and lemma 3.1 together with theorem 1.2 implies the
measurability of f .

Corollary 3.1 If f1 , f2 : S −→ R are measurable then the same is true for


f1 + f2 , f1 − f2 , f1 · f2 , min(f1 , f2 ) and max(f1 , f2 ).

Proof. The maps 0 +0 ,0 −0 ,0 ·0 , min, max : R2 → R are continuous. By remark


3.12 they are measurable. The map f = (f1 , f2 ) : S −→ R2 is measurable
according to lemma 3.4. Since f1 +f2 = 0 +0 ◦f the function f1 +f2 is measurable
by lemma 3.3. The same arguments work also for difference and multiplication.

Theorem 3.2 Let (S, F) be a measurable space and let {fn }n∈N be a sequence of
measurable extended real valued functions on S. a) Then the (pointwise defined)
functions supn fn , inf n fn , lim supn→∞ fn , lim inf n→∞ fn are measurable.
b) The set F = {x : lim fn (x) exists} is in F and limn 1F fn is measurable.

Proof. For all a ∈ R the set {sup fn ≤ a} = ∩n∈N {fn ≤ a} is in F.


So sup fn is measurable. By symmetry inf fn is measurable. Therefore also the
functions lim supn→∞ fn = inf m (supn≥m fm ) and lim sup fn = supm (inf n≥m fn )
are measurable and the set F = {lim supn fn = lim inf n fn } is in F.

Simple examples of real-valued measurable functions are the indicator func-


tions 1A , A ∈ F. In fact the sets {1A > a} are in F because they are either
empty (if a ≥ 1), equal to A (if 0 ≤ a < 1), or equal to S (if a < 0).
24 Measure Theory (October 20, 2008)

Definition
Pn 3.3 Let (S, F) be a measurable space. A function of the form
f (x) = i=1 αi 1Ai (x) with αi ∈ R, Ai ∈ F is called an elementary func-
tion or measurable step function (meßbare Treppenfunktion).

Remark 3.2 Clearly the measurable step functions form a vector space. A
measurable step function takes only finitely many different values Sβ1 , . . . , βm .
The sets Bj = {f = βj } are disjoint P and measurable and S = Bj . So f
also admits the representation f (x) = βj 1Bj (x), where the sets Bj form a
(disjoint) partition of the set S.

Theorem 3.3 For every measurable function f : S −→ [0, ∞] there is a se-


quence (fn ) of non-negative elementary functions with fn ↑ f, i.e. f1 ≤ f2 ≤ . . .
and f = supn fn .

Proof. Let
n
n2
X −1
fn (x) = k2−n 1{k2−n ≤f <(k+1)2−n } + n1{f ≥n} , (3.1)
k=0

i.e. if f (x) < n, then fn (x) is the largest member of the grid 2−n N0 which is
situated on the left of f (x), and fn (x) = n if f (x) ≥ n. Because these grids
are successively contained in each other we get fn (x) ≤ fn+1 (x). Moreover
fn ≤ f . Finally f (x) − fn (x) ≤ 2−n , as soon as f (x) ≤ 2n · n · 2−n = n and
fn (x) = n ↑ ∞ = f (x) if f (x) = ∞. Therefore (fn ) converges point-wise to
f.

The next result is the only place in this section where a measure appears.

Corollary 3.2 Let µ be a measure on the σ-algebra F. Let f be extended real-


valued Fµ -measurable. Then f is µ-almost everywhere equal to an F-measurable
function g.
Pkn
Proof. Consider elementary functions fn = k=1 βkn 1Bnk with Bnk ∈
Fµ such that fn ↑ f + . Each of the sets Bnk is µ-equivalent to a set Bnk
0

0 +
F. Outside the countably many null-sets Bnk M Bnk f is equal to the F-
measurable function
Xkn
g + = sup βnk 1Bnk
0 .
n
k=0

Similarly f is µ-a.e. equal to an F-measurable function g − . The set N =


{g + = g − = ∞} is a null-set. Let g = g + − g − outside N and g = 0 on N .


Then f = g µ-a.e.

Sometimes functions are not defined at all points of the set S. Then the
following convention is helpful.

Definition 3.4 Let A be a subset of the set S. Let f be an extended real-valued


function which may not be defined outside of A. Then 1A f or f 1A denotes the
function on S which agrees with f on A and vanishes outside of A.
Chapter 4

The Integral

In this chapter (S, F, µ) is a measure space. We define the integral with respect
to µ and study some of its properties.

4.1 The elementary integral


The simplest form of the integral is for elementary functions. We use the con-
vention 0 · ∞ = 0.

Definition 4.1 Let (S, F, µ) be a measure space. Let f be a non-negative ele-


mentary function with the representation
n
X
f= αi 1Ai (4.1)
i=1

with disjoint sets Ai ∈ F. We define the integral of f with respect to µ by


Z Z n
X
f dµ := f (x) dµ(x) := αi µ(Ai ) (4.2)
i=1

The representation 4.1 is not unique, because the sets of the form {f = α}
can be split in many different ways into disjoint measurable parts. But

Lemma 4.1 The integral in Definition 4.1 does not depend on the choice of the
representation in (4.1).
Pm
Proof. Let f = j=1 βj 1Bj be a second representation of S f with disjoint
S
sets Bj ∈ F . Without loss of generality we may assume S = i Ai = j Bj
since the coefficient 0 is admitted. Moreover αi = βj for all pairs (i, j) of indices
with Ai ∩ Bj 6= ∅, because both coefficients agree with the common value which
the function f takes on this intersection. The additivity of µ shows
n
X n X
X m m X
X n m
X
αi µ(Ai ) = αi µ(Ai ∩ Bj ) = βj µ(Ai ∩ Bj ) = βj µ(Bj ).
i=1 i=1 j=1 j=1 i=1 j=1

25
26 Measure Theory (October 20, 2008)

For the sake of simplicity of this lemma we have assumed in Definition 4.1
that the sets Ai in the representation (4.1) are disjoint. Part a) of the following
theorem shows in particular that the formula (4.2) is also valid if the Ai are not
disjoint.

Theorem 4.1 The integral as a functional on the set of non-negative elemen-


tary functions is
R R R
a) additive and positively homogeneous, i.e. af + bg dµ = a f dµ + b g dµ
if a ∈ [0, ∞), R R
b) monotone increasing, i.e. f ≤ g implies f dµ ≤ g dµ,
c) σ-continuous
R R from below, i.e. if fn is [0, ∞]-valued and fn ↑ f point-wise
then fn dµ ↑ f dµ.

Proof. The proofs for a) and b) follow from the fact that for any two
elementary functions there is a common partition of S into disjoint measurable
subsets on which
Pm both functions are constant.
c) Let f = j=1 αj 1Aj , and assume without loss of generality that αj > 0 for
all j = 1, . . . , m. We consider the sets Bjn,k = {x ∈ Aj : fn (x) > αj − k1 }. For
fixed k we get Bjn,k ↑ Aj as n → ∞ because each x ∈ Aj is in Bjn,k for eventually
1
all n. Moreover, P if k is 1large enough to make αj − k > 0 for all j = 1, . . . , m,
we have fn ≥ j (αj − k ) 1B n,k by the positivity assumption, and thus
j

Z X Xm
1 1
lim fn dµ ≥ lim (αj − ) µ(Bjn,k ) = (αj − ) µ(Aj ).
n n k j=1
k
R P R
As k → ∞ we get limn fn dµ ≥ αj µ(Aj ) = f dx. The converse inequality
follows from the monotonicity of the integral.

4.2 The Integral for non-negative measurable and


for integrable functions
We extend the definition of the integral from elementary functions in two steps:
First for all non-negative measurable functions and then for those measurable
functions whose positive and negative parts have a finite integral.
R
Definition 4.2 For a measurable function f : S 7→ [0, ∞] we define f dµ by
Z Z
f dµ = sup{ g dµ : g ≤ f, g is an elementary f unction}.

Theorem 3.3 says that a measurable f ≥ 0 can be approximated from below


by an increasing sequence of elementary functions. One reaches the sup in the
definition by any such sequence.

LemmaR 4.2 LetR(fn ) be any sequence of elementary functions such that fn ↑ f .


Then fn dµ ↑ f dµ.
R R
R Proof. Obviously
R by definition f dµ ≥ fn dµ for each n and thus
f dµ ≥ limn fn dµ. For the converse inequality let g be an elementary
Measure Theory (October 20, 2008) 27

function with g ≤ f . Then the functions min(fn , g) are also elementary


R and
min(fRn , g) ↑ min(f, g) =Rg. By theorem 4.1 part c)
R we get limn f n dµ ≥
limn min(fn , g) dµ = g dµ. The definition of f dµ yields the desired
inequality.

From Definition 4.2 and lemma 4.2 it is straightforward to see that the
integral for non-negative measurable functions is again additive, monotone and
positively homogeneous.
We now admit functions which also take negative values. Every function
f : S → [−∞, ∞] has the decomposition f = f + − f − , where f + = max(f, 0)
and f − = max(−f, 0). Then |f | = f + + f − . All these functions are measurable
if f is measurable (cf. corollary 3.1).
R
Definition 4.3 a) A measurable function f : S → [−∞, ∞] with |f | dµ < ∞
is called µ-integrable (µ-integrierbar). For a µ-integrable function the inte-
gral is defined by Z Z Z
f dµ := f dµ − f − dµ.
+

The set of finite-valued µ-integrable functions is denoted by L1 (µ).


b) If A ∈ F and f is either
R measurable andR ≥ 0 or integrable the same is true
for 1A f and one writes A f dµ instead of 1A f dµ.
Example 4.1 As in Example 2.4 the letter L is a reference to the name Lebesgue.
In the special case where S = Rd and F is the σ-algebra of Lebesgue measurable
sets and µ = λd the elements
R of L1 (λd ) are called Lebesgue-integrable.
R One
often writes L (R ) and Rd f (x) dx instead of L1 (λd ) and f dλd . In Example
1 d

4.3 below we study the relation to the Riemann integral.


We collect some elementary facts
Proposition 4.1 Let f be measurable. Then
a) (Markov’s inequality). For all a > 0
Z
1
µ{|f | ≥ a} ≤ |f | dµ. (4.3)
a
R
b) |f |dµ = 0 if and only if µ{f 6= 0} = 0.
R R
c) If f is integrable then µ{|f | = ∞} = 0 and | f dµ| ≤ |f | dµ.
d) f is integrable if there is an integrable function h with |f | ≤ h.
Proof. a) Clearly |f | ≥ a1{|f |≥a} . This implies
Z Z
|f | dµ ≥ a1{|f |≥a} dµ = aµ{|f | ≥ a}.

b) First suppose that µ(N ) = 0 for N = {f 6= 0}. Then for every elementary
function
R g with g ≤ |f | we have also µ{g 6= 0}R = 0 and hence by Definition 4.1
gdµ = 0. Then according
R to Definition 4.2 |f |dµ = 0.
Conversely assume |f |dµ = 0. One has {|f | > 1/n} ↑ {|f | > 0} and therefore
by Markov’s inequality
Z
µ({|f | > 0}) = lim µ({|f | > 1/n}) ≤ lim n |f |dµ = 0.
n n
28 Measure Theory (October 20, 2008)

c) The first assertion follows by letting a tend to +∞ in (4.3). For the second
assertion note that
Z Z Z Z Z Z
−( f + dµ + f − dµ) ≤ f + dµ − f − dµ ≤ f + dµ + f − dµ.
R R
d) Clearly if |f | ≤ h and h is integrable then |f | dµ ≤ hdµ < ∞.

In extension to theorem 4.1 one has


Theorem 4.2 L1 (µ) is a linear space and the integral is a linear and monotone
functional on L1 (µ).
R R
Proof. 1. We have af dµ = a f dµ for all a ∈ R. If a ≥ 0 this follows
from the corresponding property of the elementary integral first for f + and f −
by Definition 4.2 and then for f . If a < 0 it suffices to consider a = −1. Then
Z Z Z Z Z
−f dµ = (−f )+ dµ − (−f )− dµ = f − dµ − f + dµ
Z Z Z
+ −
= −( f dµ − f dµ) = − f dµ.

2. If f, g ∈ L1 (µ), then (f + g)+ − (f + g)− = f + g = f + + g + − f − − g − , i.e.


(f + g)+ + f − + g − = (f + g)− + f + + g + . The additivity of the integral for
non-negative functions and subtraction give
Z Z Z
f + g dµ = f dµ + g dµ.
R R R R
3. If f ≤ g, then g dµ = f dµ + g − f dµ ≥ f dµ since g − f ≥ 0.
Example 4.2 Consider the counting measure ζ on the power set of N. Then
the integral is just a series:
Z ∞
X
f dζ = f (n). (4.4)
n=1

For elementary functions this is obvious. For nonnegative functions (sequences)


this follows easily by approximation from below. A sequence f = (f (n))n∈N is ζ-
integrable if and only if the series in (4.4) converges absolutely. So every theorem
about integrals for measures contains a result aboutP∞ absolutely converging series.
n−1 1
Like in example 4.1 however a series like in n=1 (−1) n = log(2) which
converges but does not converge absolutely cannot be treated by this method.
The point is that the Lebesgue theory does not pay attention to any order
structure in the set S. (This gives a new explanation for the elementary analysis
statement that an absolutely converging series can be arbitrarily rearranged
without changing its value.)
Let us now replace N by an arbitrary set S. Then the formula
Z X
f dζ = f (x). (4.5)
x∈S

allows to extend the definition of a (absolutely converging) series to functions


on S. If f is ζ-integrable then the Markov inequality implies that for each n ∈ N
the set {|f | > n1 } is finite. In particular the set {f 6= 0} is countable.
Measure Theory (October 20, 2008) 29

We take up the subject of null-sets from section 1.3. Let P be a property


which the points of S can have. Then P is said to be fulfilled µ-almost ev-
erywhere (fast überall), in short a.e., if the set of points which do not have
this property is a µ-null-set. For example we just proved that a µ-integrable
function
R f is µ-almost everywhere finite. Similarly a measurable function f with
|f | dµ = 0 vanishes µ-a.e.. In analogy to section 1.3 we call two functions f, g
µ-equivalent if they agree µ-a.e..

Proposition 4.2 Let the two functions f, g : S → [−∞, ∞] be µ-equivalent.


a) If f is F measurable then g is Fµ -measurable.
R R
b) If f is µ-integrable then so is g and f dµ = g dµ.

Proof. a) Let B ∈ B(R). Then the set {f ∈ B} is µ-equivalent to the set


{g ∈ B} ∈ F and hence an element of the completion Fµ (cf. proposition 1.1).
This proves the assertion.
b) Let N = {f 6= g}. Part a) and lemma 3.2 show that N ∈ Fµ andRµ(N ) = 0.
Moreover
R |g| ≤ |f | + ∞1N and so by part b) of proposition 4.1 |g| dµ ≤
|f | dµ + 0 < ∞, i.e. g is integrable. The identity
Z Z Z Z
g dµ = g1N c dµ = f 1N c dµ = f dµ

follows first for g ≥ 0 and then for general g.

Example 4.3 (Riemann integrals). Let [a, b] be a real interval and let f :
Pk
[a, b] → R be Riemann integrable. In the special case where f = i=1 αi 1(ai ,bi ]
Pk
is a step function clearly the Riemann sum i=1 αi (bi − ai ) coincides with the
elementary Lebesgue integral. In the general case there are two sequences (f n )
and (f n ) of step functions such that for f = inf n f n and f = supn f n we have
Z b Z b
f ≤ f ≤ f and sup f n (x)dx = inf f n (x)dx.
n a n a

The two functions f and f are -according to theorem 3.2- Borel measurable.
Since they are bounded and on a bounded interval the constants are integrable
they areRLebesgue-integrable
R by proposition
R 4.1 d). Monotonicity of the integral
implies f dλ1 = f dλ1 . Thus f − f dλ1 = 0 and by proposition 4.1 d)
f − f = 0 a.e.. Therefore f = f a.e. and proposition 4.2 shows that f is
Lebesgue-integrable and
Z Z Z b Z b
1 1
f dλ = f dλ = f (x) dx = f (x) dx.
a a
R∞
However a so-called improper Riemann integral like in the formula 0 sinx x dx =
R∞
π/2 typically is not a Lebesgue-integral, e.g. in this example 0 | sinx x | dx = ∞,
i.e. the integrand is not λ1 -integrable in the sense of Definition 4.3.

Definition 4.4 We denote by L1 (µ) the set of µ-equivalence of integrable func-


tions.

Part a) of the following lemma follows from proposition 4.1 c). Part b) is
obvious.
30 Measure Theory (October 20, 2008)

Lemma 4.3 a) Every integrable function f is µ-equivalent to the real-valued


function f 1{|f |<∞} ∈ L1 (µ).
b) Let f, f 0 , g, g 0 ∈ L1 (µ) be such that f = f 0 a.e. and g = g 0 a.e.. Then for all
a, b ∈ R one has af + bg = af 0 + bg 0 a.e..

Thus each equivalence class in L1 (µ) has a representative in L1 (µ). In view


of theorem 4.2 the space L1 (µ) inherits from L1 (µ) the vector space structure
and the natural ordering and the integral induces a linear monotone increasing
functional on L1 (µ). Often one does not really distinguish between an element
of L1 (µ) and its equivalence in L1 (µ).

4.3 The Limit Theorems


The limit results of this section are a crucial feature of the Lebesgue integral.
They all treat in one way or another the question whether the integral can be
exchanged with a limit.

Theorem 4.3 (Theorem of monotone convergence). Let (fn ) be a non-


decreasing sequence of non-negative measurable functions. Let f = supn fn .
Then Z Z
f dµ = lim fn dµ.
n→∞

Proof. Theorem 3.3 shows that for every n ∈ N there exists a sequence
(fnk )k∈N of elementary functions such that fnk ↑k→∞ fn . Let us consider the
elementary function gk = max(f1k , . . . , fkk ). For each k we have gk+1 ≥ gk and
f ≥ gk . Moreover supk gk ≥ supk≥n fnk = fn for each n. So f ≥ supk gk ≥
supn fn = f , i.e. gk ↑ f . Since the gn are elementary lemma 4.2 gives
Z Z Z
f dµ = lim gn dµ ≤ lim fn dµ.
n→∞ n→∞

R R
The converse inequality f dµ ≥ limn→∞ fn dµ follows from the monotonicity
of the integral.

Remark 4.1 The additivity of the integral and the theorem of monotone con-
vergence together show that every non-negative measurable function f gives rise
to a new measure f µ : F → [0, ∞], defined by
Z
f µ(A) = 1A f dµ.

In fact for every sequence (An ) of pairwise disjoint sets with union A
Z X ∞ ∞ Z
X ∞
X
f µ(A) = ( 1An )f dµ = 1An f dµ = f µ(An ).
n=1 n=1 n=1

In chapter 8 we shall characterize those measures which appear in this way.

For non-negative sequences of functions which do not converge one has


Measure Theory (October 20, 2008) 31

Theorem 4.4 (Fatou’s lemma). Let (fn ) be a sequence of non-negative mea-


surable functions. Then
Z Z
lim inf fn dµ ≤ lim inf fn dµ.
n→∞ n→∞

Proof. We consider the function gm = inf n≥m fn . It is measurable by


theorem 3.2. Then
R gm ≥R 0 and gm ↑ sup gm = lim inf fn . On the other hand
gm ≤ fn and gm dµ ≤ fn dµ for all n ≥ m. Together with the theorem of
monotone convergence one gets
Z Z Z
lim inf fn dµ = sup gm dµ = sup gm dµ
n→∞ m m
Z Z
≤ sup inf fn dµ = lim inf fn dµ.
m n≥m n→∞

Example 4.4 In general the inequality in Fatou’s lemma is strict even if the
functions converge point-wise and are in L1 (µ). We consider the standard
Lebesgue measure space ([0, 1], B([0, 1]), λ|B([0, 1])). Let f = 0 and fn = n1(0, n1 ] .
Then fn → f point-wise. On the other hand
Z 1 Z 1
1
lim fn (x) dx = lim n = 1 6= 0 = f (x)dx.
n 0 n n 0

Lebesgue’s result below shows that domination by an integrable function is


sufficient to avoid the effect in this example. Note that the function x 7→ x1
which is a natural upper bound for our sequence fails to be integrable.

The following theorem is perhaps the most useful single result in measure
theory.

Theorem 4.5 (Lebesgue’s dominated convergence theorem, Satz von


der majorisierten Konvergenz) Let (fn ) be a sequence of measurable func-
tions such that there is an integrable function h with |fn | ≤ h µ-a.e.. Let f be
such that f (x) = limn→∞ fn (x) µ-a.e.. Then f is integrable as well and
Z Z
f dµ = lim fn dµ
n→∞

and Z
lim |f − fn | dµ = 0.
n→∞

Proof. The union N of the countably many null-sets {fn 9 f }, {|fn | > h}
and {|h| = ∞} is a null-set. Without changing the integrals or the integrability
of the functions involved we can set all function to vanish on N . Let us consider
the auxiliary functions gn = 2h − (f − fn ). Then limn gn = 2h and gn ≥ 0, so
by Fatou’s lemma
Z Z Z
2h dµ = lim gn dµ ≤ lim inf gn dµ
n n
Z Z Z
¡ ¢
= 2h dµ − lim sup f dµ − fn dµ .
n
32 Measure Theory (October 20, 2008)

¡R R ¢
Therefore lim supn f dµ − fn dµ ≤ 0. Similarly we can apply Fatou’s
lemma to the non-negative
¡R Rfunctions
¢ 2h + f − fn , and get in the same way the
estimate lim inf n f dµ − fn dµ ≥ 0. Together this gives the first of the two
limit assertions. The second one follows if we apply the first one to the sequence
(|f − fn |) which is dominated by 2h and converges everywhere to 0.

As a corollary one gets the following rule for differentiation under the integral
sign.

Theorem 4.6 Let (a, b) an open interval and let f : (a, b)×S → R be a function
such that
i) for fixed t ∈ (a, b) is the function f (t, ·) is µ-integrable on S,
ii) for fixed x ∈ S the function f (·, x) is continuously differentiable on (a, b),

iii) there is a µ-integrable h : S → [0, ∞] such that | ∂t f (t, x)| ≤ h(x) for all
x ∈ S and all t ∈ (a, b).
R
Then the integral f (t, x)dµ(x) is continuously differentiable in t and
Z Z
∂ ∂
f (t, x) dµ(x) = f (t, x) dµ(x). (4.6)
∂t S S ∂t

Proof. Let (tn ) be sequence of points which converges to some t ∈ (a, b).
Consider the measurable functions gn defined by gn (x) = f (tn ,x)−f
tn −t
(t,x)
. By the

mean-value theorem for each x there is a point tn (x) ∈ (a, b) such that gn (x) =
∂ ∗
∂t f (tn (x), x). By hypothesis we can conclude |gn (x)| ≤ h(x). Furthermore

gn (x) → ∂t f (t, x) for all x ∈ S. The theorem of dominated convergence gives
R R Z Z
S
f (tn , x) dµ(x) − S f (t, x) dµ(x) ∂
= gn (x) dµ → f (t, x)dµ.
tn − t ∂t

i.e. the relation (4.6). Similarly the continuity of the right-hand side in (4.6)
follows because iii) implies
Z Z Z
∂ ∂ ∂
lim f (tn , x) dµ(dx) = lim f (tn , x) dµ(dx) = f (t, x) dµ(dx).
n ∂t n ∂t ∂t
Chapter 5

Fubini’s Theorem

Fubini’s theorem is the main tool for multiple integrals.

Theorem 5.1 (Fubini) Let (X, F, µ) and (Y, G, ν) two σ-finite measure spaces.
Let F ⊗ G be the σ-algebra on the cartesian product X × Y which is generated
by the ’measurable rectangles’ F × G with F ∈ F, G ∈ G. Then
a) There is a unique measure µ ⊗ ν on the σ-algebra F ⊗ G, such that

µ ⊗ ν(F × G) = µ(F )ν(G)

for all F ∈ F, G ∈ G.
b) Let f : X × Y → [0, ∞] be a F ⊗ G-measurable function. For every x ∈ X the
sectional
R function f (x, ·) : y 7→ f (x, y) is G-measurable and the function x 7→
Y
f (x, y) dν(y) is F-measurable. Similarly the R sectional function f (·, y) is F-
measurable for each y ∈ Y and the map y 7→ X f (x, y) dµ(x) is G-measurable.
Moreover
Z Z Z
f (x, y) dµ ⊗ ν(x, y) = f (x, y) dν(y) dµ(x) (5.1)
X×Y X Y
Z Z
= f (x, y) dµ(x) dν(y).
Y X

c) An F ⊗ G-measurable function f : X × Y → [−∞, ∞] is µ ⊗ ν-integrable if


and only if
Z Z
|f (x, y)|dν(y) dµ(x) < ∞.
X Y

In this case f (x, ·) ∈ L1 (ν) for µ-almost all x and f (·, y) ∈ L1 (µ) for ν-almost
all y and the formula in b) is valid in the sense that the inner integrals denote
an arbitrary measurable function on the respective null-sets where they are not
defined.

Definition 5.1 The measure µ ⊗ ν is called the product (measure) of µ and


ν. The σ-algebra F ⊗ G is called the product (σ-algebra) of F and G.

Before the proof of the theorem let us write down two important corollaries:

33
34 Measure Theory (October 20, 2008)

Corollary 5.1 (generalized Cavalieri principle) For every set A ∈ F ⊗ G


let Ax = {y ∈ Y : (x, y) ∈ A} and Ay = {x ∈ X : (x, y) ∈ A}. Then
Z Z
µ ⊗ ν(A) = ν(Ax ) dµ(x) = µ(Ay ) dν(y).

Corollary 5.2 Let N ⊂ X × Y . The following are equivalent:


a) N is a µ ⊗ ν-null-set,
b) for µ-almost all x ∈ X one has ν(Nx ) = 0.
c) for ν-almost all y ∈ Y one has µ(N y ) = 0.

Proof. 1. Suppose first that µ(X) < ∞ and ν(Y ) < ∞. Consider the
system D of all subsets A of X × Y for which i) the section sets Ax are in G for
all x ∈ X and ii) the map x 7→ ν(Ax ) is F-measurable. Clearly X × Y ∈ D.
The set operation A 7→ Ax commutes with countable unions and the difference
operator \, moreover ν is a finite measure. These two facts imply that D is a
monotone class in the sense of section 2.4. Obviously D contains the system E
of all ’measurable rectangles’ F × G with F ∈ F , G ∈ G. The intersection of
two such sets is again Rin E. So theorem 2.4 applies and gives F ⊗ G ⊂ D. So
the map µ ⊗ ν : A 7→ X ν(Ax ) dµ(x) is well-defined and the σ-additivity of ν
and the theorem of monotone convergence show that it is a measure on F ⊗ G.
Moreover
Z Z
µ ⊗ ν(F × G) = ν((F × G)x ) dµ(x) = ν(G) dµ(x) = µ(F )ν(G).
X F

2. In a Rcompletely symmetric fashion can define a measure ρ on F ⊗ G with


ρ(A) = Y µ(Ay ) dν. This measure ρ satisfies as well ρ(F × G) = µ(F )ν(G) for
all F × G ∈ E. So the uniqueness theorem 2.5 shows ρ = µ ⊗ ν. This proves
also corollary 5.1. The second corollary follows from the first if we use the fact
that a non-negative measurable function (here the functions x 7→ ν(Ax ) and
y 7→ µ(Ay )) vanishes a.e. if and only if its integral vanishes (proposition 4.1 b).
3. If f is an elementary F ⊗ G-measurable function then the statement in part
b) for f follows from the additivity of the integral. If f is a general non-negative
measurable function then it can be approximated by a nondecreasing sequence of
elementary functions (theorem 3.3) and part b) follows by a repeated application
of the theorem of monotone convergence.
4. Let now f be a F ⊗ G-measurable function. That f is µ ⊗ ν integrable iff
the double integral in c) converges is just part
R b) applied to |f |. Now let f be
µ ⊗ ν-integrable. Then the function x 7→ |f (x, y)|dν(y) is µ-integrable and
thus thereR is a µ-null-set N1 ∈ F such that this integral is finite and hence
the value Y f (x, y) dν(y) is well-definedR for x ∈ / N1 . Let g : X → R be any
F-measurable function. Let the symbol Y f (x, y) dν(y) denote the value g(x)
for x ∈ N1 . Then the formula
Z Z Z
f (x, y) dν = f + (x, y) dν − f − (x, y) dν

holds µ-a.e. and therefore the identity (5.1) carries over from f + and f − to f .
The second identity in b) follows by exchanging the roles of the variables x, y.
5. Finally we get rid of the finiteness condition on the measures. Let (Qn ) and
(Rn ) be two sequences in F resp. G, such that µ(Qn ) < ∞, ν(Rn ) < ∞ and
Measure Theory (October 20, 2008) 35

Qn ↑ X, Rn ↑ Y for n ∈ ∞. Let µn (A) = µ(A ∩ Qn ) and νn (B) = ν(B ∩ Rn ).


Then µn (X) < ∞ and νn (Y ) < ∞. We can apply the previous arguments for
each n and get unique measures µn ⊗ νn , for which the statements a) − c) hold
analogously. These measures are increasing and the set function µ ⊗ ν defined
by µ ⊗ ν(A) = lim µn ⊗ νn (A) is a measure. One easily shows via approximation
by elementary functions and the theorem of monotone convergence that
Z Z Z
f dµ ⊗ ν = lim f dµ ⊗ ν = lim f dµn ⊗ νn
n Qn ×Rn n

for all F ⊗ G-measurable f ≥ 0. Then the extension of the various assertions


are straightforward. The uniqueness statement in a) again is a consequence of
theorem 2.5.

Remark 5.1 We have used monotone classes. If one does not want to rely on
section 2.4 one can get almost the same result as follows. Let R be the ring
of finite disjoint unions of measurable rectangles. It is easy to extend the set
function F × G 7→ µ(F )ν(G) by additivity to a content m on R R and to see
that the function x 7→ ν(Rx ) is F-measurable and m(R) = X ν(Rx ) dµ for
all R ∈ R. This implies that m is σ-additive and σ-finite. Let µ ⊗ ν denote
the unique extension to F ⊗ G. Let now f ≥ 0 be F ⊗ G-measurable. The
only difficulty is that without
R monotone classes it is not so easy to prove the
F-measurability of x 7→ f (x, y) dν(y). However the Fµ -measurability can be
verified. For this let first A ∈ F ⊗ G. Then for each n there is a sequence
S (Rin )i
of (without loss ofPgenerality disjoint) sets in R such that A ⊂ i Rin =: An
n 1 1
and µ ⊗ ν(A) ≥ i m(Ri ) − n = µ ⊗ ν(An ) − n . The class D in part 1.
of the above proof contains the Rin and their disjoint union An . Similarly,
applying the same argument to Ac we find a set An ∈ D such that An ⊂ A
and µ ⊗ ν(An ) ≥ µ ⊗ ν(A) − n1 . Then using the same argument as for the
Lebesgue-integrability of Riemann integrable functions we see that x 7→ ν(Ax )
is Fµ -measurable and the first identity in corollary 5.1 holds. The rest of the
proof works essentially like before.

In the special case of Lebesgue-measure some additional comments are in


order.
Remarks 1. If we choose F = B(Rm ) and G = B(Rk ) we have B(Rm+k ) =
B(Rm ) ⊗ B(Rk ), because both σ-algebras are generated by the compact boxes of
the form Q×R. Moreover λm k
|B(Rm ) ⊗λ|B(Rk ) (B) = λ
m+k
(B) for all B ∈ B(Rm+k ).
This can be verified first on the system of these boxes and then by appealing to
theorem 2.5.

2. In some texts on analysis the multidimensional integral of continuous


functions of compact support is defined via successive integration. Fubini’s the-
orem and our remark 4.3 shows that this integral is actually the Lebegue integral
of these functions.

3. Let us denote by Ld the σ-algebra of Lebesgue measurable sets in Rd .


Note that Lm+k is strictly larger than Lm ⊗ Lk . For example let E ⊂ [0, 1] be
the Vitali set from section 1.4. Then A = {0} × E is a λ2 -null-set since it is
contained in the product measurable {0} × [0, 1] which has λ2 -measure 0 · 1 = 0.
Because λ2 is complete we get A ∈ L2 . If A were an element of L1 ⊗ L1 then
36 Measure Theory (October 20, 2008)

all sections Ax would be in L1 which is not true since A0 = E. Nevertheless, L2


being the λ2 -completion of the Borel sets, every Element von L2 differs only on
a λ2 -null-set from a suitable Borel set.

4. We give an example that the uniqueness of the product measure may get
lost if one of the measures is no longer σ-finite. In view of the arguments of
remark 5.1 it also illustrates the importance of σ-finiteness in the uniqueness
part of Carathéodory’s theorem 2.2. Let X = Y = [0, 1] and F = G = B([0, 1]).
Let λ be Lebesgue measure on F and let ζ be the counting measure on G. As
in the beginning of the proof of theorem 5 we can define two measures ρ1 and
ρ2 on F ⊗ G by Z 1
ρ1 (B) = ζ(B x ) dλ(x),
0
Z 1 X
ρ2 (B) = λ(By ) dζ(y) = λ(By ).
0 y

For every measurable rectangle

ρ1 (F × G) = λ(F )ζ(G) = ρ2 (F × G),

but for the diagonal D ⊂ [0, 1]2 we get


Z 1 Z 1
ρ1 (D) = ζ(Dx ) dλ(x) = 1 dx = 1,
0 0

whereas X X
ρ2 (D) = λ(Dy ) = 0 = 0.
y y
Chapter 6

Convergence in measure

Besides point-wise convergence or a.e. convergence there are several other ways
to express that a sequence of measurable functions converges. The Lp -spaces
of the next chapter will give a full scale of such concepts. Here we discuss
convergence in measure which is particularly useful in probability theory. The
reader who wants to pass directly to the next chapter will need the completeness
result of theorem 6.4 b) which can be understood independently of the other
parts of this section.

6.1 Comparison with other types of convergence


Definition 6.1 Let (S, F, µ) be a measure space. Let f, fn (n ∈ N) be extended
real valued measurable functions which are µ-a.e. finite. One says that the
sequence (fn ) converges to f
a) µ-almost everywhere (fast überall), if µ{x : fn (x) 9 f (x)} = 0,
b) in measure (nach Maß) if µ{x : |fn (x) − f (x)| ≥ ε} → 0 for each ε > 0,
R
c) in the mean (im Mittel), if |f − fn | dµ → 0 and all functions are
integrable.

Comment. Note that in a) and b) the sets in question are actually in F.


We see that in all three parts of Definition 6.1 only the µ-equivalence class of
the functions matters. Therefore one usually extends these concepts also to all
extended real valued functions which are µ-a.e. finite, i.e. which are equivalent
to a finite-valued measurable function.

For infinite measures there is a weaker form of convergence in measure which


in finite measure spaces is equivalent to convergence in measure.

Definition 6.2 One speaks of stochastic convergence or local convergence in


measure if fn 1A −→ f 1A in measure for every A ∈ F with µ(A) < ∞.

Let λ1 be Lebesgue measure on the real line. The indicator functions fn =


1[n,n+1] do not converge to 0 in measure but point-wise and hence locally in
measure by part b) of the following theorem.

37
38 Measure Theory (October 20, 2008)

Theorem 6.1 Between the three concepts in Definition 6.1 the following im-
plications hold:
a) If fn −→ f in the mean then fn −→ f in measure.
b) If fn −→ f µ a.e. then fn 1A −→ f 1A in measure for every A ∈ F with
µ(A) < ∞.
The converse implications are wrong even for finite measure spaces.

Proof. 1. Markov’s inequality (theorem 4.1 a) implies for all ε > 0


Z
1
µ({|fn − f | ≥ ε}) ≤ |fn − f | dµ.
ε
So convergence in mean implies convergence in measure.
2. Assume that fn −→ f µ-a.e. Let A ∈ F with µ(A) < ∞ be given. Let An =
∪m≥n {x ∈ A : |fm (x) − f (x)| ≥ ε}. Then A1 ⊃ A2 ⊃ . . . and µ(A1 ) ≤ µ(A) <
∞. So the continuity from above of µ (theorem 1.3) shows µ(An ) ↓ µ(A∞ )
where A∞ := ∩n An . On the set A∞ one has lim supn |fn − f | ≥ ε. Therefore
A∞ is a µ-null-set. Together we get

µ({x ∈ A : |fn − f | ≥ ε}) ≤ µ(An ) ↓ µ(A∞ ) = 0,

i.e. the convergence in measure of fn 1A to f 1A .


3. The sequence in example 4.4 does not converge in the mean but it converges
in measure according to part b) which we just proved.
4. fn −→ f in the mean does not imply fn −→ f µ-a.e.. Again we work
on the standard Lebesgue measure on [0, 1]. Let (Jn ), n = 1, 2, . . . be an
enumeration of all intervals of the form R[ k−1 k
m , m ) with k ≤ m; k, m ∈ N. Then
for fn = 1Jn , f = 0 one gets on one side |fn − f | dλ = λ(Jn ) → 0, and on the
other hand (fn ) does not converge to f a.e. because for all x

lim sup |fn (x) − f (x)| = lim sup 1Jn (x) = 1.


n→∞ n→∞

This completes the proof.

Part b) of the theorem has a partial converse.

Theorem 6.2 Let µ be σ-finite. A sequence (fn ) converges locally in measure


to f if and only if for every sub-sequence (fnk )k there is a sub-sub-sequence
(fnkl )l which converges µ-a.e. to f .

Proof. We may assume (for simplicity of notation) f = 0. 1. Suppose


(fn ) converges locally in measure to 0 and let (fnk )k be a sub-sequence. Let
(Am )m be an increasing sequence of sets of finite measure whose union is S.
Then for each l ∈ N, the numbers µ(Al ∩ {|fnk | > 1l }) converge to 0 as k → ∞.
So there is an increasing sequence of indices kl such that µ(Bl ) < 2−l where
Bl = Al ∩ {|fnkl | > 1l }, for all l.
We claim that fnkl → 0 µ-a.e.. Let N be the exceptional set of all points x
for which fnkl (x) does not converge to 0. Let x ∈ N ∩ Am . There is some
r ∈ N such that fnkl (x) > 1r for infinitely many l. In particular there is some
l > max(r, m) with this inequality. Then x ∈ Bl . Therefore for fixed m0 we get
Measure Theory (October 20, 2008) 39

S∞ P∞
N ∩ Am0 ⊂ N ∩ Am ⊂ l=m Bl and µ(N ∩ Am0 ) ≤ l=m 2−l = 2 · 2−m for all
m ≥ m0 . Hence µ(N ∩ Am0 ) = 0 for all m0 and finally µ(N ) = 0 as desired.
2. Conversely suppose the subsequence condition holds. Let A ∈ F be a set of
finite measure and ε > 0. Choose a sub-sequence such that

lim sup µ(A ∩ {|fn − f | > ε}) = lim µ(A ∩ {|fnk − f | > ε}).
n k→∞

Let (fnk l )l be a sub-sub-sequence such that fnkl → f µ-a.e.. Then by theorem


6.1 this sub-sub-sequence converges locally in measure to f , in particular the
above lim sup is equal to liml→∞ µ(A ∩ {|fnkl − f | > ε}) = 0. This proves that
the full sequence (fn ) also converges locally in measure to f .

The following corollary is a partial converse to part a) of theorem 6.1 and an


extension of the theorem of dominated convergence to sequences which converge
locally in measure.

Corollary 6.1 Let h ∈ L1 (µ) and let Mh be the set all measurable functions g
with |g| ≤ h. Then for sequences in Mh convergence in mean and local conver-
gence in measure agree.

Proof. a). Suppose fn ∈ Mh for all n and assume that fn converges


locally in measure. Then theorem 6.2 and the theorem of dominated convergence
together imply that
R for every sub-sequence (fnk )k there is aR sub-sub-sequence
(fnkl )l such that |fnkl − f | dµ → 0. But this implies even |fn − f | dµ → 0,
i.e. convergence in mean. The converse is theorem 6.1 a).

We conclude this section with a little theorem about uniform convergence


on large sets.

Theorem 6.3 (Egorov). Let µ(S) < ∞ and let fn → f µ-a.e.. Then for each
ε > 0 there is a set A ∈ F with µ(Ac ) < ε such that fn converges to f even
uniformly on A.
1
Proof. Let gn = supk≥n |fk − f |. Then gn ↓ 0 µ-a.e., i.e. µ({g
P∞n > m }) → 0
for all m ∈ N. Then we find a sequence of indices such that m=1 µ({gnm >
1 ∞ 1 c
m }) < ε. Let A = ∩m=1 {gnm ≤ m }. Then µ(A ) < ε. For the verification of
1
uniform convergence let η > 0. Choose m ≥ η . Then for all x ∈ A and n ≥ nm
1
one gets |fn (x) − f (x)| ≤ gnm (x) ≤ m ≤ η.

6.2 A Metric for Local Convergence in Measure


In analysis most convergence concepts correspond to a topological structure, e.g.
to a metric. At least for σ-finite measure spaces there are many ways to define
a metric between (equivalence classes of) measurable functions which describes
local convergence in measure. Theorem 6.5 will give one of the possibilities.
The following result shows that many metrics between equivalence classes of
measurable functions are complete. It implies in particular the completeness of
the Lp -spaces in the next chapter.1
1 From this section only theorem 6.4 will be needed later.
40 Measure Theory (October 20, 2008)

Theorem 6.4 Let ρ be a functional which assigns to every non-negative F-


measurable functions g a value ρ(g) ∈ [0, ∞] with the following properties:
ρ(g) = 0 if and only if g = 0 µ − a.e., (6.1)
0≤f ≤g =⇒ ρ(f ) ≤ ρ(g), (6.2)
X∞ X∞
ρ( gi ) ≤ ρ(gi ), (6.3)
i=1 i=1
ρ(g) < ∞ =⇒ g < ∞ µ − a.e.. (6.4)
Then the set Lρ of all real valued measurable functions f with ρ(|f |) < ∞ is
complete with respect to the pseudo-metric d(f, g) = ρ(|f − g|).
Proof. The triangular inequality d(f, h) ≤ d(f, g) + d(g, h) follows from
(6.2) and (6.3). Let (fn ) be a Cauchy sequence with respect to d, i.e. ρ(|fn −
fPm |) converges to 0 as m, n tend to ∞. Choose
P∞a subsequence (nk ) such that

k=1 ρ(|fnk+1 − fnk |) < ∞. Let A = {x : k=1 |fnk+1 (x) − fnk (x)| < ∞}.
Then A ∈ F and µ(Ac ) = 0 according to (6.4). Since
k−1
X
fnk (x) = fk0 (x) + (fnl+1 (x) − fnl (x)) for k > k0 ,
l=k0

the sequence (fnk )(x) converges for k → ∞ to a finite real number f (x) for all
x ∈ A. Outside of A let f (x) = 0. The measurability
P∞ of the fn and of A imply
the measurability of f . Moreover |f − fnk | ≤ l=k |fnl+1 − fnl | everywhere and
by (6.2) and (6.3)

X
ρ(|f − fnk |) ≤ ρ(|fnl+1 − fnl |) < ∞
l=k

for each k. In particular ρ(|fnk − f |) → 0 for k → ∞. The triangular inequality


gives ρ(|f |) < ∞ and finally also ρ(|f − fn |) ≤ ρ(|f − fnk |) + ρ(|fn − fnk |) → 0
for the full sequence. This proves the completeness.

In order to define a metric for local convergence in measure we use a strictly


positive integrable function. If µ(S) < ∞ one takes the constant function 1. If
µ = λ1 take e.g. h(x) = e−|x| .
Lemma 6.1 There exists an everywhere positive function h ∈ L1 (µ) if and only
if the measure µ is σ-finite.
S∞
Proof. If h ∈ L1 (µ) is everywhere positive, then S = n=1 {|h| > n1 }
and each of the sets in this union has finite µ-measure by Markov’s inequality.
So µ is σ-finite. Conversely if µ is σ-finite there is a sequence (An ) of sets
in F P with An ↑ S und µ(An ) < ∞P for all n. Choose constants cn R> 0 such

that
P∞ c n µ(A n ) < ∞. Then h = n=1 cn 1An is measurable and |h|dµ =
n=1 cn µ(A n ) < ∞ and of course h(x) > 0 for all x ∈ S.
Theorem 6.5 Let h ∈ L1 (µ) be everywhere positive. Define for finite measur-
able functions f, g the distance d(f, g) by
Z
d(f, g) = h ∧ |f − g| dµ.
S
Measure Theory (October 20, 2008) 41

Then d is a complete pseudo-metric such that limn→∞ d(f P n , f ) = 0 if and only


if (fn ) converges locally in measure to f . If moreover n d(fn , f ) < ∞ then
fn → f µ-a.e..
R
Proof. Let ρ(g) = h ∧ g dµ. Then ρ satisfies the conditions
P∞ (6.1)
P∞ - (6.4).
The first two are obvious. For the third one note that h∧ i=1 gi ≤ i=1 (h∧gi )
(distinguish the two cases gi ≤ h for all i or gi > h for some i). Thus

X Z ∞
X Z X
∞ ∞ Z
X ∞
X
ρ( gi ) = h∧ gi dµ ≤ (h ∧ gi ) dµ = h ∧ gi dµ = ρ(gi ).
i=1 i=1 i=1 i=1 i=1

For the proof of (6.4) P∞assume that in this calculation the last series converges.
Then the function P∞ i=1 (h ∧ gi ) is integrable and hence a.e. finite. Let x be a
point such that i=1 (h(x) ∧ gi (x)) converges. ThenP gi (x) < h(x) for eventually

all
P∞ indices because h(x) > 0 and therefore the series i=1 gi (x) converges. Thus
g
i=1 i is finite a.e.. Therefore d is a complete pseudo-metric.
P P
If n d(fn , f ) < ∞ then the series n fn (x) − f (x) converges absolutely µ-a.e.
because of (6.4) and in particular fn → f µ-a.e..
Assume that limn d(fn , f ) = 0. In order to prove local convergence in measure
we verify the sub-sequence criterion P of theorem 6.2. Let (fnk ) be a sub-sequence.
Then every sub-sub-sequence with l d(fnk l , f ) < ∞ converges to f µ-a.e..
Conversely assume local convergence in measure of fn to f or |fn −f | to 0. Then
the same is true for the functions |fn − f | ∧ h. By corollary
R 6.1 these functions
converge also in mean, i.e. by definition d(fn , f ) = fn − f | dµ → 0.
42 Measure Theory (October 20, 2008)
Chapter 7

The Lp-spaces

Definition 7.1 Let (S, F, µ) be a measure space and let f be an extended real-
valued F-measurable function. We define for p ∈ [1, ∞] the number kf kp by
Z
¡ ¢1/p
kf kp := |f |p dµ (1 ≤ p < ∞)

kf k∞ := inf{α > 0 : µ{|f | > α} = 0}.

Further let Lp (µ) = { f : f is F-measurable and kf kp < ∞}. Lp (µ) denotes the
space of all µ-equivalence classes of elements of Lp (µ).

If p = 1 this is in accordance with with Definition 4.3. For p = ∞ a


measurable function is in L∞ (µ) if and only if it is µ-equivalent to a bounded
function. In the case p = 2 theR functional k · k2 is √
the (semi-) norm induced by
the scalar product < f, g >= f g dµ via kf k2 = < f, f >. In particular one
has the Cauchy-Schwarz(-Bunyakowsky) inequality
Z
| f g dµ| = | < f, g > | ≤ kf k2 kgk2 . (7.1)
S

Note that an extended real-valued function is Fµ -measurable if and only if


it is µ-equivalent to an F-measurable function according to corollary 3.2 and
proposition 4.2 a). Thus the equivalence classes of Fµ -measurable functions and
of F-measurable functions agree and each measure has the same Lp -spaces as
its completion.
The following theorem gives an extension of (7.1) and it shows that k · kp is
really a norm on Lp (µ).

Theorem 7.1 For all λ ∈ R kλf kp = |λ|kf kp . Moreover


1 1
a) kf · gkr ≤ kf kp kgkq whenever r, p, q ∈ [1, ∞] satisfy r = p + 1q .
(Hölder’s inequality)
b) kf + gkp ≤ kf kp + kgkp for all p ∈ [1, ∞].
(Minkowski’s inequality)

Proof. The homogeneity is obvious.

43
44 Measure Theory (October 20, 2008)

a) For p = ∞ the condition on r, p, q implies r = q. Moreover |f g| ≤ kf k∞ |g|


µ-a.e. and hence

kf gkr ≤ kkf k∞ |g|kr = kf k∞ kgkr = kf k∞ kgkq .

For finite p, q we use the following


1 1
Lemma 7.1 Let p, q, r ∈ [1, ∞) satisfy the relation r = p + 1q . Then for all
α, β ≥ 0
r r r r
α p β q ≤ α + β.
p q
r
Proof. (of the lemma) The function ϕ(x) = x p is concave because of p > r.
Using the equation of the tangent at the point (1, 1) we get ϕ(x) ≤ pr (x−1)+1 =
r
r r α
px + q. In this proof we may assume β > 0. For x = β we get ( α
β) ≤
p rα
pβ + qr .
r r r r r
1− p
Since β( α
β) = α β
p p =α β p q the lemma is proved.

Now let two measurable functions f, g be given. We may assume f ≥ 0, g ≥ 0


p q
and 0 < kf kp , kgkq < ∞. Choosing α = fkf(x) kp
p
and β = gkgk(x)
q in the lemma and
q
integrating we get
Z Z Z q
(f (x)g(x))r r f p (x) g (x) r r
dµ ≤ dµ + dµ = + = 1,
kf krp kgkrq p kf kpp kgkqq p q

i.e. Hölder’s inequality.


b) (Minkowski) For p = 1 and p = ∞ the assertion is easy. Let now 1 < p < ∞.
Let 0 < kf kp + kgkp < ∞. For the proof of Minkowski’s inequality we may
assume f, g ≥ 0. Because of kf + gk ≤ 2p max(f, g)p ≤ 2p (f p + g p ) one gets
p
kf + gkp < ∞. Letting r = 1, q = p−1 we get from a)

kf + gkpp = kf (f + g)p−1 k1 + kg(f + g)p−1 k1


≤ kf kp k(f + g)p−1 kq + kgkp k(f + g)p−1 kq
= (kf kp + kgkp )kf + gkp−1
p ,

which implies the assertion. In the last step we have used the general relation
µZ ¶ q1 µZ ¶ p−1
p
p−1 (p−1)q p
kh kq = h dµ = h dµ = khkp−1
p . (7.2)

1 1
Corollary 7.1 Let µ(S) < ∞ and p < q. Then kf kp ≤ kf kq µ(S) p − q for all
measurable functions f and in particular Lq (µ) ⊂ Lq (µ).
1 1
Proof. Let s be the number for which s = p − 1q . The constant function
1
s
1 is in L (µ) with k1ks = µ(S) . Hölder’s inequality yields kf kp = kf · 1kp ≤
s

kf f kq k1ks , i.e. the assertion.

Theorem 7.2 For each p ∈ [1, ∞] the space Lp (µ) with the norm k · kp is a
Banach space.
Measure Theory (October 20, 2008) 45

Proof. It remains to prove the completeness. In the case p = ∞ the proof is


analogous to the better known case of the space of bounded continuous functions
with the sup-norm. In the case p < ∞ we let ρ(f ) = kfˆkp where f is any non-
negative measurable function and fˆ is the equivalence class of f . One can verify
the conditions (6.1)-(6.4): The first two conditions are trivial. Minkowski’s
inequality together with the theorem of monotone convergence imply (6.3). For
the proof of (6.4) let f satisfy ρ(f ) < ∞. Then |f |p is integrable and hence f is
finite µ-a.e. Thus theorem 6.4 implies the asserted completeness.

As a preparation of the remaining results we remark that for a measurable


set B ∈ F and 1 ≤ p < ∞ one has
Z Z
1 1 1
k1B kp = ( 1pB dµ) p = ( 1 dµ) p = µ(B) p
B

and hence µ(B) < ∞ is equivalent to 1B ∈ Lp (µ). Moreover the theorem 4.5 of
dominated convergence holds also in Lp : If a sequence of measurable functions
satisfies |fn | ≤ h ∈ Lp (µ) and fn → f point-wise, then kfn − f kp → 0. In fact
|fn − f |p ≤ p
R h and we can apply the original theorem 4.5 to these functions and
conclude |fn − f |p dµ → 0.

Proposition 7.1 Let E be a ∩-stable generating system of the σ-algebra F.


Assume that
S there is an increasing sequence Ek of sets in E with µ(Ek ) < ∞
and S = k Ek . Then the linear span of all indicator functions 1E , E ∈ E is
dense in Lp (µ) for 1 ≤ p < ∞.

Proof. Let L be the smallest closed linear subspace of Lp (µ) which contains
the indicator functions 1E , E ∈ E. We want to show L = Lp (µ). Fix a number
k ∈ N. We consider the class Dk of all sets F ∈ F such that 1F ∩Ek ∈ L. For
every E ∈ E we have E ∩ Ek ∈ E and hence 1E∩Ek ∈ L by assumption. So
E ⊂ Dk .
We claim that Dk is a monotone class. Since S ∩ Ek = Ek ∈ E we have S ∈
Dk . Let F, G ∈ Dk such that G ⊂ F . Then 1(F \G)∩Ek = 1F ∩Ek − 1G∩Ek ∈ L
since L is a linear space. Thus F \ G ∈ Dk . Let Fn be an increasing sequence
in Dk and Fn ↑ F . Then Fn ¡∩ Ek ↑ F ∩ Ek¢ as n → ∞ and µ(F ∩ Ek ) < ∞.
Thus k1F ∩Ek − 1Fn ∩Ek kpp = µ (F \ Fn ) ∩ Ek → 0 as n → ∞ and hence 1F ∈ L
since L is closed. Therefore F ∈ Dk . The monotone class theorem 2.4 shows
that F ⊂ Dk for all k. Let now F ∈ F be any measurable set with µ(F ) < ∞.
Then F ∩ Ek ↑ F and as above we see that 1F = limk 1F ∩Ek ∈ L.
Finally let f ∈ Lp (µ). By theorem 3.3 there is a sequence of elementary
functions (fn )n which increases to f + . Each fn is in L as a linear combination
of indicators of sets of finite measure. Then fn → f + in Lp (µ) by the above
extended dominated convergence and hence f + ∈ L and similarly f − ∈ L and
f ∈ L.

Theorem 7.3 Let 1 ≤ p < ∞. Let µ be a measure on B(Rd ) for which bounded
sets have finite measure. Then the continuous functions with compact support
are dense in Lp (µ). Similarly for a finite measure µ on the Borel sets of a
metric space the bounded continuous functions are dense in Lp (µ).

Proof. In the first situation let L be the closure of Cc (Rd ) in Lp (µ). Let
U be an open set of finite µ-measure. There is a sequence (fn ) in Cc (Rd ) such
46 Measure Theory (October 20, 2008)

that 0 ≤ fn ↑ 1U point-wise. Then by the remark above about dominated


convergence in Lp we see that 1U ∈ L. The Borel σ-algebra is generated by
these sets U and the result follows from part b) of the preceding proposition.
In the second situation we simply replace Cc (Rd ) by the space of all bounded
continuous functions.

Corollary 7.2 (Lusin’s Theorem) Let µ be as in the second part of the pre-
vious theorem. Let f be measurable and real-valued. Let A be a Borel set of
finite measure. Then for each ε > 0 there is a set K ⊂ A with µ(K) > µ(A) − ε
such that f is continuous on K.
S
Proof. We may assume that f vanishes outside A. Since A = n A ∩ {|f | <
n} we may assume that f is bounded and in particular f ∈ L1 (µ). There is a
sequence (fn ) of continuous functions which converges in L1 (µ) to f . According
to theorems 6.1 and 6.2 we can assume that fn → f µ-a.e.. By Egorov’s theorem
6.3 for each ε > 0 there is a measurable set K ⊂ A with µ(K) > µ(A) − ε such
that the fn converge uniformly to f on K. Since each fn is continuous the same
is true for the restriction of f to K.
Chapter 8

The Radon-Nikodym
Theorem

8.1 Absolute continuity of measures


In this section (S, F) is a measurable space and µ and ν are measures on F. As
a consequence of the theorem of monotone convergence we saw in remark 4.1
that for a measurable function f ≥ 0 the set function f µ : F → [0, ∞] defined
by Z
f µ : A 7→ f dµ (8.1)
A
is a measure. The theorem of Radon-Nikodym characterizes the measures which
have such a representation.
Definition 8.1 Let µ and ν be two measures on the σ-algebra F. We call a
function f a Radon-Nikodym-density (R-N-Dichte) or R-N-derivative
(R-N-Ableitung) of ν w.r.t. µ if
Z
ν(A) = f dµ. (8.2)
A

for all sets A ∈ F. We denote any such function f by the symbol dµ .


The µ-equivalence class of dµ is uniquely determined by ν:

Proposition 8.1 Let ν, ν 0 be two measures with the the Radon-Nikodym den-
0

sities f = dµ and f 0 = dν 0 0
dµ . Then ν ≤ ν if and only if f ≤ f µ-a.e.. Two
measurable functions are Radon-Nikodym densities of the same measure ν if and
only if they are µ-equivalent.
Proof. Suppose that ν ≤ ν 0 , yet µ({f > f 0 }) > 0. Then proposition 4.1
part b) applied to the function 1{f >f 0 } (f − f 0 ) would give the contradiction
Z Z
ν({f 0 > f }) = f dµ > f 0 dµ = ν 0 ({f 0 > f }).
{f >f 0 } {f >f 0 }

Therefore f ≤ f 0 µ-a.e.. Conversely clearly f ≤ f 0 µ-a.e. implies ν ≤ ν 0 . The


second statement easily follows from the first.

47
48 Measure Theory (October 20, 2008)

Definition 8.2 Let µ and ν be two measures on the σ-algebra F. Then ν is


called absolutely continuous (absolut stetig) with respect to µ, if every
µ-null-set is a ν-null-set. In this case we write ν ¿ µ.

Proposition 8.2 Let µ and ν be measures on the σ-algebra F, and suppose


that ν is finite. Then the following are equivalent.
a) ν ¿ µ, i.e. ν is absolutely continuous w.r.t. µ,
b) For every ε > 0 there is a δ > 0, such that for each set A ∈ F with µ(A) < δ
one has ν(A) < ε.

Proof. b) → a) is trivial.
a) → b). If b) does not hold then there are some ε > 0, a sequence (δn ) of positive
reals converging to 0 and a sequence (An ) of measurable sets withP µ(An ) < δn
and ν(An )T≥ εSfor all n. Without loss of generality we may
P assume n δn < ∞.
Let A = n m≥n S Am . Then µ(A) = 0 because of µ(An ) < ∞. On the
other
S hand ∞ > ν( m≥n Am ) ≥ ν(An ) ≥ ε for all n and thus ν(A) ≥ ε since
m≥n Am ↓ A as n → ∞. Thus a) fails as well.

Note that under condition a) every µ-a.e. converging sequence of measurable


functions converges also ν-a.e.. If both measures are finite theorem 6.2 implies
that then every sequence (fn ) which converges in measure to 0 with respect to
µ does the same with respect to ν. If we apply this to indicator functions this
is just condition b). This gives an alternative proof of the proposition if both
measures are finite.

Theorem 8.1 (Radon-Nikodym) Let ν and µ be σ-finite and suppose ν ¿ µ.



a) There is a Radon-Nikodym density dµ .
b) If either g ≥ 0 is measurable or g ∈ L1 (ν) then
Z Z

g dν = g dµ. (8.3)

Proof. 1. Note that b) holds for all pairs µ, ν for which a) holds. First
we get the equality in b) if g is a non-negative elementary function. Then by
monotone convergence one extends the equality to all non-negative functions g
and finally to all g ∈ L1 (ν).
2. For the existence proof assume first that ν and µ are finite measures with
0 ≤ ν ≤ µ. We use, following an idea of J. v. Neumann, the following basic tool
from functional analysis (theorem of Fischer-Riesz): Let H be a real Hilbert
space with the scalar product < ·, · >. Let ` : H → R be a linear functional for
which there is some positive constant c such that `(h) ≤ c · khk for all h ∈ H.
Then there is an element f of H such that `(h) =< h, f > for all h ∈ H.
To apply this we work in the space L2 (µ) and in the associated
R Hilbert space
2
L (µ). The norm is induced by the scalar product < f, h >= f h dµ. Then
by corollary 7.1 or by Cauchy-Schwarz we get
Z Z Z
1¡ ¢1 1¡ ¢1
| h dν| ≤ ν(S) 2 h2 dν 2 ≤ ν(S) 2 h2 dµ 2 = const · khk2,µ

for all h ∈ L2 (µ). Moreover two µ-equivalent


R functions are also ν-equivalent
and hence the value of the integral h dν depends only on the µ-equivalence
Measure Theory (October 20, 2008) 49

R
class of h. Thus by `(h) = h dν a linear functional on L2 (µ) is defined which
fulfils the assumption
R of the theorem
R of Fischer-Riesz. So there is a function
f ∈ L2 (µ) with h dν = `(h) = hf dµ for all h ∈ L2 (µ). Let A ∈ F . Since µ
is finite we have 1A ∈ L2 (µ) and so
Z
ν(A) = `(1A ) = f dµ,
A

i.e. a) is proved under the above additional assumption.


3. Let now ν and µ be finite measures with ν ¿ µ. Set ρ = µ + ν. Then ρ is

a finite measure and both µ ≤ ρ and ν ≤ ρ. Let f = dν dρ and g = dρ . Then
µ({g = 0}) = 0 and hence ν({g = 0}) = 0 and ρ({g = 0}) = 0, i.e. g > 0 ρ-a.e..
Part b) applied to the pair µ, ρ gives
Z Z Z
f f
ν(A) = f dρ = g dρ = dµ,
A A g A g


i.e. the RN-density dµ is the quotient fg of the RN-densities with respect to ρ.
4. Finally let µ and ν be σ-finite with ν ¿ µ. According to lemma 6.1 there are
two strictly positive measurable functions g ∈ L1 (µ) and h ∈ L1 (ν). Then the
two measures gµ and hν (cf. (8.1)) are finite and it is easy to check that they
have the same null-sets as the measures µ and ν, respectively. Thus hν ¿ gµ
and there is a RN-density f = dhνdgµ . Then using (8.3) again we get for all A ∈ F
Z Z Z Z
1 1 1 g
ν(A) = 1A h dν = 1A dhν = 1A f dgµ = f dµ,
h h h A h
g
i.e. hf is the desired Radon-Nikodym derivative of ν w.r.t. µ.

In order to illustrate the role of σ-finiteness we consider the Lebesgue mea-


sure λ on the unit interval and the measure µ with the same null-sets but

µ(A) = ∞ if λ(A) > 0. Then λ ¿ µ but dµ does not exist.

8.2 Lebesgue and Hahn-Jordan decomposition


Definition 8.3 Two measures µ and ν on F are called singular (singulär) or
orthogonal to each other if ’they live on disjoint sets’, i.e. there is a measurable
set D such that µ(D) = 0 and ν(Dc ) = 0. We write in this case µ ⊥ ν.

Theorem 8.2 (Lebesgue decomposition) Let µ and ν be σ-finite measures. Then


ν has a unique decomposition

ν = νac + νs (8.4)

where νac ¿ µ and νs ⊥ µ.

Proof. Again consider ρ = µ + ν. Then ρ is also σ-finite: If (En ) and (En0 )


are two increasing sequence with union S and µ(En ) < ∞ and ν(En0 ) < ∞ then
(En ∩ En0 ) is an increasing sequence with union S and ρ(En ∩ En0 ) = µ(En ∩
En0 ) + ν(En ∩ En0 ) ≤ µ(En ) + ν(En0 ) < ∞. Thus there are RN-densities f = dµdρ
and g = dνdρ . Let D = {f > 0}. Then ν = νac + νs where νac (A) = ν(A ∩ D) and
50 Measure Theory (October 20, 2008)

R
νs (A) = ν(A ∩ Dc ). Then νs ⊥ µ since µ(Dc ) = {f =0} f dρ = 0 and νs (D) = 0.
Finally
Z Z Z
g g
νac (A) = ν(A ∩ D) = g dρ = 1D f dρ = 1D dµ.
A∩D A f A f

Therefore νac has a Radon-Nikodym density with respect to µ, i.e. νac ¿ µ.

Sometimes it is useful to admit ’measures’ which can have also negative


values.

Definition 8.4 Let (S, F) be a measurable space. A function ν : F → R is


called a signed measure (signiertes Maß), if there are two finite measures
ρ1 , ρ2 with ν = ρ1 − ρ2 .

Remark. Actually, every real valued set function on F which is σ-additive


in the sense of equation (1.1) is a signed measure. This fact is interesting but
rarely used. So we omit the proof. Note also that contrary to common use of
language a signed measure in general is not a measure. The representation of
a signed measures as a difference of two (non-negative) measures is not unique.
The following result provides a ’minimal’ decomposition of this type.

Theorem 8.3 (Hahn-Jordan decomposition) Every signed measure ν on the σ-


algebra F has a unique decomposition ν = ν + − ν − where ν + and ν − are two
orthogonal measures. This decomposition is ’minimal’: If ν = κ1 − κ2 for two
measures then ν + ≤ κ1 and ν − ≤ κ2 .

Proof. By assumption ν = ρ1 − ρ2 for some finite measures ρi . Consider


the measure ρ = ρ1 + ρ2 . Let fi = dρdρ for i = 1, 2. The function f = f1 − f2
i

and the set D = {f ≥ 0} satisfy for all A ∈ F


Z Z Z
ν(A) = f1 dρ − f2 dρ = f dρ
ZA A
Z A

= f dρ + f dρ = ν(A ∩ D) + ν(A ∩ Dc )
A∩{f ≥0} A∩{f <0}

where clearly in the last two sums the first term is non-negative and the second
term is non-positive for all A. Thus ν + = ν(D ∩ ·) and ν − = −ν(Dc ∩ ·) provide
the indicated decomposition.
Every representation ν = ν + − ν − as difference of two singular measures is
minimal in the above sense: Suppose ν = κ1 − κ2 and let D be a measurable
set such that ν + (Dc ) = 0 = ν − (D). Then

ν + (A) = ν(A ∩ D) = κ1 (A ∩ D) − κ2 (A ∩ D) ≤ κ1 (A ∩ D) ≤ κ1 (A)

for all A ∈ F, i.e. ν + ≤ κ1 and hence also κ2 − ν − = (κ1 − ν) − (ν + − ν) =


κ1 − ν + ≥ 0. If moreover κ1 ⊥ κ2 we can argue by symmetry that κ1 ≤ ν + and
κ2 ≤ ν − . This proves the uniqueness.

Definition 8.5 The space of signed measures on the measurable space (S, F) is
denoted by M(S, F). For ν ∈ M(S, F) the measure |ν| = ν + + ν − is called the
Measure Theory (October 20, 2008) 51

modulus (Betrag) or total variation measure of ν. The space M(S, F)


is equipped with the norm of total variation (Totalvariation) k · k defined by

kνk = |ν(S)| = ν + (S) + ν − (S). (8.5)

The Hahn-Jordan decomposition allows to extend the Radon-Nikodym the-


orem to signed measures ν.

Corollary 8.1 (’signed’ Radon-Nikodym) Let µ be a measure. Let the signed


measure ν have the property that every µ-null-set is a ν-null-set. Then ν + and

ν − have the same property. There is a function dµ ∈ L1 (µ), uniquely determined
R dν
up to µ-equivalence, such that ν(A) = A dµ dµ for all A ∈ F. Moreover

dν + dν + dν dν − dν d|ν|
( ) = , ( )− = , | |= . (8.6)
dµ dµ dµ dµ dµ dµ

Proof. For the uniqueness of dµ note that the proof of proposition 8.1 carries
over to our situation. In order to show ν + ¿ µ and ν − ¿ µ let µ(N ) = 0.
Then µ(N ∩ D) = µ(N ∩ Dc ) = 0 and hence ν + (N ) = ν(N ∩ D) = 0 and
+
dν dν dν −
ν − (N ) = −ν(N ∩ Dc ) = 0. The function f = dµ defined by dµ = dνdµ − dµ
has the desired properties since according to the proof of the Hahn-Jordan
decomposition f ≥ 0 on D and f < 0 on Dc .

Remark 8.1 The last identity in (8.6) implies


Z
dν dν
kνk = |ν|(S) = | | dµ = k k1 .
S dµ dµ

In other words the map f 7→ f µ induces an isometric embedding of the space


(L1 (µ), k · k1 ) into the space (M(S, F), k · k).

You might also like