Professional Documents
Culture Documents
Humboldt Universität Zu Berlin - Mathematische Fakultät - Measure Theory - Script Holtz 2019
Humboldt Universität Zu Berlin - Mathematische Fakultät - Measure Theory - Script Holtz 2019
Sebastian Holtz
Humboldt-Universität zu Berlin
April 2, 2019
Preface
This script contains the topics that were discussed in the lecture ’Maßtheorie
für Statistiker‘ at Humboldt-Universität zu Berlin. It is based on the book
of the same name by Prof. Uwe Küchler and the scripts offered by the
preceding lecturers of this course, Mathias Trabs and Martin Wahl. The
current script does not claim to be complete nor to be free from errors. In
case you find any error or unclear point I would appreciate if you contact
me.
ii
Contents
I Prelude 1
2 Preliminaries 2
2.1 Set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 The real numbers . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Sequences & Countability . . . . . . . . . . . . . . . . . . . . 8
2.5 Product sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
II Measure theory 12
3 Introduction 12
3.1 The ’problem of measure‘ . . . . . . . . . . . . . . . . . . . . 12
3.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Measurable spaces 14
4.1 σ-Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Borel σ-Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 σ-algebras under maps . . . . . . . . . . . . . . . . . . . . . . 21
5 Measures 23
5.1 Definition & Properties . . . . . . . . . . . . . . . . . . . . . 23
5.2 Construction Of Measures . . . . . . . . . . . . . . . . . . . . 26
5.3 The Lebesgue Measure . . . . . . . . . . . . . . . . . . . . . . 28
5.4 Probability measures . . . . . . . . . . . . . . . . . . . . . . . 31
6 Measurable maps 33
6.1 Definition & first Properties . . . . . . . . . . . . . . . . . . . 33
6.2 Induced measures . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.3 Simple functions . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.4 Approximation of Borel functions . . . . . . . . . . . . . . . . 36
iii
7.3 Integrable functions . . . . . . . . . . . . . . . . . . . . . . . 44
7.4 Convergence theorems . . . . . . . . . . . . . . . . . . . . . . 53
8 Product measures 56
8.1 Product σ-algebras . . . . . . . . . . . . . . . . . . . . . . . . 56
8.2 Product measures . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.3 Fubini’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . 59
iv
Part I
Prelude
1 About measure and integration theory
What is measure theory? The aim of measure theory is the generali-
sation of well-known geometric concepts such as lengths, areas or volumes.
Two important questions that immediately arise are: what object do we
measure and how do we measure these objects? For instance, if we consider
the Cartesian plane R2 we could be interested in the area of squares, rectan-
gles, etc. In order to calculate those areas formulas have been derived that
capture our geometric understanding of two-dimensional objects, e.g. the
area of a square equals the square of the side length, the area of a rectangle
equals the product of the side lengths, etc.
Abstractly speaking, we assign to an object a specific number that rep-
resents the mass of the object, i.e. we measure the object. The procedure
of measuring can be mathematically interpreted as a function whose inputs
are certain (geometric) objects and whose outputs are real numbers. In the
following we will call such a function a measure. One measure we are all
familiar with is the one described above, which assigns to a rectangle the
product of the lengths of its sides. In the following we will call this mea-
sure (more precisely: the measure induced by this formula) the Lebesgue
measure.
Besides the Lebesgue measure there are plenty of other measures. For
instance, consider a collection of rectangles with varying side lengths. In-
stead of the area of each rectangle according to the Lebesgue measure one
could be just interested in the quantity of rectangles, i.e. one would like
to count the number of rectangles. Then each rectangle counts as one no
matter how large it is. In other words, from a counting perspective, each
rectangle has the mass one. Since the process of counting is nothing but
assigning a number to a geometric object we have just created a second
candidate for a measure, the so-called counting measure.
Now we would like to link the concept of measures to Stochastics. For
this consider a third example involving rectangles. Suppose you draw a
rectangle on a blackboard and that you are trying to hit this rectangle with
a piece of chalk from a distance of five meters. Let us assume that you are
a good thrower (or that the rectangle is pretty large) and that the chances
of hitting the rectangle are given by 9/10 and of missing it by 1/10. In
other words, from the perspective of how likely a successful throw is, the
rectangle has the mass 9/10 and the area around has the mass 1/10 - even
though it might be of way larger size according to the Lebesgue measure.
Again, we have found an example for a measure. Since this measure tells us
the probabilities of hitting or missing the rectangle it is called a probability
1
measure.
Admittedly, these examples are of a very simple nature. However, it
will turn out that already the Lebesgue measure is not easy to define as it
is not possible to measure any object we like (at least not by maintaining
a few desirable properties). Thus one goal of measure theory is to classify
what the objects we would like to measure should ’offer‘ and what properties
characterise a ’meaningful‘ measure. Besides the derivation of the Lebesgue
measure this leads to a general theory which is the theoretical foundation
to investigate random phenomena mathematically. By the general concept
of measures a unified approach and toolbox is given that facilitates the
handling of stochastic objects no matter if it is of simple or complex nature,
one-dimensional or multi-dimensional, discrete or continuous.
2 Preliminaries
In this section we would like to introduce a collection of standard concepts
and statements of the mathematical branches linear algebra and analysis.
Another way to write down sets compactly is via rules that apply to the
contained elements:
2
• x ∈ A: x is an element of A.
• x∈
/ A: x is not an element of A.
Example 2.2. Let Ω = {1, . . . , 6} and let A = {1, 2, 3}. Then 1 ∈ A and
4∈/ A. Moreover, we have A ⊆ Ω, or more precisely: A ( Ω, i.e. neither
Ω ⊆ A nor Ω = A holds.
• A\B := {x : x ∈ A and x ∈
/ B} is called the difference (Differenz) of
A and B.
A = {an : n = 1, . . . , N },
3
Definition 2.5. Let Ω be a set. The power set (Potenzmenge) P(Ω) of Ω
is the set that contains all subsets of Ω, i.e.
P(Ω) = {A : A ⊆ Ω}.
Remark 2.6. Another notation for P(Ω) is given by 2Ω . The reason for this
is that it is not hard to see that in case of |Ω| < ∞ we have |P(Ω)| = 2|Ω| .
P(Ω) = {{1}, {2}, {3}, {1, 2}, {2, 3}, {1, 3}, {1, 2, 3}, ∅}.
Lemma 2.8. Let Ω be a non-empty set and let A, B, C ∈ P(Ω). Then the
following rules apply
A ∪ B = B ∪ A, A ∩ B = B ∩ A,
A ∪ (B ∪ C) = (A ∪ B) ∪ C, A ∩ (B ∩ C) = (A ∩ B) ∩ C,
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C), A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).
AcΩ := Ω\A = {x ∈ Ω : x ∈
/ A}
Lemma 2.11. Let Ω be a non-empty set. For any pair of sets A, B ∈ P(Ω)
it holds that
(Ac )c = A, A ∩ Ac = ∅ and A ∪ Ac = Ω,
as well as
A\B = A ∩ B c .
Proof. Exercise
4
Definition 2.12. Let I be some non-empty index set. If we are given for
each i ∈ I a subset Ai in Ω then we call (Ai )i∈I = {Ai : i ∈ I} a family of
subsets of Ω (Familie/Menge von Mengen).
Lemma 2.15 (De Morgan’s law). Let Ω be a non-empty set. For a family
(Ai )i∈I of subsets of Ω it holds that
\ c [ [ c \
Ai = Aci , Ai = Aci .
i∈I i∈I i∈I i∈I
Z := −N ∪ {0} ∪ N.
Q := { ab |a, b ∈ Z, b 6= 0}.
5
However, the rational numbers are√not ’complete’ or more roughly said ’have
gaps’. A prominent example is 2, which can be shown to √ be not rep-
resentable by a fraction of two integer numbers. Therefore, 2 is called
irrational. Nevertheless, any irrational
√ number can be approximated by ra-
tional numbers. For instance 2 = 1.41421356... can be approximated by
the rational numbers
14
q1 =1.4 = ,
10
141
q2 =1.41 = ,
100
1414
q3 =1.414 = ,
1000
14142
q4 =1.4142 = ,
10000
..
.
Definition 2.16. The set R that we obtain if we add to Q all those finite
numbers that can be approximated by elements of Q is called the (set of )
real numbers.
It is clear that one has the relation N ( Z ( Q ( R. Moreover, for two
real numbers a and b exactly one of the relations a < b, a > b and a = b
applies. For a, b ∈ R satisfying a ≤ b (i.e. not a > b) we can define the
following sets:
[a, b] := {x ∈ R : a ≤ x ≤ b},
(a, b) := {x ∈ R : a < x < b},
[a, b) := {x ∈ R : a ≤ x < b},
(a, b] := {x ∈ R : a < x ≤ b},
and we call those sets closed, open, right half-open and left-half open inter-
val, respectively. Note that a ∈ / (a, b]. The above intervals are called finite
intervals, i.e. of finite length b − a. The infinite intervals are given by
6
Definition 2.18. Let A ⊆ R. A number M is called supremum (or
infimum) of A if M is the least upper (or greatest lower) bound of A. Here
M is called least upper bound if
1. M is upper bound of A,
2. if M 0 is another upper bound, then it holds that M ≤ M 0 .
The greatest lower bound of A is defined analogously.
M is called maximum (or minimum) of A if M is supremum (or
infimum) of A and M ∈ A.
Example 2.19. Let A = { n1 : n ∈ N} = {1, 12 , 31 , 14 , . . .}. The numbers 2, 100
and 3000 are examples for upper bounds of A. The numbers −4 and −50 are
examples for lower bounds of A. However, the supremum of A is 1 and the
infimum is 0. Moreover, 1 is the maximum of A but A has no minimum as
0∈/ A. In particular, the supremum or infimum are not necessarily contained
in A and a minimum or maximum not necessarily exists.
Theorem 2.20. Every non-empty set A ⊆ R that is bounded from above
(or below) has a supremum (or infimum). We write sup A = supa∈A a (or
inf A).
2.3 Maps
Definition 2.21. Let E, F be two non-empty sets. A map f (Abbildung)
is a rule that associates each x ∈ E with exactly one y ∈ F . We write
f (x) = y and
f : E → F, x 7→ f (x).
The set E is called the domain (Definitionsbereich) of f . For a subset
A ⊆ E the set
f (A) := {y ∈ F : y = f (x) for one x ∈ A}
is called the image of A (under f ) (Bild). The set f (E) is called the
image of f , sometimes written as Im(f ).
For B ⊆ F the set
f −1 (B) := {x ∈ E : f (x) ∈ B}
is called the preimage of B (Urbild).
Example 2.22. A map f is given by
f : R → R, x 7→ x2 .
For this map, we write f (x) = x2 . The image of the set [0, 3] ( R is given
by f ([0, 3]) = [0, 9]. The image of f equals Im(f ) = R+
0 := {x ∈ R : x ≥ 0}.
For the set [4, ∞) the pre-image is given by f −1 ([4, ∞)) = (−∞, −2] ∪
[2, ∞) = R\(−2, 2).
7
Example 2.23. An important function in measure theory is the indicator
function (sometimes also called characteristic function). Let Ω be a non-
empty set and let A ⊆ Ω. Then the indicator function of the subset A is
given by (
1, if x ∈ A
1A : Ω → {0, 1}, x 7→ .
0, if x ∈/A
Remark 2.24. Technically the above definition of a map coincides with
the one of a function. Although the notion ’function‘ is sometimes preferred
over the use of ’map‘ (and vice versa) we treat the two terminologies as
interchangeable in this lecture.
or shorter (an ).
Definition 2.26. Let (an ) be a sequence of real numbers. The sequence is
called
1. bounded (beschränkt), if there is a number M ≥ 0, such that |an | ≤
M , for all n ∈ N.
lim an = a.
n→∞
8
The result of the above example can be generalised to monotone se-
quences.
It is clear that finite sets are by definition countable (see Exercise). The
concept of countability is of more interest for infinite sets, we say countably
infinite sets, which the following examples show.
Example 2.30. The sets N, Z and Q are all countable. The real numbers
R are not countable (see Exercise).
Examples 2.32.
(i) For d ≥ 1 the product set
Rd = R
| × .{z
. . × R}
d times
(iii) If A = {0, 1} and B = {2, 3} then A × B = {(0, 2), (0, 3), (1, 2), (1, 3)}.
9
(iv) For real numbers a1 ≤ b1 and a2 ≤ b2 let A = [a1 , b1 ] and B = [a2 , b2 ]
be two closed intervals. Then the set A × B = {(x, y) : a1 ≤ x ≤ b1 , a2 ≤
y ≤ b2 } can be considered as a rectangle in the Cartesian plane R2 .
For the space R intervals will be essential for the derivation of the (one-
dimensional) Lebesgue measure. Analogously, for the space R2 rectangles
of type (iv) will be essential for the derivation of the (two-dimensional)
Lebesgue measure. The following definition generalises the concept of inter-
vals or rectangles to any arbitrary (finite) dimension.
2.6 Series
For a sequence (an )n∈N of real numbers or functions and a natural number
N we introduce the notation of a finite series (endliche Reihe) by
N
X
an = a1 + a2 + . . . + aN
n=1
10
Definition 2.35. A double sequence (Doppelfolge) of elements in a non-
empty set A is a map N × N → A, i.e. each pair (n, m) ∈ N × N is associated
to one an,m ∈ A.
Examples 2.36.
n
an,m = m gives all positive rational numbers
an,m = nm gives all powers of natural numbers
an,m = fn (bm ) for sequences of functions fn : R → R and real numbers bm .
11
Part II
Measure theory
3 Introduction
3.1 The ’problem of measure‘
In the following we consider the space Rd and the task of measuring its
subsets on the basis of our geometric understanding of hyperrectangles. The
corresponding measure µ should have the following desirable properties.
This is the equivalent of the formula for the length of an interval (or
the area of rectangles) to higher dimensions.
12
Proof. (Sketch) The statement follows if it can be shown that already on
[0, 1]d ( Rd such a function does not exist. For this consider for one x ∈
[0, 1]d the set
Ax := {y ∈ [0, 1]d : x − y ∈ Qd }.
Now let I be an (non-unique) index set that yields the following partition
[
[0, 1]d = Ax , 1
x∈I
i.e. the family (Ax )x∈I consists of pairwise disjoint sets. If we set
[
B := I + {r},
r∈[−1,1]d ∩Qd
It is clear that one has µ(I) > 0 since otherwise the above sum would equals
zero which is impossible since it has to be larger than one. On the other hand
the volume µ(B) equals an infinite sum over µ(I) which tends to infinity and
cannot be smaller then 3d . Thus we get a contradiction and the assumption
that such a function exists cannot hold.
There are also other statements that show that the measure problem has no
solution. Prominent examples are given by Vitali’s theorem, the Banach-
Tarski-Paradox and the Hausdorff-Paradox.
3.2 Outline
Theorem 3.1 implies that the set P(Rd ) is too large to establish a ’nice‘
measure theory on. We therefore consider in Section 4 collections of sets
that are as large as possible but still allow for a ’meaningful‘ definition of
measures in Section 5.
1
This is possible since x − y ∈ Qd defines a so-called equivalence relation on [0, 1]d .
13
4 Measurable spaces
As we have seen in Part I it is impossible to introduce the Lebesgue measure
on the entire power set P(Rd ) = {A : A ⊆ Rd }. The aim of this chapter
is to characterise systems of subsets A ⊆ P(Rd ) that are ’very large‘ and
still allow for the derivation of measures with preferable properties. These
systems will be generally introduced for an arbitrary underlying set Ω, such
that Ω = Rd is only a special case.
4.1 σ-Algebras
We begin with a first type of system of sets that will be important for the
construction of the Lebesgue measure.
Definition 4.1. Let Ω be a non-empty set. A collection A ⊆ P(Ω) of
subsets of Ω is called algebra on Ω if the following properties are satisfied.
1. Ω ∈ A, i.e. the system A contains the underlying set Ω.
14
(ii) If Ω is infinite then A = {A ⊆ Ω : A finite or Ac finite} is an algebra
but not a σ-algebra, cf. Exercises.
1. ∅ ∈ A.
Lemma 4.7. Let Ω be a non-empty set and (Ai )i∈I be a family of σ-algebras,
where I denotes a non-empty index set. Then
\
Ai = {A ⊆ Ω : A ∈ Ai , ∀i ∈ I}
i∈I
15
• We show: If A ∈ i∈I Ai then also Ac ∈ i∈I Ai . Let A ∈ i∈I Ai ,
T T T
i.e. A ∈ Ai for all i ∈ I. Since for any i ∈ I each Ai is a σ-algebra
Ai must alsoTcontain Ac , i.e. Ac ∈ Ai for all i ∈ I. But this exactly
means Ac ∈ i∈I Ai .
T S∞
• T
We show: If (An ) is a sequence in T i∈I Ai , then also n=1 An ∈
i∈I Ai . Let (An ) be a sequence in i∈I Ai , i.e. (An ) is a sequence in
Ai for all i ∈SI. Since forSany i ∈ I each Ai is a σ-algebra Ai must
∞ ∞
also contain
S∞ n=1 ATn , i.e. n=1 An ∈ Ai for all i ∈ I. But this exactly
means n=1 An ∈ i∈I Ai .
Definition 4.9. The system σ(S) is called the σ-algebra that is gener-
ated by S. The system S is called the generator of the σ-algebra σ(S).
Example 4.10. Let Ω = {1, 2, 3, 4} and let S = {{1}, {2}}. Then there are
the following possible σ-algebras containing S:
A1 = P(Ω), A2 = {{1}, {2}, {1, 2}, {3, 4}, {2, 3, 4}, {1, 3, 4}, {1, 2, 3, 4}, ∅}.
Thus we obtain by A2 ⊆ A1
σ(S) = A1 ∩ A2 = A2 .
16
Proof. Since σ(A) is the smallest σ-algebra that contains A the first state-
ment holds.
Let A be a σ-algebra that contains S. Then the σ-algebra A0 = σ(A∪S 0 )
contains A as well as S 0 . In other words: Any σ-algebra that contains S is
contained in some σ-algebra that also contains S 0 . Thus
\ \
σ(S) = A ⊆ A0 = σ(S 0 ).
A is σ-algebra with S⊆A A0 is σ-algebra with S 0 ⊆A0
where a ≤ b are real numbers. The σ-algebra B(R) := σ(I1 ) is called the
Borel σ-algebra. A subset A ∈ B(R) is called Borel set.
Examples 4.15.
(i) Any one-elementary set {x} for x ∈ R is a Borel set. This can be seen
since {x} = ∞ 1
T
n=1 (x − n x].
,
(ii) By (i) also any finite or countably infinite set B ⊆ R is a Borel set. In
particular, the rational numbers Q form a Borel set in R.
(a) S1 = {(a, b] : a, b ∈ R, a ≤ b}
(b) S2 = {(a, b) : a, b ∈ R, a ≤ b}
(c) S3 = {[a, b] : a, b ∈ R, a ≤ b}
(d) S4 = {(a, ∞) : a ∈ R}
(e) S5 = {[a, ∞) : a ∈ R}
17
Proof. For (a) and (b), cf. exercises. The remaining systems (c), (d) & (e)
are left to the reader.
x − ε < x < x + ε.
x − ε ≤ a < x < b ≤ x + ε,
18
In particular, U is a countable union of sets of type (a, b) hence U ∈ σ(S2 ).
This means σ(U) ⊆ σ(S2 ) = B(R).
On the other hand, we know that (a, b) with a ≤ b is an open set2 , i.e.
B(R) = σ(S2 ) ⊆ σ(U). This together with σ(U) ⊆ B(R) gives σ(U) =
B(R).
Finally, for any closed set V ∈ V we have that V c is open, i.e. V c ∈
σ(U). But σ(U) is a σ-algebra, therefore also V = (V c )c ∈ σ(U). This
gives σ(V) ⊆ σ(U). In the same way we get σ(U) ⊆ σ(V) and the claim
follows.
• Let A ∈ F1 , i.e. A = m
S
i=1 Qi for some m ∈ N and some pairwise
disjoint Q1 , . . . , Qm in I1 . W.l.o.g. assume that Qi 6= ∅ for all i =
1, . . . , m and that the sets are ordered, i.e. for x ∈ Qi and y ∈ Qi+1 we
always have x < y for any i = 1, . . . , m − 1. We consider the following
cases
19
The distributive law C ∩ (D ∪ E) = (C ∩ D) ∪ (C ∩ E) implies
m
\
c
A = (b1 , ∞) ∩ (−∞, ai ] ∪ (bi , ∞)
i=2
\m
= (b1 , ∞) ∩ (−∞, ai ] ∪ (b1 , ∞) ∩ (bi , ∞)
i=2
\m [ \ \
= (b1 , ai ] ∪ (bi , ∞) = (b1 , ai ] (bj , ∞) .
i=2 I,J⊆{2,...,m}, i∈I j∈J
I=J c
and (
\ (bmax J , ∞), if J 6= ∅
(bj , ∞) = .
j∈J
(−∞, ∞), else
We therefore get
[
Ac = (b1 , a2 ] ∪ (bm , ∞) ∪ (b1 , amin I ] ∩ (bmax J , ∞)
I,J⊆{2,...,m},
I=J c ,I,J6=∅
and in particular Ac ∈ F1 .
– The two other cases involving Qm = (bm , ∞) can be shown anal-
ogously.
20
Definition 4.22. Denote by Id ⊆ P(Rd ) the system of sets that consists of
all left half-open hyperrectangles, i.e.
d
nY o
d
Id = (aj , bj ] ∩ R : −∞ ≤ aj ≤ bj ≤ ∞, j = 1, . . . , d .
j=1
21
If A and B are subsets of F then it holds that
f −1 (B) = {f −1 (B) : B ∈ B}
f −1 (F ) = {y ∈ E : f (y) ∈ F } = {y : y ∈ E} = E.
22
2. From f −1 (S) ⊆ f −1 (σ(S)) follows σ(f −1 (S)) ⊆ f −1 (σ(S)).
To show the reverse inclusion set C := {C ⊆ F : f −1 (C) ∈ σ(f −1 (S))}.
The system C is a σ-algebra in F and, by definition of C, we have that
S ⊆ C. Thus we also have σ(S) ⊆ C and therefore f −1 (σ(S)) ⊆
f −1 (C) ⊆ σ(f −1 (S)).
The above theorem implies that we are always given a σ-algebra f −1 (B)
on the domain E which is ’inherited‘ from B via the map f . To determine
f −1 (B) it suffices to generate over all pre-images from a generator S of B.
5 Measures
5.1 Definition & Properties
Definition 5.1. Let (Ω, A) be a measurable space. A function µ : A → R
is called a measure on A if
1. µ(∅) = 0,
Remark 5.2. Note that it is allowed that µ assigns the outcome ’∞’ to a
set.
Definition 5.3. Let (Ω, A) be a measurable space. If µ is a measure on A
then the triplet (Ω, A, µ) is called a measure space.
Definition 5.4. Let (Ω, A, µ) be a measure space.
1. If µ(Ω) < ∞ then µ is called a finite measure.
23
(i) Consider Ω = N and A = P(N). For A ∈ A let
(
|A|, if |A| < ∞
µ(A) = ,
∞, else
µ(A) ≤ µ(B).
24
2. For A ⊆ B we have that A and B\A are disjoint. By the previous
statement (additivity) we have
since µ(B\A) ≥ 0.
25
2. Set Cn = A1 \An . Then we have Cn ⊆ Cn+1 . By 1. we get
∞
[
lim µ(Cn ) = µ Cn .
n→∞
n=1
as well as
∞
[ ∞
[ ∞
\ ∞
\
µ Cn = µ A1 \An = µ A1 \ An = µ(A1 )−µ An .
n=1 n=1 n=1 n=1
µ(B) = |A ∩ B|=number
ˆ of elements a ∈ A that are also elements of B.
26
P
Example 5.9. If i∈I pi = 1 then µ is a probability measure, more pre-
cisely: a discrete probability measure or discrete probability distri-
bution. If the support A = {a1 , . . . , an } is finite and if pi = 1/n then
the corresponding probability measure is called uniform distribution on
{a1 , . . . , an }. Examples are given by rolling a die, where n = 6, pi = 1/6, i =
1, . . . , 6, or tossing a coin, where n = 2, pi = 1/2, i = 1, 2.
So far, we have introduced the concept of measures on given σ-algebras.
However, measures are generally defined on rather small systems, i.e. sub-
systems of σ-algebras, and extended to larger structures. We have just
seen how this works in case of discrete measures, where the knowledge of
(ai , pi )i∈I gives rise to a measure on the entire product set. In the following
we will present a general approach that will allow to define (continuous)
measures on (R, B(R)).
Definition 5.10. Let A be an algebra on some non-empty set Ω. A function
µ0 : A → R is called a content on A if
1. µ0 (∅) = 0
27
where inf ∅ := 0. The function µ∗ : P(Ω) → R is called the pre-measure
induced by µ0 . We further set
Aµ∗ := {A ∈ P(Ω) : µ∗ (B) = µ∗ (A∩B)+µ∗ (Ac ∩B) for all B ⊆ Ω with µ∗ (B) < ∞}.
1. Aµ∗ is a σ-algebra,
• Let A ∈ F1 . It suffices
Sm to check the case where A contains no infinite
interval, i.e. A = k=1 (ak , bk ] ∈ F1 with (a1 , b1 ], . . . , (ak , bk ] pairwise
disjoint and ak ≤ bk , kP = 1, . . . , m. Then bk − ak ≥ 0, for all k =
1, . . . , m hence λ0 (A) = mk=1 (bk − ak ) ≥ 0.
28
B = m+n
S
i=m+1 Qi , where Qi = (ai , bi ], i = 1, . . . , m + n, are pairwise
disjoint sets. Then, by definition of λ0 ,
m+n
[ m+n
X
λ0 (A ∪ B) =λ0 Qi = (bi − ai )
i=1 i=1
m
X m+n
X
= (bi − ai ) + (bi − ai ) = λ0 (A) + λ0 (B).
i=1 i=m+1
Proof. By R = ∞
S
n=1 (−n, n] and λ0 ((−n, n]) = 2n, for all n ∈ N, we see
that λ0 is σ-finite.
It is left to show σ-additivity. Assume that (An ) is a sequence of pairwise
disjoint elements in F1Ssuch that S∞
S
n=1 A n ∈ F1 . Here we only consider the
case in which (a, b] = ∞ A
n=1 n = ∞
n=1 n , bn ] with (an , bn ] being pairwise
(a
disjoint (the other cases follow similarly). Then we need to show that
∞
X ∞
X
λ0 ((a, b]) = b − a = bn − an = λ0 ((an , bn ]).
n=1 n=1
We first show that the right-hand side is greater or equal the left-hand side
and then we show the reverse.
Let (a1 , b1 ], . . . , (am , bm ] be finitely many pairwise disjoint intervals such
that their union is equal to (a, b]. W.l.o.g. assume that the intervals are non-
empty and ordered (otherwise throw the empty-ones out and/or rearrange),
i.e.
a ≤ a1 < b2 ≤ a2 < b2 ≤ . . . < bm−1 ≤ am < bm ≤ b.
Then this implies that
m
X m
X
λ0 ((a, b]) = b − a ≥ bm − a1 ≥ bi − ai = λ0 ((ai , bi ]).
i=1 i=1
29
Then, by the Heine-Borel theorem, there is a finite covering, i.e. there is a
finite set I ⊆ N such that
[
[a, b] ⊆ (aj − εj , bj + εj ).
j∈I
We therefore get
X X
λ0 ((a, b]) ≤ λ0 ((aj − εj , bj + εj ]) ≤ bj − aj + 2εj
j∈I j∈I
X ∞
X
≤ε + λ0 ((aj , bj ]) ≤ ε + λ0 ((aj , bj ]).
j∈I j=1
P∞
Since ε > 0 was arbitrary we get λ0 ((a, b]) ≤ i=1 λ0 ((ai , bi ]).
2. λ((a, b]) = λ([a, b]) = λ((a, b)) = λ([a, b)) = b − a, for all a ≤ b.
30
Qd
More generally, it is possible to see that for hyperrectangles Q = i=1 (ai , bi ]
the function
Y d
d
λ0 (Q) = (bi − ai )
i=1
defines a σ-finite pre-measure on the algebra S1d . Thus λd0 induces a measure
λd on (Rd , (Rd )), the d-dimensional Lebesgue measure.
(iii) F is right-continuous, i.e. the limit limy↓x F (y) exists for all x ∈ R.
31
(iii) For a monotonically decreasing sequence (xn ) with xn → x (as n → ∞)
it holds that ∞
T
n=1 (−∞, xn ] = (−∞, x]. Again, by σ-continuity,
∞
\
lim F (xn ) = lim P ((−∞, xn ]) = P (−∞, xn ]
n→∞ n→∞
n=1
=P ((−∞, x]) = F (x).
32
Part III
Integration theory
6 Measurable maps
6.1 Definition & first Properties
Definition 6.1. Let (Ω, A) and (E, B) be measurable spaces. A map f :
Ω → E is called (A, B)-measurable (or short: measurable), if f −1 (B) ∈
A for all B ∈ B. If (E, B) = (R, B(R)) then f is called Borel measurbale.
33
(i) f is Borel measurable.
cf, f 2, f + g, f g, |f |
34
are Borel measurable in case that they are well-defined.
If fn (x) converges for every x ∈ Ω then also the limit function f given
by f (x) := limn→∞ fn (x) is Borel measurable.
hold. But, by the preceding, supn fn and inf n fn are measurable, hence also
lim inf n fn and lim sup fn are measurable and thus also f .
35
6.3 Simple functions
Lemma 6.11. Let (Ω, A) be a measurable space. If α1 , . . . , αm ∈ R and
A1 , . . . , Am ∈ A then the function
m
X
f= αj 1Aj (6.1)
j=1
is Borel measurable.
SA, where I is
Definition 6.13. A family (Ai )i∈I of pairwise disjoint sets in
some non-empty index set, is called partition on Ω if Ω = i∈I Ai .
Remark 6.17. All the results that we have derived in this section can be
extended to numerical functions.
36
Proof. Define a sequence of functions via
(
k
n, if f (x) ∈ [ 2kn , k+1 n
2n ), k = 0, . . . , n2 − 1 .
fn (x) := 2
n, if f (x) ≥ n
k+1 k 1
0 ≤ f (x) − fn (x) ≤ n
− n = n.
2 2 2
Taking the limit for n → ∞ yields
37
The idea is to measure with µ the pre-image of each possible outcome of
f . As before we introduce this concept very generally such that the cases
µ = λ, i.e. when µ equals the Lebesgue measure, or µ(Ω) = 1, i.e. when µ
is a probability measure, are only particular examples of µ.
where α1 , . . . , αm are (not necessarily distinct) real numbers and (Aj )j=1,...,m
are sets in A that form a partition of Ω. Since the image of simple functions
is finite - we have Im(f ) = {α1 , . . . , αm } - this will lead to a sum over j.
Definition 7.1. Let f be a non-negative, simple function of type (7.1).
Then we set Z m
X
f dµ := αj µ(Aj ),
Ω j=1
where
R we use the convention that 0 · ∞ = 0 and a + ∞ = ∞. The object
Ω f dµ is called the integral of f over Ω w.r.t. µ.
Examples 7.2.
Consider the function
38
and the Dirichlet function satisfies
Z
1Q dλ = λ(Q) = 0.
R
Let fR = 0 be the zero function. Then it obviously holds for any measure µ
that Ω df µ = 0.
Proof. First note that Aj = nk=1 (Aj ∩ Bk ) is a disjoint union for any
S
j = 1, . . . , m,
S since (Bk )k=1,...,n is a partition. By the same argument we get
that Bk = m j=1 (Aj ∩ Bk ) is a disjoint union for any k = 1, . . . , n.
If Aj ∩ Bk 6= ∅ then αj = f (x) = βk , for all x ∈ Aj ∩ Bk . This gives
along with additivity of µ:
m
X m
X n
[ m X
X n
αj µ(Aj ) = αj µ (Aj ∩ Bk = αj µ(Aj ∩ Bk )
j=1 j=1 k=1 j=1 k=1
Xm X n n
X [ m
= βk µ(Aj ∩ Bk ) = βk µ (Aj ∩ Bk )
j=1 k=1 k=1 j=1
Xn
= βk µ(Bk ).
k=1
39
as well as Z Z Z
(f + g)dµ = f dµ + gdµ.
Ω Ω Ω
If it holds that f ≤ g, i.e. f (x) ≤ g(x), for all x ∈ Ω, then
Z Z
f dµ ≤ gdµ.
Ω Ω
is again simple and we clearly have (by a similar argument like in the proof
of the previous Lemma)
Z m X
X n
(f + g)dµ = (αj + βk )µ(Aj ∩ Bk )
Ω j=1 k=1
Xm X n m X
X n
= αj µ(Aj ∩ Bk ) + βk µ(Aj ∩ Bk )
j=1 k=1 j=1 k=1
m
X n
[ Xn m
[
= αj µ (Aj ∩ Bk ) + βk µ (Aj ∩ Bk )
j=1 k=1 k=1 j=1
Xm n
X Z Z
= αj µ(Aj ) + βk µ(Bk ) = f dµ + gdµ.
j=1 k=1 Ω Ω
αj = f (x) ≤ g(x) = βk ,
40
7.2 The integral of non-negative functions
With the help of the integral of simple functions we now introduce the
integral for non-negative functions. Remember that we can approximate
these functions arbitrarily close from below by simple functions, cf. the
Approximation theorem. Therefore the following definition is meaningful.
Examples 7.6.
Pm R
(i)
Pm If f is a simple function f = j=1 α j 1 A j of type (7.1) then still f dµ =
j=1 α j µ(A j ) (that is why we can use the same notation).
(ii) Let Ω = N, A = P(N) and µ a probability measure on P(N). Then for
any function f : N → [0, ∞) we have that
Z ∞
X
f dµ = f (k)µ({k}).
N k=1
41
(ii) For f ∈ M+ (Ω, A) and A, B ∈ A with A ⊆ B we have that
Z Z
f dµ ≤ f dµ.
A B
Bn := {x ∈ Ω : fn (x) ≥ ϕ(x)}.
42
Pm S∞
For simple ϕ = j=1 αj µ(Aj ) σ-continuity and Aj = n=1 Aj ∩ Bn yields
Z m
X m
X ∞
[
cϕdµ =c αj µ(Aj ) = c αj µ (Aj ∩ Bn )
Ω j=1 j=1 n=1
Xm m
X
=c αj lim µ(Aj ∩ Bn ) = lim c αj µ(Ai ∩ Bn )
n→∞ n→∞
j=1 j=1
Z
= lim c 1Bn ϕdµ,
n→∞ Ω
This means that we showed both directions and the claim follows.
Proof. Follows immediately from the previous theorem and the fact that
non-negative, simple functions are in M+ (Ω, A).
and Z Z Z
(f + g)dµ = f dµ + gdµ.
Ω Ω Ω
Proof. If c = 0 then the claim follows immediately. Let c > 0 and let (fn )
be a sequence of non-negative, simple functions with fn ↑ f (existence is
guaranteed by the Approximation theorem). Then also (cfn ) is a sequence
43
of non-negative functions with cfn ↑ cf . In particular, the Monotone con-
vergence theorem can be applied twice and the linearity of the integral for
simple functions yields
Z Z Z Z
cf dµ = lim cfn dµ = lim cfn dµ = lim c fn dµ
Ω Ω n→∞ Z
n→∞ Ω
Z
n→∞
Z Ω
which clearly gives f (x) = f + (x) − f − (x), for all x ∈ R. Moreover, it easy
to see that
44
R
and we call the quantity Ω f dµ the integral of f over Ω w.r.t. µ.
Equivalent notations are
Z Z Z
f dµ, f (x)µ(dx), f (x)dµ(x),
Ω Ω
Remarks 7.14.
(i) For a Borel-measurable function f : Ω → R the following is equivalent
(a) f ∈ L1 (Ω, A, µ)
(b) |f | ∈ L1 (Ω, A, µ).
This can be easily seen as the linearity of the integral for non-negative
functions gives
Z Z Z Z
|f |dµ = (f + + f − )dµ = f + dµ + f − dµ,
Ω Ω Ω Ω
Examples 7.16.
(i) Consider again the Dirichlet function f (x) = 1Q (x) and the Lebesgue
measure λ . Then this function equals zero λ-almost everywhere, since
λ(Q) = 0.
(ii) Consider the functions f (x) = x2 , g(x) = 1 and the Dirac measure
δ0 (A) = 1A (0) for A ∈ A. Then g is greater then f δ0 -almost everywhere,
since g(0) = 1 > 0 = f (0).
45
1. The functions cf and f + g are also in L1 (Ω, A, µ) and
Z Z Z Z Z
cf dµ = c f dµ, (f + g)dµ = f dµ + gdµ.
Ω Ω Ω Ω Ω
Proof. Ad 1.: For arbitrary c ∈ R we first show that cf ∈ L1 (Ω, A, µ). The
relation f ∈ L1 (Ω, A, µ) ⇔ |f | ∈ L1 (Ω, A, µ) yields
Z Z Z
|cf |dµ = |c||f |dµ = |c| |f |dµ < ∞,
Ω Ω Ω
where we have used the linearity of the integral for non-negative functions
(note: |f | ∈ M+ (Ω, A). Now consider the case in which c = 0. Then it
obviously holds that
Z Z
0 · f dµ = 0 = 0 · f dµ.
Ω Ω
46
The case c < 0 follows analogously.
Ad 2.: It is clear that f ≤ g implies 0 ≤ g−f , i.e. (g−f ) is a non-negative
function. Since the zero-function x 7→ 0, ∀x ∈ Ω, is also non-negative the
monotonicity of the integral for function in M+ (Ω, A) along with statement
1. gives Z Z Z Z
0= 0dµ ≤ (g − f )dµ = gdµ − −f dµ,
Ω Ω Ω Ω
which is equivalent to the claim.
Next we show that for f, g ∈ L1 (Ω, A, µ) also (f + g) ∈ L1 (Ω, A, µ),
which can be easily seen by then triangle inequality |f + g| ≤ |f | + |g|:
Z Z Z Z
|f + g|dµ ≤ |f | + |g|dµ = |f |dµ + |g|dµ < ∞,
Ω Ω Ω Ω
and using the derived properties of the integral for the non-negative func-
tions |f |, |g| and |f +g|. Moreover, we clearly have that f + +g + and f − +g −
are functions in M+ (Ω, A) such that f + g = (f + + g + ) − (f − + g − ). This
gives (cf. Remark 7.14)
Z Z Z
(f + g)dµ = (f + + g + )dµ − (f − + g − )dµ
Ω ZΩ Z Ω Z Z
+
= f dµ + +
g dµ − f − dµ − g − dµ
ZΩ Z Ω Ω Ω
= f dµ + gdµ,
Ω Ω
where the second equality is obtained from the linearity of the integral of
functions in M+ (Ω, A). Note that it in general we do not have (f + g)+ =
f + + g+!
Ad 3.: By |f | = f + + f − and again by the triangle inequality it holds
that
Z Z Z Z Z
−
f dµ = +
f dµ − f dµ ≤ +
f dµ + f − dµ
Ω Ω Ω Ω Ω
Z Z Z Z
+ − + −
= f dµ + f dµ = f + f dµ = |f |dµ,
Ω Ω Ω Ω
where we just used another time the linearity of the integral for non-negative
functions.
Ad 4.: We know that 1A∪B f = (1A + 1B )f = 1A f + 1B f . This along
with 1. gives
Z Z Z Z Z
f dµ = 1A∪B f dµ = 1A f + 1B f dµ = 1A f dµ + 1B f dµ.
A∪B Ω Ω Ω Ω
47
simple:
Z Z m
X m
X
f dµ = 1A f dµ = αj 1A 1Aj = αj 1A∩Aj = 0,
A Ω j=1 j=1
= lim fn dµ = lim 0 = 0,
n→∞ A n→∞
since we verified the claim already for non-negative functions. This gives
the proof of statement 5.
Ad 6.: If we assume first that f = 0 µ-a.e, i.e. µ(A) = 0 for
A :={x ∈ Ω : |f (x)| > 0} = {x ∈ Ω : f (x) 6= 0}
={x ∈ Ω : f (x) < 0} ∪ {x ∈ Ω : f (x) > 0},
which clearly belongs to A since f is Borel-measurable. Moreover, since
|f (x)| = 0, for any x ∈ Ac , an application of 4. (with Ω = A ∪ Ac ) and 5.
gives Z Z Z
|f |dµ = |f |dµ + |f |dµ = 0 + 0 = 0,
Ω A Ac
where we used that 1Ac (x)f (x) = 0.
Now we show the reverse. Assume that the set A (defined as before) has
non-zero measure, i.e. µ(A) > 0. It is not hard to see that
∞
[
A = {x ∈ Ω : |f (x)| > 0} = {x ∈ Ω : |f (x)| > n1 }.
n=1
By σ-continuity we get
∞
[
0 < µ(A) = µ {x ∈ Ω : |f (x)| > n1 } = lim µ({x ∈ Ω : |f (x)| > n1 }).
n→∞
n=1
48
This means that there is some n0 ∈ N such that µ(An0 ) > 0 for An0 = {x ∈
Ω : |f (x)| > n10 }) > 0. Note that n10 1An0 ≤ f 1An0 . But this would imply
(applying statement 2. twice)
Z Z Z
1 1
|f |dµ ≥ |f |1An0 dµ ≥ 0 1An0 dµ = 0 µ(An0 ) > 0,
Ω Ω n Ω n
R
which contradicts the assumption Ω |f |dµ = 0. Therefore it cannot hold
that µ(A) > 0, i.e. f = 0 µ-a.e.
By the help of the above corollary it is easy to derive integrals when the
underlying measure is discrete.
49
We now calculate this integral with the help of approximating functions.
For this, set An := {ai : i ∈ I, i ≤ n.} and define a sequence (fn ) by
X
fn (x) = 1An (x)f (x) = 1{ai } (x)f (ai ).
i∈I:i≤n
For continuous measures the following general statement helps with the
calculation of integrals.
50
Pm
see this, let g = j=1 αj 1Aj . Then
Z m
Z X m
X Z
f f
g(y)dµ (y) = αj 1Aj (y)dµ (y) = αj 1Aj (y)dµf (y)
F F j=1 j=1 F
m
X Z m
Z X
= αj 1Aj (f (x))dµ(x) = αj 1Aj (f (x))dµ(x)
j=1 Ω Ω j=1
Z
= g(f (x))dµ(x)
Ω
= + f
g (y)dµ (y) − g − (y)dµf (y)
ZF F
f
= g(y)dµ (y).
F
R
The above theorem implies that for the calulation of Ω (h ◦ f )dµ it is
only necessary to know the induced measure µf (and not f or µ). This is
of fundamental importance for Statistics where the underlying probability
space (Ω, A, P ) is often unknown and only derived quantities - the so-called
random variables X : Ω → F - are observed. The corresponding probability
distribution P X is then accessible via the observations of samples w.r.t. P X .
In the remainder we will often use the notation
Z Z
f (x)dx := f (x)dλ(x),
R R
51
Definition 7.22. Let µ be a σ-finite measure on (R, B(R)) and let f : R →
R+ be a non-negative function such that
Z Z b
µ((a, b]) = f (x)dx := f (x)dx, −∞ ≤ a < b ≤ ∞.
(a,b] a
Examples 7.23.
(i) The Lebesgue measure λ restricted to an interval [a, b], a < b, has the
density f (x) = 1[a,b] (x), x ∈ R.
(ii) For fixed m ∈ R and σ 2 > 0 the density of the so-called Gaußian measure
µ is given by
1 (x − m)2
f (x) = √ exp − , x ∈ R.
2πσ 2 2σ 2
Proof. Follows with measure theoretic induction, where the statement holds
for the case h = 1[a,b] by the definition of a density.
52
7.4 Convergence theorems
Definition 7.26. A sequence (fn ) a Borel-measurable functions fn : Ω →
R, n ∈ N, is called convergent µ-almost everywhere to some Borel-
measurable function f on (Ω, A, µ) if there is a µ-set N ∈ A (i.e. N satisfies
µ(N )) such that
53
Example 7.28. Consider the measure space (R, B(R), λ) and the function
fn (x) = xn 1[0,1] (x). Then one has fn (x) → 1{1} (x) but at the same time
also fn (x) → 0 λ-a.e.
Proof. Set gn (x) := inf k∈N:k≥n fk (x). Then gn ≤ fn , gn ≤ gn+1 for all
n ∈ N, and limn→∞ gn (x) = lim inf n→∞ fn (x). Moreover, it is clear that
gn is measurable (since infima and maxima are measurable) and that gn is
integrable, which follows by
Z Z Z Z
|gn |dµ = gn dµ ≤ fn dµ = |fn |dµ < ∞.
Ω Ω Ω Ω
54
Example 7.31. On (Ω, A, µ) = (R, B(R), λ) the sequence fn = − n1 1[2,2n]
violates the non-negativity condition
R and the statement of Fatou’s Lemma
does not hold, since fn → 0 and R fn dλ = −1, i.e.
Z Z
lim inf fn dλ = 0 > −1 = lim inf fn dλ.
R n→∞ n→∞ R
(i) fn → f µ−a.e.,
Proof. First note that |fn | ≤ g, for all n ∈ N, and fn → f implies that also
|f | ≤ g. Since fn is measurable (as limit of measurable functions) it is also
integrable by Z Z
|f |dµ ≤ gdµ < ∞.
Ω Ω
Now we clearly have that
|f − fn | ≤ |f | + |fn | ≤ 2g,
55
i.e. |f − fn | is integrable as well, and it even holds that
hence Z
lim |f − fn |dµ = 0.
n→∞ Ω
Example 7.34. On the measure space ([0, 1], B([0, 1]), λ[0,1] ) consider the
sequence of functions fn (x) = n1[0,1/n] (x), x ∈ R, n ∈ N. Then fn →
0, λ[0,1] -a.e. However, it holds that
Z 1 Z
lim fn dλ[0,1] = 1, lim fn dµ = 0,
n→∞ 0 R n→∞
i.e. the dominated convergence theorem does not apply here. The reason
for this is that there is no map g such that |fn | < g λ[0,1] -a.e. for all n ∈ N.
8 Product measures
In order to model a sequence of either successive or simultaneous random
experiments product σ-algebras and product measures are used.
56
Definition 8.1. Let (Ω, A) and (F, B) be measurable spaces. The system
A ⊗ B := σ({A × B : A ∈ A, B ∈ B})
ΠΩ :Ω × F → Ω, ΠΩ (x, y) = x,
ΠF :Ω × F → F, ΠF (x, y) = y.
Examples 8.3.
(i) Let Ω = F = {0, 1} and consider the σ-algebra A = {∅, {0, 1}} on Ω
and B = P({0, 1}) = {{0}, {1}, ∅, {0, 1}} on F , respectively. Then (with
A × ∅ = ∅ = ∅ × A for any set A)
Lemma 8.4. Let (Ω, A) and (F, B) be measurable spaces and let S and T
be generators of A and B, respectively. Then any of the following systems
is a generator of A ⊗ B:
1. {A × F : A ∈ A} ∪ {Ω × B : B ∈ B},
2. {S × F : S ∈ S} ∪ {Ω × T : T ∈ T },
3. {S × T : S ∈ S, T ∈ T }, if Ω ∈ S and F ∈ T .
Remarks 8.5.
(i) For measurable spaces (Ωi , Ai ), i = 1, . . . , n, where each Ai has a gener-
ator Si with Ωi ∈ Si we get by iteration
A1 ⊗ . . . ⊗ An =σ({A1 × . . . × An : Ai ∈ Ai })
=σ({S1 × . . . × Sn : Si ∈ Si }).
57
Qn
The system A1 ⊗ . . . ⊗ An is called
Qn product σ-algebra on i=1 Ωi . If we
consider the projections Πj : i=1 Ωi → Ωj by Πj ((x1 , . . . , xn )) = xj it
holds that
n
[
−1
A1 ⊗ . . . ⊗ A n = σ Πi (Ai ) .
i=1
Example 8.6. Consider again Ω1 = Ω2 = {0, 1} with A1 = {∅, {0, 1}} and
A2 = P({0, 1}). Then Π−1 −1
1 (A1 ) = {Π1 (A) : A ∈ A1 }. Since A1 only
contains the empty-set and Ω it suffices to consider the two pre-images
Π−1 2
1 (∅) ={(x1 , x2 ) ∈ {0, 1} : x1 ∈ ∅} = ∅
Π−1 2 2
1 (Ω) ={(x1 , x2 ) ∈ {0, 1} : x1 ∈ Ω} = {0, 1} .
Π−1
2 ({0}) = {(0, 0), (1, 0)}, Π−1
2 ({1}) = {(0, 1), (1, 1)}
Π−1 2
2 (A2 ) = {{(0, 0), (1, 0)}, {(0, 1), (1, 1)}, ∅, {0, 1} }.
A1 ⊗ A2 =σ(Π−1 −1
1 (A1 ) ∪ Π2 (A2 ))
=σ({∅, {(0, 0), (1, 0)}, {(0, 1), (1, 1)}, {0, 1}2 })
={∅, {(0, 0), (1, 0)}, {(0, 1), (1, 1)}, {0, 1}2 }.
58
Theorem 8.7. Let (Ω, A, µ) and (F, B, ν) be measure spaces, where µ and
ν are σ-finite. Then there is a unique measure π on A ⊗ B such that
for all A ∈ A, B ∈ B.
Definition 8.8. The measure µ on A ⊗ B is called the product measure
of µ and ν. We write π = µ ⊗ ν.
with µ(Ac ) = ν(B c ) = 0 and the functions g1 and g2 are integrable over A
and B, respectively. Moreover,
Z Z Z Z Z
f dµ ⊗ ν = f (x, y)dν(y) dµ(x) = f (x, y)dµ(y) dν(x).
Ω×F A F B Ω
is finite, then all of the above integrals are finite and they coincide. More-
over, it holds that f ∈ L1 (Ω × F, A ⊗ B, µ ⊗ ν) and the statement of Fubini’s
theorem applies.
59