Jaroslav Lukeš, Jan Malý - Measure and Integral

Measure
and
Integral
Jaroslav Lukeš
Jan Malý
matfyzpress
PRAGUE 2005
All rights reserved, no part of this publication may be reproduced or transmitted in any
form or by any means, electronic, mechanical, photocopying or otherwise, without the
prior written permission of the publisher.
© Jaroslav Lukeš, Jan Malý, 2005

© MATFYZPRESS by publishing house of the Faculty of Mathematics and Physics
Charles University in Prague, 2005
ISBN 80-86732-68-1
ISBN 80-85863-06-5 (First edition)
Motto: Everybody writes and nobody reads
L. Fejér
Preface
This text is based on lectures in measure and integration theory given by the
authors during the past decade at Charles University, and on preliminary lecture
notes published in Czech.
It is impossible to thank individually all colleagues and students who assisted
in the preparation of this manuscript, but we will just mention Michal Kubeček
who helped with the translation and TEX processing.
The authors wish to express their deep gratitude to Professor Stylianos Negre-
pontis, who was the chief coordinator of TEMPUS project JEP–1980. Without
support from him and the Tempus programme the manuscript would never have
appeared.
The preparation of this manuscript was partially supported by the grant
No. 201/93/2174 of the Czech Grant Agency and by the grant No. 354 of the
Charles University.
Prague, 1994 Jaroslav Lukeš and Jan Malý
Preface to the second edition
We have carried out only minor corrections. We wish to thank all who con-
tributed by suggestions and comments.
Prague, 2005 Jaroslav Lukeš and Jan Malý

Contents
List of Basic Notations and Frequently Used Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
A. Measures and Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1. The Lebesgue Measure
2. Abstract Measures
3. Measurable Functions
4. Construction of Measures from Outer Measures
5. Classes of Sets and Set Functions
6. Signed and Complex Measures
B. The Abstract Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7. Integration on R
8. The Abstract Lebesgue Integral
9. Integrals Depending on a Parameter
10. The Lp Spaces
11. Product Measures and the Fubini Theorem
12. Sequences of Measurable Functions
13. The Radon-Nikodým Theorem and the Lebesgue Decomposition
C. Radon Integral and Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
14. Radon Integral
15. Radon Measures
16. Riesz Representation Theorem
17. Sequences of Measures
18. Luzin’s Theorem
19. Measures on Topological Groups
D. Integration on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
20. Integral and Differentiation
21. Functions of Finite Variation and Absolutely Continuous Functions
22. Theorems on Almost Everywhere Differentiation
23. Indefinite Lebesgue Integral and Absolute Continuity
24. Radon Measures on R and Distribution Functions
25. Henstock–Kurzweil Integral
E. Integration on Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
26. Lebesgue Measure and Integral on Rn
27. Covering Theorems
28. Differentiation of Measures
29. Lebesgue Density Theorem and Approximately Continuous Functions
30. Lipschitz Functions
31. Approximation Theorems
32. Distributions
33. Fourier Transform
F. Change of Variable and k-dimensional Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
34. Change of Variable Theorem
35. The Degree of a Mapping
36. Hausdorff Measures
G. Surface and Curve Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
37. Integral Calculus in Vector Analysis
38. Integration of Differential Forms
39. Integration on Manifolds
H. Vector Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
41. Vector Measures
42. The Bochner Integral
43. The Dunford and Pettis Integrals
Appendix on Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
A Short Guide to the Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
1
List of Basic Notations and Frequently Used Symbols
In this manuscript we use the standard notation.

In all what follows, N, Z, Q, R, C will denote the sets of all natural, integer,
rational, real and complex numbers, respectively.
Extended real number set R consists of R together with two symbols −∞ and
+∞ equipped with the usual algebraic structure and topology. Remind only that
0 · ±∞ and ±∞ · 0 are taken as 0.
If X is a set, P(X) denotes the collection of all its subsets.
Rn stands for the Euclidean n-dimensional space under the usual Euclidean
norm |·| and the metric |x − y|, where x = [x1 , . . . , xn ].
Remember that when multiplied by a matrix (from the left), the vector x =
[x1 , . . . , xn ] behaves like a “vertical vector”, i.e. like a matrix with one column
⎛ ⎞
x1
.
⎝ .. ⎠ .
xn
The horizontal notation is preferred for estetical and typographical reasons.

The standard (or canonical ) basis of the space Rn is denoted by {e1 , . . . , en },
the vector ei = [0, . . . , 0, 1, 0, . . . , 0] with 1 at the ith place. The inner product in
Rn is denoted by x · y.
By U (x, r) we denote the open ball in a metric space (P, ρ) of radius r round the
x. The closed ball is denoted by B(x, r). Thus U (x, r) = {y ∈ P : ρ(x, y) < r},
B(x, r) = {y ∈ P : ρ(x, y) ≤ r}. For the diameter of a set we use the symbol
”diam” and for the distance of a pair of sets the symbol ”dist”.
If nothing else is specified, a function on a set X is a mapping of X into R. If
we want to emphasise that a function does not attain the values −∞ and +∞,
we call it a real function.
Instead of the notation {x ∈ X : f (x) > a} we often use the abbreviated version
{f > a}.
The symbol cA denotes the indicator function of a set A ⊂ X, i.e. the function

1 for x ∈ A ,
cA (x) =
0 for x ∈ X \ A.
We use fj ⇒ f to denote the uniform convergence of a sequence of functions.

2 1. The Lebesgue Measure
A. Measures and Measurable Functions

1. The Lebesgue Measure
In the history, people were engaged in the problem of measuring lenghts, areas
and volumes. In mathematical formulation the task was, for a given set A, to
determine its size (”measure”) λA. It was required that the volume of a cube or
the area of a rectangle or a circle should agree with the well-known formulae. It
was also clear by intuition that this measure should be positive and additive, i.e.
it should satisfy the equality

λ Aj = λAj
provided {Aj } is a finite disjoint collection of sets. For a succesful development

of the theory a further condition was imposed: The above equality was claimed
to hold even for countable disjoint collections of sets. Moreover, the effort was
paid to assign a measure to as many sets as possible.
Now, we are going to show how to proceed on the real line. The same approach
will be used later in the Euclidean space Rn where the proofs will be given.
1.1. Outer Lebesgue Measure. For an arbitrary set A ⊂ R, define
∞
∞

λ∗ A := inf{ (bi − ai ) : (ai , bi ) ⊃ A}.
i=1 i=1
The value λ∗ A (which can also be +∞) is called the outer Lebesgue measure of a
set A.
1.2. Properties of the Outer Lebesgue Measure. One can see immediately
that λ∗ A ≤ λ∗ B if A ⊂ B and that the measure of a singleton is 0, and without
much effort it becomes clear that λ∗ I is the length of I in case of I interval of any
type (see Exercise 1.6). Then it is relatively easy to prove that the outer Lebesgue
measure is translation invariant: If A ⊂ R and x ∈ R, then λ∗ A = λ∗ (x + A).
Another important property is the σ-subadditivity:
∞
∞

λ∗ ( Aj ) ≤ λ∗ Aj .
j=1 j=1
In mathematical terminology, the prefix σ usually relates to countable unions and δ to countable
intersections.
The question of whether λ∗ is an additive set function has a negative answer:
There are disjoint sets A, B with
λ∗ (A ∪ B) < λ∗ A + λ∗ B
(cf. 1.8), and we need to find a family of sets (as large as possible) on which the
measure λ∗ is additive. This task will be solved later in Chapter 4 in a much more
general case. Now we just briefly indicate one of its possible solutions in case of
the Lebesgue measure.
A. Measures and Measurable Functions 3
1.3. Lebesgue Measurable Sets. Let A be a subset of a bounded interval I.

Defining the “inner measure” λ∗ A = λ∗ I − λ∗ (I − A), it is natural to investigate
the collection of sets for which λ∗ A = λ∗ A (cf. Exercise 1.7). This leads to the
following definition.
We say that a set A ⊂ R is (Lebesgue) measurable if λ∗ I = λ∗ (A∩I)+λ∗ (I \A)
for every bounded interval I ⊂ R. The collection of all measurable sets on
R will be denoted by M. Not every set is measurable as will be seen in 1.8.
The set function M → λ∗ M , M ∈ M is denoted by λ and called the Lebesgue
measure. Thus, on measurable sets, the set functions λ∗ and λ coincide but for
nonmeasurable ones only λ∗ is defined.
Another important property of the measure λ is contained in the following
theorem which is now presented without proof.
1.4.
Theorem.

(a) If M1 , M2 , . . . are elements of M, then also M1 \ M2 ,
Mn and Mn are elements of M. If, in addition, the sets Mn are pairwise
disjoint, then
λ Mn = λMn .
n n
(b) Intervals of any type are in M.

1.5. Remark. The ingenuity of Lebesgue’s approach to the measure consists in considering
the countable covers of a set A with intervals. If in the definition of λ∗ A we consider only finite
covers, we get the notion of so-called Jordan-Peano content. In modern analysis this notion is
far from being as important as the Lebesgue measure.
1.6. Exercise. If I ⊂ R is an interval (of any type), show that λ∗ I is its length.
Hint. It is sufficient to consider the case I = [a, b]. Clearly λ∗ [a, b] ≤ b − a (since [a, b] ⊂
S
∞
(a − ε, b + ε)). Suppose (ai , bi ) ⊃ [a, b]. A compactness argument yields the existence of an
i=1
S
n
index n satisfying (ai , bi ) ⊃ [a, b]. Using induction (with respect to n) it can be shown that
i=1
P
n
b−a≤ (bi − ai ).
i=1
1.7. Exercise. For every bounded set A ⊂ R, define
λ∗ A := λI − λ∗ (I \ A)
where I is a bounded interval containing A. Show that:

(a) the value of λ∗ A does not depend on the choice of I;
(b) a bounded set A ⊂ R is measurable if and only if λ∗ A = λ∗ A;
(c) a set M ⊂ R is measurable if and only if its intersection with each bounded interval is
measurable.
In the next part of this chapter we introduce some significant sets on the real
line.
1.8. A Nonmeasurable Set. Now we prove the existence of a nonmeasurable subset of R
and consequently prove that the outer Lebesgue measure cannot be additive.
Set x ∼ y if x − y is a rational number. It is easy to see that ∼ is an equivalence relation
on R. Therefore R splits into an uncountable collection of pairwise disjoint classes. A set V
belongs to this collection if and only if V = x + Q for some x ∈ R. By the axiom of choice,
there exists a set E ⊂ (0, 1) that shares exactly one point with each set V ∈ . We show that
E is not in M.
Let {qn } be a sequence containing all rational numbers from the interval (−1, +1). It is not
very difficult to show that the sets En := qn + E are pairwise disjoint and that
[
(0, 1) ⊂ En ⊂ (−1, 2).
n
S P
Assuming that E ∈ M, then also En ∈ M and Theorem 1.4 gives λ En = λEn . Distin-
n n
guishing two cases λE = 0 and λE > 0 we easily obtain the contradiction.
1.9. Remarks. 1. The proof of the existence of a nonmeasurable set is not a constructive
one (it uses the axiom of choice for an uncountable collection of sets). We return to the topic
of nonmeasurable sets in Notes 1.22.
2. By a simple argument, an even stronger proposition can be proved: Any measurable set
M ⊂S R of a positive measure contains a nonmeasurable subset. It is sufficient to realize that
M = q∈Q M ∩ (E + q) where E is the nonmeasurable set from 1.8 and that any measurable
subset of E is of zero (Lebesgue) measure.
3. Van Vleck [1908] “constructed” a set E ⊂ [0, 1] for which λ∗ E = 1 and λ∗ E = 0.
1.10. Exercise. Show that every countable set S is of measure zero.
S
∞
Hint. Consider covers (rj − ε2−j , rj + ε2−j ) where {rj } is a sequence of all elements of the
j=1
set S. The assertion also follows from Theorem 1.4 if you realize that singletons have measure
zero.
1.11. Examples of Sets of Measure Zero. (a) The set Q of all rational numbers is
countable, thus by Exercise 1.10 it has Lebesgue measure zero.
(b) It can be seen from the hint to the exercise that for every k ∈ N there is an open set
T
∞
Gk such that Q ⊂ Gk and λ∗ Gk ≤ 1/k. The set Gk has also Lebesgue measure zero, it is
k=1
dense and uncountable (even residual).
1.12. Cantor Ternary Set. Consider the sequence { n } of finite collections of intervals
defined in the following way: 0 = {[0, 1]}, 1 = {[0, 13 ], [ 23 , 1]}. In each step we construct n
from n−1 as the collection of all closed intervals which are the left or right third of an interval
from the collection n−1 (the middle thirds are omitted). Then n is a collection of 2n disjoint
−n
T 3 . Let Kn denote the union of the collection n . The
closed intervals, each of them of length
Cantor ternary set 1 C is defined as Kn . It is not difficult to verify that C consists precisely
n
P
∞
of points of the form ai 3−i where each ai is 0 or 2. Roughly speaking, in the Cantor set
i=1
there are exactly those points of the interval [0, 1] whose ternary expansions do not contain the
digit 1. The Cantor set has the following properties:
(a) C is a compact set without isolated points;
(b) C is a nowhere dense (and totally disconnected) set;
(c) C is an uncountable set;
(d) the Lebesgue measure of C is zero.
1.13 Discontinua of a Positive Measure. If we construct a set D ⊂ [0, 1] like the Cantor
set except that we always omit intervals of length ε3−n where ε ∈ (0, 1) (note that their centres
are not the same as those in the construction of the Cantor set), we get a closed nowhere dense
set, for which λD = 1 − ε. Sets having this property are called the discontinua of a positive
measure. Another construction: If G is an open subset of the interval (0, 1), containing all
rational points of this interval and λG = ε < 1 then [0, 1] \ G is a discontinuum of measure 1 − ε.
1 sometimes also called the Cantor discontinuum

1.14. Exercise. Prove that there exists a non-Borel subset of the Cantor set and realize that
this set is Lebesgue measurable.
Hint. The cardinality argument shows that the set of all Borel subsets of the Cantor set has
cardinality of the continuum while the set of all its subsets has greater cardinality.
Instead of this, the following idea can be used. Define
κ(t) := inf{x ∈ [0, 1] : f (x) = t}
where f is the Cantor singular function from 23.1. Show that κ is increasing on the interval [0, 1],
and therefore it is a Borel function. Suppose E is a nonmeasurable subset of [0, 1], B := κ(E).
Then B (as a subset of the Cantor set) is a measurable set. But since κ−1 (B) = E (and κ is a
Borel function), B cannot be a Borel set.
1.15. Lebesgue Measure on Rn . In the same way as for R, we introduce

the Lebesgue measure on Rn . Recall that by an interval in Rn we understand
an arbitrary Cartesian product of n one-dimensional intervals. If I := (a1 , b1 ) ×
· · · × (an , bn ) is an open interval, we define its volume as
vol I = (b1 − a1 ) · . . . · (bn − an ).
In the same way we define vol I for intervals of other types. Given an arbitrary
set A ⊂ Rn , define the outer Lebesgue measure of A as the quantity
∞
∞

λ∗ A = inf{ vol Ik : Ik ⊃ A, Ik is an open interval}.
k=1 k=1
We say that a set A ⊂ Rn is measurable if λ∗ T = λ∗ (A ∩ T ) + λ∗ (A \ T ) for

every set T ⊂ Rn . (By analogy with the one-dimensional case we should require
this equality to hold just for bounded intervals T . We have chosen the present
definition in order to apply the general approach of Chapter 4. Soon we show
that there is no difference between these two definitions.) The symbol M again
denotes the collection of all measurable subsets of Rn . For M ∈ M we denote by
λM := λ∗ M the n-dimensional Lebesgue measure of a set M .
1.16. Theorem. If {Aj } is a sequence of (arbitrary) sets of Rn , then

∞ ∞
λ∗ Aj ≤ λ∗ Aj .
j=1 j=1
Proof. The assertion follows from Theorem 4.3.

1.17.
Theorem. If M1 , M2 , . . . are elements of M, then also M1 \ M2 , Mn
and Mn are elements of M. If, in addition, the sets Mn are pairwise disjoint,
then
λ Mn = λMn .
n n
Proof. The assertion follows from general Theorem 4.5.

Compare the following theorem with Exercise 1.6.
1.18. Theorem. If I ⊂ Rn is a bounded interval, I ⊂ Qj where {Qj } is a

j
sequence of open intervals, then

vol I ≤ vol Qj .
j
Thus the n-dimensional Lebesgue measure λ∗ I is equal to the volume vol I.

Proof. Suppose J is a compact interval contained in I. There exists a p such that
the intervals {Q1 , . . . , Qp } cover J. The interval J can be now divided into a
finite number of non-overlapping n-dimensional intervals {Ji } (distinct elements
of {Ji } have disjoint interiors) in such a way that the interior of each interval Ji
is contained in some of the intervals Qj . Then

p ∞

vol J = vol Ji ≤ vol Qj ≤ vol Qj .
i j=1 j=1
Since the difference vol I − vol J can be arbitrarily small, the assertion follows.
1.19. Theorem. (a) Any open subset of Rn is measurable.
(b) If λ∗ A = 0, then A is measurable.
Proof. The proof of part (b) is obvious; we will prove (a). First we prove that each
interval H which is a halfspace (e.g. of the form (−∞, c) × Rn−1 ) is measurable.
Choose a “test” set T , λ∗ T < ∞, and ε > 0. There exist open intervals {Qj }
with
Qj ⊃ T and vol Ij < λ∗ T + ε.
j j
Now find open intervals Ij and Jj such that

Ij ∪ Jj = Qj , Qj ∩ H ⊂ Ij , Qj \ H ⊂ Jj and λ∗ Ij + λ∗ Jj < λ∗ Qj + ε2−j .
Then
λ∗ (T ∩ I) + λ∗ (T \ I) ≤ vol Ij + vol Jj ≤ λ∗ T + ε.
j j
We proved the measurability of all intervals H of the form of a halfspace. Now,

each open set can be expressed as a countable union of intervals and each interval
is a finite intersection of intervals which are halfspaces.
1.20. Theorem. If A ⊂ Rn , then
λ∗ A = inf {λG : G open, G ⊃ A}.
Proof. One inequality follows from the monotonicity of λ∗ . Now if λ∗ A < ∞ and
ε > 0, then there exist open intervals Ij ⊂ Rn such that

A⊂ Ij and λ Ij ≤ vol Ij < λ∗ A + ε.
j j j
The reader should compare the following theorem and Exercise 15.19.
1.21. Theorem. Given a set M ⊂ Rn , the following are equivalent:

(i) M is measurable;
(ii) for every bounded interval I, λ∗ I = λ∗ (I ∩ M ) + λ∗ (I \ M );
(iii) for every ε > 0 there exists an open set G ⊃ M with λ∗ (G \ M ) < ε;
(iv) there exists a Gδ -set D ⊃ M such that λ∗ (D \ M ) = 0;
(v) there exist an Fσ -set Bi and a Gδ -set Be such that Bi ⊂ M ⊂ Be and
λ∗ (Be \ Bi ) = 0.
Proof. The implication (i) =⇒ (ii) is trivial. Assuming (ii), fix ε > 0 and denote
Ik = (−k, k)n . By Theorem 1.20 we can find open sets Gk and Hk such that
Ik ∩M ⊂ Gk , Ik \M ⊂ Hk , λGk ≤ λ∗ (Ik ∩M )+2−k ε and λHk ≤ λ∗ (Ik \M )+2−k ε.
We can assume that Gk and Hk are subsets of Ik . Then we have Gk \M ⊂ Gk ∩Hk .
Using (ii) and the measurability of open sets we obtain
λIk +λ(Gk ∩Hk ) = λGk +λHk ≤ λ∗ (Ik ∩M )+λ∗ (Ik \M )+2−k+1 ε ≤ λIk +2−k+1 ε.

Set G = Gk . Then
k
∞

λ∗ (G \ M ) ≤ λ(Gk ∩ Hk ) ≤ 2ε
k=1
so that (iii) holds. That (iii) implies (iv) is evident. It is not very difficult to
prove the implication (iv) =⇒ (v). If M satisfies (v), then M = Bi ∪ (M \ Bi )
where the sets Bi and M \ Bi are measurable by Theorem 1.19 (each one for a
different reason), so that (v) =⇒ (i).
1.22. Notes. Originally, H. Lebesgue defined the outer measure on the real line using
countable covers formed by intervals, exactly as explained in the text. He defined measurability
as in Exercise 1.7.
At the end of the last century, various attempts to define the length or area of geometrical
figures appear; in the works of G. Peano [1887] and C. Jordan [1892] even the “measures” of
more complicated sets are considered.
The existence of a Lebesgue nonmeasurable set is very closely connected to the axiom of
choice (for uncountable collections of sets) and the assertion that such sets exist was first proved
by G. Vitali [*1905]. Solovay’s result [1970] says that there exist models of the set theory (of
course not satisfying the axiom of choice) in which every subset of real numbers is Lebesgue mea-
surable. The existence of a nonmeasurable set can be proved (assuming various set conditions)
in other ways as well. Constructions of Bernstein’s sets (still assuming the axiom of choice) as
examples of nonmeasurable sets are also interesting. Another construction of a nonmeasurable
set (the axiom of choice again) based on results of the graph theory comes from R. Thomas
[1985]. Using nonstandard methods, it is possible to prove the existence of a nonmeasurable set
assuming the existence of ultrafilters (a weaker form of the axiom of choice; cf. M. Davis [*1977]).
Recently, M. Foreman and F. Wehrung [1991] proved that the existence of a nonmeasurable set
follows from the Hahn-Banach Theorem (which is again a weaker assumption than the axiom
of choice).
Let us note that the Lebesgue measure can be extended to a “translation invariant” mea-
sure defined on a wider σ-algebra than is the collection of all Lebesgue measurable sets. The
construction can be found e.g. in S. Kakutani and J.C. Oxtoby [1950]. However, the Lebesgue
measure cannot be extended in a reasonable way to the collection of all subsets of Rn .
It is interesting that in R or R2 there exist finitely additive extensions of the Lebesgue
measure to the collection of all subsets which can also be invariant with respect to translations
8 2. Abstract Measures
and rotations. This was first proved by S. Banach [1923]. However, this result cannot be
transferred to spaces of higher dimensions as follows from the famous result of S. Banach and
A. Tarski [1924]:
If U and V are arbitrary (!) bounded and open sets in the space Rn , n ≥ 3, then there S exist
S
sets E1 , . . . , Ek and F1 , . . . , Fk such that Ei ∩ Ej = ∅ = Fi ∩ Fj for i
= j, U = Ei , V = Fi
and Ej are isometric copies of Fj .
In this theorem, which is known as the Banach-Tarski paradox, in general all the sets Ei and
Fi cannot be measurable; realize that U , V can be of different measures. More information is
contained in S. Wagon [*1985].
2. Abstract Measures
In this chapter we study an abstract notion of measure which stands as a basis
for modern integration theory. Also in probability theory, the notion of measure
(termed a “probability” there) plays a crucial role. Among many fields of analysis
which employ measures in an essential way, let us mention e.g. functional analysis,
theory of function spaces and theory of distributions, or mathematical modelling
of physical quantities.
So far we have the only nontrivial example of the Lebesgue measure. Further
important examples of measures will be introduced later.
Remember that the Lebesgue measure is not defined on the collection of all
subsets of R but on its subcollection which is closed under countable operations.
We start with the following definition.
2.1. σ-algebras. A collection S of subsets of a given set X is called a σ-algebra
if
(a) X ∈ S ;
(b) if A ∈ S , then X \ A ∈ S ;

∞
(c) if An ∈ S , then An ∈ S .
n=1
The pair (X, S ) is called a measurable space.
Clearly every σ-algebra is also closed under countable intersections, under dif-
ferences and contains the empty set.
Not every collection of sets is a σ-algebra. However, if T is an arbitrary family
of subsets of X, then there exists the smallest σ-algebra σ(T ) which contains T .
Such a σ-algebra is simply the intersection of all σ-algebras (in X) which contain
T . It surely exists, since there is at least one such a σ-algebra (the σ-algebra
P(X) of all subsets of X) and the intersection of any collection of σ-algebras is
again a σ-algebra. The collection σ(T ) is called the σ-algebra generated by T .
2.2. Examples. One of the most important examples is the σ-algebra M of all Lebesgue
measurable sets on the real line. Another important class of examples yields Borel σ-algebras
from 2.3. For illustration, we add a few simple examples. Suppose X is an arbitrary set. Then
(a) {∅, X} is a σ-algebra;
(b) the collection (X) of all subsets of X is a σ-algebra;
(c) {A ⊂ X : A is countable or X \ A is countable} is a σ-algebra.
2.3. Borel Sets. Let P be a topological space. The σ-algebra B(P ) generated
by the family of all open subsets of P is called the Borel σ-algebra of P ; its
elements are called Borel sets. The Borel σ-algebra B(P ) contains all closed
sets, all countable intersections of open sets (these sets are called Gδ sets), all
countable unions of closed sets (Fσ sets), all countable unions of Gδ sets (Gδσ
sets), all countable intersections of Fσ sets (Fσδ ) and so on. Let us note that for
the complete description of all possible “types” we would have to use (in nontrivial
cases) all countable ordinal numbers.
2.4. Measures. Let S be a collection of subsets of a set X. A nonnegative set
function μ : S → [0, ∞] is called a measure if
(a) S is a σ-algebra;
(b) μ∅ = 0;
each sequence {An } of pairwise disjoint sets from S , μ( An ) =

(c) for
μAn .
The triplet (X, S , μ) is termed a measure space.
From (b) and (c) it immediately follows that each measure is monotone (if
A, B ∈ S , A ⊂ B, then μA ≤ μB); the property (c) is also called the σ-additivity
of a measure.
We say that a measure is finite if μX < +∞, σ-finite if there exist sets Mn ∈ S

∞
such that μMn < +∞ and X = Mn . If μX = 1, then we say that μ is a
n=1
probability measure. A measure μ is said to be complete if whenewer B ∈ S is a
null set and A ⊂ B, then also A ∈ S (and μA = 0).
If μ is a measure on X and E ∈ S , we define μE A = μ(A ∩ E) for A ∈ S .
A related notion is the restriction μ|A of a measure μ to the set A ∈ S : If
we denote by SA the σ-algebra {M ∈ S : M ⊂ A} of subsets of A, we define
μ|A (M ) = μ(M ) for M ∈ SA . Finally, if T ⊂ S is a σ-algebra of subsets of X,
then the symbol μ|T denotes the measure E → μE, E ∈ T .
2.5. Examples. (a) The Lebesgue measure on the collection of all Lebesgue measurable sets
in Rn . This measure is complete, σ-finite, but not finite.
(b) Counting measure. Let X be an arbitrary set, = (X) the collection of all subsets
of X and (
the number of elements of A if A is finite,
μA =
+∞ if A is infinite.
The counting measure is complete; it is σ-finite if and only if X is countable, and finite just
when X is finite.
(c) The Dirac measure. Again, let X be an arbitrary set, x ∈ X, = (X). We define
(
1, if x ∈ A,
μA =
0, if x ∈ X \ A.
The measure μ is called the Dirac measure at x and it is denoted by εx . The Dirac measure is
a complete probability measure.
(d) Trivial measures. If is an arbitrary σ-algebra of subsets of X and μA = 0 for all
A ∈ , then μ is an example of a finite measure which is not complete provided
= (X).
2.6. Properties of Measures. Let (X, S , μ) be a measure space. Then the

the following propositions hold:

(a) if A1 , A2 , · · · ∈ S , A1 ⊂ A2 ⊂ . . . , then μ ( An ) = lim μAn ;

10 2. Abstract Measures

(b) if A1 , A2 , · · · ∈ S , A1 ⊃ A2 ⊃ . . . , and μA1 < ∞, then μ ( An ) =
lim μAn ;

(c) if A1 , A2 , · · · ∈ S , then μ ( An ) ≤ μAn .
Proof. (a) Since A1 , A2 \ A1 , A3 \ A2 , . . . are pairwise disjoint, we get

∞
∞
∞

μ An = μ A1 ∪ (An+1 \ An ) = μA1 + μ(An+1 \ An )
n=1 n=1 n=1

k−1
= lim μA1 + (μAn+1 \ An ) = lim μAk .
k k→∞
n=1
(b) By (a), we obtain

∞
∞
∞

μ An = μA1 − μ A1 \ An = μA1 − μ A1 \ An
n=1 n=1 n=1
= μA1 − lim μ(A1 \ An ) = lim μAn .
n→∞ n→∞
(c) It is sufficient to consider the sequence
B1 = A1 , B2 = A2 \ A1 , B3 = A3 \ (A1 ∪ A2 ), . . . ,

notice that Bn = An and that the sequence {Bn } is pairwise disjoint.

2.7. Completion of Measures. Now we return to the notion of completeness
of a measure. We show that every measure space can be “extended” to a complete
measure space.
Let (X, S , μ) be a measure space and let N denote the collection of all sets
A ⊂ X for which there is a B ∈ S such that μB = 0 and A ⊂ B. Further, let S
denote the collection of all sets of the form M ∪ N where M ∈ S and N ∈ N .
We define a set function μ on S by
μ(M ∪ N ) = μM, M ∈ S, N ∈ N .
Clearly the value of μE does not depend on the choice of M and N . The set
function μ on S is called the completion of μ.
2.8. Theorem. S is a σ-algebra containing S and μ is a complete measure
on S which coincides with μ on S .
Proof. Obviously, S ⊂ S . Since both S and N are closed under countable
unions, the same is true for S . If E ∈ S , there are M ∈ S , N ∈ N and B ∈ S
so that N ⊂ B, μB = 0 and E = M ∪N . Then X \E = (X \ (M ∪ B))∪(B \N ) ∈
S . It is easy to verify that μ is a complete measure on S and μ = μ on S .
2.9. Remark. Note that there is a variety of extensions of a measure to a complete measure.
The completion described above is uniquely determined and in some sense minimal (cf. Exercise
2.13).
2.10. Exercise. Let f be a nonnegative function on a set X and a σ-algebra on X. For

B∈ we define
X
μB := f (x).
x∈B
(Recall that by definition

8 9
X <X =
f (x) = sup f (x) : K ⊂ B, K finite .)
K : ;
x∈B x∈K
Show that μ is a measure on (so-called weighted counting measure).

(a) In a particular case = (X) and f = 1 on X we get the counting measure.
(b) If = (X), z ∈ X and f = c{z} , then μ is the Dirac measure at z.
2.11. Exercise. (a) The trivial measure on = {∅, R} is not complete. What is its
completion?
(b) The Lebesgue measure on R considered on the σ-algebra of Borel sets (see Exercise
1.14.a) is not complete.
(c) Suppose X = {1, 2, 3, 4}, = {∅, {1, 2}, {3, 4}, X}. Let ω be a measure on (X) such
that
ω{1} = ω{2} = 0, ω{3} = ω{4} = 1.
If μ = ω|S , then μ is not complete. If
= ∪ {{1}, {2}, {1, 3, 4}, {2, 3, 4}}, ν = ω|T ,
then ν is complete. What is the completion of μ?

2.12. Exercise. Prove that the completion of the Lebesgue measure considered on the
σ-algebra of all Borel sets in R is the Lebesgue measure (on the collection of all Lebesgue
measurable sets; cf. Theorem 26.1).
2.13. Exercise. Suppose (X, , ν) is the completion of (X, , μ). If μ1 is a complete measure
on a σ-algebra 1 such that ⊂ 1 and μ = μ1 |S , show that ⊂ 1 and ν = μ1 |T .
2.14. Exercise (Borel-Cantelli lemma). Let (X, , μ) be a measure space and An ∈ . If
P
∞ T ∞
∞ S
μAn < +∞, then μ(lim sup An ) = 0 (we define lim sup An := Ak ).
n=1 n=1 k=n
2.15. Exercise (Darboux property). Let (X, , μ) be a measure space. If μ does not have
an atom, show that {μA : A ∈ } = [0, μX]. (A set A ∈ is called an atom if μA > 0 and for
each set B ∈ , B ⊂ A, either μB = 0 or μ(A \ B) = 0.)
Hint. Fix 0 < α < μX and set = {A ∈ : μA ≥ α}. There exists a set C ∈ such that
μA = μC for all A ∈ , A ⊂ C. In a similar way find D ∈ , D ⊂ C, μD ≤ α with the
following property: μB = μD for each set B ∈ for which D ⊂ B ⊂ C and μB ≤ α. Because
C \ D cannot be an atom, it follows μC = μD = α.
2.16. Notes. In [1895] and [*1898], E. Borel extended the length of intervals to a set function
(measure) defined on the collection of all Borel sets. However, H. Lebesgue was the first who
created the theory of the integral with the help of this measure. J. Radon in [1913] defined a
general notion of (Borel) measures in Euclidean spaces. A.N. Kolmogorov introduced in [*1933]
the axiomatic theory of probability measures.
The Darboux property of real-valued measures is a special version of the general Lyapunov
theorem (A. Lyapunov [1940]) which says that the range of a finite-dimensional nonatomic vector
measure is a convex compact set.
12 3. Measurable Functions
As mentioned in the first chapter, there are serious reasons why the Lebesgue
measure is not defined on the collection all subsets of Rn . Likewise, one cannot
expect a reasonable theory of integration on the class of all functions; it will be
necessary to confine to some reasonable family of functions. This class of course
should contain the indicator functions of all measurable sets and should be closed
under all common (algebraic as well as limit) operations. As we will see, these
requirements are satisfied by the following natural definition.
In what follows, we assume that S is a σ-algebra of subsets of a given set X.
3.1. Measurable Functions. Suppose D ∈ S . A function f : D → R is said
to be S -measurable on D if {x ∈ D : f (x) > α} ∈ S for each α ∈ R.
A complex function on D is S -measurable if its real and imaginary parts are
S -measurable.
3.2. Examples. (a) If is a σ-algebra of all subsets of X, then every function on X is
-measurable.
(b) Only the constant functions are -measurable if = {∅, X}.
(c) If is the σ-algebra of all Borel sets of a topological space X, then -measurable
functions are called shortly Borel functions on X.
3.3. Remark. Since any σ-algebra is closed for complementation, a function f is -
measurable if and only if {f ≤ α} ∈ for each α ∈ R. Because
∞ j
\ ff
1
{f ≥ α} = f >α− ,
n=1
n
it is possible to replace the condition {f > α} ∈ by {f ≥ α} ∈ ) in the definition of

-measurability. Other equivalent conditions are stated in Exercise 3.6.
3.4. Theorem. Let f , g be S -measurable functions on X. Then

(a) {x ∈ X : f (x) < g(x)} ∈ S ;
(b) f −1 (+∞), f −1 (−∞) ∈ S .
Proof. Since

{f < g} = ({f < r} ∩ {g > r}),
r∈Q
∞

{x ∈ X : f (x) = +∞} = {f > n},
n=1
the assertion easily follows.

3.5. Properties of S -measurable Functions. Let f , g, fn be S -measurable
functions (with possibly different domains in S ), λ ∈ R and let ϕ be a continuous
function on an open set G ⊂ R. Then the following functions are S -measurable
where defined (and their definition domains are in S ):
(a) λf , f + g, max(f, g), min(f, g), |f |, f g, f /g;
(b) sup fn , inf fn , lim sup fn , lim inf fn and lim fn ;
(c) ϕ ◦ f .
Proof. We will just indicate the ideas of some of the proofs leaving the others as
an exercise for the reader. The assertion (c) is obvious.
(a) Suppose α ∈ R. Then {f + g > α} = {f > α − g} (and the function α − g
is S -measurable). The functions |f |, f 2 and 1/g are S -measurable by (c). Then
we can use the formulae
1
max(f, g) = (f + g + |f − g|),
2
1
fg =((f + g)2 − f 2 − g 2 ).
2
(b) As a hint let us just note that

{sup fn > α} = {fn > α}.
n
3.6. Exercise. (a) Show that a real-valued function f is -measurable if and only if
f −1 (G) ∈ for each open set G ⊂ R or, if and only if f −1 (B) ∈ for each Borel set B ⊂ R.
S
Hint. Since each open set G ⊂ R can be expressed in the form G = (an , bn ) (where (a, b) =
(−∞, b) ∩ (a, +∞)), we can see that a function f is -measurable if and only if f −1 (G) ∈
for each open set G ⊂ R (note that {f > α} = f −1 ((α, +∞))!). Then consider the collection
{B ⊂ R : B Borel, f −1 (B) ∈ },
and show that it is a σ-algebra containing all open sets.
(b) The characterization given in (a) cannot be used for functions having infinite values
unless we introduce the notions of open or Borel subsets of R.
3.7. Exercise. Let {fn } be a sequence of -measurable functions. Show that the sets
{x ∈ X : lim fn (x) exists } and {x ∈ X : lim fn (x) exists and is finite }
are in .
3.8. Simple Functions. By a simple function on X we understand a (finite)
linear combination of indicator functions of sets from S . In other words, a real-
(or complex-)valued function f is simple if f is S -measurable and f (X) is a finite
n
set. Thus every simple function is of the form λi cAi where λi are numbers and
i=1
Ai ∈ S . Note that this expression is not uniquely determined!
3.9. Theorem. Let f ≥ 0 be an S -measurable function on X. Then there
exists a sequence {fn } of nonnegative simple functions on X such that fn f .
Proof. For n ∈ N and k = 1, 2, . . . , n2n set

k−1 k
Fn,k = x ∈ X : ≤ f (x) < n
2n 2
and define k−1
2n , if x ∈ Fn,k ,
fn (x) =
n if x ∈ X \ Fn,k .
k
It is easily seen that fn are simple and fn f .
3.10. Exercise. Suppose f , fn have the same meaning as in the previous theorem. Show
that fn ⇒ f on every set on which f is bounded.
3.11. Exercise. If f is an -measurable function, then there exists a sequence of simple

functions {fn } such that |fn | ≤ |f | and fn → f on X.
Most frequently we meet the concept of measurability when additionally a
measure is in consideration on a given σ-algebra. So suppose in what follows
that (X, S , μ) is a measure space. The following notion is of great importance in
Lebesgue’s integration theory, since usually the null sets are negligible.
3.12. Almost Everywhere. We say that a function h is defined μ-almost
everywhere (briefly almost everywhere) on X if its domain D ∈ S satisfies μ(X \
D) = 0. Suppose f and g are functions defined almost everywhere on X. We say
that f (x) ≤ g(x) for μ-almost all x ∈ X, or that f ≤ g μ-almost everywhere if
there exists a set N such that μN = 0 and for all x ∈ X \ N we have f (x) ≤ g(x).
Similarly we understand the expressions “almost everywhere” and “almost all” in
other contexts, e.g. when speaking about the equality of functions or about the
convergence almost everywhere.
3.13. μ-measurable functions. We say that a function f defined on D ∈ S
is μ-measurable on X if μ(X \ D) = 0 and f is SD -measurable on D.
Let us emphasize that we distinguish strictly between “μ-measurable functions”
(defined in general only almost everywhere) and “S -measurable functions on X”
(defined everywhere on X).
3.14. Equality almost everywhere. The relation “f = g almost everywhere”
is clearly an equivalence relation on the set of all functions (or all μ-measurable
functions) on X. The following observations are quite useful.
(a) Let μ be a measure on (X, S ) and f be a μ-measurable function defined
on D ∈ S . Then there exists an S -measurable function g on X such that f = g
on D; in particular, we can take

f on D,
g=
0 on X \ D.
(b) If μ is a complete measure on (X, S ) and f is a μ-measurable function,

then every function g which is equal to f almost everywhere is μ-measurable.
3.15. Exercise. Find an example of a measure space on which there exists an -measurable
function f and an -nonmeasurable function g such that f = g almost everywhere.
3.16. Exercises. Suppose (X, , μ) is the completion of a measure space (X, , μ).
(a) Show that an everywhere defined function f is -measurable if and only if there exist
-measurable functions g, h such that g ≤ f ≤ h on X and g = h μ-almost everywhere in X.
Hint. The proof of one implication is easy. Let f be -measurable. By Exercise 3.11 find
a sequence of simple ( -measurable) functions {fn } such that fn → f on X. Then find -
measurable functions gn , hn such that gn ≤ fn ≤ hn and gn = hn μ-almost everywhere and
set
g := lim sup gn , h := lim inf hn .
(b) If f is an -measurable function on X, then there exists an -measurable function g

such that f = g μ-almost everywhere.
Hint. The assertion is obvious for indicator functions of sets from , and therefore also for
-measurable simple functions. Then use Exercise 3.11.
3.17. Exercise. Let {fn } be a sequence of -measurable functions on X which converges to

a function f μ-almost everywhere. Prove that
(a) if the measure μ is complete, then also f is -measurable;
(b) if f is defined on X and is -measurable, then there exists a sequence {fn∗ } of -
measurable functions such that fn = fn∗ μ-almost everywhere and fn∗ → f everywhere on X;
(c) if the measure μ is complete, then a function f is -measurable if and only if there exists
a sequence of simple functions which converges to f μ-almost everywhere.
3.18. Images and Preimages of Measurable Sets. We know that real-valued -
measurable functions are exactly those for which the preimages of Borel sets are in . To
illustrate the situation, let us present the following example for the Lebesgue measure on R.
(a) Let g(x) = 21 (x + f (x)) where f is the Cantor singular function from 23.1. Then g is
a continuous and increasing function mapping the interval [0, 1] on [0, 1]. If C is the Cantor
set and ϕ is the inverse function to g on [0, 1], then ϕ−1 (C) is a (Lebesgue) measurable set of
positive measure. As remarked in 1.9.2 there exists a nonmeasurable set E ⊂ ϕ−1 (C). Finally,
for M := ϕ(E) we have M ⊂ C, so that M is measurable while ϕ−1 (M ) = g(M ) = E is a
nonmeasurable set. Let us note that this M is not even a Borel set (compare also with Exercise
1.14.a).
(b) Without presenting a proof, we mention that the continuous image of a Borel set on R
is always measurable but not necessarilly Borel.
4. Construction of Measures from Outer Measures
In this chapter, we will construct measures from the so-called outer measures.
This is a very useful method used already before when we introduced the Lebesgue
measure in Rn . We also fulfil our promise to prove propositions from the first
chapter.
4.1. Outer Measure. By an outer measure on a set X we understand a set
function γ which assigns to every set A ⊂ X a nonnegative number γA (real or
+∞) such that the following conditions are satisfied:
(a) γ∅ = 0;

⊂ B, then
(b) if A γA ≤ γB;
(c) γ ( An ) ≤ γAn .
The property (c) is called the σ-subadditivity of an outer measure.
4.2. Examples of Outer Measures. (a) The outer Lebesgue measure in Rn .
(b) The Hausdorff outer measure from Chapter 36.
(c) The counting measure.
(d) If (X, , μ) is a measure space, then the set function μ∗ : A → inf{μM : M ∈ :M ∈
, M ⊃ A} is an outer measure. See also Exercise 4.8.
An important example of creating an outer measure is contained in the follow-

ing theorem.
4.3. Theorem. Let G be a collection of subsets of a set X containing ∅, and
ν : G → [0, +∞] be a set function on G with ν∅ = 0. For A ⊂ X set
∞
∞

ν̃A = inf{ νGn : Gn ∈ G , Gn ⊃ A}
n=1 n=1
(note that inf ∅ = +∞). Then ν̃ is an outer measure.

16 4. Construction of Measures from Outer Measures
Proof. We only have to verify that ν̃ is σ-subadditive. So suppose A ⊂ An ,

n
ν̃An < +∞. Fix ε > 0 and find Gjn ∈ G so that
ε
Gjn ⊃ An and νGjn < ν̃An + .
j j
2n
Then
Gjn ⊃ A and ν̃An ≥ ν̃A − ε.
n,j n
4.4. γ-measurable Sets. A set M ⊂ X is said to be γ-measurable (in the

sense of Carathéodory) if
γT = γ(T ∩ M ) + γ(T \ M )
for each “test” set T ⊂ X (in other words, if M splits additively each set in X).
The collection of all γ-measurable sets will be denoted by M(γ). To prove that a
set M is γ-measurable, it is sufficient to verify the inequality
γT ≥ γ(T ∩ M ) + γ(T \ M )
for any set T with γT < +∞.

4.5. Theorem. M(γ) is a σ-algebra and γ is a complete measure on M(γ).
Proof will be divided into a few steps.
(a) It is straightforward to check that X ∈ M(γ), that X \ M ∈ M(γ) provided
M ∈ M(γ), and that A ∈ M(γ) whenever γA = 0. Suppose M, N ∈ M(γ). We
would like to show that also M ∩ N ∈ M(γ). So choose a test set T ⊂ X. Then
γT = γ(T ∩ M ) + γ(T \ M )
and
γ(T ∩ M ) = γ(T ∩ M ∩ N ) + γ((T ∩ M ) \ N ).
Now we use the test set T \ (M ∩ N ) and γ-measurability of M to get
γ(T \ (M ∩ N )) = γ((T ∩ M ) \ N ) + γ(T \ M ).
Thanks to last three equalities, it follows that
γT = γ(T ∩ (M ∩ N )) + γ(T \ (M ∩ N )).
Since M(γ) is closed under complements and finite intersections, it is closed also
under finite unions.
(b) In order to show that γ is σ-additive on M(γ), choose Mn ∈ M(γ) pairwise

disjoint. Setting T = M1 ∪ M2 and using γ-measurability of M1 we obtain
γ(M1 ∪ M2 ) = γM1 + γM2 .
Thus γ is finitely additive. Further
∞
∞

k
k
γMn = lim γMn = lim γ Mn ≤γ Mn ,
k k
n=1 n=1 n=1 n=1
and since the reverse inequality always holds we reach the conclusion.

∞
(c) Let now Mn ∈ M(γ) be pairwise disjoint. Our aim is to show that Mn ∈
n=1
M(γ). Choosing a test set T ⊂ X, we have
∞

k k
k
γT = γ T \ Mn + γ T ∩ Mn ≥ γ T \ Mn + γ(T ∩ Mn )
n=1 n=1 n=1 n=1
for each k ∈ N. Since γ is σ-subadditive, it readily follows that

∞
∞

γT ≥ γ T \ Mn + γ T ∩ Mn ,
n=1 n=1
which is what we wanted.

4.6. Exercise. Let γ be a nonnegative function on (X), γ∅ = 0. Show that the collection
of all “γ-measurable sets” forms a σ-algebra and that γ is additive on it.
Hint. It only needs to examine where monotonicity and σ-subadditivity of the outer measure
in the proof of Theorem 4.5 is used.
4.7. Exercise. We say that an outer measure γ is regular if for each set A ⊂ X there exists
M ∈ M(γ) such that A ⊂ M and γA = γM .
(a) Let γ be a regular outer measure and A1 ⊂ A2 ⊂ A2 ⊂ . . . an increasing sequence of
S
sets. Show that γ ( An ) = lim γAn .
(b) Show that the outer Lebesgue measure is regular.
(c) There exists a decreasing sequence {Mn } of subsets of [0, 1] such that λ∗ Mn = 1 and
T
Mn = ∅ (compare with (a)).
n
4.8. Exercise. Let (X, , μ) be a measure space and A ⊂ X. Set
∗
μ A := inf{μM : M ∈ , A ⊂ M }, μ∗ A := sup{μM : M ∈ , M ⊂ A}.
(a) Suppose μ∗ A < ∞. Show that A ∈ M(μ∗ ) if and only if μ∗ A = μ∗ A.

(b) Show that ⊂ M(μ∗ ) and μ = μ∗ on .
4.9. Notes. Carathéodory’s characterization of measurable sets appears first in [*1918].
5. Classes of Sets and Set Functions
This chapter will be devoted to a study of various classes of sets and set func-
tions from the point of view of measure theory. We prove theorems on extensions
of set functions to larger families of sets as important steps in constructing mea-
sures.
18 5. Classes of Sets and Set Functions
5.1. Systems of sets. A family A of subsets of a set X containing ∅ is called:

(a) a semiring if
(a1) A ∩ B ∈ A for each A, B ∈ A ,
(a2) for A, B ∈ A , A ⊂ B, there exist pairwise disjoint sets C1 , . . . , Cn ∈ A

n
such that B \ A = Cj ;
j=1
(b) a ring if given A, B ∈ A , then A ∪ B, A \ B ∈ A (so that also A ∩ B ∈ A );

(c) an algebra if it is a ring and X ∈ A ;
(d) a Dynkin class if
(d1) X ∈ A ,
(d2) A \ B ∈ A for each A, B ∈ A , B ⊂ A,

∞
(d3) if An ∈ A are pairwise disjoint, then An ∈ A ;
n=1
(e) a π-system if A ∩ B ∈ A for each A, B ∈ A ;

(f) a δ-ring if A is a ring closed under countable intersections;
(g) a σ-ring if A is a ring closed under countable unions.
5.2. Premeasure. A nonnegative set function μ is called a premeasure on X if
μ is defined on a ring A of subsets of X and satisfies the following conditions:
(a) μ∅ = 0,
(b) if {A

j } ⊂ A is a sequence of pairwise disjoint sets and Aj ∈ A , then
μ ( Aj ) = μAj .
A premeasure is in fact a σ-additive set function defined on a ring of sets. We
say that a premeasure μ on X is σ-finite

if there exists a sequence Xj of sets from
A such that μXj < ∞ and X = Xj .
5.3. Examples. (a) Denote by the collection of all intervals on R including the degenerated
ones (i.e. the empty set and the singletons) and by b the collection of all bounded intervals
from . Further, let l be the collection of all intervals of the form [a, b) together with ∅, R
and the intervals of the form (−∞, b). Finally, let lb = l ∩ b .
(a1) The collections , b, b
l, l form semirings which is not true for the collection of all
open (or closed) intervals.
(a2) Finite unions of sets from b or lb form a ring. Finite unions of sets from or l
form even an algebra or l . The set function I → vol I can be (uniquely) extended to a
premeasure on or l .
(a3) Closing or l under countable unions, we do not get even a semiring: Substracting
a countable union of open intervals from [0,1] we get the Cantor set which is not a countable
union of intervals.
(b) If and are semirings on X, then the collection of all sets of the form A × B where
A∈ and B ∈ forms a semiring on X × X.
(c) Let be a semiring. Then the set of all finite disjoint unions of elements from is a
ring (in fact, the smallest ring containing ).
(d) Consider the family of all subsets of the set {1, 2, . . . , 20} with an even number of
elements and show that is a Dynkin class but not a semiring.
(e) The collection of all countable subsets of a set X is a σ-ring which for is not a σ-algebra
unless X is countable.
(f) The collection of all Lebesgue measurable sets of finite measure forms a δ-ring.
(g) Suppose that is the collection of all unions of a finite number of intervals (including
the degenerated ones) and = {A ∩ Q : A ∈ }. Then is an algebra of subsets of Q and
ν : A ∩ Q → λA is a finitely additive set function which is not a premeasure.
(h) Let be the algebra of all finite unions of intervals on (0, 1). Define a set function ν on
by the formula
(
1 if A contains some interval of the form (0, ε), ε > 0,
μ(A) =
0 in other cases.
Then ν is a finitely additive set function on which is not a premeasure.
5.4. Theorem. Let G be a ring of subsets of a set X and ν a premeasure on

G . If ν̃ is the outer measure constructed from G and ν as in Theorem 4.3, then
(a) ν̃ = ν on G ;
(b) G ⊂ M(ν̃).

Proof. Obviously ν̃ ≤ ν. Suppose G ∈ G , ν̃G < +∞ and {Gn } ⊂ G , Gn ⊃ G.

Then

νG = ν (Gn ∩ G) ≤ ν(Gn ∩ G) ≤ νGn ,
n n n
so that νG ≤ ν̃G.
Now suppose G ∈ G . Choose a test set T ⊂ X, ν̃T < +∞ and ε > 0. There exist
Gn ∈ G such that

T ⊂ Gn and νGn ≤ ν̃T + ε.
n n
Then

ν̃T + ε ≥ νGn = (ν(Gn ∩ G) + ν(Gn \ G)) ≥ ν̃(T ∩ G) + ν̃(T \ G),
n n
thus G ∈ M(ν̃).
5.5. Hopf ’s Extension Theorem. Let μ be a premeasure on a ring A of
subsets of a set X. Then there exists a measure μ̃ on σ(A ) which equals μ on A .
This extension of μ is unique provided μ is σ-finite.
Proof. The existence of a measure μ̃ is an immediate consequence of Theorems 4.3
and 5.4 (notice that σ(A ) ⊂ M(μ̃)). To prove uniqueness, under the σ-finiteness
assumptions, let ν be another measure on σ(A ), ν = μ on A . One can easily find
out (from the construction

of the outer measure μ̃) that ν ≤ μ̃ on σ(A ). Then
for Aj ∈ A , A = Aj we have
j
⎛ ⎞ ⎛ ⎞

n
n
μA = lim ν ⎝ Aj ⎠ = lim μ ⎝ Aj ⎠ = μ̃A.
n n
j=1 j=1
20 5. Classes of Sets and Set Functions
So if
E ∈ σ(A ), μ̃E < ∞, then for a given ε > 0 there exist sets Aj ∈ A ,
A = Aj such that E ⊂ A and μ̃A < μ̃E + ε. Hence
j
μ̃E ≤ μ̃A = νA = νE + ν(A \ E) ≤ νE + μ̃(A \ E) ≤ νE + ε,

thus μ̃E = νE. Finally suppose X = Xj , μXj < +∞; one can assume that Xj
j
are pairwise disjoint. If E ∈ σ(A ), then

μ̃E = μ̃(E ∩ Xj ) = ν(E ∩ Xj ) = νE.
j j
5.6. Exercise. Let be a family of subsets of a set X. Show that there exists the smallest
Dynkin class ( ) containing .
5.7. Exercise. Show that every σ-algebra is a Dynkin class. A Dynkin class is a σ-algebra if
and only if it is a π-system.
5.8. Exercise. Prove that if is a π-system, then ( ) = σ( ).
5.9. Exercise. The assumptions of Hopf’s extension theorem can be weakened. Indeed, show
that the assertion of Theorem 5.4 is still true if we only assume that μ is finitely additive and
σ-subadditive on the semiring and that μ∅ = 0.
Hint. First note that μ is monotone. Then proceed as in Theorem 5.4 and show that σ( ) ⊂
j
(μ̃). In the essential step use the existence of sets Cn ∈ with the property Gn \ (G ∩ Gn ) =
S j
Cn .
j
5.10. Exercise. Let be a π-system on X. Show that if μ1 , μ2 are probability measures on

σ( ) which agree on , then μ1 = μ2 on σ( ).
Hint. Let := {M ∈ σ( ) : μ1 (M ) = μ2 (M )}. Show that is a Dynkin class and use Exercise
5.7.
5.11. Exercise. Consider
:= {[a, b) \ C : a, b ∈ [0, 1]}, ν([a, b) \ C) := f (b) − f (a),
where C is the Cantor set and f the Cantor function from 23.1.
Show that is a semiring and that ν is finitely additive but not σ-additive on .
5.12. Exercise. (a) Let X be an arbitrary set and n ∈ N. Consider the set function ν which
is restriction of the counting measure to the collection
n := ∅ ∪ {A ⊂ X : A has exactly n elements }
and construct the outer measure ν̃ as in Theorem 4.3. Investigate the relationship between the
collections n and M(ν̃) and compare with Theorem 5.4.
(b) Investigate the same problem in case of X = (0, 1), = (X) and
X
k [
k
νA := inf{ (bi − ai ) : (ai , bi ) ⊃ A}
i=1 i=1
(νA is the Jordan-Peano content of a set A).

5.13. Exercise. Let μ∗ be an outer measure on X constructed from the premeasure μ on an

algebra by Theorem 4.3 such that μ∗ X < +∞. For A ⊂ X set
μ∗ A := μ∗ X − μ∗ (X \ A).
Show that A ∈ M(μ∗ ) if and only if μ∗ A = μ∗ A (compare also with Theorem 5.4).
5.14. Capacity on Compact Sets. Let X be a locally compact topological space. A
real-valued nonnegative function defined on the collection (X) of all compact subsets of X
is called a Choquet capacity if it satisfies the following conditions:
(a) if K1 ⊂ K2 , then (K1 ) ≤ (K T2 );
(b) if K1 ⊃ K2 ⊃ K3 . . . , then ( Kn ) = lim (Kn );
(c) (K1 ∩ K2 ) + (K1 ∪ K2 ) ≤ (K1 ) + (K2 ) whenever Kn ∈ (X) (so-called strong
subadditivity).
An important example is the Newtonian capacity in Rn , n ≥ 3. If we define the Newtonian
potential of a Radon measure μ on Rn by
Z
μ dμ(t)
(x) := n−2
,
Rn |x − t|
we can introduce the Newtonian capacity cap K of a compact set K ⊂ Rn as
cap K := sup{μK : supt μ ⊂ K, μ

≤ 1 on Rn }.
The proof that this capacity satisfies the conditions (a), (b), (c) can be found for instance in
[KNV].
5.15. Outer Capacity. Let X be a topological space. A mapping c : (X) → [0, +∞] is
called an outer capacity provided it satisfies:
(a) if A ⊂ B, then cA ≤ cB; S
(b) if A1 ⊂ A2 ⊂ A3 ⊂ . . . , then c ( An ) = sup cAn ; T
(c) if K1 ⊃ K2 ⊃ K3 ⊃ . . . and Kn are compact, then c( Kn ) = inf c Kn .
A set A ⊂ X is said to be c-capacitable if
cA = sup{cK : K ⊂ A, K compact}.
1. If is a Choquet capacity on a locally compact space (Exercise 5.14) and
cA := inf{sup{ (K) : K ⊂ G, K compact} : G ⊃ A, G open},
prove that c is an outer capacity.

2. Suppose c is simultaneously an outer capacity and an outer measure. Investigate the relation-
ship between the notions of c-capacitability and c-measurability in the sense of Carathéodory.
Consider the cases:
(a) X is a two-points set equipped with the discrete topology and cA = 1 if A
= ∅, c ∅ = 0;
(b) X is R with the Euclidean topology and c is the outer Lebesgue measure;
(c) X is R with the discrete topology and c is again the outer Lebesgue measure;
(d) c is the outer capacity derived from the Newtonian capacity. In this case a set A ⊂ Rn
is c-measurable in the sense of Carathéodory if and only if cA = 0 or if c(Rn \ A) = 0 (this is
not quite easy, see M.M. Rao [*1987]). On the other hand, Choquet’s deep result says that each
Borel set in Rn is c-capacitable.
5.16. Notes. “Hopf’s extension theorem” is usually attributed to H. Hahn or C. Carathéodory
but it is probably originated by M. Fréchet [1915]. The proof using Carathéodory’s theorem was
given independently by H. Hahn [*1924] and A. N. Kolmogorov [*1933].
The notions of a π-system or of a Dynkin class were investigated by E. B. Dynkin in 1959 as
tools for the probability theory. However, families of similar properties were studied already by
W. Sierpiński [1928].
22 6. Signed and Complex Measures
An investigation of the theory of capacities in the classical potential theory during the period
1920–1950 is connected with the names of Ch. de la Vallée Poussin, N. Wiener, O. Frostman or
M. Brelot among others. These authors studied mainly the Newtonian capacity and examined
the role of sets of a “small” capacity. One could say that the capacity theory is a “younger
sister” of Lebesgue measure theory. In the 50’s the general capacity theory was developed and
G. Choquet proved his famous capacitability theorem. An interested reader is referred to a nice
article written by Choquet himself [1986] or [1989].
6. Signed and Complex Measures
In this chapter, we will investigate “measures” assuming also negative (or even
complex) values.
6.1. Signed Measures. Let S be a σ-algebra of subsets of X. A set function
μ : S → R is said to be a signed measure on S if
(a) μ∅= 0;
∞
∞

(b) μ An = μAn whenever An ∈ S are pairwise disjoint.
n=1 n=1
6.2. Remarks. 1. A signed measure can assume at most one of the values +∞ and
−∞. Indeed, if μE = +∞ and μF = −∞ for E, F ∈ , then μ(E ∩ F ) is finite. Therefore
μ(E \ F ) = +∞, μ(F \ E) = −∞ and the equality (b) does not hold for (disjoint) sets E \ F
and F \ E.
S P
2. If μ( An ) is finite, then the series μAn in the condition (b) converges absolutely (indeed,
this series converges — to the same value — when its terms are rearranged arbitrarily, which is
equivalent to the absolute convergence).
3. Some of basic properties of positive measures hold even for signed measures. Show that the
following assertions are true:
(a) if An ∈ , An A , then μAn → μA,
(b) if An ∈ , An A, |μA1 | < +∞ , then μAn → μA.
4. A signed measure μ is not necessarily monotone: If A ⊂ B, we can no longer assert that
μA ≤ μB.
6.3. Hahn Decomposition. We say that a set P ∈ S is positive for a signed
measure μ if μE ≥ 0 for each set E ∈ S , E ⊂ P . Analogously we define negative
sets for μ.
An ordered pair of sets (P, N ) is called a Hahn decomposition of X for μ if
(a) P ∪ N = X, P ∩ N = ∅;
(b) P is positive and N is negative (for μ).
6.4. Examples. (a) The empty set is both positive and negative for any signed measure. The
pair (X, ∅) is a Hahn decomposition for μ if and only if μ is positive.
(b) The pairs (R, ∅) , (R \ Q, Q) , (R \ {5}, {5}) are Hahn decompositions of R for the
Lebesgue measure.
R
(c) If a measure μf has a density f with respect to μ (i.e. if μf A = A f dμ where f ∈ ∗ (μ)
— cf. 8.19), then the set P := {f ≥ 0} is positive for μf and ({f ≥ 0}, {f < 0}) is a Hahn
decomposition for μf .
6.5. Hahn Decomposition Theorem. For every signed measure μ there

exists a Hahn decomposition. This decomposition is unique in the following sense:
If (P1 , N1 ) and (P2 , N2 ) are two such decompositions, then
μ(P1 ∩ E) = μ(P2 ∩ E), μ(N1 ∩ E) = μ(N2 ∩ E)
for each set E ∈ S .
Proof. We start with proof of the uniqueness assertion. Suppose that (P1 , N1 )
and (P2 , N2 ) are Hahn decompositions of X for μ. Then evidently
μ(P1 ∩ E) = μ(E ∩ (P1 ∩ P2 )) = μ(E ∩ P2 )
and, in the same way, μ(N1 ∩ E) = μ(N2 ∩ E).

To prove the existence of the decomposition, assume that μS < +∞ for all
S ∈ S . We proceed via the following steps.
Step 1: If A ∈ S , μA > −∞ and ε > 0, then there exists a set B ⊂ A such that
μB ≥ μA and B is positive. The set B will be constructed as the intersection of
a finite or infinite nested sequence {An } of subsets of A. We start with A1 = A.
In the n-th step, if An is not positive, we can set
1
kn = min k ∈ N : there exists E ⊂ An with μ(E) ≤ −
k

∞
and find An+1 ⊂ An such that μ(An \ An+1 ) ≤ − k1n . Then we set B = An .
n=1
The sets An \ An+1 are pairwise disjoint and their union is A \ B. Thus
∞
∞
1
+∞ > μ(B) = μ(A) − μ(An \ An+1 ) ≥ μ(A) + .
n=1
k
n=1 n
It follows that μ(B) ≥ μ(A) and kn → ∞. If E ⊂ B is a measurable subset, then

for each n ∈ N we have E ⊂ An , and thus μ(E) > − k1n . Hence B is positive.
Step 2: To complete the proof, set s = sup{μA : A ∈ S }. Since ∅ ∈ S , we have
s ≥ 0. There exists a sequence {Pn } ⊂ S such that μPn → s. In light of the first
step we can assume

P1 ⊂ P2 ⊂ . . . (the union of two positive sets is a positive
set!). If P := Pn , then P ⊂ A and μP = lim μPn = s < +∞. Moreover, P is
positive. Indeed, if E ⊂ P , then E ∩ Pn E and μ(E ∩ Pn ) ≥ 0. Lastly, to show
that the set N := X \ P is negative we notice that μ(E ∪ P ) = μE + μP > s for
any set E ⊂ N with μE > 0.
6.6. Variation of a Measure. Let (P, N ) be a Hahn decomposition of X for

a signed measure μ. Define
μ+ E = μ(E ∩ P ), μ− E = −μ(E ∩ N ), |μ| (E) = μ+ E + μ− E
for every set E ∈ S . By the previous theorem, μ+ E and μ− E do not depend

on the choice of the Hahn decomposition (which is not unique!). It is simply
checked that the set functions μ+ , μ− and |μ| are positive measures on S . They
are called the positive, negative and total variations of a signed measure μ. Let
us summarize our results in the following theorem.
6.7. Theorem (The Jordan decomposition of a signed measure). Let μ be a

signed measure on (X, S ). The functions μ+ , μ− and |μ| are positive measures
on S and μ = μ+ − μ− . If also μ = ν1 − ν2 where ν1 , ν2 are positive measures,
then ν1 ≥ μ+ , ν2 ≥ μ− .
Proof. It is enough to prove the last part of the theorem. But for each E ∈ S ,
one has
μ+ E = μ(E ∩ P ) = ν1 (E ∩ P ) − ν2 (E ∩ P ) ≤ ν1 (E ∩ P ) ≤ ν1 E.
An important characterization of the total variation of a signed measure, often used as its
definition, is contained in the following theorem.
6.9. Theorem. Let μ be a signed measure on (X, S ) and E ∈ S . Then

n
n
|μ| (E) = sup |μAk | : Ak ∈ S pairwise disjoint, Ak = E .
k=1 k=1
Proof. It is straightforward to check that

+
|μAk | = μ+ Ak − μ− Ak ≤ (μ Ak + μ− Ak ) = |μ| (Ak ) = |μ| (E).
k k k k
On the other hand, if (P, N ) is a Hahn decomposition of X for μ, set A1 = E ∩ P ,

A2 = E ∩ N and find out that the supremum is even attained.
6.10. Complex Measures. A complex measure on (X, S ) is a σ-additive
S → C for which μ∅ = 0. By σ-additivity we understand that
μ: ∞
set function

∞
μ Ak = μAk whenever {Ak } is a sequence of pairwise disjoint sets from
k=1 k=1
S . Let us note that, as in Remark 6.2.2, the appearing series must converge
absolutely or definitely diverge.
Each complex measure μ can be expressed uniquely in the form μ = μr + iμi
where μr and μi are (finite!) signed measures on (X, S ). In particular, each finite
signed measure can be understood as a complex measure.
If μ is a complex measure, we define its total variation |μ| by

n
n
|μ| (E) = sup |μAk | : Ak ∈ S pairwise disjoint, Ak = E .
k=1 k=1
An appeal to Theorem 6.9 reveals that this definition agrees with the previous
one for signed measures.
6.11. Theorem. Let μ be a complex measure on (X, S ). Then the total
variation |μ| is a finite positive measure on (X, S ).
Proof. Apparently, |μ| (∅) = 0 and it is simply to verify that |μ| is finitely additive.
If μ = μ1 − μ2 + i(μ3 − μ4 ) where μ1 − μ2 is the Jordan decomposition of μr and
μ3 − μ4 is the Jordan decomposition of μi (i.e. μj are positive finite measures),
then
|μ| (A) ≤ μ1 A + μ2 A + μ3 A + μ4 A,
so that |μ| (A) < +∞. If An ∈ S , An ∅, then lim μj An = 0 for each
j = 1, 2, 3, 4, and therefore |μ| (An ) → 0. A routine argument now shows that μ
is σ-additive.
6.12. Remark. Note that the range of any complex measure is always a bounded subset of
the complex plane C.
6.13. Exercise. Let μ be a signed measure on (X, ) and E ∈ . Show that μ+ E =
sup{μB : B ∈ , B ⊂ E}.
6.14.
j ∞Exercise. Let μ be a complex measure on (X, ff ) and E ∈ . Prove that |μ| (E) =
P S
∞
sup |μAk | : Ak ∈ pairwise disjoint, Ak = E .
k=1 k=1
6.15. Exercise. Let μ be a complex measure on (X, ). Prove that |μ| is the smallest
positive measure ν satisfying νA ≥ |μ| A for all A ∈ .
q
6.16 Exercise. Investigate whether |μ| = |μr | + |μi |, or |μ| = |μr |2 + |μi |2 .
6.17. Exercise. We denote by M( ) the set of all finite signed or complex measures on
(X, ). If μ ∈ M( ), let μ = |μ| (X). Show that (M( ), ·) is a Banach space (i.e. M( )
is a linear space, · becomes a norm and M( ) is complete with respect to it).
Hint. Only the completeness requires a proof. But if μn is a Cauchy sequence in M( ), then
there exists lim μn A for each A ∈ . Defining μA = lim μn A, it is not hard to show that μ is
σ-additive and μn − μ → 0.
6.18. Exercise. Suppose that μ, μn ∈ M( ), μn − μ → 0. Prove that μn A → μA for

every set A ∈ .
6.19. Exercise. Show that the set M( ) of all signed measures on (X, ) equipped with
the order
μ≤ν if μA ≤ νA for each A ∈
is a lattice (i.e. for any μ, ν ∈ M( ) there exist sup(μ, ν) and inf(μ, ν) in M( )). Notice that
sup(μ, ν) does not necessarily mean the set function A → sup(μ(A), ν(A)) — such function, in
general, is not a measure!.
Hint. Show that 1
2
(μ + ν + |ν − μ|) is sup(μ, ν) and 1
2
(μ + ν − |ν − μ|) is inf(μ, ν).
6.20. Finitely Additive Measures — Charges. A real-valued set function ν on an algebra

of sets is called a charge if
(a) ν∅ = 0;
(b) ν(A ∪ B) = νA + νB whenever A, B ∈ , A ∩ B = ∅.
If ν is a charge, define ν + A = sup{νF : F ∈ , F ⊂ A}, ν − A = − inf{νF : F ∈ , f ⊂ A},
|ν| (A) = ν + A + ν − A.
6.21. Exercise. Show that the set functions ν + , ν − and |ν| are positive finitely additive
measures on .
6.22. Exercise. Suppose X = [0, 1). Let be the collection of all finite unions of intervals
[a, b) ⊂ [0, 1). Verify that is an algebra. Set
(
1
x
for x ∈ (0, 1),
f (x) =
0 for x = 0.
S
n P
If A = [ai , bi ), define νA = (f (bi ) − f (ai )). Show that neither the “Jordan decomposition
i=1
theorem”, nor the “Hahn decomposition theorem” do hold for ν. Proceed via the following
steps:
S
(a) the definition of νA does not depend on the particular representation of A as [ai , bi );
(b) ν is a charge on ;
(c) νA ≤ 0 for A ⊂ (0, 1);
(d) ν([0, 1)) = 1;
(e) ν + ([0, 1)) = ν − ([0, 1)) = +∞.
6.23. Exercise. Let ν be a charge on . Show that ν = ν + − ν − if and only if ν is bounded
(i.e. if sup{|νA| : A ∈ } < +∞).
6.24. Notes. Signed measures were investigated by H. Lebesgue already in [1910]; he was
concerned mostly with measures given by a density (see 8.19).
The existence of a Hahn decomposition of the space and the Jordan decomposition of signed
measures were first established in a full generality by Hahn [*1921].
The monograph by K.P.S. Bhaskar Rao and M. Bhaskar Rao [*1983] is devoted to the theory
of charges.
B. The Abstract Lebesgue Integral 27
B. The Abstract Lebesgue Integral

7. Integration on R
7.1. Newton Integral. Let f be a real-valued function on an interval (a, b).

A function F on (a, b) is called an antiderivative to f if F (x) = f (x) for all
x ∈ (a, b). A real number A is called the Newton integral of f over (a, b) if there
is an antiderivative F of f on (a, b) such that A = lim F (x) − lim F (x). The
x→b− x→a+
value of the Newton integral of f on (a, b) does not depend on the choice of F
because the difference between any two antiderivatives is a constant. The Newton
b
integral is denoted by N a f .
7.2. Riemann Integral. Let f be a bounded function on a bounded interval
(a, b). If a = ξ0 < ξ1 < · · · < ξm = b is a (finite) sequence of points in [a, b] (so
called partition of the interval [a, b]), and α1 , . . . , αm a sequence of real numbers,

m
then the value αj (ξj − ξj−1 ) is called an upper Riemann sum of f if αj ≥ f
j=1
on every interval (ξj−1 , ξj ), and a lower Riemann sum of f if αj ≤ f on every
(ξj−1 , ξj ). Set
R ∗ f = inf{A : A is an upper Riemann sum of f },
R∗ f = sup{A : A is a lower Riemann sum of f }.
A function f is Riemann integrable on (a, b) provided R ∗ f = R∗ f ; the common
value is then called the Riemann integral of f over (a, b) and it is denoted by
b
R a f.
7.3. Theorem. Let f be a continuous function on an interval [a, b]. Then both
the Newton integral and the Riemann integral of f on (a, b) exist and are equal.
x
Proof. Since f is uniformly continuous, we easily obtain that R a f exists for all
x
x ∈ (a, b). A short reflection shows that x → R a f is an antiderivative of f and
this implies the existence of the Newton integral and the required equality.
7.4. Remark. Each Riemann integrable function f is absolutely Riemann integrable, i.e. |f |
is Riemann integrable as well. This assertion is no longer true for the Newton integral (consider
the integration of the function x → sin x/x over (1, +∞); the change of variables t = 1/x yields
an example on a bounded interval).
7.5. Dirichlet Function. The Dirichlet function D is defined as the indicator function of
the set of all rational numbers. Observe that D = 0 almost everywhere. The Dirichlet function
is nowhere continuous, it has neither an antiderivative (it does not have the Darboux property)
nor the Riemann integral over (0, 1) ( ∗ D = 1, ∗ D = 0).
7.6. Riemann function. The Riemann function R is zero on the set of all irrational numbers.
If r = p/q is a rational number, p and q are relatively prime and q > 0, then R(r) is defined as
1/q (zero is supposed to be 0/1, so R(0) = 1). The Riemann function is continuous at x if and
only if x is irrational. The Riemann function serves as an example of a “pathological” function
which is Riemann but not Newton integrable (it has no antiderivative) on bounded intervals.
7.7. Various Examples. The indicator function of the Cantor ternary set is Riemann
integrable. The indicator function of any discontinuum of a positive measure fails to be Riemann
integrable. An unbounded Newton integrable function cannot be Riemann integrable. There
are examples of bounded Newton integrable functions which are not Riemann integrable (cf.
7.9.f).
28 8. The Abstract Lebesgue Integral
7.8. Chebyshev’s Inequality for the Riemann Integral. If f is a nonnegative bounded

function on (a, b), ε > ∗ f and τ > 0, then the set {f ≥ τ } can be covered by a union of a
finite number of intervals, the sum of whose lengths is less than ε/τ (so λ∗ {f ≥ τ } < ε/τ ).
Hint. Find a partition a = ξ0 < ξ1 < · · · < ξm = b and numbers cj such that cj ≥ f on
P
m
(ξj−1 , ξj ) and cj (ξj − ξj−1 ) < ε. Then the sum of lengths of those intervals (ξj−1 , ξj ) for
j=1
which cj ≥ τ is less than ε/τ .
7.9. Riemann Integrable Functions. Lebesgue measure theory allows a deeper study of
the class of Riemann integrable functions.
(a) For any bounded function f on an interval (a, b) we have
Z b
∗
f = inf{ g : g piecewise constant, g ≥ f },
a
Z b
∗f = sup{ g : g piecewise constant, g ≤ f }.
a
(A function f is said to be piecewise constant on [a, b] if there exist ξj such that a = ξ0 < ξ1 <
· · · < ξm = b and f is constant on each interval (ξj−1 , ξj ).)
(b) If f is a bounded function on an interval (a, b), then the function
f ∗ := inf{g : g ≥ f, g continuous}
is called the upper Baire function of f . The definition of the lower Baire function f∗ of f is
analogous.
Show that the function f ∗ is upper semi-continuous and that f is continuous at a point x if
and only if f ∗ (x) = f∗ (x).
(c) Show that
Z b
∗
f = inf{R g : g ≥ f, g continuous }.
a
(d) Let f be a bounded function on an interval (a, b). Show that the following conditions are
equivalent:
(i) f is Riemann integrable;
R
(ii) for each ε > 0 there exist continuous functions t, s such that t ≤ f ≤ s and R ab (s−t) <
ε;
(iii) f∗ = f ∗ almost everywhere;
(iv) there exist functions u, v; u upper semi-continuous, v lower semi-continuous such that
v ≤ f ≤ u and u = v almost everywhere;
(v) the set of all points where f is discontinuous is of Lebesgue measure zero.
Hint. Use (b) and (c). For (ii) =⇒ (iii), according to 7.8, λ∗ {f ∗ − f∗ ≥ 1/k} ≤ λ∗ {s − t ≥
1/k} ≤ kε for each k ∈ N and ε > 0.
(e) Show that any Riemann integrable function is measurable.
Hint. Any semi-continuous function is measurable. Now use (d) and Theorem 3.14.b.
(f) The above condition (v) permits to construct bounded Newton integrable functions which
are not Riemann integrable. An example of Volterra type functions constructed with the aid of
closed nowhere dense sets of positive measure (Example 1.13) can be found in A.M. Bruckner
[*1978].
7.10. Notes. The origins of the integral calculus are connected with the names of I. Newton
and G.W. Leibniz. The modern integration theory has been developed from the beginning of
the 19th century by A.L. Cauchy, L. Dirichlet, B. Riemann, C. Jordan, E. Borel, H. Lebesgue,
G. Vitali and others. Some historical treatments on integration are mentioned in 8.25.
8. The Abstract Lebesgue Integral
Elementary expositions of the integration theory in usual textbooks of calculus

employ Riemann’s or Newton’s constructions. However, the examples given in
Chapter 7 show that these integrals provide rather small classes of integrable
functions. In deeper applications, we need completeness of normed linear spaces
of integrable functions, which is achieved with the aid of Lebesgue’s integration.
A further anvantage of Lebesgue’s approach consists in the possibility to integrate
over more general domains than intervals.
In the sequel, (X, S , μ) will be a measure space.
The Riemann integral of the indicator function of any interval is its length.
Accordingly,
developing the notion of an integral on X, the requirement that
X
cA dμ = μA for any A ∈ S is quite natural. Further, it is reasonable to
impose conditions on additivity and monotonicity of the integral and to ask the
family of all integrable functions to be as large as possible.
We introduce the concept of the abstract Lebesgue integral in several steps.
First we define X f dμ in a natural way for nonnegative simple functions and then
we extend it to nonnegative μ-measurable functions using their approximation by
simple functions. The general case will be completed using the decomposition of
a function into its positive and negative parts. Building up this theory, we must
be careful in order to legitimate these steps. For instance, we have to show that
the definition of X s dμ in the case of simple functions does not depend on their
representation which is not unique. Likewise, we have to deal with problems when
infinite values or “indefinite expressions” appear.
The reader may find it instructive to remember the most important example
of the n-dimensional Lebesgue measure in Rn . For integration with respect to
the Lebesgue measure λ we use a traditional notation
b
f dx := f dλ , f dx := f dλ.
E E a (a,b)
8.1. Simple Functions. Recall that a simple function is a real S -measurable

function s on X having a finite range. Any simple function s can be expressed as

n
s= βj cBj ,
j=1
where B1 , . . . , Bn ∈ S are pairwise disjoint sets and β1 , . . . , βn ∈ R. Of course,

this representation of s is not unique.
Remember again that as a matter of convenience we set 0 · ∞ = ∞ · 0 = 0.
8.2. Lemma. Let A1 , . . . , Am ∈ S and B1 , . . . , Bn ∈ S be pairwise disjoint
m n
sets and let αi and βj be nonnegative real numbers. If αi cAi ≤ βj cBj ,
i=1 j=1
then

m
n
αi μAi ≤ βj μBj .
i=1 j=1
m
Proof. To simplify the proof define α0 = β0 = 0, A0 = X \ Ai and B0 =
i=1

n
X\ Bj (then the collections {Ai }, {Bj } form “partitions” of X). For i ∈
j=1
{0, . . . , m} and j ∈ {0, . . . , n} either Ai ∩ Bj = ∅ or αi ≤ βj . Thus

m
m
n
m
n
n
αi μAi ≤ αi μ(Ai ∩ Bj ) ≤ βj μ(Ai ∩ Bj ) ≤ βj μBj .
i=0 i=0 j=0 i=0 j=0 j=0
8.3. Abstract Lebesgue Integral. If D ∈ S and s is a nonnegative simple

n
function expressed as s = βj cBj , where Bj are pairwise disjoint sets and βj are
j=1

n
nonnegative coefficients, define D
s dμ := βj μ(D ∩ Bj ). The previous lemma
j=1
shows that this value does not depend on the particular representation of s. Next,
define
f dμ := sup s dμ : 0 ≤ s ≤ f on D, s simple
D D
if f ≥ 0 is a μ-measurable function on D ∈ S . Lemma 8.2 again ensures that the

“new” definition and the “old” one agree in case when f is a simple function.

For a SD -measurable function f on D we define D f dμ := D f + dμ −
D
f − dμ provided at least one of these integrals is finite. Remember that f + :=
max(f, 0), f − := max(−f, 0).
If f is a function on X and M ∈ S , then apparently

f dμ = f · cM dμ
M X

and M f dμ = M f dν where ν = μ|M is the restriction of μ to the σ-algebra
SM := {A ∈ S : A ⊂ M } of subsets of M .
Therefore, it is no loss of generality to restrict our attention to the integration
over the whole space X.
It is useful to define the abstract Lebesgue integral even for functions defined
only μ-almost everywhere. In this case, if f is defined on D ∈ S and μ(X \D) = 0,
set
f dμ := f dμ
X D
if the integral on the right side is defined. It is immediate that this value does
not depend on the choice of D.
The symbol L ∗ or L ∗ (μ) denotes the family of all μ-measurable functions
defined μ-almost everywhere on X for which the abstract Lebesgue integral is
defined.
Further denote

∗
L = L (μ) =
1 1
f ∈ L (μ) : f dμ ∈ R .
X
If f ∈ L (μ), we say that f is μ-integrable.

1
8.4. Lebesgue Integrable Functions on R. (a) Every bounded μ-measurable function on

a bounded interval (and so every Riemann integrable function) is Lebesgue integrable. Thus,
the functions from Examples 7.5 – 7.6 are Lebesgue integrable as well.
R R
(b) If f is a Riemann integrable function on [a, b], then ab f dλ ≤ ∗ f = ∗ f ≤ ab f dλ, so
that the Lebesgue and the Riemann integrals of f coincide.
(c) If a function has both the Newton and the Lebesgue integrals, they are equal. Proof of this
is not easy. We will prove it using the Henstock-Kurzweil integral in Chapter 25.
(d) If f is Newton integrable, then f is μ-measurable since it can be expressed as the limit of a
sequence {fn } of continuous functions, in essence, fn (x) = n(F (x + n1
) − F (x)), where F is an
antiderivative to f . It can happen that f is not Lebesgue integrable, but this is only the case if
f is not absolutely Newton integrable.
(e) Since nonmeasurable functions are rather rare, the question whether or not f is Lebesgue
R R
integrable can be reduced to the question whether the integrals ab f + dx and ab f − dx are finite
or infinite. For instance, the function f : x → x is integrable on (0, 1) if and only if p > −1.
p
The key result is the following monotone convergence theorem (sometimes

called Levi’s theorem or the Lebesgue monotone convergence theorem) for non-
negative functions.
8.5. Theorem.
If fn ≥ 0 are μ-measurable functions on X, fn f , then
X
fn dμ → X
f dμ.

Proof. It is clear that the sequence X fn dμ is nondecreasing, and therefore
there exists α := lim X fn dμ. Since fn ≤ f , we have α ≤ X f dμ and the
assertion is obvious for α = +∞. Assume α < +∞, fix a simple function s,
0 ≤ s ≤ f , and prove that X s dμ ≤ α. The proof will be given in several steps:
(a) Let τ ∈ (0, 1). Define En = {x ∈ X : fn (x) ≥ τ s(x)}. Then En ∈ S , En ⊆

∞
En+1 and En = X (if f (x) = 0, then x ∈ E1 ; if f (x) > 0, then τ s(x) <
n=1
lim fn (x) ). Thus μ(En ∩ A) → μA for every set A ∈ S .
k
(b) Let s = βj cAj , where Aj are pairwise disjoint. Then
j=1
k
k
fn dμ ≥ fn dμ ≥ τ s dμ = τ ( βj cAj ) dμ = τ βj μ(Aj ∩ En ).
X En En En j=1 j=1
(c) Passing to limits in (b) (using (a)), we get

k
α≥τ βj μAj = τ s dμ .
j=1 X

Since τ ∈ (0, 1) is arbitrary, we get α ≥ X
s dμ, as needed.
8.6. Theorem. Let g ∈ L ∗ and f be a μ-measurable function, f = g almost

everywhere. Then f ∈ L ∗ and X f dμ = X g dμ.
Proof. Obvious.
8.7. Remark. For any nonnegative μ-measurable function f on X there exists a sequence
R
of simple functions sn ≥ 0 such that sn f . From Theorem 8.5 we know that X f dμ =
R
lim X sn dμ. As is often the case, this observation becomes the basis for a definition of the
integral of nonnegative μ-measurable functions. Of course, starting with such a definition, it is
R
necessary to prove the independence of X f dμ on the choice of the sequence {sn }.
8.8. Theorem. If f1 , f2 are nonnegative μ-measurable functions, then

(f1 + f2 ) dμ = f1 dμ + f2 dμ .
X X X
Proof. It might be assumed that both f1 and f2 are defined everywhere on X.

First suppose that f1 and f2 are even simple. Following the same line of proof as
in Lemma 8.2, we can find pairwise disjoint S -measurable sets A0 , . . . , Am and
B0 , . . . , Bn and nonnegative real numbers α0 , . . . , αm and β0 , . . . , βn such that

m
n
m n
X= Ai = Bj , f1 = αi cAi and f2 = βj cBj . Then
i=0 j=0 i=0 j=0

m
n
m
n
(f1 + f2 ) dμ = (αi + βj )μ(Ai ∩ Bj ) = αi μAi + βj μBj
X i=0 j=0 i=0 j=0

= f1 dμ + f2 dμ.
X X

For the general case, find sequences s1n and s2n of simple functions with
sin fi and use the first part and Theorem 8.5.
8.9. Theorem (properties of L 1 (μ)). The following propositions hold:
(a) If f ∈ L 1 , then f is finite almost everywhere.
(b) If f, g ∈ L 1 , α, β ∈ R, then αf + βg ∈ L 1 and (αf + βg) dμ =
α X f dμ + β X g dμ (an appeal to (a) reveals that αf + βg is defined
almost everywhere).
(c) If f ∈ L 1 , then |f | ∈ L 1 and X f dμ ≤ X |f | dμ (i.e. the integral is
“absolutely convergent”).
(d) L 1 is a lattice (if f, g ∈ L 1 , then max(f, g), min(f, g) ∈ L 1 ).
(e) If f is μ-measurable, g ∈ L 1 , |f | ≤ g, then f ∈ L 1 .
Proof. (a) Suppose that f ∈ L 1 , A:= {x ∈ X : f (x) = ∞}. Then
A is measurable
and 0 ≤ ncA ≤ f + for every n ∈ N. Thus 0 ≤ X ncA dμ ≤ X f + dμ and we get
μA ≤ n1 X f + dμ < ∞ for every n ∈ N. Hence μA = 0.

(b) It is plain to see that X αf dμ = α X f dμ for f ∈ L 1 and α ∈ R. Let
f, g ∈ L 1 and write f = f + − f − , g = g + − g − and h = f + g (by (a), h is defined
almost everywhere). Then
h+ − h − = f + − f − + g + − g − ,
and in light of Theorem 8.8,

− −
+
h dμ + f dμ + g dμ = +
f dμ + +
g dμ + h− dμ .
X X X X X X

The assertion now follows if we observe that the integrals X
+
h dμ and X
h− dμ
are finite. But this is true as
0 ≤ h+ = (f + g)+ ≤ f + + g + .
(c) If f ∈ L 1 , then |f | = f + + f − ∈ L 1 by (b), and we have

f dμ = f + dμ − −
f dμ ≤ f dμ + f dμ =
+ −

X
X X X X
−
= +
f dμ + f dμ = |f | dμ.
X X X
(d) The assertion follows immediately from (b) and (c) using
1
max(f, g) = (f + g + |f + g|).
2
(e) If f is a μ-measurable function, f + is μ-measurable as well and

0≤ f + dμ ≤ g dμ < ∞
X X
(0 ≤ f ≤ |f | ≤ g), thus f
+ +
∈ L . Analogously f − ∈ L 1 , and finally f =
1
f + − f − ∈ L 1.
8.10. Remarks. 1. Proposition (b) of the previous theorem shows that 1 satisfies almost
all axioms for a linear space but it is not a true linear space. To get a linear space, the set of
all finite everywhere defined functions from 1 or the space L1 (which will be introduced in
Chapter 10) is to be considered.

P
2. The sum of an absolutely convergent series j aj is the integral of the function j → aj
over the set N with respect to the counting measure. On the other hand, the abstract integration
theory cannot describe sums of nonabsolutely convergent series. This observation should not be
understood as an insufficiency of Lebesgue’s theory. The abstract Lebesgue integral described
in 8.3 provides the best approach in the general framework of measure spaces. Let us consider
a simple experiment: A rearangement of N can change the sum of a nonabsolutely convergent
series while “measures of sets” remain invariant. The sum depends on an additional structure
on N, namely, on its ordering. Analogously, taking into account the ordering of the real line (in
addition to its measure properties), various kinds of nonabsolutely convergent integrals may be
introduced. Let us mention Newton integration on an elementary level and Perron or Henstock-
Kurzweil integration on an advanced level (see Section 25). It is instructive to compare the
expressions
Z ∞ Z b
sin x sin x
dx = lim dx
1 x b→∞ 1 x
(where the integral is understood as Newtonian) and
X∞ Xn
sin k sin k
= lim .
k=1
k n→∞
k=1
k
In the theory of Lebesgue integration, limit theorems play a crucial role. They
concern either monotone convergence or dominated convergence.
8.11. Levi’s theorem. Let {f n } be a sequence of μ-measurable

functions,

fn f almost everywhere and let X f1 dμ > −∞. Then X f dμ = lim X fn dμ.
Proof. We have already proved the theorem when fn are nonnegative and fn f
everywhere. The general case can easily be reduced to Theorem 8.5. First we
redefine f and fn on sets of measure zero in such a way that fn f everywhere.
It would be clearly sufficient to assume that fn ∈ L 1 for all n. If gn = fn + f1−
(notice that f1− ∈ L 1 ), then gn are nonnegative
μ-measurable
functions and
gn f + f1− . Now, Theorem 8.5 ensures that X gn dμ → X (f + f1− ) dμ. Since

f dμ = X gn dμ − X f1− dμ, the assertion easily follows.
X n
8.12. Levi’s theorem If fn are nonnegative μ-measurable func-
for series.
tions on X, then X fn dμ = f dμ.
X n
Proof. Use Theorems 8.5 and 8.8.
8.13. Lebesgue dominated convergence theorem. Let {fn } be a sequence
of μ-measurable functions, fn → f almost everywhere. If there exists a function
∈ L such that |fn | ≤ h almost everywhere for all n, then f ∈ L and
1 1
h
X
f dμ = lim X fn dμ.
Proof. The proof may be easily reduced to Levi’s theorem by considering f =
lim sup fn = lim inf fn . Set sn = sup{fn , fn+1 , . . . }, tn = inf{fn , fn+1 , . . . }.
Then −h ≤ tn ≤ fn ≤ sn ≤ h, sn f , tn f almost everywhere and

−∞ < −h dμ ≤ t1 dμ ≤ s1 dμ ≤ h dμ < +∞.
X X X X

By
Levi’s theorem,
X
f dμ = lim X
t n dμ = lim X
sn dμ. Since t dμ ≤
X n
f dμ ≤ X sn dμ, the assertion follows.
X n
8.14. Dominated convergence theorem for series. Let {hn } be a sequence
of μ-measurable functions on X and g ∈ L 1 . Suppose that
n

hj ≤ g almost everywhere
j=1

∞
for all n ∈ N and that the series hj converges almost everywhere. Then
j=1

∞ ∞

hj dμ = hj dμ.
X j=1 j=1 X
Proof. The theorem follows immediately from the previous one.

8.15. Fatou’s lemma. Let {fn } be a sequence of μ-measurable functions and
g ∈ L 1 . If fn ≥ g almost everywhere for all n ∈ N, then

lim inf fn dμ ≤ lim inf fn dμ.
X X
Proof. Set gk = inf{fn : k ≤ n}. Then gk lim inf fn and infer on Levi’s
theorem again.

8.16. Theorem. Let f ≥ 0 be a μ-measurable function. If X f dμ = 0, then
f = 0 almost everywhere.

∞
Proof. Denote An = {x ∈ X : g(x) ≥ n1 }. Then An = {x ∈ X : f (x) > 0}
n=1
and the inequality 0 ≤ cAn ≤ nf yields μ(An ) = 0.

8.17. Theorem. Let f ∈ L 1 . If E f dμ = 0 for any measurable set E, then
f = 0 almost everywhere.

Proof. If E := {f ≥ 0}, then E f dμ = E f + dμ = 0 and using the previous the-
orem we get f + = 0 almost everywhere. Analogously f − = 0 almost everywhere.
8.18. Corollary. Let f, g ∈ L 1 . If

f dμ ≤ g dμ
E E
for every measurable set E, then f ≤ g almost everywhere.

Proof. Set h = (f − g)+ . Then h ≥ 0 and 0 ≤ E h dμ = E∩{h>0} (f − g) dμ ≤ 0.
Thanks to Theorem 8.17, (f − g)+ = 0 almost everywhere.
8.19. Indefinite Lebesgue integral. Let f ∈ L ∗ . For E ∈ S set

μf (E) := f dμ .
E
The set function μf is called the indefinite Lebesgue integral of f .
8.20. Theorem. Let f ∈ L ∗ . Then μf is a signed measure on X. Moreover,
μf (E) = 0 for every set E ∈ S with μE = 0.
that μf is a measure
Proof. It is enough to prove
provided f is nonnegative. In
this case, notice that f cA = f cAn if A = An with Ai ∩ Aj = ∅ for i = j, and
n n
cite Theorem 8.12.
8.21. Remark. A question arises whether or not for any pair of measures on , say μ and
ν, there is always a nonnegative μ-measurable function f on X such that ν = μf , i.e.
Z
νE = f dμ
E
for all E ∈ . Of course, the answer is negative; if such a function exists, then necessarily
νE = 0 whenever μE = 0. But if μ and ν obey this condition (we say that ν is absolutely
continuous with respect to μ), then there is a function f with the ascribed property (at least
for σ-finite measures). This will be the subject of the chapter concerning the Radon-Nikodým
theorem.
8.22. Exercises. Let f ∈ 1. Prove that
(a) μf is a finite signed measure on ;
˛ ˛
(b) for every ε > 0 there is a δ > 0 such that ˛μf (E)˛ < ε whenever μE < δ.
P
Hint. (a) To verify the equality μf (A) = μf (An ) (An ∈ pairwise disjoint) use Theorem
8.14 for hn = f cAn .
R
(b) Suppose there exists an ε > 0 and a sequence {En } such that μEn < 2−n and En |f | ≥ ε.
T S
∞ ∞ R
Set E = Ek . Then μE = 0 and E f dμ ≥ ε, which is a contradiction.
n=1 k=n
36 9. Integrals Depending on a Parameter
8.23. Image of a Measure. Let (X, , μ) be a measure space, (Y, ) be a measurable space
and f : X → Y a measurable mapping (i.e. f −1 (E) ∈ whenever E ∈ ). The set function
E → μ(f −1 (E)), E∈ ,
is called the image of the measure μ under the mapping f and it is denoted by f (μ).
(a) Show that f (μ) is a measure on (Y, ).
(b) Let g : Y → R be a -measurable function. Show that g is f (μ)-integrable if and only
if g ◦ f ∈ 1 . In this case Z Z
g df (μ) = g ◦ f dμ.
Y X
Hint. First consider simple functions and then pass to limits.
8.24. Exercise. Let f be an increasing differentiable function on R, lim f (x) = ±∞. Let
x→±∞
μ be a measure on (R, (R)) defined by
Z
μE = f dx.
E
Show that f (μ) = λ.

8.25. Notes. The modern integration theory starts with Lebesgue’s doctoral thesis [1902]
following a short paper [1901]. His definition uses constructed Lebesgue measure on R. The
idea of using simple functions (not equal to those introduced here) when defining the integral
(still on the real line) comes from F. Riesz ([1912], [1920]).
Further details can be found in historical notes by T. Hawkins [*1970] or by F.A. Medvedev
[1975], R. Henstock [1988], T.H. Hildebrandt [1953]. It is clear that H. Lebesgue followed investi-
gations of his predecessors (B. Riemann, C. Jordan, G. Peano, E. Borel and others). Less known
is the fact that (roughly) at the same time G. Vitali and W.H. Young published similar results
independently.
Theorem 8.11 (concerning the monotone convergence) was proved by Beppo Levi [1906],
Lemma 8.15 by P.J.L. Fatou [1906] and Theorem 8.13. by H. Lebesgue [1910].
One more remark: H. Lebesgue developed his theory of integration mainly for the case of
“Lebesgue measures” on Rn . Later J. Radon [1913] considered more general measures in Rn
(nowadays called the Radon measures). A general theory of measures on arbitrary σ-algebras was
given by M. Fréchet [1915]. Since that time, many results on this subject have been published;
the monograph [*1950] by P.R. Halmos is one of the most quoted.
9. Integrals Depending on a Parameter
The Lebesgue dominated convergence theorem has simple but very important
consequences on continuity and differentiation of integrals depending on a param-
eter.
9.1. Theorem. Let (X, S , μ) be a measure space, P a metric space and U a
neighbourhood of a point a ∈ P . Suppose that a function F : U × X → R has the
following properties:
(a) there exists a set N ⊂ X of measure zero such that for each x ∈ X \ N
the function F (·, x) is continuous at a;
(b) for each t ∈ U , the function F (t, ·) is μ-measurable;
(c) there exists a function g ∈ L 1 (X) such that |F (t, ·)| ≤ g almost every-
where for all t ∈ U .
Then for each t ∈ U , F (t, ·) ∈ L 1 (X) and the function

f : t → F (t, ·) dμ
X
is continuous at a.
Proof. The proof that

lim F (t, ·) dμ = F (a, ·) dμ
t→a X X
is achieved by showing that

lim F (tj , ·) dμ = F (a, ·) dμ
j X X
for each sequence tj → a, tj ∈ U . To prove the last assertion it suffices to use the
Lebesgue dominated convergence theorem.
9.2. Theorem. Let (X, S , μ) be a measure space, N ⊂ X a set of measure
zero and I ⊂ R an open interval. Suppose that a function F : I × X → R has the
following properties:
(a) For each x ∈ X \ N , F (·, x) is differentiable on I;
(b) for each t ∈ I, F (t, ·) is μ-measurable;
d
(c) there exists a function g ∈ L (X) such that F (t, x) ≤ g(x) for each
1
dt
x ∈ X \ N and t ∈ I;
(d) there is a t0 ∈ I such that F (t0 , ·) ∈ L 1 (X).
Then F (t, ·) ∈ L 1 (X) for all t ∈ I, the function

f : t → F (t, ·) dμ
X
is differentiable on I and

d
f (t) = F (t, ·) dμ.
X dt
Proof. Suppose a, b ∈ I, b = a, x ∈ X \ N . By the mean value theorem there

exists a ξ between a and b such that

F (b, x) − F (a, x) d
= F (ξ, x) ≤ g(x).
b−a dt
For a = t0 it follows that the function
F (b, x) − F (a, x)
x →
b−a
38 10. The Lp Spaces
is integrable, thus F (b, ·) is integrable. Choose a ∈ I again. By the Lebesgue

dominated convergence theorem,

F (tj , ·) − F (a, ·) d
lim dμ = F (t, ·) dμ
j X tj − a X dt
for each sequence tj → a of points of I \ {a}, which yields

d F (t, ·) − F (a, ·) d
F (a, ·) dμ = lim dμ = F (a, ·) dμ.
dt X t→a X t−a X dt
9.3 Remarks. 1. Notice that in the proofs of the last theorems we made heavy use of the
Lebesgue dominated convergence theorem. Of course we could state analogous results based on
Levi’s theorem.
2. Since the notions of continuity and derivative are local, it suffices to verify the assumption (c)
only locally.
3. A number of exercises and clarifying examples can be found in [L-Př].
10. The Lp Spaces
The Lebesgue spaces Lp are important means linking measure theory and func-
tional analysis, and they are essential tools in differential equations theory, prob-
ability theory and other branches of modern analysis.
In this chapter, (X, S , μ) denotes a fixed measure space and p a number from
the interval [1, ∞].
10.1. The Set L p . (a) Suppose p < ∞. We denote by L p = L p (X, S , μ) the
set of all μ-measurable functions on X such that

p
|f | dμ < ∞.
X
The value
1/p
p
f p := |f | dμ
is called the Lp -norm of a function f ∈ L p . We will show soon that ·p has all
important properties of a norm.
(b) We denote by L ∞ = L ∞ (X, S , μ) the set of all μ-measurable functions
f on X such that |f | ≤ M μ-almost everywhere for some constant M . The least
constant M having this property is called the L∞ -norm of f and is denoted by
f ∞ . Roughly speaking, the difference between f ∞ and supX |f | is that f ∞
omits values of f on null sets.
Sometimes it is useful to emphasize the measure or the space X. In this case,
abbreviated symbols L p (X) or L p (μ) will be used as well.
10.2. Young’s inequality. Let p, q ∈ (1, ∞), p1 + 1q = 1. If a, b are nonnegative

numbers, then
ap bq
ab ≤ + .
p q
Proof. We can assume ab > 0. Making use of concavity of the function ln,

ap bq 1 1
ln + ≥ ln(ap ) + ln(bq ) = ln a + ln b = ln(ab)
p q p q
holds, and we easily establish the required inequality.
10.3. Hölder’s inequality. Suppose that f ∈ L p and g ∈ L q , where p, q ∈

(1, ∞), p1 + 1q = 1. Then f g ∈ L 1 and
1/p 1/q

f g dμ ≤ |f |
p
dμ |g|
q
dμ .

X X X
Proof. Denote
1/p 1/q
p q
s= |f | dμ , t= |g| dμ .
X X
We may assume that st > 0. Thanks to Young’s inequality (a = f (x)/s, b =

g(x)/t) we have
p q
f (x)g(x) |f (x)| |g(x)| |f (x)| |g(x)|
≤ ≤ +
st st psp qtq
for each x ∈ X. Thus

p q
1 |f | dμ |g| dμ 1 1
f g dμ ≤ X
+ X
≤ + =1
st X psp qt q p q
which is what we wanted to prove.
10.4. Minkowski’s inequality. Let p ∈ [1, ∞] and f, g ∈ L p . Then f + g ∈

L p and
f + gp ≤ f p + gp .
Proof. It is not hard to verify that f + g1 ≤ f 1 + g1 .

If p = ∞, then |f | ≤ s μ-almost everywhere and |g| ≤ t μ-almost everywhere,
which implies |f + g| ≤ s + t μ-almost everywhere. Therefore f + g∞ ≤ s + t,
and consequently
f + g∞ ≤ f ∞ + g∞ .
40 10. The Lp Spaces
With these trivial cases out of the way, there remains the case 1 < p < ∞.
Hölder’s inequality yields

p−1
|f | |f + g| dμ
X
1/p 1/q
p (p−1)q
≤ |f | dμ |f + g| dμ
X X
1/p 1−1/p
p p
= |f | dμ |f + g| dμ .
X X
Analogously
1/p 1−1/p
p−1 p p
|g| |f + g| dμ ≤ |g| dμ |f + g| dμ .
X X X
This entails

p p−1 p−1
|f + g| dμ ≤ |f | |f + g| dμ + |g| |f + g| dμ
X
X 1/p
X

1/p 1−1/p
p p p
≤ |f | + |g| |f + g| dμ .
X X X
10.5. The Lp Spaces. The behavior of the function f → f p on L p , where

p ∈ [1, ∞], resembles axioms of a norm. However, in general, L p is not a linear
space and a nonzero function may have zero norm. To apply the theory of normed
linear spaces, we identify functions which are equal almost everywhere. Formally,
we assign to every function f ∈ L p the class of functions
[f ] = {g ∈ L p : g = f μ-almost everywhere on X}
and define
Lp = Lp (X, S , μ) = {[f ] : f ∈ L p }.
Then Lp is a linear space equipped with operations
[f ] + [g] := [f + g], α[f ] := [αf ] (α ∈ R),
and with the (true) norm

[f ] p := f p .
It can easily be seen that these definitions do not depend on the choice of repre-
sentatives.
It is customary not to distinguish between functions and classes, often even
between the spaces L p and Lp . For instance, we say that {fj } is a Cauchy
sequence in Lp while the meaning is that fj are functions and {[fj ]} is a Cauchy
sequence in Lp .
10.6. Completeness of Lp . Let {fj } be a Cauchy sequence in Lp . Then {fj }

is convergent in Lp , i.e. there exists an f ∈ L p such that

p
|f − fj | dμ → 0.
X
Proof. The easy case p = ∞ is left to the reader. Suppose p < ∞. First we choose
a subsequence {gj } of {fi } such that
∞

S := gj − gj+1 p < ∞.
j=1

∞
Denote h = |gj − gj+1 |. Then by Levi’s Theorem 8.12 and Minkowski’s in-
j=1
equality
⎛ ⎞p ⎛ ⎞p
∞
k
hp dμ = ⎝ |gj − gj+1 |⎠ dμ = lim ⎝ |gj − gj+1 |⎠
X X k→∞ X
j=1 j=1
⎛ ⎞p
k
≤ lim ⎝ gj − gj+1 p ⎠ ≤ S p < ∞.
k→∞
j=1
By Theorem 8.9.a there exists a set M ⊂ X such that μ(X \ M ) = 0 and h < ∞
on M . We can also assume that |g1 | < ∞ on M . For each x ∈ M , {gj (x)} is a
Cauchy sequence in R, and therefore it is convergent. Thus we can define
f (x) = lim gj (x)
j→∞
μ-almost everywhere. We prove that f ∈ L p and f − fj p → 0. The sequence

p p
{|gj | } tends to {|f | } μ-almost everywhere and
p
|gj | ≤ (|gj − g1 | + |g1 |)p ≤ (h + |g1 |)p
for all k ∈ N. An appeal to the Lebesgue dominated convergence theorem 8.13
with dominating function (h + |g1 |)p yields

p p
|f | dμ = lim |gj | dx ≤ (h + g1 )p dμ < ∞.
X j→∞ X X
We see that f is an element of L p . In a similar way by Lebesgue’s theorem

(dominating function hp )

p p
f − gj p = |f − gj | dμ → 0.
X
Since the sequence {fj } is Cauchy in Lp , we get lim fj − gj p = 0. Thus

j→∞
lim fj − f p = 0 and f is a limit of the sequence {fj } in Lp .
j→∞
10.7. Remarks. 1. In the language of functional analysis we have just proved that Lp
are Banach spaces. The characterization of their duals, which is significant for the theory, is
postponed to Exercise 13.17.
42 11. Product Measures and the Fubini Theorem
2. For a counting measure μ on a set X we write lp (X) := Lp (X, (X), μ). In particular, for
X = N we get well known spaces of sequences:
„ «1/p
P
– the space p , 1 ≤ p < ∞ of all sequences x = {xn } such that xp := |xn |p <
n
∞,
– the space ∞ of all bounded sequences x = {xn } with the norm x∞ := supn |xn |.
10.8. Exercise. Suppose fn , f ∈ ∞ . Show that fn − f ∞ → 0 if and only if there exists
a set E ∈ , μE = 0, such that fn ⇒ f on X \ E.
10.9. Exercise. Let p ∈ [1, +∞) and {fn } be a Cauchy sequence in p . A close inspection of
the proof of 10.6 shows that there is a subsequence of {fn } that converges μ-almost everywhere
in X. Compare with Theorem 12.4.
10.10. Exercise. (a) Suppose that p ∈ [1, +∞). Show that the set
X
:= { βj cBj : Bj ∈ , μBj < ∞}
j
of simple functions is dense in p.
Hint. Apparently, ⊂ p . Choose f ∈ p . Since f is finite μ-almost everywhere, Exercise

3.11 provides a sequence {fn } of simple functions such that |fn | ≤ |f | and fn → f almost
everywhere. Clearly fn ∈ and by Lebesgue’s theorem with dominating function 2p |f |p ,
fn − f p → 0.
(b) Show that the set of all simple functions is dense in ∞.
10.11. Exercise. Consider p ∈ (0, 1). Define .p and the spaces p, Lp in the same way as
in the beginning of this chapter. Show that:
(a) Lp is a linear space.
(b) The triangle inequality for .p does not hold.
Hint. Employ the indicator functions of two disjoint sets of a positive measure.
R
(c) The distance function dp (f, g) := |f − g|p dμ is a metric on Lp and (Lp , dp ) is a
complete metric linear space.
10.12. Notes. The Hölder inequality was first proved by A.L. Cauchy [*1821] in the case of
p = 2 for finite sums (so, in fact, for the counting measure on a finite set). V.Y. Bunyakowski
[1859] proved it (still for p = 2) for Riemann integrable functions and the same result was
obtained by H.A. Schwarz [1885]. For general p but still for finite sums, the inequality was
proved by L.J. Rogers [1888] and by O. Hölder [1889]. The definitive form given in 10.3 comes
from F. Riesz [1910], where Minkowski’s inequality is proved as well. The original result for
finite sums is due to H. Minkowski [1907].
The theory of Lp -spaces was originated by F. Riesz [1906] who defined the L2 -metric and
E. Fischer [1907] who proved the completeness of L2 . F. Riesz then defined Lp for other values
of p as well and proved their completeness. The notion of a Banach space has its roots in papers
by S. Banach [1922] and N. Wiener [1922]. This period culminated when Banach’s fundamental
monograph [*1932] appeared.
11. Product Measures and the Fubini Theorem
In this chapter, we will consider the following problem: Given two measure
spaces (X, S , μ) and (Y, T , ν), we wish to define a product measure τ on an
appropriate σ-algebra U of subsets of the Cartesian product X ×Y . If M = S ×T
where S ∈ S , T ∈ T (such sets are called measurable rectangles), we require
τ M = μS · νT .
An important example is that of Lebesgue measure in R2 which should arise as
the product of one-dimensional Lebesgue measures.
11.1. Product σ-algebra. First we introduce the notion of the product σ-

algebra. If S and T are some σ-algebras, the product σ-algebra S ⊗ T is the
σ-algebra generated by the collection of all measurable rectangles. Thus, S ⊗ T
is the smallest σ-algebra which contains all sets of the form S × T where S ∈ S
and T ∈ T .
For M ⊂ X × Y , x ∈ X, let
M x := {y ∈ Y : [x, y] ∈ M } .
The set M x is called the x-section of M . Analogously, define the y-section
My = {x ∈ X : [x, y] ∈ M }
for y ∈ Y .
11.2. Lemma. If M ∈ S ⊗ T and x ∈ X, then M x ∈ T (and, of course,
similarly My ∈ S for y ∈ Y ).
Proof. Denote A := {E ∈ S ⊗ T : E x ∈ T }. A moment’s reflection shows
that A contains all measurable

rectangles.

A routine argument yields that
x
(X × Y \ E) = Y \ E x and ( Eα )x = Eαx . We see that A is a σ-algebra,
α α
and therefore A = S ⊗ T .
11.3 Monotone Classes. A family M of subsets of a set Z is a monotone class
if it obeys the following conditions:

(a) if M1 ⊂ M2 ⊂ M3 ⊂ . . . , Mn ∈ M , then Mn ∈ M ;

(b) if M1 ⊃ M2 ⊃ M3 ⊃ . . . , Mn ∈ M , then Mn ∈ M .
Every monotone class which is simultaneously an algebra forms a σ-algebra.
The following idea, used already in the proof of the previous lemma, will repeat
in the proofs of subsequent theorems:
If A is the family of sets from S ⊗ T for which a certain proposition holds
and if A contains all measurable rectangles, then A = S ⊗ T provided A is a
σ-algebra. It is not always so easy to verify that A a σ-algebra. Often it is easier
to prove that A is only a monotone class and an algebra (mostly using Levi’s
theorem). But even in this case A = S ⊗ T as follows from the next theorem.
11.4. Theorem. If R is an algebra of sets, then the smallest monotone class
containing R is σ (R).
Proof. Let M denote the smallest monotone class containing R (it does exist!).
It is sufficient to show that M is an algebra. To this end, it is enough to prove
that M ⊂ KE := {B : E \ B, B \ E, B ∪ E ∈ M } for each E ∈ M . We fix a set
F and proceed in the following steps:
(a) KF is a monotone class;
(b) R ⊂ KF for F ∈ R;
(c) M ⊂ KF for F ∈ R;
(d) if E ∈ M , then R ⊂ KE . (Let F ∈ R; by (c), E ∈ KF , and so F ∈ KE .)
Therefore R ⊂ M ⊂ KE by the definition of M .
The basis for a definition of the product measure is in the following proposition.
44 11. Product Measures and the Fubini Theorem
11.5. Lemma. Let (X, S , μ) and (Y, T , ν) be σ-finite measure spaces, E ∈

S ⊗ T . Then the function x → ν(E x ) is S -measurable (E x ∈ T by the previous
lemma), the function y → μ(Ey ) is T -measurable and

ν(E x ) dμ(x) = μ(Ey ) dν(y).
X Y
Proof. Let A be the collection of all sets E ∈ S ⊗ T such that all assertions
hold for E. If R denotes the family of all finite disjoint unions of measurable
rectangles, it is straightforward to check that R is an algebra. The assertion will
follow from the preceding theorem, provided we can prove that A is a monotone
class containing R. We merely outline the main steps and invite the reader to fill
in the details.
(a) A contains all measurable rectangles.

n
(b) R ⊂ A (it suffices to show that Ej ∈ A whenever Ej ∈ A are pairwise
j=1
disjoint).

(c) If En ∈ A , E1 ⊂ E2 ⊂ . . . , then En ∈ A (use Levi’s theorem and

properties of sections).

(d) If En ∈ A , E1 ⊃ E2 ⊃ . . . , then En ∈ A (consider first the case when μ
and ν are finite measures and proceed as in (c)).
11.6. Product Measure. Let (X, S , μ) and (Y, T , ν) be σ-finite measure
spaces. A measure τ on S ⊗ T is called a product measure of μ and ν (denoted
by μ ⊗ ν) if
τ (A × B) = μA · νB
whenever A ∈ S , B ∈ T . Based on the following main theorem, we can claim
that the product operation ⊗ is properly defined.
11.7. Theorem. Let (X, S , μ) and (Y, T , ν) be again σ-finite measure spaces.
Then there exists a unique measure τ on S ⊗ T such that
τ (A × B) = μA · νB
whenever A ∈ S and B ∈ T .
Proof. Uniqueness is almost obvious. Indeed, if τ1 and τ2 are measures on S ⊗ T
satisfying requirements of the theorem, then the family D = {E ∈ S ⊗T : τ1 E =
τ2 E} contains all measurable rectangles, is closed under finite disjoint unions and
is a monotone class. Therefore D = S ⊗ T and we see that τ1 = τ2 on S ⊗ T .
To prove the existence of a product measure, define

τE = νE x dμ
X

for E ∈ S ⊗ T . We know that τ E = Y μEy dν according to Lemma 11.5.
A routine argument shows that τ is a measure on S ⊗T and τ (A×B) = μA·νB
for all measurable rectangles A × B.
11.8. Lemma. Let (X, S , μ), (Y, T , ν) be σ-finite measure spaces and f a
S ⊗T -measurable function on X ×Y . Then the function f (x, ·) is T -measurable
on Y for each x ∈ X.
Proof. Select α ∈ R, x ∈ X and observe that
x
{y ∈ Y : f (x, y) > α} = {(t, y) ∈ x × Y : f (t, y) > α} .
11.9. Fubini’s Theorem. Let (X, S , μ), (Y, T , ν) be σ-finite measure spaces,
and h ∈ L ∗ (μ ⊗ ν). Then

h dμ ⊗ ν = h(x, y) dν) dμ = h(x, y) dμ dν .
X×Y X Y Y X
Proof. It is no restriction to assume that h ≥ 0, for we can use the decomposition

h = h+ − h− . The assertion holds whenever h = cA is the indicator function of
a set A ∈ S ⊗ T (see the proof of Theorem 11.7), and therefore for nonnegative
simple functions as well. If h ≥ 0 is an arbitrary S ⊗T -measurable function, there
is a sequence
{hn } of nonnegative
simple functions, hn h (cf. Theorem 3.9).
Then X×Y hn dμ ⊗ ν → X×Y h dμ ⊗ ν. On the other hand, hn (x, y) h(x, y)
and since the assertion holds for simple functions hn , by using Levi’s theorem
(twice) we easily finish the proof.
11.10. Remarks. 1. The Lebesgue measure in Rn+k is the completion of the product of
Lebesgue measures in Rn and Rk . We refer to Chapter 26 for a detailed study of this important
example.
2. Suppose τ is the product of σ-finite measures μ and ν. Then the measure τ is not necessarily
complete, even if μ and ν are complete. (Indeed, the set {(x, x) : x ∈ M } is measurable with
respect to the product of Lebesgue measures on R if and only if M ⊂ R is Lebesgue measurable.
On the other hand, all subsets of the diagonal {(x, x) : x ∈ R} are measurable with respect to
the completion of λ ⊗ λ.) However, for the completed product measure the following variant of
Fubini’s theorem holds.
11.11. Theorem. Let (X, S , μ), (Y, T , ν) be complete σ-finite measure spaces,
τ the product of μ and ν and τ the completion of τ . Let h ∈ L ∗ (τ ). Then the
function h(x, ·) is ν-measurable for μ-almost all x ∈ X, h(·, y) is μ-measurable
for ν-almost all y ∈ Y , and

h dτ = h(x, y) dν dμ = h(x, y) dμ dν .
X×Y X Y Y X
Proof. As usual, it is sufficient to prove the theorem for indicator functions. Let
M be a set of the form A ∪ N , where A ∈ S ⊗ T and N ⊂ B for B ∈ S ⊗ T ,
τ B = 0. By Fubini’s theorem

cB (x, ·) dν = 0
Y
46 12. Sequences of Measurable Functions
for μ-almost all x ∈ X. Therefore also

cN (x, ·) dν = 0
Y
for μ-almost all x ∈ X. Hence we obtain

cM dτ = cM (x, y) dν dμ.
X×Y X Y
11.12. Exercise. Show that every Dynkin class is a monotone class and find a counterexample
that the reverse is not true.
11.13. Notes. The product measures and the reduction of multi-dimensional integration to
one-dimensional one were originally studied in case of the Lebesgue measure in R2 and appear
already in H. Lebesgue [*1904]. G. Fubini [1907] claims that the order of integration may be
reversed. A complete proof was done by L. Tonelli [1909]. Approximately in the 30’s a number
of works concerning the abstract product measure theory appeared. The method explained in
this chapter is due to H. Hahn [1933].
Let us point out that there are theories concerning products of measures which are not
necessarily σ-finite. The theory of infinite products of measures (we understand the infinite
number of factors) is also very important.
12. Sequences of Measurable Functions
In this section we will study the relationship among various kinds of conver-
gences of sequences of measurable functions on a measure space (X, S , μ). We
will consider
(a) the convergence almost everywhere;
(b) the convergence in the norm of Lp spaces;
(c) the convergence in measure;
(d) the μ-uniform convergence.
Two last mentioned notions will now be defined.

12.1 Convergence in Measure. Let f , fn be μ-measurable, almost everywhere
finite functions on X. We say that a sequence {fn } converges in measure to f if
lim μ{x ∈ X : |f (x) − fn (x)| ≥ ε} = 0
for each ε > 0.

12.2. μ-uniform Convergence. Let f , fn be almost everywhere finite func-
tions on X. We say that {fn } converges to f μ-uniformly if for every ε > 0 there
is M ∈ S such that μ(X \ M ) < ε and the convergence fn → f is uniform on M .
The limit function of a sequence converging in measure (or μ-uniformly) is
unique except for a null set and is finite μ-almost everywhere.
12.3. Theorem. Let 1 ≤ p < ∞ and {fn } be a sequence of μ-measurable,

almost everywhere finite functions on X. If fn ∈ L p and fn − f p → 0, then
fn converge to f in measure.
Proof. Choose
ε > 0 and denote En = {x ∈ X : |f (x) − f (xn )| ≥ ε}. Then
p p
εp μEn ≤ En |f − fn | dμ ≤ f − fn p , which implies the proposition.
12.4. Riesz Theorem. Let {fn } be a sequence of μ-measurable, almost every-
where finite functions on X. If fn converge to f in measure, then there exists a
subsequence {fnk } such that fnk → f almost everywhere.
Proof. Suppose fn converge to f in measure. There is a sequence n1 < n2 < n3 <
. . . such that
1 1
μ{x ∈ X : f (x) − fnj (x) ≥ } < j .
j 2

∞ 1
∞
If Aj := {x ∈ X : |f (x) − fnk (x)| ≥ } and B := Aj , then A1 ⊃ A2 ⊃
k=j k j=1
1
A3 ⊃ . . . , μA1 ≤ < +∞, and consequently
2k
∞
1 1
μB = lim μAj ≤ lim = lim j−1 = 0
2k 2
k=j
(notice the connection with Borel-Cantelli’s lemma 2.14). Now it is sufficient

to prove that fnk (x) → f (x) for x ∈ X \ B. To this end let x ∈ X \ B and

∞
ε > 0 be given. Then there is a j = j(x) such that x ∈ X \ Aj = {y ∈ X :
k=j
1
|f (y) − fnk (y)| < }. If k0 > max(j(x), 1ε ), then |f (x) − fnk (x)| < 1
k ≤ ε for all
k
k ≥ k0 , as needed.
12.5. Egorov’s Theorem. Let μX < +∞ and f , fn be μ-measurable, almost
everywhere finite functions on X. Then fn → f almost everywhere if and only if
fn → f μ-uniformly.
Proof. Assume that fn → f almost everywhere. Denote
E := {x ∈ X : f (x), fn (x) are finite and fn (x) → f (x)}.
Obviously μ(X \E)
= 0 and it does no harm to assume that E = X. Choose ε > 0.

Writing Ek,m := {x ∈ X : |fn (x) − f (x)| ≥ k1 }, we have Ek,1 ⊃ Ek,2 ⊃ . . .
n≥m
and μEk,1 ≤ μX < +∞. For fixed k ∈ N, lim μEk,m = 0, and we easily find a

m
sequence {mk } such that μ Ek,mk < ε. Set
k

M =X\ Ek,mk .
k
Then we have |fn (x) − f (x)| < 1

k for each x ∈ M and n > mk , so that fn → f
uniformly on M .
48 12. Sequences of Measurable Functions
The reverse implication is obvious.

12.6. Remark. If f and fn are μ-measurable, almost everywhere finite functions on X
and fn → f μ-uniformly, then fn → f in measure. Indeed, a routine argument shows that
lim μ{|fn − f | ≥ ε} = 0 for any ε > 0. Combining aforementioned results we can state the
following Lebesgue’s theorem:
If μX < +∞, f , fn are μ-measurable, almost everywhere finite functions on X, and if
fn → f almost everywhere, then fn → f in measure.
12.7. Exercise. Suppose μX < ∞. If f , fn are μ-measurable, almost everywhere finite
functions on X, then fn → f in measure if and only if for any subsequence {fnk } there exists
its subsequence {fnk } which converges to f almost everywhere.
j
Hint. For one of the implications use Theorem 12.4. If {fn } satisfies the “subsequences condi-
tion” and does not converge in measure, then there exists an ε > 0 and a subsequence {gn } of
{fn } such that μ{|gn − f | > ε} > ε for all n and gn → f almost everywhere. By Lebesgue’s
theorem of 12.6 we obtain a contradiction.
12.8. Exercise. Let (X, , μ) be a measure space, ϕ an increasing, bounded and continuous
function on [0, ∞], ϕ(0) = 0, and ϕ(s + t) ≤ ϕ(s) + ϕ(t) for all s, t ≥ 0 (for instance the function
t → t/(1 + t)). Define
∞
X
ρ(f, g) = 2−j ϕ(μ{|f − g| ≥ 1/j}).
j=1
Show that
(a) ρ is a pseudometric on the space of all the μ-measurable functions on X;
(b) fn converge to f in measure if and only if ρ(f, fn ) → 0.
12.9. Exercise (Vitali’s theorem). Let 1 ≤ p < +∞. If fn ∈ p, fn → f almost everywhere
and fn p → f p , then fn − f p → 0.
Hint. Let ϕn = 2p−1 (|fn |p + |f |p ) − |fn − f |p . Then ϕn → 2p |f |p almost everywhere. Since
|fn − f |p ≤ (|fn | + |f |)p ≤ 2p−1 (|fn |p + |f |p ), we have ϕn ∈ 1 , ϕn ≥ 0. By Fatou’s lemma
Z Z Z Z
2p |f |p ≤ lim inf ϕ n = 2p |f |p − lim sup |fn − f |p .
X X X X
R
Hence lim sup X |fn − f |p = 0, finishing the proof.
12.10. Exercise. (a) (Egorov) Let μ be a σ-finite measure on (X, ), fn μ-measurable

functions on X and fn → f almost everywhere (fn , f almost everywhere finite). Show that
S
∞
there exist Ek ∈ of finite measure such that μ(X \ Ek ) = 0 and fn ⇒ f on every Ek .
k=1
(b) Show that the assumption of σ-finiteness of the measure μ in (a) cannot be dropped.
Hint. Let X be the set of all convergent sequences x = {xk } of real numbers and μ the counting
measure on X. Define a sequence {fn } of functions on X by fn (x) = xn . Choose a sequence
{Ej } of subsets of X on which {fn } converges uniformly and find a convergent sequence {yk }
which does not belong to any of the sets Ej (because it converges more slowly than all sequences
from Ej ).
(c) Show that the μ-uniform convergence implies convergence almost everywhere, or conver-
gence in measure, without the assumption that μX < ∞.
12.11. Exercise. Find a sequence {fn } of functions on [0, 1] which converges to zero in
(Lebesgue) measure whereas the sequence {fn (x)} fails to converge for any x ∈ [0, 1].
Hint. Consider the indicator functions of intervals [ k−1
m
, k
m
] ordered into a suitable sequence.
p
12.12. Weak Convergence in Lp . Let 1 ≤ p < +∞ and q = p−1 (q = +∞ if p = 1).
Suppose f , fj ∈R p
R that a sequence {fj } converges
. We say weakly to f in Lp (denoted
f = w-lim fj ) if X fj g dμ → X f g dμ for every function g ∈ q .
(a) If f = w-lim fj , g = w-lim fj , then g = f almost everywhere.
(b) If f − fj p → 0, then f = w-lim fj .
(c) Suppose f, fj ∈ p . Let be a set of p -functions, the linear span of which is dense
in Lq . Then the following assertions are equivalent:
(i) f = w-lim fj ; R R
(ii) the sequence {fj p } is bounded and X gfj dμ → X gf dμ for every g ∈ .
Hint. A typical application of the Banach-Steinhaus theorem from functional analysis.
(d) We can take as the set of all indicator functions of sets from in case p = 1, or the
set of all indicator functions of sets from of finite measure if p > 1; in the special case of the
Lebesgue measure, other possibilities are shown in Theorem 31.4.
(e) If p > 1, then any norm-bounded sequence of functions from p has a weakly convergent
subsequence.
Hint. If the space Lp is separable, proceed in a similar way as in the proof of Theorem 17.10.
The nonseparable case is more difficult. The idea is that for 1 < p < ∞ the space Lp is reflexive
and reflexive spaces are characterized by this property, cf. for instance L. Mišı́k [*1989], Theorem
33.4 or R.B. Holmes [*1975], p.149.
(f) Let p > 1. If fj → f almost everywhere and if {fj p } is a bounded sequence, then
f = w-lim fj . The proposition holds also if the assumption fj → f almost everywhere is
replaced by fj → f in measure.
(g) The Radon-Riesz theorem. Let p > 1, f = w-lim fj and f p = lim fj p . Then
f − fj p → 0. (What proposition do we obtain by sticking (f) and (g) together?)
Hint. The proof makes substantial use of uniform convexity of Lp spaces for 1 < p < ∞, see
M.M. Rao [*1987], Proposition 5.5.3.
(h) The sequence fj : x → sin jx converges weakly to zero for any p on each bounded interval.
None of its subsequences is convergent almost everywhere.
(i) Suppose fj = j 1/p on (0, 1/j) and fj (x) = 0 on (1/j, 1). The sequence {fj } is bounded
in the norm of Lp (0, 1) and tends to zero almost everywhere. If p = 1, then it is not weakly
convergent (nor any of its subsequences). If p > 1, then it converges weakly but not in norm.
12.13. Weak* Convergence. Assume μ is a σ-finite measure on X and f, fj ∈ L∞ . We say
R R
that a sequence {fj } converges weakly* to f (denoted f = w* -lim fj ) if X fj g dμ → X f g dμ
for each function g ∈ 1 .
(a) If f = w* -lim fj , g = w* -lim fj , then g = f almost everywhere.
(b) If f − fj ∞ → 0, then f = w* -lim fj .
(c) Suppose f, fj ∈ ∞ . Let be a set of 1 -functions, the linear span of which is dense
in L1 . Then the following are equivalent:
(i) f = w* -lim fj ; R R
(ii) the sequence {fj ∞ } is bounded and X gfj dμ → X gf dμ for each function g ∈ .
(d) If fj → f almost everywhere and if the sequence fj ∞ is bounded, then f = w* -lim f j.
(e) Any norm-bounded sequence {fn } of ∞ -functions contains a weakly* convergent sub-
sequence.
S
Hint. Write X = Xk where X1 ⊂ X2 ⊂ . . . and μXk < ∞. Suppose {fn } is a norm-bounded
sequence of functions from L∞ and
(
fn on Xk ,
fñ,k =
0 on X \ Xk .
50 13. The Radon-Nikodým Theorem and the Lebesgue Decomposition
Then {fñ,k }n is bounded even in the L2 -norm and there exists its subsequence {fñi ,k }n which
R R
converges weakly in L2 to a function f˜k . But this means that lim X fni u dμ = X fk u dμ for
i
each function u ∈ L2 vanishing on the complement of Xk . The diagonal R method (cf.
R 17.10)
provides a subsequence {hn } of {fn } and a limit function f such that X hn u dμ → X f u dμ
for each function u ∈ L2 vanishing on the complement of some of the sets Xk . Since the set of
all such functions u is dense in L1 , in light of (c) we conclude that f = w* -lim fn .
(f) The sequence fj : x → sin jx converges weakly* to zero (cite the Riemann-Lebesgue
lemma 31.10).
(g) Define fj = 0 on (0, 1/j) and fj (x) = 1 on (1/j, 1). The sequence {fj } is bounded in
L∞ -norm and converges both almost everywhere and weakly* to the constant function 1. We
have lim fj ∞ = lim fj ∞ = 1 but lim f − fj ∞
= 0.
12.14. Remarks. It may be worth while to take for the interested reader a slightly deeper
look from the point of view of functional analysis (see also [Zap]).
(1) Let X is a Banach space with (topological) dual X ∗ , xn , x ∈ X, Fn , F ∈ X ∗ . We say
that
(a) {xn } converges weakly to x if F (xn ) → F (x) for any functional F ∈ X ∗ ;
(b) {Fn } converges weakly* to F if Fn (z) → F (z) for any z ∈ X.
Taking into account 13.17, the previous definitions of the weak and weak* convergence in
the Lp spaces now are easier to understand (at least if 1 < p < ∞ or if μ is σ-finite).
2. We say that a Banach space is uniformly convex if for any ε > 0 there is a δ > 0 such that,
‚ ‚
‚1 ‚
if x, y ∈ X, x = y = 1 and ‚ ‚
‚ 2 (x + y)‚ > 1 − δ, then y − x < ε.
Each uniformly‚ convex Banach

‚ space is locally uniformly rotund which means that if xn =
x = 1 and ‚ 12 (xn + x)‚ → 1, then xn − x → 0. The Lp spaces are uniformly convex
for 1 < p < ∞ (it follows from the Clarkson’s inequalities, see, for instance, E. Hewitt and
K. Stromberg [*1965]).
3. A Banach space X is said to have the Kadec-Klee property if xn − x → 0, whenever a
sequence {xn } ⊂ X converges weakly to x and xn → x. All locally uniformly rotund
Banach spaces (in particular Lp spaces, 1 < p < ∞) have the Kadec-Klee property. Indeed,
if {xn } is a sequence which does not converge to a point x and for which xn = x = 1,
local uniform rotundity
‚ ensures‚ the existence of its subsequence, call it {xn } again, and an
ε ∈ (0, 1) such that ‚ 12 (xn + x)‚ < 1 − ε. Using the Hahn-Banach theorem find a F ∈ X ∗ with
F = 1 = F (x). Now, it is clear that the sequence {F (xn )} cannot converge to F (x).
4. In interesting cases, the L1 spaces are not (locally) uniformly convex, nor do they need satisfy
a proposition analogous to 12.12.f. Nevertheless, if fn ∈ L1 , fn → f almost everywhere or in
measure and fn → f , then fn − f → 0 (cf. 12.9).
12.15. Exercise. Let μX < ∞ and 1 ≤ p < r ≤ +∞. If {fj r }j is a bounded sequence and
fj → f almost everywhere (or, in measure), then f − fj p → 0. Show that the assertion does
not continue to hold for r = p.
12.16. Notes. The notion of convergence in measure was introduced by F. Riesz [1909a]
where also Theorem 12.4 was proved. Egorov’s theorem was proved for case of the Lebesgue
measure by H. Lebesgue [1903] and by D. Egorov [1911].
13. The Radon-Nikodým Theorem and the Lebesgue Decomposition
13.1. Radon-Nikodým Derivative and Absolute Continuity of Mea-

sures. Let (X, S , μ) be a measure space and f ∈ L 1 (μ) a nonnegative function
on X. In Chapter 8 we defined the measure μf on (X, S ) as

μf (E) = f dμ, E ∈ S.
E
Now, if μ and ν are measures on S , one might ask whether there is a nonnegative
function f ∈ L 1 (μ) such that ν = μf . Immediately we see that such a function in
general does not exist. Indeed, every measure of the form μf satisfies μf (E) = 0,
provided μ(E) = 0. However, we show that if νE = 0 whenever μE = 0, and
μ and ν are σ-finite, then a desired function f , called the density or the Radon-
Nikodým derivative of ν with respect to μ, can be found. Then the density of ν
with respect to μ is unique except for a null set and it is sometimes denoted by
dν
.
dμ
We say that a measure ν on S is absolutely continuous with respect to μ, and
write ν μ, if νE = 0 for every E ∈ S with μE = 0.
There are several methods how to prove the existence of Radon-Nikodým derivatives. Our
proof is based on the following “variational” approach. Assuming that such a function f exists
and is in 2 , consider the functional
Z
J0 : g → |g − f |2 dμ
X
on the space L2 (μ). The function f evidently represents the only element of the space L2 (μ) at
which J0 attains its minimum. For all g ∈ L2 (μ)
Z
J0 (g) = J(g) + f 2 dμ
X
where Z Z Z
J(g) = (g 2 − 2f g) dμ = g 2 dμ − 2 g dν.
X X X
Since the functionals J and J0 differ only by a constant, J attains its minimum at f . This is
also the idea of our proof of the following lemma.
13.2. Lemma. Let ν, σ be finite measures on (X, S ) such that νA ≤ σA for

each A ∈ S . Then there exists a nonnegative function f ∈ L 2 (σ) such that

νA = f dσ
A
for any A ∈ S .
Proof. For g ∈ L 2 (σ) denote

J(g) = g 2 dσ − 2 g dν .
X X
Let
c := inf{J(g) : g ∈ L2 (σ)}.
Since νA ≤ σA for each A ∈ S , we have

J(g) ≥ (g 2 − 2 |g|) dσ = (|g| − 1)2 dσ − σ(X) ≥ −σ(X),
X X
for g ∈ L 2 (σ), so that c ∈ R. Hence, there is a sequence {fj } of functions from

L 2 (σ) such that J(fj ) → c. Since for g, h ∈ L 2 (σ) the “parallelogram law”

1 1 2
J(g) + J(h) − 2J (g + h) = g − h2
2 2
holds, we have

1 2
|fj − fk | dσ ≤ J(fj ) + J(fk ) − 2c → 0
2 X
as j, k → ∞. Consequently, {fj } is a Cauchy sequence, and therefore there is a

limit f of this sequence in L2 (σ). Obviously, J(f ) = c. Choose A ∈ S . Since
J(f ) ≤ J(f + tcA )
for all t ∈ R, we get

0 ≤ 2t f cA dσ + t2 c2A dσ − 2t cA dν = 2t f dσ − νA + t2 σA.
X X X X
The last shows that

f dσ − νA ≤ 1 |t| · σA
2
X
(note that t can be positive as well as negative). By letting t → 0 we establish

the assertion.
13.3. Remark. Proof of the preceding lemma becomes more comprehensible if it is based on
the knowledge of the theory of Hilbert spaces. Considering the functional
Z
F (g) = g dν, g ∈ L2 (σ).
X
By the Riesz representation theorem on continuous linear functionals on Hilbert spaces there is
an element f ∈ L2 (σ) such that
Z
F (g) = f g dσ
X
for each g ∈ L2 (σ). Now it is enough to apply the last equality to g = cA , where A runs over
.
13.4. Radon-Nikodým Theorem. Let μ, ν be finite measures on (X, S ),

ν μ. Then there exists a nonnegative function h ∈ L 1 (μ) such that

νA = h dμ
A
for all A ∈ S . This function h is unique up to μ-almost everywhere equality.

Proof. The uniqueness can be achieved by Theorem 8.17, and we proceed to

establish the existence of h. Set σ = μ + ν. By the previous lemma there exists
a function f ∈ L 2 (σ) such that

νA = f dσ
A
for all A ∈ S . Since

0 ≤ νA = f dσ ≤ σA = 1 dσ ,
A A
an appeal to Corollary 8.18 reveals that 0 ≤ f ≤ 1 σ-almost everywhere. We

claim that f < 1 σ-almost everywhere. Let E := {f = 1}. Then

νE = f dσ = σE,
E
thus μE = 0. By assumptions we get νE = 0 and therefore σE = 0. Hence, it is

no restriction to assume that 0 ≤ f < 1 everywhere. For each set A ∈ S we have

cA f dμ = cA (1 − f ) dν.
X X
We see (for simple functions using linearity, then passing to limits) that

gf dμ = g(1 − f ) dν
X X
for every nonnegative S -measurable function g on X. For A ∈ S , put g = cA

1−f .
Then
f
· cA dμ = cA dν
X 1−f X
and the function

f
h :=
1−f

has all required properties. Notice that h ∈ L 1 (μ)) as X
h dμ = ν(X) < +∞.
13.5. Theorem. Let μ and ν be finite measures on (X, S ). Then the following
conditions are equivalent:
(i) ν μ;
(ii) there is a nonnegative function f ∈ L 1 (μ) such that

νE = f dμ
E
for all E ∈ S ;
(iii) for any ε > 0 there is a δ > 0 such that νE < ε whenever E ∈ S and
μE < δ.
Proof. The implication (i) =⇒ (ii) was established in Theorem 13.4, (ii) =⇒ (iii)
follows by Exercise 8.22.b and (iii) =⇒ (i) is obvious.
13.6. Remarks. 1. The Radon-Nikodým theorem can be generalized, its variants hold even
for signed or complex measures. Let us state the version for σ-finite measures:
Let μ, ν be σ-finite measures on (X, ), ν μ. Then there is a nonnegative -measurable
function h (not necessarily from 1 (μ)) such that
Z
νA = h dμ
A
for all A ∈ . This function is unique except on a null set.

Indeed, we
Sfind sequences
S {An }, {Bn } ⊂ of pairwise disjoint sets such that μAn < +∞,
νBn < +∞, An = Bn = X. Applying Theorem 13.4 for restrictions of μ and ν to An ∩ Bk
we obtain the assertion.
2. We proved the Radon-Nikodým theorem and the Hahn decomposition theorem independently.
The reader should find it easy to derive the Radon-Nikodým theorem from the Hahn theorem
or vice versa. Both possibilities will be indicated.
Let μ, ν be finite measures on (X, ), ν μ. We would like to prove the existence of a
Radon-Nikodým derivative. For α ∈ Q find a Hahn decomposition X = Pα ∪ Nα such that
ν − αμ is a nonnegative measure on Pα and −(ν − μα) is a nonnegative measure on Nα . For
α = 0 set P0 = X, N0 = ∅. Then we find dν/dμ in the form
f (x) = sup{α ∈ Q ∩ [0, +∞) : x ∈ Pα }.
dμ+
On the other hand, if μ is a signed measure, f = and P = {f = 1}, N = X \ P , then
d |μ|
(P, N ) is a Hahn decomposition of X with respect to μ.
13.7. Integration with Respect to Signed or Complex Measures. Suppose μ is a
signed or complex measure on (X, ). Then we define
Z Z
f dμ = af d |μ|
X X
where a = dμ/d |μ|. (Show that |a| = 1 almost everywhere.)

13.8. Singular and Absolutely Continuous Measures. Suppose that μ
and ν are positive, signed, or complex measures on the measurable space (X, S ).
We say that μ and ν are (mutually) singular , which is denoted by μ⊥ν, if
there is a M ∈ S with |μ| (M ) = |ν| (X \ M ) = 0. Obviously, this relation is
symmetric.
A measure ν is absolutely continuous with respect to μ if νE = 0, whenever
E ∈ S and |μ| (E) = 0.
13.9. Examples. (a) Let μ be a signed measure. Using a Hahn decomposition of X for μ,
we see immediately that μ+ and μ− are mutually singular.
(b) The Dirac measure at 0 and the Lebesgue measure are relatively singular on (R, (R)).
(c) If η is the Lebesgue-Stieltjes measure determined by the Cantor singular function (see
Example 23.1 and Exercise 24.8.) and λ the Lebesgue measure, then η⊥λ on (R, (R)).
13.10. Lebesgue Decomposition Theorem. Let μ be a (positive) measure

on (X, S ) and ν a σ-finite or complex measure on (X, S ). Then there exists a
unique decomposition ν = νa + νs where νa μ and νs ⊥μ.
Proof. (See also Exercise 13.12.) We first reduce the general case to the case
when ν is a finite positive measure. There are B
j ∈ S such that μBj = 0 and
lim νBj = sup{νB : B ∈ S , μB = 0}. If M = Bj , then μM = 0 and for the
measures
νa A := ν(A \ M ), νs A := ν(A ∩ M )
the equality ν = νa + νs holds. Clearly νs ⊥μ. If now μB = 0, then νa (B) =
ν(B \ M ) = 0. This is so because νB = 0 for each set B ∈ S , B ⊂
X \ M with
μB = 0. (Otherwise μ(M ∪ B ) = 0 and ν(B ∪ M ) > νM .) If X = Xk where
νXk < +∞, find sets Mk ⊂ Xk in the same way as in the first part of the proof
and set
M= Mk , νa A = ν(A \ M ), νs A = ν(A ∩ M ).
Having the proof for positive measures, the general case easily follows.
Uniqueness remains. Suppose that ν = νa + νs is another decomposition with
the required properties. Then there is a set M ∈ S for which μM = 0 and
νs (X \ M ) = 0. Fix A ∈ S and denote C = A ∩ (M ∪ M ), D = A \ (M ∪ M ).
We have μC ≤ μM + μM = 0, so νa C = νa C = 0 and νs C = νs C. Further,
νs D = νs D = 0 and therefore νa D = νa D. Since A = C ∪ D and C ∩ D = ∅, we
get νs A = νs A and νa A = νa A.
13.11. Lebesgue Decomposition. The decomposition ν = νa + νs is called
the Lebesgue decomposition of ν relatively to μ, and the measures νa and νs are
called the absolutely continuous part and the singular part of ν. Notice that there
exists a set M ∈ S such that νs = νM and νa = νX\M .
13.12. Exercise. Give the details of an alternative proof of the Lebesgue decomposition
theorem: Existence (assuming σ-finitness of both measures) using the Radon-Nikodým theorem
and uniqueness from the following exercise (part (c)).
Hint. (a) Existence: Suppose μ, ν are positive and set λ = μ + ν, f = dμ/dλ. Then the
Lebesgue decomposition has the form νs = νM , νa = νX\M , where M = {f = 0}.
(b) Uniqueness: If ν = νa + νs = νa + νs , then νa − νa = νs − νs is at the same time
absolutely continuous and singular.
13.13. Exercise. Let μ and ν be positive, signed or complex measures on a measurable space
(X, )).
(a) Prove that the following are equivalent:
(i) μ⊥ν;
(ii) |μ| ⊥ |ν|;
(iii) (μr )+ ⊥(νr )+ , (μr )− ⊥(νr )− ; (μi )+ ⊥(νi )+ , (μi )− ⊥(νi )− .
(b) If ν1 ⊥μ, ν2 ⊥μ, then ν1 + ν2 ⊥μ.
(c) If ν⊥μ, ν μ, then ν = 0.
P
13.14. Exercise. Suppose S ⊂ R is countable, f ≥ 0 and μ = f (s)εs (εs are the Dirac
s∈S
measures, cf. Exercise 2.10). Describe the Lebesgue decomposition of μ with respect to the
Lebesgue measure.
13.15. Exercise. Suppose μ ∈ M(X) (see Exercise 6.17), V = {ν ∈ M(X) : ν⊥μ}. Show
that V is a closed linear subspace of the Banach space M(X).
13.16. Exercise. Let μ, ν be finite (positive) measures on (X, ). Let sup and inf be defined
as in Exercise 6.19 and let σ = μ + ν.
(a) Show that the following statements are equivalent:

(i) μ⊥ν;
(ii) inf(μ, ν) = 0;
(iii) sup(μ, ν) = μ + ν.
d sup(μ, ν) dμ dν
(b) Prove that = max( , ).
dσ dσ dσ
13.17. Duality of Lp Spaces. Suppose p, q ∈ [1, +∞], p1 + 1q = 1. Consider spaces Lp and
Lq determined by a σ-finite measure μ on (X, ).
R
(a) If v ∈ Lq , then the mapping u → X uv dμ is a continuous linear functional on Lp .
R ∞ and f is a continuous linear functional on L . Then there exists a v ∈ L

(b) Suppose p < p q
such that f (u) = X uv dμ for all u ∈ Lp . Moreover vq = f (here, f = sup{f (u) : u ∈
Lp , up ≤ 1}).
Hint. ByRthe Radon-Nikodým theorem there exists a Rμ-measurable finite function v such that
f (cE ) = E v dμ for each set E ∈ . Thus, f (u) = X uv dμ for each u ∈ Lp . It remains to
show that v ∈ Lq and vq = f . There is an increasing sequence {Ej } of sets from of
S
finite measures such that v is bounded on each Ej and Ej is X. Set uj = |v|q−2 vcEj . Then
R R R
uj ∈ Lp and uj pp = X uj v dμ = f (uj ) ≤ f uj p and therefore E |v|q dμ = X |uj |p dμ ≤
“ ”q j
uj p−1
p ≤ f q . An appeal to Levi’s theorem reveals that v ∈ Lq and vq ≤ f . The
reverse inequality is as easy consequence of Hölder’s inequality.
13.18. Remark. The previous theorem which describes duals of Lp for 1 < p < ∞ was
proved using the Radon-Nikodým theorem, so that we had to confine ourselves to the case of
σ-finite measures. However, the theorem holds for arbitrary measures.
In a similar way, it can be proved that any element of L∞ represents a continuous linear
functional on L1 (by the same formula as in 13.17), but not every element of the dual space
to L1 is of this form. In order to be able to characterize elements of the dual space to L1 by
elements of L∞ , it is necessary to confine ourselves, e.g. to the case of σ-finite measures. One
possible description of the dual space of L1 (μ) in the case of arbitrary measure μ can be found
in J. Schwartz [1951].
The assumptions imposed on measures in the Radon-Nikodým theorem can be weakened to
the case of so called localizable measures. This notion, containing σ-finite case as well as the
case of Radon measures, was introduced by I.E. Segal in [1954]. For more information we refer
to M.M. Rao [*1987].
Let us remark that the dual spaces of L∞ can be described using (bounded) finitely additive
measures, cf. E. Hewitt and K. Stromberg [*1965].
13.19. Notes. The Radon-Nikodým theorem was first proved by H. Lebesgue [1910] for
measures which are absolutely continuous with respect to the Lebesgue measure. Later, it was
generalized by M.J. Radon [1913] to Radon measures and by O. Nikodým [1930] to measures
on abstract spaces. The Lebesgue decomposition in the case of arbitrary measures is in Saks’
monograph [*1937]. There are different proofs of the Radon-Nikodým theorem. One of them is
based on the Hahn decomposition and a newer one, using the Riesz theorem of representation
of functionals on Hilbert spaces, comes from J. von Neumann. Our proof uses the variational
principle and is near to von Neumann’s.
In the classical case of the Lebesgue measure, the characterization of dual spaces of Lp spaces
is due to F. Riesz [1910] for p > 1 and to H. Steinhaus [1919] for p = 1.
C. Radon Integral and Measure 57
C. Radon Integral and Measure
14. Radon Integral

14.1. Radon Integral. Throughout this chapter P , will denote a locally
compact topological space. The most important examples of locally compact
spaces are open and closed subsets of Rn .
The support of a function f is the closure of the set {x ∈ P : f (x) = 0}. It is
denoted by supt f .
If K ⊂ P is a compact set, the symbol CK (P ) stands for the linear space of
all continuous functions on P whose support is contained in K. By Cc (P ) we
denote the
set of all continuous functions on P whose support is compact, i.e.
Cc (P ) = {CK (P ) : K ⊂ P is compact }. If the space P is compact, then Cc (P )
and the space C (P ) of all continuous functions on P coincide.
A functional A on Cc (P ) is positive if Af ≥ 0 whenever f ≥ 0, and monotone
if Af ≤ Ag whenever f ≤ g. A linear functional is positive if and only if it is
monotone (for nonlinear functionals, these two notions are different).
A Radon integral on P is any positive linear functional on the space Cc (P ).
14.2. Examples. (a) Fix a point a ∈ P . It is simply to verify that the mapping
εa : f → f (a) , f ∈ c (P )
is a Radon integral on P . This functional is called the Dirac integral at a.

(b) The functional
Z b
Af := f
a
Rb
is a Radon integral on [a, b]. Since f is continuous, the integral a f always exists in Newton’s
or Riemann’s sense.
(c) The previous example can be generalized. If ϕ is a nondecreasing function on R and
[a, b] ⊂ R, we define
Z b X
n
Aϕ f = (RS) f dϕ := inf{ ci (ϕ(xi ) − ϕ(xi−1 )) :
a i=1
a = x0 < x1 < · · · < xn = b, ci ≥ f on [xi−1 , xi ]}
for each f ∈ ([a, b]). The functional Aϕ is again a Radon integral and it is called the Riemann-
Stieltjes integral. If ϕ(x) = x, it is the Riemann integral.
(d) The Riemann integral or the Riemann-Stieltjes integral can be defined for functions from
R R
c (R). Indeed, observe that ab f dϕ = cd f dϕ whenever f ∈ c (R) and supt f ⊂ [a, b] ∩ [c, d].
Rb
Thus, if f ∈ c (R), set Aϕ f = a f dϕ where [a, b] is an arbitrary interval containing the
support of f . The functional Aϕ is then a Radon integral on R.
(e) Another example of a Radon integral is the functional
Z
f → (f ◦ ψ) g ,
G
where g is a continuous nonnegative function on an open set G ⊂ Rk and ψ : G → P is a

continuous mapping. In this way, the following important examples (f) and (g) can be expressed
as well.
58 14. Radon Integral
(f) Suppose G ⊂ Rk is an open set and ψ : G → Rn is a diffeomorphism. Then the Radon

integral Z p
f → f ◦ ψ det((∇ψ)∗ ∇ψ)
G
is in fact the (k-dimensional) surface integral of f over ψ(G).

(g) Let P = {z ∈ R2 : |z| = 1}. If f is a continuous function on P and x = (r cos ω, r sin ω),
|x| < 1, set
Z 2π
1 (1 − r2 )f (cos t, sin t)
Af (x) = dt;
2π 0 1 − 2r cos(t − ω) + r2
this integral is called the Poisson integral. It is not too hard to see that the mapping
f → Af (x)
is a Radon integral on P .
The Poisson integral is used when solving the Laplace (partial differential) equation Δh = 0.
Indeed, given a continuous function f on P , the function h : x → Af (x) is harmonic in the unit
disc U := {x ∈ R2 : |x| < 1} (i.e. h is a solution to the Laplace equation) and
lim h(x) = f (z)

x∈U, x→z
for each z ∈ P .
An important property of Radon integrals is described in the following theorem.

14.3. Theorem (Daniell’s property). Let A be a Radon integral on P , fn ∈
Cc (P ), fn 0. Then Afn → 0.
Proof. There is a limit b := lim Afn ≥ 0. By Dini’s theorem fn ⇒ 0 on P
(consider the sequence {fn } on the support of f1 ), and therefore there exists a
sequence {nk } so that |fnk | ≤
k
1
2 for each k ∈ N. Thus the series fnk converges
uniformly on P and if f := fnk , then f ∈ Cc (P ). Then for each k,
k

k
bk ≤ Afni = A fni ≤ Af,
i=1 i=1
and hence b = 0.
14.4. Semicontinuous Functions. Remember that a function f : P → R is
said to be lower semicontinuous if {x ∈ P : f (x) > c} is open for each c ∈ R.
Upper semicontinuous functions are defined in a similar way.
Denote by Cc↑ (P ) the set of all lower semicontinuous functions on P which are
nonnegative outside a compact set and do not attain the value −∞. Analogously
we define Cc↓ (P ).
14.5. Extension of a Radon Integral. Let A be a Radon integral on P . We
extend A for larger classes of functions in the following steps.

1. For f ∈ Cc (P ) let f = Af .

2. If f ∈ Cc↑ (P ), then we define f = sup{ g : g ∈ Cc (P ), g ≤ f }.
3. Lastly, for an arbitrary function f on P we define upper and lower integrals as

∗ ∗
f = inf{ u : u ∈ Cc↑ (P ), u ≥ f }, f =− (−f ).
∗

∗
4. We say that a function f on P is A-integrable if ∗ f = f and this common
value, which is denoted by f , is finite.
14.6. Properties of Extended Radon Integrals. The procedure of extension
of Radon integrals requires to prove in each step that the “new integral” agrees
with the “old” one for functions which are already integrable with respect to
previous definitions. It is also needed to show that
∗
(a) ∗ f ≤ f for every function f on P ;

(b) the set of all finite A-integrable functions on P is a linear space and is a
nonnegative linear functional on it.
∗
14.7. Measures induced by Radon Integrals. The mapping μ∗A : E → cE
is an outer measure on P . Define M(A) as the σ-algebra of all μ∗A -measurable
subsets of P in the sense of Carathéodory (see 4.4.) and μA as the restriction of
μ∗A to M(A). Then μA is a measure on M(A). Furthermore, the set of all A-
integrable functions on P coincides with the set of all
μA -integrable functions on
P . For every such a function f the equality f = P f dμA holds. The measure
μA is complete and possesses the following properties:
(a) the domain M(A) of μA contains all Borel subsets of P ;
(b) μA K < ∞ for every compact K ⊂ P ;
(c) μA G = sup{μA K : K ⊂ G, K compact } for every open set G ⊂ P ;
(d) μA M = inf{μA G : G ⊃ M, G open } for every M ∈ M(A).
14.8. Examples. (a) If z ∈ P and A : f → f (z) is the Dirac integral at z, then μA is the
Dirac measure at z (note that the Dirac measure is defined on the σ-algebra of all subsets of P ).
(b) If Aϕ is the Riemann-Stieltjes integral determined on R by a nondecreasing function ϕ,
then the associated measure λϕ := μAϕ is called the Lebesgue-Stieltjes measure. Show that the
Lebesgue measure corresponds to ϕ(x) = x.
(c) The Radon measure corresponding to the Poisson integral (Example 14.2.g) is called the
harmonic measure at x.
14.9. Remark. The explanation in 14.5 – 14.7 would be much longer if all proclaimed
propositions were proved. In this manuscript, we choose another approach. We will prove
the existence of μA (the so called Riesz representation theorem) as directly as possible and
the extension
R of a Radon integral to the collection of all A-integrable functions we obtain as
f → P f dμA .
Although there are several approaches with different proofs, their most difficult parts are
based on similar ideas.
14.10. Signed and Complex Radon Integrals. A linear functional A on
the space Cc (P ) is called a signed Radon integral on P if for any compact set K
there exists a constant aK such that |A(f )| ≤ aK supK |f | whenever f ∈ CK (P ).
Analogously we define complex Radon integrals on P . (They are functionals on the
space Cc (P, C) of all complex valued continuous functions on P with a compact
support.)
Any difference of positive Radon integrals serves as an example of a signed
Radon integral. Soon we show that all signed Radon integrals are of this form.
60 14. Radon Integral
14.11. Variation of Signed and Complex Radon Integrals. Let A be

a signed or complex Radon integral on P . The variation of A is the (positive)
Radon integral |A| defined in the following steps:
1. If f ∈ Cc (P ) is a nonnegative function, define
|A| (f ) = sup{Ag : g ∈ Cc (P ), |g| ≤ f }.
Apparently, 0 ≤ |A| (f ) < +∞. If f1 , f2 ∈ Cc (P ) are nonnegative, then |A| (f1 +

f2 ) ≥ |A| (f1 ) + |A| (f2 ). To prove the reverse inequality we use the Riesz decom-
position lemma: Given g ∈ Cc (P ), |g| ≤ f1 + f2 there exist g1 , g2 ∈ Cc (P ) such
that g = g1 + g2 , |g|j ≤ fj for j = 1, 2. (It is not hard to see that the functions
gj := fj g(f1 + f2 )−1 on G := {f1 + f2 > 0} and zero on P \ G have all the desired
properties.) Thus |A| (f1 ) + |A| (f2 ) ≥ Ag1 + Ag2 , and taking sup over all g, we
obtain |A| (f1 + f2 ) ≤ |A| (f1 ) + |A| (f2 ).
2. If f ∈ Cc (P ) is arbitrary, define
|A| (f ) = |A| (f + ) − |A| (f − ).
To prove the additivity of |A|, note that f1+ +f2+ +(f1 +f2 )− = f1− +f2− +(f1 +f2 )+
and use the previous step. Since |A| (γf ) = γ |A| (f ) for every f ∈ Cc (P ) and
every γ ∈ R, the functional |A| is linear.
14.12. Decomposition of Signed and Complex Radon Integrals. Every
complex Radon integral can be decomposed into its real and imaginary part which
are signed Radon integrals. If A is a signed Radon integral, then A can be
expressed in the form A = A+ − A− where A+ := 12 (|A| + A) (the positive
variation) and A− := 12 (|A| − A) (the negative variation) are positive Radon
integrals. The representation of a signed Radon integral as a difference of positive
Radon integrals is not unique; however, the decomposition to the positive and
negative parts is “minimal” in a similar sense as the Jordan decomposition of a
measure.
14.13. Exercise (product of Radon integrals). Let A1 , A2 be Radon integrals on locally
compact spaces P1 , P2 . Show that there exists exactly one Radon integral A on P1 × P2 such
that
A f = A1 f1 · A2 f2
whenever f1 ∈ c (P1 ), f2 ∈ c (P2 ) and f (x1 , x2 ) = f1 (x1 )f2 (x2 ), x1 ∈ P1 , x2 ∈ P2 . Prove the
similar assertion also for signed and complex Radon integrals.
Hint. According to the Stone-Weierstrass theorem, the set of all linear combinations of functions
of the form f1 (x) · f2 (y), where fi ∈ c (Pi ), is dense in c (P1 × P2 ).
14.14. Notes. When building up the integration theory, some authors do not start from
the original notion of a measure; they consider linear functionals on spaces of functions defined
on an arbitrary set and then they extend these functionals to larger collections of functions.
More precisely, it is possible to start with a Riesz lattice of functions on a set X (which is
a linear space of functions closed under formations of finite maxima and minima) and with a
positive linear functional on which satisfies “Daniell’s condition”. As in 14.5, this functional
is extended, a measure is defined in a natural way and its properties are derived. This method
was worked out by P.J. Daniell [1918] (for the extension he used sequences of functions) and
M.H. Stone in the series of articles [1948] and [1949] (using generalized sequences). Let us note
that many authors use different forms of a similar approach (W.H. Young [1904], H.H. Golstine
[1941], J. Mařı́k [1952] and others).
The idea of extending the Radon integral is also due to many authors. However, it seems
that Bourbaki’s group was the first who worked it out for general locally compact spaces.
Also many authors confine themselves to the integration theory on locally compact spaces.
Kakutani’s representation theorem (S. Kakutani [1941]) says that any “abstract” functional A
on a Riesz lattice can be represented as a Radon integral AK on a locally compact space PK
so that the corresponding spaces of “integrable” functions are isomorphic. However, Kakutani’s
representation also has some disadvantages — when changing the original functional, the space P
changes as well and if the original space is locally compact, Kakutani’s representation can lead to
a different space. Supposing that the Riesz lattice satisfies Stone’s condition (min(1, f ) ∈
if f ∈ ), H. Bauer in [1957] constructed a different representation of (PB , AB ) which removes
mentioned disadvantages; the space PB does not depend on A and if is the space of all
continuous functions with a compact support on a locally compact space P , PB can be identified
with P , c (PB ) with and AB with A.
Of course, no such representation can save the topology on the original space if it is not
locally compact.
Many works are also concerned with integration theories in topological spaces without the
assumption of local compactness (for instance in separable metric spaces) but in this case the
correspondence between measure and the topological structure is not so fruitful.
15. Radon Measures
15.1. Radon Measures. Properties of the measure μA from the previous

chapter (14.7) lead to the following definition.
Let P be a locally compact space and B(P ) the σ-algebra of all Borel subsets
of P . We say that μ is a Radon measure on (P, S ) if
(a) S is a σ-algebra containing B(P );
(b) μK < ∞ for every compact set K ⊂ P ;
(c) μG = sup{μK : K ⊂ G, K compact } for every open set G ⊂ P ;
(d) μA = inf{μG : G ⊃ A, G open } for every A ∈ S .
15.2. Remarks. 1. There is a one-to-one correspondence between Radon integrals on P
and Radon measures on (P ). To each Radon integral, we can assign a Radon measure μA .
(See 14.7.; an alternative, more precisely developed approach will be used in 16.4.) Then we
can restrict μA to (P ). On the other hand, ifR μ is a Radon measure on (P ), then μ = μA
on (P ), where A is the Radon integral f → P f dμ. Therefore, to give examples of Radon
measures it is enough to list examples of Radon integrals.
2. In order to describe the structure of the system of all Radon measures on P , we define an
equivalence relation: Radon measures μ1 on (P, 1 ) and μ2 on (P, 2 ) are said to be equivalent
(at least for the purpose of this remark) if μ1 = μ2 on (P ). Then, of course, μ1 = μ2 on
1 ∩ 2 . In each equivalence class we can find two special representatives: the “minimal”
one which is defined on (P ) and the “maximal” one which is complete. However, not every
complete Radon measure is “maximal”. See also Exercise 15.20. The Borel σ-algebra plays
the unifying role, all “minimal” Radon measures are defined on it while domains of “maximal”
Radon measures can be different (for instance, there are Lebesgue nonmeasurable sets while
every set is measurable with respect to the complete Dirac measure mentioned in Example
14.8.a).
3. It is worthwhile to mention that definitions of Radon measures may vary from one author to
another. We have seen that to each Radon integral we can assign an outer Radon measure and
the whole scale of equivalent Radon measures. There is no general agreement which of these
objects should be called a Radon measure. Some authors even use the term Radon measure for a
62 15. Radon Measures
measure which is inner regular (i.e. μE = sup{μK : K ⊂ E, K compact } for every A ∈ ). In

more general spaces than the σ-compact ones this leads to a different concept than our definition
which requires outer regularity. Also the terms regular Borel measure or Borel regular measure
are frequently and ambiguously used for Radon measures or similar objects.
4. Suppose that P is separable, metrizable and that = (P ). Then the property (b) of 15.1
implies both (c) and (d). Indeed, as every open set is a countable union of compact sets, we get
(c). If is the collection of all Borel sets A satisfying
μ(A ∩ U ) = inf{μG : G ⊃ A ∩ U, G open }
for every open set U , then is a σ-algebra containing all open sets. Therefore (d) holds as
well.
Throughout, μ is a Radon measure on (P, S ).
15.3. Lemma. If E ∈ S , μE < ∞, then
μE = sup {μK : K ⊂ E, K compact }.
Proof. Given ε > 0, there exist an open set G ⊃ E and a compact set K ⊂ G
such that μ(G \ E) < ε and μK > μG − ε. Let V be an open set containing G \ E
with μV < ε. If H := K \ V , then H is compact,
E \ H ⊂ (E ∩ K \ H) ∪ (E \ K) ⊂ V ∪ (G \ K)
and μH > μE − 2ε.

15.4. Lemma. Let f ≥ 0 be a μ-integrable function on P . Then

(a) f dμ = inf{ g dμ : g ∈ Cc↑ , g ≥ f };
P P
(b) f dμ = sup{ h dμ : h ∈ Cc↓ , 0 ≤ h ≤ f }.
P P
Proof. (a) Choose ε > 0 and find a sequence {sk } of simple functions, 0 ≤ sk
∞
f . Since f = s1 + (sk − sk−1 ), there exist Aj ∈ S and αj ≥ 0 so that
k=2

∞
f = αj cAj . As f is μ-integrable, we have μAj < ∞ for all j and we can find
j=1

∞
open sets Gj ⊃ Aj so that μ(Gj \ Aj ) < 2−j ε. The function g := αj cGj is
j=1
nonnegative, lower semicontinuous and satisfies

g dμ ≤ f dμ + ε.
P P
(b) To prove the asssertion it suffices to show it for simple functions. Let
n
f = αj cMj be a μ-integrable function (α1 , . . . , αn ≥ 0, M1 , . . . , Mn ∈ S and
j=1
μMj < ∞). Choose again an ε > 0. By virtue of Lemma 15.3 there are compact
sets Kj (j = 1, . . . , n) such that Kj ⊂ Mj and
ε
μKj ≥ μMj − .
nαj

n
Then h := αj cKj is an upper semicontinuous function with a compact support,
j=1
0 ≤ h ≤ f and
h dμ ≥ f dμ − ε.
P P
15.5. Theorem. Let μ be a complete Radon measure on P . A function f on

P is μ-integrable if and only if for every ε > 0 there exist integrable functions
s ∈ Cc↑ and t ∈ Cc↓ such that t ≤ f ≤ s and

(s − t) dμ < ε.
P
Proof. First assume that f ∈ L (μ). Let ε > 0 be given. Cite Lemma 15.4 to
1
get nonnegative functions g ∈ Cc↑ , h ∈ Cc↓ such that g ≥ f + , h ≤ f − and

g dμ ≤ f + dμ + ε, h dμ ≥ f − dμ − ε.
P P P P
Then s := g − h ∈ Cc↑ , s ≥ f and

s dμ ≤ f dμ + 2ε.
P
In a similar way, we manufacture a t ∈ Cc↓ such that t ≤ f and

t dμ ≥ f dμ − 2ε.
P
This establishes the necessity.

For the converse, suppose that for every k ∈ N there exist integrable functions
sk ∈ Cc↑ and tk ∈ Cc↓ such that tk ≤ f ≤ sk and

1
(sk − tk ) dμ < .
P k
No generality is lost with the assumption that the sequence {sk } is nonincreasing,
for we may replace it by the sequence {s1 , min(s1 , s2 ), min(s1 , s2 , s3 ), . . . }, and
that the sequence {tk } is nondecreasing. By the Lebesgue dominated convergence
theorem
lim(sk − tk ) dμ = 0.
P
Thanks to Theorem 8.16, lim(sk − tk ) = 0 μ-almost everywhere. Thus lim sk = f
μ-almost everywhere and by the Lebesgue theorem with the dominating function
|s1 | + |t1 | we get that f is μ-integrable.
64 15. Radon Measures
15.6. Corollary. Let f be a μ-integrable function on P . Then

f dμ = inf{ s dμ : s ∈ Cc↑ , s ≥ f }.
P P
15.7. Exercise. Show that a finite positive measure μ on (P, ) is Radon if and only if for
every E ∈ and for every ε > 0 there exist a compact set K and an open set U such that
K ⊂ E ⊂ U and μ(U \ K) < ε.
15.8. Signed and Complex Radon Measures. A signed measure on (P, ) is said to be
Radon if its positive and negative variations are Radon measures. A complex measure on is
Radon if its real and imaginary parts are signed Radon measures.
Prove the following proposition: For a locally compact space P and a complex measure μ on
(P, ), the following assertions are equivalent:
(i) μ is Radon;
(ii) |μ| is Radon;
(iii) if E ∈ and ε > 0, then there exist a compact set K and an open set U so that
K ⊂ E ⊂ U and |μA| < ε for each -measurable set A ⊂ U \ K.
Hint. Use the inequality
|μ| (U \ K) ≤ 4 sup{μA : A ∈ , A ⊂ U \ K}.
To prove this inequality, use the decomposition as in the proof of Theorem 6.11.
15.9. Exercise. Let μ be a Radon measure on P . Prove that the union G of an arbitrary
collection of μ-null open sets is again a μ-null open set.
Hint. Suppose K ⊂ G is compact. Then K can be covered by a finite union of sets from ,
whence μK = 0. Now, the definition of a Radon measure (property (c)) shows that μG = 0.
15.10. Support of a Radon Measure. Let μ be a Radon measure on P . Define the support
of μ as [
supt μ = P \ {G : G open, μG = 0}.
In other words, supt μ is the smallest closed set whose complement si of measure zero. According
to the previous exercise, such a set always exists. If μ is a signed or complex measure, its support
is defined as the support of its total variation |μ|.
R
(a) Let μ be a positive measure. Show that z ∈ supt μ if and only if P f dμ > 0 for every
nonnegative function f ∈ c (P ) with f (z) > 0, which happens if and only if μU > 0 for every
open neighbourhood U of z.
(b) If μ1 , μ2 are Radon measures on P , then supt(μ1 + μ2 ) ⊂ supt μ1 ∪ supt μ2 with the
equality if μ1 and μ2 are positive.
15.11. Exercise. Let μ be a Radon measure and E a Borel set. If E is open or of finite
measure, then μE (see 2.4) is a Radon measure.
15.12. Exercise. Let μ be a Radon measure and f a nonnegative μ-measurable function on
P . Show that μf (see 8.19) is a Radon measure in the following cases:
(a) f is a Borel function and μf is finite on compact sets;
(b) f is continuous on P .
Hint. Use the part (a).
15.13. Exercise. Suppose that μ is a Radon measure and f ∈ L1 (μ). Show that μf is a
signed Radon measure.
15.14. Exercise. Let μ0 be a σ-finite Radon measure and μ = μa + μs the Lebesgue
decomposition of a finite (signed or complex) Radon measure μ with respect to μ0 . Show that
μa and μs are Radon measures.
Hint. Use Exercise 15.11.
15.15. Exercise. (a) Show that the assertion of Lemma 15.3 continues to hold, provided E
is a countable union of sets of finite measures.
(b) If a Radon measure μ is σ-finite, then
μE = sup{μK : K ⊂ E, K compact}
for every E ∈ , in particular for every Borel set E ⊂ P . (Recall that on σ-compact spaces
Radon measures are σ-finite. Every locally compact space with a countable base of open sets is
metrizable and σ-compact.)
(c) Let P be the Cartesian product R × Rd where Rd is R equipped with the discrete
topology. Show that P is a locally compact space which is not σ-compact.
P
For every open set G ⊂ P set μG = λGy (see the notation in 11.1) and extend μ to a
y∈Rd
Radon measure on P .
If B := {0} × Rd , then B is a Borel set, μB = ∞ and μK = 0 for every compact set K ⊂ P .
15.16. Exercise. (a) Let μ be a Radon measure on P and K ⊂ P compact. Show that
{x ∈ K : μ{x} > 0} is countable. In particular, if μ is a Radon measure on a σ-compact space
P , then {x ∈ P : μ{x} > 0} is countable.
(b) Let μ be a Radon measure on Rn . Then the set {r > 0 : μ{x ∈ Rn : |x| = r} > 0} is
countable.
15.17. Exercise. Let μ be a Radon measure on (P, ), p ∈ [1, ∞). Show that the set Cc (P )
is dense in p (μ).
Hint. With the help of Exercise 10.10.a, it is sufficient to approximate every set E ∈ of finite
measure by functions from K (P ) in the Lp -norm. If ε > 0, then there exist an open set G and
a compact set K (Lemma 15.7) so that K ⊂ E ⊂ G and μ(G \ K) < ε. Now, use Urysohn’s
lemma in order to find a function ϕ ∈ c (P ) with cK ≤ ϕ ≤ cG .
15.18. Exercise. A Radon measure μ on P is called discrete (often also atomic) if there
exists a set S ⊂ P such that μ(P \ S) = 0 and μ{x}
= 0 for all x ∈ S. A measure μ is called
continuous if μ{x} = 0 for each point x ∈ P .
(a) Show that a complex Radon measure μ on P P is discrete if and P
only if there exists a
sequence of numbers {cj } and points xj ∈ P so that |cj | < ∞ and μ = cj εxj . Cf. Exercise
j j
2.10.
(b) Show that every Radon measure μ can be uniquely expressed in the form μ = μd + μc
where μd is discrete and μc continuous.
Hint. Set S := {x ∈ P : μ{x} > 0}, μd A = μ(A ∩ S) and μc A = μ(A \ S) (use Exercise 15.16.a).
(c) Show that each Radon measure μ on Rn can be written uniquely in the form
μ = μd + μa + μs
where μd is discrete, μs is continuous, μa is absolutely continuous with respect to the Lebesgue

measure λ and μs and λ are mutually singular.
15.19. Exercise. Let μ be a σ-finite Radon measure on P and E ∈ .
(a) Show that for every ε > 0 there exist an open set G and a closed set F such that
F ⊂ E ⊂ G and μ(G \ F ) < ε.
S
Hint. Suppose E = Ej , where Ej are of finite measure. Find open sets Gj ⊃ Ej such that
j
S
μGj < μEj + ε2−j−1 and set G = Gj . In a similar way find F (considering the complements).
j
66 16. Riesz Representation Theorem
(b) Show that there exists an Fσ set S and a Gδ set D such that S ⊂ E ⊂ D and μ(D \S) = 0
(compare also with Theorem 1.21).
15.20. Exercise. Let μ be a Radon measure on (P, ) and consider the outer measure
μ∗ A := inf{μ(G) : G open, G ⊃ A}.
(a) If A ∈ , then A is μ∗ -measurable in the sense of 4.4. and μ = μ∗ on . Thus the

extension of μ to M(μ∗ ) is the “maximal” representative of μ in the sense of Remark 15.2.2.
(b) Let μ be as in Exercise 15.15.c. Show that the completion of μ is not the “maximal
representative”.
15.21. Notes. The important step from the Lebesgue measure to the study of more general
measures in Euclidean spaces was done by J. Radon [1913]. The notion of a Radon measure is
connected with his name, even if this term is not entirely common and some authors use other
synonyms instead.
16. Riesz Representation Theorem
In this chapter, let us direct our attention to the relation between Radon
measures and Radon integrals on a locally compact space P .
Suppose μ is a Radon measure on P . Then Cc (P ) ⊂ L 1 (μ) and the mapping

f → f dμ , f ∈ Cc (P )
P
is a Radon integral on P .
On the other hand, every Radon integral A on P can be understood as an
integral with respect to a Radon measure. In Chapter 14 we indicated a method
of proving this proposition, and now we propose to show another method and
provide complete proofs.
Symbols μA and μ∗A of this chapter will be used to denote the same objects as
previously in Chapter 14, but the definitions are revised.
In the sequel, A denotes a Radon integral on a locally compact space P .
16.1. Outer Radon Measure. If G is an open subset of P , we set
μ∗A (G) = sup {Af : f ∈ Cc (P ), 0 ≤ f ≤ 1, f = 0 on P \ G} .
Clearly μ∗A is monotone on open sets. We define
μ∗A (E) = inf {μ∗A (G) : G open, G ⊃ E}
for an arbitrary set E ⊂ P .

The set function μ∗A will be called the outer Radon measure (assigned to the
Radon integral A). In the next theorem we will prove that μ∗A (which will be
simply denoted by μ∗ ) is really an outer measure.
16.2. Properties of Outer Radon Measures. Let μ∗ be an outer Radon

measure. Then
(a) μ∗ K = inf {Ag : g ∈ Cc (P ), 0 ≤ g ≤ 1, g = 1 on K} for every compact
set K ⊂ P (in particular, μ∗ is finite on compact sets);
(b) μ∗ G = sup {μ∗ K : K compact, K ⊂ G} for every open set G ⊂ P ;
(c) μ∗ is an outer measure.
Proof. (a) Let K ⊂ P be a compact set and g ∈ Cc (P ), 0 ≤ g ≤ 1, g = 1 on
K. Fix ε ∈ (0, 1) and denote G = {x ∈ P : g(x) > 1 − ε}. Obviously K ⊂ G. If
f ∈ Cc (P ), 0 ≤ f ≤ 1, f = 0 on P \ G, then f ≤ 1−ε
1
g. Hence μ∗ G ≤ 1−ε
1
Ag and
Ag ≥ (1 − ε)μ∗ G ≥ (1 − ε)μ∗ K .
We see that Ag ≥ μ∗ K, and thus
μ∗ K ≤ inf {Ag : g ∈ Cc (P ), 0 ≤ g ≤ 1, g = 1 on K} .
To prove the reverse inequality, select an open set G ⊃ K. Urysohn’s lemma
provides a function g ∈ Cc (P ), 0 ≤ g ≤ 1, g = 0 on P \ G and g = 1 on K. Since
Ag ≤ μ∗ G, we are done.
(b) Suppose we are given an open set G ⊂ P and ε > 0. Let f ∈ Cc (P )
be a function such that 0 ≤ f ≤ 1, f = 0 on P \ G and choose ε > 0. Since
fk := min(f, k1 ) 0, an appeal to Daniell’s
property reveals
the existence of an
n ∈ N such that Afn < ε. If K := x ∈ P : f (x) ≥ n1 , then K is a compact
subset of P . According to (a), there exists g ∈ Cc (P ) such that 0 ≤ g ≤ 1, g = 1
on K and Ag ≤ μ∗ K + ε. Since f − fn ≤ g, we get
Af ≤ Afn + Ag ≤ μ∗ K + 2ε
and we have finished.
(c) Clearly μ∗ ∅ = 0, and μ∗ S ≤ μ∗ T whenever S ⊂ T . It remains to show
that μ∗ is σ-subadditive. To this end, let ε > 0 and K1 , K2 ⊂ P be given. By
Theorem 16.2.a we can find fj ∈ Cc (P ), j = 1, 2 such that fj = 1 on Kj and
Afj ≤ μ∗ Kj +ε. Then μ∗ (K1 ∪K2 ) ≤ A(f1 +f2 ) = Af1 +Af2 ≤ μ∗ K1 +μ∗ K2 +2ε
and we see that μ∗ is subadditive on compact sets.
Next consider open sets G1 , G2 ⊂ P and pick a compact set K ⊂ G1 ∪ G2 . For
any point x ∈ K there is its neighbourhood Vx whose closure is either in G1 or
in G2 . Thanks to the compactness of K, we obtain finite collections of open sets

{Vi1 }i , {Vi2 }i such that Vij ⊂ Gj and Vi1 ∪ Vi2 ⊃ K. Set Kj = K ∩ Vij ,
i i i
j = 1, 2. Then Kj are compact, Kj ⊂ Gj and K = K1 ∪ K2 . Thus μ∗ K ≤
μ∗ K1 + μ∗ K2 ≤ μ∗ G1 + μ∗ G2 and it readily follows that μ∗ is subadditive on
open sets.

∞
Now, let {Gn } be a sequence of open sets. Choose a compact set K ⊂ Gi .
i=1

n
Then K ⊂ Gi for some n ∈ N, whence
i=1
∞

n
n
∗ ∗
μ K≤μ Gi ≤ μ∗ Gi ≤ μ∗ Gi ,
i=1 i=1 i=1
∞
∗

∞
and (b) yields that μ Gi ≤ μ∗ Gi .
i=1 i=1
∗
Finally, let En ⊂ P be arbitrary. We wish to show that μ∗ ( En ) ≤ μ En .
It would be clearly sufficient to assume that μ∗ En < ∞ for all n. Given ε > 0,
we can find open sets Gn such that En ⊂ Gn and μ∗ Gn < μ∗ En + 2−n ε. Then

μ∗ En ≤ μ∗ Gn ≤ μ∗ En + ε .
As ε > 0 was arbitrary, μ∗ is σ-subadditive as needed.

16.3. Theorem. Every Borel subset of P is μ∗ -measurable (in the sense of
Carathéodory).
Proof. It is sufficient to prove measurability of open sets. Notice that by virtue of
Theorem 16.2.c we have μ∗ (G1 ∪G2 ) = μ∗ G1 +μ∗ G2 whenever G1 , G2 are disjoint
open sets. Now, given an open set G ⊂ P , a test set T ⊂ P such that μ∗ T < ∞
and ε > 0, there exist open sets V and H such that V ⊃ T , μ∗ V < μ∗ T + ε and
H ⊃ V \ G, μ∗ H < μ∗ (V \ G) + ε. We can also find a compact set K ⊂ V ∩ G
such that μ∗ K + ε > μ∗ (V ∩ G) and an open set W with a compact closure such
that K ⊂ W ⊂ W ⊂ V ∩ G. Set W0 = V ∩ H \ W . Then W0 is an open set,
W ∩ W0 = ∅, W ∪ W0 ⊂ V , V \ G ⊂ W0 and μ∗ W + ε > μ∗ (V ∩ G). Thus
μ∗ (T ∩ G) + μ∗ (T \ G) ≤ μ∗ (V ∩ G) + μ∗ (V \ G) < μ∗ W + ε + μ∗ W0
= μ∗ (W ∪ W0 ) + ε ≤ μ∗ V + ε < μ∗ T + 2ε.
(In fact, the idea of the proof is quite simple, T ∩ G and T \ G are approximated
by disjoint open sets.)
16.4. Measure μA . Any Radon integral A on a locally compact space P defines
a corresponding outer Radon measure μ∗A . Carathéodory’s construction yields the
σ-algebra MA on which μ∗A is a complete Radon measure. The restriction of μ∗A
to MA will be denoted by μA .
16.5. Riesz Representation Theorem. Let A be a Radon integral on P .
Then there exists a complete Radon measure μ on P such that Cc (P ) ⊂ L 1 (μ)
and Af = P f dμ for each f ∈ Cc (P ). The measure μ is unique on B(P ).
Proof of the existence. If μ = μA , then μ is a complete Radon measure. To
complete the proof we only have to show that

Af = f dμ, f ∈ Cc (P ).
P
The reader should find it easy to prove that CK (P ) ⊂ L 1 (μ).

Let f ∈ Cc (P ) be given. It is no restriction to assume that 0 ≤ f ≤ 1. For
n ∈ N and k = 0, 1, . . . , n denote
k k
fk := min f, , Gk := x ∈ P : f (x) > .
n n
Then using the definition of μ∗ G and the properties of integral it is clear that

1 1 1 1
μGk ≤ A(fk − fk−1 ) ≤ μGk−1 and μGk ≤ (fk − fk−1 ) dμ ≤ μGk−1
n n n P n
for each k = 0, 1, . . . , n. Thus

n
Af − f dμ = A(fk − fk−1 ) − (fk − fk−1 ) dμ

P k=1 P

n
1 1 1
≤ (μGk−1 − μGk ) = μG0 = μ{x ∈ P : f (x) > 0}.
n n n
k=1

Since μ{x ∈ P : f (x) > 0} < +∞, we get Af = P
f dμ and we are done.
Proof of the uniqueness. If ν is a complete Radon measure on P obeying the
assumptions of the theorem, we have

Af = f dν ≤ cG dν = νG
P P
for any open set G ⊂ P and f ∈ CK (P ) with 0 ≤ f ≤ 1, f = 0 on P \ G. Whence

μA G = μ∗A G ≤ νG. Taking into account Theorem 16.2.a, we have μA K ≥ νK for
each compact set and from regularity of ν and μA it immediately follows that ν
and μA coincide on Borel sets.
16.6. Remark. The previous theorem guarantees a one-to-one correspondence between Radon
integrals and classes of equivalence of Radon measures (cf. Remark 15.2).
16.7. Other Spaces of Continuous Functions. We denote by Cb (P ) the
Banach space of all bounded continuous functions on P equipped with the norm
f = sup |f (x)| .
x∈P
The space
C0 (P ) := {f ∈ C (P ) : for every ε > 0 there exists a compact set Kε ⊂ P

such that |f (x)| < ε for all x outside Kε }
of all continuous functions on P “vanishing at infinity” is the closure of Cc (P ) in

Cb (P ). If μ is a finite Radon measure on P , then

f → f dμ
P
is a positive linear functional on Cb (P ) and on C0 (P ) as well. The converse

proposition is also true for the space C0 (P ) as an easy consequence of the Riesz
Representation Theorem 16.5.
Let A be a positive linear functional on C0 (P ). Then there exists a unique

finite
complete Radon measure μ on P such that C0 (P ) ⊂ L 1 (μ) and Af = P f dμ
for every function f ∈ C0 (P ).
16.8. Exercises. Let P , Q be locally compact spaces, h a continuous mapping from P onto
Q and μ a Radon measure on (P, (P )).
(a) Show that f ◦ h ∈ 1 (μ) for every function f ∈ c (Q) provided μP < +∞ or h−1 (F ) is
compact for every compact F ⊂ Q.
R
(b) Show that the mapping A : f → P f ◦ h dμ, f ∈ c (Q), is a Radon integral on Q.
(c) By the Riesz Representation Theorem there exists a unique Radon measure μ on
(Q, (Q)) which represents A. Prove that μ = h(μ) (notation as in Exercise 8.23).
Hint. Show that μ (K) = f (μ)(K) for every compact set K.
16.9. Product of Radon Measures. Consider locally compact spaces P1 and P2 . Their
Cartesian product P1 × P2 is again a locally compact space. If P1 and P2 have countable bases,
then the product P1 × P2 has a countable base as well.
We are now interested in the product of two Radon measures. Two main problems arise:
1. In general, a Radon measure is not necessarily σ-finite and we cannot speak about a product
measure in the sense of Chapter 11.
2. Although (P1 ) ⊗ (P2 ) ⊂ (P1 × P2 ), the equality in general does not hold. Thus even if
the original Radon measures on P1 and P2 are σ-finite, their product is not necessarily defined
on (P1 × P2 ) and so it cannot be called a Radon measure.
In the sequel, we show a way how to define a product of Radon measures.
(a) Let P1 , P2 be locally compact spaces with countable bases (and so metrizable). Then
(P1 × P2 ) = (P1 ) ⊗ (P2 ) and if μ1 , μ2 are Radon mesures on (P1 ), (P1 )), (P2 , (P2 )),
then μ1 ⊗ μ2 is a Radon measure on (P1 × P2 , (P1 × P2 )).
Hint. It is not hard to verify that (P1 ) ⊗ (P2 ) = (P1 × P2 ). Hence μ1 ⊗ μ2 is defined on
(P1 × P2 ). Since P1 × P2 has a countable basis, it is sufficient to show that μ1 ⊗ μ2 is finite
on compact sets. Select a compact set K ⊂ P1 × P2 and denote Ki projections of K onto Pi .
Then K1 and K2 are compact sets (as continuous images of compact sets) and K ⊂ K1 × K2 .
Then
μ1 ⊗ μ2 (K) ≤ μ1 ⊗ μ2 (K1 × K2 ) = μ1 K1 · μ2 K2 < +∞.
(b) Consider now the general case when P1 and P2 have not necessarily countable bases. By
Exercise 14.13 there is a Radon integral A on P1 × P2 such that
Af = A1 f1 · A2 f2
whenever f1 ∈ c (P1 ), f2 ∈ c (P2 ) and f (x1 , x2 ) = f1 (x1 )f2 (x2 ), x1 ∈ P1 , x2 ∈ P2 . Show that
μA is a unique Radon measure μ on (P1 × P2 ) satisfying
μ(E1 × E2 ) = μ1 E1 · μ2 E2
whenever E1 ∈ (P1 ) and E2 ∈ (P2 ). If, in addition, μ1 , μ2 are σ-finite, then
μA (E) = (μ1 ⊗ μ2 )E
for every set E ∈ (P1 ) ⊗ (P2 ).
(c) Using results of 16.10, it is possible to define a product of complex measures as well.
16.10. Representations of Signed and Complex Radon Integrals. (a) Let ν be a
signed or complex Radon measure on P and a a bounded Borel function on (P ). Then the
mapping
Z
A : f → af dν
P
is a complex Radon integral. If, in addition, νP < +∞, then A is a continuous linear functional
on 0 (P ).
(b) Let ν be a finite signed or complex Radon measure on P . Then the mapping
Z
A : f → f dν
P
is a continuous linear functional on the space 0 (P ).

(c) Let A be a complex Radon integral on P (signed Radon integrals are particular cases)
and let |A|R be as in 14.11. By the Riesz representation theorem there is a Radon measure μ
such that f dμ = |A| f for each f ∈ c (P ). Show that there exists a bounded RBorel function
a (uniquely determined as an element of L∞ (P, μ)) such that the equality Af = P af dμ holds
for every function f ∈ c (P ). Moreover, |a| = 1 μ-almost everywhere.
Hint. The function a can be defined locally, so the problem can be reduced to the case when P
is compact. Our approach will be analogous to that in proving the Radon-Nikodým theorem.
The functional A is uniformly continuous on (P ) with respect to the norm of the Hilbert
space L2 (P, μ), thus it can be continuously extended to the entire space L2 and the Riesz
Representation Theorem on representation of continuous linear
R functionals on a Hilbert space
yields the existence of an element a ∈ L2 such that Au = au dμ for all u ∈ L2 . Proof of the
fact that |a| = 1 μ-almost everywhere is more difficult, see e.g. G.K. Pedersen [*1989].
(d) Let A be a continuous linear functional on 0 (P ). Then there exists a unique complete
signed (or complex,R depends on what field we consider) measure μ on P such that 0 (P ) ⊂
P f dμ for each f ∈ 0 (P ).
1 (|μ|) and Af =
16.11. Notes. The famous Riesz Representation Theorem 16.5 was proved for P = [0, 1] by
F. Riesz [1909b] but continuous linear functionals on ([0, 1]) were represented by functions of
bounded variation using Riemann-Stieltjes integrals. Other theorems of this type were proved by
J. Radon [1913] (for compact subsets of Rn ), S. Banach in Appendix to Saks’ monograph [*1937],
S. Saks [1938] (for compact metric spaces) and S. Kakutani [1941]. The Riesz Representation
Theorem for compact spaces and the method of construction of Radon measures directly from
Radon integrals is due to J. von Neumann [1934]. Concerning locally compact spaces, A. Weil
[1940] was aware about the result. However, the final version for locally compact spaces was
established by the Bourbaki group (A. Weil was also its member) in the 40’s and appeared in
[*1952].
17. Sequences of Measures
Throughout this chapter, P will be a locally compact space. We denote by

M (P ) the linear space of all signed Radon integrals on P and by M + (P ) the
set of all (positive) Radon integrals on P . If no confusion can result, we will not
distinguish between the Radon integral A and the Radon measure μA . Thus, for
μ ∈ M + (P ), we will write both μ(f ) (f ∈ Cc (P )) and μE (E ⊂ P ).
This chapter is for information, proofs will be given only to some propositions.
17.1. Strong and Weak Convergence. Let F be a linear subspace of C (P )
containing Cc (P ). We say that a sequence {μn } of Radon integrals on P converges
F -weakly to a Radon integral μ if

lim f dμn = f dμ
n→∞ P P
for each f ∈ F . Notice that the F -weak limit, if exists, is uniquely determined.
The most important case is the Cc (P )-weak convergence which is called the
v
vague convergence and denoted by μn → μ.
72 17. Sequences of Measures
From the point of view of functional analysis, the vague convergence is the weak* convergence.
In case of spaces of measures, it is common to omit the asterisk. Different spaces of measures
are duals to different spaces of continuous functions. If such a space is equipped with a norm
(or, more generally, with a locally convex topology), then it is possible to introduce a -weak
topology on its dual ∗ so that the -weak convergence of measures is in fact the convergence
in this ( -weak) topology. Notice that the space ∗ is not (except a few not very interesting
cases) metrizable. Hence the -weak topology cannot be described by convergence of sequences
and, in general, nets have to be used instead.
If P is a compact metric space, then the set + (P ) is metrizable in the (P )-weak topology.
In functional analysis, besides weak convergences we meet also strong conver-
gences. The most important example of a strong type convergence of measures is
the convergence μn − μ → 0 on the space Mb (P ) of all finite signed (or com-
plex) Radon measures. The norm μ defined as |μ| (P ) (like in Exercise 6.17)
is the dual norm to the norm of C0 (P ) and Mb (P ) with this norm is a Banach
space.
17.2. Comparison of Weak Convergences. Now we will touch the question
v
whether the vague convergence μn → μ implies the F -weak convergence for a
wider space of “test” functions. Proofs of next propositions require the Banach-
Steinhaus theorem of functional analysis.
v
(a) A sequence μn converges C0 (P )-weakly to μ if and only if μn → μ and the
sequence {μn } is bounded.
(b) As usually, by a weak convergence we understand the Cb (P )-weak conver-
gence where Cb (P ) denotes the set of all bounded continuous functions on P . A
sequence μn of complex measures from Mb (P ) converges weakly to a measure
v
μ ∈ Mb (P ) if and only in μn → μ and for every ε > 0 there exists a compact set
K ⊂ P so that |μn | (P \ K) < ε for all n.
A sequence μn of positive measures from Mb+ (P ) converges weakly to a measure
v
μ ∈ Mb+ (P ) if and only if μn → μ and μn → μ.
(c) Note that the weak convergence implies the C0 (P )-weak convergence and
this one implies the vague convergence. If the space P is compact, then Cb (P ) =
Cc (P ) = C (P ) and there is no difference between these convergences.
v
17.3. Examples. (a) If xn → x, then εxn → εx .
(b) Suppose that {xn } is a sequence having no convergent subsequence (for example, take
xn = n for P = R) and {αn } a sequence of real numbers. Then the sequence {αn εxn } converges
vaguely to the null measure. This sequence converges (to the null measure) 0 (P )-weakly if and
only if αn is a bounded sequence, and weakly if and only if αn → 0.
v
(c) Suppose xn → z, yn → z, xn
= yn . Then εxn − εyn → 0 but εxn − εyn → 2
= 0.
(d) If f is a continuous function on [0, 1], then
Z
1 X i
1 k
f (x) dx = lim f( ) .
0 k→+∞ k i=1 k
As if often the case, this equality can become the basis for a definition of the Riemann integral,
but it cannot be used to describe the set of Riemann integrable functions. In the language of
the weak convergence of measures it can be understood as
1X
k
v
ε i → λ[0,1] .
k i=1 k
The above examples show that the weak or vague convergence (unlike the
strong convergence, cf. Exercise 6.18) of μn to μ does not imply μn (A) → μ(A)
for all (Borel) sets. Nevertheless, we can state the following theorem for compact
spaces. Its modifications hold in locally compact spaces as well.
v
17.4. Theorem. Let P be a compact space and μn , μ ∈ M + (P ), μn → μ.
(a) For any lower semicontinuous lower bounded function u on P we have

u dμ ≤ lim inf u dμn .
P n→∞ P
(b) If f is a bounded Borel function on P which is continuous μ-almost ev-

erywhere, then
f dμ = lim f dμn .
P n→∞ P
(c) If G ⊂ P is an open set, then lim inf μn (G) ≥ μ(G).

(d) If K ⊂ P is a compact set, then lim sup μn (K) ≤ μ(K).
(e) If A is a Borel set such that μ(∂A) = 0, then μn (A) → μ(A).
Proof. (a) is an easy consequence of the definition.

(b) Define f ∗ , f∗ as in 7.9.b. If f ∗ = f∗ μ-almost everywhere, then f is μ-
measurable and

f dμ = f∗ dμ ≤ lim inf f∗ dμn ≤ lim inf f dμn
n→∞ n→∞
P P
P
P

∗ ∗
≤ lim sup f dμn ≤ lim sup f dμn ≤ f dμ = f dμ.
n→∞ P n→∞ P P
To prove (c), (d) and (e) we can apply (a) and (b) to indicator functions.
17.5. Remark. Compare Theorem 17.4 with the well-known result that a bounded function
f is Riemann integrable if and only if the set of all points of discontinuity of f is of measure zero
(see 7.9.d). It is possible to prove that a bounded Borel function f is Riemann integrable on
R R
[0, 1] if and only if [0,1] f dμn → 01 f dλ for every sequence μn of positive measures converging
weakly to λ on [0, 1].
v
17.6. Exercise. Suppose μn , μ ∈ + (P ). Show that μ → μ if and only lim sup μ (K) ≤
n n
μ(K) for each compact set K ⊂ P and lim inf μn (G) ≥ μ(G) for each open set G ⊂ P .
17.7. Molecular Measures. A Radon measure ν is called molecular if

k
k
ν = αi εxi where x1 , . . . , xk ∈ P and α1 , . . . , αk are positive and αi = 1.
i=1 i=1
As we have seen in Example 17.3.d, the definition of the Riemann integral is
closely related to an approximation of “continuous” Lebesgue measure by “dis-
crete” molecular measures. In a similar way, more general measures can be ap-
proximated as the next theorem shows.
17.8. Theorem. Let μ be a (positive) Radon measure on a compact metric
space P . Then there exists a sequence {μn } of molecular measures such that
v
μn → μ.
74 18. Luzin’s Theorem
Proof. We sketch the main idea of the proof. Thanks to the compactness of P ,
for every k ∈

N ithere is a finite collection {Mki } of pairwise disjoint nonempty
−k
sets so that Mk = P and diam Mk ≤ 2 . Choose points xik ∈ Mki and set
i
i

μk = μ(Mki )εxik .
i
17.9. Remark. The previous theorem can be proved as a nice application of the Krejn-
Milman theorem or of the bipolar theorem. However, both of them belong to deeper theorems
of functional analysis.
The following theorem has also its interpretation in the language of functional analysis: On
dual spaces bounded sets are relatively sequentially compact in the weak* topology.
17.10. Theorem. Let P be a metric compact space and {μn } a sequence of

complex Radon measures on P . If supn μn < +∞, then there exists a weakly
convergent subsequence of {μn }.
Proof. The method of the proof uses the Cantor diagonal selection process. Since
C (P ) is separable, there is a countable dense set {fk } ⊂ C (P ). Now step by
step construct a sequence of sequences of measures {{μkn }n }k so that μ0n = μn
and every {μkn }n (k ≥ 1) is a subsequence of {μk−1 n }n for which the sequence of
numbers {μkn (fk )}n is convergent. This can be done by the Bolzano-Weierstrass
theorem since the sequence {μk−1 n
n (fk )}n of real numbers is bounded. If νn := μn ,
then {νn } is a subsequence of {μn } and since for every k the sequence {νn }n≥k is a
subsequence of {μkn }n , the sequences {νn (fk )}n are convergent. Choose f ∈ C (P )
and ε > 0. Find g ∈ {fk } such that f − g < ε. There is an n0 such that
|νi (g) − νj (g)| < ε for all i, j ≥ n0 . Then by the triangle inequality
|νi (f ) − νj (f )| ≤ (1 + 2 sup μn )ε

n
for all i, j ≥ n0 . Hence {νn (f )} is a Cauchy sequence and therefore it is conver-

gent. Define a functional μ on C (P ) as
μ(f ) = lim νn (f ).
n→∞
It remains only to show that μ is a Radon integral on P and μ is a weak limit of

{νn }, which is easy.
v
17.11. Exercise. Let P be a compact metric space and μn → μ. Show that
μ ≤ lim inf μn
(i.e. the norm is a weakly lower semicontinuous function on (P )).

17.12. Notes. The theory of weak convergence of probability measures has its roots in the
probability theory and mathematical statistics (see e.g. P. Billingsley [*1968]) and led, of course,
to a study of weak topologies on various subspaces of Radon measures. A similar notion of the
vague convergence was probably studied first by Bourbakists (see the second edition of their
monograph). Vague convergence is again nothing else than the weak convergence on the space
of measures determined by the inductive topology on c (P ). Theorem 17.10. is a special case

of the Alaoglu-Bourbaki theorem.
18. Luzin’s Theorem
If a measure space is endowed with a metric or a topological structure, a

natural question arises whether there is a closer relation between measurability
and continuity. The following theorems show that for complete Radon measures
on locally compact topological spaces this is the case.
18.1. Luzin’s Theorem. For a complete Radon measure μ on a locally compact
space P and a μ-almost everywhere finite function f on P , the following conditions
are equivalent:
(i) f is μ-measurable;
(ii) for any ε > 0 and any compact set K ⊂ P there is an open set G so that
μG < ε and f |K\G is continuous;
(iii) for any ε > 0 and any compact set K ⊂ P there exists a continuous
function ϕ on P such that
μ{x ∈ K : f (x) = ϕ(x)} < ε;
(iv) for every compact set K ⊂ P there exists a sequence {ϕn } of continuous
functions on P such that
ϕn → f μ-almost everywhere on K.
Proof. (i) =⇒ (ii): Let {Uj } be a countable base for the topology on R (for
instance, a sequence of all intervals with rational endpoints). Fix an ε > 0 and
a compact set K ⊂ P . There exist open sets Gj ⊂ P and compact sets Fj ⊂ P
such that
Fj ⊂ K ∩ f −1 (Uj ) ⊂ Gj and μ(Gj \ Fj ) < 2−j ε.

Set G = (Gj \ Fj ). Clearly G is open and μG < ε. Denote Y = K \ G and fix

an index j. Then Y ∩ Gj = Y ∩ Fj . Thus Y ∩ f −1 (Uj ) = Y ∩ Gj is an open subset
of Y and we see that f |Y is continuous.
(ii) =⇒ (iii): Select again ε > 0. Find an open set G ⊂ P such that μG < ε and
f |K\G is continuous. By Tietze’s extension theorem, there exists a continuous
function ϕ on P such that f = ϕ on K \ G.
(iii) =⇒ (iv): Find continuous functions ϕk on P so that μEk < 2−k where
Ek = {x ∈ K : f (x) = ϕk (x)}. If
∞
∞
E := Ek (= lim sup Ek ),
n=1 k=n
then μE = 0 (cf. with the proof of the Borel-Cantelli lemma 2.14). If x ∈ K \ E,

then there exists n0 such that x ∈
/ En for n ≥ n0 and ϕn (x) → f (x).
The implication (iv) =⇒ (i) is obvious.
For locally compact spaces which can be expressed as countable unions of
compact sets we get the following corollary.
76 19. Measures on Topological Groups
18.2. Luzin’s Theorem. Let μ be a complete Radon measure on a σ-compact

locally compact space P and f a μ-almost everywhere finite function on P . Then
the following conditions are equivalent:
(i) f is μ-measurable;
(ii) for any ε > 0 there exists an open set G so that μG < ε and f |P \G is
continuous;
(iii) for any ε > 0 there exist a continuous function ϕ on P and an open set
G such that μG < ε and ϕ = f on P \ G;
(iv) there exists a sequence {ϕn } of continuous functions on P such that
ϕn → f μ-almost everywhere on P.
Proof. This theorem is an easy consequence of the previous one.

18.3. Remarks. 1. Only the equivalence (i) ⇐⇒ (ii) is usually called Luzin’s theorem.
2. Notice that a measurable function can be discontinuous at all points(!), consider e.g. the
Dirichlet function on R. Luzin’s theorem says that the restricted function is continuous when
omitting a “small” set. A characterization of functions which are continuous at almost all points
of the interval [0, 1] was described in 7.9 and 17.5.
3. In general, it is not true that a measurable function is continuous omitting a null set.
Consider, for example, the indicator function of a discontinuum of a positive measure (see 1.13).
4. Another interesting characterization of measurable functions for the case of the Lebesgue
measure in Rn is given by Denjoy’s theorem 29.9.
5. By Theorem 18.2, every Lebesgue measurable function on R is a limit of a sequence of
continuous functions in the sense of the convergence almost everywhere. The collection of
functions which are pointwise limits of continuous functions (so called functions of the Baire
class one) is not very wide. For example, the Dirichlet function (see 7.5) does not belong to it.
18.4. Exercise. Show that a μ-almost everywhere finite function f on P is μ-measurable if
compact set K ⊂ P there exist compact sets Kn ⊂ K and a μ-null set E
and only if for any S
such that K = E ∪ Kn and functions f |Kn are continuous.
18.5. Exercise. Give another proof of the assertion that p
c (P ) is dense in (cf. Exercise
15.17) with the help of Luzin’s theorem.
Hint. First approximate a function from p by a bounded measurable function with compact
support and then use Luzin’s theorem.
18.6. Notes. Luzin’s theorem for the case of the Lebesgue measure was proved by H. Lebesgue
[1903] and by N.N. Luzin [1912].
19. Measures on Topological Groups
19.1. Special Case. One of the fundamental properties of the Lebesgue

measure λ on the real line is its “translation invariance”: If x ∈ R and A ⊂ R
is measurable, then λ(x + A) = λA. The Lebesgue measure is in fact by this
property uniquely determined. Indeed, the following theorem is true (cf. Exercise
26.6):
Let μ be a Radon measure on R. If μ([0, 1]) = 1 and μ(x + A) = μA for every
x ∈ R and A ∈ B(R), then μ = λ on B(R).
Most of this chapter is devoted to a more general problem. Since the object
belongs to elements of harmonic analysis and the reader can consult many text-
books (e.g. G. Bachman [*1964] or K. Ross [*1963]), we merely outline the main
ideas and invite the reader to fill in the details.
19.2. Topological Group. Let us start with basic notions. By a topological
group we understand a group G together with a (Hausdorff) topology such that
the group operations (x, y) → xy and x → x−1 are continuous.
In the sequel, G stands for a topological group whose topology is locally com-
pact. The unit element of G will be denoted by e and the σ-algebra of all Borel
subsets of G by B(G). Finally, by S we will denote σ-algebras which appear as
domains of measures in consideration.
For x ∈ G and A ⊂ G, set
xA = {xy : y ∈ A}, Ax = {yx : y ∈ A}, A−1 = {x−1 : x ∈ A}.
19.3. Haar Measure. A left Haar measure on G is every nonzero Radon

measure μ on (G, S ) which satisfies μ(xA) = μA for each x ∈ G and A ∈ S . In
a similar way we define a right Haar measure. A measure which is simultaneously
both a left and a right Haar measure is called briefly a Haar measure.
19.4. Examples. (a) The Lebesgue measure is a (typical) example of a Haar measure on Rn
(the group operation is the addition).
(b) The counting measure is a Haar measure on every group equipped with the discrete
topology.
(c) Let G = (0, +∞) be the multiplicative group of positive real numbers endowed with the
Euclidean topology. If Z
dx
μA := , A Borel ,
A x
then μ is a Haar measure on G.
(d) Let G = C \ 0 be the multiplicative group of all nonzero complex numbers (with the
usual topology). If Z
1
μA = 2
dλ(z)
A |z|
for A ∈ (G), then μ is a Haar measure on G.

19.5. Example. Let G be the multiplicative group of all 2 × 2-matrices of type
„ «
a b
0 1
where a ∈ (0, +∞) and b ∈ R. There exists a one-to-one mapping of G onto (0, +∞) × R given
by „ «
a b
F: → (a, b).
0 1
Consider on G the (locally compact) topology determined by F from R2 . For A ∈ (G), set
Z Z
1 1
μA = 2
dx dy, νA = dx dy.
A x A x
(a) Show that μ is a left Haar measure and ν a right Haar measure on G.
(b) Show that νA = μ(A−1 ).
(c) Find a set A ∈ (G) such that μA < ∞ and νA = ∞.

(The group G can be viewed as the group of all affine transformations of R onto R of the
form t → at + b, a > 0, b ∈ R.)
19.6 Remark. If μ is a left Haar measure on G and μ̃ is defined as μ̃E := μ(E −1 ) (again
E −1 ∈ (G) if E ∈ (G)), then μ̃ is a right Haar measure on G. In the same way, any right
Haar measure determines a left one.
In what follows, we restrict ourselves to a study of left Haar measures. Fundamental prop-
erties are contained in the following theorem.
19.7. Theorem (existence and uniqueness of left Haar measure). (a) On any
locally compact group there exists a left Haar measure.
(b) If μ and ν are complete left Haar measures on G, then there exists c > 0
such that μ = cν.
Proof of this theorem is quite complicated and takes a lot of effort. There are var-
ious existence proofs, some of them as applications of deep theorems of functional
analysis. Here we outline a rough idea of an “elementary” existence proof.
Let V be an open neighbourhood of the unit element e of G. If E ⊂ G is a
compact set, denote by HV (E) the smallest n such that there exist x1 , . . . , xn ∈ G

n
so that E ⊂ xj V (existence of a finite cover follows from the compactness of
j=1
E). Let K be a fixed compact set with a nonempty interior. The required Haar
measure is then obtained by some limit procedure for “V → {e}” and by extending
set functions
HV (E)
E → , E compact.
HV (K)
A proper and complete description of the entire construction is not easy.
The proof of uniqueness will be given under an additional assumption that μ
is even a Haar measure. The general case is similar, but the technical details
are more difficult. First, there exists a nonnegative function h ∈ Cc (G) so that
h(e) = 0 and h(x) = h(x−1 ) for x ∈ G (if g ∈ Cc (G), g(e) = 0, g ≥ 0, set
h(x) = g(x) + g(x−1 )). Then G h dμ > 0 (see Exercise 19.14.b) and for an
arbitrary function f ∈ Cc (G) we get

h dν f dμ = h(y) f (xy) dμ(x) dν(y)
G G
G G
= h(y)f (xy) dμ⊗ν(x, y)

G×G

= h(x−1 z)f (z) dμ⊗ν(x, z)
G×G

= h(x−1 z)f (z) dν(z) dμ(x)
G G

= f (z) h(z −1 x) dμ(x) dν(z) = h dμ f dν.
G G G G
To give reasons for particular steps when using Fubini’s theorem notice that f
and h have compact supports and R that μ and ν are Radon measures. We can
h dμ
finish the proof by setting c := RG h dν .
G
Haar measures have many interesting properties. Let us state some of them.
19.8. Theorem. Let μ be a left Haar measure on a locally compact group G.
Then:
(a) μU > 0 for any nonempty open set U ⊂ G;
(b) μG < +∞ if and only if G is compact.
Proof. (a) Let U ⊂ G be a nonempty open set. We can

assume e ∈ U . There ex-
ists a compact set K ⊂ G with μK > 0. Then K ⊂ xU and the compactness
x∈K

n
of K yields the existence of x1 , . . . , xn ∈ K such that K ⊂ xi U . Then
i=1

n
0 ≤ μK ≤ μ(xi U ) = nμ(U ).
i=1
(b) Suppose μG < +∞. Let K ⊂ G be a compact set of a positive measure.

There exist x1 , . . . , xn ∈ G so that the sets xi K, i = 1, . . . , n are pairwise disjoint

n
but all (“remaining”) x ∈ G satisfy xK ∩ ( xi K) = ∅. (Indeed – consider finite
i=1
sequences x1 , . . . , xn for which x1 K, . . . , xn K are pairwise disjoint. Their number

n
is bounded for instance by μG/μK. If x ∈ G, then xK ∩ ( xi K) = ∅ and it
n i=1

−1
follows that G = xi K K is compact.
i=1
19.9. Modular Function. Let μ be a left Haar measure on a locally compact

group G. If x ∈ G and μx A := μ(Ax) for A ∈ B(G), then μx is obviously a left
Haar measure. By the uniqueness part of Theorem 19.7 there exists Δ(x) > 0
such that μx = Δ(x)μ. Again, by uniqueness, Δ(x) does not depend on the choice
of a left Haar measure on G. The function Δ : x → Δ(x) : G → (0, +∞) is called
the modular function of G.
19.10. Theorem. Let Δ be the modular function of G. The following condi-
tions are equivalent:
(i) every left Haar measure on G is also a right Haar measure;
(ii) Δ = 1 on G.
Proof is easy and is omitted.

19.11. Unimodular Group. A locally compact group G is said to be unimod-
ular if the modular function Δ = 1 on G. In other words, G is unimodular if the
classes of left and right Haar measures on G coincide. Every commutative group
is unimodular. However, there exist examples of noncommutative unimodular
groups as indicated in the next theorem.
19.12. Theorem. Any commutative, discrete or compact group is unimodular.

Proof. If G is discrete, then every left Haar measure is a multiple of the counting
measure. Hence, it is also “right invariant”. If μ is a left Haar measure on a
compact group G and x ∈ G, then G = Gx and μG = μ(Gx) = Δ(x)μG. Since
0 < μG < ∞, we obtain Δ(x) = 1.
If G is a compact group, then each left Haar measure on G is a Haar measure
and we get immediately the following theorem.
19.13. Theorem. On every compact topological group G there exists a unique
complete Haar measure μ satisfying μG = 1. Moreover, μE = μE −1 for every
E ∈ B(G).
19.14. Exercise. Let μ be a left Haar measure. Prove the following assertions.
R R
(a) If f ∈ c (G) and y ∈ G, then G f dμ = G f (yx) dμ(x). The last equality holds if f is a
nonnegative μ-measurable function on G or if f ∈ 1 (μ).
R
(b) If f ∈ c (G) is nonnegative and f (e) > 0, then G f dμ > 0.
(c) The measure μ is σ-finite if and only if G is σ-compact.
(d) The topology on G is discrete if and only if μ{x}
= 0 for some (and thus for all) x ∈ G.
19.15. Exercise. Calculate the modular function of the group from Example 19.5.
19.16. Exercise . Let Δ be a modular function on a locally compact group G. Show that:
(a) Δ is continuous;
(b) Δ(xy) = Δ(x)Δ(y) for all x, y ∈ G;
(c) Δ(e) = 1.
19.17. Exercise. Let μ be a left Haar measure
R on G. Show that
R the set function μ̃, where
μ̃E := μE −1 , is a right Haar measure and G f (x)Δ(x−1 ) dμ = G f dμ̃ for every f ∈ c (G).
dμ̃
In other words, the Radon-Nikodým derivative equals Δ(x−1 ).
dμ
R
Hint. Show that the function ν on (G) defined by νA := A Δ(x−1 ) dμ is a right Haar
measure. Thus ν = K μ̃ for some K. Now use the fact that Δ is a continuous function attaining
the value 1 at the unit element e.
19.18. Exercise. Show that the left and right Haar measures on G are mutually absolutely
continuous.
Hint. Use Exercise 19.17.
19.19. Convolution of Functions. In the following exercises, let χ be a

fixed right Haar measure on (G, B(G)). If f, g ∈ L 1 (χ) and x ∈ G, define the
convolution of f and g at x as

f ∗ g (x) = f (xy −1 )g(y) dχ(y)
G
provided the integral on the right-hand side exists.

19.20. Exercise. If f, g ∈ 1 (χ), then the convolution f ∗ g is defined χ-almost everywhere
f ∗ g ∈ 1 (χ) and f ∗ g1 ≤ f 1 g1 . Thus, L1 (χ) with convolution as multiplication is a
Banach algebra.
19.21. Exercise. Show that convolution is commutative if and only if G is commutative.
19.22. Exercise. Show that the Banach algebra (L1 (χ), ∗) has a multiplicative unit if and
only if G is endowed with the discrete topology. As an example consider the group Z of all
integers with the discrete topology and counting measure.
19.23. Involution. If f ∗ (x) := f (x−1 )(Δ(x))−1 , then the mappingR f ˛→ f ∗ of˛the space L1 (χ)
onto itself (which is called the involution) is an isometry. (Show that G ˛f (x−1 )˛ (Δ(x))−1 dχ =
R ∗
G |f (x)| dχ for f ∈ c (G); hence it follows that f = f if f ∈
1 (χ).)
19.24. Convolution of Measures. In the following, G is again a locally com-

pact topological group. By Mb (G) denote the set of all complex Radon measures
on (G, B(G)). According to 17.1, Mb (G) with the norm defined by μ := |μ| (G)
is a Banach space.
We define the convolution of measures μ, ν ∈ Mb (G) as
μ∗ν (E) := τ {(x, y) ∈ G × G : xy ∈ E}
for any Borel set E ⊂ G, where τ is the product of complex Radon measures
μ and ν, cf. 16.9.c. (Verify that {(x, y) ∈ G × G : xy ∈ E} is a Borel subset of
G × G!)
19.25. Exercise. Prove that μ ∗ ν is a complex measure on (G), μ ∗ ν ∈ b (G) and that
the inequality μ ∗ ν ≤ μ · ν holds.
19.26. Exercise. Show that
(a) the operation of convolution of measures is associative;
(b) the convolution of measures on G is commutative if and only if G is commutative.
19.27. Exercise. Prove that if e is the unit of G, then the Dirac measure εe is the unit of
( b (G), ∗).
19.28. Remark. The reader may find it interested to investigate a relation between convolu-
tion of measures and convolution of functions. If μ, ν ∈ b (G), then
Z
μ ∗ ν (E) = cE (xy) dμ⊗ν ,
G×G
hence Z Z
h dμ∗ν = h(xy) dμ⊗ν.
G G×G
for any bounded Borel function h on G. Therefore, for f, g ∈ 1 (χ) we have
d(χf ∗ χg )
f ∗g = ,
dχ
R
where χf is the complex Radon measure given by χf (E) = E f dχ. We see that the “new”
definition of convolution of functions agrees with the “former” one.
19.29. Exercise. Give a definition of a convolution of a (complex) function f ∈ 1 (χ) and a
complex Radon measure μ ∈ b (G). Prove that f ∗ μ ∈ 1 (χ) and that f ∗ μ1 ≤ f 1 μ.
In fact, L1 (χ) is not only a subalgebra of b (G) but even an ideal.
19.30. Notes. The translation invariant measures (or integrals) on compact Lie groups
were studied by F. Peter and H. Weyl [1927]. Remarkable progress was made by proving the
existence of a left Haar measure for separable locally compact groups by A. Haar [1933] and
by J. von Neumann in [1934] (the existence and uniqueness) for arbitrary compact groups, and
in [1936] (uniqueness) for separable locally compact groups and also by A. Weil especially in
[*1940]. Their proofs used the axiom of choice. H. Cartan [1940] and G.E. Bredon [1963] then
gave proofs without using the axiom of choice. A relatively short proof of the uniqueness result
can be found in S. Kakutani [1948]. More detailed historical notes are given by E. Hewitt and
K.A. Ross in [*1963]. It is said that Example 19.5. is due to J. von Neumann [1936]. Note that
Haar measures are special cases of invariant measures which can be studied in a more general
setting, cf. Banach’s appendix of Saks’ monograph [*1937] or H. Federer [*1969].
82
D. Integration on R
20. Integral and Differentiation
In the following chapters we will investigate properties of real functions involv-

ing the Lebesgue measure and Lebesgue integration on the real line.
In this chapter, our aim is to show an inequality between the measure of f (E)
and the integral of f over E. As a particular case, we obtain Sard’s lemma on
the real line.
Let K be a positive real number. We say that a real function f is a K-Lipschitz
function on a set E ⊂ R if the inequality
|f (x) − f (y)| ≤ K |x − y|
holds for all x, y ∈ E. If f is a K-Lipschitz function on E for some K, we say
simply that f is a Lipschitz function on E.
20.1. Lemma. Let f be a K-Lipschitz function on a set E ⊂ R. Then
λ∗ f (E) ≤ Kλ∗ E.
Proof. The assertion is obvious in the case λ∗ E = +∞. If λ∗ E <
+∞, select
ε > 0 and find a sequence of open intervals (aj , bj ) with E ⊂ (aj , bj ) and
j
(bj − aj ) ≤ λ∗ E + ε. By hypothesis there are intervals [αj , βj ] such that
j
f (E ∩ (aj , bj )) ⊂ [αj , βj ] and βj − αj ≤ K(bj − aj ). Thus f (E) ⊂ [αj , βj ] and

j

λ∗ f (E) ≤ (βj − αj ) ≤ K (bj − aj ) ≤ K(λ∗ E + ε).
j j
Since the last inequality holds for all ε > 0, we are done.
20.2. Lemma. Let f be a real function on an interval I and E ⊂ I. If |f | ≤ K
on E for some K > 0, then
λ∗ f (E) ≤ Kλ∗ E.
Proof. Choose K > K and denote
Ek = {x ∈ E : |f (x) − f (y)| ≤ K for all y ∈ E ∩ (x − k1 , x + k1 )}.

Then E1 ⊂ E2 ⊂ . . . and E = Ek . Let J be an interval with length less than k1 .

k
According to the previous lemma,
λ∗ f (J ∩ Ek ) ≤ K λ∗ (J ∩ Ek ).
Dividing I into such intervals we obtain λ∗ f (Ek ) ≤ K λ∗ (Ek ). An appeal to
Exercise 4.7 reveals that λ∗ f (E) ≤ K λ∗ (E). Since K > K was arbitrary, we
have the required inequality.
As a consequence of the preceeding lemma we get one-dimensional version of
well-known Sard’s lemma. Compare it with the next Theorem 20.4.
D. Integration on R 83
20.3. Corollary. Let f be a real function on an interval I ⊂ R and E ⊂ I. If

f = 0 on E, then λf (E) = 0.
20.4. Theorem. Let f be a real-valued function on an interval I and E ⊂ I a
measurable set. Suppose that a finite derivative f (x) exists at each point x ∈ E.
Then f is measurable on E and

λ∗ f (E) ≤ |f | .
E
Proof. We first prove the measurability of f on E. Choose c ∈ R. For each

k, m ∈ N, the set
Gm,k := {x ∈ I : there exist y, z ∈ I such that

1 1 1
x − < y < x < z < x + and f (z) − f (y) > (c + )(z − y)}
k k m
is open. Hence the set

{x ∈ E : f (x) > c} = E ∩ Gm,k
m k
is measurable.
To prove the required inequality, it is no restriction to assume that E is
bounded. For an ε > 0 denote
Ek = {x ∈ E : (k − 1)ε ≤ |f (x)| < kε}.

Then Ek are pairwise disjoint measurable sets, E = Ek and an appeal to Lemma

k
20.2 yields

λ∗ f (E) ≤ λ∗ f (Ek ) ≤ kε ≤ |f | + ελEk
k k k Ek

= |f | + ελE.
E
20.5. Exercise. Under the assumptions of Theorem 20.4 prove that f (E) is a measurable set.
S
Hint. Use the following Exercise 20.6 and the fact that E = N ∪ Kn where λN = 0 and Kn
n
are compact (Theorem 1.21).
20.6. Exercise. Let f be a real-valued function on an interval I ⊂ R having a finite derivative

at each point of a (Lebesgue) null set E ⊂ I. Show that λf (E) = 0.
S
Hint. Use either Theorem 20.4 or Lemma 20.2 realizing that E = {x ∈ E : |f (x)| < n}.
n
84 21. Functions of Finite Variation and Absolutely Continuous Functions
20.7. Luzin’s (N)-property. We have seen in 3.18 that a continuous image of a (Lebesgue)
measurable set does not need to be measurable. On the other hand, if f is differentiable and E
measurable, Exercise 20.5 yields that f (E) is measurable. We now consider briefly the question,
under what conditions images of measurable sets are again measurable.
We say that a real function f defined on an interval I ⊂ R has Luzin’s (N)-property if the
image f (N ) of any (Lebesgue) null set N ⊂ I is again a null set.
(a) (Rademacher) Let f be a continuous function on an interval I. Then the following
conditions are equivalent:
(i) f has Luzin’s (N)-property;
(ii) f (M ) is measurable whenever M ⊂ I is measurable.
Hint. In light of Theorem 1.21 a measurable set is a countable union of compact sets and a null
set. For the converse, use Remark 1.9.2 which says that every set of a positive measure contains
a nonmeasurable subset.
(b) Show that the assertion of (a) remains true if f is measurable only (use Luzin’s Theorem
18.2). Likewise, the definition of Luzin’s (N)-property and other ideas can be generalized to the
case when I is a measurable subset of Rn .
(c) Let f be a function having a finite derivative everywhere on an interval I. Then f has
Luzin’s (N)-property.
Hint. Exercise 20.6.
(d) Every Lipschitz (even locally Lipschitz) function on a measurable set M has Luzin’s
(N)-property.
20.8. Exercise. Let f be an arbitrary function on an interval I ⊂ R. If D is the set of points
at which f has a finite derivative, then D is a Borel set and the function x → f (x) is a Borel
function on D.
20.9. Notes. Luzin’s (N)-property was introduced by Luzin in [*1915]. Corollary 20.3 which
we called the one-dimensional version of Sard’s lemma is also often called Luzin’s theorem.
21. Functions of Finite Variation and Absolutely Continuous Functions
In this chapter we discuss two classes of functions without using measure the-
ory. Important results concerning these functions will be introduced in following
chapters.
21.1. Functions of Finite Variation. Let f be a real-valued function on an
interval I. For any interval [a, b] ⊂ I and any partition D : a = x0 < x1 < · · · <
xm = b of [a, b] denote
b
m
V(f, D) = |f (xj ) − f (xj−1 )| .
a
j=1
The extended real number

b b
V f := sup{V(f, D) : D is a partition of[a, b]}
a a
b
is called the variation of f over [a, b]. If V f < ∞ for every interval [a, b] ⊂ I,
a
then f is termed a function of finite variation. In this case there exists a function
v on I such that
b
v(b) − v(a) = V f
a
for any interval [a, b] ⊂ I. Such a function v is determined up to an additive

constant and it is called an (indefinite) variation of f .
One can easily see that the set of all functions of finite variation on an interval
I is a vector space.
If, in addition,
b
sup{V f : [a, b] ⊂ I} < +∞,
a
we say that f is of bounded variation on I. For a compact interval [a, b], notions
of finite and bounded variation agree and they can be characterized simple by
b
V f < ∞.
a
In case of an arbitrary interval I we could say that f is of “locally bounded
variation” instead of “finite variation”. Indeed, a function f is of finite variation
on I if and only if it has bounded variation on each compact subinterval of I.
In a similar way, we are going to “localize” notions of absolute continuity and
integrability.
21.2 Jordan Decomposition Theorem. A function f is of finite variation
on I if and only if f is a difference of two nondecreasing functions.
Proof. Monotone functions are of finite variation, and thus differences of mono-
tone functions are also of finite variation. For the converse, if v is an indefinite
variation of a function f of finite variation, then v and v − f are nondecreasing
and f is their difference.
21.3. Absolutely Continuous Functions. A real-valued function f is said
to be absolutely continuous on an interval I if given ε > 0, there exists δ > 0 that

m
|f (bj ) − f (aj )| < ε
j=1

m
whenever a1 <b1 ≤ a2 <b2 ≤ · · · ≤ am <bm are points of I with (bj − aj ) < δ.
j=1
The family of all absolutely continuous functions on an interval I is a vector space
which contains all Lipschitz functions.
We say that a function f is locally absolutely continuous on an interval I if f
is absolutely continuous on every compact subinterval of I.
Any locally absolutely continuous function is continuous and of bounded vari-
ation.
21.4. Theorem. Any absolutely continuous function f on I is the difference
of two nondecreasing absolutely continuous functions.
Proof. It is enough to show that the indefinite variation v of f is absolutely
continuous. To this end let an ε > 0 be given. Find δ > 0 such that

m
|f (bj ) − f (aj )| < ε
j=1
86 22. Theorems on Almost Everywhere Differentiation

m
whenever a1 <b1 ≤ a2 <b2 ≤ · · · ≤ am <bm are points of I with (bj − aj ) < δ.
j=1

p
Let A1 <B2 ≤ A2 <B2 ≤ · · · ≤ Ap <Bp be points of I with (Bj − Aj ) < δ. Find
j=1
partitions
mj
Aj = a0j < b0j = a1j < · · · < bj = Bj
of intervals [Aj , Bj ] for which

mj
i
v(Bj ) − v(Aj ) < f (bj ) − f (aij ) + 1 ε.
i=1
p
i
Since (bj − aij ) < δ, we have
j,i

|v(Bj ) − v(Aj )| < f (bij ) − f (aij ) + ε < 2ε ,
j j,i
as needed.
21.5 Exercise. Show that the product of two absolutely continuous functions (or functions of
finite variation) on a bounded interval is again an absolutely continuous function (or a function
of finite variation).
21.6. Exercise. Prove that every absolutely continuous function has Luzin’s (N)-property.
Hint. If N is a null set, find an open set M of a small measure containing N and then use the
definition of absolute continuity.
21.7. Notes. C. Jordan discovered functions of finite variation in [1881] and proved the
decomposition theorem 21.2.
22. Theorems on Almost Everywhere Differentiation
In this chapter we show that every function of finite variation (in particular,
every nondecreasing or Lipschitz function) has a finite derivative almost every-
where. The usual proof of this deep theorem uses Vitali’s covering theorem. Here
we present a rather elementary proof. Let start with a simple lemma which is due
to F. Riesz.
22.1. F. Riesz’s Rising Sun Lemma. Let h be a continuous function on an
interval [a, b]. Denote
E = {x ∈ (a, b) : there exists ξ ∈ (x, b)such that h(ξ) > h(x)}.
Then E is a union of a sequence of pairwise disjoint open intervals (aj , bj ) with

h(aj ) ≤ h(bj ).
Proof. It is clear that E is an open set. Hence, E is a union of a sequence of
pairwise disjoint maximal open intervals contained in E. Let (α, β) be any such
an interval and x ∈ (α, β). Denote
M = {ξ ∈ (x, β) : h(ξ) ≥ h(x)}.

Since β ∈
/ E, it follows that h ≤ h(β) in [β, b). By hypothesis M = ∅ and
sup M = β. Thus h(x) ≤ h(β) and letting x → α+, we complete the proof.
22.2. Remarks. 1. If (α, β) is a maximal open interval contained in E and α > a, then even
h(α) = h(β).
2. The “mirror” version of the lemma: Let h be a continuous function on an interval [a, b] and
denote
E = {x ∈ (a, b) : there exists a ξ ∈ (a, x) with h(ξ) > h(x)}.
Then E is the union of a sequence (aj , bj ) of pairwise disjoint open intervals with h(aj ) ≥ h(bj ).
22.3. Extreme Derivatives. Let f be a function defined on a neighborhood
of a point x ∈ R. Denote
1
D+ f (x) = lim sup (f (x + t) − f (x)),
t→0+ t
1
D+ f (x) = lim inf (f (x + t) − f (x))
t→0+ t
and analogously D− f (x) and D− f (x) for t → 0−. These extended real numbers
are called the Dini derivatives of f at x. A function f is differentiable at x if all
Dini derivatives of f agree at x.
Finally set
1
Df (x) = lim sup (f (x + t) − f (x))
t→0 t
and call the function Df the upper derivative of f . The lower derivative Df is
defined in a similar way.
22.4. Lemma. Every nondecreasing Lipschitz function f on an interval [a, b]
has a finite derivative almost everywhere on [a, b].
Proof. It is no restriction to assume that f is 1-Lipschitz. Lipschitz functions
cannot have infinite derivatives. To prove the assertion it suffices to show that
D+ f ≤ D− f almost everywhere. Then analogously D− f ≤ D+ f almost every-
where, and
0 ≤ D+ f ≤ D− f ≤ D− f ≤ D+ f ≤ D+ f ≤ 1
almost everywhere as needed. To this end let 0 < p < q < 1 be given and set
Mp,q := {x ∈ (a, b) : D− f (x) < p < q < D+ f (x)}.
By Remark 22.2.2 there is a sequence of pairwise disjoint intervals (aj , bj ) such

that

x ∈ (a, b) : D− f (x) < p ⊂ x ∈ (a, b) : there exists ξ ∈ (a, x)

with f (ξ) − pξ > f (x) − px = (aj , bj )
j
and
f (bj ) − pbj ≤ f (aj ) − paj .
88 22. Theorems on Almost Everywhere Differentiation
Now apply Lemma 22.1 to f (x) − qx on each interval [ak , bk ]. Again there are
pairwise disjoint intervals (ak,j , bk,j ) ⊂ (ak , bk ) such that

{x ∈ (a, b) : D+ f (x) > q} ⊂ {x ∈ (ak , bk ) : there exists ξ ∈ (ak , x)
k

with f (ξ) − qξ < f (x) − qx} = (ak,j , bk,j )
k,j
and
f (bk,j ) − qbk,j ≥ f (ak,j ) − qak,j .
Hence
1 1
(bk,j − ak,j ) ≤ (f (bk,j ) − f (ak,j )) ≤ (f (bk ) − f (ak ))
q q
k,j k,j k
p p
≤ (bk − ak ) ≤ (b − a).
q q
k
Thus, we can define by induction sequences of decreasing collections of intervals;

in the 2nth step we obtain a collection of intervals {(As , Bs )} with
n
p
(As , Bs ) ⊃ Mp,q and (Bs − As ) ≤ (b − a).
s s
q
It follows that Mp,q is a null set. The assertion now follows from the fact that

{x ∈ (a, b) : D− f (x) < D+ f (x)} ⊂ Mp,q .
p,q∈(0,1)∩Q
Now we show that also monotone functions are differentiable almost every-
where. The situation is more difficult since monotone functions can be discon-
tinuous. The heart of the proof lies in the observation that, for a nondecreasing
function f , the inverse function to x → x + f (x) (whose domain is not necessarily
connected) can be extended to a nondecreasing Lipschitz function on an interval.
22.5. Theorem (Lebesgue). Every monotone function f on an interval I has
a finite derivative at almost all points of I.
Proof. It would be clearly sufficient to assume that I = [a, b]. There exists an
interval [A, B] and a function g on [A, B] so that g is Lipschitz and nondecreasing,
and x + f (x) ∈ [A, B], g(x + f (x)) = x for all x ∈ [a, b]. (If f is continuous, then
g is simply the inverse function to x + f (x)). A moment’s thought will convince
the reader that
{x ∈ (a, b) : f (x) does not exist } ⊂ g(E) ∪ g(N )
where
E = {y ∈ (A, B) : g (x) does not exist } and N = {y ∈ (A, B) : g (x) = 0}.
By the previous lemma λE = 0. Hence λg(E) = 0 by Lemma 20.1. According to
Sard’s lemma (Corollary 20.3) also λg(N ) = 0 and the proof is complete.
22.6. Corollary. Every function of bounded variation has a finite derivative

almost everywhere.
We conclude this section by proving an important inequality.
22.7. Theorem. Let f be a nondecreasing function on an interval [a, b]. Then
f ∈ L 1 ([a, b]) and
b
f ≤ f (b) − f (a).
a
Proof. For x > b set f (x) = f (b) and define a sequence of functions

1
fk (x) = k f (x + ) − f (x)
k
for x ∈ [a, b]. Then fk are nonnegative measurable functions on [a, b] (f is measur-
able!) and lim fk = f almost everywhere. Thus f is measurable and nonnegative
almost everywhere. Fatou’s lemma 8.15 and a quick computation establish that
1
b b b+ k b
f ≤ lim inf fk = lim inf k f− f
1
a a a+ k a
1 1

b+ k a+ k
1 1
= lim inf k f− f ≤ lim inf k f (b) − f (a)
b a k k
= f (b) − f (a).
22.8. Notes. Theorem 22.5 on differentiability of monotone functions was proved by

H. Lebesgue in [*1904] under an additional assumption of continuity of differentiated function.
In a full generality, the theorem was proved independently by G. Faber [1918] and by G.C. Young
and W.H. Young [1911]. Rising sun lemma 22.1 and proof of 22.4 are due to F. Riesz [1930-32].
To prove Lebesgue’s theorem, he uses this lemma and the notion of null sets only. The idea of
the proof of 22.5 was used by J. Mignot [1976] and L. Zajı́ček [1983].
23. Indefinite Lebesgue Integral and Absolute Continuity
In this chapter we examine the formula

b
f (b) − f (a) = f
a
where the integral is understood as Lebesgue’s one and the derivative in the
“almost everywhere” sense. First, let us show that formula does not always hold
even if f is monotone by considering the following example.
23.1. Cantor Singular Function. We define the Cantor function f : [0, 1] → [0, 1] in the
following way: Let f (0) = 0, f (1) = 1. For x ∈ [ 13 , 23 ] set f (x) = 12 . Further set f (x) = 14 for
x ∈ [ 19 , 29 ] and f (x) = 34 for x ∈ [ 79 , 89 ]. Each succesive step is essentially the same. If (a, b) is a
maximal interval in which f is not yet defined, we subdivide it into thirds and define the value
of f in the closed middle third as the arithmetical mean of f (a) and f (b). Then f is defined
90 23. Indefinite Lebesgue Integral and Absolute Continuity
and uniformly continuous on a dense subset of [0, 1]. The last step of the definition consists in
the continuous extension of f to the whole interval [0, 1]. There is also an arithmetic definition
of the Cantor function. Suppose x ∈ [0, 1] is written in the form
∞
X
x= 3−j xj
j=1
where xj ∈ {0, 1, 2}. Then

∞
1 X −j
f (x) = 2 yj
2 j=1
where (
1 if there is i < j with xi = 1,
yj =
xj otherwise.
We see that f = 0 almost everywhere and

Z 1
f (1) − f (0) = 1
= 0 = f .
0
This situation cannot occur when f is absolutely continuous as shown by the

following theorem.
23.2 Theorem. Let f be an absolutely continuous function on an interval [a, b].
Then f ∈ L 1 ([a, b]) and
b
f (b) − f (a) = f .
a
Proof. As every absolutely continuous function is the difference of two monotone

absolutely continuous functions, no generality is lost with the assumption that f
b
is nondecreasing. Obviously f ≥ 0 almost everywhere and a f ≤ f (b) − f (a)
thanks to Theorem 22.7. Select ε > 0 and let δ > 0 be as furnished by the
definition of absolute continuity of f . There exists an open set G with λG < δ
containing all points of nondifferentiability of f . Let G be expressed as a union
of pairwise disjoint intervals (aj , bj ). Then the definition of absolute continuity of
k
f yields j=1 (f (bj ) − f (aj )) < ε for all k ∈ N. Thus λf (G) ≤ ε. On the other
hand, by Theorem 20.4 we have
b
∗
λ f ([a, b] \ G) ≤ f ≤ f .
[a,b]\G a
Whence
b
f (b) − f (a) = λf ([a, b]) ≤ λ∗ f ([a, b] \ G) + λf (G) ≤ f + ε.
a
23.3 Indefinite Lebesgue Integral. A function ϕ on an interval I ⊂ R is said

to be locally integrable if it is integrable on every compact subinterval of I, and
f is called an indefinite Lebesgue integral of a locally integrable function ϕ on I

provided
b
f (b) − f (a) = ϕ whenever [a, b] ⊂ I .
a
If c ∈ I and ⎧ x
⎪
⎪ x ≥ c,
⎨ ϕ,
Φ(x) = c
c
⎪
⎪
⎩ − ϕ, x < c,
x
then Φ is an indefinite Lebesgue integral of ϕ and any other indefinite integral

of ϕ differs from Φ by an additive constant. If, in addition, ϕ is nonnegative,
then Φ is nondecreasing. It follows that for a locally integrable function ϕ, the
indefinite integral of ϕ is a function of finite variation (it is the difference of
indefinite integrals of ϕ+ and ϕ− ) and by Corollary 22.6 it has a finite derivative
almost everywhere. Notice that according to the Lebesgue dominated convergence
theorem any indefinite Lebesgue integral is a continuous function. This assertion
can be sharpened considerably as the next theorem shows.
23.4. Theorem. Let ϕ ∈ L 1 (I) and f be an indefinite Lebesgue integral of ϕ
on I. Then f is absolutely continuous and f = ϕ almost everywhere.
y
Proof. Since |f (y) − f (x)| ≤ x |f | for [x, y] ⊂ I, absolute continuity of f follows
from Exercise 8.22.b. It remains to show that f = ϕ almost everywhere. Since
f is absolutely continuous, f exists almost everywhere, f ∈ L 1 (I) and for any
interval [a, b] ⊂ I
b b

f = f (b) − f (a) = ϕ.
a a
It readily follows that

f = ϕ
E E
for any open set E, and consequently for any measurable set E as well. Then
Theorem 8.17 gives the desired conclusion.
23.5. Corollary. For a real-valued function f on an interval [a, b], the following
properties are equivalent:
(i) f is absolutely continuous on [a, b]; x
(ii) there exists ϕ ∈ L 1 ([a, b]) such that f (x) = f (a) + a ϕ for all x ∈ [a, b];

f xis differentiable almost everywhere, f ∈ L ([a, b]) and f (x) = f (a) +
1
(iii)
a
f for all x ∈ [a, b].
23.6 Corollary. Suppose f is an absolutely continuous function on [a, b]. If

f = 0 almost everywhere, then f is constant.
23.7. Remark. Let ϕ be a real-valued function on an interval [a, b]. Recall that the
Newton integral of ϕ is the difference f (b) − f (a), where f is an antiderivative of ϕ on the
interval [a, b]. This definition can be generalized in many ways. We may modify the notion
of an “antiderivative” to require the equality f = ϕ up to a countable set of points. To
92 23. Indefinite Lebesgue Integral and Absolute Continuity
make this definition reasonable, we should add the assumption that f is continuous in order to
guarantee that all “antiderivatives” of f differ up a constant. An analogous problem appears
when assuming f = ϕ only almost everywhere. Now, the example of the Cantor function shows
that continuity of f does not guarantee that the increment f (b)−f (a) does not depend on choice
of the “generalized antiderivative”. However, it is possible to give an alternative definition of
the Lebesgue integral of a function ϕ as the increment of an absolutely continuous function f
on [a, b] with f = ϕ almost everywhere. (Notice that this definition does not lead to a true
generalization — some antiderivative are not absolutely continuous, see Example 25.1). The
definitions of various integrals based on the idea of (in some way) generalized antiderivatives
are called the descriptive ones. Let us note that these generalizations may consist in omitting
“small sets” or in a “generalized differentiation”. Furthermore, in Chapter 25 we mention
Perron’s method which is also included among descriptive approaches.
23.8. Lebesgue Points. Let I ⊂ R be an interval and x ∈ I. We say that x
is a Lebesgue point for a locally integrable function f if
h
1
lim |f (x + t) − f (x)| dt = 0.
h→0 2h −h
If F is an indefinite Lebesgue integral of f on an interval I and x ∈ I is a point

where F (x) = f (x), then
h
1
lim (f (x + t) − f (x)) dt = 0
h→0 2h −h
but x does not need to be a Lebesgue point for f . However, it is clear that F = f
at each Lebesgue point for f . The following theorem is thus a strengthening of
Theorem 23.4.
23.9. Lebesgue Differentiation Theorem. Let f be a locally integrable
function on an interval I. Then almost every point of I is a Lebesgue point for
f.
Proof. For a fixed r ∈ R, the function x → |f (x) − r| is locally integrable on I.
By virtue of Theorem 23.4 there exists a set Er ⊂ I of Lebesgue measure zero
such that h
1
lim |f (x + t) − r| dt = |f (x) − r|
h→0 2h −h

for every x ∈ I \ Er . If E := Er , then λE = 0. Now, if x ∈ I \ E and ε > 0,

r∈Q
then there exists r ∈ Q with |f (x) − r| < ε. Consequently,
|f (x + t) − f (x)| ≤ |f (x + t) − r| + ε
and h
1
lim sup |f (x + t) − f (x)| dt ≤ |f (x) − r| + ε ≤ 2ε.
h→0 2h −h
23.10. Remark. Every continuity point of f is a Lebesgue point for f . An interesting

relationship between Lebesgue points and points of approximate continuity will be given in
Exercise 29.11.
23.11. Banach-Zarecki Theorem. Let f be a real-valued function on an interval [a, b]. The
following assertions are equivalent:
(i) f is absolutely continuous on [a, b];
(iv) f is continuous on [a, b], f is of bounded variation and has Luzin’s (N)-property.
Hint. We already know that any absolutely continuous

R function f is continuous and of finite
variation. According to Theorem 23.2, λf (G) ≤ G |f | for any open set G ⊂ [a, b], and a
moment’s reflection shows that f has Luzin’s (N)-property. Conversely, suppose that (iv) holds
and let D be the set of all points where f has a finite derivative. By Exercise 20.8, D is
measurable. Since λf ([a, b] \ D) = 0, Theorem 20.4 yields
Z t ˛ ˛
|f (s) − f (t)| ≤ ˛f (ξ)˛ dξ
s
Rx
for every interval [s, t] ⊂ [a, b]. Now it is sufficient to notice that the function x → a |f | is
absolutely continuous.
23.12. A Characterizations of Lipschitz Functions. Show that for a real-valued function

f on an interval [a, b] the following conditions are equivalent:
(i) f is Lipschitz on [a, b];
Pevery finite collection of intervals [aj , bj ] ⊂
(ii) for any ε > 0 there exists δ > 0 so that for
P
[a, b] with (bj − aj ) < δ the inequality j |f (bj ) − f (aj )| < ε holds;
j
(iii) f is absolutely continuous on [a, b] and f is bounded on [a, b] \ N , where λN = 0.
(Compare with a similar characterization of absolutely continuous functions.)
23.13. Integration by Parts for Lebesgue Integral. Let f , g be absolutely continuous
functions on an interval [a, b]. Then
Z b Z b
f g = f (b)g(b) − f (a)g(a) − f g.
a a
23.14. Notes. Fundamental theorem (23.5) of “the calculus for Lebesgue integrals” was proved
by H. Lebesgue [*1904]. The implication (iv) =⇒ (i) of Theorem 23.11 is due to S. Banach [1925].
24. Radon Measures on R and Distribution Functions
24.1. Distribution Functions. A distribution function of a Radon mea-

sure ν on R is a nondecreasing and right continuous function F on R such that
F (b) − F (a) = ν(a, b] for every interval (a, b] ⊂ R. It is obvious that any other
distribution function of ν can differ from F by only a constant.
If ν is a probability measure, then a distribution function of ν, given as
Gν (x) := ν(−∞, x] ,
is normalized so that lim Gν (x) = 0 and lim Gν (x) = 1.

x→−∞ x→+∞
24.2. Theorem. (a) Let F be a nondecreasing and right continuous function

on R. Then there exists a unique Radon measure νF on B(R) such that F is the
distribution function of νF .
(b) Let ν be a Radon measure on R. Then there is a distribution function F
of ν.
94 24. Radon Measures on R and Distribution Functions
Proof. (a) As for the uniqueness, any two such measures agree on all open (or
compact) sets in R. The existence can be proved in various ways, depending on
what kind of general theorem we wish to use. For instance, we can use Hopf’s
extension theorem 5.5 or start from the covering collection {(a, b] : a < b, a, b ∈ R}
and the set function νF (a, b] := F (b) − F (a), create the outer measure and to
restrict it to measurable sets. In any case, we have to prove the following property:

∞
∞
If (a, b] ⊂ (an , bn ], then F (b) − F (a) ≤ (F (bn ) − F (an )). To see this, given
n=1 n=1
ε > 0, let δn > 0, δ > 0 be so that F (bn +δn ) < F (bn )+ε2−n , F (a+δ) < F (a)+ε.
Then consider the cover {(an , bn + δn )} of the interval [a + δ, b].
(b) Setting ⎧
⎪
⎨ ν(0, x] for x > 0,
Fν (x) := 0 for x = 0,
⎪
⎩
−ν(x, 0] for x < 0,
Fν is nondecreasing, right continuous, Fν (0) = 0 and ν(a, b] = Fν (b) − Fν (a) for
every interval (a, b] ⊂ R.
24.3. Theorem. The following conditions are equivalent for a measure ν
defined on B(R):
(i) ν is a Radon measure;
(ii) there exists a distribution function F such that ν(a, b] = F (b) − F (a) for
every bounded interval (a, b];
(iii) there exists a nondecreasing function ϕ on R such that ν is the Lebesgue-
Stieltjes measure λϕ of Example 14.8.b.
Proof. Use previous results and the uniqueness part of the Riesz representation
theorem 16.5.
24.4. Remarks. 1. According to Remark 15.2 any finite measure on (R) is a Radon
measure. However,Pthere exist σ-finite measures on (R) which are not Radon measures (for
example, the sum δ1/k of Dirac measures). Hence, these measures are not Lebesgue-Stieltjes
k
measures and have not distribution functions.
2. The function ϕ from the last theorem need not be right continuous. However, as a monotone
function it has a right limit at each point. If ϕ̃(x) := limt→x+ ϕ(t), then ϕ̃ is a distribution
function of ν and determines the same Lebesgue-Stieltjes measure as ϕ. The functions ϕ̃ and F
of Theorem 24.3.ii differ by a constant (cf. Exercise 24.6).
24.5. Exercise. Let F be a distribution function of a Radon measure μ on R and let μF
correspond to F as in Theorem 24.2. Show that μ = μF on (R).
24.6. Exercise. Show that any two distribution functions of the same measure differ by a
constant.
24.7. Exercise. Let F be a distribution function of a Radon measure μ. Show that:
(a) F is locally absolutely continuous on R if and only if μ is absolutely continuous with
respect to the Lebesgue measure λ. In this case, F is the Radon-Nikodým derivative dμ/ dλ.
(b) Measures μ and λ are mutually singular if and only if F = 0 almost everywhere.
(c) μ({x}) = F (x) − lim F (t) for every x ∈ R. In particular, F is continuous at x if and
t→x−
only if μ({x}) = 0.
(d) z ∈ supt μ if and only if F (z − ε) < F (z + ε) for every ε > 0.

(e) Let μ = μd + μa + μs be the decomposition of μ into discrete, absolutely continuous
and singular part (Exercise 15.18.c). If Fd , Fa , Fs are distribution functions of μd , μa , μs ,
respectively, then Fa is locally absolutely continuous, Fs is continuous, Fs = 0 almost everywhere
and Fd is called the saltus function of F . Moreover, F = Fd + Fa + Fs .
24.8. Exercise. (a) Let ϕ equal to the Cantor function (Example 23.1) on [0, 1] and ϕ = c(0,∞)
elsewhere on R. Determine λϕ C, where C is the Cantor set. Show that λϕ is a continuous
measure (see Exercise 15.18) and supt λϕ = C (see Exercise 15.10).
(b) Let ε ∈ [0, 1]. Does there exist a nondecreasing function ψ so that ψ(0) = 0, ψ(1) = 1
and λψ C = ε?
Hint. Set ψ(x) = εϕ(x) + (1 − ε)x, where ϕ is defined as in (a).
24.9. Exercise. Let F be a distribution function of a Radon measure μ. Assume that F is
locally absolutely continuous on R. Show that
Z Z
g dμ = gF
R R
for every g ∈ 1 (μ).
24.10. Fubini’s Lemma. Let P{fn } be a sequence of nondecreasing functions

on an interval
P
I ⊂ R. Assume that the series fn converges to a finite limit f on I. Then f = fn almost
everywhere on I.
P
Hint. First prove that λf = λfn (notation as in 14.8). To finish the proof, use the Lebesgue
decompositions of these measures to absolutely continuous and singular parts.
24.11. Notes. Distribution functions play an important role in probability theory and it is
very difficult to trace who introduced them. They were investigated in various forms by Jacob
Bernoulli, P.S. de Laplace and others. Lemma 24.10 is due to G. Fubini [1915].
25. Henstock–Kurzweil Integral
In some sense, the Lebesgue integral has the best properties among all integrals
which can be defined on any measure space. However, its universality can also be
a disadvantage. If we are engaged in integration of functions of one real variable
(or several variables, but in what follows we restrict to the real line in order to
simplify the explanation), then we can find wider collections of functions on which
a reasonable notion of an integral can be introduced. Let us present an illustrative
example.
25.1. Example. The function
8“ ”
<
x2 cos2 1
x2
for x
= 0,
f (x) =
:
0 for x = 0
1 1
is not Lebesgue integrable on [−1, 1]. For aj = q and bj = √ we have
1
(j + 2 )π jπ
Z bj Z bj 1
|f (x)| dx = f (x) dx = ,
aj aj πj
R
and therefore 01 |f (x)| dx = +∞. On the other hand, f has an antiderivative on R and the
R
Newton integral 01 f (x) dx exists.
96 25. Henstock–Kurzweil Integral
There are also simple examples of functions which are Lebesgue integrable but
not Newton integrable (for example, the function sign x on [−1, 1]). A generaliza-
tion of both Newton and Lebesgue integrals leads to a nonabsolutely convergent
integral which can be introduced in several ways. Here we use an approach due
to Henstock and Kurzweil; Perron’s one is outlined in 25.10.
25.2. Henstock–Kurzweil integral. For the sake of simplicity we will define
the integral for bounded intervals only.
If [a, b] is an interval, denote the set of all (strictly) positive functions on [a, b] by
Δ and label functions from Δ as gauges. A partition is a pair D = ([aj , bj ], ξj )m j=1 ,
where
a = a1 < b1 = a2 < b2 = a3 < · · · < bm−1 = am < bm = b and ξj ∈ [aj , bj ].
If δ ∈ Δ is a gauge, then a partition D is called δ-fine (or subordinated to δ ∈ Δ)

whenever bj − aj < δ(ξj ) for all j = 1, . . . , m. The definition of the Henstock–
Kurzweil integral is based on the following proposition whose proof is an easy
consequence of the compactness of [a, b].
Cousin’s Lemma. Let δ be a gauge from Δ. Then there exists a δ-fine parti-
tion. Moreover, if a collection B of non-overlapping intervals [ai , bi ] containing
points ξi with bi − ai < δ(ξi ) is given, then adding some intervals to B we get a
δ-fine partition.
Given a real-valued function f on [a, b] and a partition D = ([aj , bj ]), ξj )m
j=1 of
[a, b], set
m
s(f, D) = f (ξj )(bj − aj ).
j=1
We say that f is Henstock–Kurzweil integrable on [a, b] if there is a real number

K so that for any ε > 0 there is δ ∈ Δ such that
|K − s(f, D)| < ε
for each δ-fine partition D of [a, b]. The number K is then uniquely determined,
b
it is called the Henstock–Kurzweil integral of f and denoted by K a f . Where
no confusion can result, we will drop the prefix “K”. This definition is similar
to original Riemann’s one, the “only” difference being that instead of constants
δ in Riemann’s definition, a gauge function δ appears in Henstock–Kurzweil’s
definition.
The set of all Henstock–Kurzweil integrable functions is a linear space on which
the Henstock–Kurzweil integral is a monotone linear functional.
We are not going to study details of the theory of nonabsolutely convergent
integrals; we will concentrate on relations to the Newton and the Lebesgue integral
only.
25.3 Theorem. Let F be a continuous function on [a, b], F = f on (a, b). Then
the Henstock–Kurzweil integral of f on [a, b] exists and equals to F (b) − F (a).
Proof. Given an ε > 0 and x ∈ (a, b), there is δ(x) > 0 such that

F (y) − F (x)
− f (x)<ε
y−x
whenever y ∈ [a, b], 0 < |y − x| < δ(x). Next, we can find δ(a) > 0 and δ(b) > 0
such that |f (a)δ(a)| < ε, |f (b)δ(b)| < ε, |F (y) − F (a)| < ε for all y ∈ [a, a + δ(a))
and |F (y) − F (a)| < ε for all y ∈ (b − δ(b), b]. Let D = ([aj , bj ]), ξj )m
j=1 be a δ-fine
partition. Then
|F (bj ) − F (aj ) − f (ξj )(bj − aj )| ≤

|F (bj ) − F (ξj ) − f (ξj )(bj − ξj )| + |F (ξj ) − F (aj ) − f (ξj )(ξj − aj )|
< ε(|bj − ξj | + |ξj − aj |) = ε(bj − aj )
for every j ∈ {1, . . . , m}, ξj ∈ (a, b), and
|F (bj ) − F (aj ) − f (ξj )(bj − aj )| ≤ |F (bj ) − F (aj )| + |f (ξj )(bj − aj )| < 2ε
if ξj ∈ {a, b}. Summing over j, we obtain the estimate
|F (b) − F (a) − s(f, D)| < 4ε + (b − a)ε.
25.4. Theorem. If a function f has a finite Lebesgue integral L, then f also

b
has the Henstock–Kurzweil integral and K a f = L.
Proof. If ε > 0, then Theorem 15.5 ensures the existence of a lower semicontinuous
function s > f and an upper semicontinuous function t < f such that s, t ∈
L 1 ([a, b]) and
b
(s − t) < ε
a
(we get strict inequalities by adding small constants). For every x ∈ [a, b] find
δ(x) > 0 with t < f (x) < s on (x − δ(x), x + δ(x)) ∩ [a, b]. Let D = ([aj , bj ], ξj )m
j=1
be a δ-fine partition of [a, b]. Then
bj bj
t ≤ f (ξj )(bj − aj ) ≤ s
aj aj
for each j ∈ {1, . . . , m}. Summing we get

b b
t ≤ s(f, D) ≤ s.
a a
b b
Since a
t≤L≤ a
s, we have |s(f, D) − L| < 2ε, and we are done.
25.5 Indefinite Henstock–Kurzweil Integral. In the sequel, [a, b] will be

a fixed interval and Δ will denote the set of all positive functions on [a, b]. If a
function f has the Henstock–Kurzweil integral on [a, b] and [a , b ] ⊂ [a, b], then f
has the Henstock–Kurzweil integral on the interval [a , b ] as well. Moreover,
c3 c2 c3
f= f+ f
c1 c1 c2
whenever a ≤ c1 < c2 < c3 ≤ b.

A function F on [a, b] is called an indefinite Henstock–Kurzweil integral of f
if, for each interval [a , b ] ⊂ [a, b],
b

F (b ) − F (a ) = K f.
a
If F is an indefinite Henstock–Kurzweil integral of f , then any other indefinite

Henstock–Kurzweil integral of f can differ from F by only a constant.
25.6. Saks-Henstock’s Lemma. Let F be an indefinite Henstock–Kurzweil
integral of f on [a, b]. Then for any ε > 0 there exists a δ ∈ Δ such that

F (bj ) − F (aj ) − f (ξj )(bj − aj ) < ε
j∈M
j=1 is a δ-fine partition of [a, b] and M ⊂ {1, . . . , m}.

whenever D = ([aj , bj ], ξj )m
Proof. Choose an ε > 0 and find a gauge δ ∈ Δ such that
ε
|F (b) − F (a) − s(f, D)| <
2
whenever D = ([aj , bj ], ξj )m j=1 is a δ-fine partition of [a, b]. Fix now any such
a partition D and M ⊂ {1, . . . , m}. For each j = 1, . . . , m there is a partition
mj
Dj = ([aij , bij ], ξji )i=1 of [aj , bj ] subordinated to the restriction of δ to [aj , bj ] such
that
ε
|F (bj ) − F (aj ) − s(f, Dj )| < .
2m
Create a new partition D∗ = ([a∗k , b∗k ], ξk∗ )pk=1 of [a, b] in such way that every triple
(a∗k , b∗k , ξk∗ ) is either one of the triplets (aij , bij , ξji ), where j ∈
/ M , i ∈ {1, . . . , mj },
or one of the triplets (aj , bj , ξj ), where j ∈ M . This partition is δ-fine and thus
ε
|F (b) − F (a) − s(f, D∗ )| < .
2
Whence combining all inequalities, we get

F (bj ) − F (aj ) − f (ξj )(bj − aj ) < ε.
j∈M
25.7. Theorem. An indefinite Henstock–Kurzweil integral F of f on [a, b] is

continuous on [a, b].
Proof. Let z ∈ [a, b] and ε > 0 be fixed. By previous Saks–Henstock’s lemma find
a gauge δ ∈ Δ such that

(*) F (bj ) − F (aj ) − f (ξj )(bj − aj ) < ε
j∈M
j=1 is a δ-fine partition of [a, b] and M ⊂ {1, . . . , m}.

whenever D = ([aj , bj ], ξj )m
For x ∈ [a, b], |x − z| < δ(z), let D = ([aj , bj ], ξj )m
j=1 be a partition such that
ξk = z for some k ∈ {1, . . . , m} and such that the endpoints of the interval [ak , bk ]
are x and z. Applying (*) with M = {k}, we obtain
|F (x) − F (z)| < ε + |f (z)| |x − z| .
Based on Henstock–Kurzweil’s definition, the following theorem (even in higher

dimensions) was proved by J. Král [1985].
25.8. Theorem. Let F be an indefinite integral of a Henstock–Kurzweil in-
tegrable function f on [a, b]. Then f is a measurable function, the derivative F
exists at almost all points of (a, b) and F = f almost everywhere.
Proof. The continuity of F (Theorem 25.7) implies that DF is measurable (even
Borel) since
DF (x) = lim lim fn,k ,
n→∞ k→∞
where functions fn,k defined as

F (y) − F (x) 1 1
fn,k (x) = sup : y ∈ [a, b], ≤ |y − x| ≤
y−x n+k n
are continuous. Now we prove that DF = DF almost everywhere which estab-

lishes the existence of the derivative almost everywhere and its measurability.
We will be done once we show that, for every n ∈ N, the level sets
1
Ln := {x ∈ [a, b] : DF (x) < f (x) − } and
n
1
Un := {x ∈ [a, b] : DF (x) > f (x) + }
n
have measure zero. For this purpose we will fix n ∈ N and estimate the (outer)
measure of the set L := Ln . We are now in a position to invoke Saks–Henstock’s
lemma 25.6: Given ε > 0, there is a gauge function δ ∈ Δ such that

F (bj ) − F (aj ) − f (ξj )(bj − aj ) < ε
j∈M
whenever D = ([aj , bj ], ξj )m j=1 is a δ-fine partition of [a, b] and M ⊂ {1, . . . , m}.

Let V denote the family of all intervals [a , b ] for which there exists an x ∈
[a , b ] ∩ L such that b − a < δ(x) and
F (b ) − F (a ) 1
< f (x) − .
b − a n
Then V is a Vitali cover of L, and so, by Corollary 27.3, there exists a finite,
pairwise disjoint subfamily {[ai , bi ]}i of V with points ξi ∈ [ai , bi ] such that

λ∗ (L \ [ai , bi ]) < ε.
i
Now, use Cousin’s lemma in 25.2 to extend the collection ([ai , bi ], ξi )i to a fam-
ily ([a∗j , b∗j ], ξj∗ ) subordinated to δ. If M := {j : there exists an i with [a∗j , b∗j ] =
[ai , bi ]}, the property of gauge δ yields that

(f (ξi )(bi − ai ) − (F (bi ) − F (ai ))) < ε ,
i
and the definition of V that

1
(f (ξi )(bi − ai ) − (F (bi ) − F (ai ))) > (bi − ai ).
i
n i

Thus i (bi − ai ) < nε, which in turn implies that

λ∗ (L) ≤ nε + λ∗ (L \ [ai , bi ]) < (n + 1)ε.
i
The following proposition illustrates the relation between Henstock-Kurzweil

and Lebesgue integrals — their equivalence for nonnegative functions.
25.9. Theorem. If a nonnegative function f has the Henstock–Kurzweil inte-
b b
gral on [a, b], then f has also the Lebesgue integral on [a, b] and L a f = K a f .
Proof. According to Theorem 25.6, f is measurable. It remains to show that
b
L a f < ∞ and use Theorem 25.4. The proof may now be finished in a stroke:
Since f ≥ 0, the indefinite Henstock–Kurzweil integral F of f is nondecreasing on
b b
[a, b] and thus by Theorems 22.7 and 25.8 L a f = L a F ≤ F (b) − F (a) < +∞.
25.10. Perron Integral. There are also other ways how to define an integral which is
equivalent to the Henstock–Kurzweil integral (i.e. it provides the same class of “integrable”
functions on which it equals to the Henstock–Kurzweil integral). In the following, we outline
Perron’s approach to generalizing both the Newton and the Lebesgue integrals on intervals. Let
us start with two notes:
(a) A descriptive definition of the Newton integral of a function f requires the existence of
an antiderivative (a function F with F = f ). This is a very strong condition as it reduces
the class of Newton integrable functions. Among others, this condition implies that f shares
the Darboux property and is of the Baire class one. A descriptive definition of the Lebesgue
integral requires the existence of an absolutely continuous function F with F = f only almost
everywhere.
(b) The following particular case of Theorem 15.5 is known as the Vitali-Carathéodory theo-
rem: A function f is Lebesgue integrable on an interval I if for any ε > 0 there exists an upper
semicontinuous
R function g and a lower semicontinuous function h with g ≤ f ≤ h on I and
I (h − g) < ε.
Now we are in the position to introduce Perron’s integral. Let f be a function on an interval
[a, b]. A function M is termed to be a majorant of f if f (x) ≤ DM (x)
= −∞ for every x ∈ [a, b].
If f ≥ Dm
= +∞, we say that m is a minorant of f .
A function f is said to be Perron integrable on an interval [a, b] if for any ε > 0 there exists
a majorant M and a minorant m so that
M (b) − M (a) − (m(b) − m(a)) < ε.
Since D(M − m) ≥ DM − Dm ≥ 0 (and since every function with a nonnegative lower derivative
is nonnegative) we may define the Perron integral of f as
Z b
P f = inf{M (b) − M (a) : M is a majorant of f }
a
= sup{m(b) − m(a) : m is a minorant of f }.
We may also define upper and lower Perron integrals of f , and f will be Perron integrable when
they are equal and finite.
25.11 Remark. Lebesgue integrable functions may be characterized in the following way:
A function f on an interval [a, b] is Lebesgue integrable if and only if for any ε > 0 there is
a majorant M and a minorant m, both of them absolutely continuous, so that
M (b) − M (a) − (m(b) − m(a)) < ε.
To prove this proposition, one can use the Vitali-Carathéodory characterization (Theorem
15.5) again (the indefinite integrals of semicontinuous functions serve as majorants and mino-
rants).
It is quite interesting to notice that in the definition of the Perron integral we can restrict to
continuous majorants:
A function f on an interval [a, b] is Perron integrable if and only if for any ε > 0 there is a
majorant M and a minorant m, both of them continuous, so that
M (b) − M (a) − (m(b) − m(a)) < ε.
Finally, the Riemann integrable functions may be characterized in a similar way:

A function f on an interval [a, b] is Riemann integrable if and only if for any ε > 0 there is
a majorant M and a minorant m, both of them with continuous derivative, so that
M (b) − M (a) − (m(b) − m(a)) < ε.
25.12. Exercise. Find the gauge function δ from the definition of the Henstock–Kurzweil
integral if f is:
(a) Newton integrable;
(b) Lebesgue integrable;
(c) an indicator function of an open set;
(d) an indicator function of a null set.
25.13 Exercise. Give an example to show that the indefinite Henstock–Kurzweil integral
need not be an absolutely continuous function.
25.14. Historical Notes. H. Lebesgue already knew that his integral on the real line does not
integrate all derivatives. The task was, to find a wider class of “integrable functions” containing
both the Lebesgue and Newton integrable functions. The problem was solved by A. Denjoy [1912]
who used a constructive method based on a transfinite approach, and by N. N. Luzin [1912] using
a descriptive definition based on further generalization of the notion of absolute continuity. The
method of envelopes was developed by O. Perron [1914]; it was proved that his and Denjoy’s
integrals are equivalent. Nowadays, this integral is usually called the Denjoy-Perron integral or
the restricted Denjoy integral. Another approach was developed independently by R. Henstock
[1961] and J. Kurzweil [1957] who returned to a Riemann type definition. The integral is often
called the complete Riemann or the Henstock-Kurzweil integral. Simple definitions of Henstock’s
and Kurzweil’s approach and simple proofs allowed a deeper study of the Henstock–Kurzweil
integral and surprising applications. Among recent publications let us mention R. Henstock
[*1988], Peng Yee Lee [*1989], Š. Schwabik [*1992] and Š. Schwabik and Guoju Ye [*2005].
E. Integration on Rn 103
E. Integration on Rn
26. Lebesgue Measure and Integral on Rn
Recall that the outer Lebesgue measure λ∗n is defined as

λ∗n A := inf vol Ik : Ik ⊃ A, Ik are open intervals
and the Lebesgue measure λn is defined as the restriction of λ∗n to the σ-algebra
Mn = Mn (λ∗n ) of all Lebesgue measurable sets. Where no confusion can result,
the subscript “n” will be omitted.
26.1. Theorem. The Lebesgue measure λn is a complete Radon measure.
Proof. Use Theorem 1.21 and Remark 15.2.4 to prove that λn is a Radon measure.
Completeness is a consequence of Caratheodory’s construction, see Theorem 4.5.
26.2. Theorem. The Lebesgue measure λn+k on Mn+k is the completion of

the product measure λn ⊗ λk on Mn ⊗ Mk .
Proof. Recall that A ∈ Mn if and only if A is a union of a Borel set and a set
of measure zero. Indeed, such a property is possessed by any complete Radon
measure and then we may refer to Theorem 26.1. The proof is now divided into
three steps.
1. First we show that Mn ⊗ Mk ⊂ Mn+k . For this purpose it is enough to show
that any measurable rectangle A × B for A ∈ Mn , B ∈ Mk is in Mn+k . We write
A = DA ∪ NA and B = DB ∪ NB , where DA , DB are Borel sets and NA , NB
have measure zero. Then DA × DB is a Borel set and (A × B) \ (DA × DB ) =
(A × NB ) ∪ (NA × B) has measure zero.
2. Conversely, Bn+k = Bn ⊗ Bk ⊂ Mn ⊗ Mk . Since (Mn+k , λn+k ) is the
completion of (Bn+k , λn+k ), Mn+k ⊂ Mn ⊗ Mk .
3. Since λn ⊗ λk = λn+k on open intervals of Rn+k , using Hopf’s extension
theorem 5.5 and uniqueness of the completion we conclude that λn ⊗ λk = λn+k
on Mn+k .
26.3. Remark. It is not very hard to prove that n ⊗ k = n+k in Euclidean spaces. For
measurable sets, the situation is much more complicated. Indeed,
n+k ⊂ Mn ⊗ Mk ⊂ Mn+k and n+k

= Mn ⊗ Mk
= Mn+k .
For the first counterexample consider the Cartesian product H × H, where H is a measurable
non-Borel subset of R (Exercise 1.14.b) and use Lemma 11.2. The second one can be find in
Remark 11.10.
26.4. Exercise. Show that the Lebesgue measure is the only complete Radon measure on Rn
that assigns to each interval its volume.
26.5. Exercise. The Lebesgue measure is translation-invariant: If A ∈ Mn and x ∈ Rn , then
A + x ∈ Mn and λn (A + x) = λn A.
26.6. Exercise. Let ν be a nontrivial translation-invariant complete Radon measure on Rn .
Show that there is c > 0 such that ν = cλn .
Hint. Let c = ν(0, 1)n . Evaluate the ν-measure of arbitrary intervals and use Exercise 26.4 to
compare λn and 1c ν.
104 26. Lebesgue Measure and Integral on Rn
26.7. Lebesgue Measurable Functions. Since Mn contains the family

of all Borel sets, each Borel (in particular, each continuous) function (on Rn )
is measurable. A deeper and complete characterization of measurable functions
(e.g. on open or closed subsets of Rn ) is contained in Luzin’s Theorem 18.2.
Basic tools of the integral calculus of functions of several variables are substi-
tution theorem and Fubini’s theorem. The last one allow to reduce the multi-
dimensional integration to a succession of one-dimensional integrals.
26.8. Introduction to Fubini’s Theorem. If M ⊂ Rn+k is a measurable
∗
set, let L
(M ) denote the set of all functions f which have (finite
or infinite)
integral M f dλn+k (which will be traditionally denoted by M f (x, y) dx dy).
Given x ∈ Rn , recall that M x stands for the set {y ∈ Rk : [x, y] ∈ M }.
26.9. Fubini’s Theorem. Let M ⊂ Rn+k be a measurable set and f ∈
L ∗ (M ). Then the integral

g(x) := f (x, ·) dλk
Mx
exists for almost all x ∈ Rn , and

f dλn+k = g dλn .
M Rn
In other words,

f (x, y) dx dy = f (x, y) dy dx.
M Rn Mx
Proof. To prove the assertion it suffices to use Theorems 11.11. and 26.2.
26.10. Remarks. 1. The assumption f ∈ ∗ (M ) is satisfied if, for instance, f ∈ 1 (M )
(Tonelli’s theorem), or if f is a non-negative measurable function (Fubini’s theorem in a narrow
sense). When applying Fubini’s theorem, we usually start with Fubini’s theorem in a narrow
sense in order to show that |f | is integrable, and then we use Tonelli’s theorem.
2. Let πM denote the projection of M into Rn (i.e. πM is the set of all points x for which M x
is nonempty). Then Z Z „Z «
f (x, y) dx dy = f (x, y) dy dx
M πM Mx
provided πM is measurable. In general, the measurability of M does not imply the measurability
of πM . Observe that πM is measurable whenever M is open (then πM is also open), or compact
(then πM is compact), or a countable union of compact sets (e.g. a closed set). Projections of
Borel sets are also measurable but the proof is difficult. The study of projections of Borel sets
led to the notion of analytic sets (see D.L. Cohn [*1980]).
R
26.11. Example. We evaluate the integral B f (z) dz, where f is an integrable function on
the ball B := {z ∈ R : |z| < 1}. Let usq
3 write z in the form [x, y], where x = [z1 , z2 ] ∈ R2 and
y = z3 ∈ R. Then B x = {y ∈ R : |y| < 1 − |x|2 } and πB = {x ∈ R2 : |x| < 1}. Hence
Z Z Z √1−|x|2 !
f (z) dz = √ f (x, y) dy dx
B πB − 1−|x|2
0 q 0 q 1 1
Z 1 Z 1−z2 Z 1−z2 −z2
1 1 2
= @ q @ q f (z1 , z2 , z3 ) dz3 A dz2 A dz1 .
−1 − 2
1−z1 − 2 −z 2
1−z1 2
26.12. Jacobian and Derivative. If G ⊂ Rk is an open set and ϕ =

[ϕ1 , . . . , ϕn ] : G → R has all partial derivatives at a point z ∈ G, then the Jacobi
n
matrix ( ∂ϕ ∂xj (z))i=1,...,n of the mapping ϕ at z will be denoted by ∇ϕ(z).

i
j=1,...,k
We say that ϕ is (Fréchet) differentiable at z if there exists a linear mapping
L : Rk → Rn such that
ϕ(z + h) − ϕ(z) − L(h)

lim = 0.
h→0 |h|
In this case, L is represented by the matrix ∇ϕ(z), it is called the derivative

of ϕ at z and it is denoted by ϕ (z). If all partial derivatives of ϕ exist and are
continuous at z, then ϕ is differentiable at z.
A mapping which has continuous all partial derivatives of order ≤ k is called
a C k -mapping, or a mapping of class C k . A C ∞ -mappings, or mappings of class
C ∞ are those possessing continuous partial derivatives of all orders.
In case k = n, the Jacobi matrix ∇ϕ(z) is a square matrix and its determinant
Jϕ (z) is called the Jacobian of ϕ at z.
We state the following theorem without proof, later we provide a proof of a
more general Theorem 34.18.
26.13. Change of Variable Formula. Let G ⊂ Rn be an open set and
ϕ : G → Rn a one-to-one C 1 -mapping such that Jϕ (z) = 0 for all z ∈ G. Let f
be a function on ϕ(G) and E ⊂ ϕ(G) a measurable set. Then

f (x) dx = f (ϕ(t)) |Jϕ (t)| dt
E ϕ−1 (E)
provided either of these integrals exists.

26.14. Polar Coordinates. Consider the mapping ϕ : [r, t] → [r cos t, r sin t]. Then ϕ is of
class ∞ in R2 , „ «
cos t, −r sin t
ϕ (r, t) =
sin t, r cos t
and Jϕ (r, t) = r. The mapping ϕ is not one-to-one. For calculation of integrals, the substitution
x = ϕ(r, t) is often useful (usually G = (0, ∞) × (0, 2π)). In this situation, the hypothesis of the
change of variable formula in 26.13 are satisfied (notice that ϕ(G)
= R2 but λ(R2 \ ϕ(G)) = 0).
26.15. Lemniscate. Calculate the area of the set M surrounded by the lemniscate
M := {[x, y] ∈ R2 : (x2 + y 2 )2 < 2a2 (x2 − y 2 )} .
Hint. If ϕ denotes a mapping of 26.14 (polar coordinates) and if

√ π
L := {[r, t] ∈ R2 : r ∈ (0, 2a2 cos 2t), t ∈ (0, )} ,
4
then ϕ(L) = M ∩ {[x, y] : x > 0, y > 0}. By the change of variable formula 26.13 we get
Z Z π
4
λ2 (M ) = 4 r dr dt = 4a2 cos 2t dt = 2a2 .
L 0
26.16. Spherical Coordinates. Consider the mapping ϕ = [x, y, z], where

x(r, t, θ) = r cos θ cos t, y(r, t, θ) = r cos θ sin t, z(r, t, θ) = r sin θ
and [r, t, θ] ∈ G := (0, ∞) × (−π, π) × (− 21 π, 12 π). Then
0 1
cos θ cos t, −r sin t cos θ, −r cos t sin θ
ϕ (r, t, θ) = @ cos θ sin t, r cos t cos θ, −r sin t sin θ A .
sin θ, 0, r cos θ
Thus Jϕ (r, t, θ) = r2 cos θ. Again ϕ(G)
= R3 . However, the set R3 \ϕ(G) = {[x, y, z] ∈ R3 : y =
0 and x ≤ 0} is of measure zero.
26.17. Exercise. Let κn denote the volume of the n-dimensional unit ball.
(a) Let ϕ be a nonnegative measurable function on (0, R). Prove that
Z Z R
ϕ(|x|) dx = nκn rn−1 ϕ(r) dr.
B(0,R) 0
Hint. Consider first the case when ϕ is piecewise constant and then approximate ϕ by piecewise
constant functions.
n/2
(b) Show that κn = Γ(πn +1) .
2
R 2
Hint. Write In = Rn e−|x| dx. Using Fubini’s theorem we obtain In = I1n . By the previous
part (a) Z ∞
2 n n n
In = nκn rn−1 e−r dr = κn Γ( ) = κn Γ( + 1) .
0 2 2 2
It is easy to compute I2 = κ2 Γ(2) = π. Hence In = π n/2 and we are done.
26.18. Convolution of Functions and Measures. Let f , g be measurable

functions on Rn . The convolution of f and g is the function f ∗ g defined as

f ∗g (x) = f (x − y)g(y) dy
Rn
at those points x for which the integral exists.
Let f be a function on Rn and μ a (signed) Radon measure on Rn . If the
expression f ∗ μ(x) := Rn f (x − y) dμ(y) makes sense for almost all x ∈ Rn , then
the function f ∗ μ is called the convolution of f and μ.
26.19. Theorem. If f, g ∈ L 1 (Rn ), then the function y → f (x − y)g(y) is in
L 1 (Rn ) for almost all x ∈ Rn , f ∗ g ∈ L 1 and f ∗ g1 ≤ f 1 g1 .
Proof. First we have to show that the function [x, y] → f (x−y)g(y) is measurable
on R2n . Obviously, the function [x, y] → g(y) is measurable on R2n and we need
to prove the measurability of [x, y] → f (x − y). To this end it is enough to
show that the set M := {[x, y] ∈ Rn × Rn : x − y ∈ E} is measurable for every
measurable set E ⊂ Rn . But this is apparent since M is of the form E × Rn in
the coordinate system x = x − y, y = x + y. Next, we use Fubini’s Theorem
26.10 (in a narrow sense) to obtain

|f (x − y)g(y)| dx dy = g(y) |f (x − y)| dx dy
2n

R n nR R
= f 1 g1 < ∞.
Using Fubini’s theorem again (now Tonelli’s one) the desired conclusion follows.
26.20. Young’s Convolution Theorem. Let 1 ≤ p, q, r ≤ ∞, p1 + 1q = 1 + 1r .

If f ∈ L p , g ∈ L q , then f ∗ g = g ∗ f is defined almost everywhere, it is an
element of L r and f ∗ gr ≤ f p gq .
Proof. With trivial cases out of the way (cf. also the proof of Theorem 26.21)
p q
we assume p, q, r < ∞. Denoting f˜ = |f | , g̃ = |g| , f˜ and g̃ are in L 1 . By the
previous theorem the convolution f˜ ∗ g̃ is defined almost everywhere and it is in
L 1 . For a fixed x, the Hölder inequality yields

f (x − y)g(y) dy ≤
r−p r−q p q
|f (x − y)| r |g(y)| r |f (x − y)| r |g(y)| r

Rn Rn
r−p
pr
r−q
qr
r1
p q p q
≤ |f (x − y)| |g(y)| |f (x − y)| |g(y)|
Rn Rn Rn
r−p r−q 1
= f p r
gq r
((f˜ ∗ g̃)(x)) r .
By Fubini’s theorem we have the equality

f˜ ∗ g = f˜(x − y)g̃(y) dy dx = ˜
f (x − y)g̃(y) dx dy
Rn Rn Rn Rn Rn
p q
= f p gq ,
and consequently
r

f ∗ gr =
r f (x − y)g(y) dy dx ≤ f p gq
r−p r−q
(f˜ ∗ g̃)(x) dx

R n Rn Rn
r−p r−q p q r r
≤ f p gq f p gq = f p gq .
In fact, a precise proof should contain also a verification of measurability of the

function
x → f (x − y)g(y) dy.
Rn
This is clear if f and g are bounded and of bounded support, as f (x − y)g(y)

is then integrable on Rn × Rn , cf. 26.18. The general case can be obtained by
approximation fk → f , gk → g, where

f (x) if |x| ≤ k and |f (x)| ≤ k,
fk (x) =
0 otherwise
and gk ’s are defined analogously.

26.21. Theorem. If f ∈ L p (1 ≤ p ≤ ∞) and if μ is a finite signed Radon
measure on Rn , then f ∗ μ ∈ L p and
f ∗ μp ≤ f p |μ| (Rn ).

Proof. (a) Let 1 < p < +∞. The Hölder inequality gives (for a fixed x)
p1

p−1
p
n f (x − y) dμ(y) ≤ |f (x − y)| d |μ| (y) (|μ| (Rn )) p ,
R n R
whence (by Fubini’s theorem)

p n p−1 p
|f ∗ μ| dx ≤ (|μ| (R )) |f (x − y)| dx d |μ| (y)
Rn Rn Rn
p
= |μ| (R ) n p
f p .
(b) If p = 1, then Fubini’s theorem yields immediately

|f ∗ μ| dx ≤ |f (x − y)| dx d |μ| (y) = |μ| (Rn ) f 1 .
Rn Rn Rn
(c) If p = +∞, then

|f ∗ μ(x)| ≤ |f (x − y)| d |μ| (y) ≤ f ∞ |μ| (Rn )
Rn
for almost all x ∈ Rn .

26.22. Remark. Suppose 1 ≤ p ≤ ∞, p1 + 1q = 1, f ∈ p , g ∈ q . Then f ∗ g ∈ ∞ by
Young’s convolution theorem 26.20 and f ∗ g∞ ≤ f p gq . Even stronger assertion holds:
The convolution f ∗ g is uniformly continuous on Rn , and if 1 < p < ∞, then f ∗ g ∈ 0 (Rn ).
These properties can be proved using methods of Chapter 31 (see Exercise 31.9).
26.23. Exercise. Suppose f, g, h ∈ 1. Show that (f ∗ g) ∗ h = f ∗ (g ∗ h) almost everywhere.
26.24. Exercise. Now we indicate another way how to define the Jordan-Peano volume on
Euclidean spaces (compare with Exercise 5.12). Let m be a positive integer and i = [i1 , . . . , in ] ∈
Zn be a “multiindex”. Denote by Qm,i the closed cube [(i1 − 1)2−m , i1 2−m ] × · · · × [(in −
1)2−m , in 2−m ]. If m is fixed and i runs over Zn , the cubes Qm,i are non-overlapping (i.e. their
interiors are pairwise disjoint) and cover Rn . If E ⊂ Rn , let ∗ (E, m) denote 2−mn times the
number of elements of the set {i : Qm,i ∩ E

= ∅} and ∗ (E, m) denote 2−mn times the number
of elements of the set {i : Qm,i ⊂ E},
∗ ∗
E = inf (E, m) and ∗E = sup ∗ (E, m).
m m
Prove that:
(a) ∗E ≤ λ∗ E ≤ λ∗ E ≤ ∗E for any set E ⊂ Rn ;
(b) ∗ G = λG for any open set G;
(c) ∗ K = λK for every compact set K;
(d) for a bounded set E, ∗E = ∗ E if and only if λ(∂E) = 0.
26.25. Exercise. Let G ⊂ Rn be an open set. Show that there exists a disjoint collection of
balls in G whose union covers G except on a null Lebesgue set.
Hint. The assertion is an easy consequence of Vitali’s covering theorem 27.2. Here we indicate
an elementary proof. We can assume that G is of a finite measure. Set G0 = G. By virtue of
Exercise 26.24 there are closed cubes Q1 , . . . , Qp whose interiors are pairwise disjoint and whose
union has a measure greater than 12 λG. Set c = 2−n λB(0, 1/2). Then in the interior of every
cube having a measure α there is a closed ball of measure cα. Hence we get closed pairwise
disjoint balls Bj ⊂ Qj whose union F1 has a measure greater than 2c λG. Set G1 = G0 \ F1 .
Inductively, there are Fi+1 ⊂ Gi := Gi−1 \ Fi which are unions of pairwise disjoint closed balls
with λGi ≤ (1 − 2c )λGi−1 ≤ · · · ≤ (1 − 2c )i λG.
26.26. Exercise. Let E ⊂ Rn be an arbitrary set. Prove that

X [
λ∗ E = inf{ λU (xj , rj ) : U (xj , rj ) ⊃ E}.
j j
Hint. It may be supposed that λ∗ E < +∞. Choose an ε > 0 and find an open set G ⊃ E with
λG < λ∗ E + ε. Thanks to Exercise 26.25 there exists a countable disjoint collection {Vj } of
open balls so that {Vj } covers G except on a null set N and
X [
λVj = λ Vj ≤ λG + ε .
j j
S P
Find a countable family {Ij } of open intervals such that Ij ⊃ N and λIj ≤ ε. According
S mj P jm
to Exercise 26.24 there exist closed cubes Qj with Qj ⊃ Ij and
m λQj < 2λIj . For each
m m
cube Qm m
j find an open ball Wj with the same centre and with radius equal to the diameter
of this cube (so that Wjm contains Qm
j ). It is not hard to check that there exists a constant c
(depending only on the dimension n) such that λWjm ≤ cλQm j . Whence
X X X
λWjm ≤ c j ≤ 2c
λQm λIj ≤ 2cε.
j,m j,m j
To finish the proof, it is enough to consider the union of all balls from collections {Vj } and
{Wjm }.
27. Covering Theorems
Covering theorems provide an important tool for deriving deeper results in

measure and integration theory. We will present two Vitali’s type theorems. The
task is to find, for a given cover, a countable or finite disjoint subcover in such
way that the difference is a set of small measure.
27.1. Vitali Cover. We say that a collection V of closed balls is a Vitali cover
of a set A ⊂ Rn (or that V has the Vitali property) if for each x ∈ A and each
r > 0 there is B ∈ V so that x ∈ B ⊂ B(x, r).
First we state the classical theorem of Vitali.
27.2. Vitali’s Covering Theorem. Let V be a Vitali cover of a set A ⊂ Rn .
Then there exists a pairwise disjoint countable subcollection A ⊂ V such that

λ(A \ A ) = 0.
Proof. It is no restriction to assume that A ⊂ H ⊂ Rn , where H is an open set

having finite measure. Indeed, if we prove the theorem in this case, we can apply
it successively to H = U (0, 1), H = U (0, 2) \ B(0, 1), H = U (0, 3) \ B(0, 2), . . .
and to find a countable pairwise disjoint collection of the given family which
covers A \ {x ∈ Rn : |s| ∈ N} (notice that λ{x ∈ Rn : |x| ∈ N} = 0). It may
be also supposed that no finite union of pairwise disjoint balls of V covers A.
Denote W = {(x, r) : B(x, r) ∈ V }. Now we are going to construct inductively a
sequence {Bj } of pairwise disjoint balls of V . In the 0th step we have the empty
110 27. Covering Theorems
collection of balls. Assume that in the k th step (k = 1, 2, . . . ) the closed balls

B1 , . . . , Bk−1 ∈ V are selected and denote

!k−1 "
sk = sup r > 0 : B(x, r) ∈ V , B(x, r) ∩ Bj = ∅, B(x, r) ⊂ H .
j=1
The Vitali property and the additional assumptions ensure that sk > 0. Further
+∞ > s1 ≥ s2 ≥ . . . since H is of finite measure. Now choose a ball Bk =
B(xk , rk ) ∈ V with rk > sk /2 and Bk ∩ Bj = ∅ for j < k.
We get a sequence A := {Bk , k ∈ N} of pairwise disjoint sets from V . We
show that A covers
A except on a set of Lebesgue measure zero. To this end let
p ∈ N and x ∈ A \ A be given. We now invoke the Vitali property and find a
ball B = B(y, s) ∈ V with x ∈ B and B ∩ Bj = ∅ for j = 1, . . . , p − 1. Clearly
s ≤ sp . Since λH < ∞, we have rkn < ∞ and lim sk = lim rk = 0. Therefore
k
we can find q ≥ p with sq+1 < s ≤ sq . By definition of sq+1 there exists i ≤ q
with Bi ∩ B = ∅. If z denotes a point of Bi ∩ B, then
|x − xi | ≤ |x − y| + |y − z| + |z − xi | ≤ s + s + ri ≤ 2sq + ri ≤ 2si + ri ≤ 5ri ,
whence x ∈ B(xi , 5ri ). Since i ≥ p, we get
∞

A\ A ⊂ B(xi , 5ri ) ,
i=p
and thus
∞

λ(A \ A)≤ 5n λB(xi , ri ) .
i=p

Taking the limit as p → ∞ it follows that λ(A \ A ) = 0, and we have finished.
27.3. Corollary. Let A ⊂ H ⊂ Rn , where H is an open set of finite measure.

If a collection V of closed balls in H is a Vitali cover of A and ε > 0, then there
exists a finite collection of pairwise disjoint balls Aε ⊂ V such that

λ∗ (A \ Aε ) < ε.
Proof. Set Aε = {B1 , . . . , Bp }, where Bj are as in the proof of the previous

theorem and p is sufficiently large.
Although Theorem 27.2 is sufficiently strong for most of the applications, we
state also more powerful covering theorems which will be useful in Chapter 28.
27.4. Besicovitch Theorem. There exists a positive integer N (depending
only on the dimension n of the space Rn ) with the following property:
Whenever A ⊂ Rn is a set and Δ is a bounded positive function on A, then

there exist a finite family of countable sets D1 , . . . , DN ⊂ A such that for each
k ∈ {1, 2, . . . , N }, the collection {B(x, Δ(x)) : x ∈ Dk } is pairwise disjoint, and

N
A⊂ {U (x, Δ(x)) : x ∈ Dk } .
k=1
Proof. Thanks to the compactness of S =: {x ∈ Rn : |x| = 1}, there is a finite

set T ⊂ S such that
S⊂ 1
U (t, 40 ).
t∈T
Let ξ denote number of elements of T and N := 2ξ + 1. First let us assume that

A is bounded. A disjoint sequence {xj } of points of A will be chosen inductively
as follows: Denote
s1 = sup{Δ(x) : x ∈ A}
and find x1 ∈ A with Δ(x1 ) > 78 s1 . In the j th step (j > 1) denote

Mj = A \ U (xi , Δ(xi )).
i<j
If Mj = ∅, then we stop and set D = {xi : i = 1, . . . , j − 1}. Otherwise, let sj =

sup{Δ(x) : x ∈ Mj }, find
xj ∈ Mj with Δ(xj ) > 78 sj and set D = {xi : i ∈ N}.
We claim that A ⊂ Uj , where Uj := U (xj , Δ(xj )). This is clear if D is
j
finite. In the general case, since A is bounded, {xj } has a Cauchy subsequence
{xjk }. Observe that Δ(xi ) ≤ |xi − xj | (indeed, for i < j, xj lies outside the ball
U (xi , Δ(xi )) ), and therefore
inf{Δ(xi ) : i ∈ N} ≤ inf{|xi − xj | : i, j ∈ N, i = j} = 0.
Given now x ∈ A, there is j with Δ(xj

) < 78 Δ(x). It is then apparent that
Δ(x) > sj , so that x ∈
/ Mj . Whence x ∈ i<j Ui .
Next we define a function p. For every xj ∈ D, set
Pj = {i ∈ {1, . . . , j − 1} : Bi ∩ Bj = ∅} ,
where Bj := B(xj , Δ(xj ), and define inductively p(x1 ) = 1,
p(xj ) = min{k ∈ N : k = p(xi ) for all i ∈ Pj }.
Set Dk = {xj : p(xj ) = k}. Plainly the collection of balls {Bj : xj ∈ Dk } is

pairwise disjoint for all k ≤ N , and to complete the proof we only have to show
that p(xj ) ≤ N for each xj ∈ D. We will be done once we prove that, for each
xj ∈ D, number of balls Bi , i < j, intersecting Bj is less than N .
We shall suppose that the last assertion does not hold, and derive a contradic-
tion. Select z := xj ∈ D. If number of balls Bi , i < j intersecting Bj is ≥ N ,
then there exist z1 , z2 , z3 ∈ D and t ∈ S so that
zk = xjk for j1 < j2 < j3 < i
and
z − zk 1

|z − zk | − t < 40 for k = 1, 2, 3 .
For k = 1, 2, 3 denote
r = Δ(z), rk = Δ(zk ), Rk = |z − zk | .
We will show that r, rk and Rk satisfy the following system of 16 inequalities
(I) r > 0,
(IIk ) Rk ≤ rk + r,
8
(IIIk ) r < rk ,
7
8
(IV1 ) r3 < r2 ,
7
8
(IV2 ) r3 < r1 ,
7
8
(IV3 ) r2 < r1 ,
7
(Vk ) rk ≤ R k ,
1
(VI1 ) r2 ≤ |R2 − R3 | + r3 ,
8
1
(VI2 ) r1 ≤ |R1 − R3 | + r3 ,
8
1
(VI3 ) r1 ≤ |R1 − R2 | + r2 ,
8
which does not have any solution. Therewith we get the required contradiction.
Now, we will derive the system of inequalities (I) – (VI3 ). The inequality (IIk )
says that the ball B(zk , rk ) intersects B(z, r). The inequalities (III), (IV) follow
from the following inequalities
8
Δ(xj ) ≤ si < Δ(xi ) for i < j .
7
Since Δ(xi ) < |xj − xi | for i < j, we get (V) and (VI). For instance, to obtain
(VI1 ) we use the definition of N , (II3 ) and (III3 )) in order to show that

|z3 − z| |z3 − z|

r2 ≤ |z2 − z3 | ≤ z2 − z −
(z2 − z) + z3 − z −
|z2 − z| |z2 − z|

z3 − z z2 − z 1
≤ |R2 − R3 | + R3 − ≤ |R2 − R3 | + 2R3
|z3 − z| |z2 − z| 40
1 15 1
≤ |R2 − R3 | + (r3 + r) ≤ |R2 − R3 | + r3 ≤ |R2 − R3 | + r3 .
20 20 · 7 8
Now we show that the system of inequalities (I)–(VI) does not possess a solution.
First, note that rk and Rk are positive by (I), (III) and (V). By (II1 ), (II2 ), (V1 ),
(V2 ), (IV3 ) and (III2 ) we have
7 3
|R1 − R2 | ≤ r+|r1 − r2 | = r+r1 +r2 −2 min(r1 , r2 ) ≤ r+r1 +r2 − r2 < r1 + r2 .
4 7
On the other hand, (VI1 ), (VI2 ) and (IV1 ) yield
1 1 5
|R1 − R3 | + |R2 − R3 | ≥ r1 − r3 + r2 − r3 ≥ r1 + ,
8 8 7
whence
|R1 − R2 | < |R1 − R3 | + |R3 − R2 | .
Similarly we obtain
|R1 − R3 | < |R1 − R2 | + |R2 − R3 | ,
and
|R2 − R3 | < |R1 − R3 | + |R2 − R1 | .
We see that the system (I)–(VI3 ) has no solution and this contradiction yields
the required conclusion.
For the general case when A is unbounded, let K > 2 sup Δ(x), and denote
x∈A
Ai = {x ∈ A : (i − 1)K ≤ |x| < iK}.
By the above argument we find for each i ∈ N corresponding sets Dki ⊂ Ai ,

k = 1, . . . , N . Setting

i
{D : i even } for k ≤ N,
Dk =
ki
{Dk−N : i odd } for N < k ≤ 2N.
we see that number 2N obeys required properties.

27.5. Lemma. Let μ be a Radon measure on Rn and N as in Theorem 27.4.
Suppose further that A ⊂ H ⊂ Rn , where H is an open set with μH < +∞. Given
a positive bounded function Δ on A such that B(x, Δ(x)) ⊂ H for all x ∈ A, then
there exist an open set U ⊂ H and a finite set F ⊂ A such that A ⊂ U , the
collection of balls B(x, Δ(x))x∈F is pairwise disjoint and

1
μ U\ B(x, Δ(x)) < 1 − μU.
2N
x∈F
Proof. By the previous theorem we can find countable sets D1 , . . . , DN ⊂ A such

N
that for D := Dk and k ∈ {1, 2, . . . , N }, the collection of balls
k=1
{B(x, Δ(x)) : x ∈ Dk }
is pairwise disjoint and

A⊂ B(x, Δ(x)) .
x∈D
Denote
Ek = B(x, Δ(x)), U= U (x, Δ(x)).
x∈Dk x∈D
We may find and index q ∈ {1, . . . , } with μEq ≥ N1 μU and a finite set F ⊂ Dq
such that 1
μ B(x, Δ(x)) > μU.
2N
x∈F
Now, it is not hard to see that

1
μ U\ B(x, Δ(x)) < 1− μU.
2N
x∈F
27.6. Vitali’s Covering Theorem for Radon Measures. Let μ be a Radon

measure on Rn and A ⊂ H ⊂ Rn , where H is an open of finite μ-measure.
Let further V be a family of closed balls of Rn having the following Vitali’s type
property: Given x ∈ A, there exist rj 0 such that B(x, rj ) ∈ V . Then for any
ε > 0 there exists an open set G ⊂ Rn with μG < ε and a finite family A of
pairwise disjoints balls of V such that

A⊂G∪ A.
Proof. We will construct recursively sequences of sets {Gk } and {Ak } according
to the following rule: Set G0 = H, A0 = A. Suppose that in the k th step the open
sets G0 ⊃ · · · ⊃ Gk−1 and sets A0 ⊃ · · · ⊃ Ak−1 have been chosen. By Lemma
27.5 there is an open set Uk , Ak−1 ⊂ Uk ⊂ Gk−1 and a closed set Mk ⊂ Gk−1 so
that Mk is the union of a finite pairwise disjoint collection of balls of V , and

1
μ(Uk \ Mk ) < 1− μUk .
2N
Set
Ak = Ak−1 \ Mk , Gk = Uk \ Mk .
Our construction guarantees that A \ Gk is covered by the selected balls (which
constitute a finite pairwise disjoint subcollection of V whose union is M1 ∪· · ·∪Mk )
and k
1 1
μGk < 1 − μGk−1 ≤ · · · ≤ 1 − μG0 .
2N 2N
Taking sufficiently large k, we get the desired conclusion.
27.7. Corollary. Let μ be a complete Radon measure on Rn , A ⊂ Rn and

V be a collection of closed balls in Rn . If for any x ∈ A there exist a sequence
rj 0 with B(x, rj ) ∈ V , then there exists a countable pairwise disjoint collection
A ⊂ V such that
μ A \ A = 0.
Proof. The proposition is a straightforward consequence of Theorem 27.6 if A is

bounded. For the general case, we use a similar idea as in the proof of Theorem
27.2. Since it can happen that μ{x : |x| = r} = 0, we use Exercise 15.16.b in
order to find a sequence {r1 , r2 , . . . }, rj +∞ for which μ{x : |x| = rj } = 0 for
every j ∈ N.
27.8. Remark. Stating Theorem 27.6 and Corollary 27.7 we assumed that centers of given
balls lie in the covered set. Hence, Theorem 27.2 and its Corollary 27.3 are not direct conse-
quences of Theorem 27.6 and Corollary 27.7.
27.9. Exercise. Let (P, ρ) be a separable metric space, A ⊂ P and τ > 1. Suppose there is
W ⊂ P × (0, 1] such that the collection of balls {B(x, r) : (x, r) ∈ W} covers A. Show that there
exists a countable collection S ⊂ W such that the balls B(x, r), (x, r) ∈ S are pairwise disjoint
and [
A⊂ U (x, (1 + 2τ )r).
(x,r)∈S
Hint. Step by step, find for any k ∈ N a countable colection Sk ⊂ Wk , where

n [ o
Wk := (x, r) ∈ W: τ −k < r ≤ τ −k+1 , B(x, r) ∩ {B(y, s) : (y, s) ∈ S1 ∪ · · · ∪ Sk−1 } = ∅
in such a way that the balls B(x, r), (x, r) ∈ Sk are pairwise disjoint and
[
B(x, r) ∩ {B(y, x) : (y, s) ∈ Sk }
= ∅
for all (x, r) ∈ Wk .

(This can be done, for instance, in the following way: Let {qj } be a dense sequence of points
of P . We start from S0k = ∅ and go on recursively. Let j ≥ 1 be a positive integer. If there
exists (x, r) ∈ Wk so that qj ∈ B(x, r) and B(x, r) ∩ B(y, s) = ∅ for all (y, s) ∈ Sj−1
k , then set
Sjk = Sj−1 ∪ {(x, r)}. In the opposite case leave Sj
= S j−1
.)
k S k k
Now set S = ∞ k=1 Sk . The balls B(x, r), (x, r) ∈ S are plainly pairwise disjoint. Take
x ∈ A. We are going to prove that
[
x∈ U (u, (1 + 2τ )r).
(u,r)∈S
Since [
A⊂ B(u, r),
(u,r)∈W
there is (y, s) ∈ W with x ∈ B(y, s). Further find k ∈ N satisfying τ −k < s ≤ τ −k+1 . Then
either (y, s) ∈ Sk (and there is nothing to prove), or B(y, s) intersects some of the “previous”
balls. More precisely, there exists z ∈ B(u, r) ∩ B(y, s), where (u, r) ∈ S1 ∪ · · · ∪ Sk . Obviously
s < τ r, and therefore
ρ(x, u) ≤ ρ(x, y) + ρ(y, z) + ρ(z, u) ≤ 2s + r < (2τ + 1)r.
Hence x ∈ U (u, (2τ + 1)r).

116 28. Differentiation of Measures
S
27.10. Exercise. Let be a collection of open balls in Rn , M = , c < λM . Prove that
there exist disjoint balls U1 , . . . , Uk ∈ such that
X
k
c
λUj > .
j=1
3n
Hint. Find a compact set K ⊂ M and a finite subcollection ⊂ so that λK > c and
S ⊃ K. Select the ball U ∈ with the greatest radius. Again, select among the balls of
1
which do not intersect U1 the ball U2 with the greatest radius. Suppose the balls U1 , . . . , Uj−1
have been chosen and find among the balls of which do not intersect U1 , . . . , Uj−1 the ball
Uj with the greatest radius. After a finite number of steps, we get a sequence {U1 , . . . , Uk } such
that we cannot add another ball. If x ∈ K, then x ∈ U for some U ∈ and the construction
implies that U ∩ Ui
= ∅ for some i. Let i be the smallest of such indices. Then the radius R of
the ball U is not greater than the radius of Ui , so that x lies in the ball with the same center as
Ui and radius 3R. Hence
Xk
c < λK ≤ 3n λUj .
j=1
27.11. Exercise. Prove Vitali’s covering theorem 27.2 using Exercise 27.10 in a similar way
as Theorem 27.6 was derived from Lemma 27.5.
27.12. Notes. The famous covering theorem is due to G. Vitali [1908] who proved it for closed
intervals and the Lebesgue measure. Later on, Vitali’s covering theorem was generalized in var-
ious directions by many authors (H. Lebesgue [1910], S. Banach [1924], C. Carathéodory (second
edition of the book [*1918] in 1927)). Another step forward was made by A.S. Besicovitch [1945],
[1946] and by A.P. Morse [1947]. The origin of the simple covering theorem which appeared in
Exercise 27.10 is in Wiener’s article [1939].
28. Differentiation of Measures

28.1. Derivative of a Measure. In the following, μ will stand for a Radon
measure on Rn and λ will denote again the Lebesgue measure.
For any x ∈ Rn , define
μB(x, r) μB(x, r)
Dμ (x) := lim sup and Dμ (x) := lim inf .
r→0+ λB(x, r) r→0+ λB(x, r)
If Dμ (x) = Dμ (x), we call this common value the (symmetric) derivative of μ

(with respect to λ) at x and denoted it by Dμ (x).
n
π2
28.2. Remarks. 1. Recall that λB(x, r) = rn , see Exercise 26.17.b
Γ( n
+ 1)
2
2. The functions x → Dμ (x) and x → Dμ (x) are Borel-measurable. This assertion follows from
the fact that the function x → μB(x, r) is upper semicontinuous, x → λB(x, r) is continuous
and that when defining Dμ , we can restrict to r ∈ Q.
Compare with the following rather surprising theorem (O. Hájek [1957]): If F is an arbitrary
function on an interval I ⊂ R, then the function DF (see 22.3) is Borel on I. The proof for
continuous F is indicated in the proof of Theorem 25.8. An analogous theorem fails for Dini
derivatives, there are examples of functions for which all four Dini derivatives are nonmeasurable.
3. Let μ be a Radon measure on R and F its distribution function (see 24.1). Then F (x) =
Dμ (x) provided F (x) exists. Moreover, the following is true: If F (x) ≤ a for x ∈ A, then
μA ≤ aλA. Similarly, μB ≥ aλB provided F ≥ a on B. Compare this result with the next
lemma.
28.3. Lemma. Let A ⊂ Rn be a Borel set and a > 0.

(a) If Dμ (x) ≤ a for all x ∈ A, then μA ≤ aλA.
(b) If Dμ (x) ≥ a for all x ∈ A, then μA ≥ aλA.
Proof. With the trivial case λA = ∞ out of the way, assume λA < ∞ and fix
an ε > 0. There is an open set G ⊃ A with λG ≤ λA + ε. By Vitali’s covering
theorem 27.6 for Radon measures we can find disjoint closed balls B(xi , ri ) ⊂ G
such that
! "
μB(xi , ri ) ≤ (a + ε)λB(xi , ri ) and μ A \ B(xi , ri ) = 0.
i
(If μ is absolutely continuous with respect to λ, we can also use Vitali’s covering
theorem 27.2.) Then

μA ≤ μB(xi , ri ) ≤ (a + ε) λB(xi , ri ) ≤ (a + ε)λG ≤ (a + ε)(λA + ε) ,
i i
whence (a) follows.

The inequality in part (b) can be proved similarly using Vitali’s covering the-
orem 27.2 for the Lebesgue measure.
28.4. Theorem. Any Radon measure μ on Rn has a finite derivative Dμ
λ-almost everywhere on Rn .
Proof. Without loss of generality we can restrict to a compact interval I ⊂ Rn .
For k ∈ N and r, s ∈ Q+ , s < r, set
Ak = {x ∈ I : Dμ (x) ≥ k}, A(r, s) = {x ∈ I : Dμ (x) ≤ s < r ≤ Dμ (x)}.
By the previous lemma,
kλAk ≤ μAk ≤ μI < +∞ and rλA(r, s) ≤ μA(r, s) ≤ sλA(r, s) ≤ sλI < +∞.
! "
Hence λ Ak = limk λAk = 0 and λA(r, s) = 0 (realize that 0 ≤ s < r). The
k
desired conclusion readily follows observing that

0 ≤ Dμ (x) ≤ Dμ ≤ +∞, {x ∈ I : Dμ (x) = +∞} = Ak
k
and
{x ∈ I : Dμ (x) < Dμ (x)} = A(r, s).
r,s∈Q+
28.5. Remark. Consider now the simple case when F is the distribution function of a Radon
measure μ on R. We know that F has a (finite) derivative F almost everywhere (Lebesgue’s
R R
theorem 22.5), that ab F dλ ≤ F (b) − F (a) (Theorem 22.7), and that ab F dλ = F (b) − F (a)
for any interval [a, b] if and only if μ λ (Corollary 23.5 and Exercise 24.7). In the last case,
F agrees almost everywhere with the Radon-Nikodým derivative dμ dλ
. An analogous assertion
holds for differentiation of measures and it is contained in the next Theorem 28.6.
Proof of the following theorem is based on Lemma 28.3. A proof based on the
covering theorem of Exercise 27.10 can be found in W. Rudin [*1974].
118 28. Differentiation of Measures
28.6. Theorem. Let μ be a Radon measure on Rn and B ⊂ Rn a Borel set.

Then:

(a) B Dμ dλ ≤ μB (in particular, Dμ is locally λ-integrable);

(b) B Dμ dλ = μB, provided μ λ.
Proof. Suppose β > 1, k ∈ Z and set

Bk (β) = {x ∈ B : β k ≤ Dμ (x) < β k+1 }, M= Bk (β).
k
According to Theorem 28.4, λ(B \ M ) = 0. If μ λ, then also μ(B \ M ) = 0.

Using Lemma 28.3.b, we get
+∞

+∞
Dμ dλ = Dμ dλ = Dμ dλ ≤ β k+1 λBk (β)
B M k=−∞ Bk (β) k=−∞

+∞
≤β μBk (β) ≤ βμB ,
k=−∞
and by part (a) of Lemma 28.3,

+∞

+∞
Dμ dλ ≥ Dμ dλ ≥ μBk (β) = β −1 μM = β −1 μB.
B k=−∞ Bk (β) k=−∞
When taking the limit as β → 1+, we get both (a) and (b).
28.7. Theorem. The following statements about a Radon measure μ on Rn
are equivalent:
(i) μ λ;
(ii) μB = B Dμ dλ for every Borel set B ⊂ Rn ;
(iii) Dμ is the Radon-Nikodým derivative of μ with respect to λ;
(iv) Dμ < +∞ μ-almost everywhere on Rn .
Proof. The previous theorem says that (i) =⇒ (ii), thus the equivalence (i) ⇐⇒
(ii) ⇐⇒ (iii) is obvious. To show that (i) implies (iv), use Theorem 28.6. Assume
now (iv) and let A ⊂ Rn , λA = 0. Using Lemma 28.3.a, we get
μ{x ∈ A : Dμ (x) ≤ k} ≤ k λA = 0
for every k ∈ N, whence plainly μA = 0.

28.8. Remarks. 1. It is not difficult to state similar results for signed Radon measures.
2. If μ is a Radon measure on Rn and μ = μs + μa is its Lebesgue decomposition into the
singular and absolutely continuous part, then Dμ is the Radon-Nikodým derivative of μa with
respect to λ.
28.9. Exercise. Let G ⊂ Rn be an open set, f : G → Rn be a diffeomorphism. If B is a
Borel subset of G, set μB := λf (B). Show that μ is a Radon measure on G and
˛ ˛
(a) Dμ (x) = ˛Jf (x)˛ for every x ∈ G;
˛ ˛
(b) ˛Jf ˛ is the Radon-Nikodým derivative dμ .
dλ
28.10. Notes. We could define the (symmetric) derivative of a measure μ at x using cubes
with the center at x whose lenght of edges converges to zero.
Furthermore, we could consider more general sequences of sets “shrinking” in a suitable way
to x. We could consider, for instance, all sequences of intervals containing x whose diameters
converge to zero. Let us note that for these general “derivatives of measures” analogous theorems
to those of this paragraph fail to hold, and even analogies of Theorem 29.2 (which is their direct
consequence) do not hold. An example can be found in M. de Guzmán [*1975], Chapter V., §2.
On the topic of differentiation of measures the reader is invite to consult for instance,
W. Rudin [*1974], M. de Guzmán [*1975] or J. Lukeš, J. Malý and L. Zajı́ček [*1986].
29. Lebesgue Density Theorem and Approximately Continuous Functions
29.1. Density points. Let λ be again the Lebesgue measure on Rn , M ⊂ Rn

a measurable set and x ∈ Rn . We say that x is a point of density, or a density
point of M if
λ(M ∩ U (x, r))
lim = 1.
r→0+ λU (x, r)
29.2. Lebesgue Density Theorem. Almost every point of a measurable set
M is its density point.
Proof. If μA := λ(A ∩ M ) for every (Lebesgue) measurable set A ⊂ Rn , then
μ is a Radon
measure on Mn absolutely continuous with respect to λ. Since
μA = cM dλ for A ∈ Mn , cM is the Radon-Nikodým derivative of μ with
A
respect to λ. On the other hand, Theorem 28.7 tell us that Dμ is also the Radon-
Nikodým derivative of μ with respect to λ. Hence Dμ = cM almost everywhere
according to the Radon-Nikodým Theorem (Remark 13.6.1) . If we realize that
μU (x, r)
Dμ (x) = lim and cM (x) = 1 for x ∈ M , we obtain the assertion.
r→0+ λU (x, r)
29.3. Remark. When n = 1, the Lebesgue density theorem is an easy consequence of

Theorem 23.4 according to which the derivative of an indefinite Lebesgue integral of a (locally)
integrable function cM equals cM almost everywhere.
29.4. Density Topology. A measurable set M ⊂ Rn is called d-open if each
point of M is its point of density. For instance, the set of all irrational numbers,
or any open subset of Rn are d-open. We are going to show that the collection
d of all d-open sets on Rn forms a topology which will be labelled as the density
topology. Since d contains all open subsets of Rn , it is finer than the Euclidean
topology of Rn .
29.5. Theorem. The collection of all d-open sets forms a topology.
Proof. Plainly ∅ and Rn are d-open, and d is closed under the formation of
finite
intersections. If A is a collection of d-open sets, we have to prove that T := A ∈
d. If we show that T is measurable, then every point of T is apparently a density
point of T . We can assume that T ⊂ I, where I ⊂ Rn is a compact interval.
Denoting S = {S : there exists a countable collection A0 ⊂ A such that S =
A0 }, there exists S ∈ S with λS = sup{λM : M ∈ S }. Given x ∈ T , there

exists A ∈ A so that x is a point of density of A. Since λ(A ∪ S) = λS, we get
120 29. Lebesgue Density Theorem and Approximately Continuous Functions
λ(A \ S) = 0. Hence x is a point of density of A ∩ S and also a point of density

of S. Of course, x cannot be a point of density of I \ S. Since by the Lebesgue
density theorem almost every point of I \ S is a point of density of I \ S, it follows
that λ(T \ S) = 0. Hence T = S ∪ (T \ S) is measurable.
29.6. Remark. The density topology in Rn shares a lot of interesting properties. Let us note
that the density topology is not metrizable, it is not normal (it is completely regular), the only
d-compact sets are the finite ones, and that the Baire category theorem holds for d.
29.7. Approximately Continuous Functions. A function f defined on a
neighbourhood of a point z ∈ Rn is said to be approximately continuous at z if
there exists a measurable set M ⊂ Rn such that z is a density point of M and
lim f (x) = f (z). If f is approximately continuous at each point of a given
x→z, x∈M
set, we say that f is approximately continuous. It is clear that each continuous
function is approximately continuous.
29.8. Theorem. A function f is approximately continuous at z if and only if
f is d-continuous at z.
Proof. Suppose f is d-continuous at z and defined on a neighbourhood U =
U (z, r0 ) of z. Then z is a point of density of each of the sets Mj := {x ∈
U : |f (x) − f (z)| < 1j }. Find a decreasing sequence of radii rk > 0 satisfying
λ(U (z, r) \ Mj )
< 2−j−k
λU (z, r)

∞
for all j = 1, . . . , k and r ∈ (0, rk ). Set Aj = U (z, rj ) \ Mj and M = U \ j=1 Aj .
We have
λ(U (z, r) ∩ Aj )
≤ 2−j
λU (z, r)
for all r > 0. Choose k ∈ N and r ∈ (0, rk ). Then

λ(U (z, r) ∩ Aj ) 2−j−k for j ≤ k ,
≤
λU (z, r) 2−j for j > k .
Whence
λ(U (z, r) \ M )
≤ 2−k+1 ,
λU (z, r)
and z is a density point of M . If x ∈ M ∩ U (z, rk ), then x ∈ Mk and consequently
|f (x) − f (z)| < k1 . Therefore lim f (x) = f (z). The reverse implication is
x→z,x∈M
obvious.
29.9. Denjoy’s Theorem. A function f : Rn → R is Lebesgue measurable if
and only if f is approximately continuous at almost all points of Rn .
Proof. Let f be approximately continuous at almost all points. Denote by N
the set of all points of approximate discontinuity of f . Take any c ∈ R. We
will show that the set M := {x ∈ R : f (x) > c} is measurable. Since M is a
d-neighbourhood of any point x ∈ M \ N , it follows that M \ N is also a d-

neighbourhood of x. Hence M \ N is d-open, and thus measurable. Since N is a
null set, M is measurable as well.
Now assume that f is measurable and select an ε > 0. Luzin’s theorem 18.2
provides us with a continuous function g on Rn and an open set G so that μG < ε
and f = g in Rn \ G. By the Lebesgue density theorem almost every point of the
set Rn \ G is its point of density. Thus f is approximately continuous at almost
all points of Rn \G and we can easily conclude that f is approximately continuous
at almost all points of Rn .
29.10. Remark. Compare the last equivalence with the Lebesgue theorem 7.9 according
to which a bounded function f is Riemann integrable if and only if f is continuous almost
everywhere.
29.11. Exercise. Let f be a bounded function defined on a neighbourhood of a point z ∈ Rn .
Then f is approximately continuous at z if and only if z is a Lebesgue point for f (the definition
of Lebesgue points in Rn is analogous to the one-dimensional case in 23.8). Show that the
assumption of boundedness is essential for one of the implications.
29.12. Notes. The notion of approximate continuity was introduced by A. Denjoy [1915];
he proved also that every Lebesgue measurable function is approximately continuous at almost
all points. The converse assertion was proved by V. Stepanov in [1924]. The density theorem
29.2 is due to H. Lebesgue [*1904]. This theorem also holds for nonmeasurable sets if the outer
Lebesgue measure is used in the definition of density points. The notion of the density topology
was studied much later in the 1950’s. Many of its properties and generalizations and recent
applications can be found in the monograph by J. Lukeš, J. Malý and L. Zajı́ček [*1986].
30. Lipschitz Functions
30.1. Lipschitz Mappings. Let (P1 , ρ1 ), (P2 , ρ2 ) be metric spaces and β > 0.
Recall that a mapping f : P1 → P2 is said to be β-Lipschitz if
ρ2 (f (x), f (y)) ≤ βρ1 (x, y)
for every x, y ∈ P1 . We say that f is Lipschitz on P1 if there exists β > 0 for
which f is β-Lipschitz. A mapping f : P1 → P2 is called locally Lipschitz if for
any z ∈ P1 there exists its neighbourhood U such that f is Lipschitz on U (in
this case, the constant β can vary from point to point).
30.2. Lemma. Let f be a continuous function on an open set G ⊂ Rn and
i ∈ {1, . . . , n}. Then the set of those points at which ∂x
∂f
i
fails to exist is a Borel
set.
Proof. To simplify the proof, we can assume that G = Rn . Denote

f (x + tei ) − f (x) 1 1 1
um,k (x) = sup : t ∈ (− , ), |t| ≥ ,
t m m k

f (x + tei ) − f (x) 1 1 1
vm,k (x) = inf : t ∈ (− , ), |t| ≥ ,
t m m k
u(x) = inf sup um,k (x) , v(x) = sup inf vm,k (x).
m k>m m k>m
Then, for k > m, um,k and vm,k are continuous. Since the set of points where
∂xi (x) exists equals {x ∈ R : −∞ < u(x) = v(x) < +∞}, the assertion follows.
∂f n
122 30. Lipschitz Functions
30.3. Rademacher’s Theorem. Let f be a Lipschitz function on an open set

G ⊂ Rn . Then f is differentiable almost everywhere in G.
Proof. Assume that f is a β-Lipschitz function. Let E be the set of all points
where f fails to have some of the partial derivatives. Using Fubini’s theorem, the
one-dimensional theorem on differentiability of Lipschitz functions (Lemma 22.4)
easily implies that E is of measure zero (notice that E is measurable by Lemma
30.2).
Now, for p, q ∈ Qn and m ∈ N denote

f (x + tei ) − f (x)
Sp,q,m = x ∈ G \ E : pi < < qi for all i = 1, . . . , n
t
1 1
and for t ∈ − , \ {0} .
m m
Let S̃p,q,m be the set of all density points of Sp,q,m . By the Lebesgue density
theorem 29.2 λ(Sp,q,m \ S̃p,q,m ) = 0. If

N := (Sp,q,m \ S̃p,q,m ) ,
p,q,m
then N is of measure zero. We claim that f is differentiable at each point of

G \ (N ∪ E).
To this end, take x ∈ G \ (N ∪ E) and ε ∈ (0, 1). Pick p, q ∈ Qn so that
∂f
qi − ε < pi < (x) < qi , i = 1, . . . , n.
∂xi
Then there exists m ∈ N with x ∈ S := Sp,q,m . Since x is outside !of "N , x is even
n
a density point of S. Find δ ∈ (0, m 1
) such that λ(U (x, r) \ S) ≤ 2ε λ(U (x, r))
for all r ∈ (0, 2δ). In particular, notice that U (x, (1 + ε)τ ) \ S does not con-
tain any ball of radius ετ if τ ∈ (0, δ). Choose y ∈ U (x, δ) and denote y i =
(y1 , . . . , yi , xi+1 , . . . , xn ). For any i ∈ {1, . . . , n}, let Ui be the ball with center y i
and radius ε |y − x|. The choice τ = |y − x| yields that S ∩ Ui = ∅ for all i. If
z i ∈ S ∩ Ui and wi := z i−1 + (yi − xi )ei , then
i
w − y i = z i−1 − y i−1 ≤ ε |y − x| ,
f (wi ) − f (z i−1 ) ∂f
pi < < qi and pi < (x) < qi .
yi − xi ∂xi
Whence

f (wi ) − f (z i−1 ) − ∂f (x)(yi − xi ) ≤ (qi − pi )|yi − xi | ≤ ε |y − x| .
∂xi
Summarizing, we get

∂f

f (y) − f (x) − (x)(yi − xi )
∂x i
i

∂f
≤ f (w ) − f (z ) −
i i−1
(x)(yi − xi )
∂x i
i
! "
+ f (wi ) − f (y i ) + f (z i−1 ) − f (y i−1 )
i
≤ ε(n + 2βn) |y − x| .
Thus f (x) does exist.

30.4. Lemma. Let (fα )α be a family of β-Lipschitz functions on Rn . Then
the function supα fα is β-Lipschitz provided it is finite at least at one point.
Proof. It is quite obvious.
30.5. McShane’s Extension Theorem. Let f be a β-Lipschitz function on
a set E ⊂ Rn . Then there exists a β-Lipschitz extension f ∗ of f to all of Rn . If,
in addition, E is bounded, then we can find a β-Lipschitz extension f ∗∗ of f with
compact support.
Proof. Set
f ∗ (x) = sup {f (y) − β |y − x|}.
y∈E
By the previous lemma, f ∗ is a β-Lipschitz function and f ∗ = f on E. If E is

bounded, then there exists an α > 0 such that |f (x)| < α − β |x| for all x ∈ E
and we set
⎧ ∗
⎨ min(f (x), α − β |x|)
⎪ if |x| < α/β and f ∗ (x) ≥ 0;
f ∗∗ (x) = max(f ∗ (x), −α + β |x|) if |x| < α/β and f ∗ (x) < 0;
⎪
⎩
0 if |x| ≥ α/β.
30.6. Remark. Let f be a β-Lipschitz mapping from E ⊂ Rn into Rk . Then the coordinates
of f can be extended separately by the previous theorem and we get a kβ-Lipschitz extension
f ∗ : Rn → Rk of f . However, this extension fails to be β-Lipschitz. There exists a stronger
extension theorem for mappings due to Kirszbraun which guarantees the existence of a β-
Lipschitz extension. The proof of this assertion is more difficult.
30.7. Exercise. Suppose E ⊂ Rn and f : E → Rn is a β-Lipschitz mapping. Prove that
λ∗ f (E) ≤ β n λ∗ E.
Hint. We can assume that λ∗ E < +∞. Fix an ε > 0. By Lemma 26.26 find a sequence
S P
{U (xj , rj )} of open balls so that E ⊂ U (xj , rj ) and λU (xj , rj ) < λ∗ E + ε. Since
j
f (U (xj , rj )) ⊂ U (f (xj ), βrj ) for every j,
X X
λ∗ (f (E)) ≤ λU (f (xj ), βrj ) = β k λU (xj , rj ) ≤ β k (λ∗ E + ε).
j j
124 31. Approximation Theorems
30.8. Notes. Theorem 30.3 is due to H. Rademacher [1919]. An elementary proof of it

(different from ours) was given by A. Nekvinda and L. Zajı́ček [1984].
Extensions of Lipschitz functions in metric space were closely related to original proofs of
the Hahn-Banach theorem. Theorem 30.5 of extension of (nonlinear) Lipschitz functions was
proved independently by M.D. Kirszbraun [1934] and by E.J. McShane [1934]. Another material
can be found in G.J. Minty [1970].
31. Approximation Theorems
31.1. Space D(Ω). Let Ω ⊂ Rn be an open set. We denote by D(Ω) the linear
space of all infinitely differentiable functions on Ω with compact support in Ω.
Our aim is to prove that D(Ω) is often dense in other function spaces. The first
task is to construct a nontrivial infinitely smooth function on Rn with compact
support.
Set 1/(|x|2 −1)
αe if |x| < 1,
χ1 (x) =
0 if |x| ≥ 1 ,

where the constant α is chosen in such a way that Rn χ1 (x) dx = 1. Denote
χk (x) = k n χ1 (kx).

Then Rn χk (x ± y) dy = 1 for any k ∈ N and x ∈ Rn . A simple calculation
shows that the functions χk are infinitely differentiable (first reduce the proof to
a problem of differentiability of functions of one variable).
Now, if f is a locally integrable function, then Theorem 9.2 gives immediately
that χk ∗f are infinitely differentiable functions as well. Likewise, the convolutions
χk ∗ μ are infinitely differentiable if μ is a Radon measure.
31.2. Lemma. Suppose p ∈ [1, ∞), f ∈ L p (Rn ) and k ∈ N. Then χk ∗ f ∈
L p (Rn ) and χk ∗ f p ≤ f p .
Proof. According to generalized Young’s inequality 26.20,
χk ∗ f p ≤ f p · χk 1 ≤ f p
(notice that χk 1 = 1).

31.3. Theorem. Suppose p ∈ [1, ∞). For any f ∈ L p (Rn ), we have
f − χk ∗ f p → 0.
Proof. Denote Ej = {x ∈ Rn : |x| ≤ j and |f (x)| ≤ j} and Fj = f cEj . Since

(Rn \ Ej ) has measure zero, fj − f → 0 almost everywhere and by the Lebesgue
j
p
theorem (dominating function |f | ), f − fj p → 0. Fix an ε > 0 and find j ∈ N
with f − fj p < ε. Whence by Lemma 31.2
χk ∗ fj − χk ∗ f p = χk ∗ (fj − f )p < ε
for all k ∈ N. Now χk ∗ fj → fj (as k → ∞) almost everywhere. Indeed, it

is routine to verify that χk ∗ fj → fj at all points of approximate continuity of
fj and this set, according to Denjoy’s theorem 29.9, is of full measure. By the
Lebesgue convergence theorem with dominating function (2j)p cB(0,j+1) we have
lim fj − χk ∗ fj p = 0.
k→0
Hence,
f − χk ∗ f p ≤ f − fj p + fj − χk ∗ fj p + χk ∗ fj − χk ∗ f p < 3ε ,
provided k is large enough.
31.4. Theorem. Let Ω ⊂ Rn be an open set and p ∈ [1, ∞). Then D(Ω) is a
dense subset of L p (Ω).
Proof. Set Ωj = {x ∈ Ω : |x| < j and dist(x, ∂Ω) > 1j }. Take a function
f ∈ L p (Ω) and denote gj = f cΩj . As in the previous proof, it is clear that
f − gj p → 0. For all k > j we get χk ∗ gj ∈ D(Ω). Now the estimate
f − χk ∗ gj p ≤ f − gj p + gj − χk ∗ gj p
yields the assertion.
31.5. Theorem. Let f be a continuous function with compact support contained
in an open set Ω ⊂ Rn . Then χk ∗ f ⇒ f on Ω.
Proof. We can assume that Ω = Rn . Since f is uniformly continuous, given ε > 0
there is k0 ∈ N such that |f (x) − f (y)| < ε whenever x, y ∈ Rn , |x − y| < k10 . If
k > k0 and x ∈ Rn , then

|f (x) − χk ∗ f (x)| = (f (x) − f (y))χk (x − y) dy
R
n

= (f (x) − f (y))χk (x − y) dy
B(x,1/k)

≤ε χk (x − y) dy = ε .
Rn
31.6. Theorem. Let μ be a Radon measure on Ω. Then

(a) there exist sequences {μj } of measures and {fj } of functions from D(Rn )
w
such that fj = dμj / dλ and μj → μ;
(b) if, in addition, μ has compact support, then we can take measures μj with
densities χj ∗ μ.
Proof. Let us start with the case (b). Assume again that Ω = Rn and take
f ∈ Cc (Rn ). Then by the previous theorem f ∗ χj ⇒ f , and it readilly follows
that
f (x)(χj ∗ μ)(x) dx = f ∗ χj dμ → f dμ.
Rn Rn Rn
For the proof of (a), we first approximate μ by μK (see 2.4) where K is a suitable
compact set, and then use the convolution as above.
126 32. Distributions
31.7. Theorem. Let f be a β-Lipschitz function on Rn . Then χk ∗ f are

(infinitely differentiable) β-Lipschitz functions, |χk ∗ f (x) − f (x)| ≤ β/k for all
x ∈ Rn and (χk ∗ f ) → f almost everywhere in Rn .
Proof. Since
(χk ∗ f ) = χk ∗ f ,
χk ∗ f are β-Lipschitz functions. For any x ∈ Rn we have

|χk ∗ f (x) − f (x)| ≤ |f (x) − f (y)| χk (x − y) dy
1
B(x, k )

β β
≤ χk (x − y) dy = .
k 1
B(x, k ) k
We can see that (χk ∗ f )(x) → f (x) at all points x, where the derivative f (x)
exists and is approximately continuous. But the Rademacher theorem 29.9 and
Denjoy’s theorem 30.3 tell us that this happens almost everywhere.
31.8. Exercise. Give an alternative proof of the fact that (Ω) is dense in p (Ω) for
1 ≤ p < ∞.
Hint. If f ∈ c (Ω), then by Theorem 31.5 the functions χk ∗ f converge uniformly to f and
their supports are contained in a compact set (not depending on k). This yields that χk ∗ f
converge to f in the p -norm as well. Now it is sufficient to use the density of c (Ω) in p (Ω)
(Exercise 15.17).
31.9. Exercise. Use Theorem 31.4 to prove that the convolution f ∗ g is uniformly continuous
provided f ∈ p and g ∈ q , p1 + 1q = 1 and p, q > 1. (This will prove the proposition stated
in Remark 26.22.)
31.10. Riemann-Lebesgue Lemma. Suppose f ∈ 1 (Rn ). Then
Z
lim eix·t f (t) dt = 0.
|x|→∞ Rn
Hint. If g ∈ (Rn ), then integration by parts gives
Z Z Xn
∂2g
− |x|2 eix·t g(t) dt = eix·t (t) dt
Rn Rn j=1
∂t2j
whence ˛Z ˛
˛ ˛
˛ eix·t g(t) dt˛˛ ≤ |x|−2 g1 .
˛
Rn
Now we invoke the fact that (Rn ) is dense in 1 (Rn ): Given f ∈ 1 (Rn ) and ε > 0, by
Theorem 31.4 there is g ∈ (Rn ) (g depends on f and ε) such that f − g1 < ε. Then
˛Z ˛
˛ ˛
˛ eix·t (f (t) − g(t)) dt˛˛ ≤ ε ,
˛
Rn
whence ˛Z ˛
˛ ˛
˛ eix·t f (t) dt˛˛ ≤ |x|−2 g1 + f − g1 .
˛
Rn
32. Distributions
Since twenties, physicists started to work with “generalized functions”. These

“functions” were determined by their “average densities” in a neighbourhood
of each point. P.A.M. Dirac introduced the “δ-function” having the following
properties:

δ(x) = 0 for x = 0, δ(0) = ∞ and δ(x) = 1
U
for every open ball U containing the origin. Thus the “average density of the
δ-function” in the ball U (x, r) is
1
λU (x,r) if x ∈ U (0, r),
fr (x) :=
0 otherwise.
(Notice that x ∈ U (0, r) if and only if 0 ∈ U (x, r).) We can see that
δ(x) = lim fr (x).

r→0+
If now ϕ is a continuous function on Rn , denote

Zr (ϕ) = fr ϕ, δ(ϕ) = ϕ(0).
Rn
Using the definition of continuity, one can show that
lim Zr (ϕ) = δ(ϕ)

r→0+
for any continuous function ϕ on Rn . In other words, the δ-function is a “weak”

limit of the sequence {Zr } of functionals while “values” of the δ-function are
pointwise limits of the sequence {fr }. The great number of quotation-marks in
the last sentences indicates that we have to give exact definitions and thus also
interpretations of the notion used. Thus while the physicists succesfully used the
δ-function and many other “generalized functions”, the mathematical theory of
distributions (or, generalized functions) arose much later in the end of the 30’s
and it is connected with names of S.L. Sobolev, and mainly with L. Schwartz.
And one more note. We know that there exist continuous functions which
have a derivative at no point. One of the great advantages of the theory of
distributions is the fact that each of them does have a derivative. However, we
should be aware of the difference between classical derivatives and derivatives in
the sense of distributions. Among others, the derivative of a function in sense
of distributions is not necessarily a function (cf. Examples 32.5). In order that
the distributional derivative of a continuous function f to be again a function
(more exactly, a regular distribution, cf. 32.3.1), f has to be locally absolutely
continuous. In particular, f must have the classical derivative almost everywhere.
In what follows, we consider an open set Ω ⊂ Rn . If α = (α1 , α2 , . . . , αn ) is

a n-tuple of nonnegative integers (so-called multiindex), set |α| = α1 + · · · + αn
and denote the differential operator
∂ |α|
∂xα
1
1
. . . ∂xα
n
n
by the symbol Dα . Finally, recall that D(Ω) denotes the linear space of all
infinitely differentiable functions with compact support contained in Ω (see 31.1).
32.1. Notion of a Distribution. If {fk } is a sequence of functions from D(Ω),
D
we say that fk → 0 if there exists a compact set K ⊂ Ω such that supt fk ⊂ K
for every k and Dα fk ⇒ 0 on K for every multiindex α. Every linear functional
T on D(Ω) satisfying
D
T (fk ) → 0 whenever fk → 0
is called a distribution (on Ω). Thus every distribution is determined by its values
on D(Ω).
Let f be a locally integrable function on Ω (i.e. integrable on each compact
subset of Ω, in particular, f can be a continuous function). Set

Tf (ϕ) = f ϕ dλ
Ω
for ϕ ∈ D(Ω) (notice that the integral exists). It is not very difficult to prove that
D
Tf is a distribution: Suppose ϕk ∈ D(Ω) and ϕk → 0. If K is a compact set on
whose complement all ϕk ’s vanish, then

|Tf (ϕk )| ≤ max |ϕk (t)| |f | dλ
t∈K K
and ϕk ⇒ 0 on K. It follows that T (ϕk ) → 0.

Thus every locally integrable function defines a distribution. We say that T
is a regular distribution if there is a function f ∈ Lloc
1
(Ω) with T = Tf . In this
sense, we can identify regular distributions and locally integrable functions.
32.2. Remarks. 1. The space (Ω) is usually endowed with a locally convex topology in
D
such a way that fk → 0 in this topology if and only if fk → 0. So distributions are continuous
linear functionals on (Ω).
2. If a linear functional Z on (Ω) is a “pointwise” limit of a sequence of distributions {Zj } (i.e.
if Z(ϕ) = lim Zj (ϕ) for each ϕ ∈ (Ω)), then Z is also a distribution. This is a consequence of
the Banach-Steinhaus theorem of functional analysis, which is valid for (Ω).
3. If f, g are locally integrable functions and f = g almost everywhere, then apparently Tf = Tg .
The converse is also true: If the regular distributions Tf , Tg are equal, then f = g almost
everywhere.
For this purpose, it is sufficient to prove that f η = gη for every η ∈ (Ω). Suppose that
h = (f − g)η on Ω and h = 0 outside Ω. Then h ∈ L1 (Rn ) and Th = 0 in Rn . Thus χk ∗ h = 0
for every k and Theorem 31.3 (the notation is the same) yields h = 0 almost everywhere.
This assertion shows that we can really identify regular distributions and locally integrable
functions.
32.3. Examples. 1. In 32.1 we “embedded” locally integrable functions into the space
of distributions. Another class of objects contained in the space of distribution are Radon
measures. Let μ be a (positive) Radon measure on Ω. The functional Tμ defined as
Z
Tμ (ϕ) = μ(ϕ) (= ϕ dμ)
Ω
is again a distribution. In this sense, we can understand every Radon measure as a distribution.
For example, the Dirac measure (the “δ-function”) defined by Tδ (ϕ) = ϕ(0) is a distribution.
It is not regular. Indeed, if there were a locally integrable function f with Tf = Tδ , we could
easily deduce that f = 0 almost everywhere. But Tδ is not the zero distribution.
2. It is not true that every distribution is regular or is defined by some Radon measure. As
the example consider the functional ϕ → ϕ (0). However, there is a simple criterion which
enables to decide whether a distribution T is equal to Tμ . Namely, if μ is a Radon measure,
then the distribution Tμ is a positive functional. Conversely, if T is a distribution with T (ϕ) ≥ 0
whenever ϕ ∈ (Ω) is nonnegative, then there exists a Radon measure μ such that T = Tμ .
The last proposition is an easy consequence of the Riesz representation theorem 16.5.
3. For ε > 0, denote O(ε) = (−∞, −ε) ∪ (ε, +∞) and
Z
ϕ(x)
Rε (ϕ) = dx for ϕ ∈ (R)
x
O(ε)
(notice that the integral exists). It is not difficult to see that Rε is a distribution on R. Further,
for every ϕ ∈ (R) there is a finite limit lim Rε (ϕ). If we denote it by T 1 (ϕ), then T 1 is a
ε→0+ x x
distribution. The reader may find it instructive to prove this assertion directly, or using Remark
32.2.2. Notice also that the function x1 is not locally integrable on R. Thus, the distribution
T 1 which is often identified with the function x1 is not regular and also cannot be determined
x
by a Radon measure.
The set of all distributions is a linear space which is the topological dual to
D(Ω) (cf. Remark 32.2.1).
As mentioned above, every distribution is differentiable. To motivate the exact
definition, consider a function f with continuous derivative f on R (notice that
both f and f are locally integrable). In this case it is natural to require that Tf
should be the derivative of Tf . Since,

Tf (ϕ) = f ϕ = −Tf (ϕ )
R
for any ϕ ∈ D(R) we are led to the following general definition.

32.4. Differentiation of Distributions. Let T be a distribution on Ω. If α
is a multiindex, we define the αth (partial) derivative Dα of the distribution T as
Dα T (ϕ) = (−1)|α| T (Dα ϕ) , ϕ ∈ D(Ω).
In particular,
∂T ∂ϕ
(ϕ) = −T .
∂xj ∂xj
In the sense of identification of regular distributions and locally integrable func-
tions, we often speak about distributional derivatives of functions provided the
result of differentiation is a function.
If f is a C 1 - function, then its distributional derivative is the classical one:

(Tf ) = Tf . For the one-dimensional case see the next Remarks. The multi-
dimensional case involving partial derivatives is then an easy consequence of Fu-
bini’s theorem.
32.5. Remarks and Examples. 1. It remains to show that Dα T is again a distribution.
D D
But the linearity of Dα T is obvious, and supposing ϕk → 0, then also Dα ϕk → 0 and we get
Dα T (ϕk ) → 0 (we used the fact that the mapping ϕ → Dα ϕ is continuous on (Ω)).
2. Notice that any distribution has derivatives of all degrees. However, the derivative of a
“function” fails to be a “function”. As the next examples show, it can be, for example, a
“measure” . For instance, any continuous function which does not possess a derivative at any
point is infinitely differentiable — but only in the sense of distributions!
3. Assume now that u and f are locally integrable functions on an interval (a, b) and that the
distributional derivative of u is f . Then there exists a locally absolutely continuous function
v and a constant c such that u = v + c almost everywhere and v = f almost everywhere.
R
Indeed, consider a function η ∈ ((a, b)) with ab η = 1 and denote by v the indefinite Lebesgue
integral of f . We know that v is a locally absolutely continuous function. The integration
by parts 23.13 shows that the distributional derivative of w := u“− v is a zero function. ” If
R R R
c = ab wη, ϕ ∈ ((a, b)) is an arbitrary function and ψ(x) = ax ϕ(t) − η(t) ab ϕ(s) ds dt,
then ψ ∈ ((a, b)) and
Z b Z b Z b Z b Z b
(w − c)ϕ = wϕ − wη ϕ= wψ = 0.
a a a a a
Thus Tw−c = 0, whence w = c almost everywhere by 32.2.3.

4. Let f be a continuous function having the derivative f almost everywhere on an interval I.
Then f is a distributional derivative of f if and only if f is locally absolutely continuous on I.
5. Consider the function f (x) = max(0, x) and the so-called Heaviside function Y on R which is
defined as the indicator function of the interval (0, +∞). Then f is not “classically differentiable”
at zero. However, its distributional derivative is the Heaviside function Y . Integration by parts
shows easily that
Z
+∞
(Tf ) (ϕ) = (TY ) (ϕ) = − ϕ = ϕ(0) = Tδ (ϕ)

0
whenever ϕ ∈ (Ω). We see that the derivative of the distribution TY is the Dirac “δ-function”.
In a similar way we find that
(TY ) (ϕ) = −ϕ (0) for ϕ∈ (Ω).
Show that (TY ) is neither a “function” nor a “measure”.

6. Let f be a real function whose classical derivative is not locally Lebesgue integrable (cf.
Example 25.1). Then (Tf ) is not a regular distribution.
7. The distributional derivative of the Cantor function of Example 23.1 is not the zero function
but a measure “concentrated” on the Cantor set (see Exercise 24.8.a).
8. The function ln(|x|) is locally Lebesgue integrable. Its derivative in the sense of distributions
is the distribution T 1 (This gives one way how to prove that T 1 is a distribution.)
x x
9. For ϕ ∈ (R), set

∞
X
U (ϕ) = ϕ(k) (k) .
k=1
Show that U is a distribution on R which is neither regular nor determined by a Radon measure.
P
n ∂2
10. Let := 2
be the Laplace operator and u(x) := |x|2−n . Our aim is to find u in the
i=1 ∂xi
∂u xi
distributional sense. First, we have that = (2 − n) n is a locally integrable function. Now
∂xi |x|
∂2u
the distributional derivative is a complicated distribution containing an “integral average”
∂x2i
similar to the distribution of Example 32.3.3. But we have an easier task, namely to find out
P
n ∂2u
the sum 2
of these distributions. If κn denotes the volume of the unit ball in Rn (cf.
i=1 ∂xi
Exercise 26.17), then we prove that
Z Xn
xi ∂ϕ
n (x) = −nκn ϕ(0)
Rn i=1
|x| ∂xi
for every test function ϕ ∈ (Rn ). Hence
u = n(2 − n)κn δ0 .
To this end let γ be a nonincreasing infinitely smooth function on (0, ∞), γ(t) = 1 for t < 1,
γ(t) = 0 for t > 2, |γ (t)| ≤ 2 and ηr (x) = γ(x/r). We use the decomposition
ϕ = ϕ(0)ηr + (ϕ − ϕ(0))ηr + ϕ(1 − ηr ).
Then Z Xn
xi ∂ϕ
n (x) = Ar + Br + Cr + Dr ,
Rn i=1
|x| ∂xi
where
Z Xn
xi ∂ηr
Ar = ϕ(0) n (x) dx
Rn i=1
|x| ∂xi
Z Z 2r
= ϕ(0) |x|1−n r−1 γ (|x| /r) dx = ϕ(0)nκn r −1 γ (t/r) dt = −nκn ϕ(0)
Rn 0
(we used Exercise 26.17),

Z Xn
xi ∂ηr
Br = (ϕ(x) − ϕ(0)) (x) dx,
Rn i=1
|x|n ∂xi
Z Xn
xi ∂ϕ
Cr = ηr (x) (x) dx
Rn i=1
|x|n ∂xi
and
Z Xn
xi ∂ (ϕ(1 − ηr ))
Dr = n (x) dx.
Rn i=1
|x| ∂xi
Set ( xi
|x|n
ϕ(x)(1 − ηr (x)) if x
= 0;
ψi (x) :=
0 if x = 0.
An easy calculation shows that ψi ∈ (Rn ) and
Xn
∂ψi Xn
xi ∂ (ϕ(1 − ηr ))
(x) = n (x).
i=1
∂xi i=1
|x| ∂xi
Thus Z Xn
∂ψi
Dr = dx = 0.
Rn i=1
∂xi
The proof will be finished by estimating the integrals Br and Cr and taking the limit as r → 0.
Since ϕ ∈ (Rn ), there exists a constant K such that |∇ϕ(x)| ≤ K and |ϕ(x) − ϕ(0)| ≤ K |x|
for all x ∈ U (0, 1). Obviously, |∇ηr | ≤ r2 . We obtain
Z n ˛
X
˛ ˛ ˛ Z
˛ xi ˛ ˛ ∂ηr ˛
|Br | ≤ ˛ ˛ ˛ ˛ dx ≤ 4nKr |x|1−n dx → 0
˛ |x|n ˛ |(ϕ(x) − ϕ(0)| ˛ ∂x (x) ˛ r
B(0,2r) i=1 i B(0,2r)
and Z ˛ ˛ ˛ ˛ Z
X
n
˛ xi ˛ ˛ ∂ϕ ˛
|Cr | ≤ ηr (x) ˛˛ n ˛˛ ˛ ˛
˛ ∂x (x)˛ dx ≤ nK |x|1−n dx → 0.
B(0,2r) i=1 |x| i B(0,2r)
Another operation with distributions is the mutiplication. Since we prove the

Schwartz impossibility theorem which says that on the space of all distibutions
the multiplication cannot be defined in a reasonable way, we restrict to the mul-
tiplication by smooth functions. However, if h is a smooth function on R and f
is continuous, then
Thf (ϕ) = Tf (hϕ) for ϕ ∈ D(Ω)
and this equality leads to the following definition.

32.6. Multiplication of Distributions by Smooth Functions. Let T be a
distribution on Ω and h ∈ C ∞ (Ω). We define the distribution hT = T h by
hT (ϕ) = T (hϕ) , ϕ ∈ D(Ω).

32.7. Remarks and Examples. 1. It is easy to show that hT and T h are distributions.
2. We have
xTδ (ϕ) = Tδ (xϕ(x)) = 0,

Z
xT 1 (ϕ) = T 1 (xϕ(x)) = ϕ = T1 (ϕ)
x x R
and we see that the multiplication in the space of distributions is almost the same as the
“pointwise” multiplication.
32.8. Schwartz Impossibility Theorem. On the space of distributions on

R, a multiplication cannot be defined in such a way that it would be commutative
and associative and that xTδ = 0 and xT x1 = T1 .
Proof. The following shows that it is impossible to define a multiplication:
0 = 0 · T x1 = (xTδ )T x1 = Tδ (xT x1 ) = Tδ · 1 = Tδ .
32.9. Convergence of Sequences of Distributions. Let {Tk } be a sequence

of distributions on Ω. We say that Tk converge to a distribution T on Ω, denoted
by Tk → T , if Tk (ϕ) → T (ϕ) for every test function ϕ ∈ D(Ω). Thus the conver-
gence in the sense of distributions is nothing else than the pointwise convergence,
or the weak*-convergence on D (Ω).
32.10. Examples. (a) If Tk := Tsin kx on R, then {Tk } converges to zero (i.e. to the
distribution T0 ).
kx
(b) Let Tk be the regular distribution on R determined by the function sin πx
. Show that
Tk → Tδ (when keeping the convention of 32.1, the sequence of functions { sin
πx
kx
} converges to
the Dirac measure δ in the sense of distributions).
Hint. Choose ϕ ∈ (R). If ϕ = 0 outside the interval [−A, A], then
Z A Z A
(ϕ(x) − ϕ(0)) sin kx
πTk (ϕ) = sin kx dx + ϕ(0) dx.
−A x −A x
By the Riemann-Lebesgue Lemma 31.10, the limit of the first term is zero. Further it is known
that
ZkA
sin t
lim dt = π ,
k→∞ t
−kA
whence the assertion follows.

(c) Show that the sequence of functions {χk } (cf. 31.1) converges on Rn to the Dirac measure
δ in the sense of distributions.
Hint. The proposition is an easy consequence of Theorem 31.5.
R
(d) Suppose fk , f ∈ 1 (Ω). If lim
K |fk − f | → 0 for every compact set K ⊂ Ω, then
Tfk → Tf .
32.11. Theorem. Let α be a multiindex, T, Tk distributions. If Tk → T on Ω,

then Dα Tk → Dα T .
Proof. The proposition is an immediate consequence of definitions.
P P
32.12. Example. We define that Tn = T provided Tn (ϕ) = T (ϕ) for any test function
ϕ ∈ (Ω). The previous theorem tell us that every convergent series of distributions can be
differentiated term by term. Here there is an illustrating example:
Let f be a 2π-periodic function on R,
8 1
< 2 (π − x)
> for x ∈ (0, π],
f (x) =
> − 1 (π + x) for x ∈ [−π, 0),
: 2
0 for x = 0.
It follows from the theory of Fourier series that
X∞
sin kx
f (x) =
k=1
k
for every x ∈ R. Show that:

(a) This equality holds in the sense of distributions as well.
(b) The derivative of the distribution Tf is a 2π-periodic measure μ whose restriction to
the interval [−π, π] equals − 12 + πδ.
P
∞
On the other hand, Theorem 32.11 implies that (Tf ) = Tcos kx .
k=1
Therefore, in the sense of distributions we have the equality
∞
X
cos kx = μ.
k=1
P
∞
We see that we can assign to the divergent series cos kx a “sum” but this one is a distribution
k=1
and not a function.
134 33. Fourier Transform
32.13. Notes. The theory of distributions in the framework of duality between topological
vector spaces was created by L. Schwartz in the late 1930’s and was fully published in [1947]
(because of the scientific isolation during the second world war he published his theory after-
wards). He also introduced the fundamental space (Rn ) suitable for the Fourier transform of
distributions.
Some related considerations were implicitly contained in earlier works of other authors. In
fact, L. S. Sobolev [1936] also introduced basic notions of the theory of distributions. Since he
was led by a special question concerning the solution of the Cauchy problem he did not realize
the power of his own ideas.
Concerning the history of the theory of distributions, we suggest also Lützen’s monograph
[*1982]. Other material can be found in J. Horváth [1970] or J. Horváth [*1966].
33. Fourier Transform
Throughout this chapter, u · v denotes the inner product of n-dimensional

vectors u and v. By a function we understand a complex-valued function and in
this way we modify the definitions of function spaces like Lp .
The Fourier transform is an operator whose calculus provides a number of
formulae. Roughly speaking, the Fourier transform transforms differentiation with
respect to j th variable into multiplication by xj and convolution into product. Due
to these properties it has wide applications in function spaces theory, theory of
partial differential equations, operator calculus and other fields.
When reading this chapter, we suggest to notice the parallel between the theory
of Fourier series and the Fourier transform (cf. Remark 33.14).
33.1. Fourier Transform of L1 -Functions. Formally, we can write the
Fourier transform in the form

#
f (x) := (2π)−n/2
e−ix·y f (y) dy.
Rn
This formula can be understood as a pointwise definition provided f ∈ L 1 (Rn ).

Notice that the Lebesgue integral always exists.
We start with simple formulae.
33.2. Theorem. If f, g ∈ L 1 (Rn ), then
(a) f∗ g = (2π)n/2 f# #
g;

#
(b) Rn f g = Rn f g# (the so-called multiplication formula).
Proof. The proof will follow using Fubini’s theorem easily. It only needs to be
observed that f#g ∈ L 1 whenever f, g ∈ L 1 .
The following theorem plays a key role in applications of the Fourier transform
to the theory of partial differential equations.
33.3. Theorem. Suppose f ∈ L 1 (Rn ) and 1 ≤ j ≤ n.
(a) If xj f ∈ L 1 (Rn ), then
∂ f#
= (−ixj f ).
∂xj
(b) Suppose that f , ∂f

∂xj are continuous on Rn , ∂f
∂xj ∈ L 1 (Rn ) and
lim f (x) = 0.
|x|→∞
Then

∂f /∂xj (x) = ixj f#(x).
Proof. (a) The differentiation under the integral sign of

f#(x) = (2π)−n/2 e−ix·t f (t) dt
Rn
leads to
∂ f#
(x) = −(2π)−n/2 e−ix·t itj f (t) dt.
∂xj Rn
(b) Using Fubini’s theorem and integration by parts we obtain

−n/2 −ix·t ∂f −n/2 ∂ −ix·t
(2π) e (t) dt = −(2π) f (t) e dt
R n ∂t j R n ∂t j

= ixj (2π)−n/2 f (t)e−ix·t dt .
Rn
, then #
2
33.4. Lemma. If h(x) = e−|x| /2
h = h.
Proof. Assume first that n = 1. By the previous lemma both h and # h are solutions
to the differential equation
u + xu = 0 .
Since h(0) = #
h(0) (see Exercise 26.17.b), we get # h = h. Assuming we had already
proved the theorem for n = 1, we get

2 2 2
e−ix·t e−|t| /2 dt = e−ix1 t1 −t1 /2 . . . e−ixn tn −tn /2 dt1 . . . dtn
Rn Rn

2 2 2
−ix1 t1 −t1 /2
e dt1 · · · e−ixn tn −tn /2 (2π)n/2 e−|x| /2 .
R R
33.5. Fourier Transform of Distributions. We start with a motivation.

It would be quite natural to define the Fourier transform of a distribution in
such way that the Fourier transform of the regular distribution Tf would be the
distribution Tfˆ. But if f ∈ L 1 (Rn ) and ϕ ∈ D(Rn ), the multiplication formula
33.2.b gives
f#ϕ = #
f ϕ.
Rn Rn
and since the Fourier transform of a function from D(Rn ) fails to be in D(Rn )
we cannot proceed in this way. (Notice that the zero function ϕ = 0 is the only
function of D(Rn ) with ϕ# ∈ D(Rn ).)
Observe also that the Fourier transform of a function from L 1 needs not to be
an element of L 1 . If f is the indicator function of the interval [−1, 1], then
$
2 sin x
f#(x) = .
π x
So we search for a subset of L 1 (Rn ) which would be closed with respect

to the Fourier transform and to differentiation (and thus also with respect to
multiplication by polynomials, cf. Theorem 33.3), and in this way we are naturally
led to the notion of the Schwartz space S (Rn ).
We say that a C ∞ –function f on Rn belongs to the Schwartz space S (Rn ) if
p Dα f is a bounded function for any multiindex α and for any polynomial p on
Rn . Equivalently, it is easy to see that f ∈ S (Rn ) if and only if p Dα f ∈ L 1 (Rn )
whenever p is a polynomial and α a multiindex.
2
The space of test functions D(Rn ) is a subset of S (Rn ). The function e−|x|
provides an example of a function from S (Rn ) which does not have compact
support.
With the help of Theorem 33.3 and Lemma 33.6 it can be easily shown that
S (Rn ) is closed with respect to the Fourier transform, and that the mapping
ϕ → ϕ# is a linear isomorphism of S (Rn ) onto itself.
A linear functional S on S (Rn ) is called a tempered distribution if it is con-
tinuous in the following sense: If {ϕj } is a sequence of functions from S (Rn )
and pDα ϕj → 0 for any multiindex α and any polynomial p, then T (ϕj ) → 0.
We can easily verify that every tempered distribution (restricted to D(Rn )) is a
distribution.
The Fourier transform T# of a tempered distribution T is defined as
T#(ϕ) = T (ϕ̂) , ϕ ∈ S (Rn ).
Apparently T# is again a tempered distribution.

So far, the Fourier transform is defined for functions from L 1 (Rn ) and for
tempered distributions. Any locally integrable function f determines a (regular)
distribution, and many of them even tempered distributions. If f ∈ L p (Rn )
(1 ≤ p ≤ ∞), then
Sf : ϕ → fϕ , ϕ ∈ S (Rn )
Rn
is a tempered distribution. In the same way as in the previous chapter, we identify

the function f ∈ L p (Rn ) and the tempered distribution Sf . The multiplication
formula yields that S#f = Sfˆ and we see that the new definition of the Fourier
transform of tempered distributions is an extension of the original one for func-
tions from L 1 (Rn ).
Since the Fourier transform is now defined also for functions outside of L 1 (Rn ),
the question arises whether, given function f (say f ∈ L p ), a function g can be
found so that S#f = Sg (then we could say that the Fourier transform of f is g).
We know that in the case of L 1 the answer is positive as we can take g = f#.
On the other hand, Exercise 33.9 shows that the Fourier transform of a constant
function (which is a function of L ∞ (Rn )) is not a function.
Let us note that even in the case when the Fourier transform of a function f
is a function g (in the sense mentioned above), the function g is not determined
pointwise (as in the original definition of the Fourier transform for f ∈ L 1 (Rn ))
but only except a null set.
In what follows, we restrict our attention to the case p = 2 and we state the
main theorem which guarantees a good theory for the Hilbert space L2 (Rn ). First
we define the inverse Fourier transform by the formula

f˜(y) = (2π)−n/2 eix·y f (x) dx , f ∈ L 1 (Rn );
Rn
T̃ (ϕ) = T (ϕ̃) , ϕ ∈ S (Rn ), T is a tempered distribution
If f ∈ L2 (Rn ), then we define the Fourier transform f# as such g ∈ L2 (Rn )

for which S#f = Sg . The inverse Fourier transform f˜ of a function f ∈ L2 (Rn ) is
defined in a similar way. Before we verify that these definitions make sense, let
state a useful lemma.
33.6. Lemma. The Schwartz space S (Rn ) is a dense subset of L2 (Rn ) and
˜ ˆ % %
fˆ = f , f˜ = f , %f#% = f
2 2
for f ∈ S (Rn ).
Proof. In Theorem 31.4 we proved even that D(Rn ) is a dense subset of L2 (Rn ).
˜ 2
Now we are going to prove the formula fˆ = f . Denote h(x) = e−|x| /2 . Re-
member, according to Lemma 33.4, that #h = h. Using the multiplication formula
33.2.b we obtain
# t t
f (t)h( ) dt = f ( )#
h(t) dt.
Rn k Rn k
When taking the limit as k → ∞ it follows that

f#(t) dt = f (0)#
h(t) dt = (2π)n/2 f (0).
Rn Rn
We make the substitution z = x + y. The integrals then become

−n −ix·t
f (y) = (2π) e f (t + y) dt dx
Rn
R
n

= (2π)−n eiy·t e−iz·t f (z) dz dt
Rn Rn

˜
= (2π)−n/2 eiy·t f#(t) dt = fˆ(y).
Rn
ˆ
The formula f = f˜ can be proved in a similar way. Notice that for the Fourier
¯
transform of the complex conjugated function we have fˆ¯ = fˆ and thus the mul-
tiplication formula yields
2
2 ¯
˜ ˆ¯ ¯ ˆ
|f | dx = f f¯ dx = f fˆ dx = f fˆ dx = fˆfˆ dx = f dx.
Rn Rn Rn Rn Rn Rn
33.7. Plancherel Theorem. For any f ∈ L2 (Rn ) there exists g ∈ L2 (Rn )

such that S#f = Sg , S̃g = Sf and f 2 = g2 .
Proof. By virtue of Lemma 33.6 there are fj ∈ S (Rn ) such that fj → f in
L2 (Rn ). By the same lemma, {f#j } is a Cauchy sequence in L2 (Rn ) and thus
there exists g ∈ L
2 (Rn ) with
% %
% %
%g − f#j % → 0.
2
By passing to the limit we get
S#f = Sg , S̃g = Sf and f 2 = g2 .
33.8. Remark. The Plancherel theorem says that the (original) Fourier transform can be
uniquely extended from (Rn ) to L2 (Rn ). The result is a continuous linear isometry of the
space L2 (Rn ) onto itself. Thus is a unitary mapping and one can easily find out that also
the inner product in L2 (Rn ) is invariant with respect to it.
33.9. Exercise. Let δ be the Dirac δ-function. Prove that δb is the constant (2π)−n/2 and
that the Fourier transform of this constant is δ.
b is a solution to the equation
33.10. Exercise. Find a function u whose Fourier transform u
− b b=δ
u+u
in the sense of distribution (here is the Laplace operator, cf. 32.5.10).
33.11. Fourier Transform of a Function in L1 (Rn ). (a) Suppose f ∈ 1 (Rn ). Then fb
is a uniformly continuous function and lim fb(x) = 0 (in other words, fb ∈ 0 (R
n )).
|x|→∞
Hint. The assertion concerning the limit at infinity is a consequence of the Riemann-Lebesgue
lemma 31.10. As for the proof of uniform continuity it is no restriction to assume that f is from
the space (Rn ) since we know that (Rn ) is dense in L1 (Rn ) (Theorem 31.4). Suppose that
the support of f is contained in U (0, R). We use the inequality
˛ iy·t ˛
˛e − eix·t ˛ ≤ |t| |y − x|
to get
˛ ˛ ˛˛Z ˛
˛
˛b ˛
˛f (y) − fb(x)˛ = ˛˛ (eiy·t − eix·t )f (t) dt˛˛ ≤ R |y − x| f 1
Rn
from which the uniform continuity of fb follows.

(b) Remark that any function of n)
0 (R
does not need to be the Fourier transform of a func-
x
tion from L1 (Rn ). 0 (R)–function g : x → √
Consider the example of the . If
1 + x2 log(2
Ra+ x2 )
1 −iz
f ∈ 1 (R), then Fubini’s theorem together with boundedness of the function a → 1 z e dz
R x fb(t)
on (1, +∞) imply that 1 t dt is a bounded function on (1, +∞). Hence apparently f
= g. b
Note that, in contrast to the equality {fb: f ∈ L2 (Rn )} = L2 (Rn ), a satisfactory character-
ization of the range {fb: f ∈ 1 (R)} of the Fourier transform on L1 (R) is not known.
33.12. Exercise. For f ∈ L2 (Rn ) define

Z
e−itx − 1
f (x) = f (t) dt.
R −it
Show that f is an indefinite Lebesgue integral of the (locally integrable) function fb.
33.13. Remark. Define the “Fourier transform” F on 1 (Rn ) by the formula
Z
1
Ff = e−ibx·t f (t) dt
a Rn
and denote c := (2π)n |b|−n a−2 . Then (under certain assumptions)
F (f ∗ g) = a F f F g;
F (F f (x)) = c f (−x);
∂(F f )
(x) = −ibxj F f (x);
∂xj
F f 2 = c f 2 .
Browsing through different manuscripts, various choices of a and b appear (e.g. a = 1, a =

(2π)n/2 , b = ±1, b = ±2π).
33.14. Fourier Transform and Fourier Series. Let T be the interval [−π, π]. We assign
to any function u ∈ L1 (T ) the sequence {b
u(k)}k∈Z of its Fourier coefficients
Z
1
b(k) :=
u e−ikt u(t) dt.
2π T
Conversely, to a sequence {ck } ∈ l1 (Z) we assign the function

X
c̃(x) := ck eikx
k∈Z
(the sum of the convergent trigonometrical series with coefficients ck ). The (formal) series
X
b(k)eikx
u
k∈Z
is called the Fourier series of a function u.

We leave the theory of Fourier series out of this manuscript (the books by V. Jarnı́k [JII] or
W. Rudin [*1974] are suggested). However, we state here without proofs some results indicating
an analogy between the theories of Fourier series and Fourier transform.
The sequence of Fourier coefficients {b u(k)} is an analogy of the Fourier transform fb(x) and
the sum c̃(x) is an analogy of the inverse Fourier transform f˜(x). In some sense, the theory of
Fourier series is less complicated because of L2 (T ) ⊂ L1 (T ) and l1 (Z) ⊂ l2 (Z). On the other
hand, we lose the symmetry between the Fourier transform and the inverse Fourier transform.
(a) If u ∈ L
1 (T ), then b(k) = 0. The proposition is a consequence of the Riemann-
lim u
|k|→∞
Lebesgue lemma 31.10; the proof is similar to that for the case of Fourier transform (cf. 33.11.a).
Also in this case, not every sequence in c0 (Z) is a sequence of Fourier coefficients of a function
in L1 (T ).
(b) If {ck } ∈ l1 (Z), then c̃ is a continuous function on T with c̃(−π) = c̃(π) (compare with
33.11.a). Let us note again that not every continuous function can be expressed in this way.
(c) If u ∈ L2 (T ), then {b
u(k)} ∈ l2 (Z) (this follows from the Bessel inequality). Conversely, for
any sequence {ck } ∈ l2 (Z) there exists a unique function u ∈ L2 (T ) with ck = u b(k) (the Riesz-
Fischer theorem). This function can be obtained as the limit of the sequence {sk } ⊂ L2 (T ),
where
Xk
sk (x) = ck eikx , x ∈ T.
j=−k
√
Moreover, u2 = 2π {b
u(k)}2 (the Parseval identity). Compare with the Plancherel theorem
33.7.
(d) The mapping u → {b u(k)} and {ck } → c̃ can be also generalized to wider classes of
functions or sequences. We are not going to mention the details. Just notice that the mapping
{ck } → u, {ck } ∈ l2 (Z) given by the Riesz-Fischer theorem is in fact an extension of the mapping
{ck } → c̃, {ck } ∈ l1 (Z). However, in the case of l2 , the function u is determined only as an
element of L2 (T ), i.e. except of a null set.
(e) If u ∈ 1 (R) is a 2π-periodic function and v = u , then
b(k) = ikb
v u(k) , k ∈ Z.
Compare again with Theorem 33.3.

33.15. Notes. The theory of Fourier series and the Fourier transform has a very long
history. They were D. Bernoulli and L. Euler who first used what is now called Fourier series.
J. B. J. Fourier used trigonometric series in connection with problems on heat and wrote his
famous book [*1822]. He also studied the Fourier transform in the form
Z ∞
F (x) = f (x) cos tx dx
0
and formally derived the inversion formula.

Common features of the Fourier transform and the Fourier series (see the list of parallels in
33.14) indicated that there should exist a unifying theory covering both. This was established
by A. Weil [1940] who presented Fourier transform on locally compact Abelian groups. This
approach led further to harmonic analysis on groups, see E. Hewitt and K.A. Ross [*1970] or
G.B. Folland [*1975].
Note that M. Plancherel extended the definition of the Fourier transform for functions in
L2 (R) so that he obtained the unitary operator on this Hilbert space (cf. Remark 33.8).
Besides the Fourier transform, various integral transforms like the Laplace or Mellin trans-
form are also examined.
Roughly speaking, an integral transform is a one-to-one mapping between two function spaces
defined by a formula of the form
Z
b(x) =
u K(x, y) u(y) dν(y).
Y
Often an inverse formula can be found as

Z
u(y) = K (x, y) u
b(x) dμ(x).
X
The theory of Fourier series can be including as a particular case since series could be
considered as integrals with respect to the counting measure.
F. Change of Variable and k-dimensional Measures 141
F. Change of Variable and k-dimensional Measures

34. Change of Variable Theorem
Throughout this chapter Rn will be the Euclidean space and k ≤ n a non-

negative integer. Our aim is to prove the general change of variable theorem for
k-dimensional measures.
We want to “measure” various subsets of Rn whose dimension is k ≤ n. For
instance, we are interested in the problem how to calculate the length of curves
or the area of surfaces in R3 . In miscellaneous fields of analysis we encounter
various methods leading to this goal in different degrees of generality. If a set
A ⊂ Rn is an isometric copy of a set E ⊂ Rk , it is reasonable to introduce its
k-dimensional measure as the Lebesgue measure of its preimage E. Consequently,
it is not difficult to introduce the k-dimensional measure of “flat” sets. Now, if we
consider a set A on a curved surface, we divide it into “almost flat” parts Aj and
sum evaluating measures of flat sets Ej which are “close to Aj ”. In the limiting
case, we obtain the intuitive meaning of measure of A. Our task is therefore to
introduce a concept of k-dimensional measures which would be general enough,
easily describing and transparent. We will propose two independent solutions of
this problem in the following chapters.
The integral with respect to the k-dimensional measure is the so-called curve
(k = 1), or surface (k > 1) integral. We prove a change of variable formula for
this integral which will include Theorem 26.13 as a special case.
We start by recalling useful notions from linear algebra.
The inner product of vectors x and y will be denoted again by x · y and the
norm of an element x of Rn simply by |x|. By ej we denote the basic vector
[0, . . . , 1, . . . , 0] of Rn having jth term 1 and all other terms zero.
Further, Mn,k will be the set of all matrices with n rows and k columns. Each
such a matrix represents a linear mapping of Rk into Rn .
If L : Rk → Rn is a linear mapping, then there is a unique mapping L∗ : Rn →
R such that Lx · y = x · L∗ y for all x ∈ Rk and y ∈ Rn . The mapping L∗ is
k
called the adjoint to L. If L is represented by a matrix A, then L∗ is represented

by the transposed matrix to A and this one will be denoted by AT . The norm of
the mapping L is defined by the formula
L = sup{|Lx| : x ∈ Rk , |x| ≤ 1}.

% %
Further we denote %L−1 % = sup{|x| : x ∈ Rk , |Lx| ≤ 1}.
We suppose that the reader is familiar with the notion of a determinant. We
will relate it not only to square matrices but also to objects which are representable
by means of square matrices: to k-tuples of vectors from Rk , k-tuples of linear
forms over Rk , or to linear mappings of Rk into itself.
34.1. Properties of Determinants. Let A, B ∈ Mk,k . Then
T
(a) det A = det A ,
(b) det AB = det A det B.
142 34. Change of Variable Theorem
34.2. Isometric Mappings. Let V, W be k-dimensional linear spaces with

inner products. Remember that a linear mapping L : W → V is said to be an
isometry provided it preserves distances of points. Then, of course, it preserves
also the norm and the inner product (indeed, the inner product can be expressed
2 2
in terms of the norm according to the formula 4x · y = x + y − x − y ). The
k k
matrices of isometric mappings of R into R are termed unitary or orthogonal .
A matrix A is orthogonal if and only if its rows (columns) form an orthonormal
basis of Rk . The product of unitary matrices is a unitary matrix.
34.3. Lemma. Let W be a k-dimensional linear space with an inner product.
Then there exists a linear isometry Q : Rk → W.
Proof. Let u1 , . . . , uk be an orthonormal basis of W. The linear transformation
which sends ei → ui , i = 1, . . . , k, is an isometry.
34.4. Example.
Let W ⊂ R3 be a vector space generated by the vectors [1, 1, 1] and [0, 1, 2]. We are looking
for a linear isometry of R2 into W. We arrive ` at this finding´ an orthonormal basis u1 , u2 of
the space W. Solving the equation [1, 1, 1] · [0, 1, 2] + α[1, 1, 1] = 0 we obtain α = −1, so that
the vector [−1, 0, 1] = [0, 1, 2] − [1, 1, 1] is orthogonal to [1, 1, 1]. If we set u1 = 3−1/2 [1, 1, 1],
u2 = 2−1/2 [−1, 0, 1], then the vectors u1 , u2 are unit vectors. The matrix of the required linear
mapping has as columns u1 , u2 .
34.5. Lemma. Let Q ∈ Mk,k be a unitary matrix. Then |det Q| = 1.

34.6. Lemma. Let A ∈ Mk,k . Then there exist unitary matrices Q1 , Q2 ∈
Mk,k and a diagonal matrix D ∈ Mk,k such that A = Q1 DQ2 .
Proof. An elementary theorem of linear algebra tell us that there is a unitary
matrix U and a symmetric matrix R such that A = U R. Further, there exist a
unitary matrix Q and a diagonal matrix D such that R = QT DQ. Now it remains
to put Q1 = U QT , Q2 = Q.
Now, go back to measure theory. We denote, as usual, by λ the Lebesgue measure.
34.7. Lemma. Let L : Rk → Rk be a linear mapping. Then, for any measur-
able set E ⊂ Rk , the set L(E) is measurable and λ(L(E)) = |det L| λE.
Proof. The measurability of L(E) is obvious. By Lemma 34.6 there are isometric
linear mappings Q1 , Q2 and a diagonal mapping D = (di,j )ki,j=1 of the space Rk
into itself such that L = Q1 DQ2 . Let K = [a1 , b1 ] × · · · × [ak , bk ]. Then (to
simplify the notation we will assume that di,i ≥ 0, but the same result is obtained
also in the remaining cases) D(K) = [d1,1 a1 , d1,1 b1 ] × · · · × [dk,k ak , dk,k bk ], so
that
λ(D(K)) = |d1,1 (b1 − a1 ) . . . dk,k (bk − ak )| = |det D| λK.
Since isometric linear mappings are measure preserving and their composition
with another linear mapping M do not change the absolute value of the determi-
nant of M , it follows that for any measurable set E ⊂ Rn we have
λ(L(E)) = λ(DQ2 (E)) = |det D| λ(Q2 (E)) = |det D| λE = |det L| λE.

34.8. k-dimensional Measures. Let σ be a measure on a σ-algebra Σ

containing B(Rn ). Denote
σ ∗ E = {inf σS : S ⊃ E, S ∈ Σ}.
We say that σ is a k-dimensional measure on Rn if:

(a) σ ∗ I(E) = λ∗ E whenever I : Rk → Rn is an isometric mapping and
E ⊂ Rk ;
(b) σ ∗ ϕ(E) ≤ β k σ ∗ E for each β-Lipschitz mapping ϕ : E → Rn and E ⊂ Rn .
A k-dimensional measure is not uniquely determined by (a) and (b) (cf. Remark
34.31). Nevertheless, on reasonable sets all k-dimensional measures coincide. No-
tice also that for any k-dimensional measure σ, the property (a) implies that
σ([0, 1]k × {0}n−k ) = 1.
By Exercise 30.7, the Lebesgue measure satisfies the property (b) (and thus
also (a)) in the case k = n. Using (a), we see immediately that the Lebesgue
measure is the unique n-dimensional measure on Rn .
34.9. Existence of k-dimensional Measures. There exists a k-dimensional
measure on Rn .
Proof. For E ⊂ Rn , set
μ∗ E = inf{ λk Gj } ,
j
where the infimum is taken over all sequences of open

sets {Gj }, Gj ⊂ Rk , which
admit 1-Lipschitz functions ϕj on Gj such that E ⊂ ϕj (Gj ). If such a sequence
j
{Gj } does not exist, we set μ∗ E = ∞.
Obviously μ∗ is an outer measure on Rn .
Let Σ be the σ-algebra of all μ∗ -measurable sets (see 4.4). Similarly as in
Theorem 1.19 we prove that every Borel set belongs to Σ: It is sufficient to
observe that for any open halfspace H and any continuous mapping ϕ of an open
set G ⊂ Rk into Rn , the set G is a disjoint union of G ∩ ϕ−1 (H) (and this is
clearly open) and G \ ϕ−1 (H) (this is a union of a countable family of closed sets;
therefore a measurable set which can be approximated by an open superset with
arbitrarily close measure). Obviously
μ∗ E = inf{μF : F ∈ Σ, F ⊃ E}

as the sets of the form ϕj (Gj ) are measurable (Gj is a union of a sequence of
j
compact sets and each continuous image of a compact set is compact). According
to Exercise 30.7 we obtain that μ is a k-dimensional measure.
34.10. Volume. Let W be a k-dimensional linear space with an inner product
and L : Rk → W a linear mapping. Let Q : R → W be an isometry and
k
a := det(Q−1 L). Then λ(Q−1 L(E)) = a λE for any Borel set E ⊂ Rk . The
constant a is independent of the choice of the isometric mapping Q and has a
multiplier |det L| of Lemma 34.7 (the case n = k).

similar meaning to that of the
The expression det(Q−1 L) may be simplified: Using 34.1 we have
! "k
(det(Q−1 L))2 = det((Q−1 L)T Q−1 T )) = det Q−1 Lei · Q−1 Lej i,j=1
! "k
= det Lei · Lej i,j=1 .
The expression &

det(ui · uj )ki,j=1
is said to be the volume of a k-tuple of vectors (u1 , . . . , uk ) and will be denoted by

vol(u1 , . . . , uk ). It has a geometric interpretation as the “k-dimensional measure”
of the set
{c1 u1 + · · · + ck uk : (c1 , . . . , ck ) ∈ [0, 1]k }.
We associate the volume also to the linear mapping L by
vol L := vol(Le1 , . . . , Lek ).

k
Let us notice that the estimate vol L ≤ L is valid. We may summarize the
consequences for the k-dimensional measure in the following theorem.
34.11. Theorem. Let W be a k-dimensional linear subspace of Rn and σ be a
k-dimensional measure on Rn . Let L : Rk → W be a linear mapping. Then
σ(L(E)) = vol L λE
for any measurable set E ⊂ Rk .

34.12. Theorem. Let W be a k-dimensional linear space and L : Rk → W,
M : Rk → Rk be linear mappings. Then vol(LM ) = |det M | vol L.
Proof. Let Q : Rk → W be a linear isometry. By 34.1
(vol LM )2 = det((Q−1 LM )T Q−1 LM ) = det M T det((Q−1 L)T Q−1 L) det M

= (det M )2 (vol L)2 .
34.13. Cauchy-Binet formula. Now we derive a formula which makes the

calculation of the volume of a mapping L : Rk → Rn easier, at least for some
combinations of n and k. The elements of the set {1, . . . , n}k (i.e. the ordered k-
tuples of indices) will be called multiindices. We will denote I = I(k, n) the set of
all multiindices α ∈ {1, . . . , n}k which are increasing, i.e. α1 < · · · < αk . Consider
a matrices A = (am,j )m=1,...,n and B = (bm,j )m=1,...,n . For each multiindex α ∈ I
j=1,...,k j=1,...,k
write Aα = (aαi ,j ) i=1,...,k , B = (bαi ,j ) i=1,...,k . Now we show that
j=1,...,k j=1,...,k

det AT B = det ATα Bα .
α∈I
A routine calculation shows that

n
det AT B = det( am,i bm,j )ki,j=1 = det(aβi ,i , bβi ,j )ki,j=1 .
m=1 β∈{1,...,m}k
Since
det(aβi ,i , bβi ,j )ki,j=1 = 0
whenever any index in β is repeated, we obtain

det AT B = det(aβi ,i , bβi ,j )ki,j=1
β∈{1,...,m}k

= det(aβi ,i , bβi ,j )ki,j=1 = det(ATα Bα ).
α∈I β∈{α1 ,...,αk }k α∈I
Now the promised application to the volume follows: We set both A and B to be
the matrices of the mapping L. From the above computation we obtain

(vol L)2 = det AT A = det ATα Aα = (det Aα )2 .
α∈I α∈I
34.14. Example. Let

u1 = [0, 2, 1], u2 = [1, 0, 1],
so that 0 1
0 1
A = @2 0A.
1 1
Then we evaluate „ «
5 1
(vol(u1 , u2 ))2 = det(AT A) = det = 9,
1 2
or
„ « „ « „ «
` 0 1 ´2 ` 0 1 ´2 ` 2 0 ´2
(vol(u1 , u2 ))2 = det + det + det = (−2)2 + (−1)2 + 22 = 9.
2 0 1 1 1 1
34.15. Lemma. Let σ be a k-dimensional measure in Rn . Let L : Rk → Rn

be an injective linear mapping, G ⊂ Rk an open set, F ⊂ G a measurable set and
ε > 0. Further, let ϕ : G → Rn be an almost everywhere differentiable mapping
G for which
|ϕ(s) − ϕ(t) − L(s − t)| ≤ ε |s − t| , t, s ∈ F.
% % % %
Denote β = 1 + ε %L−1 %, α = (1 − ε %L−1 %)−1 .
Then the following assertions hold:
−1
% ϕ ◦ L is β-Lipschitz on L(F );
(a) the %mapping
(b) if ε %L−1 % < 1, then the mapping ϕ is one-to-one and the mapping L◦ϕ−1
is α-Lipschitz on ϕ(F );

% every t ∈ F we have ϕ (t) − L ≤ ε and vol ϕ (t) ≤ β vol L;
k
(c) for almost
%
(d) if ε %L−1 % < 1, then vol ϕ (t) ≥ α−k vol L for almost all t ∈ F .
(e) σ ∗ ϕ(E) ∗
% −1≤% β vol L λ E for any set E ⊂ F ;
k
%
(f) if ε L % < 1, then

β −k α−k vol ϕ (t) dt ≤ σ ∗ ϕ(E) ≤ β k αk vol ϕ (t) dt
E E
for any set E ⊂ F .
Proof. (a) and (b): Let t, s ∈ F . Then
|ϕ(s) − ϕ(t)| ≤ |ϕ(s) − ϕ(t) − L(s − t)| + |Ls − Lt| ≤ ε |s − t| + |Ls − Lt|
% %
≤ (ε %L−1 % + 1) |Ls − Lt|
(hence (a) follows) and
|Ls − Lt| ≤ |ϕ(s) − ϕ(t) − L(s − t)| + |ϕ(s) − ϕ(t)|

% %
≤ ε %L−1 % |Ls − Lt| + |ϕ(s) − ϕ(t)| .
The last inequality may be also simplified

% %
(1 − ε %L−1 %) |L(s) − L(t)| ≤ |ϕ(s) − ϕ(t)| ,
which yields (b).

(c) and (d): By the Lebesgue Density Theorem 29.2, almost every point of F is
its point of density. If, in addition, the derivative A := ϕ (t) exists at such a point
% −1 % A − L ≤ ε. Since
t, then obviously the mapping AL−1 is (as in (a)) β-Lipschitz,
we have %AL % ≤ β and % vol
%
−1
AL ≤ β k , or vol A ≤ β k vol L. Similarly we obtain
vol A ≥ α−k vol L if ε %L−1 % < 1.
(e) and (f): Using the definition of the k-dimensional measure and the preceding
considerations, it follows that
σ ∗ ϕ(E) ≤ β k σ ∗ L(E) = β k vol L λ∗ E.

% %
Assuming ε %L−1 % < 1 we obtain

σ ∗ ϕ(E) ≤ β k vol L λ∗ E ≤ β k αk vol ϕ (t) dt
E
and
∗ −k ∗ −k −k
σ ϕ(E) ≥ α vol L λ E ≥ β α vol ϕ (t) dt.
E
Now we will deal with locally Lipschitz mappings. We will frequently and
tacitly use the fact that (in view of the Rademacher Theorem 30.3) any Lipschitz
mapping ϕ has its derivative ϕ (t) almost everywhere, and that this is measurable
(as a function of t).
34.16. Lemma. Let G ⊂ Rk be an open set, E ⊂ G a measurable set and

ϕ : G → Rn a locally Lipschitz mapping. Let F be a closed or open subset of
Mn,k and ε a positive function

on F . Then there exists a countable disjoint
partition E ∩ {ϕ ∈ F } = Dj , where Dj are measurable and for each j there is
L ∈ F such that
|ϕ(t) − ϕ(s) − L(s − t)| < ε(L) |s − t|
for each s, t ∈ Dj .
Proof. First suppose that F is compact. The system of sets
{M ∈ F : M − L < ε(L)},
where L runs over F , is an open cover of F . We can thus find its countable
subcover and divide E ∩ {ϕ ∈ F } into a disjoint union of measurable sets E i ,
where for each i there is L ∈ F such

that for all t ∈ E i we have ϕ (t) − L <
i i
ε(L). Let i be fixed. We have E = Ep , where
p

Epi := t ∈ E i : for each s ∈ U (t, p2 ) we have

|ϕ(s) − ϕ(t) − L(s − t)| < ε(L) |s − t| .
If we divide Epi into measurable pairwise disjoint parts Ep,q

i
with diameter less
than 1/p, then Ep,q have all the required properties. If F is closed or open in
i
Mn,k , it is a countable union of compact sets and we use the preceding part.
34.17. Sard Theorem. Let σ be a complete k-dimensional measure on Rn .
Let G ⊂ Rk be an open set and ϕ : G → Rk a locally Lipschitz mapping. Let
Z := {t : vol ϕ (t) = 0}. Then σ(ϕ(Z)) = 0.
Proof. It is enough to show that σ(ϕ(E)) = 0 for any set E := Z ∩ {ϕ ≤ m}.
If K := {M ∈ Mn,k : vol M = 0 a M ≤ m}, then K is a closed subset
of Mn,k . Choose δ > 0. With each L ∈ K we associate ε(L) > 0 such that
k−1
2k L ε(L) < δ and |Lt| ≥ ε(L) |t| for every t perpendicular to Ker L := {s ∈
k
R : Ls = 0}. According to Lemma 34.16 there is a disjoint partition E into
a countable union of measurable sets Dj such that for each j there is L ∈ K
satisfying
|ϕ(t) − ϕ(s) − L(s − t)| < ε(L) |s − t|
for each s, t ∈ Dj . Let j be fixed. The dimension d of the space Ker L is less
than k. Let P be the orthogonal projection of Rk onto Ker L and W be a (k-d)-
dimensional subspace of Rn orthogonal to L(Rk ). We find an isometric mapping
Q of the space Ker L to W and set M t%= Lt %+ ε(L)QP t. It is easily verified that
for ε = ε(L) we have M − L ≤ ε, %M −1 % ≤ 1/ε and vol M = εk−d vol L ≤
εk−d md ≤ εmk−1 . By Lemma 34.15 we get
% %
σ ∗ ϕ(Dj ) ≤ (2ε %M −1 %)k vol M λD ≤ 2k mk−1 ε λDj ≤ δ λDj .
Taking the union and letting δ → 0 we obtain σ ∗ ϕ(E) = 0.

34.18. Change of Variable Theorem. Let σ be a complete k-dimensional

measure on Rn . Let G ⊂ Rk be an open set and ϕ : G → R n
a locally Lipschitz
mapping. If u is a measurable function on G and w(x) := {u(t) : t ∈ ϕ−1 (x)},
then
w(x) dσ(x) = u(t) vol ϕ (t) dt ,
ϕ(G) G
provided the integral on the right-hand side converges.

Proof. Apparently it is enough to prove the theorem when u is the indicator
function of a measurable set E ⊂ Rk . If E is of measure zero, it is mapped by a
locally Lipschitz mapping again to a set of measure zero. If E ⊂ {vol ϕ = 0}, then
according to the Sard Theorem 34.17 we have σ(ϕ(E)) = 0 and the integral on
the right-hand side is also zero. Hence we may assume that E ⊂ {vol f > 0}. If
F := {M ∈ Mn,k : vol M > 0}, then F is an open subset of% Mn,k % . Choose τ > 1.
To each matrix L ∈ Mn,k find ε(L) > 0 such that ε(L) %L−1 % < 1 − τ −(1/2k)
√

estimate β ≤ α < τ ). By
k k
(then the constants α, β of Lemma 34.15 satisfy the
Lemma 34.16 there is a disjoint partition E = Dj , where Dj are measurable
j
sets and for every j there is L ∈ K such that
|ϕ(t) − ϕ(s) − L(s − t)| < ε(L) |s − t|
for all s, t ∈ Dj . Then by virtue of Lemma 34.15, the mapping ϕ is one-to-one

on Dj . We first reduce the general case to the case when ϕ is one-to-one on E.
Thanks to Lemma 34.15 we have

τ −1 σ ∗ ϕ(Dj ) ≤ vol ϕ (t) dt ≤ τ σ ∗ ϕ(Dj ).
Dj
For each compact set K ⊂ Dj the set ϕ(K) is compact and thus measurable.
There is a
sequence {Ki } of compact subset Dj such that λKi → λDj . Then the
set Fj := ϕ(Ki ) is a σ-measurable subset ϕ(Dj ) and
i

vol ϕ (t) dt ≤ τ σFj .
Dj
Denote F = Fj . If we sum over j, we obtain

j

τ −1 σ ∗ ϕ(E) ≤ vol ϕ (t) dt ≤ τ σϕ(F ) .
E
Whence, passing to the limit as τ → 1 we get

σ ∗ ϕ(E) ≤ vol ϕ (t) dt ≤ σϕ(F ).
E
Since F ⊂ E, it follows
σϕ(E) = vol ϕ (t) dt ,
E
which is the desired formula in the case of the indicator function of the set E.
Now, if ϕ is not

one-to-one on E, the same procedure as above yields a disjoint
partition E = Ej , where Ej are measurable sets and ϕ is one-to-one on Ej .
j
According to the preceding part of the proof, it follows that

cEj dσ = σϕ(Ej ) = vol ϕ (t) dt.
ϕ(G) Ej
Summing over j we complete the assertion.

34.19. Corollary. Let σ be a complete k-dimensional measure on Rn . Let
G ⊂ Rk be an open set and ϕ : G → Rn a locally Lipschitz mapping. Let f be
a σ-measurable function on ϕ(G) and E ⊂ G a measurable set. If we define the
Banach indicatrix N (x, ϕ, E) as the cardinality of the set E ∩ ϕ−1 (x), then

N (x, ϕ, E) f (x) dσ(x) = f (ϕ(t)) vol ϕ (t) dt ,
ϕ(G) E
provided either integral exists.

In particular, if the mapping ϕ is, in addition, one-to-one, then

f (x) dσ(x) = f (ϕ(t)) vol ϕ (t) dt.
ϕ(G) G
Proof. The assertion is a direct consequence of Theorem 34.18. It remains only

to verify that f ◦ ϕ is measurable on {vol ϕ > 0} and N (x, ϕ, E) is σ-measurable.
If E ⊂ {vol ϕ = 0}, then an appeal to the Sard Theorem 34.17 reveals that
σϕ(E) = 0, so that N (x, ϕ, E) = 0 almost everywhere. Therefore, we may suppose
that E ⊂ {vol ϕ > 0} and similarly as in the proof of the preceding theorem we
may restrict to the case when ϕ is one-to-one on E. Further, it is no restriction to
assume that f is the indicator function of a σ-measurable set H ⊂ ϕ(G). In this
case there are Borel sets B, N ⊂ ϕ(G) such that B ⊂ H, H \ B ⊂ N and σN = 0.
By Theorem 34.18 we have λ(E ∩ϕ−1 (N )) = ! 0, and therefore λ(E ∩ϕ
"
−1
(H \B)) =
−1 −1 −1
0. We see that the set E ∩ ϕ (H) = E ∩ ϕ (B) ∪ ϕ (H \ B) is measurable.
34.20. Bilipschitz Mappings. Let (P1 , ρ1 ), (P2 , ρ2 ) be metric spaces, G ⊂ P1

and β ∈ [1, ∞) a constant. We say that a mapping ϕ : G → P2 is β-bilipschitz on
G if
β −1 ρ1 (x, y) ≤ ρ2 (ϕ(x), ϕ(y)) ≤ βρ1 (x, y)
for all x, y ∈ G. In other words, whenever both ϕ and ϕ−1 are β-Lipschitz
mappings. A mapping is termed bilipschitz if it is β-bilipschitz for some β.
A mapping ψ : G → P2 is said to be locally bilipschitz if each point x ∈ G has
a neighborhood Ux and βx such that the restriction of ψ to Ux is βx -bilipschitz.
34.21. Remarks. 1. Each bilipschitz mapping is one-to-one, it is even homeomorphic.

However, a locally bilipschitz mapping fails to be one-to-one; and even if it is one-to-one, it is
not necessarily homeomorphic.
2. Notice that 1-bilipschitz mappings are nothing else than isometries.
3. Let ψ be a bilipschitz mapping of an open subset of Rk to Rn . Then ψ is differentiable
almost everywhere (by the Rademacher Theorem 31.2). Moreover, if ψ (t) exists, then the rank
of ψ (t) is k.
34.22. Regular Mappings and Diffeomorphisms. Every regular mapping
(i.e. a C 1 -mapping of an open set G ⊂ Rk to Rk whose Jacobian is nowhere on G
vanishing) serves as an example of a locally bilipschitz mapping. Also any regular
mapping of an open set G ⊂ Rk into Rn (k ≤ n), (which is defined as a C 1 -
mapping whose Jacobi matrix has rank k at each point G) is a locally bilipschitz
mapping.
Any one-to-one regular mapping ϕ of an open set G ⊂ Rk into Rk is a diffeo-
morphism, i.e. ϕ is a homeomorphism for which both ϕ and ϕ−1 are C 1 -mappings.
34.23. Examples. The mappings given by polar or spherical coordinates of Chapter 26
are typical examples of regular mappings. If a ∈ U (0, 1) is a constant vector, then f : x →
|x| ( |x|
x
− a), x ∈ U (0, 1) may serve as as example of a mapping which is (1 − |a|)−1 -bilipschitz
but not of class 1 (as it is not differentiable at the origin).
34.24. Integration on k-dimensional Surfaces. In applications, k-dimen-
sional measures are mostly used on the so-called k-dimensional surfaces. These
objects admit local parametrizations by means of Lipschitz (or even bilipschitz)
mappings, and consequently the k-dimensional measure is uniquely determined
on subsets of k-dimensional surfaces by the Change of Variable Theorem.
Given a set Ω ⊂ Rn , we always consider it as a metric space whose topology
is inherit from those of Rn . A set Ω ⊂ Rn is called a k-dimensional surface
whenever for each point x ∈ Ω there exists a locally bilipschitz homeomorphic
mapping ϕ of an open set G ⊂ Rk into Ω such that x ∈ ϕ(G), and ϕ(G) is a
relatively open subset of Ω. The mapping ϕ is then called a parametrization of
ϕ(G).
If each point of Ω has a neighborhood parametrizable by mappings of class C
( ≥ 1), we say that the surface Ω itself is of class C .
Roughly speaking, k-dimensional surfaces are sets which locally look like a
bilipschitz deformation of an open part of Rk . Equivalently, we may describe this
property in terms of charts, which are defined as inverse mappings to parametriza-
tions. A k-dimensional chart is thus defined as a locally bilipschitz homeomorphic
mapping of an open subset U of Ω onto an open subset of Rk . A set Ω is a k-
dimensional surface if and only if for each point x ∈ Ω there is a k-dimensional
chart defined on a neighborhood of x (relative to Ω).
Because of the Lindelöf property of Rn each k-dimensional surface is a count-
able union of surfaces of the form ϕ(G), where ϕ is a parametrization.
Another method of characterizing k-dimensional surfaces of class C is to use
an implicit description: A set Ω ⊂ Rn is a k-dimensional surface of class C
if and only if each x ∈ Ω admits a neighborhood Wx in Rn and a mapping
g : Wx → Rn−k of class C such that the matrix g (x) has rank n−k and Wx ∩Ω =
Wx ∩ {g = 0}.
(It is an easy exercise to use the Implicit Function Theorem in order to get a parametrization.
Conversely, if ϕ : G → Ω is a parametrization of class
and x = ϕ(t), then there exists the
projection Π of the space Rn onto Rk such that (Π ◦ ϕ) (t)
= 0. Then by the Inverse Mapping
Theorem there exists (after an eventual restriction of the domain of ϕ) the inverse mapping
` ´
(Π ◦ ϕ)−1 . The equation g(x) = 0, where g(x) = x − ϕ (Π ◦ ϕ)−1 (Π(x)) , is the desired implicit
description of Ω on the neighborhood of x.)
Let Ωbe a k-dimensional surface and σ a k-dimensional measure on Rn . The
integral Ω f dσ is sometimes labelled as the curve or surface integral of f .
34.25. Example (helix). Let X = {[x, y, z] ∈ R3 : x = cos z, y = sin z}. We may parametrize
X by the mapping
ϕ(t) = [cos t, sin t, t] , t ∈ R.
Then
ϕ (t) = [− sin t, cos t, 1]
and q √
vol ϕ (t) = (− sin t)2 + cos2 t + 1 = 2.
34.26. Example (sphere). Let S = {[x, y, z] ∈ R3 : x2 + y 2 + z 2 = 1}. We introduce three

possibilities of a parametrization.
(a) Remember that the spherical coordinates are given by formulas ϕs = [x, y, z], where
x(t, a) = cos a cos t,

y(t, a) = cos a sin t,
z(t, a) = sin a ,
[t, a] ∈ G := (0, 2π) × (− 12 π, 12 π). Then ϕ(G) = X := S \ N , where N is the “meridian”

{[x, y, z] ∈ S : y = 0, x ∈ [0, 1]}. Obviously, two-dimensional measure of N is zero, so that the
difference between S and X may be neglected. We have
0 1
− sin t cos a, − cos t sin a
ϕs (t, a) = @ cos t cos a, − sin t sin a A
0, cos a
and p
vol ϕs (t, a) = cos2 a sin2 a + sin2 t cos4 a + cos2 t cos4 a = cos a.
(b) Let S, X, N be as in (a). For the parametrization of X we may also use the “cylindrical
coordinates” ϕc = [x, y, z], where
x(t, h) = r cos t,
p
y(t, h) = r sin t, (r denotes 1 − h2 ),
z(t, h) = h
on G := {[t, h] ∈ R2 : t ∈ (0, 2π), h ∈ (−1, 1)}. Then

0 1
−r sin t, −h cos t
B r C
ϕc (t, h) = @ r cos t, −h sin t A
r
0, 1
and q
vol ϕc (t, h) = h2 + (1 − h2 )(sin2 t + cos2 t) = 1.
(c) Let S+ be the hemisphere {[x, y, z] ∈ S : z > 0}. Then S+ can be parametrized by
its projection into the plane determined by the axes x and y. Consider the parametrization
ϕp = [x, y, z], where
x(s, t) = s,
y(s, t) = t,
p
z(s, t) = 1 − s2 − t2 ,
on {(s, t) ∈ R2 : s2 + t2 < 1}. Then
0 1
1, 0
B C
ϕp (s, t) = @ 0, 1 A
√ −s √ −t
,
1−s2 −t2 1−s2 −t2
and
1
vol ϕp (t) = √ .
1 − s2 − t2
34.27. Example. By means of the parametrization of Example 34.26.b we will compute
two-dimensional surface measure of the unit sphere. Let S, X, G have the same meaning as in
the quoted example. Then
Z Z 1 Z 2π
σ(S) = σ(X) = vol ϕ dt dh = ( dt)dh = 4π.
G −1 0
34.28.
p Example. Let σ be a two-dimensional measure on S− := {(x, y, z) ∈ R3 : z =
− 1 − x2 − y 2 }. The integral Z
z 2 dσ
S−
posseses (up to constants) the physical meaning of the force by which (under the unit gravitation
acceleration) a ball with the unit density half immersed in a liquid is lifted (computed by the
integration of the gravitation forces).
Use spherical coordinates on G = {(t, a) ∈ R2 : t ∈ (0, 2π), a ∈ (−π/2, 0)}. Then
Z Z Z 0
2
z 2 dσ = sin2 a cos a dt da = 2π sin2 a cos a da = π ,
S− G −π/2 3
which is the half of the volume of the unit sphere. Therefore, we have verified Archimedes
principle in our particular case.
34.29. Example (the length of the graph of a function). Let Γ be the graph of a Lipschitz
function f : [0, 1] → R and σ a 1-dimensional measure on R2 . Let G = (0, 1) and ϕ(t) = (t, f (t)),
t ∈ G. Then the assumptions of Theorem 34.18 are satisfied and
Z 1 Z 1q
σ(Γ) = vol ϕ (t) dt = 1 + (f (t))2 dt.
0 0
34.30. Exercise. Derive the formula
Z b q
σ(Ω) = 2π f (t) 1 + (f (t))2 dt
a
in the case of two-dimensional measure of the surface Ω drawn in R3 by the rotation of the graph
of a function f around the z-axis. Suppose that the function f : (a, b) → (0, +∞) is Lipschitz
and set p
Ω = {[x, y, z] ∈ R3 : z ∈ (a, b), x2 + y 2 = f (z)}.
34.31. Rectifiable Sets. A set H ⊂ Rn is said to be k-rectifiable if there is a Lipschitz
mapping of a bounded measurable set E ⊂ Rk to H. It is a consequence of the Change of
Variable Formula 34.19 that on the σ-algebra generated by k-rectifiable sets all k-dimensional
measures coincide. On the other hand, there exist closed sets on which k-dimensional measures
may differ (for example, the k-dimensional measure of 34.9 may be different from the Hausdorff
measure of Chapter 36). The constructions of such sets are rather complicated.
Notice that n-rectificable sets in Rn are just bounded Lebesgue measurable sets.
34.32. Notes. The concept of k-dimensional measures comes from A. Kolmogorov [1932]. Our
exposition follows the monograph by H. Federer [*1969] (see also L. C. Evans and R. E. Gariepy
[*1992]) and ideas due to D. Preiss. The classical Sard theorem (or, Morse-Sard theorem) for
1 –mappings was proved by A.P. Morse [1939] and A. Sard [1942].
35. The Degree of a Mapping
Let G ⊂ Rk be an open set and ϕ : G → Rk a locally Lipschitz mapping. Then

|Jϕ (t)| dt = N (x, ϕ, G) dx,
G ϕ(G)
where N (x, ϕ, G) is the number of points of the set ϕ−1 (x) (the Banach indica-
trix ). It is natural to ask what is the meaning of the integral

Jϕ (t) dt.
G
This problem leads to the notion of the degree of a mapping which turns out
to have many applications. Let us mention, for instance, its importance for the
topology of Euclidean spaces and solvability problems for nonlinear “algebraic”
equations.
First we need to prepare several auxiliary computational results.
35.1. Lemma. Let G ⊂ Rk be an open set and ψ1 , . . . , ψk twice continuously
differentiable functions on G. Then we have
(a)
k
∂ ! ∂ψi "
(−1)q+1 det i=2,...,k = 0.
q=1
∂tq ∂tj j=1,...,q−1,q+1,...,k
(b)
∂ ! ∂ψi "
k
(−1)q+1 ψ1 det i=2,...,k = det(∇ψ1 , . . . , ∇ψk ).
q=1
∂tq ∂tj j=1,...,q−1,q+1,...,k
Proof. (a) The left-hand side is equal to the sum

k
k
(−1)q+1 det(ai,j,p,q ) i=2,...,k ,
q=1 p=1 j=1,...,q−1,q+1,...,k
where ⎧
⎪
⎪
∂ψi
if j = p = q ,
⎨ ∂tj
2
ai,j,p,q =
⎪
∂ ψi
if j = p = q ,
⎪
⎩
∂tq ∂tj
0 if p = q.
154 35. The Degree of a Mapping
Now it remains to realize that using the interchange of the order of differentiation
we get
(−1)q+1 det(ai,j,p,q ) i=2,...,k = (−1)p det(ai,j,q,p ) i=2,...,k
j=1,...,q−1,q+1,...,k j=1,...,p−1,p+1,...,k
for all p, q.
(b) If we apply the chain rule to the left-hand side of the equality, we obtain
∂ ! ∂ψi "
k
(−1)q+1 ψ1 det i=2,...,k
q=1
∂tq ∂tj j=1,...,q−1,q+1,...,k

k
∂ψ1 ! ∂ψi "
= (−1)q+1 det i=2,...,k
q=1
∂tq ∂tj j=1,...,q−1,q+1,...,k

k
∂ ! ∂ψi "
+ ψ1 (−1)q+1 det i=2,...,k .
q=1
∂tq ∂tj j=1,...,q−1,q+1,...,k
The first term on the right-hand side is just the expansion of the determinant
det(∇ψ1 , . . . , ∇ψk ) by the first row, the second one vanishes by the part (a).
35.2. Lemma. Let ψ1 , . . . , ψk be Lipschitz functions on Rk vanishing outside
a compact subset of Rk . Then

det(∇ψ1 , . . . , ∇ψk ) dt = 0.
Rk
Proof. We proceed in two steps. First, we assume that the functions ψ1 . . . ψk are
of class C 2 . By Lemma 35.1.b there are functions η1 , . . . , ηk vanishing outside
compact subsets of Rk such that

k
∂ηj
det(∇ψ1 , . . . , ∇ψk ) = .
j=1
∂tj
For each j = 1, . . . , k and [t1 , . . . , tj−1 , tj+1 , . . . , tk ] ∈ Rk−1 we have

∂ηj
(t1 , . . . , tj−1 , ξ, tj+1 , . . . , tk ) dξ = 0.
R ∂tj
Fubini’s theorem yields

∂ηj
(t) dt = 0.
Rk ∂tj
Summing up these equalities over j = 1, . . . , k we get the required formula.
In the second step we show that the formula is valid without smoothness as-
sumptions. Suppose that all functions ψ1 , . . . , ψk are β-Lipschitz. Then

det(∇χq ∗ ψ1 , . . . , ∇χq ∗ ψk ) dt = 0
Rk
(notation as in 31.1). Letting q → ∞, by the Lebesgue dominated convergence

theorem with constant dominating function we get the desired equality.
35.3. Corollary. Let G ⊂ Rk be a bounded open set and ϕ1 , . . . , ϕk , ψ1 , . . . , ψk

Lipschitz functions on G. If ϕi = ψi on ∂G, i = 1, . . . , k, then

det(∇ϕ1 , . . . , ∇ϕk ) dt = det(∇ψ1 , . . . , ∇ψk ) dt.
G G
Proof. By McShane’s theorem 30.5 the functions ϕi , ψi can be extended to the

whole of Rk to be Lipschitz functions of compact support. Further, we redefine
ψi on Rk \ G to coincide there with ϕi . Thanks to the preceding lemma it follows
that

det(∇ϕ1 , . . . , ∇ϕk ) dt = − det(∇ϕ1 , . . . , ∇ϕk ) dt
G Rk \G

=− det(∇ψ1 , . . . , ∇ψk ) dt
Rk \G

= det(∇ψ1 , . . . , ∇ψk ) dt.
G
35.4. Lemma. Let G ⊂ Rk be a bounded open set, f an integrable function on

Rk and ϕ, ψ Lipschitz mappings of G into Rk . If ϕ = ψ on ∂G, then

f (ϕ(t)) Jϕ (t) dt = f (ψ(t)) Jψ (t) dt.
G G
Proof. Let ϕ = [ϕ1 , . . . , ϕk ], ψ = [ψ1 , . . . , ψk ]. No generality is lost with the

assumption that f ∈ D(Rk ), for D(Rk ) is dense in L1 (Rk ). Find a C 1 -function
∂g
g on Rk so that ∂x1
= f . Then by Corollary 35.3

det(∇(g ◦ ϕ), ∇ϕ2 , . . . , ∇ϕk ) dt = det(∇(g ◦ ψ), ∇ψ2 , . . . , ∇ψk ) dt.
G G
Further,
k
∂g
det(∇(g ◦ ϕ), ∇ϕ2 , . . . , ∇ϕk ) dt = ( ◦ ϕ) det(∇ϕi , ∇ϕ2 , . . . , ∇ϕk ) dt.
G i=1 G ∂xi
Among the terms of the sum on the right, only the first one may differ from
zero and it equals G (f ◦ ϕ) Jϕ dt. If we combine this result with an analogous
computation for ψi , we obtain the desired equality.
35.5. Lemma. Let G ⊂ Rk be a bounded open set and ϕ, ψ : G → Rk Lipschitz
mappings. Let B(y, r) be a closed ball of Rk which does not intersect any of the
segments joining ϕ(t) and ψ(t), t ∈ ∂G. Let f be an integrable function on Rk
with support in U (y, r). Then

f (ϕ(t)) Jϕ (t) dt = f (ψ(t)) Jψ (t) dt.
G G
Proof. Denote
K = {t ∈ G : the segment joining ϕ(t) and ψ(t) intersects B(y, r)} .
Then K is a compact set contained in G. The function η which is one on K and

zero outside G is Lipschitz on K ∪ (Rk \ G). According to McShane’s extension
theorem 30.5, η can be extended (under the same notation) as a Lipschitz function
on Rk . Set ζ = ϕ + η(ψ − ϕ). Then ζ = ϕ on ∂G and f (ζ(t)) = f (ψ(t)) for all
t ∈ G. and by Lemma 35.4 we obtain

f (ϕ(t)) Jϕ (t) dt = f (ζ(t)) Jζ (t) dt = f (ψ(t)) Jψ (t) dt.
G G G
35.6. Lemma. Let G ⊂ Rk be a bounded open set and ϕ : G → Rk a Lipschitz

mapping. If U ⊂ Rk is a connected open set disjoint with ϕ(∂G), then there exists
a ∈ R such that
f (ϕ(t)) Jϕ (t) dt = a f (x) dx
G Rk
for every function f ∈ L 1 (Rk ) with support in U .

Proof. To prove the assertion it suffices to show it when U is an open cube, and it is
enough to verify the equality for indicator functions of cubes K(z, ρ) := [−ρ, ρ]k +z
contained in U , where z ∈ U and ρ is a rational number. Select ρ and set
Uρ := {z ∈ U : K(z, ρ) ⊂ U }. Then for z, y ∈ Uρ , we have

cK(y,ρ) (ϕ(t)) Jϕ (t) dt = cK(z,ρ) (ψ(t)) Jψ (t) dt ,
G G
where ψ(t) = ϕ(t) + y − z. For y close enough to z the functions ϕ, ψ satisfy the
hypotheses of Lemma 35.5 and, in view of connectedness of Uρ , it follows that the
function
y → cK(y,ρ) (ϕ(t)) Jϕ (t) dt
G
is constant on Uρ . Hence

cK(y,ρ) (ϕ(t)) Jϕ (t) dt = a(ρ)λK(y, ρ).
G
Since the Jacobian of a Lipschitz mapping is a bounded function and G is a

bounded set, the constant a(ρ) is finite. If ρ1 is a rational number and ρ2 is an
integer multiple of ρ1 , then the cube with edges 2ρ2 may be filled by cubes of
edge 2ρ1 , so that a(ρ2 ) = a(ρ1 ). Therefore we can conclude that a(ρ) does not
depend on ρ.
35.7. Degree of a Mapping. Let G ⊂ Rk be a bounded open set and ϕ : G →
Rk a continuous mapping. By Tietze’s extension theorem, ϕ is continuously
extendable as a continuous function of compact support to the whole of Rk (with
the same notation). By Theorem 31.5 there exist (even C ∞ ) functions ϕj on Rk

such that ϕj ⇒ ϕ. Let y ∈ Rk \ ϕ(∂G). We find a neighborhood U of the point
y whose closure does not intersect ϕ(∂G). We may assume that U ∩ ϕj (∂G) = ∅
for all j. Using the preceding lemmas, there are finite constants aj such that

f (ϕj (t)) Jϕj (t) dt = aj f (x) dx
G Rk
for each integrable function f on Rn with support in U . Further, an appeal to

Lemma 35.5 makes it clear that the sequence {aj } is constant for j ≥ j0 , and
that its limit a is independent of the choice of the sequence {ϕj }. We will soon
be able to show that a is an integer and we call a to be the degree of the mapping
ϕ at y on G. We denote it by deg(y, ϕ, G).
35.8. Change of Variable Formula. Let G ⊂ Rk be a bounded open set and
ϕ : G → Rk a Lipschitz mapping. Let f be a measurable function on Rk , f = 0
almost everywhere on ϕ(∂G). Then

f (ϕ(t)) Jϕ (t) dt = deg(x, ϕ, G) f (x) dx ,
G Rk
provided the integral on the left-hand side converges.

Proof. Making similar measurability considerations to those of the proof of Corol-
lary 34.19, we conclude the theorem using the preceding lemmas and the definition
of a degree.
35.9. Properties of a Degree. Let G ⊂ Rk be a bounded open set and
ϕ : G → Rk a continuous mapping.
(a) The function y → deg(y, f, G) is constant on each component of Rk \
f (∂G).
(b) Let ψ : G → Rk be a continuous mapping and let the segment joining ϕ(t)
and ψ(t), t ∈ ∂G, do not contain y. Then deg(y, ϕ, G) = deg(y, ψ, G).
(c) If ϕ is one-to-one and Jϕ > 0 almost everywhere in G, then deg(y, ϕ, G) =
1 for all y ∈ ϕ(G).
(d) If deg(y, ϕ, G) = 0, then the equation ϕ(t) = y has a solution in G.
(e) If G1 , . . . , Gm are disjoint open subsets of G and ϕ(t) = y for t ∈ G \
(G1 ∪ · · · ∪ Gm ), then
deg(y, ϕ, G) = deg(y, ϕ, G1 ) + · · · + deg(y, ϕ, Gm ).
(f) If ϕ is C 1 and Jϕ (t) = 0 for all t ∈ ϕ−1 (y), then

deg(y, ϕ, G) = {sign Jϕ (t) : t ∈ ϕ−1 (y)} .
(g) The degree deg(y, ϕ, G) is an integer number.
Proof. The assertions (a) and (b) are obvious. Combining Change of Variable
Formula in 35.8 and 34.19, we get (c).
(d) Assume that the equation ϕ(t) = y has no solution. Since ϕ(G) is a compact
set, there is an open neighborhood U of y such that U ∩ϕ(G) = ∅. Find a Lipschitz
function ψ close enough to ϕ such that U ∩ ψ(G) = ∅ and deg(y, ψ, G) = 0. Then
G ∩ ψ −1 (U ) = ∅ and by 35.8

0= Jψ = deg(y, ψ, G) λU = 0 ,
G∩ψ −1 (U )
which is a contradiction.
(e) We may assume that ϕ is a Lipschitz mapping. The set K := ϕ(G \ (G1 ∪
· · · ∪ Gm )) is compact and does not contain y. We find an open neighborhood U
of y whose closure does not intersect K. Then by 35.8

deg(y, ϕ, G) = Jϕ = Jϕ + · · · + Jϕ
G∩ϕ−1 (U ) G1 ∩ϕ−1 (U ) Gm ∩ϕ−1 (U )
= deg(y, ϕ, G1 ) + · · · + deg(y, ϕ, Gm ).
(f) Let s ∈ ϕ−1 (y). Since Jϕ (s) = 0, there is a neighborhood V of s such that
ϕ is one-to-one on V , ϕ(V ) is an open set (the Inverse Mapping Theorem) and
Jϕ has a constant sign on V . Then

λϕ(V ) deg(y, ϕ, V ) = Jϕ dt = sign Jϕ (s) |Jϕ | dt = sign Jϕ (s) λϕ(V ) .
V V
It is also clear that the set ϕ−1 (y) is isolated, and therefore ϕ−1 (y) is a finite
subset of G. We complete the proof using (e).
(g) We can assume that ϕ is of class C 1 . Let E = {t ∈ G : Jϕ (t) = 0}. By
the Sard Theorem 34.17, λ(ϕ(E)) = 0. Choose y ∈ Rk \ ϕ(∂G). Let U be a
neighborhood of y which does not intersect ϕ(∂G). Then there is x ∈ U \ ϕ(E).
Since deg(y, ϕ, G) = deg(x, ϕ, G), according to (f) deg(y, ϕ, G) is an integer.
35.10. Open Mapping Theorem. Let ϕ : G0 → Rk be a bilipschitz mapping
on an open set G0 ⊂ Rk . Then ϕ(G0 ) is an open set.
Proof. Let t ∈ G0 and y = ϕ(t). Let G be a bounded open set, t ∈ G ⊂ G ⊂ G0 ,
and U be an open neighborhood of y which does not intersect ϕ(∂G). If we prove
that deg(y, ϕ, G) = 0, then by 35.9.d the equation ϕ(s) = x has a solution for
any x ∈ U . Now, there is an open neighborhood V of t such that ϕ(V ) ⊂ U .
Denote V + = {s ∈ V : Jϕ(s)>0 } and V − = {s ∈ V : Jϕ(s)<0 }. Notice that by the
Rademacher Theorem 30.3, ϕ exists almost everywhere. From the definition of
a bilipschitz mapping it is clear that Jϕ (t) = 0 is impossible, and at least one
+ −
sets V , V contains a compact set K of positive measure. We obtain
of the
0 = K Jϕ = deg(y, ϕ, G) λϕ(K). Hence deg(y, ϕ, G) = 0.
35.11. Theorem on Orientation. Let ϕ be a locally bilipschitz mapping of a
connected open set G0 ⊂ Rk into Rk . Then Jϕ > 0 almost everywhere in G0 , or
Jϕ < 0 almost everywhere in G0 .
Proof. Let G ⊂ G ⊂ G0 be a bounded open set and t ∈ G. There is a neigh-

borhood U of ϕ(t) which does not intersect ϕ(∂G), and a neighborhood V of
t such that ϕ(V ) ⊂ U . As in the proof of the Open Mapping Theorem we
deduce that deg(ϕ(t), ϕ, G) = 0. By 35.9.g, the degree is an integer, so that
|deg(ϕ(t), ϕ, G)| ≥ 1. We have

Jϕ (t) dt = |deg(ϕ(t), ϕ, G)| λϕ(V ) ≥ λϕ(V ) = |Jϕ (t)| dt.
V V
Thus, Jϕ has a constant sign almost everywhere in V . From the connectedness of

G0 we obtain the assertion.
35.12. Notes. Although the origins of the idea of degree go back to K.F. Gauss and
A.L. Cauchy, for smooth mappings and smooth sets they were considered at the turn of the
century by H. Kronecker, H. Poincaré, E. Picard, P. Bohl or J. Hadamard. The essential step for
the developing the theory of degree of mappings in finite dimensional spaces and its application is
due to L.E.J. Brouwer [1912]. Later on, J. Leray and J. Schauder introduced the topological de-
gree also in infinite-dimensional spaces. Today, the significance of a degree, mainly in nonlinear
functional analysis, is undeniable. There is a rich bibliography and many sources for study the
degree theory. We refer the reader e.g. to S. Fučı́k and J. Milota [FM], J. Stará and O. John [SJ],
J.T. Schwartz [*1969], S. Fučı́k, J. Nečas, J. Souček and V. Souček [*1973], K. Deimling [*1985],
I. Fonseca and W. Gangbo [*1995] and P. Drábek and J. Milota [*2004].
36. Hausdorff Measures
In Chapter 34 we proved the existence of a k-dimensional measures on Rn

using a relatively simple method. This approach to k-dimensional measures is
appropriate for purposes of applications to the curve and surface integrals.
In theoretical parts of modern analysis we encounter Hausdorff measures oc-
curing more frequently in various connections (even for noninteger values of k).
According to 34.31, on rectifiable sets, and in particular on k-dimensional sur-
faces, the normalized Hausdorff measure and the measure constructed in 34.9
coincide.
Without loss of clarity we take a slightly deeper look in a more general setting
supposing that p (the “dimension”) is a nonnegative real number and (P, ρ) is
a metric space on which we are going to construct the p-dimensional Hausdorff
measure.
36.1. Outer Hausdorff Measure. Let A ⊂ P . Denote
∞ ∞

Hp (A, δ) = inf (diam Aj )p : Aj ⊃ A, diam Aj ≤ δ for δ > 0,
j=1 j=1
Hp (A) = sup Hp (A, δ) (= lim Hp (A, δ) ).
δ>0 δ→0+
The set function A → Hp (A) is called the p-dimensional (outer) Hausdorff mea-
sure.
As we show later, if P = Rn and k ≤ n is a nonnegative integer, there is a
constant κk such that Hk /κk is a k-dimensional measure on Rn in the sense of
definition 34.8. The measure Hk /κk is called the normalized Hausdorff measure.
160 36. Hausdorff Measures
36.2. Metric Outer Measure. An outer measure γ on P is called a metric

outer measure
if γ(A ∪ B) = γA + γB whenever A, B ⊂ P and inf ρ(x, y) : x ∈
A, y ∈ B > 0.
36.3. Remarks. 1. In the definition of the Hausdorff measure p (A) we considered arbitrary
coverings of A but we may confine to coverings consisting of closed or open sets.
2. Notice that the n-dimensional Hausdorff measure of a set K which is a ball or a cube in Rn
equals c(diam K)n for a suitable constant c (cf. Lemma 36.12).
3. The set functions p (·, δ) are not metric outer measures and cannot be used to describe
length or area. Notice that the “one-dimensional” measure 1 (K, δ) of the unit square K in
R2 is a finite number although K contains infinitely many segments of length one. This is why
we cannot replace the definition of Hausdorff measure by a more simple formula
∞ ∞
˘X [
A → inf (diam Aj )p : Aj ⊃ A}
j=1 j=1
(which is in fact p (A, ∞)).

4. Further examples of k-dimensional measures in Rn (which, in general, do not coincide with
the normalized Hausdorff measure) can be obtained using other covering families. For example,
the spherical measure is defined using coverings formed by open balls, see H. Federer [*1969].
The next series of theorems shows that Hp is a metric outer measure. There-
fore, we can apply Carathodory’s method and to derive that each Borel set is
Hp -measurable, and that the restriction of Hp to the Borel σ-algebra is a mea-
sure.
36.4. Theorem. Let γ be an metric outer measure on P . Then each Borel
subset of P is γ-measurable.
Proof. It would be clearly sufficient to prove that closed sets are γ-measurable.
To this end let a closed set F ⊂ P be given. Choose a test set T ⊂ P , γT < +∞
and denote
1 1
Pj = x ∈ T : ≤ dist(x, F ) < , j = 1, 2, . . . ,
j+1 j

P0 = x ∈ T : dist(x, F ) ≥ 1 .
Then the sets P0 , P2 , P4 , . . . have positive distances, so that

m
!
m
"
γP2j = γ P2j ≤ γT
j=0 j=0

m
∞
for all m ∈ N. Similarly γP2j+1 ≤ γT , and we see that the series γPj
j=0 j=0

m
is convergent. Since for each m ∈ N, the distance between Pj and T ∩ F is
j=0
positive we have
∞ ∞
!
m
" ! "
γ(T \ F ) ≤ γ Pj + γ Pj ≤ γT − γ(T ∩ F ) + γPj .
j=0 j=m+1 j=m+1
Letting m → ∞ we obtain
γ(T \ F ) ≤ γ(T ) − γ(T ∩ F ).
36.5. Remark. If γ is an outer measure on P for which every Borel set is γ-measurable, then
γ is already a metric outer measure.
36.6. Theorem. Hp is a metric outer measure on P .

Proof. Theorem 4.3 tell us that A → Hp (A, δ) is an outer measure for every δ > 0.
Letting δ → 0 we see
that Hp is an outer measure.
Let A, B be sets of positive
distance and δ0 < inf ρ(x, y) : x ∈ A, y ∈ B . Let M ⊂ A ∪ B, diam M ≤ δ0 .
Then either M ⊂ A or M ⊂ B. Hence
Hp (A ∪ B, δ) = Hp (A, δ) + Hp (B, δ)
for all δ ∈ (0, δ0 ). Thus
Hp (A ∪ B) = Hp (A) + Hp (B).
36.7. Corollary. Any Borel subset of P is Hp -measurable.

36.8. Exercise. (a) Let 0 ≤ p < q. If p (A) < +∞, then q (A) = 0.
(b) The number inf{p ≥ 0 : p (A) = 0} is called the Hausdorff dimension of a set A.
Compute the Hausdorff dimension of the Cantor set in [0, 1].
(c) For any p ∈ (0, 1), a “generalized Cantor set” B ⊂ [0, 1] may be constructed with
p (B)= 1.
36.9. Remark. The definition of the p-dimensional Hausdorff measure admits also noninteger
values of p. The Hausdorff measures with noninteger dimensions are not directly linked with
the topics of the following chapters nevertheless they have a great importance e.g. in the theory
of singular sets (to describe the “size” of “negligible” sets, for example the set of discontinuities
for a solution of system of partial differential equations, or the set of points of divergence of a
Fourier series; the applications to “removable singularities” are also frequent). The concept of
Hausdorff measures with noninteger dimension is also a starting point for the famous fractal
theory (see e.g G.A. Edgar [*1990] and K.J. Falconer [*1985]).
36.10. Exercise. Show that each 0-dimensional Hausdorff measure is the counting measure.
36.11. Theorem. Let P and P be metric spaces, Hp the p-dimensional Haus-

dorff measure on P and Hp the p-dimensional Hausdorff measure on P . Let E
be a subset of P and f : E → P a β-Lipschitz mapping. Then
Hp (f (E)) ≤ β p Hp (E).

∞
Proof. For each δ > 0 and each sequence {Aj } of subsets of P with Aj ⊃ E,
j=1
diam Aj < δ we have
∞
∞

Hp (f (E), βδ) ≤ (diam f (Aj )) ≤ β
p p
(diam Aj )p .
j=1 j=1
Hence the desired inequality easily follows.

162 36. Hausdorff Measures
36.12. Lemma. Let Hk be the k-dimensional Hausdorff measure on Rn ,

K = [0, 1]k × {0}n−k . Then 0 < Hk (K) < +∞.
√
k
Proof. Given δ > 0 there is m ∈ N such that < δ. We divide K to mk cubes
√ m
1 k
Kj with edges and diameters < δ. Then
m m
mk √
! k "k
Hk (K, δ) ≤ k
(diam Kj ) = m k
= k k/2 .
j=1
m
Hence Hk (K) ≤ k k/2 .

Conversely, let λ be the “Lebesgue measure” on K. Let A ⊂ K and x ∈ A.
Then
A ⊂ B(x, diam A)
' ( ' (
⊂ x1 − diam A, x1 + diam A × · · · × xk − diam A, xk + diam A ,
thus
λA ≤ 2k (diam A)k .

∞
If Aj is a sequence of subsets of K, Aj = K, then
j=1
∞
∞

(diam Aj )k ≥ 2−k λAj ≥ 2−k λK = 2−k .
j=1 j=1
Whence taking the infimum we obtain the desired lower estimate for Hk (K).
36.13. Normalized Hausdorff Measures on Rn . Let k be a nonnegative
integer, k ≤ n and κk := Hk ([0, 1]k × 0n−k ). Then Hk /κk is a k-dimensional
measure on Rn which will be labelled as the normalized Hausdorff measure.
The constant κk is equal to the number (4/π)k/2 Γ(1 + k2 ). The computation
is not easy, see C.A. Rogers [*1970].
36.14. Remarks. 1. The hints for computation k-dimensional measures of concrete rectifiable
sets are given in Chapter 34. We will not present the (difficult) examples of nonrectifiable sets.
2. Without essential changes of proofs a similar theory to that of this chapter can be built also
for the case of spherical measures (cf. Remark 36.3.4). The computation of the corresponding
constant analogous to κk is much easier, in fact it follows immediately from Exercise 26.26.
3. In terms of the Hausdorff measure the following version of the Sard theorem can be proved:
Let G ⊂ Rk be an open set and f : G → Rn an arbitrary mapping. Let E be the set of all
points t ∈ G at which the derivative f (t) exists and vol f (t) = 0 (which means that the rank
of the matrix f (t) is less than k). Then k (f (E)) = 0.
36.15. Notes. C. Carathéodory developed in [1914] the theory of the one-dimensional (“lin-
ear”) measure in n-dimensional Euclidean spaces. In the same paper he mentions the pos-
sibility of introducing k-dimensional measures in an n-dimensional space (for an integer k).
The k-dimensional measure for arbitrary positive k > 0 on Rn was introduced by F. Hausdorff
[1919]. The interesting Theorem 36.4 is due to C. Carathodory [*1918]. The theory of Haus-
dorff measures was developed very intensively, particularly A.S. Besikovitch published a great
amount of papers devoted to this topic. From monographs on Hausdorff measures we recommend
C.A. Rogers [*1970], K. J. Falconer [*1986] and P. Mattila [*1995].
G. Surface and Curve Integrals 163
G. Surface and Curve Integrals

37. Integral Calculus in Vector Analysis
In this chapter we continue a study of curve and surface integrals and state
a change of variable formula. Moreover, we derive formulae concerning relations
between the integration over the “interior” (of a set or a surface) and the integra-
tion over the “boundary”. Later on we will see that these formulae are particular
cases of general Stokes’ Theorem of the next chapter. They are, in fact, multi-
dimensional generalizations of the famous Leibniz formula
b
f (x) dx = f (b) − f (a) .
a
The reader perhaps appreciates that we include the explanation of three-dimen-
sional (and therefore most important from the point of view of the classical
physics) situations without a deeper excursion to multilinear algebra.
Recall, that the integral calculus on k-dimensional surfaces in Rn is based on
the notion of a k-dimensional measure (34.8) which exists (Theorem 34.9) and on
k-dimensional surfaces is uniquely determined (Remark 34.31). Underline that for
a basic understanding of the topic it is not essential to know how k-dimensional
measures were constructed.
Since all theorems of this chapter are only special cases of more general results
of Chapter 38, we will not disturb the presentation by their proofs.
For the one-dimensional measure on Rn we will use the notation s, while S
will be reserved for the (n − 1)-dimensional measure on Rn .
37.1. Vector Field, Gradient, Divergence, Curl. By a vector field on a set
X we understand a mapping f of a set X into Rn .
Let g be a function of class C 1 on an open set U ⊂ Rn . Then its gradient
grad g on U is defined as the vector field x → [ ∂x ∂g
1
∂g
(x), . . . , ∂xn
(x)]. (The difference
between the gradient and the derivative for functions of class C 1 consists only in
the convention that the gradient is a vector while the derivative is a linear form.)
Let f = [f1 , . . . , fn ] be a continuously differentiable (i.e. of class C 1 ) vector
field on an open set U ⊂ Rn . Then the divergence of the field f is the function
div f on U defined as
n
∂fi
div f = .
i=1
∂xi
If f is a continuously differentiable vector field on an open set U ⊂ R2 , its curl
is the function curl f defined by the formula
∂f2 ∂f1
curl f = − .
∂x1 ∂x2
Finally, if f is a continuously differentiable vector field on an open set U ⊂ R3 ,
we introduce its curl curl f as the vector field on U defined by
' ∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1 (
curl f = − , − , − .
∂x2 ∂x3 ∂x3 ∂x1 ∂x1 ∂x2
164 37. Integral Calculus in Vector Analysis
In higher-dimensional spaces the curl corresponds to the bilinear form (u, v) →

f (x)u · v − f (x)v · u.
Let V be an n-dimensional vector space with an inner product. A choice of
an orthonormal basis (u1 , . . . , un ) in V transfers the calculus in V into a calculus
in Rn . The coordinate mapping L : V → Rn assigns with each vector x ∈ V a
vector Lx ∈ Rn of its coordinates with respect to (u1 , . . . , un ) in such a way that
if x = t1 u1 + · · · + tn un , then Lx = [t1 , . . . , tn ]. Since the basis (u1 , . . . , un ) is
orthonormal, we have ti = ui ·x. Let U ⊂ V be an open set, g a real function on U
and f : U → V a vector field. Denote G = L(U ), g̃ = g ◦ L−1 and f˜ = L ◦ f ◦ L−1 .
Then f˜ is a real function on G and g̃ is a mapping of G into Rn . Of course, the
coordinates and the matrices of derivatives f˜ and g̃ depend on the choice of a
basis (u1 , . . . , un ). We can introduce grad g(x) := L−1 (grad g̃(Lx)), div f (x) :=
div f˜(Lx), and curl f (x) := curl f˜(Lx) (if n = 2) or curl f (x) := L−1 curl f (Lx)
(if n = 3). Then the operators grad, div and curl do not depend on the choice of
an orthonormal basis in V.
∂(x2 2
1 +x2 ) ∂(x2 −x1 )
37.2. Example (in R2 ). Let f (x) = [x21 + x22 , x2 − x1 ]. Then div f = ∂x1
+ ∂x2
=
∂(x2 −x1 ) ∂(x2 2
1 +x2 )
2x1 + 1 and curl f = ∂x1
− ∂x2
= −1 − 2x2 .
∂x1 ∂x2
2 ∂x3 x1
37.3. Example (in R3 ). Let f (x) = [x1 , x22 , x3 x1 ]. Then div f = ∂x1
+ ∂x2
+ ∂x3
=
∂x2 ∂x2
1 + 2x2 + x1 , curl f = [ ∂x3 x1
∂x2
− 2
∂x3
, ∂x1
∂x3
− ∂x3 x1
∂x1
, 2
∂x1
− ∂x1
∂x2
] = [0, −x3 , 0].
37.4. Exercise. Let g be a function of class 1

defined on a neighborhood of x and
u = grad g(x)
= 0. Show that the function which associates with an unit vector v the derivative
of g at x in the direction v attains its maximum at u/ |u|.
37.5. Vector Product. A vector w ∈ Rn is said to be the vector product (also

called the cross product) of vectors u1 , . . . , un−1 and denoted by u1 × · · · × un−1 ,
if for each vector v ∈ Rn we have w · v = det(v, u1 , . . . , un−1 ). Thus in particular
the i-th coordinate of the vector w is
ei · w = det(ei , u1 , . . . , un−1 ) = (−1)i+1 det(ej · um )m=1,...,n−1

j=1,...,i−1,i+1,...n .
Notice that the vector product is a binary operation if and only if n = 3. The
vector product is always perpendicular to its factors and its norm is
|u1 × · · · × un−1 | = vol(u1 , . . . , un−1 ).
37.6. Example. In R3 we have
h „ « „ « „ «i
1, 1 1, 1 1, 1
[1, 1, 1] × [1, 2, 3] = det , − det , det = [1, −2, 1].
2, 3 1, 3 1, 2
37.7. Example. In R2 , the vector “product” of the vector [1, 2] is the vector [2, −1].
37.8. Exercise. Compute in R4 the vector product [1, 0, 1, 0] × [0, 0, 0, 1] × [0, −2, 0, 0].
37.9. Orientation. Let Ω be a k-dimensional surface and x ∈ Ω. By Px =

Px (Ω) denote the set of all parametrizations of neighborhoods of x in Ω.
By a (local) orientation of Ω at x we understand a mapping, which associates

with every parametrization ϕ ∈ Px at x a positive or a negative sign so that
det(ϕ−1 −1
2 ◦ ϕ1 ) > 0 almost everywhere on a neighborhood of ϕ1 (x) whenever
ϕ1 ∈ Px and ϕ2 ∈ Px are both positive or both negative parametrizations,
and det(ϕ−1 −1
2 ◦ ϕ1 ) < 0 almost everywhere on a neighborhood of ϕ1 (x) whenever
ϕ1 ∈ Px is a positive parametrization and ϕ2 ∈ Px is a negative parametrization.
It can be proved that a local orientation is always available.
By an orientation of a surface Ω we understand its local orientation at each
point satisfying the following additional property: If ϕ is a positive parametriza-
tion at x, then it is positive also at all points of some neighborhood of x.
Although each point has exactly two local orientations, some problems can
occur when orienting a whole surface. There exist surfaces without a possibility
of a (global) orientation (the well known Möbius strip, see Example 39.7). When
orientations exist, their number is even. An orientable surface with n connected
components possess 2n orientations.
An n-dimensional surface in Rn is nothing else than an open subset of Rn .
It possesses its natural orientation in which the identical parametrizations are
positive.
0-dimensional surfaces are countable sets of isolated points, compact 0-dimen-
sional surfaces have only a finite number of points. Any oriented 0-dimensional
surface F is decomposed into sets F + and F − . At points of F + all parametriza-
tions are positive and at all points of F − are negative.
37.10. Curves. One-dimensional surfaces are called curves. Let γ ⊂ Rn be an
oriented curve. If ϕ : G → γ is a positive parametrization, then for almost every
t ∈ G we define the unit tangent vector t(x) at the point x = ϕ(t) by the formula
t(x) = u/ |u|, where u = dϕ
dt (t). Then the vector t(x) is defined up to an s-null set
and unit tangent vectors defined by means of different positive parametrizations
coincide except on an s-null set (for more details see 38.14).
The field t(x) obtained in this way determines the orientation. Namely, a
parametrization ϕ : G → γ is positive whenever

dϕ dϕ
(t) = t(ϕ(t)) (t)
dt dt
for almost all t ∈ G and negative if

dϕ dϕ
(t) = −t(ϕ(t)) (t)
dt dt
for almost all t ∈ G.

The field of unit tangent vectors will be sometimes called shortly the tangent
field .

The integral γ f · t ds is called a curve integral of the vector field f . It satisfies
the following Change of Variable Formula.
37.11. Theorem. Let γ be an oriented curve. Let G ⊂ R be an open set

and ϕ : G → γ be a positive parametrization. Then for any vector field f =
[f1 , . . . , fn ], fj ∈ L 1 (γ, s), we have

! "
f · t ds = f1 (ϕ(t))ϕ1 (t) + · · · + fn (ϕ(t))ϕn (t) dt.
ϕ(G) G
37.12. Example. Let γ = {[x, y, z] ∈ R3 : x = cos z, y = sin z, 0 < z < 2π} be a helix (see
Example 34.25) and f (x, y, z) = [0, x, 3z 2 ]. We orient γ in such a way that the unit tangent
vector would have a positive z-coordinate. Parametrizing ϕ(t) = [x(t), y(t), z(t)] = [cos t, sin t, t]
we compute
1 1
t = √ [− sin z, cos z, 1] = √ [−y, x, 1] ,
2 2
and therefore
Z Z 2π Z 2π
f · t ds = (x y + 3z 2 z ) dt = (cos t(sin t) + 3t2 ) dt = π + 8π 3 .
γ 0 0
37.13. Normal Field on (n − 1)-dimensional Surfaces. Let Γ ⊂ Rn be an

oriented (n − 1)-dimensional surface. If ϕ : G → Γ is a positive parametrization,
then for almost all t ∈ G we define the unit normal vector n(x) at the point
x = ϕ(t) by the formula
w1 × · · · × wn−1
n(x) = ,
|w1 × · · · × wn−1 |
∂ϕ
where wj = ∂t j
(t). Again, the vector n(x) is defined except on a S-null set and
using different parametrizations yields the same result outside a S-null set (for
more details see 38.14).
The field n(x) obtained in this way determines the orientation. A parametriza-
tion ϕ : G → Γ is positive provided for almost all t ∈ G we have
∂ϕ
∂ϕ ∂ϕ ∂ϕ
(t) × · · · × (t) = n(ϕ(t)) (t) × · · · × (t),
∂t1 ∂tn−1 ∂t1 ∂tn−1
and negative if the above equality holds having the opposite sign on the right-hand
side.
The field of unit normal vectors will be briefly called the normal field .

The integral Γ f · n dS is called the surface integral of the vector field f . It
satisfies the following Change of Variable Formula.
37.14. Theorem. Let Γ be an (n − 1)-dimensional oriented surface. Let
G ⊂ Rn−1 be an open set and ϕ : G → Γ a positive parametrization. Then for
any vector field f = [f1 , . . . , fn ], where fj ∈ L 1 (Γ, S), we have

∂ϕ ∂ϕ
f · n dS = f (ϕ(t)) · ( × ··· × ) dt.
ϕ(G) G ∂t1 ∂tn−1
37.15. Example. We evaluate the integral

Z
[x, y, z 2 ] · n dS,
Γ
where
Γ = {[x, y, z] ∈ R3 : z 2 = x2 + y 2 , 0 < z < 1}
and n is supposed to be oriented “outwards from the cone”
Ω := {[x, y, z] ∈ R3 : x2 + y 2 < z 2 , 0 < z < 1}
(see 37.21). We use the parametrization
ϕ(r, t) = [r cos t, r sin t, r]
on G := {[r, t] : r ∈ (0, 1) and t ∈ (0, 2π)}. Obviously ϕ(G) differs from Γ only in a set of
measure zero, hence there is no difference between integration over Γ and over ϕ(G). We have
0 1
cos t, −r sin t
ϕ (r, t) = @ sin t,

r cos t A ,
1, 0
so that
∂ϕ ∂ϕ
× = [−r cos t, −r sin t, r].
∂r ∂t
In case of a positive parametrization this vector would be a positive multiple of the unit nor-
mal vector and thus it would be directed outwards from Ω. Since it is directed inwards, the
parametrization ϕ is negative (like in 37.21 it is possible to precise “outwards” and “inwards”).
The formula will differ only in the sign. We have
Z Z
[x, y, z 2 ] · n dS = [r cos t, r sin t, r2 ] · [r cos t, r sin t, −r] dr dt
Γ
Z G
π
= (r 2 − r 3 ) dr dt = .
G 6
37.16. Example. Let Γ := {[x, y, z] ∈ R3 : x2 + y 2 = z < 1}. Let the unit normal vector
1
n(x, y, z) = ± √ [2x, 2y, −1])
4z + 1
be oriented by the choice of its sign +. Then [t, r] → [x(t, r), y(t, r), z(t, r)] := [r cos t, r sin t, r2 ],
t ∈ (0, 2π), r ∈ (0, 1), is a positive parametrization and its range differs from Γ only in a set of
measure zero. Let f (x, y, z) = [2x, 0, 0]. Then
Z Z
f · n dS = 2x det(∇y, ∇z) dt dr
Γ (0,2π)×(0,1)
Z
= 2r cos t det(∇(r sin t), ∇(r2 )) dt dr
(0,2π)×(0,1)
Z
= 4r 3 cos2 t dt dr = π.
(0,2π)×(0,1)
37.17. Surfaces with Lipschitz Boundaries. One of the most important

formula of the integral calculus on surfaces is the general Stokes’ Theorem and
its special cases. For the purpose of a formulation of these results we need to
introduce the notion of a surface with a Lipschitz boundary.
Denote by Hk+ , Hk− the halfspaces (0, ∞) × Rk−1 and (−∞, 0) × Rk−1 , respec-
tively. Further denote by i the mapping of Rk−1 onto ∂Hk− defined as
i([s1 , . . . , sk−1 ]) = [0, s1 , . . . , sk−1 ].
Let Ω ⊂ Rn be a bounded oriented k-dimensional surface. The k-boundary

of Ω is defined as Ω \ Ω. It is the topological boundary of Ω if k = n. Suppose
that the k-boundary Γ of Ω is an oriented k−1-dimensional surface. We say
that Γ is a Lipschitz k-boundary of Ω if for every point z ∈ Γ there exist an
open set G ⊂ Rk , a neighborhood U of x and a homeomorphic locally bilipschitz
mapping ϕ : G ∩ Hk− → Ω ∪ Γ such that z ∈ ϕ(∂Hk− ), ϕ(G ∩ Hk− ) = Ω ∩ U ,
ϕ(G ∩ ∂Hk− ) = Γ ∩ U and one of the following situations occurs:
(a) ϕ|G∩Hk− is a positive parametrization of Ω∩U and ϕ|G∩∂Hk− ◦i is a positive
parametrization of Γ ∩ U ;
(b) ϕ|G∩Hk− is a negative parametrization of Ω∩U and ϕ|G∩∂Hk− ◦i is a negative
parametrization of Γ ∩ U .
If k > 1 and a parametrization ϕ satisfies (b), the by a “mirror-like” modifica-
tion we get a parametrization satisfying (a).
37.18. Introduction to Curve Integral Theorem. Let γ ⊂ Rn be an ori-
ented curve with a Lipschitz 1-boundary F . As we already know, the orientation
on γ is formed by a field of unit tangent vectors t and F is a finite set consisting
from a “positive part” F + and a “negative part” F − .
The relation between orientations γ and F is given in Definition 37.17. Less
precisely but transparently: The tangent field is directed from points of the set
F − towards points of the set F + . If t(a) is a continuous extension of t to the
point a ∈ F − (warning: its existence is not guaranteed by our assumptions), then
x−a
t(a) = x→a
lim .
x∈γ
|x − a|
The result in b ∈ F + is similar, but with an opposite sign.

Under these assumptions the following result is valid.
37.19. Curve Integral Theorem. Let g be a continuously diferentiable func-
tion on a neighborhood of γ. Then

g(b) − g(a) = grad g · t ds.
b∈F + a∈F − γ
37.20. Example. Let h be a Lipschitz function on [−1, 1], h(−1) = h(1) = 0. Let γ =
{[x, y] : y = h(x), |x| < 1}. If we choose an orientation of the unit tangent vector t to γ so that
its x-coordinate is positive, then
1
t(x, y) = p [1, h (x)] .
1 + (h (x))2
R
We have to evaluate the integral γ f · t ds, where f (x, y) = [3x2 cos y, −x3 sin y]. A direct com-
putation does not seem to be very hopeful, particularly if the function h is rather complicated.
Nevertheless, since f = grad g for g = x3 cos y, by the Curve Integral Theorem we get
Z
f · t ds = g([1, 0]) − g([−1, 0]) = 2.
γ
37.21. Introduction to Gauss Theorem. Next we introduce the Gauss

Theorem which is also called the Gauss-Ostrogradski Theorem, or the Divergence
Theorem.
Let Ω ⊂ Rn be a bounded open set with a Lipschitz boundary Γ (i.e. suppose
that the conditions of 37.17 are satisfied) and consider a natural orientation on
Ω. Thus the orientation of Γ is uniquely determined and it is represented by the
field n of unit normal vectors. Roughly speaking, we can say that n(x) is the
unit vector which is perpendicular to the boundary of Ω at x and is oriented out
of Ω. Therefore n(x) is also labelled as the vector of the outer normal . By the
formulation “out” we understand that for a nonvanishing vector u ∈ Rn and a
small positive t we have x + tu ∈ Ω provided u · n(x) < 0, and x + tu ∈
/ Ω provided
u · n(x) > 0. Of course, such a situation occurs only at the points of smoothness
of Γ.
Under the above described assumptions the following result holds.
37.22. Gauss Theorem. Let f be a vector field of class C 1 on a neighborhood
of Ω. Then
f · n dS = div f dλ.
Γ Ω
37.23. Example (the ball, spherical coordinates). Let Ω = {[x, y, z] ∈ R3 : x2 + y 2 + z 2 < 1},
Γ = {[x, y, z] ∈ R3 : x2 +y 2 +z 2 = 1}. Consider the mapping given by the spherical coordinates:
ψ = [x, y, z], where
x(r, t, a) = r cos a cos t,

y(r, t, a) = r cos a sin t,
z(r, t, a) = r sin a,
and [r, t, a] ∈ H := (0, ∞) × (−π, π) × (− 21 π, 1

2
π). We have
0 1
cos a cos t, −r sin t cos a, −r cos t sin a
ψ (r, t, a) = @ cos a sin t,

r cos t cos a, −r sin t sin a A .
sin a, 0, r cos a
Then the mapping ϕ(s, t, a) = ψ(s + 1, t, a), [s, t, a] ∈ (−1, 0] × (−π, π) × (− 12 π, 12 π) satisfies
requirements of 37.17 for [x, y, z] ∈ ψ(H). (If [x0 , y0 , z0 ] is not in ψ(H), we can use mappings
[s, t, a] → ψ(s + 1, t − t0 , a − a0 ) for suitable t0 and a0 .) After verifying that det ψ = r 2 cos a > 0
and computing the outer normals we get
∂[cos a cos t, cos a sin t, sin a]

w2 := = [− cos a sin t, cos a cos t, 0],
∂t
∂[cos a cos t, cos a sin t, sin a]
w3 := = [− sin a cos t, − sin a sin t, cos a],
∂a
w2 × w3 = [cos2 a cos t, cos2 a sin t, cos a sin a],
|w2 × w3 | = cos a .
Finally
w2 × w 3
n(x, y, z) = = [cos t cos a, sin t cos a, sin a] = [x, y, z].
|w2 × w3 |
37.24. Example. (a) By means of the Gauss Theorem we will evaluate (not for the first
time) the surface measure of the sphere Γ := {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1}. We utilize
the fact that Γ is a Lipschitz boundary of the ball Ω = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 < 1}.
We compute the two-dimensional measure of the sphere by integrating the constant 1 which we
express in the form 1 = f · n, where we choose e.g. f (x, y, z) = [x, y, z]. Actually, [x, y, z] · n =
[x, y, z] · [x, y, z] = x2 + y 2 + z 2 = 1 (for a different set it would be necessary to find a different
f ). We obtain
Z Z Z Z
1 dS = [x, y, z] · n dS = div[x, y, z] dx dy dz = 3 dx dy dz = 4π.
Γ Γ Ω Ω)
(b) As an exercise evaluate

Z
4
x2 dS (result: π).
Γ 3
37.25. Example (the cube). Let Ω be a cube (0, 1)3 and Γ its boundary oriented by the
outer normal (e.g. on {0} × (0, 1)2 the unit normal vector is [−1, 0, 0]). The condition 37.17
is clearly satisfied for z lying on this side, and less obvious (but still satisfied) if z lies on an
edge or even at a vertex. Let, for example, z be the vertex [0, 0, 0]. Then a mapping ϕ with the
desired properties can be found in the form ϕ(t) ` = [−t1 +max(0, −t2 , −t3 ), −t1 +max(t2 , 0, t2 −
t3 ), −t1 + max(t3 , t3 − t2 , 0)], t ∈ (− 21 , 0] × − 12 , 12 )2 ∩ {|t2 − t3 | < 12 }.
37.26. Introduction to Green Theorem. Consider the situation described in
the Gauss theorem for the two-dimensional case. Then Γ is a curve, the orientation
of which can be expressed not only by means of the normal field, but also by means
of the tangent field. The vector n(x) is s-almost everywhere the “vector product of
the vector” t(x), i.e. (n(x), t(x)) is a positive orthonormal basis of R2 . (Roughly
speaking, t(x) circulates around Ω anti-clockwise.)
37.27. Green Theorem. Let g = [g1 , g2 ] be a continuously differentiable vector
field on a neighborhood of Ω. Then

g · t ds = curl g dx.
Γ Ω
37.28. Exercise. Compute the unit tangent vector and the unit normal vector to the unit
circle oriented as the (Lipschitz) boundary to the unit disc.
37.29. Example. Using the Green Theorem we compute the contents of the figure Ω :=
x2 y2
{(x, y) ∈ R2 : 2 + 2 < 1}. The boundary (ellipse) Γ can be (up to a set of one-dimensional
a b
measure zero) parametrized by the mapping ϕ(t) = [x(t), y(t)] := (a cos t, b sin t), t ∈ (0, 2π).
Let f be a vector field on Rn whose curl is 1, e.g. f = [0, x], f = [−y, 0], or f = [− 12 y, 12 x]. The
the Green theorem yields
Z Z Z 2π Z 2π
1 dx dy = [0, x] · t(x, y) ds = xy dt = ab cos2 t dt = πab.
Ω Γ 0 0
37.30. Exercise. Using the Green theorem evaluate the content of the figure surrounded by
the asteroide {(a cos3 t, b sin3 t) : t ∈ [0, 2π)}.
37.31. Introduction to Stokes’ Theorem. Let Γ ⊂ R3 be an oriented two-

dimensional surface with a Lipschitz k-boundary γ. Suppose that a normal field
n on Γ and a tangent field t on γ are given. Less precisely, but more transparently
we can say that the tangent field t(x) again circulates anti-clockwise provided we
observe the situation against the direction of the normal vector n(x). (Warning:
in contrast to the Green Theorem, here n means the normal field to the surface.)
Under the above stated assumptions the following theorem holds.
37.32. (Special) Stokes’ Theorem. Let f be a vector field of class C 1 on a
neighborhood of Γ. Then

f · t ds = n · curl f dS.
γ Γ
37.33. Example. Let h be p a positive even function of class 1 on [−1, 1], h(1) = 0. Let
Γ = {[x, y, z] : z = h(r)}, r = x2 + y 2 . Let the unit normal vector have the orientation for
which its z-coordinate is positive, i.e.
1 x y
n(x, y, z) = p [−h (r) , −h (r) , 1].
1 + (h (r))2 r r
R
Let us evaluate Γ g · n dS for g(x, y, z) = [−xez , −yez , 2ez ]. Since g = curl f for f (x, y, z) =
[−yez , xez , 0] and f = [−y, x, 0] on γ := {[x, y, z] ∈ R3 : r = 1, z = 0} is a Lipschitz 2-boundary
of Γ, the Stokes formula gives
Z Z Z Z 2π
g · n dS = f · t ds = [−y, x] · t(x, y) ds = (sin2 t + cos2 t) dt = 2π.
Γ γ γ 0
For the evaluation of the integral over γ we used polar coordinates: x = cos t, y = sin t, t ∈
(0, 2π).
37.44. Notes. Main formulas of the calculus of curve and surface integral were established
in 19th century. The divergence theorem was discovered by K. F. Gauss, G. Green and M. V.
Ostrogradski. The Stokes theorem is due to G. G. Stokes. See also 39.24.
38. Integration of Differential Forms
In this chapter we present the theorem which contains as special cases the
Curve integral theorem, the Gauss theorem and Stokes’ theorem of Chapter 37.
For this purpose we need more advanced tools of exterior algebra.
38.1. k-covectors and k-vectors. Let V be a n-dimensional vector space and
V∗ its dual space. The value of a linear form V ∈ V∗ at a vector u ∈ V will be
denoted by V, u.
A mapping W : Vk → R is called a k-linear form (on V) if it is linear in each
variable separately. As a special case we get linear forms for k = 1, or bilinear
forms for k = 2.
A k-linear form W on V is said to be a k-covector if W is antisymmetric in the
following sense: If π is a transposition (a permutation, which transposes exactly
one pair of elements) of the set {1, . . . , k}, then
) *
W, (uπ(1) , . . . , uπ(k) ) = − W, (u1 , . . . , uk ) .
172 38. Integration of Differential Forms
Given a k-covector W , we have

+ ,
!
k
k
"
W, a1,j uj , . . . , ak,j uj = det A W, (u1 , . . . , uk )
j=1 j=1
for any matrix A = (ai,j )ki,j=1 and any ordered k-tuple (u1 , . . . , uk ) of vectors of
V.
Dually, a k-vector on V will be defined as a k-covector on V∗ . Thus a k-vector
assigns a real number to a k-tuple of linear forms on V. In particular, a 1-covector
is the same as a linear form; a 1-vector u on V will be identified with a vector
v ∈ V, if u assigns to each linear form V ∈ V∗ the value V, v. A real number
c will be identified with a 0-form, or a 0-vector, which assigns the number c to
each 0-tuple of vectors or forms, respectively. The vector space of all k-covectors
on V will be denoted by Λk (V) and the vector space of all k-vectors on V by
Λk (V). So Λk (V) = Λk (V∗ ), Λk (V) = Λk (V∗ ), Λ1 (V) = V, Λ1 (V) = V∗ , and
Λ0 (V) = Λ0 (V) = R.
38.2. Example. A bilinear form is representable by a matrix. For the sake of simplicity
assume V = Rn , then (ai,j )i,j is a matrix of a bilinear form V if
X
V, (u, v) = ai,j ui vj
i,j
for each u, v ∈ Rn . Among bilinear forms we select 2-covectors as bilinear forms with an
antisymmetric matrix, i.e. ai,j = −aj,i (in particular ai,i = 0). An example of a matrix of a
2-covector in R3 is 0 1
0, 2, −3
@ −2, 0, 5 A.
3, −5, 0
The (canonical) inner product represented by the unit matrix is also a bilinear form but it fails
to be a 2-covector.
38.3. Exterior Product. By an exterior product of an ordered k-tuple
(V1 , . . . Vk ) of linear forms on V we mean the k-covector V1 ∧ · · · ∧ Vk defined
by the formula
V1 ∧ · · · ∧ Vk , (u1 , . . . , uk ) = det(Vi , uj )ki,j=1 , (u1 , . . . , uk ) ∈ Vk .
Dually, the exterior product of an ordered k-tuple (u1 , . . . uk ) of vectors from V

will be the k-vector u1 ∧ · · · ∧ uk defined by the formula
(V1 , . . . , Vk ), u1 ∧ · · · ∧ uk = det(Vi , uj )ki,j=1 , (V1 , . . . , Vk ) ∈ (V∗ )k .
Notice that the exterior product is invariant with respect to even permutations of
factors. An odd permutation changes (only) the sign. The exterior product van-
ishes if the factors are linearly dependent (in particular, if two factors coincide).
Now we will investigate coordinates of k-vectors and k-covectors in Rn . Let Xi
denote the i-th coordinate form [u1 , . . . , un ] → ui . Recall that in the context of
these chapters by a multiindex we understand an ordered k-tuple α = [α1 , . . . , αk ]
of indices from {1, . . . , n}, and that the set of all such multiindices is denoted by
{1, . . . , n}k . A multiindex α is called increasing if α1 < · · · < αk . The set of all
increasing multiindices from {1, . . . , n}k is denoted by I(k, n). Each k-covector
W posseses (uniquely determined) real coordinates Wα , α ∈ {1, . . . , n}k , namely
Wα = W, (eα1 , . . . , eαk ) .
Dually, each k-vector w has (uniquely determined) real coordinates wα , α ∈

{1, . . . , n}k , namely
wα = (Xα1 , . . . , Xαk ), w .
Let W be a k-covector (take into account analogous considerations for k-vectors).
From the antisymmetry it follows
Wπ(α) = −Wα
for each transposition π of indices.

In particular Wα = 0 whenever an index occurs in the multiindex at least twice.
For a complete description of a k-covector we need only coordinates corresponding
to increasing multiindices.
We have
W = Wα Xα1 ∧ · · · ∧ Xαk ,
α∈I(k,n)
so that the basis of Λk (Rn ) is {Xα1 ∧ · · · ∧ Xαk : α ∈ I(k, n)}. Dually, we have

w= wα eα1 ∧ · · · ∧ eαk ,
α∈I(k,n)
so that the basis of Λk (Rn ) is {eα1 ∧ . . . ∧ eαk : α ∈ I(k, n)}.

A similar consideration leads to a description of the basis of Λk (V) and Λk (V)
for a general n-dimensional vector space V. (It remains only to replace (e1 , . . . , en )
by a fixed basis V and (X1 , . . . , Xn ) by its dual basis.) It follows that the dimen-
k
sion of the spaces Λ!n"(V) and Λk (V) is equal to the number of the multiindices
of I(k, n) which is k .
By means of coordinates a duality pairing between k-vectors and k-covectors
can be defined: If v ∈ Λk (V) and W ∈ Λk (V), then we introduce

W, v : = Wα vα .
α∈I(k,n)
In particular, given v1 , . . . , vk ∈ V and W1 , . . . , Wk ∈ V∗ , we have
W, v1 ∧ · · · ∧ vk = W, (v1 , . . . , vk ) ,

W1 ∧ · · · ∧ Wk , v = (W1 , . . . , Wk ), v .
Notice that not every k-covector is an exterior product of linear forms and not
every k-vector is an exterior product of vectors, see Example 38.7.
38.4. Example (in R3 ).
(e1 + e2 − e3 ) ∧ (2e2 + e3 ) = 2e1 ∧ e2 + 2e2 ∧ e2 − 2e3 ∧ e2 + e1 ∧ e3 + e2 ∧ e3 − e3 ∧ e3

= 2e1 ∧ e2 + e1 ∧ e3 + (2 + 1)e2 ∧ e3 .
38.5. Example (in R3 , notice how the sign is changed under the transposition).
X2 ∧ (X1 + X3 ) ∧ (X1 − X3 ) = X2 ∧ X1 ∧ X1 − X2 ∧ X1 ∧ X3
+ X2 ∧ X3 ∧ X1 − X2 ∧ X3 ∧ X3
= X1 ∧ X2 ∧ X3 − X3 ∧ X2 ∧ X1 = 2X1 ∧ X2 ∧ X3 .
38.6. Example. In R4 we have
X1 ∧ X2 + X2 ∧ X3 + X3 ∧ X4 , 2e1 ∧ e2 − e1 ∧ e4 + e2 ∧ e4 = 2.
38.7. Example. The 2-covector W = X1 ∧ X2 + X2 ∧ X3 + X3 ∧ X4 in R4 cannot be written

as an exterior product V1 ∧ V2 of linear forms. We prove this by a contradiction: Assume that
W = V1 ∧ V2 . Let A : R4 → R2 be the linear mapping x → [V1 x, V2 x]. Since W(1,2)
= 0, we
have Ae1
= 0. Further, W(1,3) = W(1,4) = 0, so that Ae3 and Ae4 are multiples of Ae1 . This
is a contradiction, as W(3,4)
= 0.
38.8. Example. There is no chance to construct a counterexample similar to 38.7 in R3 .
Indeed, let W be a 2-covector on R3 . If W1,3 = 0, then
W = (W(1,2) X1 − W(2,3) X3 ) ∧ X2 .
If W1,3
= 0, then
W(2,3)
W =( X2 + X1 ) ∧ (W(1,3) X3 + W(1,2) X2 ).
W(1,3)
38.9. Differential Forms. A mapping ω : E → Λk (Rn ), where E ⊂ Rn ,

is called a differential k-form (or, simply a differential form) on E. We identify
differential 0-forms with functions. Any differential k-form ω on E ⊂ Rn is
representable in coordinates

ω= ωα dxα1 ∧ · · · ∧ dxαk ,
α∈I(k,n)
where dxi denotes the constant differential 1-form x → Xi and ωα are functions
on E. We say that a differential form ω is of class C if all its coordinates ωα
are of class C . Similarly we define other properties of differential forms (e.g.
measurability) using coordinates .
38.10. Example. We illustrate the calculation with differential forms by the following
example in R4 :
(dx2 + dx4 ) ∧ (x4 dx1 − x1 dx4 ) ∧ (dx2 + dx3 ) = (x4 dx2 ∧ dx1 + x4 dx4 ∧ dx1 − x1 dx2 ∧ dx4
− x1 dx4 ∧ dx4 ) ∧ (dx2 + dx3 )
= (−x4 dx1 ∧ dx2 − x4 dx1 ∧ dx4 − x1 dx2 ∧ dx4 ) ∧ (dx2 + dx3 )
= −x4 dx1 ∧ dx2 ∧ dx2 − x4 dx1 ∧ dx4 ∧ dx2 − x1 dx2 ∧ dx4 ∧ dx2
− x4 dx1 ∧ dx2 ∧ dx3 − x4 dx1 ∧ dx4 ∧ dx3 − x1 dx2 ∧ dx4 ∧ dx3
= x4 dx1 ∧ dx2 ∧ dx4 − x4 dx1 ∧ dx2 ∧ dx3 − x4 dx1 ∧ dx4 ∧ dx3 + x1 dx2 ∧ dx3 ∧ dx4 .
38.11. Differential. Let

ω= ωα dxα1 ∧ · · · ∧ dxαk−1
α∈I(k−1,n)
be a differential (k−1)-form of class C 1 on an open set U ⊂ Rn . Then its

differential dω is defined as the differential k-form on U given by the formula

n
∂ωα
dω(x) = (x) dxi ∧ dxα1 ∧ · · · ∧ dxαk−1 .
∂xi
α∈I(k−1,n) i=1
The differential of a C 1 -function f on U is the differential 1-form

n
∂f
df (x) = dxi .
i=1
∂xi
In particular, the differential of the coordinate function x → xi is dxi which

corresponds to the notation introduced above.
d(x2 dx1 − x1 sin x2 dx2 ) = dx2 ∧ dx1 − sin x2 dx1 ∧ dx2 − x1 cos x2 dx2 ∧ dx2
= −dx1 ∧ dx2 − sin x2 dx1 ∧ dx2 = (−1 − sin x2 ) dx1 ∧ dx2 .
(a) d(x1 dx1 + x1 x3 dx2 ) = dx1 ∧ dx1 + x3 dx1 ∧ dx2 + x1 dx3 ∧ dx2
= x3 dx1 ∧ dx2 − x1 dx2 ∧ dx3 .
(b) d(x1 x2 dx1 ∧ dx3 ) = x2 dx1 ∧ dx1 ∧ dx3 + x1 dx2 ∧ dx1 ∧ dx3
= −x1 dx1 ∧ dx2 ∧ dx3 .
38.14. Orientation of k-dimensional Surfaces. In the preceding chapters

we have seen that the orientation of a curve forms a vector field (of the unit
tangent vectors) on it while the orientation of an n−1-dimensional surface also
forms a vector field (now that of unit normal vectors) but in an entirely different
way. We will see that both tangent and normal fields are particular cases of a
general object. An orientation of a k-dimensional surface determines a k-vector
tangent field and the (n − k)-covector normal field on it. (In fact the normal
should also be a covector, but it is customary to understand it as a vector – the
inner product structure makes this identification possible.)
Let σ be a k-dimensional measure on Rn and Ω be a k-dimensional oriented
surface in Rn . If ϕ : G → Ω is a positive parametrization, then for almost all
t ∈ G we define the unit tangent k-vector ξ(x) ∈ Λk (Rn ) at the point x = ψ(t)
by the formula
w1 ∧ · · · ∧ wk
ξ(x) = ,
vol(w1 , . . . , wk )
∂ϕ
where wj = ∂t j
(t). Then the k-vector ξ(x) is defined σ-almost everywhere and
does not depend on the particular choice of the parametrization ϕ.
We will not use the normal n−k-covector in full generality, nevertheless for the
sake of completeness we will outline the definition: It associates with a n−k-tuple
u1 , . . . , un−k of vectors in Rn the number
vol−1 (w1 , . . . , wk ) det(u1 , . . . , un−k , w1 , . . . , wk ).
The vector space Tx generated by the vectors w1 , . . . , wk (it will be termed

the tangent space in the next chapter) is σ-almost everywhere independent of the
choice of a parametrization.
Notice that the dimension of Λk (Tx ) is 1. Consider two orientations of a surface
Ω. Suppose that ϕ1 : G1 → Ω, ϕ2 : G2 → Ω are parametrizations such that ϕ1 is
positive under the first orientation and ϕ2 is positive under the second orientation.
Let U ⊂ ϕ1 (G1 ) ∩ ϕ2 (G2 ) be a connected open (relatively in Ω) set. Let ξj be
a field of unit tangent k-vectors on U determined by the j-th orientation (by
the parametrization ϕj ). The above conducted dimension considerations show
that ξ2 (x) ∈ {ξ1 (x), −ξ1 (x)} σ-almost everywhere on U . From Theorem 35.11 it
follows that either ξ2 = ξ1 σ-almost everywhere, or ξ2 = −ξ1 σ-almost everywhere.
We can deduce that the orientation of Ω can be reconstructed from the knowl-
edge of the unit tangent k-vector field. A parametrization ϕ : G → Ω is positive
if for almost all t ∈ G
∂ϕ ∂ϕ ∂ϕ ∂ϕ
(t) ∧ · · · ∧ (t) = ξ(ϕ(t)) vol( (t), . . . , (t)),
∂t1 ∂tk ∂t1 ∂tk
and negative if the above stated equality holds with the opposite sign.
38.15. Example. Let Ω := {x ∈ R4 : x1 > 0, x3 = x1 cos x2 , x4 = x1 sin x2 }. We have to
find the unit tangent 2-vector field on Ω knowing that the parametrization ϕ : (0, ∞)×(−π, π) →
Ω,
ϕ(t1 , t2 ) = [t1 , t2 , t1 cos t2 , t1 sin t2 ]
is positive. We have
∂ϕ
w1 := (t) = [1, 0, cos t2 , sin t2 ],
∂t1
∂ϕ
w2 := (t) = [0, 1, −t1 sin t2 , t1 cos t2 ],
∂t2
and thus
vol2 (w1 , w2 ) = 1 + t21 cos2 t2 + t21 sin2 t2 + cos2 t2 + sin2 t2 + t21 (cos2 t2 + sin2 t2 )2 = 2 + 2t21
and
[1, 0, x3 /x1 , x4 /x1 ] ∧ [0, 1, −x4 , x3 ]
ξ(x) = q .
2 + 2 x21
The 2-vector ξ(x) has six coordinates, namely ξ(1,2) (x), ξ(1,3) (x), ξ(1,4) (x), ξ(2,3) (x), ξ(2,4) (x)
and ξ(3,4) (x). For example,
„ «
1, 0
det
x3 /x1 , −x4 −x4
ξ(1,3) (x) = q = q .
2 + 2 x21 2 + 2 x21
38.16. Integration of Differential Forms. Let Ω be an oriented k-dimen-

sional surface, ξ a unit tangent k-vector field and σ a k-dimensional measure on
Ω. Then the integral of a differential k-form ω is defined as

ω := ω, ξ dσ
Ω Ω
provided the right-hand side makes sense. Here, the symbol Ω represents all the
structure (Ω, orientation): Without the knowledge of ξ it would be impossible to
determine the sign of the integral.
38.17. Particular Cases. Notions of the preceding chapter can be now
revisited from the point of view of the general approach using differential forms
and k-vector fields. Let us start with trivial cases.
If k = 0, then Ω is a finite set, ξ(x) = ±1 on Ω and the integral is only a sum
of values. A differential 0-form ω is a function and

ω= ξ(x)ω(x).
Ω x∈Ω
If k = n, then Ω is an open subset of Rn , ξ = e1 ∧ · · · ∧ en (but theoretically,

the “unnatural” case ξ = −e1 ∧ · · · ∧ en should not be excluded), a differential
n-form ω is described by one coordinate, and we have

f (x) dx1 ∧ · · · ∧ dxn = f dλ.
Ω Ω
A bit more interesting is the one-dimensional case of a curve. Then ξ(x) is a

1-vector t(x), a differential 1-form is expressed by a vector field g and

g1 dx1 + · · · + gn dxn = g · t ds.
Ω Ω
Finally the case of a normal field is also included: Let k = n − 1. Then the
vector field n(x) and the (n − 1)-vector field ξ(x) are linked by the relation
ξ = n1 e2 ∧ · · · ∧ en − n2 e1 ∧ e3 ∧ · · · ∧ en + (−1)n nn e1 ∧ · · · ∧ en−1 .
A differential (n−1)-form is (again, but in a different way) represented by a vector

field g and

g1 dx2 ∧ · · · ∧ dxn − g2 dx1 ∧ dx3 ∧ · · · ∧ dxn
Ω

+ · · · + (−1)n−1 gn dx1 ∧ · · · ∧ dxn−1 = g · n dσ.
Ω
38.18. Change of Variable Formula. Let Ω be an oriented k-dimensional

surface, ξ a unit tangent k-vector field on Ω and σ a k-dimensional measure on
Ω. Let ϕ : G → Ω be a positive parametrization. Then for any differential form

ω= ωα dxα1 ∧ · · · ∧ dxαk ,
α∈I(k,n)
whose coordinates are in L 1 (Ω, σ) we have

ω= ωα ◦ ϕ dϕα1 ∧ · · · ∧ dϕαk
ϕ(G) G α∈I(k,n)

= ωα ◦ ϕ det(∇ϕα1 , . . . , ∇ϕαk ).
G α∈I(k,n)
Proof. The assertion is an obvious consequence of 34.19 and definitions.

38.19. Example. We evaluate the integral of the differential form z dx ∧ dy over the
sphere Ω = {[x, y, z] ∈ R3 : x2 + y 2 + z 2 = 1}. We will use the parametrization ϕ(t, a) =
[x(t, a), y(t, a), z(t, a)], where
x(t, a) = cos a cos t,

y(t, a) = cos a sin t,
z(t, a) = sin a,
[t, a] ∈ G := (0, 2π) × (− 12 π, 12 π) and the orientation is supposed to make ϕ positive. Recall,
that the uncovered part of the sphere has the (n−1)-dimensional measure zero. We have
dx = − sin t cos a dt − cos t sin a da,

dy = cos t cos a dt − sin t sin a da,
dz = cos a da.
Thus
Z Z
z dx ∧ dy = sin a(− sin t cos a dt − cos t sin a da) ∧ (cos t cos a dt − sin t sin a da)
Ω
ZG
= sin2 a cos a sin2 t dt ∧ da − sin2 a cos a cos2 tda ∧ dt
G
Z Z π/2
= sin2 a cos a(sin2 t + cos2 t) dt da = 2π sin2 a cos a da
G −π/2
Z 1 4
= 2π u2 du = π.
−1 3
38.20. Example. Let Ω := {x ∈ R4 : x21 + x22 = x23 + x24 = 1, x1 > 0, x3 > 0}. Consider
the parametrization ϕ(t) = [ cos t1 , sin t1 , cos t2 , sin t2 ], t ∈ G := (− π2 , π2 )2 . Suppose that the
orientation makes ϕ positive. Then
0 1
− sin t1 , 0
B cos t1 , 0 C
B
∇ϕ(t) = @ C
0, − sin t2 A
0, cos t2
so that
∂ϕ ∂ϕ
∧ (t) = sin t1 sin t2 e1 ∧ e3 − sin t1 cos t2 e1 ∧ e4
∂t1 ∂t2
− cos t1 sin t2 e2 ∧ e3 + cos t1 cos t2 e2 ∧ e4
and
` ∂ϕ ∂ϕ ´ p
vol , (t) = sin2 t1 sin2 t2 + sin2 t1 cos2 t2 + cos2 t1 sin2 t2 + cos2 t1 cos2 t2 = 1.
∂t1 ∂t2
∂ϕ ∂ϕ
(The expression under the root is the sum of squares of coordinates of the 2-vector ∂t1
∧ ∂t2
.)
It follows that
ξ(x) = x2 x4 e1 ∧ e3 − x2 x3 e1 ∧ e4 − x1 x4 e2 ∧ e3 + x1 x3 e2 ∧ e4
is a tangent 2-vector field. We can compute (taking into account the sign of ϕ only) for example
Z Z „ « Z
cos t1 , 0
x1 dx2 dx4 = cos t1 det = cos2 t1 cos t2 dt1 dt2 = π.
Ω G 0, cos t2 G
38.21. General Stokes Theorem. Let Ω ⊂ Rn be a bounded oriented k-

dimensional surface with a Lipschitz k-boundary Γ. Let ω be a C 1 differential
k−1-form on a neighborhood of Ω. Then

ω= dω.
Γ Ω
Proof. We present a proof for a differential form ω of the form
ω = ωα dxα1 ∧ · · · ∧ dxαk−1 ,
where α ∈ I(k−1, n). Using compactness of Ω and the definition of a surface

with a Lipschitz k-boundary we get open balls U (x1 , r1 ), . . . U (xm , rm ) covering
Ω, open sets Gq ⊂ Rk and bilipschitz mappings ϕq : G ∩ Hk− → Ω such that, for
each q = 1, . . . , m, we have U (xq , rq ) ∩ Ω = ϕq (Gq ∩ Hk− ), U (xq , rq ) ∩ Γ = ϕq (Gq ∩
∂Hk− ), ϕq|Gq ∩Hk− is a positive parametrization of Ω ∩ U (xq , rq ) and ϕq|Gq ∩∂Hk− ◦ i
is a positive parametrization of Γ ∩ U (xq , rq ). (Without loss of generality we
assume the case (a) of 37.17.) Let χq be Lipschitz functions positive on U (xq , rq ),

m
vanishing outside U (xq , rq ) and satisfying χq = 1 on Ω. (This is in fact a
q=1
partition of the unity, cf. 39.11. We can choose e.g. χ̃q (x) = max(0, rq − |x − xq |)

m
and χq (x) = χ̃q (x)/ χ̃i (x).) Let q be fixed, U = Uq and ϕ = ϕq . We define a
i=1
function η on Rk by the formula
⎧
⎪
⎨1 if t1 ≤ 0,
η(t) = 1 − t1 if 0 < t1 < 1,
⎪
⎩
0 if t1 ≥ 1.
Let ψ1 , . . . , ψk be Lipschitz function with a compact support on Rk satisfying
ψ1 = (χq ωα ) ◦ ϕ, ψ2 = ϕα1 , . . . , ψk = ϕαk−1 on G ∩ Hk− ,

ψ1 = 0 onHk− \ G,
ψi (t) = ψi (0, t2 , . . . , tk ), if 0 < t1 < 1.
The existence of these functions follows from McShane’s theorem 30.5. Applying
Corollary 35.3 to ηψ1 , ψ2 , . . . , ψk we obtain

0= det(∇(η ψ1 ), ∇ψ2 , . . . , ∇ψk ) dt
R
k

= ψ1 det(∇η, ∇ψ2 , . . . , ∇ψk ) dt + η det(∇ψ1 , ∇ψ2 , . . . , ∇ψk ) dt.
Rk Rk
The first integrand can differ from zero only on the strip {0 ≤ t1 ≤ 1}, otherwise
∇η = 0. Fubini’s theorem yields
1 ! ∂ψi "k
− ψ1 det(∇η, ∇ψ2 , . . . , ∇ψk ) dt = (ψ1 dt1 ) det dt2 . . . dtk
Rk ∂tj i,j=2
R 0
k
! ∂ψi "k
= ψ1 det dt2 . . . dtk
G∩∂H−k ∂tj i,j=2

= χq ω.
Γ
The second integrand vanishes on Hk− \G (where ψ1 = 0), on the strip {0 < t1 < 1}
(where the partial derivatives ∂ψi /∂t1 are zero) and for t1 ≥ 1 (where ∇η = 0).
Thus

η det(∇ψ1 , ∇ψ2 , . . . , ∇ψk ) dt = det(∇ψ1 , . . . , ∇ψk ) dt
Rk G∩Hk
−

= d(χq ω).
Ω
Summing over q = 1, . . . , m we get the desired equality.

38.22. Example. Let Ω := {x ∈ R4 : 1 < x21 + x22 = x23 + x24 < 4}, Γ := {x ∈ R4 : x21 + x22 =
x23 + x24 = 1} ∪ {x ∈ R4 : x21 + x22 = x23 + x24 = 4}. We want to verify that for a suitable
orientation, Γ is a Lipschitz 3-boundary of Ω. Using the definition, we show this for the point
[1, 0, 1, 0]. Set
ϕ(t) = [(1 − t1 ) cos t2 , (1 − t1 ) sin t2 , (1 − t1 ) cos t3 , (1 − t1 ) sin t3 ],
t ∈ G := (−1, 1) × (−π, π)2 . We have

0 1
− cos t2 , −(1 − t1 ) sin t2 , 0
B − sin t2 , (1 − t1 ) cos t2 , 0 C
B
∇ϕ(t) = @ C.
− cos t3 , 0, −(1 − t1 ) sin t3 A
− sin t3 , 0, (1 − t1 ) cos t3
∂ϕ ∂ϕ
Hence we easily compute that vol( ∂t (t), ∂t3
(t)) = 1 (similarly as in Example 38.20) and
2
` ∂ϕ ∂ϕ ∂ϕ ´ √
vol (t), (t), )(t) = 2(1 − t1 )2 .
∂t1 ∂t2 ∂t3
Let the orientations of Ω and Γ be chosen in such a way that the parametrization ϕ is positive.
We easily verify that the unit tangent 3-vector field ξ on Ω has the form
1 ` ´
ξ(x) = q x1 e2 ∧ e3 ∧ e4 − x2 e1 ∧ e3 ∧ e4 − x3 e1 ∧ e2 ∧ e4 + x4 e1 ∧ e2 ∧ e3
2 2 2 2
x1 + x2 + x3 + x4
(which corresponds to the normal field
1
n(x) = q [x1 , x2 , −x3 , −x4 ],
x21 + x22 + x23 + x24
see 38.19), and the unit tangent 2-vector field η on Γ is

` ´
η(x) = a x2 x4 e1 ∧ e3 − x2 x3 e1 ∧ e4 − x1 x4 e2 ∧ e3 + x1 x3 e2 ∧ e4 ,
where a equals −1/4 for x21 + x22 = 4 and 1 for x21 + x22 = 1.
p
38.23. Example. Let Ω := {[x, y, z] ∈ R3 : z = x2 + y 2 < 35 x + 1}, Γ := {[x, y, z] ∈
p 3
R3 : z = x2 + y 2 = 5 x + 1}. (A part of the conical surface surrounded by an ellipse .) Choose
the orientation RΩ in such a way that the z-coordinate of the normal is positive. The evaluation
of the integral Ω dy ∧ dz is simplified when applying Stokes’ formula. We use the mapping
r
15 25 5 15 25 5
ϕ(r, t) = [ + r cos t, r sin t, ( + r cos t)2 + ( r sin t)2
16 16 4 16 16 4
which is a positive parametrization; ϕ((0, 1) × (0, 2π)) covers Ω up to a set of two-dimensional
measure zero. The mapping ψ : t → ϕ(1, t), t ∈ (0, 2π) is then a positive parametrization of
Γ \ {[ 52 , 0, 52 ]}. Since z = 35 x + 1 on Γ, we have ψ3 = 35 ψ1 + 1 and dψ3 = 35 dψ1 = − 35 16
25
sin t dt.
Thus Z Z Z 2π
5 · 3 · 25 75
dy ∧ dz = y dz = − sin2 t dt = − π.
Ω Γ 0 4 · 5 · 16 64
38.24. Notes. The exterior multiplication was invented by H. Grassmann in the 19th century.
See also 39.24.
39. Integration on Manifolds
A natural generalization of k-dimensional surfaces in Rn are manifolds. The

main difference consist in the fact that a surface is considered as a subset of Rn
while in the case of manifolds we abstract from the embedding into Rn . In this
chapter we also introduce notions which we omitted in the previous chapters like
the notion of the tangent space.
However, this chapter should not be understood as an introduction to the
analysis on manifolds. It contains only a direct way of presentation integration
theory. We do not give proofs which, in principle, are mostly the same as proofs
of analogous results of the preceding chapters.
Now, the whole theory could be built in the spirit of Lipschitz mappings like in
the last chapters. We hope that the reader welcome at least once the simplicity
of a C 1 -presentation.
182 39. Integration on Manifolds
39.1. Manifolds. Let Ω be a metrizable topological space. A homeomorphic

mapping μ of an open set U ⊂ Ω into Rk is called a (k-dimensional) chart (on
Ω) provided μ(U ) is an open subset of Rk . The domain U of the chart μ will
be denoted by Uμ . A system A of charts on Ω is called a C - atlas (on Ω) if
{Uμ : μ ∈ A } is a covering of Ω and all superpositions ν ◦ μ−1 , where μ, ν ∈ A ,
are of class C . In this case Ω = (Ω, A ) is called a (k-dimensional) manifold of
class C . If the order of differentiability of a manifold is not specified, then C 1 is
tacitly understood.
Let us emphasize that our manifolds are supposed to be metrizable which is not
always the case of other authors. Omitting this assumption we get some peculiar
examples which are not important from the point of view of applications.
If the set Ω will be equipped by two different atlases A1 , A2 , we understand the
manifolds (Ω, A1 ), (Ω, A2 ) to be different. (Sometimes it is preferred to identify
manifolds whose atlases are in a sense equivalent.)
A set G ⊂ Rk is identified with the manifold (G, {id}), where id is the identity
mapping on G.
Let (Ω, A ), (Ω , A ) be two manifolds (not necessarily of the same dimension)
and U ⊂ Ω be an open set. We say that a mapping f : U → Ω is of class C
(measurable, differentiable) at a point x ∈ Ω if all superpositions ν ◦ f ◦ μ−1 ,
μ ∈ A , ν ∈ A , are of class C (measurable, differentiable) at μ(x). A set E ⊂ Ω
is called measurable if μ(E ∩ Uμ ) is λ-measurable for each μ ∈ A , and a null set
if λ(μ(E ∩ Uμ )) = 0 for each μ ∈ A . A homeomorphic mapping f is called a
diffeomorphism provided both f and f −1 are C 1 .
Notice the essential difference with the previous concept of a surface in Rn .
If Ω is a subset of Rn , then a metric and a linear structure induced on Ω gives
an easy way to a differentiation. Having a manifold, the only information about
the structure (except the topological one) is given by the atlas. Without the
knowledge of the atlas we are not able to decide whether a mapping of an open
subset of Rk to the manifold is differentiable.
39.2. Embedding. Let (Ω, A ) be a k-dimensional manifold of class C . Let
f : Ω → Rn be a diffeomorphism and f A = {μ ◦ f −1 : μ ∈ A }. The mapping f
is called an embedding of class C into Rn if (f (Ω), f A ) is again a manifold of
class C .
Most frequently we meet the identical embedding of a manifold being itself
a topological subspace of Rn . A possibility to introduce the structure of an
embedded manifold (an atlas) on a set Ω ⊂ Rn is to use coordinate charts x →
[xα1 , . . . , xαk ], where α is a multiindex of {1, . . . , n}k .
39.3. Orientation. Let k ≥ 1. A diffeomorphism ψ of an open set G ⊂ Rk
into Rk is called positive if Jψ > 0 on G, and negative if Jψ < 0 on G. We say
that (Ω, A ) is an oriented manifold , or that A is an oriented atlas on Ω, if all
superpositions ν ◦ μ−1 , where μ, ν ∈ A , are positive.
Notice that to any oriented manifold (Ω, A ) of dimension k ≥ 1 there exists a
manifold (Ω, A ) with an “opposite” orientation, where
A = {[−μ1 , μ2 , . . . , μk ] : [μ1 , . . . , μk ] ∈ A }.
We say, that a connected manifold (Ω, A ) of a dimension k ≥ 1 is orientable

if there is an oriented atlas A ⊂ A ∪ A . A disconnected manifold is orientable
if all its connected components are orientable.
0-dimensional manifolds consist of isolated points. The orientation of such a
manifold is nothing else than the assignment of a sign plus or minus with every
of its points. Let (Ω, A ) be an oriented 0-dimensional manifold. If μ ∈ A and
z ∈ Uμ , then Uμ = {z} and μ(z) = 0. If z is a positive point, then the chart μ is
positive and conversely.
39.4. Tangent Spaces and Derivative of a Mapping. Let (Ω, A ) be a
k-dimensional manifold and x ∈ Ω. Let μ ∈ A be a chart whose domain contains
x, and ϕ = μ−1 . Then the tangent space Tx to Ω at the point x is generated by
∂ϕ ∂ϕ
vectors ∂t 1
(μ(x)), . . . , ∂tk
(μ(x)). If the manifold Ω is embedded into Rn , it is not
necessary to define these vectors since partial derivatives of ϕ are elements of Rn .
There is a lack of such an interpretation for abstract manifolds. To associate a
∂ϕ
meaning with symbols ∂t j
(μ(x)), we can imagine the following construction: The
vector ∂t∂ϕ
j
(μ(x)) is represented as the linear form f → ∂(f∂t◦ϕ)
j
on the vector space
of all functions on Ω which are differentiable in x. A tedious computation shows
that such a definition of a tangent space is independent of the choice of μ.
Let (X , M ) and (Y , N ) be two manifolds (not necessarily of the same di-
mension) and f : X → Y . If x ∈ X , y ∈ Y , f (x) = y and f is differentiable at
x, then the derivative of f at x is defined as the mapping which assigns with each
vector u ∈ Tx (X ) the vector f (x)u ∈ Ty (Y ). Namely, if u = ∂t ∂ϕ
j
(μ(x)), where
∂(f ◦ϕ)
μ ∈ M and ϕ = μ−1 , we define f (x)u = ∂tj (μ(x)).
Let the k-dimensional manifold (Ω, A ) be oriented μ ∈ A and x ∈ Uμ . The ba-
sis (u1 , . . . , uk ) of the tangent space Tx (Ω) is called positive provided det(μ (x)u1 ,
. . . , μ (x)uk ) > 0, and negative if det(μ (x)u1 , . . . , μ (x)uk ) < 0. Notice that
neither of these notions depend on the particular choice of a chart.
39.5. Example. Let Ω be the sphere {x ∈ Rn : |x|2 = 1}.
(a) The structure of an oriented manifold of class ∞ can be formed on Ω by the atlas of
the coordinate charts: = {μq : q ∈ {−n, . . . , −1, 1, . . . , n}}, where
μ1 (x) = [x2 , . . . , xn ], x1 > 0,
μ−1 (x) = [−x2 , . . . , xn ], x1 < 0,
μ2 (x) = [x1 , x3 , . . . , xn ], x2 > 0,
μ−2 (x) = [−x1 , x3 , . . . , xn ], x2 < 0,
...
μn (x) = [x1 , . . . , xn−1 ], xn > 0,
μ−n (x) = [−x1 , . . . , xn−1 ], xn < 0.
(b) There is no atlas on Ω composed from a sole chart μ. Indeed, Ω is compact, μ is continuous
and thus μ(Ω) should be compact as well. However, there are no compact open sets in Rn−1 .
(c) We find the tangent space at a point x ∈ Ω, for instance by means of μn for xqsatisfying
xn > 0, using ϕ = μ−1 n . Thus, if G = {t ∈ R
n−1 : |t| < 1}, then ϕ(t) = [t , . . . , t
1 n−1 , 1 − |t|2 ],
t ∈ G, and the tangent space at the point x = ϕ(t) is generated by the vectors
−t1 x1 −tn−1 xn−1
[1, 0, . . . , 0, q ] = [1, 0, . . . , 0, − ], . . . , [0, . . . , 0, 1, q ] = [0, . . . , 0, 1, − ].
1 − |t|2 xn 1 − |t|2 xn
√
(d) Consider the mapping g : R2 → R3 defined as g(x) = [x21 , x22 , 2 x1 x2 ] and denote its
restriction to the unit circle Ω2 in R2 by f . Then f maps Ω2 to the unit sphere Ω3 in R3 .
The derivative f (x) maps a vector u ∈ Tx (Ω2 ) ⊂ R2 to the vector f (x)(u) ∈ Tf (x) (Ω3 ), and
√ √
f (x)(u) = g (x)u = u1 ∂x
∂g ∂g
(x) + u2 ∂x (x) = [2u1 x1 , 2u2 x2 , 2u1 x2 + 2u2 x1 ].
1 2
39.6. Example. Let 0 < r < R, and
p
{[x, y, z] ∈ R3 : ( x2 + y 2 − R)2 + z 2 = r 2 }
be an anuloid. We will use the parametrization ϕq (s, t) = [x, y, z], [s, t] ∈ Gq , where
x = (R + r cos s) cos t,
y = (R + r cos s) sin t,
z = r sin s,
G1 = (0, 2π) × (0, 2π),
G2 = (0, 2π) × (−π, π),
G3 = (−π, π) × (0, 2π),
G4 = (−π, π) × (−π, π).
Then the atlas = {ϕ−1 −1

1 , . . . , ϕ4 } forms the structure of an oriented manifold of class ∞
on Ω.
39.7. Example. Let Ω = ϕ(G), where G = (−1/2, 1/2) × (−2 π, 2 π) and
t t t
ϕ(s, t) = [(1 + s cos ) cos t, (1 + s cos ) sin t, s sin ].
2 2 2
Then Ω has the shape of the Möbius strip. Let be the atlas of all 1 charts on Ω. We will
prove that the manifold (Ω, ) is not orientable. We have
0 1
cos t cos 2t , − sin t − s sin t cos 2t − 2s cos t sin 2t
B C
ϕ (s, t) = @ sin t cos 2t , cos t + s cos t cos 2t − 2s sin t sin 2t A .
t s t
sin 2 , 2
cos 2
Assume that there exists an oriented atlas ⊂ . Let H = {[s, t] ∈ G : det ∇(μ ◦ ϕ)(s, t) > 0
whenever μ ∈ and ϕ(s, t) ∈ Uμ }. Then using the connectedness of G it follows that H = G
or H = ∅. Let μ ∈ , [−1, 0, 0] ∈ Uμ . Then det(μ ◦ ϕ) should have the same sign at the point
[0, −π] as at the point [0, π], which leads to a contradiction.
39.8. Differential Forms. A differential k-form on an n-dimensional manifold
Ω is defined as a mapping ω : x ∈ Ω → ω(x) ∈ Λk (Tx (Ω)). The calculation with
differential forms on manifolds is transferred into a calculation with differential
forms in Rk by means of a pullback. Let G ⊂ Rm be an open set and ϕ : G → Ω
a C 1 mapping. Let ϕ be a differential k-form on Ω. Then the differential k-form
ϕ ω which is called the pullback of ϕ on G is defined as
) *
(ϕ ω)(t), (u1 , . . . , uk ) = ω(ϕ(t)), (ϕ (t)u1 , . . . , ϕ (t)uk ) , u1 . . . , uk ∈ Rm .
The pullback of a differential form on Ω

ω(x) = ωα (x)dμα1 ∧ · · · ∧ dμαk (x)
α∈I(k,n)
is a differential form on G

ϕ ω(t) = ωα (ϕ(t))d(μα1 ◦ ϕ) ∧ · · · ∧ d(μαk ◦ ϕ) (t),
α∈I(k,n)
which can be, of course, expressed in coordinates in Rm (see Example 39.10).

We say that a differential form ω on Ω is of class C or measurable if the same

holds for its pullbacks (μ−1 ) ω, μ ∈ A .
The differential of a differential k−1-form ω is defined as a differential k-form
dω such that (μ−1 ) dω = d((μ−1 ) ω) for each chart μ ∈ A . Every C 1 differential
form has a differential (this is not entirely easy), and if Ω is an open subset of
Rn , it coincides with the differential introduced in the preceding chapter.
If a manifold Ω is embedded into Rn , then any differential form

ω= ωα dxα1 ∧ · · · ∧ dxαk
α∈I(k,n)
on a neighborhood of Ω induces a differential form ω̃ on Ω, namely

ω̃ = ωα dx̃α1 ∧ · · · ∧ dx̃αk ,
α∈I(k,n)
where x̃i are the coordinate functions x → xi on Ω. Notice that ω(x) ∈ Λk (Rn )
while ω̃(x) ∈ Λk (Tx (Ω)). The space Tx (Ω) can be understood as a subspace of Rn .
The difference is immaterial from the point of view of integration. Nevetheless,
a certain carefulness is recommended. Indeed, two different elements of Λk (Rn )
can coincide on Tx (Ω) so that a differential form on Ω can have more distinct
descriptions in coordinates (related to Rn ). This phenomenon is demonstrated
by the next example.
39.9. Example. Let Ω be the unit circle {[x, y] ∈ R2 : x2 + y 2 = 1}. Then x dx + y dy induces
the zero differential form on Ω.
q
39.10. Example. Let Ω be the conical surface {x ∈ R3 : x3 = x21 + x22 }. Let μ be the chart
which is the inverse to the mapping ϕ : G → Ω, G = (0, 1)×(0, 2π), ϕ(t) = [ϕ1 (t), ϕ2 (t), ϕ3 (t)] =
[t1 cos t2 , t1 sin t2 , t1 ], and let ω(x) = x1 dx2 ∧ dx3 be the differential form on Ω. Then
ϕ ω = ϕ1 dϕ2 ∧ dϕ3 = t1 cos t2 d(t1 sin t2 ) ∧ dt1 = (−t21 cos2 t2 )dt1 ∧ dt2 .
39.11. Partition of Unity. Let (Ω, A ) be manifold. A system {χτ }τ ∈T

of nonnegative functions of class C 1 on Ω is called a partition of unity on Ω
(subordinated to a covering {Uμ }μ∈A ) if for every τ ∈ T there is μ ∈ A such
that {χτ > 0} ⊂ Uμ and, in addition, each point x ∈ Ω has a neighborhood
V with
a finite set TV ⊂ T such that χτ = 0 on V , provided τ ∈ / TV , and χτ = 1
τ ∈TV
on V . So, χτ = 1 on Ω and this sum is “locally finite”.
τ ∈T
The partition of unity exists, cf. 39.22.
39.12. Riemannian Metric. If we want to introduce a k-dimensional measure
on a manifold, the idea of copying the definition from Rn is not the best one. An
analogy with the Change of Variable Formula of 34.19 is more straightforward. In
this case we need to define a volume of a k-tuple of tangent vectors. The definition
of a volume (if we omit the possibility of its axiomatic introduction) is based on
an inner product. If the given manifold is embedded into Rn , on each tangent
space we have to our disposal an inner product from Rn . In a general case we
need to consider an inner product on tangent spaces as an additional structure.
Let (X , M ) be an n-dimensional manifold of class C 1 and g a mapping asso-

ciating with every x ∈ X a positive definite bilinear form gx on Tx (X ). Then
we can express gx in coordinates with respect to a chart μ ∈ M in such a way
that for any vectors u, v ∈ Tx (X ) we have

n
gx (u, v) = x
gi,j ûi v̂j ,
i,j=1
where ûi = μi (x)u and v̂j = μj (x)v.

If all coordinate functions x → gi,j
x
, i, j =
1, . . . , n, are continuous, we call g a Riemannian metric on X . The structure
(X , M , g) is called a Riemannian manifold . Any manifold of class C 1 admits a
Riemannian structure; it follows easily using the partition of unity.
p
39.13. Example. If := {[x, y, z] ∈ R3 : x = r cos z, y = r sin z, where r = x2 + y 2 }
is a helix, then is a two-dimensional manifold. If we introduce a metric on T[x,y,z] by the
formula
g[x,y,z] (u, v) = P u · P v,
where P is the projection [x, y, z] → [x, y], then we get a Riemannian manifold which does not
have an isometric embedding to Rn . (The mapping P is of course locally an isometric embedding
into R2 but globaly it is not one-to-one.) Such manifolds are useful in complex analysis, the
example demonstrates that non-imbedded manifolds are not only “useless abstractions”.
39.14. Example. Let Ω be an open unit circle in R2 . Set
(u · x)(v · x)
gx (u, v) = u · in + , x ∈ Ω, u, v ∈ R2 .
|x|2
Then g is a Riemannian metric whichqgives the shape of a hemisphere to the manifold Ω.
Indeed, the mapping f : x → [x1 , x2 , x21 + x22 ] which maps to the “actual hemisphere”
f ( ) (endowed with the Euclidean inner product) preserves the inner product. Namely, for all
u, v ∈ R2 and x ∈ Ω we have
(f (x)u) · (f (x)v) = gx (u, v),
hence f is an “isometric mapping”. On the other hand, Ω is not isometric to any open subset of
R2 (it is not possible to “make the hemisphere flat”). From the geometrical point of view, the
shape of the manifold expressed by the Riemannian metric is more important than the original
“underlying space”.
39.15. k-dimensional Measures on Riemannian Manifolds. Suppose that
(X , A , g) is a Riemannian manifold, x ∈ X and (u1 , . . . , uk ) ∈ (Tx (X ))k . We
define a volume of this k-tuple of vectors similarly as in 34.10:
vol(u1 , . . . , uk ) = det(gx (ui , uj ))ki,j=1 .
This immediately introduces also the volume of a linear mapping L : Rk →
Tx (X ). Let σ be a measure on the σ-algebra of all measurable subsets of X .
We say that σ is a k-dimensional measure on X if for each μ ∈ A and each
measurable set E ⊂ Uμ we have

σE = vol ϕ (t) dt , where ϕ = μ−1 .
ϕ−1 (E)
Since the integral does not depend on a particular choice of μ, the partition of
unity leads to the existence of a “k-dimensional measure” on a k-dimensional
Riemannian manifold.
39.16. Integration of Differential Forms on Riemannian Manifolds. Let

(Ω, A , g) be a k-dimensional Riemannian manifold. We define the unit tangent
k-vector ξ(x) to Ω at a point x ∈ Ω by the formula
u1 ∧ · · · ∧ uk
ξ(x) = ,
vol(u1 , . . . , uk )
where (u1 , . . . , uk ) is a positive basis of Tx (Ω) (the definition does not depend
on the choice of the base). Now, we can introduce the integration of differential
forms by means of integration by k-dimensional measure σ on Ω similarly as we
have proceeded in 38.16: Namely, if ω is an integrable differential form on Ω, then

ω= ω, ξ dσ.
Ω Ω
39.17. Example. We evaluate the integral

Z
rx2
√ dσ,
Ω 4r 4 + 5r 2 + 1
q
where r = x21 + x22 , Ω := {x ∈ R4 : x1 = r cos x4 , x2 = r sin x4 , x3 = r 2 , x4 ∈ (0, π), r ∈
(0, 1)} and σ is a two-dimensional measure on Ω.
Let (es , et ) be the canonical basis of R2 and (e1 , . . . , e4 ) the canonical basis of R4 . The
manifold Ω will be parametrized by the mapping ϕ(s, t) = [s cos t, s sin t, s2 , t], [s, t] ∈ G :=
(0, 1) × (0, π). We introduce the structure of an oriented manifold on Ω by the atlas {ϕ−1 }.
Denote L = ϕ (s, t), w = Les ∧ Let . Then
∂ϕ ∂ϕ
w= ∧ (s, t) = [cos t, sin t, 2s, 0] ∧ [−s sin t, s cos t, 0, 1]
∂s ∂t
x1 x 2
=[ , , 2r, 0] ∧ [−x2 , x1 , 0, 1]
r r
x1 x2
= r e1 ∧ e2 + 2rx2 e1 ∧ e3 + e1 ∧ e4 − 2rx1 e2 ∧ e3 + e2 ∧ e4 + 2r e3 ∧ e4 ,
r r
and
x21 x2
|w|2 = r 2 + 4r 2 x22 +
+ 4r 2 x21 + 22 + 4r2 = 4r4 + 5r2 + 1.
r2 r
For the unit tangent k-vector we have
w
ξ= ,
|w|
the integrand is expressed as
rx2 1
√ = dx1 ∧ dx3 , ξ ,
4r 4 + 5r 2 + 1 2
and thus
Z Z
rx2 1
√ dσ = dx1 ∧ dx3
Ω 4r4 + 5r2 + 1 2 Ω
Z Z
1 2
= (cos t ds − s sin t dt) ∧ 2s ds = s2 sin t ds dt = .
2 G G 3
39.18. Integration on General Manifolds. The definition 39.16 is a logical

conclusion of the approach of the preceding chapter, where the presence of the
inner product was quite obvious. Nevertheless, for the purpose of integration of
differential forms on manifolds neither Riemannian structure nor a k-dimensional
measure are needed. Indeed, we can realize that the integral expressions given
by the Change of Variable Formula do not depend on these structures. In the
general case we can proceed as follows: Let (Ω, A ) be a k-dimensional oriented
manifold. Let ω be a measurable
differential k-form on A and E a measurable
subset of Ω. The integral E ω is defined in two steps: First, assume E ⊂ Uμ for
some μ ∈ M . Then we define

ω= (μ−1 ) ω.
E μ(E)
From the Change of Variable Formula it follows that this integral (if it makes
sense) does not depend on the choice μ.
The second step is based on the partition of unity. Let {χτ }τ ∈T be a partition
of unity on (Ω, A ).
For any measurable set E ⊂ Ω denote

I(E) = χτ ω
τ ∈T E∩{χτ >0}
(the expression I(E) does not necessarily makes sense). We say, that the integral
E
ω converges, or that the differential form ω is integrable on E if for any mea-
surable set E ⊂ Ω, the expression I(E ) makes sense and it is a finite number,
and for any sequence {Eq } of parwise disjoint measurable sets Eq ⊂ E, the series
∞

I(Eq )
q=1

converges (absolutely). If the integral E
ω converges, we set

ω = I(E).
E
Since we were careful enough, the value of such a defined integral depends neither
on the “partition” of E, nor on the partition of unity. On the other hand, it
depends on the orientation of Ω: The reverse orientation forces the converse of
the sign of the integral.
Let G ⊂ Rk be an open set and ϕ : G → Ω a diffeomorphism. We say, that
ϕ is a positive parametrization if all superpositions μ ◦ ϕ, μ ∈ A , have a positive
Jacobian. For positive parametrizations the following change of variable formula
is valid:
ω= ϕ ω
ϕ(G) G
provided either of these integrals exists.

39.19. Example. Let Ω be the set of all decreasing solution of the differential equation
(x (s))2
x (s) − x(s) = 0 satisfying the condition 0 < x(0) < 1. Let = {μ : Ω → R2 : there are a,
b ∈ R, such that a < b and μ(x) = [x(a), x(b)] for all x ∈ Ω}. Then (Ω, ) is a two-dimensional
oriented manifold and ϕ : t → et2 s+t1 , t ∈ (−∞, 0)2 is a positive parametrization of Ω. The
tangent space Tx (Ω), x = ϕ(t), is representable as a two-dimensional vector space of functions
∂ϕ ∂ϕ
generated by the functions ∂t (t) : s → et2 s+t1 and ∂t (t) : s → s et2 s+t1 . For each τ ∈ R,
1 2
let ετ be the function on Ω defined as ετ (x) = x(τ ). Then ω := dε0 ∧ dε1 is a differential form
which associates with u1 , u2 ∈ Tx (Ω) the number
„ «
u1 (0), u2 (0)
det .
u1 (1), u2 (1)
We have
ϕ ω = d(et1 ) ∧ d(et1 +t2 ) = et1 dt1 ∧ (et1 +t2 dt1 + et1 +t2 dt2 )
= e2t1 +t2 dt1 ∧ dt2 ,
so that Z Z
1
ω= e2t1 +t2 dt1 dt2 = .
Ω (−∞,0)2 2
39.20. Introduction to General Stokes’ Theorem on Manifolds. Let

(X , A ) be a k-dimensional oriented manifold and Ω ∪ Γ be a compact subset
of X . Let (Γ, B) be a (k − 1)-dimensional oriented manifold. We suppose that
for each point z ∈ Ω ∪ Γ there exist an open set G ⊂ Rk , a neighborhood U
of the point z and a homeomorphic mapping ϕ : G ∩ Hk− → Ω ∪ Γ, such that
z ∈ ϕ(∂Hk− ), ϕ(G ∩ Hk− ) = Ω ∩ U , ϕ(G ∩ ∂Hk− ) = Γ ∩ U and one of the following
cases occurs:
(a) ϕ|G∩Hk− is a positive parametrization of Ω ∩ U and ϕ|G∩∂Hk− ◦ i is a positive
(b) ϕ|G∩Hk− is a negative parametrization of Ω ∩ U and ϕ|G∩∂Hk− ◦ i is a negative
(Recall that, by the conventions of this chapter, any parametrization is a dif-
feomorphism.)
39.21. General Stokes’ Theorem on Manifolds. Let ω be a C 1 differential

(k − 1)-form on X . Then

ω= dω.
Γ Ω
39.22. Existence of the Partition of Unity. Let (Ω, ) be a k-dimensional (topological)

manifold. We introduce a temporary term of an admissible family of functions for a system
{fτ }τ ∈T of nonnegative functions on Ω which satisfies the following conditions: For every τ
from the index set there exists μ ∈ such that {fτ > 0} ⊂ Uμ , and fτ ◦ μ−1 is an infinitely
differentiable function. Further, each point x ∈ Ω has a neighborhood V with a finite set
V ⊂ such that fτ = 0 on V provided τ ∈ / V . The only requirement on partition of unity
being not satisfied by an admissible family of function is that the sum is 1. However, if {fτ }τ ∈T
is an admissible family of function whose sum S is positive, then {fτ /S}τ ∈T is a partition of
unity. Its quality depends on the quality of the atlas. If the atlas is
, or locally Lipschitz,
then also the partition of unity will have the same property.
In the next step let K ⊂ W be subsets of Ω, K compact, W open. We show that there is
an admissible family of function (even a finite one) whose sum is positive on K and vanishing
outside W . For each point a ∈ K we find μa ∈ and a radius ra > 0 so that a ∈ Uμa and
B(μa (a), 2ra ) ⊂ μa (Uμa ∩ W ). Now set
( 2 2
e1/(|μa (x)−μa (a)| −ra ) if x ∈ μ−1
a (U (μa (x), ra )),
fa (x) =
0 in remaining cases.
Then {{fa > 0} : a ∈ K} is a covering of K and taking into account the compactness of K we
can select a finite set {fa1 , . . . , fam } forming an admissible family of functions, whose sum is
positive on K. We have yet solved the existence of the partition of unity provided Ω is compact.
Recall that, by our definitions, Ω is a metrizable space. If Ω is connected, then the Topological
Lemma 39.23 yields the existence of compact sets Kq and open sets Wq (q ∈ N) such that
S
∞
Kq ⊂ Wq , Ω = Kq , and each point has a neighborhood which intersects only a finite
q=1
number of sets Wq . For each couple (Kq , Wq ) we find an admissible family of functions by the
preceding procedure. The union with respect to q will be an admissible family of functions
whose sum is positive on Ω. This solves the existence problem if Ω is connected. If Ω is not
connected, then its topological components are connected submanifolds Ω. By a simple “union”
of partitions of unity on components we obtain the partition of unity on Ω.
39.23. Topological Lemma. Let (P, ρ) be a connected locally compact metric space. Then
there exists a sequence {Kq } of compact subsets of P and a sequence {Wq } of open subsets of
S
∞
X such that X = Kq and each point P has a neighborhood intersecting only finitely many
q=1
sets Wq .
Proof. We may assume that P is not compact, for otherwise there is nothing to prove. Further,
we consider an equivalent metric in which P is bounded. Given x ∈ P , there is a radius r(x)
such that B(x, 2r(x)) is compact and B(x, 4r(x)) is not compact. We construct recursively a
sequence {Vq } of open relatively compact subsets of P . Choose x0 ∈ P and set V1 = U (x0 , 2r0 ).
Assume that V1 , . . . , Vq were already constructed. Thanks to compactness of Vq it follows that
there is a finite system {U (xj , rj )} of balls selected from {U (x, r(x)) : x ∈ V q } such that it
S
covers V q . Set Vq+1 = U (xj , 2rj ). The resulting sequence satisfies V q ⊂ Vq+1 . We prove
j
S
∞
by a contradiction that V := Vq = P . Suppose V
= P . Since P is connected, there exists
q=1
z ∈ ∂V . Set R = r(z)/3, find x ∈ U (z, R) ∩ V and q such that x ∈ Vq . Further find y ∈ Vq and
r = r(y) such that x ∈ U (y, r) and U (y, 2r) ⊂ Vq+1 . Since z ∈
/ V , it follows that ρ(z, y) ≥ 2r.
We have
2r ≤ ρ(y, z) ≤ ρ(y, x) + ρ(x, z) ≤ r + R,
thus r ≤ R. If t ∈ B(y, 4r), then
ρ(t, z) ≤ ρ(t, y) + ρ(y, z) ≤ 4r + r + R ≤ 6R,
so that B(y, 4r(y)) ⊂ B(z, 2r(z)). This is a contradiction, because the ball B(y, 4y) is not
compact and the ball B(z, 2r(z) is compact. We have proved that V = P . To finish the proof it
is enough to set K1 = V 1 , W1 = V2 , W2 = V3 , Kq = Vq \ Vq−1 for q ≥ 2 and Wq = Vq+1 \ V q−2
for q ≥ 3.
39.24. Notes. The modern theory of manifolds is based on ideas of G. F. B. Riemann.
The roots of the topics of Chapter G go back to the 19th century and are connected with
famous names of outstanding mathematicians. From an extensive bibliography we recommend
M. Berger and B. Gostiaux [*1988], L. Boček [Boč], I. Černý and J. Mařı́k [ČM II], H. Federer
[*1969], W. Fleming [*1965], O. Kowalski [Kow], L. Krump, V. Souček and J. A. Těšı́nský [KST],
F. Moran [*1988], R. Sikorski [Sik], L. Simon [*1983].
H. Vector Integration 191
H. Vector Integration
In many branches of analysis we need to integrate functions having values in

vector spaces. Throughout this chapter we consider the case when (Ω, S , μ) is
a measure space and X is a Banach space. Having a mapping f from Ω to X
(a“vector function”) we would like to define an integral Ω f dμ in a reasonable
way. In principle, there are two possibilities:
(1) to utilize real (or complex) case arising by a composition ϕ ◦ f where ϕ
varies over the set of all functionals of the dual space X ∗ ,
(2) to try to choose an appropriate definition of the Lebesgue integral suited
for the vector case.
Note that both methods are also common in other branches of analysis.
In this rather short and informative chapter we will follow both ways of defining
vector integrals. Note that basic knowledge of main topics of functional analy-
sis (like the Hahn–Banach theorem, the Riesz-Fréchet representation theorem of
bounded linear functionals on Hilbert spaces, or the notion of a reflexive space)
is necessary for a good understanding of vector integration.
It seems that the first attempt to define a vector integral of the Riemann type is
due to Graves in 1927. His definition is only a suitable modified original Riemann
definition and its main idea is explained in Exercise 43.7.
In what follows, X will denote a Banach space and (Ω, S , μ) will be a measure
space, where μ is supposed to be a complete probability measure (μΩ = 1).
First of all we concentrate on the notion of measurable functions.
40.1. Measurable Functions. A function f : Ω → X is termed
provided there exist x1 , ..., xn ∈ X and E1 , ..., En ∈ S such that
– simple
f = i xi cEi ,
– measurable if there exists a sequence {fn } of simple functions such that
lim fn (ω) = f (ω) for μ-almost all ω ∈ Ω (i.e. if fn (ω) converge to f (ω) in
the norm of the space X for μ-almost all ω ∈ Ω),
– weakly or scalarly measurable if functions ϕ ◦ f are measurable for each
continuous linear functional ϕ ∈ X ∗ .
40.2. Remarks. 1. Note that we defined measurable functions according to the char-
acterization given in Exercise 5.7. It is clear that the common definition of measurability
({ω ∈ Ω : f (ω) < α} ∈ for each α ∈ X) cannot be used in the case of vector functions.
There are other equivalent definitions of measurability of real functions like those in Exercise
3.6.a ({ω ∈ Ω : f (ω) ∈ B} ∈ for each open, or Borel set B ⊂ R, respectively) suited to
the case of vector functions. Having the last definition in mind the class of all “measurable”
functions coincides with the class of all measurable functions (and also with the class of all
weakly measurable functions according to Pettis’ theorem) provided X is separable. In a non-
separable case the sum of two “measurable” functions according to the last definition need not
be even“measurable”.
2. Assume fn are measurable, fn → f μ-almost everywhere and λ ∈ R. Prove that the
functions f1 + f2 , λf1 , f are also measurable. The same is true for weakly measurable functions.
The relationship between measurable and weakly measurable functions is de-
scribed in the next theorem.
40.3. Pettis’ Theorem. A function f : Ω → X is measurable if and only if

f is weakly measurable and there is a μ-null set E ∈ S such that f (Ω \ E) is a
separable subset of X. In particular, if X is separable the notions of measurability
and weak measurability coincide.
Proof. Suppose fk are simple functions,
μ(E) = 0 and fk → f on Ω \ E. Since
f (Ω \ E) is a subset of the closure of fk (Ω) and fk (Ω) are finite sets, it follows
that f (Ω \ E) is separable.
The proof that ϕ ◦ f is measurable provided ϕ ∈ X ∗ is easy. Indeed, the
assertion is true if f is a simple function and for the general case we pass to the
limit.
Now assume that f is weakly measurable and that the set f (Ω \ E) is separable
for E ∈ S , μE = 0. In the first step of the proof we show that the function
ω → f (ω) is measurable. To this end let {xn } ⊂ f (Ω \ E) be a dense countable
set. Using the Hahn-Banach theorem there are ϕn ∈ X ∗ , ϕn = 1 such that
ϕn (xn ) = xn . It remains to show that
f (ω) = sup |ϕn (f (ω))|

n
for each ω ∈ Ω \ E. Obviously, |ϕn (f (ω)| ≤ ϕn · f (ω) = f (ω). To prove

the reverse inequality, let ω ∈ Ω \ E, n ∈ N and ε > 0 be given. There is xn such
that f (ω) − xn < ε. Then
| f (ω) − ϕn (f (ω))| ≤ | f (ω) − xn | + | xn − ϕn (xn )|

+ |ϕn (xn ) − ϕn (f (ω))|
≤ f (ω) − xn + |ϕn (xn − f (ω))|
≤ ε + ϕn · xn − f (ω) < 2ε.
Similarly, we can prove that the functions gn : ω → f (ω) − xn are measurable.

Now fix k ∈ N, put Enk = {ω ∈ Ω : gn (ω) < k1 } (obviously Enk ∈ S ) and define

xn if ω ∈ Enk \ j<n Ejk ,

hk (ω) =
0 otherwise.
If ω ∈ Ω \ E, then f (ω) − hk (ω) < k1 . Thus, we have shown that there is a

sequence {hk } such that hk → f μ-almost everywhere and each function hk has a
countable range. Since such functions are measurable, the assertion easily follows.
To get a good understanding of vector integration we introduce examples. Be-

fore proceeding let us agree on the following notations:
– c0 is the Banach space of all real sequences x = {xn } satisfying xn → 0
equipped with the sup-norm x = maxn |xn |,
– lp , 1 ≤ p < ∞ is the space of all sequences x = {xn } for which xp :=
! p "1/p
|xn | < ∞,
n
– l∞ will denote the space of all bounded sequences x = {xn } equipped with
the norm x∞ := supn |xn |,
– l2 ([0, 1]) is the Hilbert space of all real functions f on [0, 1] vanishing off
2
a countable set such that t∈[0,1] |f (t)| < ∞, equipped with the inner

product (f, g) = t f (t)g(t).
40.4. Example. Consider X = l2 ([0, 1]) and the measure space ([0, 1], M, λ). Let {et : t ∈
[0, 1]} be the usual orthonormal base of X (et (x) = 1 for x = t, et (x) = 0 otherwise). Define
the mapping f : [0, 1] → X as f (t) = et . If ϕ ∈ X ∗ , the Riesz-Fréchet representation theorem
on Hilbert spaces implies the existence of a uniquely determined a ∈ X such that ϕ(x) = (x, a)
for every x ∈ X. Hence ϕ(f (t)) = (et , a) = a(t) and the set {t ∈ [0, 1] : (et , a)
= 0} is countable.
We can see that ϕ ◦ f = 0 almost√everywhere and therefore f is a weakly measurable function.
On the other hand, et − es = 2 for t
= s. It follows that f ([0, 1] \ E) = {et : t ∈ [0, 1] \ E}
is separable if and only if the set [0, 1] \ E is countable. Thus, there is no set E ⊂ [0, 1] of
measure zero for which f ([0, 1] \ E) would be separable and Pettis’ theorem implies that f is
not measurable. (Further examples can be found in Exercises 43.7.e and f.)
40.5. Notes. The theory of vector measures and integration was developed in an essential
way during 1930’s. Famous Theorem 40.3 appears in B.J. Pettis [1938].
41. Vector Measures
In this chapter we touch briefly on measures whose values are in a given Banach
space X. Before doing this we take notice of a convergence of series in Banach
spaces.
41.1. Absolute and Unconditional Convergence. Let {xn } be a sequence

∞
of elements of a Banach space X. We say that a (formal) series xi = xi is
i=1

n
∞
n
– convergent if there is a limit lim xi (we write xi = lim xi ),
n i=1 i=1 n i=1

∞
– absolutely convergent if xi < +∞ ,
i=1
– unconditionally convergent to x ∈ X if n xP (n) = x whenever P is a
one-to-one mapping of N onto N.
Each absolutely convergent series converges (X is a complete space !) even
unconditionally. On the other hand, every unconditionally convergent series
converges absolutely provided dim X < +∞ (Riemann’s theorem) while in in-
finite dimensional spaces this assertion is no longer true (consider the example
xn = (0, . . . , 0, n1 , 0, 0, . . . ) ∈ c0 ).
In what follows, (Ω, S , μ) will stand for a fixed measure space and X a given
Banach space.
41.2. Vector Measures. A vector-valued set function F : S → X is called an
additive (σ-additive) vector measure if F (∅) = 0 and

F ( En ) = F (En )
n n
for every finite (countable) sequence of pairwise disjoint sets En ∈ S . In case of

σ-additive measures the convergence of a series in the definition is understood in
194 42. The Bochner Integral
the sense as above. Realize that this convergence is even an unconditional one.
Any σ-additive vector measure will be shortly called a vector measure.
41.3. Examples. In all examples, (Ω, , μ) stands for the measure space ([0, 1], M, λ).
1. If X = Lp ([0, 1]), 1 ≤ p ≤ ∞, and F : E → cE for E ∈ M, then F is a σ-additive vector
measure.
2. Let L be a continuous linear operator from L1 [0, 1] into a Banach space X. If F (E) := L(cE )
for E ∈ M, then F is again a σ-additive vector measure. Indeed, a moment’s reflection shows
that
∞
[ Xn ∞
[ ∞
[
F ( Ej ) − F (Ej ) = F ( Ej ) ≤ λ( Ej ) L .
j=1 j=1 j=n+1 j=n+1
3. Let T : L∞ [0, 1] → X be a continuous linear operator, F (E) := T (cE ) for E ∈ M. Then F

is an additive vector measure which may not be σ-additive. To see an example, let T be the
Hahn-Banach extension of the functional ϕ : x → x( 21 ) from [0, 1] to L∞ [0, 1] (notice that
X = R !)
41.4. Absolute Continuity. An additive vector measure F : Ω → X is said
to be absolutely continuous with respect to a measure μ on S if for any ε > 0
there exists δ > 0 such that F (E) < ε whenever μE < δ.
41.5. Theorem (Pettis). A σ-additive vector measure F on S is absolutely
continuous with respect to a finite measure μ on S if and only if F (E) = 0
whenever μE = 0.
Proof. The necessity is obvious. For the proof of the converse we can use analo-
gous reasoning as in Exercise 8.22.b: We assume the existence of an ε > 0 and a
sequence {En } ⊂ S for which
F (En ) ≥ ε and μEn < 2−n .
To reach the contradiction consider compositions ϕ ◦ F where ϕ ∈ X ∗ . Now to

complete the proof we need “uniform” estimates with respect to ϕ and this is
more difficult than the conclusion in Exercise 8.22.b.
41.6. Exercise. Let μ be the counting measure on N and X = l2 . Show that F : E →
{n
1
cE (n)}, E ⊂ N, is a σ-additive vector measure.
41.7. Variation of Vector Measures. Let F : Ω → X be a vector measure. According to
Theorem 6.9 we define the variation of F as
˘X
n [
n
¯
|F (E)| := sup F (Ak ) : Ak ∈ , Ai ∩ Aj = ∅ for i
= j , Ak = E .
k=1 k=1
If |F (Ω)| < +∞, we say that F is a vector measure of bounded variation.

(a) Show that the variation of a vector measure is a (nonnegative) measure.
(b) Examine vector measures of the previous examples and decide whether or not they are
of bounded variation.
42. The Bochner Integral
42.1. Bochner Integral. We say that a vector function f : Ω → X is Bochner

integrable if it is measurable and there exists a sequence of simple functions {fn }
such that Ω f − fn dμ → 0.
A few remarks should be now added. Recall (cf. the proof of Pettis’ theorem, or
Exercise 42.5)
that (real) functions f − fn are measurable. Further, the above
limit lim B fn dμ exists for any B ∈ S since X is complete and the estimates
% %
% %
% fn − %
fk % ≤ fn − fk ≤ fn − f + fk − f
%
B B B Ω

hold. Note also that B ϕ dμ := xi μ(Ei ∩ B) when ϕ = xi cEi is a simple
function and that this definition does not depend on the expression of ϕ as xi cEi
which is not unique.

Moreover, if B f − fn → 0 and B f − gn → 0, B ∈ S (where fn , gn
are simple functions) then lim B fn = lim B gn (consider the sequence f1 , g1 , f2 ,
g2 , . . . ).
Now if f is Bochner
integrable, B ∈ S and {fn } is a sequence of simple func-
tions satisfying B f − fn → 0, the limit lim B hn dμ exists and is independent
of the sequence {hn }. This limit (which is an element of our Banach space X) is
called the Bochner integral of f and it is denoted by B f dμ. The fundamental
characterization of Bochner integrable functions is given in the next theorem.
42.2. Theorem (Bochner). A measurable function f : Ω → X is Bochner
integrable if and only if Ω f dμ < ∞ (i.e. exactly when the (real) function f
is Lebesgue integrable).
Proof. If f is Bochner integrable, then f is measurable (cf. Exercise 42.5). Now
the assertion follows from the estimates

f ≤ f − fn + fn

where {fn } is a sequence of simple functions satisfying f − fn → 0.
For the converse, suppose f is integrable. There are simple functions fn
tending to f μ- almost everywhere. Set

fn (ω) if fn (ω) ≤ 2 f (ω) ,
gn (ω) =
0 otherwise in Ω.
Obviously, gn are again simple functions. Moreover,
gn (ω) ≤ 2 f (ω) , f (ω) − gn (ω) → 0
for μ-almost all ω ∈ Ω. Since
f (ω) − gn (ω) ≤ f (ω) + gn (ω) ≤ 3 f (ω) ,

an appeal to the Lebesgue dominated convergence theorem 8.13 shows f − gn
→ 0.
Denoting LX1 (or, more precisely, LX1 (Ω, S , μ)) the space of all Bochner inte-
grable functions we see that f ∈ LX1 if and only if f ∈ L 1 . Basic properties of
the Bochner integral are summarized in the next theorem.
196 42. The Bochner Integral
% %
42.3. Theorem. (a) If f ∈ LX1 and E ∈ S , then % E f % ≤ E f .

(b) The space LX1 equipped with the norm g1 = Ω g dμ is complete. In
other words, identifying functions which are equal μ-almost everywhere, LX1 is a
Banach space.

(c) If F (E) := E f dμ denotes the indefinite Bochner integral, then F is a σ-
additive vector measure absolutely continuous with respect to μ. Moreover,
if
{En } is a sequence of pairwise disjoint sets from S , then the series n F (En )
converges absolutely.
Sketch of the proof. (a) Using the triangle inequality, the assertion is true for
simple functions. For the general case, pass to the appropriate limit appealing to
the Lebesgue dominated convergence theorem (which is valid even in the vector
case).
(b) The proof is the same as in the case of real functions.
(c) Without any difficulty you can show that the indefinite Bochner integral is
(finitely) additive. If E
n ∈ S are pairwise
disjoint,
then the series F (En ) is
absolutely convergent ( F (En ) ≤ En
f = S
En
f Ω
f < +∞) and
∞ ∞
%
k
% % %
%F ( En ) − F (En )% = %F ( En )% .
n=1 n=1 n=k+1
!
∞ "
The assertion now follows since lim μ En = 0 and the measure E →
k
n=k+1
E
f dμ is absolutely continuous with % to μ (Exercise 8.22.b). Indeed,
% respect
given ε > 0 there is δ > 0 such that % E f % ≤ E f < ε whenever μE < δ.
R
42.4. Remark. We have just seen that the indefinite Bochner integral E → E f dμ is a
σ-additive vector measure which is absolutely continuous with respect to μ. Moreover, it is
simply checked that this measure is of bounded variation. As in the real case, a question arises
whether each σ-additive X-valued vector measure of bounded variation which is absolutely
continuous with respect to μ can be expressed as an indefinite Bochner integral of a Bochner
integrable function. Thus there is a question whether or not the Radon-Nikodým theorem holds
for vector measures. The answer is negative. The vector measure of Example 41.3.1 is absolutely
continuous with respect to Lebesgue measure, in case of p = 1 it is of bounded variation and
still it is not the indefinite Bochner integral of any Bochner integrable function.
We say that a Banach space X has the Radon-Nikodým property (shortly, RNP) if the Radon–
Nikodým theorem holds for any X-valued vector measure. More precisely, whenever (Ω, , μ)
is a probability measure space and ν : → X is a vector measure of bounded variation which
is absolutely continuous with respect to μ, then ν is an indefinite Bochner integral of a Bochner
integrable function.
Consequently, the space L1 [0, 1] does not have the RNP. Neither do the spaces c0 , l∞ , (K)
(K infinite compact) have the RNP. On the other hand, any reflexive Banach space has the
RNP.
42.5. Exercise. If f : Ω → X is measurable, then the real function f is measurable. (In
fact, we proved this assertion in the course of the proof of Pettis’ theorem 40.3.) Prove this
assertion directly.
Hint. The assertion is obvious for simple functions. Now, if fn are simple and fn → f μ-almost
everywhere, it follows that fn → f μ-almost everywhere.
42.6. Exercise. Let f, g be measurable functions, f ≤ g μ-almost everywhere and let g
be Bochner integrable. Show that f is Bochner integrable.
42.7. Exercise. Show that the indefinite Bochner integral is a vector measure of bounded
variation.
42.8. Notes. The Bochner integral was studied by S. Bochner [1933] and N. Dunford [1935].
Today, it is used as a tool in function spaces theories, when examining evolution PDE’s and
differential equations in Banach spaces, or in some problems of the geometry of Banach spaces.
43. The Dunford and Pettis Integrals
Most of this chapter is devoted to weak integrals in Banach spaces. Next

Dunford’s lemma seems to be of a great importance.
43.1. Dunford’s Lemma. Given a vector function f : Ω → X such that
ϕ ◦ f ∈ L 1 (μ) for every ϕ ∈ X ∗ , then for any set E ∈ S there exists an element
LE ∈ X ∗∗ such that
LE (ϕ) = ϕ ◦ f dμ
E
for each ϕ ∈ X ∗ .
Proof.
Obviously, f is weakly measurable. Fix E ∈ S and define LE : ϕ →
E
ϕ ◦ f for ϕ ∈ X ∗ . Without doubt LE is a linear functional and we have to
show that LE is bounded. To this end, define the mapping T : ϕ → ϕ ◦ (f cE )
(X ∗ → L 1 (μ)). We will finish the proof by showing that T has a closed graph.
Indeed, then the closed graph theorem implies that T is bounded (the spaces X ∗
and L1 (μ) are complete !) and

|LE (ϕ)| = ϕ ◦ f = ϕ ◦ (f cE ) ≤ ϕ ◦ (f cE )1 = T ϕ1 ≤ T ϕ

E Ω
∗∗
which shows that LE ∈ X .
Now let us see why T is closed. Assume that ϕn → ϕ, T ϕn → g. By Theorem
12.4 or Exercise 10.9 we can find a subsequence {ϕnk } such that T ϕnk → g μ-
almost everywhere (i.e. ϕnk ◦(f cE ) → g μ-almost everywhere). Since ϕn ◦(f cE ) →
ϕ ◦ (f cE ) everywhere, it follows that g = ϕ ◦ (f cE ) μ-almost everywhere and we
see that T ϕ = g.
43.2. Weak Integrals. A vector function f : Ω → X is said to be Dunford
integrable (or, weakly integrable) if ϕ ◦ f ∈ L 1 (μ) for each ϕ ∈ X ∗ . The element
LE ∈ X ∗∗ whose existence is guaranteed by Dunford’s lemma is called the Dunford
integral (according to some authors also the Gelfand integral ) of f on E. Thus
LE is the Dunford integral of f if

LE (ϕ) = ϕ ◦ f dμ for each ϕ ∈ X ∗ .
E
If even LE ∈ X for each E ∈ S (or, more precisely LE ∈ εX ⊂ X ∗∗ where ε

denotes the canonical embedding of X into X ∗∗ ), then f is called Pettis integrable.
Thus PE ∈ X is the Pettis integral of f over E if

ϕ(PE ) = ϕ ◦ f dμ for each ϕ ∈ X ∗ .
E
198 43. The Dunford and Pettis Integrals

Instead of LE we will use the notation (D) E f (remember that
LE is an element
of X ∗∗ ) while PE (an element of X) will be denoted by (P ) E f .
Both integrals can be identified provided X is reflexive.
43.3. Example. Let X = c0 , (Ω, , μ) = ([0, 1], M, λ) and f : [0, 1] → c0 be defined as
f (t) = {c(0,1] (t), 2c(0,1/2] (t), 3c(0,1/3] , . . . } .

P
For each ϕ ∈ (c0 )∗ there is a sequence {αn } ⊂ l1 such that ϕ(x) = αn xn whenever x =
n
{xn } ∈ c0 . Then one has (the Lebesgue integral in consideration!)
Z 1 Z 1 ˛X ˛ XZ 1 X
˛ ˛
|ϕ ◦ f | = ˛ αn nc(0,1/n] (t)˛ dt ≤ |αn | nc(0,1/n] = |αn | < +∞.
0 0 n 0 n
It follows that f is Dunford integrable. Since

Z 1 Z 1 X X
ϕ◦f = αn nc(0,1/n] = αn ,
0 0 n n
R
we see that (D) 01 f = {1, 1, 1, , . . . }. Therefore the Dunford integral of f is an element of
P
l = (l ) = (c0 )∗∗ determined as a functional ϕ →
∞ 1 ∗ αn . Consequently, f is not Pettis
n
integrable.
Show that
Z 1/n Z 1/n Z 1/n X 1 2 n
(D) f = ϕ◦f = αi ic[0,1/i] = α1 + α2 + · · · + αn + αn+1 + . . . .
0 0 0 i
n n n
It follows that Z 1/n 1 2 n

(D) f = { , , . . . , , 1, 1, 1, . . . } ∈ l∞
0 n n n
and Z
‚ 1/n ‚
‚ ‚
‚(D) f‚ = 1.
0 ∞
43.4. Remark. Having in mind the last example, notice that the indefinite Dunford integral
Z
F : E → (D) f , E∈M
E
is not
‚ absolutely
R ‚ continuous with respect to the Lebesgue measure (λ[0, 1/n] = 1/n → 0 in spite
‚ ‚
of ‚(D) 01/n f ‚ = 1). Moreover, the vector measure F is not σ-additive.
The next theorem characterizes Pettis integrable functions among those which
are Dunford integrable.
43.5. Pettis’ Theorem. For a measurable Dunford integrable function f : Ω →
X the following assertions are equivalent:
(i) f is Pettis integrable,
(ii) the indefinite Dunford integral of f is a σ-additive vector measure,
(iii) the indefinite Dunford integral of f is absolutely continuous with respect
to μ.
The proof of this theorem makes use of deeper theorems of functional analysis and
is beyond the scope of this manuscript.
Many problems dealing with vector integration are subtle and need a deeper knowledge of
Banach spaces theory. This is also the case of the next remarks. The reader is referred, for
example, to J. Diestel and J.J. Uhl [*1977] or L. Mišı́k [*1989].
43.6. Remarks. 1. Any Bochner integrable function is also Pettis integrable and both
integrals coincide on sets of .
2. Moreover, the following theorem holds:
Let f : Ω → X be a measurable Pettis integrable function. Then f is Bochner integrable if
and only if the indefinite Pettis integral of f is a vector measure of bounded variation.
3. The space c0 is exceptional: If X does not contain a copy of c0 (i.e. there is no subspace of
X topologically and algebraically isomorphic with c0 ), then any measurable Dunford integrable
function is Pettis integrable.
4. Let f : Ω → X be a measurable function. There are xn ∈ X and En ∈ such that
P
∞
f = xn cEn almost everywhere. Then
n=1
P
f is Pettis integrable if and only if the series xn μ(En ∩ E) is unconditionally convergent
for any E ∈ ,
P
f is Bochner integrable if and only if the series xn μ(En ∩ E) is absolutely convergent for
any E ∈ .
5. Let C be a compact subset of a Banach space X. Let μ ∈ 1 (C) be a probability Radon
measure on C. (For any A ⊂ X we denote by coA the smallest closed convex subset of X
containing A. It is simply checked that RcoA equals the closure of the convex hull of A.) There
exists a unique z ∈ coC so that ϕ(z) = ϕ dμ for any ϕ ∈ X ∗ . This z is called the barycenter
C
of μ. How is this relatedR to the Pettis integral?
R The answer is simple. When defining f (x) = x
for x ∈ C then z = (P ) f dμ, or z = (P ) x dμ. Illustrate for the case X = R, C = [0, 1] and
C C
μ=λ!
6. There is a generalization of the previous example giving a criterion on the existence of the
Pettis integral.
Let K be a compact metric space, X a Banach space and μ a probability Radon measure on
K. RIf the mapping f : K → X is continuous, then there exists the Pettis integral of f and
(P ) K f dμ ∈ cof (K).
43.7. The Graves Integral. There is a straightforward analogy of Riemann integration
theory for functions having their values in a Banach space. Remember that the Riemann integral
can be defined using Darboux upper and lower sums (and upper and lower integrals), or for its
definition an original Riemann’s approach can be used. It is clear that any definition (like
Darboux’s one) making use of the ordering of the real line (and the notion of the least upper
bound) cannot be immediately carried over for a vector case. Of course, in general Banach
spaces there is no ordering. Nevertheless, in the sequel we will touch briefly analogues of both
Riemann’s and Darboux’s approaches.
Let X again be a Banach space, f a mapping from [0, 1] into X. Given a partition D :=
{0 = x0 < x1 < ... < xn = 1} of [0, 1], I(D) = {ξ = {ξi } : ξi ∈ [xi−1 , xi ]} set
X
n
Ξ(f, D, ξ) := f (ξi )(xi − xi−1 ).
i=1
The real number Ξ(f, D, ξ) is called the Riemann sum of f . Further, define the norm of a
partition D as νD := max{xi − xi−1 : i = 1, ..., n}.
(a) We will say that f is Riemann integrable provided there exists z ∈ X with the following
property: For any ε > 0 there is δ > 0 such that
Ξ(f, D, ξ) − z < ε
whenever νD < δ and ξ ∈ I(D).

If f is Riemann integrable, then the element z of the definition is uniquely determined. It
will be termed the Graves (sometimes also the Riemann–Graves) integral of f and denoted by
R
(RG) 01 f . As in the real case, a mapping f is Riemann integrable if and only if there exists
w ∈ X with the property: For any ε > 0 there is a partition D0 such that
Ξ(f, D, ξ) − w < ε
whenever a partition D is finer than D0 (i.e. D ⊂ D0 ) and ξ ∈ I(D). Of course, w is then the
Graves integral of f .
(b) Realize that the Graves integral is a (Moore-Smith) limit of a “generalized” sequence
{Ξ(f, D, ξ)} when “ordered” either by a norm or an inclusion.
(c) Any Riemann integrable X-valued function f is bounded (there exists K > 0 so that
f (t) < K whenever t ∈ [0, 1]).
In what follows, we make use of the vector integration theory for the case of the special measure
space ([0, 1], M, λ).
(d) If f : [0, 1] → X is Riemann integrable and ϕ ∈ X ∗ , then the (real) function ϕ ◦ f is
Riemann integrable. In particular, f is weakly measurable and Pettis integrable.
R
Hint. Almost all assertions are obvious, even that f is Dunford integrable and (RG) 01 =
R1 R
(D) 0 . Since f is bounded it follows that (D) I f ∈ X for any interval I ⊂ [0, 1] and one can
see that f is Pettis integrable.
(e) Define a (vector) function f : [0, 1] → l∞ [0, 1] (=the space of all bounded functions on
[0, 1] equipped with the sup -norm) as f (t) = c[0,t] . With the aid of the “Bolzano–Cauchy
condition” show that f is Riemann integrable. Now (d) implies that f is weakly measurable.
(Check the last assertion also directly: If ϕ is a continuous linear form on l∞ [0, 1], then it
is easy to see that the function t → ϕ(f (t)) is of bounded variation, and consequently it is
measurable.) On the other hand, f is not measurable (use Pettis’ theorem 40.3 and realize that
f (s) − f (t) = 1 for s
= t).
(f) Another example. Let E ⊂ [0, 1] and define gE : [0, 1] → l∞ [0, 1] as follows: Put gE (t) =
c{t} if t ∈ E and gE (t) equals zero function for t ∈
/ E. Show that gE is Riemann integrable. If
E is a nonmeasurable set, then the function t → gE (t) cannot be measurable. Accordingly,
in this case gE is not measurable (cf. Exercise 42.5 or Pettis’ theorem 40.3). State conditions
on a set E under which gE will be measurable.
(g) Any measurable Riemann integrable X-valued function f is Bochner integrable (and both
integrals equal).
Hint. Recall that a (real) function f is bounded and measurable. Now use Bochner’s charac-
terization 42.2.
(h) Given again a partition D := {0 = x0 < x1 < ... < xn = 1} of [0, 1], set
X
n
(f, D) = sup{f (s) − f (t) : s, t ∈ [xi−1 , xi ]}(xi − xi−1 ).
i=1
We say that f : [0, 1] → X is Darboux integrable if for any ε > 0 there exists δ > 0 such that
(f, D) < ε whenever D is a partition and νD < δ. Show again that instead of “ordering”
given by a norm an equivalent definition can be formulated using an “ordering” determined by
an inclusion. Prove also that any Darboux integrable function is Riemann integrable.
(i) A function f : [0, 1] → X is Darboux integrable if and only if f is bounded and continuous
in almost all points of [0, 1] (cf. 7.9.d).
Hint. Define the oscillation of a X-valued function g as
t → ωg (t) := lim sup{g(u) − g(v) : u, v ∈ (t − δ, t + δ)}.

δ→0+
First show that f is continuous at t0 if and only if ωf (t0 ) = 0. Further prove that the set
{t ∈ [0, 1] : ωf (t) ≥ α} is always closed.
Any Darboux integrable function is obviously bounded. To complete the proof check that
1
λ{t ∈ [0, 1] : ωf (t) ≥ }=0
n
for any n.
For the converse, fix ε > 0 and choose an open set G ⊂ [0, 1] such that λG sup f < ε and f is
continuous at all points of the compact set K := [0, 1]\G. For each t ∈ K find the greatest δt > 0
for which f (s) − f (t) ≤ 12 ε for any s ∈ [0, 1] ∩ (t − δt , t + δt ). A routine compactness argument
establishes the existence of δ > 0 with the property: If D := {0 = x0 < x1 < ... < xn = 1}
is a partition of [0, 1] and νD < δ, then each interval [xi−1 , xi ] is either contained in G or, it
satisfies f (s) − f (t) ≤ ε for any couple s, t ∈ [xi−1 , xi ]. Having such a partition D, we get
(f, D) ≤ 3ε.
(j) It follows from the above considerations that any Darboux integrable function is mea-
surable (use Pettis’ theorem 40.3 and the fact that a continuous image of a separable space is
separable), and therefore with the aid of Bochner’s theorem 42.2 it is also Bochner integrable.
(k) Each bounded function which is continuous at almost all points is Riemann integrable.
When X is a general Banach space, the converse may not be the‚ case.
‚
Choose E = Q in (f). Then gQ is Riemann integrable while ‚gQ ‚ is the Dirichlet function of
Example 7.5 which is nowhere continuous. Nor is gQ continuous at any point of [0, 1] (remember
that the norm is a continuous function!).
(l) We can see that the classes of Darboux and Riemann integrable functions can differ. It
seems that there is no known reasonable characterization of Banach spaces where these classes
coincide.
43.8. Exercise. Let X = c0 , (Ω, , μ) = ([0, 1], M, λ) and define
f (t) = {c(0,1] (t), 2c(0,1/2] (t), 3c(0,1/3] (t), ...},

g(t) = {c(1/2,1] (t), 2c(1/3,1/2] (t), 3c(1/4,1/3] (t), ...},
∞
X
h(t) = { nc( 1 ,1] (t), 0, 0, ...}.
n+1 n
n=1
Show that:
(a) g is Pettis integrable ,
(b) f is Dunford but not Pettis integrable,
(c) h is not even Dunford integrable but it is measurable,
(d) f (t) = g(t) = h(t) for any t ∈ [0, 1] .
43.9. Exercise. Define the mapping f from the interval (0, 1) (Lebesgue measure in consid-
eration) into the Hilbert space l2 as
1 1 1
f : x → ( , , , ...) , x ∈ (0, 1).
x+1 x+2 x+3
R1
Show that (P ) 0 f = {log(1 + 1
)} .
n n
43.10. Exercise. Consider again the measure space ([0, 1], M, λ). Let en := {0, ..., 0, 1, 0, ...}
e0 = {0, 0, ...}.
(a) Put f (rn ) = en and f = e0 otherwise (here {rn } is a sequence of all rational numbers
of [0, 1]). Show that f : [0, 1] → c0 is measurable and Bochner integrable (f is even Riemann
integrable).
(b) Examine the measurability and integrability of the mapping f of (a) replacing the space
c0 by lp , 1 ≤ p < ∞.
(c) Let α ∈ R and define
h(t) = {nα c(0, 1 ] (t)}n whenever t ∈ [0, 1].

n
If X is one of the Banach spaces c0 , 1 ≤ p ≤ ∞ show that h is measurable. There is no

lp ,
difficulty to prove that h is Dunford integrable if X = c0 and it is Dunford integrable for X = l∞
if and only if α ≤ 1. Further, h is Pettis integrable if and only if it is Bochner integrable, and
this is the case exactly when α < 1. For other cases consult I. Chitescu [1990].
43.11. Notes. Fundamental properties of the Pettis integral appeared in B.J. Pettis [1938],
also N. Dunford [1936] studied this integral. Another weak integral (for functions having values
in duals of Banach spaces) was introduced by I.M. Gelfand in [1936]. Nowadays, the Pettis
integral plays an important role in many branches of functional analysis.
L.M. Graves gave the Riemann-type definition for X-valued mappings on [0, 1] in [1927].
Historical comments on vector integration can be found in T.H. Hildebrandt [1953], or J. Di-
estel and J. J. Uhl [*1977].
Appendix on Topology 203
Appendix on Topology
In this appendix, we mention briefly some topological notions used in the man-
uscript which may not be familiar to the reader.
Let us remind that by a topology we always mean a family τ of subsets of a set
X, designated as open or τ -open sets, which has the following properties:
(a) τ contains ∅ and X;
A ∩ B ∈ τ for any A, B ∈ τ ;
(b)
(c) Aα ∈ τ if Aα ∈ τ .
α
Complements of open sets are called closed sets.
A family B ⊂ τ of open sets is a base for the topology τ if every open set can
be expressed as a union of members of B.

Since a topology is often defined with
help of a base B by the formula τ = { {B : B ∈ Z } : Z ⊂ B}, it is useful to
know when a family of sets determines a topology (in this way). The answer is
as follows:
Proposition.

A collection of sets B is a base for a topology on X if and only
if X = B, and if for any z ∈ B1 ∩ B2 (B1 , B2 ∈ B) there is B ∈ B with
z ∈ B ⊂ B1 ∩ B2 .
An important example is the topology of a metric space formed by the family
of all open sets. A base for this topology is, for instance, the set of all open balls.
The discrete topology on X consists of all subsets of X and the singletons form
a base for this topology.
General topological spaces may enjoy a very complicated structure. In whole
of this manuscript we assume that all spaces are Hausdorff: (X, τ ) is a Haus-
dorff space whenever x, y are distinct points of X, there exist disjoint open sets
Gx , Gy ∈ τ with x ∈ Gx , y ∈ Gy .
A neighborhood of a point z ∈ X is every set whose interior contains z. The
interior of a set M is defined as the largest open set contained in M . It is the
union of all open sets contained in M .
The closure A of a set A is the smallest closed set containing A (it exists!). A
set E is dense in X if E = X and nowhere dense if the interior of its closure is
empty.
A topological space is said to be separable if it contains a countable dense
subset. A metric space X is separable if and only if it has a countable base for
its topology, and this is the case exactly when X has the Lindelöf property: Any
open cover of X contains a countable subcover.
If X, Y are topological spaces, a mapping f : X → Y is continuous if the
pre-images of open sets in Y are open in X. In a usual way, we can define the
continuity at a point. A mapping is continuous if and only if it is continuous at
each point.
A function f is continuous on X if and only if the level sets
{x ∈ X : f (x) > α} and {x ∈ X : f (x) < α}

204 Appendix on Topology
are open for every α ∈ R. A function f is said to be lower semicontinuous if the

set {x ∈ X : f (x) > α} is open for every α ∈ R.
A one-to-one mapping f : X → Y is called a homeomorphism if it is continuous
and the inverse mapping f −1 : f (X) → X is also continuous.
A base for the topology of the Cartesian product of topological spaces X1 , X2
is formed by the collection of all sets of the form G1 × G2 , where Gj are open in
Xj .
A topological space is normal if every pair of disjoint closed sets can be sep-
arated by (disjoint) open sets. The normal topological spaces are exactly the
spaces where Urysohn’s lemma and Tietze’s extension theorem hold.
Urysohn’s Lemma for Normal Spaces. If F1 and F2 are disjoint closed sub-
sets of a normal topological space X, then there exists a continuous function f on
X such that
0 ≤ f ≤ 1, f = 0 on F1 , f = 1 on F2 .
Tietze’s Extension Theorem. If f is a continuous function on a closed subset

Z of a normal topological space X, then there exists a continuous function F on
X such that
f = F on Z and sup |F | = sup |f | .
X Z
Any metric space is normal.

An interesting and from the point of view of the measure theory important class
of topological spaces is formed by locally compact spaces, i.e. spaces in which every
point has a compact neighborhood. A set is compact if every its open cover (cover
by open sets) contains a finite subcover. The locally compact spaces fails to be
normal but for them a version of Urysohn’s lemma holds.
Urysohn’s Lemma for Locally Compact Spaces. If K is a compact set and
U an open subset of a locally compact space X, K ⊂ U ⊂ X, then there exists a
continuous function f and a compact set L with
K ⊂ L ⊂ U, 0 ≤ f ≤ 1, f = 1 on K, f = 0 on X \ L.
Every locally compact space with a countable base is metrizable. A finite

Cartesian product of locally compact spaces is a locally compact space.
Let C (X) be the space of all continuous (real- or complex-valued) functions on
a compact set X. If
f := max{|f (t)| : t ∈ X} for f ∈ C (X) ,
then C (X) equipped with this norm is a Banach space.

The convergence in the space C (X) is the uniform convergence. The point-
wise convergence of special sequences of continuous functions can guarantee the
convergence in C (X) as following Dini’s theorem shows.
Appendix on Topology 205
Dini’s Theorem. If {fn } is a monotone sequence of continuous functions on

a compact space X which converges pointwise to a continuous function, then the
convergence of {fn } is uniform on X.
Let A ⊂ C (X). We say that

– A is an algebra if f g ∈ A for f, g ∈ A ;
– A is a lattice if max(f, g), min(f, g) ∈ A whenever f, g ∈ A ;
– A separates points of X if for any x, y ∈ X , x = y there is ϕ ∈ A such
that ϕ(x) = ϕ(y).
The following theorem is useful in many branches of modern analysis.
Stone-Weierstrass Theorem. Let X be a compact space and A a linear sub-
space of C (X). If A is an algebra or a lattice, if A separates points of X and
contains the constant functions, then A is dense in C (X).
Let P be a metric compact space. Then there exists a countable base {Vn } for
the topology on P . Put fn (x) = dist(x, P \Vn ) and consider the algebra generated
by {fn }. The Stone-Weierstrass theorem yields the following proposition.
Theorem. If P is a metric compact space, then the space C (P ) is separable.
206 References
References
Czech Books and Lecture Notes
[Boč] Tenzorový počet, SNTL Praha, 1976.
[ČM I] Integrálnı́ počet I, lecture notes, SPN Praha 1960.

[ČM II] Integrálnı́ počet II, lecture notes, SPN Praha 1961.
[FM] Matematická analýza II. Diferenciálnı́ počet funkcı́ vı́ce proměnných, lecture notes,
SPN Praha 1975.
[Kow] Základy matematické analýzy na varietách, lecture notes, UK Praha 1975.
[KNV] Teorie potenciálu III, lecture notes, SPN Praha 1976.
[KST] Matematická analýza na varietách, Karolinum, Praha 1998.
[KS] Integrálnı́ transformace, lecture notes, SPN Praha 1969.
[L-Př] Přı́klady z matematické analýzy I. Přı́klady k teorii Lebesgueova integrálu, lecture

notes, SPN Praha 1968 (1972, 1984).
[L-T] Teorie mı́ry a integrálu I, lecture notes, SPN Praha 1972 (1974, 1980).
[Zap] Zápisky z funkcionálnı́ analýzy, Karolinum, Praha 1998 (2002, 2003).
[Uvod] Úvod do funkcionálnı́ analýzy, Karolinum, Praha 2005.
[Pr] Problémy z matematické analýzy, lecture notes, SPN Praha 1972 (1974, 1977, 1982).
[LM] Mı́ra a integrál, lecture notes, Univerzita Karlova, Praha 1993.
[Mar] Matematická analýza čtená podruhé, Academia Praha 1976.
[NV] Přı́klady z matematické analýzy. Mı́ra a integrál, lecture notes, UK Praha 1982.
[Ru] Analýza v reálném a komplexnı́m oboru, Academia Praha 1977.
[Sik] Diferenciálnı́ a integrálnı́ počet. Funkce vı́ce proměnných, Academia Praha 1973.
[SJ] Funkcionálnı́ analýza. Nelineárnı́ úlohy, lecture notes, SPN Praha 1986.
Other books
[*1981] Principles of real analysis, North-Holland.
[*1966] A first course in integration, Holt, Reinhart and Winston.

References 207
[*1964] Elements of abstract harmonic analysis, Academic Press.
[*1932] Théorie des opérations linéaires, Warszava.
[*1966] The elements of integration, Wiley.
[*1990] Maß- und Integrationstheorie, Walter de Gruyter.
[*1987] Maß- und Integrationstheorie, Springer-Verlag.
[*1965] Measure and integration, Macmillan.
[*1988] Differential geometry: Manifolds, Curves, and Surfaces, Springer-Verlag.
[*1983] Theory of charges, Academic Press.
[*1968] Convergence of probability measures, John Wiley 1968, Russian translation 1977.
[*1979] Probability and measure, Wiley.
[*1898] Leçons sur la théorie des fonctions, Gauthier-Villars, Paris.
[*1959] Real Analysis, Van Nostrand.
[*1952] Intégration, Herman et Cie, Paris 1952, 2nd edition 1965.
[*1978] Differentiation of real functions, Springer-Verlag.
[*1918] Vorlesungen über reelle Funktionen, Teubner Leipzig 1918, 2nd edition 1927, 3rd
edition Chelsea 1948.
[*1821] Cours d’analyse de l’École Royale Polytechnique, Paris.
[*1969] Lectures on analysis, Vol. I, W.A. Benjamin.
[*1980] Measure theory, Birkhäuser.
[*1985] Integration theory, Vol. 1: Measure and integral, John Wiley & Sons.
[*1977] Applied nonstandard analysis, Wiley-Interscience.
[*1985] Nonlinear functional analysis, Springer–Verlag.
[*1977] Vector measures, AMS.

208 References
[*1994] Measure theory, Springer-Verlag.
[*2004] Lectures on Nonlinear Analysis, Vydavatelský servis, Plzeň.
[*1989] Real analysis and probability, Wadsworth&Brooks/Cole.
[*1990] Measure, topology, and fractal geometry, Springer–Verlag.
[*1992] Measure theory and fine properties of functions, CRC Press, Boca Raton.
[*1985] The geometry of fractal sets, Cambridge University Press, Cambridge.
[*1969] Geometric measure theory, Springer–Verlag. (Second edition Springer 1996).
[*1965] Functions of several variables, Addison–Wesley.
[*1981] Maß- und Integrationstheorie, B.G. Teubner-Verlag, Stuttgart.
[*1984] Real analysis, John Wiley.

[*1995] A course in abstract harmonic analysis, CRC Press.
[*1995] Degree theory in analysis and applications., The Clarendon Press, Oxford University
Press, New York.
[*1991] Fundamentals of real analysis, Marcel Dekker.
[*1822] Théorie analytique de la chaleur, F. Didot, Paris.
[*1974] Topological Riesz spaces and measure, Cambridge University Press.
[*1973] Spectral analysis of nonlinear operators, Lecture Notes in Math. 346, Springer–Verlag
1973.
[*1995] Modern Real Analysis, PWS Publishing Company, Boston.
[*1994] The integrals of Lebesgue, Denjoy, Perron, and Henstock, American Mathematical
Society, Graduate studies in mathematics, vol. 4.
[*1975] Differentiation of integrals in Rn , Lecture Notes in Math. 481, Springer–Verlag.
[*1921] Theorie der reellen Funktionen, ”I. Band”, Julius Springer, Berlin.
[*1948] Set functions, Univ. New Mexico Press.

References 209
[*1950] Measure theory, Van Nostrand 1950, 1966, Springer 1974.
[*1970] Lebesgue’s theory of integration (Its origins and development), Wisconsin Press.
[*1988] Lectures on the theory of integration, World Scientific Publishing, Singapore.

[*1991] The general theory of integration, Clarendon Press, Oxford.
[*1963] Abstract harmonic analysis I, Academic Press 1963 (Russian translation 1975).
[*1970] Abstract harmonic analysis II, Academic Press 1970 (Russian translation 1975).
[*1965] Real and abstract analysis, Springer Verlag.
[*1975] Analysis in Euclidean spaces, Prentice-Hall Inc.
[*1966] Topological vector spaces and distributions I, Addison Wesley.
[*1978] Measure and integral, Academic Press.
[*1933] Grundbegriffe der Wahrscheinlichkeitsrechnung [Foundations of the theory of prob-

ability], Springer-Verlag 1933, Chelsea 1950.
[*1980] Nichtabsolut konvergente Integrale, B.S.B.G. Teubner, Leipzig.
[*1993] Real and functional analysis, Springer-Verlag (3rd edition).
[*1904] Leçons sur l’intégration et la recherche des fonctions primitives, Paris, 2nd edition
1928.
[*1989] Lanzhou lectures on Henstock integration, World Scientific Publishing Co..
[*1988] An introduction to the theory of real functions, John Wiley & Sons.
[*1986] Fine topology methods in real analysis and potential theory, Lecture Notes in Math.
1189, Springer–Verlag.
[*1982] The prehistory of the theory of distributions, Springer–Verlag.
[*1915] Integral i trigonometričeskie rjady, Moskva 1915, 2nd edition 1950.
[*1986] Lecture notes on geometric measure theory, Universidad de Extremadura.

[*1995] Geometry of sets and measures in Euclidean spaces. Fractals and rectifiability, Cam-
bridge University Press, Cambridge.
[*1992] Analyse (Fondaments, techniques, évolution), DeBoeck Université, Bruxelles.

210 References
[*1978] The Bochner integral, Birkhäuser.
[*1907] Diophantische Approximationen, Teubner, Leipzig.
[*1989] Funkcionálna analýza, Alfa.
[*1988] Geometric measure theory, Academic Press.
[*1986] Real and functional analysis, Part A and B, Plenum Press.
[*1971] Measure and integration, Addison-Wesley.
[*1965] The Haar integral, Van Nostrand.
[*1981] Miera a integrál, Veda.
[*1971] Measure and category, Springer-Verlag 1971, 1980, Moskva 1974.
[*1978] Introduction to probability and measure, Springer-Verlag.
[*1989] Analysis now, Springer-Verlag.
[*1977] Integrals and measures, Marcel Dekker, New York.
[*1987] Measure theory and integration, John Wiley & Sons. (Second revised edition Marcel
Dekker, Inc., New York, 2004).
[*1992] Téoria miery, Veda.
[*1970] Hausdorff measures, Cambridge University Press.
[*1982] A second course on real functions, Cambridge University Press.
[*1974] Real and complex analysis, McGraw-Hill (2nd ed.).
[*1968] Real analysis, Macmillan.
[*1937] Theory of the integral, Stechert 1937.
[*1992] Generalized ordinary differential equations, World Scientific, Singapore.
[*2005] Topics in Banach space integration., World Scientific Publishing Co. Pte. Ltd., Hack-
ensack.
References 211
[*1969] Nonlinear functional analysis, Gordon and Breach, New York.
[*1968] Integrals and operators, McGraw-Hill.
[*1983] Lectures on geometric measure theory, Proc. of the Centre for mathematical analysis,
Australian National University, vol.3.
[*1983] Primer on modern analysis, Springer-Verlag.
[*1984] An introduction to classical real analysis, Wadsworth International.
[*1987] Matematická analýza funkcı́ reálnej premennej, Alfa.
[*1965] General theory of functions and integration, Blaisdell.
[*1970] Topology and measure, Lecture Notes in Math. 133, Springer-Verlag.
[*1988] Real variables, Addison-Wesley.
[*1905] Sul problema della misura dei gruppi di punta di una retta, Bologna.
[*1985] The Banach-Tarski paradox, Cambridge University Press.
[*1940] L’Intégration dans les Groupes Topologiques et ses Applications, Hermann et Cie,
Paris 1940, 2nd edition 1965.
[*1973] Lebesgue integration and measure, Cambridge University Press.
[*1977] Measure and integral, Marcel Dekker.
[*1969] Lectures on measure and integration, Van Nostrand.
[*1962] Lebesgue integration, Holt,Rinehart and Winston.
[*1967] Integration, North Holland.
Papers
[1922] Sur les opérations dans les ensembles abstraits et leurs applications aux équations
intégrales, Fund. Math. 3, 133-181.
[1923] Sur le problème de la mesure, Fund. Math. 4, 7-33.
[1924] Sur un theorème de M. Vitali, Fund. Math. 5, 130-136.
[1925] Sur les lignes rectifiables et les surfaces dont l’aire est finie, Fund. Math. 7, 225-237.
212 References
[1924] Sur la décomposition des ensembles de points en parties respectivement congruentes,

Fund. Math. 6, 244-277.
[1957] Sur l’équivalence des théories de l’intégration selon N. Bourbaki et selon M.H. Stone,
Bull. Soc. math. France 85, 51-75.
[1945] A general form of the covering principle and relative differentiation of additive func-
tions, Proc. Cambridge Philos. Soc. 41, 103-110.
[1946] A general form of the covering principle and relative differentiation of additive func-
tions, Proc. Cambridge Philos. Soc. 42, 1-10.
[1933] Integration von Funktionen deren Werte die Elemente eines Vectorraumes sind,
Fund. Math. 20, 262-276.
[1895] Sur quelques points de la théorie des fonctions, Ann. Ecole Normale Sup. 12, 9-55.
[1963] A new treatment of the Haar integral, Michigan Math. J. 10, 365-373.
[1914] Über das lineare Mass von Punktmengeneine Verallgemeinerung des Längenbegriffs,
Nach. Ges. Wiss. Göttingen, 404-426.
[1940] Sur la mesure de Haar, C. R. Acad. Sci. Paris 211, 759-762.
[1990] A parametrical example of Dunford, Pettis and Bochner integration, Stud. Cerc.
Mat. 42, 405-418.
[1986] La naissance de la théorie des capacités: réflexion sur une expérience personelle, La
Vie des Sciences, Comptes rendus, sér. générale 3,4, 385-397.
[1989] Vznik teorie kapacit: zamyšlenı́ nad vlastnı́ zkušenostı́, Pokroky matematiky, fyziky
a astronomie 34, 71-83.
[1918] A general form of integral, Ann. of Math. 19, 279-284.
[1980] The Hahn decomposition theorem, Proc. Amer. Math. Soc. 80, 377.
[1935] Integration in general analysis, Trans. Amer. Math. Soc. 37, 441-453.
[1911] Sur les suites de fonctions mesurables, C. R. Acad. Sci. Paris 152, 244-246.
[1936] Integration and linear operation, Trans. Amer. Math. Soc. 40, 474-494.
[1910] Über stetige Funktionen. II, Math. Annalen 69, 372-433.
[1906] Séries trigonométriques et séries de Taylor, Acta Math. 30, 335-400.
[1981] A proof of Lusin’s theorem, Amer. Math. Monthly 88, 191-192.

References 213
[1907] Sur la convergence en moyenne, Comptes Rendus Acad. Sci. Paris 144, 1022-1024.
[1991] The Hahn-Banach theorem implies the existence of a non Lebesgue-measurable set,
Fund. Math. 138, 13-19.
[1915] Sur l‘intégrale d‘une fonctionnelle étendue à un ensemble abstrait, Bull. Soc. Math.
France 43, 248-265.
[1924] Des familles et fonctions additives d’ensembles abstraits, Fund. Math. 5, 206-251.
[1907] Sugli integrali multipli, Rendiconti Accad. Nazionale dei Lincei (Roma) 16, 608-614.
[1915] Sulla derivazione per serie, Rendiconti Accad. Nazionale dei Lincei (Roma) 24, 204-
206.
[1936] Sur un lemme de la théorie des espaces linéaires, Comm. Ins. Sci. Math. Méc. Univ.
de Kharkov et Soc. Mat. Kharkov 13, 35-40.
[1941] Linear functionals and integrals in abstract spaces, Bull. Amer. Math. Soc. 47, 615-
620.
[1927] Riemann integration and Taylor’s theorem in general analysis, Trans. Amer. Math.
Soc. 29, 163-177.
[1933] Der Maßbegriff in der Theorie der kontinuierlichen Gruppen, Ann. of Math. 34,
147-169.
[1933] Über die multiplikation total-additiver Mengenfunktionen, Annali Scuola Norm. Sup.
Pisa 2, 429-452.
[1957] Note sur la mesurabilité B de la dérivée supérieure, Fund. Math. 44, 238-240.
[1983] The Riesz representation theorem revisited, Amer. Math. Monthly 90, 277-280.
[1919] Dimension und äusseres Mass, Math. Ann. 79, 157-179.
[1961] Definitions of Riemann type of the variational integrals, Proc. London Math. Soc.
13,3, 305-321.
[1988] A short history of integration theory, SEA Bull. Math. 12, 75-95.
[1953] Integration in abstract spaces, Bull. Amer. Math. Soc. 59, 111-139.
[1889] Über einen Mittelwerthssatz, Nachr. Akad. Wiss. Göttingen Math.-Phys., 38-47.
[1970] An introduction to distributions, Amer. Math. Monthly 77, 227-240.
[1881] Sur la série de Fourier, Comptes Rendus Acad. Sci. Paris 92, 228-230.
214 References
[1941] Concrete representation of abstract (M)-spaces, Annals of Math. 42, 994-1024.

[1948] A proof of the uniqueness of Haar’s measure, Ann. Math. 49, 225-226.
[1950] Construction of a non-separable invariant extension of the Lebesgue measure space,

Annals of Math. 52, 580-590.
[1934] Über die zussammenziehenden und Lipschitzchen Transformationen, Fund. Math.

22, 77-108.
[1932] Beiträge zur Masstheorie, Math. Ann. 107, 351-366.
[1985] Note on generalized multiple Perron integral, Čas. Pěst. Mat. 110, 371-374.
[1957] Generalized ordinary differential equations and continuous dependence on a param-

eter, Czechoslovak Math. J. 82, 418-446.
[1901] Sur une généralisation de l’intégrale définie, Comptes Rendus Acad. Sci. Paris 132,
1025-1028.
[1902] Integrale, longuer, aire, Annali Mat. Pura Appl. 7, 231-359.
[1903] Sur une propriétè des fonctions, Comptes Rendus Acad. Sci. Paris 137, 1228-1230.
[1910] Sur l’intégration des fonctions discontinues, Ann. Sci. Ecole Norm. Sup. 27, 361-450.
[1906] Sopra l’integrazione delle serie, Rend. Instituto Lombardo di Sci. e Lett. 39, 775-780.
[1940] Sur les fonctions–vecteurs complètement additives, Bull. Acad. Sci. URSS 6, 465-478.
[1912] Sur les propriétés des fonctions mesurables, C. R. Acad. Sci. Paris 154, 1688-1690.
[1952] Základy theorie integrálu v Euklidových prostorech, Časopis Pěst. Mat. 77, 1-51,
125-145, 267-301.
[1934] Extension of range of functions, Bull. Amer. Math. Soc. 40, 837-842.
[1975] The work of Henri Lebesgue on the theory of functions (on the occasion of his cen-
tenary) Transl. from Uspechi Mat. Nauk, 30(1975), 227-238, Russian Math.Surveys
30, 179-191.
[1976] Côntrole dans les inéquations variationelles elliptiques, J. Functional Analysis 22,
130-185.
[1970] On the extension of Lipschitz, Lipschitz–Hölder continuous and monotone functions,

Bull. Amer. Math. Soc. 76, 334-339.
[1939] The behaviour of a function on its critical set, Ann. of Math. 40, 62–70.
[1947] Perfect blankets, Trans. Amer. Math. Soc. 6, 418-442.
References 215
[1988] A simple proof of the Rademacher theorem, Časopis Pěst. Mat. 113, 337-341.
[1934] Zum Haarschen Maß in topologischen Gruppen, Compositio Math. 1, 106-114.

[1936] The uniqueness of Haar’s measure, Matem. sbornik 43, 721-734.
[1930] Sur une généralisation des intégrales de M.J. Radon, Fund. Math. 15, 131-179.
[1914] Über den Integralbegriff, Sitzungsber. Heidelberg Akad. Wiss. A16, 1-16.
[1927] Die Vollständigkeit der primitiven Darstellungen einer geschlossenen kontinuier-

lichen Gruppe, Math. Ann. 97, 737-755.
[1938] On integration in vector spaces, Trans. Amer. Math. Soc. 44, 277-304.
[1910] Contributions à l’étude de la représentation d’une fonction arbitrire par des intǵrales
définies, Rend. Circ. mat. Palermo 30, 289-335.
[1919] Über partielle und totale Differenzierbarkeit, Math. Ann. 89, 340-359.
[1913] Theorie und Anwendungen der absolut additiv Mengenfunktionen, S.-B. Math. Nat-
ur. Kl. Kais. Akad. Wiss. Wien 122.IIa, 1295-1438.
[1906] Sur les ensembles de fonctions, Comptes Rendus Acad. Sci. Paris 143, 738-741.
[1909a] Sur les suites de fonctions mesurables, Comptes Rendus Acad. Sci. Paris 148, 1303-
1305.
[1909b] Sur les opérations fonctionnelles linéaires, Comptes Rendus Acad. Sci. Paris 149,
974-977.
[1910] Untersuchungen über Systeme integrirbarer Funktionen, Math. Annalen 69, 449-497.
[1912] Sur quelques points de la théorie des fonctions sommables, Comptes Rendus Acad.
Sci. Paris 154, 641-643.
[1920] Sur l’intégrale de Lebesgue, Acta Math. 42, 191-205.
[1930-32] Sur l’existence de la dérivée des fonctions monotones et sur quelques problèmes qui
s’y rattachent, Acta Sci. Math. Szeged 5, 208-221.
[1888] An extension of a certain theorem in inequalities, Messenger of Math. 17, 145-150.
[1938] Integration in abstract metric spaces, Duke Math. J. 4, 408-411.
[1942] The measure of the critical set values of differentiable mappings, Bull. Amer. Math.
Soc. 48, 883–890.
[1951] A note on the space L∗p , Proc. Amer. Math. Soc. 2, 270-275.
[1948] Thorie des distributions et transformation de Fourier (French), Ann. Univ. Greno-
ble. Sect. Sci. Math. Phys. (N.S.) 23, 7-24.
216 References
[1954] Equivalence of measure spaces, Am. J. Math. 73, 275-313.
[1928] Un théorème général sur les familles d’ensembles, Fund. Math. 12, 206-210.
[1936] Méthode nouvelle à résoudre le problème de Cauchy pour les équations hyperboliques
normales, Mat. Sb. 1 (43), 39-71.
L. S. Sobolev
Méthode nouvelle à résoudre le problème de Cauchy pour les équations hyperboliques nor-
males
Mat. Sb. 1 (43) (1936), 39 – 71
[1970] A model of set theory in which every set of reals is Lebesgue measurable, Annals of
Math. 92, 1-56.
[1919] Additive und stetige Funktionaloperationen, Math. Z. 5, 186-221.
[1948] Notes on integration I - III, Proc. Nat. Acad. Sci. 34, 336-342, 447-455, 483-490.
[1949] Notes on integration IV, Proc. Nat. Acad. Sci. 35, 50-58.
[1985] A combinatorial construction of a nonmeasurable set, Amer. Math. Monthly 92,

421-422.
[1909] Sull’integrazione per parti, Rendiconti Accad. Nazionale dei Lincei 18, 246-253.
[1908] On non-measurable sets of points with an example, Trans. Amer. Math. Soc. 9,
237-244.
[1922] Limit in terms of continuous transformation, Bull. Soc. Math. France 50, 119-134.
[1939] The ergodic theorem, Duke Math. J. 1-18.
[1911] On the existence of a differential coefficient, Proc. London Math. Soc. 9, 325-335.
[1904] On upper and lower integration, Proc. London Math. Soc. 2, 52-66.
[1983] On differentiation of metric projections in finite dimensional Banach spaces, Czech.

Math. J. 33,3, 325-336.
A Short Guide to the Notation 217
A Short Guide to the Notation
M, M(γ), Mn ... measurable sets 1.3, 4.4, 26

λ, λn ... Lebesgue measure 1.3, 1.15, 26
σ(A ) ... σ-algebra generated by A 2.3
B, B(P ) ... Borel sets 2.3
μ|A , μA , μ|T ... restrictions of measures 2.4
SA ... restriction of a σ-algebra 2.4
εx ... Dirac measure 2.5
μ ... completion of a measure 2.7
μ∗ , μ∗ ... outer and inner measure 1.3, 4.8
M(S ) ... space of all (signed, complex) measures on (X, S ) 6.17
vol I ... volume of an interval 1.15
μ+ , μ− , |μ| ... variations of a measure 6.6, 6.10
f (μ) ... image of a measure 8.23
L ∗ ... the set of all functions for which the integral exists 8.3
L p , Lp ... Lp -spaces 8.3, 10.1
f p ... Lp -norm of a function f 10.1, 10.5
lp ... lp -spaces 10.7, 40.3-4
S ⊗ T ... product σ-algebra 11.1
M x , Mx ... sections 11.1
μ ⊗ ν ... product measure 11.6
w-lim fj ... weak limit 12.12
w*-lim fj ... weak* limit 12.13
X ∗ ... (topological) dual 12.4
μf ... measure having a density f , 8.19, 13.1
dν
dμ ... Radon-Nikodým derivative 13.1
ν μ ... absolute continuity of measures 13.1
ν⊥μ ... mutually singular measures 13.8
supt f ... support 14.1
CK (P ) ... continuous functions having compact support in K 14.1
Cc (P ) ... continuous functions having compact support 14.1
Cc↑ (P ), Cc↓ (P ) ... semicontinuous functions 14.4
A+ , A− , |A| ... variation of a (signed) Radon integral 14.11, 14.12
μA , μ∗A ... measure (outer measure) corresponding to a Radon integral 16.1,
16.4
Cb (P ) ... bounded continuous functions 16.7
C0 (P ) ... continuous functions vanishing at infinity 16.7
v
μj → μ ... vague convergence 17.1
M (P ) ... signed (complex) Radon measures on P 17.1
Mb (P ) ... finite signed (complex) Radon measures 17.1
e ... unit of a group 19.2
Δ ... modular function 19.9; also a set of all positive functions 25.2
f ∗ g ... convolution 19.19, 26.21
b
V f ... variation 21.1
a
218 A Short Guide to the Notation
D+ f , D− f , D+ f , D− f Df , Df ... extreme derivatives 22.3

∇ϕ ... Jacobi matrix 26.12
ϕ ... (Frchet) derivative 26.12
C ... contiuously differentiable functions of order l 26.12
Jϕ ... Jacobian 26.12 2
Dμ ... symmetric derivative of a measure 28.2
D(Ω) ... (infinitely) smooth functions having compact support 31.1
χk ... smoothing convolution kernel 31.1
Tf ... regular distribution 32.1
Tμ ... distribution determined by a measure μ 32.3
δ ... Dirac distribution 32
D
fk → 0 ... convergence in D 32.1
Dα T ... multiindex derivative 32.4
Zk → Z ... convergence of distributions 32.9
fˆ, F f ... Fourier transform 33.1, 33.5, 33.8
f˜ ... inverse Fourier transform 33.5
S ... Schwartz space 33.5
Mn,k ... space of all matrices 34
L∗ ... adjoint mapping 34
AT ... transpose matrix 34
L ... norm of a linear mapping 34
det A ... determinant 34
σ ... k-dimensional measure in Rn 34.8
vol L, vol(u1 , . . . , uk ) ... volume 34.10
N (x, ϕ, E) ... Banach indicatrix 34.19
deg(y.ϕ, G) ... degree of a mapping 35.7
Hp (A), Hp (A, δ) ... Hausdorff measure 36.1
s, S ... one-dimensional (n − 1-dimensiona) measure 37
grad g, div f , curl f ... gradient, divergence, curl 37.1
u1 × · · · × uk−1 ... vector product 37.5
t, n ... unit tangent (normal) vector 37.10, 37.13
Hk− ... half-space 37.17
i ... embbeding of Rn−1 into ∂Hk− 37.17
< ·, · > ... duality 38.1
Λk (V), Λk (V) ... k-covectors and vectors 38.1
V1 ∧ · · · ∧ Vk ... exterior product 38.3
I(k, n) ... set of all increasing multiindices 38.3
Xi ... coordinate form 38.3
dω ... diferential 38.11
ξ(x) ... unit tangent k-vector 38.14
Uμ ... domain of a chart μ 39.1
Tx ... tangent space 39.4
c0 ... space of all sequences converging to zero 40.3-4
Subject Index 219
Subject Index
absolute convergence 41.1 complex Radon measure 15.8

absolutely continuous function 21.3 complex Radon integral 14.10
absolutely continuous measure 13.1, cone 37.15
13.8 continuous measure 15.18
vector 41.4 convergence 41
adjoint mapping 34 absolute 41.1
algebra 5.1, Appendix almost everywhere 3.12
almost everywhere 3.12, 3.14 in measure 12.1
analytic set 26.10 μ-uniform 12.2
antiderivative 7.1 strong 17.1
antisymmetric k-linear form 38.1 unconditional 41.1
approximately continuous function 29.7 vague 17.2
atlas 39.1 weak 12.12, 12.14, 17.1, 32.9
atom 2.15 weak* 12.13, 12.14
atomic measure 15.18 convergent integral 8.3
convolution of functions 19.19, 26.18
Baire function 7.9
convolution of measures 19.24, 26.18
ball 26.11, 28.2.1
continuous measure 15.18
Banach-Zarecki’s theorem 23.11
counting measure 2.5, 2.10, 19.4
barycenter 43.6
Cousin’s lemma 25.2
base for a topology Appendix
curl 37.1
Besicovitch’s theorem 27.4
curve 37.10
bilinear form 38.1
curve integral 34.24, 37.10
bilipschitz mapping 34.20
Bochner integral 42.1 Daniell’s property 14.3
Bochner’s theorem 42.2 Darboux property (of a measure) 2.15
Borel-Cantelli’s lemma 2.14 Darboux integrable function 43.7
Borel function 3.2 degree of a mapping 35.7
Borel set 2.3 δ-fine partition 25.2
bounded variation 21.1 δ-ring 5.1
Denjoy-Perron integral 25.14
Cantor discontinuum 1.12
Denjoy’s theorem 29.9
Cantor function 23.1
density of a measure 13.1
Cantor (ternary) set 1.12
density point 29.1
capacity 5.14
density topology 29.4
capacitable set 5.15
derivative 26.12
charge 6.20
derivative of a mapping 39.4
chart 34.24, 39.1
derivative of a measure 28.1
Chebyshev’s inequality 7.8
descriptive definition of an integral 23.7
Choquet capacity 5.14
diffeomorphism 34.22, 39.1
compact space Appendix
differential 38.11, 39.8
complete measure 2.4
differential form 38.9, 39.8
complete Riemann integral 25.14
Dini derivative 22.3
completion of a measure 2.7
Dini’s theorem Appendix
complex measure 6.10
Dirac integral 14.2
220 Subject Index
Dirac measure 2.5 locally absolutely continuous 21.3

Dirichlet function 7.5 locally integrable 23.3
discontinuum lower Baire 7.9
Cantor’s 1.12 lower semicontinuous 14.4
of a positive measure 1.13 measurable 3.1, 3.13, 40.1
discrete measure 15.18 modular 19.9
discrete topology Appendix of Baire class one 18.3
distribution 32.1 of class C 26.12
tempered 33.5 of finite variation 21.1
distribution function 24.1 Riemann 7.6
divergence 37.1 Riemann integrable 7.2
d-open set 29.4 saltus 24.7
dual space to Lp 13.17 simple 3.8, 8.1, 40.1
Dunford integral 43.2 upper semicontinuous 14.14
Dunford’s lemma 43.1 weakly integrable 43.2
Dynkin class 5.1 weakly measurable 40.1
with a compact support 14.1, 31.1
Egorov theorem 12.6
embedding 39.2 Gauss theorem 37.22
extremal derivative 22.3 Gδ set 2.3
Gδσ set 2.3
Fatou lemma 8.15
Gelfand integral 43.2
finer partition 43.7
gradient 37.1
finite measure 2.4
Graves integral 43.7
finite variation 21.1
Green’s theorem 37.27
finitely additive measure 6.20
Fourier coefficients 33.20 Haar measure 19.3
Fourier series 33.20 Hahn decomposition 6.3
Fourier transform 33.1, 33.5, 33.11 harmonic measure 14.8
Fréchet derivative 26.12 Hausdorff dimension 36.8
Fσ set 2.3 Hausdorff measure 36.1
Fσδ set 2.3 Hausdorff outer measure 4.2
Fubini’s lemma 24.10 Hausdorff space Appendix
Fubini’s theorem 11.9, 26.9 Heaviside function 32.5
function helix 34.25, 37.12
absolutely continuous 21.3 Henstock–Kurzweil integral 25.2
approximately continuous 29.7 homeomorphism Appendix
Baire 7.9 Hopf’s theorem 5.5
Borel 3.2 Hölder’s inequality 10.3
Cantor 23.1
image of a measure 8.23
Darboux integrable 43.7
increasing multiindex 38.3
Dirichlet 7.5
indefinite Henstock–Kurzweil integral
distribution 24.1
25.5
Heaviside 32.5
indefinite Lebesgue integral 23.3
integrable 8.3, 14.5
indefinite variation 21.1
Lebesgue measurable 26.7
inequality Chebyshev’s 7.8
Lipschitz 20
Subject Index 221
Hölder’s 10.3 Kirszbraun’s theorem 30.6

Minkowski’s 10.4 k-linear form 38.1
Young’s 10.2 Kurzweil (Henstock–Kurzweil) integral
integrable function 8.3, 14.5 25.2
integral 8.3 k-vector 38.1
Bochner 42.1
lattice Appendix
complex Radon 14.10
Lebesgue measurable function 26.7
convergent 8.3
Lebesgue measure 1.3, 1.15, 2.5, 19.4
curve 34.24, 37.10
Lebesgue outer measure 1.1, 1.15
Denjoy-Perron 25.14
Lebesgue point 23.8
Denjoy restricted 25.14
Lebesgue-Stieltjes measure 14.8
Dirac 14.2
Lebesgue’s theorem 8.13, 8.14, 12.6
Dunford 43.2
differentiation 22.5, 23.9
Gelfand 43.2
density 29.2
Graves 43.7
decomposition 13.10
Henstock–Kurzweil 25.2
left Haar measure 19.3
indefinite 25.5
lemma
Lebesgue 8.3, 26
Borel-Cantelli’s 2.14
indefinite 23.3
Cousin’s 25.2
Newton 7.1
Dunford’s 43.1
of a differential form 38.16, 39.18
Fatou 8.15
Perron 25.10
Fubini’s 24.10
Pettis 43.2
Riemann-Lebesgue’s 12.13, 31.10
Poisson 14.2
Saks-Henstock’s 25.6
Radon 14.1
Urysohn’s Appendix
signed 14.10
lemniscate 26.15
Riemann 7.2
Levi’s theorem 8.5, 8.11, 8.12
complete 25.14
linear form 38.1
Riemann-Graves 43.7
L∞ -norm 10.1
Riemann-Stieltjes 14.2
Lipschitz boundary 37.21
surface 34.24, 37.13
Lipschitz k-boundary 37.17
integration by parts 23.13
Lipschitz function 20
inverse Fourier transform 33.5
Lipschitz mapping 30.1
involution 19.23
Ljapunov’s theorem 2.16
isometric mapping 34.2
Lp -norm 10.1
Jacobi matrix 26.12 Lp -space 10.5
Jacobian 26.12 localizable measure 13.18
Jordan-Peano content 1.5, 5.12, 26.24 locally absolutely continuous function
Jordan decomposition 6.7 21.3
locally bilipschitz mapping 34.20
Kadec-Klee property 12.14
locally compact space Appendix
k-boundary 37.17
locally integrable function 23.3
k-covector 38.1
locally Lipschitz mapping 30.1
k-dimensional measure 34.8
locally uniformly convex space 12.14
on a manifold 39.15
lower derivative 22.3
k-dimensional surface 34.24
lower Riemann sum 7.2
222 Subject Index
lower semicontinuous function 14.4 regular 4.7

Luzin’s (N)-property 20.7 probability 2.4
Luzin’s theorem 18.2 singular 13.8
Radon 15.1
majorant 25.10
complex 15.8
manifold 39.1
outer 16.1
mapping
regular Borel 15.2
adjoint 34
right Haar 19.3
bilipschitz 34.20
σ-finite 2.4
diffeomorphic 39.1
signed 6.1
isometric 34.2
translation invariant 1.2
Lipschitz 30.1
trivial 2.5
locally bilipschitz 34.20
vector 41.2
locally Lipschitz 30.1
absolutely continuous 41.4
(of class) C 26.12
measure space 2.4
regular 34.22
metric outer measure 36.2
McShane extension theorem 30.5
Minkowski’s inequality 10.4
measurable function 3.1, 3.13, 40.1
minorant 25.10
measurable set 1.3, 1.15, 2.4, 4.4
modular function 19.9
measurable rectangle 11.1
molecular measure 17.7
measurable space 2.1
monotone class 11.3
measure 2.4
monotone functional 14.1
absolutely continuous 13.1, 13.8
Möbius strip 39.7
atomic 15.18
multiindex 32, 34.13, 38.3
complete 2.4
μ-uniform convergence 12.2
complex 6.10
continuous 15.18 natural orientation 37.9
counting 2.5, 19.4 negative base 39.4
Dirac 2.5 negative diffeomorphism 39.3
discrete 15.18 negative parametrization 37.9
finite 2.4 negative variation of a measure 6.6
finitely additive 6.20 negative variation of a Radon integral
Haar 19.3 14.12
harmonic 14.8 neighborhood Appendix
Hausdorff 36.1 Newton integral 7.1
outer 4.2 Newton potential 5.14
k-dimensional 34.8 Newtonian capacity 5.14
on a manifold 39.15 norm of a partition 43.7
Lebesgue 1.3, 1.15, 2.5, 19.4 normal (vector) 37.13
outer 1.1, 1.15 normal space Appendix
Lebesgue-Stieltjes 14.8
orientation 37.9, 38.14, 39.3
left Haar 19.3
orientable manifold 39.3
localizable 13.18
oriented atlas 39.3
molecular 17.7
oriented manifold 39.3
outer 4.1
Orlicz-Pettis’ theorem 43.5
metric 36.2
orthogonal matrix 34.2
Subject Index 223
oscillation 43.7 rectifiable set 34.31

outer capacity 5.15 regular Borel measure 15.2
outer measure 4.1 regular distribution 32.3
outer normal 37.21 regular mapping 34.22
outer product 38.3 regular outer measure 4.7
restricted Denjoy integral 25.14
parametrization 34.24
restriction of a measure 2.4
partition of an interval 7.2, 25.2
Riemann function 7.6
δ-fine 25.2
Riemann integrable function 7.2
subordinated 25.2
Riemann-Graves integral 43.7
partition of unity 39.11, 39.22
Riemann integral 7.2
Perron integral 25.10
Riemann-Lebesgue lemma 12.13, 31.10
Pettis integral 43.2
Riemann-Stieltjes integral 14.2
Pettis’ theorem 40.3, 41.5
Riemannian manifold 39.12
π-system 5.1
Riemannian metric 39.12
Plancherel theorem 33.7
Riesz lattice 14.14
point of density 29.1
Riesz theorem 12.3
Poisson integral 14.2
representation 16.5
polar coordinates 26.14
right Haar measure 19.3
positive base 39.4
ring 5.1
positive diffeomorphism 39.3
positive functional 14.1 Saks-Henstock’s lemma 25.6
positive parametrization 37.9, 39.18 Sard theorem 34.17
positive variation of a Radon integral Schwartz space 33.5
14.12 Schwartz theorem 32.8
positive variation of a measure 6.6 section 11.1
premeasure 5.2 semicontinuous function 14.4, Appen-
probability measure 2.4 dix
product of measures 11.6 semiring 5.1
product of Radon integrals 14.13 separable space Appendix
product of Radon measures 16.9 set
product σ-algebra 11.1 analytic 26.10
property Borel 2.3
Daniell 14.3 capacitable 5.15
Kadec-Klee 12.14 d-open 29.4
Luzin’s (N) 20.7 Fσ 2.3
Radon-Nikodým 42.4 Fσδ 2.3
Vitali’s 27.1 Gδ 2.3
pullback 39.8 Gδσ 2.3
measurable 1.3, 1.15, 2.4, 4.4
Rademacher theorem 30.3
rectifiable 34.31
Radon integral 14.1
σ-additivity 2.4
Radon measure 15.1
σ-algebra 2.1
Radon-Nikodým derivative 13.1
σ-finite measure 2.4
Radon-Nikodým property 42.4
σ-ring 5.1
Radon-Nikodým theorem 13.4
σ-subadditivity 1.2, 4.1
Radon outer measure 16.1
signed measure 6.1
224 Subject Index
signed Radon integral 14.10 Plancherel’s 33.7

signed Radon measure 15.8 Rademacher’s 30.3
simple function 3.8, 8.1, 40.1 Radon-Nikodým’s 13.4
singular measure 13.8 Riesz 12.3
sphere 34.26, 34.27, 37.23 representation 16.5
spherical coordinates 26.16, 37.23 Sard’s 34.17
Stokes theorem 37.32, 38.21, 39.21 Schwartz 32.8
Stone-Weierstrass theorem Appendix Stokes 37.32, 38.21 39.21
Stone’s condition 14.14 Stone-Weierstrass Appendix
strong convergence 17.1 Tietze’s Appendix
strong subadditivity 5.14 Tonelli’s 26.10
support of a function 14.1 Vitali’s 12.9, 27.2, 27.6
support of a Radon measure 15.10 Young’s convolution 26.20
surface integral 34.24, 37.13 Tietze’s theorem Appendix
surface k-dimensional 34.24 Tonelli’s theorem 26.10
surface with a Lipchitz k-boundary topological group 19.2
37.17 total variation of a measure 6.6, 6.10
translation invariant measure 1.2
tangent k-vector 38.14
trigonometric series 33.14
tangent space 39.4
trivial measure 2.5
tangent vector 37.10
tempered distribution 33.5 unconditional convergence 41.1
theorem uniformly convex space 12.14
Banach-Zarecki’s 23.11 unimodular group 19.11
Besicovitch’s 27.4 unit tangent k-vector 38.14, 39.16
Bochner’s 42.2 unit tangent vector 37.10
change of variable 26.13, 34.18, 34.19, unitary matrix 34.2
35.8, 38.18 upper Baire function 7.9
Denjoy’s 29.9 upper derivative 22.3
density 29.2 upper Riemann sum 7.2
Dini’s Appendix
vague convergence 17.2
Egorov’s 12.6
variation of a function 21.1
Fubini’s 11.9, 26.9
variation of a measure 6.6
Gauss 37.22
variation of a Radon integral 14.11
Green’s 37.27
variation of a vector measure 41.7
Hopf’s 5.5
vector field 37.1
Kirszbraun’s 30.6
vector measure 41.2
Lebesgue’s 8.13, 8.14, 12.6
vector product 37.5
decomposition 13.10
Vitali cover 27.1
density 29.2
Vitali property 27.1
differentiation 22.5, 23.9
Vitali’s theorem 12.9, 27.2, 27.6
Levi’s 8.5, 8.11, 8.12
volume 34.10
Ljapunov’s 2.16
of a k-tuple of vectors 34.10
Luzin’s 18.3
McShane 30.5 Young’s convolution theorem 26.20
Orlicz-Pettis 43.5 Young’s inequality 10.2
Subject Index 225
weak convergence 12.12, 12.14, 17.1,

32.9
weak* convergence 12.13, 12.14
weakly integrable function 43.2
weakly measurable function 40.1
weighted counting measure 2.10
Jaroslav Lukeš, Jan Malý
MEASURE AND INTEGRAL
Published by
MATFYZPRESS
publishing house
of the Faculty of Mathematics and Physics
Charles University in Prague
Sokolovská 83, CZ - 186 75 Praha 8
as the 162. publication
Reviewer: Prof. RNDr. Ivan Netuka, DrSc.
This volume was typeset by the authors using AMS-TE X
the macro system of the American Mathematical Society
Printed by Reproduction center UK MFF
Sokolovská 83, CZ - 186 75 Praha 8
Second edition
Prague 2005
ISBN 80-86732-68-1
ISBN 80-85863-06-5 (First edition)

Jaroslav Lukeš, Jan Malý - Measure and Integral

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Jaroslav Lukeš, Jan Malý - Measure and Integral

Uploaded by

Copyright:

Available Formats

Measure

© Jaroslav Lukeš, Jan Malý, 2005

Prague, 1994 Jaroslav Lukeš and Jan Malý

Preface to the second edition

Prague, 2005 Jaroslav Lukeš and Jan Malý

List of Basic Notations and Frequently Used Symbols

In this manuscript we use the standard notation.

The horizontal notation is preferred for estetical and typographical reasons.

We use fj ⇒ f to denote the uniform convergence of a sequence of functions.

A. Measures and Measurable Functions

provided {Aj } is a ﬁnite disjoint collection of sets. For a succesful development

1.3. Lebesgue Measurable Sets. Let A be a subset of a bounded interval I.

(b) Intervals of any type are in M.

1.7. Exercise. For every bounded set A ⊂ R, deﬁne

where I is a bounded interval containing A. Show that:

1 sometimes also called the Cantor discontinuum

κ(t) := inf{x ∈ [0, 1] : f (x) = t}

1.15. Lebesgue Measure on Rn . In the same way as for R, we introduce

vol I = (b1 − a1 ) · . . . · (bn − an ).

We say that a set A ⊂ Rn is measurable if λ∗ T = λ∗ (A ∩ T ) + λ∗ (A \ T ) for

Proof. The assertion follows from Theorem 4.3.

Proof. The assertion follows from general Theorem 4.5.

1.18. Theorem. If I ⊂ Rn is a bounded interval, I ⊂ Qj where {Qj } is a

Thus the n-dimensional Lebesgue measure λ∗ I is equal to the volume vol I.

Now ﬁnd open intervals Ij and Jj such that

We proved the measurability of all intervals H of the form of a halfspace. Now,

1.21. Theorem. Given a set M ⊂ Rn , the following are equivalent:

each sequence {An } of pairwise disjoint sets from S , μ( An ) =

2.6. Properties of Measures. Let (X, S , μ) be a measure space. Then the

(a) if A1 , A2 , · · · ∈ S , A1 ⊂ A2 ⊂ . . . , then μ ( An ) = lim μAn ;

Proof. (a) Since A1 , A2 \ A1 , A3 \ A2 , . . . are pairwise disjoint, we get

(b) By (a), we obtain

(c) It is suﬃcient to consider the sequence

notice that Bn = An and that the sequence {Bn } is pairwise disjoint.

2.10. Exercise. Let f be a nonnegative function on a set X and a σ-algebra on X. For

(Recall that by deﬁnition

Show that μ is a measure on (so-called weighted counting measure).

If μ = ω|S , then μ is not complete. If

= ∪ {{1}, {2}, {1, 3, 4}, {2, 3, 4}}, ν = ω|T ,

then ν is complete. What is the completion of μ?

it is possible to replace the condition {f > α} ∈ by {f ≥ α} ∈ ) in the deﬁnition of

3.4. Theorem. Let f , g be S -measurable functions on X. Then

the assertion easily follows.

3.11. Exercise. If f is an -measurable function, then there exists a sequence of simple

(b) If μ is a complete measure on (X, S ) and f is a μ-measurable function,

(b) If f is an -measurable function on X, then there exists an -measurable function g

3.17. Exercise. Let {fn } be a sequence of -measurable functions on X which converges to

4. Construction of Measures from Outer Measures

An important example of creating an outer measure is contained in the follow-

(note that inf ∅ = +∞). Then ν̃ is an outer measure.

Proof. We only have to verify that ν̃ is σ-subadditive. So suppose A ⊂ An ,

4.4. γ-measurable Sets. A set M ⊂ X is said to be γ-measurable (in the

for any set T with γT < +∞.

γ(T \ (M ∩ N )) = γ((T ∩ M ) \ N ) + γ(T \ M ).

Thanks to last three equalities, it follows that

γT = γ(T ∩ (M ∩ N )) + γ(T \ (M ∩ N )).

(b) In order to show that γ is σ-additive on M(γ), choose Mn ∈ M(γ) pairwise

for each k ∈ N. Since γ is σ-subadditive, it readily follows that

which is what we wanted.

(a) Suppose μ∗ A < ∞. Show that A ∈ M(μ∗ ) if and only if μ∗ A = μ∗ A.

5. Classes of Sets and Set Functions

5.1. Systems of sets. A family A of subsets of a set X containing ∅ is called:

(b) a ring if given A, B ∈ A , then A ∪ B, A \ B ∈ A (so that also A ∩ B ∈ A );

(e) a π-system if A ∩ B ∈ A for each A, B ∈ A ;

6.18. Exercise. Suppose that μ, μn ∈ M( ), μn − μ → 0. Prove that μn A → μA for

8.6. Theorem. Let g ∈ L ∗ and f be a μ-measurable function, f = g almost

8.11. Levi’s theorem. Let {f n } be a sequence of μ-measurable

Proof. Suppose a, b ∈ I, b = a, x ∈ X \ N . By the mean value theorem there

Proof. It is not hard to verify that f + g1 ≤ f 1 + g1 .

10.5. The Lp Spaces. The behavior of the function f → f p on L p , where