REVIEW OF LEBESGUE MEASURE AND INTEGRATION

CHRISTOPHER HEIL
These notes will briefly review some basic concepts related to the theory of Lebesgue
measure and the Lebesgue integral. We are not trying to give a complete development,
but rather review the basic definitions and theorems with at most a sketch of the proof of
some theorems. These notes follow the text Measure and Integral by R. L. Wheeden and
A. Zygmund, Dekker, 1977, and full details and proofs can be found there.
1. OPEN, CLOSED, AND COMPACT SUBSETS OF EUCLIDEAN SPACE
Notation 1.1. N = ¦1, 2, 3, . . . ¦ is the set of natural numbers, Z = ¦. . . , −1, 0, 1, . . . ¦ is the
set of integers, Q is the set of rational numbers, R is the set of real numbers, and C is the
set of complex numbers. R
d
is real d-dimensional Euclidean space, the space of all vectors
x = (x
1
, . . . , x
d
) with x
1
, . . . , x
d
∈ R.
On occasion, we formally use the extended real number line R ∪ ¦−∞, ∞¦ = [−∞, ∞],
but it is important to note that ∞ is a formal object, not a number. To write a ∈ [−∞, ∞]
means that either a is a finite real number or a is one of ±∞. We write [a[ < ∞ to mean
that a is a finite real number. Note that there is no analogue of the extended reals when we
consider complex numbers; there’s no obvious “∞” or “−∞.”
We declare some arithmetic conventions for the extended real numbers: ∞ + ∞ = ∞,
1/0 = ∞, 1/∞ = 0, and 0 ∞ = 0. The symbols ∞− ∞ are undefined, i.e., they have no
meaning.
The empty set is denoted by ∅. Two sets A, B are disjoint if A ∩ B = ∅. A collection
¦A
k
¦ of sets are disjoint if A
j
∩ A
k
= ∅ whenever j ,= k.
The real part of a complex number z = a + ib is Re (z) = a, and the imaginary part is
Im(z) = b. The complex conjugate of z = a + ib is ¯ z = a − ib. The modulus, or absolute
value, of z = a + ib is [z[ =

z¯ z =

a
2
+ b
2
.
For concreteness, we will use the Euclidean distance on R
d
in these notes. However, all
the results of this section are valid with respect to any norm on R
d
.
Definition 1.2.
(a) The Euclidean norm on R
d
is
[x[ = (x
2
1
+ + x
2
d
)
1/2
.
The distance between x, y ∈ R
d
is [x −y[.
Date: Revised July 30, 2007.
c _2007 by Christopher Heil.
1
2 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
(b) Suppose that ¦x
n
¦
n∈N
is a sequence of points in R
d
and that x ∈ R
d
. We say that x
n
converges to x, and write x
n
→ x or x = lim
n→∞
x
n
, if
lim
n→∞
[x
n
−x[ = 0.
Using this definition of distance, we can now define open and closed sets in R
d
and state
some of their basic properties.
Definition 1.3. Let E ⊆ R
d
be given.
(a) E is open if for each point x ∈ E there is some open ball
B
r
(x) = ¦y ∈ R
d
: [x −y[ < r¦
centered at x that is completely contained in E, i.e., B
r
(x) ⊆ E for some r > 0.
(b) A point x ∈ R
d
is a limit point of E if there exist points x
n
∈ E that converge to x,
i.e., such that x
n
→ x.
(c) The complement of E is
E
C
= R
d
¸E = ¦x ∈ R
d
: x / ∈ E¦.
(d) E is closed if its complement E
C
is open. It can be shown that E is closed if and
only if E contains all its limit points.
(e) If E is any subset of R, then its closure
¯
E is the smallest closed set that contains E.
It can be shown that
¯
E = E ∪ ¦x ∈ R
d
: x is a limit point of E¦.
(f) E is dense in R
d
if
¯
E = R
d
. For example, the set Q of all rational numbers is a dense
subset of R.
(g) E is bounded if it is contained in some ball with finite radius, i.e., if there is some
r > 0 such that E ⊆ B
r
(0).
The following notion of compact sets is very important.
Definition 1.4. Let E ⊆ R
d
be given.
(a) An open cover of E is any collection ¦U
α
¦
α∈J
of open sets such that E ⊆

α
U
α
.
The index set J may be finite, countable, or uncountable, i.e., there may be finitely
many, countably many, or uncountably many open sets U
α
in the collection.
(b) E is compact if every open cover ¦U
α
¦
α∈J
of E contains a finite subcover. That is,
E is compact if whenever we choose open sets U
α
such that E ⊆

α
U
α
, then there
exist finitely many indices α
1
, . . . , α
k
∈ J such that E ⊆ U
α
1
∪ ∪ U
α
k
.
Theorem 1.5. Let E ⊆ R
d
be given.
(a) (Heine–Borel Theorem) E is compact if and only if it is both closed and bounded.
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 3
(b) (Bolzano–Weierstrass Theorem) If E is compact, then every countable sequence of
points ¦x
n
¦
n∈N
with x
n
∈ E has a convergent subsequence (even if the original
sequence does not converge). That is, there exist indices n
1
< n
2
< . . . and a point
x ∈ R
d
so that x
n
k
→ x. (Note that x is then a limit point of E, and therefore x ∈ E
since E is closed.)
Theorem 1.6. Let E, F ⊆ R
d
be given.
(a) If E ⊆ F, then
¯
E ⊆
¯
F.
(b) If E and F are compact sets then E + F = ¦x + y : x ∈ E and y ∈ F¦ is compact.
(c) If E and F are bounded sets (not necessarily compact), then E + F ⊆
¯
E +
¯
F.
Proof. (a) We just have to show that every limit point of E is a limit point of F. So, suppose
that x is a limit point of E. Then there exist points x
n
∈ E such that x
n
→ x. However,
E ⊆ F, so each x
n
is also an element of F. Therefore x is a limit point of F by definition.
(b) Suppose E and F are both compact. Then E and F are both bounded, so they are
contained in some finite balls centered at the origin, say E ⊆ B
r
(0) and F ⊆ B
s
(0). Hence
E + F ⊆ B
r+s
(0), so E + F is bounded.
To show that E + F is closed, suppose that z is any limit point of E + F. This means
that there are points z
n
∈ E + F which converge to z. By definition, z
n
= x
n
+ y
n
for
some x
n
∈ E and y
n
∈ F. WE DO NOT KNOW whether x
n
and y
n
will converge to some
points x and y! However, we do know that ¦x
n
¦
n∈N
is a sequence of points in E and that
E is compact. Therefore, there exists a subsequence ¦x
n
k
¦
k∈N
which does converge to some
x ∈ E. For simplicity of notation, write x

k
= x
n
k
and y

k
= y
n
k
. Now, ¦y

k
¦
k∈N
is a sequence
of points in F and F is compact, so there must be a subsequence ¦y

k
j
¦
j∈N
which converges
to some y ∈ F. Note that since x

k
→ x, it is still true that x

k
j
→ x. Again for simplicity
write x
′′
j
= x

k
j
and y
′′
j
= y

k
j
. Then we have x
′′
j
→ x and y
′′
j
→ y. Therefore x
′′
j
+y
′′
j
→ x +y.
However, remember where these points came from: x
′′
j
= x
n
k
j
and y
′′
j
= y
n
k
j
. Therefore
x
′′
j
+ y
′′
j
→ z since x
n
+ y
n
→ z. So it must be the case that z = x + y. Thus z ∈ E + F, so
E + F contains all its limit points, and therefore is closed. Since E + F is both closed and
bounded, it is compact.
(c) Suppose that E and F are bounded sets. Then
¯
E and
¯
F are closed and bounded sets,
hence compact. Certainly E +F ⊆
¯
E +
¯
F, so we just have to show that every limit point of
E + F is in
¯
E +
¯
F.
So, suppose that z is a limit point of E + F. Then there exist points z
n
∈ E + F such
that z
n
→ z. By definition, z
n
= x
n
+ y
n
for some x
n
∈ E and y
n
∈ F. Since
¯
E and
¯
F are
compact, we can imitate the argument of part b and find convergent subsequences x
′′
j
and
y
′′
j
, i.e., x
′′
j
→ x ∈
¯
E and y
′′
j
→ y ∈
¯
F for some x ∈
¯
E and y ∈
¯
F (not necessarily x ∈ E or
y ∈ F). Therefore x +y = limx
′′
j
+limy
′′
j
= lim(x
′′
j
+y
′′
j
) = z, so z ∈
¯
E +
¯
F, as desired.
4 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
2. MEASURE THEORY
Our goal in this section is to assign to each subset of R
d
a “size” or “measure” that
generalizes the concept of area or volume from simple sets to arbitrary sets. However, we will
see that this cannot be done for all sets without introducing some strange pathologies, and
therefore we must restrict the definition of measure to a subclass of well-behaved “measurable
sets.”
Definition 2.1.
(a) A cube or rectangular box in R
d
is a set of the form
Q = [a
1
, b
1
] [a
d
, b
d
].
The volume of this cube is
vol(Q) = (b
1
−a
1
) (b
d
−a
d
).
(b) The exterior Lebesgue measure of an arbitrary set E ⊆ R
d
is
[E[
e
= inf
_

k
vol(Q
k
) : all countable sequences of cubes Q
k
with E ⊆

k
Q
k
_
.
The exterior measure of a set lies in the range 0 ≤ [E[
e
≤ ∞. Allowing the possibility
of infinite exterior measure, every subset of R
d
has a uniquely defined nonnegative
exterior measure.
Example 2.2.
(a) A seemingly “obvious” fact is that if Q is a cube in R
d
then [Q[
e
= vol(Q). Since Q
covers itself with one cube, it does follow immediately from the definition that [Q[
e

vol(Q). However, the other inequality is not so trivial to prove. More generally, if
Q
1
, . . . , Q
n
are disjoint cubes, then it can be shown that
[Q
1
∪ ∪ Q
n
[
e
= vol(Q
1
) + + vol(Q
n
).
(b) [R
d
[
e
= ∞.
(c) If S ⊆ R
d
contains only countably many points then [S[
e
= 0. For example, the set
of rational numbers in R has zero exterior measure, i.e., [Q[
e
= 0.
(d) The Cantor set C is an example of a subset of R which contains uncountably many
points yet has exterior measure [C[
e
= 0. The Cantor set is also closed and equals
its own boundary.
Definition 2.3. A property which holds except on a set of exterior measure zero is said to
hold almost everywhere (abbreviated a.e.). For example, if
g(x) =
_
1, x rational,
0, x irrational,
(2.1)
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 5
then we say that g = 0 a.e. since the set of points ¦x ∈ R : g(x) ,= 0¦ is countable and
therefore has measure zero.
The prefix essential is often applied to a property that holds a.e. For example, the essential
supremum of a function f : E → [−∞, ∞] is
ess sup
x∈E
f(x) = inf¦M : f(x) ≤ M a.e.¦.
Thus, for the function g given in equation (2.1) we have
sup
x∈R
g(x) = 1 while ess sup
x∈R
g(x) = 0.
Here are some basic properties of exterior measure.
Lemma 2.4. Let E, F ⊆ R
d
be given.
(a) If E ⊆ F, then [E[
e
≤ [F[
e
.
(b) If E
1
, E
2
, . . . ⊆ R
d
, then
¸
¸

E
k
¸
¸
e

k
[E
k
[
e
.
(c) If E ⊆ R
d
and ε > 0, then there exists an open set U ⊇ E such that [U[
e
≤ [E[
e
+ε.
(Note that we also have [E[
e
≤ [U[
e
by part a.)
Remark 2.5. We might expect in Lemma 2.4(b) that if the sets E
1
, E
2
, . . . are disjoint,
then we would actually have [ ∪E
k
[
e
=

k
[E
k
[
e
. Yet it can be shown that this is FALSE in
general: there exist disjoint sets E
1
, E
2
, . . . ⊆ R
d
such that [ ∪ E
k
[
e
<

k
[E
k
[
e
.
Likewise, in Lemma 2.4(c) we might expect that since E ⊆ U and [E[
e
≤ [U[
e
≤ [E[
3
+ε,
the set U¸E should have small exterior measure. Specifically, we expect that [U¸E[
e
≤ ε.
Yet this is also FALSE in general! Consequently, for such sets we have [(U¸E)∪E[
e
= [U[
e

[E[
e
+ε < [E[
e
+[U¸E[
e
even though U is the union of the two disjoint sets U¸E and E.
The problem is that, in some sense, the definition of exterior measure is too inclusive. All
sets have an exterior measure, even though there exist some very strange sets that behave in
unexpected ways (the existence of such strange sets is a consequence of the Axiom of Choice).
One way to handle this problem is to restrict our attention to sets which are “well-behaved”
with respect to exterior measure. This leads us to make the following definition.
Definition 2.6. A set E ⊆ R
d
is Lebesgue measurable, or simply measurable, if given any
ε > 0 there exists an open set U ⊇ E such that [U¸E[
e
≤ ε.
If E is measurable, then its Lebesgue measure is its exterior measure, and is denoted
[E[ = [E[
e
.
6 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
There exists sets that are not measurable. However, as the proof of this fact relies on the
Axiom of Choice, it is nonconstructive, i.e., it simply says that such sets exist but does not
explicitly display one. Typically, the sets we encounter are all measurable, and almost all
operations that we perform on measurable sets leave their measurability intact.
Lemma 2.7. (a) All open subsets of R
d
are measurable.
(b) All closed subsets of R
d
are measurable.
(c) Countable unions of measurable sets are measurable. That is, if E
1
, E
2
, . . . are mea-
surable, then so is

k
E
k
.
(d) Countable intersections of measurable sets are measurable. That is, if E
1
, E
2
, . . . are
measurable, then so is

k
E
k
.
(e) The complement of a measurable set is measurable. That is, if E is measurable, then
so is E
C
.
(f) All sets with exterior measure zero are measurable. That is, if [E[
e
= 0, then E is
measurable.
Proof. (f) Suppose that [E[
e
= 0, and let ε > 0 be given. Then by Lemma 2.4, there exists
an open set U ⊇ E such that [U[
e
≤ [E[
e
+ ε = 0 + ε = ε. Therefore, since U¸E ⊆ U, we
have by Lemma 2.4(a) that [U¸E[
e
≤ [U[
e
≤ ε. Hence E is measurable by definition.
Theorem 2.8. Let E and E
1
, E
2
, . . . be measurable subsets of R
d
.
(a)
¸
¸

E
k
¸
¸

k
[E
k
[.
(b) If E
1
, E
2
, . . . are disjoint, then
¸
¸

E
k
¸
¸
=

k
[E
k
[.
(c) If E
1
⊆ E
2
and [E
2
[ < ∞, then [E
1
¸E
2
[ = [E
1
[ −[E
2
[.
(d) If E
1
⊆ E
2
⊆ , then
¸
¸

E
k
¸
¸
= lim
k→∞
[E
k
[.
(e) If E
1
⊇ E
2
⊇ and [E
1
[ < ∞, then
¸
¸

E
k
¸
¸
= lim
k→∞
[E
k
[.
(f) If h ∈ R
d
and we define E + h = ¦x + h : x ∈ E¦, then [E + h[ = [E[.
(g) If T : R
d
→R
d
is linear, then [T(E)[ = [ det(T)[ [E[.
Here are some final attempts to illustrate the way in which measurable sets are “well-
behaved.”
Definition 2.9. Let E ⊆ R
d
be arbitrary. The inner measure of E ⊆ R
d
is
[E[
i
= sup¦[F[ : all closed sets F such that F ⊆ E¦.
Compare this definition of inner measure to Lemma 2.4(c), which implies that the exterior
measure of E is given by
[E[
e
= inf¦[U[ : all open sets U such that U ⊇ E¦.
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 7
Theorem 2.10. If [E[
e
< ∞, then E is measurable if and only if [E[
e
= [E[
i
.
Theorem 2.11 (Carath´eodory’s Criterion). Let E ⊆ R
d
be given. Then E is measurable if
and only if for every set A ⊆ R
d
we have
[A[
e
= [A∩ E[
e
+[A¸E[
e
.
From now on, when we are given a set E ⊆ R
d
we implicitly assume that it is measurable
unless specifically stated otherwise.
8 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
3. THE LEBESGUE INTEGRAL
Our goal in this section is to define the integral of most real- or complex-valued functions,
including functions for which the Riemann integral is not defined.
Definition 3.1. Let E ⊆ R
d
, and consider a function mapping E to the extended nonneg-
ative reals, i.e., f : E → [0, ∞].
(a) The graph of f is
Γ(f, E) =
_
(x, f(x)) ∈ R
d+1
: x ∈ E, f(x) < ∞
_
.
(b) The region under the graph of f is the set R(f, E) of all points (x, y) ∈ R
d+1
with
x ∈ E and y ∈ R and such that 0 ≤ y ≤ f(x) if f(x) < ∞, or 0 ≤ y < ∞ if
f(x) = ∞.
We begin by defining the integral of a nonnegative, real-valued function. Later we will
extend this definition to general real- or complex-valued functions.
Definition 3.2. Let E be a measurable subset of R
d
, and suppose that f : E → [0, ∞].
(a) We say that f is a measurable function if R(f, E) is a measurable subset of R
d+1
. It
can be shown that f is a measurable function if and only if ¦x ∈ R
d
: f(x) ≥ α¦ is a
measurable subset of R
d
for each α ∈ R. Sums, products, and limits of measurable
functions are measurable.
(b) If f is a measurable function, then the Lebesgue integral of f over E is the measure
of the region under the graph of f as a subset of R
d+1
, i.e.,
_
E
f =
_
E
f(x) dx = [R(f, E)[.
If the set E is understood, then we may write simply
_
f =
_
E
f. Note that the
integral of a nonnegative f lies in the range
0 ≤
_
E
f ≤ ∞.
From now on, when we are given a function f we implicitly assume that it is measurable
unless specifically stated otherwise.
There are many equivalent ways to define the Lebesgue integral. I prefer the one given
in Definition 3.2 because it captures the intuition of what an integral should mean, i.e., the
integral should represent the “area under the graph” of f. Many texts begin by defining the
integral of step functions, i.e., functions which take only finitely many distinct values. It is
clear how to define the integral of a step function. Then, an arbitrary function f is written
as a limit of step functions and the Lebesgue integral of f is defined to be the limit of the
integrals of the step functions.
Here are some basic properties of integrals of nonnegative functions.
Theorem 3.3. Let E ⊆ R
d
be given, and suppose that f, g : E → [0, ∞].
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 9
(a)
_
E
1 = [E[.
(b) If f ≤ g, then
_
E
f ≤
_
E
g.
(c) If E
1
⊆ E
2
, then
_
E
1
f ≤
_
E
2
f.
(d) If E
1
, E
2
, . . . are disjoint sets in R
d
and E = ∪E
k
, then
_
E
f =

k
_
E
k
f.
(e)
_
E
(f + g) =
_
E
f +
_
E
g.
(f) (Tchebyshev’s Inequality) If α > 0, then [¦x ∈ E : f(x) > α¦[ ≤
1
α
_
E
f.
(g) f = 0 a.e. on E if and only if
_
E
f = 0.
(h) If f = g a.e., then
_
E
f =
_
E
g.
Proof. (a) If f(x) = 1 for all x ∈ E then R(f, E) = ¦(x, y) : x ∈ E, 0 ≤ y ≤ 1¦ =
E [0, 1]. The seemingly obvious but nontrivial fact that [E F[ = [E[ [F[ then implies
that [R(f, E)[ = [E[.
(b) Since R(f, E) ⊆ R(g, E), we have
_
E
f = [R(f, E)[ ≤ [R(g, E)[ =
_
E
g.
(c) This follows similarly from the fact that R(f, E
1
) ⊆ R(f, E
2
).
(d) Note that the sets R(f, E
k
) are disjoint and that R(f, E) = ∪R(f, E
k
). Therefore,
this part follows from Theorem 2.8(b).
(e) This is another “obvious” property that is not trivial to prove using the definition of
Lebesgue integral that we have chosen. The proof is not difficult, but it is rather long and
technical. The idea is that it is easy to prove if f and g are step functions, and that arbitrary
functions can be approximated by step functions.
(f) Let F = ¦x ∈ E : f(x) > α¦. Then
_
E
f ≥
_
F
f ≥
_
F
α = α[F[.
(g) If f = 0 a.e. on E then
_
E
f = [R(f, E)[ = 0.
Conversely, suppose that
_
E
f = 0. Let F = ¦x ∈ E : f(x) > 0¦ be the set of points
where f is strictly positive. We have to show that [F[ = 0. Now, for each α > 0, we have
by Tchebyshev’s Inequality that
0 ≤ [¦x ∈ E : f(x) > α¦[ ≤
1
α
_
E
f = 0.
In particular, the set F
n
= ¦x ∈ E : f(x) > 1/n¦ has measure zero for each n > 0. However,
F is the union of the countably many sets F
1
, F
2
, . . ., so 0 ≤ [F[ ≤

n
[F
n
[ = 0.
(h) If f = g a.e., then f −g = 0 a.e., so
_
E
f −
_
E
g =
_
E
(f −g) = 0.
10 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
Note that Theorem 3.3(h) says that if two functions f and g are equal except on a set
of measure zero, then their integrals are equal. Hence, given a function f we can change
the values of f on any set of measure zero without changing the integral of the f. Hence,
whenever we are concerned only with integrals, we typically do not distinguish between two
functions that are equal except on a set of measure zero.
Now we can extend the definition of Lebesgue integral to more general functions.
Definition 3.4 (Lebesgue Integral for Real-Valued Functions). Let f be a real-valued func-
tion f : E → [−∞, ∞]. We write f as a difference of two nonnegative functions by defining
f
+
(x) =
_
f(x), f(x) ≥ 0,
0, f(x) < 0,
and f

(x) =
_
0, f(x) ≥ 0,
[f(x)[, f(x) < 0,
so that
f = f
+
−f

and [f[ = f
+
+ f

.
We say that f is measurable if both f
+
and f

are measurable, and in this case we define
the Lebesgue integral of f to be
_
E
f(x) dx =
_
E
f
+
(x) dx −
_
E
f

(x) dx,
as long as this does not have the form ∞−∞ (in that case, the integral is undefined).
Since 0 ≤ f
+
, f

≤ [f[ = f
+
+ f

, it follows from Definition 3.4 that
_
E
f exists as a finite real value ⇐⇒
_
E
f
+
< ∞ and
_
E
f

< ∞
⇐⇒
_
E
f
+
+
_
E
f

< ∞
⇐⇒
_
E
[f[ < ∞.
Note that if [f(x)[ = ∞ on a set with positive Lebesgue measure then
_
[f[ = ∞. Equiva-
lently, if
_
E
[f[ < ∞ then [f(x)[ < ∞ a.e. However, it is possible to have [f(x)[ < ∞ a.e.
yet still have
_
E
[f[ = ∞. For example, if f(x) = 1 for all x ∈ R, then
_
f = ∞.
Definition 3.5 (Lebesgue Integral for Complex-Valued Functions). Suppose that f : E →C.
Split f into real and imaginary parts by writing f = Re (f) + i Im(f). Then we define the
Lebesgue integral of f to be
_
E
f =
_
E
Re (f) + i
_
E
Im(f),
as long as both integrals on the right are defined and finite.
Note that [Re (f)[, [Im(f)[ ≤ [f[ ≤ [Re (f)[ +[Im(f)[. Therefore
_
E
[f[ < ∞ ⇐⇒
_
E
[Re (f)[,
_
E
[Im(f)[ < ∞.
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 11
Consequently,
_
E
f exists as a complex number ⇐⇒
_
E
[f[ < ∞.
Definition 3.6. We say that a function f : E → [−∞, ∞] or f : E → C is integrable if
_
E
[f[ < ∞. The collection of all integrable functions on E is called L
1
(E). That is, if we
are dealing with real-valued functions then
L
1
(E) =
_
f : E → [−∞, ∞] :
_
E
[f[ < ∞
_
,
or if we are dealing with complex-valued functions then
L
1
(E) =
_
f : E →C :
_
E
[f[ < ∞
_
.
The choice of real-valued versus complex-valued functions is usually clear from context. In
either case, L
1
(E) is a vector space under the usual operations of function addition and
multiplication by scalars, and
|f|
1
=
_
E
[f[
defines a norm on this space, meaning that:
(a) 0 ≤ |f|
1
< ∞ for all f ∈ L
1
(R),
(b) |f|
1
= 0 if and only if f = 0 a.e.,
(c) |cf|
1
= [c[ |f| for all f ∈ L
1
(R) and all scalars c, and
(d) |f + g|
1
≤ |f|
1
+|g|
1
for all f, g ∈ L
1
(R).
Note that in property (a), we only have that |f|
1
= 0 implies f = 0 a.e., not that f = 0.
In this sense |f|
1
does not quite satisfy the requirements of a norm (instead, it is only
a seminorm). On the other hand, we have declared that we will not distinguish between
two functions that are equal a.e., and with this identification we do have that |f|
1
satisfies
all the requirements of a norm. In other words, we regard any function that is 0 a.e. as
being “the” zero element of L
1
(R), and if f = g a.e. then we regard f and g as being the
“same” element of L
1
(R). To be more precise, we are really taking the elements of L
1
(R) to
be equivalence classes of functions that are equal a.e. This distinction between equivalence
classes of functions and the functions themselves is not usually an issue, and we will ignore it.
Definition 3.7. In analogy to L
1
(E), given 1 ≤ p < ∞ we define
L
p
(E) =
_
f : E → [−∞, ∞] :
_
E
[f(x)[
p
dx < ∞
_
,
or make the obvious adjustment if we are dealing with complex-valued functions. It can be
shown that L
p
(E) is a vector space, and that
|f|
p
=
__
E
[f(x)[
p
dx
_
1/p
12 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
is a norm on this space (again in the sense of identification functions that are equal almost
everywhere). In fact, L
p
(E) is complete in this norm (all Cauchy sequences converge), and
is therefore Banach space. If [E[ < ∞ and 1 ≤ p < q < ∞ then L
q
(E) ⊆ L
p
(E). This
inclusion fails if E has infinite measure.
For the case p = ∞, we define
L

(E) =
_
f : E → [−∞, ∞] : ess sup
x∈E
[f(x)[ < ∞
_
.
Then L

(E) is a Banach space with respect to the norm
|f|

= ess sup
x∈E
[f(x)[.
If [E[ < ∞, then L

(E) ⊆ L
p
(E) for each 1 ≤ p < ∞, and in fact L

(E) =

p≥1
L
p
(E).
For the case p = 2,
¸f, g) =
_
E
f(x) g(x) dx
defines an inner product on L
2
(E). Thus L
2
(E) is a Hilbert space as well as a Banach space.
Out of all the exponents 1 ≤ p ≤ ∞, only L
2
(E) is a Hilbert space.
Here is a fundamental inequality for the L
p
norms.
Theorem 3.8 (H¨older’s Inequality). Let E ⊆ R
d
be measurable, and fix 1 ≤ p ≤ ∞. If
f ∈ L
p
(E) and g ∈ L
p

(E) then fg ∈ L
1
(E), and
|fg|
1
≤ |f|
p
|g|
p
′ .
For 1 < p < ∞, this inequality is
_
E
[fg[ ≤
__
E
[f[
p
_
1/p
__
E
[g[
p

_
1/p

.
For p = 2, H¨older’s inequality is known as the Schwarz, Cauchy–Schwarz, or Cauchy–
Bunyakowski–Schwarz inequality. It has the form
_
E
[fg[ ≤
__
E
[f[
2
_
1/2
__
E
[g[
2
_
1/2
.
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 13
4. SWITCHING INTEGRALS
Suppose that f(x, y) is a function of two variables, with x varying through a domain
E ⊆ R
m
and y varying through a domain F ⊆ R
n
. Suppose also that we want to integrate
f over the entire domain
E F = ¦(x, y) ∈ R
m+n
: x ∈ E, y ∈ F¦.
In this case, it is very important which set we integrate over first! In general, it is NOT true
that if we integrate over x first and y second, we will get the same result as if we integrate
over y first and x second.
Fubini’s and Tonelli’s Theorems give two conditions under which we can safely exchange
the order of integration. Fubini’s Theorem says that we can the switch integrals if f is
an integrable function, and Tonelli’s Theorem says that we can switch the integrals if f is a
nonnegative function. If neither theorem applies, then it is possible that
_
E
_
F
f(x, y) dxdy ,=
_
F
_
E
f(x, y) dy dx!
Theorem 4.1 (Fubini’s Theorem). If any one of the possible double integrals of [f(x, y)[ is
finite, then interchanging the order of integration of f(x, y) is allowed. That is, if
(a)
_
E
__
F
[f(x, y)[ dy
_
dx < ∞, or
(b)
_
F
__
E
[f(x, y)[ dx
_
dy < ∞, or
(c)
__
E×F
[f(x, y)[ (dxdy) < ∞,
then
_
E
_
F
f(x, y) dy dx =
_
F
_
E
f(x, y) dxdy =
__
E×F
f(x, y) (dxdy). (4.1)
Moreover, in this case g(x) =
_
F
f(x, y) dy is well-defined for almost every x, and g is an
integrable function of x, i.e., g ∈ L
1
(E). Similarly, h(y) =
_
E
f(x, y) dx is well-defined for
almost every y, and is an integrable function of y, i.e., h ∈ L
1
(F).
Theorem 4.2 (Tonelli’s Theorem). If f(x, y) ≥ 0 a.e., then interchanging the order of
integration of f(x, y) is allowed, i.e.,
_
E
_
F
f(x, y) dy dx =
_
F
_
E
f(x, y) dxdy. (4.2)
Note that the hypotheses of Fubini’s Theorem imply that the integrals in equation (4.1)
are finite real or complex numbers. However, the integrals in equation (4.2) may be finite
real numbers or ∞. The equality in (4.2) means that if one side is finite then the other side
is finite as well, and if one is infinite then the other is infinite as well.
14 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
Because a series can be viewed as a “discrete integral,” there are analogues of Fubini’s
and Tonelli’s theorems that apply to the problem of interchanging an integral and a sum or
interchanging two summations (see Corollary 5.3).
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 15
5. SWITCHING INTEGRALS AND LIMITS
Suppose that ¦f
n
¦
n∈N
is a sequence of functions that converge pointwise almost every-
where, i.e., there is a function f such that f
n
(x) → f(x) for almost every x. The following
example shows that this does NOT imply that the integrals of f
n
must converge to the
integral of f, i.e., it need NOT be true that lim
n→∞
_
E
f
n
=
_
E
lim
n→∞
f
n
!
Example 5.1. Let f
n
be defined as follows. For 0 ≤ x ≤ 1/n the graph of f
n
looks like an
isosceles triangle with base [0, 1/n] and height n. For all other x we set f
n
(x) = 0. Then
f
n
(x) → 0 for every x ∈ R! However,
_
f
n
= 1/2 for every n, so
_
f
n
does not converge to
_
0 dx = 0.
Our goal in this section is to give some conditions under which a limit and an integral
can be interchanged. The first result, known as the Monotone Convergence Theorem or
the Beppo–Levi Theorem, applies to the case of nonnegative functions that are monotone
increasing a.e., i.e., for which
0 ≤ f
1
(x) ≤ f
2
(x) ≤ f
3
(x) ≤ for a.e. x ∈ E.
Theorem 5.2 (Monotone Convergence Theorem). Let E ⊆ R
d
be given, and suppose that
f
n
: E → [0, ∞] for n ∈ N. If the sequence of functions ¦f
n
¦
n∈N
is monotone increasing a.e.
on E and if lim
n→∞
f
n
(x) = f(x) for a.e. x ∈ E, then
lim
n→∞
_
E
f
n
=
_
E
lim
n→∞
f
n
=
_
E
f.
Proof. Note that R(f
1
, E) ⊆ R(f
2
, E) ⊆ and that R(f, E) = ∪R(f
n
, E). It therefore
follows from Theorem 2.8(d) that
_
E
f = [R(f, E)[ = lim
n→∞
[R(f
n
, E)[ = lim
n→∞
_
E
f
n
.
Recall that an infinite series is a limit of the partial sums of the series. Hence, we must be
careful when switching an integral and an infinite series. As a corollary of the Beppo–Levi
Theorem we can prove the following version of Tonelli’s Theorem that gives a condition on
when an integral and a summation can be interchanged.
16 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
Corollary 5.3 (Tonelli’s Theorem). Let E ⊆ R
d
begiven, and suppose that f
n
: E → [0, ∞]
for n ∈ N. Then
_
E

n=1
f
n
=

n=1
_
E
f
n
.
Proof. Set
F
N
(x) =
N

n=1
f
n
(x) and F(x) =

n=1
f
n
(x).
Then F
1
(x) ≤ F
x
(x) ≤ for a.e. x, and lim
N→∞
F
N
(x) = F(x) a.e. Therefore, by the
Monotone Convergence Theorem,
_
E
F = lim
N→∞
_
E
F
N
= lim
N→∞
_
E
N

n=1
f
n
= lim
N→∞
N

n=1
_
E
f
n
=

n=1
_
E
f
n
. (5.1)
Note that we were allowed to switch the sum and the integral in equation (5.1) because it
was a finite sum.
If the functions f
n
are nonnegative but not monotone increasing, then we may not be
able to interchange a limit and an integral. However, the following result states that if the
functions f
n
are all nonnegative, then we do at least have a particular inequality.
Theorem 5.4 (Fatou’s Lemma). Let E ⊆ R
d
be given, and suppose that f
n
: E → [0, ∞]
for n ∈ N. Then
_
E
liminf
n→∞
f
n
≤ liminf
n→∞
_
E
f
n
.
Consequently, if the f
n
converge pointwise almost everywhere, i.e., if lim
n→∞
f
n
(x) = f(x) a.e.,
and if the integrals converge as well, then
_
E
f dx ≤ lim
n→∞
_
E
f
n
dx.
Proof. Set g
n
(x) = inf
k≥n
f
k
(x). Then g
1
(x) ≤ g
2
(x) ≤ , so ¦g
n
¦
n∈N
is monotone increasing
for each x. Define
f(x) = liminf
n→∞
f
n
(x) = lim
n→∞
inf
k≥n
f
k
(x) = lim
n→∞
g
n
(x).
Then by the Monotone Convergence Theorem and the fact that g
n
(x) ≤ f
n
(x), we have
_
E
f =
_
E
lim
n→∞
g
n
= lim
n→∞
_
E
g
n
= liminf
n→∞
_
E
g
n
≤ liminf
n→∞
_
E
f
n
.
The Lebesgue Dominated Convergence Theorem, or LCDT, is perhaps the most important
and useful result of this section. It applies to functions that aren’t necessarily nonnegative
or monotone increasing, and it is the theorem to use in most cases.
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 17
Theorem 5.5 (Lebesgue Dominated Convergence Theorem). Let E ⊆ R
d
be given, and
suppose that f
n
: E → [−∞, ∞] for n ∈ N. Suppose also that the functions f
n
(x) converge
pointwise almost everywhere, i.e., lim
n→∞
f
n
(x) = f(x) for a.e. x ∈ E. If there is a single
function g such that:
(a) [f
n
(x)[ ≤ g(x) a.e. for every n, and
(b) g is integrable, i.e.,
_
E
[g[ < ∞,
then
lim
n→∞
_
E
f
n
=
_
E
lim
n→∞
f
n
=
_
E
f.
In fact, even more is true in this case: f
n
converges to f in L
1
-norm, i.e.,
lim
n→∞
|f −f
n
|
1
= lim
n→∞
_
E
[f −f
n
[ = 0.
Proof. We will give the proof for nonnegative f
n
only, but it can be extended to general f.
Suppose that f
n
≥ 0 a.e. Then by Fatou’s Lemma,
_
E
f =
_
E
liminf
n→∞
f
n
≤ liminf
n→∞
_
E
f
n
.
Further, since g −f
n
≥ 0 a.e., we can apply Fatou’s Lemma to the functions g −f
n
, to obtain
_
E
g −
_
E
f =
_
E
(g −f)
=
_
E
liminf
n→∞
(g −f
n
)
≤ liminf
n→∞
_
E
(g −f
n
) (by Fatou’s Lemma)
= liminf
n→∞
__
E
g −
_
E
f
n
_
=
_
E
g −limsup
n→∞
_
E
f
n
.
Therefore,
_
E
f ≤ liminf
n→∞
_
E
f
n
≤ limsup
n→∞
_
E
f
n

_
E
f.
Hence lim
n→∞
_
E
f
n
exists and equals
_
E
f.
18 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
6. CONVOLUTION
Definition 6.1. Let f and g be real- or complex-value functions with domain R
d
. Then the
convolution of f and g is the function f ∗ g defined by
(f ∗ g)(x) =
_
f(y) g(x −y) dy,
whenever this is well-defined.
Theorem 6.2. If f, g ∈ L
1
(R
d
) then f ∗ g ∈ L
1
(R
d
), and |f ∗ g|
1
≤ |f|
1
|g|
1
.
Proof. We start by computing:
_
[(f ∗ g)(x)[ dx =
_
¸
¸
¸
¸
_
f(y) g(x −y) dy
¸
¸
¸
¸
dx ≤
__
[f(y) g(x −y)[ dy dx = (∗).
Because [f(y) g(x − y)[ ≥ 0 for all x and y, Tonelli’s Theorem allows us to interchange the
order of integration in (∗). So, we can continue as follows:
(∗) =
__
[f(y) g(x −y)[ dxdy =
_
[f(y)[
__
[g(x −y)[ dx
_
dy = (∗∗).
Now, since we are integrating over all of R
d
, we know that
_
[g(x −y)[ dx =
_
[g(x)[ dx
(this wouldn’t necessarily be true if we were integrating on a finite domain). Therefore,
(∗∗) =
_
[f(y)[
__
[g(x)[ dx
_
dy =
_
[f(y)[ |g|
1
dy = |g|
1
_
[f(y)[ dy = |g|
1
|f|
1
.
Put it all together and we have shown |f ∗ g|
1
≤ |g|
1
|f|
1
.
In fact, Theorem 6.2 is just a special case of the following more general result.
Theorem 6.3 (Young’s Convolution Inequality). If 1 ≤ p ≤ ∞ and f ∈ L
p
(R
d
) and
g ∈ L
1
(R
d
) then f ∗ g ∈ L
p
(R
d
), and
|f ∗ g|
p
≤ |f|
p
|g|
1
.
Proof. We’ve already done the case p = 1. The case p = ∞ is easy, so I will leave it as an
exercise. For other p’s it is a little tricky.
First let p

be the dual exponent to p, i.e., the number which satisfies
1
p
+
1
p

= 1. Then
write
[(f ∗ g)(x)[ ≤
_
[f(y) g(x −y)[ dy =
_
[f(y) g(x −y)
1/p
[ [g(x −y)
1/p

[ dy = (∗).
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 19
Now apply H¨older’s inequality to the two parts of the integrand:
(∗) ≤
__
[f(y) g(x −y)
1/p
[
p
dy
_
1/p
__
[g(x −y)
1/p

[
p

dy
_
1/p

=
__
[f(y)[
p
[g(x −y)[ dy
_
1/p
__
[g(x −y)[ dy
_
1/p

=
__
[f(y)[
p
[g(x −y)[ dy
_
1/p
__
[g(y)[ dy
_
1/p

= |g|
1/p

1
__
[f(y)[
p
[g(x −y)[ dy
_
1/p
.
Therefore,
|f ∗ g|
p
p
=
_
[(f ∗ g)(x)[
p
dx = |g|
p/p

1
__
[f(y)[
p
[g(x −y)[ dy dx
= |g|
p/p

1
__
[f(y)[
p
[g(x −y)[ dxdy
= |g|
p/p

1
_
[f(y)[
p
__
[g(x −y)[ dx
_
dy
= |g|
p/p

1
_
[f(y)[
p
__
[g(x)[ dx
_
dy
= |g|
p/p

1
_
[f(y)[
p
|g|
1
dy
= |g|
1+p/p

1
|f|
p
p
= |g|
p
1
|f|
p
p
.
Take pth roots and you’re done.
20 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
7. CROSS PRODUCT BASES
Suppose that we have an orthonormal basis for the Hilbert space
L
2
[a, b] = ¦f : [a, b] →C : |f|
2
=
__
b
a
[f(x)[
2
dx
_
1/2
< ∞¦
of functions that are square-integrable defined on the interval [a, b], with inner product
¸f, g) =
_
b
a
f(x) g(x) dx.
We will show how to use this orthonormal basis to construct an orthonormal basis for the
Hilbert space
L
2
([a, b] [a, b]) = ¦F : [a, b] [a, b] →C : |F|
2
=
__
b
a
_
b
a
[F(x, y)[
2
dxdy
_
1/2
< ∞¦,
of functions that are square-integrable on the square [a, b] [a, b], under the inner product
¸F, G) =
_
b
a
_
b
a
F(x, y) G(x, y) dxdy.
Theorem 7.1. Suppose that ¦f
n
(x)¦
n∈N
is an orthonormal basis for L
2
[a, b], and define
F
mn
(x, y) = f
m
(x) f
n
(y).
Then ¦F
mn
(x, y)¦
m,n∈N
is an orthonormal basis for L
2
([a, b] [a, b]).
Proof. First we check that the functions F
mn
are indeed orthonormal:
¸F
mn
, F
jk
) =
_
b
a
_
b
a
F
mn
(x, y) F
jk
(x, y) dxdy
=
_
b
a
_
b
a
f
m
(x) f
n
(y) f
j
(x) f
k
(y) dxdy
=
_
b
a
f
n
(y) f
k
(y)
__
b
a
f
m
(x) f
j
(x) dx
_
dy
=
_
b
a
f
n
(y) f
k
(y) ¸f
m
, f
j
) dy
= ¸f
m
, f
j
)
_
b
a
f
n
(y) f
k
(y) dy
= ¸f
m
, f
j
) ¸f
n
, f
k
)
=
_
1, if m = j and n = k,
0, if m ,= j or n ,= k.
This establishes that ¦F
mn
¦ is an orthonormal system in L
2
([a, b] [a, b]).
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 21
Now we have to show that this orthonormal system is an orthonormal basis. We have
several choices for doing this. One way is to show that ¦F
mn
¦ is complete, i.e., the only
function F ∈ L
2
([a, b] [a, b]) that is orthogonal to every F
mn
is the zero function. So,
suppose that F ∈ L
2
([a, b] [a, b]) is such that ¸F, F
mn
) = 0 for every m and n. It would be
easy to proceed if it was the case that F(x, y) = f(x) g(y) for some functions f, g ∈ L
2
[a, b],
but it is important to note that only SOME of the functions in L
2
([a, b] [a, b]) can be
“factored” in this way. So, we have to be more careful. For a general function F(x, y), we
begin by computing that
0 = ¸F, F
mn
) =
_
b
a
_
b
a
F(x, y) F
mn
(x, y) dxdy
=
_
b
a
_
b
a
F(x, y) f
m
(x) f
n
(y) dxdy
=
_
b
a
__
b
a
F(x, y) f
m
(x) dx
_
f
n
(y) dy
=
_
b
a
h
m
(y) f
n
(y) dy
= ¸h
m
, f
n
), (7.1)
where
h
m
(y) =
_
b
a
F(x, y) f
m
(x) dx.
Note that h
m
∈ L
2
[a, b] because, by the Schwarz inequality,
|h
m
|
2
2
=
_
b
a
[h
m
(y)[
2
dy =
_
b
a
¸
¸
¸
¸
_
b
a
F(x, y) f
m
(x) dx
¸
¸
¸
¸
2
dy

_
b
a
__
b
a
[F(x, y)[
2
dx
___
b
a
[f
m
(x)[
2
dx
_
dy
=
_
b
a
__
b
a
[F(x, y)[
2
dx
_
|f
m
|
2
dy
=
_
b
a
_
b
a
[F(x, y)[
2
dxdy
= |F|
2
2
< ∞.
Moreover, considering now a fixed m, equation (7.1) says that h
m
is orthogonal to every
function f
n
in the orthonormal basis ¦f
n
¦
n∈N
for L
2
[a, b]. Therefore h
m
= 0 a.e.
We still have to show that F = 0 a.e. So, for each y let G
y
(x) be the function defined by
G
y
(x) = F(x, y).
22 REVIEW OF LEBESGUE MEASURE AND INTEGRATION
As was the case for h
m
, you can easily check that for each fixed y, the function G
y
is in
L
2
[a, b] (as a function of x alone). Moreover, since h(y) = 0 a.e., we have
¸G
y
(x), f
m
(x)) =
_
b
a
F(x, y) f
m
(y) dx = h
m
(y) = 0.
That is, G
y
is orthogonal to every f
m
, and since ¦f
m
¦ is an orthonormal basis for L
2
[a, b]
we therefore conclude that G
y
(x) = 0 for a.e. x. Since this is true for a.e. y, we conclude
that F(x, y) = 0 for a.e. (x, y) (OK, there’s a little argument to fill about sets in the plane
with measure zero, but it’s easy). We started with a function F(x, y) that was orthogonal to
every f
m
(x) f
n
(y) and showed that this F must be zero a.e., so this shows that ¦f
m
(x) f
n
(y)¦
is complete, and hence is an orthonormal basis since we have already shown that it is an
orthonormal system.
Here is another way of showing that ¦f
m
(x) f
n
(y)¦ is complete. Instead of showing that
only the zero function is orthogonal to every element of this system, we can show that its
finite linear span is dense. So, choose any function F(x, y) ∈ L
2
([a, b] [a, b]). By any one of
several arguments, we know that the set of continuous functions is dense in L
2
([a, b] [a, b]).
Therefore, there is a continuous function G(x, y) ∈ L
2
([a, b] [a, b]) such that |F −G|
2
< ε.
Now subdivide the square [a, b] [a, b] into finitely many smaller squares Q
k
for k = 1, . . . N.
Then we can approximate G by a step function
H(x, y) =
N

k=1
c
k
χ
Q
k
(x, y).
For example, by taking the squares small enough and letting c
k
be the average value of G
on the square Q
k
, we can make |G − H|
2
< ε. Now, each square is a cross product of two
intervals: Q
k
= I
k
J
k
. Therefore,
χ
Q
k
(x, y) =
χ
I
k
(x)
χ
J
k
(y).
Since
χ
I
k
,
χ
J
k
∈ L
2
[a, b], we can write
χ
I
k
(x) =

m
a
mk
f
m
(x) and
χ
J
k
(y) =

n
b
nk
f
n
(y)
for some scalars a
mk
and b
nk
(in fact, these scalars are given by inner products, but that
doesn’t really matter for this argument). Hence,
H(x, y) =
N

k=1
c
k
χ
I
k
(x)
χ
J
k
(y) =
N

k=1
c
k
_

m
a
mk
f
m
(x)
__

n
b
nk
f
n
(y)
_
=

m

n
_
N

k=1
c
k
a
mk
b
nk
_
f
m
(x) f
n
(y)
=

m

n
d
mn
f
m
(x) f
n
(y), (7.2)
where d
mn
=

N
k=1
c
k
a
mk
b
nk
are some new scalars (note that this is a finite sum, so it
is well-defined). But equation (7.2) says that H is an infinite linear combination of the
REVIEW OF LEBESGUE MEASURE AND INTEGRATION 23
functions f
m
(x) f
n
(y), hence can be approximated to within ε in L
2
-norm by a function in
span¦f
m
(x) f
n
(y)¦. Therefore F is approximated to within 3ε in L
2
-norm by a function in
this span. Hence the span is dense, so ¦f
m
(x) f
n
(y)¦ is complete, and therefore forms an
orthonormal basis since we already know that it is an orthonormal system.
Example 7.2. The set ¦e
2πinx
¦
n∈Z
is an orthonormal basis for L
2
[0, 1]. Therefore, the
collection
¦e
2πimx
e
2πiny
¦
m,n∈Z
is an orthonormal basis for L
2
([0, 1] [0, 1]).
Theorem 7.1 can easily be adapted to cover the case of finding an orthonormal basis for
L
2
([a, b] [c, d]) or L
2
(R
2
), etc. The general principle is that if ¦f
n
¦ is an orthonormal basis
for L
2
(Ω
1
) and ¦g
n
¦ is an orthonormal basis for L
2
(Ω
2
), then ¦f
m
(x) g
n
(y)¦ is an orthonormal
basis for L
2
(Ω
1

2
).

2

REVIEW OF LEBESGUE MEASURE AND INTEGRATION

(b) Suppose that {xn }n∈N is a sequence of points in Rd and that x ∈ Rd . We say that xn converges to x, and write xn → x or x = lim xn , if
n→∞ n→∞

lim |xn − x| = 0.

Using this definition of distance, we can now define open and closed sets in Rd and state some of their basic properties. Definition 1.3. Let E ⊆ Rd be given.

(a) E is open if for each point x ∈ E there is some open ball centered at x that is completely contained in E, i.e., Br (x) ⊆ E for some r > 0. Br (x) = {y ∈ Rd : |x − y| < r}

(b) A point x ∈ Rd is a limit point of E if there exist points xn ∈ E that converge to x, i.e., such that xn → x. (c) The complement of E is E C = Rd \E = {x ∈ Rd : x ∈ E}. / (d) E is closed if its complement E C is open. It can be shown that E is closed if and only if E contains all its limit points. ¯ (e) If E is any subset of R, then its closure E is the smallest closed set that contains E. It can be shown that ¯ E = E ∪ {x ∈ Rd : x is a limit point of E}. ¯ (f) E is dense in Rd if E = Rd . For example, the set Q of all rational numbers is a dense subset of R. (g) E is bounded if it is contained in some ball with finite radius, i.e., if there is some r > 0 such that E ⊆ Br (0). The following notion of compact sets is very important. Definition 1.4. Let E ⊆ Rd be given.

(a) An open cover of E is any collection {Uα }α∈J of open sets such that E ⊆ α Uα . The index set J may be finite, countable, or uncountable, i.e., there may be finitely many, countably many, or uncountably many open sets Uα in the collection.

(b) E is compact if every open cover {Uα }α∈J of E contains a finite subcover. That is, E is compact if whenever we choose open sets Uα such that E ⊆ α Uα , then there exist finitely many indices α1 , . . . , αk ∈ J such that E ⊆ Uα1 ∪ · · · ∪ Uαk . Theorem 1.5. Let E ⊆ Rd be given.

(a) (Heine–Borel Theorem) E is compact if and only if it is both closed and bounded.

(b) If E and F are compact sets then E + F = {x + y : x ∈ E and y ∈ F } is compact. ¯ ¯ (c) Suppose that E and F are bounded sets. suppose that z is a limit point of E + F . For simplicity of notation. To show that E + F is closed. Thus z ∈ E + F . j j Proof. then every countable sequence of points {xn }n∈N with xn ∈ E has a convergent subsequence (even if the original sequence does not converge). Since E + F is both closed and bounded. That is. so they are contained in some finite balls centered at the origin. so E + F contains all its limit points. Since E and F are compact. (a) We just have to show that every limit point of E is a limit point of F . This means that there are points zn ∈ E + F which converge to z. zn = xn + yn for some xn ∈ E and yn ∈ F .6. and a point x ∈ Rd so that xnk → x. zn = xn + yn for some xn ∈ E and yn ∈ F . So. Therefore x is a limit point of F by definition. Therefore x′′ + yj → x + y. Let E. there exists a subsequence {xnk }k∈N which does converge to some ′ ′ x ∈ E. So. so each xn is also an element of F . ¯ + F . so z ∈ E + F . remember where these points came from: x′′ = xnkj and yj = ynkj . it is still true that x′kj → x. Then there exist points xn ∈ E such that xn → x. then E + F ⊆ E + F . as desired. (Note that x is then a limit point of E. we do know that {xn }n∈N is a sequence of points in E and that E is compact. . Therefore x + y = lim x′′ + lim yj = lim (x′′ + yj ) = z.e. xj → x ∈ E ′′ ′′ ¯ ¯ y ∈ F ). E ⊆ F . it is compact. F ⊆ Rd be given. Note that since x′k → x. and therefore x ∈ E since E is closed. . write x′k = xnk and yk = ynk . then E ⊆ F . (b) Suppose E and F are both compact.) Theorem 1. Then E and F are closed and bounded sets. Hence E + F ⊆ Br+s (0). Therefore j ′′ ′′ xj + yj → z since xn + yn → z.. so E + F is bounded. So it must be the case that z = x + y. suppose that z is any limit point of E + F . ¯ ¯ (c) If E and F are bounded sets (not necessarily compact). Then there exist points zn ∈ E + F such ¯ ¯ that zn → z. Therefore. {yk }k∈N is a sequence ′ of points in F and F is compact. Then we have x′′ → x and yj → y. we can imitate the argument of part b and find convergent subsequences x′′ and j ′′ ′′ ′′ ¯ ¯ ¯ ¯ and yj → y ∈ F for some x ∈ E and y ∈ F (not necessarily x ∈ E or yj . By definition. Certainly E + F ⊆ E ¯ ¯ E + F is in E + F . Now. Then E and F are both bounded. there exist indices n1 < n2 < . say E ⊆ Br (0) and F ⊆ Bs (0). and therefore is closed. ¯ ¯ (a) If E ⊆ F .REVIEW OF LEBESGUE MEASURE AND INTEGRATION 3 (b) (Bolzano–Weierstrass Theorem) If E is compact. suppose that x is a limit point of E. j j j ′′ However. . Again for simplicity ′′ ′ ′′ ′′ write x′′ = x′kj and yj = ykj . so we just have to show that every limit point of ¯ hence compact. WE DO NOT KNOW whether xn and yn will converge to some points x and y! However. By definition. i. However. so there must be a subsequence {ykj }j∈N which converges to some y ∈ F .

and therefore we must restrict the definition of measure to a subclass of well-behaved “measurable sets. Example 2. More generally.e.1) . For example.). Since Q covers itself with one cube. bd ]. The Cantor set is also closed and equals its own boundary.” Definition 2. A property which holds except on a set of exterior measure zero is said to hold almost everywhere (abbreviated a. . (a) A cube or rectangular box in Rd is a set of the form The volume of this cube is Q = [a1 . (d) The Cantor set C is an example of a subset of R which contains uncountably many points yet has exterior measure |C|e = 0. every subset of Rd has a uniquely defined nonnegative exterior measure. However. For example. the set of rational numbers in R has zero exterior measure. Definition 2. MEASURE THEORY Our goal in this section is to assign to each subset of Rd a “size” or “measure” that generalizes the concept of area or volume from simple sets to arbitrary sets. the other inequality is not so trivial to prove.3. k The exterior measure of a set lies in the range 0 ≤ |E|e ≤ ∞. i. if Q1 .. 0.e. . (a) A seemingly “obvious” fact is that if Q is a cube in Rd then |Q|e = vol(Q). Qn are disjoint cubes. (b) The exterior Lebesgue measure of an arbitrary set E ⊆ Rd is |E|e = inf k vol(Qk ) : all countable sequences of cubes Qk with E ⊆ Qk . Allowing the possibility of infinite exterior measure.4 REVIEW OF LEBESGUE MEASURE AND INTEGRATION 2.1. (b) |Rd |e = ∞. (c) If S ⊆ Rd contains only countably many points then |S|e = 0. However. if g(x) = 1. . we will see that this cannot be done for all sets without introducing some strange pathologies. then it can be shown that |Q1 ∪ · · · ∪ Qn |e = vol(Q1 ) + · · · + vol(Qn ). x rational.2. b1 ] × · · · × [ad . vol(Q) = (b1 − a1 ) · · · (bd − ad ). |Q|e = 0. (2. x irrational. it does follow immediately from the definition that |Q|e ≤ vol(Q). .

if given any ε > 0 there exists an open set U ⊇ E such that |U\E|e ≤ ε. Lemma 2. If E is measurable. ⊆ Rd . . in Lemma 2. The problem is that. . then there exists an open set U ⊇ E such that |U|e ≤ |E|e + ε. . Ek e (c) If E ⊆ Rd and ε > 0. then we would actually have | ∪ Ek |e = k |Ek |e . E2 . ⊆ Rd such that | ∪ Ek |e < k |Ek |e . .6.e. ∞] is ess sup f (x) = inf{M : f (x) ≤ M a. the definition of exterior measure is too inclusive. F ⊆ Rd be given. Remark 2. Likewise. . then |E|e ≤ |F |e . then its Lebesgue measure is its exterior measure. we expect that |U\E|e ≤ ε. Definition 2.e. even though there exist some very strange sets that behave in unexpected ways (the existence of such strange sets is a consequence of the Axiom of Choice). A set E ⊆ Rd is Lebesgue measurable. for such sets we have |(U\E)∪E|e = |U|e ≤ |E|e + ε < |E|e + |U\E|e even though U is the union of the two disjoint sets U\E and E. .4. Specifically. are disjoint. This leads us to make the following definition. since the set of points {x ∈ R : g(x) = 0} is countable and therefore has measure zero. For example.4(b) that if the sets E1 .5. . Yet this is also FALSE in general! Consequently. then (a) If E ⊆ F . .REVIEW OF LEBESGUE MEASURE AND INTEGRATION 5 then we say that g = 0 a. (Note that we also have |E|e ≤ |U|e by part a. The prefix essential is often applied to a property that holds a. for the function g given in equation (2. the set U\E should have small exterior measure. . x∈E Thus. and is denoted |E| = |E|e . or simply measurable. the essential supremum of a function f : E → [−∞. All sets have an exterior measure. E2 . Let E. in some sense. x∈R Here are some basic properties of exterior measure. One way to handle this problem is to restrict our attention to sets which are “well-behaved” with respect to exterior measure.) ≤ k |Ek |e . . Yet it can be shown that this is FALSE in general: there exist disjoint sets E1 . (b) If E1 . We might expect in Lemma 2.4(c) we might expect that since E ⊆ U and |E|e ≤ |U|e ≤ |E|3 + ε.1) we have sup g(x) = 1 x∈R while ess sup g(x) = 0.e.}. E2 .

. then so is k Ek . (f) Suppose that |E|e = 0. . are measurable. (a) (b) If E1 .8. then |E1 \E2 | = |E1 | − |E2 |. . if E is measurable. then so is E C . That is.. . since U\E ⊆ U.4. are measurable. . (b) All closed subsets of Rd are measurable. .” Definition 2. i. we have by Lemma 2. (f) All sets with exterior measure zero are measurable.6 REVIEW OF LEBESGUE MEASURE AND INTEGRATION There exists sets that are not measurable. . E2 . (e) The complement of a measurable set is measurable. That is. Ek = lim |Ek |. That is. Theorem 2.9. if E1 .e. . That is. Proof. then E is measurable. (a) All open subsets of Rd are measurable. then (d) If E1 ⊆ E2 ⊆ · · · . Then by Lemma 2. Let E and E1 .7. the sets we encounter are all measurable. Compare this definition of inner measure to Lemma 2. be measurable subsets of Rd .4(c). k→∞ |Ek |. it is nonconstructive. Here are some final attempts to illustrate the way in which measurable sets are “wellbehaved. E2 . Typically. Ek = k (c) If E1 ⊆ E2 and |E2 | < ∞. (c) Countable unions of measurable sets are measurable. are disjoint. (f) If h ∈ R and we define E + h = {x + h : x ∈ E}. if |E|e = 0. which implies that the exterior measure of E is given by |E|e = inf{|U| : all open sets U such that U ⊇ E}. Hence E is measurable by definition. it simply says that such sets exist but does not explicitly display one. E2 . then Ek = lim |Ek |. The inner measure of E ⊆ Rd is |E|i = sup{|F | : all closed sets F such that F ⊆ E}. E2 . (d) Countable intersections of measurable sets are measurable. there exists an open set U ⊇ E such that |U|e ≤ |E|e + ε = 0 + ε = ε. and let ε > 0 be given. Therefore. . k→∞ (g) If T : Rd → Rd is linear. and almost all operations that we perform on measurable sets leave their measurability intact.4(a) that |U\E|e ≤ |U|e ≤ ε. then |T (E)| = | det(T )| |E|. . . Lemma 2. However. then |E + h| = |E|. as the proof of this fact relies on the Axiom of Choice. then d Ek ≤ k |Ek |. . if E1 . Let E ⊆ Rd be arbitrary. (e) If E1 ⊇ E2 ⊇ · · · and |E1 | < ∞. then so is k Ek . .

If |E|e < ∞.11 (Carath´odory’s Criterion). when we are given a set E ⊆ Rd we implicitly assume that it is measurable unless specifically stated otherwise. Let E ⊆ Rd be given. then E is measurable if and only if |E|e = |E|i .REVIEW OF LEBESGUE MEASURE AND INTEGRATION 7 Theorem 2. From now on. Then E is measurable if e and only if for every set A ⊆ Rd we have |A|e = |A ∩ E|e + |A\E|e . Theorem 2.10. .

or 0 ≤ y < ∞ if f (x) = ∞. y) ∈ Rd+1 with x ∈ E and y ∈ R and such that 0 ≤ y ≤ f (x) if f (x) < ∞. E) = (x. Here are some basic properties of integrals of nonnegative functions.e. and limits of measurable functions are measurable. (b) If f is a measurable function. Theorem 3. ∞]. Note that the f ≤ ∞. real-valued function.2 because it captures the intuition of what an integral should mean. and suppose that f : E → [0. Later we will extend this definition to general real. f : E → [0. THE LEBESGUE INTEGRAL Our goal in this section is to define the integral of most real. f = E E f (x) dx = |R(f. then we may write simply integral of a nonnegative f lies in the range 0 ≤ E f . Let E be a measurable subset of Rd . I prefer the one given in Definition 3.or complex-valued functions.e. Many texts begin by defining the integral of step functions. f = E If the set E is understood. f (x) < ∞ . From now on. ∞].1. (a) We say that f is a measurable function if R(f.. ∞]. the integral should represent the “area under the graph” of f . E) is a measurable subset of Rd+1 . products. We begin by defining the integral of a nonnegative. then the Lebesgue integral of f over E is the measure of the region under the graph of f as a subset of Rd+1 ... (b) The region under the graph of f is the set R(f.or complex-valued functions. Then. i.. functions which take only finitely many distinct values. Definition 3.3. an arbitrary function f is written as a limit of step functions and the Lebesgue integral of f is defined to be the limit of the integrals of the step functions. Let E ⊆ Rd be given. Sums. It can be shown that f is a measurable function if and only if {x ∈ Rd : f (x) ≥ α} is a measurable subset of Rd for each α ∈ R. i. and suppose that f . There are many equivalent ways to define the Lebesgue integral.8 REVIEW OF LEBESGUE MEASURE AND INTEGRATION 3. It is clear how to define the integral of a step function. . (a) The graph of f is Γ(f. E)|. E) of all points (x. Let E ⊆ Rd .e. i. Definition 3.2. including functions for which the Riemann integral is not defined. and consider a function mapping E to the extended nonnegative reals. when we are given a function f we implicitly assume that it is measurable unless specifically stated otherwise. i. g : E → [0. f (x)) ∈ Rd+1 : x ∈ E.e.

8(b). we have E f = |R(f. the set Fn = {x ∈ E : f (x) > 1/n} has measure zero for each n > 0. f+ E (f) (Tchebyshev’s Inequality) If α > 0. then E E 1 α E f. so 0 ≤ |F | ≤ n |Fn | = 0.e. suppose that E f = 0. However. y) : x ∈ E. so E f− g = E E (f − g) = 0. (c) This follows similarly from the fact that R(f.. . Now.REVIEW OF LEBESGUE MEASURE AND INTEGRATION 9 (a) E 1 = |E|. 0 ≤ y ≤ 1} = E × [0. are disjoint sets in Rd and E = ∪Ek . f.e. (d) Note that the sets R(f. The seemingly obvious but nontrivial fact that |E × F | = |E| |F | then implies that |R(f. E (b) If f ≤ g. E2 . E). F2 .. Ek ). then f − g = 0 a. (g) If f = 0 a.. Ek ) are disjoint and that R(f. Then E f ≥ F f ≥ F α = α |F |.e. for each α > 0. The proof is not difficult. on E then E f = |R(f. E1 ) ⊆ R(f. this part follows from Theorem 2. F is the union of the countably many sets F1 . then E E f= k Ek f. then (e) (f + g) = f≤ g. Proof. The idea is that it is easy to prove if f and g are step functions. (a) If f (x) = 1 for all x ∈ E then R(f. Therefore. . . E)| = E g. E (d) If E1 . E) ⊆ R(g. . E)| = 0. 1]. f = 0. and that arbitrary functions can be approximated by step functions. E In particular. . . but it is rather long and technical.e. We have to show that |F | = 0. E) = {(x.e. E)| = |E|. we have by Tchebyshev’s Inequality that 0 ≤ |{x ∈ E : f (x) > α}| ≤ 1 α f = 0. (e) This is another “obvious” property that is not trivial to prove using the definition of Lebesgue integral that we have chosen. f= E g. E2 ). Conversely. E2 (c) If E1 ⊆ E2 . then f≤ E1 E g. (h) If f = g a. (b) Since R(f. (f) Let F = {x ∈ E : f (x) > α}. . E) = ∪ R(f. then |{x ∈ E : f (x) > α}| ≤ (g) f = 0 a. E)| ≤ |R(g.. on E if and only if (h) If f = g a. Let F = {x ∈ E : f (x) > 0} be the set of points where f is strictly positive.

then their integrals are equal.e. and in this case we define the Lebesgue integral of f to be f (x) dx = E E f (x). For example. Definition 3. However. E as long as this does not have the form ∞ − ∞ (in that case. f (x) < 0. it follows from Definition 3. ∞]. f − ≤ |f | = f + + f − .3(h) says that if two functions f and g are equal except on a set of measure zero. |f (x)|. Then we define the Lebesgue integral of f to be f = E E Re (f ) + i E Im (f ). Equivalently.10 REVIEW OF LEBESGUE MEASURE AND INTEGRATION Note that Theorem 3.4 (Lebesgue Integral for Real-Valued Functions). We say that f is measurable if both f + and f − are measurable. Note that |Re (f )|. f (x) ≥ 0. Split f into real and imaginary parts by writing f = Re (f ) + i Im (f ). 0. as long as both integrals on the right are defined and finite. f + (x) dx − f − (x) dx. Therefore E |f | < ∞ ⇐⇒ E |Re (f )|. We write f as a difference of two nonnegative functions by defining f + (x) = so that f = f+ − f− and |f | = f + + f − . and f − (x) = 0. given a function f we can change the values of f on any set of measure zero without changing the integral of the f . E |Im (f )| < ∞. Hence. then f = ∞. Hence. it is possible to have |f (x)| < ∞ a. yet still have E |f | = ∞. Let f be a real-valued function f : E → [−∞.5 (Lebesgue Integral for Complex-Valued Functions). we typically do not distinguish between two functions that are equal except on a set of measure zero. f (x) ≥ 0. Definition 3. the integral is undefined). whenever we are concerned only with integrals. Since 0 ≤ f + . Note that if |f (x)| = ∞ on a set with positive Lebesgue measure then |f | = ∞.e. if E |f | < ∞ then |f (x)| < ∞ a. if f (x) = 1 for all x ∈ R. |Im (f )| ≤ |f | ≤ |Re (f )| + |Im (f )|. Suppose that f : E → C. f (x) < 0.4 that f exists as a finite real value E ⇐⇒ ⇐⇒ ⇐⇒ E f + < ∞ and f+ + E E f− < ∞ E f− < ∞ E |f | < ∞. . Now we can extend the definition of Lebesgue integral to more general functions.

The choice of real-valued versus complex-valued functions is usually clear from context. and if f = g a. not that f = 0. To be more precise. then we regard f and g as being the “same” element of L1 (R). On the other hand. meaning that: 1 (b) f = 0 if and only if f = 0 a. The collection of all integrable functions on E is called L1 (E). L1 (E) is a vector space under the usual operations of function addition and multiplication by scalars.. we only have that f 1 = 0 implies f = 0 a. f exists as a complex number E ⇐⇒ E |f | < ∞. g ∈ L1 (R). and with this identification we do have that f 1 satisfies all the requirements of a norm.e.6.REVIEW OF LEBESGUE MEASURE AND INTEGRATION 11 Consequently. if we E are dealing with real-valued functions then L1 (E) = f : E → [−∞. In other words. we are really taking the elements of L1 (R) to be equivalence classes of functions that are equal a.e. In this sense f 1 does not quite satisfy the requirements of a norm (instead. and 1 < ∞ for all f ∈ L1 (R). Definition 3. as being “the” zero element of L1 (R). and that 1/p f p = E |f (x)|p dx . or if we are dealing with complex-valued functions then L1 (E) = E |f | < ∞ . and we will ignore it. Note that in property (a). we regard any function that is 0 a. we have declared that we will not distinguish between two functions that are equal a. (c) cf 1 (d) f + g ≤ f 1 + g 1 for all f . or make the obvious adjustment if we are dealing with complex-valued functions. ∞] or f : E → C is integrable if |f | < ∞. This distinction between equivalence classes of functions and the functions themselves is not usually an issue.e. It can be shown that Lp (E) is a vector space. In analogy to L1 (E).e. In either case.7. = |c| f for all f ∈ L1 (R) and all scalars c.e. ∞] : f: E →C : E |f | < ∞ . and f (a) 0 ≤ f 1 1 = E |f | defines a norm on this space.. it is only a seminorm).. ∞] : E |f (x)|p dx < ∞ . given 1 ≤ p < ∞ we define Lp (E) = f : E → [−∞. Definition 3. We say that a function f : E → [−∞. That is.e.

If o ′ f ∈ Lp (E) and g ∈ Lp (E) then f g ∈ L1 (E).12 REVIEW OF LEBESGUE MEASURE AND INTEGRATION is a norm on this space (again in the sense of identification functions that are equal almost everywhere). In fact. For the case p = ∞. and fix 1 ≤ p ≤ ∞. x∈E p p≥1 L (E). Out of all the exponents 1 ≤ p ≤ ∞. It has the form 1/2 E 1/2 E |f g| ≤ E |f | 2 |g| 2 . Cauchy–Schwarz. and is therefore Banach space. ∞] : ess sup |f (x)| < ∞ . Lp (E) is complete in this norm (all Cauchy sequences converge).8 (H¨lder’s Inequality). This inclusion fails if E has infinite measure. only L2 (E) is a Hilbert space. or Cauchy– o Bunyakowski–Schwarz inequality. we define L∞ (E) = f : E → [−∞. Here is a fundamental inequality for the Lp norms. and in fact L∞ (E) = For the case p = 2. H¨lder’s inequality is known as the Schwarz. Thus L2 (E) is a Hilbert space as well as a Banach space. Theorem 3. this inequality is E ≤ f |f | p p g p′ . If |E| < ∞. f. x∈E Then L (E) is a Banach space with respect to the norm f ∞ p ∞ ∞ = ess sup |f (x)|. and fg 1 For 1 < p < ∞. For p = 2. 1/p E E 1/p′ |f g| ≤ |g| p′ . then L (E) ⊆ L (E) for each 1 ≤ p < ∞. If |E| < ∞ and 1 ≤ p < q < ∞ then Lq (E) ⊆ Lp (E). g = E f (x) g(x) dx defines an inner product on L2 (E). Let E ⊆ Rd be measurable. .

i. then interchanging the order of integration of f (x. The equality in (4.2) Note that the hypotheses of Fubini’s Theorem imply that the integrals in equation (4. and if one is infinite then the other is infinite as well.1) are finite real or complex numbers. y) is a function of two variables. f (x.. y ∈ F }.e. y) is allowed. then interchanging the order of integration of f (x.1) E F Moreover. if (a) E F |f (x.e. y) dy dx! F E Theorem 4. y)| dy dx < ∞. or |f (x. . If neither theorem applies.2 (Tonelli’s Theorem). and is an integrable function of y. y)| (dx dy) < ∞. y) dx dy = F E E×F (b) F E (c) E×F then f (x. we will get the same result as if we integrate over y first and x second.. and Tonelli’s Theorem says that we can switch the integrals if f is a nonnegative function. Theorem 4. In this case. in this case g(x) = F f (x. However. y) dx dy = f (x.e. with x varying through a domain E ⊆ Rm and y varying through a domain F ⊆ Rn . y) dy dx = f (x.. i. y) dy is well-defined for almost every x.2) may be finite real numbers or ∞. y) ∈ Rm+n : x ∈ E. Fubini’s and Tonelli’s Theorems give two conditions under which we can safely exchange the order of integration. then it is possible that E F f (x. (4. h(y) = E f (x. Suppose also that we want to integrate f over the entire domain E × F = {(x. y) dx dy.e. y) dx is well-defined for almost every y. y) is allowed. Similarly. and g is an integrable function of x. the integrals in equation (4. h ∈ L1 (F ). If f (x. or |f (x. y) (dx dy).1 (Fubini’s Theorem). y) dy dx = E F F E f (x. y) ≥ 0 a. That is. g ∈ L1 (E). f (x. y)| dx dy < ∞. (4.REVIEW OF LEBESGUE MEASURE AND INTEGRATION 13 4. If any one of the possible double integrals of |f (x. SWITCHING INTEGRALS Suppose that f (x.2) means that if one side is finite then the other side is finite as well. it is very important which set we integrate over first! In general. it is NOT true that if we integrate over x first and y second. Fubini’s Theorem says that we can the switch integrals if f is an integrable function. y)| is finite.. i.

3).” there are analogues of Fubini’s and Tonelli’s theorems that apply to the problem of interchanging an integral and a sum or interchanging two summations (see Corollary 5.14 REVIEW OF LEBESGUE MEASURE AND INTEGRATION Because a series can be viewed as a “discrete integral. .

For all other x we set fn (x) = 0. there is a function f such that fn (x) → f (x) for almost every x. known as the Monotone Convergence Theorem or the Beppo–Levi Theorem. so fn does not converge to 0 dx = 0.2 (Monotone Convergence Theorem). it need NOT be true that lim n→∞ fn = E E n→∞ lim fn ! Example 5.. 1/n] and height n.1. The following example shows that this does NOT imply that the integrals of fn must converge to the integral of f .REVIEW OF LEBESGUE MEASURE AND INTEGRATION 15 5. If the sequence of functions {fn }n∈N is monotone increasing a.e.e. E) ⊆ R(f2 . E)| = lim n→∞ n→∞ fn . x ∈ E. we must be careful when switching an integral and an infinite series. . on E and if lim fn (x) = f (x) for a. Let E ⊆ Rd be given. E) ⊆ · · · and that R(f. and suppose that fn : E → [0. SWITCHING INTEGRALS AND LIMITS Suppose that {fn }n∈N is a sequence of functions that converge pointwise almost everywhere. x ∈ E. Our goal in this section is to give some conditions under which a limit and an integral can be interchanged. The first result. Then fn (x) → 0 for every x ∈ R! However. Hence. i. applies to the case of nonnegative functions that are monotone increasing a. i. E) = ∪ R(fn .. Note that R(f1 . Let fn be defined as follows. Proof. E)| = lim |R(fn . ∞] for n ∈ N. Theorem 5. fn = 1/2 for every n.e. i.e. E). It therefore follows from Theorem 2. E Recall that an infinite series is a limit of the partial sums of the series. As a corollary of the Beppo–Levi Theorem we can prove the following version of Tonelli’s Theorem that gives a condition on when an integral and a summation can be interchanged. then n→∞ n→∞ lim fn = E E n→∞ lim fn = E f..8(d) that E f = |R(f. For 0 ≤ x ≤ 1/n the graph of fn looks like an isosceles triangle with base [0.e. for which 0 ≤ f1 (x) ≤ f2 (x) ≤ f3 (x) ≤ · · · for a.e..e.

E The Lebesgue Dominated Convergence Theorem. then we may not be able to interchange a limit and an integral. . Proof. x. n→∞ and if the integrals converge as well. Then F1 (x) ≤ Fx (x) ≤ · · · for a. Set N ∞ FN (x) = n=1 fn (x) and N →∞ F (x) = n=1 fn (x)..1) Note that we were allowed to switch the sum and the integral in equation (5. if the fn converge pointwise almost everywhere. If the functions fn are nonnegative but not monotone increasing. i. by the Monotone Convergence Theorem.e. Let E ⊆ Rd begiven. so {gn }n∈N is monotone increasing for each x. then we do at least have a particular inequality. and it is the theorem to use in most cases. we have f = E E n→∞ lim gn = lim n→∞ gn = lim inf E n→∞ E gn ≤ lim inf n→∞ fn . and lim FN (x) = F (x) a. E Consequently. Therefore. F = lim E N N →∞ N ∞ FN = lim E N →∞ fn = lim E n=1 N →∞ fn = n=1 E n=1 E fn . n→∞ n→∞ k≥n n→∞ Then by the Monotone Convergence Theorem and the fact that gn (x) ≤ fn (x). Then ∞ ∞ fn = E n=1 n=1 E fn . Theorem 5.1) because it was a finite sum.16 REVIEW OF LEBESGUE MEASURE AND INTEGRATION Corollary 5. and suppose that fn : E → [0. However. (5.3 (Tonelli’s Theorem). E Proof. ∞] for n ∈ N. or LCDT.. and suppose that fn : E → [0.e.4 (Fatou’s Lemma). ∞] for n ∈ N. Set gn (x) = inf fk (x). It applies to functions that aren’t necessarily nonnegative or monotone increasing. Then g1 (x) ≤ g2 (x) ≤ · · · . if lim fn (x) = f (x) a.e. Define f (x) = lim inf fn (x) = lim inf fk (x) = lim gn (x).e. Let E ⊆ Rd be given. is perhaps the most important and useful result of this section. the following result states that if the functions fn are all nonnegative. Then E lim inf fn ≤ lim inf n→∞ n→∞ fn . then E k≥n f dx ≤ lim n→∞ fn dx.

Let E ⊆ Rd be given. . ∞] for n ∈ N. Proof. E f ≤ lim inf n→∞ E E fn ≤ lim sup n→∞ E fn ≤ f. If there is a single n→∞ function g such that: (b) g is integrable. Then by Fatou’s Lemma. and suppose that fn : E → [−∞. f = E E lim inf fn ≤ lim inf n→∞ n→∞ fn ..e. then n→∞ (a) |fn (x)| ≤ g(x) a.5 (Lebesgue Dominated Convergence Theorem). E Further. we can apply Fatou’s Lemma to the functions g − fn . but it can be extended to general f .e.e.e. Suppose also that the functions fn (x) converge pointwise almost everywhere. x ∈ E. E Hence lim n→∞ fn exists and equals E f. even more is true in this case: fn converges to f in L1 -norm.e. since g − fn ≥ 0 a. lim fn = E E n→∞ lim fn = E f. and E |g| < ∞.. Suppose that fn ≥ 0 a.e. lim fn (x) = f (x) for a. for every n. In fact.. n→∞ lim f − fn 1 = lim n→∞ E |f − fn | = 0. i. to obtain E g− f = E E (g − f ) lim inf (g − fn ) n→∞ E = E ≤ lim inf n→∞ (g − fn ) E (by Fatou’s Lemma) fn = lim inf n→∞ g− E = E g − lim sup n→∞ fn . i. We will give the proof for nonnegative fn only.REVIEW OF LEBESGUE MEASURE AND INTEGRATION 17 Theorem 5.. i. E Therefore.e.

or complex-value functions with domain Rd . The case p = ∞ is easy. Theorem 6. Proof..2 is just a special case of the following more general result. we can continue as follows: (∗) = |f (y) g(x − y)| dx dy = |f (y)| |g(x − y)| dx dy = (∗∗). so I will leave it as an exercise. g ∈ L1 (Rd ) then f ∗ g ∈ L1 (Rd ). Then the convolution of f and g is the function f ∗ g defined by (f ∗ g)(x) = whenever this is well-defined.3 (Young’s Convolution Inequality). i. ≤ f g 1. Then write |(f ∗ g)(x)| ≤ |f (y) g(x − y)| dy = |f (y) g(x − y)1/p | |g(x − y)1/p | dy = (∗). Put it all together and we have shown f ∗ g ≤ g 1 In fact. (∗∗) = |f (y)| |g(x)| dx dy = |f (y)| g 1 1 dy = f 1. We’ve already done the case p = 1. and f ∗g p ≤ f p g 1. CONVOLUTION Definition 6.2. g 1 |f (y)| dy = g 1 f 1. Now. and f ∗ g Proof. We start by computing: |(f ∗ g)(x)| dx = f (y) g(x − y) dy dx ≤ |f (y) g(x − y)| dy dx = (∗). 1 1 First let p′ be the dual exponent to p. Let f and g be real. 1 1 Because |f (y) g(x − y)| ≥ 0 for all x and y.1. since we are integrating over all of Rd . Tonelli’s Theorem allows us to interchange the order of integration in (∗).18 REVIEW OF LEBESGUE MEASURE AND INTEGRATION 6. Theorem 6. the number which satisfies p + p′ = 1. So. ′ .e. If f . Theorem 6. f (y) g(x − y) dy. If 1 ≤ p ≤ ∞ and f ∈ Lp (Rd ) and g ∈ L1 (Rd ) then f ∗ g ∈ Lp (Rd ). Therefore. we know that |g(x − y)| dx = |g(x)| dx (this wouldn’t necessarily be true if we were integrating on a finite domain). For other p’s it is a little tricky.

p p |g(x − y)| dx dy |g(x)| dx dy 1 dy p/p′ 1 p/p′ 1 1+p/p′ 1 p 1 f Take pth roots and you’re done.REVIEW OF LEBESGUE MEASURE AND INTEGRATION 19 Now apply H¨lder’s inequality to the two parts of the integrand: o 1/p 1/p′ (∗) ≤ = = = Therefore. f ∗g p p |f (y) g(x − y) p 1/p p | dy 1/p |g(x − y) 1/p′ p′ | dy 1/p′ |f (y)| |g(x − y)| dy 1/p |g(x − y)| dy 1/p′ |f (y)| |g(x − y)| dy g 1/p′ 1 p |g(y)| dy 1/p |f (y)|p |g(x − y)| dy g g g g g g g p/p′ 1 p/p′ 1 p/p′ 1 . = |(f ∗ g)(x)|p dx = = = = = = = |f (y)|p |g(x − y)| dy dx |f (y)|p |g(x − y)| dx dy |f (y)|p |f (y)|p |f (y)|p g f p p. .

Proof. b]). Then {Fmn (x. Theorem 7. b] × [a. b] → C : b b F 2 = a |F (x. Suppose that {fn (x)}n∈N is an orthonormal basis for L2 [a. b] × [a. y) G(x. CROSS PRODUCT BASES Suppose that we have an orthonormal basis for the Hilbert space b 1/2 L [a. b] = {f : [a. b] × [a. of functions that are square-integrable on the square [a. if m = j and n = k. We will show how to use this orthonormal basis to construct an orthonormal basis for the Hilbert space b b a 1/2 L2 ([a. Fjk = a b a b Fmn (x. b]). g = a f (x) g(x) dx. b]. y) dx dy. if m = j or n = k. This establishes that {Fmn } is an orthonormal system in L2 ([a. and define Fmn (x.1. fj a fn (y) fk (y) dy fn .20 REVIEW OF LEBESGUE MEASURE AND INTEGRATION 7. First we check that the functions Fmn are indeed orthonormal: b b Fmn . y)|2 dx dy < ∞}. b]) = {F : [a. b]. b]. with inner product b f. . b] → C : 2 f 2 = a |f (x)| dx 2 < ∞} of functions that are square-integrable defined on the interval [a. G = a a F (x. y) dx dy fm (x) fn (y) fj (x) fk (y) dx dy a b a b = = a b fn (y) fk (y) a fm (x) fj (x) dx dy = a fn (y) fk (y) fm . under the inner product F. y)}m. fj dy b = fm . b] × [a.n∈N is an orthonormal basis for L2 ([a. y) Fjk (x. b] × [a. y) = fm (x) fn (y). fj = 1. fk = fm . 0.

It would be easy to proceed if it was the case that F (x. y) fm(x) dx dy a a |F (x. b]. b]) can be “factored” in this way. b]) is such that F. for each y let Gy (x) be the function defined by Gy (x) = F (x. by the Schwarz inequality. fn . y) dx dy F (x. b].. b]) that is orthogonal to every Fmn is the zero function. For a general function F (x. y)|2 dx dy < ∞. One way is to show that {Fmn } is complete. the only function F ∈ L2 ([a.e. y) fm(x) dx. Note that hm ∈ L2 [a. b] × [a. So. we have to be more careful. Fmn = a b a b F (x. y) fm(x) fn (y) dx dy a b a b = = a b a F (x. y)| dx |F (x. Fmn = 0 for every m and n. b] × [a. y). suppose that F ∈ L2 ([a.1) says that hm is orthogonal to every function fn in the orthonormal basis {fn }n∈N for L2 [a. = F Moreover. We still have to show that F = 0 a. b b b a b b a b b a b b a 2 2 b 2 hm 2 2 = a |hm (y)| dy = ≤ = 2 F (x. b] because. b] × [a. y) Fmn (x. y).e. So.e. So. Therefore hm = 0 a. but it is important to note that only SOME of the functions in L2 ([a. considering now a fixed m. equation (7. where b (7. i. y) = f (x) g(y) for some functions f . we begin by computing that b b 0 = F. y)|2 dx 2 a |fm (x)|2 dx dy 2 fm dy a = a |F (x. y) fm(x) dx fn (y) dy hm (y) fn(y) dy a = = hm . .REVIEW OF LEBESGUE MEASURE AND INTEGRATION 21 Now we have to show that this orthonormal system is an orthonormal basis. We have several choices for doing this. g ∈ L2 [a.1) hm (y) = a F (x.

there is a continuous function G(x. y.e.22 REVIEW OF LEBESGUE MEASURE AND INTEGRATION As was the case for hm .. For example. But equation (7. b]) such that F − G 2 < ε. y) that was orthogonal to every fm (x) fn (y) and showed that this F must be zero a. (x. b].2) says that H is an infinite linear combination of the . the function Gy is in L2 [a. so this shows that {fm (x) fn (y)} is complete.2) amk bnk are some new scalars (note that this is a finite sum. Then we can approximate G by a step function N H(x. choose any function F (x. you can easily check that for each fixed y. y) = k=1 ck χQk (x. Moreover. Therefore. we know that the set of continuous functions is dense in L2 ([a. Since this is true for a. we have b Gy (x). y). Therefore. we can show that its finite linear span is dense. so it where dmn = is well-defined). since h(y) = 0 a. b] into finitely many smaller squares Qk for k = 1. we can write χIk (x) = amk fm (x) m and χJk (y) = n bnk fn (y) for some scalars amk and bnk (in fact. b]). We started with a function F (x. Instead of showing that only the zero function is orthogonal to every element of this system. b]). So. Now. b] (as a function of x alone). y) = χIk (x) χJk (y). by taking the squares small enough and letting ck be the average value of G on the square Qk .. χJk ∈ L2 [a. but it’s easy). each square is a cross product of two intervals: Qk = Ik × Jk .e. N. y) = k=1 ck χIk (x) χJk (y) = k=1 ck m N amk fm (x) n bnk fn (y) = m n k=1 ck amk bnk fm (x) fn (y) dmn fm (x) fn (y). and since {fm } is an orthonormal basis for L2 [a. there’s a little argument to fill about sets in the plane with measure zero. . x. b] × [a. Here is another way of showing that {fm (x) fn (y)} is complete. m n = N k=1 ck (7. these scalars are given by inner products. Now subdivide the square [a.e. and hence is an orthonormal basis since we have already shown that it is an orthonormal system. Hence. we conclude that F (x. we can make G − H 2 < ε.e. . Gy is orthogonal to every fm . b] we therefore conclude that Gy (x) = 0 for a. y) ∈ L2 ([a. y) fm(y) dx = hm (y) = 0. χQk (x. y) (OK. y) = 0 for a. b] × [a. b] × [a.e. Since χIk . N N H(x. but that doesn’t really matter for this argument). That is. By any one of several arguments. y) ∈ L2 ([a. . fm (x) = a F (x. b] × [a.

Theorem 7. 1]).n∈Z is an orthonormal basis for L2 ([0. 1]. Example 7. Therefore. Therefore F is approximated to within 3ε in L2 -norm by a function in this span. 2 . The general principle is that if {fn } is an orthonormal basis for L2 (Ω1 ) and {gn } is an orthonormal basis for L2 (Ω2 ).2. 1] × [0. hence can be approximated to within ε in L2 -norm by a function in span{fm (x) fn (y)}.1 can easily be adapted to cover the case of finding an orthonormal basis for L ([a. so {fm (x) fn (y)} is complete. and therefore forms an orthonormal basis since we already know that it is an orthonormal system.REVIEW OF LEBESGUE MEASURE AND INTEGRATION 23 functions fm (x) fn (y). then {fm (x) gn (y)} is an orthonormal basis for L2 (Ω1 × Ω2 ). etc. the collection {e2πimx e2πiny }m. The set {e2πinx }n∈Z is an orthonormal basis for L2 [0. Hence the span is dense. b] × [c. d]) or L2 (R2 ).