You are on page 1of 61

Real Analysis

Richard F. Bass
February 2, 2010

-algebras

Let X be a set. We will use the notation: Ac = {x X : x


/ A} and
c
A B = A B . (The notation A \ B is also commonly used.)

Definition 1.1 An algebra is a collection A of subsets of X such that


(a) , X A;
(b) if A A, then Ac A;
(c) if A1 , . . . , An A, then ni=1 Ai and ni=1 Ai are in A.
A is a -algebra (or -field) if in addition

(d) if A1 , A2 , . . . are in A, then


i=1 Ai and i=1 Ai are in A.

In (d) we allow countable unions and intersections only; we do not allow


uncountable unions and intersections.

Example 1.2 Let X = R and A be the collection of all subsets of R.

Example 1.3 Let X = R and let


A = {A R : A is countable or Ac is countable}.
Parts (a) and (b) of the definition are easy. Suppose A1 , A2 , . . . are all in A.
If each of the Ai are countable, then i Ai is countable, and so in A. If Aci0
is countable for some i0 , then
(Ai )c = i Aci Aci0
is countable, and again i Ai is in A. Since Ai = (i Aci )c , then the countable
intersection of sets in A is again in A.

Example 1.4 Let X = [0, 1] and A = {, X, [0, 12 ], ( 12 , 1]}.

Example 1.5 X = {1, 2, 3} and A = {X, , {1}, {2, 3}}.

Example 1.6 Let X = [0, 1], and B1 , . . . , B8 subsets of X which are pairwise
disjoint and whose union is all of X. Let A be the collection of all finite unions
of the Bi s as well as the empty set. (So A consists of 28 elements.)
Note that if we take an intersection of -algebras, we get a -algebra; this is
just a matter of checking the definition.If we have a collection C of subsets of
X, there is at least one -algebra containing C, namely, the one consisting of
all subsets of X. We can take the intersection of all -algebras that contain
C; we denote this intersection by (C). If A is any -algebra containing C,
then A (C).
If X has some additional structure, say, it is a metric space, then we can
talk about open sets. If G is the collection of open subsets of X, then we
call (G) the Borel -algebra on X, and this is often denoted B. We will see
later that when X is the real line, that B is not equal to the collection of all
subsets of X.
We end this section with the following proposition.
2

Proposition 1.7 If X = R, then the Borel -algebra is generated by each


of the following collection of sets:
(1) C1 = {(a, b) : a, b R}.
(2) C2 = {[a, b] : a, b R};
(3) C3 = {(a, b] : a, b R};
(4) C4 = {(a, ) : a R};
Proof. (1) Let G be the collection of open sets. Then C1 G (G). (G)
is the Borel -algebra and contains C1 . Since (C1 ) is the intersection of all
-algebras containing C1 , then (C1 ) (G).
To get the reverse inclusion, if G is open, it is the countable union of open
intervals. So G (C1 ), and hence G (C1 ). (G) is the intersection of all
-algebras containing G; (C1 ) is one such, so (G) (C1 ).
1
1
(2) If [a, b] C2 , then [a, b] =
n=1 (a n , b + n ) (G). So C2 (G),
and by an argument similar to that in (1), we conclude (C2 ) (G).
1
1
If (a, b) C1 , choose n0 2/(b a) and note (a, b) =
n=n0 [a + n , b n ]
(C2 ). So the Borel -algebra, which is equal to (C1 ) by part (1), is contained
in (C2 ).
1
(3) The proof here is similar to (2), using (a, b] =
n=1 (a, b + n ) and
1
(a, b) =
n=n0 (a, b n ], provided n0 is taken large enough.

(4) The proof of this comes from (3), using that (a, b] = (a, ) (b, )
and (a, ) =
n=1 (a, a + n].

Measures

Definition 2.1 A measure on (X, A) is a function : A [0, ] such that


(a) (A) 0 for all A A;
(b) () = 0;
3

(c) if Ai A are disjoint, then


(
i=1 Ai )

(Ai ).

i=1

Example 2.2 X is any set, A is the collection of all subsets, and (A) is
the number of elements in A. This is called counting measure.

Example 2.3 X = R, A the


P collection of all subsets, x1 , x2 , . . . R, and
a1 , a2 , . . . > 0. Set (A) = {i:xi A} ai . A particular case of this is if xi = i
and all the ai = 1. We will see later that this allows us to view infinite series
as functions on this space.

Example 2.4 x (A) = 1 if x A and 0 otherwise. This measure is called


point mass at x.
We will construct Lebesgue measure on R; this is an extension of the
notion of length. However, the construction is a bit lengthy. We will also
construct Lebesgue measure on Rn ; when n = 2, this is an extension of the
notion of area, when n = 3, of volume.

Proposition 2.5 The following hold:


(a) If A, B A with A B, then (A) (B).
P
(b) If Ai A and A =
i=1 Ai , then (A)
i=1 (Ai ).
(c) If Ai A, A1 A2 , and A =
i=1 Ai , then (A) = limn (An ).
(d) If Ai A, A1 A2 , (A1 ) < , and A =
i=1 Ai , then we have
(A) = limn (An ).
4

Proof. (a) Let A1 = A, A2 = B A, and A3 = A4 = = . Now use part


(c) of the definition of measure.
(b) Let B1 = A1 , B2 = A2 B1 , B3 = A3 P
(B1 B2 ), P
and so on. The Bi

are disjoint and


B
=

A
.
So
(A)
=
(B
)

(Ai ).
i
i=1 i
i=1 i
(c) Define the Bi as in (b). Since ni=1 Bi = ni=1 Ai , then
(A) =

(
i=1 Ai )

(
i=1 Bi )

(Bi )

i=1

= lim

n
X

(Bi ) = lim (ni=1 Bi ) = lim (ni=1 Ai ).

i=1

(d) Apply (c) to the sets A1 Ai , i = 1, 2, . . ..

Example 2.6 To see that (A1 ) < is necessary, let X be the positive
integers, counting measure, and Ai = {i, i + 1, . . .}. Then the Ai decrease,
(Ai ) = for all i, but (i Ai ) = () = 0.
Definition 2.7 A probability or probability measure is a measure such that
(X) = 1. In this case we usually write (, F, P) instead of (X, A, ).

Construction of Lebesgue measure

Define m((a, b)) = ba. If G is an open set P


and G R, then G =
i=1 (ai , bi )
with the intervals disjoint. Define m(G) =
(b

a
).
If
A

R,
define
i
i=1 i
m (A) = inf{m(G) : G open, A G}.

(3.1)

We will show the following.


(A) m is not a measure on the collection of all subsets of R.
(B) m is a measure on a strictly smaller -algebra that strictly contains the
Borel -algebra.

We will prove these two facts (and a bit more) in a moment, but lets first
make some remarks.
A set N is a null set with respect to m if m (N ) = 0. Let L be the
smallest -algebra containing B and all the null sets. More precisely, let N
be the collection of all sets that are null sets with respect to m and let
L = (B N ). L is called the Lebesgue -algebra, and sets in L are called
Lebesgue measurable.
As part of our proof of (B) we will show that m is a measure on L.
Lebesgue measure is the measure m on L. (A) shows that L is strictly
smaller than the collection of all subsets of R.
It is easy to get lost in the construction of Lebesgue measure, so let us
summarize our steps.
First we prove (A), which Proposition 3.1.
We then turn to the construction of Lebesgue measure. It is more convenient for technical reasons to define

X
m (A) = inf{ (bi ai ) : A
i=1 (ai , bi ]}.

(3.2)

i=1

There is no real difference between this and (3.1) since it is clear that the
difference between an open set and a set of the form
i=1 (ai , bi ] is countable,

and with either definition of m , the measure of a point is 0, so the measure


of a set consisting of countably many points is 0. However, when we talk
about Lebesgue-Stieltjes measure, then there is a real difference.
We define what it means to be an outer measure (Definition 3.2) and prove
that m is an outer measure (Proposition 3.3). We then define what it means
for a set to be m -measurable (Definition 3.4) and prove that the collection
of m -measurable sets is a -algebra and that m restricted to this -algebra
is a measure.
This looks promising, but we do not yet know that enough sets are m measurable. That takes one more step. We show in Proposition 3.6 that the
collection of m -measurable sets contains the Borel -algebra.
Proposition 3.1 m is not a measure on the collection of all subsets of R.
6

Proof. Suppose m is a measure. Define x y if xy is rational. This is an


equivalence relationship on [0, 1]. For each equivalence class, pick an element
out of that class (by the axiom of choice) Call the collection of such points
A. Given a set B, define B + x = {y + x : y B}. Note m (A + q) = m (A)
since this translation invariance holds for intervals, hence for open sets, hence
for all sets. Moreover, the sets A + q are disjoint for different rationals q.
Now
[0, 1] q[2,2] (A + q),
where the sum is only over rational q, so 1
therefore m (A) > 0. But

q[2,2]

m (A + q), and

q[2,2] (A + q) [6, 6],


wherePagain the sum is only over rational q, so if m is a measure, then
12 q[2,2] m (A + q), which implies m (A) = 0, a contradiction.
Definition 3.2 A function n on the collection of all subsets satisfying
(a) n() = 0;
(b) if A B, then n(A) n(B);
P
(c) n(
i=1 Ai )
i=1 n(Ai ).
is called an outer measure.
Proposition 3.3 m defined by (3.2) is an outer measure.
Proof. (a) and (b) are obvious. To prove (c), let > 0. For each i there
existP
intervals Ii1 , Ii2 , . . ., each of the form (aij , bij ], such that Ai
j=1 Iij
and j m(Iij ) m (Ai ) + /2i . Then
A

I
and
i,j ij
i=1 i
X
X
X
X
m(Iij )
m (Ai ) +
/2i =
m (Ai ) + .
i,j

Since is arbitrary, m (
i=1 Ai )

i=1

m (Ai ).

Definition 3.4 Let m be an outer measure. A set A X is m -measurable


if
m (E) = m (E A) + m (E Ac )
(3.3)
for all E X.
Theorem 3.5 If m is an outer measure on X, then the collection A of m
measurable sets is a -algebra and the restriction of m to A is a measure.
Moreover, A contains all the null sets.
Proof. By Proposition 3.3,
m (E) m (E A) + m (E Ac )
for all E X. So to check (3.3) it is enough to show m (E) m (E A) +
m (E Ac ). This will be trivial in the case m (E) = .
If A A, then Ac A by symmetry and the definition of A. Suppose
A, B A and E X. Then
m (E) = m (E A) + m (E Ac )
= (m (E A B) + m (E A B c )) + (m (E Ac B)
+ m (E Ac B c ).
The first three terms on the right have a sum greater than or equal to m (E
(A B)) because A B (A B) (A B c ) (Ac B). Therefore
m (E) m (E (A B)) + m (E (A B)c ),
which shows A B A. Therefore A is an algebra.
Let Ai be disjoint sets in A, let Bn = ni=1 Ai , and B =
i=1 Ai . If E X,
m (E Bn ) = m (E Bn An ) + m (E Bn Acn )
= m (E An ) + m (E Bn1 ).
Repeating for m (E Bn1 ), we obtain
m (E Bn )

n
X
i=1

m (E Ai ).

Since Bn A, then

m (E) = m (E Bn ) + m (E

Bnc )

n
X

m (E Ai ) + m (E B c ).

i=1

Let n . Recalling that m is an outer measure,


m (E)

m (E Ai ) + m (E B c )

i=1

c
m (
i=1 (E Ai )) + m (E B )
= m (E B) + m(E B c )
m (E).

This shows B A.
If we set E = B in this last equation, we obtain

m (B) =

m (Ai ),

i=1

or m is countably additive on A.
If m (A) = 0 and E X, then
m (E A) + m (E Ac ) = m (E Ac ) m (E),
which shows A contains all null sets.
P
Define m(ni=1 (ai , bi ]) = ni=1 (bi ai ) if the (ai , bi ] are disjoint. Note m is
well-defined (a set might be expressible as a union of such intervals in more
than one way).
The last step in the construction is the following.
Proposition 3.6 Every set in the Borel -algebra is m - measurable.
Proof. Since the collection of m -measurable sets is a -algebra, it suffices
to show that every interval J of the form (a, b] is m -measurable. Let E be
any set with m (E) < ; we need to show
m (E) m (E J) + m (E J c ).
9

(3.4)

Choose I1 , I2 , . . . of the form (ai , bi ] such that E i Ii and


X
m (E)
(bi ai ) .
i

Since
E

I
,
we
have
m
(E

J)

m (Ii J) and m (E J c )
i
P
m (Ii J c ). Hence we have
X
m (E J) + m (E J c )
[m (Ii J) + m (Ii J c )].
i

Now m (Ii J) is the length of an interval and m (Ii J c ) is the length of


two intervals, so
m (Ii J) + m (Ii J) = m (Ii ).
Thus
m (E J) + m (E J c )

m (Ii ) m (E) + .

Since is arbitrary, this proves (3.4).


We now drop the asterisks from m and call m Lebesgue measure.

Examples and related results

Example 4.1 Recall the Cantor set is constructed by taking the interval
[0, 1], removing the middle third, removing the middle thirds of each of the
two remaining subintervals, and continuing. The Cantor set is what remains;
it is closed, uncountable, and every point is a limit point. Moreover, it
contains no intervals.
After one stage, the measure of the two intervals is 2( 31 ), after two stages
4(1/9), and after n stages, (2/3)n . Since the Cantor set C is the intersection
of all these sets, the Lebesgue measure of C is 0.
Suppose we define f0 to be 1/2 on the interval (1/3, 2/3), 1/4 on the interval (1/9, 2/9), 3/4 on the interval (7/9, 8/9), and so on. Define f (x) =
inf{f0 (y) : y x} for x < 1. Define f (1) = 1. Notice f = f0 on the complement of the Cantor set. f is monotone, so it has only jump discontinuities.
But if it has a jump continuity, there is a rational of the form k/2n with
10

k 2n that is not in the range of f . On the other hand, by the construction,


each of these values is taken by f0 for some point in the complement of C,
and so is taken by f . The only way this can happen is if f is continuous.
This function f is called the Cantor-Lebesgue function. We will use it in
examples later on. For now, we can see that it a function that increases only
on the Cantor set, which is of Lebesgue measure 0, yet is continuous.
Example 4.2 Let q1 , q2 , . . . be an enumeration of the rationals, let > 0,
and let Ii be the interval (qi /2i , qi + /2i ). Then the measure of Ii is
/2i1 , so the measure of i Ii is at most 2. (It is not equal to that because
there is a lot of overlap.) So the measure of A = [0, 1] i Ii is larger than
1 2. But A contains no rational numbers.
Example 4.3 Let us follow the construction of the Cantor set, with this
difference. Instead of removing the middle third at the first stage, remove
the middle fourth, i.e., remove (3/8, 5/8). On each of the two intervals that
remain, remove the middle sixteenths. On each of the four intervals that
remain, remove the middle interval of length 1/64, and so on. The total that
we removed is
1
1
1
+ 2( 16
) + 4( 64
) + = 21 .
4
The set that remains contains no intervals, is closed, every point is a limit
point, is uncountable, and has measure 1/2. Such a set is called a fat Cantor
set or generalized Cantor set. Of course, other choices that 1/4, 1/16, etc.
are possible.
Let A [0, 1] be a Borel measurable set. We will show that A is almost equal to the countable intersection of open sets and almost equal to
the countable union of closed sets.. (A similar argument to what follows is
possible for sets that have infinite measure.)
Proposition 4.4 Suppose A [0, 1] is a Borel measurable set.
(a) There exists a set H that is the countable intersection of open sets
which contains A and m(H A) = 0.
(b) There exists a set F that is the countable union of closed sets which is
contained in A and m(A F ) = 0.
11

Proof. (a) For each i, there is an open set Gi that contains A and such that
m(Gi A) < 2i . This follows from the fact that m(A) = m (A) and the
definition of m . Then Hi = ji Gj will contain A, is open, and since it is
contained in Gi , then m(Hi A) < 2i . Let H =
i=1 Hi . H need not be
open, but it is the intersection of countably many open sets. The set H is a
Borel set, contains A, and m(H A) m(Hi A) < 2i for each i, hence
m(H A) = 0.
(b) If A [0, 1], let Fi = [0, 1] Hi , where Hi is a decreasing sequence of
open sets containing Ac such that m(Hi Ac ) < 2i . (The Hi are constructed
as in the proof of (a), but in terms of Ac instead of A.) Then Fi is an
increasing sequence of closed sets, Fi A for each i, and m(A Fi ) < 2i
for each i. Our result follows from letting F = i Fi since m(A F )
m(A Fi ) < 2i for each i, hence m(A F ) = 0..
The countable intersections of open sets are sometimes called G sets; the
G is for geoffnet, the German word for open and the for Durchschnitt,
the German word for intersection. The countable unions of closed sets are
called F sets, the F coming from ferme, the French word for closed, and
the coming from Summe, the German word for union.
Therefore, when trying to understand Lebesgue measure, we can look at
G or F sets, which are not so bad, and at null sets, which can be quite bad
but dont have positive measure.
Next we prove the Caratheodory extension theorem. We say that a measure is -finite if there exist E1 , E2 , . . . , such that (Ei ) < for all i and
X
i=1 Ei .
Theorem 4.5 Suppose A0 is an algebra and m restricted to A0 is a measure.
Define

nX
o
m (E) = inf
m(Ai ) : Ai A0 , E
A
i=1 i .
i=1

Then
(a) m (A) = m(A) if A A0 ;
(b) every set in A0 is m -measurable;

12

(c) if m is -finite, then there is a unique extension to the smallest -algebra


containing A0 .
Proof. We start with (a). Suppose E A0 . We know m (E) m(E)
since we can take A1 = E and A2 , A3 , . . . empty in the definition of m . If
n1
E
i=1 Ai with Ai A0 , let Bn = E (An i=1 Ai ). Then the Bn are
disjoint, they are each in A0 , and their union is E. Therefore
m(E) =

m(Bi )

i=1

m(Ai ).

i=1

Thus m(E) m (E).


Next we look at (b). Suppose A P
A0 . Let > 0 and let E X. Pick

Bi A0 such that E
B
and
i=1 i
i m(Bi ) m (E) + . Then

m (E) +

m(Bi ) =

m(Bi A) +

m(Bi Ac )

i=1

i=1

i=1

m (E A) + m (E A ).
Since is arbitrary, m (E) m (EA)+m (EAc ). So A is m -measurable.
Finally, suppose we have two extensions to the smallest -algebra containing A0 ; let the other extension be called n. We will show that if E is in this
smallest -algebra, then m (E) = n(E).
Since m is -finite, we can reduce to the case where m is a finite measure:
if X = i Ki with m(Ki ) < and we prove uniqueness for the measure mi
defined by mi (A) = m(AKi ), then uniqueness for m follows. So we suppose
m(X) < .
Since E must be m -measurable,

m (E) = inf{

m(Ai ) : E
i=1 Ai , Ai A0 }.

i=1

But m = n on A0 , so i m(Ai ) =
which implies n(E) m (E).

n(Ai ). Therefore n(E)

n(Ai ),

Since we do not know that n is constructed via an outer measure, we must


use a different argument to get the reverse inequality. Let > 0 and choose
13

P
Ai A0 such that m (E) + i m(Ai ) and E i Ai . Let A = i Ai and
Bk = ki=1 Ai . Observe m (E) + m (A), hence m (A E) < . We have
m (A) = lim m (Bk ) = lim n(Bk ) = n(A).
k

Then
m (E) m (A) = n(A) = n(E) + n(A E) n(E) + m(A E) n(E) + .
Since is arbitrary, this completes the proof.
Remarks: (1) Uniqueness implies there is only one possible Lebesgue measure.
(2) We will use the Caratheodory extension theorem in the study of product measures. It is also used in the Riesz representation theorem and in the
Daniell-Kolmogorov extension theorem.
We now define Lebesgue-Stieltjes measures. Let : R R be nondecreasing and right continuous (i.e., (x+) = (x) for all x, where (x+) =
limyx,y>x (y)). Suppose we define m ((a, b]) = (b) (a), define
X
m (
((bi ) (ai ))
i=1 (ai , bi ]) =
i

when the intervals (ai , bi ] are disjoint, and define


X
m (A) = inf{ ((bi ) (ai )) : A i (ai , bi ]}.
i

Very much as in the previous section we can show that m is a measure on


the Borel -algebra.
The m measure of a point x is (x) (x), where
(x) =

lim (x).

yx,y<x

So m ({x}) is equal to the size of the jump (if any) of at x.


Lebesgue measure is the special case of m when (x) = x.
Given a measure on R such that (K) < whenever K is compact,
define (x) = ((0, x]) if x 0 and (x) = ((x, 0]) if x < 0. Then is
nondecreasing, right continuous, and it is not hard to see that = m .
14

Measurable functions

Suppose we have a set X together with a -algebra A.


Definition 5.1 f : X R is measurable or A-measurable if {x : f (x) >
a} A for all a R.
Example 5.2 Suppose f is identically constant. Then {x : f (x) > a} is
either empty or the whole space, so f is measurable.
Example 5.3 Suppose f (x) = 1 if x A and 0 otherwise. Then {x : f (x) >
a} is either , A, or X. So f is measurable if and only if A is in the -algebra.
Example 5.4 Suppose X is the real line with the Borel -algebra and
f (x) = x. Then {x : f (x) > a} = (a, ), and so f is measurable.
Proposition 5.5 The following are equivalent.
(a) {x : f (x) > a} A for all a;
(b) {x : f (x) a} A for all a;
(c) {x : f (x) < a} A for all a;
(d) {x : f (x) a} A for all a.
Proof. The equivalence of (a) and (b) and of (c) and (d) follow from taking
complements. The remaining equivalences follow from the equations
{x : f (x) a} =
n=1 {x : f (x) > a 1/n},
{x : f (x) > a} =
n=1 {x : f (x) a + 1/n}.

Proposition 5.6 If X is a metric space, A contains all the open sets, and
f is continuous, then f is measurable.
15

Proof. {x : f (x) > a} = f 1 (a, ) is open.

Proposition 5.7 If f and g are measurable, so are f +g, cf , f g, max(f, g),


and min(f, g).
Proof. If f (x) + g(x) < , then f (x) < g(x), and there exists a rational
r such that f (x) < r < g(x). So
[
{x : f (x) + g(x) < } =
({x : f (x) < r} {x : g(x) < r}).
r rational

2
2
> a} {x : f (x) <
f is measurable since {x : f (x) > a) = {x : f (x)
a}. The measurability of f g follows since f g = 12 [(f + g)2 f 2 g 2 ].
{x : max(f (x), g(x)) > a} = {x : f (x) > a} {x : g(x) > a}, and the
argument for min(f, g) is similar.

Proposition 5.8 If fi is measurable for each i, then so are supi fi , inf i fi ,


lim supi fi , and lim inf i fi .
Proof. The result will follow for lim sup and lim inf once we have the result
for the sup and inf by using the definitions. We have {x : supi fi > a} =

i=1 {x : fi (x) > a}, and the proof for inf fi is similar.

Definition 5.9 We say f = g almost everywhere, written f = g a.e., if


{x : f (x) 6= g(x)} has measure zero. Similarly, we say fi f a.e., if the set
of x where this fails has measure zero.
We saw in Proposition 5.6 that all continuous functions are Borel measurable. The same is true for monotone functions on the real line.
Proposition 5.10 If f : R R is nondecreasing or nonincreasing, then f
is Borel measurable.
16

Proof. Let us suppose f is nondecreasing. The set Aa = {x : f (x) > a}


is then either a semi-infinite open interval or semi-infinite closed interval.
This can be seen by a picture. To be more careful, given a R, let x0 =
sup{y : f (y) a}. If f (x0 ) = a, then Aa = (x0 , ), while if f (x0 ) 6= a, then
Aa = [x0 , ). In each case Aa is a Borel set.

Proposition 5.11 Let X be a space, A a -algebra on X, and f : X R a


A-measurable function. If A is in the Borel -algebra on R, then f 1 (A) A.
Proof. Let B be the Borel -algebra on R and C = {A B : f 1 (A) A}.
If A1 , A2 , . . . C, then since f 1 (i Ai ) = i f 1 (Ai ) A, we have that
C is closed under countable unions. Similarly C is closed under countable
intersections and complements, so C is a -algebra. Since f is measurable,
C contains (a, ) for every real a, hence C contains the -algebra generated
by these intervals, that is, C contains B.

Example 5.12 We want to construct a set that is Lebesgue measurable, but


not Borel measurable. Let F be the Cantor-Lebesgue function of Example
4.1 and define
f (x) = inf{y : F (y) x}.
Although f is not continuous, observe that f is strictly increasing (hence
one-to-one) and maps [0, 1] into C, the Cantor set. Since f is nondecreasing,
f 1 maps Borel measurable sets to Borel measurable sets.
Let A be the non-measurable set we constructed in Proposition 3.1. Let
B = f (A). Since f (A) C and m(C) = 0, then f (A) is a null set, hence
is Lebesgue measurable. On the other hand, f (A) is not Borel measurable,
because if it were, then A = f 1 (f (A)) would be Borel measurable, a contradiction.

Integration

In this section we introduce the Lebesgue integral.

17

Definition 6.1 If E X, define the characteristic function of E by


(
1 x E;
E (x) =
0 x
/ E.
A simple function s is one of the form
n
X

s(x) =

ai Ei (x)

i=1

for reals ai and measurable sets Ei .

Proposition 6.2 Suppose f 0 is measurable. Then there exists a sequence


of nonnegative measurable simple functions increasing to f .
Proof. Let Eni = {x : (i 1)/2n f (x) < i/2n } and Fn = {x : f (x) n}
for n = 1, 2, . . . , and i = 1, 2, . . . , n2n . Then define
n

sn =

n2
X
i1
i=1

2n

Eni + nFn .

It is easy to see that sn has the desired properties.

P
Definition 6.3 If s = ni=1 ai Ei is a nonnegative measurable simple function, define the Lebesgue integral of s to be
Z
s d =

n
X

ai (Ei ).

(6.1)

i=1

If f 0 is a measurable function, define


Z
nZ
o
f d = sup
s d : 0 s f, s simple .

18

(6.2)

R
R
If f is measurable and at least one of the integrals f + d, f d is finite,
where f + = max(f, 0) and f = min(f, 0), define
Z
Z
Z
+
f d = f d f d.
(6.3)
Finally, if f = u + iv and
Z

(|u| + |v|) d is finite, define


Z
Z
f d = u d + i v d.

(6.4)

A few remarks are in order. A function s might be written as a simple


function in more than one way. For example RAB = A + B is A and B
are disjoint. It is clear that the definition of s d is unaffected by how s
is written. Secondly, if s is Ra simple function, one has to think a moment to
verify that the definition of s d by means of (6.1) agrees with its definition
by means of (6.2).
Definition 6.4 If

|f | d < , we say f is integrable.

The proof of the next proposition follows from the definitions.


Proposition 6.5 R(a) If f is measurable, a f (x) b for all x, and (X) <
, then a(X) f d b(X);
(b)R If f (x) R g(x) for all x and f and g are measurable and integrable,
then f d g d.
R
R
(c) If f is integrable, then cf d = c f d for all real c.
R
(d) If (A) = 0 and f is measurable, then f A d = 0.
R
R
The integral f A d is often written A f d. Other notation for the
Rintegral is to omit the Rif it is clear which measure is being used, to write
f (x) (dx), or to write f (x) d(x).
Proposition 6.6 If f is integrable,
Z Z


f |f |.
19

R
R
Proof.
R ForRthe real case, this is easy. f |f |, so f |f |. Also f |f |,
so f |f |. Now combine these two facts.
R
For the complex case, Rf is a complex number. If it is 0, the inequality
is trivial. If it is not, then f = rei for some r and . Then
Z
Z
Z


i
f = ei f.
f = r = e
R
R
R
From the
definition
of
f
when
f
is
complex,
we
have
Re
(
f
)
=
Re (f ).
R
Since | f | is real, we have
Z
Z
Z
 Z


i
i
e f = Re (e f ) |f |.
f = Re

R
R
R
We do not yet have that (f + g) = f + g.

Limit theorems

One of the most important results concerning Lebesgue integration is the


monotone convergence theorem.
Theorem 7.1 Suppose fn is a sequence of nonnegative measurable functions
with fR1 (x) f2 (x)
R for all x and with limn fn (x) = f (x) for all x.
Then fn d f d.
R
Proof. By Proposition 6.5(b), fn is an increasing sequence
R of real numbers.
Let LR be the limit. Since fn f for all n, then L f . We must show
L f.
P
Let s = m
i=1 ai Ei be any nonnegative simple function less than f and
let c (0, 1). Let An = {x : fn (x) cs(x)}. Since the fn (x) increases to
f (x) for each x and c < 1, then A1 A2 , and the union of the An is

20

all of X. For each n,


Z

fn

fn c
An
m
X

Z
=c
=c

sn
An

ai Ei

An i=1
m
X

ai (Ei An ).

i=1

If we let n , by Proposition 2.5(c), the right hand side converges to


c

m
X

Z
ai (Ei ) = c

s.

i=1

Therefore L c s. Since c is arbitrary in the interval (0, 1),


R then L
Taking the supremum over all simple s f , we obtain L f .

s.

R
Example 7.2 Let X = [0, ) and
f
(x)
=
1/n
for
all
x.
Then
fn =
n
R
, but fn f where f = 0 and f = 0. The problem here is that the fn
are not nonnegative.
Example
7.3 Suppose fn = nR(0,1/n) . Then fn 0, fn 0 for each x, but
R
fn = 1 does not converge to 0 = 0. The trouble here is that the fn do
not increase for each x.
Once we have the monotone convergence theorem, we can prove that the
Lebesgue integral is linear.
Theorem 7.4 If f1 and f2 are integrable, then
Z
Z
Z
(f1 + f2 ) = f1 + f2 .

Proof. First suppose f1 and f2 are nonnegative and simple. Then it is clear
from the definition that the theorem holds in this case. Next suppose f1
21

and f2 are nonnegative. Take sn simple and increasing to f1 and tn simple


and increasing to f2 . Then sn + tn increases to f1 + f2 , so the result follows
from the monotone convergence theorem and the result for simple functions.
Finally in the general case, write f1 = f1+ f1 and similarly for f2 , and use
the definitions and the result for nonnegative functions.

Proposition 7.5 Suppose fn are nonnegative measurable functions. Then


Z X

fn =

n=1

Proof. Let FN =

PN

n=1

Z
X

fn .

n=1

fn and write

Z X

N
X

Z
fn =

lim

n=1

fn

n=1

Z
=

lim FN = lim

= lim

N Z
X

fn =

FN

Z
X

(7.1)

fn ,

n=1

n=1

using the monotone convergence theorem and the linearity of the integral.

The next theorem is known as Fatous lemma.


Theorem 7.6 Suppose the fn are nonnegative and measurable. Then
Z
Z
lim inf fn lim inf fn .
n

Proof. Let gn = inf in fi . Then gn are nonnegative


R
R and gn increases to
lim inf fn . Clearly gn fi for each i n, so gn fi . Therefore
Z
Z
gn inf fi .
in

22

R
If we take the limit as n , on the left hand side we obtain lim inf fn by
the monotone
convergence theorem, while on the right hand side we obtain
R
lim inf n fn .
A typical
R use of Fatous lemma is the following. Suppose we have fRn f
and supn |fn | K < . Then |fn | |f |, and by Fatous lemma, |f |
K.
Another very important theorem is the dominated convergence theorem.
Theorem 7.7 Suppose fn are measurable functions and fn (x) f (x). Suppose there
exists Ran integrable function g such that |fn (x)| g(x) for all x.
R
Then fn d f d.
Proof. Since fn + g 0, by Fatous lemma,
Z
Z
(f + g) lim inf (fn + g).
Since g is integrable,
Z

Z
f lim inf

Similarly, g fn 0, so
Z

fn .

Z
(g f ) lim inf

and hence

Therefore

(g fn ),

Z
f lim inf

Z
(fn ) = lim sup

fn .

Z
f lim sup

fn ,

which with the above proves the theorem.


Example 7.3 is an example where the limit of the integrals is not the
integral of the limit because there is no dominating function g.

23

If in the monotone convergence theorem or dominated convergence theorem we have only fn (x) f (x) almost everywhere, the conclusion still
holds. For if the fn and f are measurable and A = {x : fn (x) f (x)}, then
f A f ARfor each Rx. And since Ac has measure 0, we see from Proposition
6.5(d) that f A = f , and similarly with f replaced by fn .

Properties of Lebesgue integrals

Later on we will need the following two propositions.


Proposition
8.1 Suppose f is measurable and for every measurable set A
R
we have A f d = 0. Then f = 0 almost everywhere.
Proof. Let A = {x : f (x) > }. Then
Z
Z
0=
f
= (A)
A

since f A A . Hence (A) = 0. We use this argument for = 1/n and


n = 1, 2, . . . , so {x : f (x) > 0} = 0. Similarly {x : f (x) < 0} = 0.

Proposition 8.2 Suppose f is measurable and nonnegative and


Then f = 0 almost everywhere.

f d = 0.

Proof. If f is not almost everywhere equal to 0, there exists an n such that


(An ) > 0 where An = {x : f (x) > 1/n}. But then since f is nonnegative,
Z
Z
1
0= f
f (An ),
n
An
a contradiction.
We give a result on approximating a function on R by continuous functions.
24

Proposition 8.3 Suppose f is a measurable function from R to R that is


integrable. Let > 0. Then there exists a continuous function that is 0
outside some bounded interval such that
Z
|f g| < .
R +
Proof.
If
we
have
continuous
functions
g
,
g
such
that
|f g1 | < /2
1
2
R
+

and |f g2 | < /2, where f = max(f, 0) and f = max(f, 0), then


taking g = g1 g2 will prove our result. So without loss of generality, we
may assume f 0.
R
R
By monotone convergence f [n,n] increases to f , so by taking n large
enough,
R the difference of the integrals
R will be less than /2. If we find g such
that |f [n,n] g| < /2, then |f g| < . Therefore we may assume
that f is 0 outside some bounded interval.
increase
to
R We can find simple functions increasing to f whose integrals
R
R
f . Let sm be a simple
function
such
that
s

f
and
s

/2.
m
R
Rm
If we find g such that |sm g| < /2, then |f g| < . So it suffices to
consider the case where f is a simple function.
R
P
If fP= pi=1 ai Ai and we find gi continuous such that |ai Ai gi | < /p,
then pi=1 gi will be the desired function. So we may assume f is a constant
times a characteristic function, and by linearity, we may assume f is equal
to A for some A contained in a bounded interval [n, n].
We can choose G open and F closed such that F A G and m(GF ) <
. We can replace G by G (n 1, n + 1). Gc [n 1, n + 1] and F
are compact sets, so there is a minimum distance between them, say, . Let
g(x) = max(0, 1 dist (x, F )/). Then g is continuous, 0 g 1, g is 1 on
F , g is 0 on Gc , and g is 0 outside of [n 1, n + 1]. Therefore
|g A | G F ,
so

Z
|g A |

(G F ) = m(G F ) < .

The method of proof, where one proves a result for characteristic functions, then simple functions, then non-negative functions, and then finally
integrable functions is very common.
25

We finish this section with a comparison of the Lebesgue integral and


the Riemann integral. Here we are only looking at bounded functions
from
R
[a, b] into R. If we are looking at the Lebesgue integral, we write f , while,
temporarily, if we are looking at the Riemann integral, we write R(f ). Recall
that the Riemann integral on [a, b] is defined as follows: if P is a partition of
[a, b], then
n
X
U (P, f ) =
( sup f (x)) (xi xi1 )
xi=1 xx

i=1

and
L(P, f ) =

n
X

inf

xi=1 xx

i=1

f (x)) (xi xi1 ).

Set R(f ) = inf{U (P, f ) : P is a partition} similarly R(f ). Then the Riemann integral exists if R(f ) = R(f ), and the common value is the Riemann
integral, which we denote R(f ).
Theorem 8.4 A bounded measurable function f on [a, b] is Riemann integrable if and only if the set of points at which f is discontinuous has Lebesgue
measure 0, and in that case, the Riemann integral is equal in value to the
Lebesgue integral.
Proof. If P is a partition, define
TP (x) =

n
X

i=1

and
SP (x) =

n
X
i=1

We see that

sup

xi1 yxi

inf

xi1 yxi

TP = U (P, f ) and

f (y))[xi1 ,xi ) (x),

f (y))[xi1 ,xi ) (x).

SP = L(P, f ).

If f is Riemann integrable, there exists a sequence of partitions Qi such


that U (Qi , f ) R(f ) and a sequence Q0i such that L(Q0i , f ) R(f ). It is not
hard to check that adding points to a partition increases L and decreases U ,
so if we let Pi = ji (Qj Q0j ), then Pi is an increasing sequence of partitions,
U (Pi , f ) R(f ), L(Pi , f ) R(f ). We see also that TPi (x) decreases at each
point, say, to T (x), and SPi (x) increases at each point, say, to S(x). Also
26

T (x) f (x) S(x). Then by dominated convergence (recall that f is


bounded)
Z
Z
(T S) = lim (TPi SPi ) = lim (U (Pi , f ) L(Pi , f )) = 0.
i

We conclude T = S = f a.e. If x is not in the null set where T (x) 6= S(x) nor
in i Pi , which is countable and hence of Lebesgue measure 0, then TPi (x)
f (x) and SPi (x) f (x). This implies that f is continuous at such f . Since
Z
Z
R(f ) = lim U (Pi , f ) = lim TPi = f,
i

we see the Riemann integral and Lebesgue integral agree.


Now suppose that f is continuous a.e. Let > 0. Let Pi be the partition
where we divide [a, b] into 2i equal parts. If x is not in the null set where
f is discontinuous, nor in
i=1 Pi , then TPi (x) f (x) and SPi (x) f (x). By
dominated convergence,
Z
Z
U (Pi , f ) = TPi f
and

Z
L(Pi , f ) =

SPi

f.

This does it.

Modes of convergence

Definition 9.1 If is a measure, we say a sequence of measurable functions


fn converges to f almost everywhere (written fn f a.e.) if there is a set
of measure 0 and for x not in this set we have fn (x) f (x).
We say fn converges to f in measure if for each > 0
({x : |fn (x) f (x)| > }) 0
as n .
27

Proposition 9.2 Suppose is a finite measure.


(a) If fn f , a.e., then fn converges to f in measure.
(b) If fn f in measure, there is a subsequence nj such that fnj f ,
a.e.
Proof. Let > 0. If An = {x : |fn (x) f (x)| > }, then An 0 a.e., and
by dominated convergence,
Z
(An ) = An (x) (dx) 0.
This proves (a).
To prove (b), let n1 = 1 and choose nj > nj1 inductively so that
({x : |fnj (x) f (x)| > 1/j}) 2j .
Let Aj = {x : |fnj (x) f (x)| > 1/j}. Then (Aj ) 2j , and

A =
k=1 j=k Aj

has measure less than j=k Aj for every k, hence less than 2k+1 for every
k. Therefore A has measure 0. If x
/ A, then x
/
j=k Aj for some k, so
|fnj (x) f (x)| 1/j for j k, which means fnj f a.e. on Ac .

Example 9.3 Part (a) of the above proposition is not true if (X) = .
Let X = R and let fn = (n,n+1) .
Example 9.4 For an example where fn f in measure but not almost
everywhere, let X = [0, 1],
and let fn (x) =
Pn+1
Pnlet be Lebesgue measure,
Fn (x), where Fn = {y : ( j=1 1/j)( mod 1) y ( j=1 1/j)( mod 1)}.
z( mod 1) is defined as the fractional part of z (where the largest integer less
than z is subtracted from z). Let f (x) = 0 for all x.
Then (Fn ) 1/n 0, so fn f in measure. But any x will be in
infinitely many Fn s, so fn does not converge to f (x) at any point.
The following is known as Egoroffs theorem.
28

Theorem 9.5 If is a finite measure, > 0, and fn f a.e., then there


exists a measurable set A such that (A) < and fn f uniformly on Ac .
This type of convergence is sometimes known as almost uniform convergence.
Proof. Let
Enk =
m=n {x : |fm (x) f (x)| > 1/k}.
for fixed k, Enk decreases as n increases, and the intersection n Enk has
measure 0. So (Enk ) 0. Then there exists an integer nk such that
(Enk k ) < 2k . Let E =
/ E and
k=1 Enk k . Then (E) < , and if x
c
n > nk , then |fn (x) f (x)| 1/k. Thus fn f uniformly on E .

10

Product measures

If A1 A2 and A =
i=1 Ai , we write Ai A. If A1 A2 and
A =
A
,
we
write
A

A.
i
i=1 i
Definition 10.1 M is a monotone class is M is a collection of subsets of
X such that
(a) if Ai A and each Ai M, then A M;
(b) if Ai A and each Ai M, then A M.

The intersection of monotone classes is a monotone class, and the intersection of all monotone classes containing a given collection of sets is the
smallest monotone class containing that collection.
The next theorem, the monotone class lemma, is rather technical, but very
useful.
Theorem 10.2 Suppose A0 is a algebra, A is the smallest -algebra containing A0 , and M is the smallest monotone class containing A0 . Then
M = A.
29

Proof. A -algebra is clearly a monotone class, so M A. We must show


A M.
Let N1 = {A M : Ac M}. Note N1 is contained in M, contains A0 ,
and is a monotone class. So N1 = M, and therefore M is closed under the
operation of taking complements.
Let N2 = {A M : A B M for all B A0 }. N2 is contained in M;
N2 contains A0 because A0 is an algebra; N2 is a monotone class because

(
i=1 Ai ) B = i=1 (Ai B), and similarly for intersections. Therefore
N2 = M; in other words, if B A0 and A M, then A B M.
Let N3 = {A M : A B M for all B M}. As in the preceding
paragraph, N3 is a monotone class contained in M. By the last sentence of
the preceding paragraph, N3 contains A0 . Hence N3 = M.
We thus have that M is a monotone class closed under the operations of
taking complements and taking intersections. This shows M is a -algebra,
and so A M.
Suppose (X, A, ) and (Y, B, ) are two measure spaces, i.e., A and B are
-algebras on X and Y , resp., and and are measures on A and B, resp.
A rectangle is a set of the form A B, where A A and B B. Define a
set function on rectangles by
(A B) = (A)(B).
Lemma 10.3 Suppose AB =
i=1 Ai Bi , where A, Ai A and B, Bi B
and the Ai Bi are disjoint. Then
(A B) =

(Ai Bi ).

i=1

Proof. We have
AB (x, y) =

Ai Bi (x, y),

i=1

and so
A (x)B (y) =

X
i=1

30

Ai (x)Bi (y).

Holding x fixed and integrating over y with respect to , we have, using (7.1),
A (x)(B) =

Ai (x)(Bi ).

i=1

Now use (7.1) again and integrate over x with respect to to obtain the
result.
Let C0 = {finite unions of rectangles}. It is clear that C0 is an algebra. By
Lemma 10.3 and linearity, we see that is a measure on C0 . Let A B
be the smallest -algebra containing C0 ; this is called the product -algebra.
By the Caratheodory extension theorem, can be extended to a measure
on A B.
We will need the following observation. Suppose a measure is -finite.
So there exist Ei which have finite measure and whose union is X. If we
let Fn = ni=1 Ei , then Fi X and (Fn ) is finite for each n.
If and are both -finite, say with Fi X and Gi Y , then will
be -finite, using the sets Fi Gi .
The main result of this section is Fubinis theorem, which allows one to
interchange the order of integration.
Theorem 10.4 Suppose
R f : X Y R is measurable with respect to AB.
If f is nonnegative or |f (x, y)| d( )(x, y) < , then
(a) the function g(x) =

f (x, y)(dy) is measurable with respect to A;

(b) the function h(y) =

f (x, y)(dx) is measurable with respect to B;

(c) we have
Z

Z Z


f (x, y) d(x) d(y)
Z Z

=
f (x, y) d(y) (dx).

f (x, y) d( )(x, y) =

31

Proof. First suppose and are finite measures. If f is the characteristic


function of a rectangle, then (a)(c) are obvious. By linearity, (a)(c) hold
if f is the characteristic function of a set in C0 , the set of finite unions of
rectangles.
Let M be the collection of sets C such that (a)(c) hold for C . If Ci C
and Ci M, then (c) holds for C by monotone convergence. If Ci C,
then (c) holds for C by dominated convergence. (a) and (b) are easy. So
M is a monotone class containing A0 , so M = A B.
If and are -finite, applying monotone convergence to C (Fn Gn )
for suitable Fn and Gn and monotone convergence, we see that (a)(c) holds
for the characteristic functions of sets in A B in this case as well.
By linearity, (a)(c) hold for nonnegative simple functions. ByRmonotone
convergence, (a)(c) hold for nonnegative functions. In the case |f | < ,
writing f = f + f and using linearity proves (a)(c) for this case, too.

11

The Radon-Nikodym theorem

Suppose f is nonnegative, measurable, and integrable with respect to . If


we define by
Z
(A) =
f d,
(11.1)
A

then is a measure. The only part that needs thought is the countable additivity, and this follows from (7.1) applied to the functions f Ai . Moreover,
(A) is zero whenever (A) is. We sometimes write f = d/d for (11.1).
Definition 11.1 A measure is called absolutely continuous with respect
to a measure if (A) = 0 whenever (A) = 0. This is frequently written
 .

Proposition 11.2 A finite measure is absolutely continuous with respect


to if and only if for all there exists such that (A) < implies (A) < .

32

Proof. If the condition given in the statement of the proposition holds, it is


clear that  . Suppose now that  . If the condition does not hold,

there exists Ek such that (Ek ) < 2k but (Ek ) . Let F =


n=1 k=n Ek .
Then

2k = 0,
(F ) = lim (k=n Ek ) lim
n

k=n

but
(F ) = lim (
k=n Ek ) ;
n

This contradicts the absolute continuity.


Definition 11.3 A function
P: A (, ] is called a signed measure if
() = 0 and (
A
)
=
i=1 i
i=1 (Ai ) whenever the Ai are disjoint and all
the Ai are in A.

Definition 11.4 Let be a signed measure. A set A A is called a positive


set for if (B) 0 whenever B A and B A. We define a negative
set similarly. A null set A is one where (B) = 0 whenever B A is
measurable.
R
Example 11.5 Suppose m is Lebesgue measure and (A) = A f dm for
some integrable f . If we let P = {x : f (x) 0}, then P is easily seen to be a
positive set, and if N = {x : f (x) < 0}, then N is a negative one. The Hahn
decomposition which we give below is a decomposition of our space (in this
case R) into positive and negative sets. This decomposition is unique, except
that C = {x : f (x) = 0} could be included in N instead of P , or apportioned
partially to P and partially to N . Note, however, that C is a null set. The
Jordan decomposition
below is a decomposition
of into + and , where
R
R
+ (A) = A f + dm, and similarly (A) = A f d.
n
Note that if is a signed measure, then (
i=1 Ai ) = limn (i=1 Ai ).
The proof is the same as in the case of positive measures.

Proposition 11.6 Let be a signed measure taking values in (, ]. Let


E be measurable with (E) < 0. Then there exists a subset F of E that is a
negative set with (F ) < 0.
33

Proof. If E is a negative set, we are done. If not, there exists a subset with
positive measure. Let n1 be the smallest positive integer such that there
exists E1 E with (E1 ) 1/n. Let k 2. If Fk = E (E1 Ek1 )
is negative, we are done. If not, let nk be the smallest positive integer such
that there exists Ek Fk with (Ek ) 1/nk . We continue.
If the construction stops after a finite number of sets, we are done. If not,
let F = k Fk = E (k Ek ). Since 0 > (E) > and (Ek ) 0, then
(E) = (F ) +

(Ek ).

k=1

Then (F ) (E) < 0, so the sum converges. If G F is measurable with


(G) > 0, then (G) 1/N for some N , which contradicts the construction.
Therefore F must be a negative set.
We write AB for (A B) (B A). The following is known as the
Hahn decomposition theorem.
Theorem 11.7 Let be a signed measure taking values in (, ]. There
exist sets E and F in A that are disjoint whose union is X and such that E
is a negative set and F is a positive set. If E 0 and F 0 are another such pair,
then EE 0 = F F 0 is a null set with respect to ..
Proof. Let L = inf{(A) : A is a negative set}. Choose negative sets An
such that (An ) L. Let E =
n=1 An . Let Bn = An (B1 Bn1 ) for
each n. Since An is a negative set, so is each Bn . Also, the Bn are disjoint.
If C E, then
(C) = lim (C
n

(ni=1 Bi ))

= lim

n
X

(C Bi ) 0.

i=1

So E is a negative set.
Since E is negative,
(E) = (An ) + (E An ) (An ).
Letting n , we obtain (E) = L.
34

Let F = E c . If F were not a positive set, there would exist B F with


(B) < 0. By Proposition 11.6 there exists a negative set C contained in B
with (C) < 0. But then E C would be a negative set with (E C) <
(E) = L, a contradiction.
To prove uniqueness, if E 0 , F 0 are another such pair of sets and A
E E 0 E, then (A) 0. But A E E 0 = F 0 F F 0 , so (A) 0.
Therefore (A) = 0. The same argument works if A E 0 E, and any
subset of EE 0 can be written as the union of A1 and A2 , where A E E 0
and A2 E 0 E.
Let us say two measures and are mutually singular if there exist two
disjoint sets E and F in A whose union is X with (E) = (F ) = 0. This is
often written .
Example 11.8 If is Lebesgue measure restricted to [0, 1/2], that is, (A)
= m(A [0, 1/2]), and is Lebesgue measure restricted to [1/2, 1], then
and are mutually singular. We let E = [0, 1/2] and F = (1/2, 1]. This
example works because the Lebesgue measure of {1/2} is 0.
Example 11.9 A more interesting example is the following. Let f be the
Cantor-Lebesgue function and let be the Lebesgue-Stieltjes measure associated with f . Let be Lebesgue measure restricted to [0, 1]. Then .
To see this, we let E = C, where C is the Cantor set, and F = [0, 1] C. We
already know that m(E) = 0 and we need to show (F ) = 0. To do that,
we need to show (I) = 0 for every open interval contained in F . This will
follow if we show (J) = 0 for every interval of the form J = (a, b] contained
in F . But f is constant on every such interval, so f (b) = f (a), and therefore
(J) = f (b) f (a) = 0.
The following is known as the Jordan decomposition theorem.
Theorem 11.10 If is a signed measure, there exist measures + and
such that = + and + and are mutually singular. This decomposition is unique.
Proof. Let E and F be positive and negative sets for and let + (A) =
(E A), (A) = (A F ). This gives the desired decomposition.
35

If = + is another such decomposition with + , mutually singular, let E 0 and F 0 be the sets in the definition of mutually singular. Then
X = E 0 F 0 gives another Hahn decomposition, hence EE 0 is a null set
with respect to . Then for any A A,
+ (A) = (A E 0 ) = (A E) = + (A),
and similarly for , .
The measure + + is called the total variation measure and is written
||.
We now are ready for the Radon-Nikodym theorem.
Theorem 11.11 Suppose is a -finite measure and is a finite measure
such that is absolutely continuous with respect toR. There exists a integrable nonnegative function f such that (A) = A f d for all A A.
Moreover, if g is another such function, then f = g almost everywhere with
respect to .
Proof. Let us first prove the uniqueness assertion. For every set A we have
Z
(f g) d = (A) (A) = 0.
A

By Proposition 8.1 we have f g = 0 a.e. with respect to .


Since is -finite, there exist Fi X such that (Fi ) < for each i. Let
i be the restriction of to Fi , that is, i (A) = (A Fi ). DefineR i , the
restriction of to Fi , similarly. If fi is a function such that i (A) = A fi di
for all A, the argument of the first paragraph shows that fi = fj on Fi if
i j. If we define f by f (x) = fi (x) if x Fi , we see that f will be the
desired function. So it suffices to restrict attention to the case where is
finite.
Let

Z
n
o
F = g : 0 g, g d (A) for all A A .
A
R
F is not empty because 0
R F. Let L = sup{ g d : g F}, and let gn be
a sequence in F such that gn d L. Let hn = max(g1 , . . . , gn ).
36

If g1 and g2 are in F, then h2 = max(g1 , g2 ) is also in F. To see this, let


B = {x : g1 (x) g2 (x)}, and write
Z
Z
Z
h2 d =
h2 d +
h2 d
c
A
AB
AB
Z
Z
=
g1 d +
g2 d
AB c

AB

(A B) + (A B c )
= (A).
By an induction argument, hn is in F.
The hn increase, say to f . By monotone convergence
Z
f d (A)

f d = L and
(11.2)

for all A.
Let A be a set where there is strict inequality in (11.2); let be chosen
sufficiently small so that if is defined by
Z
(B) = (B)
f d (B),
B

then (A) > 0. is a signed measure; let F be the positive set as constructed
in Theorem 11.7. In particular, (F ) > 0. So for every B
Z
f d + (B F ) (B F ).
BF

We then have, using (11.2), that


Z
Z
(f + F ) d =
f d + (B F )
B
B
Z
Z
=
f d +
f d + (B F )
BF c

BF

(B F c ) + (B F ) = (B).
This says that f + F F. However,
Z
Z
L (f + F ) d = f d + (F ) = L + (F ),
37

which implies (F ) = 0. But then (F ) = 0, and hence (F ) = 0, contradicting the fact that F is a positive set for F with (F ) > 0.

The proof of the Lebesgue decomposition theorem is almost the same.


Theorem 11.12 Suppose and are two finite measures. There exist measures , such that = + , is absolutely continuous with respect to ,
and and are mutually singular.
Proof. Define F and L and Rconstruct f as in the proof of the
R RadonNikodym theorem. Let (A) = A f d and let = . We have A f d
(A), so (A) 0 for all A. To keep things straight, we record that we have
f = d/d and + = . We need to show and are mutually singular.
Suppose not. Then there exists F A with (F ) > 0 and (F ) > 0, and
so (F ) (F ) > 0. Note that for small enough, ( )(F ) > 0. We
claim that there exist > 0 and E F such that E is a positive set with
respect to and (E) > 0. Given the claim, if A A,
Z
E d = (A E) (A E)
A

(A E) (A)
Z
= (A)
f d.
A

This says that


Z
(f + E ) d (A)
A

for all A A, or f + E F. But


Z
Z
(f + E ) d = f d + (E) > L,
a contradiction to the definition of L.
It remains to prove the claim. Let F = Pn Nn be a Hahn decomposition
for the measure n1 restricted to F , let P = Pn , and N = Nn = F P .
Then N is a negative set for n1 for each n, or 0 (N ) n1 (N ); this
38

implies (N ) = 0. If (P ) = 0, then , and we are supposing that is


not the case. Therefore (P ) > 0, hence (Pn ) > 0 for some n, and Pn is a
positive set for n1 . Now take = 1/n and E = Pn .

12

Differentiation of real-valued functions

In this section we want to look at when f : R R is differentiable and when


the fundamental theorem of calculus holds. Briefly,
(1) Functions of bounded variation are differentiable;
Rx
(2) The derivative of a f (y) dy is equal to f a.e. if f is integrable;
Rb
(3) a f 0 (y) dy = f (b) f (a) if f 0 is absolutely continuous.
Let E R be a measurable set and let O be a collection of intervals. We
say O is a Vitali cover of E if for each x E and each > 0 there exists
an interval G O containing x whose length is less than . m will denote
Lebesgue measure.
Lemma 12.1 Let E have finite measure and let O be a Vitali cover of E.
Given > 0 there exists a finite subcollection of disjoint intervals I1 , . . . , In
such that m(E ni=1 In ) < .
Proof. We may replace each interval in O by a closed one, since the set of
endpoints of a finite subcollection will have measure 0.
Let O be an open set of finite measure containing E. Since O is a Vitali cover, we may suppose without loss of generality that each set of O is
contained in O. Let a0 = sup{m(I) : I O}. Let I1 be any element of
O with m(I1 ) a0 /2. Let a1 = sup{m(I) : I O, I disjoint from I1 },and
choose I2 O disjoint from I1 such that m(I2 ) a1 /2. Continue in this way,
choosing In+1 disjoint from I1 , . . . , In and in O with length at least one half
as large as any other such interval in O that is disjoint from I1 , . . . , In .
If the process stops at some finite stage, we are done. If not, we generate a sequence of disjoint intervals I1 , I2 , . . . Since they are disjoint and all
39

contained
in O, then
P
i=N +1 m(Ii ) < /5.

i=1

m(Ii ) m(O) < . So there exists N such that

Let R = E N
i=1 Ii ; we will show m(R) < . Let Jn be the interval
with the same center as In but five times the length. Let x R. There
exists
an interval I PO containing
P
P x with I disjoint from I1 , . . . , IN . Since
m(In ) < , then
an 2 m(In ) < , and an 0. So I must either
be one of the In for some n > N or at least intersect it, for otherwise we
would have chosen I at some stage. Let n be the smallest integer such that
I intersects In ; note n > N . We have m(I) an1 2m(In ). Since x is in
I and I intersects In , the distance from x to the midpoint of In is at most
m(I) + m(In )/2 (5/2)m(In ). Therefore x Jn .
P
P
Then R
i=N +1 Jn , so m(R)
i=N +1 m(Jn ) = 5
i=N +1 m(In ) < .

Given a function f , we define the derivates of f at x by


f (x + h) f (x)
h
h0
f (x + h) f (x)
D f (x) = lim inf
.
h0
h

f (x + h) f (x)
,
h
h0+
f (x + h) f (x)
D+ f (x) = lim inf
,
h0+
h

D f (x) = lim sup

D+ f (x) = lim sup

If all the derivates are equal, we say that f is differentiable at x and define
f 0 (x) to be the common value.
Theorem 12.2 Suppose f is nondecreasing on [a, b]. Then f is differentiable
Rb
almost everywhere, f 0 is integrable, and a f 0 (x) dx f (b) f (a).
Proof. We will show that the set where any two derivates are unequal has
measure zero. We consider the set E where D+ f (x) > D f (X), the other
sets being similar. Let Eu,v = {x : D+ f (x) > u > v > D f (x)}. If we show
m(Eu,v ) = 0, then taking the union of all pairs of rationals with u > v shows
m(E) = 0.
Let s = m(Eu,v ), let > 0, and choose an open set O such that Eu,v O
and m(O) < s + . For each x Eu,v there exists an arbitrarily small interval
[x h, x] contained in O such that f (x) f (x h) < vh. Use Lemma 12.1
to choose I1 , . . . , In which are disjoint and whose interiors cover a subset A
40

of Eu,v of measure greater than s . Suppose In = [xn hn , xn ]. Summing


over these intervals,
N
X

[f (xn ) f (xn hn )] < v

n=1

n
X

hn < vm(O) < v(s + ).

n=1

Each point y A is the left endpoint of an arbitrarily small interval


(y, y + k) that is contained in some In and for which f (y + k) f (y) > uk.
Using Lemma 12.1 again, we pick out a finite collection J1 , . . . , JM whose
union contains a subset of A of measure larger than s 2. Summing over
these intervals yields
M
X

[f (yi + ki ) f (yi )] > u

ki > u(s 2).

i=1

Each interval Ji is contained in some interval In , and if we sum over those i


for which Ji In we find
X
[f (yi + ki ) f (yi )] f (xn ) f (xn hn ),
since f is increasing. Thus
N
X

[f (xn ) f (xn hn )]

n=1

M
X

[f (yi + ki ) f (yi )],

i=1

and so v(s + ) > u(s 2). This is true for each , so vs us. Since u > v,
this implies s = 0.
This shows that

f (x + h) f (x)
h0
h
is defined almost everywhere and that f is differentiable wherever g is finite.
Define f (x) = f (b) if x b. Let gn (x) = n[f (x + 1/n) f (x)]. Then
gn (x) g(x) for almost all x, and so g is measurable. Since f is increasing,
gn 0. By Fatous lemma
Z b
Z b
Z b
[f (x + 1/n) f (x)]dx
g lim inf
gn = lim inf n
a
a
a
Z a+1/n i
Z a+1/n i
h
h Z b+1/n
= lim inf n
f n
f = lim inf f (b) n
f
g(x) = lim

f (b) f (a).
41

For the last inequality, we use the fact that f is increasing. This shows that
g is integrable and hence finite almost everywhere.
P
A function is of bounded variation if sup{ ki=1 |f (xi ) f (xi1 )|} is finite,
where the supremum is over all partitions a = x0 < x1 < < xk = b of
[a, b].
Lemma 12.3 If f is of bounded variation on [a, b], then f can be written as
the difference of two nondecreasing functions on [a, b].
Proof. Define
P (y) = sup

k
nX

o
[f (xi )f (xi1 )] ,
+

N (y) = sup

k
nX

i=1

o
[f (xi )f (xi1 )] ,

i=1

where the supremum is over all partitions a = x0 < x1 < < xk = y for
y [a, b]. P and N are measurable since they are both increasing. Since
k
X
i=1

[f (xi ) f (xi1 )] =

k
X

[f (xi ) f (xi1 )] + f (y) f (a),

i=1

taking the supremum over all partitions of [a, y] yields


P (y) = N (y) + f (y) f (a).
Clearly P and N are nondecreasing in y, and the result follows by solving
for f (y).
From this lemma, we see that functions of bounded variation are differentiable a.e. But the function sin(1/x) defined on (0, 1] is differentiable
everywhere, but is not of bounded variation.
Rx
Next we look at when the derivative of a f (t) dt is equal to f (x) a.e.
Define the indefinite integral of an integrable function f by
Z x
F (x) =
f (t) dt.
a

42

Lemma 12.4 If f is integrable, then F is continuous and of bounded variation.


Proof. The continuity follows from the dominated convergence theorem.
The bounded variation follows from
Z b
k Z xi
k
k Z xi
X
X
X


f (t) dt
|f (t)| dt
|F (xi ) F (xi1 )| =
|f (t)| dt

i=1

i=1

xi1

i=1

xi1

for all partitions.

Lemma 12.5 If f is integrable and F (x) = 0 for all x, then f = 0 a.e.


Rd
Rd
Rc
Proof. For any interval, c f = a f a f = 0. By dominated convergence andRthe fact that any open set is the countable union of disjoint open
intervals, O f = 0 for any open set O.
If E is any measurable set, take On open that such that On decreases to
E a.e. By dominated convergence,
Z
Z
Z
Z
f = f E = lim f On = lim
f = 0.
E

On

This with Proposition 8.1 implies f is zero a.e.


Proposition 12.6 If f is bounded and measurable, then F 0 (x) = f (x) for
almost every x.
Proof. By Lemma 12.4, F is continuous and of bounded variation, and so
F 0 exists a.e. Let K be a bound for |f |. If
fn (x) =

F (x + 1/n) F (x)
,
1/n

then
Z

x+1/n

f (t) dt,

fn (x) = n
x

43

so |fn | is also bounded by K. Since fn F 0 a.e., then by dominated convergence,


Z c
Z c
Z c
0
F (x) dx = lim
fn (x) dx = lim n
[F (x + 1/n) F (x)] dx
a
a
a
Z a+c
i
h Z c+1/n
= lim n
F (x) dx n
F (x) dx
c
a
Z c
= F (c) F (a) =
f (x) dx,
a

Rc
using the fact that F is continuous. So a [F 0 (x) f (x)] dx = 0 for all c,
which implies F 0 = f a.e. by Lemma 12.5.

Theorem 12.7 If f is integrable, then F 0 = f almost everywhere.


Proof. Without loss of generality we may assume f 0. Let fn (x) = f (x)
if
R xf (x) n and let fn (x) = n if f (x) > n. Then f fn 0. If Gn (x) =
[f fn ], then Gn is nondecreasing, and hence has a derivative
almost
a
Rx
everywhere. By Proposition 12.6, we know the derivative of a fn is equal to
fn almost everywhere. Therefore
h Z x i0
0
0
F (x) = Gn (x) +
fn fn (x)
a

Rb
Rb
a.e. Since n is arbitrary, F 0 f a.e. So a F 0 a f = F (b) F (a). On
Rb
Rb
the other hand, by Theorem 12.2, a F 0 (x) dx F (b) F (a) = a f . We
Rb
conclude that a [F 0 f ] = 0; since F 0 f 0, this tells us that F 0 = f a.e.

Finally, we look at when

Rb
a

F 0 (y) dy = F (b) F (a).

A function
is absolutely continuous on [a, b] if given there exists such
Pk
0
that
whenever {xi , x0i )} is a finite collection of
i=1 |f (xi ) f (xi )| <
Pk
nonoverlapping intervals with i=1 |x0i xi | < .
It is easy to see that absolutely continuous functions are continuous and
that the Cantor-Lebesgue function is not absolutely continuous.
44

Lemma 12.8 If F (x) =


lutely continuous.

Rx
a

f (t) dt for f integrable on [a, b], then F is abso-

Rb
Proof. Let > 0. Choose a simple function s such that a |f s| < /2.
Let K be a bound for |s| and let = /2K. If {(xi , x0i )} is a collection of
nonoverlapping intervals, the
is less than , then set
R sum of whose lengths
R
A = ki=1 (xi , x0i ) and note A |f s| < /2 and A s < K = /2.
Lemma 12.9 If f is absolutely continuous, then it is of bounded variation.
Proof. Let correspond to = 1 in the definition of absolute continuity.
Given a partition, add points if necessary so that each subinterval has length
at most . We can then group the subintervals into at most K collections,
each of total length less than , where K is an integer larger than (1+ba)/.
So the total variation is then less than K.

Lemma 12.10 If f is absolutely continuous on [a, b] and f 0 (x) = 0 a.e.,


then f is constant.
The Cantor-Lebesgue function is an example to show that we need the
absolute continuity.
Proof. Let c [a, b], let E = {x [a, c] : f 0 (x) = 0}, and let > 0. For
each point x E there exists arbitrarily small intervals [x, x+h] [a, c] such
that |f (x + h) f (x)| < h. By Lemma 12.1 we can find a finite collection
of such intervals that cover all of E except for a set of measure less than ,
where is the in the definitionP
of absolute continuity. If the intervals are
[xi , yi ] with xi < yi xi+1 ,P
then
|f (xi+1 ) f (yP
i )| < by the definition of
absolute continuity, while
|f (yi ) f (xi )| < (yi xi ) (c a). So
adding these two inequalities together,
X

X


|f (c) f (a)| = [f (xi+1 ) f (yi )] +
[f (yi ) f (xi )] + (c a).
Since is arbitrary, then f (c) = f (a), which implies that f is constant.

45

Theorem 12.11 If F is absolutely continuous, then


Z b
F (b) F (a) =
F 0 (y) dy.
a

Proof. Suppose F is absolutely continuous on [a, b]. Then F is of bounded


variation, so F = F1 F2 where F1 and F2 are
nondecreasing, and F 0 exists
R
a.e. Since |F 0 (x)| F10 (x) + F20 (x), then |F 0 (x)| dxR F1 (b) + F2 (b)
x
F1 (a) F2 (a), and hence F 0 is integrable. If G(x) = a F 0 (t) dt, then G is
absolutely continuous by Lemma 12.8, so F G is absolutely continuous.
0
Then
R x 0 (F G) = 0 a.e., and therefore F G is constant. Thus F (x) =
F (t) dt + F (a). If we set x = b, we get our result.
a

13

Lp spaces

We assume throughout this section that the measure is -finite. For 1 p <
, define the Lp norm of f by
Z
1/p
kf kp =
|f (x)|p d
.
For p = , define the L norm of f by
kf k = inf{M : ({x : |f (x)| M }) = 0}.
For 1 p the space Lp is the set {f : kf kp < }.
The L norm of a function f is the supremum of f provided we disregard
sets of measure 0.
It is clear that kf kp = 0 if and only if f = 0 a.e.
Proposition 13.1 (Holders inequality) If 1 < p, q < and p1 + q 1 = 1,
then
Z
|f (x)g(x)| d kf kp kgkq .
This also holds if p = and g = 1.
46

R
R
Proof. If M = kf k , then f g M |g| and the case p = and q = 1
follows.
So let us assume 1 < p, q < . If kf kp = 0, then f = 0 a.e and
R
f g = 0, so the result is clear if kf kp = 0 and similarly if kgkq = 0. Let
F (x) = |f (x)|/kf kp and G(x)
R = |g(x)|/kgkq . Note kF kp = 1 and kGkq = 1,
and it suffices to show that F G 1.
The second derivative of the function ex is again ex , which is positive, and
so ex is convex. Therefore if 0 1, we have
ea+(1)b ea + (1 )eb .
If F (x), G(x) 6= 0, let a = p log F (x), b = q log G(x), = 1/p, and 1 =
1/q. We then obtain
F (x)G(x)

F (x)p G(x)q
+
.
p
q

Clearly this inequality also holds if F (x) = 0 or G(x) = 0. Integrating,


Z
kF kpp kGkqq
1 1
FG
+
= + = 1.
p
q
p q

One application of Holders inequality is to prove Minkowskis inequality,


which is simply the triangle inequality for Lp .
We first need the following lemma:
Lemma 13.2 If a, b > 0 and 1 p < , then
(a + b)p 2p1 ap + 2p1 bp .
Proof. To prove this, we may without loss of generality assume a b. The
case a = 0 is obvious, so we assume a > 0. Dividing both sides by a and
letting x = b/a, the inequality we want is equivalent to
(1 + x)p 2p1 + 2p1 xp ,

x 1.

(13.1)

Clearly this inequality is valid for x = 1. So to prove (13.1) it suffices to


show that the derivative of
(1 + x)p 2p1 2p1 xp
47

is less than equal to 0, or


p(1 + x)p1 2p1 pxp1 ,

x 1.

This last inequality hods because 2x 1 + x when x 1.


Proposition 13.3 (Minkowskis inequality) If 1 p , then
kf + gkp kf kp + kgkp .
Proof. Since |(f + g)(x)| |f (x)| + |g(x)|, integrating gives the case when
p = 1. The case p = is also easy. So let us suppose 1 < p < . If kf kp
or kgkp is infinite, the result is obvious, so we may assume both are finite.
The inequality Lemma 13.2 with a = |f (x)| and b = |g(x)| yields, after an
integration,
Z
Z
Z
p
p
p
p
|(f + g)(x)| d 2
|f (x)| d + 2
|g(x)|p d.
So we have kf + gkp < . Clearly we may assume kf + gkp > 0.
Now write
|f + g|p |f | |f + g|p1 + |g| |f + g|p1
and apply Holders inequality with q = (1 p1 )1 . We obtain
Z
Z
1/q
Z
1/q
p
(p1)q
(p1)q
|f + g| kf kp
|f + g|
+ kgkp
|f + g|
.
Since p1 + q 1 = 1, then (p 1)q = p, so we have


kf + gkpp kf kp + kgkp kf + gkp/q
p .
p/q

Dividing both sides by kf + gkp


us our result.

and using the fact that p (p/q) = 1 gives

Minkowskis inequality says that Lp is a normed linear space, provided we


identify functions that are equal a.e.
We say fn converges to f in Lp if kfn f kp 0 as n . The next
proposition compares convergence in Lp to convergence in measure. Before
we prove this, we prove an easy preliminary result known as Chebyshevs
inequality.
48

Lemma 13.4 If 1 p < ,


({x : |f (x)| a})

kf kpp
.
ap

Proof. If A = {x : |f (x)| a}, then


Z
Z
|f (x)|p
1
(A)
d p |f |p d.
p
a
a
A

Proposition 13.5 If fn converges to f in Lp , then it converges in measure.


Proof. If > 0, by Chebyshevs inequality
({x : |fn (x)f (x)| > }) = ({x : |fn (x)f (x)|p > p })

kfn f kpp
0.
p

Letting fn = n2 (0,1/n) on [0, 1] with the measure being Lebesgue measure


gives an example where fn converges to 0 a.e. and in measure, but does not
converge in Lp .
Example 9.4 is an example where fn converges to 0 in Lp but not a.e.
We next show that Lp is complete. This is often phrased as saying that
L is a Banach space, i.e., a complete normed linear space.
p

Proposition 13.6 If 1 p , then Lp is complete.


Proof. We do only the case p < ; the case p = is easy. Suppose fn
is a Cauchy sequence in Lp . Given = 2(j+1) , there exists nj such that if
n, m nj , then kfn fm kp 2(j+1) . Without loss of generality we may
assume nj nj1 for each j.

49

Set n0 = 0 and define f0 0. If Aj = {x : |fnj (x) fnj1 (x)| > 2j/2 },


then from Lemma 13.4, (Aj ) 2jp/2 . We have
(
j=1

m=j

Am ) = lim

(
m=j Am )

lim

(Am ) = 0.

m=j

So except for a set of measure 0, for each x there is a last j for which
x
m=j Am , hence a last j for which x Aj . So for each x (except
for the null set) there is a j0 (depending on x) such that if j j0 , then
|fnj (x) fnj1 (x)| 2j .
Set
gj (x) =

j
X

|fnm (x) fnm1 (x)|.

m=1

gj (x) increases for each x, and the limit is finite for almost every x by the
preceding paragraph. Let us call the limit g(x). We have
kgj kp

j
X

2j + kfn1 kp 2 + kfn1 kp

m=1

by Minkowskis inequality, and so by Fatous lemma, kgkp 2 + kfn1 kp < .


We have
j
X
(fnm (x) fnm1 (x)).
fnj (x) =
m=1

Suppose x is not in the null set where g(x) is infinite. Since |fnj (x)fnk (x)|
|gnj (x) gnk (x)| 0 as j, k , then fnj (x) is a Cauchy series (in R), and
hence converges, say to f (x). We have kf fnj kp = limm kfnm fnj kp ;
this follows by dominated convergence with the function g defined above as
the dominating function.
We have thus shown that kf fnj kp 0. Given = 2(j+1) , if m nj ,
then kf fm kp kf fnj kp + kfm fnj kp . This shows that fm converges
to f in Lp norm.
Next we show:
Proposition 13.7 The set of continuous functions with compact support is
dense in Lp (Rd ).
50

R
Proof. Suppose f Lp . By dominated convergence |f f [n,n] |p 0
as n , the dominating function being |f |p . So we may suppose f has
compact support. By writing f = f + f we Rmay suppose f 0. By taking
simple functions sm increasing to f , we have |f sm |p 0 by dominated
convergence, so it suffices to consider simple functions. By linearity, it suffices
to consider characteristic functions with compact support. Given such a E
and > 0 we showed in Proposition 8.3 that there exists
R g continuous with
compact support and
with
values
in
[0,
1]
such
that
|g E | < . Since
R
R
p
|g E | 1, then |g E | |g E | < .
The following is very useful.
Proposition 13.8 For 1 < p < and p1 + q 1 = 1,
nZ
o
kf kp = sup
f g : kgkq 1 .

(13.2)

When p = 1 (13.2) holds if we take q = , and if p = (13.2) holds if we


take q = 1.
Proof. The right hand side of (13.2) is less than the left hand side by
Holders inequality. So we need only show that the right hand side is greater
than the left hand side.
First suppose p = 1. Take g(x) = sgn f (x), where sgn a is 1 if a > 0, is
0 if a = 0, and is 1 if a < 0. Then g is bounded by 1 and f g = |f |. This
takes care of the case p = 1.
Next suppose p = . Since is -finite, there exist sets Fn increasing
up to X such that (Fn ) < for each n. If M = kf k , let a be any
finite real less than M . By the definition of L norm, the measure of A =
{x Fn : |f (x)| > a} must be positive if n is sufficiently
R large.
R Let g(x) =
(sgn f (x))A (x)/(A). Then the L1 norm of g is 1 and f g = A |f |/(A)
a. Since a is arbitrary, the supremum on the right hand side must be M .
Now suppose 1 < p < . We may suppose kf kp > 0. Let qn be a sequence
of nonnegative simple functions increasing to f + , rn a sequence of nonnegative simple functions increasing to f , and sn (x) = (qn (x) rn (x))Fn (x).
Then sn (x) f (x) for each x, |sn (x)| |f (x)| for each x, sn is a simple function, and ksn kp < for each n. If f Lp , then ksn kp kf kp by dominated
51

R
R
convergence. If |f |p = , then |sn |p by monotone convergence.
For n sufficiently large, ksn kp > 0.
Let
gn (x) = (sgn f (x))

|sn (x)|p1
p/q

ksn kp

Since (p 1)q = p, then


kgn kq =

R
( |sn |(p1)q )1/q
p/q

ksn kp

p/q

ksn kp

p/q

ksn kp

= 1.

On the other hand, since |f | |sn |,


R
R
Z
|sn |p
|f | |sn |p1

= ksn kp(p/q)
.
f gn =
p
p/q
p/q
ksn kp
ksn kp
R
Since p (p/q) = 1, then f gn ksn kp , which tends to kf kp .
The above proof also establishes
Corollary 13.9 For 1 < p < and p1 + q 1 = 1,
nZ
o
kf kp = sup
f g : kgkq 1, g simple .
The space Lp is a normed linear space. We can thus talk about its dual,
namely, the set of bounded linear functionals on Lp . The dual of a space Y
is denoted Y . If H is a bounded linear functional on Lp , we define the norm
of H to be kHk = sup{H(f ) : kf kp 1}.
Theorem 13.10 If 1 < p < and p1 + q 1 = 1, then (Lp ) = Lq .
What this means is that if H is a bounded
linear functional on Lp , then
R
q
q
Rthere exists g L such that H(f ) = fpg and that if g L , then H(f ) =
f g is a bounded linear functional on L .
R
Proof. If g Lq , then setting H(f ) = f g for f Lp yields a bounded linear functional; the boundedness follows from Holders inequality. Moreover,
from Holders inequality and Proposition 13.8 we see that kHk = kgkq .
52

Now suppose we are given a bounded linear Rfunctional H on Lp and we


must show there exists g Lq such that H(f ) = f g. First suppose (X) <
. Define (A) = H(A ). If A and B are disjoint, then
(A B) = H(AB ) = H(A + B ) = H(A ) + H(B ) = (A) + (B).
To show is countably additive, it suffices to show that if An A, then
(An ) (A). But if An A, then An A in Lp , and so (An ) =
H(An ) H(A ) = (A); we use here the fact that (X) < . Therefore
is a countably additive signed measure. Moreover, if (A) = 0, then
A = 0 a.e., hence (A) = H(A ) = 0. By writing = + and using
the Radon-Nikodym theorem for both the positive and
R negative parts, we
see there
exists an integrable g such that (A) = A g for all sets A. If
P
s = ai Ai is a simple function, by linearity we have
Z
X
X
X Z
H(s) =
ai H(Ai ) =
ai (Ai ) =
ai gAi = gs.
By Corollary 13.9,
nZ
o
kgkq = sup
gs : kskp 1, s simple sup{H(s) : kskp 1} kHk.
If sn are simple functions
tending
to f in Lp , then H(sn )R H(f ), while by
R
R
Holders inequality sn g f g. We thus have H(f ) = f g for all f Lp ,
and kgkp kHk. By Holders inequality, kHk kgkp .
In the case where is -finite, but not finite, let Fn X be such that
(Fn ) < for each n. Define functionals Hn by Hn (f ) = H(f Fn ). Clearly
p
each Hn is a bounded linear functional on L
R . Applying the above argument,
we see there exist gn such that Hn (f ) = f gn and kgn kq = kHn k kHk.
It is easy to see that gn is 0 if x
/ Fn . Moreover, by the uniqueness part of
the Radon-Nikodym theorem, if n > m, then gn = gm on Fm . Define g by
setting g(x) = gn (x) if x Fn . Then g is well defined. By Fatous lemma, g
is in Lq with a norm bounded by kHk. Since f Fn f in Lp by dominated
convergence, then Hn (f ) = H(f Fn ) H(f ), since
R H is a Rbounded linear
R
functional on Lp . On the other hand Hn (f ) = Fn f gn = Fn f g f g
R
by dominated convergence. So H(f ) = f g. Again by Holders inequality
kHk kgkp .

53

14

Fourier transforms

Fourier transforms give a representation of a function in terms of frequencies.


We give the basic properties here.
If f L1 (Rn ), define the Fourier transform fb by
Z
fb(u) =

eiux f (x)dx,

u Rn .

(14.1)

Rn

We are using u x for the standard inner product in Rn . Various books have
slightly different definitions. Some put a negative sign before the iu x, some
have a 2 either in front of the integral or in the exponent. The basic theory
is the same in any case.
Some basic properties of the Fourier transform are given by
Proposition 14.1 Suppose f and g are in L1 . Then
(a) fb is bounded and continuous;
d)(u) = afb(u);
(b) (f\
+ g)(u) = fb(u) + gb(u); (af
(c) if fa (x) = f (x + a), then fba (u) = eiua fb(u);
(d) if ga (x) = eiax g(x), then gba (u) = fb(u + a);
(e) if ha (x) = f (ax), then b
ha (u) = an fb(u/a).
Proof. (a) fb is bounded because f L1 and |eiux | = 1. We have
Z 

i(u+h)x
iux
b
b
f (u + h) f (u) =
e
e
f (x)dx.
So
|fb(u + h) fb(u)|

Z



iux ihx
1 |f (x)|dx.
e e

The integrand is bounded by 2|f (x)|, which is integrable, and eihx 1 0


as h 0, and thus the continuity follows by dominated convergence.
54

(b) is obvious. (c) follows because


Z
Z
iux
fba (u) = e f (x + a)dx = eiu(xa) f (x)dx = eiua fb(u)
by a change of variables. For (d),
Z
Z
iux iax
gba (u) = e e f (x)dx = ei(u+a)x f (x)dx = fb(u + a).
Finally for (e), by a change of variables,
Z
Z
iux
n
b
ha (u) = e f (ax)dx = a
eiu(y/a) f (y)dy
Z
n
=a
ei(u/a)y f (y)dy = an fb(u/a).

One reason for the usefulness of Fourier transforms is that they relate
derivatives and multiplication.
Proposition 14.2 Suppose f L1 and xj f (x) L1 , where xj is the j th
coordinate of x. Then
Z
fb
(u) = i eiux xj f (x)dx.
uj
Proof. Let ej be the unit vector in the j th direction. Then
Z

fb(u + hej ) fb(u)
1  i(u+hej )x
=
e
eiux f (x)dx
h
h
Z
 eihxj 1 
= eiux
f (x)dx.
h
Since

1



eihxj 1 |xj |
h
R
and xj f (x) L1 , the right hand side converges to eiux ixj f (x)dx by dominated convergence. Therefore the left hand side converges. Of course, the
limit is fb/uj .
55

The convolution of f and g is defined by


Z
f g(x) = f (x y)g(y)dy.
By a change of variables, this is the same as

f (y)g(xy)dy, so f g = g f .

Proposition 14.3 (a) If f, g L1 , then f g is in L1 and kf gk1


kf k1 kgk1 .
(b) The Fourier transform of f g is fb(u)b
g (u).
R
Proof. (a) We will show f g is finite a.e. by showing |f g(x)| dx < .
We have
Z
Z Z
|f g(x)|dx
|f (x y)| |g(y)|dy dx.
Since the integrand is nonnegative, we can apply Fubini and the right hand
side is equal to
Z Z
Z Z
|f (x y)|dx |g(y)|dy =
|f (x)|dx |g(y)|dy = kf k1 kgk1 .
The first equality here follows by a change of variables. To verify that we can
do a change of variables, we reduce to simple functions and then characteristic
functions, and then use the translation invariance of Lebesgue measure.
(b) We have
Z
f[
g(u) =

e
Z Z

=
Z
=

iux

Z
f (x y)g(y)dy dx

eiu(xy) f (x y)dx eiuy g(y)dy

fb(u)eiuy g(y)dy = fb(u)b


g (u).

We applied Fubini in the first equality; this is valid because as we saw in (a),
the absolute value of the integrand is integrable.
We want to give a formula for recovering f from fb. First we need to
calculate the Fourier transform of a particular function.
56

Proposition 14.4 (a) Suppose f1 : R R is defined by


1
2
f1 (x) = ex /2 .
2
2
Then fb1 (u) = eu /2 .

(b) Suppose fn : Rn R is given by


fn (x) =

1
2
e|x| /2 .
n/2
(2)

2
Then fbn (u) = e|u| /2 .

Proof. (a) may be proved using


contour integration, but lets give a real
R
2
variable proof. Let g(u) = eiux ex /2 dx. Differentiate with respect to
u. We may differentiate under the integral sign because (ei(u+h)x eiux )/h
2
is bounded in absolute value by |x| and |x|ex /2 is integrable; therefore
dominated convergence applies. We then obtain
Z
2
0
g (u) = i eiux xex /2 dx.
By integration by parts this is equal to
Z
2
u eiux ex /2 dx = ug(u).
Solving the differential equation g 0 (u) = ug(u), we have
[log g(u)]0 =

g 0 (u)
= u,
g(u)

so log g(u) = u2 /2 + c1 , and so then


2

g(u) = c2 eu .
(14.2)

R
2
Since g(0) = ex /2 dx = 2, c
2. Substituting this value of c2 in
2 =
(14.2) and dividing both sides by 2 proves (a).

57

For (b), since fn (x) = f1 (x1 ) f1 (xn ) if x = (x1 , . . . , xn ),


Z
Z P
b
fn (u) = ei j uj xj f1 (x1 ) f1 (xn )dx1 dxn
2
= fb1 (u1 ) fb1 (un ) = e|u| /2 .

One more preliminary before proving the inversion theorem.


Proposition 14.5 Suppose is in L1 and
An (x/A).

(x)dx = 1. Let A (x) =

(a) Then kf A f k1 0 as A 0.
(b) If f is continuous with compact support, then f A converges to f
pointwise.
Proof. (a) Let > 0. Choose g continuous with compact support so that
kf gk1 < . Let h = f g. A change of variables shows that kA k1 = kk1 .
Observe
kf A f k1 kg A gk1 + kh A hk1
and
kh A hk1 khk1 + kh A k1 khk1 + khk1 kA k1 < (1 + kk1 ).
So since is arbitrary, it suffices to show that g A g in L1 .
We start by writing
Z
Z
g A (x) g(x) = g(x y)A (y)dy g(x) = g(x Ay)(y)dy g(x)
Z
= [g(x Ay) g(x)](y)dy.
R
We used a change of variables and the fact that (y)dy = 1. Because g
is continuous with compact support, then g is bounded, and the integral on
the right goes to 0 by dominated convergence, the dominating function being
kgk |(y)|. Therefore g A (x) converges to g(x) pointwise.
58

To show the convergence in L1 , we have


Z
Z Z
|g A (x) g(x)|dx
|g(x Ay) g(x)| |(y)|dy dx
Z Z
=
|g(x Ay) g(x)| |(y)|dx dy.
Since g is continuous with compact support and hence bounded, for each y
Z
GA (y) = |g(x Ay) g(x)|dx
converges to 0 as A 0 by dominated convergence. Also
Z
Z
GA (y) |g(x Ay)|dx + |g(x)|dx 2kgk1 < .
Then

Z
GA (y)|(y)|dy

converges to 0 as A 0 by dominated convergence, the dominating function


being 2kgk1 |(y)|.
(b) This follows from the argument we used for g above.
Now we are ready to give the inversion formula. The proof seems longer
than it might be, but there is no avoiding the introduction of the function
Ha or some similar function.
Theorem 14.6 Suppose f, fb L1 . Then
Z
1
f (y) =
eiuy fb(u)du,
n
(2)

a.e.

Proof. If g(x) = an f (x/a), then its Fourier transform is fb(au). So the


Fourier transform of
1
1
2
2
ex /2a
n
n/2
a (2)
is ea

2 u2 /2

. Therefore if we let
Ha (x) =

1 |x|2 /2a2
e
,
(2)n
59

we have
b a (u) = (2)n/2 an ea2 |u|2 /2 .
H
We have
Z

fb(u)eiuy Ha (u)du
Z Z
=
eiux f (x)eiuy Ha (u)dx du
Z Z
=
eiu(xy) Ha (u)du f (x) dx
Z
b a (x y)f (x)dx.
= H

(14.3)

We can interchange the order of integration because


Z Z
|f (x)| |Ha (u)|dx du < .
R
The left hand side of the first line of (14.3) converges to (2)n fb(u)eiuy dy
as a by dominated convergence and the fact that fb L1 . The last line
of (14.3) is equal to
Z
b a (y x)f (x)dx = f H
b a (y)
H
(14.4)
b a is symmetric. But by Proposition 14.5, f H
b a converges to f in L1
since H
as a .
The last topic that we consider is the Plancherel theorem.
Theorem 14.7 (a) Suppose f is continuous with compact support. Then
fb L2 and
kf k2 = (2)n/2 kfbk2 .
(14.5)
(b) We can use the result in (a) to define fb when f L2 and so that
(14.5) holds.

60

Proof. (a) Let g(x) = f (x). Note


Z
gb(u) =

iux

Z
f (x)dx =

eiux f (x)dx

Z
=

eiux f (x)dx = fb(u).

By (14.3) and (14.4) with y = 0


Z
b a (0) =
f gH

f[
g(u)Ha (u)du.

(14.6)

Since f[
g(u) = fb(u)b
g (u) = |fb(u)|2 , the right hand side of (14.6) converges
R
by monotone convergence to (2)n |fb(u)|2 du as a . Since f and g
are continuous with compact support, then it is easy to see Rthat f g is also,
and
left hand side of (14.6) converges to f g(0) = f (y)g(y)dy =
R so the
|f (y)|2 dy by Proposition 14.5(b).
(b) The set of continuous functions with compact support is dense in L2 .
Given a function f in L2 , choose a sequence of continuous functions with
compact support {fm } such that fm f in L2 . By the result in (a), {fbm } is
a Cauchy sequence in L2 , and therefore converges to a function in L2 , which
0
we call fb. If {fm
} is another sequence of continuous functions with compact
0
support converging to f in L2 , then {fm fm
} is a sequence of continuous
functions with compact support converging to 0 in L2 ; by the result in (a),
0
fbm fbm
converges to 0 in L2 , and therefore fb is defined uniquely up to almost
everywhere equivalence. By passing to the limit in L2 on both sides of (14.5),
we see that (14.5) holds for f L2 .
Richard F. Bass
Department of Mathematics
University of Connecticut
Storrs, CT 06269-3009, USA
bass@math.uconn.edu

61

You might also like