Professional Documents
Culture Documents
Functional Analysis I Autumn Term 2008: James C. Robinson
Functional Analysis I Autumn Term 2008: James C. Robinson
James C. Robinson
Introduction
I hope that these notes will be useful. They are, of course, much more wordy
than the notes you will have taken in lectures, but the maths itself is usually
done in a little more detail and should generally be ‘tighter’. You may find
that the order in which the material is presented is a little different to the
lectures, but this should make things more coherent.
I hope that there are relatively few mistakes, but if you find yourself
staring at something thinking that it must be wrong then it most likely is,
so do email me at j.c.robinson@warwick.ac.uk. I will post a list of errata
as and when people find them on my webpage for the course,
www.maths.warwick.ac.uk/˜jcr/FAI.
These notes will form the basis of the first part of a textbook on functional
analysis, so any general comments would also be welcome.
iii
Contents
2.2 Convergence 16
3.1 Compactness 20
4 Completeness 26
5 Lebesgue integration 36
iv
Contents v
f + λg ∈ V, f, g ∈ V, λ ∈ K = R or C,
and distributive
α ∗ (x + y) = α ∗ x + α ∗ y and (α + β) ∗ x = α ∗ x + β ∗ x
for all α, β ∈ K, x, y ∈ V .
1
2 Vector spaces and bases
Example 1.2 For 1 ≤ p < ∞ define the space ℓp (K) of all pth power
summable sequences with elements in K (recall that K = R or C):
∞
X
ℓp (K) = {x = (x1 , x2 , . . .) : xj ∈ K, |xj |p < +∞}.
j=1
x + y = (x1 + y1 , x2 + y2 , . . .),
With these definitions ℓp (K) is a vector space. The only issue is whether
x + y is still in ℓp (K); for 1 ≤ p < ∞ this follows since
n
X n
X ∞
X ∞
X
p p p p p p p
|xj + yj | ≤ 2 (|xj | + |yj | ) ≤ 2 |xj | + 2 |yj |p < +∞,
j=1 j=1 j=1 j=1
Example 1.3 The space C 0 ([0, 1]) of all real-valued continuous functions
on the interval [0, 1] is a vector space with the obvious definitions of addition
and scalar multiplication, which we give here for the one and only time: for
Vector spaces and bases 3
Example 1.4 Denote by L̃1 (0, 1) the set of all real-valued continuous func-
tions on (0, 1) for which
Z 1
|f (x)|2 dx < +∞.
0
Then L̃1 (0, 1) is a vector space (with the obvious definitions of addition and
scalar multiplication).
But while the function f (x) = x−1/2 is not continuous on [0, 1], it is
continuous on (0, 1) and
Z 1 h i1
|x−1/2 | dx = 2x1/2 = 2 < ∞,
0 0
so f ∈ L̃1 (0, 1). These two examples show that C 0 ([0, 1]) is a strict subset
of L̃1 (0, 1).
that the definitions – and the following arguments – also apply to infinite-
dimensional spaces. In particular the result of Lemma 1.9 is valid for infinite-
dimensional spaces.
Example 1.13 Consider the functions fn ∈ C 0 ([0, 1]), where fn is zero for
x∈/ In = [2−n − 2−(n+2) , 2−n + 2−(n+2) ] and interpolates linearly between the
values
fn (2−n − 2−(n+2) ) = 0 fn (2−n ) = 1 fn (2−n + 2−(n+2) ) = 0.
The intervals In where fn 6= 0 are disjoint, but f (2−n ) = 1. It follows that
for any n the {fj }nj=1 are linearly independent, and so C 0 ([0, 1]) is infinite-
dimensional.
The proof makes use of Zorn’s Lemma. In order to state this result (which
in fact is more of an axiom, since it is equivalent to the axiom of choice) we
need to introduce some auxiliary concepts.
The order is ‘partial’ because two arbitrary elements of P need not be or-
dered: consider for example, the case when P consists of all subsets of R
and X Y if X ⊆ Y ; one cannot order [0, 1] and [1, 2].
Exercise 1.16 Show that the space ℓf consisting of all sequences that contain
only finitely many non-zero terms is a vector space. Show that
ej = (0, . . . , 0, 1, 0, . . .)
(all zeros except for a single 1 in the jth position) is a Hamel basis for ℓf .
8
2.1 Norms and normed spaces 9
Note that when p = 2 and K = R this is the natural extension of the standard
Euclidean norm to a countable collection of real numbers.
We now show that this really does define a norm. It is clear that kxkℓp ≥ 0,
and that if the norm is zero that x = 0. It is also clear that
1/p 1/p 1/p
X∞ ∞
X X ∞
kλxkℓp = |λxj |p = |λ|p |xj |p = |λ| |xj |p
j=1 j=1 j=1
= |λ|kxkℓp .
(This requirement explains why we have to take the pth root of the sum of
pth powers.)
It is the triangle inequality that requires some work. Although the ar-
gument is a little long, on the way we will prove two auxiliary – and very
useful – inequalities. We say that (p, q) with 1 < p, q < ∞ are conjugate if
1 1
+ = 1; (2.2)
p q
we extend to definition to cover the case p = 1, q = ∞.
Lemma 2.4 (Young’s inequality) Let a, b > 0 and (p, q) conjugate with
1 < p, q < ∞. Then
ap bq
ab ≤ + . (2.3)
p q
and this is zero only for t = 1, where f (1) = 0. The second derivative
is positive, so this is a minimum. It follows that f (t) ≥ 0 for all t. In
particular, choosing t = ab−q/p we obtain
ap b−q 1
+ − ab−q/p ≥ 0
p q
so that
ap bq
+ ≥ ab−q/p bq = ab.
p q
≤ 1.
1/2 1/2
n
X Xn Xn
|xj yj | ≤ |xj |2 |yj |2 , (2.5)
j=1 j=1 j=1
n n n
max |xj + yj | ≤ max |xj | + max |yj |,
j=1 j=1 j=1
n
X n
X n
X
|xj + yj | ≤ |xj | + |yj |,
j=1 j=1 j=1
i.e.
1−1/q
Xn
|xj + yj |p ≤ kxkℓp + kykℓp .
j=1
One of our main concerns in what follows will be normed spaces that
consist of functions. For example, the following are norms on C 0 ([0, 1]), the
space of all continuous functions on [0, 1]: the ‘sup(remum) norm’,
kf k∞ = sup |f (x)|
x∈[0,1]
Note that of the three candidates here the L2 norm looks most like the
expression (2.1) for the familiar norm in Rn .
Proof The only part that requires much thought is (i), to make sure that
kf kL1 = 0 iff f = 0. So suppose that f 6= 0. Then |f (y)| = δ > 0 for some
y ∈ (0, 1) (if f (0) 6= 0 or f (1) 6= 0 it follows from continuity that f (y) 6= 0
for some y ∈ (0, 1)). Since f is continuous, there exists an ǫ > 0 such that
for any x ∈ (0, 1) with |x − y| < ǫ we have
|f (x) − f (y)| < δ/2.
If necessary, reduce ǫ so that [y − ǫ, y + ǫ] ∈ (0, 1). Then
Z 1 Z y+ǫ Z y+ǫ
2 2 δ
|f (x)| dx ≥ |f (x)| dx ≥ dx = ǫδ > 0.
0 y−ǫ y−ǫ 2
and
Z Z
kf + gkL1 = |f (x) + g(x)| dx ≤ |f (x)| + |g(x)| dx ≤ kf kL1 + kgkL1 .
For k · kL2 (i) and (ii) are the same as above; we will see (iii) below as a
consequence of the Cauchy-Schwarz inequality for inner products.
Definition 2.8 Two norms k·k1 and k·k2 on a vector space V are equivalent
– we write k · k1 ∼ k · k2 – if there exist constants 0 < c1 ≤ c2 such that
c1 kxk1 ≤ kxk2 ≤ c2 kxk1 for all x ∈ V.
and
kxk2 ≤ α−1 −1
1 kxk1 ≤ β2 α1 kxk3 ,
i.e. β1 α−1 −1
2 kxk3 ≤ kxk2 ≤ β2 α1 kxk3 and k · k2 and k · k3 are equivalent.
This is a particular case of the general result that all norms on a finite-
dimensional vector space are equivalent, which we will prove in the following
chapter. As part of this proof, the following proposition – which shows that
one can always find a norm on a finite-dimensional vector space – will be
useful.
P
Proof First, note that any v ∈ V can be written uniquely as v = j αj ej ,
so the map v 7→ kvkE is well-defined. We check that k · kE satisfies the three
requirements of a norm:
|αj |2 = 0¡
P P
(i) clearly kvkE ≥ 0, and if kvkE = 0 then v = αj ej with
i.e. αj = 0 for j = 1, . . . , n, and so v = 0.
P P
(ii) If v = j αj ej then λv = j (λαj )ej , and so
X X
kλvk2E = |λαj |2 = |λ|2 |αj |2 = |λ|2 kvk2E .
j j
P P
(iii) For the triangle inequality, if u = j αj ej and v = j βj ej then,
2.1 Norms and normed spaces 15
We now want to show that with k·kE norm, any finite-dimensional normed
space is ‘the same’ as Rn . For two objects to be ‘the same’ we generally
require an isomorphism that also preserves the essentially structures of the
objects. Here we want to say that two linear spaces, along with their norms,
are ‘the same’. So we will need the isomorphism to be linear (so that ϕ(x) +
ϕ(y) = ϕ(x + y)) and we will also want to preserve the norm (‘an isometry’).
Definition 2.12 Two normed spaces (X, k · kX ) and (Y, k · kY ) are isomet-
rically isomorphic, or simply isometric, if there exists a linear isomorphism
ϕ : X → Y that is also an isometry, i.e.
It is clear that ϕ is one-to-one and onto, that ϕ (and its inverse) are linear,
and it follows directly from the definition of k · kE that |ϕ(x)| = kxkE for all
x∈V.
2.2 Convergence
The proof of this lemma is immediate from the definition of the equiva-
lence of norms, since there exist constants 0 < c1 ≤ c2 such that
c1 kxn − xk1 ≤ kxn − xk2 ≤ c2 kxn − xk1 ;
Exercise 2.19 Show that this is equivalent to the ǫ–δ definition of continu-
ity: for each x ∈ X, for every ǫ > 0 there exists a δ > 0 such that
ky − xkX < ǫ ⇒ kf (y) − f (x)kY < δ.
Corollary 2.20 Suppose that k · kX,1 ∼ k · kX,2 are two equivalent norms
on a space X, and k · kY,1 ∼ k · kY,2 are two equivalent norms on a space
Y . Then a function f : (X, k · kX,1 ) → (Y, k · kY,1 ) is continuous iff it is
continuous as a map from (X, k · kX,2 ) into (Y, k · kY,2 ).
k2 − 1
1
gk
fk
1 1 1 1 1
2 − k
1
2
1
2 + 1
k k+1 k k−1
This inequality should make very clear the advantage of the shorthand norm
notation, since it just says
kfk − f kL1 ≤ kfk − f k∞ .
It is also clear that if (2.7) holds then fk (x) → f (x) for each x ∈ [0, 1], which
is ‘pointwise convergence’. However, neither pointwise convergence nor L1
convergence imply uniform convergence:
3.1 Compactness
(k) (k)
Proof Let {x(k) = (x1 , . . . , xn } be a bounded sequence in Rn . Since
(k)
x1 is a bounded sequence in R, there is a subsequence x(k1,j ) for which
(k ) (k )
x1 1,j converges. Since x(k1,j ) is again a bounded sequence in Rn , x2 1,j is
a bounded sequence in R. We can therefore find a subsequence x(k2,j ) of
(k )
x(k1,j ) such that x2 2,j converges. Since x(k2,j ) is a subsequence of x(k1,j ) ,
(k )
x1 2,j still converges. We can continue this process inductively to obtain a
(k )
subsequence x(kn,j ) such that all the xi n,j for i = 1, . . . , n converge.
20
3.1 Compactness 21
Note that if k·k1 ∼ k·k2 are two equivalent norms on V then X is bounded
wrt k · k1 iff it is bounded wrt k · k2 .
Exercise 3.5 Show that if (X, k · k) is a normed space then the unit ball
BX ([0, 1]) = {x ∈ X : kxk ≤ 1}
and the unit sphere
SX = {x ∈ X : kxk = 1}
are both closed.
Note that if k·k1 ∼ k·k2 are two equivalent norms on V then X is compact
wrt k · k1 iff it is compact wrt k · k2 .
We will see later that this characterisation does not hold in infinite-
dimensional spaces (and this is one way to characterise such spaces).
Proposition 3.10 Let K be a compact subset of (X, k·k). Then any contin-
uous function f : K → R is bounded and attains its bounds, i.e. there exists
an M > 0 such that |f (x)| ≤ M for all x ∈ K, and there exist x, x ∈ K
such that
f (x) = inf f (x) and f (x) = sup f (x).
x∈K x∈K
3.1 Compactness 23
and so f = f (x) for some x ∈ K. [That sup(S) ∈ S for any closed S is clear,
since for each n there exists an sn ∈ S such that sn > sup(S) − 1/n. Since
sn ≤ sup(S) by definition, sn → sup(S), and it follows from the fact that S
is closed that sup(S) ∈ S.] The argument for x is identical.
Proof Let E = {ej }nj=1 be a basis for V , and let k · kE be the norm on V
defined in Proposition 2.11. Let k · k be another norm on V . We will show
that k · k is equivalent to k · kE . Since equivalence of norms is an equivalence
relation, this will imply that all norms on V are equivalent.
P
Now, if u = j αj ej then
X
kuk =
αj ej
j
X
≤ |αj |kej k (using the triangle inequality)
j
1/2 1/2
X X
≤ |αj |2 kej k2 (using the Cauchy-Schwarz inequality)
j j
= CE kukE ,
where CE2 = j kej k2 , i.e. CE is a constant that does not depend on u.
P
Combining this with kuk ≤ CE kukE shows that k·k and k·kE are equivalent.
Theorem 4.1 A sequence of real numbers {xn }∞ n=1 converges if and only if
it is a Cauchy sequence, i.e. given any ǫ > 0 there exists an N such that
It follows that in particular kxn k ≤ kxN k + 1 for all n ≥ N , and hence kxn k
is bounded.
26
Completeness 27
Theorem 4.1 states that R with its standard norm is complete (‘R is a
Banach space’). It follows fairly straightforwardly that the same is true for
any finite-dimensional normed space.
Since all norms on V are equivalent (Theorem 3.11), a sequence {xk } that
is Cauchy in k · k is Cauchy in k · kE .
Writing xk = nj=1 xkj ej it follows that given any ǫ > 0 there exists an Nǫ
P
The completeness of ℓp is a little more delicate, but only in the final steps.
28 Completeness
In particular {xkj }∞
k=1 is a Cauchy sequence in K for every fixed j. Since K
is complete (recall K = R or C) it follows that for each k ∈ N
xkj → ak
for some ak ∈ R.
Set a = (a1 , a2 , · · · ). We want to show that a ∈ ℓp and that kxk −akℓp → 0
as k → ∞. First, since {xk } is Cauchy we have from (4.2) that kxn −xm kℓp <
ǫ for all n, m ≥ Nǫ , and so in particular for any N ∈ N
N
X ∞
X
|xnj − xm p
j | ≤ |xnj − xm p p
j | ≤ǫ .
j=1 j=1
Letting m → ∞ we obtain
N
X
|xnj − aj |p ≤ ǫp ,
j=1
This is really the first time we have seen a significant difference between Rn
and the abstract normed vector spaces that we have been considering. The
failure of the Bolzano-Weierstrass property is in fact a defining characteristic
of infinite-dimensional spaces.
Theorem 4.6 C 0 ([0, 1]) equipped with the sup norm k · k∞ is complete.
Proof Let {fk } be a Cauchy sequence in C 0 ([0, 1]): so given any ǫ > 0 there
exists an N such that
In particular {fk (x)} is a Cauchy sequence for each fixed x, so fk (x) con-
verges for each fixed x ∈ [0, 1]: define
We need to show that in fact fk → f uniformly. But this follows since for
every x ∈ [0, 1] we have from (4.3)
For this reason the supremum norm is the ‘standard norm’ on C 0 ([0, 1]);
if no norm is mentioned this is the norm that is intended.
Example 4.7 C 0 ([0, 1]) equipped with the L1 norm is not complete.
30 Completeness
fk
0
1 1
2 − k
1
2
1
1 1
0
0≤x≤ 2 − k
1 1 1 1 1
fk (x) = k x − 2 − k 2 − k <x< 2
1
1 ≤ x ≤ 1,
2
see Figure 4.
1 1
fn (x) = fm (x) for all x< 2 − , x > 12 ,
N
and so
1
1
1
Z Z
2
kfn − fm kL1 = |fn (x) − fm (x)| dx = |fn (x) − fm (x)| dx ≤ ,
0
1 1 N
2−N
(4.4)
since |fn (x) − fm (x)| ≤ 1 for all x ∈ [0, 1] and all n, m ∈ N.
So this sequence converges in the L1 norm but not the sup norm.
However, every normed space has a completion, i.e. a minimal set Ṽ such
that Ṽ ⊃ V and (Ṽ , k · k) is a Banach space. Essentially Ṽ consists of all
limit points of Cauchy sequences in V (and in particular, therefore, contains
a copy of V via the constant sequence vn = v ∈ V ).
This is equivalent to the fact that given any v ∈ V there exists a sequence
xn ∈ X such that
kxn − vk → 0 as n → ∞.
This is the particularly useful form of ‘density’: if X is dense in V one can
often deduce properties of V by approximating them with elements of X.
x = (x1 , x2 , x3 , . . .).
We will show (i) that x is a Cauchy sequence, and so [x] ∈ X , and (ii) that
η (k) converges to [x]. This will show that X is complete.
(i) To show that x is Cauchy, observe that
Given ǫ > 0, choose N large enough that kxn −xm kX < ǫ/2 for all n, m ≥ N ,
and then set N ′ = max(N, 2/ǫ). It follows that for k ≥ N ′ ,
The space X in the above theorem is a very abstract one, and we are
fortunate that in most situations there is a more concrete description of the
completion of ‘interesting’ normed spaces.
Definition 4.12 The space L1 (0, 1) is the completion of C 0 ([0, 1]) with re-
spect to the L1 norm.
What is this space L1 (0, 1)? There are a number of possible answers:
• Heuristically, L1 (0, 1) consists of all functions that can arise as the limit
(with respect to the L1 norm) of sequences fn ∈ C 0 ([0, 1]).
This is the most intrinsic definition, and some ways the most ‘useful’.
But note that given this definition it is certainly not obvious that L1 (0, 1) is
complete, nor that C 0 ([0, 1]) is dense in L1 (0, 1). We will assume these prop-
erties in what follows, but at the risk of over-emphasis: if we use Definition
4.12 to define L1 these properties come for free. If we use the ‘useful’ defi-
nition above there is actually some work to do to check these (which would
be part of a proper development of the Lebesgue integral and corresponding
‘Lebesgue spaces’).
We define the measure (or length) |I| of an interval I = [a, b], (a, b), (a, b],
or [a, b) to be
|I| = b − a.
The class Lstep (R) of step functions on R consists of all those functions s(x)
that are piecewise constant on a finite number of intervals, i.e.
n
X
s(x) = cj χ[Ij ](x), (5.1)
j=1
36
5.1 Integrals of functions in Lstep (R) 37
Even though this definition appears entirely reasonable, note that we have
not specified the nature of the intervals Ij , so the functions
χ(0,1) and χ[0,1]
have the same integral (which is 1); by extension we canR change the value
of s at a finite number of points and leave the value of s unchanged.
i.e. where φ and ψ are expressed using the same choice of intervals (in this
case some of cj s and dj s may be zero).
It is also relatively simple to check that this integral satisfies the following
three fundamental properties:
and so
Z n
X n
X n
X Z Z
(φ + λψ) = (cj + λdj )|Ij | = cj |Ij | + λ dj |Ij | = φ + λ ψ.
j=1 j=1 j=1
Proof Clearly |φ| ± φ ≥ 0, and so, since φ ∈ Lstep (R) implies (see *) that
|φ| ∈ Lstep (R), we can use properties (L) and (P) to give
Z Z Z
|φ| ± φ = |φ| ± φ ≥ 0,
from whence
Z Z
∓ φ≤ |φ|,
as required.
(T) Translation invariance: Take φ ∈ RLstep (R).
R For t ∈ R define φh (x) =
φ(h + x). Then φh ∈ Lstep (R) and φh = φ.
Proof Clearly if
n
X
φ= cj ϕIj
j=1
Note that combining positivity and linearity gives the comparison result
step
Rwhich Rwill be critical in what follows: if φ, ψ ∈ L (R) and φ ≥ ψ then
φ ≥ ψ.
exists.
We would like to deduce that the functions sn (x) must converge to some
limit,
R but we do not know that sn (x) is bounded, only that the integral
sn is bounded. Nevertheless, we can show that sn (x) converges ‘almost
everywhere’, an idea that we now introduce.
First, we will say that a set A ⊂ R has “measure zero” (‘zero length’) if,
given any ǫ > 0, one can find a (possibly countably infinite) set of intervals
[aj , bj ] that cover A but whose total length is less than ǫ:
∞
[ ∞
X
A⊂ [aj , bj ] and (bj − aj ) < ǫ.
j=1 j=1
Exercise 5.1 Show that if Aj has measure zero for all j = 1, . . . then
∞
[
Aj
j=1
P∞ −j
also has measure zero. [Hint: j=n+1 2 = 2−n .]
40 Lebesgue integration
What we will now show that each monotonic sequence sn (x) with (5.4)
uniformly bounded tends pointwise to a function f (x) almost everywhere,
i.e. except on a set of measure zero.
Theorem 5.3 LetR {φn } be an increasing sequence of step functions (φn+1 (x) ≥
φn (x)) such that φn ≤ K for all n. Then φn (x) converges for almost every
x.
Proof First, replace {φn } by φ̃n := φn − φ1 . Then φ̃n satisfies the conditions
of the theorem with K replaced by some K ′ , but now φ̃n ≥ 0.
If we can show that φ̃n (x) converges for almost every x then the same
clearly follows for φn (x) = φ̃n (x) + φ1 (x).
We want to show that
E = {x : φ̃n (x) → ∞}
has measure zero. Note that, since φn (x) is non-decreasing for each x, this
is precisely the set of points where φn (x) does not converge.
Fix m > 0, and set
En = {x : φ̃n (x) ≥ K ′ m}.
Then En can be covered by a finite collection In of disjoint intervals of
P
total length ≤ 1/m: indeed, using (1) and writing φ̃n = j cj χKj with Kj
disjoint, let In be the collection of those Kj with corresponding cj ≥ K ′ m.
Then Z X X X
K ≥ φ̃n = cj |Kj | ≥ ci |Ki | ≥ K ′ m |Ki |,
j j ′ ∈In i∈In
and so
X 1
|Ki | ≤ .
m
i∈In
Now, note that En+1 ⊇ En (since φ̃n+1 ≥ φ̃n ); by splitting up the intervals
that occur in In+1 if need be one can ensure that In+1 ⊇ In .
5.2 Increasing sequences of functions in Lstep (R): Linc (R) 41
Since m is arbitrary, it follows from (5.5) and (5.6) that E has zero
measure.
We denote the set of all functions that can be arrived at as the almost-
everywhere limit of an increasing sequence of step functions as Linc (R). For
such a function (f = limn→∞ sn ), we define
Z Z
f = lim sn .
n→∞
Again, we have to check that this definition does not depend on exactly
which sequence {sn } we have chosen.
The proof uses the following technical lemma whose proof, which appeals
to the Heine-Borel Theorem (compactness of [0,1]) we omit.
ψk − f ≤ g − f ≤ 0.
as n → ∞. Since
Z Z Z Z
ψk − φn = (ψk − φn ) ≤ (ψk − φn )+ ,
Proposition 5.6
If f, g ∈ Linc (R) and f ≥ g then f ≥ g.
R R
(i)
If f, g ∈ Linc (R) then f + g ∈ Linc (R) and (f + g) = f + g.
R R R
(ii)
If f ∈ Linc (R) and λ ≥ 0 then λf = λ f .
R R
(iii)
(iv) If f, g ∈ Linc (R) then max(f, g), min(f, g) ∈ Linc (R); in particular,
f + = max(f, 0) ∈ Linc (R).
from whence
Z Z Z Z
g1 − h1 = g2 − h2 .
We can now show that the integral of L1 functions has the same properties
as the integral on Lstep (R):
where both the bracketed terms are in Linc (R). So f1 + λf2 ∈ L1 (R), and
Z Z Z
f1 + λf2 = (g1 + λg2 ) − (h1 + λh2 )
Z Z Z Z
= (g1 − h1 ) + λ (g2 − h2 ) = f1 + λ f2 .
Proof Write f = g − h with g, h ∈ Linc (R). One can easily check that
and
Z Z Z Z Z
h− g=− f≤ |f | = max(g, h) − min(g, h).
By symmetry we need only check one of these, and for this it is sufficient
to show that
If g < h then the LHS is 2g(x) and the RHS is 2h(x), so the inequality
holds; while if g ≥ h the LHS is g + h and the RHS is h + g, so the
inequality holds once more.
This result will imply that there are some functions that can be integrated
to nevertheless are not in L1 (R), see the Examples Sheet.
(T) If f ∈ L1 (R) and fd (x) = f (x + d) then fd ∈ L1 (R) and fd = f .
R R
Proof Follows immediately from the same property for functions in Linc (R).
We have now defined the space L1 (R) of integrable functions, and shown
that the integral as we have defined it has the four properties we start with
for the integral on Lstep (R).
5.3 The space L1 (R) of integrable functions 45
There are three fundamental theorems for the Lebesgue integral. The first
is the Monotone Convergence Theorem, which looks like the construction
of the Lebesgue integral, but with a monotone sequence of step functions
replaced by a monotone sequence of integrable functions.
Proof Exercise.
We denote by Lp (R) [Lp (I)] the space of functions that are pth power inte-
grable [on an interval I]. It is natural to try to use the Lp norm
Z 1/p
p
kf kLp = |f (x)|
on Lp (I). However, although this is a norm on L̃p (I) [recall this was con-
tinuous functions with finite Lp norm], it in facts fails to satisfy property (i)
in the definition of a norm – if kf kLp = 0 then we only have f = 0 almost
everywhere.
In fact one can show that C 0 is dense in L1 fairly easily (see Exercises).
We now show that L1 as defined is complete (and so is the completion of
C 0 that we were after).
Proof
(i) Suppose that fk ∈ L1 (R) and
∞ Z
X
K= |fk | < ∞.
j=1
for every k.
(ii) We now apply the DCT to hn = nk=1 fk . Each hn ∈ L1 (R), and by
P
x · y = x1 y1 + · · · + xn yn . (6.1)
Note that
48
6.1 Inner products and norms 49
We will soon shows that k · k defines a norm; we say that it is the norm
induced by the inner product (·, ·).
Property (iii), the triangle inequality, follows from the Cauchy-Schwarz in-
equality (6.3), since
kx + yk2 = (x + y, x + y)
= kxk2 + (x, y) + (y, x) + kyk2
≤ kxk2 + 2kxkkyk + kyk2
= (kxk + kyk)2 ,
or just |x · y| ≤ |x||y|.
6.3 The relationship between inner products and their norms 51
Exercise 6.5 The norm on the sequence space ℓ2 derived from the inner
P
product (x, y) = xj ȳj is
1/2
X∞
kxkℓ2 = |xj |2 .
j=1
Obtain the Cauchy-Schwarz inequality for ℓ2 using (6.4) and a limiting ar-
gument rather than Lemma 6.4.
Norms derived from inner products have one key property in addition to
(i)–(iii) of Definition 2.1:
kx + yk2 + kx − yk2 = (x + y, x + y) + (x − y, x − y)
= kxk2 + (y, x) + (x, y) + kyk2
+kxk2 − (y, x) − (x, y) + kyk2
= 2(kxk2 + kyk2 ).
52 Inner product spaces
Exercise 6.7 Show that there is no inner product on C 0 ([0, 1]) which induces
the sup or L1 norms,
Z 1
kf k∞ = sup |f (x)| or kf kL1 = |f (x)| dx.
x∈[0,1] 0
Given a norm that is derived from an inner product, one can reconstruct
the inner product as follows:
while if V is complex
Proof Once again, rewrite the right-hand sides as inner products, multiply
out, and simplify.
Lemma 6.9 If V is an inner product space with inner product (·, ·) and
derived norm k · k, then xn → x and yn → y implies that
Proof Since xn and yn converge, kxn k and kyn k are bounded (the proof is
a simple exercise). Then
This lemma is extremely useful: that we can swap limits and inner prod-
ucts means that if
Xn
xj
j=1
Pn P∞
converges (so that j=1 xj →x= j=1 xj ) then
∞
X ∞
X
xj , y = (xj , y),
j=1 j=1
(the complex conjugate is redundant if K = R); and L2 (I) with inner prod-
uct and norm
Z Z 1/2
2
(f, g) = f (x)g(x) dx kf kL2 = |f (x)| dx .
I I
From now on we will assume unless explicitly stated that all the above
spaces are equipped by their standard inner product (and corresponding
norm).
7
Orthonormal bases in Hilbert spaces
Our aim in this chapter is to discuss orthonormal bases for Hilbert spaces.
In contrast to the Hamel basis we considered earlier, we are now going to
allow infinite linear combinations of basis elements (called a Schauder basis).
Definition 7.1 Two elements x and y of an inner product space are said
to be orthogonal if (x, y) = 0. (We sometimes write x ⊥ y.)
Note that this definition does not require E to be countable. Note also
54
7.1 Orthonormal sets 55
one can take the inner product with each {ej } in turn to show that αj = 0
for j = 1, . . . , n.
ej = (0, 0, . . . , 1, . . . , 0, . . .)
for any n, m
Z π Z π Z π
cos nt dt = sin nt dt = sin nt cos mt dt = 0;
−π −π −π
Proof Use induction and the Pythagorean property (7.1), noting that
n−1
X
αj ej , αn en = 0.
j=1
Lemma 7.6 Let (·, ·) be any inner product on a vector space V of dimension
n. Then there exists an orthonormal basis {ej }nj=1 of V .
It follows that in some sense the dot product (6.1) is the canonical in-
ner product on a finite-dimensional space. Indeed, with respect to any the
orthonormal basis {ej } the inner product (·, ·) has the form (6.1), i.e.
Xn n
X n
X
xj ej , yk ek = xj ȳk (ej , ek ) = x1 ȳ1 + · · · + xn ȳn .
j=1 k=1 i,j=1
We now formalise our notion of a basis for a Hilbert space. Note that at
present we do not require E to be countable.
for some αj ∈ K, ej ∈ E.
(Note that if E is a basis in the sense of Definition 7.8, i.e. the expansion
in terms of the ej is unique then E is linearly independent, since if
n
X
0= αj ej
j=1
there is a unique expansion for zero and so we must have αj = 0 for all
j = 1, . . . , n.)
† There is a subtlety here. If E is countable then one can assume that E = {ej }∞ j=1 , i.e. that the
elements of E are specified in a particular order. In this case the ‘uniqueness’ is clear. But if
E is uncountable ‘uniqueness’ means that one does not care about the order of the summation
in (7.2), and so certainly this order should not affect the value of the sum itself. This requires
proof: Suppose that {wj } is a rearrangement of the {ej }. Set αn = (x, en ), βm = (x, wm ),
∞
X ∞
X
x1 = αn en and x2 = βm wm .
n=1 m=1
Then (by Lemma 6.9) it follows that (x1 , en ) = (x, en ) and (x2 , wm ) = (x, wm ). Since en = wm
for some m = m(n), it follows that
and so x1 = x2 , as required.
58 Orthonormal bases in Hilbert spaces
∞
X ∞
X
x= αj ej and x= βk fk
j=1 j=1
Write E1 = {ej }∞ ∞ ′
j=1 and E2 = {fj }j=1 . Set U = E1 ∩ E2 , E1 = E1 \ U ,
′
and E2 = E2 \U . In other words, U consists of those basis elements common
to {ej } and {fj }, E1′ is those occurring only in {ej }, and E2′ those occurring
only in {fj }. All these sets are at most countable.
It follows that
X X X
(αu − βu )u + αe e − βf f = 0.
u∈U e∈E1′ f ∈E2′
Taking the inner product of this with any e ∈ E1′ shows that αe = 0, while
the inner product with any f ∈ E2′ implies that βf = 0. So in fact E1 = E2 ,
and the inner product with any u ∈ U then shows that αu = βu , and so the
expansion is unique. (In each of these steps we use Lemma 6.9 to swap the
order of the inner product and summation.)
To find the coefficients αj , simply take the inner product with some ek to
give
X∞ ∞
X
(x, ek ) = αj ej , ek = αj (ej , ek ) = αk ,
j=1 j=1
P
and so we would expect αk = (x, ek ). (Note that if x = αj ej then this
manipulation is rigorous, using Lemma 6.9 to change the order of the inner
product and the sum.) So if E is an orthonormal basis we would expect to
obtain the expansion
∞
X
x= (x, ej )ej .
j=1
Assuming that the Pythagoras result of Lemma 7.5 holds for infinite sums,
we would expect that
X∞
|(x, ej )|2 = kxk2 .
j=1
In some ways this says that ‘the {ej } capture all directions in H’. Presum-
ably if {ej } do not form an orthonormal basis we should be able to find an
x such that
X∞
|(x, ej )|2 < kxk2 .
j=1
We have not proved any of this yet, since we are assuming that (7.2) holds;
but it motivates the following lemma, whose result is known as Bessel’s
inequality.
Clearly
k
X
2
kxk k = |(x, ej )|2
j=1
and so we have
kx − xk k2 = (x − xk , x − xk )
= kxk2 − (xk , x) − (x, xk ) + kxk k2
k
X k
X
= kxk2 − (x, ej )(ej , x) − (x, ej )(x, ej ) + kxk k2
j=1 j=1
2 2
= kxk − kxk k .
It follows that
k
X
|(x, ej )|2 = kxk k2 ≤ kxk2 − kx − xk k2 ≤ kxk2 .
j=1
We can give an interesting corollary about the the coefficients (x, e) when
E is an uncountable set.
{e ∈ E : (x, e) 6= 0}
Then this set can have no more than m2 kxk2 elements. Indeed, if Em has
N > m2 kxk2 elements, one can select N elements {e1 , . . . , eN } from Em ,
and then
N
X 1
|(x, ej )|2 ≥ N × > kxk2 .
m2
j=1
7.2 Convergence and orthonormality in Hilbert spaces 61
But this contradicts Bessel’s inequality. Thus each Em contains only a finite
number of elements, and hence
∞
[
Em = {e ∈ E : (x, e) 6= 0}
m=1
We now use Bessel’s inequality to give a simple criterion for the conver-
gence of a sum ∞
P
j=1 αj ej when the {ej } are orthonormal.
and then
∞
2 ∞
X
X
αn en
= |αn |2 . (7.3)
n=1 n=1
P∞
We could rephrase this as n=1 αn en converges iff α = (α1 , α2 , . . .) ∈ ℓ2 .
Pn
Proof Suppose that j=1 αj ej converges to x as n → ∞; then
2
n
n
X X
|αj |2
αj ej
=
j=1
j=1
converges.
In fact one can use Corollary 7.10 to deduce that for any orthonormal set
E,
X
(x, e)e
e∈E
converges (we have already seen that this is independent of the order of
summation).
The same results hold for a general orthonormal set E, but we stick to the
countable case for simplicity of presentation.
Clearly if k ≤ n we have
Xn
( αj ej , ek ) = αk ,
j=1
and using the properties of the inner product of limits we obtain αk = (x, ek )
and hence (a) holds. The same argument shows that if we assume (a) then
this expansion is unique and so E is a basis.
We first show that (a) ⇒ (b) ⇒ (c) ⇒ (a), and then that (a) ⇒ (d) and
(d) ⇒ (c).
(a) ⇒ (b) is immediate from (2.16).
(b) ⇒ (c) is immediate since kxk = 0 implies that x = 0.
(c) ⇒ (a) Take x ∈ H and let
∞
X
y =x− (x, ej )ej .
j=1
For each m ∈ N we have, using Lemma 6.9 (continuity of the inner product),
Xn
(y, em ) = (x, em ) − lim (x, ej )ej , em
n→∞
j=1
= 0
since eventually n ≥ m. It follows from (c) that y = 0, i.e. that
∞
X
x= (x, ej )ej
j=1
as required.
(a) ⇒ (d) is clear, since given any x and ǫ > 0 there exists an n such that
X n
(x, ej )ej − x < ǫ.
j=1
ej = (0, 0, . . . , 1, . . . , 0, . . .)
(with the 1 in the jth position), is an orthonormal basis for ℓ2 , since it clear
that if (x, ej ) = xj = 0 for all j then x = 0.
Example 7.15 The sine and cosine functions given in example 7.4 are an
o-n basis for L2 (−π, π).
kei − ej k2 = 2.
The {ej } form a sequence in the unit ball that can have no convergent
subsequence.
8
Closest points and approximation
Which gives
2 2
k2x − (an + am )k2 + kan − am k2 < 4δ2 + +
m n
or
2 2
kan − am k2 ≤ 4δ2 + + − 4kx − 12 (an + am )k2 .
m n
65
66 Closest points and approximation
and so X ⊥ is closed.
Note that Proposition 7.13 shows that E is a basis for H iff E ⊥ = {0}
(since this is just a rephrasing of (c): (u, ej ) = 0 for all j implies that u = 0).
one has
n
X n
X
(y, u) = (y, αj ej ) = ᾱj (y, ej ) = 0
j=1 j=1
In fact one also has E ⊥ = (clin(E))⊥ . Recall that the closed linear span
of E, clin(E) is given by
as required.
x=u+v with u ∈ U, v ∈ U ⊥,
PU x = u
68 Closest points and approximation
since u1 − u2 ∈ U and v2 − v1 ∈ U ⊥ .
If PU x denotes the closest point to x in U then clearly PU2 = PU , and it
follows from the definition of u that
P
Proof Consider x − j αj ej . Then
2
X X X X
2
|αj |2
x −
= kxk −
αj ej
(x, αj ej ) − (αj ej , x) +
j
j j j
X X X
2
= kxk − ᾱj (x, ej ) − αj (x, ej ) + |αj |2
j j j
X
2 2
= kxk − |(x, ej )|
j
Xh i
+ |(x, ej )|2 − ᾱj (x, ej ) − αj (x, ej ) + |αj |2
j
X X
2
= kxk − |(x, ej )|2 + |(x, ej ) − αj |2 ,
j j
for every k ∈ N.
Proof First omit all elements of {en } which can be written as a linear
combination of the preceding ones.
Now suppose that we already have an orthonormal set (ẽ1 , . . . , ẽn ) whose
span is the same as (e1 , . . . , en ). Then we can define ẽn+1 by setting
n
X e′n+1
e′n+1 = en+1 − (en+1 , ẽi )ẽi and ẽn+1 = .
ke′n+1 k
i=1
The span of (ẽ1 , . . . , ẽn+1 ) is clearly the same as the span of (ẽ1 , . . . , ẽn , en+1 ),
which is the same as the span of (e1 , . . . , en , en+1 ) using the induction hy-
pothesis. Clearly kẽn+1 k = 1 and for m ≤ n we have
n
1 X
(ẽn+1 , ẽm ) = (en+1 , ẽm ) − (en+1 , ẽi )(ẽi , ẽm ) = 0
ke′n+1 k
i=1
since (ẽ1 , . . . , ẽn ) are orthonormal. Setting ẽ1 = e1 /ke1 k starts the induction.
Exercise 8.12 Find the best approximation (w.r.t. L2 (−1, 1) norm) of sin x
by a third degree polynomial.
Exercise 8.13 Find the first four polynomials that are orthogonal on L2 (0, 1)
with respect to the usual L2 inner product.
9
Separable Hilbert spaces and ℓ2
kxj − uk < ǫ.
x = (x1 , . . . , xn , 0, 0, 0, . . .)
We now show that C 0 ([0, 1]) is separable, by proving the Weierstrass ap-
proximation theorem: every continuous function can be approximated arbi-
trarily closely (in the supremum norm) by a polynomial.
73
74 Separable Hilbert spaces and ℓ2
Theorem 9.4 Let f (x) be a real-valued continuous function on [0, 1]. Then
the sequence of polynomials
n
X n
Pn (x) = f (p/n)xp (1 − x)n−p
p
p=0
Therefore
n
X n
X n
X n
X
2 2 2
(p − nx) rp (x) = n x rp (x) − 2nx prp (x) + p2 rp (x)
p=0 p=0 p=0 p=0
2 2 2
= n x − 2nx · nx + (nx + n(n − 1)x )
= nx(1 − x).
One could also state this as: the set of polynomials is dense in C 0 ([0, 1])
equipped with the supremum norm.
While the set of all polynomials in not countable, the set of all polynomials
with rational coefficients is. Since
N N N
! !
X X
X
an xn − bn xn
≤ |an − bn |,
n=1 n=1 ∞ n=1
one can choose bn ∈ Q such that |an − bn | < ǫ/2N , and then
XN
n
f − bn x
< ǫ.
j=n
∞
If we now use the fact that C 0 ([0, 1]) is dense in L2 (0, 1), it follows that:
Proof Take f ∈ L2 (0, 1). Given ǫ > 0 there exists a g ∈ C 0 ([0, 1]) such that
kf − gkL2 < ǫ/2. We know from above that there exists a polynomial h with
rational coefficients such that
Since
Z 1 Z 1
kg − hk2L2 = 2
|g(x) − h(x)| dx ≤ kg − hk2∞ dx = kg − hk2∞ ,
0 0
it follows that
Note that this shows immediately that the unit ball in a separable Hilbert
space is not compact.
Separable Hilbert spaces and ℓ2 77
Note that there are Hilbert spaces that are not separable. For example, if
Γ is uncountable then the space ℓ2 (Γ) consisting of all functions f : Γ → R
such that
X
|f (γ)|2 < ∞
γ∈Γ
78
10.1 Bounded linear maps 79
It follows that
kzk δz
kzk
δz
≤ 1 kzk,
kT zk =
T
=
T
δ kzk
δ
kzk
δ
and so T is bounded.
Proof Let us denote by kAk1 the value defined in (10.2), and by kAk2 the
value defined in (10.3). Then given x 6= 0 we have
A x
≤ kAk2
kxkX
i.e. kAxkY ≤ kAk2 kxkX ,
Y
and
kAxkY
kAkB(X,Y ) = sup . (10.4)
x6=0 kxkX
When there is no room for confusion we will omit the B(X, Y ) subscript
on the norm, sometimes adding the subscript “op” (for “operator”) to make
things clearer (k · kop ).
If T : X → Y then in order to find kT kop one can try the following: first
show that
kT xkY ≤ M kxkX (10.5)
for some M > 0, i.e. show that T is bounded. It is then clear that kT kop ≤ M
(since kT kop is the infimum of all M such that (10.5) holds). Then, in order
to show that in fact kT kop = M , find an example of a particular z ∈ X such
that
kT zkY = M kzkX .
This shows from the definition in (10.4) that kT kop ≥ M and hence that in
fact kT kop = M .
Example 10.7 Consider the right and left shift operators on ℓ2 , σr and σl
given by
σr (x) = (0, x1 , x2 , . . .) and σl (x) = (x2 , x3 , x4 , . . .).
Both operators are clearly linear. We have
∞
X
kσr (x)k2ℓ2 = |xi |2 = kxk2ℓ2 ,
i=1
then we have
∞
X
kσl (x)k2ℓ2 = |xj |2 = kxk2ℓ2 ,
j=2
Example 10.8 Consider the space L2 (a, b) with −∞ < a < b < +∞ and
the multiplication operator T from L2 (a, b) into itself given by
T x(t) = f (t)x(t) t ∈ [a, b]
and so
kT xkL2 ≤ kf k∞ kxkL2 ,
i.e. kT kop ≤ kf k∞ .
Now let s be a point at which |f | is maximum. Assume for simplicity that
s ∈ (a, b), and for each ǫ > 0 consider
(
1 |t − s| < ǫ
xǫ (t) =
0 otherwise,
then
s+ǫ
kT xǫ k2 1
Z
= |f (t)|2 dt → |f (s)|2 as ǫ→0
kxǫ k2 2ǫ s−ǫ
82 Linear maps between Banach spaces
kT kop = kf k∞ .
Example 10.9 Consider the map from L2 (a, b) into itself given by the in-
tegral
Z b
(T x)(t) = K(t, s)x(s) ds for all t ∈ [a, b]
a
where
Z bZ b
|K(t, s)|2 ds dt < +∞.
a a
and so
Z bZ b
kT k2op ≤ |K(t, s)|2 ds dt.
a a
Note that this upper bound on the operator norm can be strict, see ex-
amples.
We now show that for every fixed x ∈ X the sequence {An x} is Cauchy in
Y . This follows since
An x → y,
Ker T = {x ∈ X : T x = 0}
For any f ∈ U ∗ ,
kf kU ∗ = sup |f (u)|.
kuk=1
Example 11.1 Take U = C 0 ([a, b]), and consider δx defined for x ∈ [a, b]
by
δx (f ) = f (x) for all f ∈ U.
Then
|δx (f )| = |f (x)| ≤ kf k∞ ,
(Note: this shows that – at least for this particular choice of U – knowledge
of T (f ) for all T ∈ U ∗ determines f ∈ U . This result is in fact true in
general.)
Example 11.2 Let U be the real vector space L2 (a, b), and take φ ∈ C 0 ([a, b]).
85
86 The Riesz representation theorem and the adjoint operator
Consider
Z b
f (u) = φ(t)u(t) dt.
a
Then
Z b
|f (u)| =
φ(t)u(t) dt
a
= |(φ, u)L2 |
≤ kφkL2 kukL2 using the Cauchy-Schwarz inequality,
and so f ∈ U ∗ with
kf k ≤ kφkL2 .
φ
If we choose u = kφkL2 then kukL2 = 1 and
b
|φ(t)|2
Z
|f (u)| = dt = kφkL2
a kφkL2
and so kf k = kφk.
Exercise 11.3 Let U be C 0 ([a, b]) and for some φ ∈ U consider fφ defined
as
Z b
fφ (u) = φ(t)u(t) dt for all u ∈ U.
a
Rb
Show that fφ ∈ U ∗ with kfφ k ≤ a |φ(t)| dt. Show that this is in fact an
equality by choosingR an appropriate sequence of functions un ∈ U for which
b
|fφ (un )|/kun k∞ → a |φ(t)| dt.
The Riesz Representation Theorem shows that this example can be ‘re-
versed’, i.e. every linear functional on H corresponds to some inner product:
and kykH = kf kH ∗ .
f (u)v − f (v)u = 0,
Therefore
f (x) = (x, z)f (z) = (x, f (z)z).
Theorem 11.6 Let H and K be Hilbert spaces and T ∈ B(H, K). Then
there exists a unique operator T ∗ ∈ B(K, H), the adjoint of T , such that
(T x, y)K = (x, T ∗ y)H
for all x ∈ H, y ∈ K. In particular, kT ∗ kB(K,H) ≤ kT kB(H,K) .
Exercise 11.9 Show that the adjoint of the integral operator T : L2 (0, 1) →
L2 (0, 1) defined as
Z t
(T x)(t) = K(t, s)x(s) ds
0
is given by
Z 1
∗
(T y)(t) = K(s, t)y(s) ds.
t
90 The Riesz representation theorem and the adjoint operator
Proof
(a) Exercise
(b) Clearly
(x, (T R)∗ y)H = (T Rx, y)J
= (Rx, T ∗ y)K = (x, R∗ T ∗ y)H .
Theorem 11.12 Let H and K be Hilbert spaces and T ∈ B(H, K). Then
(a) (T ∗ )∗ = T ,
(b) kT ∗ k = kT k, and
(c) kT ∗ T k = kT k2 .
Proof
(a) Since T ∗ ∈ B(K, H), (T ∗ )∗ ∈ B(H, K). For all x ∈ K, y ∈ H we
have
(x, (T ∗ )∗ y)K = (T ∗ x, y)H
= (y, T ∗ x)H
= (T y, x)K
= (x, T y)K ,
i.e. (T ∗ )∗ y = T y for all y ∈ H, i.e. (T ∗ )∗ = T .
11.1 Linear operators from H into H 91
and
σl σr x = σl (0, x1 , x2 , . . .) = (x1 , x2 , x3 , . . .).
Let H be a complex Hilbert space and T ∈ B(H, H), then the point spectrum
of T consists of the set of all eigenvalues,
(0, x1 , x2 , . . .) = λ(x1 , x2 , x3 , . . .)
and so
λx1 = 0, λx2 = x1 , λx3 = x2 , . . . .
93
94 Spectral Theory I: General theory
i.e. if
x2 = λx1 , x3 = λx2 , x4 = λx3 .
Given λ 6= 0 this gives a candidate ‘eigenfunction’
x = (1, λ, λ2 , λ3 , . . .),
which is an element of ℓ2 provided that
∞
X 1
|λ|2n = <∞
1 − |λ|2
n=1
which is the case for any λ with |λ| < 1. It follows that
{λ ∈ C : |λ| < 1} ⊆ σp (σl ).
Exercise 12.3 Show that if A is linear and A−1 exists then it is linear too.
Proof If λ ∈
/ σ(T ) then T − λI has a bounded inverse,
(T − λI)(T − λI)−1 = I = (T − λI)−1 (T − λI).
Taking adjoints we obtain
[(T − λI)−1 ]∗ (T ∗ − λ̄I) = I = (T ∗ − λ̄I)[(T − λI)−1 ]∗ ,
and so T ∗ − λ̄I has a bounded inverse, i.e. λ̄ ∈
/ σ(T ∗ ). Starting instead with
∗
T we deduce that λ ∈ ∗
/ σ(T ) ⇒ λ̄ ∈
/ σ(T ), which completes the proof.
We have already seen that any eigenvalue λ of T must satisfy |λ| ≤ kT kop .
We now show that this also holds for any λ ∈ σ(T ); the argument is more
subtle, and based on considering how to solve the linear equation (I − T )x =
y.
Proof Since
kT n xk ≤ kT kkT n−1 xk
it follows that kT n k ≤ kT kn . Therefore if we consider
Vn = I + T + · · · + T n
we have (for n > m)
kVn − Vm k = kT m+1 + · · · + T n−1 + T n k
≤ kT m+1 k + · · · + kT n−1 k + kT n k
≤ kT km+1 + · · · + kT kn−1 + kT kn
1
≤ kT km+1 .
1 − kT k
It follows that {Vn } is Cauchy in the operator norm, and so converges to
some V ∈ B(H, H) with
kV k ≤ 1 + kT k + kT k2 + · · · = [1 − kT k]−1 .
Clearly
V (I − T ) = (I + T + T 2 + · · · )(I − T ) = (I + T + T 2 + · · · )− (T + T 2 + T 3 ) = I
and similarly (I − T )V = I.
As promised:
We now show that the spectrum must also be closed, by showing that it
complement (the resolvent set) is open. To this end, we prove the following
theorem, which shows that the set of bounded linear operators with bounded
inverses defined on all of H is open, i.e. that this property is stable under
perturbation.
kU k < kT −1 k−1
T −1 (T + U )P −1 = P −1 T −1 (T + U ) = I;
(T + U )P −1 T −1 = I
and so
(T + U )−1 = P −1 T −1
Proof We show that the resolvent set R(T ), the complement of σ(T ), is
98 Spectral Theory I: General theory
Lemma 12.12 The spectrum of σl and of σr are both equal to the unit disc
in the complex plane:
σ(σ· ) = {λ ∈ C : |λ| ≤ 1}.
Exercise 13.1 Let H be a Hilbert space over R, and define its complexifi-
cation HC as the vector space
HC = {x + iy : x, y ∈ H},
equipped with operations + and ∗ defined via
It follows that
kx + iyk2HC = kxk2 + kyk2 .
99
100 Spectral theory II: compact self-adjoint operators
Proof Clearly
kT̃ (x+iy)k2HC = kT xk2 +kT yk2 ≤ kT k2B(H,H) (kxk2 +kyk2 ) = kT k2B(H,H) kx+iyk2HC ,
i.e. λ = λ̄.
Now if λ and µ are distinct eigenvalues with T x = λx and T y = µy then
and so (x, y) = 0.
We will develop our spectral theory for operators that are self-adjoint and
‘compact’ according to the following definition:
as n → ∞, then K is compact.
for all yj in the sequence. For such an n, the sequence Kn (yj ) is Cauchy,
and so there exists an N such that for i, j ≥ N we can guarantee
So now
kK(yi ) − K(yj )kY ≤ ǫ for all i, j ≥ N,
with
Z bZ b
|K(x, y)|2 dx dy < ∞
a a
is compact.
Proof Let {φj } be an orthonormal basis for L2 (a, b). It follows that {φi (x)φj (y)}
is an orthonormal basis for L2 ((a, b) × (a, b)). If we write K(x, y) in terms
of this basis we have
∞
X
K(x, y) = kij φi (x)φj (y),
j,k=1
and the sum converges in L2 ((a, b) × (a, b)). Since {φi (x)φj (y)} is a basis
we have
Z bZ b ∞
X
kKk2L2 ((a,b)×(a,b)) = |K(x, y)|2 dx dy = |kij |2 . (13.1)
a a i,j=1
and
Z b
[Tn u](x) = Kn (x, y)u(y) dy.
a
P∞
If u ∈ L2 (Ω) is given by u = l=1 cl φl , then
n X
X ∞ Z b
(Tn u)(x) = kij φi (x)φj (y)cl φl (y) dy
i,j=1 l=1 a
X n
= kij cj φi (x).
i,j=1
Tn → T in the operator norm then we can use theorem 13.8 to show that T
is compact.
This is straightforward, since
Z bZ b
2
k(T − Tn )uk = |K(x, y)u(y) − Kn (x, y)u(y)|2 dx dy
a a
Z b Z b Z b
2
≤ |K(x, y) − Kn (x, y)| dx dy |u(y)|2 dy,
a a a
i.e.
Z bZ b
2
kT − Tn k ≤ |K(x, y) − Kn (x, y)|2 dx dy
a a
2
Z bZ b X ∞
≤
kij φi (x)φj (y) dx dy
a a i,j=n+1
∞
X
= |kij |2 ,
i,j=n+1
We now show that any compact self-adjoint operator has at least one
eigenvalue. (Recall that σr is not even normal, so this is no contradiction.)
Note that since any eigenvalue must satisfy |λ| ≤ kT kop , since if T x = λx
we have
λkxk2 = (λx, x) = (T x, x) ≤ kT xkkxk ≤ kT kop kxk2 ,
it follows that kT kop = sup{λ : λ ∈ σp (T )}.
Proof Suppose that T has infinitely many eigenvalues that do not form a
sequence tending to zero. Then for some ǫ > 0 there exists a sequence of
distinct eigenvalues with |λn | > ǫ. Let xn be a corresponding sequence of
eigenvectors with kxn k = 1; then
kT xn − T xm k2 = (T xn − T xm , T xn − T xm ) = |λn |2 + |λm |2 ≥ 2ǫ2
since (xn , xm ) = 0. It follows that {T xn } can have no convergent subse-
quence, which contradicts the compactness of T .
Now suppose that for some eigenvalue λ there exists an infinite number of
linearly independent eigenvectors {en }∞ n=1 . Using the Gram-Schmidt process
we can find a countably infinite orthonormal set of eigenvectors, since any
linear combination of the {ej } is still an eigenvector:
Xn n
X Xn
T( αj ej ) = αj T ej = λ( αj ej ).
j=1 j=1 j=1
Now, we have
√
kT en − T em k = kλen − λem k = |λ|ken − em k = 2|λ|.
106 Spectral theory II: compact self-adjoint operators
Then
n−1
X n−1
X
0 = Tn y = T y = T x − (x, wj )T wj = T x − λj (x, wj )wj
j=1 j=1
which is (13.2).
If Tn is never zero then consider
n−1
X
yn := x − (x, wj )wj ∈ Hn .
j=1
Then we have
n−1
X
kxk2 = kyn k2 + |(x, wj )|2 ,
j=1
where T e = λe e.
108 Spectral theory II: compact self-adjoint operators
Now let F be an orthonormal basis for Ker T (this exists since Ker(T ) is
a Hilbert space in its own right, and every Hilbert space has an orthonormal
basis); each f ∈ F is an eigenvector of T with eigenvalue zero, and since
T f = 0 but T wj = λj wj with λj 6= 0, we know that (f, ek ) = 0 for all f ∈ F ,
k ∈ N. So F ∪ {wj } is an orthonormal set in H. orthonormal set in H.
Now, (13.3) implies that
X∞
T x − (x, wj )wj = 0,
j=1
P∞
i.e. that x− j=1 (x, wj )wj ∈ Ker T . It follows that {wj }∪ F is an orthonor-
mal basis for H.
We end this chapter with a corollary of Corollary 13.14 (!) that shows that
the eigenvalues are essentially all of the spectrum of a compact self-adjoint
operator.
Now take µ ∈
/ σp (T ). For such µ, it follows that there exists a δ > 0 such
that
sup |µ − λ| ≥ δ > 0 for all λ ∈ σp (T )
j
converges, and that kxk ≤ δ−1 kyk. So (T − µI)−1 exists and is bounded.
14
Sturm-Liouville problems
We will assume that p(x) > 0 on [a, b] and that q(x) ≥ 0 on [a, b].
Lemma 14.1 Let u1 (x) and u2 (x) be two linearly independent non-zero
solutions of
d du
− p(x) + q(x)u = 0.
dx dx
Then
Wp (u1 , u2 )(x) := p(x)[u′1 (x)u2 (x) − u′2 (x)u1 (x)]
is a constant.
110
Sturm-Liouville problems 111
Wp′ = p′ u′1 u2 + pu′′1 u′2 + pu′1 u′2 − p′ u1 u′2 − pu′1 u′2 − pu1 u′′2
= p′ (u′1 u2 − u′2 u1 ) + p(u′′1 u2 − u′′2 u1 )
= p′ (u′1 u2 − u′2 u1 ) + u2 (qu1 − p′ u′1 ) − u1 (qu2 − p′ u′2 )
= 0.
u′1 u′
u′1 u2 − u′2 u1 = 0 ⇒ = 2,
u1 u2
Theorem 14.2 Suppose that u1 (x) and u2 (x) are two linearly independent
non-zero solutions of
d dy
− p(x) + q(x)y = 0,
dx dx
Now,
Z x
′
u (x) = Cu2 (x)u1 (x)f (x) + Cu′2 (x)
u1 (y)f (y) dy − Cu1 (x)u2 (x)f (x)
a
Z b
′
+Cu1 (x) u2 (y)f (y) dy
x
Z x Z b
= Cu′2 (x) u1 (y)f (y) dy + Cu′1 (x) u2 (y)f (y) dy,
a x
Now we have L[u] = −pu′′ − p′ u′ + qu, and since L is linear with L[u1 ] =
L[u2 ] = 0 it follows that
L[u] = f (x)
as claimed.
L[u] = λu ⇔ u = λT u.
Since p > 0 on [a, b], it follows that u′ = 0 on [a, b], and so u must be
constant on [a, b]. Since u(a) = 0, it follows that u ≡ 0.
We now use show that Ker(T ) = 0. Indeed, T f is the solution of L[u] = f ,
i.e. f = L[T f ]. So if T f = 0, it follows that f = 0.
So φ is an eigenfunction of the SL problem iff it is an eigenvector for T :
1
L[φ] = λφ ⇔ Tφ = φ.
λ
Since G(x, y) is symmetric and bounded, it follows from Examples 10.9
and 11.8 that T is a bounded self-adjoint operator; Proposition 13.9 shows
that T is also compact. It follows from Theorem 13.13 that T has a set of
orthonormal eigenfunctions {φj } with
T φj = µj φj ,
and since Ker(T ) = {0} the argument of Corollary 13.14 shows that those
form an orthonormal basis for L2 (a, b).
Comparing this with our original problem we obtain an infinite set of
eigenfunctions {φj } with corresponding eigenvalues λj = µ−1
j . Note that
now λj → ∞ as j → ∞. As above, the eigenfunctions {φj } form an or-
thonormal basis of L2 (a, b).
equation will form a basis for L2 (0, 1). These are easily found by elementary
methods, and are
{sin kπx}∞
k=1 .