You are on page 1of 192

Douglas Farenick

Functional Analysis

Banach Spaces, Operators, and Algebras


An Introductory Course for Graduate Students

December 28, 2007

University of Regina
Department of Mathematics & Statistics
Regina, Saskatchewan S4S 0A2
Canada
Preface

Herein are lecture notes1 for an introductory graduate course on functional


analysis. (At the University of Regina, this course is Mathematics 813.) The
prerequisites are real and complex analysis, general topology, linear algebra,
and ring theory.

University of Regina Douglas Farenick


Regina, Saskatchewan, Canada December 2007

1
incomplete in several places
Contents

Part I Banach Spaces

1 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 Banach Space Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Finite-Dimensional Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Subspaces and Quotients of Banach Spaces . . . . . . . . . . . . . . . . . 9
1.4 Two Useful Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Baire Category Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.6 Linear and Schauder Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 Banach Spaces in Classical Analysis . . . . . . . . . . . . . . . . . . . . . . . 15


2.1 Convex and Concave Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Classical Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Topological Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Banach Spaces of Integrable Functions . . . . . . . . . . . . . . . . . . . . . 24
2.5 Essentially Bounded Measurable Functions . . . . . . . . . . . . . . . . . 27
2.6 Banach Spaces of Continuous Functions . . . . . . . . . . . . . . . . . . . . 28
2.7 Stone–Weierstrass Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.8 Separable Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.9 Lp (T) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 The Hahn–Banach Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Complementation of Finite-Dimensional Spaces . . . . . . . . . . . . . 49
3.4 Weak Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 The Second Dual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.6 Weak∗ Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.7 Subspaces of C(X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6 Contents

3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4 Dual Spaces in Classical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 57


4.1 The Dual of Lp , 1 ≤ p < ∞ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 The Radon–Nikodým Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.1 Convex Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Extreme Points and Faces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3 Geometry of the Closed Unit Ball . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 Separation of Convex Sets by Linear Functionals . . . . . . . . . . . . 69
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.1 Inner Products and Euclidean Norms . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Distance to Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.3 Orthogonal Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.4 Orthonormal Bases for Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . 81
6.5 Examples of Separable Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . 85
6.6 Hilbert Space Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Part II Operators

7 Operators on Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93


7.1 Principle of Uniform Boundedness . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2 The Open Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.3 Invertible Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.4 The Banach Space Adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.5 The Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.6 Polynomial Functional Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.7 Parts of the Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.8 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

8 Compact Operators on Banach Spaces . . . . . . . . . . . . . . . . . . . . . 113


8.1 Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
8.2 Properties of Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 114
8.3 The Spectra of Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . 118
8.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Contents 7

9 Operators on Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121


9.1 The Hilbert Space Adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
9.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
9.3 Hermitian Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
9.4 Normal Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.5 Continuous Functional Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.6 Positive Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9.7 Polar Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.8 Projections and Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . 134
9.9 Further Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

10 Compact Operators on Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . 145

Part III Algebras

11 Banach Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149


11.1 Banach Algebra Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
11.2 Invertible Elements and Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
11.3 Subalgebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
11.4 Ideals and Quotients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
11.5 exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

12 Banach Algebras in Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155


12.1 Uniform Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
12.2 The Disc Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
12.3 Absolutely Convergent Trigonometric Series . . . . . . . . . . . . . . . . 155
12.4 Harmonic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
12.5 Complex Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
12.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

13 C∗ -algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
13.1 C∗ -algebra Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
13.2 Adjoining a Unit to a Nonunital C∗ -algebra . . . . . . . . . . . . . . . . . 158
13.3 Gelfand Theory for Unital Abelian C∗ -algebras . . . . . . . . . . . . . . 160
13.4 C∗ -subalgebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
13.5 Continuous Functional Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
13.6 Positive Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
13.7 Ideals and Quotients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
13.8 C∗ -algebra Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
13.9 States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
13.10Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
13.11Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Contents 1

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Part I

Banach Spaces
1
Banach Spaces

1.1 Banach Space Definitions


The base field for all vector spaces under consideration is the field C of complex
numbers.
Definition 1.1. A norm on a vector space V is a function k · k : V → R such
that, for all v, w ∈ V and α ∈ C,
1. kvk ≥ 0, and kvk = 0 if and only if v = 0,
2. kα vk = |α| kvk,
3. kv + wk ≤ kvk + kwk.
The norm k · k on a normed vector space V induces a metric topology (or
norm topology) on V via the metric d : V × V → R+ defined by
d(v1 , v2 ) = kv1 − v2 k , v1 , v2 ∈ V .
Hence, a subset U of V is an open set if and only if for each u0 ∈ U there
exists ε > 0 such that u ∈ U for every u ∈ V such that ku − u0 k < ε.
Definition 1.2. Let V be a normed vector space, v0 ∈ V , and ρ > 0.
1. The set Bρ (v0 ) = {v ∈ V | kv − v0 k < ρ} is called the open ball of radius
ρ, centre v0 .
2. The set Bρ (v0 ) = {v ∈ V | kv − v0 k ≤ ρ} is called the closed ball of radius
ρ, centre v0 .
3. The set Sρ (v0 ) = {v ∈ V | kv − v0 k = ρ} is called the sphere of radius ρ,
centre v0 .
In particular, the sets B1 (0) and B1 (0) are called the open and closed unit
balls, respectively, and S1 (0) called the unit sphere.
Proposition 1.3. In a Banach space V , every open ball Bρ (v0 ) is an open set
whose topological closure is Bρ (v0 ) and whose topological boundary ∂Bρ (v0 ) is
Sρ (v0 ).
6 1 Banach Spaces

Proof. Exercise 1. 
Definition 1.4. A subset K of a topological space X is compact S if for every
collection {Uα }α of open sets Uα ⊂ X that cover X (ie., KS⊂ α Uα ), there
n
are finitely many Uα1 , . . . , Uαn of these sets for which K ⊂ j=1 Uαj .
Definition 1.5. Suppose that V is a normed vector spaces and that {vk }k∈N
is a sequence of vectors vk ∈ V .
1. {vk }k∈N is a Cauchy sequence if for every ε > 0 there exists Nε ∈ N such
that
kvn − vm k < ε , ∀ m, n ≥ Nε .
2. {vk }k∈N is convergent with limit v ∈ V if for every ε > 0 there exists
Nε ∈ N such that
kv − vn k < ε , ∀ n ≥ Nε .

X
3. The series vk converges to v ∈ V if
k=1

n
X
lim v − vk = 0 .

n→∞
k=1


X
In this case we shall write v = vk .
k=1

Because every normed vector space V is a metric space, the compactness


of a subset K ⊂ V can be phrased in terms of sequential compactness [9].
Theorem 1.6. The following statements are equivalent for a subset K of a
normed vector space V :
1. K is compact;
2. Every sequence {vk }k∈N ⊂ K admits a convergent subsequence {vkj }j∈N
with limit v ∈ K.
We now are ready to define the concept of a Banach space.
Definition 1.7. A normed vector space V is a Banach space if every Cauchy
sequence in V is convergent in V .
Of all Banach spaces, those that are separable are often among the most
studied in theory and applications.
Definition 1.8. In a Banach space V ,
1. a subset S ⊂ V is dense in V if for every v ∈ V and for every ε > 0 there
exists w ∈ S such that kv − wk < ε, and
2. V is said to be separable if there is a countable subset S ⊂ V that is dense
in V .
1.2 Finite-Dimensional Spaces 7

1.2 Finite-Dimensional Spaces

No doubt the most familiar example of a finite-dimensional Banach space is


the Euclidean space Cn . Let k · k2 denote the norm of Cn ; that is,
v
u n
uX
kak2 = t |αk |2 , ∀ a = (α1 , . . . , αn) ∈ Cn .
k=1

Because the modulus |α| of α ∈ C satisfies |α|2 = (<α)2 + (=α)2 , where <α
and =α are the real and imaginary parts of α, the Euclidean space Cn may
be identified with the real Euclidean space R2n .
The Euclidean space Rm has several strong features, of which one of the
most important is given by the Heine–Borel Theorem.

Theorem 1.9. (Heine–Borel) A subset X of the Euclidean space Rm is com-


pact if and only if X is closed and bounded.

A subset X of a normed vector space V is bounded if

sup kvk < ∞ .


v∈X

The Heine–Borel Theorem implies that the Euclidean space Cn is a Banach


space. The reasons are these. Assume that {vk }k∈N is a Cauchy sequence in the
Euclidean space Cn . Since Cauchy sequences are bounded, there is a bounded
and closed set X ⊂ Cn such that vk ∈ X for every k ∈ N. (For example, X
could be the set of all v ∈ Cn for which kvk2 ≤ r, where r = supk kvk k2 .) By
the Heine–Borel Theorem, X is compact, and so every sequence in X admits
a convergent subsequence. In particular, {vk }k∈N admits a convergent subse-
quence {vkj }j∈N ; let v ∈ Cn denote the limit of this subsequence. However,
because {vk }k∈N is a Cauchy sequence, a subsequence of it converges only if
the entire sequence converges. Hence, kv − vk k2 → 0 as k → ∞. In other
words, the Euclidean space Cn is a Banach space.
A second observation is that the Euclidean space Cn is separable. To see
why, note that because Q is dense in R, the countable set Q + iQ is dense in
R + iR = C. Likewise, the countable set (Q + iQ)n is necessarily dense in Cn .
The importance of these observations lies in the fact that the same con-
clusions are true for any finite-dimensional normed vector space.

Theorem 1.10. Every finite-dimensional normed vector space is a separable


Banach space.

The proof of Theorem 1.10 rests upon the following inequality.

Lemma 1.11. If {v1 , . . . , vn } is a basis for a finite-dimensional normed vector


space V , then there are positive constants c and C such that
8 1 Banach Spaces
 1/2  1/2
n
X
n n
X X
|αj |2  |αj |2 

c ≤
≤ C
αj vj  , (1.1)
j=1 j=1 j=1

for all α1 , . . . , αn ∈ C.
P
Proof. We will first relate the norm of any element v = j αj vj ∈ V to the
length of the vector a in the Euclidean space Cn determined by the coefficients
αj for v. Let
X = {a ∈ Cn | kak2 = 1} .
The set X is closed and bounded in Cn ; hence, X is a compact set. The
continuous function f : X → R, defined by

n
X
f(a) = αj vj
,
j=1

Pattain its minimum value at some b ∈ X. Set c = f(b), so that


must therefore
n
c ≤ f(a) = j=1 αj vj for all a ∈ X. Hence,

 1/2
X n X n
2
∀ α ∈ Cn .

c |αj |  ≤
αj vj
(1.2)
j=1 j=1

P 1/2
n 2
Now let C = j=1 kvj k . Then

 1/2  1/2
n n n n
X X X X
|αj |2   kvj k2 


αj vj |αj | kvj k ≤  ,


j=1 j=1 j=1 j=1

where the inequality on the right is the Cauchy–Schwarz inequality (that is,
the Minkowski inequality [Theorem 2.13] with p = 2) in Rn . Thus,
 1/2
n n
X X
|αj |2 

≤ C
αj vj . (1.3)


j=1 j=1

This completes the proof of the lemma. 


Proof of Theorem 1.10. Assume now that {wk }k is a Cauchy sequence of
elements in V . Write
n
(k)
X
wk = αj vj , ∀k ∈ N.
j=1
1.3 Subspaces and Quotients of Banach Spaces 9
(k)
Then inequality (1.2) implies that for each j = 1, . . . , n, the sequence {αj }k
Pn
converges in C to some αj (because C is complete). Let w = j=1 αj vj .
Inequality (1.3) implies, therefore, that the sequence {wk }k converges in V to
w. Hence, V is a Banach space.
To show that V is separable, recall that the countable set (Q + iQ)n is
n
dense in
Pnthe Euclidean space C . Therefore, inequality (1.3) now shows that
if v = j=1 αj vj ∈ V and if ε > 0, then there are βj ∈ (Q + iQ) such that
 1/2
n
X n
X Xn
k αj vj − βj vj k ≤ C  |αj − βj |2  < ε.
j=1 j=1 j=1

Pn
Thus, the countable set of all j=1 βj vj , where each βj ∈ (Q + iQ), is dense
in V . 

1.3 Subspaces and Quotients of Banach Spaces


Definition 1.12. Suppose that V is a Banach space and that L ⊆ V .
1. L is a linear submanifold of V if

α 1 w1 + α 2 w2 ∈ L , ∀ α1 , α2 ∈ C, w1 , w2 ∈ L .

2. L is a subspace of V if L is a linear submanifold of V and L is a closed


set (that is, L contains all of its limit points).

Theorem 1.10implies that if L is a finite-dimensional linear submanifold


of a Banach space V , then L is a subspace. V .
If L is a subspace of a Banach space V , then let V /L denote the (quotient)
vector space of equivalence classes [v] of elements v ∈ V whereby [v0 ] = [v] if
and only if v0 − v ∈ L.

Proposition 1.13. If L is a subspace of a Banach space V , then the function


k · kq on the quotient space V /L, defined by

k[v]kq = inf{kv − yk | y ∈ L} ,

is a norm on V /L. Furthermore, with respect to this norm, V /L is a Banach


space.

Proof. To show that k · kq is a norm, the only non-obvious property to verify


is that k[v]kq = 0 only [v] = [0]. To this end, suppose that k[v]kq = 0; thus,
there is a sequence of vectors yn ∈ L such that kv − yn k → 0. Because L is
closed and v is the limit of the sequence {yn }n∈N, v must belong to L. Hence,
[v] = [0].
Observe that, by definition of k · kq ,
10 1 Banach Spaces

k[v]kq ≤ kvk , ∀v ∈ V .

To show that V /L is a Banach space, let {[vk ]}k∈N be a Cauchy sequence


in V /L. Therefore, there is a subsequence {[vkj ]}j∈N of {[vk ]}k∈N with the
property that
1
k[vkj ] − [vkj−1 ]kq < j , ∀ j ∈ N .
2
By definition of k · kq , for each j ∈ N therePis a vector yj ∈ [vkj − vkj−1 ] with

kyj k ≤ k[vkj ] − [vkj−1 ]kq < 2−j . Hence, j=2 kyj k is convergent in R. By
P∞
Exercise 4, this implies that j=2 yk is convergent in V . Let y ∈ V denote
P∞
j=2 yk and consider [y + vk1 ] ∈ V /L. For any j ∈ N,

[vkj − [y + vk1 ] = [vkj − vk1 ] − [y]
q q

j
X
= [vki − vki−1 ] − [y]


i=2 q

j
X
= [yi − y]


i=2 q

j
X
≤ yi − y .


i=2
P∞
Because i=2 yi converges to y ∈ V , the sequence {[vkj ]}j∈N converges to
[y + vk1 ] in V /L. However, as {[vkj ]}j∈N is a convergent subsequence of the
Cauchy sequence {[vk ]}k∈N, the sequence {[vk ]}k∈N must also converge in V /L
to [y + vk1 ]. 

1.4 Two Useful Lemmas

The use of quotient spaces leads to the following two useful lemmas which
will be invoked from time to time when convenient.

Lemma 1.14. (Useful Lemma 1) If V is a Banach space and if {Mn }n∈N is


a sequence of subspaces such that Mn ⊂ Mn+1 (proper containment), then for
each δ ∈ (0, 1) there is a sequence of vectors vn ∈ V such that
1. vn ∈ Mn+1 and vn 6∈ Mn ,
2. kvn k = 1, and
3. kvj − vk k ≥ δ, for all j, k ∈ N with k 6= j.
1.4 Two Useful Lemmas 11

Proof. Because the containment Mn ⊂ Mn+1 is proper, in the quotient space


Mn+1 /Mn there is a vector of norm δ. This means, by definition of the quotient
norm and because δ < 1, that there are vectors fn ∈ Mn and gn ∈ Mn+1 \Mn
such that k[gn ]kq = δ and kgn − fn k = 1. Therefore, let vn = gn − fn , which
is an element of Mn+1 \Mn that lies in the closed unit ball of V . Note that if
j < k, then vj ∈ Mk and vk ∈ Mk+1 \Mk . Hence,

kvk − vj k ≥ inf{kvk − fk | f ∈ Mk } = k[vk ]k = k[gk ]kq = δ > 0 ,

which completes the proof. 


Now for a sister lemma that is proved by precisely the same type of argu-
ment.
Lemma 1.15. (Useful Lemma 2) If V is a Banach space and if {Mn }n∈N is
a sequence of subspaces such that Mn ⊃ Mn+1 (proper containment), then for
each δ ∈ (0, 1) there is a sequence of vectors vn ∈ V such that
1. vn ∈ Mn and vn 6∈ Mn+1 ,
2. kvn k = 1, and
3. kvj − vk k ≥ δ, for all j, k ∈ N with k 6= j.
As an application of the “usefulness” of the lemmas above, we now show
that there is no counterpart of the Heine–Borel Theorem (namely, that closed
and bounded sets are compact) in infinite-dimensional Banach spaces. In fact,
the compactness of all closed bounded sets in a Banach space V implies that
V is necessarily finite-dimensional.
Proposition 1.16. The closed unit ball in a Banach space V is compact if
and only if V has finite dimension.
Proof. Suppose that V has dimension n ∈ N. The proof of Theorem 1.10 shows
that the closed unit balls of V and the Euclidean space Cn are topologically
equivalent. By the Heine–Borel Theorem, the closed unit ball of Cn is compact;
hence, the closed unit ball of V is compact as well.
Conversely, suppose that the closed unit ball of V is compact. Thus, as V
is a metric space, every sequence in the closed unit ball admits a convergent
subsequence. Assume, contrary to what is to be proved, that V has infinite
dimension. Under this assumption, there is an increasing sequence F1 ⊂ F2 ⊂
. . . ⊂ Fn ⊂ Fn+1 ⊂ . . . of subspaces Fn ⊂ V , each of dimension dim Fn = n.
Fix δ ∈ (0, 1). By Useful Lemma 1 (Lemma 1.14), the ascending sequence of
subspaces Fn determines a sequence of unit vectors vn ∈ V such that

kvk − vj k ≥ δ > 0 , ∀ k 6= j .

This shows that no subsequence of {vn }n∈N can be a Cauchy sequence. Thus,
the sequence {vn }n∈N does not admit a convergent subsequence, in contradic-
tion to the fact that the closed unit ball of V is compact. Hence, it must be
that V has finite dimension. 
12 1 Banach Spaces

Corollary 1.17. A Banach space V has finite dimension if and only if every
closed, bounded subset of V is compact.

1.5 Baire Category Theorem


We conclude this introduction to Banach spaces with a fundamental result,
known as the Baire Category Theorem, which has a number of useful conse-
quences.

Definition 1.18. Let X be a topological space.


1. A subset G ⊆ X is a Gδ -set if there is a countable family {Uk }k∈N of open
sets Uk ⊆ X such that \
G = Uk .
k∈N

2. A subset F ⊆ X is an Fσ -set if there is a countable family {Fk }k∈N of


closed sets Fk ⊆ X such that
[
F = Fk .
k∈N

Theorem 1.19. (Baire Category Theorem) If {Uk }k∈N is a sequence of open


sets in a Banach space V such that each Uk is dense in V , then the Gδ -set
\
Uk
k∈N

is also dense in V .

Proof. Let G = ∩k∈N Uk . Choose v0 ∈ V and let ε > 0. We aim to prove that
Bε (v0 ) ∩ G 6= ∅.
By hypothesis, U1 is open and dense in V . Thus, there is a δ1 ∈ (0, 1)
and an element v1 ∈ Bε (v0 ) ∩ U1 such that Bδ1 (v1 ) ⊂ Bε (v0 ) ∩ U1 . Likewise,
U2 is open and dense in V , and so there is a δ2 ∈ (0, 21 ) and an element
v2 ∈ Bδ1 (v1 ) ∩ U2 such that Bδ2 (v2 ) ⊂ Bδ1 (v1 ) ∩ U2 .
It is clear that this process may be continued by induction to obtain se-
quences {vn }n∈N in V and {δn }n∈N in R such that
1
δn ∈ (0, ) and Bδn (vn ) ⊂ Bδn−1 (vn−1 ) ∩ Un .
2n−1
By construction,

Bδk (vk ) ⊂ Bδn (vn ) ⊂ Bδn (vn ) ⊂ Bε (v0 ) , ∀k > n. (1.4)

Now fix n ∈ N and let k, m > n. By (1.4),


1.6 Linear and Schauder Bases 13

vk , vm ∈ Bδn (vn ) ⊂ Bδn (vn ) .


1
Therefore, kvk −vm k < 2n−2 , which shows that {vn }n∈N is a Cauchy sequence.
Because V is a Banach space, there is a limit v ∈ V to this sequence.
Choose any n ∈ N. If k > n, then
vk ∈ Bδn (vn ) ⊂ Bδn−1 (vn−1 ) ∩ Un ⊂ Un .

Hence, v ∈ Bδn (vn ) ⊂ Un and v ∈ Bδn (vn ) ⊂ Bε (v0 ), which proves that
v ∈ G ∩ Bε (v0 ). 
Definition 1.20. A subset F ⊂ X is nowhere dense in a topological space X
if the topological interior of the closure F of F is the empty set.
Corollary 1.21. In a Banach space V ,
1. the intersection of a countable family of dense Gδ -sets is a dense Gδ -set,
and
2. the union of a countable family of nowhere dense Fσ sets is a nowhere
dense Fσ set.
Proof. Exercise 5. 

1.6 Linear and Schauder Bases


Definition 1.22. If V is a vector space, then a linear basis of V is a subset
B ⊂ V such that the elements of B span V and are linearly independent.
A standard argument from linear algebra tells us that a subset B ⊂ V is
a linear basis of V if and only if each v ∈ V admits a unique decomposition
as a linear combination of elements of V. That is, for each v ∈ V there are
n
X n
X
v1 , . . . , vn ∈ B and α1 , . . . , αn ∈ C such that v = αj vj , and 0 = αj vj
j=1 j=1
only if each αj = 0.
Every Banach space has a linear basis. However, there is no Banach space
with a countably infinite linear basis!
Proposition 1.23. If V is an infinite-dimensional Banach space, and if B is
a linear basis for V , then B is an uncountable set.
Proof. Exercise 6. 
Definition 1.24. A Schauder basis of a Banach space V is a countable set
Q = {vn | n ∈ N} of unit vectors vn ∈ V such that for each v ∈ V there is a
uniquely determined sequence {αn }n∈N in C such that

X
v = αn vn .
n=1
14 1 Banach Spaces

Observe that if a Banach space V has a Schauder basis, then V is nec-


essarily separable (Exercise 8). However, there are separable Banach spaces
that do not have a Schauder basis [5].

1.7 Exercises

1. Prove that, in a normed vector space V , every open ball Bρ (v0 ) is an open
set whose topological closure is Bρ (v0 ) and whose topological boundary
∂Bρ (v0 ) is Sρ (v0 ).
2. Assuming that C is a complete metric space with respect to the metric
induced by the modulus, give a detailed proof that the vector space Cn ,
in the Euclidean norm
 1/2
Xn
kξk =  |ξj |2  , ξ ∈ Cn ,
j=1

is a separable Banach space.


3. In a normed vector space V , prove that

| kv1 k − kv2 k | ≤ kv1 − v2 k , ∀ v1 , v2 ∈ V .

4. SupposePthat {vk }k∈N is a sequence of P


vectors in a Banach space V . Prove
that if k kvk k converges in R,P then k vk converges in V . (Suggestion:
show that the partial sums of k vk form a Cauchy sequence.)
5. Prove that, in a Banach space V ,
a) the intersection of a countable family of dense Gδ -sets is a dense Gδ -
set, and
b) the union of a countable family of nowhere dense Fσ sets is a nowhere
dense Fσ set.
6. Use the Baire Category Theorem to prove that if V is an infinite-
dimensional Banach space and if B is a basis for the vector space V ,
then B is an uncountable set. (Suggestion: If v1 , . . . vn ∈ V are linearly
independent, then show that Span {v1 , . . . , vn } is nowhere dense in V .)
7. A topological space X is locally compact if for each x0 ∈ X there is an
open set U ⊆ X that contains x0 and has compact closure. Prove that a
Banach space V is locally compact if and only if V has finite dimension.
8. Prove that if a Banach space V has a Schauder basis, then V is separable.
2
Banach Spaces in Classical Analysis

This chapter introduces and studies a number of the important Banach spaces
that arise in classical analysis.

2.1 Convex and Concave Functions


The goal of this section is to prove one of the most fundamental of all inequal-
ities in real analysis: “Jensen’s inequality.”

Definition 2.1. Let J ⊆ R be an open interval and let ϑ : J → R be a


function.
1. ϑ is a convex function if

ϑ (λx + (1 − λ)y) ≤ λϑ(x) + (1 − λ)ϑ(y) , ∀ λ ∈ (0, 1), ∀ x, y ∈ J .

2. ϑ is a concave function if

λϑ(x) + (1 − λ)ϑ(y) ≤ ϑ (λx + (1 − λ)y) , ∀ λ ∈ (0, 1), ∀ x, y ∈ J .

Note that a function ϑ is convex if and only if −ϑ is concave. The general


shape of the graph of a convex function is “concave up,” whereas the shape
of concave functions is “concave down.” This is made precise by the following
proposition.

Proposition 2.2. If J ⊂ R is an open interval and ϑ : J → R has a con-


tinuous second derivative d2 ϑ/dt2 at every point of J, then ϑ is a convex
function.

Proof. Assume that d2 ϑ/dt2 is nonnegative on J. Then, dϑ/dt is monotone


increasing on J. To prove that ϑ is convex, choose any x, y ∈ J and λ ∈ (0, 1).
Let ζ = λx + (1 − λ)y. By the Fundamental Theorem of Calculus, and by the
fact that dϑ/dt is monotone increasing,
16 2 Banach Spaces in Classical Analysis
ζ    
dϑ dϑ
Z
ϑ(ζ) − ϑ(x) = dt ≤ [ζ] (ζ − x) .
x dt dt

Likewise,
y    
dϑ dϑ
Z
ϑ(y) − ϑ(ζ) = dt ≥ [ζ] (y − ζ) .
ζ dt dt

Because ζ − x = (1 − λ)(y − x) and y − ζ = λ(y − x), the second terms in each


of the two inequalities above can be expressed in terms of (y − x), leading to
 

ϑ(ζ) ≤ ϑ(x) + [ζ] (1 − λ)(y − x)
dt
 

ϑ(ζ) ≤ ϑ(y) − [ζ] λ(y − x) .
dt

Hence,


λϑ(ζ) ≤ λϑ(x) + [ζ] λ(1 − λ)(y − x)
dt
 

(1 − λ)ϑ(ζ) ≤ (1 − λ)ϑ(y) − [ζ] λ(1 − λ)(y − x) .
dt

Adding these two inequalities leads to

ϑ(λx + (1 − λ)y) ≤ λϑ(x) + (1 − λ)ϑ(y) .

This proves that ϑ is a convex function. 


Proposition 2.2 allows one to readily produce examples of convex functions.
Among the most important convex functions are the ones below.

Corollary 2.3. The following functions ϑ are convex:


1. ϑ(t) = eαt on R, for any α ∈ R;
2. ϑ(t) = tp on (0, ∞), where p ∈ R is such that p ≥ 1;
3. ϑ(t) = − ln t on (0, ∞).

The next result prepares the way for Jensen’s inequality.

Proposition 2.4. If ϑ : J → R is a convex function, and if x, y, z ∈ J satisfy


x < y < z, then
ϑ(y) − ϑ(x) ϑ(z) − ϑ(y)
≤ . (2.1)
y −x z − y
2.1 Convex and Concave Functions 17

Proof. There is a unique λ ∈ (0, 1) such that y = λx + (1 − λ)z. Thus,

y − λx λ
z = and z − y = (y − x) .
1−λ 1−λ
Because ϑ is a convex function, ϑ(y) ≤ λϑ(x) + (1 − λ)ϑ(z), and so

λ (ϑ(z) − ϑ(x)) ≤ ϑ(z) − ϑ(y) .

That is,
1
ϑ(z) − ϑ(x) ≤ (ϑ(z) − ϑ(y)) .
λ
Hence,
ϑ(y) − ϑ(x) λϑ(x) + (1 − λ)ϑ(z) − ϑ(x)

y − x y −x

(1 − λ) (ϑ(z) − ϑ(x))
=
y −x
(1−λ)
λ (ϑ(z) − ϑ(x))

y −x

ϑ(z) − ϑ(y)
= λ
1−λ (y − x)

ϑ(z) − ϑ(y)
= .
z −y
This completes the proof. 
If one views each side of inequality (2.1) as a difference quotient, then one
can interpret inequality (2.1) as saying that the derivative of ϑ (if it exists) is
an increasing function. See Exercise 2.

Theorem 2.5. (Jensen’s Inequality) Suppose that (X, Σ, µ) is a measure


space such that µ(X) = 1. If f : X → [a0 , b0 ] ⊂ (a, b) is a measurable function,
then, for every convex function ϑ : (a, b) → R,
Z  Z
ϑ f dµ ≤ ϑ ◦ f dµ .
X X
R
Proof. Note that X
f dµ ∈ (a, b) because

a < a0 ≤ f(x) ≤ b0 < b , ∀x ∈ X ,

implies that Z Z Z
a = a dµ < f dµ < b dµ = b .
X X X
18 2 Banach Spaces in Classical Analysis
R
Let ζ = X
f dµ and let

ϑ(ζ) − ϑ(z)
β = sup .
z∈(a,ζ) ζ−z

Hence,
ϑ(z) ≤ ϑ(ζ) + β(z − ζ) , ∀ z ∈ (a, ζ) .
Because ϑ is a convex function, Proposition 2.4 implies that
ϑ(y) − ϑ(ζ)
β ≤ , ∀ y ∈ (ζ, b) .
y−ζ
Thus,
ϑ(y) ≥ β(y − ζ) + ϑ(ζ) , ∀ y ∈ (ζ, b) .
Conclusion:
ϑ(t) ≥ ϑ(ζ) + β(t − ζ) , ∀ t ∈ (a, b) .
In particular,

ϑ (f(x)) − ϑ(ζ) − βf(x) + βζ ≥ 0 ,


∀x ∈ X . (2.2)
R
On passing to integrals, and using that µ(X) = 1 and ζ = X f dµ, inequality
(2.2) yields
Z Z 
ϑ ◦ f dµ − ϑ f dµ − βζ + βζ ≥ 0 .
X X

That is, Z  Z
ϑ f dµ ≤ ϑ ◦ f dµ ,
X X
which completes the proof. 

2.2 Classical Inequalities


A number of familiar inequalities arise from convex functions. We begin below
with two very elementary, yet useful, inequalities.
Proposition 2.6. Suppose that α1 , . . . , αn are positive real numbers.
1. Arithmetic-Geometric Mean Inequality:
1/n 1
(α1 · · · αn ) ≤ (α1 + · · · + αn ) .
n
1 1
2. Young’s Inequality: If p1 , . . . , pn are positive and p1
+...+ pn
= 1, then

1 p1 1 pn
α1 · · · αn ≤ α + ··· + α .
p1 pn n
2.2 Classical Inequalities 19

Proof. For the proof of the Arithmetic-Geometric Mean Inequality, let X =


{1, 2, . . ., n}, Σ = P(X), and µ be
1
µ(E) = |E| , ∀E ⊆ X ,
n
where |E| denotes the cardinality of E. Thus, µ(X) = 1.
Let β1 , . . . , βn ∈ R be such that eβk = αk , for each k. If f : X → R is
defined by f(k) = βk , for k ∈ X, then
n n
1X
Z X
f dµ = f(k)µ({k}) = βk .
X n
k=1 k=1

The function ϑ(t) = et is convex on R and


n n
1 X f(k) 1 X βk
Z
ϑ ◦ f dµ = e = e .
X n n
k=1 k=1

Jensen’s inequality states that


Z  Z
ϑ f dµ ≤ ϑ ◦ f dµ .
X X

Hence,
n n
1/n 1 X βk 1X
(α1 α2 · · · αn ) = eβ1 /n+···+βn /n ≤ e = αk .
n n
k=1 k=1

This completes the proof.


The proof of Young’s Inequality is left as an exercise (Exercise 5). 
Definition 2.7. Two positive real numbers p, q ∈ R+ are said to be conjugate
if
1 1
+ = 1.
p q
Note that the conjugate q to p is uniquely determined by p, and that
q → ∞ as p → 1+ . Thus, we say that “∞ is conjugate to 1.”
The notion of conjugate positive numbers p and q is nothing more than
the association of pair of convex coefficients—namely, p1 and 1q —with p and
q. The following inequality is used frequently in connection with conjugate p
and q.

Proposition 2.8. (Young’s Inequality in Two Variables) If α, β ∈ R and if


p, q ∈ (1, ∞) are conjugate, then
1 p 1
αβ ≤ α + βq .
p q
20 2 Banach Spaces in Classical Analysis

Proof. Exercise 5. 

Definition 2.9. If (X, Σ, µ) is a measure space and if p ≥ 1, then a measur-


able function f : X → C is said to be p-integrable if f p is integrable.

Theorem 2.10. Suppose that p, q ∈ R+ are conjugate. If (X, Σ, µ) is a mea-


sure space and if f, g : X → C are such that f is p-integrable and g is q-
integrable, then fg : X → C is integrable and
Z Z 1/p Z 1/q
p q
|fg| dµ ≤ |f| dµ |g| dµ (2.3)
X X X

Proof. Because f p and gq are integrable, so are |f|p and |g|q . Let F, G : X → R
be the functions defined by

|f(x)| |g(x)|
F (x) = R 1/p and G(x) = R 1/q .
p |g|q dµ
X |f| dµ X

Thus, Z Z
p
F dµ = Gq dµ = 1 .
X X

By Young’s Inequality (Proposition 2.8), for each x ∈ X,


1 1
F (x)G(x) ≤ F (x)p + G(x)q .
p q
Hence,
1 1 1 1
Z Z Z
p
F G dµ ≤ F dµ + Gq dµ = + = 1.
X p X q X p q

That is, Z
|fg| dµ
X
R 1/p R 1/q ≤ 1 ,
X |f|p dµ X |g|q dµ
which completes the proof. 

Theorem 2.11. Suppose that p ≥ 1. If (X, Σ, µ) is a measure space and if


f, g : X → C are p-integrable, then f + g is p-integrable and
Z 1/p Z 1/p Z 1/p
|f + g|p dµ ≤ |f|p dµ + |g|p dµ (2.4)
X X X

Proof. The theorem is true for p = 1 because the sum of integrable functions is
integrable and because inequality (2.4) is simply a consequence of the triangle
inequality in real and complex numbers.
2.2 Classical Inequalities 21

Assume, therefore, that p > 1 and consider the function ϑ : R+ → R+


defined by ϑ(t) = tp . Because ϑ is convex,
 
1 1 1 1
ϑ |f(x)| + |g(x)| ≤ ϑ(|f(x)|) + ϑ(|g(x)|) , ∀ x ∈ X .
2 2 2 2
Hence,  p
1 p 1 1
(|f| + |g|) ≤ |f|p + |g|p .
2 2 2
Because the sum of integrable functions is integrable, the inequality above
shows that (|f| + |g|)p is integrable. But, by the triangle inequality, |f + g|p ≤
(|f| + |g|)p ; hence, f + g is p-integrable.
Let q ∈ R+ be conjugate to p. Thus, p = (p − 1)q and
q
(|f| + |g|)p−1 = (|f| + |g|)p .

Hence, (|f| + |g|)p−1 is q-integrable. Therefore, one can apply Hölder’s In-
equality to obtain
Z Z 1/p Z 1/q
|f|(|f| + |g|)p−1 dµ ≤ |f|p dµ (|f| + |g|)(p−1)q dµ
X X X

and
Z Z 1/p Z 1/q
p−1 p (p−1)q
|g|(|f| + |g|) dµ ≤ |g| dµ (|f| + |g|) dµ .
X X X

Because |f|p + |g|p = |f|(|f| + |g|)p−1 + |g|(|f| + |g|)p−1 , summing the two
inequalities above yields a new inequality whose left hand side is
Z
(|f| + |g|)p dµ
X

and whose right hand side is


Z 1/q "Z 1/p Z 1/p#
(|f| + |g|)(p−1)q dµ |f|p dµ + |g|p dµ .
X X X

1/q
(|f| + |g|)(p−1)q dµ
R
Divide the new inequality through by X
and use that
p = (p − 1)q to obtain
Z 1−1/q Z 1/p Z 1/p
p p p
(|f| + |g|) dµ ≤ |f| dµ + |g| dµ .
X X X

Because p1 = 1 − 1
q and |f + g|p ≤ (|f| + |g|)p , the inequality above implies
inequality (2.4). 
22 2 Banach Spaces in Classical Analysis

Definition 2.12. Assume that p ∈ R satisfies p ≥ 1. A sequence {αk }k∈N in


C is p-summable if X
|αk |p < ∞ .
k∈N

The Hölder and Minkowski inequalities have formulations for sequences of


complex numbers.

Theorem 2.13. Suppose that {αk }k∈N and {βk }k∈N are sequences in C. As-
sume that p ≥ 1.
1. (Hölder) If p, q ∈ (1, ∞) are conjugate, and if {αk }k∈N is p-summable and
{βk }k∈N is q-summable, then {αk βk }k∈N is summable and
!1/p !1/q
X X X
p q
|αk βk | ≤ |αk | |βk | .
k∈N k∈N k∈N

2. (Minkowski) If {αk }k∈N and {βk }k∈N are p-summable, then {αk + βk }k∈N
is p-summable and
!1/p !1/p !1/p
X X X
p p p
|αk + βk | ≤ |αk | + |βk | .
k∈N k∈N k∈N

Proof. Apply Theorems 2.10 and 2.11 to the case where the measure space
(X, Σ, µ) is given by X = N, Σ = P(N), and µ is counting measure. 
Note that the Hölder and Minkowski inequalities are nontrivial even in
the cases where the sequences {αk }k∈N and {βk }k∈N have only finitely-many
nonzero elements.

2.3 Topological Vector Spaces


A slightly more general concept than that of a normed vector space is the
notion of a topological vector space, which is a vector space with a topology
such that the vector space operations are continuous.

Definition 2.14. A vector space V is a topological vector space if:


1. V is a topological space;
2. scalar multiplication, as a function C × V → V , is continuous;
3. addition, as a function V × V → V , is continuous.

Definition 2.15. A seminorm on a vector space V is a function ρ : V → R


such that, for all v, w ∈ V and α ∈ E,
1. ρ(v) ≥ 0,
2.3 Topological Vector Spaces 23

2. ρ(αv) = |α|ρ(v),
3. ρ(v + w) ≤ ρ(v) + ρ(w).

Property (3) is called the triangle inequality.

Proposition 2.16. If ρ is a seminorm on a vector space V , then the function


d : V × V → [0, ∞) defined by

d(v, w) = ρ(v − w) , v, w ∈ V ,

defines a metric on V . With respect to the metric topology, V is a topological


vector space.

Proof. Exercise 7. 
Seminorms ρ and norms k · k differ in the following way: the equation
ρ(v) = 0 can hold for nonzero v whereas kvk = 0 holds only for v = 0. One
obtains a norm from a seminorm as follows.

Proposition 2.17. If ρ is a seminorm on a topological vector space, and if


∼ is the relation on V defined by

v ∼ w if ρ(v − w) = 0 ,

then ∼ is an equivalence relation. Moreover, if the equivalence classes of ele-


ments of V are denoted by

[v] = {w ∈ V | w ∼ v} ,

then:
1. the set V / ∼ of equivalence classes is a vector space under the operations

[v] + [w] = [v + w] , v, w ∈ V ,
α [v] = [αv] , α ∈ C, v ∈ V ;

2. the function
k[v]k = ρ(v) , v∈V ,
is a norm on V / ∼.

Proof. Exercise 8. 
The next sections develop examples of Banach spaces.
24 2 Banach Spaces in Classical Analysis

2.4 Banach Spaces of Integrable Functions

Proposition 2.18. Let (X, Σ, µ) be a measure space, and assume that p ≥ 1.


If
Lp (X, Σ, µ) = {f : X → C | f is p-integrable } ,
and if ρ : Lp(X, Σ, µ) → R is given by
Z 1/p
ρ(f) = |f|p dµ , ∀ f ∈ Lp (X, Σ, µ) , (2.5)
X

then ρ is a seminorm on Lp (X, Σ, µ).

Proof. It is clear that αf ∈ Lp(X, Σ, µ), for every α ∈ C and f ∈ Lp(X, Σ, µ).
If f, g ∈ Lp(X, Σ, µ), then f + g ∈ Lp (X, Σ, µ), by Minkowski’s inequality.
Hence, Lp (X, Σ, µ) is a vector space.
To verify that ρ is a seminorm, the only nontrivial item to verify is that the
triangle inequality holds. But this is true because of Minkowski’s inequality:
Z 1/p
p
ρ(f + g) = |f + g| dµ
X

Z 1/p Z 1/p
≤ |f|p dµ + |g|p dµ
X X

= ρ(f) + ρ(g) .

Hence, ρ is a seminorm. 
The seminorm ρ of Proposition 2.18 need not be a norm. For example, if
f is the characteristic function on the set Q of rational numbers, and if m
denotes Lebesgue measure on R, then f 6= 0 yet
Z 1/p
ρ(f) = |f|p dm = m(Q)1/p = 0 .
R

On the other hand, Proposition 2.17 demonstrates that a bona fide normed
vector space can be obtained by passing to equivalence classes.

Definition 2.19. If p ≥ 1 and (X, Σ, µ) is a measure space, then Lp (X, Σ, µ)


denotes the normed vector space

Lp (X, Σ, µ) = Lp (X, Σ, µ)/ ∼ ,

where ∼ is the equivalence relation f ∼ g if ρ(f − g) = 0 and where ρ is the


seminorm in (2.5). Lp (X, Σ, µ) is called an Lp -space.
2.4 Banach Spaces of Integrable Functions 25

Conceptual and Notational Convention. The vector space Lp (X, Σ, µ)


is a vector space of equivalence classes of p-integrable functions f : X → C.
Thus, one might properly denote the elements of Lp (X, Σ, µ) by [f], where
f ∈ Lp (X, Σ, µ). However, this is cumbersome, and so it is a standard practice
to denote the elements of Lp (X, Σ, µ) as simply f rather than [f]. The main
points to keep in mind, given a p-integrable function f : X → C are these
below.
• The notation f ∈ Lp (X, Σ, µ) refers to f as a function; in particular,
f = 0 in Lp (X, Σ, µ) if and only if f(x) = 0 for all x ∈ X.
• the notation f ∈ Lp (X, Σ, µ) Rrefers to f as a vector; in particular, f = 0
in Lp (X, Σ, µ) if and only if X |f|p dµ = 0 (equivalently, if and only if
f(x) = 0 for almost all x ∈ X).
Assume that p ≥ 1 and consider the set `p (N) of p-summable sequences of
complex numbers:
a = {αk }k∈N .
By Minkowski’s inequality, if a = {αk }k∈N, b = {βk }k∈N ∈ `p (N), then
!1/p !1/p !1/p
X X X
p p p
|αk + βk | ≤ |αk | + |βk | .
k∈N k∈N k∈N

That is, a + b = {αk + βk }k∈N ∈ `p (N). Hence, with pointwise sum and scalar
multiplication, `p (N) is a vector space over C. The seminorm ρ of Proposition
2.18 is in fact a norm on this space.
Proposition 2.20. On the vector space `p (N), the function
!1/p
X
p
kak = |αk | (2.6)
k∈N

is a norm.
Proof. We know already that k · k is a seminorm. If a = {αk }k∈N ∈ `p (N), then
!1/p
X
0 = kak = |αk |p
k∈N

if and only if |αk | = 0 for all k. Hence, kak = 0 if and only if a = 0 ∈ `p (N). 
Notational Convention. The notation `p (N) refers to the normed vector
space of p-summable sequences, where p ≥ 1, under the norm (2.6). If n ∈ N,
then `p (n) denotes the normed vector space of sequences

a = {αk }nk=1

of complex numbers under the norm (2.6).


26 2 Banach Spaces in Classical Analysis

Theorem 2.21. (Riesz–Fischer Theorem) Lp (X, Σ, µ) is a Banach space for


every p ≥ 1.
Proof. We know already that Lp (X, Σ, µ) is a normed vector space. Thus, one
need show only that every Cauchy sequence in Lp (X, Σ, µ) is convergent. To
this end, suppose that {fk }k∈N is a Cauchy sequence in Lp (X, Σ, µ). Because
this sequence is Cauchy, one can extract from it a subsequence {fkj }j∈N such
that  j
1
kfkj+1 − fkj k < , ∀j ∈ N.
2
For every i ∈ N, let gi ∈ Lp (X, Σ, µ) be given by
i
X
gi = |fkj+1 − fkj | .
j=1

Observe that {gi }i∈N is a monotone-increasing sequence and that


i
X i
X ∞
X
kgi k ≤ k |fkj+1 − fkj | k ≤ 2−j ≤ 2−j = 1 .
j=1 j=1 j=1

Thus, Z
gip dµ ≤ 1 , ∀i ∈ N.
X
Considering gi as a function, the converse to the Monotone Convergence The-
orem implies that limi gi (x)p exists for almost all x ∈ X; that is, limi gi (x)
exists for almost all x ∈ X and the limit function—call it g—is p-integrable.
Let L ⊆ X denote the set of points x in which limi gi (x) exists; thus,
i
X
lim gi (x) = lim |fkj+1 (x) − fkj (x)| . (2.7)
i→∞ i→∞
j=1

If f : L → C denotes the function defined by



X 
f(x) = fk1 (x) + fkj+1 (x) − fkj (x) ,
j=1

then series above converges absolutely, by (2.7), for every x ∈ L. Extend f to


all of X by setting f(x) = 0 if x ∈ X\L. The i-th partial sum of f is precisely
fki and
Xi

|fki | = fk1 +
(fkj+1 − fkj ) ≤ |fk1 | + gi ≤ 2g .
j=1
Therefore, |fki |p ≤ 2p gp for all i ∈ N. As gp is integrable and limi fki (x)p =
f(x)p for all x ∈ L, the Dominated Convergence Theorem asserts that f p is
integrable. This proves that f ∈ Lp (X, Σ, µ).
2.5 Essentially Bounded Measurable Functions 27

What remains is to show kf − fk k → 0 as k → ∞. To this end, note that


|f − fki |p ≤ (|f| + |fki |)p ≤ 2p gp ∀i ∈ N .
Therefore, by the Dominated Converge Theorem, |f − fki |p is integrable for
every i and the limit function—in this case 0—satisfies
Z
lim |0 − |f − fki |p| dµ = 0 ;
i→∞ X

that is, Z
p
lim |f − fki | dµ = 0 .
i→∞ X
Hence, kf − fki k → 0 as i → ∞. To show that the entire Cauchy sequence
{fk }k∈N converges in Lp (X, Σ, µ) to f, let ε > 0. Since {fk }k∈N is a Cauchy
sequence, there exists Nε ∈ N such that lim inf kfki − fNε k < ε. Using this
i∈N
and the fact that f = limi fki almost everywhere, Fatou’s Lemma yields the
desired conclusion: namely, for any m ≥ Nε ,
Z
kf − fm kp = |f − fm |p dµ
X
Z
≤ lim inf |fki − fm |p dµ (Fatou’s Lemma)
i∈N X

= lim inf kfki − fm kp


i∈N

≤ lim inf kfki − fNε kp


i∈N

< εp .
This completes the proof that Lp (X, Σ, µ) is a Banach space. 
Corollary 2.22. `p (N) is a Banach space, for every p ≥ 1.

2.5 Essentially Bounded Measurable Functions


If (X, Σ, µ) is a measure space and if f : X → C is a measurable function,
then the essential range of f is the closed subset ess-ran ψ of C defined by
\
ess-ran f = f(E) ,
E⊂X, µ(X\E)=0

where f(E) denotes the closure in C of f(E) = {f(t) | t ∈ E}, and where
E ∈ Σ. The essential supremum of f is the quantity
ess-sup f = sup {|λ| | λ ∈ ess-ran f} .
28 2 Banach Spaces in Classical Analysis

Definition 2.23. A measurable function f is essentially bounded if it has


finite essential supremum. The set of all essentially bounded functions is de-
noted by L∞ (X, Σ, µ).

Note that if f is essentially bounded, then and let f is bounded on a subset


of X whose complement has measure zero (Exercise 16).

Proposition 2.24. If (X, Σ, µ) is a measure space, then L∞(X, Σ, µ) is a


vector space and the function ρ : L∞ (X, Σ, µ) → R defined by

ρ(f) = ess-sup f (2.8)

is a seminorm on L∞ (X, Σ, µ).

Definition 2.25. If (X, Σ, µ) is a measure space, then L∞ (X, Σ, µ) denotes


the normed vector space

L∞ (X, Σ, µ) = L∞ (X, Σ, µ)/ ∼ ,

where ∼ is the equivalence relation f ∼ g if ρ(f − g) = 0 under the seminorm


(2.8). L∞ (X, Σ, µ) is called an L∞ -space.

By consideration of the case where X = N, Σ = P(N), and µ is counting


measure, the set `∞ (N) of bounded sequences of complex numbers

a = {αk }k∈N

is a normed vector space under the norm

kak = sup |αk | . (2.9)


k∈N

Notational Convention. The notation `∞ (N) refers to the normed vector


space of bounded sequences under the norm (2.9). If n ∈ N, then `∞ (n)
denotes the normed vector space of sequences

a = {αk }nk=1

of complex numbers under the norm (2.9).

Theorem 2.26. L∞ (X, Σ, µ) is a Banach space.

2.6 Banach Spaces of Continuous Functions


Assume that X is a locally compact topological space. This is to say that for
every x ∈ X there is an open set U ⊂ X containing x such that the closure
U of U is a compact subset of X. Of courses if X is itself compact, then it is
also locally compact.
2.6 Banach Spaces of Continuous Functions 29

Let Cb (X) denote the set of all functions f : X → C that are continuous
and bounded. The continuity of f ∈ Cb (X) means that for each x ∈ X and
ε > 0 there is an open subset U ⊂ X containing x such that |f(x) − f(y)| < ε
for all y ∈ U . The boundedness of f ∈ Cb (X) means that there is a R > 0
such that |f(x)| < R for all x ∈ X.

Theorem 2.27. If X is a locally compact space, then Cb (X) is a Banach


space, where the vector space operations are given by the usual pointwise op-
erations, and where the norm is defined by

kfk = sup |f(x)| , f ∈ Cb (X) . (2.10)


x∈X

Proof. It is elementary that Cb (X) is a vector space and that (2.10) defines a
norm on Cb (X). Thus, it remains only to show that every Cauchy sequence
in Cb (X) is convergent in Cb (X).
Let {fk }k∈N ⊂ Cb (X) denote a Cauchy sequence. For each x ∈ X,

|fn (x) − fm (x)| ≤ sup |fn (y) − fm (y)| = kfn − fm k .


y∈X

Since {fk }k∈N is a Cauchy sequence in Cb (X), {fk (x)}k∈N is a Cauchy se-
quence in C for each x ∈ X. Because C is complete, limk fk (x) exists for every
x ∈ X. Therefore, define f : X → C by f(x) = limk fk (x), for each x ∈ X.
We aim to show (i) that f is continuous and bounded, and (ii) that {fk }k∈N
converges to f in Cb (X).
Let ε > 0. Because {fk }k∈N is a Cauchy sequence, there exists Nε ∈ N
such that kfn − fm k < ε for all n, m ≥ Nε . Assume that n ≥ Nε . Choose any
x ∈ X; thus,

|f(x) − fn (x)| ≤ |f(x) − fm (x)| + |fm (x) − fn (x)|

≤ |f(x) − fm (x)| + kfm − fn k .

As the inequalities above are true for all m ∈ N,

|f(x) − fn (x)| ≤ inf (|f(x) − fm (x)| + kfm − fn k)


m∈N

≤ 0 + ε.

This right hand side of the inequality above is independent of the choice of
x ∈ X. Hence, if n ≥ Nε is fixed, then f − fn is a bounded function X → C
and
sup |f(x) − fn (x)| ≤ ε .
x∈X

Since f is uniformly within ε of a continuous function, f is continuous at


each x ∈ X. Furthermore, since the sum of bounded functions is bounded,
fn + (f − fn ) = f is bounded. This proves that f ∈ Cb (X). Finally, since
30 2 Banach Spaces in Classical Analysis

f ∈ Cb (X) satisfies kf − fn k ≤ ε for all n ≥ Nε , the Cauchy sequence {fk }k∈N


converges in Cb (X) to f ∈ Cb (X). 
For a compact space X, let C(X) denote the set of all continuous functions
f : X → C. Under the usual pointwise operations, C(X) is a vector space.
Moreover, for compact X, every continuous function f : X → C is bounded
and |f| attains its maximum at some x ∈ X. Hence, for compact spaces X,
Cb (X) = C(X) and

kfk = max |f(x)| , ∀ f ∈ C(X) .


x∈X

Corollary 2.28. If X is a compact space, then C(X) is a Banach space.


Definition 2.29. Assume that X is locally compact space. A function f :
X → C vanishes at infinity if the set

{x ∈ X | |f(x)| ≥ ε}

is compact in X for every ε > 0.


Proposition 2.30. If X is a locally compact space and if C0 (X) denotes the
set of all f ∈ Cb (X) that vanish at infinity, then C0 (X) is a subspace of
Cb (X).
Proof. Exercise 12. 
Observe that Proposition 2.30 implies that C0 (X) is a Banach space.

2.7 Stone–Weierstrass Theorems


Because C has multiplication as well as addition, we may multiply f, g ∈
Cb (X) to produce a function fg : X → C whose value (fg)[x] at each x ∈ X
is defined by
(fg)[x] = f(x)g(x) .
It is not difficult to see that fg ∈ Cb (X) and that kfgk ≤ kfk kgk.
Definition 2.31. An associative algebra A over C is a Banach algebra if A is
a Banach space and if the norm on A satisfies kabk ≤ kak kbk for all a, b ∈ A.
By an “associative algebra” is meant a complex vector space A with mul-
tiplication such that, for all a, b, c ∈ A and α ∈ C, the following properties
hold: (α a)(b) = a(α b) = α(ab), a(b + c) = ab + ac, (a + b)c = ac + bc, and
a(bc) = (ab)c.
Thus, if X is a compact space, then C(X) is a Banach algebra. Note that
in this algebra the function denoted by “1” that sends each x ∈ X to 1 ∈ C
serves as the multiplicative identity for C(X) in the sense that f 1 = f for
every f ∈ C(X).
2.7 Stone–Weierstrass Theorems 31

Definition 2.32. A uniform algebra on a compact space X is a subalgebra


A ⊆ C(X) such that:
1. A is a Banach space with respect to the norm of C(X);
2. 1 ∈ A;
3. A separates the points of X—that is, if x1 , x2 ∈ X are distinct, then there
exists a function f ∈ A such that f(x1 ) 6= f(x2 ).

Is C(X) a uniform algebra ? The answer depends on the topology of X.

Proposition 2.33. Let X be a compact space.


1. There exists a uniform algebra on X only if X is Hausdorff.
2. If X is Hausdorff and if {x} is a closed set for every x ∈ X, then C(X)
is a uniform algebra on X.

Proof. To prove 1, suppose that A is a uniform algebra on X. Suppose that


x1 , x2 ∈ X are distinct. Because A is a uniform algebra on X, there is a
function f ∈ A such that f(x1 ) 6= f(x2 ). In C there are disjoint open sets V1
and V2 that contain f(x1 ) and f(x2 ) respectively. Thus, by continuity of f,
U1 = f −1 (V1 ) and U2 = f −1 (V2 ) disjoint open sets in X that contain x1 and
x2 respectively. This proves that X is a Hausdorff space.
For 2, suppose that X is Hausdorff and singleton sets are closed. Every
compact Hausdorff space is normal [9, p. 198]. Thus, X is normal and, hence,
if x0 , x1 ∈ X are distinct, then Urysohn’s Lemma applies to the point sets
{x0 } and {x1 } (which are closed by hypothesis) to obtain a function f ∈ C(X)
such that f(x0 ) = 0 6= 1 = f(x1 ). This shows that C(X) is a uniform algebra
on X. 
The elements of C(X) are complex-valued functions. Therefore, for each
f ∈ C(X) one can consider the continuous function f : X → C defined by

f (x) = f(x) , ∀x ∈ X .

Definition 2.34. A nonempty subset S ⊆ C(X) is selfadjoint if f ∈ S for


every f ∈ S.

The Stone–Weierstrass Theorem asserts that if a compact (Hausdorff)


space X admits a selfadjoint uniform algebra A, then A = C(X).

Theorem 2.35. (Stone–Weierstrass Theorem) If X is a compact space and


if A is a selfadjoint uniform algebra on X, then A = C(X).

Proof. First of all, because A is selfadjoint, A is spanned by real-valued func-


tions. Indeed, if f ∈ A, then <f = 21 (f +f ) and =f = 2i1
(f −f ) are real-valued
elements of A and f = <f + i=f. Moreover, if h ∈ A is any real-valued func-
tion, then the continuous function |h| also belongs to A. The following few
paragraphs explain why this is so.
32 2 Banach Spaces in Classical Analysis

Assume that h ∈ A is real-valued and nonzero. Without loss of generality


it may be assumed that khk = 1; thus, h(x) ∈ [−1, 1] for all x ∈ X. By
Newton’s Binomial Theorem,

√ t X 1 · 3 · . . . · (2n − 3) n
1−t = 1 − + (−1)n t ,
2 n=2
2n n!

which converges uniformly on [−1, 1]. For notational


√ convenience, let ϕ denote
the function on [−1, 1] given by ϕ(t) = 1 − t and write the power series
expansion above of ϕ as
X∞
ϕ(t) = α n tn .
n=0

For each δ ∈ (0, 1) let gδ ∈ C(X) be given by gδ (x) = δ + (1 − δ)h(x)2 ; that


is, gδ = δ + (1 − δ)h2 , where δ ∈ A is the constant function x 7→ δ. Because A
is an algebra, h2 ∈ A and gδ ∈ A. Furthermore, h(x)2 ∈ [0, 1] for all x ∈ X,
and so 0 ≤ gδ ≤ 1 and 0 ≤ 1 − gδ = (1 − δ)(1 − h2 ) ≤ 1 − δ. That is,

1 − gδ (x) ∈ [0, 1 − δ] , ∀x ∈ X .

Fix k ∈ N and define fδ,k by


k
X
fδ,k = αn (1 − gδ )n .
n=0

Thus, fδ,k ∈ A and



Xk
1/2 n
kfδ,k − (gδ ) k = max αn (1 − gδ (x)) − ϕ(1 − gδ (x))

x∈X
n=0

Xk
≤ max αn tn − ϕ(t) .

t∈[0,1−δ]
n=0

By Newton’s Binomial Theorem, this final limit tends to zero as k → ∞.


Hence, (gδ )1/2 ∈ A (as A is norm closed). Note that kh2 −gδ k = δk1+h2 k →√0
as δ → 0; that is, gδ → h2 uniformly on X as δ → 0. The function ψ(t) = t
is uniformly continuous on the compact set [0, 1], and so ψ ◦ gδ → ψ ◦ h2
uniformly on X as δ → 0. Because ψ ◦ gδ = (gδ )1/2 ∈ A and ψ ◦ h2 = |h|, the
limit k (gδ )1/2 − |h| k → 0 implies that |h| ∈ A. This completes the proof that
|h| ∈ A for every real-valued h ∈ A.
As a consequence of the arguments above, if h1 , h2 ∈ A are real-valued,
then the continuous functions h1 + h2 + |h1 − h2 | and h1 + h2 − |h1 − h2 | are
elements of A. That is, max{h1 , h2 } ∈ A and min{h1 , h2 } ∈ A. Iteration of the
argument shows that for any finite number of real-valued functions h1 , . . . , hk
in A,
2.7 Stone–Weierstrass Theorems 33

max{h1 , . . . , hm } ∈ A and min{h1 , . . . , hm } ∈ A .


Select any f ∈ C(X). To show that f ∈ A, it is sufficient—because A
is norm closed—to show that for every ε > 0 there is a κ ∈ A such that
kf − κk < . In fact, because f is a linear combination of <f, =f ∈ A, it is
sufficient to assume that f is real valued. To this end, fix z0 ∈ X. If x ∈ X
and x 6= z0 , then there is a function h ∈ A such that h(x) 6= h(z0 ). Indeed,
<h(x) 6= <h(z0 ) or =h(x) 6= =h(z0 ), and so we may assume that h is a
real-valued function. Consider the real-valued function gx ∈ C(X) defined by
 
h(y) − h(z0 )
gx (y) = f(z0 ) + (f(x) − f(z0 )) , ∀y ∈ X .
h(x) − h(z0 )
In particular, for y = z0 and y = x we obtain gx (z0 ) = f(z0 ) and gx (x) = f(x).
The function gz0 denotes f. Now, for any x ∈ X, the function gx − f is
continuous, and so the set Ux ⊆ X defined by
Ux = {y ∈ X | gx (y) − f(y) < ε}
is an open subset of X. Observe that gx(x) = f(x) implies that x ∈ Ux . Also,
gx (y) < f(y) + ε , ∀ y ∈ Ux .
Therefore, {Ux }x∈X is an open cover X. Because X is compact, this covering
admits a finite subcover: say, Ux1 , . . . , Uxn . The functions gxj that define
these n open sets determine another element of A: namely, min{gx1 , . . . , gxn }.
Because all of this has depended on the fixed element z0 ∈ X, this minimum
function shall be denoted by hz0 . That is, hz0 = min{gx1 , . . . , gxn } and hz0
has the property
hz0 (y) < f(y) + ε , ∀ y ∈ X . (2.11)
Now, for each z0 ∈ X let hz0 ∈ A be the continuous function described
above and consider the open subset Vz0 of X defined by
Vz0 = {y ∈ X | hz0 (y) − f(y) > −ε} .
Since hz0 (z0 ) − f(z0 ) = 0, z0 ∈ Vz0 . Moreover,
hz0 (y) > f(y) − ε , ∀ y ∈ Vz0 .
Therefore, {Vz0 }z0 ∈X is an open cover of X and, by the compactness of X,
there is a finite subcover: Vz1 , . . . , Vzm . Let κ = max{hz1 , . . . , hzm }, which is
an element of A. Thus, for each j = 1, . . . , m,
hj (y) ≥ κ(y) > f(y) − ε , ∀y ∈ X . (2.12)
Combining inequalities (2.11) and (2.12) leads to
f(y) − ε < κ(y) < f(y) + ε , ∀y ∈ X .
That is, kf − κk < ε. 
34 2 Banach Spaces in Classical Analysis

Corollary 2.36. (Weierstrass Approximation Theorem) If f : [a, b] → C is a


continuous function, then for every ε > 0 there is a polynomial p such that
|f(t) − p(t)| < ε for all t ∈ [a, b].
Proof. Let A be the closure in C([a, b]) of the ring of polynomials in one
indeterminate t. Thus, A is a norm-closed subalgebra of C([a, b]) and 1 ∈ A.
Moreover A separates the points of [a, b], for if x1 , x2 ∈ [a, b] are distinct,
then q(x1 ) = 0 and q(x2 ) 6= 0 for the element q ∈ A given by q(t) = t − x1 .
Therefore, the Stone–Weierstrass Theorem yields A = C([a, b]). In particular,
by the construction of A, if f ∈ C([a, b] and if ε > 0, then there is a polynomial
p such that kf − pk < ε. 
In the study of periodic phenomena, trigonometric
√ polynomials are of great
importance. Recall that if t ∈ R and i = −1, then ei kt = cos(kt) + i sin(kt).
Definition 2.37. A trigonometric polynomial is a 2π-periodic function p :
R → C of the form
m
X
p(t) = αk ei kt , t ∈ R , (2.13)
k=n
where n ≤ m in Z and αn , . . . , αm ∈ C. The set of all trigonometric polyno-
mials is denoted by T.
Observe that T is a complex vector space. The following theorem is a clas-
sic result of Weierstrass on uniform approximation of continuous 2π-periodic
functions by trigonometric polynomials.
Theorem 2.38. If f : R → C is a 2π-periodic continuous function, then for
every ε > 0 there is a trigonometric polynomial p ∈ T such that

|f(t) − p(t)| < ε , ∀t ∈ R. (2.14)

Proof. Consider R as an additive abelian group. In the standard topology, the


group operation (namely, addition) is a continuous function R × R → R. Let
2πZ denote the set {2nπ | n ∈ Z} and observe that 2πZ is a closed subset of
R. Furthermore, 2πZ is a (normal) subgroup of R and so one may consider
the quotient group T = R/2πZ.
Denote the cosets of elements of t ∈ R by [t]. Note that [t] = [t0 ] if and
only if t0 = t + k2π for some k ∈ Z. Let Q : R → T be the quotient group
homomorphism t → [t] and define a subset V ⊆ T to open in T if the preimage
Q−1 (V ) is open in R. This endows T with a topology (called the quotient
topology) in which Q is a continuous function.
It is not difficult to verify that T is a compact space in the quotient topol-
ogy. To do so, let {Vα }α be a covering of T by open sets. If Uα = Q−1 (Vα ) for
each α, then {Uα }α is a covering of R by open sets; in particular, {Vα }α is a
covering of the interval [−π, π] by open sets. Because [π, π] is compact, there
is a finite subcover: Uα1 , . . . , Uαn . Furthermore, Q maps [−π, π] onto T, and
so the open sets Vα1 , . . . , Vαn cover T, which shows that T is compact.
2.8 Separable Banach Spaces 35

√ n ∈ N let γn : T → C be defined by γn ([t]) =


For each eint , where here i
int
refers to −1. Because the function R → C given by t 7→ e is continuous and
2π-periodic, the function γn is well defined and is a continuous map (Exercise
14) of T into C. Observe that γn γm = γm+n , γ0 = 1, and that γn = γ−n for
all m, n ∈ Z. Thus, the closure A of Span {γn | n ∈ Z} is a selfadjoint, closed
subalgebra of C(T) that contains 1. Furthermore, A separates the points of
0
T—indeed, γ1 separates the points of T—because eit = eit if and only if t0 − t
is an integer multiple of 2π. This shows that A is selfadjoint uniform algebra
on T. By the Stone–Weierstrass Theorem, A = C(T) and so, upstairs in R,
the trigonometric polynomials are uniformly dense in the space of 2π-periodic
continuous functions R → C. 

2.8 Separable Banach Spaces

Recall that, in a Banach space V ,


1. a subset S ⊂ V is dense in V if for every v ∈ V and for every ε > 0 there
is s ∈ S such that kv − sk < ε, and
2. V is said to be separable if there is a countable subset S ⊂ V that is dense
in V .

Theorem 2.39. C([a, b]) is a separable Banach space, for all a < b in R.

Proof. Let C be the set {1, t, t2, t3 , . . .}, considered as functions on [a, b]. Note
that C is a countable set. Let Q + iQ denote the set of all ζ = r + is ∈ C for
which r, s ∈ Q, and note that Q + iQ is countable and dense in C. Now let P
be the set of all polynomials with coefficients coming from the subring Q + iQ
of C. Note that P is in bijective correspondence with the set
[
(Q + iQ)n ,
n∈N

which is a countable union of countable sets and, hence, is countable. (The


correspondence identifies each polynomial α0 + α1 t + . . . + αn tn ∈ P with the
(n + 1)-tuple (α0 , . . . , αn ) ∈ (Q + iQ)n+1 .)
Let f ∈ C([a, b]). We will show that for every ε > 0 there is a p ∈ P for
which |f(t) − p(t)| < ε for all t ∈ [a, b]—that is, kf − pk < ε. Because the
Weierstrass Approximation Theorem already asserts that there is a polyno-
mial q such that |f(t) − q(t)| < ε P for all t ∈ [a, b], it is sufficient to assume
that f is a polynomial, say f(t) = nk=0 αk tk . Let M be the maximum value
of the continuous function ψ on [a, b], where

n
!1/2
X
k
ψ(t) = |t| .
k=0
36 2 Banach Spaces in Classical Analysis

Because Q + Pβn0 , . . . , βn ∈ Q + iQ such that |αk −


√iQ is dense in C, there are
βk | < ε/(M n) for each k. Let p(t) = k=0 βk tk . Then p ∈ P and, for every
t ∈ [a, b],

n n
!1/2 n
!1/2
X X X
k 2 k
|f(t) − p(t)| ≤ |αk − βk | |t| ≤ |αk − βk | |t| < ε.
k=0 k=0 k=0

Hence, P is dense in C([a, b]). 

Definition 2.40. For any [a, b] ⊂ R, Lp ([a, b]) denotes Lp (X, Σ, µ), where
p ≥ 1, X = [a, b], Σ = {E ∩ [a, b] | E ∈ M(R)}, and µ = m (Lebesgue
measure).

The Weierstrass Approximation Theorem shall be used to establish the


next example of a separable Banach space.

Theorem 2.41. Lp ([a, b]) is separable, for every p ≥ 1 and all a < b in R.

Proof. As in the proof of Theorem 2.39, let P be the countable set of all poly-
nomials with coefficients coming from the subring Q + iQ of C. We shall prove
that P is dense in Lp ([a, b]). The first step is to show that any characteristic
function can be approximated in Lp ([a, b]) by polynomials g ∈ P.
Suppose that E ⊆ [a, b] is a Lebesgue-measurable set and let ϕ be the
characteristic function on E. Assume that ε > 0. If E is one of [a, b], (a, b],
[a, b), or (a, b), then ϕ = 1 ∈ P. Thus, we assume that E is a proper subset of
(a, b). The regularity of Lebesgue measure implies that, there are K, U ⊆ R
such that K is closed, U is open, K ⊆ E ⊆ U , m(E\K) < 2ε , and m(U \E) <
ε
2
. Because E is a proper subset of (a, b), U may be assumed to be a subset
of (a, b) as well. Thus,

m(U \K) = m(U \E) + m(E\K) < ε .

Let F = [a, b]\U = [a, b] ∩ U c , which is a closed set in [a, b] disjoint from the
closed set K. By Urysohn’s Lemma, there is a continuous function h : [a, b] →
[0, 1] such that h(K) = {1} and h(F ) = {0}. Because E ⊆ K ⊆ U ,

h|K = ϕ|K and h|F = ϕ|F .

Thus,
2.8 Separable Banach Spaces 37
Z
kϕ − hkp = |ϕ − h|p dm
[a,b]
Z Z Z
= |ϕ − h|p dm + |ϕ − h|p dm + |ϕ − h|p dm
K U \K F
Z
= |ϕ − h|p dm
U \K
Z
≤ (1 + 1)p dm
U \K

< 2p ε .
p
2 ε
Because h can be uniformly approximated to within b−a by some g ∈ P, we
p p+1
have that kϕ−gk < 2 ε. That is, characteristic functions are approximated
in Lp ([a, b]) by elements g of the countable set P.
Suppose now that ϕ is a simple function of the form
n
X
ϕ = αk χEk .
k=1

Let ε > 0. There exist βk ∈ Q + iQ such that |αk − βk | < ε/n for every 1√≤
k ≤ n. By the paragraph above, there are gk ∈ P such that kχEk − gk k < ε
for all 1 ≤ k ≤ n. Let g = β1 g1 + . . . + βn gn ∈ P. Then
n
X
kϕ − gk ≤ |αk − βk | kχEk − gk k < ε .
k=1

This proves that every simple function is approximated in Lp ([a, b]) by ele-
ments g of P.
Next, suppose that f ∈ Lp ([a, b]) is such that f(x) ≥ 0 for all x ∈ [a, b].
Since every nonnegative measurable function can be approximated pointwise
by a monotone-increasing sequence of simple functions, there is a monotone-
increasing sequence {ϕk }k∈N of simple functions ϕk : [a, b] → [0, ∞) such
that
lim ϕk (x) = f(x) , ∀ x ∈ [a, b] .
k→∞

Because, 0 ≤ f(x) − ϕk (x) ≤ f(x), for all x ∈ [a, b], it is also true that |f(x) −
ϕk (x)|p ≤ f(x)p for all x ∈ [a, b]. Therefore, by the Dominated Convergence
Theorem, each (f − ϕk ) is an element of Lp ([a, b]) and
Z Z
p
lim |f − ϕk | dm = lim |f − ϕk |p dm = 0 .
k→∞ [a,b] [a,b] k→∞
38 2 Banach Spaces in Classical Analysis

That is, kf − ϕk kp → 0 as k → ∞. Thus, the simple functions are dense


in Lp ([a, b]). However, each simple function is approximated by a continuous
function g ∈ P; hence, P is dense in Lp ([a, b]). 
One useful byproduct of the proof of the separability of Lp ([a, b]) is:

Corollary 2.42. C([a, b]) is a dense subset of Lp ([a, b]). That is, for every
p-integrable function f : [a, b] → C and ε > 0 there is a continuous function
ϑ : [a, b] → C such that kf − ϑk < ε.
S
If one notes that R = n∈N [−n, n], then the following theorem can be
established from the previous one.

Theorem 2.43. Lp (R, M(R), m) is separable for all p ≥ 1.

Proof. Exercise 19. 

2.9 Lp(T)

Henceforth, Lp (T) shall denote Lp ([−π, π], m), where 1 ≤ p ≤ ∞.


Recall that a trigonometric polynomial is a 2π-periodic function p : R → C
of the form
m
X
p(t) = αk ei kt , where t ∈ R , n, m ∈ Z , n ≤ m . (2.15)
k=n

The set of all trigonometric polynomials is a vector space denoted by T.


For every k ∈ Z, consider the 2π-periodic function uk : [−π, π] → C given
by
ei kt
uk (t) = √ . (2.16)

Observe that if n, m ∈ Z, then
Z π
ei (n−m)t dt = 0, if n 6= m, or 1, if n = m .
−π

This fact implies that the elements of {un }n∈Z are linearly independent (Ex-
ercise 20). Hence, {un }n∈Z is a basis for the vector space T of trigonometric
polynomials. In turn, T is dense in Lp (T), as shown by the following theorem.

Theorem 2.44. For every 1 ≤ p < ∞, the vector space T of trigonometric


polynomials is dense in Lp (T).

Proof. Theorem 2.38 shows that Span {un }n∈Z is uniformly dense in the linear
submanifold of Lp (T) consisting of all 2π-periodic continuous functions; hence,
this is also true with respect to the norm of Lp (T). Corollary 2.42 shows that
2.10 Exercises 39

C([−π, π]) is dense in Lp ([−π, π]). Therefore, it is sufficient to show that every
f ∈ C([−π, π]) can be approximated to within ε in (the norm of) Lp (T) by a
2π-periodic continuous function h.
To this end, choose f ∈ C([−π, π]) and let ε > 0. Let M = max {|f(t)| |, t ∈
εp
[−π, π]} and choose δ > 0 such that δ < 2p+1 M p . Let h ∈ C([−π, π]) be the
function that agrees with f on [−π + δ, π − δ], is a straight line from the point
(−π, 0) to the point (−π + δ, f(−π + δ)), and is a straight line from the point
(π −δ, f(π −δ)) to the point (π, 0). Thus, |f(t)−h(t)| = 0 for t ∈ [−π +δ, π −δ]
and |f(t) − h(t)| ≤ 2M for all t 6∈ [−π + δ, π − δ]. Hence,
Z Z
p p
kf − hk = |f − h| dm + |f − h|p dm ≤ 2p+1 M p δ .
[−π,−π+δ] [π−δ,π]

That is, kf − hk < ε. 


The analogous result for p = ∞ is left as an exercise.

Theorem 2.45. The vector space T of trigonometric polynomials is dense in


L∞ (T).

Definition 2.46. The Fourier coefficients of f ∈ Lp (T), where p ∈ [1, ∞],


are the complex numbers fˆ(k) defined by
Z π
1
fˆ(k) = √ f(t)e−ikt dm(t) .
2π −π
m
X
For the trigonometric polynomial p(t) = √1 αk ei kt , we have

k=n

π
1
Z
p̂(k) = √ p(t)e−i kt dt = αk , for all n ≤ k ≤ m .
2π −π

Definition 2.47. Assume that 1 ≤ p ≤ ∞. The Hardy space H p (T) is the


subset of Lp (T) defined by
 Z π 
p p −ikt
H (T) = f ∈ L (T) | f(t)e dm(t) = 0, ∀ k < 0 .
−π

The following theorem is also left to the reader to prove.

Theorem 2.48. The Hardy spaces are Banach spaces.

2.10 Exercises

1. Prove that every convex function is continuous.


40 2 Banach Spaces in Classical Analysis

2. Assume that J ⊂ R is an open interval and that ϑ : J → R has a


continuous second derivative d2 ϑ/dt2 at every point of J. Prove that is ϑ
is a convex function, then d2 ϑ/dt2 is nonnegative on J.
3. Assume that J is a open interval and that ϑ : J → R is a convex function.
Prove that if t1 , . . . , tn ∈ [0, 1] satisfy t1 +. . .+tn = 1, and if x1 , . . . , xn ∈ J,
then
n n
!
X X
ϑ tk x k ≤ tk ϑ(xk ) .
k=1 k=1

4. A function ϑ : J → R is strictly convex if

ϑ(λx + (1 − λ)y) < λϑ(x) + (1 − λ)ϑ(y) , ∀ λ ∈ (0, 1) and ∀ x 6= y .

a) Prove that ϑ(t) = eαt is strictly convex on R for every nonzero α ∈ R.


b) Prove that ϑ(t) = tp on (0, ∞) for every p > 1.
c) Prove that, for positive real numbers α1 , . . . , αn,

1/n 1
(α1 · · · αn ) = (α1 + · · · + αn )
n
if and only if α1 = . . . = αn .
5. Let α1 , . . . , αn be positive real numbers and let p1 , . . . pn be positive real
numbers that satisfy p11 + . . . + p1n = 1.
a) Prove Young’s inequality:
n
X αpk k
α1 . . . αn ≤ .
pk
k=1

b) Characterise the cases of equality in Young’s inequality.


6. If {ϑk }k∈N is a sequence of convex functions J → R and if supk ϑk (x)
exists for all x ∈ J, then prove that supk ϑk is a convex function.
7. Let ρ be a seminorm on a vector space V .
a) Prove that the function d : V × V → [0, ∞) defined by

d(v, w) = ρ(v − w) , v, w ∈ V ,

is a pseudo-metric on V .
b) With respect to the topology on V induced by the seminorm ρ, prove
that the vector space operations (scalar multiplication and vector ad-
dition) are continuous. That is, prove that V is a topological vector
space.
8. If ρ is a seminorm on a topological vector space, and if ∼ is the relation
on V defined by
v ∼ w if ρ(v − w) = 0 ,
2.10 Exercises 41

then prove that ∼ is an equivalence relation. Moreover, if the equivalence


classes of elements of V are denoted by

[v] = {w ∈ V | w ∼ v} ,

then prove the following assertions:


a) the set V / ∼ of equivalence classes is a vector space under the opera-
tions
[v] + [w] = [v + w] , v, w ∈ V ,
α [v] = [αv] , α ∈ C, v ∈ V ;
b) the function
k[v]k = ρ(v) , v∈V ,
is a norm on V / ∼.
9. Let (X, Σ, µ) be a measure space and p, q ∈ (1, ∞) be conjugate. Assume
that f, g : X → R are nonnegative measurable functions. Prove that if f
is p-integrable and g is q-integrable, then
Z Z 1/p Z 1/q
p q
fg dµ = f dµ g dµ
X X X

if and only if there is a complex number λ such that f p = λgq or gq = λf p


almost everywhere.
10. If (X, Σ, µ) is a measure space, then prove that L∞(X, Σ, µ) is a vector
space and that the function ρ : L∞ (X, Σ, µ) → R defined by

ρ(f) = inf α ∈ R | |f|−1 (α, ∞) has measure zero




is a seminorm on L∞(X, Σ, µ).


11. Prove that L∞ (X, Σ, µ) is a Banach space, for every measure space
(X, Σ, µ).
12. Assume that X is a locally compact space. Prove the following assertions.
a) C0 (X) is a subspace of Cb (X).
b) C0 (X) = Cb (X) if and only if X is compact.

13. Consider the function f(t) = t on a closed interval [0, b], b > 0. Prove
that there is a sequence of polynomials pn such that
a) pn (0)= 0, for every n ∈ 
N, and

b) lim max | t − pn (t)| = 0.
n→∞ t∈[0,b]

14. Let T denote the group R/2πZ, equipped with the quotient topology, and
for√each n ∈ Z let γn : T → C be defined by γn ([t]) = eint , where i refers
to −1. Prove that functions γn are well defined and continuous.
15. Consider the Banach space `p (N), where 1 ≤ p < ∞.
a) Prove that the closed unit ball of `p (N) is not compact.
42 2 Banach Spaces in Classical Analysis

b) Prove that `p (N) has a Schauder basis.


16. Suppose that f is an essentially bounded function on a measure space
(X, Σ, µ). Prove that there exists E ∈ Σ such that µ(E) = 0 and
sup |f(t)| < ∞ for all t ∈ X \ E.
17. Prove that if f is an essentially bounded function on a measure space
(X, Σ, µ), then

ess-sup f = inf α ∈ R | |f|−1(α, ∞) has measure zero .




18. Prove that the Banach space `∞ (N) is not separable.


19. Prove that Lp (R, M(R), m) is a separable Banach space, for every p ≥ 1.
(Suggestion: The Banach spaces Vn = Lp ([−n, n]) are separable for every
n ∈ N, andS each Vn can be viewed as a subspace of Lp (R, M(R), m). Is
p
the union n∈N Vn dense in L (R, M(R), m) ?)
20. For every k ∈ Z, consider the 2π-periodic function uk : [−π, π] → C
ei kt
given by uk (t) = √ 2π
. Prove that the elements of {un }n∈Z are linearly
independent.
21. Prove that the vector space T of trigonometric polynomials is dense in
L∞ ([−π, π]).
22. Prove that the Hardy spaces H p (T) (p ≥ 1) and H ∞ (T) are Banach
spaces.
3
Duality

3.1 Operators
Definition 3.1. If V and W are normed vector spaces, then an operator from
V to W is a function T : V → W such that
1. T is linear—that is, T (α1 v1 +α2 v2 ) = α1 T (v1 )+α2 T (v2 ), for all v1 , v2 ∈ V
and α1 , α2 ∈ C—and
2. T is continuous.
The continuity of a linear transformation can be rephrased in terms of the
boundedness of the transformation.
Definition 3.2. If V and W are normed vector spaces, then a linear trans-
formation T : V → W is bounded if there is a constant M > 0 such that

kT (v)k ≤ M kvk , ∀v ∈ V .

Proposition 3.3. A linear transformation T : V → W between normed vec-


tor spaces V and W is bounded if and only if it is continuous.
1. T is bounded;
2. T is continuous.
Proof. Assume that T is bounded and let M > 0 be such that kT (v)k ≤ M kvk,
for all v ∈ V . Fix v0 ∈ V . By linearity of T ,

kT (v) − T (v0 )k = kT (v − v0 )k ≤ M kv − v0 k .

Thus, if ε > 0 and if δ = ε/M , then kv − v0 k < δ implies kT (v) − T (v0 )k < ε.
Hence, T is continuous at v0 .
Assume that T is continuous. In particular, T is continuous at 0. Thus, for
ε = 1 there is a δ > 0 such that kwk < δ implies kT (w)k < 1. Let M = 2/δ.
δ
If v ∈ V is nonzero, then let w = 2kvk v. Thus, kwk < δ and so kT (w)k < 1.
That is,
44 3 Duality

kT (v)k < M kvk ,


which shows that T is bounded. 

Proposition 3.4. Assume that V and W are normed vector spaces.


1. If V has finite dimension, then every linear transformation T : V → W
is bounded.
2. If V has infinite dimension and if W 6= {0}, then there is a linear trans-
formation T : V → W such that T is unbounded.

Proof. The first statement is left to the reader (Exercise 2).


For the second statement, we assume V has infinite dimension. Let B be a
linear basis of V (necessarily infinite), and let {yk | k ∈ N} ⊆ B be a countably
infinite subset. Without loss of generality, we may assume each vector of B
has norm 1.
Because W is nonzero, there is at least one vector w ∈ W of norm kwk = 1.
Define a linear transformation T : V → W by the following action on the
vectors of the linear basis B:
T (yk ) = k w , for each k ∈ N ;
T (y) = 0 , for each y ∈ B\{yk | k ∈ N} .

Note that kT (yk )k = k kwk = k. Because kyk k = 1, for every k ∈ N, T is


unbounded. 

Proposition 3.5. If V and W are normed vector spaces, then the set B(V, W )
of all operators T : V → W is a vector space under the operations

T1 + T2 (v) = T1 (v) + T2 (v) and (α T )(v) = α T (v) ,

for v ∈ V and α ∈ C. Moreover, then function k · k : B(V, W ) → R defined by

kT (v)k
kT k = sup (3.1)
06=v∈V kvk

is a norm on B(V, W ).

Proof. It is clear that B(V, W ) carries the structure of a vector space, and so
it is sufficient to prove that (3.1) defines a norm on B(V, W ).
To this end, note that (using the fact that every linear transformation
sends zero to zero) equation (3.1) is equivalent to

kT (v)k ≤ kT k kvk , ∀v ∈ V . (3.2)

Thus, kT k = 0 only if kT (v)k = 0 for every v ∈ V ; hence, T (v) = 0 for all


v ∈ V . This proves that T is the zero transformation: T = 0.
It is clear that kαT k = |α| kT k, and so we now establish the triangle
inequality. Let T1 , T2 ∈ B(V, W ). For every v ∈ V ,
3.1 Operators 45

k(T1 + T2 )vk = kT1 (v) + T2 (v)k

≤ kT1 (v)k + kT2 (v)k

≤ (kT1 k + kT2 k) k(v)k ,

and so
kT1 + T2 (v)k
kT1 + T2 k = sup ≤ kT1 k + kT2 k .
06=v∈V kvk
This completes the proof. 
Notational Convention. If V and W are normed vector spaces, then
B(V, W ) denotes the normed vector space of Proposition 3.5—that is, it is
assumed implicitly that B(V, W ) is normed by definition (3.1) (as there could
be other norms on the set of all operators V → W ). If W = V , then B(V ) is
to denote B(V, V ).

Theorem 3.6. If V is a normed vector space and W is a Banach space, then


B(V, W ) is a Banach space.

Proof. Assume that {Tk }k∈N is a Cauchy sequence in B(V, W ). Inequality


(3.2) implies that {Tk (v)}k∈N is a Cauchy sequence in W for every v ∈ V .
Because W is a Banach space, each sequence {Tk (v)}k∈N is convergent in W ;
denote the limit by T (v). Note that the map

v 7→ T (v)

is indeed a linear transformation T : V → W . It remains to show that T is


bounded and that limk kTk − T k = 0.
Let ε > 0. Because {Tk }k∈N is a Cauchy sequence, there exists Nε ∈ N
such that kTn − Tm k < ε for all n, m ≥ Nε . If v ∈ V and if n ≥ Nε , then

kT (v) − Tn (v)k ≤ kT (v) − Tm (v)k + kTm (v) − Tn (v)k

≤ kT (v) − Tm (v)k + kTm − Tn k kvk .

As the inequalities above are true for all m ∈ N,

kT (v) − Tn (v)k ≤ inf (kT (v) − Tm (v)k + kTm − Tn k kvk)


m∈N

≤ 0 + ε kvk .

Hence, if n ≥ Nε is fixed, then T −Tn is bounded and kT −Tn k ≤ ε. Therefore,


Tn + (T − Tn ) = T is bounded and kT − Tn k < ε for all n ≥ Nε . This proves
that the Cauchy sequence {Tk }k∈N converges in B(V, W ) to T ∈ B(V, W ). 
Some operators of special importance are the isometries.
46 3 Duality

Definition 3.7. If V and W are normed vector spaces, then a linear trans-
formation T : V → W is called an isometry if kT vk = kvk, for every v ∈ V .

Of course, every isometry T is bounded and norm kT k = 1. Moreover,


kT vk = kvk implies that T v = 0 if and only if v = 0, and so every isometry is
an injection.

Proposition 3.8. If V is a Banach space and if W is a normed space, then


the range of every isometry T : V → W is closed in the topology of W .

Proof. Let w ∈ W be in the closure of the range of T . Since W is a metric space,


kw − T (vn )k → 0, for some sequence of vectors vn ∈ V . Thus, {T (vn )}n∈N is a
Cauchy sequence in W . Because kT vn − T vm k = kT (vn − vm )k = kvn − vm k,
the sequence {vn }n∈N is Cauchy in V . As V is a Banach space, there is a limit
v ∈ V to this sequence. Moreover,

kT v − wk ≤ kT (v − vn )k + kT vn − wk = kv − vn k + kwn − wk , ∀n ∈ N.

Hence, w = T v. That is, the range of T contains all of its limit points, implying
that the range of T is closed. 

Definition 3.9. Two normed vector spaces V and W are said to be isomet-
rically isomorphic if there is an isometry T : V → W such that T is a
surjection.

Thus, if V and W are Banach spaces, and if T : V → W is an isometry,


then V is isometrically isomorphic to the range of T . Hence, T embeds V into
W and in so doing preserves the Banach space structure of V . In such cases,
we say that W contains a copy of V .
To this point it is not at all clear whether B(V, W ) has any elements other
than the zero transformation. This issue is addressed in the following section.

3.2 The Hahn–Banach Theorem


Definition 3.10. If V is a normed vector space, then a linear functional on
V is a linear transformation ϕ : V → C with the property that there is a
constant M > 0 such that

|ϕ(v)| ≤ M kvk , ∀v ∈ V .

The set V ∗ of all linear functionals on V is called the dual space of V .

Thus, V ∗ = B(V, C). Since C is complete, V ∗ is a Banach space, by The-


orem 3.6.
If V has finite dimension n ∈ N, then the concept of dual basis in lin-
ear algebra shows that V ∗ has dimension n too. However, if V has infinite
3.2 The Hahn–Banach Theorem 47

dimension, then it is not obvious a priori that V ∗ has any nonzero elements
whatsoever. To show that V ∗ has (many) nonzero elements, we require the
Hahn–Banach Theorem, which is possibly the most important fundamental
theorem in functional analysis.
To prove the Hahn–Banach Theorem, it shall be convenient to consider
normed vector spaces V as real vector spaces. Note that if ϕ ∈ V ∗ , then the
norm of ϕ is given by

kϕk = sup |ϕ(v)| .


v∈V,kvk=1

Thus, an R-linear transformation ψ : V → R is said to bounded of norm kψk


if
sup |ψ(v)|
v∈V, kvk=1

is finite; if so, then this finite value is denoted by kψk.

Lemma 3.11. If ϕ ∈ V ∗ and if ψ = <(ϕ), then ψ : V → R is R-linear,


bounded, and kψk = kϕk.

Proof. Because ψ(v) = 21 (ϕ(v) + ϕ(v)), for every v ∈ V , ψ is R-linear and


kψk ≤ kϕk. Thus, ψ is bounded. To show that kϕk ≤ kψk, note that if v ∈ V ,
then there is a θ ∈ R for such that eiθ ϕ(v) = |ϕ(v)|. Thus,

|ϕ(v)| = eiθ ϕ(v) = ϕ(eiθ v) = ψ(eiθ v) ≤ kψk keiθ vk ,

whence kϕk ≤ kψk. 

Lemma 3.12. If ψ : V → R is R-linear and bounded, then the function ϕ :


V → C defined by

ϕ(v) = ψ(v) − iψ(iv) , v∈V ,

is C-linear, bounded, and satisfies kϕk = kψk.

Proof. For every ζ ∈ C, ζ = <(ζ) − i<(iζ); thus, the formula ϕ(v) = ψ(v) −
iψ(iv) implies that ψ = <(ϕ). It is straightforward to show, for all v, v1 , v2 ∈ V
and r, s ∈ R, that ϕ(v1 + v2 ) = ϕ(v1 ) + ϕ(v2 ), ϕ(rv) = rϕ(v), and ϕ(isv) =
isϕ(v). Thus, ϕ is C-linear. The verification of kϕk = kψk is the same as in
Lemma 3.11. 
We are now ready to prove the Hahn–Banach theorem.

Theorem 3.13. (Hahn–Banach Extension Theorem) If L is a linear subman-


ifold of a normed vector space V , and if ϕ : L → C is a bounded linear map,
then there is a linear functional Φ : V → C such that Φ|L = ϕ and kΦk = kϕk.
48 3 Duality

Proof. If ϕ = 0, then we may take Φ ∈ V ∗ to be Φ = 0. Therefore, assume


henceforth that ϕ 6= 0. Let ψ = <(ϕ); by Lemma 3.11, ψ is R-linear and
kψk = kϕk. If we are able to find an R-linear map Ψ : V → R such that
kψ|L = ψ and kΨ k = kψk, then (by Lemma 3.12) the desired extension Φ ∈ V ∗
can be defined by Φ(v) = Ψ (v) − iΨ (iv), v ∈ V . Therefore, we need only focus
on the case of a nonzero bounded R-linear map ψ : V → R. without loss of
generality, assume that kψk = 1.
We begin with a Zorn’s Lemma argument. Define a set S consisting of all
ordered pairs (M, ϑ) such that M is a linear submanifold of V containing L
and ϑ : M → R is a bounded R-linear map satisfying kϑk = 1 and ϑ|L = ψ.
The set S is nonempty since (L, ψ) ∈ S. Define a partial order “≤” on S by

(M1 , ϑ1) ≤ (M2 , ϑ2) if and only if M1 ⊆ M2 and ϑ2|M1 = ϑ1 .

With respect to this partial order, let F be any linearly ordered subset of S.
Hence, there is a linearly ordered set Λ such that

F = {(Mλ , ϑλ) | λ ∈ Λ} .

Define M ⊆ V by [
M = Mλ ,
λ∈Λ

and note that M is a linear submanifold of V containing L. Furthermore, the


function ϑ : M → R given by ϑ(v) = ϑλ (v), if v ∈ Mλ , is well defined, R-
linear, and satisfies kψk = 1. Thus, (M, ϑ) ∈ S and (M, ϑ) is an upper bound
for F. Hence, by Zorn’s Lemma, S has a maximal element.
Let (M, ϑ) denote a maximal element of S. By definition of S, kϑk = 1;
thus, it remains only to prove that M = V . Suppose, on the contrary, that
M 6= V . Choose v0 ∈ V \M . If x, y ∈ M , then

|ϑ(x) − ϑ(y)| = |ϑ(x − y)|


≤ kx − yk
= kx − v0 + v0 − yk
≤ kv0 − xk + kv0 − yk .

Hence,
sup (ϑ(x) − kv0 − xk) ≤ inf (ϑ(y) + kv0 − yk) .
x∈M y∈M

Let δ0 ∈ R satisfy

sup (ϑ(x) − kv0 − xk) ≤ δ0 ≤ inf (ϑ(y) + kv0 − yk) ;


x∈M y∈M

therefore,
|δ0 − ϑ(x)| ≤ kv0 − xk , ∀x ∈ M .
Next, consider the subspace M0 = SpanC {v0 }. Let M1 = M0 + M , which is a
linear submanifold of V which properly contains M . Because M0 ∩ M = {0},
3.3 Complementation of Finite-Dimensional Spaces 49

each vector v ∈ M1 has a unique expression as a sum of vectors in M0 and


M ; that is, for each v ∈ M there are unique α ∈ R and x ∈ M such that
v = αv0 + x. Therefore, the function ϑ1 : M1 → R defined by

ϑ1 (αv0 + x) = αδ0 + ϑ(x)

is a well-defined R-linear map. The following arguments will show that ϑ1 is


bounded and that kϑ1 k = 1.
If x ∈ M , then |ϑ1(x)| = |ϑ(x)| ≤ kxk. If α ∈ R is nonzero, and if x ∈ M ,
then  
1
|ϑ1 (αv0 + x)| = |αδ0 + ϑ(x)| = |α| δ0 + ϑ
x
α

1
≤ |α| kv0 + xk
α

= kαv0 + xk .
Thus, kϑ1 k ≤ 1 as an R-linear map M1 → R. Because ϑ1 is an extension of
ϑ from M to M1 , 1 = kϑk ≤ kϑ1 k ≤ 1 implies that kϑ1 k = 1, which shows
that (M1 , ϑ1) ∈ S. However, because M1 properly contains M , the relation
(M, ϑ) ≤ (M1 , ϑ1) contradicts the maximality of (M, ϑ) in S. Therefore, it
must be that M = V . Hence, a maximal element of S has the form (V, ϑ). To
complete the proof, let Ψ = ϑ. 

Corollary 3.14. If v is a nonzero element of a normed vector space V , then


there is a ϕ ∈ V ∗ such that kϕk = 1 and ϕ(v) = kvk.

Proof. On the 1-dimensional subspace L = {λv | λ ∈ C}, let ϕ0 : L → C be


given by ϕ0 (λv) = λkvk. Then ϕ0 is a linear transformation and ϕ0 (v) = kvk.
The norm of ϕ0 is 1, since |ϕ0 (λv)| = |λ| kvk = kλvk. By the Hahn–Banach
Theorem, there is an extension of ϕ0 to a linear functional ϕ ∈ V ∗ such that
kϕk = kϕ0 k = 1. 
Another immediate application of note:

Proposition 3.15. If V and W are nonzero normed vector spaces, then


B(V, W ) is nonzero.

Proof. Exercise 6. 

3.3 Complementation of Finite-Dimensional Spaces


Definition 3.16. A subspace M of a Banach space V is complemented if there
is a subspace N of V , which is called the complement of M , such that
1. M ∩ N = {0} and
50 3 Duality

2. M + N = V .

It need not be true that a subspace of a Banach space be complemented.


But all finite-dimensional spaces are complemented (thanks to the Hahn–
Banach Theorem).

Proposition 3.17. Every finite-dimensional subspace of a Banach space is


complemented.

Proof. Let {v1 , . . . , vn } be a basis of a finite-dimensional space V of a Banach


space V . By the linear algebra of vector spaces, there are linear functionals
ϕ̃i : M → C such that ϕ̃i (vj ) = 0, for j 6= i, and ϕ̃i (vi ) = 1, for each
1 ≤ i ≤ n. By the Hahn–Banach Theorem, each of these linear functionals
has an extension to a continuous linear function ϕi : V → C. Let
n
\
N = ker ϕi ,
i=1

which is a subspace of V (as each ker ϕi is closed). It is easy to verify that


M ∩ N = {0}. To show that V = M + N , choose any v ∈ V . Let
X
w = ϕj (v) vj ,
j=1

which is an element of M , and consider z = v−w. Since v = w +z, it is enough


to prove that z ∈ N —that is, to prove that ϕi (z) = 0 for every 1 ≤ i ≤ n. To
this end, compute:

ϕi (z) = ϕi (v) − ϕi (w) = ϕi (v) − ϕi (v) = 0.

Thus, V = M + N . 

3.4 Weak Topology


Let V and W be any normed vector spaces and suppose that F is a family of
functions f : V → W . Consider

B = f1−1 (Ω1 ) ∩ · · · ∩ fn−1 (Ωn ) | n ∈ N, Ωj ⊆ W is an open set, fj ∈ F




and let TF (V ) denote the collection of all sets formed by arbitrary unions of
U ∈ B. The set TF (V ) satisfies the axioms for a topology on V . One can think
of it as being the coarsest (or smallest, weakest) topology on V that makes
each function f ∈ F continuous.

Definition 3.18. The weak topology on V induced by F is the topology TF (V ).

Particular choices of F lead to special cases of interest and utility.


3.4 Weak Topology 51

Definition 3.19. If V is a normed vector space, then the weak topology on V


is the topology TV ∗ (V )—that is, the weak topology on V induced by the family
F = V ∗.
Suppose that v0 ∈ V and that U ⊂ V is a weakly open set (that is,
U ∈ TV ∗ (V )). Thus, there are ϕ1 , . . . , ϕn ∈ V ∗ and open sets Ω1 , . . . , Ωn ⊆ C
such that
\n
v0 ∈ ϕ−1j (Ωj ) ⊂ U .
j=1
As ϕ(v0 ) ∈ Ωj ⊆ C for each j, there are positive real numbers ε1 , . . . , εn such
that, for each j,
{ζ ∈ C | |ζ − ϕj (v0 )| < εj } ⊂ Ωj .
Hence,
v0 ∈ {v ∈ V | |ϕj (v) − ϕj (v0 )| < εj , ∀ j = 1, . . . , n} ⊂ U .
This set in the middle is a basic weak open neighbourhood of v0 .
Definition 3.20. A net {vα }α∈Λ ⊂ V converges weakly to v ∈ V if for each
weakly open set U ⊂ V containing v there is a α0 ∈ Λ such that vα ∈ U for
every α ≥ α0 .
In the norm topology of a Banach space V , there are bounded open sets
U ⊂ V that contain 0 ∈ V (eg., U = Bε (0), for any ε > 0). However, a quite
different situation exists in the weak topology.
Proposition 3.21. If V is an infinite-dimensional Banach space, and if U ⊂
V is a weakly open set such that 0 ∈ U , then U is unbounded. In fact, there
is an infinite-dimensional subspace L ⊂ V such that L ⊂ U .
Proof. By definition of weak topology, there exist ε1 , . . . , εn > 0 and ϕ1 , . . . , ϕn ∈
V ∗ such that
0 ∈ {v ∈ V | |ϕj (v)| < εj , ∀ j = 1, . . . , n} ⊂ U .
Let
n
\
L = ker ϕj ,
j=1
a subspace of V which is clearly contained in U . We need only verify that L
has infinite dimension.
Let Φ : V → Cn be the linear transformation
 
ϕ1 (v)
Φ(v) =  ...  , v ∈ V .
 

ϕn (v)
Note that ker Φ = L. The First Isomorphism Theorem in Linear Algebra
asserts that the quotient space V /L is isomorphic to the range of Φ, which is
a subspace of Cn . Hence, L cannot have finite-dimension (because dim V ≤
n + dim L). 
52 3 Duality

3.5 The Second Dual

Definition 3.22. The second dual of a normed vector space V is the dual
space (V ∗ )∗ of V ∗ , and is denoted by V ∗∗ .

The following proposition shows that V ∗∗ contains an isometric copy of


V.

Proposition 3.23. For every normed vector space V , there is an operator


T : V → V ∗∗ such that kT vk = kvk, for every v ∈ V . If V is a Banach space,
then the range of T is a subspace of V ∗∗ .

Proof. For each v ∈ V , let ωv : V ∗ → C be the linear transformation defined


by ωv (ϕ) = ϕ(v), for ϕ ∈ V ∗ . Because |ϕ(v)| ≤ kϕk kvk, for every ϕ ∈ V ∗ ,
the linear map ωv is bounded. Hence, ωv ∈ V ∗∗ , for every v ∈ V .
It is straightforward that the map T : V → V ∗∗ that sends each v ∈ V
to the function ωv is linear (ie., ωv1 + ωv2 = ωv1 +v2 and λωv = ωλv, for all
v1 , v2 , v ∈ V and λ ∈ C). Thus, T is a linear transformation. Moreover, for
each v ∈ V ,

kT vk = sup |ωv (ϕ)| = sup |ϕ(v)| = kvk .


ϕ∈V ∗ , kϕk=1 ϕ∈V ∗ , kϕk=1

(The last equality follows from Exercise 4.) Thus, T is bounded, norm pre-
serving, and injective.
It remains to show that if V is a Banach space, then the range of T is a
subspace of V ∗∗ . This is simply the statement of Proposition 3.8. 

Definition 3.24. A Banach space V is said to be reflexive if the operator T


in Proposition 3.23 is a surjection.

Thus, V is reflexive only if V and V ∗∗ are isometrically isomorphic Banach


spaces. However, here is a strange fact: there exists a nonreflexive Banach
space V such that V and V ∗∗ are isometrically isomorphic [8]. What is the
difference ? In this example [8], the surjective isometry between V and V ∗∗ is
not of the form of the isometry T in Proposition 3.23 .

3.6 Weak∗ Topology

Proposition 3.23 shows that V can be viewed as sitting inside the dual of V ∗ .
Namely, every vector v ∈ V is a linear functional on V ∗ via

ϕ 7→ ϕ(v) , ϕ ∈V∗.

Thus, V as a family of functions on V ∗ induces a weak topology on its dual


V ∗.
3.6 Weak∗ Topology 53

Definition 3.25. If V is a normed vector space, then the weak∗ topology on


the dual space V ∗ is the weak topology on V ∗ induced by the range of T in
Proposition 3.23.

By definition, a net {ϕα }α∈Λ ⊂ V ∗ is weak∗ convergent to ϕ ∈ V ∗ if


lim ϕα (v) = ϕ(v) for every v ∈ V .
α
Next, a result of fundamental importance.

Theorem 3.26. (Banach–Alaoglu) Let V be a normed vector space and let


X ⊂ V ∗ be the closed unit ball of V ∗ . Then X is compact in the weak∗
topology.

For each v ∈ V , let Kv = {λ ∈ C | |λ| ≤ kvk}. Consider the space


Proof.Y
K = Kv , endowed with the product topology. (Recall that the product
v∈V Y
topology on X is generated by basic open sets of the form Uv , where
v∈V
each Uv is open in Kv and Uv 6= Kv for at most finitely many v ∈ V .) By
Tychonoff’s Compactness Theorem [9, Theorem 5.1.1], K is a compact subset
of X.
If ζ : V → C is a function for which |ζ(v)| ≤Ykvk, for all v ∈ V , then
we may identify ζ with an element of K, namely ζ(v). Conversely, each
v∈V
ζ ∈ K determines a function ζ : V → C such that |ζ(v)| ≤ kvk, for all v ∈ V .
Therefore, in viewing K as a set of complex-valued functions on V , it is clear
that X ⊂ K. We shall show that X is closed in the topology of K, and that
this topology is the weak∗ topology. This will then imply that X is weak∗
compact (as closed sets of compact spaces are compact).
Let U ⊂ K be an open set such that ϕ ∈ U for some ϕ ∈ X. By definition
of the product topology, there are v1 , . . . , vn ∈ V and ε1 , . . . , εn > 0 such that

U0 = {ζ ∈ K | |ζ(vj ) − ϕ(vj )| < εj , 1 ≤ j ≤ n} ⊂ U .

Because U0 is an open set in K, U0 ∩ X is a relatively open subset of X. But


U0 ∩ X is clearly a weak∗ open neighbourhood of ϕ ∈ X. Hence, the topology
on X inherited from K is precisely the weak∗ topology.
We now show that X is a closed subset of K. Let ζ ∈ K be in the closure
of X. We must show that ζ is a linear function. To this end, suppose that
v1 , v2 ∈ V and λ1 , λ2 ∈ C. Let ε > 0 and define

U1 = {ω ∈ K | |ω(v) − ζ(vj )| < ε, v ∈ {v1 , v2 , λ1 v1 + λ2 v2 }} ,

which is an open subset of K containing ζ. Since ζ is in the closure of X,


there exists ϕ ∈ X such that ϕ ∈ U1 . Hence,

|ϕ(v1 ) − ζ(v1 )| < ε , |ϕ(v2 ) − ζ(v2 )| < ε ,


54 3 Duality

and
|ϕ(λ1 v1 + λ2 v2 ) − ζ(λ1 v1 + λ2 v2 )| < ε .
Therefore,

|ζ(λ1 v1 + λ2 v2 ) − λ1 ζ(v1 ) − λ2 ζ(v2 )| < (1 + |λ1 | + |λ2 |) ε .

As ε > 0 is arbitrary, ζ must be linear, whence ζ ∈ X. 

3.7 Subspaces of C(X)


The following striking theorem shows that all Banach spaces arise as subspaces
of C(X) for various choices of compact spaces X.

Theorem 3.27. For every Banach space V there is a compact space X such
that V is isometrically isomorphic to a subspace of C(X).

Proof. Let X be the unit ball of V ∗ , which is compact when endowed with the
weak∗ topology (Theorem 3.26). By Proposition 3.23, there is an isometric
embedding T : V → V ∗∗ whereby T v(ϕ) = ϕ(v), for all ϕ ∈ X and v ∈ V .
Thus we need only show that T v ∈ C(X) for every v ∈ V .
Let v ∈ V . To show that T v is (weak∗ ) continuous at ϕ0 ∈ X, let ε > 0
and consider the open ε-ball Ω centered at T v(ϕ0 ), namely

Ω = {λ ∈ C | |λ − ϕ0 (v)|} < ε .

The pre-image of Ω under T v is {ϕ ∈ X | |ϕ(v) − ϕ0 (v)| < ε}, which is an


open set in (the weak∗ topology of) X. Hence, T v is continuous. 
If V is a separable Banach space, then the topological space X that arises
in Theorem 3.27 is in fact a compact metric space. This follows from the
proposition below.

Proposition 3.28. If V is a separable Banach space and K is a weak∗ com-


pact subset of the dual space V ∗ , then K is metrisable.

Proof. Let Tw∗ denote the weak∗ topology of K. We will show below that
there is a metric d on K such that Td = Tw∗ , where Td is the topology on K
induced by the metric d.
Because V is a separable, there is a countable dense subset {vn }n∈N ⊂ V .
Define a function d : K × K → R+ by

X 1
d(ϕ1 , ϕ2 ) = n
|ϕ1 (vn ) − ϕ2 (vn )| , ∀ ϕ1 , ϕ2 ∈ K .
n=1
2

The function d plainly satisfies d(ϕ1 , ϕ2 ) = d(ϕ2 , ϕ1 ) and d(ϕ1 , ϕ3 ) ≤


d(ϕ1 , ϕ2 ) + d(ϕ2 , ϕ3 ), and therefore the only property of d left to verify is that
3.8 Exercises 55

d(ϕ1 , ϕ2 ) = 0 only if ϕ1 = ϕ2 . Now if d(ϕ1 , ϕ2 ) = 0, then ϕ1 (vn ) = ϕ2 (vn ) for


every n ∈ N. We need to show that ϕ1 (v) = ϕ2 (v) for every v ∈ V . Choose
v ∈ V and let ε > 0. There exists n ∈ N such that kv − vn k < ε. Hence,

|ϕ1 (v) − ϕ2 (v)| = |ϕ1 (v) − ϕ1 (vn ) + ϕ2 (vn ) − ϕ2 (v)| < ε (kϕ1 k + kϕ2 k) .

As ε > 0 is arbitrary, ϕ1 (v) = ϕ2 (v). Hence, d(ϕ1 , ϕ2 ) = 0 only if ϕ1 = ϕ2 .


For each n ∈ N, let ωn = T vn , where T is the isometry T : V → V ∗∗
of Theorem 3.23. Thus, ωn (ϕ) = ϕ(vn ), for every n ∈ N. Because each ωn is
weak∗ continuous function on K, the metric d is continuous with respect to
the weak∗ topology of K. Hence, for each ϕ0 ∈ K and ε > 0, the ball

{ϕ ∈ K | d(ϕ, ϕ0 ) < ε} ,

which is an open set in (K, Td), is an open set in (K, Tw∗ ). Thus, Td ⊆ Tw∗ .
Conversely, if one shows that every weak∗ closed set in K is closed in the
metric topology of K, then it follows that every weak∗ open set in K is open
in the metric topology—that is, Tw∗ ⊆ Td . To this end, if F ⊆ K is weak∗
closed, then F is necessarily weak∗ compact (because the space (K, Tw∗ ) is
compact). Since Td ⊆ Tw∗ , every covering of F by open sets in (K, Td ) admits,
therefore, a finite subcovering. Hence, F is compact in (K, Td). Because the
topology of every metric space is Hausdorff, F is a compact, Hausdorff subset
of the metric space (K, Td ). In any topological space, “compact + Hausdorff”
implies “closed” [9, Theorem 3.5.3]; hence F is a closed set in the metric space
(K, Td). 

3.8 Exercises

1. Let T : V → W be a linear transformation between normed vector spaces.


Prove that T is bounded if and only if T is continuous at least one point
v0 ∈ V .
2. Assume that V is a finite-dimensional Banach space and W is any normed
vector space. Prove that every linear transformation T : V → W is
bounded.
3. Prove that the dual space V ∗ of a normed linear space V separates the
points of V ; that is, prove that v1 , v2 ∈ V are distinct if and only if there
is a ϕ ∈ V ∗ such that ϕ(v1 ) 6= ϕ(v2 ).
4. Prove that if V is a normed vector space and v ∈ V , then

kvk = sup |ϕ(v)| .


ϕ∈V ∗ , kϕk=1

5. Prove that a linear transformation ϕ : V → C is bounded if and only if


ker ϕ is a subspace of V .
56 3 Duality

6. Prove that if V and W are nonzero Banach spaces, then there exists a
nonzero operator T : V → W .
7. Suppose that M, N ⊂ V are nonzero subspaces of a Banach space V . If
N has finite dimension, then prove that

M + N = {u + w | u ∈ M, w ∈ N }

is subspace of V .
8. Prove that the following Banach spaces are reflexive:
a) Every Hilbert space
b) Every finite-dimensional Banach space
9. Let V be a normed vector space and suppose that ϕ1 , . . . , ϕn ∈ V ∗ . Let
n
\
L = ker ϕj .
j=1

Assume that ϕ ∈ V ∗ satisfies ϕ(ξ) = 0, for every


 ξ∈  L.
ϕ1 (v)
a) Let Φ : V → Cn be given by Φ(v) =  ... , v ∈ V . Show that
 

ϕn (v)
there is a linear functional ψ : Cn → C such that ϕ(v) = ψ (Φ(v)), for
every v ∈ V .
b) Use the Riesz Representation Theorem (Theorem 6.28) for Cn to show
X n
that there are λ1 , . . . , λn ∈ C such that ϕ = λj ϕj .
j=1
10. Prove that the closed unit ball of an infinite-dimensional Hilbert space is
weakly compact.
11. Let X be the unit ball of a Banach space V such that X is endowed with
the weak∗ topology. Is X a necessarily a Hausdorff space ?
12. Prove that the vector space operations on V ∗ are continuous in the weak∗
topology, for every normed vector space V .
4
Dual Spaces in Classical Analysis

This [incomplete] chapter considers the dual spaces of some of the Banach
spaces in classical analysis.

4.1 The Dual of Lp, 1 ≤ p < ∞

Example 4.1. If p, q ∈ (1, ∞) satisfy p−1 + q −1 = 1 and if g ∈ Lq (X, Σ, µ),


then the linear transformation ϕ : Lp (X, Σ, µ) → C defined by
Z
ϕ(f) = fg dµ , ∀ f ∈ Lp (X, Σ, µ) , (4.1)
X

is a linear functional on Lp (X, Σ, µ) of norm kϕk = kgk.

Proof. Let g ∈ Lq (X, Σ, µ) be fixed. Hölder’s Inequality (Theorem 2.10) states


that
Z Z 1/p Z 1/q
p q
|fg| dµ ≤ |f| dµ |g| dµ ∀ f ∈ Lp (X, Σ, µ) .
X X X

Hence, the function ϕ : Lp (X, Σ, µ) → C defined by equation (4.1) is a linear


functional for which kϕ(f)k ≤ kfk kgk for all f ∈ Lp (X, Σ, µ). Therefore,
kϕk ≤ kgk. The rest of the proof aims to show that kgk ≤ kϕk.
By definition, kϕ(f)k ≤ kϕk kfk for all f ∈ Lp (X, Σ, µ). Let ζ : X → C be
given by ζ(x) = |g(x)|/g(x) if x 6∈ g−1 ({0}), and by ζ(x) = 1 if x ∈ g−1 ({0}).
Thus, ζ is measurable, and |ζ(x)| = 1 and |g(x)| = ζ(x)g(x) for all x ∈ X.
Let f : X → C be defined by

f(x) = |g(x)|q/p ζ(x) , x∈X.


q
Observe that f is p-integrable and that fg = |g|q/p |g| = |g|1+ p = |g|q . Thus,
58 4 Dual Spaces in Classical Analysis
Z
kgkq = |g|q dµ
X
Z
= fg dµ
X

= |ϕ(f)|

≤ kϕk kfk
Z 1/p
= kϕk |f|p dµ
X

Z 1/p
q
= kϕk |g| dµ
X

= kϕk kgkq/p .
q
Hence, the equation q − p = 1 shows that kgk ≤ kφk. 
There is a converse to Proposition 4.1, which is a (remarkable) theorem of
F. Riesz and is stated below but not proved.

Theorem 4.2. (Riesz Representation Theorem) Suppose that p, q ∈ (1, ∞)


are conjugate. If ϕ is a linear functional on Lp (X, Σ, µ), then there is a unique
g ∈ Lq (X, Σ, µ) such that kϕk = kgk and
Z
ϕ(f) = fg dµ , ∀ f ∈ Lp (X, Σ, µ) .
X

p
Corollary 4.3. L (X, Σ, µ) is a reflexive Banach space for every 1 ≤ p < ∞.

4.2 The Radon–Nikodým Theorem


Hilbert space duality yields a functional analytic proof of the Radon–Nikodým
Theorem in measure theory, as will be shown in this section.
If (X, Σ) is a measurable space and if µ and ν are measures on (X, Σ),
then define a function λ : Σ → [0, ∞] by

λ(E) = µ(E) + ν(E) , ∀E ∈ Σ . (4.2)

Denote this function λ by λ = µ + ν.

Proposition 4.4. The function λ on (X, Σ) defined by equation (4.2) is a


measure of (X, Σ).
4.2 The Radon–Nikodým Theorem 59

Proof. Exercise 1. 

Definition 4.5. If (X, Σ) is a measurable space and if µ and ν are measures


on (X, Σ), then ν is absolutely continuous with respect to µ if ν(E) = 0 for
every E ∈ Σ for which µ(E) = 0.

Observe that if (X, Σ, µ) is a measure space and g : X → R is a nonnega-


tive measurable function, then the function ν : Σ → [0, ∞] defined by
Z
ν(E) = g dµ , ∀ E ∈ Σ ,
E

is a measure on (X, Σ) and ν is absolutely continuous with respect to µ. The


Radon–Nikodým Theorem asserts that the converse is true.

Theorem 4.6. (Radon–Nikodým Theorem) Assume that (X, Σ) is a measur-


able space and suppose that µ and ν are measures on (X, Σ) such that µ(X)
and ν(X) are finite. If ν is absolutely continuous with respect to µ, then there
is a nonnegative measurable function g : X → R such that
Z
ν(E) = g dµ , ∀ E ∈ Σ .
E

Proof. Let λ = µ + ν. For notational simplicity, L2 (λ) and L2 (µ) shall denote,
respectively, the Hilbert spaces L2 (X, Σ, λ) and L2 (X, Σ, µ). Because µ(X)
and ν(X) are finite, so is λ(X) and the constant (measurable) function 1 is a
element of both L2 (λ) and L2 (µ).
Without loss of generality, we may assume that µ(X) = 1. If ψ : X → R
is a nonnegative simple function with canonical form
n
X
ψ = αj χEj ,
j=1

then
Z n
X Z Z
ψ dλ = αj (µ(Ej ) + ν(Ej )) = ψ dµ + ψ dν .
X j=1 X X

Therefore, if h ∈ L2 (λ) and if {ψk }k∈N is a monotone increasing sequence of


nonnegative simple functions such that |h(x)|2 = limk ψk (x), for all x ∈ X,
then the Monotone Convergence Theorem, with respect to each of the three
measures, yields
Z Z Z Z
|h|2 dλ = |h|2 dµ + |h|2 dν ≥ |h|2 dµ .
X X X X

Thus, h ∈ L2 (µ) for every h ∈ L2 (λ).


60 4 Dual Spaces in Classical Analysis

Therefore, on the Hilbert space L2 (λ) one can define a (linear) function
ϕ : L2 (λ) → C by
Z
ϕ(h) = h dµ , ∀ h ∈ L2 (λ) .
X
2
Observe that if f ∈ L (µ) and f ≥ 0 almost everywhere, then ϕ(f) ≥ 0. By
the Cauchy-Schwarz inequality and the fact that 1 ∈ L2 (µ), for any h ∈ L2 (λ)
we have Z

|ϕ(h)| = h dµ

X

= hh, 1iL2(µ)
Z 1/2 Z 1/2
≤ |h|2 dµ 12 dµ
X X

Z 1/2
|h|2 dµ
p
= µ(X)
X

Z 1/2
2
≤ |h| dλ
X

= khk .
Thus, ϕ is nonnegative, linear, bounded, and of norm kϕk ≤ 1. Hence, Theo-
rem 6.28 states that there is a unique q ∈ L2 (λ) such that ϕ(h) = hh, qiL2 (λ)
for all h ∈ L2 (λ). That is,
Z Z
h dµ = hq dλ , ∀ h ∈ L2 (λ) .
X X

In particular, using h = χE ,
Z
µ(E) = q dλ , ∀E ∈ Σ . (4.3)
E

It is also the case that λ(E) = µ(E)+ν(E) (by definition). By simple algebra,
we have another expression for λ(E):
Z Z Z Z
λ(E) = dλ = q + (1 − q) dλ = q dλ + (1 − q) dλ .
E E E E

Hence, Z
ν(E) = (1 − q) dλ , ∀E ∈ Σ . (4.4)
E
By Exercise ??, equations (4.3) and (4.4) show that it may be assumed that
0 ≤ q(x) ≤ 1 for all x ∈ X.
4.2 The Radon–Nikodým Theorem 61

If E ∈ Σ and ψ = χE , then
Z Z Z
ψ dµ = µ(E) = q dλ = ψ q dλ . (4.5)
X E X

Thus, equation (4.5) extends to all nonnegative simple functions ψ:


Z Z
ψ dµ = ψ q dλ . (4.6)
X X

Further, by approximating any nonnegative measurable function ψ by a


monotone increasing sequence of nonnegative simple functions ψk , such that
ψ(x) = limk ψk (x) for all x ∈ X, equation (4.6) also holds for all nonnegative
measurable functions ψ (by the Monotone Convergence Theorem).
If F = {x ∈ X | q(x) = 0}, then the function ψ = χF c q1 is nonnegative and
measurable, and so equation (4.6) becomes

1
Z Z Z Z
dµ = ψ dµ = ψ q dλ = dλ . (4.7)
Fc q X X Fc
R
Moreover, µ(F ) = X q dλ = 0 implies, because ν is absolutely continuous
with respect to µ, that ν(F ) = 0 and λ(F ) = 0. Therefore, for any E ∈ Σ,

λ(E) = λ(E ∩ F ) + λ(E ∩ F c) = 0 + λ(E ∩ F c) .

A similar equation holds for µ: namely, µ(E) = µ(E ∩F c). Therefore, equation
(4.7) yields, for every E ∈ Σ,

1 1
Z Z Z Z
dµ = dµ = dλ = dλ . (4.8)
E q E∩F c q E∩F c E

Now let g = χF c ( 1q − 1), which is nonnegative and measurable. For every


E ∈ Σ,
ν(E) = λ(E) − µ(E)

1
Z Z
= dµ − dµ
E q E
 
1
Z
= − 1 dµ
E∩F c q
Z
= g dµ ,
E
which completes the proof of the theorem. 
62 4 Dual Spaces in Classical Analysis

4.3 Exercises

1. If (X, Σ) is a measurable space and if µ and ν are measures on (X, Σ),


then define a function λ : Σ → [0, ∞] by

λ(E) = µ(E) + ν(E) , ∀E ∈ Σ.

Denote this function λ by λ = µ + ν. Prove that λ is a measure of (X, Σ).


2. Let 1 ≤ p < ∞ and consider the Banach space Lp ([−π, π], m). Show that
for each n ∈ Z the function ϕn : Lp ([−π, π], m) → C defined by

ˆ ,
ϕn (f) = f(n)

is an element of the dual space of Lp ([−π, π], m).


3. Let X be a compact Hausdorff space. Fix x0 ∈ X and define a function
ϕ : C(X) → C by

ϕ(f) = f(x0 ), f ∈ C(X) .

Prove that ϕ ∈ C(X)∗ and that kϕk = 1.


4. Let X be a compact Hausdorff space. Suppose that µ is a finite Borel
measure on X and define a function ϕ : C(X) → C by
Z
ϕ(f) = f dµ , f ∈ C(X) .
X

Prove that ϕ ∈ C(X)∗ and that kϕk = µ(X).


5
Convexity

To this point in our study, Banach spaces have been considered solely on the
basis of their analytical and topological features. In this chapter, a geometrical
aspect of Banach spaces—namely, convexity—is examined.
If u and v are vectors in a normed vector space V , then the set of all vectors
of the form τ u + (1 − τ )v, where τ ∈ [0, 1] ⊂ R, comprise a line segment in
V connecting v to u. A subset of V that contains the line segments between
any of its points is called a convex set. That is, a subset C of a normed vector
space V is convex if τ u + (1 − τ )v ∈ C, for every τ ∈ [0, 1] and all u, v ∈ C.

5.1 Convex Combinations


Definition 5.1. Assume that V is a vector space.
1. A convex combination of v1 , . . . , vn ∈ V is a sum of the form
n
X
τj vj , (5.1)
j=1

n
X
where each τj ∈ [0, 1] and τj = 1.
j=1
2. If, in (5.1), each τj 6= 0, then the convex combination is said to be proper.
3. If S ⊂ V , then the set of all convex combinations of elements of S is called
the convex hull of S and is denoted by Conv S.
Proposition 5.2. A subset C ⊂ V is convex if and only if C contains all
convex combinations of its elements.
Proof. Exercise 2. 
Corollary 5.3. The convex hull Conv S of any S ⊆ V is a convex subset of
V.
64 5 Convexity

The following theorem is very useful in the convex analysis of finite-


dimensional space.
Theorem 5.4. (Caratheódory) If W is an n-dimensional vector space over R
and if S ⊂ V is nonempty, then for each w ∈ Conv S there are z1 , . . . , zm ∈ S
such that v is a convex combination of z1 , . . . , zm and m ≤ n + 1.
Proof. Let w ∈ Conv S. Thus, w is a convex combination of z1 , . . . , zm ∈ S,
say
m
X
v = τj vj , where each τj 6= 0 .
j=1

If m ≤ (n + 1), then the desired conclusion is reached. Therefore, suppose


that m > (n + 1). Let W̃ = R ⊕ W and let z̃j = 1 ⊕ zj , 1 ≤ j ≤ m. Since
dimW̃ = n + 1 > n, the vectors z̃1 , . . . , z̃m ∈ W̃ are P linearly dependent.
Thus,
P there are α , .
1 P m. . , α ∈ R, not all zero, such that j αj z̃j = 0; hence,
α
j j = 0 in R and α
j j jz = 0 in W .
Let i be such that

αj
≤ αi , ∀ j 6= i ,

τj τi
αj τi
and set νj = τj − for every j. Then
αi
m
X m
X m
X
νj ∈ [0, 1] , νi = 0 , νj = τj = 1 , and v = νj zj .
j=1 j=1 j=1
m
X
But the number of summands in νj zj is less than m since νi = 0. Hence, if
j=1
w can be expressed as a convex combination of m > n + 1 elements of S, then
w can be expressed as a convex combination of m − 1 of these same elements.
This shows, by iteration of the argument, that the number of summands can
be reduced to n + 1. 
The upper bound of n + 1 in Caratheódory’s Theorem is sharp, as one sees
by considering a triangle C in R2 with vertex set S.
Corollary 5.5. If V is an n-dimensional vector space over C and if S ⊂ V
is nonempty, then for each v ∈ Conv S there are u1 , . . . , um ∈ S such that v
is a convex combination of u1 , . . . , um and m ≤ 2n + 1.

5.2 Extreme Points and Faces


Triangles and squares are determined by their vertices, a disc by its boundary
circle, and a Euclidean ball in R3 by its boundary surface (sphere). The general
concept in convexity theory that captures these phenomena is that of an
extreme point.
5.2 Extreme Points and Faces 65

Definition 5.6. An element v in a convex subset C ⊆ V is an extreme point


of C if the equation
n
X
v = τj vj , where vj ∈ C and τj 6= 0 for each 1 ≤ j ≤ n , (5.2)
j=1

holds only for v1 = · · · = vn = v. The set of all extreme points of C is denoted


by ext C.

Thus, if C is convex, then v ∈ C is an extreme point if v is not a proper


convex combination of elements other than itself. Geometrically this means:

Proposition 5.7. Let C be a convex set and v ∈ C. The following statements


are equivalent:
1. v is an extreme point of C;
2. v is not interior to any line segment contained in C (ie., if there are
v1 , v2 ∈ C and τ ∈ (0, 1) such that v = τ v1 + (1 − τ )v2 , then v1 = v2 = v).

Proof. Exercise 3. 
Proposition 5.7 implies that the extreme points of a convex set C must lie
on the topological boundary of C. However, some convex sets do not include
their boundaries—the open unit disc D in C is one such example—and so
they do not possess extreme points. On the other, in finite-dimensional spaces,
compact convex sets are closed an bounded, and so in these cases there is a
well defined boundary. But do these sets necessarily have extreme points ?
This answer lies in the Kreı̌n–Milman Theorem below.

Definition 5.8. A convex subset F of a convex set C is a face of C with the


property that τ v1 + (1 − τ )v2 ∈ F , for some τ ∈ (0, 1) and v1 , v2 ∈ C, only if
v1 , v2 ∈ F .

Proposition 5.7 gives an alternate way to consider extreme points: v ∈ C


is an extreme point of a convex set C if and only if {v} is a face of C.

Theorem 5.9. (Kreı̌n–Milman) Assume that V is a normed vector space en-


dowed with a topology T . If ∅ =
6 K ⊂ V is a convex subset of V , such that K
is compact in the topology T , then every nonempty T -compact face F ⊆ K
contains an extreme point of K.

Proof. Let F be a nonempty compact face of K and let

SF = {G ⊆ F | G is a nonempty compact face of F } ,

partially ordered by reverse inclusion: G1  G2 if and only if G2 ⊆ G1 . Let


L be a linearly ordered subset of SF . Hence, if G1 , . . . , Gn ∈ L, then there is
a j0 such that Gi  Gj0 for all 1 ≤ i ≤ n. Hence, ∅ = 6 G j0 ⊆ G 1 ∩ · · · ∩ G n .
66 5 Convexity

Therefore, the family L of compact sets has the finite intersection property,
and so F0 6= ∅, where \
F0 = G.
G∈L

As F0 is a nonempty compact face of F , by Exercise 6, F0 is an upperbound


in SF for L. Zorn’s Lemma implies that SF has a maximal element, say E.
Let ϕ ∈ V ∗ , γ = max{<ϕ(v) | v ∈ E}, and Eϕ = {v ∈ E | ϕ(v) = γ}. If
τ ∈ (0, 1) and v1 , v2 ∈ E are such that τ v1 + (1 − τ )v2 ∈ Eϕ, then

γ = <ϕ (τ v1 + (1 − τ )v2 )

= τ <ϕ(v1 ) + (1 − τ )<ϕ(v2 )

≤ τ γ + (1 − τ )γ

= γ,

which implies that v1 , v2 ∈ Eϕ. Hence, Eϕ is a face of E and, as well, a


face of F (Exercise 6). Furthermore, Eϕ is closed, and so it is compact (since
it is a closed subset of a compact set). Thus, Eϕ ∈ SF and E  Eϕ. By
the maximality of E in SF , E = Eϕ . We conclude, therefore, that for every
ϕ ∈ V ∗ and all v1 , v2 ∈ E, <ϕ(v1 ) = <ϕ(v2 ). By way of the formula

ϕ(v) = <ϕ(v) + i<ϕ(−iv) , v∈V ,

ϕ(v1 − v2 ) = 0 for every ϕ ∈ V ∗ , which proves that v1 = v2 . Hence, E is a


singleton set {v0 }. But {v0 } is a face of K if and only if v0 is an extreme point
of K. 

5.3 Geometry of the Closed Unit Ball


In any normed vector space, the triangle inequality implies that every open
or closed ball of finite radius is convex (Exercise 1). However, the analytic
definition of the closed unit ball reveals little about its shape or geometry of
B. Nevertheless the choice of norm greatly influences the geometry of the unit
ball. For example, the closed unit ball in a dual space always has an extreme
point.

Proposition 5.10. The closed unit ball of the dual space V ∗ of any normed
vector space has an extreme point.

Proof. Let X ⊂ V ∗ be the closed unit ball of V ∗ . Endow V ∗ with the weak∗
topology. As the closed unit ball X is convex (by Exercise 1) and weak∗
compact (by the Banach–Alaoglu Theorem), X has an extreme point, by
Theorem 5.9. 
5.3 Geometry of the Closed Unit Ball 67

Proposition 5.10 provides a useful device for showing that certain Banach
W spaces are not (isometrically isomorphic to) dual spaces; one need only
show the closed unit ball of such W has no extreme points. See Exercise 8.
For a second example of the influence of the norm on the geometry of the
closed unit ball, consider V = R2 as an `1 -space and an `2 -space, and let B1
and B2 denote the closed unit balls in these spaces. The ball B2 is simply a
circular disc; however, the ball B1 is a diamond-shaped disc:

@
@
@
@
@
@
@
@
@
@

The geometry of B1 is, evidently, quite different from that of B2 . For


example, because the boundary of B2 is round, there are no line segments
contained within the boundary; in contrast, the boundary of B1 is made up
of four line segments. In short, the unit sphere in `1 is flat whereas the unit
sphere in `2 is (uniformly) round.
As for extreme points, the closed unit ball B1 of `1 in 2-dimensional real
space has just 4 extreme points (namely, the points of the diamond), but every
boundary point of the closed unit ball B2 of `2 in 2-dimensional real space is
an extreme point of B2 .
In this section, we aim to examine obtain an analogue of this situation for
the closed unit ball of `p (N), where 1 < p < ∞. The rest of the section is
devoted to the proof of the following theorem.
Theorem 5.11. If 1 < p < ∞ and B is the closed unit ball of `p (N), then
v ∈ `p (N) is an extreme point of B if and only if kvk = 1.
We start with an analytic proof of the fact that the closed unit disc in C
is round.
Lemma 5.12. The extreme points of the closed unit disc in the complex plane
are precisely the points on the unit circle.
Proof. Let D = {ζ ∈ C | |ζ| ≤ 1}. If ζ ∈ ext D, then necessarily |ζ| has modulus
1.
Conversely, suppose that ζ ∈ D is such that |ζ| = 1. Thus, there exists
θ ∈ R such that ζ = cos θ + i sin θ. Assume that ζ = τ λ + (1 − τ )µ, for some
τ ∈ (0, 1) and λ, µ ∈ D. The triangle inequality yields |λ| = |µ| = 1 and so
λ = cos α + i sin α and µ = cos β + i sin β for some α, β ∈ R. Hence,
68 5 Convexity

1 = cos2 θ + sin2 θ = 1 + 2τ 2 − 2τ + 2τ (1 − τ ) (cos(α − β)) ,

which implies that cos(α − β) = −1; that, is α = β + (2k + 1)π for some k ∈ Z.
Thus, λ = µ = ζ. 

Lemma 5.13. If a, b ∈ R+ and τ ∈ (0, 1) satisfy

eτa+(1−τ)b = τ ea + (1 − τ )eb ,

then a = b.

Proof. The function ϑ(t) = et on R is convex and has strictly positive second
derivative. Thus, eτa+(1−τ)b = τ ea + (1 − τ )eb if and only if ea = eb . 
Proof of Theorem 5.11. Only the nontrivial implication is shown here. That
is, we show that if v ∈ `p (N) has norm kvk = 1, then v is an extreme point of
B.
Suppose that u, w ∈ B and τ ∈ (0, 1) satisfy v = τ u + (1 − τ )w. Let q > 1
be such that 1/p + 1/q = 1. By Corollary 3.14, there exists a unit vector
ϕ ∈ `p (N)∗ = `q (N) such that kvk = ϕ(v). Hence,

1 = ϕ(v) = |ϕ(v)| ≤ τ kϕk kuk + (1 − τ )kϕk kwk = 1 ,

which implies that |ϕ(v)| = kϕk kvk, |ϕ(u)| = kϕk kuk, and |ϕ(w)| = kϕk kwk.
This first of these equalities yields
∞ ∞ ∞
!1/q ∞
!1/p
X X X X
1 = ϕn vn ≤ |ϕn vn | = |ϕn |q |vn |p = 1.


n=1 n=1 n=1 n=1

This inequality implies, by Young’s inequality (Proposition 2.6), that


∞ ∞
X X 1 1 1 1
1 = |ϕn vn | ≤ ( |ϕn |q + |vn |p ) = kϕkq + kvkp = 1 .
n=1 n=1
q p q p

Therefore, |ϕn vn | = 1q |ϕn |q + p1 |vn |p for every n ∈ N. That is,

1 1 1 an 1
e p an + q bn = e + ebn ,
p q

where an = p log |vn | and bn = q log |ϕn | (in nonzero cases). By Lemma 5.13,
|vn |p = |ϕn |q for all n ∈ N. Likewise, |un |p = |wn |p = |ϕn |q for all n ∈ N.
Hence, |un| = |wn | = |vn |, for each n ∈ N. But vn = τ un + (1 − τ )wn , and so
Lemma 5.12 yields un = wn = vn ; that is, u = w = v. 
5.4 Separation of Convex Sets by Linear Functionals 69

5.4 Separation of Convex Sets by Linear Functionals


The goal of this section is to prove the following “separation” theorem.

Theorem 5.14. Let C1 and C2 be convex sets in a normed vector space V


such that C1 ∩ C2 = ∅, C1 is compact, and C2 is closed. Then there exist
ϕ ∈ V ∗ and γ1 , γ2 ∈ R such that

<ϕ(v1 ) < γ1 < γ2 < <ϕ(v2 ) , ∀ v1 ∈ C1 , v2 ∈ C2 . (5.3)

The proof of Theorem 5.14 requires several lemmas and the introduction
of the notion of a sublinear functional. The Hahn–Banach Extension Theorem
is the basis of the proof of Theorem 5.14.

Definition 5.15. A function p : V → R for which p(u + v) ≤ p(u) + p(v),


for all u, v ∈ V , and p(λ v) = λ p(v), for all λ ∈ R and v ∈ V , is called a
sublinear functional.

Proposition 5.16. If V is a normed vector space over R and if C ⊆ V is a


convex set, then the function p : V → R defined by

p(v) = inf {τ > 0 | τ −1 v ∈ C} , v∈V , (5.4)

is a sublinear functional and C = {v ∈ V | p(v) = 1}.

Proof. Exercise 9. 
The sublinear functional p in Proposition 5.16 is sometimes called the
Minkowski functional associated with C.
The Hahn–Banach Extension Theorem, in its most general form, is below.
Its proof is a fairly straightforward adaptation of the proof of Theorem 3.13.

Theorem 5.17. (Hahn–Banach) Let V be a normed vector space and suppose


that p : V → R is a sublinear functional. If L ⊂ V is a linear submanifold of
V and if µ0 : L → R is a real linear transformation such that |µ0 (v)| ≤ p(v)
for all v ∈ L, then there is an extension µ of µ0 to a real linear transformation
µ : V → R such that |µ(v)| ≤ p(v), for every v ∈ V .

Proof. Exercise 5.17 

Definition 5.18. Let V be a normed vector space and let VR be the real
normed vector space obtained from V by restricting the base field from C to
R.
1. A real subspace of V is a closed linear submanifold (over R) of VR .
2. A hyperplane of V is a subspace M such that V /M is isometrically iso-
morphic to C.
3. A real hyperplane of V is a real subspace N such that VR /N is isometrically
isomorphic to R.
70 5 Convexity

Observe that VR = V as sets. Further, by the First Isomorphism Theorem,


M ⊂ V is a hyperplane if and only if M = ker ϕ for some ϕ ∈ V ∗ . (Reason:
the canonical surjection V → V /M ∼ = C induces ϕ ∈ V ∗ and C ∼ = ran ϕ ∼
=
V / ker ϕ.) Likewise, N ⊂ V is a real hyperplane of VR if and only if N = ker µ,
for some bounded real linear transformation µ : V → R.

Lemma 5.19. Let V be a normed vector space and assume that p : V → R is


a sublinear functional on V .
1. If a (real) linear transformation µ : VR → R satisfies |µ(v)| ≤ p(v), for
every v ∈ V , then the function ϕ : V → C defined by

ϕ(v) = µ(v) + iµ(−iv) , v∈V,

is a (complex) linear transformation for which |ϕ(v)| ≤ p(v), for every


v ∈V.
2. ϕ : V → C is a (complex) linear functional such that |ϕ(v)| ≤ p(v), for
every v ∈ V , then µ = <ϕ is a (real) linear transformation µ : VR → R
such that |µ(v)| ≤ p(v), for every v ∈ V .

Proof. Exercise 11. 

Lemma 5.20. If C is a nonempty, open, convex set in a normed vector space


V such that 0 6∈ C, then
1. there is a real hyperplane N ⊂ VR such that N ∩ C = ∅, and
2. there is a hyperplane M ⊂ V such that M ∩ C = ∅.

Proof. Fix v0 ∈ V and let C0 = {v0 − v | v ∈ C}, then C0 is nonempty, open,


and convex. Further, v0 6∈ C0 because 0 6∈ C. Let p denote the Minkowski
functional (5.4) associated with C. Thus, by Proposition 5.16, p(v0 ) ≥ 1 (since
p(v) < 1 if and only if v ∈ C).
Now let L = SpanR {v0 } and define µ0 : L → R by µ0 (λv0 ) = λp(v0 ), for
all λ ∈ R. The function µ is linear over R and satisfies (one should verify this
claim!) |µ0 (w)| ≤ p(w) for every w ∈ L. Hence, by the general Hahn–Banach
Extension Theorem (Exercise 10), there is a linear transformation µ : VR → R
such that |µ(v)| ≤ p(v), for all v ∈ VR , and µ(w) = µ0 (w) for every w ∈ L.
Lemma 5.19 asserts that µ = <ϕ, for some ϕ ∈ V ∗ . Let N = ker µ and
M = ker ϕ to obtain hyperplanes N ⊂ VR and M ⊂ V such that N ∩ C = ∅
and M ∩ C = ∅. 

Lemma 5.21. If C1 and C2 are nonempty, disjoint convex sets in a normed


vector space V such that C1 is open, then there are ϕ ∈ V ∗ and γ ∈ R such
that
<ϕ(v1 ) < γ ≤ <ϕ(v2 ) , ∀ v1 ∈ C1 , v2 ∈ C2 . (5.5)
5.4 Separation of Convex Sets by Linear Functionals 71
[
Proof. Let C = {v1 − v2 | v2 ∈ C2 }, and note that C is open and convex.
v1∈C1
Further, 0 6∈ C, as C1 ∩ C2 = ∅. By Lemma 5.20, there exists ϕ ∈ V ∗ such
that ker µ ∩ C = ∅, where µ = <ϕ. Because C is convex and µ is continuous,
µ(C) is a connected subset of R. But 0 6∈ C and ker µ ∩ C = ∅ implies that
µ(C) ⊂ (−∞, 0) or µ(C) ⊂ (0, ∞). Without loss of generality, assume that
µ(C) ⊂ (−∞, 0). Hence, µ(v1 ) < µ(v2 ), for all v1 ∈ C1 , v2 ∈ C2 . Therefore,
there is a γ ∈ R such that

sup µ(v1 ) ≤ γ ≤ inf µ(v2 ) .


v1 ∈C1 v2 ∈C2

By Exercise 12, the interval µ(C1 ) is open, since C1 is open. Hence, γ 6∈ µ(C1 )
and so the inequalities (5.5) hold. 

Lemma 5.22. Assume that V is a normed vector space and that K, C ⊂ V


are such that K is compact and C is closed. If K ∩ C = ∅, then there exists
ε > 0 such that
(K + Bε (0)) ∩ C = ∅ .

Proof. If the conclusion is not true, then there are vn ∈ K and wn ∈ W of


norm kwn k < n1 such that vn + wn ∈ C, for every n ∈ N. Since K is compact,
{vn }n∈N admits a convergent subsequence {vnk }k∈N with limit v ∈ K. Note
that kv − (vnk + wnk )k ≤ kv − vnk k + n1k ; thus, v ∈ C, as C is closed. But this
contradicts K ∩ C = ∅; therefore, the conclusion of the lemma must hold. 
Proof of Theorem 5.14. Assume that C1 , C2 ⊂ V are disjoint convex sets and
that C1 is compact and C2 is closed. By Lemma 5.22, there exists ε > 0 such
that (C1 + Bε (0)) ∩ C2 = ∅. Note that C1 + Bε (0) is convex and open. Thus,
by Lemma 5.21, there exist ϕ ∈ V ∗ and γ ∈ R such that

<ϕ(v1 ) < γ ≤ <ϕ(v2 ) , ∀ v1 ∈ C1 + Bε (0), v2 ∈ C2 .

However, as C1 is compact, <ϕ(C1 ) is compact and <ϕ (C1 + Bε (0)) has com-
pact closure, disjoint from C2 . Therefore, there are γ1 , γ2 ∈ R such that

<ϕ(v1 ) < γ1 < γ2 < <ϕ(v2 ) , ∀ v1 ∈ C1 , v2 ∈ C2 ,

which completes the proof. 


The following application of the Separation Theorem adds further infor-
mation about the relationship between compact convex sets and their extreme
points.

Theorem 5.23. If K is a compact convex set in a normed vector space V ,


then the convex hull of the extreme points of K is dense in K.

Proof. By Theorem 5.9, K has at least one extreme point. Let C = Conv (ext K)
and consider the compact convex subset C of K. If C 6= K, then let w0 ∈ K\C .
72 5 Convexity

By the Separation Theorem (Theorem 5.14), there are ϕ ∈ V ∗ and γ1 , γ2 ∈ R


such that
<ϕ(v) < γ1 < γ2 < <ϕ(v0 ) , ∀ v ∈ C .
If δ = max{<ϕ(w) | w ∈ K}, then γ2 < <ϕ(w0 ) ≤ δ. The set

Kϕ = {w ∈ K | <ϕ(w) = δ}

is a compact face of K. Therefore, by the first statement, there is an extreme


point v0 of K in Kϕ . Thus, <ϕ(v0 ) = δ and <ϕ(v0 ) < δ (since v0 ∈ extK ⊂ C),
which is a contradiction. Hence, it must be that C = K. 
A more general result, stated below without proof, is true.
Theorem 5.24. (Kreı̌n–Milman) If V is a normed vector space and if K is
a nonempty weak∗ compact convex subset of V ∗ , then Conv (ext K) is weak∗
dense in K.

5.5 Exercises

1. Let V be a normed vector space.


a) Prove that the open ball Br (v0 ) is a convex set, for every v0 ∈ V and
r > 0.
b) Prove that the closure C of a convex subset C ⊂ V is convex.
2. Prove that a subset C ⊂ V is convex if and only if C contains all convex
combinations of its elements.
3. Let C be a convex set and v ∈ C. Prove that the following statements are
equivalent:
a) v is an extreme point of C;
b) if there are v1 , v2 ∈ C and τ ∈ (0, 1) such that v = τ v1 + (1 − τ )v2 ,
then v1 = v2 = v.
4. Let C be the closed unit ball of `p (N).
a) If p = 1, then show that there is a vector v ∈ C such that kvk = 1
but v 6∈ ext C.
b) If p = ∞, then show that there is a vector v ∈ C such that kvk = 1
but v 6∈ ext C.
5. Let C = [0, 1] × [0, 1] × [0, 1] ⊂ R3 .
a) Determine all the faces of C.
b) Of the faces found, identify those that correspond to extreme points
of C.
6. Let C be a convex set in a vector space V . Show that if F1 ⊆ C is a face
of C and F2 ⊆ F1 is a face of F1 , then F2 is a face of C.
7. Show that the closed unit ball of C0 (R) has no extreme points.
8. Consider the Banach space L1 ([0, 1], m), where m is Lebesgue measure..
5.5 Exercises 73

a) If f ∈ L1 ([0, 1], m) is such that kfk = 1, then prove that there is a


δ ∈ (0, 1) for which
1
Z
|f| dm = .
[0,δ] 2
b) Show that there are g, h ∈ L1 ([0, 1], m) such that kgk = khk = 1 and
f = 21 g + 21 h. (Suggestion: let g = 2f χ[0,δ] .)
c) Prove that L1 ([0, 1], m) is not (isometrically isomorphic to) a dual
space.
9. Suppose that V is a normed vector space over R and that C ⊆ V is a
convex set. Prove that the function p : V → R defined by

p(v) = inf {τ > 0 | τ −1v ∈ C} , v∈V ,

is sublinear functional and that C = {v ∈ V | p(v) = 1}.


10. Let V be a normed vector space and suppose that p : V → R is a sublinear
functional. Prove that if L ⊂ V is a linear submanifold of V and if µ0 :
L → R is a real linear transformation such that |µ0 (w)| ≤ p(w) for all
w ∈ L, then there is an extension µ of µ0 to a linear transformation
µ : VR → R such that |µ(v)| ≤ p(v), for every v ∈ VR .
11. If V is a normed vector space and if p : V → R is a sublinear functional,
then prove that for any linear transformation µ : VR → R, the (complex)
linear map ϕ : V → C defined by

ϕ(v) = µ(v) + iµ(−iv) , v∈V ,

satisfies |ϕ(v)| ≤ p(v), for every v ∈ V , if and only if |µ(v)| ≤ p(v), for
every v ∈ V .
12. Let V be a normed vector space and assume that C ⊂ V is an open set.
Let ϕ ∈ V ∗ and µ = <ϕ. Prove that if µ 6= 0, then µ(C) is an open subset
of R.
13. Proof that if K ⊂ Rn is compact and convex, and if w ∈ K, then w is a
convex combination of at most n + 1 extreme points of K.
14. A cone in a finite-dimensional Banach space V is a convex subset C ⊂ V
such that τ v ∈ C, for every τ ∈ R+ and v ∈ C. Let C † = {ϕ ∈ V ∗ | ϕ(v) ≥
0 , ∀ v ∈ C}.
a) Prove that C † is a cone in the dual space V ∗ .
b) Prove that C †† = C.
6
Hilbert Spaces

Among Banach spaces, the so-called Hilbert spaces are of special importance,
for these spaces carry a geometry that is a formal abstraction of the usual
Euclidean geometry of Cn .

6.1 Inner Products and Euclidean Norms

Definition 6.1. An inner product on a (complex) vector space H is a sesquilin-


ear form h·, ·i : H × H → C, which is to say that h·, ·i is a function on the
Cartesian product H × H, satisfying the following properties for all vectors
ξ, ξ1 , ξ2 , η, η1, η2 ∈ H and scalars α ∈ C:
1. hξ, ξi ≥ 0, with hξ, ξi = 0 if and only if ξ = 0;
2. hξ, ηi = hη, ξi;
3. hξ1 + ξ2 , ηi = hξ1 , ηi + hξ2 , ηi and hξ, η1 + η2 i = hξ, η1 i + hξ, η2 i;
4. hα ξ, ηi = α hξ, ηi and hξ, α ηi = ᾱ hξ, ηi.

Proposition 6.2. (Cauchy–Schwarz Inequality) If h·, ·i is an inner product


on a vector space H, then

|hξ, ηi| ≤ hξ, ξi1/2 hη, ηi1/2 , ∀ ξ, η ∈ H . (6.1)

If η 6= 0, then |hξ, ηi| = hξ, ξi1/2 hη, ηi1/2 if and only if ξ = αη for some α ∈ C.

Proof. The result is trivially true if hξ, ηi = 0. Assume, therefore, that hξ, ηi =
6
0. For any λ ∈ C,

0 ≤ hξ − λη, ξ − ληi = hξ, ξi − λhξ, ηi − λhη, ξi + |λ|2hη, ηi

= hξ, ξi − 2< (λhη, ξi) + |λ|2 hη, ηi .

For
76 6 Hilbert Spaces

hξ, ξi
λ = ,
hη, ξi
the inequality above becomes

hξ, ξi2 hη, ηi


0 ≤ −hξ, ξi + ,
|hξ, ηi|2

which yields inequality (6.1).


The assertion about equality is left as an exercise (Exercise 1). 

Corollary 6.3. If h·, ·i is an inner product on a vector space H, then H is a


normed vector space under the norm k · k defined by

kξk = hξ, ξi1/2 , ∀ξ ∈ H . (6.2)

Proof. The only non-obvious item to verify is the triangle inequality. If ξ, η ∈


H, then

kξ + ηk2 = hξ + η, ξ + ηi

= |ξk2 + 2<( hξ, ηi ) + kηk2

≤ |ξk2 + 2 |hξ, ηi| + kηk2

≤ |ξk2 + 2 kξk kηk + kηk2 [by (6.1)]


2
= (kξk + kηk) .

Hence, equation (6.2) defines a norm on H. 

Definition 6.4. If h·, ·i is an inner product on a vector space H, and if H is


a Banach space relative to the norm (6.2) induced by the inner product, then
H is called a Hilbert space.

A useful consequence of the Cauchy–Schwarz Inequality is the following


“continuity” lemma.

Lemma 6.5. If {ξk }k∈N is a sequence in a Hilbert space H that converges to


ξ ∈ H—that is, limk→∞ kξk − ξk = 0—then

lim | hξk , ηi − hξ, ηi | = 0 , ∀η ∈ H .


k→∞

Inner products allow one to carry intuitions and concepts from Euclidean
geometry, such as magnitude and angle, to the study of vectors.

Definition 6.6. In a Hilbert space H, vectors ξ and η are said to be orthogonal


(perpendicular) if hξ, ηi = 0.
6.2 Distance to Convex Sets 77

Observe that by definition of inner product, the only vector ξ orthogonal


to itself is the vector ξ = 0. A basic feature of Hilbert space geometry is the
Pythagorean Theorem.

Proposition 6.7. If ξ and η are orthogonal, then kξ + ηk2 = kξk2 + kηk2 .

Proof. Expand using inner products. 


If the first distinguished geometric property of Hilbert space is the notion
of orthogonality, then the second most characteristic property of Hilbert space
surely must be the parallelogram law.

Proposition 6.8. (The Jordan–von Neumann Theorem) In an inner-product


space H,
kξ + ηk2 + kξ − ηk2 = 2kξk2 + 2kηk2 , (6.3)
for all ξ, η ∈ H. Furthermore, if V is a normed vector space in which the
parallelogram law (6.3) holds for all ξ, η ∈ V , then V is an inner-product
space (that is, the norm on V is induced by an inner product on V ).

Proof. The verification of the parallelogram law (6.3) is left as an exercise


(Exercise 2). Suppose now that V is a normed vector space and that equation
(6.3) holds for all ξ, η ∈ V . Define h·, ·i : V × V → C by

hξ, ηi = kξ + ηk2 − kξ − ηk2 + ikξ + iηk2 − ikξ − iηk2 ,

for ξ, η ∈ V . It is now straightforward to verify that h·, ·i : V × V → C is an


inner product on V and that kξk = hξ, ξi1/2 for all ξ ∈ V . 

6.2 Distance to Convex Sets


Recall that a subset K of a vector space V is a convex set if

λv + (1 − λ)w ∈ K , ∀ λ ∈ [0, 1], ∀ v, w ∈ K .

If K = {(t, 1 − t) | t ∈ R2 } ⊂ R2 , then K is closed and convex in both


the `1 and `2 norms. In considering R2 as an `1 -space, we see that there is a
continuum of points v ∈ L that are within distance 1 of the origin. Thus, if
one is seeking to minimise the distance between a point v0 and a closed convex
set K in `1 , there is no reason to expect that there will be a unique point in
K that is closest to v0 . On the other hand, in considering R2 as an `2 space,
there is exactly one v ∈ L qthat q
minimises the distance between the origin and
L (namely, the point v = ( 12 , 12 ).) The differences between `1 and `2 are in
fact much more profound, and this section aims to establish some important
properties of Hilbert space that arise from its Euclidean geometry—properties
that, as it so happens, are not generally shared by arbitrary Banach spaces.
78 6 Hilbert Spaces

Definition 6.9. If K is a nonempty subset of a normed vector space V and


if v0 ∈ V , then the distance from v0 to K is denoted by dist (v0 , K) and is
defined by
dist (v0 , K) = inf{kv0 − vk | v ∈ K} . (6.4)

The following theorem is fundamental to Hilbert space theory.

Theorem 6.10. If K is a nonempty closed convex subset of a Hilbert space


H, and if ξ0 ∈ H, then there is a unique η ∈ K such that

dist (ξ0 , K) = kξ0 − ηk .

Proof. The existence of a best approximant is shown first; uniqueness of the


best approximant is demonstrated second. The convexity of K will be used
repeatedly in the following guise: if η1 , η2 ∈ K, then 12 (η1 + η2 ) ∈ K.
By definition of distance, for each k ∈ N there is a vector ηk ∈ K such
that
2 1
kξ0 − ηk k2 < (dist (ξ0 , K)) + .
k
We shall show that the sequence {ηk }k∈N is a Cauchy sequence. The fact that
H is a Hilbert space and K is closed in H will ensure the existence of a limiting
vector η and that this vector belongs to K.
To prove that {ηk }k∈N is a Cauchy sequence, note that

k2ξ0 − (ηn + ηm )k2 + kηm − ηn k2 = k(ξ0 − ηn ) + (ξ0 − ηm )k2


+ k(ξ0 − ηn ) − (ξ0 − ηm )k2

= 2kξ0 − ηn k2 + 2kξ0 − ηm k2

2 1 1
< 4 (dist (ξ0 , K)) + n + m ,

where the second equality above follows from the parallelogram law. On the
other hand,
2
4 (dist (ξ0 , K)) ≤ 4kξ0 − 12 (ηn + ηm )k2

= k2ξ0 − (ηn + ηm )k2 .


Hence,
1 1
kηm − ηn k2 < + ,
n m
which proves that {ηk }k∈N is a Cauchy sequence. Such sequences necessarily
converge; let η ∈ H denote the limit of the sequence {ηk }k∈N . Because K is
closed, η ∈ K. Thus,
6.3 Orthogonal Complements 79

dist (ξ0 , K) ≤ kξ0 − ηk

= kξ0 − ηk + ηk − ηk

≤ kξ0 − ηk k + kηk − ηk
q
2 1
< (dist (ξ0 , K)) + k + kηk − ηk .

In letting k → ∞, the inequalities above sandwich to the equation

dist (ξ0 , K) = kξ0 − ηk .

To prove uniqueness, let η 0 ∈ K be such that

dist (ξ0 , K) = kξ0 − η 0 k .

Thus, kξ0 − ηk = kξ0 − η 0 k. By the parallelogram law,

k(ξ0 − η 0 ) + (ξ0 − η)k2 + k(ξ0 − η 0 ) − (ξ0 − η)k2 = 2 kξ0 − η 0 k2 + kξ0 − ηk2 .




Thus,
kη − η 0 k2 = 4kξ0 − ηk2 − 4kξ0 − 12 (η + η 0 )k2
2
= 4 (dist (ξ0 , K)) − 4kξ0 − 12 (η + η 0 )k2
2 2
≤ 4 (dist (ξ0 , K)) − 4 (dist (ξ0 , K))

= 0.
That is, η 0 = η. 

6.3 Orthogonal Complements


Definition 6.11. If S is a nonempty subset of a Hilbert space H, then the
orthogonal complement of S is the subset of H denoted by S ⊥ and defined by

S ⊥ = {η ∈ H | hη, ξi = 0, ∀ ξ ∈ S} .

By linearity of the inner product in the first variable, it is not difficult


to see that S ⊥ is a vector subspace of H. By the Cauchy–Schwarz inequality
(that is, by continuity of the inner product in the first variable), S ⊥ is closed.
Thus, S ⊥ is a subspace of H for every nonempty subset S ⊆ H.

Theorem 6.12. Let M ⊆ H be a subspace of a Hilbert space H and let ξ ∈ H


and η ∈ M . The following statements are equivalent:
1. dist (ξ, M ) = kξ − ηk;
80 6 Hilbert Spaces

2. ξ − η ∈ M ⊥ .
Proof. Assume that dist (ξ, M ) = kξ − ηk. Let η 0 ∈ M . We aim to prove
that hξ − η, η 0 i = 0. To this end, consider any α ∈ C. Because η + αη 0 ∈ M ,
kξ − (η + αη 0 )k ≥ dist (ξ, M ) = kξ − ηk. Thus,

kξ − ηk2 ≤ k(ξ − η) − αη 0 k2

= kξ − ηk2 − 2< (αhη 0 , ξ − ηi) + |α|2 kη 0 k2 ,

whence
2< (αhη 0 , ξ − ηi) ≤ |α|2kη 0 k2 . (6.5)
The complex number hη 0 , ξ − ηi is either zero or it is not. If it is zero, then
ξ − η is orthogonal to η 0 , as desired. We shall now show that the assumption
that hη 0 , ξ − ηi =
6 0 will lead to a contradiction.
Assume that hη 0 , ξ − ηi = 6 0 and let α = t hη 0 , ξ − ηi, for some t > 0.
Inequality (6.5) becomes

2t |hη 0 , ξ − ηi|2 ≤ t2 |hη 0 , ξ − ηi|2 kη 0 k2 .

Because hη 0 , ξ − ηi =
6 0 and t > 0, the inequality above implies that

2 ≤ t kηk2 ,

which is clearly impossible if t → 0+ . Hence, it must be that hη 0 , ξ − ηi = 0,


which proves that ξ − η is orthogonal to every vector η 0 ∈ M .
Assume now that ξ − η ∈ M ⊥. Let η 0 ∈ M . Then ξ − η is orthogonal to
η − η 0 , because η − η 0 ∈ M . Thus, by invoking the Pythagorean theorem,
kξ − η 0 k2 = k(ξ − η) + (η − η 0 )k2

= kξ − ηk2 + kη − η 0 k2

≥ kξ − ηk2 .

Thus,
kξ − ηk ≤ inf{kξ − η 0 k | η 0 ∈ M } = dist (ξ, M ) ,
which proves that kξ − ηk = dist (ξ, M ). 
To conclude the discussion of Hilbert space geometry, we shall consider
decompositions of a Hilbert space as a direct sum of two subspaces that are
orthogonal to each other.
Recall from linear algebra that if M and N are linear submanifolds of H,
then M + N denotes the set

M + N = {ω + δ | ω ∈ M and δ ∈ N }

and M + N is itself a linear submanifold of H.


6.4 Orthonormal Bases for Hilbert Spaces 81

Proposition 6.13. If M and N are subspaces of a Hilbert space H such that


M ⊆ N ⊥ , then M ∩ N = {0} and the linear submanifold M +N is a subspace.

Proof. If ξ ∈ M ∩ N , then the hypothesis M ⊆ N ⊥ implies that ξ ∈ N ∩ N ⊥ ,


whence 0 = hξ, ξi and so ξ = 0.
To show that the linear submanifold M + N is closed in the topology of
H, it is necessary to show that any convergent sequence of ξk ∈ M + N has
its limit ξ in M + N . Let {ξk }k∈N ⊂ M + N be such a convergent sequence.
Therefore,{ξk }k∈N is a Cauchy sequence.
For each k ∈ N, there are ωk ∈ M and δk ∈ N such that ξk = ωk + δk .
Because the vectors of M are orthogonal to the vectors of N (by hypothesis),
the Pythagorean Theorem applies:

kξm − ξn k2 = k(ωm − ωn ) + (δm − δn )k2 = kωm − ωn k2 + kδm − δn k2 .

Thus, the sequences {ωk }k∈N ⊂ M and {δk }k∈N ⊂ N are Cauchy sequences.
Let ω ∈ M and δ ∈ N be the limits, respectively, of these sequences. (The
limits exist because H is a Hilbert space; the limits lie in M and N , respec-
tively, because M and N are closed.) The verification that ξ = ω + δ is left to
the reader. Hence, M + N is closed. 
This situation described in Proposition 6.13 is formalised by the following
definition.

Definition 6.14. If M, N ⊆ H are subspaces of a Hilbert space H such that


M ⊆ N ⊥ , then the orthogonal direct sum of M and N is the subspace M + N
and is denoted by M ⊕ N .

Proposition 6.15. If M ⊆ H is a subspace, then H = M ⊕ M ⊥.

Proof. Obviously M ⊕ M ⊥ ⊆ H. Conversely, suppose that ξ ∈ H. Because


M is a subspace, M is closed and convex. Thus, by Theorem 6.10, there is
a unique η ∈ M for which kξ − ηk = dist (ξ, M ). Therefore, ξ − η ∈ M ⊥ ,
by Theorem 6.12. Let ω = η and δ = ξ − η to get ω ∈ M , δ ∈ M ⊥ , and
ξ = ω + δ ∈ M ⊕ M ⊥ . This proves that H ⊆ M ⊕ M ⊥ . 

6.4 Orthonormal Bases for Hilbert Spaces

Definition 6.16. An orthonormal set in a Hilbert space H is a subset B ⊂ H


such that
1. kφk = 1, for all φ ∈ B, and
2. hφ1 , φ2i = 0 if φ1 , φ2 ∈ B and φ1 6= φ2 .
Further, an orthonormal set B is an orthonormal basis of H if B0 = B for
every orthonormal set B0 for which B0 ⊇ B.
82 6 Hilbert Spaces

A somewhat more convenient way to describe an orthonormal basis is


through the use of the following proposition.

Proposition 6.17. If B ⊂ H is an orthonormal set in H, then B ⊂ H is an


orthonormal basis of H if and only if Span B is dense in H.

Proof. Suppose that Span B is dense in H and let M be the closure of Span B.
Therefore, if ξ ∈ H is orthogonal to every φ ∈ B, then ξ is orthogonal to
Span B. By continuity of the inner product, this means that ξ is orthogonal
to M : ξ ∈ M ⊥ = H ⊥ = {0}. Thus, there is no orthonormal set B0 in H that
properly contains B.
The proof of the converse is straightforward. 
One passes from linearly independent vector to orthonormal vectors via
the Gram–Schmidt Process.

Proposition 6.18. (Gram–Schmidt Process) If v1 , . . . , vk ∈ H are linearly


independent, then there are k orthonormal vectors φ1 , . . . , φk ∈ H such that

Span{φ1 , . . . , φk } = Span{v1 , . . . , vk } .

In particular, every finite-dimensional Hilbert space has a basis of orthonormal


vectors.

Proof. Let φ1 = ||v1 ||−1v1 and, inductively, for each j let


j−1
X
φj = ||vj − wj ||−1(vj − wj ) , where wj = hvj , φi iφi .
i=1

The fact that φ1 , . . . , φk are orthonormal is a simple verification by straight-


forward calculation. 
The Gram–Schmidt Process demonstrates that there is really no differ-
ence between the notions of linear basis and orthonormal basis—other than
that the latter is composed of pairwise orthogonal unit vectors—in finite-
dimensional Hilbert space. At the level of infinite-dimensional Hilbert space,
a distinction occurs. Nevertheless, by Zorn’s Lemma, every Hilbert space will
have an orthonormal basis.

Theorem 6.19. Every nonzero Hilbert space has an orthonormal basis.

Proof. Exercise 7. 
The cardinality of the basis is related to the topology of the Hilbert space
by way of the following proposition.

Proposition 6.20. A Hilbert space is separable if and only if it has a count-


able orthonormal basis. Moreover, all orthonormal bases of a separable Hilbert
space are in bijective correspondence.
6.4 Orthonormal Bases for Hilbert Spaces 83

Proof. Let H be a Hilbert space. If H has a countable orthonormal basis


{φk }k∈N , then the countable set

W = SpanQ+iQ {φk | k ∈ N}

is dense in H by Proposition 6.17 and by the fact that Q + iQ is dense in C.


Conversely, assume that H is separable. If S is any countable, dense subset
of H, then the linear submanifold

W = SpanC {ω | ω ∈ S}

is dense in H. The spanning set S must contain an algebraic basis for the
vector space W . Thus, W has a countable basis and to this basis one can
apply the Gram–Schmidt process to obtain a countable set {φk }k∈N of or-
thonormal vectors whose linear span O is dense in the closure W of W . But
W = H, which indicates that O⊥ = {0}. In other words, there are no nonzero
vectors orthogonal to {φk }k∈N, which proves that {φk }k∈N is a maximal set
of orthonormal vectors. That is, {φk }k∈N is an orthonormal basis of H.
If H has finite dimension n, then n is an algebraic invariant of the space
H: all linear bases of H must have the same cardinality, whence any two
orthonormal bases of H are in bijective correspondence. If H has infinite
dimension and is separable, then all orthonormal bases of H are countably
infinite and are, therefore, in bijective correspondence with one another. 
It is also true that if H is a nonseparable Hilbert space, then all orthonor-
mal bases of H are in bijective correspondence. The proof of this requires
some delicate and substantial cardinal arithmetic, however. We will not need
to use such a theorem in what follows.
Hilbert spaces with a countable orthonormal basis are especially easy to
analyse. The following theorem, which is an abstraction of classical Fourier
series, illustrates this fact.

Theorem 6.21. If {φk }k∈N is an orthonormal basis of a Hilbert space H,


then, for every ξ ∈ H,

n
X
lim ξ − hξ, φk i φk = 0 . (6.6)

n→∞
k=1

Proof. For each n ∈ N let


n
X
ξn = hξ, φk i φk .
k=1

Observe that
n
X
0 ≤ kξ − ξn k2 = hξ − ξn , ξ − ξn i = kξk2 − |hξ, φk i|2 .
k=1
84 6 Hilbert Spaces

Hence,
n
X
lim |hξ, φk i|2 ≤ kξk2 ,
n→∞
k=1

which implies that the sequence {ξn }n∈N is a Cauchy sequence. Because H is
a Hilbert space, this sequence has a limit ξ 0 ∈ H. We shall prove that ξ 0 = ξ.
Choose k ∈ N. Direct computation shows that hξ − ξn , φk i = 0 for every
n ≥ k. Hence, if n ≥ k, then

| h(ξ − ξ 0 ), φk i | = | h(ξ − ξn + ξn − ξ 0 ), φk i |

= | h(ξ − ξn ), φk i + h(ξn − ξ 0 ), φk i |

= | h(ξn − ξ 0 ), φk i |

≤ kξn − ξ 0 k .

(The final inequality is the Cauchy–Schwarz inequality.) Thus,

0 ≤ | h(ξ − ξ 0 ), φk i | ≤ lim kξn − ξ 0 k = 0 .


n→∞

Therefore, ξ − ξ 0 is orthogonal to every vector φk in the orthonormal basis.


Thus, ξ − ξ 0 = 0. 
Equation (6.6) asserts that if ξ ∈ H and {φk }k∈N is an orthonormal basis
for H, then
Xn
lim ξ − hξ, φk i φk = 0 .

n→∞
k=1

This will be expressed as


X
ξ = hξ, φk i φk , (6.7)
k∈N

with the understanding that the convergence of the series in (6.7) is as in


(6.6).

Definition 6.22. If {φk }k∈N is an orthonormal basis of a separable Hilbert


space H, then the series X
ξ = hξ, φk i φk
k∈N

is called a Fourier series decomposition of ξ ∈ H. The complex numbers hξ, φk i


are called the Fourier coefficients of ξ.

Proposition 6.23. Assume that {φk }k∈N is an orthonormal basis for a sep-
arable Hilbert space H.
6.5 Examples of Separable Hilbert Spaces 85

1. (Parseval’s Equation) For every ξ, η ∈ H,


X
hξ, ηi = hξ, φk ihη, φk i . (6.8)
k∈N

2. For every ξ ∈ H, X
kξk2 = |hξ, φk i|2 .
k∈N

Proof. Express ξ in its Fourier series decomposition


X
ξ = hξ, φk i φk ,
k∈N

and consider
n
X
ξn = hξ, φk i φk .
k=1
Because limn→∞ kξn − ξk = 0, Lemma 6.5 shows that
hξ, ηi = lim hξn , ηi
n→∞

n
!
X
= lim hξ, φk i hφk , ηi
n→∞
k=1

X
= hξ, φk ihη, φk i .
k∈N

This proves Parseval’s Equation. The equation


X
kξk2 = |hξ, φk i|2
k∈N

follows from Parseval’s Equation because kξk2 = hξ, ξi. 


The determination of an explicit orthonormal bases is a nontrivial issue.
Some concrete examples can be found in the following section.

6.5 Examples of Separable Hilbert Spaces


The existence of an orthonormal basis is one thing, but the determination of
an explicit orthonormal basis is another issue altogether. (For example, the
Hilbert space L2 (X, Σ, µ) has an orthonormal basis; an explicit description of
the basis elements is not at all apparent.) However, it is possible to give an
explicit description of an orthonormal basis of L2 ([a, b]).
In L2 ([a, b]), it is evident that the set {1, t, t2, t3 , . . .} of functions in the
variable t are linearly independent. Thus, the Gram–Schmidt process can be
applied to this set to produce a countably infinite set of polynomials.
86 6 Hilbert Spaces

Definition 6.24. The Legendre polynomials in L2 ([a, b]) are the elements of
the countable infinite set of polynomials that result from applying the Gram–
Schmidt process in L2 ([a, b]) to the set {1, t, t2, t3 , . . .}.
As a concrete example, in L2q([−1, 1]) the Legendre polynomials are the
polynomials φk given by φ0 (t) = 12 and
r
2k + 1 1 dk  2
(t − 1)k ,

φk (t) = k k
k ∈ N.
k 2 k! dt
Theorem 6.25. The set of Legendre polynomials in L2 ([a, b]) forms an or-
thonormal basis of L2 ([a, b]).
Proof. Let L denote the set of Legendre polynomials and consider the linear
submanifold Span L of L2 ([a, b]). Because the Gram–Schmidt orthogonalisa-
tion process does not alter the span of the vectors that it is applied to,
Span L = Span {1, t, t2, t3 , . . .} .
Therefore, by the Weierstrass Approximation Theorem, if ϑ : [a, b] → C is con-
tinuous and ε > 0, there is a element f ∈ Span L such that |ϑ(t) − f(t)| < ε
for all t ∈ [a, b]. Furthermore, Corollary 2.42 states that the set of all contin-
uous functions ϑ is dense in L2 ([a, b]). Thus, a simple calculation shows that
Span L is dense in L2 ([a, b]), which proves that L is an orthonormal basis of
L2 ([a, b]). 
In some instances the Legendre polynomials are not the best orthonormal-
basis vectors to work with. This is especially true if one is interested in periodic
phenomena. In such cases, classical Fourier series offers a better tool.
For every k ∈ Z, consider the 2π-periodic function φk : [−π, π] → C given
by
ei kt
φk (t) = √ , (6.9)


where i = −1 and where ei kt = cos(kt) + i sin(kt). If n, m ∈ Z, then
Z π
hφn , φm i = φn (t)φm (t) dt
−π
Z π
= ei (n−m)t dt
−π

= 0, if n 6= m, or 1, if n = m .

Thus, {φn }n∈Z is a set of orthonormal vectors in L2 ([−π, π]) that span the
ring T of trigonometric polynomials. Indeed, {φn }n∈Z is an orthonormal basis
of L2 ([−π, π]).
Notation. L2 (T) shall denote L2 ([−π, π], m).
6.6 Hilbert Space Duality 87

Theorem 6.26. {φn }n∈Z is an orthonormal basis of L2 (T).

Proof. Theorem 2.38 shows that Span {φn }n∈Z is uniformly dense in the lin-
ear submanifold of LL2 (T) consisting of all 2π-periodic continuous functions;
hence, this is also true with respect to the norm of L2 (T). Corollary 2.42 shows
that C([−π, π]) is dense in L2 (T). Therefore, it is sufficient to show that every
f ∈ C([−π, π]) can be approximated to within ε in (the norm of) L2 (T) by a
2π-periodic continuous function h.
To this end, choose f ∈ C([−π, π]) and let ε > 0. Let M = max {|f(t)| |, t ∈
ε2
[−π, π]} and choose δ > 0 such that δ < 8M 2 . Let h ∈ C([−π, π]) be the

function that agrees with f on [−π + δ, π − δ], is a straight line from the point
(−π, 0) to the point (−π + δ, f(−π + δ)), and is a straight line from the point
(π −δ, f(π −δ)) to the point (π, 0). Thus, |f(t)−h(t)| = 0 for t ∈ [−π +δ, π −δ]
and |f(t) − h(t)| ≤ 2M for all t 6∈ [−π + δ, π − δ]. Hence,
Z Z
kf − hk2 = |f − h|2 dm + |f − h|2 dm ≤ 8M 2 δ .
[−π,−π+δ] [π−δ,π]

That is, kf − hk < ε. 

Definition 6.27. The classical Fourier coefficients of f ∈ L2 (T) are the com-
ˆ
plex numbers f(k) defined by

fˆ(k) = hf, φk i .

Note that if p is a trigonometric polynomial as in (2.13) and if n ≤ k ≤ m,


then Z π
1
p̂(k) = √ p(t)e−i kt dt = αk .
2π −π
Theorem 6.26 implies that the classical Fourier series

fˆ(k) ei kt
X

k∈Z

of a 2π-periodic continuous function f : [−π, π] → C “converges in the mean”


to f. That is,
n
Z !
fˆ(k) e | dt = 0 .
X
i kt 2
lim |f(t) −
n→∞ [−π,π] k=−n

6.6 Hilbert Space Duality


The dual spaces of Hilbert spaces are the simplest to describe.
88 6 Hilbert Spaces

Theorem 6.28. (Riesz Representation Theorem) To every linear functional


ϕ ∈ H ∗ there corresponds a unique vector η ∈ H such that kϕk = kηk and
ϕ(ξ) = hξ, ηi, for all ξ ∈ H. Conversely, for each vector η ∈ H the formula
ϕ(ξ) = hξ, ηi, for ξ ∈ H, determines a unique linear functional ϕ ∈ H ∗ of
norm kϕk = kηk.

Proof. First of all, suppose that ϕ ∈ H ∗ . If ϕ = 0, then take η = 0 and


we obtain, trivially, that ϕ(ξ) = hξ, ηi for all ξ ∈ H. Therefore, assume that
ϕ 6= 0. By continuity of ϕ, ker ϕ is closed; and by the First Isomorphism
Theorem, the codimension of ker ϕ is the dimension of the range of ϕ, which
in this case is 1. Hence, the orthogonal complement (ker ϕ)⊥ to ker ϕ in H is
1-dimensional. Fix a nonzero vector w ∈ (ker ϕ)⊥ . Now because

H = ker ϕ ⊕ (ker ϕ)⊥ ,

for each ξ ∈ H there exist a unique vector ν ∈ ker ϕ and scalar λ ∈ C for
which
ξ = ν + λw . (6.10)
Thus,
ϕ(ξ) = ϕ(ν) + ϕ(λ w) = λ ϕ(w) .
Now take η ∈ H to be the vector

ϕ(w)
η= w,
||w||2

which depends only on the vector w and the value of ϕ at w. Then, for an
arbitrary ξ ∈ H having form (6.10),
* +
ϕ(w) hw, wi
hξ, ηi = ν + λ w, 2
w = λ ϕ(w) = ϕ(ξ) .
||w|| ||w||2

Thus, ϕ ∈ H ∗ is represented by the vector η ∈ H. By the Cauchy–Schwarz


inequality,
|ϕ(ξ)| = |hξ, ηi| ≤ kξk kηk ,
and so
kϕk ≤ kηk .
−1
On the other hand, if ξ = kηk η, then kξk = 1 and ϕ(ξ) = kηk. Hence,
kϕk = kηk.
The proof of the converse is simpler; thus, it is omitted. 
As every Hilbert space H is self dual, the closed unit ball of H is compact,
by the Banach–Alaoglu Theorem. Nevertheless, the weak limit of a sequence
of unit vectors can be far from what one might expect.
6.7 Exercises 89

Example 6.29. If {φn }n∈N is a sequence of pairwise orthogonal unit vectors in


a Hilbert space H, then the sequence converges weakly to 0 ∈ H. To see this,
choose any ϕ ∈ H ∗ . By the Riesz Representation Theorem (Theorem 6.28),
there is a vector η ∈ H such that ϕ(ξ) = hξ, ηi, for every ξ ∈ H. Let M be the
subspace generated by the sequence {φn }n∈N. The projection of η onto M is
the vector
X∞ X∞
hη, φn iφn = ϕ(φn ) ϕn ,
n=1 n=1

and so the convergence of this series implies that ϕ(φn ) → 0. Hence, φn → 0


weakly. ♦

6.7 Exercises

1. Prove that if h·, ·i is an inner product on a vector space H and if η ∈ H


is nonzero, then |hξ, ηi| = hξ, ξi1/2 hη, ηi1/2 if and only if ξ = αη for some
α ∈ C.
2. Prove that in an inner-product space H,

kξ + ηk2 + kξ − ηk2 = 2kξk2 + 2kηk2 ,

for all ξ, η ∈ H.
3. Let K be the closed unit ball of a Hilbert space H. Prove that ξ ∈ K is
an extreme point of K if and only if kξk = 1.
4. Prove that in a Hilbert space H, S ⊥ is a subspace for every S ⊆ H.
5. In a Hilbert space H, let M1 and M2 be closed subspaces.
a) Prove that (M1 + M2 )⊥ = M1⊥ ∩ M2⊥ .
b) Prove or find a counterexample to (M1 ∩ M2 )⊥ = M1⊥ + M2⊥ .
6. Assume that {φk }k∈N is an orthonormal basis for a Hilbert space H, and
let {δk }∞
k=1 be a sequence of nonnegative real numbers. Hilbert’s cube is
the set (∞ )
X
∆ = αk φk : |αk | ≤ δk , ∀ k .
k=1

Prove that ∆ is compact in H if and only if



X
δk2 < ∞ .
k=1

7. Prove that every nonzero Hilbert space has an orthonormal basis. In par-
ticular, if O ⊂ H is a set of orthonormal vectors, then prove that H has
an orthonormal basis B that contains O.
90 6 Hilbert Spaces

8. Prove that if B ⊂ H is an orthonormal basis in a Hilbert space H, then


Span B is dense in H.
9. Prove that if B is an orthonormal basis for a Hilbert space H, and if
η ∈ H is such that hη, φi = 0 for all φ ∈ B, then η = 0.
10. Prove that every orthonormal basis of a separable Hilbert space is a
Schauder basis.
11. In `2 (N), let φk be the sequence

φk = (0, . . . , 0, 1, 0, . . . ) ,

where the 1 occurs in the k-th term of the sequence. Prove that {φk }k∈N
is an orthonormal basis of `2 (N).
12. Suppose that in a Hilbert space H, φ1 , . . . , φn are orthonormal vectors.
Prove that if ξ1 , . . . , ξn ∈ H satisfy kξj − φj k < n−1/2 for each j, then
ξ1 , . . . , ξn are linearly independent.
13. Consider the Fourier series expansions of f1 (t) = t and f2 (t) = eαt in
L2 ([−π, π]). Calculate the Fourier series of each fj and use Parseval’s
identity to establish each of the following formulae:

X 1 π2
= ;
n=1
n2 6


X 1 π
= coth(απ) .
n=1
n2 + α 2 α
Part II

Operators
7
Operators on Banach Spaces

Recall from Section 3.1 that an operator T : V → W , where V and W are


Banach spaces, is a linear transformation such that kT vk ≤ M kvk, for some
constant M > 0 and every v ∈ V . The vector space B(V, W ) of all operators
V → W is a Banach space under the norm
kT (v)k
kT k = sup .
06=v∈V kvk

In the case W = V , we denote B(V, V ) by B(V ). Because the composition


S ◦ T of two operators S, T ∈ B(V ) is again an operator of norm

kS ◦ T k ≤ kSk kT k ,

the Banach space B(V ) is a Banach algebra, where one considers—and


denotes—the composition S ◦ T as a product ST .
This chapter presents some of the fundamental theorems concerning oper-
ators.

7.1 Principle of Uniform Boundedness


The first theorem of the chapter concerns the boundedness of a set of operators
acting on a Banach space V , and this result has a crucial role in the spectral
theory of operators (for example, Theorem 9.4).

Definition 7.1. A set S of vectors in a Banach space W is said to be uni-


formly bounded if there is a K > 0 such that kwk ≤ K for all w ∈ S.
Theorem 7.2. (Principle of Uniform Boundedness) Let Λ ⊆ B(V, W ) be a
set of operators, where V is a Banach space and W is a normed vector space.
Exactly one of the following two statements holds:
1. There is a K > 0 such that kT k ≤ K for every T ∈ Λ.
94 7 Operators on Banach Spaces

2. There is a dense Gδ -set G ⊆ V such that

sup kT vk = ∞ ∀v ∈ G.
T ∈Λ

Proof. For each T ∈ Λ let fT : V → R be the function defined by

fT (v) = kT vk , v∈V.

Observe that fT is continuous and so

fT−1 ([0, n]) = {v ∈ V | kT vk ≤ n}

is a closed set. Therefore, if n ∈ N, then the set Kn ⊂ V defined by


\
Kn = fT−1 ([0, n])
T ∈Λ

is a closed subset of V . Let Un = V \Kn , which is open. Either every Un is


dense in V or there is at least one n ∈ N for which Un is not dense.
Case #1: Un is not dense for some n. In this case, fix such an n and choose
v0 ∈ Kn = V \Un so that v0 lies outside the closure of Un . Thus, there is a
ρ > 0 with Bρ (v0 ) ∩ Un = ∅. Hence, if ε = ρ/2, then v0 + v ∈ Kn for all v ∈ V
that satisfy kvk ≤ ε. That is, if kvk ≤ ε and T ∈ Λ, then

kT vk = k(T v0 + Tv ) − T v0 k ≤ kT (v0 + v)k + kT vk ≤ n + n = 2n .

Hence,
2n
kT k ≤ , ∀T ∈ Λ,
ε
which proves that Λ is uniformly bounded.
Case #2: Un is dense in V for every n ∈ N. In this case, let G ⊆ V be the
Gδ -set defined by \
G = Un .
n∈N

By the Baire Category Theorem, G is a dense in V . By definition, if v ∈ Un ,


then there is a T ∈ Λ with kT vk > n. Hence, if v ∈ G, then

sup kT vk = ∞ ,
T ∈Λ

which proves the theorem. 

7.2 The Open Mapping Theorem


The motivation for the Open Mapping Theorem starts with analysis in finite
dimensions.
7.2 The Open Mapping Theorem 95

Consider the finite-dimensional normed vector space Cn equipped with the


Euclidean (that is, Hilbert space) norm. Suppose that T ∈ B(Cn ) is arbitrary.
By the Singular Value Decomposition in linear algebra, the unit sphere of Cn
is mapped by T to an ellipsoid, where the length of each axis of the ellipsoid is
determined by a singular value of T . The ellipsoid is degenerate precisely when
one of the singular values of T is zero (that is, there is an axis of length zero).
Each singular value of T —and, hence, each axis of the ellipsoid—is nonzero if,
and only if, T is invertible. By the First Isomorphism Theorem, T is invertible
if and only if T is a surjection. Only a little more thought shows that if T
is indeed surjective, then the interior of the unit sphere—namely, the open
unit ball—is mapped by T to the interior of the ellipsoid, which is another
open set and therefore contains an open ball centered at the origin. The Open
Mapping Theorem asserts that these features persist in infinite dimensions as
well.
Theorem 7.3. (Open Mapping Theorem) If V and W are Banach spaces and
if T ∈ B(V, W ) is a surjection, then there is a δ > 0 such that the open ball in
W with centre 0 and radius δ is contained in the image under T of the open
unit ball in V ; that is,
{w ∈ W | kwk < δ} ⊆ {T v | v ∈ V and kvk < 1} .
The proof of the Open Mapping Theorem will require the following two
preliminary lemmas.
Lemma 7.4. If V and W are Banach spaces and if T ∈ B(V, W ) is a sur-
jection, then there is a δ > 0 with the following property: if η ∈ W and ε are
given, then there exists ξ ∈ V such that
1. kξk < δ −1 kηk, and
2. kT ξ − ηk < ε.
Proof. Let Uk = {v ∈ V | kvk < k}, k ∈ N. Because T is a surjection, W =
∪k T (Uk ); thus, W = ∪k Kk , where Kk is the closure of T (Uk ) in W . By the
Baire Category Theorem (Corollary 1.21), a countable union of nowhere dense
sets is nowhere dense; thus, there is at least one n ∈ N for which the interior
Ωn of Kn is nonempty.
Select w0 ∈ Ωn . Because Ωn is open, there is a γ > 0 such that w0 + w ∈
Ωn ⊂ Kn = T (Un ) for every w ∈ W with kwk < γ. Hence, there is a sequence
{vk }k∈N in Un such that limk kw0 − T vk k = 0. Likewise, if w ∈ W satisfies
kwk < γ, there is a sequence {zk }k∈N in Un with limk k(w0 + w) − T zk k = 0.
Let uk = zk − vk ; thus, kuk k ≤ kzk k + kvk k < 2n and limk kw − T uk k = 0.
γ
Next, set δ = 4n . Suppose that η ∈ W (nonzero) and ε > 0 are given. Be-
γ
cause the norm of 2kηk η is less than γ, the argument of the previous paragraph
shows that there is a u ∈ V such that kuk < 2n and
 

T u − γ γ
η < ε;
2kηk 2kηk
96 7 Operators on Banach Spaces

that is,  
2kηk
T u − γ
η
< ε.
γ 2kηk
2kηk 2kηk
Now let ξ = γ
u to obtain kξk < γ
(2n) = δ −1 kηk and kT ξ − ηk < ε. 
Lemma 7.5. If V and W are Banach spaces and if T ∈ B(V, W ) is a sur-
jection, then there is a δ > 0 with the following property: if ε is given and if
η ∈ W satisfies kηk < δ, then there exists ξ ∈ V such that
1. kξk < 1 + ε, and
2. T ξ = η.
Proof. Let δ > 0 be the number determined by Lemma 7.4. Suppose that
ε > 0 is given and that η ∈ W satisfies kηk < δ. By Lemma 7.4, there is a
ξ1 ∈ V with kξ1 k < δ −1 kηk (whence, kξ1 k < 1) and kT ξ1 − ηk < δε 2 .
Apply Lemma 7.4 again using T ξ1 − η in place of η: thus, there is ξ2 ∈ V
with kξ2 k < δ −1 kT ξ1 − ηk (thus, kξ2 k < ε/2) and kT ξ2 − (T ξ1 − η)k < 2δε2 .
Repeated use of Lemma 7.4 produces, inductively, a sequence {ξk }k∈N in
V such that, for every k ∈ N,
k
ε X δε
kξk k < and kη − T ξj k < .
2k−1 j=1
2k

k
X
Hence, { ξj }k∈N is a Cauchy sequence in V ; denote its limit ξ ∈ V by
j=1


X
ξ = ξj .
j=1

Thus,
∞ ∞
X X 1
kξk ≤ kξj k ≤ kξ1 k + ε < 1 + ε.
j=1 j=2
2j−1
k
X
By continuity of T , kT ξ − T ( ξj )k → 0. Hence, T ξ = η. 
j=1

Proof of the Open Mapping Theorem. Let δ > 0 be the number given by
δ
Lemma 7.5. Suppose that ε > 0. Choose any η ∈ W such that kηk < 1+ε .
Hence, k(1+ε)ηk < δ, and so by Lemma 7.5 there is an ξ ∈ V with kξk < 1+ε
and T ξ = (1 + ε)η. Therefore, ξ˜ = (1 + ε)−1 ξ is in the open unit ball of V and
T ξ̃ = η. Because ε is an arbitrary positive number, this proves that
[  δ

{w ∈ W | kwk < δ} = η ∈ W | kηk <
ε>0
1+ε

⊆ {T ξ | ξ ∈ V, kξk < 1} ,
7.3 Invertible Operators 97

which completes the proof of the Open Mapping Theorem. 

Definition 7.6. If X and Y are topological spaces and if f : X → Y is a


function, then f is called an open map if f(U ) is open in Y for every open
set U in X.

Corollary 7.7. If T is a surjective operator between Banach spaces, then T


is an open map.

Proof. Exercise 4. 

7.3 Invertible Operators

Definition 7.8. An operator T ∈ B(V, W ) is bounded below if there exists


δ > 0 such that
δkvk ≤ kT vk , ∀ v ∈ V .
The positive real number δ is called a lower bound for T .

Proposition 7.9. If V and W are Banach spaces, then the following state-
ments are equivalent for T ∈ B(V, W ).
1. T is a bijection;
2. T is bounded below and has dense range.

Proof. To show that 1 implies 2, assume that T is a bijection. Therefore, the


range of T is W , and so the statement “T has dense range” is trivial. Because
T is surjective, the Open Mapping Theorem asserts that there is a δ > 0 for
which
{w ∈ W | kwk < δ} ⊆ {T v | v ∈ V and kvk < 1} .
In other words, using that T is also injective, for every w ∈ W for which
kwk < δ there is a unique v ∈ V with kvk < 1 and T v = w. The contrapositive
of this statement is: kT vk ≥ δ for every v ∈ V such that kvk ≥ 1. Hence, if
v ∈ V is nonzero, then kvk−1 v is a unit vector and so kT (kvk−1 v)k ≥ δ; that
is, kT vk ≥ δkvk, which proves 2.
For the proof of 2 implies 1, assume that T is bounded below by δ > 0 and
that T has dense range. Because T is bounded below, it is straightforward to
verify that T is injective. To show that T is surjective, the hypothesis that the
range of T is dense in W implies that it is sufficient to show that the range
of T is closed (which would then yield T (V ) = W ). To this end, suppose that
w ∈ W is in the closure of the range of T . Thus, there is a sequence {wk }k∈N
in ran T whose limit is w; this sequence is necessarily a Cauchy sequence.
There are (unique) vk ∈ V for which T vk = wk . Thus, if n, m ∈ N, then

δ kvn − vm k ≤ kT (vn − vm )k = kwn − wm k ,


98 7 Operators on Banach Spaces

which implies that {vk }k∈N is a Cauchy sequence in V . Because V is a Banach


space, there exists a limit v ∈ V for the sequence {vk }k∈N . Because T is
continuous, T v = w; that is, w is in the range of T , and so ran T is closed. 
Proposition 7.9 has an important consequence: if an operator is bijective,
then its inverse is also an operator.
Corollary 7.10. If V and W are Banach spaces and if T ∈ B(V, W ) is a
bijection, then the linear transformation T −1 : W → V is bounded.
Proof. Ordinary linear algebra demonstrates that the inverse function T −1 :
W → V of T is a linear transformation. By Proposition 7.9, T is bounded
below by some δ > 0 (and ran T is dense). Thus, T −1 is bounded and kT −1 k ≤
δ −1 by the following computation:
1
kT −1 wk = kT −1 (T v)k = kvk ≤ kT vk = δ −1 kwk ,
δ
where w = T v. 
Two criteria for the singularity of an operator are:
Corollary 7.11. If V and W are Banach spaces, then an operator T ∈
B(V, W ) fails to be invertible if
1. there is a sequence of unit vectors vk ∈ V with inf k kT vk k = 0, or
2. if the range of T is not dense in W .

7.4 The Banach Space Adjoint


Every operator T on a Banach space V induces an operator T ∗ on the dual
V ∗ of V . The existence of this operator is shown in the following proposition.
Proposition 7.12. If V and W are Banach spaces, and if T ∈ B(V, W ), then
there is a unique operator T ∗ : W ∗ → V ∗ with the property that
T ∗ ψ(v) = ψ(T v) , ∀ ψ ∈ W ∗, v ∈ V . (7.1)
Furthermore, if T1 , T2 , T ∈ B(V, W ) and α, α2 ∈ C, then
1. kT ∗ k = kT k,
2. (α1 T1 + α2 T2 )∗ = α1 T1∗ + α2 T2∗ ,
3. if T is invertible, then T ∗ is invertible and (T ∗ )−1 = (T −1 )∗ , and
4. if W = V , then (T1 T2 )∗ = T2∗ T1∗ .
Proof. Exercise 6. 
Definition 7.13. The operator T ∗ is called the adjoint of T .
Conceptually, the adjoint T ∗ of T ∈ B(V, W ) is very much like the trans-
pose T t of an m × n matrix T . Like the transpose of a matrix, the adjoint T ∗
can sometimes be used to study T itself, as we shall see later on in this study.
7.5 The Spectrum 99

7.5 The Spectrum

The theory of polynomial rings makes extensive use of factorisation of poly-


nomials by way of their roots. Similarly, the Jordan canonical form in linear
algebra is a factorisation of a linear transformation as a product certain ba-
sic transformations which are determined by the roots (eigenvalues) of the
transformation. The study of roots, or singular points, extends naturally to
operators on Banach spaces.

Definition 7.14. Assume that T is an operator on a Banach space V . The


spectrum of T is the set σ(T ) of all λ ∈ C for which the operator T − λ1 on
V is not invertible.

The following theorem is an operator theoretic analogue of the Fundamen-


tal Theorem of Algebra.

Theorem 7.15. If V is a Banach space and T ∈ B(V ), then σ(T ) is a


nonempty compact subset of {ζ ∈ C | |ζ| ≤ kT k}.

One proof of the Fundamental Theorem of Algebra is via complex analysis


(specifically, Liouville’s Theorem on bounded entire functions). Interestingly,
it is by a very similar route that Theorem 7.15 is proved.

Lemma 7.16. If T ∈ B(V ) and λ ∈ C are such that |λ| > kT k, then (1 −
1
λ
T )−1 exists,
 −1
 −1 ∞
1 X
−k k
1 |λ|
1− T = λ T , and 1− T ≤ .

λ λ |λ| − kT k
k=0

Proof. Since k λ1 T k < 1,


∞  k
X kT k 1 |λ|
= 1 = .
|λ| 1 − kλTk |λ| − kT k
k=0


|λ|
X
Hence, the series λ−k T k converges to some S ∈ B(V ) with kSk ≤ |λ|−kT k
.
k=0
As   
1 1 1 1
1− T 1+ T + ... + k Tk = 1− T k+1 ,
λ λ λ λk+1
we conclude that S = (1 − λ1 T )−1 . 

Lemma 7.17. If T ∈ B(V ) satisfies kT k < 1, then (1 − T )−1 exists and

kT k
k(1 − T )−1 k ≤ .
1 − kT k
100 7 Operators on Banach Spaces

X
Proof. Lemma 7.16 shows that (1 − T )−1 exists, equals T k , and satisfies
k=0
1
k(1 − T )−1 k ≤ 1−kT k . Further,


X X∞
k k
k1 − (1 − T )−1 k = 1 − T = T


k=0 k=1

!
X
k
= T − T = k − T (1 − T )−1 k


k=0

≤ kT k k(1 − T )−1 k .
kT k
Hence, k1 − (1 − T )−1 k ≤ 1−kT k . 

Lemma 7.18. If T ∈ B(V ) and λ0 6∈ σ(T ), then (T − λ1)−1 exists for all
λ ∈ C for which |λ − λ0 | < k(T − λ0 1)−1 k−1 .
Proof. Lemma 7.17 asserts that 1 + (λ0 − λ)(T − λ0 1)−1 is invertible, for

all λ ∈ C for which |λ − λ0 | < k(T − λ0 1)−1 k−1 . However,

T − λ1 = T − λ0 1 − (λ0 − λ)1 = (T − λ0 1) 1 + (λ0 − λ)(T − λ0 1)−1 ,




Thus, (T − λ1)−1 exists for all λ ∈ C for which |λ − λ0 | < k(T − λ0 1)−1 k−1 .

Lemma 7.19. Suppose that T ∈ B(V ), ϕ ∈ B(V )∗ , and λ0 6∈ σ(T ). There is
an ε > 0 such that if Ω = {λ ∈ C | |λ − λ0 | < ε}, then
 Ω ∩ σ(T ) = ∅ and the
function f : Ω → C defined by f(λ) = ϕ (T − λ1)−1 is differentiable at λ0 .
Proof. Let ε = k(T − λ0 1)−1 k−1 and Ω = {λ ∈ C | |λ − λ0 | < ε}.By Lemma
7.17, Ω ∩ σ(T ) = ∅. Define f : Ω → C by f(λ) = ϕ (T − λ1)−1 . For every
λ ∈ Ω,

(T − λ1)−1 − (T − λ0 )−1 = (λ − λ0 ) (T − λ1)−1 (T − λ0 1)−1 .




This yields a difference quotient:


1
(T − λ1)−1 − (T − λ0 )−1 = (T − λ1)−1 (T − λ0 1)−1 .

λ − λ0
This limit, as λ → λ0 , appears to be (T − λ0 1)−2 . This is indeed true, since

k(T − λ0 1 + (λ0 − λ)1)−1 − (T − λ0 1)−1 + (λ0 − λ)(T − λ0 1)−2 k

≤ 2k(T − λ0 1)−1 k3 |λ − λ0 |2 .

Hence,
7.5 The Spectrum 101

f(λ) − f(λ0 )
lim = ϕ((T − λ0 1)−2 ) ,
λ→λ0 λ − λ0
which implies that f is differentiable at λ0 ∈ C. 
We are now prepared to prove that every operator on a Banach space has
a nonempty compact spectrum.
Proof of Theorem 7.15. If λ ∈ C satisfies |λ| > kT k, then (1 − λ1 T )−1 exists,
by Lemma 7.16. Therefore, −λ(1 − λ1 T ) = (T − λ1) is invertible. Hence,
σ(T ) ⊆ {ζ ∈ C | |ζ| ≤ kT k}. This proves that σ(T ) is bounded. Furthermore,
C \ σ(T ) is an open set, by Lemma 7.18. Let λ0 ∈ C \ σ(T ). Thus, σ(T ) is
bounded and closed, and hence σ(T ) is compact. It remains to show that σ(T )
is nonempty.
Assume, contrary to what we aim to prove, that σ(T ) = ∅. Choose any
ϕ ∈ B(V )∗ and let f : C → C be defined by f(λ) = ϕ((T −λ1)−1 ). By Lemma
7.19, f is differentiable at each λ0 ∈ C. Hence, f is holomorphic on C.
Suppose that λ ∈ C satisfies |λ| > kT k. Then
 −1
−1 1
(T − λ1)−1 = 1− T
λ λ
and so  −1
1 1 kϕk
|f(λ)| ≤ kϕk 1− T ≤ ,

|λ| λ |λ| − kT k
by Lemma 7.16. Hence, lim f(λ) = 0.
|λ|→∞
On the compact set {ζ ∈ C | |ζ| ≤ kT k}, the continuous map ϕ is bounded.
kϕk
Therefore, the inequality |f(λ)| ≤ |λ|−kT k
for |λ| > kT k implies that ϕ is
bounded on its entire domain C. But the only bounded entire functions are
the constant functions. Thus, there is a α ∈ C such that f(λ) = α, for all
λ ∈ C. Because lim f(λ) = 0, the constant α must in fact be zero. This,
λ→∞
therefore, proves that ϕ((T − λ1)−1 ) = 0 for all ϕ ∈ B(V )∗ and λ ∈ C. By the
Hahn–Banach Theorem, this implies that (T − λ1)−1 = 0, for each λ, which
is impossible since 0 is not invertible. Therefore, it must be that σ(T ) 6= ∅. 
Although σ(T ) lies in the closed disc of radius kT k and centre 0 ∈ C, there
could be a smaller disc with the same centre that contains σ(T ). The radius
of the smallest of such discs is called the spectral radius.
Definition 7.20. The spectral radius of T ∈ B(V ) is the quantity spr T de-
fined by
spr T = max |λ| .
λ∈σ(T )

Theorem 7.21. If T ∈ B(V ), then lim kT n k1/n exists and


n

spr T = lim kT n k1/n . (7.2)


n
102 7 Operators on Banach Spaces

Proof. If λ ∈ σ(T ), then


 
n
X Xn
(T n − λn 1) = (T − λ1) λj−1 T n−j =  λj−1 T n−j  (T − λ1) .
j=1 j=1

If T n − λn 1 were invertible, then by the equations above (T − λ1) would


have a left and a right inverse and, thus, be invertible (Exercise 8). Therefore,
T n − λn 1 is not invertible. Thus, λn ∈ σ(T n ) and |λ|n ≤ kT n k. Hence,

spr T ≤ lim inf kT n k1/n .


n

Let Ω, ∆ ⊆ C be the open sets

Ω = {ζ ∈ C | |ζ| (spr T ) < 1} and ∆ = {λ ∈ C | |λ| kT k < 1} .

Note that ∆ ⊆ Ω because spr T ≤ kT k.


Now, choose any ϕ ∈ B(V )∗ and define f : Ω → C by

f(λ) = ϕ (1 − λT )−1 .


If λ ∈ ∆, then (1 − λT )−1 is a geometric series (Lemma 7.16); hence,



X
f(λ) = λn ϕ(T n ) , ∀λ ∈ ∆.
n=0

On the other hand, if ζ ∈ Ω is nonzero, then


 
1 1 −1
f(ζ) = ϕ ( 1 − T ) .
ζ ζ
By Lemma 7.19, f is differentiable at each nonzero point of ∆. Thus, since
0 ∈ ∆ ⊆ Ω, f is holomorphic on the disc Ω. By the uniqueness of the power
series expansion about the origin, we obtain

X
f(ζ) = ζ n ϕ(T n ) , ∀ζ ∈ Ω.
n=0

Hence, lim |ϕ(ζ n T n )| = 0 for every ζ ∈ Ω. Thus, for each ζ ∈ Ω there is a


n
Mζ,ϕ > 0 such that |ϕ(ζ n T n )| ≤ Mζ,ϕ for all n ∈ N.
Now fix ζ ∈ Ω and consider the family {ζ n T n | n ∈ N}. Because B(V )
embeds into B(V )∗∗ isometrically as a Banach space of operators on B(V )∗ ,
we consider the family {ζ n T n | n ∈ N} as acting on B(V )∗ in this way—
namely,
ζ n T n (ϕ) = ϕ(ζ n T n ) , ϕ ∈ B(V )∗ .
By the Uniform Boundedness Principle (Theorem 7.2), either there is a Rζ > 0
such that kζ n T n k ≤ Rζ for all n ∈ N or supn kζ n T n ϕk = ∞ for a dense set
7.6 Polynomial Functional Calculus 103

of ϕ ∈ B(V )∗ . However, the latter situation cannot occur, since |ϕ(ζ n T n )| ≤


Mζ,ϕ for all n ∈ N. Hence, there is a Rζ > 0 such that kζ n T n k ≤ Rζ for all
n ∈ N. Thus,
1/n
|ζ| kT nk1/n ≤ Rζ .
Now choose a nonzero ζ ∈ Ω. Therefore, spr T < 1/|ζ| and
1/n
n 1/n
lim supn Rζ 1
lim sup kT k ≤ = .
n |ζ| |ζ|
Hence,
1
lim sup kT n k1/n ≤ inf = spr T .
n ζ∈Ω\{0} |ζ|
This proves that

lim sup kT n k1/n ≤ spr T ≤ lim inf kT n k1/n .


n n

That is, lim kT n k1/n exists and equals spr T . 


n

The final general result of this section concerns the relationship between
the spectra of ST and T S.

Proposition 7.22. σ(ST ) ∪ {0} = σ(T S) ∪ {0}, for all S, T ∈ B(V ).

Proof. If 1−ST is invertible, then (1−T S)(1+T ZS) = (1+T ZS)(1−T S) = 1,


where Z = (1 − ST )−1 . Interchanging the roles of S and T leads to: 1 − ST
is invertible if and only if 1 − T S is invertible. Hence, if λ 6= 0, then ST − λ1
is invertible if and only if T S − λ1 is invertible. 

7.6 Polynomial Functional Calculus


The term “functional calculus” refers to any situation in which one evaluates
a function f is a real or complex variable at an operator T to produce a new
operator f(T ). The first case of interest is with polynomials.
Xn
If f is a polynomial, say f(t) = αj tj , then for any T ∈ B(V ) we define
j=0
n
X
f(T ) = αj T j , where T 0 = 1 ∈ B(V ). The map that sends a polynomial f
j=0
to an operator f(T ) is called polynomial functional calculus for T .
Observe that, for any T ∈ B(V ), α ∈ C, and polynomials f, g ∈ C [t], we
have
1. αf(T ) = α(f(T )),
2. (f + g)(T ) = f(T ) + g(T ), and
104 7 Operators on Banach Spaces

3. fg(T ) = f(T )g(T ).


We show below that the map f 7→ f(T ) is spectrum preserving as well.
If X ⊂ C, and if f is a polynomial, then f(X) denotes the set {f(ζ) | ζ ∈
X}.

Theorem 7.23. (Polynomial Spectral Mapping Theorem) If T ∈ B(V ), then

f (σ(T )) = σ(f(T ))

for every complex polynomial f.

Proof. Let f be a complex polynomial and suppose that λ ∈ σ(T ). Let g(t) =
f(t) − f(λ). As g(λ) = 0, there is a polynomial h such that g(t) = (t − λ)h(t).
Hence, f(T ) − f(λ)1 = g(T ) = (T − λ1)h(T ) = h(T )(T − λ1). If f(T ) − f(λ)1
were invertible, then these equations imply that T − λ1 has a left and right
inverse, from which we would conclude that T − λ1 is invertible (Exercise 8).
Therefore, as T − λ1 is not invertible, it must be that f(T ) − f(λ)1 is not
invertible. That is, f(λ) ∈ σ(f(T )). This proves that f (σ(T )) ⊆ σ(f(T )).
Conversely, suppose that ω 6∈ f (σ(T )). Thus, ω 6= f(λ), for all λ ∈ σ(T ).
Let h(t) = f(t)−ω and factor h: h(t) = (t−ω1 )n1 · · · (t−ωm )nm . Since h(λ) 6= 0
for all λ ∈ σ(T ), λ 6= ωj for every j and λ ∈ σ(T ). That is, ωj 6∈ σ(T ) for each
j. The factorisation h leads to the following expression for h(T ):

f(T ) − ω1 = (T − ω1 1)n1 · · · (T − ωm 1)nm .

Because ωj 6∈ σ(N ) for each j, f(T ) − ω1 is a product of invertible operators


and is, hence, invertible. Thus, ω 6∈ σ(f(T )), and so σ(f(T )) ⊂ f (σ(T )). 

7.7 Parts of the Spectrum

By Proposition 7.9, an operator T on a Banach space V is invertible if and only


if T is bounded below and has dense range. This fact leads to the following
definitions for subsets of the spectrum.

Definition 7.24. Assume that T is an operator on a Banach space V .


1. The point spectrum of T is the set σp (T ) of all λ ∈ C for which the
operator T − λ1 on V is not injective
2. The approximate point spectrum of T is the set σap (T ) of all λ ∈ C for
which the operator T − λ1 on V is not bounded below.
3. The defect spectrum of T is the set σd (T ) of all λ ∈ C for which the
operator T − λ1 on V does not have dense range.

Proposition 7.25. If V is a Banach space and T ∈ B(V ), then σd (T ) =


σp (T ∗ ).
7.7 Parts of the Spectrum 105

Proof. Suppose that λ ∈ σd (T ); thus, ran (T − λ1) is a proper subspace of V .


Let W = V /ran (T − λ1) and let q : V → W be the canonical surjection. Since
W is nonzero, there exists ψ ∈ W ∗ of norm kψk = 1. Let ϕ ∈ V ∗ be given by
ϕ = ψ ◦ q and note that ran (T − λ1) ⊆ ker ϕ. Thus, T ∗ ϕ = λϕ, which shows
that λ ∈ σp (T ∗ ).
Conversely, assume that λ ∈ σp (T ∗ ). Thus, there exists ϕ ∈ V ∗ of norm
1 such that T ∗ ϕ = λϕ; that is, ϕ(T v − λv) = 0 for all v ∈ V . If the range
of T − λ1 were dense, then we would conclude that ϕ(w) = 0, for all w ∈ V ,
and this would imply that ϕ = 0 (by the Hahn–Banach Theorem). But since
ϕ 6= 0, the range of T − λ1 can not be dense. Hence, λ ∈ σd (T ). 
Proposition 7.26. If V is a Banach space and T ∈ B(V ), then σap (T ) is
compact and ∂σ(T ) ⊆ σap(T ).
Proof. To show that σap (T ) is closed, let λ ∈ C\σap (T ). Thus, there is a δ > 0
such that k(T − λ1)vk ≥ δkvk, for every v ∈ V . Let ε = δ/2. If µ ∈ C satisfies
|λ − µ| < ε, then, for every v ∈ V ,

δkvk ≤ k(T −λ1)vk = k(T −µ1)v +(µ−λ)vk ≤ k(T −µ1)vk + |λ−µ| kvk ,

which implies that T − µ1 is bounded below by δ/2. Hence, the complement


of σap(T ) is open, which proves that σap(T ) is closed. As σ(T ) is bounded,
σap(T ) is compact.
Proof. To show that σap (T ) is closed, let λ ∈ C\σap (T ). Thus, there is a δ > 0
such that k(T − λ1)vk ≥ δkvk, for every v ∈ V . Let ε = δ/2. If µ ∈ C satisfies
|λ − µ| < ε, then, for every v ∈ V ,

δkvk ≤ k(T −λ1)vk = k(T −µ1)v +(µ−λ)vk ≤ k(T −µ1)vk + |λ−µ| kvk ,

which implies that T − µ1 is bounded below by δ/2. Hence, the complement


of σap(T ) is open, which proves that σap (T ) is closed and, hence, compact.
Suppose that λ ∈ ∂σ(T ). Therefore, there is a sequence λn ∈ C\σ(T ) such
that |λn − λ| → 0. If there were a γ > 0 for which k(T − λn 1)−1 k < γ for
every n ∈ N, then it would be true that T − λ1 is invertible (by Lemma 7.18).
Specifically, if n0 is such that |λn0 − λ| < 1/γ, then the distance between the
invertible operator T −λn0 1 and the operator T −λ1 satisfies If n0 is such that
|λn0 − λ| < 1/γ, then the distance between the invertible operator T − λn0 1
and the operator T − λ1 satisfies

k(T − λ1) − (T − λn0 1)k = |λn0 − λ| < 1/γ < k(T − λn0 1)−1 k−1 .

This would imply that T −λ1 is invertible, contrary to the fact that λ ∈ σ(T ).
Hence, it must be that {k(T − λn 1)−1 k}n∈N is an unbounded sequence; that
is k(T − λn 1)−1 k−1 → 0.
By definition of norm, for each n ∈ N there is a unit vector wn ∈ V such
that
1
k(T − λn 1)−1 k < k(T − λn 1)−1 wn k + .
n
106 7 Operators on Banach Spaces

Let
1
vn = (T − λn 1)−1 wn , ∀n ∈ N.
k(T − λn 1)−1 wn k
Thus, kvn k = 1 and
1
(T − λ1)vn = wn + (λn − λ)vn .
k(T − λn 1)−1 wn k

Hence,
1
k(T − λ1)vn k ≤ + |λ − λn | ,
k(T − λn 1)−1 k−1 − (1/n)

which converges to 0 as n → ∞. That is, λ ∈ σap (T ). 

Corollary 7.27. Every Banach space operator has an approximate eigen-


value.

7.8 Examples

To conclude the present chapter, a few examples of operator are examined.

Multiplication Operators

Let X be a compact Hausdorff space and let V = C(X). Select a function


ψ ∈ C(X) and define a function Tψ : V → V by

Tψ (f) = ψf , ∀ f ∈ C(X) .

The function Tψ is a linear transformation. To show that Tψ is bounded,


observe that
kTψ (f)k = kψfk = maxt∈X |ψ(t)f(t)|
  
≤ max |ψ(t)| max |ψ(t)|
t∈X t∈X

= kψk kfk .
Thus, kTψ (f)k ≤ kψk kfk, for every f ∈ V . This proves that Tψ is bounded
and kTψ k ≤ kψk. The following facts will be shown below:
1. kTψ k = kψk;
2. σ(Tψ ) = σap (Tψ ) = {ψ(t) | t ∈ X};
3. λ ∈ σp (Tψ ) if and only if {t ∈ X | ψ(t) 6= λ} is not dense in X.
7.8 Examples 107

To show that kTψ k = kψk, let f(t) = 1 and note that f ∈ C(X), kfk = 1,
and Tψ f = ψ. Thus, kTψ (f)k = kψk kfk.
If λ 6∈ {ψ(t) | t ∈ X}, then ψ(t) − λ 6= 0 for all t ∈ X and so g ∈ C(X),
1
where g(t) = ψ(t)−λ . Since Tg (Tψ − λ1) = (Tψ − λ1)Tg = 1, λ 6∈ σ(Tψ ). Hence,
σ(T ) ⊆ {ψ(t) | t ∈ X}.
Suppose now that t0 ∈ X and λ = ψ(t0 ). We shall show that λ ∈ σap (Tψ ).
Let ε > 0 be given and let U = {t ∈ X | |ψ(t) − ψ(t0 )| < ε}, which is an
open subset of X (since ψ is continuous). Because X is Hausdorff, there is an
open subset W ⊂ U such that t0 ∈ W ⊂ W ⊂ U [9, Lemma 3.8.2]. Hence, W
and X \ U are disjoint closed sets in X. Because compact Hausdorff spaces
are normal, Urysohn’s Lemma implies that there is a continuous function
f : X → [0, 1] such that f(W ) = {1} and f(X \ U ) = {0}. Hence, kfk = 1
and |(ψ(t)−λ)f(t)| < ε, for all t ∈ X. That is, k(Tψ −λ1)fk < ε, which proves
that λ ∈ σap (Tψ ). The assertion about σp (Tψ ) is left as an exercise (Exercise
9). ♦

The Unilateral Shift Operator

Assume that p, q ∈ (1, ∞) satisfy 1/p + 1/q = 1. Thus, the dual of `p (N)
is isometrically isomorphic to `q (N). The linear transformation S : `p (N) →
`p (N) defined by
   
v1 0
 v2   v1 
Sv = S  v3  =  v2  , ∀v ∈ `p (N) ,
   
.. ..
   
. .

is called the unilateral shift operator on `p (N). Observe that S is an isometry;


thus, kSk = 1 and σ(S) ⊆ D (the closed unit disc).
Consider the eigenvalue problem Sv = λv. Because S shifts the entries of
v ∈]ellp (N) forward by one position, there are no nonzero solutions v ∈ H to
the equation Sv = λv. Hence, σp (S) = ∅.
As for the approximate point spectrum of S, note that, for every unit
vector v ∈ `p (N), kSvk = 1 (because S is an isometry). Thus, if λ ∈ σap (S)
and if {vn }n∈N is a sequence of unit vectors for which kSvn − λvn k → 0, then
necessarily |λ| = 1. That is, σap (S) ⊆ ∂D.
To determine the defect spectrum of S, Proposition 7.25 asserts that λ ∈
σd (S) if and only if λ ∈ σp (S ∗ ). To compute S ∗ , we need to satisfy the
equation Sϕ(v) = ϕ(Sv) for all v ∈ `p (N) and ϕ ∈ `p (N)∗ . To this end,
identify ϕ ∈ `p (N)∗ with a sequence ϕ = (ϕn )n∈N ∈ `q (N) so that

X
ϕ(w) = wn ϕn , ∀ w ∈ `p (N) .
n=1

Because
108 7 Operators on Banach Spaces

X
ϕ(Sv) = vn+1 ϕn ,
n=1

it must be that 
ϕ2
 ϕ3 
S ∗ ϕ =  ϕ4  , ∀ ϕ ∈ `p (N)∗ .
 
..
 
.
Therefore, S ∗ ϕ = λϕ if and only if ϕn+1 = λϕn for all n ∈ N. Hence, for every
λ ∈ D, S ∗ ϕ = λϕ, where the action of ϕ ∈ `p (N)∗ is given by

X
ϕ(w) = wn λn−1 , ∀ w ∈ `p (N) .
n=1

Furthermore, is not hard to verify that there are no eigenvalues of S ∗ of


modulus 1. Hence, σd (S) = D, by Proposition 7.25. To this point we know
that σap (S) ⊆ ∂D. However, as

D ⊂ σ(S) ⊆ D ,

and because the spectrum is closed, we conclude that σ(S) = D, Hence, by


Proposition 7.26, the boundary of D is a subset of σap (S). That is, σap (S) =
∂D.
In summary:
1. σ(S) = D;
2. σp (S) = ∅;
3. σap (S) = ∂D; and
4. σd (S) = D.

Integral Operators

Consider the unit square X = [0, 1] × [0, 1] ⊂ R2 , and let κ : X → C be an


essentially bounded function (see §2.5). Fix 1 ≤ p < ∞ and let Lp denote
Lp ([0, 1], m) (Lebesgue measure). For each f ∈ Lp , let Tκ f denote a function
whose value at t ∈ [0, 1] is given by given by
Z 1
Tκ f (t) = κ(t, s)f(s) dm(s) .
0

Observe that Tκ f is p-integrable and that


Z Z 1Z 1
|Tκ f|p dm ≤ |κ(t, s)|p |f(s)|p dm(t)dm(s) ≤ kκkp∞ kfkp .
[0,1] 0 0
7.9 Exercises 109

Thus, Tκ is an operator—called on integral operator –on Lp of norm kTκ k ≤


kκk∞.
Let p > 1 and q > 1 satisfy 1/p + 1/q = 1; thus, Lq = (Lp )∗ . The adjoint
of Tκ must, by definition, satisfy

Tκ∗ ϕ (f) = ϕ(Tκ f) , ∀ f ∈ Lp , ϕ ∈ (Lp )∗ = Lq .

Now because
Z 1
ϕ(Tκ f) = Tκ f(t) ϕ(t) dm(t)
0
Z 1Z 1
= κ(t, s)f(s) ϕ(t) dm(s)dm(t)
0 0

Z 1Z 1
= κ(t, s)f(s) ϕ(t) dm(t)dm(s) [Fubini’s Theorem]
0 0
Z 1 Z 1 
= f(s) κ(t, s) ϕ(t) dm(t) dm(s) ,
0 0

we conclude that Tκ∗ is given by


Z 1
Tκ∗ ϕ (t) = κ(s, t)ϕ(t) dm(s) , ϕ ∈ Lq .
0

In other words, Tκ∗ is an integral operator induced by the “transpose” κ(s, t)


of κ = κ(t, s). ♦

7.9 Exercises

1. Let V be the linear submanifold of the Banach space C([0, 1]) that consists
of all f ∈ C([0, 1]) for which the derivative df/dt exists on (0, 1) and is
continuous on [0, 1]. Let Φ : V → C([0, 1]) be the linear transformation
Φ(f) = df/dt. Show that Φ is unbounded.
2. Let H be a Hilbert space and suppose that ξ, ξk ∈ H, k ∈ N, satisfy

lim hξk , ηi = hξ, ηi , ∀η ∈ H .


k→∞

a) Prove that the set S = {ξk | k ∈ N} is uniformly bounded.


b) Prove that kξk ≤ lim inf k kξk k.
3. Assume that V is a nonzero Banach space. An operator T ∈ B(V ) is said
to have finite rank if the range of T has finite dimension. In such cases,
the rank of T is defined to be the dimension of the range of T .
110 7 Operators on Banach Spaces

a) Prove that there exists T ∈ B(V ) such that T has rank 1.


b) Prove that if T has finite rank n ∈ N, then there are rank-1 operators
n
X
F1 , . . . , Fn ∈ B(V ) such that T = Fj .
j=1
4. Prove that a surjective operator T : V → W between Banach spaces V
and W is an open map.
5. Suppose that V is a Banach space and that T ∈ B(V ) is invertible. Prove
that kT −1 k ≥ kT k−1 .
6. Assume that V and W are Banach spaces, and that T ∈ B(V, W ). Prove
that there is a unique operator T ∗ : W ∗ → V ∗ such that

T ∗ ψ(v) = ψ(T v) , ∀ ψ ∈ W ∗, v ∈ V ,

and satisfying the following properties:


a) kT ∗ k = kT k,
b) (T1 + T2 )∗ = T1∗ + T2∗ ,
c) if T is invertible, then T ∗ is invertible and (T ∗ )−1 = (T −1 )∗ , and
d) if W = V , then (T1 T2 )∗ = T2∗ T1∗ .
7. Let V be a Banach space and consider V as a subspace of V ∗∗ . Define
T ∗∗ to be (T ∗ )∗ . Prove that the restriction of T ∗∗ to V is T . That is,
T ∗∗ |V = T .
8. Assume that R, S, T ∈ B(V ) satisfy ST = T R = 1. Prove that T is
invertible.
9. Let X be a compact Hausdorff space, select ψ ∈ C(X), and let Tψ :
C(X) → C(X) be the operator of multiplication by ψ: Tψ f = ψf, for all
f ∈ C(X). For each λ ∈ C, let Kλ = {t ∈ X | ψ(t) 6= λ} (the closure in
X).
a) Prove that {t ∈ X | ψ(t) 6= λ} is an open set, for every λ ∈ C.
b) Prove that if λ ∈ σp (Tψ ) if and only if Kλ 6= X.
10. Assume that p, q ∈ (1, ∞) satisfy 1/p + 1/q = 1. Define a linear transfor-
mation S : `p (N) → `p (N) by
   
v1 0
 v2   α1 v1 
   
Sv = S   =  α2 v2  , ∀v ∈ `p (N) ,
 v3   
 v4   α3 v3 
.. ..
   
. .

where {αk }k∈N ⊂ C is a sequence for which supk |αk | < ∞.


a) Prove that S is an operator on `p (N).
b) Determine an explicit form for the adjoint operator S ∗ on `q (N).
c) If α2j = 0 and α2j−1 = 1, for all j ∈ N, then compute kSk.
7.9 Exercises 111

11. Consider the unit square X = [0, 1] × [0, 1] ⊂ R2 , and let κ ∈ C(X). Fix
1 ≤ p < ∞ and let Lp denote Lp ([0, 1], m) (Lebesgue measure). Consider
the integral operator Tκ ∈ B(Lp ).
a) Prove that if κ is a polynomial, which means κ has the form
m X
X n
κ(t, s) = αij ti sj , for some αij ∈ C ,
i=0 j=0

then Tκ is an operator of finite rank.


b) If κ ∈ C(X) is arbitrary, prove that for every ε > 0 there is a polyno-
mial q ∈ C[t, s] such that |κ(t, s) − q(t, s)| < ε, for all t, s ∈ [0, 1].
c) If κ ∈ C(X) is arbitrary, prove that for every ε > 0 there is a finite
rank operator F ∈ B(Lp ) such that kTκ − F k < ε .
8
Compact Operators on Banach Spaces

If BV denotes the closed unit ball of a Banach space V , and if T ∈ B(V )


satisfies kT k ≤ 1, then T (BV ) is a closed subset of BV . However, in general
T (BV ) is not a compact subset of BV . (Recall that Bv is a compact set if
and only if V is finite-dimensional (Proposition 1.16).) Nevertheless, there are
operators for T ∈ B(V ) with kT k ≤ 1 for which T (BV ) is a compact subset
of BV . Not surprisingly, such operators are called compact operators.

8.1 Compact Operators


Definition 8.1. An operator K on a Banach space V is said to be compact
if the set
{Kv | v ∈ V, kvk ≤ 1}
has compact closure in V . The set of all compact operators K acting on a
Banach space V is denoted by K(V ).

Because V is a metric space K ∈ B(V ) is a compact operator if and only


if, for every bounded sequence of vectors vn ∈ V , the sequence {Kvn }n∈N
admits a convergent subsequence.
The following observation will be used frequently.

Proposition 8.2. If λ ∈ C is nonzero, then λ1 is a compact operator on a


Banach space V if and only if V has finite dimension.

Proof. Let BV denote the closed unit ball of V . Thus, λ1(BV ) = λBV , and
so λ1 is a compact operator if and only if BV is a compact set. But BV is a
compact set if and only if V has finite dimension (Proposition 1.16). 

Theorem 8.3. K(V ) is a nonzero norm closed ideal of B(V ). If V has infinite
dimension, then K(V ) 6= B(V ).
114 8 Compact Operators on Banach Spaces

Proof. Let {vn }n∈N be a sequence of unit vectors. Suppose that K1 , K2 ∈


K(V ). Since K1 is compact, there is a subsequence {vnj }j of {vn }n such
that {K1 vnj }j is convergent. Since K2 is compact, there is a subsequence
{vn0i }i of {vnj }j such that {K2 vn0 i }i is convergent. Because {(K1 + K2 )vn0 i }i
is convergent, K1 + K2 is compact. Likewise, α1 K1 + α2 K2 ∈ K(V ) for all
α1, α2 ∈ C.
The proof that T K, KT ∈ K(V ), for all T ∈ B(V ) and K ∈ K(V ) is left
to the reader (Exercise 1), and so it remains only to show that K(V ) is norm
closed.
Assume that K ∈ B(H) is in the norm closure of K(H) and let {vi }i∈N be a
sequence of unit vectors. Thus, for every n ∈ N there is a Kn ∈ K(H) such that
kK − Kn k < n1 . As K1 is compact, there is a subsequence {v1,j }j of {vi }i such
that {K1 v1,j }j is convergent. As K2 is compact, there is a subsequence {v2,j }j
of {v1,j }j such that {K2 v2,j }j and {K1 v2,j }j are convergent. Inductively, for
each n ∈ N there is a sequence {vn,j }j such that
1. {vn,j }j is a subsequence of {vn−1,j }j , and
2. {K` vn,j }j is convergent for all 1 ≤ ` ≤ n.
Now let ε > 0. Choose p ∈ N such that kK − Kp k < ε. Claim: {Kvn,n }n
is a Cauchy sequence. Note that if n ≥ p, then {Kp vn,n }n is a subsequence
of the convergent sequence {Kp vp,n }n . Hence, {Kp vn,n }n is convergent and,
therefore, Cauchy. Thus,

kKvn,n − Kvm,m k ≤ k(K − Kp )vn,n k + kKp vn,n − Kp vm,m k

+ k(K − Kp )vm,m k

< 2ε + kKp vn,n − Kp vm,m k

Thus, {Kvn,n }n is a Cauchy sequence and, hence, convergent. This proves


that K ∈ K(V ).
Next, suppose that V has infinite dimension. By Proposition 8.2, 1 ∈ B(V )
is not a compact operator. Thus, K(V ) 6= B(V ), which implies that K(V ) is
a proper ideal of B(V ). 
Note that if V has infinite dimension, then every operator F : V → V of fi-
nite rank is necessarily compact. Thus, the ideal F (V ) of finite-rank operators
belongs to the norm-closed ideal K(V ). Hence, the norm closure F (V ) of the
finite-rank operators is contained in K(V ). However, by a theorem of Enflo
[5], F (V ) and K(V ) need not be equal—even if V is separable and reflexive.

8.2 Properties of Compact Operators


Proposition 8.4. If K is a compact operator acting on a Banach space V ,
and if M ⊆ V is a subspace, then
8.2 Properties of Compact Operators 115

{w − Kw | w ∈ M }

is a subspace of V .

Proof. Let M ⊆ V be a subspace and assume that M ∩ ker(1 −K) = {0}. (The
case where M ∩ ker(1 − K) 6= {0} will be handled at the end of the proof.)
Let
L = {w − Kw | w ∈ M }
and suppose that y ∈ L. We aim to prove that y ∈ {w − Kw | w ∈ M }. Thus,
there is a sequence of vectors yn ∈ {w − Kw | w ∈ M } with limit y; that is,
there is a sequence of vectors wn ∈ M such that

ky − (wn − Kwn )k → 0 .

Assume that the sequence {wn }n∈N admits a subsequence of vectors wnk in
which kwnk k → ∞. If this is the case, then let vk ∈ M denote the unit vector
kwnk k−1 wnk . The compactness of K implies that the sequence {Kvk }k∈N
admits a convergent subsequence {Kvkj }j∈N with limit, say, z ∈ V . Note that

1
kvkj − Kvkj k = kwnkj − Kwnkj k .
kwnkj k

As j → ∞ we have kwnkj − Kwnkj k → kyk and kwnkj k → ∞. Therefore,


{vkj }j∈N converges to z, which implies that z ∈ M . However, we now have that
z ∈ M and z − Kz = limj (vkj − Kvkj ) = 0; that is, z ∈ M ∩ ker(1 − K) = {0}.
This is a contradiction because 0 = z = limj vkj , where each kvkj k = 1.
Therefore, it cannot happen that the sequence {wn }n∈N admits a subsequence
of vectors wnk in which kwnk k → ∞.
Because there is a γ > 0 such that kwn k ≤ γ for all n ∈ N, and because K is
a compact operator, the sequence {Kwn }n∈N admits a convergent subsequence
{Kwnk }k∈N with limit, say, x ∈ V . Now since

Kwnk → x and (wnk − Kwnk ) → y ,

we conclude that wnk → x + y, whence x + y ∈ M . Thus, if w = x + y, then

(1 − K)w = (1 − K)(x + y) = lim (wnk − Kwnk ) = y ,


k→∞

which proves that the linear submanifold {w − Kw | w ∈ M } is closed.


Suppose next that M ∩ ker(1 − K) 6= {0}. Thus, if F = M ∩ ker(1 −
K), then F is a finite-dimensional subspace. (If not, then F is an infinite-
dimensional space upon which the compact operator K acts as the identity,
in contradiction to the fact that 1 is not compact.) Proposition 3.17 asserts
that F is complemented in M ; hence, there is a subspace N ⊂ M such that
N ∩ F = {0} and M = N + F . Consequently, each w ∈ M has the form
w1 + w0 , where w1 ∈ N and (1 − K)w0 = 0. Hence,
116 8 Compact Operators on Banach Spaces

{w − Kw | w ∈ M } = {w1 − Kw1 | w1 ∈ N } . (8.1)

Since N ∩ ker(1 − k) = {0}, the linear submanifold in (8.1) is closed by the


arguments developed initially. 

Corollary 8.5. If K is a compact operator on a Banach space V , then the


range of 1 − K is closed.

Proposition 8.6. If K is a compact operator acting on a Banach space V


such that 1 − K is injective, then 1 − K is surjective.

Proof. Assume, contrary to what we aim to prove, that 1−K is not a surjection.
Set M0 = V and let

Mn = {(1 − K)n v | v ∈ V } = {(1 − K)w | w ∈ Mn−1 } , ∀n ∈ N.

Thus, M0 ⊃ M1 ⊃ M2 ⊃ . . . ⊃ Mn ⊃ Mn+1 ⊃ . . . is a descending sequence


of subspaces (by Proposition 8.4). This sequence is in fact proper, by the
following argument. Suppose for some n ∈ N we have Mn = Mn+1 . Because
1 − K is not surjective, there is a v0 ∈ V \ran (1 − K). On the other hand,
(1 − K)n v0 ∈ Mn = Mn+1 , and so there is a vector w0 ∈ V such that
(1 − K)n+1 w0 = (1 − K)v0 . That is,

(1 − K)n [(1 − K)w0 − v0 ] = 0 .

Because 1 − K and, hence, (1 − K)n are injective, we conclude that v0 =


(1 − K)w0 . But this would place v0 in the range of 1 − K, in contradiction
to v0 ∈ V \ran (1 − K). Hence, it must indeed be true that the sequence
{Mn }n∈N∪{0} is properly descending. Moreover, if v ∈ Mn , then v = (1−K)n w
for some w ∈ V ; hence, Kv = (1 − K)n Kw (as K and 1 − K commute). In
other words, each Mn is invariant under K.
Let δ ∈ (0, 1) be fixed. For each n ∈ N there is a vector in the quotient
space Mn−1 /Mn of norm δ. Since δ < 1, this means that there is a unit vector
vn ∈ Mn−1 such that kvn − fk ≥ δ for all f ∈ Mn . If j < k, then (1 − K)vj ∈
(1 − K)(Mj−1 ) = Mj and Kvk ∈ Mk ⊂ Mj . Thus, (1 − K)vj + Kvk ∈ Mj
and so
δ ≤ kvj − [(1 − K)vj + Kvk ]k = kKvj − Kvk k .
Therefore, the sequence {Kvn }n∈N does not admit a Cauchy subsequence
and, hence, does not admit a convergent subsequence. This contradicts the
fact that K is compact. Thus, the original assumption that 1 − K is not
surjective cannot hold. That is, 1 − K is necessarily surjective. 

Proposition 8.7. If K ∈ B(V ) is a compact operator, then K ∗ ∈ B(V ∗ ) is


a compact operator.

Proof. Let BV ∗ and BV denote the closed unit balls of V ∗ and V . Suppose
that {ϕn }n∈N is a sequence in BV ∗ . Because BV ∗ is weak∗ compact, by the
8.2 Properties of Compact Operators 117

Banach–Alaoglu Theorem (Theorem 3.26), there is a ϕ ∈ BV ∗ such that for


every weak∗ open neighbourhood U of ϕ in BV ∗ and j ∈ N there is a n ≥ j
such that ϕn ∈ U . (That is, ϕ is a weak∗ cluster point of {ϕn }n∈N .)
Let ε > 0 and fix j ∈ N. Because K(BV ) is compact, there are w1 , . . . , wm ∈
K(BV ) such that
m
[
K(BV ) ⊂ {w ∈ V | kw − wk k < ε} .
k=1

Consider the weak∗ open neighbourhood U ⊂ BV ∗ of ϕ that is given by

U = {ψ ∈ BV ∗ | |ψ(wk ) − ϕ(wk )| < ε, 1 ≤ k ≤ m} .

Because ϕ is a weak∗ cluster point of {ϕn }n∈N , there is a nj ≥ j such that


ϕnj ∈ U . If v ∈ Bv , then there is a 1 ≤ k ≤ m such that kKv − wk k < ε.
Hence,

|K ∗ ϕ(v) − K ∗ ϕnj (v)| = |ϕ(Kv) − ϕnj (Kv)|

≤ |ϕ(Kv − wk ) − ϕnj (Kv − wk )|

+ |ϕ(wk ) − ϕnj (wk )|

< 3ε .

Hence,

kK ∗ ϕ − Kϕnj k = sup |K ∗ϕ(v) − K ∗ ϕnj (v)| < 3ε .


kvk≤1

As the choice of ε > 0 and j ∈ N are arbitrary, this proves that there is a
subsequence {ϕnj }j of {ϕn }n such that {K ∗ ϕnj }j is convergent (to K ∗ ϕ).
Hence, K ∗ is a compact operator. 

Proposition 8.8. If K is a compact operator acting on a Banach space V


such that 1 − K is surjective, then 1 − K is injective.

Proof. Because K is compact, so are K ∗ and K ∗∗ (Proposition 8.7). Because


1 − K is surjective, the defect spectrum σd (1 − K) is empty. But since σd (1 −
K) = σp (1 − K ∗ ), we conclude that 1 − K ∗ is injective. As K ∗ is compact,
Proposition 8.6 implies 1 − K ∗ is surjective. Therefore, the defect spectrum
is 1 − K ∗ , and so the point spectrum of (1 − K ∗ )∗ is empty. In other words,
1 − K ∗∗ is injective. The restriction of 1 − K ∗∗ to the subspace V of V ∗∗ is
precisely 1 − K (see Exercise 7 of Chapter 7). As 1 − K ∗∗ remains injective
on any smaller domain, 1 − K is, therefore, an injective operator. 
118 8 Compact Operators on Banach Spaces

8.3 The Spectra of Compact Operators

Theorem 8.9. (The Fredholm Aternative) Assume that K is a compact op-


erator on a Banach space V and that λ ∈ C is nonzero. Then exactly one of
the following statements holds:
1. Kv = λv for some nonzero v ∈ V .
2. K − λ1 is invertible.

Proof. Propositions 8.6 and 8.8 show that 1 − K is injective if and only if
1 − K is surjective. Therefore, if λ ∈ C is nonzero, then replacing K by the
compact operator λ1 K and using the fact that σ(K) = σap (K) ∪ σd (K), we
obtain λ ∈ σp (K) or λ 6∈ σ(K). 
The Fredholm Alternative has implications for what properties the spec-
trum of a compact operator may exhibit.

Theorem 8.10. If K is a compact operator acting on an infinite-dimensional


Banach space V , then
1. 0 ∈ σ(K),
2. each nonzero λ ∈ σ(K) is an eigenvalue of finite geometric multiplicity,
3. σ(K) is a finite or countably infinite set, and
4. if σ(K) is infinite, then 0 is the only cluster point of σ(K).

Proof. Because K(V ) is a proper ideal of B(V ), 1 6∈ K(V ). Thus, 0 ∈ σ(K).


Assume next that λ ∈ σ(K) is nonzero. Theorem 8.9 (Fredholm Alter-
native) asserts that λ is an eigenvalue of K. The geometric multiplicity of
λ—namely, the dimension of ker(K − λ1)—must be finite, since Kv = λv,
for all v ∈ ker(K − λ1), implies that λ1| ker(K−λ1) is compact. Therefore,
ker(K − λ1) has finite dimension, by Proposition 8.2
To show that σ(K) is a finite or countably infinite set, it is enough to
assume that σ(K) is infinite and to prove that 0 is the only cluster point of
σ(K).
Therefore, assume that σ(K) is infinite. Assume that 0 is not a cluster
point of σ(K). There exist, therefore, 1 > δ > 0 and a sequence {λn }n ⊂ σ(K)
(distinct elements) with the property that |λn | ≥ δ for every n ∈ N. Each λn
is an eigenvalue of K; let vn ∈ V be corresponding eigenvectors of length 1.
Fix m ∈ N and let f1 , . . . , fm ∈ C [t] be polynomials such that fi (λj ) = 0,
for j 6= i, and fi (λi ) = 1. (Such polynomials exist; for example, one could use
the Lagrange interpolation to construct them.) Therefore, if α1 , . . . , αm ∈ C
satisfy
Xm
αj vj = 0 ,
j=1
8.4 Examples 119
 
m
X m
X
then 0 = f(K)  αj vj  = αj f(λj )vj , for every f ∈ C [t]. In particular,
j=1 j=1
if f = fi , then 0 = αi vi , and so αi = 0. This proves that the sequence {vn }n
consists of linearly independent vectors.
For each m ∈ N let Mm = Span {v1 , . . . , vm }, which yields an ascending
sequence M1 ⊂ M2 ⊂ · · · of finite-dimensional subspaces of V . By Useful
Lemma 1 (Lemma 1.14), there is a sequence of unit vectors wn ∈ V such that
wn ∈ Mn+1 , kwn − uk ≥ δ for all u ∈ Mn , and kwn − wm k ≥ δ if m > n. Each
subspace Mn is spanned by eigenvectors of K; thus, K maps Mn back into
itself. Therefore, K − λn 1 maps Mn into Mn−1 , by the following calculation:
 
Xn n
X
(K − λn 1)  αj vj  = αj (λj − λn )vj
j=1 j=1

n−1
X
= αj (λj − λn )vj .
j=1

Therefore, if ` < n, then [(K − λn 1)vn − Kv` ] ∈ Mn−1 . Hence, for any u ∈
Mn−1 ,
kKvn − Kv` k = kλn vn + (K − λn 1)vn − Kv` k

≤ kλn vn − uk

1
= |λn | kvn − λn uk

≥ δ2 > 0 .
This means that {Kvn }n does not admit a convergent subsequence, in con-
tradiction to the compactness of K. Therefore, it must be that 0 is a cluster
point of σ(K). The same argument shows that no nonzero λ ∈ σ(K) could
possibly be a cluster point of σ(K). 

8.4 Examples
An Integral Operator

Consider the unit square [0, 1] × [0, 1] ⊂ R2 and denote L2 ([0, 1], m) by L2 .
Define a function K : L2 → L2 by
Z t Z 1 
Kf (t) = f(s) dm(s) dm(t) , f ∈ L2 .
0 s
120 8 Compact Operators on Banach Spaces

Observe that Kf is square-integrable and that kKfk ≤ kfk, for all f ∈ L2 .


We will show that K is compact, but first let us solve the eigenvalue problem
for K.
Consider the equation Kf = λf. That is,
Z t Z 1 
f(u) dm(u) dm(s) = λ f(t) , for almost all t ∈ [0, 1] .
0 s

As the left hand side of the equation above is twice differentiable almost
everywhere. The first differentiation leads to
Z 1
df
f(u) dm(u) = λ .
t dt
The second derivative yields

d2 f
−f = λ .
dt2
Evaluation of the two differential equations above at the boundary values for
t gives f(0) = f 0 (1) = 0.
Notice that hKg, gi ≥ 0, for every g ∈ L2 , and so 0 ≤ hKf, fi = λkfk2
implies√
that λ ≥ 0.√Therefore, the general solution of λf 00 + f = 0 is f(t) =
α1 eit/ λ + α2 e−it/

λ
. But f(0) = f 0 (1) = 0 implies that α2 = −α1 . Hence,
f(t) = 2i sin(t/ λ). To satisfy both f 6= 0 and f 0 (1) = 0, it is necessary and
sufficient that √1λ = π2 (2k − 1) for some k ∈ N. Hence, the eigenvalues of K
are
4
λk = , k ∈ N,
(2k − 1)2 π 2

and the corresponding eigenvectors are fk (t) = 2i sin(t/ λk ).
Now to show that K is a compact operator, first note that K = K2 K1 ,
where K1 , K2 : L2 → L2 are defined by
Z 1 Z t
K1 f(t) = f(s) dm(s) and K2 g(t) = g(s) dm(s) .
t 0

Therefore, we need only show that K1 and K2 are compact.

To be completed . . .

8.5 Exercises

1. Prove that T K, KT ∈ K(V ), for all T ∈ B(V ) and K ∈ K(V ).


9
Operators on Hilbert Spaces

Because Hilbert spaces are self dual, the adjoint T ∗ of an operator T on


Hilbert space H can be considered as as an element of B(H). Consequently,
the relationship between T and T ∗ takes on greater significance in the context
of Hilbert space.

9.1 The Hilbert Space Adjoint

The natural notion of adjoint for Hilbert space operators is slightly different
from the adjoint of Banach space operators. In the Hilbert space setting, one
needs to account for the fact that the inner product is not bilinear—rather,
it is conjugate linear in the second variable.

Theorem 9.1. If T is an operator on a Hilbert space H, then there is a unique


operator T ∗ on H such that

hT ξ, ηi = hξ, T ∗ ηi , ∀ ξ, η ∈ H . (9.1)

Proof. Fix η ∈ H and define ϕη : H → C by ϕη (ξ) = hT ξ, ηi for all ξ ∈ H.


Because ϕη is plainly linear, the Riesz Representation Theorem (Theorem
6.28) states that there is a unique vector, which we will denote by η ∗ , such
that
hT ξ, ηi = ϕη (ξ) = hξ, η ∗ i , ∀ ξ ∈ H . (9.2)
Thus, η ∗ represents ϕη .
Now consider the function T ∗ : H → H that sends each η ∈ H to η ∗ ∈ H.
It is straightforward to verify that T ∗ is a linear function. Therefore equation
(9.2) becomes
hT ξ, ηi = hξ, T ∗ ηi , ∀ ξ, η ∈ H . (9.3)
All that remains now is to show that the transformation T ∗ that satisfies
equation (9.3) is unique. Suppose that S is an operator on H such that
122 9 Operators on Hilbert Spaces

hT ξ, ηi = hξ, Sηi for all ξ, η ∈ H. Then for any η ∈ H, hT ∗ η − Sη, ξi = 0 for


all ξ ∈ H. In particular, T ∗ η − Sη is orthogonal to itself and so T ∗ η − Sη = 0.
Therefore, S = T ∗ . 

Definition 9.2. The operator T ∗ defined by equation (9.1) is called the ad-
joint of the operator T and the map T 7→ T ∗ on B(H) is called the canonical
involution, or more simply the involution, on B(H).

Proposition 9.3. The involution on B(H) has the following properties for
all S, T ∈ B(H) and α ∈ C:
1. T ∗∗ = T ;
2. (α T )∗ = α T ∗ ;
3. (S + T )∗ = S ∗ + T ∗ ;
4. (ST )∗ = T ∗ S ∗ .

Proof. To prove 1, the adjoint T ∗∗ of T ∗ is, by (9.3), the unique operator


on H for which hT ∗ ϑ, νi = hϑ, T ∗∗ νi—equivalently, equivalently hT ∗ ϑ, νi =
hϑ, T ∗∗νi—for all ϑ, ν ∈ H. In setting ν = ξ and ϑ = η, it follows that

hξ, T ∗ ηi = hT ∗∗ ξ, ηi , ∀ ξ, η ∈ H .

But of course we know that it is always the case that hξ, T ∗ηi = hT ξ, ηi; if ξ
is fixed, then for every η ∈ H,

hT ξ − T ∗∗ ξ, ηi = 0 .

Thus, the vector T ξ − T ∗∗ ξ is orthogonal to itself, which means that T ξ −


T ∗∗ ξ = 0. As this is true for every ξ, we have then that T ∗∗ = T . The proofs
of the remaining algebraic statements are left to the reader. 

Proposition 9.4. If T ∈ B(H), then

kT k = sup |hT ξ, ηi| .


kξk=kηk=1

Furthermore, kT ∗ k = kT k and kT ∗ T k = kT k2 .

Proof. If ω ∈ H, then kωk = sup{|ϕ(ω)| | ϕ ∈ H ∗ , kϕk = 1} (see Exercise


4 in Chapter 3). Thus, kωk = sup{|hω, ηi| | η ∈ H, kηk = 1}, by the Riesz
Representation Theorem. Hence,

kT k = sup kT ξk = sup |hT ξ, ηi| .


kξk=1 kξk=kηk=1

Since |hT ξ, ηi| = |hξ, T ∗ ηi|, for all ξ, η ∈ H, we obtain kT ∗ k = kT k immedi-


ately.
For any S, T ∈ B(H), kST k ≤ kSk kT k, simply by the definition of norm.
Therefore,
9.1 The Hilbert Space Adjoint 123

kT ∗ T k ≤ kT ∗ k kT k = kT k kT k = kT k2
in particular. Conversely, if ξ, η ∈ H are unit vectors, then the Cauchy–
Schwarz inequality yields
|hT ξ, ηi|2 ≤ kT ξk2 kηk2 = hT ξ, T ξi = hT ∗ T ξ, ξi ≤ kT ∗ T k .
Thus, kT k2 ≤ kT ∗ T k. 
Now we turn to the issue of representing adjoint transformations with
matrices.
Example 9.5. Matrix representations of T and T ∗ . Suppose that T ∈ B(H)
and that B = {φk }k∈N is an orthonormal basis of a separable Hilbert space
H. If T = [τij ] is the (infinite) matrix representation of T with respect to the
orthonormal basis B, then the (i, j)-entry of T is determined via
τij = hT φj , φi i .
In using this for T ∗ in place of T we conclude that the (p, q)-entry of the
matrix representation of T ∗ with respect to B is hT ∗ φq , φp i. Furthermore,
hT ∗ φq , φpi is given by
hT ∗ φq , φp i = hφp , T ∗ φq i = hT φp , φq i = τqp .
Thus, the matrix representation T ∗ of T ∗ is determined by transposing T , the
matrix representation of T , and then conjugating each entry. In other words,
T ∗ is the conjugate transpose of T . ♦
Proposition 9.6. If T ∈ B(H), then
1. ker T = (ran T ∗ )⊥ and
2. ran T = (ker T ∗ )⊥ .
Proof. For the proof of the first assertion, assume that ξ ∈ ker T . Any vector
in ran T ∗ has the form T ∗ η, for some η ∈ H. Since
hξ, T ∗ ηi = hT ξ, ηi = 0 ,
we conclude that ξ ∈ (ran T ∗ )⊥ .
Conversely, suppose that ξ ∈ (ran T ∗ )⊥ . Thus, for every η ∈ H, 0 =
hξ, T ∗ηi. In particular, if η = T ξ, then
0 = hξ, T ∗ηi = hξ, T ∗ T ξi = hT ξ, T ξi = kT ξk2 ,
and so ξ ∈ ker T .
The proof of the second assertion is left to the reader (Exercise 2). 
Another aspect of the Hilbert space adjoint to be aware of—especially in
light of what has come before in the study of operators on Banach spaces—is
that the defect spectrum is characterised as follows:
Proposition 9.7. If T ∈ B(H), then λ ∈ σd (T ) if and only if λ ∈ σp (T ∗ ).
Proof. Exercise 3. 
124 9 Operators on Hilbert Spaces

9.2 Examples
Multiplication Operators

Let X be a compact space and let µ be a finite Borel measure on X.


Recall from §2.5 that if ψ : X → C is a Borel-measurable function, then
the essential range of ψ is the closed subset ess-ran ψ of C defined by
\
ess-ran ψ = ψ(E) ,
E⊂X, µ(X\E)=0

where ψ(E) denotes the closure in C of ψ(E) = {ψ(t) | t ∈ E}, and where E
is a Borel subset of X. The essential supremum of ψ is the quantity

ess-sup ψ = max {|λ| | λ ∈ ess-ran ψ} .

A measurable function ψ is essentially bounded if it has finite essential supre-


mum.
Note that if ψ is an essentially bounded Borel function ψ : X → C, then ψ
determines an element, denoted again by ψ, in the Banach space L∞ (X, µ),
and
kψk∞ = ess-sup ψ .
Moreover, the essential range of ψ is compact.
Now consider the Hilbert space L2 (X, µ) and let ψ denote an essentially
bounded Borel function ψ : X → C. Note that ψf ∈ L2 for every f ∈ L2 .
Thus, the function Mψ : L2 → L2 defined by

Mψ (f) = ψf , ∀ f ∈ L2 ,

is a linear transformation on L2 . To show that Mψ is bounded, observe that


Z 1/2
2 2
kMψ (f)k = |ψ| |f| dµ
X

Z 1/2
≤ ess-sup |ψ| |f|2 dµ
X

= kψk∞ kfk .

Thus, Mψ is bounded and kMψ k ≤ kψk∞ . In fact, kMψ k = kψk∞ . To prove


this, let γ = kψk∞ . If ε > 0, then there is a Borel subset E ⊂ X such that
µ(E) > 0 and γ < |ψ(t)| + ε, for all t ∈ E. Let f = χE , the characteristic
function of E. Since µ(X) < ∞, f ∈ L2 . Further,
Z Z Z
kMψ fk2 ≥ |ψ|2 |f|2 dµ ≥ (γ − ε)2 |f|2 dµ = (γ − ε)2 |f|2 dµ .
E E X
9.2 Examples 125

Thus, kMψ fk2 ≥ (γ − ε)2 kfk2 , and so kMψ k ≥ kψk∞ − ε. Hence, kMψ k ≥
kψk∞ .
The multiplication operator Mψ satisfies
Z Z
hf, Mψ gi = fψg dµ = fψfg dµ = hMψ f, gi , ∀ f, g ∈ L2 .
X X

Hence, by the uniqueness of the Hilbert space adjoint, Mψ∗ = Mψ . ♦

The Bilateral Shift Operator

Let L2 (T) denote the Hilbert space L2 ([−π, π], m). By Theorem 6.26, an or-
thonormal basis of L2 (T) is given by the continuous functions φk : [−π, π] →
C, k ∈ Z, defined by
1
φk (t) = √ ei kt , t ∈ [−π, π] .

Each f ∈ L2 (T) has a Fourier series decomposition

fˆ(k) φk ,
X
f =
k∈Z

which is convergent in L2 (T) and where


Z π
fˆ(k) = hf, φk i = f(t)e−i kt dm(t) .
−π

Let ψ(t) = eit . The multiplication operator Mψ on L2 (T) is called the


bilateral shift operator . Its adjoint is Mψ . Thus, Mψ f(t) = eit f(t) and
(Mψ )∗ f(t) = e−it f(t), for all f ∈ L2 (T). If g = Mψ f and h = (Mψ )∗ f,
then
ˆ − 1) and ĥ(k) = fˆ(k + 1) ,
ĝ(k) = f(k ∀k ∈ Z .
The bilateral shift operator on L2 (T) is denoted by B. Thus, the bilateral
shift operator shifts the Fourier coefficients of f ∈ L2 (T) forward by one
position, and its adjoint shifts the Fourier coefficients backwards one position
(which explains the name “bilateral” shift). Put differently,

Bφk = φk+1 and B ∗ φk = φk−1 , ∀k ∈ Z.

Note that B is an invertible isometry and that B −1 = B ∗ . ♦


126 9 Operators on Hilbert Spaces

The Unilateral Shift Operator

Recall that the Hardy space H 2 (T) is the subspace of L2 (T) defined by
 Z π 
H 2 = f ∈ L2 (T) | f(t)e−ikt dm(t) = 0, ∀ k < 0 .
−π

The bilateral shift operator B on L2 (T) has the property that Bf ∈ H 2 (T)
for every f ∈ H 2 (T). (To verify this, one need only check that the Fourier
coefficients of Bf vanish for negative k; this is clear because the Fourier coef-
ficients of Bf are obtained from f by a forward shift of the Fourier coefficients
of f.) Thus B induces an operator on H 2 (T) which is denoted by S and is
called the unilateral shift operator.
The adjoint of S is not the restriction of B ∗ to H 2 (T) because B ∗ φ0 =
φ−1 6∈ H 2 (T) even though φ0 ∈ H 2 (T). However, it is not difficult to verify
that S ∗ satisfies Sφk = φk−1 , for all k ∈ N and S ∗ φ0 = 0 (Exercise 5). ♦

Integral Operators

Consider the unit square X = [0, 1]×[0, 1] ⊂ R2 , and let κ ∈ L2 (X, m2 ), where
m2 denotes Lebesgue measure on R2 . Let L2 denote L2 ([0, 1], m) (Lebesgue
measure). For each f ∈ L2 , let Kκ f denote a function whose value at t ∈ [0, 1]
is given by given by
Z 1
Kκ f (t) = κ(t, s)f(s) dm(s) .
0

Observe that Kκ f is square-integrable and, by the Cauchy–Schwarz inequality,


kKκ fk ≤ kκk kfk. Thus, Kκ is an operator—called on integral operator –on
L2 of norm kTκ k ≤ kκk.
The adjoint of Kκ must, by definition, satisfy

hf, (Kκ )∗ gi = hKκ f, gi , ∀ f, g ∈ L2 .

Because
Z 1
hKκ f, gi = Kκ f(t) g(t) dm(t)
0
Z 1 Z 1
= κ(t, s)f(s) g(t) dm(s)dm(t)
0 0

Z 1 Z 1
= κ(t, s)f(s) g(t) dm(t)dm(s) [Fubini’s Theorem]
0 0
Z 1 Z 1 
= f(s) κ(t, s)g(t) dm(t) dm(s) ,
0 0
9.3 Hermitian Operators 127

we conclude that Kκ∗ is given by


Z 1
Kκ∗ g (t) = κ(s, t)g(t) dm(s) , ∀ g ∈ L2 .
0

In other words, Kκ∗is an integral operator induced by the “conjugate trans-


pose” κ(s, t) of κ = κ(t, s). ♦

9.3 Hermitian Operators

Much of the theory of Hilbert space operators is devoted to the ways in which
T and T ∗ interact. The first case of interest occurs when T and T ∗ are in fact
the same. Such operators are probably the most important in all of operator
theory and its applications.

Definition 9.8. T ∈ B(H) is hermitian if T ∗ = T .

For any T ∈ B(H), T = 12 (T + T ∗ ) + i 2i 1


(T − T ∗ ). Since T + T ∗ and

i(T − T ) are hermitian, the hermitian operators span B(H). This is one
reason why hermitian operators are of importance.
If T ∈ B(H) is hermitian, then hT ξ, ηi = hξ, T ∗ηi = hξ, T ηi, for every
ξ, η ∈ H. In particular, if η = ξ, this implies that hT ξ, ξi = hT ξ, ξi, for all
ξ ∈ H; that is, the form ξ 7→ hT ξ, ξi is necessarily real valued if T is hermitian.
This necessary condition is also sufficient.

Proposition 9.9. An operator T ∈ B(H) is hermitian if and only if hT ξ, ξi ∈


R for all ξ ∈ H.

Proof If T is hermitian, then for every vector ξ,

hT ξ, ξi = hξ, T ∗ξi = hξ, T ξi = hT ξ, ξi ,

and therefore hT ξ, ξi ∈ R.
Conversely, suppose that hT ϑ, ϑi ∈ R for all ϑ; then hT ϑ, ϑi = hϑ, T ϑi.
Let λ ∈ C be arbitrary and set ϑ = ξ + λη. Let <z and =z denote the real
and complex parts of a complex number z. Because

hT ϑ, ϑi = hT ξ, ξi + λhT ξ, ηi + λhT η, ξi + |λ|2 hT η, ηi ,


hϑ, T ϑi = hξ, T ξi + λhξ, T ηi + λhη, T ξi + |λ|2 hη, T ηi ,

the equation hT ϑ, ϑi = hϑ, T ϑi leads to

λhT ξ, ηi − λhT ξ, ηi = λhξ, T ηi − λhξ, T ηi .

In other words,  
= λhT ξ, ηi = = λhξ, T ηi .
128 9 Operators on Hilbert Spaces

With λ = −1, the equation above is <hT ξ, ηi = <hξ, T ηi, whereas with
λ = 1 the equation becomes =hT ξ, ηi = =hξ, T ηi. This proves, then, that
hT ξ, ηi = hξ, T ηi for all vectors ξ, η ∈ H. Hence, T ∗ = T . 
The next set of propositions reveals some striking features of the spectra
of hermitian operators.

Proposition 9.10. If T ∈ B(H) is a hermitian operator, then σ(T ) =


σap(T ) ⊂ R.

Proof. Recall that σ(T ) = σap (T ) ∪ σd (T ). By Proposition 9.7, λ ∈ σd (T ) if


and only if λ ∈ σp (T ∗ ). As T ∗ = T , if we show that every eigenvalue of T is
real, then we will obtain σd (T ) ⊂ R. To this end, let T ξ = λξ for some λ ∈ C
and unit vector ξ ∈ H. Because hT ξ, ξi ∈ R (Proposition 9.9), we obtain

λ = λhξ, ξi = hλξ, ξi = hT ξ, ξi ∈ R .

Hence, σd (T ) = σp (T ) ⊂ R and

σ(T ) = σap (T ) ∪ σd (T ) = σap (T ) ∪ σp (T ) ⊆ σap (T ) ⊆ σ(T ) .

We now show that σap (T ) ⊂ R. Suppose that λ ∈ C \ R. For any nonzero


ξ ∈ H, we have

0 < |λ − λ| kξk2 = |h(T − λ1)ξ, ξi − h(T − λ1)ξ, ξi|

= |h(T − λ1)ξ, ξi − hξ, (T − λ1)ξ, ξi|

≤ 2k(T − λ1)ξk kξk .

Hence, k(T − λ1)ξk ≥ 21 |λ − λ| kξk for all ξ ∈ H. This proves that (T − λ1) is
bounded below and, hence, λ 6∈ σap(T ). This concludes the proof of σ(T ) =
σap(T ) ⊂ R. 

Proposition 9.11. If T ∈ B(H) is hermitian, then

kT k = max {|λ| | λ ∈ σ(T )} .

Proof. Letν = kT k. For any unit vector ξ ∈ H,

k(T 2 − ν 2 1)ξk2 = hT 2 − ν 21)ξ, T 2 − ν 2 1)ξi

= kT 2 ξk2 − 2ν 2kT ξk2 + ν 4 kξk2


(9.4)
≤ ν 2 kT ξk2 − 2ν 2kT ξk2 + ν 4

= ν 4 − ν 2kT ξk2 .
9.4 Normal Operators 129

By definition of the norm of a operator, there are unit vectors ξn ∈ H such


that kT ξn k → kT k. Hence, by inequality (9.4), limn k(T 2 − ν 21)ξk2 exists and
is equal to 0. Therefore, ν 2 ∈ σ(T 2 ).
Because T 2 − ν 2 1 = (T + ν1)(T − ν1), at least one of the two operators on
the right hand side of this expression must fail to be invertible. Thus, ν ∈ σ(T )
or −ν ∈ σ(T ). In either case, there is a λ ∈ σ(T ) such that |λ| = kT k. On the
other hand, |λ| ≤ kT k, for all λ ∈ σ(T ) (Theorem 7.15), which completes the
proof. 
Proposition 9.12. Assume that T ∈ B(H) is hermitian and let
m` = inf hT ξ, ξi and mu = sup hT ξ, ξi .
kξk=1 kξk=1

Then m` , mu ∈ σ(T ) and σ(T ) ⊆ [m` , mu ].


Proof. If T 0 = T − m` 1, then T 0 is hermitian and m` ∈ σ(T ) if and only if
0 ∈ σ(T 0 ). Therefore, we assume without loss of generality that m` = 0. Under
this assumption, the (possibly nondefinite) hermitian form [·, ·] : H × H → C
defined by [ξ, η] = hT ξ, ηi satisfies the Cauchy–Schwarz inequality
|[ξ, η]| ≤ [ξ, ξ]1/2[η, η]1/2 , ∀ ξ, η ∈ H .
Therefore, with η = T ξ,
kT ξk4 = |hT ξ, T ξi|2 ≤ hT ξ, ξi hT 2 ξ, T ξi ≤ hT ξ, ξi kT k3 kξk .
Hence, inf kT ξk = 0, which proves that T is not bounded below. That is,
kξk=1
0 ∈ σ(T ).
On the other hand, if λ < 0, then
k(T − λ1)ξk2 = kT ξk2 − 2λhT ξ, ξi + λ2 kξk2 ≥ λ2 kξk2
implies that T − λ1 is bounded below. Hence, λ 6∈ σap (T ) = σ(T ). This proves
that σ(T ) ⊆ [0, ∞).
The proof that mu ∈ σ(T ) and that σ(T ) ⊆ (−∞, mu ] is left to the reader
(Exercise 8). 
Corollary 9.13. If T ∈ B(H) is hermitian, then Conv σ(T ) = {hT ξ, ξi | kξk = 1}.

9.4 Normal Operators


Suppose that T ∈ B(H). The real part of T is the operator <T = 12 (T + T ∗ )
1
and the imaginary part of T is =T = 2i (T − T ∗ ). As every operator T can be
written in terms of its real and imaginary parts (namely, T = <T + i=T ),
the complex span of the real vector space of hermitian operators is B(H).
If <T and =T commute, then we say that T is a normal operator. Of course
if T is hermitian, then <T = T and =T = 0; hence, hermitian operators are
normal. The main result of this section shows that normal operators have
spectral properties similar to hermitian operators (Proposition 9.16).
130 9 Operators on Hilbert Spaces

Definition 9.14. An operator N ∈ B(H) is normal if (<N )(=N ) = (=N )(<N ).

More convenient characterisations of normality are:

Proposition 9.15. The following statements are equivalent for T ∈ B(H):


1. T is normal;
2. kT ∗ ξk = kT ξk, for all ξ ∈ H;
3. T ∗ T = T T ∗ .

Proof. Exercise 9. 

Proposition 9.16. If N is a normal Hilbert space operator, then σ(N ) =


σap(N ).

Proof. Because σ(N ) = σap (N ) ∪ σd (N ) = σap (N ) ∪ σp (N ∗ )∗ , to the propo-


sition it is sufficient to show that if λ ∈ σp (N ∗ ), then λ ∈ σp (N ). Therefore,
assume that λ ∈ σp (N ∗ ) and let ξ ∈ H be nonzero with N ξ = λξ. Because
N is normal, it is also true that N − λ1 is a normal operator. Consequently,
Proposition 9.15 implies that

0 = k(N ∗ − λ1)ξk = k(N − λ1)∗ ξk = k(N − λ1)ξk .

Hence, λ ∈ σp (N ) ⊆ σap (N ). 

Proposition 9.17. If N ∈ B(H) is normal, then spr N = kN k.

Proof. By the spectral radius formula (Theorem ),

spr N = lim kN n k1/n .


n

In particular,
1
spr N = lim kN 2k k 2k . (9.5)
k

Because N is normal, (N 2 )∗ (N 2 ) = (N ∗ N )2 . Thus,

kN 2 k2 = k(N 2 )∗ (N 2 )k = k(N ∗ N )(N ∗ N )k = kN ∗N k2 = kN k4 , ,

which implies that kN 2 k = kN k2 . By induction, kN 2k k = kN k2k , for all


k ∈ N. Hence, by (9.5), spr N = kN k. 

9.5 Continuous Functional Calculus


If T ∈ B(H) is hermitian, and if f ∈ C [t] is a polynomial, then f(T ) is
normal. With hermitian operators one can enlarge the polynomial functional
calculus to include continuous functions.
9.6 Positive Operators 131

Theorem 9.18. Assume that T ∈ B(H) is hermitian and that σ(T ) ⊆ [a, b].
Then for any continuous function f : [a, b] → C there is a hermitian operator
denoted by f(T ) ∈ B(H) with the property that

kf(T ) − qn (T )k → 0

for any sequence of polynomials qn ∈ C [t] for which


 
lim max |f(t) − qn (t)| = 0 .
n t∈[a,b]

Proof. The existence of the sequence {qn }n∈N is assured by the Weierstrass
Approximation Theorem. Since this sequence is assume to converge uniformly
on [a, b] to f, the sequence is a Cauchy sequence in C([a, b]). Note that qm (T )−
qn (T ) is hermitian, for all m, n ∈ N, and that the Spectral Mapping Theorem
shows that

σ(qm (T ) − qn (T )) = {qm (λ) − qn (λ) | λ ∈ σ(T )} .

Thus, kqm (T ) − qn (T )k = maxλ∈σ(T ) |qm (λ) − qn (λ)| (Proposition 9.17) and


therefore {qn (T )}n∈N is a Cauchy sequence of hermitian operators in B(H).
Hence, the limit of this sequence exists, which we denote by f(T ). Note also
that f(T ) is independent of the choice of approximating sequence {qn }n∈N. 
The map f 7→ f(T ) is called continuous functional calculus for T . Not all
Hilbert space operators admit continuous functional calculus (Exercise 10).
The following properties are readily verified.

Proposition 9.19. If T ∈ B(H) is hermitian and σ(T ) ⊆ [a, b], then for all
continuous functions f, g : [a, b] → C and α ∈ C we have:
1. kf(T )k = maxλ∈σ(T ) |f(λ)|,
2. αf(T ) = α(f(T )),
3. (f + g)(T ) = f(T ) + g(T ), and
4. fg(T ) = f(T )g(T ).

9.6 Positive Operators

As an application of continuous functional calculus, we show below that pos-


itive operators have positive square roots.

Definition 9.20. An operator T ∈ B(H) is positive if T is hermitian and


σ(T ) ⊂ [0, ∞).

Proposition 9.21. T ∈ B(H) is positive if and only if hT ξ, ξi ≥ 0 for every


ξ ∈ H.
132 9 Operators on Hilbert Spaces

Proof. This statement is an immediate consequence of Corollary 9.13. 


In light of Proposition 9.21, T ∗ T is positive for every operator T ∈ B(H).
Theorem 9.22. If T ∈ B(H) is positive, then
1. there is a positive operator R ∈ B(H) such that R2 = T , and
2. if R1 ∈ B(H) is a positive operator such that R21 = T , then R1 = R.

Proof. Since σ(T ) ⊆ [0, kT k] and the function f(t) = t is continuous on
[0, kT k], we may consider the hermitian operator R = f(T ). By Proposition
9.19, R2 = T . We now show that R is positive.
By scaling T we may assume without loss of generality that kT k = 1.
Thus, σ(T ) ⊆ [0, 1]. For√each n ∈ N, let qn be the n-th Bernstein polynomial
approximant of f(t) = t: namely,
n r  
X k n
qn (t) = tk (1 − t)n−k .
n k
k=1

Therefore, qn (t) ≥ 0 for all t ∈ [0, 1] and limn maxt∈[0,1] |qn (t) − f(t)| = 0
([2, §10.3]). Thus, each qn (T ) is a positive operator and {qn (T )}n converges
to R. We will show that this implies that R is positive.
To this end, Proposition 9.12 implies that the smallest element m` in σ(R)
has the form
m` = inf hRξ, ξi .
kξk=1

If m` < 0, then there must be a unit vector ξ and a n ∈ N such that


hqn (T )ξ, ξi < 0. On the other hand, as qn (T ) is positive, hqn (T )ξ, ξi ≥ 0
by Proposition 9.12. This contradiction implies that m` ≥ 0 and so R is pos-
itive. Thus, σ(R) ⊂ [0, ∞). Furthermore, since 0 ≤ qn (t) ≤ 1 for all t ∈ [0, 1],
each kqn (T )k ≤ 1 and so kRk ≤ 1.
Next, suppose that R1 ∈ B(H) is positive and R√21 = T . By the Spec-
tral Mapping√ Theorem, σ(R) ⊆ [0, 1]. Since qn (t) → t uniformly on [0, 1],
qn (t2 ) → t2 = t uniformly. Thus, qn (R21 ) → R1 . That is,
R1 = lim qn (R21 ) = lim qn (T ) = R ,
n n

which proves that T has a unique positive square root. 


Notation. If T ∈ B(H) is positive, then T 1/2 denotes the unique positive
square root of T .

9.7 Polar Decomposition


In working with complex numbers z, it is sometimes advantageous to express
z in its polar form z = ei θ |z|, where θ is the argument of z. One can do the
same with operators on Hilbert space, and the result is a major structure
theorem for arbitrary operators called Polar Decomposition..
9.7 Polar Decomposition 133

Definition 9.23. For any T ∈ B(H), The modulus |T | of an operator T ∈


B(H) is the operator
|T | = (T ∗ T )1/2 .
In the context of Hilbert space, one characterises isometries via the inner
product.
Proposition 9.24. The following statements are equivalent for V ∈ B(H):
1. V is an isometry;
2. hV ξ, V ηi = hξ, ηi for all ξ, η ∈ H;
3. V ∗ V = 1.
Proof. Exercise 11. 
A more general notion of isometry will be required.
Definition 9.25. An operator V ∈ B(H) is a partial isometry if kV ξk = kξk,
for every ξ ∈ (ker V )⊥ .
If V ∈ B(H) is a partial isometry, then the range of V is closed. The
subspaces (ker V )⊥ and ran V are called, respectively, the initial space and
final space of the partial isometry V .
Theorem 9.26. (Polar Decomposition) Assume that T ∈ B(H).
1. There exists partial isometry V ∈ B(H) with initial space ran |T | such
that T = V |T |.
2. If T = V1 R1 for some positive operator R1 and partial isometry V1 with
initial space ran R1 , then R1 = |T | and V1 = V .
Proof. For every ξ ∈ H,
k |T |ξk2 = h|T |ξ, |T |ξi = h|T |2 ξ, ξi = hT ∗ T ξ, ξi = kT ξk2 . (9.6)
Therefore, ker |T | = ker T .
Let V0 : ran|T | → H be the linear transformation that maps each |T |ξ ∈
ran|T | to T ξ ∈ ran T . Since |T |ξ1 = |T |ξ2 only if ξ1 −ξ2 ∈ ker |T | = ker T , V0 is
well defined. Furthermore, (9.6) implies that kV0 ψk = kψk for all ψ ∈ ran |T |.
Hence, V0 extends (by continuity) to an isometry ran |T | → H, denoted again
by V0 . Now extend V0 to a partial isometry V ∈ B(H) by defining V η = 0 for

all η ∈ ran |T | = ker |T | = ker T . Hence, V is a partial isometry with initial
space ran |T | and satisfies V |T | = T .
Suppose next that T = V1 R1 for some positive operator R1 and partial
isometry V1 with initial space ran R1 . Because, for every ξ ∈ H,
hT ∗ T ξ, ξi = kT ξk2 = kV1 R1 ξk2 = kR1 ξk2 = hR21 ξ, ξi ,
T ∗ T = R21 (by the polarisation identity). Hence, |T | = (T ∗ T )1/2 = (R21 )1/2 =
R1 by the uniqueness of the positive square root. Hence, V |T | = V1 |T |. That
is, V and V1 agree on ran |T |. But since the initial space of R1 is ran |T |, R1

is zero on ran |T | . Hence, V and V1 agree on all of H. 
134 9 Operators on Hilbert Spaces

Definition 9.27. An isometry U ∈ B(H) is called a unitary operator if U is


invertible.

Thus, an operator U ∈ B(H) is unitary if and only if U is invertible and


U ∗ = U −1 .

Proposition 9.28. If T ∈ B(H) is invertible, then T = U |T | for some uni-


tary operator U ∈ B(H).

Proof. Exercise 12. 


The spectrum of a unitary operator lies on the unit circle.

Proposition 9.29. If U ∈ B(H) is unitary, then σap (U ) = σ(U ) ⊆ ∂D.

Proof. The operator U is normal, since U ∗U = U U ∗ = 1; thus, σ(U ) = ∂σ(U )


by Proposition 9.16. Because U is an isometry, σap(U ) ⊆ ∂D. 

9.8 Projections and Invariant Subspaces


Definition 9.30. An operator P ∈ B(H) with the property that P 2 = P = P ∗
is called a projection.

Note that P 2 = P implies that P (1 − P ) = (1 − P )P = 0. Since both P


and (1 − P ) are hermitian, they have orthogonal ranges (Exercise 16). In fact:

Proposition 9.31. If P ∈ B(H) is a projection, then ker P = ran (1 − P ).

Proof. By Proposition 9.6, ker P = (ranP ∗ )⊥ . Thus, ker P = (ranP )⊥, because
P ∗ = P . Furthermore, ran (1 − P ) = ran P because

h(1 − P )ξ, P ηi = h(P (1 − P )ξ, ηi = 0 , ∀ ξ, η ∈ H .

Hence, ker P = ran (1 − P ). 


Observe that Proposition 9.31 shows that the range of a projection is
closed.

Proposition 9.32. Let M ⊂ H be a subspace and consider the direct sum


decomposition H = M ⊕ M ⊥ . Define a linear transformation P : H → H by
P (ξ + η) = ξ, for all ξ ∈ M , η ∈ M ⊥ . Then
1. P ∈ B(H),
2. P is a projection, and
3. M = ran P and M ⊥ = ker P = ran(1 − P ).
9.8 Projections and Invariant Subspaces 135

Proof. Assume that H = M ⊕ M ⊥ and that P (ξ + η) = ξ, for all ξ ∈ M ,


η ∈ M ⊥ . If ν ∈ H, then there are ξ ∈ M and η ∈ M ⊥ such that ν = ξ + η.
Hence,

kP νk2 = kP (ξ + ηk2 = kξk2 ≤ kξk2 + kηk2 = kνk2 ,

which proves that P is bounded (and kP k ≤ 1).


By the definition of the action of P on H, we see immediately that P 2 = P .
To prove that P ∗ = P , let ξ, η ∈ H. There exist unique ξj ∈ M and ηj ∈ M ⊥
such that ξ = ξ1 + η1 and η = ξ2 + η2 . Thus,

hP ξ, ηi = hP (ξ1 + η1 ), ηi = hξ1 , ξ2 + η2 i = hξ1 , ξ2 i


= hξ1 + η1 , ξ1 i = hξ, P ηi .

Hence, P ∗ = P .
Finally, by the definition of the action of P on H, it is obvious that M =
ran P and M ⊥ = ker P = ran(1 − P ). 

Definition 9.33. A subspace L of a Hilbert space is invariant under an oper-


ator T ∈ B(H) if
Tξ ∈ L, ∀ξ ∈ L.
The set of all subspaces L ⊆ H that are invariant under B(H) is denoted by
Lat T and is called the invariant-subspace lattice of T .

Why the term “lattice”? The answer is given by the following definition
and proposition.

Definition 9.34. If F is a family of subspaces of H, then the join of F is


!
_ [
F = Span F
F ∈F F ∈F

and the meet of F is ^ \


F = F.
F ∈F F ∈F

A family of subspaces E of H is called E contains {0} and H,


W a lattice if V
and if for every subfamily F of E, both F ∈F F and F ∈F F belong to E.

Proposition 9.35. Lat T is a lattice, for every T ∈ B(H).

Proof. Exercise 17. 

Example 9.36. Kernel and Closure of the Range. If T ∈ B(H) and λ ∈ C,


then
ker(T − λ1) ∈ Lat T and ran(T − λ1) ∈ Lat T .
This is straightforward to verify (Exercise 18). ♦
136 9 Operators on Hilbert Spaces

From the point of view of invariant subspace theory, what makes projec-
tions of interest is that, for a fixed operator T , the issue of whether a subspace
is invariant or not can be reformulated as an equation involving operators.

Proposition 9.37. Assume that T, P ∈ B(H) and that P is a projection.


1. ran P ∈ Lat T if and only if P T P = T P .
2. ran P ∈ Lat T and ran (1 − P ) ∈ Lat T if and only if P T = T P .

Proof. For the proof of 1, assume that ran P ∈ Lat T . If ξ ∈ H, then P ξ ∈


ran P and so T P ξ ∈ ran P . As P acts like the identity on ran P , we have
P (T P ξ) = T P ξ; thus, P T P = T P .
Conversely, assume that P T P = T P . Suppose that η ∈ ran P . Then there
is a ξ ∈ H such that P ξ = η. Thus, T η = T P ξ = P T P ξ ∈ ran P , which
proves that ran P is invariant under T . This proves 1.
To prove 2, assume that both ran P ∈ Lat T and ran (1 − P ) ∈ Lat T . By
(1), this is to say that P T P = T P and (1 − P )T (1 − P ) = T (1 − P ). Thus,

T − T P = (1 − P )T (1 − P ) = T − P T − T P + P T P = T − P T ,

and so T P = P T .
Conversely, if P T = T P , then P T P = T P 2 = T P ; thus ran P ∈ Lat T by
1. Moreover,

(1 − P )T (1 − P ) = T − T P − P T + P T P = T − T P = T (1 − P ) ,

and therefore ran (1 − P ) ∈ Lat T , again by 1. 


A projection P whose range is an invariant subspace of T affords a special
matrix representation of T . Suppose that L, M ⊂ H are nonzero subspaces
such that H = L ⊕ M ; that is, M = L⊥ . Further, assume that Tij are the
four operators indicated below:

T11 : L → L , T12 : M → L ,
T21 : L → M , T22 : M → M .

There is a natural way in which a single operator on H can be defined by


these four operators above: simply require that T act on H according to the
equation

T (ξ + η) = (T11 ξ + T12 η) + (T21 ξ + T22 η) , ∀ξ + η ∈ L⊕ M = H .

By writing ξ + η ∈ L ⊕ M in vector notation as


 
ξ
,
η

the action of the operator T on vectors ν = ξ + η ∈ H is represented in


matricial form by
9.9 Further Examples 137
    
T11 T12 ξ T11 ξ + T12 η
= .
T21 T22 η T21 ξ + T22 η

If P is the projection with range L and kernel M , then the operators Tij are
recovered from T and P via
P T P = T11 : ran P → ran P ,
P T (1 − P ) = T12 : ran (1 − P ) → ran P ,
(1 − P )T P = T21 : ran P → ran (1 − P ) ,
(1 − P )T (1 − P ) = T22 : ran (1 − P ) → ran (1 − P ) .

Thus, with respect to the decomposition H = L ⊕ M and the corresponding


matrix representation of T ∈ B(L ⊕ M ), Proposition 9.37 tells us that
 
T11 T12
L ∈ Lat T if and only if T = ,
0 T22

and that  
T11 0
L, M ∈ Lat T if and only if T = .
0 T22

Definition 9.38. A subspace L ⊂ H is reducing for an operator T ∈ B(H)


if both L and L⊥ are invariant under T .

Proposition 9.39. The following statements are equivalent for an operator


T and projection P on a Hilbert space H:
1. ran P is reducing for T ;
2. (1 − P )T ∗P = (1 − P )T P = 0;
3. T P = P T .

Proof. Exercise 21. 


Observe that Proposition 9.39 tells us that the invariant subs[aces of a
hermitian operator are reducing. However, this is not true for ormal operators
(Exercise 20). Furthermore, it is not known whether every operator acting
on an infinite-dimensional separable Hilbert space H has invariant subspaces
other than {0} and H. (All operators on finite-dimensional and nonseparable
Hilbert spaces do have invariant subspaces: see Exercise 22.)

9.9 Further Examples

To conclude this chapter we consider examples of some of the concepts intro-


duced here.
138 9 Operators on Hilbert Spaces

The Spectra of Multiplication Operators

Assume that X is a compact Hausdorff space, that µ is a regular, finite Borel


measure on X, and that ψ : X → C is an essentially bounded Borel measur-
able function. The multiplication operator Mψ on L2 (X, µ) is normal, since
Mψ∗ Mψ = M|ψ|2 = Mψ Mψ∗ , Consequently, σ(Mψ ) = σap (Mψ ) by Proposition
9.16. In fact, as shown below, σ(Mψ ) = ess-ran ψ.
To establish this claim above, suppose that λ 6∈ ess-ran ψ. Thus, there is a
Borel set E ⊂ X such that µ(X \E) = 0 and λ 6∈ ψ(E). Hence, there is a δ > 0
R |ψ(t) − λ| ≥ δ > 0 for all t ∈ E. Since 0 = µ(X \ E) = µ(X) − µ(E),
such that
kfk2 = E |f|2 dµ, for every f ∈ L2 (X, µ). Further, for any f ∈ L2 (X, µ),
Z Z
2 2 2 2
k(Mψ − λ1)fk = |ψ − λ| |f| dµ ≥ δ |f|2 dµ = δ 2 |f|2 .
E E

Thus, Mψ − λ1 is bounded below, and so λ 6∈ σap(Mψ ). As Mψ is normal,


σap(Mψ ) = σ(Mψ ) (Proposition 9.16). Hence, σ(Mψ ) ⊆ ess-ran ψ.
Conversely, suppose that λ ∈ ess-ran ψ. Choose any ε > 0 and for each
Borel set E ⊂ X for which µ(X \ E) = 0 let FE ⊂ E be the Borel set of X
defined by
FE = ψ−1 (Bε (λ)) ∩ E .
Note that each FE is nonempty because λ ∈ ψ(E). Now let
[
F = FE and f = χF .
E⊂X, µ(X\E)=0

Then µ(F ) > 0 and


Z
k(Mψ − λ1)fk2 = |ψ(t) − λ|2 |f(t)|2 dµ(t) ≤ ε2 kfk2 .
F

Hence, Mψ − λ1 is not bounded below, which proves that λ ∈ σ(Mψ ). Thus,


ess-ran ψ ⊆ σ(Mψ ).

The Square Root of a Positive Operator on C3

Consider the operator T on the Hilbert space C3 defined by


1 √12 0
 
 1 1 
T =  √2 1 √2  . (9.7)
0 √12 1

We claim that T is positive and


 √ √ 
1 1 2
2 + 2 4 − 12 + 4
2
1 1 1
T 1/2 = √ . (9.8)
 
 2 √ 2 2√
1 2 1 1 2
−2 + 4 2 2 + 4
9.9 Further Examples 139

To show this, note T ∗ = T and σ(T ) = {0, 1, 2}; thus, T is indeed positive.
The normalised eigenvectors φ1 , φ2 , φ3 ∈ C 3 corresponding to the eigen-
values λ1 = 0, λ2 = 1, and λ3 = 2 of T are:
 √1     √1 
1
1  2 1  1  2
φ1 = √ −1 , φ2 = √ 0 , φ3 = √ 1 .
2 √1 2 −1 2 √1
2 2

These vectors not only span C3 , but form an orthonormal basis. Therefore,

T = λ1 φ1 φ∗1 + λ2 φ2 φ∗2 + λ3 φ3 φ∗3


= λ2 φ2 φ∗2 + λ3 φ3 φ∗3 .

Now perform matrix multiplication to obtain


 1 1

2 0 −2
P2 = φ2 φ∗2 =  0 0 0  ,
− 12 0 12

1 1 1
 

4 2 2 4
1 1 1
P3 = φ3 φ∗3 =  √ √ .
 
2 2 2 2 2
1 1
√ 1
4 2 2 4

The operators Pj = φj φ∗j are rank-1 projections with range ker(T −λj 1). Thus
P1 + P2 + P3 = 1 and Pi Pj = Pj Pi = 0 for i 6= j. Because T = P2 + 2P3 , we
see that  √ √ 
1 2 1 1 2
√ 2 + 4 2 − 2 + 4
1/2 1 √1 1
T = P2 + 2P3 =  ,
 
2 √ 2 2√
− 12 + 42 21 12 + 42
as claimed in (9.8).

The Invariant Subspaces of a Unilateral Shift on Cn

On the Hilbert space Cn let S be the operator


 
0 1 0 ... 0
 .. .
 . 0 1 . . . .. 
 
S =  .. ..  .

 . . 0 
 0 1
0 ... 0

If e1 , . . . , en denote the standard basis vectors for Cn , then

Lat S = {{0}, Span{e1 }, Span{e1 , e2 }, . . . , Span{e1 , . . . , ek }, . . . , Fn } .


140 9 Operators on Hilbert Spaces

To verify this, first observe that the subspaces Span{e1 , . . . , ek }, for 1 ≤ k ≤


n, are plainly invariant under S. Conversely, suppose that L ∈ Lat S is an
arbitrary nonzero subspace. Each nonzero vector v ∈ L is expressed uniquely
Xn
as a linear combination v = αj ej of the standard basis vectors, where at
j=1
least one coefficient αj is nonzero. With respect to this representation of v,
let l(v) be the maximum j for which αj 6= 0 and let k be the maximum of all
l(v) as v ranges through all nonzero vectors in L. Then clearly

L ⊆ Span{e1 , . . . , ek } .
k
X
Now let v ∈ L be such that l(v) = k; so, v = αj ej and αk 6= 0. Note that
j=1
S m v ∈ L for all m ∈ Z+ n
0 and that S = 0. When m = k − 1, the result is
k−1
S v = αk e1 . Thus,
e1 = αk −1 S k−1 v ∈ L .
Now with m = k − 2 we have S k−2 v = αk e2 + αk−1e1 , and so

e2 = αk −1 S k−2 v − αk−1 e1 ∈ L .


Inductively, the same type of argument reveals that for any m such that
1 ≤ m ≤ k,  
m−1
X
em = αk −1 S m v − αj ej  ∈ L .
j=0

In particular, this proves that Span{e1 , . . . , ek } ⊆ L.

The Invariant Subspaces of the Volterra Operator

The Volterra operator is the integral operator K on L2 ([0, 1], m) defined by


Z 1
Kf (t) = f(s) dm(s) , f ∈ L2 .
0

The operator K indeed belongs to the class of integral operators described


earlier, for its kernel function k : [0, 1] × [0, 1] → C is given by κ(t, s) = 1 for
0 ≤ s ≤ t and κ(t, s) = 0 for 1 ≥ s > t ≥ 0.
For each α ∈ (0, 1) let
Z
Lα = {f ∈ L2 | |f|2 dm = 0}
[0,α]

Each Lα is invariant under K (Exercise 24). In fact, a nontrivial subspace L


is invariant under K if and only if L = Lα for some α ∈ (0, 1) [10, Theorem
4.14].
9.10 Exercises 141

9.10 Exercises

1. Prove that T, S ∈ B(H) are equal if and only if hT ξ, ξi = hSξ, ξi, for all
ξ ∈ H.
2. Prove that ran T = (ker T ∗ )⊥ , for every T ∈ B(H).
3. For T ∈ B(H), prove that λ ∈ σd (T ) if and only if λ ∈ σp (T ∗ ).
4. Assume that T ∈ B(H) is an operator of rank m ∈ N.
a) Prove that if m = 1, then there are unit vectors γ, η ∈ H such that
T ξ = hξ, γi η, for all ξ ∈ H.
b) Prove that the rank of T ∗ is m.
5. Prove that if {φk }∞ k=0 is the canonical orthonormal basis of the Hardy
space H 2 (T) and if S ∈ B(H 2 (T)) is the unilateral shift operator, then
S ∗ satisfies Sφk = φk−1 , for all k ∈ N and S ∗ φ0 = 0.
6. With respect to the canonical orthonormal basis {φk }∞ k=0 of the Hardy
space H 2 (T), find the matrix representation S of the unilateral shift op-
erator S ∈ B(H 2 (T)). Viewing S as acting on `2 (N ∪ {0}), determine the
action of the matrix S on a vector ξ ∈ `2 (N ∪ {0}).
7. Prove that if T ∈ B(H) is hermitian, then

kT k = sup |hT ξ, ξi| .


kξk=1

8. Assume that T ∈ B(H) is hermitian and let

m` = inf hT ξ, ξi and mu = sup hT ξ, ξi .


kξk=1 kξk=1

Complete the proof of Proposition 9.12 by proving the following state-


ments.
a) mu ∈ σ(T ).
b) σ(T ) ⊆ (−∞, mu ]
c) σ(T ) ⊆ [m`, mu ].
9. Prove that the following statements are equivalent for T ∈ B(H):
a) T is normal;
b) kT ∗ ξk = kT ξk, for all ξ ∈ H;
c) T ∗ T = T T ∗ .
 
01
10. Let T = ∈ B(C2 ).
00
a) Prove that σ(T ) = {0}.
b) Prove that there are no operators R ∈ B(C2 ) such that R2 = T .
11. Prove that the following statements are equivalent for an operator V ∈
B(H):
a) V is an isometry;
b) hV ξ, V ηi = hξ, ηi for all ξ, η ∈ H;
142 9 Operators on Hilbert Spaces

c) V ∗ V = 1.
12. Prove that if T ∈ B(H) is invertible, then T = U |T | for some unitary
operator U ∈ B(H).
13. Prove that every projection is a positive operator.
14. Let P ∈ B(H) be a projection other than 0 or 1. Compute σ(P ) and kP k.
15. Assume that V ∈ B(H).
a) If V is a partial isometry. Prove that V ∗ V and V V ∗ are projections
and determine their ranges.
b) If V ∈ B(H) is an operator for which V ∗ V and V V ∗ are projections,
then prove that V is a partial isometry and determine the initial and
final spaces of V .
16. Suppose that T1 , T2 ∈ B(H) are hermitian operators such that T1 T2 =
T2 T1 = 0. Prove that hξ, ηi = 0 for all ξ ∈ ran T1 and η ∈ ran T2 .
17. Prove that Lat T is a lattice, for every T ∈ B(H).
18. Assume that T ∈ B(H) and λ ∈ C. Prove that

ker(T − λ1) ∈ Lat T and ran(T − λ1) ∈ Lat T .

19. Prove that, for every T ∈ B(H),

M ∈ Lat T if and only if M ⊥ ∈ Lat T ∗ .

20. Let B denote the bilateral shift operator on L2 (T). Prove that the Hardy
space H 2 (T) is invariant under B and that, with respect to the decompo-
sition L2 (T) = H 2 (T) ⊕ H 2 (T)⊥ , B is represented by an operator matrix
of the form  
S B12
,
0 B22
where S is the unilateral shift operator on H 2 (T) and B12 6= 0.
21. Prove that the following statements are equivalent for an operator T and
projection P on a Hilbert space H:
a) ran P is reducing for T ;
b) (1 − P )T ∗ P = (1 − P )T P = 0;
c) T P = P T .
22. A subspace L ⊂ H is said to be nontrivial if L is neither {0} nor H.
Suppose that T ∈ B(H).
a) Prove that if H has finite dimension, then T has a nontrivial invariant
subspace. (Hint: think about eigenvectors.)
b) Prove that if H is nonseparable, then T has a nontrivial invariant
subspace. (Hint: if ξ ∈ H is nonzero, consider the subspace generated
by T k ξ for k ∈ N.)
23. Let ψ : [0, 1] → [0, 1] be given by ψ(t) = t and consider the (hermitian)
multiplication operator Mψ on L2 ([0, 1], m).
9.10 Exercises 143

a) Prove that Mψ has no eigenvalues.


b) Prove that Mψ has no finite-dimensional invariant subspaces.
c) Find one nontrivial subspace of L2 ([0, 1], m) that is invariant under
Mψ .
24. Consider the integral operator K on L2 ([0, 1], m) defined by
Z 1
Kf (t) = f(s) dm(s) , f ∈ L2 .
0

For each α ∈ (0, 1) let


Z
2
Lα = {f ∈ L | |f|2 dm = 0}
[0,α]

Prove that Lα is invariant under K for every α ∈ (0, 1).


25. Let S be the unilateral shift operator on `2 (N).
a) Prove that if T ∈ B(`2 (N)) satisfies T S = ST and T S ∗ = S ∗ T , then
T = λ1 for some λ ∈ C.
b) Prove that the only subspaces L ⊆ `2 (N) that are invariant under
both S and S ∗ are L = {0} and L = `2 (N).
26. Assume that N ∈ B(H) is normal.
a) Prove that ker(N −λ1 1) ⊥ ker(N −λ2 1), for all distinct λ1 , λ2 ∈ σp (N ).
b) If λ1 , λ2 ∈ σap(N ) are distinct, and if ξn , ηn ∈ H are unit vectors for
which
lim k(N − λ1 1)ξn k = lim k(N − λ2 1)ηn k = 0 ,
n n

then prove that


lim hξn , ηni = 0 .
n

(Suggestion: Consider (λ1 − λ2 )hξn ηn i.)


10
Compact Operators on Hilbert Spaces

To be completed ...
Part III

Algebras
11
Banach Algebras

Some Banach spaces, such as B(V ) and C0 (X), carry the structure of a
normed algebra in addition to that of a normed vector space. The abstract
properties of such algebras are studied in this chapter. Examples of Banach
algebras are examined in the following chapter.

11.1 Banach Algebra Definition

An associative algebra, or simply an algebra, over the field C of complex num-


bers is a vector space A such that
(i) A is an associative ring, and
(ii) (α a)(b) = a(α b) = α(ab), for all a, b ∈ A and α ∈ C.
Furthermore,
(iii) if ab = ba, for all a, b ∈ A, then A is called a commutative (or abelian)
algebra, and
(iv) if there is an element 1 ∈ A such that a 1 = 1 a = a, for every a ∈ A, then
A is said to be an algebra with identity.
If A is a unital algebra, then the same notation, namely 1, is used to denote
the multiplicative identity of the field C and the multiplicative identity of the
algebra A. Normally this should not cause confusion.
Because an algebra A is an associative ring, the following rules for multi-
plication hold:

a(b + c) = ab + ac (left distributive law) ,


(a + b)c = ac + bc (right distributive law) ,
a(bc) = (ab)c (associative law) .

Definition 11.1. An algebra A over C is a Banach algebra if there is norm


k · k on A such that
150 11 Banach Algebras

1. A is a Banach space with respect to this norm, and


2. kxyk ≤ kxk kyk, for all x, y ∈ A.

The second property above asserts that the norm on A is submultiplicative.

Definition 11.2. If A is a Banach algebra with identity 1 ∈ A, and if k1k = 1,


then A is called a unital unital Banach algebra algebra.

In this study, all algebras with identity that arise will satisfy k1k = 1.

Definition 11.3. If A and B are Banach algebras, then a homomorphism


from A to B is a function ρ : A → B such that, for all x, y ∈ A and α, β ∈ C,
1. ρ(αx + βy) = α ρ(x) + β ρ(y), and
2. ρ(xy) = ρ(x) ρ(y).

A homomorphism ρ : A → B is bounded if there is a M > 0 such that


kρ(x)k ≤ M kxk, for all x ∈ A. If a bounded homomorphism satisfies kρ(x)k =
kxk, for every x ∈ A, then ρ is said to be an isometric homomorphism.

11.2 Invertible Elements and Spectra

Definition 11.4. An elements x ∈ A in a unital algebra A is invertible if


there is a y ∈ A such that xy = yx = 1.

If x has an inverse in a unital algebra, then it is unique (Exercise 2). Thus,


we denote the inverse of x by x−1 . The set

GL(A) = {x ∈ A | x is invertible in A}

is a multiplicative group, called the general linear group of A.

Lemma 11.5. Assume that A is a unital Banach algebra and that x ∈ A. If


λ ∈ C satisfies |λ| > kxk, then (1 − λ1 x) ∈ GL(A) and
 −1
 −1 ∞
1 X
−k k
1 |λ|
1− x = λ x , and 1− x ≤ .

λ λ |λ| − kxk
k=0

Proof. The proof is identical to that of Lemma 7.16. 

Lemma 11.6. Assume that A is a unital Banach algebra and that x ∈ GL(A).
If h ∈ A satisfies khk < kx−1 k−1 , then x + h ∈ GL(A) and

khk2 kx−1 k2
k(x + h)−1 − x−1 + x−1 hx−1 k < .
kx−1 k−1 − khk
11.2 Invertible Elements and Spectra 151

Proof. If khk < kx−1 k−1 , then kx−1 hk < 1 and so Lemma 11.5 implies that
(1 + x−1 h) ∈ GL(A) and

X
(1 + x−1 h)−1 = (−1)k (x−1 h)k .
k=0

Because x + h = x(1 + x−1 h), we conclude that x + h is invertible.


Next, observe that

(x + h)−1 − x−1 + x−1 hx−1 = (1 + x−1 h)x−1 − 1 + x−1 h x−1




and

!
X
(1 + x−1 h)x−1 − 1 + x−1 h x−1 = (−1)k (x−1 h)k x−1 .


k=2

Thus,

!
X
−1 k
k(x + h)−1 − x−1 + x−1 hx−1 k ≤ kx hk kx−1 k
k=0

khk2 kx−1 k2
≤ ,
kx−1 k−1 − khk
which completes the proof. 
Corollary 11.7. GL(A) is an open set in the norm topology of A.

The map GL(A) → GL(A) that sends x → x−1 is of course a bijection.


The next result shows that the map is in fact a homeomorphism.

Proposition 11.8. The map i : GL(A) → GL(A), defined by i(x) = x−1 , is


a continuous bijection such that (xy)−1 = y−1 x−1 , for all x, y ∈ GL(A).
Proof. Exercise 3. 

Definition 11.9. If A is a unital Banach algebra and if x ∈ A, then the


spectrum of x is the set σ(x) of all λ ∈ C for which the element x − λ1 has
no inverse in A.

The remaining results below are algebraic versions of theorems in §7.5 that
were proved for Banach space operators. The proofs of these results in fact
did not use in any way the fact that the operators were acting on a Banach
space; rather, the proofs depended only upon analytic and algebraic features
of (the unital Banach algebra) B(V ). For this reason, the proofs of theorems
below are exactly the same as the proofs of the correspnding results in §7.5.
Lemma 11.10. Assume that A is a unital Banach algebra and that x ∈ A.
152 11 Banach Algebras

1. If λ0 6∈ σ(x), then (x − λ1)−1 exists for all λ ∈ C for which |λ − λ0 | <


k(x − λ0 1)−1 k−1 .
2. Suppose that ϕ ∈ A∗ and λ0 6∈ σ(x). There is an ε > 0 such that if
Ω = {λ ∈ C | |λ−λ0 | < ε}, then Ω ∩ σ(T ) = ∅ and the function f : Ω → C
defined by f(λ) = ϕ (x − λ1)−1 is differentiable at λ0 .
Theorem 11.11. If A is a unital Banach algebra and if x ∈ A, then σ(x) is
a nonempty compact subset of {ζ ∈ C | |ζ| ≤ kxk}.
Definition 11.12. Assume that A is a unital Banach algebra. The spectral
radius of x ∈ A is the quantity spr x defined by

spr x = max |λ| .


λ∈σ(x)

Theorem 11.13. If A is a unital Banach algebra and if x ∈ A, then


lim kxn k1/n exists and
n
spr x = lim kxn k1/n . (11.1)
n

Corollary 11.14. If x ∈ A satisfies kx2 k = kxk2, then kxk = spr x.


Proof. The hypothesis implies, by induction, that kx2k k = kxk2k for every
k ∈ N. Hence
lim kx2k k1/2k = kxk, .
k

Hence, by Theorem 11.13, kxk = spr x. 


n
X
If A is a unital Banach algebra and if f is a polynomial, say f(t) = α j tj ,
j=0
n
X
then for any x ∈ A we define f(x) = αj xj , where x0 = 1 ∈ A. The map
j=0
that sends a polynomial f to f(x) ∈ A is called polynomial functional calculus
for x. This functional calculus has the following properties: for any x ∈ A,
α ∈ C, and polynomials f, g ∈ C [t],
1. αf(x) = α(f(x)),
2. (f + g)(x) = f(x) + g(x), and
3. fg(x) = f(x)g(x).
Theorem 11.15. (Polynomial Spectral Mapping Theorem) If A is a unital
Banach algebra and x ∈ A, then

f (σ(x)) = σ(f(x))

for every complex polynomial f.


Proposition 11.16. In a unital Banach algebra A, σ(xy)∪{0} = σ(yx)∪{0},
for all x, y ∈ A.
11.4 Ideals and Quotients 153

The following theorem is inspired by the fact that the only complex finite-
dimensional associative algebra that is also a division algebra is 1-dimensional
algebra (and field) C.

Theorem 11.17. If A is a unital Banach algebra with the property that x is


invertible, for every nonzero x ∈ A, then there is an isometric homomorphism
ρ : A → C.

Proof. Assume x ∈ A is nonzero. As σ(x) 6= ∅, there is at least one λ ∈ σ(x)


such that x − λ1 is singular. Because 0 is the only noninvertible element of
A, x − λ1 = 0, whence x = λ1 and σ(x) = {λ}. Therefore, each x ∈ A is
determinedly uniquely by its spectrum.
For any x = λ1, kx2 k = |λ2 k = kλk2 = kxk2 . Therefore, by Corollary
11.14, the map ρ : A → C defined by ρ(x) = λ, where {λ} = σ(x), satisfies
kρ(x)k = kxk. One now need only verify that ρ is an algebra homomorphism.
But this is trivial, as each element of A is a scalar multiple of the identity. 

11.3 Subalgebras

If B is a Banach algebra, then a subset A ⊆ B is a Banach subalgebra of B


if A is a Banach algebra with respect to the sum, product, and norm of B.
If B is unital and if the multiplicative identity 1 ∈ B belongs to a Banach
subalgebra A of B, then A is said to be a unital Banach subalgebra—or, more
simply, a subalgebra—of B.
If F ⊂ A is a subset of a Banach algebra A, then the Banach algebra
generated by F is the smallest Banach subalgebra of A that contains F and
is denoted by Alg(F )− . If F = {x}, for a single x ∈ A, then Alg{x}− is the
norm-closure in A of the subring of A given by elements of the form
n
X
α j xj , where n ∈ N, αj ∈ C .
j=1

In particular, if A is unital, then every element in the unital Banach subalgebra


Alg{x, 1} is the limit of a sequence {fn (x)}n∈N , where fn is a polynomial over
C.

11.4 Ideals and Quotients

Definition 11.18. A subset J of a Banach algebra A is an algebraic ideal of


A if, for all a ∈ A and x, y ∈ J,
1. x − y ∈ J,
2. ax ∈ J and xa ∈ J.
154 11 Banach Algebras

If an algebraic ideal J of A is also norm closed in A, then J is said to be an


ideal of A.

The trivial ideals of A are those of the form J = {0} and J = A.

Definition 11.19. A Banach algebra A is simple if {xy | x, y ∈ A} =


6 {0} and
the only ideals of A are the trivial ideals.

If J is an ideal of a Banach algebra A, then the quotient space A/J is a


Banach space with respect to the quotient norm (Proposition 1.13).

11.5 exercises

1. Prove that multiplication is continuous in a Banach algebra A. That is,


if {xn }n and {yn }n are convergent sequences in A with limits x and y
respectively, then kxn yn − xyk → 0.
2. Assume that R is a unital ring and that x ∈ R. Prove the following
statements.
a) If there exist y, z ∈ R such that xy = 1 and zx = 1, then x is invertible.
b) If there exist y, y0 ∈ R such that xy = yx = 1 and xy0 = y0 x = 1, then
y0 = y.
3. Prove that inversion is continuous in a unital Banach algebra A. That
is, prove that the map i : GL(A) → GL(A), defined by i(x) = x−1 , is a
continuous bijection.
12
Banach Algebras in Analysis

Some Banach spaces, such as B(V ), carry the structure of a normed algebra
in addition to that of a normed vector space. The abstract properties of such
spaces are studied in this chapter.

12.1 Uniform Algebras

12.2 The Disc Algebra

12.3 Absolutely Convergent Trigonometric Series

12.4 Harmonic Analysis

12.5 Complex Analysis

12.6 Exercises
13
C∗-algebras

If H is a Hilbert space, then the Banach algebra B(H) differs from algebras
of the form B(V ), where V is a Banach space, by the fact that the adjoint
operator T ∗ is also an operator on H. Therefore, B(H) is an algebra with
involution T 7→ T ∗ . Banach algebras A with involution have a special theory;
an even richer theory arises if the involution satisfies kx∗xk = kxk2 , for all
x ∈ A. Such algebras are called C∗ -algebras.

13.1 C∗-algebra Definitions


A complex normed algebra A with involution x 7→ x∗ is called a C∗-algebra if
1. A is a Banach algebra, and
2. kx∗ xk = kxk2 , for every x ∈ A.
An immediate consequence of the “C∗ -norm axiom” kx∗xk = kxk2 is that
kx∗k = kxk, for every x ∈ A. That is, the involution on a C∗ -algebra is an
isometry. To prove this, note that

kxk2 = kx∗ xk ≤ kx∗k kxk and kx∗ k2 = kx∗∗ x∗ k ≤ kx∗∗k kx∗ k = kxk kx∗k ,

and so kxk ≤ kx∗k and kx∗k ≤ kxk.


If A is a C∗ -algebra with multiplicative identity 1 ∈ A, then k1k = 1 by
an argument that is similar to the one above. Thus, C∗ -algebras with identity
are unital Banach algebras. We shall call them unital C∗ -algebras.
Furthermore, if x ∈ A, then

1∗ x = (1∗ x)∗∗ = (x∗ 1)∗ = x∗∗ = x .

Likewise, x1∗ = x. By the uniqueness of the multiplicative identity in a unital


ring, 1∗ = 1.

Definition 13.1. Assume that A is a C∗ -algebra and x ∈ A.


158 13 C∗ -algebras

1. If x∗ = x, then x is said to be hermitian.


2. If x∗x = xx∗, then x is normal.
3. If A is unital and if x∗ x = xx∗ = 1, then x is unitary.

As we noted above, if A is unital, then 1 is hermitian.

Definition 13.2. The set of hermitian elements in a C∗ -algebra A is denoted


by Asa .

The real and imaginary parts of x ∈ A are


1 1
<x = (x + x∗ ) and =x = (x − x∗ ) .
2 2i
Hence:

Proposition 13.3. Asa is a real vector space and SpanC Asa = A.

In the category of C∗ -algebras, the natural maps between C∗ -algebras are


called homomorphisms.

Definition 13.4. If A and B are C∗ -algebras, then a ∗-homomorphism from


A to B is a map ρ : A → B such that, for all x, y ∈ A and α, β ∈ C,
1. ρ(αx + βy) = α ρ(x) + β ρ(y),
2. ρ(xy) = ρ(x) ρ(y), and
3. ρ(x∗ ) = ρ(x)∗ .

Definition 13.5. A ∗-homomorphism ρ of C∗-algebras is isometric if kρ(x)k =


kxk for every x ∈ A. A bijective ∗-homomorphism is called an ∗-isomorphism.

13.2 Adjoining a Unit to a Nonunital C∗-algebra

Assume that A is a nonunital C∗ -algebra. Let A1 denote the complex involu-


tive algebra
A1 = A × C = {(a, α) | a ∈ A, α ∈ C} ,
where the involution and the vector space operations on A1 are defined
through the involution and the vector space operations in each coordinate
and where the multiplication is given by

(a, α) · (b, β) = (ab + αb + βa, αβ) .

The algebra A1 is unital—its multiplicative identity is (0, 1). Identify A with


{(a, 0) |, a ∈ A}, which is a subalgebra of A1 . As vector spaces, A1 /A ∼
= C, and
so A has codimension 1 in A1 . Moreover, if z ∈ A1 and a ∈ A, then za ∈ A.
13.2 Adjoining a Unit to a Nonunital C∗ -algebra 159

Theorem 13.6. If A is a nonunital C∗ -algebra and if A1 denotes the minimal


unitisation of A, then the formula

kzk0 = sup{kzbk | b ∈ A, kbk ≤ 1} , ∀ z ∈ A1 , (13.1)

defines a norm on A1 such that:


1. A1 is a C∗ -algebra with respect to k · k0 , and
2. kak0 = kak, for every a ∈ A.
Proof. The first step will be to show that equation (13.1) defines a submulti-
plicative norm on A1 . The proof that, for all z, z1 , z2 ∈ A1 and α ∈ C,
1. kαzk = |α| kzk0,
2. kz1 + z2 k0 ≤ kz1 k0 + kz2 k0 , and
3. kz1 z2 k0 ≤ kz1 k0 kz2 k0
is left as an exercise. Furthermore, kz ∗ k0 = kzk0 (see Exercise 1). What remains
is to prove that kzk0 = 0 only if z = 0.
To this end, suppose that z ∈ A1 satisfies kzk0 = 0. If z ∈ A, then kzk0 = 0
implies that kzbk = 0 for every b ∈ A. In particular, kzz ∗ k = 0, whence
kz ∗k = 0. Since the involution on A is an isometry, kzk = 0. This proves that
z = 0 if z ∈ A and kzk0 = 0.
Next, consider the possibility that z 6= 0 yet kzk0 = 0. The paragraph
above shows that z 6∈ A (for otherwise z would be 0). Thus, z = (a, λ) for
some a ∈ A and nonzero λ ∈ C. The hypothesis kzk0 = 0 again implies that
z · b = 0 for all b ∈ A—that is, ab + λb = 0 for every b ∈ A. Hence, −λ−1 a is a
left multiplicative identity for A. By passing to adjoints, (−λ−1 a)∗ is a right
multiplicative identity of A. Thus,
∗ ∗
−λ−1 a = −λ−1 a −λ−1 a = −λ−1 a .


In other words, −λ−1 a is a multiplicative identity for A, which is in contra-


diction to the hypothesis that A is a nonunital algebra. Therefore, it must
be that kzk0 = 0 only if z = 0. This completes the proof that k · k0 is a
submultiplicative norm on A1 .
Suppose that a ∈ A. For every b ∈ A with kbk ≤ 1, kabk ≤ kak kbk ≤ kak;
thus, kak0 ≤ kak. On the other hand, if a is normalised so as to have norm
kak = 1, then kak0 ≥ kaa∗k = ka∗ k2 = kak2 = 1 = kak. This proves that
kak0 = kak, for every a ∈ A. This fact together with k(0, λ)k0 = |λ|, for all
λ ∈ C, shows that A × C is a Banach algebra (Exercise 2).
The proof of the theorem will be complete once it is shown that the norm
k · k0 has the C∗ -norm property. To this end, let z ∈ A1 and b ∈ A. Because
A is an algebraic ideal of A1 , zb ∈ A; thus,

kzbk2 = k(zb)∗ (zb)k = kb∗ (z ∗ z)bk = kb∗(z ∗ z)bk0 ≤ kbk2 kz ∗ zk0 . (13.2)

To show that kz ∗ zk0 ≥ (kzk0 )2 , note that for each ε > 0 there is a b ∈ A with
kbk ≤ 1 such that kzbk > (1 −ε)kzk0 . Thus, kz ∗ zk0 > (1 −ε)2 (kzk0 )2 by (13.2).
160 13 C∗ -algebras

As ε > 0 is arbitrary, the inequality kz ∗ zk0 ≥ (kzk0 )2 must hold. Conversely,


because k · k0 is submultiplicative and ∗ is an isometry on A1 with respect to
k · k0 , the inequality kz ∗ zk0 ≤ kz ∗ k0 kzk0 = (kzk0 )2 leads to the conclusion that
kz ∗zk0 = (kzk0 )2 . 

Definition 13.7. If A is a nonunital C∗ -algebra, then the C∗ -algebra A1 is


called the minimal unitisation of A.

The adjective “minimal” is justified by the fact that A has codimension 1


in A1 .
Notational Convention. The norm k · k0 (13.1) on A1 is denoted by k · k.

13.3 Gelfand Theory for Unital Abelian C∗-algebras

Theorem 13.8. If A is a unital, abelian C∗ -algebra, then the Gelfand trans-


form Γ : A → C (sp (A)) is an isometric ∗-homomorphism of A onto
C (sp (A)).

Proof. By Theorem ??, the Gelfand transform Γ is an isometry if kx2 k = kxk2 ,


for each x ∈ A. To see that this indeed holds for A, choose x ∈ A and note
that, because A is abelian, xx∗ = x∗ x; thus,
2
kx2 k2 = k(x2 )∗ (x2 )k = k(x∗x)∗ (x∗ x)k = kx∗ xk2 = kxk2 .

Hence, kΓ (x)k = kxk, for every x ∈ A.


For each z ∈ A, the series

X 1
zn
n=0
n!
converges in A; denote the limit by ez . Since A is abelian, a calculation with
the series expansions of z1 , z2 ∈ A shows that ez1 +z2 = ez1 ez2 (Exercise 3).
The next step is to show that Γ (h) is a real-valued function for every her-
mitian h ∈ A. To this end, fix a hermitian h ∈ A and select ω ∈ sp (A).The
previous paragraph demonstrates that e−iθ h eiθ h = 1 for every θ ∈ R. There-
fore, by the continuity of the Banach algebra homomorphism ω, in C we have
the equation

1 = ω(1) = ω e−iθ h eiθ h = e−iθω(h) eiθω(h) = |eiθω(h) |2 .




As this is true for every θ ∈ R, ω(h) must be a real number. That is, the
continuous function Γ (h) on sp (A) is real-valued for every hermitian h ∈ A.
Now suppose that x ∈ A is arbitrary. Write x = h + ig, where h, g ∈ A are
hermitian. Thus, x∗ = h − ig and so
∗ ∗
Γ (x∗) = Γ (h) − iΓ (g) = (Γ (h) + iΓ (g)) = (Γ (x)) .
13.4 C∗ -subalgebras 161

This proves that the homomorphism Γ is in fact a ∗-homomorphism. The only


item that remains to be verified is that the ∗-homomorphism Γ is surjective.
To this end, because Γ is an isometry, the range of Γ , Γ (A), is a C∗ -
subalgebra of C (sp (A)). In fact Γ (1) = 1, and so Γ (A) is a unital C∗ -
subalgebra of C (sp (A)). Therefore, to show that Γ (A) = C (sp (A)), it is
enough, by the Stone–Weierstrass Theorem, to show that Γ (A) separates
the points of sp (A). Suppose that x1 , x2 ∈ A satisfy Γ x1(ω) = Γ x2 (ω),
for every ω ∈ sp (A). Then, as functions on sp (A), Γ x1 = Γ x2 , and so
x1 − x2 ∈ ker Γ = {0}. That is, x1 = x2 , which proves that Γ (A) separates
the points of sp (A). 
The next result shows that the spectrum of an abelian C∗ -algebra A,
considered as a topological space, is an isomorphism invariant of A.

Proposition 13.9. If X and Y are compact Hausdorff spaces such that X and
Y are homeomorphic, then the C∗ -algebras C(X) and C(Y ) are isometrically
isomorphic.

Proof. Let ψ : X → Y be a homeomorphism. (That is, ψ is a bijection such


that both ψ and ψ−1 : Y → X are continuous.) Because ψ−1 is continuous,
the function f ◦ψ−1 : Y → C is continuous for every f ∈ C(X). Thus, the map
ρ : C(X) → C(Y ) given by ρ(f) = f ◦ ψ−1 is readily seen to be a C∗ -algebra
homomorphism. Furthermore, for each f ∈ C(X),

kρ(f)k = max |f ψ−1 (s) | = max |f(t)| = kfk .



s∈Y t∈X

Thus, ρ is an isometry and, therefore, the range of ρ is norm-closed in C(Y ).


That is, the range of ρ is a unital C∗ -subalgebra of C(Y ). Because ψ is a
bijection, the range of ρ separates the points of Y . Hence, by the Stone–
Weierstrass Theorem, ρ is a bijection. 

13.4 C∗-subalgebras
Definition 13.10. If B is a C∗-algebra, then a subset A ⊆ B is a C∗-
subalgebra of B if A is a C∗ -algebra with respect to the sum, product, involu-
tion, and norm of B. If B is unital and if the multiplicative identity 1 ∈ B
belongs to a C∗ -subalgebra A of B, then A is said to be a unital C∗ -subalgebra
of B.
Definition 13.11. If F ⊂ A is a subset of a C∗ -algebra A, then the C∗-
algebra generated by F is the smallest C∗ -subalgebra of A that contains F
and is denoted by C ∗ (F ).

Let Wn (N ) denote the set of all monomials (words) ω of degree N in n


noncommuting variables ζ1 , . . . , ζn , and let Pn denote the algebra over C given
by the complex linear span of Wn (N ) over all N ∈ N ∪ {0}, where the product
162 13 C∗ -algebras

ω1 ω2 of two monomials ω1 and ω2 of degrees N1 and N2 is a new monomial of


degree N1 + N2 formed by concatenation of ω1 and ω2 . The unique monomial
ω is degree 0 is ω(ζ1 , . . . , ζn ) = 1, which serves as the multiplicative identity
of Pn . Assume further that the algebra Pn is equipped with an involution ∗
whereby ζj∗ = ζj for each j, 1∗ = 1, and
∗
α ζkn11 ζkn22 · · · ζknmm = α ζknmm · · · ζkn22 ζkn11 , ∀ α ∈ C ∀ m, km , nm ∈ N .

Let P0n be the subalgebra of codimension 1 consisting of all f ∈ Pn such that


f(0, 0) = 0.
If F = {x}, for a single x ∈ A, then

C ∗(x) = {f(x, x∗ ) | f ∈ P0n } and C ∗ (1, x) = {f(x, x∗ ) | f ∈ Pn } .

If x∗ = x, then every element in the unital C∗ -algebra C ∗(x, 1) generated by


x is the limit of a sequence {fn (x)}n∈N , where fn is a polynomial over C.
If C is a unital Banach algebra and if x ∈ C, then the spectrum of x,
considered as an element of C, is the (nonempty, compact) set

σC (x) = {λ ∈ C | x − λ1 is not invertible in C} .

Recall from the spectral theory of Banach algebras that if C is a unital Banach-
subalgebra of a unital Banach algebra D, then σD (x) ⊆ σC (x) and ∂σC (x) ⊆
∂σD (x). In contrast, the following (extremely important) theorem shows that
the spectrum of an element x in a unital C∗ -algebra is independent of the
unital C∗ -algebra that contains x.

Theorem 13.12. If A is a unital C∗ -subalgebra of a unital C∗ -algebra B, then


σA(x) = σB (x), for every x ∈ A.

The proof of Theorem 13.12 requires a preliminary result.

Lemma 13.13. If h is a hermitian element of a unital C∗-algebra B, then


σB (h) ⊂ R.

Proof. Let A = C ∗ (h, 1), the unital, abelian C∗ -subalgebra of B generated by


h. The range of the Gelfand transform is the spectrum σA (h) of h relative to
A; thus, since σB (A) ⊆ σA (h), it is enough to show that Γ (h) is a real-valued

function on sp (A). Because Γ (x∗ ) = (Γ (x)) for every x ∈ A, and because
h∗ = h, Γ (h) = Γ (h)∗ . Hence, Γ (h) is a real-valued function on sp (A). 
Proof of Theorem 13.12. The inclusion σB (x) ⊆ σA (x) is straightforward
and has already been noted. To prove the containment σA (x) ⊆ σB (x) it is
sufficient to show that 0 ∈ σA (x) implies 0 ∈ σB (x). This is most simply done
by proving the contrapositive: if x ∈ A has an inverse x−1 in B, then x−1 ∈ A.
Therefore, assume that x ∈ A is invertible in B. Then x∗ is invertible as
well, since 1 = xz = zx implies that 1 = z ∗ x∗ = x∗z ∗ . Consequently, x∗ x ∈ A
is invertible in B. Lemma 13.13 shows that σA (x∗ x) ⊂ R. Hence,
13.5 Continuous Functional Calculus 163

σA (x∗ x) = ∂σA (x∗ x) ⊆ σB (x∗ x) ⊆ σA (x∗ x) ,

which implies that x∗ x is invertible in A. A left inverse for x in B is


[(x∗x)−1 x∗ ]x. Since x is in fact invertible (in B), this left inverse is neces-
sarily the inverse x−1 ∈ B of x; hence, x−1 = (x∗ x)−1 x∗ . But (x∗ x)−1 x∗ ∈ A,
which implies that x is invertible in A. 
Theorem 13.12 indicates that the notation σ(x) for the spectrum of x is un-
ambiguous. Therefore, one can define the spectrum for elements of nonunital
C∗ -algebras.

Definition 13.14. If A is a nonunital C∗ -algebra, and if x ∈ A, then the


spectrum of x is the set

σ(x) = {λ ∈ C | x − λ1 6∈ GL(A1 )} .

Note that if A is nonunital, then 0 ∈ σ(x), for all x ∈ A (Exercise 7).


Furthermore, in any C∗ -algebra A the norm of x ∈ A is determined by the
spectral radius of x∗x ∈ A.
p
Proposition 13.15. If A is any C∗ -algebra, then kxk = spr (x∗ x), for every
x ∈ A.

Proof. For every x ∈ A, x∗ x is normal and so kx∗ xk = spr (x∗ x). 



An important consequence of Proposition 13.15 is that each C -algebra
has a unique C∗ -norm (Exercise 8).

13.5 Continuous Functional Calculus

Theorem 13.16. (Continuous Functional Calculus) If x is a normal element


of a unital C∗-algebra B and if A = C ∗(x, 1) (the abelian, unital, C∗ -subalgebra
of B generated by x), then
1. the compact Hausdorff spaces sp A and σ(x) are homeomorphic,
2. there is an isometric isomorphism Φ : C (σ(x)) → C ∗ (x, 1) such that
Φ(ι) = x, where ι ∈ C (σ(x)) is the function ι(t) = t, and
3. (Spectral Mapping Theorem) for each f ∈ C (σ(x)), the spectrum of
Φ(f) ∈ C ∗(x, 1) is

σ (Φ(f)) = {f(λ) | λ ∈ σ(x)} .

Proof. A is abelian and so σ(x) = {ω(x) | ω ∈ sp A}. Therefore, define a


function ψ : sp A → σ(x) by ψ(ω) = ω(x). The topology of sp A is the weak∗ -
topology; thus, ψ is continuous. Moreover, ψ is a bijection. As every continuous
bijection from a compact space to a Hausdorff space is a homeomorphism, ψ
is a homeomorphism.
164 13 C∗ -algebras

The fact that sp A and σ(x) are homeomorphic implies that the C∗ -
algebras C(sp A) and C (σ(x)) are isometrically isomorphic, by Proposition
13.9. The isomorphism ρ : C(sp A) → C (σ(x)) given by Proposition 13.9
satisfies
ρ(f)[λ] = f(ω) , where λ = ω(x) ∈ σ(x) .
Furthermore, the Gelfand transform is an isometric isomorphism C ∗(x, 1) →
C(sp A). Hence, composing the inverses of these two isomorphisms produces
an isometric isomorphism of Φ : C (σ(x)) → C ∗ (x, 1); that is, Φ = Γ −1 ◦ ρ−1 .
Let ι ∈ C (σ(x)) denote the function ι(t) = t; thus, ι∗ ∈ C (σ(x)) is
the function ι∗ (t) = t. To show that Φ(ι) = x, it is sufficient to show that
Φ−1 (x) = ι. Note that Φ−1 = ρ ◦ Γ . If λ ∈ σ(x), then there is a unique
ω ∈ sp A such that ω(x) = λ—equivalently, ψ−1 (λ) = ω. Therefore, by the
following calculation, the function Φ−1 (x) sends λ to λ:

Φ−1 (x)[λ] = ρ (Γ (x)) [λ] = Γ (x)[ψ−1 (λ)] = Γ (x)[ω] = ω(x) = λ .

This proves that Φ(ι) = x. Thus, Φ(ι∗ ) = x∗ and


 
m X n m X n
j
X X
Φ αkj tk t  = αkj xk (x∗ )j . (13.3)
k=0 j=0 k=0 j=0

Note that the spectrum of the element given in (13.3) is the range of the
continuous function
 
Xm X n
Γ αkj xk (x∗ )j  ∈ C(sp A) ,
k=0 j=0

namely,
   
m X n m X
n
j
X  X 
j
αkj ω(x)k ω(x) | ω ∈ sp A = αkj λk λ | λ ∈ σ(x) .
   
k=0 j=0 k=0 j=0

Since the ring of all polynomials in commuting variables t and t is uniformly


dense in C(σ(x))—by the Stone–Weierstrass Theorem once again—equation
(13.3) shows that
σ (Φ(f)) = {f(λ) | λ ∈ σ(x)} .
This completes the proof. 
Notational Convention. In applications of Theorem 13.16 it is customary
to denote Φ(f) by f(x), for each f ∈ C (σ(x)).
The next proposition is a convenient device when working in the context
of nonunital C∗ -algebras.
13.6 Positive Elements 165

Lemma 13.17. Suppose that X ⊂ R is a compact set such that 0 ∈ X. If


f ∈ C(X) satisfies f(0) = 0 and if ε > 0, then there is a polynomial p such
that p(0) = 0 and |f(t) − p(t)| < ε for all t ∈ X.

Proof. Exercise 11. 

Proposition 13.18. Suppose that A is a unital C∗ -algebra and h ∈ A is


hermitian. Let X ⊂ R be a compact set such that X ⊇ σ(h)∪{0}. If f ∈ C(X)
satisifies f(0) = 0, then f(h) ∈ C ∗(h).

Remark. The algebra C ∗(h) does not necessarily contain the identity of A;
thus, the conclusion f(h) ∈ C ∗(h) is sharper than the conclusion of Theorem
13.16, namely that f(h) ∈ C ∗ (h, 1).
Proof. By Lemma 13.17, the condition f(0) = 0 implies that there is a sequence
of polynomials fn for which fn (0) = 0 and |f(t) − fn (t)| → 0 uniformly on X
(and, thus, on σ(h) as well). Since fn (h) ∈ C ∗(h),
 
lim kf(h) − fn (h)k = lim max |f(t) − fn (t)| = 0 ,
n→∞ n→∞ t∈σ(h)

and so f(h) ∈ C ∗ (h). 

13.6 Positive Elements


Definition 13.19. An element h ∈ A is positive if h∗ = h and σ(h) ⊂ R+ .

Let A+ be the set

A+ = {h ∈ A | h is a positive element of A} .

Note that if A is a C∗ -subalgebra of a C∗ -algebra B, then A+ ⊆ B+ (since


the spectrum of an element x is independent of the C∗ -algebra that contains
x).
The main achievements of this section are to show (i) that every positive
element has a unique positive square root, (ii) that A+ is a pointed convex
cone, and (iii) that x∗ x ∈ A+ for every x ∈ A. This latter fact is rather tricky.
The first proposition provides criteria to determine whether a hermitian h ∈ A
is positive.

Proposition 13.20. The following statements are equivalent for a hermitian


element h in a unital C∗ -algebra A:
1. h ∈ A+ ;
2. h = b2 for some b ∈ A+ such that b ∈ C ∗ (h);
3. kα1 − hk ≤ α, for every α ≥ khk;
4. kα0 1 − hk ≤ α0 , for some α0 ≥ khk.
166 13 C∗ -algebras

Proof. 1 ⇒ 2. Because h is positive,


√ σ(h) ⊂ R+ . Let X = [0, khk] and let
f ∈ C(X) be given by f(t) = t. By Proposition 13.18, the hermitian
√ element
b = f(h) is an element of C ∗ (h). Furthermore, σ(b) = {f( λ | λ ∈ σ(h)} ⊂
R+ , which implies that b ∈ C ∗(h)+ , and so b ∈ A+ .
2 ⇒ 3. Assume that h = b2 for some positive b ∈ C ∗ (h). Choose any α ≥
khk. By the Spectral Mapping Theorem (Theorem 13.16), σ(b2 ) = {λ2 | λ ∈
σ(b)}; thus, σ(b2 ) ⊂ R+ . Since the norm of a positive element is its spectral
radius, 0 ≤ λ ≤ khk ≤ α implies that α − λ = |α − λ| ≤ α for every λ ∈ σ(h).
Hence, α ≥ spr (α1 − h) = kα1 − hk.
3 ⇒ 4. This is trivial.
4 ⇒ 1. Assume that α0 ≥ khk satisfies α0 ≥ kα0 1 − hk. Thus, if λ ∈
σ(h), then |λ| ≤ α0 and |α0 − λ| ≤ α0 ; that is, the real number λ must be
nonnegative. 

Corollary 13.21. If A is a C∗-algebra, then A+ is a pointed convex cone.


That is, if γ, δ ∈ R+ and if h, k ∈ A+ , then
1. γh + δk ∈ A+ , and
2. −h ∈ A+ only if h = 0.

Proof. Exercise 13 handles the case where A is unital. If A is nonunital, then


consider A as a C∗ -subalgebra of its minimal unitisation A1 . Observe that 2
and 4 of Proposition 13.20 indicate that A+ = A ∩ (A1 )+ . Thus, the unital
case implies the nonunital case. 

Definition 13.22. The positive element b ∈ C ∗ (h) that satisfies b2 = h in 2


of Proposition 13.20 is called a positive square root of h and is denoted by
h1/2 .

The proposition below shows that h1/2 is unambiguously defined because


if h = b2 for some other positive b, then b = h1/2 .

Proposition 13.23. Let A be a unital C∗ -algbera and suppose that b, h ∈ A+


are such that b2 = h. Then b = h1/2 .

Proof. Let β > 0 be large enough so that σ(h) ∪ σ(b) ⊆ [0, β]. Therefore, for
any g ∈ C([0, β]),

kg(b)k = max |g(λ)| ≤ max |g(t)| . (13.4)


λ∈σ(b) 0≤t≤β

By Lemma 13.17, there is a√sequence of polynomials fn such that fn (0) = 0,


for all n ∈ N, and |fn (t) − t| → 0 uniformly on [0, β] as n → ∞. Thus, each
fn (h) ∈ A and kfn (h) − h1/2 k → 0. Note that fn (h) = fn (b2 ) ∈ A, and so

kh1/2 − bk = limn kfn (h) − bk


= limn kfn (b2 ) − bk
= limn kgn (b)k ,
13.6 Positive Elements 167

where gn (t) = fn (t2 ) − t, for each n. Since gn → 0 uniformly on [0, β] as


n → ∞, inequality (13.4) shows that kgn (b)k → 0, whence h1/2 = b. 
A common technique in measure theory is to write an arbitrary real-valued
function as a difference of two nonnegative functions whose product is zero.
That idea carries over, via functional calculus, to hermitian elements of C∗ -
algebras.

Proposition 13.24. If h is a hermitian element of a C∗-algebra A, then there


are positive h+ , h− ∈ A+ such that h = h+ − h− and h+ h− = h− h+ = 0.

Proof. First assume that A is unital. The C∗ -algebra C ∗ (h, 1) is a unital,


abelian C∗ -subalgebra of A; moreover, C ∗ (h, 1) and C(σ(h)) are isometrically
isomorphic. Let X = [−khk, khk], a compact set that contains σ(h) and 0.
Consider the functions f, g ∈ C(X) defined by f(t) = (t + |t|)/2 and g(t) =
f(−t). The functions f and g are nonnegative and vanish at 0; thus, by the
Spectral Mapping Theorem and Proposition 13.18 the elements f(h) and g(h)
are positive and belong to C ∗ (h). Let h+ = f(h) and f− = g(h). Because
t = f(t) − g(t) and f(t)g(t) = 0 for all t ∈ X, the Continuous Functional
Calculus yields h = h+ − h− and h+ h− = h−h+ = 0.
If A is nonunital, then consider A as a C∗ -subalgebra of its minimal uniti-
sation A1 . The argument above yields h+ , h− ∈ C ∗ (h)+ ⊆ A+ ⊂ (A1 )+ such
that h = h+ − h− and h+ h− = h− h+ = 0, thereby completing the proof. 
We now arrive at the main theorem of this section: a characterisation of
the positive cone in any C∗ -algebra.

Theorem 13.25. A+ = {x∗x | x ∈ A} for every C∗-algebra A.

Proof. If h ∈ A+ , then assertion 2 of Proposition 13.20 yields a positive element


b ∈ C ∗ (h)such that b2 = h. Hence, h = b∗ b ∈ {x∗ x | x ∈ A}.
Conversely, let x ∈ A. By Proposition 13.24, the hermitian element x∗x ∈
A may be expressed as x∗ x = b+ −b− , where b+ , b− ∈ A+ and b+ b− = b− b+ =
0. To show that x∗x ∈ A+ it is sufficient to prove that b− = 0. √
Let c = (b− )1/2 ∈ A+ and a = xc. By Lemma 13.17, f(t) = t can be
approximated uniformly on σ(b− ) by polynomials p such that p(0) = 0. Since
p(b−)b+ = b+ p(b− ) = 0 for any polynomial for which p(0) = 0, we conclude
that cb+ = b+ c = 0. Hence,

−a∗ a = −cx∗ xc = −c(b+ − b− )c = cb−c = b2− ,

which implies that σ(−aa∗ ) ⊂ R+ . Thus,

σ(a∗ a) ⊂ −(R+ ) . (13.5)

Let u, v ∈ A be the real and imaginary parts of a; thus, a = u + iv. By the


Spectral Mapping Theorem, u2 and v2 are positive and so, by Corollary 13.21,
168 13 C∗ -algebras

u∗2+v2 ∈ A+ . Therefore, a∗ a+aa∗ ∈ A+ as well, since a∗ a+aa∗ = 2(u2 +v2 ).


By Corollary 13.21 once again, we have that a∗ a + aa∗ + b2− ∈ A+ . But

a∗ a + aa∗ + b2− = a∗ a + aa∗ − a∗ a = aa∗ ;

this shows that aa∗ ∈ A+ and so

σ(aa∗ ) ⊂ R . (13.6)

Proposition 11.16 asserts that σ(aa∗ ) ∪ {0} = σ(a∗ a) ∪ {0}. Therefore, (13.5)
and (13.6) combine to give

σ(a∗ a) ⊆ (−R+ ) ∩ R+ = {0} .

Therefore, the spectral radius of a∗ a is 0. Since the spectral radius and norm
coincide for hermitian elements, a∗ a = 0. That is, 0 = ka∗ ak = kak2 , which
proves that a = 0. Since b2− = −a∗ a = 0 and b− is positive, we obtain
b− = [b2−]1/2 = 01/2 = 0. 

Definition 13.26. If h, k ∈ Asa , then

h ≤ k denotes k − h ∈ A+ .

The relation “≤” on Asa has the following properties (Exercise 14). If
a, b, c ∈ Asa , then:
1. a ≤ a;
2. If a ≤ b and b ≤ a, then b = a; and
3. If a ≤ b and b ≤ c, then a ≤ c.
That is, “≤” is a partial order on the R-vector space Asa .

Proposition 13.27. If h, k ∈ Asa satisfy h ≤ k, then x∗ hx ≤ x∗kx for every


x ∈ A.

Proof. If x ∈ A, then

x∗ kx − x∗ hx = x∗ (k − h)x = x∗(k − h)1/2 (k − h)1/2 x = z ∗ z ∈ A+ ,

where z = (k − h)1/2 x. 
Via the partial order, the real vector space Asa mirrors some of the prop-
erties of R. Through the notion of modulus, A mirrors some of the properties
of C.

Definition 13.28. If A is a C∗ -algebra and if x ∈ A, then the modulus of x


is the element |x| ∈ A defined by (x∗ x)1/2 .

For example, if A is a unital C∗ -algebra, then every invertible x ∈ A has


a polar form: x = u|x|, where u ∈ A is unitary. (See Exercise 19.)
13.7 Ideals and Quotients 169

13.7 Ideals and Quotients


A linear submanifold J of a C∗ -algebra A is an algebraic ideal of A if ax ∈ J
and xa ∈ J, for all a ∈ A and x ∈ J. If an algebraic ideal J of A is also norm
closed in A (that is, J is a Banach subalgebra of A), then J is said to be an
ideal of A.
The trivial ideals of A are those of the form J = {0} and J = A. A
C∗ -algebra A is simple if the only ideals of A are the trivial ideals.
Ideals of C∗ -algebras inherit many properties of the ambient C∗ -algebra.
First and foremost of these is that every ideal of a C∗ -algebra is itself a C∗ -
algebra.

Lemma 13.29. If J is an ideal of a C∗ -algebra A and if x ∈ J, then there


is a sequence {en }n∈N ⊂ J+ such that σ(en ) ⊂ [0, 1], for all n ∈ N, and
kxen − xk → 0.

Proof. First suppose that x ∈ A. If A is unital and if e ∈ A+ satisfies σ(e) ⊂


[0, 1], then k1 − ek ≤ 1 (Exercise 21). Thus, kx − xek2 = k(1 − e)x∗x(1 − e)k ≤
kx∗x(1 − e)k = kx∗x − x∗xek. If A is nonunital, then one can embed A into A1
to produce the same inequality. Therefore, regardless of whether A is unital
or not,

kx − xek2 ≤ kx∗x − x∗ xek , ∀ e ∈ A+ with σ(e) ⊆ [0, 1] . (13.7)

Suppose now that x ∈ J. Because J is an ideal, x∗ x ∈ J. Let h = x∗ x. For


each n ∈ N, let fn (t) = nt/(1 + nt); thus, fn ∈ C(σ(h)), 0 ≤ fn (t) ≤ 1, for all
t, and fn (0) = 0. Let en = fn (h). Theorem 13.16 and Proposition 13.18 show
that en ∈ J+ and σ(en ) ⊂ [0, 1]. We aim to verify that kh − hen k → 0. To this
end, note that if t ∈ σ(h), then
  
t nt 1 1
t − tfn (t) = = < , ∀ t ∈ σ(h) .
1 + nt 1 + nt n n
Therefore, by the fact that continuous functional calculus is an isometric ho-
momorphism, kh − hen k < 1/n. Hence, by inequality (13.7),
1
kx − xen k2 ≤ kx∗x − x∗en k < .
n
That is, lim kxen − xk = 0. 
n→∞

Theorem 13.30. If J is an ideal of a C∗ -algebra A, then J is a C∗ -subalgebra


of A.

Proof. All that needs to be verified is that x∗ ∈ J for every x ∈ J. By Lemma


13.29, there is a sequence {en }n∈N ⊂ J+ such that σ(en ) ⊂ [0, 1], for all n ∈ N,
and kxen − xk → 0. Note that en x∗ ∈ J for every n ∈ N. The C∗ -norm is
isometric, and so
170 13 C∗ -algebras

lim ken x∗ − x∗ k = lim kxen − xk = 0 .


n→∞ n→∞

Because J is norm-closed and each en x∗ ∈ J, we conclude that x∗ ∈ J. 


If J is an ideal of a C∗ -algebra A, then let [x] denote the set

[x] = {y ∈ A | y − x ∈ J} ,

and let
A/J = {[x] | x ∈ A} .
By general ring theory, A/J is a complex algebra under the operations

α[x] = [α x] ; [x] + [y] = [x + y] ; [x] [y] = [xy] .

Furthermore, A/J is an involutive algebra under (the well defined) involution

[x]∗ = [x∗] .

Banach space theory shows that A/J is a Banach algebra under the quotient
norm
k[x]k = inf {kx − bk | b ∈ J} .
The new fact that is proved here is that the quotient norm satisfies the C∗ -
norm axiom k[x]k2 = k[x]∗[x]k.

Theorem 13.31. If J is an ideal of a C∗ -algebra A, then A/J is a C∗ -algebra.

Proof. As mentioned in the preamble to the statement of the theorem, the only
point to verify is that the quotient norm on A/J satisfies k[x]k2 = k[x]∗[x]k.
To this end, fix x ∈ A and define

E = {e ∈ J+ | σ(e) ⊆ [0, 1]} .

If A is unital and if e ∈ E, then k1 − ek ≤ 1 and, for any b ∈ J, kx + bk ≥


k(x + b)(1 − e)k = k(x − xe) + (b − be)k. If A is nonunital, then one can embed
A into A1 to produce the same inequality. Hence,

kx + bk ≥ k(x − xe) + (b − be)k , ∀ e ∈ E, b ∈ J ,

regardless of whether A is unital or not. By definition of the quotient norm,

k[x]k ≤ inf {kx − xek | e ∈ E} . (13.8)

The first task is to show that equality holds in (13.8).


Let b ∈ J. By Lemma 13.29, there is a sequence {en }n∈N ⊂ E such that
kben − bk → 0. Thus, for every n ∈ N,

kx + bk ≥ k(x − xen ) + (b − ben )k ,


13.8 C∗ -algebra Homomorphisms 171

and so

kx + bk ≥ lim inf kx − xen k ≥ inf kx − xek ≥ k[x]k .


n e∈E

The infimum of the left hand side of the inequality above over all b ∈ J yields
k[x]k.
Hence, k[x]k = inf {kx − ek | e ∈ E}, for every x ∈ A, and therefore

k[x]k2 = inf kx − xek2


e∈E

≤ inf kx∗x − x∗ xek [by (13.7)]


e∈E

= k[x]∗[x]k .

Thus,
k[x]k2 ≤ k[x]∗[x]k ≤ k[x]∗k k[x]k . (13.9)
Conversely, k[x]∗k = inf{kx∗ − b∗ k | b ∈ J} = inf{kx − bk | b ∈ J} = k[x]k,
since J is ∗-closed. Therefore, inequality (13.9) is an equality. 

13.8 C∗-algebra Homomorphisms


The main features of C∗ -algebra homomorphisms are described by the follow-
ing theorem.

Theorem 13.32. If A and B are C∗ -algebras, and if ρ : A → B is a homo-


morphism, then
1. ρ is continuous and kρk ≤ 1,
2. ρ is an isometry if and only if ker ρ = {0},
3. the kernel of ρ is an ideal of A, and
4. the range of ρ is a C∗ -subalgebra of B.

Proof. By Exercise 33, spr ρ(x∗ x) ≤ spr (x∗ x), for all x ∈ A. Thus,

kρ(x)k2 = kρ(x)∗ ρ(x)k = spr ρ(x∗ x) ≤ spr (x∗ x) = kx∗ xk = kxk2 .

That is, ρ is bounded and kρk ≤ 1, which proves 1.


For 2, it is trivial that isometries are injective, and so only the converse
is proved here. Thus, assume that ker ρ = {0}. Assume, contrary to what
we aim to prove, that there is an element x ∈ A with kρ(x)k < kxk. Then,
kρ(h)k < khk, where h = x∗ x ∈ A+ . Let f : [0, khk] → R be any continuous
function such that f(t) = 0 for t ∈ [0, kρ(h)k] and f(khk) = 1. By the Spectral
Mapping Theorem, kf(ρ(h))k = 0 and kf(h)k ≥ 1. Because f(ρ(h)) = ρ(f(h))
(by the continuity of ρ and the Weierstrass Approximation Theorem), it must
172 13 C∗ -algebras

be that kρ(f(h))k = 0. Since ρ is injective, this means that f(h) = 0—


in contradiction of kf(h)k ≥ 1. Therefore, it must be that ρ is isometric if
ker ρ = {0}, which proves 2.
Since ρ is continuous, ker ρ is closed. As the kernel of any homomorphism
is an algebraic ideal, we conclude that ker ρ is an ideal, thereby proving 3.
For the proof of 4, consider the quotient C∗ -algebra A/ ker ρ and let φ :
A/ ker ρ → B be defined by φ ([x]) = ρ(x), for every x ∈ A. Then φ is a well
defined homomorphism with trivial kernel and range equal to the range of ρ.
Thus, by 2, φ is an isometry, and so its range is norm closed. Hence, the range
of ρ is norm closed. 

Corollary 13.33. If two C∗ -algebras are isomorphic, then they are isometri-
cally isomorphic.

13.9 States

Definition 13.34. A state on a C∗ -algebra A is a linear map ϕ : A → C such


that
1. kϕk = 1 and
2. ϕ(h) ≥ 0 for every h ∈ A+ .
The state space of A is the set S(A) of all states on A.

Two basic examples of states are as follows.


1. States on C(X). If X is a compact Hausdorff space, then

C(X)+ = {f ∈ C(X) | f(x) ≥ 0, ∀ x ∈ X} .

If x0 ∈ X is fixed, then the linear map ϕ that sends every f ∈ C(X) to


f(x0 ) ∈ C has norm 1 and is positive preserving. Hence, point evaluations
on C(X) are states. However, there are other states on C(X). For example,
if µ is a probablity measure on the Borel sets of X, then the map ϕ :
C(X) → C defined by
Z
ϕ(f) = f dµ , ∀ f ∈ C(X) ,
X

is a state on C(X).
2. States on B(H). The positive cone of B(H) is

B(H)+ = {h ∈ B(H) | hhξ, ξi ≥ 0, ∀ ξ ∈ H} .

For every unit vector ξ ∈ H, the linear map that sends x ∈ B(H) to
hxξ, ξi ∈ C is a state on B(H). However, not all states on B(H) are of
13.9 States 173

this form. For example, if ξ1 , . . . , ξm ∈ H are unit vectors, then the maps
ϕ : B(H) → C defined by
m
1 X
ϕ(x) = hxξj , ξj i
m
j=1

is a state on B(H).
States on A necessarily map Asa onto R. This can be seen via expressing
h ∈ Asa as h = h+ − h− , where h+ , h− ∈ A+ . Therefore, by expressing any
x ∈ A in terms of its real and imaginary parts, we obtain

ϕ(x∗ ) = ϕ(x) , ∀ x ∈ A , ∀ ϕ ∈ S(A) .

Proposition 13.35. (Schwarz Inequality) If ϕ ∈ S(A), then

|ϕ(y∗ x)|2 ≤ ϕ(x∗ x) ϕ(y∗ y) , ∀ x, y ∈ A . (13.10)

The equation [x, y] = ϕ(y∗ x) defines a sesquilinear form on A × A. Therefore,


the proof of the inequality can be achieved by arguing as in the proof of the
Cauchy–Schwarz inequality in Hilbert space.
Choose x, y ∈ A. If [x, y] = 0, then the inequality holds trivially. Thus,
assume that [x, y] 6= 0. Note that x∗ x, y∗ y ∈ A+ imply that [x, x], [y, y] ∈ R+ .
For any λ ∈ C,

0 ≤ [x − λy, x − λy] = [x, x] − 2< (λ[y, x]) + |λ|2 [y, y] .

For
[x, x]
λ = ,
[y, x]
the inequality above becomes

[x, x]2[y, y]
0 ≤ −[x, x] + ,
|[x, y]|2

which yields inequality (13.10). 


If A is a unital C∗ -algebra, then there is a relatively simple criterion for a
linear functional to be a state.

Proposition 13.36. Suppose that A is a unital C∗ -algebra and that ϕ : A →


C is a linear functional of norm kϕk = 1. The following statements are equiv-
alent:
1. ϕ is a state on A;
2. ϕ(1) = 1.
174 13 C∗ -algebras

Proof. Assume that ϕ is a state on A. Because 1 = 1∗ 1 ∈ A+ and k1k = 1,


we have that 0 ≤ ϕ(1∗ 1) = ϕ(1) ≤ kϕk k1k = 1. To show that 1 ≤ ϕ(1),
choose any x ∈ A with kxk ≤ 1. Thus, kx∗xk ≤ 1. Since kx∗xk = r(x∗x)
and σ(x∗ x) ⊂ R+ , the hermitian element 1 − x∗x is positive in A. Thus,
0 ≤ ϕ(1−x∗ x) = ϕ(1)−ϕ(x∗ x), which implies that ϕ(x∗ x) ≤ ϕ(1). Therefore,
by an application of the Schwarz inequality,

|ϕ(x)| = |ϕ(1∗x)| ≤ ϕ(x∗ x)ϕ(1∗ 1) ≤ ϕ(1)2 ≤ 1 ,

since ϕ(1) ≤ 1. Hence |ϕ(x)| ≤ 1, for all x ∈ A with kxk ≤ 1, implies that
kϕk ≤ ϕ(1). By hypothesis, kϕk = 1; therefore, ϕ(1) = 1.
Conversely, suppose that ϕ(1) = 1; that is, kϕk = ϕ(1) = 1. It must
happen that ϕ(Asa ) = R, for if not then there is a hermitian element h ∈ Asa
such that ϕ(h) = α + iβ, where α, β ∈ R and β 6= 0. Therefore, with k =
β −1 (h − α1) ∈ Asa , we would have that ϕ(k) = i and, for each γ ∈ R,

(γ + 1)2 = |i + γi|2 = |ϕ(k + γi1)|2

≤ kϕk2 kk + γi1k2

= k(k + γi1)∗ (k + γi1)k

= kk 2 + γ 2 1k

= kk 2 k + γ 2 .

Thus, (2γ + 1) ≤ kk 2 k for all γ ∈ R. But this is impossible; therefore, it must


be that ϕ(h) is real for every h ∈ Asa .
Now, if h ∈ A+ , then kϕk = 1 and ϕ(h) ∈ R imply that ϕ(h) ∈
[−khk, khk]. Thus, khk ≥ khk − ϕ(h) ≥ 0, which implies that ϕ(h) ≥ 0.


Proposition 13.37. For every nonzero h ∈ A+ there is a state ϕ on A with


ϕ(h) = khk.

Proof. If A is nonunital, then consider the standard unitisation A1 of A; oth-


erwise, A1 = A in the case where A is unital.
If h ∈ A+ , then h ∈ (A1 )+ as well. Consider the unital, abelian C∗ -algebra
generated by h: C ∗(h, 1). By the Gelfand theory, there is a homomorphism
ρ : C ∗ (h, 1) → C such that ρ(h) = khk. Of course, ρ(1) = kρk = 1. By the
Hahn–Banach Theorem, ρ extends to a linear function Φ : A1 → C. Since
kΦk = Φ(1) = 1, Φ is a state on A1 by Proposition 13.36. Thus, if A is unital,
we may take ϕ = Φ. If A is nonunital, then let ϕ = Φ|A. Note that ϕ(k) ≥ 0
for all k ∈ A+ and that kϕk ≤ 1. With k = khk−1 h ∈ A+ , we have kkk = 1
and ϕ(k) = 1. Hence kϕk = 1, and so ϕ is a state on A. 
13.11 Exercises 175

13.10 Representations

Definition 13.38. A representation of a C∗ -algebra A on a Hilbert space H


is a homomorphism π : A → B(H) such that π(x∗ ) = π(x)∗ for every x ∈ A.

Definition 13.39. A representation π : A → B(H) of a C∗ -algebra A on a


Hilbert space H has a cyclic vector ξ ∈ H if the linear manifold {π(x)ξ | x ∈ A}
is dense in H.

If A = M2 (C), the C∗ -algebra of 2 × 2 complex matrices, then the vector


 
1
1  0
ξ = √  
2 0

1

is a cyclic vector for the representation π of A on the Hilbert space C4 defined


by  
x0
π(x) = , ∀x ∈ A.
0x

Definition 13.40. A representation π : A → B(H) of a C∗ -algebra A on a


Hilbert space H is said to be
1. cyclic if there is a nonzero vector ξ ∈ H such that {π(x)ξ | x ∈ A} is dense
in H;
2. irreducible if {π(x)ξ | x ∈ A} is dense in H for every nonzero ξ ∈ H; and
3. faithful if π(x∗ x) = 0 only if x = 0.

13.11 Exercises

1. Suppose that A is a nonunital C∗ -algebra and that A1 is the minimal


unitisation of A. If

kzk0 = sup{kzbk | b ∈ A, kbk ≤ 1} , ∀ z ∈ A1 ,

then show that the following properties hold for all z, z1 , z2 ∈ A1 and
α ∈ C:
a) kαzk = |α| kzk0,
b) kz1 + z2 k0 ≤ kz1 k0 + kz2 k0 ,
c) kz1 z2 k0 ≤ kz1 k0 kz2 k0 , and
d) kz ∗ k0 = kzk0 .
2. Assume that A is a nonunital C∗ -algebra and that A1 = A × C is its
unitisation. Prove that in the norm k · k0 , A1 is a Banach algebra.
3. Assume that A is a unital C∗ -algebra.
176 13 C∗ -algebras

X
a) Prove that the series z n /(n!) converges in A, for each z ∈ A.
n=0
(Suggestion: show that the partial sums of the series form a Cauchy
sequence.)

X
b) If ez denotes the limit of z n /(n!), then prove that ex+y = ex ey if
n=0
x, y ∈ A commute (xy = yx).
4. Let X be a compact Hausdorff space and let f ∈ C(X). Determine nec-
essary and sufficient conditions on the range of f so that:
a) f is invertible;
b) f is hermitian;
c) f is unitary.
5. If X and Y are locally compact Hausdorff spaces and if X and Y are
homeomorphic, then prove that the C∗ -algebras C0 (X) and C0 (Y ) are
isometrically isomorphic.
6. Show that C0 (R) is nonunital.
7. Prove that if A is a nonunital C∗ -algebra, then 0 ∈ σ(x), for all x ∈ A.
8. Suppose that A is a C∗ -algebra with norm k · k. Prove that if k · k0 is a
norm on A that satisfies all of the axioms of a C∗ -norm, then kxk0 = kxk
for all x ∈ A.
9. If A is a unital C∗ -algebra and if u ∈ A is unitary, then prove that σ(u) ⊆
∂D, where D is the open unit disc of the complex plane.
10. If A is a unital C∗ -algebra and if x ∈ A (not necessarily normal), then
prove or find a counterexample to each of the following statements.
a) σ(x∗ ) = {λ | λ ∈ σ(x)}.
b) x∗x is invertible if x is invertible.
c) x is invertible if x∗ x is invertible.
11. Suppose that X ⊂ R is a compact set such that 0 ∈ X. Prove that if
f ∈ C(X) satisfies f(0) = 0 and if ε > 0, then there is a polynomial p
such that p(0) = 0 and |f(t) − p(t)| < ε for all t ∈ X.
12. In a unital Banach algebra A, an element x ∈ A is quasinilpotent if σ(x) =
{0}, and x is properly quasinilpotent if σ(xy) = {0} for all y ∈ A. Prove
that if A is a unital C∗ -algebra, then the only properly quasinilpotent
element x ∈ A is x = 0.
13. Suppose that A is a unital C∗ -algebra, γ ∈ R+ , and h, k ∈ A+ . Prove that
a) γ h ∈ A+ ,
b) h + k ∈ A+ , and
c) −h ∈ A+ only if h = 0.
14. Suppose that a, b, c ∈ Asa . Prove the following assertions.
a) a ≤ a.
b) If a ≤ b and b ≤ a, then b = a.
13.11 Exercises 177

c) If a ≤ b and b ≤ c, then a ≤ c.
15. Suppose that A is a C∗ -algebra with x ∈ A, h ∈ A+ , and xh = hx. Prove
that xh1/2 = h1/2 x. (Suggestion: show that xf(h) = f(h)x for every
polynomial f.)
16. If A is a C∗ -algebra and if a, b ∈ A+ satisfy a ≤ b and ab = ba, then prove
that a2 ≤ b2 . Show by example that a ≤ b does not always imply a2 ≤ b2
if ab 6= ba.
17. If A is a unital C∗ -algebra, prove that 1 +x∗x is invertible for every x ∈ A.
18. Prove that σ(ab) ⊂ R+ for all positive elements a and b in a C∗ -algebra
A.
19. Prove that if A is a unital C∗ -algebra and if x ∈ A is invertible, then there
is a unitary u ∈ A such that x = u|x|.
20. Prove or find a counterexample to the following assertion in a C∗ -algebra
A: kxk = k |x| k, for every x ∈ A.
21. Prove that if h ∈ Asa , where A is a unital C∗ -algebra, and if α ≥ khk,
then
a) h ≤ α1, and
b) kα1 − hk ≤ α.
22. Prove that if X is a locally compact Hausdorff space, then

C0 (X)+ = {f ∈ C0 (X) | f(t) ≥ 0, ∀ t ∈ X} .

23. Consider the 2-dimensional Hilbert space C2 with its canonical inner prod-
uct    
ξ1 η1
, = ξ1 η1 + ξ2 η2 .
ξ2 η2
Let x : C2 → C2 be the operator given by the 2 × 2 complex matrix
 
αβ
x = .
γ δ

Find necessary and sufficient conditions on the entries of x so that:


a) x is a hermitian operator;
b) x is a positive operator;
c) x is a unitary operator.
24. Suppose that h is a positive operator acting on a Hilbert space H. Prove
that if ξ ∈ H satisfies hhξ, ξi = 0, then hξ = 0.
25. Suppose that h, k ∈ A are hermitian. Prove or find a counterexample to
the following assertion: either h ≤ k or k ≤ h.
26. Suppose that J is a proper ideal of a unital C∗ -algebra A (that is, J 6= A).
Show that 1 6∈ J.
178 13 C∗ -algebras

27. Suppose that J is an ideal of a nonunital C∗ -algebra A. Consider the


inclusion of A in its minimal unitisation A1 . Show that J is an ideal of
A1 .
28. Suppose that J is a proper ideal of a unital C∗ -algebra A. Define J + C1
by
J + C1 = {x + λ1 | x ∈ J, λ ∈ C} .
a) Prove that J + C1 is a unital C∗ -subalgebra of A.
b) Prove or find a counterexample to the following statement: the C∗ -
algebras J 1 and J + C1 are isometrically isomorphic.
29. Suppose that H is an infinite-dimensional separable Hilbert space.
a) Prove that the set K(H) of compact operators is a proper ideal of
B(H).
b) If H is separable, then prove that K(H) is the only proper ideal of
B(H).
30. Let H be a nonseparable Hilbert space and consider the norm-closure S
of the set S of all x ∈ B(H) for which {xξ | ξ ∈ H} is separable. Show
that S is an ideal that contains, but does not equal, the ideal K(H) of
compact operators on H.
31. If J is an ideal of a C∗ -algebra A, then show that the map q : A → A/J
defined by q(x) = [x] is a surjective homomorphism.
32. Prove that if A and B are unital C∗ -algebras and if ρ : A → B is a
homomorphism, then σ(ρ(x)) ⊆ σ(x), for all x ∈ A.
33. Prove that if A and B are C∗ -algebras and if ρ : A → B is a homo-
morphism, then spr ρ(x∗ x) ≤ spr (x∗ x), for all x ∈ A. (A and B are not
assumed to be unital.)
34. If H is an infinite-dimensional separable Hilbert space, then the quotient
C∗ -algebra B(H)/K(H) is called the Calkin algebra. An operator x ∈
B(H) is said to be essentially normal if q(x) is a normal element of the
Calkin algebra, where q : B(H) → B(H)/K(H) is the canonical quotient
homomorphism. Let H = `2 (N) and let s ∈ B(H) be the unilateral shift,
that is s maps each sequence ξ = (ξn )n∈N to the sequence sξ = (γn )n∈N ,
where γ1 = 0 and γn = ξn−1 for all n ≥ 2.
a) Show that the unilateral shift operator s is essentially normal. (Sug-
gestion: Compute s∗ s − ss∗ .)
b) The C∗ -subalgebra C ∗ (q(s)) of B(H)/K(H) is unital and abelian.
Show that C ∗ (q(s)) is isometrically isomorphic to C(∂D).
Appendix A: Zorn’s Lemma

Definition 13.41. A partial order on a nonempty set S is a relation denoted


by  and which has the following properties:
1. (reflexivity) for all a ∈ S, a  a;
2. (antisymmetry) for all a, b ∈ S, a  b and b  a implies that b = a;
3. (transitivity) for all a, b, c ∈ S, a  b and b  c implies that a  c.
More special still is a linear order, which is a partial order that satisfies one
more property:
(iv) (comparability) for all a, b ∈ S, either a  b or b  a.

Definition 13.42. In a partially ordered set S, an element a ∈ S is said to


be a maximal element if for c ∈ S the relation a  c implies that c = a. If
E ⊆ S, then an element a ∈ S is an upper bound for E if b  a for every
b ∈ E.

Zorn’s Lemma is stated below.

Theorem 13.43. (Zorn’s Lemma) If S is a nonempty partially ordered set


such that every linearly ordered subset E ⊆ S (where the linear order on E is
inherited from the partial order on S) has an upper bound in S, then S has
a maximal element.

The sorts of fundamental propositions that Zorn’s Lemma proves are:


1. every vector space has a basis (see [6, Theorem ]);
2. every unital commutative ring has a maximal ideal ([7, ]);
3. every Hilbert space has an orthornormal basis (Theorem ??.
Deeper consequences of Zorn’s Lemma are the Hahn–Banach Theorem (The-
orem 3.13) and the Kreı̌n–Milman Theorem (Theorem 5.9).
The Zermelo–Fraenkel Axioms for Set Theory were introduced in the early
twentieth century to clarify which collections constitute sets and what the
180 13 Appendix A

permissible operations on sets are. Of all the ZF-axioms, the choice axiom is
possibly the most subtle.
The Axiom of Choice. If {Xα }α∈Λ is a family of subsets of a set X, then
there is a subset Y of X such that Y ∩ Xα is a singleton set, for every α ∈ Λ.
If, in the statement of the Axiom of Choice, the singleton set Y ∩ Xα
is denoted by {xα }, then the axiom asserts that a set Y can be formed by
selecting exactly one element xα from each of the sets Xα . This justifies the
use of the word “choice.”
In the ZF-axioms for set theory, Zorn’s Lemma is a theorem. However, if
one replaces the Axiom of Choice in the ZF-axioms and poses Zorn’s Lemma
as an axiom, then the Axiom of Choice is a theorem. Thus, the Axiom of
Choice and Zorn’s Lemma are logically equivalent. For a detailed discussion,
see [9].
References

1. W.B. Arveson, A Short Course on Spectral Theory, Springer–Verlag, GTM 122,


New York, 2003.
2. K.R. Davidson and A.P. Donsig, Real Analysis with Real Applications, Prentice–
Hall, New Jersey, 2002.
3. E. DiBenedetto, Real Analysis, Birkhäuser, Boston, 2002.
4. R.G. Douglas, Banach Algebra Techniques in Operator Theory, Springer–Verlag,
GTM 115, New York, 2000.
5. P. Enflo, A counterexample to the approximation problem in Banach spaces,
Acta Math. 130 (1973), 309–317.
6. D.R. Farenick, Algebras of Linear Transformations, Springer–Verlag, New York,
2001.
7. I.N. Herstein, Topics in Algebra, Blaisdell, New York, 1964.
8. R.C. James, A non-reflexive Banach space isometric with its second conjugate
space, Proc. Nat. Acad. Sci 37 (1951), 174–177.
9. J.R. Munkres, Topology, Prentice-Hall, Englewood Cliffs, NJ, 1975.
10. H. Radjavi and P. Rosenthal , Invariant Subspaces, Springer–Verlag, New York,
1973.
11. H.L. Royden, Real Analysis, 3rd Edition, Macmillan, New York, 1988.
12. W. Rudin, Real and Complex Analysis, McGraw–Hill, New York, 1986.
13. E.M. Stein and R. Shakarchi, Real Analysis: Measure Theory, Inegration, and
Hilbert Spaces, Princeton Lectures in Analysis, Princeton University Press,
Princeton, 2005.
Index

∗-homomorphism, 158 Schauder, 13


isometric, 158 bilateral shift operator, 125
∗-isomorphism, 158 bounded set, 7
Fσ -set, 12
Gδ -set, 12 C∗-algebra, 157
algebraic ideal, 169
C∗-subalgebra, 161 ideal, 169
unital, 161 simple, 169
ideal, 154, 169 unital, 157
Calkin algebra, 178
abelian, 149
Caratheódory’s Theorem, 64
absolutely continuous measure, 59
compact set, 6
adjoint, 98
complemented subspace, 49
algebra, 149
cone, 73
associative, 149
convergence
commutative, 149
weak, 51
unital, 150
weak∗ , 53
algebraic ideal, 153, 169
convex combination, 63
associative algebra, 30
convex hull, 63
Axiom of Choice, 180
convex set, 63
Banach algebra, 30, 149 cyclic vector, 175
algebraic ideal, 153
ideal, 154 dense subset, 6, 35
simple, 154 dual
Banach space, 6 second, 52
reflexive, 52, 56, 58 dual space, 46
separable, 6, 35
Banach subalgebra, 153 essential range of a measurable function,
Banach-subalgebra 27, 124
unital, 153 essential supremum, 27, 124
basis essentially bounded, 124
linear, 13 essentially bounded function, 28
orthonormal, 81, 83 extreme point, 65
184 Index

face, 65 locally compact space, 28


final space, 133
finite-rank operator, 109 measure
Fourier coefficients absolutely continuous, 59
in Lp , 39 metric
Fourier series, 84 pseudo-, 40
Fredholm alternative, 118 Minkowski functional, 69
function
p-integrable, 20 norm, 5
concave, 15 normal
convex, 15 element of a C∗-algebra, 158
vanishing at infinity, 30 essentially, 178
functional calculus nowhere dense set, 13
polynomial, 103, 152
open map, 97
general linear group, 150 operator, 43
Gram–Schmidt Process, 82 adjoint, 98, 122
bilateral shift, 125
Hahn–Banach Theorem, 47 bounded, 43
Hardy space, 39, 126 bounded below, 97
hermitian compact, 113
element of a C∗-algebra, 158 finite-rank, 109
Hilbert space, 76 hermitian, 127
Hilbert’s cube, 89 integral, 109, 126
homomorphism isometry, 46, 141
isometric, 150 lower bound, 97
hyperplane, 69 modulus, 133
real, 69 normal, 130
partial isometry, 133
inequality positive, 131
arithmetic–geometric mean, 18 unilateral shift, 107, 126
Cauchy–Schwarz, 75 unitary, 134
Hölder, 20 Volterra, 140
Jensen, 17
Minkowski, 20 parallelogram law, 77
Young, 18, 19 Parseval’s Equation, 85
initial space, 133 partial isometry, 133
inner product, 75, 76 Polar Decomposition, 133
invariant subspace, 135 polar form, 168
involution polynomial
on B(H), 122 Legendre, 86
isometrically isomorphic, 46 trigonometric, 34, 38
isometry, 46 positive
isomorphism element of a C∗-algebra, 165
isometric, 46 projection, 134
pseudo-metric, 40
linear basis, 13 Pythagorean Theorem, 77
linear functional, 46
linear submanifold, 9 quotient norm, 170
Index 185

real numbers complemented, 49


conjugate, 19 real, 69
reflexive Banach space, 52, 56, 58
representation, 175 topological vector space, 22
cyclic, 175 topology
faithful, 175 metric, 5
irreducible, 175 norm, 5
Riesz Representation Theorem, 58, 88 weak, 50

Schauder basis, 13 uniform algebra, 31


second dual, 52 uniformly bounded sets, 93
selfadjoint subset, 31 unilateral shift operator, 126, 178
seminorm, 22 unitary, 134
separable Banach space, 6, 35 element of a C∗-algebra, 158
sequence unitisation
p-summable, 22 minimal, 160
Cauchy, 6
Spectral Mapping Theorem Volterra operator, 140
Polynomial, 104, 152
spectral radius, 101, 152 weak convergence, 51
spectrum, 99, 151, 163 weak topology, 51
state, 172 weak∗ convergence, 53
state space, 172 weak∗ topology, 53
Stone–Weierstrass Theorem, 31 Weierstrass Approximation Theorem,
sublinear functional, 69 31, 34
Minkowski, 69
subspace, 9 Zorn’s Lemma, 179

You might also like