You are on page 1of 85

Linear Algebra

H ONG KONG U NIVERSITY OF S CIENCE AND T ECHNOLOGY

March 17, 2024


2
Contents

1 Vector Spaces and Linear Maps 5


1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Dual Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Inner Product Spaces 21


2.1 Inner Products and Norms . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Orthogonal Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3 Determinants 31

4 Eigenvalues and Eigenvectors 39


4.1 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Eigenvalues and Characteristic Polynomial . . . . . . . . . . . . . . . . . . 41
4.3 Cayley-Hamilton theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.4 Eigenvectors, Eigenspaces and Diagonal Matrices . . . . . . . . . . . . . . 51

5 Real and Complex Inner Product Spaces 55


5.1 Adjoint of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Self-Adjoint and Normal Operators . . . . . . . . . . . . . . . . . . . . . . 58
5.3 Spectral Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 Polar Decomposition and Singular Value Decomposition . . . . . . . . . . 64

6 Canonical Forms 71
6.1 Minimal Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 Root Subspace Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3
4 CONTENTS
Chapter 1

Vector Spaces and Linear Maps

Linear algebra is the study of linear maps. We have a lot of examples of maps with linear
property, for example the derivative map, the definite integral map, the matrix multiplication
map, etc.
Z b Z b Z b
d df dg
(λ f + µ g) = λ +µ ; (λ f + µ g) = λ f +µ g.
dx dx dx a a a

1.1 Vector Spaces


We now define the special structure on sets - vector spaces - on which linear maps preserves
their structure i.e. the spaces that have linear structures.

1.1 Vector spaces


A set V is a vector space over K (R or C) is a set with two operations, v1 ∈ V + v2 ∈
V, c ∈ K · v ∈ V on it, which satisfies a list of axioms.
1. v1 + v2 = v2 + v1 ;
2. v1 + (v2 + v3 ) = (v1 + v2 ) + v3 ;
3. There is 0 ∈ V such that 0 + v = v, ∀v ∈ V .
4. For any v ∈ V there is v0 such that v + v0 = 0.
5. 1 · v = v;
6. c · (v + w) = c · v + c · w;
7. (x + y) · v = x · v + y · v;
8. (xy) · v = x · (y · v).

5
6 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS

The first four axioms are about the + operation and the last four axioms are about
the · operation.

1.2 Properties of vector spaces


uppose V is a vector space.
1. There is a unique 0 ∈ V such that 0 + v = v, ∀v ∈ V .
2. For any v ∈ V , there is a unique v0 ∈ V such that v + v0 = 0. We denote it as
−v.
Furthermore,
1. x0 = 0.
2. 0v = 0 and (−1)v = −v.

1.3 Subspaces
(V, +, ·) is a vector space. U ⊂ V is a subspace of V if (U, +, ·) is a vector space.

1.4 Characterization of subspaces


Suppose V is a vector space. U is a subspace of V if and only if 0 ∈ U and U is
closed under + and ·.

1.5 Sum of subspaces


U1 , · · · ,Um be subspaces of V . The sum of U1 , · · · ,Um is defined as

U1 + · · · +Um = {u1 + · · · + um : u j ∈ U j }.

A sum is direct if each u ∈ U1 + · · · + Um can be written in only one way as a sum


u1 + · · · + um where u j ∈ U j .

Exercise 1.1.1
Show that the sum is the least containing subspace.
1.1. VECTOR SPACES 7

1.6 Characterization of direct sum


U1 + · · · + Um is direct if and only if the only way to write 0 as a sum as in the
definition is by choosing u j = 0, ∀ j = 1, · · · , m.

P ROOF.Suppose
v = u1 + · · · + um = ũ1 + · · · + ũ + m.
Then
(u1 − ũ1 ) + · · · + (um − ũm ) = 0.

Exercise 1.1.2
1. Prove the characterization of two subspaces: U + W is direct if and only if
U ∩W = {0}.
2. The above characterization does not generalize to more subspaces.
3. U1 , · · · ,Um are subspaces of V . Their sum is direct if and only if

U j ∩ U1 + · · · +U j−1 +U j+1 + · · · +Um = {0}.

1.7 Span of a list


Given u1 , · · · , um ∈ U the span of the given list of vectors is {c1 u1 + · · · + cm um : c j ∈
K}.

We call the span of the list of column vectors in a matrix the rank of that matrix.

Exercise 1.1.3
Reformulate 1.7 in terms of sum of subspaces.
8 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS

Exercise 1.1.4
Show that the span of u1 , · · · , um is the least containing subspace of {c1 u1 + · · · +
cm um }. Compare with 1.1.1.

1.8 Finite-dimensional vector spaces


A vector space V is called finite dimensional if V is finitely spanned.

We continue to make analogy of results from list of subspaces to list of vectors.

1.9 Linearly independent list


A list of vectors in V is called linearly independent if

c1 u1 + · · · + cm um = 0 =⇒ c1 = · · · = cm = 0.

Exercise 1.1.5
Reformulate 1.9 in terms of direct sum.

1.10 Reducing a spanning list to a linearly independent spanning


list
V is finite dimensional. Given a spanning list v1 , · · · , vn of V we could always reduce
it to a linearly independent list in V that still spans V .

P ROOF.Suppose v1 , · · · , vn is linearly dependent. Show that there exists j ∈


{1, · · · , m} such that v j ∈ span(v1 , · · · , v j−1 ) (why?). The key is that the span does
not change after removing the v j .
1.1. VECTOR SPACES 9

1.11 Extending a linearly independent list to a spanning list


V is finite dimensional. Given a linearly independent list u1 , · · · , um , we can always
extend it to a list of vectors that span the vector space V .

A direct corollary is that the length of a spanning list is no smaller than the length
of a linearly independent list.

P ROOF.Suppose v1 , · · · , vn spans V . Then u1 , v1 , · · · , vn is linearly dependent. Using


the same argument as in the proof of 1.10, we can reduce one of the v0j s. By repeated
applications of the same argument, we get a list u1 , · · · , um , vi1 , · · · , vin−m that spans
V.

1.12 Length of bases are equal


Suppose V is finite dimensional, and each of u1 , · · · , um and v1 , · · · , vn is a linearly
independent list of vectors that spans V (the existence follows from 1.10). Then
m = n.

P ROOF.u1 , · · · , um is linearly independent, and v1 , · · · , vn is a spanning list, hence


m ≤ n. Similarly n ≤ m.

1.13 Bases of a vector space and dimension


A basis of a finite dimensional space V is a linearly independent list of vectors that
spans V . The dimension of V is the length of the list of a basis.

Exercise 1.1.6
Prove that the subspaces of a finite dimensional space are finite dimensional.
10 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS

Exercise 1.1.7
V finite dimensional with U1 ,U2 being its subspaces. Show that

dim(U1 +U2 ) = dimU1 + dimU2 − dim(U1 ∩U2 ).

Hence derive a criterion for U1 +U2 being a direct sum.

Exercise 1.1.8
V finite dimensional with U1 , · · · ,Um being its subspaces. Show that

dim(U1 + · · · +Um ) ≤ dimU1 + · · · + dimUm .

Prove that the equality holds if and only if the sum is direct.

1.2 Linear Maps

1.14 Linear maps


U,V are K−vector spaces. T : U → V is called a linear map or a vector space
homomorphism if it preserves the vector space structure i.e.

T (u + w) = T (u) + T (w), T (cu) = cT (u).

The set of all linear maps form a vector space.

1.15 Hom(U,V )
The vector space formed by all linear maps is called Hom(U,V ), the vector space
of homomorphisms from U to V .

We often write Hom(V,V ) as End(V ), the vector space of endomorphisms (homomorphism


from V to itself) on V .
1.2. LINEAR MAPS 11

1.16 Determination of a linear map by images of a basis


Suppose u1 , · · · , um is a basis of U and v1 , · · · , vm is an arbitrary list of vectors in
V of length m. Prove that there exists a unique T ∈ Hom(U,V ) such that T (u j ) =
v j , j = 1, · · · , m.

P ROOF.Suppose there exists such a linear map T , then

T (c1 u1 + · · · + cm um ) = c1 T (u1 ) + · · · + cm T (um ) = c1 v1 + · · · + cm um .

Therefore the map would be unique. It is trivial to show that the above map is
indeed linear.

A corollary is the following.

1.17 Determination of a linear map by images of a basis


Suppose dimU = m, dimV = n. Then

dim Hom(U,V ) = mn.

Suppose v1 , · · · , vn a basis and v ∈ V ,

v = a1 v1 + · · · + an vn .

Then by the property of a basis, we can identify v through the coordinate map with
⊺
v = a1 · · · an .

1.18 Matrix of a linear map


Suppose U,V are finite-dimensional, u1 , · · · , um a basis and u ∈ U. Then

M = T (u1 ) · · · T (um )

satisfies
T (u) = M · u.
Here · is the standard multiplication of matrices, and v for v ∈ V are written in some
basis v1 , · · · , vn .
12 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS

The proof needs the following convenient result about the standard multiplication of matri-
ces.

Suppose ⊺ 
v = a1 ··· an , A = a1 ··· an .
Then
n
A · v = ∑ ai ai .
i=1

From this, the proof is then evident. This theorem says every linear map f : V → W can be
understood as a linera map M : Kn → Km .

Exercise 1.2.1
1. For X = (xi j ) ∈ Kn×n , define tr : X 7→ ∑ni=1 xii . Show that tr is a linear map
from Kn×n to K, and tr(AB) = tr(BA).
2. Now suppose a linear map T from Kn×n to K satisfies T (AB) = T (BA).
Show that there exists c ∈ K such that T = c tr.

We now study the structure of linear maps.

1.19 Kernel of a linear map


The kernel of a linear map T ∈ Hom(U,V ) is the subspace of U

ker T = {u ∈ U : Tu = 0}.

Exercise 1.2.2
Show that T is injective if and only if ker T = {0}.

1.20 Image of a linear map


The image of a linear map T ∈ Hom(U,V ) is the subspace of V
im T = {Tu : u ∈ U}.
1.2. LINEAR MAPS 13

The next theorem is a fundamental result in this section.

1.21 Rank-Nullity theorem


Suppose T : U → V , U,V finite-dimensional. Then

dimU = dim ker T + dim im T.

Also, the rank of (any) matrix of T is dim im T .

P ROOF.Find a basis v1 , · · · , vm of ker T and extend it to a basis


v1 , · · · , vm , vm+1 , · · · , vn of U. Then T vm+1 , · · · , T vn is a basis of im T .

As an application of 1.21 we study the invertible linear maps.

1.22 Invertible linear maps


We say a linear map T ∈ Hom(U,V ) is invertible or is an isomorphism if ∃S ∈
Hom(U,V ) : T S = ST = I.

1.23 Invertible = injective + surjective


T ∈ Hom(U,V ) is invertible if and only if it is both injective and surjective.

We omit the proof.

1.24 Invertibility of linear map


Let U,V finite dimensional, T ∈ Hom(U,V ). T is invertible if and only if ker T =
{0} and dimU = dimV .

P ROOF.ker T = {0} is equivalent to T is injective, and dimU = dimV surjectve.

A simple corollary is the following.


14 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS

1.25 Invertibility of matrices


Only square matrices are invertible. Among the square matrices i.e. for A ∈ Kn×n ,
A is invertible if and only if rank A = n.

P ROOF.dim im T = rank A.

In the following we present more applications.

Exercise 1.2.3
Let Pn−1 be the set of all polynomials with coefficients in R of order less or equal to
n, α1 , · · · , αn be distinct in R. Define

L : Pn−1 → Rn , f 7→ L( f ) = ( f (α1 ), · · · , f (αn )).

1. Show
t − αk
p j (t) = ∏
1≤k≤n,k6= j j − αk
α
maps to the standard basis in Rn and conclude L is invertible.
2. Find L−1 and the coordinate of f ∈ Pn−1 under the basis p1 , · · · , pn .

Exercise 1.2.4
Suppose V is finite-dimensional and R, S, T ∈ End(V ). Prove the Frobenius’ in-
equality:
dim im RST + dim im S ≥ dim im RS + dim im ST.
Hint: after some applications of 1.21, consider the restrictions ST |ker RST and S|ker RS .

Exercise 1.2.5
Suppose V is dimensional and T ∈ End(V ). Prove:
dim ker T ∩ im T = dim ker T 2 − dim ker T = dim im T − dim im T 2 .
1.2. LINEAR MAPS 15

Exercise 1.2.6
Suppose V is dimensional and T ∈ End(V ). Prove the following equivalence:

V = ker T ⊕ im T
⇐⇒ dim ker T 2 = dim ker T
⇐⇒ ker T 2 = ker T
⇐⇒ im T = im T 2
⇐⇒ dim im T = dim im T 2 .

Exercise 1.2.7
T ∈ Hom(U,V ), U,V finite dimensional.
1. Show that there exists S ∈ Hom(V,U) such that T ST = T, ST S = S
2. Prove: if S is unique then T is invertible with inverse S.

Exercise 1.2.8
U,V finite dimensional. T ∈ Hom(U,V ), S ∈ Hom(V,U) such that T ST = T, ST S =
S (c.f. 1.2.7).
1. Show that
U = ker T ⊕ im S,V = ker S ⊕ im T.
2. Prove: T |im S , S|im T are inverses of each other.

Suppose f : V → W . The choice of a basis α : v1 , · · · , vn in V induces a coordinate map


A : Kn → V . This is an isomorphism. A different choice of basis, α̃ : ṽ1 , · · · , ṽn , would
0
give another coodinate map A0 . Let TAA be the linear isomorphism mapping ei to A−1 ṽi , the
coordinate of ṽi under α . This gives the left commutative triangle. The right commutative
triangle is obtained by defining TBB0 as the linear isomorphism mapping ei to (B0 )−1 wi , the
coordinate of wi under β̃ . Joining the commutative diagrams gives
0
M̃ = TBB0 M TAA .
16 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS

This is the so-called change of basis formula.

We now define the notion of matrix equivalence.

1.26 Equivalence
Given A, B ∈ Km×n . A and B are said to be equivalent if there exists invertible
P ∈ Km×m , Q ∈ Kn×n such that

B = P AQ.

1.27 Canonical form under equivalence


 
Ir 0
Suppose A ∈ Km×n , rank A = r. Then A is equivalent to .
0 0

P ROOF.Combine Theorem 1.21 and change of basis.

There is a corollary.

1.28 Row rank = Column rank


For A ∈ Km×n define its row rank as the rank of A⊺ . Then the row rank of A is
equal to the (column) rank.

For endomorphisms, to observe what happens in the current space, we restrict the basis in
which the image is written in to be the same as the basis of the domain.
1.3. DUAL VECTOR SPACE 17

1.29 Similarity
Given A, B ∈ Kn×n . A and B are said to be similar if there exists invertible P ∈
Kn×n such that
B = P AP −1 .

Exercise 1.2.9
Suppose A ∈ Kn×n , B ∼ A. Prove: tr B = tr A.

1.3 Dual Vector Space


In the following sections, we provide a dual view of linear homomorphisms discussed in
previous sections. The key is to consider the space Hom(V, K).
For simplicity, we use the notation V ∗ = Hom(V, K).

1.30 Isomorphism between V and Hom(V, K)


Suppose v1 , · · · , vn is a basis of V . Then
n
L : V ∗ → V, f 7→ ∑ f (vi )vi
i=1

is an isomorphism from V ∗ to V .

From the above theorem, we directly have

1.31 Basis of Hom(V, R)


Suppose v1 , · · · , vn is a basis of V . Then
(
1, i= j
f1 , · · · , fn ∈ V ∗ : fi (v j ) = δi j :=
0, i 6= j

is a basis of V ∗ . It is called the dual basis of v1 , · · · , vn .


18 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS

Suppose v ∈ V, f ∈ V ∗ , then
n n
v = ∑ fi (v)vi . f = ∑ f (vi ) fi .
i=1 i=1

The main result in this section is the following theorem.

1.32 Duality of Hom(U,V )


Suppose U,V are finite-dimensional. Then

Hom(U,V ) = U ∗ ⊗V.

For simplicity, we have used the following notation.

1.33 f ⊗ v
Suppose v ∈ V, f ∈ U ∗ . Define

f ⊗ v ∈ Hom(U,V ) : · 7→ f (·)v.

1.34 U ∗ ⊗V
Define U ∗ ⊗V as the subspace

U ∗ ⊗V := span{ f ⊗ v, f ∈ U ∗ , v ∈ V }.

of Hom(U,V ).

The following definition provides a way for readers to verify 1.32.

1.35 Dual map


For T ∈ Hom(U,V ), we call the linear map T ∗ ∈ Hom(V ∗ ,U ∗ ) such that

T ∗ ( f ) = f ◦ T, ∀ f ∈ V ∗

the dual map of T .


1.3. DUAL VECTOR SPACE 19

1.36 T ∗ and T
U,V finite-dimensional. Suppose T ∈ Hom(U,V ). v1 , · · · , vn is a basis of V with
dual basis f1 , · · · , fn . Then
n
T = ∑ T ∗ ( fi ) ⊗ vi .
i=1

T∗ is the unique map satisfying the above.

Therefore,
f ⊗ v ∈ Hom(U,V ) : · 7→ f (·)v
induces a natural isomorphism between U ∗ ⊗V and Hom(U,V ).
20 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS
Chapter 2

Inner Product Spaces

2.1 Inner Products and Norms


We introduce inner products to our vector spaces to help us study the diagonalization of
matrices of linear operators. The reason why an inner product is helpful lies in its intrinsic
relation with linear functionals.

For inner products to be defined, we need our vector space to be defined on R or C. There-
fore, we use F in this chapter to denote a field that is either R or C.

2.1 Inner products


An inner product on V is a function

h·, ·i : V ×V → F

that satisfies
• hv, vi ∈ R+ ∪ {0} for all v ∈ V ;
• hv, vi = 0 ⇐⇒ v = 0;
• hu + v, wi = hu, wi + hv, wi for all u, v, w ∈ V ;
• hλ u, vi = λ hu, vi for all λ ∈ F, u, v ∈ V ;
• hu, vi = hv, ui.

A vector space with an inner product defined on it is called an inner product space.

21
22 CHAPTER 2. INNER PRODUCT SPACES

An inner product has the following properties granted by its definition.

2.2 Properties of an inner product


Suppose V is an inner product space and u, v, w ∈ V . Then
• h0, ui = hu, 0i = 0;
• hu, v + wi = hu, vi + hu, wi;
• hu, λ vi = λ hu, vi.

P ROOF.The first proposition follows by observing hu, vi = −h−u, vi. The second
and third comes from the conjugate symmetry.

For each fixed u ∈ V , the map v 7→ hv, ui is a linear functional.

Norms are also important quantities on vector spaces. They are closely related to inner
products, and an important class of them could be defined using inner product.

2.3 Norms
A norm is a function k·k : V → R+ ∪ {0} that satisfies
• kvk = 0 ⇐⇒ v = 0;
• kλ vk = |λ | kvk;
• ku + vk ≤ kuk + kvk.

2.4 Norm induced from inner product


Suppose V is an inner product space. Then
p
kvk = hv, vi

is a norm on V .

P ROOF.The proof to 2.5 and 2.7 combined together gives a proof to 2.4.
2.1. INNER PRODUCTS AND NORMS 23

2.5 Properties of the norm of vectors


Suppose V is an inner product space and v ∈ V .
• kvk = 0 ⇐⇒ v = 0;
• kλ vk = |λ | kvk;

P ROOF.The first proposition follows from the definiteness of inner products. The
second proposition follows from expansion and |λ |2 = λ λ .

2.6 Cauchy-Schwarz inequality


Suppose V is an inner product space and u, v ∈ V . Then

|hu, vi| ≤ kuk kvk .

This inequality is an equality if and only if u, v are linearly dependent.

P ROOF.Let λ ∈ C. Then

0 ≤ hλ u + v, λ u + vi = |λ |2 hu, ui + 2 Re(λ hu, vi) + hv, vi.

Now take λ = µ hu, vi where µ ∈ R and we have

0 ≤ µ 2 |hu, vi|2 kuk2 + 2µ |hu, vi|2 + kvk2

which is a real quadratic polynomial of µ . Take ∆ ≤ 0 and we have the desired


result. The equality holds if and only if hλ u + v, λ u + vi = 0 ⇐⇒ λ u + v = 0 for
some λ ∈ C.

From 2.6, we could derive the triangular inequality.

2.7 Triangular inequality


Suppose V is an inner product space and u, v ∈ V . Then
ku + vk ≤ kuk + kvk .
This inequality is an equality if and only if u, v are linearly dependent by nonnegative
24 CHAPTER 2. INNER PRODUCT SPACES

scalar.

P ROOF.We have, from the property of complex numbers and 2.6,

ku + vk2 = hu + v, u + vi
= kuk2 + kvk2 + 2 Re(hu, vi)
≤ kuk2 + kvk2 + 2 |hu, vi|
≤ kuk2 + kvk2 + 2 kuk kvk
= (kuk + kvk)2 .

The equality holds if and only if

Re(hu, vi) = |hu, vi| = kuk kvk ⇐⇒ hu, vi = kuk kvk

since Re(hu, vi) = |hu, vi| ⇐⇒ hu, vi ∈ R.

We have now completed our proof of 2.4. This means that we have a natural norm induced
from the inner product in an inner product space.

We have two other fundamental properties of the norm induced by an inner product.

2.8 Parallelogram identity


Suppose V is an inner product space and u, v ∈ V . Then

ku + vk2 + ku − vk2 = 2(kuk2 + kvk2 )

P ROOF.We have from expansion

ku + vk2 + ku − vk2 = hu + v, u + vi + hu − v, u − vi
= hu, ui + hu, vi + hv, ui + hv, vi
+ hu, ui − hu, vi − hv, ui + hv, vi
= 2(kuk2 + kvk2 ).
2.1. INNER PRODUCTS AND NORMS 25

2.9 Polarization identity


Suppose V is a vector space on F.
• If F = R, then
ku + vk2 − ku − vk2
hu, vi = .
4
• If F = C, then

ku + vk2 − ku − vk2 + ku + ivk2 i − ku − ivk2 i


hu, vi = .
4

P ROOF.We have from expansion

ku + vk2 − ku − vk2 = hu + v, u + vi − hu − v, u − vi = 2(hu, vi + hv, ui).

• If F = R, then hu, vi = hv, ui and the identity follows.


• If F = C, then

ku + ivk2 − ku − ivk2 = 2i(hu, vi − hv, ui)

and the identity follows since hv, ui terms are cancelled out.

The next theorem shows that the class of norms that satisfies 2.8 can all be induced from
inner products.

2.10 Norms satisfying parallelogram identity


Suppose k·k is a norm on V satisfying the parallelogram equality, then there is an
inner product h·, ·i on V such that
p
kvk = hv, vi, ∀v ∈ V.

P ROOF.The key idea of the proof is to construct the function h·, ·i : V × V → F as


in 2.9, and verify that this function is indeed an inner product on V .
26 CHAPTER 2. INNER PRODUCT SPACES

The theorem 2.10 characterizes inner product spaces: a vector space on which a
norm satisfying 2.8 could be defined are inner product spaces, and vice versa from
2.4.

2.2 Orthonormal Bases


We now introduce orthogonality, which defines orthonormality and helps us to relate
linear functionals and inner products.

2.11 Orthogonality
Suppose V is an inner product space. We say u, v ∈ V are orthogonal to each other
or u ⊥ v if hu, vi = 0.

The next theorem generalizes the fundamental result of R.t. triangles.

2.12 Pythagorean theorem


Suppose u, v ∈ V and u ⊥ v. Then

ku + vk2 = kuk2 + kvk2 .

P ROOF.We have from expansion

ku + vk2 = hu + v, u + vi = hu, ui + hv, vi + hu, vi + hv, ui

and the proposition follows since hu, vi = 0.

2.13 Orthonormal lists


A list of vectors e1 , · · · , en in V is called orthonormal if hei , e j i = δi j , i ≤ j. In other
words, they all have norm of 1 and orthogonal to each other.
2.2. ORTHONORMAL BASES 27

2.14 The norm of an orthonormal linear combination


Suppose e1 , · · · , en is an orthonormal list in V . Then

ka1 e1 + · · · + am em k2 = |a1 |2 + · · · + |am |2

for all a1 , · · · , am ∈ F.

P ROOF.Since each e j has norm 1, this follows easily from repeated applications of
the 2.12.

2.14 also implies that orthonormal lists (in fact all orthogonal lists) are linearly
independent lists.

2.15 A vector as linear combination of orthonormal basis


Suppose e1 , · · · , en is an orthonormal basis of V . Then

v = hv, e1 ie1 + · · · + hv, en ien .

P ROOF.Suppose v = a1 e1 + · · · + an en . Then we have

hv, e j i = ha1 e1 + · · · + an en , e j i = a j .

This also shows that

kvk2 = |hv, e1 i|2 + · · · + |hv, en i|2 .

Suppose u, v ∈ V . We wish to find a vector w ⊥ v such that u = cv + w for some c ∈ F. This


is equivalent to finding c ∈ F such that hu − cv, vi = 0. It is easy to see that

hu, vi
hu − cv, vi = hu, vi − chv, vi = 0 ⇐⇒ c = .
kvk2
28 CHAPTER 2. INNER PRODUCT SPACES

Also notice that if there exists n ∈ V such that n ⊥ u, n ⊥ v, then

hw, ni = hu − cv, ni = 0 =⇒ w ⊥ n.

Now if we have a list v1 ⊥ · · · ⊥ vn of V , and we wish to find w ∈ V such that u = w +


a1 v1 + · · · + an vn and w ⊥ v1 ⊥ · · · ⊥ vn . We then could first find the w1 = u − a1 v1 such
that w1 ⊥ v1 . Then we could find w2 = w1 − a2 v2 such that w2 ⊥ v2 , while at the same time
w2 ⊥ v1 .

The same idea allows us to get a list of vectors orthogonal to each other. Upon
scaling the vectors to vectors of norm 1, we could get an orthonormal list.

2.16 Gram-Schmidt Process


v1
Suppose v1 , · · · , vm is a linearly independent list of vectors in V . Let e1 = kv1 k . For
j = 2, · · · , m, define e j inductively by

v j − hv j , e1 ie1 − · · · − hv j , e j−1 ie j−1


ej = .
v j − hv j , e1 ie1 − · · · − hv j , e j−1 ie j−1

Then e1 , · · · , em is an orthonormal list of vectors in V and

span(v1 , · · · , v j ) = span(e1 , · · · , e j ), ∀ j = 1, · · · , m.

The spanning property is from the linear independence. If the list given is a linearly
dependent list, the process spits out 0.

2.17 Existence of an orthonormal basis


Suppose V is finite-dimensional inner product space. Then there exists an orthonor-
mal basis of V .

2.18 Orthonormal list could be extended to orthonormal basis


Suppose V is finite-dimensional inner product space. Then every orthonormal list
of vectors in V can be extended to an orthonormal basis of V .
2.3. ORTHOGONAL COMPLEMENT 29

P ROOF.Extend the list to a basis, and apply 2.16. The first vectors are not changed,
and the basis is changed to an orthonormal basis.

After showing 2.16, we have clearly studied most of the details of the inner product structure
that we need for discussion in the following chapters.
This means that we could make theorems about general bases to orthonormal bases.

2.19 Schur’s theorem


Suppose V is a finite-dimensional complex inner product space and T ∈ L (V ).
Then T has an upper-triangular matrix w.r.t. some orthonormal basis of V .

P ROOF.Apply 2.16 and we have the wanted matrix since span(e1 , · · · , e j ) =


span(v1 , · · · , v j ), ∀ j.

2.3 Orthogonal Complement

2.20 Orthogonal complement


Suppose V is an inner product space and U is a subset of V . The orthogonal com-
plement of U, denoted U ⊥ , is the set of all vectors in V that are orthogonal to every
vector in U.

2.21 Properties of orthogonal complement


Suppose V is an inner product space and U is a subset of V . Then
• U⊥ ⊂ V .
• U ∩U ⊥ ⊂ {0}.
• if U ⊂ V , then U ⊂ (U ⊥ )⊥ .

P ROOF.Note that U is not necessary to be a subspace of V .


• Suppose v, w ⊥ u, ∀u ∈ U, then

hλ v + µ w, ui = λ hv, ui + µ hw, ui = 0 =⇒ λ v + µ w ∈ U ⊥ .
30 CHAPTER 2. INNER PRODUCT SPACES

• Suppose v ∈ U ∩U ⊥ , then hv, vi = 0 =⇒ v = 0.


• Suppose u ∈ U, then u ⊥ v, ∀v ∈ U ⊥ =⇒ U ⊂ (U ⊥ )⊥ .

2.22 Orthogonal complement of a subspace


Suppose U ⊂ V is finite-dimensional. Then
• if u1 , · · · , um is a basis of U and w ⊥ u j , ∀ j = 1, · · · , m, then w ∈ U ⊥ .
• V = U ⊕U ⊥ . We write V = U ⊥ U ⊥ .
• U = (U ⊥ )⊥ .

P ROOF.
Note that V is not necessary to be finite-dimensional.
• Let e1 , · · · , em be an orthonormal basis of U. Then

v = hv, e1 ie1 + · · · + hv, em iem + (v − hv, e1 ie1 − · · · − hv, em iem ).

Since

hv − hv, e1 ie1 − · · · − hv, em iem , e j i = hv, e j i − hv, e j i = 0

we have V = U +U ⊥ . From 2.21, we have U ∩U ⊥ = {0}. Together we have


V = U ⊕U ⊥ .
• Suppose v ∈ (U ⊥ )⊥ , then ∃u, w ∈ U ⊥ : v = u + w.
Suppose n ∈ U ⊥ , then

hn, vi = 0 =⇒ hn, u + wi = hn, ui + hn, wi = hn, wi = 0.

Therefore, w ∈ U ⊥ ∩ (U ⊥ )⊥ =⇒ w = 0 =⇒ v = u ∈ U. From 2.21, we


have U ⊂ (U ⊥ )⊥ . Together we have U = (U ⊥ )⊥ .
Chapter 3

Determinants

We assume the vector spaces in this chapter are finite-dimensional.

The determinant of a matrix is a powerful tool for studying linear operators. Thanks to
the theorems from exterior algebra, we could define it axiomatically - meaning we could
construct it uniquely from some axioms.

3.1 Determinant
Suppose A ∈ Kn×n . Let A = (ai j )n×n .The determinant of A is a function Kn×n →
K that satisfies:
• multilinearity (in columns), i.e.

det(· · · λ bi + µ ci · · · ) = λ det(· · · bi · · · ) + µ det(· · · ci · · · )

where bi = (b1i , · · · , bni )⊺ , ci = (c1i · · · , cni )⊺ ;


• alternating property, i.e.

det(· · · ai · · · a j · · · ) = − det(· · · a j · · · ai · · · )

where ai = (a1i , · · · , ani )⊺ , a j = (a1 j · · · , an j )⊺ ;


• normalizing property, i.e.
det I = 1
where I = (11 , · · · , 1n ).

31
32 CHAPTER 3. DETERMINANTS

Suppose Zn = {1, 2, · · · , n}, n ∈ N∗ , n ≥ 2. A permutation σ is a bijection from Zn to itself.


We wish to show that 3.1 gives us a unique construction of determinants for square matrices.

Make the observation that

ak = (a1k , · · · , ank )⊺ = a1k (1, 0, · · · , 0)⊺ + · · · + ank (0, · · · , 0, 1)⊺

and det(· · · 1k · · · 1k · · · ) = 0. The multilinearity and alternating property makes the


determinant of a matrix A a scalar multiple of det(I) = 1, hence unique, if and
only if we could find the value i.e. the sign of

det(1σ (1) , · · · , 1σ (n) )

for any permutation σ of Zn .

A transposition τ is a permutation such that exactly two elements i, j ∈ Zn are exchanged:


τi j (i) = j; τi j ( j) = i; τi j (k) = k, ∀k 6= i, k 6= j.
The parity of a permutation is the parity of the count of transpositions a permutation could
be decomposed into.
There are infinitely many ways to decompose a permutation into transpositions. However,
the parity of a permutation is always well defined.

3.2 Parity of a permutation is well defined


The parity of a permutation σ is the parity of the count of inversions in σ (Zn ) :=
(σ (1), · · · , σ (n)), i.e. the count of pairs i, j such that i < j and σ (i) > σ ( j).

P ROOF.The key is that every transposition changes the parity of the count of inver-
sions.
• Suppose we apply τi j . Clearly, inversions formed by i or j with an element
outside of [i, j] will not be affected.
• For the M = j − i − 1 elements within the interval [i + 1, j − 1], assume Ni
of them form inversions with i and N j of them form inversions with j. If we
apply τi j , the count of inversions i gained is thus M −2Ni , which has the same
parity as M. The changes together do not change the parity of inversions in
σ (Zn ) since they add to 2M.
• However, applying τi j changes the inversion of i, j by adding or minusing 1.
33

Therefore, in total the parity changes after τi j is applied for arbitrary i, j.


Note that before applying σ , the count of inversions is 0. Therefore, the count of
transpositions the permutation σ could be decomposed into has the same parity as
the parity of inversions, which is always well defined.

By 3.2, we could define the sign of a permutation sgn(σ ) = (−1)π (σ ) where π (σ ) is the
parity of σ .

3.3 Leibniz expansion of determinants


Suppose A ∈ Kn×n . Then

det A = ∑ sgn(σ )aσ (1)1 · · · aσ (n)n


σ ∈Sn

where Sn is the set of all permutations on Zn . More generally, if f : Kn×n → K is


multilinear and alternating, then

f (A) = f (I) det A.

The key to the proof is that


• The multilinearity allows us to rewrite the determinant as a sum of determinants of
matrices with one entry in one column.
• The alternating property cancels the ones with two entries in a same row and the
"legal" ones left corresponds to permutations of the indices of rows.

P ROOF.We claim that


f (A) = f (I) ∑ sgn(σ )aσ (1)1 · · · aσ (n)n .
σ ∈Sn

This follows from our discussion above by repeatedly using multilinearity and al-
ternating property to
ak = (a1k , · · · , ank )⊺ = a1k (1, 0, · · · , 0)⊺ + · · · + ank (0, · · · , 0, 1)⊺
and eliminating
f (· · · 1k · · · 1k · · · ) = 0.
Now apply det I = 1.
34 CHAPTER 3. DETERMINANTS

Then an important result, similar to 1.2.1, follows from 3.3.

3.4 Determinant of products of matrices


Suppose A, B ∈ Kn×n . Then

det AB = det A det B.

P ROOF.Let
f : Kn×n → K, B 7→ det AB.
By the definition of matrix multiplication, f satisfy multilinearity in columns of B
and alternating property, since adding, scalar multiplying and exchanging columns
of B have the exact same effect on the product AB.
By 3.3, we have f (B) = f (I) det B, where f (I) = det AI = det A.

3.3 not only shows det AB = det BA, but also that determinant of matrices is multiplica-
tive.

3.5 Determinant is similarity invariant


Suppose A ∈ Kn×n , B ∼ A. Then

det B = det A.

P ROOF.By 3.4, we have

det P AP −1 = det(P ) det(A) det(P −1 )


= det(P −1 ) det(P ) det(A)
= det((P −1 P )(A))
= det A.

By Exercise 1.2.9 and 3.5, we could use the notation tr A , det A . The next theorem follows
from the permutation expression as in 3.3.
35

3.6 Determinant and invertibility


Suppose A ∈ Kn×n . Then A is invertible if and only if det A 6= 0.

P ROOF.If A is invertible, by 3.4, we have

det AA−1 = det A det A−1 = det I = 1 =⇒ det A 6= 0.

If A is not invertible, then some of its columns are linearly dependent. Therefore,
det A is a sum of 0, hence it is 0.

Suppose A ∈ L (V ) and A is its matrix w.r.t. any basis of V . By the definition of


invertibility of matrices, A is invertible if and only if det A 6= 0.

The next theorem is a direct result of 3.3 and 3.4. It is extremely important in giving proof
to propositions on determinants.

3.7 Determinant of block upper triangular matrices


Suppose A ∈ Kn×n , B ∈ Km×m . Then
 
A ∗
det = det A det B.
0 B

P ROOF.Since      
A ∗ I ∗ A 0
= ·
0 B 0 B 0 I
we only need to show that
   
I ∗ A 0
det = det B, det = det A.
0 B 0 I
The key is that the two functions are both multilinear and alternating, and
   
I ∗ I 0
det = det =1
0 I 0 I
by 3.3.
36 CHAPTER 3. DETERMINANTS

3.8 Determinant of transpose


Suppose A ∈ Kn×n . Then
det A = det A⊺ .

P ROOF.Let A = (ai j )n×n . Then A⊺ = (bi j )n×n where bi j = a ji .


The key is that
sgn(σ ) = sgn(σ −1 )
since σ ◦ σ −1 = σ −1 ◦ σ = I, and

{(σ (i), i) : i = 1, · · · , n} = {(i, σ −1 (i)) : i = 1, · · · , n}.

Then the result follows from the permutation expression as in 3.3 and the fact that
permutations are injective and surjective.

This directly implies that we also have the multilinearity and alternating property of
determinants in rows.

To further describe the determinant of a matrix, we introduce minors.

3.9 Minors
Suppose A ∈ Kn×n . The minors of A are determinants of submatrices of A.

The submatrix Ai j of A is a matrix obtained by removing the entries

a1 j , · · · , ai j , · · · , an j , ai1 , · · · , ai, j−1 , ai, j+1 , · · · , ain .

The determinant of such a submatrix Ci j = det Ai j is a remainder minor. We could use


these minors to expand determinants.

3.10 Laplace expansion


Suppose A ∈ Kn×n . Let A = (ai j )n×n . Then

det A = (−1)1+k a1kC1k + · · · + (−1)n+k ankCnk , k = 1, · · · , n.


37

P ROOF.The key is that, by multilinearity in columns and rows from 3.1 and 3.8,

det A = a1kC̃1k + · · · + ankC̃nk , k = 1, · · · , n

where
C̃i j = det Ãi j
and Ãi j is obtained by changing the entries

a1 j , · · · , an j , ai1 , · · · , ai, j−1 , ai, j+1 , · · · , ain

to zero. Then by the alternating property in columns and rows we get


 
1 0
C̃i j = (−1)i+ j det .
0 Ai j

By 3.7, we get
C̃i j = (−1)i+ j det Ai j = (−1)i+ jCi j .

We call Di j = (−1)i+ jCi j the cofactors or algebraic remainder minors of a matrix


A. Also, by 3.8, we have that

det A = (−1)k+1 ak1Ck1 + · · · + (−1)k+n aknCkn , k = 1, · · · , n.

Exercise 3.0.1
Suppose X = (xi j ). Prove the following derivative of determinant formula:

d det X
= Di j .
dxi j
38 CHAPTER 3. DETERMINANTS
Chapter 4

Eigenvalues and Eigenvectors

4.1 Invariant Subspaces


From this chapter on, we study the structure of linear operators on finite dimensional vector
spaces. We assume V to be finite dimensional all the time. This is done by trying to decom-
pose it. Recall that an operator is a linear map from a vector space V to itself. Suppose that
V could be decomposed as the following direct sum:

V = U1 ⊕ · · · ⊕Um .

It is not always true that the restriction of T on each U j is an operator on that subspace: for
some U j , T may not map vectors in U j to another vector in U j . If that is indeed the case, we
say that U j is an invariant subspace of T .

4.1 Invariant subspace


Suppose T ∈ End(V ). A subspace U ⊂ V is an invariant subspace of T if

u ∈ U =⇒ Tu ∈ U.

Exercise 4.1.1
Suppose every subspace of V is T −invariant. Show that T = cI.

39
40 CHAPTER 4. EIGENVALUES AND EIGENVECTORS

Exercise 4.1.2
Suppose T ∈ End(V ) and tr T = 0. Show that there is a basis of V under which the
presentation of T has all diagonal entries equal to 0.
Hint: Exercise 4.1.1.

Exercise 4.1.3
T 2 = T , U is T −invariant. Show that U = K ⊕ I, where K is a subspace of ker T
and I a subspace of im T .

If for each j, U j is an T −invariant subspace, then we have a direct decomposition of the


operator.

4.2 Direct decomposition of an operator


Suppose T ∈ End(V ) and
V = U1 ⊕ · · · ⊕Um
where each U j is T −invariant. A direct decomposition of T is

T = T1 ⊕ · · · ⊕ Tm

where T j = T |U j ∈ End(U j ) is the restriction operator of T defined as

T j u = Tu ∈ U j

for every u ∈ U j .

If we have a direct decomposition of an operator T on a finite-dimensional V , then we could


have a block diagonal matrix of it. A simple enough block diagonal matrix could save us
a lot of work when we need to manipulate them. The invariant property gives 0 entries in
other cells except for the block matrices on the diagonal.

4.3 Block diagonal matrix


4.2. EIGENVALUES AND CHARACTERISTIC POLYNOMIAL 41

Suppose V is a finite-dimensional vector space and T ∈ End(V ). If

T = T1 ⊕ · · · ⊕ Tm

then there exists a basis of V w.r.t. which the matrix of T is of block diagonal form
 
A1 0
 .. 
 . 
0 Am

where A j is the matrix of T j , which is a square matrix.

If we have a direct sum decomposition of V in which dimU j = 1 for every j, then we


would have a diagonal matrix of the operator T . However, this is not always possible. We
will return later to a deeper study of invariant subspaces, whilst now we first turn to these
simplest possible nontrivial invariant subspaces: invariant subspaces with dimension 1.

4.2 Eigenvalues and Characteristic Polynomial


Suppose v ∈ V and consider span(v). If span(v) is an invariant subspace of an operator T ,
then there exists λ ∈ K such that
T v = λ v.
Consider any other u ∈ span(v), we have

Tu = T (µ v) = µ T v = µ (λ v) = (µλ )v = (λ µ )v = λ (µ v) = λ u.

Therefore, T |span(v) u = λ u.
Conversely, if T v = λ v, then span(v) is an invariant subspace of T . We name such values λ
eigenvalues of T and such v eigenvectors of T .

4.4 Eigenvalues of an operator


Suppose T ∈ End(V ). A number λ ∈ K is an eigenvalue of T if there exists a
nonzero vector v ∈ V such that T v = λ v.

Having T v = λ v for some v 6= 0 is equivalent to having (T − λ I)v = 0 i.e. ker(T − λ I) is


nontrivial. Considering the operator (T − λ I)v, we have the following result.
42 CHAPTER 4. EIGENVALUES AND EIGENVECTORS

4.5 Equivalence conditions to be an eigenvalue


Suppose V is finite-dimensional, T ∈ End(V ), λ ∈ K. Then the following are equiv-
alent:
1. λ is an eigenvalue of T .
2. T − λ I is not injective.
3. T − λ I is not invertible.

P ROOF.The equivalence between (1) and (2) is from the equivalence of T v = λ v


and (T − λ I)v = 0. The equivalence between (2) and (3) is because T − λ I is an
operator.

By 3.6, λ ∈ K is an eigenvalue of A ∈ End(V ) if and only if det(λ I − A ) = 0 i.e. λ is a


root of χ (z) = det(zI − A ). This is the characteristic polynomial of A .

4.6 Characteristic polynomial


Suppose A ∈ End(V ). The characteristic polynomial of A is χ (z) = det(zI − A ).

4.7 Roots of characteristic polynomial


Suppose A ∈ End(V ). Then the roots of the characteristic polynomial of A is
exactly the eigenvalues of A .

P ROOF.This directly follows from 3.6.

4.8 Algebraic multiplicity


The algebraic multiplicity d(λ ) of an eigenvalue λ is the power of the linear factor
of it in χ .

The fundamental theorem of algebra says χ for T on complex vector spaces splits. There-
fore, we have the following important structural theorem.
4.2. EIGENVALUES AND CHARACTERISTIC POLYNOMIAL 43

4.9 Operators on complex vector spaces have an eigenvalue


Every operator on a finite-dimensional, nonzero, complex vector space has an eigen-
value.

Exercise 4.2.1
Suppose V is a complex vector space, dimV = n and T ∈ End(V ) is invertible.
Let p denote the characteristic polynomial of T and let q denote the characteristic
polynomial of T −1 . Prove that

q(z) = (p(0))−1 zn p(z−1 ).

4.10 Characteristic polynomial of restriction operators


Suppose A ∈ End(V ) and U ⊂ V is A −invariant. Let χU be the characteristic
polynomial of A |U . Then χU |χ .

P ROOF.The key is that there exists a matrix of A of block upper triangular form
 
AU ∗
A= .
0 B

Then by 3.7, we have


 
zI − AU ∗
det(zI − A) = det = det(zI − AU ) det(zI − B).
0 zI − B

This implies det(zI − A |U )| det(zI − A ).

4.11 Characteristic polynomials of restriction operators in a di-


rect decomposition of an operator
Suppose A ∈ End(V ) and V = V1 ⊕ · · ·Vm where V j is T −invariant. Then χ is the
product of χV1 , · · · , χVm .
44 CHAPTER 4. EIGENVALUES AND EIGENVECTORS

P ROOF.The key is that there exists a matrix of A that is diagonal form. Then by
3.7 we have exactly what we want.

The principal minors of A are the determinants of submatrices where the rows and columns
selected have the same indices in A.

4.12 Coefficients of characteristic polynomial


Suppose A ∈ End(V ) and A be a matrix of A . Let

χ (z) = det(zI − A) = zn + σ1 zn−1 + · · · + σn .

Then.
σk = (−1)n−k ∑ Pkk
where Pkk is a k-by-k principal minor of A and the summation is over all k-by-k
principal minors.

P ROOF.This follows from the permutation expression in 3.3. Notice that

∏ (ak + bk ) = ∑ ∏ ai ∏ b j .
1≤k≤n C⊆Zn i∈C j∈C
/

Let A = (ai j )n×n . Then we have

det(zI − A) = ∑ sgn σ ∏ (zδk,σ (k) − ak,σ (k) )


σ ∈Sn 1≤k≤n
= ∑ sgn σ ∑ ∏(−ai,σ (i) ) ∏ zδ j,σ ( j)
σ ∈Sn C⊆Zn i∈C j∈C
/

= ∑ (−1) ∑ |C|
sgn σ ∏ Ai,σ (i) ∏ zδ j,σ ( j) .
C⊆Zn σ ∈Sn i∈C j∈C
/

where Sn is the set of all permutations on Zn .


The key is that the product

∏ zδ j,σ ( j) = 0 ⇐⇒ ∃ j : j 6= σ ( j).
j∈C
/

Let S(C) be the set of permutations on C. In particular, the sign of a permutation in


4.2. EIGENVALUES AND CHARACTERISTIC POLYNOMIAL 45

S(C) is the same as its sign as a permutation in Sn by 3.2. Then we have

∑ (−1)|C| ∑ sgn σ ∏ ai,σ (i) ∏ zδ j,σ ( j)


C⊆Zn σ ∈Sn i∈C j∈C
/

= ∑ (−1) |C|
∑ sgn σ ∏ ai,σ (i) zn−|C|
C⊆Zn σ ∈S(C) i∈C

= ∑ (−1) |C| n−|C|


z ∑ sgn σ ∏ ai,σ (i) .
C⊆Zn σ ∈S(C) i∈C

Note that
sgn σ ∏ ai,σ (i)
i∈C

is exactly the principal minor corresponding to the set C. The result follows directly.

In particular, we have

σ1 = − ∑ P11 = − tr A , σn = (−1)n ∑ Pnn = (−1)n det A .

Since the characteristic polynomial is the same for any matrix A of A , σ1 , · · · , σn


are all similarity invariants of A.

Exercise 4.2.2
Prove 4.12 using Laplace’s expansion 3.10.

We would now be able to show that every operator on complex vector spaces has a upper
triangular matrix.

4.13 Operator on complex vector spaces have an upper triangu-


lar matrix
Suppose V is a finite-dimensional complex vector space and T ∈ End(V ). Then T
46 CHAPTER 4. EIGENVALUES AND EIGENVECTORS

has a matrix of the form  


λ1 ∗
 .. 
 . 
0 λn
with respect to some basis of V .

P ROOF.The theorem is equivalent to showing that every square complex matrix


is similar to some upper triangular matrix. It is trivial when the square matrix is
1-by-1, and we show the case for arbitrary n by induction.
• Assume the proposition is true for the case n = k.
• Pick an eigenvector v ∈ V , this exists since T is an operator on a complex
vector space.
• Extend it to a basis v, u1 , · · · , uk , and we have the matrix is similar to, after
base changing to the new basis, the following matrix
 
λ ∗
0 Ak

where Ak is a k-by-k matrix.


• By our induction hypothesis, Ak is similar to some upper triangular matrix
Rk .
– Rk is the matrix for some operator S ∈ End(span(u1 , · · · , uk ) such that

Tu j = Su j + a j v.

The above equation holds for any basis chosen for span(u1 , · · · , uk ).
– Therefore, changing the basis of span(u1 , · · · , uk ) from u1 , · · · , uk to an-
other basis w1 , · · · , wk that gives us an upper triangular form allows us
to have Sw j ∈ span(w1 , · · · , w j ). Combining this with Tw j = Sw j + a j v
gives us Tw j ∈ span(v, w1 , · · · , w j ). Therefore, the matrix w.r.t. the
basis v, w1 , · · · , wk has the desired upper triangular form
 
λ ∗
.
0 Rk

• By induction, the result holds for arbitrary n.


4.2. EIGENVALUES AND CHARACTERISTIC POLYNOMIAL 47

This result is true for vector spaces on algebraically closed fields.

It is simple from the consideration of rank of a matrix that one could determine from the
upper triangular form whether a linear operator is invertible.

4.14 Determination of invertibility from upper triangular matrix


Suppose T ∈ End(V ) has an upper-triangular form. Then T is invertible if and only
if all the entries on the diagonal of that upper triangular matrix are nonzero.

P ROOF.The key idea is that if any entry on the diagonal is zero, then the column
is either a zero matrix (when it happens at the first column) or it lives in the span
of the previous columns. This means the matrix is not of full rank and hence not
invertible. The converse is obviously true for the same reason.

If we consider relation of λ being an eigenvalue and the invertibility of T − λ I, we could


determine eigenvalues from an upper triangular matrix.

4.15 Determination of eigenvalues from upper triangular matrix


Suppose T ∈ End(V ) has an upper triangular form. Then the eigenvalues of T are
precisely the entries on the diagonal of that upper-triangular matrix.

P ROOF.The invertibility of T − λ I is equivalent to whether there are zero entries on


the diagonal of its upper triangular form, which is equivalent to having some Akk
on the diagonal of the upper triangular form of T that satisfy Akk − λ = 0. Thus λ
is an eigenvalue of T if and only if it equals one of the numbers A11 , · · · , Ann .

Exercise 4.2.3
Suppose V is complex, T ∈ End(V ), f ∈ C[z]. Prove that α ∈ C is an eigenvalue of
f (T ) if and only if α = f (λ ) for some eigenvalue λ of T .
48 CHAPTER 4. EIGENVALUES AND EIGENVECTORS

4.16 Determinant of matrices on algebraically closed fields


Suppose K is algebraically closed, V is a vector space on K, A ∈ L (V ) and
dimV = n. Let A ∈ Kn×n be the matrix of A w.r.t. any basis of V . Then det A
is the product of eigenvalues of A to the power of corresponding algebraic mul-
tiplicities.

P ROOF.By 4.13, there exists a basis w.r.t which A has an upper triangular matrix
Rn ∈ Kn×n . Notice that  
R11 ∗
Rn =
0 Rn−1
where R11 ∈ K and Rn−1 is an upper triangular matrix. Then by repeatedly using
3.7, we get
det Rn = R11 · · · Rnn .
By 3.5, we have that
det Rn = det A
for any matrix A of A . Therefore, for any matrix A of A , det A is the product of
eigenvalues of A .
To show that the times of λ appears on the diagonal is exactly d(λ ), consider the
matrix Rn − zI. Then by the same computation

det(Rn − zI) = ∏(λi − z)the no. of times λi appears on the diagonal .


i

But this is also the chracteristic polynomial (from left hand side). This concludes
our proof.

Exercise 4.2.4
Suppose K is algebraically closed, V is a vector space on K, A ∈ L (V ) and
dimV = n. Let A ∈ Kn×n be the matrix of A w.r.t. any basis of V . Prove that
tr A is the sum of eigenvalues of A .
4.3. CAYLEY-HAMILTON THEOREM 49

4.3 Cayley-Hamilton theorem


We begin by defining the key concept of a polynomial of an operator.

In fact, the main reason that a richer theory exists for operators than for more general
linear maps is that operators can be raised to powers and hence can be written as
polynomials.

4.17 Polynomial of operator


Suppose T ∈ End(V ) and m is a positive integer. Then we write

T m = T ◦ (T m−1 )

and T 0 = I.
Suppose p(z) ∈ K[z] given by

p(z) = a0 + a1 z + · · · + am zm .

Then p(T ) is an operator defined as

p(T ) = a0 + a1 T + · · · + am T m .

4.18 Polynomials of operator commute


Suppose p(z), q(z) ∈ K[z], we have

p(z)q(z) = q(z)p(z).

In particular, we have p(T )q(T ) = q(T )p(T ).

P ROOF.The key idea is to see from definition that

p(z)q(z) = ∑ ck zk
k

where ck = ∑i+ j=k ai b j = ∑i+ j=k bi a j .


50 CHAPTER 4. EIGENVALUES AND EIGENVECTORS

A corollary of 4.18 is that ker p(T ), im p(T ) are T −invariant subspaces, where p ∈
K[z].

We now give a proof to the fundamental theorem in this chapter.

4.19 Cayley-Hamilton theorem


Suppose A ∈ End(V ). Then χ (A ) = 0.

P ROOF.It suffices to prove for the case where V is complex, since the characteristic
polynomial is the same. By 4.13, we have an upper triangular matrix presentation
for A . We write it as  
λ1 ∗
.
0 B
By ??, we have
 
λ1 ∗
χ
0 B
    
0 ∗ λ1 − λ2 ∗ λ − λn ∗
= ··· 1 .
0 B − λ1 In−1 0 B − λ2 In−1 0 B − λn In−1

This further simplifies to


  
0 ∗ (λ1 − λ2 ) · · · (λ1 − λn ) ∗
0 B − λ1 In−1 0 (B − λ2 In−1 ) · · · (B − λn In−1 )

By induction on dimension of V , we can assume

(B − λ2 In−1 ) · · · (B − λn In−1 ) = 0

This gives
  
0 ∗ (λ1 − λ2 ) · · · (λ1 − λn ) ∗
= 0.
0 B − λ1 In−1 0 0
4.4. EIGENVECTORS, EIGENSPACES AND DIAGONAL MATRICES 51

4.4 Eigenvectors, Eigenspaces and Diagonal Matrices


In the last section, we discussed the existence of eigenvalues. In this section, we discuss the
relations of invariant subspaces of dimension 1.

4.20 Eigenvectors of an operator


Suppose T ∈ End(V ) and λ ∈ K is an eigenvalue of T . A vector v ∈ V is an eigen-
vector of T corresponding to λ if v 6= 0 and T v = λ v.

We need to first discuss the relations of eigenvectors. The next theorem shows that eigen-
vectors corresponding to distinct eigenvalues must be linearly independent.

4.21 Linearly independent eigenvectors


Suppose T ∈ End(V ), λ1 , · · · , λm are distinct eigenvalues of T , and v1 , · · · , vm is a
list of eigenvectors each of which corresponds to respective λ . Then v1 , · · · , vm is a
linearly independent list.

P ROOF.The key idea is that the image of different vectors are scalar multiples by
distinct scalars. Suppose that

a1 v1 + · · · + am vm = 0.

Apply the operator (λ1 − T ) · · · (λm−1 − T ) to both sides, and we have

am (λ1 − λm ) · · · (λm−1 − λm )vm = 0.

Since λ s are distinct, am = 0. Do this for arbitrary j = 1, · · · , m to get a j = 0.

We also define the eigenspaces to show the relation of eigenvectors corresponding to the
same eigenvalues.

4.22 Eigenspaces
Suppose T ∈ End(V ) and λ ∈ K. The eigenspace of T corresponding to λ is defined
as
E(λ , T ) = ker(T − λ I).
52 CHAPTER 4. EIGENVALUES AND EIGENVECTORS

In particular, eigenspaces are subspaces of V since they are kernels of operators on


V.

Since eigenvectors corresponding to distinct eigenvalues are linearly independent, it is ob-


vious that the sum of eigenspaces is direct.

4.23 Sum of eigenspaces is a direct sum


Suppose T ∈ End(V ) and λ1 , · · · , λm are distinct eigenvalues. Then the sum

E(λ1 , T ) + · · · + E(λm , T )

is direct.

P ROOF.Suppose u1 + · · · + um = 0 in which u j ∈ E(λ j , T ). Since eigenvectors cor-


responding to distinct eigenvalues are linearly independent, the u j must be 0 for
each j.

Recall our discussion on decomposition of operators. Suppose dimV = n. If an operator has


a diagonal form, then there must be a decomposition of V into n invariant subspaces of di-
mension 1 and vice versa. Notice the equivalence of this and the existence of an eigenspace
decomposition, and having a basis of V the vectors in which are all eigenvectors of T . We
then have the following equivalence to diagonalizable operators.

4.24 Equivalence conditions to be diagonalizable


Suppose V is finite-dimensional and T ∈ End(V ). Let λ1 , · · · , λm denote the distinct
eigenvalues of T . Then the following are equivalent:
1. T is diagnolizable.
2. V has a basis consisting of eigenvectors of T .
3. ∃U1 , · · · ,Un where each dimU j = 1 and U j is T −invariant such that

V = U1 ⊕ · · · ⊕Un .

4. V = E(λ1 , T ) ⊕ · · · ⊕ E(λm , T ).
4.4. EIGENVECTORS, EIGENSPACES AND DIAGONAL MATRICES 53

P ROOF.(1) and (2) are trivially equivalent: (1) implies (2) by using 4.15, the other
way by using definition. (2) and (3) are even more obviously equivalent using the
definition of direct sum.
• Suppose that (2) is true, then we have

V = E(λ1 , T ) + · · · + E(λm , T )

which by direct sum property we have (4).


• Suppose that (4) is true, then by selecting a basis in each eigenspace, we have
a basis of V consisting of eigenvectors of T .

We next discuss some less trivial results about diagonalization.


Suppose V1 ,V2 ⊂ V are invariant subspaces of T and V = V1 ⊕ V2 . Consider the direct
decomposition T = T1 ⊕ T2 . If T1 , T2 are diagonalizable, then T is diagonalizable. The next
theorem shows the other way is also true.

4.25 Diagonalizable operator and diagnolizable restriction oper-


ator
Suppose V = V1 ⊕ · · · ⊕ Vm is finite-dimensional and all V j are invariant under T .
Then T is diagonalizable if and only if all T |V j are diagonalizable.

P ROOF.If each T |V j := T j is diagonalizable, then by adjoining the eigenbasis of V j


corresponding to T j , we get an eigenbasis of V .
If T is diagonalizable, then let v1 , · · · , vn be an eigenbasis corresponding to T . Then
we have v j = v j1 + · · · + v jm , v jk ∈ Vk and this decomposition is unique since the
sum is direct. For each w ∈ Vi , we have
w = a1 v1 + · · · + an vn = a1 v1i + · · · + an vni . (∗)
since the sum is direct (so the other components must equal to 0). Clearly
v1i , · · · , vni is a spanning list of Vi since w is arbitrary, then we could reduce it
to some basis of Vi . Some computations are needed to verify that w.r.t. this basis,
Ti is diagonalizable. The key would be from (∗) we have
Tw = a1 λ1 v1i + · · · + an λn vni
since all the other terms are image of 0 so they add to 0.
54 CHAPTER 4. EIGENVALUES AND EIGENVECTORS

In fact, the list u1i , · · · , uni is a basis of Vi exactly after removing the 0s. This is
because either a j = 0 or u jk = 0 for that j with nonzero a j and all k 6= i.

The following is a useful lemma and the argument could be used in proofs of many similar
results, from 5.17 to Lie’s theorem in the theory of Lie algebra.

4.26 Lemma on commuting operators


Suppose T, S ∈ End(V ) commutes: T S = ST . Then:
1. If E is an eigenspace of T , then E is S−invariant.
2. If V is complex, T, S have a common eigenvector.

4.27 Simultaneously diagonalizable list of operators


Suppose V is finite-dimensional and T1 , · · · , Tm is a finite set of diagonalizable op-
erators T j ∈ End(V ). Then all of them have the same eigenbasis if and only if they
commute: Ti T j = T j Ti , ∀1 ≤ i, j ≤ k.

P ROOF.One direction is obvious since

Ti T j u = λ µ u = µλ u = T j Ti u.

We use induction on m to show the other direction.


• If m = 1, then there is nothing to prove. Assume it is true for m ≤ k − 1.
• If m = k, then let
V = E(λ1 , Tk ) ⊕ · · · ⊕ E(λn , Tk )
where λ1 , · · · , λn are eigenvalues of Tk .
– Each E(λ j , Tk ) is Ti −invariant for all i by 4.26.
– By 4.25, the restriction Ti |E(λ j ,Tk ) is diagonalizable and commute with
each other for different i.
– Hence by induction, there exists a simultaneous eigenbasis of Ti |E(λ j ,Tk )
on Eλ j ,Tk for i = 1, · · · , k − 1. But Eλ j ,Tk is an eigenspace of Tk , therefore
this is a simultaneous eigenbasis for all Ti , i = 1, · · · , k.
– Adjoining the bases for each j and we have the wanted eigenbasis.
Chapter 5

Real and Complex Inner Product


Spaces

In this chapter, we need to exploit the relation between operators and real and complex
numbers using inner product.

5.1 Adjoint of Operators


The first thing we need is to define adjoint, a concept similar to the concept of conjugate
of complex numbers. This requires us to exploint the similarity between inner products and
linear functionals.

As such, we first show that for every linear functional φ ∈ V 0 on V , we could find a unique
u to represent φ .

5.1 Riesz representation theorem


Suppose V is finite-dimensional and φ is a linear functional on V . Then there is a
unique vector u ∈ V such that
φ (v) = hv, ui
for all v ∈ V .

55
56 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES

P ROOF.Let e1 , · · · , en be an orthonormal basis. The key is that (the dual basis


f1 , · · · , fn satisfies)
fi = h·, ei i.
The existence and uniqueness then follows from the property of a basis.

Suppose inner product space V and T ∈ End(V ), u, v ∈ V . Consider the inner product hTu, vi
and the linear functional
φv,T : u 7→ hTu, vi.
By 5.1, there exists a unique w ∈ V such that

φv,T (u) = hu, wi.

This induces a map T ∗ : v 7→ w such that

hTu, vi = hu, T ∗ vi, ∀u, v ∈ V.

Linearity of T ∗ is not hard to check from the linearity of h·, ·, i.

5.2 Adjoint of an operator


Suppose T ∈ End(V ). The adjoint of T is the linear map T ∗ : V → V such that

hTu, vi = hu, T ∗ vi, ∀u, v ∈ V.

Adjoints have the following properties granted by its definition.

5.3 Properties of adjoints


Suppose V is an inner product space and T ∈ End(V ). Then
• (S + T )∗ = S∗ + T ∗ ;
• (λ T )∗ = λ T ∗ ;
• (T ∗ )∗ = T ;
• I ∗ = I;
• (ST )∗ = T ∗ S∗ .

We have from definition that

hTu, vi = hT ∗ v, ui, ∀u, v ∈ V.


5.1. ADJOINT OF OPERATORS 57

Suppose e1 , · · · , en is an orthonormal basis of V . Then

T ∗ e j = hT ∗ e j , e1 ie1 + · · · + hT ∗ e j , en ien
= hTe1 , e j ie1 + · · · + hTen , e j ien .

As such, similar to a dual map, we have the following result.

5.4 Conjugate transpose


The conjugate transpose of a matrix is the matrix obtained by taking the transpose
and then taking the complex conjugate of each entry.

5.5 Matrix of adjoint w.r.t. orthonormal bases


Suppose V is an inner product space and T ∈ End(V ). Let e1 , · · · , en be an orthonor-
mal basis of V . Then the matrix of T ∗ w.r.t e1 , · · · , en is the conjugate transpose of
the matrix of T w.r.t the same basis.

P ROOF.This follows from our discussion above since T ∗ e j is the jth column of
M = [T ]ee11 ,··· ,en
,··· ,en while hTei , e j i is the conjugate of the jth entry of the ith row of
matrix of M .

The adjoint of an operator T behave similarly to the dual map, but live on the same
spaces as T .

This similarity is due to the similarity between dual bases and orthonormal bases.

5.6 Null space and range of the adjoint


Suppose V is an inner product space and T ∈ End(V ). Then
• ker T ∗ = (im T )⊥ .
• im T ∗ = (ker T )⊥ .
58 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES

P ROOF.

v ∈ ker T ∗ ⇐⇒ T ∗ v = 0
⇐⇒ hu, T ∗ vi = 0, ∀u ∈ V
⇐⇒ hTu, vi = 0, ∀u ∈ V
⇐⇒ v ∈ (im T )⊥ .

The second proposition follows from the first.

A corollary from this proposition is

5.7 Eigenvalues of the adjoint


Suppose λ is an eigenvalue of T , then λ ∗ is an eigenvalue of T ∗ .

5.2 Self-Adjoint and Normal Operators


We now define two special classes of operators: the self-adjoint operators and the normal
operators. We first focus on self-adjoint operators.

5.8 Self-adjoint operators


Suppose V is an inner product space. An operator T ∈ End(V ) is self-adjoint if

hTu, vi = hu, T vi, ∀u, v ∈ V.

In other words, a self-adjoint operator is an operator T = T ∗ .

5.9 Eigenvalues of self-adjoint operators are real


Every eigenvalue of a self-adjoint operator is real.

P ROOF.
T v = λ v =⇒ λ kvk2 = hλ v, vi = hT v, vi = hv, T vi = hv, λ vi = λ kvk2 .
5.2. SELF-ADJOINT AND NORMAL OPERATORS 59

This is our first result that shows the similarity between self-adjoint operators and
real numbers.

Before we proceed to our next result, we make some rough comparison between operators
on real vector spacs and complex vector spaces using informal language to illustrate our
motivation.

Suppose V is a complex vector space and v is a eigenvector corresponding to λ ∈ C. Then T v


is not necessarily just scaling. It possibly involves rotation. In real vector spaces, rotation
are not decomposable. However, in complex vector spaces, it becomes a scalar multiple.

In conclusion, operators that behave like complex numbers i.e. involves rotation lose their
eigenvectors and eigenvalues when they are described on real vector spaces.

5.10 Difference between real and complex vector spaces


Suppose V is a complex inner product space and T ∈ End(V ). Then

hT v, vi = 0, ∀v ∈ V

if and only if T = 0.

P ROOF.The key is that

hT (u + v), u + vi − hT (u − v), u − vi
hTu, vi =
4
hT (u + iv), u + ivi + hT (u − iv), u − ivi
+ i.
4
and then we get that

hTu, vi = 0, ∀u, v ∈ V =⇒ hTu, Tui = 0, ∀u ∈ V =⇒ T = 0.


60 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES

The identity we use is a similar result to 2.9 for sesquilinear forms:

B(u + v, u + v) − B(u − v, u − v)
B(u, v) =
4
B(u + iv, u + iv) − B(u − iv, u − iv)
+ i.
4
A sesquilinear form satisfies only the linearity condition for an inner product. The
proof in 2.9 still holds for them, as we used only linearity, yielding this identity.

The significance is that we can upgrade conclusion about all B(u, u) to all B(u, v).

A corollary of 5.10 is that


T = T ∗ ⇐⇒ hT v, vi ∈ R, ∀v ∈ V.
The proof is left to the readers as an exercise.

5.10 is not true on real vector spaces, since there are rotation operators that lose the
complex eigenvalues and corresponding eigenvectors. Since self-adjoint operators
only have real eigenvalues, 5.10 holds for this class on real vector spaces.

5.11 Self-adjoint operators on real vector spaces


Suppose V is a real inner product space and T ∈ End(V ) is self-adjoint. Then

hT v, vi = 0, ∀v ∈ V

if and only if T = 0.

P ROOF.The key is
hT (u + v), u + vi − hT (u − v), u − vi
hTu, vi = .
4
and then we get that
hTu, vi = 0, ∀u, v ∈ V =⇒ hTu, Tui = 0, ∀u ∈ V =⇒ T = 0.
5.2. SELF-ADJOINT AND NORMAL OPERATORS 61

The identity we use is a similar result to 2.9 for sesquilinear forms satisfying
B(u, v) = B(v, u):

B(u + v, u + v) − B(u − v, u − v)
B(u, v) = .
4
Let B(u, v) = hTu, vi and the condition B(u, v) = B(v, u) is given by the self-adjoint
property.

5.12 Normal operators


Suppose V is an inner product space. An operator T ∈ End(V ) is normal if T ∗ T =
T T ∗.

It is easy to observe that self-adjoint operators are normal.

The next theorem provides a simple characterization of normal operators. The geometric
implication of this characterization would be clear after we arrive at the result ??.

5.13 Characterization of normal operators


Suppose V is an inner product space. An operator T ∈ End(V ) is normal if and only
if kT vk = kT ∗ vk.

P ROOF.The key is to notice that T ∗ T − T T ∗ is a self-adjoint operator since

(T ∗ T − T T ∗ )∗ = T ∗ (T ∗ )∗ − (T ∗ )∗ T ∗ = T ∗ T − T T ∗ .

Therefore

T ∗ T = T T ∗ ⇐⇒ T ∗ T − T T ∗ = 0
⇐⇒ h(T ∗ T − T T ∗ )v, vi = 0
⇐⇒ hT ∗ T v, vi = hT T ∗ v, vi
⇐⇒ kT vk2 = kT ∗ vk2 .

Another interesting result, with geometric implication (to be revealed later), is as follows.
62 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES

5.14 Normal operators and its adjoint have same eigenvectors


Suppose T ∈ End(V ) is normal. Then
1. ker T = ker T ∗ .
2. T − λ I is also normal.
3. If v is an eigenvector of T corresponding to λ , then v is also an eigenvector of
T ∗ corresponding to λ .

P ROOF.The first property is easy to check, by 5.13. The second is also easy to
check by computation. The key to the third is that the conclusion of 4.26, which
can be used since T, T ∗ commutes, can be strengthened using the first and second
properties to show that they have common eigenspaces.

5.15 Eigenvectors of normal operators corresponding to distinct


eigenvalues are orthogonal
Suppose T ∈ End(V ) is normal. Then eigenvectors of T corresponding to distinct
eigenvalues are orthogonal.

P ROOF.The key is to use 5.14.


Suppose u, v are eigenvectors of T corresponding to λ 6= µ respectively. Then v is
also an eigenvector of T ∗ corresponding to µ .

hTu, vi = hu, T ∗ vi
hλ u, vi = hu, µ vi
λ hu, vi = µ hu, vi

which implies hu, vi = 0.

5.3 Spectral Theorems


The spectral theorem, which characterizes diagonalizable operators on inner product spaces,
is the highlight of our discussion on operators on inner product spaces.

In this section and remaining sections of this chapter, we assume V is finite-dimensional.


5.3. SPECTRAL THEOREMS 63

We first study more about the invariant subspaces of normal operators.

5.16 Invariant subspaces of operators and adjoint


Suppose V = U ⊥ U ⊥ is an inner product space, T ∈ End(V ), and U is T −invariant.
Then U ⊥ is T ∗ −invariant.

5.17 Spectral theorem of complex vector spaces


Suppose V is a complex inner product vector space. Then T ∈ End(V ) is orthogo-
nally diagonalizable if and only if T is normal.

P ROOF.One direction is obvious by multiplying diagonal matrices. Suppose T is


normal. We use induction on n = dimV to show that T is diagonalizable.
• If n = 1, then there is nothing to prove. Assume that the proposition is true
for n ≤ k − 1.
• If n = k, then by 4.9, we could choose an eigenvector u of T with kuk = 1.
– Let E be the eigenspace containing u. By 4.26, which can be used
since T, T ∗ commutes, E is invariant under T ∗ . By 5.16, E ⊥ is also T
invariant, and we could apply our induction hypothesis on E ⊥ to get an
orthonormal basis of E ⊥ consisting of eigenvectors of T |E ⊥ , since the
restrictions are also normal (by the T − and T ∗ −invariant property of
E).
– Adjoining the bases produces an orthornomal basis of V consisting of
eigenvectors of T .

5.18 Simultaneously orthogonally diagonalizable list of opera-


tors
Suppose V is inner product space and T1 , · · · , Tm is a finite set of normal operators
T j ∈ End(V ). Then all of them have the same orthonormal eigenbasis if and only
if Ti T j = T j Ti , ∀1 ≤ i, j ≤ k.

P ROOF.This follows 5.17, 4.27 and 2.16.


64 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES

The next exercise shows that the commuting list of normal operators T1 , · · · , Tm induces the
commuting algebra
C[T1 , · · · , Tm , T1∗ , · · · , Tm∗ ].
This is a commutative subalgebra of the ∗−algebra of End(V ).

Exercise 5.3.1
1. Suppose V is complex and T ∈ End(V ) is normal. Then ∃p ∈ C[z] : T ∗ =
p(T ). (Hint: Use 1.2.3.)
2. Hence, show that if T is normal, ST = T S =⇒ ST ∗ = T ∗ S.

We have studied how self-adjoint operators resemble real numbers in previous chapter. In
this chapter, we show that self-adjoint opeartors are orthogonally diagonalizable on real
inner product spaces.

5.19 Self-adjoint operators have an eigenvalue


Suppose V is real and T ∈ End(V ) is self-adjoint. Then T has an eigenspace.

P ROOF. χ (z) splits over C, and by 5.9 every root of it is real. This means in fact
χ (z) splits over R. Therefore T has an eigenspace.

The complex spectral theorem 5.17 then generalizes to the real case.

5.20 Spectral theorem of real vector spaces


Suppose V is a real inner product vector space. Then T ∈ L (V ) is orthogonally
diagonalizable if and only if T is self-adjoint.

5.4 Polar Decomposition and Singular Value Decomposi-


tion
The operator T ∗ T is always self-adjoint. Moreover, if T ∗ T v = λ v, then

λ kvk2 = λ hv, vi = hT ∗ T v, vi = hT v, T vi = kT vk2


5.4. POLAR DECOMPOSITION AND SINGULAR VALUE DECOMPOSITION 65
√ √
which implies kT vk = λ kvk. This motivates us to study T ∗ T .

5.21 Positive operators


Suppose V is an inner product space and T ∈ End(V ). T is a positive operator if T is
self-adjoint and all eigenvalues of T are nonnegative and a positive definite operator
if T is self-adjoint and all eigenvalues of T are positive.

5.22 Square root


Suppose T, R ∈ End(V ). R is a square root of T if T = R2 .

√ √
Suppose T is a positive operator. Then T = λ1 I ⊥ · · · ⊥ λn I. Let R = λ1 I ⊥ · · · ⊥ λn I.
Then R is a positive square root of T . The next result shows it is unique.

5.23 Positive square root of each positive operator is unique


Every positive operator on V has a unique positive square root.

P ROOF.Suppose T = R2 . Then T R = RT .√Therefore by √


5.18, T, R share the same
orthonormal eigenbasis. This means R = λ1 I ⊥ · · · ⊥ λn I is the only possible
square root of T .


We use T to denote the unique positive square root of T .
√ √ √
We have shown above that every positive operator T is in the form T = ( T )2 = ( T )∗ T .
The converse is also obviously true. This gives us a characterization of positive operators.

5.24 Characterization of positive operators


Suppose T ∈ End(V ). Then the following statements are equivalent.
1. T is positive.
2. T is self-adjoint and hT v, vi ≥ 0, ∀v ∈ V .
3. T has a positive square root.
4. T has a self-adjoint square root.
5. ∃R ∈ End(V ) such that T = R∗ R.
66 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES

P ROOF.Suppose T is positive. Then

hT v, vi = λ hv, vi = λ kvk2 ≥ 0

for all eigenvectors of T . Then it extends to all vectors on V since T is diagonaliz-


able.
Suppose T is self-adjoint and hT v, vi ≥ 0, and suppose T v = λ v. Then by the same
equation we have λ ≥ 0.
Suppose T has a self-adjoint square root, and suppose T v = λ v. Then

hT v, vi = hR2 v, vi = hRv, Rvi = kRvk2 ≥ 0

and by the same equation we have λ ≥ 0.


5.25 Effects of T ∗T
Suppose T ∈ End(V ). Then

kT vk = T ∗ T v , ∀v ∈ V.

This is an easy upgrade from the beginning of our discussion.

5.26 Isometric operators


An operator S ∈ End(V ) is called an isometric operator if

kSvk = kvk , ∀v ∈ V.

5.27 Characterization of isometric operators


Suppose S ∈ End(V ). Then the following statements are equivalent:
1. S is an isometry.
2. hSu, Svi = hu, vi.
3. Se1 , · · · , Sen is orthonormal for every orthonormal list of vectors e1 , · · · , en in
V.
4. Se1 , · · · , Sen is orthonormal for an orthonormal list of vectors e1 , · · · , en in V .
5. S∗ S = I.
5.4. POLAR DECOMPOSITION AND SINGULAR VALUE DECOMPOSITION 67

6. SS∗ = I.
7. S∗ is an isometric operator.

P ROOF.Suppose S is an isometric operator, then by 2.9, we have kSvk = kvk =⇒


hSu, Svi = hu, vi.
Suppose hSu, Svi = hu, vi, then we have e1 ⊥ · · · ⊥ en =⇒ Se1 ⊥ · · · ⊥ Sen .
Suppose ∃e1 ⊥ · · · ⊥ en such that Se1 ⊥ · · · ⊥ Sen , then hS∗ Sei , e j i = hSei , Se j i =
hei , e j i = δi j . Therefore, we have hS∗ Su, vi = hu, vi =⇒ S∗ S = I.
Suppose S∗ S = I, then SS∗ = I since ST = I ⇐⇒ T S = I.
Suppose SS∗ = I, then kS∗ vk = hS∗ v, S∗ vi = hSS∗ v, vi = hv, vi = kvk2 .

5.28 Polar decomposition


Suppose T ∈ End(V ). Then there exists an isometric operator S ∈ End(V ) such that

T = S T ∗T .

P ROOF.The key is that


√ √
S : im T ∗ T → im T, T ∗ T v 7→ T v

is well-defined and could be extended to an isometry. The extension part is simple,


as we already have kS vk = kvk from 5.25 and only need to map an orthonormal
basis to an orthonormal basis to extend it to a global isometry. The map S is well
defined is another application of 5.25, as
√ √ √
T ∗ T v1 = T ∗ T v2 ⇐⇒ T ∗ T (v1 −v2 ) = 0 ⇐⇒ T (v1 −v2 ) = 0 ⇐⇒ T v1 = T v2 .

The second ⇐⇒ is by 5.25.

The only key property we used is the identity 5.25. Therefore, the proof of this theorem
could be used for any T, L : kT vk = kLvk, and in particular the following result.

5.29 Normal operators differ by an isometry from its adjoint


Suppose T ∈ End(V ) is normal. Then there exists an isometric operator S ∈ End(V )
68 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES

such that
T = ST ∗ .

5.30 Polar decomposition of normal operator



Suppose V is a complex vector space, T ∈ End(V ) and T =√ S T ∗ T . Then T is
normal if and only if the eigenvectors of S are eigenvectors of T ∗ T .


P ROOF.If the eigenvectors of S are eigenvectors of T ∗ T , then they are eigenvec-
√ direction in 5.17.
tors of T , then T is normal by the easy
If T is normal, the key is that T and T ∗ T must have the same eigenbasis by 5.25.
Then S must act like scalars on them.

5.28 allows us to describe T using its singular values.

5.31 Singular values



Suppose T ∈ End(V ). The singular values
√ of T are the eigenvalues of T ∗ T with
each eigenvalue λ repeated dim E(λ , T ∗ T ) times.

From the definition, the singular values are also the arithmatic square roots of eigenvalues
of T ∗ T .

5.32 Singular value decomposition


Suppose T ∈ End(V ) has singular values s1 , · · · , sn . Then there exist orthonormal
bases e1 , · · · , en and f1 , · · · , fn of V such that

T v = s1 hv, e1 i f1 + · · · + sn hv, en i fn .


P ROOF.Let e1 , · · · , en be the eigenbasis of T ∗ T and f j = Se j where S is the isom-
etry in 5.28. Then the result follows.
5.4. POLAR DECOMPOSITION AND SINGULAR VALUE DECOMPOSITION 69

5.32 gives us a diagonal matrix w.r.t. the orthonormal bases e1 , · · · , en and f1 , · · · , fn .


The entries on the diagonal are the singular values of T .
70 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES
Chapter 6

Canonical Forms

In this chapter, we study further about the structure of operators when these does not exist
an eigenspace decomposition of V . We assume the vector spaces in this chapter are finite-
dimensional.
We would study the structure of operators on finite-dimensional vector spaces by consider-
ing its direct decomposition into some more general invariant subspaces, specifically some
cyclic invariant subspaces. An inner product would not help.
Our design of this chapter is as the following:
• we begin our study with the generalized eigenvectors and eigenspaces, which would
lead us to the important result of root subspace decomposition;
• from there we could reduce the case to the study of nilpotent operators as we study
the direct decomposition of T − λ I to its cyclic invariant subspaces (the cyclic de-
composition of nilpotent operators), which is a second order decomposition after the
first root space decomposition;
• the other properties of the Jordan canonical form would then be derived easily from
the proof of it.

6.1 Minimal Polynomial


A polynomial p ∈ K[z] is annihilating if p(T ) = 0. Let n = dimV . Then
2
I, T, · · · , T n

71
72 CHAPTER 6. CANONICAL FORMS

is not linearly independent in End(V ) since dim End(V ) = n2 .


Therefore there exists a smallest m ≤ n2 such that

T m = a0 I + a1 T + · · · + am−1 T m−1

where a j are not all zero. This gives us an annihilating polynomial m of T of the smallest
possible degree. This polynomial is also unique since otherwise m is not of the smallest
degree. The polynomial m is the minimal polynomial of T .

6.1 Minimal polynomial


Suppose T ∈ End(V ). Then the minimal polynomial of T is the monic polynomial
m of smallest degree such that m(T ) = 0.

6.2 Existence and uniqueness of minimal polynomial


For every operator T ∈ End(V ), the minimal polynomial m exists and is unique.

P ROOF.This directly follows from our discussion above.

The reason why minimal polynomials are important is their minimality and the fact that
they are closely related to invariant subspaces of T . The following result is a direct demon-
stration, although it is not the most interesting one.

6.3 Minimal polynomial of restriction operators


Suppose T ∈ End(V ) and U ⊂ V is T −invariant. Let mU be the minimal polynomial
of T |U . Then mU |m.

P ROOF.First, m(T |U ) = 0 =⇒ deg m ≥ deg mU . Let

m(z) = mU (z)q(z) + r(z), 0 ≤ deg r < deg mU .

The key is that


r(T |U ) = m(T |U ) + q(T |U )mU (T |U ) = 0.
Therefore, r(z) = 0.
6.1. MINIMAL POLYNOMIAL 73

The proof above could be generalized to any annihilating polynomial p of T , as


the key is deg r < deg mU and r(T |U ) = 0.

The minimality of m also implies the following corollary.

6.4 Minimal polynomials of restriction operators in a direct de-


composition of an operator
Suppose T ∈ End(V ) and V = V1 ⊕ · · ·Vm where V j is T −invariant. Then m is the
least common multiple of mV1 , · · · , mVm .

P ROOF.By 6.3, m is obviously a common multiple of mV1 , · · · , mVm and it is of the


least degree. Therefore, it is the least common multiple.

In fact, the direct sum condition is not needed.

The theorem 6.4 also implies that every annihilating polynomial is the multiple of the min-
imal polynomial of T as they have common factors.
Also, all of the roots of m are eigenvalues of T since otherwise m is not minimal, and all
eigenvalues of T are roots of m since the eigenspaces are T −invariant.

6.5 Roots of minimal polynomials and eigenvalues


Suppose T ∈ End(V ). Then λ is an eigenvalue of T if and only if m(λ ) = 0.

P ROOF.If m(λ ) = 0, then ∃q ∈ K[z] : m(z) = (z − λ )q(z). If ker q(T ) = V , then m


is not the minimal polynomial. Therefore, pick v ∈ V such that q(T )v 6= 0. Then
v ∈ ker(T − λ I), which means λ is an eigenvalue of T .
If λ is an eigenvalue of T . Then ∃q ∈ K[z] : m(z) = (z − λ )q(z) by 6.3. Therefore
we have m(λ ) = 0.

More generally, we could use factorization of m to study all the invariant subspaces of T ,
not just the eigenspaces using the roots. This leads to the primary decomposition, a more
primary form of root space decomposition 6.11.
74 CHAPTER 6. CANONICAL FORMS

6.6 Primary decomposition


Suppose T ∈ End(V ) and m = p1 · · · pm where pi , p j have no common factors when
i 6= j. Then
V = ker(p1 (T )) ⊕ · · · ⊕ ker(pm (T )).
Note that ker(p1 (T )), · · · , ker(pm (T )) are T −invariant subspaces.

P ROOF.
The case m = 1 is trivial, while the general case follows directly by induction.
Hence we only need to prove the case when m = 2.
• By Bezout’s theorem, there exists polynomials q1 , q2 such that

1 = p1 (z)q1 (z) + p2 (z)q2 (z).

• Apply to v and we get

v = p1 (T )q1 (T )v + p2 (T )q2 (T )v.

Let v1 = p2 (T )q2 (T )v, v2 = p1 (T )q1 (T )v, then

v = v1 + v2 , v1 ∈ ker(p1 (T )), v2 ∈ ker(p2 (T )).

Therefore we have V = ker(p1 (T )) + ker(p2 (T )).


• If u ∈ ker(p1 (T )) ∩ ker(p2 (T )), then

u = p1 (T )q1 (T )u + p2 (T )q2 (T )u = 0.

Therefore we have V = ker(p1 (T )) ⊕ ker(p2 (T )).

6.6 allows us to study operators with non-split minimal polynomials/

6.7 Factorization of minimal polynomials and eigenspace decom-


position
Suppose T ∈ End(V ). Then T is diagonalizable if and only if m is the product of
distinct linear factors.
6.1. MINIMAL POLYNOMIAL 75

P ROOF.If T is diagonalizable, then there exists an eigenspace decomposition of T .


Therefore m is the product of distinct linear factors by 6.4.
If m is the product of distinct linear factors, then by 6.6, we have

V = ker(T − λ1 ) ⊕ · · · ⊕ ker(T − λm )

which is an eigenspace decomposition.

Exercise 6.1.1
1. Prove that T is invertible if and only if m has nonzero constant term.
2. Suppose T is invertible. Find f ∈ K[z] such that T −1 = f (T ).

Exercise 6.1.2
Suppose the minimal polynomial for T ∈ End(V ) is m(z) = ∏ki=1 (z − λi ), λi 6= λ j if
i 6= j, show that for any v ∈ V , v = ∑ki=1 vi , where

z − λk
vi = fi (T )v, fi (z) = ∏ .
1≤k≤n,k6=i j − λk
λ

Exercise 6.1.3
Suppose V is complex. Suppose T n = I, show that for any v ∈ V , v = ∑n−1
i=1 vi , where

1 n−1 −i j j 2π i
vi = ∑
n j=0
ω T v, ω = e n .
76 CHAPTER 6. CANONICAL FORMS

6.2 Root Subspace Decomposition


We know that an eigenspace of an operator T ∈ End(V ) corresponding to λ is ker(T − λ I).
However, such subspaces might not be large enough to form an eigenspace decomposition
of V (although their sum is always direct). An observation is that

{0} = ker T 0 ⊂ ker T 1 ⊂ · · · ⊂ ker T k ⊂ · · · .

Then we have ker(T − λ I) ⊂ ker(T − λ I) j , j ≥ 1.


Another observation is that there exists a positive integer k such that the chain stops growing.
This is obvious since ker T k ⊂ V for any k, no matter how large. Suppose dimV = n, then
the chain could grow for at most n times from {0}. Also notice that if ker T k = ker T k+1 ,
then we have
ker T k = ker T k+1 = · · · = ker T m = · · ·
for all m ≥ k. This is a direct result from manipulating the power.
The key is that using ker T k = ker T k+1 we could reduce the power of the operator T is
raised to when considering its kernel. Suppose v ∈ ker T k+2 , then T k+2 v = T k+1 (T v) =
0 =⇒ T v ∈ ker T k+1 = ker T k , which implies T k (T v) = T k+1 v = 0 =⇒ v ∈ ker T k+1 .
The observations above means that

ker(T − λ I) j ⊂ ker(T − λ I)dimV

for all j ≥ 1.
Apparently, if ker(T − λ I) j is nontrivial, then λ is an eigenvalue of T since (T − λ ) j−1 v is
an eigenvector for some v ∈ ker(T − λ I) j .
To generalize the notion of eigenvectors in this way, we prefer that vectors in each ker(T −
λ I) j corresponding to distinct λ are linearly independent i.e. the sum of such spaces is
direct. The next theorem asserts it is indeed true.

6.8 Linearly independent generalized eigenvectors


Suppose T ∈ End(V ) and λ1 , · · · , λm are distinct eigenvalues of T . Let v1 , · · · , vm be
vectors such that v j ∈ ker(T − λ j I)dimV for each j = 1, · · · , m, then v1 , · · · , vm is a
linearly independent list of vectors.

P ROOF.A proof can be structured similar to the proof of 4.21.


6.2. ROOT SUBSPACE DECOMPOSITION 77

Suppose that
a1 v1 + · · · + am vm = 0.
Apply the operator (λ1 − T )dimV· · · (λm−1 − T )dimV (λm − T )k to both sides, where
k is the largest integer for (λm − T )k vm 6= 0. We have

am (λ1 − T )dimV · · · (λm−1 − T )dimV (λm − T )k vm = 0.

Since T (λm − T )k vm = λm (λm − T )k vm , we have that

am (λ1 − λm )dimV · · · (λm−1 − λm )dimV T (λm − T )k vm = 0.

As λ s are distinct, am = 0. Do this for arbitrary j = 1, · · · , m to get a j = 0.

We may directly apply the proof of 6.6 to get the directness of the sum.

Since the sum of such spaces is direct, we could then define general eigenspaces as such
spaces, and general eigenvectors as the vectors live in them.

6.9 General eigenvectors and eigenspaces


Suppose T ∈ End(V ). A vector v ∈ V is an generalized eigenvector of T correspond-
ing to λ if
(T − λ I) j v = 0
for some j.
The generalized eigenspace of T corresponding to λ , G(λ , T ), is the set of all gen-
eralized eigenvectors of T corresponding to λ along with 0.

We could characterize general eigenspaces using the reasoning from our discussion above.

6.10 Characterization of general eigenspaces


Suppose T ∈ End(V ). Then

G(λ , T ) = ker(T − λ I)dimV .


78 CHAPTER 6. CANONICAL FORMS

P ROOF.The proof is a direct result from our discussion above. If v ∈ G(λ , T ), then
(T − λ I) j v = 0 for some j, so v ∈ ker(T − λ I) j ⊂ ker(T − λ I)dimV . The converse
is true from the definition.

Now we show the important structural result: the root subspace decomposition of complex
vector spaces.

6.11 Root subspace decomposition


Suppose V is a complex vector space and T ∈ End(V ) with distinct eigenvalues
λ1 , · · · , λm . Then
V = G(λ1 , T ) ⊕ · · · ⊕ G(λm , T ).

P ROOF.By 4.19, the characteristic polynomial χ of T is annihilating. Hence so is

f (z) = (z − λ1 )n · · · (z − λm )n .

By 6.6, the result follows.

This result is true for vector spaces on algebraically closed fields.

Exercise 6.2.1
V is complex and T ∈ End(V ). Prove that T is diagonalizable if and only if every
generalized eigenvector is an eigenvector.

The root space decomposition theorem suggests thta we could always find a basis of general
eigenvectors of T in complex V . If we write T as a matrix w.r.t. the basis, we then have a
block diagonal matrix as long as we keep a certain sequence of these basis vectors. We then
define algebraic multiplicity to describe the size of such block diagonal matrices.

6.12 Algebraic multiplicity


The algebraic multiplicity of an eigenvalue λ of T , d(λ ), is the dimension of the
6.3. JORDAN CANONICAL FORM 79

corresponding generalized eigenspace.

The next result relates the algebraic multiplicity of a (complex) eigenvalue with the times
it appear in the diagonal of the upper triangular matrix and is repeated in the characteristic
polynomial.

Exercise 6.2.2
V is complex and T ∈ End(V ). Then the characteristic polynomial of T is

χ (z) = (z − λ1 )d(λ1 ) · · · (z − λm )d(λm )

where λ1 , · · · , λm are all the distinct eigenvalues of T .

Exercise 6.2.3
T ∈ End(V ) and m(z) = ∏ki=1 (z − λi )mi is the minimal polynomial of T , where
λi 6= λ j if i 6= j. Prove:
1. G(λi , T ) = ker(λi I − T )mi .
2. mi (z) = (z − λi )mi is the minimal polynomial of T |G(λi ,T ) .

6.3 Jordan Canonical Form


Suppose T ∈ End(V ) and λ ∈ K is an eigenvalue of T . Then we have (T − λ I)dimV = 0.
Now we could reduce the study of T to the study of T − λ I, the restriction of which on the
corresponding generalized eigenspace is called a nilpotent operator.

6.13 Nilpotent operator


Suppose N ∈ End(V ), then N is called a nilpotent operator if some power of it
equals to 0.
80 CHAPTER 6. CANONICAL FORMS

Exercise 6.3.1
T ∈ End(V ). Show that there exists invariant subspaces U,W such that V = U ⊕W
where T |U is invertible operator and T |W is nilpotent.
Compare this with 6.11.
Hint: Exercise 1.2.6.

The following theorem characterizes nilpotent operator on complex vector spaces.

6.14 Characterization of nilpotent operators on complex vector


spaces
Suppose V is a complex vector space. An opeartor N ∈ End(V ) is nilpotent if and
only if it has and has only 0 as its eigenvalue.

P ROOF.If N is nilpotent, then 0 is its eigenvalue, and if Nv = λ v, then λ dimV =


0 =⇒ λ = 0.
If N has and has only 0 as its eigenvalue, then V = G(0, N) since V is complex,
which means N is nilpotent.

The proof requires V to be complex in one (which?) direction.

The cyclic decomposition of nilpotent operators would give us a second order decomposi-
tion of operators after the root space decomposition on complex vector spaces.

6.15 Cyclic invariant subspaces


Suppose T ∈ End(V ), v ∈ V . Then

Z(v; T ) = span(v, T v, · · · )

is the cyclic invariant subspace generated by v.

We now claim and give proof to the corresponding results.


6.3. JORDAN CANONICAL FORM 81

6.16 Cyclic decomposition of nilpotent operators


Suppose N ∈ End(V ) is nilpotent, then there exists a basis

v11 , · · · , v1r1 ; · · · ; vs1 , · · · , vsrs

such that
1. Nv j1 = 0; and
2. Nv ji = v j,i−1 for each i = 2, · · · , r j
for each j.

P ROOF.We show using induction on dimV .


• Suppose dimV = 1, then there is nothing to prove.
• Assume it is true for all k ≤ dimV − 1. Since N is nilpotent, it is not injective.
Therefore im N is a proper subspace of V .
– Apply our induction hypothesis to N|im N . Then there exists vectors
v1 , · · · , vn ∈ im N such that

N m1 v1 , · · · , v1 ; · · · ; N mn vn , · · · , vn

is a basis of im N and

N m1 +1 v1 = · · · = N mn +1 vn = 0.

– Since ∃u j ∈ V : v j = Nu j for each j, we claim that

N m1 +1 u1 , · · · , u1 ; · · · ; N mn +1 un , · · · , un

is a linearly independent list in V .


∗ Suppose a linear combination of the list equals to 0. Apply N
repeatedly to both sides and we have that the coefficients for

N m1 u1 , · · · , u1 ; · · · ; N mn us , · · · , un

all are 0.
∗ Therefore we only need

N m1 +1 u1 = N m1 v1 , · · · , N mn +1 un = N mn vn

to be linearly independent, which is given.


82 CHAPTER 6. CANONICAL FORMS

– Extend the list to a basis

N m1 +1 u1 , · · · , u1 ; · · · ; N mn +1 un , · · · , un , w1 , · · · , wt .

We want to change the list w1 , · · · , wt to un+1 , · · · , us where Nu j = 0 for


each j = n + 1, · · · , s. This is obvious since

im N = span(N m1 +1 u1 , · · · , Nu1 ; · · · ; N mn +1 un , · · · , Nun )

implies that there exists

x j ∈ span(N m1 +1 u1 , · · · , Nu1 ; · · · ; N mn +1 un , · · · , Nun )

such that Nw j = Nx j for each j.


Let un+ j = w j − x j . Then after changing the list w1 , · · · , wt to
un+1 , · · · , us , we have the basis as desired as Nun+ j = 0 and the list
has the same span and length as the basis before changing.

Exercise 6.3.2
N ∈ End(V ). N m = 0. Prove: there exists a subspace W ⊂ V such that

M
m
V= N i−1 (W ).
i=1

Compare this with 6.16.

V K1 K2 ··· Km−1 Km
W F1 F2 ··· Fm−1 Fm
N(W ) T (F2 ) ··· ··· T (Fm )
.. ..
. . ···
N m−1 (W ) T m−1 (Fm )

Combine the above results, the cyclic decomposition of (T − λ I)|G(λ ,T ) on each general
eigenspace, with the first step - root space decomposition of the complex vector space V ,
gives us the existence of Jordan basis.
6.3. JORDAN CANONICAL FORM 83

6.17 Jordan basis


Suppose T ∈ End(V ). A basis

v11 , · · · , v1r1 ; · · · ; vs1 , · · · , vsrs

of V is called a Jordan basis w.r.t T if


1. T v j1 = λ j v j1 for some eigenvalue λ j ∈ K; and
2. (T − λ j I)v ji = v j,i−1 for each i = 2, · · · , r j
for each j.

6.18 Existence of Jordan basis


Suppose V is a complex vector space, and T ∈ End(V ), then there exists a Jordan
basis of V w.r.t. T .

P ROOF.This follows from our discussion above.

Suppose V is a complex vector space and T ∈ End(V ) with an eigenvalue λ . The matrix of
T |G(λ ,T ) w.r.t. a Jordan basis has the form
 
J1 0
 .. 
 . 
0 Js

where each J j has the form


 
λ 1 0
 .. .. 
 . . 
 
 .. 
 . 1
0 λ

Exercise 6.3.3
Suppose V is complex and T ∈ End(V ) is invertible. Prove that T has a m−th root
i.e. R ∈ End(V ) such that T = Rm , for m ∈ Z+ , m ≥ 2.
84 CHAPTER 6. CANONICAL FORMS

We call each J j a Jordan block corresponding to λ .

Jordan basis is not necessarily unique. However, the Jordan form does not depend
on the specific basis chosen, as we would show next.

We are curious about the number of Jordan blocks for each eigenvalue, so we introduce
geometric multiplicity.

6.19 Geometric multiplicity


The geometric multiplicity of an eigenvalue λ of T is s(λ ) = dim(E(λ , T )).

Each cyclic invariant subspaces of (T − λ I)|G(λ ,T ) has an eigenvector of T corresponding


to λ . The decomposition of G(λ , T ) into cyclic invariant subspaces of (T − λ I)|G(λ ,T ) is a
direct sum, so these eigenvectors form a basis of E(λ , T ). Therefore, the number of Jordan
blocks corresponding to λ is equal to s(λ ).

Exercise 6.3.4
T ∈ End(V ) and m(z) = ∏ki=1 (z − λi )mi is the minimal polynomial of T , where
λi 6= λ j if i 6= j. Prove that
d(λi ) ≤ mi s(λi ).

We are yet to find the number t j (λ ) of Jordan blocks of size j-by- j for each j. However, it
is easy to see that

t j (λ ) + · · · + td(λ ) (λ ) = dim(ker(T − λ I) j − dim(ker(T − λ I) j−1 )).

And hence

t j = 2 dim ker(T − λ I) j − dim(ker(T − λ I) j−1 ) − dim(ker(T − λ I) j+1 ).

This means that the Jordan form of T is unique up to rearrangement of the Jordan blocks,
as t j (λ ) is uniquely determined by T for each eigenvalue λ of T .

6.20 Jordan form is unique


Suppose T ∈ End(V ) has a Jordan form. Then the Jordan form is unique up to
6.3. JORDAN CANONICAL FORM 85

rearrangement of blocks.

P ROOF.This follows from our discussion above.

Therefore, we say that Jordan form is canonical, since it only depends on T and not
on the specific Jordan basis.

Jordan canonical form can deduce useful results for similarity of matrices - it tells exactly
which matrices are similar.

6.21 Similar matrices ⇐⇒ Same JCF


2
A, B ∈ Cn are similar if and only if A, B have the same Jordan canonical form.

P ROOF.
2
Suppose A, B ∈ Cn are similar, then there is one T ∈ End(Cn ) corresponding to
both A, B under different bases. However, there is a unique JCF for T , which is
determined by T .
The converse follows from the definition of similar matrices.

You might also like