Professional Documents
Culture Documents
3 Determinants 31
6 Canonical Forms 71
6.1 Minimal Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 Root Subspace Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3
4 CONTENTS
Chapter 1
Linear algebra is the study of linear maps. We have a lot of examples of maps with linear
property, for example the derivative map, the definite integral map, the matrix multiplication
map, etc.
Z b Z b Z b
d df dg
(λ f + µ g) = λ +µ ; (λ f + µ g) = λ f +µ g.
dx dx dx a a a
5
6 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS
The first four axioms are about the + operation and the last four axioms are about
the · operation.
1.3 Subspaces
(V, +, ·) is a vector space. U ⊂ V is a subspace of V if (U, +, ·) is a vector space.
U1 + · · · +Um = {u1 + · · · + um : u j ∈ U j }.
Exercise 1.1.1
Show that the sum is the least containing subspace.
1.1. VECTOR SPACES 7
P ROOF.Suppose
v = u1 + · · · + um = ũ1 + · · · + ũ + m.
Then
(u1 − ũ1 ) + · · · + (um − ũm ) = 0.
Exercise 1.1.2
1. Prove the characterization of two subspaces: U + W is direct if and only if
U ∩W = {0}.
2. The above characterization does not generalize to more subspaces.
3. U1 , · · · ,Um are subspaces of V . Their sum is direct if and only if
U j ∩ U1 + · · · +U j−1 +U j+1 + · · · +Um = {0}.
We call the span of the list of column vectors in a matrix the rank of that matrix.
Exercise 1.1.3
Reformulate 1.7 in terms of sum of subspaces.
8 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS
Exercise 1.1.4
Show that the span of u1 , · · · , um is the least containing subspace of {c1 u1 + · · · +
cm um }. Compare with 1.1.1.
c1 u1 + · · · + cm um = 0 =⇒ c1 = · · · = cm = 0.
Exercise 1.1.5
Reformulate 1.9 in terms of direct sum.
A direct corollary is that the length of a spanning list is no smaller than the length
of a linearly independent list.
Exercise 1.1.6
Prove that the subspaces of a finite dimensional space are finite dimensional.
10 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS
Exercise 1.1.7
V finite dimensional with U1 ,U2 being its subspaces. Show that
Exercise 1.1.8
V finite dimensional with U1 , · · · ,Um being its subspaces. Show that
Prove that the equality holds if and only if the sum is direct.
1.15 Hom(U,V )
The vector space formed by all linear maps is called Hom(U,V ), the vector space
of homomorphisms from U to V .
Therefore the map would be unique. It is trivial to show that the above map is
indeed linear.
v = a1 v1 + · · · + an vn .
Then by the property of a basis, we can identify v through the coordinate map with
⊺
v = a1 · · · an .
satisfies
T (u) = M · u.
Here · is the standard multiplication of matrices, and v for v ∈ V are written in some
basis v1 , · · · , vn .
12 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS
The proof needs the following convenient result about the standard multiplication of matri-
ces.
Suppose ⊺
v = a1 ··· an , A = a1 ··· an .
Then
n
A · v = ∑ ai ai .
i=1
From this, the proof is then evident. This theorem says every linear map f : V → W can be
understood as a linera map M : Kn → Km .
Exercise 1.2.1
1. For X = (xi j ) ∈ Kn×n , define tr : X 7→ ∑ni=1 xii . Show that tr is a linear map
from Kn×n to K, and tr(AB) = tr(BA).
2. Now suppose a linear map T from Kn×n to K satisfies T (AB) = T (BA).
Show that there exists c ∈ K such that T = c tr.
ker T = {u ∈ U : Tu = 0}.
Exercise 1.2.2
Show that T is injective if and only if ker T = {0}.
P ROOF.dim im T = rank A.
Exercise 1.2.3
Let Pn−1 be the set of all polynomials with coefficients in R of order less or equal to
n, α1 , · · · , αn be distinct in R. Define
1. Show
t − αk
p j (t) = ∏
1≤k≤n,k6= j j − αk
α
maps to the standard basis in Rn and conclude L is invertible.
2. Find L−1 and the coordinate of f ∈ Pn−1 under the basis p1 , · · · , pn .
Exercise 1.2.4
Suppose V is finite-dimensional and R, S, T ∈ End(V ). Prove the Frobenius’ in-
equality:
dim im RST + dim im S ≥ dim im RS + dim im ST.
Hint: after some applications of 1.21, consider the restrictions ST |ker RST and S|ker RS .
Exercise 1.2.5
Suppose V is dimensional and T ∈ End(V ). Prove:
dim ker T ∩ im T = dim ker T 2 − dim ker T = dim im T − dim im T 2 .
1.2. LINEAR MAPS 15
Exercise 1.2.6
Suppose V is dimensional and T ∈ End(V ). Prove the following equivalence:
V = ker T ⊕ im T
⇐⇒ dim ker T 2 = dim ker T
⇐⇒ ker T 2 = ker T
⇐⇒ im T = im T 2
⇐⇒ dim im T = dim im T 2 .
Exercise 1.2.7
T ∈ Hom(U,V ), U,V finite dimensional.
1. Show that there exists S ∈ Hom(V,U) such that T ST = T, ST S = S
2. Prove: if S is unique then T is invertible with inverse S.
Exercise 1.2.8
U,V finite dimensional. T ∈ Hom(U,V ), S ∈ Hom(V,U) such that T ST = T, ST S =
S (c.f. 1.2.7).
1. Show that
U = ker T ⊕ im S,V = ker S ⊕ im T.
2. Prove: T |im S , S|im T are inverses of each other.
1.26 Equivalence
Given A, B ∈ Km×n . A and B are said to be equivalent if there exists invertible
P ∈ Km×m , Q ∈ Kn×n such that
B = P AQ.
There is a corollary.
For endomorphisms, to observe what happens in the current space, we restrict the basis in
which the image is written in to be the same as the basis of the domain.
1.3. DUAL VECTOR SPACE 17
1.29 Similarity
Given A, B ∈ Kn×n . A and B are said to be similar if there exists invertible P ∈
Kn×n such that
B = P AP −1 .
Exercise 1.2.9
Suppose A ∈ Kn×n , B ∼ A. Prove: tr B = tr A.
is an isomorphism from V ∗ to V .
Suppose v ∈ V, f ∈ V ∗ , then
n n
v = ∑ fi (v)vi . f = ∑ f (vi ) fi .
i=1 i=1
Hom(U,V ) = U ∗ ⊗V.
1.33 f ⊗ v
Suppose v ∈ V, f ∈ U ∗ . Define
f ⊗ v ∈ Hom(U,V ) : · 7→ f (·)v.
1.34 U ∗ ⊗V
Define U ∗ ⊗V as the subspace
U ∗ ⊗V := span{ f ⊗ v, f ∈ U ∗ , v ∈ V }.
of Hom(U,V ).
T ∗ ( f ) = f ◦ T, ∀ f ∈ V ∗
1.36 T ∗ and T
U,V finite-dimensional. Suppose T ∈ Hom(U,V ). v1 , · · · , vn is a basis of V with
dual basis f1 , · · · , fn . Then
n
T = ∑ T ∗ ( fi ) ⊗ vi .
i=1
Therefore,
f ⊗ v ∈ Hom(U,V ) : · 7→ f (·)v
induces a natural isomorphism between U ∗ ⊗V and Hom(U,V ).
20 CHAPTER 1. VECTOR SPACES AND LINEAR MAPS
Chapter 2
For inner products to be defined, we need our vector space to be defined on R or C. There-
fore, we use F in this chapter to denote a field that is either R or C.
h·, ·i : V ×V → F
that satisfies
• hv, vi ∈ R+ ∪ {0} for all v ∈ V ;
• hv, vi = 0 ⇐⇒ v = 0;
• hu + v, wi = hu, wi + hv, wi for all u, v, w ∈ V ;
• hλ u, vi = λ hu, vi for all λ ∈ F, u, v ∈ V ;
• hu, vi = hv, ui.
A vector space with an inner product defined on it is called an inner product space.
21
22 CHAPTER 2. INNER PRODUCT SPACES
P ROOF.The first proposition follows by observing hu, vi = −h−u, vi. The second
and third comes from the conjugate symmetry.
Norms are also important quantities on vector spaces. They are closely related to inner
products, and an important class of them could be defined using inner product.
2.3 Norms
A norm is a function k·k : V → R+ ∪ {0} that satisfies
• kvk = 0 ⇐⇒ v = 0;
• kλ vk = |λ | kvk;
• ku + vk ≤ kuk + kvk.
is a norm on V .
P ROOF.The proof to 2.5 and 2.7 combined together gives a proof to 2.4.
2.1. INNER PRODUCTS AND NORMS 23
P ROOF.The first proposition follows from the definiteness of inner products. The
second proposition follows from expansion and |λ |2 = λ λ .
P ROOF.Let λ ∈ C. Then
scalar.
ku + vk2 = hu + v, u + vi
= kuk2 + kvk2 + 2 Re(hu, vi)
≤ kuk2 + kvk2 + 2 |hu, vi|
≤ kuk2 + kvk2 + 2 kuk kvk
= (kuk + kvk)2 .
We have now completed our proof of 2.4. This means that we have a natural norm induced
from the inner product in an inner product space.
We have two other fundamental properties of the norm induced by an inner product.
ku + vk2 + ku − vk2 = hu + v, u + vi + hu − v, u − vi
= hu, ui + hu, vi + hv, ui + hv, vi
+ hu, ui − hu, vi − hv, ui + hv, vi
= 2(kuk2 + kvk2 ).
2.1. INNER PRODUCTS AND NORMS 25
and the identity follows since hv, ui terms are cancelled out.
The next theorem shows that the class of norms that satisfies 2.8 can all be induced from
inner products.
The theorem 2.10 characterizes inner product spaces: a vector space on which a
norm satisfying 2.8 could be defined are inner product spaces, and vice versa from
2.4.
2.11 Orthogonality
Suppose V is an inner product space. We say u, v ∈ V are orthogonal to each other
or u ⊥ v if hu, vi = 0.
for all a1 , · · · , am ∈ F.
P ROOF.Since each e j has norm 1, this follows easily from repeated applications of
the 2.12.
2.14 also implies that orthonormal lists (in fact all orthogonal lists) are linearly
independent lists.
hv, e j i = ha1 e1 + · · · + an en , e j i = a j .
hu, vi
hu − cv, vi = hu, vi − chv, vi = 0 ⇐⇒ c = .
kvk2
28 CHAPTER 2. INNER PRODUCT SPACES
hw, ni = hu − cv, ni = 0 =⇒ w ⊥ n.
The same idea allows us to get a list of vectors orthogonal to each other. Upon
scaling the vectors to vectors of norm 1, we could get an orthonormal list.
span(v1 , · · · , v j ) = span(e1 , · · · , e j ), ∀ j = 1, · · · , m.
The spanning property is from the linear independence. If the list given is a linearly
dependent list, the process spits out 0.
P ROOF.Extend the list to a basis, and apply 2.16. The first vectors are not changed,
and the basis is changed to an orthonormal basis.
After showing 2.16, we have clearly studied most of the details of the inner product structure
that we need for discussion in the following chapters.
This means that we could make theorems about general bases to orthonormal bases.
hλ v + µ w, ui = λ hv, ui + µ hw, ui = 0 =⇒ λ v + µ w ∈ U ⊥ .
30 CHAPTER 2. INNER PRODUCT SPACES
P ROOF.
Note that V is not necessary to be finite-dimensional.
• Let e1 , · · · , em be an orthonormal basis of U. Then
Since
Determinants
The determinant of a matrix is a powerful tool for studying linear operators. Thanks to
the theorems from exterior algebra, we could define it axiomatically - meaning we could
construct it uniquely from some axioms.
3.1 Determinant
Suppose A ∈ Kn×n . Let A = (ai j )n×n .The determinant of A is a function Kn×n →
K that satisfies:
• multilinearity (in columns), i.e.
det(· · · ai · · · a j · · · ) = − det(· · · a j · · · ai · · · )
31
32 CHAPTER 3. DETERMINANTS
P ROOF.The key is that every transposition changes the parity of the count of inver-
sions.
• Suppose we apply τi j . Clearly, inversions formed by i or j with an element
outside of [i, j] will not be affected.
• For the M = j − i − 1 elements within the interval [i + 1, j − 1], assume Ni
of them form inversions with i and N j of them form inversions with j. If we
apply τi j , the count of inversions i gained is thus M −2Ni , which has the same
parity as M. The changes together do not change the parity of inversions in
σ (Zn ) since they add to 2M.
• However, applying τi j changes the inversion of i, j by adding or minusing 1.
33
By 3.2, we could define the sign of a permutation sgn(σ ) = (−1)π (σ ) where π (σ ) is the
parity of σ .
This follows from our discussion above by repeatedly using multilinearity and al-
ternating property to
ak = (a1k , · · · , ank )⊺ = a1k (1, 0, · · · , 0)⊺ + · · · + ank (0, · · · , 0, 1)⊺
and eliminating
f (· · · 1k · · · 1k · · · ) = 0.
Now apply det I = 1.
34 CHAPTER 3. DETERMINANTS
P ROOF.Let
f : Kn×n → K, B 7→ det AB.
By the definition of matrix multiplication, f satisfy multilinearity in columns of B
and alternating property, since adding, scalar multiplying and exchanging columns
of B have the exact same effect on the product AB.
By 3.3, we have f (B) = f (I) det B, where f (I) = det AI = det A.
3.3 not only shows det AB = det BA, but also that determinant of matrices is multiplica-
tive.
det B = det A.
By Exercise 1.2.9 and 3.5, we could use the notation tr A , det A . The next theorem follows
from the permutation expression as in 3.3.
35
If A is not invertible, then some of its columns are linearly dependent. Therefore,
det A is a sum of 0, hence it is 0.
The next theorem is a direct result of 3.3 and 3.4. It is extremely important in giving proof
to propositions on determinants.
P ROOF.Since
A ∗ I ∗ A 0
= ·
0 B 0 B 0 I
we only need to show that
I ∗ A 0
det = det B, det = det A.
0 B 0 I
The key is that the two functions are both multilinear and alternating, and
I ∗ I 0
det = det =1
0 I 0 I
by 3.3.
36 CHAPTER 3. DETERMINANTS
Then the result follows from the permutation expression as in 3.3 and the fact that
permutations are injective and surjective.
This directly implies that we also have the multilinearity and alternating property of
determinants in rows.
3.9 Minors
Suppose A ∈ Kn×n . The minors of A are determinants of submatrices of A.
P ROOF.The key is that, by multilinearity in columns and rows from 3.1 and 3.8,
where
C̃i j = det Ãi j
and Ãi j is obtained by changing the entries
By 3.7, we get
C̃i j = (−1)i+ j det Ai j = (−1)i+ jCi j .
Exercise 3.0.1
Suppose X = (xi j ). Prove the following derivative of determinant formula:
d det X
= Di j .
dxi j
38 CHAPTER 3. DETERMINANTS
Chapter 4
V = U1 ⊕ · · · ⊕Um .
It is not always true that the restriction of T on each U j is an operator on that subspace: for
some U j , T may not map vectors in U j to another vector in U j . If that is indeed the case, we
say that U j is an invariant subspace of T .
u ∈ U =⇒ Tu ∈ U.
Exercise 4.1.1
Suppose every subspace of V is T −invariant. Show that T = cI.
39
40 CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Exercise 4.1.2
Suppose T ∈ End(V ) and tr T = 0. Show that there is a basis of V under which the
presentation of T has all diagonal entries equal to 0.
Hint: Exercise 4.1.1.
Exercise 4.1.3
T 2 = T , U is T −invariant. Show that U = K ⊕ I, where K is a subspace of ker T
and I a subspace of im T .
T = T1 ⊕ · · · ⊕ Tm
T j u = Tu ∈ U j
for every u ∈ U j .
T = T1 ⊕ · · · ⊕ Tm
then there exists a basis of V w.r.t. which the matrix of T is of block diagonal form
A1 0
..
.
0 Am
Tu = T (µ v) = µ T v = µ (λ v) = (µλ )v = (λ µ )v = λ (µ v) = λ u.
Therefore, T |span(v) u = λ u.
Conversely, if T v = λ v, then span(v) is an invariant subspace of T . We name such values λ
eigenvalues of T and such v eigenvectors of T .
The fundamental theorem of algebra says χ for T on complex vector spaces splits. There-
fore, we have the following important structural theorem.
4.2. EIGENVALUES AND CHARACTERISTIC POLYNOMIAL 43
Exercise 4.2.1
Suppose V is a complex vector space, dimV = n and T ∈ End(V ) is invertible.
Let p denote the characteristic polynomial of T and let q denote the characteristic
polynomial of T −1 . Prove that
P ROOF.The key is that there exists a matrix of A of block upper triangular form
AU ∗
A= .
0 B
P ROOF.The key is that there exists a matrix of A that is diagonal form. Then by
3.7 we have exactly what we want.
The principal minors of A are the determinants of submatrices where the rows and columns
selected have the same indices in A.
Then.
σk = (−1)n−k ∑ Pkk
where Pkk is a k-by-k principal minor of A and the summation is over all k-by-k
principal minors.
∏ (ak + bk ) = ∑ ∏ ai ∏ b j .
1≤k≤n C⊆Zn i∈C j∈C
/
= ∑ (−1) ∑ |C|
sgn σ ∏ Ai,σ (i) ∏ zδ j,σ ( j) .
C⊆Zn σ ∈Sn i∈C j∈C
/
∏ zδ j,σ ( j) = 0 ⇐⇒ ∃ j : j 6= σ ( j).
j∈C
/
= ∑ (−1) |C|
∑ sgn σ ∏ ai,σ (i) zn−|C|
C⊆Zn σ ∈S(C) i∈C
Note that
sgn σ ∏ ai,σ (i)
i∈C
is exactly the principal minor corresponding to the set C. The result follows directly.
In particular, we have
Exercise 4.2.2
Prove 4.12 using Laplace’s expansion 3.10.
We would now be able to show that every operator on complex vector spaces has a upper
triangular matrix.
Tu j = Su j + a j v.
The above equation holds for any basis chosen for span(u1 , · · · , uk ).
– Therefore, changing the basis of span(u1 , · · · , uk ) from u1 , · · · , uk to an-
other basis w1 , · · · , wk that gives us an upper triangular form allows us
to have Sw j ∈ span(w1 , · · · , w j ). Combining this with Tw j = Sw j + a j v
gives us Tw j ∈ span(v, w1 , · · · , w j ). Therefore, the matrix w.r.t. the
basis v, w1 , · · · , wk has the desired upper triangular form
λ ∗
.
0 Rk
It is simple from the consideration of rank of a matrix that one could determine from the
upper triangular form whether a linear operator is invertible.
P ROOF.The key idea is that if any entry on the diagonal is zero, then the column
is either a zero matrix (when it happens at the first column) or it lives in the span
of the previous columns. This means the matrix is not of full rank and hence not
invertible. The converse is obviously true for the same reason.
Exercise 4.2.3
Suppose V is complex, T ∈ End(V ), f ∈ C[z]. Prove that α ∈ C is an eigenvalue of
f (T ) if and only if α = f (λ ) for some eigenvalue λ of T .
48 CHAPTER 4. EIGENVALUES AND EIGENVECTORS
P ROOF.By 4.13, there exists a basis w.r.t which A has an upper triangular matrix
Rn ∈ Kn×n . Notice that
R11 ∗
Rn =
0 Rn−1
where R11 ∈ K and Rn−1 is an upper triangular matrix. Then by repeatedly using
3.7, we get
det Rn = R11 · · · Rnn .
By 3.5, we have that
det Rn = det A
for any matrix A of A . Therefore, for any matrix A of A , det A is the product of
eigenvalues of A .
To show that the times of λ appears on the diagonal is exactly d(λ ), consider the
matrix Rn − zI. Then by the same computation
But this is also the chracteristic polynomial (from left hand side). This concludes
our proof.
Exercise 4.2.4
Suppose K is algebraically closed, V is a vector space on K, A ∈ L (V ) and
dimV = n. Let A ∈ Kn×n be the matrix of A w.r.t. any basis of V . Prove that
tr A is the sum of eigenvalues of A .
4.3. CAYLEY-HAMILTON THEOREM 49
In fact, the main reason that a richer theory exists for operators than for more general
linear maps is that operators can be raised to powers and hence can be written as
polynomials.
T m = T ◦ (T m−1 )
and T 0 = I.
Suppose p(z) ∈ K[z] given by
p(z) = a0 + a1 z + · · · + am zm .
p(T ) = a0 + a1 T + · · · + am T m .
p(z)q(z) = q(z)p(z).
p(z)q(z) = ∑ ck zk
k
A corollary of 4.18 is that ker p(T ), im p(T ) are T −invariant subspaces, where p ∈
K[z].
P ROOF.It suffices to prove for the case where V is complex, since the characteristic
polynomial is the same. By 4.13, we have an upper triangular matrix presentation
for A . We write it as
λ1 ∗
.
0 B
By ??, we have
λ1 ∗
χ
0 B
0 ∗ λ1 − λ2 ∗ λ − λn ∗
= ··· 1 .
0 B − λ1 In−1 0 B − λ2 In−1 0 B − λn In−1
(B − λ2 In−1 ) · · · (B − λn In−1 ) = 0
This gives
0 ∗ (λ1 − λ2 ) · · · (λ1 − λn ) ∗
= 0.
0 B − λ1 In−1 0 0
4.4. EIGENVECTORS, EIGENSPACES AND DIAGONAL MATRICES 51
We need to first discuss the relations of eigenvectors. The next theorem shows that eigen-
vectors corresponding to distinct eigenvalues must be linearly independent.
P ROOF.The key idea is that the image of different vectors are scalar multiples by
distinct scalars. Suppose that
a1 v1 + · · · + am vm = 0.
We also define the eigenspaces to show the relation of eigenvectors corresponding to the
same eigenvalues.
4.22 Eigenspaces
Suppose T ∈ End(V ) and λ ∈ K. The eigenspace of T corresponding to λ is defined
as
E(λ , T ) = ker(T − λ I).
52 CHAPTER 4. EIGENVALUES AND EIGENVECTORS
E(λ1 , T ) + · · · + E(λm , T )
is direct.
V = U1 ⊕ · · · ⊕Un .
4. V = E(λ1 , T ) ⊕ · · · ⊕ E(λm , T ).
4.4. EIGENVECTORS, EIGENSPACES AND DIAGONAL MATRICES 53
P ROOF.(1) and (2) are trivially equivalent: (1) implies (2) by using 4.15, the other
way by using definition. (2) and (3) are even more obviously equivalent using the
definition of direct sum.
• Suppose that (2) is true, then we have
V = E(λ1 , T ) + · · · + E(λm , T )
In fact, the list u1i , · · · , uni is a basis of Vi exactly after removing the 0s. This is
because either a j = 0 or u jk = 0 for that j with nonzero a j and all k 6= i.
The following is a useful lemma and the argument could be used in proofs of many similar
results, from 5.17 to Lie’s theorem in the theory of Lie algebra.
Ti T j u = λ µ u = µλ u = T j Ti u.
In this chapter, we need to exploit the relation between operators and real and complex
numbers using inner product.
As such, we first show that for every linear functional φ ∈ V 0 on V , we could find a unique
u to represent φ .
55
56 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES
Suppose inner product space V and T ∈ End(V ), u, v ∈ V . Consider the inner product hTu, vi
and the linear functional
φv,T : u 7→ hTu, vi.
By 5.1, there exists a unique w ∈ V such that
T ∗ e j = hT ∗ e j , e1 ie1 + · · · + hT ∗ e j , en ien
= hTe1 , e j ie1 + · · · + hTen , e j ien .
P ROOF.This follows from our discussion above since T ∗ e j is the jth column of
M = [T ]ee11 ,··· ,en
,··· ,en while hTei , e j i is the conjugate of the jth entry of the ith row of
matrix of M .
The adjoint of an operator T behave similarly to the dual map, but live on the same
spaces as T .
This similarity is due to the similarity between dual bases and orthonormal bases.
P ROOF.
v ∈ ker T ∗ ⇐⇒ T ∗ v = 0
⇐⇒ hu, T ∗ vi = 0, ∀u ∈ V
⇐⇒ hTu, vi = 0, ∀u ∈ V
⇐⇒ v ∈ (im T )⊥ .
P ROOF.
T v = λ v =⇒ λ kvk2 = hλ v, vi = hT v, vi = hv, T vi = hv, λ vi = λ kvk2 .
5.2. SELF-ADJOINT AND NORMAL OPERATORS 59
This is our first result that shows the similarity between self-adjoint operators and
real numbers.
Before we proceed to our next result, we make some rough comparison between operators
on real vector spacs and complex vector spaces using informal language to illustrate our
motivation.
In conclusion, operators that behave like complex numbers i.e. involves rotation lose their
eigenvectors and eigenvalues when they are described on real vector spaces.
hT v, vi = 0, ∀v ∈ V
if and only if T = 0.
hT (u + v), u + vi − hT (u − v), u − vi
hTu, vi =
4
hT (u + iv), u + ivi + hT (u − iv), u − ivi
+ i.
4
and then we get that
B(u + v, u + v) − B(u − v, u − v)
B(u, v) =
4
B(u + iv, u + iv) − B(u − iv, u − iv)
+ i.
4
A sesquilinear form satisfies only the linearity condition for an inner product. The
proof in 2.9 still holds for them, as we used only linearity, yielding this identity.
The significance is that we can upgrade conclusion about all B(u, u) to all B(u, v).
5.10 is not true on real vector spaces, since there are rotation operators that lose the
complex eigenvalues and corresponding eigenvectors. Since self-adjoint operators
only have real eigenvalues, 5.10 holds for this class on real vector spaces.
hT v, vi = 0, ∀v ∈ V
if and only if T = 0.
P ROOF.The key is
hT (u + v), u + vi − hT (u − v), u − vi
hTu, vi = .
4
and then we get that
hTu, vi = 0, ∀u, v ∈ V =⇒ hTu, Tui = 0, ∀u ∈ V =⇒ T = 0.
5.2. SELF-ADJOINT AND NORMAL OPERATORS 61
The identity we use is a similar result to 2.9 for sesquilinear forms satisfying
B(u, v) = B(v, u):
B(u + v, u + v) − B(u − v, u − v)
B(u, v) = .
4
Let B(u, v) = hTu, vi and the condition B(u, v) = B(v, u) is given by the self-adjoint
property.
The next theorem provides a simple characterization of normal operators. The geometric
implication of this characterization would be clear after we arrive at the result ??.
(T ∗ T − T T ∗ )∗ = T ∗ (T ∗ )∗ − (T ∗ )∗ T ∗ = T ∗ T − T T ∗ .
Therefore
T ∗ T = T T ∗ ⇐⇒ T ∗ T − T T ∗ = 0
⇐⇒ h(T ∗ T − T T ∗ )v, vi = 0
⇐⇒ hT ∗ T v, vi = hT T ∗ v, vi
⇐⇒ kT vk2 = kT ∗ vk2 .
Another interesting result, with geometric implication (to be revealed later), is as follows.
62 CHAPTER 5. REAL AND COMPLEX INNER PRODUCT SPACES
P ROOF.The first property is easy to check, by 5.13. The second is also easy to
check by computation. The key to the third is that the conclusion of 4.26, which
can be used since T, T ∗ commutes, can be strengthened using the first and second
properties to show that they have common eigenspaces.
hTu, vi = hu, T ∗ vi
hλ u, vi = hu, µ vi
λ hu, vi = µ hu, vi
The next exercise shows that the commuting list of normal operators T1 , · · · , Tm induces the
commuting algebra
C[T1 , · · · , Tm , T1∗ , · · · , Tm∗ ].
This is a commutative subalgebra of the ∗−algebra of End(V ).
Exercise 5.3.1
1. Suppose V is complex and T ∈ End(V ) is normal. Then ∃p ∈ C[z] : T ∗ =
p(T ). (Hint: Use 1.2.3.)
2. Hence, show that if T is normal, ST = T S =⇒ ST ∗ = T ∗ S.
We have studied how self-adjoint operators resemble real numbers in previous chapter. In
this chapter, we show that self-adjoint opeartors are orthogonally diagonalizable on real
inner product spaces.
P ROOF. χ (z) splits over C, and by 5.9 every root of it is real. This means in fact
χ (z) splits over R. Therefore T has an eigenspace.
The complex spectral theorem 5.17 then generalizes to the real case.
√ √
Suppose T is a positive operator. Then T = λ1 I ⊥ · · · ⊥ λn I. Let R = λ1 I ⊥ · · · ⊥ λn I.
Then R is a positive square root of T . The next result shows it is unique.
√
We use T to denote the unique positive square root of T .
√ √ √
We have shown above that every positive operator T is in the form T = ( T )2 = ( T )∗ T .
The converse is also obviously true. This gives us a characterization of positive operators.
hT v, vi = λ hv, vi = λ kvk2 ≥ 0
√
5.25 Effects of T ∗T
Suppose T ∈ End(V ). Then
√
kT vk = T ∗ T v , ∀v ∈ V.
kSvk = kvk , ∀v ∈ V.
6. SS∗ = I.
7. S∗ is an isometric operator.
The only key property we used is the identity 5.25. Therefore, the proof of this theorem
could be used for any T, L : kT vk = kLvk, and in particular the following result.
such that
T = ST ∗ .
√
P ROOF.If the eigenvectors of S are eigenvectors of T ∗ T , then they are eigenvec-
√ direction in 5.17.
tors of T , then T is normal by the easy
If T is normal, the key is that T and T ∗ T must have the same eigenbasis by 5.25.
Then S must act like scalars on them.
From the definition, the singular values are also the arithmatic square roots of eigenvalues
of T ∗ T .
T v = s1 hv, e1 i f1 + · · · + sn hv, en i fn .
√
P ROOF.Let e1 , · · · , en be the eigenbasis of T ∗ T and f j = Se j where S is the isom-
etry in 5.28. Then the result follows.
5.4. POLAR DECOMPOSITION AND SINGULAR VALUE DECOMPOSITION 69
Canonical Forms
In this chapter, we study further about the structure of operators when these does not exist
an eigenspace decomposition of V . We assume the vector spaces in this chapter are finite-
dimensional.
We would study the structure of operators on finite-dimensional vector spaces by consider-
ing its direct decomposition into some more general invariant subspaces, specifically some
cyclic invariant subspaces. An inner product would not help.
Our design of this chapter is as the following:
• we begin our study with the generalized eigenvectors and eigenspaces, which would
lead us to the important result of root subspace decomposition;
• from there we could reduce the case to the study of nilpotent operators as we study
the direct decomposition of T − λ I to its cyclic invariant subspaces (the cyclic de-
composition of nilpotent operators), which is a second order decomposition after the
first root space decomposition;
• the other properties of the Jordan canonical form would then be derived easily from
the proof of it.
71
72 CHAPTER 6. CANONICAL FORMS
T m = a0 I + a1 T + · · · + am−1 T m−1
where a j are not all zero. This gives us an annihilating polynomial m of T of the smallest
possible degree. This polynomial is also unique since otherwise m is not of the smallest
degree. The polynomial m is the minimal polynomial of T .
The reason why minimal polynomials are important is their minimality and the fact that
they are closely related to invariant subspaces of T . The following result is a direct demon-
stration, although it is not the most interesting one.
The theorem 6.4 also implies that every annihilating polynomial is the multiple of the min-
imal polynomial of T as they have common factors.
Also, all of the roots of m are eigenvalues of T since otherwise m is not minimal, and all
eigenvalues of T are roots of m since the eigenspaces are T −invariant.
More generally, we could use factorization of m to study all the invariant subspaces of T ,
not just the eigenspaces using the roots. This leads to the primary decomposition, a more
primary form of root space decomposition 6.11.
74 CHAPTER 6. CANONICAL FORMS
P ROOF.
The case m = 1 is trivial, while the general case follows directly by induction.
Hence we only need to prove the case when m = 2.
• By Bezout’s theorem, there exists polynomials q1 , q2 such that
u = p1 (T )q1 (T )u + p2 (T )q2 (T )u = 0.
V = ker(T − λ1 ) ⊕ · · · ⊕ ker(T − λm )
Exercise 6.1.1
1. Prove that T is invertible if and only if m has nonzero constant term.
2. Suppose T is invertible. Find f ∈ K[z] such that T −1 = f (T ).
Exercise 6.1.2
Suppose the minimal polynomial for T ∈ End(V ) is m(z) = ∏ki=1 (z − λi ), λi 6= λ j if
i 6= j, show that for any v ∈ V , v = ∑ki=1 vi , where
z − λk
vi = fi (T )v, fi (z) = ∏ .
1≤k≤n,k6=i j − λk
λ
Exercise 6.1.3
Suppose V is complex. Suppose T n = I, show that for any v ∈ V , v = ∑n−1
i=1 vi , where
1 n−1 −i j j 2π i
vi = ∑
n j=0
ω T v, ω = e n .
76 CHAPTER 6. CANONICAL FORMS
for all j ≥ 1.
Apparently, if ker(T − λ I) j is nontrivial, then λ is an eigenvalue of T since (T − λ ) j−1 v is
an eigenvector for some v ∈ ker(T − λ I) j .
To generalize the notion of eigenvectors in this way, we prefer that vectors in each ker(T −
λ I) j corresponding to distinct λ are linearly independent i.e. the sum of such spaces is
direct. The next theorem asserts it is indeed true.
Suppose that
a1 v1 + · · · + am vm = 0.
Apply the operator (λ1 − T )dimV· · · (λm−1 − T )dimV (λm − T )k to both sides, where
k is the largest integer for (λm − T )k vm 6= 0. We have
We may directly apply the proof of 6.6 to get the directness of the sum.
Since the sum of such spaces is direct, we could then define general eigenspaces as such
spaces, and general eigenvectors as the vectors live in them.
We could characterize general eigenspaces using the reasoning from our discussion above.
P ROOF.The proof is a direct result from our discussion above. If v ∈ G(λ , T ), then
(T − λ I) j v = 0 for some j, so v ∈ ker(T − λ I) j ⊂ ker(T − λ I)dimV . The converse
is true from the definition.
Now we show the important structural result: the root subspace decomposition of complex
vector spaces.
f (z) = (z − λ1 )n · · · (z − λm )n .
Exercise 6.2.1
V is complex and T ∈ End(V ). Prove that T is diagonalizable if and only if every
generalized eigenvector is an eigenvector.
The root space decomposition theorem suggests thta we could always find a basis of general
eigenvectors of T in complex V . If we write T as a matrix w.r.t. the basis, we then have a
block diagonal matrix as long as we keep a certain sequence of these basis vectors. We then
define algebraic multiplicity to describe the size of such block diagonal matrices.
The next result relates the algebraic multiplicity of a (complex) eigenvalue with the times
it appear in the diagonal of the upper triangular matrix and is repeated in the characteristic
polynomial.
Exercise 6.2.2
V is complex and T ∈ End(V ). Then the characteristic polynomial of T is
Exercise 6.2.3
T ∈ End(V ) and m(z) = ∏ki=1 (z − λi )mi is the minimal polynomial of T , where
λi 6= λ j if i 6= j. Prove:
1. G(λi , T ) = ker(λi I − T )mi .
2. mi (z) = (z − λi )mi is the minimal polynomial of T |G(λi ,T ) .
Exercise 6.3.1
T ∈ End(V ). Show that there exists invariant subspaces U,W such that V = U ⊕W
where T |U is invertible operator and T |W is nilpotent.
Compare this with 6.11.
Hint: Exercise 1.2.6.
The cyclic decomposition of nilpotent operators would give us a second order decomposi-
tion of operators after the root space decomposition on complex vector spaces.
Z(v; T ) = span(v, T v, · · · )
such that
1. Nv j1 = 0; and
2. Nv ji = v j,i−1 for each i = 2, · · · , r j
for each j.
N m1 v1 , · · · , v1 ; · · · ; N mn vn , · · · , vn
is a basis of im N and
N m1 +1 v1 = · · · = N mn +1 vn = 0.
N m1 +1 u1 , · · · , u1 ; · · · ; N mn +1 un , · · · , un
N m1 u1 , · · · , u1 ; · · · ; N mn us , · · · , un
all are 0.
∗ Therefore we only need
N m1 +1 u1 = N m1 v1 , · · · , N mn +1 un = N mn vn
N m1 +1 u1 , · · · , u1 ; · · · ; N mn +1 un , · · · , un , w1 , · · · , wt .
Exercise 6.3.2
N ∈ End(V ). N m = 0. Prove: there exists a subspace W ⊂ V such that
M
m
V= N i−1 (W ).
i=1
V K1 K2 ··· Km−1 Km
W F1 F2 ··· Fm−1 Fm
N(W ) T (F2 ) ··· ··· T (Fm )
.. ..
. . ···
N m−1 (W ) T m−1 (Fm )
Combine the above results, the cyclic decomposition of (T − λ I)|G(λ ,T ) on each general
eigenspace, with the first step - root space decomposition of the complex vector space V ,
gives us the existence of Jordan basis.
6.3. JORDAN CANONICAL FORM 83
Suppose V is a complex vector space and T ∈ End(V ) with an eigenvalue λ . The matrix of
T |G(λ ,T ) w.r.t. a Jordan basis has the form
J1 0
..
.
0 Js
Exercise 6.3.3
Suppose V is complex and T ∈ End(V ) is invertible. Prove that T has a m−th root
i.e. R ∈ End(V ) such that T = Rm , for m ∈ Z+ , m ≥ 2.
84 CHAPTER 6. CANONICAL FORMS
Jordan basis is not necessarily unique. However, the Jordan form does not depend
on the specific basis chosen, as we would show next.
We are curious about the number of Jordan blocks for each eigenvalue, so we introduce
geometric multiplicity.
Exercise 6.3.4
T ∈ End(V ) and m(z) = ∏ki=1 (z − λi )mi is the minimal polynomial of T , where
λi 6= λ j if i 6= j. Prove that
d(λi ) ≤ mi s(λi ).
We are yet to find the number t j (λ ) of Jordan blocks of size j-by- j for each j. However, it
is easy to see that
And hence
This means that the Jordan form of T is unique up to rearrangement of the Jordan blocks,
as t j (λ ) is uniquely determined by T for each eigenvalue λ of T .
rearrangement of blocks.
Therefore, we say that Jordan form is canonical, since it only depends on T and not
on the specific Jordan basis.
Jordan canonical form can deduce useful results for similarity of matrices - it tells exactly
which matrices are similar.
P ROOF.
2
Suppose A, B ∈ Cn are similar, then there is one T ∈ End(Cn ) corresponding to
both A, B under different bases. However, there is a unique JCF for T , which is
determined by T .
The converse follows from the definition of similar matrices.