You are on page 1of 31

Chapter 6

The Jordan Canonical Form

6.1 Introduction
The importance of the Jordan canonical form became evident in the last chapter, where
it frequently served as an important theoretical tool to derive practical procedures for
calculating matrix polynomials.
In this chapter we shall take a closer look at the Jordan canonical form of a given
matrix A. In particular, we shall be interested in the following questions:

• how to determine its structure;

• how to calculate P such that P −1 AP is a Jordan matrix.

As we had learned in the previous chapter in connection with the diagonalization


theorem (cf. section 5.4), the eigenvalues and eigenvectors of A yield important clues for
determining the shape of the Jordan canonical form. Now it is not difficult to see that
for 2 × 2 and 3 × 3 matrices the knowledge of the eigenvalues and eigenvectors A alone
suffices to determine the Jordan canonical form J of A, but for larger size matrices this
is no longe true. However, by generalizing the notion of eigenvectors, we can determine
J from this additional information. Thus we shall:

• study some basic properties of eigenvalues and eigenvectors in section 6.2;

• learn how to find J and P when m ≤ 3 (section 6.3);

• define and study generalized eigenvectors and learn how determine J (section 6.4);

• learn a general algorithm for determining P in section 6.5.

In addition, we shall also look at some applications of the Jordan canonical form
such as a proof of the Cayley-Hamilton theorem (cf. section 6.6). Other applications will
follow in later chapters.
274 Chapter 6: The Jordan Canonical Form

6.2 Algebraic and geometric multiplicities of eigen-


values
As we shall see, much (but not all) of the structure of the Jordan canonical form J of a
matrix A can be read off from the algebraic and geometric multiplicities of the eigenvalues
of A, which we now define.
Definition. Let A be an m × m matrix and λ ∈ C. Then

mA (λ) = multλ (chA ), the multiplicity of λ as a root of chA (t) (cf. chapter 3),
is called the algebraic multiplicity of λ in A;
νA (λ) = dimC EA (λ) is called the geometric multiplicity of λ in A.

Here, as before (cf. section 5.4),

EA (λ) = {~v ∈ Cm : A~v = λ~v } = Nullsp(A − λI) denotes the λ-eigenspace of A.

Remarks. 1) Note that the above definition does not require λ to be an eigenvalue of
A. Thus by definition:

λ is an eigenvalue of A ⇔ νA (λ) ≥ 1 ⇔ mA (λ) ≥ 1.

2) We shall see later (in Theorem 6.4) that we always have νA (λ) ≤ mA (λ).
defn
3) By linear algebra: νA (λ) = dim Nullsp(A − λI) = m − rank(A − λI).
Example 6.1. Find the algebraic and geometric multiplicities of (the eigenvalues of)
the matrices    
1 1 2 1 0 2
A=0 1 2 and B=0 1 2
0 0 3 0 0 3
Solution. Since A and B are both upper triangular and have the same diagonal entries
1, 1, 3 we see that
chA (t) = chB (t) = (t − 1)2 (t − 3).
Thus, both matrices have λ1 = 1 and λ3 = 3 as their eigenvalues with algebraic multi-
plicities
mA (1) = mB (1) = 2 and mA (2) = mB (2) = 1.
To calculate the geometric multiplicities, we have to determine the ranks of A − λi I and
B − λi I for i = 1, 2. Now
       
0 1 2 0 0 2 −2 1 2 −2 0 2
A−I =  0 0 2 , B−I =  0 0 2 , A−3I =  0 −2 2 , B−3I =  0 −2 2  .
0 0 2 0 0 2 0 0 0 0 0 0
Section 6.2: Algebraic and geometric multiplicities of eigenvalues 275

Thus, since A−I clearly has 2 linearly independent column vectors, we see that rank(A−
I) = 2, and so νA (1) = 3 − rank(A − I) = 3 − 2 = 1. Similarly, rank(B − I) = 1 and so
νB (1) = 3 − rank(B − I) = 3 − 2 = 1.
Furthermore, since A − 3I and B − 3I both have rank 2, it follows that νA (3) =
νB (3) = 3 − 2 = 1. Thus, the geometric multiplicities of the eigenvalues of A and B are
νA (1) = 1, νB (1) = 2 and νA (3) = νB (3) = 1.

Example 6.2. Consider the following three Jordan matrices:


     
5 0 0 5 1 0 5 1 0
J1 =  0 5 0  , J2 =  0 5 0  , J3 =  0 5 1  .
0 0 5 0 0 5 0 0 5
Then their algebraic and geometric multiplicities are given in the following table:
1 2 3
chJi (t) (t − 5)3 (t − 5)3 (t − 5)3
mJi (5) 3 3 3
νJi (5) 3 2 1
EJi (5) h~e1 , ~e2 , ~e3 i h~e1 , ~e3 i h~e1 i
Here ~e1 = (1, 0, 0)t , e2 = (0, 1, 0)t , e3 = (0, 0, 1)t denote the standard basis vectors of C3
and h. . .i denotes the span (= set of all linear combinations) of the vectors.
Verification of table: To check the first two rows of the table we note that
 
5−t ∗ ∗
chJi (t) = (−1)3 det(Ji − 5I) = − det  0 5 − t ∗  = −(5 − t)3 = (t − 5)3 .
0 0 5−t
Thus, for all three matrices λ1 = 5 is the only eigenvalue and its algebraic multiplicity is
mJi (5) = 3 (= the exponent of (t − 5) in chJi (t)).
To compute νJi (5), it is enough to find rank(Ji − 5I) = the number of non-zero rows
of the associated row echelon form. Here we need to consider the three cases separately:
1) Since J1 − 5I = 0, and rank(0) = 0, we have νJi (5) = n − rank(J1 − 5I) = 3 − 0 = 3.
 
0 1 0
2) Next, J2 −5I =  0 0 0 , which is in row echelon form. Thus rank(J2 −5I) = 1,
0 0 0
and hence νJ2 (5) = n − rank(J2 − 5I) = 3 − 1 = 2.
 
0 1 0
3) Similarly, J3 − 5I =  0 0 1 , which is again in row echelon form. Thus
0 0 0
νJ3 (5) = n − rank(J3 − 5I) = 3 − 2 = 1.
Finally, the indicated basis of EJi (5) is obtained by using back-substitution.
276 Chapter 6: The Jordan Canonical Form
 
λ 1 0 0
. .
0 .. .. 0 
 
Example 6.3. Let J = J(λ, k) =   be a Jordan block of size k.

.. . . . .
 . . . 1
0 ... 0 λ
Then: chJ (t) = (t − λ)k (since J is upper triangular)
EJ (λ) = {c(1, 0, . . . , 0)t : c ∈ C} (cf. Example 5.7 of chapter 5)
mJ (λ) = multλ (chJ ) = k,
νJ (λ) = dimC EA (λ) = 1.
The above example shows us how to quickly find the algebraic and geometric mul-
tiplicities of Jordan blocks. To extend this to Jordan matrices, i.e. to matrices of the
form  
J1 0 . . . 0
 0 J2 . . . ... 
 
J = Diag(J1 , J2 , . . . , Jr ) = 
 .. .. ..
,
. . 0 

 .
0 . . . 0 Jr
where the Ji = J(λi , mi ) are Jordan blocks, we shall use the following result.

Theorem 6.1 (Sum Formula). If A = Diag(B, C) = B 0



0 C
, then the algebraic and geo-
metric multiplicities of the eigenvalues of A are the sum of the corresponding multiplicities
of those of B and C. In other words, for any λ ∈ C we have

(1) mA (λ) = mB (λ) + mC (λ) and νA (λ) = νB (λ) + νC (λ).

Example 6.4. As in Example 6.2, let J2 = Diag(J (5, 2), J (5, 1)). Then
| {z } | {z }
B C
(1) Ex. 6.3
mJ2 (5) = mB (5) + mC (5) = 2 + 1 = 3,
(1) Ex. 6.3
νJ2 (5) = νB (5) + νC (5) = 1 + 1 = 2.

This example generalizes as follows:


Corollary. If J = Diag(J11 , J12 , . . . , Jij , . . . ) is a Jordan matrix with Jordan blocks
Jij = J(λi , kij ) and λ ∈ C, then

νJ (λ) = the number of Jordan blocks Jij with eigenvalue λi = λ,


mJ (λ) = the sum of the sizes kij of the Jordan blocks Jij with eigenvalue λi = λ.
P
Proof. By Theorem 6.1 we have νJ (λ) = i,j νJij (λ). Now by Example 6.3 we know
that νJij (λ) = 1 if Jij has eigenvalue λi = λ and νJij (λ) = 0 otherwise, so the assertion
for νJ (λ) follows. The formula for mJ (λ) is proved similarly.
Section 6.2: Algebraic and geometric multiplicities of eigenvalues 277

Theorem 6.1 is, in fact, a special case of a much more precise result. To state it in a
convenient form, it is useful to introduce the following notation.
Notation. (a) If ~v = (v1 , . . . , vn )t ∈ Cn and w
~ = (w1 , . . . , wm )t ∈ Cm , then the vector
~ := (v1 , . . . , vn , w1 , . . . , wm )t ∈ Cn+m
~v ⊕ w
is called the direct sum of ~v and w.
~
(b) If V ⊂ Cn and W ⊂ Cm are subspaces, then the direct sum of V and W is the
subspace
~ ∈ Cn+m : ~v ∈ V, w
V ⊕ W = {~v ⊕ w ~ ∈ W }.
Remarks. 1) If ~v1 , . . . , ~vr is a basis of V ⊂ Cn and w ~ s is one of W ⊂ Cm , then
~ 1, . . . , w
~v1 ⊕ ~0m , . . . , ~vr ⊕ ~0m , ~0n ⊕ w
~ 1 , . . . , ~0n ⊕ w
~ s is a basis of V ⊕ W . (Here, ~0m = (0, . . . , 0)t .)
| {z }
Thus m

(2) dim(V ⊕ W ) = dim V + dim W.


2) If A is an a × m matrix and B is a b × n matrix, then for every ~v ∈ Cm and w
~ ∈ Cn
we have
(3) Diag(A, B)(~v ⊕ w)
~ = (A~v ) ⊕ (B w).
~

Example 6.5. Let V = {c1 (1, 2)t + c2 (3, 4)t : c1 , c2 ∈ C} and W = {c01 (1, 2, 1)t +
c02 (3, 4, 1): c01 , c02 ∈ C}. Verify the addition rule (2) for V and W .
Solution. If ~v ∈ V , then ~v = c1 (1, 2)t +c2 (3, 4)t = (c1 , 2c1 )t +(3c2 , 4c2 )t = (c1 +3c2 , 2c1 +
4c2 )t , and similarly each w
~ ∈ W has the form w ~ = (c01 + 3c02 , 2c01 + 4c02 , c01 + c02 )t . Thus
~ = (c1 + 3c2 , 2c1 + 4c2 )t ⊕ (c01 + 3c02 , 2c01 + 4c02 , c01 + c02 )t
~v ⊕ w
= (c1 + 3c2 , 2c1 + 4c2 , c01 + 3c02 , 2c01 + 4c02 , c01 + c02 )t
= c1 (1, 2, 0, 0, 0)t + c2 (3, 4, 0, 0, 0)t + c01 (0, 0, 1, 2, 1)t + c02 (0, 0, 3, 4, 1)t ,
and so V ⊕ W = {c1 (1, 2, 0, 0, 0)t + c2 (3, 4, 0, 0, 0)t + c01 (0, 0, 1, 2, 1)t + c02 (0, 0, 3, 4, 1)t :
c1 , c2 , c01 , c02 ∈ C}. Therefore, dim(V ⊕ W ) = 4 = dim V + dim W .

Example 6.6. Verify the multiplication rule (3) for A = 13 24 and B = 56 21 .


 

Solution. Write ~v = (v1 , v2 ) and w


~ = (w1 , w2 ). Then
    
1 2 0 0 v1 v1 + 2v2
 3 4 0 0   v2   3v1 + 4v2 
Diag(A, B)(~v ⊕ w)
~ =   0 0 5 2   w1  =  5w1 + 2w2 
   

0 0 6 1 w2 6w1 + w2
   
v1 + 2v2 5w1 + 2w2
= ⊕
3v1 + 4v2 6w1 + w2
     
1 2 v1 5 2 w1
= ⊕ = A~v ⊕ B w.
~
3 4 v2 6 1 w2
278 Chapter 6: The Jordan Canonical Form

We are now ready to state and prove the following refinement of Theorem 6.1.
Theorem 6.2. If A = Diag(B, C), then
(4) chA (t) = chB (t) · chC (t) and EA (λ) = EB (λ) ⊕ EC (λ).
Proof. (a) Since the determinant of a block diagonal matrix is the product of the deter-
minants of the blocks, we obtain
chA (t) = det(tI−A) = det(Diag(tI−B, tI−C)) = det(tI−B) det(tI−C) = chB (t) chC (t).
(b) Suppose that B is an m × m matrix and C an n × n matrix. Now each ~u ∈ Cm+n
can be written uniquely as ~u = ~v ⊕ w ~ with ~v ∈ Cm and w ~ ∈ Cn , so (A − λI)~u =
Diag(B − λI, C − λI)(~v ⊕ w) ~ = (B − λI)~v ⊕ (C − λI)w ~ by (3). Thus, ~u ∈ EA (λ) ⇔
(A − λI)~u = ~0 ⇔ (B − λI)~v = ~0 and (C − λI)w ~ = ~0 ⇔ ~v ∈ EB (λ) and w ~ ∈ EC (λ) ⇔
~u = ~v ⊕ w
~ ∈ EB (λ) ⊕ EC (λ), and so EA (λ) = EB (λ) ⊕ EC (λ), as claimed.
B 0 BX
 
Remark. If A = X C
or A = 0 C
, then it is still true that chA (t) = chB (t) chC (t).
However, in this case the second formula of (4) no longer holds (in general).
Proof of Theorem 6.1. It is easy to see that Theorem 6.1 follows from Theorem
(4)
6.2. Indeed, mA (λ) = multλ (chA ) = multλ (chB chC ) = multλ (chB ) + multλ (chC ) =
mB (λ) + mC (λ), which proves the first equality of Theorem 6.1.
(4) (2)
For the second we observe that νA (λ) = dim EA (λ) = dim(EB (λ) ⊕ EC (λ)) =
dim EB (λ) + dim EC (λ) = νB (λ) + νC (λ).  
2 1 0 0
1 2 0 0
Example 6.7. Find chA (t) and EA (λ) when A =   0 0 0 1 .

0 0 −1 2
Solution. Since A = Diag(B, C), where B = 1 2 and C = −10 12 , it is enough by
2 1
 

Theorem 6.2 to work out the eigenspaces and characteristic polynomials for B and C.
(a) chB (t) = (t − 2)2 − 1 = t2 − 4t + 3 = (t − 1)(t − 3)
1
: c ∈ C}, EB (3) = {c 11 : c ∈ C}, and EB (λ) = {0} for λ 6= 1, 3.
 
EB (1) = {c −1
(b) chC (t) = −t(2 − t) + 1 = (t − 1)2
EC (1) = {c 11 : c ∈ C}, and EC (λ) = {0} for λ 6= 1.


(4)
(c) Thus: chA (t) = chB (t) · chC (t) = (t − 1)(t − 3) · (t − 1)2 = (t − 1)3 (t − 3);
(4) 1
} ⊕ {c2 11 }
 
EA (1) = EB (1) ⊕ EC (1) = {c1 −1
= {c1 (1, −1, 0, 0)t + c2 (0, 0, 1, 1)t : c1 , c2 ∈ C},
EA (3) = EB (3) ⊕ EC (3)
= {c1 11 } ⊕ {~0} (Note: 3 is not an eigenvalue of C)


= {c1 (1, 1, 0, 0)t : c ∈ C},


EA (λ) = EB (λ) ⊕ EC (λ) = {~0} ⊕ {~0} = {~0}, if λ 6= 1, 3.
Section 6.2: Algebraic and geometric multiplicities of eigenvalues 279

Another important property of algebraic and geometric multiplicities is the following.


Theorem 6.3 (Invariance Property). If B = P −1 AP , then

(5) chB (t) = chA (t) and EA (λ) = P EB (λ) := {P~v : ~v ∈ EB (λ)}.

In particular, we have

mB (λ) = mA (λ) and νB (λ) = νA (λ).

Example 6.8. Find the characteristic polynomial and the eigenspaces of


     
4 0 −2 2 1 0 2 1 1
A = P BP −1 =  1 2 −1  , where B =  0 2 0  and P =  1 1 1  .
2 0 0 0 0 2 2 0 1

Solution. First note that B = Diag(J1 , J2 ), where J1 = J(2, 2) and J2 = J(2, 1). Thus
(5) (4) Ex.6.3
chA (t) = chB (t) = chJ1 (t) chJ2 (t) = (t − 2)2 (t − 2) = (t − 2)3 . Thus, λ1 = 2 is the only
eigenvalue of A (and of B).
(4)
Moreover, EB (2) = {c1 (1, 0)t : c1 ∈ C} ⊕ {c2 (1) : c2 ∈ C}
= {c1 (1, 0, 0)t + c2 (0, 0, 1)t : c1 , c2 ∈ C}
(5)
and hence EA (2) = {c1 P (1, 0, 0)t +c2 P (0, 0, 1)t : c1 , c2 ∈ C}
| {z } | {z }
1st column of P 3rd column of P
= {c1 (2, 1, 2) + c2 (1, 1, 1)t : c1 , c2 ∈ C}.
t

0 1 0 1 0 1 0 1 0 1
2 1 1 1 2
−1
Check: A 1 = (P BP )(P 0 ) = P B 0 = P (2 0 A)
@ A @ A @ A @ = 2 1A
@
2 0 0 0 2
⇒ (2, 1, 2)t ∈ EA (2).
Proof of Theorem 6.3. (a) Recall that for any two n × n matrices X and Y we have
det(XY ) = det(X) · det(Y ), and hence also det(P −1 XP ) = det(P )−1 det(X) det(P ) =
det(X). Applying this to X = A − tI yields

chP −1 AP (t) = (−1)n det(P −1 AP − tI) = (−1)n det(P −1 (A − tI)P )


= (−1)n det(A − tI) = chA (t).

(b) We have: ~v ∈ EA (λ) ⇔ A~v = λ~v ⇔ P BP −1~v = λ~v


⇔ BP −1~v = P −1 λ~v ⇔ B(P −1~v ) = λ(P −1~v )
⇔ P −1~v ∈ EB (λ) ⇔ ~v ∈ P EB (λ).
We have thus shown that ~v ∈ EA (λ) ⇔ ~v ∈ P EB (λ), which means that EA (λ) = P EB (λ).
The last two assertions follow from (5) by talking degrees and dimensions.
280 Chapter 6: The Jordan Canonical Form

Corollary. If P −1 AP = J is a Jordan matrix, then


νA (λ) = the number of Jordan blocks of J with eigenvalue λ
mA (λ) = the sum of the sizes of the Jordan blocks of J with eigenvalue λ.
Proof. Combine Theorem 6.3 with the Corollary of Theorem 6.1:
Th. 6.3 Th. 6.1
νA (λ) = νJ (λ) = #{Jordan blocks Jij with eigenvalue λ},
and the assertion about mA (λ) is proved similarly.
Remark. The above corollary represents a fundamental step towards computing the
structure of the Jordan canonical form J associated to A: we see that the algebraic and
geometric multiplicities reveal the number of Jordan blocks and the sum of their sizes.
If n ≤ 3, then the above rules already determine J, as we shall see in more detail in
the next section. However, if n ≥ 4, then this is no longer true, as the following example
illustrates.
Example 6.9. The two Jordan matrices
   
2 1 0 0 2 0 0 0
0 2 0 0 0 2 1 0
J1 =  0 0 2 1  and J2 =  0 0
  
2 1
0 0 0 2 0 0 0 2
clearly have the same number of Jordan blocks (so νJ1 (2) = νJ2 (2) = 2), and the sum of
their sizes is also the same (so mJ1 (2) = mJ2 (2) = 4), but the Jordan matrices are not the
same (even if we rearrange the blocks). Thus, the algebraic and geometric multiplicities
alone cannot distinguish between these Jordan forms.

Theorem 6.4 (Jordan Canonical Form). Every square matrix A is similar to a Jordan
matrix. In other words, there is an invertible matrix P such that
P −1 AP = J = Diag(J11 , . . . , Jij , . . .)
is a block diagonal matrix consisting of Jordan blocks Jij = J(λi , kij ). Moreover:
1) The λ1 , λ2 , . . . , λs are the (distinct) eigenvalues of A.
2) The number of Jordan blocks Ji1 , Ji2 , . . . with eigenvalue λi equals the geometric
multiplicity νA (λi ) = νi (so that this list ends with Ji,νi ).
3) The sum of the sizes kij of the blocks Ji1 , Ji2 , . . . , Ji,νi with eigenvalue λi equals the
algebraic multiplicity mi = mA (λi ):
(6) ki1 + ki2 + · · · + kiνi = mi ;
in particular: νi ≤ mi .
4) The Jij ’s are uniquely determined by A up to order.
Section 6.2: Algebraic and geometric multiplicities of eigenvalues 281

Remarks. 1) In the above statement of the Jordan canonical form the following fact is
implicitly used:
If J and J 0 are two Jordan matrices which have the same lists of Jordan blocks but in a
different order, then J and J 0 are similar, i.e. there is a matrix P such that J 0 = P −1 JP .
[To see why this is true, consider an m×m block diagonal matrix Diag(A, B) consisting
of two blocks A and B of size k × k and (m − k) × (m − k), respectively. Then we have
Pk−1 Diag(A, B)Pk = Diag(B, A),
i
where Pk = (~ek+1 |~ek+2 | . . . |e~m |~e1 | . . . |~ek ), and, as usual, ~ei = (0, . . . , 0, 1, 0, . . . , 0)t ∈ Cm .
From this the above statement about the Jordan matrices follows readily. Note that
the same argument also yields the corresponding statement for arbitrary block diagonal
matrices.]
As a result of this fact, we can always choose the matrix P in Theorem 6.4 in such
a way that, after fixing an ordering λ1 , . . . , λs of the eigenvalues, the Jordan matrix has
the form J = Diag(J11 , . . . , Jij , . . .), where the Jordan blocks Jij = J(λi , kij ) are ordered
in deceasing size (for each eigenvalue λi ), i.e. we have
ki1 ≥ ki2 ≥ . . . ≥ kiνi ;
such a Jordan matrix will be said to be in standard form. For example, J = Diag(J(1, 2),
J(1, 1), J(2, 2), J(−1, 3), J(−1, 2)) is in standard form but Diag(J(2, 2), J(2, 3)) is not.
2) Conversely, suppose J and J 0 are two Jordan matrices which are similar. Then the
lists of Jordan blocks of J and J 0 are the same (up to order), as we shall see later (cf.
Theorem 6.5, Corollary).
Corollary 1. Two m × m matrices A and B are similar (that is, B = P −1 AP for some
P ) if and only if they have the same Jordan canonical form (up to order) J.
Proof. Let J and J 0 be the Jordan canonical forms of A and B, respectively. Since A
is similar to J and B is similar to J 0 , it follows that A and B are similar if and only if
J and J 0 are similar. Now by the above remark, J and J 0 are similar if and only if they
are identical up to the order of their blocks, and so the assertion follows.
Corollary 2. A matrix A is diagonable if and only if the algebraic and geometric mul-
tiplicities of all its eigenvalues λi are the same:
νA (λ1 ) = mA (λ1 ), . . . , νA (λs ) = mA (λs ).
Proof. First note that if A is diagonable, then the associated diagonal matrix P −1 AP
is the Jordan canonical form of A, and conversely, if the JCF of A is a diagonal matrix,
then A is clearly diagonable. Thus:
A is diagonable ⇔ its associated Jordan form is a diagonal matrix
⇔ all Jordan blocks have size 1 × 1
⇔ kij = 1 for all i, j
⇔ νA (λi ) = mA (λi ), 1 ≤ i ≤ s, by (6).
282 Chapter 6: The Jordan Canonical Form

Remark. By using terminology which will be studied in more detail in the next chapter,
the above Corollary 2 can be rephrased more elegantly as follows:
Corollary 20 . A matrix A is diagonable if and only if all its eigenvalues are regular.
Here, an eigenvalue λ of A is called regular if its algebraic and geometric multiplicities
coincide, i.e. if mA (λ) = νA (λ).
Indeed, the above proof (or equation (6)) shows more precisely that
Corollary 3. An eigenvalue λ of A is regular if and only if all its Jordan blocks (in the
associated Jordan canonical form J of A) have size 1 × 1.

Exercises 6.2.

1. Find all the eigenvalues, their algebraic and geometric multiplicities and their as-
sociated eigenspaces of the matrix A when:
 
2 1 0 0
 −1 4 0 0 
(a) A =   −1 1 2 1  ;
 (b) A = P BP −1 , where
−1 1 −1 4
   
1 1 1 1 1 1 1 1 1 0 0 0 0 0

 3 2 1 0 −1 −2 −3 


 0 1 1 0 0 0 0 


 9 4 1 0 1 4 9 


 0 0 1 0 0 0 0 

P =
 27 8 1 0 −1 −8 −27  and B = 
  0 0 0 1 1 0 0 .


 81 16 1 0 1 16 81 


 0 0 0 0 1 0 0 

 243 32 1 0 −1 −32 −243   0 0 0 0 0 2 1 
729 64 1 0 1 64 729 0 0 0 0 0 0 2

Hint: In (b), you shouldn’t have to do any calculations.

2. Write down two 4 × 4 Jordan matrices which are not similar and yet have the
same eigenvalues and the same algebraic and geometric multiplicities. Justify your
answer.

3. Find a Jordan matrix J such that


 
4 1 −1
P JP −1 =  −2 1 1 
2 1 1

for some invertible matrix P . [Do not find P !]

4. Find all the Jordan matrices J (in standard form) with chJ (t) = (t − 1)2 (t − 2)4 .
Section 6.2: Algebraic and geometric multiplicities of eigenvalues 283

5. Consider the Jordan matrices


   
1 1 0 1 0 0
J1 =  0 1 0  and J2 =  0 1 1  .
0 0 1 0 0 1

(a) Which of these is standard form?


(b) Find a matrix P such that J1 = P −1 J2 P .
(c) Find a matrix Q such that Q−1 JQ is in standard form where
 
1 1 0 0 0 0
 0 1 0 0 0 0
 

J1 0 0 0 1 0 0 0
J = Diag(J1 , J2 ) = = 0 0 0 1 0 0 .

0 J2  
0 0 0 0 1 1
0 0 0 0 0 1

6. (a) Suppose that A1 and A2 are two square matrices with characteristic polynomial

chAj (t) = (t − λ1 )m1j (t − λ2 )m2j . . . (t − λs )msj ,


A
where mij ≥ 0 for 1 ≤ i ≤ s and j = 1, 2. Let Eikj denote the ik th constituent
A
matrix of Aj for 1 ≤ i ≤ s and 0 ≤ k ≤ mij − 1, and put Eikj = 0 if k ≥ mij . Show
A
that the constituent matrices Eik of A = Diag(A1 , A2 ) are given by
A A1 A2
Eik = Diag(Eik , Eik ), 1 ≤ i ≤ s, 1 ≤ k ≤ mi1 + mi2 − 1.

(b) Use this formula to find the constituent matrices of the Jordan matrices

(i) J = Diag(J(−1, 2), J(1, 2)) and (ii) J 0 = Diag(J(−1, 2), J(−1, 3)).

7. Let a0 , a1 , . . . , an−1 ∈ C and consider the matrix


 
0 1 0 ... 0
 0 0 1 0 
 
 .. . .. . .. ..
A= . .

 . 
 0 0 1 
a0 a1 . . . an−2 an−1

(a) Show that the geometric multiplicity of every eigenvalue λ of A is νA (λ) = 1.


(b) Show that A is diagonable if and only if chA (t) has n distinct roots. [Recall
from section 5.9 that chA (t) = tn − an−1 tn−1 − . . . − a1 t − a0 , but you don’t
need this here.]
284 Chapter 6: The Jordan Canonical Form

6.3 How to find P such that P −1AP = J (for m ≤ 3)


Before explaining the general procedure of finding the Jordan canonical form J (and the
associated matrix P ) of an n×n matrix A, let us first look at the special case that m ≤ 3.
The advantage of the case m ≤ 3 is that the algebraic and geometric multiplicities
suffice for finding the Jordan canonical form; this no longer true if m ≥ 4; cf. Example 6.9.
Nevertheless, in calculating the associated matrix P , we are naturally led to a method
which can be generalized to larger matrices, as will become evident in the next sections.
This method consists of looking at the so-called generalized eigenvectors which we will
need here only in special cases. The basic idea is the following.
Basic Idea: In order to find P = (~v1 | . . . |~vn ) such that P −1 AP = J, write this equation
as
AP = P J.
By using the identities

AP = (A~v1 | · · · |A~vn ),
P (a1 | · · · |an )t = a1~v1 + · · · + an~vn ,

the equation AP = P J translates into a set of (vector) equations for the ~vi ’s which we
can solve. The following examples show how this method works.
 
0 1
Example 6.10. If A = , find P such that J = P −1 AP is a Jordan matrix.
−1 −2
Solution. The procedure naturally divides into two steps.
Step 1. Find the Jordan canonical form J of A.
(i) The characteristic polynomial of A is
 
2 −t 1
chA (t) = (−1) det = t(t + 2) − 1 = (t + 1)2 ,
−1 −2 − t

and so λ1 = −1 is the only eigenvalue; it has algebraic multiplicity m1 = mA (λ1 ) = 2.


Thus, the sum of the sizes of the Jordan blocks of J is m1 = 2.
(ii) Since    
1 1 1 1
A+I = → ,
−1 −1 0 0
it follows that the λ1 -eigenspace is EA (−1) = {c(1, −1)t : c ∈ C}; in particular, ν1 = 1.
Thus, we have 1 Jordan block.
(iii) By combining (i) and (ii) we can conclude:

m1 = 2
⇒ 1 Jordan block of size 2 (with eigenvalue λ1 = −1)
ν1 = 1  
−1 1
⇒ J= is the associated Jordan canonical form.
0 −1
Section 6.3: How to find P such that P −1 AP = J (for m ≤ 3) 285

Step 2. Find P such that P −1 AP = J or, equivalently, such that AP = P J.


Write P = (~v1 |~v2 ), with ~v1 , ~v2 ∈ C2 . Since

AP = (A~v1 |A~v2 ),
−1 1

P J = (~v1 |~v2 ) 0 −1
= (−~v1 |~v1 − ~v2 ),

we want to choose ~v1 , ~v2 in such a way that


)
1) A~v1 = −~v1
AP = P J
2) A~v2 = ~v1 − ~v2
3) ~v1 , ~v2 are linearly independent (⇔ P is invertible)

Observations: (a) The equations 1) and 2) can also be written in the form
10 ) (A + I)~v1 = ~0,
20 ) (A + I)~v2 = ~v1 .
20 ) 10 )
(b) These equations imply that (A + I)2~v2 = (A + I)~v1 = ~0, i.e.
4) (A + I)2~v2 = ~0.
(c) Conversely, if we pick ~v2 such that 4) holds and define ~v1 := (A + I)~v2 , then both
1) and 2) hold.
(d) However: we have to pick ~v2 carefully such that condition 3) holds. It turns out
that condition 3) will hold if (and only if) we take
30 ) ~v2 ∈
/ EA (−1).
[Indeed, suppose that ~v2 ∈ / EA (−1) (and satisfies 4)); then ~v1 := (A + I)~v2 6= ~0. Now
if c1~v1 + c2~v2 = 0, then applying A + I yields ~0 = (A + I)(c1~v1 + c2~v2 ) = c2~v1 (because
~
(A + I)~v1 = 0 by 10 )), so c2 = 0. But then c1~v1 = ~0, so c1 = 0, and hence ~v1 and ~v2 are
linearly independent.]
These observations lead to the following strategy:
Pick / EA (−1) such that (A + I)2~v2 = ~0,
~v2 ∈
put ~v1 = (A + I)~v2 .
!
−1 1
Then: P = (~v1 |~v2 ) is invertible and P −1 AP = .
0 −1
Let us apply this strategy here. Since (A + I)2 = 0 (either by direct computation or
by using the Cayley-Hamilton Theorem: (A + I)2 = chA (A) = 0), it follows that every
/ EA (−1) = {c(1, −1)t }; take, for example, ~v2 = 10 .

~v ∈ C2 satisfies 4). Thus, pick any ~v ∈
286 Chapter 6: The Jordan Canonical Form
! ! !
1 1 1 1
Then ~v1 = (A + I)~v2 = = ,
−1 −1 0 −1
!
1 1
so P = (~v1 |~v2 ) = is the desired matrix.
−1 0
Check: P −1 AP = 01 −11 −10 −21 −11 10 = −11 −12 −1 1
    1 1 
−1 0
= 0 −1
= J.

Remark. Note that the above procedure depends on picking a vector ~v2 in the space

EA2 (λ) := {~v ∈ Cn : (A + λI)2~v = ~0};

such a vector is called a generalized eigenvector (of order ≤ 2).


 
1 2 −1
Example 6.11. If A =  0 2 0 , find P such that P −1 AP is a Jordan matrix.
1 −2 3
Solution. We shall follow the steps of the previous example.
Step 1. Find the associated Jordan canonical form J.
(i) Expanding the determinant along the 2nd row yields
 
3 1 − t −1
chA (t) = (−1) (2 − t) det = (t − 2)[(1 − t)(3 − t) + t] = (t − 2)3 .
1 3−t

Thus, the only eigenvalue is λ1 = 2; its algebraic multiplicity is m1 = 3.


(ii) By row reduction we get
   
−1 2 −1 1 −2 1
A − 2I =  0 0 0 → 0 0   0 .
1 −2 1 0 0 0

Thus, the 2-eigenspace is EA (2) = {c1 (2, 1, 0)t + c2 (−1, 0, 1)t : c1 , c2 ∈ C}, and hence
ν1 = 2. Therefore, J has 2 Jordan blocks.  
2 1 0
(iii) From (i) and (ii) we conclude that J =  0 2 0  .
0 0 2
Step 2. Find P such that AP = P J.
Write P = (~v1 |~v2 |~v3 ), where ~v1 , ~v2 , ~v3 ∈ C3 . Then we want to choose the ~vi ’s such that
AP = P J, i.e. such that (A~v1 |A~v2 |A~v3 ) = (2~v1 |~v1 + 2~v2 |2~v3 ). Thus we want:

A~v1 = 2~v1 (A − 2I)~v1 = ~0


A~v2 = ~v1 + 2~v2 (A − 2I)~v2 = ~v1
A~v3 = 2~v3 (A − 2I)~v3 = ~0
Section 6.3: How to find P such that P −1 AP = J (for m ≤ 3) 287

In addition, we need to pick the ~vi ’s such that ~v1 , ~v2 and ~v3 are linearly independent.
Following the same line of thought as in the previous example, this leads to
Strategy: pick ~v2 ∈ EA2 (2), not in EA (2);
define ~v1 := (A − 2I)~v2 ;
pick ~v3 ∈ EA (2), linearly independent from ~v1 , ~v2 .

Now EA (2) = {c1 (2, 1, 0)t + c2 (1, 0, −1)t } (cf. step 1)


EA2 (2) = C3 (since (A − 2I)2 = 0).
  
Thus, take ~v2 = (1, 0, 0)t  −1 1 2
⇒ ~v1 = (A − 2I)~v2 = (−1, 0, 1)t ⇒ P = (~v1 |~v2 |~v3 ) =  0 0 1 .
t
and take ~v3 = 1, 0) ∈ EA (2)
(2,  1 0 0

  
0 0 1 1 2 −1 −1 1 2
Check : P −1 AP =  1 −2 1   0 2 0   0 0 1 
 0 1 0   1 −2 3  1 0 0
1 −2 3 −1 1 2 2 1 0
=  2 −4 2   0 0 1  =  0 2 0  = J;
0 2 0 1 0 0 0 0 2
in the above, P −1 was computed by row reducing (P |I) → (I|P −1 ):
     
−1 1 2 1 0 0 1 0 0 0 0 1 1 0 0 0 0 1
 0 0 1 0 1 0 → 0 1
  2 1 0 0  →  0 1 0 1 −2 0  .
1 0 0 0 0 1 0 0 1 0 1 0 0 0 1 0 1 0

 
2 2 1
Example 6.12. Find P such that P −1 AP is a Jordan matrix when A =  0 3 1  .
0 −1 1
Step 1. Find the Jordan canonical form J.
(i) By expanding the determinant along the first column, we get
 
3−t 1
chA (t) = (t − 2) det = (t − 2)[(3 − t)(1 − t) + 1] = (t − 2)3 ,
−1 1 − t

so λ1 = 2 and m1 = 3.
(ii) Since    
0 2 1 0 2 1
A − 2I =  0 1 1  →  0 0 1  ,
0 −1 −1 0 0 0
we see that the 2-eigenspace is EA (2) = {c(1, 0, 0)t : c ∈ C}. Thus, ν1 = 1 and so J
288 Chapter 6: The Jordan Canonical Form

consists of 1 Jordan block:  


2 1 0
J =0 2 1
0 0 2
| {z }
1 Jordan block.
Step 2. Find P such that AP = P J.
Again, write P = (~v1 |~v2 |~v3 ), and choose the ~vi ’s such that AP = P J, i.e. such that
(A~v1 |A~v2 |A~v3 ) = (2~v1 |~v1 + 2~v2 |~v2 + 2~v3 ).
Thus we want:
A~v1 = 2~v1 (A − 2I)~v1 = ~0
A~v2 = ~v1 + 2~v2 (A − 2I)~v2 = ~v1
A~v3 = ~v2 + 2~v3 (A − 2I)~v3 = ~v2
Extending the reasoning of Example 6.10, we see that all these conditions are satisfied if
we pick ~v3 such that
~v3 ∈ EA3 (2) := {~v ∈ C3 : (A − 2I)3 = ~0},
and then define ~v2 = (A − 2I)~v3 and ~v1 = (A − 2I)~v2 . In addition, we need to pick the
~vi ’s to be linearly independent, and this means that we must require that ~v3 ∈ / EA2 (2).
We thus have the following
Strategy: pick ~v3 ∈ EA3 (2), not in EA2 (2),
define ~v2 := (A − 2I)~v3 ,
define ~v1 := (A − 2I)~v2 .
For this, we first need to compute the generalized eigenspaces EA2 (2) and EA3 (2). Since
    
0 2 1 0 2 1 0 1 1
(A − 2I)2 =  0 1 1   0 1 1  =  0 0 0  ,
0 −1 −1 0 −1 −1 0 0 0
it follows that EA2 (2) = {c1 (0, 1, −1) + c2 (1, 0, 0) : c1 , c2 ∈ C}. Moreover, EA3 (2) = C3
since (A − 2I)3 = 0, as can be seen either by a direct computation or by applying the
Cayley-Hamilton Theorem: (A − 2I)3 = chA (A) = 0. Thus
EA (2) = Nullsp(A − 2I) = {c(1, 0, 0)t }
EA2 (2) = Nullsp((A − 2I)2 ) = {c1 (0, 1, −1)t + c2 (1, 0, 0)t }
EA3 (2) = Nullsp((A − 2i)3 ) = C3
Take ~v3 = (0, 0, 1)t ∈ C3 ; note that ~v3 ∈ / EA2 (2). Then ~v2 = (A − 2I)~v3 = (1, 1, −1)t ∈
EA2 (2) and ~v1 = (A − 2I)~v2 = (1, 0, 0)t ∈ EA (2). Thus
 
1 1 0
P = (~v1 |~v2 |~v3 ) =  0 1 0  satisfies: P −1 AP = J.
0 −1 1
Section 6.3: How to find P such that P −1 AP = J (for m ≤ 3) 289
   
1 −1 0 2 2 1 1 1 0
Check : P −1 AP =  0 1 0   0 3 1  0 1 0 
 0 1 1   0 −1 1  0−1 1 
2 −1 0 1 1 0 2 1 0
=  0 3 1   0 1 0  =  0 2 1  = J,
0 2 2 0 −1 1 0 0 2
where (as in the previous example) we have computef P −1 by row reduction:
     
1 1 0 1 0 0 1 1 0 1 0 0 1 0 0 1 −1 0
(P |I) =  0 1 0 0 1 0  →  0 1 0 0 1 0  →  0 1 0 0 1 0  .
0 −1 1 0 0 1 0 0 1 0 1 1 0 0 1 0 1 1

Remark. For matrices of size ≥ 4 a similar method would also work once we could
complete step 1, i.e. predict the matrix J. As long as the geometric multiplicity of every
eigenvalue satisfies νA (λ) ≤ 3, the above method generalizes without much change, but
not when some νA (λ) > 3. However, we shall see presently how to do this in general!

Exercises 6.3.
1. Find an invertible matrix P such that P −1 AP is in Jordan canonical form, where
 
  2 1 −1
1 −2
(a) A = (b) A =  0 1 0  .
2 5
1 1 0
Also, find the Jordan canonical form of A in each case.
2. Find a matrix P such that P − AP is in Jordan canonical form when
   
−2 1 −1 −2 0 −2
(a) A =  −6 4 −1  (b) A =  −6 2 −3  .
8 −2 4 8 0 6

3. (a) Suppose B is 3 × 3 matrix such that B 3 = 0, and there exists ~v ∈ C3 such that
B 2~v 6= ~0. Show that ~v , B~v , B 2~v are linearly independent and that we have
 
0 1 0
P −1 BP = J(0, 3) =  0 0 1  if P = (B 2~v |B~v |~v ).
0 0 0

(b) Let A be a matrix with characteristic polynomial chA (t) = (t − λ)3 . Suppose
there exists a vector ~v ∈ C3 such that B 2~v 6= ~0, where B = A − λI. Show that
P = (B 2~v |B~v |~v ) is invertible and that P −1 AP = J(λ, 3).
(c) More generally, suppose A is an m × m matrix with characteristic polynomial
chA (t) = (t − λ)m and that there exists a vector ~v ∈ Cm such that B m−1~v 6= ~0,
where B = A − λI. Show that P = (B m−1~v |B m−2~v | . . . |B~v |~v ) is invertible and that
P −1 AP = J(λ, m).
290 Chapter 6: The Jordan Canonical Form

6.4 Generalized Eigenvectors and the JCF


While for m ≤ 3 the algebraic and geometric multiplicities of the eigenvalues of A deter-
mine the Jordan canonical form (JCF), this is no longer true for m ≥ 4, as Example 6.9
shows. For this reason we need to look at generalized eigenspaces.
Definition. Let A be an m× matrix and λ ∈ C. If p ≥ 1 is an integer, then the p-th
generalized eigenspace of A with respect to λ is the subspace
EAp (λ) = Nullsp((A − λI)p ) = {~v ∈ Cm : (A − λI)p~v = ~0}.
Its dimension
νAp (λ) := dim EAp (λ) = m − rank((A − λI)p ).
is called the p-th geometric multiplicity of λ in A, and any vector ~v ∈ EAp (λ) is called a
generalized λ-eigenvector of A of order ≤ p.
Remark. The generalized eigenspaces fit into an increasing sequence of subspaces
{0} ⊂ EA1 (λ) = EA (λ) ⊂ EA2 (λ) ⊂ · · · ⊂ EAp (λ) ⊂ . . . ⊂ Cm ,
| {z }
(usual)
eigenspace

for if ~v ∈ EAp (λ), then B p~v = ~0, where B = A − λI, and hence also B p+1~v = B(B p~v ) =
B~0 = ~0, i.e. ~v ∈ EAp+1 (λ). Thus, the generalized geometric multiplicities satisfy the
inequalities
0 ≤ νA1 (λ) = νA (λ) ≤ νA2 (λ) ≤ . . . ≤ νAp (λ) ≤ . . . ≤ m.

Notation. We denote the sequence of generalized geometric multiplicities by


νA∗ (λ) = (νA1 (λ), νA2 (λ), . . . , νAp (λ), . . .).

Example 6.13. If J = J(λ, k) is a Jordan block of size k, then for p < k we have
p+1
0 1
0 0 1 ... 0
B . .. .. .. .. C
.. . . . . C
B C
B
EJp (λ)
B C
p . .. ..
= Nullsp((J − λI) ) = NullspB
B
B
.. . . 1 C
C
C p+1
B . ..
C
B
@ .. . 0 A
C

0 ... ... ... 0


= {c1~e1 + · · · + cp~ep },
whereas EJp (λ) = Nullsp(0) = Ck if p ≥ k. Thus, for all p ≥ 1 we have

p p if p ≤ k
(7) νJ (λ) = min(p, k) = ,
k if p ≥ k
and hence
νJ∗ (λ) = (1, 2, 3, . . . , k − 1, k, k, . . .).
Section 6.4: Generalized Eigenvectors and the JCF 291

Theorem 6.5 (Properties of generalized eigenvectors). Let λ ∈ C and p ≥ 1.


(a) If A = Diag(B, C), then EAp (λ) = EBp (λ)⊕ECp (λ) and hence νAp (λ) = νBp (λ)+νCp (λ).
(b) If B = P −1 AP, then EAp (λ) = P EBp (λ); in particular, νBp (λ) = νAp (λ).
(c) If A is similar to a Jordan matrix J = Diag(. . . , J(λi , kij ), . . .), then

(8) νAp (λ) − νAp−1 (λ) = #(Jordan blocks J(λi , kij ) of J with λi = λ and kij ≥ p).

(d) If νAp+1 (λ) = νAp (λ), then νAp+q (λ) = νAp (λ), for all q ≥ 1.

Proof. (a) Since (A − λI)p = Diag(B − λI, C − λI)p = Diag((B − λI)p , (C − λI)p ), we
have (cf. the proof of Theorem 6.3)

EAp (λ) = Nullsp((A − λI)p ) = Nullsp(Diag(B − λI)p , (C − λI)p )


= Nullsp((B − λI)p , (C − λI)p ) = EBp (λ) ⊕ ECp (λ).

This proves the first statement of (a) and the second follows from the first by taking
dimensions.
(b) We have (B − λI)p = P −1 (A − λI)p P, and so

~v ∈ EAp (λ) ⇔ (A − λI)p~v = ~0


⇔ P −1 (A − λI)p P P −1~v = ~0
⇔ (B − λI)p P −1~v = 0
⇔ P −1~v ∈ EBp (λ),

which means that EAp (λ) = P EBp (λ). Taking dimensions yields νAp (λ) = νBp (λ).
(c) Suppose first that A = J(λ, m) is a Jordan block. Then by Example 6.13 we have

p P −1 1 if p ≤ m
(9) νA (λ) − νA (λ) = min(p, m) − min(p − 1, m) = .
0 if p > m

Thus, the formula (8) is true for Jordan blocks.


Next, suppose that A = J = Diag(J11 , . . . , Jij , . . . ) is a Jordan matrix, where Jij =
J(λi , kij ). Then by (a) and (9) we have
(a) X p (9) X
νJp (λi ) − νJp−1 (λi ) = νJij (λi ) − νJp−1
ij
(λ i =
) 1,
j j
kij ≥p

which proves formula (8) for Jordan matrices.


Finally, if A = P JP −1 , then by (b) we have νAp (λ) − νAp−1 (λ) = νJp (λ) − νJp−1 (λ), and
so formula (8) follows from what was just proved.
(d) We first note that if A is similar to a Jordan matrix J, then the assertion is clear
by (c). Indeed, if νAp (λ) = νAp+1 (λ), then it follows from (c) that J has no Jordan blocks of
size ≥ p + 1, and hence also none of size ≥ p + q, which means that νAp+q (λ) = νAp+q−1 (λ).
292 Chapter 6: The Jordan Canonical Form

Now although every matrix A is indeed similar to a Jordan matrix (Jordan’s theorem),
we do not want to use this fact here, and so we give a direct proof of (d). This proof is
based on the following formula which is also interesting in itself:

(10) νAp+1 (λ) − νAp (λ) = dim(Im(A − λI) ∩ EAp (λ)),

in which Im(B) = {B~v : ~v ∈ Cn } denotes (as usual) the image space of a matrix B (also
called the range or column space of B).
From this formula (10) the assertion follows immediately, for if ν p+1 = ν p , then also
(10)
E p+1 = E p , and hence ν p+2 − ν p+1 = (ν p+2 − ν) − (ν p+1 − ν) = dim(Im(B) ∩ E p+1 ) −
dim(Im(B) ∩ E p ) = 0. Thus ν p+2 = ν p+1 , and so the claim follows by induction.
It thus remains to verify (10). For this, put B = A − λI. Then we have

(11) BEAp+1 (λ) = Im(B) ∩ EAp (λ).

Indeed, if w ~ ∈ BEAp+1 (λ), i.e. w ~ = B~v with ~v ∈ EAp+1 (λ), then B p w


~ = B p (B~v ) =
p
B p+1~v = ~0, and so w ~ ∈ Im(B) ∩ EA (λ). Conversely, if w~ = B~v ∈ Im(B) ∩ EAp (λ), then
~ = ~0, so ~v ∈ EAp+1 (λ) and w
B p+1~v = B p w ~ = B~v ∈ BE p+1 . Thus, equality holds in (11).
Taking dimensions in (11) yields

dim(Im(B) ∩ EAp (λ)) = dim(BEAp+1 (λ)) = dim(EAp+1 (λ)) − dim(Nullsp(B) ∩ E p+1 ),

where the latter equality follows from the general fact (the rank-nullity theorem) that
for any subspace V we have dim BV = dim V − dim(V ∩ Nullsp(B)). Now since here
Nullsp(B) = EA (λ) ⊂ EAp+1 (λ), it follows that dim(Nullsp(B) ∩ EAp+1 (λ)) = dim EA (λ) =
νA (λ), and so (10) follows since νAp+1 (λ) = dim EAp+1 (λ) by definition.
Remark. By part (d) we see that the generalized geometric multiplicities νAp (λ) exhibit
the following growth pattern:

νA1 (λ) < . . . < νAp (λ) = νAp+1 (λ) = . . . = νAp+q (λ) = . . . .

This raises the question: what is the value to which the geometric multiplicities stabilize?
Now it turns out (but this is more difficult to prove) that this stabilizing value is precisely
the algebraic multiplicity:

νAp (λ) = νAp+1 (λ) ⇔ νAp (λ) = mA (λ).

Corollary. The numbers νAp (λ) determine the Jordan canonical form J of A by taking
second differences. More precisely, the number npA (λ) of Jordan blocks of J of type J(λ, p)
is given by the formula

(12) npA (λ) = ∆2 νAp (λ) := (νAp (λ) − νAp−1 (λ)) − (νAp+1 (λ) − νAp (λ)).

In particular, two Jordan matrices J and J 0 are similar if and only if νJp (λ) = νJp0 (λ), for
all p ≥ 1 and λ ∈ C.
Section 6.4: Generalized Eigenvectors and the JCF 293

Proof. By equation (10) we have (νAp (λ) − νAp−1 (λ)) − (νAp+1 (λ) − νAp (λ)) = #(Jordan
blocks of size ≥ p) - #(Jordan blocks of size ≥ p + 1) = npA (λ), which is (12).
If J and J 0 are similar, then by Theorem 6.5b) we have that νJp (λ) = νJp0 (λ), for all
p ≥ 1 and λ ∈ C. Conversely, if all these numbers are equal, then it follows from (12)
that J and J 0 have exactly the same number of Jordan blocks of each type and hence
are equal up to order. By the remark after Theorem 6.4 we know that then J and J 0 are
similar.

Example 6.14. Determine the Jordan blocks of a Jordan matrix A with characteristic
polynomial chA (t) = (t − 7)5 and generalized geometric multiplicities

νA∗ (7) = (2, 4, 5, 5, 5, . . . )

Solution. By the above corollary, we have to determine the second differences ∆2 νA∗ ,
which can be calculated by using the following scheme:
p 0 1 2 3 4 5
νA∗ 0 2 4 5 5 5
|{z} |{z} |{z} |{z} |{z}
∆νA∗ 2 2 1 0 0
|{z} |{z} |{z} |{z}
∆2 νA∗ 0 1 1 0

Note that we added an extra column for p = 0 in the above table in order to be able to
compute the first column of ∆2 νA∗ .
Conclusion. A has: 0 blocks of size 1×1
1 block of size 2×2
1 block of size 3×3
0 blocks of size 4 × 4 etc.
 
7 1 0 0 0

 0 7 1 0 0 

Thus: A = Diag(J(7, 3), J(7, 2)) = 
 0 0 7 0 0  (up to order).

 0 0 0 7 1 
0 0 0 0 7
Remark. In place of the above calculation scheme, we could have used instead the
following longer but more detailed analysis:
νA4 − νA3 = 5 − 5 = 0 ⇒ 0 blocks of size ≥ 4
} 1 block of size 3
νA3 − νA2 = 5 − 4 = 1 ⇒ 1 block of size ≥ 3
} 1 block of size 2
νA2 − νA1 = 4 − 2 = 2 ⇒ 2 blocks of size ≥ 2
} 0 blocks of size 1
νA1 − νA0 = 2 − 0 = 2 ⇒ 2 blocks of size ≥ 1
294 Chapter 6: The Jordan Canonical Form
 
7 0 0 1 0
0 7 0 0 0
 
Example 6.15. Find the Jordan canonical form of A = 
 1 0 7 0 0 .

0 0 0 7 0
0 1 0 0 7
Solution. Step 1: Find the characteristic polynomial of A:
7−t 0
 
0 1 0  
 0 7−t 0 7−t 0 0 1
0 0 
 = (t − 7) det  0 7 − t 0 0 
  
chA (t) = (−1)5 det  1 0 7 − t 0 0 
   1 0 7−t 0 
 0 0 0 7−t 0 
0 0 0 7−t
0 1 0 0 7−t
 
7−t 0 0
= −(t − 7)2 det  0 7 − t 0  = (t − 7)5 .
1 0 7−t
Step 2: Calculate the generalized geometric multiplicities:
Put B = A − 7I. Then
     
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
 2 0 0 0 0 0 3  0 0 0 0 0
   
 
B=  1 0 0 0 0 ,B =  0 0 0 1 0 ,B =  0 0 0 0 0
    ,

0 0 0 0 0 0 0 0 0 0  0 0 0 0 0 
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
which have rank 3, 1, and 0 respectively. Thus, since νAp (7) = 5 − rank(B p ), we see that
νA∗ (7) = (2, 4, 5, 5, . . .).
Step 3: Find the Jordan blocks by the method of second differences:
Since the Jordan canonical form J of A has chJ (t) = (t−7)5 and its generalized geometric
multiplicities are νJ∗ (7) = (2, 4, 5, 5, . . .), we can conclude by Example 6.14 that
 
7 1 0 0 0
 0 7 1 0 0 
 
J = Diag(J(7, 3), J(7, 2)) =   0 0 7 0 0  (up to order).

 0 0 0 7 1 
0 0 0 0 7
 
1 0 6 2 0 2

 1 2 −1 0 0 0 

 0 0 4 1 0 1 
Example 6.16. Find the Jordan canonical form of A =  .

 1 0 −2 2 0 0 

 −1 0 4 0 2 1 
−1 0 −2 −2 0 0
Section 6.4: Generalized Eigenvectors and the JCF 295

Solution. Step 1: Compute the characteristic polynomial of A.


Expanding the following determinants successively along the 2nd column, the 4th column
and the 3rd row, we get 
1−t 0 6 2 0 2
 1 2 − t −1 0 0 0 
 
6
 0 0 4−t 1 0 1 
chA (t) = (−1) det  
 1 0 −2 2 − t 0 0 

 −1 0 4 0 2−t 1 
−1 0 −2 −2 0 −t
 
1−t 6 2 0 2
 0 4−t 1 0 1 
 
= (2 − t)  1 −2 2 − t 0 0 


 −1 4 0 2−t 1 
−1 −2 −2 0 −t
 
1−t 6 2 2
 0 4−t 1 1 
= (t − 2)2 det 
 1 −2 2 − t 0 

−1 −2 −2 −t
    
6 2 2 1−t 2 2
= (t − 2)2 1 · det  4 − t 1 1  − (−2)  0 1 1 
−2 −2 −t −1 −2 −t
 
1−t 6 2
+ (2 − t) det  0 4 − t 1 
−1 −2 −t
= (t − 2)2 [(−4 + 6t − 2t2 ) + 2(1 − t)(2 − t) + (2 − t)(4 − 8t + 5t2 − t3 )]
= (t − 2)2 [(t − 2)((2 − 2t) + 2(t − 1) + (t3 − 5t2 + 8t − 4))]
= (t − 2)3 [t3 − 5t2 + 8t − 4] = (t − 1)(t − 2)5 .
Thus, we have two eigenvalues: λ1 = 1 and λ2 = 2 with algebraic multiplicities mA (λ1 ) =
1 and mA (λ2 ) = 5, respectively.
Step 2: Find the p-th geometric multiplicities:
(i) For λ1 = 1:    
0 0 6 2 0 2 1 0 −2 1 0 0
 1 1 −1 0 0 0   0 1 1 −1 0 0 
   
 0 0 3 1 0 1 0 0 3 1 0 1
Put B1 = A − I =  
 →  0 0 0 1 0 1 .
  
 1 0 −2 1 0 0   
 −1 0 4 0 1 1  0 0 0 0 1 0
−1 0 −2 −2 0 −1 0 0 0 0 0 0
Thus, νA (λ1 ) = n − rk(B1 ) = 6 − 5 = 1. Since also mA (λ1 ) = 1, and always ν p (λ1 ) ≤
mA (λ1 ) (cf. the remark after Theorem 6.5), we see that ν p (λ1 ) = 1 for all p ≥ 1. (Alter-
natively, we could have computed B12 and noticed that rank(B12 ) = rank(B1 ).)
296 Chapter 6: The Jordan Canonical Form

(ii) For λ2 = 2:
   
−1 0 6 2 0 2 1 0 −1 0 0 0

 1 0 −1 0 0 0 


 0 0 1 0 0 0 
 0 0 2 1 0 1   0 0 0 1 0 1
Put B2 = A − 2I =  → .

 1 0 −2 0 0 0 


 0 0 0 0 0 1 
 −1 0 4 0 0 1   0 0 0 0 0 0
−1 0 −2 −2 0 −2 0 0 0 0 0 0
Thus νA (λ2 ) = n − rk(B2 ) = 6 − 4 = 2. Furthermore, since
   
1 0 −2 0 0 0 1 0 −2 0 0 0
 −1 0 4 1 0 1 0 0 2 1 0 1
  
 
2
 0 0 0 0 0 0 0 0 0 0 0 0 
B2 =  −1 0 2 0 0 →  ,
 0 0 0 0
 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
1 0 −2 0 0 0 0 0 0 0 0 0
   
−1 0 2 0 0 0 1 0 −2 0 0 0
 1 0 −2 0 0 0 0 0 0 0 0 0
  
 
 0 0 0 0 0 0 → 0 0 0 0 0 0
3
 
B2 =  1 0 −2 0 0
,
 0
0 0 0
 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
−1 0 2 0 0 0 0 0 0 0 0 0
it follows that νA2 (2) = n − rk(B22 ) = 6 − 2 = 4, and νA3 (2) = n − rk(B23 ) = 6 − 1 = 5. We
can stop here because νA3 (2) = 5 = mA (2), and hence νAp (2) = 5 for all p ≥ 3.
Step 3: Find the number of Jordan blocks via the method of second differences.
By step 2 we have:
νA∗ (1) = (1, 1, 1, 1, . . .)
νA∗ (2) = (2, 4, 5, 5, . . .)
Thus, since νA (1) = mA (1) = 1, we have 1 block J(1, 1). Moreover, since the values of
νA∗ (2) are identical to those of the νA∗ (7) of Example 6.14, it follows that we have also
have the same number of blocks (by taking second differences).
Thus, J has 1 block of size 1 with eigenvalue λ1 = 1
1 block of size 2 with eigenvalue λ2 = 2
1 block of size 3 with eigenvalue λ2 = 2,
and hence the Jordan canonical form J of A is
 
1 0 0 0 0 0
 0 2 1 0 0 0 
 
 0 0 2 0 0 0 
J = Diag(J(1, 1), J(2, 2), J(2, 3)) =    (up to order).
 0 0 0 2 1 0 

 0 0 0 0 2 1 
0 0 0 0 0 2
Section 6.4: Generalized Eigenvectors and the JCF 297

Exercises 6.4.

1. Find the generalized geometric multiplicities of the following Jordan matrices:


(a) J = Diag(J(1, 2), J(1, 3));
(b) J = Diag(J(2, 1), J(2, 2), J(2, 3));
(c) J = Diag(J(3, 3), J(3, 3), J(3, 3));
(d) J = Diag(J(1, 3), J(3, 2), J(3, 3)).

2. Let J be a Jordan matrix (in standard form) and suppose that its characteristic
polynomial has the form chJ (t) = (t + 1)m . Find J if its sequence of generalized
geometric multiplicities is
(a) νJ∗ (−1) = (3, 6, 6, 6, . . .);
(b) νJ∗ (−1) = (3, 5, 6, 6, 6, . . .);
(c) νJ∗ (−1) = (3, 6, 7, 7, 7, . . .);
(d) νJ∗ (−1) = (3, 6, 7, 8, 9, 9, 9, . . .).

3. Find the Jordan canonical form J of the following matrices and justify your result.
   
−2 1 0 −1 1 −2 0 0 −1 1
 0 −2 1 −1 1   0 −2 0 −1 1 
   
(a) A =   0 0 −2 0 1  ; (b) B =  0 0 −2 0 1  .
  
 0 0 0 −2 1   0 0 0 −2 1 
0 0 0 0 −2 0 0 0 0 −2

[Do not find P such that P −1 AP = J or P −1 BP = J]

4. Find the Jordan canonical form of the matrix


 
−1 1 0 0 1 1 1 1
 0 −1 0 0 0 1 1 1 
 
 0 0 −1 1 0 0 1 1 
 
 0 0 0 −1 0 0 0 1 
A =   0 0 0 0 −1 1
.
 1 0 

 0 0 0 0 0 −1 0 1 
 
 0 0 0 0 0 0 1 1 
0 0 0 0 0 0 0 1

Be sure to justify your result by suitable computations, but do not find P such that
P −1 AP is a Jordan matrix.
298 Chapter 6: The Jordan Canonical Form

6.5 A procedure for finding P such that P −1AP = J


In the previous section we proved that the generalized geometric multiplicities νAp (λ) =
dim EAp (λ) determine the Jordan canonical form J of a matrix A. Here we shall see that
generalized eigenspaces EAp (λ) can be used to find a matrix P such that P −1 AP = J;
this partly generalizes the method which we learned in section 6.3.

Procedure for finding P such that P −1 AP = J


I. Compute and factor the characteristic polynomial of the n × n matrix A:

chA (t) = (t − λ1 )m1 (t − λ2 )m2 · · · (t − λs )ms .

II. For each eigenvalue λi , 1 ≤ i ≤ s, find a basis Bi of EA∗ (λi ) := EAmi (λi ) as follows:
1. Compute νAj (λi ) = m − rank((A − λi I)j ) for j = 1, . . . , mi , and find the
minimal k = ki such that νik = mi .
[It is usually a good idea to find the generalized eigenspaces EAj (λi ) as well.]
2. Build up a basis Bi of EAk (λi ) = EAmi (λi ) = EA∗ (λi ) as follows:
(k−1)
i. Pick a generalized eigenvector ~v ∈ EAk (λi ) of degree k (so ~v ∈
/ EA (λi ));
start the list Bi with

Bi = {w ~ i2 := (A − λi I)w
~ i1 := ~v , w ~ ik := (A − λi I)w
~ i1 , . . . , w ~ i,k−1 }.

ii. If we have already mi vectors in the list, Bi is the desired basis and we
are done with the eigenvalue λi ; otherwise, proceed to the next step.
(`−1)
iii. Determine the largest ` such that EA` (λi ) 6⊂ EA (λi ) + span(Bi ), and
(`−1)
pick a generalized eigenvector ~u ∈ EA` (λi ) such that ~u ∈ / EA (λi ) +
span(Bi ). (In other words, pick a generalized eigenvector ~u of highest
possible degree ` such that ~u is linearly independent of the vectors already
(`−1)
in the list Bi , together with those of EA (λi ). )
iv. Add the vectors ~u, (A − λi I)~u, . . . , (A − λi I)`−1~u at the end of the list Bi .
Go back to step ii.
III. Each (ordered) list Bi has now the form Bi = {w
~ i1 , w ~ imi }. Assemble these
~ i2 , . . . , w
lists as the column vectors of the matrix P by reversing the order in each list Bi ;
thus,

P = (w
~ |w
~ 1,m1 −1 | . . . | w
~ 11 | w
~ |w
~ 2,m2 −1 | . . . | w
~ 21 | . . . | w
~ | ... | w
~ s1 ).
| 1m1 {z } | 2m2 {z } | sms {z }
B1 B2 Bs

Then: J = P −1 AP is the Jordan Canonical Form of A, and the Jordan blocks


with the same eigenvalue are arranged in order of increasing block size, i.e. J is in
reverse standard form (cf. p. 281).
Section 6.5: A procedure for finding P such that P −1 AP = J 299
 
3 1 0 0 0
0 3 0 0 0
 
0
Example 6.17. Verify the algorithm for A = Diag(J(3, 2), J(3, 3)) =  0 3 1 0.
0 0 0 3 1
0 0 0 0 3
Solution. Since this matrix is already in (reverse standard) Jordan Canonical Form, we
know that we can take P = I. Thus, we expect the algorithm to assemble the identity
matrix I.

I. Clearly, det(A) = (t − 3)5 , so λ1 = 3 and m1 = n = 5.

II. 1. The generalized eigenspaces are:


01000
 
0 0 0 0 0
EA (3) = Nullspace(A − 3I) = Nullsp   0 0 0 1 0  = {c1~e1 + c2~e3 }

00001
0 0 0 0 0
00000
 0 0 0 0 0
EA2 (3) = Nullsp((A − 3I)2 ) = Nullsp 
 00 00 00 00 10  = {c1~e1 + c2~e2 + c3~e3 + c4~e4 }

00000
EA3 (3) = Nullsp((A − 3I)3 ) = Nullsp(0) = C5

Thus, νA (3) = 2 < νA2 (3) = 4 < νA3 (3) = 5 = m1 , and so k = 3.


2. We now construct the basis B1 of EA3 (3) = C5 as follows.

i. Pick ~v ∈ C5 of exact degree 3, i.e., ~v ∈ / EA2 (3). For example, take ~v = ~e5 .
Then: w ~ 12 = (A − 3I)~e5 = ~e4 , w
~ 11 = ~e5 , w ~ 13 = (A − 3I)~e4 = ~e3 .
ii. At this point the list is B1 = {~e5 , ~e4 , ~e3 }, which consists only of 3 < 5 elements,
so we continue with step iii.
iii. Since EA2 (3) + span(B1 ) = EA3 (3), we cannot take ` = 3. Thus, try ` = 2,
and look for ~u ∈ EA2 (3) = {c1~e1 + c2~e2 + c3~e3 + c4~e4 } such that ~u ∈ / EA (3) +
span(B1 ) = {c1~e1 + c3~e3 +c2~e2 + c4~e4 }. Clearly, we can take ~u = ~e2 (and hence
| {z }
EA (3)
` = 2).
iv. Thus, we add ~u = ~e2 and (A − 3I)~u = ~e1 at the end of B1 to get B1 =
{e5 , e4 , e3 , e2 , e1 }. Since we now have 5 = m1 vectors in B1 , we have con-
structed the desired basis of EA3 (3) = C5 .

III. Assembling P from B1 (in reverse order) yields P = (~e1 | . . . |~e5 ) = I, the 5 × 5
identity matrix. Thus, P −1 AP = A = J is in Jordan Canonical Form.
300 Chapter 6: The Jordan Canonical Form

Example 6.18. If A is the matrix of Example 6.16, find P such that J = P −1 AP is the
Jordan canonical form of A.
Solution. We follow the steps of the algorithm.
Step I. Compute and factor the characteristic polynomial:
By Example 6.16 we know that the characteristic polynomial is chA (t) = (t − 1)(t − 2)5 ,
so we have 2 eigenvalues: λ1 = 1 and λ2 = 2.
Step II. For each i find a basis Bi of EAmi (λi ):
a) For λ1 = 1:
1. Again, from Example 6.16 we know that νA (1) = 1; moreover, by the reduced matrix
given there we obtain
EA (1) = h(1, −1, 0, −1, 0, 1)t i
2. Thus, B1 = {w ~ 11 := (1, −1, 0, −1, 0, 1)t .
~ 11 }, where w
b) For λ2 = 2:
1. From Example 6.16 we know that νA∗ (2) = (2, 4, 5, 5, . . .), so the smallest k such that
νAk (2) = m2 is k = 3. Moreover, from the row reduced matrices of Example 6.16 we
obtain:
EA (2) = h~u11 , ~u12 i, where ~u11 = (0, 1, 0, 0, 0, 0)t ,
~u12 = (0, 0, 0, 0, 1, 0)t ;
2
EA (2) = h~u11 , ~u12 , ~u21 , ~u22 i, where ~u21 = (0, 0, 0, −1, 0, 1)t ,
~u22 = (2, 0, 1, −2, 0, 0)t ;
EA3 (2) = h~u11 , ~u12 , ~u31 , ~u32 , ~u33 i, where ~u31 = (0, 0, 0, 1, 0, 0)t ,
~u32 = (0, 0, 0, 0, 0, 1)t ,
~u33 = (2, 0, 1, 0, 0, 0)t .

Note that there are two relations among the uij ’s: ~u21 = −~u31 + ~u32 and ~u22 = ~u33 − 2~u31 .
/ EA2 (2), so ~v has degree 3. Thus,
2. i. Clearly, ~v = ~u31 = (0, 0, 0, 1, 0, 0)t ∈ EA3 (2) but ~v ∈
we can start the list B2 with ~v as a generator. (We could also take instead ~v = ~u32 or ~v33
or most linear combinations of these vectors.) Thus, put w ~ 12 = (A − 2I)w
~ 11 = ~v , w ~ 11 =
t t
(2, 0, 1, 0, 0, −2) = ~u22 − 2~u21 , w~ 13 = (A − 2I)w ~ 12 = (0, 1, 0, 0, 0, 0) = ~u11 . (Note that
(A − 2I)w ~ 13 = ~0 as expected since k = 3.) We thus have the list

B21 = {w
~ 11 , w ~ 13 }.
~ 12 , w

ii. Since #B21 = 3 < m2 = 5, we continue with the next step.


iii. Clearly EA2 (2) + span(B21 ) = h~u11 , ~u12 , ~u21 , ~u22 , ~u31 , ~u22 − 2~u21 i = EA3 (2), so ` < 3.
However, ~u21 ∈ / EA (2) + span(B21 ) = span(B21 ) but ~u21 ∈ EA2 (2), so ` = 2. Moreover, we
can take w ~ 24 := ~u21 as the next generator. Thus w ~ 25 := (A − 2I)w ~ 24 = (0, 0, 0, 0, 1, 0)t =
~u21 is the next element in the list:

B22 = {w ~ 25 }.
~ 24 , w
Section 6.5: A procedure for finding P such that P −1 AP = J 301

ii. We have now constructed the list B2 = B21 ∪ B22 = {w ~ 25 }. Since #B2 =
~ 21 , . . . , w
5 = mA (2), we are done with step II.
Step III. In step II we had constructed the lists B1 = {w ~ 11 } and B2 = B21 ∪ B22 =
{w ~ 15 }. Assembling the elements of each list in reverse order yields
~ 11 , . . . , w
 
1 0 0 0 2 0
 −1 0 0 1 0 0 
 
 0 0 0 0 1 0
~ 11 |w
P = (w ~ 25 |w
~ 24 | . . . |w
~ 21 ) = 
 .
 −1 0 −1 0 0 1 

 0 1 0 0 0 0
1 0 1 0 −2 0
Thus, P is the matrix which transforms A to its Jordan canonical form; i.e. P is such
that J = P −1 AP is a Jordan matrix.
−1
Check: P  AP
−2
  
1 0 0 0 0 1 0 6 2 0 2 1 0 0 0 2 0

 0 0 0 0 1 0  1
 2 −1 0 0 0   −1 0
  0 1 0 0
 −1 0 4 0 0 1  0 0 4 1 0 1  0 0 0 0 1 0
=    

 1 1 −2 0 0 0  1
 0 −2 2 0 0
 −1 0 −1 0 0 1
 0 0 1 0 0 0  −1 0 4 0 2 1  0 1 0 0 0 0
0 0 2 1 0 1 −1 0 −2 −2 0 0 1 0 1 0 −2 0
−2
    
1 0 0 0 0 1 0 0 0 2 0 1 0 0 0 0 0

 −1 0 4 0 2 1   −1
 0 0 1 0 0 0 2 1
 0 0 0 

 −2 0 8 0 0 2  0 0 0 0 1 0= 0 0 2
 0 0 0 
=   .

 2 2 −3 0 0 0   −1
 0 −1 0 0 1 

0 0 0 2 1 0 

 0 0 4 1 0 1  0 1 0 0 0 0 0 0 0 0 2 1 
0 0 4 2 0 2 1 0 1 0 −2 0 0 0 0 0 0 2

Exercises 6.5.
1. Find a matrix P such that P −1 AP is in Jordan canonical form for each of the
matrices A of Problem 3 of Exercise 6.4.
2. If A is as in Problem 4 of Exercises 6.4, find P such that P −1 AP is in Jordan
canonical form.
3. Find a matrix P such that P −1 AP is in Jordan canonical form where
 
−1 1 0 0 1 1 1 1
 0 −1 0 0 0 1 1 1
 
 0 0 −1 1 0 0 1 1
 
 0 0 0 −1 0 0 0 1
A =   0 0 0 0 −1 1 1
.
 0
 0 0 0 0 0 1 0 0
 
 0 0 0 0 0 1 1 0
0 0 0 0 0 1 1 1
302 Chapter 6: The Jordan Canonical Form

6.6 A Proof of the Cayley–Hamilton Theorem


As was promised in the introduction, we want to use Jordan’s theorem (Theorem 6.4) to
prove:1

Theorem 6.6 (Cayley-Hamilton). For any square matrix A we have chA (A) = 0.

Proof. We present here a proof that is typical of all the proofs using the Jordan canonical
form in that they all follow the following pattern:
Step 1. Prove the assertion for Jordan blocks.
Step 2. Prove the statement for Jordan matrices (using step 1).
Step 3. Deduce from step 2 and Jordan’s theorem that the assertion is true for a
general matrix.
We now apply this strategy to proving the Cayley-Hamilton theorem.
Step 1. Let A = J(λ, k) be a Jordan block.
Then clearly chA (t) = (t − λ)k , so

chA (A) = (J(λ, k) − λI)k = J(0, k)k = 0

because for any p ≤ k we have (cf. Example 6.13):


p+1
0 1
0 0 1 ... 0
B . .. .. .. .. C
.. . . . . C
B C
B
B C
p . .. ..
J(0, k) = B
B
B
.. . . 1 C
C
C p+1
B . ..
C
B
@ .. . 0 A
C

0 ... ... ... 0

Thus, the Cayley-Hamilton theorem holds for Jordan blocks.


Step 2. Let A = Diag(J11 , . . . , Jij , . . .) be a Jordan matrix.
Put c(t) = chA (t). Since A is a diagonal block matrix, we have

c(t) = chJ11 (t) · · · chJij (t) · · · ,

so in particular for each i, j we have c(t) = gij (t) chJij (t) for some polynomial gij (t).
Thus, c(Jij ) = gij (Jij ) chJij (Jij ) = 0 since by step 1 we have chJij (Jij ) = 0. Thus

c(A) = c(Diag(J11 , . . . , Jij , . . .)) = Diag(c(J11 ), . . . , c(Jij ), . . .) = Diag(0, . . . , 0, . . . ) = 0,

and so the statement holds for Jordan matrices.


1
In the appendix we shall give a direct proof of the Cayley-Hamilton Theorem; cf. Theorem 6.7 and
the remark following it.
Section 6.6: A Proof of the Cayley–Hamilton Theorem 303

Step 3. Let A be an arbitrary matrix.


By Jordan’s theorem, there is a matrix P such that J = P −1 AP is a Jordan matrix.
Then chA (t) = chJ (t) and so, using step 2, we obtain

chA (A) = chJ (P JP −1 ) = P chJ (J) P −1 = 0.


| {z }
0

Thus, the Cayley-Hamilton theorem holds for an arbitrary matrix A.


As was mentioned in the above proof, the basic strategy used in the proof applies to
many other situations as well. For example:

Example 6.19. Find all m × m matrices A such that A2 = I.


Solution. We follow the above strategy.
Step 1. Find all Jordan blocks J = J(λ, m) satisfying J 2 = I.
By the explicit formula of powers of Jordan blocks (cf. Theorem 5.7), this can only
happen if m = 1. Furthermore, in that case we must have that λ = ±1, i.e. either λ = 1
or λ = −1.
Step 2. Find all Jordan matrices J = Diag(J11 , . . . , Jij . . .) satisfying J 2 = I.
Since J = Im implies that Jij2 = Ikij , we obtain from step 1 that J is a diagonal matrix
with ±1 along the diagonal. (Conversely, every matrix J of this form satisfies J 2 = I.)
Step 3. General case: A = P JP −1 is any matrix such that A2 = I.
Then we also have J 2 = P −1 A2 J = I, so by step 2 A is similar to a diagonal matrix
J = Diag(±1, . . . , ±1). Conversely, any matrix A of this form satisfies A2 = I.
Conclusion. A matrix A satisfies A2 = I if and only if it is similar to a diagonal matrix
of the form Diag(±1, . . . , ±1).

Exercises 6.6.

1. Let A be a matrix with characteristic polynomial

chA (t) = (t − λ1 )m1 (t − λ2 )m2 · · · (t − λs )ms .

Prove that for any polynomial f (t) ∈ C[t], the characteristic polynomial of f (A) is

chf (A) (t) = (t − f (λ1 ))m1 (t − f (λ2 ))m2 · · · (t − f (λs ))ms .

Hint: First prove it in the case that A is a Jordan block, then for a general Jordan
matrix, and then use Jordan’s theorem.

2. Find all m × m matrices A satisfying A2 = A. (A matrix satisfying this equation


is called an idempotent matrix.)

You might also like