You are on page 1of 54

MTH205: Linear Algebra II

0.1 Recommended Texts


1. Linear Algebra and its application by Gilbert Strang, Fourth edition.

2. Linear Algebra by Jin Hefferon.

3. Applied Linear Algebra by Ben Noble and James W. Daniel.

4. Systems of Linear Equations by Beifang Chen.

1
Chapter 1

System of Linear Equations

An equation of the form


a1 x1 + a2 x2 + · · · + an xn = b,
where the a0i s are real or complex numbers and the x0i s are the variables (unknown) for
i = 1, 2, · · · , n is called a linear equation. The constants a0i s are called the coefficient of the
x0i s while b is the constant term of the equation.
A system of linear equations (or linear system) is a finite collection of linear equations of
the same variables. For example, a linear system of m equations in n unknowns x1 , x2 , · · · , xn
can be expressed as,
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
a31 x1 + a32 x2 + · · · + a3n xn = b3
.. .. .. .. .
. . . ··· . = .. (1.1)
am1 x1 + am2 x2 + · · · + amn xn = bm
A solution of (1.1) is a tuple (x∗1 , x∗2 , · · · , x∗n ) of numbers that satisfies each of the equations
when x∗1 , x∗2 , · · · , x∗n are substituted for x1 , x2 , · · · , xn respectively. The set of all solutions of
a linear system is called the solution set of the system.
The coefficient matrix of (1.1) is
 
a11 a12 a13 · · · a1n
 a21 a22 a23 · · · a21 
A =  ..
 
.. .. .. 
 . . . . 
am1 am2 am3 · · · amn
while  
a11 a12 a13 ··· a1n b1
 a21 a22 a23 ··· a21 b2 
[A|b] = 
 
.. .. .. .. .. 
 . . . . . 
am1 am2 am3 · · · amn bm
is called an augmented matrix of (1.1). If b1 = b2 = b3 = · · · = bm = 0, then equation (1.1)
is called a homogeneous system of equation otherwise it is non-homogeneous.

2
1.1 Gaussian Elimination
Definition 1.1
There are three types of elementary row/column operations on matrices:

a. Swapping or interchanging two rows/columns

b. Multiplying all entries of one row or column by a non-zero constant.

c. Adding/subtracting a multiple of one row/column to another row/column.

Definition 1.2
Two linear systems with the same unknowns are said to be equivalent if their solution sets
are the same. A matrix A is said to be row equivalent to a matrix B, written A ∼ B
(pronounced A tilde B). If there is a sequence of elementary row operations that changes A
to B.

1.1.1 Echelon Form


Definition 1.3
A matrix is said to be in row echelon form if it satisfies the following conditions:

a. All zero rows are gathered near/at the bottom.

b. The first non-zero entry of a row called the leading entry of that row is ahead of the
first non-zero entry of the next row.
A matrix in row echelon form is said to be in row reduced echelon form if it satisfies
the following conditions:

c. The leading entry of every non-zero row is 1.

d. Each leading entry 1 is the only non-zero entry in its column.

1.2 Row Reduction Algorithm


Definition 1.4
A pivot position of a matrix A is a location of entries of A that corresponds to a leading
entry of a matrix in echelon form.

Algorithm 1.1(Row Reduced Algorithm)


1. Begin with the leftmost non-zero column, which is a pivot column.

2. If the entry of the pivot position is zero, choose a non-zero entry in the pivot column,
i,e interchange the pivot row and the row containing this non-zero entry.

3
3. if the pivot position is non-zero, use elementary row operations to reduce all entries
below the pivot position to zero, (and the pivot position to 1 and entries above the
pivot position to zero for row reduced echelon form).

4. Cover the pivot row and the rows above it; Repeat (1) to (3) to the remaining sub-
matrices.

Definition 1.5
The number of non-zero rows/columns in any matrix in echelon form is called the rank. It
can also be defined as the number of pivots after reduction to echelon form.

Theorem 1.1
The number k of non-zero rows and the column numbers of the leading columns are the
same in any echelon form produced from a given matrix A by elementary row operations,
irrespective of the actual sequence of row operations used.

Proof
see[3] pages 141-143.

Theorem 1.1a(Row operations)


Suppose B results from applying a sequence of elementary row operations to A. Then there
exists a non-singular matrix F for which B = F A and hence F −1 B = A.

Example 1.1
The first three matrices below are in echelon form, the second are not,
   
  0 1 −2 3 1 2 3
0 0
, 0 0 0 1 , 0 1 4 
0 0
0 0 0 0 0 0 1

These are matrices      


2 3 0 1 2 1 0 0
 1 0 , 1 0 0 , 0 1 0 
0 0 0 0 0 0 2 1
are not

Example 1.2
The following 2 by 2 matrix A  
2 4
A=
3 9

4
is reduced to  
0 1 2
A =
0 1
by R2 ↔ 13 R2 ; R1 ↔ 12 R1 , R2 ↔ R2 −R1 . If we swap the two rows of A and then perform
row operations to echelon form, we have
 
00 1 3
A =
0 1
This example shows the non-uniqueness of the row echelon form since A is reduced to two
matrices A0 and A00 . However, the following theorem shows that the row reduced echelon
form is unique irrespective of row interchange(s). Nevertheless Theorem (1.1) is satisfied.

Theorem 1.2(Row Reduced Echelon Form)


Each matrix has one and only one row reduced echelon form to which it can be reduced by
elementary row operations, irrespective of the actual sequence of operations used to produce
it.

Proof
Let A0 and A00 be two row reduced echelon forms of the same matrix A. Since A0 and A00 are
also in echelon form, then from theorem 1.1, they have the same number k of non-zero rows
and the same column numbers of their k leading columns. The ith leading columns of each
matrix is just eij the column ith column of the identity matrix. From theorem (1.1a) ∃ non-
singular matrices F 0 and F 00 such that, F 0 A = A0 and F 00 A = A00 . Therefore A00 = HA0 and
A0 = H −0 A00 , H = F 00 F −0 The rule for partitioned multiplication tells us that the columns of
A0 and A00 are related by a00 = Ha0 , so we obtain ei = Hei . So Hei = ej for 1 ≤ j ≤ k. Any
column a0 of A0 is a sum of multiples of these first k unit column matrices ei .
a0 = Σαi ej
The corresponding column a of A is just Ha0 .
00 00

a00 = Ha0 = H(Σαj ej ) = Σαj Hej = Σαj ej = a0


and thus a = a and corresponding columns of A0 and A00 are equal; i.e A = A00 . In the next
00 0

section, we will discuss condition for solvability of the system of equations.

1.3 The Number of Solutions


In the simple case of one equation, ax = b in one unknown x, we tend to say that the solution
of this equation is x = ab , but there are three possibilities:
b
1. If a 6= 0, then x = a
corresponds to the unique solution.
2. If a = 0, then two possibilities exist,
i. If b 6= 0, then we want to find x such that 0x = b 6= 0 and no solution x exists. We
say that ”no solution exists” or that ”the equation is inconsistent” since 0 = b 6= 0,
a contradiction.

5
ii. If b = 0, then there are infinitely many solutions: Every number x is a solution
since ox = 0 = b no matter what x is.
For example: the equations,

x1 + x2 = 2
x1 − x2 = 0

has a unique solution.

x1 + x2 = 2
x1 + x2 = 1

are inconsistent or have no solution.

x1 + x2 = 2
2x1 + 2x2 = 4

have infinitely many solutions namely x1 = k, x2 = 2 − k, ∀k

Examples
1.

x1 + 2x2 − 5x3 = 2
2x1 − 3x2 + 4x3 = 4
4x1 + x2 − 6x3 = 8

2.

x1 + 2x2 − x3 + 2x4 = 4
2x1 + 7x2 + x3 + x4 = 14
3x1 + 8x2 − x3 + 4x4 = 17

3.

−2x1 + 2x2 − 4x3 − 6x4 = −4


−3x1 + 6x2 + 3x3 − 15x4 = −3
5x1 − 8x2 − x3 + 17x4 = 9
x1 + x2 + 11x3 + 7x4 = 7

1.4 Solvability of Systems of Equations


In this section, we will use the row echelon form of the augmented matrix to analyse the
solvability of the system of equations. Ax = b. First of all, we want to show that Gauss
elimination leaves the set of solution unchanged.

6
Theorem 1.3(Gauss elimination and solution sets)
Suppose that the system of equations Ax = b or equivalently, the augmented matrix [A,b] is
transformed by a sequence of elementary row operations into the augmented matrix A0 x = b0
or equivalently, into the augmented matrix [A0 , b0 ]. Then the solution sets are identical i.e x
solves Ax = b iff x solves A0 x = b0

Proof
By theorem(1.1a), on row operations there is a non-singular matrix F such that

[A0 , b0 ] = F [A, b] = [F A, F b]

where the above equality follows from the rule for partitioned multiplications. This means
that A0 = F A and b0 = F b. Suppose that x solves Ax = b. Pre-multiplication by F gives
F Ax = F b, that is, A0 x = b and thus x solve the multiplied equations as well. Conversely,
if x satisfies A0 x = b, then pre-multiplication by F −0 shows that x solves Ax = b. Therefore,
the two sets are identical.
The above theorem shows that we can study the set of all solutions to Ax = b by studying
the set of solution of the much simpler set A0 x = b obtained by reducing [A, b] to echelon
form [A0 , b0 ]. As was demonstrated in part of tutorial one of mth 204 in the first semester,
three possible cases can occur in solving Ax = b,

a. No solution (or inconsistent)

b. Infinitely many solutions

c. Exactly one solution

These possibilities are summarised in the following theorem in terms of rank.

Theorem 1.4 (Rank and solvability)


Let Ax = b be a system of equations. Exactly one of these three possibilities must hold;

1. The rank of the augmented matrix [A, b] is greater than the rank of A and no solution
exists to Ax = b

2. The rank of [A, b] equals the number of unknowns, and the system Ax = b has exactly
one solution.

3. The rank of [A, b] equals that of A, which is strictly less than the number of unknowns
and the system Ax = b has infinitely many solutions.

Proof
See[3] pages 148-149

7
Example
For what value of α will the following system of equations have one, no, infinitely many
solutions?
x − 3y = −2
2x + y = 3
3x − 2y = α
Solution
The augmented matrix is reduced from
   
1 −3 −2 1 −3 −2
 2 1 3  to  0 1 1 
3 −2 α 0 0 α−1
Rank(A0 ) = 2, Rank[A0 , b0 ]=3. If α = 1, then both A0 and [A0 , b0 ] are in echelon form and
rank(A0 ) = 2, rank ([A0 , b0 ]) = 2. This corresponds to case 2 of Theorem 1.4 and there is a
unique solution. Which is x = y = 1.
If α − 1 6= 0, Then the third row can be divided by α − 1 6= 0 to produce 1 at the bottom
of the third column, giving the rank of A as 2 but the rank of [A, b] as 3; this is case 1 and
there are no solutions.
The case Ax = 0 has been extensively studied in MTH 204 as the nullspace solution
when balancing chemical reactions. It corresponds to infinitely many solutions.

1.5 Structure of the Solution Set


when the solution set is empty (there are no solutions), or when the set consists of one single
solution, the structure of the solution set is very clear. However, what happens is the third
case when there are infinitely many solutions:
If y and z are both solutions to Ax = b such that Ay = b and Az = b, then
Az = b
A(y − z) = b − b = 0
That is the column matrix h = y − z is a solution of the homogeneous system Ah = 0
corresponding to the original sytem Ax = b.
If xp is a particular solution and h (or xn as used in MTH204) is the homogeneous
solution, then
A(xp + h) = Axp + Ah
=b+0
=b
That is xp + h is another solution to Ax = b This leads to the following result.

Theorem 1.5(Solution Sets)


Suppose xp is a particular solution to the system of equations Ax = b. Then the set of all
solutions x to Ax = b is the same as the set of all matrices of the form xp + h, where h ranges
over the set of all solutions of the homogeneous system Ah = 0.

8
Chapter 2

Eigenvalues and Eigenvectors

Let A be an n by n matrix
Ax = λx

x 6= o is called the eigenvector corresponding to the eigenvalue λ

Ax = λIx
(A − λI)x = o,
since x 6= o and (A − λI)x = o. This implies that A − λI must be singular and the solutions
x are infinitely many. Since (A − λI) is singular, its determinant must be zero. Hence
characteristic polynomial
= det(A − λI) = |A − λI|
Eigenvalues are the roots of the characteristic polynomial. The Characteristic equation is
det(A − λI) = 0. x is sometimes called a right eigenvector of A. The left eigenvector of A
is the eigenvector of AT . That is

AT y = λy, y 6= o

or
y T A = λy T .

2.0.1 How to find Eigenvalues and Eigenvectors


1. Find the characteristic polynomial = det(A − λI) of A.

2. Find the roots of the characteristic polynomial.

3. For each eigenvalue, solve the equation (A − λI)x = o. Since the determinant is zero,
there are solutions other than x = o, those are the eigenvectors.

The set of all eigenvectors of A is called the eigenspace of A. It is often defined as the
nullspace of (A − λI).

9
Definition 2.0.1. When the characteristic polynomial of an n by n matrix is written in the
form
det(A − λI) = (λ1 − λ)m1 (λ2 − λ)m2 · · · (λq − λ)mq
with λi 6= λj for 1 ≤ i 6= j ≤ q and m1 + m2 + m3 + · · · + mq = n, mi is called the
algebraic multiplicity of the eigenvalue λi . A simple eigenvalue is an eigenvalue of algebraic
multiplicity one.

Definition 2.0.2. Let vλ = nullspace(A − λI). The dimension of the eigenspace vλ is its
geometric multiplicity.

Definition 2.0.3. If the algebraic multiplicity of A is greater than the geometric multiplicity,
then the matrix is defective.

Theorem 2.0.1.

Nonzero eigenvectors belonging to district eigenvalues are linearly independent.


Proof. The proof is by induction: Let v1 , v2 , · · · , vk be nonzero eigenvectors of A. For k = 1,
this is immediate because an eigenvector cannot be zero, c1 v1 = o. Assume the result holds
for k − 1 eigenvalues. Consider the linear combination

c1 v 1 + c2 v 2 + · · · + ck−1 v k−1 + ck v k = o, (2.1)

where the c0i s are scalars. Multiply (2.1) by the matrix A

A(c1 v 1 + c2 v 2 + · · · + ck−1 v k−1 ) = c1 Av 1 + c2 Av 2 + · · · + ck−1 Av k−1 + ck Av k


= c1 λ1 v 1 + c2 λ2 v 2 + · · · + ck−1 λk−1 v k−1 + ck λk v k = o

By multiplying (2.1) by λk , we also have c1 λk v 1 + c2 λk v 2 + · · · + ck−1 λk v k−1 + ck λk v k = o.


Subtract this from the previous equation,the term involving v k cancel and we are left with
the equation
c1 (λ1 − λk )v 1 + · · · + ck−1 (λk−1 − λk )v k−1 = o
This is a linear combination of the first k − 1 eigenvectors. By the induction hypothesis, this
is true if all the coefficients are zero. That is

c1 (λ1 − λk ) = 0, c2 (λ2 − λk ) = 0, · · · , ck−1 (λk−1 − λk ) = 0.

Since the eigenvalues were assumed to be distinct, λj 6= λk when j 6= k. This implies that
c1 = c2 = · · · = ck−1 = 0. By substituting this back into equation (2.1), ck vk = o and so
ck = 0. because v k 6= o. We have proved that (2.1) holds if and only if c1 = c2 = · · · = ck = 0
, which implies the linear independence of the eigenvectors v 1 , v 2 , · · · , v k−1 , v k .

10
Crucial
The general formula for the determinant of an n by n matrix entries aij :
X
det A = (signπ)aπ(1),1 .aπ(2),2 · · · aπ(n),n (2.2)
π

The sum is over all possible permutations π of the rows of A . The ’sign’ of the permutation,
written, sign π, equals the determinant of the corresponding permutation matrix P , so sign
π = det P = +1 If the permutation is composed of an even number of row exchanges and
−1, if composed of an odd number. For example, the six terms is the well known formula:
a11 a12 a13
a21 a22 a23 = a11 a22 a33 + a31 a12 a23 + a21 a32 a13 − a11 a32 a23 − a21 a12 a33 − a31 a22 a13 .
a31 a32 a33
For 3 by 3 determinant corresponds to the six possible permutations of a rowed matrix.
           
1 0 0 0 1 0 0 0 1 0 1 0 0 0 1 1 0 0
0 1 0 , 0 0 1 ,
  1 0 0 ,
  1 0 0 ,  0 1 0 , 0 0 1 .
0 0 1 1 0 0 0 1 0 0 0 1 1 0 0 0 1 0

Theorem 2.0.2. (Characteristics Polynomials)


Let A be n by n matrix . Then
(a). det(A − λI) is a polynomial of exact degree n in the variable λ, this is called the
characteristic polynomial fA (λ) of A. The characteristics polynomial of A is

fA (λ) = det(A − λI) = cn λn + cn−1 λn−1 + · · · + c1 λ + c0 (2.3)

(b). The coefficient of λn in fA (λ) equals (−1)n .


(c). The coefficient of λn−1 in fA (λ) equals (−1)n−1 tr(A), Where tr(A) (the trace of A) is
the sum of the entries on the main diagonal of A.
(d). The constant term of fA (λ) equals det A
(a). Let fA (A) = det(A − λI) = cn λn + cn−1 λn−1 + · · · + c1 λ + c0 .
 
a11 a12 · · · a1n
 a21 a22 · · · a2n 
A =  .. ..  .
 
..
 . . . 
an1 an2 · · · ann

Then
a11 − λ a12 ··· a1n
a21 a22 − λ · · · a2n
|A − λI| = .. .. .. .
. . .
an1 an2 ··· ann − λ

11
The fact that fA (λ) is a polynomial of degree n is a consequence of the general determinental
formula (2.2) . Indeed, every term is prescribed by a permutation π of the rows of the matrix,
and equals plus or minus a product of n distinct matrix entries including one from each row
and one from each column. The term corresponding to the identity permutation is obtained
by multiplying the diagonal entries together, which in this case is

fA (λ) = det(A − λI) (2.4)


= (a11 − λ)(a22 − λ) · · · (ann − λ) (2.5)
= (−1)n λn + (−1)n−1 (a11 + a22 + · · · + ann )λn−1 + · · · (2.6)

All of the other terms have at most n − 2 diagonal factors aii − λ and so are polynomials of
degree less than or equal to n − 2 in λ.
(b). By comparing coefficients cn = (−1)n
(c). cn = (−1)n−1 (a11 + a22 + · · · + ann ) = (−1)n tr(A).
(d). The constant term is any polynomial f (λ) can be found as f (0) since fA (λ) = det(A −
λI)
fA (0) = det(A − 0I) = det A
Examples
  of Theorem 0.0.2
a b
A=
c d
det A = ad − bc
tr(A) = a + d

The characteristics polynomial

fA (λ) = A − λI
a−λ b
=
c d−λ

The characteristics equation fA (λ) = 0 simplifies to

fA (λ) = (a − λ)(d − λ) − bc
= ad − aλ − dλ + λ2 − bc
= λ2 − (a + d)λ + ad − bc
= λ2 − λtr(A) + det(A) = 0

Hence, the eigenvalues of A becomes


p
(a + d) ±(a + d)2 − 4(ad − bc)
λ1,2 =
p 2
tr(A) ± (trA)2 − 4 det A
= .
2

12
By the fundamental theorem of Algebra, every polynomial of degree n − 1 can be completely
factored. We can write the characteristics polynomial in factored form.

fA (λ) = (−1)n (λ − λ1 )(λ − λ2 ) · · · (λ − λn ) (2.7)

The numbers λ1 , λ2 , · · · , λn , some of which may be repeated are the roots of the characteristic
equation fA (λ) = 0, and hence the eigenvalues of the matrix A. Observe that:

fA (0) = (−1)n (0 − λ1 )(0 − λ2 ) · · · (0 − λn )


= (−1)2n (λ1 λ2 · · · λn )
= λ1 λ2 · · · λn ∀n ∈ N.

If we multiply out (2.7) explicitly and equate the result to the characteristic polynomial
(??), we find that its coefficients

cn = λ1 λ2 · · · λn , and cn−1 = (−1)n−1 (λ1 + λ2 + · · · + λn ).

Comparison with our previous formulae for the coefffients c0 and cn−1 leads to the following
result.

Theorem 2.0.3. The sum of the eigenvalues of a matrix equals its trace.

λ1 + λ2 + · · · + λn = tr(A) = a11 + a22 + · · · + ann

, The product of the eigenvalues equals its determinant

det A = λ1 λ2 λn .

Theorem 2.0.4. (Cayley - Hamilton Theorem)

Every matrix is a zero of its characteristic polynomial.


Proof:: Assignment

Example 2.0.1.

Show that the matrix below satisfies it characteristic polynomial.


 
1 3
A=
2 1

fA (λ) = det(A − λI) = λ2 − 2λ − 5

A is a zero of fA (λ) since

fA (λ) =
 2      
1 3 1 3 1 0 0 0
−2 −5 =
2 1 2 1 0 1 0 0

13
2.1 EIGENVECTORS AND DIAGONALIZABILITY
A square matrix A is said to be diagonalizable if there exists a nonsingular matrix S and a
diagonal matrix Λ = diag(λ1 , λ2 , · · · , λn ) such that
 
λ1
−1
 λ2 
S AS = Λ =  ,
 
. .
 . 
λn
or equivalently A = SΛS −1 .

Note : We call S the ”eigenvector matrix” and Λ the ” eigenvalue matrix”.

Theorem 2.1.1. Let A be a square n by n matrix with n linearly independent eigenvectors.


If these eigenvectors are the columns of a matrix S, then S −1 ΛS is a diagonal matrix Λ .
The eigenvalues of A are on the diagonal of Λ:

Proof. S is formed by putting the eigenvectors xi in its column. compute AS by columns:


     
AS = A x1 x2 · · · xn = Ax1 Ax2 · · · Axn = λ1 x1 λ2 x2 · · · λn xn .

Now, we split the last matrix into SΛ via


 
λ1
    λ2 
λ1 x1 λ2 x2 · · · λ n xn = x1 x2 · · · xn  .
 
 ... 
λn

Therefore; AS = SΛ or S −1 AS = Λ or A = SΛS −1 . S is invertible because its columns (the


eigenvectors) were assumed to be linearly independent.

Remarks

1. If the matrix A has no repeated eigenvalues.


the eigenvalues λ1 , λ2 , · · · λn are distinct, then its n eigenvectors are automatically indepen-
dent. Any matrix with distinct eigenvalues can be diagonalized.

2. The diagonalizing matris S is not unique.

3. Not all matrices are diagonalizable

14
4. Defective matrices are not diagonalizable. The standard example of a ” defective
matrix ” is
 
0 1
Its eigenvalues are λ1 = λ2 = 0 , since it is triangular with zeros on the diagonal:
0 0
 
−λ 1
det(A − λ) = det = λ2
0 −λ
 
1
All eigenvectors of A are multiples of the vectors
0
     
0 1 0 c
x= or x=
0 0 0 0

Its algebraic multiplicity is 2 but geometric multiplicity is one. It is not diagonalizable.

Examples
 
1 −1
1. A = λ1 = 2, λ2 = 3
2 4
  
1 1
x1 = and x2 =
−1 −2

1 1
S=
−1 −2

     
−1 2 1 1 −1 1 1 2 0
S AS = =
−1 1 2 4 −1 2 0 3

 
0 −1
2. K = ; det(k − λI) = λ2 + 1 = 0 λ1 = i and λ2 = −i
1 0
    
−i −1 a 0
(k − λ1 I)x1 = =
1 −i b 0

−ai − b = o,

a − ib = 0 =⇒ a = ib
 
1
x1 =
−i

15
Similarly;     
i −1 a 0
(k − λ2 I)x2 = =
1 i b 0
 
1
x2 =
i
   
1 1 −1 i 0
S= and S KS =
−i i 0 −i

 
0 −1 −1
3. A = 1 2 1 ; det(k − λI) = −λ3 + 4λ2 − 5λ + 2 = −(λ − 12 )(λ − 2) = 0
1 1 2
λ1 = λ2 = 1 and λ3 = 2
     
−1 −1 −1
x1 = 1 ,
 x2 = 0
  and x3 = 1 

0 1 1
The eigenvector matrix
   
−1 −1 −1 −1 0 −1
S= 1 0 1 and S −1 = −1 −1 0 
0 1 1 1 1 1
     
−1 0 −1 0 −1 −1 −1 −1 −1 1 0 0
S −1 AS = −1 −1 0 1 2 1  1 0 1  = 0 1 0
1 1 1 1 1 2 0 1 1 0 0 2

2.2 POWERS OF MATRICES : Ak


The eigenvalues of A2 are λ2 1 , λ2 2 , · · · , λ2 n are every eigenvector of A is also an eigenvector
of A2 . If Ax = λx, A2 x = A(Ax) = A(λx) = λAx = λ(λx) = λ2 x. Thus λ2 is an eigenvalue
of A2 with the same eigenvector x . The same result can be obtained by diagonalization.
Eigenvalues of A2

(S −1 AS)(S −1 AS) = (Λ)(Λ) = (Λ)2

 
λ1 2
 λ2 2 
(S −1 A2 S) = (Λ)2 = 
 
.. 
 . 
λn 2

16
or
A2 = (SΛS −1 )(SΛS −1 ) = SΛ(S −1 S)ΛS −1 = (SΛ)(ΛS −1 ) = SΛ2 S −1

In the same vein

Ak = (SΛS −1 )(SΛS −1 )(SΛS −1 ) · · · (SΛS −1 )

= SΛ(S −1 S)Λ(S −1 S)Λ(S −1 S) · · · (SΛS −1 )

Each S −1 cancels an S except for the first S and the Last S −1


=⇒
Ak = SΛk S −1
.

HOMEWORK
 
4 3
If A= then Find A100 by diagonalizing A.
1 2

17
TUTORIAL TWO

1. Find the eigenvalues and eigenvectors of the following matrices

1 − 23
       
1 −2 3 1 1 2
(a). A = (b). F = 1 (c). (d).
−2 1 2
− 16 −1 1 −1 1

 
  4 0 0 0  
3 −1 0 1 2 −1 −1
3 0 0

(e). −1 2 −1 (f ). 
−1 1 (g). A = −2 1 1
2 0
0 −1 3 1 0 1
1 −1 1 1

 
0 c −b
(h). −c 0 a
b −a 0
2(a). Find the eigenvalues of the rotation matrix
 
cos θ − sin θ
R=
sin θ cos θ
2(b). For what values of θ are the eigenvalues real.

2(c). Repeat 2(a) for  


cos θ sin θ
F =
sin θ − cos θ

18
TUTORIAL THREE

(1). Choose a, b, c such that det(A − λI) = 9λ − λ3 . Then the eigenvalues are −3, c, 3
 
0 1 0
A = 0 0 1
a b c
(2). Compute the eigenvalues and corresponding eigenvectors of
 
1 4 4
A = 3 −1 0
0 2 3
(b). Compute the trace of A and check that it equals the sum of the eigenvalues.

(c). Find the determinant of A and check that it is equal to the product of the eigen-
values

(3). Suppose that λ is an eigenvalue of A.

(a). Prove that cλ is an eigenvalue of cA

(b). Prove that λ + d is an eigenvalue of A + dI

(c). More generally, cλ + d is an eigenvalue of B = cA + dI for scalers c and d.

(d). Prove that if λ 6= 0 is a nonzero eigenvalue of A , then 1


λ
is an eigenvalue of A−1

(e). In (d) above, what happens if A has 0 as an eigenvalue


 
a b
(4). Let A = be a 2 by 2 matrix.
c d
(a). Prove that A satisfies its own characteristic equation, meaning

fA (A) = A2 − Atr(A) + det(A)I = 0

((trA)I−A)
(b). Prove the inverse formula; A−1 = det A

19
 
2 1
(c). Check the Cayley-Hamitton theorem for A =
−3 2

20
TUTORIAL FOUR

5
Diagonalize
 the
 following matrices
 and find A .  
3 −9 5 −4 −4 −2
(1). (2). A = (3). K =
2 −6 2 −1 5 2
 
    0 0 1 0
−2 3 1 3 3 5 0 0 0 1
(4). A =  0 1 6 (5). C =  5 6 5  (6). K = 
1

0 0 0
0 0 3 −5 −8 −7
0 1 0 0
 
2 1 −1 0  
−3 −2 0 2 5 5
1 (8). B = 0 2
(7). A = 
0 0
0 1 2
0 −5 −3
0 0 1 −1

Diagonalize the following complex matrices


 
    −i 0 1
i 1 2−i 2+i −i 1 −1
1 i 3−i 1+i
1 0 −i

21
Chapter 3

Eigenvalues & Similarity


Transformations

3.1 Definition
Let A and B be two square matrices and P a non-singular matrix. B is said to be similar
to A if.
B = P −1 AP (3.1)
B can also be said to be obtained from A by a similarity transformation.

3.2 Example
Similarity of matrices has some properties which we state by means of theorems.

3.3 THEOREM
3.3.1 Similarity As An Equivalence
similarity of matrices is an equivalence relation
1. A is similar to itself

2. if B is similar to A, then A is similar to B

3. if C is similar to B and B is similar to A, then C is similar to A

3.3.2 Proof
1. let P = I where I is the n by n identity matrix then
A = P −1 AP = I −1 AI, Thus A is similar to A because I is non-singular

2. If B = P −1 AP ,then we want to show that A is similar to B. Since B = P −1 AP , then


if we pre-multiply both sides by P ,

22
we have P B = P P −1 AP = IAP = AP
So P B = AP Now, post-multiply both sides by P −1 ,
P BP −1 = AP P −1 = AI = A
Since A = P BP −1 = (P −1 )−1 BP −1
Let P −1 = Q,then A = Q−1 BQ

3. If C is similar to B then ∃ a non-singular matrix Q such that C = Q−1 BQ since B is


similar to A means,
B = P −1 AP
C = Q−1 BQ
C = Q−1 (P −1 AP )Q
C = (Q−1 P −1 )AP Q
C = (P Q)−1 AP Q
C = S −1 AS
With S = P Q. This implies that C is similar to A.

The next theorem shows the connection between the eigenvalues and eigenvectors of similar
matrices

3.3.3 Similarity And Eigensystems


A. Similar matrices have the same characteristic polynomial and the same eigenvalues

B. Let B be similar to A such that B = P −1 AP . x is an eigenvector of A corresponding


to the eigenvalue λ iff P −1 x is an eigenvector corresponding to the eigenvalue λ.

3.3.4 Proof
A. Since det P −1 = 1
det P
, we have

det(B − λI) = det[p−1 AP − λI]


= det[P −1 AP − λP −1 P ]
= det[P −1 (A − λ)P ]
= det P −1 det(A − λI) det P
= det(A − λI)

Showing that the characteristic polynomials are the same. Since the eigenvalues are the
root of the characteristic polynomial, therefore similar matrices have the same eigenvalues
B. Recall Ax = λx and B = P −1 AP and A = P BP −1
Then Ax = (P BP −1 )x = λx
P −1 P BP −1 x = λP −1 x
B(P −1 x) = λ(P −1 x)
=⇒ P −1 x is the eigenvector of B corresponding to the eigenvalue λ.

23
3.4 Example
Consider the matrix A and B Example
 
2−λ −3
det(A − λI) = det = λ2 − λ + 1
1 −1 − λ
and  
0 − λ −1
det(B − λI) = det = λ2 − λ + 1
1 1−λ

3.5 THEOREM
3.5.1 Similarity and Powers
Let B be similar to A such that B = P −1 AP . Then

A. B k is similar to Ak such that B = P −1 Ak P for all positive integer k.

B. detB = detA

C. B is non-singular iff A is non-singular

D. If A and B are non-singular then B k is similar to Ak with B = P −1 Ak P for k in N . In


particular B −1 = P −1 A−1 P

E. If f is a polynomial with f (x) = am xm + am−1 xm−1 + · · · a1 x1 + a0 and if f (X) for a


square matrix denotes am X m + am−1 X m−1 + · · · + a0 I then f (B) is similar to f (A)
such that f (B) = P −1 f (A)B

PROOF

a. B k = (P −1 AP )(P −1 AP ) · · · (p−1 AP ) (k times)


since P P −1 = I, then B k = P −1 Ak P

b. detB = det(P −1 AP ) = detP −1 det(A)detP = det(A)

c. Immediate from(b) and the fact that a matrix is non-singular iff its determinate is
non-zero.

d. B −1 = (P −1 AP )−1 = P −1 A−1 (P −1 )−1 = P −1 A−1 P


= am B m + am−1 B m−1 + · · · + a0 B + a0

e. f (B) = am P −1 Am P + am−1 P −1 Am−1 P + · · · + a0 P −1 P


= P −1 (am Am + am−1 Am−1 + · · · + a1 A + a0 I)P
= P −1 f (A)P

24
3.5.2 Example on similarity
Consider the matrix below  
1 2
A=
3 2
Find an invertible matrix P such that B = P −1 AP .
The characteristic polynomial
(1 − λ) 2
A − λI = = λ2 − 3λ − 4 = 0
3 (2 − λ)
After finding the roots of the characteristic equation, we obtain λ = 4 or λ = −1
let  
a
x=
b
be the eigenvector corresponding to the eigenvalue λ = 4 such that Ax = 4x or (A−4I)x = 0
  
−3 2 a
(A − 4I)x = =0
3 −2 b

-3a+2b=0
3a-2b=0 or 3a=2b
let a=2 and b=3.  
2
x=
3
is a non-zero eigenvector belonging to the eigenvalue λ = 4. let
 
c
y=
d

be the eigenvector corresponding to the eigenvalue λ = 1. Now


  
2 2 c
(A − −I)y = (A + I)y =0
3 3 d

2c+2d=0
3c+3d=o
=⇒ c+d=0
c=-d. let d=1 and c=-1    
c 1
y= =
d −1
let  
2 1
P =
3 −1
be the non-singular with inverse
1 1
 
−1 5 5
P = 3 −2
5 5

25
A is similar to the diagonal matrix of eigenvalues.
 1 1     
−1 5 5
1 2 2 1 4 0
B = P AP = 3 −2 =
5 5
3 2 3 −1 0 −1

The diagonal elements 4 and -1 of the matrix B are the eigenvalues corresponding to the
given eigenvector.

26
Tutorial

For each matrix below, find all eigenvalues and eigenvectors

1.  
2 2
A=
1 3

2.  
4 2
B=
3 3

3.  
5 −1
C=
1 3

4. Find invertible matrices P Q and R such that P −1 AP , Q−1 BQ and R−1 CR

5. Show that A and AT have the same eigenvalues.

27
Chapter 4

Eigenvalues of Symmetric Matrices

4.1 Introduction
If A = AT is real and symmetric, then (Ax)T y = xT AT y == xT (Ay). Simply

(Ax)T y = xT (Ay). (4.1)

For two real vectors x and y the Euclidean dot product (or inner product)
xT y = x 1 y 1 + x2 y 2 + · · · + xn y n
y x = x1 y 1 + x2 y 2 + · · · + xn y n = y T x
T

If x and y are complex vectors, then the hermitian dot product of x and y is
xH y 6= y H x
xH y = x̄1 y1 + x̄2 y2 + · · · + x̄n yn = x̄T y
y H x = ȳ1 x1 + ȳ2 x2 + · · · + ȳn xn = ȳ T x = xT ȳ

Complex kxk2 = xH x = x¯1 x1 + x¯2 x2 + · · · + x¯n xn = |x1 |2 + |x2 |2 + · · · + |xn |2 A symmetric


matrix is a matrix that is equal to its transpose. i.e AT = A. or aij = aji for i, j = 1, 2, · · · , n.
A symmetric matrix need not be invertible, it could be a zero matrix. If A−1 exists, then it is
also symmetric. For a complex matrix, the term Hermittian matrix is used. A is Hermittian
if AH = ĀT = AH where ĀT denotes the conjugate transpose of A. AH is the Hermitian
transpose of A. Moreover āji = aij .

4.1.1 THEOREM
Let A = AT be a real symmetric nbyn matrix then:

a All the eigenvalues of A are real.

b Eigenvectors corresponding to distinct eigenvalues are orthogonal

c There is an orthonormal basis of Rn consisting of n eigenvectors of A

28
4.1.2 PROOF
a. ConsiderAx = λx. Pre-multiply both sides by xT to yield
xT Ax = xT λx
T
λ = xxTAx
x
T
= xkxkAx
2
T
Since x 6= 0 because it is an eigenvector and x Ax is real, this implies that the eigen-
value λ is real.
b. Let Ax = λx and Ay = µy for λ = µ
y T Ax = λy T x = λxT y
Pre-multiply both sides of Ay = µy by xT to yield,
xT Ay = xT µy
xT Ay = y T Ax ,
Hence
λxT y = µxT y
(λ − µ)xT y = 0
But since λ 6= µ, this implies xT y = 0
c. Exercise

4.1.3 Hermittian Matrix


For a symmetric matrix A = AT . For a matrix with complex entries, the notion of symmetry
needs to be extended. The generalization is not to matrices that equal their transpose, but
to matrices that equal their conjugate transpose. These are the Hermitian matrices and a
typical example is  
2 3 − 3i
A= = AT
3 + 3i 5
3 − 3i is the conjugate of 3 + 3i A real symmetric matrix is Hermitian. For real matrices
there is no difference between AT and AH (read ”A Hermittian”)

4.1.4 THEOREM
If A = AH , then for all complex vectors x, the number xH Ax is real.

4.1.5 PROOF
(xH Ax)H is the conjugate of the scalar xH Ax, but we get the same number again:
So that must be real.

4.1.6 EXAMPLE
1  
3 1
A=
1 3

29
Has real eigenvalues λ1 = 4 and λ2 = 2. The corresponding eigenvectors
 
1
x1 =
1

and  
−1
x2 =
1
are orthogonal
 
T
  −1
x1 x2 = 1 1 = (1)(−1) + (1)(1) = 0
1

The orthonormal eigenvector basis from Theorem


√ 2.1.1
√ is obtained bypdividing each
eigenvector by its Euclidean norm: kx1 k = 12 + 12 = 2 and kx2 k = (−1)2 + 12 =

2 " #
√1
2
u1 = √1
2

and " #
−1

2
u2 = √1
2

2.  
5 −4 2
A =  −4 5 2 
2 2 −1
The eigenvalues are λ1 = 9, λ2 = 3 and λ3 = −3
 
1
x1 =  −1 
0
 
1
x2 =  1 
1
and 
1
x3 =  1 
−2
We want to show that the eigenvectors form an orthogonal basis
 
T
  1
x1 x2 = 1 −1 0  1  = 1 − 1 + 0 = 0
1

30
 
1
x1 T x2 =
 
1 −1 0  1 =1−1+0=0
−2
CHECK : x2 T x3 = 0 To form an orthonormal basis for the eigenvectors, kx1 k =
√ √ √
2, kx2 k = 3 and kx3 k = 6, so

√1
 
2
−1
u1 =  √
2

0
 
√1
3
u2 =  √1
 
3 
√1
3
 
√1
6
u3 =  √1
 
6 
√2
6

The eigenvalues of a symmetric matrix can be used to test its positive definiteness, as
stated in the next theorem.

Theorem

A symmetric matrix is positive definite if and only if all its eigenvalues are strictly
positive.

Proof

See Olive and Shakiban page 414.

3.  
8 0 1
K= 0 8 1 
1 1 7
Its characteristic equation is;
det(K − λI) = −λ3 + 23λ2 − 174λ + 432
= −(λ − 9)(λ − 8)(λ − 6)
The eigenvalues are 9,8 and 6. Since they are all positive, K is a positive definite
matrix. The eigenvectors are,  
1
x1 =  1 
1

31
,  
−1
x2 =  1 
0
and  
−1
x3 =  −1 
2
The eigenvectors form an orthogonal basis of R3 . The corresponding orthonormal
eigenvector basis,  1 

3
u1 =  √1
 
3 
√1
3
 −1


2
u2 =  √1 
2
0
 −1


6
−1

x1 = 
 
6 
√2
6

by dividing each eigenvector by its norm.

Theorem
Two eigenvectors of a real symmetric or Hermitian matrix, if they come from different
eigenvalues, are orthogonal to one another.

Proof
Let Ax = λx and Ay = µx, λ 6= µ and A = AH
(λx)H y = (Ax)H y = xH AH y = xH Ay = xH (µy)
The outside numbers are λxH y = µxH y.
Since the eigenvalues are real and λ 6= µ, (λ − µ)xH y = 0.
Hence xH y = 0

4.2 The Spectral Theorem and Orthogonal Diagonali-


sation
At the beginning of this chapter, we showed that if a matrix has a full set of eigenvectors,
it is diagonalizable. i.e S −1 AS = Λ. However, if the matrix is Hermitian, the diagonalizing
matrix can be chosen with orthonormal columns. In this section, we state that a symmetric
matrix, has an orthogonal diagonalization. For a real and symmetric matrix, its eigenvalues

32
are real, its eigenvectors are orthogonal. The eigenvectors are real. The orthonormalized
eigenvectors go into an orthogonal matrix Q. A matrix is said to be orthogonal if,
QT Q = I = QH Q and QT = QH .
Hence S −1 AS = Λ becomes Q−1 AQ = Λ or A = QΛQ−1 = QΛQT .
This now leads to the following important theorem of Linear Algebra:

Theorem
A real symmetric matrix can be factored into A = QΛQT . Its orthonormal eigenvector are
in the orthogonal matrix Q and its eigenvalues are in Λ.

Proof
Exercise.
In Mathematics, the formula A = QΛQT is known as the spectral theorem and can also be
referred to as an orthogonal diagonalization of A. If we multiply columns by rows,
 . .. ..   λ 
.. . . 1 · · · x1 T ···

T  x1 x2 · · · xn  
  λ2   · · · x2 T

··· 
A = QΛQ =   . ..  
   
.. .. .. λn · · · xn T
···
. . .
= λ 1 x1 x 1 T + λ 2 x2 x2 T + · · · + λ n xn xn T

Example
The 2 by 2 matrix.  
3 1
A=
1 3
Considered earlier, the orthonormal eigenvectors produce the diagonalizing orthogonal ma-
trix " #
√1 −1

2 2
Q= √1 √1
2 2
" # " #
 
√1 −1
√ √1 √1
3 1 4 0
A= = QΛQT = √1
2
√1
2
−1
2
√1
2
1 3 2 2
0 2 √
2 2

33
Tutorial

a. Find the eigenvalues and eigenvectors of the following matrices


b. Use the eigenvalues to compute the determinant of A.
c. Which of the matrices is positive definite?
d. Find an orthonormal basis of R2 or R3 determined by A or explain why none exists.
e. Write out the spectral factorization of A if possible.

1.  
−3 4
A=
4 3
2.  
2 −1
A=
−1 4
3.  
1 1 0
A= 1 2 1 
0 1 1
4.  
3 −1 −1
A =  −1 2 0 
−1 0 2
5.  
2 1 −1
A= 1 2 1 
−1 1 2
6. Find the spectral factorization of the following matrices.
a.  
3 2i
A=
−2i 6
b.  
6 1 − 2i
B=
1 + 2i 2

34
7. For which values of b and c does the system
x1 + x2 + bx3 = 1
bx1 + 3x2 − x3 = −2
3x1 + 4x2 + x3 = c
a. Have no solution b. Exactly one solution c. Infinitely many solutions.

8. Determine the rank of the following


 
1 −1 2 1
 2 1 −1 0 
 
 1 2 −3 −1 
 
 4 −1 3 2 
0 3 −5 −2

9. Find the lengths of u = [1 + i, 1 + 2i] and v = [i, i, i] Also find uH v and v H u.

35
Chapter 5

Jordan Form

5.1 Introduction
The question we want to answer now is the following:
If A is not similar to a diagonal matrix, then what is the simplest matrix that A is similar
to?
Before we can prove the answer, we will have to introduce a few definitions.

Definition
A square matrix A is ”block diagonal” if A has the form
 
A1 0 ··· 0
 0 A2 · · · 0 
A =  ..
 
.. . . .
. ..

 . . 
0 0 · · · Ak

where each Ai is a square matrix and the diagonals of each Ai lie on the diagonal of A. Each
0 is a zero matrix of appropriate size. Each Ai is called a ”block” of A.
Technically, every square matrix is a block diagonal matrix. But we only use the terminology
when there are at least two blocks in the matrix. Here is an example of a ’typical’ block
diagonal matrix.  
1 3 2 0 0 0
 7 0 2 0 0 0 
 
 1 1 2 0 0 0 
A=  0 0 0 6 0 0 

 
 0 0 0 0 2 1 
0 0 0 0 3 3
This matrix has blocks of size 3,1 and 2 as we move down the diagonal. The three blocks in
this matrix are  
1 3 2
A1 =  7 0 2 
1 1 1

36
 
A2 = 6
 
2 1
A3 =
3 3
The lines are just drawn to illustrate the blocks.

Definition
A ”Jordan block” with value λ is a square, upper triangular matrix whose entries are all λ
on the diagonal, all 1 on the entries immediately above the diagonal and 0 elsewhere:
 
λ 1 0 ··· 0 0
 0 λ 1 ··· 0 0 
 
 0 0 λ ··· 0 0 
J(λ) =  .. .. .. . .
 
. . 
 . . . . .. .. 
 
 0 0 0 ··· λ 1 
0 0 0 ··· 0 λ

Here’s what the Jordan blocks of size 1,2 and 3 looks like
 
λ
 
λ 1
0 λ
 
λ 1 0
 0 λ 1 
0 0 λ

Definition
A ”Jordan form” matrix is a block diagonal whose blocks are all Jordan blocks.
For example, every diagonal p ∗ p matrix is a Jordan form, with P 1 ∗ 1 Jordan blocks. Here
are some more interesting examples (again, lines have been drawn to illustrate the blocks):
 
1 1 0 0 0 0   
 0 1 0 0 0 0 
  2 1 0 0 2 1 0 0
 0 0 3 1 0 0  0 2 1 0  0 2 1 0 
  

 0 0 0 3 0 0  0 0 2 0
  0 0 2 1 

 0 0 0 0 −1 0  0 0 0 2 0 0 0 2
0 0 0 0 0 −1

Now, here’s the big theorem that answers our first question:

Theorem 1
Let A be a p ∗ p matrix. Then there is a Jordan form matrix J that is similar to A.
In fact, we can be more specific than that:

37
Theorem 2
Let A be a p ∗ p matrix, with s distinct eigenvalues λ1 , · · · , λs . Let each λi have algebraic
multiplicity mi and geometric multiplicity µi . Then A is similar to a Jordan form matrix.
 
J1 0 · · · 0
 0 J2 · · · 0 
J =  .. .. . .
 
. 
 . . . .. 
0 0 · · · Jµ

where
1. µ = µ1 + µ2 + · · · + µs

2. For each λi , the number of Jordan block in J with value λi is equal to µi

3. λi appears on the diagonal of J exactly mi times.


Further, the matrix J is unique, up to re-ordering the Jordan blocks on the diagonal.
This is a pretty complicated theorem and we aren’t going to try to prove it here. but we will
learn a method for finding the Jordan form of a matrix A, and also finding the non-singular
matrix Q such that J = Q−1 AQ.

5.2 Algorithm for the Jordan Form of A


1. Compute the distinct eigenvalues λ1 , λ2 , · · · , λs , along with the associated algebraic
multiplicity m1 , m2 , · · · , ms and geometric multiplicity µ1 , µ2 , · · · , µs .

2. Treat each eigenvalue in turn. For a given eigenvalue λ of algebraic multiplicity m and
geometric multiplicity µ, we start computing the E − spaces and their dimensions.The
K − thE − space is
Eλk = [X : (A − λI)k X = 0]
So Eλ is just Eλ , and we build from there. We stop when we get to an Eλk that has
1

dimension m, the algebraic multiplicity of λ.

3. We make a diagram of boxes as follows. Compute the numbers


d1 = dimE λ 1 ,
d2 = dimE λ 2 − dimE λ 1
..
.
dk = dimE λ k − dimE λ k−1
Now we make a diagram with d1 boxes in the first row, d2 boxes in the second row,
and so on. For example, if d1 = 4, d2 = 2, d3 = 2, d4 = 1, then we get a diagram




We are going to ’fill in’ the diagram with vectors as follows.

38
4. Start at the bottom of the diagram and fill the boxes in row K with linearly independent
vectors that belong to Eλk but not Eλk−1 . Anytime you have a vector v in a box, the
box immediately above it gets filled with the vector (A − λI)v. If a box is the lowest
in its column, and belongs to row i, fill that box with a new vector from Eλi , which is
linearly independent to both Eλi−1 and all the other vectors in row i.

5. Repeat steps 2 through 4 for each distinct eigenvalue. You will get a diagram full of
vectors for each one.

6. Make a matrix Q as follows. For each eigenvalue, consider the associated diagram.
The vectors in the boxes become the columns of Q as follows. Start at the top of the
leftmost column, and use the vectors as you go down the column. When you reach the
end of a column, go to the next column. When you finish one diagram, go to the first
column of the next diagram. This gives the matrix Q.

7. The Jordan form of A is given by J = Q−1 AQ. But the nice part of the algorithm is
that you can compute J without finding Q! In fact J will have one Jordan block for
each column of each diagram. The value of the block is given by the eigenvalue, and
the size of the block is equal to the number of squares in the column. You put the
blocks down the diagonal of J in the same order you chose the vectors in Q.

Examples
1.  
2 −3
A=
3 −4
For this matrix, the characteristic polynomial is (1 + λ)2 , so there is one eigenvalue,
1
λ = −1 with m = 2. Now, we compute E − spaces: E−1 : Solving (A + I)x = 0
   
3 −3 0 1 −1 0
−→
3 −3 0 0 0 0
   
1 t 1
E−1 = =t
t 1
1
So d1 = µ = 1. Since dimE−1 < m, we have to compute another E − space.
2 2
E−1 : Solving (A + I) x = 0 :
 
2 0 0
(A + I) =
0 0
2 2
so E−1 = span(e1 , e2 ), and d2 = 2 − 1 = 1. Since dimE−1 = m, we don’t need any
more E − spaces. Since we have d1 = 1, d2 = 1, our diagram looks like:



39
2
We put a vector in the lower box. It has to be a vector in E−1 , that is linearly in-
1 T
dependent to E−1 . That’s easy enough, how about v1 = [1, o] . Abovev1 , we have to
put(A + I)v1 which is v1 = [1, 0]T . Above v1 , we have to put (A + I)v1 , which is
    
3 −3 1 3
v2 = =
3 −3 0 3

Hence our diagram is


v2
v1
So,    
3 0 3 1
Q = [v2 , v1 ] = =
3 1 3 0
Finally, without computing Q−1 AQ, we still know what J looks like. There is only one
column, so J is just one Jordan block of size 2, with value λ = −1:
 
−1 1
J=
0 −1
    
3 −3 0 −3
v2 = =
3 −3 1 −3
 
−3 0
Q=
−3 1
 
−1 −1 1
J = Q AQ =
0 −1

2.  
3 1 0
A =  −1 1 0 
3 2 2
We skip the computation to show that A has only one eigenvalue, λ = 2, of multiplicity
3. Computing E21 :    
1 1 0 0 1 0 0 0
 −1 −1 0 0  −→  0 1 0 0 
3 2 0 0 0 0 0 0
So E21 is spanned by the vector [0, 0, 1]T . So d1 = 1. Turning to E22 , we solve (A −
2I)2 x = 0:  
0 0 0 0
 0 0 0 0 
1 1 0 0
So E22 is spanned by [−1, −1, 0]T and [0, 0, 1]T . So d2 = 2 − 1. We need to compute
E −space. But computation shows that (A−2I) = 0, the zero matrix, so E23 is spanned
by e1 , e2 , e3 . So d3 = 1 and we can stop here. Our diagram is one column of three
boxes.

40
The bottom box gets filled with a vector from E23 that is linearly independent of E22 .
The vector V1 = e1 will work. Above there goes v2 = (A − 2I)v1 = [1, −1, 3]T , and
above that goes v3 = (A − 2I)v2 = [0, 0, 1]T . Since our diagram now looks like,
v3
v2
v1
we get the transition matrix,
 
0 1 1
Q = [v3 , v2 , v1 ] =  0 −1 0 
1 3 0
Again, there is only one column, so only one Jordan block, which has value 2 and size
3. We get the Jordan form matrix
 
2 1 0
J = 0 2 1 
0 0 2

3.  
2 4 −8
A= 0 0 4 
0 −1 4
You can check that this matrix also has only the eigenvalue 2, with multiplicity 3. We
compute the E − spaces. First for E21 ,
   
0 4 −8 0 0 1 −2 0
[A − 2I|0] −→  0 −2 −4 0  −→  0 0 0 0 
0 −1 2 0 0 0 0 0

So E21 is spanned by two vectors [1, 0, 0]T and [0, 2, 1]T . Also, d2 − 2 < 3, so we need
another E − space. Computing E22 , we see that (A − 2I)2 = 0, so E22 is spanned by
the standard basis, and d2 = 3 − 2 − 1. We can stop, since E22 has dimensions 3.
Our digram looks like, v2 v3
v1
where v1 is a vector in E22 linearly independent of E21 . We get v2 = (A − 2I)v1 , and we
finally choose v3 ∈ E21 linearly independent of v2 . If we start by choosing v1 = e2 , we
wind up getting,  
4 0 1
Q = [v2 , v1 , v3 ] =  −2 1 0 
−1 0 0
Finally, the diagram tells us that we get 2 Jordan blocks this time. Both have value 2,
but one is of size 2 and one is of size 1. So
 
2 1 0
J = 0 2 0 
0 0 2

41
We drew the lines just to illustrate the blocks. You can check in this example, and in
all of the previous ones, that indeed J = Q−1 AQ.

42
Chapter 6

Using Row Reduced Echelon Form in


Balancing Chemical Equations

6.1 Introduction
According to Risteski [2] a chemical reaction is an expression showing a symbolic represen-
tation of the reactants and products that is usually positioned on the left and right hand
sides of a particular chemical reaction. Substances that takes part in a chemical reaction are
represented by their molecular formula and their symbolic representation is also regarded as
a chemical reaction [3]. A chemical reaction can either be reversible or irreversible. These
differs from Mathematical equations in the sense that while a single arrow (in the case of
an irreversible reaction) or a double arrow points in the forward and backward directions
of both the reactants and products (in the case of a reversible reaction) connects chemical
reactions [4], an equality sign links the left and right hand sides of a Mathematical equation.
’The quantitative and qualitative knowledge of the chemical processes which estimates the
amount of reactants, predicting the nature and amount of products and determining con-
ditions under which a reaction takes place is important in balancing a chemical reaction.
Balancing Chemical reactions is an excellent demonstative and instructive example of the
inter-connectedness between Linear Algebra and Stoichiometric principles’ [1].
If the number of atoms of each type of element on the left is the same as the number
of atoms of the corrresponding type on the right, then the chemical equation is said to be
balanced [4], otherwise it is not. The qualitative study of the relationship between reactants
in a chemical reaction is termed Stoichiometry [6]. Tuckerman [5] mentioned two methods for
balancing a Chemical reaction: by inspection and algebraic. The balancing-by-inspection
method involves making successive intelligent guesses at making the coefficients that will
balance an equation equal and continuing until the equation is balanced [1]. For simple
equations this procedure is straight forward. However, according to [7], there is need for
a ’step-by-step’ approach which is easily applicable and can be mastered; rather than the
haphazard hoping of inspection or a highly refined inspection. In addition, balancing-by-
inspection method makes one to believe that there is only one possible solution rather than
an infinite number of solutions which the method proposed in this paper illustrates. The
algebraic approach circumvents the above loopholes provided in the inspection method and

43
can handle complex chemical reactions.
The algebraic approach discussed in [5], involves putting unknown coefficients in front of
each molecular species in the equation and solving for the unknowns. This is then followed
by writing down the balance conditions on each element. After which he lets one of the
unknowns to be one and takes turns to obtain the coefficients of the remaining unknowns.
In the proposed approach, instead of setting one of the unknowns to zero, we write out the
set of equations in matrix form, obtain a homogeneous system of equations. Since the system
of equations is homogeneous, the solution obtained is in the nullspace of the corresponding
matrix. We then perform elementary row operations on the matrix to reduce it to row
reduced echelon form. We also show the use of software environments like Matlab/octave
to reduce the corresponding matrix to row reduced echelon form using the rref command.
This approach surpasses those in [1]; in the sense that we do not need to manually reduce
the matrix to echelon form as shown in that paper. In that paper, they showed how the
corresponding matrix is reduced to echelon form but did not use elementary row operations
to convert it to row reduced echelon form.
In the next section, we state two well known results partaining echelon form and row
reduced echelon form.

6.2 Theory
In this section, we state well known results about echelon form and row reduced echelon
form. We will not bother about the algorithm as this is readily available in most Linear
Algebra textbooks.

Lemma 6.2.1. : The number of nonzero rows and columns are the same in any echelon form
produced from a given matrix A by elementary row operations, irrespective of the sequence
of row operations used.

Given an n × m matrix A,

1. Use Gauss elimination to produce an echelon form from A.

2. Use the bottom-most non zero entry 1 in each leading column of the echelon form,
starting with the rightmost leading column and working to the left, so as to eliminate
all non-zero entries in that column strictly above that entry one.

Definition 6.2.1. An n × m matrix A is said to be in row reduced echelon from when :

(a). It is in echelon form (with k non-zero rows, say)

(b). The ith leading column equals ei , the ith column of the identity matrix of order p, for
1 ≤ i ≤ k.

The next result which can be found in [9], describes the uniqueness of the row reduced
echelon form. It is the uniqueness of the row reduced echelon form that makes it a tool for
finding the nullspace of a matrix.

44
Theorem 6.2.1. (Row Reduced Echelon Form): Each matrix has precisely one row reduced
echelon form to which it can be reduced by elementary row operations, regardless of the actual
sequence of operations used to produce it.
Proof. See [9].

6.3 Worked Examples


Example 6.3.1. : Rust is formed when there is a chemical reaction between iron and oxygen.
The compound that is formed is a reddish-brown scales that cover the iron object. Rust is an
iron oxide whose chemical formula is F e2 O3 , so the chemical formula for rust is

F e + O2 −→ F e2 O3 .

Balance the equation.


In balancing the equation, let p, q and r be the unknown variables such that

pF e + qO2 −→ rF e2 O3 .

We compare the number of Iron (Fe) and Oxygen (O) atoms of the reactants with the number
of atoms of the product. We obtain the following set of equations:

F e : p = 2r
O : 2q = 3r,

The homogeneous system of equations becomes


 
  p  
1 0 −2   1 0 −2
q = 0, where A = .
0 2 −3 0 2 −3
r
From the above, the matrix A is already in the echelon form U, with two pivots 1 and 2
but not in row reduced echelon form, even though there is a zero above the second pivot 2.
However, to reduce it to row reduced echelon form R; all the pivots must be one. Hence, we
replace row two with half row two, that is R2 ↔ 21 R2 to yield,
 
1 0 −2
R= . (6.1)
0 1 − 32
Thus, Rx = 0 becomes  
  p
1 0 −2  
q = 0.
0 1 − 23
r
Upon expanding, we have

p − 2r = 0 or p = 2r
3 3
q − r = 0 or q = r,
2 2

45
the nullspace solution    
p 2
x = q = 23  r.
  
r 1
There are three pivot variables p, q and one free variable r. If we choose r = 1, then
p = 2, q = 32 . To avoid fractions, we can also let r = 2, so that p = 4, q = 3 and r = 2. We
remark that these are not the only solutions since there is a free variable r, the nullspace
solution is infinitely many. Therefore, the chemical equation can be balanced as
3
2Fe + O2 −→ Fe2 O3 ,
2
or
4Fe + 3O2 −→ 2Fe2 O3 .
Example 6.3.2. : Ethane (C2 H6 ) burns in oxygen to produce carbon (IV) oxide CO2 and
steam. The steam condenses to form droplets of water viz;

C2 H6 + O2 −→ CO2 + H2 O,

balance the equation.


Let the unknowns be p, q, r and s, such that

pC2 H6 + qO2 −→ rCO2 + sH2 O.

We compare the number of Carbon (C), Hydrogen (H) and Oxygen (O) atoms of the reactants
with the number of atoms of the products. We obtain the following set of equations:

C : 2p = r
H : 6p = 2s
O : 2q = 2r + s.

In homogeneous form,    
  p 0
2 0 −1 0    
6 0 0 −2 q  = 0 .
r  0
0 2 −2 −1
s 0
In the first step of elimination, replace row two by row two minus three times row one, i.e.,
R2 ↔ R2 − 3R1 to yield,  
2 0 −1 0
∼ 0 0 3 −2 .
0 2 −2 −1
Exchange row two with row three or vice versa to reduce A to echelon form U,
 
2 0 −1 0
U = 0 2 −2 −1 .
0 0 3 −2

46
In the next set of operations that we will carry out to reduce U to R, we perform row
operations that will change the entries above the pivots to zero; Replace row one by three
times row two plus two times row three i.e., R2 ↔ 3R2 + 2R3 and replace row one with three
times row one plus row three (R1 ↔ 3R1 + R3 ) to yield
 
6 0 0 −2
∼ 0 6 0 −7
0 0 3 −2

The last operation that will give us R, is to reduce all the pivots to unity, that is replace row
one with one-sixth row one, row two with one-sixth row two and row three with one-third
row three to obtain
1 0 0 − 31
 

R = 0 1 0 − 67  . (6.2)
2
0 0 1 −3
The solution to Ax = 0 reduces to Rx = 0 where x is actually the nullspace of A which is
equivalent to the nullspace of R. Hence,
 
 1
 p
1 0 0 −3  
0 1 0 − 7  q  = 0.
6 r 
0 0 1 − 23
s

Upon expanding, we have


1 1
p − s = 0 or p = s
3 3
7 7
q − s = 0 or q = s
6 6
2 2
r − s = 0 or r = s,
3 3
the nullspace solution   1
p 3
   
  7
q   
  6
x=
  =  2  s.
  
r   
  3
   
s 1
There are three pivot variables p, q, r and one free variable s. Let s = 3, so that p = 1, q = 72
and r = 2. We remark that this is not the only solution since there is a free variable s, the
nullspace solution is infinitely many. Therefore, the chemical equation can be balanced as
7
C2 H6 + O2 −→ 2CO2 + 3H2 O.
2

47
Example 6.3.3. : Sodium hydroxide (NaOH) reacts with sulphuric acid (H2 SO4 ) to yield
sodium sulphate (Na2 SO4 ) and water,
N aOH + H2 SO4 −→ N a2 SO4 + H2 O.
Balance the equation.
In balancing the equation, let p, q, r and s be the unknown variables such that
pN aOH + qH2 SO4 −→ rN a2 SO4 + sH2 O.
We compare the number of Sodium (Na), Oxygen (O), Hydrogen (H) and Sulphur (S) atoms
of the reactants with the number of atoms of the products. We obtain the following set of
equations:
Na : p = 2r
O : p + 4q = 4r + s
H : p + 2q = 2s
S : q = r.
Re-writing these equations in standard form, we have a homogeneous system Ax = 0 of
linear equations with p, q, r and s
p − 2r = 0
p + 4q − 4r − s = 0
p + 2q − 2s = 0
q−r = 0,
or     
1 0 −2 0 p 1 0 −2 0
1 4 −4 −1 q 
   = 0,
1 4 −4 −1
 where A= .
1 2 0 −2 r  1 2 0 −2
0 1 −1 0 s 0 1 −1 0
The augmented system becomes
 
1 0 −2 0 | 0
1 4 −4 −1 | 0
[A 0] =  .
1 2 0 −2 | 0
0 1 −1 0 | 0
Since the right hand side is the zero vector, we work with the matrix A because any row
operation will not change the zeros.
Replace row 2 with row two minus row one i.e, R2 ↔ R2 − R1 . Similarly, replace row
three with row three minus row one i.e, R3 ↔ R3 − R1 . These first set of row operations
reduces A to  
1 0 −2 0
0 4 −2 −1
∼ 0 2 2 −2 .

0 1 −1 0

48
In the second set of row operations, we replace row three by two times row three minus
row two or R3 ↔ 2R3 − R2 and replace row four by four times row four minus row two or
R4 ↔ 4R4 − R2 to yield  
1 0 −2 0
0 4 −2 −1
∼ 0 0 6 −3 .

0 0 −2 1
In the third stage of the elimination process, we replace row four with 3 times row four plus
row three i.e, R4 ↔ 3R4 + R3 to yield the row echelon matrix or upper triangular U,
 
1 0 −2 0
0 4 −2 −1
U= 0 0 6 −3 .

0 0 0 0

We now reduce U to row reduced echelon form R as follows: First, we reduce the pivots to
unity in rows two and three via R2 ↔ 41 R2 and R3 ↔ 61 R3 to obtain
 
1 0 −2 0
0 1 − 1 − 1 
∼ 2
0 0 1 − 1  .
4
2
0 0 0 0

Replace row one by row one plus two times row three i.e, R1 ↔ R1 +2R3 and row two by row
two plus half row three, that is R2 ↔ R2 + 21 R3 . These two operations replaces all nonzeros
above the pivots to zero resulting in the row reduced echelon form R
 
1 0 0 −1
0 1 0 − 1 
R= 2
0 0 1 − 1  . (6.3)
2
0 0 0 0

The solution to Ax = 0 reduces to Rx = 0 where x is actually the nullspace of A which is


equivalent to the nullspace of R. Hence,
  
1 0 0 −1 p
0 1 0 − 1  q 
2  
0 0 1 − 1  r  = 0.

2
0 0 0 0 s

Upon expanding, we have

p − s = 0 or p = s
1 1
q − s = 0 or q = s
2 2
1 1
r − s = 0 or r = s,
2 2

49
the nullspace solution    
p 1
q   1 
x=  2
r  =  1  s
2
s 1
There are three pivot variables p, q, r and one free variable s. We set s = 2, so that p = 2, q =
1 and r = 1. We remark that this is not the only solution since there is a free variable s, the
nullspace solution is infinitely many. Therefore, the chemical equation can be ’balanced’ as

2N aOH + H2 SO4 −→ N a2 SO4 + 2H2 O.

Example 6.3.4. : Using row reduced echelon form, balance the following chemical reaction:

KHC8 H4 O4 + KOH −→ K2 C8 H4 O4 + H2 O.

Let p, q, r and s be the unknown variables such that

pKHC8 H4 O4 + qKOH −→ rK2 C8 H4 O4 + sH2 O.

We obtain the following set of equations for each of the elements:

K : p + q = 2r
H : 5p + q = 4r + 2s
C : 8p = 8r
O : 4p + q = 4r + s.

The corresponding matrix becomes


 
1 1 −2 0
5 1 −4 −2
A= .
8 0 −8 0 
4 1 −4 −1

The following row operations R2 ↔ R2 − 5R1 , R3 ↔ R3 − 8R1 and R4 ↔ R4 − 4R1 reduces


A to  
1 1 −2 0
0 −4 6 −2
∼
0 −8 8
.
0
0 −3 4 −1
In the same vein, the following row operations R3 ↔ R3 − 2R2 and R4 ↔ 4R4 − 3R2 reduces
the above matrix to  
1 1 −2 0
0 −4 6 −2
∼0 0 −4 4  .

0 0 −2 2

50
Finally, R4 ↔ 2R4 − R3 reduces the matrix to echelon form
 
1 1 −2 0
0 −4 6 −2
U= 0 0 −4 4  .

0 0 0 0

There are three pivots respectively 1, −4, −4. Hence, to reduce the matrix to row reduced
echelon form, we make sure the entries above the pivots are zero and then change the pivots
to unity. The row operations R2 ↔ 4R2 + 6R3 , R1 ↔ 2R1 − R3 and R1 ↔ R1 + 81 R2 changes
the nonzero entries above the pivots to zero so that U reduces to
 
2 0 0 −2
0 −16 0 16 
∼0 0 −4 4  .

0 0 0 0
1
The row operations R1 ↔ 12 R1 , R2 ↔ − 16 R1 and R3 ↔ − 14 R1 leads to the row reduced
echelon form  
1 0 0 −1
0 1 0 −1
R= . (6.4)
0 0 1 −1
0 0 0 0
Therefore, the solution x to Rx = 0 becomes
   
p 1
q  1
x= r  = 1 s.
  

s 1

For simplicity, we equate s to one so that p = q = r = s = 1. This actually shows that the
equation was balanced in the first place.

6.4 Using Matlab or Octave rref function


In this section, we use octave to reduce each of the matrices considered in the last section
to row reduced echelon form. We remark that just as predicted by the theory, row exchanges
does not change the outcome of row reduced echelon form. This means that if you interchange
any of the row of each of the matrices in the four examples, the rref will be the same.

Example 6.4.1. : Type the matrix A = [1 0 − 2; 0 2 − 3] and R = rref (A). This


gives the same R as in (6.1) as  
1 0 −2
R= .
0 1 − 23

51
Example 6.4.2. : A = [2 0 − 1 0; 6 0 0 − 2; 0 2 − 2 − 1] and R = rref (A).
This gives the same R as in (6.2) as

1 0 0 − 13
   
1 0 0 −0.3333
R = 0 1 0 −1.1667 = 0 1 0 − 76  .
0 0 1 −0.6667 0 0 1 − 23

Example 6.4.3. : A = [1 0 − 2 0; 1 4 − 4 − 1; 1 2 0 − 2; 0 1 − 1 0] and


R = rref (A). This gives the same R as in (6.3) as
 
1 0 0 −1
0 1 0 − 21 
R= .
0 0 1 − 21 
0 0 0 0

Example 6.4.4. : A = [1 1 − 2 0; 5 1 − 4 − 2; 8 0 − 8 0; 4 1 − 4 − 1] and


R = rref (A). This gives the same R as in (6.4) as
 
1 0 0 −1
0 1 0 −1
R= .
0 0 1 −1
0 0 0 0

In the next example, we illustrate the power of the rref command.

Example 6.4.5. : Consider balancing the following chemical reaction from [5]

NaCl + SO2 + H2 O + O2 −→ Na2 SO4 + HCl.

Let the unknown coefficients be p, q, r, s, t, u such that

pNaCl + qSO2 + rH2 O + sO2 −→ tNa2 SO4 + uHCl.

We write down the balance conditions on each element as


Sodium : p = 2t
Chlorine: p = u
Sulphur: q = t
Oxygen: 2q + r + 2s = 4t
Hydrogen: 2r = u.
After transposing, the above system of equations can be written in the form Ax = 0 as
 
  p
1 0 0 0 −2 0  
1 0 0 0 0 −1  q 
0 1 0 0 −1 0   r  = 0.
  
0 2 1 2 −4 0   s 
  
t
0 0 2 0 0 −1
u

52
Using Matlab or Octave R = rref (A) command, Ax = 0 reduces to Rx = 0 as
   
p 1
     
  p   1
1 0 0 0 0 −1   q   
  2
0 1 0 0 0 −0.5   q     
  r    1
0 0 1 0 0 −0.5    = 0 or r  =   u.
  s   2
0 0 0 1 0 −0.25      
t   1
0 0 0 0 1 −0.5 s  
  4
u    
1
t 2

If we set u = 4, then p = 4, q = 2, r = 2, s = 1 and t = 2. The balanced equation becomes

4NaCl + 2SO2 + 2H2 O + O2 −→ 2Na2 SO4 + 4HCl.

53
References

[1] Gabriel C. I and Onwuka G. I: Balancing of Chemical Equations using Matrix Algebra.
Journal of Natural Sciences Research, Vol3, No. 5, 29–36, 2015.

[2] Risteski I. B: Journal of the Chinese Chemical Society, 56: 65–79, 2009.

[3] Roa C. N. R: University General Chemistry: An introduction to Chemistry science.


Rajiv Beri for Macmillian India Ltd, 17–41, 2007.

[4] Lay D. C: Linear Algebra and its Applications, 17–120, 2006.

[5] Tuckerman M. E: http://www.nyu.edu/classes/tuckerman/adv.chem/lectures/lecture 2/node3.htm


2011.

[6] Hill J. W., Mecreary T. W., Kolb D. K: Chemistry for changing times. Pearson Educa-
tion Inc., 123–143, 2000.

[7] Hutchings L., Peterson L., Almasude A: The Journal of Mathematics and Science:
Collaborative Explorations 9: 119–133, 2007.

[8] Guo C: Journal of Chemical Education Vol74, 13-65, 1997.

[9] Ben Noble, James W. Daniel : Applied Linear Algebra, Third Edition pp. 90–97, 103,
140–149, 1988.

54

You might also like