Professional Documents
Culture Documents
1
Chapter 1
2
1.1 Gaussian Elimination
Definition 1.1
There are three types of elementary row/column operations on matrices:
Definition 1.2
Two linear systems with the same unknowns are said to be equivalent if their solution sets
are the same. A matrix A is said to be row equivalent to a matrix B, written A ∼ B
(pronounced A tilde B). If there is a sequence of elementary row operations that changes A
to B.
b. The first non-zero entry of a row called the leading entry of that row is ahead of the
first non-zero entry of the next row.
A matrix in row echelon form is said to be in row reduced echelon form if it satisfies
the following conditions:
2. If the entry of the pivot position is zero, choose a non-zero entry in the pivot column,
i,e interchange the pivot row and the row containing this non-zero entry.
3
3. if the pivot position is non-zero, use elementary row operations to reduce all entries
below the pivot position to zero, (and the pivot position to 1 and entries above the
pivot position to zero for row reduced echelon form).
4. Cover the pivot row and the rows above it; Repeat (1) to (3) to the remaining sub-
matrices.
Definition 1.5
The number of non-zero rows/columns in any matrix in echelon form is called the rank. It
can also be defined as the number of pivots after reduction to echelon form.
Theorem 1.1
The number k of non-zero rows and the column numbers of the leading columns are the
same in any echelon form produced from a given matrix A by elementary row operations,
irrespective of the actual sequence of row operations used.
Proof
see[3] pages 141-143.
Example 1.1
The first three matrices below are in echelon form, the second are not,
0 1 −2 3 1 2 3
0 0
, 0 0 0 1 , 0 1 4
0 0
0 0 0 0 0 0 1
Example 1.2
The following 2 by 2 matrix A
2 4
A=
3 9
4
is reduced to
0 1 2
A =
0 1
by R2 ↔ 13 R2 ; R1 ↔ 12 R1 , R2 ↔ R2 −R1 . If we swap the two rows of A and then perform
row operations to echelon form, we have
00 1 3
A =
0 1
This example shows the non-uniqueness of the row echelon form since A is reduced to two
matrices A0 and A00 . However, the following theorem shows that the row reduced echelon
form is unique irrespective of row interchange(s). Nevertheless Theorem (1.1) is satisfied.
Proof
Let A0 and A00 be two row reduced echelon forms of the same matrix A. Since A0 and A00 are
also in echelon form, then from theorem 1.1, they have the same number k of non-zero rows
and the same column numbers of their k leading columns. The ith leading columns of each
matrix is just eij the column ith column of the identity matrix. From theorem (1.1a) ∃ non-
singular matrices F 0 and F 00 such that, F 0 A = A0 and F 00 A = A00 . Therefore A00 = HA0 and
A0 = H −0 A00 , H = F 00 F −0 The rule for partitioned multiplication tells us that the columns of
A0 and A00 are related by a00 = Ha0 , so we obtain ei = Hei . So Hei = ej for 1 ≤ j ≤ k. Any
column a0 of A0 is a sum of multiples of these first k unit column matrices ei .
a0 = Σαi ej
The corresponding column a of A is just Ha0 .
00 00
5
ii. If b = 0, then there are infinitely many solutions: Every number x is a solution
since ox = 0 = b no matter what x is.
For example: the equations,
x1 + x2 = 2
x1 − x2 = 0
x1 + x2 = 2
x1 + x2 = 1
x1 + x2 = 2
2x1 + 2x2 = 4
Examples
1.
x1 + 2x2 − 5x3 = 2
2x1 − 3x2 + 4x3 = 4
4x1 + x2 − 6x3 = 8
2.
x1 + 2x2 − x3 + 2x4 = 4
2x1 + 7x2 + x3 + x4 = 14
3x1 + 8x2 − x3 + 4x4 = 17
3.
6
Theorem 1.3(Gauss elimination and solution sets)
Suppose that the system of equations Ax = b or equivalently, the augmented matrix [A,b] is
transformed by a sequence of elementary row operations into the augmented matrix A0 x = b0
or equivalently, into the augmented matrix [A0 , b0 ]. Then the solution sets are identical i.e x
solves Ax = b iff x solves A0 x = b0
Proof
By theorem(1.1a), on row operations there is a non-singular matrix F such that
[A0 , b0 ] = F [A, b] = [F A, F b]
where the above equality follows from the rule for partitioned multiplications. This means
that A0 = F A and b0 = F b. Suppose that x solves Ax = b. Pre-multiplication by F gives
F Ax = F b, that is, A0 x = b and thus x solve the multiplied equations as well. Conversely,
if x satisfies A0 x = b, then pre-multiplication by F −0 shows that x solves Ax = b. Therefore,
the two sets are identical.
The above theorem shows that we can study the set of all solutions to Ax = b by studying
the set of solution of the much simpler set A0 x = b obtained by reducing [A, b] to echelon
form [A0 , b0 ]. As was demonstrated in part of tutorial one of mth 204 in the first semester,
three possible cases can occur in solving Ax = b,
1. The rank of the augmented matrix [A, b] is greater than the rank of A and no solution
exists to Ax = b
2. The rank of [A, b] equals the number of unknowns, and the system Ax = b has exactly
one solution.
3. The rank of [A, b] equals that of A, which is strictly less than the number of unknowns
and the system Ax = b has infinitely many solutions.
Proof
See[3] pages 148-149
7
Example
For what value of α will the following system of equations have one, no, infinitely many
solutions?
x − 3y = −2
2x + y = 3
3x − 2y = α
Solution
The augmented matrix is reduced from
1 −3 −2 1 −3 −2
2 1 3 to 0 1 1
3 −2 α 0 0 α−1
Rank(A0 ) = 2, Rank[A0 , b0 ]=3. If α = 1, then both A0 and [A0 , b0 ] are in echelon form and
rank(A0 ) = 2, rank ([A0 , b0 ]) = 2. This corresponds to case 2 of Theorem 1.4 and there is a
unique solution. Which is x = y = 1.
If α − 1 6= 0, Then the third row can be divided by α − 1 6= 0 to produce 1 at the bottom
of the third column, giving the rank of A as 2 but the rank of [A, b] as 3; this is case 1 and
there are no solutions.
The case Ax = 0 has been extensively studied in MTH 204 as the nullspace solution
when balancing chemical reactions. It corresponds to infinitely many solutions.
8
Chapter 2
Let A be an n by n matrix
Ax = λx
Ax = λIx
(A − λI)x = o,
since x 6= o and (A − λI)x = o. This implies that A − λI must be singular and the solutions
x are infinitely many. Since (A − λI) is singular, its determinant must be zero. Hence
characteristic polynomial
= det(A − λI) = |A − λI|
Eigenvalues are the roots of the characteristic polynomial. The Characteristic equation is
det(A − λI) = 0. x is sometimes called a right eigenvector of A. The left eigenvector of A
is the eigenvector of AT . That is
AT y = λy, y 6= o
or
y T A = λy T .
3. For each eigenvalue, solve the equation (A − λI)x = o. Since the determinant is zero,
there are solutions other than x = o, those are the eigenvectors.
The set of all eigenvectors of A is called the eigenspace of A. It is often defined as the
nullspace of (A − λI).
9
Definition 2.0.1. When the characteristic polynomial of an n by n matrix is written in the
form
det(A − λI) = (λ1 − λ)m1 (λ2 − λ)m2 · · · (λq − λ)mq
with λi 6= λj for 1 ≤ i 6= j ≤ q and m1 + m2 + m3 + · · · + mq = n, mi is called the
algebraic multiplicity of the eigenvalue λi . A simple eigenvalue is an eigenvalue of algebraic
multiplicity one.
Definition 2.0.2. Let vλ = nullspace(A − λI). The dimension of the eigenspace vλ is its
geometric multiplicity.
Definition 2.0.3. If the algebraic multiplicity of A is greater than the geometric multiplicity,
then the matrix is defective.
Theorem 2.0.1.
Since the eigenvalues were assumed to be distinct, λj 6= λk when j 6= k. This implies that
c1 = c2 = · · · = ck−1 = 0. By substituting this back into equation (2.1), ck vk = o and so
ck = 0. because v k 6= o. We have proved that (2.1) holds if and only if c1 = c2 = · · · = ck = 0
, which implies the linear independence of the eigenvectors v 1 , v 2 , · · · , v k−1 , v k .
10
Crucial
The general formula for the determinant of an n by n matrix entries aij :
X
det A = (signπ)aπ(1),1 .aπ(2),2 · · · aπ(n),n (2.2)
π
The sum is over all possible permutations π of the rows of A . The ’sign’ of the permutation,
written, sign π, equals the determinant of the corresponding permutation matrix P , so sign
π = det P = +1 If the permutation is composed of an even number of row exchanges and
−1, if composed of an odd number. For example, the six terms is the well known formula:
a11 a12 a13
a21 a22 a23 = a11 a22 a33 + a31 a12 a23 + a21 a32 a13 − a11 a32 a23 − a21 a12 a33 − a31 a22 a13 .
a31 a32 a33
For 3 by 3 determinant corresponds to the six possible permutations of a rowed matrix.
1 0 0 0 1 0 0 0 1 0 1 0 0 0 1 1 0 0
0 1 0 , 0 0 1 ,
1 0 0 ,
1 0 0 , 0 1 0 , 0 0 1 .
0 0 1 1 0 0 0 1 0 0 0 1 1 0 0 0 1 0
Then
a11 − λ a12 ··· a1n
a21 a22 − λ · · · a2n
|A − λI| = .. .. .. .
. . .
an1 an2 ··· ann − λ
11
The fact that fA (λ) is a polynomial of degree n is a consequence of the general determinental
formula (2.2) . Indeed, every term is prescribed by a permutation π of the rows of the matrix,
and equals plus or minus a product of n distinct matrix entries including one from each row
and one from each column. The term corresponding to the identity permutation is obtained
by multiplying the diagonal entries together, which in this case is
All of the other terms have at most n − 2 diagonal factors aii − λ and so are polynomials of
degree less than or equal to n − 2 in λ.
(b). By comparing coefficients cn = (−1)n
(c). cn = (−1)n−1 (a11 + a22 + · · · + ann ) = (−1)n tr(A).
(d). The constant term is any polynomial f (λ) can be found as f (0) since fA (λ) = det(A −
λI)
fA (0) = det(A − 0I) = det A
Examples
of Theorem 0.0.2
a b
A=
c d
det A = ad − bc
tr(A) = a + d
fA (λ) = A − λI
a−λ b
=
c d−λ
fA (λ) = (a − λ)(d − λ) − bc
= ad − aλ − dλ + λ2 − bc
= λ2 − (a + d)λ + ad − bc
= λ2 − λtr(A) + det(A) = 0
12
By the fundamental theorem of Algebra, every polynomial of degree n − 1 can be completely
factored. We can write the characteristics polynomial in factored form.
The numbers λ1 , λ2 , · · · , λn , some of which may be repeated are the roots of the characteristic
equation fA (λ) = 0, and hence the eigenvalues of the matrix A. Observe that:
If we multiply out (2.7) explicitly and equate the result to the characteristic polynomial
(??), we find that its coefficients
Comparison with our previous formulae for the coefffients c0 and cn−1 leads to the following
result.
Theorem 2.0.3. The sum of the eigenvalues of a matrix equals its trace.
det A = λ1 λ2 λn .
Example 2.0.1.
fA (λ) =
2
1 3 1 3 1 0 0 0
−2 −5 =
2 1 2 1 0 1 0 0
13
2.1 EIGENVECTORS AND DIAGONALIZABILITY
A square matrix A is said to be diagonalizable if there exists a nonsingular matrix S and a
diagonal matrix Λ = diag(λ1 , λ2 , · · · , λn ) such that
λ1
−1
λ2
S AS = Λ = ,
. .
.
λn
or equivalently A = SΛS −1 .
Remarks
14
4. Defective matrices are not diagonalizable. The standard example of a ” defective
matrix ” is
0 1
Its eigenvalues are λ1 = λ2 = 0 , since it is triangular with zeros on the diagonal:
0 0
−λ 1
det(A − λ) = det = λ2
0 −λ
1
All eigenvectors of A are multiples of the vectors
0
0 1 0 c
x= or x=
0 0 0 0
Examples
1 −1
1. A = λ1 = 2, λ2 = 3
2 4
1 1
x1 = and x2 =
−1 −2
1 1
S=
−1 −2
−1 2 1 1 −1 1 1 2 0
S AS = =
−1 1 2 4 −1 2 0 3
0 −1
2. K = ; det(k − λI) = λ2 + 1 = 0 λ1 = i and λ2 = −i
1 0
−i −1 a 0
(k − λ1 I)x1 = =
1 −i b 0
−ai − b = o,
a − ib = 0 =⇒ a = ib
1
x1 =
−i
15
Similarly;
i −1 a 0
(k − λ2 I)x2 = =
1 i b 0
1
x2 =
i
1 1 −1 i 0
S= and S KS =
−i i 0 −i
0 −1 −1
3. A = 1 2 1 ; det(k − λI) = −λ3 + 4λ2 − 5λ + 2 = −(λ − 12 )(λ − 2) = 0
1 1 2
λ1 = λ2 = 1 and λ3 = 2
−1 −1 −1
x1 = 1 ,
x2 = 0
and x3 = 1
0 1 1
The eigenvector matrix
−1 −1 −1 −1 0 −1
S= 1 0 1 and S −1 = −1 −1 0
0 1 1 1 1 1
−1 0 −1 0 −1 −1 −1 −1 −1 1 0 0
S −1 AS = −1 −1 0 1 2 1 1 0 1 = 0 1 0
1 1 1 1 1 2 0 1 1 0 0 2
λ1 2
λ2 2
(S −1 A2 S) = (Λ)2 =
..
.
λn 2
16
or
A2 = (SΛS −1 )(SΛS −1 ) = SΛ(S −1 S)ΛS −1 = (SΛ)(ΛS −1 ) = SΛ2 S −1
HOMEWORK
4 3
If A= then Find A100 by diagonalizing A.
1 2
17
TUTORIAL TWO
1 − 23
1 −2 3 1 1 2
(a). A = (b). F = 1 (c). (d).
−2 1 2
− 16 −1 1 −1 1
4 0 0 0
3 −1 0 1 2 −1 −1
3 0 0
(e). −1 2 −1 (f ).
−1 1 (g). A = −2 1 1
2 0
0 −1 3 1 0 1
1 −1 1 1
0 c −b
(h). −c 0 a
b −a 0
2(a). Find the eigenvalues of the rotation matrix
cos θ − sin θ
R=
sin θ cos θ
2(b). For what values of θ are the eigenvalues real.
18
TUTORIAL THREE
(1). Choose a, b, c such that det(A − λI) = 9λ − λ3 . Then the eigenvalues are −3, c, 3
0 1 0
A = 0 0 1
a b c
(2). Compute the eigenvalues and corresponding eigenvectors of
1 4 4
A = 3 −1 0
0 2 3
(b). Compute the trace of A and check that it equals the sum of the eigenvalues.
(c). Find the determinant of A and check that it is equal to the product of the eigen-
values
((trA)I−A)
(b). Prove the inverse formula; A−1 = det A
19
2 1
(c). Check the Cayley-Hamitton theorem for A =
−3 2
20
TUTORIAL FOUR
5
Diagonalize
the
following matrices
and find A .
3 −9 5 −4 −4 −2
(1). (2). A = (3). K =
2 −6 2 −1 5 2
0 0 1 0
−2 3 1 3 3 5 0 0 0 1
(4). A = 0 1 6 (5). C = 5 6 5 (6). K =
1
0 0 0
0 0 3 −5 −8 −7
0 1 0 0
2 1 −1 0
−3 −2 0 2 5 5
1 (8). B = 0 2
(7). A =
0 0
0 1 2
0 −5 −3
0 0 1 −1
21
Chapter 3
3.1 Definition
Let A and B be two square matrices and P a non-singular matrix. B is said to be similar
to A if.
B = P −1 AP (3.1)
B can also be said to be obtained from A by a similarity transformation.
3.2 Example
Similarity of matrices has some properties which we state by means of theorems.
3.3 THEOREM
3.3.1 Similarity As An Equivalence
similarity of matrices is an equivalence relation
1. A is similar to itself
3.3.2 Proof
1. let P = I where I is the n by n identity matrix then
A = P −1 AP = I −1 AI, Thus A is similar to A because I is non-singular
22
we have P B = P P −1 AP = IAP = AP
So P B = AP Now, post-multiply both sides by P −1 ,
P BP −1 = AP P −1 = AI = A
Since A = P BP −1 = (P −1 )−1 BP −1
Let P −1 = Q,then A = Q−1 BQ
The next theorem shows the connection between the eigenvalues and eigenvectors of similar
matrices
3.3.4 Proof
A. Since det P −1 = 1
det P
, we have
Showing that the characteristic polynomials are the same. Since the eigenvalues are the
root of the characteristic polynomial, therefore similar matrices have the same eigenvalues
B. Recall Ax = λx and B = P −1 AP and A = P BP −1
Then Ax = (P BP −1 )x = λx
P −1 P BP −1 x = λP −1 x
B(P −1 x) = λ(P −1 x)
=⇒ P −1 x is the eigenvector of B corresponding to the eigenvalue λ.
23
3.4 Example
Consider the matrix A and B Example
2−λ −3
det(A − λI) = det = λ2 − λ + 1
1 −1 − λ
and
0 − λ −1
det(B − λI) = det = λ2 − λ + 1
1 1−λ
3.5 THEOREM
3.5.1 Similarity and Powers
Let B be similar to A such that B = P −1 AP . Then
B. detB = detA
PROOF
c. Immediate from(b) and the fact that a matrix is non-singular iff its determinate is
non-zero.
24
3.5.2 Example on similarity
Consider the matrix below
1 2
A=
3 2
Find an invertible matrix P such that B = P −1 AP .
The characteristic polynomial
(1 − λ) 2
A − λI = = λ2 − 3λ − 4 = 0
3 (2 − λ)
After finding the roots of the characteristic equation, we obtain λ = 4 or λ = −1
let
a
x=
b
be the eigenvector corresponding to the eigenvalue λ = 4 such that Ax = 4x or (A−4I)x = 0
−3 2 a
(A − 4I)x = =0
3 −2 b
-3a+2b=0
3a-2b=0 or 3a=2b
let a=2 and b=3.
2
x=
3
is a non-zero eigenvector belonging to the eigenvalue λ = 4. let
c
y=
d
2c+2d=0
3c+3d=o
=⇒ c+d=0
c=-d. let d=1 and c=-1
c 1
y= =
d −1
let
2 1
P =
3 −1
be the non-singular with inverse
1 1
−1 5 5
P = 3 −2
5 5
25
A is similar to the diagonal matrix of eigenvalues.
1 1
−1 5 5
1 2 2 1 4 0
B = P AP = 3 −2 =
5 5
3 2 3 −1 0 −1
The diagonal elements 4 and -1 of the matrix B are the eigenvalues corresponding to the
given eigenvector.
26
Tutorial
1.
2 2
A=
1 3
2.
4 2
B=
3 3
3.
5 −1
C=
1 3
27
Chapter 4
4.1 Introduction
If A = AT is real and symmetric, then (Ax)T y = xT AT y == xT (Ay). Simply
For two real vectors x and y the Euclidean dot product (or inner product)
xT y = x 1 y 1 + x2 y 2 + · · · + xn y n
y x = x1 y 1 + x2 y 2 + · · · + xn y n = y T x
T
If x and y are complex vectors, then the hermitian dot product of x and y is
xH y 6= y H x
xH y = x̄1 y1 + x̄2 y2 + · · · + x̄n yn = x̄T y
y H x = ȳ1 x1 + ȳ2 x2 + · · · + ȳn xn = ȳ T x = xT ȳ
4.1.1 THEOREM
Let A = AT be a real symmetric nbyn matrix then:
28
4.1.2 PROOF
a. ConsiderAx = λx. Pre-multiply both sides by xT to yield
xT Ax = xT λx
T
λ = xxTAx
x
T
= xkxkAx
2
T
Since x 6= 0 because it is an eigenvector and x Ax is real, this implies that the eigen-
value λ is real.
b. Let Ax = λx and Ay = µy for λ = µ
y T Ax = λy T x = λxT y
Pre-multiply both sides of Ay = µy by xT to yield,
xT Ay = xT µy
xT Ay = y T Ax ,
Hence
λxT y = µxT y
(λ − µ)xT y = 0
But since λ 6= µ, this implies xT y = 0
c. Exercise
4.1.4 THEOREM
If A = AH , then for all complex vectors x, the number xH Ax is real.
4.1.5 PROOF
(xH Ax)H is the conjugate of the scalar xH Ax, but we get the same number again:
So that must be real.
4.1.6 EXAMPLE
1
3 1
A=
1 3
29
Has real eigenvalues λ1 = 4 and λ2 = 2. The corresponding eigenvectors
1
x1 =
1
and
−1
x2 =
1
are orthogonal
T
−1
x1 x2 = 1 1 = (1)(−1) + (1)(1) = 0
1
and " #
−1
√
2
u2 = √1
2
2.
5 −4 2
A = −4 5 2
2 2 −1
The eigenvalues are λ1 = 9, λ2 = 3 and λ3 = −3
1
x1 = −1
0
1
x2 = 1
1
and
1
x3 = 1
−2
We want to show that the eigenvectors form an orthogonal basis
T
1
x1 x2 = 1 −1 0 1 = 1 − 1 + 0 = 0
1
30
1
x1 T x2 =
1 −1 0 1 =1−1+0=0
−2
CHECK : x2 T x3 = 0 To form an orthonormal basis for the eigenvectors, kx1 k =
√ √ √
2, kx2 k = 3 and kx3 k = 6, so
√1
2
−1
u1 = √
2
0
√1
3
u2 = √1
3
√1
3
√1
6
u3 = √1
6
√2
6
The eigenvalues of a symmetric matrix can be used to test its positive definiteness, as
stated in the next theorem.
Theorem
A symmetric matrix is positive definite if and only if all its eigenvalues are strictly
positive.
Proof
3.
8 0 1
K= 0 8 1
1 1 7
Its characteristic equation is;
det(K − λI) = −λ3 + 23λ2 − 174λ + 432
= −(λ − 9)(λ − 8)(λ − 6)
The eigenvalues are 9,8 and 6. Since they are all positive, K is a positive definite
matrix. The eigenvectors are,
1
x1 = 1
1
31
,
−1
x2 = 1
0
and
−1
x3 = −1
2
The eigenvectors form an orthogonal basis of R3 . The corresponding orthonormal
eigenvector basis, 1
√
3
u1 = √1
3
√1
3
−1
√
2
u2 = √1
2
0
−1
√
6
−1
√
x1 =
6
√2
6
Theorem
Two eigenvectors of a real symmetric or Hermitian matrix, if they come from different
eigenvalues, are orthogonal to one another.
Proof
Let Ax = λx and Ay = µx, λ 6= µ and A = AH
(λx)H y = (Ax)H y = xH AH y = xH Ay = xH (µy)
The outside numbers are λxH y = µxH y.
Since the eigenvalues are real and λ 6= µ, (λ − µ)xH y = 0.
Hence xH y = 0
32
are real, its eigenvectors are orthogonal. The eigenvectors are real. The orthonormalized
eigenvectors go into an orthogonal matrix Q. A matrix is said to be orthogonal if,
QT Q = I = QH Q and QT = QH .
Hence S −1 AS = Λ becomes Q−1 AQ = Λ or A = QΛQ−1 = QΛQT .
This now leads to the following important theorem of Linear Algebra:
Theorem
A real symmetric matrix can be factored into A = QΛQT . Its orthonormal eigenvector are
in the orthogonal matrix Q and its eigenvalues are in Λ.
Proof
Exercise.
In Mathematics, the formula A = QΛQT is known as the spectral theorem and can also be
referred to as an orthogonal diagonalization of A. If we multiply columns by rows,
. .. .. λ
.. . . 1 · · · x1 T ···
T x1 x2 · · · xn
λ2 · · · x2 T
···
A = QΛQ = . ..
.. .. .. λn · · · xn T
···
. . .
= λ 1 x1 x 1 T + λ 2 x2 x2 T + · · · + λ n xn xn T
Example
The 2 by 2 matrix.
3 1
A=
1 3
Considered earlier, the orthonormal eigenvectors produce the diagonalizing orthogonal ma-
trix " #
√1 −1
√
2 2
Q= √1 √1
2 2
" # " #
√1 −1
√ √1 √1
3 1 4 0
A= = QΛQT = √1
2
√1
2
−1
2
√1
2
1 3 2 2
0 2 √
2 2
33
Tutorial
1.
−3 4
A=
4 3
2.
2 −1
A=
−1 4
3.
1 1 0
A= 1 2 1
0 1 1
4.
3 −1 −1
A = −1 2 0
−1 0 2
5.
2 1 −1
A= 1 2 1
−1 1 2
6. Find the spectral factorization of the following matrices.
a.
3 2i
A=
−2i 6
b.
6 1 − 2i
B=
1 + 2i 2
34
7. For which values of b and c does the system
x1 + x2 + bx3 = 1
bx1 + 3x2 − x3 = −2
3x1 + 4x2 + x3 = c
a. Have no solution b. Exactly one solution c. Infinitely many solutions.
35
Chapter 5
Jordan Form
5.1 Introduction
The question we want to answer now is the following:
If A is not similar to a diagonal matrix, then what is the simplest matrix that A is similar
to?
Before we can prove the answer, we will have to introduce a few definitions.
Definition
A square matrix A is ”block diagonal” if A has the form
A1 0 ··· 0
0 A2 · · · 0
A = ..
.. . . .
. ..
. .
0 0 · · · Ak
where each Ai is a square matrix and the diagonals of each Ai lie on the diagonal of A. Each
0 is a zero matrix of appropriate size. Each Ai is called a ”block” of A.
Technically, every square matrix is a block diagonal matrix. But we only use the terminology
when there are at least two blocks in the matrix. Here is an example of a ’typical’ block
diagonal matrix.
1 3 2 0 0 0
7 0 2 0 0 0
1 1 2 0 0 0
A= 0 0 0 6 0 0
0 0 0 0 2 1
0 0 0 0 3 3
This matrix has blocks of size 3,1 and 2 as we move down the diagonal. The three blocks in
this matrix are
1 3 2
A1 = 7 0 2
1 1 1
36
A2 = 6
2 1
A3 =
3 3
The lines are just drawn to illustrate the blocks.
Definition
A ”Jordan block” with value λ is a square, upper triangular matrix whose entries are all λ
on the diagonal, all 1 on the entries immediately above the diagonal and 0 elsewhere:
λ 1 0 ··· 0 0
0 λ 1 ··· 0 0
0 0 λ ··· 0 0
J(λ) = .. .. .. . .
. .
. . . . .. ..
0 0 0 ··· λ 1
0 0 0 ··· 0 λ
Here’s what the Jordan blocks of size 1,2 and 3 looks like
λ
λ 1
0 λ
λ 1 0
0 λ 1
0 0 λ
Definition
A ”Jordan form” matrix is a block diagonal whose blocks are all Jordan blocks.
For example, every diagonal p ∗ p matrix is a Jordan form, with P 1 ∗ 1 Jordan blocks. Here
are some more interesting examples (again, lines have been drawn to illustrate the blocks):
1 1 0 0 0 0
0 1 0 0 0 0
2 1 0 0 2 1 0 0
0 0 3 1 0 0 0 2 1 0 0 2 1 0
0 0 0 3 0 0 0 0 2 0
0 0 2 1
0 0 0 0 −1 0 0 0 0 2 0 0 0 2
0 0 0 0 0 −1
Now, here’s the big theorem that answers our first question:
Theorem 1
Let A be a p ∗ p matrix. Then there is a Jordan form matrix J that is similar to A.
In fact, we can be more specific than that:
37
Theorem 2
Let A be a p ∗ p matrix, with s distinct eigenvalues λ1 , · · · , λs . Let each λi have algebraic
multiplicity mi and geometric multiplicity µi . Then A is similar to a Jordan form matrix.
J1 0 · · · 0
0 J2 · · · 0
J = .. .. . .
.
. . . ..
0 0 · · · Jµ
where
1. µ = µ1 + µ2 + · · · + µs
2. Treat each eigenvalue in turn. For a given eigenvalue λ of algebraic multiplicity m and
geometric multiplicity µ, we start computing the E − spaces and their dimensions.The
K − thE − space is
Eλk = [X : (A − λI)k X = 0]
So Eλ is just Eλ , and we build from there. We stop when we get to an Eλk that has
1
38
4. Start at the bottom of the diagram and fill the boxes in row K with linearly independent
vectors that belong to Eλk but not Eλk−1 . Anytime you have a vector v in a box, the
box immediately above it gets filled with the vector (A − λI)v. If a box is the lowest
in its column, and belongs to row i, fill that box with a new vector from Eλi , which is
linearly independent to both Eλi−1 and all the other vectors in row i.
5. Repeat steps 2 through 4 for each distinct eigenvalue. You will get a diagram full of
vectors for each one.
6. Make a matrix Q as follows. For each eigenvalue, consider the associated diagram.
The vectors in the boxes become the columns of Q as follows. Start at the top of the
leftmost column, and use the vectors as you go down the column. When you reach the
end of a column, go to the next column. When you finish one diagram, go to the first
column of the next diagram. This gives the matrix Q.
7. The Jordan form of A is given by J = Q−1 AQ. But the nice part of the algorithm is
that you can compute J without finding Q! In fact J will have one Jordan block for
each column of each diagram. The value of the block is given by the eigenvalue, and
the size of the block is equal to the number of squares in the column. You put the
blocks down the diagonal of J in the same order you chose the vectors in Q.
Examples
1.
2 −3
A=
3 −4
For this matrix, the characteristic polynomial is (1 + λ)2 , so there is one eigenvalue,
1
λ = −1 with m = 2. Now, we compute E − spaces: E−1 : Solving (A + I)x = 0
3 −3 0 1 −1 0
−→
3 −3 0 0 0 0
1 t 1
E−1 = =t
t 1
1
So d1 = µ = 1. Since dimE−1 < m, we have to compute another E − space.
2 2
E−1 : Solving (A + I) x = 0 :
2 0 0
(A + I) =
0 0
2 2
so E−1 = span(e1 , e2 ), and d2 = 2 − 1 = 1. Since dimE−1 = m, we don’t need any
more E − spaces. Since we have d1 = 1, d2 = 1, our diagram looks like:
39
2
We put a vector in the lower box. It has to be a vector in E−1 , that is linearly in-
1 T
dependent to E−1 . That’s easy enough, how about v1 = [1, o] . Abovev1 , we have to
put(A + I)v1 which is v1 = [1, 0]T . Above v1 , we have to put (A + I)v1 , which is
3 −3 1 3
v2 = =
3 −3 0 3
2.
3 1 0
A = −1 1 0
3 2 2
We skip the computation to show that A has only one eigenvalue, λ = 2, of multiplicity
3. Computing E21 :
1 1 0 0 1 0 0 0
−1 −1 0 0 −→ 0 1 0 0
3 2 0 0 0 0 0 0
So E21 is spanned by the vector [0, 0, 1]T . So d1 = 1. Turning to E22 , we solve (A −
2I)2 x = 0:
0 0 0 0
0 0 0 0
1 1 0 0
So E22 is spanned by [−1, −1, 0]T and [0, 0, 1]T . So d2 = 2 − 1. We need to compute
E −space. But computation shows that (A−2I) = 0, the zero matrix, so E23 is spanned
by e1 , e2 , e3 . So d3 = 1 and we can stop here. Our diagram is one column of three
boxes.
40
The bottom box gets filled with a vector from E23 that is linearly independent of E22 .
The vector V1 = e1 will work. Above there goes v2 = (A − 2I)v1 = [1, −1, 3]T , and
above that goes v3 = (A − 2I)v2 = [0, 0, 1]T . Since our diagram now looks like,
v3
v2
v1
we get the transition matrix,
0 1 1
Q = [v3 , v2 , v1 ] = 0 −1 0
1 3 0
Again, there is only one column, so only one Jordan block, which has value 2 and size
3. We get the Jordan form matrix
2 1 0
J = 0 2 1
0 0 2
3.
2 4 −8
A= 0 0 4
0 −1 4
You can check that this matrix also has only the eigenvalue 2, with multiplicity 3. We
compute the E − spaces. First for E21 ,
0 4 −8 0 0 1 −2 0
[A − 2I|0] −→ 0 −2 −4 0 −→ 0 0 0 0
0 −1 2 0 0 0 0 0
So E21 is spanned by two vectors [1, 0, 0]T and [0, 2, 1]T . Also, d2 − 2 < 3, so we need
another E − space. Computing E22 , we see that (A − 2I)2 = 0, so E22 is spanned by
the standard basis, and d2 = 3 − 2 − 1. We can stop, since E22 has dimensions 3.
Our digram looks like, v2 v3
v1
where v1 is a vector in E22 linearly independent of E21 . We get v2 = (A − 2I)v1 , and we
finally choose v3 ∈ E21 linearly independent of v2 . If we start by choosing v1 = e2 , we
wind up getting,
4 0 1
Q = [v2 , v1 , v3 ] = −2 1 0
−1 0 0
Finally, the diagram tells us that we get 2 Jordan blocks this time. Both have value 2,
but one is of size 2 and one is of size 1. So
2 1 0
J = 0 2 0
0 0 2
41
We drew the lines just to illustrate the blocks. You can check in this example, and in
all of the previous ones, that indeed J = Q−1 AQ.
42
Chapter 6
6.1 Introduction
According to Risteski [2] a chemical reaction is an expression showing a symbolic represen-
tation of the reactants and products that is usually positioned on the left and right hand
sides of a particular chemical reaction. Substances that takes part in a chemical reaction are
represented by their molecular formula and their symbolic representation is also regarded as
a chemical reaction [3]. A chemical reaction can either be reversible or irreversible. These
differs from Mathematical equations in the sense that while a single arrow (in the case of
an irreversible reaction) or a double arrow points in the forward and backward directions
of both the reactants and products (in the case of a reversible reaction) connects chemical
reactions [4], an equality sign links the left and right hand sides of a Mathematical equation.
’The quantitative and qualitative knowledge of the chemical processes which estimates the
amount of reactants, predicting the nature and amount of products and determining con-
ditions under which a reaction takes place is important in balancing a chemical reaction.
Balancing Chemical reactions is an excellent demonstative and instructive example of the
inter-connectedness between Linear Algebra and Stoichiometric principles’ [1].
If the number of atoms of each type of element on the left is the same as the number
of atoms of the corrresponding type on the right, then the chemical equation is said to be
balanced [4], otherwise it is not. The qualitative study of the relationship between reactants
in a chemical reaction is termed Stoichiometry [6]. Tuckerman [5] mentioned two methods for
balancing a Chemical reaction: by inspection and algebraic. The balancing-by-inspection
method involves making successive intelligent guesses at making the coefficients that will
balance an equation equal and continuing until the equation is balanced [1]. For simple
equations this procedure is straight forward. However, according to [7], there is need for
a ’step-by-step’ approach which is easily applicable and can be mastered; rather than the
haphazard hoping of inspection or a highly refined inspection. In addition, balancing-by-
inspection method makes one to believe that there is only one possible solution rather than
an infinite number of solutions which the method proposed in this paper illustrates. The
algebraic approach circumvents the above loopholes provided in the inspection method and
43
can handle complex chemical reactions.
The algebraic approach discussed in [5], involves putting unknown coefficients in front of
each molecular species in the equation and solving for the unknowns. This is then followed
by writing down the balance conditions on each element. After which he lets one of the
unknowns to be one and takes turns to obtain the coefficients of the remaining unknowns.
In the proposed approach, instead of setting one of the unknowns to zero, we write out the
set of equations in matrix form, obtain a homogeneous system of equations. Since the system
of equations is homogeneous, the solution obtained is in the nullspace of the corresponding
matrix. We then perform elementary row operations on the matrix to reduce it to row
reduced echelon form. We also show the use of software environments like Matlab/octave
to reduce the corresponding matrix to row reduced echelon form using the rref command.
This approach surpasses those in [1]; in the sense that we do not need to manually reduce
the matrix to echelon form as shown in that paper. In that paper, they showed how the
corresponding matrix is reduced to echelon form but did not use elementary row operations
to convert it to row reduced echelon form.
In the next section, we state two well known results partaining echelon form and row
reduced echelon form.
6.2 Theory
In this section, we state well known results about echelon form and row reduced echelon
form. We will not bother about the algorithm as this is readily available in most Linear
Algebra textbooks.
Lemma 6.2.1. : The number of nonzero rows and columns are the same in any echelon form
produced from a given matrix A by elementary row operations, irrespective of the sequence
of row operations used.
Given an n × m matrix A,
2. Use the bottom-most non zero entry 1 in each leading column of the echelon form,
starting with the rightmost leading column and working to the left, so as to eliminate
all non-zero entries in that column strictly above that entry one.
(b). The ith leading column equals ei , the ith column of the identity matrix of order p, for
1 ≤ i ≤ k.
The next result which can be found in [9], describes the uniqueness of the row reduced
echelon form. It is the uniqueness of the row reduced echelon form that makes it a tool for
finding the nullspace of a matrix.
44
Theorem 6.2.1. (Row Reduced Echelon Form): Each matrix has precisely one row reduced
echelon form to which it can be reduced by elementary row operations, regardless of the actual
sequence of operations used to produce it.
Proof. See [9].
F e + O2 −→ F e2 O3 .
pF e + qO2 −→ rF e2 O3 .
We compare the number of Iron (Fe) and Oxygen (O) atoms of the reactants with the number
of atoms of the product. We obtain the following set of equations:
F e : p = 2r
O : 2q = 3r,
p − 2r = 0 or p = 2r
3 3
q − r = 0 or q = r,
2 2
45
the nullspace solution
p 2
x = q = 23 r.
r 1
There are three pivot variables p, q and one free variable r. If we choose r = 1, then
p = 2, q = 32 . To avoid fractions, we can also let r = 2, so that p = 4, q = 3 and r = 2. We
remark that these are not the only solutions since there is a free variable r, the nullspace
solution is infinitely many. Therefore, the chemical equation can be balanced as
3
2Fe + O2 −→ Fe2 O3 ,
2
or
4Fe + 3O2 −→ 2Fe2 O3 .
Example 6.3.2. : Ethane (C2 H6 ) burns in oxygen to produce carbon (IV) oxide CO2 and
steam. The steam condenses to form droplets of water viz;
C2 H6 + O2 −→ CO2 + H2 O,
We compare the number of Carbon (C), Hydrogen (H) and Oxygen (O) atoms of the reactants
with the number of atoms of the products. We obtain the following set of equations:
C : 2p = r
H : 6p = 2s
O : 2q = 2r + s.
In homogeneous form,
p 0
2 0 −1 0
6 0 0 −2 q = 0 .
r 0
0 2 −2 −1
s 0
In the first step of elimination, replace row two by row two minus three times row one, i.e.,
R2 ↔ R2 − 3R1 to yield,
2 0 −1 0
∼ 0 0 3 −2 .
0 2 −2 −1
Exchange row two with row three or vice versa to reduce A to echelon form U,
2 0 −1 0
U = 0 2 −2 −1 .
0 0 3 −2
46
In the next set of operations that we will carry out to reduce U to R, we perform row
operations that will change the entries above the pivots to zero; Replace row one by three
times row two plus two times row three i.e., R2 ↔ 3R2 + 2R3 and replace row one with three
times row one plus row three (R1 ↔ 3R1 + R3 ) to yield
6 0 0 −2
∼ 0 6 0 −7
0 0 3 −2
The last operation that will give us R, is to reduce all the pivots to unity, that is replace row
one with one-sixth row one, row two with one-sixth row two and row three with one-third
row three to obtain
1 0 0 − 31
R = 0 1 0 − 67 . (6.2)
2
0 0 1 −3
The solution to Ax = 0 reduces to Rx = 0 where x is actually the nullspace of A which is
equivalent to the nullspace of R. Hence,
1
p
1 0 0 −3
0 1 0 − 7 q = 0.
6 r
0 0 1 − 23
s
47
Example 6.3.3. : Sodium hydroxide (NaOH) reacts with sulphuric acid (H2 SO4 ) to yield
sodium sulphate (Na2 SO4 ) and water,
N aOH + H2 SO4 −→ N a2 SO4 + H2 O.
Balance the equation.
In balancing the equation, let p, q, r and s be the unknown variables such that
pN aOH + qH2 SO4 −→ rN a2 SO4 + sH2 O.
We compare the number of Sodium (Na), Oxygen (O), Hydrogen (H) and Sulphur (S) atoms
of the reactants with the number of atoms of the products. We obtain the following set of
equations:
Na : p = 2r
O : p + 4q = 4r + s
H : p + 2q = 2s
S : q = r.
Re-writing these equations in standard form, we have a homogeneous system Ax = 0 of
linear equations with p, q, r and s
p − 2r = 0
p + 4q − 4r − s = 0
p + 2q − 2s = 0
q−r = 0,
or
1 0 −2 0 p 1 0 −2 0
1 4 −4 −1 q
= 0,
1 4 −4 −1
where A= .
1 2 0 −2 r 1 2 0 −2
0 1 −1 0 s 0 1 −1 0
The augmented system becomes
1 0 −2 0 | 0
1 4 −4 −1 | 0
[A 0] = .
1 2 0 −2 | 0
0 1 −1 0 | 0
Since the right hand side is the zero vector, we work with the matrix A because any row
operation will not change the zeros.
Replace row 2 with row two minus row one i.e, R2 ↔ R2 − R1 . Similarly, replace row
three with row three minus row one i.e, R3 ↔ R3 − R1 . These first set of row operations
reduces A to
1 0 −2 0
0 4 −2 −1
∼ 0 2 2 −2 .
0 1 −1 0
48
In the second set of row operations, we replace row three by two times row three minus
row two or R3 ↔ 2R3 − R2 and replace row four by four times row four minus row two or
R4 ↔ 4R4 − R2 to yield
1 0 −2 0
0 4 −2 −1
∼ 0 0 6 −3 .
0 0 −2 1
In the third stage of the elimination process, we replace row four with 3 times row four plus
row three i.e, R4 ↔ 3R4 + R3 to yield the row echelon matrix or upper triangular U,
1 0 −2 0
0 4 −2 −1
U= 0 0 6 −3 .
0 0 0 0
We now reduce U to row reduced echelon form R as follows: First, we reduce the pivots to
unity in rows two and three via R2 ↔ 41 R2 and R3 ↔ 61 R3 to obtain
1 0 −2 0
0 1 − 1 − 1
∼ 2
0 0 1 − 1 .
4
2
0 0 0 0
Replace row one by row one plus two times row three i.e, R1 ↔ R1 +2R3 and row two by row
two plus half row three, that is R2 ↔ R2 + 21 R3 . These two operations replaces all nonzeros
above the pivots to zero resulting in the row reduced echelon form R
1 0 0 −1
0 1 0 − 1
R= 2
0 0 1 − 1 . (6.3)
2
0 0 0 0
p − s = 0 or p = s
1 1
q − s = 0 or q = s
2 2
1 1
r − s = 0 or r = s,
2 2
49
the nullspace solution
p 1
q 1
x= 2
r = 1 s
2
s 1
There are three pivot variables p, q, r and one free variable s. We set s = 2, so that p = 2, q =
1 and r = 1. We remark that this is not the only solution since there is a free variable s, the
nullspace solution is infinitely many. Therefore, the chemical equation can be ’balanced’ as
Example 6.3.4. : Using row reduced echelon form, balance the following chemical reaction:
KHC8 H4 O4 + KOH −→ K2 C8 H4 O4 + H2 O.
K : p + q = 2r
H : 5p + q = 4r + 2s
C : 8p = 8r
O : 4p + q = 4r + s.
0 0 −2 2
50
Finally, R4 ↔ 2R4 − R3 reduces the matrix to echelon form
1 1 −2 0
0 −4 6 −2
U= 0 0 −4 4 .
0 0 0 0
There are three pivots respectively 1, −4, −4. Hence, to reduce the matrix to row reduced
echelon form, we make sure the entries above the pivots are zero and then change the pivots
to unity. The row operations R2 ↔ 4R2 + 6R3 , R1 ↔ 2R1 − R3 and R1 ↔ R1 + 81 R2 changes
the nonzero entries above the pivots to zero so that U reduces to
2 0 0 −2
0 −16 0 16
∼0 0 −4 4 .
0 0 0 0
1
The row operations R1 ↔ 12 R1 , R2 ↔ − 16 R1 and R3 ↔ − 14 R1 leads to the row reduced
echelon form
1 0 0 −1
0 1 0 −1
R= . (6.4)
0 0 1 −1
0 0 0 0
Therefore, the solution x to Rx = 0 becomes
p 1
q 1
x= r = 1 s.
s 1
For simplicity, we equate s to one so that p = q = r = s = 1. This actually shows that the
equation was balanced in the first place.
51
Example 6.4.2. : A = [2 0 − 1 0; 6 0 0 − 2; 0 2 − 2 − 1] and R = rref (A).
This gives the same R as in (6.2) as
1 0 0 − 13
1 0 0 −0.3333
R = 0 1 0 −1.1667 = 0 1 0 − 76 .
0 0 1 −0.6667 0 0 1 − 23
Example 6.4.5. : Consider balancing the following chemical reaction from [5]
52
Using Matlab or Octave R = rref (A) command, Ax = 0 reduces to Rx = 0 as
p 1
p 1
1 0 0 0 0 −1 q
2
0 1 0 0 0 −0.5 q
r 1
0 0 1 0 0 −0.5 = 0 or r = u.
s 2
0 0 0 1 0 −0.25
t 1
0 0 0 0 1 −0.5 s
4
u
1
t 2
53
References
[1] Gabriel C. I and Onwuka G. I: Balancing of Chemical Equations using Matrix Algebra.
Journal of Natural Sciences Research, Vol3, No. 5, 29–36, 2015.
[2] Risteski I. B: Journal of the Chinese Chemical Society, 56: 65–79, 2009.
[6] Hill J. W., Mecreary T. W., Kolb D. K: Chemistry for changing times. Pearson Educa-
tion Inc., 123–143, 2000.
[7] Hutchings L., Peterson L., Almasude A: The Journal of Mathematics and Science:
Collaborative Explorations 9: 119–133, 2007.
[9] Ben Noble, James W. Daniel : Applied Linear Algebra, Third Edition pp. 90–97, 103,
140–149, 1988.
54