You are on page 1of 153

Notes on Linear Algebra

A. K. Lal S. Pati
July 11, 2011
2
Contents
1 Introduction to Matrices 5
1.1 Denition of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Operations on Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Multiplication of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Some More Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.1 Submatrix of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 System of Linear Equations 19
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.1 A Solution Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.2 Gauss Elimination Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.3 Gauss-Jordan Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Elementary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Rank of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.4 Existence of Solution of Ax = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5 Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.5.1 Adjoint of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.5.2 Cramers Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.6 Miscellaneous Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3 Finite Dimensional Vector Spaces 51
3.1 Finite Dimensional Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.1 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.1.2 Linear Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.3 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.3.1 Dimension of a Finite Dimensional Vector Space . . . . . . . . . . . . . . . . 66
3.3.2 Application to the study of C
n
. . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4 Ordered Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3
4 CONTENTS
4 Linear Transformations 81
4.1 Denitions and Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2 Matrix of a linear transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3 Rank-Nullity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4 Similarity of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.5 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5 Inner Product Spaces 97
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.2 Denition and Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.2.1 Basic Results on Orthogonal Vectors . . . . . . . . . . . . . . . . . . . . . . . 104
5.3 Gram-Schmidt Orthogonalization Process . . . . . . . . . . . . . . . . . . . . . . . . 106
5.4 Orthogonal Projections and Applications . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.4.1 Matrix of the Orthogonal Projection . . . . . . . . . . . . . . . . . . . . . . . 116
5.5 QR Decomposition

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6 Eigenvalues, Eigenvectors and Diagonalization 121
6.1 Introduction and Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.2 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.3 Diagonalizable Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.4 Sylvesters Law of Inertia and Applications . . . . . . . . . . . . . . . . . . . . . . . 135
7 Appendix 141
7.1 Permutation/Symmetric Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.2 Properties of Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.3 Dimension of M +N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Chapter 1
Introduction to Matrices
1.1 Denition of a Matrix
Denition 1.1.1 (Matrix) A rectangular array of numbers is called a matrix.
The horizontal arrays of a matrix are called its rows and the vertical arrays are called its
columns. A matrix is said to have the order mn if it has m rows and n columns. An mn
matrix A can be represented in either of the following forms:
A =
_

_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_

_
or A =
_
_
_
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
_
_
_
,
where a
ij
is the entry at the intersection of the i
th
row and j
th
column. In a more concise
manner, we also write A
mn
= [a
ij
] or A = [a
ij
]
mn
or A = [a
ij
]. We shall mostly be concerned
with matrices having real numbers, denoted R, as entries. For example, if A =
_
1 3 7
4 5 6
_
then
a
11
= 1, a
12
= 3, a
13
= 7, a
21
= 4, a
22
= 5, and a
23
= 6.
A matrix having only one column is called a column vector; and a matrix with only one
row is called a row vector. Whenever a vector is used, it should be understood from
the context whether it is a row vector or a column vector. Also, all the vectors
will be represented by bold letters.
Denition 1.1.2 (Equality of two Matrices) Two matrices A = [a
ij
] and B = [b
ij
] having the
same order mn are equal if a
ij
= b
ij
for each i = 1, 2, . . . , m and j = 1, 2, . . . , n.
In other words, two matrices are said to be equal if they have the same order and their corre-
sponding entries are equal.
Example 1.1.3 The linear system of equations 2x + 3y = 5 and 3x + 2y = 5 can be identied
with the matrix
_
2 3 : 5
3 2 : 5
_
. Note that x and y are indeterminate and we can think of x being
associated with the rst column and y being associated with the second column.
5
6 CHAPTER 1. INTRODUCTION TO MATRICES
1.1.1 Special Matrices
Denition 1.1.4 1. A matrix in which each entry is zero is called a zero-matrix, denoted by 0.
For example,
0
22
=
_
0 0
0 0
_
and 0
23
=
_
0 0 0
0 0 0
_
.
2. A matrix that has the same number of rows as the number of columns, is called a square
matrix. A square matrix is said to have order n if it is an n n matrix.
3. The entries a
11
, a
22
, . . . , a
nn
of an nn square matrix A = [a
ij
] are called the diagonal entries
(the principal diagonal) of A.
4. A square matrix A = [a
ij
] is said to be a diagonal matrix if a
ij
= 0 for i ,= j. In other words,
the non-zero entries appear only on the principal diagonal. For example, the zero matrix 0
n
and
_
4 0
0 1
_
are a few diagonal matrices.
A diagonal matrix D of order n with the diagonal entries d
1
, d
2
, . . . , d
n
is denoted by D =
diag(d
1
, . . . , d
n
). If d
i
= d for all i = 1, 2, . . . , n then the diagonal matrix D is called a scalar
matrix.
5. A scalar matrix A of order n is called an identity matrix if d = 1. This matrix is denoted
by I
n
.
For example, I
2
=
_
1 0
0 1
_
and I
3
=
_
_
1 0 0
0 1 0
0 0 1
_
_
. The subscript n is suppressed in case the
order is clear from the context or if no confusion arises.
6. A square matrix A = [a
ij
] is said to be an upper triangular matrix if a
ij
= 0 for i > j.
A square matrix A = [a
ij
] is said to be a lower triangular matrix if a
ij
= 0 for i < j.
A square matrix A is said to be triangular if it is an upper or a lower triangular matrix.
For example,
_
_
0 1 4
0 3 1
0 0 2
_
_
is upper triangular,
_
_
0 0 0
1 0 0
0 1 1
_
_
is lower triangular.
Exercise 1.1.5 Are the following matrices upper triangular, lower triangular or both?
1.
_

_
a
11
a
12
a
1n
0 a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
0 0 a
nn
_

_
2. The square matrices 0 and I or order n.
3. The matrix diag(1, 1, 0, 1).
1.2 Operations on Matrices
Denition 1.2.1 (Transpose of a Matrix) The transpose of an m n matrix A = [a
ij
] is de-
ned as the n m matrix B = [b
ij
], with b
ij
= a
ji
for 1 i m and 1 j n. The transpose of
A is denoted by A
t
.
1.2. OPERATIONS ON MATRICES 7
That is, if A =
_
1 4 5
0 1 2
_
then A
t
=
_
_
1 0
4 1
5 2
_
_
. Thus, the transpose of a row vector is a column
vector and vice-versa.
Theorem 1.2.2 For any matrix A, (A
t
)
t
= A.
Proof. Let A = [a
ij
], A
t
= [b
ij
] and (A
t
)
t
= [c
ij
]. Then, the denition of transpose gives
c
ij
= b
ji
= a
ij
for all i, j
and the result follows.
Denition 1.2.3 (Addition of Matrices) let A = [a
ij
] and B = [b
ij
] be two m n matrices.
Then the sum A+B is dened to be the matrix C = [c
ij
] with c
ij
= a
ij
+b
ij
.
Note that, we dene the sum of two matrices only when the order of the two matrices are same.
Denition 1.2.4 (Multiplying a Scalar to a Matrix) Let A = [a
ij
] be an mn matrix. Then
for any element k R, we dene kA = [ka
ij
].
For example, if A =
_
1 4 5
0 1 2
_
and k = 5, then 5A =
_
5 20 25
0 5 10
_
.
Theorem 1.2.5 Let A, B and C be matrices of order mn, and let k, R. Then
1. A +B = B +A (commutativity).
2. (A +B) +C = A+ (B +C) (associativity).
3. k(A) = (k)A.
4. (k +)A = kA +A.
Proof. Part 1.
Let A = [a
ij
] and B = [b
ij
]. Then
A +B = [a
ij
] + [b
ij
] = [a
ij
+b
ij
] = [b
ij
+a
ij
] = [b
ij
] + [a
ij
] = B +A
as real numbers commute.
The reader is required to prove the other parts as all the results follow from the properties of
real numbers.
Denition 1.2.6 (Additive Inverse) Let A be an mn matrix.
1. Then there exists a matrix B with A+B = 0. This matrix B is called the additive inverse of
A, and is denoted by A = (1)A.
2. Also, for the matrix 0
mn
, A+0 = 0+A = A. Hence, the matrix 0
mn
is called the additive
identity.
Exercise 1.2.7 1. Find a 3 3 non-zero matrix A satisfying A = A
t
.
2. Find a 3 3 non-zero matrix A such that A
t
= A.
8 CHAPTER 1. INTRODUCTION TO MATRICES
3. Find the 3 3 matrix A = [a
ij
] satisfying a
ij
= 1 if i ,= j and 2 otherwise.
4. Find the 3 3 matrix A = [a
ij
] satisfying a
ij
= 1 if [i j[ 1 and 0 otherwise.
5. Find the 4 4 matrix A = [a
ij
] satisfying a
ij
= i +j.
6. Find the 4 4 matrix A = [a
ij
] satisfying a
ij
= 2
i+j
.
7. Suppose A +B = A. Then show that B = 0.
8. Suppose A +B = 0. Then show that B = (1)A = [a
ij
].
9. Let A =
_
_
1 1
2 3
0 1
_
_
and B =
_
2 3 1
1 1 2
_
. Compute A +B
t
and B +A
t
.
1.2.1 Multiplication of Matrices
Denition 1.2.8 (Matrix Multiplication / Product) Let A = [a
ij
] be an m n matrix and
B = [b
ij
] be an n r matrix. The product AB is a matrix C = [c
ij
] of order m r, with
c
ij
=
n

k=1
a
ik
b
kj
= a
i1
b
1j
+a
i2
b
2j
+ +a
in
b
nj
.
That is, if A
mn
=
_

_

a
i1
a
i2
a
in

_

_
and B
nr
=
_

_
.
.
. b
1j
.
.
.
.
.
. b
2j
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. b
mj
.
.
.
_

_
then
AB = [(AB)
ij
]
mr
and (AB)
ij
= a
i1
b
1j
+a
i2
b
2j
+ +a
in
b
nj
.
Observe that the product AB is dened if and only if
the number of columns of A = the number of rows of B.
For example, if A =
_
a b c
d e f
_
and B =
_
_

x y z t
u v w s
_
_
then
AB =
_
a +bx +cu a +by +cv a +bz +cw a +bt +cs
d +ex +fu d +ey +fv d +ez +fw d +et + fs
_
. (1.2.1)
Observe that in Equation (1.2.1), the rst row of AB can be re-written as
a
_

+b
_
x y z t

+c
_
u v w s

.
That is, if Row
i
(B) denotes the i-th row of B for 1 i 3, then the matrix product AB can be
re-written as
AB =
_
a Row
1
(B) +b Row
2
(B) +c Row
3
(B)
d Row
1
(B) +e Row
2
(B) +f Row
3
(B)
_
. (1.2.2)
1.2. OPERATIONS ON MATRICES 9
Similarly, observe that if Col
j
(A) denotes the j-th column of A for 1 j 3, then the matrix
product AB can be re-written as
AB =
_
Col
1
(A) + Col
2
(A) x + Col
3
(A) u,
Col
1
(A) + Col
2
(A) y + Col
3
(A) v,
Col
1
(A) + Col
2
(A) z + Col
3
(A) w
Col
1
(A) + Col
2
(A) t + Col
3
(A) s] . (1.2.3)
Remark 1.2.9 Observe the following:
1. In this example, while AB is dened, the product BA is not dened.
However, for square matrices A and B of the same order, both the product AB and BA are
dened.
2. The product AB corresponds to operating on the rows of the matrix B (see Equation (1.2.2)).
This is row method for calculating the matrix product.
3. The product AB also corresponds to operating on the columns of the matrix A (see Equa-
tion (1.2.3)). This is column method for calculating the matrix product.
4. Let A = [a
ij
] and B = [b
ij
] be two matrices. Suppose a
1
, a
2
, . . . , a
n
are the rows of A and
b
1
, b
2
, . . . , b
p
are the columns of B. If the product AB is dened, then check that
AB = [Ab
1
, Ab
2
, . . . , Ab
p
] =
_

_
a
1
B
a
2
B
.
.
.
a
n
B
_

_
.
Example 1.2.10 Let A =
_
_
1 2 0
1 0 1
0 1 1
_
_
and B =
_
_
1 0 1
0 0 1
0 1 1
_
_
. Use the row/column method of
matrix multiplication to
1. nd the second row of the matrix AB.
Solution: Observe that the second row of AB is obtained by multiplying the second row of A
with B. Hence, the second row of AB is
1 [1, 0, 1] + 0 [0, 0, 1] + 1 [0, 1, 1] = [1, 1, 0].
2. nd the third column of the matrix AB.
Solution: Observe that the third column of AB is obtained by multiplying A with the third
column of B. Hence, the third column of AB is
1
_
_
1
1
0
_
_
+ 1
_
_
2
0
1
_
_
+ 1
_
_
0
1
1
_
_
=
_
_
1
0
0
_
_
.
Denition 1.2.11 (Commutativity of Matrix Product) Two square matrices A and B are
said to commute if AB = BA.
10 CHAPTER 1. INTRODUCTION TO MATRICES
Remark 1.2.12 Note that if A is a square matrix of order n and if B is a scalar matrix of order
n then AB = BA. In general, the matrix product is not commutative. For example, consider
A =
_
1 1
0 0
_
and B =
_
1 0
1 0
_
. Then check that the matrix product
AB =
_
2 0
0 0
_
,=
_
1 1
1 1
_
= BA.
Theorem 1.2.13 Suppose that the matrices A, B and C are so chosen that the matrix multiplica-
tions are dened.
1. Then (AB)C = A(BC). That is, the matrix multiplication is associative.
2. For any k R, (kA)B = k(AB) = A(kB).
3. Then A(B +C) = AB +AC. That is, multiplication distributes over addition.
4. If A is an n n matrix then AI
n
= I
n
A = A.
5. For any square matrix A of order n and D = diag(d
1
, d
2
, . . . , d
n
), we have
the rst row of DA is d
1
times the rst row of A;
for 1 i n, the i
th
row of DA is d
i
times the i
th
row of A.
A similar statement holds for the columns of A when A is multiplied on the right by D.
Proof. Part 1. Let A = [a
ij
]
mn
, B = [b
ij
]
np
and C = [c
ij
]
pq
. Then
(BC)
kj
=
p

=1
b
k
c
j
and (AB)
i
=
n

k=1
a
ik
b
k
.
Therefore,
_
A(BC)
_
ij
=
n

k=1
a
ik
_
BC
_
kj
=
n

k=1
a
ik
_
p

=1
b
k
c
j
_
=
n

k=1
p

=1
a
ik
_
b
k
c
j
_
=
n

k=1
p

=1
_
a
ik
b
k
_
c
j
=
p

=1
_
n

k=1
a
ik
b
k
_
c
j
=
t

=1
_
AB
_
i
c
j
=
_
(AB)C
_
ij
.
Part 5. For all j = 1, 2, . . . , n, we have
(DA)
ij
=
n

k=1
d
ik
a
kj
= d
i
a
ij
as d
ik
= 0 whenever i ,= k. Hence, the required result follows.
The reader is required to prove the other parts.
Exercise 1.2.14 1. Find a 2 2 non-zero matrix A satisfying A
2
= 0.
2. Find a 2 2 non-zero matrix A satisfying A
2
= A and A ,= I
2
.
1.2. OPERATIONS ON MATRICES 11
3. Find 2 2 non-zero matrices A, B and C satisfying AB = AC but B ,= C. That is, the
cancelation law doesnt hold.
4. Let A =
_
_
0 1 0
0 0 1
1 0 0
_
_
. Compute A + 3A
2
A
3
and aA
3
+bA+cA
2
.
5. Let A and B be two matrices. If the matrix addition A + B is dened, then prove that
(A+B)
t
= A
t
+B
t
. Also, if the matrix product AB is dened then prove that (AB)
t
= B
t
A
t
.
6. Let A = [a
1
, a
2
, . . . , a
n
] and B
t
= [b
1
, b
2
, . . . , b
n
]. Then check that order of AB is 11, whereas
BA has order n n. Determine the matrix products AB and BA.
7. Let A and B be two matrices such that the matrix product AB is dened.
(a) If the rst row of A consists entirely of zeros, prove that the rst row of AB also consists
entirely of zeros.
(b) If the rst column of B consists entirely of zeros, prove that the rst column of AB also
consists entirely of zeros.
(c) If A has two identical rows then the corresponding rows of AB are also identical.
(d) If B has two identical columns then the corresponding columns of AB are also identical.
8. Let A =
_
_
1 1 2
1 2 1
0 1 1
_
_
and B =
_
_
1 0
0 1
1 1
_
_
. Use the row/column method of matrix multipli-
cation to compute the
(a) rst row of the matrix AB.
(b) third row of the matrix AB.
(c) rst column of the matrix AB.
(d) second column of the matrix AB.
(e) rst column of B
t
A
t
.
(f ) third column of B
t
A
t
.
(g) rst row of B
t
A
t
.
(h) second row of B
t
A
t
.
9. Let A and B be the matrices given in Exercise 1.2.14.8. Compute A A
t
, (3AB)
t
4B
t
A
and 3A2A
t
.
10. Let n be a positive integer. Compute A
n
for the following matrices:
_
1 1
0 1
_
,
_
_
1 1 1
0 1 1
0 0 1
_
_
,
_
_
1 1 1
1 1 1
1 1 1
_
_
.
Can you guess a formula for A
n
and prove it by induction?
11. Construct the matrices A and B satisfying the following statements.
(a) The matrix product AB is dened but BA is not dened.
(b) The matrix products AB and BA are dened but they have dierent orders.
12 CHAPTER 1. INTRODUCTION TO MATRICES
(c) The matrix products AB and BA are dened and they have the same order but AB ,= BA.
12. Let A be a 3 3 matrix satisfying A
_
_
a
b
c
_
_
=
_
_
a +b
b c
0
_
_
. Determine the matrix A.
13. Let A be a 2 2 matrix satisfying A
_
a
b
_
=
_
a b
a
_
. Can you construct the matrix A satisfying
the above? Why!
1.2.2 Inverse of a Matrix
Denition 1.2.15 (Inverse of a Matrix) Let A be a square matrix of order n.
1. A square matrix B is said to be a left inverse of A if BA = I
n
.
2. A square matrix C is called a right inverse of A, if AC = I
n
.
3. A matrix A is said to be invertible (or is said to have an inverse) if there exists a matrix
B such that AB = BA = I
n
.
Lemma 1.2.16 Let A be an n n matrix. Suppose that there exist n n matrices B and C such
that AB = I
n
and CA = I
n
, then B = C.
Proof. Note that
C = CI
n
= C(AB) = (CA)B = I
n
B = B.

Remark 1.2.17 1. From the above lemma, we observe that if a matrix A is invertible, then the
inverse is unique.
2. As the inverse of a matrix A is unique, we denote it by A
1
. That is, AA
1
= A
1
A = I.
Example 1.2.18 1. Let A =
_
a b
c d
_
.
(a) If ad bc ,= 0. Then verify that A
1
=
1
_
d b
c a
_
.
(b) If ad bc = 0 then prove that either [a b] = [c d] for some R or [a c] = [b d] for
some R. Hence, prove that A is not invertible.
(c) In particular, the inverse of
_
2 3
4 7
_
equals
1
2
_
7 3
4 2
_
. Also, the matrices
_
1 2
0 0
_
,
_
1 0
4 0
_
and
_
4 2
6 3
_
do not have inverses.
2. Let A =
_
_
1 2 3
2 3 4
3 4 6
_
_
. Then A
1
=
_
_
2 0 1
0 3 2
1 2 1
_
_
.
Theorem 1.2.19 Let A and B be two matrices with inverses A
1
and B
1
, respectively. Then
1. (A
1
)
1
= A.
1.2. OPERATIONS ON MATRICES 13
2. (AB)
1
= B
1
A
1
.
3. (A
t
)
1
= (A
1
)
t
.
Proof. Proof of Part 1.
By denition AA
1
= A
1
A = I. Hence, if we denote A
1
by B, then we get AB = BA = I. Thus,
the denition, implies B
1
= A, or equivalently (A
1
)
1
= A.
Proof of Part 2.
Verify that (AB)(B
1
A
1
) = I = (B
1
A
1
)(AB).
Proof of Part 3.
We know AA
1
= A
1
A = I. Taking transpose, we get
(AA
1
)
t
= (A
1
A)
t
= I
t
(A
1
)
t
A
t
= A
t
(A
1
)
t
= I.
Hence, by denition (A
t
)
1
= (A
1
)
t
.
We will again come back to the study of invertible matrices in Sections 2.2 and 2.5.
Exercise 1.2.20 1. Let A be an invertible matrix and let r be a positive integer. Prove that
(A
1
)
r
= A
r
.
2. Find the inverse of
_
cos() sin()
sin() cos()
_
and
_
cos() sin()
sin() cos()
_
.
3. Let A
1
, A
2
, . . . , A
r
be invertible matrices. Prove that the product A
1
A
2
A
r
is also an in-
vertible matrix.
4. Let x
t
= [1, 2, 3] and y
t
= [2, 1, 4]. Prove that xy
t
is not invertible even though x
t
y is
invertible.
5. Let A be an n n invertible matrix. Then prove that
(a) A cannot have a row or column consisting entirely of zeros.
(b) any two rows of A cannot be equal.
(c) any two columns of A cannot be equal.
(d) the third row of A cannot be equal to the sum of the rst two rows, whenever n 3.
(e) the third column of A cannot be equal to the rst column minus the second column,
whenever n 3.
6. Suppose A is a 2 2 matrix satisfying (I + 3A)
1
=
_
1 2
2 1
_
. Determine the matrix A.
7. Let A be a 3 3 matrix such that (I A)
1
=
_
_
2 0 1
0 3 2
1 2 1
_
_
. Determine the matrix A
[Hint: See Example 1.2.18.2 and Theorem 1.2.19.1].
8. Let A be a square matrix satisfying A
3
+A2I = 0. Prove that A
1
=
1
2
_
A
2
+I
_
.
9. Let A = [a
ij
] be an invertible matrix and let p be a nonzero real number. Then determine the
inverse of the matrix B = [p
ij
a
ij
].
14 CHAPTER 1. INTRODUCTION TO MATRICES
1.3 Some More Special Matrices
Denition 1.3.1 1. A matrix A over R is called symmetric if A
t
= A and skew-symmetric if
A
t
= A.
2. A matrix A is said to be orthogonal if AA
t
= A
t
A = I.
Example 1.3.2 1. Let A =
_
_
1 2 3
2 4 1
3 1 4
_
_
and B =
_
_
0 1 2
1 0 3
2 3 0
_
_
. Then A is a symmetric
matrix and B is a skew-symmetric matrix.
2. Let A =
_

_
1

3
1

3
1

3
1

2

1

2
0
1

6
1

6

2

6
_

_. Then A is an orthogonal matrix.

3. Let A = [a
ij
] be an nn matrix with a
ij
equal to 1 if ij = 1 and 0, otherwise. Then A
n
= 0
and A

,= 0 for 1 n 1. The matrices A for which a positive integer k exists such that
A
k
= 0 are called nilpotent matrices. The least positive integer k for which A
k
= 0 is called
the order of nilpotency.
4. Let A =
_
1
2
1
2
1
2
1
2
_
. Then A
2
= A. The matrices that satisfy the condition that A
2
= A are
called idempotent matrices.
Exercise 1.3.3 1. Let A be a real square matrix. Then S =
1
2
(A + A
t
) is symmetric, T =
1
2
(A A
t
) is skew-symmetric, and A = S +T.
2. Show that the product of two lower triangular matrices is a lower triangular matrix. A similar
statement holds for upper triangular matrices.
3. Let A and B be symmetric matrices. Show that AB is symmetric if and only if AB = BA.
4. Show that the diagonal entries of a skew-symmetric matrix are zero.
5. Let A, B be skew-symmetric matrices with AB = BA. Is the matrix AB symmetric or skew-
symmetric?
6. Let A be a symmetric matrix of order n with A
2
= 0. Is it necessarily true that A = 0?
7. Let A be a nilpotent matrix. Prove that there exists a matrix B such that B(I + A) = I =
(I +A)B [ Hint: If A
k
= 0 then look at I A +A
2
+ (1)
k1
A
k1
].
1.3.1 Submatrix of a Matrix
Denition 1.3.4 A matrix obtained by deleting some of the rows and/or columns of a matrix is
said to be a submatrix of the given matrix.
For example, if A =
_
1 4 5
0 1 2
_
, a few submatrices of A are
, ,
_
1
0
_
, [1 5],
_
1 5
0 2
_
, A.
1.3. SOME MORE SPECIAL MATRICES 15
But the matrices
_
1 4
1 0
_
and
_
1 4
0 2
_
are not submatrices of A. (The reader is advised to give
reasons.)
Let A be an nm matrix and B be an mp matrix. Suppose r < m. Then, we can decompose
the matrices A and B as A = [P Q] and B =
_
H
K
_
; where P has order nr and H has order r p.
That is, the matrices P and Q are submatrices of A and P consists of the rst r columns of A and
Q consists of the last mr columns of A. Similarly, H and K are submatrices of B and H consists
of the rst r rows of B and K consists of the last m r rows of B. We now prove the following
important theorem.
Theorem 1.3.5 Let A = [a
ij
] = [P Q] and B = [b
ij
] =
_
H
K
_
be dened as above. Then
AB = PH +QK.
Proof. First note that the matrices PH and QK are each of order np. The matrix products PH
and QK are valid as the order of the matrices P, H, Q and K are respectively, nr, rp, n(mr)
and (m r) p. Let P = [P
ij
], Q = [Q
ij
], H = [H
ij
], and K = [k
ij
]. Then, for 1 i n and
1 j p, we have
(AB)
ij
=
m

k=1
a
ik
b
kj
=
r

k=1
a
ik
b
kj
+
m

k=r+1
a
ik
b
kj
=
r

k=1
P
ik
H
kj
+
m

k=r+1
Q
ik
K
kj
= (PH)
ij
+ (QK)
ij
= (PH +QK)
ij
.

Remark 1.3.6 Theorem 1.3.5 is very useful due to the following reasons:
1. The order of the matrices P, Q, H and K are smaller than that of A or B.
2. It may be possible to block the matrix in such a way that a few blocks are either identity
matrices or zero matrices. In this case, it may be easy to handle the matrix product using the
block form.
3. Or when we want to prove results using induction, then we may assume the result for r r
submatrices and then look for (r + 1) (r + 1) submatrices, etc.
For example, if A =
_
1 2 0
2 5 0
_
and B =
_
_
a b
c d
e f
_
_
, Then
AB =
_
1 2
2 5
_ _
a b
c d
_
+
_
0
0
_
[e f] =
_
a + 2c b + 2d
2a + 5c 2b + 5d
_
.
If A =
_
_
0 1 2
3 1 4
2 5 3
_
_
, then A can be decomposed as follows:
16 CHAPTER 1. INTRODUCTION TO MATRICES
A =
_

_
0 1 2
3 1 4
2 5 3
_

_, or A =
_

_
0 1 2
3 1 4
2 5 3
_

_, or
A =
_

_
0 1 2
3 1 4
2 5 3
_

_ and so on.
Suppose A =
m
1
m
2
n
1
n
2
_
P Q
R S
_
and B =
s
1
s
2
r
1
r
2
_
E F
G H
_
. Then the matrices P, Q, R, S and
E, F, G, H, are called the blocks of the matrices A and B, respectively.
Even if A+B is dened, the orders of P and E may not be same and hence, we may not be able
to add A and B in the block form. But, if A+B and P+E is dened then A+B =
_
P +E Q+F
R +G S +H
_
.
Similarly, if the product AB is dened, the product PE need not be dened. Therefore, we
can talk of matrix product AB as block product of matrices, if both the products AB and PE are
dened. And in this case, we have AB =
_
PE +QG PF +QH
RE +SG RF +SH
_
.
That is, once a partition of A is fixed, the partition of B has to be properly
chosen for purposes of block addition or multiplication.
Exercise 1.3.7 1. Complete the proofs of Theorems 1.2.5 and 1.2.13.
2. Let A =
_
_
1/2 0 0
0 1 0
0 0 1
_
_
, B =
_
_
1 0 0
2 1 0
3 0 1
_
_
and C =
_
_
2 2 2 6
2 1 2 5
3 3 4 10
_
_
. Compute
(a) the rst row of AC,
(b) the rst row of B(AC),
(c) the second row of B(AC), and
(d) the third row of B(AC).
(e) Let x
t
= [1, 1, 1, 1]. Compute the matrix product Cx.
3. Let x =
_
x
1
x
2
_
and y =
_
y
1
y
2
_
. Determine the 2 2 matrix
(a) A such that the y = Ax gives rise to counter-clockwise rotation through an angle .
(b) B such that y = Bx gives rise to the reection along the line y = (tan )x.
Now, let C and D be two 22 matrices such that y = Cx gives rise to counter-clockwise
rotation through an angle and y = Dx gives rise to the reection along the line
y = (tan ) x, respectively. Then prove that
(c) y = (AC)x or y = (CA)x give rise to counter-clockwise rotation through an angle +.
(d) y = (BD)x or y = (DB)x give rise to rotations. Which angles do they represent?
(e) What can you say about y = (AB)x or y = (BA)x ?
4. Let A =
_
1 0
0 1
_
, B =
_
cos sin
sin cos
_
and C =
_
cos sin
sin cos
_
. If x =
_
x
1
x
2
_
and y =
_
y
1
y
2
_
then geometrically interpret the following:
1.3. SOME MORE SPECIAL MATRICES 17
(a) y = Ax, y = Bx and y = Cx.
(b) y = (BC)x, y = (CB)x, y = (BA)x and y = (AB)x.
5. Consider the two coordinate transformations
x
1
= a
11
y
1
+a
12
y
2
x
2
= a
21
y
1
+a
22
y
2
and
y
1
= b
11
z
1
+b
12
z
2
y
2
= b
21
z
1
+b
22
z
2
.
(a) Compose the two transformations to express x
1
, x
2
in terms of z
1
, z
2
.
(b) If x
t
= [x
1
, x
2
], y
t
= [y
1
, y
2
] and z
t
= [z
1
, z
2
] then nd matrices A, B and C such that
x = Ay, y = Bz and x = Cz.
(c) Is C = AB?
6. Let A be an n n matrix. Then trace of A, denoted tr(A), is dened as
tr(A) = a
11
+a
22
+ a
nn
.
(a) Let A =
_
3 2
2 2
_
and B =
_
4 3
5 1
_
. Compute tr(A) and tr(B).
(b) Then for two square matrices, A and B of the same order, prove that
i. tr (A +B) = tr (A) + tr (B).
ii. tr (AB) = tr (BA).
(c) Prove that there do not exist matrices A and B such that ABBA = cI
n
for any c ,= 0.
7. Let A and B be two mn matrices with real entries. Then prove that
(a) Ax = 0 for all n 1 vector x with real entries implies A = 0, the zero matrix.
(b) Ax = Bx for all n 1 vector x with real entries implies A = B.
8. Let A be an n n matrix such that AB = BA for all n n matrices B. Show that A = I
for some R.
9. Let A =
_
1 2 3
2 1 1
_
.
(a) Find a matrix B such that AB = I
2
.
(b) What can you say about the number of such matrices? Give reasons for your answer.
(c) Does there exist a matrix C such that CA = I
3
10. Let A =
_

_
1 0 0 1
0 1 1 1
0 1 1 0
0 1 0 1
_

_
and B =
_

_
1 2 2 1
1 1 2 1
1 1 1 1
1 1 1 1
_

_
. Compute the matrix product AB
using the block matrix multiplication.
11. Let A =
_
P Q
R S
_
. If P, Q, R and S are symmetric, is the matrix A symmetric? If A is
symmetric, is it necessary that the matrices P, Q, R and S are symmetric?
12. Let A be a 3 3 matrix and let A =
_
A
11
A
12
A
21
c
_
, where A
11
is a 2 2 invertible matrix and
c is a real number.
18 CHAPTER 1. INTRODUCTION TO MATRICES
(a) If p = cA
21
A
1
11
A
12
is non-zero, prove that B =
_
B
11
B
12
B
21
p
1
_
is the inverse of A, where
B
11
= A
1
11
+A
1
11
A
12
p
1
A
21
A
1
11
, B
12
= A
1
11
A
12
p
1
and B
21
= p
1
A
21
A
1
11
.
(b) Find the inverse of the matrices
_

_
0 1 2
1 1 4
2 1 1
_

_ and
_

_
0 1 2
3 1 4
2 5 3
_

_.
13. Let x be an n 1 matrix satisfying x
t
x = 1.
(a) Dene A = I
n
2xx
t
. Prove that A is symmetric and A
2
= I. The matrix A is
commonly known as the Householder matrix.
(b) Let ,= 1 be a real number and dene A = I
n
xx
t
. Prove that A is symmetric and
invertible [Hint: the inverse is also of the form I
n
+xx
t
for some value of ].
14. Let A be an nn invertible matrix and let x and y be two n1 matrices. Also, let be a real
number such that = 1 +y
t
A
1
x ,= 0. Then prove the famous Shermon-Morrison formula
(A +xy
t
)
1
= A
1

A
1
xy
t
A
1
.
This formula gives the information about the inverse when an invertible matrix is modied by
a rank one matrix.
15. Let J be an n n matrix having each entry 1.
(a) Prove that J
2
= nJ.
(b) Let
1
,
2
,
1
,
2
R. Prove that there exist
3
,
3
R such that
(
1
I
n
+
1
J) (
2
I
n
+
2
J) =
3
I
n
+
3
J.
(c) Let , with ,= 0 and + n ,= 0 and dene A = I
n
+ J. Prove that A is
invertible.
16. Let A be an upper triangular matrix. If A

A = AA

then prove that A is a diagonal matrix.

The same holds for lower triangular matrix.
1.4 Summary
In this chapter, we started with the denition of a matrix and came across lots of examples. In
particular, the following examples were important:
1. The zero matrix of size mn, denoted 0
mn
or 0.
2. The identity matrix of size n n, denoted I
n
or I.
3. Triangular matrices
4. Hermitian/Symmetric matrices
5. Skew-Hermitian/skew-symmetric matrices
6. Unitary/Orthogonal matrices
We also learnt product of two matrices. Even though it seemed complicated, it basically tells
the following:
1. Multiplying by a matrix on the left to a matrix A is same as row operations.
2. Multiplying by a matrix on the right to a matrix A is same as column operations.
Chapter 2
System of Linear Equations
2.1 Introduction
Let us look at some examples of linear systems.
1. Suppose a, b R. Consider the system ax = b.
(a) If a ,= 0 then the system has a unique solution x =
b
a
.
(b) If a = 0 and
i. b ,= 0 then the system has no solution.
ii. b = 0 then the system has infinite number of solutions, namely all x R.
2. Consider a system with 2 equations in 2 unknowns. The equation ax + by = c represents a
line in R
2
if either a ,= 0 or b ,= 0. Thus the solution set of the system
a
1
x +b
1
y = c
1
, a
2
x +b
2
y = c
2
is given by the points of intersection of the two lines. The dierent cases are illustrated by
examples (see Figure 1).
(a) Unique Solution
x + 2y = 1 and x + 3y = 1. The unique solution is (x, y)
t
= (1, 0)
t
.
Observe that in this case, a
1
b
2
a
2
b
1
,= 0.
(b) Infinite Number of Solutions
x+2y = 1 and 2x+4y = 2. The solution set is (x, y)
t
= (1 2y, y)
t
= (1, 0)
t
+y(2, 1)
t
with y arbitrary as both the equations represent the same line. Observe the following:
i. Here, a
1
b
2
a
2
b
1
= 0, a
1
c
2
a
2
c
1
= 0 and b
1
c
2
b
2
c
1
= 0.
ii. The vector (1, 0)
t
corresponds to the solution x = 1, y = 0 of the given system
whereas the vector (2, 1)
t
corresponds to the solution x = 2, y = 1 of the system
x + 2y = 0, 2x + 4y = 0.
(c) No Solution
x+2y = 1 and 2x+4y = 3. The equations represent a pair of parallel lines and hence there
is no point of intersection. Observe that in this case, a
1
b
2
a
2
b
1
= 0 but a
1
c
2
a
2
c
1
,= 0.
19
20 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS

2
No Solution
Pair of Parallel lines

1
and
2
Innite Number of Solutions
Coincident Lines

2 P
Unique Solution: Intersecting Lines
P: Point of Intersection
Figure 1 : Examples in 2 dimension.
3. As a last example, consider 3 equations in 3 unknowns.
A linear equation ax +by +cz = d represent a plane in R
3
provided (a, b, c) ,= (0, 0, 0). Here,
we have to look at the points of intersection of the three given planes.
(a) Unique Solution
Consider the system x +y +z = 3, x +4y +2z = 7 and 4x +10y z = 13. The unique
solution to this system is (x, y, z)
t
= (1, 1, 1)
t
; i.e. the three planes intersect at
a point.
(b) Infinite Number of Solutions
Consider the system x+y +z = 3, x+2y +2z = 5 and 3x+4y +4z = 11. The solution
set is (x, y, z)
t
= (1, 2 z, z)
t
= (1, 2, 0)
t
+ z(0, 1, 1)
t
, with z arbitrary. Observe the
following:
i. Here, the three planes intersect in a line.
ii. The vector (1, 2, 0)
t
corresponds to the solution x = 1, y = 2 and z = 0 of the linear
system x + y + z = 3, x + 2y + 2z = 5 and 3x + 4y + 4z = 11. Also, the vector
(0, 1, 1)
t
corresponds to the solution x = 0, y = 1 and z = 1 of the linear system
x +y +z = 0, x + 2y + 2z = 0 and 3x + 4y + 4z = 0.
(c) No Solution
The system x +y + z = 3, x + 2y + 2z = 5 and 3x + 4y + 4z = 13 has no solution. In
this case, we get three parallel lines as intersections of the above planes, namely
i. a line passing through (1, 2, 0) with direction ratios (0, 1, 1),
ii. a line passing through (3, 1, 0) with direction ratios (0, 1, 1), and
iii. a line passing through (1, 4, 0) with direction ratios (0, 1, 1).
The readers are advised to supply the proof.
Denition 2.1.1 (Linear System) A system of m linear equations in n unknowns x
1
, x
2
, . . . , x
n
is a set of equations of the form
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= b
1
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= b
2
.
.
.
.
.
. (2.1.1)
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= b
m
where for 1 i n, and 1 j m; a
ij
, b
i
R. Linear System (2.1.1) is called homogeneous if
b
1
= 0 = b
2
= = b
m
and non-homogeneous otherwise.
2.1. INTRODUCTION 21
We rewrite the above equations in the form Ax = b, where
A =
_

_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_

_
, x =
_

_
x
1
x
2
.
.
.
x
n
_

_
, and b =
_

_
b
1
b
2
.
.
.
b
m
_

_
The matrix A is called the coefficient matrix and the block matrix [A b] , is called the
augmented matrix of the linear system (2.1.1).
Remark 2.1.2 1. The rst column of the augmented matrix corresponds to the coecients of
the variable x
1
.
2. In general, the j
th
column of the augmented matrix corresponds to the coecients of the
variable x
j
, for j = 1, 2, . . . , n.
3. The (n + 1)
th
column of the augmented matrix consists of the vector b.
4. The i
th
row of the augmented matrix represents the i
th
equation for i = 1, 2, . . . , m.
That is, for i = 1, 2, . . . , m and j = 1, 2, . . . , n, the entry a
ij
of the coecient matrix A
corresponds to the i
th
linear equation and the j
th
variable x
j
.
Denition 2.1.3 For a system of linear equations Ax = b, the system Ax = 0 is called the
associated homogeneous system.
Denition 2.1.4 (Solution of a Linear System) A solution of Ax = b is a column vector y
with entries y
1
, y
2
, . . . , y
n
such that the linear system (2.1.1) is satised by substituting y
i
in place
of x
i
. The collection of all solutions is called the solution set of the system.
That is, if y
t
= [y
1
, y
2
, . . . , y
n
] is a solution of the linear system Ax = b then Ay = b holds.
For example, from Example 3.3a, we see that the vector y
t
= [1, 1, 1] is a solution of the system
Ax = b, where A =
_
_
1 1 1
1 4 2
4 10 1
_
_
, x
t
= [x, y, z] and b
t
= [3, 7, 13].
We now state a theorem about the solution set of a homogeneous system. The readers are
advised to supply the proof.
Theorem 2.1.5 Consider the homogeneous linear system Ax = 0. Then
1. The zero vector, 0 = (0, . . . , 0)
t
, is always a solution, called the trivial solution.
2. Suppose x
1
, x
2
are two solutions of Ax = 0. Then k
1
x
1
+ k
2
x
2
is also a solution of Ax = 0
for any k
1
, k
2
R.
Remark 2.1.6 1. A non-zero solution of Ax = 0 is called a non-trivial solution.
2. If Ax = 0 has a non-trivial solution, say y ,= 0 then z = cy for every c R is also a solution.
Thus, the existence of a non-trivial solution of Ax = 0 is equivalent to having an innite
number of solutions for the system Ax = 0.
3. If u, v are two distinct solutions of Ax = b then one has the following:
(a) u v is a solution of the system Ax = 0.
(b) Dene x
h
= u v. Then x
h
is a solution of the homogeneous system Ax = 0.
22 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
(c) That is, any two solutions of Ax = b dier by a solution of the associated homogeneous
system Ax = 0.
(d) Or equivalently, the set of solutions of the system Ax = b is of the form, x
0
+x
h
; where
x
0
is a particular solution of Ax = b and x
h
is a solution of the associated homogeneous
system Ax = 0.
2.1.1 A Solution Method
Example 2.1.7 Solve the linear system y +z = 2, 2x + 3z = 5, x +y +z = 3.
Solution: In this case, the augmented matrix is
_
_
0 1 1 2
2 0 3 5
1 1 1 3
_
_
and the solution method proceeds
along the following steps.
1. Interchange 1
st
and 2
nd
equation.
2x + 3z = 5
y +z = 2
x +y +z = 3
_
_
2 0 3 5
0 1 1 2
1 1 1 3
_
_
.
2. Replace 1
st
equation by 1
st
equation times
1
2
.
x +
3
2
z =
5
2
y +z = 2
x +y +z = 3
_
_
1 0
3
2
5
2
0 1 1 2
1 1 1 3
_
_
.
3. Replace 3
rd
equation by 3
rd
equation minus the 1
st
equation.
x +
3
2
z =
5
2
y +z = 2
y
1
2
z =
1
2
_
_
1 0
3
2
5
2
0 1 1 2
0 1
1
2
1
2
_
_
.
4. Replace 3
rd
equation by 3
rd
equation minus the 2
nd
equation.
x +
3
2
z =
5
2
y +z = 2

3
2
z =
3
2
_
_
1 0
3
2
5
2
0 1 1 2
0 0
3
2

3
2
_
_
.
5. Replace 3
rd
equation by 3
rd
equation times
2
3
.
x +
3
2
z =
5
2
y +z = 2
z = 1
_
_
1 0
3
2
5
2
0 1 1 2
0 0 1 1
_
_
.
The last equation gives z = 1. Using this, the second equation gives y = 1. Finally, the rst
equation gives x = 1. Hence the solution set is (x, y, z)
t
: (x, y, z) = (1, 1, 1), a unique solution.
In Example 2.1.7, observe that certain operations on equations (rows of the augmented matrix)
helped us in getting a system in Item 5, which was easily solvable. We use this idea to dene
elementary row operations and equivalence of two linear systems.
2.1. INTRODUCTION 23
Denition 2.1.8 (Elementary Row Operations) Let A be an mn matrix. Then the elemen-
tary row operations are dened as follows:
1. R
ij
: Interchange of the i
th
and the j
th
row of A.
2. For c ,= 0, R
k
(c): Multiply the k
th
row of A by c.
3. For c ,= 0, R
ij
(c): Replace the j
th
row of A by the j
th
row of A plus c times the i
th
row of A.
Denition 2.1.9 (Equivalent Linear Systems) Let [A b] and [C d] be augmented matrices of
two linear systems. Then the two linear systems are said to be equivalent if [C d] can be obtained
from [A b] by application of a nite number of elementary row operations.
Denition 2.1.10 (Row Equivalent Matrices) Two matrices are said to be row-equivalent if
one can be obtained from the other by a nite number of elementary row operations.
Thus, note that linear systems at each step in Example 2.1.7 are equivalent to each other. We
also prove the following result that relates elementary row operations with the solution set of a
linear system.
Lemma 2.1.11 Let Cx = d be the linear system obtained from Ax = b by application of a single
elementary row operation. Then Ax = b and Cx = d have the same solution set.
Proof. We prove the result for the elementary row operation R
jk
(c) with c ,= 0. The reader is
advised to prove the result for other elementary operations.
In this case, the systems Ax = b and Cx = d vary only in the k
th
equation. Let (
1
,
2
, . . . ,
n
)
be a solution of the linear system Ax = b. Then substituting for
i
s in place of x
i
s in the k
th
and
j
th
equations, we get
a
k1

1
+a
k2

2
+ a
kn

n
= b
k
, and a
j1

1
+a
j2

2
+ a
jn

n
= b
j
.
Therefore,
(a
k1
+ca
j1
)
1
+ (a
k2
+ca
j2
)
2
+ + (a
kn
+ca
jn
)
n
= b
k
+cb
j
. (2.1.2)
But then the k
th
equation of the linear system Cx = d is
(a
k1
+ca
j1
)x
1
+ (a
k2
+ca
j2
)x
2
+ + (a
kn
+ca
jn
)x
n
= b
k
+cb
j
. (2.1.3)
Therefore, using Equation (2.1.2), (
1
,
2
, . . . ,
n
) is also a solution for k
th
Equation (2.1.3).
Use a similar argument to show that if (
1
,
2
, . . . ,
n
) is a solution of the linear system Cx = d
then it is also a solution of the linear system Ax = b. Hence, the required result follows.
The readers are advised to use Lemma 2.1.11 as an induction step to prove the main result of
this subsection which is stated next.
Theorem 2.1.12 Two equivalent linear systems have the same solution set.
2.1.2 Gauss Elimination Method
We rst dene the Gauss elimination method and give a few examples to understand the method.
24 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
Denition 2.1.13 (Forward/Gauss Elimination Method) The Gaussian elimination method
is a procedure for solving a linear system Ax = b (consisting of m equations in n unknowns) by
bringing the augmented matrix
[A b] =
_

_
a
11
a
12
a
1m
a
1n
b
1
a
21
a
22
a
2m
a
2n
b
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mm
a
mn
b
m
_

_
to an upper triangular form
_

_
c
11
c
12
c
1m
c
1n
d
1
0 c
22
c
2m
c
2n
d
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 c
mm
c
mn
d
m
_

_
by application of elementary row operations. This elimination process is also called the forward
elimination method.
We have already seen an example before dening the notion of row equivalence. We give two
more examples to illustrate the Gauss elimination method.
Example 2.1.14 Solve the following linear system by Gauss elimination method.
x +y +z = 3, x + 2y + 2z = 5, 3x + 4y + 4z = 11
Solution: Let A =
_
_
1 1 1
1 2 2
3 4 4
_
_
and b =
_
_
3
5
11
_
_
. The Gauss Elimination method starts with the
augmented matrix [A b] and proceeds as follows:
1. Replace 2
nd
equation by 2
nd
equation minus the 1
st
equation.
x +y +z = 3
y +z = 2
3x + 4y + 4z = 11
_
_
1 1 1 3
0 1 1 2
3 4 4 11
_
_
.
2. Replace 3
rd
equation by 3
rd
equation minus 3 times 1
st
equation.
x +y +z = 3
y +z = 2
y +z = 2
_
_
1 1 1 3
0 1 1 2
0 1 1 2
_
_
.
3. Replace 3
rd
equation by 3
rd
equation minus the 2
nd
equation.
x +y +z = 3
y +z = 2
_
_
1 1 1 3
0 1 1 2
0 0 0 0
_
_
.
Thus, the solution set is (x, y, z)
t
: (x, y, z) = (1, 2 z, z) or equivalently (x, y, z)
t
: (x, y, z) =
(1, 2, 0) + z(0, 1, 1), with z arbitrary. In other words, the system has infinite number of
solutions. Observe that the vector y
t
= (1, 2, 0) satises Ay = b and the vector z
t
= (0, 1, 1) is
a solution of the homogeneous system Ax = 0.
2.1. INTRODUCTION 25
Example 2.1.15 Solve the following linear system by Gauss elimination method.
x +y +z = 3, x + 2y + 2z = 5, 3x + 4y + 4z = 12
Solution: Let A =
_
_
1 1 1
1 2 2
3 4 4
_
_
and b =
_
_
3
5
12
_
_
. The Gauss Elimination method starts with the
augmented matrix [A b] and proceeds as follows:
1. Replace 2
nd
equation by 2
nd
equation minus the 1
st
equation.
x +y +z = 3
y +z = 2
3x + 4y + 4z = 12
_
_
1 1 1 3
0 1 1 2
3 4 4 12
_
_
.
2. Replace 3
rd
equation by 3
rd
equation minus 3 times 1
st
equation.
x +y +z = 3
y +z = 2
y +z = 3
_
_
1 1 1 3
0 1 1 2
0 1 1 3
_
_
.
3. Replace 3
rd
equation by 3
rd
equation minus the 2
nd
equation.
x +y +z = 3
y +z = 2
0 = 1
_
_
1 1 1 3
0 1 1 2
0 0 0 1
_
_
.
The third equation in the last step is
0x + 0y + 0z = 1.
This can never hold for any value of x, y, z. Hence, the system has no solution.
Remark 2.1.16 Note that to solve a linear system Ax = b, one needs to apply only the row
operations to the augmented matrix [A b].
Denition 2.1.17 (Row Echelon Form of a Matrix) A matrix C is said to be in the row ech-
elon form if
1. the rows consisting entirely of zeros appears after the non-zero rows,
2. the rst non-zero entry in a non-zero row is 1. This term is called the leading term or a
leading 1. The column containing this term is called the leading column.
3. In any two successive non-rows, the leading 1 in the lower row occurs farther to the right than
the leading 1 in the higher row.
Example 2.1.18 The matrices
_

_
0 1 4 2
0 0 1 1
0 0 0 0
_

_ and
_

_
1 1 0 2 3
0 0 0 1 4
0 0 0 0 1
_

_
are in row-echelon
form. Whereas, the matrices
_

_
0 1 4 2
0 0 0 0
0 0 1 1
_

_,
_

_
1 1 0 2 3
0 0 0 1 4
0 0 0 0 2
_

_
and
_

_
1 1 0 2 3
0 0 0 0 1
0 0 0 1 4
_

_
are not in row-echelon form.
26 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
Denition 2.1.19 (Basic, Free Variables) Let Ax = b be a linear system consisting of m equa-
tions in n unknowns. Suppose the application of Gauss elimination method to the augmented matrix
[A b] yields the matrix [C d].
1. Then the variables corresponding to the leading columns (in the rst n columns of [C d]) are
called the basic variables.
2. The variables which are not basic are called free variables.
The free variables are called so as they can be assigned arbitrary values. Also, the basic variables
can be written in terms of the free variables and hence the value of basic variables in the solution
set depend on the values of the free variables.
Remark 2.1.20 Observe the following:
1. In Example 2.1.14, the solution set was given by
(x, y, z) = (1, 2 z, z) = (1, 2, 0) +z(0, 1, 1), with z arbitrary.
That is, we had x, y as two basic variables and z as a free variable.
2. Example 2.1.15 didnt have any solution because the row-echelon form of the augmented matrix
had a row of the form [0, 0, 0, 1].
3. Suppose the application of row operations to [A b] yields the matrix [C d] which is in row
echelon form. If [C d] has r non-zero rows then [C d] will consist of r leading terms or r
leading columns. Therefore, the linear system Ax = b will have r basic variables
and n r free variables.
Before proceeding further, we have the following denition.
Denition 2.1.21 (Consistent, Inconsistent) A linear system is called consistent if it admits
a solution and is called inconsistent if it admits no solution.
We are now ready to prove conditions under which the linear system Ax = b is consistent or
inconsistent.
Theorem 2.1.22 Consider the linear system Ax = b, where A is an m n matrix and x
t
=
(x
1
, x
2
, . . . , x
n
). If one obtains [C d] as the row-echelon form of [A b] with d
t
= (d
1
, d
2
, . . . , d
m
)
then
1. Ax = b is inconsistent (has no solution) if [C d] has a row of the form [0
t
1], where
0
t
= (0, . . . , 0).
2. Ax = b is consistent (has a solution) if [C d] has no row of the form [0
t
1]. Furthermore,
(a) if the number of variables equals the number of leading terms then Ax = b has a unique
solution.
(b) if the number of variables is strictly greater than the number of leading terms then Ax = b
has infinite number of solutions.
2.1. INTRODUCTION 27
Proof. Part 1: The linear equation corresponding to the row [0
t
1] equals
0x
1
+ 0x
2
+ + 0x
n
= 1.
Obviously, this equation has no solution and hence the system Cx = d has no solution. Thus, by
Theorem 2.1.12, Ax = b has no solution. That is, Ax = b is inconsistent.
Part 2: Suppose [C d] has r non-zero rows. As [C d] is in row echelon form there exist
positive integers 1 i
1
< i
2
< . . . < i
r
n such that entries c
i

for 1 r are leading

terms. This in turn implies that the variables x
ij
, for 1 j r are the basic variables and the
remaining n r variables, say x
t1
, x
t2
, . . . , x
tnr
, are free variables. So for each , 1 r, one
obtains x
i

k>i

c
k
x
k
= d

(k > i

in the summation as [C d] is an upper triangular matrix). Or

equivalently,
x
i

= d

j=+1
c
ij
x
ij

nr

s=1
c
ts
x
ts
for 1 l r.
Hence, a solution of the system Cx = d is given by
x
ts
= 0 for s = 1, . . . , n r and x
ir
= d
r
, x
ir1
= d
r1
d
r
, . . . , x
i1
= d
1

r

j=2
c
ij
d
j
.
Thus, by Theorem 2.1.12 the system Ax = b is consistent. In case of Part 2a, there are no free
variables and hence the unique solution is given by
x
n
= d
n
, x
n1
= d
n1
d
n
, . . . , x
1
= d
1

n

j=2
c
ij
d
j
.
In case of Part 2b, there is at least one free variable and hence Ax = b has innite number of
solutions. Thus, the proof of the theorem is complete.
We omit the proof of the next result as it directly follows from Theorem 2.1.22.
Corollary 2.1.23 Consider the homogeneous system Ax = 0. Then
1. Ax = 0 is always consistent as 0 is a solution.
2. If m < n then n m > 0 and there will be at least n m free variables. Thus Ax = 0 has
innite number of solutions. Or equivalently, Ax = 0 has a non-trivial solution.
We end this subsection with some applications related to geometry.
Example 2.1.24 1. Determine the equation of the line/circle that passes through the points
(1, 4), (0, 1) and (1, 4).
Solution: The general equation of a line/circle in 2-dimensional plane is given by a(x
2
+
y
2
) + bx + cy + d = 0, where a, b, c and d are the unknowns. Since this curve passes through
the given points, we have
a((1)
2
+ 4
2
) + (1)b + 4c +d = = 0
a((0)
2
+ 1
2
) + (0)b + 1c +d = = 0
a((1)
2
+ 4
2
) + (1)b + 4c +d = = 0.
Solving this system, we get (a, b, c, d) = (
3
13
d, 0,
16
13
d, d). Hence, taking d = 13, the equation
of the required circle is
3(x
2
+y
2
) 16y + 13 = 0.
28 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
2. Determine the equation of the plane that contains the points (1, 1, 1), (1, 3, 2) and (2, 1, 2).
Solution: The general equation of a plane in 3-dimensional space is given by ax+by+cz+d =
0, where a, b, c and d are the unknowns. Since this plane passes through the given points, we
have
a +b +c +d = = 0
a + 3b + 2c +d = = 0
2a b + 2c +d = = 0.
Solving this system, we get (a, b, c, d) = (
4
3
d,
d
3
,
2
3
d, d). Hence, taking d = 3, the equation
of the required plane is 4x y + 2z + 3 = 0.
3. Let A =
_
_
2 3 4
0 1 0
0 3 4
_
_
.
(a) Find a non-zero x
t
R
3
such that Ax = 2x.
(b) Does there exist a non-zero vector y
t
R
3
such that Ay = 4y?
Solution of Part 3a: Solving for Ax = 2x is same as solving for (A 2I)x = 0. This
leads to the augmented matrix
_
_
0 3 4 0
0 3 0 0
0 4 2 0
_
_
. Check that a non-zero solution is given by
x
t
= (1, 0, 0).
Solution of Part 3b: Solving for Ay = 4y is same as solving for (A 4I)y = 0. This
leads to the augmented matrix
_
_
2 3 4 0
0 5 0 0
0 3 0 0
_
_
. Check that a non-zero solution is given by
y
t
= (2, 0, 1).
Exercise 2.1.25 1. Determine the equation of the curve y = ax
2
+ bx + c that passes through
the points (1, 4), (0, 1) and (1, 4).
2. Solve the following linear system.
(a) x +y +z +w = 0, x y +z +w = 0 and x +y + 3z + 3w = 0.
(b) x + 2y = 1, x +y +z = 4 and 3y + 2z = 1.
(c) x +y +z = 3, x +y z = 1 and x +y + 7z = 6.
(d) x +y +z = 3, x +y z = 1 and x +y + 4z = 6.
(e) x +y +z = 3, x +y z = 1, x +y + 4z = 6 and x +y 4z = 1.
3. For what values of c and k, the following systems have i) no solution, ii) a unique solution
and iii) innite number of solutions.
(a) x +y +z = 3, x + 2y +cz = 4, 2x + 3y + 2cz = k.
(b) x +y +z = 3, x +y + 2cz = 7, x + 2y + 3cz = k.
(c) x +y + 2z = 3, x + 2y +cz = 5, x + 2y + 4z = k.
(d) kx +y +z = 1, x +ky +z = 1, x +y + kz = 1.
(e) x + 2y z = 1, 2x + 3y +kz = 3, x +ky + 3z = 2.
2.1. INTRODUCTION 29
(f ) x 2y = 1, x y +kz = 1, ky + 4z = 6.
4. For what values of a, does the following systems have i) no solution, ii) a unique solution
and iii) innite number of solutions.
(a) x + 2y + 3z = 4, 2x + 5y + 5z = 6, 2x + (a
2
6)z = a + 20.
(b) x +y +z = 3, 2x + 5y + 4z = a, 3x + (a
2
8)z = 12.
5. Find the condition(s) on x, y, z so that the system of linear equations given below (in the
unknowns a, b and c) is consistent?
(a) a + 2b 3c = x, 2a + 6b 11c = y, a 2b + 7c = z
(b) a +b + 5c = x, a + 3c = y, 2a b + 4c = z
(c) a + 2b + 3c = x, 2a + 4b + 6c = y, 3a + 6b + 9c = z
6. Let A be an n n matrix. If the system A
2
x = 0 has a non trivial solution then show that
Ax = 0 also has a non trivial solution.
7. Prove that we need to have 5 set of distinct points to specify a general conic in 2-dimensional
plane.
8. Let u
t
= (1, 1, 2) and v
t
= (1, 2, 3). Find condition on x, y and z such that the system
cu
t
+dv
t
= (x, y, z) in the unknowns c and d is consistent.
2.1.3 Gauss-Jordan Elimination
The Gauss-Jordan method consists of rst applying the Gauss Elimination method to get the row-
echelon form of the matrix [A b] and then further applying the row operations as follows. For
example, consider Example 2.1.7. We start with Step 5 and apply row operations once again. But
this time, we start with the 3
rd
row.
I. Replace 2
nd
equation by 2
nd
equation minus the 3
rd
equation.
x +
3
2
z =
5
2
y = 2
z = 1
_
_
1 0
3
2
5
2
0 1 0 1
0 0 1 1
_
_
.
II. Replace 1
st
equation by 1
st
equation minus
3
2
times 3
rd
equation.
x = 1
y = 1
z = 1
_
_
1 0 0 1
0 1 0 1
0 0 1 1
_
_
.
III. Thus, the solution set equals (x, y, z)
t
: (x, y, z) = (1, 1, 1).
Denition 2.1.26 (Row-Reduced Echelon Form) A matrix C is said to be in the row-reduced
echelon form or reduced row echelon form if
1. C is already in the row echelon form;
2. the leading column containing the leading 1 has every other entry zero.
A matrix which is in the row-reduced echelon form is also called a row-reduced echelon matrix.
30 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
Example 2.1.27 Let A =
_

_
0 1 4 2
0 0 1 1
0 0 0 0
_

_ and B =
_

_
1 1 0 2 3
0 0 0 1 4
0 0 0 0 1
_

_
. Then A and B
are in row echelon form. If C and D are the row-reduced echelon forms of A and B, respectively
then C =
_

_
0 1 0 2
0 0 1 1
0 0 0 0
_

_ and D =
_

_
1 1 0 0 0
0 0 0 1 0
0 0 0 0 1
_

_
.
Denition 2.1.28 (Back Substitution/Gauss-Jordan Method) The procedure to get The row-
reduced echelon matrix from the row-echelon matrix is called the back substitution. The elimi-
nation process applied to obtain the row-reduced echelon form of the augmented matrix is called the
Gauss-Jordan elimination method.
That is, the Gauss-Jordan elimination method consists of both the forward elimination and the
backward substitution.
Remark 2.1.29 Note that the row reduction involves only row operations and proceeds from left
to right. Hence, if A is a matrix consisting of rst s columns of a matrix C, then the row-reduced
form of A will consist of the rst s columns of the row-reduced form of C.
The proof of the following theorem is beyond the scope of this book and is omitted.
Theorem 2.1.30 The row-reduced echelon form of a matrix is unique.
Remark 2.1.31 Consider the linear system Ax = b. Then Theorem 2.1.30 implies the following:
1. The application of the Gauss Elimination method to the augmented matrix may yield dierent
matrices even though it leads to the same solution set.
2. The application of the Gauss-Jordan method to the augmented matrix yields the same ma-
trix and also the same solution set even though we may have used dierent sequence of row
operations.
Example 2.1.32 Consider Ax = b, where A is a 3 3 matrix. Let [C d] be the row-reduced
echelon form of [A b]. Also, assume that the rst column of A has a non-zero entry. Then the
possible choices for the matrix [C d] with respective solution sets are given below:
1.
_
_
1 0 0 d
1
0 1 0 d
2
0 0 1 d
3
_
_
. Ax = b has a unique solution, (x, y, z) = (d
1
, d
2
, d
3
).
2.
_
_
1 0 d
1
0 1 d
2
0 0 0 1
_
_
,
_
_
1 0 d
1
0 0 1 d
2
0 0 0 1
_
_
or
_
_
1 d
1
0 0 0 1
0 0 0 0
_
_
. Ax = b has no solution for any
choice of , .
3.
_
_
1 0 d
1
0 1 d
2
0 0 0 0
_
_
,
_
_
1 0 d
1
0 0 1 d
2
0 0 0 0
_
_
,
_
_
1 d
1
0 0 0 0
0 0 0 0
_
_
. Ax = b has Infinite number of so-
lutions for every choice of , .
2.2. ELEMENTARY MATRICES 31
Exercise 2.1.33 1. Let Ax = b be a linear system in 2 unknowns. What are the possible choices
for the row-reduced echelon form of the augmented matrix [A b]?
2. Find the row-reduced echelon form of the following matrices:
_
_
0 0 1
1 0 3
3 0 7
_
_
,
_
_
0 1 1 3
0 0 1 3
1 1 0 0
_
_
,
_
_
0 1 1
2 0 3
5 1 0
_
_
,
_

_
1 1 2 3
3 3 3 3
1 1 2 2
1 1 2 2
_

_
.
3. Find all the solutions of the following system of equations using Gauss-Jordan method. No
other method will be accepted.
x + y 2 u + v = 2
z + u + 2 v = 3
v + w = 3
v + 2 w = 5
2.2 Elementary Matrices
In the previous section, we solved a system of linear equations with the help of either the Gauss
Elimination method or the Gauss-Jordan method. These methods required us to make row op-
erations on the augmented matrix. Also, we know that (see Section 1.2.1 ) the row-operations
correspond to multiplying a matrix on the left. So, in this section, we try to understand the matri-
ces which helped us in performing the row-operations and also use this understanding to get some
important results in the theory of square matrices.
Denition 2.2.1 A square matrix E of order n is called an elementary matrix if it is obtained
by applying exactly one row operation to the identity matrix, I
n
.
Remark 2.2.2 Fix a positive integer n. Then the elementary matrices of order n are of three types
and are as follows:
1. E
ij
corresponds to the interchange of the i
th
and the j
th
row of I
n
.
2. For c ,= 0, E
k
(c) is obtained by multiplying the k
th
row of I
n
by c.
3. For c ,= 0, E
ij
(c) is obtained by replacing the j
th
row of I
n
by the j
th
row of I
n
plus c times
the i
th
row of I
n
.
Example 2.2.3 1. In particular, for n = 3 and a real number c ,= 0, one has
E
23
=
_
_
1 0 0
0 0 1
0 1 0
_
_
, E
1
(c) =
_
_
c 0 0
0 1 0
0 0 1
_
_
, and E
32
(c) =
_
_
1 0 0
0 1 c
0 0 1
_
_
.
2. Let A =
_
_
1 2 3 0
2 0 3 4
3 4 5 6
_
_
and B =
_
_
1 2 3 0
3 4 5 6
2 0 3 4
_
_
. Then B is obtained from A by the inter-
change of 2
nd
and 3
rd
row. Verify that
E
23
A =
_
_
1 0 0
0 0 1
0 1 0
_
_
_
_
1 2 3 0
2 0 3 4
3 4 5 6
_
_
=
_
_
1 2 3 0
3 4 5 6
2 0 3 4
_
_
= B.
32 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
3. Let A =
_
_
0 1 1 2
2 0 3 5
1 1 1 3
_
_
. Then B =
_
_
1 0 0 1
0 1 0 1
0 0 1 1
_
_
is the row-reduced echelon form of A. The
B = E
32
(1) E
21
(1) E
3
(1/3) E
23
(2) E
23
E
12
(2) E
13
A.
Or equivalently, check that
E
13
A = A
1
=
_
_
1 1 1 3
2 0 3 5
0 1 1 2
_
_
, E
12
(2)A
1
= A
2
=
_
_
1 1 1 3
0 2 1 1
0 1 1 2
_
_
,
E
23
A
2
= A
3
=
_
_
1 1 1 3
0 1 1 2
0 2 1 1
_
_
, E
23
(2)A
3
= A
4
=
_
_
1 1 1 3
0 1 1 2
0 0 3 3
_
_
,
E
3
(1/3)A
4
= A
5
=
_
_
1 1 1 3
0 1 1 2
0 0 1 1
_
_
, E
21
(1)A
5
= A
6
=
_
_
1 0 0 1
0 1 1 2
0 0 1 1
_
_
,
E
32
(1)A
6
= B =
_
_
1 0 0 1
0 1 0 1
0 0 1 1
_
_
.
Remark 2.2.4 Observe the following:
1. The inverse of the elementary matrix E
ij
is the matrix E
ij
itself. That is, E
ij
E
ij
= I =
E
ij
E
ij
.
2. Let c ,= 0. Then the inverse of the elementary matrix E
k
(c) is the matrix E
k
(1/c). That is,
E
k
(c)E
k
(1/c) = I = E
k
(1/c)E
k
(c).
3. Let c ,= 0. Then the inverse of the elementary matrix E
ij
(c) is the matrix E
ij
(c). That is,
E
ij
(c)E
ij
(c) = I = E
ij
(c)E
ij
(c).
That is, all the elementary matrices are invertible and the inverses are also elementary ma-
trices.
4. Suppose the row-reduced echelon form of the augmented matrix [A b] is the matrix [C d].
As row operations correspond to multiplying on the left with elementary matrices, we can nd
elementary matrices, say E
1
, E
2
, . . . , E
k
, such that
E
k
E
k1
E
2
E
1
[A b] = [C d].
That is, the Gauss-Jordan method (or Gauss Elimination method) is equivalent to multiplying
by a nite number of elementary matrices on the left to [A b].
We are now ready to prove a equivalent statements in the study of invertible matrices.
Theorem 2.2.5 Let A be a square matrix of order n. Then the following statements are equivalent.
1. A is invertible.
2. The homogeneous system Ax = 0 has only the trivial solution.
3. The row-reduced echelon form of A is I
n
.
2.2. ELEMENTARY MATRICES 33
4. A is a product of elementary matrices.
Proof. 1 = 2
As A is invertible, we have A
1
A = I
n
= AA
1
. Let x
0
be a solution of the homogeneous
system Ax = 0. Then, Ax
0
= 0 and Thus, we see that 0 is the only solution of the homogeneous
system Ax = 0.
2 = 3
Let x
t
= [x
1
, x
2
, . . . , x
n
]. As 0 is the only solution of the linear system Ax = 0, the nal
equations are x
1
= 0, x
2
= 0, . . . , x
n
= 0. These equations can be rewritten as
1 x
1
+ 0 x
2
+ 0 x
3
+ + 0 x
n
= 0
0 x
1
+ 1 x
2
+ 0 x
3
+ + 0 x
n
= 0
0 x
1
+ 0 x
2
+ 1 x
3
+ + 0 x
n
= 0
.
.
. =
.
.
.
0 x
1
+ 0 x
2
+ 0 x
3
+ + 1 x
n
= 0.
That is, the nal system of homogeneous system is given by I
n
x = 0. Or equivalently, the row-
reduced echelon form of the augmented matrix [A 0] is [I
n
0]. That is, the row-reduced echelon
form of A is I
n
.
3 = 4
Suppose that the row-reduced echelon form of A is I
n
. Then using Remark 2.2.4.4, there exist
elementary matrices E
1
, E
2
, . . . , E
k
such that
E
1
E
2
E
k
A = I
n
. (2.2.4)
Now, using Remark 2.2.4, the matrix E
1
j
is an elementary matrix and is the inverse of E
j
for
1 j k. Therefore, successively multiplying Equation (2.2.4) on the left by E
1
1
, E
1
2
, . . . , E
1
k
,
we get
A = E
1
k
E
1
k1
E
1
2
E
1
1
and thus A is a product of elementary matrices.
4 = 1
Suppose A = E
1
E
2
E
k
; where the E
i
s are elementary matrices. As the elementary matrices
are invertible (see Remark 2.2.4) and the product of invertible matrices is also invertible, we get
the required result.
As an immediate consequence of Theorem 2.2.5, we have the following important result.
Theorem 2.2.6 Let A be a square matrix of order n.
1. Suppose there exists a matrix C such that CA = I
n
. Then A
1
exists.
2. Suppose there exists a matrix B such that AB = I
n
. Then A
1
exists.
Proof. Suppose there exists a matrix C such that CA = I
n
. Let x
0
be a solution of the
homogeneous system Ax = 0. Then Ax
0
= 0 and
x
0
= I
n
x
0
= (CA)x
0
= C(Ax
0
) = C0 = 0.
That is, the homogeneous system Ax = 0 has only the trivial solution. Hence, using Theorem 2.2.5,
the matrix A is invertible.
34 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
Using the rst part, it is clear that the matrix B in the second part, is invertible. Hence
AB = I
n
= BA.
Thus, A is invertible as well.
Remark 2.2.7 Theorem 2.2.6 implies the following:
1. if we want to show that a square matrix A of order n is invertible, it is enough to show the
existence of
(a) either a matrix B such that AB = I
n
(b) or a matrix C such that CA = I
n
.
2. Let A be an invertible matrix of order n. Suppose there exist elementary matrices E
1
, E
2
, . . . , E
k
such that E
1
E
2
E
k
A = I
n
. Then A
1
= E
1
E
2
E
k
.
Remark 2.2.7 gives the following method of computing the inverse of a matrix.
Summary: Let A be an n n matrix. Apply the Gauss-Jordan method to the matrix [A I
n
].
Suppose the row-reduced echelon form of the matrix [A I
n
] is [B C]. If B = I
n
, then A
1
= C or
else A is not invertible.
Example 2.2.8 Find the inverse of the matrix
_
_
0 0 1
0 1 1
1 1 1
_
_
using the Gauss-Jordan method.
Solution: let us apply the Gauss-Jordan method to the matrix
_
_
0 0 1 1 0 0
0 1 1 0 1 0
1 1 1 0 0 1
_
_
.
1.
_
_
0 0 1 1 0 0
0 1 1 0 1 0
1 1 1 0 0 1
_
_

R
13
_
_
1 1 1 0 0 1
0 1 1 0 1 0
0 0 1 1 0 0
_
_
2.
_
_
1 1 1 0 0 1
0 1 1 0 1 0
0 0 1 1 0 0
_
_

R
31
(1)

R
32
(1)
_
_
1 1 0 1 0 1
0 1 0 1 1 0
0 0 1 1 0 0
_
_
3.
_
_
1 1 0 1 0 1
0 1 0 1 1 0
0 0 1 1 0 0
_
_

R
21
(1)
_
_
1 0 0 0 1 1
0 1 0 1 1 0
0 0 1 1 0 0
_
_
.
Thus, the inverse of the given matrix is
_
_
0 1 1
1 1 0
1 0 0
_
_
.
Exercise 2.2.9 1. Find the inverse of the following matrices using the Gauss-Jordan method.
(i)
_
_
1 2 3
1 3 2
2 4 7
_
_
, (ii)
_
_
1 3 3
2 3 2
2 4 7
_
_
, (iii)
_
_
2 1 1
1 2 1
1 1 2
_
_
, (iv)
_
_
0 0 2
0 2 1
2 1 1
_
_
.
2. Which of the following matrices are elementary?
_
_
2 0 1
0 1 0
0 0 1
_
_
,
_
_
1
2
0 0
0 1 0
0 0 1
_
_
,
_
_
1 1 0
0 1 0
0 0 1
_
_
,
_
_
1 0 0
5 1 0
0 0 1
_
_
,
_
_
0 0 1
0 1 0
1 0 0
_
_
,
_
_
0 0 1
1 0 0
0 1 0
_
_
.
2.2. ELEMENTARY MATRICES 35
3. Let A =
_
2 1
1 2
_
. Find the elementary matrices E
1
, E
2
, E
3
and E
4
such that E
4
E
3
E
2
E
1
A =
I
2
.
4. Let B =
_
_
1 1 1
0 1 1
0 0 3
_
_
. Determine elementary matrices E
1
, E
2
and E
3
such that E
3
E
2
E
1
B =
I
3
.
5. In Exercise 2.2.9.3, let C = E
4
E
3
E
2
E
1
. Then check that AC = I
2
.
6. In Exercise 2.2.9.4, let C = E
3
E
2
E
1
. Then check that BC = I
3
.
7. Find the inverse of the three matrices given in Example 2.2.3.3.
8. Show that a triangular matrix A is invertible if and only if each diagonal entry of A is non-
zero.
9. Let A be a 1 2 matrix and B be a 2 1 matrix having positive entries. Which of BA or AB
is invertible? Give reasons.
10. Let A be an n m matrix and B be an mn matrix. Prove that
(a) the matrix I BA is invertible if and only if the matrix I AB is invertible [Hint: Use
Theorem 2.2.5.2].
(b) (I BA)
1
= I +B(I AB)
1
A whenever I AB is invertible.
(c) (I BA)
1
B = B(I AB)
1
whenever I AB is invertible.
(d) (A
1
+B
1
)
1
= A(A +B)
1
B whenever A, B and A+B are all invertible.
We end this section by giving two more equivalent conditions for a matrix to be invertible.
Theorem 2.2.10 The following statements are equivalent for an n n matrix A.
1. A is invertible.
2. The system Ax = b has a unique solution for every b.
3. The system Ax = b is consistent for every b.
Proof. 1 = 2
Observe that x
0
= A
1
b is the unique solution of the system Ax = b.
2 = 3
The system Ax = b has a solution and hence by denition, the system is consistent.
3 = 1
For 1 i n, dene e
i
= (0, . . . , 0, 1
..
i
th position
, 0, . . . , 0)
t
, and consider the linear system
Ax = e
i
. By assumption, this system has a solution, say x
i
, for each i, 1 i n. Dene a matrix
B = [x
1
, x
2
, . . . , x
n
]. That is, the i
th
column of B is the solution of the system Ax = e
i
. Then
AB = A[x
1
, x
2
. . . , x
n
] = [Ax
1
, Ax
2
. . . , Ax
n
] = [e
1
, e
2
. . . , e
n
] = I
n
.
Therefore, by Theorem 2.2.6, the matrix A is invertible.
We now state another important result whose proof is immediate from Theorem 2.2.10 and
Theorem 2.2.5 and hence the proof is omitted.
36 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
Theorem 2.2.11 Let A be an n n matrix. Then the two statements given below cannot hold
together.
1. The system Ax = b has a unique solution for every b.
2. The system Ax = 0 has a non-trivial solution.
Exercise 2.2.12 1. Let A and B be two square matrices of the same order such that B = PA.
Prove that A is invertible if and only if B is invertible.
2. Let A and B be two mn matrices. Then prove that the two matrices A, B are row-equivalent
if and only if B = PA, where P is product of elementary matrices. When is this P unique?
3. Let b
t
= [1, 2, 1, 2]. Suppose A is a 4 4 matrix such that the linear system Ax = b has
no solution. Mark each of the statements given below as true or false?
(a) The homogeneous system Ax = 0 has only the trivial solution.
(b) The matrix A is invertible.
(c) Let c
t
= [1, 2, 1, 2]. Then the system Ax = c has no solution.
(d) Let B be the row-reduced echelon form of A. Then
i. the fourth row of B is [0, 0, 0, 0].
ii. the fourth row of B is [0, 0, 0, 1].
iii. the third row of B is necessarily of the form [0, 0, 0, 0].
iv. the third row of B is necessarily of the form [0, 0, 0, 1].
v. the third row of B is necessarily of the form [0, 0, 1, ], where is any real number.
2.3 Rank of a Matrix
In the previous section, we gave a few equivalent conditions for a square matrix to be invertible.
We also used the Gauss-Jordan method and the elementary matrices to compute the inverse of a
square matrix A. In this section and the subsequent sections, we will mostly be concerned with
mn matrices.
Let A by an m n matrix. Suppose that C is the row-reduced echelon form of A. Then the
matrix C is unique (see Theorem 2.1.30). Hence, we use the matrix C to dene the rank of the
matrix A.
Denition 2.3.1 (Row Rank of a Matrix) Let C be the row-reduced echelon form of a matrix
A. The number of non-zero rows in C is called the row-rank of A.
For a matrix A, we write row-rank (A) to denote the row-rank of A. By the very denition, it is
clear that row-equivalent matrices have the same row-rank. Thus, the number of non-zero rows in
either the row echelon form or the row-reduced echelon form of a matrix are equal. Therefore, we
just need to get the row echelon form of the matrix to know its rank.
Example 2.3.2 1. Determine the row-rank of A =
_
_
1 2 1 1
2 3 1 2
1 1 2 1
_
_
.
Solution: The row-reduced echelon form of A is obtained as follows:
_
_
1 2 1 1
2 3 1 2
1 1 2 1
_
_

_
_
1 2 1 1
0 1 1 0
0 1 1 0
_
_

_
_
1 2 1 1
0 1 1 0
0 0 2 0
_
_

_
_
1 0 0 1
0 1 0 0
0 0 1 0
_
_
.
2.3. RANK OF A MATRIX 37
The nal matrix has 3 non-zero rows. Thus row-rank(A) = 3. This also follows from the
third matrix.
2. Determine the row-rank of A =
_
_
1 2 1 1 1
2 3 1 2 2
1 1 0 1 1
_
_
.
Solution: row-rank(A) = 2 as one has the following:
_
_
1 2 1 1 1
2 3 1 2 2
1 1 0 1 1
_
_

_
_
1 2 1 1 1
0 1 1 0 0
0 1 1 0 0
_
_

_
_
1 2 1 1 1
0 1 1 0 0
0 0 0 0 0
_
_
.
The following remark related to the augmented matrix is immediate as computing the rank only
involves the row operations (also see Remark 2.1.29).
Remark 2.3.3 Let Ax = b be a linear system with m equations in n unknowns. Then the row-
reduced echelon form of A agrees with the rst n columns of [A b], and hence
row-rank(A) row-rank([A b]).
Now, consider an mn matrix A and an elementary matrix E of order n. Then the product AE
corresponds to applying column transformation on the matrix A. Therefore, for each elementary
matrix, there is a corresponding column transformation as well. We summarize these ideas as
follows.
Denition 2.3.4 The column transformations obtained by right multiplication of elementary ma-
trices are called column operations.
Example 2.3.5 Let A =
_
_
1 2 3 1
2 0 3 2
3 4 5 3
_
_
. Then
A
_

_
1 0 0 0
0 0 1 0
0 1 0 0
0 0 0 1
_

_
=
_
_
1 3 2 1
2 3 0 2
3 5 4 3
_
_
and A
_

_
1 0 0 1
0 1 0 0
0 0 1 0
0 0 0 1
_

_
=
_
_
1 2 3 0
2 0 3 0
3 4 5 0
_
_
.
Remark 2.3.6 After application of a nite number of elementary column operations (see Deni-
tion 2.3.4) to a matrix A, we can obtain a matrix B having the following properties:
1. The rst nonzero entry in each column is 1, called the leading term.
2. Column(s) containing only 0s comes after all columns with at least one non-zero entry.
3. The rst non-zero entry (the leading term) in each non-zero column moves down in successive
columns.
We dene column-rank of A as the number of non-zero columns in B.
It will be proved later that row-rank(A) = column-rank(A). Thus we are led to the following
denition.
Denition 2.3.7 The number of non-zero rows in the row-reduced echelon form of a matrix A is
called the rank of A, denoted rank(A).
38 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
we are now ready to prove a few results associated with the rank of a matrix.
Theorem 2.3.8 Let A be a matrix of rank r. Then there exist a nite number of elementary
matrices E
1
, E
2
, . . . , E
s
and F
1
, F
2
, . . . , F

such that
E
1
E
2
. . . E
s
A F
1
F
2
. . . F

=
_
I
r
0
0 0
_
.
Proof. Let C be the row-reduced echelon matrix of A. As rank(A) = r, the rst r rows of C are
non-zero rows. So by Theorem 2.1.22, C will have r leading columns, say i
1
, i
2
, . . . , i
r
. Note that,
for 1 s r, the i
th
s
column will have 1 in the s
th
row and zero, elsewhere.
We now apply column operations to the matrix C. Let D be the matrix obtained from C by
successively interchanging the s
th
and i
th
s
column of C for 1 s r. Then D has the form
_
I
r
B
0 0
_
, where B is a matrix of an appropriate size. As the (1, 1) block of D is an identity matrix,
the block (1, 2) can be made the zero matrix by application of column operations to D. This gives
the required result.
The next result is a corollary of Theorem 2.3.8. It gives the solution set of a homogeneous
system Ax = 0. One can also obtain this result as a particular case of Corollary 2.1.23.2 as by
denition rank(A) m, the number of rows of A.
Corollary 2.3.9 Let A be an mn matrix. Suppose rank(A) = r < n. Then Ax = 0 has innite
number of solutions. In particular, Ax = 0 has a non-trivial solution.
Proof. By Theorem 2.3.8, there exist elementary matrices E
1
, . . . , E
s
and F
1
, . . . , F

such that
E
1
E
2
E
s
A F
1
F
2
F

=
_
I
r
0
0 0
_
. Dene P = E
1
E
2
E
s
and Q = F
1
F
2
F

. Then the
matrix PAQ =
_
I
r
0
0 0
_
. As E
i
s for 1 i s correspond only to row operations, we get AQ =
_
C 0

, where C is a matrix of size m r. Let Q

1
, Q
2
, . . . , Q
n
be the columns of the matrix
Q. Then check that AQ
i
= 0 for i = r + 1, . . . , n. Hence, the required results follows (use
Theorem 2.1.5).
Exercise 2.3.10 1. Determine ranks of the coecient and the augmented matrices that appear
in Exercise 2.1.25.2.
2. Let P and Q be invertible matrices such that the matrix product PAQ is dened. Prove that
rank(PAQ) = rank(A).
3. Let A =
_
2 4 8
1 3 2
_
and B =
_
1 0 0
0 1 0
_
. Find P and Q such that B = PAQ.
4. Let A and B be two matrices. Prove that
(a) if A +B is dened, then rank(A +B) rank(A) + rank(B),
(b) if AB is dened, then rank(AB) rank(A) and rank(AB) rank(B).
5. Let A be a matrix of rank r. Then prove that there exists invertible matrices B
i
, C
i
such that
B
1
A =
_
R
1
R
2
0 0
_
, AC
1
=
_
S
1
0
S
3
0
_
, B
2
AC
2
=
_
A
1
0
0 0
_
and B
3
AC
3
=
_
I
r
0
0 0
_
, where the
(1, 1) block of each matrix is of size r r. Also, prove that A
1
is an invertible matrix.
2.3. RANK OF A MATRIX 39
6. Let A be an mn matrix of rank r. Then prove that A can be written as A = BC, where both
B and C have rank r and B is of size mr and C is of size r n.
7. Let A and B be two matrices such that AB is dened and rank(A) = rank(AB). Then prove
that A = ABX for some matrix X. Similarly, if BA is dened and rank (A) = rank (BA),
then A = Y BA for some matrix Y. [Hint: Choose invertible matrices P, Q satisfying PAQ =
"
A1 0
0 0
#
, P(AB) = (PAQ)(Q
1
B) =
"
A2 A3
0 0
#
. Now nd R an invertible matrix with P(AB)R =
"
C 0
0 0
#
. Dene X = R
"
C
1
A1 0
0 0
#
Q
1
.]
8. Suppose the matrices B and C are invertible and the involved partitioned products are dened,
then prove that
_
A B
C 0
_1
=
_
0 C
1
B
1
B
1
AC
1
_
.
9. Suppose A
1
= B with A =
_
A
11
A
12
A
21
A
22
_
and B =
_
B
11
B
12
B
21
B
22
_
. Also, assume that A
11
is
invertible and dene P = A
22
A
21
A
1
11
A
12
. Then prove that
(a) A is row-equivalent to the matrix
_
A
11
A
12
0 A
22
A
21
A
1
11
A
12
_
,
(b) P is invertible and B =
_
A
1
11
+ (A
1
11
A
12
)P
1
(A
21
A
1
11
) (A
1
11
A
12
)P
1
P
1
(A
21
A
1
11
) P
1
_
.
We end this section by giving another equivalent condition for a square matrix to be invertible.
To do so, we need the following denition.
Denition 2.3.11 A n n matrix A is said to be of full rank if rank(A) = n.
Theorem 2.3.12 Let A be a square matrix of order n. Then the following statements are equiva-
lent.
1. A is invertible.
2. A has full rank.
3. The row-reduced form of A is I
n
.
Proof. 1 = 2
Let if possible rank(A) = r < n. Then there exists an invertible matrix P (a product of
elementary matrices) such that PA =
_
B
1
B
2
0 0
_
, where B
1
is an rr matrix. Since A is invertible,
let A
1
=
_
C
1
C
2
_
, where C
1
is an r n matrix. Then
P = PI
n
= P(AA
1
) = (PA)A
1
=
_
B
1
B
2
0 0
_ _
C
1
C
2
_
=
_
B
1
C
1
+B
2
C
2
0
_
.
Thus, P has n r rows consisting of only zeros. Hence, P cannot be invertible. A contradiction.
Thus, A is of full rank.
2 = 3
40 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
Suppose A is of full rank. This implies, the row-reduced echelon form of A has all non-zero
rows. But A has as many columns as rows and therefore, the last row of the row-reduced echelon
form of A is [0, 0, . . . , 0, 1]. Hence, the row-reduced echelon form of A is I
n
.
3 = 1
Using Theorem 2.2.5.3, the required result follows.
2.4 Existence of Solution of Ax = b
In Section 2.2, we studied the system of linear equations in which the matrix A was a square matrix.
We will now use the rank of a matrix to study the system of linear equations even when A is not
a square matrix. Before proceeding with our main result, we give an example for motivation and
observations. Based on these observations, we will arrive at a better understanding, related to the
existence and uniqueness results for the linear system Ax = b.
Consider a linear system Ax = b. Suppose the application of the Gauss-Jordan method has
reduced the augmented matrix [A b] to
[C d] =
_

_
1 0 2 1 0 0 2 8
0 1 1 3 0 0 5 1
0 0 0 0 1 0 1 2
0 0 0 0 0 1 1 4
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
_

_
.
Then to get the solution set, we observe the following.
Observations:
1. The number of non-zero rows in C is 4. This number is also equal to the number of non-zero
rows in [C d]. So, there are 4 leading columns/basic variables.
2. The leading terms appear in columns 1, 2, 5 and 6. Thus, the respective variables x
1
, x
2
, x
5
and x
6
are the basic variables.
3. The remaining variables, x
3
, x
4
and x
7
are free variables.
Hence, the solution set is given by
_

_
x
1
x
2
x
3
x
4
x
5
x
6
x
7
_

_
=
_

_
8 2x
3
+x
4
2x
7
1 x
3
3x
4
5x
7
x
3
x
4
2 +x
7
4 x
7
x
7
_

_
=
_

_
8
1
0
0
2
4
0
_

_
+x
3
_

_
2
1
1
0
0
0
0
_

_
+x
4
_

_
1
3
0
1
0
0
0
_

_
+x
7
_

_
2
5
0
0
1
1
1
_

_
,
where x
3
, x
4
and x
7
are arbitrary.
Let x
0
=
_

_
8
1
0
0
2
4
0
_

_
, u
1
=
_

_
2
1
1
0
0
0
0
_

_
, u
2
=
_

_
1
3
0
1
0
0
0
_

_
and u
3
=
_

_
2
5
0
0
1
1
1
_

_
.
2.5. DETERMINANT 41
Then it can easily be veried that Cx
0
= d, and for 1 i 3, Cu
i
= 0. Hence, it follows that
Ax
0
= d, and for 1 i 3, Au
i
= 0.
A similar idea is used in the proof of the next theorem and is omitted. The proof appears on
page 74 as Theorem 3.3.26.
Theorem 2.4.1 (Existence/Non-Existence Result) Consider a linear system Ax = b, where
A is an m n matrix, and x, b are vectors of orders n 1, and m 1, respectively. Suppose
rank (A) = r and rank([A b]) = r
a
. Then exactly one of the following statement holds:
1. If r < r
a
, the linear system has no solution.
2. if r
a
= r, then the linear system is consistent. Furthermore,
(a) if r = n then the solution set contains a unique vector x
0
satisfying Ax
0
= b.
(b) if r < n then the solution set has the form
x
0
+k
1
u
1
+k
2
u
2
+ +k
nr
u
nr
: k
i
R, 1 i n r,
where Ax
0
= b and Au
i
= 0 for 1 i n r.
Remark 2.4.2 Let A be an mn matrix. Then Theorem 2.4.1 implies that
1. the linear system Ax = b is consistent if and only if rank(A) = rank([A b]).
2. the vectors u
i
, for 1 i n r, correspond to each of the free variables.
Exercise 2.4.3 In the introduction, we gave 3 gures (see Figure 2) to show the cases that arise
in the Euclidean plane (2 equations in 2 unknowns). It is well known that in the case of Euclidean
space (3 equations in 3 unknowns), there
1. is a gure to indicate the system has a unique solution.
2. are 4 distinct gures to indicate the system has no solution.
3. are 3 distinct gures to indicate the system has innite number of solutions.
Determine all the gures.
2.5 Determinant
In this section, we associate a number with each square matrix. To do so, we start with the following
notation. Let A be an n n matrix. Then for each positive integers
i
s 1 i k and
j
s for
1 j , we write A(
1
, . . . ,
k

1
, . . . ,

) to mean that submatrix of A, that is obtained by

deleting the rows corresponding to
i
s and the columns corresponding to
j
s of A.
Example 2.5.1 Let A =
_
_
1 2 3
1 3 2
2 4 7
_
_
. Then A(1[2) =
_
1 2
2 7
_
, A(1[3) =
_
1 3
2 4
_
and A(1, 2[1, 3) =
.
With the notations as above, we have the following inductive denition of determinant of a
matrix. This denition is commonly known as the expansion of the determinant along the rst
row. The students with a knowledge of symmetric groups/permutations can nd the denition of
the determinant in Appendix 7.1.15. It is also proved in Appendix that the denition given below
does correspond to the expansion of determinant along the rst row.
42 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
Denition 2.5.2 (Determinant of a Square Matrix) Let A be a square matrix of order n. The
determinant of A, denoted det(A) (or [A[) is dened by
det(A) =
_
_
_
a, if A = [a] (n = 1),
n

j=1
(1)
1+j
a
1j
det
_
A(1[j)
_
, otherwise.
Example 2.5.3 1. Let A = . Then det(A) = [A[ = 2.
2. Let A =
_
a b
c d
_
. Then, det(A) = [A[ = a

A(1[1)

A(1[2)

= ad bc. For example, if

A =
_
1 2
3 5
_
then det(A) =

1 2
3 5

= 1 5 2 3 = 1.
3. Let A =
_
_
a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
_
_
. Then,
det(A) = [A[ = a
11
det(A(1[1)) a
12
det(A(1[2)) +a
13
det(A(1[3))
= a
11

a
22
a
23
a
32
a
33

a
12

a
21
a
23
a
31
a
33

+a
13

a
21
a
22
a
31
a
32

= a
11
(a
22
a
33
a
23
a
32
) a
12
(a
21
a
33
a
31
a
23
)
+a
13
(a
21
a
32
a
31
a
22
) (2.5.1)
Let A =
_
_
1 2 3
2 3 1
1 2 2
_
_
. Then [A[ = 1

3 1
2 2

2 1
1 2

+ 3

2 3
1 2

= 4 2(3) + 3(1) = 1.
Exercise 2.5.4 Find the determinant of the following matrices.
i)
_

_
1 2 7 8
0 4 3 2
0 0 2 3
0 0 0 5
_

_
, ii)
_

_
3 0 0 1
0 2 0 5
6 7 1 0
3 2 0 6
_

_
, iii)
_
_
1 a a
2
1 b b
2
1 c c
2
_
_
.
Denition 2.5.5 (Singular, Non-Singular) A matrix A is said to be a singular if det(A) = 0.
It is called non-singular if det(A) ,= 0.
We omit the proof of the next theorem that relates the determinant of a square matrix with
row operations. The interested reader is advised to go through Appendix 7.2.
Theorem 2.5.6 Let A be an n n matrix. If
1. B is obtained from A by interchanging two rows then det(B) = det(A),
2. B is obtained from A by multiplying a row by c then det(B) = c det(A),
3. B is obtained from A by replacing the jth row by jth row plus c times the ith row, where i ,= j
then det(B) = det(A),
4. all the elements of one row or column of A are 0 then det(A) = 0,
5. two rows of A are equal then det(A) = 0.
6. A is a triangular matrix then det(A) is product of diagonal entries.
2.5. DETERMINANT 43
Since det(I
n
) = 1, where I
n
is the nn identity matrix, the following remark gives the determi-
nant of the elementary matrices. The proof is omitted as it is a direct application of Theorem 2.5.6.
Remark 2.5.7 Fix a positive integer n. Then
1. det(E
ij
) = 1, where E
ij
corresponds to the interchange of the i
th
and the j
th
row of I
n
.
2. For c ,= 0, det(E
k
(c)) = c, where E
k
(c) is obtained by multiplying the k
th
row of I
n
by c.
3. For c ,= 0, det(E
ij
(c)) = 1, where E
ij
(c) is obtained by replacing the j
th
row of I
n
by the j
th
row of I
n
plus c times the i
th
row of I
n
.
Remark 2.5.8 Theorem 2.5.6.1 implies that one can also calculate the determinant by expanding
along any row. Hence, the computation of determinant using the k-th row for 1 k n is given
by
det(A) =
n

j=1
(1)
k+j
a
kj
det
_
A(k[j)
_
.
Example 2.5.9 1. Let A =
_
_
2 2 6
1 3 2
1 1 2
_
_
. Determine det(A).
Solution: Check that

2 2 6
1 3 2
1 1 2

R
1
(2)

1 1 3
1 3 2
1 1 2

R
21
(1)

R
31
(1)

1 1 3
0 2 1
0 0 1

. Thus, using Theo-

rem 2.5.6, det(A) = 2 1 2 (1) = 4.
2. Let A =
_

_
2 2 6 8
1 1 2 4
1 3 2 6
3 3 5 8
_

_
. Determine det(A).
Solution: The successive application of row operations R
1
(2), R
21
(1), R
31
(1), R
41
(3), R
23
and R
34
(4) and the application of Theorem 2.5.6 implies
det(A) = 2 (1)

1 1 3 4
0 2 1 2
0 0 1 0
0 0 0 4

= 16.
Observe that the row operation R
1
(2) gives 2 as the rst product and the row operation R
23
gives 1 as the second product.
Remark 2.5.10 1. Let u
t
= (u
1
, u
2
) and v
t
= (v
1
, v
2
) be two vectors in R
2
. Consider the
parallelogram on vertices P = (0, 0)
t
, Q = u, R = u + v and S = v (see Figure 3). Then
Area (PQRS) = [u
1
v
2
u
2
v
1
[, the absolute value of

u
1
u
2
v
1
v
2

.
44 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
P
Q
S
R
T
u
v
w
u v

Figure 3: Parallelepiped with vertices P, Q, R and S as base

Recall the following: The dot product of u
t
= (u
1
, u
2
) and v
t
= (v
1
, v
2
), denoted u v, equals
u v = u
1
v
1
+u
2
v
2
, and the length of a vector u, denoted (u) equals (u) =
_
u
2
1
+u
2
2
. Also,
if is the angle between u and v then we know that cos() =
uv
(u)(v)
. Therefore
Area(PQRS) = (u)(v) sin() = (u)(v)

1
_
u v
(u)(v)
_
2
=
_
(u)
2
+(v)
2
(u v)
2
=
_
(u
1
v
2
u
2
v
1
)
2
= [u
1
v
2
u
2
v
1
[.
That is, in R
2
, the determinant is times the area of the parallelogram.
2. Consider Figure 3 again. Let u
t
= (u
1
, u
2
, u
3
), v
t
= (v
1
, v
2
, v
3
) and w
t
= (w
1
, w
2
, w
3
) be three
vectors in R
3
. Then u v = u
1
v
1
+ u
2
v
2
+ u
3
v
3
and the cross product of u and v, denoted
u v, equals
u v = (u
2
v
3
u
3
v
2
, u
3
v
1
u
1
v
3
, u
1
v
2
u
2
v
1
).
The vector u v is perpendicular to the plane containing both u and v. Note that if u
3
=
v
3
= 0, then we can think of u and v as vectors in the XY -plane and in this case (uv) =
[u
1
v
2
u
2
v
1
[ = Area(PQRS). Hence, if is the angle between the vector w and the vector
u v, then
volume (P) = Area(PQRS) height = [w (u v)[ =

w
1
w
2
w
3
u
1
u
2
u
3
v
1
v
2
v
3

.
In general, for any n n matrix A, it can be proved that [ det(A)[ is indeed equal to the volume
of the n-dimensional parallelepiped. The actual proof is beyond the scope of this book.
Exercise 2.5.11 In each of the questions given below, use Theorem 2.5.6 to arrive at your answer.
1. Let A =
_
_
a b c
e f g
h j
_
_
, B =
_
_
a b c
e f g
h j
_
_
and C =
_
_
a b a +b +c
e f e +f +g
h j h +j +
_
_
for some complex
numbers and . Prove that det(B) = det(A) and det(C) = det(A).
2. Let A =
_
_
1 3 2
2 3 1
1 5 3
_
_
and B =
_
_
1 1 0
1 0 1
0 1 1
_
_
. Prove that 3 divides det(A) and det(B) = 0.
2.5. DETERMINANT 45
2.5.1 Adjoint of a Matrix
Denition 2.5.12 (Minor, Cofactor of a Matrix) The number det (A(i[j)) is called the (i, j)
th
minor of A. We write A
ij
= det (A(i[j)) . The (i, j)
th
cofactor of A, denoted C
ij
, is the number
(1)
i+j
A
ij
.
Denition 2.5.13 (Adjoint of a Matrix) Let A be an n n matrix. The matrix B = [b
ij
] with
b
ij
= C
ji
, for 1 i, j n is called the Adjoint of A, denoted Adj(A).
Example 2.5.14 Let A =
_
_
1 2 3
2 3 1
1 2 2
_
_
. Then Adj(A) =
_
_
4 2 7
3 1 5
1 0 1
_
_
as
C
11
= (1)
1+1
A
11
= 4, C
21
= (1)
2+1
A
21
= 2, . . . , C
33
= (1)
3+3
A
33
= 1.
Theorem 2.5.15 Let A be an n n matrix. Then
1. for 1 i n,
n

j=1
a
ij
C
ij
=
n

j=1
a
ij
(1)
i+j
A
ij
= det(A),
2. for i ,= ,
n

j=1
a
ij
C
j
=
n

j=1
a
ij
(1)
+j
A
j
= 0, and
3. A(Adj(A)) = det(A)I
n
. Thus,
whenever det(A) ,= 0 one has A
1
=
1
det(A)
Proof. Part 1: It directly follows from Remark 2.5.8 and the denition of the cofactor.
Part 2: Fix positive integers i, with 1 i ,= n. And let B = [b
ij
] be a square matrix
whose
th
row equals the i
th
row of A and the remaining rows of B are the same as that of A.
Then by construction, the i
th
and
th
rows of B are equal. Thus, by Theorem 2.5.6.5, det(B) =
0. As A([j) = B([j) for 1 j n, using Remark 2.5.8, we have
0 = det(B) =
n

j=1
(1)
+j
b
j
det
_
B([j)
_
=
n

j=1
(1)
+j
a
ij
det
_
B([j)
_
=
n

j=1
(1)
+j
a
ij
det
_
A([j)
_
=
n

j=1
a
ij
C
j
. (2.5.3)
This completes the proof of Part 2.
Part 3:, Using Equation (2.5.3) and Remark 2.5.8, observe that
_
A
_
_
_
ij
=
n

k=1
a
ik
_
_
kj
=
n

k=1
a
ik
C
jk
=
_
0, if i ,= j,
det(A), if i = j.
Thus, A(Adj(A)) = det(A)I
n
. Therefore, if det(A) ,= 0 then A
_
1
det(A)
_
= I
n
. Hence, by
Theorem 2.2.6,
A
1
=
1
det(A)

46 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS

Example 2.5.16 For A =
_
_
1 1 0
0 1 1
1 2 1
_
_
_
_
1 1 1
1 1 1
1 3 1
_
_
and det(A) = 2. Thus, by
Theorem 2.5.15.3, A
1
=
_
_
1/2 1/2 1/2
1/2 1/2 1/2
1/2 3/2 1/2
_
_
.
The next corollary is a direct consequence of Theorem 2.5.15.3 and hence the proof is omitted.
Corollary 2.5.17 Let A be a non-singular matrix. Then
_
_
A = det(A) I
n
and
n

i=1
a
ij
C
ik
=
_
det(A), if j = k,
0, if j ,= k.
The next result gives another equivalent condition for a square matrix to be invertible.
Theorem 2.5.18 A square matrix A is non-singular if and only if A is invertible.
Proof. Let A be non-singular. Then det(A) ,= 0 and hence A
1
=
1
det(A)
Now, let us assume that A is invertible. Then, using Theorem 2.2.5, A = E
1
E
2
E
k
, a product
of elementary matrices. Also, by Remark 2.5.7, det(E
i
) ,= 0 for each i, 1 i k. Thus, by a
repeated application of the rst three parts of Theorem 2.5.6 gives det(A) ,= 0. Hence, the required
result follows.
We are now ready to prove a very important result that related the determinant of product of
two matrices with their determinants.
Theorem 2.5.19 Let A and B be square matrices of order n. Then
det(AB) = det(A) det(B) = det(BA).
Proof. Step 1. Let A be non-singular. Then by Theorem 2.5.15.3, A is invertible. Hence, using
Theorem 2.2.5, A = E
1
E
2
E
k
, a product of elementary matrices. Then a repeated application
of the rst three parts of Theorem 2.5.6 gives
det(AB) = det(E
1
E
2
E
k
B) = det(E
1
) det(E
2
E
k
B)
= det(E
1
) det(E
2
) det(E
3
E
k
B)
= det(E
1
E
2
) det(E
3
) det(E
4
E
k
B)
=
.
.
.
= det(E
1
E
2
E
k
) det(B) = det(A) det(B).
Thus, if A is non-singular then det(AB) = det(A) det(B). This will be used in the second step.
Step 2. Let A be singular. Then using Theorem 2.5.18 A is not invertible. Hence, there exists
an invertible matrix P such that PA = C, where C =
_
C
1
0
_
. So, A = P
1
C and therefore
det(AB) = det((P
1
C)B) = det(P
1
(CB)) = det
_
P
1
_
C
1
B
0
__
= det(P
1
) det
__
C
1
B
0
__
as P
1
is non-singular
= det(P) 0 = 0 = 0 det(B) = det(A) det(B).
2.5. DETERMINANT 47
Thus, the proof of the theorem is complete.
The next result relates the determinant of a matrix with the determinant of its transpose. As
an application of this result, determinant can be computed by expanding along any column as well.
Theorem 2.5.20 Let A be a square matrix. Then det(A) = det(A
t
).
Proof. If A is a non-singular, Corollary 2.5.17 gives det(A) = det(A
t
).
If A is singular, then by Theorem 2.5.18, A is not invertible. Therefore, A
t
is also not invertible
(as A
t
is invertible implies A
1
=
_
(A
t
)
1
_
t
)). Thus, using Theorem 2.5.18 again, det(A
t
) = 0 =
det(A). Hence the required result follows.
2.5.2 Cramers Rule
Let A be a square matrix. Then using Theorem 2.2.10 and Theorem 2.5.18, one has the following
result.
Theorem 2.5.21 Let A be a square matrix. Then the following statements are equivalent:
1. A is invertible.
2. T he linear system Ax = b has a unique solution for every b.
3. det(A) ,= 0.
Thus, Ax = b has a unique solution for every b if and only if det(A) ,= 0. The next theorem
gives a direct method of nding the solution of the linear system Ax = b when det(A) ,= 0.
Theorem 2.5.22 (Cramers Rule) Let A be an n n matrix. If det(A) ,= 0 then the unique
solution of the linear system Ax = b is
x
j
=
det(A
j
)
det(A)
, for j = 1, 2, . . . , n,
where A
j
is the matrix obtained from A by replacing the jth column of A by the column vector b.
Proof. Since det(A) ,= 0, A
1
=
1
det(A)
Adj(A). Thus, the linear system Ax = b has the solution
x =
1
det(A)
j
, the jth coordinate of x is given by
x
j
=
b
1
C
1j
+b
2
C
2j
+ +b
n
C
nj
det(A)
=
det(A
j
)
det(A)
.

In Theorem 2.5.22 A
1
=
_

_
b
1
a
12
a
1n
b
2
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
b
n
a
n2
a
nn
_

_
, A
2
=
_

_
a
11
b
1
a
13
a
1n
a
21
b
2
a
23
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
b
n
a
n3
a
nn
_

_
and so on
till A
n
=
_

_
a
11
a
1n1
b
1
a
12
a
2n1
b
2
.
.
.
.
.
.
.
.
.
.
.
.
a
1n
a
nn1
b
n
_

_
.
48 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
Example 2.5.23 Solve Ax = b using Cramers rule, where A =
_
_
1 2 3
2 3 1
1 2 2
_
_
and b =
_
_
1
1
1
_
_
.
Solution: Check that det(A) = 1 and x
t
= (1, 1, 0) as
x
1
=

1 2 3
1 3 1
1 2 2

= 1, x
2
=

1 1 3
2 1 1
1 1 2

= 1, and x
3
=

1 2 1
2 3 1
1 2 1

= 0.
2.6 Miscellaneous Exercises
Exercise 2.6.1 1. Let A be an orthogonal matrix. Prove that det A = 1.
2. Prove that every 2 2 matrix A satisfying tr(A) = 0 and det(A) = 0 is a nilpotent matrix.
3. Let A and B be two non-singular matrices. Are the matrices A+B and AB non-singular?
4. Let A be an n n matrix. Prove that the following statements are equivalent:
(a) A is not invertible.
(b) rank(A) ,= n.
(c) det(A) = 0.
(d) A is not row-equivalent to I
n
.
(e) The homogeneous system Ax = 0 has a non-trivial solution.
(f ) The system Ax = b is either inconsistent or it is consistent and in this case it has an
innite number of solutions.
(g) A is not a product of elementary matrices.
5. For what value(s) of does the following systems have non-trivial solutions? Also, for each
value of , determine a non-trivial solution.
(a) ( 2)x +y = 0, x + ( + 2)y = 0.
(b) x + 3y = 0, ( + 6)y = 0.
6. Let x
1
, x
2
, . . . , x
n
be xed reals numbers and dene A = [a
ij
]
nn
with a
ij
= x
j1
i
. Prove that
det(A) =

1i<jn
(x
j
x
i
). This matrix is usually called the Van-der monde matrix.
7. Let A = [a
ij
]
nn
with a
ij
= maxi, j. Prove that det A = (1)
n1
n.
8. Let A = [a
ij
]
nn
with a
ij
=
1
i+j1
. Using induction, prove that A is invertible. This matrix
is commonly known as the Hilbert matrix.
9. Solve the following system of equations by Cramers rule.
i) x +y +z w = 1, x +y z + w = 2, 2x +y +z w = 7, x +y +z +w = 3.
ii) x y +z w = 1, x +y z +w = 2, 2x +y z w = 7, x y z +w = 3.
10. Suppose A = [a
ij
] and B = [b
ij
] are two nn matrices with b
ij
= p
ij
a
ij
for 1 i, j n for
some non-zero p R. Then compute det(B) in terms of det(A).
11. The position of an element a
ij
of a determinant is called even or odd according as i + j is
even or odd. Show that
2.7. SUMMARY 49
(a) If all the entries in odd positions are multiplied with 1 then the value of the determinant
doesnt change.
(b) If all entries in even positions are multiplied with 1 then the determinant
i. does not change if the matrix is of even order.
ii. is multiplied by 1 if the matrix is of odd order.
12. Let A be a Hermitian (A

= A
t
) matrix. Prove that det A is a real number.
13. Let A be an n n matrix. Then A is invertible if and only if Adj(A) is invertible.
14. Let A and B be invertible matrices. Prove that Adj(AB) = Adj(B)Adj(A).
15. Let P =
_
A B
C D
_
be a rectangular matrix with A a square matrix of order n and [A[ , = 0.
Then show that rank (P) = n if and only if D = CA
1
B.
2.7 Summary
In this chapter, we started with a system of linear equations Ax = b and related it to the augmented
matrix [A [b]. We applied row operations to [A [b] to get its row echelon form and the row-reduced
echelon forms. Depending on the row echelon matrix, say [C [d], thus obtained, we had the following
result:
1. If [C [d] has a row of the form [0  then the linear system Ax = b has not solution.
2. Suppose [C [d] does not have any row of the form [0  then the linear system Ax = b has
at least one solution.
(a) If the number of leading terms equals the number of unknowns then the system Ax = b
has a unique solution.
(b) If the number of leading terms is less than the number of unknowns then the system
Ax = b has an innite number of solutions.
The following conditions are equivalent for an n n matrix A.
1. A is invertible.
2. The homogeneous system Ax = 0 has only the trivial solution.
3. The row reduced echelon form of A is I.
4. A is a product of elementary matrices.
5. The system Ax = b has a unique solution for every b.
6. The system Ax = b has a solution for every b.
7. rank(A) = n.
8. det(A) ,= 0.
Suppose the matrix A in the linear system Ax = b is of size m n. Then exactly one of the
following statement holds:
1. if rank(A) < rank([A [b]), then the system Ax = b has no solution.
50 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS
2. if rank(A) = rank([A [b]), then the system Ax = b is consistent. Furthermore,
(a) if rank(A) = n then the system Ax = b has a unique solution.
(b) if rank(A) < n then the system Ax = b has an innite number of solutions.
We also dealt with the following type of problems:
1. Solving the linear system Ax = b. In the next chapter, we will see that this leads us to the
question is the vector b a linear combination of the columns of A?
2. Solving the linear system Ax = 0. In the next chapter, we will see that this leads us to the
question are the columns of A linearly independent/dependent?
(a) If Ax = 0 has a unique solution, the trivial solution, then the columns of A are linear
independent.
(b) If Ax = 0 has an innite number of solutions then the columns of A are linearly depen-
dent.
3. Let b
t
= [b
1
, b
2
, . . . , b
m
]. Find conditions of the b
i
s such that the linear system Ax = b
always has a solution. Observe that for dierent choices of x the vector Ax gives rise to
vectors that are linear combination of the columns of A. This idea will be used in the next
chapter, to get the geometrical representation of the linear span of the columns of A.
Chapter 3
Finite Dimensional Vector Spaces
3.1 Finite Dimensional Vector Spaces
Recall that the set of real numbers were denoted by R and the set of complex numbers were denoted
by C. Also, we wrote F to denote either the set R or the set C.
Let A be an mn complex matrix. Then using Theorem 2.1.5, we see that the solution set of
the homogeneous system Ax = 0, denoted V , satises the following properties:
1. The vector 0 V as A0 = 0.
2. If x V then A(x) = (Ax) = 0 for all C. Hence, x V for any complex number .
In particular, x V whenever x V .
3. Let x, y V . Then for any , C, x, y V and A(x+y) = 0+0 = 0. In particular,
x +y V and x +y = y +x. Also, (x +y) +z = x + (y +z).
That is, the solution set of a homogeneous linear system satises some nice properties. We use
these properties to dene a set and devote this chapter to the study of the structure of such sets.
We will also see that the set of real numbers, R, the Euclidean plane, R
2
and the Euclidean space,
R
3
, are examples of this set. We start with the following denition.
Denition 3.1.1 (Vector Space) A vector space over F, denoted V (F) or in short V (if the eld
F is clear from the context), is a non-empty set, satisfying the following axioms:
1. Vector Addition: To every pair u, v V there corresponds a unique element u v in V
(called the addition of vectors) such that
(a) u v = v u (Commutative law).
(b) (u v) w = u (v w) (Associative law).
(c) There is a unique element 0 in V (the zero vector) such that u0 = u, for every u V
(called the additive identity).
(d) For every u V there is a unique element u V such that u (u) = 0 (called the
2. Scalar Multiplication: For each u V and F, there corresponds a unique element
u in V (called the scalar multiplication) such that
(a) ( u) = () u for every , F and u V.
51
52 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
(b) 1 u = u for every u V, where 1 R.
3. Distributive Laws: relating vector addition with scalar multiplication
For any , F and u, v V, the following distributive laws hold:
(a) (u v) = ( u) ( v).
(b) ( +) u = ( u) ( u).
Note: the number 0 is the element of F whereas 0 is the zero vector.
Remark 3.1.2 The elements of F are called scalars, and that of V are called vectors. If F = R,
the vector space is called a real vector space. If F = C, the vector space is called a complex
vector space.
Some interesting consequences of Denition 3.1.1 is the following useful result. Intuitively, these
results seem to be obvious but for better understanding of the axioms it is desirable to go through
the proof.
Theorem 3.1.3 Let V be a vector space over F. Then
1. u v = u implies v = 0.
2. u = 0 if and only if either u is the zero vector or = 0.
3. (1) u = u for every u V.
Proof. Part 1: For each u V, by Axiom 3.1.1.1d there exists u V such that uu = 0.
Hence, u v = u is equivalent to
u (u v) = u u (u u) v = 0 0 v = 0 v = 0.
Part 2: As 0 = 0 0, using Axiom 3.1.1.3, we have
0 = (0 0) = ( 0) ( 0).
Thus, for any F, Axiom 3.1.1.3a gives 0 = 0. In the same way,
0 u = (0 + 0) u = (0 u) (0 u).
Hence, using Axiom 3.1.1.3a, one has 0 u = 0 for any u V.
Now suppose u = 0. If = 0 then the proof is over. Therefore, let us assume ,= 0 (note
that is a real or complex number, hence
1

exists and
0 =
1

0 =
1

( u) = (
1

) u = 1 u = u
as 1 u = u for every vector u V. Thus, if ,= 0 and u = 0 then u = 0.
Part 3: As 0 = 0u = (1 + (1))u = u + (1)u, one has (1)u = u.
Example 3.1.4 The readers are advised to justify the statements made in the examples given below.
1. Let A be an mn matrix with complex entries and suppose rank(A) = r n. Let V denote
the solution set of Ax = 0. Then using Theorem 2.4.1, we know that V contains at least the
trivial solution, the 0 vector. Thus, check that the set V satises all the axioms stated in
Denition 3.1.1 (some of them were proved to motivate this chapter).
3.1. FINITE DIMENSIONAL VECTOR SPACES 53
2. The set R of real numbers, with the usual addition and multiplication of real numbers (i.e.,
+ and ) forms a vector space over R.
3. Let R
2
= (x
1
, x
2
) : x
1
, x
2
R. Then for x
1
, x
2
, y
1
, y
2
R and R, dene
(x
1
, x
2
) (y
1
, y
2
) = (x
1
+y
1
, x
2
+y
2
) and (x
1
, x
2
) = (x
1
, x
2
).
Then R
2
is a real vector space.
4. Let R
n
= (a
1
, a
2
, . . . , a
n
) : a
i
R, 1 i n be the set of n-tuples of real numbers. For
u = (a
1
, . . . , a
n
), v = (b
1
, . . . , b
n
) in V and R, we dene
u v = (a
1
+b
1
, . . . , a
n
+b
n
) and u = (a
1
, . . . , a
n
)
(called component wise operations). Then V is a real vector space. This vector space R
n
is
called the real vector space of n-tuples.
Recall that the symbol i represents the complex number

1.
5. Consider the set C = x + iy : x, y R of complex numbers and let z
1
= x
1
+ iy
1
and
z
2
= x
2
+iy
2
. Dene
z
1
z
2
= (x
1
+x
2
) +i(y
1
+y
2
), and
(a) for any R, dene z
1
= (x
1
) + i(y
1
). Then C is a real vector space as the
scalars are the real numbers.
(b) ( +i) (x
1
+iy
1
) = (x
1
y
1
) +i(y
1
+x
1
) for any +i C. Here, the scalars
are complex numbers and hence C forms a complex vector space.
6. Let C
n
= (z
1
, z
2
, . . . , z
n
) : z
i
C, 1 i n. For (z
1
, . . . , z
n
), (w
1
, . . . , w
n
) C
n
and F,
dene
(z
1
, . . . , z
n
) (w
1
, . . . , w
n
) = (z
1
+w
1
, . . . , z
n
+w
n
), and
(z
1
, . . . , z
n
) = (z
1
, . . . , z
n
).
Then it can be veried that C
n
forms a vector space over C (called complex vector space) as
well as over R (called real vector space). Whenever there is no mention of scalars, it will
always be assumed to be C, the complex numbers.
Remark 3.1.5 If the scalars are C then i(1, 0) = (i, 0) is allowed. Whereas, if the scalars
are R then i(1, 0) ,= (i, 0).
7. Fix a positive integer n and let T
n
(R) denote the set of all polynomials in x of degree n
with coecients from R. Algebraically,
T
n
(R) = a
0
+a
1
x +a
2
x
2
+ +a
n
x
n
: a
i
R, 0 i n.
Let f(x) = a
0
+a
1
x+a
2
x
2
+ +a
n
x
n
, g(x) = b
0
+b
1
x+b
2
x
2
+ +b
n
x
n
T
n
(R) for some
a
i
, b
i
R, 0 i n. It can be veried that T
n
(R) is a real vector space with the addition and
scalar multiplication dened by
f(x) g(x) = (a
0
+b
0
) + (a
1
+b
1
)x + + (a
n
+b
n
)x
n
, and
f(x) = a
0
+a
1
x + +a
n
x
n
for R.
54 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
8. Let T(R) be the set of all polynomials with real coecients. As any polynomial a
0
+ a
1
x +
+a
m
x
m
also equals a
0
+a
1
x + +a
m
x
m
+ 0 x
m+1
+ + 0 x
p
, whenever p > m, let
f(x) = a
0
+a
1
x+ +a
p
x
p
, g(x) = b
0
+b
1
x+ +b
p
x
p
T(R) for some a
i
, b
i
R, 0 i p.
So, with vector addition and scalar multiplication is dened below (called coecient-wise),
T(R) forms a real vector space.
f(x) g(x) = (a
0
+b
0
) + (a
1
+b
1
)x + + (a
p
+b
p
)x
p
and
f(x) = a
0
+a
1
x + +a
p
x
p
for R.
9. Let T(C) be the set of all polynomials with complex coecients. Then with respect to vector
addition and scalar multiplication dened coecient-wise, the set T(C) forms a vector space.
10. Let V = R
+
= x R : x > 0. This is not a vector space under usual operations of
addition and scalar multiplication (why?). But R
+
is a real vector space with 1 as the additive
identity if we dene vector addition and scalar multiplication by
u v = u v and u = u

for all u, v R
+
and R.
11. Let V = (x, y) : x, y R. For any R and x = (x
1
, x
2
), y = (y
1
, y
2
) V , let
x y = (x
1
+y
1
+ 1, x
2
+y
2
3) and x = (x
1
+ 1, x
2
3 + 3).
Then V is a real vector space with (1, 3) as the additive identity.
12. Let M
2
(C) denote the set of all 2 2 matrices with complex entries. Then M
2
(C) forms a
vector space with vector addition and scalar multiplication dened by
A B =
_
a
1
a
2
a
3
a
4
_

_
b
1
b
2
b
3
b
4
_
=
_
a
1
+b
1
a
2
+b
2
a
3
+b
3
a
4
+b
4
_
, A =
_
a
1
a
2
a
3
a
4
_
.
13. Fix positive integers m and n and let M
mn
(C) denote the set of all mn matrices with com-
plex entries. Then M
mn
(C) is a vector space with vector addition and scalar multiplication
dened by
AB = [a
ij
] [b
ij
] = [a
ij
+b
ij
], A = [a
ij
] = [a
ij
].
In case m = n, the vector space M
mn
(C) will be denoted by M
n
(C).
14. Let C([1, 1]) be the set of all real valued continuous functions on the interval [1, 1]. Then
C([1, 1]) forms a real vector space if for all x [1, 1], we dene
(f g)(x) = f(x) +g(x) for all f, g C([1, 1]) and
( f)(x) = f(x) for all R and f C([1, 1]).
15. Let V and W be vector spaces over F, with operations (+, ) and (, ), respectively. Let
V W = (v, w) : v V, w W. Then V W forms a vector space over F, if for every
(v
1
, w
1
), (v
2
, w
2
) V W and R, we dene
(v
1
, w
1
)

(v
2
, w
2
) = (v
1
+v
2
, w
1
w
2
), and
(v
1
, w
1
) = ( v
1
, w
1
).
v
1
+v
2
and w
1
w
2
on the right hand side mean vector addition in V and W, respectively.
Similarly, v
1
and w
1
correspond to scalar multiplication in V and W, respectively.
3.1. FINITE DIMENSIONAL VECTOR SPACES 55
From now on, we will use u +v for u v and u or u for u.
Exercise 3.1.6 1. Verify all the axioms are satised in all the examples of vector spaces con-
sidered in Example 3.1.4.
2. Prove that the set M
mn
(R) for xed positive integers m and n forms a real vector space with
usual operations of matrix addition and scalar multiplication.
3. Let V = (x, y) : x, y R
2
. For x = (x
1
, x
2
), y = (y
1
, y
2
) V , dene
x +y = (x
1
+y
1
, x
2
+y
2
) and x = (x
1
, 0)
for all R. Is V a vector space? Give reasons for your answer.
4. Let a, b R with a < b. Then prove that C([a, b]), the set of all complex valued continuous
functions on [a, b] forms a vector space if for all x [a, b], we dene
(f g)(x) = f(x) +g(x) for all f, g C([a, b]) and
( f)(x) = f(x) for all R and f C([a, b]).
5. Prove that C(R), the set of all real valued continuous functions on R forms a vector space if
for all x R, we dene
(f g)(x) = f(x) +g(x) for all f, g C(R) and
( f)(x) = f(x) for all R and f C(R).
3.1.1 Subspaces
Denition 3.1.7 (Vector Subspace) Let S be a non-empty subset of V. The set S over F is
said to be a subspace of V (F) if S in itself is a vector space, where the vector addition and scalar
multiplication are the same as that of V (F).
Example 3.1.8 1. Let V (F) be a vector space. Then the sets given below are subspaces of V.
They are called trivial subspaces.
(a) S = 0, consisting only of the zero vector 0 and
(b) S = V , the whole space.
2. Let S = (x, y, z) R
3
: x + 2y z = 0. Then S is a subspace of R
3
(S is a plane in R
3
passing through the origin).
3. Let S = (x, y, z) R
3
: x + y + z = 0, x y z = 0. Then S is a subspace of R
3
(S is a
line in R
3
passing through the origin).
4. Let S = (x, y, z) R
3
: z 3x = 0. Then S is a subspace of R
3
.
5. The vector space T
n
(R) is a subspace of the vector space T(R).
6. Prove that S = (x, y, z) R
3
: x + y + z = 3 is not a subspace of R
3
(S is still a plane in
R
3
but it does not pass through the origin).
7. Prove that W = (x, 0) R
2
: x R is a subspace of R
2
.
56 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
8. Let W = (x, 0) V : x R, where V is the vector space of Example 3.1.4.11. Then
(x, 0) (y, 0) = (x+y +1, 3) , W. Hence W is not a subspace of V but S = (x, 3) : x R
is a subspace of V . Note that the zero vector (1, 3) V .
9. Let W =
__
a b
c d
_
M
2
(C) : a = d
_
. Then the condition a = d forces us to have = for
any scalar C. Hence,
(a) W is not a vector subspace of the complex vector space M
2
(C), but
(b) W is a vector subspace of the real vector space M
2
(C).
We are now ready to prove a very important result in the study of vector subspaces. This result
basically tells us that if we want to prove that a non-empty set W is a subspace of a vector space
V (F) then we just need to verify only one condition. That is, we dont have to prove all the axioms
stated in Denition 3.1.1.
Theorem 3.1.9 Let V (F) be a vector space and let W be a non-empty subset of V . Then W is a
subspace of V if and only if u +v W whenever , F and u, v W.
Proof. Let W be a subspace of V and let u, v W. Then for every , F, u, v W and
hence u +v W.
Now, let us assume that u +v W whenever , F and u, v W. Need to show, W is a
subspace of V . To do so, observe the following:
1. Taking = 1 and = 1, we see that u +v W for every u, v W.
2. Taking = 0 and = 0, we see that 0 W.
3. Taking = 0, we see that u W for every F and u W and hence using Theo-
rem 3.1.3.3, u = (1)u W as well.
4. The commutative and associative laws of vector addition hold as they hold in V .
5. The axioms related with scalar multiplication and the distributive laws also hold as they hold
in V .
Thus, we have the required result.
Exercise 3.1.10 1. Determine all the subspaces of R, R
2
and R
3
.
2. Prove that a line in R
2
is a subspace if and only if it passes through (0, 0) R
2
.
3. Let V = (a, b) : a, b R. Is V a vector space over R if (a, b) (c, d) = (a + c, 0) and
(a, b) = (a, 0)? Give reasons for your answer.
4. Let V = R. Dene x y = x y and x = x. Which vector space axioms are not
satised here?
5. Which of the following are correct statements (why!)?
(a) S = (x, y, z) R
3
: z = x
2
is a subspace of R
3
.
(b) S = x : F forms a vector subspace of V (F) for each xed x V .
(c) S = (1, 1, 1) +(1, 1, 0) : , R is a vector subspace of R
3
.
3.1. FINITE DIMENSIONAL VECTOR SPACES 57
(d) All the sets given below are subspaces of C([1, 1]) (see Example 3.1.4.14).
i. W = f C([1, 1]) : f(1/2) = 0.
ii. W = f C([1, 1]) : f(0) = 0, f(1/2) = 0.
iii. W = f C([1, 1]) : f(1/2) = 0, f(1/2) = 0.
iv. W = f C([1, 1]) : f

(
1
4
)exists .
(e) All the sets given below are subspaces of T(R)?
i. W = f(x) T(R) : deg(f(x)) = 3.
ii. W = f(x) T(R) : deg(f(x)) = 0.
iii. W = f(x) T(R) : f(1) = 0.
iv. W = f(x) T(R) : f(0) = 0, f(1) = 0.
(f ) Let A =
_
1 2 1
2 1 1
_
and b =
_
1
1
_
. Then x : Ax = b is a subspace of R
3
.
(g) Let A =
_
1 2 1
2 1 1
_
. Then x : Ax = 0 is a subspace of R
3
.
6. Which of the following are subspaces of R
n
(R)?
(a) (x
1
, x
2
, . . . , x
n
) : x
1
0.
(b) (x
1
, x
2
, . . . , x
n
) : x
1
+ 2x
2
= 4x
3
.
(c) (x
1
, x
2
, . . . , x
n
) : x
1
is rational .
(d) (x
1
, x
2
, . . . , x
n
) : x
1
= x
2
3
.
(e) (x
1
, x
2
, . . . , x
n
) : either x
1
or x
2
or both are 0.
(f ) (x
1
, x
2
, . . . , x
n
) : [x
1
[ 1.
7. Which of the following are subspaces of i)C
n
(R) ii)C
n
(C)?
(a) (z
1
, z
2
, . . . , z
n
) : z
1
is real .
(b) (z
1
, z
2
, . . . , z
n
) : z
1
+z
2
= z
3
.
(c) (z
1
, z
2
, . . . , z
n
) :[ z
1
[=[ z
2
[.
8. Let A =
_
_
1 1 1
2 0 1
1 1 0
_
_
. Are the sets given below subspaces of R
3
?
(a) W = x
t
R
3
: Ax = 0.
(b) W = b
t
R
3
: there exists x
t
R
3
with Ax = b.
(c) W = x
t
R
3
: x
t
A = 0.
(d) W = b
t
R
3
: there exists x
t
R
3
with x
t
A = b
t
.
9. Fix a positive integer n. Then M
n
(R) is a real vector space with usual operations of matrix
addition and scalar multiplication. Prove that the sets W M
n
(R), given below, are subspaces
of M
n
(R).
(a) W = A : A
t
= A, the set of symmetric matrices.
(b) W = A : A
t
= A, the set of skew-symmetric matrices.
(c) W = A : A is an upper triangular matrix.
58 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
(d) W = A : A is a lower triangular matrix.
(e) W = A : A is a diagonal matrix.
(f ) W = A : trace(A) = 0.
(g) W = A = (a
ij
) : a
11
+a
22
= 0.
(h) W = A = (a
ij
) : a
21
+a
22
+ +a
2n
= 0.
10. Fix a positive integer n. Then M
n
(C) is a complex vector space with usual operations of
matrix addition and scalar multiplication. Are the sets W M
n
(C), given below, subspaces
of M
n
(C)? Give reasons.
(a) W = A : A

(b) W = A : A

= A, the set of skew-Hermitian matrices.

(c) W = A : A is an upper triangular matrix.
(d) W = A : A is a lower triangular matrix.
(e) W = A : A is a diagonal matrix.
(f ) W = A : trace(A) = 0.
(g) W = A = (a
ij
) : a
11
+a
22
= 0.
(h) W = A = (a
ij
) : a
21
+a
22
+ +a
2n
= 0.
What happens if M
n
(C) is a real vector space?
11. Prove that the following sets are not subspaces of M
n
(R).
(a) G = A M
n
(R) : det(A) = 0.
(b) G = A M
n
(R) : det(A) ,= 0.
(c) G = A M
n
(R) : det(A) = 1.
3.1.2 Linear Span
Denition 3.1.11 (Linear Combination) Let u
1
, u
2
, . . . , u
n
be a collection of vectors from a
vector space V (F). A vector u V is said to be a linear combination of the vectors u
1
, . . . , u
n
if we
can nd scalars
1
, . . . ,
n
F such that u =
1
u
1
+
2
u
2
+ +
n
u
n
.
Example 3.1.12 1. Is (4, 5, 5) a linear combination of (1, 0, 0), (2, 1, 0), and (3, 3, 1)?
Solution: The vector (4, 5, 5) is a linear combination if the linear system
a(1, 0, 0) +b(2, 1, 0) +c(3, 3, 1) = (4, 5, 5) (3.1.1)
in the unknowns a, b, c R has a solution. The augmented matrix of Equation (3.1.1) equals
_
_
1 2 3 4
0 1 3 5
0 0 1 5
_
_
and it has the solution
1
= 4,
2
= 10 and
3
= 5.
2. Is (4, 5, 5) a linear combination of the vectors (1, 2, 3), (1, 1, 4) and (3, 3, 2)?
Solution: The vector (4, 5, 5) is a linear combination if the linear system
a(1, 2, 3) +b(1, 1, 4) + c(3, 3, 2) = (4, 5, 5) (3.1.2)
3.1. FINITE DIMENSIONAL VECTOR SPACES 59
in the unknowns a, b, c R has a solution. The row reduced echelon form of the augmented
matrix of Equation (3.1.2) equals
_
_
1 0 2 3
0 1 1 1
0 0 0 0
_
_
. Thus, one has an innite number of
solutions. For example, (4, 5, 5) = 3(1, 2, 3) (1, 1, 4).
3. Is (4, 5, 5) a linear combination of the vectors (1, 2, 1), (1, 0, 1) and (1, 1, 0).
Solution: The vector (4, 5, 5) is a linear combination if the linear system
a(1, 2, 1) +b(1, 0, 1) +c(1, 1, 0) = (4, 5, 5) (3.1.3)
in the unknowns a, b, c R has a solution. An application of Gauss elimination method to
Equation (3.1.3) gives
_
_
1 1 1 4
0 1
1
2
3
2
0 0 0 1
_
_
. Thus, Equation (3.1.3) has no solution and hence
(4, 5, 5) is not a linear combination of the given vectors.
Exercise 3.1.13 1. Prove that every x R
3
is a unique linear combination of the vectors
(1, 0, 0), (2, 1, 0), and (3, 3, 1).
2. Find condition(s) on x, y and z such that (x, y, z) is a linear combination of (1, 2, 3), (1, 1, 4)
and (3, 3, 2)?
3. Find condition(s) on x, y and z such that (x, y, z) is a linear combination of the vectors
(1, 2, 1), (1, 0, 1) and (1, 1, 0).
Denition 3.1.14 (Linear Span) Let S = u
1
, u
2
, . . . , u
n
be a non-empty subset of a vector
space V (F). The linear span of S is the set dened by
L(S) =
1
u
1
+
2
u
2
+ +
n
u
n
:
i
F, 1 i n
If S is an empty set we dene L(S) = 0.
Example 3.1.15 1. Let S = (1, 0), (0, 1) R
2
. Determine L(S).
Solution: By denition, the required linear span is
L(S) = a(1, 0) +b(0, 1) : a, b R = (a, b) : a, b R = R
2
. (3.1.4)
2. For each S R
3
, determine the geometrical representation of L(S).
(a) S = (1, 1, 1), (2, 1, 3).
Solution: By denition, the required linear span is
L(S) = a(1, 1, 1) +b(2, 1, 3) : a, b R = (a + 2b, a +b, a + 3b) : a, b R. (3.1.5)
Note that nding all vectors of the form (a + 2b, a + b, a + 3b) is equivalent to nding
conditions on x, y and z such that (a + 2b, a + b, a + 3b) = (x, y, z), or equivalently, the
system
a + 2b = x, a +b = y, a + 3b = z
always has a solution. Check that the row reduced form of the augmented matrix equals
_
_
1 0 2y x
0 1 x y
0 0 z +y 2x
_
_
. Thus, we need 2x y z = 0 and hence
L(S) = a(1, 1, 1) +b(2, 1, 3) : a, b R = (x, y, z) R
3
: 2x y z = 0. (3.1.6)
60 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
Equation (3.1.5) is called an algebraic representation of L(S) whereas Equation (3.1.6)
gives its geometrical representation as a subspace of R
3
.
(b) S = (1, 2, 1), (1, 0, 1), (1, 1, 0).
Solution: As in Example 3.1.15.2, we need to nd condition(s) on x, y, z such that the
linear system
a(1, 2, 1) +b(1, 0, 1) +c(1, 1, 0) = (x, y, z) (3.1.7)
in the unknowns a, b, c is always consistent. An application of Gauss elimination method
to Equation (3.1.7) gives
_
_
1 1 1 x
0 1
1
2
2xy
3
0 0 0 x y +z
_
_
. Thus,
L(S) = (x, y, z) : x y +z = 0.
(c) S = (1, 2, 3), (1, 1, 4), (3, 3, 2).
Solution: We need to nd condition(s) on x, y, z such that the linear system
a(1, 2, 3) +b(1, 1, 4) +c(3, 3, 2) = (x, y, z)
in the unknowns a, b, c is always consistent. An application of Gauss elimination method
gives 5x 7y + 3z = 0 as the required condition. Thus,
L(S) = (x, y, z) : 5x 7y + 3z = 0.
3. S = (1, 2, 3, 4), (1, 1, 4, 5), (3, 3, 2, 3) R
4
. Determine L(S).
Solution: The readers are advised to show that
L(S) = (x, y, z, w) : 2x 3y +w = 0, 5x 7y + 3z = 0.
Exercise 3.1.16 For each of the sets S, determine the geometric representation of L(S).
1. S = 1 R.
2. S =
1
10
4
R.
3. S =

15 R.
4. S = (1, 0, 0), (0, 1, 0), (0, 0, 1) R
3
.
5. S = (1, 0, 1), (0, 1, 0), (3, 0, 3) R
3
.
6. S = (1, 0, 1), (1, 1, 0), (3, 4, 3) R
3
.
7. S = (1, 2, 1), (2, 0, 1), (1, 1, 1) R
3
.
8. S = (1, 0, 1, 1), (0, 1, 0, 1), (3, 0, 3, 1) R
4
.
Denition 3.1.17 (Finite Dimensional Vector Space) A vector space V (F) is said to be nite
dimensional if we can nd a subset S of V , having nite number of elements, such that V = L(S).
If such a subset does not exist then V is called an innite dimensional vector space.
Example 3.1.18 1. The set (1, 2), (2, 1) spans R
2
and hence R
2
is a nite dimensional vector
space.
3.1. FINITE DIMENSIONAL VECTOR SPACES 61
2. The set 1, 1 +x, 1 x +x
2
, x
3
, x
4
, x
5
spans T
5
(C) and hence T
5
(C) is a nite dimensional
vector space.
3. Fix a positive integer n and consider the vector space T
n
(R). Then T
n
(C) is a nite dimen-
sional vector space as T
n
(C) = L(1, x, x
2
, . . . , x
n
).
4. Recall T(C), the vector space of all polynomials with complex coecients. Since degree of a
polynomial can be any large positive integer, T(C) cannot be a nite dimensional vector space.
Indeed, checked that T(C) = L(1, x, x
2
, . . . , x
n
, . . .).
Lemma 3.1.19 (Linear Span is a Subspace) Let S be a non-empty subset of a vector space
V (F). Then L(S) is a subspace of V (F).
Proof. By denition, S L(S) and hence L(S) is non-empty subset of V. Let u, v L(S).
Then, there exist a positive integer n, vectors w
i
S and scalars
i
,
i
F such that u =

1
w
1
+
2
w
2
+ +
n
w
n
and v =
1
w
1
+
2
w
2
+ +
n
w
n
. Hence,
au +bv = (a
1
+b
1
)w
1
+ + (a
n
+b
n
)w
n
L(S)
for every a, b F as a
i
+ b
i
F for i = 1, . . . , n. Thus using Theorem 3.1.9, L(S) is a vector
subspace of V (F).
Remark 3.1.20 Let W be a subspace of a vector space V (F). If S W then L(S) is a subspace
of W as W is a vector space in its own right.
Theorem 3.1.21 Let S be a non-empty subset of a vector space V. Then L(S) is the smallest
subspace of V containing S.
Proof. For every u S, u = 1.u L(S) and hence S L(S). To show L(S) is the smallest
subspace of V containing S, consider any subspace W of V containing S. Then by Remark 3.1.20,
L(S) W and hence the result follows.
Exercise 3.1.22 1. Find all the vector subspaces of R
2
and R
3
.
2. Prove that (x, y, z) R
3
: ax +by +cz = d is a subspace of R
3
if and only if d = 0.
3. Let W be a set that consists of all polynomials of degree 5. Prove that W is not a subspace
T(R).
4. Determine all vector subspaces of V , the vector space in Example 3.1.4.11.
5. Let P and Q be two subspaces of a vector space V.
(a) Prove that P Q is a subspace of V .
(b) Give examples of P and Q such that P Q is not a subspace of V.
(c) Determine conditions on P and Q such that P Q a subspace of V ?
(d) Dene P +Q = u +v : u P, v Q. Prove that P +Q is a subspace of V .
(e) Prove that L(P Q) = P +Q.
6. Let x
1
= (1, 0, 0), x
2
= (1, 1, 0), x
3
= (1, 2, 0), x
4
= (1, 1, 1) and let S = x
1
, x
2
, x
3
, x
4
.
Determine all x
i
such that L(S) = L(S x
i
).
62 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
7. Let P = L(1, 0, 0), (1, 1, 0) and Q = L(1, 1, 1) be subspaces of R
3
. Show that P +Q = R
3
and P Q = 0. If u R
3
, determine u
P
, u
Q
such that u = u
P
+ u
Q
where u
P
P and
u
Q
Q. Is it necessary that u
P
and u
Q
are unique?
8. Let P = L(1, 1, 0), (1, 1, 0) and Q = L(1, 1, 1), (1, 2, 1) be subspaces of R
3
. Show that
P + Q = R
3
and P Q ,= 0. Also, nd a vector u R
3
such that u cannot be written
uniquely in the form u = u
P
+u
Q
where u
P
P and u
Q
Q.
In this section, we saw that a vector space has innite number of vectors. Hence, one can start
with any nite collection of vectors and obtain their span. It means that any vector space contains
innite number of other vector subspaces. Therefore, the following questions arise:
1. What are the conditions under which, the linear span of two distinct sets are the same?
2. Is it possible to nd/choose vectors so that the linear span of the chosen vectors is the whole
vector space itself?
3. Suppose we are able to choose certain vectors whose linear span is the whole space. Can we
nd the minimum number of such vectors?
We try to answer these questions in the subsequent sections.
3.2 Linear Independence
Denition 3.2.1 (Linear Independence and Dependence) Let S = u
1
, u
2
, . . . , u
m
be a
non-empty subset of a vector space V (F). The set S is said to be linearly independent if the system
of equations

1
u
1
+
2
u
2
+ +
m
u
m
= 0, (3.2.1)
in the unknowns
i
s 1 i m, has only the trivial solution. If the system (3.2.1) has a non-trivial
solution then the set S is said to be linearly dependent.
Example 3.2.2 Is the set S a linear independent set? Give reasons.
1. Let S = (1, 2, 1), (2, 1, 4), (3, 3, 5).
Solution: Consider the linear system a(1, 2, 1) + b(2, 1, 4) + c(3, 3, 5) = (0, 0, 0) in the un-
knowns a, b and c. It can be checked that this system has innite number of solutions. Hence
S is a linearly dependent subset of R
3
.
2. Let S = (1, 1, 1), (1, 1, 0), (1, 0, 1).
Solution: Consider the system a(1, 1, 1) +b(1, 1, 0) +c(1, 0, 1) = (0, 0, 0) in the unknowns a, b
and c. Check that this system has only the trivial solution. Hence S is a linearly independent
subset of R
3
.
In other words, if S = u
1
, . . . , u
m
is a non-empty subset of a vector space V, then one needs
to solve the linear system of equations

1
u
1
+
2
u
2
+ +
m
u
m
= 0 (3.2.2)
in the unknowns
1
, . . . ,
n
. If
1
=
2
= =
m
= 0 is the only solution of (3.2.2), then S
is a linearly independent subset of V. Otherwise, the set S is a linearly dependent subset of V. We
are now ready to state the following important results. The proof of only the rst part is given.
The reader is required to supply the proof of other parts.
3.2. LINEAR INDEPENDENCE 63
Proposition 3.2.3 Let V (F) be a vector space.
1. Then the zero-vector cannot belong to a linearly independent set.
2. A non-empty subset of a linearly independent set of V is also linearly independent.
3. Every set containing a linearly dependent set of V is also linearly dependent.
Proof. Let S = 0 = u
1
, u
2
, . . . , u
n
be a set consisting of the zero vector. Then for any ,= o,
u
1
+ ou
2
+ + 0u
n
= 0. Hence, the system
1
u
1
+
2
u
2
+ +
m
u
m
= 0, has a non-trivial
solution (
1
,
2
, . . . ,
n
) = (, 0 . . . , 0). Thus, the set S is linearly dependent.
Theorem 3.2.4 Let v
1
, v
2
, . . . , v
p
be a linearly independent subset of a vector space V (F). If for
some v V , the set v
1
, v
2
, . . . , v
p
, v is a linearly dependent, then v is a linear combination of
v
1
, v
2
, . . . , v
p
.
Proof. Since v
1
, . . . , v
p
, v is linearly dependent, there exist scalars c
1
, . . . , c
p+1
, not all zero,
such that
c
1
v
1
+c
2
v
2
+ +c
p
v
p
+c
p+1
v = 0. (3.2.3)
Claim: c
p+1
,= 0.
Let if possible c
p+1
= 0. As the scalars in Equation (3.2.3) are not all zero, the linear system
1
v
1
+
+
p
v
p
= 0 in the unknowns
1
, . . . ,
p
has a non-trivial solution (c
1
, . . . , c
p
). This by denition
of linear independence implies that the set v
1
, . . . , v
p
is linearly dependent, a contradiction to
our hypothesis. Thus, c
p+1
,= 0 and we get
v =
1
c
p+1
(c
1
v
1
+ +c
p
v
p
) L(v
1
, v
2
, . . . , v
p
)
as
ci
cp+1
F for 1 i p. Thus, the result follows.
We now state a very important corollary of Theorem 3.2.4 without proof. The readers are
advised to supply the proof for themselves.
Corollary 3.2.5 Let S = u
1
, . . . , u
n
be a subset of a vector space V (F). If S is linearly
1. dependent then there exists a k, 2 k n with L(u
1
, . . . , u
k
) = L(u
1
, . . . , u
k1
).
2. independent and there is a vector v V with v , L(S) then u
1
, . . . , u
n
, v is also a linearly
independent subset of V.
Exercise 3.2.6 1. Consider the vector space R
2
. Let u
1
= (1, 0). Find all choices for the vector
u
2
such that u
1
, u
2
is linearly independent subset of R
2
. Does there exist vectors u
2
and u
3
such that u
1
, u
2
, u
3
is linearly independent subset of R
2
?
2. Let S = (1, 1, 1, 1), (1, 1, 1, 2), (1, 1, 1, 1) R
4
. Does (1, 1, 2, 1) L(S)? Furthermore,
determine conditions on x, y, z and u such that (x, y, z, u) L(S).
3. Show that S = (1, 2, 3), (2, 1, 1), (8, 6, 10) R
3
is linearly dependent.
4. Show that S = (1, 0, 0), (1, 1, 0), (1, 1, 1) R
3
is linearly independent.
5. Prove that u
1
, u
2
, . . . , u
n
is a linearly independent subset of V (F) if and only if u
1
, u
1
+
u
2
, . . . , u
1
+ +u
n
is linearly independent subset of V (F).
64 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
6. Find 3 vectors u, v and w in R
4
such that u, v, w is linearly dependent whereas u, v, u, w
and v, w are linearly independent.
7. What is the maximum number of linearly independent vectors in R
3
?
8. Show that any set of k vectors in R
3
is linearly dependent if k 4.
9. Is (1, 0), (i, 0) a linearly independent subset of C
2
(R)?
10. Suppose V is a collection of vectors such that V (C) as well as V (R) are vector spaces. Prove
that the set u
1
, . . . , u
k
, iu
1
, . . . , iu
k
is a linearly independent subset of V (R) if and only if
u
1
, . . . , u
k
is a linear independent subset of V (C).
11. Let M be a subspace of V and let u, v V . Dene K = L(M, u) and H = L(M, v). If v K
and v , M prove that u H.
12. Let A M
n
(R) and let x and y be two non-zero vectors such that Ax = 3x and Ay = 2y.
Prove that x and y are linearly independent.
13. Let A =
_
_
2 1 3
4 1 3
3 2 5
_
_
. Determine non-zero vectors x, y and z such that Ax = 6x, Ay = 2y
and Az = 2z. Use the vectors x, y and z obtained here to prove the following.
(a) A
2
v = 4v, where v = cy +dz for any real numbers c and d.
(b) The set x, y, z is linearly independent.
(c) Let P = [x, y, z] be a 3 3 matrix. Then P is invertible.
(d) Let D =
_
_
6 0 0
0 2 0
0 0 2
_
_
. Then AP = PD.
14. Let P and Q be subspaces of R
n
such that P + Q = R
n
and P Q = 0. Prove that each
u R
n
is uniquely expressible as u = u
P
+u
Q
, where u
P
P and u
Q
Q.
3.3 Bases
Denition 3.3.1 (Basis of a Vector Space) A basis of a vector space V is a subset B of V such
that B is a linearly independent set in V and the linear span of B is V . Also, any element of B is
called a basis vector.
Remark 3.3.2 Let B be a basis of a vector space V (F). Then for each v V , there exist vectors
u
1
, u
2
, . . . , u
n
B such that v =
n

i=1

i
u
i
, where
i
F, for 1 i n. By convention, the linear
span of an empty set is 0. Hence, the empty set is a basis of the vector space 0.
Lemma 3.3.3 Let B be a basis of a vector space V (F). Then each v V is a unique linear
combination of the basis vectors.
Proof. On the contrary, assume that there exists v V that is can be expressed in at least two
ways as linear combination of basis vectors. That means, there exists a positive integer p, scalars

i
,
i
F and v
i
B such that
v =
1
v
1
+
2
v
2
+ +
p
v
p
and v =
1
v
1
+
2
v
2
+ +
p
v
p
.
3.3. BASES 65
Equating the two expressions of v leads to the expression
(
1

1
)v
1
+ (
2

2
)v
2
+ + (
p

p
)v
p
= 0. (3.3.1)
Since the vectors are from B, by denition (see Denition 3.3.1) the set S = v
1
, v
2
, . . . , v
p
is a
linearly independent subset of V . This implies that the linear system c
1
v
1
+c
2
v
2
+ +c
p
v
p
= 0 in
the unknowns c
1
, c
2
, . . . , c
p
has only the trivial solution. Thus, each of the scalars
i

i
appearing
in Equation (3.3.1) must be equal to 0. That is,
i

i
= 0 for 1 i p. Thus, for 1 i p,

i
=
i
and the result follows.
Example 3.3.4 1. The set 1 is a basis of the vector space R(R).
2. The set (1, 1), (1, 1) is a basis of the vector space R
2
(R).
3. Fix a positive integer n and let e
i
= (0, . . . , 0, 1
..
ith place
, 0, . . . , 0) R
n
for 1 i n. Then
B = e
1
, e
2
, . . . , e
n
is called the standard basis of R
n
.
(a) B = e
1
= 1 is a standard basis of R(R).
(b) B = e
1
, e
2
with e
1
= (1, 0) and e
2
= (0, 1) is the standard basis of R
2
.
(c) B = e
1
, e
2
, e
3
with e
1
= (1, 0, 0), e
2
= (0, 1, 0) and e
3
= (0, 0, 1) is the standard basis
of R
3
.
(d) B = (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1) is the standard basis of R
4
.
4. Let V = (x, y, 0) : x, y R R
3
. Then B = (2, 0, 0), (1, 3, 0) is a basis of V.
5. Let V = (x, y, z) R
3
: x + y z = 0 be a vector subspace of R
3
. As each element
(x, y, z) V satises x +y z = 0. Or equivalently z = x +y and hence
(x, y, z) = (x, y, x +y) = (x, 0, x) + (0, y, y) = x(1, 0, 1) +y(0, 1, 1).
Hence (1, 0, 1), (0, 1, 1) forms a basis of V.
6. Let V = a + ib : a, b R be a complex vector space. Then any element a + ib V equals
a +ib = (a +ib) 1. Hence a basis of V is 1.
7. Let V = a + ib : a, b R be a real vector space. Then 1, i is a basis of V (R) as
a +ib = a 1 +b i for a, b R and 1, i is a linearly independent subset of V (R).
8. In C
2
, (a+ib, c+id) = (a+ib)(1, 0)+(c+id)(0, 1). So, (1, 0), (0, 1) is a basis of the complex
vector space C
2
.
9. In case of the real vector space C
2
, (a +ib, c +id) = a(1, 0) +b(i, 0) +c(0, 1) +d(0, i). Hence
(1, 0), (i, 0), (0, 1), (0, i) is a basis.
10. B = e
1
, e
2
, . . . , e
n
is the standard basis of C
n
. But B is not a basis of the real vector space
C
n
.
Before coming to the end of this section, we give an algorithm to obtain a basis of any nite
dimensional vector space V . This will be done by a repeated application of Corollary 3.2.5. The
algorithm proceeds as follows:
Step 1: Let v
1
V with v
1
,= 0. Then v
1
is linearly independent.
66 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
Step 2: If V = L(v
1
), we have got a basis of V. Else, pick v
2
V such that v
2
, L(v
1
). Then by
Corollary 3.2.5.2, v
1
, v
2
is linearly independent.
Step i: Either V = L(v
1
, v
2
, . . . , v
i
) or L(v
1
, v
2
, . . . , v
i
) ,= V.
In the rst case, v
1
, v
2
, . . . , v
i
is a basis of V. In the second case, pick v
i+1
V with
v
i+1
, L(v
1
, v
2
, . . . , v
i
). Then, by Corollary 3.2.5.2, the set v
1
, v
2
, . . . , v
i+1
is linearly
independent.
This process will nally end as V is a nite dimensional vector space.
Exercise 3.3.5 1. Let u
1
, u
2
, . . . , u
n
be basis vectors of a vector space V . Then prove that
whenever
n

i=1

i
u
i
= 0, we must have
i
= 0 for each i = 1, . . . , n.
2. Find a basis of R
3
containing the vector (1, 1, 2).
3. Find a basis of R
3
containing the vector (1, 1, 2) and (1, 2, 1).
4. Is it possible to nd a basis of R
4
containing the vectors (1, 1, 1, 2), (1, 2, 1, 1) and (1, 2, 7, 11)?
5. Let S = v
1
, v
2
, . . . , v
p
be a subset of a vector space V (F). Suppose L(S) = V but S is not
a linearly independent set. Then prove that each vector in V can be expressed in more than
one way as a linear combination of vectors from S.
6. Show that the set (1, 0, 1), (1, i, 0), (1, 1, 1 i) is a basis of C
3
.
7. Find a basis of the real vector space C
n
containing the basis B given in Example 10.
8. Find a basis of V = (x, y, z, u) R
4
: x y z = 0, x +z u = 0.
9. Let A =
_
_
1 0 1 1 0
0 1 2 3 0
0 0 0 0 1
_
_
. Find a basis of V = x
t
R
5
: Ax = 0.
10. Prove that 1, x, x
2
, . . . , x
n
, . . . is a basis of the vector space T(R). This basis has an innite
number of vectors. This is also called the standard basis of T(R).
11. Let u
t
= (1, 1, 2), v
t
= (1, 2, 3) and w
t
= (1, 10, 1). Find a basis of L(u, v, w). Determine
a geometrical representation of L(u, v, w)?
12. Prove that (1, 0, 0), (1, 1, 0), (1, 1, 1) is a basis of C
3
. Is it a basis of C
3
(R)?
3.3.1 Dimension of a Finite Dimensional Vector Space
We rst prove a result which helps us in associating a non-negative integer to every nite dimensional
vector space.
Theorem 3.3.6 Let V be a vector space with basis v
1
, v
2
, . . . , v
n
. Let m be a positive integer
with m > n. Then the set S = w
1
, w
2
, . . . , w
m
V is linearly dependent.
Proof. We need to show that the linear system

1
w
1
+
2
w
2
+ +
m
w
m
= 0 (3.3.2)
in the unknowns
1
,
2
, . . . ,
m
has a non-trivial solution. We start by expressing the vectors w
i
in terms of the basis vectors v
j
s.
3.3. BASES 67
As v
1
, v
2
, . . . , v
n
is a basis of V , for each w
i
V, 1 i m, there exist unique scalars
a
ij
, 1 i n, 1 j m, such that
w
1
= a
11
v
1
+a
21
v
2
+ +a
n1
v
n
,
w
2
= a
12
v
1
+a
22
v
2
+ +a
n2
v
n
,
.
.
. =
.
.
.
w
m
= a
1m
v
1
+a
2m
v
2
+ +a
nm
v
n
.
Hence, Equation (3.3.2) can be rewritten as

1
_
_
n

j=1
a
j1
v
j
_
_
+
2
_
_
n

j=1
a
j2
v
j
_
_
+ +
m
_
_
n

j=1
a
jm
v
j
_
_
= 0.
Or equivalently,
_
m

i=1

i
a
1i
_
v
1
+
_
m

i=1

i
a
2i
_
v
2
+ +
_
m

i=1

i
a
ni
_
v
n
= 0. (3.3.3)
Since v
1
, . . . , v
n
is a basis, using Exercise 3.3.5.1, we get
m

i=1

i
a
1i
=
m

i=1

i
a
2i
= =
m

i=1

i
a
ni
= 0.
Therefore, nding
i
s satisfying Equation (3.3.2) reduces to solving the homogeneous system A =
0 where =
_

2
.
.
.

m
_

_
and A =
_

_
a
11
a
12
a
1m
a
21
a
22
a
2m
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
a
nm
_

_
.
Since n < m, Corollary 2.1.23.2 (here the matrix A is m n) implies that A = 0 has a non-
trivial solution. hence Equation (3.3.2) has a non-trivial solution and thus w
1
, w
2
, . . . , w
m
is a
linearly dependent set.
Corollary 3.3.7 Let B
1
= u
1
, . . . , u
n
and B
2
= v
1
, . . . , v
m
be two bases of a nite dimensional
vector space V . Then m = n.
Proof. Let if possible, m > n. Then by Theorem 3.3.6, v
1
, . . . , v
m
is a linearly dependent
subset of V , contradicting the assumption that B
2
is a basis of V . Hence we must have m n. A
similar argument implies n m and hence m = n.
Let V be a nite dimensional vector space. Then Corollary 3.3.7 implies that the number of
elements in any basis of V is the same. This number is used to dene the dimension of any nite
dimensional vector space.
Denition 3.3.8 (Dimension of a Finite Dimensional Vector Space) Let V be a nite di-
mensional vector space. Then the dimension of V , denoted dim(V ), is the number of elements in a
basis of V .
Note that Corollary 3.2.5.2 can be used to generate a basis of any non-trivial nite dimensional
vector space.
68 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
Example 3.3.9 The dimension of vector spaces in Example 3.3.4 are as follows:
1. dim(R) = 1 in Example 3.3.4.1.
2. dim(R
2
) = 2 in Example 3.3.4.2.
3. dim(V ) = 2 in Example 3.3.4.4.
4. dim(V ) = 2 in Example 3.3.4.5.
5. dim(V ) = 1 in Example 3.3.4.6.
6. dim(V ) = 2 in Example 3.3.4.7.
7. dim(C
2
) = 2 in Example 3.3.4.8.
8. dim(C
2
(R)) = 4 in Example 3.3.4.9.
9. For xed positive integer n, dim(R
n
) = n in Example 3.3.4.3 and in Example 3.3.4.10, one
has dim(C
n
) = n and dim(C
n
(R)) = 2n.
Thus, we see that the dimension of a vector space dependents on the set of scalars.
Example 3.3.10 Let V be the set of all functions f : R
n
R with the property that f(x + y) =
f(x) +f(y) and f(x) = f(x) for all x, y R
n
and R. For any f, g V, and t R, dene
(f g)(x) = f(x) +g(x) and (t f)(x) = f(tx).
Then it can be easily veried that V is a real vector space. Also, for 1 i n, dene the functions
e
i
(x) = e
i
_
(x
1
, x
2
, . . . , x
n
)
_
= x
i
. Then it can be easily veried that the set e
1
, e
2
, . . . , e
n
is a
basis of V and hence dim(V ) = n.
The next theorem follows directly from Corollary 3.2.5.2 and Theorem 3.3.6. Hence, the proof
is omitted.
Theorem 3.3.11 Let S be a linearly independent subset of a nite dimensional vector space V.
Then the set S can be extended to form a basis of V.
Theorem 3.3.11 is equivalent to the following statement:
Let V be a vector space of dimension n. Suppose, we have found a linearly independent subset
v
1
, . . . , v
r
of V with r < n. Then it is possible to nd vectors v
r+1
, . . . , v
n
in V such that
v
1
, v
2
, . . . , v
n
is a basis of V. Thus, one has the following important corollary.
Corollary 3.3.12 Let V be a vector space of dimension n. Then
1. any set consisting of n linearly independent vectors forms a basis of V.
2. any subset S of V having n vectors with L(S) = V forms a basis of V .
Exercise 3.3.13 1. Determine dim(T
n
(R)). Is dim(T(R)) nite?
2. Let W
1
and W
2
be two subspaces of a vector space V such that W
1
W
2
. Show that W
1
= W
2
if and only if dim(W
1
) = dim(W
2
).
3.3. BASES 69
3. Consider the vector space C([, ]). For each integer n, dene e
n
(x) = sin(nx). Prove that
e
n
: n = 1, 2, . . . is a linearly independent set.
[Hint: For any positive integer , consider the set {e
k
1
, . . . , e
k

} and the linear system

1 sin(k1x) +2 sin(k2x) + +

sin(k

x) = 0 for all x [, ]
in the unknowns 1, . . . , n. Now for suitable values of m, consider the integral
Z

sin(mx) (1 sin(k1x) +2 sin(k2x) + +

sin(k

x)) dx
to get the required result.]
4. Determine a basis and dimension of W = (x, y, z, w) R
4
: x +y z +w = 0.
5. Let W
1
be a subspace of a vector space V . If dim(V ) = n and dim(W
1
) = k with k 1 then
prove that there exists a subspace W
2
of V such that W
1
W
2
= 0, W
1
+ W
2
= V and
dim(W
2
) = n k. Also, prove that for each v V there exist unique vectors w
1
W
1
and
w
2
W
2
such that v = w
1
+ w
2
. The subspace W
2
is called the complementary subspace of
W
1
in V.
6. Is the set, W = p(x) T
4
(R) : p(1) = p(1) = 0 a subspace of T
4
(R)? If yes, nd its
dimension.
3.3.2 Application to the study of C
n
In this subsection, we will study results that are intrinsic to the understanding of linear algebra,
especially results associated with matrices. We start with a few exercises that should have appeared
in previous sections of this chapter.
Exercise 3.3.14 1. Let V = A M
2
(C) : tr(A) = 0, where tr(A) stands for the trace of the
matrix A. Show that V is a real vector space and nd its basis. Is W =
__
a b
c a
_
: c = b
_
a subspace of V ?
2. Find the dimension and a basis for each of the subspaces given below.
(a) sl
n
(R) = A M
n
(R) : tr(A) = 0.
(b) S
n
(R) = A M
n
(R) : A = A
t
.
(c) A
n
(R) = A M
n
(R) : A +A
t
= 0.
(d) sl
n
(C) = A M
n
(C) : tr(A) = 0.
(e) S
n
(C) = A M
n
(C) : A = A

.
(f ) A
n
(C) = A M
n
(C) : A +A

= 0.
3. Does there exist an A M
2
(C) satisfying A
2
,= 0 but A
3
= 0.
4. Prove that there does not exist an A M
n
(C) satisfying A
n
,= 0 but A
n+1
= 0. That is, if A
is an n n nilpotent matrix then the order of nilpotency n.
5. Let A M
n
(C) be a triangular matrix. Then the rows/columns of A are linearly independent
subset of C
n
if and only if a
ii
,= 0 for 1 i n.
6. Prove that the rows/columns of A M
n
(C) are linearly independent if and only if det(A) ,= 0.
70 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
7. Prove that the rows/columns of A M
n
(C) span C
n
if and only if A is an invertible matrix.
8. Let A be a skew-symmetric matrix of odd order. Prove that the rows/columns of A are linearly
dependent. Hint: What is det(A)?
We now dene subspaces that are associated with matrices.
Denition 3.3.15 Let A M
mn
(C) and let R
1
, R
2
, . . . , R
m
C
n
be the rows of A and a
1
, a
2
, . . . , a
n

C
m
be its columns. We dene
1. RowSpace(A), denoted (A), as (A) = L(R
1
, R
2
, . . . , R
m
) C
n
,
2. ColumnSpace(A), denoted ((A), as ((A) = L(a
1
, a
2
, . . . , a
n
) C
m
,
3. NullSpace(A), denoted ^(A), as ^(A) = x
t
C
n
: Ax = 0.
4. Range(A), denoted Im(A), as Im(A) = y : Ax = y for some x
t
C
n
.
Note that the column space of A consists of all b such that Ax = b has a solution. Hence,
((A) = Im(A). The above subspaces can also be dened for A

, the conjugate transpose of A. We

illustrate the above denitions with the help of an example and then ask the readers to solve the
exercises that appear after the example.
Example 3.3.16 Compute the above mentioned subspaces for A =
_
_
1 1 1 2
1 2 1 1
1 2 7 11
_
_
.
Solution: Verify the following
1. (A) = L(R
1
, R
2
, R
3
) = (x, y, z, u) C
4
: 3x 2y = z, 5x 3y +u = 0 = ((A

)
2. ((A) = L(a
1
, a
2
, a
3
, a
4
) = (x, y, z) C
3
: 4x 3y z = 0 = (A

)
3. ^(A) = (x, y, z, u) C
4
: x + 3z 5u = 0, y 2z + 3u = 0.
4. ^(A

) = (x, y, z) C
3
: x + 4z = 0, y 3z = 0.
Exercise 3.3.17 1. Let A M
mn
(C). Then prove that
(a) (A) is a subspace of C
n
,
(b) ((A) is a subspace of C
m
,
(c) ^(A) is a subspace of C
n
,
(d) ^(A

) is a subspace of C
m
,
(e) (A) = ((A

) and ((A) = (A

).
2. Let A =
_

_
1 2 1 3 2
0 2 2 2 4
2 2 4 0 8
4 2 5 6 10
_

_
and B =
_

_
2 4 0 6
1 0 2 5
3 5 1 4
1 1 1 2
_

_
.
(a) Find the row-reduced echelon forms of A and B.
(b) Find P
1
and P
2
such that P
1
A and P
2
B are in row-reduced echelon form.
(c) Find a basis each for the row spaces of A and B.
(d) Find a basis each for the range spaces of A and B.
3.3. BASES 71
(e) Find bases of the null spaces of A and B.
(f ) Find the dimensions of all the vector subspaces so obtained.
Lemma 3.3.18 Let A M
mn
(C) and let B = EA for some elementary matrix E. Then (A) =
(B) and dim((A)) = dim((B)).
Proof. We prove the result for the elementary matrix E
ij
(c), where c ,= 0 and 1 i < j m.
The readers are advised to prove the results for other elementary matrices. Let R
1
, R
2
, . . . , R
m
be
the rows of A. Then B = E
ij
(c)A implies
(B) = L(R
1
, . . . , R
i1
, R
i
+cR
j
, R
i+1
, . . . , R
m
)
=
1
R
1
+ +
i1
R
i1
+
i
(R
i
+cR
j
) +
+
m
R
m
:

R, 1 m
=
_
m

=1

+
i
(cR
j
) :

R, 1 m
_
=
_
m

=1

R, 1 m
_
= L(R
1
, . . . , R
m
) = (A)
Hence, the proof of the lemma is complete.
We omit the proof of the next result as the proof is similar to the proof of Lemma 3.3.18.
Lemma 3.3.19 Let A M
mn
(C) and let C = AE for some elementary matrix E. Then ((A) =
((C) and dim(((A)) = dim(((C)).
The rst and second part of the next result are a repeated application of Lemma 3.3.18 and
Lemma 3.3.19, respectively. Hence the proof is omitted. This result is also helpful in nding a basis
of a subspace of C
n
.
Corollary 3.3.20 Let A M
mn
(C). If
1. B is in row-reduced echelon form of A then (A) = (B). In particular, the non-zero rows
of B form a basis of (A) and dim((A)) = dim((B)) = Row rank(A).
2. the application of column operations gives a matrix C that has the form given in Remark 2.3.6,
then dim(((A)) = dim(((C)) = Column rank(A) and the non-zero columns of C form a basis
of ((A).
Before proceeding with applications of Corollary 3.3.20, we rst prove that for any A M
mn
(C),
Row rank(A) = Column rank(A).
Theorem 3.3.21 Let A M
mn
(C). Then Row rank(A) = Column rank(A).
Proof. Let R
1
, R
2
, . . . , R
m
be the rows of A and C
1
, C
2
, . . . , C
n
be the columns of A. Let
Row rank(A) = r. Then by Corollary 3.3.20.1, dim
_
L(R
1
, R
2
, . . . , R
m
)
_
= r. Hence, there exists
vectors
u
t
1
= (u
11
, . . . , u
1n
), u
t
2
= (u
21
, . . . , u
2n
), . . . , u
t
r
= (u
r1
, . . . , u
rn
) R
n
with
R
i
L(u
t
1
, u
t
2
, . . . , u
t
r
) R
n
, for all i, 1 i m.
72 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
Therefore, there exist real numbers
ij
, 1 i m, 1 j r such that
R
1
=
11
u
t
1
+ +
1r
u
t
r
=
_
r

i=1

1i
u
i1
,
r

i=1

1i
u
i2
, . . . ,
r

i=1

1i
u
in
_
,
R
2
=
21
u
t
1
+ +
2r
u
t
r
=
_
r

i=1

2i
u
i1
,
r

i=1

2i
u
i2
, . . . ,
r

i=1

2i
u
in
_
,
and so on, till
R
m
=
m1
u
t
1
+ +
mr
u
t
r
=
_
r

i=1

mi
u
i1
,
r

i=1

mi
u
i2
, . . . ,
r

i=1

mi
u
in
_
.
So,
C
1
=
_

_
r

i=1

1i
u
i1
.
.
.
r

i=1

mi
u
i1
_

_
= u
11
_

11

21
.
.
.

m1
_

_
+u
21
_

12

22
.
.
.

m2
_

_
+ +u
r1
_

1r

2r
.
.
.

mr
_

_
.
In general, for 1 j n, we have
C
j
=
_

_
r

i=1

1i
u
ij
.
.
.
r

i=1

mi
u
ij
_

_
= u
1j
_

11

21
.
.
.

m1
_

_
+u
2j
_

12

22
.
.
.

m2
_

_
+ +u
rj
_

1r

2r
.
.
.

mr
_

_
.
Therefore, C
1
, C
2
, . . . , C
n
are linear combination of the r vectors
(
11
,
21
, . . . ,
m1
)
t
, (
12
,
22
, . . . ,
m2
)
t
, . . . , (
1r
,
2r
, . . . ,
mr
)
t
.
Thus, by Corollary 3.3.20.2, Column rank(A) = dim
_
((A)
_
r = Row rank(A). A similar argu-
ment gives Row rank(A) Column rank(A). Hence, we have the required result.
Let M and N be two subspaces a vector space V (F). Then recall that (see Exercise 3.1.22.5d)
M + N = u + v : u M, v N is the smallest subspace of V containing both M and N.
We now state a very important result that relates the dimensions of the three subspaces M, N and
M +N (for a proof, see Appendix 7.3.1).
Theorem 3.3.22 Let M and N be two subspaces of a nite dimensional vector space V (F). Then
dim(M) + dim(N) = dim(M +N) + dim(M N). (3.3.4)
Let S be a subset of R
n
and let V = L(S). Then Theorem 3.3.6 and Corollary 3.3.20.1 to obtain
a basis of V . The algorithm proceeds as follows:
1. Construct a matrix A whose rows are the vectors in S.
2. Apply row operations on A to get B, a matrix in row echelon form.
3. Let B be the set of non-zero rows of B. Then B is a basis of L(S) = V.
3.3. BASES 73
Example 3.3.23 1. Let S = (1, 1, 1, 1), (1, 1, 1, 1), (1, 1, 0, 1), (1, 1, 1, 1) R
4
. Find a basis
of L(S).
Solution: Here A =
_

_
1 1 1 1
1 1 1 1
1 1 0 1
1 1 1 1
_

_
. Then B =
_

_
1 1 1 1
0 1 0 0
0 0 1 0
0 0 0 0
_

_
is the row echelon form
of A and hence B = (1, 1, 1, 1), (0, 1, 0, 0), (0, 0, 1, 0) is a basis of L(S). Observe that
the non-zero rows of B can be obtained, using the rst, second and fourth or the rst,
third and fourth rows of A. Hence the subsets (1, 1, 1, 1), (1, 1, 0, 1), (1, 1, 1, 1) and
(1, 1, 1, 1), (1, 1, 1, 1), (1, 1, 1, 1) of S are also bases of L(S).
2. Let V = (v, w, x, y, z) R
5
: v+x+z = 3y and W = (v, w, x, y, z) R
5
: wx = z, v = y
be two subspaces of R
5
. Find bases of V and W containing a basis of V W.
Solution: Let us nd a basis of V W. The solution set of the linear equations
v +x 3y +z = 0, w x z = 0 and v = y
is
(v, w, x, y, z)
t
= (y, 2y, x, y, 2y x)
t
= y(1, 2, 0, 1, 2)
t
+x(0, 0, 1, 0, 1)
t
.
Thus, a basis of V W is B = (1, 2, 0, 1, 2), (0, 0, 1, 0, 1). Similarly, a basis of V is B
1
=
(1, 0, 1, 0, 0), (0, 1, 0, 0, 0), (3, 0, 0, 1, 0), (1, 0, 0, 0, 1) and that of W is
B
2
= (1, 0, 0, 1, 0), (0, 1, 1, 0, 0), (0, 1, 0, 0, 1). To nd a basis of V containing a basis of
V W, form a matrix whose rows are the vectors in B and B
1
(see the rst matrix in
Equation(3.3.5)) and apply row operations without disturbing the rst two rows that have
come from B. Then after a few row operations, we get
_

_
1 2 0 1 2
0 0 1 0 1
1 0 1 0 0
0 1 0 0 0
3 0 0 1 0
1 0 0 0 1
_

_
1 2 0 1 2
0 0 1 0 1
0 1 0 0 0
0 0 0 1 3
0 0 0 0 0
0 0 0 0 0
_

_
. (3.3.5)
Thus, a required basis of V is (1, 2, 0, 1, 2), (0, 0, 1, 0, 1), (0, 1, 0, 0, 0), (0, 0, 0, 1, 3). Simi-
larly, a required basis of W is (1, 2, 0, 1, 2), (0, 0, 1, 0, 1), (0, 0, 1, 0, 1).
Exercise 3.3.24 1. If M and N are 4-dimensional subspaces of a vector space V of dimension
7 then show that M and N have at least one vector in common other than the zero vector.
2. Let V = (x, y, z, w) R
4
: x + y z + w = 0, x + y + z + w = 0, x + 2y = 0 and
W = (x, y, z, w) R
4
: xy z +w = 0, x+2y w = 0 be two subspaces of R
4
. Find bases
and dimensions of V, W, V W and V + W.
3. Let W
1
and W
2
be two subspaces of a vector space V . If dim(W
1
) +dim(W
2
) > dim(V ), then
prove that W
1
W
2
contains a non-zero vector.
4. Give examples to show that the Column Space of two row-equivalent matrices need not be
same.
5. Let A M
mn
(C) with m < n. Prove that the columns of A are linearly dependent.
6. Suppose a sequence of matrices A = B
0
B
1
B
k1
B
k
= B satises
(B
l
) (B
l1
) for 1 l k. Then prove that (B) (A).
74 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
Before going to the next section, we prove the rank-nullity theorem and the main theorem of
system of linear equations (see Theorem 2.4.1).
Theorem 3.3.25 (Rank-Nullity Theorem) For any matrix A M
mn
(C),
dim(((A)) + dim(^(A)) = n.
Proof. Let dim(^(A)) = r < n and let u
1
, u
2
, . . . , u
r
be a basis of ^(A). Since u
1
, . . . , u
r

is a linearly independent subset in R

n
, there exist vectors u
r+1
, . . . , u
n
R
n
(see Corollary 3.2.5.2)
such that u
1
, . . . , u
n
is a basis of R
n
. Then by denition,
((A) = L(Au
1
, Au
2
, . . . , Au
n
)
= L(0, . . . , 0, Au
r+1
, Au
r+2
, . . . , Au
n
) = L(Au
r+1
, . . . , Au
n
).
We need to prove that Au
r+1
, . . . , Au
n
is a linearly independent set. Consider the linear system

1
Au
r+1
+
2
Au
r+2
+ +
nr
Au
n
= 0. (3.3.6)
in the unknowns
1
, . . . ,
nr
. This linear system is equivalent to
A(
1
u
r+1
+
2
u
r+2
+ +
nr
u
n
) = 0.
Hence, by denition of ^(A),
1
u
r+1
+ +
nr
u
n
^(A) = L(u
1
, . . . , u
r
). Therefore, there
exists scalars
i
, 1 i r such that

1
u
r+1
+
2
u
r+2
+ +
nr
u
n
=
1
u
1
+
2
u
2
+ +
r
u
r
.
Or equivalently,

1
u
1
+ +
r
u
r

1
u
r+1

nr
u
n
= 0. (3.3.7)
As u
1
, . . . , u
n
is a linearly independent set, the only solution of Equation (3.3.7) is

i
= 0 for 1 i n r and
j
= 0 for 1 j r.
In other words, we have shown that the only solution of Equation (3.3.6) is the trivial solution
(
i
= 0 for all i, 1 i n r). Hence, the set Au
r+1
, . . . , Au
n
is a linearly independent and is
a basis of ((A). Thus
dim(((A)) + dim(^(A)) = (n r) +r = n
and the proof of the theorem is complete.
Theorem 3.3.25 is part of what is known as the fundamental theorem of linear algebra (see
Theorem 5.2.14). As the nal result in this direction, We now prove the main theorem on linear
systems stated on page 41 (see Theorem 2.4.1) whose proof was omitted.
Theorem 3.3.26 Consider a linear system Ax = b, where A is an m n matrix, and x, b are
vectors of orders n1, and m1, respectively. Suppose rank (A) = r and rank([A b]) = r
a
. Then
exactly one of the following statement holds:
1. If r < r
a
, the linear system has no solution.
2. if r
a
= r, then the linear system is consistent. Furthermore,
(a) if r = n, then the solution set of the linear system has a unique n1 vector x
0
satisfying
Ax
0
= b.
3.3. BASES 75
(b) if r < n, then the set of solutions of the linear system is an innite set and has the form
x
0
+k
1
u
1
+k
2
u
2
+ +k
nr
u
nr
: k
i
R, 1 i n r,
where x
0
, u
1
, . . . , u
nr
are n1 vectors satisfying Ax
0
= b and Au
i
= 0 for 1 i nr.
Proof. Proof of Part 1. As r < r
a
, the (r + 1)-th row of the row-reduced echelon form of [A b]
has the form [0, 1]. Thus, by Theorem 1, the system Ax = b is inconsistent.
Proof of Part 2a and Part 2b. As r = r
a
, using Corollary 3.3.20, ((A) = (([A, b]). Hence, the
vector b ((A) and therefore there exist scalars c
1
, c
2
, . . . , c
n
such that b = c
1
a
1
+c
2
a
2
+ c
n
a
n
,
where a
1
, a
2
, . . . , a
n
are the columns of A. That is, we have a vector x
t
0
= [c
1
, c
2
, . . . , c
n
] that
satises Ax = b.
If in addition r = n, then the system Ax = b has no free variables in its solution set and thus
we have a unique solution (see Theorem 2.1.22.2a).
Whereas the condition r < n implies that the system Ax = b has n r free variables in its
solution set and thus we have an innite number of solutions (see Theorem 2.1.22.2b). To complete
the proof of the theorem, we just need to show that the solution set in this case has the form
x
0
+ k
1
u
1
+ k
2
u
2
+ + k
nr
u
nr
: k
i
R, 1 i n r, where Ax
0
= b and Au
i
= 0 for
1 i n r.
To get this, note that using the rank-nullity theorem (see Theorem 3.3.25) rank(A) = r implies
that dim(^(A)) = n r. Let u
1
, u
2
, . . . , u
nr
be a basis of ^(A). Then by denition Au
i
= 0
for 1 i n r and hence
A(x
0
+k
1
u
1
+k
2
u
2
+ +k
nr
u
nr
) = Ax
0
+k
1
0 + +k
nr
0 = b.
Thus, the required result follows.
Example 3.3.27 Let A =
_
_
1 1 0 1 1 0 1
0 0 1 2 3 0 2
0 0 0 0 0 1 1
_
_
and V = x
t
R
7
: Ax = 0. Find a basis
and dimension of V .
Solution: Observe that x
1
, x
3
and x
6
are the basic variables and the rest are the free variables.
Writing the basic variables in terms of free variables, we get
x
1
= x
7
x
2
x
4
x
5
, x
3
= 2x
7
2x
4
3x
5
and x
6
= x
7
.
Hence,
_

_
x
1
x
2
x
3
x
4
x
5
x
6
x
7
_

_
=
_

_
x
7
x
2
x
4
x
5
x
2
2x
7
2x
4
3x
5
x
4
x
5
x
7
x
7
_

_
= x
2
_

_
1
1
0
0
0
0
0
_

_
+x
4
_

_
1
0
2
1
0
0
0
_

_
+x
5
_

_
1
0
3
0
1
0
0
_

_
+x
7
_

_
1
0
2
0
0
1
1
_

_
. (3.3.8)
Therefore, if we let u
t
1
=
_
1, 1, 0, 0, 0, 0, 0

, u
t
2
=
_
1, 0, 2, 1, 0, 0, 0

, u
t
3
=
_
1, 0, 3, 0, 1, 0, 0

and u
t
4
=
_
1, 0, 2, 0, 0, 1, 1

then S = u
1
, u
2
, u
3
, u
4
is the basis of V . The reasons are as follows:
1. For Linear independence, we consider the homogeneous system
c
1
u
1
+c
2
u
2
+ c
3
u
3
+c
4
u
4
= 0 (3.3.9)
in the unknowns c
1
, c
2
, c
3
and c
4
. Then relating the unknowns with the free variables x
2
, x
4
, x
5
and x
7
and then comparing Equations (3.3.8) and (3.3.9), we get
76 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
(a) c
1
= 0 as the 2-nd coordinate consists only of c
1
.
(b) c
2
= 0 as the 4-th coordinate consists only of c
2
.
(c) c
3
= 0 as the 5-th coordinate consists only of c
3
.
(d) c
4
= 0 as the 7-th coordinate consists only of c
4
.
Hence, the set S is linearly independent.
2. L(S) = V is obvious as any vector of V has the form mentioned as the rst equality in
Equation (3.3.8).
The understanding built in Example 3.3.27 gives us the following remark.
Remark 3.3.28 The vectors u
1
, u
2
, . . . , u
nr
in Theorem 3.3.26.2b correspond to expressing the
solution set with the help of the free variables. This is done by writing the basic variables in terms
of the free variables and then writing the solution set in such a way that each u
i
corresponds to a
specic free variable.
The following are some of the consequences of the rank-nullity theorem. The proof is left as an
exercise for the reader.
Exercise 3.3.29 1. Let A be an mn real matrix. Then
(a) if n > m, then the system Ax = 0 has innitely many solutions,
(b) if n < m, then there exists a non-zero vector b = (b
1
, b
2
, . . . , b
m
)
t
such that the system
Ax = b does not have any solution.
2. The following statements are equivalent for an mn matrix A.
(a) Rank (A) = k.
(b) There exist a set of k rows of A that are linearly independent.
(c) There exist a set of k columns of A that are linearly independent.
(d) dim(((A)) = k.
(e) There exists a k k submatrix B of A with det(B) ,= 0 and determinant of every
(k + 1) (k + 1) submatrix of A is zero.
(f ) There exists a linearly independent subset b
1
, b
2
, . . . , b
k
of R
m
such that the system
Ax = b
i
for 1 i k is consistent.
(g) dim(^(A)) = n k.
3.4 Ordered Bases
Let B = u
1
, u
2
, . . . , u
n
be a basis of a vector space V . As B is a set, there is no ordering of its
elements. In this section, we want to associate an order among the vectors in any basis of V as this
helps in getting a better understanding about nite dimensional vector spaces and its relationship
with matrices.
Denition 3.4.1 (Ordered Basis) Let V be a vector space of dimension n. Then an ordered
basis for V is a basis u
1
, u
2
, . . . , u
n
together with a one-to-one correspondence between the sets
u
1
, u
2
, . . . , u
n
and 1, 2, 3, . . . , n.
3.4. ORDERED BASES 77
If the ordered basis has u
1
as the rst vector, u
2
as the second vector and so on, then we denote
this by writing the ordered basis as (u
1
, u
2
, . . . , u
n
).
Example 3.4.2 1. Consider the vector space T
2
(R) with basis 1 x, 1 +x, x
2
. Then one can
take either B
1
=
_
1 x, 1 + x, x
2
_
or B
2
=
_
1 + x, 1 x, x
2
_
as ordered bases. Also for any
element a
0
+a
1
x +a
2
x
2
T
2
(R), one has
a
0
+a
1
x +a
2
x
2
=
a
0
a
1
2
(1 x) +
a
0
+a
1
2
(1 +x) +a
2
x
2
.
Thus, a
0
+a
1
x +a
2
x
2
in the ordered basis
(a) B
1
, has
a0a1
2
as the coecient of the rst element,
a0+a1
2
as the coecient of the second
element and a
2
as the coecient the third element of B
1
.
(b) B
2
, has
a0+a1
2
as the coecient of the rst element,
a0a1
2
as the coecient of the second
element and a
2
as the coecient the third element of B
2
.
2. Let V = (x, y, z) : x + y = z and let B = (1, 1, 0), (1, 0, 1) be a basis of V . Then check
that (3, 4, 7) = 4(1, 1, 0) + 7(1, 0, 1) V.
That is, as ordered bases (u
1
, u
2
, . . . , u
n
), (u
2
, u
3
, . . . , u
n
, u
1
) and (u
n
, u
n1
, . . . , u
2
, u
1
) are
dierent even though they have the same set of vectors as elements. To proceed further, we now
dene the notion of coordinates of a vector depending on the chosen ordered basis.
Denition 3.4.3 (Coordinates of a Vector) Let B = (v
1
, v
2
, . . . , v
n
) be an ordered basis of a
vector space V . If an element v V is expressible as
v =
1
v
1
+
2
v
2
+ +
n
v
n
for some scalars
1
,
2
, . . . ,
n
then the tuple (
1
,
2
, . . . ,
n
) is called the coordinate of the vector v with respect to the ordered
basis B and is denoted by [v]
B
= (
1
, . . . ,
n
)
t
, a column vector.
Example 3.4.4 1. In Example 3.4.2.1, let p(x) = a
0
+a
1
x +a
2
. Then
[p(x)]
B1
=
_
_
a0a1
2
a0+a1
2
a
2
_
_
, [p(x)]
B2
=
_
_
a0+a1
2
a0a1
2
a
2
_
_
and [p(x)]
B3
=
_
_
a
2
a0a1
2
a0+a1
2
_
_
.
2. In Example 3.4.2.2,
_
(3, 4, 7)

B
=
_
4
7
_
and
_
(x, y, z)

B
=
_
(z y, y, z)

B
=
_
y
z
_
.
3. Let the ordered bases of R
3
be B
1
=
_
(1, 0, 0), (0, 1, 0), (0, 0, 1)
_
, B
2
=
_
(1, 0, 0), (1, 1, 0), (1, 1, 1)
_
and B
3
=
_
(1, 1, 1), (1, 1, 0), (1, 0, 0)
_
. Then
(1, 1, 1) = 1 (1, 0, 0) + (1) (0, 1, 0) + 1 (0, 0, 1).
= 2 (1, 0, 0) + (2) (1, 1, 0) + 1 (1, 1, 1).
= 1 (1, 1, 1) + (2) (1, 1, 0) + 2 (1, 0, 0).
Therefore, if we write u = (1, 1, 1), then
[u]
B1
= (1, 1, 1)
t
, [u]
B2
= (2, 2, 1)
t
, [u]
B3
= (1, 2, 2)
t
.
78 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
In general, let V be an n-dimensional vector space with B
1
= (u
1
, u
2
, . . . , u
n
) and B
2
=
(v
1
, v
2
, . . . , v
n
). Since B
1
is a basis of V, there exist unique scalars a
ij
, 1 i, j n, such that
v
i
=
n

l=1
a
li
u
l
, or equivalently, [v
i
]
B1
= (a
1i
, a
2i
, . . . , a
ni
)
t
for 1 i n.
Suppose v V with [v]
B2
= (
1
,
2
, . . . ,
n
)
t
. Then
v =
n

i=1

i
v
i
=
n

i=1

i
_
_
n

j=1
a
ji
u
j
_
_
=
n

j=1
_
n

i=1
a
ji

i
_
u
j
.
Since B
1
is a basis this representation of v in terms of u
i
s is unique. So,
[v]
B1
=
_
n

i=1
a
1i

i
,
n

i=1
a
2i

i
, . . . ,
n

i=1
a
ni

i
_
t
=
_

_
a
11
a
1n
a
21
a
2n
.
.
.
.
.
.
.
.
.
a
n1
a
nn
_

_
_

2
.
.
.

n
_

_
= A[v]
B2
,
where A =
_
[v
1
]
B1
, [v
2
]
B1
, . . . , [v
n
]
B1
_
. Hence, we have proved the following theorem.
Theorem 3.4.5 Let V be an n-dimensional vector space with bases B
1
= (u
1
, u
2
, . . . , u
n
) and
B
2
= (v
1
, v
2
, . . . , v
n
). Dene an n n matrix A by A =
_
[v
1
]
B1
, [v
2
]
B1
, . . . , [v
n
]
B1
_
. Then, A is an
invertible matrix (see Exercise 3.3.14.7) and
[v]
B1
= A[v]
B2
for all v V.
Theorem 3.4.5 states that the coordinates of a vector with respect to dierent bases are related
via an invertible matrix A.
Example 3.4.6 Let B
1
=
_
(1, 0, 0), (1, 1, 0), (1, 1, 1)
_
and B
2
=
_
(1, 1, 1), (1, 1, 1), (1, 1, 0)
_
be two
bases of R
3
.
1. Then [(x, y, z)]
B1
= (x y, y z, z)
t
and [(x, y, z)]
B2
= (
yx
2
+z,
xy
2
, x z)
t
.
2. Check that A =
_
[(1, 1, 1)]
B1
, [(1, 1, 1)]
B1
, [(1, 1, 0)]
B1
_
=
_
_
0 2 0
0 2 1
1 1 0
_
_
as
[(1, 1, 1)]
B1
= 0 (1, 0, 0) + 0 (1, 1, 0) + 1 (1, 1, 1) = (0, 0, 1)
t
,
[(1, 1, 1)]
B1
= 2 (1, 0, 0) + (2) (1, 1, 0) + 1 (1, 1, 1) = (2, 2, 1)
t
and
[(1, 1, 0)]
B1
= 0 (1, 0, 0) + 1 (1, 1, 0) + 0 (1, 1, 1) = (0, 1, 0)
t
.
3. Thus, for any (x, y, z) R
3
,
[(x, y, z)]
B1
=
_
_
x y
y z
z
_
_
=
_
_
0 2 0
0 2 1
1 1 0
_
_
_
_
yx
2
+z
xy
2
x z
_
_
= A [(x, y, z)]
B2
.
4. Observe that the matrix A is invertible and hence [(x, y, z)]
B2
= A
1
[(x, y, z)]
B1
.
3.5. SUMMARY 79
In the next chapter, we try to understand Theorem 3.4.5 again using the ideas of linear trans-
formations/functions.
Exercise 3.4.7 1. Consider the vector space T
3
(R).
(a) Prove that B
1
= (1 x, 1 +x
2
, 1 x
3
, 3 +x
2
x
3
) and B
2
= (1, 1 x, 1 +x
2
, 1 x
3
) are
bases of T
3
(R).
(b) Find the coordinates of u = 1 +x +x
2
+x
3
with respect to B
1
and B
2
.
(c) Find the matrix A such that [u]
B2
= A[u]
B1
.
(d) Let v = a
0
+a
1
x +a
2
x
2
+a
3
x
3
. Then verify that
[v]
B1
=
_

_
a
1
a
0
a
1
+ 2a
2
a
3
a
0
a
1
+a
2
2a
3
a
0
+a
1
a
2
+a
3
_

_
=
_

_
0 1 0 0
1 0 1 0
1 0 0 1
1 0 0 0
_

_
_

_
a
0
+a
1
a
2
+a
3
a
1
a
2
a
3
_

_
= [v]
B2
.
2. Let B =
_
(2, 1, 0), (2, 1, 1), (2, 2, 1)
_
be an ordered basis of R
3
. Determine the coordinates of
(1, 2, 1) and (4, 2, 2) with respect B.
3.5 Summary
In this chapter, we started with the denition of vector spaces over F, the set of scalars. The set F
was either R, the set of real numbers or C, the set of complex numbers.
It was important to note that given a non-empty set V of vectors with a set F of scalars, we
need to do the following:
1. rst dene vector addition and scalar multiplication and
2. then verify the axioms in Denition 3.1.1.
If all the axioms are satised then V is a vector space over F. To check whether a non-empty subset
W of a vector space V over F is a subspace of V , we only need to check whether u +v W for all
u, v W and u W for all F and u W.
We then came across the denition of linear combination of vectors and the linear span of
vectors. It was also shown that the linear span of a subset S of a vector space V is the smallest
subspace of V containing S. Also, to check whether a given vector v is a linear combination of the
vectors u
1
, u
2
, . . . , u
n
, we need to solve the linear system
c
1
u
1
+c
2
u
2
+ +c
n
u
n
= v
in the unknowns c
1
, . . . , c
n
. This corresponds to solving the linear system Ax = b. It was also
shown that the geometrical representation of the linear span of S = u
1
, u
2
, . . . , u
n
is equivalent
to nding conditions on the coordinates of the vector b such that the linear system Ax = b is
consistent, where the matrix A is formed with the coordinates of the vector u
i
as the i-th column
of the matrix A.
By denition, S = u
1
, u
2
, . . . , u
n
is linearly independent subset in V (F) if the homogeneous
system Ax = 0 has only the trivial solution in F, else S is linearly dependent, where the matrix A
is formed with the coordinates of the vector u
i
as the i-th column of the matrix A.
We then had the notion of the basis of a nite dimensional vector space V and the following
results were proved.
80 CHAPTER 3. FINITE DIMENSIONAL VECTOR SPACES
1. A linearly independent set can be extended to form a basis of V .
2. Any two bases of V have the same number of elements.
This number was dened as the dimension of V and we denoted it by dim(V ).
The following conditions are equivalent for an n n matrix A.
1. A is invertible.
2. The homogeneous system Ax = 0 has only the trivial solution.
3. The row reduced echelon form of A is I.
4. A is a product of elementary matrices.
5. The system Ax = b has a unique solution for every b.
6. The system Ax = b has a solution for every b.
7. rank(A) = n.
8. det(A) ,= 0.
9. The row space of A is R
n
.
10. The column space of A is R
n
.
11. The rows of A form a basis of R
n
.
12. The columns of A form a basis of R
n
.
13. The null space of A is 0.
Let A be an mn matrix. Then we proved the rank-nullity theorem which states that rank(A)+
nullity(A) = n, the number of columns. This implied that if rank(A) = r then the solution set of
the linear system Ax = b is of the form x
0
+c
1
u
1
+ +c
nr
u
nr
, where Ax
0
= b and Au
i
= 0
for 1 i n r. Also, the vectors u
1
, u
2
, . . . , u
nr
are linearly independent.
Let V be a vector space of R
n
for some positive integer n with dim(V ) = k. Then V may not
have a standard basis. Even if V may have a basis that looks like an standard basis, our problem
may force us to look for some other basis. In such a case, it is always helpful to x an ordered basis
B and then express each vector in V as a linear combination of elements from B. This idea helps
us in writing each element of V as a column vector of size k. We will also see its use in the study
of linear transformations and the study of eigenvalues and eigenvectors.
Chapter 4
Linear Transformations
4.1 Denitions and Basic Properties
In this chapter, it will be shown that if V is a real vector space with dim(V ) = n then V looks like
R
n
. On similar lines a complex vector space of dimension n has all the properties that are satised
by C
n
. To do so, we start with the denition of functions over vector spaces that commute with
the operations of vector addition and scalar multiplication.
Denition 4.1.1 (Linear Transformation, Linear Operator) Let V and W be vector spaces
over the same scalar set F. A function (map) T : V W is called a linear transformation if for
all F and u, v V the function T satises
T( u) = T(u) and T(u +v) = T(u) T(v),
where +, are binary operations in V and , are the binary operations in W. In particular, if
W = V then the linear transformation T is called a linear operator.
We now give a few examples of linear transformations.
Example 4.1.2 1. Dene T : RR
2
by T(x) = (x, 3x) for all x R. Then T is a linear
transformation as
T(x) = (x, 3x) = (x, 3x) = T(x) and
T(x +y) = (x +y, 3(x +y) = (x, 3x) + (y, 3y) = T(x) +T(y).
2. Let V, W and Z be vector spaces over F. Also, let T : V W and S : WZ be linear
transformations. Then, for each v V , the composition of T and S is dened by S T(v) =
S
_
T(v)
_
. It is easy to verify that S T is a linear transformation. In particular, if V = W,
one writes T
2
in place of T T.
3. Let x
t
= (x
1
, x
2
, . . . , x
n
) R
n
. Then for a xed vector a
t
= (a
1
, a
2
, . . . , a
n
) R
n
, dene
T : R
n
R by T(x
t
) =
n

i=1
a
i
x
i
for all x
t
R
n
. Then T is a linear transformation. In
particular,
(a) T(x
t
) =
n

i=1
x
i
for all x
t
R
n
if a
i
= 1 for 1 i n.
(b) if a = e
i
for a xed i, 1 i n, one can dene T
i
(x
t
) = x
i
for all x
t
R
n
.
81
82 CHAPTER 4. LINEAR TRANSFORMATIONS
4. Dene T : R
2
R
3
by T(x, y) = (x + y, 2x y, x + 3y). Then T is a linear transformation
with T(1, 0) = (1, 2, 1) and T(0, 1) = (1, 1, 3).
5. Let A M
mn
(C). Dene a map T
A
: C
n
C
m
by T
A
(x
t
) = Ax for every x
t
= (x
1
, x
2
, . . . , x
n
)
C
n
. Then T
A
is a linear transformation. That is, every mn complex matrix denes a linear
transformation from C
n
to C
m
.
6. Dene T : R
n+1
T
n
(R) by T(a
1
, a
2
, . . . , a
n+1
) = a
1
+a
2
x+ +a
n+1
x
n
for (a
1
, a
2
, . . . , a
n+1
)
R
n+1
. Then T is a linear transformation.
7. Fix A M
n
(C). Then T
A
: M
n
(C)M
n
(C) and S
A
: M
n
(C)C are both linear transfor-
mations, where
T
A
(B) = BA

and S
A
(B) = tr(BA

) for every B M
n
(C).
Before proceeding further with some more denitions and results associated with linear trans-
formations, we prove that any linear transformation sends the zero vector to a zero vector.
Proposition 4.1.3 Let T : V W be a linear transformation. Suppose that 0
V
is the zero vector
in V and 0
W
is the zero vector of W. Then T(0
V
) = 0
W
.
Proof. Since 0
V
= 0
V
+0
V
, we have
T(0
V
) = T(0
V
+0
V
) = T(0
V
) +T(0
V
).
So T(0
V
) = 0
W
as T(0
V
) W.
From now on, we write 0 for both the zero vector of the domain and codomain.
Denition 4.1.4 (Zero Transformation) Let V and W be two vector spaces over F and dene
T : V W by T(v) = 0 for every v V. Then T is a linear transformation and is usually called
the zero transformation, denoted 0.
Denition 4.1.5 (Identity Operator) Let V be a vector space over F and dene T : V V by
T(v) = v for every v V. Then T is a linear transformation and is usually called the Identity
transformation, denoted I.
Denition 4.1.6 (Equality of two Linear Operators) Let V be a vector space and let T, S :
V V be a linear operators. The operators T and S are said to be equal if T(x) = S(x) for all
x V .
We now prove a result that relates a linear transformation T with its value on a basis of the
domain space.
Theorem 4.1.7 Let V and W be two vector spaces over F and let T : V W be a linear trans-
formation. If B =
_
u
1
, . . . , u
n
_
is an ordered basis of V then for each v V , the vector T(v) is
a linear combination of T(u
1
), . . . , T(u
n
) W. That is, we have full information of T if we know
T(u
1
), . . . , T(u
n
) W, the image of basis vectors in W.
Proof. As B is a basis of V, for every v V, we can nd c
1
, . . . , c
n
F such that v =
c
1
u
1
+ +c
n
u
n
, or equivalently [v]
B
= (
1
, . . . ,
n
)
t
. Hence, by denition
T(v) = T(c
1
u
1
+ +c
n
u
n
) = c
1
T(u
1
) + +c
n
T(u
n
).
That is, we just need to know the vectors T(u
1
), T(u
2
), . . . , T(u
n
) in W to get T(v) as [v]
B
=
(
1
, . . . ,
n
)
t
is known in V . Hence, the required result follows.
4.1. DEFINITIONS AND BASIC PROPERTIES 83
Exercise 4.1.8 1. Are the maps T : V W given below, linear transformations?
(a) Let V = R
2
and W = R
3
with T(x, y) = (x +y + 1, 2x y, x + 3y).
(b) Let V = W = R
2
with T(x, y) = (x y, x
2
y
2
).
(c) Let V = W = R
2
with T(x, y) = (x y, [x[).
(d) Let V = R
2
and W = R
4
with T(x, y) = (x +y, x y, 2x +y, 3x 4y).
(e) Let V = W = R
4
with T(x, y, z, w) = (z, x, w, y).
2. Which of the following maps T : M
2
(R)M
2
(R) are linear operators?
(a) T(A) = A
t
(b) T(A) = I +A (c) T(A) = A
2
(d) T(A) = BAB
1
, where B is a xed 2 2 matrix.
3. Prove that a map T : R R is a linear transformation if and only if there exists a unique
c R such that T(x) = cx for every x R.
4. Let A M
n
(C) and dene T
A
: C
n
C
n
by T
A
(x
t
) = Ax for every x
t
C
n
. Prove that for
any positive integer k, T
k
A
(x
t
) = A
k
x.
5. Use matrices to give examples of linear operators T, S : R
3
R
3
that satisfy:
(a) T ,= 0, T
2
,= 0, T
3
= 0.
(b) T ,= 0, S ,= 0, S T ,= 0, T S = 0.
(c) S
2
= T
2
, S ,= T.
(d) T
2
= I, T ,= I.
6. Let T : R
n
R
n
be a linear operator with T ,= 0 and T
2
= 0. Prove that there exists a
vector x R
n
such that the set x, T(x) is linearly independent.
7. Fix a positive integer p and let T : R
n
R
n
be a linear operator with T
k
,= 0 for
1 k p and T
p+1
= 0. Then prove that there exists a vector x R
n
such that the set
x, T(x), . . . , T
p
(x) is linearly independent.
8. Let T : R
n
R
m
be a linear transformation with T(x
0
) = y
0
for some x
0
R
n
and
y
0
R
m
. Dene T
1
(y
0
) = x R
n
: T(x) = y
0
. Then prove that for every x T
1
(y
0
)
there exists z T
1
(0) such that x = x
0
+z. Also, prove that T
1
(y
0
) is a subspace of R
n
if
and only if 0 T
1
(y
0
).
9. Dene a map T : C C by T(z) = z, the complex conjugate of z. Is T a linear transforma-
tion over C(R)?
10. Prove that there exists innitely many linear transformations T : R
3
R
2
such that
T(1, 1, 1) = (1, 2) and T(1, 1, 2) = (1, 0)?
11. Does there exist a linear transformation T : R
3
R
2
such that T(1, 0, 1) = (1, 2), T(0, 1, 1) =
(1, 0) and T(1, 1, 1) = (2, 3)?
12. Does there exist a linear transformation T : R
3
R
2
such that T(1, 0, 1) = (1, 2), T(0, 1, 1) =
(1, 0) and T(1, 1, 2) = (2, 3)?
13. Let T : R
3
R
3
be dened by T(x, y, z) = (2x + 3y + 4z, x + y + z, x + y + 3z). Find the
value of k for which there exists a vector x
t
R
3
such that T(x
t
) = (9, 3, k).
84 CHAPTER 4. LINEAR TRANSFORMATIONS
14. Let T : R
3
R
3
be dened by T(x, y, z) = (2x 2y + 2z, 2x + 5y + 2z, 8x +y + 4z). Find
the value of k for which there exists a vector x
t
R
3
such that T(x
t
) = (k, 1, 1).
15. Let T : R
3
R
3
be dened by T(x, y, z) = (2x+y +3z, 4xy +3z, 3x2y +5z). Determine
non-zero vectors x
t
, y
t
, z
t
R
3
such that T(x
t
) = 6x, T(y
t
) = 2y and T(z
t
) = 2z. Is the
set x, y, z linearly independent?
16. Let T : R
3
R
3
be dened by T(x, y, z) = (2x+3y +4z, y, 3y+4z). Determine non-zero
vectors x
t
, y
t
, z
t
R
3
such that T(x
t
) = 2x, T(y
t
) = 4y and T(z
t
) = z. Is the set x, y, z
linearly independent?
17. Let n be any positive integer. Prove that there does not exist a linear transformation T : R
3

R
n
such that T(1, 1, 2) = x
t
, T(1, 2, 3) = y
t
and T(1, 10, 1) = z
t
where z = x + y. Does
there exist real numbers c, d such that z = cx +dy and T is indeed a linear transformation?
18. Find all functions f : R
2
R
2
that xes the line y = x and sends (x
1
, y
1
) for x
1
,= y
1
to its
mirror image along the line y = x. Or equivalently, f satises
(a) f(x, x) = (x, x) and
(b) f(x, y) = (y, x) for all (x, y) R
2
.
4.2 Matrix of a linear transformation
In the previous section, we learnt the denition of a linear transformation. We also saw in Exam-
ple 4.1.2.5 that for each A M
mn
(C), there exists a linear transformation T
A
: C
n
C
m
given
by T
A
(x
t
) = Ax for each x
t
C
n
. In this section, we prove that every linear transformation over
nite dimensional vector spaces corresponds to a matrix. Before proceeding further, we advise the
reader to recall the results on ordered basis, studied in Section 3.4.
Let V and W be nite dimensional vector spaces over F with dimensions n and m, respectively.
Also, let B
1
= (v
1
, . . . , v
n
) and B
2
= (w
1
, . . . , w
m
) be ordered bases of V and W, respectively.
If T : V W is a linear transformation then Theorem 4.1.7 implies that T(v) W is a linear
combination of the vectors T(v
1
), . . . , T(v
n
). So, let us nd the coordinate vectors [T(v
j
)]
B2
for
each j = 1, 2, . . . , n. Let us assume that
[T(v
1
)]
B2
= (a
11
, . . . , a
m1
)
t
, [T(v
2
)]
B2
= (a
12
, . . . , a
m2
)
t
, . . . , [T(v
n
)]
B2
= (a
1n
, . . . , a
mn
)
t
.
Or equivalently,
T(v
j
) = a
1j
w
1
+a
2j
w
2
+ +a
mj
w
m
=
m

i=1
a
ij
w
i
for j = 1, 2, . . . , n. (4.2.1)
Therefore, for a xed x V , if [x]
B1
= (x
1
, x
2
, . . . , x
n
)
t
then
T(x) = T
_
_
n

j=1
x
j
v
j
_
_
=
n

j=1
x
j
T(v
j
) =
n

j=1
x
j
_
m

i=1
a
ij
w
i
_
=
m

i=1
_
_
n

j=1
a
ij
x
j
_
_
w
i
. (4.2.2)
Hence, using Equation (4.2.2), the coordinates of T(x) with respect to the basis B
2
equals
[T(x)]
B2
=
_

_
n

j=1
a
1j
x
j
.
.
.
n

j=1
a
mj
x
j
_

_
=
_

_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_

_
_

_
x
1
x
2
.
.
.
x
n
_

_
= A [x]
B1
,
4.2. MATRIX OF A LINEAR TRANSFORMATION 85
where
A =
_

_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_

_
=
_
[T(v
1
)]
B2
, [T(v
2
)]
B2
, . . . , [T(v
n
)]
B2

. (4.2.3)
The above observations lead to the following theorem and the subsequent denition.
Theorem 4.2.1 Let V and W be nite dimensional vector spaces over F with dimensions n and
m, respectively. Let T : V W be a linear transformation. Also, let B
1
and B
2
be ordered bases
of V and W, respectively. Then there exists a matrix A M
mn
(F), denoted A = T[B
1
, B
2
], with
A =
_
[T(v
1
)]
B2
, [T(v
2
)]
B2
, . . . , [T(v
n
)]
B2

such that
[T(x)]
B2
= A [x]
B1
.
Denition 4.2.2 (Matrix of a Linear Transformation) Let V and W be nite dimensional
vector spaces over F with dimensions n and m, respectively. Let T : V W be a linear transfor-
mation. Then the matrix T[B
1
, B
2
] is called the matrix of the linear transformation with respect to
the ordered bases B
1
and B
2
.
Remark 4.2.3 Let B
1
= (v
1
, . . . , v
n
) and B
2
= (w
1
, . . . , w
m
) be ordered bases of V and W,
respectively. Also, let T : V W be a linear transformation. Then writing T[B
1
, B
2
] in place of
the matrix A, Equation (4.2.1) can be rewritten as
T(v
j
) =
m

i=1
T[B
1
, B
2
]
ij
w
i
, for 1 j n. (4.2.4)
We now give a few examples to understand the above discussion and Theorem 4.2.1.
Q = (0, 1)
P = (1, 0)

= (sin , cos )
P

= (cos , sin )

P = (x, y)
P

= (x

, y

)
Figure 4.1: Counter-clockwise Rotation by an angle
Example 4.2.4 1. Let T : R
2
R
2
be a function that counterclockwise rotates every point in R
2
by an angle , 0 < 2. Then using Figure 4.1 it can be checked that x

= OP

cos(+) =
OP
_
cos cos sin sin
_
= xcos y sin and similarly y

= xsin + y cos . Or equiv-

alently, if B = (e
1
, e
2
) is the standard ordered basis of R
2
, then using T(1, 0) = (cos , sin )
and T(0, 1) = (sin , cos ), we get
T[B, B] =
_
[T(1, 0)]
B
, [T(0, 1)]
B

=
_
cos sin
sin cos
_
. (4.2.5)
86 CHAPTER 4. LINEAR TRANSFORMATIONS
2. Let B
1
=
_
(1, 0), (0, 1)
_
and B
2
=
_
(1, 1), (1, 1)
_
be two ordered bases of R
2
. Then Compute
T[B
1
, B
1
] and T[B
2
, B
2
] for the linear transformation T : R
2
R
2
dened by T(x, y) = (x +
y, x 2y).
Solution: Observe that for (x, y) R
2
, [(x, y)]
B1
=
_
x
y
_
and [(x, y)]
B2
=
_
x+y
2
xy
2
_
. Also,
T(1, 0) = (1, 1), T(0, 1) = (1, 2), T(1, 1) = (2, 1) and T(1, 1) = (0, 3). Thus, we have
T[B
1
, B
1
] =
_
[T
_
1, 0)
_
]
B1
, [T
_
0, 1)
_
]
B1

=
_
[(1, 1)]
B1
, [(1, 2)]
B1

=
_
1 1
1 2
_
and
T[B
2
, B
2
] =
_
[T
_
1, 1)
_
]
B2
, [T
_
1, 1)
_
]
B2

=
_
[(2, 1)]
B2
, [(0, 3)]
B2

=
_
1
2
3
2
3
2

3
2
_
.
Hence, we see that
[T(x, y)]
B1
=
_
(x +y, x 2y)

B1
=
_
x +y
x 2y
_
=
_
1 1
1 2
_ _
x
y
_
and
[T(x, y)]
B2
=
_
(x +y, x 2y)

B2
=
_
2xy
2
3y
2
_
=
_
1
2
3
2
3
2

3
2
_ _
x+y
2
xy
2
_
3. Let B
1
=
_
(1, 0, 0), (0, 1, 0), (0, 0, 1)
_
and B
2
=
_
(1, 0), (0, 1)
_
be ordered bases of R
3
and R
2
,
respectively. Dene T : R
3
R
2
by T(x, y, z) = (x +y z, x +z). Then
T[B
1
, B
2
] =
_
[(1, 0, 0)]
B2
, [(0, 1, 0)]
B2
, [(0, 0, 1)]
B2
_
=
_
1 1 1
1 0 1
_
.
Check that [T(x, y, z)]
B2
= (x +y z, x +z)
t
= T[B
1
, B
2
] [(x, y, z)]
B1
.
4. Let B
1
=
_
(1, 0, 0), (0, 1, 0), (0, 0, 1)
_
, B
2
=
_
(1, 0, 0), (1, 1, 0), (1, 1, 1)
_
be two ordered bases of
R
3
. Dene T : R
3
R
3
by T(x
t
) = x for all x
t
R
3
. Then
[T(1, 0, 0)]
B2
= 1 (1, 0, 0) + 0 (1, 1, 0) + 0 (1, 1, 1) = (1, 0, 0)
t
,
[T(0, 1, 0)]
B2
= 1 (1, 0, 0) + 1 (1, 1, 0) + 0 (1, 1, 1) = (1, 1, 0)
t
, and
[T(0, 0, 1)]
B2
= 0 (1, 0, 0) + (1) (1, 1, 0) + 1 (1, 1, 1) = (0, 1, 1)
t
.
Thus, check that
T[B
1
, B
2
] = [[T(1, 0, 0)]
B2
, [T(0, 1, 0)]
B2
, [T(0, 0, 1)]
B2
]
= [(1, 0, 0)
t
, (1, 1, 0)
t
, (0, 1, 1)
t
] =
_
_
1 1 0
0 1 1
0 0 1
_
_
,
T[B
2
, B
1
] = [[T(1, 0, 0)]
B1
, [T(1, 1, 0)]
B1
, [T(1, 1, 1)]
B1
] =
_
_
1 1 1
0 1 1
0 0 1
_
_
,
T[B
1
, B
1
] = I
3
= T[B
2
, B
2
] and T[B
2
, B
1
]
1
= T[B
1
, B
2
].
Remark 4.2.5 1. Let V and W be nite dimensional vector spaces over F with order bases
B
1
= (v
1
, . . . , v
n
) and B
2
of V and W, respectively. If T : V W is a linear transformation
then
(a) T[B
1
, B
2
] =
_
[T(v
1
)]
B2
, [T(v
2
)]
B2
, . . . , [T(v
n
)]
B2

.
4.3. RANK-NULLITY THEOREM 87
(b) [T(x)]
B2
= T[B
1
, B
2
] [x]
B1
for all x V . That is, the coordinate vector of T(x) W is
obtained by multiplying the matrix of the linear transformation with the coordinate vector
of x V .
2. Let A M
mn
(R). Then A induces a linear transformation T
A
: R
n
R
m
dened by
T
A
(x
t
) = Ax for all x
t
R
n
. Let B
1
and B
2
be the standard ordered bases of R
n
and R
m
,
respectively. Then it can be easily veried that T
A
[B
1
, B
2
] = A.
Exercise 4.2.6 1. Let T : R
2
R
2
be a function that reects every point in R
2
y = mx. Find its matrix with respect to the standard ordered basis of R
2
.
2. Let T : R
3
R
3
be a function that reects every point in R
3
about the X-axis. Find its
matrix with respect to the standard ordered basis of R
3
.
3. Let T : R
3
R
3
be a function that counterclockwise rotates every point in R
3
around the
positive Z-axis by an angle , 0 < 2. Prove that T is a linear operator and nd its
matrix with respect to the standard ordered basis of R
3
.
4. Dene a function D : T
n
(R)T
n
(R) by
D(a
0
+a
1
x +a
2
x
2
+ +a
n
x
n
) = a
1
+ 2a
2
x + +na
n
x
n1
.
Prove that D is a linear operator and nd the matrix of D with respect to the standard ordered
basis of T
n
(R). Observe that the image of D is contained in T
n1
(R).
5. Let T be a linear operator in R
2
satisfying T(3, 4) = (0, 1) and T(1, 1) = (2, 3). Let B =
_
(1, 0), (1, 1)
_
be an ordered basis of R
2
. Compute T[B, B].
6. For each linear transformation given in Example 4.1.2, nd its matrix of the linear transform
with respect to standard ordered bases.
4.3 Rank-Nullity Theorem
We are now ready to related the rank-nullity theorem (see Theorem 3.3.25 on 74) with the rank-
nullity theorem for linear transformation. To do so, we rst dene the range space and the null
space of any linear transformation.
Denition 4.3.1 (Range Space and Null Space) Let V be nite dimensional vector space over
F and let W be any vector space over F. Then for a linear transformation T : V W, we dene
1. ((T) = T(x) : x V as the range space of T and
2. ^(T) = x V : T(x) = 0 as the null space of T.
We now prove some results associated with the above denitions.
Proposition 4.3.2 Let V be a vector space over F with basis v
1
, . . . , v
n
. Also, let W be a vector
spaces over F. Then for any linear transformation T : V W,
1. ((T) = L(T(v
1
), . . . , T(v
n
)) is a subspace of W and dim(((T) dim(W).
2. ^(T) is a subspace of V and dim(^(T) dim(V ).
3. The following statements are equivalent.
88 CHAPTER 4. LINEAR TRANSFORMATIONS
(a) T is one-one.
(b) ^(T) = 0.
(c) T(u
i
) : 1 i n is a basis of ((T).
4. dim(((T) = dim(V ) if and only if ^(T) = 0.
Proof. Parts 1 and 2 The results about ((T) and ^(T) can be easily proved. We thus leave the
proof for the readers.
We now assume that T is one-one. We need to show that ^(T) = 0.
Let u ^(T). Then by denition, T(u) = 0. Also for any linear transformation (see Proposition
4.1.3), T(0) = 0. Thus T(u) = T(0). So, T is one-one implies u = 0. That is, ^(T) = 0.
Let ^(T) = 0. We need to show that T is one-one. So, let us assume that for some u, v
V, T(u) = T(v). Then, by linearity of T, T(u v) = 0. This implies, u v ^(T) = 0. This
in turn implies u = v. Hence, T is one-one.
The other parts can be similarly proved.
Remark 4.3.3 1. ((T) is called the range space and ^(T) the null space of T.
2. dim(((T) is denoted by (T) and is called the rank of T.
3. dim(^(T) is denoted by (T) and is called the nullity of T.
Example 4.3.4 Determine the range and null space of the linear transformation
T : R
3
R
4
with T(x, y, z) = (x y +z, y z, x, 2x 5y + 5z).
Solution: By Denition
(T) = L
_
(1, 0, 1, 2), (1, 1, 0, 5), (1, 1, 0, 5)
_
= L
_
(1, 0, 1, 2), (1, 1, 0, 5)
_
= (1, 0, 1, 2) +(1, 1, 0, 5) : , R
= ( +, , , 2 + 5) : , R
= (x, y, z, w) R
4
: x +y z = 0, 5y 2z +w = 0
and
^(T) = (x, y, z) R
3
: T(x, y, z) = 0
= (x, y, z) R
3
: (x y +z, y z, x, 2x 5y + 5z) = 0
= (x, y, z) R
3
: x y +z = 0, y z = 0,
x = 0, 2x 5y + 5z = 0
= (x, y, z) R
3
: y z = 0, x = 0
= (0, y, y) R
3
: y R = L((0, 1, 1))
Exercise 4.3.5 1. Dene a linear operator D : T
n
(R)T
n
(R) by
D(a
0
+a
1
x +a
2
x
2
+ +a
n
x
n
) = a
1
+ 2a
2
x + +na
n
x
n1
.
Describe ^(D) and ((D). Note that ((D) T
n1
(R).
4.3. RANK-NULLITY THEOREM 89
2. Let T : V W be a linear transformation. If T(v
1
), . . . , T(v
n
) is linearly independent
subset in ((T) then prove that v
1
, . . . , v
n
V is linearly independent.
3. Dene a linear operator T : R
3
R
3
by T(1, 0, 0) = (0, 0, 1), T(1, 1, 0) = (1, 1, 1) and
T(1, 1, 1) = (1, 1, 0). Then
(a) determine T(x, y, z) for x, y, z R.
(b) determine ((T) and ^(T). Also calculate (T) and (T).
(c) prove that T
3
= T and nd the matrix of T with respect to the standard basis.
4. Find a linear operator T : R
3
R
3
for which ((T) = L
_
(1, 2, 0), (0, 1, 1), (1, 3, 1)
_
?
5. Let v
1
, v
2
, . . . , v
n
be a basis of a vector space V (F). If W(F) is a vector space and
w
1
, w
2
, . . . , w
n
W then prove that there exists a unique linear transformation T : V W
such that T(v
i
) = w
i
for all i = 1, 2, . . . , n.
We now state the rank-nullity theorem for linear transformation. The proof of this result is
similar to the proof of Theorem 3.3.25 and it also follows from Proposition 4.3.2. Hence, we omit
the proof.
Theorem 4.3.6 (Rank Nullity Theorem) Let V be a nite dimensional vector space and let
T : V W be a linear transformation. Then (T) +(T) = dim(V ). That is,
dim((T)) + dim(^(T)) = dim(V ).
Using the Rank-nullity theorem, we give a short proof of the following result.
Corollary 4.3.7 Let V be a nite dimensional vector space and let T : V V be a linear operator.
Then the following statements are equivalent.
1. T is one-one.
2. T is onto.
3. T is invertible.
Proof. By Proposition 4.3.2, T is one-one if and only if ^(T) = 0. By Theorem 4.3.6 ^(T) =
0 implies dim(((T)) = dim(V ). Or equivalently, T is onto.
Now, we know that T is invertible if T is one-one and onto. But we have just shown that T is
one-one if and only if T is onto. Thus, we have the required result.
Remark 4.3.8 Let V be a nite dimensional vector space and let T : V V be a linear operator.
If either T is one-one or T is onto then T is invertible.
Theorem 4.3.9 Let V and W be nite dimensional vector spaces over F and let T : V W be a
linear transformation. Also assume that T is one-one and onto. Then
1. for each w W, the set T
1
(w) is a set consisting of a single element.
2. the map T
1
: WV dened by T
1
(w) = v whenever T(v) = w is a linear transformation.
90 CHAPTER 4. LINEAR TRANSFORMATIONS
Proof. Since T is onto, for each w W there exists v V such that T(v) = w. So, the set
T
1
(w) is non-empty.
Suppose there exist vectors v
1
, v
2
V such that T(v
1
) = T(v
2
). Then the assumption, T is
one-one implies v
1
= v
2
. This completes the proof of Part 1.
We are now ready to prove that T
1
, as dened in Part 2, is a linear transformation. Let
w
1
, w
2
W. Then by Part 1, there exist unique vectors v
1
, v
2
V such that T
1
(w
1
) = v
1
and T
1
(w
2
) = v
2
. Or equivalently, T(v
1
) = w
1
and T(v
2
) = w
2
. So, for any
1
,
2
F,
T(
1
v
1
+
2
v
2
) =
1
w
1
+
2
w
2
. Hence, by denition, for any
1
,
2
F, T
1
(
1
w
1
+
2
w
2
) =

1
v
1
+
2
v
2
=
1
T
1
(w
1
) +
2
T
1
(w
2
). Thus the proof of Part 2 is over.
Denition 4.3.10 (Inverse Linear Transformation) Let V and W be nite dimensional vector
spaces over F and let T : V W be a linear transformation. If the map T is one-one and onto,
then the map T
1
: WV dened by
T
1
(w) = v whenever T(v) = w
is called the inverse of the linear transformation T.
Example 4.3.11 1. Let T : R
2
R
2
be dened by T(x, y) = (x+y, xy). Then T
1
: R
2
R
2
is dened by T
1
(x, y) = (
x+y
2
,
xy
2
). One can see that
T T
1
(x, y) = T(T
1
(x, y)) = T(
x +y
2
,
x y
2
)
= (
x +y
2
+
x y
2
,
x +y
2

x y
2
) = (x, y) = I(x, y),
where I is the identity operator. Hence, T T
1
= I. Verify that T
1
T = I. Thus, the map
T
1
is indeed the inverse of T.
2. For (a
1
, . . . , a
n+1
) R
n+1
, dene the linear transformation T : R
n+1
T
n
(R) by
T(a
1
, a
2
, . . . , a
n+1
) = a
1
+a
2
x + +a
n+1
x
n
.
Then it can be checked that T
1
: T
n
(R)R
n+1
is dened by T
1
(a
1
+a
2
x+ +a
n+1
x
n
) =
(a
1
, a
2
, . . . , a
n+1
) for all a
1
+a
2
x + +a
n+1
x
n
T
n
(R).
Exercise 4.3.12 1. Let V be a nite dimensional vector space and let T : V W be a linear
transformation. Then
(a) prove that ^(T) and ((T) are also nite dimensional.
(b) prove that
i. if dim(V ) < dim(W) then T cannot be onto.
ii. if dim(V ) > dim(W) then T cannot be one-one.
2. Let V be a vector space of dimension n and let B = (v
1
, . . . , v
n
) be an ordered basis of V .
For i = 1, . . . , n, let w
i
V with [w
i
]
B
= [a
1i
, a
2i
, . . . , a
ni
]
t
. Also, let A = [a
ij
]. Then prove
that w
1
, . . . , w
n
is a basis of V if and only if A is invertible.
3. Let T, S : V V be linear transformations with dim(V ) = n.
(a) Show that ((T +S) ((T) +((S). Deduce that (T +S) (T) +(S).
(b) Now, use Theorem 4.3.6 to prove (T +S) (T) +(S) n.
4.4. SIMILARITY OF MATRICES 91
4. Let z
1
, z
2
, . . . , z
k
be k distinct complex numbers and dene a linear transformation T : T
n
(C)
C
k
by T
_
P(z)
_
=
_
P(z
1
), P(z
2
), . . . , P(z
k
)
_
. For each k 1, determine dim(((T)).
5. Fix A M
n
(R) and dene T
A
: R
n
R
n
by T
A
(v
t
) = Av for all v
t
R
n
. Then
(a) T
A
T
A
= T
A
. Equivalently, T
A
(I T
A
) = 0, where I : R
n
R
n
is the identity map
and 0 : R
n
R
n
is the zero map.
(b) ^(T
A
) ((T
A
) = 0.
(c) R
n
= ((T
A
) +^(T
A
). [Hint: x = T
A
(x) + (I T
A
)(x)]
4.4 Similarity of Matrices
Let V be a nite dimensional vector space with ordered basis B. Then we saw that any linear
operator T : V V corresponds to a square matrix of order dim(V ) and this matrix was denoted
by T[B, B]. In this section, we will try to understand the relationship between T[B
1
, B
1
] and
T[B
2
, B
2
], where B
1
and B
2
are distinct ordered bases of V . This will enable us to understand the
reason for dening the matrix product somewhat differently.
Theorem 4.4.1 (Composition of Linear Transformations) Let V, W and Z be nite dimen-
sional vector spaces with ordered bases B
1
, B
2
and B
3
, respectively. Also, let T : V W and
S : WZ be linear transformations. Then the composition map S T : V Z (see Figure 4.2)
is a linear transformation and
(V, B
1
, n) (W, B
2
, m) (Z, B
3
, p)
T[B
1
, B
2
]
mn
S[B
2
, B
3
]
pm
(S T)[B
1
, B
3
]
pn
= S[B
2
, B
3
] T[B
1
, B
2
]
Figure 4.2: Composition of Linear Transformations
(S T) [B
1
, B
3
] = S[B
2
, B
3
] T[B
1
, B
2
].
Proof. Let B
1
= (u
1
, . . . , u
n
), B
2
= (v
1
, . . . , v
m
) and B
3
= (w
1
, . . . , w
p
) be ordered bases of
V, W and Z, respectively. Then using Equation (4.2.4), we have
(S T) (u
t
) = S(T(u
t
)) = S
_
m

j=1
(T[B
1
, B
2
])
jt
v
j
_
=
m

j=1
(T[B
1
, B
2
])
jt
S(v
j
)
=
m

j=1
(T[B
1
, B
2
])
jt
p

k=1
(S[B
2
, B
3
])
kj
w
k
=
p

k=1
(
m

j=1
(S[B
2
, B
3
])
kj
(T[B
1
, B
2
])
jt
)w
k
=
p

k=1
(S[B
2
, B
3
] T[B
1
, B
2
])
kt
w
k
.
Thus, using matrix multiplication, the t-th column of (S T) [B
1
, B
3
] is given by
[(S T) (u
t
)]
B3
=
_

_
_
S[B
2
, B
3
] T[B
1
, B
2
]
_
1t
_
S[B
2
, B
3
] T[B
1
, B
2
]
_
2t
.
.
.
_
S[B
2
, B
3
] T[B
1
, B
2
]
_
pt
_

_
= S[B
2
, B
3
]
_

_
T[B
1
, B
2
]
1t
T[B
1
, B
2
]
2t
.
.
.
T[B
1
, B
2
]
pt
_

_
.
92 CHAPTER 4. LINEAR TRANSFORMATIONS
Hence, (S T)[B
1
, B
3
] =
_
[(S T)(u
1
)]
B3
, . . . , [(S T)(u
n
)]
B3

= S[B
2
, B
3
] T[B
1
, B
2
] and the proof
of the theorem is over.
Proposition 4.4.2 Let V be a nite dimensional vector space and let T, S : V V be two linear
operators. Then (T) +(S) (T S) max(T), (S).
Proof. We rst prove the second inequality.
Suppose v ^(S). Then (T S)(v) = T(S(v) = T(0) = 0 gives ^(S) ^(T S). Therefore,
(S) (T S).
We now use Theorem 4.3.6 to see that the inequality (T) (T S) is equivalent to showing
((T S) ((T). But this holds true as ((S) V and hence T(((S)) T(V ). Thus, the proof of
the second inequality is over.
For the proof of the rst inequality, assume that k = (S) and v
1
, . . . , v
k
is a basis of ^(S).
Then v
1
, . . . , v
k
^(T S) as T(0) = 0. So, let us extend it to get a basis v
1
, . . . , v
k
, u
1
, . . . , u

of ^(T S).
Claim: S(u
1
), S(u
2
), . . . , S(u

) is a linearly independent subset of ^(T).

It is easily seen that S(u
1
), . . . , S(u

) is a subset of ^(T). So, let us solve the linear system

c
1
S(u
1
) + c
2
S(u
2
) + + c

S(u

) = 0 in the unknowns c
1
, c
2
, . . . , c

. This system is equivalent

to S(c
1
u
1
+ c
2
u
2
+ + c

) = 0. That is,

i=1
c
i
u
i
^(S). Hence,

i=1
c
i
u
i
is a unique linear
combination of the vectors v
1
, . . . , v
k
. Thus,
c
1
u
1
+c
2
u
2
+ +c

=
1
v
1
+
2
v
2
+ +
k
v
k
(4.4.1)
for some scalars
1
,
2
, . . . ,
k
. But by assumption, v
1
, . . . , v
k
, u
1
, . . . , u

is a basis of ^(T S)
and hence linearly independent. Therefore, the only solution of Equation (4.4.1) is given by c
i
= 0
for 1 i and
j
= 0 for 1 j k.
Thus, S(u
1
), S(u
2
), . . . , S(u

) is a linearly independent subset of ^(T) and so (T) .

Hence, (T S) = k + (S) +(T).
Remark 4.4.3 Using Theorem 4.3.6 and Proposition 4.4.2, we see that if A and B are two n n
matrices then
min(A), (B) (AB) n (A) (B).
Let V be a nite dimensional vector space and let T : V V be an invertible linear operator.
Then using Theorem 4.3.9, the map T
1
: V V is a linear operator dened by T
1
(u) = v
whenever T(v) = u. The next result relates the matrix of T and T
1
. The reader is required to
supply the proof (use Theorem 4.4.1).
Theorem 4.4.4 (Inverse of a Linear Transformation) Let V be a nite dimensional vector
space with ordered bases B
1
and B
2
. Also let T : V V be an invertible linear operator. Then the
matrix of T and T
1
are related by T[B
1
, B
2
]
1
= T
1
[B
2
, B
1
].
Exercise 4.4.5 Find the matrix of the linear transformations given below.
1. Dene T : R
3
R
3
by T(1, 1, 1) = (1, 1, 1), T(1, 1, 1) = (1, 1, 1) and T(1, 1, 1) =
(1, 1, 1). Find T[B, B], where B =
_
(1, 1, 1), (1, 1, 1), (1, 1, 1)
_
. Is T an invertible linear
operator?
4.5. CHANGE OF BASIS 93
2. Let B =
_
1, x, x
2
, x
3
_
be an ordered basis of T
3
(R). Dene T : T
3
(R)T
3
(R) by
T(1) = 1, T(x) = 1 +x, T(x
2
) = (1 +x)
2
and T(x
3
) = (1 +x)
3
.
Prove that T is an invertible linear operator. Also, nd T[B, B] and T
1
[B, B].
We end this section with denition, results and examples related with the notion of isomorphism.
The result states that for each xed positive integer n, every real vector space of dimension n is
isomorphic to R
n
and every complex vector space of dimension n is isomorphic to C
n
.
Denition 4.4.6 (Isomorphism) Let V and W be two vector spaces over F. Then V is said to
be isomorphic to W if there exists a linear transformation T : V W that is one-one, onto and
invertible. We also denote it by V

= W.
Theorem 4.4.7 Let V be a vector space over R. If dim(V ) = n then V

= R
n
.
Proof. Let B be the standard ordered basis of R
n
and let B
1
=
_
v
1
, . . . , v
n
_
be an ordered basis
of V . Dene a map T : V R
n
by T(v
i
) = e
i
for 1 i n. Then it can be easily veried that T
is a linear transformation that is one-one, onto and invertible (the image of a basis vector is a basis
vector). Hence, the result follows.
A similar idea leads to the following result and hence we omit the proof.
Theorem 4.4.8 Let V be a vector space over C. If dim(V ) = n then V

= C
n
.
Example 4.4.9 1. The standard ordered basis of T
n
(C) is given by
_
1, x, x
2
, . . . , x
n
_
. Hence,
dene T : T
n
(C)C
n+1
by T(x
i
) = e
i+1
for 0 i n. In general, verify that T(a
0
+
a
1
x + + a
n
x
n
) = (a
0
, a
1
, . . . , a
n
) and T is linear transformation which is one-one, onto
and invertible. Thus, the vector space T
n
(C)

= C
n+1
.
2. Let V = (x, y, z, w) R
4
: xy +z w = 0. Suppose that B is the standard ordered basis of
R
3
and B
1
=
_
(1, 1, 0, 0), (1, 0, 1, 0), (1, 0, 0, 1)
_
is the ordered basis of V . Then T : V R
3
dened by T(v) = T(yz+w, y, z, w) = (y, z, w) is a linear transformation and T[B
1
, B] = I
3
.
Thus, T is one-one, onto and invertible.
4.5 Change of Basis
Let V be a vector space with ordered bases B
1
= (u
1
, . . . , u
n
) and B
2
= (v
1
, . . . , v
n
. Also, recall
that the identity linear operator I : V V is dened by I(x) = x for every x V. If
I[B
2
, B
1
] =
_
[I(v
1
)]
B1
, [I(v
2
)]
B1
, . . . , [I(v
n
)]
B1

=
_

_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
a
nn
_

_
then by denition of I[B
2
, B
1
], we see that v
i
= I(v
i
) =
n

j=1
a
ji
u
j
for all i, 1 i n. Thus, we have
proved the following result which also appeared in another form in Theorem 3.4.5.
94 CHAPTER 4. LINEAR TRANSFORMATIONS
Theorem 4.5.1 (Change of Basis Theorem) Let V be a nite dimensional vector space with
ordered bases B
1
= (u
1
, u
2
, . . . , u
n
and B
2
= (v
1
, v
2
, . . . , v
n
. Suppose x V with [x]
B1
=
(
1
,
2
, . . . ,
n
)
t
and [x]
B2
= (
1
,
2
, . . . ,
n
)
t
. Then [x]
B1
= I[B
2
, B
1
] [x]
B2
. Or equivalently,
_

2
.
.
.

n
_

_
=
_

_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
a
nn
_

_
_

2
.
.
.

n
_

_
.
Remark 4.5.2 Observe that the identity linear operator I : V V is invertible and hence by The-
orem 4.4.4 I[B
2
, B
1
]
1
= I
1
[B
1
, B
2
] = I[B
1
, B
2
]. Therefore, we also have [x]
B2
= I[B
1
, B
2
] [x]
B1
.
Let V be a nite dimensional vector space with ordered bases B
1
and B
2
. Then for any linear
operator T : V V the next result relates T[B
1
, B
1
] and T[B
2
, B
2
].
Theorem 4.5.3 Let B
1
= (u
1
, . . . , u
n
) and B
2
= (v
1
, . . . , v
n
) be two ordered bases of a vector
space V . Also, let A = [a
ij
] = I[B
2
, B
1
] be the matrix of the identity linear operator. Then for any
linear operator T : V V
T[B
2
, B
2
] = A
1
T[B
1
, B
1
] A = I[B
1
, B
2
] T[B
1
, B
1
] I[B
2
, B
1
]. (4.5.2)
Proof. The proof uses Theorem 4.4.1 by representing T[B
1
, B
2
] as (IT)[B
1
, B
2
] and (TI)[B
1
, B
2
],
where I is the identity operator on V (see Figure 4.3). By Theorem 4.4.1, we have
(V, B
1
) (V, B
1
)
(V, B
2
) (V, B
2
)
T[B
1
, B
1
]
T[B
2
, B
2
]
I[B
1
, B
2
] I[B
1
, B
2
]
T I
I T
Figure 4.3: Commutative Diagram for Similarity of Matrices
T[B
1
, B
2
] = (I T)[B
1
, B
2
] = I[B
1
, B
2
] T[B
1
, B
1
]
= (T I)[B
1
, B
2
] = T[B
2
, B
2
] I[B
1
, B
2
].
Thus, using I[B
2
, B
1
] = I[B
1
, B
2
]
1
, we get I[B
1
, B
2
] T[B
1
, B
1
] I[B
2
, B
1
] = T[B
2
, B
2
] and the result
follows.
Let T : V V be a linear operator on V . If dim(V ) = n then each ordered basis B of V
gives rise to an n n matrix T[B, B]. Also, we know that for any vector space we have innite
number of choices for an ordered basis. So, as we change an ordered basis, the matrix of the linear
transformation changes. Theorem 4.5.3 tells us that all these matrices are related by an invertible
matrix (see Remark 4.5.2). Thus we are led to the following remark and the denition.
Remark 4.5.4 The Equation (4.5.2) shows that T[B
2
, B
2
] = I[B
1
, B
2
] T[B
1
, B
1
] I[B
2
, B
1
]. Hence,
the matrix I[B
1
, B
2
] is called the B
1
: B
2
change of basis matrix.
4.5. CHANGE OF BASIS 95
Denition 4.5.5 (Similar Matrices) Two square matrices B and C of the same order are said to
be similar if there exists a non-singular matrix P such that P
1
BP = C or equivalently BP = PC.
Example 4.5.6 1. Let B
1
=
_
1 +x, 1 +2x+x
2
, 2 +x
_
and B
2
=
_
1, 1 +x, 1 +x+x
2
_
be ordered
bases of T
2
(R). Then I(a +bx +cx
2
) = a +bx +cx
2
. Thus,
I[B
2
, B
1
] = [
B1
, [1 +x]
B1
, [1 +x +x
2
]
B1
] =
_
_
1 1 2
0 0 1
1 0 1
_
_
and
I[B
1
, B
2
] = [[1 +x]
B2
, [1 + 2x +x
2
]
B2
, [2 +x]
B2
] =
_
_
0 1 1
1 1 1
0 1 0
_
_
.
Also, verify that I[B
1
, B
2
]
1
= I[B
2
, B
1
].
2. Let B
1
=
_
(1, 0, 0), (1, 1, 0), (1, 1, 1)
_
and B
2
=
_
1, 1, 1), (1, 2, 1), (2, 1, 1)
_
be two ordered
bases of R
3
. Dene T : R
3
R
3
by T(x, y, z) = (x + y, x +y + 2z, y z). Then T[B
1
, B
1
] =
_
_
0 0 2
1 1 4
0 1 0
_
_
and T[B
2
, B
2
] =
_
_
4/5 1 8/5
2/5 2 9/5
8/5 0 1/5
_
_
. Also, check that I[B
2
, B
1
] =
_
_
0 1 1
2 1 0
1 1 1
_
_
and
T[B
1
, B
1
] I[B
2
, B
1
] = I[B
2
, B
1
] T[B
2
, B
2
] =
_
_
2 2 2
2 4 5
2 1 0
_
_
.
Exercise 4.5.7 1. Let V be an n-dimensional vector space and let T : V V be a linear
operator. Suppose T has the property that T
n1
,= 0 but T
n
= 0.
(a) Prove that there exists u V with u, T(u), . . . , T
n1
(u), a basis of V.
(b) For B =
_
u, T(u), . . . , T
n1
(u)
_
prove that T[B, B] =
_

_
0 0 0 0
1 0 0 0
0 1 0 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1 0
_

_
.
(c) Let A be an n n matrix satisfying A
n1
,= 0 but A
n
= 0. Then prove that A is similar
to the matrix given in Part 1b.
2. Dene T : R
3
R
3
by T(x, y, z) = (x+y +2z, xy 3z, 2x+3y +z). Let B be the standard
basis and B
1
=
_
(1, 1, 1), (1, 1, 1), (1, 1, 2)
_
be another ordered basis of R
3
. Then nd the
(a) matrices T[B, B] and T[B
1
, B
1
].
(b) matrix P such that P
1
T[B, B] P = T[B
1
, B
1
].
3. Dene T : R
3
R
3
by T(x, y, z) = (x, x + y, x + y + z). Let B be the standard basis and
B
1
=
_
(1, 0, 0), (1, 1, 0), (1, 1, 1)
_
be another ordered basis of R
3
. Then nd the
(a) matrices T[B, B] and T[B
1
, B
1
].
(b) matrix P such that P
1
T[B, B] P = T[B
1
, B
1
].
4. Let B
1
=
_
(1, 2, 0), (1, 3, 2), (0, 1, 3)
_
and B
2
=
_
(1, 2, 1), (0, 1, 2), (1, 4, 6)
_
be two ordered bases
of R
3
. Find the change of basis matrix
96 CHAPTER 4. LINEAR TRANSFORMATIONS
(a) P from B
1
to B
2
.
(b) Q from B
2
to B
1
.
(c) from the standard basis of R
3
to B
1
. What do you notice?
Is it true that PQ = I = QP? Give reasons for your answer.
4.6 Summary
Chapter 5
Inner Product Spaces
5.1 Introduction
In the previous chapters, we learnt about vector spaces and linear transformations that are maps
(functions) between vector spaces. In this chapter, we will start with the denition of inner product
that helps us to view vector spaces geometrically.
5.2 Denition and Basic Properties
In R
2
and R
3
, we had a notion of dot product between two vectors. In particular, if x
t
=
(x
1
, x
2
, x
3
), y
t
= (y
1
, y
2
, y
3
) are two vectors in R
3
then their dot product was dened by
x y = x
1
y
1
+x
2
y
2
+x
3
y
3
.
Note that for any x
t
, y
t
, z
t
R
3
and R, the dot product satised the following conditions:
x (y +z) = x y +x z, x y = y x, and x x 0.
Also, x x = 0 if and only if x = 0. So, in this chapter, we generalize the idea of dot product
for arbitrary vector spaces. This generalization is commonly known as inner product which is our
starting point for this chapter.
Denition 5.2.1 (Inner Product) Let V be a vector space over F. An inner product over V,
denoted by , ), is a map from V V to F satisfying
1. au +bv, w) = au, w) +bv, w), for all u, v, w V and a, b F,
2. u, v) = v, u), the complex conjugate of u, v), for all u, v V and
3. u, u) 0 for all u V and equality holds if and only if u = 0.
Denition 5.2.2 (Inner Product Space) Let V be a vector space with an inner product , ).
Then (V, , )) is called an inner product space (in short, ips).
Example 5.2.3 The rst two examples given below are called the standard inner product or
the dot product on R
n
and C
n
, respectively. From now on, whenever an inner product is not
mentioned, it will be assumed to be the standard inner product.
97
98 CHAPTER 5. INNER PRODUCT SPACES
1. Let V = R
n
. Dene u, v) = u
1
v
1
+ + u
n
v
n
= u
t
v for all u
t
= (u
1
, . . . , u
n
), v
t
=
(v
1
, . . . , v
n
) V . Then it can be easily veried that , ) satises all the three conditions of
Denition 5.2.1. Hence,
_
R
n
, , )
_
is an inner product space.
2. Let u
t
= (u
1
, . . . , u
n
), v
t
= (v
1
, . . . , v
n
) be two vectors in C
n
(C). Dene u, v) = u
1
v
1
+u
2
v
2
+
+u
n
v
n
= u

v. Then it can be easily veried that

_
C
n
, , )
_
is an inner product space.
3. Let V = R
2
and let A =
_
4 1
1 2
_
. Dene x, y) = x
t
Ay for x
t
, y
t
R
2
. Then prove
that , ) is an inner product. Hint: x, y) = 4x
1
y
1
x
1
y
2
x
2
y
1
+ 2x
2
y
2
and x, x) =
(x
1
x
2
)
2
+ 3x
2
1
+x
2
2
.
4. Prove that x, y) = 10x
1
y
1
+ 3x
1
y
2
+ 3x
2
y
1
+ 2x
2
y
2
+ x
2
y
3
+ x
3
y
2
+ x
3
y
3
denes an inner
product in R
3
, where x
t
= (x
1
, x
2
, x
3
) and y
t
= (y
1
, y
2
, y
3
) R
3
.
5. For x
t
= (x
1
, x
2
), y
t
= (y
1
, y
2
) R
2
, we dene three maps that satisfy two conditions out
of the three conditions for an inner product. Determine the condition which is not satised.
(a) x, y) = x
1
y
1
.
(b) x, y) = x
2
1
+y
2
1
+x
2
2
+y
2
2
.
(c) x, y) = x
1
y
3
1
+x
2
y
3
2
.
6. For A, B M
n
(R), dene A, B) = tr(AB
t
). Then
A +B, C) = tr
_
(A +B)C
t
_
= tr(AC
t
) +tr(BC
t
) = A, C) +B, C).
A, B) = tr(AB
t
) = tr( (AB
t
)
t
) = tr(BA
t
) = B, A).
If A = (a
ij
), then A, A) = tr(AA
t
) =
n

i=1
(AA
t
)
ii
=
n

i,j=1
a
ij
a
ij
=
n

i,j=1
a
2
ij
and therefore,
A, A) > 0 for all non-zero matrix A.
Exercise 5.2.4 1. Verify that inner products dened in Examples 3 4, are indeed inner products.
2. Let x, y) = 0 for every vector y of an inner product space V . prove that x = 0.
Denition 5.2.5 (Length/Norm of a Vector) Let V be a vector space. Then for any vector
u V, we dene the length (norm) of u, denoted |u|, by |u| =
_
u, u), the positive square root.
A vector of norm 1 is called a unit vector.
Example 5.2.6 1. Let V be an inner product space and u V . Then for any scalar , it is
easy to verify that |u| =

|u|.
2. Let u
t
= (1, 1, 2, 3) R
4
. Then |u| =

1 + 1 + 4 + 9 =

15. Thus,
1

15
u and
1

15
u
are vectors of norm 1 in the vector subspace L(u) of R
4
. Or equivalently,
1

15
u is a unit
vector in the direction of u.
Exercise 5.2.7 1. Let u
t
= (1, 1, 2, 3, 7) R
5
. Find all R such that |u| = 1.
2. Let u
t
= (1, 1, 2, 3, 7) C
5
. Find all C such that |u| = 1.
3. Prove that |x + y|
2
+ |x y|
2
= 2
_
|x|
2
+ |y|
2
_
, for all x, y R
n
. This equality is
commonly known as the Parallelogram Law as in a parallelogram the sum of the lengths
of the diagonals equals twice the sum of the lengths of the sides.
5.2. DEFINITION AND BASIC PROPERTIES 99
4. Prove that for any two continuous functions f(x), g(x) C([1, 1]), the map f(x), g(x)) =
_
1
1
f(x) g(x)dx denes an inner product in C([1, 1]).
5. Fix an ordered basis B = (u
1
, . . . , u
n
) of a complex vector space V . Prove that , ) dened
by u, v) =
n

i=1
a
i
b
i
, whenever [u]
B
= (a
1
, . . . , a
n
)
t
and [v]
B
= (b
1
, . . . , b
n
)
t
is indeed an inner
product in V .
A very useful and a fundamental inequality concerning the inner product is due to Cauchy and
Schwarz. The next theorem gives the statement and a proof of this inequality.
Theorem 5.2.8 (Cauchy-Schwarz inequality) Let V (F) be an inner product space. Then for
any u, v V
[u, v)[ |u| |v|. (5.2.1)
Equality holds in Equation (5.2.1) if and only if the vectors u and v are linearly dependent. Fur-
thermore, if u ,= 0, then v = v,
u
|u|
)
u
|u|
.
Proof. If u = 0, then the inequality (5.2.1) holds trivially. Hence, let u ,= 0. Also, by the third
property of inner product, u + v, u +v) 0 for all F. In particular, for =
v, u)
|u|
2
,
0 u +v, u +v) = |u|
2
+u, v) +v, u) +|v|
2
=
v, u)
|u|
2
v, u)
|u|
2
|u|
2

v, u)
|u|
2
u, v)
v, u)
|u|
2
v, u) +|v|
2
= |v|
2

[v, u)[
2
|u|
2
.
Or, in other words [v, u)[
2
|u|
2
|v|
2
and the proof of the inequality is over.
If u ,= 0 then u+v, u+v) = 0 if and only of u+v = 0. Hence, equality holds in (5.2.1) if
and only if =
v, u)
|u|
2
. That is, u and v are linearly dependent and in this case v =
_
v,
u
u
_
u
u
.

Let V be a real vector space. Then for every u, v V, the Cauchy-Schwarz inequality (see
(5.2.1)) implies hat 1
u,v
u v
1. Also, we know that cos : [0, ] [1, 1] is an one-one and
onto function. We use this idea, to relate inner product with the angle between two vectors in an
inner product space V .
Denition 5.2.9 (Angle between two vectors) Let V be a vector space and let u, v V . Sup-
pose is the angle between u, v. We dene
cos =
u, v)
|u| |v|
.
1. The real number with 0 and satisfying cos =
u, v)
|u| |v|
is called the angle between
the two vectors u and v in V.
2. The vectors u and v in V are said to be orthogonal if u, v) = 0. Orthogonality corresponds
to perpendicularity.
3. A set of vectors u
1
, u
2
, . . . , u
n
is called mutually orthogonal if u
i
, u
j
) = 0 for all 1 i ,=
j n.
100 CHAPTER 5. INNER PRODUCT SPACES
a
b
c
A B
C
Figure 2: Triangle with vertices A, B and C
Before proceeding further with one more denition, recall that if ABC are vertices of a triangle
(see Figure 5.2) then cos(A) =
b
2
+c
2
a
2
2bc
. We prove this as our next result.
Lemma 5.2.10 Let A, B and C be the sides of a triangle in an inner product space V then
cos(A) =
b
2
+c
2
a
2
2bc
.
Proof. Let the coordinates of the vertices A, B and C be 0, u and v, respectively. Then

AB = u,

AC = v and

BC = v u. Thus, we need to prove that
cos(A) =
|v|
2
+|u|
2
|v u|
2
2|v||u|
.
Now, using the properties of an inner product and Denition 5.2.9, it follows that
|v|
2
+|u|
2
|v u|
2
= 2 u, v) = 2 |v||u| cos(A).
Thus, the required result follows.
Denition 5.2.11 (Orthogonal Complement) Let W be a subset of a vector space V with inner
product , ). Then the orthogonal complement of W in V , denoted W

, is dened by
W

= v V : v, w) = 0 for all w W.
Exercise 5.2.12 Let W be a subset of a vector space V . Then prove that W

is a subspace of V .
Example 5.2.13 1. Let R
4
be endowed with the standard inner product. Fix two vectors u
t
=
(1, 1, 1, 1), v
t
= (1, 1, 1, 0) R
4
. Determine two vectors z and w such that u = z +w, z is
parallel to v and w is orthogonal to v.
Solution: Let z
t
= kv
t
= (k, k, k, 0) for some k R and let w
t
= (a, b, c, d). As w is
orthogonal to v, w, v) = 0 and hence a +b c = 0. Thus, c = a +b and
(1, 1, 1, 1) = u
t
= z
t
+w
t
= (k, k, k, 0) + (a, b, a +b, d).
Comparing the corresponding coordinates, we get
d = 1, a +k = 1, b +k = 1 and a +b k = 1.
Solving for a, b and k gives a = b =
2
3
and k =
1
3
. Thus, z
t
=
1
3
(1, 1, 1, 0) and w
t
=
1
3
(2, 2, 4, 3).
2. Let R
3
be endowed with the standard inner product and let P = (1, 1, 1), Q = (2, 1, 3) and
R = (1, 1, 2) be three vertices of a triangle in R
3
. Compute the angle between the sides PQ
5.2. DEFINITION AND BASIC PROPERTIES 101
and PR.
Solution: Method 1: The sides are represented by the vectors

PQ = (2, 1, 3) (1, 1, 1) = (1, 0, 2),

PR = (2, 0, 1) and

RQ = (3, 0, 1).
As

PQ,

PR) = 0, the angle between the sides PQ and PR is

2
.
Method 2: |PQ| =

5, |PR| =

5 and |QR| =

10. As
|QR|
2
= |PQ|
2
+|PR|
2
,
by Pythagoras theorem, the angle between the sides PQ and PR is

2
.
We end this section by stating and proving the fundamental theorem of linear algebra. To do
this, recall that for a matrix A M
n
(C), A

denotes the conjugate transpose of A, ^(A) = v

C
n
: Av = 0 denotes the null space of A and (A) = Av : v C
n
denotes the range space of A.
The readers are also advised to go through Theorem 3.3.25 (the rank-nullity theorem for matrices)
before proceeding further as the rst part is stated and proved there.
Theorem 5.2.14 (Fundamental Theorem of Linear Algebra) Let A be an nn matrix with
complex entries and let ^(A) and (A) be dened as above. Then
1. dim(^(A)) + dim((A)) = n.
2. ^(A) =
_
(A

)
_

and ^(A

) =
_
(A)
_

.
3. dim((A)) = dim((A

)).
Proof. Part 1: Proved in Theorem 3.3.25.
Part 2: We rst prove that ^(A) (A

0 = Ax, u) = u

Ax = (A

u)

x = x, A

u)
for all u C
n
. Thus, x (A

and hence ^(A) (A

.
We now prove that (A

^(A). Let x (A

. Then for every y C

n
,
0 = x, A

y) = (A

y)

x = y

(A

x = y

Ax = Ax, y).
In particular, for y = Ax, we get |Ax|
2
= 0 and hence Ax = 0. That is, x ^(A). Thus, the
proof of the rst equality in Part 2 is over. We omit the second equality as it proceeds on the same
lines as above.
Part 3: Use the rst two parts to get the result.
Hence the proof of the fundamental theorem is complete.
For more information related with the fundamental theorem of linear algebra the interested
readers are advised to see the article The Fundamental Theorem of Linear Algebra, Gilbert Strang,
The American Mathematical Monthly, Vol. 100, No. 9, Nov., 1993, pp. 848 - 855.
Exercise 5.2.15 1. Answer the following questions when R
3
is endowed with the standard inner
product.
(a) Let u
t
= (1, 1, 1). Find vectors v, w R
3
that are orthogonal to u and to each other.
(b) Find the equation of the line that passes through the point (1, 1, 1) and is parallel to the
vector (a, b, c) ,= (0, 0, 0).
102 CHAPTER 5. INNER PRODUCT SPACES
(c) Find the equation of the plane that contains the point (1, 1 1) and the vector (a, b, c) ,=
(0, 0, 0) is a normal vector to the plane.
(d) Find area of the parallelogram with vertices (0, 0, 0), (1, 2, 2), (2, 3, 0) and (3, 5, 2).
(e) Find the equation of the plane that contains the point (2, 2, 1) and is perpendicular to
the line with parametric equations x = t 1, y = 3t + 2, z = t + 1.
(f ) Let P = (3, 0, 2), Q = (1, 2, 1) and R = (2, 1, 1) be three points in R
3
.
i. Find the area of the triangle with vertices P, Q and R.
ii. Find the area of the parallelogram built on vectors

PQ and

QR.
iii. Find a nonzero vector orthogonal to the triangle with vertices P, Q and R.
iv. Find all vectors x orthogonal to

PQ and

QR with |x| =

2.
v. Choose one of the vectors x found in part 1(f )iv. Find the volume of the paral-
lelepiped built on vectors

PQ and

QR and x. Do you think the volume would be
dierent if you choose the other vector x?
(g) Find the equation of the plane that contains the lines (x, y, z) = (1, 2, 2) +t(1, 1, 0) and
(x, y, z) = (1, 2, 2) +t(0, 1, 2).
(h) Let u
t
= (1, 1, 1) and v
t
= (1, k, 1). Find k such that the angle between u and v is
/3.
(i) Let p
1
be a plane that passes through the point A = (1, 2, 3) and has n = (2, 1, 1) as its
normal vector. Then
i. nd the equation of the plane p
2
which is parallel to p
1
and passes through the point
(1, 2, 3).
ii. calculate the distance between the planes p
1
and p
2
.
(j) In the parallelogram ABCD, AB|DC and AD|BC and A = (2, 1, 3), B = (1, 2, 2), C =
(3, 1, 5). Find
i. the coordinates of the point D,
ii. the cosine of the angle BCD.
iii. the area of the triangle ABC
iv. the volume of the parallelepiped determined by the vectors AB, AD and the vector
(0, 0, 7).
(k) Find the equation of a plane that contains the point (1, 1, 2) and is orthogonal to the line
with parametric equation x = 2 +t, y = 3 and z = 1 t.
(l) Find a parametric equation of a line that passes through the point (1, 2, 1) and is or-
thogonal to the plane x + 3y + 2z = 1.
2. Let e
t
1
, e
t
2
, . . . , e
t
n
be the standard basis of R
n
. Then prove that with respect to the standard
inner product on R
n
, the vectors e
i
satisfy the following:
(a) |e
i
| = 1 for 1 i n.
(b) e
i
, e
j
) = 0 for 1 i ,= j n.
3. Let x
t
= (x
1
, x
2
), y
t
= (y
1
, y
2
) R
2
. Then x, y) = 4x
1
y
1
x
1
y
2
x
2
y
1
+ 2x
2
y
2
denes an
inner product. Use this inner product to nd
(a) the angle between e
t
1
= (1, 0) and e
t
2
= (0, 1).
(b) v R
2
such that v, (1, 0)
t
) = 0.
5.2. DEFINITION AND BASIC PROPERTIES 103
(c) vectors x
t
, y
t
R
2
such that |x| = |y| = 1 and x, y) = 0.
4. Does there exist an inner product in R
2
such that
|(1, 2)| = |(2, 1)| = 1 and (1, 2), (2, 1)) = 0?
[Hint: Consider a symmetric matrix A =
_
a b
b c
_
. Dene x, y) = y
t
Ax. Use the given
conditions to get a linear system of 3 equations in the unknowns a, b, c. Solve this system.]
5. Let W = (x, y, z) R
3
: x +y +z = 0. Find a basis of W

.
6. Let W be a subspace of a nite dimensional inner product space V . Prove that (W

= W.
7. Let x
t
= (x
1
, x
2
, x
3
), y
t
= (y
1
, y
2
, y
3
) R
3
. Show that
x, y) = 10x
1
y
1
+ 3x
1
y
2
+ 3x
2
y
1
+ 4x
2
y
2
+x
2
y
3
+x
3
y
2
+ 3x
3
y
3
is an inner product in R
3
(R). With respect to this inner product, nd the angle between the
vectors (1, 1, 1) and (2, 5, 2).
8. Recall the inner product space M
nn
(R) (see Example 5.2.3.6). Determine W

for the sub-

space W = A M
nn
(R) : A
t
= A.
9. Prove that A, B) = tr(AB

) denes an inner product in M

n
(C). Determine W

for W =
A M
n
(C) : A

= A.
10. Prove that f(x), g(x)) =

f(x) g(x)dx denes an inner product in C[, ]. Dene

1(x) = 1 for all x [, ]. Prove that
S = 1 cos(mx) : m 1 sin(nx) : n 1
is a linearly independent subset of C[, ].
11. Let V be an inner product space. Prove the triangle inequality
|u +v| |u| +|v| for every u, v V.
12. Let z
1
, z
2
, . . . , z
n
C. Use the Cauchy-Schwarz inequality to prove that
[z
1
+z
2
+ +z
n
[
_
n([z
1
[
2
+[z
2
[
2
+ +[z
n
[
2
).
When does the equality hold?
13. Let x, y R
n
. Prove the following:
(a) x, y) = 0 |x y|
2
= |x|
2
+|y|
2
(Pythagoras Theorem).
(b) |x| = |y| x + y, x y) = 0 (x and y form adjacent sides of a rhombus as the
diagonals x +y and x y are orthogonal).
(c) 4x, y) = |x +y|
2
|x y|
2
(polarization identity).
Are the above results true if x, y C
n
(C)?
14. Let x, y C
n
(C). Prove that
(a) 4x, y) = |x +y|
2
|x y|
2
+i|x +iy|
2
i|x iy|
2
.
104 CHAPTER 5. INNER PRODUCT SPACES
(b) If x ,= 0 then |x +ix|
2
= |x|
2
+|ix|
2
, even though x, ix) , = 0.
(c) x, y) = 0 whenever |x +y|
2
= |x|
2
+|y|
2
and |x +iy|
2
= |x|
2
+|iy|
2
.
15. Let , ) denote the standard inner product on C
n
(C) and let A M
n
(C). That is, x, y) =
x

y for all x
t
, y
t
C
n
. Prove that Ax, y) = x, A

y) for all x, y C
n
.
16. Let (V, , )) be an n-dimensional inner product space and let u V be a xed vector with
|u| = 1. Then give reasons for the following statements.
(a) Let S

= v V : v, u) = 0. Then dim(S

) = n 1.
(b) Let 0 ,= F. Then S = v V : v, u) = is not a subspace of V.
(c) Let v V . Then v = v
0
+ u for a vector v
0
S

and a scalar . That is, V =

L(u, S

).
5.2.1 Basic Results on Orthogonal Vectors
We start this subsection with the denition of an orthonormal set. Then a theorem is proved that
implies that the coordinates of a vector with respect to an orthonormal basis are just the inner
products with the basis vectors.
Denition 5.2.16 (Orthonormal Set) Let S = v
1
, v
2
, . . . , v
n
be a set of non-zero, mutually
orthogonal vectors in an inner product space V . Then S is called an orthonormal set if |v
i
| = 1
for 1 i n. If S is also a basis of V then S is called an orthonormal basis of V.
Example 5.2.17 1. Consider R
2
with the standard inner product. Then a few orthonormal sets
in R
2
are
_
(1, 0), (0, 1)
_
,
_
1

2
(1, 1),
1

2
(1, 1)
_
and
_
1

5
(2, 1),
1

5
(1, 2)
_
.
2. Let R
n
be endowed with the standard inner product. Then by Exercise 5.2.15.2, the standard
ordered basis (e
t
1
, e
t
2
, . . . , e
t
n
) is an orthonormal set.
Theorem 5.2.18 Let V be an inner product space and let u
1
, u
2
, . . . , u
n
be a set of non-zero,
mutually orthogonal vectors of V.
1. Then the set u
1
, u
2
, . . . , u
n
is linearly independent.
2. Let v =
n

i=1

i
u
i
V . Then |v|
2
= |
n

i=1

i
u
i
|
2
=
n

i=1
[
i
[
2
|u
i
|
2
;
3. Let v =
n

i=1

i
u
i
. If |u
i
| = 1 for 1 i n then
i
= v, u
i
) for 1 i n. That is,
v =
n

i=1
v, u
i
)u
i
and |v|
2
=
n

i=1
[v, u
i
)[
2
.
4. Let dim(V ) = n. Then v, u
i
) = 0 for all i = 1, 2, . . . , n if and only if v = 0.
Proof. Consider the linear system
c
1
u
1
+c
2
u
2
+ +c
n
u
n
= 0 (5.2.2)
in the unknowns c
1
, c
2
, . . . , c
n
. As 0, u) = 0 for each u V and u
j
, u
i
) = 0 for all j ,= i, we have
0 = 0, u
i
) = c
1
u
1
+c
2
u
2
+ +c
n
u
n
, u
i
) =
n

j=1
c
j
u
j
, u
i
) = c
i
u
i
, u
i
).
5.2. DEFINITION AND BASIC PROPERTIES 105
As u
i
,= 0, u
i
, u
i
) , = 0 and therefore c
i
= 0 for 1 i n. Thus, the linear system (5.2.2) has only
the trivial solution. Hence, the proof of Part 1 is complete.
For Part 2, we use a similar argument to get
|
n

i=1

i
u
i
|
2
=
_
n

i=1

i
u
i
,
n

i=1

i
u
i
_
=
n

i=1

i
_
u
i
,
n

j=1

j
u
j
_
=
n

i=1

i
n

j=1

j
u
i
, u
j
) =
n

i=1

i
u
i
, u
i
) =
n

i=1
[
i
[
2
|u
i
|
2
.
Note that v, u
i
) =
_

n
j=1

j
u
j
, u
i
_
=

n
j=1

j
u
j
, u
i
) =
j
. Thus, the proof of Part 3 is
complete.
Part 4 directly follows using Part 3 as the set u
1
, u
2
, . . . , u
n
is a basis of V. Therefore, we
have obtained the required result.
In view of Theorem 5.2.18, we inquire into the question of extracting an orthonormal basis from a
given basis. In the next section, we describe a process (called the Gram-Schmidt Orthogonalization
process) that generates an orthonormal set from a given set containing nitely many vectors.
Remark 5.2.19 The last two parts of Theorem 5.2.18 can be rephrased as follows:
Let B =
_
v
1
, . . . , v
n
_
be an ordered orthonormal basis of an inner product space V and let u V .
Then
[u]
B
= (u, v
1
), u, v
2
), . . . , u, v
n
))
t
.
Exercise 5.2.20 1. Let B =
_
1

2
(1, 1),
1

2
(1, 1)
_
be an ordered basis of R
2
. Determine [(2, 3)]
B
.
Also, compute [(x, y)]
B
.
2. Let B =
_
1

3
(1, 1, 1),
1

2
(1, 1, 0),
1

6
(1, 1, 2),
_
be an ordered basis of R
3
. Determine [(2, 3, 1)]
B
.
Also, compute [(x, y, z)]
B
.
3. Let u
t
= (u
1
, u
2
, u
3
), v
t
= (v
1
, v
2
, v
3
) be two vectors in R
3
. Then recall that their cross
product, denoted u v, equals
u
t
v
t
= (u
2
v
3
u
3
v
2
, u
3
v
1
u
1
v
3
, u
1
v
2
u
2
v
1
).
Use this to nd an orthonormal basis of R
3
containing the vector
1

6
(1, 2, 1).
4. Let u
t
= (1, 1, 2). Find vectors v
t
, w
t
R
3
such that v and w are orthogonal to u and to
each other as well.
5. Let A be an nn orthogonal matrix. Prove that the rows/columns of A form an orthonormal
basis of R
n
.
6. Let A be an n n unitary matrix. Prove that the rows/columns of A form an orthonormal
basis of C
n
.
7. Let u
t
1
, u
t
2
, . . . , u
t
n
be an orthonormal basis of R
n
. Prove that the n n matrix A =
[u
1
, u
2
, . . . , u
n
] is an orthogonal matrix.
106 CHAPTER 5. INNER PRODUCT SPACES
5.3 Gram-Schmidt Orthogonalization Process
Suppose we are given two non-zero vectors u and v in a plane. Then in many instances, we need
to decompose the vector v into two components, say y and z, such that y is a vector parallel to u
and z is a vector perpendicular (orthogonal) to u. We do this as follows (see Figure 5.3):
Let u =
u
|u|
. Then u is a unit vector in the direction of u. Also, using trigonometry, we know that
cos() =

OQ

OP
and hence |

OQ| = |

OP| cos(). Or using Denition 5.2.9,
|

OQ| = |v|
v, u)
|v| |u|
=
v, u)
|u|
,
where we need to take the absolute value of the right hand side expression as the length of a vector
is always a positive quantity. Thus, we get

OQ = |

OQ| u = v,
u
|u|
)
u
|u|
.
Thus, we see that y =

OQ = v,
u
u
)
u
u
and z = v v,
u
u
)
u
u
. It is easy to verify that
v = y + z, y is parallel to u and z is orthogonal to u. In literature, the vector y =

OQ is often
called the orthogonal projection of the vector v on u and is denoted by Proj
u
(v). Thus,
Proj
u
(v) = v,
u
|u|
)
u
|u|
and |Proj
u
(v)| = |

OQ| =

v, u)
|u|

. (5.3.1)

OR = v
v,u
u
2
u

OQ =
v,u
u
2
u
R

O
Q
P
u
v
Figure 3: Decomposition of vector v
Also, note that u is a unit vector in the direction of u and z =
z
z
is a unit vector orthogonal
to u. This idea is generalized to study the Gram-Schmidt Orthogonalization process which is given
as the next result. Before stating this result, we look at the following example to understand the
process.
Example 5.3.1 1. In Example 5.2.13.1, we note that Proj
v
(u) = (u v)
v
|v|
2
is parallel to v
and u Proj
v
(u) is orthogonal to v. Thus,
z = Proj
v
(u) =
1
3
(1, 1, 1, 0)
t
and w = (1, 1, 1, 1)
t
z =
1
3
(2, 2, 4, 3)
t
.
2. Let u
t
= (1, 1, 1, 1), v
t
= (1, 1, 1, 0) and w
t
= (1, 1, 0, 1) be three vectors in R
4
. Write
v = v
1
+v
2
where v
1
is parallel to u and v
2
is orthogonal to u. Also, write w = w
1
+w
2
+w
3
such that w
1
is parallel to u, w
2
is parallel to v
2
and w
3
is orthogonal to both u and v
2
.
Solution : Note that
(a) v
1
= Proj
u
(v) = v, u)
u
u
2
=
1
4
u =
1
4
(1, 1, 1, 1)
t
is parallel to u and
(b) v
2
= v
1
4
u =
1
4
(3, 3, 5, 1)
t
is orthogonal to u.
Note that Proj
u
(w) is parallel to u and Proj
v2
(w) is parallel to v
2
. Hence, we have
5.3. GRAM-SCHMIDT ORTHOGONALIZATION PROCESS 107
(a) w
1
= Proj
u
(w) = w, u)
u
u
2
=
1
4
u =
1
4
(1, 1, 1, 1)
t
is parallel to u,
(b) w
2
= Proj
v2
(w) = w, v
2
)
v2
v2
2
=
7
44
(3, 3, 5, 1)
t
is parallel to v
2
and
(c) w
3
= ww
1
w
2
=
3
11
(1, 1, 2, 4)
t
is orthogonal to both u and v
2
.
That is, from the given vector subtract all the orthogonal components that are obtained as
orthogonal projections. If this new vector is non-zero then this vector is orthogonal to the
previous ones.
Theorem 5.3.2 (Gram-Schmidt Orthogonalization Process) Let V be an inner product space.
Suppose u
1
, u
2
, . . . , u
n
is a set of linearly independent vectors in V. Then there exists a set
v
1
, v
2
, . . . , v
n
of vectors in V satisfying the following:
1. |v
i
| = 1 for 1 i n,
2. v
i
, v
j
) = 0 for 1 i ,= j n and
3. L(v
1
, v
2
, . . . , v
i
) = L(u
1
, u
2
, . . . , u
i
) for 1 i n.
Proof. We successively dene the vectors v
1
, v
2
, . . . , v
n
as follows.
Step 1: v
1
=
u
1
|u
1
|
.
Step 2: Calculate w
2
= u
2
u
2
, v
1
)v
1
, and let v
2
=
w
2
|w
2
|
.
Step 3: Obtain w
3
= u
3
u
3
, v
1
)v
1
u
3
, v
2
)v
2
, and let v
3
=
w
3
|w
3
|
.
Step i: In general, if v
1
, v
2
, . . . , v
i1
are already obtained, we compute
w
i
= u
i
u
i
, v
1
)v
1
u
i
, v
2
)v
2
u
i
, v
i1
)v
i1
. (5.3.2)
As the set u
1
, u
2
, . . . , u
n
is linearly independent, it can be veried that |w
i
| , = 0 and hence
we dene v
i
=
wi
wi
.
We prove this by induction on n, the number of linearly independent vectors. For n = 1, v
1
=
u1
u1
.
As u is an element of a linearly independent set, u
1
,= 0 and thus v
1
,= 0 and
|v
1
|
2
= v
1
, v
1
) =
u
1
|u
1
|
,
u
1
|u
1
|
) =
u
1
, u
1
)
|u
1
|
2
= 1.
Hence, the result holds for n = 1.
Let the result hold for all k n 1. That is, suppose we are given any set of k, 1 k n 1
linearly independent vectors u
1
, u
2
, . . . , u
k
of V. Then by the inductive assumption, there exists
a set v
1
, v
2
, . . . , v
k
of vectors satisfying the following:
1. |v
i
| = 1 for 1 i k,
2. v
i
, v
j
) = 0 for 1 i ,= j k, and
3. L(v
1
, v
2
, . . . , v
i
) = L(u
1
, u
2
, . . . , u
i
) for 1 i k.
Now, let us assume that u
1
, u
2
, . . . , u
n
is a linearly independent subset of V . Then by the
inductive assumption, we already have vectors v
1
, v
2
, . . . , v
n1
satisfying
1. |v
i
| = 1 for 1 i n 1,
108 CHAPTER 5. INNER PRODUCT SPACES
2. v
i
, v
j
) = 0 for 1 i ,= j n 1, and
3. L(v
1
, v
2
, . . . , v
i
) = L(u
1
, u
2
, . . . , u
i
) for 1 i n 1.
Using (5.3.2), we dene
w
n
= u
n
u
n
, v
1
)v
1
u
n
, v
2
)v
2
u
n
, v
n1
)v
n1
. (5.3.3)
We rst show that w
n
, L(v
1
, v
2
, . . . , v
n1
). This will imply that w
n
,= 0 and hence v
n
=
wn
wn
is well dened. Also, |v
n
| = 1.
On the contrary, assume that w
n
L(v
1
, v
2
, . . . , v
n1
). Then, by denition, there exist scalars

1
,
2
, . . . ,
n1
, not all zero, such that
w
n
=
1
v
1
+
2
v
2
+ +
n1
v
n1
.
So, substituting
1
v
1
+
2
v
2
+ +
n1
v
n1
for w
n
in (5.3.3), we get
u
n
=
_

1
+u
n
, v
1
)
_
v
1
+
_

2
+u
n
, v
2
)
_
v
2
+ + (
_

n1
+u
n
, v
n1
)
_
v
n1
.
That is, u
n
L(v
1
, v
2
, . . . , v
n1
). But L(v
1
, . . . , v
n1
) = L(u
1
, . . . , u
n1
) using the third induc-
tion assumption. Hence u
n
L(u
1
, . . . , u
n1
). A contradiction to the given assumption that the
set of vectors u
1
, . . . , u
n
is linearly independent.
Also, it can be easily veried that v
n
, v
i
) = 0 for 1 i n 1. Hence, by the principle of
mathematical induction, the proof of the theorem is complete.
We illustrate the Gram-Schmidt process by the following example.
Example 5.3.3 1. Let (1, 1, 1, 1), (1, 0, 1, 0), (0, 1, 0, 1) R
4
. Find a set v
1
, v
2
, v
3
that is
orthonormal and L( (1, 1, 1, 1), (1, 0, 1, 0), (0, 1, 0, 1) ) = L(v
t
1
, v
t
2
, v
t
3
).
Solution: Let u
t
1
= (1, 0, 1, 0), u
t
2
= (0, 1, 0, 1) and u
t
3
= (1, 1, 1, 1). Then v
t
1
=
1

2
(1, 0, 1, 0).
Also, u
2
, v
1
) = 0 and hence w
2
= u
2
. Thus, v
t
2
=
1

2
(0, 1, 0, 1) and
w
3
= u
3
u
3
, v
1
)v
1
u
3
, v
2
)v
2
= (0, 1, 0, 1)
t
.
Therefore, v
t
3
=
1

2
(0, 1, 0, 1).
2. Find an orthonormal set in R
3
containing (1, 2, 1).
Solution: Let (x, y, z) R
3
with

(1, 2, 1), (x, y, z)

_
= 0. Then x+2y +z = 0 or equivalently,
x = 2y z. Thus,
(x, y, z) = (2y z, y, z) = y(2, 1, 0) +z(1, 0, 1).
Observe that the vectors (2, 1, 0) and (1, 0, 1) are both orthogonal to (1, 2, 1) but are not
orthogonal to each other.
Method 1: Consider
1

6
(1, 2, 1), (2, 1, 0), (1, 0, 1) R
3
and apply the Gram-Schmidt
process to get the result.
Method 2: This method can be used only if the vectors are from R
3
. Recall that in R
3
, the
cross product of two vectors u and v, denoted uv, is a vector that is orthogonal to both the
vectors u and v. Hence, the vector
(1, 2, 1) (2, 1, 0) = (0 1, 2 0, 1 + 4) = (1, 2, 5)
is orthogonal to the vectors (1, 2, 1) and (2, 1, 0) and hence the required orthonormal set is

6
(1, 2, 1),
1

5
(2, 1, 0),
1

30
(1, 2, 5).
5.3. GRAM-SCHMIDT ORTHOGONALIZATION PROCESS 109
Remark 5.3.4 1. Let V be a vector space. Then the following holds.
(a) Let u
1
, u
2
, . . . , u
k
be a linearly independent subset of V. Then Gram-Schmidt orthog-
onalization process gives an orthonormal set v
1
, v
2
, . . . , v
k
of V with
L(v
1
, v
2
, . . . , v
i
) = L(u
1
, u
2
, . . . , u
i
) for 1 i k.
(b) Let W be a subspace of V with a basis u
1
, u
2
, . . . , u
k
. Then v
1
, v
2
, . . . , v
k
is also a
basis of W.
(c) Suppose u
1
, u
2
, . . . , u
n
is a linearly dependent subset of V . Then there exists a smallest
k, 2 k n such that w
k
= 0.
Idea of the proof: Linear dependence (see Corollary 3.2.5) implies that there exists a
smallest k, 2 k n such that
L(u
1
, u
2
, . . . , u
k
) = L(u
1
, u
2
, . . . , u
k1
).
Also, by Gram-Schmidt orthogonalization process
L(u
1
, u
2
, . . . , u
k1
) = L(v
1
, v
2
, . . . , v
k1
).
Thus, u
k
L(v
1
, v
2
, . . . , v
k1
) and hence by Remark 5.2.19
u
k
= u
k
, v
1
)v
1
+u
k
, v
2
)v
2
+ +u
k
, v
k1
)v
k1
.
So, by denition w
k
= 0.
2. Let S be a countably innite set of linearly independent vectors. Then one can apply the
Gram-Schmidt process to get a countably innite orthonormal set.
3. Let B = (e
1
, e
2
, . . . , e
n
) be the standard ordered basis of R
n
. Suppose R
n
has an orthonormal
set v
1
, v
2
, . . . , v
n
. Then we can nd real numbers
ij
, 1 i, j n such that
[v
i
]
B
= (
1i
,
2i
, . . . ,
ni
)
t
for i = 1, 2, . . . , n.
Then in the ordered basis B, the matrix A = [v
1
, v
2
, . . . , v
n
] equals
A
nn
=
_

11

12

1n

21

22

2n
.
.
.
.
.
.
.
.
.
.
.
.

n1

n2

nn
_

_
.
As |v
i
| = 1 and v
i
, v
j
) = 0 for 1 i ,= j n, we get
1 = |v
i
| = |v
i
|
2
= v
i
, v
i
) =
n

j=1

2
ji
,
and 0 = v
i
, v
j
) =
n

s=1

si

sj
.
_

_
(5.3.4)
Note that,
A
t
A =
_

_
v
t
1
v
t
2
.
.
.
v
t
n
_

_
[v
1
, v
2
, . . . , v
n
]
=
_

_
|v
1
|
2
v
1
, v
2
) v
1
, v
n
)
v
2
, v
1
) |v
2
|
2
v
2
, v
n
)
.
.
.
.
.
.
.
.
.
.
.
.
v
n
, v
1
) v
n
, v
2
) |v
n
|
2
_

_
=
_

_
1 0 0
0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1
_

_
= I
n
.
110 CHAPTER 5. INNER PRODUCT SPACES
Or using (5.3.4), in the language of matrices, we get
A
t
A =
_

11

21

n1

12

22

n2
.
.
.
.
.
.
.
.
.
.
.
.

1n

2n

nn
_

_
_

11

12

1n

21

22

2n
.
.
.
.
.
.
.
.
.
.
.
.

n1

n2

nn
_

_
= I
n
.
Recall that the matrix A is an orthogonal matrix. So, we see that the rows/columns of an
n n orthogonal matrix gives rise to an orthonormal basis of R
n
. Using a similar idea, one
can show that the rows/columns of an nn unitary matrix gives rise to an orthonormal basis
of the complex vector space C
n
.
Denition 5.2.11 started with a subspace of an inner product space V and looked at its comple-
ment. Be now look at the orthogonal complement of a subset of an inner product space V and the
associated results.
Denition 5.3.5 (Orthogonal Subspace of a Set) Let V be an inner product space. Let S be
a non-empty subset of V . We dene
S

= v V : v, s) = 0 for all s S.
Example 5.3.6 Let V = R.
1. S = 0. Then S

= R.
2. S = R, Then S

= 0.
3. Let S be any subset of R containing a non-zero real number. Then S

= 0.
4. Let S = (1, 2, 1) R
3
. Then using Example 5.3.3.2, S

= L((2, 1, 0), (1, 0, 1)).

We now state the result which gives the existence of an orthogonal subspace of a nite dimen-
sional inner product space.
Theorem 5.3.7 Let S be a subset of a nite dimensional inner product space V, with inner product
, ). Then
1. S

is a subspace of V.
2. Let W = L(S). Then the subspaces W and S

= W

are complementary. That is, V =

W +S

= W +W

.
3. Moreover, u, w) = 0 for all w W and u S

.
Proof. We leave the prove of the rst part to the reader. The prove of the second part is as
follows:
Let dim(V ) = n and dim(W) = k. Let w
1
, w
2
, . . . , w
k
be a basis of W. By Gram-Schmidt
orthogonalization process, we get an orthonormal basis, say v
1
, v
2
, . . . , v
k
of W. Then, for any
v V,
v
k

i=1
v, v
i
)v
i
S

.
So, V W +S

. Hence, V = W +S

. We now need to show that W S

= 0.
To do this, let v W S

. Then v W and v S

. Hence, be denition, v, v) = 0. That

is, |v|
2
= v, v) = 0 implying v = 0 and hence W S

= 0.
The third part is a direct consequence of the denition of S

.
5.3. GRAM-SCHMIDT ORTHOGONALIZATION PROCESS 111
Exercise 5.3.8 1. Let A be an n n orthogonal matrix. Then prove that
(a) the rows of A form an orthonormal basis of R
n
.
(b) the columns of A form an orthonormal basis of R
n
.
(c) for any two vectors x, y R
n1
, Ax, Ay) = x, y).
(d) for any vector x R
n1
, |Ax| = |x|.
2. Let A be an n n unitary matrix. Then prove that
(a) the rows/columns of A form an orthonormal basis of the complex vector space C
n
.
(b) for any two vectors x, y C
n1
, Ax, Ay) = x, y).
(c) for any vector x C
n1
, |Ax| = |x|.
3. Let A and B be two n n orthogonal matrices. Then prove that AB and BA are both
orthogonal matrices. Prove a similar result for unitary matrices.
4. Prove the statements made in Remark 5.3.4.3 about orthogonal matrices. State and prove a
similar result for unitary matrices.
5. Let A be an n n upper triangular matrix. If A is also an orthogonal matrix then prove that
A = I
n
.
6. Determine an orthonormal basis of R
4
containing the vectors (1, 2, 1, 3) and (2, 1, 3, 1).
7. Consider the real inner product space C[1, 1] with f, g) =
1
_
1
f(t)g(t)dt. Prove that the
polynomials 1, x,
3
2
x
2

1
2
,
5
2
x
3

3
2
x form an orthogonal set in C[1, 1]. Find the corresponding
functions f(x) with |f(x)| = 1.
8. Consider the real inner product space C[, ] with f, g) =

f(t)g(t)dt. Find an orthonor-

mal basis for L(x, sin x, sin(x + 1)) .
9. Let M be a subspace of R
n
and dimM = m. A vector x R
n
is said to be orthogonal to M
if x, y) = 0 for every y M.
(a) How many linearly independent vectors can be orthogonal to M?
(b) If M = (x
1
, x
2
, x
3
) R
3
: x
1
+ x
2
+ x
3
= 0, determine a maximal set of linearly
independent vectors orthogonal to M in R
3
.
10. Determine an orthogonal basis of L((1, 1, 0, 1), (1, 1, 1, 1), (0, 2, 1, 0), (1, 0, 0, 0)) in R
4
.
11. Let R
n
be endowed with the standard inner product. Suppose we have a vector x
t
= (x
1
, x
2
, . . . , x
n
)
R
n
with |x| = 1.
(a) Then prove that the set x can always be extended to form an orthonormal basis of R
n
.
(b) Let this basis be x, x
2
, . . . , x
n
. Suppose B = (e
1
, e
2
, . . . , e
n
) is the standard basis of R
n
and let A =
_
[x]
B
, [x
2
]
B
, . . . , [x
n
]
B
_
. Then prove that A is an orthogonal matrix.
12. Let v, w R
n
, n 1 with |u| = |w| = 1. Prove that there exists an orthogonal matrix A
such that Av = w. Prove also that A can be chosen such that det(A) = 1.
112 CHAPTER 5. INNER PRODUCT SPACES
5.4 Orthogonal Projections and Applications
Recall that given a k-dimensional vector subspace of a vector space V of dimension n, one can
always nd an (n k)-dimensional vector subspace W
0
of V (see Exercise 3.3.13.5) satisfying
W +W
0
= V and W W
0
= 0.
The subspace W
0
is called the complementary subspace of W in V. We rst use Theorem 5.3.7 to get
the complementary subspace in such a way that the vectors in dierent subspaces are orthogonal.
That is, w, v) = 0 for all w W and v W
0
. We then use this to dene an important class of
linear transformations on an inner product space, called orthogonal projections.
Denition 5.4.1 (Orthogonal Complement and Orthogonal Projection) Let W be a sub-
space of a nite dimensional inner product space V .
1. Then W

is called the orthogonal complement of W in V. We represent it by writing V =

W W

in place of V = W +W

.
2. Also, for each v V there exist unique vectors w W and u W

such that v = w + u.
We use this to dene
P
W
: V V by P
W
(v) = w.
Then P
W
is called the orthogonal projection of V onto W.
Exercise 5.4.2 Let W be a subspace of a nite dimensional inner product space V . Use V =
W W

W
of V onto W

. Prove that the maps

P
W
and P
W
are indeed linear transformations.
Example 5.4.3 1. Let V = R
3
and W = (x, y, z) R
3
: x + y z = 0. Then it can be easily
veried that (1, 1, 1) is a basis of W

as for each (x, y, z) W, we have x + y z = 0

and hence
(x, y, z), (1, 1, 1)) = x +y z = 0 for each (x, y, z) W.
Also, using Equation (5.3.1), for every x
t
= (x, y, z) R
3
, we have u =
x+yz
3
(1, 1, 1),
w = (
2xy+z
3
,
x+2y+z
3
,
x+y+2z
3
) and x = w+u. Let
A =
1
3
_
_
2 1 1
1 2 1
1 1 2
_
_
and B =
1
3
_
_
1 1 1
1 1 1
1 1 1
_
_
.
Then by denition, P
W
(x) = w = Ax and P
W
(x) = u = Bx. Observe that A
2
= A, B
2
= B,
A
t
= A, B
t
= B, A B = 0
3
, B A = 0
3
and A+B = I
3
, where 0
3
is the zero matrix of size
3 3 and I
3
is the identity matrix of size 3. Also, verify that rank(A) = 2 and rank(B) = 1.
2. Let W = L( (1, 2, 1) ) R
3
. Then using Example 5.3.3.2, and Equation (5.3.1), we get
W

= L((2, 1, 0), (1, 0, 1)) = L((2, 1, 0), (1, 2, 5)),

u = (
5x2yz
6
,
2x+2y2z
6
,
x2y+5z
6
) and w =
x+2y+z
6
(1, 2, 1) with (x, y, z) = w +u. Hence,
for
A =
1
6
_
_
1 2 1
2 4 2
1 2 1
_
_
and B =
1
6
_
_
5 2 1
2 2 2
1 2 5
_
_
,
5.4. ORTHOGONAL PROJECTIONS AND APPLICATIONS 113
we have P
W
(x) = w = Ax and P
W
(x) = u = Bx. Observe that A
2
= A, B
2
= B, A
t
= A
and B
t
= B, A B = 0
3
, B A = 0
3
and A+B = I
3
, where 0
3
is the zero matrix of size 3 3
and I
3
is the identity matrix of size 3. Also, verify that rank(A) = 1 and rank(B) = 2.
We now prove some basic properties related to orthogonal projection maps. We also need the
following denition.
Denition 5.4.4 (Self-Adjoint Transformation/Operator) Let V be an inner product space
with inner product , ). A linear transformation T : V V is called a self-adjoint operator if
T(v), u) = v, T(u)) for every u, v V.
The example below gives an indication that the self-adjoint operators and Hermitian matrices
are related. It also shows that C
n
(R
n
) can be decomposed in terms of the null space and range
space of Hermitian matrices. These examples also follow directly from the fundamental theorem of
linear algebra.
Example 5.4.5 1. Let A be an n n real symmetric matrix and dene T
A
: R
n
R
n
by
T
A
(x) = Ax for every x
t
R
n
.
(a) T
A
is a self adjoint operator.
As A = A
t
, for every x
t
, y
t
R
n
,
T
A
(x), y) = (y
t
)Ax = (y
t
)A
t
x = (Ay)
t
x = x, Ay) = x, T
A
(y)).
(b) ^(T
A
) = (T
A
)

follows from Theorem 5.2.14 as A = A

t
. But we do give a proof for
completeness.
Let x ^(T
A
). Then T
A
(x) = 0 and x, T
A
(u)) = T
A
(x), u) = 0. Thus, x (T
A
)

and hence ^(T

A
) (T
A
)

.
Let x (T
A
)

. Then 0 = x, T
A
(y)) = T
A
(x), y) for every y R
n
. Hence, by
Exercise 2 T
A
(x) = 0. That is, x ^(A) and hence (T
A
)

^(T
A
).
(c) R
n
= ^(T
A
) (T
A
) as ^(T
A
) = (T
A
)

.
(d) Thus ^(A) = Im(A)

, or equivalently, R
n
= ^(A) Im(A).
2. Let A be an n n Hermitian matrix. Dene T
A
: C
n
C
n
dened by T
A
(z) = Az for
all z
t
C
n
. Then using arguments similar to the arguments in Example 5.4.5.1, prove the
following:
(a) T
A
is a self-adjoint operator.
(b) ^(T
A
) = (T
A
)

and C
n
= ^(T
A
) (T
A
).
(c) ^(A) = Im(A)

and C
n
= ^(A) Im(A).
We now state and prove the main result related with orthogonal projection operators.
Theorem 5.4.6 Let W be a vector subspace of a nite dimensional inner product space V and let
P
W
: V V be the orthogonal projection operator of V onto W.
1. Then ^(P
W
) = v V : P
W
(v) = 0 = W

= (P
W
).
2. Then (P
W
) = P
W
(v) : v V = W = ^(P
W
).
3. Then P
W
P
W
= P
W
, P
W
P
W
= P
W
.
114 CHAPTER 5. INNER PRODUCT SPACES
4. Let 0
V
denote the zero operator on V dened by 0
V
(v) = 0 for all v V . Then P
W
P
W
=
0
V
and P
W
P
W
= 0
V
.
5. Let I
V
denote the identity operator on V dened by I
V
(v) = v for all v V . Then I
V
=
P
W
P
W
, where we have written instead of + to indicate the relationship P
W
P
W
= 0
V
and P
W
P
W
= 0
V
.
6. The operators P
W
and P
W
Proof. Part 1: Let u W

. As V = W W

, we have u = 0 + u for 0 W and u W

.
Hence by denition, P
W
(u) = 0 and P
W
(u) = u. Thus, W

^(P
W
) and W

(P
W
).
Also, suppose that v ^(P
W
) for some v V . As v has a unique expression as v = w + u
for some w W and some u W

, by denition of P
W
, we have P
W
(v) = w. As v ^(P
W
), by
denition, P
W
(v) = 0 and hence w = 0. That is, v = u W

. Thus, ^(P
W
) W

.
One can similarly show that (P
W
) W

. Thus, the proof of the rst part is complete.

Part 2: Similar argument as in the proof of Part 1.
Part 3, Part 4 and Part 5: Let v V and let v = w + u for some w W and u W

.
Then by denition,
(P
W
P
W
)(v) = P
W
_
P
W
(v)
_
= P
W
(w) = w & P
W
(v) = w, (5.4.1)
(P
W
P
W
)(v) = P
W

_
P
W
(v)
_
= P
W
(w) = 0 and (5.4.2)
(P
W
P
W
)(v) = P
W
(v) +P
W
(v) = w+u = v = I
V
(v). (5.4.3)
Hence, applying Exercise 2 to Equations (5.4.1), (5.4.2) and (5.4.3), respectively, we get P
W
P
W
=
P
W
, P
W
P
W
= 0
V
and I
V
= P
W
P
W
.
Part 6: Let u = w
1
+ x
1
and v = w
2
+ x
2
, where w
1
, w
2
W and x
1
, x
2
W

. Then, by
denition w
i
, x
j
) = 0 for 1 i, j 2. Thus,
P
W
(u), v) = w
1
, v) = w
1
, w
2
) = u, w
2
) = u, P
W
(v))
and the proof of the theorem is complete.
The next theorem is a generalization of Theorem 5.4.6 when a nite dimensional inner product
space V can be written as V = W
1
W
2
W
k
, where W
i
s are vector subspaces of V . That
is, for each v V there exist unique vectors v
1
, v
2
, . . . , v
k
such that
1. v
i
W
i
for 1 i k,
2. v
i
, v
j
) = 0 for each v
i
W
i
, v
j
W
j
, 1 i ,= j k and
3. v = v
1
+v
2
+ +v
k
.
We omit the proof as it basically uses arguments that are similar to the arguments used in the
proof of Theorem 5.4.6.
Theorem 5.4.7 Let V be a nite dimensional inner product space and let W
1
, W
2
, . . . , W
k
be vector
subspaces of V such that V = W
1
W
2
W
k
. Then for each i, j, 1 i ,= j k, there exist
orthogonal projection operators P
Wi
: V V of V onto W
i
satisfying the following:
1. ^(P
Wi
) = W

i
= W
1
W
2
W
i1
W
i+1
W
k
.
2. (P
Wi
) = W
i
.
3. P
Wi
P
Wi
= P
Wi
.
5.4. ORTHOGONAL PROJECTIONS AND APPLICATIONS 115
4. P
Wi
P
Wj
= 0
V
.
5. P
Wi
is a self-adjoint operator, and
6. I
V
= P
W1
P
W2
P
W
k
.
Remark 5.4.8 1. By Exercise 5.4.2, P
W
is a linear transformation.
2. By Theorem 5.4.6, we observe the following:
(a) The orthogonal projection operators P
W
and P
W
are idempotent operators.
(b) The orthogonal projection operators P
W
and P
W
are also self-adjoint operators.
(c) Let v V . Then v P
W
(v) = (I
V
P
W
)(v) = P
W
(v) W

. Thus,
v P
W
(v), w) = 0 for every v V and w W.
(d) Using Remark 5.4.8.2c, P
W
(v) w W for each v V and w W. Thus,
|v w|
2
= |v P
W
(v) +P
W
(v) w|
2
= |v P
W
(v)|
2
+|P
W
(v) w|
2
+2v P
W
(v), P
W
(v) w)
= |v P
W
(v)|
2
+|P
W
(v) w|
2
.
Therefore,
|v w| |v P
W
(v)|
and equality holds if and only if w = P
W
(v). Since P
W
(v) W, we see that
d(v, W) = inf |v w| : w W = |v P
W
(v)|.
That is, P
W
(v) is the vector nearest to v W. This can also be stated as: the vector
P
W
(v) solves the following minimization problem:
inf
wW
|v w| = |v P
W
(v)|.
Exercise 5.4.9 1. Let A M
n
(R) be an idempotent matrix and dene T
A
: R
n
R
n
by
T
A
(v) = Av for all v
t
R
n
. Recall the following results from Exercise 4.3.12.5.
(a) T
A
T
A
= T
A
(b) ^(T
A
) (T
A
) = 0.
(c) R
n
= (T
A
) +^(T
A
).
The linear map T
A
need not be an orthogonal projection operator as (T
A
)

need not be
equal to ^(T
A
). Here T
A
is called a projection operator of R
n
onto (T
A
) along ^(T
A
).
(d) If A is also symmetric then prove that T
A
is an orthogonal projection operator.
(e) Which of the above results can be generalized to an nn complex idempotent matrix A?
2. Find all 2 2 real matrices A such that A
2
= A. Hence or otherwise, determine all projection
operators of R
2
.
116 CHAPTER 5. INNER PRODUCT SPACES
5.4.1 Matrix of the Orthogonal Projection
The minimization problem stated above arises in lot of applications. So, it is very helpful if the
matrix of the orthogonal projection can be obtained under a given basis.
To this end, let W be a k-dimensional subspace of R
n
with W

as its orthogonal complement.

Let P
W
: R
n
R
n
be the orthogonal projection of R
n
onto W. Suppose, we are given an
orthonormal basis B = (v
1
, v
2
, . . . , v
k
) of W. Under the assumption that B is known, we explicitly
give the matrix of P
W
with respect to an extended ordered basis of R
n
.
Let B
1
= (v
1
, v
2
, . . . , v
k
, v
k+1
. . . , v
n
) be an ordered orthonormal basis of R
n
containing B.
Then (by Theorem 5.2.18) for any v R
n
, v =
n

i=1
v, v
i
)v
i
and by denition, P
W
(v) =
k

i=1
v, v
i
)v
i
.
Let A = [v
1
, v
2
, . . . , v
k
] and let B
2
= (e
1
, e
2
, . . . , e
n
) be the standard ordered basis of R
n
. If
v
i
=
n

j=1
a
ji
e
j
, for 1 i k then A
nk
=
_

_
a
11
a
12
a
1k
a
21
a
22
a
2k
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
a
nk
_

_
, [v]
B2
=
_

_
n

i=1
a
1i
v, v
i
)
.
.
.
n

i=1
a
ni
v, v
i
)
_

_
and [P
W
(v)]
B2
=
_

_
k

i=1
a
1i
v, v
i
)
.
.
.
k

i=1
a
ni
v, v
i
)
_

_
. Also, as observed in Remark 5.3.4.3 , A
t
A = I
k
. That is, for
1 i, j k,
n

s=1
a
si
a
sj
=
_
1 if i = j
0 if i ,= j.
(5.4.4)
5.4. ORTHOGONAL PROJECTIONS AND APPLICATIONS 117
Thus, using the associativity of matrix product and (5.4.4), we get
_
AA
t
_
[v]
B2
= A
_

_
a
11
a
21
a
n1
a
12
a
22
a
n2
.
.
.
.
.
.
.
.
.
.
.
.
a
1k
a
2k
a
nk
_

_
_

_
n

i=1
a
1i
v, v
i
)
n

i=1
a
2i
v, v
i
)
.
.
.
n

i=1
a
ni
v, v
i
)
_

_
= A
_

_
n

s=1
a
s1
_
n

i=1
a
si
v, v
i
)
_
n

s=1
a
s2
_
n

i=1
a
si
v, v
i
)
_
.
.
.
n

s=1
a
sk
_
n

i=1
a
si
v, v
i
)
_
_

_
= A
_

_
n

i=1
_
n

s=1
a
s1
a
si
_
v, v
i
)
n

i=1
_
n

s=1
a
s2
a
si
_
v, v
i
)
.
.
.
n

i=1
_
n

s=1
a
sk
a
si
_
v, v
i
)
_

_
= A
_

_
v, v
1
)
v, v
2
)
.
.
.
v, v
k
)
_

_
=
_

_
k

i=1
a
1i
v, v
i
)
k

i=1
a
2i
v, v
i
)
.
.
.
k

i=1
a
ni
v, v
i
)
_

_
= [P
W
(v)]
B2
=
_
P
W
[B
2
, B
2
]
_
[v]
B2
.
Thus P
W
[B
2
, B
2
] = AA
t
and hence we have proved the following theorem.
Theorem 5.4.10 Let W be a k-dimensional subspace of R
n
and let P
W
be the corresponding or-
thogonal projection of R
n
onto W. Also assume that B = (v
1
, v
2
, . . . , v
k
) is an orthonormal ordered
basis of W. Then the matrix of P
W
in the standard ordered basis (e
1
, e
2
, . . . , e
n
) of R
n
is AA
t
(a
symmetric matrix), where A = [v
1
, v
2
, . . . , v
k
] is an n k matrix.
We illustrate the above theorem with the help of an example. One can also see Example 5.4.3.
Example 5.4.11 Let W = (x, y, z, w) R
4
: x = y, z = w be a subspace of W. Then an orthonor-
mal ordered basis of W and W

is
_
1

2
(1, 1, 0, 0),
1

2
(0, 0, 1, 1)
_
and
_
1

2
(1, 1, 0, 0),
1

2
(0, 0, 1, 1)
_
,
respectively. Let P
W
: R
4
R
4
be an orthogonal projection of R
4
onto W. Then
A =
_

_
1

2
0
1

2
0
0
1

2
0
1

2
_

_
and P
W
[B, B] = AA
t
=
_

_
1
2
1
2
0 0
1
2
1
2
0 0
0 0
1
2
1
2
0 0
1
2
1
2
_

_
,
where B =
_
1

2
(1, 1, 0, 0),
1

2
(0, 0, 1, 1),
1

2
(1, 1, 0, 0),
1

2
(0, 0, 1, 1)
_
. Verify that
1. P
W
[B, B] is symmetric,
2. (P
W
[B, B])
2
= P
W
[B, B] and
3.
_
I
4
P
W
[B, B]
_
P
W
[B, B] = 0 = P
W
[B, B]
_
I
4
P
W
[B, B]
_
.
118 CHAPTER 5. INNER PRODUCT SPACES
Also, [(x, y, z, w)]
B
=
_
x+y

2
,
z+w

2
,
xy

2
,
zw

2
_
t
and hence
P
W
_
(x, y, z, w)
_
=
x +y
2
(1, 1, 0, 0) +
z +w
2
(0, 0, 1, 1)
is the closest vector to the subspace W for any vector (x, y, z, w) R
4
.
Exercise 5.4.12 1. Show that for any non-zero vector v
t
R
n
, rank(vv
t
) = 1.
2. Let W be a subspace of an inner product space V and let P : V V be the orthogo-
nal projection of V onto W. Let B be an orthonormal ordered basis of V. Then prove that
(P[B, B])
t
= P[B, B].
3. Let W
1
= (x, 0) : x R and W
2
= (x, x) : x R be two subspaces of R
2
. Let P
W1
and
P
W2
be the corresponding orthogonal projection operators of R
2
onto W
1
and W
2
, respectively.
Compute P
W1
P
W2
and conclude that the composition of two orthogonal projections need not
be an orthogonal projection?
4. Let W be an (n 1)-dimensional subspace of R
n
. Suppose B is an orthogonal ordered basis
of R
n
obtained by extending an orthogonal ordered basis of W. Dene
T : R
n
R
n
by T(v) = w
0
w
whenever v = w+w
0
for some w W and w
0
W

. Then
(a) prove that T is a linear transformation,
(b) nd T[B, B] and
(c) prove that T[B, B] is an orthogonal matrix.
T is called the reection operator along W

.
5.5 QR Decomposition

The next result gives the proof of the QR decomposition for real matrices. A similar result holds
for matrices with complex entries. The readers are advised to prove that for themselves. This de-
composition and its generalizations are helpful in the numerical calculations related with eigenvalue
problems (see Chapter 6).
Theorem 5.5.1 (QR Decomposition) Let A be a square matrix of order n with real entries.
Then there exist matrices Q and R such that Q is orthogonal and R is upper triangular with
A = QR.
In case, A is non-singular, the diagonal entries of R can be chosen to be positive. Also, in this
case, the decomposition is unique.
Proof. We prove the theorem when A is non-singular. The proof for the singular case is left as
an exercise.
Let the columns of A be x
1
, x
2
, . . . , x
n
. Then x
1
, x
2
, . . . , x
n
is a basis of R
n
and hence
the Gram-Schmidt orthogonalization process gives an ordered basis (see Remark 5.3.4), say B =
(v
1
, v
2
, . . . , v
n
) of R
n
satisfying
L(v
1
, v
2
, . . . , v
i
) = L(x
1
, x
2
, . . . , x
i
),
|v
i
| = 1, v
i
, v
j
) = 0,
_
for 1 i ,= j n. (5.5.5)
5.5. QR DECOMPOSITION

119
As x
i
R
n
and x
i
L(v
1
, v
2
, . . . , v
i
), we can nd
ji
, 1 j i such that
x
i
=
1i
v
1
+
2i
v
2
+ +
ii
v
i
=
_
(
1i
, . . . ,
ii
, 0 . . . , 0)
t

B
. (5.5.6)
Now dene Q = [v
1
, v
2
, . . . , v
n
] and R =
_

11

12

1n
0
22

2n
.
.
.
.
.
.
.
.
.
.
.
.
0 0
nn
_

_
. Then by Exercise 5.3.8.4, Q is
an orthogonal matrix and using (5.5.6), we get
QR = [v
1
, v
2
, . . . , v
n
]
_

11

12

1n
0
22

2n
.
.
.
.
.
.
.
.
.
.
.
.
0 0
nn
_

_
=
_

11
v
1
,
12
v
1
+
22
v
2
, . . . ,
n

i=1

in
v
i
_
= [x
1
, x
2
, . . . , x
n
] = A.
Thus, we see that A = QR, where Q is an orthogonal matrix (see Remark 5.3.4.1) and R is an
upper triangular matrix.
The proof doesnt guarantee that for 1 i n,
ii
is positive. But this can be achieved by
replacing the vector v
i
by v
i
whenever
ii
is negative.
Uniqueness: suppose Q
1
R
1
= Q
2
R
2
then Q
1
2
Q
1
= R
2
R
1
1
. Observe the following properties
of upper triangular matrices.
1. The inverse of an upper triangular matrix is also an upper triangular matrix, and
2. product of upper triangular matrices is also upper triangular.
Thus the matrix R
2
R
1
1
is an upper triangular matrix. Also, by Exercise 5.3.8.3, the matrix Q
1
2
Q
1
is an orthogonal matrix. Hence, by Exercise 5.3.8.5, R
2
R
1
1
= I
n
. So, R
2
= R
1
and therefore
Q
2
= Q
1
.
Let A = [x
1
, x
2
, . . . , x
k
] be an n k matrix with rank (A) = r. Then by Remark 5.3.4.1c , the
Gram-Schmidt orthogonalization process applied to x
1
, x
2
, . . . , x
k
yields a set v
1
, v
2
, . . . , v
r
of
orthonormal vectors of R
n
and for each i, 1 i r, we have
L(v
1
, v
2
, . . . , v
i
) = L(x
1
, x
2
, . . . , x
j
), for some j, i j k.
Hence, proceeding on the lines of the above theorem, we have the following result.
Theorem 5.5.2 (Generalized QR Decomposition) Let A be an n k matrix of rank r. Then
A = QR, where
1. Q = [v
1
, v
2
, . . . , v
r
] is an n r matrix with Q
t
Q = I
r
,
2. L(v
1
, v
2
, . . . , v
r
) = L(x
1
, x
2
, . . . , x
k
), and
3. R is an r k matrix with rank (R) = r.
120 CHAPTER 5. INNER PRODUCT SPACES
Example 5.5.3 1. Let A =
_

_
1 0 1 2
0 1 1 1
1 0 1 1
0 1 1 1
_

_
. Find an orthogonal matrix Q and an upper tri-
angular matrix R such that A = QR.
Solution: From Example 5.3.3, we know that
v
1
=
1

2
(1, 0, 1, 0), v
2
=
1

2
(0, 1, 0, 1) and v
3
=
1

2
(0, 1, 0, 1). (5.5.7)
We now compute w
4
. If we denote u
4
= (2, 1, 1, 1)
t
then
w
4
= u
4
u
4
, v
1
)v
1
u
4
, v
2
)v
2
u
4
, v
3
)v
3
=
1
2
(1, 0, 1, 0)
t
. (5.5.8)
Thus, using Equations (5.5.7), (5.5.8) and Q =
_
v
1
, v
2
, v
3
, v
4

, we get
Q =
_

_
1

2
0 0
1

2
0
1

2
1

2
0
1

2
0 0
1

2
0
1

2
1

2
0
_

_
and R =
_

2 0

2
3

2
0

2 0

2
0 0

2 0
0 0 0
1

2
_

_
. The readers are advised to check
that A = QR is indeed correct.
2. Let A =
_

_
1 1 1 0
1 0 2 1
1 1 1 0
1 0 2 1
_

_
. Find a 43 matrix Q satisfying Q
t
Q = I
3
and an upper triangular
matrix R such that A = QR.
Solution: Let us apply the Gram Schmidt orthogonalization process to the columns of A.
That is, apply the process to the subset (1, 1, 1, 1), (1, 0, 1, 0), (1, 2, 1, 2), (0, 1, 0, 1) of R
4
.
Let u
1
= (1, 1, 1, 1). Dene v
1
=
1
2
u
1
. Let u
2
= (1, 0, 1, 0). Then
w
2
= (1, 0, 1, 0) u
2
, v
1
)v
1
= (1, 0, 1, 0) v
1
=
1
2
(1, 1, 1, 1).
Hence, v
2
=
1
2
(1, 1, 1, 1). Let u
3
= (1, 2, 1, 2). Then
w
3
= u
3
u
3
, v
1
)v
1
u
3
, v
2
)v
2
= u
3
3v
1
+v
2
= 0.
So, we again take u
3
= (0, 1, 0, 1). Then
w
3
= u
3
u
3
, v
1
)v
1
u
3
, v
2
)v
2
= u
3
0v
1
0v
2
= u
3
.
So, v
3
=
1

2
(0, 1, 0, 1). Hence,
Q = [v
1
, v
2
, v
3
] =
_

_
1
2
1
2
0
1
2
1
2
1

2
1
2
1
2
0
1
2
1
2
1

2
_

_
, and R =
_
_
2 1 3 0
0 1 1 0
0 0 0

2
_
_
.
The readers are advised to check the following:
(a) rank (A) = 3,
(b) A = QR with Q
t
Q = I
3
, and
(c) R a 3 4 upper triangular matrix with rank (R) = 3.
Chapter 6
Eigenvalues, Eigenvectors and
Diagonalization
6.1 Introduction and Denitions
In this chapter, the linear transformations are from the complex vector space C
n
to itself. Observe
that in this case, the matrix of the linear transformation is an n n matrix. So, in this chapter,
all the matrices are square matrices and a vector x means x = (x
1
, x
2
, . . . , x
n
)
t
for some positive
integer n.
Example 6.1.1 Let A be a real symmetric matrix. Consider the following problem:
Maximize (Minimize) x
t
Ax such that x R
n
and x
t
x = 1.
To solve this, consider the Lagrangian
L(x, ) = x
t
Ax (x
t
x 1) =
n

i=1
n

j=1
a
ij
x
i
x
j
(
n

i=1
x
2
i
1).
Partially dierentiating L(x, ) with respect to x
i
for 1 i n, we get
L
x
1
= 2a
11
x
1
+ 2a
12
x
2
+ + 2a
1n
x
n
2x
1
,
L
x
2
= 2a
21
x
1
+ 2a
22
x
2
+ + 2a
2n
x
n
2x
2
,
and so on, till
L
x
n
= 2a
n1
x
1
+ 2a
n2
x
2
+ + 2a
nn
x
n
2x
n
.
Therefore, to get the points of extremum, we solve for
(0, 0, . . . , 0)
t
= (
L
x
1
,
L
x
2
, . . . ,
L
x
n
)
t
=
L
x
= 2(Ax x).
We therefore need to nd a R and 0 ,= x R
n
such that Ax = x for the extremal problem.
Let A be a matrix of order n. In general, we ask the question:
For what values of F, there exist a non-zero vector x F
n
such that
Ax = x? (6.1.1)
121
122 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
Here, F
n
stands for either the vector space R
n
over R or C
n
over C. Equation (6.1.1) is equivalent
to the equation
(A I)x = 0.
By Theorem 2.4.1, this system of linear equations has a non-zero solution, if
rank (A I) < n, or equivalently det(A I) = 0.
So, to solve (6.1.1), we are forced to choose those values of F for which det(AI) = 0. Observe
that det(AI) is a polynomial in of degree n. We are therefore lead to the following denition.
Denition 6.1.2 (Characteristic Polynomial, Characteristic Equation) Let A be a square
matrix of order n. The polynomial det(A I) is called the characteristic polynomial of A and is
denoted by p
A
() (in short, p(), if the matrix A is clear from the context). The equation p() = 0 is
called the characteristic equation of A. If F is a solution of the characteristic equation p() = 0,
then is called a characteristic value of A.
Some books use the term eigenvalue in place of characteristic value.
Theorem 6.1.3 Let A M
n
(F). Suppose =
0
F is a root of the characteristic equation. Then
there exists a non-zero v F
n
such that Av =
0
v.
Proof. Since
0
is a root of the characteristic equation, det(A
0
I) = 0. This shows that the
matrix A
0
I is singular and therefore by Theorem 2.4.1 the linear system
(A
0
I
n
)x = 0
has a non-zero solution.
Remark 6.1.4 Observe that the linear system Ax = x has a solution x = 0 for every F.
So, we consider only those x F
n
that are non-zero and are also solutions of the linear system
Ax = x.
Denition 6.1.5 (Eigenvalue and Eigenvector) Let A M
n
(F) and let the linear system Ax =
x has a non-zero solution x F
n
for some F. Then
1. F is called an eigenvalue of A,
2. x F
n
is called an eigenvector corresponding to the eigenvalue of A, and
3. the tuple (, x) is called an eigen-pair.
Remark 6.1.6 To understand the dierence between a characteristic value and an eigenvalue, we
give the following example.
Let A =
_
0 1
1 0
_
. Then p
A
() =
2
+ 1. Also, dene the linear operator T
A
: F
2
F
2
by
T
A
(x) = Ax for every x F
2
.
1. Suppose F = C, i.e., A M
2
(C). Then the roots of p() = 0 in C are i. So, A has (i, (1, i)
t
)
and (i, (i, 1)
t
) as eigen-pairs.
2. If A M
2
(R), then p() = 0 has no solution in R. Therefore, if F = R, then A has no
eigenvalue but it has i as characteristic values.
6.1. INTRODUCTION AND DEFINITIONS 123
Remark 6.1.7 1. Let A M
n
(F). Suppose (, x) is an eigen-pair of A. Then for any c
F, c ,= 0, (, cx) is also an eigen-pair for A. Similarly, if x
1
, x
2
, . . . , x
r
are linearly indepen-
dent eigenvectors of A corresponding to the eigenvalue , then
r

i=1
c
i
x
i
is also an eigenvector
of A corresponding to if at least one c
i
,= 0. Hence, if S is a collection of eigenvectors, it is
implicitly understood that the set S is linearly independent.
2. Suppose p
A
(
0
) = 0 for some
0
F. Then A
0
I is singular. If rank (A
0
I) = r then
r < n. Hence, by Theorem 2.4.1 on page 41, the system (A
0
I)x = 0 has n r linearly
independent solutions. That is, A has n r linearly independent eigenvectors corresponding
to
0
whenever rank (A
0
I) = r.
Example 6.1.8 1. Let A = diag(d
1
, d
2
, . . . , d
n
) with d
i
R for 1 i n. Then p() =
n

i=1
( d
i
) and the eigen-pairs are (d
1
, e
1
), (d
2
, e
2
), . . . , (d
n
, e
n
).
2. Let A =
_
1 1
0 1
_
. Then p() = (1)
2
. Hence, the characteristic equation has roots 1, 1. That
is, 1 is a repeated eigenvalue. But the system (A I
2
)x = 0 for x = (x
1
, x
2
)
t
implies that
x
2
= 0. Thus, x = (x
1
, 0)
t
is a solution of (AI
2
)x = 0. Hence using Remark 6.1.7.1, (1, 0)
t
is an eigenvector. Therefore, note that 1 is a repeated eigenvalue whereas there is
only one eigenvector.
3. Let A =
_
1 0
0 1
_
. Then p() = (1 )
2
. Again, 1 is a repeated root of p() = 0. But in this
case, the system (A I
2
)x = 0 has a solution for every x
t
R
2
. Hence, we can choose
any two linearly independent vectors x
t
, y
t
from R
2
to get (1, x) and (1, y) as the
two eigen-pairs. In general, if x
1
, x
2
, . . . , x
n
R
n
are linearly independent vectors then
(1, x
1
), (1, x
2
), . . . , (1, x
n
) are eigen-pairs of the identity matrix, I
n
.
4. Let A =
_
1 2
2 1
_
. Then p() = (3)(+1) and its roots are 3, 1. Verify that the eigen-pairs
are (3, (1, 1)
t
) and (1, (1, 1)
t
). The readers are advised to prove the linear independence of
the two eigenvectors.
5. Let A =
_
1 1
1 1
_
. Then p() =
2
2 + 2 and its roots are 1 +i, 1 i. Hence, over R, the
matrix A has no eigenvalue. Over C, the reader is required to show that the eigen-pairs are
(1 +i, (i, 1)
t
) and (1 i, (1, i)
t
).
Exercise 6.1.9 1. Find the eigenvalues of a triangular matrix.
2. Find eigen-pairs over C, for each of the following matrices:
_
1 1 +i
1 i 1
_
,
_
i 1 +i
1 +i i
_
,
_
cos sin
sin cos
_
and
_
cos sin
sin cos
_
.
3. Let A and B be similar matrices.
(a) Then prove that A and B have the same set of eigenvalues.
(b) If B = PAP
1
for some invertible matrix P then prove that Px is an eigenvector of B
if and only if x is an eigenvector of A.
124 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
4. Let A = (a
ij
) be an n n matrix. Suppose that for all i, 1 i n,
n

j=1
a
ij
= a. Then prove
that a is an eigenvalue of A. What is the corresponding eigenvector?
5. Prove that the matrices A and A
t
have the same set of eigenvalues. Construct a 2 2 matrix
A such that the eigenvectors of A and A
t
are dierent.
6. Let A be a matrix such that A
2
= A (A is called an idempotent matrix). Then prove that its
eigenvalues are either 0 or 1 or both.
7. Let A be a matrix such that A
k
= 0 (A is called a nilpotent matrix) for some positive integer
k 1. Then prove that its eigenvalues are all 0.
8. Compute the eigen-pairs of the matrices
_
2 1
1 0
_
and
_
2 i
i 0
_
.
Theorem 6.1.10 Let A = [a
ij
] be an n n matrix with eigenvalues
1
,
2
, . . . ,
n
, not necessarily
distinct. Then det(A) =
n

i=1

i
and tr(A) =
n

i=1
a
ii
=
n

i=1

i
.
Proof. Since
1
,
2
, . . . ,
n
are the n eigenvalues of A, by denition,
det(A I
n
) = p() = (1)
n
(
1
)(
2
) (
n
). (6.1.2)
(6.1.2) is an identity in as polynomials. Therefore, by substituting = 0 in (6.1.2), we get
det(A) = (1)
n
(1)
n
n

i=1

i
=
n

i=1

i
.
Also,
det(A I
n
) =
_

_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
a
nn

_
(6.1.3)
= a
0
a
1
+
2
a
2
+
+(1)
n1

n1
a
n1
+ (1)
n

n
(6.1.4)
for some a
0
, a
1
, . . . , a
n1
F. Note that a
n1
, the coecient of (1)
n1

n1
, comes from the
product
(a
11
)(a
22
) (a
nn
).
So, a
n1
=
n

i=1
a
ii
= tr(A) by denition of trace.
But , from (6.1.2) and (6.1.4), we get
a
0
a
1
+
2
a
2
+ + (1)
n1

n1
a
n1
+ (1)
n

n
= (1)
n
(
1
)(
2
) (
n
). (6.1.5)
Therefore, comparing the coecient of (1)
n1

n1
, we have
tr(A) = a
n1
= (1)(1)
n

i=1

i
=
n

i=1

i
.
Hence, we get the required result.
6.1. INTRODUCTION AND DEFINITIONS 125
Exercise 6.1.11 1. Let A be a skew symmetric matrix of order 2n+1. Then prove that 0 is an
eigenvalue of A.
2. Let A be a 3 3 orthogonal matrix (AA
t
= I). If det(A) = 1, then prove that there exists a
non-zero vector v R
3
such that Av = v.
Let A be an n n matrix. Then in the proof of the above theorem, we observed that the
characteristic equation det(A I) = 0 is a polynomial equation of degree n in . Also, for some
numbers a
0
, a
1
, . . . , a
n1
F, it has the form

n
+a
n1

n1
+a
n2

2
+ a
1
+a
0
= 0.
Note that, in the expression det(A I) = 0, is an element of F. Thus, we can only substitute
by elements of F.
It turns out that the expression
A
n
+a
n1
A
n1
+a
n2
A
2
+ a
1
A +a
0
I = 0
holds true as a matrix identity. This is a celebrated theorem called the Cayley Hamilton Theorem.
We state this theorem without proof and give some implications.
Theorem 6.1.12 (Cayley Hamilton Theorem) Let A be a square matrix of order n. Then A
satises its characteristic equation. That is,
A
n
+a
n1
A
n1
+a
n2
A
2
+ a
1
A +a
0
I = 0
holds true as a matrix identity.
Some of the implications of Cayley Hamilton Theorem are as follows.
Remark 6.1.13 1. Let A =
_
0 1
0 0
_
. Then its characteristic polynomial is p() =
2
. Also,
for the function, f(x) = x, f(0) = 0, and f(A) = A ,= 0. This shows that the condition
f() = 0 for each eigenvalue of A does not imply that f(A) = 0.
2. Suppose we are given a square matrix A of order n and we are interested in calculating A

where is large compared to n. Then we can use the division algorithm to nd numbers

0
,
1
, . . . ,
n1
and a polynomial f() such that

= f()
_

n
+a
n1

n1
+a
n2

2
+ a
1
+a
0
_
+
0
+
1
+ +
n1

n1
.
Hence, by the Cayley Hamilton Theorem,
A

=
0
I +
1
A+ +
n1
A
n1
.
That is, we just need to compute the powers of A till n 1.
In the language of graph theory, it says the following:
Let G be a graph on n vertices. Suppose there is no path of length n1 or less from a vertex v to a
vertex u of G. Then there is no path from v to u of any length. That is, the graph G is disconnected
and v and u are in dierent components.
126 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
3. Let A be a non-singular matrix of order n. Then note that a
n
= det(A) ,= 0 and
A
1
=
1
a
n
[A
n1
+a
n1
A
n2
+ +a
1
I].
This matrix identity can be used to calculate the inverse.
Note that the vector A
1
(as an element of the vector space of all n n matrices) is a linear combi-
nation of the vectors I, A, . . . , A
n1
.
Exercise 6.1.14 Find inverse of the following matrices by using the Cayley Hamilton Theorem
i)
_
_
2 3 4
5 6 7
1 1 2
_
_
ii)
_
_
1 1 1
1 1 1
0 1 1
_
_
iii)
_
_
1 2 1
2 1 1
0 1 2
_
_
.
Theorem 6.1.15 If
1
,
2
, . . . ,
k
are distinct eigenvalues of a matrix A with corresponding eigen-
vectors x
1
, x
2
, . . . , x
k
, then the set x
1
, x
2
, . . . , x
k
is linearly independent.
Proof. The proof is by induction on the number m of eigenvalues. The result is obviously true
if m = 1 as the corresponding eigenvector is non-zero and we know that any set containing exactly
one non-zero vector is linearly independent.
Let the result be true for m, 1 m < k. We prove the result for m+1. We consider the equation
c
1
x
1
+c
2
x
2
+ +c
m+1
x
m+1
= 0 (6.1.6)
for the unknowns c
1
, c
2
, . . . , c
m+1
. We have
0 = A0 = A(c
1
x
1
+c
2
x
2
+ +c
m+1
x
m+1
)
= c
1
Ax
1
+c
2
Ax
2
+ +c
m+1
Ax
m+1
= c
1

1
x
1
+c
2

2
x
2
+ +c
m+1

m+1
x
m+1
. (6.1.7)
From Equations (6.1.6) and (6.1.7), we get
c
2
(
2

1
)x
2
+c
3
(
3

1
)x
3
+ +c
m+1
(
m+1

1
)x
m+1
= 0.
This is an equation in m eigenvectors. So, by the induction hypothesis, we have
c
i
(
i

1
) = 0 for 2 i m+ 1.
But the eigenvalues are distinct implies
i

1
,= 0 for 2 i m + 1. We therefore get c
i
= 0 for
2 i m+ 1. Also, x
1
,= 0 and therefore (6.1.6) gives c
1
= 0.
Thus, we have the required result.
We are thus lead to the following important corollary.
Corollary 6.1.16 The eigenvectors corresponding to distinct eigenvalues are linearly independent.
Exercise 6.1.17 1. Let A, B M
n
(R). Prove that
(a) if is an eigenvalue of A then
k
is an eigenvalue of A
k
for all k Z
+
.
(b) if A is invertible and is an eigenvalue of A then
1

is an eigenvalue of A
1
.
(c) if A is nonsingular then BA
1
and A
1
B have the same set of eigenvalues.
6.2. DIAGONALIZATION 127
(d) AB and BA have the same non-zero eigenvalues.
In each case, what can you say about the eigenvectors?
2. Let A M
n
(R) be an invertible matrix and let x
t
, y
t
R
n
. Dene B = xy
t
A
1
. Then prove
that
(a) 0 is an eigenvalue of B of multiplicity n 1 [Hint: Use Exercise 6.1.9.5].
(b)
0
= y
t
A
1
x is an eigenvalue of multiplicity 1.
(c) 1 +
0
is an eigenvalue of I +B of multiplicity 1 for any R.
(d) 1 is an eigenvalue of I +B of multiplicity n 1 for any R.
(e) det(A + xy
t
) equals (1 +
0
) det(A) for any R. This result is known as the
Shermon-Morrison formula for determinant.
3. Let A, B M
2
(R) such that det(A) = det(B) and tr(A) = tr(B).
(a) Do A and B have the same set of eigenvalues?
(b) Give examples to show that the matrices A and B need not be similar.
4. Let A, B M
n
(R). Also, let (
1
, u) be an eigen-pair for A and (
2
, v) be an eigen-pair for
B.
(a) If u = v for some R then (
1
+
2
, u) is an eigen-pair for A +B.
(b) Give an example to show that if u and v are linearly independent then
1
+
2
need not
be an eigenvalue of A+B.
5. Let A M
n
(R) be an invertible matrix with eigen-pairs (
1
, u
1
), (
2
, u
2
), . . . , (
n
, u
n
). Then
prove that B = u
1
, u
2
, . . . , u
n
forms a basis of R
n
(R). If [b]
B
= (c
1
, c
2
, . . . , c
n
)
t
then the
system Ax = b has the unique solution
x =
c
1

1
u
1
+
c
2

2
u
2
+ +
c
n

n
u
n
.
6.2 Diagonalization
Let A M
n
(F) and let T
A
: F
n
F
n
be the corresponding linear operator. In this section, we ask
the question does there exist a basis B of F
n
such that T
A
[B, B], the matrix of the linear operator
T
A
with respect to the ordered basis B, is a diagonal matrix. it will be shown that for a certain
class of matrices, the answer to the above question is in armative. To start with, we have the
following denition.
Denition 6.2.1 (Matrix Digitalization) A matrix A is said to be diagonalizable if there exists
a non-singular matrix P such that P
1
AP is a diagonal matrix.
Remark 6.2.2 Let A M
n
(F) be a diagonalizable matrix with eigenvalues
1
,
2
, . . . ,
n
. By def-
inition, A is similar to a diagonal matrix D = diag(
1
,
2
, . . . ,
n
) as similar matrices have the
same set of eigenvalues and the eigenvalues of a diagonal matrix are its diagonal entries.
Example 6.2.3 Let A =
_
0 1
1 0
_
. Then we have the following:
128 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
1. Let V = R
2
. Then A has no real eigenvalue (see Example 6.1.7 and hence A doesnt have
eigenvectors that are vectors in R
2
. Hence, there does not exist any non-singular 2 2 real
matrix P such that P
1
AP is a diagonal matrix.
2. In case, V = C
2
(C), the two complex eigenvalues of A are i, i and the corresponding eigen-
vectors are (i, 1)
t
and (i, 1)
t
, respectively. Also, (i, 1)
t
and (i, 1)
t
can be taken as a basis
of C
2
(C). Dene U =
1

2
_
i i
1 1
_
. Then
U

AU =
_
i 0
0 i
_
.
Theorem 6.2.4 Let A M
n
(R). Then A is diagonalizable if and only if A has n linearly inde-
pendent eigenvectors.
Proof. Let A be diagonalizable. Then there exist matrices P and D such that
P
1
AP = D = diag(
1
,
2
, . . . ,
n
).
Or equivalently, AP = PD. Let P = [u
1
, u
2
, . . . , u
n
]. Then AP = PD implies that
Au
i
= d
i
u
i
for 1 i n.
Since u
i
s are the columns of a non-singular matrix P, using Corollary 4.3.7, they form a linearly
independent set. Thus, we have shown that if A is diagonalizable then A has n linearly independent
eigenvectors.
Conversely, suppose A has n linearly independent eigenvectors u
i
, 1 i n with eigenvalues

i
. Then Au
i
=
i
u
i
. Let P = [u
1
, u
2
, . . . , u
n
]. Since u
1
, u
2
, . . . , u
n
are linearly independent, by
Corollary 4.3.7, P is non-singular. Also,
AP = [Au
1
, Au
2
, . . . , Au
n
] = [
1
u
1
,
2
u
2
, . . . ,
n
u
n
]
= [u
1
, u
2
, . . . , u
n
]
_

1
0 0
0
2
0
.
.
.
.
.
.
.
.
.
0 0
n
_

_
= PD.
Therefore, the matrix A is diagonalizable.
Corollary 6.2.5 If the eigenvalues of a A M
n
(R) are distinct then A is diagonalizable.
Proof. As A M
n
(R), it has n eigenvalues. Since all the eigenvalues of A are distinct, by
Corollary 6.1.16, the n eigenvectors are linearly independent. Hence, by Theorem 6.2.4, A is
diagonalizable.
Corollary 6.2.6 Let
1
,
2
, . . . ,
k
be distinct eigenvalues of A M
n
(R) and let p() be its char-
acteristic polynomial. Suppose that for each i, 1 i k, (x
i
)
mi
divides p() but (x
i
)
mi+1
does not divides p() for some positive integers m
i
. Then prove that A is diagonalizable if and only
if dim
_
ker(A
i
I)
_
= m
i
for each i, 1 i k. Or equivalently A is diagonalizable if and only if
rank(A
i
I) = n m
i
for each i, 1 i k.
6.2. DIAGONALIZATION 129
Proof. As A is diagonalizable, by Theorem 6.2.4, A has n linearly independent eigenvalues. Also,
by assumption,
k

i=1
m
i
= n as deg(p()) = n. Hence, for each eigenvalue
i
, 1 i k, A has
exactly m
i
linearly independent eigenvectors. Thus, for each i, 1 i k, the homogeneous linear
system (A
i
I)x = 0 has exactly m
i
linearly independent vectors in its solution set. Therefore,
dim
_
ker(A
i
I)
_
m
i
. Indeed dim
_
ker(A
i
I)
_
= m
i
for 1 i k follows from a simple
counting argument.
Now suppose that for each i, 1 i k, dim
_
ker(A
i
I)
_
= m
i
. Then for each i, 1 i k,
we can choose m
i
linearly independent eigenvectors. Also by Corollary 6.1.16, the eigenvectors
corresponding to distinct eigenvalues are linearly independent. Hence A has n =
k

i=1
m
i
linearly
independent eigenvectors. Hence by Theorem 6.2.4, A is diagonalizable.
Example 6.2.7 1. Let A =
_
_
2 1 1
1 2 1
0 1 1
_
_
. Then p
A
() = (2 )
2
(1 ). Hence, the eigen-
values of A are 1, 2, 2. Verify that
_
1, (1, 0, 1)
t
_
and (
_
2, (1, 1, 1)
t
_
are the only eigen-pairs.
That is, the matrix A has exactly one eigenvector corresponding to the repeated eigenvalue 2.
Hence, by Theorem 6.2.4, A is not diagonalizable.
2. Let A =
_
_
2 1 1
1 2 1
1 1 2
_
_
. Then p
A
() = (4 )(1 )
2
. Hence, A has eigenvalues 1, 1, 4.
Verify that u
1
= (1, 1, 0)
t
and u
2
= (1, 0, 1)
t
are eigenvectors corresponding to 1 and
u
3
= (1, 1, 1)
t
is an eigenvector corresponding to the eigenvalue 4. As u
1
, u
2
, u
3
are linearly
independent, by Theorem 6.2.4, A is diagonalizable.
Note that the vectors u
1
and u
2
(corresponding to the eigenvalue 1) are not orthogonal. So,
in place of u
1
, u
2
, we will take the orthogonal vectors u
2
and w = 2u
1
u
2
as eigenvectors.
Now dene U = [
1

3
u
3
,
1

2
u
2
,
1

6
w] =
_

_
1

3
1

2
1

6
1

3
0
2

6
1

3

1

2
1

6
_

_. Then U is an orthogonal matrix

and U

AU = diag(4, 1, 1).
Observe that A is a symmetric matrix. In this case, we chose our eigenvectors to be mutually
orthogonal. This result is true for any real symmetric matrix A. This result will be proved
later.
Exercise 6.2.8 1. Are the matrices A =
_
cos sin
sin cos
_
and B =
_
cos sin
sin cos
_
for some
, 0 2, diagonalizable?
2. Find the eigen-pairs of A = [a
ij
]
nn
, where a
ij
= a if i = j and b, otherwise.
3. Let A M
n
(R) and B M
m
(R). Suppose C =
_
A 0
0 B
_
. Then prove that C is diagonalizable
if and only if both A and B are diagonalizable.
4. Let T : R
5
R
5
be a linear operator with rank (T I) = 3 and
^(T) = (x
1
, x
2
, x
3
, x
4
, x
5
) R
5
[ x
1
+x
4
+x
5
= 0, x
2
+x
3
= 0.
(a) Determine the eigenvalues of T?
130 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
(b) Find the number of linearly independent eigenvectors corresponding to each eigenvalue?
(c) Is T diagonalizable? Justify your answer.
5. Let A be a non-zero square matrix such that A
2
= 0. Prove that A cannot be diagonalized.
[Hint: Use Remark 6.2.2.]
6. Are the following matrices diagonalizable?
i)
_

_
1 3 2 1
0 2 3 1
0 0 1 1
0 0 0 4
_

_
, ii)
_
_
1 0 1
0 0 1
0 2 0
_
_
, iii)
_
_
1 3 3
0 5 6
0 3 4
_
_
and iv)
_
2 i
i 0
_
.
6.3 Diagonalizable Matrices
In this section, we will look at some special classes of square matrices that are diagonalizable. Recall
that for a matrix A = [a
ij
], A

= [a
ji
] = A
t
= A
t
, is called the conjugate transpose of A. We also
recall the following denitions.
Denition 6.3.1 (Special Matrices) 1. A matrix A M
n
(C) is called
(a) a Hermitian matrix if A

= A.
(b) a unitary matrix if A A

= A

A = I
n
.
(c) a skew-Hermitian matrix if A

= A.
(d) a normal matrix if A

A = AA

.
2. A matrix A M
n
(R) is called
(a) a symmetric matrix if A
t
= A.
(b) an orthogonal matrix if A A
t
= A
t
A = I
n
.
(c) a skew-symmetric matrix if A
t
= A.
Note that a symmetric matrix is always Hermitian, a skew-symmetric matrix is always skew-
Hermitian and an orthogonal matrix is always unitary. Each of these matrices are normal. If A is
a unitary matrix then A

= A
1
.
Example 6.3.2 1. Let B =
_
i 1
1 i
_
. Then B is skew-Hermitian.
2. Let A =
1

2
_
1 i
i 1
_
and B =
_
1 1
1 1
_
. Then A is a unitary matrix and B is a normal matrix.
Note that

2A is also a normal matrix.

Denition 6.3.3 (Unitary Equivalence) Let A, B M
n
(C). They are called unitarily equiva-
lent if there exists a unitary matrix U such that A = U

BU. As U is a unitary matrix, U

= U
1
.
Hence, A is also unitarily similar to B.
Exercise 6.3.4 1. Let A be a square matrix such that UAU

is a diagonal matrix for some

unitary matrix U. Prove that A is a normal matrix.
6.3. DIAGONALIZABLE MATRICES 131
2. Let A M
n
(C). Then A =
1
2
(A + A

) +
1
2
(A A

), where
1
2
(A + A

of A and
1
2
(A A

) is the skew-Hermitian part of A. Recall that a similar result was given

in Exercise 1.3.3.1.
3. Every square matrix can be uniquely expressed as A = S + iT, where both S and T are
Hermitian matrices.
4. Let A M
n
(C). Prove that A A

is always skew-Hermitian.
5. Does there exist a unitary matrix U such that U
1
AU = B where
A =
_
_
1 1 4
0 2 2
0 0 3
_
_
and B =
_
_
2 1 3

2
0 1

2
0 0 3
_
_
.
Theorem 6.3.5 Let A M
n
(C) be a Hermitian matrix. Then
1. the eigenvalues,
i
, 1 i n, of A are real.
2. A is unitarily diagonalizable. That is, there exists a unitary matrix U such that U

AU = D;
where D = diag(
1
, . . . ,
n
). In other words, the eigenvectors of A form an orthonormal
basis of C
n
.
Proof. For the proof of Part 1, let (, x) be an eigen-pair. Then Ax = x and A

= A implies
that x

A = x

= (Ax)

= (x)

= x

. Hence,
x

x = x

(x) = x

(Ax) = (x

A)x = (x

)x = x

x.
As x is an eigenvector, x ,= 0 and therefore |x|
2
= x

x ,= 0. Thus = . That is, is a real

number.
For the proof of Part 2, we use induction on n, the size of the matrix. The result is clearly true
for n = 1. Let the result be true for n = k 1. we need to prove the result for n = k.
Let (
1
, x) be an eigen-pair of a k k matrix A with |x| = 1. Then by Part 1,
1
R. As
x is a linearly independent set, by Theorem 3.3.11 and the Gram-Schmidt Orthogonalization
process, we get an orthonormal basis x, u
2
, . . . , u
k
of C
k
. Let U
1
= [x, u
2
, . . . , u
k
] (the vectors
x, u
2
, . . . , u
k
are columns of the matrix U
1
). Then U
1
is a unitary matrix. In particular, u

i
x = 0
for 2 i k. Therefore, for 2 i k,
x

(Au
i
) = (Au
i
)

x = (u

i
A

)x = u

i
(A

x) = u

i
(Ax) = u

i
(
1
x) =
1
(u

i
x) = 0 and
U

1
AU
1
= U

1
[Ax, Au
2
, , Au
k
] =
_

_
x

2
.
.
.
u

k
_

_
[
1
x, Au
2
, , Au
k
]
=
_

1
x

x x

Au
k
u

2
(
1
x) u

2
(Au
k
)
.
.
.
.
.
.
.
.
.
u

k
(
1
x) u

k
(Au
k
)
_

_
=
_

1
0
0
.
.
. B
0
_

_
,
where B is a (k 1) (k 1) matrix. As (U

1
AU
1
)

= U

1
AU
1
and
1
R, the matrix B is also
Hermitian. Therefore, by induction hypothesis there exists a (k 1) (k 1) unitary matrix U
2
132 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
such that U

2
BU
2
= D
2
= diag(
2
, . . . ,
k
), where
i
R, for 2 i k are the eigenvalues of B.
Dene U = U
1
_
1 0
0 U
2
_
. Then U is a unitary matrix and
U

AU =
_
U
1
_
1 0
0 U
2
__
A
_
U
1
_
1 0
0 U
2
__
=
__
1 0
0 U

2
_
U

1
_
A
_
U
1
_
1 0
0 U
2
__
=
_
1 0
0 U

2
_
_
U

1
AU
1
_
_
1 0
0 U
2
_
=
_
1 0
0 U

2
_ _

1
0
0 B
_ _
1 0
0 U
2
_
=
_

1
0
0 U

2
BU
2
_
=
_

1
0
0 D
2
_
.
Thus, U

AU is a diagonal matrix with diagonal entries

1
,
2
, . . . ,
k
, the eigenvalues of A. Hence,
the result follows.
Corollary 6.3.6 Let A M
n
(R) be a symmetric matrix. Then
1. the eigenvalues of A are all real,
2. the eigenvectors can be chosen to have real entries and
3. the eigenvectors also form an orthonormal basis of R
n
.
Proof. As A is symmetric, A is also a Hermitian matrix. Hence, by Theorem 6.3.5, the eigenvalues
of A are all real. Let (, x) be an eigen-pair of A. Suppose x
t
C
n
. Then there exist y
t
, z
t
R
n
such that x = y +iz. So,
Ax = x = A(y +iz) = (y +iz).
Comparing the real and imaginary parts, we get Ay = y and Az = z. Thus, we can choose the
eigenvectors to have real entries.
The readers are advised to prove the orthonormality of the eigenvectors (see the proof of The-
orem 6.3.5).
Exercise 6.3.7 1. Let A be a skew-Hermitian matrix. Then the eigenvalues of A are either zero
or purely imaginary. Also, the eigenvectors corresponding to distinct eigenvalues are mutually
orthogonal. [Hint: Carefully see the proof of Theorem 6.3.5.]
2. Let A be a normal matrix with (, x) as an eigen-pair. Then
(a) (A

)
k
x for k Z
+
is also an eigenvector corresponding to .
(b) (, x) is an eigen-pair for A

. [Hint: Verify |A

x x|
2
= |Ax x|
2
.]
3. Let A be an n n unitary matrix. Then
(a) the rows of A form an orthonormal basis of C
n
.
(b) the columns of A form an orthonormal basis of C
n
.
(c) for any two vectors x, y C
n1
, Ax, Ay) = x, y).
(d) for any vector x C
n1
, |Ax| = |x|.
(e) = 1 for any eigenvalue of A.
(f ) the eigenvectors x, y corresponding to distinct eigenvalues and satisfy x, y) = 0.
That is, if (, x) and (, y) are eigen-pairs with ,= , then x and y are mutually
orthogonal.
6.3. DIAGONALIZABLE MATRICES 133
4. Show that the matrices A =
_
4 4
0 4
_
and B =
_
10 9
4 2
_
are similar. Is it possible to nd a
unitary matrix U such that A = U

BU?
5. Let A be a 2 2 orthogonal matrix. Then prove the following:
(a) if det(A) = 1, then A =
_
cos sin
sin cos
_
for some , 0 < 2. That is, A counter-
clockwise rotates every point in R
2
by an angle .
(b) if det A = 1, then A =
_
cos sin
sin cos
_
for some , 0 < 2. That is, A reects ev-
ery point in R
2
about a line passing through origin. Determine this line. Or equivalently,
there exists a non-singular matrix P such that P
1
AP =
_
1 0
0 1
_
.
6. Let A be a 3 3 orthogonal matrix. Then prove the following:
(a) if det(A) = 1, then A is a rotation about a xed axis, in the sense that A has an eigen-
pair (1, x) such that the restriction of A to the plane x

is a two dimensional rotation

in x

.
(b) if det A = 1, then A corresponds to a reection through a plane P, followed by a rotation
about the line through origin that is orthogonal to P.
7. Let A =
_
_
2 1 1
1 2 1
1 1 2
_
_
. Find a non-singular matrix P such that P
1
AP = diag (4, 1, 1). Use
this to compute A
301
.
8. Let A be a Hermitian matrix. Then prove that rank(A) equals the number of non-zero eigen-
values of A.
Remark 6.3.8 Let A and B be the 22 matrices in Exercise 6.3.7.4. Then A and B were similar
matrices but they were not unitarily equivalent. In numerical calculations, unitary transformations
are preferred as compared to similarity transformations due to the following main reasons:
1. Exercise 6.3.7.3d implies that an orthonormal change of basis does not alter the sum of squares
of the absolute values of the entries. This need not be true under a non-singularity change of
basis.
2. For a unitary matrix U, U
1
= U

and hence unitary equivalence is computationally simpler.

3. Also there is no round-o error in the operation of conjugate transpose.
We next prove the Schurs Lemma and use it to show that normal matrices are unitarily diago-
nalizable. The proof is similar to the proof of Theorem 6.3.5. We give it again so that the readers
have a better understanding of unitary transformations.
Lemma 6.3.9 (Schurs Lemma) Let A M
n
(C). Then A is unitarily similar to an upper trian-
gular matrix.
Proof. We will prove the result by induction on n. The result is clearly true for n = 1. Let the
result be true for n = k 1. we need to prove the result for n = k.
134 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
Let (
1
, x) be an eigen-pair of a k k matrix A with |x| = 1. Let us extend the set x, a
linearly independent set, to form an orthonormal basis x, u
2
, u
3
, . . . , u
k
(using Gram-Schmidt
Orthogonalization) of C
k
. Then U
1
= [x u
2
u
k
] is a unitary matrix and
U

1
AU
1
= U

1
[Ax Au
2
Au
k
] =
_

_
x

2
.
.
.
u

k
_

_
[
1
x Au
2
Au
k
] =
_

1

0
.
.
. B
0
_

_
,
where B is a (k 1) (k 1) matrix. By induction hypothesis there exists a (k 1) (k 1)
unitary matrix U
2
such that U

2
BU
2
is an upper triangular matrix with diagonal entries
2
, . . . ,
k
,
the eigenvalues of B. Dene U = U
1
_
1 0
0 U
2
_
. Then check that U is a unitary matrix and U

AU
is an upper triangular matrix with diagonal entries
1
,
2
, . . . ,
k
, the eigenvalues of the matrix A.
Hence, the result follows.
In Lemma 6.3.9, it can be observed that whenever A is a normal matrix then the matrix B
is also a normal matrix. It is also known that if T is an upper triangular matrix that satises
TT

= T

T then T is a diagonal matrix (see Exercise 16). Thus, it follows that normal matrices
are diagonalizable. We state it as a remark.
Remark 6.3.10 (The Spectral Theorem for Normal Matrices) Let A be an n n normal
matrix. Then there exists an orthonormal basis x
1
, x
2
, . . . , x
n
of C
n
(C) such that Ax
i
=
i
x
i
for
1 i n. In particular, if U [x
1
, x
2
, . . . , x
n
] then U

AU is a diagonal matrix.
Exercise 6.3.11 1. Let A M
n
(R) be an invertible matrix. Prove that AA
t
= PDP
t
, where
P is an orthogonal and D is a diagonal matrix with positive diagonal entries.
2. Let A =
_
_
1 1 1
0 2 1
0 0 3
_
_
, B =
_
_
2 1

2
0 1 0
0 0 3
_
_
and U =
1

2
_
_
1 1 0
1 1 0
0 0

2
_
_
. Prove that A and B
are unitarily equivalent via the unitary matrix U. Hence, conclude that the upper triangular
matrix obtained in the Schurs Lemma need not be unique.
3. Prove Remark 6.3.10.
4. Let A be a normal matrix. If all the eigenvalues of A are 0 then prove that A = 0. What
happens if all the eigenvalues of A are 1?
5. Let A be an n n matrix. Prove that if A is
(a) Hermitian and xAx

= 0 for all x C
n
then A = 0.
(b) a real, symmetric matrix and xAx
t
= 0 for all x R
n
then A = 0.
Do these results hold for arbitrary matrices?
We end this chapter with an application of the theory of diagonalization to the study of conic
sections in analytic geometry and the study of maxima and minima in analysis.
6.4. SYLVESTERS LAW OF INERTIA AND APPLICATIONS 135
6.4 Sylvesters Law of Inertia and Applications
Denition 6.4.1 (Bilinear Form) Let A be an n n real symmetric matrix. A bilinear form in
x = (x
1
, x
2
, . . . , x
n
)
t
, y = (y
1
, y
2
, . . . , y
n
)
t
is an expression of the type
Q(x, y) = y
t
Ax =
n

i,j=1
a
ij
x
i
y
j
.
Denition 6.4.2 (Sesquilinear Form) Let A be an nn Hermitian matrix. A sesquilinear form
in x = (x
1
, x
2
, . . . , x
n
)

, y = (y
1
, y
2
, . . . , y
n
)

is given by
H(x, y) = y

Ax =
n

i,j=1
a
ij
x
i
y
j
.
Observe that if A = I
n
then the bilinear (sesquilinear) form reduces to the standard real (com-
plex) inner product. Also, it can be easily seen that H(x, y) is linear in x, the rst component
and conjugate linear in y, the second component. The expression Q(x, x) is called the quadratic
form and H(x, x) the Hermitian form. We generally write Q(x) and H(x) in place of Q(x, x) and
H(x, x), respectively. It can be easily shown that for any choice of x, the Hermitian form H(x) is
a real number. Hence, for any real number , the equation H(x) = , represents a conic in C
n
.
Example 6.4.3 Let A =
_
1 2 i
2 +i 2
_
. Then A

= A and for x = (x
1
, x
2
)

,
H(x) = x

Ax = (x
1
, x
2
)
_
1 2 i
2 +i 2
_ _
x
1
x
2
_
= x
1
x
1
+ 2x
2
x
2
+ (2 i)x
1
x
2
+ (2 + i)x
2
x
1
= [x
1
[
2
+ 2[x
2
[
2
+ 2Re[(2 i)x
1
x
2
]
where Re denotes the real part of a complex number. This shows that for every choice of x the
Hermitian form is always real. Why?
The main idea of this section is to express H(x) as sum of squares and hence determine the
possible values that it can take. Note that if we replace x by cx, where c is any complex number,
then H(x) simply gets multiplied by [c[
2
and hence one needs to study only those x for which
|x| = 1, i.e., x is a normalized vector.
Let A

= A M
n
(C). Then by Theorem 6.3.5, the eigenvalues
i
, 1 i n, of A are real
and there exists a unitary matrix U such that U

AU = D diag(
1
,
2
, . . . ,
n
). Now dene,
z = (z
1
, z
2
, . . . , z
n
)

= U

x. Then |z| = 1, x = Uz and

H(x) = z

AUz = z

Dz =
n

i=1

i
[z
i
[
2
=
p

i=1

_
[
i
[ z
i

i=p+1

_
[
i
[ z
i

2
. (6.4.1)
Thus, the possible values of H(x) depend only on the eigenvalues of A. Since U is an invertible
matrix, the components z
i
s of z = U

x are commonly known as linearly independent linear forms.

Also, note that in Equation (6.4.1), the number p (respectively r p) seems to be related to the
number of eigenvalues of A that are positive (respectively negative). This is indeed true. That is,
in any expression of H(x) as a sum of n absolute squares of linearly independent linear forms, the
number p (respectively r p) gives the number of positive (respectively negative) eigenvalues of A.
This is stated as the next lemma and it popularly known as the Sylvesters law of inertia.
136 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
Lemma 6.4.4 Let A M
n
(C) be a Hermitian matrix and let x = (x
1
, x
2
, . . . , x
n
)

. Then every
Hermitian form H(x) = x

Ax, in n variables can be written as

H(x) = [y
1
[
2
+[y
2
[
2
+ +[y
p
[
2
[y
p+1
[
2
[y
r
[
2
where y
1
, y
2
, . . . , y
r
are linearly independent linear forms in x
1
, x
2
, . . . , x
n
, and the integers p and
r satisfying 0 p r n, depend only on A.
Proof. From Equation (6.4.1) it is easily seen that H(x) has the required form. We only need to
show that p and r are uniquely determined by A. Hence, let us assume on the contrary that there
exist positive integers p, q, r, s with p > q such that
H(x) = [y
1
[
2
+[y
2
[
2
+ +[y
p
[
2
[y
p+1
[
2
[y
r
[
2
= [z
1
[
2
+[z
2
[
2
+ +[z
q
[
2
[z
q+1
[
2
[z
s
[
2
,
where y = (y
1
, y
2
, . . . , y
n
)

= Mx and z = (z
1
, z
2
, . . . , z
n
)

= Nx for some invertible matrices

M and N. Hence, z = By for some invertible matrix B. Let us write Y
1
= (y
1
, . . . , y
p
)

, Z
1
=
(z
1
, . . . , z
q
)

and B =
_
B
1
B
2
B
3
B
4
_
, where B
1
is a q p matrix. As p > q, the linear system BY
1
= Z
1
has a non-zero solution for every choice of Z
1
. In particular, for Z
1
= 0, let

Y
1
= ( y
1
, . . . , y
p
)

be a
non-zero solution and let y

= (

Y
1

, 0

). Then
H( y) = [ y
1
[
2
+[ y
2
[
2
+ +[ y
p
[
2
= ([z
q+1
[
2
+ +[z
s
[
2
).
Now, this can hold only if y
1
= y
2
= = y
p
= 0, which gives a contradiction. Hence p = q.
Similarly, the case r > s can be resolved. Thus, the proof of the lemma is over.
Remark 6.4.5 The integer r is the rank of the matrix A and the number r2p is sometimes called
the inertial degree of A.
We complete this chapter by understanding the graph of
ax
2
+ 2hxy +by
2
+ 2fx + 2gy +c = 0
for a, b, c, f, g, h R. We rst look at the following example.
Example 6.4.6 Sketch the graph of 3x
2
+ 4xy + 3y
2
= 5.
Solution: Note that 3x
2
+ 4xy + 3y
2
= [x, y]
_
3 2
2 3
_ _
x
y
_
and the eigen-pairs of the matrix
_
3 2
2 3
_
are (5, (1, 1)
t
), (1, (1, 1)
t
). Thus,
_
3 2
2 3
_
=
_
1

2
1

2
1

2

1

2
_
_
5 0
0 1
_
_
1

2
1

2
1

2

1

2
_
.
Now, let u =
x+y

2
and v =
xy

2
. Then
3x
2
+ 4xy + 3y
2
= [x, y]
_
3 2
2 3
_ _
x
y
_
= [x, y]
_
1

2
1

2
1

2

1

2
_
_
5 0
0 1
_
_
1

2
1

2
1

2

1

2
_
_
x
y
_
=
_
u, v

_
5 0
0 1
_ _
u
v
_
= 5u
2
+v
2
.
6.4. SYLVESTERS LAW OF INERTIA AND APPLICATIONS 137
Thus, the given graph reduces to 5u
2
+ v
2
= 5 or equivalently to u
2
+
v
2
5
= 1. Therefore, the given
graph represents an ellipse with the principal axes u = 0 and v = 0 (correspinding to the line
x +y = 0 and x y = 0, respectively). See Figure 6.4.6.
y = x
y = x
Figure 1: The ellipse 3x
2
+ 4xy + 3y
2
= 5.
We now consider the general conic. We obtain conditions on the eigenvalues of the associated
quadratic form, dened below, to characterize conic sections in R
2
(endowed with the standard
inner product).
Denition 6.4.7 (Quadratic Form) Let ax
2
+ 2hxy + by
2
+ 2gx + 2fy + c = 0 be the equation
of a general conic. The quadratic expression
ax
2
+ 2hxy +by
2
=
_
x, y

_
a h
h b
_ _
x
y
_
is called the quadratic form associated with the given conic.
Proposition 6.4.8 For xed real numbers a, b, c, g, f and h, consider the general conic
ax
2
+ 2hxy +by
2
+ 2gx + 2fy +c = 0.
Then prove that this conic represents
1. an ellipse if ab h
2
> 0,
2. a parabola if ab h
2
= 0, and
3. a hyperbola if ab h
2
< 0.
Proof. Let A =
_
a h
h b
_
. Then ax
2
+ 2hxy + by
2
=
_
x y

A
_
x
y
_
is the associated quadratic
form. As A is a symmetric matrix, by Corollary 6.3.6, the eigenvalues
1
,
2
of A are both real,
the corresponding eigenvectors u
1
, u
2
are orthonormal and A is unitarily diagonalizable with A =
_
u
1
u
2

1
0
0
2
_ _
u
t
1
u
t
2
_
. Let
_
u
v
_
=
_
u
t
1
u
t
2
_ _
x
y
_
. Then ax
2
+ 2hxy + by
2
=
1
u
2
+
2
v
2
and the
equation of the conic section in the (u, v)-plane, reduces to

1
u
2
+
2
v
2
+ 2g
1
u + 2f
1
v +c = 0. (6.4.2)
Now, depending on the eigenvalues
1
,
2
, we consider dierent cases:
1.
1
= 0 =
2
. Substituting
1
=
2
= 0 in Equation (6.4.2) gives the straight line 2g
1
u +
2f
1
v +c = 0 in the (u, v)-plane.
138 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
2.
1
= 0,
2
> 0. As
1
= 0, det(A) = 0. That is, ab h
2
= det(A) = 0. Also, in this case,
Equation (6.4.2) reduces to

2
(v +d
1
)
2
= d
2
u +d
3
for some d
1
, d
2
, d
3
R.
To understand this case, we need to consider the following subcases:
(a) Let d
2
= d
3
= 0. Then v +d
1
= 0 is a pair of coincident lines.
(b) Let d
2
= 0, d
3
,= 0.
i. If d
3
> 0, then we get a pair of parallel lines given by v = d
1

_
d3
2
.
ii. If d
3
< 0, the solution set of the corresponding conic is an empty set.
(c) If d
2
,= 0. Then the given equation is of the form Y
2
= 4aX for some translates X = x+
and Y = y + and thus represents a parabola.
3.
1
> 0 and
2
< 0. In this case, ab h
2
= det(A) =
1

2
< 0. Let
2
=
2
with
2
> 0.
Then Equation (6.4.2) can be rewritten as

1
(u +d
1
)
2

2
(v +d
2
)
2
= d
3
for some d
1
, d
2
, d
3
R (6.4.3)
whose understanding requires the following subcases:
(a) Let d
3
= 0. Then Equation (6.4.3) reduces to
_
_

1
(u + d
1
) +

2
(v +d
2
)
_

_
_

1
(u +d
1
)

2
(v +d
2
)
_
= 0
or equivalently, a pair of intersecting straight lines in the (u, v)-plane.
(b) Let d
3
,= 0. In particular, let d
3
> 0. Then Equation (6.4.3) reduces to

1
(u +d
1
)
2
d
3

2
(v +d
2
)
2
d
3
= 1
or equivalently, a hyperbola in the (u, v)-plane, with principal axes u + d
1
= 0 and
v +d
2
= 0.
4.
1
,
2
> 0. In this case, ab h
2
= det(A) =
1

2
> 0 and Equation (6.4.2) can be rewritten
as

1
(u +d
1
)
2
+
2
(v +d
2
)
2
= d
3
for some d
1
, d
2
, d
3
R. (6.4.4)
We consider the following three subcases to understand this.
(a) Let d
3
= 0. Then Equation (6.4.4) reduces to a pair of perpendicular lines u + d
1
= 0
and v +d
2
= 0 in the (u, v)-plane.
(b) Let d
3
< 0. Then the solution set of Equation (6.4.4) is an empty set.
(c) Let d
3
> 0. Then Equation (6.4.4) reduces to the ellipse

1
(u +d
1
)
2
d
3
+

2
(v +d
2
)
2
d
3
= 1
whose principal axes are u +d
1
= 0 and v +d
2
= 0.

6.4. SYLVESTERS LAW OF INERTIA AND APPLICATIONS 139

Remark 6.4.9 Observe that the condition
_
x
y
_
=
_
u
1
u
2

_
u
v
_
implies that the principal axes of
the conic are functions of the eigenvectors u
1
and u
2
.
Exercise 6.4.10 Sketch the graph of the following surfaces:
1. x
2
+ 2xy +y
2
6x 10y = 3.
2. 2x
2
+ 6xy + 3y
2
12x 6y = 5.
3. 4x
2
4xy + 2y
2
+ 12x 8y = 10.
4. 2x
2
6xy + 5y
2
10x + 4y = 7.
As a last application, we consider the following problem that helps us in understanding the
ax
2
+by
2
+cz
2
+ 2dxy + 2exz + 2fyz + 2lx + 2my + 2nz +q = 0 (6.4.5)
be a general quadric. Then to get the geometrical idea of this quadric, do the following:
1. Dene A =
_
_
a d e
d b f
e f c
_
_
, b =
_
_
2l
2m
2n
_
_
and x =
_
_
x
y
z
_
_
. Note that Equation (6.4.5) can be
rewritten as x
t
Ax +b
t
x +q = 0.
2. As A is symmetric, nd an orthogonal matrix P such that P
t
AP = diag(
1
,
2
,
3
).
3. Let y = P
t
x = (y
1
, y
2
, y
3
)
t
. Then Equation (6.4.5) reduces to

1
y
2
1
+
2
y
2
2
+
3
y
2
3
+ 2l
1
y
1
+ 2l
2
y
2
+ 2l
3
y
3
+q

= 0. (6.4.6)
4. Depending on which
i
,= 0, rewrite Equation (6.4.6). That is, if
1
,= 0 then rewrite

1
y
2
1
+ 2l
1
y
1
as
1
_
y
1
+
l1
1
_
2

_
l1
1
_
2
.
5. Use the condition x = Py to determine the center and the planes of symmetry of the quadric
in terms of the original system.
Example 6.4.11 Determine the following quadrics
1. 2x
2
+ 2y
2
+ 2z
2
+ 2xy + 2xz + 2yz + 4x + 2y + 4z + 2 = 0.
2. 3x
2
y
2
+z
2
+ 10 = 0.
Solution: For Part 1, observe that A =
_
_
2 1 1
1 2 1
1 1 2
_
_
, b =
_
_
4
2
4
_
_
and q = 2. Also, the orthonormal
matrix P =
_

_
1

3
1

2
1

6
1

3
1

2
1

6
1

3
0
2

6
_

_ and P
t
AP =
_
_
4 0 0
0 1 0
0 0 1
_
_
. Hence, the quadric reduces to 4y
2
1
+ y
2
2
+
y
2
3
+
10

3
y
1
+
2

2
y
2

2

6
y
3
+ 2 = 0. Or equivalently to
4(y
1
+
5
4

3
)
2
+ (y
2
+
1

2
)
2
+ (y
3

6
)
2
=
9
12
.
140 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
So, the standard form of the quadric is 4z
2
1
+z
2
2
+z
2
3
=
9
12
, where the center is given by (x, y, z)
t
=
P(
5
4

3
,
1

2
,
1

6
)
t
= (
3
4
,
1
4
,
3
4
)
t
.
For Part 2, observe that A =
_
_
3 0 0
0 1 0
0 0 1
_
_
, b = 0 and q = 10. In this case, we can rewrite the
y
2
10

3x
2
10

z
2
10
= 1
which is the equation of a hyperboloid consisting of two sheets.
The calculation of the planes of symmetry is left as an exercise to the reader.
Chapter 7
Appendix
7.1 Permutation/Symmetric Groups
In this section, S denotes the set 1, 2, . . . , n.
Denition 7.1.1 1. A function : SS is called a permutation on n elements if is both
one to one and onto.
2. The set of all functions : SS that are both one to one and onto will be denoted by o
n
.
That is, o
n
is the set of all permutations of the set 1, 2, . . . , n.
Example 7.1.2 1. In general, we represent a permutation by =
_
1 2 n
(1) (2) (n)
_
.
This representation of a permutation is called a two row notation for .
2. For each positive integer n, o
n
has a special permutation called the identity permutation,
denoted Id
n
, such that Id
n
(i) = i for 1 i n. That is, Id
n
=
_
1 2 n
1 2 n
_
.
3. Let n = 3. Then
o
3
=
_

1
=
_
1 2 3
1 2 3
_
,
2
=
_
1 2 3
1 3 2
_
,
3
=
_
1 2 3
2 1 3
_
,

4
=
_
1 2 3
2 3 1
_
,
5
=
_
1 2 3
3 1 2
_
,
6
=
_
1 2 3
3 2 1
__
(7.1.1)
Remark 7.1.3 1. Let o
n
. Then is determined if (i) is known for i = 1, 2, . . . , n. As
is both one to one and onto, (1), (2), . . . , (n) = S. So, there are n choices for (1)
(any element of S), n1 choices for (2) (any element of S dierent from (1)), and so on.
Hence, there are n(n 1)(n 2) 3 2 1 = n! possible permutations. Thus, the number of
elements in o
n
is n!. That is, [o
n
[ = n!.
2. Suppose that , o
n
. Then both and are one to one and onto. So, their composition
map , dened by ( )(i) =
_
(i)
_
, is also both one to one and onto. Hence, is
also a permutation. That is, o
n
.
3. Suppose o
n
. Then is both one to one and onto. Hence, the function
1
: SS
dened by
1
(m) = if and only if () = m for 1 m n, is well dened and indeed
141
142 CHAPTER 7. APPENDIX

1
is also both one to one and onto. Hence, for every element o
n
,
1
o
n
and is the
inverse of .
4. Observe that for any o
n
, the compositions
1
=
1
= Id
n
.
Proposition 7.1.4 Consider the set of all permutations o
n
. Then the following holds:
1. Fix an element o
n
. Then the sets : o
n
and : o
n
have exactly n!
elements. Or equivalently,
o
n
= : o
n
= : o
n
.
2. o
n
=
1
: o
n
.
Proof. For the rst part, we need to show that given any element o
n
, there exists elements
, o
n
such that = = . It can easily be veried that =
1
and =
1
.
For the second part, note that for any o
n
, (
1
)
1
= . Hence the result holds.
Denition 7.1.5 Let o
n
. Then the number of inversions of , denoted n(), equals
[(i, j) : i < j, (i) > (j) [.
Note that, for any o
n
, n() also equals
n

i=1
[(j) < (i), for j = i + 1, i + 2, . . . , n[.
Denition 7.1.6 A permutation o
n
is called a transposition if there exists two positive integers
m, r 1, 2, . . . , n such that (m) = r, (r) = m and (i) = i for 1 i ,= m, r n.
For the sake of convenience, a transposition for which (m) = r, (r) = m and (i) = i
for 1 i ,= m, r n will be denoted simply by = (m r) or (r m). Also, note that for any
transposition o
n
,
1
= . That is, = Id
n
.
Example 7.1.7 1. The permutation =
_
1 2 3 4
3 2 1 4
_
is a transposition as (1) = 3, (3) =
1, (2) = 2 and (4) = 4. Here note that = (1 3) = (3 1). Also, check that
n() = [(1, 2), (1, 3), (2, 3)[ = 3.
2. Let =
_
1 2 3 4 5 6 7 8 9
4 2 3 5 1 9 8 7 6
_
. Then check that
n() = 3 + 1 + 1 + 1 + 0 + 3 + 2 + 1 = 12.
3. Let , m and r be distinct element from 1, 2, . . . , n. Suppose = (m r) and = (m ).
Then
( )() =
_
()
_
= (m) = r, ( )(m) =
_
(m)
_
= () =
( )(r) =
_
(r)
_
= (r) = m, and ( )(i) =
_
(i)
_
= (i) = i if i ,= , m, r.
Therefore,
= (m r) (m ) =
_
1 2 m r n
1 2 r m n
_
= (r l) (r m).
Similarly check that =
_
1 2 m r n
1 2 m r n
_
.
7.1. PERMUTATION/SYMMETRIC GROUPS 143
With the above denitions, we state and prove two important results.
Theorem 7.1.8 For any o
n
, can be written as composition (product) of transpositions.
Proof. We will prove the result by induction on n(), the number of inversions of . If n() = 0,
then = Id
n
= (1 2) (1 2). So, let the result be true for all o
n
with n() k.
For the next step of the induction, suppose that o
n
with n() = k +1. Choose the smallest
positive number, say , such that
(i) = i, for i = 1, 2, . . . , 1 and () ,= .
As is a permutation, there exists a positive number, say m, such that () = m. Also, note that
m > . Dene a transposition by = ( m). Then note that
( )(i) = i, for i = 1, 2, . . . , .
So, the denition of number of inversions and m > implies that
n( ) =
n

i=1
[( )(j) < ( )(i), for j = i + 1, i + 2, . . . , n[
=

i=1
[( )(j) < ( )(i), for j = i + 1, i + 2, . . . , n[
+
n

i=+1
[( )(j) < ( )(i), for j = i + 1, i + 2, . . . , n[
=
n

i=+1
[( )(j) < ( )(i), for j = i + 1, i + 2, . . . , n[

i=+1
[(j) < (i), for j = i + 1, i + 2, . . . , n[ as m > ,
< (m) +
n

i=+1
[(j) < (i), for j = i + 1, i + 2, . . . , n[
= n().
Thus, n( ) < k +1. Hence, by the induction hypothesis, the permutation is a composition
of transpositions. That is, there exist transpositions, say
i
, 1 i t such that
=
1

2

t
.
Hence, =
1

2

t
as = Id
n
for any transposition o
n
. Therefore, by
mathematical induction, the proof of the theorem is complete.
Before coming to our next important result, we state and prove the following lemma.
Lemma 7.1.9 Suppose there exist transpositions
i
, 1 i t such that
Id
n
=
1

2

t
,
then t is even.
144 CHAPTER 7. APPENDIX
Proof. Observe that t ,= 1 as the identity permutation is not a transposition. Hence, t 2.
If t = 2, we are done. So, let us assume that t 3. We will prove the result by the method of
mathematical induction. The result clearly holds for t = 2. Let the result be true for all expressions
in which the number of transpositions t k. Now, let t = k + 1.
Suppose
1
= (m r). Note that the possible choices for the composition
1

2
are
(m r) (m r) = Id
n
, (m r) (m ) = (r ) (r m), (m r) (r ) = ( r) ( m) and (m r) ( s) =
( s) (m r), where and s are distinct elements of 1, 2, . . . , n and are dierent from m, r. In the
rst case, we can remove
1

2
and obtain Id
n
=
3

4

t
. In this expression for identity,
the number of transpositions is t 2 = k 1 < k. So, by mathematical induction, t 2 is even and
hence t is also even.
In the other three cases, we replace the original expression for
1

2
by their counterparts
on the right to obtain another expression for identity in terms of t = k + 1 transpositions. But
note that in the new expression for identity, the positive integer m doesnt appear in the rst
transposition, but appears in the second transposition. We can continue the above process with
the second and third transpositions. At this step, either the number of transpositions will reduce
by 2 (giving us the result by mathematical induction) or the positive number m will get shifted to
the third transposition. The continuation of this process will at some stage lead to an expression
for identity in which the number of transpositions is t 2 = k 1 (which will give us the desired
result by mathematical induction), or else we will have an expression in which the positive number
m will get shifted to the right most transposition. In the later case, the positive integer m appears
exactly once in the expression for identity and hence this expression does not x m whereas for the
identity permutation Id
n
(m) = m. So the later case leads us to a contradiction.
Hence, the process will surely lead to an expression in which the number of transpositions at
some stage is t 2 = k 1. Therefore, by mathematical induction, the proof of the lemma is
complete.

Theorem 7.1.10 Let o

n
. Suppose there exist transpositions
1
,
2
, . . . ,
k
and
1
,
2
, . . . ,

such that
=
1

2

k
=
1

2

then either k and are both even or both odd.

Proof. Observe that the condition
1

2

k
=
1

2

and = Id
n
for any
transposition o
n
, implies that
Id
n
=
1

2

k

1

1
.
Hence by Lemma 7.1.9, k + is even. Hence, either k and are both even or both odd. Thus the
result follows.
Denition 7.1.11 A permutation o
n
is called an even permutation if can be written as a
composition (product) of an even number of transpositions. A permutation o
n
is called an odd
permutation if can be written as a composition (product) of an odd number of transpositions.
Remark 7.1.12 Observe that if and are both even or both odd permutations, then the permu-
tations and are both even. Whereas if one of them is odd and the other even then the
permutations and are both odd. We use this to dene a function on o
n
, called the sign
of a permutation, as follows:
7.2. PROPERTIES OF DETERMINANT 145
Denition 7.1.13 Let sgn : o
n
1, 1 be a function dened by
sgn() =
_
1 if is an even permutation
1 if is an odd permutation
.
Example 7.1.14 1. The identity permutation, Id
n
is an even permutation whereas every trans-
position is an odd permutation. Thus, sgn(Id
n
) = 1 and for any transposition o
n
, sgn() =
1.
2. Using Remark 7.1.12, sgn( ) = sgn() sgn() for any two permutations , o
n
.
We are now ready to dene determinant of a square matrix A.
Denition 7.1.15 Let A = [a
ij
] be an n n matrix with entries from F. The determinant of A,
denoted det(A), is dened as
det(A) =

Sn
sgn()a
1(1)
a
2(2)
. . . a
n(n)
=

Sn
sgn()
n

i=1
a
i(i)
.
Remark 7.1.16 1. Observe that det(A) is a scalar quantity. The expression for det(A) seems
complicated at the rst glance. But this expression is very helpful in proving the results related
with properties of determinant.
2. If A = [a
ij
] is a 3 3 matrix, then using (7.1.1),
det(A) =

Sn
sgn()
3

i=1
a
i(i)
= sgn(
1
)
3

i=1
a
i1(i)
+ sgn(
2
)
3

i=1
a
i2(i)
+ sgn(
3
)
3

i=1
a
i3(i)
+
sgn(
4
)
3

i=1
a
i4(i)
+ sgn(
5
)
3

i=1
a
i5(i)
+ sgn(
6
)
3

i=1
a
i6(i)
= a
11
a
22
a
33
a
11
a
23
a
32
a
12
a
21
a
33
+a
12
a
23
a
31
+a
13
a
21
a
32
a
13
a
22
a
31
.
Observe that this expression for det(A) for a 3 3 matrix A is same as that given in (2.5.1).
7.2 Properties of Determinant
Theorem 7.2.1 (Properties of Determinant) Let A = [a
ij
] be an n n matrix. Then
1. if B is obtained from A by interchanging two rows, then
det(B) = det(A).
2. if B is obtained from A by multiplying a row by c then
det(B) = c det(A).
3. if all the elements of one row is 0 then det(A) = 0.
4. if A is a square matrix having two rows equal then det(A) = 0.
5. Let B = [b
ij
] and C = [c
ij
] be two matrices which dier from the matrix A = [a
ij
] only in the
m
th
row for some m. If c
mj
= a
mj
+b
mj
for 1 j n then det(C) = det(A) + det(B).
146 CHAPTER 7. APPENDIX
6. if B is obtained from A by replacing the th row by itself plus k times the mth row, for ,= m
then det(B) = det(A).
7. if A is a triangular matrix then det(A) = a
11
a
22
a
nn
, the product of the diagonal elements.
8. If E is an elementary matrix of order n then det(EA) = det(E) det(A).
9. A is invertible if and only if det(A) ,= 0.
10. If B is an n n matrix then det(AB) = det(A) det(B).
11. det(A) = det(A
t
), where recall that A
t
is the transpose of the matrix A.
Proof. Proof of Part 1. Suppose B = [b
ij
] is obtained from A = [a
ij
] by the interchange of the

th
and m
th
row. Then b
j
= a
mj
, b
mj
= a
j
for 1 j n and b
ij
= a
ij
for 1 i ,= , m n, 1
j n.
Let = ( m) be a transposition. Then by Proposition 7.1.4, o
n
= : o
n
. Hence by
the denition of determinant and Example 7.1.14.2, we have
det(B) =

Sn
sgn()
n

i=1
b
i(i)
=

Sn
sgn( )
n

i=1
b
i()(i)
=

Sn
sgn() sgn() b
1()(1)
b
2()(2)
b
()()
b
m()(m)
b
n()(n)
= sgn()

Sn
sgn() b
1(1)
b
2(2)
b
(m)
b
m()
b
n(n)
=
_

Sn
sgn() a
1(1)
a
2(2)
a
m(m)
a
()
a
n(n)
_
as sgn() = 1
= det(A).
Proof of Part 2. Suppose that B = [b
ij
] is obtained by multiplying the m
th
row of A by c ,= 0.
Then b
mj
= c a
mj
and b
ij
= a
ij
for 1 i ,= m n, 1 j n. Then
det(B) =

Sn
sgn()b
1(1)
b
2(2)
b
m(m)
b
n(n)
=

Sn
sgn()a
1(1)
a
2(2)
ca
m(m)
a
n(n)
= c

Sn
sgn()a
1(1)
a
2(2)
a
m(m)
a
n(n)
= c det(A).
Proof of Part 3. Note that det(A) =

Sn
sgn()a
1(1)
a
2(2)
. . . a
n(n)
. So, each term in the
expression for determinant, contains one entry from each row. Hence, from the condition that A
has a row consisting of all zeros, the value of each term is 0. Thus, det(A) = 0.
Proof of Part 4. Suppose that the
th
and m
th
row of A are equal. Let B be the matrix obtained
from A by interchanging the
th
and m
th
rows. Then by the rst part, det(B) = det(A). But the
assumption implies that B = A. Hence, det(B) = det(A). So, we have det(B) = det(A) = det(A).
Hence, det(A) = 0.
7.2. PROPERTIES OF DETERMINANT 147
Proof of Part 5. By denition and the given assumption, we have
det(C) =

Sn
sgn()c
1(1)
c
2(2)
c
m(m)
c
n(n)
=

Sn
sgn()c
1(1)
c
2(2)
(b
m(m)
+a
m(m)
) c
n(n)
=

Sn
sgn()b
1(1)
b
2(2)
b
m(m)
b
n(n)
+

Sn
sgn()a
1(1)
a
2(2)
a
m(m)
a
n(n)
= det(B) + det(A).
Proof of Part 6. Suppose that B = [b
ij
] is obtained from A by replacing the th row by itself plus
k times the mth row, for ,= m. Then b
j
= a
j
+k a
mj
and b
ij
= a
ij
for 1 i ,= m n, 1 j n.
Then
det(B) =

Sn
sgn()b
1(1)
b
2(2)
b
()
b
m(m)
b
n(n)
=

Sn
sgn()a
1(1)
a
2(2)
(a
()
+ka
m(m)
) a
m(m)
a
n(n)
=

Sn
sgn()a
1(1)
a
2(2)
a
()
a
m(m)
a
n(n)
+k

Sn
sgn()a
1(1)
a
2(2)
a
m(m)
a
m(m)
a
n(n)
=

Sn
sgn()a
1(1)
a
2(2)
a
()
a
m(m)
a
n(n)
use Part 4
= det(A).
Proof of Part 7. First let us assume that A is an upper triangular matrix. Observe that if o
n
is dierent from the identity permutation then n() 1. So, for every ,= Id
n
o
n
, there exists
a positive integer m, 1 m n 1 (depending on ) such that m > (m). As A is an upper
triangular matrix, a
m(m)
= 0 for each (,= Id
n
) o
n
. Hence the result follows.
A similar reasoning holds true, in case A is a lower triangular matrix.
Proof of Part 8. Let I
n
be the identity matrix of order n. Then using Part 7, det(I
n
) = 1. Also,
recalling the notations for the elementary matrices given in Remark 2.2.2, we have det(E
ij
) = 1,
(using Part 1) det(E
i
(c)) = c (using Part 2) and det(E
ij
(k) = 1 (using Part 6). Again using
Parts 1, 2 and 6, we get det(EA) = det(E) det(A).
Proof of Part 9. Suppose A is invertible. Then by Theorem 2.2.5, A is a product of elementary
matrices. That is, there exist elementary matrices E
1
, E
2
, . . . , E
k
such that A = E
1
E
2
E
k
. Now
a repeated application of Part 8 implies that det(A) = det(E
1
) det(E
2
) det(E
k
). But det(E
i
) ,= 0
for 1 i k. Hence, det(A) ,= 0.
Now assume that det(A) ,= 0. We show that A is invertible. On the contrary, assume that A
is not invertible. Then by Theorem 2.2.5, the matrix A is not of full rank. That is there exists a
positive integer r < n such that rank(A) = r. So, there exist elementary matrices E
1
, E
2
, . . . , E
k
such that E
1
E
2
E
k
A =
_
B
0
_
. Therefore, by Part 3 and a repeated application of Part 8,
det(E
1
) det(E
2
) det(E
k
) det(A) = det(E
1
E
2
E
k
A) = det
__
B
0
__
= 0.
148 CHAPTER 7. APPENDIX
But det(E
i
) ,= 0 for 1 i k. Hence, det(A) = 0. This contradicts our assumption that
det(A) ,= 0. Hence our assumption is false and therefore A is invertible.
Proof of Part 10. Suppose A is not invertible. Then by Part 9, det(A) = 0. Also, the product ma-
trix AB is also not invertible. So, again by Part 9, det(AB) = 0. Thus, det(AB) = det(A) det(B).
Now suppose that A is invertible. Then by Theorem 2.2.5, A is a product of elementary matrices.
That is, there exist elementary matrices E
1
, E
2
, . . . , E
k
such that A = E
1
E
2
E
k
. Now a repeated
application of Part 8 implies that
det(AB) = det(E
1
E
2
E
k
B) = det(E
1
) det(E
2
) det(E
k
) det(B)
= det(E
1
E
2
E
k
) det(B) = det(A) det(B).
Proof of Part 11. Let B = [b
ij
] = A
t
. Then b
ij
= a
ji
for 1 i, j n. By Proposition 7.1.4, we
know that o
n
=
1
: o
n
. Also sgn() = sgn(
1
). Hence,
det(B) =

Sn
sgn()b
1(1)
b
2(2)
b
n(n)
=

Sn
sgn(
1
)b

1
(1) 1
b

1
(2) 2
b

1
(n) n
=

Sn
sgn(
1
)a
1
1
(1)
b
2
1
(2)
b
n
1
(n)
= det(A).

Remark 7.2.2 1. The result that det(A) = det(A

t
) implies that in the statements made in
Theorem 7.2.1, where ever the word row appears it can be replaced by column.
2. Let A = [a
ij
] be a matrix satisfying a
11
= 1 and a
1j
= 0 for 2 j n. Let B be the submatrix
of A obtained by removing the rst row and the rst column. Then it can be easily shown that
det(A) = det(B). The reason being is as follows:
for every o
n
with (1) = 1 is equivalent to saying that is a permutation of the elements
2, 3, . . . , n. That is, o
n1
. Hence,
det(A) =

Sn
sgn()a
1(1)
a
2(2)
a
n(n)
=

Sn,(1)=1
sgn()a
2(2)
a
n(n)
=

Sn1
sgn()b
1(1)
b
n(n)
= det(B).
We are now ready to relate this denition of determinant with the one given in Denition 2.5.2.
Theorem 7.2.3 Let A be an nn matrix. Then det(A) =
n

j=1
(1)
1+j
a
1j
det
_
A(1[j)
_
, where recall
that A(1[j) is the submatrix of A obtained by removing the 1
st
row and the j
th
column.
Proof. For 1 j n, dene two matrices
B
j
=
_

_
0 0 a
1j
0
a
21
a
22
a
2j
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
a
nj
a
nn
_

_
nn
and C
j
=
_

_
a
1j
0 0 0
a
2j
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
nj
a
n1
a
n2
a
nn
_

_
nn
.
7.3. DIMENSION OF M +N 149
Then by Theorem 7.2.1.5,
det(A) =
n

j=1
det(B
j
). (7.2.2)
We now compute det(B
j
) for 1 j n. Note that the matrix B
j
can be transformed into C
j
by
j 1 interchanges of columns done in the following manner:
rst interchange the 1
st
and 2
nd
column, then interchange the 2
nd
and 3
rd
column and so on
(the last process consists of interchanging the (j 1)
th
column with the j
th
column. Then by
Remark 7.2.2 and Parts 1 and 2 of Theorem 7.2.1, we have det(B
j
) = a
1j
(1)
j1
det(C
j
). Therefore
by (7.2.2),
det(A) =
n

j=1
(1)
j1
a
1j
det
_
A(1[j)
_
=
n

j=1
(1)
j+1
a
1j
det
_
A(1[j)
_
.

7.3 Dimension of M +N
Theorem 7.3.1 Let V (F) be a nite dimensional vector space and let M and N be two subspaces
of V. Then
dim(M) + dim(N) = dim(M +N) + dim(M N). (7.3.3)
Proof. Since M N is a vector subspace of V, consider a basis B
1
= u
1
, u
2
, . . . , u
k
of M N.
As, M N is a subspace of the vector spaces M and N, we extend the basis B
1
to form a basis
B
M
= u
1
, u
2
, . . . , u
k
, v
1
, . . . , v
r
of M and also a basis B
N
= u
1
, u
2
, . . . , u
k
, w
1
, . . . , w
s
of N.
We now proceed to prove that the set B
2
= u
1
, u
2
, . . . , u
k
, w
1
, . . . , w
s
, v
1
, v
2
, . . . , v
r
is a basis
of M +N.
To do this, we show that
1. the set B
2
is linearly independent subset of V, and
2. L(B
2
) = M +N.
The second part can be easily veried. To prove the rst part, we consider the linear system of
equations

1
u
1
+ +
k
u
k
+
1
w
1
+ +
s
w
s
+
1
v
1
+ +
r
v
r
= 0. (7.3.4)
This system can be rewritten as

1
u
1
+ +
k
u
k
+
1
w
1
+ +
s
w
s
= (
1
v
1
+ +
r
v
r
).
The vector v = (
1
v
1
+ +
r
v
r
) M, as v
1
, . . . , v
r
B
M
. But we also have v =
1
u
1
+ +

k
u
k
+
1
w
1
+ +
s
w
s
N as the vectors u
1
, u
2
, . . . , u
k
, w
1
, . . . , w
s
B
N
. Hence, v M N
and therefore, there exists scalars
1
, . . . ,
k
such that v =
1
u
1
+
2
u
2
+ +
k
u
k
.
Substituting this representation of v in Equation (7.3.4), we get
(
1

1
)u
1
+ + (
k

k
)u
k
+
1
w
1
+ +
s
w
s
= 0.
But then, the vectors u
1
, u
2
, . . . , u
k
, w
1
, . . . , w
s
are linearly independent as they form a basis.
Therefore, by the denition of linear independence, we get

i

i
= 0, for 1 i k and
j
= 0 for 1 j s.
150 CHAPTER 7. APPENDIX
Thus the linear system of Equations (7.3.4) reduces to

1
u
1
+ +
k
u
k
+
1
v
1
+ +
r
v
r
= 0.
The only solution for this linear system is

i
= 0, for 1 i k and
j
= 0 for 1 j r.
Thus we see that the linear system of Equations (7.3.4) has no non-zero solution. And therefore,
the vectors are linearly independent.
Hence, the set B
2
is a basis of M +N. We now count the vectors in the sets B
1
, B
2
, B
M
and B
N
to get the required result.
Index
Adjoint of a Matrix, 45
Back Substitution, 30
Basic Variables, 26
Basis of a Vector Space, 64
Bilinear Form, 135
Cauchy-Schwarz Inequality, 99
Cayley Hamilton Theorem, 125
Change of Basis Theorem, 94
Characteristic Equation, 122
Characteristic Polynomial, 122
Cofactor Matrix, 45
Column Operations, 37
Column Rank of a Matrix, 37
Complex Vector Space, 52
Coordinates of a Vector, 77
Denition
Diagonal Matrix, 6
Equality of two Matrices, 5
Identity Matrix, 6
Lower Triangular Matrix, 6
Matrix, 5
Principal Diagonal, 6
Square Matrix, 6
Transpose of a Matrix, 6
Triangular Matrix, 6
Upper Triangular Matrix, 6
Zero Matrix, 6
Determinant
Properties, 145
Determinant of a Square Matrix, 42, 145
Dimension
Finite Dimensional Vector Space, 67
Eigen-pair, 122
Eigenvalue, 122
Eigenvector, 122
Elementary Matrices, 31
Elementary Row Operations, 23
Elimination
Gauss, 24
Gauss-Jordan, 30
Equality of Linear Operators, 82
Forward Elimination, 24
Free Variables, 26
Fundamental Theorem of Linear Algebra, 101
Gauss Elimination Method, 24
Gauss-Jordan Elimination Method, 30
Gram-Schmidt Orthogonalization Process, 107
Idempotent Matrix, 14
Identity Operator, 82
Inner Product, 97
Inner Product Space, 97
Inverse of a Linear Transformation, 90
Inverse of a Matrix, 12
Linear Algebra
Fundamental Theorem, 101
Linear Combination of Vectors, 58
Linear Dependence, 62
linear Independence, 62
Linear Operator, 81
Equality, 82
Linear Span of Vectors, 59
Linear System, 20
Associated Homogeneous System, 21
Augmented Matrix, 21
Coecient Matrix, 21
Equivalent Systems, 23
Homogeneous, 20
Non-Homogeneous, 20
Non-trivial Solution, 21
Solution, 21
Solution Set, 21
Trivial Solution, 21
Consistent, 26
151
152 INDEX
Inconsistent, 26
Linear Transformation, 81
Matrix, 85
Matrix Product, 91
Null Space, 87
Range Space, 87
Composition, 91
Inverse, 90, 92
Nullity, 87
Rank, 87
Matrix, 5
Cofactor, 45
Column Rank, 37
Determinant, 42
Eigen-pair, 122
Eigenvalue, 122
Eigenvector, 122
Elementary, 31
Full Rank, 39
Hermitian, 130
Non-Singular, 42
Rank, 37
Row Equivalence, 23
Row-Reduced Echelon Form, 29
Scalar Multiplication, 7
Singular, 42
Skew-Hermitian, 130
Diagonalisation, 127
Idempotent, 14
Inverse, 12
Minor, 45
Nilpotent, 14
Normal, 130
Orthogonal, 14
Product of Matrices, 8
Row Echelon Form, 25
Row Rank, 36
Skew-Symmetric, 14
Submatrix, 14
Symmetric, 14
Trace, 17
Unitary, 130
Matrix Equality, 5
Matrix Multiplication, 8
Matrix of a Linear Transformation, 85
Minor of a Matrix, 45
Nilpotent Matrix, 14
Non-Singular Matrix, 42
Normal Matrix
Spectral Theorem, 134
Operations
Column, 37
Operator
Identity, 82
Order of Nilpotency, 14
Ordered Basis, 76
Orthogonal Complement, 112
Orthogonal Projection, 112
Orthogonal Subspace of a Set, 110
Orthogonal Vectors, 99
Orthonormal Basis, 104
Orthonormal Set, 104
Orthonormal Vectors, 104
Properties of Determinant, 145
QR Decomposition, 118
Generalized, 119
Rank Nullity Theorem, 89
Rank of a Matrix, 37
Real Vector Space, 52
Row Equivalent Matrices, 23
Row Operations
Elementary, 23
Row Rank of a Matrix, 36
Row-Reduced Echelon Form, 29
Sesquilinear Form, 135
Similar Matrices, 95
Singular Matrix, 42
Solution Set of a Linear System, 21
Spectral Theorem for Normal Matrices, 134
Square Matrix
Bilinear Form, 135
Determinant, 145
Sesquilinear Form, 135
Submatrix of a Matrix, 14
Subspace
Linear Span, 61
INDEX 153
Orthogonal Complement, 100, 110
Sum of two Matrices, 7
System of Linear Equations, 20
Trace of a Matrix, 17
Transformation
Zero, 82
Unit Vector, 98
Unitary Equivalence, 130
Vector Space, 51
C
n
: Complex n-tuple, 53
R
n
: Real n-tuple, 53
Basis, 64
Dimension, 67
Dimension of M +N, 149
Inner Product, 97
Isomorphism, 93
Real, 52
Subspace, 55
Complex, 52
Finite Dimensional, 60
Innite Dimensional, 60
Vector Subspace, 55
Vectors
Angle, 99
Coordinates, 77
Length, 98
Linear Combination, 58
Linear Independence, 62
Linear Span, 59
Norm, 98
Orthogonal, 99
Orthonormal, 104
Linear Dependence, 62
Zero Transformation, 82