You are on page 1of 90

DEPARTMENT OF MATHEMATICS, IIT Guwahati

MA102: Mathematics II, January - May 2020


Lecture Notes

Note: If a proof of a result or a solution of an example is not written in this summary of lecture notes, then you are
advised to find the same in the recommended text book (Linear Algebra: A Modern Introduction by David Poole). Most
of the materials in this summary have been extracted from this text book. However, a few results and examples have
also been taken from other sources. Proofs/solutions of such results/examples have either been provided or relatively
easier to be done by yourself.

1 System of Linear Equations and Matrices


An Example for Motivation:
To solve the system of linear equations: x − y − z = 2, 3x − 3y + 2z = 16, 2x − y + z = 9.
We solve this system by eliminating the variables.

Step 1: Represent the given system of equations in the rectangular array form as follows.
 
x − y − z = 2 1 −1 −1 2
 
3x − 3y + 2z = 16  3 −3 2 16 
2x − y + z = 9 2 −1 1 9

Step 2: Subtract 3 times the 1st equation from the 2nd equation; and
subtract 3 times the 1st row from the 2nd row.
 
x − y − z = 2 1 −1 −1 2
 
5z = 10  0 0 5 10 
2x − y + z = 9 2 −1 1 9

Step 3: Subtract 2 times the 1st equation from the 3rd equation; and
subtract 2 times the 1st row from the 3rd row.
 
x − y − z = 2 1 −1 −1 2
 
5z = 10  0 0 5 10 
y + 3z = 5 0 1 3 5

Step 4: Interchange the 2nd and 3rd equation; and interchange the 2nd and 3rd row.
 
x − y − z = 2 1 −1 −1 2
 
y + 3z = 5  0 1 3 5 
5z = 10 0 0 5 10

Now by backward substitution, we find that z = 2, y = −1, x = 3 is a solution of the given system of equations.
A linear system of m equations in n unknowns x1 , x2 , . . . , xn is a set of equations of the form

a11 x1 + a12 x2 + . . . + a1n xn = b1


a21 x1 + a22 x2 + . . . + a2n xn = b2
.. .. .. ..
. . . . (1)
am1 x1 + am2 x2 + . . . + amn xn = bm ,

where aij , bi ∈ C for each 1 ≤ i ≤ m and 1 ≤ j ≤ n. Letting


     
a11 a12 . . . a1n x1 b1
     
 a21 a22 . . . a2n   x2   b2 
A=  .. .. .. 
..  , x = 
 ..  and b = 
  .. ,

 . . . .   .   . 
am1 am2 . . . amn xn bm

we can represent the above system of equations as Ax = b.


It is now clear that matrices are quite handy for expressing a system of linear equations in compact form.

1 @bikash
Definition:
• A rectangular array of (complex) numbers is called a matrix. Formally, an m × n matrix A = [aij ] is an array of
numbers in m rows and n columns as shown below:
 
a11 a12 ... a1n
 
 a21 a22 ... a2n 
A=  .. .. .. .. .

 . . . . 
am1 am2 ... amn

The matrix A is called a matrix of size m × n or a matrix of order m × n.

• The number aij is called the (i, j)-th entry of A.

• A 1 × n matrix is called a row matrix (or row vector ) and an n × 1 matrix is called a column matrix (or
column vector ).

• Two matrices A = [aij ] and B = [bij ] are said to be equal if they are of same size and aij = bij for each
i = 1, 2, . . . , m and j = 1, 2, . . . , n.

• An m × n matrix is called a zero matrix of size m × n, denoted Om×n (or simply O), if all the entries are equal
to 0.

• If m = n, then A is called a square matrix.

• If A is a square matrix, then the entries aii are called the diagonal entries of A.

• If A is a square matrix and if aij = 0 for all i 6= j, then A is called a diagonal matrix.

• If an n × n diagonal matrix has all diagonal entries equal to 1, then it is called the identity matrix of size n,
and is denoted by In (or simply by I).

• A matrix B is said to be a sub matrix of A if B is obtained by deleting some rows and/or columns of A.

• Let A = [aij ] and B = [bij ] be two m × n matrices. Then the sum A + B is defined to be the matrix C = [cij ],
where cij = aij + bij . Similarly, the difference A − B is defined to be the matrix D = [dij ], where dij = aij − bij .

• For a matrix A = [aij ] and c ∈ C (set of complex numbers), we define cA to be the matrix [caij ].

• Let A = [aij ] and B = [bjk ] be two m × n and n × r matrices, respectively. Then the product AB is defined to
be the m × r matrix AB = [cik ], where

cik = ai1 b1k + ai2 b2k + . . . + ain bnk .

• The transpose At of an m × n matrix A = [aij ] is defined to be the n × m matrix At = [aji ], where the i-th row
of At is the i-th column of A for all i = 1, 2, . . . , n.

• The matrix A is said to be symmetric if At = A, and skew-symmetric if At = −A.


t
• If A is a complex matrix, then A = [aij ] and A∗ = A .

• The matrix A∗ is called the conjugate transpose of A.

• The (complex) matrix A is said to be Hermitian if A∗ = A, and skew-Hermitian if A∗ = −A.

• A square matrix A is said to be upper triangular if aij = 0 for all i > j.

• A square matrix A is said to be lower triangular if aij = 0 for all i < j.

• Let A be an n × n square matrix. Then we define A0 = In , A1 = A and A2 = AA.

• In general, if k is a positive integer, we define the power Ak as follows

Ak = AA . . . A} .
| {z
k times

2 @bikash
It is obvious to see that if A and O are matrices of the same size, then A+O = A = O+A, A−O = A, O−A = −A
and A − A = O.
Unless otherwise mentioned, all the matrices will be taken to have complex numbers as entries.
Method of mathematical induction is an useful tool for proving many results of mathematics. We now present two
equivalent versions of the method of induction.
Method of Induction: [Version I] Let P (n) be a mathematical statement based on all positive integers n.
Suppose that P (1) is true. If k ≥ 1 and if the assumption that P (k) is true gives that P (k + 1) is also true, then the
statement P (n) is true for all positive integers.

Method of Induction: [Version II] Let i be an integer and let P (n) be a mathematical statement based on
all integers n of the set {i, i + 1, i + 2, . . .}. Suppose that P (i) is true. If k ≥ i and if the assumption that P (k) is true
gives that P (k + 1) is also true, then the statement P (n) is true for all integers of the set {i, i + 1, i + 2, . . .}.

Result 1.1. Let A be an m × n matrix, ei an 1 × m standard unit row vector, and ej an n × 1 standard unit column
vector. Then ei A is the i-th row of A and Aej is the j-th column of A.

Result 1.2. Let A be a square matrix and let r and s be non-negative integers. Then Ar As = Ar+s and (Ar )s = Ars .

Proof. We have Ar As = (AA . . . A}) AA


| {z . . . A} = AA
| {z . . . A} = Ar+s .
| {z
r times s times r+s times
r r r
Also (Ar )s = A
| A {z. . . A} = AA . . . A} = Ars .
| {z
s times rs times

Result 1.3. Let A, B and C be matrices of size m × n, and let s, r ∈ C. Then

1. Commutative Law: A + B = B + A.

2. Associative Law: (A + B) + C = A + (B + C).

3. 1A = A, s(rA) = (sr)A.

4. s(A + B) = sA + sB and (s + r)A = sA + rA.

Proof. Let A = [aij ], B = [bij ], C = [cij ] and s, r ∈ C. We have

1. ij-th entry of A + B = aij + bij = bij + aij = ij-th entry of B + A. Hence A + B = B + A.

2. ij-th entry of (A + B) + C = (aij + bij ) + cij = aij + (bij + cij ) = ij-th entry of A + (B + C). Hence (A + B) + C =
A + (B + C).

3. ij-th entry of 1A = 1.aij = aij = ij-th entry of A. Hence 1A = A.


Also ij-th entry of s(rA) = s(raij ) = (sr)aij = ij-th entry of (sr)A. Hence s(rA) = (sr)A.

4. ij-th entry of s(A + B) = s(aij + bij ) = saij + sbij = ij-th entry of sA + sB. Hence s(A + B) = sA + sB.
Also ij-th entry of (s + r)A = (s + r)aij = saij + raij = ij-th entry of (sr)A. Hence s(rA) = (sr)A.

Result 1.4. Let A, B and C be matrices, and let s ∈ C. Then

1. Associative Law: (AB)C = A(BC), if the respective matrix products are defined.

2. Distributive Law: A(B + C) = AB + AC, (A + B)C = AC + BC, if the respective matrix sum and matrix
products are defined.

3. s(AB) = (sA)B = A(sB), if the respective matrix products are defined.

4. Im A = A = AIn , if A is of size m × n.

Proof. Let A = [aij ], B = [bij ], C = [cij ] and s ∈ C. Let In = [eij ] so that eij = 1 if i = j and eij = 0 otherwise.

1. Let the orders of A, B, C be m × p, p × q, q × n, respectively. Then the ij-th entry of (AB)C is equal to
q
 p  p
 q 
P P P P
air brk ckj = air brk ckj = ij-th entry of A(BC). Hence (AB)C = A(BC).
k=1 r=1 r=1 k=1

3 @bikash
2. Let the orders of A, B, C be m × p, p × n, p × n, respectively. Then
p
P Pp Pp
ij-th entry of A(B + C) = aik (bkj + ckj ) = aik bkj + aik ckj = ij-th entry of AB + AC. Hence
k=1 k=1 k=1
A(B + C) = AB + AC.
Again let the orders of A, B, C be m × p, m × p, p × n, respectively. Then
Pp Pp Pp
ij-th entry of (A + B)C = (aik + bik )ckj = aik ckj + bik ckj = ij-th entry of AC + BC. Hence
k=1 k=1 k=1
(A + B)C = AC + BC.

3. Let the orders of A, B be m × p, p × n, respectively. Then


 p  p
P P
ij-th entry of s(AB) = s aik bkj = (saik )bkj = ij-th entry of (sA)B. Hence s(AB) = (sA)B. Again
k=1
p
 k=1
p
P P
ij-th entry of s(AB) = s aik bkj = aik (sbkj ) = ij-th entry of A(sB). Hence s(AB) = A(sB).
k=1 k=1

4. Let the order of A be m × n. Then


m
P
ij-th entry of Im A = eik akj = eii aij = aij = ij-th entry of A. Hence Im A = A. Again
k=1
n
P
ij-th entry of AIn = aik ekj = aij ejj = aij = ij-th entry of A. Hence AIn = A.
k=1

Result 1.5. Let A and B be two matrices and k ∈ C. Then

1. (At )t = A, (kA)t = kAt .

2. (A + B)t = At + B t if A and B are of the same size.

3. (AB)t = B t At if the matrix product AB is defined.

4. (Ar )t = (At )r for any non-negative integer r.

Proof. Let A = [aij ], B = [bij ], At = [a0ij ], B t = [b0ij ] and k ∈ C so that a0ij = aji and b0ij = bji .

1. ij-th entry of (At )t = ji-th entry of At = ij-th entry of A. Hence (At )t = A. Again
ij-th entry of (kA)t = ji-th entry of kA = kaji = ka0ij = ij-th entry of kAt . Hence (kA)t = kAt .

2. ij-th entry of (A + B)t = ji-th entry of A + B = aji + bji = a0ij + b0ij = ij-th entry of At + B t . Hence
(A + B)t = At + B t .
p
P p
P
3. ij-th entry of (AB)t = ji-th entry of AB = ajk bki = b0ik a0kj = ij-th entry of B t At . Hence (AB)t = B t At .
k=1 k=1

4. We use induction on r. For r = 0, it is clear that (A ) = I = (At )0 . Now let us assume that (Ak )t = (At )k for a
0 t

k ≥ 0. Then applying this assumption as well as Part 3, we have

(Ak+1 )t = (Ak .A)t = At (Ak )t = At (At )k = (At )k+1 .

Thus (Ar )t = (At )r is also true for r = k + 1. Hence by principle of mathematical induction, we have that
(Ar )t = (At )r for any non-negative integer r.

Partitioned Matrix: By introducing vertical and horizontal lines into a given matrix, we can partition it into
some blocks of smaller sub-matrices. A given matrix can have several partitions possible. For example, three partitions
of the matrix A are given below:
       
1 0 0 2 1 0 0 2 1 0 0 2 1 0 0 2
 0 1 1 1   0 1 1 1   0 1 1 1   0 1 1 1 
       
A= = = = .
 0 0 2 1   0 0 2 1   0 0 2 1   0 0 2 1 
0 0 1 5 0 0 1 5 0 0 1 5 0 0 1 5

• A block matrix A = [Aij ] is a matrix, where each entry Aij is itself a matrix.

• Thus a partition of a given matrix give us a block matrix.

4 @bikash
" # " #
I B 1 0
• For example, the first partition of the previous matrix A is the block matrix , where I = ,
O C 0 1
" # " # " #
0 0 0 2 2 1
O= ,B = and C = .
0 0 1 1 1 5

Result 1.6. Let A and B be two matrices of sizes m × n and n × r, respectively.

1. If B = [b1 | b2 | . . . | br ], where bj is the j-th column of B then AB = [Ab1 | Ab2 | . . . | Abr ].


   
a1 a1 B
 a   a B 
 2   2 
 .. , where ai is the i-th row of A then AB =  ..
2. If A =    .

 .   . 
am am B
 
bt1

 bt2 

3. If m = n = r and if A = [at1 | at2 | ... | atn ] and B = 
 .. , where at is the k-th column of A and bt is the
 k k
 . 
btn
k-th row of B then AB = at1 bt1 + at2 bt2 + . . . + atn btn .

Proof. Let A = [aij ], B = [bij ] of sizes m × n and n × r, respectively.

1. Here bj = [b1j , b2j , . . . , bij , . . . , bnj ]t . We have


 n 
P
 k=1 a1k bkj    
a a12 ... a1n b1j
  11
 n 
 P

 a2k bkj   a21
  a22 ... a2n

 b2j


j-th column of AB =  k=1 = . .. .. ..  ..  = Abj .
 ..   .. . . .

 .


 n .
 

 P  am1 am2 ... amn bnj
amk bkj
k=1

Hence AB = [Ab1 | Ab2 | . . . | Abr ].

2. Here ai = [ai1 ai2 . . . aij . . . ain ]. We have


" n n n n
#
X X X X
i-th row of AB = aik bk1 aik bk2 . . . . . . aik bkj . . . . . . aik bkr
k=1 k=1 k=1 k=1
 
b11 b12 ... b1r
 
 b21 b22 ... b2r 
=[ai1 ai2 . . . aij . . . ain ] 
 .. .. .. .. 

 . . . . 
bn1 bn2 ... bnr
=ai B.
 
a1 B
 a2 B 
 
Hence AB = 
 .. .

 . 
am B
 
a1k
 
 a2k 
3. Here atk = 
 ..
 and bt = [bk1 bk2 . . . bkj . . . bkr ]. Now clearly the ij-th entry of at bt is aik bkj . Therefore
 k k k
 . 
amk

ij-th entry of at1 bt1 + at2 bt2 + . . . + atn btn = ai1 b1j + ai2 b2j + . . . + ain bnj = ij-th entry of AB.

Hence AB = at1 bt1 + at2 bt2 + . . . + atn btn .

5 @bikash
Practice Problems Set 1

1. Find two different 2 × 2 real matrices A and B such that A2 = O = B 2 but A 6= O, B 6= O.


   
1 −1 2 0 1 3
2 −3 . Compute A2 + 2AB + B 2 and (A + B)2 . Are they equal? If
   
2. Let A =  2 −3 4  and B =  1
4 −1 3 2 1 −1
not, give reasons.
m
P 
m
3. Let A and B be two n×n matrices. If AB = BA then show that (AB)m = Am B m and (A+B)m = i Am−i B i
i=0
for every m ∈ N. If AB 6= BA then show that these two results need not be true.

4. Show that every square matrix can be written as a sum of a symmetric and an skew-symmetric matrix. Further,
show that if A and B are symmetric, then AB is symmetric if and only if AB = BA. Give an example to show
that if A and B are symmetric then AB need not be symmetric.
 
1 −1 " #
  3 1
5. Let A =  2 2  and B = . Is there a matrix C such that CA = B?
−4 4
1 0
" #
a b
6. Let C = be a 2 × 2 matrix. Show that there exist two 2 × 2 matrices A and B satisfying C = AB − BA
c d
if and only if a + d = 0.
" #
a b
7. Find conditions on the numbers a, b, c and d such that A = commutes with every 2 × 2 matrix.
c d

8. Let A1 , A2 , . . . , An be matrices of the same size, where n ≥ 1. Using mathematical induction, prove that

(a) (A1 + A2 + . . . + An )t = At1 + At2 + . . . + Atn ; and


(b) (A1 A2 . . . An )t = Atn Atn−1 . . . At1 .

9. For an n × n matrix A = [aij ], the trace is defined as tr(A) = a11 + . . . + ann . If A and B are two n × n matrices
then prove that tr(A + B) = tr(A) + tr(B), tr(AB) = tr(BA) and tr(kA) = k.tr(A), where k ∈ R. Also find an
expression of tr(AAt ) in terms of entries of A.

10. Let A and B be two n × n matrices. If AB = O then show that tr((A + B)k ) = tr(Ak ) + tr(B k ) for any positive
integer k.
" #
0 1
11. Let A = . Show that tr(Ak ) = tr(Ak−1 ) + tr(Ak−2 ), for any positive integer k.
1 1
(
1 if i = j + 1,
12. Let A = [aij ] be an n × n matrix such that aij = Show that An = O but Al 6= O for
0 otherwise.
1 ≤ l ≤ n − 1.

13. Show that the product of two lower triangular matrices is a lower triangular matrix. (A similar statement also
holds for upper triangular matrices.)

14. Let A and B be two skew-symmetric matrices such that AB = BA. Is the matrix AB symmetric or skew-
symmetric?

15. Let A and B be two m × n matrices. Prove that if Ax = 0 for all x ∈ Rn then A is the zero matrix. Further,
prove that if Ax = Bx for all x ∈ Rn then A = B.
 
1 2
 
16. Let A =  2 1 . Show that there exist infinitely many matrices B such that BA = I2 . Also, show that there
3 1
does not exist any matrix C such that AC = I3 .

17. Show that if A is a complex triangular matrix and AA∗ = A∗ A, then A is a diagonal matrix.

18. Let A be a real matrix such that AAt = O. Show that A = O. Is the same true if A is a complex matrix?

6 @bikash
Hints to Practice Problems Set 1
" # " #
0 0 1 −1
1. For example, A = and B = .
1 0 1 −1

2. The reason is AB 6= BA.

3. Use induction on m. Find counterexamples for the last part.


" # " #
1 t 1 t 1 2 1 3
4. A = 2 (A + A ) + 2 (A − A ). For the last part, take A = and B = .
2 1 3 0
" # " #
a b c 1 1 0
5. Solve CA = B, for C = . One solution is C = .
x y z −4 0 0
" #
0 a
6. For the first part, use tr(C) = tr(AB − BA). For the second part, if a 6= 0 then take A = and
0 c
" # " #" # " #" #
0 1 x 0 p q p q x 0
B= . If a = 0, then solve C = − to obtain
1 b+ca 0 y r s r s 0 y
" # " # " # " #
1 0 0 −b/c 1 0 0 1
A= ,B = for b 6= 0 and A = ,B = for c 6= 0.
0 1+c 1 0 0 1−b −c/b 0

7. a = d, b = c = 0

8. Easy.
n
P
9. tr(AAt ) = a2ij .
i,j=1

10. Use the expansion of (A + B)k and the facts AB = O, tr(XY ) = tr(Y X).

11. Use induction on k.

12. Show that A2 has n − 2 non-zero rows, and so on.


n
P
13. If [aij ] and [bij ] are lower triangular matrices then show that aik bkj = 0 for i < j.
k=1

14. AB is symmetric.

15. Take x = ei to show that the i-th column of A is zero.


" #
x y z
16. Take X = to solve the system XA = I2 and AX = I3 .
a b c

17. Compare the (i, i)-entries of AA∗ and A∗ A.

18. Use Problem 9 or compare the (i, i)-entries of AAt and O.

7 @bikash
2 Solutions of System of Linear Equations
Row Echelon Form: A matrix A is said to be in row echelon form if it satisfies the following properties:
1. All rows consisting entirely of 0’s are at the bottom.

2. In each non-zero row, the first non-zero entry (called the leading entry or pivot) is in a column to the left
(strictly) of any leading entry below it. [The columns containing a leading entry is called a leading column.]

Notice that if a matrix A is in row echelon form, then in each column of A containing a leading entry, the entries
below that leading entry are zero. For example, the following matrices are in row echelon form:
   
" # 1 −1 −1 0 −1 3 2
1 −1    
, 0 1 5 , 0 0 5 10  .
0 1
0 0 1 0 0 0 0

However, the following matrices are not in row echelon form:


   
" # " # 1 −1 −1 0 0 0 0
1 −1 0 −1    
, , 0 0 ,
5   1 0 5 10  .
2 1 2 1
0 1 1 0 0 0 0

Elementary Row Operations: The following row operations are called elementary row operation of a matrix:
1. Interchange of two rows Ri and Rj (shorthand notation Ri ↔ Rj ).

2. Multiply a row Ri by a non-zero constant c (shorthand notation Ri → cRi ).

3. Add a multiple of a row Rj to another row Ri (shorthand notation Ri → Ri + cRj ).

Any given matrix can be reduced to a row echelon form by applying suitable elementary row operations on the matrix.

Example 2.1. Transform the following matrix to row echelon form


 
0 2 3 8
 
 2 3 1 5 .
1 −1 −2 −5

Solution. We have
   
0 2 3 8 1 −1 −2 −5
   
 2 3 1 5  R1 ↔ R3  2 3 1 5 
−−−−−−→
1 −1 −2 −5 0 2 3 8
 
1 −1 −2 −5
 
R2 → R2 − 2R1  0 5 5 15 
−−−−−−−−−−−→
0 2 3 8
   
1 −1 −2 −5 1 −1 −2 −5
1    
R2 → R2  0 1 1 3  R3 → R3 − 2R2  0 1 1 3 .
5−→ −−−−−−−−−−−→
−−−−−− 0 2 3 8 0 0 1 2
 
1 −1 −2 −5
 
Thus a row echelon form of the given matrix is  0 1 1 3 .
0 0 1 2
Note that if A is in row echelon form then for any c 6= 0, the matrix cA is also in row echelon form. Thus a given
matrix can be reduced to several row echelon forms.

Row Equivalent Matrices: Matrices A and B are said to be row equivalent if there is a finite sequence of
elementary row operations that converts A into B or B into A. We will see later that if A can be converted into B
through a finite sequence of elementary row operations, then B can also be converted into A through a finite sequence
of elementary row operations.

Result 2.1. Matrices A and B are row equivalent if and only if (iff) they can be reduced to the same row echelon form.

8 @bikash
Linear System of Equations: Consider the following linear system of equations

a11 x1 + a12 x2 + . . . + a1n xn = b1


a21 x1 + a22 x2 + . . . + a2n xn = b2
.. .. .. ..
. . . . (2)
am1 x1 + am2 x2 + . . . + amn xn = bm ,

where aij , bi ∈ C for each 1 ≤ i ≤ m and 1 ≤ j ≤ n. Letting


     
a11 a12 ... a1n x1 b1
     
 a21 a22 ... a2n   x2   b2 
A=
 .. .. .. .. , x = 
  ..  and b = 
  .. ,

 . . . .   .   . 
am1 am2 ... amn xn bm

we write the above system of equations as Ax = b.

• The matrix A is called the coefficient matrix of the system of equations Ax = b.

• The matrix [A| b], as given below, is called the augmented matrix of the system of equations Ax = b.
 
a11 a12 ... a1n b1
 
 a21 a22 ... a2n b2 
[A| b] = 
 .. .. .. .. .. .

 . . . . . 
am1 am2 ... amn bm

The vertical bar is used in the augmented matrix [A| b] only to separate the column vector b from the coefficient
matrix A.

• If b = 0 = [0, 0, . . . , 0]t , i.e., if b1 = b2 = . . . = bm = 0, the system Ax = 0 is called a homogeneous system of


equations. Otherwise, if b 6= 0 then Ax = b is called a non-homogeneous system of equations.

• A solution of the linear system Ax = b is a column vector y = [y1 , y2 , . . . , yn ]t such that the linear system (2)
is satisfied by substituting yi in place of xi . That is, Ay = b holds true.

• The solution 0 of Ax = 0 is called the trivial solution and any other solution of Ax = 0 is called a non-trivial
solution.

• Two system of linear equations are called equivalent if their augmented matrices are row-equivalent.

• An elementary row operation on Ax = b is an elementary row operation on the augmented matrix [A | b].

Result 2.2. Let Cx = d be the linear system obtained from the linear system Ax = b by a single elementary row
operation. Then the linear systems Ax = b and Cx = d have the same set of solutions.

Proof. Let the system Ax = b be given by

a11 x1 + a12 x2 + . . . + a1n xn = b1


.. .. .. ..
. . . .
ai1 x1 + ai2 x2 + . . . + ain xn = bi
.. .. .. ..
. . . .
aj1 x1 + aj2 x2 + . . . + ajn xn = bj
.. .. .. ..
. . . .
am1 x1 + am2 x2 + . . . + amn xn = bm .

9 @bikash
Let [α1 , α2 , . . . , αn ]t be a solution of Ax = b. Then we have

a11 α1 + a12 α2 + . . . + a1n αn = b1


.. .. .. ..
. . . . (3)
ai1 α1 + ai2 α2 + . . . + ain αn = bi
.. .. .. ..
. . . .
aj1 α1 + aj2 α2 + . . . + ajn αn = bj
.. .. .. ..
. . . .
am1 α1 + am2 α2 + . . . + amn αn = bm .

Let Cx = d be obtained from Ax = b by a single elementary operation. We consider three cases.


Case I: Let Cx = d be obtained from Ax = b by applying Ri ↔ Rj . Then the system Cx = d is given by

a11 x1 + a12 x2 + . . . + a1n xn = b1


.. .. .. ..
. . . .
aj1 x1 + aj2 x2 + . . . + ajn xn = bj
.. .. .. ..
. . . .
ai1 x1 + ai2 x2 + . . . + ain xn = bi
.. .. .. ..
. . . .
am1 x1 + am2 x2 + . . . + amn xn = bm .

Notice that (3) can also be written as

a11 α1 + a12 α2 + . . . + a1n αn = b1


.. .. .. ..
. . . .
aj1 α1 + aj2 α2 + . . . + ajn αn = bj
.. .. .. ..
. . . .
ai1 α1 + ai2 α2 + . . . + ain αn = bi
.. .. .. ..
. . . .
am1 α1 + am2 α2 + . . . + amn αn = bm ,

which shows that [α1 , α2 , . . . , αn ]t is a solution of Cx = d. Since Ax = b is obtained from Cx = d by applying


again Ri ↔ Rj , we can conclude by the same argument as above that, if [β1 , β2 , . . . , βn ]t is a solution of Cx = d then
[β1 , β2 , . . . , βn ]t is also a solution of Ax = b. Thus Ax = b and Cx = d have the same set of solutions.
Case II: Let Cx = d be obtained from Ax = b by applying Ri → cRi , (c 6= 0). Then the system Cx = d is given
by

a11 x1 + a12 x2 + . . . + a1n xn = b1


.. .. .. ..
. . . .
cai1 x1 + cai2 x2 + . . . + cain xn = cbi
.. .. .. ..
. . . .
am1 x1 + am2 x2 + . . . + amn xn = bm .

10 @bikash
Notice that by multiplying the i-th equality in (3) by c, we find

a11 α1 + a12 α2 + . . . + a1n αn = b1


.. .. .. ..
. . . .
cai1 α1 + cai2 α2 + . . . + cain αn = cbi
.. .. .. ..
. . . .
am1 α1 + am2 α2 + . . . + amn αn = bm ,

which shows that [α1 , α2 , . . . , αn ]t is a solution of Cx = d. Since Ax = b is obtained from Cx = d by applying


Ri → 1c Ri , we can conclude by the same argument as above that, if [β1 , β2 , . . . , βn ]t is a solution of Cx = d, then
[β1 , β2 , . . . , βn ]t is also a solution of Ax = b. Thus Ax = b and Cx = d have the same set of solutions.
Case III: Let Cx = d be obtained from Ax = b by applying Rj → Rj + cRi . Then the system Cx = d is given by

a11 x1 + a12 x2 + . . . + a1n xn = b1


.. .. .. ..
. . . .
ai1 x1 + ai2 x2 + . . . + ain xn = bi
.. .. .. ..
. . . .
(aj1 + cai1 )x1 + (aj2 + cai2 )x2 + . . . + (ajn + cain )xn = bj + cbi
.. .. .. ..
. . . .
am1 x1 + am2 x2 + . . . + amn xn = bm .

Notice that replacing the j-th equality in (3) by the result of addition of c times the i-th equality to the j-th equality,
we find

a11 α1 + a12 α2 + . . . + a1n αn = b1


.. .. .. ..
. . . .
ai1 α1 + ai2 α2 + . . . + ain αn = bi
.. .. .. ..
. . . .
(aj1 + cai1 )α1 + (aj2 + cai2 )α2 + . . . + (ajn + cain )αn = bj + cbi
.. .. .. ..
. . . .
am1 α1 + am2 α2 + . . . + amn αn = bm ,

which shows that [α1 , α2 , . . . , αn ]t is a solution of Cx = d. Since Ax = b is obtained from Cx = d by applying


Rj → Rj − cRi , we can conclude by the same argument as above that, if [β1 , β2 , . . . , βn ]t is a solution of Cx = d then
[β1 , β2 , . . . , βn ]t is also a solution of Ax = b. Thus Ax = b and Cx = d have the same set of solutions.

Result 2.3. Two equivalent system of linear equations have the same set of solutions.

Proof. Let Ax = b and Cx = d be two equivalent system of linear equations, so that the augmented matrices [A | b]
and [C | d] are row equivalent. Then the matrix [C | d] is obtained from an application of finite number of elementary
row operations on [A | b]. By Result 1.2, we know that the solution set does not change by an application of a single
elementary row operation on Ax = b. Therefore the solution set does not change by an application of finite number of
elementary row operations on Ax = b. Hence the systems Ax = b and Cx = d have the same set of solutions.
Gaussian Elimination Method: Use the following steps to solve a system of equations Ax = b.
1. Write the augmented matrix [A | b].

2. Use elementary row operations to reduce [A | b] to row echelon form.

3. Use back substitution to solve the equivalent system that corresponds to the row echelon form.

11 @bikash
Example 2.2. Use Gaussian Elimination method to solve the system:
(a) y + z = 1, x + y + z = 2, x + 2y + 2z = 3
(b) y + z = 1, x + y + z = 2, x + 2y + 3z = 4
(c) y + z = 1, x + y + z = 2, x + 2y + 2z = 4.

Solution.

(a) The augmented matrix of the given system is


       
0 1 1 1 1 1 1 2 1 1 1 2 1 1 1 2
       
1 1 1 2 R
 1 ↔ R 0 1 1 1 R → R − R 0 1 1 1  R3 → R3 − R2  0 1 1 1 .
−−−−−−→2
 3 3
−−−→1
  
−−−−−−− −−−−−−−−−−→
1 2 2 3 1 2 2 3 0 1 1 1 0 0 0 0

Thus the given system of equation is reduced to x+y +z = 2 and y +z = 1. Letting z = s, we find y = 1−z = 1−s
and x = 2 − y − z = 2 − (1 − s) − s = 1. Thus a solution [x, y, z]t of the given system is of the form [1, 1 − s, s]t .
Hence the required set of solution is {[1, 1 − s, s]t : s ∈ R}.

(b) The augmented matrix of the given system is


       
0 1 1 1 1 1 1 2 1 1 1 2 1 1 1 2
       
 1 1 1 2  R1 ↔ R2  0 1 1 1  R3 → R3 − R1  0 1 1 1  R3 → R3 − R2  0 1 1 1 .
−−−−−−→ −−−−−−−−−−→ −−−−−−−−−−→
1 2 3 4 1 2 3 4 0 1 2 2 0 0 1 1

Thus the given system of equation is reduced to x + y + z = 2, y + z = 1 and z = 1. This gives y = 1− z = 1− 1 = 0


and x = 2 − y − z = 2 − 0 − 1 = 1. Thus a solution [x, y, z]t of the given system is of the form [1, 0, 1]t . Hence the
required set of solution is {[1, 0, 1]t }.

(c) The augmented matrix of the given system is


       
0 1 1 1 1 1 1 2 1 1 1 2 1 1 1 2
       
 1 1 1 2  R1 ↔ R2  0 1 1 1  R3 → R3 − R1  0 1 1 1  R3 → R3 − R2  0 1 1 1 .
−−−−−−→ −−−−−−−−−−→ −−−−−−−−−−→
1 2 2 4 1 2 2 4 0 1 1 2 0 0 0 1

The equation corresponding to the last row of this equivalent augmented matrix is 0.x + 0.y + 0.z = 1 ⇒ 0 = 1,
which is absurd. Hence the given system of equation does not have a solution, that is, the solution set is ∅.

• If the system Ax = b has at least one solution then it is called a consistent system. Otherwise, it is called an
inconsistent system.

Reduced Row Echelon Form: A matrix A is said to be in reduced row echelon form (RREF) if it satisfies
the following properties:

1. A is in row echelon form.

2. The leading entry in each non-zero row is a 1.

3. Each column containing a leading 1 has zeros everywhere else.

For example, the following matrices are in reduced row echelon form.
   
" # 1 0 0 0 1 0 2
1 0    
, 0 1 ,
0   0 0 1 10  .
0 1
0 0 0 0 0 0 0

Applying elementary row operations, any given matrix can be transformed to a reduced row echelon form.
Leading and Free Variable: Consider the linear system Ax = b in n variables and m equations. Let [R | r] be
the reduced row echelon form of the augmented matrix [A | b].

• Then the variables corresponding to the leading columns in the first n columns of [R | r] are called the leading
variables or basic variables.

• The variables which are not leading are called free variables.

Remark: If A = [B | C], and RREF (A) = [R1 | R2 ], then note that R1 is the RREF of B.

12 @bikash
Result 2.4. Every matrix has a unique reduced row echelon form.

Proof. We will apply induction on number of columns.


Base Case: Consider a matrix with one column, i.e., Am×1 . The possibilities of the RREF being Om×1 and e1 , its
RREF is unique.
Induction Hypothesis: Let any matrix having k − 1 columns have a unique RREF.
Inductive Case: We need to show that Am×k also has unique RREF. Assume A has two RREF, say B and C.
Note that because of induction hypothesis the sub-matrices B1 and C1 formed by the first k − 1 columns of B and C,
respectively, are identical. In particular, they have same number of nonzero rows.
We have that the systems Ax = 0, Bx = 0 and Cx = 0 have the same set of solutions. Let the ith row of the k th
column be different in B and C i.e. bik 6= cik ⇒ bik − cik 6= 0.
Let u be an arbitrary solution of Ax = 0.
⇒ Bu = 0, Cu = 0 and (B − C)u = 0
⇒ (bik − cik )uk = 0 ⇒ uk must be 0
⇒ xk is not a free variable, as xk can take only one value
⇒ there must be a leading 1 in the k th column of B and C
⇒ the location of 1 is different in the k th column
⇒ numbers of nonzero rows in B1 and C1 are different, a contradiction.
This contradiction leads us to conclude that B = C.
Hence by the Method of Induction, we conclude that every matrix has a unique reduced row echelon form.

Result 2.5. The matrices A and B are row equivalent iff RREF (A) = RREF (B).

Method of transforming a given matrix to reduced row echelon form: Let A be an m×n matrix.
Then the following step by step method is used to obtain the reduced row echelon form of the matrix A.

1. Let the i-th column be the left most non-zero column of A. Interchange rows, if necessary, to make the first entry
of this column non-zero. Now use elementary row operations to make all the entries below this first entry equal
to 0.

2. Forget the first row and first i columns. Start with the lower (m − 1) × (n − i) sub-matrix of the matrix obtained
in the first step and proceed as in Step 1.

3. Repeat the above steps until we get a row echelon form of A.

4. Now use the leading term in each of the leading column to make (by elementary row operations) all other entries
in that column equal to zero. Use this step starting form the rightmost leading column.

5. Scale the leading entries of the matrix obtained in the previous step, by multiplying the rows by suitable constants,
to make all the leading entries equal to 1, ending with the unique reduced row echelon form of A.

Gauss-Jordan Elimination Method: Use the following steps to solve a system of equations Ax = b.
1. Write the augmented matrix [A | b].

2. Use elementary row operations to transform [A | b] to reduced row echelon form.

3. Use back substitution to solve the equivalent system that corresponds to the reduced row echelon form. That is,
solve for the leading variables in terms of the remaining free variables, if possible.

Example 2.3. Solve the system w − x − y + 2z = 1, 2w − 2x − y + 3z = 3, − w + x − y = −3 using Gauss-Jordan


elimination method.

Example 2.4. Solve the following systems using Gauss-Jordan elimination method:
(a) 2y + 3z = 8, 2x + 3y + z = 5, x − y − 2z = −5
(b) x − y + 2z = 3, x + 2y − z = −3, 2y − 2z = 1.

Rank: The rank of a matrix A, denoted rank(A), is the number of non-zero rows in its reduced row echelon form.
Result 2.6. Let Ax = b be a system of equations with n variables. Then number of free variables is equal to n−rank(A).

Proof. By definition of rank, the number of basic variables is equal to the rank of A. Hence the number of free variables
is equal to n − rank(A).

13 @bikash
Result 2.7. Let Ax = 0 be a homogeneous system of equations with n variables. If rank(A) < n then the system has
infinitely many solutions.

Proof. Since rank(A) < n so that n − rank(A) > 0, we find that the system has at least one free variables. Since
free variables can be assigned any real numbers, the system has infinitely many solutions.

Result 2.8. Let Ax = b be a system of equations with n variables.

1. If rank(A) 6= rank([A | b]) then the system Ax = b is inconsistent;

2. If rank(A) = rank([A | b]) = n then the system Ax = b has a unique solution; and

3. If rank(A) = rank([A | b]) < n then the system Ax = b has infinitely many solutions.

Proof. Given system is Ax = b, where x = [x1 , x2 , . . . , xn ]t .

1. If rank(A) 6= rank([A | b]), that is, rank(A) < rank([A | b]), then the last non-zero row of reduced row echelon
form of [A | b] will be of the form [0, 0, . . . , 0 | 1]. The equation corresponding to this row becomes 0 = 1, which
is absurd. Hence the system Ax = b does not have a solution, that is, it is inconsistent.

2. Let [R | r] be the sub-matrix consisting of the non-zero rows of reduced row echelon form of [A | b] and let
r = [r1 , r2 , . . . , rn ]t . Then the system Ax = b is equivalent to the system Rx = r, and R is an identity matrix of
order n. Now the equations of the system Rx = r give x1 = r1 , x2 = r2 , . . . , xn = rn . Thus the system Rx = r
has the unique solution r, and hence the system Ax = b has a unique solution.

3. Let [R | r] be the sub-matrix consisting of the non-zero rows of reduced row echelon form of [A | b]. Then the
system Ax = b is equivalent to the system Rx = r. Since rank(A) = rank([A | b]) < n, the number of rows
of [R | r] is less than n. Therefore there is at least one free variable to the system Rx = r. Further, since
rank(A) = rank([A | b]), no row of [R | r] is of the form [0, 0, . . . , 0 | 1]. Therefore the system Rx = r cannot be
inconsistent.
Since free variables can be assigned any real numbers, we find infinitely many choices for the values of all the
variables of the system Rx = r. Therefore the system Rx = r has infinitely many solutions, and hence the system
Ax = b has infinitely many solutions.
Practice Problems Set 2

1. What are the possible types of reduced row echelon forms of 2 × 2 and 3 × 3 matrices?

2. Find the reduced row echelon form of each of the following matrices.
       
1 −1 2 3 1 2 3 4 3 4 5 −6 1 2 1 3 2
 0 5 6 2   5 6 7 8   2 3 1 
1   
      0 2 2 2 4 
 , ,  and  .
 −1 2 4 3   9 10 11 12   0 2 0 0   2 −2 4 0 8 
1 2 −1 2 13 14 15 16 5 −5 5 5 4 2 5 6 10

3. Solve the following systems of equations using Gaussian elimination method as well as Gauss-Jordan elimination
method:

(a) 4x + 2y − 5z = 0, 3x + 3y + z = 0, 2x + 8y + 5z = 0;
(b) −x + y + z + w = 0, x − y + z + w = 0, − x + y + 3z + 3w = 0, x − y + 5z + 5w = 0;
(c) x + y + z = 3, x − y − z = −1, 4x + 4y + z = 9;
(d) x + y + 2z = 3, − x − 3y + 4z = 2, − x − 5y + 10z = 11;
(e) x − 3y − 2z = 0, − x + 2y + z = 0, 2x + 4y + 6z = 0; and
(f) 2w + 3x − y + 4z = 0, 3w − x + z = 1, 3w − 4x + y − z = 2.
" #
a b
4. Consider the system Ax = 0, where A = . Prove the following:
c d

(a) If each entry of A is 0 then each vector x = [x, y]t is a solution of Ax = 0.


(b) If ad − bc 6= 0 then the system Ax = 0 has only the trivial solution 0 = [0, 0]t .

14 @bikash
(c) If ad − bc = 0 and some entry of A is non-zero, then there is a solution x0 = [x0 , y0 ]t 6= 0 such that x = [x, y]t
is a solution if and only if x = kx0 , y = ky0 for some constant k.

5. For what values of c ∈ R and k ∈ R, the following systems of equations has (i) no solution, (ii) a unique solution,
and (iii) infinitely many solutions?

(a) x + y + z = 3, x + 2y + cz = 4, 2x + 3y + 2cz = k;
(b) x + y + 2z = 3, x + 2y + cz = 5, x + 2y + 4z = k; and
(c) x + 2y − z = 1, 2x + 3y + kz = 3, x + ky + 3z = 2.

Also, find the solutions whenever they exist.

6. For what values of a ∈ R and b ∈ R, the following system of equations in unknowns x, y and z, has (i) no solution,
(ii) a unique solution, and (iii) infinitely many solutions:

ax + by + 2z = 1, ax + (2b − 1)y + 3z = 1, ax + by + (b + 3)z = 2b − 1 ?

Also, find the solutions whenever they exist.

7. Solve the following system of equations applying Gaussian elimination method:

(1 − n)x1 + x2 + . . . + xn = 0
x1 + (1 − n)x2 + x3 + . . . + xn = 0
.. .. .. .. .. .. .. ..
. . . . . . . .
x1 + x2 + . . . + xn−1 + (1 − n)xn = 0.

8. Prove that if r < s, then the r-th and the s-th rows of a matrix can be interchanged by performing 2(s − r) − 1
interchanges of adjacent rows.

9. Let A be an m × n matrix. Prove that the system of equations Ax = b is consistent for all b ∈ Rm if and only if
each row of the row echelon form of A contains a leading term.

10. Let A be an m × n matrix and b ∈ Rm . Prove that the system of equations Ax = b is inconsistent if and only if
there is a leading term in the last column of a row echelon form of its augmented matrix.
 
i −1 − i 0
 
11. Let A =  1 −2 1 . Determine the reduced row echelon form of A. Hence solve the system Ax = 0.
1 2i −1
12. Show that x = 0 is the only solution of the system of equations Ax = 0 if and only if the rank of A equals the
number of variables.

13. Show that a consistent system of equations Ax = b has a unique solution if and only if the rank of A equals the
number of variables.

14. Prove that if two homogeneous systems of m linear equations in two variables have the same solution, then these
two systems are equivalent. Is the same true for more than two variables? Justify.

15. (a) If x1 is a solution of the non-homogeneous system Ax = b and if y0 is a solution of the system Ax = 0 then
show that x1 + y0 is a solution of Ax = b.
(b) If x0 , x1 are solutions of the system Ax = b then show that there is a solution y0 of the system Ax = 0 such
that x0 = x1 + y0 .
[Let Sh and Snh be the solution sets of Ax = 0 and Ax = b, respectively. Note that part (a) implies x1 +Sh ⊆ Snh
and part (b) implies Snh ⊆ x1 + Sh .]

16. Suppose the system Ax = b is consistent with s0 as one of the solutions. If Sh is the set of solutions of Ax = 0,
then show that the set of solutions of Ax = b is s0 + Sh .

17. Suppose that the non-homogeneous system Ax = b has solutions u1 , u2 , . . . , uk . Show that a linear combination
a1 u1 + a2 u2 + . . . + ak uk is a solution of Ax = b if and only if a1 + a2 + . . . + ak = 1. Also, show that
c1 u1 + c2 u2 + . . . + ck uk = 0 implies c1 + c2 + . . . + ck = 0.

15 @bikash
18. (Not for Exam) Solve the following systems of equations over the indicated Zp .

(a) x + 2y = 1, x + y = 2 over Z3 .
(b) x + y = 1, y + z = 0, x + z = 1 over Z2 .
(c) 3x + 2y = 1, x + 4y = 1 over Z5 .
(d) 2x + 3y = 4, 4x + 3y = 2 over Z6 .

19. (Not for Exam) Let A be an m × n matrix of rank r with entries in Zp , where p is a prime number. Prove that
every consistent system of equations with coefficient matrix A has exactly pn−r solutions over Zp .
Hints to Practice Problems Set 2

1. There are four and eight possible reduced row echelon forms of a 2 × 2 and a 3 × 3 matrices, respectively. These
matrices can be obtained by considering the cases based on the ranks and the position of the leading entries.

2. The reduced row echelon form of the given matrices are the following (in order):
       
1 0 0 0 1 0 −1 −2 1 0 0 0 1 0 0 1 0
 0 1 0 0   0 1 2 3   0 1 0 0   0 1 0 1 0 
       
 , ,  and  .
 0 0 1 0   0 0 0 0   0 0 1 0   0 0 1 0 0 
0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1

t t
3. (a) {[0,
n 0, 0] }, (b) {[r, r, −s, s]  : r, s ∈ R},
o (c) {[1, 1, 1]t }, (d) Inconsistent, (e) {[s, s, −s]t : s ∈ R},
5 15 5 t
(f ) 22 − s, 22 − 2s, 2 − 4s, s : s ∈ R .
" #
1 0
4. (b) If a = 0 or ac 6= 0 or c = 0 then the reduced row echelon form of A is .
0 1
# "
0 1
(c) If a = c = 0 then the reduced row echelon form of A is . If (ac 6= 0) or (a = 0, c 6= 0) or
0 0
" #
1 α
(a 6= 0, c = 0), then the reduced row echelon form of A is for some α ∈ R.
0 0
h it
k−7 k−7
5. (a) If c = 1, k 6= 7 then no solution. If c 6= 1 then unique solution: k − 5 − c−1 , 8 − k, c−1 .
t
If c = 1, k = 7 then infinitely many solutions: {[2 − s, 1, s] : s ∈ R}.
h it
(c−2)(k−5) k−5
(b) If c = 4, k 6= 5 then no solution. If c 6= 4 then unique solution: 6 − k, 2 + c−4 , 4−c .
t
If c = 4, k = 5 then infinitely many solutions: {[1, 2 − 2s, s] : s ∈ R}.
h 2 it
(c) If k = 0 then no solution. If k 6= 0 then unique solution: k −k+3k2 , k−2
k2 ,
k−1
k2 .

6. The augmented matrix can be reduced to


 
a 1 1 1
 
 0 b−1 1 0 .
0 0 b+1 2(b − 1)

If a = 0, b = −1, then no solution.


If a = 0, b = 1, then infinitely many solutions: {[s, 1, 0]t : s ∈ R}.
If a = 0, b = 5, then infinitely many solutions: {[s, −1/3, 4/3]t : s ∈ R}.
If a = 0, b 6= 5, ±1, then no solution.
If a 6= 0, b = −1, then no solution.
If a 6= 0, b = 1, then infinitely many solutions: {[ 1−s t
a , s, 0] : s ∈ R}.
h it
5−b
If a 6= 0, b 6= ±1, then unique solution: a(b+1) 2
, − b+1 , 2(b−1)
b+1 .

16 @bikash
7. The coefficient matrix can be transformed to
 
1 1 1 1 ... 1 1−n
 0 1 0 0 ... 0 −1 
 
 
 0 0 1 0 ... 0 −1 
 
 0 0
 0 1 ... 0 −1  .
 . . .. .. .. .. 
 .. .. . . ... . . 
 
 
 0 0 0 0 ... 1 −1 
0 0 0 0 ... 0 0

Hence xn = xn−1 = . . . . . . = x2 = x1 .

8. Apply the following sequence of elementary row operations:

Rs ↔ Rs−1 , Rs−1 ↔ Rs−2 , . . . . . . , Rr+1 ↔ Rr , Rr+1 ↔ Rr+2 , Rr+2 ↔ Rr+3 , . . . . . . , Rs−1 ↔ Rs .

9. For the ‘only if’ part (⇒), take b = ei for i = 1, 2, . . . , m. For the ‘if’ part (⇐), m = rank(A) ≤ rank([A | b]) ≤ m.

10. Prove the contra-positive for the ‘only if’ part. That is, assume Ax = b is consistent and then prove that the last
column of a row echelon form of its augmented matrix cannot contain a leading term.
 
1 0 −i
11. The reduced row echelon form is  0 1 − 12 (1 − i) . The solution set is {t[i, 12 (1 − i), 1]t : t ∈ R}.
 

0 0 0

12. For the ‘only if’ part (⇒), prove the contra-positive, so that there will be at least one free variable. For the ‘if’
part (⇐), there will be no free variable.

13. Similar to Problem 12.

14. Similar to a tutorial problem. The statement is also true in general. Indeed, if Ax = b and Cx = d are two
systems of m equations and having equal solution sets (non-empty) then [A | b] is row equivalent to [C | d].
This has been partially covered in a tutorial problem.

15. Take y0 = x0 − x1 for the second part.

16. Use Problem 15.

17. Easy.

18. (a) {[0, 2]t }, (b) {[0, 1, 1]t , [1, 0, 0]t }, (c) Inconsistent, (d) {[2, 0]t , [2, 2]t , [2, 4]t }.

19. There are p choices to assign values to each of the n − r free variables.

17 @bikash
3 Vector Space
Definition 3.1. Let n be a positive integer. Then the space Rn , as defined below, is called the n-dimensional Euclidean
space:
Rn = {[x1 , x2 , . . . , xn ]t : x1 , x2 , . . . , xn ∈ R}.

• Elements of Rn are called n-vectors or simply vectors.


 
x1
 
t
 x2 
• Note that [x1 , x2 , . . . , xn ] =  . 

 is a column vector.
 .. 
xn

• Sometimes, an element [x1 , x2 , . . . , xn ]t of Rn is also written as a row vector [x1 , x2 , . . . , xn ] or (x1 , x2 , . . . , xn ).

• The element (x1 , x2 , . . . , xn ) is termed as an n-tuple.

• Normally, while discussing a system of linear equations, elements of Rn are regarded as column vectors. Otherwise,
elements of Rn may also be regarded as row vectors.

• In a similar way, we can define Cn .

Recall that a vector ~x = aî + bĵ + ck̂ can also be written as (a, b, c) or [a, b, c]t . Thus elements of R3 are basically the
vectors that we had learnt in school. For u, v, w in R3 , c, d ∈ R and for the zero vector 0, we learnt in school that

1. u + v ∈ R3 ;

2. u + v = v + u;

3. (u + v) + w = u + (v + w);

4. u + 0 = u;

5. u + (−u) = 0;

6. c.u ∈ R3 ;

7. c.(u + v) = c.u + c.v;

8. (c + d).u = c.u + d.u;

9. c.(d.u) = (cd).u; and

10. 1.u = u.

The above properties are sufficient to do vector algebra in R3 . The same properties, exactly in the same way, hold good
in Rn too.

• If u, v, w, 0 ∈ Cn and c, d ∈ R, we get all the previous ten properties.

• If u, v, w, 0 ∈ Cn and c, d ∈ C, we get all the previous ten properties.

• If A, B, C, O ∈ M2 (R) (set of all 2 × 2 real matrices) and c, d ∈ R, we get all the previous ten properties.

• If p(x), q(x), r(x), 0 ∈ R2 [x] (set of all polynomials of degree at most two with real coefficients) and c, d ∈ R, we
get all the previous ten properties.

• In our discussion, the symbol F will be used as a representative of R or C. That is, F = R or F = C.

• The elements of F will be termed as scalars.

Definition 3.2. Let V be a non-empty set. For every u, v ∈ V and c ∈ F, let the addition u + v (called the vector
addition) and the multiplication c.u (called the scalar multiplication) be defined. Then V is called a vector space over
F if for all u, v, w ∈ V and c, d ∈ F, the following properties are satisfied:

18 @bikash
1. u + v ∈ V ;

2. u + v = v + u;

3. (u + v) + w = u + (v + w);

4. There is an element 0, called a zero, such that u + 0 = u;

5. For each u ∈ V , there is an element −u such that u + (−u) = 0;

6. c.u ∈ V ;

7. c.(u + v) = c.u + c.v;

8. (c + d).u = c.u + d.u;

9. c.(d.u) = (cd).u; and

10. 1.u = u.

Example 3.1. For any n ≥ 1, the set Rn is a vector space over R with respect to usual operations of addition and
scalar multiplication. [Show that all the ten defining properties of a vector space are satisfied].

Example 3.2. For any n ≥ 1, the set Cn is a vector space over R with respect to usual operations of addition and
scalar multiplication. [Show that all the ten defining properties of a vector space are satisfied].

Example 3.3. For any n ≥ 1, the set Cn is a vector space over C with respect to usual operations of addition and
scalar multiplication. [Show that all the ten defining properties of a vector space are satisfied].

Example 3.4. The set Rn is not a vector space over C with respect to usual operations of addition and scalar multi-
plication.

Solution. For c = i and u = [1, 1, . . . , 1]t ∈ Rn , we have c.u = i.[1, 1, . . . , 1]t = [i, i, . . . , i]t ∈
/ Rn , where i = −1. Hence
Rn is not a vector space over C with respect to usual operations of addition and scalar multiplication [the 6th property
is violated].

Example 3.5. The set Z is not a vector space over R with respect to usual operations of addition and scalar multipli-
cation.

Solution. For c = 21 ∈ R and u = 3 ∈ Z, we have c.u = 21 .3 = 23 ∈


/ Z. Hence Z is not a vector space over R with respect
to usual operations of addition and scalar multiplication [the 6th property is violated].

Example 3.6. The set M2 (C) of all 2 × 2 complex matrices is a vector space over C with respect to usual operations
of matrix addition and matrix scalar multiplication.

Example 3.7. The set R2 is not a vector space over R with respect to usual operations of addition and the following
definition of scalar multiplication:
c.[x, y]t = [cx, 0]t for [x, y]t ∈ R2 , c ∈ R.

Solution. For c = 1 and u = [2, 3]t ∈ R2 , we have c.u = 1.[2, 3]t = [2, 0]t 6= u. Hence R2 is not a vector space over R
with respect to the operations defined in the example [the 10th property is violated].

Example 3.8. Let R2 [x] denote the set of all polynomials of degree at most two with real coefficients. That is,

R2 [x] = {a + bx + cx2 : a, b, c ∈ R}.

For p(x) = a0 + b0 x + c0 x2 , q(x) = a1 + b1 x + c1 x2 ∈ R2 [x] and k ∈ R, define

p(x) + q(x) = (a0 + a1 ) + (b0 + b1 )x + (c0 + c1 )x2 and k.p(x) = (ka0 ) + (kb0 )x + (kc0 )x2 .

Then R2 [x] is a vector space over R.

• If V is a vector space, then the elements of V are called vectors.

• If there is no confusion, c.u is simply written as cu.

19 @bikash
• In general, we take V to be a vector space over F.

• If F needs to be specified, then we write V (F) to denote that V is a vector space over F.

• We call V a real vector space or complex vector space according as F = R or F = C.

Result 3.1. Let V be a vector space over F. Let u ∈ V and c ∈ F.

1. 0.u = 0;

2. c.0 = 0;

3. (−1).u = −u; and

4. If c.u = 0 then either c = 0 or u = 0.

Proof. [Proof of Part 1 and Part 3] We have 0.u = (0 + 0).u = 0.u + 0.u. Therefore adding −(0.u) on both sides,

[−(0.u] + 0.u = [−(0.u)] + (0.u + 0.u) ⇒ − (0.u) + 0.u = [−(0.u) + 0.u] + 0.u
⇒0 = 0 + 0.u
⇒0 = 0.u.

We have 0 = 0.u = [(1 + (−1)].u = 1.u + (−1).u = u + (−1).u. Therefore adding −u on both sides, we get

(−u) + 0 = (−u) + [u + (−1).u] ⇒ − u = [(−u) + u] + (−1).u


⇒ − u = 0 + (−1).u
⇒ − u = (−1).u.

Subspace: Let V be a vector space and (∅ 6=)W ⊆ V . Then W is called a subspace of V if and only if au + bv ∈ W
for every u, v ∈ W and for every a, b ∈ F.

• If W is a subspace of a vector space V , then W is also a vector space.

• If W is a subspace of a vector space V then 0 ∈ W .

• The sets {0} and V are always subspaces of any vectors space V . These are called the trivial subspaces.

• If S is a subspace of Rn then it is clear that if u, v ∈ S then 0u + 0v = 0 ∈ S, au + 0u = au ∈ S and


1u + 1v = u + v ∈ S.

• U is a subspace iff U is closed under addition and scalar multiplication.

Example 3.9. Examine whether the sets S = {[x, y, z]t ∈ R3 : x = y + 1}, T = {[x, y, z]t ∈ R3 : x = z 2 } and
U = {[x, y, z]t ∈ R3 : x = 5y} are subspaces of R3 .

Solution. Notice that [0, 0, 0]t does not satisfy x = y + 1 and so [0, 0, 0]t ∈ / S. Therefore S is not a subspace.
We have [1, 1, 1]t ∈ T but 2.[1, 1, 1]t = [2, 2, 2]t ∈ / T . Therefore T is not a subspace.
For the third set, let u = [x1 , y1 , z1 ]t , v = [x2 , y2 , z2 ]t ∈ U and a, b ∈ R. Then we have x1 = 5y1 and x2 = 5y2 . Now

au + bv = [ax1 + bx2 , ay1 + by2 , az1 + bz2 ]t and ax1 + bx2 = a(5y1 ) + b(5y2 ) = 5(ay1 + by2 ).

Therefore au + bv ∈ U , and hence U is a subspace.

Example 3.10. Let W be the set of all 2 × 2 real symmetric matrices. Then W is a subspace of M2 (R).

Example 3.11. Let W = {[x, y, z]t ∈ R3 : x + y − z = 0}. Then W is a subspace of R3 .

Solution. Let u = [x1 , y1 , z1 ]t , v = [x2 , y2 , z2 ]t ∈ W . Then we have x1 + y1 − z1 = 0, x2 + y2 − z2 = 0. Now


for a, b ∈ R, we have au + bv = [ax1 + bx2 , ay1 + by2 , az1 + bz2 ]t and (ax1 + bx2 ) + (ay1 + by2 ) − (az1 + bz2 ) =
a(x1 + y1 − z1 ) + b(x2 + y2 − z2 ) = 0. So au + bv ∈ W and hence W is a subspace of R3 .

Example 3.12. Let A be an m × n matrix. Then S = {x ∈ Cn : Ax = 0} is a subspace, called the null space of A.

20 @bikash
Example 3.13. Let U and W be two subspaces of V . Then U + W = {u + w : u ∈ U, w ∈ W } is also a subspace of V .

Solution. Let x, y ∈ U + W and a, b ∈ F. Then x = u1 + v1 and y = u2 + v2 for some u1 , u2 ∈ U, v1 , v2 ∈ W . Now


we have ax + by = a(u1 + v1 ) + b(u2 + v2 ) = (au1 + bu2 ) + (av1 + bv2 ). Since U and W are subspaces, we have
au1 + bu2 ∈ U, av1 + bv2 ∈ W . Therefore ax + by ∈ U + W and hence U + W is a subspace of V .

• If U and W are subspaces of V such that U ∩ W = {0}, then U + W is called a direct sum, denoted by U ⊕ W .

Result 3.2. Let V be a vector space.

1. If U and W are subspaces of V , then U ∩ W is also a subspace of V .


T
2. If {Ui : i ∈ ∆} is a collection of subspaces of V , then Ui is also a subspace of V .
i∈∆

3. Let U and W be subspaces of V . Then U ∪ W is a subspace of V iff U ⊆ W or W ⊆ U .

Proof.

1. Clearly U ∩ W 6= ∅, as 0 ∈ U ∩ W . Let x, y ∈ U ∩ W and a, b ∈ F. Then we have x, y ∈ U and x, y ∈ W . Since


U and W are subspaces of V , we have

ax + by ∈ U and ax + by ∈ W ⇒ ax + by ∈ U ∩ W.

Thus U ∩ W is a subspace of V .
T T T
2. Clearly Ui 6= ∅, as 0 ∈ Ui for all i ⇒ 0 ∈ Ui . Let x, y ∈ Ui and a, b ∈ F. Then x, y ∈ Ui for each i.
i∈∆ i∈∆ i∈∆
Since Ui is a subspace for each i, we have
\
ax + by ∈ Ui for each i ⇒ ax + by ∈ Ui .
i∈∆
T
Thus Ui is a subspace of V .
i∈∆

3. If U ⊆ W or W ⊆ U , then we have U ∪ W = W or U ∪ W = U . So, clearly U ∪ W is a subspace of V .


Conversely, assume that U ∪ W is a subspace of V . To show that U ⊆ W or W ⊆ U . Let u ∈ U and w ∈ W .
Now

u, w ∈ U ∪ W ⇒u + w ∈ U ∪ W
⇒u + w ∈ U or u+w ∈W
⇒(−1)u + 1.(u + w) ∈ U or 1.(u + w) + (−1)w ∈ W, as U, W are subspaces
⇒−u+u+w ∈U or u+w−w ∈W
⇒w ∈ U or u∈W
⇒W ⊆ U or U ⊆ W.

Linear Combination: Let V be a vectors space. Let v1 , v2 , . . . , vk ∈ V and c1 , c2 , . . . , ck ∈ F. Then the vector
c1 v1 + c2 v2 + . . . + ck vk is called a linear combination of the vectors v1 , v2 , . . . , vk .
A vector v in V is called a linear combination of the vectors v1 , v2 , . . . , vk in V if there exists scalars c1 , c2 , . . . , ck
such that v = c1 v1 + c2 v2 + . . . + ck vk . The numbers c1 , c2 , . . . , ck are called the coefficients of the linear combination.

Example 3.14. Is the vector [1, 2, 3]t a linear combination of [1, 0, 3]t and [−1, 1, −3]t ?

Result 3.3. A system of linear equations with augmented matrix [A | b] is consistent if and only if b is a linear
combination of the columns of A.

Proof. Let a = [α1 , α2 , . . . , αn ]t ∈ Fn and let the system Ax = b be given by

a11 x1 + a12 x2 + . . . + a1n xn = b1


.. .. .. ..
. . . .
am1 x1 + am2 x2 + . . . + amn xn = bm ,

21 @bikash
Now we have, a is a solution of Ax = b ⇔a11 α1 + a12 α2 + . . . + a1n αn = b1
.. .. .. ..
. . . .
am1 α1 + am2 α2 + . . . + amn αn = bm
       
a11 a12 a1n b1
 ..   ..  
  . 
..   
⇔α1  .  + α2  .  + . . . + αn  .  =  .. 
   

am1 am2 amn bm
⇔b is a linear combination of the columns of A.

Span of a Set: Let S = {v1 , v2 , . . . , vk } be a subset of a vector space V . Then the set of all linear combinations
of v1 , v2 , . . . , vk is called the span of v1 , v2 , . . . , vk , and is denoted by span(v1 , v2 , . . . , vk ) or span(S). That is,

span(S) = {v | v = c1 v1 + c2 v2 + . . . + ck vk for some c1 , c2 , . . . , ck ∈ F}.

Let S ⊆ V (may be infinite!) The span of S is defined by


(m )
X
span(S) := αi vi | vi ∈ S, αi ∈ F, m a nonnegative integer .
i=1

• Convention: span(∅) = {0}.

• If span(S) = V , then S is called a spanning set for V .

• For example, R2 = span(e1 , e2 ), where e1 = [1, 0]t and e2 = [0, 1]t , since for any [x, y]t ∈ R2 we have
[x, y]t = xe1 + ye2 .

• R2 [x] = span(1, x, x2 ).

• R[x] = span(1, x, x2 , . . .) [R[x] := set of all polynomials in x].

Clearly span(1, x, x2 , . . .) ⊆ R[x]. Also, if p(x) ∈ R[x] then p(x) = a1 xm1 + . . . + ak xmk for some a1 , . . . , ak ∈ R
and m1 , . . . , mk ∈ N ∪ {0}. As xm1 , . . . , xmk ∈ {1, x, x2 , . . .}, we have p(x) ∈ span(1, x, x2 , . . .) and consequently
R[x] ⊆ span(1, x, x2 , . . .). Thus we get R[x] = span(1, x, x2 , . . .).

Example 3.15. Let u = [1, 2, 3]t and v = [−1, 1, −3]t . Describe span(u, v) geometrically.

Solution. Let w = [a, b, c]t ∈ span(u, v). Then there exist α, β ∈ R such that w = αu + βv, which implies that the
1 −1 " # a
  x1  
system of equation  2 1  =  b  is consistent. We have
x2
3 −3 c
     
1 −1 a 1 −1 a 1 −1 a
     
 2 1 b  R2 → R2 − 2R1  0 3 b − 2a  R3 → R3 − 3R1  0 3 b − 2a  .
−−−−−−−−−−−→ −−−−−−−−−−−→
3 −3 c 3 −3 c 0 0 c − 3a

Thus the above system of equation is consistent iff c − 3a = 0. Hence span(u, v) is the plane whose equation is given
by z − 3x = 0.

Result 3.4. Let S be a subset of a vector space V . Then span(S) is a subspace of V .

Proof. Let u, v ∈ span(S). Then u = α1 u1 + . . . + αk uk and v = β1 v1 + . . . + βm vm for some u1 , . . . , uk , v1 , . . . , vm ∈ S


and α1 , . . . , αk , β1 , . . . , βm ∈ F. Let a, b ∈ F. We have

au + bv =a(α1 u1 + . . . + αk uk ) + b(β1 v1 + . . . + βm vm )
=(aα1 )u1 + . . . + (aαk )uk + (bβ1 )v1 + . . . + (bβm )vm ∈ span(S).

Hence span(S) is a subspace of V .


Linear Dependence: A set {v1 , v2 , . . . , vk } of vectors in a vector space V is said to be linearly dependent if
there are scalars c1 , c2 , . . . , ck , at least one of them non-zero, such that c1 v1 + c2 v2 + . . . + ck vk = 0.
An infinite set S ⊆ V is linearly dependent if there is some finite linearly dependent subset of S.

22 @bikash
Linear Independence: A set S of vectors in a vector space V is said to be linearly independent (LI) if it is
not linearly dependent. Thus
S = {v1 , v2 , . . . , vk } is linearly independent (LI) if c1 v1 + c2 v2 + . . . + ck vk = 0 ⇒ c1 = c2 = . . . = ck = 0.
If S is infinite then S is linearly independent (LI) if every finite subset of S is linearly independent.

• Set {0} is linearly dependent as 1.0 = 0. [A non-trivial linear combination of 0 is 0.]

• If 0 ∈ S, then S is always linearly dependent as S contains a linearly dependent set {0}.



• If {u, v} is linearly dependent, then au + bv = 0, where either a 6= 0 or b 6= 0. If a 6= 0, then u = − ab v. If

b 6= 0, then v = − ab u. Thus two vectors are linearly dependent iff one of them is a scalar multiple of the other.

• Convention: the set ∅ is linearly independent.

Example 3.16. Let ei ∈ Rn be the i-th column of the identity matrix In . Is {e1 , e2 , . . . , en } linearly independent?

Solution. For a1 , a2 , . . . , an ∈ R, we have

a1 e1 + a2 e2 + . . . + an en = 0 ⇒ [a1 , a2 , . . . , an ]t = 0 ⇒ a1 = a2 = . . . = an = 0.

Hence {e1 , e2 , . . . , en } is linearly independent.

Example 3.17. Examine whether the sets T = {[1, 2, 0]t , [1, 1, −1]t , [1, 4, 2]t } and S = {[1, 4]t , [−1, 2]t } are linearly
dependent.

Solution. For a, b, c ∈ R, we have

a[1, 2, 0]t + b[1, 1, −1]t + c[1, 4, 2]t = 0 ⇒ a + b + c = 0, 2a + b + 4c = 0, − b + 2c = 0.

Now the augmented matrix of this system of equation is


     
1 1 1 0 1 1 1 0 1 1 1 0
     
2 1 4 0 R
 2 → R − 2R 0 −1 2 0 R3 → R3 − R2  0 −1 2 0 .
−−−−−−−2−−−−→1
 
−−−−−−−−−−→
0 −1 2 0 0 −1 2 0 0 0 0 0

The equations corresponding to the last row echelon form are a + b + c = 0 and −b + 2c = 0. Taking c = 1, we have
b = 2 and a = −(b + c) = −3. Therefore

(−3)[1, 2, 0]t + 2.[1, 1, −1]t + [1, 4, 2]t = 0.

Hence T is a linearly dependent set.


It is easy to see that [1, 4]t = a[−1, 2]t , a ∈ R ⇒ 1 = −a, 4 = 2a, which is impossible.
Similarly, [−1, 2]t = b[1, 4]t , b ∈ R is not possible. Hence S is a linearly independent set.

Example 3.18. The set {A, B, C} is linearly dependent in M2 (R), where


" # " # " #
1 1 1 −1 2 0
A= ,B = and C = .
0 1 1 0 1 1

Example 3.19. The set {1, x, x2 , . . . , xn } is linearly independent in Rn [x].

Example 3.20. The set {1, x, x2 , . . .} is linearly independent in R[x].

Result 3.5. The vectors v1 , v2 , . . . , vk in a vector space V are linearly dependent iff either v1 = 0 or there is an
integer r such that vr can be expressed as a linear combination of v1 , . . . , vr−1 .

Linear combinations
  of rows:
at1
 t 
 a2 
Suppose A =   is an m × n matrix. Then
 .. 
 . 
atm

23 @bikash
• For ci ∈ R, a = c1 at1 + . . . + cm atm is a linear combination of the rows of A. Note that a is a 1 × n matrix and
at ∈ Rn .

• Note: c1 at1 + . . . + cm atm = [c1 , . . . , cm ]A. Thus, for any c ∈ Rm , ct A is a linear combination of rows of A.

• The rows of A are linearly dependent iff ct A = c1 at1 + . . . + cm atm = 0t (zero row) for some nonzero c ∈ Rm .

• The rows of A are linearly dependent iff a1 , . . . , am are linearly dependent, i.e., the columns of At are linearly
dependent.

Result 3.6. Suppose R = RREF(A) has a zero row. Then the rows of A are linearly dependent.

Proof. If there is a zero row in A, then clearly the rows of A are linearly dependent. Otherwise, a zero row in
the RREF(A) can be obtained from adding multiples of some rows to a fixed row and then by moving the rows (by
interchange of rows) so that zero rows are at the bottom. This process basically express the zero vector as a non-trivial
linear combination of row vectors of A. Hence the rows of A are linearly dependent.
[The converse of this result is also true. You are asked to prove the converse in a practice problem.]

Result 3.7. Let v1 , v2 , . . . , vm be vectors in Rn and let A be the n × m matrix [v1 v2 . . . vm ] with these vectors as its
columns. Then v1 , v2 , . . . , vm are linearly dependent if and only if the homogeneous system Ax = 0 has a non-trivial
solution.

Result 3.8. Any set of m vectors in Rn is linearly dependent if m > n.


 
v1t
v2t
 
 
Result 3.9. Let S = {v1 , v2 , . . . , vm } ⊆ Rn and consider the m × n matrix A = 
 .. . Then S is linearly dependent

 . 
t
vm
if and only if rank(A) < m.

Result 3.10. Let S = {v1 , v2 , . . . , vm } ⊆ Rn and A = [v1 v2 · · · vm ]. Then the following are equivalent.

1. S is linearly dependent.

2. Columns of A are linearly dependent.

3. Ax = 0 has a nontrivial solution.

4. Rows of At are linearly dependent.

5. rank(At ) < m.

6. RREF(At ) has a zero row.

Proof. Follows from previous few results.

Result 3.11. Let S = {v1 . . . , vk } be a linearly independent set in V and v ∈


/ span(S) Then S ∪ {v} is also linearly
independent.

Proof. Let au + a1 v1 + . . . + ar vr = 0 for a, a1 , . . . , ar ∈ F. Now if a 6= 0, then we have


 a   a 
1 r
au + a1 v1 + . . . + ar vr = 0 ⇒ u = − v1 + . . . + − vr ∈ span{v1 , . . . , vr },
a a
a contradiction to the given assumption. Therefore we must have a = 0. Now using a = 0, we have

a1 v 1 + . . . + ar v r = 0
⇒a1 = 0, . . . , ar = 0, as {v1 , . . . , vr } is linearly independent.

Thus au + a1 v1 + . . . + ar vr = 0 ⇒ a = 0, a1 = 0, . . . , ar = 0 and hence {u, v1 , v2 , . . . , vr } is linearly independent.


Basis: A subset B of a vector space V is said to be a basis for V if span(B) = V and if B is linearly independent.
• The set {1} is a basis for R1 (= R).

• The set {e1 , e2 }, where e1 = [1, 0]t and e2 = [0, 1]t , is a basis for R2 . Similarly, {[1, 0]t , [1, 1]t } is also a basis for
R2 .

24 @bikash
• In general, if ei = [0, . . . , 0, 1, 0, . . . , 0]t , where 1 is the i-th entry and the other entries are zero, then {e1 , e2 , . . . , en }
is a basis for Rn .

• The vectors ei (for i = 1, 2, . . . , n) are called the n × 1 standard unit vectors.

• If ei is written as a row vector, then the vectors ei (for i = 1, 2, . . . , n) are called the 1 × n standard unit vectors.

Example 3.21. {e1 , e2 , . . . , en } is a basis for Fn . This basis is called the standard basis for Fn .

Example 3.22. {1, x, x2 , . . . , xn } is a basis for Rn [x], known as the standard basis for Rn [x].

Example 3.23. {1, x, x2 , . . .} is a basis for R[x], known as the standard basis for R[x].

Example 3.24. E = {E11 , E12 , E21 , E22 } is a basis for M2 (R), where
" # " # " # " #
1 0 0 1 0 0 0 0
E11 = , E12 = , E21 = and E22 = .
0 0 0 0 1 0 0 1

Example 3.25. B = {1 + x, x + x2 , 1 + x2 } is a basis for R2 [x].

Result 3.12. For a vector space U , a subset {v1 , . . . , vr } ⊆ U is a basis of U iff every element of U is a unique linear
combination of v1 , . . . , vr .

Proof. Let B = {v1 , . . . , vr } be a basis of U . Therefore B spans U and so every element of U can surely be expressed as a
linear combination of elements of {v1 , . . . , vr }. Let v ∈ U be expressed as v = a1 v1 +. . .+ar vr and v = b1 v1 +. . .+br vr
for a1 , . . . , ar , b1 , . . . , br ∈ R. Then we have

a 1 v 1 + . . . + a r v r = v = b1 v 1 + . . . + br v r
⇒(a1 − b1 )v1 + . . . + (ar − br )vr = 0
⇒a1 − b1 = . . . = ar − br = 0, as {v1 , . . . , vr } is linearly independent
⇒a1 = b1 , . . . , ar = br .

Thus every element of U is a unique linear combination of v1 , . . . , vr .


Conversely, suppose that every element of U is a unique linear combination of v1 , . . . , vr . Then clearly B spans U .
Now for a1 , . . . , ar ∈ R, assume that a1 v1 + . . . + ar vr = 0. Since 0.v1 + . . . + 0.vr = 0, we find two linear combinations
of the element 0. By the uniqueness of expressions of elements of U as a linear combination of v1 , . . . , vr , we find that
a1 = 0, . . . , ar = 0. Thus B is also linearly independent. Hence B is a basis of U .

Example 3.26. Find a basis for the subspace S = {x ∈ R4 : Ax = 0}, where


 
1 −1 −1 2
 
A =  2 −2 −1 3  .
−1 1 −1 0

Solution. Let x = [x, y, z, w]t . We have


     
1 −1 −1 2 0 1 −1 −1 2 0 1 −1 −1 2 0
     
 2 −2 −1 3 0  R2 → R2 − 2R1  0 0 1 −1 0 R3 → R3 + R1  0 0 1 −1 0 
−−−−−−−−−−−→ −−−−−−−−−−→
−1 1 −1 0 0 −1 1 −1 0 0 0 0 −2 2 0
 
1 −1 −1 2 0
 
R3 → R3 + 2R2  0 0 1 −1 0 
−−−−−−−−−−−→
0 0 0 0 0
 
1 −1 0 1 0
 
R → R1 + R2  0 0 1 −1 0  .
−−1−−−−−−−−→
0 0 0 0 0

The last RREF gives x − y + w = 0 and z − w = 0, where y and w are free variables. Setting y = s, w = r, we find
z = r, x = s − r. So
       
x s−r 1 −1
 y   s   1   0 
       
x= =  = s +r .
 z   r   0   1 
w r 0 1

25 @bikash
Thus {[1, 1, 0, 0]t , [−1, 0, 1, 1]t } spans S. Also for a, b ∈ R,

a[1, 1, 0, 0]t + b[−1, 0, 1, 1]t = 0 ⇒ [a − b, a, b, b]t = 0 ⇒ a = 0 = b.

Thus {[1, 1, 0, 0]t , [−1, 0, 1, 1]t } is also linearly independent. Hence {[1, 1, 0, 0]t , [−1, 0, 1, 1]t } is a basis for S.

Result 3.13. Let S = {v1 , . . . , vr } ∈ Rn and T ⊆ span(S) such that m = |T | > r. Then T is linearly dependent.

Proof. Let T = {u1 , . . . , um }. Write

ui = ai1 v1 + ai2 v2 + · · · + air vr , 1 ≤ i ≤ m.


     
a11 ··· a1r at1 v1
 .. .. ..   ..   . 
. . So ui = ai  .. .
t
Let A = 
 . . .
=
 
 
am1 · · · amr atm vr
Since m > r, the rows of A are linearly dependent. Suppose α1 at1 + · · · + αm atm = 0t . Then
   
m m
v 1 v 1
X X  .   . 
αi ui = αi ati  .  t
 . =0  . =0.
.  t

i=1 i=1
vr vr

Let a, b, c ∈ R. Then
 
a
2 2
a(1 + x) + b(x + x ) + c(1 + x ) = 0 ⇒ [1 + x x+x 2
1+x 2
] b =0
c
  
1 0 1 a
⇒ [1 x x ] 1 1 2 0 b = 0
0 1 1 c
  
1 0 1 a
⇒ 1 1 0 b = 0, as {1, x, x2 } is LI
0 1 1 c
   
a 1 0 1
⇒ b = 0, as the rank of 1 1 0 is 3
c 0 1 1

⇒ {1 + x, x + x2 , 1 + x2 } is LI.
  
1 0 1
• Note the correspondence 1 + x ←→ 1 , x + x2 ←→ 1 , 1 + x2 ←→ 0 .
0 1 1
  
1 0 1
• {1 + x, x + x2 , 1 + x2 } is LI iff { 1 , 1 , 0 } is LI.
0 1 1

Coordinate: Let B = {v1 , v2 , . . . , vn } be an ordered basis for a vector space V (F) and let v ∈ V . Let
v = c1 v1 + c2 v2 + . . . + cn vn . Then the scalars c1 , c2 , . . . , cn are called the coordinates of v with respect to
B, and the column vector  
c1
 
 c2 
[v]B =   ..


 . 
cn
is called the coordinate vector of v with respect to B.
F Coordinate of a vector is always associated with an ordered basis.

Example 3.27. Thecoordinate


 vector [p(x)]B of p(x) = 1 − 3x + 4x2 with respect to the ordered basis B = {1, x, x2 }
1
of R2 [x] is [p(x)]B = −3 .
4

Result 3.14. Let B = {v1 , v2 , . . . , vn } be an ordered basis for a vector space V , let u, v ∈ V and let c ∈ F. Then

[u + v]B = [u]B + [v]B and [cu]B = c[u]B .

Result 3.15. Let B = {v1 , v2 , . . . , vn } be an ordered basis for a vector space V , and let u1 , u2 , . . . , uk be vectors in
V . Then {u1 , u2 , . . . , uk } is linearly independent in V if and only if {[u1 ]B , [u2 ]B , . . . , [uk ]B } is linearly independent
in Fn .

26 @bikash
Result 3.16. Let B = {v1 , v2 , . . . , vn } be a basis for a vector space V .

1. Any set of more than n vectors in V must be linearly dependent.

2. Any set of fewer than n vectors in V cannot span V .

Result 3.17. Let S = {v1 , . . . , vr } ⊆ V and T ⊆ span(S) be such that m = |T | > r. Then T is linearly dependent.

Proof. Similar to Result 3.13.

Result 3.18. Let S = {v1 , . . . , vm } ⊆ V . Then S contains a basis of span(S).

Proof. Let vi1 ∈ S and let vi1 6= 0. If span({vi1 }) 6= span(S), then span({vi1 }) is a proper subset of span(S). So,
S must have some non-zero elements which do not belong to span({vi1 }). Let vi2 ∈ S \ span({vi1 }). Then clearly,
{vi1 , vi2 } is linearly independent. If span({vi1 , vi2 }) 6= span(S), then let vi3 ∈ S \ span{vi1 , vi2 }. Then clearly,
{vi1 , vi2 , vi3 } is linearly independent. Continuing in this way, we find a linearly independent set {vi1 , vi2 , . . . , vik } ⊆ S
such that span({vi1 , vi2 , . . . , vik }) = span(S). Thus {vi1 , vi2 , . . . , vik } is a basis for span(S).

• The vector space V is called finite-dimensional if V has a finite spanning set. Else, V is called infinite-
dimensional.

Result 3.19 (The Basis Theorem). Every finite-dimensional vector space V has a basis. Further, every basis for V
has equal number of vectors.

Dimension: Let V be a vector space.


• The dimension of V , denoted dim V , is the number of vectors in a basis for V . We write dim V = ∞ if V does
not have a finite basis.

• The dimension of the zero space {0} is defined to be 0.

• The vector space V is called finite-dimensional if dim V is a finite number.

• The vector space V is called infinite-dimensional if there is no finite basis for V .

Example 3.28. dim(Rn ) = n, dim C(C) = 1, dim C(R) = 2, dim M2 (R) = 4 and dim Rn [x] = n + 1.

Solution. A basis of Rn over R is {e1 , e2 , . . . , en }. A basis of C over C is {1}. A basis of C over R is {1, i}.

Result 3.20. Let V be a vector space with dim V = n.

1. Any linearly independent set in V contains at most n vectors.

2. Any spanning set for V contains at least n vectors.

3. Any linearly independent set of exactly n vectors in V is a basis for V .

4. Any spanning set for V of exactly n vectors is a basis for V .

5. Any linearly independent set in V can be extended to a basis for V .

6. Any spanning set for V can be reduced to a basis for V .

Result 3.21. Let W be a subspace of a finite-dimensional vector space V . Then

1. W is also finite-dimensional and dim W ≤ dim V ; and

2. dim W = dim V if and only if W = V .

Proof. [Proof of Part 2] Let dim(W ) = dim(V ) = k and {v1 , . . . , vk } be a basis for W so that span({v1 , . . . , vk }) = W .
If W 6= V , then span({v1 , . . . , vk }) is a proper subset of V and so there exists v ∈ V such that v ∈
/ span({v1 , . . . , vk }).
By Result 3.11, the set {v, v1 , . . . , vk } is linearly independent in V , which gives that dim(V ) ≥ k + 1. This is a
contradiction to the hypothesis that dim(W ) = dim(V ). Hence we must have W = V .
Change of Basis: Let B = {u1 , u2 , . . . , un } and C = {v1 , v2 , . . . , vn } be ordered bases for a vector space V . The
n × n matrix whose columns are the coordinate vectors [u1 ]C , [u2 ]C , . . . , [un ]C is denoted by PC←B , and is called the
change of basis matrix from B to C. That is,

PC←B = [[u1 ]C , [u2 ]C , . . . , [un ]C ].

27 @bikash
Result 3.22. Let B = {u1 , u2 , . . . , un } and C = {v1 , v2 , . . . , vn } be ordered bases for a vector space V and let PC←B
be the change of basis matrix from B to C. Then

1. PC←B [x]B = [x]C for all x ∈ V ;

2. PC←B is the unique matrix P with the property that P [x]B = [x]C for all x ∈ V ;

3. PC←B is invertible and (PC←B )−1 = PB←C .

Example 3.29. Find the change of basis matrix PC←B and PB←C for the ordered bases B = {1, x, x2 } and C =
{1 + x, x + x2 , 1 + x2 } of R2 [x]. Then find the coordinate vector of p(x) = 1 + 2x − x2 with respect to C.
Practice Problems Set 3

1. Consider the vector space R2 over R. Give an example of a subset of R2 which is

(a) closed under addition but not closed under scalar multiplication; and
(b) closed under scalar multiplication but not closed under addition.

2. Let S be a non-empty set and let V be the set of all functions from S to R. Show that V is a vector space
with respect to addition (f + g)(x) = f (x) + g(x) for f, g ∈ V and scalar multiplication (c.f )(x) = cf (x) for
c ∈ R, f ∈ V .

3. Show that the space of all real (respectively complex) matrices is a vector space over R (respectively C) with
respect to usual addition and scalar multiplication of matrices.

4. A set V and the operation of vector addition and scalar multiplication are given below. Examine whether V is a
vector space over R. If not, then find which of the vector space properties it violates.

(a) V = R2 and for [x, y]t , [z, w]t ∈ V, a ∈ R define [x, y]t + [z, w]t = [x + z, y + w]t and a.[x, y]t = [ax, 0]t .
(b) V = R2 and for [x, y]t , [z, w]t ∈ V, a ∈ R define [x, y]t + [z, w]t = [x + z, 0]t and a.[x, y]t = [ax, 0]t .
(c) V = R and for x, y ∈ V, a ∈ R define x ⊕ y = x − y and a x = −ax.
(d) V = R2 and for [x, y]t , [z, w]t ∈ V, a ∈ R define [x, y]t + [z, w]t = [x + z, y + w]t and a.[x, y]t = [ax, y]t .
(e) V = R2 and for [x, y]t , [z, w]t ∈ V, a ∈ R define [x, y]t + [z, w]t = [x − z, y − w]t and a.[x, y]t = [−ax, −ay]t .
(f) V = {[x, 3x + 1]t : x ∈ R} with usual addition and scalar multiplication in R2 .
(g) V = {[x, mx + c]t : x ∈ R}, where m and c are some fixed real numbers, with usual addition and scalar
multiplication in R2 .
(h) V = {f : R → C | f is a function such that f (−t) = f (t) for all t ∈ R}, with usual addition and scalar
multiplication of functions (as defined in Problem 2). Also, find a function in V whose range is contained in
R. (" # )
a b
(i) V = M2 (R) = : a, b, c, d ∈ R, ad − bc 6= 0 with usual addition and scalar multiplication of
c d
matrices.

5. Let V = R+ (the set of positive real numbers). This is not a vector space under usual operations of addition and
scalar multiplication (why?). Define a new vector addition and scalar multiplication as

v1 ⊕ v2 = v1 .v2 and α v = v α ,

for all v1 , v2 , v ∈ R+ and α ∈ R. Show that R+ is a vector space over R with 1 as the additive identity.

6. (Optional) Let V = R2 . Define [x1 , x2 ]t ⊕[y1 , y2 ]t = [x1 +y1 +1, x2 +y2 −3]t , α [x1 , x2 ]t = [αx1 +α−1, αx2 −3α+3]t
for [x1 , x2 ]t , [y1 , y2 ]t ∈ R2 and α ∈ R. Verify that the vector [−1, 3]t is the additive identity, and V is indeed a
vector space over R.

7. (Optional) Let V and W be vector spaces over R with binary operations (+, •) and (⊕, ), respectively. Consider
the following operations on the set V × W : for (x1 , y1 ), (x2 , y2 ) ∈ V × W and α ∈ R, define

(x1 , y1 ) (x2 , y2 ) = (x1 + x2 , y1 ⊕ y2 ) and α ◦ (x1 , y1 ) = (α • x1 , α y1 ).

With the above definitions, show that V × W is also a vector space over R.

28 @bikash
8. Show that R is a vector space over R with additive identity 1 with respect to the addition ⊕ and scalar multipli-
cation ⊗ defined as follows:

x ⊕ y = x + y − 1 for x, y ∈ R and a ⊗ x = a(x − 1) + 1 for a, x ∈ R.

9. Let V = C[x] be the set of all polynomials with complex coefficients. Show that V is a vector space over C with
n
P m
P
respect to the following definition of addition and scalar multiplication: If p(x) = ai xi , q(x) = bi x i ∈ V ,
i=0 i=0
m ≥ n, then
m
X n
X
p(x) + q(x) = (ai + bi )xi , where ai = 0 for i > n and c.p(x) = (cai )xi for c ∈ C.
i=0 i=0

10. Prove that every vector space has a unique zero vector.

11. Prove that for every vector v in a vector space V , there is a unique v0 ∈ V such that v + v0 = 0.

12. Examine whether the following sets are subspaces of R2 .

{[x, y]t ∈ R2 : x2 + y 2 = 0}, {[x, y]t ∈ R2 : x = y}, {[x, y]t ∈ R2 : x, y ∈ Z},


x
{[x, y]t ∈ R2 : x − y = 1}, {[x, y]t ∈ R2 : xy ≥ 0} and {[x, y]t ∈ R2 : = 1}.
y

13. Examine whether the following sets are subspaces of R3 .

{[x, y, z]t ∈ R3 | x ≥ 0}, {[x, y, z]t ∈ R3 | x + y = z} and {[x, y, z]t ∈ R3 | x = y 2 }.

14. Examine whether the following sets are subspaces of Rn .

S1 = {[x1 , . . . , xn ]t : x1 ≥ 0}, S2 = {[x1 , x2 , . . . , xn ]t ) : x2 = x21 }

and S3 = {[x1 , x2 , x3 , x4 , . . . , xn ]t : 3x1 − x2 + x3 + 2x4 = 0}.

15. Determine the subspaces of R3 spanned by each of the following sets:

{[1, 1, 1]t , [0, 1, 2]t , [1, 0, −1]t }, {[1, 2, 3]t , [1, 3, 5]t }, {[2, 1, 0]t , [2, 0, −2]t },

{[1, 2, 3]t , [1, 3, 5]t , [1, 2, 4]t } and {[1, 2, 0]t , [1, 0, 5]t , [1, 2, 3]t }.

16. Show that S = {[1, 0, 0]t , [1, 1, 0]t , [1, 1, 1]t } is a linearly independent set in R3 . In general, if {u, v, w} is a linearly
independent set of vectors in some Rn then prove that {u, u + v, u + v + w} is also a linearly independent set.

17. Show, without using the following result, that any set of k vectors in R3 is linearly dependent if k ≥ 4.
Result: Any set of m vectors in Rn is linearly dependent if m > n.

18. Let W be a subspace of R3 . Show that {[1, 0, 0]t , [0, 1, 0]t , [0, 0, 1]t } ⊆ W if and only if W = R3 . Determine which
of the following sets span R3 :

{[0, 0, 2]t , [2, 2, 0]t , [0, 2, 2]t }, {[3, 3, 1]t , [1, 1, 0]t , [0, 0, 1]t } and {[−1, 2, 3]t , [0, 1, 2]t , [3, 2, 1]t }.

19. Determine whether s(x) = 3 − 5x + x2 is in span(p(x), q(x), r(x)), where p(x) = 1 − 2x, q(x) = x − x2 and
r(x) = −2 + 3x + x2 .

20. Let S = {u1 , . . . , uk } be a spanning set for a vector space V . Show that if uk ∈ span(u1 , . . . , uk−1 ) then
span(u1 , . . . , uk−1 ) = V .

21. Let V be a vector space over R (or C) and let ∅ 6= S ⊆ V . Let A = {W | S ⊆ W, W is a subspace of V}. Show
T
that span(S) = W (i.e., span(S) is the smallest subspace of V containing S).
W ∈A

22. Prove or disprove:

(a) If u1 , u2 , . . . , ur are linearly dependent vectors in a vector space then each of these vectors is a linear
combination of the other vectors.

29 @bikash
(b) If any r − 1 vectors of the set {u1 , u2 , . . . , ur } are linearly independent in a vector space then the set
{u1 , u2 , . . . , ur } is also linearly independent.
(c) If V = span{u1 , u2 , . . . , un } and if every ui (1 ≤ i ≤ n) is a linear combination of no more than r vectors in
{u1 , u2 , . . . , un } \ {ui } then dim V ≤ r.

23. If u1 , u2 , . . . , ur are linearly independent vectors in a vector space V and the vectors u1 , u2 , . . . , ur , v are linearly
dependent, then show that v can be uniquely expressed as a linear combination of u1 , u2 , . . . , ur .

24. Let {u1 , u2 , . . . , un } be a basis for a vector space V , where n ≥ 2. Show that {u1 , u1 + u2 , . . . , u1 + . . . + un } is
also a basis for V . How about the converse?

25. Discuss the linear independence/linear dependence of the following sets. For those that are linearly dependent,
express one of the vectors as a linear combination of the others.

(a) {[1, 0, 0]t , [1, 1, 0]t , [1, 1, 1]t } of R3 .


(b) {[1, 0, 0, 0]t , [1, 1, 0, 0]t , [1, 2, 0, 0]t , [1, 1, 1, 1]t } of R4 .
(c) {[1, i, 0]t , [1, 0, 1]t , [i + 2, −1, 2]t } of C3 (C).
(d) {[1, i, 0]t , [1, 0, 1]t , [i + 2, −1, 2]t } of C3 (R).
(e) {1, i} of C(C) and {1, i} of C(R).
(f) {1 + x, 1 + x2 , 1 − x + x2 } of R2 [x].
(" # " # " #)
1 1 1 −1 1 0
(g) , , of M2 (R).
0 −1 1 0 3 2

26. Let {u, v, w} be a linearly independent set of vectors in a vector space.

(a) Is {u + v, v + w, w + u} linearly independent? Either prove that it is or give a counterexample to show that
it is not.
(b) Is {u − v, v − w, u − w} linearly independent? Either prove that it is or give a counterexample to show that
it is not.

27. Let A be a lower triangular matrix such that none of the diagonal entries are zero. Show that the row (or column)
vectors of A are linearly independent.

28. Let V be a vector space and v1 , v2 , . . . , vk ∈ V . Let W = span(v1 , v2 , . . . , vk ) and v1 + v2 , v1 − 2v2 ∈


span(v3 , v4 , . . . , vk ). Prove that W = span(v3 , v4 , . . . , vk ).

29. Examine whether the sets {1 − x, 1 − x2 , x − x2 } and {1, 1 + 2x + 3x2 } are bases for R2 [x].

30. Find all values of a for which the set {[a2 , 0, 1]t , [0, a, 2]t , [1, 0, 1]t } is a basis for R3 .

31. Find a basis for the solution space of the following system of n + 1 linear equations in 2n unknowns:

x1 + x2 + . . . + xn = 0
x2 + x3 + . . . + xn+1 = 0
.. .. .. .. ..
. . . . .
xn+1 + xn+2 + . . . + x2n = 0.

32. Let t ∈ R. Discuss the linear independence of the following three vectors over R:

u = [1, 1, 0]t , v = [1, 3, −1]t , w = [5, 3, t]t .

33. Let u, v, w and x be four linearly independent vectors in Rn , where n ≥ 4. Write true or false for each of the
following statements, with proper justification:

(a) The vectors u + v, v + w, w + x and x + u are linearly independent.


(b) The vectors u − v, v − w, w − x and x − u are linearly independent.
(c) The vectors u + v, v + w, w + x and x − u are linearly independent.

30 @bikash
(d) The vectors u + v, v + w, w − x and x − u are linearly independent.
n
P
34. Let S = {[x1 , x2 , . . . , xn ]t ∈ Rn : xi = 0}. Show that S is a subspace of Rn . Find a basis and dim(S).
i=1

35. Prove that the row vectors of a matrix form a linearly dependent set if and only if there is a zero row in any row
echelon form of that matrix.

36. Prove that the column vectors of a matrix form a linearly dependent set if and only if there is a column containing
no leading term in any row echelon form of that matrix.

37. Prove that the column vectors of a matrix form a linearly independent set if and only if each column contains a
leading term in any row echelon form of that matrix.

38. Let V be a vector space over R and let x, y, z ∈ V such that x + 2y + 7z = 0. Show that span(x, y) = span(y, z) =
span(z, x).

39. Show that the set {[1, 0]t , [i, 0]t } is linearly independent over R but is linearly dependent over C.

40. Under what conditions on the complex number α are the vectors [1 + α, 1 − α]t and [α − 1, 1 + α]t in C2 (R) linearly
independent?

41. Let V be a vector space over C and let {x1 , x2 , . . . , xk } be a linearly independent subset of V . Show that the set
{x1 , x2 , . . . , xk , ix1 , ix2 , . . . , ixk } is linearly independent over R.

42. Examine whether the following sets are subspaces of each of C3 (R) and C3 (C):

{[z1 , z2 , z3 ]t ∈ C3 : z1 is real}, {[z1 , z2 , z3 ]t ∈ C3 : z1 + z2 = 10z3 },

{[z1 , z2 , z3 ]t ∈ C3 : |z1 | = |z2 |} and {[z1 , z2 , z3 ]t ∈ C3 : z1 + z2 = 2z 3 }.

43. Let U and W be two subspaces of a vector space V . Show that U ∩ W is also a subspace of V . Give examples to
show that U ∪ W need not be a subspace of V .

44. Let U and W be subspaces of a vector space V . Define the linear sum of U and W to be

U + W = {u + w : u ∈ U, w ∈ W }.

Show that U +W is a subspace of V . If V = R3 , U is the x-axis and W is the y-axis, what is U +W geometrically?

45. Let W be a subspace of a vector space V . Show that ∆ = {(w, w) : w ∈ W } is a subspace of V × V and that
dim ∆ = dim W .

46. Let U and V be two finite-dimensional vector spaces. Find a formula for dim (U × V ) in terms of dim U and
dim V .

47. Let {x1 , x2 , . . . , xn } be a basis for a vector space V . Show that the set {x1 , . . . , xi−1 , cxi , xi+1 , . . . , xn } is also a
basis for V , where c is a non-zero scalar.

48. Find a basis for span(1 − x, x − x2 , 1 − x2 , 1 − 2x + x2 ) in R2 [x].

49. Extend {1 + x, 1 + x + x2 } to a basis for R2 [x].


(" # " #)
0 1 1 1
50. Extend , to a basis for M2 (R).
0 1 0 1

51. Prove or disprove:

(a) Every lineally independent set of a vector space V is a basis for some subspace of V .
(b) If {x1 , x2 , . . . , xn } is a lineally dependent subset of a vector space V , then xn ∈ span(x1 , x2 , . . . , xn−1 ).
(c) Let W1 and W2 be two subspaces of a vector space V . If {u1 , u2 , v1 , v2 } is a basis for W1 and {u1 , u2 , y1 , y2 , y3 }
is a basis for W2 , then {u1 , u2 } is a basis for W1 ∩ W2 .
(d) B = {x3 , x3 + 2, x2 + 1, x + 1} is a basis for R3 [x].

31 @bikash
52. Find a basis for each of the following subspaces.
(" # )
a b
(a) ∈ M2 (R) : a − d = 0 .
c d
(" # )
a b
(b) ∈ M2 (R) : 2a − c − d = 0, a + 3b = 0 .
c d
(c) {a + bx + cx3 ∈ R3 [x] : a − 2b + c = 0}.
n
P
(d) {A = [aij ] ∈ Mm×n (R) : aij = 0 for i = 1, . . . , m}.
j=1

53. Recall that Mn (R) denote the space of all n × n real matrices. Show that the following sets are subspaces of
Mn (R). Also, find a basis and the dimension of each of these subspaces.

(a) Un (R) = {A ∈ Mn (R) : A is upper triangular}.


(b) Ln (R) = {A ∈ Mn (R) : A is lower triangular}.
(c) Dn (R) = {A ∈ Mn (R) : A is diagonal}.
(d) sln (R) = {A ∈ Mn (R) : tr(A) = 0}.
(e) An (R) = {A ∈ Mn (R) : A + At = 0}.
(" # )
a b
54. Let W = : a, b, c ∈ R . Show that W is a subspace of M2 (R) and that the following set form an
b c
ordered basis for W : (" # " # " #)
1 0 0 1 0 0
, , .
0 0 1 0 0 1
" #
1 −2
Find the coordinate of the matrix with respect to this ordered basis.
−2 3

55. Find a basis and the dimension for each of the following vector spaces:

(a) Mn (C) over C;


(b) Mn (C) over R;
(c) Mn (R) over R.
(d) Hn (C) (the space of all n × n Hermitian matrices) over R; and
(e) Sn (C) (the space of all n × n skew-Hermitian matrices) over R.
(" # ) (" # )
a b a b
56. Let V = ∈ M2 (C) : a + d = 0 and W = ∈ V : c = −b . Show that V is a vector space
c d c d
over R and find a basis for V . Also, show that W is a subspace of V , and find the dimension of W .

57. Show that W = {[x, y, z, t]t ∈ R4 : x + y + z + 2t = 0 = x + y + z} is a subspace of R4 . Find a basis for W , and
extend it to a basis for R4 .

58. Show that W = {[v1 , . . . , v6 ]t ∈ R6 | v1 + v2 + v3 = 0, v2 + v3 + v4 = 0, v5 + v6 = 0} is a subspace of R6 . Find a


basis for W , and extend it to a basis for R6 .

59. Let {v1 , v2 , . . . , vn } be a basis for Rn . Show that {v1 , αv1 + v2 , αv1 + v3 , . . . , αv1 + vn } is also a basis of Rn for
every α ∈ R.

60. Show that the set {[1, 0, 1]t , [1, i, 0]t , [1, 1, 1 − i]t } is a basis for C3 (C).

61. Show that {[1, 0, 0]t , [1, 1, 1]t , [1, 1, −1]t } is a basis for C3 (C). Is it a basis for C3 (R) as well? If not, extend it to
a basis for C3 (R).

62. Show that {1, (x − 1), (x − 1)(x − 2)} is a basis for R2 [x], and that W = {p(x) ∈ R2 [x] : p(1) = 0} is a subspace
of R2 [x]. Also, find dim W .

63. For each of the following statements, write true or false with proper justification:

32 @bikash
(a) {p(x) ∈ R3 [x] : p(x) = ax + b} is a subspace of R3 [x].
(b) {p(x) ∈ R3 [x] : p(x) = ax2 } is a subspace of R3 [x].
(c) {p(x) ∈ R[x] : p(0) = 1} is a subspace of R[x].
(d) {p(x) ∈ R[x] : p(0) = 0} is a subspace of R[x].
(e) {p(x) ∈ R[x] : p(x) = p(−x)} is a subspace of R[x].

64. Let B be a set of vectors in a vector space V with the property that every vector in V can be uniquely expressed
as a linear combination of the vectors in B. Prove that B is a basis for V .
" #
x −x
65. Let W1 be the set of all real matrices of the form and W2 be the set of all real matrices of the form
y z
" #
a b
. Show that W1 and W2 are subspaces of M2 (R). Find the dimension for each of W1 , W2 , W1 + W2
−a c
and W1 ∩ W2 .

66. Let W1 and W2 be two subspaces of R8 and dim(W1 ) = 6, dim(W2 ) = 5. What are the possible values for
dim(W1 ∩ W2 )?

67. Does there exist subspaces M and N of R7 such that dim(M ) = 4 = dim(N ) and M ∩ N = {0}? Justify.

68. Let M be an m-dimensional subspace of an n-dimensional vector space V . Prove that there exists an (n − m)-
dimensional subspace N of V such that M + N = V and M ∩ N = {0}.

69. Let W1 = {[x, y, z]t ∈ R3 : 2x + y + 4z = 0} and W2 = {[x, y, z]t ∈ R3 : x − y + z = 0}. Show that W1 and W2
are subspaces of R3 , and find a basis for each of W1 , W2 , W1 ∩ W2 and W1 + W2 .

70. Let W1 = span([1, 1, 0]t , [−1, 1, 0]t ) and W2 = span([1, 0, 2]t , [−1, 0, 4]t ). Show that W1 + W2 = R3 . Give an
example of a vector v ∈ R3 such that v can be expressed in two different ways in the form v = w1 + w2 , where
w 1 ∈ W1 , w 2 ∈ W2 .

71. Are the vector spaces M6×4 (R) (the space of all 6 × 4 real matrices) and M3×8 (R) (the space of all 3 × 8 real
matrices) isomorphic? Justify your answer.

72. Let P = span([1, 0, 0]t , [1, 1, 0]t ) and Q = span([1, 1, 1]t ). Show that R3 = P + Q and P ∩ Q = {0}. For an x ∈ R3 ,
find xp ∈ P and xq ∈ Q such that x = xp + xq . Is the choice of xp and xq unique? Justify.

73. Let V = {f | f : R → R is a function}, Ve = {f ∈ V : f (−x) = f (x) for all x ∈ R} and Vo = {f ∈ V : f (−x) =


−f (x) for all x ∈ R}. Prove that Ve and Vo are subspaces of V , Ve ∩ Vo = {0} and V = Ve + Vo .

74. Let V1 and V2 be two subspaces of a vector space V such that V = V1 + V2 and V1 ∩ V2 = {0}. Prove that for
each vector v ∈ V , there are unique vectors v1 ∈ V1 and v2 ∈ V2 such that v = v1 + v2 . (In such a situation, V
is called the direct sum of V1 and V2 , and is written as V = V1 ⊕ V2 .)

75. Find the coordinate of p(x) = 2 − x + 3x2 with respect to the ordered basis {1 + x, 1 − x, x2 } of R2 [x].
" # (" # " # " # " #)
1 2 1 0 1 1 1 1 1 1
76. Find the coordinate of with respect to the ordered basis , , ,
3 4 0 0 0 0 1 0 1 1
of M2 (R).

77. Consider the ordered bases B = {1 + x + x2 , x + x2 , x2 } and C = {1, x, x2 } for R2 [x], and let p(x) = 1 + x2 . Find
the coordinate vectors [p(x)]B and [p(x)]C . Also, find the change of basis matrices PC←B and PB←C . Finally,
compute [p(x)]B and [p(x)]C using the change of basis matrices and compare the result with the initially computed
coordinates of p(x).

78. Show that the set Rn [x] of all real polynomials of degree at most n is a subspace of the vector space R[x] of all
real polynomials. Find a basis for Rn [x]. Also, show that the set of all real polynomials of degree exactly n is
not a subspace of R[x]. Further, show that {x + 1, x2 + x − 1, x2 − x + 1} is a basis for R2 [x]. Finally, find the
coordinates of the elements 2x − 1, x2 + 1 and x2 + 5x − 1 of R2 [x] with respect to the above ordered basis.

33 @bikash
79. For 1 ≤ i ≤ n, let xi = [0, . . . , 0, 1, 1, . . . , 1]t ∈ Rn (i.e., the first (i − 1) entries of xi are 0 and the rest are 1). Show
that B = {x1 , x2 , . . . , xn } is a basis for Rn . Also, find the coordinates of the vectors [1, 1, . . . , 1]t , [1, 2, 3, . . . , n]t
and [0, . . . , 0, 1, 0, . . . , 0]t with respect to the ordered basis B.

80. Consider the ordered bases B = {[1, 2, 0]t , [1, 3, 2]t , [0, 1, 3]t } and C = {[1, 2, 1]t , [0, 1, 2]t , [1, 4, 6]t } for R3 . Find
the change of basis matrix P from B to C. Also, find the change of basis matrix Q from C to B. Verify that
P Q = I3 .

81. Show that the vectors u1 = [1, 1, 0, 0]t , u2 = [0, 0, 1, 1]t , u3 = [1, 0, 0, 4]t and u4 = [0, 0, 0, 2]t form a basis for R4 .
Find the coordinates of each of the standard basis vectors of R4 with respect to the ordered basis {u1 , u2 , u3 , u4 }.

82. Let W be the subspace of C3 spanned by u1 = [1, 0, i]t and u2 = [1 + i, 1, −1]t . Show that {u1 , u2 } is a basis for
W . Also, show that v1 = [1, 1, 0]t and v2 = [1, i, 1 + i]t are in W , and {v1 , v2 } form a basis for W . Finally, find
the coordinates of u1 and u2 with respect to the ordered basis {v1 , v2 }.

83. Let {u1 , u2 , . . . , un } be a basis for an n-dimensional vector space V . Show that {a1 u1 , a2 u2 , . . . , an un } is also a
basis for V , for any non-zero scalars a1 , a2 , . . . , an . If the coordinate of a vector v with respect to the ordered
basis {u1 , u2 , . . . , un } is x = [x1 , x2 , . . . , xn ]t , what is the coordinate of v with respect to the ordered basis
{a1 u1 , a2 u2 , . . . , an un }? What are the coordinates of w = u1 + u2 + . . . + un with respect to each of the ordered
bases {u1 , u2 , . . . , un } and {a1 u1 , a2 u2 , . . . , an un }?

84. Let A = {u1 , u2 , . . . , um } be a set of vectors in an n-dimensional vectors space V , and let B be a basis for V . Let
S = {[u1 ]B , [u2 ]B , . . . , [um ]B } be the set of coordinate vectors of A with respect to the ordered basis B. Prove
that span(A) = V if and only if span(S) = Rn .

85. Consider two ordered bases B and C for R2 [x]. Find C, if B = {x, 1 + x, 1 − x + x2 } and the change of basis
matrix from B to C is  
1 0 0
 
PC←B =  0 2 1 .
−1 1 1

86. Let V be an n-dimensional vector space with an ordered basis B = {v1 , . . . , vn }. Let P = [pij ] be an n × n
invertible matrix, and set

uj = p1j v1 + p2j v2 + . . . + pnj vn for all j = 1, 2, . . . , n.

Prove that C = {u1 , . . . , un } is an ordered basis for V and P = PB←C .


Hints to Practice Problems Set 3

1. Take {[x, y]t : x ∈ R, y ∈ Z} and {[x, y]t ∈ R2 : 2x + y = 0 or x + 2y = 0}.

2. Routine check.

3. Routine check.

4. (a), (b) 1[2, 2]t 6= [2, 2]t . (c) Addition is not commutative. (d) (1 + 1)[2, 2]t 6= 1[2, 2]t + 1[2, 2]t .
(e) Addition is not commutative. (f ), (i) 0 ∈ / V. (g) V is a vector space iff c = 0. (h) Routine Check.

5. Routine check.

6. Routine check.

7. Routine check.

8. Routine check.

9. Routine check.

10. If 0 and 00 are two zeros then 0 = 0 + 00 = 00 .

11. v + v0 = 0 = v + v00 . Now ⇒ v00 + (v + v0 ) = v00 + 0 ⇒ (v00 + v) + v0 = v00 ⇒ 0 + v0 = v00 .

12. Only the first two sets are subspaces.

34 @bikash
13. Only the second set is a subspace.

14. Only the S3 is a subspace.

15. The span of the first three sets is the plane x − 2y + z = 0. The last two sets span R3 .

16. Easy.

6 )S ⊆ R3 , we have dim span(S) ≤ dim R3 = 3.


17. For any (∅ =

18. Only the first and the third sets span R3 .

19. No. Compare coefficients of s(x) = ap(x) + bq(x) + cr(x) and solve for a, b, c.

20. span(S) = span(S \ {uk }).


T T
21. span(S) ∈ A ⇒ W ⊆ span(S). Also S ⊆ W ⇒ span(S) ⊆ W ⇒ span(S) ⊆ W.
W ∈A W ∈A

22. (a) Consider u1 = [1, 0]t , u2 = [2, 0]t , u3 = [0, 1]t . (a) Consider u1 = [1, 0, 0]t , u2 = [0, 1, 0]t , u3 = [1, 1, 0]t .
(c) Consider u1 = [1, 0]t , u2 = [2, 0]t , u3 = [0, 1]t , u4 = [0, 2]t .
k
P k
P k
P
23. av + ai ui = 0 ⇒ a 6= 0. Also bi u i = v = c i u i ⇒ bi = c i .
i=1 i=1 i=1

24. a1 u1 + a2 (u1 + u2 ) + . . . + an (u1 + u2 + . . . + un ) = 0 ⇒ a1 + . . . + an = 0, . . . , an−1 + an = 0, an = 0. The


converse is also true.

25. (a), (d), (f ), (g) Linearly independent, (b) Linearly dependent, [1, 2, 0, 0]t = 2[1, 1, 0, 0]t − [1, 0, 0, 0]t + 0[1, 1, 1, 1]t ,
(c) Linearly dependent, [i + 2, −1, 2]t = i[1, i, 0]t + 2[1, 0, 1]t ,
(e) {1, i} is linearly independent in C(R), but linearly dependent in C(C), 1 = −i.i.

26. (a) Yes. (b) (u − v) + (v − w) = u − w.

27. Similar to Problem 24. Or, the reduced row echelon form of A is In .

28. v1 = 31 [2(v1 + v2 ) + (v1 − 2v2 )], v2 = 31 [(v1 + v2 ) − (v1 − 2v2 )].

29. None of them are bases. Notice that x − x2 = (1 − x2 ) − (1 − x), and the second set is linearly independent but
not spanning.

30. a 6= 0, 1, −1.

31. xn+2 , xn+3 , . . . , x2n are the free variables. A basis for the solutions space is {f1 , f2 , . . . , fn−1 }, where

fi = [−1, 0, 0, . . . , 0, 1, 0, . . . , 0, −1, 0, . . . , 0, 1, 0, . . . , 0], 1 ≤ i ≤ n − 1.

Note that, the first and the (n + 1)-th entry of each fi is −1. The two positive entries of fi is at the (i + 1)-th
position and at the (n + i + 1)-th position, respectively.

32. The set is linearly independent iff t 6= 1.

33. Only part (c) is true.

34. A basis is {f1 , f2 , . . . , fn−1 }, where

fi = [−1, 0, 0, . . . , 0, 1, 0, . . . , 0], 1 ≤ i ≤ n − 1.

Note that, the first entry of each fi is −1 and the other positive entry of fi is at the (i + 1)-th position.

35. Similar to Result 3.6.

36. Take hint from Result 3.5.

37. Same as Problem no. 36.

38. z = − 71 (x + 2y), y ∈ span(x, y) ⇒ span(y, z) ⊆ span(x, y).

35 @bikash
39. [1, 0]t + i[i, 0]t = [0, 0]t .

40. The set is linearly independent for all values of α.


P P P
41. ak xk + bk (ixk ) = 0 ⇒ (ak + ibk )xk = 0.

42. The 1st and the 4th sets are subspaces of C3 (R) but not of C3 (C). The 2nd set is a subspace of each of C3 (R)
and C3 (C). The 3rd set is neither a subspace of C3 (R) nor a subspace of C3 (C).

43. Take U = {[x, 0]t : x ∈ R} and W = {[0, x]t : x ∈ R}.

44. Routine check. U + W will represent the xy-plane in R3 .

45. If {w1 , w2 , . . . , wk } is a basis for W then {(wi , wi ) : i = 1, 2, . . . , k} is a basis for ∆.

46. dim(U × V ) = dim(U ).dim(V ).


P
47. ak xk + ai (cxi ) = 0 ⇒ ak = 0 for k 6= i and cai = 0.
k6=i

48. {1 − x, x − x2 }.

49. {1, 1 + x, 1 + x + x2 } (some other answers are also possible).


(" # " # " # " #)
0 1 1 1 1 0 0 0
50. , , , (some other answers are also possible).
0 1 0 1 1 0 1 1

51. (a), (d) True statements. (b) Take x1 = [1, 0]t , x2 = [2, 0]t , x3 = [0, 1]t . (c) If W1 = span(1, x, x + x2 , x3 ), W2 =
span(1, x, x2 , x4 , x5 ) then W1 ∩ W2 = span(1, x, x2 ) = span(1, x, x + x2 ).
(" # " # " #) (" # " #)
0 1 0 0 1 0 3 −1 3 −1
52. (a) , , , (b) , , (c) {2 + x, 1 − x3 },
0 0 1 0 0 1 6 0 0 6
(d) Let Alk = [aij ], where al1 = 1, alk = −1 and aij = 0 otherwise. Then {Alk : 1 ≤ l ≤ m, 2 ≤ k ≤ n} is a basis.
(
1 if i = l, j = k,
53. (a) {Alk : 1 ≤ l ≤ n, 1 ≤ k ≤ n, l ≤ k}, where Alk = [aij ] and aij =
0 otherwise.
(
1 if i = l, j = k,
(b) {Blk : 1 ≤ l ≤ n, 1 ≤ k ≤ n, l ≥ k}, where Blk = [bij ] and bij =
0 otherwise.
(
1 if i = j = l,
(c) {Al : 1 ≤ l ≤ n}, where Al = [aij ] and aij =
0 otherwise.
(
1 if i = l, j = k,
(d) {Alk : 1 ≤ l ≤ n, 1 ≤ k ≤ n, l 6= k} ∪ {Bl : 2 ≤ l ≤ n}, where Alk = [aij ], aij = and
0 otherwise,

 1 if i = j = 1,

Bl = [bij ], bij = −1 if i = j = l,


0 otherwise.

 1 if i = l, j = k,

(e) {Alk : 1 ≤ l ≤ n, 1 ≤ k ≤ n, l < k}, where Alk = [aij ] and aij = −1 if i = k, j = l,


0 otherwise.

54. The required coordinate is [1, −2, 3]t .

55. (a), (c), (d), (e) n2 , (b) 2n2 .


(" # " # " # " # " # " #)
0 1 0 0 1 0 0 i 0 0 i 0
56. A basis for V is , , , , , .
0 0 1 0 0 −1 0 0 i 0 0 −i
(" # " # " # " #)
1 0 0 1 i 0 0 i
A basis for W is , , , .
0 −1 −1 0 0 −i i 0

57. B = {[1, −1, 0, 0]t , [1, 0, −1, 0]} is a basis for W . An extension is B ∪ {[1, 0, 0, 0]t , [0, 0, 0, 1]t }.

58. B = {[0, 0, 0, 0, 1, −1]t , [0, 1, −1, 0, 0, 0], [1, 0, −1, 1, 0, 0]t } is a basis for W . An extension is
B ∪ {[1, 0, 0, 0, 0, 0]t , [0, 1, 0, 0, 0, 0]t , [0, 0, 0, 0, 1, 0]t }.

36 @bikash
n
P
59. a1 v1 + ai (αv1 + vi ) = 0 ⇒ a1 + αa2 + . . . + αan = 0, a2 = 0, . . . , an = 0.
i=2

60. Examine that the given set is linearly independent.

61. The given set is linearly independent over C. An extension is {[1, 0, 0]t , [1, 1, 1]t , [1, 1, −1]t , [i, 0, 0]t , [i, i, i]t , [i, i, −i]t }.

62. A basis for W is {x − 1, (x − 1)(x − 2)}.

63. Only part (c) is not a subspace.


P
64. If B = {v1 , . . . , vk } then because of the uniqueness, ai vi = 0 ⇒ ai = 0.

65. dim(W1 ) = 3, dim(W2 ) = 3, dim(W1 + W2 ) = 4 and dim(W1 ∩ W2 ) = 2.

66. dim(W1 ∩ W2 ) ≤ dim(W2 ) = 5. Also dim(W1 ∩ W2 ) = dim(W1 ) + dim(W2 ) − dim(W1 + W2 ) ≥ 6 + 5 − 8 = 3.

67. dim(M ∩ N ) = dim(M ) + dim(N ) − dim(M + N ) ≥ 4 + 4 − 7 = 1.

68. If B = {v1 , . . . , vm } is a basis for M and B ∪ {u1 , . . . , un−m } is an extension of B, then consider
N = span(u1 , . . . , un−m ).

69. {[−1, 2, 0]t , [−2, 0, 1]t }, {[1, 1, 0]t , [1, 0, −1]t }, {[−5, −2, 3]t } and {[1, 1, 0]t , [1, 0, −1]t , [−1, 2, 0]t } are bases for
W1 , W2 , W1 ∩ W2 and W1 + W2 , respectively.

70. {[1, 1, 0]t , [−1, 1, 0]t , [1, 0, 2]t } is linearly independent. Also [1, 1, 1]t = [0, 1, 0]t + [1, 0, 1]t = [1, 1, 0]t + [0, 0, 1]t .

71. Both the spaces have equal dimension.

72. The set {[1, 0, 0]t , [1, 1, 0]t , [1, 1, 1]t } is linearly independent. Also [x, y, z]t = [x − z, y − z, 0]t + [z, z, z]t .

73. f ∈ Ve ∩ Vo ⇒ f (x) = f (−x), f (−x) = −f (x) for all x ∈ R ⇒ f (x) = −f (x) for all x ∈ R ⇒ f = 0. Also
f = fe + fo , where fe (x) = f (x)+f
2
(−x)
and fo (x) = f (x)−f
2
(−x)
.

74. v = v1 + v2 = v10 + v20 ⇒ v1 − v10 = v20 − v2 ∈ V1 ∩ V2 = {0}.

75. [ 12 , 32 , 3]t .

76. [−1, −1, −1, 4]t .


   
1 0 0 1 0 0
   
77. [p(x)]B = [1, −1, 1]t , [p(x)]C = [1, 0, 1]t , PC←B = 1 1 0  , PB←C =  −1 1 0 .
1 1 1 0 −1 1

78. (1 + xn ) + (−xn ) = 1. If B is the given basis then [2x − 1]B = [ 12 , 34 , − 34 ]t , [x2 + 1]B = [ 21 , 41 , − 34 ]t and
[x2 + 5x − 1]B = [2, 2, −1]t .

79. Let u1 = [1, 1, . . . , 1]t , u2 = [1, 2, 3, . . . , n]t and u3 = [0, . . . , 0, 1, 0, . . . , 0]t . Then [u1 ]B = e1 , [u2 ]B = u1 and
[u3 ]B = [0, . . . , 0, 1, −1, 0, . . . , 0]t , where 1 is at the i-th place.
   
2 2 −1 2 −1 1
   
80. PC←B =  2 3 −1  , PB←C =  −1 1 0 .
−1 −1 1 1 0 2

81. [e1 ]B = [0, 0, 1, −2]t , [e2 ]B = [1, 0, −1, 2]t , [e3 ]B = [0, 1, 0, − 12 ]t and [e4 ]B = [0, 0, 0, 21 ]t .

82. v1 = −iu1 + u2 , v2 = (2 − i)u1 + iu2 , [u1 ]B = [ 1−i 1+i t i+1 i−1 t


2 , 2 ] , [u2 ]B = [ 2 , 2 ] .

83. [v]C = [ xa11 , . . . , xann ]t , [w]B = [1, 1, . . . , 1]t and [w]C = [ a11 , . . . , a1n ]t .
n
P
84. Let span(A) = V and B = {v1 , . . . , vn }, x = [x1 , . . . , xn ]t . Take v = xi vi .
i=1
m
P
Then v = yi ui ⇒ x = [v]B = y1 [u1 ]B + . . . + ym [um ]B .
i=1

85. PB←C = (PC←B )−1 and C = {1 − 2x + 2x2 , 2x − x2 , 1 − 3x + 2x2 }.

86. a1 u1 + . . . + an un = 0 ⇒ P a = 0, where a = [a1 , . . . , an ]t ⇒ a = 0.

37 @bikash
4 The Inverse of a Matrix
Definition 4.1. An n × n matrix A is said to be invertible if there exists an n × n matrix B satisfying AB = In = BA,
and B is called an inverse of A.

• Note that we can talk of invertibility only for square matrices.


" #
2 5
• For example, the matrix A = is invertible since
1 3
" #" # " # " #" #
2 5 3 −5 1 0 3 −5 2 5
= = .
1 3 −1 2 0 1 −1 2 1 3

• It is easy to see that the zero matrix O is never invertible.


" #
1 1
• The matrix is not invertible.
2 2

Result 4.1. If A is an invertible matrix, then its inverse is unique.

• We write A−1 to denote the inverse of an invertible matrix A.

• That is, if A is invertible then AA−1 = In = A−1 A.

• If A is a 1 × 1 invertible matrix, what is A−1 ?


" # " #
a b 1 d −b
Result 4.2. Let A = . If ad − bc 6= 0 then A is invertible and A−1 = ad−bc . If ad − bc = 0 then
c d −c a
A is not invertible.

Result 4.3. Let A and B be two invertible matrices of the same size.

1. The matrix A−1 is also invertible, and (A−1 )−1 = A.

2. If c 6= 0 then cA is also invertible, and (cA)−1 = 1c A−1 .

3. The matrix AB is invertible, and (AB)−1 = B −1 A−1 .

4. The matrix At is invertible, and (At )−1 = (A−1 )t .

5. For any non-negative integer n, the matrix An is invertible, and (An )−1 = (A−1 )n .

Proof. [Proof of Part 2] We have (cA)( 1c A−1 ) = c. 1c AA−1 = I and ( 1c A−1 )(cA) = 1c .cA−1 A = I. Hence cA is also
invertible, and (cA)−1 = 1c A−1 .
[Proof of Part 4] Using the facts (AB)t = B t At and (AB)−1 = B −1 A−1 , we have

At (A−1 )t = (A−1 A)t = I t = I and (A−1 )t At = (AA−1 )t = I t = I.

Hence At is invertible, and (At )−1 = (A−1 )t .


Elementary Matrices: An elementary matrix is a matrix that can be obtained by performing an elementary
row operation on the identity matrix.

• Since there are three types of elementary row operations, there are three types of corresponding elementary
matrices.

• For example, the following are examples of the three types of elementary matrices of size 3.
     
1 0 0 1 0 0 1 0 0
     
E1 =  0 0 1  , E2 =  0 5 0  , E3 =  0 1 0 .
0 1 0 0 0 1 −2 0 1

• Note that E1 is obtained by performing R2 ↔ R3 on I3 , E2 is obtained by performing R2 → 5R2 on I3 and E3 is


obtained by performing R3 → R3 − 2R1 on I3 .

38 @bikash
• Let A be the 3 × 3 matrix as given below:  
a b c
 
A= x y z .
p q r
Then we have
     
a b c a b c a b c
     
E1 A =  p q r  , E 2 A =  5x 5y 5z  and E 3 A =  x y z .
x y z p q r p − 2a q − 2b r − 2c

• Notice that E1 A is the matrix obtained from A by performing the elementary row operation R2 ↔ R3 .

• The matrix E2 A is the matrix obtained from A by performing the elementary row operation R2 → 5R2 .

• The matrix E3 A is the matrix obtained from A by performing the elementary row operation R3 → R3 − 2R1 .

Result 4.4.

1. Let E be an elementary matrix obtained by an elementary row operation on In . If the same elementary row
operation is performed on an n × r matrix A, then the resulting matrix is equal to EA.

2. The matrices B and A are row equivalent iff there are elementary matrices E1 , · · · , Ek such that B = Ek · · · E1 A.

Proof.
 
r1
 r2 
 
1. Let us write A = [aij ] as a row of columns A = [a1 | a2 | . . . | ar ]. Let E = [rij ], a column of rows 
 .. . The

 . 
rn
n
P
ij-th entry of EA is rik akj , that is, ri aj . We consider three cases.
k=1
Case I: Let E be the elementary matrix corresponding to the row operation Ri ↔ Rj . Then each row rk , for
k 6= i, j, has 1 on the k-th place and zeroes elsewhere. The row ri has 1 on the j-th place and zeroes elsewhere.
The row rj has 1 on the i-th place and zeroes elsewhere. For k 6= i, j, we have

kl-th entry of EA = rk al = rk1 a1l + . . . + rkk akl + . . . + rkn anl = rkk akl = akl = kl-th entry of A.

Thus all rows of EA except for the i-th and j-th rows coincide with the corresponding rows of A. Again

il-th entry of EA =ri al


=ri1 a1l + . . . + rij ajl + . . . + rin anl
=rij ajl , as ri has 1 on the j-th place and zero elsewhere
=ajl
=jl-th entry of A.

Also

jl-th entry of EA =rj al


=rj1 a1l + . . . + rji ail + . . . + rjn anl
=rji ail , as rj has 1 on the i-th place and zero elsewhere
=ail
=il-th entry of A.

Thus the i-th row of EA is equal to the j-th row of A and the j-th row of EA is equal to the i-th row of A. Thus
we see that EA is obtained from A by applying the same row operation used in order to get E from I.
Case II: Let E be the elementary matrix corresponding to the row operation Ri → xRi , x 6= 0. Then each
row rk , for k 6= i, has 1 on the k-th place and zeroes elsewhere. The row ri has x on the i-th place and zeroes
elsewhere. For k 6= i, we have

kl-th entry of EA = rk al = rk1 a1l + . . . + rkk akl + . . . + rkn anl = rkk akl = akl = kl-th entry of A.

39 @bikash
Thus all rows of EA except for the i-th row coincide with the corresponding rows of A. Again

il-th entry of EA =ri al


=ri1 a1l + . . . + rii ail + . . . + rin anl
=rii ail , as ri has x on the i-th place and zero elsewhere
=xail .

Thus the i-th row of EA is equal to x times the i-th row of A. Thus we see that EA is obtained from A by
applying the same operation used in order to get E from I.
Case III: Let E be the elementary matrix corresponding to the row operation Ri → Ri + xRj . Then each row
rk , for k 6= i, has 1 on the k-th place and zeroes elsewhere. The row ri has 1 on the i-th place and x on the j-th
place and zeroes elsewhere. For k 6= i, we have

kl-th entry of EA = rk al = rk1 a1l + . . . + rkk akl + . . . + rkn anl = rkk akl = akl = kl-th entry of A.

Thus all rows of EA except for the i-th row coincide with the corresponding rows of A. Again

il-th entry of EA
=ri al
=ri1 a1l + . . . + rii ail + . . . + rij ajl + . . . + rin anl
=rii ail + rij ajl , as ri has 1 on the i-th place and x on the j-th place and zero elsewhere
=ail + xajl .

Thus the i-th row of EA is obtained by adding x times the j-th row of A to the i-th row of A. Thus we see that
EA is obtained from A by applying the same operation used in order to get E from I.

2. The matrices B and A are row equivalent iff B can be obtained from A by applying a finite sequence of elementary
row operations. Since each application of elementary row operations on A is as good as pre-multiplication of A
by the corresponding elementary matrix, we find a sequence of elementary matrices E1 , E2 , · · · , Ek such that
B = Ek · · · E2 E1 A.

Note that every elementary row operation can be undone or reversed. Applying this fact to the previous elementary
matrices E1 , E2 and E3 , we see that they are invertible.

• Indeed, applying R2 ↔ R3 on I3 we find E1−1 , applying R2 → 15 R2 on I3 we find E2−1 , and applying R3 → R3 +2R1
on I3 we find E3−1 .

We have      
1 0 0 1 0 0 1 0 0
E1−1 =  0 0 1  = E1 , E2−1 =  0 0  , E3−1 =  0
   1   
5 1 0 .
0 1 0 0 0 1 2 0 1
Result 4.5. Every elementary matrix is invertible, and its inverse is an elementary matrix of the same type.

Proof. We consider three cases.


Case I: Let E be the elementary matrix corresponding to the elementary row operation Ri ↔ Rj . It is easy to see that
two times application of this row operation on any matrix does not change the matrix. Since the result of application
of the row operation on A is equal to EA, we find that EEA = A. In particular, we have EE = EEI = I. Hence E is
invertible, and E = E −1 .
Case II: Let k 6= 0, and let E and F be the elementary matrices corresponding to the elementary row operations
Ri ↔ kRi and Ri ↔ k1 Ri , respectively. For any matrix A = [aij ], using Result 4.4, the i-th row of EA becomes
[kai1 kai2 . . . kaij . . . kain ] and all other rows remain unchanged. Therefore the i-th row of F EA becomes
[ai1 ai2 . . . aij . . . ain ] and all other rows remain unchanged. That is, F EA = A. Similarly, EF A = A. In
particular, for A = I, we have F E = I = EF . Hence E is invertible, and F = E −1 .
Case III: Let E and F be the elementary matrices corresponding to the elementary row operations Ri ↔ Ri + kRl
and Ri ↔ Ri − kRl , respectively. For any matrix A = [aij ], using Result 4.4, the i-th row EA becomes
[ai1 + kal1 . . . aij + kalj . . . ain + kaln ] and all other rows remain unchanged. Therefore the i-th row F EA becomes

[(ai1 + kal1 ) − kal1 . . . (aij + kalj ) − kalj . . . (ain + kaln ) − kaln ] = [ai1 . . . aij . . . ain ] ,

40 @bikash
and all other rows remain unchanged. That is, F EA = A. Similarly, EF A = A. In particular, for A = I, we have
F E = I = EF . Hence E is invertible, and F = E −1 .
From the above three cases, we conclude that every elementary matrix is invertible, and its inverse is an elementary
matrix of the same type.

Result 4.6 (The Fundamental Theorem of Invertible Matrices: Version I). Let A be an n × n matrix. Then
the following statements are equivalent.

1. A is invertible.

2. At is invertible.

3. Ax = b has a solution for every b in Rn .

4. Ax = b has a unique solution for every b in Rn .

5. Ax = 0 has only the trivial solution.

6. The reduced row echelon form of A is In .

7. The rows of A are linearly independent.

8. The columns of A are linearly independent.

9. rank(A) = n.

10. A is a product of elementary matrices.

Proof. We will prove the equivalence of the statements as follows:

(1) =⇒ (3) =⇒ (4) =⇒ (5) =⇒ (6) =⇒ (10) =⇒ (1)


m m m
(2) (8) (7), (9)

(1) ⇐⇒ (2) Follows from Part 4 of Result 4.3.

(1) =⇒ (3) If A is invertible, then for every b ∈ Rn , the vector y = A−1 b is a solution of Ax = b.

(3) =⇒ (4) Assume that Ax = b has a solution for every b in Rn . We first claim that Ãx = b has a solution for every
b ∈ Rn , where à = RREF (A). Suppose, if possible, Ãx = b does not have a solution for some b ∈ Rn . Apply
the elementary row operations on [à | b], that converts à to A, to obtain the matrix [A | b0 ]. Since [A | b0 ] and
[Ã | b] are row-equivalent and Ãx = b does not have a solution, we find that the system Ax = b0 does not have
a solution. This contradicts our initial assumption. Thus we find that Ãx = b has a solution for every b ∈ Rn .
In particular, Ãx = en has a solution, where en = [0, 0, 0, . . . , 0, 1]t .
If à has a zero row, then, the last equation of Ãx = en gives 0 = 1, which is absurd. Hence we find that RREF (A)
does not have a zero row and therefore all the columns of A are leading columns. Consequently, all the variables
of Ax = b are leading variables. Since there are no free variables, we find that Ax = b cannot have more than
one solution for every b in Rn .

Aliter

Assume that Ax = b has a solution for every b in Rn . If RREF (A) has a zero row, then some i-th row of A
will be a linear combination of the remaining rows of A. Consequently, the i-th row row of [A | ei ] can be made
[0 0 0 . . . 0 | 1] by elementary row operations. This gives that the system Ax = ei is in-consistent, which
is against the initial assumption. Thus all the rows of RREF (A) are non-zero, and therefore all the columns of
A are leading columns. Consequently, all the variables of Ax = b are leading variables. Since there are no free
variables, we find that Ax = b cannot have more than one solution for every b in Rn .

(4) =⇒ (5) Find it from the text book.

(5) =⇒ (6) Find it from the text book.

(6) =⇒ (10) Find it from the text book.

41 @bikash
(10) =⇒ (1) Find it from the text book.

(5) ⇐⇒ (8) Let the columns of A be a1 , a2 , . . . , an , respectively. For α = [α1 , α2 , . . . , αn ]t ∈ Rn , we find that Aα = α1 a1 +
α2 a2 + . . . + αn an . Therefore

Ax = 0 has only the trivial solution


⇔Aα = 0 implies α = 0
⇔α1 a1 + α2 a2 + . . . + αn an = 0 implies α1 = 0 = α2 = . . . = αn
⇔ the columns a1 , a2 , . . . , an are linearly independent.

(6) ⇐⇒ (7) Notice that, since A is a square matrix, RREF (A) is not In iff RREF (A) contains a zero row. Thus the contra-
positive of the equivalence to be proved is: RREF (A) contains a zero row iff the rows of A are linearly dependent,
which follows from Result 3.6.

(6) ⇐⇒ (9) Recall that rank(A) is defined to be the number of non-zero rows in the reduced row echelon form (RREF) of A.
Therefore if RREF (A) is In , then clearly rank(A) = n. Conversely, if rank(A) = n, then there cannot be a zero
row in the RREF (A). Since A is a square matrix and all the rows of RREF (A) are non-zero, we find that all
the rows of RREF (A) contains a leading term. Consequently, the columns of RREF (A) are the standard unit
vector ei . Hence the RREF (A) is In .

Result 4.7. Let A be a square matrix. If B is a square matrix such that either AB = I or BA = I, then A is invertible
and B = A−1 .

Proof. Assume that BA = I and consider the equation Ax = 0. Multiplying both sides by B, we have BAx =
B0 = 0 ⇒ Ix = 0 ⇒ x = 0. Thus the equation Ax = 0 has only the trivial solution, and hence by Fundamental
Theorem of Invertible Matrices, the matrix A is invertible, that is, A−1 exists and satisfy AA−1 = I = A−1 A. Now
BA = I ⇒ (BA)A−1 = IA−1 ⇒ B = A−1 .
Now let us assume that AB = I. As in the previous case, we find that B is invertible and A = B −1 . However, this
implies that A is also invertible and that A−1 = (B −1 )−1 = B.

Result 4.8. Let A be a square matrix. If a finite sequence of elementary row operations transforms A to the identity
matrix I, then the same sequence of elementary row operations transforms I into A−1 .

Gauss-Jordan Method for Computing Inverse:


Let A be an n × n matrix.

• Apply elementary row operations on the augmented matrix [A | In ].

• If A is invertible, then [A | In ] will be transformed to [In | A−1 ].

• If A is not invertible, then [A | In ] can never be transformed to a matrix of the type [In | B].
 
1 2 −1
 
Example 4.1. Find the inverse of the matrix  0 1 0 , if it exists.
1 0 0
Solution. We have
     
1 2 −1 1 0 0 1 0 0 0 0 1 1 0 0 0 0 1
     
 0 1 0 0 1 0  R1 ↔ R3  0 1 0 0 1 0  R3 → R3 − R1  0 1 0 0 1 0 
−−−−−−→ −−−−−−−−−−→
1 0 0 0 0 1 1 2 −1 1 0 0 0 2 −1 1 0 −1
 
1 0 0 0 0 1
 
R3 → R3 − 2R2  0 1 0 0 1 0 
−−−−−−−−−−−→
0 0 −1 1 −2 −1
 
1 0 0 0 0 1
 
R3 → (−1)R3  0 1 0 0 1 0 .
−−−−−−−−−−→
0 0 1 −1 2 1
 
0 0 1
 
Hence the required inverse is  0 1 0 .
−1 2 1

42 @bikash
Practice Problems Set 4

1. Using mathematical induction, prove that if A1 , A2 , . . . , An are invertible matrices of the same size then the
−1 −1
product A1 A2 . . . An is also invertible and that (A1 A2 . . . An )−1 = A−1
n An−1 . . . A1 for all n ≥ 1.

2. Give a counterexample to show that (AB)−1 6= A−1 B −1 in general, where A and B are two invertible matrices of
the same size. Find a necessary and sufficient condition such that (AB)−1 = A−1 B −1 .

3. Give a counterexample to show that (A + B)−1 6= A−1 + B −1 in general, where A and B are two matrices of the
same size such that each of A, B and A + B are invertible.

4. Show that if A is a square matrix that satisfy the equation A2 −2A+I = O, then A is invertible and A−1 = 2I −A.

5. Solve each of the following matrix equations for X:

XA2 = A−1 , (A−1 X)−1 = A(B −2 A)−1 and ABXA−1 B −1 = I + A.

6. Let A be an invertible matrix. Show that no row or column of A can be entirely zero.

7. Find the inverse of the following matrices, whenever they exist, preferably using Gauss-Jordan method:
       
" # 3 −1 2 1 1 3 1 1 1 1 2 2
2+i 6        
,  −6 3 1 , 4 5 6 , x y z , 2 1 2 ,
0 −3
−7 −2 3 1 2 4 x2 y 2 z 2 2 2 1
   
1 0 0 0 1 −1 0 0
 a 1 0 0   2 4 0 0 
   
  and  ,
 a2 a 1 0   0 0 5 2 
a3 a2 a 1 0 0 −6 −2
where x, y, z are distinct real numbers.

8. Find the inverse of the following matrices, where a, b, c, d are all non-zero real numbers:
       
a 0 0 0 0 0 0 a 0 a 0 0 1 0 0 0
 0   
b 0 0   0 0 b 0   b 0 0 0    
  a 1 0 0 
 , ,  and  .
 0 0 c 0   0 c 0 0   0 0 0 c   0 a 1 0 
0 0 0 d d 0 0 0 0 0 d 0 0 0 a 1

9. Find a 3 × 3 real matrix A such that Au1 = u1 , Au2 = 2u2 and Au3 = 3u3 , where

u1 = [1, 2, 2]t , u2 = [2, −2, 1]t and u3 = [−2, −1, 2]t .

10. Let A, B, C, D be n × n matrices such that ABCD = I. Show that

ABCD = DABC = CDAB = BCDA = I.

11. Let A and B be two n × n matrices. Show that if AB = A ± B then AB = BA.

12. (a) Let A be the 3 × 3 matrix all of whose main diagonal entries are 0, and elsewhere 1, i.e., aii = 0 for 1 ≤ i ≤ 3
and aij = 1 for i 6= j. Show that A is invertible and find A−1 .
(b) Let A be the n × n matrix all of whose main diagonal entries are 0, and elsewhere 1, i.e., aii = 0 for 1 ≤ i ≤ n
and aij = 1 for i 6= j. Show that A is invertible and find A−1 .
" #
1 0
13. Let A = .
2 3

(a) Find elementary matrices E1 and E2 such that E2 E1 A = I2 .


(b) Write A and A−1 as a product of elementary matrices.

14. Let A and B be two n × n matrices and let B be invertible. If b ∈ Rn then show that the system of equations
Ax = b and BAx = Bb are equivalent.

43 @bikash
5
P
15. Let A = [aij ] be a 5 × 5 invertible matrix such that aij = 1 for i = 1, 2, 3, 4, 5. Show that the sum of all the
j=1
entries of A−1 is 5.

16. Prove that if a matrix A is row equivalent to B, then there exists a invertible matrix P such that B = P A.
Further, show that

(a) if A is n × n and invertible then P is unique; and


(b) if A is n × n and non-invertible then P need not be unique.

17. Find invertible matrices P and Q such that B = P AQ, where


" # " #
2 4 8 1 0 0
A= and B = .
1 3 2 0 1 0

18. Show that the inverse of an invertible Hermitian matrix (i.e., A = A∗ ) is Hermitian. Also, show that the product
of two Hermitian matrices is Hermitian if and only if they commute.

19. Find the inverse of the matrix  


1 1 1 ... 1

 1 2 2 ... 2 

 
 1 2 3 ... 3 .
 
 .. .. .. .. .. 
 . . . . . 
1 2 3 ... n

20. Let A be an m × n matrix. Show that, by means of a finite number of elementary row and/or column operations,
one can transform A to a matrix R = [rij ] which is both ‘reduced row echelon form’ and ‘reduced column echelon
form’, i.e., rij = 0 if i 6= j, and there is a k ∈ {1, 2, . . . , n} such that rii = 1 if 1 ≤ i ≤ k and rii = 0 if i > k.
Show that R = P AQ, where P and Q are invertible matrices of sizes m × m and n × n, respectively.
Hints to Practice Problems Set 4

1. Use induction on n.

2. Take two invertible matrices which do not commute.

3. Take A = B = I2 .

4. The given condition gives I = (2I − A)A.

5. The solutions are (in order): X = A−3 , X = AB −2 and X = B −1 A−1 BA + A.

6. Show that AB = I is never possible if A has a zero row.

7. For each matrix A if you get the answer as B, then to be sure verify that AB = I.

8. For each matrix A if you get the answer as B, then to be sure verify that AB = I.

9. A[u1 , u2 , u3 ] = [u1 , 2u2 , 3u3 ].

10. For square matrices, XY = I ⇔ Y X = I.

11. AB = A − B ⇒ (A + I)(I − B) = I ⇒ (A + I)(I − B) = I = (I − B)(A + I) ⇒ AB = BA.


 
1
12. Here A = J − I, where each entry of J is 1. We have (J − I) n−1 J − I = I.
" # " #
1 0 1 0
13. E1 = , E2 = and A−1 = E2 E1 , A = E1−1 E2−1 .
−2 1 0 1/3

14. Use the fact that B is a product of elementary matrices, OR directly show that both the systems have the same
solution sets.
 5  " !#
5 P
P 5 P P5 5
P P5
15. If B = [bij ] is the inverse of A, then bik akj = 5 ⇒ bik akj = 5.
j=1 i=1 k=1 k=1 i=1 j=1

Or note that A[1, 1, . . . , 1]t = [1, 1, . . . , 1]t which gives [1, 1, . . . , 1]t = A−1 [1, 1, . . . , 1]t .

44 @bikash
16. Each elementary
 row operation
 corresponds
 to anelementary matrix.
  ⇒P
(a) PA = B = QA = Q.

1 0 0 2 0 0 2 0 0 2 0 0
       
(b) Take A =  0 0 0  and B =  0 0 0  . Then P =  0 1 0  and Q =  0 0 1 .
0 0 0 0 0 0 0 0 1 0 1 0

17. Find P and Q from the multiplication of elementarymatrices corresponding


 to the respective elementary row and
" # 1 0 −8
3/2 −2  
column operations. P = and Q =  0 1 2  are two such matrices. There could be other
−1/2 1
0 0 1
such matrices also.

18. Easy.

19. xi is the solution of Ax = ei , where x1 = [2, −1, 0, . . . , 0]t , xn = [0, . . . , 0, −1, 1]t and xi = [0, . . . , 0, −1, 2, −1, 0, . . . , 0]t
for 1 < i < n. Then A−1 = [x1 , x2 , . . . , xn ].

20. Apply sufficient numbers of elementary row and column operations on A.

45 @bikash
5 Determinant
Definition 5.1. Let A = [aij ] be an n × n matrix.

• For a 1 × 1 matrix A = [a], we define the determinant of A as det(A) = a.


" #
a b
• If A = , then we define det(A) = ad − bc.
c d

• In general, if Aij is the submatrix of A obtained by deleting the i-th row and the j-th column of A, then det(A) is
defined recursively as follows:
n
X
det(A) = a11 det(A11 ) − a12 det(A12 ) + . . . + (−1)1+n a1n det(A1n ) = (−1)1+j a1j det(A1j ).
j=1

• Sometimes det(A) is also denoted by |A|.

• We define det(Aij ) to be the (i, j)-minor of A.

• The number Cij = (−1)i+j det(Aij ) is called the (i, j)-cofactor of A.

• Thus we can write


n
X
det(A) = |A| = a11 C11 + a12 C12 + . . . + a1n C1n = a1j C1j .
j=1

Result 5.1 (Properties of Determinants). Let A = [aij ] be an n × n matrix. Then

a11 C11 + a12 C12 + . . . + a1n C1n = det(A) = a11 C11 + a21 C21 + . . . + an1 Cn1 .

Result 5.2 (Properties of Determinants). Let A be an n × n matrix. If B is obtained by interchanging any two
rows of A, then det(B) = −det(A).

Result 5.3 (Laplace Expansion Theorem). The determinant of an n × n matrix A = [aij ], where n ≥ 2, can be
computed as
n
X
det(A) = ai1 Ci1 + ai2 Ci2 + . . . + ain Cin = aij Cij ,
j=1

(this is the cofactor expansion along the i-th row), and also as
n
X
det(A) = a1j C1j + a2j C2j + . . . + anj Cnj = aij Cij ,
i=1

(this is the cofactor expansion along the j-th column).

Result 5.4.

1. For any square matrix A, det(At ) = det(A).

2. The determinant of a triangular matrix is the product of the diagonal entries. That is, if A = [aij ] is an n × n
triangular matrix then det(A) = a11 a22 . . . ann .

Proof.

1. The rows of A are the columns of At . So, the proof follows from Laplace Expansion Theorem.

2. Let A be a lower triangular matrix of size n. We use induction on the size n. For n = 1, it is clear that
det(A) = a11 . Let us assume that for any k × k lower triangular matrix, the determinant is equal to the product
of the diagonal entries. Now let A = [aij ] be a (k + 1) × (k + 1) lower triangular matrix. Then we have

det(A) = a11 C11 + a12 C12 + . . . + a1n C1n = a11 C11 + 0C12 + . . . + 0C1n = a11 C11 .

Since C11 is the determinant of a lower triangular matrix of size k with diagonal entries a22 , . . . , ak+1,k+1 , by
induction hypothesis, we have C11 = a22 . . . . .ak+1,k+1 . Therefore det(A) = a11 a22 . . . . .ak+1,k+1 . Hence by
principle of mathematical induction, we conclude that for any lower triangular matrix, the determinant is equal
to the product of the diagonal entries.
Now the proof for upper triangular matrix follows from the fact that At is lower triangular, whenever A is upper
triangular and that det(A) = det(At ). [Induction can also be used separately through column-wise expansion.]

46 @bikash
Result 5.5. [Properties of Determinants]

1. If A has a zero row then det(A) = 0.

2. If A has two identical rows then det(A) = 0.

3. If B is obtained by multiplying a row of A by k, then det(B) = k.det(A).

4. If the matrices A, B and C are identical except that the i-th row of C is the sum of the i-th rows of A and B, then
det(C) = det(A) + det(B).

5. If C is obtained by adding a multiple of one row of A to another row, then det(C) = det(A).

6. If A = [aij ], where n ≥ 2, then ai1 Cj1 + ai2 Cj2 + . . . + ain Cjn = 0 for i 6= j.

Proof. [Proof of Part 1] Expanding through the zero row (by Laplace Expansion Theorem), we find det(A) = 0.

[Proof of Part 5] Let C be obtained by adding k times of the j-th row of A to the i-th row of A. Let B be the
matrix whose i-th row is equal to k times of the j-th row of A, and all other rows of B are identical with that of A.
Application of Part 2 and Part 3 give that det(B) = 0.
Now we see that the matrices A, B and C are identical except that the i-th row of C is the sum of the i-th rows of
A and B. Therefore det(C) = det(A) + det(B) = det(A) + 0 = det(A).

[Proof of Part 6] Follows from Part 2.

Result 5.6 (Determinants of Elementary Matrices). Let E be an n × n elementary matrix and A be any n × n
matrix. Then

1. det(E) = −1, k or 1.

2. det(EA) = det(E)det(A).

Proof.

1. Since det(In ) = 1, Result 5.2 and Result 5.5 (Part 3 and Part 5) give the desired result.

2. Let E1 be the elementary matrix corresponding to the elementary row operation Ri ←→ Rj . Then det(E1 ) = −1
and E1 A is the matrix obtained by interchanging the i-th and j-th rows of A. Therefore det(E1 A) = −det(A)
and also det(E1 )det(A) = −det(A). Thus det(E1 A) = det(E1 )det(A).
Let E2 be the elementary matrix corresponding to the elementary row operation Ri −→ kRi , k 6= 0. Then
det(E2 ) = k and E2 A is the matrix obtained by multiplying the i-th row of A by k. Therefore det(E2 A) = kdet(A)
and also det(E2 )det(A) = kdet(A). Thus det(E2 A) = det(E2 )det(A).
Let E3 be the elementary matrix corresponding to the elementary row operation Ri ←→ Ri + kRj . Then
det(E3 ) = 1 and E3 A is the matrix obtained by adding k-times of the j-th row of A to the i-th row of A.
Therefore det(E3 A) = det(A) and also det(E3 )det(A) = 1.det(A). Thus det(E3 A) = det(E3 )det(A).

Result 5.7.

1. A square matrix A is invertible if and only if det(A) 6= 0.

2. Let A be an n × n matrix. Then det(kA) = k n det(A).

3. Let A and B be two n × n matrices. Then det(AB) = det(A)det(B).


1
4. If the matrix A is invertible then det(A−1 ) = det(A) .

Proof. [Proof of Part 2] If k = 0, then clearly det(kA) = 0 = k n det(A). For k 6= 0, let Ei be the elementary matrix
corresponding to the elementary row operation Ri −→ kRi for i = 1, 2, . . . , n. Then we see that kA = E1 E2 . . . En A.
Therefore
det(kA) = det(E1 E2 . . . En A) = det(E1 )det(E2 ) . . . det(En )det(A) = k n det(A).

F A matrix A is said to be singular or non-singular according as det(A) = 0 or det(A) 6= 0.

47 @bikash
F As a determinant can be expanded column-wise, all the previous results based on row-wise expansion of determi-
nant are also valid for column-wise expansion.

Result 5.8.

1. If B is obtained by interchanging any two columns of A, then det(B) = −det(A).

2. If A has a zero column then det(A) = 0.

3. If A has two identical columns then det(A) = 0.

4. If B is obtained by multiplying a column of A by k, then det(B) = kdet(A).

5. If A, B and C are identical except that the i-th column of C is the sum of the i-th columns of A and B, then
det(C) = det(A) + det(B).

6. If C is obtained by adding a multiple of one column of A to another column, then det(C) = det(A).

7. If A = [aij ], where n ≥ 2, then a1i C1j + a2i C2j + . . . + ani Cnj = 0 for i 6= j.

Proof. Follows from Result 5.2, Result 5.5 and det(At ) = det(A).
" # " #
P R P R
Result 5.9. Let the matrices , P and Q be square matrices. Then det = det(P )det(Q).
O Q O Q
" #
P R
Proof. We use induction on the size of P . Let A = = [aij ]. Let Cij be the co-factor of aij in A. If P = [a11 ]
O Q
is 1 × 1, then C11 = det(Q). Expanding through the first column, we have
" #
P R
det = a11 C11 + 0.C21 + . . . + 0.Cn1 = a11 C11 = det(P )det(Q).
O Q

Let us now assume that the result is true whenever the size of P is at most k − 1. Now let the size of P be k and
consider  
a11 a12 . . . a1k a1,k+1 ... a1n
 a a . . . a a2,k+1 ... a2n 
 21 22 2k 
 . . . . .. .. .. 
#   .. .. .. .. 
" . . . 
P R  
A = det =  ak1 ak2 . . . akk
 ak,k+1 ... akn .
O Q 


 0 0 . . . 0 a k+1,k+1 ... ak+1,n 
 . .. .. .. .. .. .. 
 . 
 . . . . . . . 
0 0 ... 0 an,k+1 ... ann
Then we have
det(A) = a11 C11 + a21 C21 + . . . + ak1 Ck1 .
0
For i = 1, . . . , k, let Ci1 denote the co-factor of ai1 in P . Notice that, for each i = 1, 2,". . . , k, the# sub-matrix Ai1 of A
Pi R i
obtained by deleting the i-th row and the first column is also a block matrix of the form , where the size of Pi
O Q
is k − 1. Further, Pi is the sub-matrix of P obtained by deleting the "i-th row and # the first column from P . Therefore by
Pi R i 0
induction hypothesis, we have Ci1 = (−1)i+1 det(Ai1 ) = (−1)i+1 det = (−1)i+1 det(Pi )det(Q) = Ci1 .det(Q).
O Q
Therefore

det(A) =a11 C11 + a21 C21 + . . . + ak1 Ck1


0 0 0
=a11 C11 .det(Q) + a21 C21 .det(Q) + . . . + ak1 Ck1 .det(Q)
0 0 0
=(a11 C11 + a21 C21 + . . . + ak1 Ck1 )det(Q)
=det(P )det(Q).

Thus we see that the "result is also


# true whenever the size of P is k. Hence by the principle of mathematical induction,
P R
we conclude that det = det(P )det(Q).
O Q

48 @bikash
Definition 5.2. Let A be an n × n matrix and let b ∈ Rn . Then Ai (b) denotes the matrix obtained by replacing the
i-th column of A by b. That is, if A = [a1 a2 . . . an ], then Ai (b) = [a1 a2 . . . ai−1 b ai+1 . . . an ].

Result 5.10 (Cramer’s Rule). Let A be an n × n invertible matrix and let b ∈ Rn . Then the unique solution
x = [x1 , x2 , . . . , xn ]t of the system Ax = b is given by
det (Ai (b))
xi = for i = 1, 2, . . . , n.
det(A)
The Adjoint of a Matrix: Let A = [aij ] be an n × n matrix and let Cij be the (i, j)-cofactor of A. Then the
adjoint of A, denoted adj(A), is defined as
 
C11 C21 . . . Cn1
 
 C12 C22 . . . Cn2  t
adj(A) = 
 .. .. ..  = [Cij ] .
.. 
 . . . . 
C1n C2n . . . Cnn
1
Result 5.11. Let A be an n × n invertible matrix. Then A−1 = det(A) .adj(A).

Remark: The proof of Result 5.11 can be obtained using Part 6 of Result 5.5 or as given in the text book.

Example 5.1. Use the adjoint method to find the inverse of the matrix
 
1 2 −1
 
A= 2 2 4 .
1 3 −3
Practice Problems Set 5

1. Find the inverse of the following matrices using adjoint method:


     
1 1 3 1 1 1 1 2 2
     
 4 5 6 ,
  x y z  and  2 1 2 ,
1 2 4 x y2
2
z2 2 2 1
where x, y, z are distinct real numbers.

2. If A is an idempotent matrix (i.e., A2 = A), then find all possible values of det(A).

3. Let A = [aij ] be an n × n matrix such that aij = max{i, j}. Find det(A).

4. Let A = [aij ] be an n × n matrix such that aij = j i−1 . Show that

det(A) = (n − 1)(n − 2)2 (n − 3)3 . . . 3n−3 2n−2 .

5. Let A = [a, r2 , r3 , r4 ] and B = [b, r2 , r3 , r4 ] be two 4 × 4 matrices, where a, b, r2 , r3 , r4 ∈ R4 . If det(A) = 4 and


det(B) = 1, find det(A + B). What is det(C), where C = [r4 , r3 , r2 , a + b]?

6. Let A be an n × n matrix.

(a) Show that if A2 + I = O then n must be an even integer.


(b) Does (a) remain true for complex matrices?

7. Let A be an n × n matrix. If AAt = I and det(A) < 0, then find det(A + I).

8. If A is a matrix satisfying A3 = 2I then show that the matrix B is invertible, where B = A2 − 2A + 2I.
an+1 −bn+1
9. Show that if a 6= b then det(A) = a−b ,where A is an n × n matrix given as follows:
 
a+b ab 0 ... 0 0
 
 1 a+b ab ... 0 0 
 
 0 1 a + b ... 0 0 
A= .

 ... ... ... ... ... ... 
 
 0 0 0 ... a + b ab 
0 0 0 ... 1 a+b

If a = b then what happens to the value of det(A)?

49 @bikash
10. The position of the (i, j)-th entry aij of an n × n matrix A = [aij ] is called even or odd according as i + j is even
or odd.

(a) Let B be the matrix obtained from multiplying all the entries of A in odd positions by −1. That is, if
B = [bij ] then bij = aij or bij = −aij according as i + j is even or odd. Show that det(B) = det(A).
(b) Let C = [cij ] be the matrix such that cij = −aij or cij = aij according as i + j is even or odd. Show that
det(C) = det(A) or det(C) = −det(A) according as n is even or odd.

11. Let λ 6= 0. Determine |λI − A| for the 10 × 10 matrix


 
0 1 0 ... 0 0
 0 0 1 ... 0 0 
 
 . .. .. .. .. .. 
 . .
 . . . . . . 
 
 0 0 0 ... 0 1 
10
10 0 0 ... 0 0

12. Show that an upper triangular (square) matrix is invertible if and only if every entry on its main diagonal is
non-zero.

13. Let A and B be two non-singular matrices of the same size. Are A + B, A − B and −A non-singular? Justify.

14. Each of the numbers 1375, 1287, 4191 and 5731 is divisible by 11. Show, without calculating the actual value, that
 
1 1 4 5
 3 2 1 7 
 
the determinant of the matrix   is also divisible by 11.
 7 8 9 3 
5 7 1 1
1
15. Let A = [aij ] be an n × n matrix, where aij = i+j for all i, j. Show that A is invertible.

16. Let A be an n × n matrix. Show that there exists an n × n non-zero matrix B such that AB = O if and only if
det(A) = 0.

17. Let aij be integers, where 1 ≤ i, j ≤ n. If for any set of integers b1 , b2 , . . . , bn , the system of linear equations
n
X
aij xj = bj for i = 1, 2, . . . , n,
j=1

has integer solution [x1 , x2 , ..., xn ]t then show that det(A) = ±1, where A = [aij ].

18. Show that det(A) = det(A∗ ) = det(A). Hence show that if A is Hermitian (i.e., A∗ = A) then det(A) is a real
number.

19. A matrix A is said to be orthogonal if AAt = I = At A. Show that if A is orthogonal then det(A) = ±1.

20. Let A = [aij ] be an n × n complex matrix. Show that


t
(a) if A is skew-Hermitian (i.e., A = −A) and n is even then det(A) is real;
(b) if A is skew-symmetric (i.e., At = −A) and n is odd then det(A) = 0;
(c) if A is invertible and A−1 = At then det(A) = ±1; and
t
(d) if A is invertible and A−1 = A then |det(A)| = 1.

21. Suppose A is a 2 × 1 matrix and B is an 1 × 2 matrix. Prove that the matrix AB is not invertible. When is the
matrix BA invertible?

22. Prove or disprove: The matrix A, as given below, is invertible and all entries of A−1 are integers.
 1 1 1 
1 2 3 ... n
 1 1 1 1
. . . n+1 
 2 3 4 
 1 1 1 1 
A= . . . n+2  .

 3 4 5
 .. .. .. .. 
 . . . . 
1 1 1 1
n n+1 n+2 ... 2n−1

50 @bikash
23. Find the determinant and the inverse of the n × n matrix
 
0 1 1 ... 1
 1 0 1 ... 1 
 
 
 1 1 0 ... 1 .
 
 .. .. .. .. .. 
 . . . . . 
1 1 1 ... 0

24. Assuming that all matrix inverses involved below exist, show that

(A − B)−1 = A−1 + A−1 (B −1 − A−1 )−1 A−1 .

In particular, show that

(I + A)−1 = I − (A−1 + I)−1 and |(I + A)−1 + (I + A−1 )−1 | = 1.

25. Let S be the backward identity matrix; that is,


 
0 0 ... 0 1

 0 0 ... 1 0 

 .. .. .. .. .. 
S=
 . . . . .
.

 
 0 1 ... 0 0 
1 0 ... 0 0

Show that S −1 = S t = S. Find det(S) and SAS for n × n matrix A = [aij ].


Hints to Practice Problems Set 5

1. For each matrix A if you get the answer as B, then to be sure verify that AB = I.

2. A2 = A ⇒ (det A)2 = det A.

3. Apply the row operations Rn → Rn − Rn−1 , Rn−1 → Rn−1 − Rn−2 , . . . , R2 → R2 − R1 and then expand through
the last column. We get det A = (−1)n+1 n.

4. Consider the Vandermonde matrix, where xi = i for i = 1, 2, . . . , n.

5. Use Result 5.1 repeatedly to find det(A + B) = 40 an det C = 5.

6. A2 = −I ⇒ (det A)2 = (−1)n . For the 2nd part, using the fact i2 = −1 try to find a counterexample for n = 3.

7. det A = −1 and det(A + I) = det(A + AAt ). We find det(A + I) = 0.

8. B = A(A+2I)(A−I). Also A3 = 2I ⇒ I = (A−I)(A2 +A+I) and A3 +8I = 10I ⇒ (A+2I)(A2 −2A+4I) = 10I.
Thus det(A − I) 6= 0 and det(A + 2I) 6= 0, and hence det B 6= 0.

9. Use induction on n. Induction hypothesis: Assume the result to be true for matrices of size n, where n ≤ k. Now
prove for a matrix of size k + 1. [This is another version of method of induction]
an+1 −bn+1 an+1 −bn+1
[Note that we need to prove det(A) = a−b .] Take limit of a−b as a tends to b, for the second part.

10. (a). The numbers i + j and i − j are either both even or both odd. Thus, multiplying the entries at odd positions
of A by −1 is same as multiplying each entry of A by (−1)i−j . Now use Problem 32 of Tutorial Sheet.
(b). Multiplying the entries at even positions of A by −1 is same as multiplying each entry of B by −1.

11. Expand through the last row. The answer is λ10 − 1010 .

12. What is the determinant of a triangular matrix?

13. A + B and A − B need not be non-singular (find counterexapmles.) Also, det(−A) = (−1)n det(A).

14. Apply suitable elementary row operations.

51 @bikash
15. To show that det(A) 6= 0. Subtract the last row of A from the n − 1 preceding rows. Now take the factors
1 1 1 1
n+1 , n+2 , . . . , 2n−1 , 2n common from the respective columns and also take the factors n − 1, n − 2, . . . , 2 common
from the respective rows. Now in the remaining determinant, subtract the last column from each of the preceding
columns and take suitable factors common from the rows as well as from the columns. Finally, expand the
remaining determinant through a suitable row (column) to get a similar determinant of size n − 1. Now use
mathematical induction on n.

16. Ax = 0 has a non-trivial solution iff det(A) = 0.

17. For each i, there is a column vector ci , each of whose entries are integers, such that Aci = ei . Set C = [c1 , . . . , cn ]
so that AC = I.

18. Easy.

19. det(AAt ) = 1.
t
20. (a) det(A ) = det(−A) ⇒ det(A) = (−1)n det(A). (b) Similar to first part. (c) AAt = I. (d) AA∗ = I.

21. det(AB) = 0.

22. The given statement is correct. The proof is similar to Problem 15.

23. Apply the row operations R1 ↔ Rn , R2 → R2 − R1 , R3 → R3 − R1 , . . . , Rn → Rn − Rn−1 and then use


induction to find det(A) = (−1)n−1 (n − 1). Find A−1 as in Problem 12 of Section 4 OR solve Ax = ei for each
i = 1, 2, . . . , n.
 
24. Check that (A − B) A−1 + A−1 (B −1 − A−1 )−1 A−1 = I.
n2 +3n
25. Direct computation gives S 2 = I. Use induction to show that det(S) = (−1) 2 . The (i, j)-th entry of SAS is
an−i+1,n−j+1 . (Notice that if S = [uij ] then uij = 1 iff i + j = n + 1.)

52 @bikash
6 Subspaces Associated with Matrices
Definition 6.1. Let A be an m × n matrix.

1. The null space of A, denoted null(A), is the subspace of Cn consisting of the solutions of the homogeneous linear
system Ax = 0. In other words, null(A) = {x ∈ Cn | Ax = 0}.

2. The column space of A, denoted col(A), is the subspace of Cm spanned by the columns of A. In other words,
col(A) = {Ax | x ∈ Cn }.

3. The row space of A, denoted row(A), is the subspace of Cn spanned by the rows of A.
In other words, row(A) = {xT A | x ∈ Cm }.

[Here, elements of row(A) are row vectors. How can they be elements of Cn ? In strict sense, row(A) := col(AT ).]

Result 6.1. Let the matrices B and A be row equivalent. Then row(B) = row(A).

Corollary 6.1. For any A, row(A) = row(RREF(A)).

Proof. Since A and RREF (A) are row equivalent, we find that row(A) = row(RREF(A)).

Corollary 6.2. For any A, the non-zero rows of RREF(A) form a basis of row(A).

Proof. Since row(A) = row(RREF(A)), the non-zero rows of RREF(A) span row(A). Let {a1 , a2 , . . . , ar } be the set of
non-zero rows of RREF(A). We claim that {a1 , a2 , . . . , ar } is linearly independent to prove that the non-zero rows of
RREF(A) form a basis of row(A).
Let j1 , j2 , . . . , jr be the position of the first non-zero entry of a1 , a2 , . . . , ar , respectively. Since a1 , a2 , . . . , ar are the
non-zero rows of an RREF of a matrix, we find that j1 < j2 < . . . < jr . Therefore for α1 , α2 , . . . , αr ∈ R, the jk -th
entry of α1 a1 + α2 a2 + . . . + αr ar is αk for k = 1, 2, . . . , r. So

α 1 a1 + α 2 a2 + . . . + α r ar = 0 ⇒ α 1 = 0 = α 2 = . . . = α r .

Hence {a1 , a2 , . . . , ar } is linearly independent. # " "


#
1 1 1 1
Suppose A and B are row-equivalent. Are col(A) and col(B) equal? No. Take A = and B = .
1 1 0 0
Suppose A and B are row-equivalent. Do col(A) and col(B) have same dimension? Yes. We will see soon. Indeed,
rank(A) = dim(row(A)) = dim(col(A)).
Method of Finding Bases of the Row Space, the Null Space and the Column Space of a
Matrix:
Let A be a given matrix and let R be the reduced row echelon form of A.

1. Use the non-zero rows of R to form a basis for row(A).

2. Solve the leading variables of Rx = 0 in terms of the free variables, set the free variables equal to parameters,
substitute back into x, write the result as a linear combination of k vectors (where k is the number of free
variables). These k vectors form a basis for null(A).

3. A basis for row(At ) will also be a basis for col(A). Or, Use the columns of A that correspond to the columns of
R containing the leading 1’s to form a basis for col(A).

Example 6.1. Find bases for the row space, column space and the null space of the following matrix:
 
1 2 −1
 
A= 2 2 4 .
4 6 2

53 @bikash
Solution. We have
     
1 2 −1 1 2 −1 1 2 −1
     
 2 2 4  R2 → R2 − 2R1  0 −2 6 R3 → R3 − 4R1  0 −2 6 
−−−−−−−−−−−→ −−−−−−−−−−−→
4 6 2 4 6 2 0 −2 6
 
1 2 −1
 
R3 → R3 − R2  0 −2 6 
−−−−−−−−−−→
0 0 0
 
1 2 −1
1  
R2 → − R2  0 1 −3 
2−→
−−−−−−−− 0 0 0
 
1 0 5
 
R1 → R1 − 2R2  0 1 −3  = R, say.
−−−−−−−−−−−→
0 0 0

A basis for row(A) is {[1, 0, 5]t , [0, 1, −3]t }. A basis for col(A) is {[1, 2, 4]t , [2, 2, 6]t }. Also for x = [x, y, z]t , we have

Rx = 0 ⇒ x + 5z = 0, y − 3z = 0 ⇒ x = −5z, y = 3z.

Take z = s, so that x = −5s, y = 3s and x = [−5s, 3s, s]t = s[−5, 3, 1]t . Hence a basis for null(A) is {[−5, 3, 1]t }.

Result 6.2. Let R = [b1 b2 . . . bn ] be the reduced row echelen form of a matrix A = [a1 a2 . . . an ] of rank r. Let
bj1 , bj2 , . . . , bjr be the columns of R such that bjk = ek for k = 1, . . . , r. Then {aj1 , aj2 , . . . , ajr } is a basis for col(A).

Proof. Since R is the RREF of A, we have R = E1 E2 . . . Ek A for some elementary matrices E1 , E2 , . . . , Ek . Taking
E1 E2 . . . Ek = M , we find that M is an invertible matrix and R = M A. Since the i-th column of M A is M ai , we have
bi = M ai for i = 1, 2, . . . , n. Now for α1 , α2 , . . . , αr ∈ Rn , we have

α 1 aj1 + α 2 aj2 + . . . + α r ajr = 0


⇒M (α1 aj1 + α2 aj2 + . . . + αr ajr ) = M 0 = 0
⇒α1 (M aj1 ) + α2 (M aj2 ) + . . . + αr (M ajr ) = 0
⇒α1 bj1 + α2 bj2 + . . . + αr bjr = 0
⇒α1 e1 + α2 e2 + . . . + αr er = 0
⇒α1 = α2 = . . . = αr = 0.

Therefore {aj1 , aj2 , . . . , ajr } is linearly independent. Now let x ∈ col(A). Then there exist α1 , α2 , . . . , αn ∈ Rn such
that x = α1 a1 + α2 a2 + . . . + αn an . Notice that each bj is of the form bj = [b1j , b2j , . . . , brj , 0, 0, . . . , 0]t . Now we have

x = α 1 a1 + α 2 a2 + . . . + α n an
⇒M x = α1 (M a1 ) + α2 (M a2 ) + . . . + (M αr an ) = α1 b1 + α2 b2 + . . . + αn bn
⇒M x = α1 (b11 e1 + . . . + br1 er ) + α2 (b12 e1 + . . . + br2 er ) + . . . + αn (b1n e1 + . . . + brn er )
⇒M x = (α1 b11 + . . . + αn b1n )e1 + . . . + (α1 br1 + . . . + αn brn )er
⇒M x = (α1 b11 + . . . + αn b1n )bj1 + . . . + (α1 br1 + . . . + αn brn )bjr
⇒x = (α1 b11 + . . . + αn b1n )(M −1 bj1 ) + . . . + (α1 br1 + . . . + αn brn )(M −1 bjr )
⇒x = (α1 b11 + . . . + αn b1n )aj1 + . . . + (α1 br1 + . . . + αn brn )aj1 .

Thus x is a linear combination of aj1 , aj2 , . . . , ajr , and so {aj1 , aj2 , . . . , ajr } spans col(A).
Finally, {aj1 , aj2 , . . . , ajr } is linearly independent and spans col(A), and hence is a basis for col(A).
Remark: Note that if {x1 , x2 , . . . , xk } is a linearly independent subset of Rn and P is an n × n invertible matrix, then
it is immediate from the previous proof that {P x1 , P x2 , . . . , P xk } is also linearly independent. Also, if {x1 , x2 , . . . , xk }
is a basis of a subspace of Rn and P is an n × n invertible matrix, then it is immediate from the previous proof that
{P x1 , P x2 , . . . , P xk } is also a basis of the same subspace.

Result 6.3. The row space and the column space of a matrix A have the same dimension, and dim(row(A)) =
dim(col(A)) = rank(A).

54 @bikash
Proof. By Corollary 6.2, the non-zero rows of RREF (A) form a basis for row(A). Therefore dim(row(A)) = rank(A).
Again, from Result 6.2, dim(col(A)) = rank(A).

Result 6.4. For any matrix A, we have rank(At ) = rank(A).

Nullity: The nullity of a matrix A is the dimension of its null space, and is denoted by nullity(A).
Note that nullity(A) = the number of free variables of the system Ax = 0.

Result 6.5 (Rank Nullity Theorem). Let A be an m × n matrix. Then

rank(A) + nullity(A) = n.

Result 6.6 (The Fundamental Theorem of Invertible Matrices: Version II). Let A be an n × n matrix. Then
the following statements are equivalent.

1. A is invertible.

2. At is invertible.

3. Ax = b has a solution for every b in Rn .

4. Ax = b has a unique solution for every b in Rn .

5. Ax = 0 has only the trivial solution.

6. The reduced row echelon form of A is In .

7. The rows of A are linearly independent.

8. The columns of A are linearly independent.

9. rank(A) = n.

10. A is a product of elementary matrices.

11. nullity(A) = 0.

12. The column vectors of A span Rn .

13. The column vectors of A form a basis for Rn .

14. The row vectors of A span Rn .

15. The row vectors of A form a basis for Rn .

Proof. The equivalence of the statements (1) to (10) are proved in Result 4.6. The remaining equivalence will be
proved as follows:
(12) ⇐⇒ (13) (14) ⇐⇒ (15)
(9) ⇐⇒ (11), m and m .
(8) (7)

(9) ⇐⇒ (11) From rank(A) + nullity(A) = n, we find that rank(A) = n ⇐⇒ nullity(A) = 0.

(12) ⇐⇒ (13) Suppose the column vectors {a1 , a2 , . . . , an } of A spans Rn . If {a1 , . . . , an } is linearly dependent, then some column
ai is a linear combination of the remaining columns. Consequently, the n − 1 columns {a1 , . . . , ai−1 , ai+1 , . . . , an }
also span Rn , contradicting the fact that dim(Rn ) = n. Therefore {a1 , a2 , . . . , an } is linearly independent, which
along with the fact span({a1 , a2 , . . . , an }) = Rn give that {a1 , a2 , . . . , an } is a basis for Rn .
Now (13) =⇒ (12) follows directly from the definition of basis.

(13) ⇐⇒ (8) Suppose the column vectors {a1 , a2 , . . . , an } are linearly independent. If {a1 , a2 , . . . , an } does not form a basis
for Rn , then n < dim(Rn ) = n, a contradiction. Hence column vectors of A form a basis for Rn .
Now (13) =⇒ (8) follows directly from the definition of basis.

55 @bikash
(14) ⇐⇒ (15) Suppose the row vectors {r1 , r2 , . . . , rn } of A spans Rn . If {r1 , . . . , rn } is linearly dependent, then some row ri
is a linear combination of the remaining rows. Consequently, the n − 1 rows {r1 , . . . , ri−1 , ri+1 , . . . , rn } also span
Rn , contradicting the fact that dim(Rn ) = n. Therefore {r1 , r2 , . . . , rn } is linearly independent, which along with
the fact span({r1 , r2 , . . . , rn }) = Rn give that {r1 , r2 , . . . , rn } is a basis for Rn .
Now (15) =⇒ (14) follows directly from the definition of basis.

(15) ⇐⇒ (7) Suppose the row vectors {r1 , r2 , . . . , rn } are linearly independent. If {r1 , r2 , . . . , rn } does not form a basis for Rn ,
then n < dim(Rn ) = n, a contradiction. Hence row vectors of A form a basis for Rn .
Now (15) =⇒ (7) follows directly from the definition of basis.

Example 6.2. Show that the vectors [1, 2, 3]t , [−1, 0, 1]t and [4, 9, 7]t form a basis for R3 .

Result 6.7. Let A be an m × n matrix. Then

1. rank(At A) = rank(A).

2. The n × n matrix At A is invertible if and only if rank(A) = n.

Result 6.8. Let A, B, T, S be matrices, where T are S are invertible.

1. If T A and AS are defined, then rank(T A) = rank(A) = rank(AS).

2. If AB is defined then rank(AB) ≤ min{rank(A), rank(B)}.

3. If A + B is defined then rank(A + B) ≤ rank(A) + rank(B).

Proof. We provide the proof of Part 1. See Tutorial Sheet for the proof of the other two parts.
If T is invertible, then T = E1 E2 . . . Ek for some elementary matrices E1 , E2 , . . . , Ek . Thus T A = E1 E2 . . . Ek A.
Since the effect of pre-multiplication of a given matrix by an elementary matrix is same as the effect of application of the
corresponding elementary row operation, we conclude that T A and A are row equivalent. Therefore row(T A) = row(A),
and hence rank(T A) = rank(A).
Now rank(AS) = rank((AS)t ) = rank(S t At ) = rank(At ) = rank(A).

Result 6.9. Let A be an n × n matrix of rank"r, where #1 ≤ r < n. Then there exist elementary matrices E1 , . . . , Ep
Ir O
and F1 , . . . , Fq such that E1 . . . Ep AF1 . . . Fq = .
O O

Proof. See Tutorial Sheet.


Practice Problems Set 6

1. Determine the reduced row echelon form for each of the following matrices. Hence, find a basis for each of the
corresponding row spaces, column spaces and the null spaces.
       
1 −1 2 3 1 2 3 4 3 4 5 −6 1 2 1 3 2
 0 5 6 2   5 6 7 8   2 3 1  
1   0 2 2 2 4 
     
 , , , .
 −1 2 4 3   9 10 11 12   0 2 0 0   2 −2 4 0 8 
1 2 −1 2 13 14 15 16 5 −5 5 5 4 2 5 6 10

2. Prove that if R is an echelon form of a matrix A, then the non-zero rows of R form a basis of row(A).

3. Give examples to show that the column space of two row equivalent matrices need not be the same.

4. Find a matrix whose row space contains the vector [1, 2, 1]t and whose null space contains the vector [1, −2, 1]t ,
or prove that there is no such matrix.

5. If a matrix A has rank r then prove that A can be written as the sum of r matrices, each of which has rank 1.
 
1 2 −1 1
 
6. If the rank of the matrix  2 0 t 0  is 2, find all the possible values of t.
0 −4 5 −2

56 @bikash
 
t 1 1 1
 1 t 1 1 
 
7. For what values of t is the rank of the matrix   equal to 3?
 1 1 t 1 
1 1 1 t

8. Let A and B be two n × n matrices. Show that if AB = O then rank(A) + rank(B) ≤ n.

9. Let A be an n × n matrix such that A2 = A3 and rank(A) = rank(A2 ). Show that A = A2 . Also, show that the
condition rank(A) = rank(A2 ) cannot be dropped even for a 2 × 2 matrix.

10. If B is a sub matrix of a matrix A obtained by deleting s rows and t columns from A, then show that
rank(A) ≤ s + t + rank(B).

11. Let A be an n × n matrix. Show that A2 = A if and only if rank(A) + rank(A − I) = n.

12. Find the values of λ ∈ R for which β = [0, λ, λ2 ]t belongs to the column space of A, where
 
1+λ 1 1
 
A= 1 1+λ 1 .
1 1 1+λ

13. Let A be a square matrix. If rank(A) = rank(A2 ), show that the linear systems of equations Ax = 0 and A2 x = 0
have the same solution space.

14. Let A be a p × n matrix and B be a q × n matrix. If rank(A) + rank(B) < n then show that there exists an
x (6= 0) ∈ Rn such that Ax = 0 and Bx = 0.

15. Let A and B be n × n matrices such that nullity(A) = l and nullity(B) = m. Show that nullity(AB) ≥ max(l, m).

16. Let A be an m × n matrix of rank r. Show that A can be expressed as A = BC, where each of B and C have
rank r, B is a matrix of size m × r and C is a matrix of size r × n.

17. Let A and B be two matrices such that AB is defined and rank(A) = rank(AB). Show that A = ABX for some
matrix X. Similarly, if BA is defined and rank(A) = rank(BA) then show that A = Y BA for some matrix Y .

18. Let A be an m × n matrix with complex entries. Show that the system A∗ Ax = A∗ b is consistent for each b ∈ Cn .
Hints to Practice Problems Set 6

1. The reduced row echelon forms are given in Problem 2 of Section 2. Bases for the row space, column space
and the null space are:
First matrix: {[1, 0, 0, 0]t , [0, 1, 0, 0]t , [0, 0, 1, 0]t , [0, 0, 0, 1]t }, {[1, 0, −1, 1]t , [−1, 5, 2, 2]t , [2, 6, 4, −1]t , [3, 2, 3, 2]t } and
∅, respectively.
Second matrix: {[1, 0, −1, −2]t , [0, 1, 2, 3]t }, {[1, 5, 9, 13]t , [2, 6, 10, 14]t } and {[1, −2, 1, 0]t , [2, −3, 0, 1]t }, respec-
tively.
Third matrix: {[1, 0, 0, 0]t , [0, 1, 0, 0]t , [0, 0, 1, 0]t , [0, 0, 0, 1]t }, {[3, 2, 0, 5]t , [4, 3, 2, −5]t , [5, 1, 0, 5]t , [−6, 1, 0, 5]t } and
∅, respectively.
Fourth matrix: {[1, 0, 0, 1, 0]t , [0, 1, 0, 1, 0]t , [0, 0, 1, 0, 0]t , [0, 0, 0, 0, 1]t }, {[1, 0, 2, 4]t , [2, 2, −2, 2]t , [1, 2, 4, 5]t , [2, 4, 8, 10]t }
and {[−1, −1, 0, 1, 0]t }, respectively.

2. Show that the non-zero rows of R are linearly independent.

3. One example is given in this lecture note. Find other examples of your own.

4. Not possible since [1, 2, 1][1, −2, 1]t 6= 0.


" #
Ir O
5. There are invertible matrices P and Q such that P AQ = .
O O

6. t = 3.

7. t = −3.

57 @bikash
8. If AB = O then col(B) ⊆ null(A).

9. I = (I − A) + A ⇒ n ≤ rank(I
" −#A) + rank(A). Also A3 = A2 ⇒ rank(I − A) ≤ nullity(A2 ) = n − rank(A2 ). Now
0 0
use Problem 11. Take as a counterexample.
1 0

10. If As is the matrix obtained by deleting s rows of A, then rank(A) − s ≤ rank(As ).

11. n ≤ rank(A − I) + rank(A). Also A2 = A ⇒ rank(A − I) ≤ nullity(A) = n − rank(A). Conversely, n =


rank(A − I) + rank(A) ⇒ rank(A − I) = nullity(A). Then z ∈ null(A) ⇒ z = −(A − I)z ∈ col(A − I). Thus
null(A) ⊆ col(A − I) ⇒ null(A) = col(A − I), and so A(A − I)x = 0 for all x ∈ Rn . Hence A2 = A.

12. λ 6= −3 (transform the matrix to row echelon form).

13. null(A) ⊆ null(A2 ). Also rank(A) = rank(A2 ) ⇒ nullity(A) = nullity(A2 ).

14. The formula dim(U + W ) = dim(U ) + dim(W ) − dim(U ∩ W ) will be discussed in vector space.

dim[null(A) ∩ null(B)] = dim[null(A)] + dim[null(B)] − dim[null(A) + null(B)]


= n − rank(A) + n − rank(B) − dim[null(A) + null(B)] > n − dim[null(A) + null(B)] ≥ 0.

15. nullity(AB) = n − rank(AB) ≥ n − max{n − l, n − m} = max{l, m}.


r
P
16. Let A = [a1 , . . . , an ] and {b1 , . . . , br } a basis for col(A). If ai = αij bj then take B = [b1 , . . . , br ] and
j=1
C = [αji ].

17. col(AB) ⊆ col(A), rank(A) = rank(AB) ⇒ col(AB) = col(A). Now if A = [a1 , . . . , an ], AB = [b1 , . . . , bk ] and
k
P
ai = αij bj then take X = [αji ].
j=1

18. Show that rank(A∗ A) ≤ rank([A∗ A | A∗ b]) ≤ rank(A∗ [A | b]) ≤ rank(A∗ ) and rank(A∗ A) = rank(A∗ ). Hence
rank(A∗ A) = rank([A∗ A | A∗ b]).

58 @bikash
7 Linear Transformation
• Suppose A ∈ Mm×n . Take v ∈ Rn . Then Av ∈ Rm . Thus, we have a map (function) F : Rn → Rm given by
F (v) = Av.

• Take F : R[x] → R[x] given by F (p(x)) = p0 (x).

• Take F : R[x] → R given by F (p(x)) = p(3).

What is common in all of these? Well, they are maps (functions) with domains and co-domains as vector space’s
over R. Further,
F (u + v) = F (u) + F (v), F (αv) = αF (v),
or, equivalently, F (αu + βv) = αF (u) + βF (v). Such functions are called linear transformations (LT).
Definition 7.1. A linear transformation (or a linear mapping) from a vector space V (F) into a vector space W (F)
is a mapping T : V → W such that for all u, v ∈ V and for all a ∈ F

T (au + v) = aT (u) + T (v).

Equivalently, T (u + v) = T (u) + T (v) and T (av) = aT (v) for all u, v ∈ V, a ∈ F.


Example 7.1. Let A be an m × n matrix. Define T : Rn → Rm such that T (x) = Ax for all x ∈ Rn . Then T is a
linear transformation from Rn into Rm .
Example 7.2. The map T : R → R, defined by T (x) = x + 1 for all x ∈ R, is not a linear transformation.
Example 7.3. The map T : R2 → R2 , defined by T ([x, y]t ) = [2x, x + y]t for all [x, y]t ∈ R2 , is a linear transformation.
Solution. Let u = [x1 , y1 ]t , v = [x2 , y2 ]t ∈ R2 and a ∈ R. We have au + v = [ax1 + x2 , ay1 + y2 ]t . So

T (au + v) =T ([ax1 + x2 , ay1 + y2 ]t )


=[2(ax1 + x2 ), (ax1 + x2 ) + (ay1 + y2 )]t
=a[2x1 , x1 + y1 ]t + [2x2 , x2 + y2 ]t
=aT (u) + T (v).

Hence T is a linear transformation.


Example 7.4. Let V and W be two vector spaces. The map T0 : V → W , defined by T0 (v) = 0 for all v ∈ V , is a
linear transformation. The map T0 is called the zero transformation.
Example 7.5. Let V be a vector space. The map I : V → V , defined by I(v) = v for all v ∈ V , is a linear
transformation. The map I is called the identity transformation.
Result 7.1. Let T : V → W be a linear transformation. Then
1. T (0) = 0;

2. T (−v) = −T (v) for all v ∈ V ; and

3. T (u − v) = T (u) − T (v) for all u, v ∈ V .


Proof. For the second part, we have T (−v) = T ((−1)v) = (−1)T (v) = −T (v).
Example 7.6. Suppose T : R2 → R2 [x] is a linear transformation such that T [1, 0]t = 2 − 3x + x2 and T [0, 1]t = 1 − x2 .
Find T [2, 3]t and T [a, b]t .
Result 7.2. Let {v1 , . . . , vn } be a basis of V (dim(V ) = n). Let u1 , . . . , un be arbitrarily chosen in W . Then there is
a unique linear transformation T : V −→ W such that T (vi ) = ui .
Proof. Let v ∈ V . Then v equals a unique linear combination a1 v1 + · · · + an vn . Define T (v) = a1 u1 + · · · + an un ∈ W .
The resulting map T is the linear transformation we are looking for.
Remark: To define (know) a linear transformation, it is enough to define (know) the images of vectors in any basis of
the domain.
Composition of Linear Transformation: Let T : U → V and S : V → W be two linear transformations.
Then the composition of S with T is the mapping S ◦ T : U → W defined by

(S ◦ T )(u) = S(T (u)) for all u ∈ U.

59 @bikash
Result 7.3. Let T : U → V and S : V → W be two linear transformations. Then the composition S ◦ T is also a linear
transformation.

Inverse of a Function: A function f : X → Y is said to be invertible if there is another function g : Y → X


such that
g ◦ f = IX and f ◦ g = IY .

• If f is invertible, the the function g satisfying g ◦ f = IX , f ◦ g = IY is called inverse of f .

• Inverse of a function, if it exists, is unique.

• The symbol f −1 is used to denote the inverse of f .

• Inverse of a linear transformation is linear.

Example 7.7. Let T : R2 → R2 and S : R2 → R2 be defined by

T [x, y]t = [x − y, −3x + 4y]t and S[x, y]t = [4x + y, 3x + y]t for all [x, y]t ∈ R2 .

Then S is the inverse of T .

Solution. We have

(T ◦ S)(x, y) = T (S(x, y)) = T ([4x + y, 3x + y]t ) = [(4x + y) − (3x + y), −3(4x + y) + 4(3x + y)]t = [x, y]t = I([x, y]t ),

(S ◦ T )(x, y) = S(T (x, y)) = S([x − y, −3x + 4y]t ) = [4(x − y) + (−3x + 4y), 3(x − y) + (−3x + 4y)]t = [x, y]t = I([x, y]t ).
Thus T ◦ S = I and S ◦ T = I, and therefore T −1 = S.
Kernel and Range: Let T : V → W be a linear transformation. Then the kernel of T (null space of T ), denoted
ker(T ), and the range of T , denoted range(T ), are defined as

ker(T ) = {v ∈ V : T (v) = 0}, and

range(T ) = {w ∈ W : w = T (v) for some v ∈ V }.

Result 7.4. Let T : V → W be a linear transformation and let B = {v1 , v2 , . . . , vk } be a spanning set for V . Then
T (B) = {T (v1 ), T (v2 ), . . . , T (vk )} spans the range of T .

Example 7.8. Let A be an m × n matrix. Define T : Rn → Rm such that T (x) = Ax for all x ∈ Rn . Then
ker(T ) = null(A) and range(T ) = col(A).

Result 7.5. Let T : V → W be a linear transformation. Then ker(T ) is a subspace of V and range(T ) is a subspace of W .

Definition 7.2. Let T : V → W be a linear transformation. Then we define

• rank(T ) = dimension of range(T ); and

• nullity(T ) = dimension of ker(T ).


d
Example 7.9. Let D : R3 [x] → R2 [x] be defined by D(p(x)) = dx p(x). Then rank(D) = 3 and nullity(D) = 1.

Result 7.6 (The Rank-Nullity Theorem). Let T : V → W be a linear transformation from a finite dimensional
vector space V into a vector space W . Then

rank(T ) + nullity(T ) = dim(V ).

Definition 7.3. Let T : V → W be a linear transformation. Then

1. T is called one-one if T maps distinct vectors in V into distinct vectors in W .

2. T is called onto if range(T ) = W .

• For all u, v ∈ V , if u 6= v implies that T (u) 6= T (v), then T is one-one.

• For all u, v ∈ V , if T (u) = T (v) implies that u = v, then T is one-one.

• For all w ∈ W , if there is at least one v ∈ V such that T (v) = w, then T is onto.

60 @bikash
Example 7.10. Some examples of one-one and onto linear transformation.

• T : R → R2 defined by T (x) = [x, 0]t , x ∈ R is one-one but not onto.

• T : R2 → R defined by T [x, y]t = x, for [x, y]t ∈ R2 is onto but not one-one.

• T : R2 → R2 defined by T [x, y]t = [−x, −y]t , for [x, y]t ∈ R2 is one-one and onto.

Solution.

• For x, y ∈ R, we have T (x) = T (y) ⇒ [x, 0]t = [y, 0]t ⇒ x = y. So T is one-one. However, [1, 1]t does not have a
pre-image, and so T is not onto.

• We have [1, 1]t , [1, 2]t ∈ R2 such that [1, 1]t 6= [1, 2]t but T ([1, 1]t ) = T ([1, 2]t ). So, T is not one-one. However, for
every x ∈ R, we have [x, 1]t ∈ R2 such that T ([x, 1]t ) = x. So, T is onto.

• For [x, y]t , [u, v]t ∈ R2 , we have T ([x, y]t ) = T ([u, v]t ) ⇒ [−x, −y]t = [−u, −v]t ⇒ −x = −u, −y = −v ⇒ [x, y]t =
[u, v]t . So T is one-one. Also, for every [x, y]t ∈ R2 , we have [−x, −y]t ∈ R2 such that T ([−x, −y]t ) = [x, y]t . So,
T is onto.

Result 7.7. A linear transformation T : V → W is one-one iff ker(T ) = {0}.

Result 7.8. Let dim(V ) = dim(W ). Then a linear transformation T : V → W is one-one iff T is onto.

Example 7.11. Let T : V −→ W be a linear transformation and v1 , . . . , vk ∈ V such that T (v1 ), . . . , T (vk ) are linearly
independent. Can v1 , . . . , vk be linearly dependent? Justify?

Solution. No, because if a1 v1 + . . . + ak vk = 0 then a1 T (v1 ) + . . . + ak T (vk ) = 0.

Result 7.9. Let T : V → W be a one-one linear transformation. If S = {v1 , . . . , vk } is a linearly independent set in
V then T (S) = {T (v1 ), . . . , T (vk )} is a linearly independent set in W .

Remark: Find an example to see that if T is not one-one then {v1 , . . . , vk } is linearly independent does not necessarily
imply that {T (v1 ), . . . , T (vk )} is linearly independent.

Result 7.10. Let dim(V ) = dim(W ). Then a one-one linear transformation T : V → W maps a basis for V onto a
basis for W .

Isomorphism:
• A linear transformation T : V → W is called an isomorphism if it is one-one and onto.

• If T : V → W is an isomorphism then we say that V and W are isomorphic, and we write V ∼


= W.

Example 7.12. The vector spaces R3 and R2 [x] are isomorphic.

Result 7.11. Let V (F) and W (F) be two finite dimensional vector spaces. Then V ∼
= W iff dim(V ) = dim(W ).

Example 7.13. The vector spaces Rn and Rn [x] are not isomorphic.

Matrix of a Linear Transformation:


Result 7.12. Let V and W be two finite-dimensional vector spaces with ordered bases B and C respectively, where
B = {v1 , v2 , . . . , vn } and C = {u1 , u2 , . . . , um }. If T : V → W is a linear transformation, then the m × n matrix A
defined by
A = [[T (v1 )]C , [T (v2 )]C , . . . , [T (vn )]C ]

satisfies
A[v]B = [T (v)]C for all v ∈ V.

• The matrix A in Result 7.12 is called the matrix of T with respect to the ordered bases B and C.

• The matrix A is also written as [T ]C←B .

• If B = C, then [T ]C←B is written as [T ]B .

61 @bikash
Remark 7.1. Result 7.12 means:
Suppose we know [T ]C←B with respect to given ordered bases B and C. Then we know T in the following sense:
   
a1 b1
   
Pn  a 2   b2  Pm
If v = ai vi and [T ]C←B   =
 ..   .. , then T (v) =
  bj u j .
i=1  .   .  j=1

an bn
The above expression can be represented by the following diagram.
 T


 v∈V −−−→ T (v) ∈ W 

 
A[v]B = [T (v)]C , i.e., ↓ ↓ .

 

 TA 
[v]B ∈ Fn −
−−→ [T (v)]C = A[v]B ∈ Fm
Here TA : Fn −→ Fm is the linear transformation given by TA (x) = Ax.

Example 7.14. Let T : R3 → R2 be the linear transformation defined by

T ([x, y, z]t ) = [x − 2y, x + y − 3z]t for [x, y, z]t ∈ R3 .

Let B = {e1 , e2 , e3 } and C = {e2 , e1 } be ordered bases for R3 and R2 , respectively. Find [T ]C←B and verify Result 7.12
for v = [1, 3, −2]t .

Example 7.15. Consider D : R3 [x] → R3 [x] defined by D(p(x)) = p0 (x). Find the matrix [D]B with respect to the
ordered basis B = {1, x, x2 , x3 } of R3 [x]. Also verify that [D]B [p(x)]B = [D(p(x))]B .
 
0 1 0 0
 0 0 2 0 
Solution. Since D(1) = 0, D(x) = 1, D(x2 ) = 2x, D(x3 ) = 3x2 , we get [D]B = 
 
.
 0 0 0 3 
0 0 0 0
2 3 2
Consider p(x) = a0 + a1 x + a2 x + a3 x . Then D(p(x)) = a1 + 2a2 x + 3a3 x .
   
a0 a1
 a   2a 
 1   2 
We see that [p(x)]B =   , [D]B [p(x)]B =   = [D(p(x))]B .
 a2   3a3 
a3 0
Result 7.13. Let U, V and W be three finite-dimensional vector spaces with ordered bases B, C and D, respectively.
Let T : U → V and S : V → W be linear transformations. Then

[S ◦ T ]D←B = [S]D←C [T ]C←B .

Result 7.14. Let T : V → W be a linear transformation between two n-dimensional vector spaces V and W with
ordered bases B and C, respectively. Then T is invertible if and only if the matrix [T ]C←B is invertible. In this case,

([T ]C←B )−1 = [T −1 ]B←C .

Example 7.16. Let T : R2 → R1 [x] be defined by T ([a, b]t ) = a + (a + b)x for [a, b]t ∈ R2 . Show that T is invertible,
and hence find T −1 .
Practice Problems Set 7

1. Examine whether the following maps T : V → W are linear transformations.

(a) V = W = R3 and T [x, y, z]t = [3x + y, z, |x|]t for all [x, y, z]t ∈ R3 .
" #
a b
(b) V = W = M2 (R) and for every A = ∈ M2 (R), define (i) T (A) = At , (ii) T (A) = A + I2 ,
c d
" # " # " # " #
a b c d a b a + b 0
(iii) T (A) = A2 , (iv) T (A) = det A, (v) T = , (vi) T = .
c d a b c d 0 c+d
(c) V = W = Mn (R) and for every A = [aij ] ∈ Mn (R), define (i) T (A) = tr(A), (ii) T (A) = rank(A) and
(iii) T (A) = a11 a22 . . . ann .
(d) V = W = R2 [x] and T (a + bx + cx2 ) = a + b(x + 1) + c(x + 1)2 for all a + bx + cx2 ∈ R2 [x].

62 @bikash
2. Let T : R2 → R2 [x] be a linear transformation for which T ([1, 1]t ) = 1 − 2x and T ([3, −1]t ) = x + 2x2 . Find
T ([−7, 9]t ) and T ([a, b]t ) for [a, b]t ∈ R2 [x].

3. Let T : R2 [x] → R2 [x] be a linear transformation for which T (1 + x) = 1 + x2 , T (x + x2 ) = x − x2 and


T (1 + x2 ) = 1 + x + x2 . Find T (4 − x + 3x2 ) and T (a + bx + cx2 ) for a + bx + cx2 ∈ R2 [x].

4. Consider the linear transformations T : R2 → R2 and S : R2 → R2 defined by T [x, y]t = [0, x]t and S[x, y]t = [y, x]t
for all [x, y]t ∈ R2 . Compute T ◦ S and S ◦ T . What do you observe?

5. Let T : C → C be the map defined by T (z) = z for all z ∈ C. Show that T is a linear transformation on C(R)
but not a linear transformation on C(C).

6. Let {u1 , u2 , . . . , un } be a basis for a vector space V and T : V → V be a linear transformation. Prove that if
T (ui ) = ui for all i = 1, 2, . . . , n, then T is the identity transformation on V .

7. Let V be a vector space over R (or C) of dimension n and let {v1 , v2 , . . . , vn } be a basis for V . If W is another
vector space over R (or C) and w1 , w2 , . . . , wn ∈ W , then show that there exists a unique linear transformation
T : V → W such that T (vi ) = wi for all i = 1, 2, . . . , n.

8. Examine the linearity of the following maps. Also, find bases for their range spaces and null spaces, whenever
they are linear.
" # " # " #
a b a 0 a b
(a) T : M2 (R) → M2 (R) defined by T = for ∈ M2 (R).
c d 0 d c d
(b) T : R2 [x] → R2 defined by T (a + bx + cx2 ) = [a − b, b + c]t for a + bx + cx2 ∈ R2 [x].
(c) T : R2 → R3 defined by T [x, y]t = [x, x + y, x − y]t for all [x, y]t ∈ R2 .
(d) T : R2 [x] → R3 [x] defined by T (p(x)) = x.p(x) for all p(x) ∈ R2 [x].
" # " #
a b a b
(e) T : M2 (R) → R2 defined by T = [a − b, c − d]t for ∈ M2 (R).
c d c d

9. Examine whether the following linear transformations are one-one and onto.

(a) T : R2 → R2 defined by T [x, y]t = [2x − y, x + 2y]t for all [x, y]t ∈ R2 .
(b) T : R2 [x] → R3 defined by T (a + bx + cx2 ) = [2a − b, a + b − 3c, c − a]t for all a + bx + cx2 ∈ R2 [x].

10. Find a linear transformation T : R3 → R3 such that

(a) range(T ) = span([1, 1, 1]t ) (b) range(T ) = span([1, 2, 3]t , [1, 3, 2]t ).

11. Let S : V → W and T : U → V be two linear transformations.

(a) If S ◦ T is one-one then prove that T is also one-one.


(b) If S ◦ T is onto then prove that S is also onto.

12. Let T : R3 → R3 be a linear transformation such that T [1, 0, 0]t = [1, 0, 0]t , T [1, 1, 0]t = [1, 1, 1]t and T [1, 1, 1]t =
[1, 1, 0]t . Find T [x, y, z]t , nullity(T ) and rank(T ), where [x, y, z]t ∈ R3 . Also, show that T 2 = T .

13. Let z1 , z2 , . . . , zk be k distinct complex numbers. Let T : Cn [z] → Ck be defined by T (f (z)) = [f (z1 ), f (z2 ), . . . , f (zk )]k
for all f (z) ∈ Cn [z]. Show that T is a linear transformation, and find the dimension of range(T ).

14. Show that each of the following linear transformations is an isomorphism.

(a) T : R3 [x] → R4 defined by T (a + bx + cx2 + dx3 ) = [a, b, c, d]t for all a + bx + cx2 + dx3 ∈ R3 [x].
" # " #
a b a b
(b) T : M2 (C) → C4 defined by T = [a, b, c, d]t for all ∈ M2 (C).
c d c d
(c) T : Mn (R) → Mn (R) defined by T (X) = A−1 XA for all X ∈ Mn (R), where A is a given n × n invertible
matrix.
d
(d) T : Rn [x] → Rn [x] defined by T (p(x)) = p(x) + dx (p(x)) for all p(x) ∈ Rn [x].

63 @bikash
15. Examine whether the following vector spaces V and W are isomorphic. Whenever they are isomorphic, find an
explicit isomorphism T : V → W .

(a) V = C and W = R2 .
(b) V = {A ∈ M2 (R) : tr(A) = 0} and W = R2 .
(c) V = the vector space of all 3 × 3 diagonal matrices and W = R3 .
(d) V = the vector space of all 3 × 3 symmetric matrices and W = the vector space of all 3 × 3 skew-symmetric
matrices.

Result 7.15. Let T : V → W be a linear transformation. If V is finite-dimensional then ker(T ) and range(T )
are also finite-dimensional.

16. Let A be an m × n real matrix. Using the above result, show that if m < n then the system of equations Ax = 0
has infinitely many solutions, and if m > n then there exists a non-zero vector b ∈ Rm such that the system of
equations Ax = b does not have any solution.

17. Let {u1 , u2 , u3 , u4 } be a basis for a vector space V of dimension 4. Define a linear transformation T : V → V
such that
T (u1 ) = T (u2 ) = T (u3 ) = u1 , T (u4 ) = u2 .
Describe each of the spaces ker(T ), range(T ), ker(T ) ∩ range(T ) and ker(T ) + range(T ).

18. Let V be a finite-dimensional vector space and let T : V → V be a linear transformation. If rank(T ) = rank(T 2 )
then prove that range(T ) ∩ ker(T ) = {0}.

19. Let U and W be subspaces of a finite-dimensional vector space V . Define T : U × W → V by T (u, w) = u − w


for all (u, w) ∈ U × W .

(a) Show that T is a linear transformation.


(b) Show that range(T ) = U + W .
∼ U ∩ W.
(c) Show that ker(T ) =
(d) Prove Grassmann’s identity: dim (U + W ) = dim U + dim W − dim (U ∩ W ).

20. Find the matrix [T ]C←B for each of the following linear transformations T : V → W with respect to the given
ordered bases B and C for V and W , respectively.

(a) T : R3 → R2 defined by T [x, y, z]t = [x−y+z, y−z]t for all [x, y, z]t ∈ R3 , and B = {[1, 1, 1]t , [1, 1, 0]t , [1, 0, 0]t },
C = {[1, 1]t , [1, −1]t }.
(b) T : R3 → R3 defined by T [x, y, z]t = [x+y, y+z, z+x]t for all [x, y, z]t ∈ R3 , and B = C = {[1, 1, 0]t , [0, 1, 1]t , [1, 0, 1]t }.
(c) T : Rn → Rn defined by T [x1 , x2 , x3 , . . . , xn ]t = [x2 , x3 , . . . , xn , 0]t for all [x1 , x2 , x3 , . . . , xn ]t ∈ Rn , and
B = C = the standard basis for Rn .
(d) T : R3 [x] → R4 [x] defined by T (p(x)) = x.p(x) for all p(x) ∈ R3 [x], and B = {1, x, x2 , x3 },
C = {1, x, x2 , x3 , x4 }.
(e) T : C2 → C2 defined by T [z1 , z2 ]t = [z1 + z2 , iz2 ]t for all [z1 , z2 ]t ∈ C2 , and B = the standard basis for C2 ,
C = {[1, 1]t , [1, 0]t }.
→ M2 (C)# defined
(f) T : M2 (C)(" " by# T"(A) = A#+"iAt for all
#)A ∈ M2 (C), and
1 0 0 1 0 0 0 0
B=C= , , , .
0 0 0 0 1 0 0 1

21. Let A be an m × n matrix and let T : Rn → Rm be a linear transformation defined by T (x) = Ax for all x ∈ Rn .
Show that the matrix of T with respect to the standard bases for Rn and Rm is A.

22. Let T : R3 → R3 be the linear transformation whose matrix with respect to the standard basis B for R3 is given
by  
1 2 −4
 
A =  2 −3 5 .
1 0 1
Let [x, y, z]t ∈ R3 . Determine T [x, y, z]t and show that T is invertible. Also, find T −1 [x, y, z]t .

64 @bikash
23. Let T : R3 → R3 be the linear transformation whose matrix with respect to the basis B = {[1, 1, 1]t , [1, 0, 1]t , [1, −1, −1]t }
for R3 is given by  
1 2 −4
 
A =  2 −3 5 .
1 0 1
Determine T [x, y, z]t , where [x, y, z]t ∈ R3 .
t t t 3
24. Consider the " bases B = {[1,
# 1, 1] , [0, 1, 1] , [0, 0, 1] } and C = {1 − t, 1 + t} for R and R1 [t], respectively. If
1 2 −1
[T ]C←B = is the matrix of a linear transformation T : R3 → R1 [t], determine T [x, y, z]t , where
−1 0 1
[x, y, z]t ∈ R3 .

25. Let {u1 , u2 , u3 , u4 } be a basis for a vector space V of dimension 4, and let T be a linear transformation on V
whose matrix representation with respect to this basis is given by
 
1 0 2 1
 −1 2 1 3 
 
A= .
 1 2 5 5 
2 −2 1 −2

(a) Describe each of the spaces ker(T ) and range(T ).


(b) Find a basis for ker(T ), extend it to a basis for V , and then find the matrix representation of T with respect
to this basis.

26. Let T and S be two linear transformations " on# R2 . Suppose the matrix representation of T with respect to
1 2
the basis {u1 = [1, 2]t , u2 = [2, 1]t } is , and the matrix representation of S with respect to the basis
2 3
" #
3 3
{v1 = [1, 1]t , v2 = [1, 2]t } is . Let u = [3, 3]t ∈ R2 .
2 4

(a) Find the matrix of T + S with respect to the basis {v1 , v2 }.


(b) Find the matrix of T ◦ S with respect to the basis {u1 , u2 }.
(c) Find the coordinate of T (u) with respect to the basis {u1 , u2 }.
(d) Find the coordinate of S(u) with respect to the basis {v1 , v2 }.
Hints to Practice Problems Set 7

1. Yes: (b)i, (b)v, (b)vi, (c)i, (d). No: (a), (b)ii, (b)iii, (b)iv, (c)ii, (c)iii.

2. T [a, b]t = 41 [(a + 3b) − (a + 7b)x + 2(a − b)x2 ].


(3a−b−c) 2
3. T (a + bx + cx2 ) = a + cx + 2 x .

4. (T ◦ S)[x, y]t = [0, y]t , (S ◦ T )[x, y]t = [x, 0]t .

5. T (i.1) 6= iT (1).
P P P
6. T (v) = T ( ai ui ) = ai T (ui ) = ai ui = v for all v ∈ V .
P P
7. For v = ai vi , define T (v) = ai wi .
(" # ) (" # )
0 x x 0
8. (a) ker(T ) = : x, y ∈ R , range(T ) = : x, y ∈ R .
y 0 0 y
(b) ker(T ) = span(1 + x − x2 ), range(T ) = R2 , (c) ker(T ) = (" {0}, range(T
# ) = {[x,
t 3
)y, z] ∈ R : 2x − y − z = 0},
a a
(d) ker(T ) = {0}, range(T ) = span(x, x2 , x3 ), (e) ker(T ) = : a, c ∈ R , range(T ) = R2 .
c c

9. (a) ker(T ) = {0}, (b) ker(T ) 6= {0}.

10. (a) Define T (e1 ) = [1, 1, 1]t , T (e2 ) = T (e3 ) = 0. (b) Define T (e1 ) = [1, 2, 3]t , T (e2 ) = [1, 3, 2]t , T (e3 ) = 0.

65 @bikash
11. (a) T (x) = 0 ⇒ (S ◦ T )(x) = 0 ⇒ x = 0. (b) For w ∈ W , S(T u) = w where u ∈ U .

12. T [x, y, z]t = [x, y, y − z]t , nullity(T ) = 0, rank(T ) = 3, T (T [x, y, z]t ) = [x, y, z]t .

13. If k ≥ n + 1 then f ∈ ker(T ) ⇒ f will have at least n + 1 roots ⇒ ker(T ) = {0}.


If k = n then ker(T ) = span(f (z)), where f (z) = (z − z1 )(z − z2 ) . . . (z − zn ).
If k < n then ker(T ) = {(z − z1 )(z − z2 ) . . . (z − zk )q(z) : q(z) is a polynomial of degree at most n − k}. Now use
rank(T ) + nullity(T ) = n + 1.

14. All are isomorphisms. One method may be by showing that ker(T ) = {0}.

15. (a) T (a + ib) = [a, b]t , (b) dim(V ) 6= dim(R2 ), (c) T (diag[x, y, z]) = [x, y, z]t , (d) dim(V ) 6= dim(W ).

16. Consider T : Rn → Rm defined by T (x) = Ax. Now m < n ⇒ T is not one-one, and m > n ⇒ T is not onto.

17. {u1 , u2 } is a basis for range(T ), {u1 −u2 , u1 −u3 } is a basis for ker(T ), {u1 , u2 , u3 } is a basis for ker(T )+range(T )
and {u1 − u2 } is a basis for ker(T ) ∩ range(T ).

18. rank(T ) = rank(T 2 ) ⇒ nullity(T ) = nullity(T 2 ). Also ker(T ) ⊆ ker(T 2 ). Hence ker(T ) = ker(T 2 ). Now
x ∈ ker(T ) ∩ range(T ) ⇒ T x = 0, x = T y for some y ⇒ T 2 y = 0 ⇒ T y = 0 ⇒ x = 0.

19. (c) Define S : ker(T ) → U ∩ W by S(w, w) = w. Then S is an isomorphism. (d) Use Rank-Nullity Theorem.
" # " #
1 1 1
2 2 2 0 i
20. (a) 1
, (e) , (c) [0, e1 , e2 , . . . , en−1 ], (d) [e2 , e3 , e4 , e5 ],
2 − 21 12 1 1−i
 
  1+i 0 0 0
1 1 0  0
   1 i 0  
(b)  0 1 1  , (f )  .
 0 i 1 0 
1 0 1
0 0 0 1+i

21. If A = [a1 , a2 , . . . , an ] then T (ei ) = ai .

22. T [x, y, z]t = A[x, y, z]t = [x + 2y − 4z, 2x − 3y + 5z, x + z]t . Also A is invertible ⇒ T is invertible, and
T −1 [x, y, z]t = A−1 [x, y, z]t .
 
23. Use A[x]B = [T (x)]B . We have T [x, y, z]t = 3x + 5y − 4z, −5x−4y+9z
2 , x + 3y − 2z .

24. Use A[x]B = [T (x)]C . We have T [x, y, z]t = (−2x + 2y) + (−4y + 2z)t.

25. (a) A basis for null(A) is {x, y}, where x = [−2, − 32 , 1, 0]t , y = [−1, −2, 0, 1]t . We have ker(T ) = span(v, w),
where v = −2u1 − 32 u2 + u3 and w = −u1 − 2u2 + u4 .
Now col(A) = span(a1 , a2 ), where a1 , a2 are the 1st and the 2nd column of A. Therefore range(T ) = span(T u1 , T u2 ),
since a1 = [T u1 ]B , a2 = [T u2 ]B .
 
0 0 1 2
 0 0 2 −2 
 
(b) C = {v, w, u1 , u2 } will be a basis for V . Also [T ]C =  .
 0 0 5 2 
0 0 9/2 1
" # " #
8 9 7 8
26. (a) , (b) , (c) [3, 5]t , (d) [9, 6]t .
4/3 3 13 14

66 @bikash
8 Eigenvalue, Eigenvector and Diagonalizability
Recall that like the space Rn , we also defined the space Cn . Indeed,

Cn = {[x1 , x2 , . . . , xn ]t : x1 , x2 , . . . , xn ∈ C}.

Definition 8.1. Let A be an n × n matrix. A complex number λ is called an eigenvalue of A if there is a vector
x ∈ Cn , x 6= 0 such that Ax = λx. Such a vector x is called an eigenvector of A corresponding to λ.
" #
1 3
Example 8.1. The numbers 4, −2 are eigenvalues of A = with corresponding eigenvectors [1, 1]t and [1, −1]t ,
3 1
respectively.

Definition 8.2. Let λ be an eigenvalue of a matrix A. Then the collection of all eigenvectors of A corresponding to λ,
together with the zero vector, is called the eigenspace of λ, and is denoted by Eλ .

Result 8.1. Let A be an n × n matrix and let λ be an eigenvalue of A. Then

1. λ is an eigenvalue of A iff det(A − λI) = 0.

2. λ is an eigenvalue of A iff A − λI is not invertible.

3. 0 is an eigenvalue of A iff A is not invertible.

4. Eλ = null(A − λI), that is, Eλ is a non-trivial subspace of Cn .

5. Let v1 , . . . , vk be eigenvectors of A corresponding to λ and v = α1 v1 + . . . + αk vk 6= 0. Then v is also an


eigenvector of A corresponding to λ.

6. Eigenvalues of a triangular matrix are its diagonal entries.


" #
Ap C
7. Eigenvalues of are the eigenvalues of Ap and Bq .
O Bq

Proof. Let λ be an eigenvalue of A with corresponding eigenvector x, so that Ax = λx and x 6= 0.

1. We have

λ is an eigenvalue of A ⇔ Ax = λx, x 6= 0
⇔ (A − λI)x = 0, x 6= 0
⇔ (A − λI)y = 0 has a non-trivial solution
⇔ A − λI is not invertible
⇔ det(A − λI) = 0.

2. Use Part 1.

3. Take λ = 0 and use Part 2.

4. Eλ = {x ∈ Cn : Ax = λx} = {x ∈ Cn : (A − λI)x = 0} = null(A − λI).

5. Given that Avi = λvi for i = 1, 2, . . . , k and v = α1 v1 + . . . + αk vk 6= 0. Now

Av = A(α1 v1 ) + A(α2 v2 ) + . . . + A(αk vk ) = α1 (λv1 ) + α2 (λv2 ) + . . . + αk (λvk ) = λv.

Hence v is also an eigenvector of A corresponding to λ.

6. Let A be a triangular matrix with diagonal entries a11 , a22 and ann , respectively. Then we have

det(A − λI) = 0 ⇔ (a11 − λ)(a22 − λ) · · · (ann − λ) = 0 ⇔ λ = a11 , a22 , . . . , ann .

67 @bikash
7. We have
" # " # !
Ap C Ap C
λ is an eigenvalue of ⇔ det − λI =0
O Bq O Bq
" #
Ap − λI C
⇔ det =0
O Bq − λI
⇔ det(Ap − λI).det(Bq − λI) = 0
⇔ λ is an eigenvalue of Ap or Bq .

Definition 8.3. Let A be an n × n matrix. Then

• pA (x) = det(A − xI) is called characteristic polynomial of A.

• pA (x) = 0 is called characteristic equation of A.

Method of Finding Eigenvalues and Bases for Corresponding Eigenspaces:


Let A be an n × n matrix.

1. Compute the characteristic polynomial det(A − λI).

2. Find the eigenvalues of A by solving the characteristic equation det(A − λI) = 0 for λ.

3. For each eigenvalue λ, find null(A − λI). This null space is the required eigenspace, i.e., Eλ = null(A − λI).

4. Find a basis for each eigenspace.

Example 8.2. Find the eigenvalues and the corresponding eigenspaces of the following matrices:
   
0 1 0 −1 0 1
   
 0 0 1  and  3 0 −3  .
2 −5 4 1 0 −1

Example 8.3. A real matrix may have "complex eigenvalues


# and complex eigenvectors. No wonder, every real matrix
0 1
is also a complex matrix. Consider A = . The characteristic polynomial of A is λ2 + 1, eigenvalues are ± ι̇.
−1 0
Corresponding to the eigenvalues ± ι̇, it has eigenvectors [1, ι̇]T and [ι̇, 1]T , respectively. Eigenspaces Eι̇ and E(−ι̇) are
one dimensional. However, most of our examples will be of real matrices with real eigenvalues.

Result 8.2 (The Fundamental Theorem of Invertible Matrices: Version III). Let A be an n × n matrix. Then
the following statements are equivalent.

1. A is invertible.

2. At is invertible.

3. Ax = b has a solution for every b in Cn .

4. Ax = b has a unique solution for every b in Cn .

5. Ax = 0 has only the trivial solution.

6. The reduced row echelon form of A is In .

7. The rows of A are linearly independent.

8. The columns of A are linearly independent.

9. rank(A) = n.

10. A is a product of elementary matrices.

11. nullity(A) = 0.

68 @bikash
12. The column vectors of A span Cn .

13. The column vectors of A form a basis for Cn .

14. The row vectors of A span Cn .

15. The row vectors of A form a basis for Cn .

16. det A 6= 0.

17. 0 is not an eigenvalue of A.

Proof. The equivalence of the statements (1) to (15) are proved in Result 6.6. The equivalence (1) ⇐⇒ (16) is proved
in Part 1 of Result 5.7. The equivalence (1) ⇐⇒ (17) is proved in Part 3 of Result 8.1.

Result 8.3. Let A be a matrix with eigenvalue λ and corresponding eigenvector x.

1. For any positive integer n, λn is an eigenvalue of An with corresponding eigenvector x.


1
2. If A is invertible, then λ is an eigenvalue of A−1 with corresponding eigenvector x.

3. If A is invertible then for any integer n, λn is an eigenvalue of An with corresponding eigenvector x.

Proof. Given that Ax = λx and x 6= 0.

2. If A is invertible, then λ 6= 0. Now


1
Ax = λx ⇒ A−1 (Ax) = A−1 (λx) ⇒ x = λ(A−1 x) ⇒ A−1 x = x.
λ
1
Thus λ is an eigenvalue of A−1 with corresponding eigenvector x.

3. If A is invertible, then λ 6= 0. For n > 0, Part 1 gives that λn is an eigenvalue of An with corresponding eigenvector
x. For n = 0, clearly λ0 = 1 is an eigenvalue of A0 = I with corresponding eigenvector x. Now let n = −m < 0,
where m > 0. We have
1 1
A−1 x = x ⇒ (A−1 )m x = m x ⇒ A−m x = λ−m x ⇒ An x = λn x.
λ λ
Thus λn is an eigenvalue of An with corresponding eigenvector x.

Result 8.4. Let v1 , v2 , . . . , vm be eigenvectors of a matrix A with corresponding eigenvalues λ1 , λ2 , . . . , λm , respectively.


Let x = c1 v1 + c2 v2 + . . . + cm vm . Then for any positive integer k,

Ak x = c1 λk1 v1 + c2 λk2 v2 + . . . + cm λkm vm .

Proof. Applying Part 1 of Result 8.3, we have Ak vi = λki vi for i = 1, 2, . . . , k. Therefore

Ak x =Ak (c1 v1 + c2 v2 + . . . + cm vm )
=c1 (Ak v1 ) + c2 (Ak v2 ) + . . . + cm (Ak vm )
=c1 λk1 v1 + c2 λk2 v2 + . . . + cm λkm vm .

Result 8.5. Let λ1 , λ2 , . . . , λm be distinct eigenvalues of a matrix A with corresponding eigenvectors v1 , v2 , . . . , vm ,


respectively. Then the set {v1 , v2 , . . . , vm } is linearly independent.

Result 8.6 (Cayley-Hamilton Theorem). Let p(λ) be the characteristic polynomial of a matrix A. Then p(A) = O, the
zero matrix.

This is a beautiful and useful theorem. We will see a simple proof of this theorem in the last section of the course.
" #
1 2
Example 8.4. Let A = .
2 1

• Verify that characteristic polynomial of A is λ2 − 2λ − 3.

• Verify that A2 − 2A − 3I = O.

• Use the fact A2 = 2A + 3I to compute A5 without computing any matrix multiplication.

69 @bikash
1
• Argue that A is invertible using its characteristic polynomial. Since 3I = A2 − 2A, we have A−1 = 3 [A − 2I].
Verify that the inverse you found here is the same obtained by the Gauss-Jordan method.

Similar Matrices: Let A and B be two n × n matrices. Then A is said to be similar to B if there is an n × n
invertible matrix T such that T −1 AT = B. Then note that B is also similar to A, since T −1 AT = B ⇒ A = T BT −1 .

• If A is similar to B, we write A ≈ B.

• If A ≈ B, we can equivalently write that A = T BT −1 or AT = T B.


" # " # " #
1 2 1 0 1 −1
Example 8.5. Let A = ,B = and T = . Then A ≈ B since AT = T B.
0 −1 −2 −1 1 1

Result 8.7. Let A, B and C be n × n matrices. Then

1. A ≈ A.

2. If A ≈ B then B ≈ A.

3. If A ≈ B and B ≈ C then A ≈ C.

Proof. [Proof of Part 3] Since A ≈ B and B ≈ C, we find invertible matrices T and S such that A = T BT −1 and
B = SCS −1 . Now T S is also invertible and A = T BT −1 = A = T (SCS −1 )T −1 = (T S)C(T S)−1 . Hence A ≈ C.

Result 8.8. Let A, B and T be three matrices such that T −1 AT = B, that is, A ≈ B. Then

1. det A = det B.

2. A is invertible iff B is invertible.

3. A and B have the same rank.

4. A and B have the same characteristic polynomial.

5. A and B have the same set of eigenvalues.

6. λ is an eigenvalue of B with eigenvector v iff λ is an eigenvalue of A with eigenvector T v.

7. The dim(Eλ ) for A is same as dim(Eλ ) for B.

8. Am ≈ B m for all integers m ≥ 0.

9. If A is invertible, then Am ≈ B m for all integers m.

Proof. Given that A ≈ B so that we find invertible matrix T such that A = T BT −1 .

1. det A = det (T BT −1 ) = det (T ).det (B).det (T −1 ) = det (T ).det (B). det1(T ) = det B.

2. A is invertible ⇔ det A 6= 0 ⇔ det B 6= 0 ⇔ B is invertible.

3. Recall that pre or post multiplication by an invertible matrix do not alter the rank. Hence A and B have the
same rank.

4. det (A − λI) = det (T BT −1 − λI) = det [T (B − λI)T −1 ] = det (B − λI).

5. Since A and B have the same characteristic polynomial, by Part 4, they must have the same set of eigenvalues.

6. We have

λ is an eigenvalue of B with corresponding eigenvector v


⇔Bv = λv
⇔(T −1 AT )v = λv
⇔A(T v) = λ(T v)
⇔λ is an eigenvalue of A with corresponding eigenvector T v.

70 @bikash
7. Let dim(Eλ ) = k for B and S = {v1 , . . . , vk } be a basis of Eλ for B. We claim that {T v1 , . . . , T vk } is a basis of
Eλ for A. We have for a1 , . . . , ak ∈ C,

a1 (T v1 ) + . . . + ak (T vk ) = 0 ⇒T (a1 v1 + . . . + ak vk ) = 0
⇒a1 v1 + . . . + ak vk = 0, since T is invertible
⇒a1 = . . . = ak = 0, since S is linearly independent
⇒{T v1 , . . . , T vk } is linearly independent.

Again if v is an eigenvector of A, then by Part 6, v = T u for some eigenvector u of B. Now u = a1 v1 + . . . + ak vk


for some a1 , . . . , ak ∈ C and hence

v = T u = T (a1 v1 + . . . + ak vk ) = a1 (T v1 ) + . . . + ak (T vk ).

Thus {T v1 , . . . , T vk } is linearly independent and spans Eλ for A. Therefore {T v1 , . . . , T vk } is a basis of Eλ for


A, and hence the dim(Eλ ) for A is same as dim(Eλ ) for B.

8. Let m ≥ 0. We use induction to show that Am = T B m T −1 . Clearly Am = T B m T −1 is true for m = 0. Now


assume that Ak = T B k T −1 for k ≥ 0. Now we have

Ak+1 = Ak .A = (T B k T −1 ).(T BT −1 ) = T B k .BT −1 = T B k+1 T −1 .

Thus whenever Ak = T B k T −1 , we find Ak+1 = T B k+1 T −1 . Hence by principle of mathematical induction,


Am = T B m T −1 for all m ≥ 0. That is, Am ≈ B m for all integers m ≥ 0.

9. Let m = −n so that n > 0. If A is invertible, then

A = T BT −1 ⇒ An = T B n T −1 ⇒ (An )−1 = (T B n T −1 )−1 ⇒ A−n = (T −1 )−1 (B n )−1 T −1 = T B −n T −1


⇒ Am = T B m T −1 .

Thus Am = T B m T −1 , and hence Am ≈ B m for all integers m.

Diagonalizable Matrix: A matrix A is said to be diagonalizable if there is a diagonal matrix D such that
A ≈ D, that is, if there is an invertible matrix T and a diagonal matrix D such that AT = T D.
" # " # " #
1 3 4 0 1 3
Example 8.6. The matrix A = is diagonalizable, since if D = and T = then
2 2 0 −1 1 −2
AT = T D.

Result 8.9. Let A be an n × n matrix. Then A is diagonalizable iff A has n linearly independent eigenvectors.

• Let A be an n×n matrix. Then there exists an invertible matrix T and a diagonal matrix D satisfying T −1 AT = D
iff the columns of T are n linearly independent eigenvectors of A and the diagonal entries of D are the eigenvalues
of A corresponding to the columns (eigenvectors of A) of T in the same order.

Example 8.7. Check for the diagonalizablity of the following matrices. If they are diagonalizable, find invertible
matrices T that diagonalizes them:
   
0 1 0 −1 0 1
   
 0 0 1  and  3 0 −3  .
2 −5 4 1 0 −1

Result 8.10. If A is an n × n matrix with n distinct eigenvalues then A is diagonalizable.

Result 8.11. Let λ1 , λ2 , . . . , λk be distinct eigenvalues of a matrix A. If Bi is a basis for the eigenspace Eλi , then
B = B1 ∪ B2 ∪ . . . ∪ Bk is a linearly independent set.

71 @bikash
Definition 8.4. Let λ be an eigenvalue of a matrix A.

• The algebraic multiplicity of λ is the multiplicity of λ as a root of the characteristic polynomial of A.

• The geometric multiplicity of λ is the dimension of Eλ .

Result 8.12. The geometric multiplicity of each eigenvalue of a matrix is less than or equal to its algebraic multiplicity.

Result 8.13 (The Diagonalization Theorem). Let A be an n×n matrix whose distinct eigenvalues are λ1 , λ2 , . . . , λk .
Then the following statements are equivalent:

1. A is diagonalizable.

2. The union B of the bases of the eigenspaces of A contains n vectors.

3. The algebraic multiplicity of each eigenvalue equals its geometric multiplicity.

4. A has n linearly independent eigenvectors.


Practice Problems Set 8

1. Let A be an n × n matrix and S be an n × n invertible matrix. Show that the eigenvalues of A and S −1 AS are
the same. Are their corresponding eigenvectors same?

2. Find the eigenvalues and the corresponding eigenvectors of the following matrices:
" # " # " # " # " # " #
1 0 1 1+i 1 2 1 i 1 0 1 2
, , , , , ,
0 0 1−i 1 1 3 2 1 5 4 3 4
         
1 1 1 0 1 0 1 2 0 2 1 −1 1 1 1
         
 −1 −3 −3 ,
  1 0 0 ,  3 2 −2  ,  1 2 −1  and  1 1 1 .
2 4 4 0 0 1 0 3 0 1 0 0 1 1 1

3. Show that the following matrices A, B and C are diagonalizable. Also, find invertible matrices S1 , S2 and S3 such
that S1−1 AS1 , S2−1 BS2 and S3−1 CS3 are all diagonal matrices.
     
1 1 0 1 1 −1 1 1 1
     
A =  3 2 −2  , B =  −1 1 1  and C =  1 1 1 .
0 3 0 −1 1 1 1 1 1

4. Prove that the following matrices A and B are similar by showing that they are similar to the same diagonal
matrix. Also, find an invertible matrix P such that P −1 AP = B.
" # " #
3 1 1 2
(a) A = , B= .
0 −1 2 1
   
2 1 0 3 2 −5
   
(b) A =  0 −2 1  , B =  1 2 −1 .
0 0 1 2 2 −4

5. Let A and B be invertible matrices of the same size. Show that the matrices AB and BA are similar.

6. Let A and B be two similar matrices and let λ be an eigenvalue of A and B. Prove that the algebraic (geometric)
multiplicity of λ in A is equal to the algebraic (geometric) multiplicity of λ in B.
" # " #
2 1 2 i
7. Show that the matrices and are not diagonalizable.
−1 0 i 0

8. Let A be a symmetric matrix. Show that the eigenvalues of A are real numbers.

9. Let A be an skew-symmetric matrix of odd order. Prove that 0 is an eigenvalue of A.

10. Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Show that det(A) = λ1 .λ2 . . . . .λn and
tr(A) = λ1 + λ2 + . . . + λn . Further, show that A is invertible if and only if none of the eigenvalues of A
are zero.

72 @bikash
11. Let A and B be two n × n matrices. Prove that the sum of all the eigenvalues of A + B is the sum of all the
eigenvalues of A and B individually. Also, prove that the product of all the eigenvalues of AB is the product of
all the eigenvalues of A and B individually.
 
5 4 14 0
 4 13 14 0 
 then there exists a symmetric matrix B such that A = B 52 .
 
12. Prove or disprove: If A = 
 14 14 49 0 
0 0 0 −1
13. Let A and B be two n × n matrices with eigenvalues λ and µ, respectively.

(a) Give an example to show that λ + µ need not be an eigenvalue of A + B.


(b) Give an example to show that λµ need not be an eigenvalue of AB.
(c) Suppose that λ and µ correspond to the same eigenvector x. Show that λ + µ is an eigenvalue of A + B and
λµ is an eigenvalue of AB.

14. Let A be an n × n matrix and let c (6= 0) be a constant. Show that λ is an eigenvalue of A if and only if cλ is an
eigenvalue of cA.

15. Let A be an n × n matrix. Show that A and At have the same eigenvalues. Are their corresponding eigenspaces
same?

16. Let A be a n × n matrix. Show that the eigenvalues of A are either real numbers or complex conjugates occurring
in pairs. Further, show that if the order of A is odd then A has at least one real eigenvalue.

17. Let A be an n × n complex matrix. Show that


t
(a) if A is Hermitian (i.e., A∗ = A = A) then all eigenvalues of A are real numbers; and
t
(b) if A is skew-Hermitian (i.e., A∗ = A = −A) and A 6= O then all the eigenvalues of A are purely imaginary
numbers.

18. Let A be an n × n matrix. Show that

(a) if A is idempotent (i.e., A2 = A) then all the eigenvalues of A are either 0 or 1; and
(b) if A is nilpotent (i.e., Am = 0 for some m ≥ 1) then all the eigenvalues of A are 0.

19. Let A be an n × n matrix. Prove that


1
(a) if λ (6= 0) is an eigenvalue of A then λ det(A) is an eigenvalue of adj(A); and
(b) if v is an eigenvector of A then v is also an eigenvector of adj(A).

20. Let A and B be two n × n matrices and let A be invertible. Show that the matrices BA−1 and A−1 B have the
same eigenvalues.

21. Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn (which are not necessarily real). Denote λk = xk + iyk
for k = 1, 2, . . . , n. Show that

(a) y1 + y2 + . . . + yn = 0;
(b) x1 y1 + x2 y2 + . . . + xn yn = 0; and
(c) tr(A2 ) = (x21 + x22 + . . . + x2n ) − (y12 + y22 + . . . + yn2 ).

22. For each of the following matrices, find the eigenvalues and the corresponding eigenspaces over C:
" # " # " # " # " #
1 0 1 1+i i 1+i cos θ − sin θ cos θ sin θ
, , , and .
0 0 1−i 1 −1 + i i sin θ cos θ sin θ − cos θ

23. Using Cayley Hamilton Theorem, find the inverse of the following matrices, whenever they exist:
   
" # " # 0 0 2 0 1 0
1 i 1 1    
, ,  0 2 0  and  1 0 0  .
i 2 4 1
2 0 3 0 0 1

73 @bikash
24. Find all real values of k for which the following matrices are diagonalizable.
   
" # " # " # 1 0 k 1 1 k
1 1 1 k k 1    
, , , 0 1 0  and  1 1 k  .
0 k 0 1 1 0
0 0 1 1 1 k

25. Prove that if A and B are similar matrices then tr(A) = tr(B).

26. For any real numbers a, b and c, show that the matrices
     
b c a c a b a b c
     
A= c a b , B =  a b c  and C =  b c a 
a b c b c a c a b

are similar to each other. Moreover, if BC = CB then show that A has two zero eigenvalues.

27. Prove that if the matrix A is similar to B, then At is similar to B t .

28. Prove that if the matrix A is diagonalizable, then At is also diagonalizable.

29. Let A be an invertible matrix. Prove that if A is diagonalizable, then A−1 is also diagonalizable.

30. Let A be a diagonalizable matrix and let Am = O for some m ≥ 1. Show that A = O.
" #
a b
31. Let A = .
c d

(a) Prove that A is diagonalizable if (a − d)2 + 4bc > 0.


(b) Find two examples to demonstrate that if (a − d)2 + 4bc = 0, then A may or may not be diagonalizable.

32. Let A be a 2 × 2 matrix with eigenvectors v1 = [1, −1]t and v2 = [1, 1]t and corresponding eigenvalues 21 and 2,
respectively. Find A10 x and Ak x for k ≥ 1, where x = [5, 1]t . What happens if k becomes large (i.e., k → ∞.)
Definition: Let p(x) = xn + an−1 xn−1 + . . . + a1 x + a0 be a polynomial. Then the companion matrix of p(x) is
the n × n matrix  
−an−1 −an−2 ... −a1 −a0

 1 0 ... 0 0 

 
C(p) =  0 1 ... 0 0 .
 
 .. .. .. .. .. 
 . . . . . 
0 0 ... 1 0

33. Show that the companion matrix C(p) of p(x) = x2 + ax + b has characteristic polynomial λ2 + aλ + b. Also,
show that if λ is an eigenvalue of this companion matrix then [λ, 1]t is an eigenvector of C(p) corresponding to
the eigenvalue λ.

34. Show that the companion matrix C(p) of p(x) = x3 +ax2 +bx+c has characteristic polynomial −(λ3 +aλ2 +bλ+c).
Further, show that if λ is an eigenvalue of this companion matrix then [λ2 , λ, 1]t is an eigenvector of C(p)
corresponding to λ.

35. Use mathematical induction to show that for n ≥ 2, the companion matrix C(p) of p(x) = xn + an−1 xn−1 + . . . +
a1 x + a0 has characteristic polynomial (−1)n p(λ). Further, show that if λ is an eigenvalue of this companion
matrix then [λn−1 , λn−2 , . . . , λ, 1]t is an eigenvector of C(p) corresponding to λ.

36. Let A be an n × n non-singular matrix and let p(λ) = λn + an−1 λn−1 + . . . + a1 λ + a0 be its characteristic
polynomial. Show that A−1 = − a10 (An−1 + an−1 An−2 + . . . + a1 I).

37. Let A and B be two 2 × 2 real matrices for which det(A) = det(B) and tr(A) = tr(B).

(a) Do A and B have the same set of eigenvalues?


(b) Give examples to show that the matrices A and B need not be similar.

74 @bikash
Hints to Practice Problems Set 8
" # " #
1 0 2 0
1. det(S −1 AS − λI) = det[S −1 (A − λI)S]. Take the counterexample and for the second part.
0 2 0 1

2. For each of the matrices, the ordered pairs given below consist of an eigenvalue and a basis for the corresponding
eigenspace:
√ √ √ √
1st matrix: (0, {[0, 1]t }), (1, {[1, 0]t }), 2nd matrix: (1 + 2, {[ 2, 1 − i]t }), (1 − 2, {[− 2, 1 − i]t }),
√ √ t √ √ t
3rd matrix: (2 + 3, {[2, 1 + 3] }), (2 − 3, {[2, 1 − 3] }), 5th matrix: (1, {[−3, 5]t }), (4, {[0, 1]t }),
√ √ √ √
4th matrix: (1 + √2i, {[1, −i 2i]t }), (1 − 2i,√{[1, i 2i]t }),
√ √
6th matrix: ( 52 + 233 , {[3 − 33, −6]t }), ( 25 − 233 , {[3 + 33, −6]t }),
7th matrix: (0, {[0, 1, −1]t }), (2, {[1, −2, 3]t }),
8th matrix: (1, {[0, 0, 1]t , [1, 1, 0]t }), (−1,√{[1, −1, 0]t }),
√ 2(2−i 2) i

2
√ √
2(2+i 2)

i 2
9th matrix: (3, {[1, 1, 1]t }), (i 2, {[ 9 , 3 , 1] t
}), (−i 2, {[ 9 , −
√ 3
, 1]t }),

3+ 5
√ √ √ t 3− 5
√ √ √
10th matrix: (1, {[1, −2, 1] }), ( 2 , {[−1 − 5, −1 − 5, 1 − 5] }), ( 2 , {[−1 + 5, −1 + 5, 1 + 5]t }),
t

11th matrix: (0, {[−1, 1, 0]t , [−1, 0, 1]t }), (3, {[1, 1, 1]t }).
 √ √ 
−8−2i 11 8−2i 11
2 9 9
 √ √ 
3. A has 3 distinct eigenvalues and S1 =  2 13 (5 − i 11) − 31 (5 + i 11) .
√ √
3 −1 − i 11 1 − i 11
 
1 1 0
 
B has 3 distinct eigenvalues and S2 =  0 1 1 .
1 1 1
 
−1 −1 1
 
C has eigenvalues 0 and 3, but E0 has dimension 2, and S3 =  1 0 1 .
0 1 1
" #
−1 0
4. (a) Both the matrices are similar to D = . If U −1 AU = D and V −1 AV = D then take P = U V −1 .
0 3
(b) Both the matrices are similar to diag[2, −2, 1].

5. A−1 (AB)A = BA.

6. B = P −1 AP ⇒ A and B have the same characteristic polynomial. Also any eigenvector of B must be of the form
P −1 v for some eigenvector v of A.

7. Find the algebraic and geometric multiplicities of the eigenvalues.

8. Au = λu ⇒ ut A = λut ⇒ ut Au = λut u.

9. At = −A ⇒ det(A) = (−1)n det(A).

10. det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn ). Compare the constant term and the coefficient of λn−1 on both
the sides.

11. tr(A + B) = tr(A) + tr(B) and det(AB) = det(A)det(B).

12. −1 is an eigenvalue of A. So, A = B 52 ⇒ λ52 = −1 for some eigenvalue λ of B.


" # " # " # " #
1 1 1 1 1 1 2 0
13. (a) A = , B= , (b) A = , B= , (c) Easy.
1 1 0 1 0 1 2 2

14. Easy.
" #
t 1 1
15. det(A − λI) = det(A − λI). Take the counterexample for the second part.
0 0

16. Complex roots of a polynomial equation occur in pairs.

17. Similar to Problem 8.

18. (a) Ax = λx, x 6= 0 ⇒ λ2 x = λx. (b) Ax = λx ⇒ 0 = λm x.

75 @bikash
19. (a) Av = λv and adj(A)A = det(A)I.
(b) If Av = λv and λ 6= 0, then use Part (a). Now let Av = 0 i.e., v ∈ null(A). If rank(A) ≤ n−2 then adj(A) = O.
If rank(A) = n − 1 then null(A) = {αv : α ∈ R} and also A.adj(A)v = det(A)v = 0 ⇒ adj(A)v ∈ null(A).

20. A−1 (BA−1 )A = A−1 B.


n
P
21. tr(A) and tr(A2 ) are real numbers. Also tr(A2 ) = λ2k .
k=1

22. 1st matrix: E0 = {[0, α]t : α ∈ C}, E1 = {[α, 0]t : α ∈ C},


√ √
2nd matrix: E1+√2 = {α[ 2, 1 − i]t : α ∈ C}, E1−√2 = {α[− 2, 1 − i]t : α ∈ C},
√ √
3rd matrix: Ei+i√2 = {α[i 2, i − 1]t : α ∈ C}, Ei−i√2 = {α[−i 2, i − 1]t : α ∈ C},
4th matrix: If θ 6= nπ then Eeiθ = {α[i, 1]t : α ∈ C} and Ee−iθ = {α[−i, 1]t : α ∈ C}. If θ = 2nπ then E1 = R2 .
If θ = (2n + 1)π then E−1 = R2 .
5th matrix: Consider each of the cases θ 6= nπ, nπ + π2 , θ = 2nπ, θ = (2n+1)π, θ = 2nπ + π2 and θ = (2n+1)π + π2
one by one.

23. If the given matrices are A, B, C, D (in order), then A−1 = 31 (3I −A), B −1 = 13 (B −2I), C −1 = 81 (−C 2 +5C −2I)
and D−1 = −D2 + D + I.

24. 1st matrix: k 6= 1, 2nd matrix: k = 0, 3rd matrix: k ∈ R, 4th matrix: k = 0, 5th matrix: k 6= −2.

25. A and B have the same set of eigenvalues.

26. Take P = [e3 , e1 , e2 ] and Q = [e2 , e3 , e1 ]. Then AP = P B and AQ = QC. Also BC = CB ⇒ a3 + b3 + c3 = 3abc.
Use this equation in computing det(A − λI).

27. Take transpose of both sides of P −1 AP = B.

28. Use Problem 27.

29. Take inverse of both sides of P −1 AP = B.

30. P −1 AP = D ⇒ P −1 Am P = Dm .
" # " #
1 0 3 −1
31. (a) When does the equation det(A − λI) = 0 have distinct real roots? (b) A = , B= .
0 1 1 1

32. x = 2v1 + 3v2 , Ak x = [21−k + 3.2k , 3.2k − 21−k ]t .

33. Easy.

34. Easy.

35. Expand det(C(p) − λI) through the last column and use induction.

36. a0 = det(A) 6= 0. Use Cayley-Hamilton Theorem.


" # " #
0 0 0 1
37. (a) xy = ab, x + y = a + b ⇒ {x, y} = {a, b}. (b) A= , B= .
0 0 0 0

76 @bikash
9 Inner Product Spaces
Let F denote C or R. Let u = [u1 , u2 , . . . , un ]t , v = [v1 , v2 , . . . , vn ]t ∈ Fn .

• The dot product or the standard inner product u.v of u and v is defined by

u.v = u1 v1 + u2 v2 + . . . + un vn .

Notice that " v1 #


X
u · v = [ u1 ··· un ] .
.
.
= u1 v 1 + u2 v 2 + . . . + un v n = ui vi = u∗ v.
vn

• Sometimes, the notation hu, vi is also used to denote the inner product of u and v.

• The length or norm kuk of u is defined by


√ p
kuk = u.u = |u1 |2 + |u2 |2 + . . . + |un |2 .

• Observe that kαuk = |α| kuk.

• The vectors u and v are said to be orthogonal if u.v = 0.

• A vector u is called an unit vector if kuk = 1.

Result 9.1. Let u, v, w ∈ Fn and c ∈ F. Then

1. u · v = v · u;

2. u · (v + w) = u · v + u · w;

3. (cu) · v = c(u · v);

4. u · (cv) = c(u · v);

5. u.u ≥ 0 and u.u = 0 if and only if u = 0.

Proof. Let u = [u1 , u2 , . . . , un ]t , v = [v1 , v2 , . . . , vn ]t and w = [w1 , w2 , . . . , wn ]t .


P P
1. u · v = ui v i = ui vi = v · u.
P P P
2. u · (v + w) = ui (vi + wi ) = ui vi + ui wi = u · v + u · w.
P P
3. (cu) · v = cui vi = c ui vi = c(u · v).
P P
4. u · (cv) = ui (cvi ) = c ui vi = c(u · v).
P P P
5. u.u = ui ui = |ui |2 ≥ 0 and so u.u = 0 ⇔ |ui |2 = 0 ⇔ |ui |2 = 0 for all i ⇔ ui = 0 for all i ⇔ u = 0.

Orthogonal Set: A set of vectors {v1 , v2 , . . . , vk } in Fn is said to be an orthogonal set if all pairs of distinct
vectors in the set are orthogonal, that is, if

vi .vj = 0 whenever i 6= j for i, j = 1, 2, . . . , k.

• The set {[2, 1, −1]t , [0, 1, 1]t , [1, −1, 1]t } is an orthogonal set in R3 .

• An orthogonal set can contain a zero vector.

Result 9.2. If S = {v1 , v2 , . . . , vk } is an orthogonal set of non-zero vectors, then S is linearly independent.

Result 9.3. Let A be a Hermitian (or real symmetric) matrix. Then

• all eigenvalues of A are real.

• eigenvectors corresponding to distinct eigenvalues of A are orthogonal.

77 @bikash
Proof. Suppose A is a Hermitian matrix, i.e., A∗ = A. Let λ be an eigenvalue of A. Then for an eigenvector v of A, we
have

λkvk2 = λ(v · v) = v · (λv) = v · Av = v∗ (Av) = (vT A)v =(AT v)T v


=(A∗ v)T v
=(Av)T v
=(Av)∗ v
=(λv) · v
=λ(v · v)
=λkvk2 .

Since v 6= 0, we have kvk2 6= 0. Thus, λ = λ, i.e., λ is real.


Next, suppose λ, µ are two distinct eigenvalues of A, and u and v are corresponding eigenvectors. Then

µ(u · v) = u · (µv) = u · (Av) = (u∗ A)v = (A∗ u)∗ v = (Au)∗ v = (λu) · v = λ(u · v).

Since λ is real we get (λ − µ)(u · v) = 0, i.e., u · v = 0.


Orthogonal Basis: An orthogonal basis for a subspace W is a basis for W that is an orthogonal set.
Example 9.1. Notice that u = [2, 1, −1]t , v = [0, 1, 1]t , w = [1, −1, 1]t } form an orthogonal basis for R3 . Take
x = [1, 1, 1]t ∈ R3 . Find a, b, c such that [1, 1, 1]t = a[2, 1, −1]t + b[0, 1, 1]t + c[1, −1, 1]t .
1 1
Solution. We have a = kuk
u·x
2 = 3 , b = kvk2 = 1 and c = kwk2 = 3 .
v·x w·x

     
2 0 1 1
3
Example 9.2. u = 1 , v = 1 , w = −1 form an orthogonal basis for C . Take x = 1 ∈ C3 . Find a, b, c such
−1 1 1 i
that x = au + bv + cw.
3−i 1+i
Solution. We have a = u·x
kuk2 = 6 , b= v·x
kvk2 = 2 and c = w·x
kwk2 = 3i .

Result 9.4. Let {v1 , v2 , . . . , vk } be an orthogonal basis for a subspace W and let w ∈ W . Then the unique scalars
c1 , c2 , . . . , ck such that w = c1 v1 + c2 v2 + . . . + ck vk are given by
vi .w
ci = for i = 1, 2, . . . , k.
vi .vi
Orthonormal Set: A set of vectors is said to be an orthonormal set if it is an orthogonal set of unit vectors.
Orthonormal Basis: An orthonormal basis of a subspace W is a basis of W that is an orthonormal set.
Result 9.5. Let {u1 , u2 , . . . , uk } be an orthonormal basis for a subspace W and let w ∈ W . Then

w = (u1 .w)u1 + (u2 .w)u2 + . . . + (uk .w)uk ,

and this representation is unique.

Orthogonal Complement: Let W be a subset of F3 .


• A vector v is said to be orthogonal to W if v is orthogonal to every vector in W .

• The orthogonal complement of W , denoted W ⊥ , is defined as

W ⊥ = {v ∈ Fn : v.w = 0 for all w ∈ W }.

• The vector 0 is orthogonal to W , for any subset W .

• In R3 , take W = {e1 }. Then W ⊥ = yz-plane.


 
x
• In R3 , take W = {[1, 1, 1]t }. Then W ⊥ = {[x, y, z]t : [1 1 1 ] y = 0} = {[x, y, z]t : x + y + z = 0}.
z

h ix 
1 1 1
• In R3 , take W = {[1, 1, 1]t , [1, 2, 3]t }. Then W ⊥ = {[x, y, z]t : 1 2 3
y = 0}.
z
  h ix 
x
• In C3 , take W = span{[1, 1, i]t , [1, 2i, 3]t }. Then W ⊥ = { y : 11 1
−2i
−i
3
y = 0}.
z z

78 @bikash
Result 9.6. Let W be a subspace of Fn . Then

1. W ⊥ is also a subspace of Fn .

2. W ∩ W ⊥ = {0}.

3. If {u1 , . . . , ur } and {{v1 , . . . , vs }} are linearly independent sets in W and W ⊥ , respectively, then the union
{u1 , . . . , ur , v1 , . . . , vs } is also linearly independent.

4. If B = {w1 , w2 , . . . , wk } is a basis for W , then v ∈ W ⊥ if and only if v.wi = 0 for all i = 1, 2, . . . , k.


h i
Ik
5. Let A = [w1 , w2 , . . . , wk ], where wi ’s are as in Part 4. Let T be an invertible matrix such that T A = O
. Then
∗ ⊥
the last n − k columns of T will form a basis for W .

Proof.

1. Clearly W ⊥ is non-empty since 0 ∈ W ⊥ . Let x, y ∈ W ⊥ and a, b ∈ F. Then x.w = 0, y.w = 0 for all w ∈ W .
Now (ax + by).w = a(x.w) + b(y.w) = 0, and so ax + by ∈ W ⊥ . Hence W ⊥ is a subspace of Fn .

2. If v ∈ W ∩ W ⊥ , then v · v = 0 ⇒ v = 0. Hence W ∩ W ⊥ = {0}.

3. Let α1 , . . . , αr , β1 , . . . , βs ∈ F be such that α1 u1 + . . . + αr ur + β1 v1 + . . . + βs vs = 0. Then

α1 u1 + . . . + αr ur = −(β1 v1 + . . . + βs vs ) ∈ W ∩ W ⊥ = {0}
⇒α1 u1 + . . . + αr ur = 0 = β1 v1 + . . . + βs vs
⇒α1 = 0 = . . . = αr , β1 = 0 = . . . = βs .

Hence {u1 , . . . , ur , v1 , . . . , vs } is a linearly independent set.


Pk
4. Suppose that v · wi = 0, for each i. Let w ∈ W . Then w = ai wi for some a1 , . . . , ak ∈ F.
P i=1
Now v · w = ai (v · wi ) = 0 and so v ∈ W ⊥ .
Conversely, if v ∈ W ⊥ , that is, if v · w = 0 for each w ∈ W , then in particular, v · wi = 0 for each i.
h i
I
5. Note that rank(A) = k and so nullity(A) = 0. Also T A = Ok = RREF (A). Consider the last n − k rows of T .
These rows are linearly independent, as they are part of an invertible matrix. Notice that the matrix product of
such a row with the vectors wi is 0. Therefore by Part 4, the last n − k columns of T ∗ will belong to W ⊥ . Now
by Part 3, dim(W ⊥ ) = n − k. Thus the last n − k columns (linearly independent) of T ∗ form a basis for W ⊥ .

Corollary 9.1. Let W be a subspace of Fn . Then dim W + dim W ⊥ = n.

Proof. Follows from Part 5 of Result 9.6.

Result 9.7 (Orthogonal Decomposition Theorem). Let W be a subspace of Fn and let v ∈ Fn . Then there are
unique vectors w in W and w⊥ in W ⊥ such that v = w + w⊥ . That is, W ⊕ W ⊥ = Fn .

Proof. We already know that if the dimension of W is k then the dimension of W ⊥ is n − k. Let {v1 , . . . , vk } be a basis
for W and {vk+1 , . . . , vn } be a basis for W ⊥ . Then by Part 3 of Result 9.6, the set {v1 , . . . , vn } is linearly independent,
and hence it is a basis for Fn .
Pk 0
Pk
Thus for any v ∈ Fn , we have a1 , . . . , an ∈ F such that v = i=1 ai vi = w + w , where w = i=1 ai vi and
P n
w0 = i=k+1 ai vi .
Now if v = w1 + w10 = w2 + w20 , where w1 , w2 ∈ W, w10 , w20 ∈ W ⊥ then

w1 − w2 = w20 − w10 ∈ W ∩ W ⊥ = {0} ⇒ w1 = w2 , w10 = w20 .

Thus there are unique vectors w in W and w⊥ in W ⊥ such that v = w + w⊥ . That is, W ⊕ W ⊥ = Fn .

Result 9.8. Let W be a subspace of Fn . Then (W ⊥ )⊥ = W .

Proof. Note that (W ⊥ )⊥ contains those elements of Fn which are orthogonal to W ⊥ . In particular, as each element w
of W is orthogonal to W ⊥ , we see that w ∈ (W ⊥ )⊥ . Thus W ⊆ (W ⊥ )⊥ .
As dim(W ) = k, we know that dim(W ⊥ ) = n − k. Hence dim((W ⊥ )⊥ ) = n − (n − k) = k. Now W ⊆ (W ⊥ )⊥ and
both have the same dimension, and hence they must be equal.

79 @bikash
Result 9.9. Let A be an m × n matrix. Then (col(A))⊥ = null(A∗ ), (row(A))⊥ = null(A) and row(A) = (null(A))⊥ .
( " a11 # " a1n #)
. .
Proof. Let A = [a1 a2 . . . an ], where a1 = .
.
, . . . , an = .
.
. Notice that if w ∈ col(A) then we have
wm1 amn
w = α1 a1 + . . . + αn an for some α1 , . . . , αn ∈ F. Therefore v.ai = 0 for all i = 1, 2, . . . , n implies that v.w = 0.
Conversely, it is clear that if v.w = 0 for all w ∈ col(A), then v.ai = 0 for all i = 1, 2, . . . , n. Now we have

(col(A))⊥ ={v : v · w = 0 for each w ∈ col(A)}


={v : v · ai = 0 for each i = 1, . . . , n}
={v : ai · v = 0 for each i = 1, . . . , n}
={v : ai ∗ v = 0 for each i = 1, . . . , n}
( " a11 · · · am1 # "0#)
. .
= v : .
.
v= .
.
a1n ··· amn 0

=null(A∗ ).

Now row(A)⊥ = (col(A∗ ))⊥ = null(A), as row(A) = col(At ). Again we have row(A) = ((row(A))⊥ )⊥ = (null(A))⊥ .

Corollary 9.2. Let A be an m × n matrix. Then rank(A) + nullity(A) = n.

Proof. We first prove that nullity(A) = nullity(A). It is clear that Av = 0 ⇔ Av = 0. Let {v1 , . . . vk } be a basis for
null(A). Now for a1 , . . . , ak ∈ F we have

a1 v1 + . . . + ak vk = 0 ⇒ a1 v1 + . . . + ak vk = 0 ⇒ a1 = 0, . . . , ak = 0 ⇒ a1 = 0, . . . , ak = 0.

Thus {v1 , . . . , vk } is linearly independent. Also if v ∈ null(A), then v ∈ null(A). Accordingly we find b1 , . . . , bk ∈ F
such that v = b1 v1 + . . . + bk vk . Then v = b1 v1 + . . . + bk vk . Thus {v1 , . . . , vk } spans null(A). Finally we conclude
that {v1 , . . . , vk } is a basis for null(A). Hence nullity(A) = nullity(A).
Now we have nullity(A) = nullity(A) = dim(null(A)) = dim(row(A)⊥ ) and rank(A) = dim(row(A)). Therefore
n = dim(row(A)) + dim(row(A)⊥ ) = rank(A) + nullity(A).

Example 9.3. Let S be a subspace of Fn and let {v1 , . . . , vk } form a basis for S ⊥ . Consider the k × n matrix A whose
i-th row is vi∗ . Show that S = null(A).

Solution. Note that vi ∗ w = 0 for each w ∈ S. Thus Aw = 0 for each w ∈ S. Hence S ⊆ null(A). Furthermore,
rank(A) = k = dim(S ⊥ ). Hence dim(S) = n − k = dim(null(A)), so that S = null(A).

Example 9.4. Let A and B be two m × n matrices and let the linear systems Ax = 0 and Bx = 0 have the same
solution space. Show that the matrix A is row equivalent to B.

Solution. We have null(A) = null(B). Hence (null(A))⊥ = (null(B))⊥ , that is, row(A) = row(B). Hence row(A) =
row(A). Let k = dim(row(A)). Let A0 be the matrix obtained by taking the first k rows of RREF of A and B 0 be the
matrix obtained by taking the first k rows of RREF of B. As rows of A0 are linear combinations of rows of B 0 , we have
A0 = SB 0 for some k × k matrix S. Note that k = rank(A0 ) = rank(SB 0 ) ≤ rank(S). That is, S is invertible. Hence
A0 is row equivalent to B 0 . Now A is row equivalent to A0 ,A0 is row equivalent to B 0 and B 0 is row equivalent to B
altogether give that A is row equivalent to B. [Indeed, A0 = B 0 as RREF of A is row equivalent to RREF of B and
that RREF of a matrix is unique.]

Example 9.5. Let A and B be two m × n matrices and let the consistent linear systems Ax = c and Bx = d have
the same solution set. Show that the matrix A is row equivalent to B.

Solution. If Ax = c and Bx = d have the same solution set, then Ax = 0 and Bx = 0 have the same solution set. Now
proceed as in Example 9.4.

Example 9.6. Let W be the subspace of R5 spanned by the vectors w1 = [1, −3, 5, 0, 5]t , w2 = [−1, 1, 2, −2, 3]t and
w3 = [0, −1, 4, −1, 5]t . Find a basis for W ⊥ .
  
x 1
3 ⊥
Example 9.7. Consider W = { y ∈ R : x + y + z = 0}. Find bases for W and W . For v = 2 , find (somehow)
z 3
w ∈ W and w0 ∈ W ⊥ such that v = w + w0 .

80 @bikash
     
x −1 −1
Solution. We have x + y + z = 0 ⇒ x = −y − z so that y =y 1 +z 0 . Thus {[−1, 1, 0]t , [−1, 0, 1]t } is a basis
z 0 1
for W . Now [x, y, z]t ∈ W ⊥ iff [x, y, z]t .[−1, 1, 0]t = 0 and [x, y, z]t .[−1, 0, 1]t = 0. That is, −x + y = 0 and −x + z = 0
t ⊥
 {[1,
which imply x = y = z. Thus  1,1]} is a basis for W .       
1 −1 −1 1 1 −1 1 −1 2
Solving 2 =a 1 +b 0 +c 1 for a, b, c, we have 2 = 0 + 2 , so that
1 0 ∈ W and 2 ∈ W ⊥ .
3 0 1 1 3 1 1 1 2
Orthogonal Projection: Let W be a subspace of Fn and let {w1 , w2 , . . . , wk } be an orthogonal basis for W . If
v ∈ Fn , then the orthogonal projection of v onto W is defined as
w1 w2 wk .v
projW (v) = (w1 .v) 2 + (w2 .v) 2 + . . . + (wk .v) 2.
kw1 k kw2 k kwk k

The component of v orthogonal to W is the vector

perpW (v) = v − projW (v).

• Note that if Wi is the subspace spanned by the vector wi , then projW (v) = projW1 (v)+projW2 (v)+. . .+projWk (v).

• If {w1 , w2 , . . . , wk } is an orthonormal basis for W , then

projW (v) = (w1 .v)w1 + . . . + (wk .v)wk .

• Assume that {w1 , . . . , wn } is an orthonormal basis for Fn such that {w1 , w2 , . . . , wk } is a basis for W and
{wk+1 , . . . , wn } is a basis for W ⊥ . Then for v ∈ Fn

v =(w1 .v)w1 + . . . + (wn .v)wn


=[(w1 .v)w1 + . . . + (wk .v)wk ] + [(wk+1 .v)wk+1 + . . . + (wn .v)wn ]
=w + w0 .

Notice that w = projW (v) and w0 = perpW (v).

• The orthogonal projection of v on W is the unique vector w ∈ W such that v = w + w0 , for some w0 ∈ W ⊥ .
In other words, it is the unique vector w ∈ W such that v − w ∈ W ⊥ . Again in words, it is the unique vector
w ∈ W such that (v − w) is orthogonal to w.

• For Example 9.7, we have {[−1, 2, −1]t , [1, 0, −1]t } is an orthogonal basis for W . Hence

[−1, 2, −1]t .[1, 2, 3]t [1, 0, −1]t .[1, 2, 3]t


projW ([1, 2, 3]t ) = [−1, 2, −1]t + [1, 0, −1]t = [−1, 0, 1]t .
6 2
Thus,
perpW ([1, 2, 3]t ) = [1, 2, 3]t − [−1, 0, 1]t = [2, 2, 2]t .

Result 9.10 (The Gram-Schmidt Process). Let {x1 , x2 , . . . , xk } be an ordered basis for a subspace W of Fn and
define

v1 = x1 , W1 = span(x1 );
 
v1 .x2
v2 = x2 − v1 , W2 = span(x1 , x2 );
v1 .v1
   
v1 .x3 v2 .x3
v3 = x3 − v1 − v2 , W3 = span(x1 , x2 , x3 );
v1 .v1 v2 .v2
.. .. .. ..
.. . .
.. .. .. ..
.. . .
     
v1 .xk v2 .xk vk−1 .xk
vk = xk − v1 − v2 − . . . − vk−1 , Wk = span(x1 , . . . , xk ).
v1 .v1 v2 .v2 vk−1 .vk−1

Then for each i = 1, 2, . . . , k, {v1 , . . . , vi } is an orthogonal basis for Wi . In particular, {v1 , . . . , vk } is an orthogonal
basis for W .

Example 9.8. Apply the Gram-Schmidt process to find an orthonormal basis of the subspace spanned by u =
[1, −1, 1]t , v = [0, 3, −3]t and w = [3, 2, 2]t .

81 @bikash
Solution. We have
 
1
v1 =x1 = −1 ,
1
         
v1 .x2 −6 0 2 2
v2 =x2 − v1 = x2 − v1 = 3 + −2 = 1 ,
v1 .v1 3 −3 2 −1
             
v1 .x3 v2 .x3 3 6 3 1 2 0
v3 =x3 − v1 − v2 = x3 − v1 − v2 = 2 − −1 − 1 = 2 .
v1 .v1 v2 .v2 3 6 2 1 −1 2

 √  √  
1/ 3 2/ 6 0
√ √ √
Therefore kvv11 k = −1/√3 , kvv22 k = 1/√6 and kvv33 k = 1/√2 .
1/ 3 6 1/ 2
 −1/ √   √   
1/ 3 2/ 6 0
√ √ √
Hence an orthonormal basis is −1/ 3 ,

1/ 6 ,

1/ 2
√ .
1/ 3 −1/ 6 1/ 2

• Given a set of vectors S, we can use Gram-Schmidt process to check its linear dependency.

• We can find an orthonormal basis B for span(S).

• The vectors in S corresponding to the elements of B are linearly independent.

• The angle θ between u and v, (u, v ∈ Rn ), is defined by


u.v
cos θ = , θ ∈ [0, π].
kuk kvk

Orthogonal Matrix: An n × n matrix Q whose columns form an orthonormal set (i.e., QQt = I = Qt Q) is called
an orthogonal matrix.
Practice Problems Set 9

1. Let x, y, z ∈ Fn and let a, b ∈ R. Show that (ax + by).z = a(x.z) + b(y.z) and x.(ay + bz) = a(x.y) + b(x.z).

2. For x, y ∈ Fn , prove the following:


2 2 2
(a) x.y = 0 if and only if kx + yk = kxk + kyk ;
2 2 2 2
(b) kx + yk + kx − yk = 2 kxk + 2 kyk ;
2 2
(c) kx + yk − kx − yk = 4(x.y);
(d) kxk = kyk if and only if (x + y).(x − y) = 0; and
(e) kx + yk = kxk + kyk if and only if ksx + tyk = s kxk + t kyk for all s, t ≥ 0.

3. Let {v1 , v2 , . . . , vk } be an orthogonal set of vectors in Fn . Prove that


2
Xk k
X
2
vi = kvi k .

i=1 i=1

4. Find a basis for each of the orthogonal complements S ⊥ , M ⊥ and W ⊥ , where

S = {[x, y, z]t ∈ R3 : 2x − y + 3z = 0}, M = {[x, y, z]t ∈ R3 : x = 2t = y, z = −t, t ∈ R}

and W = {[x, y, z]t ∈ R3 : x = t, y = −t, z = 3t, t ∈ R}.

Also, write the orthogonal complements S ⊥ , M ⊥ and W ⊥ .

5. Let {v1 , v2 , . . . , vn } be an orthonormal basis for Fn . Prove that the matrix A = [v1 , v2 , . . . , vn ] is invertible, and
compute its inverse.

6. Let M and N be two subspaces of Fn . Show that (M + N )⊥ = M ⊥ ∩ N ⊥ and (M ∩ N )⊥ = M ⊥ + N ⊥ .

7. Let A be a symmetric matrix. Show that the eigenvectors corresponding to distinct eigenvalues of A are orthogonal
to each other.

8. If Q is an orthogonal matrix, prove that any matrix obtained by rearranging the rows of Q is also orthogonal.

9. Prove that the columns of an m × n matrix Q form an orthonormal set if and only if Qt Q = In .

82 @bikash
10. Show that a square matrix Q is orthogonal if and only if Q−1 = Qt .

11. Prove that if an upper triangular matrix is orthogonal, then it must be a diagonal matrix.

12. Let Q be an n × n matrix. Prove that the following statements are equivalent:

(a) Q is orthogonal.
(b) (Qx).(Qy) = x.y for every x, y ∈ Rn .
(c) kQxk = kxk for every x ∈ Rn .

13. Prove that if n > m, then there is no m × n matrix A such that kAxk = kxk for all x ∈ Rn .

14. Apply Gram-Schmidt process to the following sets to obtain an orthonormal set in the spaces spanned by the
corresponding vectors:

(a) {[1, 1, 0]t , [3, 4, 2]t } in R3 ;


(b) {[−1, 0, 1]t , [1, −1, 0]t , [0, 0, 1]t } in R3 ;
(c) {[2, −1, 1, 2]t , [3, −1, 0, 4]t } in R4 ; and
(d) {[1, 1, 1, 1]t , [0, 2, 0, 2]t , [−1, 1, 3, −1]t } in R4 .

15. Use Gram-Schmidt process to find an orthogonal basis for the column space of each of the following matrices:
 
  1 1 1
0 1 1  
   1 −1 2 
 1 0 1  and  .
 −1 1 0 
1 1 0
1 5 1

16. Find an orthogonal basis for R3 containing the vector [3, 1, 5]t .

17. Find an orthogonal basis for R4 containing the vectors [2, 1, 0, −1]t and [1, 0, 3, 2]t .

18. Find an orthogonal basis for the subspace spanned by [1, 1, 0, 1]t , [−1, 1, 1, −1]t , [0, 2, 1, 0]t and [1, 0, 0, 0]t .

19. Find the orthogonal projection of v onto the subspace W spanned by the vectors u1 and u2 , where

v = [1, 2, 3]t , u1 = [2, −2, 1]t and u2 = [−1, 1, 4]t .

20. Find the orthogonal projection of v onto the subspace W spanned by the vectors u1 , u2 and u3 , where

v = [3, −2, 4, −3]t , u1 = [1, 1, 0, 0]t , u2 = [1, −1, −1, 1]t and u3 = [0, 0, 1, 1]t .

21. Let M be a subspace of Rm and let dim(M ) = k. How many linearly independent vectors can be orthogonal to
M ? Justify your answer.

22. Let A be an n × n orthogonal matrix. Show that the rows of A form an orthonormal basis for Rn . Similarly, the
columns of A also form an orthonormal basis for Rn .

23. Let u = [1, 0, 0, 0]t and v = [0, 21 , 12 , √12 ]t . Find vectors w and x in R4 such that {u, v, w, x} form an orthonormal
basis for R4 .

24. Let {v1 , v2 , . . . , vn } be an orthogonal basis for Fn and let W = span(v1 , v2 , . . . , vk ) for some 1 ≤ k < n. Is it
necessarily true that W ⊥ = span(vk+1 , . . . , vn )? Either prove that it is true or find a counterexample.

25. Let W be a subspace of Fn and let x ∈ Fn . Prove that

(a) x ∈ W if and only if projW (x) = x;


(b) x is orthogonal to W if and only if projW (x) = 0; and
(c) projW (projW (x)) = projW (x).

83 @bikash
Hints to Practice Problems Set 9

1. Easy.
2
2. For the first four parts, applly distributive and commutative laws in kx + yk = (x + y).(x + y).
(e) ksx + tyk ≤ s kxk + t kyk. Also for t ≥ s, ksx + tyk = kt(x + y) − (t − s)xk ≥ t kx + yk − (t − s) kxk.
Similarly, for t < s, ksx + tyk ≥ s kxk + t kyk.
n
P 2 Pn Pn
3. Use vi .vj = 0 in the expansion of the RHS of vi

= ( vi ).( vi ).
i=1 i=1 i=1

4. S ⊥ = {[u, v, w]t ∈ R3 : u + 2v = 0, 3u − 2w = 0} and a basis for S ⊥ is {[2, −1, 3]t }.


M ⊥ = {[u, v, w]t ∈ R3 : 2u + 2v − w = 0} and a basis for M ⊥ is {[−1, 1, 0]t , [1, 0, 2]t }.
W ⊥ = {[u, v, w]t ∈ R3 : u − v + 3w = 0} and a basis for W ⊥ is {[1, 1, 0]t , [3, 0, −1]t }.

5. {v1 , . . . , vn } is linearly independent, and A−1 = At .

6. Show that (M + N )⊥ ⊆ M ⊥ ∩ N ⊥ and M ⊥ ∩ N ⊥ ⊆ (M + N )⊥ . For the 2nd part, replace M by M ⊥ and N by


N ⊥ in the 1st part.

7. Ax = λx, Ay = µy ⇒ λxt y = xt Ay = µxt y.

8. Rearranging the columns of Qt will not change the orthogonality of its columns.

9. See Theorem 5.4 of the text book.

10. Use Problem 9.

11. If A = [aij ] is orthogonal and upper triangular, then we have a211 = a211 + a212 + . . . + a21n .

12. See Theorem 5.6 of the text book.

13. Use Problem 12.



14. (a) {[ √12 , √12 , 0]t , [− 3√
1 1
, √
2 3 2
, 2 3 2 ]t }, (b) Find and check yourself.

(c) {[ √210 , − √110 , √110 , √210 ]t , [0, √114 , − √314 , √27 ]t },
(d) {[ √14 , √14 , √14 , √14 ]t , [− √14 , √14 , − √14 , √14 ]t , [− √210 , √110 , √210 , − √110 ]t }.

15. {[0, 1, 1]t , [1, − 12 , 21 ]t , [ 23 , 32 , − 23 ]t } and {[1, 1, −1, 1]t , [0, −2, 2, 4]t , [0, 1, 1, 0]t }.

16. Apply Gram-Schmidt Process to {[3, 1, 5]t } ∪ B, where {[3, 1, 5]t } ∪ B is a basis for R3 .

17. Similar to Problem 16.

18. Apply Gram-Schmidt Process to a basis for the subspace.

19. projW (v) = [− 12 , 12 , 3]t .

20. projW (v) = [0, 1, 1, 0]t .

21. m − k.

22. AAt = I = At A.

23. Similar to Problem 17.

24. The given statement is true.

25. (a) Direct use of the definition of projW (v). (c) Use part (a).

84 @bikash
10 Spectral Theorem
In this section, we shall study those matrices which are diagonalizable in a very nice manner, in the sense that P −1 AP
is diagonal, where P is unitary or orthogonal in some cases. Note that if P is unitary, then P −1 = P ∗ , and if P is
orthogonal, then P −1 = P t .
" #
t 1 2
Example 10.1. Examine if there is an orthogonal matrix P such that P AP is diagonal, where A = .
2 −2

Definition 10.1. Recall the definitions of transpose and conjugate transpose of a matrix, symmetric and Hermitian
matrices.

• A matrix A is said to be normal if AA∗ = A∗ A.

• A real matrix A is said to be orthogonal if AAt = I = At A.

• A matrix A is said to be unitary if AA∗ = I = A∗ A.

• Note that symmetric and Hermitian matrices are normal.

• A real unitary matrix is orthogonal.

• A real Hermitian matrix is symmetric.

Definition 10.2.

• A real matrix A is said to be orthogonally diagonalizable if there is an orthogonal matrix Q such that Qt AQ is a
diagonal matrix.

• A matrix A is said to be unitarily diagonalizable if there is an unitary matrix U such that U ∗ AU is a diagonal
matrix.

Result 10.1 (Schur Unitary Triangularization Theorem). For a square matrix A, there is an unitary matrix U such
that U ∗ AU is upper triangular.

Proof. Let A be an n × n matrix. We use induction on n. The case n = 1 is trivial. Let n > 1, and assume that the
result is true for every (n − 1) × (n − 1) matrices.
" w1 such#that kw1 k = 1. Consider and orthonormal basis {w1 , . . . , wn }
Let λ1 be an eigenvalue of A, with eigenvector
λ1 ∗
of Cn , and W = [w1 · · · wn ]. Then W ∗ AW = , where A0 is an (n − 1) × (n − 1) matrix.
0 A0
" #
0∗ 0 0 0 0 λ1 0
By induction hypothesis, U A U = T is upper-triangular for some unitary matrix U . Take U = W
0 U0
so that U is unitary. We have
" # " # " #" #" #
∗ 1 0 ∗ 1 0 1 0 λ1 ∗ 1 0
U AU = W AW =
0 U 0∗ 0 U0 0 U 0∗ 0 A0 0 U0
" #
λ1 ∗
=
0 U 0∗ A0 U 0
" #
λ1 ∗
= , an upper-triangular matrix.
0 T0

Hence, we conclude by principle of mathematical induction that the result is true for all n.

Corollary 10.1. If a square matrix A and its eigenvalues are real, then there is a (real) orthogonal matrix Q such that
Qt AQ is upper triangular.

Proof. If a square matrix A and its eigenvalues are real, then Ax = λx can be solved in Rn . Thus if the proof of Schur
Unitary Triangularization Theorem is applied for A, every numbers, vectors and matrices can be taken to be real.

Remark 10.1. The diagonal entries in U ∗ AU or Qt AQ of the previous results can be obtained (taken) in any prescribed
order, accordingly the U or Q will change. They are the eigenvalues of A.

Result 10.2. • If A is orthogonally diagonalizable then A is symmetric.

85 @bikash
• If A is unitarily diagonalizable then A is normal.

Proof.

• Let P t AP = D be diagonal, where P t P = I = P P t . We have A = P DP t and so At = (P DP t )t = P DP t = A.


Hence A is symmetric.

• Let P ∗ AP = D be diagonal, where P ∗ P = I = P P ∗ . We have A = P DP ∗ so that A∗ = P D∗ P ∗ . Now

A∗ A = (P D∗ P ∗ )(P DP ∗ ) = P D∗ DP ∗ = P DD∗ P ∗ = (P DP ∗ )(P D∗ P ∗ ) = AA∗ .

Hence A is normal.

The following is Result 9.3 stated again.

Result 10.3. If A is a Hermitian matrix, then its eigenvalues are real.

The second part of the following result is a part of Result 9.3.

Result 10.4. • If A is a normal matrix, then any two eigenvectors corresponding to distinct eigenvalues of A are
orthogonal.

• If A is a real symmetric matrix, then any two eigenvectors corresponding to distinct eigenvalues of A are orthog-
onal.

Proof. Let A∗ A = AA∗ , Au = λu and Av = µv, where u 6= 0, v 6= 0 and λ 6= µ. To show that v∗ u = 0.


We notice that
2 2
kAxk = (Ax)∗ (Ax) = x∗ (A∗ A)x = x∗ (AA∗ )x = (A∗ x)∗ (A∗ x) = kA∗ xk .

Therefore

Av = µv ⇔ k(A − µI)vk = 0 ⇔ k(A − µI)∗ vk = 0 ⇔ k(A∗ − µI)vk = 0 ⇔ A∗ v = µv ⇔ v∗ A = µv∗ .

Therefore

λ(v∗ u) = v∗ (λu) = v∗ (Au) = (v∗ A)u = µv∗ u ⇒ (λ − µ)v∗ u = 0 ⇒ v∗ u = 0, as λ 6= µ.

Result 10.5 (Spectral Theorem for Normal Matrices). A matrix A is normal iff there is an unitary matrix U such
that U ∗ AU is a diagonal matrix.

Proof. Let A be normal. By Schur’s Triangulation Theorem, there exists an unitary matrix U such that U ∗ AU = T is
upper-triangular, where T = [tij ]. From AA∗ = A∗ A, we get T T ∗ = T ∗ T . Now
n
X
(T T ∗ )11 = (T ∗ T )11 ⇒ |t1i |2 = |t11 |2 ⇒ t1i = 0 for i = 2, 3, . . . , n.
i=1

Now from (T T ∗ )22 = (T ∗ T )22 , we get t2i = 0 for i = 3, 4, . . . , n. Repeating this process, we find that T is diagonal.
The other part of the proof is easy, and was done in Result 10.2.

Corollary 10.2. A matrix A is Hermitian iff there is an unitary matrix U such that U ∗ AU is a real diagonal matrix.

Proof. Let A = A∗ so that A is normal as well. By Spectral Theorem, there is an unitary matrix U such that U ∗ AU = D
is a diagonal matrix. Now A = A∗ gives that

D∗ = (U ∗ AU )∗ = U ∗ AU = D ⇒ D is a real diagonal matrix.

Thus U ∗ AU is a real diagonal matrix. The other part of the proof is easy.

Corollary 10.3 (Spectral Theorem for Real Symmetric Matrices). A real matrix A is symmetric iff there is an
orthogonal matrix Q such that Qt AQ is a diagonal matrix.

86 @bikash
Proof. Let A be a real symmetric matrix. Then all eigenvalues of A are real. Therefore by Corollary 10.1, there is a
(real) orthogonal matrix Q such that Qt AQ = D is upper triangular. Now A = At and Qt AQ = D give that D = Dt ,
and hence D is a diagonal matrix.
The other part of the proof is easy, and was done in Result 10.2.

Result 10.6 (Cayley-Hamilton Theorem). Every matrix satisfies its characteristic equation.
n
Q
Proof. Let λ1 , . . . , λn be the eigenvalues of a square matrix A, so that its characteristic polynomial is p(x) = (x − λi ).
i=1
By Schur’s Triangulation Theorem, there exists an unitary matrix U such that U ∗ AU = T is upper-triangular, where
T = [tij ] with tii = λi for each i. We have
n
Y n
Y
p(A) = (A − λi I) = (U T U ∗ − λi U IU ∗ ) = U [(T − λ1 I) . . . (T − λn I)] U ∗ = U OU ∗ = O.
i=1 i=1

Note that T − λ1 I has the first column zero, (T − λ1 I)(T − λ2 I) has the first two columns zero, and so on.
Practice Problems Set 10

1. Orthogonally diagonalize the following matrices:


   
" # " # " √ # " # 5 0 0 2 3 0
4 1 −1 3 1 2 9 −2    
(a) , (b) , (c) √ , (d) , (e)  0 1 3  , (f )  3 2 4 ,
1 4 3 −1 2 0 −2 6
0 3 1 0 4 2
   
    1 1 0 0 2 0 0 1
1 0 −1 1 2 2    
     1 1 0 0   0 1 0 0 
(g)  0 1 0 , (h)  2 1 2 , (i)  , (j)  .
 0 0 1 1   0 0 1 0 
−1 0 1 2 2 1
0 0 1 1 1 0 0 2
 
" # a 0 b
a b  
2. If b 6= 0, then orthogonally diagonalize the real matrices A = and B =  0 a 0 .
b a
b 0 a

3. Let A and B be orthogonally diagonalizable real matrices of same size and let c ∈ R, k ∈ N. Show that the
matrices A + B, cA, A2 and Ak are also orthogonally diagonalizable.

4. If the matrix A is invertible and orthogonally diagonalizable, then show that A−1 is also orthogonally diagonal-
izable.

5. If A and B be orthogonally diagonalizable and AB = BA, then show that AB is also orthogonally diagonalizable.

6. If A is a symmetric matrix, then show that every eigenvalue of A is non-negative iff A = B 2 for some symmetric
matrix B.
n
P
7. Let {u1 , . . . , un } be an orthonormal set in Rn and α1 , . . . , αn ∈ R. Show that the matrix αi ui uti is symmetric.
i=1

n
P
8. Let {u1 , . . . , un } be an orthonormal set in Cn and α1 , . . . , αn ∈ C. Show that the matrix αi ui u∗i is normal.
i=1

9. Let A be a real symmetric matrix. Show that there is an orthonormal set {u1 , . . . , un } in Rn and α1 , . . . , αn ∈ R
Pn
such that A = αi ui uti .
i=1

10. Let A be a normal matrix. Show that there is an orthonormal set {u1 , . . . , un } in Cn and α1 , . . . , αn ∈ C such
Pn
that A = αi ui u∗i .
i=1

11. Let A be a symmetric matrix. Show that there are distinct real numbers α1 , . . . , αk and real matrices E1 , . . . , Ek
such that A = α1 E1 + . . . + αk Ek , where E1 + . . . + Ek = I, Ei Ej = O for i 6= j and Ei2 = Ei for i = 1, . . . , k.

12. Find spectral decomposition of the matrices given in Excercise 1.

13. Find symmetric matrices of appropriate sizes with the given eigenvalues and the corresponding eigenvectors for
the following problems:

87 @bikash
" # " #
1 1
(a) λ1 = −1, λ2 = 2 and v1 = , v2 = ;
1 −1
" # " #
3 −4
(b) λ1 = 3, λ2 = −2 and v1 = , v2 = ;
4 3
" # # "
1 1
(c) λ1 = −3, λ2 = −3 and v1 = , v2 = ;
1 −1
     
4 −1 2
     
(d) λ1 = 1, λ2 = −4, λ3 = −4 and v1 =  5  , v2 =  1  , v3 =  −1 .
−1 1 3

14. Let A be a real nilpotent matrix. Prove that there is an orthogonal matrix Q such that Qt AQ is upper-triangular
with zeros on its diagonal.

15. Let {q1 , . . . , qk } be an orthonormal set of vectors in Rn and W = span(q1 , . . . , qk ).

(a) Show that the matrix of the orthogonal projection onto W is given by

P = q1 qt1 + . . . + qk qtk .

(b) Show that the matrix in Part (a) above is symmetric and satisfies P 2 = P .
(c) Show that P = QQt , and deduce that rank(P ) = k, where Q = [q1 · · · qk ].
Hints to Practice Problems Set 10
" √ √ # " #
1/ 2 1/ 2 5 0
1. (a) Q = √ √ , D= .
1/ 2 −1/ 2 0 3
" # " #
1 1
(b) The eigenvalues of the matrix are 2 and −4 with corresponding eigenvectors and , respectively.
1 −1
" √ √ # " #
1/ 2 1/ 2 2 0
Normalize the eigenvectors to get Q = √ √ , D= .
1/ 2 −1/ 2 0 −4
" √ √ # " #
2/ 6 1/ 3 2 0
(c) Q = √ √ , D= .
1/ 3 −2/ 6 0 −1
" √ √ # " #
2/ 5 1/ 5 10 0
(d) Q = √ √ , D= .
−1/ 5 2/ 5 0 5
   
1 0 0 5 0 0
 √ √   
(e) Q =  0 1/ 2 −1/ 2  , D =  0 4 0 .
√ √
0 1/ 2 1/ 2 0 0 −2
 √ √   
3/ 2 3/ 2 4 7 0 0
√ √
(f) Q = 51  5/ 2 −5/ 2
   
0  , D =  0 −3 0 .
√ √
4/ 2 4/ 2 −3 0 0 2
 √ √   
−1/ 2 0 1/ 2 2 0 0
   
(g) Q =  0 1 0  , D =  0 1 0 .
√ √
1/ 2 0 1/ 2 0 0 0
 √ √ √   
1/ 3 −1/ 2 −1/ 2 5 0 0
 √ √   
(h) Q =  1/ 3 0 1/ 2  , D =  0 −1 0 .
√ √
1/ 3 1/ 2 0 0 0 −1
 √ √   
1/ 2 0 1/ 2 0 2 0 0 0
 1/ 2√ √   0 2 0 0 
 0 −1/ 2 0
(i) Q =  √ √  , D = 
 
.
 0 1/ 2 0 1/ 2   0 0 0 0 
√ √
0 1/ 2 0 −1/ 2 0 0 0 0
 √ √   
1/ 2 1/ 2 0 0 3 0 0 0
 0 0 1 0   0 1 0 0 
   
(j) Q =  , D =  .
 0 0 0 1   0 0 1 0 
√ √
1/ 2 −1/ 2 0 0 0 0 0 1

88 @bikash
" √ #√ " #
1/ 2
1/ 2 a+b 0
2. For A: Q = √ √ , D= ;
−1/ 2
1/ 2 0 a−b
 √ √   
0 1/ 2 1/ 2 a 0 0
   
For B: Q =  1 0 0 , D =  0 a + b 0 ;
√ √
0 1/ 2 −1/ 2 0 0 a−b

3. Since A and B be orthogonally diagonalizable real matrices, they are all symmetric. Therefore the matrices
A + B, cA, A2 and Ak are all symmetric. OR, if Q is orthogonal and Qt AQ = D is diagonal, then Qt (cA)Q =
cD, Qt A2 Q = D2 and Qt Ak Q = Dk .

4. If A is symmetric, then A−1 is symmetric. OR, if Q is orthogonal and Qt AQ = D, then Qt A−1 Q = D−1 .

5. If A, B are symmetric and AB = BA, then AB is symmetric.

6. If Qt AQ = D, then take B = Qt D1 Q, where each entry of D1 is the positive square root of the corresponding
entry in D, for one part. For the other part, if B = Qt D1 Q, then A = B 2 = Qt D12 Q ⇒ Qt AQ = D12 .

7. (αi ui uti )t = αi (uti )t uti = αi ui uti .

8. (αi ui u∗i )∗ (αi ui u∗i ) = |αi |2 ui u∗i = (αi ui u∗i )(αi ui u∗i )∗ . For i 6= j, (αi ui u∗i )∗ (αj uj u∗j ) = 0 = (αi ui u∗i )(αj uj u∗j )∗ .

9. For Qt AQ = D, let u1 , . . . , un be the columns of Q and α1 , . . . , αn be the diagonal entries of D.


Pn
Then QDQt = αi ui uti .
i=1

10. For Q AQ = D, let u1 , . . . , un be the columns of Q and α1 , . . . , αn be the diagonal entries of D.
Pn
Then QDQ∗ = αi ui u∗i .
i=1

11. Use Problem 9 and proceed as in Spectral Decomposition Theorem.


" # " #
1 1 1
2 2 2 − 12
12. (a) 5 1 1 + 3 .
2 2 − 12 1
2
" # " #
1 1 1 1

(b) 2 21 21 − 4 2 2 .
2 2 − 12 1
2
" # " #
2 √2 1 √2
3 18 3 − 18
(c) 2 − .
√2 1
− √218 2
18 3 3
" # " #
4
5 − 25 1
5
2
5
(d) 10 +5 2 4 .
− 52 1
5 5 5
     
1 0 0 0 0 0 0 0 0
     1 
(e) 5  0 0 0  + 4  0 21 12  − 2  0 21 2 
.
0 0 0 0 12 12 0 12 − 12
     
9 15 12 9 15 12 16 12
50 50 50 50 − 50 50 25 0 − 25
 20   20   
(f) 7.  1550
25
50 50 
− 3.  − 15 50
25
50 − 50  + 2.  0 0 0 
12 20 16 12 20 16
50 50 50 50 − 50 50 − 12
25 0 9
25
         
1
2 0 − 21 0 0 0 1
2 0 12 1
2 0 − 21 0 0 0
         
(g) 2.  0 0 0  + 1.  0 1 0  + 0.  0 0 0  = 2  0 0 0 + 0 1 0 .
− 21 0 1
2 0 0 0 1
2 0 1
2 − 1
2 0 1
2 0 0 0
     
1 1 1 1
3 3 3 2 0 − 21 1
2 − 12 0
 1 1 1     1 1 
(h) 5.  3 3 3  − 1.  0 0 0  − 1.  − 2 2 0 .
1 1 1
3 3 3 − 21 0 1
2 0 0 0
 1 1   
2 2 0 0 0 0 0 0
 1 1 0 0   0 0 0 0 
   
(i) 2  2 2  + 2 .
 0 0 0 0   0 0 12 12 
1 1
0 0 0 0 0 0 2 2

89 @bikash
 1 1
  1
    
2 0 0 2 2 0 0 − 12 0 0 0 0 0 0 0 0
 0 0 0 0   0 0 0 0   0 1 0 0   0 0 0 0 
       
(j) 3.   + 1.   + 1.   + 1.  .
 0 0 0 0   0 0 0 0   0 0 0 0   0 0 1 0 
1 1
2 0 0 2 − 21 0 0 1
2 0 0 0 0 0 0 0 0
" #
1
2 − 23
13. (a) A = .
− 32 1
2
" # " #
3
5 − 54
(b) Normalize the given vectors using Gram-Schmidt Process to obtain u1 = 4
, u2 = 3
. Now
" # 5 5
1 12

A = 3u1 ut1 − 2u2 ut2 = 5
12
5
6
.
5 5
" # " #
√1 √1
(c) Normalize the given vectors using Gram-Schmidt Process to obtain u1 = 2 , u2 = 2 . Now
√1 − √12
" # 2

t t −3 0
A = −3u1 u1 − 3u2 u2 = .
0 −3
   
4 −1
= √142  5  , u2 = √1 
   
(d) Normalize the given vectors using Gram-Schmidt Process to obtain u1 3
1 
−1 1
   
2 −88 100 −20
and u3 = √114  −1 . Now A = 1.u1 ut1 − 4.u2 ut2 − 4.u3 ut3 = 1
   
42  100 −43 −25 .
3 −20 −25 −163

14. 0 is the only eigenvalue of a nilpotent matrix.

15. projW (v) = (qt1 v)q1 + . . . + (qtk v)qk = (q1 qt1 + . . . + qk qtk )v.

(a) T (v) = Av, where A = q1 qt1 + . . . + qk qtk .


(b) (qi qti )t = qi qti , (qi qti )(qi qti ) = qi qti and (qi qti )(qj qtj ) = qi (qti qj )qtj = O for i 6= j.
 
qt1
 . 
(c) Clearly, [q1 · · · qk ]  .  t t t
 .  = q1 q1 + . . . + qk qk . Also rank(QQ ) ≤ rank(Q) = k. Further
qtk
 
qt1 qi
   .. 
qt1  . 
 .   
P qi = Q  ..  qi = Q 
 t
qi qi

 = Qei = qi ⇒ qi ∈ range(P ).
   
t  .. 
qk  . 
qtn qi

End of Note

90 @bikash

You might also like