You are on page 1of 62

MA 106-2023-2 and MA110-2023-2 (1st half): Linear

Algebra

H. Ananthnarayan*
Department of Mathematics, I.I.T. Bombay

February 20, 2024

This set of slides is the material presented in my classes (Divisions 1 & 4) of MA106,
and the first half of MA110 in Spring 2024 at IIT Bombay. The primary content was
developed by me and my co-instructor, Prof. Rekha Santhanam using the reference:
Linear Algebra and its Applications by G. Strang, 4th Ed., Thomson.
The topics covered are:

1. L INEAR E QUATIONS & M ATRICES

(a) Linear Equations & Pivots


(b) Matrices
(c) Gaussian Elimination
(d) Null Space & Column Space: Introduction

2. V ECTOR S PACES

(a) Vector Spaces & Subspaces


(b) Linear Span & Independence
(c) Basis & Dimension
(d) Null Space, Column Space & Row Space
(e) Linear Transformations

3. E IGENVALUES & E IGENVECTORS

(a) Eigenvalues & Eigenvectors


(b) Diagonalizability

4. O R THOGONALITY & P ROJECTIONS

(a) Orthogonality
(b) Projection & Least Squares Method
(c) Gram-Schmidt Process & Applications

A PPENDIX I - D ETERMINANTS

N OTE : (i) The notation in these slides is the same as that discussed in class.
(ii) Work out as many examples as you can.

1
Chapter 1. L INEAR E QUATIONS & M ATRICES

1.1 L INEAR E QUATIONS & P IVOTS

What is Linear Algebra?


It is the theory of solving simultaneous linear equations in a finite number of unknowns.
E XAMPLE :You buy food worth Rs. 2600 at the Cafe 2023.
(And your smart phone refuses to pay).
Suppose you have 10, 3, 6 and 1 currency notes respectively of denominations Rs. 100,
Rs. 200, Rs. 500 and Rs. 1000. How many notes of each denomination will you need?
S OLUTION : Give names to unknowns: Let t, u, v, w be number of notes that you pick,
of denominations Rs. 100, Rs. 200, Rs. 500 and Rs. 1000 respectively.
Want to solve: 100t + 200u + 500v + 1000w = 2600.
A possible solution: (t, u, v, w) = (1, 0, 5, 0).
Is this unique? No. e.g., (10, 3, 2, 0) or (10, 3, 0, 1)(?)
Are the following solutions permissible:
(1, 5, 3, 0)? No. We need u ≤ 3.
or (−1, 1, 5, 0)? Only if you can get change back.
or (0, 0, 5.2, 0)? No. We need integer values.
K EY NOTE : In general, we are looking for all possible solutions to the given system,
i.e., without any constraints, unlike the introductory example.

An Example
Solve the system: (1) x + 2y = 3, (2) 3x + y = 4.
E LIMINATION OF VARIABLES :
Eliminate x by (2) − 3 × (1) to get y = 1.
1 3
3 4 4−9
C RAMER ’ S R ULE ( DETERMINANT ): y = = =1
1 2 1−6
3 1
In either case, back substitution gives x=1
We could also solve for x first and use back substitution for y.
P OINT TO P ONDER : Why is it possible?
C OMPARISON : For a large system, say 100 equations in 100 variables, elimination
method is preferred, since computing 101 determinants of size 100 × 100
is time-consuming.

Geometry of linear equations


R OW METHOD :

x+2y=3 and 3x+y=4

represent lines in R2 passing through (0, 3/2) and (3, 0) and through
(0, 4) and (4/3, 0) respectively.

2
y
3x + y = 4

x + 2y = 3

x
The intersection of the two lines is the unique point (1, 1). Hence (x, y) = (1, 1)
is the (only) solution of above system of linear equations.

Geometry of linear equations      


1 2 3
C OLUMN METHOD : The system is x +y = .
3 1 4
We need to find a linear combination of the column vectors on LHS
to produce the column vector on RHS.
Geometrically this is same as completing the parallelogram
with given directions and diagonal.
y (3,4)
(1,3)

(2,1)

x
Here x = 1, y = 1 will work.
P OINT TO P ONDER : How do the pictures change
if the system is x + 2y = 5 and 3x + y = 5?

Equations in 3 variables
R OW METHOD : A linear equation in 3 variables represents
a plane in a 3 dimensional space R3 , e.g.,
P1: x+2y+3z=6 is a plane passing through: (0, 0, 2), (0, 3, 0), (6, 0, 0).

P2: x+2y+3z=0 is a plane passing through: (0, 0, 0), (−3, 0, 1), (−2, 1, 0).
N OTE : P2 consists of all (x, y, z) such that (x, y, z) · (1, 2, 3) = 0, i.e.,
P2 is the set of all vectors perpendicular to the vector (1,2,3).
P1 and P2 have the same normal vector (1,2,3), and hence are parallel.

Equations in 3 variables: Examples


E XAMPLE 1: (1) x + 2y + 3z = 6 (2) x + 2y + 3z = 0.
The two equations represent planes with normal vector (1,2,3)
and are parallel to each other.
How many solutions can we find? There are no solutions.
E XAMPLE 2: (1) x + 2y + 3z = 0 (2) −x + 2y + z = 0
The two equations represent planes passing through (0,0,0).

3
The intersection is non-empty, i.e., the system has at least one solution. In fact,
the solution set is a line in R3 , passing through the origin.
E XERCISE : Find all the solutions in the second example.

3 equations in 3 variables

• Solving 3 by 3 system by the row method means finding an intersection of three


planes, say P1 , P2 , P3 . This is same as the intersection of a line L (intersection of P1
and P2 ) with the plane P3 .
P OINT TO PONDER :
There is an implicit assumption made in this point. What is it?

• If the line L does not intersect the plane P3 , then the linear system has no solution,
i.e., the system is inconsistent.

• If the line L is contained in the plane P3 , then the system has infinitely many solutions.
In this case, every point of L is a solution.

• Workout some more examples.

Linear Combinations (or Weighted Sum)


C OLUMN METHOD : Consider the 3 × 3 system:
x+2y+3z = 0, -2x+3y = 1, -x+5y+2z = 2. Equivalently,
       
1 2 3 0
x −2 + y 3 + z 0 = 1
−1 5 2 2

We want a linear combination (or weighted sum) of the column vectors on


LHS which is equal to RHS.
O BSERVE :
1. x = 1, y = 1, z = −1 is a solution. Q: Is it unique?
2. Since each column represents a vector in R3 from origin,
we can find the solution geometrically, as in the 2 × 2 case.
Q: Can we do the same when number of variables is more than 3?
Solve the system by other techniques to answer such questions.

Matrix notation (Ax̃ = b̃) for linear systems


Consider the system
2u + v +w =5, 4u − 6v = −2, −2u + 7v + 2w =  9. 
u 5
Let ⃗x =  v be the unknown vector, and ⃗b = −2.
w 9
 
2 1 1
The coefficient matrix is A =  4 −6 0.
−2 7 2
If we have m equations in n variables, then A has m rows and n columns,
the column vector ⃗b has size m, and the unknown vector ⃗x has size n.

N OTATION : From now on, we will write ⃗x as x and ⃗b as b .

4
Gaussian Elimination
E XAMPLE : 2u + v + w = 5, 4u − 6v = −2, −2u + 7v + 2w = 9.
A LGORITHM : Eliminate u from last 2 equations by
4 −2
(2) − × (1), and (3) − × (1) to get the equivalent system:
2 2
2u + v + w = 5, − 8v − 2w = −12, 8v + 3w = 14

The coefficient used for eliminating a variable is called a pivot.


The first pivot is 2, second pivot is −8. Eliminate v from the last equation to get
an equivalent triangular system:

2u + v + w = 5, − 8v − 2w = −12, 1 · w = 2

Solve this triangular system by back substitution, to get the unique solution

w = 2, v = 1, u = 1.

Elimination: Matrix form


E XAMPLE : 2u + v + w = 5, 4u − 6v = −2, −2u + 7v + 2w = 9.
Forward elimination in the augmented matrix form [A|b]:
(N OTE: The last column is the constant vector b).
   
2 1 1 | 5 2 1 1 | 5
 4 −6 0 | −2 →  0 −8 −2 | −12
−2 7 2 | 9 0 8 3 | 14
   
2 1 1 | 5 1
→ 0 −8 −2
 | −12 . Solution is: x = 1 .

0 0 1 | 2 2
P OINT TO P ONDER : Is there a relation between ‘pivots’ and ‘unique solution’?

Singular case: No solution


E XAMPLE : 2u + v + w = 5, 4u − 6v = −2, −2u + 7v + w = 9.
Step 1 Eliminate u (using the 1st pivot 2) to get:
2u + v + w = 5, −8v − 2w = −12, 8v + 2w = 14
Step 2 : Eliminate v (using the 2nd pivot -8) to get:
2u + v + w = 5, −8v − 2w = −12, 0 · w = 2.
The last equation shows that there is no solution,
i.e., the system is inconsistent.
G EOMETRIC REASONING : In Step 1, notice we get two distinct parallel planes
8v + 2w = 12 and 8v + 2w = 14.
They have no point in common.
N OTE : The planes in the original system were not parallel,
but in an equivalent system, we get two distinct parallel planes!

5
Singular Case: Infinitely many solutions
E XAMPLE : 2u + v + w = 5, 4u − 6v = −2, −2u + 7v + w = 7.
Step 1 Eliminate u (using the 1st pivot 2) to get:
2u + v + w = 5, −8v − 2w = −12, 8v + 2w = 12
Step 2 : Eliminate v (using the 2nd pivot -8) to get:
2u + v + w = 5, −8v − 2w = −12, 0 · w = 0.
There are only two equations. For every value of w, values for u and v
are obtained by back-substitution,
e.g., w = 2 gives (1, 1, 2), and w = 0 gives 47 , 32 , 0


⇒ the system has infinitely many solutions.


G EOMETRIC REASONING : In Step 1, we get the two equations −8v − 2w = −12
and 8v + 2w = 12, giving the same plane. Hence we are looking at the
intersection of the two planes, 2u + v + w = 5 and 8v + 2w = 12, which is a line.

Singular Cases: Matrix Form


1. 2u + v + w = 5, 4u − 6v = −2, −2u + 7v + w = 9.
   
2 1 1 | 5 2 1 1 | 5
 4 −6 0 | −2 →  0 −8 −2 | −12 .
−2 7 1 | 9 0 0 0 | 2
No Solution! Why?
2. 2u + v + w = 5, 4u − 6v = −2, −2u + 7v + w = 7.
   
2 1 1 | 5 2 1 1 | 5
 4 −6 0 | −2 →  0 −8 −2 | −12 .
−2 7 1 | 7 0 0 0 | 0

Infinitely many solutions! Why?


P OINT TO P ONDER : Is there a relation between pivots and number of solutions?

Choosing pivots: Two examples


E XAMPLE 1:
−6v + 4w = −2, u + v + 2w = 5, 2u + 7v − 2w = 9.
Forward elimination in the augmented matrix form [A|b]:
 
0 −6 4 | −2
1 1 2 | 5
2 7 −2 | 9
Coefficient of u in the first equation is 0. To eliminate u, exchange the first
two equations , i.e, interchange the first two rows of the matrix and get
 
1 1 2 | 5
 0 −6 4 | −2
2 7 −2 | 9
E XERCISE : Continue using elimination method; find all solutions.

6
Choosing pivots: Two examples
E XAMPLE 2: 3 equations in 3 unknowns (u, v, w)
0u + 6v + 4w = −2, 0u + v + 2w = 1, 0u + 7v − 2w = −9.
 
0 1 2 | 1
[A|b] =  0 6 4 | −2
0 7 −2 | −9
Coefficient of u is 0 in every equation. Start by eliminating v. Solve for v and w
to get w = 1, and v = −1.
N OTE : (0, −1, 1) is a solution of the system. So is (1, −1, 1).
In fact, (∗, −1, 1) is a solution, for any real number ∗.
O BSERVE : ‘Unique solution’ is not an option!
Q: Does such a system always have infinitely many solutions?
A: Depends on the constant vector b.
E XERCISE : Find 3 vectors b for which the above system has
(i) no solutions (ii) infinitely many solutions.

Summary: Pivots

• Can a pivot be zero? No (since we need to divide by it).

• If the first pivot (coefficient of 1st variable in 1st equation) is zero, then interchange it
with next equation so that you get a non-zero first pivot. Do the same for other pivots.

• If the coefficient of the 1st variable is zero in every equation, consider the 2nd variable
as 1st and repeat the previous step.

• Consider system of n equations in n variables.


The non-singular case, i.e. the system has n pivots:
The system has a unique solution.
The singular case, i.e., the system has atmost n − 1 pivots: The system has no so-
lutions, i.e., it is inconsistent, or it will have infinitely many solutions, provided it is
consistent.

1.2 M ATRICES

What is a matrix?
A matrix is a collection of numbers arranged into a fixed number of
rows and columns.
If a matrix A has m rows and n columns, the size of A is m × n.
 
A1∗
 A2∗ 
The rows of A are denoted A1∗ , A2∗ , . . . , Am∗ , i.e., A =  . ,
 
 .. 
Am∗
the columns are denoted A∗1 , A∗2 . . . , A∗n , i.e.,

A = A∗1 A∗2 · · · A∗n , and the (i, j)th entry is Aij (or aij ).
T HUS, an m×n matrix can be thought of as mn real numbers, n column vectors in Rm , or m row vecto
arranged in a particular order.

7
Operations on Matrices: Matrix Addition
E XAMPLE 1. We know how to add two row or column vectors.
  
1 2 3 + −3 −2 −1 = ? ? ? (component-wise)

We can add matrices if and only if they have the same size ,
and the addition is component-wise.
E XAMPLE 2.      
1 2 3 −1 −4 −2 ? −2 ?
+ =
2 2 4 0 0 1 2 ? ?
T HUS
(A + B)i∗ = Ai∗ + Bi∗ and (A + B)∗j = A∗j + B∗j

Linear Systems: Multiplying a Matrix and a Vector


O NE ROW AT A TIME ( DOT PRODUCT ): The system
2u + v + w = 5, 4u − 6v = −2, −2u + 7v + 2w = 9
can be rewritten using dot product as follows:
     
 u  u  u
2 1 1  v  = 5, 4 −6 0  v  = −2 and −2 7 2  v  = 9.
w w w
Write
 the system
  in the Ax = b form:   
2 1 1 u 2u + v + w 5
 4 −6 0  v  =  4u − 6v  = −2

−2 7 2 w −2u + 7v + 2w 9
N OTE : No. of columns of A = length of the vector x.

Multiplication of a Matrix and a Vector


D OT P RODUCT ( ROW METHOD ):
Ax is obtained by taking dot product of each row of A with x.
   
A1∗ A1∗ · x
If A = A2∗ , then Ax = A2∗ · x
A3∗ A3∗ · x
L INEAR C OMBINATIONS ( COLUMN METHOD ):
The column form of the system
2u + v + w = 5, 4u − 6v = −2, −2u + 7v + 2w = 9 is:
        
2 1 1 2 1 1 u
u  4  + v −6 + w 0 =  4 −6 0  v 
−2 7 2 −2 7 2 w

T HUS Ax is a linear combination of columns of A , with the coordinates of x


as weights, i.e., Ax = uA∗1 + vA∗2 + wA∗3 .

An Example    
  2 1
1 3 −3 −1 1 0
Let A = 1 2 0 −2, x = 
1, and e1 = 0 .
  
1 0 −2 0
2 0
 
A1∗ = 1 3 −3 −1 , A2∗ = 1 2 0 −2 A3∗ =?.

8
 
?
Then A1∗ · x =?, A2∗ · x = 0, A3∗ · x = 0, hence Ax = ? .
?
Q: What is Ae1 ? A: The first column A∗1 of A.
E XERCISE : What should e2 , e3 , e4 be so that Aej = A∗j , the jth column of A?
P OINTS TO P ONDER : Let A be m × n.
(i) For which x ∈ Rn is Ax = 0?
(ii) For which b ∈ Rm is Ax = b solvable?
(iii) Ae1 gives A∗1 . How can we get A1∗ , the first row of A?

Operations on Matrices: Matrix Multiplication


Key Point: We know how to multiply a matrix and a vector.
O BSERVE : No. of rows of Ax = No. of rows of A,
and No. of columns of Ax = No. of columns of x.

Two matrices A and B can be multiplied if and only if


no. of columns of A = no. of rows of B .
If A is m × n and B is n × r, then AB is m × r.

R OW WISE : Write A row-wise, i.e., let A1∗ , . . . , Am∗ be the rows of A. Then
   
A1∗ A1∗ B
AB =  ...  B =  ... 
   

Am∗ Am∗ B

N OTE : Each Ai∗ is a row vector of size 1 × n. Hence, Ai∗ B is a row vector
of size 1 × r. So, the size of AB is m × r. 
C OLUMN WISE : Write B column-wise, i.e., let B = B∗1 B∗2 · · · B∗r . Then

AB = AB∗1 AB∗2 · · · AB∗r

N OTE : (i) Each B∗j is a column vector of length n. Hence, AB∗j is a column vector
of length m. So, the size of AB is m × r.
(ii) Moreover, AB∗j is a linear combination of the columns of A
with the as weights.
T HUS, Columns of AB are a linear combination of Columns of A.
P OINT TO P ONDER : Each row of AB is a linear combination of .
W ORKING R ULE:
The entry in the ith row and jth column of AB is the dot product
of the ith row of A with the jth column of B, i.e., (AB)ij = Ai∗ · B∗j .

Properties of Matrix Multiplication


If A is m × n and B is n × r.
a) (AB)ij = Ai∗ · B∗j = (ith row of A) · (jth column of B)
b) jth column of AB =A · (jth column of B) , i.e., (AB)∗j = AB∗j .
c) ith row of AB =(ith row of A) · B , i.e., (AB)i∗ = Ai∗ B.
P ROPER TIES :

9
• (associativity) (AB)C = A(BC)

• (distributivity) A(B + C) = AB + AC

(B + C)D = BD + CD

• (non-commutativity) AB ̸= BA, in general. Find examples.


 
1
 ..
• (Identity) Let I =   be n × n. If A is n × n, then AI = A = IA.

.
1

Matrix Multiplication:
 Examples
  
2 1 1   1 0 0
1 2 3
E XAMPLES : A =  4 −6 0 , B =
 ,I= 0 1 0 (Identity)
2 2 4
−2 7 2 0 0 1
   
0 1 0  1 0 0
(Permutation) P = 1 0 0 = e2 e1 e3 , E = −2 1 0
0 0 1 0 0 1
• AB = ??
• size of BA is ×
 
? 10 ?
• BA = ,
4 ? ?
• and IA = A = AI.  
• AP = Ae2 Ae1 Ae3 = A∗2 A∗1 A∗3
E XERCISE : Find EA and P A.
P OINT TO P ONDER : Can you obtain EA and P A directly from A?

Transpose of a Matrix
D EFN . AT , the transpose of A, is defined as follows:
The i-th row of A is the i-th column of AT and vice-versa.
Hence if Aij = a, then (AT )ji = a.
 
  1 0
1 2 3
E XAMPLE : If A = , then AT = 2 −2.
0 −2 1
3 1
• If A is m × n, then AT is n × m.
• If A is upper triangular, then AT is lower triangular.
• (AT )T = A, (A + B)T = AT + B T .

• (AB)T = B T AT .
Proof. Exercise.

10
Symmetric Matrix
D EFN . If AT = A, then A is called a symmetric matrix.
N OTE : A symmetric
 matrix
 is   n × n.
always
1 2 0 1
E XAMPLES : A = ,B= are symmetric.
2 3 1 0
• If A, B are symmetric, then
  may NOT be symmetric.
AB
2 1
In the above case, AB = .
3 2
• If A and B are symmetric, then so is A + B.
• If A is n × n, A + AT is symmetric.
• For any m × n matrix B, BB T and B T B are symmetric.
E XERCISE : If AT = −A, we say that A is skew-symmetric.
Verify if similar observations are true for skew-symmetric matrices.

Inverse of a Matrix
D EFN . Given A of size n × n, we say B is an inverse of A if AB = I = BA.
If this happens, we say A is invertible.
• An inverse may not exist. Find an example. Hint: n = 1.
• An inverse of A, if it exists, has size n × n.
• If the inverse of A exists, it is unique, and is denoted A−1 .
Proof. Let B and C be inverses of A.

⇒ BA = I by definition of inverse.
⇒ (BA)C = IC multiply both sides on the right by C.
⇒ B(AC) = IC by associativity.
⇒ BI = IC since C is an inverse of A.
⇒B=C by property of the identity matrix I.

Inverses and Matrix Operations


• If A and B are invertible, what about AB?
AB is invertible, with inverse (AB)−1 = B −1 A−1 .
Proof. Exercise.
• If A, B are invertible, what about A + B?
A + B may not be invertible. Example: I + (−I) = (0).
• If A is invertible, what about AT ?
AT is invertible with inverse (AT )−1 = (A−1 )T .
Proof. Use AA−1 = I. Take transpose.
• If A is symmetric and invertible, then is A−1 symmetric?
Yes. Proof. Exercise!
• (Identity) I −1 = I.

Inverses and Linear Systems


• If A is invertible then the system Ax = b has a solution
for every constant vector b, namely x = A−1 b. Is this unique?
• Since x = 0 is always a solution of Ax = 0, if Ax = 0 has a non-zero solution,
then A is not invertible by the last remark.
• If A is invertible, then the Gaussian elimination of A produces n pivots.

11
E XERCISE :
1. A diagonal matrix A is invertible if and only if .
(Hint: When are the diagonal entries pivots?)
2. When is an upper triangular matrix invertible?
 
• Since AB = AB∗1 AB∗2 · · · AB∗n and I = e1 e2 · · · en ,
if B = A−1 , then B∗j is a solution of Ax = ej for all j.
• S TRATEGY TO FIND A−1 : Let A be an n × n invertible matrix.
Solve Ax = e1 , Ax = e2 , . . . , Ax = en .

Finding A−1
S TRATEGY: Let A be an invertible n × n matrix. If v1 , v2 , . . . , vn
are solutions of Ax = e1 , Ax = e2 , . . . , Ax = en respectively, then A−1 = v1 v2 · · ·

vn .

If Ax = ej is not solvable for some j, then A is not invertible.


T HUS, finding A−1 reduces to solving multiple systems of linear equations
with the same coefficient matrix.
Q: If A is m × n, and b1 , b2 ∈ Rm , how do we (attempt to) solve Ax = b1 and Ax = b2 ?
A: Apply elimination to (A | b1 ) and (A | b2 ).
O BSERVE : The elimination steps depend only on A, and not on
the constant vectors b1 or b2 .
Hence, we can apply the elimination steps to the augmented matrix (A | b1 b2 )
to solve Ax = b1 and Ax = b2 simultaneously.

Solutions to Multiple
 Systems
  
0 1 −1 −1 1 
Q: Let A = 1 0 2 , B =  2 0 = b1 b2 .
1 2 0 0 2
Is there a matrix C such that AC = B,
i.e., such that AC∗1 = b1 , AC∗2 = b2 ?
Rephrased question: Are Ax = b1 and Ax = b2 simultaneously consistent?
   
0 1 −1 | −1 1 1 0 2 | 2 0
R1 ↔R2
[A|B] = 1
 0 2 | 2 0 −−−−−→ 0 1 −1
  | −1 1
1 2 0 | 0 2 1 2 0 | 0 2
   
1 0 2 | 2 0 1 0 2 | 2 0
R −R1 R −2R2
−−3−−→ 0 1 −1 | −1 1 −−3−−−→ 0 1 −1 | −1 1
0 2 −2 | −2 2 0 0 0 | 0 0
   
0 0
A solution to Ax = b1 is 0, and to Ax = b2 is 1. (Verify)!
1 0

So C = e3 e2 works! Is it unique?
 
0 1 −1
E XERCISE : Can you find the inverse of A =  1 0 2 
−1 2 0
(if it exists), by using elimination as above?

1.3 G AUSSIAN E LMINATION

12
Elementary and Permutation
  Matrices

0 1 −1 −1
Let A = 1 0 2 , b =  2 .
1 2 0 0
In the process of solving Ax = b by elimination, we see that
   
0 1 −1 | −1 1 0 2 | 2
[A | b] = 1 0 2 | 2  −→ 0 1 −1 | −1
1 2 0 | 0 0 0 0 | 0

O BSERVE : We use a row exchange: R1 ↔ R2


and elimination using pivots: R3 = R3 − R1 , R3 = R3 − 2R2 .
Both can be achieved by left multiplication by special matrices.
E XERCISE : Identify the special matrices in this example.

Row Operations: Elementary


 Matrices
    
1 0 0 u u
E XAMPLE : E x = −2 1 0  v  = v − 2u .
0 0 1 w w
 
If A = A∗1 A∗2 A∗3 , then EA = EA∗1 EA∗2 EA∗3 .
Thus, EA has the same effect on A as the row operation
R2 7→ R2 + (−2)R1 on the matrix A.
N OTE : E is obtained from the identity matrix I by the row operation
R2 7→ R2 + (−2)R1 .
Such a matrix (diagonal entries 1 and atmost one off-diagonal entry non-zero)
is called an elementary matrix.
N OTATION : E := E21 (−2). Similarly define Eij (λ).

Row Operations: Permutation


  Matrices
  
0 1 0 u v
E XAMPLE : P x = 1 0 0  v  =  u 
0 0 1 w w
 
If A = A∗1 A∗2 A∗3 , then P A = P A∗1 P A∗2 P A∗3 .
Thus P A has the same effect on A as the row interchange R1 ↔ R2 .
N OTE : We get P from the I by interchanging first and second rows.
A matrix obtained from I by row exchanges is called a permutation matrix.
N OTATION : P = P12 . Similarly define Pij .
R EMARK : Row operations correspond to multiplication by elementary matrices
Eij (λ) or permutation matrices Pij on the left.

Elementary and Permutation Matrices: Inverses


For any n × n matrix A, observe that the row operations
R2 7→ R2 − 2R1 , R2 7→ R2 + 2R1 leave the matrix unchanged.
In matrix terms, E21 (2)E21 (−2)A = IA = A, and in particular,
    
1 0 0 1 0 0 1 0 0
E21 (−2) E21 (2) = −2 1 0 2 1 0 = 0 1 0 .
0 0 1 0 0 1 0 0 1

13
 
1 0 0
• If E21 (λ) = λ 1 0, what is your guess for E21 (λ)−1 ? Verify.
0 0 1
   T
0 1 0 e2
• Let P12 = 1 0 0 = eT1 . What is P12 T ? P T P ? P −1 ?
12 12
   12
0 0 1 eT3

Elementary and Permutation Matrices: Inverses


Notice that the row interchange R1 ↔ R2 followed by R1 ↔ R2
leaves a matrix unchanged.
In matrix terms, P12 P12 A = IA = A, and in particular,
    
0 1 0 0 1 0 1 0 0
P12 P12 = 1 0 0 1 0 0 = 0 1 0 .
0 0 1 0 0 1 0 0 1

• Let Pij be obtained by interchanging the ith and jth rows of I.


Show that PijT = Pij = Pij−1 .
   T
0 0 1 e3
• Let P = 1 0 0 = eT1 . Show that P = P12 P23 .
  
0 1 0 eT2
−1 −1
Hence, P −1 = (P12 P23 )−1 = P23 T PT = PT.
P12 = P23 12

Elimination using Elementary Matrices


    
2 1 1 u 5
Consider  4 −6 0  v  = −2 (Ax = b)
−2 7 2 w 9

Step 1 Eliminate u by R2 7→ R2 − 2R1 , R3 7→ R3 − (−1)R1 .


This corresponds to multiplying both sides on the left first by
E21 (−2) and then by E31 (1). The equivalent system is:

E31 (1)E21 (−2)Ax = E31 (1)E21 (−2) b, i.e.,


    
2 1 1 u 5
0 −8 −2  v  = −12 .
0 8 3 w 14

Step 2 Eliminate v by R3 7→ R3 − (−1)R2 ,


i.e., multiply both sides by E32 (1) to
 get U x = c,
2 1 1
where U = E32 (1)E31 (1)E21 (−2)A = 0 −8 −2 and
0 0 1
 
5
c = E32 (1)E31 (1)E21 (−2) b = −12 .
2
Elimination changed A to an upper triangular matrix
and reduced the problem to solving U x = c.
O BSERVE : The pivots of the system Ax = b are the non-zero diagonal entries of U .

14
Summary: Elementary and Permutation Matrices
E LEMENTARY M ATRIX Eij (λ): Obtained from In by Ri = Ri + λRj .
• Eij (λ)A has the same effect on A as the row operation Ri = Ri − (−λ)Rj .
• I NVERSE : Eij (λ)−1 = Eij (−λ).
P ERMUTATION M ATRIX : Obtained from In by row exchanges, e.g.,
• Pij : Obtained from In by Ri ↔ Rj
• Pij A has the same effect on A as the row operation Ri ↔ Rj .
• A permutation matrix P is a product of Pij ’s.
• I NVERSE : Pij−1 = Pij = PijT , and hence, P −1 = P T .
Q: Is P = P −1 ?

Triangular Factorization
Thus Ax = b is equivalent to Ux = c .
where
E32 (1) E31 (1) E21 (−2) A = U
Multiply both sides by E32 (−1) on the left: E31 (1) E21 (−2) A = E32 (−1) U
Multiply first by E31 (−1) and then E21 (2) on the left:

A = E21 (2) E31 (−1) E32 (−1) U = LU


where U is upper triangular, which is obtained by forward elimination,
with diagonal entries as pivots and L = E21 (2) E31 (−1) E32 (−1).
Note that each Eij (a) is a lower triangular. Product of lower triangular matrices
is lower triangular. In particular L is lower triangular, where
L = E21 (2) E31 (−1) E32 (−1) =
     
1 0 0 1 0 0 1 0 0 1 0 0
2 1 0   0 1 0  0 1 0  =  2 1 0
0 0 1 −1 0 1 0 −1 1 −1 −1 1

O BSERVE : L is lower triangular with diagonal entries 1 and


below the diagonals are the multipliers. (2, −1, −1 in the earlier example).

LU Decomposition
If A is an n × n matrix, with no row interchanges needed
in the Gaussian elimination of A, then A = L U , where
• U is an upper triangular matrix, which is obtained by forward elimination,
with non-zero diagonal entries as pivots.
• L is a lower triangular with diagonal entries 1 and with the multipliers needed
in the elimination algorithm below the diagonals.
Q: What happens if row exchanges are required?

15
LU Decomposition: with Row Exchanges
0 2
E XAMPLE : A = . A can not be factored as LU . (Check!)
3 4
The 1st step in the Gaussian elimination of A is a row exchange.
    
0 1 0 2 3 4
P12 A = =
1 0 3 4 0 2

Now elimination can be carried out without row exchanges.


• If A is an n × n matrix, then there is a permutation matrix P such that
P A = LU , where L and U are as defined earlier.
P OINT TO P ONDER : What happens when A is an m × n matrix?

Applications:
 1. Linear Equations
   
1 2 3 1 0 0 1 2 3
Let A = −2 −12 −5 = −2 1 0 0 −8 1 
1 −6 3 1 1 1 0 0 −1
To solve Ax = b, we can solve two triangular systems
Lc = b and U x = c . Then Ax = LU x = Lc = b.
      
1 1 0 0 c1 1
Take b = 2 . First solve −2 1 0
    c2 = 2.
 
5 1 1 1 c3 5
We get c1 = 1, −2c1 + c2 = 2 ⇒ c2 = 4, and similarly c3 = 0.
    
1 2 3 u 1
Now solve 0 −8 1   v  = 4.
0 0 −1 w 0
We get w = 0, v = −1/2, u = 2.

Applications: 2. Invertibility of a Matrix


Let A be n × n, P , L and U as before be such that P A = LU .
• P is invertible and P −1 = P T ⇒ A = P −1 LU .
• L is lower triangular, with diagonal entries 1 ⇒ L is invertible.
Q: What is L−1 ? e.g., Try L = E21 (2)E31 (−1)E32 (−1) first.
• The non-zero diagonal entries of U are the pivots of A.
Thus, A invertible ⇒ A has n pivots
⇒ all diagonal entries of U are non-zero ⇒ U is invertible.
Q: Why? H INT : U T is invertible, by elimination. (Verify).
Conversely, suppose U is invertible. Then A is invertible. Why?.
Moreover, A−1 = .
We have proved:
A is invertible ⇔ U is invertible ⇔ A has n pivots.

16
Computing the Inverse
K EY N OTE : A = L U ⇒ A−1 = U −1 L−1 .
(Replace A by P A if row exchanges are needed)
 
2 1 1
E XAMPLE : A =  4 −6 0 is invertible. Find A−1 .
−2 7 2
Recall: If A = x1 x2 x3 , where xi is the i-th column of A−1 ,
−1


then AA−1 = I gives three systems of linear equations

Ax1 = e1 , Ax2 = e2 , Ax3 = e3

where ei is the i-th column of I. Since the coefficient matrix A is same


in the three systems, we can solve them simultaneously as follows:

Calculation of A−1 : Gauss-Jordan Method


Forward Backward
Steps: (A|I) −−−−−−−→ (U |L−1 ) −−−−−−−→ (I | U −1 L−1 ).
Elimination Elimination
 
2 1 1 | 1 0 0
Step 1 (A | e1 e2 e3 ) =  4 −6 0 | 0 1 0
−2 7 2 | 0 0 1
 
2 1 1 | 1 0 0
R2 −2R1
−−−−−→ 0 −8 −2
 | −2 1 0
R3 +R1
0 8 3 | 1 0 1
 
2 1 1 | 1 0 0
R +R2
−−3−−→  0 −8 −2 | −2 1 0
0 0 1 | −1 1 1
= (U | L−1 ).
 
2 1 1 | 1 0 0
Step 2 (U | L−1 ) =  0 −8 −2 | −2 1 0
0 0 1 | −1 1 1
 
2 1 0 | 2 −1 −1
R1 −R3
−−− −−→  0 −8 0 | −4 3 2
R2 +2R3
0 0 1 | −1 1 1
 
R1 + 1 R2
2 0 0 | 12/8 −5/8 −6/8
−−−−8−→  0 −8 0 | −4 3 2 
0 0 1 | −1 1 1
 
1 0 0 | 12/16 −5/16 −6/16
Divide by pivots −→ 0 1 0 | 4/8 −3/8 −2/8 
0 0 1 | −1 1 1
= (I | U −1 L−1 ) = (I | A−1 )

Summary: LU Decomposition
If A is an n × n matrix, with no row interchanges needed
in the Gaussian elimination of A, then A = L U , where
• U is an upper triangular matrix, which is obtained by forward elimination,
with non-zero diagonal entries as pivots.
• L is a lower triangular with diagonal entries 1 and with the multipliers needed

17
in the elimination algorithm below the diagonals.
Q: What happens if row exchanges are required?
• If A is an n × n matrix, then there is a permutation matrix P
such that P A = LU , where L and U are as above.
Q: What happens when A is an m × n matrix?

Echelon Form
Q: What happens when A is not a square matrix?
 
1 2 3 5
E XAMPLE : Let A = 2 4 8 12. By elimination, we see:
3 6 7 13
   
1 2 3 5 1 2 3 5
R −2R1 R3 −(−1)R2
A −−2−−−→ 0 0 2 2  −−−−−−−→ 0 0 2 2 = U .
R3 −3R1
0 0 −2 −2 0 0 0 0
 
1 0 0
Thus A = LU , where L = E21 (2)E31 (3)E32 (−1) = 2 1 0 .
3 −1 1
Q: What are the pivots of A?

Echelon Form
If A is m × n, we can find P , L and U as before.
In this case, L and P will be m × m and U will be m × n.
U has the following properties:
1. Pivots are the 1st nonzero entries in their rows.
2. Entries below pivots are zero, by elimination.
3. Each pivot lies to the right of the pivot in the row above.
4. Zero rows are at the bottom of the matrix.
U is called an echelon form of A.
  2 × 2 
Possible echelon
 forms: 
 Let •= pivot entry.
• ∗ • ∗ 0 • 0 0
, , , and .
0 • 0 0 0 0 0 0

Row Reduced Form


To obtain the row reduced form R of a matrix A:
1) Get the echelon form U . 2) Make the pivots 1.
3) Make the entries above the pivots 0.
E XERCISE : Find all possible 2 × 2 row reduced forms.
   
1 2 3 5 1 2 3 5
E XAMPLE . Let A = 2 4 8 12. Then U =  0 0 2 2.
3 6 7 13 0 0 0 0
 
1 2 3 5
Divide by pivots: R2 /2 gives  0 0 1 1 .
0 0 0 0
By R1 = R1 − 3R2 ,  
1 2 0 2
Row reduced form of A: R =  0 0 1 1
0 0 0 0
U and R are used to solve Ax = 0 and Ax = b.

18
1.4 N ULL S PACE & C OLUMN S PACE : I NTRODUCTION

Null Space: Solution of Ax = 0


Let A be m × n. Q: For which x ∈ Rn , is Ax = 0?
The Null Space of A, denoted N (A),
is the set of all vectors x in Rn such that Ax = 0.
 
1 2 3 5
E XAMPLE : A = 2 4 8 12. Are the following in N (A)?
3 6 7 13
       
−2 0 −2 −5
1 0 0 0
 ?  ?  ?  ?
0 −5 −1 0
0 3 1 1
N OTE : x is in N (A) ⇔ A1∗ · x = 0, A2∗ · x = 0, and A3∗ · x = 0,
i.e., x is perpendicular to every row of A.
T T
Q: x = −2 1 0 0 and y = −2 0 −1 1 are in N (A).
T T
What about x + y = −4 1 −1 1 , and −3 · x = 6 −3 0 0 ?
R EMARK : Let A be an m × n matrix.
• The null space of A, N (A) contains vectors from Rn ,
• If x, y are in N (A), i.e., Ax = 0 and Ay = 0, and u, v are real numbers, then
A(ux + vy) = u(Ax) + v(Ay) = 0, i.e., ux + vy is in N (A).
i.e., a linear combination of vectors in N (A) is also in N (A).
Thus N (A) is closed under linear combinations .

Finding N(A)
K EY P OINT : Let U be the echelon form of A, and R be the reduced form of A,
then Ax = 0 has the same solutions as U x = 0,
which has the same solutions as Rx = 0, i.e.,
N (A) = N (U ) = N (R) .
R EASON : If A is m × n, and Q is an invertible m × m matrix, then N (A) = N (QA).
(Verify this)!
E XAMPLE :  
    t
1 2 3 5 1 2 0 2  
u
For A = 2 4 8 12, we have Rx =  0 0 1 1   v .
3 6 7 13 0 0 0 0
w
R x = 0 gives t + 2u + 2w = 0 and v + w = 0.
i.e., t = −2u − 2w and v = −w.
t and v are dependent on the values of u and w.
u and w are free and independent,
i.e., we can choose any value for these two variables.
S PECIAL SOLUTIONS : T
u = 1 and w = 0, gives x = −2 1 0 0 .
T
u = 0 and w = 1, gives x = −2 0 −1 1 .
The null space contains:

19
       
t −2u − 2w −2 −2
u  u  1 0
x=  v  =  −w  = u  0  + w −1 ,
      

w w 0 1
i.e., all possible linear combinations of the special solutions.

Rank of a matrix
Ax = 0 always has a solution: the trivial one, i.e., x = 0.
M AIN Q1: When does Ax = 0 have a non-zero solution?
A: When there is at least one free variable,
i.e., not every column of R contains a pivot.
To keep track of this, we define:
rank(A) = number of columns containing pivots in R .
If A is m × n and rank(A) = r, then
• rank(A) ≤ min{m, n}.
• no. of dependent variables = r.
• no. of free variables = n − r.
• Ax = 0 has only the 0 solution ⇔ r = n.
• m < n ⇒ Ax = 0 has non-zero solutions.
T RUE /F ALSE : Ifm ≥ n, then
 Ax = 0 has only the 0 solution.

1 2 0 2 1 2 3 5
E XAMPLE : R = 0 0 1 1 when A = 2 4 8 12.
0 0 0 0 3 6 7 13
The number of columns containing pivots in R is 2. So, rank(A) = 2.
R contains a 2 × 2 identity matrix, namely the rows and columns
corresponding to the pivots.  
1 3
This is the row reduced form of the corresponding submatrix of A,
2 8
which is invertible, since it has 2 pivots.
Thus, rank(A) = r ⇒ A has an r × r invertible submatrix.
Is the converse is true?

Finding N(A) = N(U) = N(R)


Let A be m × n. To solve Ax = 0, find R and solve Rx = 0.
1. Find free (independent) and pivot (dependent) variables:
pivot variables: columns in R with pivots (↔ t and v).
free variables: columns in R without pivots (↔ u and w).
2. No free variables, i.e., rank(A) = n ⇒ N (A) = 0.
3(a). If rank(A) < n, obtain a special solution:
Set one free variable = 1, the other free variables = 0.
Solve Rx = 0 to obtain values of pivot variables.
3(b). Find special solutions for each free variable.
N (A) = space of linear combinations of special solutions.
• This information is stored in a compact form in:
N ULL S PACE M ATRIX : Special solutions as columns.

20
Solving Ax = b
C AUTION : If b ̸= 0, solving Ax = b may not be the same as
solving U x = b or Rx = b.  
 t
  
1 2 3 5   b
u   1
E XAMPLE : Ax = 2 4 8
 12   = b2 = b.
 
v
3 6 7 13 b3
w
Convert
 to U x = c and
 then
 Rx = d. 
1 2 3 5 | b1 1 2 3 5 | b1
2 4 8 12 | b2  →  0 0 2 2 | b2 − 2b1 
3 6 7 13 | b3 0 0 −2 −2 | b3 − 3b1
 
1 2 3 5 | b1
→ 0 0 2 2 | b2 − 2b1 
0 0 0 0 | b3 + b2 − 5b1
System is consistent ⇔ b3 + b2 − 5b1 = 0, i.e., b3 = 5b1 − b2

Solving Ax = b or Ux = c or Rx = d
Ax = b has a solution ⇔ b3 = 5b1 − b2 .
T
e.g., there is no solution when b = 1 0 4 .
T
Suppose b = 1 0 5 . Then [A|b] →
   
1 2 3 5| b1 1 2 3 5| 1
 0 0 2 2 | b2 − 2b1  =  0 0 2 2 | −2
0 0 0 0 | b3 + b2 − 5b1 0 0 0 0| 0
   
1 2 3 5| 1 1 2 0 2 | 4
→  0 0 1 1 | −1 →  0 0 1 1 | −1
0 0 0 0| 0 0 0 0 0 | 0
T
Ax = b is reduced to solving U x = c = 1 −2 0 ,
T
which is further reduced to solving Rx = d = 4 −1 0 .
i.e., we want to solve  
  t  
1 2 0 2   4
u  
0 0 1 1 
v  = −1
0 0 0 0 0
w
i.e., t = 4 − 2u − 2w and v = −1 − w
Set the free variables u and w = 0 to get t = 4 and v = −1
T
A particular solution: x = 4 0 −1 0 .
E XERCISE : Check it is a solution i.e., check Ax = b.
O BSERVE : In Rx = d, the vector d gives values for the pivot variables,
when the free variables are 0.

General Solution of Ax = b
From Rx = d, we get t = 4 − 2u − 2w and v = −1 − w, where u and w are free.
Complete set of solutions to Ax = b:
         
t 4 − 2u − 2w 4 −2 −2
u  u  =  0  + u 1  + w 0 .
      
 =
 v   −1 − w  −1 0 −1
w w 0 0 1

21
To solve Ax = b completely, reduce to Rx = d. Then:
1. Find xNullSpace , i.e., N (A), by solving Rx = 0.
2. Set free variables = 0, solve Rx = d for pivot variables.
This is a particular solution: xparticular .
3. Complete solutions: xcomplete = xparticular + xNullSpace

E XERCISE : Verify geometrically for a 1 × 2 matrix, say A = 1 2 .

The Column Space of a Matrix


Q: Does Ax = b have a solution? A: Not always.
M AIN Q2: When does Ax = b have a solution?
If Ax = b has a solution, then
  we can find numbers x1 , . . . , xn
x1
such that A∗1 · · · A∗n  ...  = x1 A∗1 + · · · + xn A∗n = b,
 

xn
i.e., b can be written as a linear combination of columns of A.
The column space of A, denoted C(A)
is the set of all linear combinations of the columns of A
= {b in Rm such that Ax = b is consistent}.

Finding C(A): Consistency


 of Ax=b
1 2 3 5 T
E XAMPLE : Let A = 2 4
 8 12. Then Ax = b, where b = b1 b2 b3 , has a solution
3 6 7 13
whenever −5b1 + b2 + b3 = 0.
T
• C(A) is a plane in R3 passing through the origin with normal vector −5 1 1 .
T
• a = 1 0 4 is not in C(A) as Ax = a is inconsistent.
T
• b = 1 0 5 is in C(A) as Ax = b is consistent.
E XERCISE : Write b as a linear combination of the columns of A.
(A different way of saying: Solve Ax = b).
T
x = 4 0 −1 0 is a solution of Ax = b. Hence
T
1 0 5 = 4A∗1 + (−1)A∗3 .
Q: Can you write b as a different combination of A∗1 , . . . , A∗4 ?

Linear Combinations in C(A)


Let A be an m × n matrix, u and v be real numbers.
• The column space of A, C(A) contains vectors from Rm .
• If a, b are in C(A), i.e., Ax = a and Ay = b for some x, y in Rn ,
then ua + vb = u(Ax) + v(Ay) = A(ux + vy) = Aw, where w = ux + vy.
T
Hence, if w = w1 · · · wn , then ua + vb = w1 A∗1 + · · · wn A∗n ,
i.e., a linear combination of vectors in C(A) is also in C(A).
Thus, C(A) is closed under linear combinations.
• If b is in C(A), then b can be written as a linear combination
of the columns of A in as many ways as the solutions of Ax = b.

22
Summary: N(A) and C(A)
R EMARK : Let A be an m × n matrix.
• The null space of A, N (A) contains vectors from Rn .
• Ax = 0 ⇔ x is in N (A).
• The column space of A, C(A) contains vectors from Rm .
• If B is the nullspace matrix of A, then C(B) = N (A).
• Ax = b is consistent ⇔ b is in C(A) ⇔
b can be written as a linear combination of the columns of A.
This can be done in as many ways as the solutions of Ax = b.
• Let A be n × n.
A is invertible ⇔ N (A) = {0} ⇔ C(A) = Rn . Why?
• N (A) and C(A) are closed under linear combinations.

23
Chapter 2. V ECTOR S PACES

2.1 V ECTOR S PACES & S UBSPACES

Vector Spaces: Rn
We begin with R1 , R2 , . . . , Rn , etc., where Rn consists of all column vectors
T
of length n, i.e., Rn = {x = x1 · · · xn , where x1 , . . . , xn are in R}.
We can add two vectors, and we can multiply vectors by scalars,
(i.e., real numbers). Thus, we can take linear combinations in Rn .
E XAMPLES :
R1 is the real line, R3 is the usual 3-dimensional space, and
R2 is represented by the x-y plane; the x and y co-ordinates
are given by the two components of the vector.
y
(−2, 1) (0, 1)

(−2, 0) (1, 0) x

Vector Spaces: Definition


D EFN . A non-empty set V is a vector space if it is closed under
vector addition ( i.e., if x, y are in V, then x + y must be in V ) and
scalar multiplication, (i.e., if x is in V , a is in R, then a ∗ x must be in V ).
Equivalently, x, y in V, a, b in R ⇒ a ∗ x + b ∗ y must be in V.
• A vector space is a triple (V, +, ∗) with vector addition + and scalar
multiplication ∗, with + and ∗ satisfying some additional properties
(see next slide).
• The elements of V are called vectors and the scalars are chosen to be
real numbers (for now).
• If the scalars are allowed to be complex numbers, then V is a
complex vector space.
• For a scalar a, and a vector x, we denote a ∗ x by ax.
• P RIMARY E XAMPLE : Rn . Under which operations?

Vector Space: Definition continued


Let x, y and z be vectors, a and b be scalars.
The vector addition and scalar multiplication are also required to satisfy:

i) x + y = y + x Commutativity of addition

ii) (x + y) + z = x + (y + z) Associativity of addition

iii) There is a unique vector 0, such that x + 0 = x Existence of additive identity

iv) For each x, there is a unique −x such that x + (−x) = 0 Existence of additive inverse

v) 1 ∗ x = x Unit property

vi) (a + b)x = ax + bx, a(x + y) = ax + ay, (ab)x = a(bx) Compatibility

24
Vector Spaces: Examples

1. V = 0, the space consisting of only the zero vector.

2. V = Rn , the n-dimensional space.

3. V = R∞ , vectors with infinite number of components,


i.e., a sequence of real numbers, e.g., x = (1, 1, 2, 3, 5, 8, . . .), y = (0, 1, 0, 2, 0, 3, 0, 4, . . .),
with component-wise addition and scalar multiplication.

4. V = M, the set of 2 × 2 matrices. What are + and ∗?


Q: Is this the ’same’ as R4 ?

5. V = P, the set of polynomials, e.g. 1 + 2x + 3x2 + · · · + 2018x2017 , with term-wise + and


∗.

6. V = C[0, 1], the set of continuous real-valued functions on the closed interval [0, 1].
e.g., x2 , ex are vectors in V .
1 1
Q: Is x a vector in V ? How about x−2 ?

Vector addition and scalar multiplication are pointwise:


(f + g)(x) = f (x) + g(x) and (a ∗ f )(x) = af (x).

Subspaces: Definition and Examples


Let V is a vector space. If W is a non-empty subset,
then W is a subspace of V if:
x, y in W , a, b in R ⇒ a ∗ x + b ∗ y are in W .
i.e., linear combinations stay in the subspace.
E XAMPLES :

i) {0}: The zero subspace and Rn itself.

ii) {(x1 , x2 ) : x1 ≥ 0, x2 ≥ 0} is not a subspace of R2 . Why?

iii) The line x − y = 1 is not a subspace of R2 . Why?

iv) The line x − y = 0 is a subspace of R2 . Why?


P OINT TO P ONDER : When is a line in R2 a subspace? Justify!

v) Let A be an m × n matrix.
The null space of A, N (A), is a subspace of .
The column space of A, C(A), is a subspace of .
R ECALL : They are both closed under linear combinations.

vi) The set of 2 × 2 symmetric matrices is a subspace of M. So is the set of 2 × 2 lower


triangular matrices. Is the set of invertible 2 × 2 matrices a subspace of M?

vii) The set of convergent sequences is a subspace of R∞ . What about the set of sequences
convergent to 1?

viii) The set of differentiable functions is a subspace of C[0, 1]. Is the same true for the set
of functions integrable on [0, 1]?

25
ix) - ? See the tutorial sheet for many more examples!

E XERCISE : Verify that a subspace is itself a vector space.


In particular, it must contain the 0 vector!

Subspaces of R2
What are the subspaces of R2 ?
T
• V ={ 0 0 }.

• V = R2 .

What if V is neither of the above? T


E XAMPLE : Suppose V contains a non-zero vector, say v = −1 1 .
L: x+y =0 y
(−0.5)v

x
v
1.5v

V must contain the entire line L : x + y = 0, i.e., all multiples of v.


T
If V ̸= L, it contains a vector v2 , which is not a multiple of v1 , say v2 = 0 1 .
 
 −1 0
O BSERVE : A = v1 v2 = has two pivots,
1 1
⇔ A is invertible.
⇔ for any v in R2 , Ax = v is solvable,
⇔ v is in C(A),
⇔ v can be written as a linear combination of v1 and v2 .
⇒ v is in V , i.e., V = R2
T O SUMMARISE : A subspace of R2 , which is non-zero, and not R2 ,
is a line passing through the origin.
P OINT TO P ONDER : What happens in R3 ?

26
2.2 L INEAR S PAN & I NDEPENDENCE

Linear Span: Definition


Given a collection S = {v1 , v2 , . . . , vn } in a vector space V , the linear span of S,
denoted Span(S) or Span{v1 , . . . , vn },
is the set of all linear combinations of v1 , v2 , . . . , vn , i.e.,
Span(S) = {v = a1 v1 + · · · + an vn , for scalars a1 , . . . , an }.

Let {v1 , . . . , vn } be in Rm . Then A = v1 · · · vn is m × n.




O BSERVE :
1. Span{v1 , . . . , vn } = C(A) Thus
v is in Span{v1 , . . . , vn } ⇔ Ax = v is consistent.
2. Span{v1 , . . . , vn } = Rm ⇔ Ax = v is solvable for every v ∈ Rm ⇔ A has m pivots.
In particular, m ≤ n.
3. Let m = n.
Then A is invertible ⇔ A has n pivots ⇔ Ax = v is consistent for every v in Rn ⇔
Span{v1 , . . . , vn } = Rn ⇔ C(A) = Rn .
E XAMPLE : Span{e1 , . . . , en } = Rn .

Linear Span: Examples


E XAMPLES :
1. Span{0} = {0}.

2. If v ̸= 0 is a vector, Span{v} = {av, for scalars a}.


G EOMETRICALLY ( IN Rm ): Span{v} = the line in the direction of v passing through the
origin.
   
−1 0
3. Span , = R2 .
1 1

4. If A is m × n, then Span{A∗1 , . . . , A∗n } = C(A).

5. If v1 , . . . , vk are the special solutions of A,


then Span {v1 , . . . , vk } = N (A).

N OTE : All of the above are subspaces.


E XERCISE : 
Span(S)
 is 
a subspace
 ofV .  
1 2 3 5
6. Let v1 = 2, v2 = 4, v3 = 8 and v4 = 12.
3 6 7 13
T
Is v = 1 0 4 in Span{v1 , v2 , v3 , v4 }? 
Let A = v1 · · · v4 , and b = b1 b2 b3 .
Recall Ax = b is solvable ⇔ 5b1 − b2 − b3 = 0.
⇒ v is not in Span{v1 , v2 , v3 , v4 }, and
T
w = 1 0 5 = 4v1 + (−1)v3 is in it.
O BSERVE : v2 = 2v1 and v4 = 2v1 + v3 . Hence v2 , v4 are in Span{v1 , v3 }
⇒ Span{v1 , v2 , v3 , v4 } = Span{v1 , v3 , }.
Thus, C(A) = the plane P : (5x − y − z = 0) = Span{v1 , v3 }.
P OINT TO P ONDER :

27
3
Is the span of two vectors
 in R always
  a plane? 
2 4 6 4
7. Let v1 = 2, v2 = 5, v3 = 7 and v4 = 6.
2 3 5 2
T
Is v = 4 3 5 in Span{v1 , v2 , v3 , v4 }?
If yes, write v as a linear combination of v1 , v2 , v3 , v4 .

Let A = v1 · · · v4 . The question can be rephrased as:
Q: Is v in C(A), i.e., is Ax = v solvable? If yes, find a solution.
T
E XERCISE : Ax = a b c is consistent ⇔ 2a − b − c = 0.
O BSERVE : Span{v1 , v2 , v3 , v4 } = C(A) is a plane!
T
Verify v is in Span{v1 , v2 , v3 , v4 } (and w = 4 3 4 is not).
Solve Ax = v using the row reduced form of A to get
T
Particular solution: 4 −1 0 0 and v = 4v1 + (−1)v2 .

Linear Independence:
  Example
     
2 4 6 4
With v1 = 2, v2 = 5, v3 = 7 and v4 = 6
2 3 5 2
Observe: v3 = v1 + v2 and v4 = −2v1 + 2v2 .
Hence v3 and v4 are in Span{v1 , v2 }.
Therefore, Span{v1 , v2 } = Span{v1 , v2 , v3 , v4 } = C(A) = the plane 2x − y − z = 0.
Q: Is the span of two vectors in R3 always a plane?
A: Not always. If v is a multiple of w, then Span{v, w} = Span{w}, which is a line through
the origin or zero.
Q: If v and w are not on the same line through the origin?
A: Yes. v, w are examples of linearly independent vectors.

Linear Independence: Definition


The vectors v1 , . . . , vn in a vector space V , are linearly independent
if a1 v1 + · · · + an vn = 0 ⇒ a1 = 0, . . ., an = 0 .
Equivalently, for every nonzero (a1 , . . . , an )T in Rn ,
we have a1 v1 + · · · + an vn ̸= 0 in V .
The vectors v1 , . . . , vn are linearly dependent if they are not linearly independent.
i.e., we can find (a1 , . . . , an )T ̸= 0 in Rn , such that a1 v1 + · · · + an vn = 0 in V .
O BSERVE : When V = Rm , if A = v1 · · · vn , then


v1 , . . . , vn are linearly dependent ⇔


Ax = x1 v1 + · · · + xn vn = 0 has a non-trivial solution, ⇔ N (A) ̸= 0
and
Ax = x1 v1 + · · · + xn vn = 0 has only the trivial solution
⇔ N (A) = 0 ⇔ v1 , . . . , vn are linearly independent.

Linear Independence: Remarks


R EMARKS /E XAMPLES :

• The zero vector 0 is linearly dependent. Why?

• If v ̸= 0, then it is linearly independent. Why?

28
• v, w are linearly dependent ⇔ one is a multiple of the other ⇔ (for V = Rm ) they lie on
the same line through the origin.

• More generally, v1 , . . . , vn are linearly dependent ⇔ one of the vi ’s can be written as a


linear combination of the others, i.e., vi is in Span{vj : j = 1, . . . n, j ̸= i}.

• Let A be m × n. Then rank(A) = n ⇔ N (A) = 0 ⇔ A∗1 , · · · , A∗n are linearly independent.


In particular, if A is n × n, A is invertible ⇔ A∗1 , · · · , A∗n are linearly independent.

E XAMPLE : e1 , . . . , en are linearly independent vectors in Rn .

Linear Independence: Example


T T T T
E XAMPLE : Let v1 = 2 2 2 , v2 = 4 5 3 , v3 = 6 7 5 and v4 = 4 6 2 .
Are v1 , v2 , v3 , v4 linearly independent?  
 1 0 1 −2
For A = v1 · · · v4 , reduced form R = 0 1 1 2 .
0 0 0 0
A has only 2 pivots ⇒ N (A) ̸= 0, so v1 , v2 , v3 , v4 are not independent.
A non-trivial linear combination which is zero is (1)v1 + (1)v2 + (−1)v3 + (0)v4 , or
(2)v1 + (−2)v2 + (0)v3 + (1)v4 .
M ORE GENERALLY : If v1 , . . . , vn are vectors in Rm , then A = v1 · · · vn is m × n.


If m < n, then rank(A) < n ⇒ N (A) ̸= 0.


T HUS : In Rm , any set with more than m vectors is linearly dependent.

Summary: Vector Spaces, Span and Independence


• Vector space: A triple (V, +, ∗) which is closed under + and ∗
with some additional properties satisfied by + and ∗.
• Subspace: A non-empty subset W of V closed under linear combinations.
Let V = Rm , v1 , . . . , vn be in V , and A = v1 · · · vn .


• For v in V , v is in Span{v1 , . . . , vn } ⇔ Ax = v is consistent


• v1 , . . . , vn are linearly independent
⇔ N (A) = 0 ⇔ rank(A) = n.
• In particular, with n = m, A is invertible
⇔ Ax = v is consistent for every v
⇔ Span{v1 , . . . , vn } = Rn ⇔ rank(A) = n
⇔ N (A) = 0 ⇔ v1 , . . . , vn are linearly independent.
• If Span{v1 , . . . , vn } = Rm , then m ≤ n, and
any subset of Rm with more than m vectors is dependent.

2.3 B ASIS & D IMENSION

Basis: Introduction
       
1 2 3 5 
Let v1 = 2 , v2 = 4 , v3 = 8 , v4 = 12, and A = v1 v2 v3 v4 .
      
3 6 7 13
Can C(A) =Span{v1 , v2 , v3 , v4 } be spanned by less than 4 vectors?
O BSERVE :
• v2 = 2v1 & v4 = 2v1 + v3 ⇒ C(A) = Span{v1 , v3 }.
Moreover, the span of only v1 or only v3 is a line.

29
Thus, {v1 , v3 } is a minimal spanning set for C(A).
• Clearly v1 is not on the line spanned by v3 or vice versa.
Hence, v1 and v3 are linearly independent vectors in C(A).
Moreover, v is in C(A) = Span{v1 , v3 }, ⇒ v1 , v3 , v are linearly dependent. Why?
Thus, {v1 , v3 } is a maximal linearly independent set in C(A).
Any such set of vectors gives a basis of C(A).

Basis: Definition
D EFN . A subset B of a vector space V , is said to be a basis of V ,
if it is linearly independent and Span(B) = V .
T HEOREM : For any subset S of a vector space V , the following are equivalent:
(i) S is a maximal linearly independent set in V
(ii) S is linearly independent and Span(S) = V .
(iii) S is a minimal spanning set of V .
R EMARK /E XAMPLES :
• Every vector space V has a basis.
• By convention, the empty set is a basis for V = {0}.
• {e1 , . . . , en } is a basis for Rn , called the standard basis.
•An basis of R is just {1}. Is this unique?
T T o
• −1 1 , 0 1 is a basis for R2 . So is {e1 , e2 },
as is the set consisting of columns of a 2 × 2 invertible matrix.
• Find a basis in all the examples seen so far.

Coordinate Vector: Definition


Let B = {v1 , . . . , vn } be a basis for V and v a vector in V .
Span(B) = V ⇒ v = a1 v1 + · · · + an vn for scalars a1 , . . . , an .
Linear independence ⇒ this expression for v is unique. Thus

Every v ∈ V can be uniquely written as a linear combination of {v1 , . . . , vn }.

E XERCISE : Prove this.


D EFINITION : If v = a1 v1 + · · · + an vn , then (a1 , . . . , an )T ∈ Rn is called
the coordinate vector of v w.r.t. B, denoted [v]B .
N OTE :
[v]B depends not only on the basis B, but also the order of the elements in B.
P OINT TO P ONDER : How does [v]B change, if B is rewritten as {v2 , v1 , v3 , . . . , vn }?

Coordinate Vectors: Examples

1. Consider the basis B = {v1 = (1, −1)T , v2 = (0, 1)T } of R2 , and  v = T


 (1, 1) . Note that
1
v = 1v1 + 2v2 . Hence, the coordinate vector of v w.r.t. B is [v]B = .
2

2. E XERCISE : Show that B = {1, x, x2 } is a basis of P2


(called the standard basis of P2 ).
The coordinate vector of v = 2x2 − 3x + 1 w.r.t. B is [v]B = (1, −3, 2)T .

30
3. E XERCISE : Show that B ′ = {1, (x − 1), (x − 1)2 , (x − 1)3 } is a basis of P3 .
H INT : Taylor expansion.
Let B be the standard basis of P3 . Then [x3 ]B = ( , , , )T , and [x3 ]B′ = ( , , , )T .

I MPOR TANT N OTE : To write the coordinates, we have a to fix a basis B,


and also fix the order of elements in it!

Example: A basis for R2


Pick v1 ̸= 0. Choose v2 , not a multiple of v1 . For any v in R2 , there are unique scalars a
and b such that v = av1 + bv2 .
e.g., pick v1 = (1, −1)T , v2 = (0, 1)T , and let v = (1, 1)T .

b = −1 b=5

v
b = −2 v2 b=4

b = −3 b=3

b = −4 v1 b=2

b = −5 b=1

b=0
a = −2 a = −1 a=0 a=1 a=2
Thus the lines a = 0 and b = 0 give a set of
axes for R2 , and v = v1 + 2v2 .
 
1
With this basis B = {v1 , v2 }, the coordinates of v will be [v]B = .
2

Dimension of a Vector Space


Q: The number of vectors in each basis of R3 is 3. Why?
R ECALL : If v1 , . . . , vn span Rm , then m ≤ n, and
if they are linear independent, then n ≤ m.
D EFN .: More generally, if v1 , . . . vm and w1 , . . . , wn are both basis of V , then m = n.
This is called the dimension of V . Thus
dim(V ) = number of elements in a basis of V .
E XAMPLES : • dim({0}) = 0. • dim(Rn ) = n.
• A line through origin in R3 is of the form
L = {tu | t ∈ R} for some u in R3 \ {0}. A basis for L is { }, and dim(L) = .
3
• The dimension of a plane (P) in R is 2. Why?
• A basis for C as a vector space over R is {1, i}.
A basis for C as a complex vector space is {1}.
i.e., dim(C) = 2 as a R-vector space and 1 as a C-vector space.
Thus, dimension depends on the choice of scalars!

31
Basis & Dimension: Remarks
Let dim (V ) = n, S = {v1 , . . . , vk } ⊆ V .
R ECALL : A basis is a minimal spanning set.
In particular, if Span(S) = V , then k ≥ n, and S contains a basis of V ,
i.e., there exist {vi1 , . . . , vin } ⊆ S which is a basis of V .
E XAMPLE : The columns of a 3 × 4 matrix A with 3 pivots span R3 . Hence the columns
contain a basis of R3 .
R ECALL : A basis is a maximal linearly independent set.
In particular, if S is linear independent, then k ≤ n, and S can be extended
to a basis of V , i.e., there exist w1 , . . . , wn−k in V such that {v1 , . . . , vk , w1 , . . . , wn−k }
is a basis of V .
E XAMPLE : The columns of a 3 × 2 matrix A with 2 pivots
has linearly independent columns, and hence can be extended to a basis of R3 .

Coordinate Vector and Dimension: Example


A basis for M2×2 , the vector space of 2 × 2 matrices,
(called
 standard
  of M),is B =
thebasis  {e11 , e12, e21 , e22 }, where
1 0 0 1 0 0 0 0
e11 = , e12 = , e21 = , e22 = .
0 0 0 0 1 0 0 1
(Verify this!) Hence dim(M2×2 ) = 4.
Every 2 × 2 matrix A = (aij ) can be written uniquely as
A = a11 e11 + a12 e12 + a21 e21 + a22 e22 .
Thus, the coordinate vector of A with respect to B is
[A]B = (a11 , a12 , a21 , a22 )T
N OTE : [A]B completely determines A, once we fix B, and order the elements in B.
Once we fix a basis, we will need 4 coordinates to describe each matrix,
since dim (M2×2 ) = 4.
E XERCISE : Find two bases of Mm×n (other than the standard one)
and its dimension. Find [e11 ]B in both cases.

Summary: Basis and Dimension


• A basis of a vector space V is a linearly independent subset B which spans V .
• A basis is a maximal linearly independent subset of V
⇒ any linearly independent subset in V can be extended to a basis of V .
• A basis is a minimal spanning set of V
⇒ every spanning set of V contains a basis.
• The number of elements in each basis is the same,
and the dimension of V , dim(V ) = number of elements in a basis of V .
• B = {v1 , . . . , vn } is a basis for V ⇔
every v ∈ V can be uniquely written as a linear combination of {v1 , . . . , vn },
and if v = a1 v1 + · · · + an vn , then the coordinate vector of v w.r.t. the basis B is
[v]B = (a1 , . . . , an )T ∈ Rn .
• dim (Rn ) = n, and  the set B = {v1 , . . . , vn } is a basis of R
n

⇔ A = v1 · · · vn is invertible.

2.4 N ULL S PACE , C OLUMN S PACE & R OW S PACE

32
Subspaces Associated to a Matrix
Associated to an m × n matrix A, we have four subspaces:
• The column space of A: C(A) = Span{A∗1 , . . . A∗n }
= {v : Ax = v is consistent} ⊆ Rm .
• The null space of A: N (A) = {x : Ax = 0} ⊆ Rn .
• The row space of A = Span{A1∗ , . . . , Am∗ } = C(AT ) ⊆ Rn .
• The left null space of A = {x : xT A = 0} = N (AT ) ⊆ Rm .
Q: Why are the row space and the left null space subspaces?
R ECALL : Let U be the echelon form of A, and R its reduced form. Then
N (A) = N (U ) = N (R).
O BSERVE : The rows of U (and R) are linear combinations of the rows of A,
and vice versa ⇒ their row spaces are same, i.e., C(AT ) = C(U T ) = C(RT ).
We compute bases and dimensions of these special subspaces.

An Example
We illustrate how to find a basis and the dimension of the Null Space N (A), the
Column Space C(A), and the Row Space C(AT ) by using the following example.
 
2 4 6 4
Let A = 2 5 7 6 .
2 3 5 2
R ECALL :  
1 0 1 −2
• The reduced form of A is R = 0 1 1 2 .
0 0 0 0
• The 1st and 2nd are pivot columns ⇒ rank(A) = 2.
T
• v = a b c is in C(A) ⇔ Ax = v is solvable ⇔ 2a − b − c = 0.
• We can compute special solutions to Ax = 0.
The number of special solutions to Ax = 0 is the number of free variables.

The Null Space:


 N(A)   
2 4 6 4 1 0 1 −2
For A = 2 5 7 6, reduced form R = 0 1 1 2 .
2 3 5 2 0 0 0 0
       

 a −c + 2d −1 2  
b −c − 2d −1 −2
       
N (A) =   = c  + d  .
c = 
 

 c  1  0 

d d 0 1
 
n T T o
= Span w1 = −1 −1 1 0 , w2 = 2 −2 0 1 .
w1 , w2 are linearly independent (Why?)
⇒ B = {w1 , w2 } forms a basis for N (A) ⇒ dim(N (A)) = 2.
A basis for N (A) is the set of special solutions.
dim(N (A)) = no. of free variables = no. of variables - rank(A)
dim(N (A)) is called nullity(A).
S HOW : w = (−3, −7, 5, 1)T is in N (A). Find [w]B .

33
The Column  Space: C(A)
  
2 4 6 4 1 0 1 −2
For A = 2 5 7 6, reduced form R = 0 1 1 2 .
2 3 5 2 0 0 0 0
 
Write A = v1 v2 v3 v4 and R = w1 w2 w3 w4 .
R ECALL : Relations between the column vectors of A are the same as the relations be-
tween column vectors of R, since N (A) = N (R).
⇒ Ax = v3 has a solution has the same solution as Rx = w3 , and Ax = v4 has a same
solution as Rx = w4 .
Particular solutions are (1, 1, 0, 0)T and (−2, 2, 0, 0)T respectively ⇒ v3 = v1 + v2 , v4 =
−2v1 + 2v2 .
O BSERVE :
• v1 and v2 correspond to the pivot columns of A.
• {v1 , v2 } are linearly independent. Why?
• C(A) = Span{v1 , . . . , v4 } = Span{v1 , v2 }.
Thus B = {v1 , v2 } is a basis of C(A). Q: What is [vi ]B ?

The Rank-Nullity Theorem


More generally, for an m × n matrix A,
• Let rank(A) = r. The r pivot columns are linearly independent since their reduced
form contains an r × r identity matrix.
• Each non-pivot column A∗j of A can be written as a linear combination of the pivots
columns, by solving Ax = A∗j . Thus

A basis for C(A) is given by the pivot columns of A.

dim(C(A)) = no. of pivot variables = rank(A).

• A basis for N (A) is given by the special solutions of A. Thus

dim(N (A)) = no. of free variables = nullity(A).

R ANK -N ULLITY T HEOREM: Let A be an m × n matrix. Then

dim(C(A)) + dim(N (A)) = no. of variables = n

The Row Space: C(A T


 )   
2 4 6 4 1 0 1 −2
R ECALL : If A = 2 5 7 6, then R = 0 1 1 2 .
2 3 5 2 0 0 0 0
R ECALL : R is obtained from A by taking non-zero scalar multiples of rows and their
sums ⇒ C(RT ) = C(AT ).
O BSERVE : The non-zero rows of R will span C(AT ), and they contain an identity sub-
matrix ⇒ they are linearly independent.
Thus, the non-zero rows of R form a basis for C(RT ) = C(AT ).
E XERCISE : Give two different basis for C(AT ).
Since the number of non-zero rows of R = number of pivots of A, we have:

dim C(AT )= no. of pivots of A = rank(A).

34
R ECALL : dim C(AT ) = rank(AT ). Thus,

rank(AT ) = dim (C(AT )) = rank(A)

Extra Reading: The Left Null Space - N (AT )


The no. of columns of AT is m.
By Rank-Nullity Theorem, rank(AT )+ dim(N (AT )) = m.
Hence:
dim(N (AT )) = m− rank(A).
 
2 4 6 4
E XERCISE : Complete the example by finding a basis for N (AT ). A = 2 5 7 6,
2 3 5 2
 
1 0 1 −2
reduced form R = 0 1
 1 2 .
0 0 0 0
Q. Can you use R to compute the basis for T
 N (A )?Why not?
1 0 2
0 1 −1
A. Need the reduced form of AT which is 
0 0 0 .

0 0 0

2.5 L INEAR T RANSFORMATIONS

Matrices as Transformations: Examples


     
1 0 x1 x1
1. Let A = . Then A = . Let x = (2, 1)T . What is Ax?
0 −1 x2 −x2

Y
(2, 1)

(2, −1)

How does A transform x? A reflects vectors across the X-axis.


     
0 1 x1 x2
2. Let B = . Then B = . If x = (−1, 0.5)T , then Bx = (0.5, −1)T .
1 0 x2 x1

(−1, 0.5)

(0.5, −1)

35
How does B transform x? B reflects vectors across the line x1 = x2 .
Q: Do reflections preserve scalar multiples? Sums of vectors?
     x1 +x2 
1/2 1/2 x1 2
3. P = transforms x = to P x = x1 +x .
1/2 1/2 x2 2
2

y
(−1, 1)
e2

P e1 = P e2

(0, 0) e1 x

   
1/2 −1
P e1 = = P e2 , and P transforms the vector to the origin.
1/2 1
Geometrically, P transforms vectors in R2 by projecting onto the line x1 = x2 .
Q: What happens to sums of vectors when you project them? What about scalar
multiples?
 
1 0
E XERCISE : Understand the effect of P = on e1 and e2 and interpret what P
0 0
represents geometrically.
! 
√1 − √1 cos(45◦ ) − sin(45◦ )

4. Let Q = √1 2 2 = .
2
√1
2
sin(45◦ ) cos(45◦ )

How does Q transform the standard basis vectors e1 and e2 ?


Y
 T  T
−1 √1
√ , = Qe2 e2 Qe1 = 1
√ , √1
2 2 2 2

e1 X
 
x1
Q: What does the transformation x = 7→ Qx represent geometrically?
x2
Again note that rotations map sum of vectors to sum of their images and scalar
multiple of a vector to scalar multiple of its image.

Matrices as Transformations
• An m × n matrix A transforms a vector x in Rn into the vector Ax in Rm .
Thus T (x) = Ax defines a function T : Rn → Rm .
• The domain of T is . The codomain of T is .
• Let b ∈ Rm . Then b is in C(A) ⇔ Ax = b is consistent
⇔ T (x) = b, i.e., b is in the image (or range) of T .
Hence, the range of T is .
 
2 4 6 4
E XAMPLE : Let A = 2 5 7 6 . Then T (x) = Ax is a function with domain R4 ,
2 3 5 2

36
codomain R3 , and range equal to C(A) = {(a, b, c)T | 2a − b − c = 0} ⊆ R3 .
Q: How does T transform sums and scalar multiples of vectors?
A. Nicely! For scalars a and b, and vectors x and y,
T (ax + by) = A(ax + by) = aAx + bAy = aT (x) + bT (y). Thus

T takes linear combinations to linear combinations.

Linear Transformations
D EFN . Let V and W be vector spaces.
• A linear transformation from V to W is a function T : V → W such that
for x, y ∈ V , scalars a and b,
T (ax + by) = aT (x) + bT (y)
i.e., T takes linear combinations of vectors in V
to the linear combinations of their images in W .
• If T is also a bijection, we say T is a linear isomorphism.
• The image (or range) of T is defined to be
C(T ) = {y ∈ W | T (x) = y for some x ∈ V }.
• The kernel (or null space) of T is defined as
N (T ) = {x ∈ V | T (x) = 0}.
M AIN E XAMPLE : Let A be an m × n matrix. Define T (x) = Ax.
• This defines a linear transformation T : Rn → Rm .
• The image of T is the column space of A, i.e., C(T ) = C(A).
• The kernel of T is the null space of A, i.e., N (T ) = N (A).

Linear Transformations: Examples


Which of the following functions are linear transformations?
• g : R3 → R3 defined as g(x1 , x2 , x3 )T = (x1 , x2 , 0)T
         
x1 y1 ax1 by1 ax1 + by1
ag(x) + bg(y) = ag x2  + bg y2  = ax2  + by2  = ax2 + by2  = g(ax + by)
x3 y3 0 0 0
is a linear transformation.
E XERCISE : Find N (g) and C(g).
• h : R3 → R3 defined as h(x1 , x2 , x3 )T = (x1 , x2 , 5)T .
N OTE : h(0 + 0) ̸= h(0) + h(0).
O BSERVE : A linear transformation must map 0 ∈ V to 0 ∈ W .
• f : R2 → R4 defined by f (x1 , x2 )T = (x1 , 0, x2 , x22 )T .
N OTE : f transforms the Y -axis in R2 to {(0, 0, y, y 2 )T | y ∈ R}.
O BSERVE :
A linear transformation must transform a subspace of V into a subspace of W .
 
a b
• S : M2×2 → R4 defined by S = (a, b, c, d)T is a linear transformation.
c d
O BSERVE : S is also a bijection, and hence an isomorphism!
S is onto ⇒ C(S) = R4 , and S(A) = S(B) ⇒ A = B,
i.e., S is one-one. In particular, N (S) = {0}.
• T : R∞ → R∞ defined by T (x1 , x2 , . . .) = (x1 + x2 , x2 + x3 , . . . , ).
E XERCISE : What is N (T )?
• S : R∞ → R∞ defined by S(x1 , x2 , . . .) = (x2 , x3 , . . .).
E XERCISE : Find C(S), and a basis of N (S).

37
• Let T : P2 → P1 be S(a0 + a1 x + a2 x2 ) = a1 + 4a2 x.
E XERCISE : Show that dim (N (T )) = 1, and find C(T ).
df
• Let D : C ∞ ([0, 1]) → C ∞ ([0, 1]) defined as Df = dx .
E XERCISE : Is D2 = D ◦ D linear? What about D3 ?
E XERCISE : What is N (D)? N (D2 )? N (Dk )?
P OINT TO P ONDER : Is integration linear?
O BSERVE : Images and null spaces are subspaces!
Of which vector space?

Properties of Linear transformations


Let B = {v1 , . . . , vn } ⊆ V , T : V → W be linear, and T (B) = {T (v1 ), . . . , T (vn )}.
Then:
• T (au + bv) = aT (u) + bT (v). In particular, T (0) = 0.
• N (T ) is a subspace of V . Why? C(T ) is a subspace of W . Why?
• If Span(B) = V , is Span{T (B)} = W ? N OTE : It is C(T ).
C ONCLUSION : (i) If dim (V ) = n, then dim (C(T )) ≤ n.
(ii) T is onto ⇔ Span{T (B}) = C(T ) = W .
• T (u) = T (v) ⇔ u − v ∈ N (T ).
C ONCLUSION : T is one-one ⇔ N (T ) = 0.
• If B ⊆ V is linearly independent, is {T (B)} ⊆ W linearly independent?
H INT : a1 T (v1 ) + · · · an T (vn ) = 0 ⇒ a1 v1 + · · · an vn ∈ N (T ).
• S : U → V , T : V → W are linear ⇒ T ◦ S : U → W is linear.
E XERCISE : Show that N (S) ⊆ N (T ◦ S).
How are C(T ◦ S) and C(T ) related?

Isomorphism of Vector Spaces


R ECALL : A linear map T : V → W is an isomorphism if T is also a bijection.
N OTATION : V ≃ W .
Q: If T : V → W is an isomorphism, is T −1 : W → V linear?
R ECALL : T is one-one ⇔ N (T ) = 0 & T is onto ⇔ C(T ) = W .
Thus T is an isomorphism ⇔ N (T ) = 0 and C(T ) = W .
E XAMPLE : If V is the subspace of convergent sequences in R∞ ,
then L : V → R given by L(x1 , x2 , . . .) = lim (xn ) is linear.
n→∞
What is N (L)? C(L)? Is L one-one or onto?
E XERCISE : Given A ∈ Mm×n , let T (x) = Ax for x ∈ Rn .
Then T is an isomorphism ⇔ m = n and A is invertible.
E XERCISE : In the previous examples, identify linear maps which are one-one,
and those which
 are onto.
a b
E XAMPLE : S = (a, b, c, d)T is an isomorphism since
c d
N (S) = 0 and C(S) = R4 . Thus M2×2 ≃ R4 . What is S −1 ?

38
Linear Maps and Basis  
4 a b
• Consider S : M2×2 → R given by S = (a, b, c, d)T .
c d
Recall that {e11, e12 , e
21 , e22 } is a basis of M2×2
a b
such that A = = ae11 + be12 + ce21 + de22 .
c d
Observe that S(e11 ) = e1 , S(e12 ) = e2 , S(e21 ) = e3 , S(e22 ) = e4 .
Thus, S(A) = aS(e11 ) + bS(e12 ) + cS(e21 ) + dS(e22 )= ae1 + be2 + ce3 + de4 = (a, b, c, d)T .
G ENERAL CASE :
If {v1 , . . . , vn } is a basis of V , T : V → W is linear, v ∈ V ,then v = a1 v1 + · · · an vn
⇒ T (v) = a1 T (v1 ) + · · · an T (vn ). Why? Thus, T is determined by its action on a basis,
i.e., for any n vectors w1 , . . . , wn in W (not necessarily distinct), there is
a unique linear transformation T : V → W such that T (v1 ) = w1 , . . . , T (vn ) = wn .

Finite-dimensional Vector Spaces


I MPOR TANT O BSERVATION : Let dim (V ) = n, and B = {v1 , . . . , vn } be a basis of V .
Define T : V → Rn by T (vi ) = ei .
e.g., If v = v1 + vn , T (v) =? If v = 3v2 − 5v3 , T (v) =?
If v = a1 v1 + · · · + an vn , T (v) =?
Thus T (v) = [v]B .
Is T a linear transformation? What is N (T )? What is C(T )?
C ONCLUSION : If dim (V) = n, then V ≃ Rn .
Q: Is P3 ≃ M2×2 ?
K EY POINT : Composition of isomorphisms is an isomorphism,
and inverse of an isomorphism is an isomorphism.
E XERCISE : Find 3 isomorphisms each from P3 to R4 , and M2×2 to R4 .

Linear maps from Rnto Rm    


3 2 −5
E XAMPLE : T (e1 ) = , T (e2 ) = , T (e3 ) =
1 −1 0
3 2
defines a linear map T : R → R .
If x = (x1 , x2 , x3 )T , then T (x) = T (x e1 + x2
 1 e2 +x3 e3 ) = 
3 2 −5
x1 T (e1 ) + x2 T (e2 ) + x3 T (e3 ) = x1 + x2 + x3 , i.e., T (x) = Ax, where
1 −1 0
 
3 2 −5
A= . Q: A∗j =?
1 −1 0
 
T 3 2 −5
If x = (x1 , x2 , x3 ) , then T (x) = Ax, where A = , i.e., A∗j = T (ej ).
1 −1 0
G ENERAL CASE : If T : Rn → Rm is linear, then
for x = (x1 , . . . , xn )T in Rn ,
T (x) = x1 T (e1 ) + · · · xn T (en ) = Ax,

where A = T (e1 ) · · · T (en ) ∈ Mm×n , i.e., A∗j = T (ej ).
D EFN . A is called the standard matrix of T . Thus

Linear transformations from Rn to Rm


are in one-one correspondence with m × n matrices.

P OINT TO P ONDER : Can you imitate this if V and W are not Rn and Rm ?

39
Matrix Associated to a Linear Map: Example
S : P2 → P1 given by S(a0 + a1 x + a2 x2 ) = a1 + 4a2 x is linear.
Q: Is there a matrix associated to S?
Expected size: 2 × 3. Why?
I DEA : Construct an associated linear map R3 → R2 . Use coordinate vectors!
Fix bases B = {1, x, x2 } of P2 , and C = {1, x} of P1 to do this.
Identify f = a0 + a1 x + a2 x2 ∈ P2 with [f ]B = (a0 , a1 , a2 )T ∈ R3 ,
and S(f ) ∈ P1 with [S(f )]C = (a1 , 4a2 )T ∈ R2 .
The associated linear map S ′ : R3 → R2 is defined by S ′ (a0 , a1 , a2 )T = (a1 , 4a2 )T ,
i.e., S ′ ([f ]B ) = [S(f )]C , i.e.,
S ′ is defined by S ′ (e1 ) = (0, 0)T , S ′ (e2) = (1, 0)T ′
 , S (e3 ) = (0, 4)
T

0 1 0
⇒ the standard matrix of S ′ is A = .
0 0 4
Q: How is A related to S?
O BSERVE : A∗1 =[S(1)]C , A∗2 = [S(x)]C , A∗3 = [S(x2 )]C .

Matrix Associated to a Linear Map


E XAMPLE : The matrix of S(a0 + a1 x + a2 x2 ) = a1 + 4a2 x,
w.r.t. the bases B = {1, x, x2 } of P2 and C = {1, x} of P1 is A =
 
0 1 0
and A∗1 = [S(1)]C , A∗2 = [S(x)]C , A∗3 = [S(x2 )]C .
0 0 4
G ENERAL C ASE : If T : V → W is linear, then the matrix of T with respect to the
ordered bases B = {v1 , . . . , vn } of V , and C = {w1 , . . . , wm } of W , denoted [T ]B
C , is

A = [T (v1 )]C · · · [T (vn )]C ∈ Mm×n .
E XAMPLE
  :Projection onto the line x1 = x2
x1 +x2
 
x1 2 1/2 1/2
P = x1 +x2 has standard matrix .
x2 2 1/2 1/2
This is the matrix of P w.r.t. the standard basis.
Q: What is [P ]B T T
B where B = {(1, 1) , (−1, 1) }?
C ONCLUSION : The matrix of a transformation depends on the chosen basis.
Some are better than others!

40
Chapter 3. E IGENVALUES , E IGENVECTORS , & D IAGONALIZABILITY

3.1 E IGENVALUES & E IGENVECTORS

Eigenvalues and Eigenvectors: Motivation


• Solve the differential equation (DE) for u: du/dt = 3u.
The solution is u(t) = c e3t ,
where c ∈ R.
With initial condition (IC) u(0) = 2, the solution is u(t) = 2e3t .
• Consider the system of linear 1st order differential equations with constant coefficients:

du1 /dt = 4u1 − 5u2 , du2 /dt = 2u1 − 3u2 ,


How does one find the solution?
• Write the system in matrix form du/dt = Au,
   
u1 (t) 4 −5
where u(t) = ,A= .
u2 (t) 2 −3
 
x
• Assuming the solution is u(t) = eλt v, where v = ∈ R2 ,
y
we need to find λ and v.

Eigenvalues and Eigenvectors: Definition


P ROBLEM : Find u1 (t) = eλt x, u2 (t) = eλt y, given
u′1 = 4u1 − 5u2 , u′2 = 2u1 − 3u2 .
λ eλt x = 4eλt x − 5eλt y,
λ eλt y = 2eλt x − 3eλt y.
Cancelling eλt , we get
E IGENVALUE PROBLEM : Find λ and v = (x, y)T satisfying
4x − 5y = λx, 2x − 3y = λy.
In the matrix form, it is Av = λv .
This equation has two unknowns, λ and v.
D EFN . If there exists a λ such that Av = λv has a non-zero solution v, then λ is called
an eigenvalue of A and all nonzero v satisfying Av = λv are called eigenvectors of A
associated to λ.
Q: How many eignevalues can A have?
How do we find them & the associated eigenvectors?
Reduce the number of unknowns!

Eigenvalues and Eigenvectors: Solving Ax = λx


Rewrite Av = λv as (A − λI)v = 0.
• λ is an eigenvalue of A
⇔ there is a nonzero v in the nullspace of A − λI
⇔ N (A − λI) ̸= 0, i.e., dim (N (A − λI)) ≥ 1,
⇔ A − λI is not invertible
⇔ det(A − λI) = 0.
• det(A − λI) is a polynomial in the variable λ of degree n. Hence it has at most n roots
⇒ A has atmost n eigenvalues.

41
• det(A − λI) is called the characteristic polynomial of A.
• If λ is an eigenvalue of A, then the nullspace of A − λI is called the
eigenspace of A associated to eigenvalue λ.
P OINT TO P ONDER : When is 0 an eigenvalue of A?

Eigenvalues and Eigenvectors: Example


T O SUMMARISE : An eigenvalue of A is a root (in R) of its characteristic polynomial.
Any non-zero vector in the corresponding eigenspace is an associated eigenvector.
R ECALL : The ODE system we want to solve is
u′1 = 4u1 − 5u2 , u′2 = 2u1 − 3u2 ,
The solutions are u1 (t) = eλt x, u2 (t) = eλt y, where (x, y)T is a solution of:
    
4 −5 x x
=λ (Av = λv)
2 −3 y y
The characteristic polynomial of A is det(A − λI)
 
4−λ −5
= det = (4 − λ)(−3 − λ) + 10
2 −3 − λ
= λ2 − λ − 2 = (λ + 1)(λ − 2)
The eigenvalues of A are λ1 = −1, λ2 = 2.
Eigenvectors v1 and v2 associated to λ1 = −1 and λ2 = 2 respectively are in:
N (A − λ1 I) = N (A + I), and
 N (A −λ2 I)
 = N (A − 2I).   
5 −5 x y
Solving (A + I)v = 0, i.e., = 0, we get N (A + I) = |y∈R
2 −2 y y
 
1
and hence v1 = is an eigenvector associated to λ1 = −1.
1
 5y  
Similarly, solving (A − 2I)v = 0 gives N (A − 2I) = 2 |y∈R .
y
 
5
In particular, v2 = is an eigenvector associated to λ2 = 2.
2
Thus, the system du/dt = Au has two special solutions: e−t v1 and e2t v2 .

Reading Slide: Complete Solution to ODE


N OTE : When two functions satisfy du/dt = Au, then so do their linear combinations.
     
−t 2t u1 (t) −t 1 2t 5
C OMPLETE SOLUTION : u(t) = c1 e v1 + c2 e v2 , i.e., = c1 e + c2 e .
u2 (t) 1 2

i.e. u1 (t) = c1 e−t + 5c2 e2t , u2 (t) = c1 e−t + 2c2 e2t .


If we put initial conditions (IC) u1 (0) = 8 and u2 (0) = 5, then
c1 + 5c2 = 8, c1 + 2c2 = 5 ⇒ c1 = 3, c2 = 1.
Hence the solution of the original ODE system with the given IC is
u1 (t) = 3e−t + 5e2t , u2 (t) = 3e−t + 2e2t .

42
Finding Eigenvalues: Diagonal Matrices
It is easy to find the 
eigenvalues, and associated eigenvectors for diagonal matrices.
3 0
E XAMPLE : A = .
0 2
Characteristic polynomial: (3 − λ)(2 − λ). Eigenvalues: λ1 = 3, λ2 = 2.
Eigenvectors: A∗1 = Ae1 = 3e1 ⇒ we can take v1 = e1 .
Similarly, an eigenvector associated to λ2 = 2 is v2 = e2 .
Note: R2 has a basis consisting of eigenvectors of A: {e1 , e2 }.
G ENERAL CASE : If A is a diagonal matrix with diagonal entries λ1 , · · · , λn , then
Characteristic Polynomial: (λ1 − λ) · · · (λn − λ) Eigenvalues: λ1 , · · · , λn &
n
Eigenvectors: e1 , · · · , en , which form a basis for R .

Finding Eigenvalues: Examples  


1/2 1/2 T
E XAMPLE : Projection onto the line x = y: P = . v1 = 1 1 projects onto
1/2 1/2
T
itself ⇒ λ1 = 1 with eigenvector v1 . v2 = 1 −1 7→ 0 ⇒ λ2 = 0 with eigenvector v2 .
Further, {v1 , v2 } is a basis of R2 .
Q: Do a collection of eigenvectors always
 form
 a basis of Rn ?
c 1
A: No! E XAMPLE : For c ∈ R, let A = .
0 c
Characteristic Polynomial: det(A − λI) = (c − λ)2 . Eigenvalues: λ = c.
 
1
Eigenvectors: (A − I)v = 0 ⇒ v = . Q: Is it unique?
0
Eigenspace of A is 1 dimensional ⇒ R2 has no basis of eigenvectors of A.
P OINT TO P ONDER : What is the advantage of such a basis?

Similarity and Eigenvalues


D EFN . The n × n matrices A and B are similar, if there exists an invertible matrix P
such that P −1 AP = B.
O BSERVE : If B = P −1 AP , then (i) det(A) = det(B), and (ii) B n = P −1 An P for each n ∈ N.
Q: If A is invertible, what more can we say?
T HEOREM : If A and B are similar, then they have the same characteristic polynomial.
Proof. Given: B = P −1 AP . Want: det(A − λI) = det(B − λI).
This is true since A − λI and B − λI are similar!
Indeed, B − λI = P −1 AP − λP −1 P = P −1 (A − λI)P . □
Q: Why care? Write det(A − λI) = (λ1 − λ) · · · (λn − λ).
Compare constant coeff.: det(A) = λ1 · · · λn = det(B);
Compare coeff. of λn−1 :
Sum of diagonal entries = a11 + · · · + ann = Trace of A = λ1 + . . . + λn = Trace of B.
Q: How are eigenvalues of A and B related?

Summary: Eigenvalues and Characteristic Polynomial


Let A be n × n.

1. The characteristic polynomial of A is det(A − λI) (of degree n) and its (real) roots are
the eigenvalues of A.

43
2. For each eigenvalue λ, the associated eigenspace is N (A − λI). To find it, solve (A −
λI)v = 0. Any non-zero vector in N (A − λI) is an eigenvector associated to λ.

3. If A is a diagonal matrix with diagonal entries λ1 , · · · , λn , then its eigenvalues and


associated eigenvectors are λ1 , · · · , λn and e1 , · · · , en respectively.

4. Write det(A − λI) = (λ1 − λ) · · · (λn − λ) and expand.


Trace of A = λ1 + . . . + λn det(A) = λ1 · · · λn
T HUS : If λ1 , . . ., λn are real numbers, then Tr(A) = sum of eigenvalues, and det(A) =
product of eigenvalues.

5. If B is similar to A, then det(B − λI) = det(A − λI).

3.2 D IAGONALIZABLITY

Diagonizability: Introduction
N OTE : Finding roots of polynomials (and hence eigenvalues) is difficult in general.
For n ≥ 5, no formula exists for roots. (Abel, Galois)
For n = 3, 4, formulae for root exist, but not easy to use.
D EFN . An n × n matrix A is diagonalizable if A is similar to a diagonal matrix Λ,
i.e., there is an invertible matrix P and a diagonal matrix Λ such that P −1 AP = Λ.
I MPOR TANCE OF D IAGONALIZABILITY :
Let the n × n matrix A be diagonalizable, i.e., P −1 AP = Λ, where P is invertible and Λ
is diagonal. If this happens,
• The eigenvalues of A are the diagonal entries of Λ,
• det(A) is the product of the diagonal entries of Λ, and
• Trace(A) = sum of the diagonal entries of Λ.
• O THER I NFORMATION : e.g., what is Trace(An )?

Diagonalization: Example 
1 5 6
E XAMPLE : A = 0 2 −4 is triangular.
0 0 3
det(A − λI) = (1 − λ)(2 − λ)(3 − λ) Eigenvalues: λ1 = 1, λ2 = 2, λ3 = 3.
N OTE : If A is triangular, its eigenvalues are on the diagonal
T T
Eigenvectors: v1 = e1 , v2 = 5, 1, 0 , v3 = −7, −4, 1 . (V ERIFY !)
3
Further, {v1 , v2 , v3 } is a
 basis of R .  
Hence P = v1 v2 v3 is invertible , and AP = Av1 Av2 Av3 = v1 2v2 3v3 = P Λ,
 
1
where Λ =  2 . Thus P −1 AP = Λ, i.e., A is diagonalizable.
3
E XERCISE : If B = {v1 , v2 , v3 }, and T (v) = Av, then [T ]B
B = .

44
Eigenvalue Decomposition (EVD)
Let A be an n×n matrix with n eigenvectors v1 , . . . , vn , associated to eigenvalues λ1 , . . . , λn .
If B = {v1 , . . . , vn } is a basis of Rn , then the matrix P = v1 · · · vn is invertible.

N OTE : P Λ = λ1 v1 · · · λn vn . (T RUE / F ALSE : ΛP = P Λ).
T HUS :  
λ1
  
AP =A v1 · · · vn = Av1 · · · Avn = λ1 v1 · · · λn vn = P Λ, where Λ = 
 .. .

.
λn

Therefore P −1 AP = Λ, i.e., A is similar to a diagonal matrix.


 
T HUS : Eigenvectors diagonalize a matrix
 
E IGENVALUE D ECOMPOSITION (EVD): Let A be diagonalizable.
With notation as above, we have A = P ΛP −1 .
This is called as the eigenvalue decompostion (EVD) of A.

Diagonalizability & Eigenvectors


T HEOREM : A is diagonalizable ⇔ A has n linearly independent eigenvectors.
In particular, Rn has a basis consisting of eigenvectors of A.

Proof. (⇐): Done! To prove (⇒), assume P = v1 · · · vn is an invertible matrix such
 
λ1
−1
that P AP = Λ = 
 .. . Then AP = P Λ,

.
λn
 
i.e. Av1 . . . Avn = λ1 v1 . . . λn vn . Therefore v1 , . . . , vn are eigenvectors of A.
They are linearly independent since P is invertible. □
Q: Is every matrix is diagonalizable? A: No.
 
0 −1
E XAMPLES : Q = no eigenvalues (over R)!
1 0
 
0 1
P = not enough eigenvectors!
0 0

Summary: Diagonalizability
T HUS : If an n × n matrix A has n linearly independent eigenvectors v1 , . . . , vn , then A is
diagonalizable. Moreover, if λ1 , . . . , λn are the corresponding eigenvalues, then
P −1 AP = Λ, where the diagonalizing matrix is P = v1 · · · vn , and the diagonal matrix

 
λ1  
is Λ = 
 .. , i.e., P −1 AP = Λ 

, where
.
λn
 
The diagonal entries of Λ are eigenvalues of A and
 
The columns of P are corresponding eigenvectors of A.
   
−1
The EVD of A is A = P ΛP . 

N OTE : P need not be unique, e.g., replace v1 by 2v1 , etc.

45
When is A Diagonalizable?
A: When A has n linearly independent eigenvectors!
Q: When does this happen?
• If v1 , . . . , vr are eigenvectors of A associated to distinct eigenvalues λ1 , . . . , λr ,
then v1 , . . . , vr are linearly independent.
Proof. Suppose v1 , . . . , vr are linearly dependent.
Choose a linear relation involving minimum number of vi ’s, say
(1) a1 v1 + · · · + at vt = 0. (1 < t ≤ r, t is minimal, ai ̸= 0)
Apply A to get a1 λ1 v1 + · · · + at λt vt = 0 (2)
λ1 (1) − (2) gives a2 (λ1 − λ2 )v2 + · · · + at (λ1 − λt )vt = 0,
which contradicts the minimality of t. □
• If A has n distinct eigenvalues, then A is diagonalizable.
Proof. If v1 , . . . , vn are eigenvectors associated to distinct eigenvalues
 λ1 , . . . , λn , then
{v1 , . . . , vn } is linearly independent. Then P = v1 . . . vn is invertible, and
P −1 AP = Λ as seen earlier. Hence A is diagonalizable. □

Reading Slide: Eigenvalues of AB and A + B


• If λ is an eigenvalue of A, µ is an eigenvalue of B, is λµ an eigenvalue of AB?
False Proof. ABv =A(µv) = µ(Av) = λµv.
This is false since A and B may not have same eigenvector v.
     
0 1 0 0 1 0
E XAMPLE : A = ,B= , AB = .
0 0 1 0 0 0
The eigenvalues of A and B are 0, 0 and that of AB are 1, 0.
• λ + µ is NOT an eigenvalue of A + B in general.
 
0 1
In above example, A + B = has eigenvalues 1, −1.
1 0
• If A and B have same eigenvectors associated to λ and µ,
then λµ and λ + µ are eigenvalues of AB and A + B respectively.
Q: When do A and B have the same eigenvectors?

Extra Reading: Simultaneous Diagonalizability


Assume A and B are diagonalizable .
Then A and B have same eigenvector matrix P if and only if AB = BA.
Proof. (⇒) Assume P −1 AP = Λ1 and P −1 BP = Λ2 , where Λ1 and Λ2 are diagonal matrices.
Then AB = (P Λ1 P −1 )(P Λ2 P −1 ) = P (Λ1 Λ2 )P −1 and BA = P (Λ2 Λ1 )P −1 .
Since Λ1 Λ2 = Λ2 Λ1 , we get AB = BA.
(Part of ⇐) Assume AB = BA. If Av = λv, then ABv = B(Av) = B(λv) = λBv. If Bv = 0,
then v is an eigenvector of B, associated to µ = 0.
If Bv ̸= 0, then v and Bv both are eigenvectors of A, associated to λ.
S PECIAL CASE : Assume all the eigenspaces of A are one dimensional. Then Bv = µv
for some scalar µ ⇒ v is an eigenvector of B. We will not prove the general case.

46
Eigenvalues of Ak
• If Av = λv, then A2 v = A(Av) = A(λv) = λ(Av) = λ2 v. Similarly Ak v = λk v for any k ≥ 0.
Thus if v is an eigenvector of A with associated eigenvalue λ, then v is also an eigenvector
of Ak with associated eigenvalue λk for k ≥ 0. If A is invertible, then λ ̸= 0. Hence,
the same also holds for k < 0 since A−1 v = λ−1 v.
• If A is diagonalizable , then P −1 AP = Λ is diagonal where columns of P are
eigenvectors of A. Since (P −1 Ak P ) = Λk , which is diagonal, we see that Ak is
diagonalizable, and the eigenvectors of Ak are same as eigenvectors of A.
Similarly, the same also holds for k < 0 if A is invertible.
Q: What is the EVD of Ak ?

Reading Slide: An Application: Fibonacci Numbers


Let F0 = 0, F1 = 1 and Fk = Fk−1 + Fk−2 for k ≥ 2 define the Fibonacci sequence.
What isthe kth
 term?     
Fk+1 Fk+1 1 1 Fk
If uk = , then = , i.e., uk = Auk−1 for n ≥ 1,
Fk Fk 1 0 Fk−1
 
1 1
where A = ⇒ uk = Ak u0 for k ≥ 1.
1 0
√ √
2 1+ 5 1− 5
Characteristic polynomial of A: λ − λ − 1; Eigenvalues: λ1 = , λ2 = .
2 2
There are 2 distinct eigenvalues ⇒ the associated eigenvectors x1 and x2
are linearly independent ⇒ {x1 , x2 } is a basis for R2 .
Write u0 = c1 x1 + c2 x2 . Then uk = Ak u0 = Ak (c1 x1 + c2 x2 )
√ !k √ !k
k k 1+ 5 1− 5
= c1 A x1 + c2 A x2 = c1 x1 + c2 x2 .
2 2
Q: Find x1 , x2 , c1 and c2 and get the exact formula for Fk .

An Application: Predator-Prey Model


Let the owl and rat populations at time k be Ok and Rk respectively.
Owls prey on the rats, so if there are no rats, the population of owls will go down,
say by 50%. If there are no owls to prey on the rats,
then the rat population will increase, say by 10%.
In particular, suppose the rat and owl populations dependence is as follows:

Ok+1 = 0.5Ok + 0.4Rk


Rk+1 = −pOk + 1.1Rk
 
Ok
The term −p calculates the rats preyed by the owls. Thus, if Pk = and
Rk
 
0.5 0.4
A= , then Pk+1 = APk for all k. In particular, Pk = Ak P0 .
−p 1.1
E XERCISE : If we start with a certain initial population of owls and rats,
how many will be there in, say, 50 years, i.e., given P0 , what is P50 ?
What is the steady state, i.e., what is limk→∞ Pk ?

47
An Application: Steady State
Suppose we have a system where the current state uk depends on the previous one uk−1
linearly, i.e., uk = Auk−1 . Then observe that uk = Ak u0 . The steady state of
the system is u∞ = limk→∞ (uk ). How do we find this?
• If u0 is an eigenvector of A associated to λ, then uk = λk u0 .
• Let v1 , . . . , vr be eigenvectors of A associated respectively to λ1 , . . . , λr .
If u0 ∈ Span{v1 , . . . , vr }, i.e., u0 = c1 v1 + · · · + cr vr for scalars c1 , . . . , cr , then
uk = Ak u0 = c1 Ak v1 +· · ·+cr Ak vr = c1 λk1 v1 +· · ·+cr λkr vr . In particular, if A is diagonalizable,
then there is a basis of Rn of eigenvectors of A. Hence, this is applicable to every u0 ∈ Rn .
Let A be diagonalizable, and uk represent population.
• Under what conditions will there be a population explosion?
• What conditions will force the population to become extinct?
• When does it stabilise (to a non-zero value)?
H INT : Depends on |λi |.

48
Chapter 4. O R THOGONALITY & P ROJECTIONS

4.1 O R THOGONALITY

Inner product on Rn
D EFN . Define the inner product (dot product) of two vectors v, w ∈ Rn as v · w = v T w
This satisfies the following properties:
For v = (a1 , . . . , an )T , w = (b1 , . . . , bn )T in Rn and c in R
• v · w = a1 b1 + · · · + an bn = w · v (Commutativity) .
• (Bilinearity)
(v + w) · z = v · z + w · z
cv · w = c (v · w) = v · cw.
• v · v ≥ 0 and v · v = 0 if and only if v = 0.

Define length (or norm) of v in Rn to be ∥v∥ = v · v.

Henceforth we use v T w to write the dot product v · w.

Reading Slide: Length of a vector in R3 and Rn


Let v = (2, 3, 2). p √ √
By Pythagoras theorem, ||v|| = ||(2, 3, 0)||2 + ||(0, 0, 2)||2 = 22 + 32 + 22 = 17.

(0, 0, 2)

(2, 3, 2)

(0, 0, 2)

(0, 3, 0)

(2, 3, 0)
(2, 0, 0)

Generalize by induction: Let v = (x1 , · · · , xn )T q ∈ Rn . Define


p √
∥v∥ = ∥(x1 , · · · , xn−1 , 0)∥2 + ∥(0, 0, · · · , xn )∥2 = x21 + · · · + x2n−1 + x2n = v T v.
The length in Rn is compatible with the vector space structure. Let v, w ∈ Rn and c ∈ R.
Then,
• ∥v∥ ≥ 0 and ∥v∥ = 0 if and only if v = 0.
• ∥cv∥ = |c|∥v∥ • ∥v + w∥ ≤ ∥v∥ + ∥w∥ (Triangle Inequality)

Orthogonal vectors in Rn
Vectors v and w in R3 are orthogonal (perpendicular)
if they satisfy the Pythagoras theorem ⇔ ||v||2 + ||w||2 = ||v − w||2

v−w

49
⇔ ∥v∥2 + ∥w∥2 = (v − w)T (v − w)
= (v T − wT )(v − w)
= v T v − wT v − v T w + wT w
= ∥v∥2 − 2 v T w + ∥w∥2 (since wT v = v T w )

Therefore, we define v and w in Rn to be orthogonal if v T w = 0.


P OINT TO P ONDER :
What can be said about Span{v} and Span{w} when v and w are orthogonal?

Orthogonal and Orthonormal Sets


D EFN . A set of non-zero vectors {v1 , · · · , vk } ⊂ Rn , is said to be an orthogonal set
if viT vj = 0 for all i, j = 1, · · · n, i ̸= j.
E XAMPLES : {(1, 3, 1)T , (−1, 0,
 1)T } ⊂ R3 , {(2, 1, 0, −1)T , (0, 1, 0, 1)T , (−1, 1, 0, −1)T } ⊆ R4 ,
 T  T 
−1
√1 , 0, √1
2 2
, √12 , 0, √ 2
⊆ R3 , {e1 , · · · , en } ⊆ Rn .

Of these, the last two examples consist of unit vectors.


D EFN . An orthogonal set {v1 , · · · , vk } ⊆ Rn with all unit vectors, i.e., ∥vi ∥ = 1 for all i,
is called an orthonormal set.
N OTE : {v1 , · · · , vk } is an orthogonal set, ⇒ {u1 , · · · , uk } is orthonormal, where ui = vi /||vi ||.
E XERCISE : If S = {v1 , . . . , vk } is an orthogonal set, then vk is orthogonal to
each v ∈ Span{v1 , . . . , vk−1 }.

Orthogonality and Linear Independence


T HEOREM : An orthogonal set in Rn is linearly independent.
Proof. Let {v1 , · · · , vk } be an orthogonal set in Rn , i.e. vi ̸= 0 and viT vj = 0 for i ̸= j.
Note that for i = j, viT vi = ||vi ||2 ̸= 0.
Assume for some a1 , · · · , ak ∈ R,

a1 v1 + a2 v2 + · · · + ak vk = 0
⇒ (a1 v1 + a2 v2 + · · · + ak vk )T v1 = 0 · v1 = 0
⇒ (a1 v1T + a2 v2T + · · · + ak vkT ) v1 = 0
⇒ a1 v1T v1 + a2 v2T v1 + · · · + ak vkT v1 = 0
⇒ a1 ||v1 ||2 = 0
⇒ a1 = 0 since v1 ̸= 0

Similarly, we get a2 = · · · = an = 0. Hence {v1 , · · · , vk } is linearly independent.


T RUE /F ALSE : Any matrix whose columns form an orthogonal set is invertible.

Matrices with Orthogonal Columns


Let A = [v1 · 
· · vn] be m × n. If {v1 ,
. . . , vn } form an orthonormal set in Rm ,
v1T v1T v1 . . . v1T vn
then AT A =  ...  v1 . . . vn =  ... ..  = I .
   
.  n
vnT vnT v1 . . . vnT vn
D EFN . A square matrix A whose column vectors form an orthonormal set is called an
orthogonal matrix.
If Q = [u1 · · · un ] is an orthogonal matrix, then
• {u1 , . . . , un } is an orthonormal set (by definition)

50
• QT Q = Ip= QQT Why? p √
• ∥Qv∥ = (Qv)T (Qv) = v T QT Qv = v T x = ∥v∥.
⇒ the only (real) eigenvalues of Q, if they exist, are ±1.
• Row vectors of T
 Q are orthonormal
 since QQ
 = I.
cos θ − sin θ 1 1 1
E XAMPLES : 1. . 2. √ .
sin θ cos θ 2 1 −1
 
  1 −1 −1 1
2 1 −2
1  1 
1 1 1 1
3. 2 −2 1 . 4. .
3 2 −1 −1 1 1

1 2 2
−1 1 −1 1

Orthogonal Basis
D EFN . A basis B = {v1 , . . . , vk } of a subspace V of Rn is an orthogonal basis
if it is an orthogonal set, i.e., viT vj = 0 for i ̸= j.
Furthermore, if ||vi || = 1 for each i, then B is an orthonormal basis (or o.n.b.) of V .
E XAMPLE : Consider the basesof R2 : B1 = {w1 = (8, 0)T , w T
2 = (6, 3) },
 T  T
B2 = {(8, 0)T , (0, 3)T } and B3 = √ 8 , 0 , 0, √023+32 .
82 +02

Then B1 is not orthogonal, B2 is an orthogonal basis, but not an orthonormal basis,


and B3 is an orthonormal basis of R2 .
N OTE : If {u1 , . . . , uk } ⊆ Rn is an orthonormal set, then it is an o.n.b. of V = Span{u1 , . . . , uk }.

Importance of Orthogonal Basis


E XAMPLE : The set B = {v1 = (−1, 1)T , v2 = (1, 1)T } is a orthogonal basis of R2 .
• Find [v]B = (a, b)T : v = av1 + bv2 = a(−1, 1)T + b(1, 1)T
v1T v = (−1, 1)v = a(−1, 1)(−1, 1)T = 2a = a||v1 ||2
vT v vT v vT v vT v
Then a = 1 = 1 2 and b = 2 = 2 2
2 ||v1 || 2 ||v2 ||
G ENERAL C ASE : If B = {v1 , . . . , vn } is an o.n.b of V ,
then [v]B = (c1 , . . . , cn )T , where cj = vjT v.
Moreover, if T : V → V is linear, B
B
 and [T ]B = [aij ], then
[T ]B = [T (v1 )]B · · · [T (vn )]B ⇒ aij = .
P OINT TO P ONDER : When does T map orthogonal sets to orthogonal sets?

4.2 P ROJECTIONS & THE L EAST S QUARES M ETHOD

Orthogonal Basis and Projections


Every subspace of Rn has an orthogonal basis. To construct one, we can start with any
basis and modify it (Gram-Schmidt process). First we see what happens in R2 .
v2 − projv (v2 )
1
v2

θ
v1
projv (v2 )
1

To construct an orthogonal basis in Rn , we need to know how to find projv1 (v2 ) in Rn .

51
Orthogonal Projections in Rn
If v(̸= 0), w ∈ Rn , then projv (w), is a multiple of v and w − projv (w) is orthogonal to v.
Thus

projv w = av for some a ∈ R


T
v (w − projv w) = 0
vT w
v T w − v T av = 0 ⇔ a =
vT v

vT w
 
Therefore projv (w) = v.
vT v
T T
E XAMPLE . If w = 1 1 1 and v = 1 2 3 , then the orthogonal projection
 T 
v w 6
of w on Span{v} is given by projv (w) = T
v = v.
v v 14
A PPLICATION : This can be used to “solve” an inconsistent system of equations.

Linear Least Squares and Projections


Suppose the system Ax = b is inconsistent, i.e. b ̸∈ C(A).
The error E = ∥Ax − b∥ is the distance from b to Ax ∈ C(A).
b


C(A)

We want the least square solution x̂ which minimizes E,


i.e., we want to find b̂ closest to b such that Ax̂ = b̂ is a consistent system.
Therefore, b̂ = projC(A) (b) and Ax̂ = b̂.
The error vector e = b − Ax̂ must be perpendicular to C(A), which is also the row space
of AT . So, e must be in the left null space of A, N (AT ), i.e.,
AT (b − Ax̂) = 0 or AT A x̂ = AT b
Therefore, to find x̂, we need to solve AT A x̂ = AT b.
Let A be m × n. Then AT A is a symmetric n × n matrix.
We now find a condition on A which implies that AT A is invertible.
• N (AT A) = N (A) .
Proof. Ax = 0 ⇒ AT Ax = 0. So, N (A) ⊆ N (AT A).
For the other inclusion, take x ∈ N (AT A).
AT Ax = 0 ⇒ xT (AT Ax) = (Ax)T (Ax) = ∥Ax∥2 = 0 ⇒ Ax = 0,i.e., x ∈ N (A).
• Since N (A) = N (AT A), by rank-nullity theorem, rank(A) = n−dim (N (A)) = rank(AT A).
• A has linearly independent columns ⇔ rank(A) = n ⇔ rank(AT A) = n
⇔ AT A is invertible.
• If rank(A) = n, then the least square solution of Ax = b is given by AT A x̂ = AT b ⇒
x̂ = (AT A)−1 AT b and the orthogonal projection of b on C(A) is b̂ = Ax̂ = P b ,
where P = A(AT A)−1 AT is the projection matrix. Q: Is P 2 = P ?

52
Linear Least Squares: Example
E XAMPLE : Find the least square solution to the system
   
−1 2 4
 2 −3  x =  1  (Ax = b)
−1 3 2
   
T T T −4 T 6 −11
We need to solve A A x̂ = A b. Now A b = and A A = .
11 −11 22
   
6 −11 | −4 6 −11 | −4
[AT A | AT b] = → . Therefore xˆ2 = 2, and xˆ1 = 3.
−11 22 | 11 0 11/6 | 11/3

E XERCISE : Find the projection matrix P , and check that P b = Ax̂.

Reading Slide: Linear Least Squares: Application

Hooke’s Law states that displacement x of the spring is directly proportional to


the load (mass) applied, i.e., m = kx.
A student performs experiments to calculate spring constant k. The data collected says
for loads 4, 7, 11 kg applied, the displacement is 3, 5, 8 inches respectively. Hence we
have:
   
3 4
 5 k =  7  (ak = b).
8 11

Clearly the data is inconsistent.


Allowing for various errors, how do we find an estimate for k?
The method of least squares allows us to find a consistent system “close” to this one!
E XERCISE : Estimate k using the method of least squares.

Reading Slide: Line of Best Fit: Example


Q: We want to find the best line y = C + Dx which fits the given data
and gives least square error.Data: (x, y) = (−2, 4), (−1, 3), (0, 1), and (2, 0).
   
1 −2   4
1 −1 C 3
The system 1 0  D = 1 (Ax = b) is inconsistent.
  

1 2 0
Find the least square solution by solving AT A x̂ = AT b.
Q: Find the best quadratic curve y = C + Dx + Ex2 which fits the above data
and gives least square error.
H INT. The first row of the matrix A in this case will be [1 − 2 4].

53
4.3 G RAM -S CHMIDT P ROCESS & A PPLICATIONS

Gram-Schmidt Process
If the set of vectors v1 , . . . , vr in Rn are linearly independent, then we can find an
orthonormal set of vectors q1 , . . . , qr such that Span{v1 , . . . , vr } = Span{q1 , . . . , qr }.
First find an orthogonal set.
Let w1 = v1 , w2 = v2 − projw1 (v2 ). Then w1 ⊥ w2 and Span{v1 , v2 } = Span{w1 , w2 }.
Let c1 w1 + c2 w2 be the projection of v3 on Span{w1 , w2 }.
Then (v3 − c1 w1 − c2 w2 ) ⊥ w1 and (v3 − c1 w1 − c2 w2 ) ⊥ w2 . ⇒ w1T (v3 − c1 w1 − c2 w2 ) = 0
⇒ c1 w1 = projw1 (v3 ) and similarly c2 w2 = projw2 (v3 ). Therefore,
 T   T 
w1 v 3 w2 v 3
w3 = v3 − projSpan{w1 ,w2 } (v3 ) = v3 − 2
w1 − w2 .
∥w1 ∥ ∥w2 ∥2
Span{v1 , v2 , v3 } = Span{w1 , w2 , w3 } and w1T w3 = 0, w2T w3 = 0.
By induction,

wr = vr − projSpan{w1 ,...wr−1 } (vr ) = vr − projw1 (vr ) − projw2 (vr ) − · · · − projwr−1 (vr )


T v
w1T vr w2T vr wr−1 r
= vr − w 1 − w 2 − · · · − wr−1
∥w1 ∥2 ∥w2 ∥2 ∥wr−1 ∥2
w1 w2 wr
Now take q1 = , q2 = , . . ., qr = . Then {q1 , . . . , qr } is an orthonormal set
∥w1 ∥ ∥w2 ∥ ∥wr ∥
and W = Span{v1 , . . . , vr } = Span{w1 , . . . , wr } = Span{q1 , . . . , qr }.
In particular, {q1 , q2 , . . . , qr } is an orthonormal basis for W .
E XERCISE : Show that if {w1 , . . . , wr } is an orthogonal set, then
projSpan{w1 ,...wr } (v) = projw1 (v) + projw2 (v) + · · · + projwr (v).

Gram-Schmidt  Method:
 Example    

 3 −5 1  
 1 1  1 
Q: Let S = v1 =   , v2 =   , v3 =   and W = Span(S).
     

 −1 5 −2 
3 −7 8
 

Find an orthonormal basis for W .


E XERCISE : First verify that {v1, v2 , v3 } are linearly independent. (
Check that rank of v1 v2 v3 is 3). Hence S is a basis of W .
 T 
w1 v 2
Use Gram-Schmidt method: w1 = v1 , w2 = v2 − w1
∥w1 ∥2
   
−15 + 1 − 5 − 21 −40 T
⇒ w2 = v 2 − w1 = v 2 − w1 = v2 + 2w1 = 1 3 3 −1 .
9+1+1+9 20
O BSERVE : v1 , v2 ∈ Span{w1 , w2 }, w1 , w2 ∈ Span{v1 , v2 } ⇒ Span{v1 , v2 } = Span{w1 , w2 }.
C HECK : w1T w2 = 0.
 T   T     
w1 v 3 w2 v 3 3 + 1 + 2 + 24 1+3−6−8
Now w3 = v3 − 2
w 1 − 2
w 2 = v 3 − w 1 − w2
  ∥w 1 ∥  ∥w 2 ∥
    20 20
1 3 1 −3
 1  3 1  1 3   1 
⇒ w3 =  −2 − 2 −1 + 2  3  =  1  .
      

8 3 −1 3

54
Check w1T w3 = 0 = w2T w3 and Span{v1 , v2 , v3 } = Span{w1 , w2 , w3 }. Hence {w1 , w2 , w3 } is an
 
1 1 1
orthogonal basis of W . An orthonormal basis for W is √ w1 , √ w2 , √ w3 .
20 20 20

QR Factorization 
Let A = v1 · · · vr be an n × r matrix of rank r. Then v1 , . . . , vr are linearly independent
vectors in Rn . By the Gram-Schmidt method, we get an orthonormal basis
wi
{q1 , . . . , qr } of C(A), where qi = and w1 = v1 , and for k > 1,
∥wi ∥
!
 T  T v
wk−1
w1 v k k
wk = v k − w1 − · · · − wk−1 .
∥w1 ∥2 ∥wk−1 ∥2

Let Q = q1 . . . qr . How are A and Q related?
Note that Span{v1 , . . . , vk } = Span{w1 , . . . , wk } = Span{q1 , . . . , qk } for all k.
If vk = c1 q1 + . . . + ck qk , then c1 = q1T vk , c2 = q2T vk , . . . , ck = qkT vk . Thus
Hence vk = (q1T vk )q1 + . . . + (qkT vk )qk for each k.
Therefore
 T
q1 v1 q1T v2 q1T vr

 0 q2T v2 q2T vr 



v 1 v 2 . . . v r = q1 q2 . . . q r  . .. 
.. . .. 
 .. . . 
0 0 qrT vr

i.e. A = QR , where the columns of Q form an orthonormal set and R is an


invertible r × r matrix. Q: Why is R invertible?
This is called QR-factorization of A.
• If A is an n × n invertible matrix, then A = QR , where Q is an n × n orthogonal matrix
and R is an n × n invertible upper triangular matrix.

Diagonalizing Symmetric Matrices:


 Example  
1 1 1 1−λ 1 1
E XAMPLE : Let A = 1 1 1. Then A − λI =  1 1−λ 1  and
1 1 1 1 1 1−λ
det(A − λI) = (3 − λ)λ2 Eigenvalues:
 λ1 = 3, λ = 0,
 2  3 λ = 0.
−2 1 1 x 0
To find N (A − 3I), solve A − 3I =  1 −2 1  y  = 0.
1 1 −2 z 0
N (A) is the plane x + y + z = 0.
Hence, the associated eigenvectors are v1 = (1, 1, 1)T , v2 = (−1, 0, 1)T and v3 = (0, −1, 1)T .
Note that v2 and v3 are linearly independent in N (A). Observe v1T v2 = 0 = v1T v3 .
How do we get an orthogonal Q such that A = QΛQT , where Λ is diagonal
with entries 3, 0, 0 on the diagonal?
S TEPS : 1. Let u1 = v1 /∥v1 ∥.
2. Start with the basis {v2 , v3 } of N (A), and apply the Gram-Schimdt process to get an
orthonormal basis {u2 , u3 } for N (A). Note that u2 and u3 are eigenvectors of A associated
to λ = 0, and are linearly independent since they are non-zero orthogonal vectors.
3. Then Q = [u1 u2 u3 ] is orthogonal, and Q−1 AQ = Λ.
4. Since Q−1 = QT , A = QΛQT .

55
Diagonalizing Symmetric Matrices
Let A be a symmetric matrix, which is diagonalizable. Then there is an orthogonal
matrix Q, and a diagonal matrix Λ such that A = QΛQT .
O BSERVE : Eignevectors corresponding to distinct eigenvalues are orthogonal.
Proof. Let λ and µ be distinct eigenvalues of A with associated eigenvectors v and w
repectively. Now, λ(v T w) = (λv)T w = (Av)T w = v T (AT w) = v T (Aw) = µ(v T w).
Since λ ̸= µ, this imples v T w = 0, proving the result.
Step 1: Find the eigenvalues and the respective eigenvectors.
Step 2: Use Gram-Schmidt process to get an orthogonal basis for each eignespace.
T HEOREM : (Real Spectral Theorem)
Every symmetric matrix (with real entries) is diagonalizable, and hence
decomposes as above.

56
Appendix I - D ETERMINANTS

Reading Slide: Determinants: Key Properties


Let A and B n × n, and c a scalar.

• T RUE /F ALSE : det(A + B) = det(A) + det(B).

• T RUE /F ALSE : det(cA) = c det(A).

• det(AB) = det(A)det(B).

• det(A) = det(AT ).

• If A is orthogonal, i.e., AAT = I, then det(A) =

• If A = [aij ] is triangular, then det(A) =

• A is invertible ⇔ det(A) ̸= 0.
If this happens, then det(A−1 ) =

• If B = P −1 AP for an invertible matrix P ,


i.e., A and B are similar, then det(B) =

• If A is invertible, and d1 , . . . , dn are the pivots of A,


then det(A) = .

Reading Slide: Determinants: Defining Properties


D EFN . The determinant function det : Mn×n (R) → R can be defined (uniquely)
by its three basic properties.
• det(I) = 1.
• The sign of determinant is reversed by a row exchange.
Thus, if B = Pij A, i.e., B is obtained from A by exchanging two rows,
then det(B) = −det(A). In particular, det(I) = 1 ⇒ det(Pij ) = −1.
• det is linear in each row separately, i.e. , we fix n − 1 row vectors, say v2 , · · · , vn ,
T
then det − v2 · · · vn : Rn → R is a linear function.
I.e., for c, d in R, and vectors
T u and v, if A1∗ = cu + dv,we have
T T
det cu + dv A2∗ · · · An∗ = c det u A2∗ · · · An∗ + d det v A2∗ · · · An∗ .
There are n such equations (for n choices of rows).

Reading Slide: Determinants: Induced Properties

1. If two rows of A are equal, then det(A) = 0.


Proof. Suppose i-th and j-th rows of A are equal, i.e., Ai∗ = Aj∗ , then A = Pij A.

Hence det(A) = det(Pij A) = −det(A) ⇒ det(A) = 0 .

2. If B is obtained from A by Ri 7→ Ri + aRj , then det(B) = det(A). (Use linearity in the


ith row)

3. If A is n × n, and its row echelon form U is obtained without row exchanges, then
det(U ) = det(A).
Q: What happens if there are row exchanges? Exercise!

57
4. If A has a zero row, then det(A) = 0.
Proof: Let the ith row of A be zero, i.e., Ai∗ = 0.
Let B be obtained from A by Ri = Ri + Rj , i.e., B = Eij (1)A. Then Bi∗ = Bj∗ .

E XERCISE : Complete the proof.

Reading Slide: Determinants: Special Matrices


 
a1
5. If A = 
 ..  is diagonal, then det(A) = a1 · · · an . (Use linearity).

.
an

6. If A = (aij ) is triangular, then det(A) = a11 . . . ann .


Proof. If all aii are
 non-zero,  then by elementary row operations, A reduces to the
a11
diagonal matrix 
 ..  whose determinant is a11 · · · ann .

.
ann

If at least one diagonal entry is zero, then elimination will produce a zero row ⇒
det(A) = 0.

Reading Slide: Formula for Determinant: 2 × 2 case


Write (a, b) = (a, 0) + (0, b), the sum of vectors in coordinate directions.
Similarly write (c, d) = (c, 0) + (0, d). By linearity,
a b a 0 0 b a 0 a 0 0 b 0 b
= + = + + + .
c d c d c d c 0 0 d c 0 0 d
For an n × n matrix, each row splits into n coordinate directions,
so the expansion of det(A) has nn terms.
However, when two rows are in same coordinate direction, that term will be zero, e.g.,

a 0 0 b 0 b c 0 a b
= 0, = 0, =− = −bc ⇒ = ad − bc
c 0 0 d c 0 0 b c d

The non-zero terms have to come in different columns.


So, there will be n! such terms in the n × n case.

Reading Slide: Formula for Determinant: n × n case


For n × n matrix A = (aij ),
X
det(A) = (a1α1 . . . anαn ) det(P ).
all permutations P

The sum is over n! permutations of numbers (1, . . . , n). Here a permutation


 T (i1 , i2 , . . . , in )
ei1
 .. 
of (1, 2, . . . , n) corresponds to the product of permutation matrices P =  . .
eTin
Then det(P ) = +1 if the number of row exchanges in P needed to get I is even, and −1
if it is odd.

58
Reading Slide: Cofactors: 3 × 3 Case

a11 a12 a13


det(A) = a21 a22 a23
a31 a32 a33
= a11 a22 a33 (1) + a11 a23 a32 (−1) + a12 a21 a33 (−1)
+a12 a23 a31 (1) + a13 a21 a32 (1) + a13 a22 a31 (−1)
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 )
+a13 (a21 a32 − a22 a31 )
a a a a a a
= a11 22 23 + a12 (−1) 21 23 + a13 21 22
a32 a33 a31 a33 a31 a32
= a11 C11 + a12 C12 + a13 C13 where,
a22 a23 a a a a
C11 = , C12 = (−1) 21 23 , C13 = 21 22 .
a32 a33 a31 a33 a31 a32

Reading Slide: Cofactors: n × n Case


Let C1j be the coefficient of a1j in the expansion
X
det(A) = (a1α1 . . . anαn ) det(P )
all permutations P

Then det(A) = a11 C11 + a12 C12 + . . . + a1n C1n where,

C1j = Σa2β2 . . . anβn det(P )


 
a21 . a2(j−1) a2(j+1) . a2n
1+j  .. .. ..
= (−1) det  . .

. . . 
an1 . an(j−1) aj+1 . ann
= (−1)1+j det(M1j ),

where M1j is obtained from A by deleting the 1st row and j th column.

Extra Reading: det(AB) = det(A) det(B) (Proof)

7.
det(AB) = det(A) det(B)

Proof. We may assume that B are invertible. Else, rank(AB) ≤ rankB ̸= n


⇒ rank(AB) ̸= n ⇒ AB is not invertible.
Hint: For fixed B, show that the function d defined by

d(A) = det(AB)/det(B)

satisfies the following properties

(a) d(I) = 1.
(b) If we interchange two rows of A, then d changes its sign.
(c) d is a linear function in each row of A.

Then d is the unique determinant function det and det(AB) = det(A) det(B). □

59
Extra Reading: Determinants of Transposes (Proof)

8.
det(A) = det(AT )

Proof. With U , L, and P , as usual write P A = LU ⇒ AT P T = U T LT Since U and L are


triangular, we get det(U ) = det(U T ) and det(L) = det(LT ).

Since P P T = I and det(P ) = ±1, we get det(P ) = det(P T ).


Thus det(A) = det(AT ). □

Extra Reading: Determinants and Invertibility (Proof)

9. A is invertible if and only if det(A) ̸= 0.


By elimination, we get an upper triangular matrix U , a lower triangular matrix L with
diagonal entries 1, and a permutation matrix P , such that P A = LU .
O BSERVATION 1: If A is singular, then det(A) = 0.
This is because elimination produces a zero row in U and hence det(A) = ± det(U ) = 0.

O BSERVATION 2: If A is invertible, then det(A) ̸= 0.


This is because elimination produces n pivots, say d1 , . . . , dn , which are non-zero.
Then U is upper triangular, with diagonal entries d1 , . . . , dn ⇒ det(A) = ± det(U ) =
± d1 · · · dn ̸= 0.
T HUS we have: A invertible ⇒ det(A) = ±(product of pivots).
E XERCISE : If AB is invertible, then so are A and B.
E XERCISE : A is invertible if and only if AT is invertible.

Extra Reading: Determinant: Geometric Interpretation (2 × 2)


I NVER TIBILITY 
: Very often we are interested in knowing when a matrix is invertible.
a b
Consider A = . Then A is invertible if and only if A has full rank.
c d
If a, c both are zero then clearly rank(A) < 2 ⇒ A is not invertible.
   
a b R2 −c/aR1 a b
Assume a ̸= 0, else, interchange rows. The row operations −−−−−−→
c d 0 d − cb/a
show that A is invertible if and only if d − cb/a ̸= 0, i.e., ad − bc ̸= 0.
A REA : The area of the parallelogram with sides as vectors v = (a, b) and w = (c, d) is
equal
 to ad − bc. Thus,     
A 2 × 2 matrix A is singular ⇔ its columns are on the same line ⇔ the area is zero. 
 

60
Extra Reading: Determinant: Geometric Interpretation
• Test for invertibility: An n × n matrix A is invertible ⇔ det(A) ̸= 0.
• n-dimensional volume: If A is n × n, then |det(A)| = the volume of the box
(in n-dimensional space Rn ) with edges as rows of A.
Examples: (1) The volume (area) of a line in R2 = 0.
 
1 2
(2) The determinant of the matrix A = is −4 .
1 −2
(3) Let’s compute the volume of the box (parallelogram) with edges as
rows of A or columns of A.
y
y
=4
x x

Extra Reading: Expansion along the i-th row (Proof)


If Cij is the coefficient of aij in the formula of det(A), then
det(A) = ai1 Ci1 + . . . + ain Cin , where Cij is determined as follows:
T
By i − 1 row exchanges on A, get the matrix B = Ai∗ A1∗ .. A(i−1)∗ A(i+1)∗ .. An∗
Since det(A) = (−1)i−1 det(B), we get
Cij (A) = (−1)i−1 C1j (B) = (−1)i−1 (−1)j−1 det(M )
where M is obtained from B by deleting 1st row and j th column. Here M is obtained
from B by deleting its first row, and j-th column, and hence from A by deleting i-th row
and j-th column. Write M as Mij . Then Cij = (−1)i+j det(Mij )

Extra Reading: Expansion along the j-th column (Proof)


Note that Cij (AT ) = Cji (A).
Hence, if we write AT = (bij ), then
det(A) = det(AT )
= bj1 Cj1 (AT ) + . . . + bjn Cjn (AT )
= a1j C1j (A) + . . . + anj Cnj (A)
This is the expansion of det(A) along j-th column of A.

Extra Reading: Applications: 1. Computing A−1


If C = (Cij ) : cofactor matrix of A, then A C T = det(A) I
    
a11 . . . a1n C11 . . . Cn1 det(A) . . . 0
i.e.,  ... .. ..   .. .. .. =  .. .. .. 

. .  . . .   . . . 
an1 . . . ann C1n . . . Cnn 0 . . . det(A)
Proof. We have seenthat ai1 Ci1 + . . . + ain Cin = det(A). Now a11 C21 + a12 C22 + . . . + a1n C2n
a11 . . . a1n
 a11 . . . a1n 
 
= det  a31 . . . a3n = 0. Similarly, if i ̸= j, then ai1 Cj1 + ai2 Cj2 + . . . + ain Cjn = 0. □

 . . 
an1 . . . ann

61
1
R EMARK . If A is invertible, then A−1 = CT .
det(A)
For n ≥ 4, this is not a good formula to find A−1 . Use elimination to find A−1 for n ≥ 4.
This formula is of theoretical importance.

Extra Reading: Applications: 2. Solving Ax = b


C RAMER ’ S RULE : If A is invertible, the Ax = b has a unique solution.
  
C11 · · · Cn1 b1
1 1  .. .
x=A b= −1 T
C b=  . ..   ... 
 
det(A) det(A)

C1n · · · Cnn bn
1 1
Hence xj = (b1 C1j + b2 C2j + · · · + bn Cnj )= det(Bj ), where Bj is obtained by
det(A) det(A)
replacing the j th column of A by b, and det(Bj ) is computed along the j th column.
R EMARK : For n ≥ 4, use elimination to solve Ax = b. Cramer’s rule is of theoretical
importance.

Extra Reading: Applications: 3. Volume of a box


Assume the rows of A are mutually orthogonal . Then
  2 
A1∗ l 0
 1
AAT =  ...  (A1∗ )T T
. . . (An∗ ) =  ..
  
. 
An∗ 0 ln2
p
where li = (Ai )T · Ai is the length of Ai . Since det(A) = det(AT ), we get |det(A)| = l1 · · · ln .

Since the edges of the box spanned by rows of A are at right angles, the volume of the
box
= the product of lengths of edges = |det(A)|
.

Extra Reading: Applications: 4. A Formula for Pivots


O BSERVATION : If row exchanges are not required, then the first k pivots are determined
by the top-left k × k submatrices A ek of A.
 
a11 a12
E XAMPLE . If A = [aij ]3×3 , then A1 = (a11 ), A2 =
e e and A
e3 = A.
a21 a22
Assume the pivots are d1 , . . . , dn , obtained without row exchange. Then

• det(A
e1 ) = a11 = d1

• det(A
e2 ) = d1 d2 = det(A1 ) d2

• det(A
e3 ) = d1 d2 d3 = det(A2 ) d3 etc.,

• If det(A
ek ) = 0, then we need a row exchange in elimination.

• Otherwise the k-th pivot is dk = det(A


ek )/det(A
ek−1 )

62

You might also like