You are on page 1of 14

MA110-2023-2 (1st half): Linear Algebra

H. Ananthnarayan & Rekha Santhanam



Department of Mathematics, I.I.T. Bombay

March 8, 2024

This set of notes is to supplement the material presented in our classes of Linear Algebra
in the first half of MA110 in Spring 2024 at IIT Bombay. The primary content was
developed by using the reference:
Linear Algebra and its Applications by G. Strang, 4th Ed., Thomson.
Most of this content in these Appendices were not covered in class, but are the result
of various questions asked by curious students. These are being shared primarily
to give such students more content to mull over.
The topics covered in the class were:

1. L INEAR E QUATIONS & M ATRICES

2. V ECTOR S PACES

3. E IGENVALUES & E IGENVECTORS

4. O R THOGONALITY & P ROJECTIONS

The topics covered in these slides are:

A PPENDIX I - D ETERMINANTS

A PPENDIX II - O R THOGONAL S UBSPACES

A PPENDIX III - A DDITIONAL T OPICS

(a) An Application of QR-factorization


(b) Angles in Rn
(c) Inner Product Spaces
(d) Symmetric Matrices & SVD

N OTE : (i) The notation in these notes is the same as that discussed in class.
(ii) Work out as many examples as you can.

1
Appendix I - D ETERMINANTS

Determinants: Key Properties


Let A and B n × n, and c a scalar.

• T RUE /F ALSE : det(A + B) = det(A) + det(B).

• T RUE /F ALSE : det(cA) = c det(A).

• det(AB) = det(A)det(B).

• det(A) = det(AT ).

• If A is orthogonal, i.e., AAT = I, then det(A) =

• If A = [aij ] is triangular, then det(A) =

• A is invertible ⇔ det(A) ̸= 0.
If this happens, then det(A−1 ) =

• If B = P −1 AP for an invertible matrix P ,


i.e., A and B are similar, then det(B) =

• If A is invertible, and d1 , . . . , dn are the pivots of A,


then det(A) = .

Determinants: Defining Properties


D EFN . The determinant function det : Mn×n (R) → R can be defined (uniquely)
by its three basic properties.
• det(I) = 1.
• The sign of determinant is reversed by a row exchange.
Thus, if B = Pij A, i.e., B is obtained from A by exchanging two rows,
then det(B) = −det(A). In particular, det(I) = 1 ⇒ det(Pij ) = −1.
• det is linear in each row separately, i.e. , we fix n − 1 row vectors, say v2 , · · · , vn ,
T
then det − v2 · · · vn : Rn → R is a linear function.
I.e., for c, d in R, and vectors
T u and v, if A1∗ = cu + dv,we have
T T
det cu + dv A2∗ · · · An∗ = c det u A2∗ · · · An∗ + d det v A2∗ · · · An∗ .
There are n such equations (for n choices of rows).

Determinants: Induced Properties

1. If two rows of A are equal, then det(A) = 0.


Proof. Suppose i-th and j-th rows of A are equal, i.e., Ai∗ = Aj∗ , then A = Pij A.

Hence det(A) = det(Pij A) = −det(A) ⇒ det(A) = 0 .

2. If B is obtained from A by Ri 7→ Ri + aRj , then det(B) = det(A). (Use linearity in the


ith row)

3. If A is n × n, and its row echelon form U is obtained without row exchanges, then
det(U ) = det(A).
Q: What happens if there are row exchanges? Exercise!

2
4. If A has a zero row, then det(A) = 0.
Proof: Let the ith row of A be zero, i.e., Ai∗ = 0.
Let B be obtained from A by Ri = Ri + Rj , i.e., B = Eij (1)A. Then Bi∗ = Bj∗ .

E XERCISE : Complete the proof.

Determinants: Special Matrices


 
a1
5. If A = 
 ..  is diagonal, then det(A) = a1 · · · an . (Use linearity).

.
an

6. If A = (aij ) is triangular, then det(A) = a11 . . . ann .


Proof. If all aii are
 non-zero,  then by elementary row operations, A reduces to the
a11
diagonal matrix 
 ..  whose determinant is a11 · · · ann .

.
ann

If at least one diagonal entry is zero, then elimination will produce a zero row ⇒
det(A) = 0.

Formula for Determinant: 2 × 2 case


Write (a, b) = (a, 0) + (0, b), the sum of vectors in coordinate directions.
Similarly write (c, d) = (c, 0) + (0, d). By linearity,
a b a 0 0 b a 0 a 0 0 b 0 b
= + = + + + .
c d c d c d c 0 0 d c 0 0 d
For an n × n matrix, each row splits into n coordinate directions,
so the expansion of det(A) has nn terms.
However, when two rows are in same coordinate direction, that term will be zero, e.g.,

a 0 0 b 0 b c 0 a b
= 0, = 0, =− = −bc ⇒ = ad − bc
c 0 0 d c 0 0 b c d

The non-zero terms have to come in different columns.


So, there will be n! such terms in the n × n case.

Formula for Determinant: n × n case


For n × n matrix A = (aij ),
X
det(A) = (a1α1 . . . anαn ) det(P ).
all permutations P

The sum is over n! permutations of numbers (1, . . . , n). Here a permutation


 T (i1 , i2 , . . . , in )
ei1
 .. 
of (1, 2, . . . , n) corresponds to the product of permutation matrices P =  . .
eTin
Then det(P ) = +1 if the number of row exchanges in P needed to get I is even, and −1
if it is odd.

3
Cofactors: 3 × 3 Case

a11 a12 a13


det(A) = a21 a22 a23
a31 a32 a33
= a11 a22 a33 (1) + a11 a23 a32 (−1) + a12 a21 a33 (−1)
+a12 a23 a31 (1) + a13 a21 a32 (1) + a13 a22 a31 (−1)
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 )
+a13 (a21 a32 − a22 a31 )
a a a a a a
= a11 22 23 + a12 (−1) 21 23 + a13 21 22
a32 a33 a31 a33 a31 a32
= a11 C11 + a12 C12 + a13 C13 where,
a22 a23 a a a a
C11 = , C12 = (−1) 21 23 , C13 = 21 22 .
a32 a33 a31 a33 a31 a32

Cofactors: n × n Case
Let C1j be the coefficient of a1j in the expansion
X
det(A) = (a1α1 . . . anαn ) det(P )
all permutations P

Then det(A) = a11 C11 + a12 C12 + . . . + a1n C1n where,

C1j = Σa2β2 . . . anβn det(P )


 
a21 . a2(j−1) a2(j+1) . a2n
1+j  .. .. ..
= (−1) det  . .

. . . 
an1 . an(j−1) aj+1 . ann
= (−1)1+j det(M1j ),

where M1j is obtained from A by deleting the 1st row and j th column.

det(AB) = det(A) det(B) (Proof)

7.
det(AB) = det(A) det(B)

Proof. We may assume that B are invertible. Else, rank(AB) ≤ rankB ̸= n


⇒ rank(AB) ̸= n ⇒ AB is not invertible.
Hint: For fixed B, show that the function d defined by

d(A) = det(AB)/det(B)

satisfies the following properties

(a) d(I) = 1.
(b) If we interchange two rows of A, then d changes its sign.
(c) d is a linear function in each row of A.

Then d is the unique determinant function det and det(AB) = det(A) det(B). □

4
Determinants of Transposes (Proof)

8.
det(A) = det(AT )

Proof. With U , L, and P , as usual write P A = LU ⇒ AT P T = U T LT Since U and L are


triangular, we get det(U ) = det(U T ) and det(L) = det(LT ).

Since P P T = I and det(P ) = ±1, we get det(P ) = det(P T ).


Thus det(A) = det(AT ). □

Determinants and Invertibility (Proof)

9. A is invertible if and only if det(A) ̸= 0.


By elimination, we get an upper triangular matrix U , a lower triangular matrix L with
diagonal entries 1, and a permutation matrix P , such that P A = LU .
O BSERVATION 1: If A is singular, then det(A) = 0.
This is because elimination produces a zero row in U and hence det(A) = ± det(U ) = 0.

O BSERVATION 2: If A is invertible, then det(A) ̸= 0.


This is because elimination produces n pivots, say d1 , . . . , dn , which are non-zero.
Then U is upper triangular, with diagonal entries d1 , . . . , dn ⇒ det(A) = ± det(U ) =
± d1 · · · dn ̸= 0.
T HUS we have: A invertible ⇒ det(A) = ±(product of pivots).
E XERCISE : If AB is invertible, then so are A and B.
E XERCISE : A is invertible if and only if AT is invertible.

Determinant: Geometric Interpretation (2 × 2)


I NVER TIBILITY 
: Very often we are interested in knowing when a matrix is invertible.
a b
Consider A = . Then A is invertible if and only if A has full rank.
c d
If a, c both are zero then clearly rank(A) < 2 ⇒ A is not invertible.
   
a b R2 −c/aR1 a b
Assume a ̸= 0, else, interchange rows. The row operations −−−−−−→
c d 0 d − cb/a
show that A is invertible if and only if d − cb/a ̸= 0, i.e., ad − bc ̸= 0.
A REA : The area of the parallelogram with sides as vectors v = (a, b) and w = (c, d) is
equal
 to ad − bc. Thus,     
A 2 × 2 matrix A is singular ⇔ its columns are on the same line ⇔ the area is zero. 
 

5
Determinant: Geometric Interpretation
• Test for invertibility: An n × n matrix A is invertible ⇔ det(A) ̸= 0.
• n-dimensional volume: If A is n × n, then |det(A)| = the volume of the box
(in n-dimensional space Rn ) with edges as rows of A.
Examples: (1) The volume (area) of a line in R2 = 0.
 
1 2
(2) The determinant of the matrix A = is −4 .
1 −2
(3) Let’s compute the volume of the box (parallelogram) with edges as
rows of A or columns of A.
y
y
=4
x x

Expansion along the i-th row (Proof)


If Cij is the coefficient of aij in the formula of det(A), then
det(A) = ai1 Ci1 + . . . + ain Cin , where Cij is determined as follows:
T
By i − 1 row exchanges on A, get the matrix B = Ai∗ A1∗ .. A(i−1)∗ A(i+1)∗ .. An∗
Since det(A) = (−1)i−1 det(B), we get
Cij (A) = (−1)i−1 C1j (B) = (−1)i−1 (−1)j−1 det(M )
where M is obtained from B by deleting 1st row and j th column. Here M is obtained
from B by deleting its first row, and j-th column, and hence from A by deleting i-th row
and j-th column. Write M as Mij . Then Cij = (−1)i+j det(Mij )

Expansion along the j-th column (Proof)


Note that Cij (AT ) = Cji (A).
Hence, if we write AT = (bij ), then
det(A) = det(AT )
= bj1 Cj1 (AT ) + . . . + bjn Cjn (AT )
= a1j C1j (A) + . . . + anj Cnj (A)
This is the expansion of det(A) along j-th column of A.

Applications: 1. Computing A−1


If C = (Cij ) : cofactor matrix of A, then A C T = det(A) I
    
a11 . . . a1n C11 . . . Cn1 det(A) . . . 0
i.e.,  ... .. ..   .. .. .. =  .. .. .. 

. .  . . .   . . . 
an1 . . . ann C1n . . . Cnn 0 . . . det(A)
Proof. We have seenthat ai1 Ci1 + . . . + ain Cin = det(A). Now a11 C21 + a12 C22 + . . . + a1n C2n
a11 . . . a1n
 a11 . . . a1n 
 
= det  a31 . . . a3n = 0. Similarly, if i ̸= j, then ai1 Cj1 + ai2 Cj2 + . . . + ain Cjn = 0. □

 . . 
an1 . . . ann

6
1
R EMARK . If A is invertible, then A−1 = CT .
det(A)
For n ≥ 4, this is not a good formula to find A−1 . Use elimination to find A−1 for n ≥ 4.
This formula is of theoretical importance.

Applications: 2. Solving Ax = b
C RAMER ’ S RULE : If A is invertible, the Ax = b has a unique solution.
  
C11 · · · Cn1 b1
1 1  .. .
x=A b= −1 T
C b=  . ..   ... 
 
det(A) det(A)

C1n · · · Cnn bn
1 1
Hence xj = (b1 C1j + b2 C2j + · · · + bn Cnj )= det(Bj ), where Bj is obtained by
det(A) det(A)
replacing the j th column of A by b, and det(Bj ) is computed along the j th column.
R EMARK : For n ≥ 4, use elimination to solve Ax = b. Cramer’s rule is of theoretical
importance.

Applications: 3. Volume of a box


Assume the rows of A are mutually orthogonal . Then
  2 
A1∗ l 0
 1
AAT =  ...  (A1∗ )T T
. . . (An∗ ) =  ..
  
. 
An∗ 0 ln2
p
where li = (Ai )T · Ai is the length of Ai . Since det(A) = det(AT ), we get |det(A)| = l1 · · · ln .

Since the edges of the box spanned by rows of A are at right angles, the volume of the
box
= the product of lengths of edges = |det(A)|
.

Applications: 4. A Formula for Pivots


O BSERVATION : If row exchanges are not required, then the first k pivots are determined
by the top-left k × k submatrices A ek of A.
 
a11 a12
E XAMPLE . If A = [aij ]3×3 , then A1 = (a11 ), A2 =
e e and A
e3 = A.
a21 a22
Assume the pivots are d1 , . . . , dn , obtained without row exchange. Then

• det(A
e1 ) = a11 = d1

• det(A
e2 ) = d1 d2 = det(A1 ) d2

• det(A
e3 ) = d1 d2 d3 = det(A2 ) d3 etc.,

• If det(A
ek ) = 0, then we need a row exchange in elimination.

• Otherwise the k-th pivot is dk = det(A


ek )/det(A
ek−1 )

7
Appendix II - O R THOGONAL S UBSPACES

Orthogonal Subspaces: Examples


If {w1 , . . . , wr } is a orthogonal set then wr is orthogonal to Span{w1 , . . . , wr−1 }. Then
every vector in Span{wr } is orthogonal to every vector in Span{w1 , . . . , wr−1 }.
E XAMPLE : Let u = (1, 1, 0)T , v = (0, 0, 1)T , and w = (1, −1, 0)T ,
P = Span{u, v}, and L = Span{w}.
z

(1, −1, 0) y
x
y=x

O BSERVE : For scalars a, b, t, (tw)T (au + bv) = 0.


The subspace L = {(t, −t, 0) | t ∈ R} is orthogonal to the subspace P = {(a, a, b) | a, b ∈ R}.
O BSERVE : dim(L) + dim(P) = dim(R3 ).

Orthogonal Subspaces: Definition and Properties


D EFN . (Orthogonal Subspaces) Let V and W be subspaces of Rn . We say V and W are
orthogonal to each other (notation: V ⊥ W ) if every vector in V is orthogonal to
every vector in W , i.e., for every v ∈ V and w ∈ W , v · w = v T w = 0.
N OTE : Let V = Span of {v1 , . . . , vr }, W = Span {w1 , . . . , ws }.
If viT wj = 0 for all i, j, then V ⊥ W .
Proof. If v = a1 v1 + · · · + ar vr ∈ V , w = b1 w1 + · · · + bs ws ∈ W , then
v T w = (a1 v1T + · · · + ar vrT )(b1 w1 + · · · + bs ws )
= a1 b1 v1T w1 + a1 b2 v1T w2 + · · · + a1 bs v1T ws
+a2 b1 v2T w1 + · · · + a2 bs v2T ws + · · · + ar bs vrT ws = 0 using bilinearity.

Orthogonal Subspaces: Examples


Q. Let V be the yz-plane and W be the xz-plane.
Are V and W orthogonal to each other (as subspaces of R3 )?
A. No. The vector e3 = (0, 0, 1) lies in V and W both and eT3 e3 ̸= 0.
R EMARK : If V and W are orthogonal subspaces of Rn , then V ∩ W = 0.
This is a necessary condition.
Q. Is it a sufficient condition? A. No.
   
1 1
E XAMPLE 1: If V = Span , and W = Span . Then V ∩ W = 0,
0 1
 
 1
but V and W are not orthogonal, since 1 0 = 1 ̸= 0.
1

8
Orthogonal Subspaces:Examples     
1 0 1
1 1 −1
E XAMPLE 2: Let v1 =  0 , v2 = 1 and w =  1 .
    

0 0 0
If V = Span {v1 , v2 }, and W = Span{w}, then V ⊥ W .
Sketch of Proof. It is enough to see that v1 , v2 ∈ V are orthogonal to w ∈ W .
C OMPARE : In Example 1, dim (L) + dim (P ) = dim (R3 ).
In Example 2, dim (V ) + dim (W ) = 3 < dim (R4 ).
Q: Can we enlarge W to W ′ ⊆ R4 such that V ⊥ W ′ and dim (V ) + dim (W ′ ) = dim (R4 )?
A: Yes!
O BSERVE : (0, 0, 0, 1) is orthogonal to both V and W .
 T  
v 1 1 0 0
Let A = 1T = . Then C(AT ) = V .
v2 0 1 1 0
Note that x ∈ N (A) ⇔ v1T x = 0 = v2T x. ⇒ (1, −1, 1, 0)T ∈ N (A) ⇒ W ⊂ N (A).
By the Rank-Nullity Theorem, rank(A)+ dim(N (A)) = 4. Since rank(A) = 2,
we get dim(N (A)) = 2. Therefore if W ′ = N (A), then V ⊥ W ′ and dim (V ) + dim (W ′ ) = 4.
By inspection, W ′ = Span{w1 = (1, −1, 1, 0)T , w2 = (0, 0, 0, 1)T } is orthogonal to V .
The dimension of the two spaces sum up to dim (R4 ).

The Four Fundamental Spaces and Orthogonality


Let A be a m × n matrix.
1. The row space of A, C(AT ) is orthogonal to N (A).
2. C(A) is orthogonal to the left nullspace of A, N (AT ).
In fact, N (A) = set of all vectors orthogonal to C(AT ),
and N (AT ) = set ofall vectors
 orthogonal to C(A)
1 2
E XAMPLE . Let A = 2 4. Then A has 1 pivot ⇒ dim (C(A)) = 1 = dim (C(AT )) =.
3 6
O BSERVE : C(AT ) = Span{(1, 2)T }, N (A) = Span{(−2, 1)T },
C(A) = Span{(1, 2, 3)T }, and
N (AT ) is the plane y1 + 2y2 + 3y3 = 0.
Moreover, dim(C(AT ))+ dim(N (A)) = 2 = dim (R2 ),
dim(C(A))+ dim(N (AT )) = 3 = dim (R3 ).

Fundamental Theorem of Orthogonality


D EFN . Let W be a subspace of Rn .
Define its orthogonal complement as W ⊥ = {v ∈ Rn | v T w = 0 for all w ∈ W }.
Claim. W ⊥ is a subspace of Rn .
If v1 , v2 ∈ W ⊥ and w ∈ W , then v1T w = 0 = v2T w.
Hence for c1 , c2 ∈ R, (c1 v1 + c2 v2 )T w = c1 v1T w + c2 v2T w = 0. ⇒ c1 v1 + c2 v2 ∈ W ⊥ .
T HEOREM (Fundamental Theorem of Orthogonality )
Let A be an m × n matrix.
1. The row space of A = orthogonal complement of N (A).
2. The column space of A = orthogonal complement of left nullspace N (AT ).
i.e., C(AT ) = N (A)⊥ , and C(A) = N (AT )⊥ .

9
Orthogonal Complements
T HEOREM (Orthogonal Complement)
Given a subspace W ⊆ Rn , dim (W ) + dim (W ⊥ ) = n.
Proof. Let v1 , . . . , vr be a basis of W . Let A be a matrix with rows v1 , . . . , vr .
Then rank(A) = r and W = C(AT ). W ⊥ = N (A) is of dimension n − r.
This proves the theorem.
O BSERVE : Let V = Span v1 = (1, 1, 0, 0)T , v2 = (0, 1, 1, 0)T


and W = Span{w1 = (1, −1, 1, 0)T , w2 = (0, 0, 0, 1)T }.


• {v1 , v2 }, {w1 , w2 } are bases for V and W respectively.
• V ⊥ W ⇒ V ⊆ W ⊥ and dim (V ) + dim (W ) = 4 = dim (R4 )
⇒ V = W ⊥.
• V ∩ W = 0 ⇒ B = {v1 , v2 , w1 , w2 } is linearly independent ⇒
B is a basis of R4 ⇒ V + W = {v + w | v ∈ V, w ∈ W } = R4 ,
and for every x ∈ R4 , ∃ unique v ∈ V = W ⊥ and w ∈ W such that x = v + w.

Orthogonal Complements
T O SUMMARISE : If W is a subspace of Rn with basis B, and B ′ is a basis of W ⊥ ,
then B ∪ B ′ is a basis of Rn , W ∩ W ⊥ = 0, dim (W ) + dim (W ⊥ ) = n, W + W ⊥ = Rn ,
and for every x ∈ Rn , ∃ unique w1 ∈ W and w2 ∈ W ⊥ such that x = w1 + w2 .
S OME C ONSEQUENCES : Let A be m × n of rank r. Then
1. C(AT ) ∩ N (A) = 0 and Rn = C(AT ) + N (A).
Similarly C(A) ∩ N (AT ) = 0 and Rm = C(A) + N (AT ).
2. If x ∈ Rn , there is a unique expression x = xr + xn , where xr ∈ C(AT ), xn ∈ N (A).
Hence Ax = Axr ∈ C(A). Thus

The matrix A transforms its row space C(AT ) into its column space C(A).

10
Appendix III - A DDITIONAL T OPICS

Introduction
Topics included (in no particular order) are:

1. An Application of QR-factorization

2. Angles in Rn

3. Inner Product Spaces

4. Symmetric Matrices & SVD

III.1 An Application of QR-Factorization

Application: Least squares + QR


Let A be n × r matrix of rank r.
Let A = QR, where Q is a n × r matrix with orthonormal columns
and R is a k × k invertible upper triangular matrix.
 T
q1 q1 . . . q1T qr

Recall that QT Q =   = Ir .
T T
qr q1 . . . qr qr
Let Ax = b be an inconsistent system.
We apply least squares method to get best approximate solution.
AT A = (QR)T QR = RT QT QR = RT R. Therefore we solve
AT A x̂ = RT R x̂ = AT b = RT QT b. Since RT is invertible,
Thus least square solution is given by Rx̂ = QT b .

III.2 Angles in Rn

Angles in R2
Note: w ∈ R2 can be written as (||w|| cos α, ||w|| sin α), where
y

v
w
θ
α
x

 
cos(θ + α)
v = ∥v∥
sin(θ + α)
 
cos α
w = ∥w∥
sin α

11
v T w = ∥v∥∥w∥(cos(θ + α) sin α + sin(θ + α) cos α)
= ∥v∥∥w∥ cos(θ + α − α) = ∥v∥∥w∥ cos θ

This gives a formula for the angle between two vectors in terms of the inner product and
norm (length) of vectors.

Angles in Rn
To generalise the angle between two vectors to Rn , we use the notion of inner product
and norm in Rn . But to do this, we need the Cauchy-Schwartz inequality for v, w ∈ Rn :

|v T w| ≤ ∥v∥∥w∥
2
vT w
To verify this expand the inequality w − v ≥ 0.
∥v∥∥w∥
Define the angle between any two vectors v, w ∈ Rn as
 T 
−1 v w
θ = cos .
∥v∥∥w∥

T HUS : θ is 90◦ if and only if v T w = 0.


 
E XAMPLE Angle between (0, 1, 1) and (0, 2, 0) is cos−1 2

2 2
= 45◦ .

III.3 General Inner Product Spaces

Inner Product
We have defined orthogonality only in Rn .
Can we do something similar for other vector spaces?
If V is n-dimensional, we can use coordinate vectors and the inner product in Rn .
P OINT TO P ONDER :
Can different bases and coordinate vectors be used to define new inner products on Rn ?
N OTE : On the infinite dimensional space R∞ , the space of all real sequences,
the dot product does not generalise.
If xn = √1n , then (xn ) · (xn ) = Σn n1 which does not converge, and hence is not finite.
However, this generalization works for sequences (xn ) and (yn ) if Σn (xn yn ) is convergent.

Define ℓ2 (R) = {(x1 , . . . , xn , . . .) ∈ R∞ | Σi x2i < ∞}. E.g., (1, 1/2, 1/3, . . . , 1/n, . . .) ∈ ℓ2 (R).
Which of the following are in ℓ2 (R)? (1, 0, . . . , 0, . . .), (1, 0, 10, 1, 0, . . .), (1, 1/2, 1/22 , 1/23 , . . .)

Inner Product Spaces : Definition


If (xn ), (yn ) ∈ ℓ2 (R) then define ⟨(xn ) , (yn )⟩ = Σn xn yn .
1
Why is this bounded? (xk − yk )2 ≥ 0 =⇒ xk yk ≤ (x2k + yk2 ).
2
Note that, similar to the usual dot product on Rn , ⟨ , ⟩ has the following properties:
For all x = (xn ), y = (yn ), z = (zn ) ∈ ℓ2 (R) and c ∈ R.

12
• (Symmetry) ⟨x , y⟩ = ⟨y , x⟩

• (Positive Definiteness) ⟨x , x⟩ ≥ 0 and ⟨x , x⟩ = 0 ⇐⇒ x = 0.

• (Bilinearity) ⟨x + y , z⟩ = ⟨x , z⟩ + ⟨y , z⟩ and ⟨cx , y⟩ = c⟨x , y⟩ = ⟨x , cy⟩.


An inner product on a vector space V is one which gives a real number ⟨u , v⟩
for every pair u, v ∈ V such that it has the three above-mentioned properties.
This helps us define length of a vector in such spaces
p
Define length (norm) of v as ∥v∥ = ⟨v , v⟩.
P OINT TO P ONDER :
What would one need to define angle between two non-zero vectors in V ?

Inner Product Spaces : Properties


Let V be a vector space with an inner product.
Then it satisfies the Cauchy-Schwartz inequality, that is, for all u, v ∈ V ,

|⟨u , v⟩| ≤ ∥u∥∥v∥,

and the Pythagoras theorem, that is, for u, v ∈ V ,

⟨u , v⟩ = 0 ⇐⇒ ∥u − v∥2 = ∥u∥2 + ∥v∥2 .

We define two vectors u, v ∈ V to be orthogonal to each other if ⟨u , v⟩ = 0.

Inner Product spaces : Another Example


Let V = C[0, 2π] be the vector space of continuous functions on [0, 2π].
We now define an inner product on V as follows: For f , g ∈ C[0, 2π]
Z 2π
⟨f , g⟩ = f (t)g(t) dt.
0

This satisfies all the axioms of an inner product.


Some of the well understood functions are the cosine and sine functions on [0, 2π],
and it is often useful to write periodic functions in terms of sines and cosines.
This idea is a simple application of taking orthogonal projections in the
vector space C[0, 2π] onto the subspace W = Span{cos nx, sin mx | m, n ∈ Z, m, n ≥ 0}.
In order to do take orthogonal projections we need an orthogonal basis of W .

An Application: Fourier Series


The set Sn = {cos(kx), sin(lx) | 0 ≤ k, l ≤ n} is orthogonal. Why?
R 2π 1 R 2π
For example, ⟨cos x , sin x⟩ = 0 cos t sin t dt = sin 2t dt = 0.
2 0
R 2π 1 R 2π
⟨cos x , cos x⟩ = 0 cos2 t dt = (1 − cos 2t) dt = 2π.
2 0
Then Sn is an orthogonal basis of Wn = Span(Sn ) ⊆ W . Let f ∈ C[0, 2π]. Then

fn = projWn (f (x)) = a0 + a1 cos(x) + b1 sin(x) + . . . + an cos(nx) + bn sin(nx)

⟨1 , f (x)⟩ ⟨cos(kx) , f (x)⟩ ⟨sin(kx) , f (x)⟩


where a0 = , ak = , bk = , for 1 ≤ k ≤ n.
2π 2π 2π
Thus (fn ) is exactly the sequence of partial sums of the Fourier series expansion of f .

13
III.4 Symmetric Matrices & SVD

(Real) Symmetric Matrices


• Let A be a symmetric n × n matrix. Then all its eigenvalues are real.
Proof. Let λ ∈ C and v ∈ Cn be eigenvalue and eigenvector of A.
Then Av = λv ⇒ (Āv̄)T = λ̄v̄ T .
Now v̄ T Av = λv̄ T v. Alternately, A = AT and ĀT = AT implies v̄ T Av = v̄ T ĀT v = λ̄v̄ T v.
Since v̄ T v > 0 for a v non-zero, we have, λ = λ̄ ⇒ λ is real.
• λ has an associated eigenvector with real entries.
Proof. If x ̸∈ Rn , write x = u + iv, where u, v ∈ Rn .
Then λ ∈ R ⇒ u and v are also eigenvectors associated to λ.
• If λ1 and λ2 are distinct eigenvalues of A, with associated eigenvectors v1 and v2 ∈ Rn .
Then v1 and v2 are orthogonal.
Proof. Want to prove: v1T v2 = 0. Now

λ1 (v1T v2 ) = (λ1 v1 )T v2 = (Av1 )T v2 = (v1T AT )v2 = v1T (Av2 ) = v1T (λ2 v2 ) = λ2 (v1T v2 ).

Since λ1 ̸= λ2 , we have v1T v2 = 0.


S PECTRAL T HEOREM FOR A R EAL S YMMETRIC M ATRIX :
 
Every real symmetric matrix A can be diagonalised.
 
In particular, there is an orthogonal matrix Q and a diagonal matrix Λ such that

A = QΛQT .

Given that A can be diagonalised, then A = QΛQT follows from the previous slide ,
together with the Gram-Schmidt process applied to each eigenspace of A.

Application: Singular Value Decomposition (SVD)


SVD: Given an m × n matrix A, there exists an m × n ”diagonal” matrix Σ,
orthogonal matrices U (m × m) and V (n × n), such that A = U ΣV T .
N OTE : If U , V and Σ are as above, they satisfy:
AAT = (U ΣV T )(V ΣT U T ) = U Σ2 U T , and AT A = V Σ2 V T . Thus:
1. The non-zero diagonal entries in Σ, called the singular values of A,
are the square roots of the common eigenvalues of AAT and AT A.
2. Columns of U are eigenvectors of AAT , and those of V are eigenvectors of AT A.
3. If the first r columns of Σ are non-zero, then {U∗1 , . . . , U∗r } and {V∗1 , . . . , V∗r } are
orthonormal bases for the column space of A, C(A) and the row space of A, C(AT )
respectively. Furthermore, AV = U Σ ⇒ AV∗j = σj U∗j for each j ≤ r.
S IMPLE - MINDED A PPLICATION : Can be used in image compression,
e.g., If an image is represented by A, find U , Σ, V as in SVD. Replace “small” singular
values in Σ by 0 to Σ.
e Then A e T can be used to represent the compressed image.
e = U ΣV

∼∼∼ f in ∼∼∼

14

You might also like