Professional Documents
Culture Documents
Contents
1
Decomposition of Matrices, Generalized inverses
1.1 Rank
Column space of a matrix: Given m × n matrix A, each column of A is a vector from Rm and the
subspace spanned by those columns is known as ‘column space’ of matrix A (denoted by C (A)).
The dimension of column space of A is known as ‘column rank’ of matrix A.
Row space of a matrix: Given m × n matrix A, each row of A is a vector from Rn and the subspace
spanned by those rows is known as ‘row space’ of matrix A.
The dimension of row space of A is known as ‘row rank’ of matrix A.
Note: For any matrix A, we have
Definition 1.1 (Rank). Rank of an m × n matrix A is the Column Rank(A) which is same as the Row
Rank(A).
Proof. A vector in C (AB) is of the form ABx for some vector x, and therefore it belongs to C (A). There-
fore C (AB) ⊆ C (A) and hence rank(AB) ≤ rank A. Similarly, we observe that R (AB) ⊆ R (B) and there-
fore rank(AB) ≤ rank B.
Proof. Consider a basis for the column space of A, say b 1 , b 2 , . . . , b r . Construct an m × r matrix B =
³ ´
b 1 · · · b r . Since each column of A is a linear combination of the columns of B, there exists an r × n
matrix C such that A = BC From the definition of B, it is trivial that rank B = r. Since r = rank A ≤
rank C and C is of size r × n, we obtain rank C = r.
Exercise 1.2. Let A be an n × n matrix. Then the following conditions are equivalent.
Lecture Notes 2
1.1 Rank
Exercise 1.3. Let A be an m × n matrix and M and N are the invertible matrices of size m × m and n × n,
respectively. Then prove that
Theorem 1.4. Given an m × n matrix A ofrank r, there exists invertible matrices M, N of order m × m,
Ir 0
n × n respectively such that M AN = .
0 0
Example 1. Obtain the canonical form of the following matrix and give two different rank factorizations:
3 6 6
A= .
1 2 2
Lecture Notes 3
1.2 Determinants
1 −2 −2 1 2 2
1 0 0 −1
Take N =
0 1 0 , then M AN =
and Q = N = 0 1 0 .
0 0 0
0 0 1 0 0 1
1 2 2
1 0 0 I 0 3 1 1 0 0
Therefore A = M −1 N −1 = P 1
Q = 0 1 0.
0 0 0 0 0 1 0 0 0 0
0 0 1
3 h i
A rank factorization of A is A = 1 2 2 . Now
1
3 h i 3 h i h i−1 h i
A = I1 1 2 2 = 2 2 1 2 2
1 1
3 h ih ih i
= 2 2−1 1 2 2
1
6 h i
= 12 1 1 ,
2
Exercise 1.4. Obtain the canonical form of the following matrices and give two different rank factoriza-
tions:
2 1 −2 1 1 −1 1 1 1 −1
, , , .
1 0 −1 −1 1 1 2 2 1 1
Exercise 1.5. Let A be an n × n matrix of rank r. Then there exists an n × n matrix Z of rank n − r such
that A + Z is nonsingular.
1.2 Determinants
Consider Rn×n the set of all n × n matrices over R.
A mapping D : Rn×n → R is said to be n-linear if for each i, 1 ≤ i ≤ n, D is linear function of i-th row
when other (n − 1) rows are held fixed.
Lecture Notes 4
1.2 Determinants
(i) D is n-linear
When D : Rn×n → R satisfies the condition (i) above, the condition (ii) can be replaced by
(ii)′ D(A) = 0 if two rows are equal. (Exercise)
Theorem 1.6. If a mapping D : Rn×n → R is n-linear, then the following are equivalent:
(iii) If B is a matrix obtained by interchanging two adjacent rows of A, then D(A) = −D(B).
Proof. (i) =⇒ (ii) Consider a matrix A such that its i-th row A i and j-th row A j are same for some
i ̸= j. Now
D(A) = D(A 1 , . . . , A i , . . . , A j , . . . , A n ) = D(A 1 , . . . , A j , . . . , A i , . . . , A n )
D(A 1 , . . . , A i , . . . , A j , . . . , A n ) = −D(A 1 , . . . , A j , . . . , A i , . . . , A n )
Therefore D(A) = 0. (ii) =⇒ (i) Consider a matrix B with its k-th row B k is same as A k , the k-th row of
A, for all k ̸= i, j and B i = A j and B j = A i (i < j). Now obtain a matrix C such that k-th row C k is same
as A k , the k-th row of A, for all k ̸= i, j and C i = C j = A i + A j . Now from (ii), D(C) = 0 and we get
0 = D(C 1 , . . . , C i , . . . , C j , . . . , C n )
= D(A 1 , . . . , (A i + A j ), . . . , (A i + A j ), . . . , A n )
= D(A 1 , . . . , A i , . . . , A i , . . . , A n ) + D(A 1 , . . . , A i , . . . , A j , . . . , A n )+
D(A 1 , . . . , A j , . . . , A i , . . . , A n ) + D(A 1 , . . . , A j , . . . , A j , . . . , A n )
= D(A 1 , . . . , A i , . . . , A j , . . . , A n ) + D(A 1 , . . . , A j , . . . , A i , . . . , A n )
= D(A) + D(B)
Lecture Notes 5
1.2 Determinants
Now begin interchange the row A i with A i+1 and continue until we get the sequence in the order
This requires k = j − i many interchanges of adjacent rows. Further to get A j in the i-th position we
require k − 1 such interchanges of adjacent rows. So, if B is the matrix with interchange of i-th and j-th
rows of A, we get B from A after 2k − 1 interchanges of adjacent rows. So, D(B) = (−1)2k−1 D(A) = −D(A).
(ii) =⇒ (iv) is trivial, (iv) =⇒ (ii) follows from (iii) =⇒ (i).
D(A) = D(a 11 e 1 + a 12 e 2 , a 21 e 1 + a 22 e 2 )
= D(a 11 e 1 , a 21 e 1 + a 22 e 2 ) + D(a 12 e 2 , a 21 e 1 + a 22 e 2 )
+ D(a 12 e 2 , a 22 e 2 )
Determinant Function
Determinant function on Rn×n is a mapping D : Rn×n → R such that
(i) D is n-linear
If D satisfies the property (iii) given above, then (1.1) reduces to D(A) = (a 11 a 22 − a 12 a 21 ).
Now for any alternating n-linear function D : Rn×n → R, we shall consider e i the i-th row of an
identity matrix. Now à !
n
X n
X n
X
D(A) = D a1i e i , a2i e i , . . . , a ni e i (1.2)
i =1 i =1 i =1
Lecture Notes 6
1.2 Determinants
Now apply n-linear property and D(e k1 , e k2 , . . . , e k n ) = 0 for any repeated k i , we get (1.2) as
X
D(A) = a 1σ1 a 2σ2 . . . a nσn D(e σ1 , e σ2 , . . . , e σn )
σ
Further, if D satisfies (iii) of determinant function then D(A) is uniquely determined by the entries of
A and
X
D(A) = sgn(σ)a 1σ1 a 2σ2 . . . a nσn
σ
Theorem 1.7. Consider an n × n matrix A and D an alternating (n − 1) linear function for each j, 1 ≤
j ≤ n, E j defined by
n
(−1) i+ j A i j D i j (A)
X
E j (A) =
i =1
The followings are some basic properties of determinant which are useful:
(i) The determinant is a linear function of any row when all the other rows are held fixed.
(iii) The determinant is unchanged if a constant multiple of one row is added to another row.
Lecture Notes 7
1.2 Determinants
Remarks:(Lapalce expansion) The determinant can be evaluated by expansion along a row or a column.
• The determinant expansion for a real matrix A of size n × n about the j-th column is
n
(−1) i+ j a i j det(A(i | j)).
X
det(A) =
i =1
• The determinant expansion for a real matrix A of size n × n about the i-th row is
n
(−1) i+ j a i j det(A(i | j)).
X
det(A) =
j =1
Exercise 1.7. For a square matrix A ∈ Rn×n adjoint matrix adj(A) is the matrix where
Prove that A.ad j(A) = det(A)I. Hence prove the Cramer’s rule, i.e., j-th co-ordinate x j of the solution to
|B j |
Ax = b is given by | A| where B j is the matrix obtained by replacing j-th column of A by b.
Exercise 1.8. For a square matrix A ∈ Rn×n , A has inverse if and only if det(A) is nonzero.
Exercise 1.9. Find the inverse of the following matrices using adjoint method:
1 2 −1
(i)
−1 1 2
2 −1 1
cos(θ ) − sin(θ ) 0
sin(θ ) cos(θ ) 0
(ii)
0 0 1
(i) 3x + y + 2z = 3, 2x − 3y − z = −3, x + 2y + z = 4
(ii) x + y + z = 11, 2x − 6y − z = 0, 3x + 4y + 2z = 0
(iii) 3x − 2y = 7, 3y − 2z = 6, 3z − 2x = −1
¯ ¯
¯ x 2 −1¯
¯ ¯
¯ ¯
Exercise 1.11. Solve for x : ¯ 2 5 x ¯¯ = 0.
¯
¯ ¯
¯−1 2 x ¯
¯ ¯
¯1
¯ ω ω2 ¯¯
Exercise 1.12. If ω is the imaginary cube root of unity, evaluate ¯¯ ω
¯ ¯
ω2 1 ¯¯ .
¯ω2
¯ ¯
1 ω¯
¯ ¯
¯1 + a b c ¯¯
¯
¯ ¯
Exercise 1.13. Prove that ¯¯ a 1+b c ¯¯ = 1 + a + b + c.
¯ ¯
¯ a b 1 + c¯
Lecture Notes 8
1.3 Eigenvalues, Positive Definite Matrices and Decompositions
¯ ¯
¯a + b + 2c a b ¯
¯ ¯
¯ = 2(a + b + c)3 .
¯ ¯
Exercise 1.14. Show that ¯¯ c b + c + 2a b ¯
¯ ¯
¯ c a c + a + 2b¯
¯ ¯
¯91 92 93¯
¯ ¯
¯ ¯
Exercise 1.15. Evaluate ¯94 95 96¯¯.
¯
¯ ¯
¯97 98 99¯
¯ ¯
¯x + 3 5 7 ¯¯
¯
¯ ¯
Exercise 1.16. Solve for x : ¯¯ 3 x+5 7 ¯¯ = 0.
¯ ¯
¯ 3 5 x + 7¯
4 10 11 1 −2 3
Exercise 1.17. If A =
7 6 2 and B = 0
, find | A.B|.
2 1
1 5 4 −4 5 2
102 105 160 150
Exercise 1.18. If A = and B = , find | A.B|.
100 100 150 150
3 2 x
Exercise 1.19. If A =
4 1 −1 is a singular matrix, find x.
0 3 4
¯ ¯
¯ x 3 7¯
¯ ¯
¯ ¯
Exercise 1.20. If x = −9 is a root of ¯2 x 2¯¯ = 0, find the other two roots.
¯
¯ ¯
¯7 6 x ¯
Exercise 1.21. Decide whether the determinant of the following matrix A is even or odd, without eval-
387 456 589 238
488 455 677 382
uating it explicitly: A = .
440 982 654 651
892 564 786 442
¯ ¯
¯ A + B A¯
¯ ¯
Exercise 1.22. If A, B are n × n matrices, show that ¯
¯ ¯ = | A ||B|.
¯ A A¯
¯
Lecture Notes 9
1.3 Eigenvalues, Positive Definite Matrices and Decompositions
Note: In the characteristic polynomial, the coefficient of x n is 1, coefficient x n−1 is (−1)1 T race(A), . . . . . .
coefficient of x n− i is (−1) i s i× i where s i× i is the sum of all i × i principal minors of A . . . . . . and the con-
stant term is (−1)n det(A).
Definition 1.2. The roots of the characteristic equation of a square matrix A is called the eigenvalues
(characteristic values) of A or in other words λ is said to be an eigenvalue of A if there exists a vector
x ̸= 0 such that Ax = λ x. Such a vector x is called an eigenvector of A corresponding to the eigenvalue λ.
Note that the set of vectors { x : Ax = λ x} is the nullspace of A − λ I. This nullspace is called the
eigenspace of A corresponding to the eigenvalue λ and its dimension is called the geometric multi-
plicity of λ.
The eigenvalues may not all be distinct. The number of times an eigenvalue occurs as a root of the
characteristic equation is called the algebraic multiplicity of the eigenvalue.
Cayley Hamilton theorem: Given a n × n matrix A and characteristic polynomial P(x) = det(xI −
A), we have P(A) = 0.
4 0 2
Exercise 1.26. Find the characteristic polynomial of a 2 × 2 matrix whose trace and determinant are 7
and 6 respectively.
Exercise 1.27. Show that a matrix A and its transpose A T have the same characteristic polynomial.
A1 B
Exercise 1.28. Suppose where A 1 and A 2 are the square matrices. Show that the character-
0 A2
istic polynomial of M is the product of the characteristic polynomials of A 1 and A 2 .
Lecture Notes 10
1.4 Diagonalization
5 8 −1 0
0 3 6 7
(iii)
0 −3 5 −4
0 0 0 7
Exercise 1.30. Find all the eigenvalues and the eigenvectors corresponding to each of the eigenvalues of
the following matrices:
1 4
(i)
2 3
1 0 −1
(ii)
1 2 1
2 2 3
1 −3 3
(iii)
3 −5 3
6 −6 4
Exercise 1.31. Show that the eigenvectors corresponding to distinct eigenvalues of a matrix are linearly
independent.
1.4 Diagonalization
Definition 1.3. Two n × n matrices A and B are said to be similar if there exists an invertible matrix P
such that P −1 AP = B.
Lecture Notes 11
1.4 Diagonalization
which implies,
λ1 0 0 ... 0
0
λ2 0 ... 0
AP = PD ⇒ A[P1 |P2 |...|P n ] = [P1 |P2 |...|P n ]
... ... .
... ... ...
... ... ... ... ...
0 0 0 0 λn
Equivalently,
[AP1 | AP2 |...| AP n ] = [λ1 P1 |λ2 P2 |...|λn P n ].
Hence, AP j = λ j P j i.e., (λ j , P j ) is an eigenpair for A i.e., P must be a matrix whose columns constitute
n linearly independent eigenvectors, and D is a diagonal matrix whose diagonal entries are the corre-
sponding eigenvalues. Also, if there exists a linearly independent set of n eigenvectors that are used as
columns to build a nonsingular matrix P, and if D is the diagonal matrix whose diagonal entries are
the corresponding eigenvalues, then P −1 AP = D.
A complete set of eigenvectors for A n×n is any set of n linearly independent eigenvectors for A. By the
above discussion it follows that A is diagonalizable if and only if it has a complete set of eigenvectors.
Hence A is diagonalizable if and only if the algebraic multiplicity equals the geometric multiplicity of
each eigenvalue.
Exercise 1.32. If possible, diagonalize the following matrix with a similarity transformation:
1 −4 −4
A= 8 −11 −8
−8 8 5
Exercise 1.33. If possible, diagonalize the following matrices with a similarity transformation. Other-
wise give reasons why they are not diagonalizable:
0 1
1. A =
−8 4
1 1 1
2. A =
1 1 1
1 1 1
Lecture Notes 12
1.5 Positive Definite Matrices
5 4 2 1
0 1 −1 −1
3. A =
−1 −1 3 0
1 1 −1 2
5
−6 −6
4. A =
−1 4 2
3 −6 −4
5. Let A be a symmetric n × n matrix. Then A is positive definite if and only if the eigenvalues of
A are all positive. Similarly, A is positive semidefinite if and only if the eigenvalues of A are all
nonnegative.
6. Let A be a symmetric n × n matrix. Then A is positive definite if and only if the entire principal
minors of A are positive. (Similarly, A is positive semidefinite if and only if the entire principal
minors of A are nonnegative.)
Lecture Notes 13
1.6 LU Decomposition
7. Let A be a symmetric n × n matrix. Then A is positive definite if and only if all leading principal
minors of A are positive.
Note: (i) If A is a symmetric matrix then the eigenvalues of A are all real.
(ii) If v, w are eigenvectors of a symmetric matrix correspond to distinct eigenvalues α, β, then v, w are
mutually orthogonal.
1.6 LU Decomposition
Theorem 1.8. Let
a 11 a 12 ... a 1n
a a 22 ... a 2n
21
A=
... ... ... ...
... ... ... ...
a n1 a n2 ... a nn
be a non-singular matrix. Then A can be factorized into the form LU, where
l 11 0 0 ... 0 1 u 12 u 13 ... u 1n
l
21 l 22 0 ... 0 0 1 u 23 ... u 2n
L= ... ... ... ... ... and U = ... ...
... ... ,
...
... ... ... ... ... ... ... ... ... ...
l n1 l n2 l n3 ... l nn 0 0 0 ... 1
if
¯ ¯
¯ ¯ ¯a a 12 a 13 ¯¯
¯ 11
¯a 11 a 12 ¯
¯ ¯
¯ ¯
a 11 ̸= 0, ¯¯ ¯ ̸= 0, ¯a
¯ 21 a 22 a 23 ¯¯ ̸= 0 and so on.
¯a 21 a 22 ¯
¯
¯ ¯
¯a 31 a 32 a 33 ¯
Such a factorization, whenever it exists, is unique.
Lecture Notes 14
1.6 LU Decomposition
i.e.,
u 11 u 12 u 13
A=
l 21 u 11 l 21 u 12 + u 22 l 21 u 13 + u 23
l 31 u 11 l 31 u 12 + l 32 u 22 l 31 u 13 + l 32 u 23 + u 33
Equating the corresponding entries, we get
u 11 = l 11 , u 12 = a 12 , u 13 = a 13 ,
a 21 a 31
l 21 = , l 31 = ,
a 11 a 11
a 21 a 21
u 22 = a 22 − a 12 , u 23 = a 23 − a 13 ,
a 11 a 11
a 32 − aa31
11
a 12
l 32 = ,
u 22
from which u 33 can be computed.
Note: We follow the following systematic procedure to evaluate the elements of L and U (where L
is unit lower triangular and U is upper triangular).
Step I: Determine first row of U and first column of L.
Step II: Determine the second row of U and the second column of L.
Step III: Determine third row of U.
3 1 2
into the LU form.
3 5 3
into the LU form.
3 7 4
1 2 1
into the LU form.
Lecture Notes 15
1.7 Cholesky Decomposition
Theorem 1.9. A positive definite matrix A can be factorized into a product A = LL T , where L is a lower-
triangular matrix with positive diagonal entries:
a 11 a 12 ... a 1n l 11 0 0 ... 0 l 11 l 21 l 31 ... l n1
a a 22 a 2n
... l 21 l 22 0 ... 0 0 l 22 l 23 ... l 2n
21
= ... .
... ... ... ... ... ... ... ...
... ... ... ... ...
... ... ... ...
... ... ... ... ...
... ... ... ... ...
a n1 a n2 ... a nn l n1 l n2 l n3 ... l nn 0 0 0 ... l nn
L is called the Cholesky factor of A, and L is unique.
i.e.,
l 211 l 21 l 11 l 31 l 11
A=
l 21 l 11 l 221 + l 222 .
l 21 l 31 + l 22 l 32
l 31 l 11 l 21 l 31 + l 22 l 32 l 231 + l 232 + l 233
Equating the first columns of each of the matrices, we get,
p 1 1
l 11 = a 11 , l 21 = a 21 , l 31 = a 31 .
l 11 l 11
Note: If the matrix is Positive Semi-definite, instead of Positive Definite, then it still has a de-
composition of the form A = LL T , where the diagonal entries are allowed to be zero. However, this
decomposition is not unique.
Lecture Notes 16
1.8 Spectral Decomposition Theorem
A = U diag(λ1 , λ2 , . . . , λr , 0, . . . , 0)U T ,
where λ1 , λ2 , . . . , λr are the eigenvalues of A and the first r columns of U are the unit eigenvectors corre-
sponding the eigenvalues.
Lemma 1.1. If A a symmetric matrix of size n × n and v is a unit eigenvector corresponding to the
eigenvalue α, then B = αvvT is a matrix satisfying the following properties:
Proof. Proof is by induction on the rank of given symmetric matrix A. If the rank of A is one with
eigenvalue λ1 and eigenvector v, then A is in the form
λ1 u 1 u 1T
1
where u 1 = ||v|| v. Now extend { u 1 } to an orthonormal basis { u 1 , u 2 , . . . u n } of Rn . Now construct a matrix
U by taking { u 1 , u 2 , . . . u n } as columns in the same order. Clearly, UU T = U T U = I and
λ1 0
A =U U T .
0 0
So, the theorem holds when the rank of A is one. Suppose that the theorem holds for all the matrices of
rank r − 1. If A is a symmetric matrix of rank r, chose an eigenvalue λ1 of A with corresponding unit
eigenvector u 1 , as in the earlier lemma. Now construct B = λ1 u 1 u 1T satisfying Rank(A) = Rank(B) +
Rank(C) and BC = CB = 0. Since C is a symmetric matrix and Rank(C) = r − 1, by induction, there
exists an orthogonal matrix V such that
C = V diag(λ2 , . . . , λr , 0, . . . , 0)V T ,
where λ2 , . . . , λr are the eigenvalues of C and the first r − 1 columns of V are corresponding eigenval-
ues with unit norm. Since BC = CB = 0, u 2 , . . . , u r are the eigenvectors correspond to the eigenvalues
λ2 , . . . , λr of C as well of A. For the same reason, the set of vectors u 1 together with the first r − 1 columns
{ u 2 , . . . , u r } of V forms a set of orthonormal vectors. Extend this set of orthonormal vectors to form an
Lecture Notes 17
1.8 Spectral Decomposition Theorem
A = U diag(λ1 , λ2 , . . . , λr , 0, . . . , 0)U T .
2 1
Example 6. Obtain the spectral decomposition of A = .
1 2
A = QDQ T
First step is to find the eigenvalues of A. To get eigenvalues we have to solve characteristic equation
| A − λI | = 0 ¯ ¯
¯2 − λ 1 ¯
¯ ¯
| A − λ I | = ¯¯ ¯=0
¯ 1 2 − λ¯
¯
Solving, we get λ = 3, 1
Eigenvector corresponding to eigenvalue λ = 3 is given by
−1 1 x 0
1 =
1 −1 x2 0
x1
1
Solving we get =
x2 1
p1
Normalized eigenvector corresponding to eigenvalue λ = 3 is 2
p1
2
Eigenvector corresponding to eigenvalue λ = 1 is given by
1 1 x1 0
=
1 1 x2 0
x1
−1
Solving we get =
x2 1
−1
p
Normalized eigenvector corresponding to eigenvalue λ = 1 is 2
p1
2
Spectral Decomposition of A is given as
p1 −1
p 3 0 p1 p1
A = 2 2 2 2.
p1 p1
0 1 −1
p p1
2 2 2 2
Exercise 1.39. If A is positive semidefinite matrix then there exists a unique positive semidefinite matrix
B such that B2 = A. (The matrix B is called the square root of A and is denoted by A /2 .)
1
Lecture Notes 18
1.9 Singular Values
1 3
Exercise 1.40. Obtain the spectral decomposition of .
3 1
0 1 1
2 5
Exercise 1.41. Obtain the spectral decomposition of matrices
1 0 1 ,
.
5 3
1 1 0
or simply by
σ1 ≥ · · · ≥ σ n
Suppose A is an m × n matrix with m < n. Augment A by n − m zero rows to get a square matrix, say B.
Then the singular values of A are defined to be the singular values of B. Suppose m < n, then similar
definition can be given by augmenting A by zero columns, instead of zero rows.
The following assertions can be verified easily. We omit the proof.
(i) The singular values of A and P AQ are identical for any orthogonal matrices P,Q.
(ii) The rank of a matrix equals the number of nonzero singular values of the matrix.
(iii) If A is symmetric then the singular values of A are the absolute values of its eigenvalues. If A is
positive semidefinite then the singular values are the same as eigenvalues.
A = U diag(σ1 , σ2 , . . . , σr , 0, . . . , 0)V T
Proof. If A is a matrix of size m × n and with rank r, it is clear that the matrix A A T is of size m ×
m with rank r. Further, the eigenvalues of A A T are non-negative and let the positive eigenvalues
be σ21 , σ22 , . . . , σ2r . Consider the orthogonal unit eigenvectors u 1 , u 2 , . . . , u r of A A T corresponding to the
eigenvalues σ21 , σ22 , . . . , σ2r . From spectral decomposition theorem, we have
Lecture Notes 19
1.11 Generalized Inverses and Applications
where P is the matrix obtained by taking orthogonal unit eigenvectors u 1 , u 2 , . . . , u r as its columns. Note
1
that P T P = I and PP T A A T = A A T which implies PP T A = A. Now write v i = σi
A T u i . Observe that
1 T 1 T 1 T 2
A T Av i = A T A( A ui) = A (A A T u i ) = A (σ i u i ) = σ2i v i
σi σi σi
and therefore v i are eigenvectors of A T A. Further, v i are orthogonal unit vectors and therefore {v1 , v2 , . . . , vr }
is a set of orthonormal vectors. For Q obtained by taking {v1 , v2 , . . . , vr } as its columns, we get
A T A = QD 2 Q T ,
Q = A T PD −1
and therefore
PDQ T = PDD −1 P T A = PP T A = A.
Now extend the matrices P and Q to the orthonormal matrices U and V , respectively to get
D 0
A =U V T.
0 0
where A is an m × n matrix and y ∈ R (A), the range space of A. If the matrix A is nonsingular then
x = A −1 y will be the solution to the system (1.3). Suppose the matrix A is singular or m ̸= n then we
need a right candidate G of order n × m such that
AG y = y. (1.4)
That is G y is a solution to the linear system (1.3). Equivalently, G is of order n × m such that
AG A = A. (1.5)
Lecture Notes 20
1.11 Generalized Inverses and Applications
AG A = A.
G is also known as g-inverse, {1}−inverse, pseudo inverse, partial inverse by many authors in the
literature. We denote an arbitrary generalized inverse by A − . The set of all generalized inverses is
denoted by { A − }.
Remark: If A is square and nonsingular, then A −1 is the unique g-inverse of A.
Proof. Since AG A = A and rank(AB) ≤ min{ rank(A), rank(B)} for any two matrices A, B, we have
Also,
rank (A) = rank (AG A) ≤ rank (G A) ≤ rank (A).
Example 7. Let A be a matrix and let G be a g-inverse of A. Show that the class of all g-inverses of A is
given by
G + (I − G A)U + V (I − AG),
Solution. Consider any matrix G + (I − G A)U + V (I − AG) where U and V are arbitrary matrices and
G is a g-inverse of A.
We have AG A = A. Now,
= AG A + AU A − AG AU A + AV A − AV AG A
= A + AU A − AU A + AV A − AV A
=A
W = (I − G A)W + G AW
AW A = 0 =⇒ G AW A = 0 =⇒ G AW = G AW(I − AG)
So,
W = (I − G A)W + G AW(I − AG)
Lecture Notes 21
1.12 Construction of generalized inverse
is a g-inverse of A where C − −
r is a right inverse of C and B l is left inverse of B.
AG A = BCC − −
r B l BC = BI r I r C = BC = A.
=A
G is the g-inverse of A.
Lecture Notes 22
1.12 Construction of generalized inverse
This also shows that any matrix which is not a square, nonsingular matrix admits infinitely many
g-inverses.
3. Let A be of rank r. Since A is a matrix of rank r, there exists a r × r submatrix whose determinant
is nonzero. Without loss of generality, let
B C
A=
D E
where B r×r is the non-singular submatrix of A. Also, there exists a matrix X such that B =
C X , D = E X . Then,
B−1 0
G=
0 0
Solution. Note that the matrix is of rank 2, since the Echelon form of the matrix is,
1 0 −1 2
0 1 0 5
.
0 0 0 0
0 0 0 0
2 0
Now, note that, A 1 = is a 2 × 2 nonsingular minor. Now, fitting the inverse of A 1 in the
−1 1
appropriate place, we get the
0 1/2 0 0
0 1/2 1 0
G1 = .
0 0 0 0
0 0 0 0
1 0 1 0
Similarly, A 2 = , which gives A −1 = . And the g-inverse is,
2
−1 1 1 1
1 0 0 0
1 0 1 0
G2 = .
0 0 0 0
0 0 0 0
Lecture Notes 23
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse
G AG = G.
rank A ≤ rankG.
Proof. From Lemma 1.2, rank (A) = rank (G A) ≤ rank (G). If G is a reflexive g-inverse of A, then A is a
g-inverse of G and hence rank(G) ≤ rank (A), equality holds.
Conversely, suppose rank(A) = rank(G). First observe that C (G A) ⊂ C (G). By 1.2, rank(G) =
rank(G A) and hence C (G) = C (G A). Therefore G = G A X for some X . Now G AG = G AG A X = G A X = G
and hence G is reflexive.
Definition 1.12 (least squares g-inverse). A g-inverse G of A is said to be a least squares g-inverse if,
in addition to AG A = A, it satisfies
(AG)T = AG.
Definition 1.13 (Moore-Penrose inverse). If G is a reflexive g-inverse of A which is both minimum norm
and least squares then it is called a Moore-Penrose inverse of A.
In other words, G is said to be Moore-Penrose inverse of A if it satisfies
Lemma 1.3. Let A be a complex matrix of order m × n then the Moore-Penrose inverse of A exists and it
is unique.
and then
A + = C + B+ .
Verification:
Lecture Notes 24
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse
(i)
=A
(ii)
= A+
(iii)
= B(B T B)−1 B T
(A A + )T = (B(B T B)−1 B T )T
= B(B T B)−1 B T
∴ (A A + )T = A A +
(iv)
= C T (CC T )−1 C
= C T (CC T )−1 C
∴ (A + A)T = A + A.
Since, all the four conditions of Moore-Penrose are satisfied, therefore A + is Moore-Penrose inverse of
A. Hence the existence. To prove the uniqueness, let G 1 and G 2 be two Moore-Penrose inverse of A.
Then
G 1 = G 1 AG 1 = G 1 G 1T A T (∵ AG 1 = (AG 1 )T )
= G 1 G 1T A T G 2T A T (∵ AG 2 A = A)
= G 1 G 1T A T AG 2 (∵ AG 2 = (AG 2 )T )
= G 1 AG 1 AG 2 (∵ AG 1 = (AG 1 )T )
= G 1 AG 2 AG 2 (AG 1 A = A, G 2 AG 2 = G 2 )
= G 1 A A T G 2T G 2 (∵ G 2 A = (G 2 A)T )
= A T G 1T A T G 2T G 2 (∵ G 1 A = (G 1 A)T )
= A T G 2T G 2 (∵ AG 1 A = A)
= G 2 AG 2 = G 2 (∵ G 2 A = (G 2 A)T , G 2 AG 2 = G 2 ).
Lecture Notes 25
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse
1 T
(i) If A is of rank 1 matrix, then A + = α
A is the Moore-Penrose inverse of A, where α = Trace(A T A).
(Proof as exercise)
2 1 3
Exercise 1.43. Find the Moore-Penrose inverse of A = .
4 2 6
Here rank(A) = 1.
20 10 30
AT A = T
10 5 15 , Trace(A A) = 70
30 15 45
2 4
1 70 70
A+ = A T = 1 2 , where α = Trace(A T A)
α 70 70
3 6
70 70
(ii) Let A be an m× n matrix. The singular value decomposition of A is given by A (m×n) = U(m×m) (m×n) V(Tn×n)
P
P
where U and V are orthogonal matrices and is a block diagonal matrix consists of singular val-
ues of A and zeros.
2 −2
1 −1
Solution. Given A =
−2 2.
2 −2
P T
Formula for SVD is A m×n = Um×m m× n Vn× n
where U, V are orthogonal matrices.
In this case formula for calculating Moore-Penrose inverse is A + = V + U T .
P
First step is to find the singular values of A. To get singular values we have to find the eigenvalues
of A T A by solving | A T A − λ I | = 0.
Now
1
−1
1 −2 2 9 −9
AT A =
−2 2 = .
−1 2 −2 −9 9
2 −2
Therefore ¯ ¯
9 λ 9
¯ ¯
− −
| A T A − λ I | = 0 ⇒ ¯¯
¯ ¯
¯ = 0;
¯ −9 9 − λ¯
¯
Lecture Notes 26
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse
∴ λ = 18, 0.
p
Thus the singular values are σ = 18, 0.
Finding eigenvector corresponding to λ = 18, by solving the matrix equation
(A T A − λ I)X = 0
−9 −9 x1 0
i. e., = .
−9 −9 x2 0
−1 x1
Solving we get = = v.
x2 1
−1
p
Normalizing v we get v1 = 2 .
p1
2
We require unit vector v
2 which
is orthogonal to v1 .
y1
To find v2 we write v2 = such that y2 + y2 = 1,
1 2
y2
v1T v2 = 0
1
i.e., p (− y1 + y2 ) = 0
2
y1 p1
∴ v2 = = 12 .
y2 p
2
−1
p p1
∴V = 2 2 .
p1 p1
2 2
Next is to find U3×3 .
−1
1
−1 −1 3
p
Let u 1 = σ11 Av1 = p1 2 = 2 .
18
− 2 2 p1 3
2 −2
2 −2 3
We now extend the set { u 1 } to form an orthonormal basis for R3 . We need two orthonormal vectors
which are orthogonal to u 1 , satisfying u 1T x = 0 i.e., − x1 + 2x2 − 2x3 = 0.
x1 2x2 − 2x3 2 −2
∴ x2 = x2 = x2 1 + x3 0
.
x3 x3 0 1
0 1
Lecture Notes 27
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse
p2 p−2
5 45
u 2 = p1 , u 3 = p4 .
5 45
0 p5
45
−1 p2 p−2
3 5 45
2 p1 p4 .
∴U =
3 5 45
−2
3 0 p5
45
−1
p
p2 p−2
3 18
0 −1
5 45
p p1
2 1 4 2 2 is the required singular value decomposition of A.
∴ A= 00
p p
3 p1
5 45 p1
−2
3 0 p5 0 0 2 2
45
Hence the Moore-Penrose inverse of A is
−1 2 −2
−1 1 1 3 3 3 1 −1 1
p p p 0 0
A+ = V + U T = 2 2 18 p25 p1 0 = 18 9 9
P
p1 p1 0 0 0 5 −1 1 −1
18 9 9
2 2 p−2 p4 p5
45 45 45
We now obtain some characterizations of the Moore–Penrose inverse in terms of volume. For example,
we will show that if A is an n × n matrix then A + is a g-inverse of A with minimum volume. First we
prove some preliminary results. It is easily seen that A + can be determined from the singular value
decomposition of A. A more general result is proved next.
where X , Y , Z are arbitrary matrices of appropriate dimension. The class of reflexive g-inverses G of A is
given by (1.6) with the additional condition that Z = Y Σ X . The class of least squares g-inverses G of A
is given by (1.6) with X = 0. The class of minimum norm g-inverses G of A is given by (1.6) with Y = 0.
Finally, the Moore-Penrose inverse of A is given by (1.6) with X , Y , Z all being zero.
Lecture Notes 28