M 1.29 RM

SINGULAR VALUE DECOMPOSITION
A great matrix factorization has been saved for the end of

the decomposition modules. 𝑼𝚺𝑽𝑇 joins with 𝑳𝑼 from
elimination and 𝑸𝑹 from orthogonalization (Gram-
Schmidt). Nobody’s name is attached; 𝑨 = 𝑼𝚺𝑽𝑇 is known
as the “SVD” or the singular value decomposition. We
want to describe it, to prove it, and to discuss its
applications-which are many and growing.
The SVD is closely associated with the eigenvalue-

eigenvector factorization 𝑸𝐃𝑸𝑇 of a symmetric matrix. The
eigenvalues are in the diagonal matrix 𝐃. The eigen-vector
matrix 𝑸 is orthogonal 𝑸𝑇 𝑸 = 𝑰 because eigenvectors of a
symmetric matrix can be chosen to be orthonormal. For
most matrices that is not true, and for rectangular
matrices it is ridiculous (eigenvalues undefined). But now
we allow the 𝑸 on the left and the 𝑸𝑇 on the right to be any
two orthogonal matrices 𝑼 and 𝑽𝑇 – not necessarily
transposes of each other. Then every matrix will split into
𝑨 = 𝑼𝚺𝑽𝑇 .
The diagonal (but rectangular) matrix 𝚺 has eigenvalues

from 𝑨𝑇 𝑨, not from 𝑨! Those positive entries (also called
sigma) will be 𝜎1 , … , 𝜎𝑟 . They are the singular values of 𝑨.
They fill the first 𝑟 places on the main diagonal of 𝚺 – when
𝑨 has rank 𝑟. The rest of 𝚺 is zero.
With rectangular matrices, the key is almost always to
consider 𝑨𝑇 𝑨 and 𝑨𝑨𝑇 .
Singular value Decomposition: Any 𝑚 × 𝑛 matrix 𝑨 can be
factored into
𝑨 = 𝑼𝚺𝑽𝑇 =(orthogonal)(diaonal)(orthogonal)
The columns of 𝑼 (𝑚 × 𝑚) are eigenvectors of 𝑨𝑨𝑇 , and the

columns of 𝑽 (𝑛 × 𝑛) are eigenvectors of 𝑨𝑇 𝑨. The 𝑟 singular
values on the diagonal of 𝚺 (𝑚 × 𝑛) are the square roots of
the nonzero eigenvalues of both 𝑨𝑨𝑇 and 𝑨𝑇 𝑨.
Remark 1:
For symmetric matrices, 𝚺 is 𝑫 and 𝑼𝚺𝑽𝑇 is identical to

𝑸𝑫𝑸𝑇 . For other symmetric matrices, any negative
eigenvalues in 𝑫 become positive in 𝚺. For complex
matrices, 𝚺 remains real but 𝑼 and 𝑽 become unitary (the
complex version of orthogonal). We take complex
conjugates in 𝑼∗ 𝑼 = 𝑰 and 𝑽∗ 𝑽 = 𝑰 and 𝑨 = 𝑼𝚺𝑽∗.
Remark 2:
𝑼 and 𝑽 give orthonormal bases for all four fundamental

subspaces:
First 𝑟 columns of 𝑼: Column Space of 𝑨
Last 𝑚 − 𝑟 columns of 𝑼: nullspace of 𝑨𝑇
First 𝑟 columns of 𝑽: row space of 𝑨
Last 𝑛 − 𝑟 colums of 𝑽: nullspace of 𝑨

Remark 3:
The SVD chooses those bases in an extremely special way.

They are more than just orthonormal. When 𝑨 multiplies a
column 𝑢𝑗 of 𝑽, it produces 𝜎𝑗 times a column of 𝑼. That
comes directly from 𝑨𝑽 = 𝑼𝚺, looked at a column at a time.
Remark 4:
Eigenvectors of 𝑨𝑨𝑇 and 𝑨𝑇 𝑨 must go into the columns of 𝑼

and 𝑽:
𝑨𝑨𝑇 = 𝑼𝚺𝑽𝑇 𝑽𝚺T 𝑼𝑇 = 𝑼𝚺𝚺T 𝑼𝑇 and, similarly, 𝑨𝑇 𝑨 =

𝑽𝚺T 𝚺𝑽𝑇 . (1)
𝑼 must be in the eigenvector matrix for 𝑨𝑨𝑇 . The

eigenvalue matrix in the middle is 𝚺𝚺T −which is 𝑚 × 𝑚
with 𝜎12 , … , 𝜎𝑟2 on the diagonal.
From the 𝑨𝑇 𝑨 = 𝑽𝚺T 𝚺𝑽𝑇 , the 𝑽 matrix must be the

eigenvector matrix for 𝑨𝑇 𝑨. The diagonal matrix 𝚺T 𝚺 has the
same 𝜎12 , … , 𝜎𝑟2 , but it is 𝑛 × 𝑛.
Example:
1 2 1
Compute the 𝑺𝑽𝑫 of 𝑨 =
2 4 2
Solution:
1 2
1 2 1 6 12
𝑨𝑨𝑇 = 2 4 =
2 4 2 12 24
1 2
6−𝜆 12
𝑨𝑨𝑇 − 𝜆𝑰 = 0 ⇒ =0
12 24 − 𝜆
⇒ 6 − 𝜆 24 − 𝜆 − 144 = 0
⇒ 144 − 30𝜆 + 𝜆2 − 144 = 0
⇒ 𝜆 = 0, 30
Now we find the eigenvectors associated with these
eigevalues.
𝝀 = 𝟎:
𝑨𝑨𝑇 𝒙 = 0
6 12 𝑥1 0
⇒ =
12 24 𝑥2 0
6 12 1 2
~
12 24 0 0
So 𝑥1 + 2𝑥2 = 0 ⇒ 𝑥1 = −2𝑥2
𝑥1 −2
𝑥2 = 𝑡
1
−2
So is a basis for the eigenspace corresponding to the
1
eigenvalue 0.
𝝀 = 𝟑𝟎:
𝑨𝑨𝑇 − 30𝑰 𝒙 = 0
−24 12 𝑥1 0
⇒ =
12 −6 𝑥2 0
1 𝑥
0
~ 1 − 1
2 𝑥2 = 0
0 0
1
⇒ 𝑥1 − 𝑥2 = 0
2
𝑥1 1
⇒ 𝑥 =𝑡 2
2
1
1
So 2 is a basis for the eigenspace corresponding to the
1
eigenvalue 30.
1 1
−
5 5
Therefore 𝑼 = 2 1 .
5 5
Now to find 𝑽 we start with 𝑨𝑇 𝑨.

1 2 5 10 5
1 2 1
𝑨𝑇 𝑨 = 2 4 = 10 20 10
2 4 2
1 2 5 10 5
5−𝜆 10 5
𝑨𝑇 𝑨 − 𝜆𝑰 = 0 ⇒ 10 20 − 𝜆 10 = 0
5 10 5−𝜆
⇒ 5 − 𝜆 20 − 𝜆 5 − 𝜆 − 100 − 10 10 5 − 𝜆 − 50
+ 5 100 − 5 20 − 𝜆 = 0
⇒ 5 − 𝜆 100 − 25𝜆 + 𝜆2 − 100 − 10 50 − 10𝜆 − 50

+ 5 100 − 100 + 5𝜆 = 0
⇒ 5 − 𝜆 𝜆 𝜆 − 25 + 100𝜆 + 25𝜆 = 0
⇒ 𝜆 30𝜆 − 𝜆2 − 125 + 100 + 25 = 0
⇒ 𝜆 𝜆 30 − 𝜆 =0
⇒ 𝜆 = 0,0,30
𝝀 = 𝟎:
𝑨𝑇 𝑨𝒙 = 0
5 10 5 𝑥1 0
⇒ 10 20 10 𝑥2 = 0
5 10 5 𝑥3 0
1 2 1 𝑥1 0
~ 0 0 0 𝑥2 = 0
0 0 0 𝑥3 0
⇒ 𝑥1 + 2𝑥2 + 𝑥3 = 0
𝑥3 = 𝑡, 𝑥2 = 𝑠 , 𝑥1 = −2𝑠 − 𝑡
𝑥1 −1 −2
𝑥2 = 𝑡 0 + 𝑠 1
𝑥3 1 0
Now we apply Gram-Schmidt process to get orthonormal
vectors.
1
−
2
𝑣1 = 0
1
2
−2
𝑃1 = projection of 1 on 𝑣1
0
1 −1 −2
= 0 −1 0 1 1
2
1 0
1 1 0 −1 −2
= 0 0 0 0
2
−1 0 1 1
1 −2 −1
= 0 = 0
2
2 1
−2 −1 −1
Now 0 − 0 = 1
1 1 −1
1
−
3
1
𝑣2 =
3
1
−
3
𝝀 = 𝟑𝟎:
𝑨𝑇 𝑨 − 30𝑰 𝒙 = 0
−25 10 5 𝑥1 0
⇒ 10 −10 10 𝑥2 = 0
5 10 −25 𝑥3 0
2 1 2 1
1 − − 1 − −
~ 5 5 ~ 5 5
10 −10 10 0 −6 12
5 10 −25 0 12 −24
2 1 2 1
1 − − 1 − −
5 5 ~ 5 5
0 1 −2 0 1 −2
0 12 −24 0 0 0
2 1
⇒ 𝑥1 − 𝑥2 − 𝑥3 = 0
5 5
𝑥2 − 2𝑥3 = 0
4 1
⇒ 𝑥3 = 𝑡, 𝑥2 = 2𝑡 , 𝑥1 = 𝑡+ 𝑡=𝑡
5 5
𝑥1 1
𝑥2 = 𝑡 2
𝑥3 1
1 −1 −1
6 2 3
2 1
So 𝑽 = 6
0
3
1 1 1
−
6 2 3
1 2 1
1 2 6 6 6
− 1 1
Now 𝑼𝚺𝑽𝑇 =
5 5 30 0 0 − 0
2 1 2 2
0 0 0
5 5 −1 1 −1
3 3 3
1 2 1
6 6 6
30 5 0 0 1 1
= − 0
2 30 5 0 0 2 2
−1 1 −1
3 3 3
1 2 1
=
2 4 2
=𝑨
Applications of the SVD

We will pick a few important applications, after
emphasizing one key point. The SVD is terrific for
numerically stable computations, because 𝑼 and 𝑽 are
orthogonal matrices. They never change the length of a
vector. Since 𝑼𝒙 2 = 𝒙𝑇 𝑼𝑇 𝑼𝒙 = 𝒙 2 , multiplication by 𝑼
cannot destroy the scaling.
Image processing
Suppose a satellite takes a picture, and wants to send it to
Earth. The picture may contain 1000 × 1000 “pixels”- a
million little squares, each with a definite color. We can
code the colors, and send back 1,000,000 numbers. It is
better to find the essential information inside the 1000 ×
1000 matrix, and send only that.
Suppose we know the SVD. The key is in the singular
values (in 𝚺). Typically, some 𝜎′𝑠 are significant and others
are extremely small. If we keep 20 and throw away 980,
then we send only the corresponding 20 columns of 𝑼 and
𝑽. The other 980 columns are multiplied in 𝑼𝚺𝑽𝑇 by the
small 𝜎’s that are being ignored. We can do the matrix
multiplication as columns times rows:
𝑨 = 𝑼𝚺𝑽𝑇 = 𝑢1 𝜎1 𝑣1𝑇 + 𝑢2 𝜎1 𝑣2𝑇 + ⋯ + 𝑢𝑟 𝜎𝑟 𝑣𝑟𝑇 . (3)
Any matrix is the sum of 𝑟 matrices of rank 1. If only 20

terms are kept, we send 20 times 2000 numbers instead of
a million (25 to 1 compression).
The picture are really striking, as more and more singular
values are included. At first you see nothing, and suddenly
you recognize everything. The cost is in computing the SVD
– this has become much more efficient, but it is expensive
for a big matrix.
Polar decomposition
Every nonzero complex number 𝑧 is a positive number 𝑟

times a number 𝑒 𝑖𝜃 on the unit circle: 𝑧 = 𝑟𝑒 𝑖𝜃 . That
expresses 𝑧 in “polar coordinates.” If we think of 𝑧 as a 1 × 1
matrix, 𝑟 corresponds to a positive definite matrix and 𝑒 𝑖𝜃
corresponds to an orthogonal matrix. More exactly, since
𝑒 𝑖𝜃 is complex and satisfies 𝑒 −𝑖𝜃 𝑒 𝑖𝜃 = 1, it forms a 1 × 1
unitary matrix. 𝑼∗ 𝑼 = 𝑰. We take the complex conjugate as
well as the transpose, for 𝑼∗ .
The SVD extends this “polar factorization” to matrices of

any size:
Every real square matrix can be factored into 𝑨 = 𝑸𝑺, where

𝑸 is orthogonal and 𝑺 is symmetric positive
semidefinite. If 𝑨 is invertible then 𝑺 is positive definite.
For proof we just insert 𝑽𝑇 𝑽 = 𝑰 into the middle of the SVD:
𝑨 = 𝑼𝚺𝑽𝑇 = 𝑼𝑽𝑇 𝑽𝚺𝑽𝑇 . (4)

The factors 𝑺 = 𝑽𝚺𝑽𝑇 is symmetric and semidefinite
(because 𝚺 is ). The factor 𝑸 = 𝑼𝑽𝑇 is an orthogonal matrix
(because 𝑸𝑇 𝑸 = 𝑽𝑼𝑇 𝑼𝑽𝑇 = 𝑰). In the complex case, 𝑺
becomes Hermitian instead of symmetric and 𝑸 becomes
unitary instead of orthogonal. In the invertible case 𝚺 is
definite and so is 𝑺.
Polar decomposition:
1 −2 0 −1 3 −1
𝑨 = 𝑸𝑺: = .
3 −1 1 0 −1 2
Reverse polar decomposition:
1 −2 2 1 0 −1
𝑨 = 𝑺𝑻 𝑸: = .
3 −1 1 3 1 0
Both 𝑺 and 𝑺𝑇 are symmetric positive definite because this
𝐴 is invertible.
Moore Penrose Inverse

We have already seen from our previous modules that
finding generalized inverse needs more calculations and
time taking. But here is an easy technique to find it if we
know the singular value decomposition.
If 𝑨 = 𝑼𝚺𝑽𝑻 (the SVD), then its pseudoinverse is

𝑨+ = 𝑽𝚺 + 𝑼𝑻.
Example:
1 2 2
−
3 3 3
2 1 2
𝑨 = −1 2 2 = 𝑼𝚺𝑽𝑇 = 1 3 0 0 3
−
3 3
2 2 1
−
3 3 3
1 2 2 1
− 1 −
3 3 3 9
2 1 2 3 2
𝑽𝚺𝑼𝑇 = 3
−
3 3 0 1 = 9
= 𝑨+.
2 2 1 0 2
−
3 3 3 9

M 1.29 RM

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

M 1.29 RM

Uploaded by

Copyright:

Available Formats

SINGULAR VALUE DECOMPOSITION

A great matrix factorization has been saved for the end of

The SVD is closely associated with the eigenvalue-

The diagonal (but rectangular) matrix 𝚺 has eigenvalues

The columns of 𝑼 (𝑚 × 𝑚) are eigenvectors of 𝑨𝑨𝑇 , and the

For symmetric matrices, 𝚺 is 𝑫 and 𝑼𝚺𝑽𝑇 is identical to

𝑼 and 𝑽 give orthonormal bases for all four fundamental

First 𝑟 columns of 𝑼: Column Space of 𝑨

Last 𝑚 − 𝑟 columns of 𝑼: nullspace of 𝑨𝑇

First 𝑟 columns of 𝑽: row space of 𝑨

Last 𝑛 − 𝑟 colums of 𝑽: nullspace of 𝑨

The SVD chooses those bases in an extremely special way.

Eigenvectors of 𝑨𝑨𝑇 and 𝑨𝑇 𝑨 must go into the columns of 𝑼

𝑨𝑨𝑇 = 𝑼𝚺𝑽𝑇 𝑽𝚺T 𝑼𝑇 = 𝑼𝚺𝚺T 𝑼𝑇 and, similarly, 𝑨𝑇 𝑨 =

𝑼 must be in the eigenvector matrix for 𝑨𝑨𝑇 . The

From the 𝑨𝑇 𝑨 = 𝑽𝚺T 𝚺𝑽𝑇 , the 𝑽 matrix must be the

Now to find 𝑽 we start with 𝑨𝑇 𝑨.

⇒ 5 − 𝜆 100 − 25𝜆 + 𝜆2 − 100 − 10 50 − 10𝜆 − 50

Applications of the SVD

Any matrix is the sum of 𝑟 matrices of rank 1. If only 20

Every nonzero complex number 𝑧 is a positive number 𝑟

The SVD extends this “polar factorization” to matrices of

Every real square matrix can be factored into 𝑨 = 𝑸𝑺, where

For proof we just insert 𝑽𝑇 𝑽 = 𝑰 into the middle of the SVD:

𝑨 = 𝑼𝚺𝑽𝑇 = 𝑼𝑽𝑇 𝑽𝚺𝑽𝑇 . (4)

Moore Penrose Inverse

If 𝑨 = 𝑼𝚺𝑽𝑻 (the SVD), then its pseudoinverse is

You might also like