Singular Value Decomposition (SVD) : Shusen Wang

Singular Value Decomposition (SVD)
Shusen Wang
Orthonormal Basis
Definition (Orthonormal Basis).

A set of vectors 𝐯" , 𝐯$ , ⋯ , 𝐯& ⊂ ℝ& form an orthonormal basis if:
• unit ℓ$ -norm: 𝐯* $
= 1 for all 𝑖,
• orthogonal to each other: 𝐯*. 𝐯/ = 0 for all 𝑖 ≠ 𝑗.
SVD and Truncated SVD
• Let 𝐀 ∈ ℝ5×& be any matrix.

• Rank: 𝑟 = rank(𝐀). (𝑟 ≤ 𝑚, 𝑛)
A rank-1 matrix
• SVD: 𝐀 = ∑E*F" 𝜎* 𝐮* 𝐯*.

• SVD: 𝐀 = ∑E*F" 𝜎* 𝐮* 𝐯*.
• Singular values: 𝜎" ≥ 𝜎$ ≥ ⋯ ≥ 𝜎E > 0.
• Left singular vectors: 𝐮", 𝐮$, ⋯ , 𝐮E ⊂ ℝ5 forms an orthonormal basis.
• Right singular vectors: 𝐯", 𝐯$, ⋯ , 𝐯E ⊂ ℝ& forms an orthonormal basis.
Truncated SVD

• SVD: 𝐀 = ∑E*F" 𝜎* 𝐮* 𝐯*.
• Truncated SVD’s definition: for 0 < 𝑘 < 𝑟, 𝐀K = ∑K*F" 𝜎* 𝐮* 𝐯*.
Truncated SVD

• SVD: 𝐀 = ∑E*F" 𝜎* 𝐮* 𝐯*.
• Truncated SVD’s definition: for 0 < 𝑘 < 𝑟, 𝐀K = ∑K*F" 𝜎* 𝐮* 𝐯*.
• Truncated SVD’s properties:
• 𝐀K is the best rank-𝑘 approximation to 𝐀.
$
• 𝐀K = argmin𝐁 𝐀 − 𝐁 Q
; 𝑠. 𝑡. rank 𝐁 ≤ 𝑘.
Matrix Frobenius Norm
• Let 𝐀 = 𝑎*/ ∈ ℝ5×& be any matrix.
• Frobenius norm: 𝐀 = ∑5 ∑ & $

*F" /F" */ .
𝑎
Q
• A generalization of the ℓ$-vector norm to matrix.
• Property: 𝐀 = ∑E*F" 𝜎*$ .

Q
• 𝜎* is the 𝑖-th singular value of 𝐀.

Matrix Rank & Singular Values
• Let 𝐀 ∈ ℝ5×& be any matrix. (Suppose 𝑚 ≤ 𝑛).

• 𝑟 = rank(𝐀). (𝑟 ≤ 𝑚)
• Singular values:
𝜎" ≥ 𝜎$ ≥ ⋯ ≥ 𝜎E > 0 = 𝜎EX" = ⋯ = 𝜎5 .
• rank(𝐀) equals # of positive singular values.

Error of Truncated SVD
• SVD: 𝐀 = ∑E*F" 𝜎* 𝐮* 𝐯*.

• Truncated SVD: for 0 < 𝑘 < 𝑟, 𝐀K = ∑K*F" 𝜎* 𝐮* 𝐯*. .
• Error of the truncated SVD:
$
𝐀 − 𝐀K = ∑E*FKX" 𝜎*$ .
Q
Bottom (the smallest) singular values

SVD: Example
Original image (1000×1500)
SVD: Example
Singular values
Indices
SVD: Example
Singular values Top-50 singular values
Indices
SVD: Example
Singular values
Indices (first 50)

SVD: Example
Original image (1000×1500)
SVD: Example
Rank-5 truncated SVD
SVD: Example
SVD: Example
SVD: Example
SVD: Example
A high-quality approximation using

10% singular values/vectors!
SVD: Example
• The original matrix

• Shape: 1000×1500
• #Entries: 1.5M
• The rank-100 truncated SVD
• Shape: 100×1, 100×1500, and 100×1000
• #Entries: 0.25M
• Truncated SVD saves 83% storage
Power Iteration for Computing Truncated SVD
A Property
Theorem. If 𝐀 = ∑E*F" 𝜎* 𝐮* 𝐯*. is the SVD of 𝐀, then 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 𝐯*. .
Proof.
E . .
• 𝐀. 𝐀 = ∑*F" 𝜎* 𝐮* 𝐯* ∑E/F" 𝜎/ 𝐮/ 𝐯/. = ∑E*F" 𝜎* 𝐯* 𝐮.* ∑E/F" 𝜎/ 𝐮/ 𝐯/. .
A Property
Proof.
• 𝐀. 𝐀 = ∑E*F" 𝜎* 𝐯* 𝐮.* ∑E/F" 𝜎/ 𝐮/ 𝐯/. .
• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 𝐮.* 𝐮* 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 𝐮.* 𝐮/ 𝐯/. .
. . .
Using ∑* 𝐗 * ∑/ 𝐗/ = ∑* 𝐗 * 𝐗 * + ∑*`/ 𝐗 * 𝐗/ .
A Property
Proof.
• 𝐀. 𝐀 = ∑E*F" 𝜎* 𝐯* 𝐮.* ∑E/F" 𝜎/ 𝐮/ 𝐯/. .
• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 1 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 0 𝐯/. .
Using the properties of orthonormal basis: 𝐮.* 𝐮* = 1 and 𝐮.* 𝐮/ = 0 for 𝑖 ≠ 𝑗.

A Property
Proof.
• 𝐀. 𝐀 = ∑E*F" 𝜎* 𝐯* 𝐮.* ∑E/F" 𝜎/ 𝐮/ 𝐯/. .
• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 1 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 0 𝐯/. .
• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 0 𝐯/. .

A Property
Eigenvalue decomposition of 𝐀. 𝐀.
Implication: To compute the top singular values 𝜎* and right singular

vectors 𝐯* , we can do eigenvalue decomposition instead of SVD.
Power Iteration for Truncated SVD
Goal: Compute the top 1 eigenvalue/eigenvector of 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 𝐯*. .
Algorithm:
1. Randomly initialize a vector 𝐱 c (with unit ℓ$-norm);
2. Repeat the power iteration: 𝐱 d ← 𝐀. 𝐀 𝐱 df" and 𝐱 d ← 𝐱 d / 𝐱 d .
$
Algorithm:
$
Merely 2 matrix-vector multiplications.

Very Cheap!
Algorithm:
$
Convergence Analysis (𝐱 d converges to 𝐯" ):

Algorithm:
$

• 𝐱 c = ∑&*F" 𝛼* 𝐯* .
• Every vector can be written as a linear combination of the orthonormal basis.

• Because 𝐱 c is randomly initialized, 𝛼" = 𝐱 c. 𝐯" = Ω(1/𝑛) with high probability.
Algorithm:
$

• 𝐱 c = ∑&*F" 𝛼* 𝐯* . Here 𝛼" = Ω 1/𝑛 with high probability.
• Every vector can be written as a linear combination of the orthonormal basis.

• Because 𝐱 c is randomly initialized, 𝛼" = 𝐱 c. 𝐯" = Ω(1/𝑛) with high probability.
Algorithm:
$

$d $d
• 𝐱 d ∝ 𝐀. 𝐀 d 𝐱 c = ∑E*F" 𝜎* 𝐯* 𝐯*. ∑&/F" 𝛼/ 𝐯/ = ∑E*F" 𝛼* 𝜎* 𝐯* .
Algorithm:
$

$d $d
$d
• It can be proved that 𝐀. 𝐀 d = ∑E*F" 𝜎* 𝐯* 𝐯*. .
Algorithm:
$

$d $d
$d
• It can be proved that 𝐀. 𝐀 d = ∑E*F" 𝜎* 𝐯* 𝐯*. .
Algorithm:
$

$d $d
Algorithm:
$

$d
• 𝐱 d ∝ ∑E*F" 𝛼* 𝜎* 𝐯* .
Algorithm:
$

$d
• 𝐱 d ∝ ∑E*F" 𝛼* 𝜎* 𝐯* .
E kl $d E kl $d
• 𝐱d ∝ ∑*F" 𝛼* 𝐯* = 𝛼"𝐯" + ∑*F$ 𝛼* 𝐯* .
km km
Algorithm:
$

$d
• 𝐱 d ∝ ∑E*F" 𝛼* 𝜎* 𝐯* .
E kl $d E kl $d
• 𝐱d ∝ ∑*F" 𝛼* 𝐯* = 𝛼"𝐯" + ∑*F$ 𝛼* 𝐯* .
km km
Algorithm:
$

$d
• 𝐱 d ∝ ∑E*F" 𝛼* 𝜎* 𝐯* .
E kl $d E kl $d
• 𝐱d ∝ ∑*F" 𝛼* 𝐯* = 𝛼"𝐯" + ∑*F$ 𝛼* 𝐯* .
km km
kl
It converge to 0 because < 1.
km
Goal: Compute the top 𝑘 eigenvalue/eigenvector of 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 𝐯*. .
Algorithm:
1. Randomly initialize a matrix 𝐗 c ∈ ℝ&×K (entries are i.i.d. standard Gaussian);
2. Orthogonalize the columns: 𝐗 c ← orth(𝐗 c);
3. Repeat the power iteration:
i. 𝐗 d ← 𝐀. 𝐀 𝐗 df" ;
ii. 𝐗 d ← orth(𝐗 d ).
Summary
• SVD: 𝐀 = ∑E*F" 𝜎* 𝐮* 𝐯*.

• Truncated SVD: abandon the bottom singular values/vectors.
• Power iteration (algorithm) for computing truncated SVD.

Singular Value Decomposition (SVD) : Shusen Wang

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Singular Value Decomposition (SVD) : Shusen Wang

Uploaded by

Copyright:

Available Formats

Singular Value Decomposition (SVD)

Definition (Orthonormal Basis).

• Let 𝐀 ∈ ℝ5×& be any matrix.

• Let 𝐀 ∈ ℝ5×& be any matrix.

• Let 𝐀 ∈ ℝ5×& be any matrix.

• Let 𝐀 ∈ ℝ5×& be any matrix.

• Let 𝐀 = 𝑎*/ ∈ ℝ5×& be any matrix.

• Frobenius norm: 𝐀 = ∑5 ∑ & $

• A generalization of the ℓ$-vector norm to matrix.

• Property: 𝐀 = ∑E*F" 𝜎*$ .

• 𝜎* is the 𝑖-th singular value of 𝐀.

• Let 𝐀 ∈ ℝ5×& be any matrix. (Suppose 𝑚 ≤ 𝑛).

• rank(𝐀) equals # of positive singular values.

• SVD: 𝐀 = ∑E*F" 𝜎* 𝐮* 𝐯*.

Bottom (the smallest) singular values

Singular values Top-50 singular values

Indices (first 50)

A high-quality approximation using

• The original matrix

• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 𝐮.* 𝐮* 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 𝐮.* 𝐮/ 𝐯/. .

• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 𝐮.* 𝐮* 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 𝐮.* 𝐮/ 𝐯/. .

• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 1 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 0 𝐯/. .

Using the properties of orthonormal basis: 𝐮.* 𝐮* = 1 and 𝐮.* 𝐮/ = 0 for 𝑖 ≠ 𝑗.

• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 𝐮.* 𝐮* 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 𝐮.* 𝐮/ 𝐯/. .

• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 1 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 0 𝐯/. .

• 𝐀. 𝐀 = ∑E*F" 𝜎*$ 𝐯* 𝐯*. + ∑*`/ 𝜎* 𝜎/ 𝐯* 0 𝐯/. .

Implication: To compute the top singular values 𝜎* and right singular

Merely 2 matrix-vector multiplications.

Convergence Analysis (𝐱 d converges to 𝐯" ):

Convergence Analysis (𝐱 d converges to 𝐯" ):

• Every vector can be written as a linear combination of the orthonormal basis.

Convergence Analysis (𝐱 d converges to 𝐯" ):

• Every vector can be written as a linear combination of the orthonormal basis.

Convergence Analysis (𝐱 d converges to 𝐯" ):

Convergence Analysis (𝐱 d converges to 𝐯" ):

Convergence Analysis (𝐱 d converges to 𝐯" ):

Convergence Analysis (𝐱 d converges to 𝐯" ):

Convergence Analysis (𝐱 d converges to 𝐯" ):

Convergence Analysis (𝐱 d converges to 𝐯" ):

Convergence Analysis (𝐱 d converges to 𝐯" ):

Convergence Analysis (𝐱 d converges to 𝐯" ):

• SVD: 𝐀 = ∑E*F" 𝜎* 𝐮* 𝐯*.

You might also like

• Property: 𝐀 = ∑EF" 𝜎$ .

• SVD: 𝐀 = ∑EF" 𝜎 𝐮* 𝐯*.

• 𝐀. 𝐀 = ∑EF" 𝜎$ 𝐯* 𝐮.* 𝐮* 𝐯. + ∑`/ 𝜎* 𝜎/ 𝐯* 𝐮.* 𝐮/ 𝐯/. .

• 𝐀. 𝐀 = ∑EF" 𝜎$ 𝐯* 𝐮.* 𝐮* 𝐯. + ∑`/ 𝜎* 𝜎/ 𝐯* 𝐮.* 𝐮/ 𝐯/. .

• 𝐀. 𝐀 = ∑EF" 𝜎$ 𝐯* 1 𝐯. + ∑`/ 𝜎* 𝜎/ 𝐯* 0 𝐯/. .

• 𝐀. 𝐀 = ∑EF" 𝜎$ 𝐯* 𝐮.* 𝐮* 𝐯. + ∑`/ 𝜎* 𝜎/ 𝐯* 𝐮.* 𝐮/ 𝐯/. .

• 𝐀. 𝐀 = ∑EF" 𝜎$ 𝐯* 1 𝐯. + ∑`/ 𝜎* 𝜎/ 𝐯* 0 𝐯/. .

• 𝐀. 𝐀 = ∑EF" 𝜎$ 𝐯* 𝐯. + ∑`/ 𝜎* 𝜎/ 𝐯* 0 𝐯/. .

• SVD: 𝐀 = ∑EF" 𝜎 𝐮* 𝐯*.