You are on page 1of 13

2010/08/30

Experiments of RedSVD

Daisuke Okanohara
RedSVD
• RedSVD is a C++ library for matrix
decompositions
– New BSD license
– http://code.google.com/p/redsvd/
• The implementation of the algorithm in [1]
– [1] “Finding structure with randomness: Stochastic algorithms
for constructing approximate matrix decompositions”, N. Halko,
P.G. Martinsson, J. Tropp, arXiv 0909.4061
RedSVD (contd.)
• RedSVD differs from the original work [1].
– To reduce the memory requirement further,
redsvd sample both rows and columns and solve
the smaller SVD problem.
• RedSVD is optimized for truncated SVD and for
sparse matrices
Experiments
• Conducted the following two experiments
– Performance
– Accuracy
Performance Test
Setup
• Compare the result for the following cases
– Dense Matrix
• # row is fixed, and # col is increased
• Square matrix
– Sparse Matrix
• Square matrix and the nonzero ratio is changed
– svd : Eigen::SVD Version 3.0
– redsvd : REDSVD::RedSVD Version 0.0.3
– Some results are not examined due to the lack of
memory requirement
The result of SVD for dense matrices
Time (sec).
row=500, col=x
100

10

1
svd
100 1000 10000 100000
redsvd r=10
0.1
redsvd r=100
redsvd r= 1000
0.01

0.001

0.0001
The result of SVD for dense matrices
Time (sec).
row=x col=x
100

10

1 svd
100 1000 10000
redsvd r=10
0.1
redsvd r=100
0.01 redsvd r=1000

0.001

0.0001
The result of SVD for sparse matrices
row = x, col = x, nonZero ratio = 0.1%
Time (sec).

100

10

1 redsvd r=10
100 1000 10000 100000
redsvd r=20
0.1
redsvd r=40
0.01 redsvd r=80

0.001

0.0001
The result of SVD for sparse matrices
row = x, col = x, nonZeroRatio = 1%
1000

100

10
redsvd r=10
1 redsvd r=20
100 1000 10000 100000
0.1 redsvd r=40
redsvd r=80
0.01

0.001

0.0001
Accuracy Test
Setup
• Generate random square matrices U and V
– These are ortho-normalized by Gram-Schmidt
• Set a singular vector as Si = 0.9i
• Set a sample matrix A := USVT
• Compute a SVD of A := UtStVtT with top-10
singular values
– n : the row/col of A
– r : the actual rank of A
Singular Values
2.00E-01
rank
0.00E+00
1 2 3 4 5 6 7 8 9 10 11 actual
-2.00E-01
n=100 r=10
-4.00E-01 n=100 r=20
-6.00E-01 n=100 r=100
-8.00E-01 n=1000 r=10
-1.00E+00 n=1000 r=20
n=1000 r=100
-1.20E+00
n=1000 r=1000
-1.40E+00
-1.60E+00

Order of Magnitude = log(Si)


LSA
(SVD for Doc-Term Matrix)
• Data: English Wikipedia
• Aij = I(term j is appeared in a doc i)
– I(x) returns 1 if x is true and 0 otherwise
– A is very sparse; the nonzero ratio is 0.2% - 0.5%
• SVD of A is known as Latent Semantic Analysis
The result of LSA
(ms) Time Performance of RedSVD
10000

1000

100

10

1
100000 1000000 10000000 # of total terms
• The numbers of docs are 3560, 46857, 118110, 233717,
and those of terms are 27106, 147144, 261495, 402239

You might also like