Professional Documents
Culture Documents
semantic indexing
Theorem 1.
Theorem 2
Theorem 3
.
Proof ( Theorem 1) : be the eigenvectors of S as columns
Then we have
Thus, we have
.
Figure Illustration of the singular-value decomposition.
In the top figure matrix C for which M > N.
Below is the case for M < N.
.
.
.
Low-rank approximations
(18.15)
.
The singular value decomposition can be used to solve the low-rank matrix
approximation problem.
.
We invoke the following three-step procedure to this end:
.
.
has at most k non-zero values.
Rreplacing small eigenvalues by zero will not substantially alter the product,
leaving it “close” to C.
.
Theorem 4.
.
Latent semantic indexing
.
A query vector is mapped into its representation in the LSI space by the
transformation
Now, we can use cosine similarities to compute the similarity between a query and a
document.
.
.
.
.
.
.
.
Notice that the low-rank approximation, unlike the original matrix C, can have
negative entries.
.
.
LSI shares two basic drawbacks of vector space retrieval:
.
LSI has not been established as a significant force in scoring and ranking for information
retrieval.
.
END