• Embed Doc
  • Readcast
  • Collections
  • 1
    CommentGo Back
Download
 
There is a buzz all around the corner about the usage of Latent Semantic Indexing (LSI) byGoogle.
Why LSI?
Despite its success, the vector model suffers some serious problems. Unrelated documents maybe retrieved simply because terms occur accidentally in it, and on the other hand relateddocuments may be missed because no term in the document occurs in the query (considersynonyms, there exists a study that different people use the same keywords for expressing thesame concepts only 20% of the time).Thus it would be an interesting idea to see whether the retrieval could be based on conceptsrather than on terms, by mapping first terms to a "concept space" (and queries as well) andthen establish the ranking with respect to similarity within the concept space.In amateur’s language, the search engines through their vast databases are able to use LSI toassociate certain terms with concepts when indexing web pages. LSI tool endeavors to read thesemantic map of searchers to display results.Googlerealized that it needed a better way forits bots to ascertain the true theme of a webpage and that’s what Latent Semantic Indexing isall about.The following screenshot confirms that Google has started using LSI, although in some areas.See the red marked circle, which confirms the usage of related words.
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
11 / 28 / 2010<span class="translation_missing">en_US, this_document_made_it_onto_the</span>Rising List!
You must be to leave a comment.
Submit
Characters: ...