Professional Documents
Culture Documents
Tim Berners-Lee and his World Wide Web entered the 2.4 Heterogeneity
information retrieval world in 1989 .This event caused a
branch that focused specifically on search within this new The data is heterogeneous, existing in multiple formats,
document collection to break away from traditional languages, and alphabets with duplicity of this data at various
information retrieval. This branch is called web information places. In addition, there is no editorial review process, which
retrieval [4]. Web Information retrieval is the process of means errors, falsehoods, and invalid statements abound.
searching within a huge World Wide Web document
collection for a particular information need (called a query).
2.5 Duplication
By all measures, the Web is enormous and growing at a Several studies indicate that nearly 30% of the web's content
staggering rate, which has made it increasingly intricate and is duplicated, mainly due to mirroring[7].
crucial for both people and programs to have quick and
accurate access to Web information and services. Thus, it is 2.6 Hyperlinked
imperative to provide users with tools for efficient and
The web is hyperlinked. Hyperlinks make focused, effective
effective resource and knowledge discovery. Web IR can be
defined as the application of theories and methodologies from searching a reality. The hyperlink functionality alone—that is,
IR to the World Wide Web. It is concerned with addressing the hyperlink to Web page B that is contained in Web page
A—is not directly useful in information retrieval. However,
the technological challenges facing Information Retrieval (IR)
the way Web page authors use hyperlinks can give them
in the setting of WWW [10].
valuable information content. [4][10].
2. CHARACTERISTICS OF WEB IR These characteristics of Web make the task of retrieving
The web is a unique document collection. Some of the information from it quite different from the traditional
distinguished characteristics of the Web make the nature of information retrieval.
29
International Journal of Computer Applications (0975 – 8887)
Volume 76– No.1, August 2013
30
International Journal of Computer Applications (0975 – 8887)
Volume 76– No.1, August 2013
31
International Journal of Computer Applications (0975 – 8887)
Volume 76– No.1, August 2013
search engines especially Google is a testimony to this fact. [5] S. Spat ―About Web IR models.pdf‖, pp 31-35, March
The paper further discussed the characteristics and models of 2007
Web IR.
[6] H.Monica, ―The Past, Present, and Future of Web Search
7. REFERENCES Engines‖, Proceedings of 31st International Colloquium,
ICALP 2004,Finland, July 12-16, 2004.
[1] M. Kobayashi & K. Takeda, ―Information Retrieval on the
Web‖, ACM Computing Surveys, Vol. 32, No.2, June [7] N. Sérgio, ―State of the Art in Web Information
2000. Retrieval‖, Technical Report, FEUP, 2006.
[2] N. Fuhr, ―Models in Information Retrieval‖, ESSIR 2000: [8] H.Monica, ―Information Retrieval on the Web‖, 39th
21-50. Annual Symposium on Foundations of Computer
Science (FOCS'98), Palo Alto, CA
[3] S.Amit, "Modern Information Retrieval: A Brief
Overview", Bulletin of the IEEE Computer Society [9] R. Baeza-Yates, B. Ribeiro-Neto, ―Modern Information
Technical Committee on Data Engineering 24 (4): 35– Retrieval‖. Addison Wesley, New York, 1999.
43, 2001. [10] B.MPS,A.Kumar ― A Primar on the Web Information
[4] N. Langville and D. Meyer ―Science of search engine Retrieval Paradigm‖, JATIT, March 2007.
rankings‖, Chapter 1 ,Princeton Pubs,2006
32
IJCATM : www.ijcaonline.org