Information Retrieval: FOREWORD
Department of Computer Science, University of ArizonaIn the not-so-long ago past, information retrieval meant going to the town's library and asking thelibrarian for help. The librarian usually knew all the books in his possession, and could give one adefinite, although often negative, answer. As the number of books grew--and with them the number of libraries and librarians--it became impossible for one person or any group of persons to possess so muchinformation. Tools for information retrieval had to be devised. The most important of these tools is the
--a collection of terms with pointers to places where information about them can be found. Theterms can be subject matters, author names, call numbers, etc., but the structure of the index isessentially the same. Indexes are usually placed at the end of a book, or in another form, implemented ascard catalogs in a library. The Sumerian literary catalogue, of c. 2000 B.C., is probably the first list of books ever written. Book indexes had appeared in a primitive form in the 16th century, and by the 18thcentury some were similar to today's indexes. Given the incredible technology advances in the last 200years, it is quite surprising that today, for the vast majority of people, an index, or a hierarchy of indexes, is still the only available tool for information retrieval! Furthermore, at least from myexperience, many book indexes are not of high quality. Writing a good index is still more a matter of experience and art than a precise science.Why do most people still use 18th century technology today? It is not because there are no othermethods or no new technology. I believe that the main reason is simple: Indexes work. They areextremely simple and effective to use for small to medium-size data. As President Reagan was fond of saying "if it ain't broke, don't fix it." We read books in essentially the same way we did in the 18thcentury, we walk the same way (most people don't use small wheels, for example, for walking, althoughit is technologically feasible), and some people argue that we teach our students in the same way. Thereis a great comfort in not having to learn something new to perform an old task. However, with theinformation explosion just upon us, "it" is about to be broken. We not only have an immensely greateramount of information from which to retrieve, we also have much more complicated needs. Fastercomputers, larger capacity high-speed data storage devices, and higher bandwidth networks will allcome along, but they will not be enough. We will need better techniques for storing, accessing,querying, and manipulating information.It is doubtful that in our lifetime most people will read books, say, from a notebook computer, thatpeople will have rockets attached to their backs, or that teaching will take a radical new form (I dare noteven venture what form), but it is likely that information will be retrieved in many new ways, but manymore people, and on a grander scale.
file:///C|/E%20Drive%20Data/My%20Books/Algorithm/DrDob...ooks_Algorithms_Collection2ed/books/book5/foreword.htm (1 of 2)7/3/2004 4:19:16 PM