Professional Documents
Culture Documents
and Knowledge†
This Perspectives issue is assembled to provide an his- the sky. But the first definition will elude the untutored and
torical background to visualization in information retrieval. the second remain unverifiable for half of each planetary
It is a review of the assumptions and technology configu- revolution.
rations by which the current literature may be interpreted. In many circumstances, descriptive knowledge of the
The techniques of the authors of this issue differ, but all world must be limited to closed vocabularies such as math-
treat their techniques as manuals of description flowing ematics or predicate logic. These special languages remove
from a history of common mathematical and technical in- ambiguity, but frequently limit the power of expressiveness
fluences. to the trivial, or distort common observations into the realm
All technologies have histories of development. The of the bizarre. The common sense in which words are used
historical forces of visualization frame the current efforts to encapsulate knowledge is the only sense in which words
and comprise the field in which new problem dimensions interest us. We seek in description to elucidate those prop-
are addressed. No field of scientific inquiry emerges without erties of a phenomenon which most differentiate it from
a background. This issue adds to the depth necessary for the others, or which best provide the boundaries by which
study of visualization by new students and new scholars. phenomena may be treated in common as a class. It may
Moreover, these articles also reflect developments by well be that in the future visualization in retrieval will be
some of the world’s most prestigious research institutions. construed as such a special language, but one vastly more
Not only are two American national laboratories repre- expressive and universal.
sented, but also the National Aeronautics and Space Ad- The digital world technology has introduced bequeaths
ministration (NASA), the Institute for Scientific Informa- us with ever more complex problems of knowledge differ-
tion, the Digital Libraries Initiative in the Alexandria entiation and classification. No one would argue that images
Project, the Canadian National Archives, and the Founda- do not encapsulate knowledge, but how images encapsulat-
tions of Advanced Information Visualization (FADIVA) ing knowledge may be described from their primitive forms
European group. This issue is far from the last word on this is still in the resolution stage. Images are not usually parsed
topic, but surely it is among the most authoritative ones. into words as documents are so treated. On an even more
intricate plane of examples, the methodologies by which
sounds may be retrieved by images is a problem so con-
In the Beginning Was the Word founded that no literature exists yet to address it.
The history of information retrieval is mostly the history In the early literature of information science, reference is
of word retrieval. Very early in this history, word frequency often made to a paradigm known as “the answer document.”
in document collections was used to convey distinctions In this formalism, the question is immutable, and the world
among classes of documents. Much followed from this basic image is that of an “answer document” flying like an arrow
insight and today the notion of combining like documents to intercept the question and bring it to ground. This char-
by common words is universal in practical information acterization is considered antique. Yet, in the test collec-
retrieval engineering. This concept has reached so refined a tions that we must use to determine the efficacy of varying
state, however, that discoveries now occur only in small approaches to retrieval, we essentially resort to the same
increments. formalism. Although these static methodologies may be
There seems little amiss in the notion that knowledge of altered by interactive experimentation, the quality of knowl-
the world about us should be encapsulated by words. Yet, edge is of a different order when we do so.
words may be false, or their meaning misunderstood. Take The authors in this issue present an alternative vision. It
the notion of words which uniquely define knowledge of is a vision which relies on user presentation of entire answer
nature, for example. One might universally define “spring” sets resting in a visual field. It is not an “answer document”
as the period between the winter solstice and the vernal upon which this vision relies, but rather an “answer set.” Is
equinox, or the word “sun” as the brightest of all objects in this vision of greater power than the ordering of retrieved
objects by lists? All the authors in this issue claim that it is
a more powerful vision. Moreover, this claim is made on the
†
face validation presented by our human senses and the
This Perspectives issue is dedicated to the memory of Robert R.
Korfhage: teacher, scholar, friend. pre-linguistic evolutionary properties of the human visual
neocortex. The visual processing skills which permitted our
© 1999 John Wiley & Sons, Inc. species early triumph over much stronger and better adapted
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 50(9):790 –793, 1999 CCC 0002-8231/99/090790-04
ones provide the prima facie rationale for the use of visu- more successful than Rorvig et al., due to their use of
alization in information retrieval. As one observer put it late Kohenen feature maps, a far more robust technique than the
at night at an ASIS conference, “Whatever we were doing to simple cosine vector intersection of terms. They were also
survive in the brush 50,000 years ago, you can bet that it able to take advantage of the homogeneity of their earth
wasn’t perfecting the skill of taking notes.” observation data. Their system is more than merely “prom-
No one claims that words are not useful in capturing ising.” It actually works.
knowledge. Indeed, for detailed control and description of
complex phenomena, they represent our most valued cul-
tural legacy. The illiterate cannot expect to profit much from Theoretical Foundations
visualization technology. Rather, the claim made in this
The article by Henry Small of the Institute for Scientific
issue is that the wholistic expression of the interrelation of
Information contributes the most clearly defined statement
knowledge objects is more powerful than any other compi-
of visualization techniques to appear in print. This is a rather
lation of such objects. In this paradigm, a single Linnean
bold statement, yet it is true. The article addresses the two
taxonomic tree is more revelatory than all the nodes of
decade-long historical use of visualization techniques in
description in the tree individually presented. And in this
calculating the relationships among scientific fields by their
proposition, these authors propose new visual grammars for
patterns of co-citation.
interpreting the manifold relations among words and im-
Researchers who seek “cookbook” renditions of algo-
ages.
rithms as they developed over time will find this article to be
Such propositions do not resolve the issues of knowing
a precise guide to alternatives. Small, in his article entitled
by any user. Nor do they resolve the issues of classification
“Visualizing Science by Citation Mapping,” begins with the
boundaries by which like phenomena may be known. They
simplest of algorithms as conceived within the computa-
do, on the other hand, offer a paradigm of alternative
tional limitations of the 1970s and ends with the most
engineering strategies for user apprehension of a multime-
ambitious ones presently available through Sandia National
dia world. Alternatives are priceless. Even a rough and
Laboratories (SNL). In this article, students and scholars
uneven road is preferable to a dead end. And although it is
will find algorithms applicable to many different aspects of
quite unfair to characterize advances in word retrieval as
the co-citation problem, as Small frankly describes the
negligible, such advances have surely been limited in recent
research paths that were successful and led to further en-
years.
hancements as well as the ones that were eventually dis-
carded either because of their inefficiency in computation,
The First Visual Interface or their failure to yield truthful insights validated by earlier
techniques. Many of these algorithms may be transplanted
The first visual interface to a collection was designed and
to address similar problems with data that may be encoun-
implemented at the Johnson Space Center of NASA in the
tered by researchers who require some intermediate pro-
years 1988 to 1992. In this interface, described in the article
cessing alternatives.
entitled, “The NASA Image Collection Visual Thesaurus,”
the authors assumed that the task of inferring images from
terms and terms from images would introduce invariance in Exemplar Applications
image indexing. The system remained in use for two years,
but eventually failed because no automatic method to assign This issue would surely be guilty of intellectual hubris
terms to images could be discovered, and the manual cost of without providing a few examples of applications of re-
such term assignment was too great to be supported. trieval visualization to common search problems. Although
The authors of this article attempted to use image de- the bibliography provided by Robert Korfhage in the online
scriptions clustered by cosine vector methods to identify a site illustrates the widespread trials of visualization technol-
unique image for every thesaurus term. The candidate im- ogy, these two applications illustrate what can be done on a
ages suggested by this method were often heartbreakingly completely practical level. Neither of the two systems pre-
close to the mark. But close was not good enough. These sented in this section is a demonstration system, rather both
developments were described in detail by Seloff (1990). are part of larger efforts to control massive collections. The
Although the Seloff article has been widely cited, the initial first of these two articles, by Ramsay et al., “A Collection of
article which specified the design parameters for the system Visual Thesauri for Browsing Large Collections of Geo-
of his report has remained unpublished. It appears in this graphic Images,” describes the efforts to provide some
article in the form originally presented at the ASIS midyear meaningful access to the more than eight million earth-
conference of 1988. observing images made available in this decade. The second
The article is significant because it represents the first article by Brooks and Campbell, “Interactive Graphical
identification of the components of a visual interface. Its Queries for Bibliographic Search,” describes the use of a
heritage is reflected in the article “A Collection of Visual Canadian government-sponsored National Archives product
Thesauri for Browsing Large Collections of Geographic designed to permit visual searching of text materials in a
Images,” by Ramsay et al., in this issue. These authors were traditional boolean logic-based environment.