You are on page 1of 12

Searching and Browsing

on Map Displays
Xia Lin
School of Library and Information Science
University of Kentucky
Lexington, Kentucky

 Abstract
 Introduction
 Experimental Design
o Research questions and objectives
o Hypotheses
o Map displays used in the experiment
o Subjects and experimental procedures
 Experimental Results
o About the map displays
o About learning or memorizing
 Discussions
 Conclusions
 References

ABSTRACT

An experiment was designed to compare three different map displays generated from the
same set of documents by either a self-organizing algorithm or human subjects. Purposes of
this experiment are 1) to evaluate usefulness of map displays for information seeking, 2) to
observe how people search and browse on map displays, and 3) to compare structural and
visual features of map displays. Sixty-eight subjects were randomly assigned to three
selected map displays. They were asked to perform some simple retrieval tasks. Their
performances were observed and analyzed. The result indicated that both the organization
and the visual appearance of map displays had significant effects on subjects' searching and
browsing on the map displays. In particular, the map displays were found to provide three
useful functions. The first is to assist subjects to spot an area of the displays that may be
related to the query. The second is to help subjects learn or memorize the display structures,
which improves their subsequent searches. The third is to help subjects make judgments on
whether or not the right location has been selected for the requested information.
Accordingly, future visual interfaces for retrieval systems need to support these functions.
INTRODUCTION

In previous research, a map display was proposed as a visual display for information
retrieval (Lin, et al. 1991, Lin, 1992, Lin, 1993, Lin, et al. 1993). The map display was
designed to show both contents and associative structures of a document collection, and to
reveal semantic relationships of documents by organizing terms extracted from the
document collection. The map display was first generated by a neural network's learning
algorithm (Lin, et al. 1991, Lin, 1992). An experiment was then conducted to observe how
human subjects organized such map displays (Lin, et al. 1993).

This paper presents results of another experiment to compare map displays generated by the
algorithm and by human subjects. In this experiment, how subjects used the map displays
was observed, and functions of the map displays were examined through analysis of
subjects' searching process. Results of the experiment not only reveal similarities and
differences between the two types of map displays, but also identify some factors and
features that make map displays useful for searching and browsing.

In the following sections, the experimental design is described first, and experimental
results are presented next, followed by discussions of the results. Conclusions of the
experiment are summarized in the final section.

EXPERIMENTAL DESIGN

Research questions and objectives

The overall objective of the research is to investigate features and properties of visual
displays for information retrieval. The map display has been suggested as a promising
format of visual displays comparing to other formats such as hierarchical displays, network
displays, and scatter displays. We define the map display as a visual display that is
generated from a direct mapping of underlying data, and that uses a spatial analog and
geographical features to reveal contents and structures of the data. Like geographical maps,
the map display needs to have labels or elaborate signs or symbols to represent the data. It
needs to be selective in order to provide an appropriate granularity to display structures and
relationships of the data. It usually conveys a large amount of information in a limited
space.

The mapping process that generates the map display can be a machine's self-organizing
process or a human cognitive process. They share the same goal of creating a display to
reflect the semantic structure of underlying data "as truly as possible." Nevertheless,
structures created by different processes for the same data may still be very different. In the
previous experiment on generating map displays by human subjects (Lin, et al., 1993), we
found that all subjects generated different map displays which could be divided into two
types: category-based or association-based. The category-based map display was arranged
by more or less distinct groups, usually in columns. The association-based map display
maintained clear associations among clusters and groups, but boundaries between clusters
and groups were not clear.
The goal of this experiment thus was to evaluate different types of map displays in a
retrieval setting. Specifically, the objectives of this experiment were:

to study efficiency of searching and browsing on map displays generated by the mapping
algorithm and by human subjects,

to compare different types of map displays and explore their organizational and visual
features, and

to observe how subjects search and browse the map displays and identify cognitive factors
that may affect the use of may displays.

Hypotheses

For purposes of this experiment, we concentrate on two aspects of using the map display:
how quickly a viewer can identify a given document from the map display, and how well
the viewer can learn and memorize layouts and details of the map display to improve the
speed of searching and browsing.

The following hypotheses were proposed:

H1. Subjects can complete the assigned retrieval tasks more quickly on the map displays
than on a random display;

H2. The human-generated map display is more helpful for the assigned retrieval tasks than
is the machine-generated map display;

H3. The association-based map display is more helpful than the category-based map
display for the assigned retrieval tasks; and

H4. Subjects can learn and memorize structures of map displays to improve the speed of
their searching and browsing.

These hypotheses mainly focus on comparing different types of map displays. Other factors
that may influence use of map displays, such as users' backgrounds, knowledge levels, and
language proficiencies, were explored through questionnaire data. No a priori hypotheses
were generated for them.

Map displays used in the experiment

Three map displays were used for this experiment. They were all generated from the same
set of documents: a total of 133 documents retrieved by a query on library automation from
LISA database. Out of the three, one was a machine-generated map display; it was the
mapping result of Kohonen's self-organizing algorithm (Kohonen, 1989). The other two
were human-generated; one each for the association-based and the category-based map
displays. They were selected by the experimenter from the eight map displays generated
from the previous experiment (Lin, et al. 1993). The one chosen to represent that category
was, as judged by the experimenter, a typical representative of its category. All these map
displays were table-size (about 31 by 40 inches). The two human-generated map displays
were as they were when the subjects created them in the previous experiment, the machine-
generated map display was re-created to the same size as the other two. Figure 1 showed
the three map displays. These map displays were re-drawn based on the table-size displays
used in the experiment. These re-drawings show the same organizational structures as the
original displays except that individual document titles were represented by dots (these
titles were readable in the original displays).

As a comparison, a random display of the same size (a display with document titles
randomly put on the grid) was also used in this experiment. The above hypotheses were
tested through evaluating differences and similarities of these map displays, and through
comparison of subjects' retrieval performance with the three map displays and with the
random display.

Subjects and experimental procedures

A total of 68 subjects participated in this experiment. All except three were library school
students. Each subject spent about 10 minutes to look up ten titles from one of the four
displays assigned to them. The ten titles were randomly selected from the group of
documents used to generate the displays, and the same ten titles were given to every
subject, one at a time, during the experiment. The first title was used as practice. Time used
by subjects to look up the other nine titles was recorded. The subjects were interrupted if
they spent more than two minutes to locate a title. During the experiment, subjects'
searching and browsing behaviors were observed and noted by the experimenter. After
completing the tasks, the subjects were invited to comment on questions such as what were
their strategies for completing the tasks, how difficulties they thought the tasks were, and
what were their impressions of the displays they used.

Subjects also filled out a brief questionnaire to provide information related to their
backgrounds. The questionnaire measured how much they knew about the content of the
documents (library automation), how familiar they were with online searching, whether
English was their first language, and other demographic information. Among the 68
subjects, 47 were female and 21 were male, 50 were native English speakers and 18 were
not.

EXPERIMENTAL RESULTS

The primary dependent measure in this experiment was the time that subjects used to
complete the 9 look-up tasks. The results were presented in two groups: about the map
displays and about learning and memorizing.

About the map displays

Table 1(a) shows results of one way ANOVA on the data. The dependent variable was the
mean time each subject spent to locate a title on the map displays, which were 26.2, 25.9
and 28.1 seconds with the machine-generated, the association-based, and the category-
based map displays, respectively. The mean time for a subject to locate a title with the
random display was 38.0. A significant difference among the means was found (p=0.04).

Table 1. Statistical results on the search data. The dependent variable is the mean time
that each subject spent to look up a title from an assigned map display.

Two groups of a priori contrasts were defined to test hypotheses H1 and H2. The results of
the first group (table 1(b)) indicated a significant difference (at 0.05 level) between the
mean times that subjects' spent with the random display and each of the three map displays
for the given retrieval tasks, thus hypothesis H1 is accepted. The results of the second
group (Table 1(b)) showed no significant differences between the mean times spent with
the human-generated map display and the machine-generated map display, thus hypothesis
H2 is rejected. In other words, we found no differences between the mean time each subject
used to complete the 9 searches with the machine-generated map display and the
association-based or the category-based map displays.

The last contract showed no significant differences between the mean times subjects spent
with the association-based and the category-based map displays, thus hypothesis H3 is
rejected.

About learning or memorizing

The learning effect was tested based on the subjects' performance in locating the first three
titles and the last three titles. four null hypotheses were tested for hypothesis H4:
H04.1 There is no significant difference between the time used to locate a title with the
machine-generated map display for the first three titles and for the last three titles.

H04.2 There is no significant difference between the time used to locate a title with the
association-based map display for the first three titles and for the last three titles.

H04.3 There is no significant difference between the time used to locate a title with the
category-based map display for the first three titles and for the last three titles.

H04.4 There is no significant difference between the time used to locate a title with the
random display for the first three titles and for the last three titles.

Table 2 shows t-test results. The results indicated statistically significant differences
between the mean time that subjects spent on the first and the last three titles for the
machine-generated map display and for the association-based map display (p= 0.008 and
p=0.000, respectively). There was some difference for the category-based map display
between the time spent on the first and the last three titles, but the difference was not
statistically significant (p=0.143). For the random display, there were no differences
between times spent on the first and the last three titles (p=0.832). Therefore, hypothesis
H04.1 and H04.2 were rejected; the hypothesis H04.3 and H04.4 were not rejected.

Table 2. Comparison of time spent on searching for the first three titles and the last
three titles.

These results indicated a learning effect. The subjects seem to be able to learn or memorize
the map displays to improve their search speed, especially when the displays were
associatively organized. They were not able to do so when the display was not organized.

ANOVA analysis on the data revealed other details about the learning effects. The results
confirmed that there were no differences among the mean times spent with the four types of
displays for the first three questions (F=0.49, p=0.69), and there were significant
differences among the mean times spent with the map displays for the last three questions
(F=6.51, p=0.01). While the type of displays accounts for only 2% of the total variation for
the first three questions (R=.15), it accounts for 23% of the total variation (R=.48) for the
last questions.
Furthermore, the results (Table 3) showed that, the learning effect was particularly apparent
for the association-based map display. For the first three questions, subjects spent the least
time with the category-base map display, but for the last three questions, subjects spent the
most time with the category-based map display except with the random display. These
results indicated that the category-based map display would be more helpful when the
subjects were new to the map display. The learning effects were more effective with the
association-based map display, and the learning effects seemed to be less robust with the
category-based map display.

Table 3. Learning effects by map types. The mean is the average time (in seconds) that
each subject spent to look for the first and the last three questions.

DISCUSSIONS

As expected, the findings show that subjects searched much faster on the map displays than
they did on the random display. This suggests that both the machine-generated and the
human-generated map displays provide reasonable structures to support viewers' searching
and browsing. The results also indicate that, for the simple retrieval task, the machine-
generated map display works as well as the human-generated map displays.

The results show that the subjects completed the assigned tasks surprisingly fast. Even on
the random display, the subjects did not confront as much difficulty as originally expected.
This may be due to the fact that table-size displays were used. Browsing becomes much
easier on table-size displays because 1) all the titles are displayed clearly and legibly, 2) the
grid, the titles, the labels, and the boundary sticks, are of all different colors that make all
the display elements more distinctive, and 3) there is much more space to represent
similarities and differences among documents. Comments by the subjects indicated that
they took advantage of all the above three for their searching and browsing. In particular,
since the assigned tasks were to look for specific titles, subjects only needed to visually
scan the displays for a match. Many visual cues, such as different layouts, length of titles,
capital words, and unfamiliar words, were very helpful even in the random display. Some
subjects indicated that they always looked for only two or three words, either the first two
or three words in the titles, or the major words that they picked up from the titles. Human
visual capabilities are powerful in discriminating the selected two or three words from the
displays with the help of visual cues.

What makes the subjects do better on the map displays while they already can do a good
job on the random display? One answer is clear from the findings: the organization of map
displays helps the subjects to learn and memorize the map displays. The learning effect was
confirmed by the statistical results. It was also shown in the subjects' remarks. For example,
a subject commented that "It got easy to do as I went along because I remembered the
categories more easily, and once I remember the categories, it's easier to pick them out."
Psychologically, the subjects also felt that the retrieval task became easier and easier as
indicated in remarks such as "I thought it was very slow at the beginning and getting fast as
I went along."

The map displays seem to allow the subjects to identify a starting point quickly. Subjects
could easily associate orientations of the displays to document contents. For example, the
subjects were able to say what was on the left and what was on the right of the displays
after using the map displays for a while. They could identify (off the top of the head) the
major groups and their locations on the map displays. When they searched on the map
displays, a typical reaction after reading a searching title was that "I think it's somewhat
here."

The map displays also made it easier for the subjects to focus on one or two groups on the
displays. While the labels on the displays were not precise and sometimes were even
confusing, subjects seem to rely on the labels to decide whether to focus on certain groups
or to exclude extraneous groups. Remarks such as "it's got to be in this area" and "it
wouldn't be in that group" were often heard during the experiment. When asked how
difficult the assigned tasks were, subjects often said that it depended on the search titles,
some titles were easier to search than others. Many subjects put titles in two categories:
those that could be found at the place the subjects thought they should be, and those that
could not be found at the first try. Typical comments were that "I got at least half of them at
the first try. For those I didn't I had to end up looking all over because it wasn't where I
thought it should be," and, "if you look at a title, and you hit the term right away, and it's
under the term you're thinking of, it is easy to retrieve. But if you don't see it the same way
as it is conceptualized here, you have to go back and kind of re-thinking how it would be
put in the system." These remarks indicated that an important function of the map displays
was to improve the success rate of "first try." The map displays have much higher "first try"
success rate than the random display.

When the first try was not successful, the subject also rely on the map displays to direct
their browsing. The association-based map display was particularly helpful for this
function. One reason is that boundaries on the displays seem to have an effect to exclude
neighbors in the category-based map display, but to link with neighbors in the association-
based map display. With the category-based map display, the subjects often thought that
categories on the display were precise. They were less willing, thus less likely, to browse
through neighboring categories when they did not find the title in a category where they
thought it should be. With the association-based map display, the subjects were encouraged
to look around since there were no clear separations among clusters or groups. Their views
naturally extended more broadly if they did not find the title.

Finally, we observed in the experiment that some subjects were able to complete the
assigned tasks comfortably, while others needed extra effort to complete the tasks. This is
likely contributed by individual differences (Allen, 1992; Borgman, 1989). The large
standard deviation of the search times (Table 1) shows the effect of individual differences.
To look for the factors that might cause the individual differences, the questionnaire data
were explored. A T-test on the difference of the mean time spent by subjects who claimed
to have more content knowledge and subjects who claimed to have less content knowledge
showed no significant differences (p=.34). Similarly, there was no difference between the
mean time spent by subjects who were familiar with online searching and subjects who
were not (p=.55). However, a significant difference was found between the results of native
English speakers and non-native English speakers (p=.007). This, on one hand, indicated
that it was important to have legible verbal elements on the visual display. People needed to
scan the labels and titles to direct their browsing. On the other hand, this might also indicate
cultural differences in the use of map displays and in the organization of knowledge.
Therefore, the use of non-verbal cues such as icons might help smooth the use of map
displays among culturally diverse users.

A note about the retrieval task is due here. The retrieval task is to search for the known
items in a small set, which is only a very special case of information retrieval in the real
environment. The results are further limited by the small number of subjects on the random
display, which made uneven number of subjects in the four different treatments of the
experiment. To this extent, the experiment was a first step that showed that a full evaluation
of map displays is warranted. Such full evaluation should include examining the detailed
structures of map displays, implementing map displays as an interface for a retrieval
system, testing map displays for different retrieval tasks on systems of different data sets.
More detailed and comprehensive studies are needed for further investigation of map
displays for information retrieval.

CONCLUSIONS

This experiment compared subjects' searching and browsing performance on three map
displays and a random display. Several hypotheses were tested. Following conclusions
were reached:

Map displays organized to reflect semantic structures of documents significantly improves


the completion of the retrieval tasks we defined. Both the machine-generated and human-
generated map displays provided a reasonable structure to show underlying document
relationships.

While human-generated map displays can be divided into category-based and association-
based, they both facilitate searching and browsing; they both allow the subjects to learn and
memorize the map displays to improve their searching and browsing.

While the organization of map displays is important, the visual appearance of map displays
is also essential. The subjects can use many visual cues on the displays to support their
searching and browsing.

Browsing map displays is found to be related to language skills. People who are familiar
with the language used in the map display will be more comfortable browsing in that
display.
These conclusions are based on the results on the selected map displays. As the map display
is treated as a special case of visual interfaces, the conclusions and discussion of how
subjects use the map display for searching and browsing will be useful for the design of
future visual interfaces for information retrieval systems.

REFERENCES

Allen, Bryce L. (1992). Cognitive differences in end-user searching of a CD-ROM index.


Proceedings of the 15th Annual International ACM/SIGIR Conference on Research and
Development in Information Retrieval, pp. 298-309.

Borgman, Christine L. (1989). All users of information systems are not created equal: An
exploration into individual differences. Information Processing & Management, 25, 237-
251.

Kohonen, T. (1989). Self-organization and associate memory. (3rd ed.). New York:
Springer-Verlag.

Lin, X. (1993) Self-organizing semantic maps as graphical interfaces for information


retrieval. Unpublished Doctoral Dissertation, University of Maryland, College Park.

Lin, X.; Marchionini, G.; & Soergel, D. (1993). Category-based and association-based map
displays by human subject. In: Proceedings of the 4th ASIS Classification Research
Workshop (Columbus, Ohio, October, 1993), pp. 147-164.

Lin, X. (1992). Visualization for the document space. Proceedings of Visualization'92,


(Boston, October 21-23, 1992), pp. 274-281.

Lin, X., Soergel, D., & Marchionini, G. (1991). A self-organizing semantic map for
information retrieval. Proceedings of the 14th Annual International ACM/SIGIR
Conference on Research and Development in Information Retrieval, pp. 262-269.
Figures -- back to the text
bac

You might also like