You are on page 1of 3

Library Projects

Knowledge-based Editing and Visualization


for Hypermedia Encyclopedias
Christoph Hüser, Klaus Reichenberger, Lothar Rostek, and Norbert Streitz

O
ne of the main goals in developing digital Systems Institute (GMD-IPSI) in Darmstadt, Ger-
libraries is to provide users with opportu- many, we are developing concepts and tools sup-
nities for accessing and using information porting the production and use of innovative
in highly flexible and user-oriented ways publication products as flexibly compiled cross-sec-
not available in current information repositories.This tional information collections. These efforts are part
implies a focus on supporting the intellectual access of the Europublishing project which is funded by
to information and to consider the context of the the European R&D program RACE II. As a concrete
information request. Systems for digital libraries application, we use the Dictionary of Art which is to
should be able to select the appropriate type and be published this year by Macmillan Publishers Ltd.,
amount of information from a comprehensive pool U.K., as a 34-volume print edition. More than 6,000
and compose it on the fly for a meaningful and authors and 50 editors have been involved in its con-
coherent presentation which might have never ventional publication process for over 15 years. Our
occurred before or never will again after this event research focuses on providing support for the com-
because it is customized to the current situation. plex and demanding editorial work and on the pre-
In order to meet these goals, one has to adopt a sentation of hypermedia publications (see Figure 1).
new perspective towards information retrieval, the Like digital library applications, the Dictionary of
notion of documents, and publishing in general. Art poses problems of large amounts of information
This is achieved by utilizing the paradigm shift cur- and thus of individual access involving the selection
rently taking place in electronic publishing caused and combination of information which is put togeth-
by hypertext and hypermedia. That is, no longer er by users demands. As is true for all electronic
viewing documents as static entities published at one publication products, there is the additional chal-
point in time in a definite form, but as dynamic and lenge to meet the quality standards of traditional
networked collections of information composed on publishing, in particular of design and layout when
demand and presented with possibilities for interac- presenting the information on the screen.
tion. This also allows for multiple views on the con- Our approach for meeting these requirements is
tent according to different aspects of interest based on the three concepts which also reflect the
initiated by different users. overall process. First, we use an object-oriented rep-
At the Integrated Publications and Information resentation of a formal representation of facts which

COMMUNICATIONS OF THE ACM April 1995/Vol. 38, No. 4 49


Figure 1. User interface for the Dictionary of Art

are extracted from Dictionary of Art source materi- ment are the Editor’s Workbench for advanced
al in order to create a powerful knowledge base. hypermedia publications and an automatic visualiza-
Second, we support a process of network editing and tion engine. The Editor’s Workbench reflects our
enrichment, partially achieved by automatic means view that maintaining and editing a knowledge base
and partially by sophisticated tools for the human of extracted facts remains an intellectual task. How-
editor who is part of the editorial cycle. Third, we ever, the Editor’s Workbench (see Figure 2) makes
employ a set of procedures and tools for automatic pre- the added value of the intellectual efforts accessible
sentation of results utilizing the underlying formal for further processing. The skeleton of the knowl-
representation. edge base is an object network consisting of Dictio-
The core components of our publishing environ- nary of Art articles and domain-specific objects such
as representations of art styles,
artists, and works of art.
Schema and update behavior
of these object types are mod-
eled using the frame-based
representation tool SmallTalk
Frame Kit (SFK). It supports a
variety of consistency checks
and atomicity of complex
update operations.
For knowledge acquisition—
that is, for populating the
object network-—we experi-
ment with automatic text pro-
cessing techniques ranging
from pattern-oriented parsing
to full text analysis applied, e.g.,
to biography articles. The head
of the dictionary biographies is
a particularly well-structured,
densely phrased piece of text
that contains the essential facts
Figure 2. Conceptual architecture of the Editor’s Workbench of a person’s life structured

50 April 1995/Vol. 38, No. 4 COMMUNICATIONS OF THE ACM


Library Projects
according to editorial guidelines. These guidelines
are encoded in the rules of our parsing and text-to-
object conversion tools (XGrammar).
Allowing flexible information access to the knowl-
edge base results in unpredictable content selections
The University of
to be presented. This requires an automatic genera-
tion of graphical presentation in combination with
text generation which flexibly transforms complex
California CD-ROM
knowledge representations into readable natural lan-
guage texts. Our approach is unique in so far as there
are no predefined templates for the presentations,
Information System
such as timeline, network or geographical diagrams.
Although a variety of presentations is created by Deane Merrill, Nathan Parker,
our system, each one is a result of the specific request
combined with the available information and Fredric Gey, and Chris Stuber
depends on the characteristics of the selected con-
tent. This integrative approach is based on a common

T
description for both the facts represented in the he University of California CD-ROM Infor-
knowledge base and for the basic graphical means of mation System replaces the equivalent of
expression: both are described as relations—existing 260,000 books of published federal statistics
either between domain objects or between graphical with a CD-ROM-based online information
elements. Correspondence in their characteristic rela- system. The size of this database is currently 270 CD-
tional properties serves to determine whether a par- ROMs (135GB). It contains 1990 U.S. census data
ticular domain relation can be visualized using a (approximately 3,000 items of socio-economic and
particular graphical relation. When a conflict-free set demographic information, including race-ethnicity,
of graphical relations has been decided an optimiza- employment, income, educational level, and poverty)
tion algorithm, based on a force-model, is applied to
compute the final presentation.
NCSA Mosaic: Document View
We built an environment consisting of integrated File Options Navigate Annotate Help
prototypes providing the functions described here. As Document Title: 1990 Census Lookup (1.0.5e)
a whole, they address major issues relevant to digital
Document URL: http: //cedr.lbl.gov/cdrom/lookup/date=788117324
libraries of the future where we expect the source
material is going to be beyond scanned images of text (Reload this page)
pages of existing books. The described prototypes are
part of a more comprehensive effort at GMD-IPSI Current Level: State – – Place

including multimedia archives based on object-orient- Ann Arbor city: FIPS.STATE=26,FIPS.PLACE90=03000

ed databases, support for cooperative work, such as RACE


Universe: Persons
multiple authors and editors, and information retrieval White (800–869, 971) ................................................ 90196
Black (870–934, 972) ................................................. 9785
and 3D visualization for large document bases. C American Indian, Eskimo, or Aleut (000–599, 935–970, 973–975):
American Indian (000–599, 973) ..................................... 263
Eskimo(935–940, 974) ................................................. 0
Aleut (941–970, 975) ................................................. 0
References Asian or Pacific Islander (600–699, 976-985):
Asian (600–652, 976, 977, 979–982, 985):
1. Kamps, T., Reichenberger, K. A dialogue approach to graphi- Chinese (605–607, 976) .......................................... 3170
cal information access. Designing User Interfaces for Hypermedia, Filipino (608, 977) .............................................. 620
Japanese (611, 981) .............................................. 981
Schuler, W., Hannemann, J., and Streitz, N., Eds. Springer, Asian Indian (600, 982) ......................................... 1469
Korean (612, 979) ............................................... 1729
Heidelberg (1995), 141– 55. Vietnamese (619, 980) ............................................. 85
2. Rostek, L., Möhr, W. An editor’s workbench for an art history Cambodian (604) .................................................... 0
Hmong (609) ........................................................ 7
reference work. In Proceedings of the ACM European Conference Laotian (613) ...................................................... 0
on Hypermedia Technology. Edinburgh, U.K., Sept. 13 –18, 1994, Thai (618)......................................................... 43
Other Asian (601–603, 610, 614–617, 620–652, 985) ................ 386
pp. 233– 238. Pacific Islander (653–699, 978, 983, 984):
Polynesian (653–659, 978, 983):
3. Rostek, L., Möhr, W., Fischer, D. Weaving a web: The struc- Hawaiian (653, 654, 978) ........................................ 23
ture and creation of an object network representing an elec- Samoan (655, 983) ................................................ 0
Tongan (657) ..................................................... 0
tronic reference work. Electronic Publishing — Origination, Other Polynesian (656, 658, 659) ................................. 0
Dissemination and Design. Special Issue, 6, 4 Wiley, NY, 495 – 505. Micronesian (660–675, 984):
Guamanian (660, 984) ............................................. 0
Other Micronesian (661–675) ...................................... 0

Christoph Hüser is the manager of the Publications and Visualization Environment


research department at GMD-IPSI. Klaus Reichenberger is a member of the research Back Forward Home Reload Open... Save As... Clone New Window Close Window
staff at GMD-IPSI. Lothar Rostek is a senior member of the research staff at GMD-
IPSI. Norbert A. Streitz is the deputy director of GMD-IPSI and the manager of the
Cooperative Hypermedia Systems research division. Email: hueser, reichen, rostek,
streitz@darmstadt.gmd.de Figure 1. Example use of LOOKUP system to
retrieve the racial composition of the population
© ACM 0002-0782/95/0400 of Ann Arbor, Michigan.

COMMUNICATIONS OF THE ACM April 1995/Vol. 38, No. 4 51

You might also like