Professional Documents
Culture Documents
Knowledge-Based Editing and Visualization For Hypermedia Encyclopedias
Knowledge-Based Editing and Visualization For Hypermedia Encyclopedias
O
ne of the main goals in developing digital Systems Institute (GMD-IPSI) in Darmstadt, Ger-
libraries is to provide users with opportu- many, we are developing concepts and tools sup-
nities for accessing and using information porting the production and use of innovative
in highly flexible and user-oriented ways publication products as flexibly compiled cross-sec-
not available in current information repositories.This tional information collections. These efforts are part
implies a focus on supporting the intellectual access of the Europublishing project which is funded by
to information and to consider the context of the the European R&D program RACE II. As a concrete
information request. Systems for digital libraries application, we use the Dictionary of Art which is to
should be able to select the appropriate type and be published this year by Macmillan Publishers Ltd.,
amount of information from a comprehensive pool U.K., as a 34-volume print edition. More than 6,000
and compose it on the fly for a meaningful and authors and 50 editors have been involved in its con-
coherent presentation which might have never ventional publication process for over 15 years. Our
occurred before or never will again after this event research focuses on providing support for the com-
because it is customized to the current situation. plex and demanding editorial work and on the pre-
In order to meet these goals, one has to adopt a sentation of hypermedia publications (see Figure 1).
new perspective towards information retrieval, the Like digital library applications, the Dictionary of
notion of documents, and publishing in general. Art poses problems of large amounts of information
This is achieved by utilizing the paradigm shift cur- and thus of individual access involving the selection
rently taking place in electronic publishing caused and combination of information which is put togeth-
by hypertext and hypermedia. That is, no longer er by users demands. As is true for all electronic
viewing documents as static entities published at one publication products, there is the additional chal-
point in time in a definite form, but as dynamic and lenge to meet the quality standards of traditional
networked collections of information composed on publishing, in particular of design and layout when
demand and presented with possibilities for interac- presenting the information on the screen.
tion. This also allows for multiple views on the con- Our approach for meeting these requirements is
tent according to different aspects of interest based on the three concepts which also reflect the
initiated by different users. overall process. First, we use an object-oriented rep-
At the Integrated Publications and Information resentation of a formal representation of facts which
are extracted from Dictionary of Art source materi- ment are the Editor’s Workbench for advanced
al in order to create a powerful knowledge base. hypermedia publications and an automatic visualiza-
Second, we support a process of network editing and tion engine. The Editor’s Workbench reflects our
enrichment, partially achieved by automatic means view that maintaining and editing a knowledge base
and partially by sophisticated tools for the human of extracted facts remains an intellectual task. How-
editor who is part of the editorial cycle. Third, we ever, the Editor’s Workbench (see Figure 2) makes
employ a set of procedures and tools for automatic pre- the added value of the intellectual efforts accessible
sentation of results utilizing the underlying formal for further processing. The skeleton of the knowl-
representation. edge base is an object network consisting of Dictio-
The core components of our publishing environ- nary of Art articles and domain-specific objects such
as representations of art styles,
artists, and works of art.
Schema and update behavior
of these object types are mod-
eled using the frame-based
representation tool SmallTalk
Frame Kit (SFK). It supports a
variety of consistency checks
and atomicity of complex
update operations.
For knowledge acquisition—
that is, for populating the
object network-—we experi-
ment with automatic text pro-
cessing techniques ranging
from pattern-oriented parsing
to full text analysis applied, e.g.,
to biography articles. The head
of the dictionary biographies is
a particularly well-structured,
densely phrased piece of text
that contains the essential facts
Figure 2. Conceptual architecture of the Editor’s Workbench of a person’s life structured
T
description for both the facts represented in the he University of California CD-ROM Infor-
knowledge base and for the basic graphical means of mation System replaces the equivalent of
expression: both are described as relations—existing 260,000 books of published federal statistics
either between domain objects or between graphical with a CD-ROM-based online information
elements. Correspondence in their characteristic rela- system. The size of this database is currently 270 CD-
tional properties serves to determine whether a par- ROMs (135GB). It contains 1990 U.S. census data
ticular domain relation can be visualized using a (approximately 3,000 items of socio-economic and
particular graphical relation. When a conflict-free set demographic information, including race-ethnicity,
of graphical relations has been decided an optimiza- employment, income, educational level, and poverty)
tion algorithm, based on a force-model, is applied to
compute the final presentation.
NCSA Mosaic: Document View
We built an environment consisting of integrated File Options Navigate Annotate Help
prototypes providing the functions described here. As Document Title: 1990 Census Lookup (1.0.5e)
a whole, they address major issues relevant to digital
Document URL: http: //cedr.lbl.gov/cdrom/lookup/date=788117324
libraries of the future where we expect the source
material is going to be beyond scanned images of text (Reload this page)
pages of existing books. The described prototypes are
part of a more comprehensive effort at GMD-IPSI Current Level: State – – Place