You are on page 1of 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/245633963

CGWorld-a web based workbench for conceptual graphs management and


applications

Article · July 2000

CITATIONS READS

12 4,597

2 authors:

Pavlin Dobrev Kristina Toutanova


Bosch.IO Google Inc.
16 PUBLICATIONS   233 CITATIONS    109 PUBLICATIONS   7,483 CITATIONS   

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Pavlin Dobrev on 23 May 2014.

The user has requested enhancement of the downloaded file.


CGWorld - A Web Based Workbench for Conceptual
Graphs Management and Applications
Pavlin Dobrev1 and Kristina Toutanova2

1ProSyst Bulgaria Ltd., Sofia, Bulgaria


p_dobrev@prosyst.com
2
Stanford University, Department of Computer Science, Stanford, CA, USA
kristina@cs.stanford.edu

Abstract. This article presents CGWorld – a Web based workbench for joint
distributed development of a knowledge base of conceptual graphs, which resides on
a central server. It is implemented in Java and Prolog. The workbench includes a
graphical, easy to use CG Editor written as a Java Applet. CGWorld has facilities for
translating CGs between four different formats - Display form, Internal Prolog
Representation, First Order Predicate Calculus, and CGIF,and textual and graphical
browsing features for easy search on KBs of conceptual graphs. Also, using the
Application Server technology, Internet access is added to previously developed CG
applications. Using the standard Internet client - the browser, it is possible to add
new features allowed by the new presentation layer.

1 Introduction

One of the problems that have been faced by CG researchers has been the difference in the
format of encoding of CGs and the platforms and software tools they have been using to
develop CG based applications. A second major problem that we encountered was the
absence of a reliable CG Editing tool allowing the development of a large knowledge base
of conceptual graphs in the course of several years by partners in different parts of the
world. The CGWorld workbench presented in this paper is our initial solution to some of
these problems. It is a unified framework for distributed remote development of a
knowledge base and applications of conceptual graphs. Java technologies have been
employed to implement multi-user access for viewing and modification over the Internet
to a KB of conceptual graphs. The different formats of CG encoding are added for
enhanced interchangeability and reuse of previously developed applications. Import and
export from and to CGIF [1] form, Internal Prolog Representation (IPR), First Order
Predicate Calculus (FOPC), and Display form have been implemented. Web based access
to a restricted version of a natural language generation system [6], written in Prolog has
been added as part of the CGWorld workbench, thus demonstrating the promise the Java
Application server technology bares to the reuse and integration of new or previously
developed conceptual graphs applications.

2 Motivation

The system presented here was first inspired by the needs of the LARFLAST (LeARning
Foreign LAnguage Scientific Terminology1) project. The overall aim of the project is to
develop intelligent tools to assist users from CCE/NIS countries in foreign language
learning specifically aimed at the learning of scientific, technical language. The approach
of the project is to apply natural language processing techniques in developing a generic
intelligent foreign language terminology learning system, which can subsequently be
adapted to different source languages, target languages and scientific/technical areas.

The project involves the development of a knowledge base in some knowledge


representation formalism to encode the linguistic, semantic, pedagogical knowledge, etc.,
as well as the implementation of a number of multilingual tools to accommodate different
source and target languages and technical areas.

A major requirement for this project is the possibility to develop and maintain
cooperatively and remotely, a large knowledge base of conceptual graphs in the course of
several years. As a response to this need, the CGWorld workbench has been built. Using
the universal Internet client - the Browser, the system allows access and editing of the KB
from any computer connected to the Internet. A mixed textual and graphical browsing
interface and a graphical CG Editor have been developed to aid the editing of the KB even
by non-expert users.

In summary, the goals that had to be met by the CGWorld workbench have been: (i) to
allow for collaborative, distributed acquisition and editing of a CG knowledge base; (ii) to
provide easy search and navigation in a large KB; (iii) to maintain different representation
languages, thus accommodating the needs of different users of CGWorld and the different
applications the KB of CGs is used in; (iv) easy to use by non-expert in CG theory users
graphical editor and viewer for CGs; (v) integration and addition of Web access to
currently or previously developed CG applications, written in different programming
languages.

1 INCO Copernicus'98 Joint Research Project #977074


3 Implementation

Java - the implementation language


Java was chosen as the implementation language because it provides the programmers
with an object-oriented technology, memory management, and platform independence. It
is increasingly used in the software industry.

Prolog
The Prolog programming language is one of the major languages for Artificial
Intelligence. Many CG tools and applications have used Prolog [2,3,4,6]. It is suitable for
fast development of prototypes of real systems. Its most recent implementations allow for
its use for industrial applications as well.

For the work presented in this paper SICStus Prolog is used, because of its easy
integration with Java through Jasper. Jasper is a bi-directional interface between SICStus
Prolog and Java. It is currently built on top of the JNI together with the C-Prolog
interface. As a result, a Prolog-enhanced Java program can not be used in an applet, for
example, because SICStus would generally not be installed on the client computer and
furthermore applets do not have privileges to invoke native applications residing on the
client. Our solution has been to develop a Java server to provide the communication
between the two programming languages.

Application Server Technology


Generally defined as a platform for data interchange among applications of different
systems, the application server plays the role of an intermediary between the Internet
client and the applications and databases of the client.

3.1 Architecture

The system is composed of three parts - client(s), server(s) and a Prolog agent.

A client could be any Internet user. The only requirement for the client part is a Java
enabled browser and the Swing 1.1. library which is freely available from the JavaSoft site
(http://java.sun.com). Clients are also the other CG applications which use the knowledge
base.
A server manages the client support and the information transfer between the Prolog
agent and the client. A simple Web server has been developed to carry out the support of
HTML clients (browsers). An RMI server has been implemented to accept the requests
from other applications – the CG Editor, the Type Hierarchy browser , etc.
The Prolog agent is responsible for the synchronization between the Prolog knowledge
base and its object representation in Java. It also invokes operations implemented in
Prolog whose results are used for HTML pages generation.

Server
Web Client
HTTP Prolog
Browser RMI Agent
Internet Java
Java
HTML Conceptual
Graphs KB
Prolog

Fig. 1. High Level Architecture

3.2 Structure of the Knowledge Base

The basic representation for all conceptual objects in the KB is the Internal Prolog
Representation. It is based on the CGLex [4] representation. It allows for efficient
implementation of the canonical formation rules and for straightforward translation to
CGIF and FOPC representations. The CGLex format was further developed to meet the
needs of the LARFLAST project. For example, the extended format allows additional
information to be kept in annotations about the intent of creation of the CGs, the creator,
the date of creation, etc. The position and color information for the conceptual objects can
also be kept in comments in the Prolog source files; therefore the translation to display
form is straightforward as well. Further information about the format of the knowledge
base can be found in [2].

On top of this Prolog representation, a Java object-oriented model of the KB is built. Only
the Prolog representation is kept on disk, the object-oriented one is constructed from it
when necessary.

Here is a short description of the conceptual objects:

• Cgc - represents a concept;


• Cgr - represents a conceptual relation;
• Coreference - represents a coreference link between two concepts;
• Cg - represents a conceptual graph;
• TypeDef - represents a type definition;
• Isa -represents an isa relation in the type hierarchy;
• IsaKind – represents a subtyping relation in the type hierarchy which encodes
information about perspectives;
• CgComment - represents a CG comment that contains the layout information;

Fig. 2. UML class diagram of the Java object-oriented model of the KB

3.3 Web Interface

The interface to the CGWorld workbench is a set of dynamically generated HTML pages
and Java applets. The HTML pages contain information from the knowledge base of
CGs. A number of useful features for browsing and visualization of the KB are accessible
through the Web interface. From the index page there are links to seven ways of searching
and browsing through the conceptual objects in the KB.

The concepts can browsed by identifier and by substring contained in their Prolog
representation. The conceptual graphs can be browsed by identifier, by type label of a
concept or relation that they contain, and by a substring of their internal Prolog
representation.
After a search request is submitted, the matching concepts are displayed in IPR and the
matching conceptual graphs are displayed in IPR and graphical form. On any of these
dynamically generated HTML pages, clicking on the image of a conceptual graph invokes
the generation of a new HTML page that contains all four representations of that CG. It is
equivalent to the result of a search under the View Conceptual Graphs link. From the View
Conceptual Graphs link, CGs can be browsed by identifier and are displayed in four
different forms – Display Form, IPR, CGIF , and FOPC.

As an example the next figure shows part of the result for a search request under the
Search by Word Contained in the Prolog Representation link. The keyword typed is
‘’water’’. The HTML page shows fifteen conceptual graphs containing ‘’water’’ in their
IPRs and around fifty concepts. From this page the user can obtain the representations of
the CGs in four formats by clicking on their images.

Fig. 3. Substring Search for “water”

The Type Hierarchy can be viewed through the Type Hierarchy Java Browser or the
HTML Viewer.
The next figure shows dynamically generated Web pages: the left one displays a
conceptual graph in four representations and the one to the right shows a visualization of
the type hierarchy as a graph.

Fig. 4. The Conceptual Graph “A cat sits on a mat” and the Type Hierarchy

The page to the left in Figure 4 has been generated in the following manner: Given a CG
identifier, Java generates a blank HTML page; fills it with the IPR representation of this
conceptual graph and all the concepts it contains; generates the graphical representation of
the CG and passes it as a parameter to an applet, which visualizes it; calls Prolog
predicates which translate the graph into CGIF and FOPC forms; displays their results in
the tables shown.

The figure to the right displays a visualization of the type hierarchy implemented through
a Java applet, which builds the tree (nodes with more than one parent appear once for each
parent) from the IPR representation of the isa relations between types. The nodes can be
expanded and shrinked by clicking on them. The information about the type of partitions
in the type hierarchy (exhaustive, disjoint, role, packager), encoded through isa_kind/3
facts in the IPR is not visualized at this stage.
3.4 The CGWorld Graphical Conceptual Graphs Editor

The graphical CG Editor has been implemented as a Java applet and provides easy to use
Drag & Drop interface for definition and manipulation of a conceptual graphs knowledge
base. After it is started, the Editor can load and save conceptual objects from the server
from which it has been loaded. (The server is automatically detected upon the loading of
the applet in the client Browser).

The data transfer between the Editor and the server is carried out through the use of two
protocols - HTTP and RMI. The currently edited knowledge base is loaded through
HTTP. This speeds the transfer because all data is loaded in a single HTTP operation and
the data is not additionally coded as in RMI. On the other hand, the data loading and
saving of parts of the knowledge base is implemented through RMI. The advantages of
this approach are the data transparency and high level of abstraction that RMI provides.
The operations implemented through RMI include insert, delete, and load of concepts,
relations, type definitions, and annotations, and operations for saving of the changes to the
KB and its synchronization with the Prolog KB.

Fig. 5. The display form of the CG ' Mixture consisting of small solid particles, evenly dispersed in
liquid or gas'
The CGWorld Editor has one main panel, which includes 11 buttons that are used for
creation and manipulation of conceptual objects and data interchange between the server
and the Editor.

The main features of the CGWorld Editor are:

• Portable across all platforms (It has been tested with the most popular browsers -
Netscape Navigator and Internet Explorer);
• Any number of graph windows may be opened for editing;
• Concepts, relations, arcs, coreference links and contexts are supported for editing via
a simple Drag & Drop interface;
• Ability to customize the color, the position and the size of conceptual objects;
• Ability to assign any number of additional properties to the conceptual objects (e. g.
number, definite marker, comment);
• Zooming capability;
• Storing and retrieving of conceptual graphs to/from the application server.

3.5 Java and Prolog Work Together

The chosen model - mixed Java and Prolog application allows distinct modules to be
implemented in different programming languages.

Prolog was used for the language of the main representation of CGs - IPR, which is saved
across sessions. It was also used for the FOPC representation, for inference on the FOPC
representation and other CG manipulations. Some of the Prolog modules are not yet
accessible from CGWorld.

Java was used for implementing the CGWorld Graphical CG Editor, the Application
Server, the type hierarchy viewer and other modules.

Canonical Formation Rules


The CG operations - copy, restrict, project, simplify, and join have been implemented for
a restricted kind of conceptual graphs. They exist as Prolog predicates and are not yet
accessible by the Web interface since work is currently done to develop better and wider
implementations for them.

Translation to First Order Predicate Calculus Form


A subset of the CGs in the LARFLAST KB, representable as FOPC formulas is translated
to logical formulas which is feasible for some simple reasoning tasks. The translation is
not meant to provide for rigorous inference on the knowledge base. The algorithm for
translation follows the basic algorithm proposed by John Sowa in [5]. Some extensions
have been implemented to account for (i) named concepts; (ii) universally quantified
concepts; (iii) generic sets; (iv) contexts and relations between them (only contexts of type
proposition and relations that represent logical connectives are allowed). For the kinds of
conceptual graphs that are translated we have not run into the problematic cases for the
translation operator as originally defined by Sowa.

Translation to CGIF
For the purpose of being transmitted and interchanged, the CGs in the LARFLAST KB
can be imported and exported to CGIF form. The translation form IPR to CGIF is rather
straightforward since in both formats a conceptual graph is a list of relations and a relation
is a relation name followed by a list of arguments.

Web Interface for the Natural Language Generators from DBR-MAT


The Application server developed as a part of CGWorld has been used to add Web access
to the natural language generation modules of a previously developed natural language
conceptual graphs application DBR-MAT. This is a Machine-Aided-Translation system
that uses the conceptual graphs formalism to encode domain knowledge and to make
inference and natural language generation from its knowledge base of conceptual graphs.

Fig. 6. Explanation in Bulgarian of suspension with level of detailness less


The modules for explanation generation in Bulgarian and German have been made
accessible from CGWorld. Using the standard Internet client - the browser, valuable
features are added to the DBR-MAT generators allowed by the new presentation layer.
The browsing capabilities of the standard browsers provide easier navigation through the
domain knowledge. Web access is by itself is an important advantage of computer
systems, because it means easier and wider access. In addition, user management and
sessions’ history allow better user modeling and customization of the system’s replies.

The CGWorld graphical CG Editor can also be used to manipulate the DBR-MAT
knowledge base and to view the knowledge from which the natural language explanations
have been generated. There are plans to add a Generate button to the CG Editor working
panel, thus allowing the direct use of the generation modules in the editor. The multi-
lingual generation program can also be used for automatic generation of CG comments
and annotations, which aids the collaborative development of knowledge bases.

Figure 5 shows a sample session with the NL Generators. This is explanation in Bulgarian
of the concept suspension. In the figure, some words have hyperlinks. Clicking on them
invokes a query to the system for the corresponding concept.

4 User Access

The CGWorld workbench is publicly available at http://www.larflast.bas.bg:8080. The


figure below shows the knowledge base browser initial page. It has hyperlinks to different
ways of viewing and searching through the concepts, conceptual graphs and the type
hierarchy. The main entrance page contains instructions for use, installation information,
links for logging to the KB server, to the User Guide of the CGWorld editor and other
information.

There are two ways to login to the knowledge base server – as a guest, without
authorization, and as a registered user. There is a registration form, which must be filled to
establish a user account for the server. The access levels are determined according to the
user and the group he/she belongs to. Currently the mechanism of user identification is not
in full operation and serves only for determining global access and for making logs of user
requests.

After logging to the CGWorld server, the user is presented with a set of HTML pages and
Java applets through which he/she communicates with the system.
Fig. 7 CGWorld – Knowledge Base Browser Index Page

5 Conclusions and Future Work

CGWorld is an initial solution to the goals that motivated its creation. It allows easy
distributed development of a CG knowledge base over the Internet. While it is a
framework for collaborative development of a knowledge base, no special protocols have
been integrated to ensure the semantic consistency of the knowledge apart from the RMI
interface to the KB which accounts for the data integrity , uniqueness of identifiers, etc.

The browsing capabilities have been considered important and the due attention has been
paid to ensure fast and easy search in the knowledge base and ‘nice’ visualization of the
knowledge constructs in both textual and graphical forms. As the knowledge base
becomes larger, an underlying database will be integrated for efficient storage and
retrieval.

The four different representation formats of the CGs allow flexibility depending on the
particular applications of the knowledge base of conceptual graphs. The usage of the
Application server to integrate the DBR-MAT generation modules has been quite
successful and bears a promise to reusing previously developed CG resources.

As compared to other CG engineering environments, CGWorld lacks some features and


on the other hand has some advantages.

The canonical formation rules for CGs have not been integrated yet as in [7, 8]. This will
be the next step in the development of our CG workbench. The graphical editing facilities
are quite advanced and the Editor is run over the Internet and not downloaded locally as
[9]. CGWorld does not currently allow saving of the KB to a local machine as [9]. The
CG Editor is implemented as an applet, not an application as compared to [8], thus sparing
the privileges that applications have but increasing the security and the ease of
maintenance.

In accordance with the new Java technologies the next release of the CGWorld workbench
will use application server with support of Java 2 Enterprise Edition (J2EE). The current
application server will be replaced with a set of servlets and Java Server Pages (JSP). The
application logic will be implemented as a set of Session Enterprise JavaBeans. This will
facilitate the management of user sessions. A set of Entity Enterprise JavaBeans will
represent persistent conceptual objects like concept, relation, context, referent, and arc .
This will allow the maintenance of large amounts of data and the control of the data
integrity will be performed by the built-in mechanisms for transaction maintenance. As a
next step, the CG workbench will be extended with enterprise components for other
modules of LARFLAST.

References

1. Conceptual Graph Standard Information Technology (IT) - Conceptual Graphs draft proposed
American National Standard (dpANS) NCITS.T2/98-003 (http://www.bestweb.net/~sowa/cg/
cgdpansw.htm).
2. G. Angelova, K. Toutanova, and S. Damianova. Knowledge Base of Conceptual Graphs in DBR-
MAT. University of Hamburg, Computer Science Faculty, Project DBR-MAT (funded by the
Volkswagen Foundation). Technical Report BG-3-98, July 1998.
3. A. -M. Rassinoux, R. H. Baud and J. -R. Scherrer. A Multilingual Analyzer of Medical Texts
4. G. Angelova, S. Damianova, K. Toutanova, K. Bontcheva: Menu-Based Interfaces to Conceptual
Graphs: The CGLex Approach. ICCS 1997: pp. 603-606.
5. John F. Sowa, Information Processing in Mind and Machine. Reading, MA: Addison-Wesley
Publ., 1984.
6. G. Angelova, K. Bontcheva: DB-MAT: Knowledge Acquisition, Processing and NL Generation
Using Conceptual Graphs. ICCS 1996: 115-129
7. D. Lukose. CGKEE: Conceptual Graph Knowledge Engineering Environment. ICCS 1997: pp.
598-602.
8. S. Pollitt, A. Burrow, P. Eklund. WebKB-GE - A Visual Editor for Canonical Conceptual Graphs.
ICCS 1998: pp. 111-118
9. H. Delugah, CharGer – A Conceptual Graph Editor written by Harry Delugah
http://www.cs.uah.edu/~delugach/CharGer/

View publication stats

You might also like