Martin e Eklund - 2000 - KnowledgeRetrievel - Web

K N O W L E D G E M A N A G E M E N T
Knowledge Retrieval and the

World Wide Web
Philippe Martin and Peter W. Eklund, Griffith University
L ARGE-SCALE WEB SEARCH

engines effectively retrieve entire documents,
THE WEB IS CURRENTLY A DISTRIBUTED MASS OF SIMPLE
but they are imprecise, because they do not HYPERTEXT DOCUMENTS. LARGE-SCALE WEB SEARCH
exploit and hence retrieve the semantic Web
document content. We cannot automatically
ENGINES SUCH AS ALTAVISTA AND INFOSEEK ARE NOT CAPABLE
extract such content from general documents OF RETRIEVING PRECISE INFORMATION RESULTS. THE AUTHORS
yet. Manually structuring Web documents—
for example, with XML—lets us retrieve PRESENT A NEW TOOL, WEBKB, THAT INTERPRETS SEMANTIC
more precise information using string-and STATEMENTS STORED IN WEB-ACCESSIBLE DOCUMENTS.
structure-matching tools, such as the Web
robots Harvest, WebSQL, and WebLog. How-
ever, this approach is not scalable, because it such as conceptual graphs (CGs)1 rather than for people to use easily (after a short training
only retrieves fine-grained information if the the direct use of XML-based languages. To let period). Most current knowledge-oriented
documents are thinly structured and the users represent knowledge at the level of detail metadata languages are built above XML, such
querier knows their structures, exact names, they require, we propose simple notations for as the Resource Description Framework
and forms. restricted knowledge representation cases and (RDF) and Ontology Markup Language
Knowledge representation languages that a technique that lets users leave knowledge (OML). The choice of XML as an underlying
support logic inference can help us achieve terms undeclared. We built a Web-accessible format lets you use standard XML tools to
more flexible and precise knowledge repre- tool (CGI server), WebKB,2,3 to support this exchange and parse these metadata languages.
sentation and retrieval. Industry is currently approach and let its users combine lexical, However, because XML is verbose, the meta-
developing many metadata languages to let structural, and knowledge-based techniques to data languages built above it are verbose and
people index Web information resources with exploit or generate Web documents. WebKB difficult to use without specialized editors.
knowledge representations (logical state- is an ontology server and directed Web robot. Such editors do not eliminate the need for peo-
ments) and store them in Web documents. (See the sidebar for a list of related URLs.) ple to use a language to represent knowledge
However, these metadata languages are insuf- (except in application-dependent editors that
ficient to satisfy several requirements neces- only let you fill predefined “frames”). Conse-
sary to allow precise, flexible, and scalable Metadata language quently, with XML-based languages, we must
information retrieval. requirements write information in two versions—one for
On the basis of ease and representational machines and another for humans.4 Addi-
completeness, we argue in favor of general and A first requirement is that the metadata lan- tionally, standard XML tools are of little use,
intuitive knowledge representation languages guage must be intuitive and concise enough because we still require specialized editors,
18 1094-7167/00/$10.00 © 2000 IEEE IEEE INTELLIGENT SYSTEMS

Relevant URLs
AI-Trader: www.vsb.informatik.uni-frankfurt.de/projects/aitrader/intro.html
Apecks: ksi.cpsc.ucalgary.ca/KAW/KAW98/tennison/index.html
analyzers, and inference engines to manage CGI server: www.w3.org/CGI
these languages. Co4: ksi.cpsc.ucalgary.ca/KAW/KAW96/euzenat/euzenat96b.html
To reduce information redundancy, Onto- Concept Maps: sern.ucalgary.ca/~kremer/tutorials/conceptmaps/high
broker, an ontology that guides information Conceptual Graphs: meganesia.int.gu.edu.au/~phmartin/WebKB/doc/CGs.html
retrieval from annotated HTML documents Controlled Languages: cslu.cse.ogi.edu/HLTsurvey/ch7node8.html
accessible on the Web, provides a notation for FastDB: www.ispras.ru/~knizhnik/fastdb.html
embedding attribute-value pairs inside an Harvest: www.ncsa.uiuc.edu/SDG/IT94/proceedings/searching/schwartz.harvest/schwartz.
HTML hyperlink tag.A document’s author can harvest.html
use these tags to delimit an element. Thus, he Knowledge Interchange Format: logic.stanford.edu/kif/kif.html
or she can implicitly reference each element Ontobroker: ontobroker.aifb.uni-karlsruhe.de
in the knowledge statement within the tag Ontolingua ontology server: www-KSL-SVC.stanford.edu:5915
enclosing the element. A wrapper has been Ontology Markup Language: www.mailbase.ac.uk/list/rdf-dev/1998-08/0000.html
added to Ontobroker to generate RDF defini- Ontosaurus: www.isi.edu/isd/ontosaurus.html
tions automatically from Ontobroker metadata. Resource Description Framework: www.w3.org/RDF
Along this same line, a document’s author Shore: www.cs.wisc.edu/shore
should be allowed to make some knowledge Tadzebao: ksi.cpsc.ucalgary.ca:80/KAW/KAW98/domingue
statements visible to readers. This is an obvi- Visual languages: www.cs.orst.edu/~burnett/vpl.html
ous requirement when we can use an espe- WebKB: meganesia.int.gu.edu.au/~phmartin/WebKB
cially intuitive notation—for example, when WebLog: www.cs.concordia.ca/~special/bibdb/weblog.html
we can make graphics with a visual language WordNet: www.cogsci.princeton.edu/~wn
or write sentences with a controlled lan- Word Wide Web Consortium: www.w3.org
guage, a subset of natural language that elim- XML: www.w3.org/XML
inates ambiguity. This visualization feature
is also handy with any notation when the doc-
ument provides explanations about the
knowledge statements it stores. In this way, statements. For example, a CG-based search tions we have invented: a formalized English
for example, we can integrate a knowledge engine can ignore references to sets within and a frame-like CG linear notation. These
base and its associated documentation within CGs and still exhibit adequate precision (the simpler notations are automatically translated
the same document and access both using CGs with references to sets are also retrieved). into CGs. We chose the CG formalism, first,
classic searches (such as string-matching, The Ontobroker metadata language and the because it has a graphical and linear notation
navigating a table of contents, and so on) and RDF are general but imprecise, because they that is both concise and easily comprehensi-
knowledge-based searches. are oriented toward representing entire docu- ble. Second, we can reuse two CG inference
Although the Ontobroker metadata lan- ments (not arbitrary parts of them) and do not engines—CoGITo5 and Peirce6—that exploit
guage was designed to reduce information propose conventions to represent logic-based subsumption defined between formal terms
redundancy, statements cannot be displayed features, such as quantifiers and operators. for calculating specialization relations be-
by browsers, because they are within HTML This limits the capacity of their statements. tween graphs and, therefore, between a query
tags. Furthermore, like the RDF, the Onto- and facts in a knowledge base. Hence, we can
broker metadata language is a notation for make statements and queries at different lev-
attribute-value pairs. Such a representation is WebKB els of granularity.
general but basic and hard to read. Finally, only Another requirement is that each user
the document authors can index any of its parts, The three first requirements for precise, should not have to explicitly declare and orga-
because they index document elements with flexible, and scalable information retrieval nize all the terms in the knowledge statements.
HTML tags. Others are limited to only those imply several easy to use notations (some Indeed, declaring and organizing terms is a
elements that are accessible via URLs. intuitive, some precise and expressive) and tedious and often complex work that deters
A metadata language should also be suffi- the possibilities to insert them anywhere in most users and is probably one of the main
ciently precise and general enough to let users a Web document. WebKB interprets chunks reasons why so few hypertext systems have
represent any Web-accessible information at of knowledge statements in Web documents been knowledge-based (MacWeb7 is an
the desired level of precision. This implies that to satisfy these requirements. Two special exception). This requirement is a rationale for
the metadata language is based on an expres- HTML markers (<KR> and </KR>) or the semiformal knowledge-representation lan-
sive formal model and that it has a notation strings $( and )$ delimit each chunk, or guages such as concept maps as opposed to
letting the user exploit the formal model’s group of statements. These chunks are visi- logic-based formalisms such as KIF. The use
expressivity. Any formalism equivalent to ble unless the document’s author hides them of semi-formal statements is at the expense of
first-order logic that permits the use of con- with HTML comment tags. The author must knowledge precision and accessibility but
texts is an appropriate candidate. The Knowl- specify the knowledge representation lan- allows rapid expression and incremental
edge Interchange Format (KIF) and CGs are guage used in each chunk at its beginning: refinement of knowledge. When forewarned
good examples. It is important not to restrict <KR language=“CG”>. by a special command (no decl), WebKB
users, but for efficiency reasons, a search Currently, WebKB can interpret the linear accepts CGs that include some undeclared
engine can ignore some features in knowledge notation of CGs and the more readable nota- terms. Another informal feature WebKB
MAY/JUNE 2000 19
Knowledge representation
languages versus XML-based
metadata languages
To represent knowledge within docu-
ments, we advocate using knowledge repre-
sentation languages over XML-based meta-
data languages. To compare the alternatives,
Figure 2 shows how to represent a simple
sentence with CGs in WebKB and with KIF
and the RDF. The sentence is, “John believes
that Mary has a cousin who is her age.”
The CG representation in Figure 2 seems
simpler than the others. The semantic net-
work structure of CGs has three advantages:
• it restricts knowledge formation without

compromising expressivity, which tends
to reduce the computational cost of
knowledge comparison;
• it encourages users to fully describe rela-
tions between concepts (as opposed to
languages where they can use “slots” of
frames or objects); and
• it permits a better visualization of rela-
tions between concepts.
Figure 1. The WebKB menu and a knowledge-based information retrieval and handling tool. The example query shows Even if CGs seem relatively intuitive, they
how a document containing indexing images is loaded by the browser into the WebKB processor and then how the are not readable by everyone. In restricted
command spec, which looks for specializations of a conceptual graph, can be used to retrieve CGs and the images cases, we might prefer simpler notations. For
they index. According to the value selected in the “kinds of results” option (top right), the images, but not the knowl- instance, Figure 3 shows notations that WebKB
edge statements, will be presented.
accepts as equivalent to the following CG:
accepts are notations for sets within CGs: current metadata languages only allow TC for KADS_conceptual_model(x)
WebKB ignores them during searches but dis- knowledge representation. WebKB lets users are //note: TC means “Typical
plays each retrieved CG in the form in which generate virtual documents and combine lex- Conditions”
it was entered. ical, structural, and knowledge-based data [KADS_conceptual_model:*x]-
HTML and XML do not let users refer- management by proposing commands for { (Part)->[Model_of_problem_
ence or index any part of a document that searching and joining CGs, Unix-like file solving_expertise];
they have not created. WebKB provides an management commands working on Web- (Part)->[Model_of_
indexation notation that lets a document ele- accessible documents, and a simple Unix communication_expertise];
ment be referred by its content or occurrence shell-like script language to combine com- (Part)->[Model_of_
in a document. mands. Users can insert these commands in cooperation_expertise];
Simply representing knowledge within documents or in form-based interfaces. (Input)<-[Knowledge_ design]-
documents is insufficient; knowledge- and Figure 1 shows the WebKB menu and the >(Output)->[Knowledge_
string-based commands are also necessary. knowledge-based information retrieval and base_system];
It is handy to be able to use them within doc- handling tool, the main general interface to }
uments and—if desired—have the results WebKB.
automatically inserted in place of the com- Although this document-based approach
mands. The hypertext literature refers to this is useful, its scalability is limited. For exam- Undeclared terms in
technique as dynamic linking and calls the ple, before using knowledge query com- knowledge statements
generated document a dynamic or virtual mands, the WebKB user must assert knowl-
document.7 This idea has many applica- edge or use commands (such as load URL) Users might not want to take the time to
tions—for example, adapting a document’s to load documents that contain knowledge. declare and order most of the terms they use
content to a user. A procedural, or declara- To let users benefit from the knowledge of when representing knowledge. For example,
tive, language should combine the com- users they do not know, we are currently this might be the case when a user indexes
mands and their results. Web robots perform extending WebKB to handle a cooperatively sentences from various documents for pri-
some document generation in this way, but built knowledge repository. vate knowledge organization purposes.
20 IEEE INTELLIGENT SYSTEMS

<KR language=”CG”>
load “http://www.bar.com/topLevelOntology”; //Import this ontology
Age < Property; //Declare Age as a subtype of Property
Cousin(Person,Person) {Relation type Cousin};
[Person:”John”]<-(Believer)<-[Descr: [Person:”Mary”]- { (Chrc)->[Age: *a];
(Cousin)->[Person]->(Chrc)->[*a];
} ];
</KR>
<KR language=”KIF”>
load “http://www.bar.com/topLevelOntology”; //Import this ontology
(Define-Ontology Example (Slot-Constraint-Sugar topLevelOntology))
(Define-Class Age (?X) :Def (Property ?X))
(Define-Relation Cousin(?s ?p) :Def (And (Person ?s) (Person ?p)))
(Exists ((?j Person))
(And (Name ?j John) (Believer ?j ’(Exists ((?m Person) (?p Person) (?a Age))
(And (Name ?m Mary) (Chrc ?m ?a)
(Cousin ?m ?p) (Chrc ?p ?a)
))
))) </KR>
<!— RDF notation; assumed location: http://www.bar.com/example —>

<RDF xmlns=”http://www.w3.org/TR/WD-rdf-schema#”
xmlns:t=”http://www.bar.com/topLevelOntology#”>
<Class ID=”Age”><subClassOf resource=”t:Property”/></Class>
<PropertyType ID=”Cousin” comment=”Relation type Cousin”>
<range resource=”t:Person”/>
<domain resource=”t:Person”/></PropertyType> </RDF>
<RDF xmlns=”http://www.w3.org/TR/WD-rdf-schema#” xmlns:x=”http://www.bar.com/example#”
xmlns:t=”http://www.bar.com/topLevelOntology#”>
<t:Person bagID=”Statement_01”><t:Name>Mary</t:Name>
<t:Chrc><x:Age ID=”age”></x:Age></t:Chrc>
<x:Cousin><t:Person><t:Chrc resource=”x:age”/></t:Cousin>
</t:Person>
<Description aboutEach=”#Statement_01” t:Believer=”John”/> </RDF>
Figure 2. Comparing knowledge representation with the Knowledge Interchange Format, Conceptual Graphs, and the Resource Description Framework.
/*Structured text (“:“ ends the name of a “typical” relation,

“=>” of a “necessary” relation, “<=” of a sufficient relation) */
KADS1_conceptual_model.
Part: Model_of_problem_solving_expertise,
Model_of_communication_expertise,
Model_of_cooperation_expertise.
Input of: Knowledge_design (Output: Knowledge_base_system).
/* Text structured with HTML tags (and same conventions for relations) */
<dl><dt>KADS1_conceptual_model
<dd>Part: <ul><li>Model_of_problem_solving_expertise
<li>Model_of_communication_expertise
<li>Model_of_cooperation_expertise
</ul>
<dd>Input of: Knowledge_design (Output: Knowledge_base_system)
</dl>
/* Formalized english */
A KADS1_conceptual_model has for part a model_of_problem_solving_expertise,
a model_of_communication_expertise and a model_of_cooperation_expertise.
A knowledge design has for input a KADS1_conceptual_model and has for output a
knowledge_base_system.
Figure 3. Complementary notations for simple knowledge statements.
MAY/JUNE 2000 21
$(Indexation
(Context: Language: CG;
Ontology: http://www.bar.com/topLevelOntology.html;
Repr_author: phmartin; Creation_date: Mon Sep 14 02:32:21 1998;
Indexed_doc: http://www.bar.com/example.html; )
(DE: {2nd occurence} the red damaged vehicle )
(Repr: [Color: red]<-(Color)<-[Vehicle]->(Attr)->[Damaged] )
)$
$(DEconnection
(Context: Language: CG;
Ontology: http://www.bar.com/topLevelOntology.html;
Repr_author: phmartin; Creation_date: Mon Sep 14 02:53:36 1998;)
(DE: {Document: http://www.bar.com/example.html} )
(Relation: Summary)
(DE: {Document: http//www.bar.com/example.html} {section title: Abstract})
)$
Figure 4. A language for knowledge indexing or connecting any Web-accessible document element.
To permit this and still let the system per-

form some minimal semantic checks and
knowledge organization, we propose that the
casual user represent knowledge with basic
declared relation types and leave undeclared
the terms used as concept types. This method
has four rationales:
First, if knowledge statements are made
from concepts linked by basic relations—that
is, if the complexity is manifest within con-
cept types rather than in relation types—only
a limited set of relation types are necessary
for an application. WebKB already proposes
a top-level ontology of 200 basic relation
types collecting common thematic, mathe-
matical, spatial, temporal, rhetorical, and
argumentative relations types.8,9
Second, WebKB can use relation signatures
to give suitable types to the undeclared terms
used as concept types. For instance, in the top-
level ontology WebKB proposes, the relation
types Input, Output, Agent, Method,
SubProcess, and Purpose are all defined
to have a concept of type Process as the first
argument. Hence, in the previous example,
WebKB can infer that Knowledge_design
must be a subtype of Process.
Third, we merged the natural language
ontology WordNet—120,000 words linked to
90,000 concept types—into our top-level
ontology.8,9 When users implement and ini-
tialize the WebKB shared repository with
these ontologies, WebKB will be able to
semiautomatically relate the undeclared terms
used as concept types to precise concept types
in the repository, thanks to links between
words and concept types and to constraints
imposed by the relation signatures. For exam-
Figure 5. Images, knowledge indexations, and a customized query interface contained within one document. The sam- ple, consider the following CG where users
ple query shows how the command spec, which looks for specializations of a conceptual graph, can be used to have not declared the terms cat and table:
retrieve images CGs indexed. (Figure 7 shows the results.) [Cat]->(On)->[Table]. In WordNet, cat

has five meanings (feline, gossiper, x-ray,
beat, and vomit), and table has five meanings
(array, furniture, tableland, food, and post-
pone). In the WebKB ontology, the relation
type On connects a concept of type Spa-
tial_entity to another concept of the
same type. Thus, WebKB can infer that beat
and vomit are not the intended meanings for
cat, and array and postpone are not the
intended meanings for table. To further iden-
tify the intended meanings, WebKB could
prompt the following questions to the user:
“Does cat refer to feline, gossiper, x-ray, or
something else?” and “Does table refer to fur-
niture, tableland, food, or something else?”
Finally, knowledge statements are more
readily comparable if they follow the same
conventions. Thus, the convention of using Figure 6. The HTML source code of the image indexation in Figure 5.
basic relations, is important. (The opposite
convention, using primitive concepts and com-
plex relations, would be much harder to fol- Indexing any document this special indexation case to present a sim-
low). For example, consider the sentence, element using knowledge ple illustration of WebKB’s features. The
“Mary is 20 years old.” Following our con- example in Figure 5 is a good synthesis but
vention it is better to use the concept type Age A document element is any textual or does not represent the general use of WebKB,
([Person:”Mary”]->(Chrc)- HTML data, such as a sentence, section, or because it mixes the indexed source data (in
>[Age:@20]), unless a user has predefined reference to an image or entire document. this case, a collection of images), their index-
this relation type: This definition excludes binary data but ation, and a customized interface to query
includes textual knowledge statements. them in a single document. Figure 6 shows a
relation Age (x,y) is [Age]- WebKB lets users index any DE of a Web- part of this document that illustrates the
{ (Chrc)->[Livingentity:*x]; accessible document (or later, our repository) indexation. Figure 7 displays the results of
(Measure)->[Integer:*y]; with knowledge statements or connect DEs the query in Figure 5.
} by relations. Figure 4 shows an example of
each case. Lexical and structural query commands.
(This solution implies that the inference XML provides more ways to isolate and Because WebKB proposes knowledge repre-
engine expands the relation type Defini- reference DEs than HTML. Because WebKB sentation, query commands, and a script lan-
tion when comparing graphs. Few CG exploits the capacities of Web browsers, guage, we have not felt the need to give it a lex-
engines can perform type expansion.) WebKB users can use XML mechanisms. ical and structural query language as precise
By default, WebKB enforces declared However, XML does not help users annotate as those in Harvest, WebSQL, and WebLog.
terms in the CG linear notation but permits others’ documents, because DEs cannot be Instead, we have implemented some Unix-like
undeclared terms in simpler notations (see referenced if the documents’authors have not text-processing commands to exploit Web-
Figure 3). The commands decl and no decl been explicitly delimited them. Therefore, the accessible documents or databases and gener-
overide this default mode, and an exclamation WebKB facility of referring to a DE by spec- ate other documents—for example, cat,
mark before a type explicitly tells the system ifying its content and its occurrence number grep, fgrep, diff, head, tail, awk, cd,
that the type was deliberately left undeclared. will still be useful. pwd, wc, and echo. We added the hyperlink
We can also use quoted sentences; WebKB path exploring the command accessible-
understands them as individual concepts of A simple example. The indexation notations DocFrom. This command lists the documents
the type Description. in Figure 4 let the statements and the indexed directly and indirectly accessible from given
Another facility of the WebKB parser is DEs be in different documents. Thus, any documents within a maximal number of hyper-
that, like HTML browsers, it ignores HTML user can index any element of a document on links. For example, the following command
tags (except definition list tags) in knowledge the Web. Figure 1 presents a general inter- lists the HTML documents accessible from
statements. However, when these statements face for knowledge-based queries and shows www.foo.bar/foo.html (maximum two levels)
are displayed in response to a query, they are how a document containing knowledge must and that include the string knowledge in their
displayed using the exact form given by the be loaded in the WebKB processor before HTML source code:
user, including HTML tags. Thus, the user being queried.
can combine HTML or XML features with WebKB also lets the author of a document accessibleDocFrom -maxlevel 2
knowledge statements. For example, the user index an image with a knowledge statement - HTMLonly
can put some types in italics or make them directly stored in the alt field of the HTML http://www.foo.bar/foo.html
the source of hypertext links. img tag used to specify the image. We use | grep-i knowledge
MAY/JUNE 2000 23
string by the character %. WebKB generates
the actual requests by replacing the substring
with the manually and automatically declared
types that include that substring. WebKB dis-
cards replacements that violate the con-
straints imposed by relation signatures or
individual types. Then, it displays and exe-
cutes each remaining request. For example,
spec [%thing] will trigger the generation
and execution of spec [something].
Users can combine knowledge-query com-
mands with the script language to generate
complex documents, perform consistency tests
on the knowledge base, or solve problems
procedurally. The WebKB site provides many
examples of queries and scripts; one script
solves the Sisyphus-I room allocation prob-
lem (meganesia.int.gu.edu.au/~phmartin/
WebKB/kb/sisyphus1.html). Readers are
invited to test these examples at meganesia.int.
gu.edu.au/~phmartin/WebKB or www.int.gu.
edu.au/phmartin/WebKB.
Knowledge-generation commands. The only

type of knowledge-generation commands in
WebKB are commands that join CGs. We can
define various kinds of joins but WebKB only
proposes joins that, given a set of CGs, create
a new CG specializing each of the source CGs.
Although we insert the result in the CG base,
it might not represent anything true for the user
but provides a device for accelerating knowl-
edge representation. For instance, in WebKB,
we can collect and automatically merge CGs
related to a type with a command—such as,
Figure 7. The document generated in response to Figure 5’s query.
spec [TypeX] | maxjoin. The result can
then serve as a basis for the user to create a type
Knowledge-query commands. WebKB has > load http://www.bar.com/ definition for TypeX.
commands for displaying specializations, example.htm The following is a concrete example for
generalizations of a concept or relation type, > spec [Something]->)color-> the maximal join command:
or an entire CG in a knowledge base. At pre- [color: red]
sent, queries for CG specializations only [Color: red]<-(Color)<- > maxjoin [Cat]->(On)->[Mat]
retrieve connected CGs; the processor cannot [vehicle->(attr)->[damaged] [Cat:Tom]->(Near)->[Table]
retrieve paths between concepts specified in Source
a query. If a retrieved CG indexes a document > use Repr //*answer with DEs [Cat:Tom]- { (On)->[Mat];
element, we can present it instead of the CG. > spec [Something]->(Color)-> (Near)->[Table]
(Figure 7 gives an example.) In both cases, [Color: red] }
we generated hypertext links to reach the the red damaged vehicle
source of each answer in its original docu- Source
ment—WebKB will actually present a A scalable cooperatively built
slightly modified copy of this original docu- Queries for specializations give users some knowledge repository
ment to instruct the Web browser to display freedom in the way they expresses queries;
and highlight the selected answer in its source they can search at a general level and subse- Ontology servers support shared knowl-
document. What follows is an example of quently refine searches according to the edge repositories—for example, the Ontolin-
such an interaction, assuming that www.bar. results. However, they must know the exact gua ontology server and Ontosaurus. How-
com/example.html is the file indexed in Fig- names of types. To improve this situation, ever, they are not usable for managing large
ure 4 and Something is the most general WebKB lets users give only a substring of a quantities of knowledge, and apart form AI-
concept type: type in a query CG if they prefixed this sub- Trader,10 they do not allow the indexation and

IEEE
& their applications

retrieval of parts of documents. Support of 3. P. Martin and P. Eklund, “Embedding Knowl- EDITORIAL BOARD
cooperation between the users is essentially edge in Web Documents,” Proc. 8th Int’l World Editor-in-Chief
Wide Web Conf., Elsevier, Amsterdam, The Daniel E. O’Leary
limited to consistency enforcement, annota- University of Southern California
Netherlands, 1999, pp. 324–341.
tions and structured dialogues, as in APECKS, oleary@rcf.usc.edu
Co4, and Tadzebao. We are extending WebKB 4. S. Decker et al., “Ontobroker: Ontology Associate Editor-in-Chief
to handle a knowledge repository. (For more Based Access to Distributed and Semi-Struc- Stephen E. Cross
details, see meganesia.int.gu.edu.au/~phmartin tured Information,” Semantic Issues in Mul- Carnegie Mellon University
timedia Systems: Proc. DS-8, R. Meersman sc@sei.cmu.edu
/WebKB/doc/coopKBbuilding.html). We et al., eds., Kluwer Academic Publisher,
address scalability by Boston, 1999, pp. 351–369. Richard Benjamins
University of Amsterdam
5. O. Haemmerlé, CoGITo: Une Plate-Forme de Richard J. Doyle
• implementing a knowledge-based system Jet Propulsion Lab
Developpement de Logiciels sur les Graphes
that reuses FastDB; Douglas Dyer
Conceptuels (CoGITo: A Conceptual Graph
• using visualization techniques (mainly DARPA
Workbench), PhD thesis, Montpellier II Univ.,
the handling of aliases for terms and the France, Jan. 1995. Dieter A. Fensel
Vrije Universiteit Amsterdam
generation of views) that avoid lexical Yolanda Gil
conflicts and let users focus on certain 6. G. Ellis, Managing Complex Objects, PhD
University of Southern California
thesis, Dept. of Computer Science, Queens-
kinds of knowledge; land University, Australia, 1995. C. Lee Giles
• using protocols that let users solve NEC Research Institute
semantic conflicts by inserting new terms 7. J. Nanard et al., “Integrating Knowledge-based Kristian J. Hammond
University of Chicago
and relations in the common ontology Hypertext and Database for Task-Oriented
Access to Documents,” Proc. DEXA’93, Lec- James V. Hansen
and, in some cases, in the knowledge of Brigham Young University
ture Notes in Computer Science, Vol. 720,
other users; and Springer-Verlag, NewYork, 1993, pp. 721–732. Marti A. Hearst
• using conventions for representing knowl- University of California, Berkeley
8. P. Martin, “Using the WordNet Concept Cata- James A. Hendler
edge that improve the automatic compar- DARPA
ison of knowledge from different users log and a Relation Hierarchy for Knowledge
Acquisition,” Proc. 4th Peirce Workshop, 1995. Haym Hirsh
and hence their consistency and retrieval. Rutgers University
www.inria.fr/acacia/Publications/1995/
peirce95phm.ps.Z (current May 2000). Se June Hong
IBM T.J. Watson Research Center
C 9. P. Martin, Exploitation de Graphes Con- Eric Horvitz

Microsoft Research
ceptuels et de Documents Structure (Exploita-
URRENT INFORMATION RETRIEVAL Craig Knoblock
tion of Conceptual Graphs and Structures
techniques are not knowledge-enabled and University of Southern California
Documents for Knowledge Acquisistion and
hence cannot give precise answers to precise Information Retrieval), PhD Thesis, Univer- Robert Laddaga
DARPA
questions. To overcome this problem, a cur- sity of Nice–Sophia Antipolis, France, 1996.
Robin R. Murphy
rent trend on the Web is to let users annotate University of South Florida
10. A. Puder and K. Romer, “Generic Trading Ser-
documents using metadata languages. vice in Telecommunication Platforms,” Proc. Robert Plant
WebKB lets its users combine lexical, struc- University of Miami
5th Int’l Conf. Conceptual Structures, Lecture
tural, and knowledge-based techniques to Notes in Artifical Intelligence, Vol. 1257, Alun Preece
University of Aberdeen
exploit or generate Web documents. The Springer-Verlag, NewYork, 1997, pp. 551–565.
Paul S. Rosenbloom
scalable knowledge repository we are build- University of Southern California
ing will permit the fusion and reuse of knowl- Philippe Martin is a research fellow at Griffith Howard E. Shrobe
University’s Gold Coast Campus, Australia. His Massachusetts Institute of Technology
edge from various sources. In an operational main interests are knowledge representation, shar- Rudi Studer
context, these knowledge-based features ing, and retrieval. He has an engineering degree, a University of Karlsruhe
need to be combined with more traditional DEA, and a PhD in software engineering from the William R. Swartout
information retrieval ideas that give both University of Nice–Sophia Antipolis, France. Con- University of Southern California
tact him at Griffith Univ., School of Information Mark L. Swinson
coarse-grained search capabilities and the Technology, PMB 50 Gold Coast MC, QLD 9726, DARPA
fine-grained, precision-based knowledge Australia; philippe.martin@gu.edu.au. Milind Tambe
retrieval we describe here. University of Southern California
Peter W. Eklund is the Foundation Chair of Infor- Austin Tate
mation Technology at Griffith University’s Gold University of Edinburgh
References Coast Campus. He is also a key researcher at the David L. Waltz
Distributed Systems Technology Center in Bris- NEC Research Institute
1. J.F. Sowa, Conceptual Structures: Informa- bane. His main interests are knowledge ordering, Ronald Yager
tion Processing in Mind and Machine, Addi- visualization, and management. He graduated in Iona College
son-Wesley, Reading, Mass., 1984. mathematics from Wollongong University, Aus- CS MAGAZINE OPERATIONS COMMITTEE
tralia, and has an MPhil from Brighton University, Carl Chang (chair), William Everett (vice chair),
2. P. Martin and P. Eklund, “WWW Indexation UK, and a PhD in computer science from James Aylor, Jean Bacon, Wushow Chou,
and Document Navigation Using Conceptual Linköping University, Sweden. Contact him at George Cybenko, William Grosky,
Steve McConnell, Daniel O’Leary, Ken Sakamura,
Structures,” Proc. 2nd IEEE Int’l Conf. Intel- Griffith Univ., School of Information Technology, Munindar Singh, James Thomas, Michael Williams, Yervant Zorian
ligent Processing Systems (ICIPS ’98), IEEE PMB 50 Gold Coast MC, QLD 9726, Australia; CS PUBLICATIONS BOARD
Press, Pistcataway. N.J., 1998, pp. 217–221. p.eklund@gu.edu.au. Sallie Sheppard (chair), Jake Aggarwal, Gul Agha, Jon Butler,
Alberto del Bimbo, Mike Liu, Nancy Mead, Sorel Reisman,
Joseph Urban, Zhiwei Xu
MAY/JUNE 2000

Martin e Eklund - 2000 - KnowledgeRetrievel - Web

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Martin e Eklund - 2000 - KnowledgeRetrievel - Web

Uploaded by

Copyright:

Available Formats

K N O W L E D G E M A N A G E M E N T

Knowledge Retrieval and the

L ARGE-SCALE WEB SEARCH

18 1094-7167/00/$10.00 © 2000 IEEE IEEE INTELLIGENT SYSTEMS

• it restricts knowledge formation without

20 IEEE INTELLIGENT SYSTEMS

<!— RDF notation; assumed location: http://www.bar.com/example —>

/*Structured text (“:“ ends the name of a “typical” relation,

Figure 3. Complementary notations for simple knowledge statements.

To permit this and still let the system per-

22 IEEE INTELLIGENT SYSTEMS

Knowledge-generation commands. The only

24 IEEE INTELLIGENT SYSTEMS

& their applications

C 9. P. Martin, Exploitation de Graphes Con- Eric Horvitz

You might also like