You are on page 1of 85

Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.

XML Design
(A Gentle Transition from XML to RDF)

Roger L. Costello
David B. Jacobs
The MITRE Corporation
(The creation of this tutorial was sponsored by DARPA)
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
2

Acknowledgments
• We are very grateful to the Defense Agency Research
Projects Agency (DARPA) for funding the creation of this
tutorial. We are especially grateful to Murray Burke
(DARPA) and John Flynn (BBN) for making it all happen.
• Special thanks to Frank Manola for answering our many
questions.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
3

What is the Purpose of RDF?


• The purpose of RDF (Resource Description
Framework) is to give a standard way of specifying
data "about" something.
• Here's an example of an XML document that
specifies data about China's Yangtze river:
<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>

"Here is data about the Yangtze River. It has a length of 6300 kilometers.
Its startingLocation is western China's Qinghai-Tibet Plateau. Its endingLocation
is the East China Sea."
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
4

XML --> RDF


Modify the following XML document so that it is also a valid RDF document:

<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river">
XML <length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze.xml

"convert to"

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
RDF xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
5

The RDF Format


1 RDF provides an ID attribute for identifying the resource being described.

2 The ID attribute is in the RDF namespace.

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>

3 Add the "fragment identifier symbol" to


the namespace.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
6

The RDF Format (cont.)

2 Identifies the resource being described. This


resource is an instance of River.
1 Identifies the type
<?xml version="1.0"?>
(class) of the <River rdf:ID="Yangtze"
resource being xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
described. xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
3 These are properties, <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
or attributes, of the <endingLocation>East China Sea</endingLocation>
type (class). </River>

4 Values of the properties


Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
7

Namespace Convention
Question: Why was "#" placed onto the end of the namespace? E.g.,
xmlns="http://www.geodesy.org/river#"

Best Practice
Best Practice

Answer: RDF is very concerned about uniquely identifying things -


uniquely identifying the type (class) and uniquely identifying the properties.
If we concatenate the namespace with the type then we get a unique
identifier for the type, e.g.,
http://www.geodesy.org/river#River
If we concatenate the namespace with a property then we get a unique
identifier for the property, e.g.,
http://www.geodesy.org/river#length

http://www.geodesy.org/river#startingLocation

http://www.geodesy.org/river#endingLocation

Thus, the "#" symbol is simply a mechanism for separating the namespace from the type name and the property name.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
8

The RDF Format

<?xml version="1.0"?>
<Class rdf:ID="Resource"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="uri">
<property>value</property>
<property>value</property>
...
</Class>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
9

Advantage of using the RDF Format


Interoperability

• You may ask: "Why should I bother designing my XML to be in the


RDF format?"
• Answer: there are numerous benefits:
– The RDF format, if widely used, will help to make XML more
interoperable:
• Tools can instantly characterize the structure, "this element is a type (class), and
here are its properties”.
• RDF promotes the use of standardized vocabularies ... standardized types
(classes) and standardized properties.
Network effect

– The RDF format gives you a structured approach to designing your XML
documents. The RDF format is a regular, recurring pattern.
– It enables you to quickly identify weaknesses and inconsistencies of non-
RDF-compliant XML designs. It helps you to better understand your data!
– You reap the benefits of both worlds:
• You can use standard XML editors and validators to create, edit, and validate
your XML.
• You can use the RDF tools to apply inferencing to the data.
– It positions your data for the Semantic Web!
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
10

Disadvantage of using the


RDF Format
• Constrained: the RDF format constrains you on
how you design your XML (i.e., you can't design
your XML in any arbitrary fashion).
• RDF uses namespaces to uniquely identify types
(classes), properties, and resources. Thus, you
must have a solid understanding of namespaces.
• Another XML vocabulary to learn: to use the RDF
format you must learn the RDF vocabulary.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
11

Uniquely Identify the Resource

• Earlier we said that RDF is very concerned about uniquely


identifying the type (class) and the properties. RDF is also
very concerned about uniquely identifying the resource,
e.g.,
This is the resource being described. We want to uniquely
identify this resource.

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
12

rdf:ID
• The value of rdf:ID is a "relative URI".
• The "complete URI" is obtained by concatenating the URL of the
XML document with "#" and then the value of rdf:ID, e.g.,

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>

Yangtze.rdf

Suppose that this RDF/XML document is located at this URL: http://www.china.org/geography/rivers.


Thus, the complete URI for this resource is:

http://www.china.org/geography/rivers#Yangtze
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
13

xml:base
• On the previous slide we showed how the URL of the document provided the base URI.
• Depending on the location of the document is brittle: it will break if the document is
moved, or is copied to another location.
• A more robust solution is to specify the base URI in the document, e.g.,

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xml:base="http://www.china.org/geography/rivers">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>

Resource URI = concatenation(xml:base, '#', rdf:ID)


= concatenation(http://www.china.org/geography/rivers, '#', "Yangtze")
= http://www.china.org/geography/rivers#Yangtze
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
14

rdf:about
• Instead of identifying a resource with a relative URI (which
then requires a base URI to be prepended), we can give the
complete identity of a resource. However, we use rdf:about,
rather than rdf:ID, e.g.,

<?xml version="1.0"?>
<River rdf:about="http://www.china.org/geography/rivers#Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
15

Triple -> resource/property/value


http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#length of 6300 kilometers

resource property value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#startingLocation of western China's ...

resource property value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#endingLocation of East China Sea

resource property value


Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
16

The RDF Format = triples!


• The fundamental design pattern of RDF is to structure your XML data
as resource/property/value triples!
<?xml version="1.0"?>
<Resource-A> Notice that the RDF design pattern is an
<property-A> alternating sequence of resource-property.
<Resource-B>
<property-B> This pattern is known as "striping".
<Resource-C>
<property-C>
value of value of Value-C
property-A property-B </property-C>
</Resource-C>
</property-B>
</Resource-B>
</property-A>
</Resource-A>

The value of a property can be a literal (e.g., length has a value of 6300 kilometers).
Also, the value of a property can be a resource, as shown above (e.g., property-A
has a value of Resource-B, property-B has a value of Resource-C). We will see examples
of properties having a resource value in a little bit.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
17

Naming Convention
• The convention is to use a capital letter to
start a type (class) name, and use a
lowercase letter to start a property name.
– This helps the eye quickly discern the striping
pattern.
<?xml version="1.0"?>
uppercase <River rdf:about="http://www.china.org/geography/rivers#Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
lowercase <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
18

RDF Model (graph)

Legend:
Ellipse indicates "Resource"
Rectangle indicates "literal string value"
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
19

rdf:Description + rdf:type
• There is still another way of representing the XML. This way makes it
very clear that you are describing something, and it makes it very clear
what the type (class) is of the thing you are describing:

<?xml version="1.0"?>
<rdf:Description rdf:about="http://www.china.org/geography/rivers#Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<rdf:type rdf:resource="http://www.geodesy.org/river#River"/>
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</rdf:Description>

This is read as: "This is a Description about the resource http://www.china.org/geography/rivers#Yangtze.


This resource is an instance of the River type (class). The http://www.china.org/geography/rivers#Yangtze
resource has a length of 6300 kilometers, a startingLocation of western China's Qinghai-Tibet Plateau,
and an endingLocation of the East China Sea."

Note: this form of describing a resource is called the "long form". The form we have seen previously is an
abbreviation of this long form. An RDF Parser interprets the abbreviated form as if it were this long form.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
20

Alternative
• Alternatively we can use rdf:ID rather than rdf:about, as shown
here:

<?xml version="1.0"?>
<rdf:Description rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xml:base="http://www.china.org/geography/rivers">
<rdf:type rdf:resource="http://www.geodesy.org/river#River"/>
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</rdf:Description>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
21

Equivalent Representations!
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xml:base="http://www.china.org/geography/rivers">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>

<?xml version="1.0"?>
<River rdf:about="http://www.china.org/geography/rivers#Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>

<?xml version="1.0"?>
<rdf:Description rdf:about="http://www.china.org/geography/rivers#Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
Note: In the RDF literature the examples
<rdf:type rdf:resource="http://www.geodesy.org/river#River"/> are typically shown in this form.
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</rdf:Description>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
22

RDF Namespace

http://www.w3.org/1999/02/22-rdf-syntax-ns#

ID

about

type
resource

Description
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
23

Terminology
• As you read the RDF literature you may see the following
terminology:
– Subject: this term refers to the item that is playing the role of the
resource.
– predicate: this term refers to the item that is playing the role of the
property.
– Object: this term refers to the item that is playing the role of the
value.

predicate
Subject Object

Equivalent!
property
Resource Value
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
24

RDF Parser

• There is a nice RDF parser at the W3 Web site:

http://www.w3.org/RDF/Validator/

This RDF parser will tell you if your XML


is in the proper RDF format.

Do Lab1
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
25

Example #2
Modify the following XML document so that it is RDF-compliant:

<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
<Dam id="ThreeGorges"
xmlns="http://www.geodesy.org/dam">
<name>The Three Gorges Dam</name>
<width>1.5 miles</width>
<height>610 feet</height>
<cost>$30 billion</cost>
</Dam>
</River>

Yangtze2.xml
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
26

Note the two types (classes)

River Dam

Instance: Yangtze Instance: ThreeGorges


Properties: Properties:
length name
startingLocation width
endingLocation height
cost
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
27

Dam - out of place


<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
<Dam id="ThreeGorges"
xmlns="http://www.geodesy.org/dam">
<name>The Three Gorges Dam</name>
<width>1.5 miles</width>
<height>610 feet</height>
<cost>$30 billion</cost>
</Dam>
</River>
Dam

Types (classes) contain properties . Here we see the River type


containing the properties - length, startingLocation, and endingLocation.
It also shows River containing a type - Dam. Thus, there is a Resource
that contains another Resource. This is inconsistent with RDF design
pattern. (We are seeing one of the benefits of using the RDF format - to
identify inconsistencies in an XML design.)
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
28

Property value must be a


Literal or a Resource
<length>6300 kilometers</length>

property
Value is a Literal

property <obstacle>
<Dam id="ThreeGorges"
xmlns="http://www.geodesy.org/dam">
<name>The Three Gorges Dam</name>
Value is a Resource <width>1.5 miles</width>
<height>610 feet</height>
<cost>$30 billion</cost>
</Dam>
</obstacle>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
29

Modified XML (to make it consistent)


<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
<obstacle>
<Dam id="ThreeGorges"
xmlns="http://www.geodesy.org/dam">
<name>The Three Gorges Dam</name>
<width>1.5 miles</width>
<height>610 feet</height>
<cost>$30 billion</cost>
</Dam>
</obstacle>
</River>

Yangtze2,v2.xml

"The Yangtze River has an obstacle that is the ThreeGorges Dam. The Dam
has a name - The Three Gorges Dam. It has a width of 1.5 miles, a height of 610 feet,
and a cost of $30 billion."
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
30

RDF Format
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xml:base="http://www.china.org/geography/rivers">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
<obstacle>
<Dam rdf:ID="ThreeGorges"
xmlns="http://www.geodesy.org/dam#">
<name>The Three Gorges Dam</name>
<width>1.5 miles</width>
<height>610 feet</height>
<cost>$30 billion</cost>
</Dam>
</obstacle>
</River>

Changed id to rdf:ID
Added the '#' symbol

As always, the other representations using rdf:about and rdf:Description are available.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
31

RDF Model (graph)


Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
32

Alternatively, suppose that someone has already created a document containing


information about the Three Gorges Dam:
<?xml version="1.0"?>
<Dam rdf:ID="ThreeGorges"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/dam#"
xml:base="http://www.china.org/geography/rivers">
<name>The Three Gorges Dam</name>
<width>1.5 miles</width>
<height>610 feet</height>
<cost>$30 billion</cost>
</Dam>
Three-Gorges-Dam.rdf

Then we can simply reference the Three Gorges Dam resource using rdf:resource,
as shown here:
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xml:base="http://www.china.org/geography/rivers">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
<obstacle rdf:resource="http://www.china.org/geography/rivers#ThreeGorges"/>
</River>
Yangtze.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
33

Note: reference is to a resource,


not to a file
Why was this the reference:
<obstacle rdf:resource="http://www.china.org/geography/rivers #ThreeGorges"/>

and not this:


<obstacle rdf:resource="http://www.china.org/geography/rivers /Three-Gorges-Dam.rdf"/>

That is, why wasn't the reference to a "file"?

Answer:
1. What if the file moved? Then the reference would break.
2. By using an identifier of the Three Gorges Dam, and keeping a particular file unspecified,
then an "aggregator tool" will be able to collect information from all the files that talk
about the Three Gorges Dam resource (see next slide).

Do Lab2
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
34

Anyone, Anywhere, Anytime


Can Talk About a Resource
• In all of our examples we have provided a unique identifier
to resources, e.g.,

http://www.china.org/geography/rivers#Yangtze

• Consequently, if another RDF document identifies the


same resource then the data that it specifies gives
additional data about that resource.
• An aggregator tool will be able to collect all data about a
resource and present a consolidated set of data for the
resource. That's powerful!
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
35

rdf:ID versus rdf:about

• When should rdf:ID be used? When should


rdf:about be used?
– When you want to introduce a resource, and provide
an initial set of information about a resource use rdf:ID
– When you want to extend the information about a
resource use rdf:about
• The RDF philosophy is akin to the Web philosophy. That is,
anyone, anywhere, anytime can provide information about a
resource.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
36

<?xml version="1.0"?>
<River rdf:about="http://www.china.org/geography/rivers#Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>

http://www.china.org/geography/rivers/yangtze.rdf <?xml version="1.0"?>


<River rdf:about="http://www.china.org/geography/rivers#Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<name>Dri Chu - Female Yak River</name>
<name>Tongtian He, Travelling-Through-the-Heavens River</name>
<name>Jinsha Jiang, River of Golden Sand</name>
</River>

http://www.encyclopedia.org/yangtze-alternate-names.rdf

Aggregator tool collects


data about the Yangtze

<?xml version="1.0"?>
<River rdf:about="http://www.china.org/geography/rivers#Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
A distributed network of data!
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
<name>Dri Chu - Female Yak River</name>
<name>Tongtian He, Travelling-Through-the-Heavens River</name>
<name>Jinsha Jiang, River of Golden Sand</name>
</River>

Aggregated Data!
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
<?xml version="1.0"?> 37
<Dam rdf:ID="ThreeGorges"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/dam#"
xml:base="http://www.china.org/geography/rivers">
Another Example of Aggregation
<name>The Three Gorges Dam</name>
<width>1.5 miles</width>
<height>610 feet</height>
<cost>$30 billion</cost>
</Dam>

http://www.encyclopedia.org/three-gorges-dam.rdf
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xml:base="http://www.china.org/geography/rivers">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
<obstacle rdf:resource="http://www.china.org/geography/rivers #ThreeGorges"/>
</River>
Aggregate!
http://www.china.org/geography/rivers/yangtze.rdf

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xml:base="http://www.china.org/geography/rivers">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
<obstacle>
<Dam rdf:ID="ThreeGorges"
xmlns="http://www.geodesy.org/dam#"> Note that the reference to the
<name>The Three Gorges Dam</name>
<width>1.5 miles</width>
ThreeGorges Dam resource has been
<height>610 feet</height> replaced by whatever information the
<cost>$30 billion</cost>
</Dam> aggregator could find on this resource!
</obstacle>
</River>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
38

Example #3
Notice that in this XML document there is no unique identifier:
<?xml version="1.0"?>
<River xmlns="http://www.geodesy.org/river#">
<name>Yangtze</name>
XML <length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze3.xml

<?xml version="1.0"?>
<River xmlns="http://www.geodesy.org/river#"> The RDF is
<name>Yangtze</name> identical to the
RDF <length>6300 kilometers</length> XML!
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze3.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
39

Interpreting the RDF

<?xml version="1.0"?>
<River xmlns="http://www.geodesy.org/river#">
<name>Yangtze</name>
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze3.rdf

This is read as: "This is an instance of the River type (class). The River has a
name of Yangtze, a length of 6300 kilometers, a startingLocation of western
China's Qinghai-Tibet Plateau, and an endingLocation of the East China Sea."

In this document the resource is anonymous - it has no identifier.


Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
40

Disadvantage of anonymous
resources
<?xml version="1.0"?>
<River xmlns="http://www.geodesy.org/river#">
<name>Yangtze</name>
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>

http://www.china.org/geography/rivers/yangtze.rdf
<?xml version="1.0"?>
<River xmlns="http://www.geodesy.org/river#">
<name>Yangtze</name>
<name>Dri Chu - Female Yak River</name>
<name>Tongtian He, Travelling-Through-the-Heavens River</name>
<name>Jinsha Jiang, River of Golden Sand</name>
</River>

http://www.encyclopedia.org/yangtze-alternate-names.rdf

Aggregate

An aggregator tool will not be able to determine if these


documents are talking about the same resource.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
41

Example #4
<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river"
xmlns:uom="http://www.measurements.org/units-of-measure">
XML <length uom:units="kilometers">6300</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze4.xml

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
RDF xmlns="http://www.geodesy.org/river#"
xmlns:uom="http://www.measurements.org/units-of-measure#">
<length>
<rdf:Description>
<rdf:value>6300</rdf:value>
<uom:units>kilometers</uom:units>
</rdf:Description>
</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze4.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
42

<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river"
xmlns:uom="http://www.measurements.org/units-of-measure#">
<length uom:units="kilometers">6300</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze4.xml

RDF does not allow attributes on the properties (except for special RDF
attributes such as rdf:resource). So we need to make the uom:units attribute
a child element.
Your first instinct might be to modify length to have two child elements:

<?xml version="1.0"?>
<River id="Yangtze" However, now
xmlns="http://www.geodesy.org/river" the length
xmlns:uom="http://www.measurements.org/units-of-measure#"> property has
<length> as its value two
<value>6300</value> values.
<uom:units>kilometers</uom:units> RDF only
</length> binary relations
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> i.e., a single
<endingLocation>East China Sea</endingLocation> value for a
</River> property.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
43

rdf:value

6300
length
kilometers

length has two values - 6300 and kilometers.


RDF provides a special property, rdf:value, to
be used for specifying the "primary" value.
In this example, 6300 is the primary value, and
kilometers is a value which provides additional
information about the primary value.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
44

RDF Format
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xmlns:uom="http://www.measurements.org/units-of-measure#">
<length>
<rdf:Description>
An anonymous <rdf:value>6300</rdf:value>
resource <uom:units>kilometers</uom:units>
</rdf:Description>
</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze4.rdf

Read this as: "The Yangtze River has a length whose value is a resource
which has a value of 6300 and whose units is kilometers.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
45

Advantage of anonymous
resources
<rdf:Description>
<rdf:value>6300</rdf:value>
<uom:units>kilometers</uom:units>
</rdf:Description>

This is an anonymous resource. Its purpose


is solely to provide a context for the two
properties. Other RDF documents will have
no need to amplify this resource. So, in
this case, there is no reason for giving the
resource an identifier. In this case it makes
good sense to use an anonymous resource.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
46

RDF Model (graph)

Legend:
An anonymous resource (also called a "blank node"). That is, a
resource with no identifier. (Note: RDF Parsers will typically
generate a unique identifier for anonymous resources, to
distinguish one anonymous resource from another.)
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
47

rdf:parseType="Resource"
If the value of a property is comprised of several values then one option
is to create an anonymous resource, as we saw. RDF provides a shorthand,
so that you don't need to create an rdf:Description element, by using
rdf:parseType="Resource", as shown here:
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xmlns:uom="http://www.measurements.org/units-of-measure#">
<length rdf:parseType="Resource">
<rdf:value>6300</rdf:value>
<uom:units>kilometers</uom:units>
</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
Yangtze4,v2.rdf

The meaning of this is identical to that shown on the previous slide.


Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
48

Equivalent!

<length>
<rdf:Description>
<rdf:value>6300</rdf:value>
<uom:units>kilometers</uom:units>
</rdf:Description>
</length>

<length rdf:parseType="Resource">
<rdf:value>6300</rdf:value>
<uom:units>kilometers</uom:units>
</length>

Do Lab3
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
49

Summary
Modify the following XML document so that it is also a valid RDF document:

<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river"
xmlns:uom="http://www.measurements.org/units-of-measure#">
<length uom:units="kilometers">6300</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
<Dam id="ThreeGorges"
xmlns="http://www.geodesy.org/dam">
<name>The Three Gorges Dam</name>
<width>1.5 miles</width>
<height>610 feet</height>
<cost>$30 billion</cost>
</Dam>
</River>
Yangtze.xml

See next slide -->


Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
50

RDF Format!
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xmlns:uom="http://www.measurements.org/units-of-measure#"
xml:base="http://www.china.org/geography/rivers">
<length rdf:parseType="Resource">
<rdf:value>6300</rdf:value>
<uom:units>kilometers</uom:units>
</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
With relatively
<endingLocation>East China Sea</endingLocation>
few changes the
<obstacle>
XML document
<Dam rdf:ID="ThreeGorges"
is now usable by
xmlns="http://www.geodesy.org/dam#">
both XML tools
<name>The Three Gorges Dam</name>
and RDF tools!
<width>1.5 miles</width>
<height>610 feet</height>
<cost>$30 billion</cost>
</Dam>
</obstacle>
</River>
Yangtze.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
51

Example #5
Modify the following XML document so that it is also a valid RDF document:
<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river"
xmlns:uom="http://www.measurements.org/units-of-measure#">
<length uom:units="kilometers">6300</length>
<maxWidth uom:units="meters">175</maxWidth>
<maxDepth uom:units="meters">55</maxDepth>
</River>
Yangtze5.xml

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xmlns:uom="http://www.measurements.org/units-of-measure#">
<length rdf:parseType="Resource"> This is one way of doing it.
<rdf:value>6300</rdf:value>
Now we will see a better
<uom:units>kilometers</uom:units>
</length> way - using "typed literals".
<maxWidth rdf:parseType="Resource"> (See next slide)
<rdf:value>175</rdf:value>
<uom:units>meters</uom:units></maxWidth>
<maxDepth rdf:parseType="Resource">
<rdf:value>55</rdf:value>
<uom:units>kilometers</uom:units>
</maxDepth>
</River>
Yangtze5.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
52

Alternate RDF Format


<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length rdf:datatype="http://www.uom.org/distance#kilometer">6300</length>
<maxWidth rdf:datatype="http://www.uom.org/distance#meter">175</maxWidth>
<maxDepth rdf:datatype="http://www.uom.org/distance#meter">55</maxDepth>
</River>

Yangtze5.rdf

With rdf:datatype you can give a property's value a datatype label. The
rdf:datatype value acts as a semantic label for the datatype of the value.
This is called a typed literal.

For this example there must be a namespace, http://www.uom.org/distance#,


which defines two datatypes - kilometer and meter.

On the next slide is shown how to do this using XML Schemas.


Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
53

Defining the kilometer and meter


datatypes using XML Schemas

<?xml version="1.0" encoding="UTF-8"?>


<schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.uom.org/distance#">

<simpleType name="kilometer">
<restriction base="integer">
</restriction>
</simpleType>

<simpleType name="meter">
<restriction base="integer">
</restriction>
</simpleType>

</schema>
uom.xsd
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
54

Another example using


rdf:datatype
<?xml version="1.0"?>
<Person rdf:ID="JohnSmith"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.person.org#">
<age rdf:datatype="http://www.w3.org/2001/XMLSchema#nonNegativeInteger">30</age>
</Person>

In this example we are specifying that the value (30) of


age is a nonNegativeInteger (which is defined in the XML
Schema namespace).
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
55

Example #6
Modify the following XML document so that it is also a valid RDF document:

<?xml version="1.0"?>
<WeatherReading xmlns="http://www.meteorology.org#">
<Weather>
<temperature>-2 degrees Celsius</temperature>
<barometer>30.4 (rising)</barometer>
</Weather>
<Location xmlns="http://www.geodesy.org#">
<country>USA</country>
<state>Massachusetts</state>
<city>Boston</city>
</Location>
<datetime>January 22, 2003, 11:15 EST</datetime>
</WeatherReading>
WMUR_TV_WeatherReading.xml
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
56

Things to note
1. The XML document is using three types (classes):

WeatherReading Weather Location

2. The 3 type instances are anonymous. Consequently,


we cannot benefit from others, and others cannot
benefit from us. Where does it make sense to give
an identifier to an instance?
- In general, the root element should have an identifier.
- There is lots of information about Boston. Let's give
the Location instance an identifier.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
57

Modification 1: Identifiers added

<?xml version="1.0"?>
<WeatherReading id="BOS-012203-1115"
xmlns="http://www.meteorology.org#">
<Weather>
<temperature>-2 degrees Celsius</temperature>
<barometer>30.4 (rising)</barometer>
</Weather>
<Location id="BOS"
xmlns="http://www.geodesy.org#">
<country>USA</country>
<state>Massachusetts</state>
<city>Boston</city>
</Location>
<datetime>January 22, 2003, 11:15 EST</datetime>
</WeatherReading>
WMUR_TV_WeatherReading,v2.xml
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
58

Modification 2: Create a property for


the Weather, Location types
Types (classes) contain properties. We need to wrap the Weather and Location
types within a property:
<?xml version="1.0"?>
<WeatherReading id="BOS-012203-1115"
xmlns="http://www.meteorology.org#">
<instrumentReading>
<Weather>
<temperature>-2 degrees Celsius</temperature>
<barometer>30.4 (rising)</barometer>
</Weather>
</instrumentReading>
<instrumentLocation>
<Location id="BOS"
xmlns="http://www.geodesy.org#">
<country>USA</country>
<state>Massachusetts</state>
<city>Boston</city>
</Location>
</instrumentLocation>
<datetime>January 22, 2003, 11:15 EST</datetime>
</WeatherReading>
WMUR_TV_WeatherReading,v3.xml
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
59

RDF Format!
<?xml version="1.0"?>
<WeatherReading rdf:ID="BOS-012203-1115"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.meteorology.org#"
xml:base="http://www.wmur-tv.com/weather">
<instrumentReading>
<Weather>
<temperature>-2 degrees Celsius</temperature>
<barometer>30.4 (rising)</barometer>
</Weather>
</instrumentReading>
<instrumentLocation>
<Location rdf:ID="BOS"
xmlns="http://www.geodesy.org#"
xml:base="http://www.airports.org/icao">
<country>USA</country>
<state>Massachusetts</state>
<city>Boston</city>
</Location>
</instrumentLocation>
<datetime>January 22, 2003, 11:15 EST</datetime>
</WeatherReading>

WMUR_TV_WeatherReading.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
60

The rdf:Bag type (class)

• The rdf:Bag type is used to represent an unordered


collection.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
61

Example #7
Modify the following XML document so that it is also a valid RDF document:
<?xml version="1.0"?>
<Meeting id="XML-Design-Patterns"
xmlns="http://www.business.org">
<attendees>
<name>John Smith</name>
<name>Sally Jones</name>
</attendees>
</Meeting>
DesignMeeting.xml

<?xml version="1.0"?>
<Meeting rdf:ID="XML-Design-Pattern"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.business.org#">
<attendees>
<rdf:Bag>
rdf:Bag makes it clear
<name>John Smith</name>
that this is an unordered
<name>Sally Jones</name>
collection of names.
</rdf:Bag>
</attendees>
</Meeting>
DesignMeeting.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
62

The rdf:Alt type (class)

• The rdf:Alt type is used to represent a set of


alternate properties.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
63

Example #8
Modify the following XML document so that it is also a valid RDF document:
<?xml version="1.0"?>
<Retailer id="BarnesAndNoble"
xmlns="http://www.retailers.org">
<webLocation>
<url>http://www.bn.com</url>
<url>http://www.barnesandnoble.com</url>
</webLocation>
</Retailer>
BarnesAndNoble.xml

<?xml version="1.0"?>
<Retailer rdf:ID="BarnesAndNoble"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.retailers.org#">
<webLocation>
<rdf:Alt> rdf:Alt makes it clear
<url>http://www.bn.com</url> that the urls listed are
<url>http://www.barnesandnoble.com</url> alternates, i.e., choose
</rdf:Alt> one of them.
</webLocation>
</Retailer>
BarnesAndNoble.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
64

The rdf:Seq type (class)

• The rdf:Seq type is used to represent a


sequence of properties.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
65

Example #9
Modify the following XML document so that it is also a valid RDF document:
<?xml version="1.0"?>
<ToDoList id="MondayMeetings"
xmlns="http://www.reminders.org">
<activities>
<activity1>Meet with CEO at 10am</activity1>
<activity2>Luncheon at The Eatery</activity2>
<activity3>Flight at 3pm</activity3>
</activities>
</ToDoList>
MyDaysActivities.xml

<?xml version="1.0"?>
<ToDoList rdf:ID="MondayMeetings"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.reminders.org#">
<activities>
<rdf:Seq>
<activity1>Meet with CEO at 10am</activity1> rdf:Seq makes it clear
<activity2>Luncheon at The Eatery</activity2> that the activities listed
<activity3>Flight at 3pm</activity3> are to be done in the
</rdf:Seq> sequence listed.
</activities>
</ToDoList>
MyDaysActivities.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
66

rdf:li Property
• The property, rdf:li ("list item"), is provided by
RDF for use with either rdf:Bag, rdf:Alt, or
rdf:Seq.
• The rdf:li property is provided for you to specify
an item in a Bag/Alt/Seq.
• An RDF Parser will replace each rdf:li with
rdf:_1, rdf:_2, rdf:_3, etc.
• The following slide recasts the previous examples
using the rdf:li property.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
67
<?xml version="1.0"?>
<Retailer rdf:ID="BarnesAndNoble"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.retailers.org#">
<webLocation>
<rdf:Alt>
<rdf:li>http://www.bn.com</rdf:li>
<rdf:li>http://www.barnesandnoble.com</rdf:li>
</rdf:Alt> <?xml version="1.0"?>
</webLocation> <Meeting rdf:ID="XML-Design-Pattern"
</Retailer> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.business.org#">
BarnesAndNoble.rdf <attendees>
<rdf:Bag>
<rdf:li>John Smith</rdf:li>
<rdf:li>Sally Jones</rdf:li>
</rdf:Bag>
</attendees>
</Meeting>

DesignMeeting.rdf
<?xml version="1.0"?>
<ToDoList rdf:ID="MondayMeetings"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.reminders.org#">
<activities>
<rdf:Seq>
<rdf:li>Meet with CEO at 10am</rdf:li>
<rdf:li>Luncheon at The Eatery</rdf:li>
<rdf:li>Flight at 3pm</rdf:li>
</rdf:Seq>
</activities>
</ToDoList>

MyDaysActivities.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
68

Example #10
Modify the following XML document so that it is also a valid RDF document:
<?xml version="1.0"?>
<Catalogue xmlns="http://www.publishing.org#"
xmlns:dc="http://pur1.org/metadata/dublin-core#">
<Book>
<dc:Title>Lateral Thinking</dc:Title>
<dc:Creator>Edward de Bono</dc:Creator>
<dc:Date>1973</dc:Date>
<ISBN>0-06-099325-2</ISBN>
<dc:Publisher>Harper &amp; Row</dc:Publisher>
</Book>
<Book>
<dc:Title>Illusions: The Adventures of a Reluctant Messiah</dc:Title>
<dc:Creator>Richard Bach</dc:Creator>
<dc:Date>1977</dc:Date>
<ISBN>0-440-34319-4</ISBN>
<dc:Publisher>Dell Publishing Co.</dc:Publisher>
</Book>
<Book> Note: regrettably,
<dc:Title>The First and Last Freedom</dc:Title>
<dc:Creator>J. Krishnamurti</dc:Creator> the Dublin Core does
<dc:Date>1954</dc:Date> not conform to the
<ISBN>0-06-064831-7</ISBN> RDF naming
<dc:Publisher>Harper &amp; Row</dc:Publisher>
</Book> conventions. Hence,
</Catalogue> you see properties
Barnes_and_Noble _BookCatalogue.xml with the first letter
capitalized.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
69

Things to note

1. The XML document uses two types (classes):

Catalogue Book

2. All type instances are anonymous. Consequently,


we cannot benefit from others, and others cannot
benefit from us. Where does it make sense to give
the instance an identifier?
- In general, the root element should have an identifier.
- There is lots of information about each book instance.
Let's give each book instance an identifier (the ISBN).
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
70

Modification 1: Identifiers added


<?xml version="1.0"?>
<Catalogue id="BookCatalogue"
xmlns="http://www.publishing.org#"
xmlns:dc="http://pur1.org/metadata/dublin-core#">
<Book id="_0-06-099325-2"> Why was an underscore
<dc:Title>Lateral Thinking</dc:Title> placed in front of the
<dc:Creator>Edward de Bono</dc:Creator> ISBN?
<dc:Date>1973</dc:Date>
Answer: The ID datatype
<dc:Publisher>Harper &amp; Row</dc:Publisher>
</Book> does not allow an
<Book id="_0-440-34319-4"> identifier to begin with
<dc:Title>Illusions: The Adventures of a Reluctant Messiah</dc:Title> a digit. So, we
<dc:Creator>Richard Bach</dc:Creator> (arbitrarily) decided to
<dc:Date>1977</dc:Date> use an underscore.
<dc:Publisher>Dell Publishing Co.</dc:Publisher>
</Book>
<Book id="_0-06-064831-7">
<dc:Title>The First and Last Freedom</dc:Title>
<dc:Creator>J. Krishnamurti</dc:Creator>
<dc:Date>1954</dc:Date>
<dc:Publisher>Harper &amp; Row</dc:Publisher>
</Book>
</Catalogue>

Barnes_and_Noble _BookCatalogue,v2.xml

Notice that the ISBN elements were deleted and their values used as identifiers.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
71

Modification 2: Create a property for the Book type


Types (classes) contain properties. We need to wrap the Book type within a property:
<?xml version="1.0"?>
<Catalogue id="BookCatalogue"
xmlns="http://www.publishing.org#"
xmlns:dc="http://pur1.org/metadata/dublin-core#">
<item>
<Book id="_0-06-099325-2">
<dc:Title>Lateral Thinking</dc:Title>
<dc:Creator>Edward de Bono</dc:Creator>
<dc:Date>1973</dc:Date>
<dc:Publisher>Harper &amp; Row</dc:Publisher>
</Book>
</item>
<item>
<Book id="_0-440-34319-4">
<dc:Title>Illusions: The Adventures of a Reluctant Messiah</dc:Title>
<dc:Creator>Richard Bach</dc:Creator>
<dc:Date>1977</dc:Date>
<dc:Publisher>Dell Publishing Co.</dc:Publisher>
</Book>
</item>
<item>
<Book id="_0-06-064831-7">
<dc:Title>The First and Last Freedom</dc:Title>
<dc:Creator>J. Krishnamurti</dc:Creator>
<dc:Date>1954</dc:Date>
<dc:Publisher>Harper &amp; Row</dc:Publisher>
</Book>
</item>
</Catalogue>

Barnes_and_Noble _BookCatalogue,v3.xml
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
72

RDF Format!
<?xml version="1.0"?>
<Catalogue rdf:ID="BookCatalogue"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.publishing.org#"
xmlns:dc="http://pur1.org/metadata/dublin-core#"
xml:base="http://www.bn.com">
<item>
<Book rdf:ID="_0-06-099325-2"
xml:base="http://www.publishing.org/book">
<dc:Title>Lateral Thinking</dc:Title>
<dc:Creator>Edward de Bono</dc:Creator>
<dc:Date>1973</dc:Date>
<dc:Publisher>Harper &amp; Row</dc:Publisher>
</Book>
</item>
<item>
<Book rdf:ID="_0-440-34319-4"
xml:base="http://www.publishing.org/book">
<dc:Title>Illusions: The Adventures of a Reluctant Messiah</dc:Title>
<dc:Creator>Richard Bach</dc:Creator>
<dc:Date>1977</dc:Date>
<dc:Publisher>Dell Publishing Co.</dc:Publisher>
</Book>
</item>
...
</Catalogue>

Barnes_and_Noble_BookCatalogue.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
73

Dublin Core (dc:)


• The Dublin Core is a standard set of properties:

Content Intellectual Instance


Property

Title Creator Date


Subject Publisher Type
Description Contributor Format
Language Rights Identifier
Relation
Coverage
Source

Note: many people use these properties in their HTML today. For example:
<META NAME="DC.Creator" CONTENT="John Smith">
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
74

rdf:parseType="Collection"
• This may be added as an attribute of a property to
indicate that the contents of the property is a list of
resources.
• The following slide recasts the BookCatalogue
example to use this list type.
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
75

<?xml version="1.0"?>
<Catalogue rdf:ID="BookCatalogue"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.publishing.org#"
xmlns:dc="http://pur1.org/metadata/dublin-core#"
xml:base="http://www.bn.com">
<books rdf:parseType="Collection">
<Book rdf:ID="_0-06-099325-2"
xml:base="http://www.publishing.org/book">
<dc:Title>Lateral Thinking</dc:Title>
<dc:Creator>Edward de Bono</dc:Creator>
<dc:Date>1973</dc:Date>
<dc:Publisher>Harper &amp; Row</dc:Publisher>
</Book>
<Book rdf:ID="_0-440-34319-4"
xml:base="http://www.publishing.org/book">
<dc:Title>Illusions: The Adventures of a Reluctant Messiah</dc:Title>
<dc:Creator>Richard Bach</dc:Creator>
<dc:Date>1977</dc:Date>
<dc:Publisher>Dell Publishing Co.</dc:Publisher>
</Book>
...
</books>
</Catalogue>
Barnes_and_Noble_BookCatalogue.rdf Do Lab4
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
76

Example #11
Modify the following XML document so that it is also a valid RDF document:
<?xml version="1.0"?>
<Article id="QuickBrownFox"
xmlns="http://www.publishing.org">
<paragraph>
The quick brown <bold>fox</bold> jumped over the lazy dog.
The paragraph contains
An important person once said "<quote>Now is the time for all
"mixed content".
good men to come to the aid of their country</quote>"
</paragraph>
</Article>
QuickBrownFox.xml

<?xml version="1.0"?>
<Article rdf:ID="QuickBrownFox"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.publishing.org#">
<paragraph rdf:parseType="Literal"> rdf:parseType="Literal"
The quick brown <bold>fox</bold> jumped over the lazy dog. indicates that the content
An important person once said "<quote>Now is the time for all of paragraph is to be
good men to come to the aid of their country</quote>" treated simply as literal
</paragraph> XML, i.e., tools shouldn't try
</Article> to parse the content into
resource/property/value triples.
QuickBrownFox.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
77

rdf:parseType="Literal"

• In all of the previous examples the data was


structured as resource/property/value triples.
• Sometimes it doesn't make sense to do such
structuring
– Example: with mixed content
• In those cases we can simply indicate "hey, the
content of this property is okay. Treat it as a
literal XML string."
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
78

Dangers of
rdf:parseType="Literal"
• The advantage of structuring your XML as
resource/property/value triples is enhanced
interoperability.
• When you use rdf:parseType="Literal" you lose
the ability for a tool to instantly take advantage of
the resource/property/value structure (since you
are, by definition, saying that the data doesn't have
this structure).
• Lesson Learned: use rdf:parseType="Literal"
sparingly!
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
79

Example 12
Modify the following XML document so that it is also a valid RDF document:
<?xml version="1.0"?>
<River id="Yangtze"
Note that attributes are
xmlns="http://www.geodesy.org/river"
being used, not child
XML length="6300 kilometers"
elements!
startingLocation="western China's Qinghai-Tibet Plateau"
endingLocation="East China Sea"/>
Yangtze.xml

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
The RDF format allows
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
you to use attributes as
RDF xmlns="http://www.geodesy.org/river#"
well!
xmlns:r="http://www.geodesy.org/river#"
r:length="6300 kilometers"
r:startingLocation="western China's Qinghai-Tibet Plateau"
r:endingLocation="East China Sea"/>
Yangtze.rdf
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
80

Attributes

• Recall that at the very beginning of this tutorial we


said that a resource has properties (attributes).
Thus, a property can be represented either as a
child element, or as an attribute. (Of course, a
property can only be represented as an attribute if
it has a literal value, not a structured value.)
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
81

Equivalent!
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>

<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#"
xmlns:r="http://www.geodesy.org/river#"
r:length="6300 kilometers"
r:startingLocation="western China's Qinghai-Tibet Plateau"
r:endingLocation="East China Sea"/>

Do Lab5
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
82

Example #13
Modify the following XML document so that it is also a valid RDF document:
<?xml version="1.0"?>
<Resume xmlns="http://www.resume.org#">
<Person xmlns="http://www.person.org#">
<name>John Smith</name>
<phone>555-1212</phone>
<email>abc@def.com</email>
</Person>
<experience>
<Job>
<startDate>1995-01-01</startDate>
<endDate>1999-01-01</endDate>
<title>researcher</title>
<employer>Some Corp</employer>
<employerHomepage>http://www.company.org</employerHomepage>
<description>Cool stuff</description>
</Job>
</experience>
<skills>
<Skill name="Oracle" yrs="13"/>
<Skill name="Java" yrs="3"/>
</skills>
<education>
<Degree type="BS" school="RPI" year="1987"/>
</education>
</Resume>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
83

<?xml version="1.0"?>
<Resume rdf:ID="JSmith-2003"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" <skills rdf:parseType="Collection">
xmlns="http://www.resume.org#" <Skill>
xml:base="http://www.jsmith.com/resume"> <name>Oracle</name>
<about> <yearsExperience>13</yearsExperience>
<Person xmlns="http://www.person.org#"> </Skill>
<name>John Smith</name> <Skill>
<phone>555-1212</phone> <name>Java</name>
<email>abc@def.com</email> <yearsExperience>3 </yearsExperience>
</Person> </Skill>
</about> </skills>
<experience rdf:parseType="Collection"> <education rdf:parseType="Collection">
<Job> <Degree xmlns="http://www.schools.com#">
<startDate>1995-01-01</startDate> <type>BS</type>
<endDate>1999-01-01</endDate> <school>RPI</school>
<title>researcher</title> <graduationYear>1987</graduationYear>
<employer> </Degree>
<Company xmlns="http://www.company.org#"> </education>
<name>Some Corp</name> </Resume>
<homepage>http://www.company.org</homepage>
</Company>
</employer>
<description>Cool stuff</description>
</Job>
</experience>
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
84

Conclusion
• In this tutorial we demonstrated that, with
oftentimes slight modifications, "traditional XML"
documents can be made RDF-compliant.
• We recommend using the RDF format for all your
XML documents.
• We feel that the advantages accrued by formatting
your XML far outweigh any possible
disadvantages (having to structure your XML in a
specific format and having to learn a new
vocabulary)
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
85

Related Articles

"Make Your XML RDF-Friendly" by Bob DuCharme, John Cowan

http://www.xml.com/pub/a/2002/10/30/rdf-friendly.html

You might also like