You are on page 1of 7

Objectives

ƒ What is XML, in particular in relation to HTML


An Introduction to XML and Web Technologies
ƒ The XML data model and its
textual representation
XML Documents
ƒ The XML Namespace mechanism

Anders Møller & Michael I. Schwartzbach


© 2006 Addison-Wesley
An Introduction to XML and Web Technologies 2

What is XML? Recipes in XML

ƒ XML: Extensible Markup Language ƒ Define our own “Recipe Markup Language”
ƒ A framework for defining markup languages ƒ Choose markup tags that correspond to concepts
ƒ Each language is targeted at its own in this application domain
application domain with its own markup tags • recipe, ingredient, amount, ...
ƒ There is a common set of generic tools for ƒ No canonical choices
processing XML documents • granularity of markup?
ƒ XHTML: an XML variant of HTML • structuring?
• elements or attributes?
ƒ Inherently internationalized and platform
• ...
independent (Unicode)
ƒ Developed by W3C, standardized in 1998
An Introduction to XML and Web Technologies 3 An Introduction to XML and Web Technologies 4

1
Example (1/2) Example (2/2)

<collection> <comment>
<description>Recipes suggested by Jane Dow</description> Rhubarb Cobbler made with bananas as the main sweetener.
It was delicious.
<recipe id="r117">
</comment>
<title>Rhubarb Cobbler</title>
<date>Wed, 14 Jun 95</date>
<nutrition calories="170" fat="28%"
<ingredient name="diced rhubarb" amount="2.5" unit="cup"/> carbohydrates="58%" protein="14%"/>
<ingredient name="sugar" amount="2" unit="tablespoon"/> <related ref="42">Garden Quiche is also yummy</related>
<ingredient name="fairly ripe banana" amount="2"/> </recipe>
<ingredient name="cinnamon" amount="0.25" unit="teaspoon"/> </collection>
<ingredient name="nutmeg" amount="1" unit="dash"/>

<preparation>
<step>
Combine all and use as cobbler, pie, or crisp.
</step>
</preparation>

An Introduction to XML and Web Technologies 5 An Introduction to XML and Web Technologies 6

Building on the XML Notation XML Trees

ƒ Defining the syntax of our recipe language ƒ Conceptually, an XML document is a


• DTD, XML Schema, ... tree structure
ƒ Showing recipe documents in browsers • node, edge
• XPath, XSLT • root, leaf
ƒ Recipe collections as databases • child, parent
• sibling (ordered),
• XQuery
ancestor,
ƒ Building a Web-based recipe editor descendant
• HTTP, Servlets, JSP, ...
ƒ ...

– the topics of the following weeks...


An Introduction to XML and Web Technologies 7 An Introduction to XML and Web Technologies 8

2
An Analogy: File Systems Tree View of the XML Recipes

An Introduction to XML and Web Technologies 9 An Introduction to XML and Web Technologies 10

Nodes in XML Trees Textual Representation

ƒ Text nodes: carry the actual contents, leaf nodes ƒ Text nodes: written as the text they carry
ƒ Element nodes: define hierarchical logical ƒ Element nodes: start-end tags
groupings of contents, each have a name • <bla ...> ... </bla>
ƒ Attribute nodes: unordered, each associated • short-hand notation for empty elements: <bla/>
with an element node, has a name and a value
ƒ Attribute nodes: name=“value” in start tags
ƒ Comment nodes: ignorable meta-information
ƒ Comment nodes: <!-- bla -->
ƒ Processing instructions: instructions to specific
ƒ Processing instructions: <?target value?>
processors, each have a target and a value
ƒ Root nodes: implicit
ƒ Root nodes: every XML tree has one root node
that represents the entire tree

An Introduction to XML and Web Technologies 11 An Introduction to XML and Web Technologies 12

3
Browsing XML (without XSLT) More Constructs

ƒ XML declaration
ƒ Character references
ƒ CDATA sections
ƒ Document type declarations and entity references
explained later...

ƒ Whitespace?

An Introduction to XML and Web Technologies 13 An Introduction to XML and Web Technologies 14

Example Well-
Well-formedness

<?xml version="1.1" encoding="ISO-8859-1"?> ƒ Every XML document must be well-formed


<!DOCTYPE features SYSTEM "example.dtd">
<features a="b">
• start and end tags must match and nest properly
<?mytool here is some information specific to mytool?> • <x><y></y></x> 9
El señor está bien, garçon! • </z><x><y></x></y>
Copyright &#169; 2005 • exactly one root element
<![CDATA[ <this is not a tag> ]]>
• ...
<!-- always remember to specify the
right character encoding --> ƒ in other words, it defines a proper tree structure
</features>

ƒ XML parser: given the textual XML document,


constructs its tree representation
An Introduction to XML and Web Technologies 15 An Introduction to XML and Web Technologies 16

4
Simpler Alternatives? Applications

S-expressions, 1958: Rough classification:


ƒ Data-oriented languages
(collection
(recipe ƒ Document-oriented languages
(title "Rhubarb Cobbler") (date "Wed, 14 Jun 95") ƒ Protocols and programming languages
...
)
ƒ Hybrids
)

ƒ XML is defined as a simplified subset of SGML


ƒ XML could have been designed simpler...
ƒ ... but it wasn’t [end of discussion]
An Introduction to XML and Web Technologies 17 An Introduction to XML and Web Technologies 18

Example: XHTML Example: CML

<molecule id="METHANOL">
<?xml version="1.0" encoding="UTF-8"?> <atomArray>
<stringArray builtin="id">a1 a2 a3 a4 a5 a6</stringArray>
<html xmlns="http://www.w3.org/1999/xhtml"> <stringArray builtin="elementType">C O H H H H</stringArray>
<head><title>Hello world!</title></head> <floatArray builtin="x3" units="pm">
-0.748 0.558 ...
<body>
</floatArray>
<h1>This is a heading</h1> <floatArray builtin="y3" units="pm">
This is some text. -0.015 0.420 ...
</floatArray>
</body> <floatArray builtin="z3" units="pm">
</html> 0.024 -0.278 ...
</floatArray>
</atomArray>
</molecule>

An Introduction to XML and Web Technologies 19 An Introduction to XML and Web Technologies 20

5
Example: ebXML Example: ThML

<MultiPartyCollaboration name="DropShip"> <h3 class="s05" id="One.2.p0.2">Having a Humble Opinion of Self</h3>


<BusinessPartnerRole name="Customer"> <p class="First" id="One.2.p0.3">EVERY man naturally desires knowledge
<Performs initiatingRole='//binaryCollaboration[@name="Firm Order"]/ <note place="foot" id="One.2.p0.4">
InitiatingRole[@name="buyer"]' /> <p class="Footnote" id="One.2.p0.5"><added id="One.2.p0.6">
</BusinessPartnerRole> <name id="One.2.p0.7">Aristotle</name>, Metaphysics, i. 1.
<BusinessPartnerRole name="Retailer"> </added></p>
<Performs respondingRole='//binaryCollaboration[@name="Firm Order"]/ </note>;
RespondingRole[@name="seller"]' /> but what good is knowledge without fear of God? Indeed a humble
<Performs initiatingRole='//binaryCollaboration[...]/ rustic who serves God is better than a proud intellectual who
InitiatingRole[@name="buyer"]' /> neglects his soul to study the course of the stars.
</BusinessPartnerRole> <added id="One.2.p0.8"><note place="foot" id="One.2.p0.9">
<BusinessPartnerRole name="DropShip Vendor"> <p class="Footnote" id="One.2.p0.10">
... Augustine, Confessions V. 4.
</BusinessPartnerRole> </p>
</MultiPartyCollaboration> </note></added>
</p>

An Introduction to XML and Web Technologies 21 An Introduction to XML and Web Technologies 22

XML Namespaces The Idea


<widget type="gadget">
<head size="medium"/> ƒ Assign a URI to every (sub-)language
<big><subwidget ref="gizmo"/></big>
<info>
<head> e.g. http://www.w3.org/1999/xhtml
<title>Description of gadget</title> for XHTML 1.0
</head>
<body>
<h1>Gadget</h1>
A gadget contains a big gizmo
</body>
ƒ Qualify element names with URIs:
</info>
</widget>
{http://www.w3.org/1999/xhtml}head
ƒ When combining languages, element names may become
ambiguous!
ƒ Common problems call for common solutions
An Introduction to XML and Web Technologies 23 An Introduction to XML and Web Technologies 24

6
The Actual Solution Widgets with Namespaces
<widget type="gadget" xmlns="http://www.widget.inc">
ƒ Namespace declarations bind URIs to prefixes <head size="medium"/>
<big><subwidget ref="gizmo"/></big>
<... xmlns:foo="http://www.w3.org/TR/xhtml1"> <info xmlns:xhtml="http://www.w3.org/TR/xhtml1">
... <xhtml:head>
<xhtml:title>Description of gadget</xhtml:title>
<foo:head>...</foo:head>
</xhtml:head>
...
<xhtml:body>
</...>
<xhtml:h1>Gadget</xhtml:h1>
A gadget contains a big gizmo
ƒ Lexical scope </xhtml:body>
</info>
ƒ Default namespace (no prefix) declared with </widget>
xmlns="...“
ƒ Attribute names can also be prefixed Namespace map: for each element, maps prefixes to URIs

An Introduction to XML and Web Technologies 25 An Introduction to XML and Web Technologies 26

Summary Essential Online Resources

ƒ XML: a notation for hierarchically structured text ƒ http://www.w3.org/TR/xml11/


ƒ Conceptual tree model vs. ƒ http://www.w3.org/TR/xml-names11
concrete textual representation ƒ http://www.unicode.org/
ƒ Well-formedness
ƒ Namespaces

An Introduction to XML and Web Technologies 27 An Introduction to XML and Web Technologies 28

You might also like