Professional Documents
Culture Documents
0
Recommendation, February 1998
An Introduction to XML
Patrice Bonhomme & Laurent Romary
Lucid-IT LORIA
bonhomme@lucid-it.com
romary@loria.fr
Objectives
! Understanding the basic concepts of XML
! Elements, attributes and content
! DTD (, Schemas)
! Namespaces
! An overview of the main associated
recommendations:
! XML path language (XPath)
! XML pointers and links (Xpointer and XLink)
! The transformation language of XSL (eXtensible
Stylesheet Language)
XML in the document chain
Conception Edition Transformation Consultation
XML
DTD/ HTML
Schema XML XSL/XSLT XHTML
Data User
Structures Data processing perspective
A quick historical overview
! 1986
! SGML (Standard Generalized Markup Language)
! ISO standard: ISO:8879:1986
! 1987
! TEI (Text Encoding Initiative)
! 1990
! HTML 1.0 (HyperText Markup Language)
! 1997/1998
! XML 1.0 (eXtensible Markup Language)
What XML is:
! XML: eXtended Markup Language
! A W3C (World Wide Web Consortium)
Recommendation
! A meta-language: it allows one to define his
own markup language
! A simplification of the SGML standard
! SGML was intended to represent the “logical”
structure of a document
! HTML was conceived as an application of SGML
A simplified SGML
! An XML document is an SGML document
! With some slight (but essential) differences...
! XML has the expressive power of SGML
without its complexity
! Opens the door to the transmission of
structured documents on the web
! Databases also entered the game...
What can we do with it?
! Data modeling (in complement to UML for
instance)
! Publication of structured data on the web
! Separation of the logical structure of a
document from its actual presentation
! Distributed applications (cf. well-formed vs.
valid documents)
! Integrating data from heterogeneous sources
Why can’t we avoid it?
! Simplicity, which makes it simple to integrate into any
kind of application
! XML specifications = 36 pages
! SGML standard, ISO-8879 = 250 pages
! Consequence:
! a lot of software available: editors, parsers, bridges from and to
existing editing environment or DBMSs
From HTML to XML - 1
! A simple HTML document:
Closing tag
Elements and their content
Opening tag
! Commentaries
<!-- ceci est un commentaire -->
! CDATA section
<![CDATA[Langue & Dialogue]]>
! Processing instruction (application specific)
<?edit line="wrap"?>
From one document to a class…
How do I How may I share
know the this structure
structure of with others?
my document?
Document Type Definition
! Expresses constraints on:
! Allowed element and attribute names
! Possible content of a given element (“content
model”)
! To which elements a given attribute can be
attached
! Similar to the traditional SGML approach, but:
! Simplified syntax
! The DTD is optional for a document
Example
<!ELEMENT
MEMBRE
(LOGIN, NOM?, PRENOM?,MEL, TEL+, FAX*, EQUIPE)>
<!DOCTYPE MEMBRE [
<!ELEMENT MEMBRE … >
…
]>
<MEMBRE TYPE="IE" ID="M28">
…
</MEMBRE>
Valid vs. Well-formed
! Well-formed documents
! Syntactic bracketing is preserved, without a DTD
! Empty element:
<toto></toto> = <toto/>
! Valid documents
! With a DTD (à la SGML)
! Essential difference with SGML
! Extracting and re-using document fragments
! One usually produce valid document and distribute well-
formed ones
XML namespaces
! Objectives: avoid conflicts between element and
attribute names coming from various sources
! Composite documents
! XSLT instructions, Schema declarations
! Declaration:
<DOC xmlns:mml="http://www.w3.org/Math/MathML/"
xmlns="http://www.ua99.net/DOC/1.0">
<P>blah blah :
<mml:fn mml:definitionURL="mydef.xml">
…
</mml:fn> re blah blah</P>
</DOC>
Reserved namespaces
! The xml: prefix is reserved by the W3C for specific
attributes:
<title xml:space="default">...</title>
<p xml:lang="FR">…</p>
XPath
! XML Path Language 1.0 REC 29012000
! Wide purpose syntax for addressing sub-parts of an
XML document
! Joint specification used by XML Pointers
(XPointer recommendation) and the XSLT
transformation language
! Allows one to access, select and filter XML
fragments (cf. Tree representation of an XML
document)
Addressing nodes in XPath
! Absolute addressing
! Given: a URL
! id(M28), root()
! Relative addressing along axes
! Given: a node
! ancestor, child
! descendant
! psibling, fsibling
An XML document represents a
hierarchical structure
The only view you
should ever, ever have
of an XML document
MEMBRE
TYPE="IE" ID="M28"
/ ou /DB
XPath - Exemples
<DB>
<MEMBRE TYPE="IE" ID="M28">
<LOGIN ID="bonhomme"/>
...
<EQUIPE LAB="LORIA">Langue et Dialogue</EQUIPE>
</MEMBRE>
<MEMBRE TYPE="CR" ID="M14">
<LOGIN ID="romary"/>
...
</MEMBRE>
</DB>
/ ou /DB
XPath - Exemples
<DB>
<MEMBRE TYPE="IE" ID="M28">
<LOGIN ID="bonhomme"/>
...
<EQUIPE LAB="LORIA">Langue et Dialogue</EQUIPE>
</MEMBRE>
<MEMBRE TYPE="CR" ID="M14">
<LOGIN ID="romary"/>
...
</MEMBRE>
</DB>
/ ou /DB /DB/MEMBRE
XPath - Exemples
<DB>
<MEMBRE TYPE="IE" ID="M28">
<LOGIN ID="bonhomme"/>
...
<EQUIPE LAB="LORIA">Langue et Dialogue</EQUIPE>
</MEMBRE>
<MEMBRE TYPE="CR" ID="M14">
<LOGIN ID="romary"/>
...
</MEMBRE>
</DB>
/ ou /DB /DB/MEMBRE
XPath - Exemples
<DB>
<MEMBRE TYPE="IE" ID="M28">
<LOGIN ID="bonhomme"/>
...
<EQUIPE LAB="LORIA">Langue et Dialogue</EQUIPE>
</MEMBRE>
<MEMBRE TYPE="CR" ID="M14">
<LOGIN ID="romary"/>
...
</MEMBRE>
</DB>
/ ou /DB
/ /DB/MEMBRE
/ / /DB/MEMBRE[2]
/ /
/DB/MEMBRE[@ID=‘M28’]/EQUIPE[1]/text()
XPath - Exemples
<DB>
<MEMBRE TYPE="IE" ID="M28">
<LOGIN ID="bonhomme"/>
...
<EQUIPE LAB="LORIA">Langue et Dialogue</EQUIPE>
</MEMBRE>
<MEMBRE TYPE="CR" ID="M14">
<LOGIN ID="romary"/>
...
</MEMBRE>
</DB>
/ ou /DB
/ /DB/MEMBRE
/ / /DB/MEMBRE[2]
/ /
/DB/MEMBRE[@ID=‘M28’]/EQUIPE[1]/text()
XPath - Exemples
<DB>
<MEMBRE TYPE="IE" ID="M28">
<LOGIN ID="bonhomme"/>
...
<EQUIPE LAB="LORIA">Langue et Dialogue</EQUIPE>
</MEMBRE>
<MEMBRE TYPE="CR" ID="M14">
<LOGIN ID="romary"/>
...
</MEMBRE>
</DB>
/ ou /DB
/ /DB/MEMBRE
/ / /DB/MEMBRE[2]
/ /
/DB/MEMBRE[@ID=‘M28’]/EQUIPE[1]/text()
/DB/MEMBRE/LOGIN[@ID=‘romary’]/../@ID
XPath - Exemples
<DB>
<MEMBRE TYPE="IE" ID="M28">
<LOGIN ID="bonhomme"/>
...
<EQUIPE LAB="LORIA">Langue et Dialogue</EQUIPE>
</MEMBRE>
<MEMBRE TYPE="CR" ID="M14">
<LOGIN ID="romary"/>
...
</MEMBRE>
</DB>
/ ou /DB
/ /DB/MEMBRE
/ / /DB/MEMBRE[2]
/ /
/DB/MEMBRE[@ID=‘M28’]/EQUIPE[1]/text()
/DB/MEMBRE/LOGIN[@ID=‘romary’]/../@ID
XPointer
! Cf. HTML, anchors are needed:
<A NAME="TOTO">
http://www.titi.fr/index.html#toto
! In XML, pointers can directly address a
document component:
http://…/doc.xml#xptr(id(M28))
http://…/doc.xml#xptr(/DB/MEMBRE[28]/MEL)
! Advantage: no need to modify the target
document (notion of primary source)
XLink
! In HTML: the elements which may carry links are
known:
<A>, <IMG>, ...
! In XML: any element may carry a simple or
complex link
! This is done by using pre-defined attributes:
<a xlink:type="simple"
xlink:href="http://www.w3.org/">W3C</a>
Visualizing XML documents
! Basically, an XML document does not provide
any information about its presentation
! Visualizing a document may depend on the
target audience, device etc.
! Stylesheets:
! Casdading Style Sheets (CSS 1 et 2)
! Extensible Style Language (XSL) >> XSLT
eXtensible Style Language
! Describes the way a
document will be shown,
+ XSL printed or verbalized…
XML
XSL: a two-fold proposal
! XSL = Transformations + Visualizing properties
! XSLT : Transformation of XML documents
! Allows one to transform an XML document into another
XML document
! Use this to produce well-formed (!) HTML documents
PDF, MIF, …)
! Not a recommendation yet :-(
General structure of an XSL
document
<?xml version="1.0"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Trans
form">
…
<xsl:template match="/">
…
</xsl:template>
<xsl:template match="NOM">
…
</xsl:template>
</xsl:stylesheet>
Declarative approach
! Sequence of rules (templates) specifying:
! The pattern (XPath) of nodes to which the rule can
be applied
! Actions to be undertaken:
! Elements to be generated in the target document
! Selection of the elements to be further explored in the
source document
! Additional functionalities: testing, sorting, etc.
A simple rule
<xsl:template match='/DB/MEMBRE/NOM'>
<B>
<xsl:apply-templates/>
</B>
</xsl:template>
A simple rule
<xsl:template match='/DB/MEMBRE/NOM'>
<B>
<xsl:apply-templates/>
</B>
</xsl:template>
A simple rule
pattern (XPath)
<xsl:template match='/DB/MEMBRE/NOM'>
<B>
<xsl:apply-templates/>
</B>
</xsl:template>
A simple rule
pattern (XPath)
<xsl:template match='/DB/MEMBRE/NOM'>
<B>
<xsl:apply-templates/>
</B>
</xsl:template>
A simple rule
HTML element to be produced pattern (XPath)
<xsl:template match='/DB/MEMBRE/NOM'>
<B>
<xsl:apply-templates/>
</B>
</xsl:template>
A simple rule
HTML element to be produced pattern (XPath)
<xsl:template match='/DB/MEMBRE/NOM'>
<B>
<xsl:apply-templates/>
</B>
</xsl:template>
A simple rule
HTML element to be produced pattern (XPath)
<xsl:template match='/DB/MEMBRE/NOM'>
<B>
<xsl:apply-templates/>
</B>
</xsl:template>