You are on page 1of 36

CSC570 XML

Programming
Chapter 2
XML Documents
By: Anis Shobirin
Amended by: Aimi Liyana Amir

Objective

To introduce xml document


To introduce well-formed xml document

Compiled by Anis Shobirin Bt


Abdullah Sani

TOC

XML Document
Creating Well-Formed XML Documents
Elements
Attributes
Entities Reference
Namespaces
Adding Comments, Processing Instructions
and CDATA Sections

Compiled by Anis Shobirin Bt


Abdullah Sani

The XML Document

<novel>
<foreword>
<paragraph>This is the great American novel.
</paragraph>
</foreword>
<chapter number="1">
<paragraph>It was a dark and stormy night.
</paragraph>
<paragraph>Suddenly, a shot rang out!
</paragraph>
</chapter>
</novel>
Compiled by Anis Shobirin Bt
Abdullah Sani

XML as a tree

An XML document represents a hierarchy; a hierarchy is a tree

novel

foreword

chapter
number="1"

paragraph

paragraph

paragraph

This is the great


American novel.

It was a dark
and stormy night.

Suddenly, a shot
rang out!

Compiled by Anis Shobirin Bt


Abdullah Sani

XML Basic Components

Elements.

Attributes.

The basic building block of XML markups.

Information regarding the element.

Character and Entity References.

Use to display characters that are not supported


by the input device (keyboard).
Entities may be defined and used.

Anis Shobirin Bt Abdullah Sani

XML Basic Components

Processing Instruction.

Allow documents to contain instructions for


applications.

Comments.

Allow programmers to add comments inside the


code.

Anis Shobirin Bt Abdullah Sani

XML Document Example 1

Anis Shobirin Bt Abdullah Sani

XML Document
<?xml version="1.0"?>
Root element
<book>
Parent of <lastname>
<author>
<lastname>Tennant</lastname>
<firstname>Roy</firstname> Child of <author>
</author>
<title>The Great American Novel</title>
<chapter number=1>
<chaptitle>It Was Dark and
Siblings
Stormy</chaptitle>
<p>It was a dark and stormy night.</p>
<p>An owl hooted.</p>
</chapter>
Compiled by Anis Shobirin Bt
</book>
Abdullah Sani

XML Declaration

The XML declaration looks like this:

<?xml version=1.0 encoding=UTF-8


standalone=yes?>

The XML declaration is not required by browsers, but is


required by most XML processors.
If present, the XML declaration must be in the first line
Version=1.0 is required
Encoding can be UTF-8 (ASCII) or UTF-16 (Unicode),
or something else, or can be omitted.
Standalone tells whether there is a separate DTD, can
also be omitted.
Anis Shobirin Bt Abdullah Sani

XML Documents

XML Document
Creating Well-Formed XML Documents
Elements
Attributes
Entities Reference
Namespaces
Adding Comments, Processing Instructions
and CDATA Sections

Compiled by Anis Shobirin Bt


Abdullah Sani

Well-Formed XML Documents

A well-formed XML document means


that the document is syntactically
correct

This allows it to be processed by generic


processors (XML parser) that traverse the
document and create an internal tree
representation.

Compiled by Anis Shobirin Bt


Abdullah Sani

Well-Formed XML Documents


A well-formed document must adher to the following rules:

Every start tag has a matching end tag.

Elements may nest, but must not overlap.

There must be exactly one root element.

Attribute values must be quoted.

An element may not have two attributes with the same


name.

Comments and processing instructions may not appear


inside tags.

No unescaped < or & signs may occur inside character


data.
Compiled by Anis Shobirin Bt
Abdullah Sani

Well-Formed XML Documents


A well-formed document must adher to the following rules:

Every start tag has a matching end tag.

Elements may nest, but must not overlap.

Only
well-formed
There must be exactly one root element.
documents
can be
Attribute values
must be quoted.
processed
bywith
XML
An element
may not have two attributes
the same
name.
parsers.
Comments and processing instructions may not appear
inside tags.

No unescaped < or & signs may occur inside character


data.
Compiled by Anis Shobirin Bt
Abdullah Sani

Well-Formed XML Documents

Every start tag has a matching end tag.

Eg: <name> ... </name>


But empty elements can be abbreviated: <break />.

XML tags are case sensitive

The tag <Student> is different from the tag <student>

start with letters (including non-Latin characters) or the "_" character,


but not numbers or other punctuation characters.
After the first character, numbers are allowed, as are the characters
"-" and ".".
Names can't contain spaces.
Names can't contain the ":" character.
Names can't start with the letters "xml", in uppercase, lowercase, or
mixed you can't start a name with "xml", "XML", "XmL", or any
other combination.

Compiled by Anis Shobirin Bt


Abdullah Sani

Well-Formed XML Documents

Elements may nest, but must not overlap.

e.g. not <b><i>bold and italic</b></i>

There must be exactly one root element.


<root>
<child>
<subchild>..</subchild>
<subchild>..</subchild>
</child>
</root>

Attribute values must be quoted.

e.g. <time unit="days">


Compiled by Anis Shobirin Bt
Abdullah Sani

Well-Formed XML Documents

An element may not have two attributes


with the same name.

Comments and processing instructions


may not appear inside tags.

e.g. <time unit="days unit=hour>

Example comment <! This is comment -->

No unescaped < or & signs may occur


inside character data.

E.g. <comparison>6 is < 7 & 7 > 6</comparison>

Compiled by Anis Shobirin Bt


Abdullah Sani

Well-Formed XML Documents

A document that obeys all the rules is


said to be well-formed.

Being well-formed is the minimum


requirement for being an XML document.

Compiled by Anis Shobirin Bt


Abdullah Sani

XML Documents

Creating Well-Formed XML Documents


Elements
Attributes
Entities Reference
Namespaces
Adding Comments, Processing Instructions
and CDATA Sections

Compiled by Anis Shobirin Bt


Abdullah Sani

Element Content

Three kinds:

Data content
Element content
Empty
Mixed content

<name>Steve Ballmer</name>
<book><intro/><chapter/><chapter/></book>
<book></book> or <book/>

<bookname>The Difference Between


<name>God</name> and
<name>Larry Ellison</name>
</bookname>

Compiled by Anis Shobirin Bt


Abdullah Sani

XML Documents

Creating Well-Formed XML Documents


Elements
Attributes
Entities Reference
Namespaces
Adding Comments, Processing Instructions
and CDATA Sections
Error in XML

Compiled by Anis Shobirin Bt


Abdullah Sani

Element Attributes

Write inside element

<name first="David" last="Matuszek"></name>

<name ORIGIN=Scotland>Groundskeeper Willie</name>

It is a metadata, data about data.

Consist of a name and a value.

browsers

display only elements (values


enclosed by tags), not tags and attributes

Compiled by Anis Shobirin Bt


Abdullah Sani

XML Documents

Creating Well-Formed XML Documents


Elements
Attributes
Entities Reference
Namespaces
Adding Comments, Processing Instructions
and CDATA Sections
Error in XML

Compiled by Anis Shobirin Bt


Abdullah Sani

Entities Reference

Five special characters must be written as


entities:

&amp; for
&lt;
for
&gt;
for
&quot; for
&apos; for

&
<
>
"
'

(almost always necessary)


(almost always necessary)
(not usually necessary)
(necessary inside double quotes)
(necessary inside single quotes)

These are the only predefined entities in XML


Compiled by Anis Shobirin Bt
Abdullah Sani

XML Documents

Creating Well-Formed XML Documents


Adding Elements to the Documents
Adding Attributes to Elements
Entities Reference
Namespaces
Adding Comments, Processing Instructions
and CDATA Sections
Error in XML

Compiled by Anis Shobirin Bt


Abdullah Sani

Namespaces and URIs*

A namespace is defined as a unique string

To guarantee uniqueness, typically a URI (Uniform Resource


Indicator) is used, because the author owns the domain

It doesn't have to be a real URI; it just has to be a unique


string

Example: http://www.matuszek.org/ns

There are two ways to use namespaces:

Declare a default namespace

Associate a prefix with a namespace, then use the prefix in


the XML to refer to the namespace
Compiled by Anis Shobirin Bt
Abdullah Sani

Namespace syntax*

In any start tag you can use the reserved attribute name xmlns:
<book xmlns="http://www.matuszek.org/ns">

You can use almost this same form to declare a prefix:


<book xmlns:dave="http://www.matuszek.org/ns">

This namespace will be used as the default for all elements up


to the corresponding end tag
You can override it with a specific prefix

Use this prefix on every tag and attribute you want to use
from this namespace, including end tags--it is not a default
prefix
<dave:chapter dave:number="1">To Begin</dave:chapter>

You can use the prefix in the start tag in which it is defined:
<dave:book xmlns:dave="http://www.matuszek.org/ns">
Compiled by Anis Shobirin Bt
Abdullah Sani

XML Documents

Creating Well-Formed XML Documents


Adding Elements to the Documents
Adding Attributes to Elements
Entities Reference
Namespaces
Adding Comments, Processing Instructions
and CDATA Sections

Compiled by Anis Shobirin Bt


Abdullah Sani

Comments

<!-- This is a comment in both HTML and XML -->

Comments can be put anywhere in an XML


document

Comments are useful for:

Explaining the structure of an XML document


Commenting out parts of the XML during
development and testing

Comments are not elements and do not have


an end tag
Compiled by Anis Shobirin Bt
Abdullah Sani

Comments

The blanks after <!-- and before --> are


optional

The closing bracket must be -->

Comments are not displayed by browsers,


but can be seen by anyone who looks at the
source code

Compiled by Anis Shobirin Bt


Abdullah Sani

Processing instructions*

PIs (Processing Instructions) may occur anywhere


in the XML document.

A PI is a command to the program processing the


XML document to handle it in a certain way

XML documents are typically processed by more


than one program

Programs that do not recognize a given PI should


just ignore it

Compiled by Anis Shobirin Bt


Abdullah Sani

Processing instructions*

General format of a PI: <?target instructions?>

Example:

<?xml-stylesheet type="text/css" href="mySheet.css"?>


<?CDParser MessageBox("There are songs missing!")?>
<?nameprocessor PRINT name?>

Compiled by Anis Shobirin Bt


Abdullah Sani

Example 1

Compiled by Anis Shobirin Bt


Abdullah Sani

Solution

Compiled by Anis Shobirin Bt


Abdullah Sani

Exercise

Compiled by Anis Shobirin Bt


Abdullah Sani

Q&A

Thank You

Compiled by Anis Shobirin Bt


Abdullah Sani