You are on page 1of 59

Integrative Programming And Technologies

(ITec4121)

Chapter Three
XML and XML Related Technologies
Introduction

 Xml (extensible Markup Language) is a markup language and it is


designed to store and transport data.
 It was created to provide an easy to use and store self describing data.
(Self-describing data is the data that describes both its content and
structure.)
 It is not a replacement for HTML.
 It is designed to be self-descriptive and used to carry data, not to
display data.
 XML tags are not predefined. You must define your own tags.
 XML is platform independent and language independent.
 XML truly powerful is its international acceptance.
 XML interfaces for databases, programming, office application mobile phones
and more due to its platform independent feature

2
Features of XML

 It separates data from HTML


 It simplifies data sharing
 It simplifies data transport
 It simplifies Platform change
 It increases data availability
 It can be used to create new internet languages
Examples:
 WSDL for describing available web services
 WAP and WML as markup languages for handheld devices
 RSS languages for news feeds
 RDF and OWL for describing resources and ontology
 SMIL for describing multimedia for the web

3
XML Document

Example 1:
<?xml version="1.0" encoding="ISO-8859-1"?>  
<note>  
  <to>John</to>  
  <from>Ahmed</from>  
  <heading>Reminder</heading>  
  <body>Don't forget me this weekend!</body>  
</note>  
 The first line is the XML declaration. It defines the XML version (1.0)
and the encoding used (ISO-8859-1 = Latin-1/West European character
set).
 The next line describes the root element of the document (like saying:
"this document is a note"):
<note>  

4
XML Document…

 The next 4 lines describe 4 child elements of the root (to, from,
heading, and body).
<to>John</to>  
<from>Ahmed</from>  
<heading>Reminder</heading>  
<body>Don't forget me this weekend!</body>  
And finally the last line defines the end of the root element.
</note>  
 Note: XML documents must contain a root element. This element is "the
parent" of all other elements
 The elements in an XML document form a document tree.

5
XML Document…

 All elements can have sub elements (child elements).


 Example 2:
<root>  
 <child>  
    <subchild>.....</subchild>  
  </child>  
</root>  
 The terms parent, child, and sibling are used to describe the
relationships between elements.
 Parent elements have children.
 Children on the same level are called siblings (brothers or sisters).

6
XML Document…

 Example:
<?xml version="1.0"?>  
<University>  
  <student>  
      <firstname>Abdi</firstname>  
      <lastname>Kemal</lastname>  
      <contact>0999044993</contact>  
      <email>abdikemal@gmail.com</email>  
      <address>  
           <city>Ambo</city>  
           <state>Oromia</state>  
           <pin>201007</pin>  
      </address>  
  </student>  
</University>   
7
XML Related Technologies

8
XML Related Technologies…

9
XML Related Technologies…

10
XML Attributes

 XML elements can have attributes which are used to add the information about the
element.
 XML attributes enhance the properties of the elements.
 XML attributes must always be quoted. We can use single or double quote.
 Example:
<book publisher="Tata McGraw Hill"></book>  
Or
<book publisher='Tata McGraw Hill'></book>  
 Metadata should be stored as attribute and data should be stored as element.
<book>  
<book category="computer">  
<author> A & B </author>  
</book>  

 
11
XML Attributes…

 Data can be stored in attributes or in child elements.


 Difference between attribute and sub-element:
 Attributes are part of markup, while sub elements are part of the basic
document contents.
 Example
1st way:
<book publisher="Tata McGraw Hill"> </book>  
2nd way:
<book>  
<publisher> Tata McGraw Hill </publisher>  
</book>  
 In the first way publisher is used as an attribute and in the second way
publisher is an element.
12
XML Document…

 All elements can have text content and attributes (just like in HTML).

13
XML Comments

 XML comments are used to make codes more understandable other


developers
 Comments add notes or lines for understanding the purpose of an XML
code.
 Syntax:
<!-- Write your comment-->  
 Note: You cannot nest one XML comment inside another.

14
XML Comments…

Example:
<?xml version="1.0" encoding="UTF-8" ?>  
<!--Students marks are uploaded by months-->  
<students>  
   <student>  
      <name>Daba</name>  
      <marks>70</marks>  
   </student>  
   <student>  
      <name>Almaz</name>  
      <marks>60</marks>  
   </student>  
 </students>   
15
XML Comments…

 Rules for adding XML comments:


 Don't use a comment before an XML declaration.
 You can use a comment anywhere in XML document except
within attribute value.
 Don't nest a comment inside the other comment.

16
XML Validation

 A well-formed XML document is an XML document with correct


syntax.
 It is very necessary to know about valid XML document before
knowing XML validation.
 XML file can be validated by two ways:
1. against DTD (Document Type Definition)
2. against XSD (XML Schema Definition)
 DTD and XSD are used to define XML structure.
 Valid XML document:
 It must be well formed (satisfy all the basic syntax condition)
 It should be behave according to predefined DTD or XML schema

17
XML Validation…

 Rules for well formed XML:


 It must begin with the XML declaration.
 It must have one unique root element.
 All start tags of XML documents must match end tags.
 XML tags are case sensitive.
 All elements must be closed.
 All elements must be properly nested.
 All attributes values must be quoted.
 XML entities must be used for special characters.

18
XML Validation…

 DTD (Document Type Definition) defines the legal building


blocks of an XML document. It is used to define document
structure with a list of legal elements and attributes.
 XSD (XML Schema Definition) is defined as an XML language
and it uses namespaces to allow for reuses of existing definitions.
It supports a large number of built in data types and definition of
derived data types
 Actually DTD and XML schema both are used to form a well
formed XML document. We should avoid errors in XML
documents because they will stop the XML programs.

19
Checking Validation using DTD

 Before proceeding with XML DTD, you must check the validation. An
XML document is called "well-formed" if it contains the correct
syntax.
 A well-formed and valid XML document is one which has been
validated against DTD.
 Example: well-formed and valid XML document.
employee.xml
<?xml version="1.0"?>  
<!DOCTYPE employee SYSTEM "employee.dtd">  
<employee>  
  <firstname>Abebe</firstname>  
  <lastname>Zewdie</lastname>  
  <email>abewinta@gmail.com</email>  
</employee>   
20
Checking Validation using DTD…

 In the above example, the DOCTYPE declaration refers to an


external DTD file.
 The content of the file is shown in below paragraph.
employee.dtd
<!ELEMENT employee (firstname,lastname,email)>  
<!ELEMENT firstname (#PCDATA)>  
<!ELEMENT lastname (#PCDATA)>  
<!ELEMENT email (#PCDATA)>  

21
Checking Validation using DTD…

Description of DTD:
 <!DOCTYPE employee : defines that the root element of the
document is employee.
 <!ELEMENT employee: defines that the employee element contains 3
elements "firstname, lastname and email".
 <!ELEMENT firstname: defines that the firstname element is
#PCDATA typed. (parse-able data type).
 <!ELEMENT lastname: defines that the lastname element is
#PCDATA typed. (parse-able data type).
 <!ELEMENT email: defines that the email element is #PCDATA
typed. (parse-able data type).

22
Checking Validation using DTD…

XML DTD with entity declaration:


 A doctype declaration can also define special strings that can be
used in the XML file.
 An entity has three parts:
 An ampersand (&)
 An entity name
 A semicolon (;)
 Syntax to declare entity:
<!ENTITY entity-name "entity-value">  

23
Checking Validation using DTD…

 Example:
author.xml
<?xml version="1.0" standalone="yes" ?>  
<!DOCTYPE author [  
  <!ELEMENT author (#PCDATA)>  
  <!ENTITY jm "John Michael">  
]>  
<author>& jm;</author>  
 In the above example, jm is an entity that is used inside the author
element. In such case, it will print the value of jm entity that is "John
Michael".
 Note: A single DTD can be used in many XML files
24
XML CSS with DTD

 CSS (Cascading Style Sheets) can be used to add style and


display information to an XML document. It can format the
whole XML document.
 To link XML files with CSS, you should use the following syntax:
<?xml-stylesheet type="text/css" href="cssemployee.css"?>   

25
XML CSS with DTD…

XML CSS Example:


cssemployee.css
employee  
{  
background-color: pink;  
}  
firstname,lastname,email  
{  
font-size:25px;  
display:block;  
color: blue;  
margin-left: 50px;  
}   
26
XML CSS with DTD…

 Create the DTD file:


employee.dtd
<!ELEMENT employee (firstname,lastname,email)>  
<!ELEMENT firstname (#PCDATA)>  
<!ELEMENT lastname (#PCDATA)>  
<!ELEMENT email (#PCDATA)>  

27
XML CSS with DTD…

 Example of XML file using CSS and DTD:


employee.xml
<?xml version="1.0"?>  
<?xml-stylesheet type="text/css" href="cssemployee.css"?>  
<!DOCTYPE employee SYSTEM "employee.dtd">  
<employee>  
  <firstname>Abebe</firstname>  
  <lastname>Zewdie</lastname>  
  <email>abewinta@gmail.com</email>  
</employee>  
 Note: CSS is not generally used to format XML file. W3C
recommends XSLT instead of CSS

28
XML Schema

 XML schema is a language which is used for expressing constraint


about XML documents.
 Examples of schema languages are Relax- NG and XSD (XML
schema definition).
 An XML schema is used to define the structure of an XML
document.
 It is like DTD but provides more control on XML structure.

29
Checking Validation with XSD

 A well-formed and valid XML document is one which has been validated
against Schema.
 Example: Create a schema file:

30
Checking Validation with XSD…

 See the xml file using XML schema or XSD file.


employee.xml
<?xml version="1.0"?>  
<employee  
xmlns="http://www.javatpoint.com"  
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
xsi:schemaLocation="http://www.javatpoint.com employee.xsd">  
  
  <firstname>Abebe</firstname>  
  <lastname>Zewdie</lastname>  
  <email>abewinta@gmail.com</email>  
</employee>  

31
Checking Validation with XSD…

Description of XML Schema:


 <xs:element name="employee"> : defines the element name
employee.
 <xs:complexType> : defines that the element 'employee' is complex
type.
 <xs:sequence> : defines that the complex type is a sequence of
elements.
 <xs:element name="firstname" type="xs:string"/> : defines that the
element 'firstname' is of string/text type.
 <xs:element name="lastname" type="xs:string"/> : defines that the
element 'lastname' is of string/text type.
 <xs:element name="email" type="xs:string"/> : defines that the
element 'email' is of string/text type.

32
DTD vs. XSD

33
CDATA and PCDATA

CDATA (Unparsed Character data):


 CDATA contains the text which is not parsed further in an XML
document. Tags inside the CDATA text are not treated as markup and
entities will not be expanded.
 Example of CDATA:
<?xml version="1.0"?>  
<!DOCTYPE employee SYSTEM "employee.dtd">  
<employee>  
<![CDATA[  
  <firstname>Abebe</firstname> 
  <lastname>Zewdie</lastname> 
  <email>abewinta@gmail.com</email> 
]]>   
</employee>   

34
CDATA and PCDATA…

 In the above CDATA example, CDATA is used just after the element
employee to make the data/text unparsed, so it will give the value of
employee:
<firstname>Abebe</firstname><lastname>Zewdie</lastname><email>abewinta@gmail.com</email>

PCDATA(Parsed Character Data):


 PCDATA is the text that will be parsed by a parser. Tags inside the
PCDATA will be treated as markup and entities will be expanded.
 Example:
<?xml version="1.0"?>  
<!DOCTYPE employee SYSTEM "employee.dtd">  
<employee>  
  <firstname>Abebe</firstname>  
  <lastname>Zewdie</lastname>  
  <email>abewinta@gmail.com</email>  
</employee>   
35
CDATA and PCDATA…

 In the above example, the employee element contains 3 more


elements 'firstname', 'lastname', and 'email', so it parses further to
get the data/text of firstname, lastname and email to give the
value of employee as:
Abebe Zewdie abewinta@gmail.com

36
XML Parsers

 An XML parser is a software library or package that provides


interfaces for client applications to work with an XML document.
 It is designed to read the XML and create a way for programs to use
XML.
 It validates the document and check that the document is well
formatted.

37
Types of XML Parsers

 Two types of XML Parsers:


1. SAX
2. XML DOM

38
SAX (Simple API for XML)

 A SAX Parser implements SAX API. This API is an event based API and less
intuitive.
Features of SAX Parser:
 It does not create any internal structure.
 Clients does not know what methods to call, they just overrides the methods
of the API and place his own code inside method.
 It is an event based parser; it works like an event handler in Java.
 It is simple and memory efficient.
 It is very fast and works for huge documents.
 It is event-based so its API is less intuitive.
 Clients never know the full information because the data is broken into pieces

39
XML DOM

 A DOM document is an object which contains all the information of an XML


document.
 The DOM Parser implements a DOM API which is very simple to use.
 DOM defines a standard way to access and manipulate XML documents.
 The Document Object Model (DOM) is a programming API for HTML and
XML documents.
 It defines the logical structure of documents and the way a document is
accessed and manipulated.
 The Document Object Model can be used with any programming language.
 The XML DOM makes a tree-structure view for an XML document.

40
XML DOM…

Features of DOM Parser:


 It creates an internal structure in memory which is a DOM document object
and the client applications.
 It has a tree based structure
 It supports both read and write operations and the API is very simple to use.
 It is preferred when random access to widely separated parts of a document is
required.
 It is memory inefficient(it consumes more memory because the whole XML
document needs to load into memory).
 It is comparatively slower than other parsers.

41
XML DOM…

 We can modify or delete their content and also create new elements.
 The elements, their content (text and attributes) are all known as nodes.
 For example, consider this table, taken from an HTML document:
<TABLE>  
<ROWS>   
<TR>   
<TD>A</TD>  
<TD>B</TD>   
</TR>   
<TR>  
<TD>C</TD>  
<TD>D</TD>   
</TR>   
</ROWS>  
</TABLE>  
42
XML DOM…

 The Document Object Model represents this table like this:

43
Example 1: Load XML File

 This example parses an XML document (“note.xml”) into an XML


DOM object and extracts information from it with JavaScript.
 See the XML file that contains message.

note.xml
<?xml version="1.0" encoding="ISO-8859-1"?>    
<note>    
  <to>sonoojaiswal@javatpoint.com</to>    
  <from>vimal@javatpoint.com</from>    
  <body>Hello XML DOM</body>    
</note> 

44
Example 1: Load XML File…

 The HTML file that extracts the data of XML document using DOM:
xmldom.html
<!DOCTYPE html>  
<html>  
<body>  
<h1>Important Note</h1>  
<div>  
<b>To:</b> <span id="to"></span><br>  
<b>From:</b> <span id="from"></span><br>  
<b>Message:</b> <span id="message"></span>  
</div>  

45
Example 1: Load XML File…

<script>  
if (window.XMLHttpRequest)  
  {// code for IE7+, Firefox, Chrome, Opera, Safari  
  xmlhttp=new XMLHttpRequest();  
  }  
else  
  {// code for IE6, IE5  
  xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");  
  }  
xmlhttp.open("GET","note.xml",false);  
xmlhttp.send();  
xmlDoc=xmlhttp.responseXML;  

46
Example 1: Load XML File…

document.getElementById("to").innerHTML=  
xmlDoc.getElementsByTagName("to")
[0].childNodes[0].nodeValue;  
document.getElementById("from").innerHTML=  
xmlDoc.getElementsByTagName("from")
[0].childNodes[0].nodeValue;  
document.getElementById("message").innerHTML=  
xmlDoc.getElementsByTagName("body")
[0].childNodes[0].nodeValue;  
</script>  
</body>  
</html>
47
XML Database

 It is a data persistence software system used for storing the huge


amount of information in XML format.
 It provides a secure place to store XML documents.  
 You can query your stored data by using XQuery, export and
serialize into desired format.
 XML databases are usually associated with document-oriented
databases.

48
Types of XML databases

1. XML-enabled database
2. Native XML database (NXD)
XML-enable Database:
 It works just like a relational database.
 It is like an extension provided for the conversion of XML documents.
 It stores data in a table in the form of rows and columns.
Native XML Database:
 It stores large amount of data.
 Instead of table format, it is based on container format.
 You can query data by XPath expressions
 It is preferred over XML-enable database because it is highly capable
to store, maintain and query XML documents.

49
Example of XML database

<?xml version="1.0"?>  
<contact-info>  
   <contact1>  
      <name>Abebe Zewdie</name>  
      <company>Ambo University</company>  
      <phone>(0120) 4256464</phone>  
   </contact1>  
   <contact2>  
      <name>John Michael </name>  
      <company>Ambo University</company>  
      <phone>09990449935</phone>  
   </contact2>  
</contact-info>  
 In the above example, a table named contacts is created and holds the contacts
(contact1 and contact2). Each one contains 3 entities name, company and phone.
50
XML Namespaces

 It is used to avoid element name conflict in XML document


 It is declared using the reserved XML attribute whose name must be started
with "xmlns".
 Syntax:
<element xmlns:name = "URL"> 
 Where:
 name is a namespace prefix.
 URL is a namespace identifier 
 Example: XML file
<?xml version="1.0" encoding="UTF-8"?>  
<cont:contact xmlns:cont="http://sssit.org/contact-us">  
   <cont:name>Vimal Jaiswal</cont:name>  
   <cont:company>SSSIT.org</cont:company>  
   <cont:phone>(0120) 425-6464</cont:phone>  
</cont:contact>
51
XML Namespaces…

 In the above example:


Namespace Prefix: cont
Namespace Identifier(URL): http://sssit.org/contact-us
 Generally this conflict occurs when we try to mix XML documents
from different XML application.
 Take an example with two tables:
Table1:
<table>  
  <tr>  
    <td>Aries</td>  
    <td>Bingo</td>  
  </tr>  
</table>   
52
XML Namespaces…

 Table2: This table carries information about a computer table.


<table>  
  <name>Computer table</name>  
  <width>80</width>  
  <length>120</length>  
</table> 
 If you add these both XML fragments together, there would be a
name conflict because both have <table< element. Although they
have different name and meaning.
  

53
XML Namespaces…

 You can get rid of name conflict:


1. By Using a Prefix
2. By Using xmlns Attribute
By Using a Prefix:
 You can easily avoid the XML namespace by using a name prefix.
<h:table>  
  <h:tr>  
    <h:td>Aries</h:td>  
    <h:td>Bingo</h:td>  
  </h:tr>  
</h:table>  
<f:table>  
  <f:name>Computer table</f:name>  
  <f:width>80</f:width>  
  <f:length>120</f:length>  
</f:table>  
 Note: In this example, you will get no conflict because both the tables have specific
names
54
XML Namespaces…

By Using xmlns Attribute:


 You can use xmlns attribute to define namespace with the following syntax:
<element xmlns:name = "URL">  
 Example:
<root>  
<h:table xmlns:h="http://www.abc.com/TR/html4/">  
  <h:tr>  
    <h:td>Aries</h:td>  
    <h:td>Bingo</h:td>  
  </h:tr>  
</h:table>  
  
<f:table xmlns:f="http://www.xyz.com/furniture">  
  <f:name>Computer table</f:name>  
  <f:width>80</f:width>  
  <f:length>120</f:length>  
</f:table>  
</root>   
55
XML Namespaces…

 In the above example, the <table> element defines a namespace and


when a namespace is defined for an element, the child elements with
the same prefixes are associated with the same namespace.
<root xmlns:h="http://www.abc.com/TR/html4/"  
xmlns:f="http://www.xyz.com/furniture">  
<h:table>  
  <h:tr>  
    <h:td>Aries</h:td>  
    <h:td>Bingo</h:td>  
  </h:tr>  
</h:table>  
<f:table>  
  <f:name>Computer table</f:name>  
  <f:width>80</f:width>  
  <f:length>120</f:length>  
</f:table>  
</root>   

56
XML Namespaces…

 Note: The Namespace URI used in the above example is not


necessary at all. It is not used by parser to look up information. It
is only used to provide a unique name to the Namespace
identifier.

57
The Default Namespace

 It doesn’t allow you to use prefixes in all the child elements.


 You can also use multiple namespaces within the same document just
define a namespace against a child node
 Example:
<tutorials xmlns="http://www.javatpoint.com/java-tutorial">  
  <tutorial>  
    <title>Java-tutorial</title>  
    <author>Sonoo Jaiswal</author>  
  </tutorial>  
  ...  
</tutorials>   
 Note: If you define a namespace without a prefix, all descendant
elements are considered to belong to that namespace.

58

You might also like