You are on page 1of 15

Overview of XML

“The Extensible Markup Language(XML) has replaced 
java, design Patterns and Object Technology as the 
Software Industry ‘s solution to world  hunger.”

Developed By : Jay A. Vora.

 XML is...
 XML’s History And Applications…

 XML syntax and data model

 Parsing XML Document..

 Validating XML Documents

 - XML DTDs,Schema

... an eXtensible Markup Language
 ... HTML  presentation tags + your-own-tags
 ... Technology for describing Structured
 ... You need domain-specific standards and code
libraries to use it effectively.
 ... Moreover, far from making java technology
obsolete, XML Works Very Well with java.
 … allows you to express the structure hierarchy
and repeated elements without contortions.
 ... It looks similar to an HTML file.

Some History…
 SGML (Standard Generalized Markup Language)
 ISO Standard, 1986, for describing the structure
of complex documents.
 A famous SGML language: HTML!!
 Used with success in some industries that
requires ongoing maintenance of massive
 Used in U.S. gvt. & contractors, large
manufacturing companies, technical info.
 However , SGML is Quite Complex. So it has never
caught on a big way.
 SGML wants to make sure that documents are
formed according to the rules for their document

 XML (eXtensible Markup Language)
 W3C (World Wide Web Consortium) -- recommendation in
 Simple subset (80/20 rule) of SGML: “ASCII of
the Web”, “Semantic Web”
 Canonical XML
 “normalization”, equivalence testing of XML
 SML (Simple Markup Language)
 “Reduce to the max”: No Attributes / No
Processing Instructions (PI) / No DTD / No
non-character entity-references / No CDATA
marked sections / Support for only UTF-8
character encoding / No optional features

 XML Schema
 XML Schema definition language
 X-Zoo (Xoo?), “Brave New X-World”
 Specifications CSS • Digital Signatures • ebxml
Project Teams • ebXML •
IETF Specifications • Internationalization •
IOTP (Internet Open Trading Protocol) •
OASIS • Requirements Documents • SMIL •
SVG (Scalable Vector Graphics) • Topic Maps
• W3C Activity Pages • W3C Notes •
W3C Standards •
W3C Standards-in-progress • WAP •
WebDAV • XHTML • XLink • XPath • XSLT
 Vocabularies DTDs • Music • P3P • RDF • RSS •
SMIL • W3C Standards •
W3C Standards-in-progress • WML • XHTML

XML Applications & Industry Initiatives
Advertising: adXML place an ad onto an ad network or 
to a single vendor
Literature: Gutenberg convert the world’s great 
literature into XML
Directories: dirXML Novell’s Directory Services 
Markup Language (DSML)
Web Servers: apacheXML parsers, XSL, web publishing
Travel: openTravel information for airlines, hotels, and 
car rental places
News: NewsML creation, transfer and delivery of news
Human Resources: XML­HR standardization of 
HR/electronic recruiting XML definitions
International Dvt: IDML improve the mgt. and 
exchange of info. for sustainable development
Wireless: WAP (Wireless Application Protocol) wireless 
devices on the World Wide Web
Banking: MBA  Mortgage Bankers Association of 
America ­­> credit report, loan file, underwriting…
Healthcare: HL7  DTDs for prescriptions, policies & 
procedures, clinical trials
 Unlike HTML, XML is case-sensitive. For Example; <H1>
and <h1> are different in XML tags.
 In HTML, you can omit end tags such as </p> or </li>
tags if it’s clear from the context where paragraph of
list item ends. In XML, you can never omit an end tag.
 In XML, elements that have single tag without a
matching end tag must in a /, as in <img src =
“coffeecup.jpeg”/> .
 In XML, attribute values must be enclosed in quotation
marks. In HTML, quotation are optional.
 In HTML, you can have attribute names without values.
Such as <input type=“radio” name=“language”
checked>.In XML, such as checked=“true” or
The Structure of an XML Document.:
An XML document should start with a header such

 <?xml version=“1.0”?>

 <?xml version=“1.0” encoding=“UTF-8”?>

A Header is optional, but it’s highly recommended.


 The header can be followed by a Document Type

Definition(DTD), Such as…

 “-//Sun Microsystems,Inc”>.
DTDs are an important mechanism to ensure the

correctness of a document, but they are not

 Finally, the body of the XML document contains the root
element,which can contain other elements,like
<?Xml version=“1.0”?>

<!DOCTYPE configuration . . . >



 <font>

 <name>Arial</name>
 <size>26</size>
 </font>


. . .


Elements and their Content:

element type
<bibliography> element conten

XML Tutorial, Bertram Ludäscher

<paper ID="object-fusion">
<author>S. Abiteboul</author>
<author>H. Garcia-Molina</author> empty
</authors> element
<fullPaper source="fusion"/>
<title>Object Fusion in Mediator Systems</title>
<booktitle>VLDB 96</booktitle>

Parsing an XML Document:
The java library supplies two kinds of XML parsers:

 Tree parser such as Document Object Model(DOM).

 Streaming parsers such as the Simple API for
XML(SAX) parser that generate events as they read
an XML Document.

 The DOM parser is easy to use for most purpose. You

would consider a streaming parser if you process
very long documents whose tree structure would
up a lot of memory.
 The DOM parser interface is standardized by the
World Wide Web(W3C).
 A Simple DOM Tree:

Do c ume nt

Ele me nt
< fo nt>

Text: Ele me nt Text: Ele me nt Text:

white spac e < name > white spac e < size > white space

Text:  Text:
Arial 25
 Validating XML Documents:
 One of the major benefits of an XML parser is that
it can automatically verify that a document has the
correct structure.
 To specify document structure , you can supply a
DTD or an XML Schema definition. A DTD or
Schema contains rules that how a document
should formed.
XML Do c ume nt Type  De finitio ns (DTDs):
•de fine  the  struc ture  o f "allo we d" 
do c ume nts                         (i.e ., valid wrt. a 
•   database  sc he ma 
= >  impro ve  que ry fo rmulatio n, exe c utio n, ...  
XML Sc he ma 
de fine s struc ture  and data type s 
allo ws de ve lo pe rs to  build the ir o wn librarie s 
Thank You ….