Lecture 1

Advanced XML


• Introduction to XML
– Outline the feature of markup language and list their drawbacks – Define and describe XML – State the benefits and scope of XML

• Exploring XML
– Describe the structure of an XML document – Explain the lifecycle of an XML document
Advanced XML 2

• Exploring XML (Cont‟…)
– State the functions of editors for XML and list the popularly used editors. – State the functions of parsers for XML and list names of commonly used parsers. – State the functions of browsers for XML and list of commonly used browsers

Advanced XML


Objectives • Working with XML – Explain the steps towards building an XML – Define what is meant by well-formed XML • XML Syntax – State and describe the use of comments and processing instructions in XML – Classify character data that is written between tags – Describe entities. DOCTYPE declarations and attributes Advanced XML 4 .

Objectives • Describe namespaces – Define XML Namespaces – Working with Namespaces Syntax • • • • Problems posed by prefixes Placing attributes in a namespace Default Namespaces Override default namespaces Advanced XML 5 .

History of Markup Documents recorded using paper and pen Typesetters formatting documents Tools used by typesetters to format a document Advanced XML 6 .

Markup Language • A Markup language defines the rules that help to add meaning to the content and structure of documents. • They are classified as: – Stylistic Markup – It determines the presentation of the document – Structure Markup – It defines the structure of the document – Semantic Markup – It determines the content of the document Advanced XML 7 .

• GML was fine-tuned and came to be known as Standard Generalized Markup Language (SGML). • SGML is the source of origin of all markup languages Advanced XML 8 .SGML • Generalized Markup Language (GML) is the system of formatting documents.

• It needs a separate file that will contain all the rules for the language. which allows authors to create their own tags that relate to their content.Features of SGML • It describes markup language. Advanced XML 9 . for its interpretation • A SGML application is markup language derived from SGML.

conduct transactions. These forms can be used to collect information about the user.Introduction to HTML • HTML is a MARKUP language • Using HTML tags and elements. and so on Advanced XML 10 . we can: – Control the appearance of the page and the content – Publish online documents and retrieve online information using the links inserted in the HTML document – Create on-line forms.

Advanced XML 11 .HTML • HTML is the most famous markup language derived from SGML. • It was created to mark up technical papers so that they could be transferred across different platforms for the scientific community. • It is now also used by those non-scientific users who are concerned about their document‟s presentation.

Drawbacks of HTML • Fixed tag set • Presentation technology does not relate to the contents • It is flat • Clogging • HTML is not international • Data interchange is impossible • Does not have a robust linking mechanism • HTML is not reusable Advanced XML 12 .

net Phone : 3336767 Street Adress: 25th City : Toronto State : Toronto Zip : 20056 <UL> <LI> <LI> <LI> <LI> <LI> St. </Company> <Email>tom@usa.HTML and XML code Examples <UL> HTML Code <LI> TOM CRUISE CLIENT ID : 100 COMPANY : XYZ Corp.net</Email> <Phone> 3336767</Phone> <Street> 25th St.</Street> <City>Toronto</City> <State>Toronto</State> <ZIP> 20056</ZIP> </CONTACT> </Details> Advanced XML 13 . <LI> <LI> <LI> </UL> </UL> <Details> XML Code <CONTACT> <PERSON_NAME>TOM CRUISE </PERSON_NAME> <ID> 100 </ID> <Company>XYZ Corp. Email : tom@usa.

Advanced XML 14 . • It inherits the features of SGML and combines it with the features of HTML. • It allows the user to define their own set of tags. • It overcomes all the drawbacks of HTML.XML -1 • XML stands for Extensible Markup Language. • It is more flexible than HTML. and also makes it possible for others (people or programs) to understand it. • It is a smaller version of SGML.

This enables data to be displayed on different browsers. • The data contained in an XML file can be displayed in different ways.XML -2 • XML is a metalanguage and it describes other languages. • It can also be offered to other applications for further processing. Advanced XML 15 . • Style sheets help transform structured data into different HTML views.

• XML tags represent the logical structure of data that can be interpreted and used in various ways by different applications. • It can be generated from existing databases using a scalable three-tier model. Advanced XML 16 .XML Architecture .1 • XML supports three-tier architecture for handling and manipulating data. • The middle-tier is used to access multiple databases and translate data into XML.

XML Architecture -2 Advanced XML 17 .

• XML has a structured data format.XML – A Universal data format • HTML is a single markup language. • XML is popular because it supports a wide range of applications and is easy to use. which allows it to store complex data Advanced XML 18 . but XML is a family of markup languages. • Any type of data can be easily defined in XML.

• The benefits of XML are classified into the following: – Business benefits – Technological benefits Advanced XML 19 .Benefits of XML • The three-tier architecture has easier scalability and better security.

flexible and extensible language • Content Delivery: – Supports different users and channels. like digital TV. write and transform data between XML and other formats • XML inside a single application: – Powerful. phone.Business Benefits • Information sharing: – Allows businesses to define data formats in XML – Provides tools to read. web and multimedia kiosks Advanced XML 20 .

Business Benefits • Other Benefits: – – – – – – – Data Independence Easier to parse Reducing Server Load Easier to create Web Site Content Remote Procedure Calls e-Commerce Advanced XML 21 .

Technological Benefits Separation of data and presentation Semantic information Technological Benefits Extensibility Re-use of data Advanced XML 22 .

XML Document Structure Advanced XML 23 .

XML Document Structure Advanced XML 24 .

Advanced XML 25 . elements. character references. and processing instructions. comments. • Entities are aliases for more complex functions. • All documents begin with a root or document entity.XML Document Structure • An XML document is composed of sets of “entities” identified by unique names. • Documents are logically composed of declarations.

Advanced XML 26 . • A valid XML document is a well-formed XML document. • The requirements ensure that correct language terms are used in the right manner . • DTD defines the rules that an XML markup in the XML document must follow. which conforms to the rules of a Document Type Definition (DTD).Well formed and Valid Documents • An XML document is considered as well formed. if a minimum set of requirements defined in the XML 1.0 specification are satisfied.

XML Document Life cycle • XML Document Life cycle • Importance components • Editors • Parser • Browser 27 Advanced XML .

Editors • The main functions that editors provide: – Add opening and closing tags to the code – Check for validity of XML – Verify XML against a DTD/Schema – Perform series of transforms over a document – Color the XML Syntax – Display the line numbers – Present the content and hide the code – Complete the word Advanced XML 28 .

Editors • The popular used editors are: – Oxygen – XML Writer – XML Spy – XML Pro – XML Mind – XMetal Advanced XML 29 .

Parsers .1 • Parsers help the computer interpret an XML file.0”? > <nxn> </nxn> Editor with the XML document XML document parsed by the parser Parsed document viewed in the browser • Their are two types of parsers: • • Non Validating parser Validating parser 30 Advanced XML . <?xml version=“1.

Parsers .2 XML file Parsers load the XML and other related files to check whether the XML document is well formed and valid Other related files (like DTD file) Advanced XML Data tree 31 .

Parsers .3 • Commonly used parsers are: • • • • • Crimson Xerces Oracle XML Parser JAXP (Java API for XML) MSXML Advanced XML 32 .

Browsers • Commonly used web browser are as follows: • • • • • Netscape Mozilla Internet Explorer Firefox Opera Advanced XML 33 .

Markup Markup <NAME> Tom Cruise Data </NAME> Advanced XML 34 .Data vs.

Creating an XML Document • To create an XML document: – State an XML declaration – Create a root element – Create the XML code – Verify the document Advanced XML 35 .

Creating an XML Document Advanced XML 36 .

0” standalone=“no” encoding=“UTP-8”?> • „Standalone‟ and „encoding‟ attributes are optional.Stating an XML Declaration • Syntax <?xml version=“1.specifies the character encoding used by the author • XML 1. only the version number is mandatory • „Standalone‟ – is the external declaration • „Encoding‟ .0 version is default Advanced XML 37 .

0” standalone=“no” encoding=“UTP-8”?> <BOOK> </BOOK> Advanced XML 38 .Creating a Root Element • There can only be one root element • It describes the function of the document • Every XML document must have a root element Example <?xml version=“1.

• Elements are the basic units of XML content. • Tags tell the user agent to do something to the content encased between the start and end tag.Creating the XML Code -1 • It is the process of creating our own elements and attributes as required by our application. Opening Tag Content Closing Tag Parts of an element <TITLE> FPT University </TITLE> Element Advanced XML 39 .

Creating the XML Code -2 • Rules govern the elements: – At least one element required – XML tags are case sensitive – End the tags correctly – Nest tags Properly – Use legal tags – Length of markup names – Define Valid Attributes Advanced XML 40 .

otherwise it will not be read by the browser or by any other XML reader Advanced XML 41 .Verify the document • The document should follow the XML rules.

and is to be ignored by the processor. and others are treated as comments..> Example <!-.don't show these <NAME>KATE WINSLET</NAME> <NAME>NICOLE KIDMAN</NAME> <NAME>ARNOLD</NAME> --> <NAME>TOM CRUISE</NAME> The example given will display only the name TOM CRUSIE.Comments • This is information for the understanding of the user. • Syntax <!. Advanced XML 42 .Write the comment here -.

• The XML declaration is also a processing agent.Processing Instruction • A processing information is a bit of information meant for the application using the XML document. • These instructions are directly passed to the application using the parser. <?xml:stylesheet type=“text/xsl”?> Name of application Advanced XML Instruction information 43 .

• Character data is classified into: – PCDATA – CDATA Advanced XML 44 .Character Data • The text between the start and end tags is defined as „character data‟. • Character data may be any legal (Unicode).

PCDATA • It stands for parsed character data. Entity Name &lt. &gt. &quot. • PCDATA is text that will be parsed by a Parser. Advanced XML Character < > & " ' Predefined entities 45 . • Tags inside the text will be treated as markup and entities will be expanded. &amp. &apos.

• It will not be parsed by the Parser. Example <SAMPLE> <![CDATA[<DOCUMENT> <NAME>TOM CRUISE</NAME> <EMAIL>tom@usa. • The character string ]]> is not allowed within a CDATA block as it will signal the end of the CDATA block.com</EMAIL> </DOCUMENT>]]> </SAMPLE> Advanced XML 46 . • CDATA are used to make it convenient to include large blocks of special characters.CDATA • It means character data.

• There are two categories of entities: – General entities Syntax <!ENTITY ADDRESS "text that is to be represented by an entity"> – Parameter entities Syntax <!ENTITY % ADDRESS "text that is to be represented by an entity"> Advanced XML 47 .Entities • Entities are used to avoid typing long pieces of text repeatedly within a document.

Entities Advanced XML 48 .

An example of a General entity <!ENTITY full_address " My Address 12 Tenth Ave. Advanced XML 49 . Suite 12 Paris. France"> • Entity declaration – Syntax &ENTITY_NAME." PRODUCT = "&PRODUCT_ID.Examples of Entities An example of Parameter entities < CLIENT = "&FPT." QUANTITY = "15"> • Entity declaration – Syntax %PARAMETER_ENTITY_NA ME. – Example %address. – Example &address.

The DOCTYPE declarations
• The <!DOCTYPE [..]> declaration follows the XML declaration in an XML document. • Syntax
<?xml version="1.0"?> <!DOCTYPE myDoc [ ...declare the entities here.... <myDoc> ...body of the document.... </myDoc>

Example <!DOCTYPE CUSTOMERS [ <!ENTITY firstFloor "15 Downing St Floor 1"> <!ENTITY secondFloor "15 Downing St Floor 2"> <!ENTITY thirdFloor "15 Downing St Floor 3"> ]>

Advanced XML

• An attribute gives information about an element. • Attributes are embedded in the element start tag. • An attribute consists of an attribute name and attribute value.
Example <TV count="8">SONY</TV> <LAPTOP count="10">IBM</LAPTOP>

Advanced XML


XML Namespaces - 1
• Two or more applications on the Internet may also have some element names that are common. Namespaces help avoid such ambiguity that may arise. • It also allows to combine documents from different sources and enables the identification of what element or attributes come from which source. • It instructs the user agent to access the DTD against which the document is validated.
Advanced XML 52

• URN is a universally unique number that identifies Internet resources. • It includes Uniform Resources Name(URN) and a Uniform Resource Locator(URL). Advanced XML 53 .XML Namespaces .2 • A URI(Uniform Resource Identifier) is used to identify namespaces in XML. • URL contains the reference for a document or an HTML page on a web.

• Namespaces ensure that element names do not conflict and do clarify their origins. • Namespaces help standardize and uniquely brand elements and attributes. • Namespaces employ the URI to instruct the useragent about the location of the DTD against which the XML document is checked for validity.Needs of a Namespace • Namespaces are used to overcome the conflict that arise when reuse and extension of the DTD‟s take place. Advanced XML 54 .

Needs of a Namespace Advanced XML 55 .

Syntax for Namespace Advanced XML 56 .

Syntax for Namespace • A prefix is associated with the URI that can be used as a namespace.edu.vn” – Namespace needs to be declared before using – It is declared in the root element of the document Advanced XML 57 .fpt. • Syntax xmlns:[prefix]= “[URI of namespace]” – The xmlns: is a reserved attribute • Example xmlns:ins= “http://www.

tea.vn” xmlns:tea_batch= “http://www.edu.Attributes and Namespaces • Attributes comes within the namespace of their element unless they are predefined.org”> <batch-list> <batch type=“thirdbatch”>Evening Batch</batch> <batch tea_batch:type= “thirdbatch”>Tea batch III </batch> <batch>Afternoon Batch</batch> </batch-list> </sample> Advanced XML 58 . • We can also incorporate attributes from two domains: <sample xmlns= “http://www.fpt.

org/TR/WDxsl/FO.Namespace Application • The new XSL syntax makes use of namespace to identify both its own tags. • XSL is written in XML syntax and uses tags.w3. and the formatting vocabulary tags. elements. Advanced XML 59 .w3.org/TR/WDxsl namespace. • The xsl: prefix are in the http//www. • The fo: prefix are in the http//www. and attributes.

org/TR/WD-xsl/FO”> <index> <chapter>this is chapter 1</chapter> <html:br/> <chapter>this is chapter 1</chapter> </index> </book> Advanced XML 60 .Namespace Example <book xmlns:html=“http//www.w3.

Default Namespace Advanced XML 61 .

Override Default Namespace Advanced XML 62 .

XML-based data does not contain information about how data should be displayed • An XML document is composed of a set of “entities” identified by unique names Advanced XML 63 . This makes XML much more flexible than HTML • XML inherits features from SGML and includes the features of HTML. which means that we can define our own set of tags.Summary-1 • A markup language defines a set of rules that adds meaning to the content and structure of documents • XML is extensible. and make it possible for other parties (people or programs) to know and understand these tags. XML can be generated from existing databases using a scalable three-tier model.

Summary-2 • A well-formed document is one that conforms to the basic rules of XML. a valid document is a well-formed document that conforms to the rules of a DTD (Document Type Definition) • The parser helps the computer to interpret an XML file • Steps involved in the building of an XML document are: – Stating an XML declaration – Creating a root element – Creating the XML code – Verifying the document • Character data is classified into PCDATA and CDATA Advanced XML 64 .

• An attribute gives information about an element Advanced XML 65 . The two types of entities are: – General entities – Parameter entities • The <!DOCTYPE […]> declaration follows the XML declaration in an XML document.Summary-3 • Entities are used to avoid typing long pieces of text repeatedly in a document.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.