<?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of text XHTML Document </title> </head> <body> <div class="myDiv"> <h1> Heading of Page </h1> <p> here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy. </p> <p>Here is another paragraph with <em>inline emphasized</em> text, and <b> absolutely no</b> sense of humor. </p> <p>And here is another paragraph, this one containing an <img src="image.gif" alt="waste of time" /> inline image, and a <br /> line break. </p> </div> </body></html> Example Message <partorders xmlns=http://myco.org/Spec/partorders.desc> <order ref=x23-2112-2342 date=25aug1999-12:34:23h> <desc> Gold sprockel grommets, with matching hamster</desc> <part number=23-23221-a12 /> <quantity units=gross> 12 </quantity> <delivery-date date=27aug1999-12:00h> </order> <order ref=x23-2112-2342 date=25aug1999-12:34:23h> . Order something else .. </order> </partorders>
XML Parsers An XML parser is a program that can read an XML document and provide programmatic access to the document Two types of parsers: 1) DOM based Document Object Model Constructs a tree that represents the document 2) SAX based Simple API for XML Generates events when parts of the document are encountered. Can also be classified as push or pull parsers XML Characters Consist of carriage returns, line feeds and Unicode characters XML is either markup or text Markup is enclosed in < and > Character text is the text between a start and end tag Child elements are considered markup White Space Parsers consider whitespace inside text data to be significant and must pass it to an application An application can consider whitespace significant or insignificant. Normalization is the process in which whitespace is collapsed or removed Entities &, <, >, (apostrophe), and (double quote) are special characters and may not be used in character data directly To use these characters we code entity references which begin with an ampersand and end with a semicolon & < > ' " <mytag>David's Tag</mytag> Root Element <?xml version="1.0"?>
<xs:schema> ... ... </xs:schema> Some Simple Data Types XML Schema has a lot of built-in data types, including: xs:string xs:decimal xs:integer xs:boolean xs:date xs:time
Simple Elements Here are some XML elements: <lastname>Refsnes</lastname> <age>36</age> <dateborn>1970-03-27</dateborn> And here are the corresponding simple element definitions: <xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/> Default and Fixed Values for Simple Elements Simple elements may have a default value OR a fixed value specified. A default value is automatically assigned to the element when no other value is specified. In the following example the default value is "red": <xs:element name="color" type="xs:string" default="red"/> A fixed value is also automatically assigned to the element, and you cannot specify another value. In the following example the fixed value is "red": <xs:element name="color" type="xs:string" fixed="red"/>