You are on page 1of 86

Getting started with HTML

This is a short introduction to writing HTML. What is HTML? It is a special kind of text document that is used by
Web browsers to present text and graphics. The text includes markup tags such as <p> to indicate the start of a
paragraph, and </p> to indicate the end of a paragraph. HTML documents are often refered to as "Web pages". The
browser retrieves Web pages from Web servers that thanks to the Internet, can be pretty much anywhere in World.
Many people still write HTML by hand using tools such as NotePad on Windows, or TextEdit on the Mac. This guide
will get you up and running. Even if you don't intend to edit HTML directly and instead plan to use an HTML editor
such as Netscape Composer, or W3C's Amaya, this guide will enable you to understand enough to make better use of
such tools and how to make your HTML documents accessible on a wide range of browsers. Once you are
comfortable with the basics of authoring HTML, you may want to learn how to add a touch of style using CSS, and to
go on to try out features covered in my page on advanced HTML
p.s. a good way to learn is to look at how other people have coded their html pages. To do this, click on the "View"
menu and then on "Source". On some browsers, you instead need to click on the "File" menu and then on "View
Source". Try it with this page to see how I have applied the ideas I explain below. You will find yourself developing a
critical eye as many pages look rather a mess under the hood!
For Mac users, before you can save a file with the ".html" extension, you will need to ensure that your document is
formatted as plain text. For TextEdit, you can set this with the "Format" menu's "Make Plain Text" option.
If you are looking for something else, try the advanced HTML page.
Start with a title
Every HTML document needs a title. Here is what you need to type:
<title>My first HTML document</title>
Change the text from "My first HTML document" to suit your own needs. The title text is preceded by the start tag
<title> and ends with the matching end tag </title>. The title should be placed at the beginning of your document.
To try this out, type the above into a text editor and save the file as "test.html", then view the file in a web browser. If
the file extension is ".html" or ".htm" then the browser will recognize it as HTML. Most browsers show the title in the
window caption bar. With just a title, the browser will show a blank page. Don't worry. The next section will show
how to add displayable content.
Add headings and paragraphs
If you have used Microsoft Word, you will be familiar with the built in styles for headings of differing importance. In
HTML there are six levels of headings. H1 is the most important, H2 is slightly less important, and so on down to H6,
the least important.
Here is how to add an important heading:
<h1>An important heading</h1>
and here is a slightly less important heading:
<h2>A slightly less important heading</h2>
Each paragraph you write should start with a <p> tag. The </p> is optional, unlike the end tags for elements like
headings. For example:
<p>This is the first paragraph.</p>

<p>This is the second paragraph.</p>

Adding a bit of emphasis


You can emphasize one or more words with the <em> tag, for instance:
This is a really <em>interesting</em> topic!

Adding interest to your pages with images


Images can be used to make your Web pages distinctive and greatly help to get your message across. The simple way
to add an image is using the <img> tag. Let's assume you have an image file called "peter.jpg" in the same
folder/directory as your HTML file. It is 200 pixels wide by 150 pixels high.
<img src="peter.jpg" width="200" height="150">
The src attribute names the image file. The width and height aren't strictly necessary but help to speed the display of
your Web page. Something is still missing! People who can't see the image need a description they can read in its
absence. You can add a short description as follows:
<img src="peter.jpg" width="200" height="150"
alt="My friend Peter">
The alt attribute is used to give the short description, in this case "My friend Peter". For complex images, you may
need to also give a longer description. Assuming this has been written in the file "peter.html", you can add one as
follows using the longdesc attribute:
<img src="peter.jpg" width="200" height="150"
alt="My friend Peter" longdesc="peter.html">
You can create images in a number of ways, for instance with a digital camera, by scanning an image in, or creating
one with a painting or drawing program. Most browsers understand GIF and JPEG image formats, newer browsers
also understand the PNG image format. To avoid long delays while the image is downloaded over the network, you
should avoid using large image files.
Generally speaking, JPEG is best for photographs and other smoothly varying images, while GIF and PNG are good
for graphics art involving flat areas of color, lines and text. All three formats support options for progressive rendering
where a crude version of the image is sent first and progressively refined.
Adding links to other pages
What makes the Web so effective is the ability to define links from one page to another, and to follow links at the click
of a button. A single click can take you right across the world!
Links are defined with the <a> tag. Lets define a link to the page defined in the file "peter.html" in the same
folder/directory as the HTML file you are editing:
This a link to <a href="peter.html">Peter's page</a>.
The text between the <a> and the </a> is used as the caption for the link. It is common for the caption to be in blue
underlined text.
If the file you are linking to is in a parent folder/directory, you need to put "../" in front of it, for instance:
<a href="../mary.html">Mary's page</a>
If the file you are linking to is in a subdirectory, you need to put the name of the subdirectory followed by a "/" in
front of it, for instance:
<a href="friends/sue.html">Sue's page</a>
The use of relative paths allows you to link to a file by walking up and down the tree of directories as needed, for
instance:
<a href="../college/friends/john.html">John's page</a>
Which first looks in the parent directory for another directory called "college", and then at a subdirectory of that
named "friends" for a file called "john.html".
To link to a page on another Web site you need to give the full Web address (commonly called a URL), for instance to
link to www.w3.org you need to write:
This is a link to <a href="http://www.w3.org/">W3C</a>.
You can turn an image into a hypertext link, for example, the following allows you to click on the company logo to get
to the home page:
<a href="/"><img src="logo.gif" alt="home page"></a>
This uses "/" to refer to the root of the directory tree, i.e. the home page.
Three kinds of lists
HTML supports three kinds of lists. The first kind is a bulletted list, often called an unordered list. It uses the <ul> and
<li> tags, for instance:
<ul>
<li>the first list item</li>

<li>the second list item</li>

<li>the third list item</li>


</ul>
Note that you always need to end the list with the </ul> end tag, but that the </li> is optional and can be left off. The
second kind of list is a numbered list, often called an ordered list. It uses the <ol> and <li> tags. For instance:
<ol>
<li>the first list item</li>

<li>the second list item</li>

<li>the third list item</li>


</ol>
Like bulletted lists, you always need to end the list with the </ol> end tag, but the </li> end tag is optional and can be
left off.
The third and final kind of list is the definition list. This allows you to list terms and their definitions. This kind of list
starts with a <dl> tag and ends with </dl> Each term starts with a <dt> tag and each definition starts with a <dd>. For
instance:
<dl>
<dt>the first term</dt>
<dd>its definition</dd>

<dt>the second term</dt>


<dd>its definition</dd>

<dt>the third term</dt>


<dd>its definition</dd>
</dl>
The end tags </dt> and </dd> are optional and can be left off. Note that lists can be nested, one within another. For
instance:
<ol>
<li>the first list item</li>

<li>
the second list item
<ul>
<li>first nested item</li>
<li>second nested item</li>
</ul>
</li>

<li>the third list item</li>


</ol>
You can also make use of paragraphs and headings etc. for longer list items.
HTML has a head and a body
If you use your web browser's view source feature (see the View or File menus) you can see the structure of HTML
pages. The document generally starts with a declaration of which version of HTML has been used, and is then
followed by an <html> tag followed by <head> and at the very end by </html>. The <html> ... </html> acts like a
container for the document. The <head> ... </head> contains the title, and information on style sheets and scripts,
while the <body> ... </body> contains the markup with the visible content. Here is a template you can copy and paste
into your text editor for creating your own pages:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title> replace with your document's title </title>
</head>
<body>

replace with your document's content

</body>
</html>

Tidying up your markup


A convenient way to automatically fix markup errors is to use HTML Tidy which also tidies the markup making it
easier to read and easier to edit. I recommend you regularly run Tidy over any markup you are editing. Tidy is very
effective at cleaning up markup created by authoring tools with sloppy habits. Tidy is available for a wide range of
operating systems from the TidyLib Sourceforge site, and has also been integrated into a variety of HTML editing
tools.

More advanced features


Having mastered the basics, it is time to move on to more advanced features. The following will teach you how to:
• force line breaks
• introduce non-breaking spaces
• use entities for special characters
• link into the middle of pages
• use preformatted text
• flow text around images
• define clickable regions within images
• create tables
• use roll-overs and other tricks
• enable users to listen to sound files
How to force line breaks
Just occasionally, you will want to force a line break. You do this using the br element, for example when you want to
include a postal address:
<p>The Willows<br>
21 Runnymede Avenue<br>
Morton-in-the-marsh<br>
Oxfordshire OX27 3BQ</p>
The br element never needs an end-tag. In general, elements that don't take end-tags are known as empty elements.
How to introduce non-breaking spaces
Browsers automatically wrap text to fit within the margins. Line breaks can be introduced wherever space characters
appear in the markup. Sometimes you will want to prevent the browser from wrapping text between two particular
words. For instance between two words in a brand name such as "Coca Cola". The trick is to use &nbsp; in place of
the space character, for example:
Sweetened carbonated beverages, such as Coca&nbsp;Cola,
have attained world-wide popularity.
It is bad practice to use repeated non-breaking spaces to indent text. Instead, you are advised to set the indent via style
rules.
How to use entities for special characters
For copyright notices, or trademarks it is customary to include the appropriate signs:

Symbol Entity Example

Copyright sign &copy; Copyright © 1999 W3C

Registered trademark &reg; MagiCo ®

Trademark &#8482; Webfarer™

Note HTML 4.0 defines &trade; for the trademark sign but this is not yet as widely supported as &#8482;
There are a number of other entities you may find useful:

Symbol Entity Example

Less than &lt; <

Greater than &gt; >

Ampersand &amp; &

nonbreaking space &nbsp;

em dash &#8212; —

quotation mark &quot; "

And then, there are entities for accented characters and miscellaneous symbols in the Latin-1 character set:
&nbsp; &#160; Ð &ETH; &#208;
¡ &iexcl; &#161; Ñ &Ntilde; &#209;
¢ &cent; &#162; Ò &Ograve; &#210;
£ &pound; &#163; Ó &Oacute; &#211;
¤ &curren; &#164; Ô &Ocirc; &#212;
¥ &yen; &#165; Õ &Otilde; &#213;
¦ &brvbar; &#166; Ö &Ouml; &#214;
§ &sect; &#167; × &times; &#215;
¨ &uml; &#168; Ø &Oslash; &#216;
© &copy; &#169; Ù &Ugrave; &#217;
ª &ordf; &#170; Ú &Uacute; &#218;
« &laquo; &#171; Û &Ucirc; &#219;
¬ &not; &#172; Ü &Uuml; &#220;
&shy; &#173; Ý &Yacute; &#221;
® &reg; &#174; Þ &THORN; &#222;
¯ &macr; &#175; ß &szlig; &#223;
° &deg; &#176; à &agrave; &#224;
± &plusmn; &#177; á &aacute; &#225;
² &sup2; &#178; â &acirc; &#226;
³ &sup3; &#179; ã &atilde; &#227;
´ &acute; &#180; ä &auml; &#228;
µ &micro; &#181; å &aring; &#229;
¶ &para; &#182; æ &aelig; &#230;
· &middot; &#183; ç &ccedil; &#231;
¸ &cedil; &#184; è &egrave; &#232;
¹ &sup1; &#185; é &eacute; &#233;
º &ordm; &#186; ê &ecirc; &#234;
» &raquo; &#187; ë &euml; &#235;
¼ &frac14; &#188; ì &igrave; &#236;
½ &frac12; &#189; í &iacute; &#237;
¾ &frac34; &#190; î &icirc; &#238;
¿ &iquest; &#191; ï &iuml; &#239;
À &Agrave; &#192; ð &eth; &#240;
Á &Aacute; &#193; ñ &ntilde; &#241;
 &Acirc; &#194; ò &ograve; &#242;
à &Atilde; &#195; ó &oacute; &#243;
Ä &Auml; &#196; ô &ocirc; &#244;
Å &Aring; &#197; õ &otilde; &#245;
Æ &AElig; &#198; ö &ouml; &#246;
Ç &Ccedil; &#199; ÷ &divide; &#247;
È &Egrave; &#200; ø &oslash; &#248;
É &Eacute; &#201; ù &ugrave; &#249;
Ê &Ecirc; &#202; ú &uacute; &#250;
Ë &Euml; &#203; û &ucirc; &#251;
Ì &Igrave; &#204; ü &uuml; &#252;
Í &Iacute; &#205; ý &yacute; &#253;
Î &Icirc; &#206; þ &thorn; &#254;
Ï &Iuml; &#207; ÿ &yuml; &#255;
You can also use numeric character entities for the greek letters and mathematical symbols defined in Unicode. For
more details, take a look at the list specified in the HTML 4 specification. Note that the entity names for these
characters aren't recognized in Navigator 4, so you are recommended to stick to the numeric entities instead.
Linking into the middle of Web pages
Imagine you have written a long Web page with a table of contents near the start. How do you make the entries in the
table contents into hypertext links to the corresponding sections?
Let's assume that each section starts with a heading, for instance:
<h2>Local Night Spots</h2>
You make the heading into a potential target for a hypertext link by enclosing its contents with
<a name=identifier> .... </a>
<h2><a name="night-spots">Local Night Spots</a></h2>
The name attribute specifies the name you will use to identify the link target, in this case: "night-spots". The table of
contents can now include a hypertext link using this name, for instance:
<ul>
...
<li><a href="#night-spots">Local Night Spots</a></li>
...
</ul>
The # character is needed before the target name. If the target is in a different document, then you need to place the
web address of that document before the # character. For example if the document is in "http://www.bath.co.uk/", then
the link becomes:
<a href="http://www.bath.co.uk/#night-spots">Local Night Spots</a>
In the future, you will eventually be able to define link targets without the need for the <a> element. The new method
is much easier, as all you need to do is to add an id attribute to the heading, for instance:
<h2 id="night-spots">Local Night Spots</h2>
This method doesn't work for 4th generation or earlier browsers, so it should be used with care while these browsers
are still in use!
Preformatted Text
One advantage of the Web is that text is automatically wrapped into lines fitting within the current window size.
Sometimes though, you will want to disable this behavior. For example when including samples of program code. You
do this using the pre element. For instance:
<pre>
void Node::Remove()
{
if (prev)
prev->next = next;
else if (parent)
parent->SetContent(null);

if (next)
next->prev = prev;

parent = null;
}
</pre>
Which renders as:
void Node::Remove()
{
if (prev)
prev->next = next;
else if (parent)
parent->SetContent(null);

if (next)
next->prev = prev;

parent = null;
}
The text and background colors were set via the style sheet. Note that all line breaks and spaces are rendered exactly
as they appear in the HTML. The exception is a newline immediately after the start tag <pre> and immediately before
the end tag </pre>, which are discarded. This means that the following two examples are rendered identically:
<pre>preformatted text</pre>

<pre>
preformatted text
</pre>
Preformatted text is generally rendered using a monospaced font where each character has the same width. If you
define a style rule for the pre element, some browsers forget to use the monospace font, necessitating the use of the
font-family property. For instance if you want to render all pre elements in green you can define the style rule:
<style type="text/css">
pre { color: green; background: white; font-family: monospace; }
</style>
When setting the foreground color for text, you are advised to also set the color for the background. This will prevent
accidents where the background color is hard to distinguish from the foreground. Rather than setting the background
color on the pre element, you may find it more convenient to set it on the body element, for instance:
<style type="text/css">
body { color: black; background: white; }
pre { color: green; font-family: monospace; }
</style>

Flowing text around images


With HTML, you can choose whether any given image is treated as part of the current text line or is floated to the left
or right margins. You control this via the align attribute. If the align attribute is set to left the image floats to the left
margin. If it is set to right the image floats to the right margin. For instance:
<p><img src="sun.jpg" alt="sunburst graphic"
width="32" height="21" align="left"> This text will be
flowed around the right side of the graphic.</p>
which renders as:
This text will be flowed around the right side of the graphic.

The following uses align="right"


<p><img src="sun.jpg" alt="sunburst graphic"
width="32" height="21" align="right"> This text will be
flowed around the left side of the graphic.</p>
which renders as:
This text will be flowed around the left side of the graphic.
To force rendering to continue below the floated image you can use the <br clear=all> element, for example:
<p><img src="sun.jpg" alt="sunburst graphic"
width="32" height="21" align="left"> This text will be
flowed around the right side of the graphic.<br clear="all">
This starts a new line below the floated image.</p>
which renders as:
This text will be flowed around the right side of the graphic.
This starts a new line below the floated image.
Clickable regions within images
The following image acts as a map of a group of Web pages. You can click on the circles to go to the corresponding
page.
The markup for this is as follows:
<p align="center">
<img src="pages.gif" width="384" height="245"
alt="site map" usemap="#sitemap" border="0">
<map name="sitemap">
<area shape="circle" coords="186,44,45"
href="Overview.html" alt="Getting Started">
<area shape="circle" coords="42,171,45"
href="Style.html" alt="A Touch of Style">
<area shape="circle" coords="186,171,45"
alt="Web Page Design">
<area shape="circle" coords="318,173,45"
href="Advanced.html" alt="Advanced HTML">
</map>
</p>
The src attribute on the img element specifies the image "pages.gif". The usemap attribute references a map element.
It uses a Web address to do so, hence the # character. The border attribute is set to "0" to suppress the blue border
around the image.
The map element specifies which regions in the image act as hypertext links. The name attribute matches usemap
attribute on the img element and acts much like the name attribute on the <a> element. In practice, the map element
needs to be in the same file as the img element.
The area element is used to define a region on the image and to bind it to a Web address. The shape attribute specifies
"rect", "circle" or "poly". The coords attribute specifies the coordinates for the region depending on the shape.
• rect: left-x, top-y, right-x, bottom-y
• circle: center-x, center-y, radius
• poly: x1,y1, x2,y2, ... xn,yn
The top left pixel is considered as the origin of the image with x and y both equal to zero, x increases rightwards
across the image and y increases downwards. Most image manipulation tools allow you to find the pixel coordinates of
any given point in the image.
If two or more defined regions overlap, the region-defining element that appears earliest in the document takes
precedence (i.e., responds to user input). For a complex shape such as an anular ring, you can make part of a region
inactive by overlaying it with another region using the nohref attribute, for example:
<area shape="circle" coords="186,44,50" nohref>
<area shape="circle" coords="186,44,100"
href="Overview.html" alt="Getting Started">
Where the first circle creates an inactive region within the larger circle created by the second area element. To have
any effect, the inactive shape needs to be placed first as otherwise it will be hidden by the active shape.
Why you need to specify the alt attribute
The alt attribute on the area element is used to supply a text label for the link. Without it the image map is inaccessible
to people who can't see the image.
Tables
Tables are used for information as well as for layout. You can stretch tables to fill the margins, specify a fixed width
or leave it to the browser to automatically size the table to match the contents.
Tables consist of one or more rows of table cells. Here is a simple example:
Year Sales
2000 $18M
2001 $25M
2002 $36M
The markup for this is:
<table border="1">
<tr><th>Year</th><th>Sales</th></tr>
<tr><td>2000</td><td>$18M</td></tr>
<tr><td>2001</td><td>$25M</td></tr>
<tr><td>2002</td><td>$36M</td></tr>
</table>
The table element acts as the container for the table. The border attribute specifies the border width in pixels. The tr
element acts as a container for each table row. The th and td elements act as containers for heading and data cells
respectively.
Cell Padding
You can increase the amount of padding for all cells using the cellpadding attribute on the table element. For instance,
to set the padding to 10 pixels:
<table border="1" cellpadding="10">
this has the effect:

Year Sales

2000 $18M

2001 $25M

2002 $36M

Cell Spacing
By contrast the cellspacing attribute sets the space between the cells. Setting the cell spacing to 10:
<table border="1" cellpadding="10" cellspacing="10">
has the effect:

Year Sales

2000 $18M

2001 $25M

2002 $36M

Table Width
You can set the width of the table using the width attribute. The value is either the width in pixels or a percentage
value representing the percentage of the space available between the left and right margins. For instance to set the
width to 80% of the margins:
<table border="1" cellpadding="10" width="80%">
which has the effect:

Year Sales

2000 $18M

2001 $25M

2002 $36M
Text Alignment within Cells
By default browsers center heading cells (th), and left align data cells (td). You can change alignment using the align
attribute, which can be added to each cell or to the row (tr element). It is used with the values "left", "center" or
"right":
<table border="1" cellpadding="10" width="80%">
<tr align="center"><th>Year</th><th>Sales</th></tr>
<tr align="center"><td>2000</td><td>$18M</td></tr>
<tr align="center"><td>2001</td><td>$25M</td></tr>
<tr align="center"><td>2002</td><td>$36M</td></tr>
</table>
with the following result:

Year Sales

2000 $18M

2001 $25M

2002 $36M

The valign attribute plays a similar role for the vertical alignment of cell content. It is used with the values "top",
"middle" or "bottom", and can be added to each cell or row. By default, heading cells (th) position their content in the
middle of the cells while data cells align their content at the top of each cell.
Empty Cells
One quirk is the way browsers deal with empty cells, compare:

Year Sales

2000 $18M

2001 $25M

2002 $36M

2003

with

Year Sales

2000 $18M

2001 $25M

2002

The former occurs when a cell is empty:


<td></td>
To prevent this, include a non-breaking space:
<td>&nbsp;</td>

Cells that span more than one row or column


Let's extend the above example to break out sales by north and south sales regions:

Sales
Year
North South Total

2000 $10M $8M $18M

2001 $14M $11M $25M

The heading "Year" now spans two rows, while the heading "Sales" spans three columns. This is done by setting the
rowspan and colspan attributes respectively. The markup for the above is:
<table border="1" cellpadding="10" width="80%">
<tr align="center"><th rowspan="2">Year</th><th colspan="3">Sales</th></tr>
<tr align="center"><th>North</th><th>South</th><th>Total</th></tr>
<tr align="center"><td>2000</td><td>$10M</td><td>$8M</td><td>$18M</td></tr>
<tr align="center"><td>2001</td><td>$14M</td><td>$11M</td><td>$25M</td></tr>
</table>
You can simplify this by taking advantage of the fact that browsers don't need the end tags for table cells and rows:
<table border="1" cellpadding="10" width="80%">
<tr align="center"><th rowspan="2">Year<th colspan="3">Sales
<tr align="center"><th>North<th>South<th>Total
<tr align="center"><td>2000<td>$10M<td>$8M<td>$18M
<tr align="center"><td>2001<td>$14M<td>$11M<td>$25M
</table>
Notice that as the heading "Year" spans two rows, the first th element on the second row appears on the second rather
than the first column.
Borderless tables
These are commonly used for laying out pages in a gridded fashion. All you need to do is to add border="0" and
cellspacing="0" to the table element:

Year Sales

2000 $18M

2001 $25M

2002 $36M

This was produced using the following markup:


<table border="0" cellspacing="0" cellpadding="10">
<tr><th>Year</th><th>Sales</th></tr>
<tr><td>2000</td><td>$18M</td></tr>
<tr><td>2001</td><td>$25M</td></tr>
<tr><td>2002</td><td>$36M</td></tr>
</table>
If you leave out the cellspacing attribute you will get a thin gap between the cells, as shown below:

Year Sales

2000 $18M

2001 $25M
2002 $36M

Coloring your tables


This page uses a style sheet to set the background colors for tables, with a different color for heading and data cells.
The style rules I used are as follows:
table {
margin-left: -4%;
font-family: sans-serif;
background: white;
border-width: 2;
border-color: white;
}
th { font-family: sans-serif; background: rgb(204, 204, 153) }
td { font-family: sans-serif; background: rgb(255, 255, 153) }
The last two lines above set the background color for th and td cells to given red/green/blue values. The numbers are
in the range 0 to 255 (fully saturated).
Another way to set the background color is to use the bgcolor attribute. This works with nearly all browsers, and
doesn't rely on style sheet support. The first step is to determine the hexadecimal values for the red, green and blue
components of the color you wish to use. A convertor is included in the style page.
<table border="0" cellspacing="0" cellpadding="10">
<tr>
<th bgcolor="#CCCC99">Year</th>
<th bgcolor="#CCCC99">Sales</th>
</tr>
<tr>
<td bgcolor="#FFFF66">2000</td>
<td bgcolor="#FFFF66">$18M</td>
</tr>
<tr>
<td bgcolor="#FFFF66">2001</td>
<td bgcolor="#FFFF66">$25M</td>
</tr>
<tr>
<td bgcolor="#FFFF66">2002</td>
<td bgcolor="#FFFF66">$36M</td>
</tr>
</table>

Centering your tables


You can position your tables midway between the left and right margins by using some CSS. If your style sheet
includes the following rule, then all tables will be centered:
table {
margin-left: auto;
margin-right: auto;
}
You can make this specific to a given table by giving it an id value, or by setting a class. The following example
applies to tables with a class attribute value of centered:
First here is the style rule:
table.centered {
margin-left: auto;
margin-right: auto;
}
and here is the table markup:
<table class="centered" border="1">
<tr><th>Year</th><th>Sales</th></tr>
<tr><td>2000</td><td>$18M</td></tr>
<tr><td>2001</td><td>$25M</td></tr>
<tr><td>2002</td><td>$36M</td></tr>
</table>
and here is how it is rendered in your browser:
Year Sales
2000 $18M
2001 $25M
2002 $36M
Note that you can replace the border attribute by CSS rules for greater control over the appearence of table and cell
borders. See the style guide for examples of how to set border styles.
Making your tables accessible
If you are unable to see the table it can be quite hard to understand what the table is about. The first step is to add
information describing the purpose and structure of the table. The caption element allows you to provide a caption,
and to position this above or below the table. The caption element should appear before the tr element for the first row.

Projected sales revenue by year

Year Sales

2000 $18M

2001 $25M

which was produced by the following markup:


<table border="1" cellpadding="10" width="80%">
<caption>Projected sales revenue by year</caption>
<tr align="center">
<th>Year</th><th>Sales</th>
</tr>
<tr align="center"><td>2000</td><td>$18M</td></tr>
<tr align="center"><td>2001</td><td>$25M</td></tr>
</table>
Here is the same table with align="bottom" added to the caption element:

Year Sales

2000 $18M

2001 $25M

Projected sales revenue by year

The table element's summary attribute should be used to describe the structure of the table for people who can't see the
table. For instance: "the first column gives the year and the second, the revenue for that year".
<table summary="the first column gives the year
and the second, the revenue for that year">
Specifying the relation between header and data cells
When a table is rendered to audio or to Braille, it is useful to be able to identify which headers describe each cell. For
instance, an aural browser could allow you to move up and down or left and right from cell to cell, with the
appropriate headers spoken before each cell.
To support this you need to annotate the header and/or data cells. The simplest approach is to add the scope attribute to
header cells. It may be used with the following values:
• row: The current cell provides header information for the rest of the row that contains it.
• col: The current cell provides header information for the rest of the column that contains it.
Applying this to the example table gives:
<table border="1" cellpadding="10" width="80%">
<caption>Projected sales revenue by year</caption>
<tr align="center">
<th scope="col">Year</th>
<th scope="col">Sales</th>
</tr>
<tr align="center"><td>2000</td><td>$18M</td></tr>
<tr align="center"><td>2001</td><td>$25M</td></tr>
</table>
For more complex tables, you can use the headers attribute on individual data cells to provide a space separated list of
identifiers for header cells. Each such header cell must have an id attribute with a matching identifier.
A final point is that you should consider using the abbr attribute to specify an abbreviation for long headers. This
makes it tolerable to listen to lists of headers for each cell, for instance:
<th abbr="W3C">World Wide Web Consortium</th>

Roll-Overs and other tricks


A little JavaScript can go a long way to enliven your pages. You will be shown below how to create "rollovers" where
the appearence of a link changes as you move the mouse over it. You will also learn how to create cycling banner ads
which help to direct visitors to your sponsors' sites
Roll-Overs
In the most common form, a roll-over consists of an image serving as a hypertext link. While the mouse pointer is
over the image, it changes appearence to attract attention to the link. For example, you could add a glow effect, a drop
shadow or simply change the background color. Here is an example:
<script type="text/javascript">
if (document.images)
{
image1 = new Image;
image2 = new Image;
image1.src = "enter1.gif";
image2.src = "enter2.gif";
}

function chgImg(name, image)


{
if (document.images)
{
document[name].src = eval(image+".src");
}
}
</script>

...

<a href="/" onMouseOver='chgImg("enter", "image2")'


onMouseOut='chgImg("enter", "image1")'><img name="enter"
src="enter1.gif" border="0" alt="Enter if you dare!"></a>
and here is what it looks like ...
I created these images using a freeware painting tool by adding a hot wax effect and then a drop shadow to the text.
You can find lots of advice and royalty free clipart on the Web via most search engines.
Banner Ads
If your website has several sponsors, then you can use an image link that cycles through each of the sponsors in turn.
The first step is to create an image for each of your sponsors. All the images should have the same size. The
corresponding URLs for the images and for the websites are then placed into the arrays named adImages and adURLs
defined at the start of the script. The img element for the link should be initialized to the first image in the array. The
cycle is started off using the onload event on the body element.
<html>
<head>
<title>cycling banner ads</title>
<script type="text/javascript">
if (document.images)
{
adImages = new Array("hosts/csail.gif",
"hosts/ercim.gif", "hosts/keio.gif");
adURLs = new Array("www.csail.mit.edu",
"www.ercim.org", "www.keio.ac.jp");
thisAd = 0;
}

function cycleAds()
{
if (document.images)
{
if (document.adBanner.complete)
{
if (++thisAd == adImages.length)
thisAd = 0;

document.adBanner.src = adImages[thisAd];
}
}

// change to next sponsor every 3 seconds


setTimeout("cycleAds()", 3000);
}

function gotoAd()
{
document.location.href = "http://" + adURLs[thisAd];
}
</script>
</head>
<body onload="cycleAds()">
...

<a href="javascript:gotoAd()"><img name="adBanner"


src="hosts/csail.gif" border="0" alt="Our sponsors"></a>
Our Sponsors:
Note: you are recommended to make sure that all of the images are the same width and height. An alternative is to add
width and height attributes to the img element to ensure the images are all shown at the same size.
What about browsers that don't support scripting?
The content of a noscript element is only shown if the browser doesn't support scripting. It should be used when you
want to give people access to information that would otherwise be inaccessible to people with browsers that don't
support scripting. Let's assume that you want to make the links for your sponsors available as text:
<noscript>
Our sponsors: <a href="http://www.lcs.mit.edu/">MIT</a>,
<a href="http://www.inria.fr/">INRIA</a>, and
<a href="http://www.keio.ac.jp/">Keio University</a>.
</noscript>
There are many free sources of information about scripting, which can be easily found via most search engines.
Enable users to listen to sound files
Let's assume that you and your friends have got together to record some music in your garage, and you now want to
get this out to the listening public. The first step is to compress the recorded audio, e.g. as an mp3 file and upload it to
your website. For explanatory purposes, let's assume that this is then available at:
http://example.com/music/myband.mp3. In the examples below you should replace this with the correct location
for your website.
The next step is to create a playlist file with the file extension .m3u. This avoids the lengthy wait before users start to
hear the music that typically occurs if you link directly to the mp3 file. You can create the playlist with a text editor
and it just needs to include the URL for the mp3 file. For the example sound file, this would be:
http://example.com/music/myband.mp3
Upload the m3u playlist file to your web server. You can now add a link to your band's web page as follows:
<a href="http://example.com/music/myband.m3u"
type="audio/x-mpegurl">listen to our band</a>
You may also need to check with the web server adminstrator that the correct MIME types are set for the both mp3
and m3u files.
• m3u extension with audio/x-mpegurl
• mp3 extension with audio/mpeg
Note the above approach works best for people with broadband connections. You should consider providing a lower
quality version of the mp3 file for people on low speed connections (i.e. at a much lower data rate than 128K).
A very similar approach can also be used for Ogg Vorbis sound files (MIME type "application/ogg" and file extension
".ogg"). These can be used with either m3u or pls playlist files, but not all music players will be configured to support
this. Consult a search engine for more details on this and other audio formats.
Getting Further Information
W3C's Recommendation for HTML 4.0 is the authoritative specification for HTML. However, it is a technical
specification. For a less technical source of information you may want to purchase one of the many books on HTML,
for example "Raggett on HTML 4", published 1998 by Addison Wesley. XHTML 1.0 is now a W3C
Recommendation.

Chapter 2
CSS
This is chapter 2 of the book Cascading Style Sheets, designing for the Web, by Håkon Wium Lie and Bert Bos (2nd
edition, 1999, Addison Wesley, ISBN 0-201-59625-3)
As we explained in the previous chapter, HTML elements enable Web page designers to mark up a document as to its
structure. The HTML specification lists guidelines on how browsers should display these elements. For example, you
can be reasonably sure that the contents of a strong element will be displayed bold-faced. Also, you can pretty much
trust that most browsers will display the content of an h1 element using a big font size... at least bigger than the p
element and bigger than the h2 element. But beyond trust and hope, you don't have any control over how your text
appears.
CSS changes that. CSS puts the designer in the driver's seat. We devote much of the rest of this book to explaining
what you can do with CSS. In this chapter, we begin by introducing you to the basics of how to write style sheets and
how CSS and HTML work together to describe both the structure and appearance of your document.
Rules and Style Sheets
To start using CSS, you don't even have to write style sheets. Chapter 16 will tell you how to point to existing style
sheets on the Web.
There are two ways to create CSSs. You can either use a normal text editor and write the style sheets "by hand," or
you can use a dedicated tool - for example a Web page design application - which supports CSS. The dedicated tools
allow you to create style sheets without learning the syntax of the CSS language. However, in many cases the designer
will want to tweak the style sheet by hand afterwards, so we recommend that you learn to write and edit CSSs by
hand. Let's get started!
H1 { color: green }
What you see above is a simple CSS rule that contains one rule. A rule is a statement about one stylistic aspect of one
or more elements. A style sheet is a set of one or more rules that apply to an HTML document. The rule above sets the
color of all first-level headings (h1). Let's take a quick look at what the visual result of the rule could be:
Figure 2.1

We will now start dissecting the rule.


Anatomy of a rule
A rule consists of two parts:
• Selector - the part before the left curly brace
• Declaration - the part within the curly braces

The selector is the link between the HTML document and the style. It specifies what elements are affected by the
declaration. The declaration is that part of the rule that sets forth what the effect will be. In the example above, the
selector is h1 and the declaration is "color: green." Hence, all h1 elements will be affected by the declaration, that is,
they will be turned green. (The color property just affects the foreground text color, there are other properties for
background, border, etc.)
The above selector is based on the type of the element: it selects all elements of type "h1." This kind of selector is
called type selector. Any HTML element type can be used as a type selector. Type selectors are the simplest kind of
selectors. We discuss other kinds of selectors in See CSS selectors. , "CSS selectors."
Anatomy of a declaration
A declaration has two parts separated by a colon:
• Property - that part before the colon
• Value - that part after the colon

The property is a quality or characteristic that something possesses. In the previous example, it is color. CSS2 (see
separate box) defines around 120 properties and we can assign values to all of them.
CSS specifications
Cascading Style Sheets is formally described in two specifications from W3C: CSS1 and CSS2. CSS1 was issued in
December 1996 and describes a simple formatting model mostly for screen-based presentations. CSS1 has around 50
properties (for example color and font-size). CSS2 was finalized in May 1998 and builds on CSS1. CSS2 includes all
CSS1 properties and adds around 70 of its own, such as properties to describe aural presentations and page breaks. In
this book we do not try to distinguish between CSS1 and CSS2 and use the term "CSS" unless the distinction is
important. Most features described in the first four chapters are part of CSS1. If you would like to read the CSS
specifications themselves, you can find them from:
http://www.w3.org/TR/REC-CSS1

http://www.w3.org/TR/REC-CSS2

The value is a precise specification of the property. In the example, it is "green," but it could just as easily be blue, red,
yellow, or some other color.
The diagram below shows all ingredients of a rule. The curly braces ({ }) and colon (:) make it possible for the
browser to distinguish between the selector, property, and value.
Figure 2.2 Diagram of a rule.

Grouping selectors and rules


In designing CSS, brevity was a goal. We figured that if we could reduce the size of style sheets, we could enable
designers to write and edit style sheets "by hand." Also, short style sheets load faster than longer ones. CSS therefore
includes several mechanisms to shorten style sheets by way of grouping selectors and declarations.
For example, consider these three rules:
H1 { font-weight: bold }
H2 { font-weight: bold }
H3 { font-weight: bold }

All three rules have exactly the same declaration - they set the font to be bold. (This is done using the font-weight
property, which we discuss in See Fonts. .) Since all three declarations are identical, we can group the selectors into a
comma-separated list and only list the declaration once, like this:
H1, H2, H3 { font-style: bold }
This rule will produce the same result as the first three.
A selector may have more than one declaration. For example, we could write a style sheet with these two rules:
H1 { color: green }
H1 { text-align: center }
In this case, we set all h1s to be green and to be centered on the canvas. (This is done using the text-align property,
discussed in Chapter 5 .)
But we can achieve the same effect faster by grouping the declarations that relate to the same selector into a
semicolon-separated list, like this:
H1 {
color: green;
text-align: center;
}
All declarations must be contained within the pair of curly braces. A semicolon separates the declarations and may -
but doesn't have to - also appear at the end of the last declaration. Also, to make your code easier to read, we suggest
you place each declaration on its own line, as we did here. (Browsers won't care, they'll just ignore all the extra
whitespace and line breaks.)
Now you have the basics of how to create CSS rules and style sheets. However, you're not done yet. In order for the
style sheet to have any effect you have to "glue" your style sheet to your HTML document.
"Gluing" Style Sheets to the Document
For any style sheet to affect the HTML document, it must be "glued" to the document. That is, the style sheet and the
HTML document must be combined so that they can work together to present the document. This can be done in any
of four ways:
1. Apply the basic, document-wide style sheet for the document by using the style element.
2. Apply a style sheet to an individual element using the style attribute.
3. Link an external style sheet to the document using the link element.
4. Import a style sheet using the CSS @import notation.
In the next section, we discuss the first method: using the style element. We discuss using the style attribute in Chapter
4 , "CSS selectors," and using the link element and the @import notation in Chapter 16 , "External style sheets."
Gluing by using the STYLE element
You can glue the style sheet and the HTML document together by putting the style sheet inside a style element at the
top of your document. The style element was introduced in HTML specifically to allow style sheets to be inserted
inside HTML documents. Here's a style sheet (shown in bold) glued to a sample document by using the style element.
The result is shown in Figure 2.3 .
<HTML>
<TITLE>Bach's home page</TITLE>
<STYLE>
H1, H2 { color: green }
</STYLE>
<BODY>
<H1>Bach's home page</H1>
<P>Johann Sebastian Bach was a prolific
composer. Among his works are:
<UL>
<LI>the Goldberg Variations
<LI>the Brandenburg Concertos
<LI>the Christmas Oratorio
</UL>
<H2>Historical perspective</H2>
<P>Bach composed in what has been referred to as
the Baroque period.
</BODY>
</HTML>
Figure 2.3 The result of adding to a style sheet a rule to turn h1s green and then gluing the style sheet to the document
using the style elements. (try it)
Notice that the style element is placed after the title element and before the body element. The title of a document does
not show up on the canvas, so it is not affected by CSS styles.
The content of a style element is a style sheet. However, whereas the content of such elements as h1, p, and ul appears
on the canvas, the content of a style element does not show on the canvas. Rather, it is the effect of the content of the
style element - the style sheet - that appears on the canvas. So you don't see "{ color: green }" displayed on your
screen; you see instead two h1 elements colored green. No rules have been added that affect any of the other elements,
so those elements appear in the browser's default color.
Browsers and CSS
For an updated overview of available browsers, see the W3C overview page
For CSS to work as described in this book, you must use a CSS-enhanced browser, that is, a browser that supports
CSS. A CSS-enhanced browser will recognize the style element as a container for a style sheet and present the
document accordingly. Most browsers that are distributed today support CSS, for example Microsoft Internet Explorer
4 (IE4), Netscape Navigator 4 (NS4) and Opera 3.5 (O3.5). Conservative estimates indicate that more than half the
people on the Web use a CSS-enhanced browser, and the figures are steadily rising. Chances are that the people you
communicate with have CSS-enhanced browsers. If not, give them a reason to upgrade!
The best source for information on how different browsers support CSS is WebReview's charts
Alas, not all CSS implementations are perfect. When you start experimenting with style sheets, you will soon notice
that each browser comes with a set of bugs and limitations. In general, newer browsers behave better than older ones.
IE4 and O3.5 are among the best, and Netscape's next offering - code-named Gecko - also promises much improved
support for CSS.
Those who don't use CSS-enhanced browsers can still read pages that use style sheets. CSS was carefully designed so
that all content should remain visible even if the browser knows nothing about CSS. Some browsers, such as
Netscape's Navigator version 2 and 3 don't support style sheets but they know enough about the style element to fully
ignore it. Next to supporting style sheets, this is the correct behavior.
However, other browsers that do not know the style element, such as Netscape's Navigator 1 and Microsoft Internet
Explorer 2, will ignore the style tags but display the content of the style element. Thus, the user will end up with the
style sheet printed on the top of the canvas. At the time of writing, only a few percent of Web users will experience
this problem. To avoid this, you can put your style sheet inside an HTML comment, which we discussed in Chapter 1 .
Because comments don't display on the screen, by placing your style sheet inside an HTML comment, you prevent the
oldest browsers from displaying the style element's content. CSS-enhanced browsers are aware of this trick, and will
treat the content of the style element as a style sheet.
Recall that HTML comments start with <!-- and end with -->. Here's an excerpt from the previous code example that
shows how you write a style sheet in an HTML comment. The comment encloses the style element content only:
<HTML>
<TITLE>Bach's home page</TITLE>
<STYLE>
<!--
H1 { color: green }
-->
</STYLE>
<BODY>
..
</BODY>
</HTML>
CSS also has its own set of comments that you can use within the style sheet. A CSS comment begins with "/*" and
ends with "*/." (Those familiar with the C programming language will recognize these.) CSS rules inside a CSS
comment will not have any effect on the presentation of the document.
The browser also needs to be told that you are working with CSS style sheets. CSS is currently the only style sheet
language in use with HTML documents and we don't expect this to change. For XML the situation might be different.
But just as there is more than one image format (GIF, JPEG and PNG come to mind), there could be more than one
style sheet language. So it's a good habit to tell browsers that they are dealing with CSS. (In fact, HTML requires you
to.) This is done with the type attribute of the style element. The value of type indicates what type of style sheet is
being used. For CSS, that value is "text/css." The following is an excerpt from our previous sample document that
shows you how you would write this (in combination with the use of the HTML comment):
<HTML>
<TITLE>Bach's home page</TITLE>
<STYLE TYPE="text/css">
<!--
H1 { color: green }
-->
</STYLE>
<BODY>
..
</BODY>
</HTML>
When the browser loads a document, it checks to see if it understands the style sheet language. If it does, it will try to
read the sheet, otherwise it will ignore it. The type attribute (see Chapter 1 for a discussion on HTML attributes) on
the style element is a way to let the browser know which style sheet language is being used. The type attribute must be
included.
To make examples easier to read, we have chosen not to wrap style sheets in HTML comments, but we do use the type
attribute throughout this book.
Tree structures and inheritance
Recall from Chapter 1 the discussion about HTML representing a document with a tree-like structure and how
elements in HTML have children and parents. There are many reasons for having tree-structured documents. For style
sheets, there is one very good reason: inheritance. Just as children inherit from their parents, so do HTML elements.
Instead of inheriting genes and money, HTML elements inherit stylistic properties.
Let's start by taking a look at the sample document:
<HTML>
<TITLE>Bach's home page</TITLE>
<BODY>
<H1>Bach's home page</H1>
<P>Johann Sebastian Bach was a
<STRONG>prolific</STRONG> composer. Among his
works are:
<UL>
<LI>the Goldberg Variations
<LI>the Brandenburg Concertos
<LI>the Christmas Oratorio
</UL>
</BODY>
</HTML>
The tree structure of this document is:

Through inheritance, CSS property values set on one element will be transferred down the tree to its descendants. For
example, our examples have up to now set the color to be green for h1 and h2 elements. Now, say, you would like to
set the same color on all elements in your document. You could do this by listing all element types in the selector:
<STYLE TYPE="text/css">
H1, H2, P, LI { color: green }
</STYLE>
However, most HTML documents are more complex than our sample document, and your style sheet would soon get
long. There is a better - and shorter - way. Instead of setting the style on each element type, we set it on their common
ancestor, the body element:
<STYLE TYPE="text/css">
BODY { color: green }
</STYLE>
Since other elements inherit properties from the body element, they will all inherit the color green (Figure 2.4 ).
Figure 2.4 The result of inheritance. (try it)

As you have seen above, inheritance is a transport vehicle that will distribute stylistic properties to descendants of an
element. Since the body element is a common ancestor for all visible elements, body is a convenient selector when you
want to set stylistic rules for the entire document.
Overriding Inheritance
In the previous example, all elements were given the same color through inheritance. Sometimes, however, children
don't look like their parents. Not surprisingly, CSS also accounts for this. Say you would like for h1 elements to be
blue while the rest should be green. This is easily expressed in CSS:
<STYLE TYPE="text/css">
BODY { color: green }
H1 { color: navy }
</STYLE>
Since h1 is a child element of body (and thereby inherits from body), the two rules in the above style sheet are
conflicting. The first one sets the color of the body element - and thereby also the color of h1 through inheritance -
while the second one sets the color specifically on the h1 element. Which rule will win? Let's find out:
The reason why the second rule wins is that it is more specific than the first. The first rule is very general - it affects all
elements on the canvas. The second rule only affects h1 elements in the document and is therefore more specific.
If CSS had been a programming language, the order in which the rules were specified would determine which of them
would win. CSS is not a programming language, and in the above example, the order is irrelevant. The result is exactly
the same if we use this style sheet:
<STYLE TYPE="text/css">
H1 { color: navy }
BODY { color: green }
</STYLE>
CSS has been designed to resolve conflicts between style sheet rules like the one above. Specificity is one aspect of
that. You can find the details in Chapter 15 , "Cascading and inheritance."
Properties that don't inherit
As a general rule, properties in CSS inherit from parent to child elements as described in the previous examples. Some
properties, however, don't inherit and there is always a good reason why. We will use the background property
(described in Chapter 11) as an example of a property that doesn't inherit.
Let's say you want to set a background image for a page. This is a common effect on the Web. In CSS, you can write:
<HTML>
<TITLE>Bach's home page</TITLE>
<STYLE TYPE="text/css">
BODY {
background: url(texture.gif) white;
color: black;
}
</STYLE>
<BODY>
<H1>Bach's <EM>home</EM> page</H1>
<P>Johann Sebastian Bach was a prolific
composer.
</BODY>
</HTML>
The background property has a URL ("texture.gif") that points to a background image as value. When the image is
loaded, the canvas looks like:
There are a few noteworthy things in the above example:
• The background image covers the surface like a wallpaper - also the backgrounds of the h1 and p element have
been covered. This is not due to inheritance, but to the fact that unless otherwise set, all backgrounds are
transparent. So, since we haven't set the backgrounds of the h1 or p element to something else, the parent
element, body, will shine through.
• In addition to the URL of the image, a color (white) has also been specified as the background. In case the
image can't be found, you will see the color instead.
• The color of the body element has been set to black. To ensure contrast between the text and the background, it
is a good habit to always set a color when the background property is set.
So, exactly why doesn't the background property inherit? Visually, the effect of transparency is similar to inheritance:
it looks like all elements have the same backgrounds. There are two reasons: first, transparent backgrounds are faster
to display (there is nothing to display!) than other backgrounds. Second, since background images are aligned relative
to the element they belong to, you would otherwise not always end up with a smooth background surface.
Common tasks with CSS
Setting colors and backgrounds - as described above - are among the most common tasks performed by CSS. Other
common tasks include setting fonts and white space around elements. This section gives you a guided tour of the most
commonly used properties in CSS.
Common tasks: fonts
Let's start with fonts. If you have used desktop publishing applications in the past, you should be able to read this little
style sheet:
H1 { font: 36pt serif }
The rule above sets the font for h1 elements. The first part of the value - 36pt - sets the font size to be 36 points. A
"point" is an old typographic unit of measurement which has survived into the digital age. In the next chapter we will
tell you why you should use the "em" unit instead of "pt" but for now we'll stick to points. The second part of the value
- serif - tells the browser to use a font with serifs (the little hooks at the ends of the strokes, Chapter 5 will tell you all
about them). The more decorated serif fonts suit Bach's home page well since the modern sans-serif fonts (fonts
without serifs) weren't used in his time. Here is the result:

The font property is a shorthand property for setting several other properties at once. By using it, you can shorten your
style sheets and set values on all properties it replaces. If you choose to use the expanded version, you would have to
set all of these to replace the example above:
H1 {
font-size: 36pt;
font-family: serif;
font-style: normal;
font-weight: normal;
font-variant: normal;
line-height: normal;
}
Sometimes you only want to set one of these. For example, you may want to slant the text in some elements. Here is
an example:
UL { font-style: italic }
The font-style property will not change the font size or the font family, it will only slant the existing font. When set on
the ul element, the li elements inside will become slanted, since font-style is inherited. Here is the result when applied
to the test page you know by now:
(try it)

Similarly, the font-weight property is used to change the weight - thickness - of the letters. You can further emphasize
the list items by setting their ancestor to be bold:
UL {
font-style: italic;
font-weight: bold;
}

Which yields:
(try it)

The last properties, font-variant and line-height, haven't been widely supported in browsers up to now and are
therefore not as commonly used yet.
Common tasks: margins
Setting space around elements is a basic tool in typography. The headline above this paragraph has space above it and
(slightly less) space below it. This paragraph, as printed in the book, has space on the left and (slightly less) on the
right. CSS can be used to express how much space there should be around different kinds of elements.
By default, your browser knows quite a bit about how to display the different kinds of elements in HTML. For
example, it knows that lists and blockquote elements should be indented to set them apart from the rest of the text. As
a designer, you can build on these settings while at the same time provide your own refinements. Let's use the
blockquote element as an example. Here's a test document:
<HTML>
<TITLE>Fredrick the Great meets Bach</TITLE>
<BODY>
<P>One evening, just as Fredrick the Great was
getting his flute ready, and his musicians
were assembled, an officer brought him a
list of the strangers who had arrived. With
his flute in his hand he ran over the list,
but immediately turned to the assembled
musicians, and said, with a kind of
agitation:
<BLOCKQUOTE>"Gentlemen, old Bach is come."
</BLOCKQUOTE>
<P>The flute was now laid aside, and old Bach, who
had alighted at his son's lodgings, was immediately
summoned to the Palace.
</BODY>
</HTML>
The screen-shot below is how a typical HTML browser would display the document:

As you can see, the browser has added space on all sides of the quoted text. In CSS, this space is called "margins" and
all elements have margins on all four sides. The properties are called: margin-top, margin-right, margin-bottom,
and margin-left. You can change how the blockquote element is displayed by writing a little style sheet:
BLOCKQUOTE {
margin-top: 1em;
margin-right: 0em;
margin-bottom: 1em;
margin-left: 0em;
font-style: italic;
}
The "em" unit will be treated in detail in the next chapter, but we can already now reveal its secret: it scales relative to
the font size. So, the above example will result in the vertical margins being as high as the font size (1em) of the
blockquote, and horizontal margins having zero width. To make sure the quoted text can still be distinguished, it has
been given an italic slant. The result is:
Just like font is a shorthand property to set several font-related properties at once, margin is a shorthand property
which sets all margin properties. The above example can therefore be written:
BLOCKQUOTE {
margin: 1em 0em 1em 0em;
font-style: italic;
}
The first part of the value - 1em - is assigned to margin-top. From there it's clockwise: 0em is assigned to margin-
right, 1em is assigned to margin-bottom, and 0em is assigned to margin-left.
With the left margin set to zero, the quoted text needs more styling to set it apart from the rest of the text. Setting font-
style to italic helps, and adding a background color further amplifies the quote:
BLOCKQUOTE {
margin: 1em 0em 1em 0em;
font-style: italic;
background: #EDB;
}
The result is:

As expected, the background color behind the quote has changed. Unlike previous examples, the color was specified
in red/green/blue (RGB) components. RGB colors are described in detail in Chapter 11 .
One stylistic problem in the example above is that the background color barely covers the quoted text. The space
around the quote - the margin area - does not use the element's background color. CSS has another kind of space,
called padding, which uses the background color of the element. In other respects the padding properties are like the
margin properties: they add space around an element. Let's add some padding to the quote:
BLOCKQUOTE {
margin: 1em 0em 1em 0em;
font-style: italic;
background: #EDB;
padding: 0.5em;
}
The result of setting the padding is added space between the text and the rectangle that surrounds it:
Notice that the padding property was only given one value (0.5em). Just like the margin property, padding could
have taken 4 values which would have been assigned to the top, right, bottom and left padding respectively. However,
when the same value is to be set on all sides, listing it once will suffice. This is true both for padding and margin (as
well as some other border properties, which are described in See Space around boxes. ).
Common tasks: links
To make it easier for users to browse in hypertext documents, the links should have a style that distinguishes them
from normal text. HTML browsers have often underlined hyperlink text. Also, various color schemes have been used
to indicate if the user has previously visited the link or not. Since hyperlinks are such a fundamental part of the Web,
CSS has special support for styling them. Here's a simple example:
A:link { text-decoration: underline }
The above example specifies that unvisited links should be underlined:

The links are underlined, as we have specified, but they are also blue, which we have not. When authors do not specify
all possible styles, browsers use default styles to fill in the gaps. The interaction between author styles, browser default
styles and user styles (the user's own preferences) is another example of CSS's conflict resolution rules. It is called the
cascade (the "C" of CSS). We will discuss the cascade below.
The selector (A:link) deserves special mentioning. You probably recognize "A" as being an HTML element, but the
last part is new. ":link" is one of several so-called pseudo-classes in CSS. Pseudo-classes are used to give style to
elements based on information outside of the document itself. For example, the author of the document can't know if a
certain link will be visited or not. Pseudo-classes are described in detail in Chapter 4, and we'll only give a few more
examples here:
A:visited { text-decoration: none }
This rule gives style to visited links, just like A:link gave style to unvisited links. Here is a slightly more complex
example:
A:link, A:visited { text-decoration: none }
A:hover { background: cyan }
The last rule introduces a new pseudo-class :hover. Assuming the user is moving a pointing device (like a mouse), the
specified style will be applied to the element when the user moves the pointer over ("hovers" over) the link. A
common effect is to change the background color. Here is what it looks like:

The :hover pseudo-class has an interesting history. It was introduced in CSS2 after the hover effect became popular
among JavaScript programmers. The JavaScript solution requires complicated code compared to the CSS pseudo-class
and this is an example of CSS picking up effects that have become popular among Web designers.
A word about Cascading
A fundamental feature of CSS is that more than one style sheet can influence the presentation of a document. This
feature is known as cascading because the different style sheets are thought of as coming in a series. Cascading is a
fundamental feature of CSS, because we realized that any single document could very likely end up with style sheets
from multiple sources: the browser, the designer, and possibly the user.
In the last set of examples you saw that the text color of the links turned blue without that being specified in the style
sheet. Also, the browser knew how to format blockquote and h1 elements without being told so explicitly. Everything
that the browser knows about formatting is stored in the browser's default style sheet and is merged with author and
user style sheets when the document is displayed.
We have known for years that designers want to develop their own style sheets. However, we discovered that users,
too, want the option of influencing the presentation of their documents. With CSS, they can do this by supplying a
personal style sheet that will be merged with the browser's and the designer's style sheets. Any conflicts between the
various style sheets are resolved by the browser. Usually, the designer's style sheet will have the strongest claim on the
document, followed by the user's, and then the browser's default. However, the user can say that a rule is very
important and it will then override any author or browser styles.
We go into details about cascading in Chapter 15, "Cascading and inheritance." Before that, there is much to learn
about fonts, space and colors.

Web Style Sheets


CSS tips & tricks
(This page uses CSS style sheets)
Figures/captions
A pinned-down menu
Indented paragraphs
Alternative style sheets
A confetti menu
Un-colored scrollbars
Even/odd
A tabbed interface
Fonts
Centering
Shadows
Text shadows
Round corners & shadows
Length units
Tips & tricks
A random collection of CSS examples and some help in using them.
1. Figures & captions
2. A pinned-down menu
3. Indented paragraphs
4. Alternative style sheets
5. A confetti menu
6. Getting rid of colored scrollbars (user style sheets)
7. Even/odd: coloring every other row
8. A tabbed interface
9. A chart comparing font styles
10. Horizontal and vertical centering
11. Boxes with drop shadows
12. Text shadows
13. Rounded boxes and unsharp shadows
14. Units of length: px, em, cm, etc.

CSS tutorial
starting with HTML + CSS
Contents
• 1. The HTML
• 2. Adding color
• 3. Adding fonts
• 4. A navigation bar
• 5. Styling links
• 6. Horizontal line
• 7. External CSS
• Further reading
This short tutorial is meant for people who want to start using CSS and have never written a CSS style sheet before.
It does not explain much of CSS. It just explains how to create an HTML file, a CSS file and how to make them work
together. After that, you can read any of a number of other tutorials to add more features to the HTML and CSS files.
Or you can switch to using a dedicated HTML or CSS editor, that helps you set up complex sites.
At the end of the tutorial, you will have made an HTML file that looks like this:
The resulting HTML page, with colors and layout, all done with CSS.
Note that I don't claim that this is beautiful ☺
Sections that look like this are optional. They contain some extra explanation of the HTML and CSS codes in
the example. The “alert!” sign at the start indicates that this is more advanced material than the rest of the text.
Step 1: writing the HTML
For this tutorial, I suggest you use only the very simplest of tools. E.g., Notepad (under Windows), TextEdit (on the
Mac) or KEdit (under KDE) will do fine. Once you understand the principles, you may want to switch to more
advanced tools, or even to commercial programs, such as Style Master, Dreamweaver or GoLive. But for your very
first CSS style sheet, it is good not to be distracted by too many advanced features.
Don't use a wordprocessor, such as Microsoft Word or OpenOffice. They typically make files that a Web browser
cannot read. For HTML and CSS, we want simple, plain text files.
Step 1 is to open your text editor (Notepad, TextEdit, KEdit, or whatever is your favorite), start with an empty window
and type the following:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>My first styled page</title>
</head>

<body>

<!-- Site navigation menu -->


<ul class="navbar">
<li><a href="index.html">Home page</a>
<li><a href="musings.html">Musings</a>
<li><a href="town.html">My town</a>
<li><a href="links.html">Links</a>
</ul>

<!-- Main content -->


<h1>My first styled page</h1>

<p>Welcome to my styled page!

<p>It lacks images, but at least it has style.


And it has links, even if they don't go
anywhere&hellip;

<p>There should be more here, but I don't know


what yet.

<!-- Sign and date the page, it's only polite! -->
<address>Made 5 April 2004<br>
by myself.</address>

</body>
</html>
In fact, you don't have to type it: you can copy and paste it from this Web page into the editor.
(If you are using TextEdit on the Mac, don't forget to tell TextEdit that the text is really plain text, by going to the
Format menu and selecting “Make plain text”.)

The first line of the HTML file above tells the browser which type of HTML this is (DOCTYPE means
DOCument TYPE). In this case, it is HTML version 4.01.
Words within < and > are called tags and, as you can see, the document is contained within the <html> and </html>
tags. Between <head> and </head> there is room for various kinds of information that is not shown on screen. So far it
contains the title of the document, but later we will add the CSS style sheet there, too.
The <body> is where the actual text of the document goes. In principle, everything in there will be displayed, except
for the the text inside <!-- and -->, which serves as a comment to ourselves. The browser will ignore it.
Of the tags in the example, <ul> introduces an “Unordered List”, i.e., a list in which the items are not numbered. The
<li> is the start of a “List Item.” The <p> is a “Paragraph.” And the <a> is an “Anchor,” which is what creates a
hyperlink.

The KEdit editor showing the HTML source.

If you want to know what the names in <…> mean, one good place to start is Getting started with HTML. But
just a few words about the structure of our example HTML page.
• The “ul” is a list with one hyperlink per item. This will serve as our “site navigation menu,” linking to the
other pages of our (hypothetical) Web site. Presumably, all pages on our site have a similar menu.
• The “h1” and “p” elements form the unique content of this page, while the signature at the bottom (“address”)
will again be similar on all pages of the site.
Note that I didn't close the “li” and “p” elements. In HTML (but not in XHTML), it is allowed to omit the </li> and
</p> tags, which I did here, to make the text a little easier to read. But you may add them, if you prefer.
Let's assume that this is going to be one page of a Web site with several similar pages. As is common for current Web
pages, this one has a menu that links to other pages on the hypothetical site, some unique content and a signature.
Now select “Save As…” from the File menu, navigate to a directory/folder where you want to put it (the Desktop is
fine) and save the file as “mypage.html”. Don't close the editor yet, we will need it again.
(If you are using TextEdit on Mac OS X before version 10.4, you will see an option Don't append the .txt extension in
the Save as dialog. Select that option, because the name “mypage.html” already includes an extension. Newer versions
of TextEdit will notice the .html extension automatically.)
Next, open the file in a browser. You can do that as follows: find the file with your file manager (Windows Explorer,
Finder or Konqueror) and click or double click the “mypage.html” file. It should open in your default Web browser.
(If it does not, open your browser and drag the file to it.)
As you can see, the page looks rather boring…
Step 2: adding some colors
You probably see some black text on a white background, but it depends on how the browser is configured. So one
easy thing we can do to make the page more stylish is to add some colors. (Leave the browser open, we will use it
again later.)
We will start with a style sheet embedded inside the HTML file. Later, we will put the HTML and the CSS in separate
files. Separate files is good, since it makes it easier to use the same style sheet for multiple HTML files: you only have
to write the style sheet once. But for this step, we just keep everything in one file.
We need to add a <style> element to the HTML file. The style sheet will be inside that element. So go back to the
editor window and add the following five lines in the head part of the HTML file. The lines to add are shown in red
(lines 5 to 9).
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>My first styled page</title>
<style type="text/css">
body {
color: purple;
background-color: #d8da3d }
</style>
</head>

<body>
[etc.]
The first line says that this is a style sheet and that it is written in CSS (“text/css”). The second line says that we add
style to the “body” element. The third line sets the color of the text to purple and the next line sets the background to a
sort of greenish yellow.

Style sheets in CSS are made up of rules. Each rule has three parts:
1. the selector (in the example: “body”), which tells the browser which part of the document is affected by the
rule;
2. the property (in the example, 'color' and 'background-color' are both properties), which specifies what aspect of
the layout is being set;
3. and the value ('purple' and '#d8da3d'), which gives the value for the style property.
The example shows that rules can be combined. We have set two properties, so we could have made two separate
rules:
body { color: purple }
body { background-color: #d8da3d }
but since both rules affect the body, we only wrote “body” once and put the properties and values together. For more
about selectors, see chapter 2 of Lie & Bos.
The background of the body element will also be the background of the whole document. We haven't given any of the
other elements (p, li, address…) any explicit background, so by default they will have none (or: will be transparent).
The 'color' property sets the color of the text for the body element, but all other elements inside the body inherit that
color, unless explicitly overridden. (We will add some other colors later.)
Now save this file (use “Save” from the File menu) and go back to the browser window. If you press the “Reload”
button, the display should change from the “boring” page to a colored (but still rather boring) page. Apart from the list
of links at the top, the text should now be purple against a greenish yellow background.

How one browser shows the page now that some colors have been added.

Colors can be specified in CSS in several ways. This example shows two of them: by name
(“purple”) and by hexadecimal code (“#d8da3d”). There are about 140 color names and the hexadecimal codes allow
for over 16 million colors. Adding a touch of style explains more about these codes.
Step 3: adding fonts
Another thing that is easy to do is to make some distinction in the fonts for the various elements of the page. So let's
set the text in the “Georgia” font, except for the h1 heading, which we'll give “Helvetica.”
On the Web, you can never be sure what fonts your readers have on their computers, so we add some alternatives as
well: if Georgia is not available, Times New Roman or Times are also fine, and if all else fails, the browser may use
any other font with serifs. If Helvetica is absent, Geneva, Arial and SunSans-Regular are quite similar in shape, and if
none of these work, the browser can choose any other font that is serif-less.
In the text editor add the following lines (lines 7-8 and 11-13):
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>My first styled page</title>
<style type="text/css">
body {
font-family: Georgia, "Times New Roman",
Times, serif;
color: purple;
background-color: #d8da3d }
h1 {
font-family: Helvetica, Geneva, Arial,
SunSans-Regular, sans-serif }
</style>
</head>

<body>
[etc.]
If you save the file again and press “Reload” in the browser, there should now be different fonts for the heading and
the other text.

Now the main text has a different font from the heading.
Step 4: adding a navigation bar
The list at the top of the HTML page is meant to become a navigation menu. Many Web sites have some sort of menu
along the top or on the side of the page and this page should have one as well. We will put it on the left side, because
that is a little more interesting than at the top…
The menu is already in the HTML page. It is the <ul> list at the top. The links in it don't work, since our “Web site” so
far consists of only one page, but that doesn't matter now. On a real Web site, there should not be any broken links, of
course.
So we need to move the list to the left and move the rest of the text a little to the right, to make room for it. The CSS
properties we use for that are 'padding-left' (to move the body text) and 'position', 'left' and 'top' (to move the menu).
There are other ways to do it. If you look for “column” or “layout” on the Learning CSS page, you will find several
ready-to-run templates. But this one is OK for our purposes.
In the editor window, add the following lines to the HTML file (lines 7 and 12-16):
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>My first styled page</title>
<style type="text/css">
body {
padding-left: 11em;
font-family: Georgia, "Times New Roman",
Times, serif;
color: purple;
background-color: #d8da3d }
ul.navbar {
position: absolute;
top: 2em;
left: 1em;
width: 9em }
h1 {
font-family: Helvetica, Geneva, Arial,
SunSans-Regular, sans-serif }
</style>
</head>

<body>
[etc.]
If you save the file again and reload it in the browser, you should now have the list of links to the left of the main text.
That already looks much more interesting, doesn't it?

The main text has been moved over to the right and the list of links is now to the left of it, instead of above.

The 'position: absolute' says that the ul element is positioned independently of any text that comes before or
after it in the document and the 'left' and 'top' indicate what that position is. In this case, 2em from the top and 1em
from the left side of the window.
'2em' means 2 times the size of the current font. E.g., if the menu is displayed with a font of 12 points, then '2em' is 24
points. The 'em' is a very useful unit in CSS, since it can adapt automatically to the font that the reader happens to use.
Most browsers have a menu for increasing or decreasing the font size: you can try it and see that the menu increases in
size as the font increases, which would not have been the case, if we had used a size in pixels instead.
Step 5: Styling the links
The navigation menu still looks like a list, instead of a menu. Let's add some style to it. We'll remove the list bullet and
move the items to the left, to where the bullet was. We'll also give each item its own white background and a black
square. (Why? No particular reason, just because we can.)
We also haven't said what the colors of the links should be, so let's add that as well: blue for links that the user hasn't
seen yet and purple for links already visited (lines 13-15 and 23-33):
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>My first styled page</title>
<style type="text/css">
body {
padding-left: 11em;
font-family: Georgia, "Times New Roman",
Times, serif;
color: purple;
background-color: #d8da3d }
ul.navbar {
list-style-type: none;
padding: 0;
margin: 0;
position: absolute;
top: 2em;
left: 1em;
width: 9em }
h1 {
font-family: Helvetica, Geneva, Arial,
SunSans-Regular, sans-serif }
ul.navbar li {
background: white;
margin: 0.5em 0;
padding: 0.3em;
border-right: 1em solid black }
ul.navbar a {
text-decoration: none }
a:link {
color: blue }
a:visited {
color: purple }
</style>
</head>

<body>
[etc.]

Traditionally, browsers show hyperlinks with underlines and with colors. Usually, the colors are similar to
what we specificed here: blue for links to pages that you haven't visited yet (or visited a long time ago), purple for
pages that you have already seen.
In HTML, hyperlinks are created with <a> elements, so to specify the color, we need to add a style rule for “a”. To
differentiate between visited and unvisited links, CSS provides two “pseudo-classes” (:link and :visited). They are
called “pseudo-classes” to distinguish them from class attributes, that appear in the HTML directly, e.g., the
class="navbar" in our example.
Step 6: adding a horizontal line
The final addition to the style sheet is a horizontal rule to separate the text from the signature at the bottom. We will
use 'border-top' to add a dotted line above the <address> element (lines 34-37):
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>My first styled page</title>
<style type="text/css">
body {
padding-left: 11em;
font-family: Georgia, "Times New Roman",
Times, serif;
color: purple;
background-color: #d8da3d }
ul.navbar {
list-style-type: none;
padding: 0;
margin: 0;
position: absolute;
top: 2em;
left: 1em;
width: 9em }
h1 {
font-family: Helvetica, Geneva, Arial,
SunSans-Regular, sans-serif }
ul.navbar li {
background: white;
margin: 0.5em 0;
padding: 0.3em;
border-right: 1em solid black }
ul.navbar a {
text-decoration: none }
a:link {
color: blue }
a:visited {
color: purple }
address {
margin-top: 1em;
padding-top: 1em;
border-top: thin dotted }
</style>
</head>

<body>
[etc.]
Now our style is complete. Next, let's look at how we can put the style sheet in a separate file, so that other pages can
share the same style.
Step 7: putting the style sheet in a separate file
We now have an HTML file with an embedded style sheet. But if our site grows we probably want many pages to
share the same style. There is a better method than copying the style sheet into every page: if we put the style sheet in
a separate file, all pages can point to it.
To make a style sheet file, we need to create another empty text file. You can choose “New” from the File menu in the
editor, to create an empty window. (If you are using TextEdit, don't forget to make it plain text again, using the Format
menu.)
Then cut and paste everything that is inside the <style> element from the HTML file into the new window. Don't copy
the <style> and </style> themselves. They belong to HTML, not to CSS. In the new editor window, you should now
have the complete style sheet:
body {
padding-left: 11em;
font-family: Georgia, "Times New Roman",
Times, serif;
color: purple;
background-color: #d8da3d }
ul.navbar {
list-style-type: none;
padding: 0;
margin: 0;
position: absolute;
top: 2em;
left: 1em;
width: 9em }
h1 {
font-family: Helvetica, Geneva, Arial,
SunSans-Regular, sans-serif }
ul.navbar li {
background: white;
margin: 0.5em 0;
padding: 0.3em;
border-right: 1em solid black }
ul.navbar a {
text-decoration: none }
a:link {
color: blue }
a:visited {
color: purple }
address {
margin-top: 1em;
padding-top: 1em;
border-top: thin dotted }
Choose “Save As…” from the File menu, make sure that you are in the same directory/folder as the mypage.html file,
and save the style sheet as “mystyle.css”.
Now go back to the window with the HTML code. Remove everything from the <style> tag up to and including the
</style> tag and replace it with a <link> element, as follows (line 5):
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>My first styled page</title>
<link rel="stylesheet" href="mystyle.css">
</head>

<body>
[etc.]
This will tell the browser that the style sheet is found in the file called “mystyle.css” and since no directory is
mentioned, the browser will look in the same directory where it found the HTML file.
If you save the HTML file and reload it in the browser, you should see no change: the page is still styled the same
way, but now the style comes from an external file.
The final result
The next step is to put both files, mypage.html and mystyle.css on your Web site. (Well, you might want to change
them a bit first…) But how to do that depends on your Internet provider.
Further reading
For an introduction to CSS, see chapter 2 of Lie & Bos, or Dave Raggett's intro to CSS.
Other information, including more tutorials, can be found on the learning CSS page.

XHTML Modules and Markup Languages


How to create XHTML Family modules and markup languages for fun
and profit
Full Table of Contents
• 1. Introduction
• 2. Terminology
• 3. Module Construction
○ 3.1. Defining the Qname Module
○ 3.2. Defining the Declaration Module
○ 3.3. Using the module as a stand-alone DTD
• 4. DTD Construction
○ 4.1. Define the Content Model as a 'Model Module'
○ 4.2. Define the qualified names collection
○ 4.3. Define the Driver
• 5. Module Examples
○ 5.1. Qname Modules
 5.1.1. Inventory Qname Module
 5.1.2. Inventory Extensions Qname Module
○ 5.2. Declaration Modules
 5.2.1. Inventory Declaration Module
 5.2.2. Inventory Extensions Declaration Module
• 6. Markup Language Examples
○ 6.1. XHTML-Inventory Extensions - an XHTML Family Markup Language
 6.1.1. Qname Collection Module
 6.1.2. Content Model Module
 6.1.3. DTD Driver
• 7. Usage Examples
○ 7.1. XHTML-Inventory with no prefixes
○ 7.2. XHTML-Inventory with Inventory prefixes
○ 7.3. XHTML-Inventory with all prefixes

1. Introduction
XHTML Modularization provides a structure for the creation of new markup languages through the extension of the
XHTML Core modules and the use of the XHTML Module Framework. In some instances, people will want to create
complete, proprietary markup languages through these mechanisms. In other instances, people may wish to create
new, reusable modules that will be used by their organization or by others in the definition of markup languages. In
either case, the mechanics of the modules added and the markup language definition are the same. This document
describes the manner in which such modules are defined, and the way in which modules should be combined to create
new markup languages.

2. Terminology
Markup Language
A grammar (in this case, an XML grammar) that can be used to structure information. Once structured, the
information can be processed in the context of the markup language. Such processing might include
presentation to a user, extraction of key information, transformation into other forms, etc.
Hybrid Markup Language
A Markup Language that is made up of Modules from multiple Namespaces.
Namespace
A namespace is a collection of names that are delimited in some way. An XML Namespace is a W3C-defined
mechanism for delimiting XML elements and attributes. XHTML-defined modules are all in the XHTML
Namespace. XHTML-family modules are required to be in their own XML Namespace. XHTML
Modularization defines a mechanism for declaring the XML Namespace of a module in a way that is
compatible with XML DTDs and permits XML Validation of XHTML-family documents.
XML Validation
The XML Recommendation defines validation ensuring that a document is well-formed and that it conforms to
the content model defined in the document's associated DTD. XHTML family documents are required to be
XML Valid.
Module
In XHTML, a module is a collection of one or more files that define entities, elements, and/or attributes. A
module may represent a complete, stand-alone markup language. It may also represent a small, incremental
change to some other markup language or some other module. Regardless, modules can be combined with
other modules using the XHTML Framework. With care, the elements defined by these modules can be
combined into a complete content model for a markup language.
DTD
DTD is a grammar in which XML-based markup languages can be defined (there are others, but right now we
are talking about DTDs). It is also a term commonly used to refer to the file in which a markup language
definition can be found. In the context of XHTML Modules and Markup Languages, a DTD is actually a file
that includes the XHTML-family modules that make up the markup language (along with some other helper
files). In DTD parlance, this file can also be called a "DTD Driver" file.
Qualified Name
The combination of XML and XML Namespaces gives rise to a class of elements and attributes that have
"qualified names". A qualified name consists of the element or attribute name, possibly prefixed with a
namespace declarator (e.g. xhtml:p for paragraph). In XHTML, the qualified names for elements and attributes
are defined in a Qname Module.

3. Module Construction
XHTML Modules are made up of at least two modules - a Qname Module and a Declaration Module. In this section
we will walk through building each of these. In the next section we will use this new module with another XHTML-
family module and some XHTML Core modules to define a new markup language.
3.1. Defining the Qname Module
An XHTML Qname Module should be constructed using the following process:
1. Define a parameter entity MODULE.prefixed that announces whether the elements in the module are being
used with XML Namespace prefixed names or not. This parameter entity's default value should be
"%NS.prefixed;". The NS.prefixed parameter entity is defined by the XHTML framework to be IGNORE by
default, and can be used in a document instance to switch on prefixing for all included namespaces (see the
prefixing example for more on this).
2. Define a parameter entity MODULE.xmlns that contains the namespace identifier for this module.
3. Define a parameter entity MODULE.prefix that contains the default prefix string to use when prefixing is
enabled.
4. Define a parameter entity MODULE.pfx that is "%MODULE.prefix;:" when prefixing is enabled, and "" when
it is not.
5. Define a parameter entity MODULE.xmlns.extra.attrib that contains the declaration of any XML Namespace
attributes for namespaces referenced by this module (e.g., xmlns:xlink). When %MODULE.prefix is set to
INCLUDE, this attribute should include the xmlns:%MODULE.pfx; declaration as well.
6. For each of the elements defined by the module, create a parameter entity of the form
"MODULE.NAME.qname" to hold its qualified name. The value for this parameter entity must be
"%MODULE.pfx;NAME". In this way, the parsed value will be "PREFIX:NAME" when prefixes are enabled,
and "NAME" otherwise. For example:
<!ENTITY % MODULE.myelement.qname "%MODULE.pfx;myelement" >
If the module adds attributes to elements defined in modules that do not share the namespace of this module, declare
those attributes so that they use the %MODULE.pfx prefix. For example:
<ENTITY % MODULE.img.myattr.qname "%MODULE.pfx;myattr" >

3.2. Defining the Declaration Module


An XHTML Declaration Module should be constructed using the following process:
1. Define a parameter entity to use within the ATTLIST of each declared element. This parameter entity should
be %NS.decl.attrib; when %MODULE.prefixed; is set to INCLUDE, and %NS.decl.attrib; plus "xmlns=
%MODULE.xmlns;" when %MODULE.prefixed; is set to IGNORE.
2. Declare all of the elements and attributes for the module. Within each ATTLIST for an element, include the
parameter entity defined above so that all of the required xmlns attributes are available on each element in the
module.
3.3. Using the module as a stand-alone DTD
It is sometimes desirable to have an XHTML module also usable as a stand alone DTD. A good example of this might
be a module that defines inventory items. These items need to be embeddable in an XHTML document, and also need
to be available as free-standing documents extracted from a database (or something). The easiest way to accomplish
this is to define a DTD file that instantiates the components of your module. Such a DTD would have this structure:
1. Include the XHTML Datatypes module (your qnames module likely uses some of these datatypes).
2. Include the Qnames Module for your module.
3. Define the parameter entity %NS.prefixed.attrib to be MODULE.xmlns.extra.attrib when MODULE.prefixed
is set to IGNORE, or to be MODULE.xmlns.extra.attrib and "xmlns:MODULE.prefix=MODULE.xmlns"
when MODULE.prefixed is set to INCLUDE.
4. Include the Declaration Module(s) for your module.
This DTD can then be referenced by documents that use only the elements from your module.

4. DTD Construction
Once you have defined your module(s), you are going to want to combine them with XHTML and other modules to
create a new markup language. Since in this document we are talking about building these markup languages using
DTDs, what you need to do is define a DTD that reflects the markup language. In the remainder of this section, we
will explore the process for creating such a "hybrid markup language".
4.1. Define the Content Model as a 'Model Module'
A Model Module is an XHTML Module that defines the content model for your new markup language. This module
can be extremely complex, or it can be as simple as the declaration of a parameter entity and the inclusion of some
other Model Module. Regardless, the purpose is the same: Define the structure of all of the elements in your markup
language.
4.2. Define the qualified names collection
Your markup language may include one or more additional XHTML-family Modules. Each of these Modules will
have a Qname Module. The qualified names collection is a module in which all of the Qname Modules are
instantiated, and the set of prefixed attributes are defined. Specifically, a qualified names collection module contains:
1. A reference to the Qname Module of each non-XHTML module included
2. A definition of the parameter entity XHTML.xmlns.extra.attrib to be the collection of the
MODULE.xmlns.extra.attrib parameter entities, one from each included Module.
4.3. Define the Driver
The driver is the actual file that is referenced by documents written in your new markup language. The driver may be
complex or simple, depending upon the markup language. However, each XHTML-family markup language driver
must contain the following elements in order to work well:
1. A definition of the parameter entity XHTML.version. This should be set to the Formal Public Identifier for
your new markup language.
2. A definition of the parameter entity xhtml-qname-extra.mod. This must be set to the qualified names collection
module defined above. It is fine to have this as only a SYSTEM identifier, since it is internal to the DTD.
3. A definition of the parameter entity xhtml-model.mod. This must be set to the Model Module defined above. It
is fine to also have this as only a SYSTEM identified, since it is internal to the DTD.
4. A series of references to the modules that make up the DTD. This may be a reference to another DTD that you
are incrementally modifying, or it may be an explicit list of the XHTML Modules that are being included, or
some combination of the two. Regardless, the first thing that actually gets instantiated through this reference is
the XHTML Modularization Framework Module. This Module takes care of incorporating all of the XHTML
infrastructure, merging it with your specified qualified names and your content model via the parameter
entities defined in steps 2 and 3. Don't forget to include your new Declaration Modules, since that is where
your new markup languages elements and attributes are defined!
Now you are ready to go. Your new Markup Language, defined via a DTD, can be referenced in the DOCTYPE
declaration of a document, and that document can be validated against your new DTD using common commercial and
free-ware tools.

5. Module Examples
In the following sections, you will see examples of each type of module referred to in this document, as well as the
components that make up two different markup language definitions.
5.1. Qname Modules
This first qname module is for an inventory module. The second is for some extensions to the inventory module.
5.1.1. Inventory Qname Module
<!-- ...................................................................... -->
<!-- Inventory Qname Module ................................................... -->
<!-- file: inventory-qname-1.mod

PUBLIC "-//MY COMPANY//ELEMENTS XHTML Inventory Qnames 1.0//EN"


SYSTEM "http://www.my.org/DTDs/inventory-qname-1.mod"

xmlns:inventory="http://www.my.org/xmlns/inventory"
...................................................................... -->

<!-- Declare the default value for prefixing of this module's elements -->
<!-- Note that the NS.prefixed will get overridden by the XHTML Framework or
by a document instance. -->
<!ENTITY % NS.prefixed "IGNORE" >
<!ENTITY % Inventory.prefixed "%NS.prefixed;" >

<!-- Declare the actual namespace of this module -->


<!ENTITY % Inventory.xmlns "http://www.my.org/xmlns/inventory" >

<!-- Declare the default prefix for this module -->


<!ENTITY % Inventory.prefix "inventory" >

<!-- Declare the prefix and any prefixed namespaces that are required by
this module -->
<![%Inventory.prefixed;[
<!ENTITY % Inventory.pfx "%Inventory.prefix;:" >
<!ENTITY % Inventory.xmlns.extra.attrib
"xmlns:%Inventory.prefix; %URI.datatype; #FIXED '%Inventory.xmlns;'" >
]]>
<!ENTITY % Inventory.pfx "" >
<!ENTITY % Inventory.xmlns.extra.attrib "" >

<!ENTITY % XHTML.xmlns.extra.attrib "%Inventory.xmlns.extra.attrib;" >

<!ENTITY % Inventory.shelf.qname "%Inventory.pfx;shelf" >


<!ENTITY % Inventory.item.qname "%Inventory.pfx;item" >
<!ENTITY % Inventory.desc.qname "%Inventory.pfx;desc" >
<!ENTITY % Inventory.sku.qname "%Inventory.pfx;sku" >
<!ENTITY % Inventory.price.qname "%Inventory.pfx;price" >

5.1.2. Inventory Extensions Qname Module


<!-- ...................................................................... -->
<!-- Extension Qname Module ............................................... -->
<!-- file: extension-qname-1.mod

xmlns:invext="http://www.my.org/xmlns/invext"
...................................................................... -->

<!-- Declare the default value for prefixing of this module's elements -->
<!-- Note that the NS.prefixed will get overridden by the XHTML Framework or
by a document instance. -->
<!ENTITY % NS.prefixed "IGNORE" >
<!ENTITY % Extension.prefixed "%NS.prefixed;" >

<!-- Declare the actual namespace of this module -->


<!ENTITY % Extension.xmlns "http://www.my.org/xmlns/invext" >

<!-- Declare the default prefix for this module -->


<!ENTITY % Extension.prefix "invext" >

<!-- Declare the prefix and any prefixed namespaces that are required by
this module -->
<![%Extension.prefixed;[
<!ENTITY % Extension.pfx "%Extension.prefix;:" >
<!ENTITY % Extension.xmlns.extra.attrib
"xmlns:%Extension.prefix; %URI.datatype; #FIXED '%Extension.xmlns;'" >
]]>
<!ENTITY % Extension.pfx "" >
<!ENTITY % Extension.xmlns.extra.attrib "" >

<!ENTITY % Extension.store.qname "%Extension.pfx;store" >


<!ENTITY % Extension.aisle.qname "%Extension.pfx;aisle">

5.2. Declaration Modules


The first declaration module is for the inventory module elements. The second is for the extension elements.
5.2.1. Inventory Declaration Module
<!-- ...................................................................... -->
<!-- Inventory Elements Module ................................................... -->
<!-- file: inventory-1.mod

PUBLIC "-//MY COMPANY//ELEMENTS XHTML Inventory Elements 1.0//EN"


SYSTEM "http://www.my.org/DTDs/inventory-1.mod"

xmlns:inventory="http://www.my.org/xmlns/inventory"
...................................................................... -->

<!-- Inventory Module

item
sku
desc
price

This module defines a simple inventory item structure


-->

<!-- Define the global namespace attributes -->


<![%Inventory.prefixed;[
<!ENTITY % Inventory.xmlns.attrib
"%NS.decl.attrib;"
>
]]>
<!ENTITY % Inventory.xmlns.attrib
"%NS.decl.attrib;
xmlns %URI.datatype; #FIXED '%Inventory.xmlns;'"
>

<!ELEMENT %Inventory.shelf.qname;
( %Inventory.item.qname; )* >
<!ATTLIST %Inventory.shelf.qname;
location CDATA #IMPLIED
%Inventory.xmlns.attrib;
>
<!ELEMENT %Inventory.item.qname;
( %Inventory.desc.qname;, %Inventory.sku.qname;, %Inventory.price.qname;) >
<!ATTLIST %Inventory.item.qname;
location CDATA #IMPLIED
%Inventory.xmlns.attrib;
>

<!ELEMENT %Inventory.desc.qname; ( #PCDATA ) >


<!ATTLIST %Inventory.desc.qname;
%Inventory.xmlns.attrib;
>

<!ELEMENT %Inventory.sku.qname; ( #PCDATA ) >


<!ATTLIST %Inventory.sku.qname;
%Inventory.xmlns.attrib;
>

<!ELEMENT %Inventory.price.qname; ( #PCDATA ) >


<!ATTLIST %Inventory.price.qname;
%Inventory.xmlns.attrib;
>

<!-- end of inventory-1.mod -->

5.2.2. Inventory Extensions Declaration Module


<!-- ...................................................................... -->
<!-- Extension Elements Module ................................................... -->
<!-- file: extension-1.mod

SYSTEM "extension-1.mod"

xmlns:invext="http://www.my.org/xmlns/invext"
...................................................................... -->

<!-- Extension Module

store
aisle

This module defines an extension to the inventory structure


-->

<!-- Define the global namespace attributes -->


<![%Extension.prefixed;[
<!ENTITY % Extension.xmlns.attrib
"%NS.decl.attrib;"
>
]]>
<!ENTITY % Extension.xmlns.attrib
"%NS.decl.attrib;
xmlns %URI.datatype; #FIXED '%Extension.xmlns;'"
>

<!ELEMENT %Extension.store.qname;
( %Extension.aisle.qname; )* >
<!ATTLIST %Extension.store.qname;
name CDATA #IMPLIED
%Extension.xmlns.attrib;
>
<!ELEMENT %Extension.aisle.qname;
( %Inventory.shelf.qname; )* >
<!ATTLIST %Extension.aisle.qname;
number CDATA #IMPLIED
%Extension.xmlns.attrib;
>

<!-- end of extension-1.mod -->

6. Markup Language Examples


6.1. XHTML-Inventory Extensions - an XHTML Family Markup
Language
This markup language complies with all of the requirements for an XHTML family markup language. It uses the
XHTML Core Modules, and extends that with the Inventory and Inventory Extensions modules defined above.
6.1.1. Qname Collection Module
<!-- Bring in the inventory qualified names -->
<!ENTITY % Inventory-qname.mod
PUBLIC "-//MY COMPANY//ENTITIES XHTML Inventory Qnames 1.0//EN"
"inventory-qname-1.mod" >
%Inventory-qname.mod;

<!-- Bring in the local extension module -->


<!ENTITY % Extension-qname.mod
SYSTEM "extension-qname-1.mod" >
%Extension-qname.mod;

<!-- Define the xmlns extension attributes -->


<!ENTITY % XHTML.xmlns.extra.attrib
"%Inventory.xmlns.extra.attrib;
%Extension.xmlns.extra.attrib;" >

6.1.2. Content Model Module


<!-- ...................................................................... -->
<!-- Inventory Extension Model Module .................................... -->
<!-- file: xhtml-invext-model-1.mod

SYSTEM "xhtml-invext-model-1.mod"
...................................................................... -->

<!-- Define the content model for Misc.extra -->


<!ENTITY % Misc.class
"| %script.qname; | %noscript.qname; | %Extension.store.qname; ">

<!-- .................... Inline Elements ...................... -->

<!ENTITY % HeadOpts.mix
"( %meta.qname; )*" >

<!ENTITY % I18n.class "" >

<!ENTITY % InlStruct.class "%br.qname; | %span.qname;" >

<!ENTITY % InlPhras.class
"| %em.qname; | %strong.qname; | %dfn.qname; | %code.qname;
| %samp.qname; | %kbd.qname; | %var.qname; | %cite.qname;
| %abbr.qname; | %acronym.qname; | %q.qname;" >

<!ENTITY % InlPres.class
"| %tt.qname; | %i.qname; | %b.qname; | %big.qname;
| %small.qname; | %sub.qname; | %sup.qname;" >

<!ENTITY % Anchor.class "| %a.qname;" >

<!ENTITY % InlSpecial.class "| %img.qname; " >

<!ENTITY % Inline.extra "" >

<!-- %Inline.class; includes all inline elements,


used as a component in mixes
-->
<!ENTITY % Inline.class
"%InlStruct.class;
%InlPhras.class;
%InlPres.class;
%Anchor.class;
%InlSpecial.class;"
>

<!-- %InlNoAnchor.class; includes all non-anchor inlines,


used as a component in mixes
-->
<!ENTITY % InlNoAnchor.class
"%InlStruct.class;
%InlPhras.class;
%InlPres.class;
%InlSpecial.class;"
>

<!-- %InlNoAnchor.mix; includes all non-anchor inlines


-->
<!ENTITY % InlNoAnchor.mix
"%InlNoAnchor.class;
%Misc.class;"
>

<!-- %Inline.mix; includes all inline elements, including %Misc.class;


-->
<!ENTITY % Inline.mix
"%Inline.class;
%Misc.class;"
>

<!-- ..................... Block Elements ...................... -->

<!ENTITY % Heading.class
"%h1.qname; | %h2.qname; | %h3.qname;
| %h4.qname; | %h5.qname; | %h6.qname;" >

<!ENTITY % List.class "%ul.qname; | %ol.qname; | %dl.qname;" >

<!ENTITY % Blkstruct.class "%p.qname; | %div.qname;" >

<!ENTITY % Blkphras.class
"| %pre.qname; | %blockquote.qname; | %address.qname;" >

<!ENTITY % Blkpres.class "| %hr.qname;" >

<!ENTITY % Block.extra "" >


<!-- %Block.class; includes all block elements,
used as an component in mixes
-->
<!ENTITY % Block.class
"%Blkstruct.class;
%Blkphras.class;
%Blkpres.class;
%Block.extra;"
>

<!-- %Block.mix; includes all block elements plus %Misc.class;


-->
<!ENTITY % Block.mix
"%Heading.class;
| %List.class;
| %Block.class;
%Misc.class;"
>

<!-- ................ All Content Elements .................. -->

<!-- %Flow.mix; includes all text content, block and inline


-->
<!ENTITY % Flow.mix
"%Heading.class;
| %List.class;
| %Block.class;
| %Inline.class;
%Misc.class;"
>

<!-- end of xhtml-invext-model-1.mod -->

6.1.3. DTD Driver


<!-- ....................................................................... -->
<!-- Inventory Extension DTD .............................................. -->
<!-- file: xhtml-invext-1.dtd -->

<!-- This is the DTD driver for inventory extension 1.0.

Please use this formal public identifier to identify it:

"-//MY COMPANY//DTD XHTML Inventory Extension 1.0//EN"

And this namespace for extension-unique elements:

xmlns:inventory="http://www.my.org/xmlns/invext"

Other namespaces are also included.


-->
<!ENTITY % XHTML.version "-//MY COMPANY//DTD XHTML Inventory Extension 1.0//EN" >

<!-- Define the xhtml qualified names module to be ours -->


<!ENTITY % xhtml-qname-extra.mod
SYSTEM "xhtml-invext-qname-1.mod" >

<!-- reserved for use with document profiles -->


<!ENTITY % XHTML.profile "" >

<!-- Define the Content Model for the framework to use -->
<!ENTITY % xhtml-model.mod
SYSTEM "xhtml-invext-model-1.mod" >

<!-- Disable bidirectional text support -->


<!ENTITY % XHTML.bidi "INCLUDE" >

<!-- Bring in the XHTML Framework -->


<!ENTITY % xhtml-framework.mod
PUBLIC "-//W3C//ENTITIES XHTML Modular Framework 1.0//EN"
"http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-framework-1.mod" >
%xhtml-framework.mod;

<!-- Text Module (Required) ............................... -->


<!ENTITY % xhtml-text.mod
PUBLIC "-//W3C//ELEMENTS XHTML Text 1.0//EN"
"http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-text-1.mod" >
%xhtml-text.mod;

<!-- Hypertext Module (required) ................................. -->


<!ENTITY % xhtml-hypertext.mod
PUBLIC "-//W3C//ELEMENTS XHTML Hypertext 1.0//EN"
"http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-hypertext-1.mod" >
%xhtml-hypertext.mod;

<!-- Lists Module (required) .................................... -->


<!ENTITY % xhtml-list.mod
PUBLIC "-//W3C//ELEMENTS XHTML Lists 1.0//EN"
"http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-list-1.mod" >
%xhtml-list.mod;

<!-- Inventory Module ........................................ -->


<!ENTITY % Inventory-elements.mod
SYSTEM "inventory-1.mod" >
%Inventory-elements.mod;

<!-- Inventory Extension Module .............................. -->


<!ENTITY % Invext-elements.mod
SYSTEM "extension-1.mod" >
%Invext-elements.mod;

<!-- XHTML Images module ........................................ -->


<!ENTITY % xhtml-image.mod
PUBLIC "-//W3C//ELEMENTS XHTML Images 1.0//EN"
"http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-image-1.mod" >
%xhtml-image.mod;

<!-- Document Metainformation Module ............................ -->


<!ENTITY % xhtml-meta.mod
PUBLIC "-//W3C//ELEMENTS XHTML Metainformation 1.0//EN"
"xhtml-meta-1.mod" >
%xhtml-meta.mod;

<!-- Document Structure Module (required) ....................... -->


<!ENTITY % xhtml-struct.mod
PUBLIC "-//W3C//ELEMENTS XHTML Document Structure 1.0//EN"
"http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-struct-1.mod" >
%xhtml-struct.mod;

7. Usage Examples
7.1. XHTML-Inventory with no prefixes
This example uses the new markup language in its default form - with no prefixes being defined for any module.
<!DOCTYPE html SYSTEM "xhtml-invext-1.dtd" >
<html xmlns="http://www.w3.org/1999/xhtml" >
<head>
<title>An example using defaults</title>
</head>
<body>
<p>This is content in the XHTML namespace</p>
<shelf>
<item>
<desc>
this is a description.
</desc>
<sku>
this is the price.
</sku>
<price>
this is the price.
</price>
</item>
</shelf>
</body>
</html>

7.2. XHTML-Inventory with Inventory prefixes


This example uses the new markup language with prefixes enabled for just the inventory and extension components.
r!DOCTYPE html SYSTEM "xhtml-invext-1.dtd" [
<!ENTITY % Inventory.prefixed "INCLUDE">
<!ENTITY % Inventory.prefix "i">
]>
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:i="http://www.my.org/xmlns/inventory" >
<head>
<title>An example using prefixes</title>
</head>
<body>
<p>This is content in the XHTML namespace</p>
<i:shelf>
<i:item>
<i:desc>
this is a description.
</i:desc>
<i:sku>
this is the sku.
</i:sku>
<i:price>
this is the price.
</i:price>
</i:item>
</i:shelf>
</body>
</html>

7.3. XHTML-Inventory with all prefixes


This example places a prefix on every element.
<!DOCTYPE x:html SYSTEM "xhtml-invext-1.dtd" [
<!ENTITY % NS.prefixed "INCLUDE">
<!ENTITY % XHTML.prefix "x" >
<!ENTITY % Inventory.prefix "i">
]>
<x:html xmlns:x="http://www.w3.org/1999/xhtml"
xmlns:i="http://www.my.org/xmlns/inventory" >
<x:head>
<x:title>An example using prefixes</x:title>
</x:head>
<x:body>
<x:p>This is content in the XHTML namespace</x:p>
<i:shelf>
<i:item>
<i:desc>
this is a description.
</i:desc>
<i:sku>
this is the sku.
</i:sku>
<i:price>
this is the price.
</i:price>
</i:item>
</i:shelf>
</x:body>
</x:html>

Top of Form
Bottom of Form

While the most common scripting language ECMAscript (more widely known as JavaScript) is developed by Ecma, a
great many of the APIs made available in browsers have been defined at W3C.
What is scripting?
A script is program code that doesn’t need pre-processing (e.g. compiling) before being run. In the context of a Web
browser, scripting usually refers to program code written in JavaScript that is executed by the browser when a page is
downloaded, or in response to an event triggered by the user.
Scripting can make Web pages more dynamic. For example, without reloading a new version of a page it may allow
modifications to the content of that page, or allow content to be added to or sent from that page. The former has been
called DHTML (Dynamic HTML), and the latter AJAX (Asynchronous JavaScript and XML).
Beyond this, scripts increasingly allow developers to create a bridge between the browser and the platform it is
running on, making it possible, for example, to create Web pages that incorporate information from the user’s
environment, such as current location, address book details, etc.
This additional interactivity makes Web pages behave like a traditional software application. These Web pages are
often called Web applications and can be made available either directly in the browser as a Web page, or can be
packaged and distributed as Widgets.
What scripting interfaces are available ?
The most basic scripting interface developed at W3C is the DOM, the Document Object Model which allows
programs and scripts to dynamically access and update the content, structure and style of documents. DOM
specifications form the core of DHTML.
Modifications of the content using the DOM by the user and by scripts trigger events that developers can make use of
to build rich user interfaces.
A number of more advanced interfaces are being standardized, for instance:
• XMLHttpRequest makes it possible to load additional content from the Web without loading a new document,
a core component of AJAX,
• the Geolocation API makes the user’s current location available to browser-based applications,
• several APIs make the integration of Web applications with the local file system and storage seamless.
WAI ARIA offers mechanisms to ensure that this additional interactivity remains usable independent of devices and
disabilities. Additional considerations apply to the development of Web applications for mobile devices.
Beyond scripting
While scripting offers a great opportunity to develop new interfaces and experiment with new user interactions, over
time a number of these additions benefit from a more declarative approach; for instance, instead of having each and
every developer re-implement a calendar-interface that allows a user to pick a date, defining an input type (<input
type='date' />) that does it automatically saves a lot of time and bugs, and creates a ground for further innovation.
Beyond the set of declarative interfaces made available through HTML, several technologies have been developed to
make these Declarative Web Applications possible.

Adding a touch of style


This is a short guide to styling your Web pages. It will show you how to use W3C's Cascading Style Sheets language
(CSS) as well as alternatives using HTML itself. The route will steer you clear of most of the problems caused by
differences between different brands and versions of browsers.
For style sheets to work, it is important that your markup be free of errors. A convenient way to automatically fix
markup errors is to use the HTML Tidy utility. This also tidies the markup making it easier to read and easier to edit. I
recommend you regularly run Tidy over any markup you are editing. Tidy is very effective at cleaning up markup
created by authoring tools with sloppy habits.
The following will teach you how to:
• use the style element
• link to separate style sheets
• set page margins
• set left and right and first-line indents
• set the amount of whitespace above and below
• set the font type, style and size
• add borders and backgrounds
• set colors with named or numeric values
• add style for browsers that don't understand CSS
Getting started
Let's start with setting the color of the text and the background. You can do this by using the STYLE element to set
style properties for the document's tags:
<style type="text/css">
body { color: black; background: white; }
</style>
The style element is placed within the document head. Here is a template HTML file showing where the above style
element fits. You can copy and paste this into your editor as a convenient way to experiment with CSS style sheets:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title> replace with your document's title </title>
<style type="text/css">
body { color: black; background: white; }
</style>
</head>
<body>

replace with your document's content

</body>
</html>
The stuff between the <style> and </style> is written in special notation for style rules. Each rule starts with a tag
name followed by a list of style properties bracketed by { and }. In this example, the rule matches the body tag. As you
will see, the body tag provides the basis for setting the overall look and feel of your Web page.
Each style property starts with the property's name, then a colon and lastly the value for this property. When there is
more than one style property in the list, you need to use a semicolon between each of them to delimit one property
from the next. In this example, there are two properties - "color" which sets the color of the text, and "background"
which sets the color of the page background. I recommend always adding the semicolon even after the last property.
Colors can be given as names or as numerical values, for instance rgb(255, 204, 204) which is a fleshy pink. The 3
numbers correspond to red, green and blue respectively in the range 0 to 255. You can also use a hexadecimal
notation, the same color can also be written as #FFCCCC. More details on color are given in a later section.
Note that the style element must be placed in the document's head along with the title element. It shouldn't be placed
within the body.
Linking to a separate style sheet
If you are likely to want to use the same styles for several Web pages it is worth considering using a separate style
sheet which you then link from each page. You can do this as follows:
<link type="text/css" rel="stylesheet" href="style.css">
The link tag should be placed within the <head> ... </head> element. Here is an HTML file with a link to an external
style sheet:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title> replace with your document's title </title>
<link type="text/css" rel="stylesheet" href="style.css">
</head>
<body>

replace with your document's content

</body>
</html>
The link element's rel attribute must be set to the value "stylesheet" to allow the browser to recognize that the href
attribute gives the Web address (URL) for your style sheet. A simple stylesheet file might look like the following:
/* style.css - a simple style sheet */
body {
margin-left: 10%; margin-right: 10%;
color: black; background: white;
}
Note that the same HTML file can link to an external style sheet and also include a style element for additional style
settings specific to this page. If you place the link element before the style element, you can use the latter to override
the style settings in the linked style sheet.
Setting the page margins
Web pages look a lot nicer with bigger margins. You can set the left and right margins with the "margin-left" and
"margin-right" properties, e.g.
<style type="text/css">
body { margin-left: 10%; margin-right: 10%; }
</style>
This sets both margins to 10% of the window width, and the margins will scale when you resize the browser window.
Setting left and right indents
To make headings a little more distinctive, you can make them start within the margin set for the body, e.g.
<style type="text/css">
body { margin-left: 10%; margin-right: 10%; }
h1 { margin-left: -8%;}
h2,h3,h4,h5,h6 { margin-left: -4%; }
</style>
This example has three style rules. One for the body, one for h1 (used for the most important headings) and one for the
rest of the headings (h2, h3, h4, h5 and h6). The margins for the headings are additive to the margins for the body.
Negative values are used to move the start of the headings to the left of the margin set for the body.
In the following sections, the examples of particular style rules will need to be placed within the style element in the
document's head (if present) or in a linked style sheet.
Controlling the white space above and below
Browsers do a pretty good job for the white space above and below headings and paragraphs etc. Two reasons for
taking control of this yourself are: when you want a lot of white space before a particular heading or paragraph, or
when you need precise control for the general spacings.
The "margin-top" property specifies the space above and the "margin-bottom" specifies the space below. To set these
for all h2 headings you can write:
h2 { margin-top: 8em; margin-bottom: 3em; }
The em is a very useful unit as it scales with the size of the font. One em is the height of the font. By using em's you
can preserve the general look of the Web page independently of the font size. This is much safer than alternatives such
as pixels or points, which can cause problems for users who need large fonts to read the text.
Points are commonly used in word processing packages, e.g. 10pt text. Unfortunately the same point size is rendered
differently on different browsers. What works fine for one browser will be illegible on another! Sticking with em's
avoids these problems.
To specify the space above a particular heading, you should create a named style for the heading. You do this with the
class attribute in the markup, e.g.
<h2 class="subsection">Getting started</h2>
The style rule is then written as:
h2.subsection { margin-top: 8em; margin-bottom: 3em; }
The rule starts with the tag name, a dot and then the value of the class attribute. Be careful to avoid placing a space
before or after the dot. If you do the rule won't work. There are other ways to set the styles for a particular element but
the class attribute is the most flexible.
When a heading is followed by a paragraph, the value for margin-bottom for the heading isn't added to the value for
margin-top for the paragraph. Instead, the maximum of the two values is used for the spacing between the heading and
paragraph. This subtlety applies to margin-top and margin-bottom regardless of which tags are involved.
First-line indent
Sometimes you may want to indent the first line of each paragraph. The following style rule emulates the traditional
way paragraphs are rendered in novels:
p { text-indent: 2em; margin-top: 0; margin-bottom: 0; }
It indents the first line of each paragraph by 2 em's and suppresses the inter-paragraph spacing.
Controlling the font
This section explains how to set the font and size, and how to add italic, bold and other styles.
Font styles
The most common styles are to place text in italic or bold. Most browsers render the em tag in italic and the strong tag
in bold. Let's assume you instead want em to appear in bold italic and strong in bold uppercase:
em { font-style: italic; font-weight: bold; }
strong { text-transform: uppercase; font-weight: bold; }
If you feel so inclined, you can fold headings to lower case as follows:
h2 { text-transform: lowercase; }

Setting the font size


Most browsers use a larger font size for more important headings. If you override the default size, you run the risk of
making the text too small to be legible, particularly if you use points. You are therefore recommended to specify font
sizes in relative terms.
This example sets heading sizes in percentages relative to the size used for normal text:
h1 { font-size: 200%; }
h2 { font-size: 150%; }
h3 { font-size: 100%; }

Setting the font family


It is likely that your favorite font won't be available on all browsers. To get around this, you are allowed to list several
fonts in preference order. There is a short list of generic font names which are guaranteed to be available, so you are
recommended to end your list with one of these: serif, sans-serif, cursive, fantasy, or monospace, for instance:
body { font-family: Verdana, sans-serif; }
h1,h2 { font-family: Garamond, "Times New Roman", serif; }
In this example, important headings would preferably be shown in Garamond, failing that in Times New Roman, and
if that is unavailable in the browsers default serif font. Paragraph text would appear in Verdana or if that is unavailable
in the browser's default sans-serif font.
The legibility of different fonts generally depends more on the height of lower case letters than on the font size itself.
Fonts like Verdana are much more legible than ones like "Times New Roman" and are therefore recommended for
paragraph text.
Avoid problems with fonts and margins
My first rule is to avoid text at the body level that isn't wrapped in a block level element such as p. For instance:
<h2>Spring in Wiltshire</h2>

Blossom on the trees, bird song and the sound of lambs


bleating in the fields.
The text following the heading runs the risk on some browsers of being rendered with the wrong font and margins.
You are therefore advised to enclose all such text in a paragraph, e.g.
<h2>Spring in Wiltshire</h2>

<p>Blossom on the trees, bird song and the sound of lambs


bleating in the fields.</p>
My second rule is to set the font family for pre elements, as some browsers forget to use a fixed pitch font when you
set the font size or other properties for pre.
pre { font-family: monospace; }
My third rule is to set the font family on headings, p and ul elements if you intend to set borders or backgrounds on
elements such as div. This is a work-around for a bug where the browser forgets to use the inherited font family,
instead switching to the default font as set by the browser preferences.
h1,h2,h3,h4,h5,p,ul { font-family: sans-serif; }

Adding borders and backgrounds


You can easily add a border around a heading, list, paragraph or a group of these enclosed with a div element. For
instance:
div.box { border: solid; border-width: thin; width: 100%; }
Note that without the "width" property some browsers will place the right margin too far to the right. This can then be
used with markup such as:
<div class="box">
The content within this DIV element will be enclosed
in a box with a thin line around it.
</div>
There are a limited choice of border types: dotted, dashed, solid, double, groove, ridge, inset and outset. The border-
width property sets the width. Its values include thin, medium and thick as well as a specified width e.g. 0.1em. The
border-color property allows you to set the color.
A nice effect is to paint the background of the box with a solid color or with a tiled image. To do this you use the
background property. You can fill the box enclosing a div as follows:
div.color {
background: rgb(204,204,255);
padding: 0.5em;
border: none;
}
Without an explicit definition for border property some browsers will only paint the background color under each
character. The padding property introduces some space between the edges of the colored region and the text it
contains.
You can set different values for padding on the left, top, right and bottom sides with the padding-left, padding-top,
padding-right and padding-bottom properties, e.g. padding-left: 1em.
Suppose you only want borders on some of the sides. You can control the border properties for each of the sides
independently using the border-left, border-top, border-right and border-bottom family of properties together with the
appropriate suffix: style, width or color, e.g.
p.changed {
padding-left: 0.2em;
border-left: solid;
border-right: none;
border-top: none;
border-bottom: none;
border-left-width: thin;
border-color: red;
}
which sets a red border down the left hand side only of any paragraph with the class "changed".
Setting Colors
Some examples for setting colors appeared in earlier sections. Here is a reminder:
body {
color: black;
background: white;
}
strong { color: red; }
This sets the default to black text on a white background, but renders strong elements in red. There are 16 standard
color name, which are explained just below. You can also use decimal values for red, green and blue, where each
value appears in the range 0 to 255, e.g. rgb(255, 0, 0) is the same as red. You can also used hex color values which
start with the '#' characted followed by six hexadecimal digits. A two-way converter is included below which allows
you to convert from RGB to hex color values.
Setting Link Colors
You can use CSS to set the color for hypertext links, with a different color for links that you have yet to follow, ones
you have followed, and the active color for when the link is being clicked. You can even set the color for when the
mouse pointer is hovering over the link.
:link { color: rgb(0, 0, 153); } /* for unvisited links */
:visited { color: rgb(153, 0, 153); } /* for visited links */
a:active { color: rgb(255, 0, 102); } /* when link is clicked */
a:hover { color: rgb(0, 96, 255); } /* when mouse is over link */
Sometimes you may want to show hypertext links without them being underlined. You can do this by setting the text-
decoration property to none, for example:
a.plain { text-decoration: none; }
Which would suppress underlining for a link such as:
This is <a class="plain" href="what.html">not underlined</a>
Most people when they see underlined text on a Web page, will expect it to be part of a hypertext link. As a result, you
are advised to leave underlining on for hypertext links. A similar argument applies to the link colors, most people will
interpret underlined blue text as hypertext links. You are advised to leave link colors alone, except when the color of
the background would otherwise make the text hard to read.
Color Blindness
When using color, remember that 5 to 10% of men have some form of color blindness. This can cause difficulties
distinguishing between red and green, or between yellow and blue. In rare cases, there is an inability to perceive any
colors at all. You are recommended to avoid foreground/background color combinations that would make the text hard
to read for people with color blindness.
Named colors
The standard set of color names is: aqua, black, blue, fuchsia, gray, green, lime, maroon, navy, olive, purple, red,
silver, teal, white, and yellow. These 16 colors are defined in HTML 3.2 and 4.01 and correspond to the basic VGA set
on PCs. Most browsers accept a wider set of color names but use of these is not recommended.
Color names and sRGB values

black = "#000000" green = "#008000"

silver = "#C0C0C0" lime = "#00FF00"

gray = "#808080" olive = "#808000"

white = "#FFFFFF" yellow = "#FFFF00"

maroon = "#800000" navy = "#000080"

red = "#FF0000" blue = "#0000FF"

purple = "#800080" teal = "#008080"

fuchsia = "#FF00FF" aqua = "#00FFFF"

Thus, the color value "#800080" is the same as "purple".


Hexadecimal color values
Values like "#FF9999" represent colors as hexadecimal numbers for red, green and blue. The first two characters
following the # give the number for red, the next two for green and the last two for blue. These numbers are always in
the range 0 to 255 decimal. If you know the values in decimal, you can convert to hexadecimal using a calculator, like
the one that comes as part of Microsoft Windows.

Top of Form

Enter RGB or Hex value and press appropriate button to convert


255
red: Hex color value

255 #FFFFFF
green:
255
blue:
Bottom of Form

Browser safe colors


New computers support thousands or millions of colors, but many older color systems can only show up to 256 colors
at a time. To cope with this, these browsers make do with colors from a fixed palette. The effect of this is often visible
as a speckling of colors as the browser tries to approximate the true color at any point in the image. This problem will
gradually go away as older computers are replaced by newer models.
Most browsers support the same so called "browser safe" palette. This uses 6 evenly spaced gradations in red, green
and blue and their combinations. If you select image colors from this palette, you can avoid the speckling effect. This
is particularly useful for background areas of images.
If the browser is using the browser safe palette, the page background uses the nearest color in the palette. If you set the
page background to a color which isn't in the browser safe palette, you run the risk that the background will have
different colors depending on whether the computer is using indexed or true-color.
These are constructed from colors where red, green and blue are restricted to the values:
RGB 00 51 102 153 204 255
Hex 00 33 66 99 CC FF
Here is a table of the browser safe colors and their hex values. My thanks to Bob Stein for this arrangement.
CCC
FFF 999 666 333 000 FFC FF9 FF6 FF3
CC
FFF 999 666 333 000 C00 900 600 300
C
99C CC9 FFC FFC FF9 FF6 CC3 CC0
C00 900 C33 C66 966 633 300 033
CCF CCF 333 666 999 CCC FFF CC9 CC6 330 660 990 CC0 FF0 FF3 FF0
F00 F33 300 600 900 C00 F00 933 633 000 000 000 000 000 366 033
99F CCF 99C 666 999 CCC FFF 996 993 663 993 CC3 FF3 CC3 FF6 FF0
F00 F66 C33 633 933 C33 F33 600 300 333 333 333 333 366 699 066
FF6
66F 99F 66C 669 999 CCC FFF 996 663 996 CC6 FF6 990 CC3 FF0
6C
F00 F66 C33 900 966 C66 F66 633 300 666 666 666 033 399 099
C
FF3
33F 66F 339 66C 99F CCC FFF CC9 CC6 CC9 FF9 FF3 CC0 990 FF0
3C
F00 F33 900 C00 F33 C99 F99 966 600 999 999 399 066 066 0CC
C
FFC FF9
00C 33C 336 669 99C CCF FFF FFC FF9 CC6 993 660 CC0 330
CC 9C
C00 C00 600 933 C66 F99 FCC C99 933 699 366 033 099 033
C C
CC9
33C 66C 00F 33F 66F 99F CCF 996 993 990 663 660
9C
C33 C66 F00 F33 F66 F99 FCC 699 399 099 366 066
C
CC6
006 336 009 339 669 99C FFC FF9 FF6 FF3 FF0 CC3
6C
600 633 900 933 966 C99 CFF 9FF 6FF 3FF 0FF 3CC
C
996 990
003 00C 006 339 66C 99F CCF 339 99C CCC CC9 663 330 CC0
6C 0C
300 C33 633 966 C99 FCC FFF 9FF CFF CFF 9FF 399 066 0CC
C C
99C 006 669 999 660
00F 33F 009 00C 33F 99F 999 993 660 CC3 CC0
CC 6C 9C 9C 0C
F33 F66 933 C66 F99 FFF 9FF 3FF 099 3FF 0FF
C C C C C
66C 666 993
00F 66F 33C 009 66F 669 003 336 666 666 330 CC6 990
CC 6C 3C
F66 F99 C66 966 FFF 999 366 699 6FF 699 099 6FF 0FF
C C C
33C 333 663
00F 66F 33C 33F 339 336 006 003 333 333 333 996 660
CC 3C 3C
F99 FCC C99 FFF 999 666 699 399 3FF 399 366 6FF 0FF
C C C
00F 00C 339 336 000
33F 00F 009 006 003 000 000 000 000 663 330
FC CC 9C 6C 0C
FCC FFF 999 666 333 0FF 099 066 033 3FF 0FF
C C C C C
003
00C 009 33C 66C 669 336 330
3C
C99 9CC CFF CFF 9FF 6FF 0CC
C
00C 009 006 003
CFF 9FF 6FF 3FF
Color swatches for the browser safe palette are available free for many popular graphics packages, from
www.visibone.com.
What about browsers that don't support CSS?
Older browsers, that is to say before Netscape 4.0 and Internet Explorer 4.0, either don't support CSS at all or do so
inconsistently. For these browsers you can still control the style by using HTML itself.
Setting the color and background
You can set the color using the BODY tag. The following example sets the background color to white and the text
color to black:
<body bgcolor="white" text="black">
The BODY element should be placed before the visible content of the Web page, e.g. before the first heading. You can
also control the color of hypertext links. There are three attributes for this:
• link for unvisited links
• vlink for visited links
• alink for the color used when you click the link
Here is an example that sets all three:
<body bgcolor="white" text="black"
link="navy" vlink="maroon" alink="red">
You can also get the browser to tile the page background with an image using the background attribute to specify the
Web address for the image, e.g.
<body bgcolor="white" background="texture.jpeg" text="black"
link="navy" vlink="maroon" alink="red">
It is a good idea to specify a background color using the bgcolor attribute in case the browser is unable to render the
image. You should check that the colors you have chosen don't cause legibility problems. As an extreme case consider
the following:
<body bgcolor="black">
Most browsers will render text in black by default. The end result is that the page will be shown with black text on a
black background! Lots of people suffer from one form of color blindness or another, for example olive green may
appear brown to some people.
A separate problem appears when you try to print the Web page. Many browsers will ignore the background color, but
will obey the text color. Setting the text to white will often result in a blank page when printed, so the following is not
recommended:
<body bgcolor="black" text="white">
You can also use the bgcolor attribute on table cells, e.g.
<table border="0" cellpadding="5">
<tr>
<td bgcolor="yellow">colored table cell</td>
</tr>
</table>
Tables can be used for a variety of layout effects and have been widely exploited for this. In the future this role is
likely to be supplanted by style sheets, which make it practical to achieve precise layout with less effort.
Setting the font, its size and color
The FONT tag can be used to select the font, to set its size and the color. This example just sets the color:
This sentence has a <font color="yellow">word</font> in yellow.
The face attribute is used to set the font. It takes a list of fonts in preference order, e.g.
<font face="Garamond, Times New Roman">some text ...</font>
The size attribute can be used to select the font size as a number from 1 to 6. If you place a - or + sign before the
number it is interpreted as a relative value. Use size="+1" when you want to use the next larger font size and size="-1"
when you want to use the next smaller font size, e.g.
<font size="+1" color="maroon"
face="Garamond, Times New Roman">some text ...</font>
There are a couple of things you should avoid: Don't choose color combinations that make text hard to read for people
who are color blind. Don't use font to make regular text into headings, which should always be marked up using the h1
to h6 tags as appropriate to the importance of the heading.
English Typography
Use the proper English characters instead of their misused equivalents.
Quotes
“ (&#8220;) opening quote (instead of ")
” (&#8221;) closing quote (instead of ")
Apostrophe
’ (&#8217;) apostrophe (instead of ')
Dashes and Hyphens
– (&#8211; or &ndash;) en dash, used for ranges, e.g. “13–15 November” (instead of -)
— (&#8212; or &mdash;) em dash, used for change of thought, e.g. “Star Wars is—as everyone knows—
amazing.” (instead of -, or --)
Ellipsis
… (&#8230; or &hellip;) horizontal ellipsis, used to indicate an omission or a pause (instead of ...)

Internationalization Quicktips
Use Unicode wherever possible for content, databases, etc. Always declare the encoding of content.
The character encoding you choose determines how bytes are mapped to characters in your text.
Normally character encodings limit you to a particular script or set of languages. Unicode allows you to deal simply
with almost all scripts and languages in use around the world. In this way Unicode simplifies the handling of content
in multiple languages, whether within a single page or across one or more sites. Unicode is particularly useful when
used in forms, scripts and databases, where you often need to support multiple languages. Unicode also makes it very
straightforward to add new languages to your content.
Unless you appropriately declare which character encoding you are using your users may be unable to read your
content. This is because incorrect assumptions may be made by the application interpreting your text about how the
bytes map to characters.
Give me more background
Character encodings for beginners explains some of the basic concepts about character encodings, and why
you should care. Introducing Character sets and Encodings gives an gentle introduction to various aspects of
the topic.
So, how do I do this?
HTML & CSS authors • Spec developers• Server setup
Use characters rather than escapes (e.g. &#xE1; &#225; or &aacute;) whenever you can.
Escapes such as Numeric Character References (NCRs), and entities are ways of representing any Unicode character
in markup using only ASCII characters. For example, you can represent the character á in X/HTML as &#xE1; or
&#225; or &aacute;.
Such escapes are useful for clearly representing ambiguous or invisible characters, and to prevent problems with
syntax characters such as ampersands and angle brackets. They may also be useful on occasion to represent characters
not supported by your character encoding or unavailable from your keyboard. Otherwise you should always use
characters rather than escapes.
Give me more background
Using character entities and NCRs provides additional information about the use of escapes in markup
languages. In particular, note that entities (such as &aacute;) should be used with caution.
So, how do I do this?
HTML & CSS authors • Spec developers • SVG authors
Declare the language of documents and indicate internal language changes.
Information about the (human) language of content is already important for accessibility, styling, searching, editing,
and other reasons. As more and more content is tagged and tagged correctly, applications that can detect language
information will become more and more useful and pervasive.
When declaring language, you may need to express information about a specific range of content in a different way
from metadata about the document as a whole. It is important to understand this distinction.
Give me more background
Language on the Web gives an gentle introduction to various aspects of the topic.
So, how do I do this?
HTML & CSS authors • SVG authors • XML authors • Schema developers • Server setup
Use style sheets for presentational information. Restrict markup to semantics.
It is an important principle of Web design to keep the way content is styled or presented separate from the actual text
itself. This makes it simple to apply alternative styling for the same text, for example in order to display the same
content on both a conventional browser and a small hand-held device.
This principle is particularly useful for localization, since different scripts have different typographic needs. For
example, due to the complexity of Japanese characters, it may be preferable to show emphasis in Japanese X/HTML
pages in other ways than bolding or italicisation. It is much easier to apply such changes if the presentation is
described using CSS, and markup is much cleaner and more manageable if text is correctly and unambiguously
labelled as 'emphasised' rather than just 'bold'.
It can save considerable time and effort during localization to work with CSS files rather than have to change the
markup, because any needed changes can be made in a single location for all pages, and the translator can focus on the
content rather than the presentation.
Give me more background
Read the talks slides from the 2007 @media conference presentation "Designing for International Users:
Practical Tips".
Check for translatability and inappropriate cultural bias in images, animations & examples.
If you want your content to really communicate with people, you need to speak their language, not only through the
text, but also through local imagery, color, objects and preoccupations. It is easy to overlook the culture-specific
nature of symbolism, behaviour, concepts, body language, humor, etc. You should get feedback on the suitability and
relevance of your images, video-clips, and examples from in-country users.
You should also take care when incorporating text in graphics when content is translated. Text on complex
backgrounds or in restricted spaces can cause considerable trouble for the translator. You should provide graphics to
the localization group that have text on a separate layer, and you should bear in mind that text in languages such as
English and Chinese will almost certainly expand in translation.
Give me more background
Read the talks slides from the 2007 @media conference presentation "Designing for International Users:
Practical Tips".
Use an appropriate encoding on both form and server. Support local formats of names/addresses,
times/dates, etc.
The encoding used for an HTML page that contains a form should support all the characters needed to enter data into
that form. This is particularly important if users are likely to enter information in multiple languages.
Databases and scripts that receive data from forms on pages in multiple languages must also be able to support the
characters for all those languages simultaneously.
The simplest way to enable this is to use Unicode for both pages containing forms and all back-end processing and
storage. In such a scenario the user can fill in data in whatever language and script they need to.
You should also try to avoid making assumptions that things such as the user's name and address will follow the same
formatting rules as your own. Ask yourself how much detail you really need to break out into separate fields for things
such as addresses. Bear in mind that in some cultures there are no street names, in others the house number follows the
street name, some people need more than one line for the part of the address that precedes the town or city name, etc.
In fact in some places an address runs top down from the general to the specific, which implies a very different layout
strategy. Be very careful about building into validation routines incorrect assumptions about area codes or telephone
number lengths. Recognize that careful labelling is required for how to enter numeric dates, since there are different
conventions for ordering of day, month and year.
If you are gathering information from people in more than one country, it is important to develop a strategy for
addressing the different formats people will expect to be able to use. Not only is this important for the design of the
forms you create, but it also has an impact on how you will store such information in databases.
So, how do I do this?
HTML & CSS authors • HTML & CSS authors • Spec developers
Use simple, concise text. Use care when composing sentences from multiple strings.
Simple, concise text is easier to translate. It is also easier for people to read if the text they are reading is not in their
first language.
You should take considerable care when composing messages from multiple substrings, or when inserting variable
text into strings. For example, suppose your site uses JSP scripting, and you decide to compose certain messages on
the fly. You may create messages by concatenating separate substrings, such as 'Only' or 'Don't', ' return results in ',
and 'any format' or 'HTML'. Because the order of text in sentences of other languages can be very different, translating
this may present major difficulties.
Similarly, it is important to avoid fixing the positions of variables in text such as "Page 1 of 10". The syntax of other
languages may require the numbers to be reversed to make sense. If you use PHP, this would mean using a formatting
string such as "Page %1\$d of %2\$d.", rather than the more simple "Page %d of %d.". The latter is untranslatable in
some languages.
So, how do I do this?
HTML & CSS authors
On each page include clearly visible navigation to localized pages or sites, using the target
language.
Where you have versions of a page or site in a different language, or for a different country or region, you should
provide a way for the user to view the version they prefer. This should be available from any page on your site where
an alternative exists.
When providing links to pages in other languages, use the name of the target language in the native language and
script. Don't assume that the user can read English. For example, in a link to a French page, 'French' would be written
'français'. This also applies if you are guiding the user to a country- or region-specific page or site, eg. 'Germany'
would be 'Deutschland'.
So, how do I do this?
HTML & CSS authors
For XHTML, add dir="rtl" to the html tag for right-to-left text. Only re-use it to change the base
direction.
Text in languages such as Arabic, Hebrew, Persian and Urdu is read from right to left. This reading order typically
leads to right-aligned text and mirror-imaging of things like page and table layout. You can set the default alignment
and ordering of page content to right to left by simply including dir="rtl" in the html tag.
The direction set in the html tag sets a base direction for the document which cascades down through all the elements
on the page. It is not necessary to repeat the attribute on lower level elements unless you want to explicitly change the
directional flow.
Embedded text in, for example, Latin script still runs left to right within the overall right to left flow. So do numbers.
If you are working with right to left languages, you should become familiar with the basics of the Unicode
bidirectional algorithm. This algorithm takes care of much of this bidirectional text without the need for intervention
from the author. There are some circumstances, however, where markup or Unicode control characters are needed to
ensure the correct effect.
Give me more background
Creating (X)HTML Pages in Arabic & Hebrew provides a gentle introduction to the basics of handling right-
to-left text in HTML. The principles are similar for other markup languages.
What you need to know about the bidi algorithm and inline markup provides a gentle introduction to the basics
of handling inline bidirectional text.
So, how do I do this?
HTML & CSS authors • SVG authors • XML authors • Schema developers
Validate! Use techniques, tutorials, and articles at http://www.w3.org/International/
Accessibility: WCAG2 at a Glance
Perceivable
• Provide text alternatives for non-text content.
• Provide captions and alternatives for audio and video content.
• Make content adaptable; and make it available to assistive technologies.
• Use sufficient contrast to make things easy to see and hear.
Operable
• Make all functionality keyboard accessible.
• Give users enough time to read and use content.
• Do not use content that causes seizures.
• Help users navigate and find content.
Understandable
• Make text readable and understandable.
• Make content appear and operate in predictable ways.
• Help users avoid and correct mistakes.
Robust
• Maximize compatibility with current and future technologies.
Mission of the XHTML2 Working Group
The mission of the XHTML2 Working Group is to fulfill the promise of XML for applying XHTML to a wide variety
of platforms with proper attention paid to internationalization, accessibility, device-independence, usability and
document structuring. The group will provide an essential piece for supporting rich Web content that combines
XHTML with other W3C work on areas such as math, scalable vector graphics, synchronized multimedia, and forms,
in cooperation with other Working Groups.
To join the XHTML2 Working Group, see the instructions for joining. To enquire about the possibility of joining as
an invited expert, please contact the HTML Activity Lead.
What is HTML?
HTML is the lingua franca for publishing hypertext on the World Wide Web. It is a non-proprietary format based
upon SGML, and can be created and processed by a wide range of tools, from simple plain text editors - you type it in
from scratch - to sophisticated WYSIWYG authoring tools. HTML uses tags such as <h1> and </h1> to structure text
into headings, paragraphs, lists, hypertext links etc. Here is a 10-minute guide for newcomers to HTML. W3C's
statement of direction for HTML is given on the HTML Activity Statement. See also the page on our work on the next
generation of Web forms, and the section on Web history.
What is XHTML?
The Extensible HyperText Markup Language (XHTML™) is a family of current and future document types and
modules that reproduce, subset, and extend HTML, reformulated in XML rather than SGML. XHTML Family
document types are all XML-based, and ultimately are designed to work in conjunction with XML-based user agents.
XHTML is the successor of HTML, and a series of specifications has been developed for XHTML. See also: HTML
and XHTML Frequently Answered Questions
Recommendations
W3C produces what are known as "Recommendations". These are specifications, developed by W3C working groups,
and then reviewed by Members of the Consortium. A W3C Recommendation indicates that consensus has been
reached among the Consortium Members that a specification is appropriate for widespread use.
In general, XHTML specifications include implementations of their requirements in various syntaxes (e.g., XML
DTD, XML Schema, RelaxNG). These implementations are normative, and are meant to be used either as building
blocks for new markup languages (e.g., XHTML Modularization) or as complete markup language implementations
(e.g., XHTML 1.1).
While a normative part of the W3C Recommendation in which they are presented, these implementations are also
code containing potential errors or omissions. When such errors are discovered, it is sometimes important that they be
addressed very quickly to ensure that technologies relying on the implementations work as expected (e.g., validators
and content authoring systems). The W3C process allows for the publication and frequent updating of errata, but
unfortunately this process does not enable implementations to be quickly updated. As a result, the XHTML 2 Working
Group has adopted the following concerning the production and evolution of its implementations:
• All implementations will adhere to the naming convention(s) and evolution rules as defined in XHTML
Modularization. These names include both Formal Public Identifiers and System Identifiers. These conventions
require that the System Identifier must include a revision number. This revision number is ONLY incremented
when a revision is not backward compatible.
• Each applicable Recommendation will include fixed, unchanging versions of those implementations within the
formal dated location for the Recommendation (/TR/YYYY/REC-whatever-YYYYmmdd/...).
• The Working Group will also provide a version of that implementation in the working group's space on the
W3C server (/MarkUp), uncoupled from a specific dated version of the associated Recommendation. In the
beginning this uncoupled version will be *identical* to the version from the associated Recommendation.
• If the Working Group identifies a problem with an implementation, and it is possible to solve the problem in a
way that is 100 percent backward compatible, then the version in the group's space will be updated in place and
an announcement will be sent to the XHTML 2 public email list.
The XHTML 2 Working Group states that the term "backward compatible" should be used only when:
• The external interface to the module cannot change in any way that would break another module or markup
language, either within or outside of the W3C.
• The content model cannot change in any way that would cause a previously valid document to become invalid.
If either of the above constraints would be violated by a change, the working group will either 1) not make the change,
or 2) revise the applicable module. In the latter case, the working group will also change the associated identifiers.
XHTML 1.0
XHTML 1.0 was the W3C's first Recommendation for XHTML, following on from earlier work on HTML 4.01,
HTML 4.0, HTML 3.2 and HTML 2.0. With a wealth of features, XHTML 1.0 is a reformulation of HTML 4.01 in
XML, and combines the strength of HTML 4 with the power of XML.
XHTML 1.0 was the first major change to HTML since HTML 4.0 was released in 1997. It brings the rigor of XML to
Web pages and is the keystone in W3C's work to create standards that provide richer Web pages on an ever increasing
range of browser platforms including cell phones, televisions, cars, wallet sized wireless communicators, kiosks, and
desktops.
XHTML 1.0 was the first step: it reformulates HTML as an XML application. This makes it easier to process and
easier to maintain. XHTML 1.0 borrows elements and attributes from W3C's earlier work on HTML 4, and can be
interpreted by existing browsers, by following a few simple guidelines. This allows you to start using XHTML now!
You can roll over your old HTML documents into XHTML using an Open Source HTML Tidy utility. This tool also
cleans up markup errors, removes clutter and prettifies the markup making it easier to maintain.
Three "flavors" of XHTML 1.0
XHTML 1.0 is specified in three "flavors". You specify which of these variants you are using by inserting a line at the
beginning of the document. For example, the HTML for this document starts with a line which says that it is using
XHTML 1.0 Strict. Thus, if you want to validate the document, the tool used knows which variant you are using. Each
variant has its own DTD - Document Type Definition - which sets out the rules and regulations for using HTML in a
succinct and definitive manner.
• XHTML 1.0 Strict - Use this when you want really clean structural mark-up, free of any markup associated
with layout. Use this together with W3C's Cascading Style Sheet language (CSS) to get the font, color, and
layout effects you want.
• XHTML 1.0 Transitional - Many people writing Web pages for the general public to access might want to
use this flavor of XHTML 1.0. The idea is to take advantage of XHTML features including style sheets but
nonetheless to make small adjustments to your markup for the benefit of those viewing your pages with older
browsers which can't understand style sheets. These include using the body element with bgcolor, text and
link attributes.
• XHTML 1.0 Frameset - Use this when you want to use Frames to partition the browser window into two or
more frames.
The complete XHTML 1.0 specification is available in English in several formats, including HTML, PostScript and
PDF. See also the list of translations produced by volunteers.
HTML 4.01
HTML 4.01 is a revision of the HTML 4.0 Recommendation first released on 18th December 1997. The revision fixes
minor errors that have been found since then. The XHTML 1.0 spec relies on HTML 4.01 for the meanings of
XHTML elements and attributes. This allowed us to reduce the size of the XHTML 1.0 spec very considerably.
XHTML Basic
XHTML Basic is the second Recommendation in a series of XHTML specifications.
The XHTML Basic document type includes the minimal set of modules required to be an XHTML Host Language
document type, and in addition it includes images, forms, basic tables, and object support. It is designed for Web
clients that do not support the full set of XHTML features; for example, Web clients such as mobile phones, PDAs,
pagers, and settop boxes. The document type is rich enough for content authoring.
XHTML Basic is designed as a common base that may be extended. For example, an event module that is more
generic than the traditional HTML 4 event system could be added or it could be extended by additional modules from
XHTML Modularization such as the Scripting Module. The goal of XHTML Basic is to serve as a common language
supported by various kinds of user agents.
The document type definition is implemented using XHTML modules as defined in "Modularization of XHTML".
The complete XHTML Basic specification is available in English in several formats, including HTML, plain text,
PostScript and PDF. See also the list of translations produced by volunteers.
XHTML Modularization
XHTML Modularization is the third Recommendation in a series of XHTML specifications.
This Recommendation does not specify a markup language but an abstract modularization of XHTML and an
implementation of the abstraction using XML Document Type Definitions (DTDs) and (in version 1.1) XML
Schemas. This modularization provides a means for subsetting and extending XHTML, a feature needed for extending
XHTML's reach onto emerging platforms.
Modularization of XHTML makes it easier to combine with markup tags for things like vector graphics, multimedia,
math, electronic commerce and more. Content providers will find it easier to produce content for a wide range of
platforms, with better assurances as to how the content is rendered, and that the content is valid.
The modular design reflects the realization that a one-size-fits-all approach no longer works in a world where browsers
vary enormously in their capabilities. A browser in a cellphone can't offer the same experience as a top of the range
multimedia desktop machine. The cellphone doesn't even have the memory to load the page designed for the desktop
browser.
XHTML 1.1 - Module-based XHTML
This Recommendation defines a new XHTML document type that is based upon the module framework and modules
defined in Modularization of XHTML. The purpose of this document type is to serve as the basis for future extended
XHTML 'family' document types, and to provide a consistent, forward-looking document type cleanly separated from
the deprecated, legacy functionality of HTML 4 that was brought forward into the XHTML 1.0 document types.
This document type is essentially a reformulation of XHTML 1.0 Strict using XHTML Modules. This means that
many facilities available in other XHTML Family document types (e.g., XHTML Frames) are not available in this
document type. These other facilities are available through modules defined in Modularization of XHTML, and
document authors are free to define document types based upon XHTML 1.1 that use these facilities (see
Modularization of XHTML for information on creating new document types).
What is the difference between XHTML 1.0, XHTML Basic and XHTML 1.1?
The first step was to reformulate HTML 4 in XML, resulting in XHTML 1.0. By following the HTML Compatibility
Guidelines set forth in Appendix C of the XHTML 1.0 specification, XHTML 1.0 documents could be compatible
with existing HTML user agents.
The next step is to modularize the elements and attributes into convenient collections for use in documents that
combine XHTML with other tag sets. The modules are defined in Modularization of XHTML. XHTML Basic is an
example of fairly minimal build of these modules and is targeted at mobile applications.
XHTML 1.1 is an example of a larger build of the modules, avoiding many of the presentation features. While
XHTML 1.1 looks very similar to XHTML 1.0 Strict, it is designed to serve as the basis for future extended XHTML
Family document types, and its modular design makes it easier to add other modules as needed or integrate itself into
other markup languages. XHTML 1.1 plus MathML 2.0 document type is an example of such XHTML Family
document type.
XHTML-Print
XHTML-Print is member of the family of XHTML Languages defined by the Modularization of XHTML. It is
designed to be appropriate for printing from mobile devices to low-cost printers that might not have a full-page buffer
and that generally print from top-to-bottom and left-to-right with the paper in a portrait orientation. XHTML-Print is
also targeted at printing in environments where it is not feasible or desirable to install a printer-specific driver and
where some variability in the formatting of the output is acceptable.
XML Events
Note. This specification was renamed from "XHTML Events".
The XML Events module defined in this specification provides XML languages with the ability to uniformly integrate
event listeners and associated event handlers with Document Object Model (DOM) Level 2 event interfaces. The
result is to provide an interoperable way of associating behaviors with document-level markup.
Previous Versions of HTML
HTML 4.0
First released as a W3C Recommendation on 18 December 1997. A second release was issued on 24 April
1998 with changes limited to editorial corrections. This specification has now been superseded by HTML
4.01.
HTML 3.2
W3C's first Recommendation for HTML which represented the consensus on HTML features for 1996. HTML
3.2 added widely-deployed features such as tables, applets, text-flow around images, superscripts and
subscripts, while providing backwards compatibility with the existing HTML 2.0 Standard.
HTML 2.0
HTML 2.0 (RFC 1866) was developed by the IETF's HTML Working Group, which closed in 1996. It set the
standard for core HTML features based upon current practice in 1994. Note that with the release of RFC 2854,
RFC 1866 has been obsoleted and its current status is HISTORIC.
ISO HTML
ISO/IEC 15445:2000 is a subset of HTML 4, standardized by ISO/IEC. It takes a more rigorous stance for instance, an
h3 element can't occur after an h1 element unless there is an intervening h2 element. Roger Price and David
Abrahamson have written a user's guide to ISO HTML.
Other Public Drafts
The current editors' drafts of all specifications are linked to from a separate drafts page.
If you have any comments on any of our specifications we would like to hear from you via email. Please send your
comments to: www-html-editor@w3.org (archive). Don't forget to include XHTML in the subject line.
XHTML 2.0
XHTML 2.0 is a markup language intended for rich, portable web-based applications. While the ancestry of XHTML
2.0 comes from HTML 4, XHTML 1.0, and XHTML 1.1, it is not intended to be 100% backwards compatible with its
earlier versions. Application developers familiar with its earlier ancestors will be comfortable working with XHTML
2.0.
XHTML 2.0 is a member of the XHTML Family of markup languages. It is an XHTML Host Language as defined in
Modularization of XHTML. As such, it is made up of a set of XHTML Modules that together describe the elements
and attributes of the language, and their content model. XHTML 2.0 updates many of the modules defined in
Modularization of XHTML, and includes the updated versions of all those modules and their semantics.
XHTML 2.0 essentially consists of a packaging of several parts currently independently proceeding to
recommendation:
• RDFa
• XForms
• Access
• Role
• XML Events
plus the necessary text and hyperlinking modules, which you will find in the XHTML2 draft.
The most recent editor's draft can always be found on the XHTML2 WG's drafts page.
An XHTML + MathML + SVG Profile
An XHTML+MathML+SVG profile is a profile that combines XHTML 1.1, MathML 2.0 and SVG 1.1 together. This
profile enables mixing XHTML, MathML and SVG in the same document using XML namespaces mechanism, while
allowing validation of such a mixed-namespace document.
This specification is a joint work with the SVG Working Group, with the help from the Math WG.
XFrames
XFrames is an XML application for composing documents together, replacing HTML Frames. XFrames is not a part
of XHTML per se, that allows similar functionality to HTML Frames, with fewer usability problems, principally by
making the content of the frameset visible in its URI.
HLink
The HLink module defined in this specification provides XHTML Family Members with the ability to specify which
attributes of elements represent Hyperlinks, and how those hyperlinks should be traversed, and extends XLink use to a
wider class of languages than those restricted to the syntactic style allowed by XLink.
XHTML Media Types
This document summarizes the best current practice for using various Internet media types for serving various
XHTML Family documents. In summary, 'application/xhtml+xml' SHOULD be used for XHTML Family documents,
and the use of 'text/html' SHOULD be limited to HTML-compatible XHTML 1.0 documents. 'application/xml' and
'text/xml' MAY also be used, but whenever appropriate, 'application/xhtml+xml' SHOULD be used rather than those
generic XML media types.
XHTML 1.0 in XML Schema
This document describes non-normative XML Schemas for XHTML 1.0. These Schemas are still work in progress,
and this document does not change the normative definition of XHTML 1.0.
XHTML2 Working Group Roadmap
This describes the timeline for deliverables of the XHTML2 working group. It used to be a W3C NOTE but has now
been moved to the MarkUp area for easier maintenance.
Issue tracking
There are two sets of issues being tracked:
XHTML2 Issue Tracking System
This database is dedicated to XHTML2 issues.
Voyager Issue Tracking System
This database contains issues for all other specs.
Guidelines for authoring
Here are some rough guidelines for HTML authors. If you use these, you are more likely to end up with pages that are
easy to maintain, look acceptable to users regardless of the browser they are using, and can be accessed by the many
Web users with disabilities. Meanwhile W3C have produced some more formal guidelines for authors. Have a look at
the detailed Web Content Accessibility Guidelines 1.0.
1. A question of style sheets. For most people the look of a document - the color, the font, the margins - are as
important as the textual content of the document itself. But make no mistake! HTML is not designed to be used
to control these aspects of document layout. What you should do is to use HTML to mark up headings,
paragraphs, lists, hypertext links, and other structural parts of your document, and then add a style sheet to
specify layout separately, just as you might do in a conventional Desk Top Publishing Package. That way, not
only is there a better chance of all browsers displaying your document properly, but also, if you want to change
such things as the font or color, it's really simple to do so. See the Touch of style.
2. FONT tag considered harmful! Many filters from word-processing packages, and also some HTML authoring
tools, generate HTML code which is completely contrary to the design goals of the language. What they do is
to look at a document almost purely from the point of view of layout, and then mimic that layout in HTML by
doing tricks with FONT, BR and &nbsp; (non-breaking spaces). HTML documents are supposed to be structured
around items such as paragraphs, headings and lists. Yet some of these documents barely have a paragraph tag
in sight!
The problem comes when the content of pages needs to be updated, or given a new layout, or re-cast in XML
(which is now to be the new mark-up language). With proper use of HTML, such operations are not difficult,
but with a muddle of non-structural tags it's quite a different matter; maintenance tasks become impractical. To
correct pages suffering from injudicious use of FONT, try the HTML Tidy program, which will do its best to put
things right and generate better and more manageable HTML.
3. Make your pages readable by those with disabilities. The Web is a tremendously useful tool for the visually
impaired or blind user, but bear in mind that these users rely on speech synthesizers or Braille readers to render
the text. Sloppy mark-up, or mark-up which doesn't have the layout defined in a separate style sheet, is hard for
such software to deal with. Wherever possible, use a style sheet for the presentational aspects of your pages,
using HTML purely for structural mark-up.
Also, remember to include descriptions with each image, and try to avoid server-side image maps. For tables,
you should include a summary of the table's structure, and remember to associate table data with relevant
headers. This will give non-visual browsers a chance to help orient people as they move from one cell to the
next. For forms, remember to include labels for form fields.
Do look at the accessibility guidelines for a more detailed account of how to make your Web pages really accessible.
W3C Markup Validation Service
To further promote the reliability and fidelity of communications on the Web, W3C has introduced the W3C Markup
Validation Service at http://validator.w3.org/.
Content providers can use this service to validate their Web pages against the HTML and XHTML Recommendations,
thereby ensuring the maximum possible audience for their Web pages. It also supports XHTML Family document
types such as XHTML+MathML and XHTML+MathML+SVG, and also other markup vocabularies such as SVG.
Software developers who write HTML and XHTML editing tools can ensure interoperability with other Web software
by verifying that the output of their tool complies with the W3C Recommendations for HTML and XHTML.
HTML Tidy
HTML Tidy is a stand-alone tool for checking and pretty-printing HTML that is in many cases able to fix up mark-up
errors, and also offers a means to convert existing HTML content into well-formed XML, for delivery as XHTML.
HTML Tidy was originally written by Dave Raggett, and it is now maintained as an open source project at
SourceForge by a group of volunteers.
There is an archived public mailing list html-tidy@w3.org. Please send bug reports / suggestions on HTML Tidy to
this mailing list.
Related W3C Work
XML
XML is the universal format for structured documents and data on the Web. It allows you to define your own
mark-up formats when HTML is not a good fit. XML is being used increasingly for data; for instance, W3C's
metadata format RDF.
Style Sheets
W3C's Cascading Style Sheets language (CSS) provides a simple means to style HTML pages, allowing you to
control visual and aural characteristics; for instance, fonts, margins, line-spacing, borders, colors, layers and
more. W3C is also working on a new style sheet language written in XML called XSL, which provides a
means to transform XML documents into HTML.
Document Object Model
Provides ways for scripts to manipulate HTML using a set of methods and data types defined independently of
particular programming languages or computer platforms. It forms the basis for dynamic effects in Web pages,
but can also be exploited in HTML editors and other tools by extensions for manipulating HTML content.
Internationalization
HTML 4 provides a number of features for use with a wide variety of languages and writing systems. For
instance, mixed language text, and right-to-left and mixed direction text. HTML 4 is formally based upon
Unicode, but allows you to store and transmit documents in a variety of character encodings. Further work is
envisaged for handling vertical text and phonetic annotations for Kanji (Ruby).
Access for People with Disabilities
HTML 4 includes many features for improved access by people with disabilities. W3C's Web Accessibility
Initiative is working on providing effective guidelines for making your pages accessible to all, not just those
using graphical browsers.
XForms
Forms are a very widely used feature in web pages. W3C is working on the design of the next generation of
web forms with a view to separating the presentation, data and logic, as a means to allowing the same forms to
be used with widely differing presentations.
Mathematics
Work on representing mathematics on the Web has focused on ways to handle the presentation of
mathematical expressions and also the intended meaning. The MathML language is an application of XML,
which, while not suited to hand-editing, is easy to process by machine.

XBL 2.0 Primer: An Introduction for Developers


W3C Working Draft 18 July 2007
Abstract
This practical guide provides you with the knowledge required to effectively use the XML Binding Language 2.0. It
introduces both the basic and advanced concepts of XBL and describes its syntax and scenarios that should be
considered best-practice. It also describes the purpose of the language elements described in the XBL 2.0
specification.
XBL describes the ability to associate elements in one document with script, event handlers, styles, and more complex
content models in another document. You can use XBL to re-order and wrap content so that, for instance, simple
HTML or XHTML markup can have complex CSS styles applied without requiring that the markup be polluted with
multiple div elements. In addition, if you are a programmer, you can use XBL to implement new DOM interfaces,
and, in conjunction with other specifications, it enables arbitrary XML tag sets to be treated as "widgets" (pluggable
user interface components).
Status of this Document
This section describes the status of this document at the time of its publication. Other documents may supersede this
document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C
technical reports index at http://www.w3.org/TR/.
This is the 18 July 2007 First Public Working Draft of the XBL 2.0 Primer: An Introduction for Developers. This
document is produced by the Web Application Formats (WAF) Working Group (WG). This WG is part of the Rich
Web Clients Activity and this activity is within the W3C's Interaction Domain.
Web content and browser developers are encouraged to review this draft. Please send comments to public-
appformats@w3.org, the W3C's public email list for issues related to web Application Formats. Archives of the list
are available. The editor's draft of this document is available in W3C CVS. A detailed list of changes is also available
from the W3C CVS server.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a
public list of any patent disclosures made in connection with the deliverables of the group; that page also includes
instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes
contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Please note that at the time of writing there are no implementations of XBL 2.0 public ally available, so everything in
this document is untested.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and
may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as
other than work in progress.
1. Chapter 1 - Overview of XBL
The move in web development towards avoiding the table element for layout has led developers to consider how to
exploit other HTML elements, CSS, and ECMAScript to achieve complex layouts. To a large extent, this move has
been fueled by proponents of the Web 2.0 movement who promote the importance of having highly accessible content
that is both adaptive and provides an engaging user experience. However, a new problem has emerged where by web
documents are now heavily 'polluted' with the semantically-neutral div element and complex JavaScript and CSS that
is hard for authors to maintain.
The XML Binding Language 2.0 (XBL) is a declarative language that can be used together with existing or new web
documents to enhance their presentation, behavior, accessibility, and maintainability. This Primer is designed to
provide you with the practical knowledge required to use XBL effectively in your work. It introduces both the basic
and advanced concepts of XBL and describes its syntax and best-practices usage scenarios. It also describes the
purpose of the language elements described in the XBL specification document.
1.1. Intended audience
The primary intended audience for the Primer is web developers: that is, anyone who has some experience working
with HTML, CSS, JavaScript, and perhaps some exposure to XML. This assumes that the reader is familiar with
those, and other related web development techniques and technologies. A second intended audience are XML
developers who are considering XBL as a tool to enhance the behavior and programmatic functionality of DOM
elements. Where relevant, we make note of advanced functionality of XBL specifically for XML developers or
advanced web developers.
We have written this document as a series of tutorials for developers who want to learn XBL in a 'hands-on' manner.
We have made every effort to write this in a relaxed style that should be understandable by a large audience. While
this is not a technical specification and it does not include any implementation details or requirements, this may still be
a useful introduction to the concepts of XBL for people who are intending to implement the XBL specification.
1.4. XBL
XML Binding Language (XBL) 2.0 is a mechanism for extending the presentation and behavior of a document. XBL
2.0 is based upon the original XBL 1.0 specification created and implemented by Mozilla, though it has been
significantly redesigned and is not backwards compatible. One of the goals of XBL is to allow you to directly enhance
the user experience of web documents without needing to overuse structuring elements, such as the div element, in
your HTML.
XBL provides various mechanisms to dynamically pre-load and include new content and style sheets into a document,
and to enhance HTML or XML elements with scripted functionality. For example, an HTML input element can
automatically validate user input via a custom script that is bound to it using XBL. These features potentially translate
into a richer end-user experience and documents that are easier to code, style, and maintain.
XBL is structured into several different components. The bindings are used to attach presentation and behavior to an
element, and the scripts are used to define helper functions used by the bindings. The bindings are comprised of
templates, event handlers, API implementations and resources (figure 1).

Figure 1. Structure of an XBL Document


1.5. Bindings
A binding is a way to attach presentation and behavior to an HTML or XML element. The concept is similar to the
way we already style elements using CSS and attach event listeners to them with JavaScript, but by adding an extra
layer of abstraction in between simplifies the development process. Bindings are a not a way to replace existing
authoring tools like CSS and JavaScript, but rather an enhancement to them.
There are four main aspects of a binding: templates, handlers, implementations and resources.
Templates
A way to enhance the presentation (particularly layout) beyond what is possible with existing
CSS techniques.

Handlers

Offer an improved way to declare event listeners (eg. mouse and key events).

Implementations

A means to add new methods and properties to a XML or HTML element.

Resources

Allow you to load style-sheets and prefetch images, video, audio or any other content
associated with the binding.

1.6. Attaching Bindings


Bindings can be attached to elements in several ways using:
1. the 'element' attribute of the binding element via a CSS-style selector [SELECTORS],
2. the 'binding' property in CSS,
3. the 'addBinding()' method in a script.
We discuss these three attachment methods below.
1.6.1. The element Attribute
To create a binding using the element attribute of a binding element you need to specify a selector. It’s the same type
of selector that is used in CSS, so it’s very easy to understand. This binding will be attached to all elements that match
the selector: #nav li.
<xbl xmlns="http://www.w3.org/ns/xbl">
<binding element="#nav li">
<implementation>...</implementation>
<template>...</template>

<handlers>...</handlers>
<resources>...</resources>
</binding>
</xbl>
1.6.2. The ‘binding‘ Property
The 'binding' property can be used in in your CSS to attach a binding, in exactly the same way you apply any other
other style to an element.
bindings.xml:
<xbl xmlns="http://www.w3.org/ns/xbl">
<binding id="demo">
<implementation>...</implementation>
<template>...</template>

<handlers>...</handlers>
<resources>...</resources>
</binding>
</xbl>
The style sheet:
#nav li { binding: url(bindings.xml#demo); }
1.6.3. Using the addBinding() method
Elements will implement the ElementXBL interface, which defines three methods: addBinding(), removeBinding()
and hasBinding(). The addBinding() method can be used to attach a binding to an individual element using a script,
like this:
var e = getElementById("example"); // Get the element
e.addBinding("bindings.xml#foo"); // Attach the binding
It is also possible to check if a binding has been attached using the hasBinding() funciton.
if (e.hasBinding("bindings.xml#foo")) {
// Do something
}
Bindings can also be detached using the removeBinding() function.
e.removeBinding("bindings.xml#foo");

1.7. Event Handlers


As stated earlier, handlers offer an improved way to declare event listeners (eg. mouse and key events).
1.7.1. Traditional Event Handling
The following example illustrates some typical unobtrusive scripting techniques to attach event listeners, including
both the window.onload property and the addEventListener() function.
window.onload = function() {
var nav = document.getElementById("nav");
var li = nav.getElementsByTagName("li");
for (var i = 0; i < li.length; i++) {
li[i].addEventListener("mouseover", doSomething, false);
}
}
Another common method is to use the HTML onevent attributes, like the following.
<li onmouseover="doSomething();">...</li>
There are advantages and disadvantages to both methods, but the former is generally considered better because it
separates the behavior layer from the markup. However, the latter is a simple declarative syntax that can be quite
convenient in some cases.
1.7.2. Handling Events with XBL
In XBL, instead of requiring authors to use a script to search for the elements, the event listeners are attached to those
that the binding is attached to. XBL provides a simple declarative syntax which also continues to separate the behavior
layer from the semantic markup layer. Event listeners are declared using both the handlers element and its child
handler elements. For example, this binding will be attached to all li elements within an element with id="nav".
<xbl xmlns="http://www.w3.org/ns/xbl">
<binding element="#nav li">
<handlers>
<handler event="mouseover">

doSomething();
</handler>
</handlers>
</binding>
</xbl>
If present, only one handlers element is allowed within a binding, but it can contain as many child handler elements
as required, to capture as many different events as you like. This binding declares a single event handler that listens for
the mouseover event. When the mouseover event is fired on a bound element (i.e. an element to which this binding is
attached), the handler is invoked in effectively the same way it would have been using the other methods shown
above.
1.7.2.1. Event Filters
There are often times when you only want to handle an event under certain conditions. For example, when you want to
capture a click event and do something only when the user clicks the left mouse button; or capture a keyboard event
and perform different functions depending on which key was pressed. In traditional scripting techniques, you have to
check the values of certain properties using if or switch statements in your function, like the following.
function doSomething(e) {
var code;
e = e || window.event;
code = e.keyCode || e.which;
switch(code) {
...
}
}
Much of that involves handling of incompatibilities between legacy browsers, but even if all browsers supported the
DOM Events standard, it is still quite complicated. XBL addresses this by providing a simple declarative syntax for
describing these conditions using attributes on the handler element.
In the following example, separate handlers are provided for for handling the keypress events depending on which
character was entered. The first handles the character a, the second handles b. If any other character was entered,
neither of these two handlers will be invoked.
<handlers>

<handler event="keypress" text="a">


doSomethingA();
</handler>

<handler event="keypress" text="b">


doSomethingB();
</handler>
</handlers>
Similarly, in the following example, the handler will only be invoked when the user left clicks while holding the
Shift key down.
<handlers>
<handler event="click" button="0" modifiers="shift">
doSomething();
</handler>

</handlers>
There are several other filters that can be used. The following list is a subset of the available attributes for this purpose.
These are expected to be the most commonly used filters because they cover the majority of mouse and keyboard
event usage on the web today.
button

A space separated list of mouse buttons pressed by the user. e.g. button="0 2" matches either
the left or right mouse buttons.

click-count

The number of times the user clicked. e.g. click-count="2" matches double clicks.

text

The text entered by the user. This is different from the key code because it matches the letter
that was entered, regardless of the keys that were pressed. This is particularly important for
languages that require several key presses to enter certain letters.

modifiers

Modifier keys, including alt, control, shift, meta, etc.

key
Matches against the keyIdentifier value defined in DOM 3 Events

key-location

For matching the location of the key that was pressed on the keyboard, including standard, left,
right and numpad.

1.8. Templates
Templates is used to control the presentation of a document. They can be used to reorder and restructure content in the
document without affecting the underlying DOM.
Templates are created using the template element within a binding. The templating model allows you to combine
elements from the document with additional elements in creative ways, removing the need for unnecessary and
extraneous elements to be added to the original document. You could, for example, use XBL to extract the data from
an HTML table and present it as a chart using SVG.
An important concept to grasp is that regardless of what content you include in the template, the template does not
alter the semantics of the original document. For example, in an HTML document, a heading could be bound to a
binding with a template containing an SVG image. The bound element still semantically represents a heading, only its
presentation has changed from plain text to an image. That concept shouldn't’t be too hard to grasp, that example (in
principle) is not much different from any of the widely used image replacement techniques, it only differs in
implementation.
The XBL content element can be used to insert content from the document into the template. The includes attribute
value is a selector, used to select which elements to insert into the tree at that location. The div element is provided as
a generic structural element that you can use for any purpose you like.
1.8.1. Shadow Trees
When elements are bound, the contents of their binding’s template are cloned and appended to them as children,
creating shadow trees. Shadow trees exist outside of the normal DOM and are thus transparent to ordinary DOM
processing. In other words, shadow trees are rendered as though they were part of the original document, but do not
actually exist within the document itself.
<body>
<div id="main">...</div>

<div id="nav">...</div>
</body>
Using XBL, the content can be reordered and restructured, which will allow for more complex styles to be applied.
<xbl xmlns="http://www.w3.org/ns/xbl">

<binding element="body">
<template>
<div id="container">
<div id="left"><content includes="#nav"/></div>

<div id="right"><content includes="#main"/></div>


</div>
</template>
</binding>

</xbl>
This will create the following shadow tree.
Need a diagram
1.9. Implementing Interfaces
The implementation element describes a set of methods and properties that are attached to the bound element. That
is, a way to enhance the bound element’s DOM interface. For example, if you wanted to add custom validation to
HTMLInput element, you would need to do the following in JavaScript:
var customInput = document.createElement("input");
myInput.max_value = 56;
myInput.checkValue = function() {
// Custom validation
};

That example illustrates the basic way in which we can add properties and methods to an already existing HTML
element. Exactly the same technique can already be used to add properties and methods to an element, and this is
similar to what the implementation element is designed to do. The equivalent to the example above in XBL would be.
<xbl xmlns="http://www.w3.org/ns/xbl">
<binding element="#customInput">
<implementation>

({
max_value: 56,
checkValue: function() {
// Custom validation
}
})
</implementation>
</binding>
</xbl>
In the HTML you would have:
<input type="text" id="customInput"/>
In this example, the binding is attached to elements matching the selector: #customInput.
1.10. Resources
Resources include style sheets and additional files that are used by the binding, such as images, audio and video. The
style sheets are used to add style to the binding’s template.
This section is incomplete
1.11. Scripts
The XBL script element, which is similar to the script element in HTML, can be used to define helper functions
for your bindings.
Just like in HTML, scripts can either be script resources using the script element's src attribute and declare as many
script elements as you need:
<xbl xmlns="http://www.w3.org/ns/xbl">
<script src="example.js"/>

<script><![CDATA[
function foo(){
example(); // Assume this is defined in example.js
...
}
]]></script>
...
</xbl>
In the following example, the doSomething() function that is defined in an XBL script element will be
automatically called when binding foobar is attached to an element.
<xbl xmlns="http://www.w3.org/ns/xbl">
<script>
function doSomething(){...};
</script>
<binding id="foobar">

<implementation>
({
xblBindingAttached: function() {
doSomething(); // Calls the function defined in the script element
}
})
</implementation>
</binding>
</xbl>
The default scripting language is ECMAScript. Other languages may be used by specifying them with the script-
type attribute on the xbl element. Since JavaScript is the most common scripting language on the web, the default
will usually be acceptable to most authors.
Functions and variables defined in the XBL script element are scoped to the XBL document, so they cannot be
accessed from the bound document. Conversely, for security reasons, functions defined in the bound document cannot
be invoked from within the XBL document. For example, the following will not work:
<html xmlns="http://www.w3.org/1999/xhtml">
<xbl xmlns="http://www.w3.org/ns/xbl">
<script>

function bar(){...}
foo(); //error, foo is undefined!
</script>
</xbl>
<!-- script in the XHTML namespace -->
<script type="text/javascript">
function foo(){...}
bar(); //error, bar is undefined!
</script>

</html>

HTML and XHTML Frequently Answered Questions


Why is XHTML needed? Isn't HTML good enough?
HTML is probably the most successful document markup language in the world. But when XML was introduced, a
two-day workshop was organised to discuss whether a new version of HTML in XML was needed. The opinion at the
workshop was a clear 'Yes': with an XML-based HTML other XML languages could include bits of XHTML, and
XHTML documents could include bits of other markup languages. We could also take advantage of the redesign to
clean up some of the more untidy parts of HTML, and add some new needed functionality, like better forms.
What are the advantages of using XHTML rather than HTML?
If your document is just pure XHTML 1.0 (not including other markup languages) then you will not yet notice much
difference. However as more and more XML tools become available, such as XSLT for tranforming documents, you
will start noticing the advantages of using XHTML. XForms for instance will allow you to edit XHTML documents
(or any other sort of XML document) in simple controllable ways. Semantic Web applications will be able to take
advantage of XHTML documents.
If your document is more than XHTML 1.0, for instance including MathML, SMIL, or SVG, then the advantages are
immediate: you can't do that sort of thing with HTML.
Can I just put the XML declaration on top of existing HTML documents? Can I intermix HTML
4.01 and XHTML documents?
No. HTML is not in XML format. You have to make the changes necessary to make the document proper XML before
you can get it accepted as XML.
What is the easiest way to convert my HTML documents to XHTML?
HTML Tidy gives you the option to transform any HTML document into an XHTML one. Amaya is a browser/editor
that will save HTML documents as XHTML.
Why are browsers so fussy about XML? They were more accepting with HTML.
This is deliberate. HTML browsers accept any input, correct or incorrect, and try to make something sensible of it.
This error-correction makes browsers very hard to write, especially if all browsers are expected to do the same thing. It
has also meant that huge numbers of HTML documents are incorrect, because since they display OK in the browser,
the author isn't aware of the errors. This makes it incredibly difficult to write new web user agents since documents
claiming to be HTML are often so poor.
Why should I care if my document is in correct HTML? It displays all right on my browser.
All browsers know how to deal with correct HTML. However, if it is incorrect, the browser has to repair the
document, and since not all browsers repair documents in the same way, this introduces differences, so that your
document may look and work differently on different browsers. Since there are hundreds of different browsers, and
more coming all the time (not only on PCs, but also on PDAs, mobile phones, televisions, printers, even refrigerators),
it is impossible to test your document on every browser. If you use incorrect HTML and your document doesn't work
on a particular browser, it is your fault; if you use correct HTML and it doesn't work, it is a bug in the browser.
Where can I go to verify my document uses correct markup?
W3C offers a service at http://validator.w3.org/. The Amaya browser/editor will also ensure that your markup is
correct.
Why do you say "user agent" everywhere, instead of "browser"?
Although browsers are indeed important users of HTML and XHTML, there are other programs and systems that read
those documents. Search engines for instance read documents, but are not browsers. By using the term "user agent" we
are trying to remind people of the difference.
For example, when you do a Google search often you will see under some of the search results something like "This
web page uses frames, but your browser doesn't support them." therefore surely frightening off some people from
clicking on that link. The author of the website in question hasn't realised that there are more than just browsers, and
that they ought to include better text in their <noframes> section, so that they don't appear so foolish when people
search their site.
Why do I have to use these namespace things in XHTML?
In the early days of HTML different groups and companies added new elements and attributes to HTML at will. This
threatened to cause a chaos of different non-interoperable versions of HTML. XML (the X stands for Extensible)
allows anyone to use elements and elements from different languages, but for a browser or other user agent to know
which element belongs to which language, you have to tell it. The namespace declarations do just that.
Why is it allowed to send XHTML 1.0 documents as text/html?
XHTML is an XML format; this means that strictly speaking it should be sent with an XML-related media type
(application/xhtml+xml, application/xml, or text/xml). However XHTML 1.0 was carefully designed so that
with care it would also work on legacy HTML user agents as well. If you follow some simple guidelines, you can get
many XHTML 1.0 documents to work in legacy browsers. However, legacy browsers only understand the media type
text/html, so you have to use that media type if you send XHTML 1.0 documents to them. But be well aware, sending
XHTML documents to browsers as text/html means that those browsers see the documents as HTML documents,
not XHTML documents.
Which browsers accept the media type application/xhtml+xml?
Browsers known to us include all Mozilla-based browsers, such as Mozilla, Netscape 5 and higher, Galeon and
Firefox, as well as Opera, Amaya, Camino, Chimera, DocZilla, iCab, Safari, and all browsers on mobile phones that
accept WAP2. In fact, any modern browser. Most accept XHTML documents as application/xml as well. See the
XHTML Media-type test for details.
Does Microsoft Internet Explorer accept the media type application/xhtml+xml?
No. However, there is a trick that allows you to serve XHTML1.0 documents to Internet Explorer as
application/xml.
Include at the top of your document the line in bold here:
<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet type="text/xsl" href="copy.xsl"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
where copy.xsl is a file that contains the following:
<stylesheet version="1.0"
xmlns="http://www.w3.org/1999/XSL/Transform">
<template match="/">
<copy-of select="."/>
</template>
</stylesheet>
Note that this file must be on the same site as the document referring to it.
Although you are serving the document as XML, and it gets parsed as XML, the browser thinks it has received
text/html, and so your XHTML 1.0 document must follow many of the guidelines for serving to legacy browsers.
Your XHTML document will continue to work on browsers that accept XHTML 1.0 as application/xml.
CSS has a lot of special rules that only apply to HTML. Do these also apply to XHTML?
No. CSS rules that apply only to HTML, apply only to documents that are delivered as text/html.
Does document.write work in XHTML?
No. Because of the way XML is defined, it is not possible to do tricks like this, where markup is generated by
scripting while the parser is still parsing the markup.
You can still achieve the same effects, but you have to do it by using the DOM to add and delete elements.
Why is it disallowed to send XHTML 1.1 documents as text/html?
XHTML 1.1 is pure XML, and only intended to be XML. It cannot reliably be sent to legacy browsers. Therefore
XHTML 1.1 documents must be sent with an XML-related media type, such as application/xhtml+xml.
Why was the target attribute removed from XHTML 1.1?
It wasn't. XHTML 1.0 comes in three versions: strict, transitional, and frameset. All three of these were deliberately
kept as close as possible to HTML 4.01 as XML would allow. XHTML 1.1 is an updated version of XHTML 1.0
strict, and no version of HTML strict has ever included the target attribute. The other two versions, transitional and
frameset, were not updated, because there was nothing to update. If you want to use the target attribute, use XHTML
1.0 transitional.
What is the use of XHTML Modularization?
XHTML Modularization is not aimed at the regular users of XHTML, but at designers of XHTML-based languages. It
had been observed that companies and groups had the tendency to design their own versions of HTML and XHTML
that were often not interoperable at basic levels. XHTML Modularization splits XHTML into a number of modules
that can be individually selected when defining a new language; in this way any XHTML-based language that uses
tables is guaranteed to use the same definition of tables, and not some divergent version. Modularization also makes it
clear where it is OK to add new elements, and where it is not.
Why is XHTML2 needed? Isn't XHTML 1 good enough?
HTML and XHTML have done good service, but there are many things that can be improved. Areas that have
received particular attention include better structuring possibilities, removing features that are duplicated in XML,
usability, accessibility, internationalization, device independence, better forms, and reducing the need for scripting.
Is <img> being replaced by <object> in XHTML2?
No. <img> is being replaced in XHTML2, but by something else (although you could use <object> if you wanted).
The design of <img> has many problems in HTML:
• It has no fallback possibilities, so that if you use an image of type PNG for instance, and the browser can't
handle that type, the only alternative is to use the alt text. This fact has hampered the adoption of PNG
images, which in many ways are better than GIF and JPG, since people have continued to use the lowest-
common denominator format, to ensure that everyone can see the images.
• The alt text cannot be marked up, so that if it gets used, you just get the plain text.
• It is possible to include a longdesc link to a description of the image, to help people who cannot see, but it is
seldom implemented.
What XHTML2 does is say that all images are equivalent to some piece of content; it does this by allowing you to put
a src attribute on any element at all. What this says is: if the image is available, and the browser can process it, use it,
otherwise use the content of the element. For instance:
<p src="map.png">Exit from the station, turn left,
go straight on to <strong>High Street</strong>,
and turn right</p>
The advantage of this is that if the image is not available for some reason (such as network failure) or the browser can't
render that sort of image, your document is still usable. If you want to supply more than one sort of image, you can do:
<p src="map.png"><span src="map.gif">Exit from station...</span></p>
although it is better to use content negotiation if your server supports it (and most do):
<p src="map">Exit from station...</p>
which would negotiate with the browser which sort of image it accepts, and give the browser its preferred sort. If there
is no available image, then the content of the element would be used. This has an added advantage that you can later
add other image types on your server and you don't have to change the page for it still to work.
Why doesn't XHTML2 use XLink?
XLink and XHTML had different requirements for linking that turned out not to be reconcilable.
Why isn't XHTML2 backwards compatible?
It is, but in a different way to how previous versions of HTML were backwards compatible.
Because earlier versions of HTML were special-purpose languages, it was necessary to ensure a level of backwards
compatibility with new versions so that new documents would still be usable in older browsers. For instance, this is
why the <meta> element has its content in an attribute rather than in the content of the element, since it would have
shown up in older browsers.
However, thanks to XML and stylesheets, such strict element-wise backwards compatibility is no longer necessary,
since an XML-based browser, of which at the time of writing means more than 95% of browsers in use, can process
new markup languages without having to be updated. Much of XHTML 2 works already in existing browsers,
browsers that are not pre-programmed to accept XHTML2. Much works, but not all: when forms and tables were
added to HTML, people had to wait for new version of browsers; similarly some parts of XHTML 2, such as XForms
and XML Events, still require user agents that understand that functionality.
Why is xml:space set to 'preserve' on all elements of XHTML? I don't want to see extra space in
my output.
The attribute xml:space is about input: that is to say, it controls if the spaces will be present in the DOM (i.e. in the
internal version of the document inside the browser); it says nothing about what will appear on your screen. Output
whitespace is controlled by the CSS property 'whitespace'. Set it to 'pre' and the spaces in the DOM will be
preserved on output; set it to 'normal' and the whitespace will be collapsed (CSS3 will have more properties to enable
greater control).
This is the reason that all elements are set to xml:space="preserve" in XHTML2, otherwise the CSS 'whitespace'
property would have no effect, and you would have no control over visible whitespace. The default stylesheet will set
'whitespace' to 'normal' for all elements except <pre>, but you will be free to change them.

XML Events for HTML Authors


Introduction
This document is a quick introduction to XML Events for HTML authors. XML Events is a method of catching events
in markup languages that offers several advantages over the HTML onclick style of event handling.

The Basic Idea


The important thing to know about XML Events is that it uses exactly the same event mechanism as HTML, only
written differently.
Consider this simple HTML example:
<input type="submit" onclick="verify(); return true;">
This says that if the <input> element (or any of its children) gets the click event, then the piece of code in the
onclick attribute is performed.
We say "or any of its children" because in a case like
<a href="..." onclick="...">A <em>very</em> nice place to go</a>
or
<a href="..." onclick="..."><strong>More</strong></a>
you want the onclick to be performed even if the click actually happens on the <em> or <strong> elements. In these
cases we call the element that was clicked on the target, and the element that responds to the event an observer
(though often target and observer are the same element).
So what you see is that there are three important things involved: an event, an observer, and a piece of script (called a
handler). As you can see from the above, the fourth thing, the target, isn't usually important.
There are some problems with the way HTML specifies the relationship between the three:
• the event name is hard-wired into the language, rather than being a parameter, so that to be able to deal with a
new sort of event you have to add a new attribute, like onflash if an event called flash were introduced.
• the event name is usually very hardware specific, such as click, when in fact you don't care how the button is
activated, only that is has been activated
• you can only use one scripting language (since you can't have two attributes called onclick, one for JavaScript
and one for VB)
• event handling and markup are intertwined — there are no ways to separate the two.
XML Events specifies the relationship between the event, observer and handler in a different way. The equivalent to:
<input type="submit" onclick="validate(); return true;">
would be:
<input type="submit">
<script ev:event="DOMActivate" type="text/javascript">
validate();
</script>
</input>
(Note that there is no markup language yet that allows you to write exactly this: these examples just show what the
equivalent markup would be, in order to explain the concepts. There are some pointers at the end to languages that
already use XML Events.)
This says that the <script> element is a handler for the DOMActivate event (which we use in preference to click,
because buttons can be activated in different ways, not just by clicks), and in the absence of any other information, the
parent element is the observer (<input> in this case).
(Note that handlers must only be evaluated when the event happens, and not when the document is loading as with
<script> in HTML4.)
This approach now allows you to specify handlers for different scripting languages:
<input type="submit">
<script ev:event="DOMActivate" type="text/javascript">
...
</script>
<script ev:event="DOMActivate" type="text/vbs">
...
</script>
</input>
and/or different events:
<input type="submit">
<script ev:event="DOMActivate" type="text/javascript">
...
</script>
<script ev:event="DOMFocusIn" type="text/javascript">
...
</script>
</input>
You now know enough to be able to do event handling using XML Events. However, there are some other useful
features.
Other Ways to Specify the Relationship
Just as with HTML, there are other ways you can specify the event-observer-handler relationship. You can put the
information on the handler, on the observer, or on a <listener> element. Note that whichever method you use, you
always have to specify the three parts: so if you use the handler you have to say what the event and observer are, if you
use the observer you have to say what the event and handler are, and with <listener> you have to specify all three.
The reason why you would want to this is mostly to make your documents more manageable. You can separate all
scripting out of the main body of your document. It also allows you to avoid duplicating your handlers, and just having
one copy which you can apply to several places.
Specifying the Relationship on the Handler
Here you move the handler to some other part of the document, and specify the relationship there (like some versions
of HTML use the for attribute on the <script> element):
<script ev:observer="button" ev:event="DOMActivate" type="text/javascript">
validate();
</script>
...
<input type="submit" id="button"/>

Specifying the Relationship with <listener>


Here you again put the handler somewhere, and specify the relationship in another place with the <listener> element
(this allows you to use the same handler for more than one observer):
<ev:listener observer="button" handler="#validator" event="DOMActivate"/>
...
<script id="validator" type="text/javascript">
validate();
</script>
...
<input type="submit" id="button"/>
(Note that in this case it is the <listener> element that carries the ev: prefix, and not the attributes.)
Specifying the Relationship on the Observer
And finally, you can specify the relationship on the observer:
<script id="validator" type="text/javascript">
validate();
</script>
...
<input type="submit" ev:handler="#validator" ev:event="DOMActivate"/>

Fine Details
Really for everyday use of events, that's all you need to know. There are however a few details that may occasionally
come in useful.
phase ("capture" | "default"*)
Since, as was said, event handlers get invoked for an element or any of its children, it is possible for instance to listen
for all clicks on anything within a paragraph:
<p>
<script ev:event="DOMActivate" type="...">
...
</script>
...
<a href="...">...</a>
...
<a href="...">...</a>
...
</p>
which would respond to clicks on either contained <a> (as well as on anything else in the paragraph).
However, since the <a> elements might also have handlers on them, the phase attribute is used to specify whether a
given handler should be activated before handlers lower in the tree, or after them. Normally handlers lower in the tree
for an event are activated first, but the attribute ev:phase="capture" causes a handler higher in the tree to be
executed before lower handlers for the same event:
<p>
<script ev:event="DOMActivate" ev:phase="capture" type="...">
<!-- This handler will be done before any lower in the tree -->
...
</script>
...
<a href="...">...</a>
...
<a href="...">...</a>
...
</p>

target
If you really do care about which element is the target, then you can say that you are only listening for events for that
element:
<p>
<script ev:event="DOMActivate" ev:target="img1" type="...">
...
</script>
...
<a href="..."><img id="img1" ... /><img id="img2" ... /></a>
</p>
(Remember, the target is the thing actually clicked on, which we usually aren't particularly interested in.)
propagate ("stop" | "continue"*)
Event handlers for an event are handled in the following order: first all handlers with ev:phase="capture", starting
first at the top of the tree, and then working down to the target's parent element, and then all other handlers, working
from the target element, back up to the top of the tree.
If any handler has ev:propagate="stop" then no handler after those on the current element will be invoked for the
current event.
<body>
<script ev:event="DOMActivate" ev:propagate="stop" ev:phase="capture" type="...">
<!-- This script is the only one that will respond to DOMActivate -->
...
</script>
...
</body>
By the way, if there is more than one handler on an element for the same event, it is undefined what order they are
performed in.
defaultAction ("cancel" | "perform"*)
Many events have a default action which gets carried out after all event handlers for the event in the tree have been
performed. For instance, a click on a button will cause the button to be activated, a click on an <a> or one of its
children will cause the link to be activated. It is possible to stop this default action with the attribute
ev:defaultAction="cancel".
<input type="submit" ...>
<script ev:event="DOMActivate" ev:defaultAction="cancel" type="...">
<!-- This script will be activated, but the default action will not be done -->
...
</script>
</input>
(This is what onclick="...; return=false;" does in HTML4. If you want to do this conditionally in XML Events,
you should call the DOM preventDefault method appropriately.)