You are on page 1of 84

Internet Client-Server Systems

Web Server

Client

1 Internet Banking
Internet Client-Server Systems

Client

Web Server

Uber is an app and taxi service that


connects riders and drivers with the tap
Client Client
of a button by using their phone’s GPS
capabilities
Outline of Today’s Class
◆ Web Servers
◆ Static and Dynamic Web Pages
◆ CGI Programming
◆ What makes the CGI work?
➢ FORM
➢ GET and POST Methods
➢ QUERY_STRING and CONTENT_LENGTH

◆ SGML, HTML and XHTML


◆ XML and DTD
◆ XML Examples
Web Servers
◆ How does a web server work?
You contact the web server and request a file. The
server returns the file.

GET foo.html
PC-1 Web Server
Foo.html

GET index.html Files


/myDir/index.html
PC-2 Index.html /myDir/foo.html
/myDir/bar.html
Web Servers
Request 1

Request 2
Web Servers
◆ Most web servers are very simple. They
just return files to the PC that requests it

◆ The web browser does the hard work of


translating a file into pretty pictures

◆ See “View->Source” for the file that is


returned by the server
Web Servers
◆ It would be a Bad Thing if anyone on
the internet could retrieve any file on
the web server.

◆ The files are kept in a special directory


— requests for files are relative to that
directory.
Static Web Pages

Request file

Retrieve file

Send file
Dynamic Web Pages

Request service

Do Computation

Generate HTML
page with results
of computation

Return dynamically
generated HTML file
CGI and Web Forms
◆ How to write the HTML that sends data to
the server?

◆ What does the server have to do to process


this information?

◆ The most common method to handle this


is CGI -- Common Gateway Interface
CGI Programming

HTTP
SERVER

CLIENT CGI Program


CGI Programming
Environment
Variables

stdin

HTTP CGI Program


SERVER
stdout
Important CGI
Environment Variables
REQUEST_METHOD

QUERY_STRING

CONTENT_LENGTH
Request Method: Get

◆ GET requests can include a query string as


part of the URL:
Delimiter

GET /cgi-bin/finger?hollingd HTTP/1.0

Request Resource
Method Query
Name
String
CGI URLs
◆ There is a mapping between URLs and CGI
programs provided by a web server. The
exact mapping is not standardized (web
server admin can set it up)

◆ Typically:
➢ requests that start with /CGI-BIN/ , /cgi-bin/
or /cgi/, etc. refer to CGI programs (not to
static documents).
CGI Programs
◆ When the user hits the “submit” button
the data is sent to the web server
◆ The CGI program that handles it on the
web server is specified in the HTML
Form tag

<FORM method=post action="http://unix.aml.yorku.ca/cgi-bin/formProcessor.pl">


CGI Programs

◆ Anything special about the program?


➢ The web server has to have permissions set to
allow the program to be executed. Typically
this is only turned on in a few directories, eg
/cgi-bin
➢ Has to comply with the usual security things
for that system.
CGI Programs
◆ What kind of program does it need to be?
➢ Can be written in any language—C++, C,
perl, etc. Just has to be able to process the
attribute-value pairs.

◆ Perl is excellent for its pattern matching


and text processing capabilities.
CGI Programs
The data is sent to the CGI program in a specific
format of attribute-value pairs. The attribute is the
name of the field in the HTML tag, the values are
what the user inputs

First name: <input type="text" name="firstName">


Middle name: <input type="text" name="middleName"><br>
Last name: <input type="text" name="lastName"><br>

firstName=lee
middleName=harvey
lastName=oswald
CGI Programs
◆ Strengths:
➢ A simple method to send data to the server.
➢ Dynamically generates HTML pages.

◆ Weaknesses
➢ All the processing happens on the server.
➢ Takes time to launch the CGI process on the
server.
➢ Use the process, instead of thread.
Web Forms
◆ Overview of Web forms
◆ HTML form components
◆ GET & POST methods
◆ Server-side processing with forms
CGI-based Web Application

 HTT Request

 HTTP Document
Web Browser Web Server

 Output  HTML forms to


(HTML) invoke CGI scripts

 Get Data
CGI Scripts/
Applications Database
 Return data
Form Interaction with CGI
Web Browser Web CGI
Server Program
User requests form

Returns form to client

User submits form


Forwards to CGI program

Returns results to server


Returns results to client

Network Server
Forms
◆ Forms work in a different and slightly more
complex way than standard HTML pages.
◆ Forms consist of a number of separate data entry
components such as menus and text areas.
◆ The user can select different options from the menus
and enter text in the text entry fields.
◆ A single form can contain many text entry fields
and/or many menus.
◆ To differentiate the menus and text areas from each
other each one is given a unique name, selected by
the Web form designer.
HTML Forms

◆ Each form includes a METHOD that


determines what http method is used
to submit the request.

◆ Eachform includes an ACTION that


determines where the request is
made.
HTML Forms
HTML includes elements or tags for creating forms on Web pages.
There are three stages to creating a form:
➢ define the form data [a set of variables]
➢ design the form itself
➢ define the method for processing the form’s data on the
server-side
When the Web page containing the form is loaded, the user can:
➢ enter data into the form
➢ then submit that data to the Web server

[usually by clicking a submit button on the form]


HTML Form Variables
A variable has:
➢ a name
➢ a value
A form contains one or more variables. When the user
fills in the form, values are assigned to these variables.

When the user clicks the submit button, the set of variable
names & corresponding values are sent to the Web server
in a HTTP request.

The Web server can extract the set of variables & values
from the HTTP request, and can do something with them...
Example for HTML Form
<html>
<head> Note that this form
<title>Query Form</title> contains two variables
name & id
</head>

<body>
<h2>Query Form</h2>
<form method="GET” action="doquery.php”>
<p>Your name: <input name="name" type="text" size=30></p>
<p>Your ID: <input name="id" type="text" size=15></p>
<p><input type="submit" value="Submit your query"></p>
<p><input type="reset" value="Clear your query"></p>
</form>
</body>
</html>
Example for HTML Form

<input name="name" type="text" size=30>


<input name="id" type="text" size=15>

<input type="submit" value="Submit your query">


<input type="reset" value="Clear your query">
Forms
The method attribute specifies
Each form must begin and
how the form’s data is sent
end with form tags.
to the Web server. The post
<?xml version = "1.0"?> method appends form data
to the browser request.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns = "http://www.w3.org/1999/xhtml"> The value of the action attribute
<head> specifies the URL of a script on
the Web server.
<title>Web Engineering - Feedback Form</title>
</head>
Input elements are used to send
<body><h1>Feedback Form</h1> data to the script that processes the form.
<p>Any comments please.</p>
<form method = "post" action = "/cgi-bin/feedbackform"><p>
<input type = "hidden" name = "recipient" value = "webeng@xhtmllecture.com" />
<input type = "hidden" name = "subject" value = "Feedback Form" />
<input type = "hidden" name = "redirect" value = "main.html" /> </p>
</form> A hidden value for the type
<p> attribute sends data that is
not entered by the user.
Forms
<label>Name:
<input name = "name" type = "text" size = "25" maxlength = "30" />
</label></p>
<p><form> The maxlength attribute gives
the maximum number of
<input type = "submit" value = "Submit comments" /> Characters the user can input.
<input type = "reset" value = "Clear comments" />
</p> The size attribute gives
the number of characters
</form></body></html>
visible in the text box.

The label element describes The value attribute displays


the data the user needs a name on the buttons created.
to enter in the text box.
Forms

Text box created using


input element.

Submit button created


using input element.

Reset button created


using input element.
Table & Form

<TABLE FRAME = none>


<TR><TD ALIGN = right>
Name:<BR>
Card number:<BR>
Expires:<BR>
Telephone:<BR>
<TD ALIGN=left><BR>
<FORM method="POST" action=”/cgi-bin/myscript.cgi”>
<INPUT NAME=“name” SIZE=18><BR>
<INPUT NAME=“cardnum” SIZE=18><BR>
<INPUT NAME=“expires-month” SIZE=2>/
<INPUT NAME=“expires-year ” SIZE=2><BR>
<INPUT NAME=“phone” SIZE=18>
</FORM>
</TABLE>
Form Methods
◆ The method attribute on the form tag specifies how the Web
Browser should send the data to the Web server.

◆ Two options:
➢ GET: pass the data in a HTTP GET request
➢ POST: pass the data in a HTTP POST request

◆ In a HTTP GET request, the browser appends the form data to a


URL. For example:

◆ http://www.yorku.ca/jhuang/doquery.cgi?name=joe+bloggs&id=1234

◆ Note how the variable names & values are appended to the URL.
Any spaces in a value are converted to +.
Form Actions
◆ The action attribute on the form tag specifies what the
Web server should do with the form data.

◆ Common options:
➢ email the data to someone [the mailto action]
➢ pass the data to a script or program

◆ The script will be parsed the variables & values, and


can then process them.

◆ For example, the CGI script could use the name & id to
look up student info in a database.
Form Actions
◆ <form method="GET" action="mailto:jhuang@yorku.ca">

◆ Until you can actually use scripts on the server, use the
mailto action. It operates in the same way as the mailto that
you have used in the HTML document.

◆ When used in a form, the mailto action will send an email


to the email address of the person specified. The mailto
action is of limited use for complicated forms but works
adequately for simple forms.

◆ The email received contains all of the names and values in


one long list.
What a CGI will get

◆ Thequery (from the environment


variable QUERY_STRING) will be a
URL-encoded string containing the
name, value pairs of all form fields.

◆ TheCGI must decode the query and


separate the individual fields.
GET vs. POST
◆ TheGET method delivers data (query) as
part of the URL

◆ When using forms, it’s generally better to


use POST:
➢ there are limits on the maximum size of a GET
query string (environment variable)
➢ a post query string doesn’t show up in the
browser as part of the current URL
CGI reading POST

◆ IfREQUEST_METHOD is a POST, the


query is coming in STDIN.

◆ The environment variable


CONTENT_LENGTH tells us how much
data to read.
CGI Method Summary
◆ GET:
➢ REQUEST_METHOD is “GET”
➢ QUERY_STRING is the query

◆ POST:
➢ REQUEST_METHOD is “POST”
➢ CONTENT_LENGTH is the size of the query
(in bytes)
➢ query can be read from STDIN
HTTP Form Processing
1. user fills in form &
clicks submit 2. Browser sends GET
http://www.yorku.ca/jhuang/doquery.cgi?name=joe+bloggs&id=1234

4. server sends
5. Browser script results to 3. server runs
displays the Browser the script
script results* doquery.cgi
passing form
internet data to it

*The script results will usually be HTML text


A More Complex Form Example
Text field

Password field

Radio buttons

Drop-down list

Check boxes

Text area

Buttons
Form Processing & Results
◆ The easiest way to deal with form data is to simply email it to an
email address using a mailto form action:

◆ <form method="POST" action="mailto:name@where.com">

◆ More often, we want to process the data on the server-side, using


a program or script.

◆ The old way is to use a so-called CGI Script, usually with a URL
something like:

◆ <form method="POST" action=”/cgi-bin/myscript.cgi">

◆ The newer way is to use an HTML-embedded script language


such as Servlet, JSP, or ASP. We’ll look at how to use Servlet
later in the course...
Alternatives for Generating
Dynamic Pages
Can dynamically generate page in other ways?
◆ Java Servlets

◆ Java Server Pages

◆ Active Server Pages (ASP)


Dynamic Web Pages

server side client side

SSI
WWW
CGI WWW server
CGI HTTP client
program

API
script
(embedded
in HTML)
Java
other Java applet
program servlet
(application )


SML

Standardized General Markup Language

)
Some History

◆ SGML

◆ HTML

◆ XML and XHTML


SGML

◆ Standardized General Markup Language


◆ Developed by a committee!
◆ Led by Charles Goldfarb, 1978-1986
◆ A grammar to define the structure of documents

◆ Rules define the construct or structure


◆ Terminals are <tags> and strings
HTML & XML
➢ HTML is a subset of SGML with a shared
DTD

➢ HTMLDOC::=(<html> HEAD BODY </html>)

➢ XML is a subset of SGML with many DTDs


allowed
XML
Uses tags to identify semantics of data
◆ looks like HTML, but isn’t
<slide><title>Introduction</title>
<author><first>Jimmy</first>
<last>Huang</last>
</author>
<content>XML this and that</content>
</slide>
◆ is license free, platform-independent and
well-supported
HTML

Hypertext Markup Language


◆ Hypertext Markup Language
◆ Presents documents via WWW browsers
◆ Specifies document layout and hyperlink
◆ Predefines set of tags (ie. Common DTD)
HTML: An Example
<HTML>
<TITLE>Statistics Canada</TITLE>
<BODY>
<H3>Welcome to Stats Canada</H3>
Statistics Canada ……. . <p> We like numbers…..
<img src=“mapleleaf.gif>
<ul>What we do
<li><a href=“census.html”>Census</a>
<li><a href=“special.html”>Special surveys</a>
<li><a href=“online.html”>Online data</a>
</ul>
</BODY>
</HTML>
HTML
HTML - Advantages
◆ Simple - fixed set of tags
◆ Portable - used with all browsers
◆ Linking - within and to external documents

HTML - Disadvantages
◆ Limited tag set
◆ Can’t separate the presentation from content
◆ Can’t define structure of contents
XHTML

EXtensible Hyper-Text Makeup Language

)
XHTML Basics
◆ Very few real changes from HTML
◆ But more strict

◆ All tags are in lowercase


◆ All tags must be closed
➢ Empty tags
➢ Paired tags
XHTML Document Structure
Overlap versus Nesting
XHTML tags
◆ Start tags and end tags
◆ Start tags - delimited by < and >
◆ End tags - delimited by </ and >
➢ <h1>This is a Large Heading</h1>
➢ <br>This text starts on a new line.

◆ Some start tags also include attributes which


further define information about the element.
!DOCTYPE
◆ HTML 3.2
➢ <!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 3.2
Draft//EN”>
◆ Netscapes HTML standard
➢ <!DOCTYPE HTML PUBLIC “-//WebTechs//DTD Mozilla
HTML 2.0//EN”>
◆ Not strictly necessary for HTML, highly recommended
◆ Future browsers can still attempt to display your older documents
(written to previous HTML standards) in the way that was
originally intended, even though the HTML language may have
evolved
◆ XHTML
➢ <?xml version = "1.0"?>
➢ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Strict//EN“ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-
strict.dtd">
!DOCTYPE
!DOCTYPE Title tags

<?xml version = "1.0"?>


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Body tags
<!– Comments: name_of_webpage.html -->

<html xmlns = "http://www.w3.org/1999/xhtml">


<head>
<title> Web Engineering: XHTML I </title>
</head>

<body>
<p>Welcome to XHTML!</p>
</body>
</html>
Images
The value of the src attribute

<?xml version = "1.0"?> of the image element is the


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" location of the image file.
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!-- Pictures with XHTML -->
The height and width attributes of the
<html xmlns = "http://www.w3.org/1999/xhtml">
<head> image element give the height
<title>Web Engineering - pictures</title> and width of the image.
</head>
<body>
<p><img src = "angelheart.jpg" height = "251" width = "367"
alt = "An angel" />
<img src = "grail.jpg" height = "180" width = "130"
alt = "A chalice" /></p>
</body>
</html>
The value of the alt attribute gives a
description of the image. This description
is displayed if the image cannot be displayed.
Colours
◆ <BODY TEXT=“aqua”>
aqua black blue fuchsia
gray green lime maroon
navy olive purple red
silver teal white yellow
◆ <BODY TEXT=“#00FF00”>
◆ <FONT COLOR = “#rrggbb” | “colour name”>
text</FONT>

000000 00FF00 FFFFFF


BLACK BRIGHT-GREEN WHITE
Inline Styles
<h1 style="color:blue; font-style: italic">First
Stylesheet Example</h1>

<p>The first example of stylesheets uses an inline


style.</p>

<h1>Second Stylesheet Example</h1>

<p>The second example of stylesheets uses a document-


level style.</p>

<h1>Third Stylesheet Example</h1>

<p> The third example of stylesheets uses an external


stylesheet.</p>
Demonstration:
inline_css.html
XML

EXtensible Markup Language

)
XML Introduction
◆ The Extensible Markup Language (XML) is a document
processing standard proposed by the World Wide Web
Consortium (W3C), which is related to Standard
Generalised Markup Language (SGML).
◆ Possible to search, sort, manipulate and render XML
using Extensible Markup Language (XSL).
◆ Highly portable
◆ Files end in the .xml extension.
XML & W3C
• XML has been in development since the 1960s through its parent called
SGML (Standard Generalized Markup Language) which is also the parent for
HTML

• XML is a streamlined version of SGML designed for transmission of structured


data over the Web by a working group in the World Wide Web Consortium
(W3C) in 1996

• Passed as W3C standard in Feb 1998

- www.w3.org/xml
- www.xml.com/axml/axml.html (annotated version)
XML-related Technologies
◆ DTD (Document Type Definition) and XML Schemas are
used to define legal XML tags and their attributes for
particular purposes

◆ CSS (Cascading Style Sheets) describe how to display


HTML or XML in a browser

◆ XSLT (eXtensible Stylesheet Language Transformations)


and XPath are used to translate from one form of XML to
another

◆ DOM (Document Object Model), SAX (Simple API for


XML, and JAXP (Java API for XML Processing) are all
APIs for XML parsing
From HTML to XML..
• HTML major drawback – information loses its
structure when translated into HTML
• HTML is a presentation-oriented markup language,
so information embodied in it is difficult to process
• Information and knowledge servers are overloaded
since we have to search information and perform
format processing
• Servers often answer the same request many times
if users request several views on the same data
From HTML to XML..
• HTML:
- Lacks extensibility – can’t create tags or attributes
to parameterise or semantically qualify data
- Lacks structure – does not support the
specification of deep structures needed to represent
database schemas or object-oriented hierarchies
- Lacks validation – does not support language
specification that lets applications check imported
data’s structural validity
XML Goals
As a portable, platform independent data storage

• support a wide variety of applications,


• easy to use across the Internet,
• compatible with SGML,
• easy to create programs that process XML,
• clear and legible (self-describing),
• XML documents should be easy to create
• XML designs should be quickly prepared, formal & concise etc.
XML..
• XML is not for displaying information but for managing
information.
•Working group of World Wide Web Consortium (W3C) created
XML as a standard for creating markup languages.
• Designed it for distributing structured documents over the web
• A kind of “light” SGML (Standard General Markup Language)
simplified to meet Web requirements
• Unlike HTML, XML lets users:
 Extract data from a document
 Define their own tags and attributes
 Define data structures and nest document structures to any
complexity level
 Make applications that validate a documents structure. Any XML
document can contain an optional description of its grammar for use by
applications that perform structural validation
XML..
◆ The problem that XML helps us to solve is how to transfer data
between servers, or between the client and the server.
◆ It is a Markup language for describing structured data – content is
separated from presentation.
◆ XML documents contain only data
➢ Applications decide how to display the data
◆ Language for creating markup languages
➢ Can create new tags
◆ XML documents contain only data, not formatting instructions, so
applications that process XML documents must decide how to display
the documents data.
◆ For example a PDA (personal digital assistant) may render an XML
document differently than a wireless phone or desktop computer would
render that document.
HTML and XML
XML stands for eXtensible Markup Language
HTML is used to mark up XML is used to mark up
text so it can be displayed to data so it can be processed
users by computers
HTML describes both XML describes only
structure (e.g. <p>, <h2>, content, or “meaning”
<em>) and appearance (e.g.
<br>, <font>, <i>)

HTML uses a fixed, In XML, you make up


unchangeable set of tags your own tags
XML..
◆ XML is a meta-language
◆ With HTML, existing markup is static: <HEAD> and <BODY>
for example, are tightly integrated into the HTML standard and
cannot be changed or extremely difficult extended.
XML..
◆ XML is a meta-language
◆ With HTML, existing markup is static: <HEAD> and <BODY>
for example, are tightly integrated into the HTML standard and
cannot be changed or extremely difficult extended.
◆ XML, on the other hand, allows ou to create your own markup
tags and configure each to your liking: for example
➢ <WebEngHeading>
➢ <WebEngSummary>
➢ <WebEngReallyWildFont>
◆ Each of these elements can be defined through user defined
document type definitions (DTD) and stylesheets are applied to
one or more XML documents.
◆ There are no ‘correct’ tags for an XML document, except those
defined by the author
Some Code
◆ Schema
◆ Entity ◆ Entity
➢ Passport Details ➢ Address
◆ SubEntities ◆ SubEntities
➢ Last Name ➢ Street
➢ First Name ➢ City
➢ Address ➢ Town
➢ State
➢ Province
➢ ……..
DTD
<!ELEMENT passport_details (last_name,first_name+,address)>
<!ELEMENT last_name (#PCDATA)>
<!ELEMENT first_name (#PCDATA)>
<!ELEMENT address
(street,(city|town),(state|province),(ZIP|postal_code),country,contact_no?,email*)>
<!ELEMENT street (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT town (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT province (#PCDATA)>
<!ELEMENT ZIP (#PCDATA)>
<!ELEMENT postal_code (#PCDATA)>
<!ELEMENT country (#PCDATA)>
<!ELEMENT phone_home (#PCDATA)>
<!ELEMENT email (#PCDATA)>
Internal DTD and Instance
<?xml version='1.0'?>
<!DOCTYPE passport_details [
<!ELEMENT passport_details <passport_details>
(last_name,first_name+,address)> <last_name>Smith</last_name>
<!ELEMENT last_name (#PCDATA)> <first_name>Jo</first_name>
<!ELEMENT first_name (#PCDATA)>
<first_name>Stephen</first_name>
<!ELEMENT address
(street,(city|town),(state|province) <address>
,(ZIP|postal_code),country,contact_no?,email*)> <street>1 Great Street</street>
<!ELEMENT street (#PCDATA)> <city>GreatCity</city>
<!ELEMENT city (#PCDATA)> <state>GreatState</state>
<!ELEMENT town (#PCDATA)> <postal_code>1234</postal_code>
<!ELEMENT state (#PCDATA)>
<country>GreatLand</country>
<!ELEMENT province (#PCDATA)>
<!ELEMENT ZIP (#PCDATA)> <email>jhuang@yorku.ca</email>
<!ELEMENT postal_code (#PCDATA)> </address>
<!ELEMENT country (#PCDATA)> </passport_details>
<!ELEMENT phone_home (#PCDATA)>
<!ELEMENT email (#PCDATA)>
]>
Shared DTD
XML Document specifies the DTD
<?xml version='1.0'?>

<!DOCTYPE passport_details SYSTEM "PassportExt.dtd">

<passport_details>
<last_name>Smith</last_name>
<first_name>Jo</first_name>
<first_name>Stephen</first_name>
<address>
<street>1 Great Street</street>
<city>GreatCity</city>
<state>GreatState</state>
<postal_code>1234</postal_code>
<country>GreatLand</country>
<email>jo@theworldaccordingtojo.com</email>
</address>
</passport_details>
XML Examples
◆ XML Source File
➢ http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xml

◆ XML Style language


➢ http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xsl

◆ Parsing and rendering XML with IE5+


➢ http://www.yorku.ca/jhuang/xml/04.adhoc.topics_xsl.xml
XML Applications
◆ XML permits document authors to create markup for
virtually any type of information.
◆ Authors can create entirely new markup languages for
describing specific types of data, including mathematical
formulas, chemical molecular structures, music, recipes etc.
- XHTML
- VoiceXML (for speech)
- MathML (for mathematics)
- SMIL (the Synchronous Multimedia Integration Language, for
multimedia presentations)
- CML (Chemical Markup Language, for chemistry)
- XBRL (Extensible Business Reporting Language, for financial
data exchange)
XML Parsers
◆ Processing an XML document requires a software program
called an XML parser (or processer). These are available at
no charge in many languages (Java, Python, C++ etc.).

http://www.xml.com/programming/

◆ Parsers check an XML documents syntax and enable software


programs to process marked-up data. XML parsers can
support the Document Object Model (DOM) or the Simple API
for XML (SAX).
➢ DOM: Build a tree structure containing the XML
document’s data
➢ SAX: Process the document and generate events
XML-related Vocabulary
◆ SGML: Standard Generalized Markup Language
◆ XML : Extensible Markup Language
◆ DTD: Document Type Definition
◆ element: a start and end tag, along with their contents
◆ attribute: a value given in the start tag of an element
◆ entity: a representation of a particular character or string
◆ PI: a Processing Instruction, to possibly be used by a program
that processes this XML
◆ namespace: a unique string that references a DTD
◆ well-formed XML: XML that follows the basic syntax rules
◆ valid XML: well-formed XML that conforms to a DTD

You might also like