You are on page 1of 7

Q.1) Relation between XML, HTML, SGML.

Relation between XML and HTML

1. XML is defined as a syntax

2. HTML is defined as a vocabulary
3. A given markup language might be either or both (or neither)
4. HTML to Well-formed XML


XML is a close relative of HTML, the language of the World Wide Web. They both use what is known
as markup: symbols that describe the contents of a file (XML) or a page (HTML). While HTML
describes the contents of a web page, its text, and its graphics, XML consists of the data itself and how it
can be displayed (like HTML), stored (like a file), or immediately processed and used by the application
(like data).
Both XML and HTML enable people to send messages over the Internet. The essential
difference is that HTML lets a sender’s browser tell a receiver’s web server how to display the format of
a message, while XML enable a sender to tell a receiver how to handle the content of a message. This
difference is precisely what is needed at this junction in the Internet revolution. XML will enable banks
to create knowledge systems and to exchange financial information with customers and such third-party
services providers as stockbrokers, mortgage and insurance companies, and mutual funds.
Probably the most significant departure from HTML is that XML specifies the structure of a
document via its accompany DTD and is robust enough to describe a wide range of abstract structures.
Since the write of the document can define this DTD, any customized tags may be defined and used
within the document. XML therefore can be used to describe data objects, structure records and many
other types of structured data. The industry has been quick to appreciate the potential of this new
standard for sharing and exchanging documents on the Internet and intranets and also for sharing a huge

variety of structured data using established Internet standards such as hypertext protocol (HTTP) as the
mediating protocols.

1. XML is not a replacement for HTML.

2. XML and HTML were designed with different goals.
3. XML was designed to describe data and to focus on what data is.
4. HTML was designed to display data and to focus on how data looks.
5. HTML is about displaying information, XML is about describing information
6. HTML is static and XML is dynamic.

XML Relationship with SGML

1. Standardize effort of W3C.

2. Conforms to ISO 8879.
3. Simplifies SGML construct for electronic delivery of documents.
4. Allow structured documents without the full complexity of SGML.
5. Allow user defined tags.
6. Can be viewed on a XML browser with the requirement of a DTD.
7. Allow use of entities.

Difference between SGML and XML

The Extensible Markup Language (XML) is designed to be a subset of SGML. To achieve this, XML
was defined by removing features and options from the SGML standard.

The following list describes features that are available in SGML but not in XML.


2. Short references
3. The and connector "&" in content models
4. Inclusions and exclusions exceptions in content models
5. #CURRENT and #CONREF default values for attributes
6. Attribute declared values of: NAME, NAMES, NUMBER, NUMBERS, NUTOKEN,and
7. Abstract syntax
8. Capacities and quantities
9. Comments embedded in other markup declarations
10. Multiple comments in a single comment declaration
11. Omission of quotes on attribute default values consisting of name characters in attribute

Q.2) Benefit of XML.
► In the past, desktop systems did not provide much memory, disk space, and CPU power, so it was
not feasible to store complex document information in a memory-inefficient ASCII or text format. Plus,
parsing text data is very CPU intense. Therefore, old file formats contained only ASCII formatted data
that did not need to be converted into internal binary information. This meant that some content was
kept as ASCII data while layout coordinates or style details were stored as binary data. Since all
applications used in-memory representation to store document data in corresponding document files, file
formats were very closely related to their creating applications. As a consequence, a text document file
from vendor A was incompatible with the format used by vendor B, and vice versa.

Another problem was that every new application version introduced features that required new data
fields in the corresponding file formats. Because of this, even applications from the same vendor used
different and incompatible file formats for different versions of the same application, which made it
difficult to exchange files among people and companies — unless everybody was using the same
applications and application versions.

Because there was no guaranteed compatibility between different versions of the same application, being
able to open old files required the availability of all corresponding application versions, unless a new
application version supported all past file format versions. This was especially problematic in areas
where documents had to be archived for a long time. Therefore, many document management systems
chose a different file format as a workaround. Most of the time, only the graphical representations of
documents were stored in more accessible formats, including Tagged Image File Format (TIFF) or
Portable Document Format (PDF).

Storing document information in a binary file format also had another disadvantage. If files became
corrupted, it was almost impossible to access the document data. Since many companies store important
business data in text and spreadsheet documents, losing the data from just one file could be a significant

A few years ago, most people did not care that much about interoperability. Today, however, the
integration of heterogeneous business systems is a very important topic. Most enterprises want to
streamline their processes to cut down operation costs. In the past, integration was often achieved by
using macro-based scripting capabilities and exchange file formats that more or less represented the
most common denominator between different applications. Recently, the demand for interoperability
among applications has increased dramatically. This becomes obvious when one looks at the rate of
adoption of XML and Web services by most software vendors.

There are different definitions and opinions about what XML is, and what its real purpose and usage
should be. Many vendors introduced XML as an application-independent format of data for exchange,
import, and export purposes. Similar to the Hypertext Markup Language (HTML), which represents
layout information in an application-agnostic format, XML focused on the application-independent
representation of data fields and records.

Since the focus of XML is on data, this implies separating data from its representation. However, in the
case of office documents, even layout information is important document data, and thus can and should
be stored using XML together with the core document data, including its content. In addition, formatting

information should be as open as the document content — it should be easily readable and well-
documented so that it can be processed by other applications.

Information coded in XML is easy to read and understand, plus it can be processed easily by computers.

XML is a W3C standard, endorsed by software industry market address.

There is no fixed set of tags. New tags can be created as they are needed. Self-description
In traditional databases, data records require schemas set up by the database administrator. XML
documents can be stored without such definitions, because they contain Meta data in the form of tags
and attributes.XML provides a basis for author identification and versioning at the element level. Any
XML tag can possess an unlimited number of attributes such as author or version. Contains machine-
readable context information Tags, attributes and element structure provide context information that can
be used to interpret the meaning of content, opening up new possibilities for highly efficient search
engines, intelligent data mining,agents,etc.This is a major advantage over HTML or plain text, where
context information is difficult or impossible to evaluate.

Separates content from presentation

XML tags describe meaning not presentation. The motto of HTML is: “I know how it looks”, whereas
the motto of XML is: “I know what it means, and you tell me how it should look.” The look and feel of
an XML document can be controlled by XSL style sheets, allowing the look of a document (or of a
complete Web site) to be changed without touching the content of the document. Multiple views or
presentations of the same content are easily rendered.

Supports multilingual documents and Unicode

This is important for the internationalization of applications.

Facilitates the comparison and aggregation of data

The tree structure of XML documents allows documents to be compared and aggregated efficiently
element by element.

Can embed multiple data types

XML documents can contain any possible data type – from multimedia data (image, sound, video) to
active components (Java applets, ActiveX).

Can embed existing data

Mapping existing data structures like file systems or relational databases to XML is simple. XML
supports multiple data formats and can cover all existing data structures and .

Provides a ‘one-server view’ for distributed data
XML documents can consist of nested elements that are distributed over multiple remote servers. XML
is currently the most sophisticated format for distributed data – the World Wide Web can be seen as one
huge XML database.

Rapid adoption by industry

Software AG, IBM, Sun, Microsoft, Netscape, Data Channel, SAP and many others have already
announced support for XML. Microsoft will use XML as the exchange format for its Office product line,
while both Microsoft’s and Netscape’s Web browsers support XML. SAP has announced support of
XML through the SAP Business Connector with R/3. Software AG supports XML in its Bolero and
Natural product lines and provides Tamino, a native XML database

Platform Independent: Uses plain-text format which is understandable on all platforms.

It allows data interchange between different computers

It enables smart searches.

It provides user-selected view of data.

Problems that XML can help solve include:

• Internet searches that return hundreds and hundreds of hits for a simple topic. XML-coded data
in a search engine’s database would allow users to clearly specify they’re searching for information
about olive oil, not a petroleum based product and not Popeye’s girlfriend.
• The immense task of integrating one application with another.
• The distribution of documents in a variety of standards that may conflict with the tools you use.
If different companies all use XML, competing products will still be able to talk to each other.
• The need for multiple copies of the same information: one for your printed reports, one for your
Web site and another for CD-ROMs. Today, it takes time-consuming manual effort to turn
information in a printed report into a usable format for a digital medium, for instance. However,
XML would automatically code the instructions for each format, saving time and reducing the
human-error factor.

But the biggest advantage is the XML is an open standard, not a product of one vendor. It does not hold
users hostage to one software company. The more companies agree to a standard series of codes for a
given application-like banking- the more powerful XML becomes. And since everyone’s product is
better with XML features, there are incentives to cooperate.
• Uses human, not computer, language. XML is readable (and understandable, even by novices)
and no more difficult to code than HTML.
• Completely compatible with Java and 100% portable. Any application that can process XML (on
any platform) can use your information.
• Extendable. Create your own tags (or use tags created by others) that use the native language of
your domain, have the attributes you need, and make sense to you and your users.

Q.3) Future of XML.

► The future of XML is expansion. It is going to veer to many areas of web publishing technology.
XSL is a perfect example of this expansion. As developers began taking the flexible language of XML
to heart, the need to fill the voids became evident. This led to network tools that began with XML
schemas, moved on to XPath, and then advanced further to XSLT. It is difficult to determine the
intersections ahead, but many experts believe the future lies in Ajax and advanced parsers, such as VTD-

Technology is always trying to build a better mousetrap. That is true whether you are discussing
electronics, computers or web applications. The future of XML lies in its stability. XML is a platform
for innovation and growth that will continue to improve the Internet and web page design.

Immediate future

At this level, the messaging, discovery, and trading partner agreements are standardized and can be
automated. Lowest level: email ad-hoc XML, conducting all other aspects manually using email or
phone calls
• This would usually be a one-sided relationship.
• But it allows the smallest businesses to participate And very large players can engage at this
level when it suits them.
• Repository supplies messaging specification and commonly used industry tag sets.
• This first specification of this layer will be available from ebXML at the end of this week in

Near future

• Standardized XML messages defined by humans within some standardized semantic framework
• Enables off-the-shelf software solutions
• Enables a transition from existing business practices by replacing traditional business documents
with human-readable XML equivalents
• Many options for defining the semantic framework
• Prose
• Prose + DTD
• Repository includes business models and standard XML business document schemas

Distant future

Standardized messages automatically generated from a single specific standardized semantic framework
• Modeling done from the top down
• XML schemas are generated automatically
• The whole system operates without human intervention
• Repository includes business models designed for automation

You don't need a crystal ball or Tarot cards to see the future of XML, just examine some of the current
and emerging technologies.

XHTML - Extensible Hypertext Markup Language

As the world grew, so did the Internet. With the expansion of Internet capabilities came new devices,
software and alternative languages. Browser designers began having trouble integrating the changing
technology with standard HTML. Part of the problem fell in the coding of HTML documents. Poorly
formed HTML was wreaking havoc in the Internet world. To combat this problem, programmers
combined the reigning king, HTML, to the precise syntax formation of XML and named it XHTML.
XHTML allows developers to expand into smaller devices unable to interpret poorly formed code, such
as cell phones.

XSLT - XSL Transformation

XSLT is the tool that takes an XML document and makes it XHTML. By itself, XML does not do much
but contain data. There needs to be a bridge between the processing page and the information contained
in XML. To perform this miracle, XSLT turns to another tool, XPath.

XPath - XML Path Language

XPath defines the various parts of an XML document to allow for proper navigation. XML has one
major advantage. There are no predefined elements. While this appears, on the outside, to be a plus from
a developer's standpoint it presents a problem. How does a browser know if <table> is an HTML table, a
piece of furniture or something else. XPath expanded on schemas by segmenting XML documents into
definable sections known as nodes. The development of XPath allowed programs to map out XML
documents in order to transform them into XHTML.

Ajax - Asynchronous JavaScript and XML

Ajax programming takes existing web components, JavaScript and XML, and marries them to create
efficient web pages. As the technology grew, it slowed down. The use of client-side servers and user
interfaces moved web design to a new level. Unfortunately, this new programming had a side effect. In
order to refresh data for updates, the entire page needed to reload. Ajax allows updates and information
exchanges to move behind the scenes. With Ajax, small amounts of data can update without interrupting
the flow or the observer's eye. Ajax is a way to create dynamic web interfaces that act seamlessly. Some
examples of modern Ajax based applications includes Google, You tube and Face book.