Professional Documents
Culture Documents
way that is syntactically distinguishable from the text,[1] meaning when the document is processed
for display, the markup language is not shown, and is only used to format the text. [2] The idea and
terminology evolved from the "marking up" of paper manuscripts (i.e., the revision instructions by
editors), which is traditionally written with a red or blue pencil on authors' manuscripts.[3] Such
"markup" typically includes both content corrections (such as spelling, punctuation, or movement
of content), and also typographic instructions, such as to make a heading larger or boldface.
In digital media, this "blue pencil instruction text" was replaced by tags which ideally indicate
what the parts of the document are, rather than details of how they might be shown on some
display. This lets authors avoid formatting every instance of the same kind of thing redundantly
(and possibly inconsistently). It also avoids the specification of fonts and dimensions which may
not apply to many users (such as those with different-size displays, impaired vision and screen-
reading software).
Early markup systems typically included typesetting instructions, as troff, TeX and LaTeX do,
while Scribe and most modern markup systems name components, and later process those
names to apply formatting or other processing, as in the case of XML.
Some markup languages, such as the widely used HTML, have pre-defined presentation
semantics—meaning that their specification prescribes some aspects of how to present
the structured data on particular media. HTML, like DocBook, Open eBook, JATS and countless
others, is a specific application of the markup meta-languages SGML and XML. That is, SGML
and XML enable users to specify particular schemas, which determine just what elements,
attributes, and other features are permitted, and where.
One extremely important characteristic of most markup languages is that they allow mixing
markup directly into text streams. This happens all the time in documents: A few words in a
sentence must be emphasized, or identified as a proper name, defined term, or other special
item. This is quite different structurally from traditional databases, where it is by definition
impossible to have data that is (for example) within a record, but not within any field. Likewise,
markup for natural language texts must maintain ordering: it would not suffice to make each
paragraph of a book into a "paragraph" record, where those records do not maintain order.
Contents
1Etymology
2Types of markup language
3History of markup languages
o 3.1GenCode
o 3.2troff and nroff
o 3.3TeX for formulas
o 3.4Scribe, GML and SGML
3.4.1HTML
o 3.5XML
3.5.1XHTML
3.5.2Other XML-based applications
4Features of markup languages
5Alternative usages
6See also
7References
8External links
Etymology[edit]
The noun markup is derived from the traditional publishing practice called "marking
up" a manuscript,[4] which involves adding handwritten annotations in the form of conventional
symbolic printer's instructions — in the margins and the text of a paper or a printed manuscript. It
is a jargon used in coding proof. For centuries, this task was done primarily by skilled
typographers known as "markup men" [5] or "d markers"[6] who marked up text to indicate
what typeface, style, and size should be applied to each part, and then passed the manuscript to
others for typesetting by hand or machine. Markup was also commonly applied by editors,
proofreaders, publishers, and graphic designers, and indeed by document authors, all of whom
might also mark other things, such as corrections, changes, etc.
XML[edit]
Main article: XML
XML (Extensible Markup Language) is a meta markup language that is very widely
used. XML was developed by the World Wide Web Consortium, in a committee
created and chaired by Jon Bosak. The main purpose of XML was to simplify SGML
by focusing on a particular problem — documents on the Internet. [24] XML remains a
meta-language like SGML, allowing users to create any tags needed (hence
"extensible") and then describing those tags and their permitted uses.
XML adoption was helped because every XML document can be written in such a
way that it is also an SGML document, and existing SGML users and software could
switch to XML fairly easily. However, XML eliminated many of the more complex
features of SGML to simplify implementation environments such as documents and
publications. It appeared to strike a happy medium between simplicity and flexibility,
as well as supporting very robust schema definition and validation tools, and was
rapidly adopted for many other uses. XML is now widely used for
communicating data between applications, for serializing program data, for hardware
communications protocols, vector graphics, and many other uses as well as
documents.
XHTML[edit]
Main article: XHTML
Alternative usages[edit]
While the idea of markup language originated with text documents, there is
increasing use of markup languages in the presentation of other types of
information, including playlists, vector graphics, web services, content syndication,
and user interfaces. Most of these are XML applications, because XML is a well-
defined and extensible language.
The use of XML has also led to the possibility of combining multiple markup
languages into a single profile, like XHTML+SMIL and XHTML+MathML+SVG.[26]