Professional Documents
Culture Documents
Ciancarini 1998
Ciancarini 1998
Abstract
XML has been proposed in order to bring to the Web a markup language free of the shortcomings of HTML. in
particular the inextensibility of the set of valid elements (tags). Stylesheet languages have been proposed for XML, in
order to provide precise and sophisticated typographical control over the appearance of text-based data. We have developed
a rendering engine for HTML and XML documents, providing rudimentary support for typography, but allowing easy
extensions (displets) for any kind of data, including non-textual ones, such as math, charts, graphs, etc. Some extensions
have already been developed: here we present the one for supporting Z, a notation for formal specifications of software
systems. 0 1998 Published by Elsevier Science B.V. All rights reserved.
asked to develop a new markup meta-language, the are not well supported by the X*L language family
extended Markup Language (XML), that could take except for links and text.
the place of SGML on the Web. XML documents The stylesheet languages available within the
would have to be straightforwardly usable over the XSL framework focus on typographical control: they
Internet. compatible with SGML, and easy to create. have precise instructions on the type of font to use,
Of course, XML gets flexibility by giving up the margins and alignment to provide to paragraphs,
specificity: contrarily to HTML, the elements used how to format tables, etc. In short, they are thought
in XML documents do not have a pre-defined mean- essentially for textual data.
ing that is understood and used by XML software, This has not deterred many special communities
and they do not have a given graphical aspect or from using XML for standardising their notation
typographical style. That is, XML only specifies the and markup needs, but they are planning special
correct abstract syntax of documents, but not how applications and applets for the correct display of
each element should be displayed as, nor what be- their documents.
haviour should be associated with them. For instance. the HTML-Math working group has
Two ancillary mechanisms were designed to take changed its charter from extending HTML to speci-
care of some of these issues: hypertext linking was fying a XML application [ 111. The display problem
considered a behaviour important and widespread is clear for mathematical markup: using images of
enough among the document types to be directly equations, that have been created statically and inde-
supported in the language, and the XLL specification pendently of the surrounding text, is limiting in sev-
[2] is meant to take care of the linking paradigm eral ways: the point size and type, the background,
of XML. The typographical rendering of markup the line alignment may match the surrounding text
elements, that is part of the definition of HTML, on the system where it was produced. but not the
is outside of the concern of XML, since the ele- system where it is displayed. Furthermore, while the
ment set is open and the emphasis is on structure resolution of images and text may be the same on the
markup rather than formatting. The XSL (Extensi- screen, print resolution usually is much improved for
ble Stylesheet Language. [I]) specification, currently text, but would remain exactly the same for a GIF or
just being proposed for discussion, is meant to pro- a JPEG image.
vide a general way to attach a set of rendering Also the problem of rendering software is clear
instructions to the elements of an XML document. to the authors of the document: at present, only spe-
Although at this time XSL is in a too early stage cialised Java applets and plug-ins can be foreseen to
of development to be reliably discussed, it is clear take advantage of the new notation. Further, more
that the proposed mechanism will provide XML syn- general and flexible solutions can be hypothesised
tax for selecting and using graphical constructs from using some ways to extend the stylesheet mechanism
either one of the two major stylesheet languages of XML and relying on the document model that
currently available for markup languages: CSS and is being proposed for Web-based data 171. Currently
DSSSL. there are some applications and browser that do or
It is worth noting that, in spite of the rel- plan to support MathML (WehEQ ‘. IBM techex-
evant differences that can be found between the plorer j, MathType’, etc.).
two stylesheet languages, neither of them supports Among the browsers that can display MathML
or consider the display of non-textual data, which it is important to mention Amaya5, which is a
clearly is an important need of many of the above general-purpose extensible WWW browser. Among
mentioned communities. the things that can be extended is the management
of the HTML language itself. It is thus possible to
extend both the tag set that is recognized by the tations require special software to take advantage of
HTML parser, and the display objects that constitute them: we must be able to display these notations, and
the display of the tags themselves. Amaya uses to activate behaviours on them: to provide a general
the Thot languages to specify the extensions: the S markup meta-language is not enough. One class of
language is used for the specification of the new behaviours (links), and one class of display elements
structural element, while the P language is used to (text) have been considered important enough to be
specify the appearance of the new elements. singled out in the development of the XML standard,
The chemistry community is also using XML to and will have their own specifications.
provide support for the myriad of different nota- For other types of behaviours (for instance, those
tion needs of their science. The CML language [ 131 required by forms elements), and for the display of
is an application of XML meant for chemical data, non-textual data, the attitude is towards creating spe-
another kind of information that cannot be easily rep- cialised applications that can understand and enact
resented by text. JUMBO’ is a specialised browser the behaviours and the rendering option implicit in
for CML documents: it is an autonomous application the special notation. Thus, while applications such
(an applet is being developed currently), and pro- as JUMBO or MathEQ are commendable for their
vides several independent windows where each type generality and usefulness, there is the need for an
of chemical data is displayed. integrated environment that provides a general ren-
Displets were proposed in [ 191 as a way to extend dering and behaviour engine for arbitrary markup
HTML documents using Java. The HTML language elements.
was extended on a per-document basis by defin-
ing new tags as needed, and providing Java classes
to take care of their graphical display. While not 4. The DispletManager applet
providing all the functionality and flexibility of a
full meta-markup language such as XML, HTML Our first experiment with rendering arbitrary, non-
extended with displets could allow all kinds of spe- text-based markup extensions [ 191 was to modify an
cialised notations and graphical effects while at the existing browser to allow the parsing and the vi-
same time leveraging over the existing and well- sualisation of new HTML-like elements. To do so,
known set of elements defined by HTML. we took an early version of the HotJava browser,
And still we are not convinced that this is a bad whose source code was freely available and modified
idea when striving beyond HTML. Admittedly, in it so that it could accept on-the-fly extensions of the
some instances documents simply cannot be forced HTML DTD and load the appropriate classes (called
into the existing elements of the HTML language, displets) whenever the newly defined tags were to be
and require a complete and radical re-thinking of displayed. That experiment was extremely limited.
the elements that need to be used. In these cases a in that we used an old version of the Java language,
full meta-language is needed such as XML, so that and worked only on a specific version of a specific
one can define the markup elements from scratch browser. Furthermore, we heavily relied on the ex-
and without legacy influences. In many other cases, isting rendering architecture of the browser and just
on the other hand, there are one or two special provided a minimal effort implementation (basically
requirements that cannot be satisfactorily covered by a displet was just a sequence of drawing instructions
HTML, and that require the definition and use of a for the visualisation of the elements).
few more tags than those available in the standard In this paper, on the contrary, we report about the
language. In these cases, it would be useful to be DispletManager applet, a general, extensible render-
able to extend the existing HTML tag set with those ing and behaviour architecture we have been work-
few new ones, rather than starting from scratch with ing on, which can be used for both extensions to
all the tags both special and trivial. HTML and straight XML documents. This architec-
The key issue in our opinion is that special no- ture is embodied in a Java applet that can be run
within any Java-enabled browser such as Netscape
’ http://www.venus.co.uWomf/cml/download.html Communicator or MS Internet Explorer.
a chart element may have a mathematical formula
Host browser as one of the labels, and some staff notation as an-
other, where some notes may act as hypertext links.
l the rendering engine must work both for extended
4. I. Extending HTML
Fig. I. The general structure of the DispletManagerapplet.
As mentioned, we believe that extending HTML
Fundamental design requirements for the render- still has a function in case one needs to provide
ing engine were as follows: graphical support for some notation, within a docu-
l it must be possible to create special code for ment that can be easily mapped onto plain HTML.
rendering arbitrarily odd data types, in particular The following is an example of a simple HTML
non-textual data (displets). document with a simple extension for displaying text
l all displets must easily integrate with each other: in reverse video:
<HTML>
<HEAD>
</HEAD>
<BODY BGCOLOR=white>
<applet archive="displets.zip" code="DispletManager.class" width=450
height=460>
<param name = "def" value = I'
-c/body>
” >
</applet>
</body>
Figure 2 shows how the document results on to the DispletManager class. The standard HTML el-
screen. ements are all pre-defined, including links and form
The document, or just the part that contains the elements, and have their own displet classes. The
new element, is given as the parameter “HTMLcode” “def” parameter of the applet allows the definition of
the new elements, providing some simple syntactical
support and the URL of the displet code in charge of
providing its rendering.
Upon loading the applet, the displet manager will
start the HTML parser and patch it with the ele-
ments defined in the “def” parameter. It will then
parse the “HTMLcode” parameter, verifying that the
new elements are being used according to the given
grammar. The “def” parameter acts as a simplified
DTD for the new tags: it allows to specify the names,
!I
731s document contam tlonof : :.
some common HTML HTML is ,+
,I’ attributes, inclusion rules and minimisation features
defined for text in reverse colors >.“_
i, I2
II‘_Z
allowed for the new elements, as well as the URL of
the Java code containing the rendering applet.
Forms are also allowed, The resulting tree then is examined and the proper
Yonrname 1 J
displet for each node is activated. Each displet is re-
Your e-ml 71 quired to produce a (list of) bitmaps of its content.
Cormxgfrom 0 Afrtca 0 America Each displet may set parameters and wait for the dis-
0 Asia 0 Australra @ Europa plet of its sub-elements to return their bitmaps, before
producing its own. For instance, the P displet sets
some values (such as margins, line spacing, font and
size) that may affect its sub-elements, waits for all
of them to return their bitmaps (one for each word.
Fig 2. Rendering an HTML displet with an extension. because P combines whole words into lines) and then
P Ciuncurini rt al. /Cottyxtter Networks and lSDN Swvms 30 f lY9Xi 225-237 7.31
creates its own list of bitmaps (one for each line). 4.2. Displaying XML documents with XSL styles
We are using a modification and update of a
beta version of an old Sun HTML parser. Currently, The same displet may be used for an XML docu-
the HTML displet package contains displets for all ment chunk. Since XML comes with no pre-defined
HTML 3.2 tags, except tables. Anchors and form el- rendering for any elements that is used, it is neces-
ements are included and work correctly. This allows sary to provide a map that connects the elements of
authors to use standard HTML elements within the the document to the displets that must be used for
special elements they define: for instance, it is pos- the display. We are using XSL for this.
sible to have styled text, form elements or hypertext The following is similar to the previous example
links within a chart or a timeline. of the reverse colour, but in XML:
<HTML><HEAD>
<TITLE>Displet test - XML</TITLE>
</HEAD><BODY BGCOLOR="#FFFFFF">
<xsl>
<rule>
<target-element value='para'/>
<css.div font-size='lZ'>
<children/>
</css.div>
<./rule>
<rule>
<target-element value='rev'/>
iexample.reverse>
<children/>
<example.reverse>
</rule>
i/xsl> ">
‘( >
</applet>
</BODY></HTML>
The displet manager knows we are dealing with provide support for at least one more implementa-
XML because of the presence of the “XMLcode” tion. In order to associate the correct displet class
parameter, and activates the correct parser. We are
using the msxml parser’ by Microsoft, but plan to ’ http://www.microsoft.comlstandardslxmlixmlparse.htn~
232 I? Cimcarini et al. /Co117puter Networks and lSDN S~.QPIIIS 30 (1998) 225-227
with the relevant XML element, we are developing classes, such as those for rendering non-textual data,
our own XSL processor. to be used in an XSL specification.
According to the very early specification of XSL To the time of this writing. only a few pattern
113, an XSL stylesheet is a XML document con- selectors are available. As for the action part, we are
taining a list of style rules. Each rule is composed currently limiting to the main CSS flow objects.
of a pattern selector and an associated action. For
every element of the document to be displayed, the 4.3. The rendering engine
XSL processor should find the best matching pat-
tern, and activate the associated action to provide The rendering engine used by the DispletManager
the rendering. In our example, the selector allows to applet is a set of Java classes that provide the rendering
specify the actual name of the XML element to be for the appropriate document elements. These classes
processed. A default rule is provided that matches are all subclasses of the DocElement class, which
all patterns. so that every element can have an action provides the framework of the rendering procedure.
to determine its display. The actions of an XSL rule All classes provide a createBitmap ( ) method,
are a representation of the structure of the rendering whose purpose is to create and return the bitmap of
objects (in stylesheet parlance, the flow objects) that the llow object of the considered markup element
have to be created. The flow objects that are planned on the basis of the bitmaps of its sub-elements. The
to be available in XSL rules derive either from CSS createBitmap ( ) method is usually not seen by
or from DSSSL. The ambiguity is due to the fact the implementer of new classes, and provides the
that there is a partial overlap in the meaning of CSS following functionalities:
and DSSSL flow objects, and therefore the need may l an active drawing environment is managed. The
arise for using either model. Flow object elements drawing environment is a set of parameters that
may be combined with literal text, and elements are used by the rendering methods of the classes in
generated procedurally by the use of ECMAscript order to decide how to create the bitmaps. For in-
scripts [S]. The <children/> element specifies that stance, a paragraph-like class may set some param-
the sub-elements of the element being considered eters that will be used by itself, such as margins,
need to be recursively processed for determining the line spacing, alignment, etc., and some that will
correct rendering of the element. be used by its sub-elements, such as font name,
Our implementation of the XSL processor is pro- font size, font colour, etc. The createBi tmap ( )
totypical and rudimentary. but features and function- method allows a displet to set its own attributes
alities are being added daily. In our interpretation of with the set~arams ( ) method, and restores the
the proposed specification, the actions are names of previous situation when the displet is finished.
displet classes. We plan to create both a CSS and a Since createBitmap ( ) methods are activated
DSSSL package, that will properly subclass and tai- recursively, this creates a stack that provides the
lor the appropriate internal classes of the rendering proper parameters at any level of recursion.
engine. Differently from the proposed specification, l the rendering of sub-elements is managed. The
therefore, it is necessary to preface the name of the presence/absence of the <children/> element
flow object with the name of the style mode: for in the XSL rule, or some internal decisions for the
instance, css.div or dsssl.paragraph instead of DIV HTML displet. may cause or prevent the render-
in an exclusive CSS setting and paragraph in an ing of the sub-elements of the current element.
exclusive DSSSL setting. l the rendering of the element is managed. Af-
Furthermore. it is possible to create one’s own ter the bitmaps of the sub-elements of the el-
package of classes, and use it liberally within an ement have been created (if appropriate), the
XSL specification, by appropriately specifying the createBitmap () method calls the render()
full name of the classes. In our example. the “re- method, which creates the final bitmap (or set of
verse” action comes from neither CSS nor DSSSL, bitmaps) that will be returned. Different classes
but from the “example” package, and is thus ap- will implement render ( ) differently: for in-
propriately specified. This allows special rendering stance, the render ( ) method of a block element
will collect the bitmaps of its sub-elements in a ver-
tical stack (one above the other), and provide a sin-
gle bitmap of the whole element, while the ren-
der ( ) method of a paragraph wiI1 collect its sub-
elements side-by-side in lines of the given width,
and provide a bitmap for every line it has created:
this allow the element containing the paragraph to
decide how much of the paragraph to display at a
time (for instance, in case of scrolling).
l active elements are specified and created. Active
elements. EntityElements are provided for the man- by side with their siblings. Inline elements are
agement of XML and HTML entities such as & used within block elements and may be text-
or the definition of new ones. TagElements are used based, images or something else. The StyledText
for the creation of the structure flow objects of the class allows the specification of text runs of ar-
document: they are either flow objects, block objects, bitrary styles. Inline elements specify the places
inline elements or special elements. where they can be broken by creating as many
l A block element is a single object that stands alone bitmaps as break points. This allows the con-
in the vertical layout of the document. Paragraphs taining paragraph or block element to determine
or tables are block elements. A flow element is a where the line should be broken.
block element that is built piecemeal: while plain l Special elements are completely tailorable. While
block elements are built from start to end be- in the previous classes displet programmers can
fore the createBi tmap ( ) returns, flow elements only overload the setparams ( ) and render ()
build each of their sub-element and return, and are methods, here all methods are overloadable, and
called as many times as there are sub-elements. can be customised.
This allows long and complex elements to be ren- As an example, this is the complete source code
dered only for the possibly small section that is ac- of the “reverse” displet:
package example ;
import displet.* ;
As a proof of concept for our rendering engine, Fig. 3. A displet for 2 schemata
we have developed a full set of displets for the
management of Z specifications. Z [18] is special 5 years, we have decided that the proof of concept
notation that uses mathematical and logical elements for our rendering engine would be the implemen-
for the specification of software systems. Z also uses tation of Z specifications both by extending HTML
a special set of symbols and graphical boxes that and by specifying a DTD chunk for XML. Z specih-
makes it impossible to specify using existing markup cations are documents of mixed free text and formal
systems or even the foreseen MathML [ 111. specifications. Schemas are the most common graph-
Software designers needing to share Z specihca- ical objects in Z formal specifications. A schema is a
tions have used so far special LaTeX extensions, but graphical box composed of a declaration part. where
could not till recently use it on the Web, except by variables are declared, and an axiom part. contain-
creating GIF images. The Z browser [ 141 is a plug-in ing predicates that must hold for those variables. An
for Netscape designed so that it is possible to show Z example schema is shown in Fig. 4.
specifications in an HTML document, although with The preceding schema has been created with dis-
the clear limitations that plug-ins imply. plets extending HTML. The following is the markup
Since an SGML DTD for Z has existed for the last for the schema in Fig. 4:
<HTML><HEAD>
<TITLE>Birthday example</TITLE>
</HEAD><BODY BGCOLOR="#FFFFFF">
<body>
<givendef>
<a name='name'>NAME</a>,
<a name='date'>DATE</a>
</givendef>
<Pa
<schemadef>
BirthdayBook
<decpart>
<declaration> known: &pset;
<a href='#name'>NAME</a>
</declaration>
<declaration> birthday:
<a href='#name'>NAME</a> &pfun;
236 P Ciuncarini YI al. /Computer Networks und ISDN S~strms 30 (199X) 225-237
<a href='#date'>DATEc/a>
</declaration>
</decpart>
<axpart>
<predicate>known = &dom; birthday</predicate>
</axpart>
</schemadef>
</body>
">
c/applet>
</BODY>
</HTML>
We have created a complete package of Z displets XML, and a list of all the displets we have created so
providing support for the whole notation set: enti- far.
ties and tags closely reflect those of the Z DTD as Work on the displet manager is continuing. Our
published in [lo]. A conversion utility for Z specifi- future aims are at creating a complete XSL engine,
cations in LaTeX to the corresponding HTML ones and to extend the displet manager to handle position-
has also been created. able and scriptable elements. Several examples of
displets are being produced as well, many of which
are not typographical in nature, and are thus beyond
5. Conclusions the capabilities of current stylesheet languages.
www.textuality.com/sgml-erb/WD-xml-lang.html 2nd World Wide Web Conference ‘94: Mosarc cmd the
M. Bieber and F. Vitali, Toward support for hypermedia Web, http://www.ncsa.uiuc.edu/SDGIIT94/Proceedings/Aut
on the World Wide Web. IEEE Compute6 30(l): 62-70. ools/sperberg-mcqueen/sperberg.html
January 1997. [ 181 J.M. Spivey. The Z Noration. a Reference Manual, 2nd
M. Bieber. F. Vitali. H. Ashman, V. Balasubramanian and edition, Prentice Hall, Englewood-Cliffs. NJ, 1992
H. Oinas-Kukkonen. Fourth generation hypertext: some [I91 F. Vitali. C. Chiu and M. Bieber. Extending HTML in a
rnissing links for the World Wide Web. fnflrrmationat Jortr- principled way with dispirts. in: Pmt. of the 6/h WWW
nal /?f‘HumLln-Conlputrr Studies 41: 3 l-65, 1997. International Conjerence; also Comput. Netxwrks lSDN
Document Object Model Specification, W3C Working Draft Systems 29 (8-13): 1 115-l 128. 1997.
09-Oct- 1997, http://www.w3,orgffR/WD-DOM
Standard ECMA-262 ECMAScript: A general purpose, Paolo Ciancarini is Associate Pro-
cross-platform programmin g language. June 1997. http://w fessor of Computer Science at the
ww.ecma.chlstand/ecma-262.htm University of Bologna. His research
P. Flynn (Ed.). Frequently Asked Questions about the Ex- interests include: coordination lan-
tensihle Markup Language (the XML FAQ). http://www.uc guages and systems, programming
c.irlxmil systems based on distributed objects.
J. Hammond. The Z interchange format, in: J. Nicholls formal methods in software engineer-
(Ed.). The Z Not&on. 199.5. Appendix E. http://www.coml ing, and markup languages.
ab.ox.ac.uk/archive/z.html
P. Ion and R. Miner (Eds.), Mathematical Markup Lan-
guage. W3C Working Draft, IO July 1997, http://www.w3.
orglpub/WWWlTRAVD-math-9707 IO
Alfred0 Rizzi has recently graduated
H. Lie and B. Bos. Cascading Style Sheets, level I. W3C
in Computer Science at the Univer-
Recommendation 17 Dee 1996. http://www.w3.orglpublW
sity of Bologna and currently works
WWITRIREC-CSS I
as software developer in a private
P. Murray-Rust. Chemical Markup Language. http://www.v
tirm. He is interested in object ori-
enus.co.uk/omf/cml/
ented programming languages and
L. Mikusiak. V. Vojtek. J. Hasaralejko and J. Hanzelova.
WWW technologies.
Z browser: a tool for visualization of Z specifications, in:
J.P. Bowen and M.G. Hinchey. (Eds.), ZUM ‘95: The Z
Fonnol Sprcification Notutian Proceedings. Lecture Notes
in Computer Science 967: 5 10-526. 1995.
lnternalional Standards Organisation. IS0 8879, Informa-
tion processing - Text and oftice systems - Standard Gen- Fabio Vitali is Assistant Professor of
cralized Markup Language (SGML) Computer Science at the University
D. Siegel. The balkanization of the Web, http://www.dsiege of Bologna. His research interests
I.com/balkanization/ include: hypertext systems, version-
C.M. Sperberg-McQueen and R.F. Goldstein, HTML to ing systems, collaborative systems,
the max: a manifesto for adding SGML intelligence to markup languages, digital typogra-
the World Wide Web. in: Electronic Proc. qf the phy. and coordination systems.