Professional Documents
Culture Documents
Kay L. O'Halloran - Multimodal Discourse Analysis - Systemic Functional Perspectives (Open Linguistics) (2004)
Kay L. O'Halloran - Multimodal Discourse Analysis - Systemic Functional Perspectives (Open Linguistics) (2004)
Systemic-Functional Perspectives
Open Linguistics Series
Series Editor
Robin Fawcett, Cardiff University
The series is 'open' in two related ways. First, it is not confined to works associated with
any one school of linguistics. For almost two decades the series has played a significant
role in establishing and maintaining the present climate of 'openness' in linguistics, and
we intend to maintain this tradition. However, we particularly welcome works which
explore the nature and use of language through modelling its potential for use in social
contexts, or through a cognitive model of language - or indeed a combination of the two.
The series is also 'open' in the sense that it welcomes works that open out 'core'
linguistics in various ways: to give a central place to the description of natural texts and the
use of corpora; to encompass discourse 'above the sentence'; to relate language to other
semiotic systems; to apply linguistics in fields such as education, language pathology and
law; and to explore the areas that lie between linguistics and its neighbouring disciplines
such as semiotics, psychology, sociology, philosophy, and cultural and literary studies.
Continuum also publishes a series that offers a forum for primarily functional
descriptions of languages or parts of languages — Functional Descriptions of Language.
Relations between linguistics and computing are covered in the Communication in Artificial
Intelligence series, two series, Advances in Applied Linguistics and Communication in Public Life,
publish books in applied linguistics and the series Modern Pragmatics in Theory and Practice
publishes both social and cognitive perspectives on the making of meaning in language
use. We also publish a range of introductory textbooks on topics in linguistics, semiotics
and deaf studies.
Recent titles in this series
Classroom Discourse Analysis: A Functional Perspective, Frances Christie
Construing Experience through Meaning: A Language-based Approach to Cognition,
M. A. K. Halliday and Christian M. I. M. Matthiessen
Culturally Speaking: Managing Rapport through Talk across Cultures, Helen Spencer-Oatey (ed.)
Educating Eve: The 'Language Instinct' Debate, Geoffrey Sampson
Empirical Linguistics, Geoffrey Sampson
Genre and Institutions: Social Processes in the Workplace and School, Frances Christie and
J. R. Martin (eds)
The Intonation Systems of English, Paul Tench
Language Policy in Britain and France: The Processes of Policy, Dennis Ager
Language Relations across Bering Strait: Reappraising the Archaeological and Linguistic Evidence,
Michael Fortescue
Learning through Language in Early Childhood, Clare Painter
Pedagogy and the Shaping of Consciousness: Linguistic and Social Processes, Frances Christie (ed.)
Register Analysis: Theory and Practice, Mohsen Ghadessy (ed.)
Relations and Functions within and around Language, Peter H. Fries, Michael Cummings,
David Lockwood and William Spruiell (eds)
Researching Language in Schools and Communities: Functional Linguistic Perspectives,
Len Unsworth (ed.)
Summary Justice: Judges Address Juries, Paul Robertshaw
Syntactic Analysis and Description: A Constructional Approach, David G. Lockwood
Thematic Developments in English Texts, Mohsen Ghadessy (ed.)
Ways of Saying: Ways of Meaning. Selected Papers of Ruqaiya Hasan. Carmen Cloran, David
Butt and Geoffrey Williams (eds)
Words, Meaning and Vocabulary: An Introduction to Modern English Lexicology, Howard Jackson
and Etienne Zé Amvela
Working with Discourse: Meaning beyond the Clause, J. R. Martin and David Rose
Multimodal Discourse Analysis
Systemic-Functional Perspectives
continuum
LONDON NEW YORK
Continuum
The Tower Building 15 East 26th Street
11 York Road New York
London SE1 7NX NY 10010
ISBN: 0-8264-7256-7
Introduction 1
Kay L. O'Hallomn
Part I
Three-dimensional material objects in space
1 Opera Ludentes: the Sydney Opera House at work and play 11
Michael O'Toole
Part II
Electronic media and film
4 Phase and transition, type and instance: patterns in media texts
as seen through a multimodal concordancer 83
Anthony P. Baldry
Part III
Print media
7 The construal of Ideational meaning in print advertisements 163
Cheong Tin Yuen
vi CONTENTS
Index 247
This book is dedicated to my mother, Janet O'Halloran
This page intentionally left blank
Introduction
Kay L. O'Halloran
to the 'social semiotic' of both Sydney in the 1960s and to the international
community of its users today.
The museum is located as the next site for semiotic study in Alfred Pang's
'Making history in From Colony to Nation: a multimodal analysis of a museum
exhibition in Singapore'. Pang discusses how systemic-functional theory is
productive in fashioning an interpretative framework that facilitates a multi-
modal analysis of a museum exhibition. The usefulness of this framework
is exemplified in the critical analyses of particular displays in From Colony to
Nation, an exhibition at the Singapore History Museum (SHM) that displays
Singapore's political constitutional history. From this analysis, Pang explains
how the museum as a discursive site powerfully constitutes and maintains
particular social structures through the primary composite medium of an
exhibition. Of interest is the relationship between the museum, nation and
history and how the multimodal representation of history in From Colony to
Nation ideologically positions the visitor to a particular style of imagining a
'nation' (Anderson, 1991).
Safeyaton Alias investigates the semiotic makeup of the city in 'A semiotic
study of Singapore's Orchard Road and Marriott Hotel'. Like a written text,
the city stores information and 'presents particular transformations and
embeddings of a culture's knowledge of itself and of the world' (Preziosi,
1984: 50-51). In this paper, a rank-scale framework for the functions and
systems in the three-dimensional multi-semiotic city is proposed. The focus in
this paper, however, is the analysis of the built forms of Orchard Road and
the Marriott Hotel. Safeyaton discusses how these built forms transmit mes-
sages which are articulated through choices in a range of metafunctionally
based systems. This paper discusses the intertextuality and the discourses that
construct Singapore as a city that survives on consumerism and capitalism.
In Part II on electronic media and film, Anthony Baldry's opening paper,
'Phase and transition, type and instance: patterns in media texts as seen
through a multimodal concordancer', explores the use of computer tech-
nology for capturing 'the slippery eel-like' (to quote Baldry) dynamics of
semiosis. Baldry demonstrates that the online multimodal concordancer, the
Multimodal Corpus Authoring (MCA) system, provides new possibilities for
the analysis and comparison of film and videotexts. This type of concord-
ancing transcends in vitro approaches by preserving the dynamic text, insofar
as this is ever possible, in its original form. The relational properties of the
multimodal concordancer also allow a researcher to embark on a quest for
patterns and types. Taking the crucial semiotic units of phase and transition
as its starting point, Baldry shows that, when examining the semiotic and
structural units that make up a video, a multimodal concordancer far out-
strips multimodal transcription in the quest for typical patterns.
Kay O'Halloran further explores the use of computer technology for
the semiotic analysis of dynamic images in 'Visual semiosis in film'. A sys-
temic-functional model which incorporates the visual imagery and the
soundtrack for the analysis of film is introduced. Inspired by O'Toole's
(1999) representation of systemic choices in paintings in the interactive
4 INTRODUCTION
Note
Regrettably it has not been possible to reproduce coloured plates in this
publication. However, as will become evident in what follows, the contribu-
tors in this volume recognize that colour is a significant resource for mean-
ing (see also Kress and van Leeuwen, 2002). While the papers have been
somewhat comprised by the black and white reproductions, every possible
effort has been made to ensure that the analysis refers to the original colour
of the texts.
References
Alberts, B., Bray, D., Johnson, A., Lewis, J., Raff, M., Roberts, K. and Walter,
P. (1998) Essential Cell Biology: An Introduction to the Molecular Biology of the Cell.
New York: Garland.
Anderson, B. (1991) Imagined Communities: Reflections on the Origin and Spread of National-
ism (revised edn). London: Verso.
Baldry, A. P. (ed.) (2000) Multimodality and Multimediality in the Distance Learning Age.
Gampobasso, Italy: Palladino Editore.
Baldry, A. P. and Thibault, P. (forthcoming) Multimodal Transcription and Text.
London: Equinox.
Gallaghan, J. and McDonald, E. (2002) Expression, content and meaning in lan-
guage and music: an integrated semiotic analysis. In P. McKevitt, S. O'Nuallain
and C. Mulvihill (eds), Language, Vision and Music. Selected papers from the 8th Inter-
national Workshop on the Cognitive Science of Natural Language Processing, Galway, Ireland,
1999. Advances in Consciousness Research, Volume 35. Amsterdam: Benjamins,
205-220.
6 INTRODUCTION
Acknowledgements
The research presented here is only made possible through the foundational
work of Michael Halliday and Michael O'Toole. I am also indebted to Jay
Lemke for originally pointing me in this direction many years ago, and for
his continued support since that time. I also thank Joe Foley, Eija Ventola,
Frances Christie and Anthony Baldry for their friendship, advice and active
support over the years.
My special thanks also to Michael O'Toole for his invaluable reading of
the first draft of the manuscript. His comments, corrections and suggestions
have contributed to the final form of this volume, although of course any
errors of interpretation are mine. I am also most grateful to Guo Libo for his
careful proof-reading and corrections to the manuscript.
My sincere thanks to my talented group of postgraduate research stu-
dents for their enthusiasm, dedication and commitment to push the bound-
aries of multimodal analysis. This volume would not be possible without
their contributions. And special thanks to my past and present colleagues in
the Department of English Language and Literature at the National Uni-
versity of Singapore (NUS), especially Linda Thompson, Chris Stroud, Ed
McDonald and Desmond Allison for their continued friendship and
support.
I would also like to thank Anne Pakir and the Faculty Research Commit-
tee (FRC) in the Faculty of Arts and Social Sciences at NUS for providing
the research grant (R-103-000-014-107/112) in 2000 to establish the Labora-
tory for Research in Semiotics (LRS) in the Department of English Language
and Literature. The research grant has directly supported the research
presented in this publication.
This page intentionally left blank
Parti
Three-dimensional material objects in space
This page intentionally left blank
1 Opera Ludentes: the Sydney Opera House at work
and play
Michael O'Toole
Murdoch University, Western Australia
Here the trick was to get people up. When you go up the steps you see no
buildings. You see the sky and you get separated from being between houses. I
like procession very much: sky — foyer — windows — sea. It takes you to another
world. That's what you want for an audience: to separate themselves from their
daily life.
(J0rnUtzon, 1998)1
Clearly, for the architect of Sydney Opera House (Plate 1.1) 'Interpersonal'
meanings are very important: the building's height and orientation to its
visitors; the play of vistas as one approaches the entrance; the stress on
architecture as theatre; constructing an audience; a working building at play.
In a systemic-functional semiotic model of architecture2 (O'Toole, 1994;
Table 1.1) these kinds of meaning are analogous to the Interpersonal
semantic functions in language: Mood constructing the roles to be played in
a verbal interaction; Modality constructing a hinge between the real and the
hypothetical; Attitudinal Modifiers and Intensifiers expressing the speaker's
position and influencing the response of the hearer.
If you look out here [at Utzon's home in Helebek, Denmark], you see a field
with flowers and a small bush and small trees and big trees. They all consist of
small elements. And if you take them up and put them on the table it's a
number of elements. Together they make this. In architecture you have a floor,
your walls, you have windows, doors, and you have a lot of materials. And you
select them. You must have in mind that they make a whole or an expression of
some kind.
(J0rnUtzon, 1998)3
It's a curious fact that in all the drama of constructing the building, not much
detailed thought had gone into its specific uses. The competition entrants had
been asked to provide large and small halls, the larger to accommodate orchestral
concerts and opera as its chief forms of entertainment. At this point, seven years
after construction began, the Australian Broadcasting Commission decided that a
multi-purpose venue wouldn't be good enough as the permanent home of the
Sydney Symphony Orchestra.
A great deal of the political controversy surrounding the design and con-
struction of the Opera House focused on the 'Experiential' use-functions of
the building and the competing claims of its corporate users. The brief for
any commissioned architect or entrant to an architectural competition
necessarily starts from the uses proposed for the building.
Like a clause in language, a building incorporates Types of Process and
their Participants; its specific functions are Modified in terms of material,
size, colour and texture; and its component elements are organized taxo-
nomically like lexical items in the vocabulary of our language.
We clearly need to take account of the Experiential function of archi-
tecture. Otherwise, our roof will leak, our rooms will be full of draughts, our
cupboards and desk will face the wall, and we will find ourselves cooking or
worshipping or taking baths in the bedroom. But the obsession with Tunc-
tionalism' in architecture by both its modernist proponents and its Post-
modernist critics has taken it for granted either that the Experiential function
is the only function and that the design and evaluation of a building stands
or falls by this criterion alone, or that the form of the building primarily
expresses its practical use, which confuses functions, or modes of meaning,
which should be kept distinct. A systemic-functional approach corrects such
blinkered approaches by proposing that there are three functions creating
meaning in all buildings: an Experiential, an Interpersonal and a Textual
function, and that these are all equally valid and equally necessary for a
building to be meaningful and socially usable.
J0rn Utzon was probably naive in the early phase of designing and con-
structing the Opera House in that his revolutionary designs foregrounded
the public image (Interpersonal) and sculptural coherence (Textual function)
of the building, leaving many features of its use (Experiential) insufficiently
resolved. Given the political partisanship, the conflicting client requirements
and the media hype surrounding his design from the outset, this bias is
understandable, but it meant that his successors had to focus in the first
instance on the Experiential function:
16 MULTIMODAL DISCOURSE ANALYSIS
When Utzon resigned in 1966, the construction of the roof and its tile cladding
was well under way. But plans for Stage III were scarcely defined, and they
involved the elements which would turn the building from a magnificent sculp-
ture to a working centre for the performing arts: the walls that would enclose the
roof area, the performing venues within it, the stage equipment and the furnish-
ing of foyer, backstage and administrative areas throughout.
The newly appointed triumvirate of architects (Peter Hall, Lionel Todd and
David Littlemore) declared their intention to complete the building as closely as
possible to Utzon's intentions. But in the drawings that Utzon left behind, there
were no precise dimensions worked out for what would be more than a thousand
rooms within the structure. [. . .]
The key to finalising the internal designs was to establish what their users
wanted. Incredibly, in the Alice in Wonderland development of the construction,
there had been no formal compilation of user organisations' expectations
in terms of performance characteristics and capacities, dressing room and
rehearsal area backup, box office, administration, air conditioning and catering
requirements.
(Sykes, 1993: 61-62)
I asked him in his office, 'Why do you want to cover a building like that with tiles?
A curved surface, it could be sprayed.' And he looked surprised and said, 'But
tiles are the best.' And he'd looked all over the world at them, and he'd seen them
in the Middle East and elsewhere, mosques covered in gleaming tiles. And he'd
been to Japan and China, and he was very concerned with the quality that made
them up: what material they used, where they got the clay from and what mixes
they used in the clay, till it eventually satisfied him that it gave a slightly rough
surface. And this was the natural colour, the white, and over that surface was a
very clear glaze, a very shiny glaze.5
The material quality and the rough surface, the texture of the built surface
are primarily Interpersonal considerations. Like the shine and the gleam
they are part of the impact the Opera House shells have on spectators. And
the intertextual references to mosques and Oriental architecture, visual simi-
larities which may jog our cultural memory, are Interpersonal issues. The
impact on the spectator is crucial to Utzon. For him his Opera House is
almost more than a sculpture; it has a human personality:
It tells a story, it's not a calm building, it's awake all the time. You cannot make a
sculpture better than something that's white or off-white. If you look at bronze
sculptures in nature, they're difficult to read. If you had put a copper roof on this
house, you wouldn't have benefitted from the light. You would have seen a green
18 MULTIMODAL DISCOURSE ANALYSIS
marvellous colour. So this was my first and only idea for the roof. And Saarinen
said to me, 'Keep it white. Sydney harbour is dark.' And at that time the buildings
were dark. So it's the right answer.6
The older Finnish architect is as alert as his Danish colleague to the effect on
the spectator of the chiaroscuro of a white building against a dark ground
and the quality of light in a city on the water like Sydney, Helsinki or
Copenhagen.
Of course, the Interpersonal function at the rank of Element is not con-
fined to the roof tiles. The concrete ribs of the shells have a primary
Experiential function of binding and supporting the roof, but as soon as one
steps inside, one becomes aware of the contrast between the raw, matt and
unpatterned grey concrete of the ribs and the warm brown satin grain of
internal balustrades and doors. In terms of its textures, the building (apart
from the tiled surfaces) seems to start as rough, raw, grey and abrasive in its
outer layers and become progressively more smooth, polished, colourful and
comforting as we move to the core of the personal artistic experience in our
seat in any of the auditorial it speaks to us Interpersonally through its shine,
colours, textures, the very warmth or coldness of the materials used. This
play of material qualities has even more impact on the spectator at those
points outside the building where the shells meet the metal struts and sheets
of glass of the windows in an exciting geometry of tiles, raw concrete, metal
and glass (Plate 1.2). As we shall see, this involves an important interplay of
the Interpersonal and Textual functions.
Interpersonal relevance is obviously a key criterion inside a theatrical
building. Audience seats and lighting and sound booths face performers'
spaces; conductor's rostra face orchestras; prompt boxes face actors; bar-
tenders face customers across bars, counters and tables (as Ervin Goflftnan
showed in the 1950s7 - and Fawlty Towers hyperbolized in the 1960s - res-
taurants and hotels are highly dramatistic spaces). The public relations
mechanisms of display boards, information desks, ticket offices, media
interview spaces and Opera House guide routes all have their structure as
mini-theatres. And where 'projection' in the home may be confined to one
or two TV sets, in theatres it covers the gamut of possibilities from staging,
rostra, lighting, sound projection, security video and telephones (fixed and
mobile) and even the projection of performances to overflow audiences on
closed-circuit television. In all these aspects of a theatre or concert hall you
might say that the Interpersonal is Experiential - but we will argue that there
is still real heuristic value in keeping them separate.
Less obviously 'theatrical' choices at Room rank are involved in the Inter-
personal systems of Comfort, Modernity, Opulence and Style. Patrons of
concerts and operas are enveloped in a cocoon of almost perfect acoustics
and seated on luxuriously upholstered seats (Plate 1.3). These seats in
moulded birch ply and contrasting scarlet upholstery carry a message of
Scandinavian 'functionalism' of the 1960s and 1970s: like so much of the
architecture here, they put their working functions on display. The steel
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 19
entrant (systems in the top central box of the Chart): the viewer is induced to
look up, beyond the steps, beyond the shells to the sky and to imagine
themselves into another world of the imagination, even before the official
performance starts. Chthonicity is a particularly interesting system in this
case, because the Opera House deliberately plays with conflicting options:
on the one hand, there seem to be no solid walls embedded in the base. The
shells rear up skywards (anti-chthonically, away from the earth), to such a
degree that their corners hardly seem to touch their footings, seeming to
balance on pinpoints. The smooth spherical curves induce a touch of ver-
tigo and it is no wonder that so many of the photographs of the Opera
House, whether by official agencies or casual tourists, accentuate the
upward thrust of the shells. On the other hand, the podium is highly
chthonic: it has turned Bennelong Point into a rock-like headland and, as we
know, incorporates many of the key functions of the working building. The
light, dynamic, mobile and poetic structures above are embedded in the
solid and prosaic podium.
A building's orientation to its neighbours and the road by which it is
approached are important aspects of its Interpersonal function. Utzon and
Saarinen were keen for the white curves of the sails to stand out against the
predominantly dark water of the harbour and the high-rise buildings of
Sydney's rigidly rectangular central business district at that time. (Since
1973 more of the neighbouring buildings have been constructed in lighter
concrete, marble or glass - perhaps in deference to Utzon's building as well
as in harmony with changing architectural fashions.) The multiple curves,
however, offer visual echoes of Sydney Harbour Bridge (Plate 1.4), Circular
Quay and the bays and headlands of the harbour. Of course, good archi-
tectural as well as human relations can be spoiled when bad neighbours
move in. The Opera House's visual relationship with Circular Quay has
been obstructed and, more importantly, the easy natural pedestrian route
from the ferry terminals to the entrance steps has been interrupted by the
rectangular complex of shops and apartments erected in 1997-8,
unpopularly known as 'the East Circular Quay toaster'.
The final heading in the Interpersonal box at the rank of Building on the
chart is 'Intertextuality'. This was a term coined by Mikhail Bakhtin, the
Russian literary theorist and philosopher, to account for the deliberate refer-
ences, allusions or echoes that a writer makes to other widely known texts.
As with language texts, this would seem to carry primarily an Interpersonal
function in architecture: the writer/architect is saying to the viewer 'Nudge-
nudge . . . look at my clever reference here to Stonehenge, or Palladian
villas, or St Peter's in Rome, or the Pompidou Centre in Paris . . . It is up to
you to enrich the meaning further here by your knowledge of that building,
its uses, its tradition, its local cultural significance, etc'. And to some extent
we as viewers interpret the allusion according to our range of references and
our cultural preoccupations at the time. Virtually everyone seeing the Opera
House sees the visual metaphor of sails; many see sharks' jaws or clam
shells; Barry Humphries saw a drowning nun. Utzon claims that the curves
Plate 1.4 Visual echoes
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 23
of the sails were inspired by the segments of an orange; the relation between
the outer shell and the inner roof of the auditoria - by the snug fit between
shell and kernel in a walnut; the structural relations between the construc-
tion units and the whole building — variously by the leaves on a palm tree or
by Meccano toy construction sets. But in terms of other built texts, we have
Utzon's word for it that he had in mind a relationship between water and
built forms at Kronborg Castle, Helsingor; the soaring vaults of Gothic
cathedrals; and the shining segmentation of tiles on a mosque.
The tiles bring us at last to the Textual function (which does not have to be
the last function examined: the three functions are all equally meaningful
and may be considered in any order). At the lowest rank of Element the
finish of the tiles and the chevron patterning create the surface texture of
the Opera House shells. This is texture as such - Textual meaning - as
opposed to their practical (Experiential) function of keeping out the rain and
their decorative or dramatic (Interpersonal) functions.
At the rank of Room, each auditorium, or foyer, or office, or restaurant
has its own scale and proportions, it is lit or in shadow, and has its own
acoustic properties in contrast to other spaces around it. Its relation to
outside carries Textual meaning, so that our response to the isolated and
insulated worlds of the concert hall, opera theatre, drama theatre or cinema
is quite different from how we feel in the foyers, where our gaze is delib-
erately projected out to the harbour and city views - where we are no longer
fully enclosed in the built text. At this rank we experience a Textual focus as
well as the power relation (Interpersonal) between the rostrum and the
orchestra and the audience. This is facilitated by aisles and stairways within
the auditoria, and all such 'connectors' as corridors, stairs, lifts, escalators,
hatches and interconnecting windows throughout the building are primarily
Textual in function: they work like the cohesive devices of conjunction in
language.
Like cohesive devices in language, these connectors work across several
ranks, since they also work to relate floors and the various auditoria and
other internal spaces to each other. Doors and windows, of course, relate the
internal spaces to exterior parts of the built text: walkways, entrance steps
and terraces, and thence to the Broadwalk and approach road.
The most striking Textual systems of the Opera House at the rank of
Building are listed in the top right-hand box of the Chart. We will consider
them from the bottom up - as if we were moving from near the building to
vantage points further away. Opacity/Reflectivity/Transparency is a system
of options that tends to have meaning when we are near a building. The
shells of the Opera House are opaque, but, being shiny and white or off-
white, reflect the light, whereas the podium is opaque and comparatively
matt, giving a denser, less light-responsive texture. The windows, of demi-
topaz coloured laminated glass, are highly transparent for the viewer from
inside and for those outside when the interior is lit - after dark, when most
of the building's theatrical functions are at play. Unlike most glass facing
water and sky, they do not reflect much of their environment, except from
24 MULTIMODAL DISCOURSE ANALYSIS
to solve the design problems, at first working through trial proposals and then
tackling tricky situations as they arose under construction [. . .]
Linking the curves of the sails to the rectangular lines of the podium required a
concept that combined the aesthetic with the pragmatic. Without a mathematical
relationship between the shape of the shell and that of the podium to use as a
starting point for a geometrical solution to devising the structure of the two
largest glass walls overlooking the Harbour, a new design element had to be
introduced. The result was a combination of three surface planes: vertical at the
top, coming down to a half-circle leaning outwards from the vertical, then pulling
back in a cone shape [Plate 1.5].
Notes
1 J0rn Utzon in an interview for the film The Edge of the Possible: J0rn Utzpn and the
Sydney Opera House, director: Daniel Dellora, ABC Television, 20.10.98.
2 Michael O'Toole, The Language of Displayed Art (1994), Chap. 3 'A Semiotics of
Architecture', pp. 85-144.
3 J0rn Utzon in an interview for the film The Edge of the Possible.
4 My versions of Halliday's model for the systemic-functional analysis of painting
and sculpture (O'Toole, 1994) use the term 'Compositional function' for this kind
of meaning in those arts which are primarily for display. In the case of archi-
tecture, which, like language, is of practical use as well as display, it seems
appropriate to retain Halliday's notion of the 'Textual function'.
5 Harry Seidler in an interview for the film The Edge of the Possible.
6 J0rn Utzon in an interview for the film The Edge of the Possible.
1 Ervin Goffman, The Presentation of Self in Everyday Life (1965).
References
Dellora, D. (1998) The Edge of the Possible: J0rn Utzon and the Sydney Opera House. ABC
Television, 20.10.98.
GofTman, E. (1965) The Presentation of Self in Everyday Life. London: Penguin Books.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
Sykes, J. (1993) Sydney Opera House from the Outside In. Sydney: Playbill Proprietary
Ltd/Sydney Opera House Trust.
2 Making history in From Colony to Nation: a multimodal
analysis of a museum exhibition in Singapore
Introduction
This paper explores how systemic-functional (SF) theory may be extended
to a social semiotic analysis of the museum exhibition as a multimodal site.
The museum exhibition is obviously multimodal in that different semiotic
resources, such as photographs, three-dimensional physical objects, space
and language, are co-deployed in complex ways to construct meaning. I
sketch here a preliminary SF framework for the multimodal analysis of a
museum exhibition and exemplify its usefulness in articulating the critical
construction of historical meaning by particular displays in From Colony to
Nation, an exhibition at the Singapore History Museum (SHM) that repre-
sents the national history of Singapore. By critical, I mean understanding
how the communicative complexity of the exhibition connects with the
discursive institution of the museum as 'a dynamic power-play of compet-
ing knowledges, intentions and interests' (Macdonald, 1998: 3). In particu-
lar, I reflect on how the making of Singapore's national history in From
Colony to Nation serves to (re)produce particular dominant imaginings of
Singapore as a 'nation'. The general point here is that making history is
never value-free; it is, rather, imbued with power-knowledge relations1
invested in the site of historical production.
artifacts do not exist in a space of their own, transmitting meaning to the specta-
tor, but on the contrary, are susceptible to a multiform construction of meaning
which is dependent on the design, the context of other objects, the visual and
historical representation, the whole environment.
emotive awareness initiates the dialectical process through which the self and its
world 'make' each other so that the former may begin to 'mean' and 'do' — both
cognize and act upon the latter.
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 33
The total experience (in living history or interactive exhibits), the total immersion
(in gallery workshops and events), can have the function, in the apparently dem-
ocratized environment of the museum marketplace, of soothing, of silencing, of
quieting questions, of closing minds.
In other words, the current popular paradigm that pushes for the democra-
tization of the museum does not equal the dissolution of power. Instead, it
indexes the powerful capacity of the museum in strategically negotiating its
institutional authority to position the subjectivities of its audience in particu-
lar ways. This ideological motivation of meanings construed and constru-
able in an exhibition is taken seriously in an SF framework that emphasizes
a dialectical relationship between social context and semiotic system(s).
Surface/ Topics (Sub-topics) Interactivity Gaze and other sensory Display Style Classification
Item modes of attention Arrangement
Relationship Map
Intra-relationship of elements in an item Interpretive Path Interplay of modal and Visual Salience Balance:
Inter-relationship of elements across items Directional Path compositional elements Flank/ Spiral
Focus (CVI) (e.g. Colour, Light, Alignment
Shape, Size, Lines)
Image-Word-Object: Extra-
Vocalization
Semiotic
Metaphor
Obj ectification Perspective Information Composition
Metonymy Viewing height Relative Prominence of Surface/
Item
Visual semiotic O'Toole (1994)/Kress and
van Leeuwen (1996)
Linguistic semiotic: Halliday (1994)
36 MULTIMODAL DISCOURSE ANALYSIS
Deputy Prime Minister Lee Hsien Loong reiterated this position at the
formal launch of NE on 19 May 1997, saying that it is 'a concerted effort to
imbue the right values and instincts in the psyche of our young' through
teaching 'the Singapore Story - how Singapore succeeded against all odds
to become a nation'. Thus, From Colony to Nation, which is also referred to as
'The story of Singapore' in the exhibition guide (see Plate 2.1),5 has a strong
pedagogic purpose that is tightly circumscribed by the ideals of NE, namely
to underscore the constraints and vulnerabilities of Singapore. I discuss now
how the intent of NE motivates a selective remembering of Singapore's
recent political past, with particular focus on an Area - the 'Communist
United Front' — that displays the Communist movement in Singapore after
the Japanese Occupation.
It is worthwhile first to contextualize this Area concerning the Communist
movement in terms of the Narrative Design at the rank of Gallery. Typically
referred to as the 'storyline' among exhibition makers, the Narrative Design
is abstracted as that overall thematic content of an exhibition that binds the
particular selection and arrangement of multiple semiotic systems. As Vergo
(1989: 46) puts it:
in the case of most exhibitions at least, objects are brought together not simply for
the sake of their physical manifestation or juxtaposition, but because they are
part of a story one is trying to tell . . . Through being incorporated into an
exhibition, they [objects] become not merely works of art or tokens of a certain
culture or society, but elements of a narrative, forming part of a thread of
discourse which is itself one element in a more complex web of meanings.
The Narrative Design is, then, an 'interpretative strategy' (Dean, 1994: 103),
within which the subject matter of an exhibition is formulated at several
levels of complexity. An aspect of this complexity lies in the Interplay of
Genres, which is worked through the social experience of a museum visit.
An instance of this would be the experience of picking up and glancing
through an exhibition/gallery guide before viewing the actual three-
dimensional display. In From Colony to Nation, where no main introductory
panel is installed, the exhibition guide plays a marked role in providing
visitors with an overview of the content of the display. More significantly,
the exhibition guide, in orientating the visitor to '[t]ake a walk through
history and understand why Singapore must prize her independence above
all else', inflects the historical recount displayed as an exemplum. An
exemplum, according to Martin (2000b: 8), 'relate [s] a sequence of events in
order to make a moral point'. The moral point here is the obligation for
Singaporeans to value positively and not take for granted the country's
independence.
Plate 2.1 Exhibition guide to From Colony to Nation (layout plan)
38 MULTIMODAL DISCOURSE ANALYSIS
the Communists. Any act of violence which might have been committed by
the police then is from the start tolerated and legitimized as control.
Figure 2.1 System of Circulation Path (adapted from Royal Ontario Museum
1999)
Internal time is deployed to build up an explanation about the past and this
is linguistically construed in the text panel via logical links of Cause. Now, the
spatial semiotic also affords the capacity to realize external and internal
time, but perhaps in ways less differentiated than language.
The three-dimensional spatialization of external time can be seen to
involve parallel semiotic metaphor. The events dynamically recounted along
a chronological timeline of marked Circumstances in the linguistic text are
physically bounded in a more or less rectangular enclosure with exhibits
displayed along the two longer walls (see Plate 2.2). The left wall consists of
42 MULTIMODAL DISCOURSE ANALYSIS
items that relate to the May 13th Incident in 1954, while the right wall
exhibits items associated with the Hock Lee Bus Riots in 1955. There
appears to be a shift from the linguistic construal of time as Circumstance to
its spatial experience in the exhibition as a physical material Thing. It is this
semantic shift that enables the further compression of these events into a
period, negatively appraised in its sub-thematic classification as 'Colony in
Chaos'.
The semantic shift is parallel in the sense that no new functional entities
are introduced in this reconstrual although there is an overlay of meaning
enabled by the system of Circulation Path in the spatial semiotic. The sight
of space simultaneously invites its traversal. The continuous material pro-
cess of'organized walking' (Bennett, 1995: 6) now topologically enacts the
dynamic unfolding of time (external and internal) in space. The system thus
activated is that of the Circulation Path. From the perspective of Traffic Flow,
this display on the Communists is situated relatively early in an Arterial
pattern (see Plate 2.1) from left to right. This left-right directional flow is
explicitly insisted upon by the instruction on the Exit Door: 'Please enter
exhibition via door on the left'. Interpersonally, the Arterial pattern pro-
motes a didactic stance in that the visitor is given little choice in choosing
her/his pathway through an exhibition. This textures the importance of this
display since a visitor is made to walk through it anyhow.
Now, I focus on the Flow Rate, which is affected by the arrangement of
walls. In this Area, the two longer walls run parallel to each other and are
conjoined by a straight path through. Movement through this pathway
enacts a conjunctive relation in the Interplay of Walls. This conjunction is
not merely an additive of two external timeframes (referenced as 1954 and
1955 from the linguistic text panel), but also expresses their internal relation
as examples of Communist-instigated violence. More significantiy, this spa-
tial design, by its relatively low Degree of Partition, affects a Flow Rate that
tends not to be crowd stopping.
Furthermore, following Arnheim (1982: 61), the two longer walls of the
rectangular enclosure tend to emphasize an axial symmetry, which propels
the Ideal Visitor to move forward and ahead of the Area, towards the portrait
painting of Lee Kuan Yew being sworn in as Prime Minister in 1959. This
coloured oil painting, enshrined in a gold frame, stands out in contrast to the
black walls and the black-and-white photographs used. According to Bal
(1999: 176), the portrait is
44 MULTIMODAL DISCOURSE ANALYSIS
[a] genre that bestows authority upon its subject. Its history is bound up with that
of capitalism, individualism, bourgeois culture . . . portraits are made to honor
power.
Thus, apart from visual contrast in the display design, the intertextual allu-
sion to such generic conventions about the portrait marks the painting as a
focal point, which indexes the starting-point within the Narrative Design of
how the elected PAP Government (represented metonymically and authori-
tatively by the figure of Lee) would overcome all odds to build Singapore
into what it is today. From the perspective of Flow Rate, then, the relative
prominence of the Topic 'Communist United Front' is downplayed. It is not
that the Topic has become less important or significant. Rather, what seems
to be enacted by the continuous Flow Rate is perhaps a channelling of that
significance to an appropriateness of distancing oneself from Communism
towards the promise of social prosperity that the PAP Government has
come to stand for. This gesture of distancing is furthermore directed to
reinforce the negative desirability of Communist activism in general.
Photographic images
I examine the collection of thirteen photographs placed immediately after
the text panel along the left Wall (see Plate 2.3). What probably arrests a
visitor's attention to this collection of photographs is the wired fence. The
significance of this wired fence, other than its role as a focal point that draws
a visitor's Gaze to the photographs, is discussed later in this section. For now,
I concentrate my analysis on some of the photographic images displayed.
For the specific analysis of the meanings constructed in each photograph, I
apply eclectically the SF interpretative frameworks formulated in O'Toole
(1994) and Kress and van Leeuwen (1996). The analyst's situation is, how-
ever, further complicated in the medium of a museum exhibition, where
how any single photographic image can mean is as much mediated by its
dissemination alongside other photographs through display practices, two
of which are discussed here: museum labelling and setting.
In relation to the exposition set out by the linguistic text panel, these
photographs serve as artefactual evidence that testify to the 'truth' of the May
13th Incident recounted in clauses 12-20. Following O'Toole (1994), the
Representational content expressed (at the rank of Work) in the thirteen
photographs consists of Scenes of police control and arrest, crowd dispersion
and injury, all of which illustrate the non-productive consequences of the
May 13th Incident. In addition, photographs in black-and-white and par-
ticularly sepia not only evoke a sense of the past, but also hark back to the
traditional genre of documentary. As Price (2000: 75) writes of documentary
photography, one implicit claim that underlies its historical development is
that 'it offers us a disinterested and true picture of the world'. It is precisely
this naturalistic coding orientation (Kress and van Leeuwen, 1996; Thibault,
2000) that underpins the evidential value of each photographic image.
image a
Plate 2.3 Display of photographs on the May 13th Incident (left Wall)
46 MULTIMODAL DISCOURSE ANALYSIS
There is, in other words, the social assumption that photography real-
istically captures 'an immediate and transparent identity between image and
referent' (Phillips, 1998: 155). However, as Ryan (1993, cited in Price, 2000:
69) argues:
Despite claims for its accuracy and trustworthiness, however, photography did
not so much record the real as signify and construct it.
[t]he photographer turns his or her camera on a world of objects already con-
structed as a world of uses, values and meanings, though in the perceptual pro-
cess these may not appear as such but only as qualities discerned in a 'natural'
recognition of'what is there'.
not only objectifies the negative appraisal of the Communists and their
activities, but also layers it with the delicate complexity of race.
What probably arrests a visitor's attention to this collection of photo-
graphs is the hapticity7 of the wired fence. The wired fence is used here as
an object prop to 'fabricate' the Setting of a prison. It is within this Setting of
imprisonment that the photographs come to be interpreted as ideational
tokens of negative Judgement on riotous behaviour. The visitor walking
through the floor of this area is simultaneously locked in and out from the
Scenes captured in the photographs. This physical barrier serves as a 'safety
net' that 'protects' the visitor from acts of violence. In preventing the visitor
from having any direct tactile contact with the photographs, the wired fence
enacts a form of metaphorical distancing from a riotous past. Even one's
visual interactivity with these images is 'intervened' by the criss-cross of
wire, as if dictating that these riots in the past should not be allowed to
repeat themselves in present time. What may be implied in this construction
is the importance in preserving the 'safety net' that the PAP Government
has thus far spun for the peaceful progress of Singapore as a nation-state.
The wired fence thus amplifies the scale of the undesirability of the
Communist movement. In addition, the perceived risk of physical pain
evoked by the barbed wiring at the top disciplines the visitor into accepting
police control as a necessary and legitimate deterrent against Communism
lest Singapore becomes a totalitarian state. For some, there may seem to be a
dash of irony here since police surveillance is as instrumental in enforcing a
sense of totalitarianism. Yet, any force wielded by police power remains
hidden and naturalized behind a legalistic frame of social order presently
articulated to criminalize the Communist movement during the 1950s.
Ideological motivation
The exhibition, which displays a dominant 'progressivist national narrative'
that stages 'a transition from a colonial society to a modern capitalist one'
(Wee, 1999: 169, 172, emphasis original), suppresses any formative role the
Communists played in the 'nation-ising' of Singapore. The collective multi-
modal definition of the Communists as a dangerous riotous Other is filtered
through the dominant lens of communitarian ideology (Chua, 1995) pres-
ently held by the PAP Government. Communitarian ideology is most
recently articulated and instituted in the Government's 1991 White Paper
on Shared Values.8
Two of these Shared Values are transmitted through this display. First, the
non-legitimate place of revolutionary violence emphasizes PAP's order of
politics, which is one founded on constitutional consensus rather than con-
flict; this echoes the Shared Value Consensus instead of contention. Second, down-
playing the racial script in this display also aligns the exhibition with the
Shared Value of Racial harmony. As Wee (1999: 170) writes of the delicate
racial communal tension that underlay the mobilization of Communism in
Singapore's 'stage of nationalist polities':
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 49
A national trauma involves sufficient damage to the social system that discourse
throughout the nation is directed toward the repair work that needs to be done.
50 MULTIMODAL DISCOURSE ANALYSIS
This 'repair work' to recover from the trauma of Communist and com-
munal violence allows the reinstatement of the communitarian ideology
espoused by the ruling PAP Government. Herein lies the second evaluative
level, where through the Narrative Design, the national 'self of Singapore is
positioned as vulnerable; this vulnerability includes especially the delicate
problem of difference posed by race. In this light, communitarianism is
posed as a form of social discipline cultivated to prevent a relapse into a
traumatic past. This social discipline is hardly resisted primarily because of
its pragmatic effectiveness in sustaining Singapore's material progress. The
body politic of Singapore thus risks trauma if there should be a lapse from
this progress. Underscored in all these is also the discursive positioning of
the museum (SHM) as a State apparatus that plays a political role in repro-
ducing PAP's ideals of a Singapore citizenry. Such politicization resides
precisely in their capacity to structure knowledge.
Finally, on the third level, evaluation of the Narrative Design engages the
researcher's subjectivity in her/his analysis. That is, the interpretative analy-
sis I present in this paper is as evaluative, positioning you to view the
exhibition in a particular light. The interpretative stance I adopt towards the
analysis undertaken here aims to trace how From Colony to Nation naturalizes
dominant conceptions of social 'reality'. It is necessary, though, to add the
qualification that the point here is not to denounce the credibility of the past
represented in the exhibition. Indeed, the emphasis on history as an ideo-
logical (re)construction throughout this paper does not mean that those
past events recounted did not happen. Nor should it be easily conflated with
a claim of historical falsity. In fact, if one takes the social constructivist view
of history seriously, notions of 'truth' and 'falsity' appear to be in flux since
the crux of the matter now is how any single interpretation of the past
becomes (de)legitimized, by whom and for what purposes. Further, it is the act
of evaluating that is directive of one's sensibilities to the past. Herein lies the
disciplining act of history, whose representation in the museum is a form of
directed remembering. The flipside of this selective remembering is, of
course, a disciplined forgetting motivated by the ideologies of the dominant
in society.
Museums are then strategically placed in history making. The SF
framework formulated here endeavours to be useful as some form of
'meta-language' that enables visitors to 'talk' systematically about how the
exhibition as a primary composite medium construes ideology. Yet, not just
'talking' about, but also potentially 'talking' back to particular unequal repre-
sentations displayed in exhibitions. In the final analysis, the museum repre-
sents a heterogeneous zone that differentially engages multiple social players
in negotiating (or mutually disciplining) the discursive forces of social
change. It is perhaps for this reason that the museum continues to stand as a
site worth (re)visiting.
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 51
Notes
1 Foucault (1977, 1980) conceptualizes the mutual constitution of power and
knowledge in social practices. As Foucault (1977: 194) has argued in Discipline and
Punish: 'In fact, power . . . produces domains of objects and rituals of truth'.
2 For a more detailed consideration of the theoretical basis for extending SF
theory into the domain of multimodality, see Pang (2001: 38-54).
3 Harris (2000) first conceived the term integrational semiology to understand the
multimodal character of writing. Under integrational semiology, Harris (2000:
69, emphasis original) explains that 'signs . . . are not invariants: their semiological
value depends on the circumstances and activities in which, in any particular
instance, they fulfil an integrational function'. Though insightful, Harris remains
vague on the what and how of this integrational function. This paper suggests
that: (1) the metafunctional hypothesis and (2) the realizational dialectic between text and
social context in SF theory help elucidate more concretely the shape of this inte-
grational semiology.
4 Results of the survey are also reported in The Straits Times, 16 September 1996.
For a sample of some of the questions asked in this survey, see The Straits Times,
15 September 1996.
5 I refer to the guide here, not for an exhaustive multimodal analysis of it, but to
distil the exhibition's classificatory scheme (see Table 2.2).
6 According to Fairclough (2001): 'An (interaction may involve a "chain" of
different, interconnected texts which manifest a chain of different genres'.
7 Following O'Toole (1994: 35), hapticity refers to that three-dimensional quality
in sculpture which 'engages our whole body in an identification with [its] mass
and rhythms'.
8 For a detailed discussion on the promulgation of Shared Values as a National
Ideology, see Hill and Lian (1995: 210—219). There are principally five com-
ponents in this National Ideology: (1) nation before community and society above
self; (2) family as the basic unit of society; (3) regard and community support for
the individual; (4) consensus instead of contention; and (5) racial and religious
harmony.
Acknowledgements
Plates 2.1, 2.2 and 2.3 are reproduced by courtesy of the Singapore History
Museum, National Heritage Board, Singapore.
References
Antze, P. and Lambek, M. (eds) (1996) Tense Past: Cultural Essays in Trauma and
Memory. London: Routledge.
Arnheim, R. (1982) The Power of the Center: A Study of Composition in the Visual Arts.
Berkeley: University of California Press.
Bal, M. (1999) Memories in the museum: preposterous histories for today. In M. Bal,
J. Crewe and L. Spitzer (eds), Acts of Memory: Cultural Recall in the Present. London:
University Press of New England, 171-190.
Baldry, A. P. (ed.) (2000) Multimodality and Multimediality in the Distance Learning Age.
Campobasso, Italy: Palladino Editore.
52 MULTIMODAL DISCOURSE ANALYSIS
Safeyaton Alias
National University of Singapore
Introduction
Cities are more than a place to live, to work or to play in. As people observe
the city while they move through it (Lynch, 1996), the city serves as a
political and social statement, and in some cases, symbolizes and
encompasses the achievement and political prowess of the country's ruling
elite. This is especially true in the case of Singapore where the city becomes
a showcase of what has been politically and economically achieved by the
People's Action Party (PAP) over the years since independence in 1965.
Within a span of thirty-five years, for instance, the country has achieved one
of the highest living standards in Asia, which has led some economists to
proclaim it a modern miracle. Lacking in natural resources and having to
rely on its human resources, it was suggested that for Singapore 'the capital-
ist road was [perhaps] the only one open' (Chua, 1995: 59). The number of
buildings and shops in Orchard Road stands as testimony to the realization
of Raffles's vision of a 'bustling emporium' (Jayapal, 1992: 67). A city is
therefore 'man's single most impressive and visible achievement' (Pike,
1996: 243) while remaining nonetheless a 'social institution' (Mumford,
1996: 184).
A city or a 'built world, like a written text, stores information' and 'pres-
ents particular transformations and embeddings of a culture's knowledge of
itself and of the world' (Preziosi, 1984: 50-51). The built world is an exhibit
of the culture of a given society, which in some ways reflects the ideologies
that operate within that society. Buildings, for example, 'are not just func-
tional machines; they have signs of their practical functions written all over
them: they signify their function as use' (O'Toole, 1994: 85); that is, 'buildings
are designed to mean something' (Stern, 1994: 47). Architecture is part of a
society's culture which affirms and re-establishes its values and ideals; it is
the representation of power (Betsky, 1994; Stern, 1994) and, whether posi-
tive or negative, the city or the built world is the image of the community
(Pike, 1996).
This paper therefore sets out to investigate the nature and manifestation
of the prevailing ideologies within the society of Singapore. To achieve this
purpose Singapore is treated as a text and indeed, it is a discourse worth
56 MULTIMODAL DISCOURSE ANALYSIS
plan was reviewed and renamed the Concept Plan in 1971 by an appointed
local authority. The Central Area Plans came to fruition between 1974—1989
only to be renamed the Revised Concept Plan in 1991. In 1998 the latest
revision of the Concept Plan was presented (Dale, 1993; Fong, 1973; Master
Plan Written Statement, 1993; Tan, J. H., 1972; Tan, S., 1999; URA Annual
Report, 1997/98).
Part of the Concept Plan's objectives is to meet what the authorities per-
ceive as 'the new wants and needs of [the] people' (Dale, 1993: 42). This is
accomplished by improving the living environment and by offering, or
rather prescribing, a better quality of life. This includes a policy of
decentralizing commercial activities to avoid overcrowding in any one area,
specifically the Central Area of which Orchard Road is a part (Dale, 1993).
But the ideas and the benefits outlined in the Concept Plan can only be
successfully implemented with a healthy economic growth. This becomes a
platform through which the authorities can justify their actions and
decisions both politically and in terms of the development practices. On the
business front, for example, one aim of the Concept Plan is to provide 12,000
hectares of land for industrial needs (Keung, 1991). In addition, 'judicious'
investments in the leisure industry will be welcomed because such invest-
ments mean 'good business' and will 'enhance [the Singaporeans'] quality
of life' (Liu, 1991: 4). In an effort to add 'life and character' to the streets as
well as making them 'more exciting and lively' (URA Annual Report, 1997/98:
28), the regulations for the setting up of outdoor refreshment areas and
outdoor kiosks along the pedestrian malls in the city were relaxed in July
1996. Previously, these outlets could occupy only 10 per cent of the total
building length but this is now 25 per cent, resulting in more outlets being
set up along pedestrian malls, especially along Orchard Road. These outlets
bring in additional income for the authority in the form of 'payment of
development charges or different premiums' (ibid.}. Hence, every metre of
unoccupied space in, around, below and above Singapore has potential for
extra revenue. This provides a boost to the economy with the Singaporeans
themselves helping to sustain that economy; the system and the people
depend on each other.
A visitor to Singapore, however, is likely to have little knowledge of how
the country has been transformed historically although he or she may have
seen where the locals live, how they travel, where they eat, work, play, shop
or seek medical attention. The visitor sees how the country 'operates' but
he or she may not be able to explain how this is possible because, more
likely than not, the visitor is not equipped with the knowledge or the tools
to explain what he or she sees or feels. For the uninitiated, Singapore
'explains' its operations very well because every part of the country, be it a
designated area, its roads, the open spaces or the buildings, transmits
explicit messages. Each of these 'speaks' to or 'addresses' the visitor dir-
ectly. While part of the Singapore city has its specific functions or pur-
poses, linguistically there is also a physical and Textual representation
which transmits messages.
Table 3.1 Functions and systems in Singapore
Orientation to buildings
Characterization: MRT
stations, bus-stops, street lights,
road names, road signs,
signboards
Lighting
Openness
Soft/hard texture: concrete,
asphalt, dirt track
Open Space Specific functions: Spaciousness Relation to bus-stops, taxi
(Rank 2.2) Road dividers, islands Openness stands, roads, MRT stations
Road shoulders Orientation to entrant: Relation to area/theme
Pavements/Footpaths accessibility Relation to buildings
Parking space View Relation to safety
Grass verge/Green belt Relevance Relation to power and prestige
Open field: recreational, Comfort: sheltered/ Degree of visibility
business unsheltered walk-ways, Degree of partition
Burial grounds shades, benches External cohesion: relation to
Public space Lighting: natural, artificial connectors, stairs, overhead
Private space Hard/soft textures: concrete, bridges, pedestrian crossings,
asphalt, grass underground passage
Colour Permanence of open space
Permanence of partition
60 MULTIMODAL DISCOURSE ANALYSIS
To help 'explain' Singapore, that is, to analyse and interpret the city,
which is three-dimensional and multi-semiotic, a framework featuring a
rank-scale for the functions and systems is proposed (Table 3.1). The multi-
plicity of the framework means that the city can be read 'backwards or
forwards, upwards or downwards, and inside to outside' (Preziosi, 1984: 55).
The framework may be used to analyse from the whole to the smallest unit
in the city. This means that the semiotic analysis of the city of Singapore
begins with the unit Area at Rank 1, followed by the units Roads/MRT at
Rank 2.1 and the unit Open Space at Rank 2.2 (Table 3.1). The analysis of
the smallest unit in a city, that is, Elements contained in a room or on a floor
in a Building at Rank 2.3 in Table 3.2, completes the semiotic analysis.
Alternatively, because the city is three-dimensional and multimodal, it is
possible to perform the analysis from the lowest rank to the highest, that is,
from the unit Element in a Building (Rank 2.3) upwards to the unit Area
(Rank 1). Although beyond the scope of this paper, Singapore could be
conceived as the total sum of these Areas.
As buildings constitute an essential part of a city, O'Toole's (1994: 86)
chart for architecture has been incorporated into Table 3.2. Although the
chart has been amended to suit the Singapore context because 'the existence
of built form is not universal in all cultures' (Preziosi, 1984: 52), the change
is minimal. Elements such as the characterization of a building, that is,
whether it is occidental or Oriental, for example, or how it is oriented
towards the MRT station, have been incorporated into the framework. As
most buildings in Singapore are designed to be either self-contained (for
example, a hotel) or interdependent (for example, a market), they are treated
as individual episodes that help to contribute to or to complement the design
of the whole area. In other words, there is interaction or 'interplay' between
these Episodes.
ing one, and to its general surroundings or environment. At the same time in
the built world, the 'built forms' (Preziosi, 1984) will direcdy or indirecdy
command the people's involvement and interaction; our senses respond in
specific ways to our natural environment (Kress, 2000). The framework in
Table 3.1 lists the systems which function Interpersonally to engage us with
our environment. In what follows, I analyse and interpret the built forms of
the Orchard Road and the Marriott Hotel. These analyses reveal how
specific ideologies are manifested in the city of Singapore.
Building Practical Function: Business, Size (relation to area and Proportion (height/breadth/
(Rank 2.3) Cultural, Educational, setting) length)
Entertainment, Governmental, Orientation to neighbours, Relation to external area
Medical, Private/Public, adjacent buildings Relation to road/MRT station
Recreational, Religious, Orientation to road, MRT Relation to adjacent buildings
Residential tracks and stations Relation to permanence: old,
Orientation to light Orientation to entrant new, preservation,
Orientation to wind Facade conservation
Orientation to earth Modernity Rhythms: contrasting shapes,
Orientation to service (water, Colour angles, colours
power) Cladding Textures: rough/smooth
Episode: self-contained, Characterization Roof/wall relation
interdependent Colour Opacity
Interplay of episodes Intertextuality: reference, Reflectivity
mimicry, colour Cohesion: interplay of episodes
Exoticism
Floor Sub-functions: Height Relation to other floors
Access Spaciousness Relation to outer world
Working Accessibility Relation to connectors, stairs,
Selling Openness lifts, escalators (external
Administration View cohesion)
Storing Hard/soft texture Relation of landing/corridor/
Waking Colour room/foyer/room (internal
Sleeping Sites of power cohesion)
Parking Separation of groups Degree of partition
Permanance of partition
Units/ Experiential Interpersonal Textual
Functions
area appears to rely on the concept of'supply creates demand' (ibid.); that is,
the restricted nature of parking spaces and the open spaces create the
demand for public transport.
The built forms in Orchard Road place a great emphasis on Interpersonal
metafunction. Commercial developments along the pedestrian routes are
encouraged to 'have activity-generating uses on the [ground floor]' (Orchard
Planning Area, 1994: 20) and as a result, shops and restaurants open directly
to the mall, beckoning pedestrians as well as supporting street activities. The
open spaces are planned so that people instinctively walk into the air-
conditioned interiors of the shopping malls to escape the humidity of the
outdoors. The 'progression from street to interior is critical' (Whyte, 1996:
117) and Orchard Road has been planned such that it is hard to tell when
one transition ends and when the other begins. Pedestrians also have visual
access to the products on sale at the ground floor shops, which are encased
behind glass panels. Window displays are usually used to attract the attention
of the female pedestrians and cater first to what are perceived as the primary
needs of women: cosmetics, fine jewellery, clothing as well as their coordin-
ated accessories while condoms at the Lucky Plaza shops are arranged to
resemble a bouquet of flowers. A major part of the business strategy is to
capture the female eye first. Seen textually, sex implicitly becomes the selling
point in Orchard Road in what largely remains a patriarchal society.
The presence of overseas investors in Orchard Road is ubiquitous and
thus there is a reinforcement of the culture of consumerism. For example, at
the time of writing, twenty-five outdoor refreshment outlets (OROs) are
located along Orchard Road. Located on both sides of the road, these
outlets serve coffee and tea and food such as burgers and fries; that is,
foreign imports from the West. It is common to see several oudets promoting
the same items but under different trade names. Patronizing these oudets
has become a way of life. These OROs have built a 'new constituency'
(Whyte, 1996: 111) where people are subconsciously trained to adopt new
habits such as having alfresco lunches. These outlets also act as an avenue
for the people to see and be seen and this has given rise to a new street
culture that is readily embraced. As competition among the various inves-
tors intensifies, 'campaigns' are launched to remind consumers, particularly
the young, of the products' existence, which are readily accessible and avail-
able to them. Hence, these OROs are located a few hundred metres away
from one another. While these outlets operate textually because they con-
tribute to the thematic 'consistency' of Orchard Road, they have what
O'Toole describes as 'powerful [and serious] Interpersonal implications'
(O'Toole, 1994: 103). The 'repetition of themes' ensures that people would
not miss or forget these products. To invest in the young and impressionable
is therefore to invest in the future of Singapore. Such investments guarantee
the survival of these products and the continuous Western presence. Equally
important, these OROs continue to draw revenues for the authorities.
Ironically though, while these OROs are located at strategic and prime
locations, that is, they are in the Open Space and visible from the road, outlets
66 MULTIMODAL DISCOURSE ANALYSIS
serving local Singaporean fare are usually confined within a building, often
at the basement or the back of the building or at a side road and away from
the main road. Although the nature of Asian cooking is highly suitable for
the outdoors, it does not or rather is not allowed to fit into the context
of Orchard Road. Textually, a conscious effort has been made to ensure
Orchard Road projects and reinforces the sophisticatedly developed clean
and green image that has become synonymous with the image of Singapore.
The fact that ideas for these outlets were imported from overseas (URA
Annual Report, 1998/99) and are expected to 'make our streets more exciting
and lively' while 'adding life and character to our streetscape' (URA Annual
Report., 1997/98: 28) suggests the relative value of Asian culture. The hotels
may not necessarily cater solely to European tourists, but nevertheless, a
foreign culture is foregrounded while its Asian counterpart is backgrounded.
The message is clear: anything foreign, imported and specifically Western
excites and sells readily.
Unlike Geylang or Serangoon Road in Singapore, there is a conspicuous
absence of religious symbols along Orchard Road even though a prayer hall
for the Muslims is located off the main street in Bideford Road. The vicinity
is thus constructed to be secular, but not necessarily apolitical. California
Fitness Centres and Planet Hollywood have made their presence felt in
Orchard Road along with Singtel and the Safra Town Club. Unlike these
vibrant institutions whose open concept invites pedestrians to browse, the
Thai Embassy appears inaccessible behind its iron gates and thick foliage. As
the lowest building there, the embassy does not fit into the concept of
Orchard Road because it does not generate sales or draw in the crowds. It
mars the overall outiook and thematic concept of Orchard Road and we
interpret it as 'failing' textually. In contrast, the Singaporean Presidential
Palace or the Istana, situated at the end of Orchard Road and not visible
from the main road, is designed to attract attention. The changing-of-the-
guards ceremony has found favour with both the locals and the visitors. This
appeal can be translated as a desirable Interpersonal relation. Officially
closed to the public throughout the year, however, the grounds are opened
on designated public holidays.
Streetscapes such as road signs, street lights and bus shelters appear to be
neutral, but closer inspection reveals a different scenario. While the street
signs are in English, the ethnic group whose presence is strongly represented
is Chinese. The architecture of the Marriott Hotel is an example of how
that presence is reinforced and preserved. Such buildings serve to remind
Singaporeans of their cultural heritage. The one reminder of a multi-racial
society is the mural wall located next to the entrance to Orchard MRT
station where foreigners, especially the Filipinos, congregate on Sundays.
This mural wall depicts the cultural activities and the various landmarks
associated with the four main ethnic groups in Singapore. Discotheques and
pubs are discreetly placed in various corners of buildings and roads, away
from the public eye during the day. However, at night, these entertainment
centres spring to life, while out in the street the action continues. Orchard
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 67
Road has fulfilled the expectations and has realized the vision of the author-
ities to create 'a modern and vibrant commercial corridor alive with day and
night activities' (Orchard Planning Area, 1994: 14).
Hence, it is immaterial whether the people are indoors or outdoors. In
Orchard Road, people are constantly on the move and wherever they may
be, there are ATM machines for them to withdraw their money and at the
same time, an outlet for them to spend it. Every visitor to Orchard Road is a
potential customer. Regardless of the time of day, one can be assured that
there are cash transactions in Orchard Road. Textually then, the retail
industry has been successfully turned into one of the cultures of Orchard
Road. Except for Ngee Ann City, no other shopping mall has been promin-
ently featured in postcards, the one form of communication that 'person-
ally' connects Singapore and its visitors to the other parts of the world.
Examination of the postcards available in the local shops reveals that a
postcard of Orchard Road often includes Tang Plaza and the Singapore
Marriott Hotel, usually photographed from various angles and at different
times of day. This inevitably enhances the hotel's status but most import-
antly, it transforms the hotel into the landmark of Orchard Road.
What make the complex more significant are its colours and its pagoda-
like architecture. Using the framework for architecture in Table 3.2, choices
from systems for Interpersonal meanings feature strongly in the design of
the hotel. For example, as an illustration of the functions of the units Mod-
ernity and Colour, its former owners had deliberately chosen its present
design to reflect their racial and cultural heritage and, although only eight-
een years old, its design is representative of the days of ancient China. As far
as Colour is concerned, the dominant colours in the vicinity of Orchard
Road are blue and brown, as in the Forum the Shopping Mall, Wisma Atria
and Ngee Ann City, but, at the complex, the traditional Chinese colours,
green and red, dominate both the roof tiles and columns of the building.
Unlike the other hotels, which were designed to resemble vertical rect-
angular blocks that occupy extra lateral space, the Marriott Hotel is
octagonal in shape and is a tall and lean building with a distinctive Fa£ade,
which is a conical top and upturned roof-ends that point towards the sky (see
Figure 3.2). The contrast in building shape and colour is grouped under the
category of the unit Rhythms that operates textually In addition, the
appearance of the complex has been likened to 'a decent Oriental gentle-
man' and conferred as 'a trustworthy place' (Gwee, 1991: 62—63). We note
that the number 'eight' and the colour 'red' are considered lucky and sym-
bolize prosperity within the Chinese community. Such beliefs or practices
are related to a community's social semiotic, which operates Interpersonally.
However, because of its pagoda-like structure and octagonal shape, the
design of Tang Plaza and Marriott Hotel is not consistent with the overall
environmental and architectural structure of Orchard Road. In other
words, the complex does not 'exhibit some kind of "fit" with their neigh-
bours and neighbourhood' (O'Toole, 1994: 87). Although this is apparently
deliberate, textually the inconsistency could be said to 'fail' or be 'undesir-
able'. This Textual failing means, of course, that Interpersonally the building
attracts attention. The shape of the Marriott Hotel is only prominent from
an aerial view (see Figure 3.2). At eye-level, due to its orientation, distance to
and accessibility from the main road, the Tang Plaza is more distinct (see
Figure 3.1). This disparity may be partly due to proportionality in Size, a
system that operates interpersonally. The hotel seems to be sitting on a base
that is too broad for it (see Figure 3.2) and unlike the Tang Plaza, the
Marriott Hotel is backgrounded. The hotel proper is built in the centre of
the Plaza, which means that it is actually distanced from the main road.
From the environmental point of view, and both interpersonally and text-
ually, this location acts as a buffer to the noise generated by the traffic.
Nevertheless, the hotel draws attention to itself due to its unique roof
design. One needs to raise one's head to view the hotel, and what is first seen
at ground level is the red and green roof (see Figure 3.1). In sum, the
Intertextuality or the difference in overall design, mismatch in size and the
colour scheme gives the building its Oriental character, one that provides
that significant 'contrast with the background' (Lynch, 1996:102) which is
Orchard Road. These differences have naturally proven to be advantageous
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 69
because these are the features that are constandy highlighted in various
postcards and travelling brochures.
The complex is a building with a hotel and the four-storey Tangs Super-
store built as a whole unit. Experientially, there are two Episodes operating
simultaneously at the Tang Plaza. One Episode is that of a hotel and the
other, a shopping centre. Each is a different entity but one which has been
integrated and superimposed over the other. Each Episode serves its own
function: the hotel provides lodging, food and entertainment, while
the shopping centre is part of an industry that is responsible for shaping
Singapore into the commonly perceived shoppers' paradise. Both cater to
the needs of the foreigners as well as the locals and fit into the concept of
'under one roof; that is, shopping, dining, entertainment and lodging within
the same building. This provides the Textual Cohesion in the Episodes. This
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 71
Cohesion is also responsible for the great interplay and interaction between
the two Episodes because, seen experientially, for the uninitiated at least, it is
hard to predict where the shopping centre ends and where the hotel begins.
Textually, the foregrounding and the prominence given to Tangs Superstore
ensure that the complex fits into the overall thematic concept of Orchard
Road. Unlike the thematic Malay Village in Singapore, which was designed
to promote the Malay culture as a form of tourist attraction, the Tang
complex has proven to be a successful social, cultural and economic venture.
Even though the complex is located at the junction of Scotts and Orchard
Roads, vehicle access or the Interpersonal salient Orientation to entrant to
the complex is only from Scotts Road. A slip road branching out from Scotts
Road leads to the hotel main entrance and subsequently to the main
entrance of Tangs Superstore. For those using public transport, a bus stop
and an underpass to Orchard MRT station are conveniently located oppos-
ite the entrance of Tangs Superstore giving commuters, who are also pro-
spective customers, direct access to the shopping centre. The whole complex
is slightly elevated from the main road, which metaphorically puts it in a
position of power or superiority. The protruding roof of the Tang complex
provides a much-needed shelter from both sun and rain while its red col-
umns act as advertisement boards. The width of the pedestrian walkway
skirting the complex indicates that a heavy human traffic flow is anticipated.
Therefore, the open spaces around the complex are put to efficient use.
Benches are provided while OROs, such as Mrs Fields' and Juice & Java,
provide quick snacks and drinks. Textually, unlike in most parts of Singapore,
there is a sloping ramp that caters to the needs of the physically handi-
capped or those who are wheelchair-bound. And in case pedestrians forget
that the hotel is an octagonal-shaped building, this has been permanently
imprinted on the non-slip tiles of the walkway skirting the complex, while an
octagon circumscribes each column of the complex on the roof. Like the
built form of Orchard Road, there is an overwhelming emphasis on the
Interpersonal function at the complex.
Textually, in keeping with the green image of the area, low-lying shrubs
and palm trees signifying 'a tropical island' line the perimeter of the com-
plex. The hotel entrance, however, has the thickest shrubs. Interpersonally,
other than complementing the colours of the hotel and enhancing its land-
scape, these plants shield the hotel guests from the main road, providing a
little privacy. The names of the complex's main tenants, that is, 'Marriott'
flanked by 'Tang' on either side, are mounted on the wall facing Scotts Road,
giving the impression that each is vying for the attention of the onlookers. If
one were to miss the hotel's name, the situation has been rectified through a
concrete signboard. This signboard 'announces' its presence in the vicinity
as it is erected directly opposite the hotel entrance and thus faces towards
the junction of Scotts/Orchard/Paterson Roads. Such a signboard, one
that is not part of a hotel proper and located in an open space, is the only
one found in the area. Others, if available, are usually located within the
hotel's premises.
72 MULTIMODAL DISCOURSE ANALYSIS
Interpersonal features or strategies to provide the guests with what are per-
ceived as the necessary comforts. Guests are expected to respond visually,
auditorily, mentally as well as emotionally to their immediate surroundings
(Kress, 2000), and, in this case, to the soft lighting, to the orchids and to the
soothing sounds of the flowing water from the fountains.
The layout of the Foyer of the Marriott Hotel is unique because it does
not conform to the standards adopted by the various hotels in the vicinity. At
the rank of Element and for the Experiential function, for instance, the Foyer
opens to the sky and, as described in its promotional brochure, is 'illumin-
ated by a three-storey skylight' thereby reducing the reliance on artificial
lightings while at the same time giving the air-conditioned lobby an airy
atmosphere and good ventilation. No chandeliers are needed, just wall-
mounted lamps and table lamps placed at strategic locations. The warm and
soft lightings are easy on the eyes. The walls and floor are bare as carpets
and ornaments or decorations such as paintings are kept to a minimum.
Instead both the floor and the walls are fully tiled and of similar shades.
Though this means easy maintenance, as the cleaning and mopping process
is easier, the Foyer exudes coldness and appears businesslike. Interpersonal
functions or considerations such as Warmth and Comfort appear to have
been backgrounded.
At the Foyer, guests are not greeted by the traditional Sites of power,
that function interpersonally at the rank of Floor. This is usually the recep-
tion counter where the initial scrutiny of a guest takes place. Instead guests
are 'greeted' by an escalator or a 'connector' leading to the second floor of
the hotel where the banquet rooms and restaurants are located. Sign-
boards displaying the names of the banquet rooms and restaurants are
placed at the foot of the escalator. Thus, guests need not seek directions,
thereby alleviating labour costs. Inevitably this reduces the human inter-
action between guests and hotel staff. For an establishment that deals with
the service industry, Interpersonally, this is interpreted as another setback.
The distance between guests and hotel is further widened by the location
of the reception counter, which is located at the far end of the lobby and
sandwiched between its side entrance and its emergency exit. Guests either
approach the counter by walking across the Foyer (Path 'A' in Figure 3.3)
or by passing the jewellery, pastry and cigar shops along the passageway
on the right (Path 'B' in Figure 3.3). Initially the location of the counter,
which is part of the hotel's welcoming team and the human face of the
establishment, appears inconvenient to the guests but this apparent
inconvenience is negated because of the close proximity of the lifts that
would eventually lead the guests to their rooms. What can be deduced
here is that at the Foyer, the foregrounding of Textual functions such as the
Relation of the lifts to the reception counter and the Relation of escalator to
signboards, far outweighs the Interpersonal functions such as Comfort, Wel-
come, and human contact with hotel staff. This also functions to make
surveillance and the official scrutiny of the guests implicit rather than
explicit.
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 75
rooms but also to the lifts on that level. In the same manner too, the cigar
shop and the underground pub are sites for discreet soliciting. These are,
however, located away from public viewing. What is further implied is that
seeking pleasures and entertainment is the prerogative of the male. The
open concept in the hotel, which symbolizes one's public image, reflects a
closure to reality or to one's private life; it demands discretion because there
still remains the Asian obsession with the subject efface'.
Conclusion
While the analysis of Orchard Road entails the construction of a framework
that features a rank-scale with the functions and systems through which
Singapore is constructed, the analysis of the Marriott Hotel requires the
application of O'Toole's (1994: 86) framework for architecture. Through
the integration of both frameworks, the analyses of both Orchard Road and
Marriott Hotel reveal how spaces in and around Singapore are carefully
organized to meet the sociopolitical and socio-economic demands of the
authorities. Every available space is found to be potentially economically
viable. In general, the analyses reveal how Singapore is constructed as a
shopper's paradise, a tropical island and a food haven. What is presented,
however, is a constructed image of a country and a hotel that both the
authorities and the management want the public and the world to see and to
believe. How this is done requires, to a certain degree, the use of women as
commodities. The general perception in Orchard Road and the Marriott
Hotel is that sex sells, thus reflecting the values of a patriarchal society.
Orchard Road demonstrates how foreign cultures, specifically those from
the West, are foregrounded and how the cultures of Singapore's multi-racial
societies are backgrounded. This is perhaps part of a strategy to cater to the
influx of 'foreign talents' and tourists to the country. Business concepts such
as the outdoor refreshment areas, for example, are imported from overseas
in an effort to make the streets 'more exciting and lively' (URA Annual Report,
1997/98: 28). The concepts of excitement and liveliness are therefore
denned by the authorities and Singaporeans are socially engineered to sub-
scribe to these prescribed concepts. These oudets as well as the abundance
of shopping centres in the vicinity are in reality revenue-generating
machines. Profit-making is the key word; the culture of consumerism dom-
inates the area and capitalism is seen as the answer for a land reliant on
human resources. The presence of the Marriott Hotel, however, serves to
remind the people, whether locals or foreigners, of Singapore's cultural
heritage. The building is a potent social and cultural symbol and a reminder
of the prominence of the Chinese community in the country. Amidst the
chaotic cultural scene in Orchard Road, Singaporeans must be reminded of
their cultural heritage and to meet those expectations, a system is imple-
mented and a lifestyle prescribed. The system and the people depend on
one another.
78 MULTIMODAL DISCOURSE ANALYSIS
Acknowledgements
The map (Plate 3.1) is provided courtesy of This Week Singapore.
References
Betsky, A. (1994) James Gamble Rogers and the pragmatics of architectural repre-
sentation. In W. J. Lillyman, M. E Moriarty and D. J. Neuman (eds), Critical
Architecture and Contemporary Culture. New York: Oxford University Press, 64—84.
Chua, B. H. (1995) Communitarian Ideology and Democracy in Singapore. London:
Routledge.
Dale, O. J. (1993) The Singapore Concept Plan: historical context/current assess-
ment. PLANEWS. Journal of the Singapore Institute of Planners 14(1): 41-46.
Singapore: Straits Printers Pte. Ltd.
Fong, T W. (1973) Industrial complexes and the garden city — can they co-exist? In
Chua Peng Chye (ed.), Planning In Singapore - Selected Aspects and Issues. Singapore:
Chopmen Enterprises, 16-21.
Gwee, P. K. W. (1991) Fengshui: The Geomancy and Economy of Singapore. Singapore:
Shing Lee Publishers Pte Ltd.
Halliday M. A. K. (1994) An Introduction to Functional Grammar (2nd edition). London:
Arnold.
Jayapal, M. (1992) Old Singapore. New York: Oxford University Press.
Keung, J. (1991) Overview on the Concept Plan. Living the Next Lap —Blueprintsfor Business.
Singapore: Urban Redevelopment Authority.
Kress, G. (2000) Multimodality. In B. Cope and M. Kalantzis (eds), Multiliteracies:
Literacy Learning and the Design of Social Futures. South Yarra: Macmillan Publishers
Australia Pty Ltd, 182-202.
Liu, T. K. (1991) Press Release on Living the Next Lap — Blueprints for Business. Singapore:
Urban Redevelopment Authority, 1—5.
Lynch, K. (1996) The city image and its elements (first published 1960). In R. T.
LeGates and E Stout (eds), The City Reader. London: Routledge, 98-102.
Master Plan. Report of Survey Volume 1. (1955) Singapore: E S. Horslin, Government
Printer.
Master Plan Written Statement 1993. The Planning Act (Cap 232, revised edn 1990).
Republic of Singapore. Singapore: Ministry of National Development.
Mumford, L. (1996) What is a city? (first published 1937). In R. T. LeGates and
E Stout (eds), The City Reader. London: Routiedge, 183-188.
Noble, S. (1994) Feng Shui in Singapore. Singapore: Graham Brash (Pte) Ltd.
Orchard Planning Area: Planning Report 1994. Singapore: Urban Redevelopment
Authority.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
Pike, B. (1996) The city as image (first published 1981). In R. T. LeGates and
E Stout (eds), The City Reader. London: Routledge, 242-249.
Preziosi, D. (1984) Relations between environmental and linguistic structure. In
R. P. Fawcett, M. A. K. Halliday, S. M. Lamb and A. Makkai (eds), The Semiotics of
Culture and Language Volume 2. Language and Other Semiotic Systems of Culture. Dover,
New Hampshire: Frances Pinter, 47-67.
Safeyaton, A. (2001) The Lion City as a text - a semiotic study of Singapore's
Orchard Road and Marriott Hotel. Unpublished MA dissertation. The National
University of Singapore.
THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE 79
Anthony P. Baldry
University of Pavia
Introduction
How can we go about analyzing a TV advertisement? Despite the long
tradition of analysis of printed advertisements, the prevailing view, until
quite recently, has been that it is impossible, for technical reasons, to analyse
TV adverts in such a way that the interplay of visual and verbal resources
can be reconstructed. Cook (1992: 37-38, see also 2001: 42-44), for
example, states that:
Any analysis of the language of adverts immediately encounters the paradox that
it both must and cannot take the musical and pictorial modes into account as well
[. . .] This problem is more serious with tv than with printed ads, for on paper
pictures stand still (and can even be reproduced), and there is no sound [. . .] In
considering tv ads, where pictures move, music plays, and language comes in
changing combinations of speech, song and writing, reproduction is virtually
impossible, and a video, to be watched while reading, would transform a written
analysis even more than companion illustrations. Many analyses of advertising
solve this problem by ignoring it.
Cook's statement is, in fact, a testimony to the revolution that has taken
place in a decade vis-a-vis film texts and their analysis, often providing
solutions to the concerns he raises, particularly those relating to reproduc-
tion: the videocassette can be easily digitalized using an appropriate PC card
and the resulting digital film can be manipulated in many ways, including,
for example, the addition of explanatory captions; the Web, unknown ten
years ago, has spawned new forms of advertising which increasingly include
streaming video capturable through special software programs such as
Camtasia; postproduction software such as Adobe Premiere has made it
possible to convert a film into a sequence of stills and hence into a printable
format.
These technological innovations have given rise to new descriptive prac-
tices including: (a) the multimodal transcription (Baldry, 2000b: 81-85;
Thibault, 2000: 374-385) and (b) the construction of PC-based multimodal
84 MULTIMODAL DISCOURSE ANALYSIS
of this paper is, however, to introduce the new field of multimodal con-
cordancing as a means of examining text and text types in relation to their
context of situation and context of culture (Halliday, 1978; Halliday and
Hasan, 1985). Multimodal concordancing thus builds on the foundations
laid by the multimodal transcription and on systemic-functional
approaches to language-only concordancing such as the Systemics Coder
developed by O'Donnell (2002). In so doing it raises questions about how
the study of multimodal discourse might be undertaken in the language-
learning classroom (Baldry, 1999, in press; Pavesi and Baldry, 2000) and
more generally how multimodal concordancing might develop in the
future.
towards the constant shifts in the selection of options in keeping with Gregory's
principle that phase and transition can 'be used to capture the dynamic
instantiation of micro-registerial choices in a particular discourse' (Gregory,
2002: 323); the Bottom Row (with its focus on the content of each shot)
describes, on the other hand, the film's unfolding in time, and, though not
excluding the principle of selection from options, is thus oriented more
towards sequential development and specific realizations. Though not
shown here, a multimodal transcription of this type also allows Textual
elements from various texts to be aligned in such a way as to compare their
phasal organization (see Baldry, 2000b: 68-69 for the development of the
comparative multimodal transcription).
Though unusually involving two drivers and two car-drive phases, in
many other ways the text in question illustrates many typical features of
car adverts, in particular the expression of the very strong relationship
between the driver's and the car's identity. As Figure 4.2 indicates,
although other criteria might have been invoked, a good starting point
when defining the division into phases in this text (and we may add the
60 adverts in the current car advert corpus) relates not to the human
participants but instead to the type of representation of the car: in this
case (and in many other cases) whether the car is present and, if so,
whether it is moving or stationary. At the start of the first phase of this
text, there is a typical car-drive phase [+CD], in the second, an essen-
tially car-stationary phase [+CS] (though the second subphase contains
the idea of a car stopping and starting - hence the [+CS, +CD] tag); the
third phase is again a car-drive phase [+CD], while the fourth phase, the
end phase, typically relates to the car abstractly in terms of its make and
manufacturer, and presents all the typical ingredients of one type of end
phase where the car itself is (physically) excluded [—CD, —CS] and where
instead the focus is on oral and written slogans and the manufacturer's
logo.
The correlation between driver and car is, of course, a major goal of
the car advert genre, reflected in the genre's phasal organization, which
characterizes the way the car advert unfolds in time. The car is very
much a Participant — by definition, at least an equal partner in the
human/non-human participant relationship (and more often than not a
superior). This emerges quite clearly in the type-oriented multimodal tran-
scription of Figure 4.2, which explicitly defines the constant shifts in local
foregrounding in the Top Row, e.g. whether it is the car, the driver, the
mascot or the countryside that is the salient Participant in a particular
phase or subphase.
In fact, the most salient phases in this text are the first and third, with
the first phase being conjunctive-disjunctive in nature and the third, con-
versely, of the disjunctive-conjunctive type. This reflects the text's fore-
grounding of potentially conflictual Interpersonal relationships between
the two drivers: the first, an outlandish jive-as-you-drive dude, the second,
a suave, sophisticated female. 'Conjunctive' and 'disjunctive' are here
90 MULTIMODAL DISCOURSE ANALYSIS
These two phases are separated by a very brief second phase in which
both cars are essentially motionless and where, vis-a-vis the metafunctions,
rather than Interpersonal elements, Textual and Experiential elements are
prominent: in this short phase, the New is introduced in the form of a new
car, a new driver and an Internet address (note that, in fact, the Internet
address has been 'carried across' from the previous phase, illustrating the
significance of extended transitions and overlaps in phasal organization, a
matter discussed in detail below). The main meaning created is that the
male driver successfully hitches a lift, so that it is important to glimpse one
of the cars stopping - hence the [+CS +CD] tag for this phase. The absence
of salient Interpersonal elements in this phase is striking: the song, for
example, ceases, in contrast to the previous and subsequent phases,
indirectly underscoring the fact that, in many car ads, song is a crucial source
of meaning, often acting as the functional equivalent of a narrator, linking
the viewer to the events at hand and, in part, defining the viewer's expected
response to actions and events.
This advert is no exception in this respect: the final refrain 'you own my
heart', cements the identity between the viewer, the car drivers and the car.
It also coincides with the written slogan - multitronic©: II cambio automa-
tico a variazione continua da Audi [i.e. Audi's gearbox with continuously
variable automatic transmission] further building on the text's basic the-
matics, namely that the discordant contrast between the smooth, sophisti-
cated lady and the dude's jive-as-you-drive lifestyle will be resolved in a
harmonious fashion by a relaxing ride in the right car, namely an Audi
automatic.
Significantly, many contemporary car adverts present the car as a
space where social conflicts, potential or real, may be resolved, whether
within the family or, for example, between loving couples. The car can thus
be a sexual space as in the Citroen car advert in the author's corpus
where, in the hyperreal coding orientation (for coding orientation see
Bernstein, 1971; for a multimodal perspective of coding orientation
see Kress and van Leeuwen, 1996: 168—171), the car rolls over and
over as the couple make love; alternatively, it may be a place of protection
where a kissing couple in a lonely lane can successfully fend off an
attack from Zombies. More mundanely, it can also be a space where
children and animals can be safely transported and sometimes even a
space in which members of a sports team can start throwing a
ball around, de facto transcending the narrow confines of the car's
interior.
In this advert, song contributes significantly to this particular meaning of
conflict resolution, built up gradually and multimodally, throughout the
advert with its highly sensual suggestion that the relationship between the
man and the woman will outlive the lift and that it will be the woman
who will provide the initiative in this respect: a prod, as it were, is as
good as a wink. Notice, indeed, how, rather than stopping at the end of
the third phase, the song extends beyond into the final phase, with the
92 MULTIMODAL DISCOURSE ANALYSIS
result that the latter, as well as giving the usual information about the
particular model and the manufacturer, also underscores the entire text's
meaning, by suggesting that the conflict between the man and the woman
can be, and indeed, has been resolved by virtue of the car's design
characteristics.
This meaning-making is the result of the artful juxtaposition and over-
lapping of different types of phases that carry out different functions. The
very notion of phase presupposes that there is some transition between one
phase and another and, to a lesser extent, between the various subphases
that constitute a phase. Moreover, following on from what has been stated
above, as well as phase types, we can also expect various types of transition to be
present in film texts. For example, Thibault (2000: 320-321) suggests that
the points of transition between phases have their own special features that
play an important role in the ways in which observers or viewers recognize
the shift from one phase to the next and that, generally speaking, transition
points are perceptually more salient in relation to the phases themselves.
Thus viewers of texts have no difficulty in perceiving particular Textual
phases thanks to their ability to recognize the transition points or the
boundaries between phases.
However, the notion of transition should not necessarily be associated
with the idea that there is a precise boundary or point at which a transition
occurs. In many cases, this vision of boundaries in the organization of
phases and transitions will work very successfully. But this is not always the
case. As Thibault (2000: 326-327) points out:
In this text, for example, each of the two main phases (the first and the third)
contains a series of pivotal transitional points between the various sub-
phases that mark the step-like progression from conjunctive to disjunctive (as
defined above) and vice versa: these are movements relating to the gearbox,
the cassette, the mascot, the drivers and the cars (e.g. braking). In the first
phase, the malfunctioning of the gearbox (we hear an ominous crunching
noise) and sudden braking and cessation of the mascot's movements signal
that disruption is to follow. The driver's jiving also comes to a stop. In the
third phase, the reverse is true: for the second time the camera carefully
focuses on the gearbox, which, in keeping with the demands of the targeted
audience, and quite unlike the first gearbox, is an automatic gearbox for
drivers who like a smooth ride. Significantly, the transition points in this, and
many other adverts, are linked in a chain to form a crescendo which
ELECTRONIC MEDIA AND FILM 93
contributes to the overall coherence of the text. One way in which this
salience is achieved is by changing the camera focus: thus the out-of-focus
mascot suddenly comes into focus. Another is the type of shot used: two
major subphasal transition points in the first phase and a third in the third
phase coincide with the only three shots in which we view the mascot by
looking out of the car through the windscreen: in each case this selection of the
mascot contributes to the underlying conjunctive/disjunctive 'stop-start'
flow of the text: the mascot is shown, in an alternating way, as either static,
carrying with it a negative connotation (a 'stop') that things are wrong, or,
when it sways in all directions, with a positive connotation (a 'continuation'
or a 'restart' after a 'stop'). A similar chain is described in Thibault (2000:
328-329), in terms of:
covariate semantic ties in the visual thematics [. . .] that are progressively denned
in the unfolding text as cohesive chains extending over the entire text. For
example, the foregrounded co-patternings of items deriving from the interacting
cohesive chains of 'smiling', 'rolling the sleeves', and 'moving forward' function
to create global coherence in the text.
the same thing as characterizing the typical ways in which transitions come to
be the salient element in phasal organization.
A multimodal transcription is limited in the amount of information it can
give about types of semiotic units that are found in film texts and cannot
provide anything like the information we need in order to provide motivated
answers to these questions. If we are to pursue our understanding of the co-
deployment of semiotic resources more thoroughly we need to understand
how a large number of dynamic texts typically unfold in time.
And in order to be able to identify characteristic patterns, the research
process requires us to build corpora that can be analysed in terms of various
Textual phenomena, including, in particular, a study of the typical phasal
organization of a specific genre which ensures that a film's unfolding in
time, in which the transition, as we have seen, is so significant, can be
captured by in vivo multimodal analysis. Such a requirement dictates the
need to build software programs that are capable of analyzing corpora and
not just individual texts.
What then are the characteristics of an online XML-based multimodal
concordancer such as the Multimodal Corpus Authoring (MCA) system,
which has been designed by the author specifically to identify recurrent
patterns in films?
First, as an authoring tool, it enables researchers, however imperfectly, to
view short pieces of film and simultaneously to write multimodal descrip-
tions of them in terms of various parameters, for example, those relating to
a text's metafunctional and phasal organization. Using MCA's editing tool,
researchers can segment a particular film into functional units and, while
viewing these units, type out detailed annotations relating both to the semi-
otic resources they deploy and the functions they perform within that film.
Indeed, MCA approximates to the researcher's dream of simultaneously
viewing and writing a description of a film in real time (see Baldry and
Taylor, in press).
Second, like a linguistic concordancer, a multimodal concordancer can
also establish patterns that relate to a series of texts, rather than to specific
instances, to a much greater degree than is possible with a multimodal
transcription, even where the latter is oriented towards type rather than
instance. For example, it is possible, using MCA, to determine the ratio of
female to male drivers, or to identify those texts relating to cars that are not
being driven, and hence have no drivers, and those relating to cars which are
instead being driven but where the driver is 'implied' and not actually seen.
It is also possible to identify special cases that involve two drivers, typically
one male and one female, or non-human drivers, typically robots. As with
any corpus approach using information technology, this information can be
obtained within a few seconds. However, unlike many lemma-based
approaches, the researcher must first carry out the work of description-cum-
transcription of the texts in the corpus. Not surprisingly, the software design
is such to incorporate an analytical framework that simplifies this task as
much as possible.
ELECTRONIC MEDIA AND FILM 97
voiceover predominates. As Table 4.3 shows, the search query in this case is
no longer formed by a single parameter (driver) but is a relational search that
links two disparate parameters: driver and storyteller.
Thus, unlike many lemma-based linguistic concordancers such as OCP
or WordSmith, but in keeping with the approach adopted by O'Halloran
and Judd (2002), a multimodal concordancer needs to be built around the
notion of the relationship between resources, events and participants. In this
respect, any form of transcription is a hard task, often undertaken by a
researcher without knowing whether the effort will be worth the candle. In
theory, the results described in Table 4.2 could be acquired by watching a
videocassette and marking down the various features using pen and paper.
Though in principle feasible, it would be a time-consuming process. Even
using MCA, which greatly reduces the time taken to provide a description, it
is still a time-consuming process. A much harder task, however, is to relate the
parameter DRIVER with other parameters such as STORYTELLER and ORAL
SLOGAN. This is virtually impossible to achieve using traditional pen-and-
paper and cassette methods. A multimodal concordancer, such as MCA,
which is based on these relational principles, can easily identify such pat-
terns through relational searches as Table 4.3 indicates.
Third, a multimodal concordancer, even more than a linguistic concord-
ancer, needs to be built around functional parameters such as those we have
mentioned above, namely Halliday's notion of metafunctions (Halliday,
1994) and Gregory's notion of phase and transition (Gregory, 1995, 2002).
In this respect, one significant step in the development of a corpus relates to
the work of tagging. In their paper on the development of a tagging system,
Baldry and Thibault (2001: 94-98) proposed the use of an annotational
system that defined gesture and language in terms of Halliday's notion of
example, when a driver opens the door and puts an object or person in the
car rather than himself/herself). Equally, it is possible, with a single search, to
identify all the cases where we see the car being driven and the driver getting
into and out of the car, in this case a query of the type: SP3: contains YES +
SP5: contains YES + SP8 contains YES. Thus 60 adverts were 'tagged' in
terms of the subphases of the car-drive phase (the first subphase has been
excluded on the grounds that it is only partly a material process), in such a
way that the corpus could be searched for the absence or presence of a
particular subphase.
As Table 4.5 shows, there is in fact only one advert (n. 21) which comes
anywhere close to instantiating all the possible subphases and even in this
case one subphase is missing and another is doubtful - hence the YES/NO tag
represented as a bracketed tick: this is a case where the driver is seen getting
into the car but only to put his young son in the back seat (see Figure 4.3
below). In all these adverts, visual/verbal ellipsis is constantly at work vis-a-
vis the instantiation of the driving experience: there is normally no need to
see all the phases at work, since our own experience of driving allows us to
'fill in the gaps'. With the exception of advert n. 21, in 60 adverts we never
see the driver getting into and out of a car.
Table 4.5 suggests that car adverts do, in fact, fall into three types, which
may be tabulated as follows:
1 Car-drive adverts: The car is seen moving in a glorified way that attempts to
go beyond the daily grind of the ordinary world. The car is in an ideal
world. More often than not the number of participants is limited to one or two people
and in many cases no human participant is foregrounded; the participants never talk
about the car and never talk to each other and only exceptionally to the audience. In
these adverts only subphase 5 is apparent (17 cases);
2 Car-stationary adverts: The car is motionless, a statue to be 'worshipped'
and is typically related to some inconsistency or oddity in the behaviour
of the people surrounding the car who typically talk about the car. In these
adverts, none of the subphases listed in Table 4.4 is present (11 cases
represented in Table 4.5 as grey-shaded columns) or alternatively sub-
phases in which the car is seen moving are absent (a further 6 cases);
3 Hybrid storytelling adverts: where both car-drive and car-stationary elements
are present and where either other genres are exploited to meet the
advert's own ends (e.g. spoofs on cinema and TV genres) or some attempt
is made to define the car in relation to daily activities and (usually) its
enhancement of these. These types include talk but never in the car-drive phase or
subphase. A good example of this is where the car-drive element is not
shown - hence the bracketed tick notation - but is instead realized,
through talk, as a mental and oral fantasy (projection) about the car's
drive potential by the car driver while the car is actually stopped (say at
the traffic lights). This is by far the largest category (26 cases), although it
should be noted that the majority (15) instantiate CD subphases before CS
subphases (The Fan being a rather special case).
ELECTRONIC MEDIA AND FILM 101
Figure 4.3 Unusual events dictate the need for an extended pre-drive subphase
This phasal organization seems to fit The Fan and many other adverts in the
corpus very well. However, more work using MCA is required to establish
the validity of this suggested typical phasal organization and the division
of advert types into three types. The [+/-CD] and [+/—GS] tagging
system will not, of course, always be distributed as in the current case as:
+CD(P1)A+CSA+CS/+CD(P2)A+GD(P3)A-GS/-GD(P4). There are cases,
for example, in which the distribution is essentially the reverse, with the car's
physical presence being confined exclusively to the end phase. But this does
not affect the hypothesis that three basic subtypes exist.
104 MULTIMODAL DISCOURSE ANALYSIS
If they do exist, then it may well be that the predominating human figure
in the car advert will turn out to be generically correlated with one of the
specific subtypes mentioned above: the DRIVER (the car-drive only advert),
the INSPECTOR (the car-stationary advert) and the RACONTEUR/STORYTELLER
(the hybrid type alternating car-drive and car-stationary phases and includ-
ing the subtype which includes an off-screen narrator). A further prediction
is that other roles will be involved definable, however, in relation to the car (as
opposed to other participants, whether family, colleagues or strangers). That
is, it may prove to be the case that (despite many overlaps between the
categories) the car may be defined in terms of first, second and third person
relationships. The general distribution might well be: (a) car-drive adverts:
driver with his/her car [first person: mine: car and me, driver are the same thing];
(b) car-stationary adverts: inspector with somebody else's car, not mine [third per-
son: otherness: not mine/notyours\; (c) storytelling adverts: raconteur and his/her
dream car for you [second person: yours, likely to include some kind of appeal
of the type: You should be driving it. . .].
Table 4.5 reconstructs the Experiential metafunction of 60 car adverts
analytically and systematically as subphases in the material process of driving,
thereby suggesting the validity of multimodal concordancing as an ana-
lytical and teaching approach. But, however systematic this may be, this is
only a provisional finding for if we are to honour the definition of phases in
terms of Gregory's already mentioned concept of consistency and congruity
echoed in Thibault's definition of phases as 'co-patterned semiotic selec-
tions that are co-deployed in a consistent way over a given stretch of text'
(Thibault, 2000: 325-326) and if we are to characterize their consequent
close identification with specific metafunctional configurations, we need,
at the very least, to complete the picture by describing patterns that
emerge vis-a-vis the Interpersonal metafunction (many of which are likely
to be stereotypical) and even more crucially the types of configurations
that emerge in relation to Interpersonal meanings when they are mapped
onto the Experiential structure we have sketched out. This is a complex
descriptive operation. Thus, although the previous paragraph gives broad
suggestions as to how this mapping might take place in car adverts, a com-
plete picture of the organization of car adverts into typical patterns of
phases and transitions still needs to be worked out. Such a picture needs to
be ascertained with more robust corpus description than the one currently
available. But the important point to note is that both the type of corpus
description and the corpus querying that this operation requires seem to be
quite in keeping with MCA's capabilities, given that its core feature is
its capacity to relate a wide array of disparate features over a wide range
of texts. But even if the phasal patterns sketched out above prove to be
valid over a still larger corpus, they will not be a point of arrival. Rather they
will still be a point of departure into a more precise understanding of
transitions and transition types, whose careful description, as this paper has
attempted to suggest, is crucial to the success of the multimodal analysis
of film texts.
ELECTRONIC MEDIA AND FILM 105
Conclusion
What is a multimodal transcription and what is a multimodal concordancer?
What is the relation between them and how can they promote English
studies, both from the standpoint of the researcher carrying out detailed
comparisons of texts and, more generally, from the standpoint of teachers
and students of English? Why should we be looking at type as opposed to
instance? Most answers to these questions will, hopefully, have been provided
in what has been stated above. A characterization of phase and transition
types would seem to lead to a better understanding of the features of
dynamic genres of which TV ads are just one exponent, one that at the very
least provides a guiding framework for students taking their first steps in the
analysis of dynamic texts.
A few concluding notes are, however, in order. While the multimodal tran-
scription can be a useful starting point for an understanding of the ways in
which resources such as gaze, gesture and language combine in typical phasal
patterns, it has its limitations, some of which have been noted above. In the
early stages of this work, Baldry and Thibault developed a dynamic version of
the static multimodal transcription, a forerunner of MCA, which allowed the
user to generate the individual rows of a transcription through a query mech-
anism, and which facilitated understanding of how visual objects and their
movements could be analysed in terms of Halliday's metafunctions.
Unlike a lemma-based linguistic concordancer such as OCP or
Wordsmith, MCA does not search throug Textual data directly in the
search for patterns but does so indirectly: it searches the corpus for patterns in
descriptions which have been previously created by the researcher using
MCA's annotational tool. The annotational patterns so far used in the con-
struction of a corpus of car adverts relate mainly to the metafunctional and
phasal organization of the texts. As we have seen, in the analysis of The Fan car
advert, driving a car is notjust a question of driving: rather a car advert can be
defined in terms of the relationship between the car driver and the car itself,
with car-drive (CD) phases intertwining with car-stationary (CS) phases.
Above all, though, MCA is the result of efforts to create transcription and
annotational tools that meet functional criteria in a way that was not
achieved by the first generations of lemma-based concordances. In this
respect, it has to be stressed that the needs of the research community have
changed in recent years in such a way as to privilege specialized corpora,
including the analysis, whether comparative or otherwise, of specific texts,
all of which are clearly reflected in the design characteristics of MCA. MCA
has been specifically designed as an online tool so that the research and
teaching community can easily access it. In this respect, work is currently in
progress to establish what integrations can be achieved with other systems,
for example, with HyperContext Web which uses techniques born in arti-
ficial intelligence that keep track of the user's progress and which are fun-
damental in teaching applications of corpora (see Pavesi and Baldry, 2000;
Piastra and Lombardi, 2000).
106 MULTIMODAL DISCOURSE ANALYSIS
Acknowledgements
This paper is part of research within the Linguatel Project, an Italian inter-
University project, co-financed by MURST/MIUR and co-ordinated by
Carol Taylor Torsello, University of Padua and its successor the Didactas
Project, co-ordinated by Chris Taylor, University of Trieste, which is
similarly financed. Michele Beltrami has developed MCA to the author's
design requirements as part of this project. Now in its second release,
MCA is viewable through the Pavia pages of the Linguatel Website:
claweb.cla.unipd.it/Linguatel/Pavia/MCA.htm or directly at: mca.unipv.it
[default User name: guest and default login: iamguest; see also New Regis-
tration] using Microsoft Explorer.
I thank Vauxhall Motors for the inclusion of five frames from their
advertisement, and I also wish to thank Antonio Cerlenizza and Oliver
Bartholomay, respectively Direttore Audi Italia and Responsabile MKT-
Audi of Autogerma, Divisione Audi S.p.A, Verona and Roberta Mottino of
Verba s.r.l. Milan for their kind permission to reproduce parts of The Fan
advert for the Audi A4 model. However appreciative and supportive of the
advert's organization and goals, the interpretation given above remains, of
course, entirely mine.
References
Baldry, A. P. (1999) Multimodality and multimediality. In M. Karagevrekis (ed.),
Compelling Learning Techniques in ESP/EAP, Proceedings of the 3rd ESP Conference., 25th
September 1998. Thessaloniki: Zefyros, 5-32.
Baldry, A. P. (ed.) (2000a) Multimodality and Multimediality in the Distance Learning Age.
Campobasso: Palladino Editore.
ELECTRONIC MEDIA AND FILM 107
Kay L. O'Hallomn
National University of Singapore
Introduction
The aim of this paper is to investigate a method for capturing and interpret-
ing the spatial and temporal dynamics of visual semiosis. This is achieved
through the description of an analysis of a short segment from the dynamic
medium of film. The analysis is based on a systemic-functional framework
for film, and the use of software which allows the editing of digital video
images in order to display visually the nature of different semiotic choices
across a range of systems. From this point, the problematic nature of such
an enterprise becomes apparent and possible directions for future research
are suggested.
The film medium parallels a significant dimension of our experience of
the world: it involves sequences of change and repetition in the visual and
auditory realm. Film, however, involves playing with time sequences in a
two-dimensional frame to represent our three-dimensional lived-in material
experience of the world where the faculties of hearing, sight, smell, taste
and touch are sources for sensory, and therefore semiotic, input. Thus while
limited in the sense that the discussion presented here only incorporates the
visual aspect of semiotic exchange, this paper is nonetheless a further tenta-
tive step towards incorporating the meaning of the dynamic in systemic-
functional theory. For it is not only the culmination of choices made across
semiotic resources in their interaction with other resources that makes
meaning, but also the temporal and spatial unfolding of those choices.
Although images of instances frozen in time may become lodged within our
consciousness, generally we do not make meaning from a series of snapshot
images of the world, but rather our daily experience of the world is based on
patterns of change; that is, meanings derived from systems in flux. Our
perceptual apparatus is oriented towards detecting and assimilating change
and contrast, rather than relying on the stability and continuity which, in the
normal course of events, we learn to layer on top of that experience. An
adequate model which accounts for our social construction of the world,
therefore, necessarily needs to account for changing states which have trad-
itionally been the concern of other domains, which include film theory,
mathematics, physics and studies of perception in cognitive science.
110 MULTIMODAL DISCOURSE ANALYSIS
MCA does not search through Textual data directly in the search for patterns but
does so indirectly: it searches the corpus for patterns in descriptions which have
been previously created by the researcher using MCA's annotational tool. The
annotational patterns so far used in the construction of a corpus of car adverts
relate mainly to the metafunctional and phasal organization of the texts.
One aim of this paper is to suggest ways in which the user can directly search
for patterns in visual Textual data. In other words, I explain how com-
mercially available software can be used in conjunction with a visual
grammar to capture changing patterns in dynamic text. This exploratory
stage is viewed as a first step towards a new methodology afforded by the
electronic medium which could eventually be included in a system such as
Baldry's MCA. In addition, there is the potential to incorporate software
such as Systemics 1.0 (O'Halloran and Judd, 2002) in such applications
in order to analyse the linguistic choices as they unfold in time. The chal-
lenge remains for us to capture and analyse choices across all semiotic
resources in such a way that the dynamics of meaning-making can truly be
investigated.
Thus, transitions are not necessarily equated with the cutting from one shot to
another, nor indeed with what is happening in the visual. While transitions will
often be related to what is happening in the visual, this will not always be the case
[. . .] transitions, as Thibault (2000: 320) and Gregory (2002: 323) have pointed
out, essentially relate to changes in the metafunctional organization of the text and
as such may very well be related to changes in the soundtrack and not just to what
happens in the visual.
Video-editing tools, therefore, allow the user to highlight the different semi-
otic choices visually and view the impact of such choices when they com-
bine in the text in real time. The method which was adopted for this paper
involved the use of Adobe Premiere 6.0 to explore how salient semiotic
choices may be highlighted in a short extract from the film Chinatown. How-
ever, as previously noted, unfortunately it has not been possible to reproduce
still frames from this analysis in this publication due to Paramount Studio's
refusal to give copyright permission. Nonetheless, the results of the visual
analysis are described in some detail.
114 MULTIMODAL DISCOURSE ANALYSIS
Genre
There are no rigid criteria to define the different genres of film (Bordwell and
Thompson, 2001). Some classifications are based on subject/theme (for
example, crime for gangster movies), while others are defined by emotional
effect (for example, amusement for comedy). Genre conventions are also
based on plot, thematic development, film techniques and iconography. Fur-
ther to this, genres change and new hybrid types are continually emerging.
However, despite this fluidity the audience generally recognizes genre con-
ventions. Genres are seen to be institutionalized and ritualized dramas 'which
are satisfying because they reaffirm cultural values . . . [such as] self sacri-
ficing heroism, the desirability of romantic love' (Bordwell and Thompson,
2001: 99). Bordwell and Thompson (2001) further explain that these reaf-
firmations distance the viewer from real social problems and the more finite
and anxiety-ridden aspects of life such as death, disease, breakdown and
insecurity. Genres may also be seen to 'exploit ambivalent social values and
attitudes' which 'arouse emotion by touching upon deep social uncertainties
but then channel those emotions into approved attitudes' (ibid.: 99).
Chinatown is a detective story with an investigative structure (Eaton, 1997).
'As Poe so clearly put it, the detective exists "to play the Oedipus'" (ibid.:
17), the truth seeker. Chinatown is a story where 'wrongs can ultimately be
uncovered but the seeker after truth is not only completely incapable of right-
ing them but his very search will only make matters worse' (ibid.: 21).
Chinatown is also recognized as film noir and, more specifically, reflects the
origins of the neo-noir. The subject of much study (for example, Christopher,
1997; Hirsch, 1981; Kaplan, 1998; Krutnik, 1991; Palmer, 1994; Tuska,
1984; Voytilla, 1999], film noir is a descriptive term for American crime film
from early 1940s to late 1950s where doomed men are obsessed with seduc-
tive women, as exemplified by Double Indemnity (1944) and Scarlet Street (1945).
In the 1960s and 1970s films with noir flourishes include Klute (1971), Play
Misty for Me (1971), Taxi Driver (1976) and Chinatown (1974).
Definitions of film noir vary but there seems to be general agreement that
the term designates films with a low-key visual style which contrasts to the
bright balanced studio look of the 1930s. There are noir movies of different
genres, for example, mystery, suspense thriller, psychological drama, and
gangster films (Krutnik, 1991). Critics generally agree that there is also an
obliqueness and often confused temporal narrative plot. There is usually a
general mood of dislocation and bleakness, and the noir world is deceptive
and uncertain. ' "The world is a dangerous place" is one of the axioms of
noir' (Hirsch, 1981: 13).
Chinatown, however, is filmed in the non-expressionistic 'classical' style of
Panavision and Technicolour with a straightforward narrative style. How-
ever, 'the cynicism and despair which permeates the social vision of the film
noir... is present... in the final act of this Polish exile's [Roman Polanski's]
film' (Eaton, 1997: 57-58). However, according to Eaton (1997: 58), the
depiction of Evelyn Cross Mulwray is where the noir-ish influence is most
ELECTRONIC MEDIA AND FILM 117
obvious. 'The dark lady, the spider woman, the evil seductress who tempts
man and brings about his destruction' [Place, 1998: 47] is how the "female
archetype" of film noir has been characterized and this is the image of the
female lead which is now consciously evoked [in Chinatown]' (ibid.: 58).
The figure of the woman in film noir has been the focus of feminist film
theory since Chinatown was produced. The emergent newfemmefatale in films
in the 1990s, for example, Basic Instinct (1992), is 'redefined as a sexual
performer within a visual system which owes as much to soft-core porn-
ography as it does to mainstream Hollywood' (Stables, 1998: 172-173). The
new woman takes an active role in initiating sexual practices which are
perceived as deviant, marginal or transgressive to the dominant culture. In
the analysis below, we shall investigate the semiotic construction of Evelyn
Cross in the role of 'spider woman' which has subsequently led to such
constructions of women in contemporary cinema.
MISE-EN-SCENE
The Temporal-Spatial Frame
Complex Relation: The Shot
Visual Imagery
Temporal Episode Relation to Movement- Sequence of Sub- Actions, Contribution to Relative Relation of
Action-Event: Side Sequences and Narrative Action in Changing
Scale Events Cause-Effect Gestalt
Depth Interplay of Actions Relations Subframing
Centrality Parallelism and
Relative Prominence Opposition
Duration Relative On-Screen/Off-
Clarity Screen Space
Focus Camera Angle
Light Camera Level
Camera Distance
Soundtrack
Visual Imagery +
Soundtrack
Gaze towards Jake (which may also be marked visually through vectors) is
oblique and so the viewer can openly scrutinize her face, Makeup and Cos-
tume throughout the extended Duration of the Image. After her husband's
funeral, Evelyn is wearing a black dress and a hat with a netted black veil
which covers the top half of her face. Her Gaze in effect is veiled. Jake
comments in the next Mise-en-Scene, 'And I still think you are hiding some-
thing'. Here the motif of distorted vision is reinforced. In this case, Jake is
not gazing through a camera or car mirror, rather he is trying to penetrate
the protective veil through which Evelyn views the world.
The use of Colour in the restaurant scene is significant for several reasons.
Digital colour matching (which can be displayed) reveals that Evelyn's red
lipstick exactly matches the colour of the couch upon which she is seated.
The motif of sexuality is represented through this use of the colour red in
Evelyn's makeup which coheres with the intimate setting. The characteriza-
tion of Evelyn as the 'spider woman' is thus created; she is veiled, oblique,
sexual and potentially dangerous. This portrayal of Evelyn largely remains
in place until the final scenes in the movie.
his lack of understanding of the situation. The intense gaze between Jake
and Evelyn, which accompanies her refusal of his offer to drive her home,
may be indicated visually by vectors. The On-Screen Space dominated by
Evelyn and Jake continues to remain perfectly balanced, and the analyst can
begin to appreciate how effectively the camera work and background setting
function in this Mise-en-Scene. In addition, there is a lightly coloured ban-
dage on Jake's nose which is marked with visual prominence despite its
cohesiveness with the background colours. This visual prominence of the
bandage is matched by the linguistic choices in the dialogue which takes
place as we shall see in a moment.
The triangle of social relationships between Jake, Evelyn and the car
attendant is construed visually as well as linguistically. The attendant is a
minor participant as indicated by his backgrounded physical position in the
Movement-Action-Event when Jake and Evelyn walk out of the restaurant.
Jake's use of the vocative 'sonny' in the command 'Wait a minute sonny'
reinforces this position. Jake's attempts at exercising power over Evelyn,
however, do not succeed.
Jake fails in his bid to drive Evelyn home, and there is a pause before he
turns to confront her. Evelyn remains detached and supposedly nonchalant
by focusing her Gaze on her gloves, which may be indicated visually by line
vectors. Evelyn's hand movements may also be highlighted visually to indi-
cate Gesture. After a short silence, the Interpersonal relations between Jake
and Evelyn intensify. The Gaze becomes direct and focused as the Proxemics,
which may be displayed by visual vectors, decrease. The Mobile Frame has
been brought into play so that the Camera Distance is decreased. This com-
positional strategy further draws the viewer into the exchange between Jake
and Evelyn. The Interpersonal intensity of Jake's delivery continues as he
explains that Evelyn's husband was murdered. Evelyn's Gaze, which again
may be marked by visual vectors, shifts downwards as Jake refers to her late
husband. Jake, however, continues regardless of Evelyn's silent response.
When Jake refers to a situation where he was physically attacked and his
nose sliced by a knife [hence the bandage], 'but Mrs Mulwray I goddamn
near lost my nose', the Interpersonal intensity of the exchange increases.
The use of vectors may explicidy demonstrate how distance in the Proxem-
ics has again decreased with a resulting increase in the intensity of gaze. In
addition, Jake's use of'goddamn near' reinforces the affect of his speech to
Evelyn, which is somewhat mocking given that he addresses her as
'Mrs Mulwray'.
The climax in this Mise-en-Scene is reached when Jake accuses Evelyn of
'hiding something'. Here the motif of the truth seeker looking through a veil
of deception is reinforced. While he is correct that Evelyn is withholding
information, it is not exactly the sort that Jake envisages. However, in the
remainder of the street scene, Roman Polanski allows the viewer to gain
some insight into Evelyn's situation.
The final frames of the Mise-en-Scene capture one of the rare moments
in Chinatown where the Point of View switches from Jake to Evelyn. The
126 MULTIMODAL DISCOURSE ANALYSIS
Conclusion
This necessarily incomplete description of the analysis of two Mises-en-
Scene from Chinatown seeks to describe how a visual grammar may be
applied to the dynamic visual image. In the discourse analysis of a linguistic
text, the analyst directly engages with the linguistic choices which have been
made in order to interpret the text. In a similar manner, the description of
this analysis seeks to demonstrate the effectiveness of directly engaging visu-
ally with a Mise-en-Scene to make salient the choices which have been made.
Through such an analysis, we start to appreciate the reasons why director
Roman Polanski favoured this particular scene in Chinatown.
The bright public street setting marks a stark transition from the intimate
restaurant scene where Evelyn's sexuality is marked. The compositional
aspects of the narrow street setting are perfect; the actors are framed
through perspective, on-screen space, colour cohesion and contrast. The
yellow tones of the background setting with light and shadows provided by
the sun, the buildings and other lighting effects further enhance the visual
salience of the two actors in the setting. The camera moves in to record the
growing intensity of the exchange between Jake and Evelyn against a back-
drop of day-to-day life which continues despite the drama being played out
before the viewer's eyes. Through the use of gaze, gesture and proxemics the
visual aspects of the interaction effectively construct Jake's growing frustra-
tion and anger with Evelyn in his search for truth. The camera later lingers
to capture a subtle shift in the point of view where the unenviable position
of Evelyn is signalled to the viewer. Jake's arrogance transforms her per-
ceived strength into a web of deceit and corruption which rightfully should
be attributed to her father.
Roman Polanski ensured that the usual generic conventions were not fol-
lowed in the movie Chinatown. In Robert Towne's original script, Evelyn is
saved and her father exposed. Thus the usual generic tropes such as 'love
triumphs' and 'youth defeats old age' and 'corruption resulting in a new
ELECTRONIC MEDIA AND FILM 127
Notes
1 Despite repeated written requests to Paul Hrisko, the Manager for the Film Clip
Licensing Division for Paramount Studios, copyright permission to reproduce
still frames from the movie containing the analysis of Chinatown was not given. I
128 MULTIMODAL DISCOURSE ANALYSIS
am, however, most grateful to Roman Polanski who kindly wrote in support of
my requests for copyright permission.
2 See Visual Communication (Sage Publications), a journal devoted to the theory and
analysis of visual images and multimodal texts.
3 See also Baldry (this volume) for the analysis of car advertisements.
4 See ledema's (2001) social semiotic framework and analysis of a television
documentary. His framework consists of six levels: frame, Shot, Scene, Sequence,
Generic Stage and Work as a whole.
Acknowledgements
I would like to thank Michael O'Toole for his kind permission to
reproduce Plate 5.1 from the CD-ROM Engaging with Art (Perth: Murdoch
University, 1999) [copyright Michael O'Toole] with acknowledgement to
the Rijksmuseum of Amsterdam for the original image of Rembrandt's The
Night Watch.
References
Baldry, A. P. (this volume) Phase and transition, type and instance: patterns in media
texts as seen through a multimodal concordancer, 83—108.
Baldry, A. P. (ed.) (2000) Multimodality and Multimediality in the Distance Learning Age.
Campobasso, Italy: Palladino Editore.
Bordwell, D., and Thompson, K. (2001) Film Art: An Introduction (6th edn). New York:
McGraw Hill.
Gallaghan, J. and McDonald, E. (2002). Expression, content and meaning in lan-
guage and music: an integrated semiotic analysis. In P. McKevitt, S. O'Nuallain
and C. Mulvihill (eds), Language, Vision and Music. Selected papers from the 8th Inter-
national Workshop on the Cognitive Science of Natural Language Processing, Galway, Ireland,
1999. Advances in Consciousness Research, Volume 35. Amsterdam: Benjamins, 205—220.
Christopher, N. (1997) Somewhere in the Night: Film Noir and the'American City. New York:
The Free Press.
Eaton, M. (1997) Chinatown. London: British Film Institute.
Gregory, M. (1995) Generic expectancies and discoursal surprises: John Donnne's
The Good Morrow. In P. H. Fries and M. Gregory (eds), Discourse in Society: Systemic-
Functional Perspectives. Meaning and Choice in Language: Studies for Michael Halliday.
Norwood, NJ: Ablex, 67-84.
Gregory, M. (2002) Phasal analysis within communication linguistics: two contrast-
ive discourses. In P. Fries, M. Cummings, D. Lockwood and W. Sprueill (eds),
Relations and Functions within and around Language. London and New York: Con-
tinuum, 316-345.
Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edn). London:
Arnold.
Heisner, B. (1997) Production Design in the Contemporary American Film. Jefferson: Me
Farland.
Hirsch, F (1981) The Dark Side of the Screen: Film Noir. New York: Da Capo Press.
ledema, R. (2001) Analysing film and television: a social semiotic account of hos-
pital: an unhealthy business. In T. van. Leeuwen and C. Jewitt (eds), Handbook of
Visual Analysis. London: Sage, 183—204.
ELECTRONIC MEDIA AND FILM 129
Kaplan, E. A. (ed.) (1998) Woman in Film Noir (rev. edn). London: British Film
Institute.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.
Kress, G. and van Leeuwen, T. (2001) Multimodal Discourse: The Modes and Media of
Contemporary Communication. London: Arnold.
Krutnik, E (1991) In a Lonely Street: Film Noir, Genre and Masculinity. London: Routledge.
Lemke, J. L. (1998a) Metamedia literacy: transforming meanings and media. In D.
Reinking, L. Labbo, M. McKenna and R. Kiefer (eds), Handbook of Literacy
and Technology: Transformations in a Post-Typographic World. Hillsdale, NJ: Erlbaum,
283-301.
Lemke, J. L. (1998b) Multiplying meaning: visual and verbal semiotics in scientific
text. InJ. R. Martin and R. Veel (eds), Reading Science: Critical and Functional Perspec-
tives on Discourses of Science. London: Routledge, 87—113.
Lemke, J. L. (2000) Multimedia demands of the scientific curriculum. Linguistics and
Education, 10(3): 247-271.
Lemke, J. L. (2003) Mathematics in the middle: measure, picture, gesture, sign and
word. In M. Anderson, A. Saenz-Ludlow, S. Zellweger and V Cifarelli (eds),
Educational Perspectives on Mathematics as Semiosis: From Thinking to Interpreting to Know-
ing. Ottawa: Legas Publishing, 215-234.
Mclnnes, D. (1998) Attending to the instance: towards a systemic-based dynamic
and responsive analysis of composite performance text. Unpublished Ph.D.
thesis. University of Sydney.
Martinec, R. (2000) Construction of identity in Michael Jackson's 'Jam'. Social
Semiotics, 10(3): 313-329.
O'Halloran, K. L. (2003a) Educational implications of mathematics as a multi-
semiotic discourse. In M. Anderson, A. Saenz-Ludlow, S. Zellweger, and V V
Cifarelli (eds), Educational Perspectives on Mathematics as Semiosis: From Thinking to
Interpreting to Knowing. Ottawa: Legas Publishing, 185-214
O'Halloran, K. L. (2003b) Intersemiosis in mathematics and science: grammatical
metaphor and semiotic metaphor. In A.-M. Simon-Vandenbergen, M. Taverni-
ers, and L. Ravelli (eds), Grammatical Metaphor: Views from Systemic Functional
Linguistics. Amsterdam: John Benjamins, 337—365.
O'Halloran, K. L. and Judd, K. (2002) Systemics 1.0. [CD-ROM]. Singapore:
Singapore University Press.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
O'Toole, M. (1995) A systemic-functional semiotics of art. In P. H. Fries and M.
Gregory (eds), Discourse in Society: Systemic—Functional Perspectives: Meaning and Choice
in Language: Studiesfor Michael Halliday. Norwood, NJ: Ablex, 159-179.
O'Toole, M. (1999) Engaging with Art. [CD-ROM]. Perth: Murdoch University.
Palmer, R. B. (1994) Hollywood's Dark Cinema: The American Film Noir. New York:
Twayne Publishers.
Place, J. (1998) Women in Film noir. In E. Anne Kaplan (ed.), Women in Film Noir (rev.
edn). London: British Film Institute, 47-68.
Stables, K. (1998) The postmodern always rings twice: constructing the femme
fatale in 1990s cinema. In E. A. Kaplan (ed.), Woman in Film Noir (rev. edn).
London: British Film Institute, 164-201.
Thibault, P. J. (2000) The multimodal transcription of a television advertisement:
theory and practice. In A. P. Baldry (ed.), Multimodality and Multimediality in the
Distance Learning Age. Campobasso, Italy: Palladino Editore, 311—385.
130 MULTIMODAL DISCOURSE ANALYSIS
Introduction
This paper is an attempt to understand how an institution and its objectives
become translated, transmitted and received through the hypertext
medium. The notion of hypertext is first clarified with the aim of abstract-
ing methodological categories which may be used for a semiotic analysis.
Following this, systemic functional models (Halliday, 1994; Kress and van
Leeuwen, 1996; O'Toole, 1994) are employed to examine the semiotic
choices made within a selected webpage, the Singaporean Ministry of Edu-
cation (MOE) site,1 in order to examine the meanings produced by these
choices and the context circumscribing this choice-making and meaning
production. The interaction of meanings across different semiotic instanti-
ations also features in this analysis.
Genesis of hypertext
The precedence of verbal over written language in human groups is firmly
acknowledged in conventional histories of writing, with only certain cultures
developing a recording-writing system for reasons of trade, religion or polit-
ics (Kress and van Leeuwen, 1996: 18-19). In Euro-American history, the
advent of print technology made recordable texts not only vastly replicable
but also more readily available compared to the past. In this sea of data,
however, information retrieval posed a serious difficulty because texts
remained in an unchangeable linear format.
Early theorists concerned with presenting and retrieving information
envisaged a system for providing complete access to the 'endlessly expansive
world of texts' (Tuman, 1992: 55). The term 'hypertext', coined by Ted
Nelson in the 1960s, was used to refer to a form of electronic text where the
mode of publication was characterized by 'non-sequential writing'; that is,
'text that branches and allows choices to the reader' in the form of 'a series
of text chunks connected by links which offer the reader different pathways'
through an interactive screen interface (Landow, 1997: 3). In the late 1960s,
theory moved towards reality when the Advanced Research Projects Agency
(AREA) of the Department of Defence in the United States of America set
132 MULTIMODAL DISCOURSE ANALYSIS
The crucial qualification 'makes possible' arises for two reasons: first, multi-
semiotic texts can be assembled by technology other than the hypertext;
second, a whole host of factors can curtail what hypertext affords; for
example, 'secure' websites that can only be accessed by certain knowledge-
able people (whether one possesses the password or is an expert hacker),
incompatible or missing software, lack of technical savoir-faire., and so on. On
another note, my definition excludes CD-ROM programs for standalone
computer workstations. These CD-ROMs, while possessing certain hyper-
text features (such as connected scrollable pages and multimedia), are not
related or potentially relatable to other webpages or software in a larger
connected network of workstations. This exclusion holds until a website is
created for supporting the said CD-ROM program in a web-browser win-
dow, in effect, making it relatable to other webpages. One is forced to admit
that technological innovation continues to problematize the notion of
hypertext.
soon see, these orders of abstraction are not necessarily related to each other
by constituency. Indeed, the orders of abstraction are different in nature to
the aforementioned semiotic ranks because hypertext is not a semiotic
resource, but a platform for the codeployment of different semiotic
resources. The orders of abstraction proposed for hypertext are ITEM, LEXIA,
CLUSTER and WEB. As these terms require theorization, I start with the lowest
order of abstraction and develop these concepts to the highest or most
inclusive category of hypertext.
Item
An ITEM is any instantiation from any meaning-making system that is sup-
portable by hypertext technology, and to date, these semiotic resources
include the linguistic, visual, music and phonic. The question of what
instantiation(s) count as an ITEM is necessarily preceded with a brief discus-
sion of ranks (in italic font below) in semiotic systems.
A linguistic instantiation such as 'I could fly' is easily identified as a
Clause. In contrast, the instantiation 'Move!' is simultaneously a Clause, a
Verbal Group and a Word. O'Toole (1994: 12) observes the same phenom-
enon in certain paintings where a Work may simultaneously be an Episode,
a Figure or simply a Member. Ostensibly, ranks within any one semiotic
system are not impermeable to each other. In any one semiotic, an ITEM may
therefore be a number of instantiations of different ranks of the one semi-
otic combining together as a discernible whole. In multisemiotic texts, an
ITEM could be an instantiation of one semiotic resource, or a combination of
instantiations of different ranks of different semiotic resources joining
together as a methodologically justifiable whole. In this light, ITEM encapsu-
lates this permeability of the ranks within and across semiotic resources.
What are the semiotic choices that contribute to a sign or a complex of
signs being designated as an ITEM? For either linguistic or visual semiosis,
they are the choices made in the Textual or Compositional metafunction
respectively. For a combination between the two resources, factors that
separate one ITEM from another crucially rest on the choices made in the
Compositional metafunction. These Compositional choices include those
from the system of Colour Cohesion, the system of Alignment and the
system of Gestalt: Framing (see Table 6.2). This is not meant, however, to
play down the fact that choices made in the other metafunctions in both
semiotic resources also contribute to the discreteness of a sign or complex of
signs, but that the justification for ITEM rests primarily on choices made in
the Compositional metafunction with regards to the Textual organization
of the typographical/graphical instantiation of the linguistic/visual semi-
otic choices.
As displayed in Plate 6.1, the order of ITEM could apply to a Word, a
boxed-up Clause(s), an Element of a stylized gust of wind, an Episode of a
man swatting a fly, the Work of an evening skyline serving as a background
graphic, or even a complex of signs.
135
Lexia
The word lexia derives from Roland Barthes (1974: 13—14) and stands for
the scrollable webpage; that is, the 'text composed of blocks of texts' that an
ergodist sees on the computer screen (Landow, 1997: 3—4). ITEM, which
include hypertext links, become the constituents that make up a LEXIA. In
practice, LEXIAS can be 'short' or 'long' depending on how many ITEMS are
included and how they are organized. It is at this order of abstraction where
(multi)semiotic realizations are organized in some meaningful way in rela-
tion to others. 'Reality' is represented (multi)semiotically, and the ergodist
engages with, and is placed in a particular relation to, what is displayed and
the producers of that display. The relation between LEXIA and ITEM is one of
composition where a LEXIA is made up of ITEMS. Instances of LEXIAS and
ITEMS are in turn realized from choices made in the metafunctional systems
for different semiotic resources.
Cluster
CLUSTER refers to a number of connected LEXIAS due to associations created
via hypertext links. These hypertext links are classified as 'LEXIA internal' as
136 MULTIMODAL DISCOURSE ANALYSIS
they are located within the LEXIA itself and serve to 'call-up' another LEXIA
should the ergodist click on it. With hypertext links, one agency (institution,
company, collective or individual) can link its many LEXIAS in such a way as
to suggest (and so limit) the multidirectionality of traversing the LEXIAS that
make up one CLUSTER. The notion of CLUSTER thus overlaps with the notion
of a producer-created path, because it is the producers of particular LEXIAS
who place hypertext links that in turn suggest or determine a pathway or
pathways through the CLUSTER. A CLUSTER can appear discrete from others
by means such as strategic placing of'Back', 'Forward', 'Back to Homepage'
buttons or even a sidebar with hypertext links to other LEXIAS within the
CLUSTER.
A complication to this order of abstraction may be the fact that a num-
ber of LEXIAS associated by one agency via hypertext links can join with or
overlap with others as a result of hypertext links put up by the same agency
or some other. This is not only a remote possibility, but an avenue exploited
by agencies who insert a hypertext link on their own LEXIA that links to a
larger number of associated LEXIAS. Pushed to its logical extreme, this
notion breaks down what is authoritatively the CLUSTER belonging to a par-
ticular agency. For example, in December 1999, a hypertext link on the
MOE homepage linked directly to a webpage belonging to the Housing
Development Board of Singapore (HDB), which was in turn linked with a
vast series of LEXIAS that the HDB produced. One asks where the MOE
CLUSTER ends and the HDB counterpart begins? This is precisely the prob-
lem of designating CLUSTERS based on agency. The notion of CLUSTER is
thus not concerned with agency perse., but associations formed via hypertext
links. These links are finite, and a CLUSTER 'rounds off, or starts becoming a
more discrete entity from other CLUSTERS with the termination of links.
While the CLUSTER is constituted by LEXIAS based on internal hypertext
links, these are temporal and changeable, thus making the associations
between LEXIAS transient and mutable. CLUSTER is as such Virtual' and an
observable disjunction occurs between this order and those of LEXIA
and ITEM.
Web
WEB is the number of LEXIAS associable through hypertext links and other
facilities internal and external to a LEXIA. Facilities that are LEXIA internal
(but are not hypertext links) include search engines situated within a LEXIA,
while LEXIA external facilities are those provided, for example, by the web-
browser software. These appear on the web-browser window and include
the 'Forward', 'Back' and 'Home' buttons among other options. LEXIA
external facilities also include the hardware, or the cable connections
between computers. This notion of WEB thus includes LEXIAS potentially
relatable to each other by Local Area Networks (LANs), such as Ethernet,
that join sets of machines within an institution or a part of one and also
Wide Area Networks (WANs) that join multiple organizations in widely
ELECTRONIC MEDIA AND FILM 137
Figure 6.1 Relation of culture, situation, semiotic resources and lexia (adapted
from HaUiday, 1991)
seriously the Government's call for workers to upgrade their skills to find a place
in the new knowledge economy.
As the educational arm of the PAP, the MOE works with such an end in
mind. In a public release, in the section entitled 'Cornerstone of education
policy', the MOE reveals that one of its chief foci is 'the development of
human resources to meet Singapore's need for an educated and skilled
workforce' (Ministry of Education, Singapore 2000). Out of this context
construed by the PAP and, more specifically, the MOE, the homepage
under consideration is erected.
Another configuration of context, comprising the production norms for
webpages, forms a necessary second step to contextualize the MOE
homepage. LEXIAS can be constructed for a range of purposes. One such
purpose is the display of information. Webpages that only serve this purpose
emerge as 'content heavy'. Other webpages are used for administrative
purposes such as gathering feedback and so possess features whereby the
ergodist can 'enter' whatever he or she wishes. A particular type of webpage
serves the function of welcoming and introducing the ergodist to a series of
linked webpages. Such a webpage is commonly referred to as the
'homepage', since it is held to be the locus point to all the other linked
webpages. Apart from welcoming and introducing the ergodist, homepages
may also serve as an index of varying degrees by having visible hypertext
links to the linked webpages.
The norms associated with a homepage provide an insight into one
aspect of the context that produces it. Most homepages have the generic
layout of masthead in the topmost position with various texts and hyper-
text links beneath. This layout is generally adopted by commercial and
institutional organizations perhaps because apart from welcoming and
introducing, it foregrounds the corporate identity behind the website. With
the identity of the 'seller' disclosed, the ergodist as consumer may in 'good
faith' accept the material goods, services or information proffered by the
website. Nonetheless, some websites do play with the rigid style of presen-
tation or depart from it altogether to increase its engagement with the
ergodist. This is done either by experimenting with the different semiotic
resources in the hypertext environment or communicating in novel
ways through uniquely hypertext facilities to create a greater sense of
dynamism and unpredictability. For example, homepages may flout conven-
tion by duplicating and relocating the masthead vertically at the sides of the
webpage, and such columns of words may flash alternative colours
sequentially.
Whatever the case may be, the purposes served by a homepage are cir-
cumscribed by situational and cultural demands of context. Context thus
stands as a necessary preface to any semiotic analysis. With this in mind, one
may enter into an exploration of the semiotic choices and hypertext facilities
employed by the MOE homepage.
ELECTRONIC MEDIA AND FILM 141
Table 6.1 Halliday's functional systems for language (adapted from O'Toole,
1999)
Table 6.2 Ranked functional systems for the visual semiotic (adapted from
O'Toole, 1999)
Bold (such as the ITEMS under 'Web Sites of Interest' and 'Corporate Infor-
mation'). Because these sections are rectilinear and stacked vertically, the
Gestalt is one that positively suggests stability or negatively an absence of
dynamism (O'Toole, 1994).
The organization of linguistic and visual instantiation of this webpage
reflects a certain trend. If one were to consider the linguistic texts on the
webpage, the selection is Relative Position In Gestalt: Formatting: Left Justified.,
meaning strings of words are aligned from the same vertical point of
departure starting from the left. This left justification relates to the reading
practice associated with English texts which is left to right to the row below.
Additionally, each of the hypertext links under 'Highlights' and 'Corporate
Information' has a graphic bullet that indicates the start of a 'new point' as
well as a distinct hypertext link. These bullets therefore function to draw the
eye to the right and to signal the intended discreteness of linguistic instanti-
ations. In much the same way, the MOE Shield at the top left corner of the
webpage calls attention to itself while bulleting the 'main point' of the
homepage: the Ministry of Education, Singapore.
More so than in other multisemiotic texts, the 'putting together' or con-
struction of a hypertext involves a heightened awareness of bringing separ-
ate elements together in spatial relation to each other. This construction is
fundamentally achieved through Hypertext Mark-Up Language (HTML)
that is used to 'write' computer commands which execute the webpage as
seen on-screen. A source code thus details a particular webpage's HTML
consisting of commands enclosed in pointed brackets such as '<P align-
=centre>' to more complex ones such as <TABLE border=0 cellPadding= 5
cellSpacing=5 width='101 per cent'>'.
In addition, sequentiality in the source code usually translates to the
actual webpage displayed, as evinced by a simple comparison between the
given source code and the MOE homepage. The HTML of the source code
thus implicates a deliberate writer who is conscious of the spatial ordering of
texts as they appear on a webpage.
Representational choices
The above choices not only underscore the MOE as most salient (and this is
matter of course since it is the MOE homepage) but they also work in tandem
with Representational choices to construe the MOE's institutional 'face'.
Contextualized with other homepages, the MOE homepage does 'reassure'
with its 'generic' layout of masthead at the top with various texts and hyper-
text links beneath it. As mentioned in the discussion on context, this layout is
adhered to through a choice in Portrayal to foreground the corporate agency,
and this functions to increase credibility to the end of encouraging the
ergodist to 'buy' what is offered on-screen. In the case of the MOE, it is
information on local education-related issues that is being 'sold'.
Nonetheless, there are websites that play with the rigid style of presenta-
tion, or depart from it altogether, to create a greater sense of dynamism and
146 MULTIMODAL DISCOURSE ANALYSIS
Scrollability
Before finishing the analysis at the order of LEXIA, one particular hypertext
feature gives cause for further thought. Due to several factors, such as a non-
maximized web-browser window or a small monitor display, a LEXIA may
only be presented in part. One facility hypertext opens up is what I call
'scrollability' which determines how the semiotic choices ultimately contact
the ergodist. A deliberately lengthy or wide webpage exploits scrollability
while simultaneously marking it as a feature for the ergodist.
The feature of scrollability has two types: vertical and lateral. As the
default display of webpages is always the topmost and leftmost portion first,
this means that for small displays, the option to scroll laterally arises, in
which case one must always start from the left. The more common case is
the vertical scrolling option, starting always from the top. Noting this default
top-left display, it is not surprising that webpage designers usually situate
what they deem as more important in these 'guaranteed viewing areas'. In
the light of scrollability, the preceding discussion needs re-examination
because even with the largest monitor display presently available and maxi-
mization of the web-browser window, the MOE homepage is only fully
'read' by scrolling downwards. The downward scrolling process is repro-
duced in Plate 6.4.
The initial window rules out all those ITEMS under the heading 'web sites
of interest' and below, ensuring that the already prominent masthead is even
more salient. Ostensibly, the convention of locating the most important
information (in this case the MOE masthead) at the top is a recognition of
the default top-left display.
What is deemed most significant is situated at the said guaranteed viewing
areas with the rest arranged in a descending sequence according to import.
This overall arrangement has a significant contribution to how the ITEMS
Plate 6.4 Scrolling sequence of the MOE homepage
148 MULTIMODAL DISCOURSE ANALYSIS
are read. The questions of how the MOE homepage acts on the reader and
what assumptions are embedded in this representation of the MOE may
now be more fully explored in an analysis at the order of ITEM.
The hypertext links under 'Web Sites of Interest' can thus be seen as primar-
ily expansions of what are institutional-governmental goals rather than
what may be of some interest to the ergodist.
While one may examine any one of these ITEMS for any length of time,
the leftmost ITEM draws back the eye through the hypertext facility of
ELECTRONIC MEDIA AND FILM 151
animation. For the hypertext link 'Teacher: Create a sense of wonder. Offer
new perspectives' the image of the magnifying glass over a flower morphs
into a girl in mid-jump and then back again in perpetual recursion. As a
complex extension of the visual semiotic, animation necessitates further
research which, however, is beyond the scope of this enterprise. Nonetheless,
this conscious use of animation implies that the MOE made a decision to
foreground this particular ITEM. This link's power of attraction is also
enhanced by the possession of two of the only three Mood: Imperative
clauses which, in effect, level a 'direct' address at the ergodist. Both anima-
tion and the rhetorical stance carried by this ITEM function to attract and
situate the ergodist as someone who can 'Create a sense of wonder' and
'Offer new perspectives'. In the recent context of a nationwide campaign to
enlarge the teaching workforce, the relative magnetism of this ITEM
becomes meaningful when one recognizes the fact that it serves as a link to
another webpage that encourages individuals to join the teaching profes-
sion. Regardless, the general paucity of direct address may be due to an
aspiration towards a formal, objective register which interacts with the
'headline' convention of hypertext links as discussed above.
Steps J and K comprise an ordered bi-column arrangement of linguistic
hypertext links as in Steps C and D. Notably, under 'Corporate Informa-
tion', a choice from Gestalt: Framing tabulates the hypertext links. Represen-
tationally the rectilinear framing is a choice which projects stability and
immutability which is meant to accord with the corporate, definitive nature
of the information. In this light, perhaps among other reasons, the linguistic
hypertext links in Steps C and D are not framed because they are by nature
time sensitive. For example, in January 2000 the hypertext link for the '2nd
AEMM Education Ministerial Meeting' appeared while simultaneously the
hypertext link to 'ThinkQuest-Singapore' was dropped.
In Step L, a border marks off the final portion of the webpage which
contains the MOE's contact information and the 'Last Updated' date in
small fonts. This contact information is obligatory insofar as authenticating
the website and providing an avenue for dialoguing with the MOE. How-
ever, this information may perhaps be obscured because it is deemed rela-
tively less newsworthy to the purposes of the website which acts as a media
arm for the MOE. As evidenced by choices in Relative Prominence, contact-
ing the MOE through any of the channels laid out in the contact informa-
tion is downplayed as an option for the reader. What is instead deliberately
highlighted are the definitive statements found in the LEXIAS that the MOE
has already scripted for the MOE CLUSTER.
Backtracking the steps of the reading path, one finds an increasing sig-
nificance associated with the ITEMS, with Step A housing the 'main subject'
from which the rest of the webpage is understood: the MOE. As the tour at
the orders of LEXIA and ITEM shows, different texts on a webpage stand out
differently due to various Modal, Compositional and Representational
choices, pulling in the ergodist's gaze at every step of the reading path. In
turn, these choices as a whole reflect an image of the MOE as construed by
152 MULTIMODAL DISCOURSE ANALYSIS
the MOE homepage for the ergodist: as the authoritative voice on local
education in service of the governmental goal of economic viability through
an educated workforce in the global marketplace.
So far, the semiotic choices explored at the orders of LEXIA and ITEM
have been rather brief owing to the limitations of space. Nonetheless, this
sketch sets the backdrop against which a more delicate account of semiotic
activity may be further explored and detailed. This activity is exemplified
with a focused examination of Step A in the MOE homepage.
An account of intersemiosis
Visual semiosis
The ITEM that stands out as the CVI in the MOE homepage is that com-
plex of signs constituting the masthead, reproduced in Plate 6.6.
chosen at the programming stage from a larger range of font styles. Any
other font style apart from 'Arial' thus implies a certain degree of delib-
erateness. The masthead with its non-conventional font is thus a deliberate
choice to make it stand out from the rest.
The 'effect' of Modal choices is thus intimately tied to how the texts are
arranged in meaningful relation to each other, that is, the compositional
choices made. Gestalt: Framing is selected for the masthead via a border with
equidistant light and dark intensities of colour, suggesting both variation
and regularity. The strong rectangular frame at once mirrors the rectilinear
frame of the web-browser window and is echoed by the grid-like pattern
within itself. Although the criss-crossing lines segment and may thus fracture
the surface of the masthead, the continuity of the words 'Ministry of Edu-
cation, Singapore' over the surface evokes at the very least a closely pieced
together surface without chinks. What remains is a Parallelism connecting
these geometric Forms which relate 'to the horizontal axis and the vertical
axis [. . .] [and] contribute to stability and harmony' (O'Toole, 1994: 23).
What is crucially conveyed by the Modal and Compositional choices are
the discreteness, centrality and stability of the masthead. Representationally,
the masthead with its tiled texture and patterned border suggests among
many things some flat human-worked surface. Two important observations
can be made here: first, the range of visual meanings are suggested by the
actual choices instantiated; and second, while the meanings are, according
to Barthes (1977: 38-39), 'polysemous', they are nonetheless finite (Kress
and van Leeuwen, 1996: 16).
Visual-linguistic intersemiosis
Barthes's (1977) attempt to 'fix' visual meanings has been criticized because
it makes visual meanings dependent on linguistic choices, a phenomenon he
called 'anchorage'. Nonetheless, perhaps Barthes observes part of a more
complex process. Analyzing the masthead once again, the uncertainty of the
visual meanings is clarified somewhat as it interacts with the linguistic mean-
ings it frames. The meanings of the Nominal Group may be uncovered by
examining the choices realized in its structure as we may see in Table 6.3.
At the rank of Word, the Lexical 'Content' of the noun head 'Ministry'
allows for these taxonomic meanings:
At the rank of Nominal Group, the options 2 a, b, c, 3[b] and 4 are excluded
by the following choices in Modification: the premodifying definite article
154 MULTIMODAL DISCOURSE ANALYSIS
Head
(a) From la: The Ministry of Education headed by the Minister of Education.
(b) From Ib: The building of this government department.
(c) From 3 [a]: The body of ministers of this government department.
(d) From 5: The administration of this government department.
Table 6.4 Experiential relations between the masthead and the mission
statement
a sentient Participant Actor. This is complemented with the option Font: Font
Style: Italics which also imbues the mission statement with a sense of dyna-
mism, implicating an animate Actor. These options act to specify the
linguistic meaning of the masthead as (a) and (c) (see above). I call this
'Specification 2'. As can be observed, a disjunction arises between Specifica-
tions 1 and 2. The MOE as imaged by Specification 1 is solid, concrete,
immovable and non-living. In contradistinction, Specification 2 suggests the
MOE as the animate agent shaping Singapore's future. In this sustained
ambiguity, the MOE is (re)presented as a human agency who is at the same
time 'faceless', impenetrable and incontestable. This depiction of the MOE
derives perhaps from the premises of its uncontestable authority with
respect to educational matters and its existence as an arm of the PAP.
Abstraction ofintersemiosis
This discussion has been concerned with the way meanings across instanti-
ations of various semiotic resources interact with one another to give a new
meaning or set of meanings. This complex interaction and production of
meanings between instantiations of different semiotic resources is called
'intersemiosis'. Though the prior analysis of Step A is sequenced as Specifi-
cation 1 followed by 2 in keeping with the suggested reading path,
intersemiosis does not in fact depend on any one sequence, but upon the
meanings first conveyed by each instantiation. In other words, for multi-
semiotic texts, there is no binding unidirectionality or sequentiality for
meaning interaction. Rather, one instantiation comes into relation with
another, and each simultaneously specifies the other.
Intersemiosis as discussed so far has been circumscribed by Compositional
choices such as Gestalt: Framing and Relative Position: Proximate that relate
instantiations that are spatially 'grouped'. A more complete notion of
intersemiosis recognizes that choices from the Modal and Representational
systems can also bring instantiations that are spatially distant or ungrouped
into significant relations for the interaction and production of meanings.
However, these non-Compositional factors for intersemiosis can only be
pursued outside the confines of this paper.
An abstraction of the stages of visual-linguistic intersemiosis may be
offered at this point as Relation, Intersection and Manifestation (collectively
RIM):
156 MULTIMODAL DISCOURSE ANALYSIS
Conclusion
This undertaking has been an exercise in increasing specificity. That is,
against an expansive range of discourse on hypertext, four abstract orders
of hypertext are posited, out of which the two lower orders of LEXIA and
ITEM are identified as sites for semiosis. At these lower orders of abstraction,
a multisemiotic analysis was applied to the MOE homepage to uncover the
meaning-making choices which construe the MOE. A further particulariza-
tion occurs when intersemiosis is demonstrated at the level of delicacy of
two ITEMS. Finally, this exploratory attempt culminated in an abstraction
of the process of intersemiosis, Relation, Intersection, Manifestation
(RIM) which approaches the problem of how to illuminate this complex
phenomenon.
The issue of whether non-linguistic semiotic resources are systemic raises
the question of the validity of extending the notion of the systemic
metafunctions beyond language. The contention that there may not exist a
stratum of 'grammar' for a non-linguistic semiotic resource and that, even if
there is, this stratum is of a comparable nature to that of language becomes
an issue. These theoretical questions remain still very much questions in
themselves and there is no reason to date to reject the notion that non-
linguistic semiotic resources are systemic and tri-metafunctional. This is
not to say that the metafunctional systems between semiotic resources are
ELECTRONIC MEDIA AND FILM 157
identical. That is patently untrue for the simple reason that different semi-
otic resources have different ways of meaning, and so have in themselves
different meaning-making systems. The systems proposed for non-linguistic
semiotic resources are markedly different from the linguistic. One crucial
question may be whether non-linguistic semiotic resources serve non-social
functions. The notion that semiosis is necessarily social seems to secure the
notion of the three metafunctions (see Kok, 2001).
While exploring the systemic choices in the MOE homepage, my analysis
has worked with a suggested reading path. This does not, however, rule out
the fact that an ergodist can focus initially on an ITEM other than the GVI,
or in a similar fashion, can work through a different sequence of engaging
with the ITEMS on a LEXIA depending upon the immediate contextual
factors such as the number of times the website has been viewed. Further-
more, what is immediately demanding of attention for one particular cul-
ture may not be so for another, although acculturation across cultures is
becoming more frequent with the spread of mass media, of which hyper-
text is a part. Further to this, it appears that various meaning-making
choices and facilities in hypertext, as demonstrated, function to secure cer-
tain sites of immediate visual engagement so that a GVI becomes visually
prominent.
This enterprise has been unwilling to divorce hypertext from contextual
use because as a means of communication, hypertext only acquires its
richness and definition from its use in the social realm. The functions of
hypertext are not wholly determined either by technology or society, but
by technology used in society. As future innovations in communicative
technology surface, new ways of meaning-making will be introduced.
What has been suggested in the course of this undertaking are some of
the new systems of meaning-making enabled by hypertext. However, fur-
ther work is needed to account for the many other systems opened up in
this new platform. Nonetheless, the value of this work lies in its potential
to explicate the process through which semiotic choices are made, how
they are made, for what purposes and to what effect. It is hoped that this
has provided some answers to enquiries concerning the shifting ways of
communication and works towards a fuller disclosure of multisemiotic
activity.
Notes
1 Due to publishing constraints, the MOE homepage could not be reproduced in
colour. As colour is an important resource for meaning, these constraints some-
what compromise the reader's interpretation of the webpage and the analysis
presented here. However, every effort has been made to overcome this deficiency.
2 Although it seems counterintuitive to say that means of writing prior to the
printing press or the typewriter are technologies, 'the papyrus roll and the vel-
lum manuscript also exemplify technologies of writing . . . [as] . . . both required
devices: the reed pen and papyrus in ancient Egypt, and the quill and parchment
in the Middle Ages' (Snyder, 1997: 1).
158 MULTIMODAL DISCOURSE ANALYSIS
Acknowledgements
Plates 6.2, 6.3, 6.4, 6.5 and 6.6 are reproduced by courtesy of the Ministry
of Education (MOE), Singapore. The screenshots of the MOE homepage
were captured on 7 January 2000.
References
Aarseth, E. J. (1997) Cybertext: Perspectives on Ergodic Literature. Baltimore: John
Hopkins University Press. Available from: http://www.hf.uib.no/cybertext/
Ergodic.html.
Barthes, R. (1974) S/£(R. Miller, trans.) New York: Hill and Wang.
Barthes, R. (1977) Rhetoric of the image. In R. Barthes (S. Heath, ed. and trans.),
Image—Music—Text. London: Fontana, 32—51.
Bohle, R. (1990) Publication Design for Editors. New Jersey: Prentice-Hall.
Halliday, M. A. K. (1991) The notion of context in language education. In
T. Le and M. McCausland (eds), Language Education: Interaction and Development:
Proceedings of the International Conference, Ho Chin Min City, Vietnam 30 March-1 April
1991.
Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edn). London:
Edward Arnold.
Hasan, R. (1996) Ways of Saying: Ways of Meaning Selected Papers of Ruqaiya Hasan.
London: Gassell.
Heng, G. and Devan, J. (1995) State Fatherhood: the politics of nationalism, sexual-
ity and race in Singapore. In A. Ong and M. G. Peletz (eds), Bewitching Women,
Pious Men: Gender and Body Politics in Southeast Asia. Berkeley: University of California
Press, 195-215.
Kok, K. G. A. (2001) What is material about hypertext? Unpublished masters thesis.
National University of Singapore.
Kress, G. (2003) Literacy in the New Media Age. London: Routledge.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.
Kress, G. and van Leeuwen, T. (2001) Multimodal Discourse: The Modes and Media of
Contemporary Communication Discourse. London: Arnold.
Landow, G. P. (1997) Hypertext 2.0: The Convergence of Contemporary Critical Theory and
Technology. Baltimore: John Hopkins University Press.
Lemke, J. L. (1998) Metamedia literacy: transforming meanings and media. In
D. Reinking, L. Labbo, M. McKenna and R. Kiefer (eds), Handbook of Literacy and
Technology: Transformations in a Post-Typographic World. Hillsdale, NJ: Erlbaum,
283-301.
Lim, B. L. L. (1998) Hypertext fiction: a narrative analysis. Unpublished honours
thesis. National University of Singapore.
Ministry of Education, Singapore. (2000) Education in Singapore; available from
http://wwwl.moe.edu.sg/educatio.htm.
Moore, M. (1994) Introducing the internet. In Wired Magazine: The Internet Unleashed.
Indianapolis: Sams Publishing, 4—19.
O'Halloran, K. L. (1999) Interdependence, interaction and metaphor in multi-
semiotic texts. Social Semiotics 9(3): 317—354.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University
Press.
ELECTRONIC MEDIA AND FILM 159
O'Toole, M. (1999) Functions and Systems in Verbal and Visual Texts. Paper presented at
the 26th International Systemic—Functional Congress. Regional Language
Centre (RELC), Singapore, 26-30 July 1999.
Reader's Digest Oxford Complete Wordfinder, The (1994) London: The Reader's Digest
Association Limited.
Snyder, I. (1997) Hypertext: The Electronic Labyrinth. New York: New York University
Press.
Straits Times, The (9 February 2000) Internet a driving force.
Tuman, M. C. (1992) WordPerfect: Literacy in the Computer Age. London: Falrner Press.
Unsworth, L. (2001) Teaching Multiliteracies across the Curriculum: Changing Contexts of
Text and Image in Classroom Practice. Buckingham, UK: Open University Press.
Wee, C. K. A. (1999) Multi-semiotic analysis of advertisements. Unpublished hon-
ours thesis. National University of Singapore.
This page intentionally left blank
Part III
Print media
This page intentionally left blank
7 The construal of Ideational meaning in print
advertisements
Introduction
The investigation of the intricacies, complexities and nuances of multi-
semiotic texts has been the Focus of recent research. This arises from the
observation that 'language, and typological modes of semiosis generally,
have evolved to work in partnership with other, often more topologically
grounded, semiotic systems' (Lemke, 1998: 111). O'Toole (1994), Kress and
van Leeuwen (1996, 2001), Lemke (1998), Wee (1999), O'Halloran (1999)
and Baldry (2000) have made significant strides within this area of multi-
semiotic text analyses from a systemic-functional perspective.
This paper aims to contribute to the development of a theoretical frame-
work and vocabulary for the articulation of meaning in multi-semiotic texts
as research in this realm has not been as extensive as the examination of
purely linguistic texts. To limit text analyses to only the linguistic aspect and
disregard the non-linguistic features such as graphs and diagrams is tanta-
mount to annihilating the efflorescence of meaning that can emerge from a
multi-semiotic analysis. As aptly stated by Wee (1999: vi):
Compared to text with a single semiotic code, the meaning potential of multi-
semiotic texts is greatly expanded. Hence, meaning creation becomes an inter-
active, dynamic and symbiotic process.
to attract attention . . . [and] realized in the written mode through the manage-
ment of the visual layout, the typeface patterns and/or the presence of pictures.
According to Hasan (1996: 41), the Focus csingle[s] out that which is being
advertised'. However, while stating that the Focus can be visually realized,
Hasan (1996) does not clarify whether the Focus has a linguistic realization as
well. Hasan (1996) also establishes the presence of a visual aspect to the
Justification, but in a similar manner does not include the component to give
a 'detailed account of other elements of structure for an advertisement'
(Hasan, 1996: 42). Suffice to say that Hasan's (1996) generic structure Cap-
tures to some extent the multi-semiotic nature of advertisements.
Following Hasan's proposal, there is a need to provide a more detailed
account of generic structure for advertisements. Hasan's (1996) model does
not make explicit the complexities involved in the interaction between visual
images and linguistic text in advertisements. It is the aim of this paper to
provide a model that best Captures the multi-semiotic interaction between
visual images and linguistic text in print advertisements.
Based on this limited study of print advertisements, the Generic Structure
Potential or GSP which ' [expresses] the total range of optional and obliga-
tory elements' (Halliday and Hasan, 1985: 64) for advertisements may be
Captured as:
Five advertisements are analysed in this paper: the Golf, the Epson, the
Ml, the Beetle and the Guess? advertisements, which are displayed in
Plates 7.1-7.5. I discuss in the next section why the Lead and the Emblem
are designated obligatory elements while the others are optional.
The Lead
The discussion that follows details the characteristics and function/s of the
various components that constitute the Generic Structure Potential of a
print advertisement. I will begin with the Lead.
The Lead is thus termed as it is Interpersonally most Salient (Kress and
van Leeuwen, 1996) through choices in size, position and/or colour. The
Lead is illustrated in Plate 7.1. On its own, the Lead has a wide spectrum in
terms of meaning potential, that is, many possible meanings emanate from
the Lead. Interpreted independently of the Announcement, Enhancer, Dis-
play and Emblem, the Lead is figuratively an efflorescence of meaning. For
example, the sensual looking female who is the Lead in the Golf advertise-
ment (Plate 7.1) could be calling to attention the new millennium look or she
could be an ambassador for women's rights. Therefore, on its own, the Lead
has a bounty or a kaleidoscope of possible meanings.
As I explain below, the Lead consists of the Locus of Attention (LoA) and
Complements to the Locus of Attention (Comp.LoA). There is an element
in the Lead that by its very Salience, be it an unusual quality that challenges
reality or outstanding size, colour and so forth, arrests the attention of the
viewers. In Plate 7.2 depicting the Epson advertisement, it is the splash of
the water outside the boundaries of the photograph. This attention-
arresting element is termed the 'Locus of Attention' (LoA). The LoA
embeds the central idea of the advertisement, that Epson produces lifelike
quality prints. The three-fold functions of the LoA include Interpersonally
attracting attention, and Ideationally construing reality in a way intended by
the advertisers, where the viewer's perception of reality is manipulated.
Textually, it is a springboard for further development of the central idea, for
166 MULTIMODAL DISCOURSE ANALYSIS
example, that Epson produces lifelike quality prints in the linguistic text that
follows. 'The text serves to elaborate' the visuals (Kress and van Leeuwen,
1996: 194). But by what specific strategies/systems, we are left uninformed.
The following discussion serves to explain this.
PRINT MEDIA 167
Visually, the LoA encapsulates the central idea that Epson produces life-
like prints. This central idea is reiterated in the linguistic text. That is, there
is a linguistic equivalence (be it in the form of sentences or particular lexis)
that coheres ideationally with this central idea conveyed in the LoA. Idea-
tionally, the following linguistic items, including clauses and nominal groups,
encapsulate tightly and parallel the idea embedded in the LoA, that is,
Epson produces lifelike prints:
168 MULTIMODAL DISCOURSE ANALYSIS
(1) Comp.LoA 1: the two boys beside the LoA, who are reduced in size and
are lacking in affective appeal, and therefore visually less inviting than
the smiling LoA. Their Stylization (O'Toole, 1994) differs from the
model's. They do not hold the Ml placard, and are not smiling, which
implies also that they would never be able to say 'Everything they offer is
brighter, nicer and more fun'. The Comp.LoAl subordinates itself to bring
the LoA and the Emblem into Focus. The LoA and the Emblem become
the confluence of all attention.
(2) Comp.LoA2: the background, which remains in dark hues and fails to
be illuminated, despite the spotlights. The backgrounded stalls and the
goods the stalls are selling are generally obliterated and unobservable.
This may be contrasted with the LoA who is illuminated by the Emblem
of the 'sun' (that is, Ml) she is holding, while the spotlighted back-
ground, ironically, fails to brighten up. Thus the Comp.LoA2 under-
scores the prominence of the LoA and by extension, the prominence of
the product (that is, Ml). The Comp.LoA2 thrusts the LoA and the
product (that is, Ml) into viewers' attention.
The Display
//In its own confident and quiet style [[that have won endless admirations
the world over]], the New Golf has come of age with a sophistication
beyond comparison//
a //[[Setting itself apart and in a blistering pace]], is a new and awesome
1.8 litre turbo engine//
X (3 //to take its performance to a higher level//
//Not only that, the beauty and luxury of the New Golf is also graced with
equally exciting refinement both inside and outside//
1 //Truly the New Golf hasn't changed in spirit and valour//
+ 2 a //but has gotten better//
X (3 //to assert itself as the ultimate hatchback//
//No wonder it has been hailed as '. . . a triumph of execution' by UK's
Car's Magazine (January '99)//
//And termed by others as the 'Rolls-Royce of hatchbacks'//
Table 7.2 is a brief survey of the advertisements analyzed in this paper and
reveals which elements are optional, and which obligatory.
Evident in Table 7.2 is the diversity of choice as to which elements
are included or excluded from the advertisements. This study indicates
that only the Lead and the Emblem occur in all the advertisements
which have been analyzed. Thus the Lead and the Emblem appear to be
obligatory elements, while the rest are optional in the GSP of print
advertisements.
Cook (1992: 216), quoting Barthes, states that advertisements represent a
'resdess' discourse type. He explains (ibid. 217):
Deriving a GSP for advertisements is thus made difficult due to this 'rest-
lessness' of advertisements. The GSP for advertisements which I have
derived is at best tentative, insofar as advertisements metamorphose along
with 'changes in society . . . constantly transmuting and re-combining'
Table 7.2 Tabulation of Elements in the five advertisements
(ibid.}. I venture further to say that the GSP for advertisements is chameleon-
like, slippery to define and ever-evolving. Further research into the GSP of
advertisements, which needs to be conducted in greater breadth and depth,
may produce a different GSP. For example, Hasan (1996) establishes
'CaptureAFocusAJustification' as the generic structure for advertisements but
my research has produced a GSP which differs in terms of the degree of
detail and the ability to Capture the complexity of intersemiosis in
advertisements.
Ideational meaning
Effervescence of
meaning; a
kaleidoscope of
meaning;
Low GP, wide IS,
high SE
Contextualization
of meaning;
options of meaning
unintended by
advertisers closed
off
Meaning in
advertisement
funnelled towards
a preferred
direction
intended by advertisers
Stability in
meaning; X
number of meanings
intended by
advertisers
communicated to
and received
by viewer.
High CP, narrow IS,
lowSE
Stage 1
The Lead in the Golf advertisement is the most interpersonally Salient, as
seen in Plate 7.1, and thus this element is first approached by viewers. The
178 MULTIMODAL DISCOURSE ANALYSIS
gaze of the LoA locks with the viewer, and the latter is led into the adver-
tisement. If interpreted independently of the Announcement, Enhancer,
Display and Emblem, the Lead is figuratively an effervescence of meaning.
As mentioned above, the Lead could represent an ambassador for women's
rights, or a call to attention to the new millennium look. On its own, a
kaleidoscope of possible meanings characterizes the Lead. There is a wide
scope in terms of meaning potential in the Lead. At this stage, there is low
CP, wide IS and high SE.
Stage 2
In a print advertisement, the next most Salient item is the Announcement,
thus the Primary Announcement is second in the reading path. There is
Bidirectional Investment of meaning between the Announcement (the
linguistic code) and the Lead (the visual code) as illustrated in Figure 7.3.
The term Investment refers to the Bidirectional Investment of meaning
from the lexicogrammatical choices in the Announcement to the visual
in the Lead and vice-versa. For example, should the Announcement in the
Golf advertisement 'It does not make a statement. It's for people who
already have one' occur elsewhere, for instance on the back of a T-shirt,
it would connote different meanings. Similarly if the Lead of the Golf
advertisement appears in a different context, for instance in a Playboy
magazine, it would have different connotations from what it has here. So
how are the viewers constructed by the advertisers to read the meaning in
the Golf advertisement that the LoA represents someone with a statement-
making personality? After all, the Announcement is a linguistic code while
the Lead is a visual one. How does the juxtaposition of two different codes
result in meaning that can be unambiguously conveyed by the advertisers
and unambiguously decoded by the viewers? I propose that the juxta-
position of the linguistic texts and visuals sets up Transitivity processes that
invest meaning from the linguistic code to the visual code and vice versa.
The discussion that follows unravels and explicates the mechanics of this
Bidirectional Investment of meaning from the Announcement to the Lead
and vice-versa.
Stage 2a
In the Golf advertisement (Plate 7.1), there is a Relational:Attributive:
Intensive process between the Primary Announcement and the Lead. The
Attribute 'statement-making personality' is invested from the Primary
Announcement into the Carrier (that is, the LoA in the Lead) by virtue of
their proximity, thus causing viewers to see the LoA as a person with a
statement-making personality (Figure 7.4):
PRINT MEDIA 179
Primary Announcement
'It does not make a statement. It's for people who already have one'
Lead
'Visual of LoA'
Stage 2b
Figure 7.5 illustrates how the Lead enriches the Ideational meaning
carried in the Primary Announcement. The sophisticated, sensuous, coy-
looking LoA in Plate 7.1 is a visual exemplification of the statement
'people who already have (a statement to make)'. A RelationaLIdentifying:
Intensive process occurs in the Investment of meaning from Lead to
Announcement.
180 MULTIMODAL DISCOURSE ANALYSIS
Primary Announcement
'It does not make a statement. It's for people who already have one'
Investment
Relational: Identifying: Intensive process occurs between
Announcement and Lead.
The sophisticated LoA in die Lead is the Value exemplifying the
Token 'It's for people who already have one'.
Lead
'Visual of LoA'
(Plate 7.3) enter into Relational, Verbal, Mental and Material processes with
the Lead (Figure 7.6).
Defined as the most Interpersonally Salient
Announcement among other Announcements in the
same advertisement
(E.g. (1) '118 Off-peak hours every week')
Primary
At the most obvious level, the LoA in the Lead is the Sayer, with 'I get
the feeling that Ml wants me to enjoy value - and enjoy life. Everything they offer is
brighter, nicer and more fun!' as the Locution and the viewers as the Receiver.
Underscoring this is an ideology of persuasion particularly through the
Attributive:Intensive process in the Locution:
The juxtaposition of the Lead and the Primary Announcement 1 gives rise to
the following possible meanings:
Stage 3
The advertisers are able to convey, and viewers are able to receive, the
meaning of a satisfied Ml customer unambiguously because the juxta-
position of Announcements and Lead has resulted in the above Transitivity
processes. Should the Lead be paradigmatically replaced by a visual of
uncongested roads in the city, the meanings conveyed would definitely be
different. Again, although the Announcements are linguistic codes and
the Lead a visual one, their juxtaposition still creates meaning due to the
Transitivity processes between them. These meanings that result from the
interaction between the Announcements and the Lead are built on or modi-
fied (depending on the advertisers' intention) by the Enhancer.
An analysis of the Transitivity processes in the Enhancer reveals how it
builds on the meanings generated by the Announcements and the Lead. In
Stage 2b one meaning generated between the Lead and the Primary
Announcement in the Ml advertisement is:
In the Relational clauses below, the 'Free talk time' is accorded a Value and an
Attribute, to attract viewers to these benefits that they can enjoy. Herein,
again, lies the ideology of persuasion.
(iporv
//Free (is) worth 3>4U every month with the Ml
talk time PrimePlan//
The meaning of (A) is further enhanced, LoA is smiling because of the '118 off-peak hours
every week'.
(C) Secondary Announcement 1: Third in Salience therefore its reading follows Primary
Announcement 1. Meaning of Primary Announcement 1 further enhanced, that is '118
off-peak hours eveiy weeK is not only of good value, it enables one to enjoy life. The '118 off-
peak hours every weeK is a means to a 'brighter, nicer and morejuri lifestyle.
(A) Lead is visually most Salient therefore viewers interact with it first.
(D) Secondary Announcement 2: Next in Salience after (C) and enhances the meaning of (C).
(E) Enhancer: Read last since it is at die bottom. The Enhancer further builds on what it
means to have a 'brighter, nicer and moreJim' lifestyle, by describing the benefits.
Lead
X Primary Announcement 1
X Secondary Announcement 1
X Secondary Announcement 2
X Enhancer
Stage 2a mentions the interaction between the Announcement and the Lead
in the Golf advertisement which sets up Relational processes resulting in the
following meanings:
Lead
= Primary Announcement
X Enhancer
(A) The Lead is visually most Salient therefore read first. The LoA carries some
meaning but we are not sure yet what meanings advertisers intend her to
have till she interacts with the Announcement.
Stage 4
The total meaning derived from the interaction between the Lead,
Announcement/s and the Enhancer needs to be read in the socio-cultural
context within which it is placed. The meaning of the entire advertisement,
according to Wernick (1991: 42):
delivers back to the people the culture and values that are their own . . . [it is] a
reinforcement of whatever ideological codes and conditions [that have] come to
prevail.
the goals and values that are consistent with and conducive to the consumer
economy and [socialize] us into thinking that we can buy a way of life as well as
goods.
Linguistic items (a)-(e) provide the context within which the meanings of the
LoA may be negotiated and established. As the LoA is more contextualized
by linguistic items (a)-(e), the meaning of the LoA becomes more strait-
jacketed. Such a scenario defines a high CP in an advertisement. With a
high CP, the viewers' interpretation of the LoA is constricted, with a low-
ered freedom to read other meanings in the LoA given the semantic input by
linguistic items (a)—(e).
The CP, therefore, has ideological implications. A greater Propensity for
Contexualization implies greater effort by the advertisers (through the lin-
guistic items) to introduce specific strands of meanings. One is discouraged
to read alternative meanings in the LoA given the context by (a)-(e). The
viewers thus have limited IS, that is, space to create, invent and author
meaning. This of course does not mean that alternative readings do not
occur. A critical reader can interpret the intended meanings and offer fur-
ther perspectives other than those intended by the advertiser.
As illustrated in Plate 7.5, the CP is low in the Guess? advertisement as
there is only one lexical item, namely 'Guess?' to contextualize the meaning
of the entire Lead, made up of the LoA, that is, the model whose limbs shine
with metallic sheen, and the Comp.LoA, that is, the background. Apart from
the possible reading that the LoA is in some way related to Guess?, which is
the brandname of a fashion product known for its watches and clothes, and
that there is the underlying message that Guess? fashion is trendy, chic and in
vogue, the entire Lead is an effervescence of meaning as there is a lack of
contextualizing function by linguistic items. Arising from this lack of
contextualization, that is, a low CP, a myriad of interpretations of the LoA is
possible: the LoA with the metallic sheen-like complexion is a probable
personification of the futuristic stance Guess? adopts towards fashion; the
current Guess? trend is the minimalist look, as exemplified by the generous
show of legs and body swathed with a minimum of cloth; the Guess? con-
sumer is bohemian in outlook, as is the LoA whose cascading hair is caressed
by the wind and throws a cold, removed glance at the viewer and the world;
the Guess? consumer looks down at the world in nonchalance, articulating
the superiority of the product and hence the consumer who chooses to use
Guess? products. Guess? is thus selling an attitude, a certain style of living;
Guess? products hint of sexual attractiveness and availability (as signified
through the high-split in the skirt and slightly parted legs), which can be
extended to imply the non-conformist nature of Guess? products, which
challenge the conservative mould of society; Guess? applauds the flat-
chested female as opposed to society's fascination with and celebration of the
amply endowed female, again a hint of Guess?'s non-conformist ideology;
dark skies and seas fail to intimidate Guess? consumers, who are able to put
their best foot forward in style and confidence, the Stance (O'Toole, 1994)
adopted by the LoA; Guess? is beyond definition, there is no single aspect to
its fashion statement. Guess? products, it seems in this particular advertise-
ment, have limitless possible interpretations within the semantic realm of
'the desirability' of this label, and that is likely to be the message intended by
190 MULTIMODAL DISCOURSE ANALYSIS
advertisers wish the consumers to purchase the illusion that consumers are
empowered to create meanings for themselves in an advertisement. The
ideology of manipulation is no less evident, for by thinking they have free-
dom to interpret, the viewers have played themselves into the hands of the
advertisers. They have bought the ideology of Guess?, that is, there is no
single definition of the Guess? fashion statement, so dress the Guess? way
and be open to interpretation by the (admiring?) eye of the public.
[the Given is defined as] something the viewer already knows, as a familiar and
agreed-upon point of departure for the message. For something to be New means
that it is presented as something which is not yet known, or perhaps not yet
agreed upon by the viewer, hence as something to which the viewer must pay
special attention.
However, my proposal is that Given and New information need not be Com-
positionally determined in this manner of left to right organization. The
Given-New information value may be derived in any print advertisement, in
any layout, whether with left-right or top-down Composition. The Guess?
advertisement is a case in point.
PRINT MEDIA 193
(a)
(b)
Narrower IS Wider IS
(c)
Acknowledgements
Plates 7.1 and 7.4 are reproduced with kind permission of Volkswagen.
Plates 7.2, 7.3 and 7.5 are reproduced with kind permission of Epson,
MobileOne Ltd and Guess? Inc, respectively. The credits for the photo-
graph in Plate 7.5 are due to creative director Paul Marciano and photo-
grapher Dah Len.
References
Baldry, A. P. (ed.) (2000) Multi-modality and Multimediality in the Distance Learning Age.
Gampobasso, Italy: Palladino Editore.
Bardies, R. (1977) (S. Heath, ed. and trans.) Image-Music-Text. London: Fontana.
Bohle, R. (1990) Publication Design For Editors. New Jersey: Prentice Hall.
Bruns, A. (1998) Major Terms in Structuralism: Text, Reading, Author, Intertextuality,
Discourse, (http://www.uq.au/~zzabruns/uni/en22l-ass05.html).
Cheong, Y Y. (1999) Construing meaning in multi-semiotic texts — a systemic-
linguistics perspective. Unpublished masters thesis. National University of
Singapore.
Cook, G. (1992) The Discourse of Advertising (2nd edn 2001). London: Routledge.
Dyer, G. (1982) Advertising as Communication. London: Routledge.
Goldman, R. (1992) Reading Ads Socially. London: Routledge.
PRINT MEDIA 195
Libo Guo
National University of Singapore
Introduction
Introductory biology textbooks in current use in educational institutions
invariably contain words and visual images, for example, schematic draw-
ings, photographs, and mathematical and statistical graphs. Further, it is not
only recently that biology texts have been multimodal; drawings of animals
and plants have been used as an aid to the study of living organisms for
agricultural, medicinal and biological purposes since ancient civilizations
(Ford, 1992).
Sociologists or ethnomethodological researchers, notably Lynch (1990)
and Myers (1990, 1995), have attempted to theorize about the deployment
of visual displays in biology texts. Lynch (1990: 153-154), for instance,
believes that 'visual displays are more than a simple matter of supplying
pictorial illustrations for scientific texts. They are essential to how scientific
objects and orderly relationships are revealed and made analyzable'. In a
similar vein historians and philosophers of science have turned their atten-
tion to the evolution and philosophical aspects of scientific (including
biological) illustrations (see, for example, Baigrie, 1996). Although these
investigations have made significant contributions to our knowledge and
understanding, they often seem to lack a coherent framework to explain how
the various visual displays make meaning in their natural and social settings.
These approaches have shown us what is happening through videotape
recordings, verbal accounts and historical documents, but they have not
been explicit enough about the systems and functions that underlie the use
of visual images.
This paper explores the potential of an alternative approach to the study
of meaning-making practices in scientific discourses. This is the social semi-
otic approach developed by M. A. K. Halliday (1978, 1994) as systemic-
functional linguistics (henceforth SFL) and the emerging SFL-informed the-
ory of multimodality (Baldry, 2000a, 2000b; Kress, 2000, 2003; Kress and
van Leeuwen, 1996; Lemke, 1998; O'HaUoran, 1999a, 1999b, 2003). Due
to the main purpose of my study, that is, helping non-native university
learners of English cope with English for Specific Purposes (ESP) or English
for Academic Purposes (EAP), I confine myself to the study of textbook
PRINT MEDIA 197
articles in biology, sharing Myers's conviction that textbooks are the type of
writing that university students are 'most likely to face' (Myers, 1992: 3). The
excerpts analysed here are from Chapter 17 Cell Division of Essential Cell
Biology: An Introduction to the Molecular Biology of the Cell (henceforth ECB) by
Alberts et al. (1998). This textbook is used as required reading material for
second-year biology majors for Bachelor of Science degrees at the National
University of Singapore for the module of Cell Biology.
This paper is organized as follows. I first discuss the semantics of biology
and what biologists do that characterize them as biologists. Second, follow-
ing O'Toole (1994), Lemke (1998), and O'Halloran (1996, 1999a, 1999b) I
propose frameworks for the analysis of visual images in the textbook, and,
following this, I analyse two multimodal composites and discuss how each
type of resource contributes to meaning-making. This paper concludes by
outlining some of the implications of a multimodal approach for teaching
ESP/EAP to non-native speakers of English.
Episode Shape; Relative Prominence: Colour, amount Relative position in the structure or
Colour; of detail; process;
Size; Centrality; Colour contrast between components
Spatial relation to each other, and to Lettering (for label and caption): type
the structure; size, style (serif or san serif), Weight;
Actions, events Line and arrow width;
Numerical sequence
Figure Components; Acts Contrast: Scale, Line, Light, Colour; Relative position in the component or
Omission of detail phase;
Colour contrast or similarity;
Subframing
Member Natural form: Shape, colour, etc. and Stylization; Cohesion: Parallel/Contrast in Shape
spatial relationship to other C onventionalization and Colour;
components Reference through language
PRINT MEDIA 201
Graph Statistical reality: topological meanings, Accompanying text in the form of Gestalt: Framing, Horizontals, Verticals
such as trends, continuous co- Caption, Tide and Labelling which are and Diagonals;
variations, correlation and frequency emphasized by Size, Positioning, Positioning;
distribution; Underlining and Font; Use of Lines, Curves and Bars;
Comparisons of patterns of variation Colour, Line width, Shading, Line Interconnections established through
Solidarity, Arrows; symbolism and language for the
Curvature; labelling of Participants and Processes;
Perspective; Cohesion: links to the running text
Framing;
Scale;
Style of production;
Directionality
These two processes together constitute the M phase of the cell cycle
(Figure 17.3).
However, he or she may not wait until being instructed to view Figure 17.3.
Since Figure 17.3 is a full-colour drawing, a picture more attractive than the
largely black and white verbal text, a reader's attention is more likely to be
drawn to the drawing than to the written description. Thus one plausible
reading session may be that a reader, at some point in his or her reading,
turns his or her attention to the figure, and then back to the verbal text for
careful study and then back to the figure again, following a back-and-forth
type of reading path as explained above.
PRINT MEDIA 205
The reading path within Figure 17.3 is marked in Figure 8.1 by the
capitalized and italicized Roman letters A to G. As is clear from Figure 8.1,
the reading path is not linear, from left to right, from top to bottom, but is
determined ideationally by what is in focus in the running text (the M phase
of the cell cycle), and interpersonally by the visual means of directing the
reader's attention (for example, the bright yellow Shading and Capitaliza-
tion of MITOSIS and CYTOKINESIS and light green Shading of M phase and the
large square bracket embracing MITOSIS and CYTOKINESIS in the original text).
This is, in verbal and common parlance, equivalent to saying 'Hey, look at
what is highlighted first!'. Indeed, in this part of the reading, Steps C and D
are all an experienced reader needs to attend to. The highlighting devices
such as arrows are equivalent to a lecturer's cursor in an actual classroom,
where he or she, while talking to the students, points to relevant parts of the
figures. Although in viewing Figure 17.3, one's gaze, especially that of a
novice, may work from Step G down to Step D due to the Interpersonal
impact of the downward-pointing arrows and the reading habit of a normal
reader, it is nonetheless arguable that the reading path suggested above is
206 MULTIMODAL DISCOURSE ANALYSIS
most economical for the experienced reader, that is, one that has followed
the Textual explication up to this point.
At the rank of Work, interpersonally, this figure thus employs an array of
visual means to emphasize various parts of the cell structure and stages of
cell division. Ideationally, the figure is designed to tell a story about what
happens in a cell cycle, in particular the M phase of the cell cycle. The
ideational meanings include: (a) material processes realized by changes in
the shapes at different stages, the arrows and the nominal groups in the
linguistic text; (b) intensive identifying processes realized by the labels, lead-
ers and the pictorial elements, and, in the absence of leaders by the labels,
the spatial proximity between the pictorial element and the labels, and the
pictorial elements; and (c) possessive identifying relational processes realized
by the labels, square bracket, the pictorial elements and the linguistic text.
The overriding Experiential content seems to be concerned with material
processes, although the intensive and possessive relational processes con-
tribute significantly to the construction of biological knowledge. And text-
ually, the drawing is not isolated from other parts of the text. It is related to
the main text and the caption and is placed in a specific position on the
page. The drawing is vertically positioned, with the Arrows connecting one
stage with another. Other resources employed for the textual meaning
include Geometry (e.g. circles), Colour Contrast or Similarity, Labelling
(with or without leaders) and Framing. In what follows, I analyse selected
steps in terms of the Interpersonal (Modal) meaning, Ideational (Represen-
tational) meaning and Textual (Compositional) meaning, by reference to the
functions and systems chart in Table 8.1.
Step C: mitosis
This step can be broken into three sub-stages: Step C-l the word
'MITOSIS', Step C-2 the arrow and Step C-3 the circle and the two over-
lapping circles which contain the semiotic depiction of the cell.
mammalian cell of a 24-hour cell cycle requires only about one hour for the
M phase to complete. This misrepresentation of the temporal dimension
functions to highlight the M phase of the cell cycle.
Figure Ql7.1
Ideationally, at the rank of Graph, the graph shows visually the Result (or
part of the Result) of an experiment, the frequency distribution of cells with
different DNA contents in a population of growing cells. The x values refer
to the DNA content per cell, as the label indicates, and the y values the
number of cells with a given DNA content. In other words, cells in the
population are divided into various types according to the amount of DNA
the cell contains: the type of cells on the right of the .x-axis contains more
DNA than a type on the left. The value in thejy-axis records the number or
frequency of occurrence of each type of cells in the population. A higher
point on the graph means that the number of cells of a particular type is
greater. Thus Ideationally the graph is a visual equivalent to a group of
linguistic relational processes through its Curvature. In addition, this graph
shows the 'conceptual relations, and not actual data' (Lemke, 1998: 102).
For instance, we are not told how many cells there are in the population, the
exact number of cells with different DNA contents, nor how much DNA
each cell contains, as there is no indication of the unit of measurement on
either x- oiy- axis. We are provided with the theoretical relation between the
212 MULTIMODAL DISCOURSE ANALYSIS
two variables: the type of cell defined by its DNA content and its frequency
of occurrence in the population.
It is worth noting that this Ideational meaning resides uniquely in a graph
and that it cannot be expressed as effectively by a verbal text or a mathemat-
ical equation. For Figure Q17.1 visually expresses the general abstract
pattern, or spatializes the quantitative relationship. It is a document with
visual impact, one that enables the viewer or reader to 'take in' the pattern
at a glance. However well a verbal clause or clause complex or a mathematical
equation may express the trend or relationship, a graph always does so with
a strong visual impact.
I would also like to note that just as the move from concrete data record-
ing to the abstract relationship between the values of two variables may
involve grammatical metaphor (Halliday, 1998), the visualization of the
abstract relationships may involve semiotic metaphor as formulated by
O'Halloran (1999a, 2003, forthcoming). By semiotic metaphor, O'Halloran
(2003: 357) refers to the phenomenon in which 'when a functional element
is reconstrued using another semiotic code' there may occur 'a shift in the
function and the grammatical class of [the] element, or the introduction of
new functional elements'. The formulation of semiotic metaphors involved
in the movements between natural language, mathematical symbolisms and
visual displays is crucial for the ultimate solution to mathematical problems,
as demonstrated by O'Halloran (1999a, 1999b, 2003, forthcoming).
Here I analyse the movements between the verbal text and the visual text
in Question 17.1, which involves instances of semiotic metaphor. 'The
number of cells with a given DNA content' in the verbal section of Ques-
tion 17.1 functions as one participant, the Goal, with 'The number of cells'
as the head and 'with a given DNA content' as the embedded Postmodifier
(Halliday, 1994: 191-192). Experientially the 'cells' functions as the Thing
and 'with a given DNA content' the Qualifier. But the elements 'The num-
ber of cells' and 'with a given DNA content' do not mean only within
language; they are also to mean mfe^emiotically, that is, in relation to the
visual text. In other words, the Head and Postmodifier composite in the
linguistic text is transformed into two separate participants in the visual
text, the two variables represented by the j-axis and thejy-axis perpendicu-
lar to each other. This shift from one linguistic participant to two visual
participants of equal status may be considered an example of 'parallel
semiotic metaphor' (O'Halloran, 1999a: 348) in that the two participants in
the second semiotic derive from the Goal in the first. This movement from
the linguistic to the visual code permits, however, the exploitation of the
meaning potential of the visual semiotic. Once this shift has taken place, it
is possible to represent the relationship between the number of cells and the
amount of DNA content per cell in terms of the height of the points or
lines in the coordinate system and to make visual comparisons and even
hypothesize some mathematical relationship between the two variables.4 The
precise shape of the curve in the visual text did not exist in the linguistic
text and thus may be considered as a case of 'divergent semiotic metaphor'
PRINT MEDIA 213
(ibid.} because a new participant is introduced with the movement from the
language to the visual image. In this case the divergent semiotic metaphor
(the curve) occurs as a consequence of the parallel semiotic metaphor (the
introduction of two participants). As will be clear shortly, the solution of the
problem depends to a large extent on how much sense the student can
make of the two instances of semiotic metaphor together with the informa-
tion contained in the main text. In what follows I discuss two questions: (a)
how do the Question and the graph relate to the main text? and (b) how do
the main text and Question (including the graph) contribute to the solution
of the problem?
The relationship between the question, the graph and the main text
The relevant main text reads:
During S phase (S = synthesis), the cell replicates its nuclear DNA, [. . .] S phase
is flanked by two phases where the cell continues to grow. The G1 phase (G =
Gap) is the interval between the completion of M phase and the beginning of S
phase (DNA synthesis). The G2 phase is the interval between the end of S phase
and the beginning of M phase. (Figure 17.4).
(ECB, p. 550)
This means that if a cell in Gl phase has 2n units of DNA content, then by
the end of S phase ('replicates its nuclear DNA'), it has doubled the amount
of nuclear DNA content and in the G2 and M phases, it has 4n units of
DNA content. That is, the amount of DNA per cell in G2 and mitosis is
twice the amount in Gl and S phase is in the transition from 2n to 4/z units.
Then how do Question 17.1 and the graph relate to such information
contained in the main text? The main text reveals the general facts, the
'laws' in biology, the conclusion, and/or the theory, which scientists arrive at
from numerous experiments (as can be seen in the use of simple present
tense in the quotation above). Question 17.1 (including the graph), on the
other hand, reports just one experiment, complete with Method and Results
of an experimental report (the verb tense in some of the first few clauses in
the Question is the simple past, for example, 'were stained' and 'were then
passed'). That is, the main text presents the conclusion and the Question
presents one of the experiments leading to such a general conclusion.
Question 17.1 is not, however, a real experimental report, but rather it is a
textbook question. In a real experimental report, the conclusion is presented
in the final part while in the textbook question the conclusion is the point of
departure and the student is expected to apply this general rule to solve a
practical problem.
The contribution of the main text, question text and the graph in solving
the problem
There are two parts to the Question. The first part reads: 'Indicate on the
graph where you would expect to find cells that are in the following stages:
G l5 S, G2, and mitosis'. To answer this question, the student must understand
214 MULTIMODAL DISCOURSE ANALYSIS
the change in the amount of DNA content at different stages of the cell
cycle. That is, he or she must understand the relevant part of the main text
quoted above. Then in relation to the question he or she must also know
how to interpret the x-axis and know that at point b, the point on the x-axis
corresponding to Peak B (which is not displayed in Figure 8.2) the amount
of DNA per cell is twice that at point a, the point on the x-axis correspond-
ing to Peak A (again not displayed in Figure 8.2), and that Peak B is therefore
the place where one would expect to find cells in G2 and mitosis phases
(chromosomes replicated, doubled) and Peak A the place to find cells in Gl
phase (chromosomes not yet replicated). Here the ability to deduce b = 2a
on the x-axis is crucial to the solution of the problem. To know where to find
the cells that are in the S phase, the student must again understand the
relevant main text. He or she must also be able to translate such main text
information into the line segment ab on the x-axis and know that cells that
are in the S phase can be found between Peaks A and B.
The second sub-question reads: 'Which is the longest phase of the cell
cycle in this population of cells?'. To answer this question, the student needs
to interpret the divergent semiotic metaphor, that is, he or she needs to know
how to interpret the frequency graph. That is, Peak A is the highest, indicat-
ing that the number of cells with this DNA content, that is, cells at Gl phase,
is the largest. This further suggests that Gl is the longest phase of the cell
cycle, assuming that the cells were selected on a random basis.
In this subsection I have discussed the essential role that a knowledge of
the linguistic and visual resources and how they interact with each other
plays in the solution of an in-text problem in ECB. In the final section of the
paper I discuss the implications of the preceding analyses.
Notes
1 Unfortunately, it has not been possible to reproduce these Figures in colour as they
appear in the textbook. However, the following glosses are provided on the
colours used in the original text.
Figure 17—3, reproduced in Figure 8.1
The words 'Figure 17—3' are green. The nucleus of the cell is shaded in pink. In
the top circle the chromosome on the left is black, the one on the right red, and
this scheme is retained throughout. 'MITOSIS' and 'CYTOKINESIS' are
shaded in bright yellow, and 'M phase' light green.
Figure Q17—1, reproduced in Figure 8.2
The Question is framed by a box which is marked by its yellow background. In a
similar manner, the actual graph is framed inside the yellow box by a white
background. The curve, the A and B, and the area underneath are green.
2 On the basis of how a signifier relates to the signified, Peirce (1985) classifies
signs into an icon, an index and a symbol. In simple terms, an icon is a sign that
relates to its object in terms of their resemblance. This resemblance can be
similarity in 'simple qualities', as in images or photographs, or in 'relations', as in
diagrams and algebraic formulae, or it can be 'a parallelism' as in metaphors
(Peirce 1985: 10-11). An index is 'a sign which refers to the Object that it
denotes by virtue of being really affected by that Object' (ibid.: 8), for example,
smoke as an indication of fire. A symbol is a sign that derives its meaning by
conventions, by agreement between people (ibid.}, for example, the phonological
PRINT MEDIA 217
or graphological feature of the word 'man' and its meaning. The reason for
applying Peirce's trichotomy to the present analysis is that it brings to light the
fact that the relationship between the signified and the signifier is not always
identical or straightforward. Thus signs vary in the degree of the potential
semiotic load they pose for students. An iconic photograph of some familiar
object is easy to decipher, less so the schematic drawing, and even less so the
symbolic signs such as 5'-UGC-3'.
3 For lack of space I do not analyse the graph in a step-by-step manner, as I did
with Figure 17.3 above. However, one can always explore the visual resources
the graph exploits by reference to Table 8.2.
4 According to Tilling (1975: 200-211), quantitative graphs were not only used by
scientists such asj. H. Lambert (1728—1777) to present experimental data graph-
ically but also help to analyse them, for example derive mathematical relation-
ship between the variables (e.g. the rate of water evaporation as a function of
temperature as reported in one of Lambert's papers (Tilling, 1975: 201)).
Apparently, the student reader in this question is not required to derive an
equation from the graph but just to interpret the results displayed in the graph
and draw some conclusions.
Acknowledgements
Figures 8.1 and 8.2: Copyright © 1998 from Essential Cell Biology: An Introduction
to the Molecular Biology of the Cell by Alberts, B., Bray, D., Johnson, A., Lewis,
J., Raff, M., Roberts, K. and Walter, P. Reproduced by permission of
Routledge/Taylor & Francis Books, Inc.
References
Alberts, B., Bray, D., Johnson, A., Lewis, J, Raff, M., Roberts, K. and Walter,
P. (1998) Essential Cell Biology: An Introduction to the Molecular Biology of the Cell. New
York: Garland.
Arnheim, R. (1974) Art and Visual Perception: A Psychology of the Creative Eye. Berkeley:
University of California Press.
Baigrie, B. S. (ed.) (1996) Picturing Knowledge: Historical and Philosophical Problems
Concerning the Use of Art in Science. Toronto: University of Toronto Press.
Baldry, A. P. (ed.) (2000a) Multimodality and Multimediality in the Distance Learning Age.
Gampobasso, Italy: Palladino Editore.
Baldry, A. P. (2000b) English in a visual society: comparative and historical dimen-
sions in multimodality and multimediality. In Baldry (ed.), 2000a: 41-89.
Barthes, R. (1977) Rhetoric of the image. In R. Bardies (S. Heath, ed. and trans.),
Image—Music—Text. New York: Hill and Wang, 32—51. (Originally published in
1964.)
Bastide, F. (1990) The iconography of scientific texts: principles of analysis. In
Lynch and Woolgar (eds), 187-229.
Ford, B. J. (1992) Images of Science: A History of Scientific Illustration. New York: Oxford
University Press.
Haas, G. (1994) Learning to read biology: one student's rhetorical development in
college. Written Communication 11(1): 43-84.
218 MULTIMODAL DISCOURSE ANALYSIS
Introduction
In this age of the multimedia, there is an increasing awareness that meaning
is rarely made with language alone. As Baldry (2000), Kress (2003) and
Kress and van Leeuwen (2001) note, we live in a multimodal society which
makes meaning through the co-deployment of a combination of semiotic
resources. Visual images, gestures and sounds often accompany the lin-
guistic semiotic resource in semiosis. As such, there is a pressing need to
understand the dynamics of meaning-making, or semiosis, in multimodal
discourse. Academic disciplines that focus on mono-modality, such as that
of linguistics, must come into dialogue with other fields of research, for
instance, visual communication studies and media studies, to facilitate the
interdisciplinary nature of multimodal research.
In this paper, the Integrative Multi-Semiotic Model (IMM) (Lim, 2002) is
proposed as a 'meta-model' for the analysis of a page or frame which
involves the use of both language and pictures as semiotic resources. The
term 'meta-model' is used to describe the IMM as a model which brings
together and incorporates the systemic-functional matrices and frameworks
currendy available in the field of multimodal studies. This is undertaken
with the aim of unifying these contributions for the expression, content and
communicative planes of language and visual images in the IMM. There is
a need, however, to further develop the model into one that can account for
meaning arising from other semiotic resources in dynamic environments
such as video texts and hypertext.
Systemic-functional linguistics (SFL), developed by Michael Halliday
(1978, 1994) and extended by Martin (2002) and Martin and Rose (2003),
provides the theory for this investigation into semiosis involving language
and visual images. Although originally conceived for the semiotic resource
of language, the application of SFL to other semiotic resources has been
productive. Pioneering work in the application of systemic-functional the-
ory to visual images, architecture and sculpture includes O'Toole's (1994)
The Language of Displayed Art and Kress and van Leeuwen's (1996) Reading
Images. Following this, further applications of SFL to other semiotic
resources for the analysis of multimodal discourses in mathematics, science,
PRINT MEDIA 221
Proposing an IMM
Despite the advances made in recent research, there remains a lack of
understanding of how meanings arise in multimodal texts. Apart from
Thibault's (2000; forthcoming) comprehensive framework for the analysis
of television advertisements and Baldry and Thibault's (2001) conception of
phase in dynamic video texts, an overarching model and a meta-language to
describe the processes involved in semiosis and intersemiosis in multimodal
texts is lacking. As such, the IMM and the related concepts introduced in
222 MULTIMODAL DISCOURSE ANALYSIS
this paper are proposed as a tentative step to account for the different
aspects of meaning arising from the use of multiple semiotic resources. The
IMM, which may be used for the analysis of a printed text involving the two
semiotic resources of language and visual images, is a modest step. None-
theless, the necessity of developing a 'meta-model' with an accompanying
'meta-language' to describe semiotic processes in multimodal discourse is
demonstrated through the discussion of the IMM and the issues raised by
such a model.
The IMM, as displayed in Figure 9.1, demonstrates topologically the
complex multifaceted nature of meaning made in a multi-semiotic text.
The rectangular blocks are used metaphorically to represent the strata,
planes and dimensions of meaning within and across language and visual
images. Following Martin (1992), three planes are conceptualized for these
two semiotic resources. That is, the language and visual image plane
consists of an Expression plane and a Content plane (which is further
divided into grammar and discourse semantics strata), and the Context
plane which consists of register, genre and ideology as displayed in
Figure 9.1.
The top view of the model appropriately displays the Expression plane
which is referred to as 'Typography' for language and 'Graphics' for visual
images. This is significant as the Expression plane is the interface between
the text and the reader. As seen in Figure 9.1, this interface is mediated by
the medium and materiality of the text, which also mediates the other
planes. This mediation may be seen in operation in the simple case of a
wedding invitation card which is usually printed on certain types of paper.
This demonstrates that the Content, register and genre of the text (the wed-
ding invitation) are related to the materiality options of the medium (the
type of paper and print). Together, these choices carry ideological implica-
tions, which in this case concern the elevated status of weddings in Western
society.
An elevated platform between the linguistic and pictorial modalities can
be seen from the top of the IMM. This is called the Space of Integration
(Sol), which is the theoretical platform where intersemiosis occurs through
contextualizing relations. The elevation of the Sol signifies topologically the
semantic expansions that result from the interaction and negotiation
between semiotic resources in what Lemke (1998) terms as 'the multiplica-
tion of meaning'. Below the Expression plane is the Content plane which
consists of the lexicogrammatical and discourse semantics strata for lan-
guage, and the visual grammar and discourse semantics strata for visual
images. As seen in Figure 9.1 the Sol also operates on the Content plane.
The lexicogrammatical and discourse systems for language are organized
according to the three metafunctions proposed by Halliday (1994); the idea-
tional, Interpersonal and Textual metafunctions. The theory of metafunc-
tionality has been extended to the systems which constitute the grammar of
other semiotic resources. For example, Kress and van Leeuwen (1996) and
O'Toole (1994) extend the metafunctional hypothesis to the systems of a
visual grammar. O'Toole (1994) proposes a detailed metafunctionally based
matrix for the analysis of paintings. In addition to the lexicogrammatical
and grammatical systems, a discourse semantics stratum is also recognized
for the pictorial modality as well as for the linguistic in the IMM. Although
not developed here, this extension follows from Martin's (1992) metafunc-
tionally based discourse systems for language. The discourse semantics
stratum for language and visual images is useful for analysing children's
picturebooks, for example, which consist of a sequence of pictures and text
(Lim, 2002).
The systems of meaning in the Expression and Content plane for lan-
guage and visual images are seen to be organized metafunctionally in the
IMM. The metafunctional distinctions within the systems on the grammar
and discourse strata in the IMM are indicated through the three rectangular
boxes of different Tone in Figure 9.1. Thibault (2000: 362) proposes that
'metafunctions are best seen as a principle of integration for approaching the
Experiential, Interpersonal, logical and Textual dimension of the text as a
whole'. The commonalities of metafunctional organization across semiotic
resources are drawn upon and metafunctional distinction is used as a means
of conceptualizing meaning across the different strata in the IMM.
The term system-metafunction fidelity is used to signify the degree of
dedication of a system towards a specific metafunction. Although meaning
is organized around the metafunctional classifications, the system-
metafunction fidelity of the visual grammar is less rigid compared to the
lexicogrammar in language. In other words, the metafunctional categories
by which the systems for visual images on the grammar stratum are organized
may be more fluid than depicted by the three rectangles in Figure 9.1. For
example, the system of Rhythm in the grammar for visual images (O'Toole,
224 MULTIMODAL DISCOURSE ANALYSIS
The Expression plane of the Figure involves, for example, the systems
of Colour and Form used to make meaning. This refers to choices in the form
of the black thin line, the two small black circles and the larger circle in
Figure 9.3. Should any of the choices be altered at the rank of the Expres-
sion plane; for example, should the eyes become green, or the thin black
line becomes a red brushstroke, the meaning of the picture would change.
The choices from systems in the Expression plane (see Figure 9.5) are signifi-
cant in terms of the meaning of the picture. This illustrates that choices
made from systems on the Expression plane contribute or feed through to
the meanings made through systems operating on the Content plane. This
point is further discussed below.
The grammar stratum, as extensively theorized by O'Toole (1994, 1995)
and Kress and van Leeuwen (1996), relates one disparate element to
another and explains how the whole functions cohesively to make meaning.
Just as the grammar of language concerns itself with the chains of words to
form coherent sentences, the grammar of visual images is about the piecing
of one item with another to construct a coherent message. The relations of
the parts to a whole, for instance, how the various shapes form the iconic
face in Figure 9.3, operate on the grammar stratum. This grammar is cul-
turally dependent and governs the way a reader 'reads' and understands
images such as the iconic face in Figure 9.3.
Following O'Toole (1994: 24), a hierarchy of different ranks analogous to
Halliday's (1978) rank scale for language, is proposed for the visual gram-
mar. In this way, it is possible to examine the meaning made on each of the
rank units, which are Member, Figure, Episode and Work. This adoption of a
rank scale operating within the principle of constituency, where one rank is
constitutive of the next higher rank in the hierarchy, facilitates a more
systematic analysis of the meaning made in the different units on the visual
grammar stratum.
In a sense, the delicate distinction between the Expression plane and
grammar stratum can be made with the Expression plane being largely
concerned with the surface instantial features of the text and the Content
plane with the interaction and negotiation between the different elements in
the text. In the same way that Context mediates the meaning of a text, the
Expression plane mediates the choices made from the grammatical and
discourse systems operating on the Content plane. The notion here is one of
'mutual engendering' which has been used to describe the relationship
228 MULTIMODAL DISCOURSE ANALYSIS
between language and social Context (Martin, 1992). In this case, the
mutual engendering encompasses the Expression plane and the Content
plane, the materiality and medium of the text, and the social and cultural
Context within which the text was produced.
Reading path
The assumption of perceptual equity on the Expression plane has pro-
found implications for our approach to the analysis of the multimodal text.
The Expression plane is the interface the reader experiences upon reading
the text. In this paper, the term 'reading', despite being a term derived
from the study of language, is taken to include visual perception or viewing.
Following Sardar and van Loon's (2000: 44) work in media studies, reading
is defined as 'the process of interaction when a text is analysed as well as the
final result of that process, the interpretation'. Hence, in any multimodal
text, it is useful to chart a typical reading path that the hypothetical reader may
follow in the reading of different episodes on a page. In a sense, the reading
path is the order by which the reader may process different episodes in a
multimodal text.
As previously mentioned, Thibault (2000, forthcoming) and Baldry and
Thibault (2001, 2004) use phasal analysis in their deconstruction of a film
segment, where salience or the 'use of foregrounding strategies' allows for
certain modalities to be thrust into prominence. Analysis is therefore guided
by the contrastive salience of a specific semiotic resource in each particular
instance. This presupposes and builds upon the theory of a 'reading path'
where the viewer reads according to the contrastive salience of the semiotic
resources at each instantiation. O'Halloran (1999: 323) proposes that a prac-
tical approach to analyzing a multi-semiotic text can be through a progres-
sive analysis following the 'reading path determined by the choices within
different semiotic codes'.
The notion of a linear or uni-directional reading path, however, deserves
to be more closely scrutinized. This conception seems to be appropriate for
a reader reading a book or magazine, navigating across the pages or frames
in a linear reading pattern, governed by literacy conventions. Following
Pang (2000), however, this would more suitably be termed as a directional path
rather than a reading path. The usefulness of a restrictive and regulated
reading path breaks down when analyzing the multimodal text on a page or
frame. The reading path on a multimodal frame is seldom only uni-
directional, as the hypothetical reader's eyes are led through contrastive
salience, possibly even in a back and forth fashion between two items or
Episodes (O'Toole, 1994) on a page. In other words, the path, although
sequential due to constraints of human visual perception, may not be uni-
directional but is free to be bidirectional (Pang, 2000) or multidirectional as
displayed in Plate 9.1. Following the assumption of perceptual equity, the
reading path may disregard the distinction between linguistic and pictorial
semiotic resources as the reader is drawn by the contrastive salience of a
section or Episode.
Kress and van Leeuwen (1998) introduce the notion of scanning which
clarifies their earlier claim that readers tend to read in a left to right and up
to down pattern. They describe scanning as a process that occurs before
reading. The 'scanning process sets up connections between the different
PRINT MEDIA 231
on the Expression plane. This includes developing the notion that a viewer
is drawn towards interpersonally salient components in a multimodal text.
While system networks for some of the more prominent systems of the
linguistic and pictorial modalities on the Expression stratum are proposed in
the next section of this paper, these are not exhaustive and remain very
much at a preliminary stage.
Although the meanings made through the systems in the grammar stra-
tum are organized metafunctionally, the tri-metafunctional distinction
appears to be more uncertain on the Expression plane as previously dis-
cussed. These systems with a low system-metafunction fidelity can be more
appropriately described as functioning on a cline and, as such, the classifica-
tion of the systems is not based on metafunctionally based discrete categor-
ies in Figure 9.1. Instead, systems operating on the Expression plane can
contribute to the ideational, Interpersonal and Textual meanings in a text. It
is therefore useful to examine the critical impetus, or the necessary conditions
and circumstances which reveal which particular metafunctional meaning is
likely to emerge from choices within systems on the Expression plane.
The critical impetus for a dominant Interpersonal meaning on the Expres-
sion plane is salience., and this can be achieved through contrast of Colour,
Shape, Size, and so forth. The critical impetus for Textual meaning on the
Expression plane is the presence of Textual unity and cohesiveness. But first,
what is the nature of the ideational meaning made on the Expression
plane? Visual semioticians Floch (1986) and Thurlemann (1990) have
observed a double layer of signification in pictures. They term the first level
as 'iconic' and the second as 'plastic'. Sonesson (1993: 325) explains that 'on
the iconic level, the picture is supposed to stand for some object recognizable
from the ordinary perceptual lifeworld, while concurrently on the plastic
level, simple qualities of the pictorial Expression serve to convey abstract
concepts' within the lifeworld as well. Lifeworld, according to Husserl, is the
'world taken for granted'. To extend this rather crudely into SFL terms,
lifeworld can be compared to the Context of Situation and Context of
Culture, the social reality in which the individual operates.
Doonan (1993: 15), working on picture books from a literary perspective,
also recognizes the 'two modes of referring' in pictorial images. She simpli-
fies 'Denotation' as the representation of an object in a particular Context of
culture. 'Exemplification', on the other hand, is the mode by which
'abstracted notions, conditions and ideas' (1993: 15) are represented within
that culture. This approach to the representation and composition of pic-
torial semiotics is congruent to our proposed formulations in this paper,
which draws expedientiy upon some of these ideas. Modifying the original
sense of denotation and connotation as proposed by Barthes (1977), the
terms Denotative Value and Connotative Value are used to describe the two types
of ideational meanings made on the Expression and Content strata.
The Denotative Value is understood as the literal or iconic meaning. For
instance, the denotative value of the colour red is confined to the perception
and reference of the reddish hue. Saint-Martin (1990) observes that two
PRINT MEDIA 233
persons can look at one colour and yet see it differently. Hence, it must be
added that the use of denotative value is qualified with the acknowledging
of the reader's cultural-based subjectivities. This contrasts with Barthes's
(1977) use of denotation as a rather non-Context-dependent Platonic ideal.
In other words, the denotative value is understood in this paper as the literal
but Context-dependent meaning. Like Floch (1986) and Thurlemann's
(1990) conception of the 'plastic' and Doonan's proposal of'Exemplifica-
tion', the Connotative Value is the ideas and abstractions evoked from the
literal image. For instance, the connotative value of the colour red refers to
the abstract concepts which the colour evokes in the reader. Dependent on
the Context of culture, situation and co-text, the red hue could connote
antithetical ideas ranging from danger in a European Context to good for-
tune in Chinese culture.
The Interpersonal meaning dominates when system choices on the
Expression stratum generate Salience., in other words, when salience has a
critical impetus. This salience can sometimes be achieved through contrasts
in, for example, Size, Shape and/or Colour as mentioned above. The critical
impetus of salience can be linked to the notion of'markedness' in Halliday's
(1994) conceptions. The notion of'markedness' could be helpful to account
for the meaning expansion on the grammar stratum as well as on the Expres-
sion plane. Markedness in Halliday's (1994) original usage means to
'stand out' as an atypical choice. The choices made in Typography for most
texts, for example, are usually stereotypical options according to their genre.
For instance, in the Context of a piece of formal academic writing such as a
dissertation, a particular selection of appropriate Typography is expected. In
addition, because of the association of certain Typography with particular
genres, any departure from the convention or mismatch between Typog-
raphy and genre would render those typoGraphical choices as 'marked'.
This is consistent with Halliday's (1994) observation that there is an order in
a clause which is usually expected in a particular clause type, for example,
the nominal group functioning as Subject is usually the first item in a clause
which has a declarative mood. When this order is not adhered to, the clause
is marked. A marked selection in Typography is similarly meaningful.
The notion of critical impetus is thus useful when included in systemic
analysis of both linguistic and multimodal discourse. The critical impetus is
used to identify the environment whereby certain Interpersonal meanings
may dominate through the notion of marked choices. In the same manner,
Textual meanings are usually observed when the critical impetuses of
Unity and Cohesiveness in a text are in operation. For example, in a tapestry
design, the system of Saturation and Hue in Colour and the geometric
forms through the system of Shape operate to create unity and cohesion in
the text. As may be seen from this very preliminary discussion, however,
further research is needed to understand the conditions under which certain
metafunctional orientations are realized through choices from the systems
operating on the Expression strata. Provisional networks for these systems
are given in the next section of this paper.
234 MULTIMODAL DISCOURSE ANALYSIS
of converging vectors gives a sense of the illusionary depth and adds a sense
of three-dimensionality into the picture world. Finally, Chiaroscuro is the
application of light and shadows to create DS in Picture C. The example of
PRINT MEDIA 237
the Singaporean Merlion statue shows how shading can suggest a sense of
three-dimensionality on a two-dimensional plane.
PoV is the viewpoint through which the reader is presented with a scene
in the picture. Following cinematography theory, Bordwell and Thompson
(1997: 241) explain that there are systems available in a cinematic shot
which determine the reader's entry into the story world. Two main systems
are Angle and Distance. Angle is the tilt at which the visual image is pre-
sented. A high tilt may place the viewer in a somewhat voyeuristic position.
This can be seen in the frame shown in Picture E in Plate 9.2, where the
reader is 'situated' in the position of an intrusive outsider. A sense of alien-
ation and detachment or feelings of superiority could result from a skilful
use of the high tilt. Correspondingly, a low tilt may lead the reader to feel
overwhelmed, usually with the character positioned to be 'towering' over the
reader. An example can be seen in Picture D, where the pile of toys is
emphasized and the children are portrayed above the clutter. Finally, the
system of Distance has the categories of Long Shot, Medium Shot and Close-
up. Although these categories are relative, they are typically discernible, as
displayed in Picture F, and have a powerful effect.
The system of Form displayed in Figure 9.5 contains four sub-systems,
those of Colour, Shape, Line and Strokes. Colour, following Doonan (1993),
operates through three sub-systems. Hue or pigment distinguishes the col-
our across the spectrum, making it possible to discriminate, for instance,
blue from purple. Tone 'is a measure of light and dark of an area regardless
of its colour, and its quality of a surface as measured purely by its position in
the scale between black and white' (Doonan, 1993: 30). Tone or shading can
render the effects of texture and lighting. Saturation refers to the purity of a
colour. The primary colours such as red, yellow and blue are hues with the
highest level of intensity or saturation.
The system of Shape includes the options Geometric and regular or Non-
Geometric and irregular. The selection of shapes adds to the multifarious
meaning made in the text. For instance, a picture composed of largely
regular shapes positioned horizontally or vertically could suggest stability
and even a sense of rigidity. The system of Line 'creates contour, modelling,
shading and a sign for movement. A contour puts a Line round objects and
figures and gives them individuality and character' (Doonan, 1993: 23).
Lines such as those used to create varying tone could render the effect of
lighting conditions. Finally, the system of Strokes in Graphics refers to the
way in which colour is applied. Some common options available are Brush,
Pencil, Paint and Crayon. Once again, these systems are not exhaustive, but
rather they are presented to illustrate how systems on the Expression plane
contribute to the overall meaning of the text.
Picture A Picture B
Plate 9.3 Homospatiality: Picture A reproduced from Sallustio (1999b: 4)
PRINT MEDIA 241
Homospatiality, an intensified sense of heat and smoke from the fire is repre-
sented. These extensions of the meaning stem from the intersemiosis on the
Expression plane of the multimodal text, which engenders the meaning
arising from choices in the Content plane.
also one which is situated within the old' (1999a: 348). Although there could
be redundant meanings due to overlaps, 'new layers of meaning are [essen-
tially] simultaneously added to the original representation'. The reconstrual
of elements in a divergent Semiotic Metaphor, however, is more far-reaching.
Here 'the functional element is reconstrued into a new semantic field'
(ibid.}. The metaphorical shift in meaning accompanying such divergent
reconstruals is substantial as the functional element is literally relocated in
a semantic field which is not typically intertextually related to the first.
O'Halloran (1999a) explains that the types of semantic shifts involved
in divergent semiotic metaphors, however, gradually become naturalized
over time.
A possible by-product of the meaning made through parallel semiotic
metaphors is semantic redundancies. These redundancies are realized when
there is a duplication of the meaning made by the semiotic resources. These
meanings, though actualized when the modalities are independent, serve a
reinforcing function when the two systems combine in the Sol. A by-product
of divergent semiotic metaphor, on the other hand, could be the surfacing
of conflicting meanings. These conflicts or examples of 'ideological disjunc-
tion' are a possible result 'of the complex, often intricate, relations of inter-
functional solidarity among the various semiotic resource systems that are
co-deployed' (Thibault, 2000: 321). However, the Sol usually brings about a
harmonization of these disjunctions and conflicts 'in the service of the
semiotic project of this particular text' (ibid.}. In a multimodal text where the
modalities share co-contextualizing relations, there is a stronger likelihood
for parallel semiotic metaphors to arise, where the new meaning made
remains situated within the old. Divergent semiotic metaphors where
new, previously unrealized meanings are being made through the process
are more likely to emerge from a text where its modalities share re-
contextualizing relations.
PRINT MEDIA 243
Conclusion
As a meta-model, the IMM attempts to synthesize various research efforts
by situating them on the strata, planes and metafunctional dimensions of
the IMM where there is greater centrality and focus. For instance, the field
of materiality and medium of resource is located within this larger theor-
etical multi-semiotic model, in this case, across the communication planes.
The IMM is designed to help unify diverse research efforts in the field by
locating their contributions into a single model, which takes into account the
complexities of multimodal meaning-making.
However, some qualifications exist with respect to the IMM. The problem
of addressing a dynamic phenomenon with a typological description and
framework is a perennial quandary. Hence, the IMM may bear the criti-
cism, like other frameworks, of being reductionist and even rigid in the
categorization of systems according to the metafunctions, despite the use-
fulness of the metafunction as a principle of theoretical integration. The
severity of this criticism, however, will be somewhat alleviated in the IMM
with the construction of a model that can reflect more effectively topological
meaning in dynamic environments such as those afforded by film and hyper-
text. In addition, at this stage the categories in actuality are more fluid than
can be represented by clearly delineated and neat classifications of systems
in the model.
Apart from recognizing the fluidity of the classifications, it is useful to
note that each of the metafunctions may not be equally dominant on a
multimodal page. O'Toole (1994) discusses the monofunctional tendencies
of certain schools of paintings, where a single metafunction may tend to
dominate in a certain work. Similarly, not all metafunctions are equally
salient in a multimodal text, despite the appearance of the equal topological
space allocated to each metafunction in the abstract theoretical construction
of the IMM. Hence, it is not surprising to find a particular metafunction
having a greater role in a certain multimodal texts.
O'Toole (1999) also comments that since only some options within the
systems in the matrix are selected in the construction of any one text, it is
not necessary to account for every system in the analysis of a text. Likewise,
in the IMM, there are many systems used to describe and analyse a multi-
modal text. However, not every single system needs to be accounted for in
an analysis; rather, the model is to serve our purpose of understanding how
meaning is made in a multimodal text through the choices which have been
made in the text.
Despite these possible weaknesses, a categorical framework for the
analysis of a multimodal text that pays attention to the meaning made on
the Expression plane as well as on the Space of Integration is helpful.
IMM may be likened to a neat (although at this stage underequipped)
toolbox. The toolbox contains concepts and a theoretical meta-language to
describe and account for phenomena which arise in the multimodal con-
struction of meaning. Just as one does not use all the equipment in a
244 MULTIMODAL DISCOURSE ANALYSIS
toolbox in any one instance, the analyst selects the tools most useful for the
analysis of the text. However, it is also realized that the IMM and the
accompanying conceptual apparatus are provisional and exploratory.
There remains much work to be done in the theory and practice of multi-
modal analysis.
Acknowledgements
Plate 9.1 and Picture A in Plate 9.2 are reproduced with kind permission
from SNP Panpac Pte Ltd, Singapore from the children's picturebook
Dominic Duck Goes to School (2000) written by Maeli Wong and illustrated by
Don Low. I thank Zuraidah Jaffar for generously waiving the copyright fees
for reproducing these pictures.
Picture B and Picture E in Plate 9.2 are reproduced from When Sheep
Cannot Sleep written and illustrated by Satoshi Kitamura with kind permis-
sion from Andersen Press. Thanks also to Red Fox who currently publish the
paperback version of the book.
Picture C in Plate 9.2 is reproduced from the book Rhyming Round Singapore
(Yee, 1998), written by Patrick Yee, Kathleen Chia and Linda Gan, Girl's
Brigade, Singapore. Thanks to Linda Gan for kindly granting permission to
reproduce the picture.
Picture D in Plate 9.2 is reproduced from The Tidy-Up Race and Picture
F in Plate 9.2 and Plate 9.3 are reproduced from Lightning and Thunder by
E. Sallustio with kind permission from the Educational Publishing House
(Singapore) with special thanks to Margaret Tan for her assistance.
Plate 9.4 is reproduced from the website http://www.hearts-on-fire.com
with kind permission from 'Hearts on Fire - The World's Most Perfectly Cut
Diamond'.
References
Baldry, A. P. (2000) (ed.) Multimodality and Multimediality in the Distance Learning Age.
Campobasso, Italy: Palladino Editore.
Baldry, A. P. (this volume). Phase and transition type and instance: patterns in media
texts as seen through a multimodal concordancer, 83—108.
Baldry, A. P. and Thibault, P. J. (2001) Towards Multimodal Corpora. In G. Aston
and L. Burnard (eds), Corpora in the Description and Teaching of English, Bologna:
GLUEB, 87-102.
Baldry, A. P. and Thibault, P. (forthcoming) Multimodal Transcription and Text.
London: Equinox.
Barthes, R. (1977) Rhetoric of the image. In R. Barthes (S. Heath, ed. and trans.),
Image-Music-Text. London: Fontana, 32-51.
Bohle, R. (1990) Publication Design for Editors. New Jersey: Prentice-Hall.
Bordwell, D. and Thompson, K. (1997) Film Art: An Introduction. New York:
McGraw-Hill.
Carroll, N. (2001) Beyond Aesthetics: Philosophical Essays. Cambridge: Cambridge Uni-
versity Press.
PRINT MEDIA 245