You are on page 1of 24

A taxonomy of the discourse relations between

Information words14(3),
Design Journal and 207–230
visuals 207
© 2006 John Benjamins Publishing Company

Kenneth Kong
A taxonomy of the discourse relations
between words and visual

Keywords: multimodality, visual analysis, word-image


interface, systemic-functional linguistics, rhetorical
structure

A document is mainly composed of words and images, ‘What is the use of a book’, thought Alice, ‘without
but the complex relationship that binds these two pictures and conversation?’
completely different semiotic resources is usually taken Alice’s Adventures in Wonderland (Lewis Carroll)
for granted as transparent. The simple relations between
‘Of course the abstract idea must be occasionally
words and images – ‘anchorage’ and ‘relay’, identified by
explained – paraphrased, as it were – by the aid of
Barthes almost 30 years ago – are unable to deal with the
pictures; but discreetly, cum grano Salis’
complexity of their bond, made even more complex by
(Arthur Schopenhauer)
current printing and computer technology. This paper
aims to identify the potential relations that bind texts and
images together by arguing for a multilevel description
of their logico-semantic relationships. The multiple,
evaluative and metaphorical functions of the relations
will also be discussed. The data generated from the
proposed framework can form an empirical corpus for
quantitative analysis. Examples from a variety of sources
will be used as examples to show how the framework can
be operationalized.
208 Kenneth Kong

Introduction Walker, 2001). Nevertheless, recent studies of verbal-


visual connections have been slanted towards linguistics
Schopenhauer’s cautionary statement about the use of or visual communication, without fully integrating the
pictures is rarely heeded these days, particularly because insights of both, and there remains an unsolved question
the use of visuals has almost been stretched to its limit of how the two important codes connect and relate to
through current technology and the popularity of the each other.
Internet. Visuals are seen as essential to the communica- To trace the study of verbal-visual connections,
tion of messages, at least from the perspective of web- we need to revisit Roland Barthes’ Elements of Semiol-
based communication. Of course, visuals are not new; ogy (1967), which argues that verbal text is the central
they were used long before the invention of words in any anchorage of human meanings and perceptions.
language. But it has been through the Internet that the Barthes identifies the relationship between verbal and
use of visuals has reached a new stage of development in visual codes along two dimensions – elaboration and
which our past concepts and notions of communication extension. Verbal codes can elaborate or extend visual
seem outdated or even irrelevant. It is a mistake, however, elements. Elaboration can come in two forms – either
to assume that visual language has replaced or will language coming first and visuals coming afterwards
replace verbal language as the prime means of commu- to become what is traditionally labelled ‘illustration’, or
nication. The history of verbal languages has proved visuals coming first and language afterwards, which is
their importance and adaptability, despite threats posed known as ‘anchorage’. A verbal code can also extend or
by other semiotic means, including visuals. While it is a add new information to the visual code, which Barthes
prime concern to explore the many possibilities of visuals labels ‘relay’.
as semiotic tools in new media, it is equally important Barthes’ notions of ‘anchorage’ and ‘relay’ are
to understand how visuals can work with other semiotic important because they touch on the heart of the issue,
elements, particularly verbal elements, in communicating i.e. how the two important semiotic tools converge.
messages. As Barthes (1978, p. 195) argues: However, Barthes presupposes linearity and does not
allow the reality that any code, whether visual or verbal,
the two structures (words and images) of the message
can anchor or relay the other (van Leeuwen, 2005). This
each occupy their own defined spaces, these being
is not surprising because Barthes reveals that his focus is
contiguous but not ‘homogenized’, as they are for
on images by suggesting a historical reversal in which ‘the
example in the rebus which fuses words and images in a
image no longer illustrates the words; it is now the words
single line of reading.
which, structurally, are parasitic on the image’ (Barthes,
Although there has been an increasing amount of work 1978, p.204). In his terms, images are polysemous
done on the use of visuals in a general sense (cf. Kress – unstable and subject to interpretation – and words are
and van Leeuwen, 1996; Oyama, 1999; van Leeuwen & used to fix the ambiguous meaning of images.
Jewitt, 2000), there has been a lack of systematic atten- The three relations (‘anchorage’, ‘relay’ and the
tion paid to the links between visual and linguistic traditional illustrative function of visuals to verbal
elements. Scholars have advocated that visual literacy language) are not sufficient to deal with the complex
should not be seen as an obstacle to written literacy; relationship between words and images, particularly
instead, they are complementary to each other (Kress in web-based texts. Although there has been a
and van Leeuwen, 1996; Still, 2001; Stroupe, 2000; revolutionary shift from the prime focus on written
A taxonomy of the discourse relations between words and visuals 209

language to the mixed mode of written and visual a sign’ (Eco, 1976, p. 6). A sign can include an image,
language, it is not to assume the complex phenomenon gesture, sound and of course a word. One of the most
can be understood by the existing theories of language important concepts in semiotics is the distinction of
or visuals alone. As Kress (1998, p. 65) points out, ‘we signifier (the sign itself) and signified (the referent),
seem to have a new code of writing and image [author’s (Saussure, 1974), which implies that signs are arbitrary
italics], in which information is carried differentially entities that vary from culture to culture. Another
by the two modes… simpler syntax does not mean that important distinction is between iconic and symbolic
the text – the verbal and visual elements together – is signs (Peirce, 1931-58). Iconic signs are the direct repre-
less complex’. In fact, Bonsiepe made a similar point sentations of the referents whereas symbolic signs make
more than 30 years back by arguing that a visual/verbal use of another image to make reference to the referent.
rhetorical figure is a combination of two types of signs Iconic signs are less subject to convention but may have
whose effectiveness in communication depends on the more constraints in terms of use; on the other hand,
tension between their semantic characteristics. The signs symbolic signs are more subject to convention but have
no longer add up, but rather operate in cumulative recip- fewer constraints in terms of use and manipulation. One
rocal relations. (1966, p. 171) of the most salient manipulative functions of signs is
While it is possible to borrow insights from to persuade people to take action, or, in brief, signs can
linguistics and visual design, it is too risky to take these have a persuasive function. Although rhetoric has been
for granted and apply them to either mode without mainly directed towards the study of words, it has been
careful consideration. While visuals have taken up some argued that visuals can perform similar persuasive func-
of the functions that written language used to perform, tions. The focus on the rhetorical/persuasive dimension
whereas written language is mainly used for reporting of signs has been labeled ‘visual rhetoric’ (Bonsiepe,
and narrating (Kress, 1998), the ‘display’ function of 1966; Marcus, 1979). For example, Marcus (1979) argued
visuals needs to be explored, especially in relation to the that many classical rhetorical devices (such as devices
narrating and reporting function of words. The most building in climax and devices of intensification) can
important issue in the research of multimodal docu- be applied to images. Verbal and visual signs cannot
ments is whether they can communicate meanings that be isolated when studying their persuasive functions,
traditional documents cannot (see Kaltenbacher, 2004). because, as Bonsiepe (1966) noted decades ago, there is a
In other words, we need a more systematic and sophisti- strong correlation of words and images in (for example)
cated network that can allow practitioners and analysts advertisements. This correlation has intensified in the
to understand the increasingly complicated verbal-visual age of the Internet.
connections. As mentioned earlier, recent studies of the connec-
tions between words and visuals have focused on the
supporting functions of language for visuals and vice
Word-image connections versa, without fully considering the interactive interplay
between the two. A good example of studies that focus
Visuals have been under close examination in a diverse on the supportive functions of visuals to body text
discipline of what is traditionally known as semiotics would be Pegg’s (2002) classification of images into
that is ‘concerned with everything that can be taken as ancillary, correlative or substantive types:
210 Kenneth Kong

[Ancillary images are] usually placed in the opposition relations (such as antonymy in verbal codes),
neighbourhood of relevant text (except in the case class:sub-class relations (such as hyponymy in verbal
of title-page illustrations, front-pieces, and thematic codes) and expectancy relations (such as collocation
colophons). The exact relationship of the image to the in verbal codes). While Royce is able to identify some
text however is left for the reader to determine. (p. 170) of the important word-image relations, the four types
of cohesion are not adequate in capturing the complex
Correlative illustrations are usually associated with
relationship. Indeed, the four ‘cohesive devices’
technical and scientific discourse and are characterised
identified by Halliday and Hasan are superficial links
by keying illustrations to the text in a variety of ways.
(p. 172) that connect words and clauses in verbal texts. There
are more ‘subtle’ or ‘external’ links that combine units
In […] substantive (illustrations), there is no need for in a text, which are known as coherence in discourse
the reader to build bridges between image and text, as analysis. For example, relations such as ‘explanation’,
with ancillary illustrations, nor for a writer to construct ‘sequence’ and ‘cause’ cannot be adequately dealt with by
elaborate and misunderstanding-prone bridges out of cohesive devices even though they may contribute to the
callout lines, numbering, or labels, as with correlative overall coherence of a text.
illustrations. If words (or numbers) used with Through a review of 24 previous studies, Marsh and
substantive illustrations are laid out two-dimensionally White (2003) could identify 46 relations between image
on a page so as to display structurally the relationships and text, which they then classified according to their
between them, then text and image are one. (p. 174) degree of integration or closeness as (a) functions that
have little relation to the text, such as those that decorate
By focusing on how images can ‘illustrate’ the verbal
or elicit emotion, (b) functions that have a close relation
content, Pegg’s study can only capture the degree of
to text, such as those that reiterate or exemplify, and (c)
integration between words and images. While this is
functions that go beyond the text, such as those that
an important index of connection, it does not explain
interpret, develop and contrast. Although this taxonomy
the semantic relations that bind the two. In the same
was based on extensive review, a large number of labels
vein, Eckkrammer (2004) identified four relations in
are either overlapping or difficult to apply in analysis.
word-image connections: transmedial relationships
For example, relations such as ‘control’, ‘relate’, ‘sample’,
(such as photos in novels), multimedial discourse (such
‘motivate’ and ‘translate’ are ambiguous terms that need
as book illustrations), mixed discourse (such as comic)
to be elaborated and re-defined. However, Marsh and
and syncretic discourse (such as visual poetry). Again
White highlight the importance of close integration
with a focus on the degree of connections, this does
between words and images. An image merely placed
not touch on how the connections are made possible.
beside text cannot make the connection clear; what
Similarly, Royce’s study (2002) of the intersemiotic
is needed is a framework that can help analysts or
systems of words and visuals in academic texts can
document designers to understand the complex
only partly identify the semantic relationships between
relationship between the two.
words and images. Drawing on Halliday and Hasan’s
Amongst recent studies, Horn’s idea of ‘visual
ideas of lexical cohesion (1976), Royce argued that
language’ (1998) is closer to providing an understand-
visual and linguistic elements can be realized through
ing of this relationship. Although visual language
similarity relations (such as synonyms in verbal codes),
can be analyzed from linguistic perspectives, it ‘has
A taxonomy of the discourse relations between words and visuals 211

distinct properties that make it different from natural disciplines such as speech synthesis and automation,
languages of words and from purely artistic languages. medicine, hypertext design, education and entertainment
It has a more complex syntax and requires more industry has been comprehensively reviewed by Kalten-
diverse and complex analysis’ (p. 13). Horn classifies the bacher (2004), who also argued that many such studies
relations between words and images along three main ’lack empirical evidence to support many of the claims
dimensions: semantic categories, classical rhetorical made’ (p. 202) and a corpus (a larger compilation of data)
devices and temporal relations. Semantic categories is an ideal solution to this problem. This paper can be
refer to the linking of two modes through their potential regarded as an important step towards a more systematic
meanings and include relations such as substitution, and quantitative approach to the issue.
disambiguation, labelling, example, reinforcement and
completion. Classical rhetorical devices are very similar
to the cohesive devices of Halliday and Hasan (1976) and The proposed taxonomy
include synecdoche (part-whole relation), metonymy
(association relation) and metaphor (suggestion of anal- Logico-semantic relations
ogy or likeness). Temporal relations are ‘different sets of
relationships between verbal and visual elements as seen The two text-image relations identified by Barthes are
in … process communications [rather than] in static ‘anchorage’ and ‘relay’, which come very close to what
displays’. Although Horn’s framework offers important Halliday (1994) calls ‘elaboration’ and ‘extension’, are two
insights into the ways in which words and images work of the many ‘logico-semantic relations ... which may
together to create meanings, its classifications can be hold between a primary and a secondary member of
refined and expanded. For example, the ‘metaphor’ a clause nexus’ (p. 219). Although the idea is based on
relation can be an inherent characteristic of any relation language clauses, the framework is equally applicable to
and is not necessarily an individual relation. In other verbal-visual connections, although with modifications.
words, a multi-layered framework that can consider the There are two categories of relations in Halliday’s
multiple functions of relations is a better solution. framework: expansion and projection. Expansion
Delin, et al. (2003), in an attempt in providing multi- refers to how a unit expands the other by extension and
layered framework of studying multimodal documents, elaboration, and projection refers to how a unit projects
argues that a detailed analysis of a document should take another unit, which can be a locution or an idea. The
into consideration at least five levels: content structure, relation of projection is more straightforward, and is
rhetorical structure, layout structure, navigation struc- usually found where the drawing of a character (a visual
ture and linguistic structure. The rhetorical structure, i.e. element) may have a projected speech or thought (a
‘how the content is argued’ (p. 56) is most relevant to the linguistic element). The relationship is usually signaled
issue of verbal-visual connections, but their framework by a connecting line between two units and the placing
is based on the linguistic model of Mann and Thomp- of the projected speech or thought inside a balloon. The
son (1988) without modifications. Besides, the linguistic expanded meaning is more complex and will receive
structure should be ideally incorporated into the rhetori- more treatment below. A third category – decoration
cal analysis because no information can be without rheto- – will also be added to the framework.
ric; language without rhetoric is ‘a pipe-dream’ (Bonseipe, The idea of expansion is useful when applied to
1966). Multimodality and its relevance in different word-image connections. The three types of expansion
212 Kenneth Kong

can be compared to elaborating an existing building, collection and variation are important in verbal texts,
extending a building and enhancing the environment. there are other equally important meanings present
In the case of elaboration, one unit (a clause, in in word-image extension but missing in Halliday’s
Halliday’s sense) ‘elaborates on the meaning of another framework. The first of these is the idea of sequentiality.
by further specifying or describing it’ (p. 225). The new In Halliday’s model, sequence is classified as a relation
unit ‘does not introduce a new element into the picture of enhancement (which will be discussed below) of a
but rather provides a further characterization of one main unit. In other words, the unit with the meaning of
that is already there, restating it, clarifying it, or adding sequence is subordinated to another unit by quantifying
a descriptive attribute or comment’ (p. 225). There are it. In the word-image world, both pictures and words
three sub-categories under elaboration: exposition, can convey ideas in chronological ways, without one unit
exemplification and clarification. ‘Exposition’ is being subordinated to another. A good example is comic
equivalent to ‘in other words’ in linguistic terms. strips, which are composed of spatially arranged panels
‘Exemplification’ is equivalent to ‘for example’ and that may contain words or pictures only. These units are
‘Clarification’ is equivalent to ‘to be precise’. I tend to self-sufficient and constitute the comic as a whole, and
use terms that are user-friendly, and replace ‘exposition’ cannot be regarded as subordinate in any sense.
and ‘clarification’ with ‘explanation’ and ‘specification’, Another important meaning of extension in word-
following van Leeuwen’s terminology (2005). Moreover, image relationships is alternation, which is different
there is a relation that Halliday neglects but which is from variation in the sense that one of the entities can
extremely important in multimodal discourse. A word completely replace another without losing any meaning.
or text can be used to identify a particular image and Response, as proposed by Grimes (1975), is embedded
vice versa. ‘Identification’ is an important function in in the mode of question and answer. Although derived
many genres, such as academic textbooks, technical from another unit, the ‘response’ unit also has its own
manuals and travel guidebooks. Hence, one unit can information.
elaborate on another by explaining, exemplifying, The third category of expansion is enhancement
specifying or identifying. (Halliday, 1994), which tries to expand a main idea unit
The second category of expansion is extension. Like by specifying circumstances. This is based on the idea of
an extension of a building, a unit with new meaning symmetry of relations, which will be further elaborated
can be added to the original unit. Based on language below. Basically the enhancement unit qualifies another
data, there are two main ways of extending a unit unit by specifying time, purpose, condition, goal etc.
– addition and variation (Halliday, 1994). In the case of These three categories of expansion are significant
addition, one unit is ‘adjoined to another; there is no in linking words and images, as is a fourth connection,
implication of causal or temporal relationship between which is more diffuse but less constrained by convention
them’ (p. 230). In linguistic terms, this is like the use of and more subject to interpretation. This is the deco-
the conjunction ‘and’. In the framework proposed here, rative function of images. Although words can also
the term ‘collection’ will be used instead, to underscore do the same to images, this is less likely because we
the constitutive function of semiotic resources. In the seldom relate words to decorative functions when
case of variation, one unit is regarded as ‘being in total we see words and pictures together. Although this
or partial replacement of another’, which is similar to function is subjective, it is important enough to be
the function of ‘comparing and contrasting’. Although labeled as another relation because while pictures can
A taxonomy of the discourse relations between words and visuals 213

  Explanation
 (in other words)

  Exemplification
 Elaboration  (for example)
   Specification
 (No new

 information  (to be precise)
   Identification
 
 (namely)

 


  Collection
Extension   Variation
 
 (New   Sequence
 information)   Response
 
 Expansion    Alternation
  
 


  Spatio-temporal
    Manner
    Cause
  


  Effect
    Condition
  Enhancement   Means
  
 (New information   Purpose

  by specifying   Justification
Types   circumstances)   Concession
 
of    Motivation
relation    Enablement
   Restatement
 
   Summary
 Projection 
 
(New  Speech

 information 
 
usually in Thought
 linguistic forms) 
 

 Decoration (New 
 but omissible 
 Various forms
  and types
 information) 

Figure 1.  Logico-semantic relations between words and images


214 Kenneth Kong

simply decorate accompanying messages, they can which these relationships do not work. This is why
also ‘elicit emotion-laden reactions that may precede Grimes (1975) proposed a third type of relationship that
cognitive awareness and influence interpretation of can take either form. Known as the ‘neutral predicate’,
messages’ (Richards and David, 2005, p. 31). Conse- Grimes argued that it was most common in discourse.
quently, decoration, together with our existing two The best examples of neutral relations are collection
main relationship binders – expansion and projection and sequence, in which a unit may take a dominant
– constitute what I mean by the three logico-semantic role (with a bigger size or a more central position). In
relations of words and images. fact, I would like to go a step further by arguing that all
relations can be neutral, depending on the layout, posi-
Symmetry/Asymmetry of relations tioning (to be discussed below) and intended purpose.
Symmetrical and asymmetrical relations can be illustrat-
The symmetrical relations of verbal language have ed by the diagrammatic analogy of nucleus (main unit)
attracted enormous attention in linguistics (cf. Mann and satellite (subordinate unit) (Mann and Thompson,
and Thompson, 1988). Dealing with the schematic 1988). The two basic relations also give rise to other
structure of language, studies in textual relations have possibile combinations (see Figure 2).
always had strong implications for pedagogy (Bhatia, More examples of how these can be applied to word-
1993) and computer-based text generation (Mann and image relations can be found in Section 3.
Thompson, 1988). One very useful notion of textual
linguistics that can be applied to word-image relations is Spatial arrangement of relations
that of unit hierarchy, namely that relations which bind
units together may or may not be of equal status. There Kress and van Leeuwen’s concept of ‘integrated text’ is
can be at least three possibilities. First, a unit can be particularly useful in understanding how the arrange-
subordinate to the main unit. This is what linguists call a ment of relations can be linked to their discourse func-
hypotactic relationship or hypotaxis1: tions. As they argue, verbal and visual codes should not
be seen as separate; instead they should be ‘looked upon
Although he woke up late, he could still catch the bus.
as interacting with [and] affecting one another’ (1996, p.
A unit subordinate to another unit is found in the 183). This view of integrated text allows us to see how the
logico-semantic relations having supporting functions, ‘representational’ and ‘interactive’ meanings are interwo-
such as elaboration, enhancement and decoration. ven into three interrelated systems:
Second, two units can be of equal status and linked
in coordinate fashion: Information value: The placement of elements …
endows them with the specific informational values at-
Mary was a teacher and her husband was a nurse. tached to the various zones of the image: left and right,
top and bottom, center and margin.
This is known as a paratactic relationship or parataxis,
and is found in logico-semantic relations which are of Salience: The elements are made to attract the viewer’s
equal standing, without one modifying the other. The attention to different degrees, as realized by such factors
logico-semantic relations of extension and projection as placement in the foreground or background, relative
are good examples of this type of relationship. Although size, contrasts in total value (or colour), difference in
this is a useful classification, there are occasions in sharpness, etc.
A taxonomy of the discourse relations between words and visuals 215

Figure 2.  Diagrammatic illustrations of various symmetrical and asymmetrical patterns


216 Kenneth Kong

Framing: The presence or absence of framing devices As all document designers know, the arrangement of
(realized by elements which create dividing lines, or by elements in a document is more than a random choice,
actual frame lines) disconnects or connects elements of and reflects rather complicated cultural and individual
the image, signifying that they belong or do not belong expectations. Kress and van Leeuwen touch on the issue
together in some sense. (p. 183) by identifying what is regarded as ‘given’ on the left-hand
side and what is regarded as ‘new’ on the right-hand
In other words, information value is linked to the
side, following how people read English sentences:
relationship between how information is spatially
‘for something to be Given means it is presented as
arranged and its inherent meanings and implications.
something the viewer already knows, as a familiar and
Salience and framing are related to how information
agreed-upon … For something to be New means that it
is highlighted and how it is divided. Since these three
is presented as something the viewer must pay special
elements are important in creating text coherence
attention.’ (1996, p.187). Other specific meanings may
through combining and textualising semiotic resources,
be created by other spatial arrangements. Centre posi-
they are treated under the ‘textual’ meta-function in
tion usually denotes a more important role of the unit
systemic-functional framework of Halliday. Two other
compared with the marginal position, the top position
meta-functions are ‘ideational’ and ‘interpersonal’,
means ‘ideal’, and the bottom position is usually related
which will be dealt with in detail below.
to more real and down-to-earth particularities. This is
Particularly relevant to our discussion here is the
why attractive models (which take up most of the space)
idea of information value, which pays attention to how
are always put in the central or ‘ideal’ position in an
different verbal and visual elements can be arranged
advertisement and disclaimers (which aim at reducing
in such a way that specific meanings can be created.
legal responsibilities) are placed at the very bottom.

Figure 3.  Example of a prototypical left-right and given-new arrangement


A taxonomy of the discourse relations between words and visuals 217

It is certainly useful to apply these concepts in There are, in fact, empirical findings that support
analyzing the different logico-semantic and symmetrical these observations. Biber et al. (2002) argue, based
relations in a document. For example, the ‘identifica- on a large computerized language corpus, that most
tion’ relation, as a subordinate unit, is usually put on satellite clauses are posed after nucleus clauses and that
the right hand side (as NEW information2), and what is only about 30% of satellite clauses are positioned at
being identified is put on the left hand side (as GIVEN the beginning of a sentence or in the medial position.
information). There are at least two reasons for this deviation. Firstly,
In other words, what is designated as GIVEN on satellite clauses can function as the ‘bridge’ between
the left hand side usually coincides with the NUCLEUS previous and subsequent discourse (Givon, 2000).
element (the key element), whereas what is presented Interestingly, this principle can be equally applicable to
as NEW is closely related to the SATELLITE unit (the the relationship between verbal and visual elements.
subordinate unit) on the right hand side. This is basically In the diagram below, the identification satellite in
the same as linguistic information. the middle is put on the left hand side of the diagram
being illustrated, instead of the usual right position.
He arrived in Hong Kong When he was 12.
This is mainly because the function of the identification
Nucleus Satellite satellite is to serve as a link between the two panels
(Enhancement: Time)
(those of the recorder and the remote control) to show
Given/Familiar New/Unfamiliar the corresponding buttons. Secondly, from a linguistic
point of view, in addition to their coherence function,

Satellite: Identification
(New/Unfamiliar
Information)

Figure 4.  An example in which the given-new pattern is violated


218 Kenneth Kong

pre-posed satellite clauses have been shown to have Ideational


significant interpersonal functions of mitigating or Expansion:
softening requests that are put after the satellite. Elaboration (explanation, exemplification, specification,
Another important factor in word-image integration in identification)
terms of spatial arrangement is page layout. As Bateman Extension (collection, variation, alternation, response)
et al. (2004) point out, the relevant unit of analysis may Enhancement (spatio-temporal, manner)
not be the page in its entirety. The example they use Projection
is newspapers, in which spaces have been allocated as
Interpersonal
so-called ‘newsholes’ (Lie, 1991), the areas for advertising
Expansion:
and other fixed elements, and where another important
Enhancement (justification, motivation, concession,
issue is whether the combination of visuals and words enablement, cause, effect, condition, means, purpose)
can give a balanced layout from an aesthetic point of
view. In other words, the GIVEN-NEW sequence or Decoration
the NUCLEUS-SATELLITE sequence may not apply. Textual
In some cases the opposite may even be true. Hence, Expansion:
the interplay of conventionality and other factors such Enhancement (restatement, summary)
as coherence, interpersonal, production and aesthetic
considerations should be further explored in the Figure 5.  Meta-functions of word-image relations
relationship between visuals and words, as visual and
verbal elements may clash and contradict each other verbal texts only, the relations between words and
(Kress and van Leeuwen, 1998). Because directionality images can be classified into these three important
is ‘now a variable, a matter of choice in many genres’ functions.
(Kress and van Leeuwen, 2001, p. 125), it is important Projection and most of the expansion relations
to examine how and why document designers violate are ideational because their main functions are to
conventional patterns to create new meanings. present, clarify or emphasize information. However,
some of the expansion relations, such as justification
Meta-functions of relations and motivation, are mainly important in negotiating
stance with readers or an audience. Decoration is an
From systemic functional perspectives, language obvious example that has no explicit ideational value
has three simultaneous functions: its ideational, and is there mainly to catch audience attention or for
interpersonal and textual meta-functions. The ideational aesthetic purposes. Some of the expansion relations,
meta-function is its role in giving and presenting such as summary, are related to organizing messages
information; the interpersonal meta-function is its role and making a text more coherent. In other words,
in negotiating and regulating human relationships; and all relations can be assigned a value in terms of their
the textual meta-function of language is its power to semantic and pragmatic functions. However, these three
interweave ideational and interpersonal resources into meta-functions are not meant to be clear-cut: on many
a coherent piece. These functions can be applied to texts occasions more than one function is at stake. What is
and images individually, as well as their relationships. important is our awareness of the functional value that
Following Mann and Thompson (1988), who deal with may exist between images and words.
A taxonomy of the discourse relations between words and visuals 219

Evaluative nature of relations Evaluation is a focus in linguistic studies (see Hunston


and Thompson, 2000) and recently in the study of
Some relations may have additional evaluative functions visuals in political cartoons (Lemke, in progress),
in addition to the logico-semantic ones outlined above. webpages (Lemke, 2002) and in a more general manner
Evaluation is closely related to the ideational and (Kress and van Leeuwen, 2001). Apprisal theory
interpersonal meta-functions mentioned previously. (Martin, 1992, 2000; Martin and White, 2005; White,
It is related to the ideational meta-function because 2001) is one of the recent attempts to study evaluation.
evaluation is used to give additional information The theory identifies three main elements of evaluation:
about something in question,3 and to the interpersonal affect, judgment and appreciation. The framework
meta-function because firstly evaluation can mitigate avoids a simplistic treatment of evaluation and considers
the degree of certainty in a proposition, as in the case a large range of evaluative resources. Another strength is
of a hedge (the epistemic nature of evaluation), and its consideration of the interface of discourse and finer
secondly it can influence a reader’s attitude towards linguistic resources. However, the framework has some
ideational content (the attitudinal nature of evaluation). ambiguous and overlapping classifications which should
Evaluation, hence, is closely related to the function of be subject to fine-tuning in future. Another framework
persuasion, which has been argued to be an important was developed by Greenbaum (1969) and Lemke (1998)
issue in effective document design (Fogg, et al. 2001). and has been recently modified to study evaluation
The following shows how a visual can be anchored or more holistically (Kong, forthcoming). The framework
elaborated by words through identification (‘Beloselsky- provides a tidier treatment of evaluation and is more
Belozersky Palace facade’) and at the same time its suitable to study the connections between words and
embedded evaluation (‘distinctive’). images. In the model, a total of 7 evaluative categories
are identified and divided into large groups of categories:
epistemic and attitudinal (see Figure 7 below). Epistemic
evaluative categories express doubt or certainty about
a proposition or an idea, whereas attitudinal evaluative
items tend to express more personal judgment about
what is said or shown in a picture.4 Warrantability and
comprehensibility can be regarded as belonging to the
first group because they are both related to the degree of
conviction or doubt. For example, warrantability items
in linguistic mode such as ‘definitely’ and ‘undoubtedly’
express conviction, and warrantability items such as
‘possibly’ and ‘allegedly’ express a certain degree of
doubt. When comprehensibility is also related to the
expression of epistemic elements, it has an additional
element of ‘observation or perception of a state of affairs’
Figure 6.  An example of an evaluative relation between a (Greenbaum 1969, p. 204). Again comprehensibility
picture and words items can be used to express conviction or some degree
[Source: Eastern Europe (2001). Lonely Planet Press] of doubt. For example: expressions such as ‘evidently’
220 Kenneth Kong

Warrantability
(Doubt or certainty)
Epistemic
Comprehensibility
(Observation or Perception in
Evaluative type addition to doubt or certainty)

Desirability
Normativity
Attitudinal Usuality
Importance
Humorousness

Figure 7.  Classification of evaluative types

and ‘obviously’ tend to express the writer’s conviction, Metaphor has been studied mainly from cognitive
whereas words such as ‘seemingly’, ‘apparently’ express and pragmatic perspectives, focusing on verbal data
some degree of doubt. alone. The most notable exception is Marcus (1998)
Other categories are expressions of more personal who studied the use of visual metaphors in user-inter-
judgment about a statement or visual. Desirability face design in computer documentation. The cognitive
assesses the extent to which something causes approach to metaphors argue that metaphors reflect
satisfaction or otherwise. Usuality expresses an attitude not only superficial linguistic patterns, but also the
that something is expected or unexpected. Normativity existence of certain conceptual patterns in our minds
assesses the appropriateness of an entity. Lastly, (Lakoff and Johnson, 1980; Lakoff and Turner, 1989).
humorousness assesses the extent to which something This can be illustrated by the way cognitive linguists
can cause entertainment or has a humorous effect. The study metaphors. They identify the linguistic resources
above figure summarizes the semantic relationships of that are used to express a metaphor with small letters.
the evaluative categories. For example, the metaphorical expression ‘I am quite
rusty in Spanish’ is represented by small letters. This
Metaphorical nature of relations is known as a linguistic metaphor. When cognitive
linguists refer to the conceptual frame that underlies
Like evaluation, metaphor may not be an inherent the linguistic metaphor, which is known as a conceptual
feature of all word-image relations but is important metaphor, they will use capital letters such as ‘MIND
in various genres. To define it briefly, metaphor is IS MACHINE’ (A = B). ‘A’ denotes the target domain
a figurative expression that is transferred from one (mind) and ‘B’ is the source domain (machine).
semantic domain to another. ‘Metaphor’ here includes Metaphors have been frequently used to convince
all kinds of figures of speech such as similes, metonomy, and persuade others. As Gibbs and Gerring (1989, p. 156)
synecdoche, hyperbole, and apostrophe, since the term argue, metaphors play a crucial role in maintaining social
‘metaphor’ has been so widespread that it can be used as and interpersonal relationships through the common
an umbrella term for all related terms (Chandler, 2004). ground of the speaker and the listener. The listener has
A taxonomy of the discourse relations between words and visuals 221

to work out the reasons why certain metaphors are


used and the conceptual framework underlying them.
However, this cannot explain why certain metaphors are
used more frequently than others in certain situations,
which is why a pragmatic approach has been advocated.
As Charteis-Black (2004, p. 11) argues,

“the only experiment of metaphor motivation [from a


cognitive perspective] is with reference to an underlying
experiential basis. This assumes that metaphor is an
unconscious reflex, whereas a pragmatic view argues that Figure 8.  An example of metaphorical relation (Adapted
speakers use metaphors to persuade by combining the from Horn, 1998, p. 119) ©Robert Horn. Reproduced with
cognitive and linguistic resources at their disposal”. permission.

Again, previous studies on linguistic metaphors can be This figure can be illustrated diagrammatically as follows,
fruitfully applied to the study of visual metaphors. By together with the concepts of logico-semantic relations,
focusing on printed advertisements, Forceville (1996) symmetry/asymmetry and nucleus-satellite pair.
identified four types of pictorial metaphors: those with
one pictorially present term, those with two pictorially Specification
present terms, pictorial similes and verbo-pictorial
metaphors. Obviously, the last category is most relevant
to the discussion here. According to Forceville, in verbo-
pictorial synthesis, one of the domains (target or source)
is realized verbally, and the other is realised pictori-
Picture: Plane and the label Sentence: ‘This project is really
ally. The removal of either element may result in the
taking off.’
disappearance of a metaphorical relationship. Focusing (Inherently metaphorical)
on the use of metaphors in computer documentation,
Marcus (1998) argues that metaphors are a significant
component of user-interface designs (for example, the Identification (Metaphorical)
use of trash bin to represent a file folder for discarded
computer files) because metaphors can increase the
level of familiarity of new concepts to readers and can

also increase ease of learning, memorization and use.
Horn (1998) points out that metaphorical relationship The Bradford Plane (Visual)
between words and images can be elaborated by words Project
at a higher level. For example, a picture that shows a (Words)
plane taking off with the label ‘the Bradford project’ is
a metaphorical relationship, which is anchored by a In this diagram, the plane is the nucleus (the most
speaker’s sentence with a similar metaphorical meaning: important proposition in a message to which other
‘this project is really taking off ’. more peripheral propositions are referred), not only
222 Kenneth Kong

because it takes a central position in the picture but also An example of analysis (full text in appendix)
because the sentence ‘this project is really taking off ’
clarifies or specifies the metaphorical relationship. It This classification can be put into practice by examining
should be noted that words and images may not have an extract from a travel guide. Travel guides exhibit a
any metaphorical relationship binding them, but as with range of styles, from content that is extremely packed
other classification values, metaphor can overlap with with words without any pictures to content that is fully
other values. In the example above, the relationship illustrated with images and pictures. This reflects the
between the visual plane and the label (the Bradford history of the genre – texts produced more than 10
Project) is metaphorical and linked by the logico- years ago tend to be more word-based, whereas those
semantic relation of identification. The following chart produced in the last 5 years are usually illustrated
summaries the different possible combinations of text- with more pictures and images, although the degree
image relations or ‘network’ of relations in Halliday’s of integration between words and images can vary
terms (1994): considerably. The example selected puts words and
   Expansion images in a rather integrated fashion and is what Horn

 Logico-semantic   Projection (1998) calls ‘visual language’.
 
relations   Decoration The text is about a scenic spot in central India called

  Fatehpur Sikri. The text, excluding the page numbers
  and label at the top, can be divided into six blocks of
   Coordination
  (Nucleus-Nucleus) information. The name of the place ‘Fatehpur Sikri’, as the
 Hierarchy   Subordination smallest piece of information, is the first block (Block 1).
 
 (Nucleus-Satellite) The second block serves as an introduction and is located


  Neutral at the top left-hand corner, immediately underneath the
 title (Block 2). It consists of an image of ‘Fretwork jali’
   Right and a paragraph that introduces readers to the history
 
Text-image    Left of the place. The most prominent aspect of this extract
Relations  Information   Centre is obviously the drawing of the place together with a
 
value   Top synthesis of words and photographs that branch out from

   Bottom the drawing (Block 3). The fourth block, squared into a
 box, is located at the top right-hand corner and is entitled
   Ideational
  ‘Visitors’ Checklist’ (Block 4). The last two blocks are
 Meta-functions   Interpersonal located at the bottom on the right-hand side. One is about
   Textual
  the sights that visitors should not miss (Block 5) and the
  other is a plan of Fatehpur Sirki (Block 6). For the sake
   Evaluative
Evaluative of better organization, I will initially focus on the logico-
   Non-evaluative
 semantic relations, the symmetrical status and the meta-
  functional distribution of these relations in the extract.
   Metaphorical
Metaphor I will then move on to the spatial arrangement of these
  Non-metaphorical
relations, and finally the evaluative and metaphorical

Figure 9.  Network of relations nature of a number of the relations.
A taxonomy of the discourse relations between words and visuals 223

Logico-semantic relations and their hierarchical represen- but adds new information to the nucleus. Similar to the
tation and meta-functional distribution MOTIVATION block, the fourth block, the ‘Visitor’s
Checklist’ adds further information about the site
At the macro-level, the three-dimensional picture (instead of simply elaborating), such as exact location,
with its elaboration can be regarded as the nucleus of address and telephone number of the information
information. All other blocks are satellites to it. But centre, and other important information about the site.
what are their binding relationships? The first block is Hence, this qualifies as a MEANS relation, also under
the IDENTIFICATION of the nucleus by a satellite. The the category of enhancement. The last two blocks do
second block, giving background historical information not give new information, and mainly elaborate what
to the main block, MOTIVATES the readers to read is already there in the nucleus. The plan of Fatehpur
the rest of the page by conveying the importance Sikri EXPLAINS the spatiality of the site, whereas the
of the scenic spot. Of course, readers can skip this ‘Star Sights’ block reinforces what is also highlighted
information and jump to any part of the extract, but in the nucleus by SPECIFYING which spot should not
that is the intended function because it is put in the top be missed. This also has an IDENTIFYING function
left-hand corner, orienting readers to it as the first piece by telling the reader exactly what a star sight is. The
of information by following the normal reader sequence following diagram shows the logico-semantic relations
from left to right in English. This is also qualified as found in the extract:
an enhancement relation because it does not elaborate

Figure 10.  Logico-semantic relations in the extract


224 Kenneth Kong

The foregoing analysis has only identified the nucleus The following table summarizes the distribution of all
and satellites of the larger blocks of information, and the relations found in the extract.
the logico-semantic relations that bind them together.
Obviously each block of information has it own Meta-functions Logico-semantic Number Percentage
internal structure that is at the same time marked by a relations
Ideational Elaboration: Identification 18 32.7%
hierarchical layer (or layers) of relations. To explain
Extension: Collection 12 21.8%
how this internal hierarchical structure works, I will Elaboration: Explanation 9 16.4%
refer to the central block (Block 3) as an example. The Extension: Variation 7 12.7%
three-dimensional map with a centrifugal elaboration Elaboration: Specification 5 9.1%
of words and images can be seen as a ‘composite’ of Sub-total 51 92.7%
different sub-blocks linked by COLLECTION, as a Interpersonal Decoration 2 3.4%
type of extension that adds new information. Each Enhancement: Motivation 1 1.8%
Enhancement: Means 1 1.8%
element in this composite is also supported by a number
Sub-total 4 7.3%
of IDENTIFICATION satellites, which label the
Textual 0
individual spots in the place. What about these satellites Total 55 100%
in relation to the photographs and short paragraphs that
go along with them? The photographs can be regarded Figure 12.  Distribution and percentage of logico-semantic
as nuclei that invite readers to COMPARE (i.e. the relations
VARIATION relation) them with the corresponding
section in the map. The short paragraph that is usually
placed underneath each photograph is its anchor or
EXPLANATION. The complicated relationship of the
main block (Block 3) can be illustrated as follows.

Central Block
Extension: Collection

Elaboration: Identification

   Elaboration: Explanation   Extension: Compare

……………………………
Text Photograph Corresponding
part in the 3-D
map
Figure 11.  Complicated relationship of the main block
A taxonomy of the discourse relations between words and visuals 225

It is clear from the table above that most of the relations extract can illustrate some of these considerations. The
are elaborative, particularly that of IDENTIFICATION. most obvious example is the picture. The specific spots
The second most frequent relation is COLLECTION, in the picture are identified by text and photographs
under the category of extension. EXPLANATION is on every side of it, instead of its right hand side, which
the third most frequent relation. This is not surprising is supposed to be the position for a satellite. However,
at all because the main communicative purpose the document designer still follows what is regarded as
of travel guides is to identify where to find certain the usual way of an identification relation in the block
places and then explain why they are worth visiting. about the floor plan (Block 6). What is being labelled
COLLECTION is frequently used to organize aspects of is put on the left hand side and the labels are put on
the explanation. COMPARISON is used in this extract the right hand side. Also note the placement of the
because photographs are used to show images of the background information (as a MOTIVATION satellite
spot to increase the readability of the document. The relation) at the top left hand corner, while the ‘Visitor’s
less frequent relations found are SPECIFICATION, Checklist’ (as a satellite relation of MEANS) is in the top
DECORATION, MOTIVATION and MEANS. In right hand corner. In sum, the information value of a
other words, the types and distribution of relations are document is rather complex and can only be understood
consistent with the overall purpose of the document in terms of the specific layout in question, in addition
in question. In terms of the meta-functions of the to the conventional understanding of what we mean by
relations, most of the relations are ideational, that is, information value.
to convey core information about the place. Although Evaluation is not an inherent feature of all relations
interpersonal meta-functions constitute only 7.3% of the and is usually embedded in a logico-semantic relation.
total relations, they are important in positioning readers In the extract, evaluation is frequently found in the
and constructing the image of a professional writer who verbal elaboration of the real pictures. Some of them are
understands the needs of a traveller. related to how beautiful or desirable a certain feature is
(desirability):
Spatial arrangement, and evaluative and metaphorical lie within this lavishly decorated ‘Chamber of Dreams.’
functions
Akbar’s queens and their attendants savoured the cool
In terms of the information value of spatial evening breezes.
arrangement, the nucleus is in the central position. The fine dado panels and delicately sculpted walls of
To put this another way, the picture is regarded as the this ornate sandstone pavilion make the stone seem like
nucleus of all relations mainly because it is in the central wood.
position and takes up most of the space. What I am
trying to argue is that the information value of a relation Some linguistic expressions are used to anchor the visual
is both its constructive and constitutive feature. As I with highlights of its uniqueness (Usuality/unusuality,
explained above, the NUCLEUS-SATELLIE pattern not Normativity, which usually refers to the idea of
usually coincides with the GIVEN-NEW pattern, but appropriateness):
there are always other considerations – regardless of This hall for private audience and debate is a unique
whether they are ideational, interpersonal or textual fusion of different architectural styles and religious
– when a document designer makes a decision. This motifs.
226 Kenneth Kong

It is topped with an unusual stone roof of imitation clay tourists should not miss. This inherently metaphorical
tiles. heading is used to identify the two important sights: the
Turkish Sultana’s House and Panch Mahal, which are
The evaluative category of warrantability is also found in
preceded by a star ‘*’. The relationship between the ‘star’
the verbal elaboration:
and the names of the places is one of identification, and
Its decorative screens were probably stolen after the city is also metaphorical.
was abandoned.
Sometimes identified as the treasury…
Conclusion
These are the main evaluative categories in the extract,
and are consistent with the communicative functions Although the importance of turning visual information
of a travel guide. Readers expect a travel guide to into empirical data has been highlighted in the recent
be informative, elaborate and accurate. At the same literature on multimodality, there remains a gap
time, writers of the text also have to convince readers between what has been said and what can be done to
that a certain place is worth visiting, which explains study multi-modal discourse. Kaltenbacher (2004, p.
why desirability and usuality/unusuality are most 203) succinctly argues for a need “to develop systematic
frequently used. This also highlights the persuasive methods for theoretically developing, systematically
function of travel guides, in addition to the commonly analyzing and empirically testing the semiotic unfold-
acknowledged function of giving factual information. ing of resources and modes and their combinations
To summarise, evaluative meanings that are usually in all aspects of our daily lives.” Some recent attempts
embedded in a larger segment of a logio-semantic (Bateman et al., 2004; Taylor, 2003) have been made to
relationship can be used to anchor pictures or words transcribe or annotate multi-modal discourses so they
(words anchoring pictures in the extract) by supporting can be stored and further studied. However, a systemic
the broader communicative function of a text. This is classification of the schematic or functional relationship
subtle, but can be argued to be more powerful. between text and images is still unexplored, as language
The last characteristic to be discussed here is meta- has its verbal, pictorial and schematic modes (Walker,
phor. Similar to evaluation, metaphor is NOT the 2001, p. 177).
inherent feature of a relation. It is possible to classify a Current studies either take an over-simplified view
relation into a logico-semantic type or call it a nucleus of the issue or use existing text-based models (such as
or satellite, but it may not be possible to identify a that of Mann and Thompson, 1988) as their framework
word-image relation of any metaphorical nature. In the without any modification, ignoring the potential
sample extract, a relation that is related to metaphorical differences between words and images. Further studies
meaning can be found in the information block about in word-image relations will not only be important in
the star sights. In fact, the heading ‘Star Sights’ is testing assumptions about these two modes (such as
metaphorical in itself. Star belongs to the domain of ‘only words are used to anchor pictures’), but will also
‘universe’ and does not have any direct connection to be important in two other respects. First, they will offer
scenic spots. The associated meaning of stars is always practical guidelines to document designers on how to
attached to importance in everyday discourse, and the use words and visuals in a more effective way. Second,
same meaning is invoked here. Star sights are places that they will offer an understanding of how new meanings
A taxonomy of the discourse relations between words and visuals 227

are created in this word-picture fusion. Although units examined in this paper are based on meaning as the sole
increasing attention is being paid to meaning making criterion for distinction, i.e. whether a unit is subordinate to
and negotiation in verbal language or conversational another in terms of their meaning. This is more useful since
visuals do not have some of the structural characteristics of
discourse, this important – subtle, but more powerful
verbal language. Moreover, in the case of verbal examples, the
– meaning-making device has not received enough units of analysis can be a word, a clause, a sentence or even a
consideration. paragraph although clauses are used as examples.
Hence, this paper identifies the potential relations 2.  It should be noted, however, that the ‘new’ element in
between words and images. Most importantly, it argues the GIVEN-NEW pair is not the same as new information in
for a multiple level of relations and the need for empiri- the classification of logico-semantic relations in which new
cal analysis. The framework, shown using different information is related to the presence of information new to
types of data, can be applied to other types of discourse the nucleus. New information in terms of information value
including web-based genres and visual-based genres. concerns the degree of familiarity to the readers/viewers. This
Owing to space limitations, many concepts, particularly is the reason why alternative phrasing such as ‘Given-Un-
familiar’ or ‘Familiar-Unfamiliar’ might be more suitable in
the various logico-semantic relations, have not been
order to distinguish the two classification systems.
examined here in great detail, and only a limited set of
3.  In systemic-functional linguistics, evaluation is usually
examples has be used as illustrations. The potentially
regarded as an interpersonal resource only, through which an
subjective and ambiguous nature of some verbal-visual evaluator is used to consciously influence a reader towards
relationships will need to be followed up in future some ideational content. I have to thank an anonymous re-
studies. viewer for pointing this out.
4.  The epistemic function of evaluation is closely related to
modality in systemic-functional linguistics and is for adjusting
Acknowledgements a writer’s stance towards a proposition. Modal verbs such as
‘may’ fall into this category.
*  I would like to thank the Research Grant Committee of
Hong Kong SAR for their generous support to the project on
which this paper is based (RGC research project # HKBU2162/ References
03H). Every attempt has been made to obtain permission to
reproduce copyright material. If any proper acknowledgement Barthes, R. (1967). Elements of Semiology. London: Cape.
has not been made, or permission not received, I would be glad Barthes, R. (1978). Image-Music-Text. New York: Hill and
to hear from the copyright holders. Wang. Reprinted in S. Sontag (Ed.) (1993). A Barthes
Reader. London: Vintage.
Bateman, J., Delin, J., & Henschel, R. (2004). Multimodality
Notes and empiricism: Preparing for a corpus-based approach to
the study of multimodal meaning-making’ In E. Ventola,
1.  It should be highlighted that the idea of subordination and C. Charles & M. Kaltenbacher (Eds.), Perspectives on
coordination or hypotaxis and parataxis is also related to the Multimodality (pp. 65–87). Amsterdam: John Benjamins.
structural hierarchy of clauses. One of the clauses may have Bhatia, V. K. (1993). Analysing Genre: Language Use in
a higher standing and can stand alone, and the other is lower Professional Settings. London: Longman.
and cannot exist alone. Although this may be applicable to Biber, D., Conrad, S., Leech, G., & Longman (2002). Longman
some of the relations identified in this study, the hierarchical Student Grammar of Spoken and Written English. Harlow
Essex: Longman.
228 Kenneth Kong

Bonsieppe, G. (1966). Visual-verbal rhetoric. Dot Zero, 2, 37–38. forms of text. In I. Snyder (Ed.), Page to Screen: Taking Lit-
Chandler, D. (2004). Semiotics: The Basics. London: Routledge. eracy into the Electronic Era. London, New York: Routledge.
Charteris-Black, J. (2004). Corpus Approaches to Critical Meta- Kress, G., & Van Leeuwen, T. (1996). Reading Images: The
phor Analysis. New York: Palgrave Macmillan. Grammar of Visual Design. London: Routledge.
Delin, J., Bateman, J., & Allen, P. (2003). A model of genre in Kress, G., & Van Leeuwen, T. (1998). Analysis of newspaper
document layout. Information Design Journal, 11(1) 54–66. layout. In A. Bell & P. Garret (Eds.), Approaches to Media
Eckkrammer, E. M. (2004). Drawing on the theories of inter- Discourse. Oxford: Blackwell.
semiotic layering to analyse multimodality in medical self- Kress, G., & Van Leeuwen, T. (2001). Multimodal Discourse:
counselling texts and hypertexts. In E. Ventola, C. Charles The Modes and Media of Contemporary Communication.
& M. Kaltenbacher (Eds.), Perspectives on Multimodality London: Arnold.
(pp. 211–226). Amsterdam: John Benjamins. Lakoff, G., & Johnson, M. (1980). Metaphors We Live By.
Eco, U. (1976). A theory of semiotics. Indiana University Press: Chicago: Chicago University Press.
Bloomington. Lakoff, G., & Turner, M. (1989). More than Cool Reason: A Field
Fogg, B. J., Marshall, J., Laraki, O., Osipovich, A., Varma, C. et Guide to Poetic Metaphor. Chicago: University of Chicago
al. (2001). What makes websites credible? A report on a large Press.
quantitative study. SIGCHI, March 31-April 4, 2001, Seattle, Lemke, J. C. (1998). Resources for attitudinal meaning:
WA, USA. evaluative orientations in text semantics. Functions of
Forceville, C. (1996). Pictorial Metaphors in Advertising. Language, 5(1), 33–56.
London: Routledge. Lemke, J. C. (2002). Travels in hypermodality. Visual
Gibbs, R.W., & Gerring, R. J. (1989). How context makes Communication, 1(3), 299–325.
metaphor comprehension seem ‘special’. Metaphor and Lemke, J. C. (work in progress). Visual and Verbal Resources
Symbolic Activity, 4(3), 145–158. for Evaluative Meaning in Political Cartoons.
Givon, T. (2000). Syntax Vol. 2. Amsterdam: John Benjamins. Lie, H. K. (1991). The Electronic Broadsheet: All the News that
Greenbaum, S. (1969). Studies in English Adverbial Usage. Fits the Display. Master’s Thesis, Boston, School of Archi-
London: Longman. tecture and Planning, MIT. http://www.bilkent.edu.tr/pub/
Grimes, J. (1975). The Thread of Discourse. The Hague: Mouton. WWW/People/howcome/TEB/www/hwl_th_1.html.
Halliday, M. A. K. (1994). An Introduction to Functional Gram- Mann, W., & Thompson, S. A. (1988). Rhetorical Structure
mar. London: Arnold. Theory: Toward a functional theory of text organization.
Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. Text, 8(3), 242–281.
London and New York: Longman. Marcus, A. (1979). Visual rhetoric in a pictographic-ideo-
Horn, R. E. (1998). Visual Language: Gobal Communication graphic narrative. In T. Borbe (Ed.), Semiotics Unfolding:
for the 21st Century. Bainbridge Island: Washington: Proceedings of the Second Congress of the International As-
MacroVU, Inc. sociation for Semiotic Studies Vienna, July 1979.
Hunston, S., & Thompson, G. (2000). Evaluation in Text: Marcus, A. (1998). Metaphor design in user interfaces. Journal
Authorial Stance and the Construction of Discourse. Oxford: of Computer Documentation, 22(2), 43–57.
Oxford University Press. Marsh, E. E., & White, M. D. (2003). A taxanomy of
Kaltenbacher, M. (2004). Perspectives on multimodality: From relationships between images and text. Journal of
the early beginnings to the state of art. Information Design Documentation, 59(6), 647–672.
+ Document Design, 12(3), 190–207. Martin, J. R. (1992). English Text: System and Structure. Am-
Kong, K. C. C. (forthcoming). Linguistic resources as sterdam: John Benjamins.
evaluators in English and Chinese research articles. Martin, J. R. (2000). Beyond exchange: appraisal systems in
Multilingua. English. In S. Hunston & G. Thompson (Eds.), Evaluation
Kress, G. (1998). Visual and verbal modes of representation in in Text: Authorial Stance and the Construction of Discourse
electronically mediated communication: the potentials of new (pp. 142–175). Oxford: Oxford University Press.
A taxonomy of the discourse relations between words and visuals 229

Martin, J. R., & White, P. R. R. (2005). The Language of Evalua- Sources of examples
tion: Appraisal in English. New York: Palgrave Macmillan.
Oyama, R. (1999). Visual semiotics in a cross-cultural perspec- Eastern Europe (2001). Lonely Planet Press, 6th edition.
tive: a study of visual images in Japanese and selected British Horn, Robert E. (1998). Visual Language: Gobal
advertisements. Unpublished PhD dissertation, University Communication for the 21st Century. Bainbridge Island,
of London. WA: MacroVU, Inc.
Pegg, B. (2002). Two dimensional features in the history of India Travel Guide (2002). London: Dorling Kinderly (Mark
text format: how print technology has preserved linearity. Warner © Dorling Kindersley).
In N. Allen (Ed.), Working with Words and Images.
Westport, CT: Ablex Publishing.
Peirce, C.S. (1931-58). Collected Writings (Vols. 1–8). In C.
Hartshorne, P. Weiss & A.W. Burks (Eds.). Cambridge,
about the author
MA: Harvard University Press.
Richards, A.R., & David, C. (2005). Decorative color as a Kenneth Kong is an associate professor of linguistics in the
rhetorical enhancement on the world wide web. Technical English department of Hong Kong Baptist University. His
Communication Quarterly, 14(1), 31–48. academic interests include discourse analysis, multimodal
Royce, T. (2002). Multimodality in the TESOL Classroom: analysis, intercultural pragmatics and language for spe-
Exploring Visual-Verbal Synergy. TESOL Quarterly, 36(2), cific purposes. He has published extensively in the areas of
191–205. discourse analysis and pragmatics.
Saussure, F. de ([1916] 1974). Course in General Linguistics
(trans. Wade Baskin). London: Fontana/Collins. Contact
Still, J. M. (2001). A content analysis of university library Web
Kenneth Kong
sites in English speaking countries. Online Information
Department of English Language and Literature
Review, 25(3), 160–164.
Hong Kong Baptist University
Stroupe, C. (2000). Visualizing English: Recognizing the
Waterloo Road
hybrid literacy of visual and verbal authorship on the Web.
Kowloon
College English, 62(5), 607–632.
Hong Kong
Taylor, Ch. (2003). Multimodal Transcription in the Analysis,
e-mail: kkong@hkbu.edu.hk
Translation and Subtitling of Italian Films. The Translator,
9(2), 191–205.
Van Leeuwen, T. (2005). Introducing social semiotics. New
York: Routledge.
Van Leeuwen, T., & C. Jewitt (Eds.) (2000). Handbook of Visual
Analysis. London: Sage.
Walker, S. (2001). Typography and Language in Everyday Life.
London: Longman.
White, P. (2001). Appraisal website. Retrieved October 10, 2001
from http:///www.grammatics.com/appraisal/Appraisal
Guide
230 Kenneth Kong

Appendix 1.  Sample analysis of a multimodal document

Block 1: Identification Block 2: Motivation Block 4: Means

Block 3: Nucleus Block 5: Specification/ Block 6: Explanation


Identification

Mark Warner© Dorling Kindersley. Reproduced with permission, except for the top left photo on the left page (© Ashok
Dilwaki) and the top right and top left photos on the right page (© DN Dube).

You might also like