You are on page 1of 4

Corpora: Multimodal

SVENJA ADOLPHS

Natural language collected and stored in multimillion-word corpora forms the basis of
inquiry in a diverse range of disciplines. Advances in the field of corpus linguistics over
the past two decades have contributed to pioneering research in many areas of communi-
cation studies and language description. However, while the analysis of large-scale text
corpora can provide insights into language patterning, and can help establish linguistic
profiles of particular social contexts, it is limited to the textual dimension of communication.
Yet, communication processes are multimodal in nature, and there is now an increasing
body of research concerned with the development and analysis of corpora that enable the
user to study the speech and gestures of the participants in social interaction, and ways
in which the verbal and nonverbal are inextricably linked. The impetus towards multimodal
corpora recognizes that natural language is an embodied phenomenon and that a deeper
understanding of the relationship between talk and bodily actionsparticular gestures
is required if we are to develop a more coherent understanding of the collaborative organ-
ization of communication (see also Saferstein, 2004).
Work in multimodal communication has seen advances in both theory and practice.
The theoretical starting point for much significant work has been in systemic-functional
linguistics and in different subfields of psychology (see, for example, McNeill, 1985).
Foundational work in multimodal communication, such as Kress and van Leeuwen (1996),
has illustrated how choices of image can align with verbal choices. This work has been
extended in recent years to embrace the multimodal analyses of word, image, and sound
within different language varieties, including cartoons, comics, film, information leaflets,
maps, advertisements, Web pages, and classroom textbooks (e.g., Baldry & Thibault, 2004,
2006). The emphasis has been on how choices of one image or camera angle or color tone
can cumulatively encode particular meanings. The almost exclusive focus has been on written
text. A particular challenge for current research is therefore to integrate the computer-
enabled power of corpus linguistic methods with the theories and practices of multimodal
linguistic research, with particular reference to the analysis of spoken discourse. In other
words, one key aim of multimodal corpus linguistics is to provide computerized analyses
of patterns of verbal and nonverbal meaning in ways that allow new understandings of
discourse to emerge.
Human communication functions within a variety of direct and indirect semiotic
channels (Brown, 1986, p. 409). The occurrence of such channels is affected by modes of
communication that differ widely according to their form, function, and context of use
(see foundational work by Argyle, 1988, and Ekman & Friesen, 1969, and more recent
studies by Wilcox, 2004, and Gu, 2006). Previous studies on the relationship between
language and gesture have traditionally focused on smaller data samples (see, for example,
Streeck, 1994; Goodwin, 2000; Kita, 2003; and Olsher, 2004), and are often carried out in
experimental settings. With the development of large-scale collections of video-recorded
and transcribed interactions, we can extend the potential for research into behavioral,
gestural and linguistic features. Current multimodal corpus studies focus mainly on multi-
modal and multimedia studies of discourse, and speech engineering and corpus annotation
(Gu, 2006, p. 132).

The Encyclopedia of Applied Linguistics, Edited by Carol A. Chapelle.


2013 Blackwell Publishing Ltd. Published 2013 by Blackwell Publishing Ltd.
DOI: 10.1002/9781405198431.wbeal0233
2 corpora: multimodal

Whatever the approach and focus of multimodal corpus research, there are a number
of common challenges which relate to three key stages in data preparation and collection.
These can be summarized under the headings record, represent, and replay (Knight,
Evans, Carter, & Adolphs, 2009). Recording multimodal corpus data poses a number of
issues ranging from the right camera angle to capture sufficient detail of the interaction,
to more basic questions of equipment, and challenges relating to informed consent. A
related priority for future research in this area is the development of tools and methods
to address ethical issues; for example, to anonymize video data while still being able to
extract the salient features that are the focus of the analysis. Pixelating faces or using
shadow representations of heads and bodies can blur distinctions between gestures and
language forms and, when taken to its logical conclusion, anonymization should also
include replacing voices with voice-overs from other speakers. Ethical considerations of
reusing and sharing contextually sensitive video data as part of a multimodal corpus
resource need to be addressed further in consultation with end users, informants, researchers,
and ethics advisors. The issues are especially acute when tools and resources are shared
and Web-enabled.
Once the data is recorded, it needs to be represented in a way that allows a certain
degree of alignment between the video/audio stream and the transcript of the interaction.
Video annotation software, such as ELAN and Transana, can facilitate this process. The
development of new interfaces is required when the aim is a corpus-based analysis of
multimodal discourse representations. It is important that such interfaces include a
concordance facility and allow the analyst to move between the concordance data and
transcript, and the relevant point in the video. An example of such an interface is the
Digital Replay System (DRS), which is a research prototype developed at the University
of Nottingham (Brundell et al., 2008). The replay phase of the process requires careful
consideration as to the extent to which the user may determine replay features (e.g., through
linking search functions to predetermined coding tracks in the analysis).
Research in multimodal corpus linguistics has contributed significantly to the develop-
ment of new interfaces of the kind described above and to the articulation of key challenges
associated with each of the research phases (Carter & Adolphs, 2008). Most importantly,
it has challenged the applicability of some of the key methods and visualization tools used
within the corpus linguistics tradition, such as the analysis of concordance data, when we
move from monomodal to multimodal representations.
Multimodal corpus linguistics is a relatively new area of research, and there are numer-
ous challenges that require further investigation. They include technical issues such as the
development of gesture-recognition systems, and issues of scope, such as the analysis of
more than one gesture region at a time. An example of the latter would be the integration
of hand and head gestures, as well as gaze, in alignment with language use. They further
include theoretical questions of how gesture and language integrate and whether they can
be described within a single framework. Early work in this area has generated some inter-
esting findings as outlined above. It has also shown that some of the functional categories
of language use that we have developed purely on the basis of transcribed records of
discourse may have to be reconsidered in the light of new insights gleaned from the
deployment of a multimodal approach. As our abilities develop in recording, storing, and
analyzing ever larger multimodal corpora in an integrated manner, we may expect to see
new patterns of meaning, which allow us to assess issues of scalability in relation to multi-
modal corpus research. Finally, advances in adding video and audio streams to more
traditional corpus linguistics resources mainly based on textual records, are likely to pave
the way for adding and exploring further data streams. This may include global position-
ing system (GPS) data or time reference to a discourse event, and could potentially include
any type of data that is recorded in relation to our everyday social interactions. In this
corpora: multimodal 3

regard, multimodal corpus linguistics is likely to have an impact on language description


that goes beyond the mere link-up between language and gesture, to include a vast number
of other data streams that will ultimately lead to better descriptions of language in context,
and thus to better applications based on those descriptions.

SEE ALSO: Analyzing Spoken Corpora; Multimodal Corpus-Based Approaches

References

Argyle, M. (1988). Bodily communication (2nd ed.). London, England: Methuen.


Baldry, A., & Thibault, P. (2004). Multimodal transcription and text analysis. London, England:
Equinox.
Baldry, A., & Thibault, P. (2006). Multimodal corpus linguistics. In P. Thompson & S. Hunston
(Eds.), System and corpus (pp. 16483). London, England: Equinox.
Brown, R. (1986). Social Psychology (2nd ed.). New York, NY: Free Press.
Brundell, P., Knight, D., Tennent, P., Naeem, A., Adolphs, S., Ainsworth, S., Carter, R., Clarke,
D., Crabtree, A., Greenhalgh, C., OMalley, C., Pridmore, T., & Rodden, T. (2008). The
experience of using Digital Replay System for social science research. Proceedings of the 4th
International Conference on e-Social Science, Manchester, England, June 1820, 2008.
Carter, R., & Adolphs, S. (2008). Linking the verbal and visual: New directions for corpus
linguistics. Language and Computers [Special issue] Language, People, Numbers, 64, 27591.
Ekman, P., & Friesen, W. (1969). The repertoire of nonverbal behaviour: Categories, origins,
usage and coding. Semiotica 1, 4998.
Goodwin, C. (2000). Action and embodiment within situated human interaction. Journal of
Pragmatics, 32(10), 1489522.
Gu, Y. (2006). Multimodal text analysis: A corpus linguistic approach to situated discourse.
Text and Talk, 26(2), 12767.
Kita, S. (Ed.). (2003). Pointing: Where language, culture and cognition meet. Mahwah, NJ:
Erlbaum.
Knight, D., Evans, D., Carter, R., & Adolphs, S. (2009). Redrafting corpus development
methodologies: Blueprints for 3rd generation multimodal, multimedia corpora. Corpora,
4(1), 132.
Kress, G., & van Leeuwen, T. (1996). Reading images. London, England: Routledge.
McNeill, D. (1985). So you think gestures are nonverbal? Psychological Review, 92(3), 35071.
Olsher, D. (2004). Talk and gesture: The embodied completion of sequential actions in spoken
interaction. In R. Wagner & J. Gardner (Eds.), Second language conversations (pp. 22145).
London, England: Continuum.
Saferstein, B. (2004). Digital technology and methodological adaptation: Text on video as a
resource for analytical reflexivity. Journal of Applied Linguistics, 1(2), 197223.
Streeck, J. (1994). Gestures as communication 2: The audience as co-author. Research on Language
and Social Interaction, 27(3), pp. 23967.
Wilcox, S. (2004). Language from gesture. Behavioral and Brain Sciences, 27(4), 52526.

Suggested Readings

Adolphs, S. (2008). Corpus and context: Investigating pragmatic functions in spoken discourse.
Amsterdam, Netherlands: John Benjamins.
Baldry, A., & Thibault, P. J. (2004). Multimodal transcription and text analysis. London, England:
Equinox.
Carletta, J. (2007). Unleashing the killer corpus: Experiences in creating the multi-everything
AMI Meeting Corpus. Language Resources and Evaluation Journal, 41(2), 18190.
4 corpora: multimodal

Gu, Y. (2006). Multimodal text analysis: A corpus linguistic approach to situated discourse. Text
and Talk, 26(2), 12767.
Knight, D., & Adolphs, S. (2008). Multi-modal corpus pragmatics: The case of active listenership.
In Romeo, J. (Ed.) Corpus and pragmatics. Berlin, Germany: De Gruyter.
Knight, D., Evans, D., Carter, R., & Adolphs, S. (2009). Redrafting corpus development
methodologies: Blueprints for 3rd generation multimodal, multimedia corpora. Corpora,
4(1), 132.
Kress, G. R., & van Leeuwen, T. (2001). Multimodal discourse: The modes and media of contemporary
communication. London, England: Edward Arnold.

You might also like