Transcription Systems For Gestures, Speech

68.
Transcription systems for gestures, speech, prosody, postures, and gaze 1037
Sassenberg, Uta, Ingo Helmich and Hedda Lausberg 2010. Awareness of emotions: Movement be-
haviour as indicator of implicit emotional processes in participants with and without alexity-
hmia. In: Haack, Wiese, Abraham, Chiarcos (eds.), Proceedings of KogWis: 10th Biannual
Meeting of the German Society for Cognitive Science, 169. Potsdam, Germany.
Scheflen, Albert E. 1973. Communicational Structure: Analysis of a Psychotherapy Transaction.
Bloomington: Indiana University Press.
Scheflen, Albert E. 1974. How Behaviour Means. New York: Anchor/Doubleday.
Seyfeddinipur, Mandana, Sotaro Kita and Peter Indefrey 2008. How speakers interrupt them-
selves in managing problems in speaking: Evidence for self-repairs. Cognition 108: 837–842.
Sirigu, Angela, Jean-Rene Duhamel, Laurent Cohen, Bernard Pillon, Bruno Dubois and Yves
Agid 1996. The mental representation of hand movements after parietal cortex damage.
Science 273: 1564–1568.
Skomroch, Harald, Robert Rein, Katharina Hogrefe, Georg Goldenberg and Hedda Lausberg
2010. Gesture production in the right and left hemispheres during narration of short movies.
Conference Proceedings, International Society for Gesture Studies Frankfurt/Oder, Germany,
25–30 July 2010.
Ulrich, Gerald 1977. Videoanalytische Methoden zur Erfassung averbaler Verhaltensparameter
bei depressiven Syndromen. Pharmakopsychiatrie 10: 176–182.
Ulrich, Gerald and K. Harms 1985. Video analysis of the non-verbal behaviour of depressed pa-
tients before and after treatment. Journal of Affective Disorders 9: 63–67.
Wagner Cook, Susan and Susan Goldin-Meadow 2006. The role of gesture in learning: Do children
use their hand to change their minds? Journal of Cognition and Development 7(2): 211–232.
Wallbott, Harald G. 1989. Movement quality changes in psychopathological disorders. In B. Kir-
kaldy (Ed.), Normalities and abnormalities in human movement. Medicine and Sport Science
29: 128–146.
Wartenburger, Isabell, Esther Kühn, Uta Sassenberg, Manja Foth, Elizabeth A. Franz and Elke
van der Meer 2010. On the relationship between fluid intelligence, gesture production, and
brain structure. Intelligence 38: 193–201.
Willke, S. 1995. Die therapeutische Beziehung in psychoanalytisch orientierten Anamnesen und
Psychotherapien mit neurotischen, psychosomatischen und psychiatrischen Patienten. DFG-
Bericht Wi 1213/1–1.
Hedda Lausberg, Cologne (Germany)
68. Transcription systems for gestures, speech,

prosody, postures, and gaze
1. Introduction
2. Transcription systems for gesture
3. Transcription systems for speech
4. Transcription of prosody
5. Transcription of body posture
6. Transcription systems for gaze
7. Conclusion
8. References
Müller, Cienki, Fricke, Ladewig, McNeill, Teßendorf (eds.) 2013, Body – Language – Communication (HSK 38.1), de Gruyter, 1037–1059
Unauthenticated
Download Date | 5/17/16 9:42 PM
1038 V. Methods
Abstract
The chapter presents a concise overview of existing transcription system for gestures,
speech, prosody, postures, and gaze aiming at a presentation of their theoretical back-
grounds and methodological approaches. After a short introduction discussing the under-
standing of the term “transcription”, the article first focuses on transcription systems in
modern gesture research and discusses systems from the field of gestures research, non-
verbal communication, conversation analysis, and artificial agents (e.g., Birdwhistell
1970; Sager 2001; Martell 2002; Bressem this volume; Gut et al. 2002; Kipp, Neff, and Al-
brecht 2007; Lausberg and Sloetjes 2009; HIAT 1, GAT). Afterwards, the paper presents
well-known transcription systems for speech from the field of linguistics, conversation,
and discourse analysis (e.g., IPA, HIAT, CHAT, DT, GAT). Apart from systems
for describing speech, the article also focuses on systems for the transcription of prosody.
In doing so, the paper discusses prosodic descriptions within the field of conversation
analysis, discourse analysis, and linguistics (HIAT2, GAT2, TSM, ToBI, PROLAB,
INTSINT, SAMPROSA). The last sections of the paper focus on the transcription of
body posture and gaze (e.g., Birdwhistell 1970; Ehlich and Rehbein 1982; Goodwin
1981; Schöps in preparation; Wallbott 1998).
1. Introduction
The term transcription goes back to the Latin word trānsscrı̄bere meaning ‘to overwrite’
or ‘to rewrite’ (Bußmann 1990: 187). In a linguistic understanding, transcription refers
to the notation of spoken language in written form. More specifically, it is understood as
the reproduction of communicative events using alphabetic resources and specific sym-
bols while capturing the characteristics and specifics of spoken language (Dittmar
2004). Transcription must be understood as a scientific working method directed to
the analytical needs of a scientist by freezing oral communication and making it acces-
sible to thorough inspection (Redder 2001: 1038; see also Bohle this volume for a gen-
eral discussion). However, today, transcription is not only restricted to linguistics.
Investigating interaction and communication is part of a number of scientific disciplines,
such as ethnography, sociology, psychology, neurology, and biology for instance, which
face comparable problems and obstacles in making communicative behavior analyz-
able. Accordingly, the term transcription is no longer restricted to the notation of spo-
ken language, but also includes the notation of bodily behavior, such as gesture, posture,
and gaze.
2. Transcription systems for gesture

Research on gestures is characterized by a variety of transcription systems developed
within a range of differing disciplines. Foci of the systems vary immensely in their the-
oretical and methodological perspectives, and tend to settle in an extreme, from a pure
form description over a combination of form and possible meanings to a rudimentary
description of gestures. The following section will discuss transcription systems from
the field of gesture research, nonverbal communication, conversation analysis, and
artificial agents along the line of this threefold distinction.
Unauthenticated
68. Transcription systems for gestures, speech, prosody, postures, and gaze 1039
2.1. Systems focusing on gestures’ form

2.1.1. Birdwhistell: A kinesic notation of bodily motion
Starting from the assumption that the kinesic structure of bodily motion is set up in par-
allel to the linguistic structure of spoken language, Birdwhistell developed a notation
scheme, which accounts for the hierarchical set up of bodily motion by distinguishing
a microkinesic and macrokinesic level of notation (Birdwhistell 1970). While the micro-
kinesic level accounts for all forms of bodily motion, the macrokinesic level focuses on
the “meaningful” variants of bodily motions within a particular culture (Birdwhistell
1970: 290). Based on this theoretical assumption, the notational system is devised
into 8 major sections:
(i) total head,

(ii) face,
(iii) trunk,
(iv) shoulder/arm/wrist,
(v) hand and finger activity,
(vi) hip, leg, ankle
(vii) foot activity, walking, and
(viii) neck.
For each of the section, the notational system includes a “basic notational logic”
(Birdwhistell 1970: 258), which is combined with indicators to capture the differing var-
iants producible by the 8 sections of the body. The description is based on articulatory
aspects, such as muscular tension as well joints of articulation. Altogether, the system
offers 400 signs for the description of bodily motion. It includes descriptions for hand
and finger activities, and bi-manual gestures, yet only rarely includes the notation of
movement.
2.1.2. Bernese coding system: The alphabet of body language

Criticizing the functional orientation in the notation and analysis of nonverbal
behavior, Frey et al. (1981) and Frey, Hirsbrunner, and Jorns (1982) developed an
Alphabet of Body Language. By using a time course notation and frame-by-frame
procedure, the coding scheme breaks down the bodily movement into temporal
and spatial components. Altogether, Frey et al. (1981) distinguish 104 movement di-
mensions for the following body parts: head, face, shoulders, trunk, upper arms,
hands, pelvis, thigh, and feet. The body parts are coded according to various dimen-
sions, e.g., head sagittal, relational or lateral movement and the degree of twisting.
For the coding of hand shapes, the scheme distinguishes the position and orientation
of the palm of the hand. The notation of movement is included in the time course
notation, but not separately. The system pursues a taxonomic perspective and
thus only includes as many dimensions as can be distinguished by the coders “in an
objective and reliable way” (Frey et al. 1981: 219). Articulatory variants are not
included.
Unauthenticated
1040 V. Methods
2.1.3. Sager: A notational system for gestures’ form

Similar to the Facial Action Coding System (FACS) (Ekman and Friesen 1978; Hager,
Ekman, and Friesen 2002), Sager aims at a standardized, detailed, and objective system
for describing gestures (Sager 2001; Sager and Bührig 2005). The notational system
includes 3 major aspects:
(i) temporal structure of gestures,

(ii) quality of movement, and
(iii) description of “Signifikanzpunkte”, that is, semiotically significant points of move-
ment (Sager 2001: 26).
The temporal structure of gestures is described according to the beginnings and endings
of movement. Assuming only a restricted number of possible movements for the pro-
duction of gestures, the direction of movement is described as being horizontal, vertical
or diagonal, for instance. Moreover, the system mentions differences in the quality of
movements and differentiates movements as slow, fast, or discontinuous (Sager 2001:
28–29). Signifikanzpunkte are described on the basis of two principles. The principle
center of rotation records the position of arms and hands through the centers of rotation
allowing the movement (shoulder, upper arm, elbow, wrist). The principle body levels
registers movements in the various centers of rotation relative to three body levels (ver-
tical axis, sagittal axis, transversal axis), allowing for different degrees of freedom for
movement (e.g., pronation or supination of the hand). Apart from the position and
movement of hands and arms, the system includes a description of hand shapes.
Seven communicatively relevant types of hand configurations, derivable from two
types of movements of the hand, are differentiated (e.g., cupped hand) (Sager 2001:
41–42).
2.1.4. FORM: An automated form-based description of gestures

FORM (Martell 2002; Martell and Kroll n.d.), an annotation scheme designed to
describe the kinematic information in gesture, aims at the development of “something
like a “phonetics” of gesture that will be useful for both building better HCI [human
computer interaction] systems and doing fundamental scientific research into the com-
municative process.” (Martell 2005: v) The description of gestures in FORM is based
on anatomical criteria, which are represented as a series of tracks capturing different
aspects of the gestural space. All in all, FORM distinguishes five tracks:
(i) excursion duration for the beginning and end of gestures,

(ii) upper arm and
(iii) lower arm recording the position, lift and direction of the arm (location track) as
well as the planes, types and efforts of movements and particular aspects of the
temporal structure of gestures (movement track),
(iv) hand and wrist for its shape, movement and the differentiation between one and
two handed gestures, and
(v) torso for recording its movement and orientation.
Unauthenticated
FORM pursues a strictly hierarchical and technical set up due to its aspect of computa-
tional processing and its applicability in research on artificial agents. The system is
designed for the use with the annotation program ANVIL (Kipp 2004).
2.1.5. Bressem: A linguistic perspective on the notation of gestures’ forms

Bressem’s system is developed within a linguistic approach to gestures, assuming a sep-
aration of form and function in the analytical process (Müller 1998, 2010; Fricke 2007,
2012; Bressem and Ladewig 2011; Ladewig and Bressem forthcoming; Müller, Bressem
and Ladewig this volume). Against the background of the four feature scheme (Becker
2004), the system grounds the description of gestures on the four parameters of sign lan-
guage (Battison 1974; Klima and Bellugi 1979; Stokoe 1960), assuming a potential sig-
nificance of all four parameters for the creation of gestural meaning. This theoretical
focus is reflected in the set up of the system, as gestures are described in their hand
shape, position, movement (Stokoe 1960), and orientation (Battison 1974). The system
focuses on a description of the hand, leaving anatomical descriptions of arms aside. De-
veloped within the context of a corpus linguistic study investigating forms of gestures
(Ladewig and Bressem forthcoming), the system propagates a differentiation between
an articulatory and taxonomic description of gestures’ forms.
The system draws on existing notational conventions from the field of sign language
and gestures studies. The description of hand configurations is based on HamNoSys
(Prillwitz et al. 1989), yet without assuming a taxonomic but an articulatory focus, as
it aims at a description of all possible hands by describing configuration of the hands
and fingers separately. The notation of the hand’s orientation is based on the distinction
made by McNeill (1992) with slight adaptations to incorporate form variants. Movement
is split into:
(i) type of movement,

(ii) direction of movement, and
(iii) quality of movement.
For the positions of the hand, the notational system draws on the gesture space intro-
duced by McNeill, which divides the gesture space “into sectors using a system of
concentric squares.” (McNeill 1992: 86) (see Bressem this volume for the notational
system)
2.2. Systems focusing on gestures’ form and function

2.2.1. McNeill: A psycho-linguistic perspective on the transcription of gestures
McNeill’s coding scheme is designed as a “guide to gesture classification, transcription,
and distribution.” (McNeill 1992: 75) It is developed in the light of a psycholinguistic
perspective on gestures in which gestures’ forms are perceived as an expression and man-
ifestation of cognitive processes through reproducing underlying “imagery” (McNeill
2005), which give direct insight onto cognitive processes as a “window onto thinking”
(McNeill and Duncan 2000: 14). The form of the gesture is viewed as inseparable
from its imagery through which meaning is materialized.
Unauthenticated
1042 V. Methods
McNeill’s system uses the written speech transcription as the basis for coding ges-
tures. Gestures are annotated into the speech transcription by inserting brackets for
the beginning and end of a gesture. The scheme includes the description of hand con-
figuration, orientation, position in gesture space, and movement. The gesture space con-
sists of a system of concentric squares dividing the space in front of the speaker into
three basic areas (center, periphery, and extreme periphery) (see McNeill 1992: 89).
Hand configurations are based on the labeling of hand shapes in American Sign Lan-
guage (ASL) using the “ASL shape that the gesture mostly resembles.” (McNeill
1992: 86) Orientation of the hand is coded according to the gesture space and palm ori-
entation. Gestural movements, such as shape, direction, and trajectory are accounted
for in a descriptive fashion without providing strict guidelines.
2.2.2. CoGest: A linguistic perspective on the transcription of form, meaning,

and function of gestures
The Conversational Gesture Transcription system (CoGest) pursues the goal of providing
a transcription “system of linguistically motivated categories for gestures and a practical
machine and human readable annotation scheme with simple and complex symbols for
simple and complex categories.” (Gut et al. 2002: 3) The system focuses on “linguistically
relevant gestural forms motivated by the functions of gestures within multimodal conver-
sations.” (Gut et al. 2002: 3) The Conversational Gesture Transcription system is based on
the theoretical assumption that the patterning of gestures is organized to a large extent in
similar ways as speech. The system thus distinguishes a form-based and functional descrip-
tion while assuming morphological and syntactical rules for structural and sequential
combinations of gestures (Gut et al. 2002: 4). Distinguishing different levels of transcrip-
tion (“compulsory basic” and “additional optional categories”), the Conversational Ges-
ture Transcription system offers different depths of transcription and coding.
The Conversational Gesture Transcription system includes the coding of:
(i) gesture phases (Kendon 1980),

(ii) hand configuration, movement, position,
(iii) combination of gestures into complex units, and
(iv) a functional classification of gestures.
Configurations of the hand are described with the taxonomic notation systems of
HamNoSys (Prillwitz et al. 1989) and FORM (Martell 2002). Movements are coded
for shape, direction, and modifiers, that is, size, speed, and number of repetitions. In
addition symmetry of hands is coded. For the combination of gestures into complex
units, the Conversational Gesture Transcription system distinguishes sequences of pre-
cedence and overlap (Gut et al. 2002: 6). The functional classification of gestures is
based on a four-part classification distinguishing various degrees of overlap between
gestural and verbal meaning.
2.2.3. Kipp, Neff, and Albrecht (2007): A transcription and coding scheme for the
automatic generation and animation of character-speciﬁc hand/arm gestures
Offering a scheme developed “for the specific purpose of automatically generating and
animating character-specific hand/arm gestures, but with potential general value”
Unauthenticated
(Kipp, Neff, and Albrecht 2007: 1), the scheme operates on the concept of a gesture lex-
icon made up of lexemes, that is “prototypes of recurring gesture patterns where certain
formational features remain constant over instances and need not be annotated for
every single occurrence.” (Kipp, Neff, and Albrecht 2007: 4)
The scheme is implemented and used within the ANVIL annotation tool (Kipp
2004) and consists of adding annotation elements to a track in which each element
is described with a pre-assigned set of attributes, which capture the most essential
parts of a gesture. The annotation scheme includes the spatial form of gestures in
which gesture phases, phrases, and units (Kendon 1980, 2004) are described along
with handedness, path of movement, position, as well as hand shape, and distance of
hands. Hand shapes are coded using a taxonomic classification of 9 types of configura-
tions. Gestures’ membership to a lexical category is determined by the lexeme that de-
fines the hand shape, palm orientation, and exact trajectory. Typical lexemes include:
raised index finger, cup (open hand), finger ring or progressive (circular movement)
(Kipp, Neff, and Albrecht 2007: 14). In a last step, the relation of speech and gesture
is captured.
2.2.4. Zwitserlood, Özyürek, and Perniss (2008): A cross-linguistic annotation

scheme for signs and gestures
The system by Zwitserlood, Özyürek, and Perniss (2008) is a cross modal and cross-
linguistic annotation scheme for the coding of sign language and gesture. It is based
on a number of existing coding schemes for sign language and gestures and pursues a
twofold description and analysis: a descriptive level (description of phonetic and phono-
logical form) and an analytic level (interpretation and analysis) (Zwitserlood, Özyürek,
and Perniss 2008: 186–187). At the descriptive level, manual elements, such as position,
action, and shape of each hand are annotated. The description of hand configurations
is based on HamNoSys (Prillwitz et al. 1989) with some additions and combined with
the description of the orientation of the hand. Furthermore, non-manual elements
such as body position, eye gaze, and facial expression are described. At the analytic
level, an interpretation of the sign or gesture and other non-verbal information is
given (Zwitserlood, Özyürek, and Perniss 2008: 187–189).
2.2.5. NEUROGES: A neurological annotation scheme for gestures

NEUROGES (Lausberg and Sloetjes 2009), an annotation scheme developed for its use
with the annotation software ELAN, pursues a neurological perspective. NEUROGES
assumes that “main kinetic and functional gesture categories are differentially asso-
ciated with specific cognitive (spatial cognition, language, praxis), emotional, and inter-
active functions.” (Lausberg and Sloetjes 2009: 1) The scheme implies that different
gesture categories may be generated in different brain areas. NEUROGES is composed
of 3 modules:
(i) kinetic gesture coding,

(ii) bimanual relation coding, and
(iii) functional gesture coding.
Unauthenticated
1044 V. Methods
Module (i) refers to the kinetic features of a hand movement, i.e., execution of move-
ment vs. no movement, trajectory and dynamics of movements, location of acting as
well as contact with body or not. For the characterization of the dynamic aspects of
movements, NEUROGES uses Laban notation (1950). Module (ii) allows for the cod-
ing of bimanual relation (for instance in touch vs. separate, symmetrical vs. complemen-
tary, independent vs. dominance). Module (iii) brings in the functional aspects and
determines the meaning of gestures based on a specific combination of kinetic features
(hand shape, orientation, path of movement, effort and others), which define the
various gesture types.
2.3. Systems including a rudimentary gesture coding

A rather large portion of transcription systems, mainly from the field of nonverbal com-
munication and conversation analysis, shares a perspective on the transcription and
coding of gestures that is characterized by the primacy of the verbal modality (Sager
and Bührig 2005). Accordingly, gestures and even nonverbal behavior in general are de-
scribed only rudimentarily in form as well as function, and only in functional relevance
to the meaning expressed in the verbal modality. This section discusses three systems
as examples for this kind of notational tradition. For further examples see for instance
Brinker and Sager (1989), Gumperz and Berenz (1993), Kallmeyer and Schmitt (1996),
Schmitt (2007), Schönherr (1997), and Weinrich (1992).
2.3.1. HIAT 1: A discourse analytic perspective on gestures

The discourse analytic transcription system HalbInterpretative ArbeitsTranskriptionen
(HIAT, Ehlich and Rehbein 1976, 1979a, 1979b, 1982) approaches the transcription of
gestures with a focus on the expressional repertoire (Ehlich and Rehbein 1979a: 313).
Gestures are transcribed symbolically through words or predications using everyday
speech. Designations can refer
(i) only partly to specific elements of the movement potential (e.g. raising hand),
(ii) to the expressional quality including the movement potential, and
(iii) the summery of complex movements and actions (e.g., waving).
The transcription furthermore includes a rough record of the on- and offsets as well as
the length of movements and only a rudimentary description of form or function. The
relevance of the gestural component and its transcription is thereby always dependent
on its relevance for the verbal communication (Ehlich and Rehbein 1979a: 315). The
verbal modality is the constitutive background for the transcription of gestures, so
that gestures are transcribed on commentary lines dependent on the verbal utterance.
2.3.2. Jefferson and Gesprächsanalytisches Transkriptionssystem (GAT):

A conversational analytic perspective on gestures
The transcription system proposed by Jefferson (1984) offers a quite similar perspective
as HIAT. It also uses symbolization for the transcription of gestures and other kinds of
Unauthenticated
kinesic behavior. Bodily behavior is noted in commentary lines dependent on the verbal
utterance and inclusive of a rudimentary coding of on- and offsets of movement se-
quences. Bodily behavior is yet only of interest if it obviously influences the verbal
and communicative orientations of the speakers and addressees.
The Gesprächsanalytische Transkriptionssystem (GAT, Selting, Auer, Barden, et al.
1998; Selting, Auer, Barth-Weingarten, et al. 2009) also only includes behavioral as-
pects, such as proxemic, kinesic, gesture, and gaze in the transcription of face-to-face
interaction if it contributes to the “(un)ambiguousness of other predominantly verbal
levels of activities.” (Selting, Auer, Barth-Weingarten, et al. 2009: 26) Regarding ges-
tures, the Gesprächsanalytisches Transkriptionssystem (Selting, Auer, Barden, et al.
1998) lists deictic gestures, illustrators, and emblems and includes a rough description
of on- and offsets as well as apex, that is, peaks, of gestural movement sequences.
The description is behavior-oriented and tries to be as little interpretative as possible.
The Gesprächsanalytisches Transkriptionssystem offers differing degrees of detailed-
ness in the transcription, as it sets apart basic vs. fine-grained transcripts. Basic tran-
scripts usually include an interpretive characterization of the gestures within the line
containing the verbal transcription. Fine-grained transcripts list gestures in a separate
line under the simultaneously occurring verbal activity. For illustrative purposes and in
cases of special importance of the nonverbal activities, the Gesprächsanalytisches Trans-
kriptionssystem also mentions the inclusion of pictures in the transcript (Selting, Auer,
Barden, et al. 1998: 28). Its newest revision, the Gesprächsanalytisches Transkriptions-
system 2, mentions that new conventions for the transcription of visual components of
communication are being designed (Selting, Auer, Barth-Weingarten, et al. 2009: 356)
due to the growing interest and importance of visual aspects of communication within
the field of interaction analysis.
This section has presented notation and transcription systems for gestures, which
range from a focus on:
(i) form, to
(ii) form and function, to
(iii) rudimentary descriptions.
The presented systems primarily differ in the aspect of whether a) gestures’ form can
and should be separated from possible meanings and functions (e.g., Birdwhistell
1970; Martell 2002; Bressem this volume) or b) whether a separation of form, meaning,
and function is not useful for a transcription of gestures (e.g., Gut et al. 2002; McNeill
1992, 2005). These diverging foci thereby go along with the theoretical assumption that
gestures can either be broken down into separate components, which may combine with
other features or not. Furthermore, the role of speech in the process of notation or tran-
scription is different in the systems presented above. While the verbal utterance is of
particular importance for some of the systems (e.g., Ehlich and Rehbein 1979b; Selting,
Auer, Barden, et al. 1998; Selting, Auer, Barth-Weingarten, et al. 2009), others exclude
the verbal modality in parts completely from the notational process (e.g., Bressem this
volume). A further difference in the presented systems is the integration of annotation
software. While especially recent systems use the advantages of annotation software for
the process of notation and transcription (e.g., Kipp, Neff, and Albrecht 2007; Lausberg
and Sloetjes 2009; Martell 2002), others rely on conventional and longstanding methods
Unauthenticated
1046 V. Methods
of transcribing gesture with the use of word documents. Yet, the most important differ-
ence is the clarification and integration of the system within a theoretical and method-
ological framework as well as their implications, which, for most systems, are not
presented as articulately as is necessary.
3. Transcription systems for speech

Transcription systems for speech, regardless of their theoretical and methodological ori-
entation, aim at a representation of spoken language that is easily readable, clear and
distinctive, easily understandable and learnable, economically expandable, and adapt-
able. In doing so, transcription systems commonly use deviations from standard orthog-
raphy to capture a variety of specific characteristics (dialectal or social characteristics),
morphological aspects, syntactical aspects (e.g., reduplication, syntactic malposition-
ing), as well as phonological, and prosodic aspects (e.g., volume, suprasegementalia,
lengthening, pauses, and tempo).
The following section will present a concise overview of well-known transcription
systems from the field of linguistics, conversation, and discourse analysis. The overview
will focus on the following aspects: basic theoretical assumptions, aim of transcription,
basic units of analysis (turn vs. utterance), segmentation of verbal units, transcription of
prosodic aspects as well as nonverbal aspects. It will not focus on the set up of transcrip-
tion files, such as transcription heads and other technical aspects. For a more detailed
overview of the discussed system and the transcription of speech in general see Dittmar
(2004).
3.1. IPA: The International Phonetic Alphabet

Developed at the end of the 19th century, the International Phonetic Alphabet (IPA)
aims at a one to one representation of phonetic characteristics of a sound in a graphical
system (International Phonetic Association 2005). Accordingly, the International
Phonetic Alphabet bases its representation of spoken language on the phonetic quality
of sounds, by providing one letter for each distinctive sound, thereby assuming two
major categories, i.e., vowels and consonants (International Phonetic Association
2005). The notation symbols are based on the Latin alphabetic, using as few non-
Latin forms as possible. If necessary, additional letters are created by using capital or
cursive forms, diacritics, and rotation of letters. The symbols are further supplemented
by a) diacritics indicating information about their articulation, co-articulation and pho-
nation, and b) suprasegmentalia for the representation of prosody, tone, length, and
stress. Using these symbols, the International Phonetic Alphabet includes 107 letters
to represent consonants and vowels, 31 diacritics, and 19 additional signs to indicate
suprasegmental qualities of spoken language.
At the end of the 1980ies, the Speech Assessment Methods Phonetic Alphabet
(SAMPA), an electronic representation and coding of parts of the International Phonetic
Alphabet notation, was developed. A complete representation of the International
Phonetic Alphabet notation, allowing for a machine-readable phonetic transcription
for every known human language, was put forward in the 1990s with an extended version
of the Speech Assessment Methods Phonetic Alphabet (X-SAMPA). While the Speech
Assessment Methods Phonetic Alphabet was essentially designed for segmental
Unauthenticated
transcription, particularly of a traditional phonemic or near-phonemic kind, the electronic

representation of prosodic aspects was later included in the Speech Assessment Methods
Prosodic Alphabet (SAMPROSA), a system of prosodic notation (Wells et al. 1992).
3.2. HIAT: HalbInterpretative ArbeitsTranskriptionen

HalbInterpretative ArbeitsTranskriptionen (HIAT), the “semi-interpretative working
transcription” (Ehlich and Rehbein 1976) is a literary transcription system based on
the concept of a score notation, i.e., an endless line. The HalbInterpretative Arbeit-
sTranskriptionen segments words by using principles of standard orthography. Bound-
aries of utterances, overlaps, and fast attachment of utterances along with listener
feedback are included in the transcription. Apart from conventions for the transcription
of verbal segments and units, the HalbInterpretative ArbeitsTranskriptionen also in-
cludes guidelines for the representation of prosody. Prosody is transcribed in an extra
commentary line and includes changes in the intonation contour (falling, rising intona-
tion), accented words, lengthening, changes in the volume (crescendo, decrescendo),
and pauses (Ehlich and Rehbein 1976). Nonverbal aspects are transcribed either interli-
nearily or in commentary lines by using a stylistic interpretation of the utterances (see
section 2.3). Apart from its convention for the representation of spoken language phe-
nomena, the HalbInterpretative ArbeitsTranskriptionen tries to account for changing
perspectives on the data and thus distinguishes various levels of transcription. A primary
transcription using a minimal set of signs can be changed into an individual analytic per-
spective by subsequently adding aspects to the transcription. Furthermore, the system
includes technical methods of transcribing and analyzing (HIAT-DOS Schneider 2001).
3.3. CHAT: Codes for Human Analysis of Transcripts

Codes for Human Analysis of Transcripts, developed within the Child Language Data
Exchange System (CHILDES), aims at an international database for first language
acquisition with a uniform transcription system (MacWhinney 2000). Codes for
Human Analysis of Transcripts represents verbal utterances not in turns, but in single
segmented utterances, which are represented using literal transcription. Boundaries
of turns are only noted if necessary. Particular functional signs, that is, declarative, ques-
tion, and request, mark ends of utterances. In addition, overlaps, quick uptake, interrup-
tions (self and other), and listener feedback are transcribed on the level of word
segmentation. Regarding prosodic aspects, Codes for Human Analysis of Transcripts
marks the tonal structure of words not separately but includes it in a commentary
line, representing aspects such as primary accent, three types of accents (standard, espe-
cially strong, contrastive, deviant), lengthening, pauses, volume, and tempo. If neces-
sary, the phonetic structure of the utterance can be represented by using the
International Phonetic Alphabet system. Nonverbal aspects are included within lines
or more detailed in a commentary line.
3.4. Jefferson: The conversation analytical system

The transcription system by Jefferson (1984) aims at the reproduction of sequential
aspects of natural interaction in everyday interaction by using a neutral design of the
Unauthenticated
1048 V. Methods
transcript as an observational datum. A rich inventory for the reproduction of turns and
their sequential progression is thus characteristic for the system. The system uses the
“eye dialect,” a standard orthography onomasiologically adapted to the phonetic real-
ization of the expression. The system’s inventory of signs is based on the Latin alphabet.
The format of transcription is sequentially organized and turns of speakers are, analog
to their linear progression, ordered in chronological order. The system represents
simultaneous utterances of more than one speaker by using brackets at the time the
overlap occurs. The end of verbal units/turns is marked by standard orthography for
interrogative sentences. The system also includes prosodic aspects of utterances, such
as remarkable changes in pitch, changes in the intonations contour, lengthening,
emphasis, changes in tempo, and pauses. Nonverbal events, such as gestures, mimics,
breathing, and coughing for instance, are represented in commentary lines in double
parenthesis (e.g. coughing). In its newest revision (Jefferson 2002), the system also
includes guidelines for a computer-aided transcription.
3.5. DT: Discourse Transcription system

The discourse transcription system proposed by Du Bois (1991) and Du Bois et al.
(1992) offers an improved version of a conversation analytic for the transcription of
spoken language. The Discourse Transcription System distinguishes a basic (observa-
tional description of spoken language) from a fine-grained transcription (research
specific coding). Contrary to the segmentation in the HalbInterpretative ArbeitsTran-
skriptionen and the Conversation Analytical System by Jefferson, the Discourse Tran-
scription System segments the verbal utterance in intonation units. Intonation units are
functionally classified in three types (final, continuing, appeal) and represented on sin-
gle lines by using a specific phonetic contour or set of contours (Du Bois 1991: 53) as
well as their terminal pitch direction. If necessary, intonation units can be phonetically
realized in fine-grained transcripts. Further prosodic aspects captured by the system
include primary word accents, lengthening, pauses, volume, tempo, and voice quality.
The system also captures nonverbal aspects, such as smiling, coughing, yawning, inhala-
tion and exhalation as well as a rough coding of gestures, gaze, body, and co-action (Du
Bois et al. 1992).
3.6. GAT: Gesprächsanalytisches Transkriptionssystem

The Gesprächsanalytisches Transkriptionssystem (GAT, Selting, Auer, Barden, et al.
1998; Selting, Auer, Barth-Weingarten, et al. 2009) is the most widespread transcription
system in the German speaking area within the field of conversation analysis. It pro-
poses a standardization of existing systems and, similar to the one by Jefferson
(1984), aims at an analysis of talk in interaction and a representation of its sequential
aspects. For ease of handling, principles of standard orthography represent spoken lan-
guage. In the Gesprächsanalytisches Transkriptionssystem, turns, that is minimal units
of utterances, are composed of phrasing units that are constituted by a primary accent.
Each new turn is represented in a new line, thus offering a different set up of the tran-
script than the HalbInterpretative ArbeitsTranskriptionen, for instance. Prosodic as-
pects are represented intralinearily by capturing changes in pitch, primary and
secondary accents (falling, rising, rise-fall, fall-rise, constant), eye-catching changes in
Unauthenticated
tone, lengthening, pauses, and volume. Laughing is represented in a syllabic manner

(haha hehe hihi) or interlinearily in double parenthesis ((smiles)) (see section 2.3.).
Stylistic interpretations of utterances, such as irony for instance, are represented in
commentary lines. Similar to the HalbInterpretative ArbeitsTranskriptionen, the Ge-
sprächsanalytisches Transkriptionssystem offers a distinction between a basic and a
fine-grained transcription. The basic transcription includes
(i) sequential structure,

(ii) pauses,
(iii) specific segmental conventions,
(iv) laughing,
(v) listener feedback,
(vi) accentuation, and
(vii) changes in pitch.
A fine-grained transcription focuses more closely on the representation of prosody.

In its newest revision (Selting, Auer, Barth-Weingarten, et al. 2009), the Gespräch-
sanalytisches Transkriptionssystem 2 offers an even easier start of transcription by in-
troducing the concept of the minimal transcript. Overall, the Gesprächsanalytisches
Transkriptionssystem 2 has a stronger phonological and prosodic bias, such that the con-
cept of intonation unit replaced the “phrase unit” (Selting, Auer, Barth-Weingarten,
et al. 2009: 355). Primary and secondary accents are mostly defined phonologically rather
than phonetically. In addition, a tutorial (GAT-TO) was developed, introducing the prac-
tical process of transcription along with a discussion of some problematic aspects. (For a
more detailed discussion of the Gesprächsanalytisches Transkriptionssystem 2 and the
transcription of prosody see section 4.2.)
While all of the presented systems aim at a representation of spoken language, the
systems differ from each other in a range of aspects. The most obvious difference is the
diverging format of representation for spoken language. Systems may use notation scores,
that is, an endless line (Ehlich and Rehbein 1979b) or single lines for single speakers and
turns (e.g., Selting, Auer, Barden, et al. 1998; Selting, Auer, Barth-Weingarten, et al. 2009).
Furthermore, the systems differ in their basic unit of analysis: turn (e.g., Jefferson 1984;
Selting, Auer, Barden, et al. 1998; Selting, Auer, Barth-Weingarten, et al. 2009) vs. utter-
ance as a whole (MacWhinney 2000). Going along with this is a differentiation in the
segmentation of verbal units, varying from sounds (International Phonetic Association
2005) to intonations units for instance (Du Bois 1991). In addition, the systems include
prosodic aspects as well as other forms of bodily behavior to varying degrees.
4. Transcription of prosody
Transcription systems for prosody generally capture two main types of phenomena:
a) the division of utterances into prosodically-marked chunks, units or phrases and
b) the representation of prominence along with aspects such as pitch movement,
reset or rhythmic change for instance. But, the size and type of prosodic units vary
considerably in the different systems, thus resulting in different prosodic transcriptions.
Unauthenticated
1050 V. Methods
In general, it is common practice for manual prosodic annotation to be carried out

via auditory analysis that is being accompanied by analyses of waveforms and funda-
mental frequency (F0). More recently however, a growing number of systems address
the question of automated prosodic annotation and transcription (see for instance
Avanzi, Lacharet-Dujour and Victorri 2008; Campione and Vèronis 2001; Garcia,
Gut, and Galves 2002; and Mertens 2004). The following section will present a concise
discussion of prosodic descriptions within the field of conversation analysis, discourse
analysis, and linguistics.
4.1. HIAT 2: Erweiterte HalbInterpretative ArbeitsTranskriptionen

While the HalbInterpretative ArbeitsTranskriptionen 1 (Ehlich and Rehbein 1976) pri-
marily focused on the verbal level, thus allowing for a relatively accurate representation
of “normal” intonation (Ehlich and Rehbein 1976: 59), the HalbInterpretative Arbeit-
sTranskriptionen 2 (Ehlich and Rehbein 1979a) focuses on a fine-grained transcription
of intonational phenomena. For doing so, the HalbInterpretative ArbeitsTranskriptio-
nen 2 assumes a system with 4 (up to 6) levels of pitch range. By marking the pitch
with the sign “o” and the use of vertical lines to represent the different levels of
pitch range, the HalbInterpretative ArbeitsTranskriptionen 2 is able to capture changes
in pitch. Accents are only transcribed when differing immensely from the normal
accentuation pattern of the utterance. In such cases, the HalbInterpretative Arbeit-
sTranskriptionen 2 assumes contrastive accents underlining of syllable or word for
the indication of marked accentuation patterns. Changes in volume and transcription
of pauses follow the same conventions as in the HalbInterpretative ArbeitsTranskrip-
tionen 1. Finally, the HalbInterpretative ArbeitsTranskriptionen 2 marks tempo in
commentary lines.
4.2. GAT 2: Gesprächsanalytisches Transkriptionssystem

Decisive for the prosodic transcription in the Gesprächsanalytisches Transkriptionssys-
tem 2 is the central concept of the intonation unit with at least one accentuated syllable,
that is the focus accent. The Gesprächsanalytisches Transkriptionssystem 2 demarcates
intonation units by more or less phonetically strong features, such as creaky voice, final
lengthening, pauses, and jumps in pitch at the beginning and ending of units. All in
all, the system includes a rich inventory for the intralinear prosodic transcription of
utterances, as it offers numerous extensions to the conventions offered in the Gespräch-
sanalytisches Transkriptionssystem (Selting, Auer, Barden, et al. 1998). The revised
conventions for rhythm, following the transcription of isochronous rhythmical units
(Couper-Kuhlen 1993 and Auer, Couper-Kuhlen, and Müller 1999), allow for its more
fine-grained representation. In addition, breathing, pauses, and lengthening can be re-
presented more precisely by offering more detailed conventions. Moreover, by suggest-
ing an autosegmental transcription of prosody using German Tones and Break Indices
(GTobi, Grice, Baumann, and Benzmüller 2005) or Dutch Tones and Break Indices
(ToDi, Gussenhoven, Rietveld, and Terken 2005) (see below) for particularly detailed
analyses going beyond the possibilities of an intralinear transcription, the Gesprächsa-
nalytisches Transkriptionssystem 2 captures the range of prosodic phenomena needed
for a conversational analysis.
Unauthenticated
4.3. TSM: Tonic stress marks system

The tonic stress marks system (TSM, Roach et al. 1993; Knowles, Williams, and Taylor
1996) is based on the British school of auditory intonation analysis (e.g., Crystal 1969;
O’Connor and Arnold 1973). The tonic stress marks system therefore assumes a tran-
scription of intonation by means of a cohesive contour represented in tone units,
which are made up of one nucleus and show their nuclear syllable as the last accented
syllable of the unit. Tone units consist of maximally four components, that is pre-head,
head, nucleus and tail (Crystal 1969; O’Connor and Arnold 1973), and the main ac-
cented syllable receives particular importance as it determines the tone unit as a
whole. Based on these assumptions, the tonic stress marks system assumes two levels
of intonation phrasing (major tone group and minor tone group). The tonic stress
marks system indicates the presence and tonal characteristics of every accent by
means of a diacritic before the accented syllable. The tonic stress marks system includes
5 different tones ((1) level, (2) fall, (3) rise, (4) fall-rise, (5) rise-fall), which can be
either high or low.
4.4. ToBI: Tones and Break Indices

Tones and Break Indices (ToBi, Beckman and Elam 1997; Silverman et al. 1992) is cur-
rently probably the best-known system for the prosodic representation of American
English. Although originally designed for a transcription of American English, Tones
and Break Indices has been successfully adapted to a number of languages, e.g., Ger-
man (GToBi), Dutch (ToDi), varieties of English (IViE), Glasgow dialect of English
(Gla-ToBI), Spanish, Japanese, and Korean.
The basic principles of Tones and Break Indices are taken from the phonological
model of English intonation by Pierrehumbert (1980). The transcription of prosody
using Tones and Break Indices thus follows two steps: 1) transcription of tones and
2) transcription of break indices. For the transcription of tones, Tones and Break Indices
assumes tones to be part of accents or to indicate a boundary. They can be either high
(H) or low (L) and tones signaling the boundaries of prosodically defined phrases may
occur at their right or left edge. The Tones and Break Indices altogether distinguishes
five basic pitch accents, assuming that accents may contain more than one tone. The
transcription of tones consists of a speech signal and a fundamental frequency record
(F0) along with the time-aligned symbolic labels relating to four types of events (Beck-
man and Elam 1997).
The transcription of break indices is based on auditory and visual information and
distinguishes five levels or perceived junctures, which are transcribed between words
on the orthographic tier. They are numbered from 0 to 4, with 0 indicating the lowest
degree (words are grouped together into a “clitic group”), 1 marking the default bound-
ary between two words in the absence of another prosodic boundary, and 3 and 4
specifying intermediate and intonation phrase boundaries (Beckman and Elam 1997).
4.5. PROLAB: Prosodic labeling

PROLAB, a method for prosodic labeling developed in the project “Verbmobil”
(Kohler, Pätzold, and Simpson 1995), is based on the Kiel Intonation Model (KIM)
Unauthenticated
1052 V. Methods
(Kohler 1991). Accordingly, PROLAB defines all categories perceptually, involving

bundles of acoustic properties including for instance fundamental frequency (F0), dura-
tion, intensity for sentence stresses, and segmental lengthening (Kohler 1995). Contrary
to Tones and Break Indices (ToBi), PROLAB does not represent pitch patterns as lin-
ear sequences of elementary tones but rather recognizes whole pitch peak and valley
contours. Furthermore, it marks sentence stress separately from intonation and sepa-
rates phrasing and intonation. Phrasing markers are assigned with reference to all pro-
sodic features on perceptual grounds. Contrary to other systems, PROLAB does not
offer separate labeling tiers for different speech phenomena. Rather PROLAB aims
at an integration of segmental and prosodic labeling (Kohler 1995), such that prosodic
labels can be integrated into a complete segmental label file or an orthographic file.
4.6. INTSINT: International Transcription System for Intonation

International Transcription System for Intonation (INTSINT) is a system for cross-
linguistic comparison of prosodic systems, allowing for a transcription at different levels
of detail. It is based on the postulate that “the surface phonological representations of a
pitch curve can be assumed to consist of phonetically interpretable symbols which can
in turn be derived from a more abstract phonological representation.” (Hirst 1991: 307)
Transcriptions are thus closely linked to the phonetic realization of the intonation
contour, but at the same time allow for a phonological symbolization. In the Interna-
tional Transcription System for Intonation, prosodic target points are aligned with
an orthographic or phonetic transcription, defined in relation to earlier pitch, and are
transcribed by means of arrows corresponding to the different pitch levels (higher,
up-stepped, lower, down-stepped or same) (Hirst 1991; Hirst and Di Cristo 1998).
4.7. SAMPROSA: SAM Prosodic Transcription

SAM Prosodic Transcription (SAMPROSA) offers a prosodic transcription for linguis-
tic purposes and for prosodic labeling in speech technology and experimental phonetic
research (Wells et al. 1992). In SAM Prosodic Transcription, intonational transcriptions
need to be transcribed independently from other transcriptions or representations of
the signal on a separate tier. Accordingly, SAM Prosodic Transcription sets up parallel
symbolic representations of utterances using different segmental or prosodic criteria
(Gibbon 1988). The parallel symbolic representations may, thereby, be related either
through a) association, in which phonological rules are defined which relate prosodic
and segmental units or b) synchronisation, in which the symbols may be assigned to
the signal as tags or annotations. In general, SAM Prosodic Transcription allows for
the transcription of global, local, terminal and nuclear tones as well as length, stress,
pauses, and prosodic boundaries. However, SAM Prosodic Transcription is not a tran-
scription system for prosody in a strict sense but rather “computer-compatible codes
for use in formatting transcriptions for interchange purposes, once a model has been
selected” (Gibbon, Mertins, and Moore 2000: 53).
The preceding overview has shown that the systems not only vary in their theoretical
and methodological tradition, but also in their focus on a transcription of prosody. In
general, the proposed systems can be classified according to common and differing
Unauthenticated
parameters (Llisterri 1994). Regarding the representation of prosodic events, the sys-
tems can be classified into multi-tiered (the Tones and Break Indices, the International
Transcription System for Intonation) or one-tiered systems (e.g., the International Pho-
netic Alphabet, the Gesprächsanalytisches Transkriptionssystem 2, the HalbInter-
pretative ArbeitsTranskriptionen 2). The systems further differ regarding their aspects
of machine readable symbols (e.g., the Speech Assessment Method Phonetic Alphabet,
SAMSINT or SAM Prosodic Transcription) vs. non-machine readable symbols (e.g., the
Gesprächsanalytisches Transkriptionssystem 2, the Tones and Break Indices, the Ge-
sprächsanalytisches Transkriptionssystem, the HalbInterpretative ArbeitsTranskriptio-
nen, PROLAB). In addition, the systems differ in whether they are theory-driven
systems, that is a) based on a conception of the phonetics-phonology interface or b) data-
driven systems, i.e., defined by the needs and the practices which are known to be
relevant in order to explain the discursive or the interactional behavior of the speakers (Llis-
terri 1994).
5. Transcription of body posture

While the majority of research is interested in bodily behavior and body posture as in-
dicators for personal attitudes, personal criteria or emotions (e.g., Argyle 1975; Scherer
1970, 1979), or changes of body postures for discursive functions (e.g., Bohle 2007; Ken-
don 1972; Scheflen 1965), approaches focusing on a close description and transcription
systems of body posture are rare. Existing transcription systems thereby usually base their
transcription either on an anatomical or an environmental reference system. While ana-
tomic systems identify the location of a body part or the body as a whole with respect to
the bodily axes (e.g., Birdwhistell 1970; Frey et al. 1981; Wallbott 1998), environmental
systems define the body in relation to objects in the external world (e.g., Hall 1963).
Birdwhistell’s kinesic notation system for bodily motion (see also section 2.1.1.) in-
cludes the trunk, shoulder/arm/wrist, hip, leg, ankle, and neck for a description of body
posture. The different body parts are thereby described in their different anatomical po-
sitions. The trunk may be “leaning back” or “leaning forward,” the shoulders might be
“hunched” or “drooped lateral”. Seated positions may be described as “close double l
(seated, feet square on the floor)” or “reverse x (lower legs crossed)” (Birdwhistell
1970: 261–281). Same as for the notation of hand positions, shapes and finger activities,
Birdwhistell distinguishes a micro and macrokinesic level for the transcription of body
postures, thus allowing for a form-based as well as functional transcription.
A further anatomical transcription system including conventions for body posture is
the Bernese Coding System for Time-Series Notation of Body Movement (Frey et al.
1981; Hirsbrunner, Frey, and Crawford 1987) (see also section 2.1.2.). Similar to Bird-
whistell, the Bernese Coding Scheme differentiates the body according to the shoulders,
trunk, pelvis, thigh, and feet. The body parts are coded according to their potential to
engage in complex movement variations, which are defined for the most part “as displa-
cements” or ‘flexions’ from a standard ‘upright’ sitting position”. The feet are thus
described for instance as “strongly tilted to the left” or “straight (Hirsbrunner, Frey,
and Crawford 1987: 110)”.
Based on Birdwhistell (1970) and Frey et al. (1981), but also on functional classifica-
tion systems such as Ekman and Friesen (1969), Wallbott’s system combines a form-
based and functional perspective. By distinguishing 5 categories (upper body, shoulders,
Unauthenticated
1054 V. Methods
head, arms, hands), which are described according to their movement abilities (up, down,
back for the shoulders for instance), Wallbott’s system allows for a rough anatomical
description of various body postures.
Although primarily developed for the notation of dance, aspects of the Laban nota-
tion (Laban 1950) are nowadays used in a range of transcription systems (e.g., Davis
1979; Greenbaum and Rosenfeld 1980; Kendon 2004; Lausberg and Sloetjes 2009).
This system includes a basic segmentation of the skeletal system, basic kinesiological
terms (e.g., rotations), spatial terms (e.g., straight vs. circular paths) and object relations
(e.g., touch), which allow for a detailed notation of body posture and bodily movement.
Recently, new transcription systems aiming at a combination of describing form and
function of body postures are postulated. Schöps (in preparation), for instance, presents
a system including basic postures (standing, laying down, sitting) as well as body parts
and movement categories, along with different predicates for the transcription of the
used body configurations (e.g., spread for arms and legs). The Body Action and Posture
Coding system by Dael and Scherer (2012) approaches the transcription and coding of
body posture on an “anatomical level (different articulations of body parts), a form
level (direction and orientation of movement), and a functional level (communicative
and self-regulatory functions)” (Dael and Scherer 2012).
6. Transcription systems for gaze

The role of gaze has so far been mostly investigated in two main areas: 1) for the orga-
nization of talk in interaction, greetings and in particular in turn-taking (e.g., Eibl-
Eibesfeldt 1971; Goodwin 1981; Kendon 1967; Kendon and Ferber 1973), and 2) in
regulating the relationship and intimacy between interactants (e.g., Argyle and Cook
1976). Description and transcription of gaze is thereby usually rough, noting down as-
pects such as looking at each other, eyes closed, eyes wide-open etc.. Birdwhistell
(1970), for instance, includes a few conventions for the transcription of gaze, such as
shut eyes, sideways look, or stare. Ehlich and Rehbein (1982), based on eight different
categories of gaze, distinguish between gaze between a) eyes and b) eyes and face
region exchanged between speakers, sender, and other interactants. The differentiation
is based on the presence and absence of gaze, its movement and duration (Ehlich and
Rehbein 1982: 55–56), allowing for a transcription of gaze as “sender looks at recipi-
ent”, “sender turns gaze towards recipient” or “recipient looks away from sender”.
Goodwin’s (1981) is probably the most explicit transcription system by providing con-
ventions not only for the different types of eye gaze but also for the setup of transcrip-
tion files. Gaze of the recipient is thus marked above the utterance, while the recipient’s
gaze is noted down underneath the utterance. Goodwin includes conventions for mark-
ing the beginning and ending of the gaze as well as preparational and retraction phases
of the gaze, as well as its directionality (Goodwin 1981: viii, 52–53).
7. Conclusion
This overview of notation and transcription system for speech and bodily behavior has
shown that a range of proposals exists, all of which try to account for the reproduction
of verbal and bodily behavior in written forms. It became apparent that the individual
Unauthenticated
systems differ immensely in their theoretical background and methodological approach.

On the one hand, the large number of proposed systems, in particular for the transcrip-
tion of gestures, speech, and prosody, results in a range of different forms of transcrip-
tions, thus hindering the comparability of research results. On the other hand, it
provides the necessary grounds for the transcription of speech or bodily behavior ac-
cording to particular research perspectives and theoretical foci, thus broadening the
scope of research within the respective fields of research.
8. References
Argyle, Michael 1975. Bodily Communication. London: Methuen.
Argyle, Michael and Mark Cook 1976. Gaze and Mutual Gaze. Cambridge: Cambridge University
Press.
Auer, Peter, Elizabeth Couper-Kuhlen and Frank Müller 1999. Language in Time: The Rhythm
and Tempo of Spoken Interaction. New York: Oxford University Press.
Avanzi, Mathieu, Anne Lacheret-Dujour and Bernard Victorri 2008. ANALOR. A tool for semi-
automatic annotation of French prosodic structure. Paper presented at the Interspeech 2008,
Campinas, Brazil, May 6–9.
Battison, Robin 1974. Phonological deletion in American Sign Language. Sign Language Studies
5: 1–19.
Becker, Karin 2004. Zur Morphologie redebegleitender Gesten. MA thesis, Department of Philos-
ophy and Humanities, Free University Berlin.
Beckman, Mary E. and Gayle Ayers Elam 1997. Guidelines for ToBI labelling. Retrieved
15.07.2011, from http://www.ling.ohio-state.edu/research/phonetics/E_ToBI/
Birdwhistell, Ray 1970. Kinesics and Context. Essays on Body, Motion, Communication. Philadel-
phia: University of Pennsylvania Press.
Bohle, Ulrike 2007. Das Wort Ergreifen – das Wort Übergeben: Explorative Studie zur Rolle Re-
debegleitender Gesten in der Organisation des Sprecherwechsels. Berlin: Weidler.
Bohle, Ulrike this volume Approaching notation, coding, and analysis from a conversational ana-
lysis point of view. In: Cornelia Müller, Alan Cienki, Ellen Fricke, Silva H. Ladewig, David
McNeill and Sedinha Teßendorf (eds.), Body – Language – Communication: An International
Handbook on Multimodality in Human Interaction. (Handbooks of Linguistics and Communi-
cation Science 38.1.) Berlin: De Gruyter Mouton.
Bressem, Jana this volume. A linguistic perspective on the notation of form features in gestures. In:
Cornelia Müller, Alan Cienki, Ellen Fricke, Silva H. Ladewig, David McNeill and Sedinha
Teßendorf (eds.), Body – Language – Communication: An International Handbook on Multi-
modality in Human Interaction. (Handbooks of Linguistics and Communication Science 38.1.)
Berlin: De Gruyter Mouton.
Bressem, Jana and Silva H. Ladewig 2011. Rethinking gesture phases: Articulatory features of ges-
tural movement? Semiotica 184(1/4): 53–91.
Brinker, Klaus and Sven F. Sager 1989. Linguistische Gesprächsanalyse. Berlin: Erich Schmidt.
Bußmann, Hadumod 1990. Lexikon der Sprachwissenschaft. 2nd revised edition. Stuttgart,
Germany: Alfred Kröner.
Campione, Estelle and Jean Vèronis 2001. Semi-automatic tagging of intonation in French spoken
corpora. In: Paul Rayson, Andrew Wilson, Tony McEnery, Andrew Hardie and Shereen Khoja
(eds.), Proceedings of the Corpus Linguistics’ 2001 Conference, 90–99. Lancaster, U.K.: Lancas-
ter University, UCREL.
Couper-Kuhlen, Elizabeth 1993. English Speech Rhythm: Form and Function in Everyday Verbal
Interaction. Amsterdam: John Benjamins.
Crystal, David 1969. Prosodic Systems and Intonation in English. Cambridge: Cambridge Univer-
sity Press.
Unauthenticated
1056 V. Methods
Dael, Nele and Klaus R. Scherer 2012. The Body Action and Posture coding system (BAP):
Development and reliability. Journal of Nonverbal Behavior, 36, 97–121.
Davis, Martha 1979. Laban analysis of nonverbal communication. In: Shirley Weitz (ed.), Nonver-
bal Communication: Readings with Commentary, 182–206. New York: Oxford University Press.
Dittmar, Norbert 2004. Transkription: Ein Leitfaden mit Aufgaben für Studenten, Forscher und
Laien. Heidelberg: VS Verlag für Sozialwissenschaften.
Du Bois, John W. 1991. Transcription design principles for spoken discourse research. Pragmatics
1(1): 71–106.
Du Bois, John W., Susanna Cumming, Stephan Schuetze-Coburn and Danae Paolino 1992. Dis-
course transcription. Santa Barbara Papers in Linguistics 4. University of California, Santa Bar-
bara, Department of Linguistics.
Ehlich, Konrad and Jochen Rehbein 1976. Halbinterpretative Arbeitstranskriptionen (HIAT 1).
Linguistische Berichte 25: 21–41.
Ehlich, Konrad and Jochen Rehbein 1979a. Erweiterte halbinterpretative Arbeitstranskriptionen
(HIAT2). Linguistische Berichte 59: 51–75.
Ehlich, Konrad and Jochen Rehbein 1979b. Zur Notierung nonverbaler Kommunikation für dis-
kursanalytische Zwecke (HIAT2). In: Peter Winkler (ed.), Methoden der Analyse von Face-to-
Face-Situationen, 302–329. Stuttgart: Metzler.
Ehlich, Konrad and Jochen Rehbein 1982. Augenkommunikation. Methodenreflextion und Beis-
pielanalyse. Amsterdam: John Benjamins.
Eibl-Eibesfeldt, Irenäus 1971. Transcultural patterns of ritualized contact behavior. In: Aristide H.
Esser (ed.), Behavior and Environment. The Use of Space by Animals and Men, 238–246. New
York: Plenum.
Ekman, Paul and Wallace V. Friesen 1969. The repertoire of nonverbal behavior: Categories,
origins, usage, and coding. Semiotica 1 49–98.
Ekman, Paul and Wallace V. Friesen 1978. Facial Action Coding System (FACS): A Technique for
the Measurement of Facial Action. Palo Alto, CA: Consulting Psychologists Press.
Frey, Siegfried, Hans Peter Hirsbrunner, Jonathan Pool and William Daw 1981. Das Berner Sys-
tem zur Untersuchung nonverbaler Interaktion: I. Die Erhebung des Rohdatenprotokolls; II.
Die Auswertung von Zeitreihen visuell-auditiver Information. In: Peter Winkler (ed.), Metho-
den der Analyse von Face-to-Face-Situationen, 203–268. Stuttgart: Metzler.
Frey, Siegfried, Hans Peter Hirsbrunner and Ulrich Jorns 1982. Time-series notation: A coding
principle for the unified assessment of speech and movement in communication research. In:
Ernest W. B. Hess-Lüttich (ed.), Multimodal Communication: Vol. I Semiotic Problems of Its
Notation, 30–58. Tübingen: Narr.
Fricke, Ellen 2007. Origo, Geste und Raum: Lokaldeixis im Deutschen. Berlin: Walter de Gruyter.
Fricke, Ellen 2012. Grammatik multimodal: Wie Wörter und Gesten zusammenwirken. Berlin:
De Gruyter.
Garcia, Jesus, Ulrike Gut and Antonio Galves 2002. Vocale – a semi-automatic annotation tool for
prosodic research. Proceedings of Speech Prosody 2002.
Gibbon, Dafydd 1988. Intonation and discourse. In: Janos S. Petöfi (ed.), Text and Discourse Con-
stitution, 3–25. Berlin: De Gruyter.
Gibbon, Dafydd, Inge Mertins and Roger K. Moore 2000. Handbook of Multimodal and Spoken
Dialogue Systems: Resources, Terminology, and Product Evaluation. Norwell, Massachusetts,
USA: Kluwer Academic.
Goodwin, Charles 1981. Conversational Organization: Interaction between Speakers and Hearers.
New York: Academic Press.
Greenbaum, Paul E. and Howard Rosenfeld 1980. Varieties of touching in greetings: Sequential
structure and sex-related differences. Journal of Nonverbal Behavior 5(1): 13–25.
Grice, Martin, Stefan Baumann and Ralf Benzmüller 2005. German intonation in autosegmental-
metrical phonology. In: Sun-Ah Jun (ed.), Prosodic Typology: The Phonology of Intonation
and Phrasing, 55–83. Oxford: Oxford University Press.
Unauthenticated
Gumperz, John and Norine Berenz 1993. Transcribing conversational exchanges. In: Jane A. Ed-
wards and Martin D. Lampert (eds.), Talking Data: Transcription and Coding in Discourse
Research, 91–121. Hillsdale, NJ: Lawrence Erlbaum.
Gussenhoven, Carlos, Toni Rietveld and Jaques Terken 2005. Transcription of Dutch intonation.
In: Sun-Ah Jun (ed.), Prosodic Typology: The Phonology of Intonation and Phrasing, 118–145.
Oxford: Oxford University Press.
Gut, Ulrike, Karin Looks, Alexandra Thies and Dafydd Gibbon 2002. Cogest: Conversational ges-
ture transcription system version 1.0. Fakultät für Linguistik und Literaturwissenschaft, Uni-
versität Bielefeld, ModeLex Tech. Rep 1.
Hager, Joseph C., Paul Ekman and Wallace V. Friesen 2002. Facial action coding system. Re-
trieved 15.07.2011, from http://face-and-emotion.com/dataface/facs/guide/InvGuideTOC.html
Hall, Alan T. 1963. A system for the notation of proxemic behavior. American Anthropologist
65(5): 1003–1026.
Hirsbrunner, Hans-Peter, Siegfried Frey and Robert Crawford 1987. Movement in human inter-
action: Description, parameter formation and analysis. In: Aron W. Siegman and Stanley Feld-
stein (eds.), Nonverbal Behavior and Communication, 99–140. Hillsdale, NJ: Lawrence
Erlbaum.
Hirst, Daniel 1991. Intonation models: Towards a third generation. In: Actes du XIIe‘me Congre‘s
International des Sciences Phone´tiques, 305–310. Aix-en-Provence, France: Université de
Provence, Service des Publications.
Hirst, Daniel and Albert Di Cristo 1998. Intonation Systems: A Survey of Twenty Languages. Cam-
bridge: Cambridge University Press.
International Phonetic Association 2005. Handbook of the International Phonetic Association: A
Guide to the Use of the International Phonetic Alphabet. Cambridge: Cambridge University
Press.
Jefferson, Gail 1984. On stepwise transition from talk about a trouble to inappropriately next-
positioned matters. In: Maxwell J. Atkinson and John Heritage (eds.), Structures of Social
Action: Studies in Conversation Analysis, 191–222. Cambridge: Cambridge University Press.
Jefferson, Gail 2002. Is “no” an acknowledgment token? Comparing American and British uses of
(+)/(-) tokens. Journal of Pragmatics 34(10/11): 1345–1383.
Kallmeyer, Werner and Reinhold Schmitt 1996. Forcieren oder: Die verschärfte Gangart. Zur
Analyse von Kooperationsformen im Gespräch. In: Werner Kallmeyer (ed.), Gesprächsrhe-
torik: Rhetorische Verfahren im Gesprächsprozeß, 19–118. Tübingen: Narr.
Kendon, Adam 1967. Some functions of gaze-direction in social interaction. Acta Psychologica 26:
22–63.
Kendon, Adam 1972. Some relationship between body motion and speech In: Aron W. Siegman,
and Benjamin Pope (eds.), Studies in Dyadic Communication, 177–216. Elmsford, NY: Perga-
mon Press.
Kendon, Adam 1980. Gesture and speech: Two aspects of the process of utterance. In: Mary
Ritchie Key (ed.), Nonverbal Communication and Language, 207–288. The Hague: Mouton.
Kendon, Adam 2004. Gesture. Visible Action as Utterance. Cambridge: Cambridge University
Press.
Kendon, Adam and A. Ferber 1973. A description of some human greetings. In: R. P. Michael
and J. H. Crook (eds.), Comparative ecology and behavior of primates, 591–668. New York:
Academic Press.
Kipp, Michael 2004. Gesture Generation by Imitation: From Human Behavior to Computer Char-
acter Animation. Boca Raton, FL: Dissertation.com.
Kipp, Michael, Michael Neff and Irene Albrecht 2007. An annotation scheme for conversational
gestures: How to economically capture timing and form. Journal on Language Resources and
Evaluation – Special Issue on Multimodal Corpora 41(3/4): 325–339.
Klima, Edward and Ursula Bellugi 1979. The Signs of Language. Cambridge, MA: Harvard
University Press.
Unauthenticated
1058 V. Methods
Knowles, Gerry, Briony Williams and L. Taylor 1996. A Corpus of Formal British English Speech.
London: Longman.
Kohler, Klaus J. 1991. A model of German intonation. Arbeitsberichte des Instituts für Phonetik
und digitale Sprachverarbeitung der Universität Kiel 25: 295–360.
Kohler, Klaus J. 1995. ToBIG and PROLAB: Two prosodic transcription systems for German
compared. Paper presented at the Conference ICPhS Stockholm, 13 August 1995.
Kohler, Klaus J., Matthias Pätzold and Adrian P. Simpson 1995. From Scenario to Segment: The
Controlled Elicitation, Transcription, Segmentation and Labelling of Spontaneous Speech.
Kiel, Germany: Institut für Phonetik und Digitale Sprachverarbeitung, IPDS, Universität Kiel.
Laban, Rudolph von 1950. The Mastery of Movement on the Stage. London: Macdonald and
Evans.
Ladewig, Silva H. and Jana Bressem forthcoming. New insights into the medium ‘hand’: Discover-
ing recurrent structures in gestures. Semiotica.
Lausberg, Hedda and Han Sloetjes 2009. Coding gestural behavior with the NEUROGES –
ELAN system. Behavioral Research Methods 41(3): 841–849.
Llisterri, Joaquim 1994. Prosody encoding survery. MULTEXT-LRE Project 62-050.
MacWhinney, Brian 2000. The CHILDES Project: Tools for Analyzing Talk, Volume 2: The Da-
tabase. Hillsdale, NJ: Lawrence Erlbaum.
Martell, Craig 2002. Form: An extensible, kinematically-based gesture annotation scheme. Paper
presented at International Conference on Language Resources and Evaluation. European
Language Resources Association.
Martell, Craig 2005. FORM: An experiment in the annotation of the kinematics of gesture. Ph.D.
dissertation, Department of Computer and Information Sciences, University of Pennsylvania.
Martell, Craig and Joshua Kroll no date. Corpus-based gesture analysis: An extension of the
FORM dataset for the automatic detection of phases in a gesture. Unpublished manuscript.
McNeill, David 1992. Hand and Mind. What Gestures Reveal about Thought. Chicago: University
of Chicago Press.
McNeill, David 2005. Gesture and Thought. Chicago: University of Chicago Press.
McNeill, David and Susan D. Duncan 2000. Growth points in thinking-for-speaking. In: David
McNeill (ed.), Language and Gesture, 141–161. Cambridge: Cambridge University Press.
Mertens, Pier 2004. The prosogram: Semi-automatic transcription of prosody based on a tonal per-
ception model. Paper presented at Speech Prosody 2004, March 23–26, 2004, Nara, Japan.
Müller, Cornelia 1998. Redebegleitende Gesten: Kulturgeschichte – Theorie – Sprachvergleich. Ber-
lin: Arno Spitz.
Müller, Cornelia 2010. Wie Gesten bedeuten. Eine kognitiv-linguistische und sequenzanalytische
Perspektive. In: Sprache und Literatur 41(1): 37–68.
Müller, Cornelia, Hedda Lausberg, Ellen Fricke and Katja Liebal 2005. Towards a grammar of
gesture: evolution, brain, and linguistic structures. Berlin: Antrag im Rahmen der Förderinitia-
tive “Schlüsselthemen der Geisteswissenschaften Programm zur Förderung fachübergreifender
und internationaler Zusammenarbeit”.
Müller, Cornelia, Jana Bressem and Silva H. Ladewig this volume. Towards a grammar of ges-
tures: A form-based view. In: Cornelia Müller, Alan Cienki, Ellen Fricke, Silva H. Ladewig
and David McNeill (eds.), Body – Language – Communication: An International Handbook
on Multimodality in Human Interaction. Handbooks of Linguistics and Communication
Science (38.1). Berlin and Boston: De Gruyter Mouton.
O’Connor, Joseph Desmond and Gordon Frederick Arnold 1973. Intonation of Colloquial
English. London: Longman.
Pierrehumbert, Janet B. 1980. The Phonology and Phonetics of English Intonation. Boston: Mas-
sachusetts Institute of Technology Press.
Prillwitz, Siegmund, Regina Leven, Heiko Zienert, Thomas Hanke and Jan Henning 1989. Ham-
NoSys Version 2.0 Hamburger Notationssystem für Gebärdensprachen: Eine Einführung. Ham-
burg: Signum Verlag.
Unauthenticated
Redder, Angelika 2001. Aufbau und Gestaltung von Transkriptionssystemen. In: Klaus Brinker,
Gerd Antos, Wolfgang Heinemann and Sven F. Sager (eds.), Text und Gesprächslinguistik.
Ein Internationales Handbuch Zeitgenössischer Forschung, 1038–1059. (Handbücher zur
Sprach- und Kommunikationswissenschaft 16.2.) Berlin: De Gruyter.
Roach, Peter, Gerry Knowles, Tamas Varadi and Simon Arnfield 1993. Marsec: A machine-read-
able spoken English corpus. Journal of the International Phonetic Association 23(2): 47–54.
Sager, Sven F. 2001. Probleme der Transkription nonverbalen Verhaltens. In: Klaus Brinker, Gerd
Antos, Wolfgang Heinemann and Svend F. Sager (eds.), Text und Gesprächslinguistik. Ein In-
ternationales Handbuch Zeitgenössischer Forschung, 1069–1085. (Handbücher zur Sprach- und
Kommunikationswissenschaft 16.2.) Berlin: De Gruyter.
Sager, Svend F. and Kristin Bührig 2005. Nonverbale Kommunikation im Gespräch–Editorial. In:
Kristin Bührig and Svend F. Sager (eds.), Osnabrücker Beiträge zur Sprachtheorie 70: Nonver-
bale Kommunikation im Gespräch, 5–17. Oldenberg: Redaktion Obst.
Scheflen, Albert 1965. The significance of posture in communication systems. Psychiatry 27: 316–331.
Scherer, Klaus R. 1970. Non-Verbale Kommunikation: Ansätze zur Beobachtung und Analyse der
Aussersprachlichen Aspekte von Interaktionsverhalten. Hamburg: Buske.
Scherer, Klaus R. 1979. Die Funktionen des nonverbalen Verhaltens im Gespräch. In: Klaus R.
Scherer and Harald G. Wallbott (eds.), Nonverbale Kommunikation: Forschungsberichte zum
Interaktionsverhalten, 25–32. Weinheim: Beltz.
Schmitt, Reinhold (ed.) 2007. Koordination: Analysen zur Multimodalen Interaktion. Tübingen:
Narr.
Schneider, Wolfgang 2001. Der Transkriptionseditor HIAT-DOS. Gesprächsforschung-Online
Zeitschrift zur Verbalen Interaktion 2: 29–33.
Schönherr, Beatrix 1997. Syntax – Prosodie – Nonverbale Kommunikation. Empirische Untersu-
chungen zur Interaktion Sprachlicher und Parasprachlicher Ausdrucksmittel im Gespräch. Tü-
bingen, Germany: Niemeyer.
Schöps, Doris in preparation. Körperhaltung als Zeichen am Beispiel des DEFA-Films. Disserta-
tion, Technische Universität Berlin.
Selting, Margret, Peter Auer, Birgit Barden, Jörg R. Bergmann, Elizabeth Couper-Kuhlen, Sus-
anne Günther, Christoph Meier, Uta Quasthoff, Peter Schlobinski and Susanne Uhmann
1998. Gesprächsanalytisches Transkriptionssystem (GAT). Linguistische Berichte 173: 91–122.
Selting, Margret, Peter Auer, Dagmar Barth-Weingarten, Jörg Bergmann, Pia Bergmann, Karin
Birkner, Elizabeth Couper-Kuhlen, Arnulf Deppermann, Peter Gilles, Susanne Günthner,
et al. 2009. Gesprächsanalytisches Transkriptionssystem 2 (GAT 2). Gesprächsforschung –
Online Zeitschrift zur Verbalen Interaktion 10: 353–402.
Silverman, Kim, Mary Beckman, John Pitrelli, Mori Ostendorf, Colin Wightman, Patti Price, Janet
Pierrehumbert and Julia Hirschberg 1992. ToBI: A standard for labeling English prosody.
Proceedings of ICSLP-1992, 867–870.
Stokoe, William 1960. Sign Language Structure. Buffalo, NY: Buffalo University Press.
Wallbott, Harald 1998. Ausdruck von Emotionen in Körperbewegungen und Körperhaltungen. In:
Caroline Schmauser and Thomas Noll (eds.), Körperbewegungen und ihre Bedeutung, 121–136.
Berlin: Arno Spitz.
Weinrich, Lotte 1992. Verbale und Nonverbale Strategien in Fernsehgesprächen: Eine Explorative
Studie. Tübingen: Niemeyer.
Wells, John, William Barry, Martin Grice, Adrian Fourcin and Dafydd Gibbon 1992. Standard
computer-compatible transcription. Technical Report No. SAM Stage Report Sen.3 SAM
UCL-037. London: University College London.
Zwitserlood, Ingeborg, Asli Özyürek and Pamela Perniss 2008. Annotation of sign and gesture
cross-linguistically. Paper presented at the 3rd Workshop on the Representation and Proces-
sing of Sign Languages, Marrakesh.
Jana Bressem, Chemnitz (Germany)
Unauthenticated

Transcription Systems For Gestures, Speech

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Transcription Systems For Gestures, Speech

Uploaded by

Copyright:

Available Formats

68.

Hedda Lausberg, Cologne (Germany)

68. Transcription systems for gestures, speech,

2. Transcription systems for gesture

2.1. Systems focusing on gestures’ form

(i) total head,

2.1.2. Bernese coding system: The alphabet of body language

2.1.3. Sager: A notational system for gestures’ form

(i) temporal structure of gestures,

2.1.4. FORM: An automated form-based description of gestures

(i) excursion duration for the beginning and end of gestures,

2.1.5. Bressem: A linguistic perspective on the notation of gestures’ forms

(i) type of movement,

2.2. Systems focusing on gestures’ form and function

2.2.2. CoGest: A linguistic perspective on the transcription of form, meaning,

(i) gesture phases (Kendon 1980),

2.2.4. Zwitserlood, Özyürek, and Perniss (2008): A cross-linguistic annotation

2.2.5. NEUROGES: A neurological annotation scheme for gestures

(i) kinetic gesture coding,

2.3. Systems including a rudimentary gesture coding

2.3.1. HIAT 1: A discourse analytic perspective on gestures

2.3.2. Jefferson and Gesprächsanalytisches Transkriptionssystem (GAT):

3. Transcription systems for speech

3.1. IPA: The International Phonetic Alphabet

transcription, particularly of a traditional phonemic or near-phonemic kind, the electronic

3.2. HIAT: HalbInterpretative ArbeitsTranskriptionen

3.3. CHAT: Codes for Human Analysis of Transcripts

3.4. Jefferson: The conversation analytical system

3.5. DT: Discourse Transcription system

3.6. GAT: Gesprächsanalytisches Transkriptionssystem

tone, lengthening, pauses, and volume. Laughing is represented in a syllabic manner

(i) sequential structure,

A fine-grained transcription focuses more closely on the representation of prosody.

In general, it is common practice for manual prosodic annotation to be carried out

4.1. HIAT 2: Erweiterte HalbInterpretative ArbeitsTranskriptionen

4.2. GAT 2: Gesprächsanalytisches Transkriptionssystem

4.3. TSM: Tonic stress marks system

4.4. ToBI: Tones and Break Indices

4.5. PROLAB: Prosodic labeling

(Kohler 1991). Accordingly, PROLAB defines all categories perceptually, involving

4.6. INTSINT: International Transcription System for Intonation

4.7. SAMPROSA: SAM Prosodic Transcription

5. Transcription of body posture

6. Transcription systems for gaze

systems differ immensely in their theoretical background and methodological approach.

Jana Bressem, Chemnitz (Germany)

You might also like