You are on page 1of 67

Disfluency and Proficiency in Second

Language Speech Production 1st


Edition Simon Williams
Visit to download the full and correct content document:
https://ebookmass.com/product/disfluency-and-proficiency-in-second-language-speec
h-production-1st-edition-simon-williams-2/
Disfluency and
Proficiency in
Second Language
Speech Production

Simon Williams
Disfluency and Proficiency in Second Language
Speech Production
Simon Williams

Disfluency and
Proficiency in Second
Language Speech
Production
Simon Williams
School of Media, Arts and Humanities
University of Sussex
Falmer, UK

ISBN 978-3-031-12487-7    ISBN 978-3-031-12488-4 (eBook)


https://doi.org/10.1007/978-3-031-12488-4

© The Editor(s) (if applicable) and The Author(s), under exclusive licence to Springer Nature
Switzerland AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and trans-
mission or information storage and retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

Cover Illustration: Marina Lohrbach_shutterstock.com

This Palgrave Macmillan imprint is published by the registered company Springer Nature Switzerland AG.
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
For Cynthia, Janet and Maureen
Acknowledgements

I owe a huge debt of gratitude to Cathy Scott, Senior Editor at Palgrave


Macmillan, for suggesting I write this monograph and for reading
through first drafts. I’m also grateful to Asma Azeezullah, Project
Coordinator, and the production team for their care and efficiency.
I would like to thank Simon Cassar, Anne-Meike Fechter, Sebastian
Loew, Roberta Piazza, Kevin Porthouse, and Robin Williams for their
support and encouragement; Nina Studer and Charlotte Taylor for
invaluable practical advice; and Janet Lacey, Cynthia Leverton, and
Maureen Porthouse for their kindness throughout.

vii
Transcription Symbols

Below are the main symbols used within the book. Additional symbols
are described in examples in certain chapters.

= Short (untimed) pause


(0.5) Silent pauses are represented in seconds and tenths of a second
within round brackets
: A colon indicates a prolongation of the preceding sound: the
more colons, the longer the prolongation.
? A question mark indicates a rising inflection.
| An upward arrow indicates a marked rise in intonation.
£ the pound sterling sign indicates a smiley voice.
>example< words that occur between the greater than sign and the less than
sign indicate talk delivered at a faster speed than the
surrounding talk.
<example> words that occur between the less than sign and the greater than
sign indicate talk delivered at a slower speed than the
surrounding talk.
() single round brackets containing no transcription indicate talk
that is too indistinct to transcribe
(( )) talk between double round brackets indicates some sort of
activity within the interaction.

ix
Contents

1 I ntroduction  1
1.1 Disfluency as Fluency   1
1.2 Introduction to the Six Disfluencies: Formal Descriptions   5
1.3 Effects on Listeners  10
1.4 Main Types of Disfluency Data  12
1.5 Disfluency Types and Proficiency Level  14
1.6 Disfluencies in Public Language Test Descriptors  15
1.7 Comment  16
1.8 Overview  19
References 24

2 S
 ilent Pauses 31
2.1 Introduction  31
2.2 Formal Description of Silent Pauses  32
2.3 Effect of Silent Pauses on Listeners  36
2.4 Silent Pauses in Classroom Interaction  47
2.5 Silent Pauses and Proficiency Level  49
2.6 Silent Pauses in Public Language Tests  61
2.7 Comment  64
References 65

xi
xii Contents

3 F
 illed Pauses 73
3.1 Introduction  73
3.2 Formal Description of Filled Pauses  75
3.2.1 Location/Position  75
3.2.2 Frequency  76
3.2.3 Length/Duration  77
3.2.4 Pitch  78
3.3 Effect of Filled Pauses on Listeners  78
3.3.1 uh and um Studies  78
3.3.2 Second Language Rater Studies  85
3.4 Speech Environment of Filled Pauses  94
3.5 Filled Pauses and Speaker Proficiency  96
3.6 Filled Pauses in Public Language Tests 105
3.7 Comment 106
References108

4 P
 rolongations117
4.1 Introduction 117
4.2 Formal Description of Prolongations 119
4.3 Effect of Prolongations on Listeners 121
4.4 Main Types of Prolongation Data 131
4.4.1 Corpora Studies 132
4.4.2 Natural Language Processing 135
4.4.3 Prolongations in Classroom Interaction 137
4.4.4 Prolongations in Elicited Data 138
4.5 Prolongations and Proficiency Level 139
4.6 Prolongations in Public Language Tests 139
4.7 Comment 141
References141

5 R
 epetitions147
5.1 Introduction 147
5.2 Formal Description of Repetitions 149
5.3 Effect of Repetitions on Listeners 157
5.4 Repetitions Outside the Classroom 164
5.5 Repetitions in Elicited Data 165
Contents xiii

5.6 Task Effect on Repetitions 167


5.7 Repetitions and Proficiency Level 168
5.8 Repetitions in Public Language Tests 171
5.9 Comment 173
References174

6 S
 elf-Corrections177
6.1 Introduction 177
6.2 Formal Description of Self-Corrections 180
6.3 Effect of Self-Corrections on Listeners 185
6.4 Main Types of Self-Correction Data 189
6.4.1 Self-Corrections Outside the Classroom 189
6.4.2 Self-Corrections in Classroom Interaction 191
6.4.3 Task Effect on Self-Corrections in Elicited Data 196
6.5 Self-Corrections and Proficiency Level 198
6.6 Self-Corrections in Public Language Tests 202
6.7 Comment 205
References206

7 F
 alse Starts213
7.1 Introduction 213
7.2 Formal Description of False Starts 215
7.3 Effect of False Starts on Listeners 217
7.4 Main Types of False Starts Data 222
7.4.1 False Starts Outside the Classroom 223
7.4.2 False Starts in Classroom Interaction 224
7.4.3 Task Effect on False Starts in Elicited Data 231
7.5 False Starts and Proficiency Level 235
7.6 False Starts in Public Language Tests 240
7.7 Comment 242
References243

8 C
 onclusion247
8.1 Introduction 247
8.2 Formal Description of Disfluencies 247
8.3 Effects of Disfluencies on Listeners 250
xiv Contents

8.4 Disfluencies in Classroom Interaction 250


8.5 Disfluencies and Proficiency Level 252
8.6 Disfluencies in Public Language Tests 257
8.7 Comment 259
Appendix262
A 262
B 263
C 265
References266

I ndex269
Acronyms

A2 (CEFR) post-beginner language level


ASU syntax-based Analysis of Speech Unit
B1 (CEFR) pre-intermediate language level
B2 (CEFR) post-intermediate language level
C1 (CEFR) advanced language level
CEFR Common European Framework of Reference for Languages
CSR Continuous Speech Recogniser – software capable of detecting
and processing meaningful natural language, e.g. based on
vocabulary items, that allows a suitable response to be
automatically generated and enables human-computer
interaction to take place based on this principle
DELE Diploma de Español como Lengua Extranjera
EAIS Equal Appearing Interval Scale - a form of attitude measurement
that allows quantitative findings to be reported on qualitative
variables such as attitude
EEG Electrical Geodesics here refers to a non-invasive proprietary
head net containing electrodes sensitive to small changes in
voltage fields generated within the brain that is widely used in
neuroscience studies
EIT Elicited imitation test—a test in which listeners repeat samples
of language, e.g. sentences, as accurately as possible
ERP Event-related potential is a measurable electrical response in the
brain to a cognitive, motor, or sensory stimulus

xv
xvi Acronyms

ETS Educational Testing Service—a non-profit company


specialising in educational testing and assessment
FP Filled pause
FS False start
GECO Ghent Eye-Tracking Corpus
IELTS International English Language Testing System
L1 First language
L2 Second language
LFP Lengthening, Fragment, Pause (filled)
MLS Mean length of syllable
MOS Mean opinion score
N400 an electrical signal detectable in the brain in response to stimuli
that include signs in the form of words, sounds and images. It
is a form of ERP that peaks at 400 ms after onset.
NS Native speaker, i.e. speaker of an L1
NNS Non-native speaker, i.e. speaker of an L2
OET Occupational English Test
OISR Other-initiated self-repair
PRL Prolongation
PTE Pearson English Language Tests
RPT Repetition
SC Self-correction
SISR Self-initiated self-repair
SITAF Spécificités des Interactions verbales dans le cadre de Tandems
linguistiques Anglais-Français—verbal and non-­verbal corpus
of conversational French
SLA Second language acquisition
SP Silent pause
SPEAK Test the Speaking Proficiency English Assessment Kit developed by
the Educational Testing Service (ETS)
TOEFL iBT Test of English as a Foreign Language Internet-based Test
TOEIC Test of English for International Communication
TRP Transition relevance place in conversation
UCLES University of Cambridge Local Examinations Syndicate
WISP What is Speaking Proficiency corpus
List of Figures

Fig. 1.1 Framework for L2 (dis)fluency and proficiency (based on


Segalowitz, 2016) 17
Fig. 2.1 Rating scale for Experiment 2 (Bosker et al., 2013, p. 175) 41
Fig. 3.1 The nine-level Equal Appearing Interval Scale completed by
participants in Experiment 1 (Bosker et al., 2013, p. 175) 90
Fig. 5.1 Incidence of repetition forms (based on Maclay and Osgood,
1959)150
Fig. 6.1 Overview of the speech-processing monitor and repair as
comprehensible output (based on Levelt, 1989, p. 9) 186

xvii
List of Tables

Table 2.1 References to SPs in rating scale for individuals (based on


Sato 2014, p. 85) 43
Table 2.2 References to SPs in rating scale for interaction (based on
Sato 2014, p. 86) 43
Table 2.3 L2 studies reporting effect of SPs on listeners’ judgements of
fluency (studies inviting raters’ additional qualitative
comments on the speech samples are italicised) 46
Table 2.4 Number of SPs per 60 seconds and total silent pausing time
(based on Iwashita et al., 2008, p. 40) 53
Table 2.5 Proficiency test results (based on García-Amaya, 2015, p. 24) 55
Table 2.6 SPs by gender and level in Mandarin Chinese (L2) speakers
(based on Yuan et al., 2016, n.p) 57
Table 2.7 Utterance and clause boundaries with SPs, language, and
English (L2) proficiency (percentages) (adapted from Rose,
2017, p. 50) 57
Table 3.1 Duration of FPs vs silent pauses within speaker
reformulations (based on Williams & Korko, 2019) 77
Table 4.1 Ten most frequent disfluent word prolongations in Betz
et al. (2016) 120
Table 4.2 Participants’ gender and age in Bosker et al. (2013) 126
Table 4.3 Percentage of pauses and fillers preceding and following thee
and thi:y (Fox Tree & Clark, 1997, p. 158) 132

xix
xx List of Tables

Table 4.4 Filled pause tokens in four corpora in Clark & Tree (2002) 133
Table 4.5 Key differences in filled pauses vs prolongations in Moniz
et al. (2007) 138
Table 5.1 Three subclasses of disfluent repetition in Plauché &
Shriberg (1999) 155
Table 5.2 Repetitions in post-semester interview home and abroad
participants (extract adapted from Table 8. Comparisons of
repair phenomena for at home and abroad groups, Freed,
1995, p. 141) 159
Table 6.1 CEFR level comparability with OET and Aptis tests 207
Table 7.1 False starts (T-test Mean totals) in four picture narratives
(Tavakoli & Foster, 2011; Foster & Tavakoli, 2009) 234
Table 7.2 Repairs in Van Hest (1996) 237
Table 8.1 Disfluencies and errors in three semi-spontaneous
monologues256
Table 8.2 Measures of speaker speed and breakdown fluency to
achieve consistency (but not statistical significance) across
proficiency levels in the Aptis speaking test 257
1
Introduction

1.1 Disfluency as Fluency


Disfluency is a relatively recent construct. Not until the last century did
fluency with the specific meaning of ease of speaking become a subject of
widespread interest; and only in the mid-twentieth century did its corol-
lary, the absence of such a facility, become the focus of systematic empiri-
cal study. Although the concepts of fluency and disfluency have been
applied in a metaphorical sense to a number of human, and occasionally
non-human, activities, the locus of much investigation has been the lin-
guistically productive areas of speech and writing. While disciplines such
as medicine and neuroscience have taken an interest in the pathology of
dysfluent speech in physiological conditions such as aphasia, apraxia, and
dysarthria, linguists and psycholinguists have sought to understand and
analyze the mundane occurrences of the non-­pathological disorders of
articulation. Six of those resulting from social-psychological conditions
have become standard psycholinguistic typologies of speech disfluency:
repetitions, filled and silent pauses, prolongations, self-­corrections and
false starts.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 1


S. Williams, Disfluency and Proficiency in Second Language Speech Production,
https://doi.org/10.1007/978-3-031-12488-4_1
2 S. Williams

Disfluency remains a relatively specialised area of investigation. Perhaps


this is not surprising when recognising that the study of effortless behav-
iour has always been of more interest than its effortful equivalent.
Nevertheless, an examination of the components signalling breakdown
can reveal a great deal about the economy and detail of elements that are
in more optimal relation to each other. In human behaviour, those asso-
ciations may offer insights into the mechanics of a competence like lan-
guage, into the workings of the brain, and into human relationships.
If fluency has its own markers, such as fast speech, long turns, accurate
syntax, and syntactic complexity, it also demands a repertory of strategies
for achieving them, and for dealing with their consequences: the complex
syntax that loses its way, the speed that produces breakdowns in articula-
tion, the accuracy that becomes pedantic, and the turn that never ends.
As social interactants, speakers have to consider the effect on the listener
and beware of interruptions, non-sequiturs, mistaken inferences, and
other hazards, and intervene in the stream of speech to change course
when necessary. Although (dis)fluency studies frequently cite Lennon’s
(1990) distinction between the ‘broad’ and ‘narrow’ senses of the con-
struct (p. 388), broad meaning speaking proficiency in general and nar-
row meaning aspects such as ‘correctness, idiomaticity, relevance,
appropriateness, pronunciation, lexical range’ (p. 389), few are careful to
define where speaking proficiency ends and more general language profi-
ciency begins. In other words, while the narrow sense of speaking fluency
might refer to (lack of ) silent and filled pauses, repetitions, self-corrections
and so on, and the broader sense to these absences plus pronunciation,
grammatical accuracy, and vocabulary range, the proficiency here still
refers to speech and not to facility in other language activities such as
listening, reading, and writing. So while ‘fluent in English’ (p. 389) might
mean proficient in speaking the language, it could also be generalised in
one direction to mean proficiency in all uses of the language, or in another
as speaking the language in a particularly flawless, non-disfluent way.
Research in this area has followed a similar course to other objects of
linguistic attention, from early identification of disfluency as a novel can-
didate for investigation to the increasing discrimination of exemplars, an
appreciation of their mutual relationship and functions, and the exten-
sion of empirical studies to new populations. An initial focus on first
1 Introduction 3

language speakers expanded to second language speakers and thence to


natural language processing for machine learning. Early work on one
kind of disfluency (Goldman-Eisler, 1968: pauses) moved to investiga-
tion of other kinds (Fox Tree, 1995: false starts) and the development of
taxonomies (Levelt, 1983); and this approach was extended to research
on second language disfluency (Kormos, 1998) in speakers of first lan-
guages other than English (Crible & Pascual, 2020) and to comparative
studies of the disfluencies of bilingual speakers in both languages
(Riazantseva, 2001). In addition, the influence of contextual variables has
been studied, including task type (Ahmadian et al., 2012; Gilabert, 2007;
Foster & Skehan, 1996), cognitive load (Skehan, 2009), gender (Bortfeld
et al., 2001), culture (Swerts, 1998), and listener (Brennan & Schober,
2001; MacGregor et al., 2009).
The general direction of study has been from highly controlled elicita-
tion of data within laboratory settings with first language participants
being taken as a given to more naturalistic, less controlled data and set-
tings, and thence to human-machine interfaces. Within L2 research, the
classroom has served as the equivalent to the laboratory and study has not
generally gravitated beyond. The focus reflects the overriding concerns of
L2 researchers with pedagogic environments to the extent of the relative
neglect of naturalistic settings with their inferred dangers for L2 learners
such as fossilization (e.g. Schmidt, 1983).
Disfluency research began by studying monologic data—elicited by
read-aloud prompts (Goldman-Eisler, various) and static visual networks
(Levelt, 1983). Only later did investigators turn their attention to the
analysis of spoken interaction and acknowledge the role of the addressee
in the production of disfluent speech. For example, Clark (1994), follow-
ing Schegloff et al. (1977), noted that disfluencies were joint problems
for interactants and had three purposes: prevention, warning and repair
(Clark, 1994). Forms of hesitation have also been found helpful to con-
versationalists by alerting the listener to expect a next word or utterance
from the speaker that is either unusually complex or marked in some
other way. Certain forms of hesitation such as the filled pause, may be
culturally defined, e.g. the function of the filled pause has been found to
differ according to the speakers’ first language (Schmid & Fägersten,
2010; Crible & Pascual, 2020); and to vary with gender, e.g. men make
4 S. Williams

more use of the filled pause than women when they are searching for a
word and want to hold the floor (Shriberg, 2001). The two types of inter-
actional approach, i.e. the language production approach and the social
interaction approach (Faerch & Kasper 1983, 1984; and see Chiang &
Mi, 2011 for a discussion) represent a long-standing, cognitive-social
debate on second-language acquisition (SLA) (cf. Firth & Wagner, 1997,
2007; Gass, 2004).
Perhaps because of the high demands of research time and, apart from
test providers and certain government agencies, the paucity of funding
for most spoken language research, questions regarding individual differ-
ences, noted by Engelhardt et al. (2019), and the speech style and stock
of disfluencies produced by the same speakers in different conditions and
in response to social judgements, remain under-researched. The range of
fluencies demanded and codified by cultures was early recognised in first
language studies: the fluency of the DJ is different from that of the dip-
lomat, which is different again from the after-dinner speaker, though all
three would be acknowledged as notably fluent in their professional lives.
Influenced in this way by a mix of speech style and cultural norms, dis-
fluencies as conversational markers become rhetorical strategies, associ-
ated both with speech genres and with personal identity.
The six kinds of disfluency discussed in these chapters have all been
classed as hesitation (Maclay & Osgood, 1959), with silent or unfilled
pauses occurring most frequently.
Perhaps the most recognizable feature associated with the absence of
fluency is pausing. It is present to a greater or lesser extent on its own and
as part of other forms of dysfluency. Many speakers interrupt the flow of
speech, creating a temporary hiatus and a moment of silence. Silent
breaks occur often in conversation and if less than 0.25 milliseconds are
not normally noticed by an interlocutor. Pauses of longer duration, how-
ever, are marked and are likely to cause the interlocutor to search for an
explanation and make a judgement about the quality of the exchange.
For example, speaker pauses to conduct a word search have been shown
to alert the listener to the less ordinary nature of the following word, aid-
ing mutual understanding and contributing to the success of the discourse.
As another form of codified description of (dis)fluency, the level
descriptors of well-known English language tests are useful in illustrating
this point. With first and second language development, disfluencies
1 Introduction 5

never wholly disappear but rather become invisible through cultural stan-
dardisation. Thus, the remarkable mid-juncture silences of the L2 begin-
ner migrate towards end-juncture and become longer though less
noticeable. Filled pauses become lexicalised, e.g. like and yeah, and func-
tion words may be imperceptibly prolonged as the speaker searches for
a low frequency content word. Function words and phrases may also be
repeated in hardly noticeable close succession for the same reason. The L2
learner becomes adept at self-correcting from the previously articulated
function word, so contextualising the repair and minimising the mental
work of the listener. The same principle applies to false starts, with the
added advantage to the discourse that the repair is information more
obviously intended for the benefit of the listener at a here-and-now level
of interaction rather than an attempt to conform to more general social
norms of language accuracy. Eventually, the L2 (dis)fluency patterns are
indistinguishable from those of an L1 speaker, save that wide individual
variation in the use of filled and silent pauses and repetitions, all delaying
tactics, is well-recognised and accepted.

1.2 Introduction to the Six Disfluencies:


Formal Descriptions
The nature of disfluent production and its relation to general proficiency
in speaking is the subject of the next six chapters. The object of enquiry
is first one of definition, description, and discovery, and then its applica-
tion to the wider spoken situation. The survey that occupies most of these
pages collects a range of studies from sub-areas of spoken fluency, each
with its own agenda, terminology, and concerns, and responds to the ‘So
what?’ question by understanding (dis)fluency as a social project in which
speakers are constantly adapting to the norms that surround them, and in
their turn contributing to those norms. An illustration might be the pre-
viously mentioned lexical filler like, a disfluency that functions to give the
speaker more time in the planning process and is also a discourse marker
conveying a particular speaker identity and contributing to fluent speech
by maintaining the flow of talk.
6 S. Williams

If a central notion associated with fluency is keeping the talk going,


then disfluency is the opposite, a temporary interruption to the flow of
speech. The speaker is faced with an ongoing tension between planning
and production: keep going and hesitate from time to time to plan the
next stretch of talk or carry on regardless and repair the inevitable errors
and breakdowns. Thus, fluency has become synonymous with speed, and
disfluency with forms of hesitation and repair. Four recognised forms of
hesitation are silent pauses, filled pauses, prolongations, and repetitions.
Two forms of repair are self-corrections and false starts. These six disflu-
encies are the subject matter of the central chapters. The modern history
of their investigation and the nature and amount of interest in each varies
considerably. Common to each chapter is a general introduction to the
disfluency, its formal description, effect on listeners, relation to speaker
proficiency, significance in international language tests, and a final
comment.
Interest in silent and filled pauses goes back to Goldman-Eisler’s work
in the early 1950s; and analysis of what they define as hesitations types:
silent and filled pauses, repetitions and false starts, to the seminal study
of Maclay and Osgood (1959). Prolongations in non-pathological speech
are the most youthful of the research areas in disfluencies (e.g. Eklund,
2001), although in relation to L1 stutterers prolongations have a much
longer history. Similarly, studies in repair usually cite Schegloff, Jefferson
and Sacks’ (1977) paper, and disfluency studies in particular cite Levelt
(1983), who distinguishes five subtypes in Dutch (L1) data that include
self-corrections and other forms of repair recognisable as false starts.
Notable so far is the focus on L1 speech. It was van Hest (1996) in her
empirical study who applied Levelt’s (1983) taxonomy to the L1 and L2
speech of her participants.
Common to all these disfluency specialisms is the interest of the
enquirers in solving linguistic problems in a specialist area. For example,
the interest in non-pathological prolongations as disfluencies is shared by
a group of specialists in human-computer communication such as Eklund
(human-computer travel booking systems) and Betz (spoken dialogue
systems). The purpose of their enquiries is to improve the increasingly
common human-machine communication experience by introducing
disfluencies like fillers and prolongations into machine-simulated natural
1 Introduction 7

language to make it sound more natural and increase the time available
for processing responses. If any evidence were needed that the status of
disfluencies has been revised, one example is their inclusion in speech
synthesis, e.g. filled pauses (Adell et al., 2010), in order to simulate spon-
taneous rather than read speech.
The interest in second language self-corrections is concentrated par-
ticularly in language teaching practitioners and specialists keen to maxi-
mise the acquisition of the L2 by learners. Evidence exists that at an early
stage, language learners prioritise acquiring the formal rules of the target
language and devote considerable energy to eliciting models from expert
others, gradually internalising them.
Silent pauses (SPs) as stretches of silence within the speech stream
occur naturally at clause and sentence boundaries but are heard as disflu-
encies when they interrupt a word, phrase or clause. Pause duration is less
significant in this regard, though longer boundary clauses are associated
with longer following sentences (Goldman-Eisler, 1972). In this way,
pauses signal an impending information increase, and words following
pauses are found to be less predictable. Words that are not preceded by
pauses are therefore more predictable, and fluency as absence of mid-­
clause pausing is associated with predictability or word frequency, provid-
ing some justification for encouraging learners to include formulaic
sequences in their speech as a means to increase fluency. Pauses can be
lengthy at boundaries without sounding marked. Thus, the distribution
of pauses is more significant than their duration.
Filled pauses (FPs) can be defined as voiced hesitation such as er and
erm in English and include lexical fillers such as yeah, like and you know
that carry no additional information. Speakers may produce filled pauses
for at least three reasons, some of which overlap, depending on the situa-
tion. Of interest to psycholinguists is the filled pause as marker for a word
search. Interactionists are concerned with the filled pause as back-­
channelling. And discourse analysts will note the turn-taking function of
filled pauses. As with silent pauses, work has centred on the distribution,
duration, and frequency of filled pauses. And like silent pauses, the
unmarked location of filled pauses is at major boundaries, where they
might indicate either a desire to continue speaking or an invitation to the
listener to take the floor, depending on pitch and intonation. Again, like
8 S. Williams

silent pauses, filled pauses are more likely to occur before a low-frequency
content word and before longer clauses; therefore, like silent pauses in
this regard, they indicate syntactic uncertainty. The duration of FPs is
markedly shorter than SPs and produced at lower pitch than the sur-
rounding words.
Prolongation is the continuation of the immediately prior phoneme
and is classed as disfluent when it continues for longer than usual relative
to similar phones produced by the same speaker. The prolonged sound
and the usual length of production can be measured digitally for com-
parison. Prolongations are one way of the speaker’s achieving extra time
for planning and information processing without resorting to a more
noticeable repetition, filled pause, or silent pause. In this sense, prolonga-
tions may be the first option for speakers who need extra time. Three-­
quarters of disfluent prolongations occur in function words rather than
content words. Speakers’ preferred location for prolongations is the long
vowel in the nuclei of syllables but the location is capable of migrating,
e.g. to a sonorant in the coda, when options are reduced.
Repetitions are contiguous repeats of the same sound, syllable, word,
or phrase that convey no additional meaning for the listener and are more
likely to be heard as disfluent than discontinuous repeats, i.e. repeats
separated by some intervening language. Like the location of prolonga-
tions, repetitions are more likely to be function words that occur before
content words, presumably allowing the speaker time to search for the
lexical item. Repetitions are most often of single words, occasionally
more, and rarely less. Like filled pauses, repetitions are often found near
to, or at the start of, an utterance (Schegloff, 1987), possibly to repair
speaker overlaps. By analysing intonation and pause patterns surround-
ing repetitions, Plauché and Shriberg (1999) offer a model of three rep-
etition types: canonical, covert self-repair, and stalling, illustrated with
the definite article representing a function word. As the term suggests,
canonical repetition is the most numerous of the three subtypes, and the
repetition, retrospective in nature, provides some continuity for the lis-
tener. Covert repetition is rather prospective, its rising intonation expres-
sive of the speaker’s planning uncertainty. The function of stalling
repetitions may be to keep the floor during planning problems and they
seem to be simultaneously prospective and retrospective.
1 Introduction 9

A self-correction is the speaker’s replacement of non-standard output


with an alternative supposed to be standard. The concern with standardi-
sation makes self-corrections different from false starts, the other type of
self-repair. Answering Schegloff et al.’s call for ‘an account of the organi-
zation of repair’ (Schegloff et al., 1977, p. 381), Levelt (1983) describes a
three-part model, which applies equally to self-corrections and false
starts. It consists of a problem source, an editing phase characterised by a
silent pause and which may include an editing term such as a filler, and a
revision or repair of the trouble source. To be well-formed and not impose
an additional burden on the listener, the speaker needs to repeat in the
revision any words that followed the problem source that were articulated
before the interruption of the editing phase, and optionally any words
immediately in front of the problem source, especially if they are func-
tion words and the problem is a content word. In contrast to false starts,
most self-corrections repair only the problem element. The repetition of
language around the problem source, mandated where words follow the
problem source, provides evidence that the syntactic structure is held in
the speaker’s working memory and is available for the revision.
False starts may be defined as the speaker’s abandonment of an appar-
ently standard utterance to start afresh or continue in a different direc-
tion. The three-part structure of false starts is as described for
self-corrections, consisting of problem source, editing phase, and revi-
sion. An important difference is that if the speaker had not commenced
the editing phase, the listener would not have recognised any need for a
revision. This is particularly the case with so-called ‘covert repairs’ (Levelt,
1983, p. 45), where a word is simply repeated after an intervening editing
term such as a silent or filled pause, or the utterance continues following
an editing phase, apparently in the same direction. Because no standard
form exists for the speaker’s false start to orient to, the editing phase and
revision are likely to be longer and more complex than a self-correction
(Kormos, 2000). On the other hand, while this complexity and the inter-
ruption to the speech flow it causes is what makes false starts disfluent,
the speaker’s orientation may be wider than finding the optimum expres-
sion for a thought. It may signal an adjustment to the discourse for the
mutual benefit of speaker and listener. Certain other factors, such as the
well-formedness of the repair, are associated with proficiency.
10 S. Williams

1.3 Effects on Listeners


SPs are invariably included in quantitative acoustic measures in rater
studies and their location at non-clause boundaries is significant. SP
mean length is less important to raters’ fluency assessments than their
frequency (Cucchiarini et al., 2002). Their importance in fluency judge-
ments is underscored by their accounting for the largest percentage of
any temporal variable in listeners’ negative impressions of fluency
(Derwing et al., 2004). In monologic speech production, raters seem to
interpret SPs as L2 speaker processing problems. In dialogic speech pro-
duction, in contrast, raters seem to interpret SPs as a turn-taking phe-
nomenon rather than a disfluency. Speakers who produce fewer SPs have
been found to pause for longer, and longer SPs may imply content plan-
ning. Shorter SPs, on the other hand, may imply breakdowns (Préfontaine
et al., 2016). Slow speech may be judged fluent in itself, but the addition
of SPs results in its being perceived as unduly slow and being awarded a
lower fluency rating (van Os et al., 2020). In a meta-analysis of rater
studies, Suzuki et al. (2021) find found no association between length of
pause and perceived disfluency. The negative effect of clause-internal
pauses and pause frequency, which may signal syntax building and word
retrieval problems, on the other hand, was strong.
FPs have been shown to influence listeners’ expectations of a following
speaker referent, and to be heard as disfluencies when not conforming to
target language placement, i.e. end clause, and lexicalisation, e.g. well,
like, you know (English L1). Discussion on the signalling effect of FPs
concerning the identification of approaching referents as low- rather than
high-frequency centres on speaker intentionality and listener inference.
Experiments by Bosker et al. (2014), Fraundorf and Watson (2011),
Watanabe et al. (2008) and others indicate that FPs are signals rather than
signs, and listeners must learn to recognise the associations of speaker dif-
ficulty with cognitive or environmental challenges. In this respect, Bosker
et al. (2014) report that when listeners heard disfluent L2 instructions,
their anticipatory effect was neutralised and they no longer inferred the
approach of a low-frequency referent. Rather, they reassessed the speaker’s
overall competence instead of assuming a local and temporary challenge.
The results of rater studies indicate the significance to listeners of FPs.
1 Introduction 11

Together with SPs, FPs were responsible for 59% of listener judgements
of temporal disfluency in Rossiter (2009); but Cucchiarini et al. (2002)
conclude from their findings that FPs are a poor indicator of fluency and
may instead be a strategy reflecting sentence planning. Perhaps FPs rather
reflect individual differences or broader cultural norms.
Prolongations in the form of longer syllables have been found to occur
before the vast majority of syntactic junctures and are also typical of gram-
matical junctures, where length of syllable is generally longer, as it is before
rather than after pauses (Martin, 1970). Listeners identify prolongations
as disfluencies yet tolerate them well. In rater studies, listeners prefer pro-
longations to compound disfluencies (Betz et al., 2015) and, consistent
with their effect at grammatical junctures, rate prolongations that occur
within conjunctions highly. Prolongations are thus a means of preserv-
ing fluent speech while planning more complex language (Moniz et al.,
2007). Disfluent prolongations, i.e. certain extended marked forms, espe-
cially in function words such as thee for the, have been shown to prime
listeners for a following referent that is new or unfamiliar (Arnold et al.,
2003). Listeners’ response seems to be inferentially learned rather than
associative because when prompts are given by an L2-accented speaker,
the attenuation effect disappears, i.e. listeners show no preference for old
or new referents following a prolongation. It may be that the wide varia-
tion in length of prolongations is what makes them notably acceptable to
listeners. Furthermore, because speakers seem to plan ahead during longer
prolongations, overall speed is not reduced (Betz et al., 2017).
Like FPs, repetitions may serve as planning devices for speakers
(Lennon, 1990). Similarly, when speakers repeat a function word such as
an article, they may be conducting a word search for a following noun
(Temple, 1992). The fact that repetitions have been reported to increase
in number in study abroad groups, presumably as a result of pressure on
speakers to keep pace in interaction with L1 speakers (Freed, 1995),
reveals the consequences of the additional real-time planning necessary in
the new speech environment. Although listeners are often oblivious to
repetitions in interaction and repetitions impose no additional burden on
listener comprehension, their Event Related Potential (ERP) effect,
recorded by Electrical Geodesics (EEG), is similar to prolongations and
consistent with semantically incongruous input (McAllister et al., 2001).
12 S. Williams

The incongruity may explain why listeners in rater studies see them as a
key indicator of disfluency (Rossiter, 2009).
In general, self-corrections seem not to influence listeners’ judgements
of speaker fluency: although listeners are sensitive to repairs, their judge-
ment is barely affected (Bosker et al., 2013). An exception is reported in
McRobie (1993), whose raters awarded higher fluency scores to non-­
corrected vs corrected speech. Expert L1 and L2 raters’ written comments
as they audited L2 speech showed that they consistently reacted more
negatively to self-corrections than false starts (Rossiter, 2009). To sum
up, where self-correction is a named fluency measure in listener rating
studies, it hardly figures in raters’ perceptions.
False starts are another disfluency that holds meaning for listeners, in
this case by recycling material to create ‘discourse coherence’ (Levelt,
1983, p. 42). ERP data shows that false starts also involve a processing
cost (McAllister et al., 2001) and slow monitoring times (Hindle, 1983),
which becomes evident from the more numerous false starts of beginner
and intermediate-level speakers compared to advanced-level speakers.
Lower level speaker FSs also tend to be ill-formed (van Hest, 1996), pos-
sibly reflecting working memory capacity. Although advanced learners
produce fewer false starts, for that reason their false starts may be more
distracting. Comprehension is also retarded by the occurrence of false
starts in the middle of an utterance, and near the start but introduced by
a conjunction, rather than without any marker at the beginning. In gen-
eral, however, rater studies reveal that although listeners notice false starts,
they are one of the disfluencies least likely to elicit a negative reaction
(Rossiter, 2009).

1.4 Main Types of Disfluency Data


For the purposes of this survey, the data for disfluency studies is catego-
rised into five types: classroom interaction, corpora, elicited data, natural
language processing, and natural data outside the classroom. The division
cannot be watertight and anomalies remain, with ambiguity, overlap, and
contradiction. Some categories are confined to a single chapter, and no
category is common to all. However, even when a data type is absent
1 Introduction 13

from the central section, it may feature within one of the other sections.
For example, no separate section exists for elicited data in the chapter on
silent pauses. Yet elicited data is central to the rater studies reported
within the ‘effect on listeners’ section in the same chapter.
In fact, all six disfluencies feature in the core group of elicited rater
studies, most of which follow a similar design. Listeners rate speech
recordings of L2 learners for fluency. The recordings are subsequently
analysed for acoustic measures such as speed and pauses and the results
compared with the ratings. Inferences are made about the features that
listeners responded to in making their judgements on fluency. These
core studies are given a separate section, Effects of [the disfluency] on lis-
teners, in central chapters. Data from classroom interaction is included
in Chap. 2: Silent pauses for a number of reasons. Walsh and Li (2013)
illustrate the problem of assigning between-turn pauses in interaction
to one or other speaker and discuss the implications when one of the
speakers is a teacher eliciting responses to questions and the other a
student who is socially reticent or requires time to think or both. The
shorter pauses observable in learner pair work contribute to fluency in
dialogic speech (Tavakoli, 2016). Longitudinal study of learners shows
a steady decline in pause time with proficiency gains, though quantity
and pause distribution mid-clause still convey disfluency (Mora &
Valls-­Ferrer, 2012).
Apart from appearing in other sections such as effects on listeners and
proficiency, filled pause data in Chap. 3 is restricted to the uh and um
studies in the elicited data section. Prolongations are the only disfluency
whose study data is largely based on corpora and natural language pro-
cessing: they do not enjoy autonomous sections in any other chapter.
Although Chap. 4: Prolongations includes short sections on classroom
and elicited data, prolongations feature only in transcripts of classroom
interaction and are rarely the subject of analysis; they also figure in elic-
ited data collected in a classroom and gathered into a corpus in Moniz
et al. (2007). There is no classroom data section in Chap. 3: Filled pauses
or Chap. 5: Repetitions. Corpora and natural language processing sec-
tions are included only in Chap. 4. Natural language data collected out-
side the classroom is confined to Chaps. 5, 6 and 7 on repetitions,
self-corrections, and false starts.
14 S. Williams

1.5 Disfluency Types and Proficiency Level


SPs were the only fluency measure to discriminate between high- and
low- proficiency learners in Riggenbach (1991). Low-proficiency learners
have been found to produce the great majority (69%) of non-juncture
SPs, most SPs occurring in compound disfluencies, e.g. false starts, self-­
corrections, and repetitions (Cenoz, 1998). Lower-proficiency learners
produce more SPs than higher proficiency leaners; and pause duration
decreases within speech units in line with higher L2 proficiency
(Riazantseva, 2001). Thus, in rater studies, absence of pauses is associated
with higher proficiency (Bosker et al., 2013), and fluency ratings are
found to fall when SP duration and frequency are artificially increased
(Bosker et al., 2014). SP length increases and the frequency of mid-clause
SPs declines with higher levels of proficiency (Tavakoli et al., 2017).
Lower-intermediate speakers are reported to produce more SPs per min-
ute and nearly twice the number of SPs within self-corrections compared
to advanced speakers (Williams & Korko, 2019).
Rater studies disagree on whether FPs signal proficiency. FPs alone
were the poorest indicator of raters’ fluency judgments in Bosker et al.
(2013) and Pinget et al. (2014) but contrariwise best explained the raters’
predictions of communicative adequacy, or proficiency, in Révész et al.
(2016). Riggenbach (1991) reports that advanced speakers produced
more lexicalised FPs such as you see than lower fluency speakers, who
overwhelmingly stuck to non-lexical fillers. Her finding is consistent with
raters’ positive response to speakers’ production of L1 lexical FPs in
Préfontaine & Kormos (2016).
Based on evidence that length of pause is disregarded by raters in flu-
ency studies such as Bosker et al. (2014) and therefore unrelated to pro-
ficiency level, the same premise might reasonably be applied to
prolongations.
Repetitions have been found to decline by a factor of three with
increased fluency level (Révész et al., 2016). Low-fluency speakers are
reported to make more discontinuous repetitions of the kind sometimes
found in false starts than the concatenated, i.e. immediately adjacent,
ones found outside a repair. Concatenated repetitions demand little pro-
cessing from listeners (Olynyk et al., 1987).
1 Introduction 15

Self-corrections have also been found to decrease in number with


increasing proficiency (Simard et al., 2017), and are fewer than false starts
at all fluency levels (van Hest, 1996; Williams & Korko, 2019). Self-­
correction may be a necessary disfluency for optimum learner progress.
The learner focus on target language structures implied by self-­corrections
fosters L2 development and may be an efficient means to acquire profi-
ciency at the low-intermediate (CEFR B1) stage. The majority of self-­
corrections in speakers judged to be low proficiency have been reported
as lexical in focus (Matea Kovač & Milatović, 2013). In contrast, while
lower-intermediate (B1) learners are the most prolific self-correctors,
post-beginner (A2) learners have been found to produce no self-­
corrections and upper-intermediate (B2) level fewer (Tavakoli et al., 2017).
False starts have been reported to show a negative correlation with L2
proficiency and have been described as ‘productive devices’ (Verhoeven,
1989, p. 151). With greater automaticity, higher level learners may moni-
tor less overtly, resulting in the production of fewer false starts. Similarly,
raters have been found to associate fewer false starts with higher profi-
ciency learners (Révész et al., 2016), a finding confirmed by Williams and
Korko (2019). The fact that advanced learners produce four times as
many false starts as self-corrections (van Hest, 1996) reflects the increas-
ing accuracy of their language. Improved automaticity at this level may
also be responsible for better-formed repairs (Kormos, 2006) with an
orientation to shared discourse (Gilabert, 2007), perhaps because
advanced learners are better able to execute timely conceptual revisions.

1.6 Disfluencies in Public Language


Test Descriptors
In their test descriptors, English language rating scales reflect socio-­
cultural (dis)fluency norms associated with levels of speaker proficiency.
Each chapter maps a disfluency measure against five of the publicly avail-
able descriptors (for Aptis, IELTS, OET, PTE Academic, and TOEFL
iBT) and the influential Common European Framework of Reference for
Languages (CEFR) (Council of Europe, 2001), to which all five are
referenced.
16 S. Williams

Pauses are cited in at least one band level descriptor for all five tests but
none except OET, which refers to ‘fillers’, specifies whether SP or
FP. Tavakoli et al. (2017) recommends that Aptis rating scales and training
materials should specify pauses as silent, filled, or mid-clause as necessary.
Prolongations are not referred to in any of the descriptors. Like pauses,
repetitions are included in one or more descriptors from all five scales.
(Self-)corrections are referred to in the CEFR descriptors, the IELTS and
OET tests, and as the more generic ‘reformulations’ in the Aptis test.
False starts are named in the CEFR descriptors, and those for the Aptis,
PTE Academic and TOEFL iBT rating scales. Anomalously, Aptis there-
fore refers to both reformulations and false starts.
The disfluencies that are mentioned most often are pauses and repeti-
tions. Presumably, these forms are most salient to listeners and exam-
iner raters.

1.7 Comment
The preceding outline has been largely descriptive, reflecting the early
state of the disfluency field. It seems that writers share a common pur-
pose: to better understand the causes, workings, and consequences of
disfluencies in spoken language and, in the case of L2 investigators, the
special considerations that might apply to speakers of a second language.
Yet, secondary interests, including language acquisition, task-based per-
formance, human-machine interfacing, test standardisation, and listener
responses, to name just a few, mean that differences in working defini-
tions, methodologies, and conclusions, have resulted in a centrifugal
push of knowledge concentration into a number of different specialisms
rather than a centripetal pull towards a common understanding.
The constructs of fluency and disfluency can best be understood as
interactive, socially-produced, and jointly-constructed phenomena (cf.
Jaspers, 2016; Segalowitz, 2016) that have significance for the iterative
reproduction of social relations. In this respect, Segalowitz’s (2016)
framework offers more global insights into fluency and therefore disflu-
ency, and is helpful in foregrounding the crucial role of the social context
and its intimate relation with speaker motivation. The essential
1 Introduction 17

components of the framework are (1) the cognitive processing systems


familiar from the upper components (conceptualiser and formulator) of
speech models like Levelt’s (1989) and relevant to psycholinguistic inves-
tigations, e.g. in inhibitory control; (2) the lower, articulator component
of speech models that makes audible (dis)fluency features such as speed,
pausing, and repetition; and (3) the individual motivation to produce
speech of a more or less (dis)fluent nature. The other two elements in
Segalowitz’s framework represent the communicative context in which
the speaker and the other components operate, namely the social context,
and (past) perceptual and cognitive experiences.
In adapting the framework to L2 disfluencies (Fig. 1.1), the last two
elements above are combined on the basis that the social context creates
the perceptual and cognitive experiences, whose lasting trace is opera-
tionalised in the motivation element.
There seems no advantage to adding an additional affective long-term
memory factor. The interactive communicative context explains how and
why disfluencies develop in L2 learners, and their trajectory from bald
mid-clause silent pauses to covert repairs, which mirror developments in
automaticity and consequent improvements in the availability of work-
ing memory. The communicative context also explains the type of disflu-
ency the speaker learns to produce, its position in the utterance, and the
social interactions in which it is acceptable, e.g. the present vogue for like

L2 speech
production

Cognitive-
perceptual Motivation
processing

Fig. 1.1 Framework for L2 (dis)fluency and proficiency (based on Segalowitz, 2016)
18 S. Williams

as a filler in English. The adapted model is sufficient to account for (dis)


fluency and its gradual orientation towards a social standard regardless of
first or second language. It contextualises disfluencies and provides an
explanation for their relationship with proficiency inasmuch as profi-
ciency necessarily includes (dis)fluency.
Just as Fasold and Connor-Linton (2014) note language variation in
spoken communication according to situation and speaker role and pur-
pose, so are different (dis)fluencies acquired according to the same situa-
tions, roles and purposes. Thus, in relation to L1 development,

When children learn to use the socially approved variety of spoken lan-
guage in school, it is not from what their teachers explicitly teach in class,
but rather from adjusting their speech to match the speech of other chil-
dren, in halls, on the playground, and outside of school, and thus gain their
approval.
(Fasold & Connor-Linton, 2014, p. 10).

A similar social process might apply to L2 learners and their learning


environment. For example, Hellermann’s (2009) data, all collected in an
ESL classroom but, thanks to permanently sited video recorders and
microphones, containing both task-oriented exchanges and conversation,
is a reminder that the L2 learner’s further project is to develop language
variation in the L2. At first, however, the L2 learner’s overarching con-
cern is to join a community of practice (Wenger, 1998), specifically a
language learning classroom of English language users. Hellermann’s
focus in the data is on the participant, Inez’s, self-initiated self-repairs
(SISRs). In this context and at this stage of her L2 language development,
term 5, little difference can be seen between Inez’s (dis)fluency in task
and conversation. Rather, Hellermann noticed an increase in the number
of SISRs over Inez’s five terms of study, implying more active monitoring
of her language production as a learner. In the first extract, Inez finds the
space to make a self-repair during a task.

Inez: teacher? uh, we have another question.


eh ea- eat is food. Drink is (.) water. (.) a:nd
have is have the (.) asp- have aspirin? I
1 Introduction 19

Te: take aspirin.


Inez: take aspirin.
(Hellermann, 2009, p. 123)

And in conversation with A, a visitor to the classroom, she is equally


successful.

A: I didn’t know that.


Inez: yes Mayan people. She uh they uh speak Maya?
A: uh huh,
I: and Spanish.
(Hellermann, 2009, p. 126)

Although both self-repairs are noticeably disfluent, the first with silent
pauses, filled pauses, prolongations, and repetitions; the second with
filled pauses, Inez accomplishes her communicative objective. Which
intervention strategies speakers adopt to achieve a temporary break in
fluent speech will thus vary as to their self-ascribed identity, the values
they align with, and the interactant. In fact, the disfluencies speakers
produce in the service of repair bear only an indirect relationship to
metadiscoursal social judgements; it is rather the speech style adopted by
the speaker as a consequence of their social judgement that shapes the
speech process and defines its points of vulnerability. That style, whether
impetuous or measured, familiar or formal, confident or hesitant—hav-
ing regard to topic, social context, size of audience, and other environ-
mental factors—determines the speaker’s pre-planning, their lexical
range, syntactic complexity and so on, and predicts where breakdowns
and disfluencies are likely to occur.

1.8 Overview
Apart from introduction and conclusion, the inner six chapters (Chaps.
2–7) are each devoted to a recognised disfluency. Most of the chapters are
organised in a similar way, with some variation reflecting the sort of
methodology and data characteristic of work in that disfluency area.
20 S. Williams

Chapters often begin by reporting L1 groundwork in the disfluency as


first language research in these areas invariably predates L2 studies.
The next chapter (Chap. 2) defines and gives examples of silent pauses
and reports some of the influential studies of Goldman-Eisler, who stud-
ied silent pause duration and frequency and later their distribution in
utterances. The chapter then interprets a group of L2 rating studies, com-
mon to most of the disfluency chapters, to suggest the effect of silent
pauses on listeners. A section follows that reports studies in classroom
interaction that reference silent pauses. The chapter then revisits some of
the rater studies and cites others to examine the relation of silent pause
production to L2 oral proficiency level. The chapter ends with a section
on references to silent pauses in the descriptors of public language tests
and a closing comment.
Chapter 3 on filled pauses follows a similar structure, though it notes
the prominence given by (dis)fluency researchers to filled pauses com-
pared to silent pauses, with many more publications on this disfluency
form. After a short introduction, a formal description of filled pauses
describes their distribution, frequency, duration and pitch. The next sec-
tion is in two parts, the first reporting studies on the effects on listeners
of the so-called uh and um forms, i.e. binary filled pause forms claimed
to have distinct functions; and the second interpreting the rater studies
introduced in the previous chapter with reference to the effects of filled
pauses. Then, three kinds of speech environment: cognitive, interactive,
and turn-taking, for FPs are considered. The section on FPs and profi-
ciency includes a discussion of the effects of their L1 FPs on the FPs the
same speakers produce in their L2: L2 learners gradually adopt FPs
appropriate to the target language and allow the L1 FPs to attrite. Some
difference is observed between studies that report advanced speakers pro-
ducing more FPs than lower-level learners, and those that report the
opposite. The explanation for the difference probably lies in the pause
distribution, mid- or end-juncture. The published descriptors for the
Occupational English Test (OET) are the only ones of the English lan-
guage tests to mention FPs, here in the form of fillers.
Chapter 4 concerns prolongations, in pathological speech associated
with stuttering but in this case a form of disfluency evidenced as unusual
lengthening of phonemes and very common as a form of hesitation to
1 Introduction 21

gain processing time. Precisely for this reason, prolongations are of inter-
est to automatic speech recognition researchers, whose concern is to
establish the maximum tolerable length of prolongation for the listener.
Prolongations have long been recognised and reproduced in transcrip-
tions by conversation analysts, conventionally by a colon following the
extended syllable. The second section offers a formal description, noting
the tendency of prolongations to fall on function words such as and, the
and so. The largest section reports the effects of prolongations on listen-
ers, including extended forms of function words such as a and the occur-
ring before uh and um, the FPs discussed in the previous chapter. Apart
from studies in human-machine interaction, the majority of prolonga-
tion data is found in corpora, e.g. the corpus in Fox Tree and Clark
(1997), which examines the incidence of thi:y as an extension of the
strong form thee (ordinarily thuh). Three more sources of data exempli-
fied in this section are natural language processing, interaction, elicited
data. Prolongations and proficiency level are discussed only in the context
of language test descriptors and then by inference from mention of hesi-
tations. The functional association of prolongations with filled pauses is
noted, along with their appearance in contrasting environments.
Chapter 5 distinguishes immediate repeats from language repeated
after some delay and suggests that immediate or ‘concatenated’ repetition
(Maclay & Osgood, 1959, p. 148) is heard as disfluent. The second sec-
tion offers a formal description of repetitions and includes the repetition
often found in repairs and a description of Plauché and Shriberg’s (1999)
taxonomy of repetitions by intonation. The following section examines
the effect of repetitions on listeners and reports several of the rater studies
with reference to findings on repetitions. After noting L2 study outside
the classroom, and in a set of empirical studies, and the effect of task on
L2 learners’ repetitions, a section on repetition and proficiency level
revisits some of the rater studies for conclusions drawn on the association
of repetitions with fluency. The penultimate section surveys references to
repetitions in the public descriptors of English language speaking tests.
The chapter ends by noting the joint occurrence of repetitions with
22 S. Williams

pauses, after which the speaker restarts the utterance with a function
word preceding the lexical word causing trouble, and the fact that listen-
ers often fail to notice speakers’ repetitions of this kind, which are most
likely sites of speakers’ covert repair.
Chapter 6 concerns self-corrections, one of the two compound refor-
mulation disfluencies. Self-corrections are distinct from false starts, the
other kind of reformulation, for a number of reasons. In a self-correction,
the speaker attempts to repair the utterance towards a form compliant
with a social norm. For the same reason, because the repair is usually
towards a standard in pronunciation, grammar, or lexis, the trouble
source is marked as non-standard and the whole repair heard as a disflu-
ency. A formal description of self-corrections in the second section refers
to the seminal paper of Schegloff, Jefferson and Sacks (1977) and their
work on L1 repairs. Levelt (1983), also working with L1 (Dutch) data,
defines five categories of repair, one of which is self-correction, and three
of the others forms of false start. It is Levelt that L2 researchers such as
Kormos build on to describe features of L2 learner self-corrections and
other repairs. The next section reports studies on the effects of self-­
corrections on listeners. The following three sections discuss studies of
self-corrections whose data has been collected outside the classroom,
studies of self-corrections in classroom interaction, and the effect of task
on self-corrections in elicited data. Van Hest (1996) and Williams and
Korko (2019) report that self-corrections decline with proficiency gains;
and Tavakoli et al. (2017) note that lower-intermediate learners make
abundant self-corrections as they actively experiment with achieving tar-
get language forms. Only two English language speaking tests, IELTS
and the Occupations English Test (OET), refer to self-correction in their
public descriptors, but the references may be to the more generic ‘repair’,
i.e. including false starts.
Chapter 7, the last chapter on disfluency types, describes false starts,
as distinct from L2 self-corrections, the other kind of repair. Of more
interest than self-corrections as a linguistic opportunity for intake and
acquisition, false starts represent the speaker’s revision of a standard
form that would not necessarily be recognised as a trouble source by the
1 Introduction 23

listener. Because they represent the speaker’s choice of linguistic revi-


sion, false starts are of great interest to language acquisition theorists as
places in the utterance where the monitoring process is thought to
involve the speaker’s selection of a repair from one of a number of alter-
native revisions. This is the case because, unlike a self-­correction, no
particular standard form exists for the repair since the trouble source
already adhered to target language norms. The second section offers a
formal description of both types of repair, self-corrections and false
starts, as comprising three parts: a trouble source, editing phase, and
repair (Levelt, 1983). The effects of false starts on listeners is reported
in the following section, which notes that false starts are not simply
speaker problem-solving devices. They also benefit listeners by building
‘discourse coherence’ (Levelt, 1983, p. 42). At the same time, as ‘regres-
sive speech markers’ (Olynyk et al., 1987), false starts impose a process-
ing cost on listeners, forcing them to re-evaluate speech already heard.
Compared to a middle position, or at the beginning prefixed by
‘and’, this cost is mitigated if the false start occurs at the start of the
utterance (Fox Tree, 2010). The next section on false start data types
includes data outside the classroom, within classroom interaction, and
task effect in elicited data. A section on false starts and speaker profi-
ciency level notes that while lower-proficiency learners produce greater
numbers of false starts than advanced learners, those of advanced learn-
ers tend to be more discourse-related. Some evidence exists that false
start production reflects individual speaker style rather than proficiency
level (Zuniga & Simard, 2019). The penultimate section on false starts
and public language tests notes that only two tests: the TOEFL iBT
Test and the Pearson Test of English Academic reference false starts in
their descriptors, the former within Level A2, and the latter for the level
extremes of Native-­like fluency and Disfluent.
The book will be of interest to postgraduate students of linguistics,
communication studies, media, and other social sciences, and to special-
ists from other disciplines who want an overview of the range of research
in disfluency with a particular emphasis on second language speakers and
proficiency.
24 S. Williams

References
Adell, J., Bonafonte, A., & Escudero-Mancebo, D. (2010). Modelling filled
pauses prosody to synthesise disfluent speech. In Speech Prosody 2010-Fifth
International Conference.
Ahmadian, M. J., Abdolrezapour, P., & Ketabi, S. (2012). Task difficulty and
self-repair behavior in second language oral production. International Journal
of Applied Linguistics, 22(3), 310–330.
Arnold, J. E., Fagnano, M., & Tanenhaus, M. K. (2003). Disfluencies signal
theee, um, new information. Journal of Psycholinguistic Research, 32(1), 25–36.
Betz, S., Wagner, P., & Schlangen, D. (2015). Modular synthesis of disfluencies
for conversational speech systems. Studientexte zur Sprachkommunikation:
Elektronische Sprachsignalverarbeitung, 2015, 128–134.
Betz, S., Zarrieß, S., & Wagner, P. (2017). Synthesized lengthening of function
words - The fuzzy boundary between fluency and disfluency. In Proceedings of
the International Conference Fluency and Disfluency.
Bortfeld, H., Leon, S. D., Bloom, J. E., Schober, M. F., & Brennan, S. E. (2001).
Disfluency rates in spontaneous speech: Effects of age, relationship, topic,
role, and gender. Language and Speech, 44, 123–147.
Bosker, H. R., Pinget, A. F., Quené, H., Sanders, T., & De Jong, N. H. (2013).
What makes speech sound fluent? The contributions of pauses, speed and
repairs. Language Testing, 30(2), 159–175.
Bosker, H. R., Quené, H., Sanders, T., & De Jong, N. H. (2014). The percep-
tion of fluency in native and nonnative speech. Language Learning,
64(3), 579–614.
Brennan, S. E., & Schober, M. F. (2001). How listeners compensate for disflu-
encies in spontaneous speech. Journal of Memory and Language,
44(2), 274–296.
Cenoz, J. (1998). Pauses and communication strategies in second language speech.
Chiang, S. Y., & Mi, H. F. (2011). Reformulation: A verbal display of interlan-
guage awareness in instructional interactions. Language Awareness,
20(2), 135–149.
Clark, H. H. (1994). Managing problems in speaking. Speech Communication,
15(3–4), 243–250.
Council of Europe. (2001). Common European framework of reference for lan-
guages: Learning, teaching, assessment. Cambridge University Press.
1 Introduction 25

Crible, L., & Pascual, E. (2020). Combinations of discourse markers with


repairs and repetitions in English, French and Spanish. Journal of Pragmatics,
156, 54–67.
Cucchiarini, C., Strik, H., & Boves, L. (2002). Quantitative assessment of sec-
ond language learners’ fluency: Comparisons between read and spontaneous
speech. The Journal of the Acoustical Society of America, 111(6), 2862–2873.
Derwing, T. M., Rossiter, M. J., Munro, M. J., & Thomson, R. I. (2004).
Second language fluency: Judgments on different tasks. Language Learning,
54(4), 655–679.
Eklund, R. (2001). Prolongations: A dark horse in the disfluency stable. In ISCA
Tutorial and Research Workshop (ITRW) on Disfluency in Spontaneous Speech.
Engelhardt, P. E., McMullon, M. E., & Corley, M. (2019). Individual differ-
ences in the production of disfluency: A latent variable analysis of memory
ability and verbal intelligence. Quarterly Journal of Experimental Psychology,
72(5), 1084–1101.
Faerch, C., & Kasper, G. (1983). Procedural knowledge as a component of for-
eign language learners’ communicative competence. In Kommunication im
(Sprach-) Unterricht. Rijksuniversiteik.
Faerch, C., & Kasper, G. (1984). Two ways of defining communication strate-
gies. Language Learning, 34(1), 45–63.
Fasold, R. W., & Connor-Linton, J. (Eds.). (2014). An introduction to language
and linguistics. Cambridge University Press.
Firth, A., & Wagner, J. (1997). On discourse, communication, and (some) fun-
damental concepts in SLA research. The Modern Language Journal,
81(3), 285–300.
Firth, A., & Wagner, J. (2007). On discourse, communication, and (some) fun-
damental concepts in SLA research. The Modern Language Journal,
91, 757–772.
Foster, P., & Skehan, P. (1996). The influence of planning and task type on sec-
ond language performance. Studies in Second Language Acquisition,
18(3), 299–323.
Fox Tree, J. E. F. (1995). The effects of false starts and repetitions on the process-
ing of subsequent words in spontaneous speech. Journal of Memory and
Language, 34(6), 709–738.
Fox Tree, J. E. (2010). Discourse markers across speakers and settings. Language
and Linguistics Compass, 4(5), 269–281.
Fox Tree, J. E., & Clark, H. H. (1997). Pronouncing “the” as “thee” to signal
problems in speaking. Cognition, 62, 151–167.
26 S. Williams

Fraundorf, S. H., & Watson, D. G. (2011). The disfluent discourse: Effects of


FPs on recall. Journal of Memory and Language, 65(2), 161–175.
Freed, B. F. (1995). What makes us think that students who study abroad
become fluent? In B. F. Freed (Ed.), Second language acquisition in a study-­
abroad context (pp. 123–148). John Benjamins.
Gass, S. (2004). Conversation analysis and input-interaction. The Modern
Language Journal, 88(4), 597–602.
Gilabert, R. (2007). Effects of manipulating task complexity on self-repairs dur-
ing L2 oral production. International Review of Applied Linguistics in Language
Teaching, 45(3), 215–240.
Goldman-Eisler, F. (1968). Psycholinguistics: Experiments in spontaneous speech.
Goldman-Eisler, F. (1972). Pauses, clauses, sentences. Language and Speech,
15(2), 103–113.
Hellermann, J. (2009). Looking for evidence of language learning in practices
for repair: A case study of self-initiated self-repair by an adult learner of
English. Scandinavian Journal of Educational Research, 53(2), 113–132.
Hindle, D. (1983, June). Deterministic parsing of syntactic non-fluencies. In
21st Annual Meeting of the Association for Computational Linguistics
(pp. 123–128).
Jaspers, J. (2016). (Dis)fluency. Annual Review of Anthropology, 45, 147–162.
Kormos, J. (1998). A new psycholinguistic taxonomy of self-repairs in L2: A
qualitative analysis with retrospection. Even Yearbook, ELITE SEAS Working
Papers in Linguistics, 3, 43–68.
Kormos, J. (2000). The timing of self-repairs in second language speech produc-
tion. Studies in Second Language Acquisition, 22(2), 145–167.
Kormos, J. (2006). The structure of self-repairs in the speech of Hungarian
learners of English. Acta Linguistica Hungarica, 53(1), 53–76.
Lennon, P. (1990). Investigating fluency in EFL: A quantitative approach.
Language Learning, 40(3), 387–417.
Levelt, W. J. M. (1983). Monitoring and self-repair in speech. Cognition,
14, 41–103.
Levelt, W. J. M. (1989). Speaking: From intention to articulation. MIT Press.
MacGregor, L. J., Corley, M., & Donaldson, D. I. (2009). Not all disfluencies
are equal: The effects of disfluent repetitions on language comprehension.
Brain and Language, 111(1), 36–45.
Maclay, H., & Osgood, C. E. (1959). Hesitation phenomena in spontaneous
English speech. Word, 15(1), 19–44.
1 Introduction 27

Martin, J. G. (1970). On judging pauses in spontaneous speech. Journal of


Verbal Learning and Verbal Behavior, 9, 75–78.
Matea Kovač, M., & Milatović, B. (2013). Analysis of repair distribution, error
correction rates and repair successfulness in L2. Studia Linguistica,
67(2), 225–255.
McAllister, J., Cato-Symonds, S., & Johnson, B. (2001). Listeners’ ERP
responses to false starts and repetitions in spontaneous speech. In ISCA
Tutorial and Research Workshop (ITRW) on Disfluency in Spontaneous Speech.
Edinburgh, 29–31 August.
McRobie, K. F. (1993). Perceived fluency: A study of self-correction, speech rate
and three foreign accents as components of fluency in English as a second
language (Doctoral dissertation, University of San Francisco).
Moniz, H., Mata, A. I., & Viana, M. C. (2007). On filled-pauses and prolonga-
tions in European Portuguese. In Eighth Annual Conference of the International
Speech Communication Association.
Mora, J. C., & Valls-Ferrer, M. (2012). Oral fluency, accuracy, and complexity
in formal instruction and study abroad learning contexts. TESOL Quarterly,
46(4), 610–641.
Olynyk, M., d'Anglejan, A., & Sankoff, D. (1987). A quantitative and qualita-
tive analysis of speech markers in the native and second language speech of
bilinguals. Applied PsychoLinguistics, 8(2), 121–136.
Pinget, A. F., Bosker, H. R., Quené, H., & De Jong, N. H. (2014). Native
speakers’ perceptions of fluency and accent in L2 speech. Language Testing,
31(3), 349–365.
Plauché, M., & Shriberg, E. (1999, August). Data-driven subclassification of
disfluent repetitions based on prosodic features. In Proceedings of International
Congress of Phonetic Sciences (Vol. 2, pp. 1513–1516).
Préfontaine, Y., & Kormos, J. (2016). A qualitative analysis of perceptions of
fluency in second language French. International Review of Applied Linguistics
in Language Teaching, 54(2), 151–169.
Préfontaine, Y., Kormos, J., & Johnson, D. E. (2016). How do utterance mea-
sures predict raters’ perceptions of fluency in French as a second language?
Language Testing, 33(1), 53–73.
Révész, A., Ekiert, M., & Torgersen, E. N. (2016). The effects of complexity,
accuracy, and fluency on communicative adequacy in oral task performance.
Applied Linguistics, 37(6), 828–848.
28 S. Williams

Riazantseva, A. (2001). Second language proficiency and pausing a study of


Russian speakers of English. Studies in Second Language Acquisition,
23(4), 497–526.
Riggenbach, H. (1991). Toward an understanding of fluency: A microanalysis of
nonnative speaker conversations. Discourse Processes, 14(4), 423–441.
Rossiter, M. J. (2009). Perceptions of L2 fluency by native and non-native
speakers of English. Canadian Modern Language Review, 65(3), 395–412.
Schegloff, E. A. (1987). Recycled turn beginnings: A precise repair mechanism
in conversation’s turn-taking organization. Talk and Social Organisation,
1(1), 70–85.
Schegloff, E., Jefferson, G., & Sacks, H. (1977). The preference for self-­
correction in the organization of repair in conversation. Language,
53, 361–382.
Schmid, M. S., & Fägersten, K. B. (2010). Disfluency markers in L1 attrition.
Language Learning, 60(4), 753–791.
Schmidt, R. (1983). Interaction, acculturation, and the acquisition of commu-
nicative competence: A case study of an adult. Sociolinguistics and Language
Acquisition, 137, 174.
Segalowitz, N. (2016) Second language fluency and its underlying cognitive and
social determinants. International Review of Applied Linguistics in Language
Teaching, 54(2), 79–95. Available from: De Gruyter. doi:https://doi.
org/10.1515/iral-­2016-­9991.. Accessed 06 April 2020.
Shriberg, E. (2001). To ‘errrr’ is human: Ecology and acoustics of speech disflu-
encies. Journal of the International Phonetic Association, 31(1), 153–169.
Simard, D., French, L., & Zuniga, M. (2017). Evolution of self-repair behav-
iour in narration among adult learners of French as a second language.
Canadian Journal of Applied Linguistics/Revue canadienne de linguistique
appliquée, 20(2), 71–89.
Skehan, P. (2009). Modelling second language performance: Integrating com-
plexity, accuracy, fluency, and lexis. Applied Linguistics, 30(4), 510–532.
Suzuki, S., Kormos, J., & Uchihara, T. (2021). The relationship between utter-
ance and perceived fluency: A meta-analysis of correlational studies. The
Modern Language Journal.
Swerts, M. (1998). FPs as markers of discourse structure. Journal of Pragmatics,
30(4), 485–496.
Tavakoli, P. (2016). Fluency in monologic and dialogic task performance:
Challenges in defining and measuring L2 fluency. International Review of
Applied Linguistics in Language Teaching, 54(2), 133–150.
1 Introduction 29

Tavakoli, P., Nakatsuhara, F., & Hunter, A. M. (2017). Scoring validity of the
Aptis Speaking Test: Investigating fluency across tasks and levels of proficiency.
ARAGs Research Reports Online. British Council.
Temple, L. (1992). Disfluencies in learner speech. Australian Review of Applied
Linguistics, 15(2), 29–44.
Van Hest, E. (1996). Self-repair in L1 and L2 production. Tilburg University Press.
Van Os, M., De Jong, N. H., & Bosker, H. R. (2020). Fluency in dialogue:
Turn-taking behavior shapes perceived fluency in native and nonnative
speech. Language Learning, 70(4), 1183–1217.
Verhoeven, L. T. (1989). Monitoring in children’s second language speech.
Interlanguage Studies Bulletin (Utrecht), 5(2), 141–155.
Walsh, S., & Li, L. (2013). Conversations as space for learning. International
Journal of Applied Linguistics, 23(2), 247–266.
Watanabe, M., Hirose, K., Den, Y., & Minematsu, N. (2008). FPs as cues to the
complexity of forming phrases for native and non-native listeners. Speech
Communication, 50(2), 81–94.
Wenger, E. (1998). Communities of practice: Learning as a social system. Systems
thinker, 9(5), 2–3.
Williams, S. A., & Korko, M. (2019). Pause behavior within reformulations and
the proficiency level of second language learners of English. Applied
PsychoLinguistics, 40(3), 723–742.
Zuniga, M., & Simard, D. (2019). Factors influencing L2 self-repair behavior:
The role of L2 proficiency, attentional control and L1 self-repair behavior.
Journal of Psycholinguistic Research, 48(1), 43–59.
Another random document with
no related content on Scribd:
There are six of these strange compositions, upon the stories of
David and Goliath, of David and Saul, of Jacob and Leah, and
others. Some years later they undoubtedly suggested to Sebastian
Bach the delicate little capriccio which he wrote upon the departure
of his brother for the wars. Apart from this they are of slight
importance except as indications of the experimental frame of mind
of their composer. Indeed, beyond imitation and to a small extent
description, neither harpsichord nor pianoforte music has been able
to make much progress in the direction of program music.

Kuhnau’s musical narratives were published in August, 1700. Earlier


than this he had published his famous Sonata aus dem B. The work
so named was appended to Kuhnau’s second series of suites or
Partien. It has little to recommend it to posterity save its name, which
here appears in the history of clavier music for the first time. Nor
does this name designate a form of music akin to the sonatas of the
age of Mozart and Beethoven, a form most particularly associated
with the pianoforte. Kuhnau merely appropriated it from music for
string instruments. There it stood in the main for a work which was
made up of several movements like the suite, but which differed from
the suite in depending less upon rhythm and in having a style more
dignified than that which had grown out of experiments with dance
tunes. In addition, the various movements which constituted a
sonata were not necessarily in the same key. Here alone it
possessed a possible advantage over the suite. Yet though in other
respects it cannot compare favorably to our ears with the suite,
Kuhnau cherished the dignity of style and name with which tradition
had endowed it. These he attempted to bestow upon music for the
clavier.[6]

The various movements lack definite form and balance. The first is in
rather heavy chord style, the chords being supported by a dignified
counterpoint in eighth notes. This leads without pause into a fugue
on a figure of lively sixteenth notes. The key is B-flat major. There
follows a short adagio in E-flat major, modulating to end in C minor,
in which key the last movement, a short allegro in triple time, is taken
up. The whole is rounded off by a return to the opening movement,
signified by the sign Da Capo.

Evidently pleased with this innovation, Kuhnau published in 1696 a


set of seven more sonatas called Frische Clavier Früchte. These
show no advance over the Sonata aus dem B in mastery of musical
structure. Still they are evidence of the efforts of one man among
many to give clavier music a life of its own and to bring it in
seriousness and dignity into line with the best instrumental music of
the day, namely, with the works of such men as Corelli, Purcell, and
Vivaldi. That he was unable to do this the verdict of future years
seems to show. The attempt was none the less genuine and
influential.

In the matter of structure, then, the seventeenth century worked out


and tested but a few principles which were to serve as foundation for
the masterpieces of keyboard music in the years to come. But these,
though few, were of vast importance. Chief among them was the
new principle of harmony. This we now, in the year 1700, find at the
basis of fugue, of prelude and toccata, and of dance form, not
always perfectly grasped but always in evidence. Musical form now
and henceforth is founded upon the relation and contrast of keys.

Consistently to hold to one thematic subject throughout a piece in


polyphonic style, skillfully to contrast or weave with that secondary
subjects, mark another stage of development passed. The fugue is
the result, now articulate, though awaiting its final glory from the
hand of J. S. Bach. To write little dance pieces in neat and precise
form is an art likewise well mastered; and to combine several of
these, written in the same key, in an order which, by affording
contrast of rhythms, can stir the listener’s interest and hold his
attention, is the established rule for the first of the so-called cyclic
forms, prototype of the symphony and sonata of later days. Such
were the great accomplishments of the musicians of the seventeenth
century in the matter of form.
V
In the matter of style, likewise, much was accomplished. We have
had occasion frequently to point out that in the main the harpsichord
remained throughout the first half of the seventeenth century under
the influence of the organ. For this instrument a conjunct or legato
style has proved to be most fitting. Sudden wide stretches,
capricious leaps, and detached runs seldom find a place in the
texture of great organ music. The organist strives for a smoothness
of style compatible with the dignity of the instrument, and this
smoothness may be taken as corollary to the fundamental
relationship between organ music and the vocal polyphony of the
sixteenth century.

On the other hand, by comparison with the vocal style, the organ
style is free. Where the composer of masses was restricted by the
limited ability of the human voice to sing wide intervals accurately,
the organist was limited only by the span of the hand. Where
Palestrina could count only upon the ear of his singers to assure
accurate intonation, the organist wrote for a keyboard which,
supposing the organ to be in tune, was a mechanism that of itself
could not go wrong. Given, as it were, a physical guarantee of
accuracy as a basis for experiment, the organist was free to devise
effects of sheer speed or velocity of which voices would be utterly
incapable. He had a huge gamut of sounds equally at his command,
a power that could be mechanically bridled or let loose. His
instrument could not be fatigued while boys could be hired to pump
the bellows. So long as his finger held down a key, or his foot a
pedal, so long would the answering note resound, diminishing,
increasing, increasing, diminishing, according to his desire, never
exhausted.

Therefore we find in organ music, rapid scales, arpeggios rising from


depths, falling from heights, new figures especially suited to the
organ, such as the ‘rocking’ figure upon which Bach built his well-
known organ fugue in D minor; deep pedal notes, which endure
immutably while above them the artist builds a castle of sounds;
interlinked chords marching up and down the keyboard, strong with
dissonance. There are trills and ornamental turns, rapid thirds and
sixths. And in all these things organ music displays what is its own,
not what it has inherited from choral music.

Yet, notwithstanding the magnificent chord passages so in keeping


with the spirit of the instrument, in which only the beauty of harmonic
sequence is considered, the treatment of musical material by the
organists is prevailingly polyphonic. The sound of a given piece is
the sound of many quasi-independent parts moving along together,
in which definite phrases or motives constantly reappear. The
harmony on which the whole rests is not supplied by an
accompaniment, but by the movement of the several voice-parts
themselves in their appointed courses. And it may be said as a
generality that these parts progress by steps not wider than that
distance the hand can stretch upon the keyboard.

During the first half of the seventeenth century the harpsichord was
but the echo of the organ. Even the collections of early English
virginal music, which in some ways seem to offer a brilliant
exception, are the work of men who as instrumentalists were
primarily organists. In so far as they achieved an instrumental style
at all it was usually a style fitting to a small organ. The few cases
where John Bull’s cleverness displayed itself in almost a true
virtuoso style are exceptions which prove the rule. Not until the time
of Chambonnières and Froberger do we enter upon a second stage.

About the middle of the seventeenth century Chambonnières was


famous over Europe as a performer upon the harpsichord. As first
clavicinist at the court of France, his manner of playing may be taken
to represent the standard of excellence at that time. Constantine
Huygens, a Dutch amateur exceedingly well-known in his day,
mentions him many times in his letters with unqualified admiration,
always as a player of the harpsichord, or as a composer for that
instrument. Whatever skill he may have had as an organist did not
contribute to his fame; and his two sets of pieces for harpsichord,
published after his death in 1670, show the beginnings of a distinct
differentiation between harpsichord and organ style.
Title page of Kuhnan's "Neue Clavier-Übung".
The harpsichord possesses in common with the organ its keyboard
or keyboards, which render the playing of solid chords possible. The
lighter action of the harpsichord gives it the advantage over the
organ in the playing of rapid passages, particularly of those light
ornamental figures used as graces or embellishments, such as trills,
mordents, and turns. A further comparison with the organ, however,
reveals in the harpsichord only negative qualities. It has no volume
of sound, no power to sustain tones, no deep pedal notes.
Consequently the smooth polyphonic style which sounds rich and
flowing on the organ, sounds dry and thin upon the weaker
instrument. The composer who would utilize to advantage what little
sonority there is in the harpsichord must be free to scatter notes here
and there which have no name or place in the logic of polyphony, but
which make his music sound well. Voice parts must be interrupted,
notes taken from nowhere and added to chords. The polyphonic web
becomes disrupted, but the harpsichord profits by the change. It is
Chambonnières who probably first wrote in such a style for the
harpsichord.

He learned little of it from what had been written for the organ, but
much from music for the lute, which, quite as late as the middle of
the century, was interchangeable with the harpsichord in
accompaniments, and was held to be equal if not superior as a solo
instrument. It was vastly more difficult to play, and largely for this
reason fell into disuse. The harpsichord is by nature far nearer akin
to it than to the organ. The free style which lutenists were driven to
invent by the almost insuperable difficulties of their instrument, is
nearly as suitable to the harpsichord as it is to the lute. Without
doubt the little pieces of Denis Gaultier were played upon the
harpsichord by many an amateur who had not been able to master
the lute. The skilled lutenist would find little to give him pause in the
harpsichord music of Chambonnières. The quality of tone of both
instruments is very similar. For neither is the strict polyphony of
organ music appropriate; for the lute it is impossible. Therefore it fell
to the lutenists first to invent the peculiar instrumental style in which
lie the germs of the pianoforte style; and to point to their cousins,
players of the harpsichord, the way towards independence from
organ music.

Froberger came under the influence of Denis Gaultier and


Chambonnières during the years he spent in Paris, and he adopted
their style and made it his own. He wrote, it is true, several sets of
ricercars, capriccios, canzonas, etc., for organ or harpsichord, and in
these the strict polyphonic style prevails, according to the
conventionally more serious nature of the compositions. But his fame
rests upon the twenty-eight suites and fragments of suites which he
wrote expressly for the harpsichord. These are closely akin to lute
music, and from the point of view of style are quite as effective as
the music of Chambonnières. In harmony they are surprisingly rich.
Be it noted, too, in passing, that they are not lacking in emotional
warmth. Here is perhaps the first harpsichord music which demands
beyond the player’s nimble fingers his quick sympathy and
imagination—qualities which charmed in Froberger’s own playing.

Kuhnau as a stylist is far less interesting than Froberger, upon


whose style, however, his clavier suites are founded. His importance
rests in the attempts he made to adapt the sonata to the clavier, in
his experiments with descriptive music, and in the influence he had
upon his contemporaries and predecessors, notably Bach and
Handel. Froberger is the real founder of pianoforte music in
Germany, and beyond him there is but slight advance either in style
or matter until the time of Sebastian Bach.

What we may now call the harpsichord style, as exemplified in the


suites of Chambonnières and Froberger, is relatively free. Both
composers had a fondness for writing in four parts, but these parts
are not related to each other, nor woven together unbrokenly as in
the polyphonic style of the organ. They cannot often be clearly
followed throughout a given piece. The upper voice carries the music
along, the others accompany. The arrangement is not wholly an
inheritance from the lute, but is in keeping with the general tendency
in all music, even at times in organ music, toward the monodic style,
of which the growing opera daily set the model.

But the harpsichord style of this time is by no means a simple


system of melody and accompaniment. Though the three voice parts
which support the fourth dwell together often in chords, they are not
without considerable independent movement. They constitute the
harmonic background, as it were, which, though serving as
background, does not lack animation and character in itself. In other
words, we have a contrapuntal, not a polyphonic, style.

A marked feature of the music is the profuse number of graces and


embellishments. These rapid little figures may be akin to the vocal
embellishments which even at the beginning of the seventeenth
century were discussed in theoretical books; but they seem to flower
from the very nature of the harpsichord, the light tone and action of
which made them at once desirable and possible. They are but
vaguely indicated in the manuscripts, and there can be no certainty
as to what was the composer’s intention or his manner of
performance. Doubtless they were left to the discretion of the player.
At any rate for a century more the player took upon himself the
liberty of ornamenting any composer’s music to suit his own whim.
These agrémens[7] were held to be and doubtless were of great
importance. Kuhnau, in the preface to his Frische Clavier Früchte,
speaks of them as the sugar to sweeten the fruit, even though he left
them much to the taste of players; and Emanuel Bach in the second
half of the eighteenth century devoted a large part of his famous
book on playing the clavier to an analysis and minute explanation of
the host of them that had by then become stereotyped. They have
not, however, come down into pianoforte music. It is questionable if
they can be reproduced on the pianoforte, the heavy tone of which
obscures the delicacy which was their charm. They must ever
present difficulty to the pianist who attempts to make harpsichord
music sound again on the instrument which has inherited it.

The freedom from polyphonic restraint, inherited from the lute, and
the profusion of graces which have sprouted from the nature of the
harpsichord, mark the diversion between music for the harpsichord
and music for the organ. In other respects they are still much the
same; that is to say, the texture of harpsichord music is still close—
restricted by the span of the hand. This is not necessarily a sign of
dependence on the organ, but points rather to the young condition of
the art. It is not to be expected that the full possibilities of an
instrument will be revealed to the first composers who write for it
expressly. They lie hidden along the way which time has to travel.
But Chambonnières, in France, and Froberger, in Germany, opened
up the special road for harpsichord music, took the first step which
others had but to follow.

Neither in France nor in Germany did the next generation penetrate


beyond. Le Gallois, a contemporary of Chambonnières, has
remarked that of the great player’s pupils only one, Hardelle, was
able to approach his master’s skill. Among those who carried on his
style, however, must be mentioned d’Anglebert,[8] Le Begue,[9] and
Louis and François Couperin, relatives of the great Couperin to
come.

In Germany Georg and Gottlieb Muffat stand nearly alone with


Kuhnau in the progress of harpsichord music between Froberger and
Sebastian Bach. Georg Muffat spent six years in Paris and came
under French influence as Froberger had come, but his chief
keyboard works (Apparatus Musico Organisticus (1690)) are twelve
toccatas more suited to organ than to harpsichord. In 1727 his son
Gottlieb had printed in Vienna Componimenti musicali per il
cembalo, which show distinctly the French influence. Kuhnau looms
up large chiefly on account of his sonatas, which are in form and
extent the biggest works yet attempted for clavier. By these he
pointed toward a great expansion of the art; but as a matter of fact
little came of it. In France, Italy, and Germany the small forms were
destined to remain the most popular in harpsichord music; and the
sonatas and concertos of Bach are immediately influenced by study
of the Italian masters, Corelli and Vivaldi.
In Italy, the birthplace of organ music and so of a part of harpsichord
music, interest in keyboard music of any kind declined after the
death of Frescobaldi in 1644, and was replaced by interest in opera
and in music for the violin. Only one name stands out in the second
half of the century, Bernardo Pasquini, of whose work, unhappily,
little remains. He was famous over the world as an organist, and the
epitaph on his tombstone gives him the proud title of organist to the
Senate and People of Rome. Also he was a skillful performer on the
harpsichord; but he is more nearly allied to the old polyphonic school
than to the new. A number of works for one and for two harpsichords
are preserved in manuscript in the British Museum, and these are
named sonatas. Some are actually suites, but those for two
harpsichords have little trace of dance music or form and may be
considered as much sonatas as those works which Kuhnau
published under the same title. All of Kuhnau’s sonatas appeared
before 1700 and the date on the manuscript in the British Museum is
1704. Pasquini was then an old man, and it is very probable that
these sonatas were written some years earlier; in which case he and
not Kuhnau may claim the distinction of first having written music for
the harpsichord on the larger plan of the violin concerto and the
sonatas of Corelli.[10]

Two books of toccatas by Alessandro Scarlatti give that facile


composer the right to be numbered among the great pioneers in the
history of harpsichord music. These toccatas are in distinct
movements, usually in the same key, but sharply contrasted in
content. The seventh is a theme and variations, in which Scarlatti
shows an appreciation of tonal effects and an inventiveness which
are astonishingly in advance of the time. He foreshadows
unmistakably the brilliant style of his son Domenico; indeed, he
accounts in part for what has seemed the marvellous instinct of
Domenico. If, as is most natural, Domenico approached the
mysteries of the harpsichord through his father, he began his career
with advantages denied to all others contemporary with him, save
those who, like Grieco, received that father’s training. Alessandro
Scarlatti was one of the most greatly endowed of all musicians. The
trend of the Italian opera during the eighteenth century toward utter
senselessness has been often laid partly to his influence; but in the
history of harpsichord music that influence makes a brilliant showing
in the work of his son, who contributed perhaps more than any other
one man to the technique of writing not only for harpsichord but for
pianoforte.

Little of the harpsichord and clavichord music of the seventeenth


century is heard today. It has in the main only an historical interest.
The student who looks into it will be amazed at some of its beauties;
but as a whole it lacks the variety and emotional strength which
claim a general attention. Nevertheless it is owing to the labor and
talent of the composers of these years that the splendid
masterpieces of a succeeding era were possible. They helped
establish the harmonic foundation of music; they molded the fugue,
the prelude, the toccata, and the suite; they developed a general
keyboard style. After the middle of the century such men as
Froberger and Kuhnau in Germany, Chambonnières, d’Anglebert,
and Louis and François Couperin in France, and Alessandro
Scarlatti in Italy, finally gave to harpsichord music a special style of
its own, and to the instrument an independent and brilliant place
among the solo instruments of that day. Out of all the confusion and
uncertainty attendant upon the breaking up of the old art of vocal
polyphony, the enthusiasm of the new opera, the creation of a new
harmonic system, the rise of an instrumental music independent of
words, these men slowly and steadily secured for the harpsichord a
kingdom peculiarly its own.
FOOTNOTES:
[1] It should be noted in passing that during the early stages of the growth of
polyphonic music, roughly from the eleventh to the fifteenth century, composers
had brought over into their vocal music a great deal of instrumental technique or
style, which had been developed on the crude organs, and on the accompanying
instruments of the troubadours. In the period which we are about to treat the
reverse is very plainly the case.

[2] At the head of Sebastian Bach’s Musikalisches Opfer stands the Latin
superscription: Regis Iussu Cantio et Reliqua Canonica Arte Resoluta. The initial
letters form the word ricercar.

[3] Cf. Vol. VI, Chap. XV.

[4] Suites were known in England as ‘lessons,’ in France as ordres, in Germany as


Partien, and in Italy as sonate da camera.

[5] There was a form of suite akin to the variation form. In this the same melody or
theme served for the various dance movements, being treated in the style of the
allemande, courante, or other dances chosen. Cf. Peurl’s Pavan, Intrada, Dantz,
and Gaillarde (1611); and Schein’s Pavan, Gailliarde, Courante, Allemande, and
Tripla (1617). This variation suite is rare in harpsichord music. Froberger’s suite on
the old air, Die Mayerin, is a conspicuous exception.

[6] 'Denn warum sollte man auf dem Clavier nicht eben wie auf anderen
Instrumenten dergleichen Sachen tractieren können?’ he writes in his preface to
the ‘Seven New Partien,’ 1692.

[7] So they were called in France, which until the time of Beethoven set the model
for harpsichord style. In Germany they were called Manieren.

[8] D’Anglebert published in 1689 a set of pieces, for the harpsichord, containing
twenty variations on a melody known as Folies d’Espagne, later immortalized by
Corelli.

[9] Le Begue (1630-1702) published Pièces de clavecin in 1677.

[10] See J. S. Shedlock: ‘The Pianoforte Sonata,’ London, 1895.


CHAPTER II
THE GOLDEN AGE OF HARPSICHORD
MUSIC
The period and the masters of the ‘Golden Age’—Domenico
Scarlatti; his virtuosity; Scarlatti’s ‘sonatas’; Scarlatti’s technical
effects; his style and form; æsthetic value of his music; his
contemporaries—François Couperin, le Grand; Couperin’s
clavecin compositions; the ‘musical portraits’; ‘program music’—
The quality and style of his music; his contemporaries, Daquin
and Rameau—John Sebastian Bach; Bach as virtuoso; as
teacher; his technical reform; his style—Bach’s fugues and their
structure—The suites of Bach: the French suites, the English
suites, the Partitas—The preludes, toccatas and fantasies;
concertos; the ‘Goldberg Variations’—Bach’s importance; his
contemporary Handel.

In round figures the years between 1700 and 1750 are the Golden
Age of harpsichord music. In that half century not only did the
technique, both of writing for and performing on the harpsichord,
expand to its uttermost possibilities, but there was written for it music
of such beauty and such emotional warmth as to challenge the best
efforts of the modern pianist and to call forth the finest and deepest
qualities of the modern pianoforte.

It was an age primarily of opera, of the Italian opera with its


senseless, threadbare plots, its artificial singers idolized in every
court, its incredible, extravagant splendor. The number of operas
written is astonishing, the wild enthusiasm of their reception hardly
paralleled elsewhere in the history of music. Yet of these many works
but an air or two has lived in the public ear down to the present day;
whereas the harpsichord music still is heard, though the instrument
for which it was written has long since vanished from our general
musical life.

Practically the whole seventeenth century has been required to lay


down a firm foundation for the development of instrumental music in
all its branches. This being well done, the music of the next epoch is
not unaccountably surprising. As soon as principles of form had
become established, composers trod, so to speak, upon solid
ground; and, sure of their foothold, were free to make rapid progress
in all directions. In harpsichord music few new forms appeared. The
toccata, prelude, fugue, and suite offered room enough for all the
expansion which even great genius might need. Within these limits
the growth was twofold: in the way of virtuosity and refinement of
style, and in the way of emotional expression. That music which
expands at once in both directions, or in which, rather, the two
growths are one and the same, is truly great music. Such we shall
now find written for the harpsichord.

Each of the three men whose work is the chief subject of this chapter
is conspicuous in the history of music by a particular feature.
Domenico Scarlatti is first and foremost a great virtuoso, Couperin
an artist unequalled in a very special refinement of style, Sebastian
Bach the instrument of profound emotion. In these features they
stand sharply differentiated one from the other. These are the
essential marks of their genius. None, of course, can be
comprehended in such a simple characterization. Many of Scarlatti’s
short pieces have the warmth of genuine emotion, and Couperin’s
little works are almost invariably the repository of tender and naïve
sentiment. Bach is perhaps the supreme master in music and should
not be characterized at all except to remind that his vast skill is but
the tool of his deeply-feeling poetic soul.
I
It will be noticed that each of these great men speaks of a different
race. We may consider Scarlatti first as spokesman in harpsichord
music of the Italians, who at that time had made their mark so deep
upon music that even now it has not been effaced, nor is likely to be.
His father, Alessandro, was the most famous and the most gifted
musician in Europe. From Naples he set the standard for the opera
of the world, and in Naples his son Domenico was born on October
26, 1685, a few months only after the birth of Sebastian Bach in
Eisenach. Domenico lived with his father and under his father’s
guidance until 1705, when he set forth to try his fame. He lived a few
years in Venice and there met Handel in 1708, with whom he came
back to Rome. Here in Rome, at the residence of Corelli’s patron,
Cardinal Ottoboni, took place the famous contest on organ and
harpsichord between him and Handel. For Handel he ever professed
a warm friendship and the most profound admiration.

He remained for some years in Rome, at first in the service of Marie


Casimire, queen of Poland, later as maestro di capella at St. Peter’s.
In 1719 came a journey to London in order to superintend
performances of his operas. From 1721 to 1725 he seems to have
been installed at the court of Lisbon; and then, after four years in
Naples, he accepted a position at the Spanish court in Madrid. Just
how long he stayed there is not known. In 1754 he was back again in
Naples, and in Naples he died in 1757, seven years after the death
of Bach.

Scarlatti wrote many operas in the style of his father, and these were
frequently performed, with success, in Italy, England, Spain, and
elsewhere. During his years at St. Peter’s he also wrote sacred
music; but his fame now rests wholly upon his compositions for the
harpsichord and upon the memory of the extraordinary skill with
which he played them.

We have dwelt thus briefly upon a few events of his life to show how
widely he had travelled and in how many places his skill as a player
must have been admired. That in the matter of virtuosity he was
unexcelled can hardly be doubted. It is true that in the famous
contest with Handel he came off the loser on the organ, and even his
harpsichord playing was doubted to excel that of his Saxon friend.
But these contests were a test of wits more than of fingers, a trial of
extempore skill in improvising fugues and double fugues, not of
virtuosity in playing.

Two famous German musicians, J. J. Quantz and J. A. Hasse, both


heard him and both marvelled at his skill. Monsieur L’Augier, a gifted
amateur whom Dr. Burney visited in Vienna, told a story of Scarlatti
and Thomas Roseingrave,[11] in which he related that when
Roseingrave first heard Scarlatti play, he was so astonished that he
would have cut off his own fingers then and there, had there been an
instrument at hand wherewith to perform the operation; and, as it
was, he went months without touching the harpsichord again.

Whom he had to thank for instruction is not known. There is nothing


in his music to suggest that he was ever a pupil of Bernardo
Pasquini, who, however, was long held to have been his master. J.
S. Shedlock, in his ‘History of the Pianoforte Sonata,’ suggests that
he learned from Gaëtano Greco or Grieco, a man a few years his
senior and a student under his father; but it would seem far more
likely that Domenico profited immediately from his father, who, we
may see from a letter to Ferdinand de’ Medici, dated May 30, 1705,
had watched over his son’s development with great care. It must not
be forgotten that Alessandro Scarlatti’s harpsichord toccatas,
described in the previous chapter, are, in spite of a general
heaviness, often enlivened by astonishing devices of virtuosity.

Scarlatti wrote between three and four hundred pieces for the
harpsichord. The Abbé Santini[12] possessed three hundred and
forty-nine. Scarlatti himself published in his lifetime only one set of
thirty pieces. These he called exercises (esercizii) for the
harpsichord. The title is significant. Before 1733 two volumes, Pièces
pour le clavecin, were published in Paris; and some time between
1730 and 1737 forty-two ‘Suites of Lessons’ were published in
London under the supervision of Roseingrave. More were printed in
London in 1752. Then came Czerny’s edition, which includes two
hundred pieces; and throughout the nineteenth century various
selections and arrangements have appeared from time to time, von
Bülow having arranged several pieces in the order of suites, Tausig
having elaborated several in accordance with the modern pianoforte.
A complete and authoritative edition has at last been prepared by
Sig. Alessandro Longo and has been printed in Italy by Ricordi and
Company.

By far the greater part of these many pieces are independent of each
other. Except in a few cases where Scarlatti, probably in his youth,
followed the model of his father’s toccatas, he keeps quite clear of
the suite cycle. The pieces have been called sonatas, but they are
not for the most part in the form called the sonata form. This form
(which is the form in which one piece or movement may be cast and
is not to be confused with the sequence or arrangement of
movements in the classical sonata) is, as we shall later have ample
opportunity to observe, a tri-partite or ternary form; whereas the so-
called sonatas of Scarlatti are in the two-part or binary form, which
is, as we have seen, the form of the separate dance movements in
the suite. Each ‘sonata’ is, like the dance movements, divided into
two sections, usually of about equal length, both of which are to be
repeated in their turn. In general, too, the harmonic plan is the same
or nearly the same as that which underlies the suite movement, the
first section modulating from tonic to dominant, the second back from
dominant to tonic. But within these limits Scarlatti allows himself
great freedom of modulation. It is, in fact, this harmonic expansion
within the binary form which makes one pause to give Scarlatti an
important place in the development of the sonata form proper.

The harmonic variety of the Scarlatti sonatas is closely related to the


virtuosity of their composer. He spins a piece out of, usually, but not
always, two or three striking figures, by repeating them over and
over again in different places of the scale or in different keys. His
very evident fondness for technical formulæ is thus gratified and the
piece is saved from monotony by its shifting harmonies.
A favorite and simple shift is from major to minor. This he employs
very frequently. For example, in a sonata in G major, No. 2 of the
Breitkopf and Härtel collection of twenty sonatas[13] measures 13,
14, 15, and 16, in D major, are repeated immediately in A major. In
20, 21, 22, and 23, the same style of figure and rhythm appears in D
major and is at once answered in D minor. Toward the end of the
second part of the piece the process is duplicated in the tonic key. In
the following sonata at the top of page seven occurs another similar
instance. It is one of the most frequent of his mannerisms.

The repetition of favorite figures is by no means always


accompanied by a change of key. The two-measure phrase
beginning in the fifteenth measure of the third sonata is repeated
three times note for note; a few measures later another figure is
treated in the same fashion; and in yet a third place, all in the first
section of this sonata, the trick is turned again. Indeed, there are
very few of Scarlatti’s sonatas in which he does not play with his
figures in this manner.

We have said that often he varies his key when thus repeating
himself, and that such variety saves from monotony. But it must be
added that even where there is no change of key he escapes being
tedious to the listener. The reason must be sought in the sprightly
nature of the figures he chooses, and in the extremely rapid speed at
which they are intended to fly before our ears. He is oftenest a
dazzling virtuoso whose music appeals to our bump of wonder, and,
when well played, leaves us breathless and excited.

The pieces are for the most part extremely difficult; and this, together
with his ever-present reiteration of special harpsichord figures, may
well incline us to look upon them as fledgling études. The thirty
which Scarlatti himself chose to publish he called esercizii, or
exercises. We may not take the title too literally, bearing in mind that
Bach’s ‘Well-Tempered Clavichord’ was intended for practice, as
were many of Kuhnau’s suites. But that Scarlatti’s sonatas are
almost invariably built up upon a few striking, difficult and oft-
repeated figures, makes their possible use as technical practice

You might also like