Implementing Linguistic Plans

Under this heading, we consider the last two stages of production:

articulating and self-monitoring.


Once we have organized our thoughts into a linguistic plan, this

information must be sent from the brain to the muscles in the speech
system so that they can then execute the required movements and produce
the desired sounds.

Fluent articulation of speech requires the coordinated use of a large

number of muscles. These muscles are distributed over three systems: the
respiratory, the laryngeal and the supralaryngeal or vocal tract.

Motor Control of Speech

Motor control of speech begins with motor commands from the brain.
As we assemble a linguistic plan for our utterance, the brain structures
responsible for speech production send messages to the muscles in the
respiratory, laryngeal and supralaryngeal systems.

It is generally believed that these motor commands to speech muscles

take the form of commands for the articulators to move to a particular
location. If the next phonetic segment is [b], the muscles controlling the
lips must be brought into action.
Planning and Production Cycles

Here, we consider the relationship between these articulatory processes

and the planning processes. We plan our utterances in cycles. We express
a portion of our intended message, pause to plan the next portion,
articulate that portion, pause again and so on.

One underlying reason that we tend to hesitate during speech

production is that linguistic planning is very cognitively demanding and
it is difficult to plan an entire utterance at once. As a consequence, we
typically plan only a portion of an utterance at a time. Lounsbury
hypothesized that we pause at periods of high uncertainty. This
hypothesis has been generally supported by studies concluding that
variables that influence lexical retrieval also influence hesitation pauses.
For instance, Levelt found that pauses occurred more often before low-
frequency words than before high-frequency words.

Another variable that influences lexical retrieval is the sheer number of

words from which we choose. Schacter, Christenfeld, Ravina and Bilous
found that during lectures, humanists used more filled pauses, such as uh,
than social scientists or natural scientists. The humanities have a far
richer vocabulary than the sciences, and thus humanists have more
options during speech production, leading to more filled pauses.

A different kind of variable that influences lexical retrieval during

speech production is the use of gestures. Krauss, Rauscher and colleagues
demonstrated that gestures that accompany speech may help speakers
formulate coherent speech by facilitating the retrieval of elusive words
from the internal lexicon. Krauss conjectures that words are represented
in permanent memory in a number of different formats and that gestures
are linked to spatial properties of words and thereby help retrieval. In
addition to word frequency and size of vocabulary, such variables as
morphological complexity, lexical ambiguity, age of acquisition and
recency of usage also influence retrieval.

Planning and production cycles sometimes overlap. Griffin explored

the circumstances under which we articulate the beginnings of sentence
while planning later parts. Griffins study suggests that speakers begin
sentences without knowing how they will finish them. The implication
is that speakers do not always hesitate during the production of a
sentence. Sometimes, we are able to be fluent even when we have not
fully prepared the sentence in advance.

A later study extended this line of thought. Speakers were presented

with drawings and were asked to name the objects without pausing
between the names of the two objects. Griffin found that speakers took
longer to begin to speak when the first noun was one syllable, such as
wig, rather than multisyllabic, such as windmill. Thus, speakers can
maintain fluent speech by preparing later portions of their sentences on
the fly.


From time to time, we spontaneously interrupt our speech and correct

ourselves. These corrections are referred to as self-repairs. According to
Levelt, self-repairs have a characteristic structure that consists of three
parts. First, we interrupt ourselves after we have detected an error in our
speech. Second, we usually utter one of various editing expressions such
as uh and I mean. Finally, we repair the utterance.

Nooteboom suggests that the timing of self-interruption after detection

of an error is based on two competing forces. On one hand, we have an
urge to correct the error immediately. On the other hand, we want to
complete the word we are speaking. As a consequence, interruptions are
predominantly made at the first word boundary after the error.

Levelt used a somewhat different procedure. Students were shown

colour patterns and were then asked to describe the patterns while another
person hearing a tape-recorded version of the description would be able
to draw it. The main advantage of this approach is the greater degree of
experimental control. Levelt found that 18% of the corrections were
within a word. Another 51% occurred immediately after the error. The
remaining 31% of errors were delayed by one or more words.

Editing Expressions

The editing expression conveys to the listener the kind of trouble that
the speaker is correcting. These different editing expressions are not fully
interchangeable and that the expression that is used conveys the type of
editing that the speaker is doing. James analyzed utterances containing
expressions such as uh and oh suggesting that these convey different
meanings. For instance, in the first of the following sentences, the uh
suggests that the speaker paused to try to remember the exact number of
people. In contrast, the second sentence would be used when the speaker
did not know the precise number but was trying to choose a number that
was approximately correct.

I saw... uh ... 12 people at the party.

I saw...oh ... 12 people at the party.
In the following sentence, rather is used as an example of nuance
editing, in which a word is substituted that is similar in meaning to the
original, but slightly closer to the speakers meaning.

I am trying to lease, or rather, sublease, my apartment.


Levelt distinguishes among three types of repairs. Instant repairs

consist of a speakers retracing back to a single troublesome word, which
is then replaced with the correct word, as in the following sentence:

Again left to the same blank crossing pointwhite crossing point.

In anticipatory retracings, the speaker retraces back to some point

prior to the error, as in the following sentence:

And left to the purple crossing pointto the red crossing point.

In fresh starts, the speaker drops the original syntactic structure and
just starts over, as in the following example:

From yellow down to brownnothats red.

Levelt argues that repairs are systematically different when there is an

out-and-out error as opposed to an utterance that is merely inappropriate.
Repairs based on social or contextual inappropriateness are those in
which the speaker says what was intended but perhaps not in the way
intended. Levelt found that error repairs consisted primarily of instant
repairs and anticipatory retracings, with very few fresh starts. Error
repairs are conservative in that the speaker leaves most of the original
utterance unaffected, but alters the erroneous element. In such a case, the
error and revised utterances have a parallel structure with but one
difference. In contrast, fresh starts are more likely when the original
utterance is contextually inappropriate. When what we have said is
technically correct but awkward, we tend to rephrase.

In general, speakers repair their utterances in a way that maximizes

listeners comprehension. The listeners problem when a speaker errs is
not only to understand the correction, but also how to fit the correction
into the ongoing discourse. Several aspects of speaker self-repairs are
helpful in this regard; speakers interrupt themselves quickly, their editing
expressions indicate the type of error and then, the repair itself is

Both the use of editing expressions and the linguistic structure of the
repair itself appear to facilitate listener comprehension. Brennan and
Schober suggest that long editing intervals enable the listener to
confirm that the speaker is having difficulty and then, cancel the
erroneous material.