You are on page 1of 24

The Speaker: Producing Speech

Part 2
Ahmad Nadhif
Producing Speech After It Is Planned
• The abstract phonetic representation of the
speaker’s sentence is sent to the central motor
areas of the brain, where it is converted into
instructions to the vocal tract to produce the
required sounds.
• Speaking is an incredibly complex motor activity,
involving over 100 muscles moving in precise
synchrony to produce speech at a rate of 10 to 15
phonetic units per second (Liberman et al. 1967).
• During silence, the amount of time needed for
inhaling is about the same as for exhaling.
Respiration during speech is very different: the
time for inhaling is drastically reduced,
sometimes to less than half a second, and
much more time is spent exhaling, sometimes
up to several seconds.
• During speech, air from the lungs must be
released with exactly the correct pressure. The
respiratory system works with the muscles of
the larynx to control the rate of vibration of
the vocal folds, providing the necessary
variations in pitch, loudness, and duration for
the segmental (consonants and vowels) and
suprasegmental (prosody) content of the
utterance.
• Muscles of the lips, the tongue, and other
articulators must be carefully coordinated. Much
precision of planning is required. For example,
to make the vowel sound [u], different sets of
nerves lower the larynx and round the lips.
• Impulses travel at different rates down those
two sets of nerves, so timing must be carefully
orchestrated: one impulse must be sent a
fraction of a millisecond sooner than the other.
• In this section, we examine how vowels and consonants
are produced, with a focus on how the articulation of
speech converts a sequence of discrete mental units (a
phonological representation) into a continuous acoustic
signal.
• The signal, as the end product for the speaker and the
starting point for the hearer, must contain sufficient
information for successful decoding. Our objective, then,
is to identify some of the characteristics of the signal
which carry information that will be used by the hearer.
The source-filter model of vowel production

• Speech consists of sounds generated at the


vocal folds being filtered as they travel
through the vocal tract. The source–filter
model of vowel production breaks down the
process of producing vowels into two
component parts: a source and a filter.
• We will illustrate how the source–filter model
works by considering the vowels [i], [a], and
[u]. To articulate these vowels, you open your
mouth and force air from your lungs through
your larynx, where the vocal folds reside. This
causes the vocal folds to vibrate – that is, to
open and close in rapid sequence.
• The frequency of this vibration is called the
fundamental frequency (or F0), and this is, in
essence, the source in the source–filter model
of speech production.
• Key to understanding how the vocal tract acts as
a filter is the concept of resonance. The vocal
tract changes shape when different sounds are
articulated. For example, when the vowels [i] and
[u] are articulated, the tongue body is relatively
high, compared to when [a] is articulated; the
tongue body is farthest in the front of the mouth
when articulating [i], slightly farther back for [a],
and farthest in the back for [u].
• For these three vowels, then, the oral and
pharyngeal cavities are shaped slightly differently,
relative to each other. Consequently, for a sound
generated at the vocal folds traveling through these
differently shaped cavities, some harmonics will be
reinforced, and other harmonics will be cancelled.
• In other words, energy at some frequencies will
increase, and energy at other frequencies be
eliminated. This is resonance.
Acoustic characteristics of consonants

• A complete description of the acoustic


characteristics of speech sounds is beyond
the scope of this book, but there are some
general properties of certain classes of
consonants that are worth pointing out.
• A feature distinguishing between many
consonants is voicing. For voiced sounds, like
[z], the vocal folds are engaged during the
articulation of the consonant. For voiceless
sounds, like [s], voicing will not begin until the
vowel that follows is articulated.
• For further details, read the textbook and you
might also review the materials of Phonology
you have taken.
Coarticulation
• Probably the most important psycholinguistic
aspect of speech production is the
phenomenon of coarticulation. Coarticulation
simply means that the articulators are always
performing motions for more than one speech
sound at a time. The articulators do not
perform all the work for one speech sound,
then another, then another.
• The genius of speech production is that phonological
segments overlap, so the articulators work at
maximum efficiency, in order to be able to produce
10 to 15 phonetic segments per second – more in
rapid speech.
• This transmission speed would be close to
impossible to achieve if each phonological unit were
produced individually. As it is, speech is produced
more slowly than necessary for the speech
perception system.
• People can actually understand speech that
has been sped up (compressed) at several
times the normal rate (Foulke and Sticht
1969). But coarticulation is not just a matter of
convenience for the speaker: if speech were
not coarticulated – that is, if phonological
units did not overlap – speech would actually
be too slow and disconnected for the hearer
to process it efficiently.
• A simple example of coarticulation is the
articulation of [k] in key and coo. When
uttering key, while the back of the tongue is
making closure with the top of the mouth for
the [k], the lips – not ordinarily involved in
articulating [k] – begin to spread in
anticipation of the following vowel [i].
• Similarly, when uttering coo, the lips round
during the articulation of [k], in anticipation of
the upcoming [u]. One aspect of
coarticulation, then, is that the actual
articulation of a phonological segment can be
influenced by upcoming sounds. This is
sometimes referred to as regressive
assimilation.
The linguist Charles Hockett offered an apt metaphor
for coarticulation (1955: 210):
• Coarticulation can also be influenced by a
phonological segment that has just been
produced, a phenomenon sometimes called
progressive assimilation. The [t] in seat is
pronounced slightly more forward in the
mouth than the [t] in suit. This is because the
tongue position for the [t] is influenced by the
preceding vowel ([i] is a front vowel and [u] is
a back vowel).

You might also like