You are on page 1of 5

Coarticulatory effects of consonants on vowels and their

reflection in perception
Hartmut Traunmüller
Dept. of Linguistics
Stockholm University

Abstract
The coarticulatory effects of consonants on vowels may be restricted by the need to keep
vowel phonemes distinct. Such restrictions are minimal for the schwa-like vowels in the NW-
Caucasian languages and in Northern Chinese, which is shown to display a wide range of
such effects. Co-occurrence restrictions and preferences motivated by coarticulatoy ease
exist in many languages. Consonant confusions observed in CV- and VC-syllables under
various forms of distortion are discussed in order to see the perceptual effects of coarticula-
tion. It is found that listeners tend to ascribe to the consonant some of the properties of the
vowel when the consonantal segment is impoverished in information. Such perceptual
“reattribution” is likely to be quite important for the perception of spontaneous speech.

Coarticulation
Speakers might restrict the coarticulatory impact of consonants on vowels in order to prevent
confusions between different vowel phonemes. Therefore, it may be instructive to analyze the
C-V coarticulatory effects in languages in which this need imposes minimal restrictions. In
some languages, vowels are distinguished on the basis of openness (jaw angle) while the po-
sitioning of tongue and lips is given by the consonants of the syllable. This is the case in the
NW-Caucasian languages, all of which have just one phonological opposition among their
vowels, /α/ vs. /‹/ (see e.g. Kuipers, 1968, Anderson, 1991). Northern Chinese (see e.g.
Kratochwil, 1968) is close to having such a linear system with three distinctive degrees of
openness and a more backward neutral setting than in NW-Caucasian. Table 1 shows how the
half-open vowel is realized in Chinese. Among its close (high) vowels, rounding and palatality
are distinctive in certain contexts in which no conflict with consonants arises.

The “unmarked” vowel in the NW-Caucasian languages differs slightly from the Chinese
schwa-like vowel in that it is a central, not a back vowel. In a bilabial or labiodental context,
it is realized as a half-rounded central vowel - as a back rounded vowel only in a labiovelar
context (Kuipers, 1968; Choi, 1991; Wood, 1991, 1994). Otherwise, it is affected by conso-
nants in the same way as its Chinese equivalent.

Table 2 shows the coarticulatory effects of consonants on schwa-like vowels by place of ar-
ticulation. The laryngeal place of articulation (not listed) is the only one that has no such ef-
fects. Consonants articulated with the tongue dorsum (palatals, velars, uvulars, and pharyn-
geals) have essentially inescapable coarticulatory effects on adjacent vowels. Thereby, the
openness of vowels is also affected, alveolo-palatals and pharyngeals representing the extreme
cases. In yielding to these effects, many languages have developed phonotactic restrictions
that disallow CV- and VC-sequences in which the vowel and the consonant would require
incompatible settings of the tongue. The Chinese alveolopalatals (see Table 1) represent a
case in point. These are always followed by a close front vowel [K] or [[]. Coarticulation at
the uvular place, as in Arabic /qaf/ [S H], gives rise to a vowel that sounds as if it were
rounded, even if it is not. It has the Jakobsonian feature of “flatness” (Jakobson, Fant and
Halle, 1952). For some reason, Arabic pharyngeals tend to be accompanied by nasalization.
Table 1. Phonotactic restrictions and coarticulatory effects of syllable initial and final consonants
on the Chinese schwa-like vowel, which in the absence of any consonants is realized as a close-mid
unrounded centralized back vowel.

Place Initial C Effect on V Final C


Labiovelar Y rounding 1 2 Y
Bilabial R R* O rounding 1 2
Labiodental H rounding 1 2
Dental, apical V V* P N centralizing P
Dental, laminal U VU VU* dentalizing
Retroflex ‡ ܇ ܇* › rhoticizing 1 °
Palatal L Ä fronting L
Alveolopalatal Û VÛ VÛ* need close front V 3
Velar Z M M* backing (opening) 1 0
1
Slightly more open in closed syllables.
2
No rounding unless the final segment of the syllable is also rounded. In general, the influence of a
final consonant is stronger than that of an initial. Thus, although we have [YQ] and [LG], not *[YG]
and *[LQ], we find [YGL] and [LQY], not *[YQL] and *[LGY].
3
The alveolo-palatals occur only immediately before [K] and [[] (also as part of diphthongs), while
the dentals =U VU VU*? and the velars =Z M M*? only occur before other vowels.

Table 2. Vowel preferences and unrestrained coarticulatory effects of consonants, by place of ar-
ticulation, on schwa-like vowels like [ ↔] (NW-Caucasian) and [Φ] (Chinese).

Place of articulation Tongue configu- Jaw position Rounded ↔ Φ


ration (openness)
Labiovelar back raised close(-mid) yes W
[Ë] central close-mid yes Π
Bilabial - close-mid (yes) Π Q
Labiodental - close-mid (half) ↔³Π Φ³Q
Dental non-back apical close-mid - G ↔
Postalveolar non-back close-mid (yes) Ο Π
Retroflex non-back rhotic open-mid ambivalent Θ ³ ∈ ³ ³∈³

Alveolo-palatal front raised close - K
Palatal front raised close(-mid) - K-G
Velar back raised (quite free) - ℘³Φ
Uvular back retracted open(-mid) quasi (flat) ³
Pharyngeal back lowered open - C

Reattribution
Coarticulation implies that some of the properties of a phonetic segment are also present
within an adjacent segment. In particular when a consonantal segment is deficient in informa-
tion, it may, then, be rational for listeners to ascribe to the consonant some of the properties
that are present within an adjacent vowel. Such perceptual reattribution of the properties of
speech signals may be quite important in all languages. It would explain how listeners succeed
in perceiving the almost totally fused or deleted segments that are common in spontaneous
styles of speech. This reasoning presupposes a segmental view of speech, just as any reason-
ing about “coarticulation” does. If segments were of no importance to speakers and listeners,
both “coarticulation” and “reattribution” should be seen as theoretical constructs.
In NW-Caucasian and Chinese (see Table 1), it is evident that listeners can exploit the fact
that the vowels contain reliable information about the consonants. A listener can, e.g., be sure
that a syllable has an initial labial consonant, when he has perceived its coda as an [Q]. Should
we, then, in general expect the vowels listed in the last column of Table 2 to increase the like-
lihood for the place of articulation of a consonant to be perceived as listed on the same line in
the first column? The results of several investigations of consonant confusions under various
forms of noise, filtering and clipping point in this direction.

Winitiz, Scheib and Reeds (1972) presented to native listeners of English a number of stimu-
lus segments that consisted of a consonantal burst or the burst plus 100 ms of an adjacent
vowel. They found /MK/ to be misperceived as /VK/ more often than the reverse. Similar results
were later obtained with Italian and Spanish listeners (cf. Plauche, Delogu and Ohala, 1997).
Such asymmetries might arise when two acoustically similar sounds are differentiated by a cue
that one of them possesses while the other one lacks it. Listeners may be more likely to miss
such a differentiating cue than to introduce it spuriously. Before an [K], [M] and [V] have simi-
lar formant transitions, but [M] has a sharp mid-frequency peak that [V] lacks. Listeners would
then be more likely to miss the spectral peak for the [M] than to introduce it in the burst of the
[V?. Plauche et al. (1997) tested this hypothesis by also using [MK? stimuli in which the spectral
maximum had been leveled out by filtering. They obtained support for the hypothesis in the
case of [MK? vs. [VK?. However, if we reason in an analogous way concerning the distinction of
[VK? vs [RK?, which also was tested by Plauche et al, we would predict a bias in favor of /R/
responses, which was not be observed.

The reattribution hypothesis offers another explanation and makes different predictions. In the
context of an [K], the formants of [M] and [V] and their transitions are atypical for a velar place
of articulation. This would disfavor /M-responses, even if there were no asymmetry in the
cues that differentiate [MK] from =VK]. As for the pair [RK] vs. =VK], the reattribution hypothesis
predicts a preponderance of /V-responses, which was actually observed by Plauche et al.

In an investigation of the perceptual effects of clipping speech, Traunmüller (1979), used


Swedish CV syllables with all possible consonants but only the two vowels [GÖ] and [QÖ]. The
initial portions of the syllables had been clipped and either replaced by noise or silence. With
noise, the [GÖ] favored dental and the [QÖ] velar responses. A different pattern emerged when
the initial portion was left silent, which gave rise to an audible “technical stop”. While these
results tell us that reattribution occurs, they do not give us a complete picture. Another study
that suggests reattribution is that by Zee (1981), who investigated the effect of vowel quality
on the perception of the post-vocalic nasals [O? =P? and =0? in syllables with the vowels
=K? =G? =C? =Q? and =W? using several S/N ratios with white noise. It was observed that both
=O? and =0? tends to be identified as /P after the front vowel =K? =P? also after =G?, which is to
be expected on the basis of the reattribution hypothesis.

Krull (1988) investigated the perceptual confusions of the Swedish voiced stops [D?, [F?, [Ç?,
[I? in fragments of VCV-utterances, where all the 25 combinations of the five vowels
[? ='? =C? =n? and =7? were used. Krull showed that the asymmetries in the confusion pat-
terns were largely predictable on the basis of the following vowel. Table 3 shows a selection
of her results, obtained when just the consonantal burst was presented and when the preced-
ing vowel was identical with the following. The hypothesis tested by Plauche et al (1997)
may explain the general deficit in velar responses that can be seen in these data, but for each
place, the vowel-specific bias is nicely compatible with the reattribution hypothesis. However,
since all the properties were actually present within the consonantal bursts, these results pro-
vide direct support only for “co-perception” rather than for “reattribution”. Anyway, evidence
for reattribution can also be found in Krull’s (1988) investigation, but only in the opposite
direction: Listeners were quite successful in identifying the vowel that followed the consonant
when they just heard the consonantal burst.
Table 3. Percentage of total (and incorrect) consonant responses obtained for bursts excised from
V_V utterances with the same vowel in both places. (Swedish data from Krull, 1988). Shaded: posi-
tive bias in number of responses.

Vowel D F Ç I

 93 (0) 183 (86) 73 (33) 51 (4)


' 107 (17) 135 (57) 118 (38) 40 (5)
C 105 (7) 105 (30) 108 (25) 82 (4)
n 132 (41) 77 (22) 128 (48) 63 (5)
7 156 (71) 65 (28) 91 (35) 85 (48)
Σ 593 (136) 565 (223) 518 (179) 321 (66)

Livijn (1999) investigated the perception of a number of Swedish consonants in CV and VC


contexts under masking with white and speech-like noise. This investigation did provide
support for reattribution of some kind - the responses were vowel dependent, but the pattern
of confusions was dominated by a bias in favor of letter names. Evidently, the task of identi-
fying consonants had sensitized subjects also for hearing consonant names. A bias towards
letter names, such as [MQÖ], can also be seen in Traunmüller (1979) and it is unclear to what
extent this factor may have contributed to (or distorted) the results obtained by Winitiz et al.
(1972), Zee (1981), and Plauche et al. (1997). We have to conclude that reattribution may be
based on other factors, in addition to coarticulation, and that these may dominate.

References
Anderson J. 1991. Kabardian disemvowelled, again. Studia Linguistica 45, 18–48.
Choi J.D. 1991. An acoustic study of Kabardian vowels Journal of the International Pho-
netic Association 21, 4–12.
Jakobson R, Fant C.G.M., and Halle M. 1952. Preliminaries to Speech Analysis. MIT
Acoust. Lab. Techn. Rep. 13.
Kratochvil P. 1968. The Chinese Language Today. London: Hutchinson.
Krull D. 1988 Acoustic Properties as Predictors of Perceptual Responses: A Study of Swed-
ish Voiced Stops. PERILUS VII, Inst. Linguist., Stockholm University.
Kuipers A.H. 1968. Phoneme and morpheme in Kabardian (Eastern Adyghe) The Hague:
Mouton.
Livijn P. 1999. Ett perceptionsexperiment med konsonanter i olika vokalkontext vid masker-
ing med brus. (C-uppsats, Inst. för lingvistik, Stockholms universitet.)
Plauche M.C., Delogu C., and Ohala J.J. 1997. Asymmetries in consonant confusion, Proc. of
EuroSpeech'97, vol.4, p. 2187–2190.
Traunmüller H. 1979. Artificially clipped syllables and the role of formant transitions in con-
sonant perception. PERILUS I, 105–122 (Inst. Linguist., Stockholm University).
Winitiz H., Scheib M.E., and Reeds J.A. 1972. Identification of stops and vowels for the
burst portion of /p,t,k/ isolated from conversational speech. J. Acoust. Soc. Am. 51, 1309–
1317.
Wood S.A.J. 1991. Vertical, Monovocalic and Other ‘Impossible’ Vowel Systems: A review
of the articulation of the Kabardian vowels. Studia Linguistica 45, 49–70.
Wood S.A.J. 1994. A spectrographic analysis of vowel allophones in Kabardian. Working
Papers 42, 241–50 (Inst. Linguist., Lund University).
Zee E. 1981. Effect of vowel quality on perception of post–vocalic nasal consonants. J. Pho-
net. 9, 35–48.

You might also like