Professional Documents
Culture Documents
Redução e Apócope PB
Redução e Apócope PB
Abstract
This is a study of final poststressed vowel devoicing following /s/ in Brazilian
Portuguese. We contradict the literature describing it as deletion by arguing, first,
that the vowel is not deleted, but overlapped and devoiced by the /s/, and, second,
that gradient reduction with devoicing may lead to apocope diachronically. The
following results support our view: (1) partially devoiced vowels are centralized;
(2) centralization is inversely proportional to duration; (3) total devoicing is accom-
panied by lowering of the /s/ centroid; (4) the /s/ noise seems to be lengthened
when the vowel is totally devoiced; (5) aerodynamic tests reveal that lengthened
/s/ has a final vowel-like portion, too short to be voiced; (6) lengthened /s/ favors
vowel recovery in perceptual tests. This seems to be a likely path from reduction
to devoicing to listener-based apocope.
© 2015 S. Karger AG, Basel
1 Introduction
This paper analyzes an allophonic process that has been commonly described as
deletion of poststressed vowels in Brazilian Portuguese (henceforth BP).
Portuguese went through several sound changes throughout its history. Vowel apo-
cope is among the most recurrent, e.g. amare > amar; legale > leal, mense > mês. It
is in fact pervasive in Romance languages, given its Vulgar Latin origins. Deletion
tends to apply primarily to poststressed syllables in languages that have maintained the
asymmetric stress contour of Vulgar Latin (a reinterpretation of the Latin stress rule in
which intensity falls abruptly after stress). Later it spread to prestressed syllables – as
syncope (Taylor, 1994; Wheeler, 2007; Cunha, 2015, this volume) – in languages tend-
ing to stress timing and left-headedness, such as Catalan and European Portuguese
(Ramus et al., 2003). However, the conservative stress contours of present-day BP,
Galician and Spanish are more likely to favor apocope.
Recently, several studies have claimed that final vowels tend to be deleted in
weak positions in BP (Pagel, 1993; Viegas and Oliveira, 2008; Rolo and Mota, 2012).
However, studies of similar cases in other languages such as English, Turkish, French
Siriraj Medical Library, Mahidol University
198.143.39.97 - 3/11/2016 7:14:25 PM
2 Background
1
This is different from simple in-phase coordination, where the vowel reaches its target after the consonant
offset. In blending, the vowel reaches its target before the consonant offset.
Siriraj Medical Library, Mahidol University
198.143.39.97 - 3/11/2016 7:14:25 PM
3 Experiment 1
3.2 Measurements
Several acoustic properties were measured in target vowels. These measurements
aimed at relating reduction to devoicing. The measurements used in experiment 1
were: /s/ noise centroid, and the durations of the target syllable, noise and vowel. In
order to compare contexts favoring and disfavoring devoicing, VSAs were calculated
as an index of vowel dispersion. The correlation between VSA and vowel duration was
also computed.
3.5 Results
3.5.1 General Findings
In general, our results confirm the acoustic patterns described by the literature
(Jannedy, 1995; Tsuchida, 1997). High vowels lose voicing (sometimes incompletely),
periodicity and formant structure, as shown in figure 1. Note below that, as pointed out
by Han (1962) and Kondo (2005), the formant pattern is not clear in the case of devoic-
ing (fig. 1c) as compared with partial devoicing (fig. 1b) and voicing (fig. 1a), which
exhibits well-defined boundaries between the vowel and the fricative.
Siriraj Medical Library, Mahidol University
198.143.39.97 - 3/11/2016 7:14:25 PM
0 0 0
Frequency (Hz)
Frequency (Hz)
0 0 0
0 0.1189 0 0.1017 0 0.1078
a Time (s) b Time (s) c Time (s)
Fig. 1. Examples of voiced (a), partially devoiced (b) and totally devoiced (c) syllables /si/, respec-
tively, extracted from the word ‘lance’ [’lã.si].
F2 F2
3,000 2,500 2,000 1,500 1,000 500 0 3,000 2,500 2,000 1,500 1,000 500 0
0 0
100 100
200 200
300 300
400 400
500 500
F1
F1
600 600
[a] 700 700
[i] 800 800
[u] 900 900
1,000 1,000
a b
Fig. 2. Vowel space areas of vowels followed by voiceless (a) and voiced consonants (b).
The proportions of devoiced vowels of different heights are also in line with the
literature (Tsuchida, 1997). Percentages of totally devoiced vowels are 46% for /u/,
42% for /i/ and 12% for /a/. Percentages of partial devoicing are also higher for /u/
(45%) and /i/ (38%) compared to /a/ (17%). As mentioned above, this kind of variation,
recurrent across speakers, is the raw material for sound change in progress.
70
Vowel duration (ms)
60
50
40
30
20
0
00
00
00
00
0
00
00
00
00
00
,0
,0
,0
,0
0,
0,
0,
0,
0,
20
40
60
80
10
12
14
16
18
Fig. 3. Correlation between
VSA
speaker VSAs and their vowel
duration in voiceless contexts.
35 35
/u/
30 30
/a/
Occurrences (%)
Occurrences (%)
25 25 /i/
20 20
15 15
10 10
5 5
0 0
Devoiced Partially Voiced Devoiced Partially Voiced
a devoiced b devoiced
Fig. 4. Occurrence of devoiced vowels, partially devoiced vowels and voiced vowels for speakers 1
(a) and 5 (b), respectively.
voiceless consonants is more compressed than the space defined by vowels preceding
voiced consonants. This strongly suggests that the vowel reduction is more pronounced
in poststressed syllables followed by voiceless consonants, as seen in figure 2a.
Figure 3 shows the correlation between the speakers’ VSAs and the duration of
their voiced and partially devoiced vowels pooled together. There is a clear trend for
more centralized vowels to be shorter than less centralized vowels. The positive cor-
relation [r(24) = 0.53, p = 0.005] indicates that centralization tends to be directly pro-
portional to vowel shortening.
Figure 4 displays differences in devoicing percentage between two extreme speak-
ers, S11 and S51, to help clarify the relationship between centralization and shortening
in figure 3. While S11 (left) devoices practically all high vowels, S51 (right) has a low
percentage of devoicing, producing more voiced or partially devoiced vowels, the vow-
els of S11 tend to be shorter (69 ms average) and more centralized (VSA = 17.02). The
vowels of S51 tend to be longer (91 ms average) and less centralized (VSA = 103.21).
Siriraj Medical Library, Mahidol University
198.143.39.97 - 3/11/2016 7:14:25 PM
7,000
6,900
Centroid (Hz)
6,800
6,700
6,600
Fig. 5. Spectral moments of
6,500 fricative in syllables ‘without’
Devoiced /i/ Voiced /i/ vowel (devoiced /i/) and with
vowel (voiced /i/).
Thus, we take S11 and S51 as representative of the kind of variation involved in the
steps of sound change under investigation.
It is clear that final unstressed vowels followed by voiceless consonants undergo
even more radical reduction in speakers such as S11.
1.0 1.0
Duration (Z score)
Duration (Z score)
0.5 0.5
0 0
–0.5 –0.5
–1.0 –1.0
–1.5 –1.5
Devoiced Partially Voiced Devoiced Partially Voiced
a /u/ devoiced /u/ /u/ b /i/ devoiced /i/ /i/
Fig. 6. a, b Duration of /s/ noise with voiced, partially devoiced and totally devoiced vowels.
1.5 1.5
1.0 1.0
Duration (Z score)
Duration (Z score)
0.5 0.5
0 0
–0.5 –0.5
–1.0 –1.0
–1.5 –1.5
Devoiced Partially Voiced Devoiced Partially Voiced
a /si/ devoiced /si/ /si/ b /su/ devoiced /su/ /su/
Fig. 7. a, b Syllable duration with voiced, partially devoiced and devoiced vowel.
preliminary study, we (Albano and Meneses, 2015) tested for devoicing in vowels
preceded by stops. Preliminary results showed the presence of a longer burst in the
absence of a visible vowel. Thus, duration patterns are apparently similar to those
following fricatives4.
Figures 6 and 7 illustrate the tendency for vowels and consonants to lengthen and
shorten depending on the amount of voicing. This is confirmed in figure 8, which plots
consonant durations against vowel durations in cases where devoicing was not com-
plete. The correlation is negative and significant [Pearson r(45) = –0.46, p = 0.001],
confirming that vowel and consonant durations are inversely proportional.
So far, the centroid and duration results point to the existence of some vowel por-
tion in the /s/ noise. Furthermore, the lower centroid and the longer noise seem to result
from the extreme overlap of the vowel and consonant gestures. Such an overlap pre-
cludes voicing, due to the dominance of the fricative gesture. As seen in the waveform
of figure 1, the consonant noise prevails. This is in accordance with the predictions of
4
The analysis of these data is still under way.
Siriraj Medical Library, Mahidol University
198.143.39.97 - 3/11/2016 7:14:25 PM
2
Vowel duration (Z score)
1
–1
–2
–3
–3 –2 –1 0 1 2 3 Fig. 8. Correlation between
Consonant duration (Z score) normalized vowel and conso-
nantal duration.
AP, which attributes more deformability to the least constricted gesture (Browman and
Goldstein, 1987). In an attempt to further clarify these findings, the next experiment
investigates devoiced vowels through aerodynamic data.
4 Experiment 2
Acoustic amplitude
0.4
0.6
0.4 0.2
0.2
0 0
–0.2 –0.2
–0.4
–0.6 –0.4
–0.8 –0.6
a 88,600 88,800 89,000 89,200 89,400 89,600 ms b 39,400 39,600 39,800 40,000 40,200 40,400 ms
1.2
1.0 1.0
0.8 0.8
Airflow (dm3/s)
0.6
Airflow (dm3/s)
0.4 0.6
0.2 0.4
0 0.2
–0.2 0
–0.4 –0.2
–0.6 –0.4
–0.8
c 88,600 88,800 89,000 89,200 89,400 89,600 ms d 39,400 39,600 39,800 40,000 40,200 40,400 ms
Fig. 9. Waveform (a, b) and oral airflow (c, d) of /s/ with voiced vowel (a, c) and coda /s/ (b, d) for
the target word passe [‘pa.si] and paz [‘pas].
are available to measure oral and nasal airflow along with pharyngeal pressure. Here,
we have just used 1 oral airflow channel.
The simultaneous recording of sound and airflow requires the use of a ‘mouth-
piece’ attached to the microphone’s mechanical stand. A flexible silicone mask is used
to seal off the mouth so as to obtain reliable oral airflow measurements. To this end, the
pressure sensors were calibrated and checked for each speaker. The Phonedit software
was used for data recording and processing.
4.3 Results
Examination of the synchronized airflow and waveform signals shows that the oral
flow of /s + V/ presents two peaks, as shown in figure 9a and c for ‘Digo paz baixinho’.
The first peak refers to the constriction at the beginning of the friction. The valley that
follows refers to the continuous release of the flow during the production of the frica-
tive. The second peak refers to the relaxation of the constriction to produce the vowel.
In coda /s/ (fig. 9b, d), there is only one oral airflow peak related to fricative
production, followed by the constriction release, since the /s/ is in coda (i.e. no vowel
follows). If we compare figures 9 and 10, we see that there is a clear similarity between
voiced vowels and devoiced vowels: the noise occurring with devoiced vowels also
displays, first, a fricative constriction peak and, second, a peak analogous to that of
voiced vowels. Note the contrast with coda /s/ above.
Figure 10 suggests that neither time nor airflow is sufficient for the production of
a canonical voiced vowel when a voiceless consonant follows. Apparently, voicing is
precluded by the dominance of the fricative gesture, as predicted by the extreme over-
lap analysis.
5 Experiment 3
0.4
–0.2
–0.4
–0.6
a 43,200 43,400 43,600 43,800 44,000 ms
1.2
1.0
0.8
Airflow (dm3/s)
0.6
0.4
0.2
0
–0.2
–0.4
Fig. 10. Waveform (a) and oral airflow (b) of devoiced vowel for the target word passe [‘pa.si] in the
carrier sentence ‘Digo passe paciente’.
5.3 Procedure
The perception test consisted in presenting 35 stimuli to each listener. Listeners
heard the stimuli 5 times and chose the matching word among 3 possibilities. For
instance, after hearing the word [ˈfa.si] with stretched /s/, they were asked to choose
Siriraj Medical Library, Mahidol University
198.143.39.97 - 3/11/2016 7:14:25 PM
70
60
Occurrences (%)
50
40
30
20
10
0
Hit Error Hit Error Hit Error
Lengthened /s/ Mean length /s/ Coda /s/
Fig. 11. Percentages of hits and errors in the identification of lengthened /s/, mean length /s/ and final
/s/.
among [ˈfas], [ˈfa.si] or [ˈfa.su]. Then, listeners were asked to rate their responses in a
5-point scale ranging from maximal to minimal confidence. Incorrect vowel identifica-
tion (e.g. [ˈpa.su] for [ˈpa.si]) was counted as a hit, since the major issue was vowel
detection. After identification, listeners indicated the confidence level of their answer
(from 1 to 5), and this was used to weight the scores.
5.5 Results
Figure 11 shows the rates of hits and errors for all test stimuli. Lengthened noise
presented more hits, namely 76%. The error rate was relatively low in lengthened
(24%) as compared to average noise (42%).
In spite of the confusion caused by forced choice, coda /s/ has a high hit rate,
as expected. Interspeaker variability analogous to that of experiment 1 can also be
observed in our identification experiment. Here, the extreme cases are listeners L13
and L33: while L13 has only 9% errors for lengthened /s/, L33 has as much as 37%.
Even though L33 presents fewer errors than hits, the difference between L13 and L33
suggests that there are people who are more likely to misperceive extremely over-
lapped /s + V/.
When we consider the weighted scores in figure 12, i.e. those resulting from
weighting hits and errors by confidence level, the Kruskal-Wallis test indicates that
lengthened noise is a decisive cue to vowel detection (H = 6.33 , d.f. = 2, p < 0.04).
Siriraj Medical Library, Mahidol University
198.143.39.97 - 3/11/2016 7:14:25 PM
Weighted score 8
*
7
*
*
6
Mean
Mean ± SE
5 *
Mean ± SD
Outliers *
* Extremes
4 Fig. 12. Mean score weighted
Lengthened /s/ Mean length /s/ Codas /s/ by confidence for lengthened
/s/, mean length /s/ and final /s/.
Average /s/ and coda /s/ yielded more errors and less listener confidence in the
forced identification test. In contrast, lengthened /s/ yielded more hits and more listener
confidence. Therefore, lengthened noise seems to carry important cues for the recovery
of the devoiced vowel. As we shall see in the next section, our results are explainable
in terms of relative timing with various degrees of gestural overlap with total devoicing
as the extreme case.
6 Discussion
The results of this work show that the acoustics, the aerodynamics and the per-
ception of radically reduced final vowels in BP contradict the phonological literature
claiming that there is deletion in that position. Following up on Meneses’s findings
(2012), the above data show that reduction of final poststressed vowels in BP varies
from moderate to extreme, implying devoicing as it approaches critical shortness lead-
ing to total devoicing in a voiceless environment.
In summary, final vowel reduction is expressed in five ways in our data, namely:
(1) partially devoiced vowels are highly centralized and shorter if followed by a con-
sonant; (2) when the vowel is apparently absent, the average centroid differentiates
between regular and extreme CV overlap; (3) the /s/ noise lengthens when the vowel is
fully devoiced; (4) lengthened /s/ exhibits a final vowel-like rise in airflow revealed by
aerodynamic tests; (5) lengthened /s/ noise facilitates vowel recovery.
Let us now try to integrate the interpretation of the above facts.
When preceded by /s/ and followed by a voiceless consonant, voiced vowels are
very short and highly centralized, as indicated by the decrease in both vowel duration and
VSA. When the vowel is fully devoiced, the consonant gesture dominates. Vowels tend
to shorten gradually and slide under the preceding /s/. When overlap is so extreme that
it produces a fricative vowel, two acoustic cues point to the presence of a vowel gesture.
First, a lowered /s/ centroid signals the overlapped vowel. Second, lengthened /s/ points
to the extremeness of such overlap in full devoicing (as shown in experiment 1). This
Siriraj Medical Library, Mahidol University
198.143.39.97 - 3/11/2016 7:14:25 PM
7 Conclusion
Our results indicate that the reduction of final unstressed vowels in BP is per-
vasive but gradual, ranging from shortening with partial devoicing to full devoicing.
This variability appears to result from differences in gestural coordination and different
degrees of overlap between consonant and vowel gestures.
Such a gradience is more consistent with extreme C and V overlap than with vowel
deletion. However, as the fricative vowel is extremely short, listeners can be misled by
the weak vowel cues in the acoustic signal so that they fail to recover such an overlap.
If, eventually, overlap with total devoicing spreads diachronically in the population,
apocope can be gradually assumed by more and more listeners, since they have vari-
able degrees of sensitivity to the vowel cues in the noise. Thus, devoiced vowels, i.e.
completely overlapped fricative vowels, can trigger listener-based apocope in the long
run, along the lines advocated by Ohala (1981).
Acknowledgments
This research was supported by the Fundação de Amparo à Pesquisa do Estado de São Paulo,
grant No. 2010/04902-0. Support from the Conselho Nacional de Desenvolvimento Científico e
Tecnológico (CNPq), grant No. 311154/2009-3, is also acknowledged. We also thank Didier Demolin,
Khalil Iskarous, the audience of PAPI 2013, Marina Vigário, Rachel Walker, two anonymous review-
ers and our colleagues from Unicamp’s Laboratório de Fonética e Psicolinguística (LAFAPE) for use-
ful comments and suggestions. Special thanks are due to our participants.
References
Albano E, Meneses F (2015): Novas luzes sobre a Dinâmica Sincrônica e Diacrônica do Desvozeamento Vocálico
(oral presentation). Cadernos de resumos do X Congresso Internacional da Abralin. Bélem, Pará, pp 59–60.
Aquino P (1997): O papel das vogais reduzidas pós-tônicas na construção de um sistema de síntese concatenativa
para o português do Brasil; master degree thesis, Unicamp, Campinas.
Siriraj Medical Library, Mahidol University
198.143.39.97 - 3/11/2016 7:14:25 PM