Acoustic Correlates

Linguistic Portfolios
Volume 3 Article 7
2014
The Acoustic Correlates of Stress-Shifting Suffixes

in Native and Nonnative English: An Overview
Paul Keyworth
St. Cloud State University
Follow this and additional works at: https://repository.stcloudstate.edu/stcloud_ling

Part of the Applied Linguistics Commons
Recommended Citation
Keyworth, Paul (2014) "The Acoustic Correlates of Stress-Shifting Suffixes in Native and Nonnative English: An Overview," Linguistic
Portfolios: Vol. 3 , Article 7.
Available at: https://repository.stcloudstate.edu/stcloud_ling/vol3/iss1/7
This Article is brought to you for free and open access by theRepository at St. Cloud State. It has been accepted for inclusion in Linguistic Portfolios by
an authorized editor of theRepository at St. Cloud State. For more information, please contact rswexelbaum@stcloudstate.edu.
Keyworth: Acoustic Correlates of Stress-Shifting Suffixes
L i n g u i s t i c P o r t f o l i o s – V o l u m e 3 | 58
THE ACOUSTIC CORRELATES OF STRESS-SHIFTING SUFFIXES IN

NATIVE AND NONNATIVE ENGLISH: AN OVERVIEW
PAUL KEYWORTH
ABSTRACT
Although laboratory phonology techniques have been widely employed to discover the
interplay between the acoustic correlates of English Lexical Stress (ELS) – fundamental
frequency, duration, and intensity - studies on ELS in polysyllabic words are rare, and
cross-linguistic acoustic studies in this area are even rarer. Consequently, the effects of
language experience on L2 lexical stress acquisition are not clear. This investigation of
adult Arabic (Saudi Arabian) and Mandarin (Mainland Chinese) speakers analyzes their
ELS production in tokens with seven different stress-shifting suffixes; i.e., Level 1
[+cyclic] derivations to phonologists. Stress productions are then systematically
analyzed and compared with those of speakers of Midwest American English using the
acoustic phonetic software, Praat. In total, one hundred subjects participated in the
study, spread evenly across the three language groups, and 2,125 vowels in 800
spectrograms were analyzed (excluding stress placement and pronunciation errors).
Nonnative speakers completed a sociometric survey prior to recording so that statistical
sampling techniques could be used to evaluate acquisition of accurate ELS production.
The speech samples of native speakers were analyzed to provide norm values for cross-
reference and to provide insights into the proposed Salience Hierarchy of the Acoustic
Correlates of Stress (SHACS). The results support the notion that a SHACS does exist in
the L1 sound system, and that native-like command of this system through accurate ELS
production can be acquired by proficient L2 learners via increased L2 input. Other
findings raise questions as to the accuracy of standard American English dictionary
pronunciations as well as the generalizability of claims made about the acoustic
properties of tonic accent shift.
1.0 Introduction
It is widely accepted that certain suffixes in English cause a shift in stress in the
root morpheme to the syllable directly preceding the suffix (Celce-Murcia, Brinton, &
Goodwin, 1996; Kreidler, 2004). These stress-shifting suffixes have been labeled Level 1
[+cyclic] suffixes by generative phonologists (Kisparsky, 1982; Halle and Kenstowicz,
1991). Pronunciation experts, including Celce-Murcia et al. (1996), have claimed that the
resultant shift in stress in turn causes a change in the neutralization or vowel reduction in
the unstressed syllable. Koffi (personal communication, September 11, 2012) has
affirmed that these claims about lexical stress shifts have not yet been supported
quantitatively by the subfield of laboratory phonology.
In addition to this concern about validity, although various studies on the acoustic
properties of English word stress do exist, there is a lack of consensus in the literature as
to the relative importance of the acoustic correlates of stress -- fundamental frequency
(F0) (i.e., pitch), duration, intensity, and spectral reduction. Indeed, various contrasting
versions of what the author hereby coins the Salience Hierarchy of the Acoustic

Published by theRepository at St. Cloud State, 2014 1
Linguistic Portfolios, Vol. 3 [2014], Art. 7
Correlates of Stress (SHACS) have been proposed: F0 > duration > intensity (e.g., Fry,
1955, 1958; Ladefoged, 2003), duration > F0 > intensity (e.g., Adams & Munro, 1978),
and duration > intensity > F0 (e.g., Beckman & Edwards, 1994). The latter have reasoned
that F0 is only a relevant acoustic correlate of stress with regards to sentential pitch
accent.
Furthermore, most studies have not explored the acoustic properties of the full range of
Level 1 [cyclic] suffixes in the lexicon. In fact, studies on English Lexical Stress (ELS) in
polysyllabic words in general have largely been ignored in favor of disyllabic minimal
stress pairs, as in Fry’s original studies (1955, 1958).
Moreover, there is a dearth of cross-linguistic acoustic data on comparisons of

productions of Level 1 [+cyclic] derivations by native speakers of English (NES) and
nonnative speakers of English (NNES) of different proficiencies and first language (L1)
backgrounds. Although the extent of L2 accentedness is related to many determinants,
including language environment and age of speakers, the main mediator of individual
differences in L2 accents is the “sound system” of their L1 (Zhang, Nissen, & Francis,
2008, p. 4498). For example, there is growing evidence to suggest that Mandarin L1
speakers have problems pronouncing L2 English stress contrasts because of “strong
interference from the Mandarin tonal system” (Zhang et al., 2008, p. 4500). As Zhang et
al. have stated, even when syllabic stress is placed appropriately by Mandarin NNES,
they have problems manipulating the acoustic correlates of stress in a native-like manner.
Conversely, various phonetic studies on rhythmic typology strongly indicate that Arabic
is a stress-timed language that is “a very likely language to exhibit the same correlates to
stress as does English” (de Jong & Zawaydeh, 1999, p.5).
This paper aims to validate the widely-held impressionistic assertions in the literature
about the morphophonemic properties Level 1 [+cyclic] suffixes by providing
quantifiable data. Therefore, the current study is based on quantitative acoustic analyses
of the data using laboratory phonology techniques which have the advantage of
“replicability and robustness” (Post & Nolan, 2012, p. 544) if suitable sampling and
statistical methods are employed. In addition, this project investigates the dichotomous
claims made my acoustical phonetics experts about SHACS. To do this, syllabic F0,
duration, and intensity productions are analyzed in Level 1 [+cyclic] derivations by
native speakers of Midwestern American English (MWAE) dialect. Due to limitations of
time, the researcher does not measure the acoustic correlate of vowel quality (i.e., first
and second formants (F1 and F2), which is in accordance with Lieberman’s study (1960).
From a second language acquisition (SLA) research perspective, the other purpose of this
study is to observe whether there is a correlation between exposure to the L2 and/or L1
background and production accuracy of Level 1 [+cyclic] suffix derivations. As Zhang et
al. (2008) have succinctly noted, most research in the area of English lexical stress
“confound the phonological issue of stress placement with the phonetic problem of
native-like stress production” (p. 4498). Thus, production accuracy here refers to a
twofold distinction: 1) L2 knowledge of where to place the stress in derived words, and
2) native-like production of the acoustic correlates of stress. More specifically, this study

https://repository.stcloudstate.edu/stcloud_ling/vol3/iss1/7 2
examines the acoustic correlates of productions of English Level 1 [+cyclic] derivations

by Arabic L1 and Mandarin L1 NNES.
Ancillary findings also raise questions as to the accuracy of IPA pronunciations of certain
Level 1 [+cyclic] derivations provided by Standard American English (SAE) dictionaries
as well as the generalizability of claims made about the acoustic properties of tonic
accent shift. The latter is a theory proposed by Ladefoged and Johnson (2010, p. 119) that
suggests primary-stressed vowels only differ from secondary-stressed vowels with
regards to an increase in F0, making the syllable [+tonic].
2.0 Data Collection

The researcher digitally recorded productions of Level 1 [+cyclic] tokens by L1
speakers of Midwest American English, Saudi Arabian Arabic, and Mainland Chinese
Mandarin. Participants read aloud eight tokens inside a carrier phrase “Say __________
again”. The stimuli included the stem word <HIStory> in addition to seven derived
words formed by the addition of seven different stress-shifting suffixes: <hisTORic>,
<hisTORical>, <histoRICity>, <hisTORian>, <hisTORify>, <hisTORial>, and
<HisTORious>. Each of these suffixes has a corpus frequency of approximately 1%
(Carroll, Davies, & Richman, 1971). The suffix, <tion>, which has a corpus frequency of
around 4% (Carroll, Davies, & Richman, 1971), was purposefully omitted to keep this
variable constant. The carrier phrase was designed so that the words to be studied did not
carry an onset rise or a pitch accent (Maeda, 1976; Ladefoged & Johnson, 2010; Post &
Nolan, 2012). To limit ordering effects (See Cowart, 1997), tokens were presented
randomly by shuffling the cards.
Prior to recording, subjects completed a short sociometric survey so that statistical

sampling techniques could be used to evaluate acquisition of accurate ELS production.
The speech samples of native speakers were analyzed to provide norm values for cross-
reference and to provide insights into the proposed SHACS. In total, 100 subjects
participated in the study, spread evenly across the three language groups (Figure 1).
Figure 1: Participants by L1 and English Proficiency Level

3.0 Data Analysis

The acoustic phonetic researcher analyzed productions using Praat, delineated
vowels, and measured the relative primary stressed to non-primary stressed vocalic
ratios1 for: mean F0 (Hz), mean intensity (dB), and duration (ms). This methodology is
similar to Flege and Bohn (1989), Zhang et al. (2008), and Lee and Cho (2011) using
Praat Version 5.3.31 (Boersma & Weenink, 2013). However, to the best of the author’s
knowledge, the proposed method of comparing vowels was novel in that the acoustic
correlates in vowels with primary stress were examined in relation to those of all the
other vowels in the utterance – as opposed to just one of the unstressed vowels (e.g.,
Flege & Bohn, 1989; Lee & Cho, 2011). In other words, the [+tonic acc.] vowel in each
token was acoustically compared to the [-tonic acc.] vowel (Ladefoged, 2001). The
rationale being, that if a vowel has primary stress, the acoustic cues should be prominent
to all the other vowels in word. Scripts were used to semi-automate delineation of vowels
(Ryan, 2005) and retrieve stress analysis data (Yoon, 2008) which greatly reduced the
potential for human error. In total, 2,125 vowels in 800 spectrograms were analyzed as in
Figure 2, which excluded stress placement and pronunciation errors which were entered
into a separate data pool. The latter error type could also include stress placement errors
but is was deemed more serious with regards to intelligibility as it could (also) include a
deletion, and/or addition, and /or substitution of a segment or cluster of segments.
Naturally, any productions containing errors had to be excluded from the main dataset as
it would not be a fair test to compare the vocalic relative stress values in words which had
been pronounced differently.
Figure 2: <Historical> as spoken by a Midwest American female

P = primary-stressed vowel S = secondary-stressed vowel US = unstressed vowel
3.1 Statistical Analyses

- Paired sample t-tests were used to identify significant differences (p < .05)
between the primary [+tonic] and non-primary stressed [-tonic] vowels (P vs. All)
in each token.
- Paired sample t-tests were used to identify significant differences (p < .05)
between the primary [+tonic, +stress] and secondary stressed [-tonic, +stress]
vowels (P vs. S) in each token.

1
Two ratios used: a) [+tonic]/[-tonic] (P vs. All) b) [+stress, + tonic]/[+stress, -tonic] (P vs. S)

- Mean vocalic relative stress ratios of each factor were submitted to ANOVAs, as
well as Tukey HSD post hoc tests, to see if there were significant differences in
stress production among the language groups. Productions of the stem token,
<history>, were omitted from comparisons.
- Pearson Product-Moment Correlations were used to determine strength of relations

and effect sizes for the operationalized variables of: L1 background, L2 exposure,
L2 input, and L2 proficiency. Productions of the stem token, <history>, were
omitted from comparisons.
4.0 Results and Discussion
4.1 Quantitative Observation of Stress-Shifts in

ES Level 1 [+cyclic] Word Productions
NES produced primary-stressed vowels with higher F0, greater intensity, and
longer duration than the average of the rest of the vowels combined as indicated by the
positive ratios in Table 1.
Language F0 SD Intensity SD Duration SD

Group
Arabic 1.07 0.30 1.05 0.06 1.29 0.70

English 1.13 0.34 1.06 0.04 1.61 0.63
Mandarin 1.08 0.18 1.04 0.05 1.57 0.58
Table 1: Mean Relative Stress Ratios of Primary [+tonic acc.] to Non Primary Vowels [-tonic acc.]
for the Three Acoustic Correlates by Language Group
Paired sample t-tests revealed that, for each token, at least one correlate had a
significantly different value in the primary-stressed vowel than in the mean of all the
other vowels combined. Notwithstanding, the pentasyllabic token <historicity> was
idiosyncratic as it was only significantly different (negatively) with regards to duration.
Since primary vowels were prominent due to the contrast in one or more acoustic
features, we can conclude that stress-shifts in Level 1 [+cyclic] derivations can be
observed quantitatively, at least in words with fewer than five syllables. Table 2
summarizes the significant findings from all the P vs. All paired-sample t-tests for all
three language groups, so that the reader may have a better overall picture of the ways in
which lexical stress manifests itself in the speech of the different L1 groups.

Token Language F0 Intensity Duration

Group
<history> English û ü ü(negative)

Arabic û û û
Mandarin ü ü û
<historial> English ü ü ü
Arabic ü ü ü
Mandarin ? ü ü
<historian> English û ü ü
Arabic ü ü ü
Mandarin ü ü ü
<historic> English û ü ü
Arabic û ü û
Mandarin û û ü
<historical> English ü ü ü
Arabic û ü û
Mandarin û ü ü
<historicity> English û û ü(negative)
Arabic û ü ü(negative)
Mandarin û ü û
<historify> English û ü ü(negative)
Arabic û û ü(negative)
Mandarin ü ü û
<historious> English û ü ü
Arabic û ü ü
Mandarin ü ü ü
Table 2: Most Salient Acoustic Correlates of Primary Stressed [+tonic] Vowels in Level 1 [+cyclic]
Derivations per Language Group
ü = salient acoustic correlate û = non-salient acoustic correlate
? = borderline saliency based on a non-significant difference between P and All but a large effect size
4.2 Salience Hierarchy of the Acoustic Correlates of Stress (SHACS)

Paired sample t-tests on the P vs. All analyses of MWAE NES productions (Table
2) provide a more convincing description of SHACS than the mean vocalic relative stress
ratios alone. Antithetically, at least with regards to many of the leading theories on
SHACS (e.g., Fry, 1955, 1958; Adams & Munro, 1978), the results from this study
suggest that intensity is at least the most reliable, if not the most salient, cue to ELS.
Intensity of the primary-stressed [+tonic, +stress] vowel was statistically greater than the
mean intensity of the non-primary [-tonic, +stress] vowels in all eight tokens. Based on
frequency of significances, duration was the next most salient cue to ELS as it was also
statistically different in P vs. All in all the tokens, albeit with a shorter duration in three of
the tokens. In sum, SHACS in Level 1 [+cyclic] derivations in MWAE appears to be:
intensity > duration > F0

4.3 Nonnative Ordering of the Acoustic Correlates of Stress

On average, both Arabic L1 NNES and Mandarin L1 NNES produced primary-
stressed vowels with higher F0, greater intensity, and longer duration, albeit with smaller
vocalic relative stress ratios for each of the acoustic correlates than NES (Table 1). These
ratios were in roughly the same proportions as for NES; i.e., duration > F0 > intensity.
However, since the ratios for each acoustic cue are measured in different units (i.e.,
Hertz, decibels, and milliseconds) which are not calibrated to be perceptively equivalent
or directly comparable, one cannot assert that this is the SHACS for these speakers. Thus,
further statistical tests were conducted for both NNES groups.
4.31 Mandarin L2 English

One-way ANOVA tests showed that there were significant differences between
the language groups with regards to intensity as a prosodic cue, F(2, 481) = 5.25, p < .01.
The Tukey HSD post hoc comparison revealed that Mandarin L1 NNES (M = 1.04, SD =
.05) were statistically different from NES (M = 1.06, SD = .04) as they used a
significantly lower ratio of intensity in stress contrasts (Figure 3). Notwithstanding,
intensity was still the most reliable cue to stress according to the P vs. All t-tests (Table
2).
Figure 3: Comparative Usage of Intensity as an Acoustic Cue to ELS in Level 1 [+cyclic] Derivations
Paired sample t-tests also revealed that Mandarin English L2 speakers deviated more
from the NES norm in that they tended to use F0 as a salient acoustic cue to ELS more
often (Table 2). Thus, the results seem to concur with earlier studies (e.g., Zhang, Nissen,
& Francis, 2008; Keating and Kuo). In general, SHACS for Mandarin English L2
speakers appears to be:
intensity > F0 ≥ duration

4.32 Arabic L2 English

One-way ANOVA tests showed that there were significant differences between
the language groups with regards to duration as a prosodic cue, F(2, 481) = 11.90, p <
.01. The Tukey HSD post hoc comparison revealed that Arabic L1 NNES (M = 1.29, SD
= .70) were statistically different from both NES (M = 1.61, SD = .63) and Mandarin L1
NNES (M = 1.57, SD = .58) in their usage of durational stress contrasts (Figure 4); i.e.,
their relative vocalic stress ratios were much smaller.
Figure 4: Comparative Usage of Duration as an Acoustic Cue to ELS in Level 1 [+cyclic] Derivations
Paired sample t-tests revealed that durational contrasts between P and All were not as
numerous as they were for NES although Arabic English L2 speakers also produced
negative duration ratios for <historicity> and <historify> (Table 2). In sum, SHACS for
Arabic L1 NNES appears to be:
intensity >> duration > F0
4.4 Problematic Acoustic Correlates for Arabic L1 and Mandarin L1 Speakers
4.41 Fundamental frequency

Paired sample t-tests of P vs. All indicated that Chinese subjects tend to use more
pitch contrasts than NES to differentiate ELS, which almost certainly leads to
accentedness (Table 2). However, one-way ANOVA found that neither Arabic L1 NNES
(M = 1.07, SD = .30) nor Mandarin L1 NNES (M = 1.08, SD = .18) were statistically
different from NES (M = 1.13, SD = .34) with regards to F0 usage although the p value
was close to being significant, (F (2, 481) = 2.80, p = 0.6). Therefore, differences in
relative vocalic stress ratios of F0 production in P vs. All were very small; that is,

Mandarin L1 speakers use a slightly smaller F0 range than NES which seems to confirm
the findings of Li and Shuai (2011).
4.42 Intensity
As reported in 4.31, one-way ANOVAs revealed that Mandarin L1 NNES
significantly under-use intensity contrasts between P and All to emphasize primary stress.
This was somewhat unexpected based on previous studies of Mandarin English L2. Still,
paired sample t-tests revealed that intensity was significantly different in P vs. All in all
but one of the tokens (Table 2). Thus, it seems that although intensity is an important
acoustic cue for Mandarin L1 NNES, they still do not employ it in a native-like manner.
4.43 Duration
As reported in 4.32, one-way ANOVAs revealed that Arabic L1 NNES
significantly under-use durational contrasts between P and All to emphasize primary
stress. This may be due to the fact that Arabic English L2 speakers tend not to reduce
vowels as suggested by Zuraiq and Sereno (2005). Not reducing the unstressed vowels in
a token would certainly result in smaller durational contrasts between the primary
[+tonic] and non-primary [-tonic] vowels. The ANOVA results also support
Bouchhioua’s (2008) study which found that duration is not an important correlate of
lexical stress in Tunisian Arabic as it is in English and negative transfer may lead to non-
native accentedness, if not unintelligibility.
4.44 Summary of distinct features of nonnative stress production

To recapitulate the findings in 4.4, Arabic L1 NNES use significantly smaller
durational contrasts to denote primary stress, perhaps caused by a tendency to not reduce
unstressed vowels in Level 1 [+cyclic] derivations. Conversely, Mandarin L1 NNES use
F0 more often than NES as a salient marker of primary stress while significantly under-
using intensity contrasts. The author hypothesizes that these features of Arabic and
Mandarin accented English are direct results of transfer from the predictable stress and
tonal L1 sound systems, respectively.
4.5 Correlations between Amount of L2 Exposure,

Amount of L2 Input and L2 Proficiency
4.51 Accurate placement of stress in Level 1 [+cyclic] derivations

Figure 5 shows the percentage of correct responses by token and language group.
Also, as one might expect, <historicity> caused the most problems.

100% 93% 97% 93% 97% 97%

86% 85%
90% 79%
83% 82% 85%
80% 68% 73% 71% 71% 70%
% of Correct Reponses
70% 64%
58% 55% 55% 58%
60% 55%
50%
37% 34%
40%
30%
20%
10%
0%
Arabic English Mandarin

Figure 5: Proportion of Correct Responses by Token and Language Group
In fact, NES were no better at accurately producing this word than Mandarin speakers.
However, it is important to note that for the two nonsense words (i.e., <historious> and
<historial>), the NES performed much better. It is the researcher’s contention that
although these are not real words, NES were able to use the stress-shifting rules that are
stored in the lexicon. Nevertheless, the smooth curve in Figure 6 shows that a significant
correlation was found (r = - .26, p < .05.) for years of English language study and
frequency of errors. Therefore, the longer learners of English have spent studying the
language (i.e., increased L2 input), the fewer pronunciation and stress-placement errors
they make in stress-suffixed words.
Figure 6: Frequency of Pronunciation and Stress-Placement Errors vs. Years of L2 Study
From a psychological viewpoint, this strong correlation may relate to how many years of
English L2 education subjects feel they have received. When L2 exposure (i.e., years of
residence in L2 country) was plotted against number of errors, it did not yield significant
correlations. However, the author posits that the study did not have a large enough range
for this variable. As expected, Arabic and Mandarin L1 NNES produce fewer errors the

higher the level of their English L2 proficiency with a significance of p < 0.00 and a
medium effect size of r = -.05 (Figure 7).
Figure 7: Frequency of Pronunciation and Stress-Placement Errors vs. English Proficiency Level
4.52 Native-like production of the acoustic correlates

of stress in Level 1 [+cyclic] derivations
Correlations did not find any significant relationship between any of the
independent variables and native-like production of intensity. However, based on the
paired sample t-tests, it is suggested that intensity is already employed as a prosodic
correlate in almost every token by both groups of NNES. Perhaps this is why previous
studies have claimed that intensity is the least important cue to ELS; i.e., because it is not
hard for NNES to manipulate, it is not a determiner of accentedness, comprehensibility,
or intelligibility. Nevertheless, F0 and duration did yield interesting distinctions.
4.53 Mandarin L2 English

Figure 8 represents a statistically significant correlation (p < 0.05) and a moderate
effect size (r = -.5) between amount of L2 input (i.e,, years of English L2 study) and
Mandarin L1 NNES’ ability to produce native-like F0 contours in lexical stress contrasts.

Figure 8: Difference of Mean Mandarin L1 Speaker Ratio of F0 from Mean Native Speaker Ratio of F0
vs. Years of L2 English Study
Figure 9 shows significant correlations were also found when native-like production of
F0 was correlated with English L2 proficiency level, albeit with a very small effect size (r
= -0.0, p < 0.05).
Figure 9: Difference of Mean Mandarin L1 Speaker Ratio of F0 from Mean Native Speaker Ratio of
Duration vs. English Proficiency Level
Encouragingly, the results suggest that Chinese learners of English are able to overcome
their innate difficulties when producing F0 as an acoustic cue to ELS through increased
L2 study.
4.54 Arabic L2 English

Correlations of production accuracy of duration versus amount of L2 study and
proficiency level both yielded significances, approximately r = -.3, p < 0.05 in each case
(Figures 10-11).

Figure 10: Difference of Mean Arabic L1 Speaker Ratio of Duration from Mean Native Speaker Ratio of
Duration vs. Years of L2 English Study
Figure 11: Difference of Mean Arabic L1 speaker Ratio of Duration from Mean Native Speaker Ratio of
Duration vs. English Proficiency Level
Comparing these results with those from the ANOVA (Figure 4), it is proposed that
through increased acquisition of the English language, Saudi learners are also able to
overcome the detrimental effects of negative transfer from their L1 sound system.
4.6 Palatal glide epenthesis in the stress-shifting suffixes: <ian>, <ious>, and
<ial>
The researcher identified a surprisingly high incidence of /j/ epenthesis in the
tokens <historian>, <historious>, and <historial> (Figure 12).

30
Number of Productions 25
20
15
10
0
[hɪstɔ́riəәl] [hɪstɔ́rijəәl] [hɪstɔ́riəәn] [hɪstɔ́rijɛn] [hɪstɔ́riəәs] [hɪstɔ́rijəәs]
historial historian historious
Figure 12: Proportion of Standard American English Dictionary Pronunciations
vs. Proportion of Alternative Pronunciations with Epenthetic Palatal Glide by MWAE NES
The results suggest that [hɪˈstɔr i jəәn] be included as an alternative pronunciation to

[hɪˈstɔr i əәn] in standard American English dictionaries since almost half of the MWAE
NES subjects pronounced it this way (Figure 13-14). The author concludes that American
English dictionary pronunciation entries for words containing Level 1 [+cyclic] suffixes,
<ian>, <-ial>, and <-ious> may need to be revised.
Figure 13: <Historian> Produced by a Midwest American Male

in Accordance with the Standard Dictionary Pronunciation, [hɪstɔ́riəәn], without an Epenthetic /j/

Figure 14: <Historian> Produced by a Midwest American Male as [hɪstɔ́rijəәn]

with an Epenthetic /j/ in the Onset of the Ultima
4.7 Lack of evidence for “tonic accent shift” in

polysyllabic Level 1 [+cyclic] derivations
Paired sample t-tests failed to support Ladefoged and Johnson’s (2010) notion
that tonic accent shift (i.e., the differentiator between primary (P) and secondary-stressed
(S) syllables) is caused by a “major pitch change” (p. 119) in the primary-stressed vowel,
at least with respect to polysyllabic Level 1 [+cyclic] derivations. Contrarily, the results
reveal that intensity and duration are the relevant prosodic correlates in assigning primary
status to a vowel. Even in the disyllabic stem word, <historic>, these two correlates were
the most salient. Table 3 provides an overview of the relevant acoustic correlates for each
token with respect to tonic accent shift (i.e., primary vs. secondary stress). Clearly, more
studies on the features of tonic accent shift in polysyllabic words should be conducted to
determine whether the theory is tenable.
Token F0 Intensity Duration
<history> û ü ü
<historial> û ü ü
<historian> û ü ü
<historic> û ü ü
<historical> ü ü ü
<historicity> û û û
<historify> û ü ü
<historious> û ü ü
Table 3: Most Salient Acoustic Correlates of Tonic Accent Shift

in Primary-Stressed Vowels [+tonic, +stress]
ü = salient acoustic correlate û = non-salient acoustic correlate
5.0 Conclusion
The results presented in this paper yield insight into several interdependent issues
related to lexical stress in polysyllabic English words containing stress-shifting suffixes.
First and foremost, this study provides support to the view that the acoustic correlates of

stress do indeed have a hierarchy of relative salience, hereinafter named SHACS. The
SHACS proposed here is intensity > duration > fundamental frequency (F0). While this
does not exactly match any of the schemes described in the literature, it most closely
resembles the SHACS postulated by Beckman and Edwards (1994): duration > intensity
> F0. Most likely, SHACS is context dependent. For instance, Fry’s (1955, 1958) notion
of SHACS (i.e., F0 > duration > intensity>) may only be relevant to disyllabic
homographs while intensity may only be the most salient acoustic cue in three and four
syllable words. Clearly, more studies on English lexical stress (ELS) in a wide range of
polysyllabic words are needed to validate this hypothesis. What is certain though is that
relative vocalic stress ratios of the three acoustic cues play an important role in
differentiating lexical stress patterns, and there does appear to be a native-norm for
ordering these acoustic signals. Indeed, the various significant correlations described in
this paper support this notion.
From a second language acquisition perspective, there is good evidence to suggest that
native-like command of the acoustic correlates is attainable for English language learners.
Although speakers with different inherent L1 sound systems encounter different problems
when trying to acquire native-like stress production, it favorably appears that they can
overcome these difficulties through increased input of the L2. Not only do experienced
English language learners produce fewer pronunciation errors, they also produce prosodic
contrasts in a more native-like manner. For instance, although Saudi speakers inherently
under-use duration as acoustic cue to ELS - perhaps by not fully reducing vowels as a
result of L1 transfer from the predictable stress system as suggested by other researchers
(Zuraiq & Sereno, 2005; Altmann, 2006; Bouchhioua, 2008) - they are able to use this
acoustic correlate more accurately as their language skills progress. Similarly, Chinese
learners of English are able to overcome the negative transfer of their tonal system by
producing pitch in a more native-like manner as they advance in their studies.
Furthermore, the present results do not support Ladefoged and Johnson’s (2010) theory
of tonic accent shift. Instead, the results actually suggest the opposite; that is, intensity
and duration appear to be responsible for contrasts between primary and secondary
stressed vowels. It will be interesting to observe whether these findings can be replicated,
and whether the variables of token length (disyllabic vs. polysyllabic words) and/or token
delivery method (read utterances vs. natural speech) have any effect.
Finally, with regards to pronunciation, this investigation provides convincing evidence

for a possible revision of standard American English dictionary IPA transcriptions. At
least for the words in this study, the suffixes <-ial>, <-ian>, and <-ious> were more
commonly pronounced by native and nonnative speakers with the epenthetical insertion
of a palatal glide, [j]. Thus, future studies should explore the presence or absence of this
phenomenon in other dialects of English, both native and nonnative.
ABOUT THE AUTHOR

Paul Keyworth graduated from the MA TESL/Applied Linguistics program at SCSU in
March, 2014. This paper is an abridged version of his Master’s thesis project. Paul has
spent the past 12 years teaching EFL/ESL to children and adults of all proficiencies in

South Korea, Singapore, the UK, and the US. He earned a B.Sc. degree from the
University of Kent at Canterbury in Molecular and Cellular Biology in 2001 and attained
the University of Cambridge Certificate in English Language Teaching to Adults
(CELTA) in 2008. His area of research interest is the interface between laboratory
phonology and the sociophonetic aspects of second language acquisition. He intends to
pursue a Ph.D. in the field of acoustic phonetics and speech communication.
E-mail: keyworth.paul@gmail.com LinkedIn: www.linkedin.com/in/keyworth/
Thesis Committee: Dr. Ettien Koffi (Chair), Dr. Michael Schwartz, and Dr. Monica
Devers.
Recommendation: This MA Thesis summary was recommended for publication by

Professor Ettien Koffi, Ph.D., Linguistics Department, St. Cloud State University, St.
Cloud, MN. Email: enkoffi@stcloudstate.edu
Acknowledgements
The study reported here was supported in part by the Student Research Fund - an internal
grant awarded by the Office of Sponsored Programs at Saint Cloud State University. All
copyright belongs to the Acoustical Society of America (2014).
References
Adams, C., & Munro, R. (1978). In search of the acoustic correlates of stress: fundamental
frequency, amplitude, and duration in the connected utterance of some native and
non-native speakers of English. Phonetica, 35, 125-156.
Altmann, H. (2006). The Perception and Production of Second Language Stress: A Cross-
Linguistic Experimental Study. Dissertation Abstracts International, Section A: The
Humanities And Social Sciences, 67(6), 2133-2134. (2006753642)
Beckman, M. E., & Edwards, J. (1994). Articulatory evidence for differentiating stress
categories. In P. Keating (Ed.), Phonological structure and phonetic form. Papers in
Laboratory Phonology III (pp. 7-33). Cambridge: Cambridge University Press.
Boersma, P., & Weenink, D. (2013). Praat: Doing Phonetics by Computer (Version 5.3.32)
[computer program], < http://www.fon.hum.uva.nl/praat/>. Amsterdam: University
of Amsterdam.
Bouchhioua, N. (2008). Duration as a cue to stress and accent in Tunisian Arabic, native
English, and L2 English. Proceedings from Speech Prosody 2008: The Fourth
International Conference on Speech Prosody. (pp. 535-538). Campinas: Brazil.
Carroll, J.B., Davies, P., & Richman, B. (1971). The American heritage word frequency
book. New York: Houghton Mifflin.
Celce-Murcia, M., Brinton, D., & Goodwin, J. M. (1996). Teaching pronunciation: A
reference for teachers of English to speakers of other languages. Cambridge:
Cambridge University Press.
Cowart, W. (1997). Experimental syntax. Thousand Oaks, Sage.
de Jong, K., & Zawaydeh, B. A. (1999). Stress, duration, and intonation in Arabic word-
level prosody. Journal of Phonetics, 27, 3-22.

Flege, J. E., & Bohn, O. S. (1989). An instrumental study of vowel reduction and stress
placement in Spanish-accented English. Studies in Second Language Acquisition,
11, 35-62.
Fry, D. B. (1955). Duration and intensity as physical correlates of linguistic stress. Journal
of the Acoustical Society of America, 27, 765-768.
Fry, D. B. (1958). Experiments in the perception of stress. Language & Speech, 1:2, 126-
152.
Halle, M., & Kenstowicz, M. (1991). The free element condition and cyclic versus
noncyclic stress. Linguistic Inquiry, 22, 457-501.
Keating, P., & Kuo, G. (2012). Comparison of speaking fundamental frequency in English
and Mandarin. Journal of the Acoustical Society of America, 132, 1050-1060.
Kiparsky, P. (1982). From cyclic phonology to lexical phonology. In H. van der Hulst & N.
Smith (eds.), The Structure of Phonological Representations I, pp. 131-175.
Dordrecht: Foris.
Kreidler, C. W. (2004). The pronunciation of English: A course book, 2nd edition. Malden,
MA: Blackwell.
Ladefoged, P. (2001). A course in phonetics, 4th edition. New York: Thomson-Wadsworth.
Ladefoged, P. (2003). Phonetic data analysis: An introduction to fieldwork and
instrumental techniques. Malden, MA: Blackwell.
Ladefoged, P., & Johnson, K. (2010). A course in phonetics, 6th edition. New York:
Thomson-Wadsworth.
Lee, S., & Cho, M. (2011). An acoustic analysis of stressed and unstressed vowels in
English nouns: A cross-language case study. Korean Journal of English Language
and Linguistics 17, 61-88.
Li, B., & Shuai, L. (2011). Suprasegmental features of Chinese-accented English. Journal
of the Acoustical Society of America, 129, 2453.
Lieberman, P. (1960). Some acoustic correlates of word stress in American English.
Journal of the Acoustical Society of America, 32, 451-454.
Maeda, S. (1976): A characterization of American English intonation. (Doctoral
dissertation). MIT: Cambridge, MA. Retrieved from:
http://hdl.handle.net/1721.1/29189. (edsoai.654895205)
Post, B., & Nolan, F. (2012). Data collection for prosodic analysis of continuous speech
and dialectal varaiation. In A. C. Cohn, C. Fougeron, and M.K. Huffman (Eds.),
The Oxford handbook of laboratory phonology (pp. 538-547). NY: Oxford
University Press.
Ryan, K. (2005). Grid-maker.praat [Praat script]. Available from
http://www.linguistics.ucla.edu/faciliti/facilities/acoustic/praat.html
Yoon, T. (2008). Stress analysis script [Praat script]. Available from
http://web.uvic.ca/~tyoon/resource/vq.praat
Zhang, Y., Nissen. S.L., & Francis, A.L. (2008). Acoustic characteristics of English lexical
stress produced by Mandarin speakers. Journal of the Acoustical Society of
America, 123:6, 4498-4513.
Zuraiq, W. (2006). The Production of Lexical Stress by Native Speakers of Arabic and
English and by Arab Learners of English. Dissertation Abstracts International,
Section A: The Humanities and Social Sciences, 66(12), 4375. (2006751445)


Acoustic Correlates

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Acoustic Correlates

Uploaded by

Copyright:

Available Formats

Linguistic Portfolios

The Acoustic Correlates of Stress-Shifting Suffixes

Follow this and additional works at: https://repository.stcloudstate.edu/stcloud_ling

THE ACOUSTIC CORRELATES OF STRESS-SHIFTING SUFFIXES IN

Moreover, there is a dearth of cross-linguistic acoustic data on comparisons of

examines the acoustic correlates of productions of English Level 1 [+cyclic] derivations

2.0 Data Collection

Prior to recording, subjects completed a short sociometric survey so that statistical

Figure 1: Participants by L1 and English Proficiency Level

3.0 Data Analysis

Figure 2: <Historical> as spoken by a Midwest American female

3.1 Statistical Analyses

- Pearson Product-Moment Correlations were used to determine strength of relations

4.0 Results and Discussion

4.1 Quantitative Observation of Stress-Shifts in

Language F0 SD Intensity SD Duration SD

Arabic 1.07 0.30 1.05 0.06 1.29 0.70

Token Language F0 Intensity Duration

<history> English û ü ü(negative)

4.2 Salience Hierarchy of the Acoustic Correlates of Stress (SHACS)

intensity > duration > F0

4.3 Nonnative Ordering of the Acoustic Correlates of Stress

4.31 Mandarin L2 English

intensity > F0 ≥ duration

4.32 Arabic L2 English

intensity >> duration > F0

4.4 Problematic Acoustic Correlates for Arabic L1 and Mandarin L1 Speakers

4.41 Fundamental frequency

4.44 Summary of distinct features of nonnative stress production

4.5 Correlations between Amount of L2 Exposure,

4.51 Accurate placement of stress in Level 1 [+cyclic] derivations

100% 93% 97% 93% 97% 97%

Arabic English Mandarin

Figure 6: Frequency of Pronunciation and Stress-Placement Errors vs. Years of L2 Study

4.52 Native-like production of the acoustic correlates

4.53 Mandarin L2 English

4.54 Arabic L2 English

The results suggest that [hɪˈstɔr i jəәn] be included as an alternative pronunciation to

Figure 13: <Historian> Produced by a Midwest American Male

Figure 14: <Historian> Produced by a Midwest American Male as [hɪstɔ́rijəәn]

4.7 Lack of evidence for “tonic accent shift” in

Token F0 Intensity Duration

Table 3: Most Salient Acoustic Correlates of Tonic Accent Shift

Finally, with regards to pronunciation, this investigation provides convincing evidence

ABOUT THE AUTHOR

Recommendation: This MA Thesis summary was recommended for publication by

You might also like