You are on page 1of 16

This article appeared in a journal published by Elsevier.

The attached
copy is furnished to the author for internal non-commercial research
and education use, including for instruction at the authors institution
and sharing with colleagues.
Other uses, including reproduction and distribution, or selling or
licensing copies, or posting to personal, institutional or third party
websites are prohibited.
In most cases authors are permitted to post their version of the
article (e.g. in Word or Tex form) to their personal website or
institutional repository. Authors requiring further information
regarding Elseviers archiving and manuscript policies are
encouraged to visit:
http://www.elsevier.com/copyright
Author's personal copy
The Persian pitch accent and its retention after the focus
Vahideh Abolhasanizadeh
a,b
, Mahmood Bijankhan
c
, Carlos Gussenhoven
b,d,
*
a
Department of English Language and Literature, Shahid Bahonar University of Kerman, Iran
b
Department of Linguistics, Radboud University Nijmegen, The Netherlands
c
Department of Linguistics, University of Tehran, Iran
d
School of Languages, Linguistics and Film, Queen Mary University of London, UK
Received 3 February 2012; received in revised form 4 June 2012; accepted 5 June 2012
Available online 6 July 2012
Abstract
Persian words have prominence on the last syllable. Right-edge clitics fall outside this word domain, and segmentally identical words
and word-plus-clitic combinations therefore contrast for the location of the prominence. Two experiments were conducted to answer two
questions. A production experiment addressed the question whether any phonetic cues other than f0 signal this prominence contrast. We
found small phonetic differences between members of minimal pairs outside the more evident f0 differences, but attribute these to side
effects of pitch accent placement. The second question was whether post-focal words undergo deaccentuation, as evidenced by
neutralization of the contrast between post-focal words and word-plus-clitic combinations. Both the production experiment and a
perception experiment showed that there is Post Focus Compression, since pitch excursions in the post-focal speech were considerably
reduced, both in interrogative and in declarative utterances, as compared to other positions in the sentence. However, no neutralization
occurred. We tentatively conclude that Persian word prominences are pitch accents and that words are not deaccented when the pitch
range is reduced after the focus.
2012 Elsevier B.V. All rights reserved.
Keywords: Clitic group; Phonological word; Prosodic hierarchy; Focus; Pitch range
1. Introduction
Persian sentence prosody has been described as involving accentual phrases which have a single intonational pitch
accent on a stressed syllable (Mahjani, 2003). After the focus constituent, deaccentuation has been claimed to occur
(Sadat Tehrani, 2007). In this contribution, we address two issues in the word and sentence prosody of Persian. The first is
the phonological and phonetic status of the Persian word prominence. The question here is whether the prominence is
typologically like West Germanic or Catalan stress, with multiple phonetic parameters conspiring to create it, or a pitch
accent that is signaled only through fundamental frequency (f0). Second, we are interested in knowing whether the word
prominence disappears after the focus constituent, to the extent that minimal stress pairs become homophonous.
1.1. Persian stress
Persian word prominence has generally been described as the assignment of stress to the final syllables of nouns,
adjectives, most adverbs and unprefixed verbs (Ferguson, 1957; Lazard, 1957; Samei, 1996). Prefixed verbs take stress
www.elsevier.com/locate/lingua
Available online at www.sciencedirect.com
Lingua 122 (2012) 1380--1394
* Corresponding author at: Afdeling Taalwetenschap, Radboud Universiteit Nijmegen, Postbus 9103, 6500 HD Nijmegen, The Netherlands.
Tel.: +31 0243612839/237240; fax: +31 0627205464.
E-mail address: c.gussenhoven@let.ru.nl (C. Gussenhoven).
0024-3841/$ -- see front matter 2012 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.lingua.2012.06.002
Author's personal copy
on the prefix. Kahnemuyipour (2003) argued that the uniformity in stress placement in nouns and its variability in verbs
follows from a morphological difference between these word types and the resulting difference in the way they map onto
prosodic structures. Specifically, prefixes are separate phonological words in his analysis, and a phrase-level stress rule
puts the stress on the final syllable of the initial phonological word in a phonological phrase.
While the assignment of stress thus follows transparently from the morphological (or prosodic) structure, the issue
addressed here is the interpretation of the term stress in these and other descriptions of Persian word prosody. In
general, word-level prominent syllables have variably been characterized as having accent or stress. The distinction
between these has been sought in the extent to which the prominent syllable is phonetically cued by pitch features alone
or, alternatively, whether other phonetic cues like duration and spectral properties are consistently present. Beckman
(1986) termed these prominences non-stress accent and stress accent, respectively. Adifferent perspective is obtained
when distributional and phonological properties are taken into account. Stress has been characterized as being
obligatory, meaning that, not counting function words that cannot be citation utterances, every word has a stressed
syllable, and culminative, meaning that there is one most prominent syllable in the word (Hyman, 2006). By contrast, an
accented syllable need not be present on every word, allowing the existence of unaccented words, either due to
deaccentuation of lexically accented words or to the fact that words can be lexically unaccented and not acquire accents in
the sentence prosody. Hyman (2006) discusses cases of obligatoriness and culminativity where there is no evidence of
phonetic stress, i.e. when the prominence is not a stress accent, in Beckman's (1986) sense. Some of these are ruled out
as stress systems on the basis of the location of the accent. If that is a mora, as in Somali, the prominence is not stress,
since stress is a property of syllables (Hayes, 1995; Hyman, 2006). Nubi represents a case of a culminative and obligatory
system where the prominent element is the syllable and prominent syllables are not systematically differentiated by
durational or spectral properties from non-prominent syllables (Hyman, 2006; Gussenhoven, 2006). The historical
explanation here is that Nubi is a creolized formof Arabic in which the Arabic stress locations have been interpreted as H-
toned syllables by speakers of East African tone languages (Wellens, 2005). However, there are likely to be more cases of
phonological stress that are not signaled by phonetic stress, i.e. by f0 only. Levi (2005) presents phonetic data on Turkish
which make her conclude that this language has a pitch accent, not (phonetic) stress.
In line with the recent emphasis on language diversity, we present evidence that the word prominence of Persian is
both obligatory and culminative in the sense of Hyman (2006), while also being a non-stress accent in the sense of
Beckman (1986). In current usage, it will be argued to be pitch accent, a concept and term that was introduced by
Bolinger (1958) in reference to the tonal component in accented syllables in English. In autosegmental phonology, it is the
term for any tonal melody that is associated with an accented syllable, whether that syllable is stressed, as it is in English
(Bolinger, 1958; Pierrehumbert, 1980) and Jordanian Arabic (De Jong and Zawaydeh, 1999), or lexically determined, as it
is in Japanese (Pierrehumbert and Beckman, 1988; Kubozono, 1993) and which is not analyzable as a boundary tone.
1
This point we will try to make on the basis of Experiment I.
1.2. Post-Focus Compression
The second issue addressed by our investigation concerns the phonological status of Post-Focus Compression (PFC)
in Persian. Xu et al. (2012) suggest that the reduction of the pitch range after the focus constituent, as found for instance in
Germanic languages, may in fact be an areal feature covering Europe as well as a northern and central swathe of Asia.
Thus, Beijing Mandarin is a PFC language (as are Japanese, Bengali and Mongolian), but Taiwanese and Taiwanese
Mandarin are not. We will showthat Persian is a PFClanguage, in line with Xu et al.s hypothesis. The question at issue in
our investigation, however, one that is not considered by Xu et al., is whether PFC involves the removal of the tonal
structure in the post-focal words. This we believe is the case in English. While the noun phrase a Spnish tacher is
distinct from the compound a Spnish teacher in isolation, in a sentence like Ive already HEARD that story about the
Spanish teacher, it is no longer possible to tell which structure is used, because after focal heard no pitch accents occur,
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1381
1
We use the term pitch accent in this meaning only. In particular, we do not mean to refer to any distributional or other criterion that might be
assumed to allow a meaningful classification of a pitch accent language (Hyman, 2009). In the meaning we use the term, that of tones that are
systematically present in some syllable or mora and which cannot be analyzed as boundary tones, English, Japanese, Jordanian Arabic, Nubi,
Turkish and Somali all have pitch accents. While making clear which meaning we intend, we use the term stress both in the sense of phonetic
stress, i.e. phonetically enhanced duration and spectral measures as occurring in, e.g. English, and in the sense of culminative obligatory word
prominence as occurring in English, Nubi, Turkish and, as we will argue, Persian. An issue that is not always given the credit it deserves is whether
an accentual analysis is to be preferred over an analysis with underlyingly linked tones, which will depend on the existence of generalizations
about the location of the word prominence that abstract away from the tones that are found there (Goldsmith, 1975; Gussenhoven, 2004:37). As
Hyman (2006) stresses, a tonal analysis can in principle always replace a word prosodic accentual analysis, but a tonal analysis can be
cumbersome when there are many generalizations about their permitted locations and the pitch accent consists of more than one tone, as in
Japanese, or when there are more options for the pitch accent, as in Barasana (Gomez-Imbert and Kisseberth, 2000).
Author's personal copy
leaving Spanish and teacher with unaccented stressed syllables in both sentences (James Sledd, cited fromHill [1962:36]
by Schmerling, 1976:27). By contrast, in Beijing Mandarin, PFC reduces the pitch range without deleting the tones of the
words. Tonal minimal pairs like ma1 mother and ma3 horse thus remain phonologically distinct if they are used post-
focally (Chen and Guion-Anderson, 2012). Similarly, Bizkaian Basque retains the distinction between accented and
unaccented words under PFC (Elordieta and Hualde, 2003). If Persian has pitch accents without phonetic stress, the
issue arises whether these accents are retained under PFC in a phonetically reduced form or whether instead they are
deleted. If they are deleted, contrasts that rely on a difference in the location of the pitch accent will be neutralized. Our
Experiment II was run to address this issue from a perceptual viewpoint. The results converge so as to allow the
conclusion that Persian does not deaccent after the focus, but retains phonetically reduced pitch accents in post-focal
speech that allow accentual minimal pairs to be disambiguated to a certain extent.
1.3. Intonation
Persian has been described as having three levels of prosodic hierarchy that are relevant to the intonational structure,
the accentual phrase, the intermediate phrase and the intonational phrase (Mahjani, 2003; Sadat Tehrani, 2007:36). The
word-final syllable has been claimed to be associated with a pitch accent (Eslami and Bijankhan, 2002), but there are
conflicting analyses of its tonal structure. Eslami (2000) posits four pitch accents, H*, L*, L*+Hand L+H*, in addition to two
tones marking intermediate phrases, L- and H-, as well as two boundary tones of the intonational phrase, L%and H%. The
meanings of the tonal morphemes given by Eslami (2000), inspired by Hirschberg and Pierrehumbert (1986) and
Pierrehumbert and Hirschberg (1990), are reproduced in (1).
(1) H* new information
L* given information
L+H* contrast
L*+H doubt
H- incompleteness
L- completeness
L% statement
H% question
In contrast to (1), Sadat Tehrani (2007) posits a single pitch accent, L+H*, which has two morpheme alternants, L+H* in
polysyllabic accentual phrases and H* in monosyllabic ones. Another claim by Sadat Tehrani (2007) is that post-focal
words are deaccented, while any internal boundary tones are deleted after the focus. We will evaluate some of the claims
in the literature in section 5.
1.4. The clitic group
Our investigation relies on a contrast between plain words and cliticized words. Combinations of words and clitics have
been described as clitic groups. The exclusion of right-edge clitics fromstress or accent assignment was noted by Lazard
(1957:48) and Shaqaqi (1993:46). Bijankhan and Nourbakhsh (2009) make stress the main defining feature of the
phonological word, pointing out that since clitics remain unstressed, they must lie outside the domain of the phonological
word. As an alternative, stress assignment can be described as being morphologically determined. This would mean that
a pitch accent is assigned to the last syllable of lexical category words, and that cliticized words and non-cliticized words
are both phonological words.
2
Because the surface segmental structures of words and word-clitic combinations are not
systematically different, many examples of minimal pairs can be given, like gol flower, which gives [go li] one flower, with a
clitic [i], and [gol] proper name, which has a suffix. We illustrate the systematic nature of accent assignment in (2a, b, c, d),
where (2a) provides two isolated words, (2b) two suffixed words, (2c) two words with a clitic, and (2d) a compound. As these
data show, words and suffixed words have final accented syllables, compounds fail to have an accent on their first
constituent, while clitics are not assigned accent, causing the accent in cliticized words to be on the final syllable of the host.
This latter generalizationremains trueif awordhastwoclitics, as in[ket b-i-je] of onebook. For convenience, wewill refer to
word+clitic combinations as clitic groups, without committing ourselves to the inclusion of this constituent in the prosodic
hierarchy of Persian.
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1382
2
The fact that the assignment of a pitch accent to final syllables of words skips right-edge clitics does not form the sole motivation for assuming
the existence of a clitic group for Bijankhan and Nourbakhsh (2009). A second motivation is provided by syncope, the deletion of a word-final
vowel before a clitic-initial vowel.
Author's personal copy
(2) a. ket b book xun house
b. ketb-h books xune-h houses
c. ket b-i one book xun-j-i one house
d. ketbxun library
1.5. Addressing the research questions
We here report the results of two experiments. Experiment I was a production experiment, the first acoustic
investigation of Persian word prominence, which was undertaken to answer two questions. First, we wanted to determine
whether the word prominent syllable of Persian has phonetic stress in addition to being pitch accented. Second, we
wanted to establish whether Persian has Post-Focus Compression in the words after the focus constituent. Experiment II
was a perception experiment. It was undertaken to investigate the question whether PFCinvolves the neutralization of the
difference between plain words and cliticized words.
2. Experiment I
The aim of Experiment I was to collect detailed phonetic information about the realization of words and word+clitic
combinations that is representative of the speech of Tehran, so as to enable us to establish the phonetic differences
between them. Since the presence of a H-tone or a L-tone may be accompanied by small and partly systematic
phonetic differences as compared to a toneless syllable (Beckman, 1986; Levi, 2005), we decided to place the
investigation in a wider perspective. Specifically, we expected small and partly systematic phonetic differences that
accompany other structural differences, like segmental distinctions or focus differences. We would like to be able to
evaluate the status of any differences between our stressed and unstressed syllables either as side effects of other
structural options, in this case the presence of a pitch accent, or as intrinsically due to differences in the location of
phonological stress.
For this purpose, in addition to the difference in the location of the word prominence (PW vs CG), we included a
segmental difference in the intervocalic consonant separating the two potential accent positions ([p] vs. [b]), the focus
condition of the target words, and sentence mode (declarative vs. interrogative). The phonetic measures that are
potentially affected by these structural differences include f0, duration, intensity and spectral properties. All of these were
included in our investigation.
2.1. Materials
We composed a corpus of sentences featuring two minimal pairs contrasting a noun (henceforth the word or PW
condition) and a noun+clitic combination (henceforth the clitic group or CG condition). These two pairs of minimal pairs
contrasted only in the voicing of the obstruent in the onset of the second syllable, which in the CG was the last consonant
of the lexical word. These materials were part of a larger corpus testing more conditions. Since no obvious quadruplets
were available in the segmental condition we report here, one of the four target words was a nonsense word. The target
words were [tb] light vs. [t b-e] swing+his/her and [tp] (nonsense word) vs. [t p-e] tank-top+his/her. They were
embedded in declarative and interrogative carrier sentences which varied across three focus conditions, referred to as
neutral (3a), post-focal (3b) and focal (3c). In (3), we show the voiced minimal pair in its declarative embedding
sentences. The total number of sentences was thus 3 (focus conditions) 2 (word structures) 2 (voicing conditions)
2 (sentence modes) = 24. For the neutral and post-focal carrier sentence we used Un X-e That is X, where -e is a clitic.
This makes all target words part of trisyllabic clitic groups that contrast in having the H* on the antepenultimate syllable
(the CG condition) or on the penultimate syllable (the PW condition). By having an accentual phrase-final unaccented
syllable in all cases, we abstract away from local phrase-finality effects on the duration and f0 of the two target syllables.
Condition (3c) differs from (3a,b) in having un that in final position, which allows X to be focused and X-e to be in first
position in the sentence, the focus position.
(3) a. un tb-e un t b-e-e
that light-is that swing-his/her-is
That is light That is his/her swing
b. un tb-e un t b-e-e
THAT is light THAT is his/her swing
c. tb-e un t b-e-e un
That is LIGHT That is his/her SWING
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1383
Author's personal copy
The sentences were presented to subjects in standard Persian orthography, which uses Arabic letters. Conditions (3a)
and (3b) were distinguished by having bold print for the target word in (3a) and bold print for un in (3b), reproduced here in
the transcription. These twelve sentences were given twice, once with a question mark ( ) and once with a full stop (.) at the
end, in order to elicit both declarative and interrogative intonation contours. Subjects read each sentence twice in a
professional recording studio at the University of Tehran.
2.2. Speakers and recordings
Twelve speakers took part in the experiment, six male and six female. Their ages ranged from 26 to 37 and they were
all educated native speakers recruited from students and staff in the Linguistics Department of the University of Tehran.
Speakers were freely allowed to repeat themselves if they thought they hadnt read a sentence correctly. The two best
versions were selected from the utterances of each sentence by each speaker. In the majority of cases, these were the
only readings produced for the sentence. After inspecting these 576 utterances, we decided to discard 31 of them
because of disfluencies or technical problems, which left us with 545 utterances for analysis. We supplied the means over
speakers for the 5.4% missing utterances.
2.3. Procedure
Utterances were segmented with the help of Praat (Boersma and Weenink, 1992--2009). Instead of establishing only
the start of the closure duration and the end of the stop burst of plosives, the boundary between closure and burst was
included as a segmental boundary, for both voiced and voiceless plosives. In the case of voiced plosives, this meant that
we had burst intervals of zero duration in a number of cases. Initial plosives were only measured for their bursts, since no
reliable indication of the beginning of the closure is available. An example of a TextGrid with wave form is shown in Fig. 1.
We included separate tiers for segments, words and clitic [e].
Subsequently, we averaged all values over the repetitions. Because of the way we supplied averaged values for the
missing data, we have potentially reduced the variation. We adopted a 1% significance level for all analyses, but include
results at 5%, which may be seen as trends.
3. Experiment I: results
We report the results for duration, intensity, spectral measures and f0. For duration, we first present the results of
overall analyses of variance in which SEGMENT is included as a 7-level variable in order to identify interactions between
segment durations and any of the four experimental variables. The same procedure is followed for intensity and the
spectral formants (F1, F2 and F3) for the two vowels in the potentially accented syllables, as well as for Centre of Gravity
(COG), with three levels for segment ([t]-burst, [p/b]-burst and [
R
]). The COGis a measure of indicating the mean spectral
frequency over some time span. The measure is particularly useful for segments without well-defined formant structure,
like those with voiceless friction (van Son and Pols, 1999).
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1384

u n t t b e e
un t be
e
Time (s)
0 2.112
Fig. 1. Praat TextGrid for a declarative neutral utterance of [un tb-e] That is light.
Author's personal copy
3.1. Duration
An analysis of variance (repeated measures) was performed on the durations of the segmented sections of the target
words, with SEGMENT ([t]-burst, [], [p/b]-closure, [p/b]-burst, [e], [], clitic [e]), WORD STRUCTURE (PW VS CG), SENTENCE MODE
(declarative vs interrogative), FOCUS (neutral, post-focal, focal) and VOICE (voiced vs voiceless) as factors. Mauchly's test for
sphericity was significant only for SEGMENT; we adopted the Greenhouse-Geisser correction in all cases. There were
interactions between SEGMENT and WORD STRUCTURE (F[6,66] = 6.755; p < 0.001), SEGMENT and FOCUS (F[12,132] = 72.543;
p < 0.001), SEGMENT and SENTENCE MODE (F[6,66] = 100.667; p < 0.001) and SEGMENT and VOICING (F[6,66] = 56.165;
p < 0.001) as well as main effects for SEGMENT (F[6,22] = 123.31; p < 0.001), FOCUS (F[2,22] = 51.01; p < 0.001) and
SENTENCE MODE (F[1,11] = 35.76; p < 0.001). This means that, unsurprisingly, segments have unequal durations, but more
importantly that some or all of our seven segment durations vary systematically with the word type of the target word, with
the focus condition, with the sentence mode and with whether [p] or [b] occurs in the target words. To establish which
segments vary under which condition we carried out repeated measures analyses of variance for each of the segmental
durations separately. The results are presented in Table 1.
Table 1 shows that the voicing of the labial closure affects the duration of the closure and the burst. The closure phase
of [p] is 12 ms longer than that of [b], and the burst is 39 ms longer (see Fig. 2). The segment [p] is 105 ms, [b] 54 ms in
total. The preceding vowel 27 ms longer before [b] (149 ms) than before [p] (122 ms). This result follows widespread
tendencies for voiceless plosives to be longer and preceding vowels to be shorter compared to the situation for voiced
plosives (Luce and Charles-Luce, 1985; Kluender et al., 1988). Unexpectedly, the effect of the voicing of the plosive was
also found on the following vowel, [e], which is 11 ms longer after [b] (97 ms) than after [p] (84 ms).
The effect of the focus condition is due to two quite different factors. Focus condition is partly confounded with position
in the sentence, because the focal target words are sentence-initial rather than sentence-final, as in the other two
conditions. The effects on [e], [] and the final clitic [e] are due to final lengthening in the neutral and post-focal conditions.
A post hoc test (Sidak) shows that in all three cases, the focal condition differs from the other two conditions ( p < 0.01),
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1385

0
50
100
150
200
250
t burst p/b p/b burst e e
D
u
r
a
t
i
o
n

(
m
s
)
Fig. 2. Mean segment durations for the target words pooled over 12 speakers for voiced (---) and voiceless (- - -) labial plosives separately.
Table 1
Effects of Voicing of labial plosive, Focus condition, Sentence mode and Word structure on durations of seven phonetic segments in the target
words [tp-e], [t p-e-e], [tb-e], [t b-e-e].
Segment Voicing df 1,11 Focus df 2,22 Sentence mode df 1,11 Word structure df 1,11
[t]-burst ns F = 5.646
*
ns F = 20.446
**
[] F = 188.34
**
ns F = 8.71
*
ns
[p/b]-closure F = 20.92
**
F = 4.12
*
ns F = 15.156
**
[p/b]-burst F = 170.19
**
ns F = 6.81
*
F = 6.13
*
[e] F = 27.071
**
F = 15.491
**
ns F = 5.189
*
[] ns F = 61.506
**
F = 14.872
**
F = 7.74
*
[e] ns F = 51.225
**
F = 117.3
**
ns
*
p < 0.05.
**
p < 0.01.
Author's personal copy
which however do not differ between themselves. In the focal condition, these three segments are respectively 8 ms,
10 ms and 16 ms shorter than in the other two conditions (see Fig. 3). The second explanatory factor is only present as a
trend, and concerns the lesser articulatory care taken over the post-focal target words. However, the effects here are very
small, as seen in Fig. 3, with [p/b] being 4 ms shorter and the [t]-burst 10 ms shorter in the post-focal condition than in the
focal condition.
Third, theeffect of sentencemodeis locatedinthefinal syllable, asindicatedinTable1anddepictedinFig. 4. Theonset []
is 7 ms longer and the final [e] is 95 ms longer in the interrogative condition than in the declarative condition. Increased final
lengtheninginquestions wouldappear tobeageneral tendency(e.g. Smith, 2002), whichhas beenphonologizedinvarieties
of West Greenlandic(Rischel, 1974:79; seealsoFortescue, 1984:4). Bycontrast, theeffect on[] is areductionof 8 ms inthe
interrogative condition. In fact, overall, non-final syllables tend to be longer in declaratives than in interrogatives, suggesting
that the lengthening of the final syllable is heraldedby anaccelerando in the pre-final syllables.
3
vanHeuven andvan Zanten
(2005) in fact propose faster speech rate as a near-universal characteristic of questions.
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1386
0
50
100
150
200
250
t burst p/b p/b burst e e
D
u
r
a
t
i
o
n

(
m
s
)
Fig. 3. Mean segment durations for the target words pooled over 12 speakers for neutral focus (---), post-focal (- - -) and focal ( ) pronunciations
separately.

0
50
100
150
200
250
t burst p/b p/b burst e e
D
u
r
a
t
i
o
n

(
m
s
)
Fig. 4. Mean segment durations for the target words pooled over 12 speakers for declarative (---) and interrogative (- - -) sentences separately.
3
A pattern of shorter non-final syllables and a longer final syllable in interrogatives compared to declaratives was earlier reported by Stoel
(2007) for the East Timorese language Fataluku.
Author's personal copy
Finally, is there evidence that the location of the accent is accompanied by inherent differences in duration of the
syllable rime? The answer must be negative, even though we did find interpretable effects of word structure. In the CG-
condition, in which [t] has the pitch accent, the [t]-burst is 9 ms longer than in the PW-condition (see Fig. 5; the 6 ms
longer [] just failed to reach significance (F = 4.735; p = 0.052)). Conversely, in the PW-condition, in which [be] has the
accent, the labial closure is 7 ms and the [e] 6 ms longer than in the CGcondition. The following [] compensates partly for
this lengthening by being 3 ms shorter in the PW-condition.
3.2. Spectral measures
Spectral measures have been used to detect differences in articulator shape or position. We report Centre of Gravity
measurements and formant measurements. Centreof Gravity measures for the [t]-burst, [p/b]-burst and[] were subjected to
a repeated measures analysis of variance with SEGMENT ([t]-burst, [p/b]-burst, []), VOICE (voiced vs voiceless), FOCUS (neutral,
post-focal, focal), SENTENCE MODE (declarative vs interrogative) and WORD STRUCTURE (PW VS CG) as factors. Apart from the
obviouseffect of SEGMENT, wefoundaninteractionbetweenFOCUS andSEGMENT (F[2,22] = 6.851; p < 0.01), whichappearedto
be due to a 330 Hz lower COG for [t]-burst in the focal condition. Since the focal condition has the target word in sentence-
initial position, this effect must be due to the occurrence of [t] at the beginning of the utterance. The same procedure was
followed for F1, F2, andF3, but with[] and [e] as the levels for SEGMENT. (We excludedthe final [e], as it was never accented.)
In the case of F1, there was a main effect for VOICE (F = 5.339, p < 0.05) and a significant interaction between SEGMENT and
WORD STRUCTURE (F[1,11] = 15.904; p < 0.01). In the case of F2, there was a main effect for VOICE (F[1,11] = 13.811, p < 0.01)
andsignificant interactionsbetweenSEGMENT andVOICE (F[1,11] = 14.531; p < 0.01) andbetweenSEGMENT andFOCUS (F[2,22]
= 6.268; p < 0.01). In the case of F3, there was a main effect for FOCUS only (F[2,22] = 6.148; p < 0.01). The results of the
separate analyses of variance of the three formants for the individual vowels are given in Tables 1 and 2.
The effect of the voicing of the labial plosive is confined to [e], whose F2 is 92 Hz higher and whose F3 is 33 Hz higher
before the voiceless consonant than the voiced consonant. This means that the vowel is slightly more centralized after [b]
than after [p]. As for the focus condition, we found that the F1 of focal [] is marginally higher than that of post-focal []
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1387
0
50
100
150
200
250
t burst p/b p/b burst e e
D
u
r
a
t
i
o
n

(
m
s
)
Fig. 5. Mean segment durations for the target words pooled over 12 speakers for the CG (---) and PW (- - -) word structures separately.
Table 2
Effects of voicing of labial plosive, focus condition, sentence mode and word structure on the F1, F2 and F3 of [] and [e].
Segment Dependent variable Voicing df 1,11 Focus df 2,22 Sentence mode df 1,11 Word structure df 1,11
[] F1 ns F = 3.650
*
ns F = 7.078
*
F2 ns F = 4.854
*
ns ns
F3 ns F = 4.277
*
ns ns
[e] F1 ns ns ns F = 10.917
**
F2 F = 25.361
**
F = 3.802
*
ns ns
F3 F = 6.34
*
F = 4.077
*
ns ns
*
p < 0.05.
**
p < 0.01.
Author's personal copy
(25 Hz) and the neutral [] (14 Hz). The F2 of [] is higher in the post-focal condition than in the neutral condition (34 Hz)
and the focal condition (52 Hz), while its F3 is 60 Hz lower in the post-focal condition than in the neutral condition and
46 Hz lower than in the focal condition. That is, [] is slightly more centralized in the post-focal condition than in the neutral
and focal conditions. The F2 of [e] was 48 Hz lower in the post-focal condition than in the neutral condition, and F3 was
46 Hz lower in the post-focal condition than in the neutral condition, which, again, means that in the post-focal condition [e]
was marginally more central. Finally, the effects of word type are summarized by observing that when [t] has the pitch
accent (CG), it has a marginally higher F1 (18 Hz) than when it has not (PW). Conversely, accented [e] (PW) has a
marginally higher F1 (19 Hz) than unaccented [e] (CG). That is, vowels in accented syllables are fractionally, and
negligibly, opener than in the unaccented case.
3.3. Intensity
Wereport theresults for intensity (dB) of theseparateanalyses of variancefor thetwotarget vowels separately inTable3.
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1388
Table 3
Effects of voicing of labial plosive, focus condition, sentence mode and word structure on the intensity of [] and [e].
Voicing df 1,11 Focus df 2,22 Sentence mode df 1,11 Word structure df 1,11
[] ns F = 42.46
**
ns F = 5.89
*
[e] F = 15.92
**
F = 42.47
**
F = 7.20
*
F = 5.04
**
*
p < 0.05.
**
p < 0.01.

50
150
250
350
50
150
250
350
50
150
250
350
50
150
250
350
50
150
250
350
50
150
250
350
F0 (Hz)
Fig. 6. Mean declarative F0 contours for un and [t[b/p]ee] on normalized time scale for PW (---) and CG (- - -) word structures separately, with
target words in a neutral focus sentence (top), in post-focal position (middle) and focus position (bottom). Pooled over 4 speakers.
Author's personal copy
The voiced labial plosive causes the intensity of the following [e] to be 1.47 dB higher compared to the voiceless
consonant. In interrogatives, it is 2 dB higher than in declaratives, a statistical trend. We have no interpretation of these
effects. As for Focus, [] is 3.03 dB higher in the neutral condition than that in the post-focal condition, and 1.26 dB higher
in the focal condition than in the neutral condition. Similarly, the intensity of [e] is 1.36 dB higher in the focal condition than
in the neutral condition and 3.98 dB higher in the neutral condition than in the post-focal condition. This result matches the
communicative nature of these conditions for both vowels, with more intense pronunciations in more emphatic
conditions. As for the effect of word structure, we found that accented [] is 2.06 dB higher than unaccented [], and
accented [e] is 1.96 dB higher than unaccented [e]. Again, this result is in the expected direction for both vowels, but the
effects are statistically trends.
3.4. Fundamental frequency
We report mean f0 for the PW and CG target words in the neutral, post-focal and focal conditions with declarative and
interrogative intonation separately. Fig. 6 shows averaged contours on normalized time scales for the declarative
condition, while Fig. 4 does the same for the interrogative condition.
In comparison to the duration, intensity and spectral measurements, the f0 measurements show substantial differences
between the two word types. In the top panels of Figs. 6 and 7, which show the neutral condition in declaratives and
interrogatives respectively, accented [t] is approximately 50 Hz (declarative) and 70 Hz (interrogative) higher than its
unaccented counterpart. In the bottompanels, which give the focal condition, comparable differences are observed both for
[t] and [be]. To turn to the post-focal condition, a comparison of the neutral contrasts (Fig. 6, top panels) with the post-focal
(middle panels) contrasts between the PWand CGpronunciations suggests that post-focal forms are not deaccented. With
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1389

50
150
250
350
50
150
250
350
50
150
250
350
50
150
250
350
50
150
250
350
50
150
250
350
F0 (Hz)
Fig. 7. Mean interrogative F0 contours for un and [t[b/p]ee] on normalized time scale for PW (---) and CG (- - -) word structures separately, with
target words in a neutral focus sentence (top), in post-focal position (middle) and focus position (bottom). Pooled over 4 speakers.
Author's personal copy
neutral focus (top), the first syllable [t] of the CG (solid line) has high pitch and the following clitic has low pitch, and this
pattern is reversed for the PW (dashed line). In the post-focal condition, there is evident Post Focus Compression, and the
differences between the two word types are reduced considerably as a result, but there are indications that the general
pattern may be preserved. In the CG condition, there is a regular downtrend across the last four syllables, but in the PW
condition, there is not lowering from[t] to [be], which is consistent with an assumption that [be] has a range-compressed H-
tonetarget. Theinterrogativecontours(Fig. 7) confirmthisconclusion. Acomparisonof thecontrastsinneutral andpost-focal
positions shows that the post-focal pronunciation of the target words (middle panels) are reduced versions of the contrast in
neutral position (top panels). Afurther indication that post-focal words are not deaccented is the relatively high f0 of un in the
focal condition, where un is post-focal (bottompanels in Figs. 6 and 7). In the CGcondition in particular (solid line), the third
syllable in the target words has lower pitch than the following syllable un, which suggests there is a H-tone on un in both the
declarative and the interrogative. Since the declarative ends in L%, that H-tone must be H*.
Afinal observationconcerns theutterance-final syllablesintheinterrogativecontours. All of theseremainquitelevel till the
end. This is different from what is seen in many other languages, where a final boundary H% causes a local rise in pitch.
4. Experiment I: discussion
Experiment I was run to be able to answer the question whether Persian has phonetic stress, in the sense of features
other than f0 that mark prominent syllables, or alternatively only tone, to be described as a pitch accent. It was addressed
by means of a detailed investigation of the phonetic differences between nouns and segmentally identical, but
prosodically different noun+clitic combinations in a variety of conditions. The choice of these conditions was motivated by
two considerations. The first was to spread the word accent contrast exemplified by the two structures across an array of
contexts that might have an impact on the realization of the contrast. The second was to create a baseline for gauging the
effect size of any phonetic differences we might find between the two word structures, so as to be able to assign them to
the existence of stress, as opposed to regarding them as side effects of the existence of a pitch accent, i.e. of tone. The
reasoning here is that phonological contrasts rarely confine their effect on just a single or primary phonetic parameter, with
tone only having an effect on f0 or [voice] only having an effect on the state of the glottis. Side effects are ubiquitous, and
are often conventionalized in the phonetic implementation (Stevens and Keyser, 1989).
The results showed that a number of structural contrasts are accompanied by differences in phonetic parameters that
are not the primary phonetic exponents of these structural contrasts. The largest of these occurred as a function of
sentence mode. Excessive final lengthening and some non-final shortening occurred in utterances with interrogative
intonation as compared with the same utterances with the (tonally different) declarative intonation (102 ms for the final
syllable). Next in importance were the durational effects of the value for [voice] of the intervocalic plosive on its closure and
burst durations, which are longer for [p] than for [b] (by 51 ms), and on the preceding and following vowels, which are
shorter in the case of [p] (by 27 ms and 11 ms, respectively). While the specific finding that vowels are longer after onset
[b] than they are after [p] is new, as far as we are aware, both the interrogative durational pattern and the durational effects
of the laryngeal specification of plosives are in line with many earlier findings (e.g. Rischel, 1974; Ryalls et al., 1994; Smith,
2002; van Heuven and van Zanten, 2005; Stoel, 2007; Luce and Charles-Luce, 1985; Kluender et al., 1988). If we ignore
the effect of position in the sentence, an inevitable confound of focus in our data, the durational effects of focus is very
small, with a 10 ms longer [t]-burst duration in post-focal pronunciation relative to neutral pronunciation. Similarly small
effects are found for the structural difference at issue, that of the prosodic difference between first and second syllable
accentuation in nouns and noun+clitic combinations, respectively. Adding significant as well as near-significant effects
over the consonant and vowel in each syllable, we found a 15 ms longer duration of the first accented syllable and a 13 ms
longer duration of the second accented syllable than in their unaccented counterparts.
The findings for intensity and the spectral measures lead to the same conclusion. The intensity of the vowels responded
most clearly to the variation in focus, with vowels in post-focal words having lower intensity than under neutral focus and
having less intensity under neutral focus than under focus. A similar effect was found for the difference in the position of the
wordprominence, but it was statistically less robust. The spectral differences betweenaccentedandunaccented versions of
thetwovowels are extremely small, and smaller thanthosethat resultedfromthedifferent focus conditions. Inthe caseof [e],
we found a difference in F2 of the vowel after the labial plosive which is comparable in size to the difference in F1 we found
between the accented and unaccented versions of this vowel. In short, the durational and spectral differences between
accented and unaccented vowels stay well below the baseline for a phonological status of stress.
5. Experiment II
The results of Experiment I appeared to indicate that, while there is Post-Focus Compression in the declarative and
interrogative data, the tonal distinctions between the two word types remain intact after the focus. Experiment II was
conducted to see whether Post-Focus Compression merely compresses the pitch range or alternatively causes the
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1390
Author's personal copy
deletion of the tones of the pitch accent. Lack of deaccentuation under Post-Focus Compression predicts that the salience
of the contrast between the PW and CG conditions may well be reduced, because the distinctions between the high pitch
of the prominent syllables and the low pitch of the non-prominent syllables is reduced, but that it is nevertheless
categorically present. We used a word identification task to test this prediction.
5.1. Experiment II: materials
Twelve utterances were selected from the recordings by each of four of the twelve speakers who contributed to the
corpus used for Experiment I, two randomly chosen female speakers and two randomly chosen male speakers. The
utterances contained equal numbers of nouns (PW) and noun+clitic (CG) versions of the same segmental strings. In order
to see if interruption of f0 might aggravate the difficulty of perceiving the post-focal contrast, half of the utterances had the
target words with the voiceless plosives and half those with the voiced plosive. The focus condition was represented by
including the three sentence frames representing the neutral, post-focal and focal conditions. This yielded 4 (speakers)
2 (plosives) 2 (word structures) 3 (focus conditions), or 48 stimuli. We only used the declarative sentences in this
experiment, as there seemed to be a tendency to preserve the contrast better in the interrogative sentences (cf. the middle
panels in Figs. 6 and 7). Inclusion of interrogative sentences might have caused a positive bias in the results, something
we wanted to avoid.
5.2. Experiment II: procedure
Twenty subjects, 8 female and 12 male, were recruited from the student population of Tehran University. They were
tested individually in the phonetics laboratory of the University of Tehran with the help of a Praat Multiple Forced Choice
experiment run on a laptop (Boersma and Weenink, 1992--2009). They were instructed the listen to each stimulus and to
select one of four structures displayed on the screen, where the words [tb] light, [t b-e] swing+his/her, [tp]
(nonsense word) and [t p-e] tank-top+his/her appeared in Arabic spelling in four clickable buttons, in this order. The
order of the stimuli was randomized per listener. Before the test proper, subjects did six practice items to familiarize
themselves with the task. They could listen to each stimulus as often as they wished, but once they made their choice, the
next screen appeared automatically.
5.3. Experiment II: results
Correct scores were pooled over the stimuli spoken by the different speakers, and an analysis of variance (repeated
measures) was performed on them with WORD STRUCTURE (CG vs. PW), VOICE ([b], [p]) and FOCUS (neutral, post-focal, focal)
and as factors. There was a main effect for FOCUS (F[2] = 54.125, p < 0.001). A post hoc test (Sidak) showed that the post-
focal condition was significantly different from the neutral and focus conditions ( p < 0.001). The lower recognition scores
in the post-focal condition are due to Post-Focus Compression, which as we have seen reduces the phonetic difference
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1391

Fig. 8. Correct scores in a word identification task for noun (PW) vs noun+clitic combinations (CG) as obtained in the neutral condition, the post-
focal condition and the focal condition.
Author's personal copy
between the F0 of accented and unaccented syllables. Inspection of the errors showed that there were no confusions
between the voiced and voiceless target words. Thus, while the chance level in this four-choice task is technically 25%, in
practice it is 50% for the difference between noun and noun+clitic combinations, given that there is no variation in the
scores for the voicing distinction. The score of 73% is clearly considerably better than a chance level of 50% (see Fig. 8).
6. Conclusion
There were two questions we intended to address in our investigation. One concerned the presence of phonetic
differences other than f0 in word prominent syllables and the other was whether Post-Focus Compression involved the
deletion of the tonal structure that is responsible for the word prominence. The results of Experiment I showed extremely
small durational, intensity and spectral differences between prominent and non-prominent syllables. They were smaller or
at least as small as those found between different focus conditions, between vowels before and after voiced and voiceless
labial plosives and between declarative and interrogative sentences. While the differences were in the direction to be
expected from a difference in phonetic stress, with slightly longer, slightly more intense and slightly opener mid vowels in
the prominent condition, their effect size fell well belowwhat would be expected froma difference is stress. By contrast, the
f0 differences were substantial. The Persian word prosodic prominence contrast, therefore, is that between the presence
vs. absence of a pitch accent. The literature on Persian (Sadat Tehrani, 2007) as well as our data suggest that this pitch
accent is (L)+H*, with the H-tone going to the last syllable of the lexical word domain, whereby L is overt in polysyllabic
words. Because clitics fall outside this domain, but are syllabified with it, minimal pairs that differ in the location of the
prominent syllable arise whenever a clitic has the same segmental composition as the last segments of a word.
Specifically, the cliticized words (also clitic group or CG) have non-final prominence where the word (also phonological
word or PW) has final prominence. Our experiment did not aimto elucidate the prosodic status of these constituents. The
data we collected are consistent with an interpretation of all these structures as phonological words and with predictable
prominence assignment taking place in the lexicon.
Experiment II confirmed an impression that could be gained from the production data in Experiment I. The phonetic
difference between CG [t [b/p]ee] and PW [t[b/p]e] we observed in the neutral focus condition appeared to be
preserved after the focus, where the pitch range was compressed. That is, a higher first syllable in the CGcondition than in
the PW condition was observable in the post-focal condition, even if the F0 difference was less than in the other
conditions. A word identification task showed that the contrast, which reached a 96% correct score in the data without
Post-Focus Compression, still reached a 73% correct score in the post-focal condition, where Post-Focus Compression
applies. This means that there was no neutralization between CGand PW, and that the prosodic difference between them
is intact.
Persian thus differs from English in two respects. First, there is no comparable difference between stressed and
unstressed syllables independently of the presence of the pitch accent, and second, unlike the pitch accents of English,
the Persian pitch accent is not deleted after the focus. While Persian does have Post-Focus Compression (Xu et al.,
2012), the reduction in pitch range is phonetic and leaves the tonal structure intact. These features make Persian more like
Northern Bizkaian Basque (e.g. Hualde et al., 2007) and Tokyo Japanese (Pierrehumbert and Beckman, 1988;
Kubozono, 1993) than English (e.g. Beckman, 1986), Dutch (Rietveld et al., 2004; van Heuven and de Jonge, 2010),
Spanish (Ortega-Llebaria and Prieto, 2010) or Catalan (Ortega-Llebaria et al., 2010), where stressed and unstressed
syllables differ in duration and often also in vowel quality. However, unlike Japanese and Northern Bizkaian Basque,
Persian has obligatory accent, making it similar to Nubi (Gussenhoven, 2006) and Turkish (Levi, 2005). A preliminary
conclusion therefore is that this kind of system, which is both culminative and obligatory and as such counts as a stress
system in the sense of Hyman (2006), may be more common than is suggested by their relatively sparseness in the
typological literature.
We have not addressed all the relevant issues. One of these concerns the question whether there is only a single pitch
accent, (L+)H*, or more, as in English. Neither have we investigated the question whether deaccentuation of the word-
based pitch accent might be systematic in other contexts. If the pitch accent is routinely deleted in other contexts that
would evidently compromise its culminative status. Observe that there is no post-lexical process in English that affects the
location or presence of stressed syllables (Gussenhoven, 2011), and thus all stress changing processes take place during
word derivation (satan -- satanic, explain -- explanation, etc.). That is, culminativity in English is absolute. It remains to be
seen whether the same is true for Persian.
4
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1392
4
Nima Sadat-Tehrani ran an informal small-scale replication of the perception experiment and found that responses in the post-focal condition
were in the vicinity of chance level. To the third author, who doesnt speak Persian, the post-focal stimuli in Experiment II sound deaccented and
neutralized. A formal replication of the experiment would be welcome. Instead of a reading task, as used in Experiment I and which yielded the
stimuli we used, a more realistic elicitation task would be desirable.
Author's personal copy
Our data are too limited to argue for any particular tonal analysis of Persian. However, some aspects of the data would
seem to conflict with claims in the literature. Interrogatives end in level pitch, rather than a final rise. Since a sequence of
H* H%(or H* H-H%, as in Hirschberg and Pierrehumbert, 1986) has generally been used to describe upstepped contours,
like the high rise of English (e.g. Gussenhoven, 2004:302) or the rise to high of French (Post, 2000), the Persian contour
may need to be analyzed with the absence of a boundary tone (cf. Grabe, 1998:49). This would mean that Persian
contrasts a declarative L% with the absence of a boundary tone () for interrogatives. H% might be reserved for non-final
IPs, as suggested by the examples in Sadat Tehrani (2007).
Acknowledgements
Experiment I was conducted by the first author under the supervision of the second author. The data have been
reanalyzed and interpreted in collaboration with the third author. We thank the participants of Experiment I and Experiment
II at the University of Tehran and Joop Kerkhoff for technical assistance. We are grateful for the comments by Hamed
Rahmani, Nima Sadat-Tehrani and three anonymous reviewers, which have helped greatly to improve the final text. The
first author acknowledges the ITRCGrant awarded by the Iranian Ministry of Information and Communication Technology,
which enabled her to carry out research at Radboud University Nijmegen. The Ministry has in no way influenced the
contents of this report.
References
Beckman, M.E., 1986. Stress and Non-stress Accent. Foris, Dordrecht.
Bijankhan, M., Nourbakhsh, M., 2009. Voice onset time in Persian initial and intervocalic stop production. Journal of the International Phonetic
Association 39, 335--364.
Boersma, P., Weenink, D., 1992--2009. Praat: Doing Phonetics by Computer. Version 5.1.04. , www.praat.org.
Bolinger, D., 1958. A theory of pitch accent in English. Word 14, 109--149.
Chen, Y., Guion-Anderson, S., 2012. Prosodic realization of focus in Mandarin by advanced American learners of Chinese. Journal of the
Acoustical Society of America 131, 3200--3234.
De Jong, K., Zawaydeh, B.A., 1999. Stress, duration, and intonation in Arabic word-level prosody. Journal of Phonetics 27, 3--22.
Elordieta, G., Hualde, J.I., 2003. Tonal and durational correlates of accent in contexts of downstep in Lekeitio Basque. Journal of the International
Phonetic Association 33, 195--209.
Eslami, M., 2000. S

enaxt-e nvay-e goftar-e zban-e farsi v karbord-e an dar bazsazi v bazsenas-ye rayani-ye goftar [The prosody of the
Persian language and its application in computer-aided speech recognition]. Ph.D. Dissertation, University of Tehran.
Eslami, M., Bijankhan, M., 2002. Nezam-e ahng-e zban-e Farsi: [Persian intonation system]. Iranian Journal of Linguistics 34, 36--61.
Ferguson, C., 1957. Word stress in Persian. Language 33, 123--135.
Fortescue, M., 1984. West Greenlandic. Croom Helm, London.
Goldsmith, J., 1975. Autosegmental phonology. Ph.D. Dissertation, MIT.
Gomez-Imbert, E., Kisseberth, M., 2000. Barasana tone and accent. International Journal of American Linguistics 66, 419--463.
Grabe, E., 1998. Comparative intonational phonology: English and German. PhD dissertation, Radboud University Nijmegen. Published in MPI
Series in Psycholinguistics.
Gussenhoven, C., 2004. The Phonology of Tone and Intonation. Cambridge University Press, Cambridge, UK.
Gussenhoven, C., 2006. The word prosody of Nubi: between stress and tone. Phonology 23, 193--223.
Gussenhoven, C., 2011. Sentential prominence in English. In: van Oostendorp, M., Ewen, C.J., Hume, E., Rice, K. (Eds.), The Blackwell
Companion to Phonology, vol. 5. Wiley-Blackwell, Malden, MA/Oxford, pp. 2780--2806.
Hayes, B., 1995. Metrical Stress Theory: Principles and Case Studies. Chicago University Press, Chicago.
Hill, A.A., 1962. First Texas Conference on Problems of Linguistic Analysis. University of Texas, Austin.
Hirschberg, J., Pierrehumbert, J., 1986. Intonational structuring of discourse. In: Proceedings of the 24th Meeting of the Association for
Computational Linguistics. pp. 136--144.
Hualde, J., Elordieta, G., Gaminde, I., Smiljanic, R., 2007. Frompitch accent to stress accent in Basque. In: Gussenhoven, C., Warner, N. (Eds.),
Laboratory Phonology, vol. 7. Mouton de Gruyter, Berlin/New York, pp. 547--584.
Hyman, L.M., 2006. Word-prosodic typology. Phonology 23, 225--257.
Hyman, L.M., 2009. How (not) to do phonological typology: the case of pitch-accent. Language Sciences 31, 213--238.
Kahnemuyipour, A., 2003. Syntactic categories and Persian stress. Natural Language and Linguistic Theory 21, 333--379.
Kluender, K.R., Diehl, R.L., Wright, B.A., 1988. Vowel-length differences before voiced and voiceless consonants: an auditory explanation.
Journal of Phonetics 16, 153--169.
Kubozono, H., 1993. The organization of Japanese prosody. Studies in Japanese Linguistics, vol. 2. Kurosio Publishers, Tokyo.
Lazard, G., 1957. Grammaire du Persan Contemporain. Klincksieck, Paris, New Edition published by Peeters, Paris, 2006.
Levi, S., 2005. Acoustic correlates of lexical accent in Turkish. Journal of the International Phonetic Association 35, 73--97.
Luce, P.A., Charles-Luce, J., 1985. Contextual effects on vowel duration, closure duration and the vowel consonant ratio in speech production.
Journal of the Acoustical Society of America 78, 1949--1957.
Mahjani, B., 2003. An instrumental study of prosodic features and intonation in modern Farsi (Persian). MA Thesis, University of Edinburgh.
Ortega-Llebaria, M., Prieto, P., 2010. Acoustic correlates of stress in Central Catalan and Castilian Spanish. Language and Speech 54,
73--97.
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1393
Author's personal copy
Ortega-Llebaria, M., Vanrell, M.M., Prieto, P., 2010. Catalan speakers perception of word stress in unaccented contexts. Journal of Acoustical
Society of America 127, 462--471.
Pierrehumbert, J., 1980. The phonology and phonetics of English intonation. Ph.D. Dissertation, MIT. Distributed 1988, Indiana University
Linguistics Club.
Pierrehumbert, J., Beckman, M.E., 1988. Japanese Tone Structure. MIT Press, Cambridge, MA.
Pierrehumbert, J., Hirschberg, J., 1990. The meaning of intonational contours in the interpretation of discourse. In: Cohen, P., Morgan, J., Pollack,
M. (Eds.), Intentions in Communication. MIT Press, Cambridge, MA, pp. 271--311.
Post, B., 2000. Tonal and Phrasal Structures in French Intonation. LOT Publications, Utrecht.
Rietveld, T., Kerkhoff, J., Gussenhoven, C., 2004. Word prosodic structure and vowel duration in Dutch. Journal of Phonetics 32, 349--371.
Rischel, J., 1974. Topics in West Greenlandic Phonology. Akademisk, Copenhagen.
Ryalls, J., Le Dorze, G., Lever, N., Ouellet, L., Larfeuil, C., 1994. The effects of age and sex on speech intonation and duration for matched
statements and questions in French. Journal of the Acoustical Society of America 95, 2274--2276.
Sadat Tehrani, N., 2007. The intonational grammar of Persian. Ph.D. Dissertation, University of Manitoba.
Samei, H., 1996. Tekye-ye fe,l dr zban-e farsi: Yek bresi-ye mojdd.(Verb stress in Persian: a re-examination). Nameye Frhngestan
1, 6--21.
Schmerling, S., 1976. Aspects of English Sentence Stress. University of Texas Press, Austin.
Shaqaqi, V., 1993. Clitics in Persian. Ph.D. Dissertation, University of Tehran.
Smith, C., 2002. Prosodic finality and sentence type in French. Language and Speech 45, 141--178.
Stevens, K.N., Keyser, S.J.S.J., 1989. Primary features and their enhancement in consonants. Language 65, 81--106.
Stoel, Ruben, 2007. Question intonation in Fataluku. Presented at the Fifth East Nusantara Conference, Kupang, Indonesia. www.fataluku.com.
van Heuven, V.J.J.P., de Jonge, M., 2010. Spectral and temporal reduction as stress cues in Dutch. Phonetica 68, 120--132.
van Heuven, V., van Zanten, E., 2005. Speech rate as a secondary prosodic characteristic of polarity questions in three languages. Speech
Communication 47, 87--99.
van Son, R.J.J.H., Pols, L.C.W., 1999. An acoustic description of consonant reduction. Speech Communication 28, 125.
Wellens, I., 2005. The Nubi Language of Uganda: An Arabic Creole in Africa. Brill, Leiden.
Xu, Y., Chen, S.-w., Wang, B., 2012. Prosodic focus with and without post-focus compression (PFC): a typological divide within the same
language family? The Linguistic Review 29, 131--147.
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 1394