An 1 Sem 1 A Short Introduction To Phonetics and Phonology

MARA VAN SCHAIK RĂDULESCU
A SHORT INTRODUCTION
TO PHONETICS AND PHONOLOGY
Ediţia a II-a
Universitatea SPIRU HARET

Descrierea CIP a Bibliotecii Naţionale a României
VAN SCHAIK RĂDULESCU, MARA
A short Introduction to phonetics and phonology / Van
Schaik Rădulescu, Mara. - Bucureşti, Editura Fundaţiei România
de Mâine, 2005
152 p.; 20,5 cm
Bibliogr.
ISBN 973-725-437-6
811.111.’342’344(075.8)
© Editura Fundaţiei România de Mâine, 2005
Redactor: Andreea DINU

Tehnoredactor: Alexandru OANĂ
Coperta: Stan BARON
Bun de tipar: 26.01.2006; Coli tipar: 9,5
Format: 16/61×86
Editura şi Tipografia Fundaţiei România de Mâine
Splaiul Independenţei nr.313, Bucureşti, s. 6, O P. 83
Tel./ Fax 3169790; www. SpiruHaret.ro
e-mail: contact@edituraromaniademaine

UNIVERSITATEA SPIRU HARET
FACULTATEA DE LIMBI ŞI LITERATURI STRĂINE
MARA VAN SCHAIK RĂDULESCU
A SHORT INTRODUCTION
TO PHONETICS AND PHONOLOGY
Ediţia a II-a
EDITURA FUNDAŢIEI ROMÂNIA DE MÂINE

Bucureşti, 2006

CONTENTS
FOREWORD .……………………………………………………… 9
I. INTRODUCTION ………………………………………………. 11
1. Phonetics and phonology as branches of linguistics …………. 11

1.1. Disciplines of linguistics ……………………………….. 12
2. Speech sounds ……………………………………………….. 14
3. The International Phonetic Alphabet …………………………. 15
4. On varieties of English ……………………………………… 20
5. Questions ……………………………………………………. 21
II. BRANCHES OF PHONETICS ………………………………. 22
1. Acoustic phonetics ………………………………………….. 23

2. Auditory phonetics ………………………………………….. 28
3. Questions ……………………………………………………. 30
III. ARTICULATORY PHONETICS …………………………... 32
1. Airstream mechanisms ………………………………………. 32

2. The vocal cords ……………………………………………... 34
3. Resonance …………………………………………………… 35
4. Oral and nasal sounds ……………………………………….. 36
5. Active and passive articulators ………………………………. 36
6. Manners of articulation ……………………………………… 37
7. Fortis and lenis ………………………………………………. 38
8. Places of articulation ………………………………………… 38
9. Questions …………………………………………………….. 40
5
IV. CONSONANTS ………………………………………………. 42
1. Obstruents …………………………………………………… 42
1.1. Plosives ………………………………………………... 42
1.1.1. Aspiration ……………………………………… 43
1.2. Fricatives ………………………………………………. 44
1.2.1. On the distribution of fricatives ……………….. 45
1.3. Affricates ………………………………………………. 46
2. Sonorant consonants …………………………………………. 46
2.1. Nasals …………………………………………………. 46
2.2. Liquids ………………………………………………… 46
2.2.1. Laterals ………………………………………… 46
2.2.2. Rhotics ………………………………………… 47
3. Glides ………………………………………………………... 48
3.1. Distribution and variation of glides ……………………. 49
4. Summary ……………………………………………………. 50
5. Questions and exercises ……………………………………… 53
V. VOWELS ………………………………………………………. 55
1. Criteria for classifying vowels ……………………………… 55

2. The Cardinal Vowels ………………………………………… 56
3. Other criteria for classifying vowels ………………………… 57
4. English vowel sounds ……………………………………….. 61
4.1. RP front vowels ………………………………………... 61
4.2. RP back vowels ………………………………………... 62
4.3. RP central vowels ……………………………………... 62
4.4. RP centring diphthongs ………………………………... 62
4.5. RP diphthongs falling to [I] and to [U] ………………….. 63
VI. PHONOLOGY ……………………………………………….. 66
1. Phonetics vs. phonology ……………………………………... 66

2. Segmental vs. suprasegmental phonology …………………… 66
3. Segmental phonology …………………………………………. 67
6
3.1 Phonemes and their variants ……………………………. 67
3.2 Distribution ……………………………………………... 70
4. Questions …………………………………………………….. 71
VII. PHONOLOGICAL FEATURES …………………………... 72
1. Major class features …………………………………………. 75

2. Consonantal features ………………………………………… 76
2.1. Voice ………………………………………………….. 77
2.2. Manner features ………………………………………. 77
2.3. Place features ………………………………………… 78
3. Vowel features ………………………………………………. 79
4. Summing up …………………………………………………. 82
VIII. PHONOLOGICAL RULES ……………………………….. 86
1. Rule writing ………………………………………………….. 86

2. Selecting the underlying form ……………………………….. 88
3. Phonological alternations ……………………………………. 90
3.1 Phonetically conditioned alternations ………………….. 90
3.2 Phonetically and morphologically conditioned alternations 91
3.3 Phonetically, morphologically and lexically conditioned alternations 92
4. More on rule writing ………………………………………… 93
5. Derivations …………………………………………………... 98
5.1. Rule ordering ………………………………………….. 99
IX. PHONOLOGICAL PROCESSES ………………… 103
1. Feature changing rules ………………………………………. 103

1.1. Assimilation ……………………………………………. 103
1.2. Dissimilation …………………………………………… 105
1.3. Lenition ………………………………………………… 105
1.4. Flapping ………………………………………………… 106
1.5. Glottalisation …………………………………………… 106
7
2. Other types of changes ………………………………………. 106
2.1. Deletion ………………………………………………… 106
2.2. Insertion ……………………………………………….. 107
2.3. Metathesis ……………………………………………… 108
2.4. Reduplication …………………………………………... 108
2.5. Haplology ……………………………………………… 109
3. Questions and exercises 109
X. SUPRASEGMENTAL PHONOLOGY: THE SYLLABLE 110
1. Syllable structure …………………………………………….. 111

1.1. Sonority and the syllable ……………………………….. 111
1.2. The onset-rhyme theory ………………………………... 114
1.3. The timing tier …………………………………………. 117
2. Syllabification ……………………………………………….. 119
2.1. Principles of syllabification …………………………… 119
3. Syllable weight ………………………………………………. 120
3.1. Latin stress assignment rule …………………………… 121
XI. SUPRASYLLABIC STRUCTURE 126
1. Stress and accent …………………………………………….. 126

2. The metrical foot ……………………………………………. 129
3. Intonation and tone ………………………………………….. 133
4. Questions and exercises …………………………………….. 136
SAMPLE TESTS ……………………………………………... 139
APPENDIX 1: English consonantal clusters ………………….. 141

APPENDIX 2: English weak forms …………………………... 146
SUGGESTED ANSWERS TO SAMPLE TEST A …………. 148

RECOMMENDED FURTHER READING ………………… 150
BIBLIOGRAPHY ……………………………………………… 151
8
FOREWORD
The general purpose of this course of lectures is to introduce the

first year students in English to the study of sounds. The emphasis
falls of course on the English sound system, but some examples from
other languages are also brought up, so as to increase the explanatory
power of the presentation.
Preparing for this course will first of all enable the students to
recognize, transcribe and describe the English sounds in general
phonetic terms and to master the basic phonetic characteristics of the
English language. At the same time, they will have the possibility to
improve their knowledge of English pronunciation in relationship with
the English spelling, thus increasing their speaking and writing
proficiency.
In the second part of the course, the students will become
familiar with the object of phonology, its basic concepts, and the
phonological description and classification of sounds. They will be
introduced as well to the main phonological processes and their
representation, with practical application on English specific
phenomena.
The third aim of the course is to present the main features of
English suprasegmental phonology, starting with English phonotactics
(phonological restrictions), and continuing with syllable structure and
syllabification rules of English. Other categories that will fall under
scrutiny are: stress, rhythm, intonation, the relationship between
English weak and strong syllables, etc.
9
10
I. INTRODUCTION
1. Phonetics and phonology as branches of linguistics
Phonetics and phonology are two closely related branches of

linguistics, the science which studies human language in all its
aspects.
The study of language is one of the oldest and dearest
preoccupations of philosophers and scientists. Ever since ancient
times, linguists and other scholars have understood that the
phenomena of language are much too complex to be studied globally.
There are, in fact, different levels at which the linguistic analysis can
apply, including, for instance, the level of sounds, that of words and
that of sentences. Of course, sounds, words and sentences cannot be
separated in practice, as they are simultaneously included in the
utterances that we use to communicate. However, a close examination
will reveal that both the substance and the rules by which these
elements of language are organized are quite specific and different
from one another. This is the reason why each level of linguistic
analysis has come to be studied by a different branch of linguistics,
with its own principles and methods. Especially in the past century,
the study of language has become such a complex and diverse
enterprise that it has split up into various relatively independent
branches – the linguistic disciplines of today.
11
1.1. Disciplines of linguistics
In a philological approach, students are first to become familiar

with the theoretical bases of the most important branches of
linguistics, depending on the various levels of linguistic analysis, and
then learn how to apply their newly acquired knowledge on the
languages they are studying. Consequently, the full (four-year)
curriculum of a department of foreign languages has come to contain
courses covering the disciplines of phonetics, phonology, morphology,
syntax, semantics, pragmatics, as well as other areas of linguistics,
such as discourse analysis, sociolinguistics, psycholinguistics,
computational linguistics, historical linguistics, etc. Below follows a
short presentation of these branches.
Phonetics deals with the physical aspect of speech sounds (or
phones): their production, transmission, and reception (hence the
three corresponding branches of phonetics: articulatory, acoustic and
auditory phonetics).
Phonology is the study of the distinctive sounds of a language,
the so-called phonemes. Phonology examines the functions of
sounds within a language, as well as the way they combine in
syllables and other stretches of speech.
Morphology is the study of morphemes, the smallest
meaningful elements of a language. Morphemes may be whole words
(e.g., thin, cat, wait) or parts of words (e.g., the plural marker -s in
cats, the past tense marker -ed in waited, the comparative marker -er
in thinner, etc.).
Syntax is the study of sentence structure. There are several
ways of defining and examining sentences, according to various
grammars. Syntax may look at the inner structure of clauses or at the
way clauses combine into complex sentences.
12
Semantics examines the meaning of linguistic signs (words)
and strings of signs. This meaning may result from the relationship of
a sign with the concept it corresponds to in our minds, with the object
it represents in the real world or with another sign in the same natural
language.
Pragmatics studies the use of language and the relationship
between language and its users. It is interested in what we do with
utterances, the way we use them to a certain effect.
Discourse analysis studies the various linguistic features of
different types of text: e.g., the detective story, the political discourse,
the medical scientific reports, etc.
Sociolinguistics is the study of the interaction of language and
social organization. Language has specific social functions, which
make it change accordingly.
Psycholinguistics studies the processes of language acquisition,
language comprehension, language production, language
memorization, etc., which have to do with the cognitive aspect of
language.
Computational linguistics is an interdisciplinary area of
research between linguistics and information science. Some computer
linguists simulate language structures into computer programs. Some
others use the computer as a tool for the analysis of language (e.g., by
using text corpus analysis).
Historical linguistics studies the historical development of
languages. Apart from the diachronic analysis (along time), it also
deals with the synchronic analysis of certain states of language (e.g.,
Old English, the language of Shakespeare, that of the eighteenth
century England, etc.). The evolution of the sound pattern in a
language is studied by a subfield of historical linguistics: historical
(or diachronic) phonetics and phonology.
13
2. Speech sounds
As can be seen from their definitions, both phonetics and

phonology deal with human speech sounds. Speech sounds are the
sounds we produce when we want to communicate, that is, the sounds
that build up our words and sentences. Unlike animals, which use sets
of sounds at random to transmit brief uncomplicated messages (e.g., a
honey-bee dancing in front of its hive), human beings can combine
their sounds in a precise order so as to form larger units and to convey
much ampler and more abstract meaning. This double structuring of
natural languages – both at the ‘lower’ level of sounds and at the
‘higher’ levels of grammar and meaning – has been referred to by
linguists as double articulation. Owing to this special ability, human
languages are (as good as) infinitely creative. In other words, human
speakers can produce an indefinite number of words and sentences,
while using a limited number of sound units and a restricted set of
rules according to which these sounds are organized.
Speaking a language we are intuitively aware that in order to
pronounce it correctly (or accurately) we have to follow a certain
pattern and pick those sounds that characterize it. This is because, as
already stated, each language uses a closed set of sounds, and native
speakers have the built-in ability to identify those sounds and
associations of sounds, which normally occur in their language and
distinguish them from ‘alien’ ones. It is usually when we try to learn a
foreign language that we start to realize what is typical of it (i.e., what
rules are there to observe) and where it differs from our native
language. For example, a Romanian will have difficulties when
learning how to master the difference between the initial sound in the
word there [D] and the corresponding sound in dare [d] because the
former sound does not belong to the inventory of sounds of his own
14
language. A similar lack of correspondence between the Romanian
and the English sound systems stands behind the way the English
vowel [æ] is rendered in Romanian in neologisms, e.g. in the way the
name Lassie is pronounced – Romanian [lesi]. Since there is no [æ]
sound in Romanian, our language replaces it with the sound [e], which
is the most similar to [æ] in our sound repertoire.
Although each language can only make use of a finite set of
sounds, each set is different, so there is no natural language that
employs, has employed or probably will ever employ the same sounds
as another one. Moreover, the sound system of any language changes
in time. This is due to the fact that the vocal tract of a human being is
sophisticated enough to produce an amazingly large variety of speech
sounds (see Figure 1.1), so that when the generations of speakers
change, the sounds they use will also change, even if only
imperceptibly, under various conditioning factors. Small changes turn
over centuries into big shifts. This explains, for instance, why the sets
of sounds of related languages, e.g., Romanian, Italian, French, etc.
are not identical among themselves and with the sounds of the mother-
language they all emerged from – in our example: Latin.
3. The International Phonetic Alphabet
As a means of communication, language is fundamentally oral.

However old writing might seem to be, as compared to speech it is a
far younger development in the history of humanity. Writing is
subordinate to speech and thinking, as its role is that of fixing ideas in
a more or less durable material by means of symbols.
The oldest systems of writing placed great emphasis on the
iconic representation of words; thus, for each word corresponding to a
referent in the real world or to a concept, a suggestive image was
carved or painted. This led to the creation of a long list of symbols
15
(ideograms), which had little to do with the actual pronunciation of
words. Later on, the sounds contained in words came to be
individualized in writing, first grouped in syllables, then separately.
Thus the first alphabet was invented, marking a major breakthrough in
people’s conception about language.
An alphabet is a much more economical system of writing, as
it starts from the idea that every sound should be represented by one
symbol, a letter. Since, as already stated, there is only a small set of
sounds employed in a language at a certain stage in its existence, the
number of corresponding letters in an alphabet are also small, and thus
easy to master and use. Nowadays, the most frequently employed
alphabet is the Latin one, which has been adapted by many languages
according to their phonetic system.
Natural languages tend to change in their historical evolution,
which makes the relationship between their spelling and their sounds
imperfect. In fact, the older the alphabet, the more irregular the
correspondence between letters and sounds, owing to the phonetic
transformations which have taken place in the history of the respective
language. In the English spelling, for instance, the relationship
between the pronunciation and the spelling of words has become
apparently so lax that learners have to memorize strings of letters
whose value is different in different contexts: think, e.g., of the
English ghost, laugh and thought. In the first word, the graphic
sequence gh is pronounced [g], and in the second, [f], but in the third
it is not pronounced at all.
Faced with the imperfections and irregularities characterizing
the alphabets of natural languages, in order to be able to refer
unambiguously and rigorously to speech sounds, linguists have come
to design special phonetic alphabets. Nowadays, the best known in the
scientific world is the alphabet of the International Phonetic
Association (in short: IPA – see Figure 1.1), which can be used for
the notation of speech sounds from all natural languages.
16
The IPA was first devised at the end of the 19th century, and
ever since it has been regularly revised and updated, so as to
accommodate sounds features and from languages that are still being
studied. Nevertheless, many American linguists prefer to use simpler
symbols and diacritics available on typewriters. For instance, instead
of IPA [S] and [Z], they use [š] and [ž] to note the initial sounds in
ship and genre, respectively.
Like any alphabet, IPA makes use of letters and other small
symbols attached to them (diacritics), which can express the tiniest
nuances of pronunciation. For instance, there are numerous shades of
[t] listed in the IPA alphabet: aspirated [th] (as in top), labialised [tw]
(as in twitter), palatalized [tj] (as in tune), etc. (see Figure 1.1). Such
detailed notations are necessary in the ‘narrow’ phonetic
transcription, which tends to be exhaustive in its description, that is,
to capture all the details in the articulation of the respective sound.
The narrow transcription is useful when we wish to give an accurate
and unitary rendering of the pronunciation of a sound in a certain
language and/or in a specific phonetic environment. If, on the
contrary, we need to be economical, we may only note the sound as a
simple symbol, without any detail (i.e., in ‘broad’ phonetic
transcription) – in our example as [t]. By convention, the symbols
used in the phonetic transcription are places within square brackets,
e.g., the cat is on the mat: [D@ "k&t Iz Qn D@ "m&t].
As can be seen in Figure 1.1, apart from various types of
sounds, the International Phonetic Alphabet also contains symbols for
suprasegmental phonological phenomena like stress, tone, intonation,
etc.
17
18
Figure 1.1a The International Phonetic Alphabet
Figure 1.1b The International Phonetic Alphabet
19
4. On varieties of English
Being spoken on all continents, English is the most widely

spread language on earth. It is used by hundreds of millions of people,
as a mother tongue, but also as a second language (e.g., in India,
where it is an official language), or as a language of international
communication (a lingua franca).
The immense geographical spread of English makes it very
different in various places. There are traditional dialectal differences,
as those between standard British English and the English dialects
spoken in the United Kingdom and Ireland (e.g., Scottish English,
Irish English, etc.), but there are also differences due to the separate
evolution of the language in various parts of the world (e.g., in the
United States of America or Canada), or to the contact between
English and the language of a colonized territory (e.g., in Hong Kong
or South Africa).
The Standard British English pronunciation, also known as
Received Pronunciation (in short, RP), is based on the southern
dialects of England and it is the type of language used by the upper
middle classes, in schools and in the media. In the United States a
corresponding standard variety is called General American
(abbreviated GenAm or GA).
20
5. Questions
1. What characterizes linguistics?

2. Which linguistics branches do you know?
3. What do phonetics and phonology share?
4. What are speech sounds?
5. What is the double articulation of language?
6. Why is it difficult to learn the sounds of a foreign language?
7. Does writing depend on speech?
8. Is the English spelling phonetic?
9. What is IPA and what does it contain?
10. How many kinds of phonetic transcription do you know?
11. Which are the most important varieties of English?
12. What are RP and GenAm?
21
II. BRANCHES OF PHONETICS
A phonetician may be interested in studying the speech sounds

of the languages of the world in general (general phonetics) or he
may apply himself to the study of the phonetic system of one given
language. His approach may be synchronic (focusing on the state of a
phonetic system at a certain moment in its historical development), or
diachronic (following the historical evolution of the respective
system). He may wish to compare or contrast two systems that are
related or not (comparative phonetics). In his investigation, he can
make use of various techniques and devices to probe the nature of
speech sounds (experimental phonetics). If he makes use of
instruments, which allow him to perform exact measurements, then he
is an adept of instrumental phonetics.
Phonetics, as practiced today, is an independent science, with its
own methods of investigation and experiment, but importing data
from the fields of anatomy, physiology and physics. As already stated,
phonetics deals with speech sounds, focusing on how they are
produced and perceived and on their physical features.
Speech sounds can be described in three different ways: in
terms of (a) the manner of their production; (b) the acoustic properties
of the sound waves traveling between speaker and hearer; and (c) their
physical effects upon the ear. Hence a threefold division of this
science into: articulatory, acoustic and auditory phonetics. We will
start with a short presentation of the last two branches.
22
1. Acoustic phonetics
Acoustic phonetics is the most technical branch of phonetics, as

the data and the methods it operates with are mostly borrowed from
physics.
Analyzed from the physical point of view, speech sounds are
waves, originated by the vibration of the source (the vocal cords in the
human larynx) and transmitted through the air. Waves can be
represented graphically in sinusoidal shape (see Figure 2.1). Apart
from duration (= how long they last) they have two important
characteristics. One of them is frequency, measured in Hertz (Hz).
Frequency shows how close together the waves are and corresponds to
the pitch (= the shrillness) of the sound. It is calculated by the number
of sinusoidal cycles completed per second (cps). (A complete cycle is
illustrated in Figure 2.1 as the movement between the rest points A
and B.)
Frequency
peak
x
Amplitude
x x
A B
x
trough
Figure 2.1 Periodic wave
23
The second important aspect of sounds is amplitude (=
intensity), measured in decibels (dB). Amplitude is the maximum
distance between the highest point of the wave – the peak – and the
lowest point – the trough (often divided by 2) and corresponds to the
loudness of the sound. This is related to the amount of energy that is
transmitted through the air by means of the respective sound wave.
As to the measurement of amplitude, the reference point for the
decibel scale is the standard intensity of a sound, which has a fixed
value close to the audible limit of sound. The sound intensity at the
threshold of human hearing (= 0 dB) is conventionally taken to be one
picowatt per square meter (1 pW/m²), roughly the sound of a mosquito
flying 3 m away, or a sound pressure level (SPL) of 20 micropascal
(20 μPa).
The reason for using the decibel is that the ear is capable of
hearing a very large range of sound pressures. The ratio of the sound
pressure that causes permanent damage from short exposure to the
limit that (undamaged) ears can hear is more than a million.
Psychologists have found that our perception of loudness is roughly
logarithmic. In other words, you have to multiply the sound intensity
by the same factor to have the same increase in loudness. This is why
the numbers around the volume control dial on a typical audio
amplifier are related not to the absolute power amplification, but to its
logarithm.
Because the power in a sound wave is proportional to the square
of the pressure, the ratio of the maximum power to the minimum
power is more than one trillion. To deal with such a range, logarithmic
units are useful: the log of a thousand is 3 (from 103), so this ratio
represents a difference of 30 dB from the audible limit. Similarly, a
sound of 60 dB is a million times more intense than the standard
value, while one of 120 dB is a trillion times more intense.
The time it takes for a cycle to be completed is called the
period of the vibration. Some sounds have constant regular periodic
24
vibrations (= tones = musical sounds, including, of the speech sounds,
vowels and sonorant), some others have irregular aperiodic vibrations
(= noise sounds, including voiceless consonants), while still others
have mixed vibrations (= tones and noises, including voiced
consonants) (see also Chapter III).
Vowels consist of bunches of periodic waves with various
frequencies. The wave with the lowest frequency is called the
fundamental (frequency), whereas the others are called the
harmonics of the respective sound. The higher harmonics are whole
number multiples of the fundamental (= the lowest harmonic). For
instance, if a sound has as its fundamental frequency 100 Hz and one
of its higher harmonics is, for instance, of 400 Hz, then we may say
that this is its fourth harmonic, since it is four times higher than the
fundamental.
The fundamental frequency is produced by the vibration of
the vocal cords in the larynx (hence the name laryngeal or glottal
tone), whereas the harmonics are due to the resonating qualities of
the vocal tract above the larynx (in the supraglottal cavities: the
pharynx, the mouth and the nose), whose shapes can be modified
during the articulation. Only some of the harmonics of a sound are
emphasized by the shapes and materials of the resonating cavities,
thus giving the sound a certain quality. That is why, when describing
sounds, phoneticians speak of their characteristic energy bands
(formants), namely the bands of strongly reinforced harmonics,
corresponding to a specific shape of the resonating chamber. The
complex range of formants of a sound make up its acoustic spectrum.
For example, the spectrum of the vowel /A:/ has one band of strong
components in the 800 Hz range and another one in the 1100 Hz
range, while the formants of /i:/ are in the 280 and 2500 Hz range,
respectively (see Figure 2.2).
25
/i:/ /A:/ /aI/
Figure 2.2 Spectrograms of /i:/, /A:/, /aI/ (after Ladefoged 1971;

Chiţoran 1978: 49)
The fundamental frequency of a sound corresponds to its pitch.

While the fundamental frequency involves acoustic measurement
expressed in Hz, pitch is used as a perceptual term, relating to
listeners’ judgements as to whether a sound is ‘high’ or ‘low’, whether
one sound is ‘higher’ or ‘lower’ than another and by how much, and
whether the voice is going ‘up’ or ‘down’. Such judgements are not
linearly related to fundamental frequency. For listeners to judge that
one tone is twice as high as another, the frequency difference between
the two tones is much larger at higher absolute frequencies, e.g., 1000
Hz is judged to be double 400 Hz, but 4000 Hz is judged to be double
1000 Hz. However, fundamental frequency values in speech are all
relatively low (i.e., usually less than 500 Hz), and for most practical
purposes pitch can be equated with fundamental frequency.
Different persons have different pitches (women have shriller
voices than men, though not as shrill as those of children; the average
values for the fundamental frequency with men, women and children
are 120 Hz, 225 Hz and 265 Hz, respectively). However, we can still
recognize, e.g., an /i:/ or an /aI/ even if the type of voice which utters
them is different from the point of view of pitch. What stays the same
26
is the shape of the spectrum: e.g., in the /i:/ pronounced by a woman
and the /i:/ of a man the harmonics with the greatest amplitude are of
similar frequency (even if the lower pitch will involve a lower number
of harmonics in the man’s sound).
The graphic representation of the frequencies (the formants) of
a sound is called spectrogram and it can be obtained by means of a
device called acoustic spectrograph. Nowadays the functions of such
devices have been taken over by specially programmed computers.
A recent field of activity, which involves knowledge of
phonetics and much more, is speech processing, the study of speech
signals and the processing methods of these signals. The signals are
usually processed in a digital representation whereby speech
processing can be seen as the intersection of digital signal processing
and natural language processing.
Speech processing can be divided in the following categories:
(a) speech recognition (analysis of the linguistic content of a

speech signal);
(b) speaker recognition (where the aim is to recognize the
identity of the speaker);
(c) speech signal enhancement (e.g., noise reduction);
(d) speech coding for compression and transmission of speech
(in telecommunications);
(e) voice analysis for medical purposes (e.g., analysis of vocal
loading and dysfunction of the vocal cords);
(f) artificial speech synthesis (by means of a speech
synthesizer, a software or hardware device capable of rendering text
into speech).
27
2. Auditory phonetics
Auditory phonetics focuses on the perception of sounds (the

way in which sounds are heard and interpreted). It is a field of study
where the scientist has to rely heavily on notions of anatomy and
physiology, involving the functions of the ear, but also of the brain,
where the acoustic message is decoded.
The ear receives auditory stimuli and transmits them further to
the brain. The outer ear is made up of the pinna (auricle), which
collects and focuses sound waves. From the pinna, the sound moves
into the ear canal, a simple tube running to the middle ear. This
includes the eardrum (tympanum or tympanic membrane) and the
ossicles, three tiny bones (called hammer, anvil, and stirrup) which
form the linkage between the tympanic membrane and the oval
window that leads to the inner ear. The tympanum turns vibrations of
air in the ear canal into vibrations of the ossicles.
The inner ear contains the organ of hearing (the cochlea) and
the labyrinth (vestibular apparatus), the organ of balance. The
cochlea is a hollow organ filled with a fluid (endolymph) and lined on
the inside with hair cells (sensory cells topped with hair-like
structures), the stereocilia. All vibrations passing through the middle
ear enter the endolymph. Hair cells are varied in length, so that they
resonate with sounds of various frequencies. Whenever a hair cell
resonates, it sends a nerve impulse to the brain, which is perceived as
a sound of whatever pitch the hair cell is associated with. A very
strong movement of the endolymph due to very loud noise may cause
hair cells to die. This is a common cause of partial hearing loss, and
the reason why anyone near guns or heavy machinery should wear
earmuffs or earplugs.
28
Our hearing mechanism is limited to an auditory field
ranging from the frequency of roughly 20 Hz to that of 20000 or
22000 Hz. With age, the range decreases, especially at the upper limit.
Above and below this range are ultrasound and infrasound,
respectively. Lower frequencies cannot be heard but loud sounds can
be felt on the skin. The optimum range of sensitivity is between 600
Hz and 4200 Hz.
Frequency resolution of the ear is, in the middle range, about 2
Hz. That is, changes in pitch larger than 2 Hz can be perceived.
However, even smaller pitch differences can be perceived through
other means. For example, the interference of two pitches can often be
heard as a (low-)frequency difference pitch. This effect is called
beating.
The intensity range of audible sounds is enormous. The lower
limit of audibility is defined to 0 dB (we cannot hear sounds lower
than this), but the upper limit is not as clearly defined. The upper limit
is more a question of the limit where the sensation of pain occurs
(because of too much pressure on the eardrums) and the ear will be
physically harmed. This limit depends also on the time exposed to the
sound. Sometimes, the ear can be exposed to short periods of sounds
of 120 dB without harm, but long periods of exposure to 80 dB sounds
will harm the ear. 150 dB sounds will cause physical damage to the
human body.
The human hearing is basically a spectral analyzer, that is, the
ear resolves the spectral content of the pressure wave without respect
to the phase or the waveform of the signal. In practice, though, some
phase information can be perceived. Inter-aural (i.e., between ears)
phase difference is a notable exception by providing a significant part
of the directional sensation of sound.
In some situations an otherwise clearly audible sound can be
masked by another sound. For example, conversation at a bus stop
can be completely impossible if a loud bus is driving past. This
29
phenomenon is called intensity masking. A loud sound will mask a
weaker sound so that the weaker sound is inaudible in the presence of
the louder sound.
Actually, the masking depends on two more parameters:
frequency and temporal separation of the sounds. A sound close in
frequency to the louder sound is more easily masked than two sounds
far apart in frequency. This effect is called pitch masking. Similarly,
a weak sound emitted soon after the end of a louder sound is masked
by the louder sound. In fact, even a weak sound just before a louder
sound can be masked by the louder sound. These two effects are called
forward and backward temporal masking, respectively.
The act of audition has objective as well as subjective
characteristics when it comes to language. Most often we give a
subjective interpretation to what we hear, selecting only those sound
features that are relevant for the language we communicate in. For
example, when listening to spoken standard English, untrained
Romanians may have difficulty in recognizing (and reproducing) the
difference between the aspirated and non-aspirated variants of
voiceless stops (e.g., the difference between [ph] in top and [p] in
stop), because they do not use aspiration in their own language. So in
order to become able to perceive sounds correctly, speakers must also
learn how to pronounce them and how to use them in the system of the
respective language, and thus develop an awareness of auditory
sensations corresponding to various sound qualities.
3. Questions
1. Which branches of phonetics do you know?

2. What do articulatory, acoustic and auditory phonetics study?
3. Which are the physical characteristics of sounds?
4. What is frequency and what is its unit of measure?
30
5. What is amplitude and how is it measured?
6. What is the difference between periodic, aperiodic and mixed
vibrations?
7. What is the fundamental frequency and how is it produced?
8. What is the relationship between pitch and fundamental
frequency?
9. What are the harmonics and where are they produced?
10. What is an acoustic spectrum and what does it consist in?
11. What is a spectrograph and what is it used for?
12. What is speech processing?
13. How is sound transmitted to the brain?
14. Which are the limits of the human auditory field?
15. Which is the intensity range of audible sounds?
16. How can a sound be masked?
17. Can an untrained ear easily discern the sounds of a foreign
language?
31
III. ARTICULATORY PHONETICS
The physical processes involved in the production of speech

sounds are the domains of articulatory phonetics, which uses a lot of
data from human anatomy and physiology in its descriptions. This is
so because the same organs, which are involved in breathing
processes, also participate in the production of speech. Speech sounds
result from the modification of the volume and direction of the airflow
originating in the lungs, which are carried out through the vocal tract
(see Figure 3.1 for a schematic illustration of the anatomic parts
involved in the process).
1. Airstream mechanisms
The airflow initiated in the lungs follows the direction of the

trachea (windpipe), larynx (in the Adam’s apple) and vocal tract
(mouth and nose). This type of airstream mechanism, known as
pulmonic egressive (‘from the lungs outwards’) is involved in all
human languages and for many languages it is the only mechanism
employed to produce speech sounds (e.g., English, Romanian, etc.).
For a small number of articulations, the airstream does not
originate in the lungs, but rather from outside. The ingressive
airstream mechanism produces sound through inhalation, as when
uttering a gasp of astonishment by breathing in air: aa! A speech
sound can also be generated from a difference in pressure of the air
inside and outside a resonator. In the case of the oral cavity, this
32
pressure difference can be created without using the lungs at all
(producing clicks, for example).
In the following discussion it will be assumed that the airstream
mechanism is pulmonic egressive.
Alveolar ridge (Hard) Palate
Upper lip Velum (=soft

palate)
Nasal cavity
Uvula
Oral
cavity
Teeth
Tongue Pharynx
Lower lip
Epiglottis
Larynx (with
vocal cords)
Trachea
Lungs
Figure 3.1 The vocal tract and articulatory organs
33
2. The vocal cords
In the larynx box, the air pushed out from the lungs meets the
vocal cords. These are two flaps of muscle placed across the windpipe
and bound to the arytenoids cartilages (which cause the protrusion
called the Adam’s apple in males’ throats). The vocal cords can
modify their position and thus allow the air to flow upwards in certain
ways.
When they are wide apart, the air passes through without any
obstacle. This results in a so-called voiceless sound, such as the initial
and final sounds in the word case [keIs]. If, on the contrary, the vocal
cords are close together, with a narrow gap in between, then the
pressure of the air moving through will cause them to vibrate, which
will result in a voiced sound (as in all the sounds in the word gaze
[geIz]). The vibration of the vocal cords can be heard – when we
cover our ears during the articulation, as well as felt – by placing a
finger on the larynx during the pronunciation of voiced sounds. To
practice, try to articulate the voiced fricative consonants [z] or [v] in a
prolonged manner, contrasting them with their voiceless counterparts
[s] and [f].
Apart from these two most common positions of the vocal cords
(open and narrowed), languages can also exploit a number of other
configurations, such as complete closure. If the glottis (= the opening
inside the larynx box, in between the vocal cords) is completely closed
(glottal stop), the air accumulates below the vocal cords; when they
are opened, the pressure is released with a cough-like puff of air. The
glottal stop is important in the study of many kinds of British English,
as it can be found in the dialects spoken in London (Cockney),
Glasgow, Manchester and in some varieties of North American
English (in New England). Take for instance the regional
34
pronunciation of the final sounds in wha[?] (e.g., in what rain), shu[?]
(e.g., in shut up), the “dropped t or k” pronunciation of, e.g., butter
and crackle, etc., the vowel reinforcement in a hiatus, etc. (see also
Section III.3).
If the vocal cords are wide apart, as if for the pronunciation of
voiceless consonants, but the air still causes some vibrations while
passing through the glottis, we are dealing with the so-called
murmured sounds or breathy voice. These are sounds we may
produce every day when we whisper so as not to disturb the people
around us.
3. Resonance
As the air moves out of the larynx, owing to the movement of

the articulators (the tongue, lips, etc.) the shapes of the vocal tract
above it are modified, so that the vibrations of the air inside the oral
and nasal cavities will also change, by a phenomenon called
resonance, similar to the resonance inside a guitar box or a flute.
Some sounds (the sonorants = vowels, glides, liquids and
nasals) involve a relatively high degree of resonance (= sonorance or
sonority). Other sounds (the obstruents) involve much less
sonorance. Obstruents are ‘noisy’ consonants produced by air
disturbances: a sudden burst of air or air friction, whereas sonorants
are more like pure, musical sounds. The most sonorous sounds are the
vowels. In English all sonorants are voiced, while obstruents can be
either voiced or voiceless.
35
4. Oral and nasal sounds
The choice between the oral and nasal articulations depends on

the position of the soft palate (or velum), a muscular flap placed at
the back end of the palate (= the roof of the mouth) (see Figure 3.1). If
the velum is raised and the nasal port closed, the air flows only into
the oral tract (the mouth), so that oral sounds are produced (most
speech sounds are oral). If the velum is lowered, the air can flow both
through the oral and the nasal cavities, which leads to the articulation
of nasal sounds. Nasals are sonorant consonants (see Section III.3).
5. Active and passive articulators
In the oral tract, the tongue and the lips, which move during the
articulation of sounds, are considered to be active articulators,
whereas the upper non-mobile surfaces of the mouth are usually
referred to as passive articulators. Of the active articulators, the
tongue is usually described in very precise details: the tip, blade,
front, body, back and root. That is because the smallest alteration in its
position can determine a perceptible change in the pronunciation of
the sound. Passive articulators can be the lower lip, the teeth, the
palate and the pharynx wall. By convention, the roof of the mouth is
further subdivided into the alveolar ridge (= the gum ridge), the hard
palate, the soft palate (often called velum) and the uvula (= the
fleshy tip of the soft palate, used, e.g., in the articulation of French
uvular ‘r’ [K]) (see Section III.8).
36
6. Manners of articulation
The manner in which a sound is articulated depends on the

channel opening (the distance between the active and passive
articulators). This distance can vary from complete closure (or
stricture) (a blockage in the mouth which prevents the air from
escaping) to complete aperture (through which the air flows out
unhindered).
In the case of complete stricture, the air which has built up
behind the blockage (the ‘closure phase)’ is released with a small
outburst when the blockage is removed (the ‘release phase’). This is
the way in which stops are produced. Oral stops (also known as
plosives if they are pulmonic egressive) are obstruent sounds
articulated with a raised velum (e.g., the consonants in the word bide:
[b] and [d]). Nasal stops involve a lowered velum (e.g., the initial and
final consonantal sounds of mine [m] and [n]); they are sonorant
sounds (in their production the nasal cavity acts as a resonator for the
airflow vibrations).
When the articulators are close together, but the stricture rests
incomplete, the air escapes through a very narrow passageway with
some friction (turbulence noise). This is the manner of articulation
specific to fricatives (e.g., the first and last sounds in fuss: [f] and [s]).
Since in the articulation of fricatives the air can pass continuously
through the vocal tract, they are described as continuant sounds.
The articulation of another type of obstruents involves
complete closure, followed by a release phase which is prolonged.
The air is slowly released through a narrow gap between the
articulators, in a way that resembles the articulation of fricatives. The
sounds produced in this manner are called affricates (e.g., the initial
sounds [Í] and [Ù] in cheat and gesture). Affricates do not behave
however like a sequence made up of two sounds, but rather as one
single segment. Examine, e.g., the following pairs of words: catch it
37
(containing the sound [Í]) and cat shit (containing the sequence [t+S],
noticeably longer than the previous one).
Apart from fricatives, there are some other sounds which can
be characterized as continuant: the frictionless continuants or
approximants, which are divided into two groups: glides and liquids.
The glides are closely related to the corresponding high vowels (e.g.,
the glide [j] in yet resembles the short vowel [I] in sing). The liquids
are laterals and rhotics (i.e., ‘l’ and ‘r’ sounds, respectively), which
often are articulated with approximation, but not always.
In the articulation of vowels (e.g., the middle sounds in fish
[I], bad [æ] or boot [u:]), the air flows out unhindered because the
articulators are more or less wide apart. Just like glides and liquids,
vowels are continuant sounds.
7. Fortis and lenis
Fortis consonants are produced with greater articulatory effort

and more air pressure required by stronger resistance at the place of
articulation. Lenis consonants are more lax: they require less intensity
and tension. The duration of articulation is also longer in the case of
fortis consonants than in the case of lenis ones. In a voiced/voiceless
pair (e.g., [d]/[t]), the voiced consonant is lenis and the voiceless
consonant fortis.
8. Places of articulation
The production of a sound involves the movement of an active

articulator towards a passive one. The articulators give the name of the
place of articulation of the respective sound (see Table 3.1).
38
Table 3.1 Places of articulation
• Bilabial – sound produced with both lips (e.g., [p], [b], [m], etc.).
• Labiodental – the lower lip and the upper teeth (e.g., [f], [v], etc.).
• Interdental – the teeth and the tongue tip/blade (e.g., [θ], [ð], etc.).
• Alveolar – the alveolar ridge and the tongue tip/blade (e.g., [t], [d],
[s], [z], [n], [r], [l], etc.).
• Alveo-palatal – the alveolar ridge/hard palate and the tongue blade
(e.g., [S], [Z], [Í], [Ù]).
• Retroflex – the hard palate and the tongue tip curled backwards
(e.g., [©], etc.).
• Palatal – the hard palate and the tongue blade (e.g., [j], etc.).
• Velar – the soft palate (velum) and the tongue body (dorsum) (e.g.,
[k], [g], etc.).
• Uvular – the uvula and the tongue body (dorsum) (e.g., [K] in Fr.
raison ‘root, reason’, etc.).
• Pharyngeal – the pharynx wall and the tongue root (e.g., [¿] in
Arabic [¿amm] ‘uncle’, etc.).
• Glottal – the vocal cords in the larynx (e.g., [h], [?] (the glottal
stop), etc.).
Bilabial and labiodental sounds are included in the general

class of labials, since both sets involve at least one of the lips. The
class of coronals (sounds produced by raising the front part of the
tongue – the tongue tip or blade, but not the body of the tongue)
comprises the dentals, alveolars, alveo-palatals (= palato-alveolars
or postalveolars), retroflex and palatal sounds. Velars and the
uvulars have as an active articulator the body or dorsum of the
tongue, so they are both referred to as dorsals. The class of gutturals
contains pharyngeal and glottal sounds, which tend to behave as one
group (see Table 3.2).
39
Some consonants have two simultaneous places of articulation.
Secondary articulation occurs when an additional vowel-like
articulation is overlaid on the basic sound. In this case the consonant is
articulated with a simultaneous glide, i.e., palatalized (e.g., [tj] in
Romanian peşti ‘fish (pl.)’), labialized (e.g., [kw] in English quick),
etc. In the production of sounds with double articulation both places
of articulation are equally important (e.g., the labio-velar glide [w] in
wife).
Table 3.2 Groups of place of articulation
LABIAL CORONAL DORSAL GUTTURAL

Bilabial Dental Velar Pharyngeal
Labio-dental Alveolar Uvular Glottal
Alveo-palatal
Retroflex
Palatal
9. Questions and exercises
1. What do you know about the pulmonic egressive airstream

mechanism?
2. Are there any other types of airstream mechanisms?
3. Which positions of the vocal cords do you know?
4. What is the difference between voiced and voiceless sounds?
5. How is a glottal stop articulated?
6. What is resonance?
7. Which sonorant sounds do you know?
8. What is the difference between oral and nasal sounds?
9. Which active and passive articulators do you know?
10. Which types of manner of articulation do you know?
40
11. How are plosives / fricatives / affricates articulated?
12. How are nasals articulated?
13. How are liquids / glides / vowels articulated?
14. What is the difference between fortis and lenis consonants?
15. Which places of articulation do you know and how can they
be grouped?
16. Which sounds correspond to each place of articulation?
17. What is the difference between secondary articulation and
double articulation?
18. In each of the following words one sound is underlined.
Describe it in terms of voicing, nasality (if necessary), place of
articulation and manner of articulation:
a) more b) bar c) assist d) lazy e) joy
f) peach g) thin h) fast i) season j) north
19. Which are the active and passive articulators in the
production of the following underlined sounds?
a) choke b) very c) yet d) happy e) singing
f) then g) cherry h) dear i) bridge j) shoe
20. For each of the pairs of words below identify the difference
between the underlined sounds.
Example: The difference between the [t] in pat and the [d] in
pad is a matter of voicing ([d] is voiced, while [t] is voiceless).
a) pit/bit b) sent/tent c) vest/zest d) mob/bob
e) core/gore f) deck/neck g) soap/soak h) force/source
i) lag/lad j) measure/mesher
41
IV. CONSONANTS
1. Obstruents
1.1. Plosives (= oral stops)
Table 4.1 English plosives
Plosive Place of articulation Voice Examples

(IPA)
[p] bilabial - pear, drop
[b] bilabial + bit, sob
[t] alveolar - tooth, pat
[d] alveolar + dash, cod
[k] velar - kitchen, thick
[g] velar + gong, lag
[?] glottal - rat, buckle (in some varieties
of Br. Engl. and Am. Engl.)
Some languages may have other oral stops, produced in other

places of articulation. For instance, in the pronunciation of Romanian
[t] and [d] the passive articulators are the upper teeth rather than the
alveolar ridge, as in English (dental stops are usually symbolised by
[t∞ ] and [d5], with a little tooth-like diacritic under the main symbol).
42
The glottal stop [?] has been compared with a slight cough. It
has no voiced counterpart because the vocal cords cannot vibrate
when they are in contact (see also Section III.2). Under some
circumstances, voiceless stops may be reinforced or completely
replaced by glottal stops: e.g., in bu[?ν⎯] (button) (where the
diacritic [ ⎯] under [n] marks the syllabic nasal); li[?]or (liquor); si[?
g]uy (sick guy); cu[? σ]lice (cut slice), etc. If vowels occur
(emphatically) at the beginning of a word or in a hiatus (two vowels
juxtaposed in consecutive syllables), they may also suffer glottal
reinforcement, as, e.g., in its [?]eight!; re[?]act.
1.1.1. Aspiration
In most English varieties, when a voiceless stop is placed at the

beginning of a stressed syllable, its release is followed by a
perceptible puff of air, called ‘aspiration’ and marked by a [h]
diacritic, e.g. in [ph]ot, [th]op, [kh]an. On the other hand, when the stop
follows the fricative [s] in the same initial position, its release stage is
devoid of such an audible outrush of air (it is ‘non-aspirated’), e.g. in
spot, stop, scan.
In connected speech, aspiration may help us distinguish
between otherwise ambiguous sentences, such as in the pair peace
talks [pi:sthO:ks] and pea stalks [pi:stO:ks]. A weaker sort of
aspiration may also be present in the articulation of stops at the
beginning of unstressed English syllables, as well as in word-final
position.
43
1.2 Fricatives
In many varieties of English, there is no voiced glottal fricative

corresponding to the voiceless [h]. However, if the sound begins a
stressed syllable, following a non-stressed syllable ending in a vowel,
some English speakers make use of a breathy voice [§], as in behead
or rehearsal (see Section III.2). Some other English variants (e.g.,
Cockney) hardly make use of any [h] sound, which leads to
ambiguities of pronunciation (e.g., in the pair hall – all).
In the so-called ‘Celtic’ varieties of English (Irish, Scottish and
Welsh) another type of fricative occurs: the voiceless velar [x] (e.g., in
Scottish loch / Irish lough ‘lake’, as well as in German acht ‘eight’ or
Dutch nog ‘still, more’). Other languages use different places of
articulation for the pronunciation of their fricatives, e.g., the Japanese
voiceless bilabial [P], as in Fuji, the Spanish voiced bilabial [B], as in
deber ‘owe’, the German voiceless palatal [ç], as in sich ‘self’, the
Greek voiced velar [◊], as in [◊]ata ‘cat’ (see also the IPA chart =
Figure 1.1).
Table 4.2 English fricatives
Fricative Place of articulation Voice Examples

(IPA)
[f] labio-dental - fine, puff
[v] labio-dental + vat, move
[θ] (inter)dental - thick, path
[ð] (inter)dental + that, bathe
[s] alveolar - sink, kiss
[z] alveolar + zero, buzz
[S] alveo-palatal - shake, dash
[Z] alveo-palatal + pleasure, beige
[h] glottal - hat, inherit
44
1.2.1. On the distribution of fricatives
Most of the English fricatives occur in all positions (word-

initial, word-medial and word-final). Words beginning with the voiced
interdental [ð] belong to a small set of articles and adverbs, such as
the, that, there, etc.
Other fricatives with only limited distribution in English are
[Z], [h] and [x]. The voiced alveo-palatal [Z] never occurs word-
initially (except in a couple of neologisms, e.g. gigolo and genre) and
for the rest it can only be identified in relatively few words, e.g.,
pleasure, casual; beige, rouge. In word-final position [Z] may vary
with the affricate [Ù] (e.g., garage, etc.).
The voiceless glottal fricative [h] can never be found in final
position; it is restricted to the word-initial or word-medial position,
but even then it must belong to the onset of a stressed syllable, e.g., in
horse or ahead. [h] is regularly dropped from the initial position of
several function words – unstressed pronouns and auxiliaries (e.g., his,
her, has, etc.) and it is often absent in other words in many varieties of
English characterised as sub-standard. In those cases where the first
orthographic sequence of a word is ‘hu’, the initial sound is sometimes
pronounced as the palatal fricative [ç] followed by the glide [j]. In
some North American varieties, these words actually begin with the
glide [j], without any [h] sound (e.g., in huge, humid, etc.).
The voiceless velar [x] never occurs word-initially in the
‘Celtic’ English varieties.
45
1.3 Affricates
There are only two affricates commonly used in English, both

alveo-palatal: voiceless [Í] (as in charity, teacher, catch) and voiced
[Ù] (as in generous, pledger, rage). Speakers of other languages make
use of more affricates, such as the German voiceless labio-dental [pf],
as in Pfeffer ‘pepper’, the Romanian voiceless dental [ts], as in ţară
‘country’, or the Italian voiced dental [dz] in zio ‘uncle’.
2. Sonorant consonants
2.1 Nasals
English nasals are stops. They correspond to the English

plosives in terms of their place of articulation: there is a bilabial [m],
as in money, an alveolar [n], as in nutty, and a velar [N], as in sing.
The English velar nasal [N] cannot occur at the beginning of a
syllable. In other languages we find different types of nasals (e.g.,
dental [n5], as in Romanian numai ‘only’, palatal [J], as in French
ga[J]er (gagner) ‘to win’, Spanish ni[J]o (niño) ‘child’, Italian o[J]i
(ogni) ‘every’, etc.).
2.2 Liquids
2.2.1. Laterals
Laterals are those sonorants whose articulation involves a free

flux of air over the lowered sides of the tongue. The central part of
the tongue (the active articulator) touches the palate (the passive
articulator) (in a so-called mid-saggital contact), but both (or at least
46
one of) its lateral parts are free in the process. Characteristic of many
languages including English is the alveolar lateral [l], as in lamb (in
this case, the tongue blade is in contact with the alveolar ridge).
Another type of lateral is Spanish and Italian palatal [F] (as in Sp.
caballo ‘horse’, It. figlio ‘son’), etc.
In English, a lateral liquid may occur in all positions in a
word, but its articulation varies accordingly. An important distinction
results from contrasting the articulation of (a) an [l] in initial position,
or word-medially before a vowel, to (b) a lateral placed at the end of
the word, before a consonant or in syllabic position. The lateral variant
produced in the environments under (a) (e.g., in lake, ludicrous,
follow, inland), which only has alveolar contact, is known as clear ‘l’
and is symbolised as [l]. For the articulation of the other variant, in
addition to the alveolar contact, the back of the tongue is
simultaneously raised towards the soft palate (e.g., in pi[5], ki[5]t,
ratt[5⎯]). This secondary velar articulation has given the alveolar
sound the description dark ‘l’.
2.2.2 Rhotics
Under the name ‘rhotics’ a large variety of sounds are usually

grouped, and a good ear will notice the differences in the articulation
of the ‘r’ sounds used, for instance, in RP English, Scottish English,
North American English, or other languages, such as Spanish, French
and High German (see Table 4.3). In fact, as we will see, the general
heading of rhotic covers sounds that either involve contact between
the active and passive articulators, or friction, or neither contact nor
friction (in the case of continuants). What all these ‘r’ sounds share is
that they tend to function as sonorants, even if they are not so
phonetically.
47
In the articulation of the alveolar trill (or roll) [r], which also
happens to be the ‘r’ sound characteristic of Romanian, the tongue
blade vibrates against the alveolar ridge, touching it repeatedly (in
intermitent closure). For the alveolar tap (or flap) [Ρ] (a stop of very
short duration), a single tap of the tongue blade against the alveolar
ridge is enough. Both the trill and the tap are met in the Scottish
varieties of English, especially the latter. The tap (or flap) [Ρ] is also
the intervocalic sound in North American English pattern, etc.
Table 4.3 Various types of rhotics
Rhotic Place and manner of Examples

(IPA) articulation
[ρ] alveolar trill/roll Sp. perro ‘dog’, Rom. raţă ‘duck’,
Russ. roza ‘rose’
[Ρ] alveolar tap/flap Sp. pero ‘but’, Scott. Eng. red, North
Am. Eng. cutter
[♦] (post-)alveolar Br. Eng. right
approximant
[©] retroflex approximant North Am. Eng. rabbit
[{] uvular trill/roll Somewhat older (e.g., Edith Piaf’s)
Fr. regrette ‘regret’
[®] uvular fricative French mari ‘husband’, High German
richtig
The characteristic RP rhotic is the (post-)alveolar continuant

(approximant) [♦]. It is produced by raising the tongue blade
towards the alveolar ridge, but in this case the sides of the tongue
come into contact with the molars, which creates a narrow channel for
the air to flow down the middle of the tongue. The retroflex
approximant [©] is articulated in a similar way (characteristic, e.g.,
of many North American varieties of English), but this time the
tongue blade is curled backwards, to the post-alveolar position.
48
The uvular roll (or trill) [{] and the voiced uvular fricative
[®] involve the vibration of the back of the tongue against the velum
or in close approximation to it, respectively. The former reminds an
English speaker of gargling and it occurs in some older dialects of
French and in Lisbon Portuguese. The latter is the sound often heard
in French and High German.
The distribution of the ‘r’ sounds lies at the basis of one of the
major English dialect divisions. Thus, varieties with pre- and post-
vocalic ‘r’ are called rhotic accents (i.e., accents where both the
rhotics in e.g. rose or marry are pronounced, as well as in, e.g., fair
and sort), whereas those with only pre-vocalic ‘r’ are named non-
rhotic accents. Most types of English are non-rhotic. The rhotic ones
include the majority of North American English, Scottish and Irish
English, etc.
This dialectal difference rests on a historical sound change,
which led to the post-vocalic loss of the rhotic in some types of
English. The evidence comes from the spelling of English words, as
well as from the presence at the end of a word like fair in non-rhotic
accents of an ‘r’ sound if the word is followed by another word which
starts with a vowel, e.g., in fair answer (this rhotic is called linking
‘r’). This phenomenon occurs also within morphologically complex
words, as for instance in boring (cf. bore): the rhotic always precedes
a vowel-initial ending.
Another phenomenon connected to the one illustrated above is
intrusive ‘r’: the insertion of a word-final rhotic sound between two
vowels in non-rhotic accents, e.g., in the idea [♦] of it. Intrusive ‘r’ is
most often heard word-finally after the vowel [@] and it is also
sometimes heard word-internally for some speakers (e.g., compare
soaring and saw[♦]ing (sawing)).
Some adult speakers use a so-called ‘defective r’ [√], a labio-
dental approximant quite similar to the glide [w]. This type of
pronunciation is often considered affected, and was typically a feature
49
of upper class English English, but nowadays it is characteristic of the
language spoken, for instance, by the working-class and lower middle-
class in South Eastern England.
3. Glides
In the articulation of glides, no contact is produced between the

articulatory organs, which groups them together with the vowels. For
this reason glides are also called semi-vowels. In fact, their
articulation is slightly different from that of the corresponding vowels:
when a glide is produced, the articulators are prepared for a vowel-like
sound, but then they immediately change their position (get closer) to
produce another sound. It is to this ‘gliding’ that the sounds owe their
name. Besides, glides are shorter and their articulation is more
forceful than that of vowels.
Glides are also called semi-consonants because they behave
like consonants: unlike vowels, they cannot occur at the end of a
syllable or preceding a consonant and they are always followed by a
vowel. Together with some of the liquids with similar characteristics
they build the class of approximants (frictionless continuant sounds).
There are only two glides in English, as in the majority of
languages: the palatal [j] (e.g., in yet) and the labio-velar [w] (e.g., in
water). The articulation of the palatal [j] is similar to that of the vowel
[i] (the front of the tongue is raised close to the palate). The labio-
velar [w] shares the articulation features of [u] (the lips are rounded
and the back of the tongue raised towards the soft palate). Apart from
these most common two glides, there are also others, such as the
French labio-palatal [H] (similar to French [y], the front round
vowel) (e.g., in lui [λHι] ‘him’).
50
3.1. Distribution and variation of glides
In many North American types of English, as well as in some

English English varieties, [j] cannot follow the alveolar consonants [t],
[d], [s], [z], [n] and [l], or the dental fricative [θ], e.g., in tune, dupe,
suit, presume, rebuke, lure, Lithuania, but it will follow [n] and [l] if
they are placed in unaccented syllables, e.g., in ven[j]ue and val[j]ue.
In those varieties of English where [j] can follow an alveolar
sound, the sequences [t] + [j] and [d] + [j] frequently coalesce to form
the alveo-palatal affricates [Í]and [Ù]. This happens inside words or
across word boundaries, e.g. in [Íune, [Ù]uring, as well as in bet you
[bEÍ@], bid you [bIÙ@], etc. Similarly, the sequences [s] + [j] and [z]
+ [j] often combine into the corresponding alveo-palatal fricatives [S]
and [Z], e.g. in ti[S]ue (tissue), ca[Z]ual (casual), as well as in
ki[S]you (kiss you), ama[Z]you (amaze you).
In Scottish, Irish and North American types of English, a sound
which is very similar to the labio-velar glide, the voiceless labio-velar
fricative [©], spelled ‘wh’, functions as a distinct sound. Thus, in
these types of English there is a clear contrast between the words:
witch (with initial [w]) vs. which (with [©]), Wales vs. whales,
weather vs. whether, etc. The other English varieties treat these words
as pairs of homophones, both having the glide [w].
51
52
CLASS Stops Fricatives Affricates Nasals Approximants
bilabial [π] [β] [μ]
labio-dental [φ] [ϖ]
(inter)
[Τ] [Δ]
dental
introduced in this chapter.
[λ] [♦]
alveolar [τ] [δ] [σ] [ζ] [ν]
(liquids)
alveo-palatal [Σ] [Ζ] [τΣ] [δΖ]

4. Summary

[ϕ]
palatal
(glide)
(labio- velar)
Table 4.4 Consonants typically used in English
[ξ] (in
velar [κ] [γ] [Ν] [ω]
Celt. var.)
(glide)
[?]
glottal [η]
(dial.)
Table 4.4 resumes the typical English consonantal sounds
1. Which English plosives do you know?

2. What characterizes the glottal stop?
3. What is aspiration and which sounds are affected by it?
4. Which English fricatives do you know?
5. What is particular in the distribution and variation of English
fricatives?
6. Which affricates do you know?
7. Which sonorant consonants do you know?
8. What is characteristic of the English nasals?
9. Which English liquids do you know?
10. What is the difference between ‘clear l’ and ‘dark l’?
11. What is the difference between rhotic and non-rhotic
varieties of English?
12. What are ‘linking / intrusive / defective r’?
13. What are glides?
14. What characterizes the distribution and variation of English
glides?
15. Indicate the symbols representing the sounds described
below:
a) voiceless dental fricative; b) voiceless bilabial stop; c) voiced
velar nasal; d) voiced palatal glide; e) voiceless alveolar fricative;
f) voiced alveo-palatal fricative; g) voiced alveolar lateral; h) voiceless
glottal stop; i) voiced alveo-palatal affricate; voiced labio-velar glide;
j) voiced labio-dental fricative; k) voiced bilabial nasal.
16. For each of the following symbols, find an adequate
description in words.
Example: [b] = voiced bilabial stop

a) [δ] b) [z] c) [n] d) [p] e) [h] f) [τΣ] g) [/]
53
h) [Β] i) [Χ] j) [Δ] k) [Σ] l) [♦] m) [ω] n) [ξ]
o) [×] p) [ρ] q) [Ν] r) [φ] s) [®] t) [⎠] u) [k]
v) [Ζ]
17. Identify the difference in articulation between the following

sounds, grouped in two sets.
Example: [Τ s v Ζ] differ from [d p γ k] in point of manner of

articulation – the sounds in the first set are all fricatives and the
sounds in the second set are all stops.
a) [κ γ ξ ⊗ Ν] vs. [t d s z n] b) [n r λ] vs. [d s z] c) [b d Z] vs. [p t S] d)

[÷ Β Τ] vs. [β δ κ] e) [j w] vs. [ζ ♦] f) [μ ν Ν] vs. [b d g] g) [pf ts τΣ]
vs. [f s Σ]
18. Identify which of the following sounds does not share all the
features of the rest of the sounds and specify what the difference
consists in (sometimes there is more than one solution).
Example: in the set [p, n, s, δΖ], [n] is nasal and the rest are oral
sounds.
a) [w j t] b) [k x γ s] c) [r l m n] d) [m p b ÷] e) [v z Ζ ⊗ h]
54
V. VOWELS
The description of vowels is quite different from that of

consonants. First of all, voicing is irrelevant in this case, since vowels
are usually voiced in the majority of languages, so this feature is rarely
mentioned. Secondly, the manner of articulation as such is equally
irrelevant, since all vowels are by definition produced with the
articulators wide apart. Thirdly, vowels are restricted to the palatal and
velar places of articulation.
1. Criteria for classifying vowels
Vowels are usually described according to their ‘quality’ within

a three-term system: vowel height, vowel backness, and vowel
roundness.
Vowel height is a ‘vertical’ parameter, corresponding more or
less to the consonantal criterion of manner, based on the distance
between the articulators. Vowels vary from high (that position in
which the tongue body is as near the palate as it can be without
causing audible friction) to high-mid, mid, low-mid and low (where
the tongue body is as far from the palate as possible) (older texts may
also use close and open instead of high and low, respectively).
Vowel backness is a ‘horizontal’ criterion, parallel to
consonantal place. It refers to the part of the tongue which is raised
55
highest in the articulation of the vowel, varying from front (equivalent
to palatal) (through central) to back (equivalent to velar).
Vowel roundness: a vowel may be either rounded – articulated
with the corners of the lips brought towards each other and the lips
pushed forwards, e.g., [u] – or unrounded. Some phoneticians make a
further distinction within unrounded vowels, between spread vowels
– produced with the corners of the lips moved away from each other,
as for a smile, e.g., [i], and neutral vowels – where the lips are not
noticeably rounded or spread, e.g., [@].
2. The Cardinal Vowels
Applying the three major criteria presented above, we can

delimit the vowel articulation from the articulation of other sounds,
calculating the so-called ‘vowel space’. This is the space within the
oral cavity available for the production of vowels. For the sake of
simplicity, the most common representation of the vowel space takes
the stylized arbitrary shape of a quadrilateral (a trapezoid), as first
proposed by Daniel Jones in the 1920s, under the name of Cardinal
Vowel chart (see Figure 5.1).
In Figure 5.1, the upper left corner represents the tongue
position for the (ideally) highest and furthest forward vowel ([i]),
while the lower right corner shows the tongue position for the lowest
and furthest back vowel [A]. Six other sounds, approximately placed
equidistantly from each other, are also indicated, thus giving a series
of eight cardinal vowels, of which 1 to 5 are unrounded, and 6 to 8
rounded. These are known as the primary cardinal vowels.
By reversing the rounding value, we obtain eight more secondary
cardinal vowels, of which 9 to 13 are rounded, and 13 to 16 unrounded.
Two more vowels are numbered in the chart: the high central unrounded 17
[È] and the high central rounded 18 [Ë]. There are also other central vowels
56
which do not belong to the inventory of cardinal vowels, but are included in
the IPA chart: the central low unrounded vowel [6], the central low-mid
unrounded vowel [3], the central mid unrounded vowel [@], etc. [@] is
shaped like an inverted ‘e’ and is usually called ‘schwa’ (pronounced
[SwA]), which is the old Hebrew term for a diacritic indicating a missing
vowel (Hebrew writing usually only includes consonants).
1 i u 8
2 e o 7
3 E O 6
4 a A 5
Figure 5.1 The primary cardinal vowels
A few other IPA vowels are important in the description of the

English vocalic system. One of them is [æ] (found in conservative RP
and in most American English varieties). This vowel is somewhat
higher and fronter than [a], but also a little lower than [E]. IPA [I] and
[Y] are the lower, more central, short, and lax counterparts of [i] and
[u], respectively, while [U] similarly corresponds to [u] (see Figure
5.2).
front central back
high
i y È Ë M u
high-mid I Y U
mid
e { G o
@
low-mid E 9 3 V O
low
æ 6
a ↓ Q A
Figure 5.2 IPA vowels (selective)
57
The Cardinal Vowel chart is a schematic representation of the
vowel space and its limits. It establishes reference points (hence the
label ‘cardinal’) to which vowels in specific languages can be
compared and described as, for instance, ‘higher than the cardinal
vowel X’, ‘further back than the cardinal vowel Y’, or ‘more rounded
than the cardinal vowel Z’. In this sense, the vowels in the words sea
and shoe are said to illustrate the high cardinal vowels [i] and [u],
respectively. But so is said about the French vowels in the words si
‘yes’ and chou ‘cabbage’, and yet there is a perceptible difference
between the two pronunciations. This is because the French vowels
are closer to the corresponding cardinal vowels than are the English
vowels.
A special mention needs to be made of the symbol [a] being
commonly used to represent a low central vowel rather than a low
front vowel (as specified in the Cardinal Vowel chart). This sound is
typical, for instance, of Romanian (e.g., in are ‘(he) has’).
3. Other criteria for classifying vowels
Traditionally in describing English vowels we use the ‘quantity’

distinction ‘long’ vs. ‘short’. Long consonants are also known (e.g.,
fricatives take longer to be articulated than plosives; plosives can be
long if they are ‘doubled’ or geminated – as, e.g., in Italian). Long
vowels can be 50 to 100 percent longer than short vowels. For
example, there is an obvious difference in length between the vowel
in feet [i:] (the colon indicates a long vowel) and the one in fit [I]. At
the same time, the two vowels also differ through ‘quality’ factors: [I]
is lower and more central than [i:]. That is because length in most
English varieties is never the only feature which distinguishes two
vowels. This is not the case in other languages (e.g., Danish) or even
in a number of Scottish and Northern Irish English varieties, where
58
length is sometimes the only criterion of distinction between pairs of
words such as daze [dez] and days [de:z].
Long vowels are always associated with a higher degree of
muscular tension in the articulatory organs. Consequently, they are
described as tense. Short vowels are produced with less tension, in a
more relaxed manner – hence their description as lax.
The more advanced or retracted position of the tongue root
can differentiate among vowels. Vowels articulated with the root of
the tongue pushed forward of its normal position are described as
advance tongue root (ATR) vowels. Non-ATR vowels are
articulated with the tongue root in its resting position. The former type
of vowels are also tenser and higher than the latter.
Another important way of distinguishing vowel sounds depends
on whether the tongue stays in the same position or is shifted during
the articulation. Some vowel sounds are relatively steady
(monophthongs, also called ‘pure’ vowels), e.g. in feet, some others
involve tongue movement after the beginning of the articulation
(diphthongs), e.g., in fight. Monophthongs are represented by a single
vowel symbol, such as [i:] in feet, while diphthongs are represented by
two symbols (indicating the starting and the finishing positions of the
tongue, respectively), such as [aI] in fight. Both monophthongs and
diphthongs belong to one single syllable. The duration of a diphthong
is usually equal to the duration of a long vowel, but there are
languages which make use of short diphthongs (e.g., Icelandic).
One of the members of the diphthong sequence dominates over
the other. If the dominant member comes first in the sequence, we are
dealing with a falling diphthong. English only has falling diphthongs,
of two kinds: opening – in fact, centering (ending in [@], e.g., [I@]
in beard) and closing (ending in [I] or in [U], e.g., [OI] in voice and
[aU] in loud). In other languages, e.g., Romanian, there are also rising
diphthongs, where the dominant member comes second, e.g., in iarnă
(‘winter’), iute ‘spicy, quickly’, ies ‘I go out’, coadă ‘tail’, ceas
59
‘clock, watch, hour’, etc. However, some linguists (especially
Americans) describe diphthongs (and even long monophthongs) as
sequences of glide + vowel (e.g., [ja], [wa]) or vowel + glide (e.g.,
[aj], [aw]).
In some non-rhotic English varieties, closing diphthongs may
be followed by [@] (in those environments where rhotic varieties have
an ‘r’ sound), e.g., in RP sour [saU@], sayer [seI@], fire [faI@],
lawyer [lOI@], and slower [sl@U@]. Thus triphthongs result, which
by nature are very unstable and subject to reduction. Their reduction
usually implies the loss of the intermediary vowel, which
automatically determines the compensatory lengthening of the initial
vowel. The RP words enumerated above are now pronounced [sa:@]
(sour), [se:@] (sayer), [fa:@] (fire), [lO:@] (lawyer), and [sl3:]
(slower), with a further tendency towards monophthongisation of the
resulting centring diphthong. Thus the pairs slower and slur [3:], fire
and far [A:], and even layer and lair (if the [e:@] is further reduced to
[E:]) tend to become homophonous.
The position of the velum can also be used as a criterion in
distinguishing vowels. In most of the situations the soft palate is raised,
so that oral vowels are produced, but if it is lowered, the change results
in the articulation of nasal vowels. In some languages oral vowels
contrast with nasal vowels – as in French, e.g., in the pair lait [lE]
‘milk’ vs. lin [lE)] ‘flax’ (the nasal sound is marked by the ‘tilde’
symbol [~]). In English, nasalised vowels are always positional
variants: if a vowel precedes a nasal stop it will be produced with
lowered velum so as to anticipate the following consonant, as in seen [
i):].
60
4. English vowel sounds
Vowels have a tendency to move about in the articulatory space

much more than consonants. This variation depends both on the
regional origin of the speaker and on his social class and age group.
The number of vowels and their positions on the vowel chart differs
considerably from one English variety to another. Of the English
varieties, the RP vowel system is particularly rich (see Figure 5.3),
though the diphthongs have tended towards symplification.
Conservative RP is thus said to have 21 vowel sounds (12
monophthongs and 9 diphthongs). In more recent RP, speakers tend to
reduce the diphthongs [O@] and [U@] to [O:] and [e@] to [E:], so
that the newer form of RP only has 19 vowels sounds.
i: u:
I U
@ O:
E 3:
V
& Q
A:
Figure 5.3 RP pure vowels
4.1. RP front vowels
[i:] – high, long, tense, unrounded (e.g., in see).

[I] – high, more central and lower than [i:]; short, lax,
unrounded (e.g., in bit).
[E] – low-mid, short, lax, unrounded (e.g., in check).
[æ] – low, short, lax, unrounded (e.g., in cat).
61
4.2. RP back vowels
[u:] – high, long, tense, rounded (e.g., in boot).

[U] – high, more central and lower than [u:]; short, lax, rounded
(e.g., in put).
[O:] – low-mid, long, tense, rounded (e.g., in taught).
[Q] – low, short, lax, rounded (e.g., in got).
[A:] – low, long, tense, unrounded (e.g., in father).
4.3. RP central vowels
[V] – low-mid, short, lax, unrounded (e.g., in cut); it is closer to

the IPA vowel [6] than to the cardinal [V].
[@] – mid, short, lax, unrounded (e.g., in about, verandah –
always in unstressed syllables).
[3:] – mid, long, tense, unrounded (e.g., in fur, bird, in non-
rhotic varieties of English); in North American English (which a
rhotic variety of English) a [@] is often used followed by an ‘r’ sound,
represented as [™].
4.4. RP centring diphthongs
[I@] – e.g., in fear.

[e@] – traditional RP (e.g., in fair); nowadays reduced to [E:].
[O@] – traditional RP (e.g., in oar); nowadays reduced to [O:].
[U@] – traditional RP (e.g., in poor or tour); nowadays reduced
to [O:].
62
4.5. RP diphthongs falling to [I] and to [U]
[aI] – e.g., in pie. [aU] – e.g., in cow.

[OI] – e.g., in coin. [@U] – e.g., in know.
[eI] – e.g., in play.
1. Which are the main criteria used to classify vowels?

2. What is the difference between high and low vowels?
3. What is the difference between front and back vowels?
4. What is the difference between rounded and unrounded
vowels?
5. What is the cardinal vowel chart?
6. Which cardinal vowels do you know?
7. Which are the other criteria used to classify vowels?
8. How can vowels be classified according to length?
9. How is a tense vowel articulated?
10. What is Adanced Tongue Root?
11. What is the difference between a monophthong and a
diphthong?
12. Are there any triphthongs in English?
13. What kind of diphthongs do you know?
14. How is a nasalised vowel articulated?
15. Are there nasalised vowels in English?
16. Which are the vowel sounds of RP English?
17. Indicate the symbols representing the vowel sounds
described below:
63
a) low back round vowel; b) mid central unstressed short vowel;
c) high back short vowel; d) high front long vowel; e) mid back round
vowel; f) high central unround vowel; g) mid front unround vowel; h)
low front unround vowel; i) low-mid central stressed vowel; j) central
to high back diphthong; k) mid back to central diphthong; l) low front
to high front diphthong.
description in words.
Example: [e] = high-mid front unround vowel
a) [Θ] b) [Ε] c) [ο] d) [ Υ] e) [↵] f) [←] g) [ ] h)

[υ⎤]
i) [Ι] j) [αΙ] k) [Α⎤] l) [™] m) [∝] n) [ ] o) [ ] p)
[∈⎤]
q) [Ι↔] r) [℘] s) [ψ]
19. Identify the difference in articulation between the following

sounds, grouped in two sets.
Example: The vowels in the set [ε Ε ο ] are mid non-central,

while the vowels in [↔ ℘ ∈] are mid central.
a) [ψ ↵ Υ ←] vs. [↔ α Ε Ι] b) [Θ Α ] vs. [ι Ε Υ]
c) [Ι Υ Α] vs. [ι⎤ υ⎤ Α⎤]
20. Transform the following transcriptions into orthographic
forms.
a) [πλι⎤ζδ], b) [τΣΑ⎤νσ], c) [τΗαΙμΙΝ], d) [Τ♦υ⎤], e)
[ΘλδΖΙβ♦↔],
f) [κΗ↔Υμ], g) [σκΕ↔δ] h) [φ℘νΙ], i) [ϕΕστ↔δεΙ], j)
[δΘΣτ],
64
k) [ΖΑ⎤ν♦↔], l) [κΗ ⎤τ], m) [Δe↔], n) [↔κ℘στ↔μδ], o)
[fl ⎤♦↔].
21. Transcribe phonetically the following words in RP.
a) question b) threaten c) this d) yelling e) blurry f) congress

g) generosity h) phantom i) shiver j) jester k) chopper
l) casualties m) womb n) central o) thought p) social.
65
VI. PHONOLOGY
1. Phonetics vs. phonology
Unlike phonetics, which deals with the more or less universal

features of sounds, phonology studies the relationships and functions
of sounds, the way they are organized into patterns and systems and
the way they interact with each other. However, there is no clear-cut
boundary between the two disciplines of linguistics: in fact, one could
not separate the phonetic features of a sound from its phonological
environment, nor could one analyze a phonological process without
taking into account its phonetic characteristics.
2. Segmental vs. suprasegmental phonology
Sounds are not always seen as independent segments, since they

are usually organized in higher, more complex structures. If a
phonologist regards sounds as individual units (phonological
segments), he places his approach within the framework of segmental
phonology. If, on the contrary, he looks at sounds as parts of higher
units of organization, he does it from the perspective of
suprasegmental phonology (also known as prosody).
Suprasegmental phonology studies units of speech larger than sounds,
e.g., syllables, metrical feet, phonological words, phrases and
sentences, and phenomena which characterize them, such as pitch,
stress, tone, intonation, rhythm, etc.
66
3. Segmental phonology
3.1. Phonemes and their variants
If a speaker of English is asked to produce the word cup several

times, he will articulate the three sounds [k, V, p] with slight, almost
imperceptible differences every time he utters the word (this can
easily be proven by means of a simple phonographic recording).
However, he will tend to ignore such differences and consider the
sounds identical. This is because the speaker will compare, e.g., the
types of [k] he articulates with a mental representation of [k] stored in
his mind (a common denominator of all the [k] sounds he has ever
produced or heard in his language) and decide that they should be
treated as the same thing.
Indeed, in the mind of the speaker of a certain language there
are abstract representations of the sounds used in the respective
language, listed up in a sort of catalogue he consults on every occasion
a sound is produced. All the possible sounds of a language are referred
to such phonological categories, which are not palpable entities, like
the speech sounds we ourselves hear or articulate, but rather exist only
in our minds. These categories are described by phonologists as
invariants or phonemes, as opposed to all their possible concrete
phonetic realizations or materializations in the actual speech, which
are called variants or phones. By convention, phonemes are
transcribed within slashes (in broad transcription) and their variants
within square brackets (in narrow transcription).
We always strictly refer to the phonemes of one language and
not of languages in general, because each language has a different
grouping of the sounds into phonemes. A phonological category in a
language may be larger than the corresponding category in another
67
language. For instance, the English phoneme /p/ is the category to
which we refer both aspirated and non-aspirated [p] variants (e.g., the
[ph] in pan and the [p] in span). On the contrary, in a language like
Thai, [ph] and [p] belong to two different phonemes, one aspirated and
the other plain (non-aspirated) (/ph/ and /p/), as, for instance, in /phàa/
‘to split’ and /pàa/ ‘forest’. We know they are different because they
contrast: when one is replaced by the other in a word (= the
substitution or commutation test) there results a different word with
another meaning.
Such two words are said to make up a minimal pair, that is, a
pair of words that differ in just one respect (e.g., English /pæn/ pan vs.
/bæn/ ban, where /p/ and /b/ are different phonemes – they contrast in
an opposition of voicing). In some cases, certain sounds may have
limited occurrence, so there might be no minimal pairs to evince the
difference between these sounds. Instead, we could content ourselves
with near minimal pairs, where only the immediate phonetic
environment of the sounds concerned is identical. For instance, in
pressure ["prES@] vs. pleasure ["plEZ@] we can see the contrast
between /S/ and /Z/, though the two words also differ by another
opposition (between /r/ and /l/). In this case, the immediate phonetic
environment is ["E__@] for both /S/ and /Z/.
A phoneme, therefore, is an abstract representation of a class of
sounds whose members (variants) are highly similar phonetically and
never contrast functionally (i.e., never occur in the same
environment). Only sounds with a high degree of phonetic similarity
qualify as members of the same phoneme (e.g., aspirated and plain [p],
which only differ in one phonetic feature: aspiration). If two sounds
always occur in different contexts, but do not share enough phonetic
features, they cannot be the realizations of the same phoneme. For
instance, English /h/ is always syllable-initial, while English /N/ is
only syllable-final, but physically they are completely different: one is
68
a voiceless glottal fricative and the other a voiced velar nasal stop, so
they could not be the variants of the same phoneme.
The difference between the English [p] and [ph] and the Thai
[p] and [ph] does not lie in the phonetic characteristics of these sounds,
i.e., in their physical traits. Both English and Thai use more or less the
same plain and aspirated types of voiceless bilabial plosive. We are
rather dealing with a difference in the two language systems, in the
way the speakers of the two languages group these phones in their
minds in one or two categories, i.e., one or two phonemes: /p/ and /ph/.
Graphically, this can be illustrated as in Figure 6.1:
English /p/ Thai /p/ /ph/ phonological level (phonemes)
[p] [ph] [p] [ph] phonetic level (phones)
Figure 6.1 The phonological and phonetic levels
The phonetic and the phonological level coexist, i.e., speakers

use concrete sounds in accordance with the abstract role played by
these sounds in their language system. The concrete level of
representation has been conventionally called by linguists the ‘surface
level’ (= the level of phones, i.e., of sounds as they are actually
pronounced), while the abstract level has become known as the
‘underlying level’ (= the level of phonemes, i.e., of sounds as they are
systematically organized in the respective language).
69
3.2. Distribution
Variants (or phones) can be of different types, depending on

their distribution (= their occurrence in different environments or
contexts). For example, the aspirated and the non-aspirated [p] in
English never appear in the same environment: [ph] only shows up
unless preceded by [s], whereas [p] is always preceded by [s]. Such
conditioned variants (or allophones) are in complementary
distribution. The occurrence of allophones is said to be predictable,
because in a certain environment only one variant of the phoneme is
expected to appear (they are context-bound). On the contrary, the
occurrence of phonemes is described as unpredictable (phonemes
have contrastive distribution in the same context: e.g., /p b k r m/,
etc. in initial position before /æn/ - in pan, ban, can, man etc.).
Sometimes, variation is not related to positioning, being rather
unpredictable, yet not phonemic: this is the case of free variants.
Free variation is the different realization of one phoneme in various
dialects of the same language or in one person’s speech, in different
situations. Free variants are context-free and are not supposed to lead
to meaning contrasts: e.g., Northern English English [mUd] mud vs.
Southern English English [mVd] (regional variants); [pli:ð] please vs.
[pli:z] (uttered by a lisping person).
70
4. Questions
1. What is the difference between segmental and

suprasegmental phonology?
2. What are phonemes?
3. What are (allo) phones / variants?
4. What is the relationship between two phonemes that can
occur in the same environment?
5. What is a minimal pair?
6. What is a near minimal pair?
7. What is the surface level of representation?
8. What is the underlying level of representation?
9. Which types of speech sound distribution do you know?
10. When are two sounds in contrastive distribution?
11. When are two sounds in complementary distribution?
12. When are two sounds in free variation?
13. When is the occurrence of a sound predictable?
71
VII. PHONOLOGICAL FEATURES
When they contrast in a minimal pair, phonemes oppose each

other in terms of one or more distinctive features (= phonological
properties): e.g., in /bIt/ vs. /pIt/, /b/ is voiced and /p/ is not; in /mi:t/
vs. /bi:t/, /m/ is nasal and /b/ is not; in /væn/ vs. /bæn/, /v/ is
continuant, while /b/ is not; in /bEt/ vs. /wEt/, /b/ is a consonant, while
/w/ is not, etc.
Thus, by contrasting /b/ with other sounds we can learn more
about what /b/ is and what it is not. In fact, we can arrive at a list of
inherent features which characterize this sound, which we might
consider equal to the phoneme /b/. This means that we can regard /b/
as a unit (a phoneme) decomposable into smaller constitutive elements
(its distinctive features).
Based on their constitutive features, phonemes are more or less
alike, i.e., they share more or less properties. The more properties two
phonemes share, the higher the chance for them to belong to the same
class of sounds.
Thus, /b/, /p/, /m/ and /v/ are all consonants, therefore they can
all be represented as [+consonantal]; /w/, however, is a glide (it only
resembles consonants in its behavior) and like vowels it can be
described as [–consonantal].
Secondly, /b/, /p/ and /m/ are all non-continuant sounds (they
are stops), so they can all be characterised as [–continuant]; /v/ and
/w/, on the other hand, are [+continuant] (in the articulation of
72
fricatives, glides, and a few other sounds the air is released
continuously, without complete obstruction).
Thirdly, /m/ is [+nasal] because it is articulated with a raised
velum, while /b/, /p/, /v/ and /w/ are oral sounds, therefore [–nasal].
Similarly, in the articulation of /p/ the vocal cords do not vibrate
(so /p/ is to be described as [–voice]), but /b/, /m/, /v/ and /w/ are
voiced, therefore [+voice].
Finally, /b/, /p/, /m/, /v/ and /w/ are articulated by means of lip
movement, so they all belong to the class of [labial] sounds. At the
same time, /w/ also belongs to the [dorsal] class (it is a labio-velar).
Sounds, therefore, can be grouped in several ways according
to their features. Phonologists, starting from the discoveries of
phoneticians, have tried not to simply list up sound features at
random, but rather to associate them in categories (clusters) that are
relevant for the hierarchy in which the phonological system of a
human language is organized. Thus they have come to rank features
according to the role they play in the system.
Since one of the most important oppositions in the phonological
system is that of vowels vs. consonants, the feature [consonantal], for
instance, which distinguishes between the two classes of sounds, has
been given pride of place. Another feature illustrated above, [nasal], is
hierarchically subordinated to [consonantal], since it is used to
subdivide some consonants (or vowels) into nasal and oral. The same
is true about the features [voice] or [continuant]. Features like [labial]
and [dorsal], which strictly refer to the place of articulation of a
consonant, are commonly subordinated to other features characteristic
of consonants.
The feature hierarchy depends on the natural grouping of
sounds into classes, which make up the (segmental) phonological
system: e.g., obstruents, sonorants, stops, nasals, etc. Sounds are
grouped according to their articulatory characteristics, but also
depending on the way they behave in phonological processes. For
73
example, alveolar and dental sounds can suffer a phenomenon called
‘palatalization’ (a type of assimilation) by which they turn into alveo-
palatals or palatals (e.g., in the pronunciation ki[S]you of kiss you –
see also Section IV.3.1). Besides, these places of articulation are, of
course contiguous and the position of the tongue is not very dissimilar
in the articulation of these sounds. For these reasons they are grouped
together under the label [coronal].
The most widely known system of phonological properties is
the one proposed in Chomsky and Halle’s work (1968) The Sound
Pattern of English (in short SPE), taken over and amended by
numerous phonologists who followed in their foot steps. For example,
in the SPE model segments were viewed as consisting simply of a list
of binary features (= with two possible values: + or –), as illustrated
above by [+nasal]/[–nasal], [+voice]/[–voice], etc. Later on, as already
emphasized, linguists understood that phonological features are
hierarchically ordered in the system.
Phonologists have also insisted on the avoidance of redundancy
in feature specification, stating that some features are simply implied
by others and should not be mentioned. For instance, since all
sonorant sounds are voiced, it would usually be superfluous to
describe a sound as [+voice] once it has already been described as
[+sonorant]. However, there are situations where the sonorant is
devoiced (e.g., if followed by a voiceless sound), and in such cases the
[voice] specification will indeed be necessary.
In the SPE model, the features characterizing a segment were
organized into a feature matrix representation in which they were
listed along with their value (either + or –) for the respective segment.
For example, in the spirit but not exactly the letter of the SPE, the
feature matrix for the English consonant /b/ could be described as
containing the following properties:
74
/b/ –syllabic
+consonantal
–sonorant
–continuant
–del. release
LABIAL
+voice
1. Major class features
Already in the SPE approach features were grouped according

to their higher or lower degree of general applicability. Those features
which apply to all sounds are those which distinguish the so-called
major classes of speech sounds: obstruents, sonorant consonants,
glides, and vowels.
Vowels can be described as [+syllabic], because they
characteristically occur in syllable nucleuses (= centers). Other
sounds also become [+syll] when they behave in the same way as
vowels. They are mainly syllabic sonorant consonants, like those in
button [b℘tν⎯] or bottle [b tλ⎯]) (in English, generally in word-final
unstressed syllables).
In order to distinguish obstruents, liquids and nasals from
vowels and glides, the feature [consonantal] was introduced: [+cons]
sounds are articulated with a high degree of stricture.
The third major class feature, [sonorant], is the one which
allows us to distinguish vowels, glides, liquids and nasals [+son] from
obstruents (oral stops, fricatives and affricates: [–son]). Sonorants are
produced with a higher degree of sonority and they display a clear
75
formant pattern in the acoustic spectrum – they have relatively more
periodic acoustic energy.
By combining the three features we can characterize each major
class of segments in a particular way.
A feature which has also been introduced as a major class
feature is [approximant] (= frictionless continuant), used to
individualize liquids and glides ([+approximant]) from nasals.
vowels glides sonorant consonants obstruents

liquids nasals
[syll] + – – – –
[cons] – – + + +
[son] + + + + –
[approx] + + + – –
2. Consonantal features
Because of the numerous differences between the articulation of

consonants and that of vowels, their features are usually presented in
separate lists. We will start with consonants.
2.1. Voice
Although it is a general feature which applies to all classes of

sounds, [voice] is mostly used to distinguish between voiceless and
voiced obstruents. As already stated, [+voice] sounds are produced
with vocal cord vibration. They typically include the vowels, as well
as the glides, sonorants and voiced obstruents. However, there are
languages which sometimes make use of voiceless vowels or voiceless
sonorants.
76
2.2. Manner features
There are five manner features to be discussed here:

[continuant], [delayed release], [strident], [nasal], and [lateral].
To account for manner of articulation differences between
sounds, e.g., in the obstruent series /τ/, /σ/, and /τs/, new features were
introduced instead of the phonetic labels [stop], [fricative] and
[affricate], namely [continuant] and [delayed release], which refer to
the degree of aperture in the oral tract and to the duration of the sound,
respectively.
Thus, a stop, which is pronounced with a complete obstruction
of the airflow, can be described as [–cont, –del rel], a fricative (which
is articulated with incomplete stricture) as [+cont, –del rel], and an
affricate (which starts as a stop and ends as a fricative and takes
longer than the other obstruents) as [–cont, +del rel].
The feature [delayed release] is strictly relevant in describing
the difference between the articulation of a stop and that of an
affricate, whereas [continuant] applies to all sounds: [+cont] sounds
are those in the articulation of which there is a free airflow through the
oral tract: vowels, glides, liquids and fricative obstruents.
One more feature ([strident]) was introduced in the list of
manner features to pinpoint the difference between relatively turbulent
[+strid] sounds (those fricatives and affricates whose articulation involve
a complex kind of constriction, resulting in continuous noisy or hissing
airflow): e.g., /φ ϖ σ ζ Σ Ζ ts dz τΣ δΖ/ and those sounds (fricatives only)
which have less high-frequency noise: e.g., [–strid] /÷ Β Τ Δ ξ ⊗ h/.
The following two features are mainly used to distinguish

sonorants. Above we mentioned the feature [nasal]. [+nas] sounds are
77
those articulated with lowered velum, so that the airflow can pass both
through the oral cavity and through the nose. In English and
Romanian, for instance, [nasal] is only distinctive for consonants, but
there are other languages in which it can also distinguish vowels, e.g.,
French.
The feature [lateral] is used to separate ‘l’-sounds from other
liquids (and also from the rest of the sounds). It refers to the lateral
release of the airflow – i.e., by the sides of the tongue.
2.3. Place features
The numerous articulatory labels used by phoneticians were

replaced in the SPE model by only two binary features, [anterior] and
[coronal]. Chomsky and Halle described as [+ant] those sounds which
are produced no further back in the oral tract than the alveolar ridge
(labials, alveolars and dentals), while [+cor] was introduced to refer to
sounds produced in the area delimited by the teeth and the hard palate
(alveolars, dentals and alveo-palatals). This caused palatals, velars,
uvulars, pharyngeals and glottals to be characterized together as [–ant,
–cor]. Later on, due to the similarities noticed in the phonological
behavior of alveolars and palatals (see above), the latter were also
included in the group of [+coronal] sounds. They were distinguished
by means of the vowel-specific features [high], [low] and [back].
Instead of using the two binary features in the SPE approach,
it has been assumed that it would be more adequate and more
economical to base the classification on the active articulators.
Thus the features [labial] (= with the lips), [coronal] (= with the crown
/ blade of the tongue), [dorsal] (= with the tongue-body (dorsum)) and
[guttural] (= with the tongue root) came to be employed as
unary (= single-value) features. Place features are now unary because
phonologists have come to the conclusion that there is no point in
78
specifying a sound for anything but its own place of articulation (e.g.,
in the old system of notation, /b/ would have been [+anterior] but also
[–coronal]). Unary place of articulation features can also co-occur:
e.g., /w/, which has double articulation, can be described as both
[labial] and [dorsal].
The feature [anterior] has not been altogether abandoned,
however, but now it is used exclusively to subcategorize the class of
coronals.
dental alveolar alveo-palatal retroflex palatal

[ant] + + – – –
Another feature originally proposed in SPE which has proved to

be useful in distinguishing coronals is [distributed]. Tongue-blade
(laminal) sounds and non-retroflex sounds are thus considered to be
[+distr], whereas tongue-tip (apical) sounds and retroflex sounds are
described as [–distr]. This feature is particularly useful for stops, since
for fricatives [strident] (already) is sufficient to characterize the
oppositions found in language.
3. Vowel features
The following features are mainly relevant in the description of

vowels (in terms of height, backness, roundness and length – see
Chapter V), but they have also been used to distinguish consonants.
The feature [+high] applies to those sounds which involve
raising the body of the tongue above the so-called ‘neutral’ position
(roughly the position characterizing the articulation of the schwa),
e.g., the high vowels, the glides, the velar consonants, etc.
79
[+low] applies to sounds in the articulation of which the body of
the tongue is lowered from the neutral position, e.g., the low vowels
and the pharyngeal and glottal consonants.
We use [+back] to refer to sounds produced by retracting the
body of the tongue from the neutral position, e.g., the back vowels, the
velar, uvular and pharyngeal consonants.
The feature [+front] describes those sounds which involve the
fronting of the body of the tongue from the neutral position, e.g., the
front vowels. This feature is not accepted by all accounts (including
the SPE), but it is useful in characterizing central vowels, in
combination with the feature [back] (central vowels can thus be
defined as [–back, –front]).
[+round] sounds are articulated with rounded protruding lips,
e.g., the rounded vowels and the labial-velar glide /w/.
In order to distinguish long vowels from short ones, we may use
the feature [tense], first proposed in SPE: [+tense] sounds are
produced with a lot of muscular effort – a considerable tensing of the
body of the tongue – in comparison to the so-called ‘lax’ vowels
([–tense]), and they imply a greater deviation from the neutral relaxed
state of the tongue. This increased muscular effort allows for a longer
and more peripheral sound to be articulated (e.g., the vowel [u:] in
boom [bu:m]) rather than a shorter and more centralized lax vowel
(e.g., [U] in [pUt]) (see Figure 5.3).
The feature [tense] seems to apply well in RP: the [–tense]
vowels of RP form a class (including [I E & @ V Q U]), which is
proven by the fact that they cannot occur in final position in a stressed
syllable, while the [+tense] vowels of RP can (e.g., [fi:] vs. *[fI]).
Similarly, [–tense] vowels occur before the velar nasal [N], but
[+tense] vowels do not (e.g., [sVN] vs. *[su:N]).
An idealized ten-vowel system based on the distinction of
tenseness will contain a set of [–tense] ‘central’ vowels ([I E @ O U])
80
and one of [+tense] ‘peripheral’ vowels ([i e A o u]), as in the
following representation:
ι υ
Ι Υ
ε ο
Ε
There are other vowel systems, however, with a different type

of organization of the very same vowels. Many languages do not
divide the set of vowels into a tense and a lax subset. Instead, they
oppose two subsets according to the position of the tongue-root (and
the feature [Advanced Tongue Root]). Some vowels are [+ATR] –
([i e 3 o u]), whereas others are [–ATR] ([I E a O U]) (see below).
ι υ
Ι Υ
ε ο
Ε
81
The feature [ATR] is sometimes used nowadays to describe
English vowels instead of the feature [tense], since, as already stated,
the advanced position of the tongue root determines the simultaneous
raising of the tongue body (which, by definition, characterizes tense
vowels).
4. Summing up
The features presented in this chapter are phonologically

relevant. They can be successfully used to identify natural classes of
sounds. For instance, the set of English nasal consonants [μ ν Ν] share
the features [+cons, +son, –approx, +nasal] and constitute a natural
class because there are no other sounds in this language to fit this
description. Similarly, [τΣ] and [δΖ] are the typical English sounds
describable as [+cons, –son, –cont, +del rel].
From now on, instead of enumerating sounds, we will often
refer to them via their feature specifications. As it will soon become
obvious, this approach is considerably more economical and allows us
to capture interesting generalizations on whole classes of sounds.
Tables 7.1 and 7.2 present a summary of the features of various
kinds of English sounds. Further on, the main features introduced so
far are presented in the shape of a tree.
Table 7.1 Features of English RP vowels
Features Ε Θ Α⎤ : ∈
ι⎤ Ι υ⎤ Υ ℘ ↔
⎤
high + + - - - - - + + - - -
low - - - + + + - - - + - -
back - - - - + + + + + - - -
front + + + + - - - - - - - -
round - - - - - + + + + - - -
tense + - - - + - + + - - - +
82
83
Table 7.2 Features of English RP consonants
round - - - - - - - - - - - - - - - - - - - - - - + -
back - - - - - - - - - - - - - - + + - - - + - - - -
low - - - - - - - - - - - - - - - - + - - - - - - -
high - - - - - - - - - - + + + + + + - - - + - - + +
dorsal 9 9 9 9
distrib - - - - + + - - + + - - - -
ant + + + + + + - - - - + - + -
cor 9 9 9 9 9 9 9 9 9 9 9 9 9 9
labial 9 9 9 9 9 9
lat - - - - - - - - - - - - - - - - - - - - - + - -
nas - - - - - - - - - - - - - - - - - + + + - - - -
stri - - + + - - + + - - + + + + - - - - - - - - - -
del rel - - - - - - - - - - - - + + - - - - - - - - - -
cont - - + + - - + + + + + + - - - - + - - - + + + +
voice - + - + - + - + - + - + - + - + - + + + + + + +
approx - - - - - - - - - - - - - - - - - - - - + + + +
son - - - - - - - - - - - - - - - - - + + + + + + +
cons + + + + + + + + + + + + + + + + + + + + + + - -
τ δ
Feature p b f v τ δ σ ζ Τ Δ Σ Ζ k γ η μ ν Ν ♦ λ w j
Σ Ζ
84
[- syll] [+syll]
[+ cons] [- cons] [- cons]
[- son] [+ son] [+ son] [+ son]

(obstruents) (sonor. cons.) (glides) (vowels)
[- cont] [+ cont] [+nas] [-nas]

(fricatives) (nasals) (liquids)
[-approx] [+approx] [+approx]
[-del rel] [+del rel] [+lat] [-lat]

(stops) (affricates) (laterals) (rhotics)
1. What is a distinctive feature? What is a binary feature? What

is a unary feature?
2. What is a matrix representation?
3. How many types of features do you know?
4. Which are the major class features?
5. Which manner features do you know? What role does each of
them play?
6. Why are unary features preferable in place descriptions?
7. What role does the feature [anterior] play in the recent
approach?
8. Which vocalic features do you know? Do they only apply to
vowels?
85
9. What is the difference between the features [tense] and
[ATR]?
10. Decide whether the following sets form natural classes or
not. Which features would you use to describe them?
a) /π τ κ τΣ/; b) /λ μ Ν ρ ν/; c) /β φ Σ η/; d) /υ Υ ∝ ο Α /;

e) /β τ μ λ ϕ Ζ/; f) /υ Υ ω/; g) /φ σ τs Σ τΣ/; h) /ι Ι y ϕ ⎞ ←/;
i) /λ ν Ε ω/; j) /Θ Α/; k) /Β ϖ ζ Δ Ζ ⊗/; l) /β δ γ/.
11. Decide which sounds are represented by the following

feature matrices:
a) cor b) labial c) -cont d) dorsal e) -syll

+ant -del rel -del rel -cont -cons
+cont +voi -voi +son +son
12. Identify the features which distinguish the following

sounds:
a) /τ/ and /τΣ/; b) /Β/ and /ϖ/; c) /ι/ and /ϕ/; d) /κ/ and /ξ/;
e) /σ/ and /Σ/; f) /Τ/ and /Δ/; g) /η/ and /?/; h) /λ/ and /ρ/;
i) /γ/ and /Ν/; j) /ω/ and /ϕ/.
13. Provide feature matrixes for the following sounds:
a) /p/; b) /N/; c) /z/; d) /Ù/; e) /l/; f) /h/; g)

/m/;
h) /k/; i) /S/; j) /D/; k) /v/; l) /R/; m) /w/; n) /?/;
o) /d/; p) /j/; q) /Q/; r) /V/; s) /i:/; t) /E/; u) /O:/;
v) /@/; w) /A:/; x) U: y) /&/; z) /3:/.
86
VIII. PHONOLOGICAL RULES
1. Rule writing
Sounds used in spoken communication may be more or less

similar to the corresponding phonemes. Sometimes speech may be
hard to understand because of the numerous phonetic ‘accidents’
which can occur in various environments, changing or even
completely deleting sounds. Think for instance of the following
sentence: Did you arrive safely?. In very careful pronunciation, this
sentence could be transcribed as [dId ju: @RaIv seIflI], but in fast
coarticulated speech it will sound more like [dIÙ@RaIfseIflI]
(including several instances of assimilation and deletion).
Despite the differences between the first and the second
pronunciation of this sentence, a speaker of English will be able to
interpret them in a similar way. This is because in the mind of a
speaker, apart from the set of phonemes characteristic of his language
(the underlying structure), there is also a set of rules which he can
apply in order to generate the spoken sounds (the surface structure).
These rules also help the speaker ‘reconstruct’ the phonemes and
interpret the message attached to them.
The two levels of representation introduced in Section VI.3.1
(the underlying level and the surface level) are thus linked by a set of
rules characteristic for a certain language, that is, a set of explicit
statements (predictions) about the way particular (allo)phones
87
represent particular phonemes in the respective language. By means of
these rules, speakers are able to use the sounds of their language in an
appropriate way. We can also say that they derive the phonetic
representation from the phonological representation by applying the
rules (see also Section VIII.5).
Underlying representation
↓
Rule(s)
↓
Surface representation
Essentially, rules state that some (input) item (e.g., A) becomes

(→) some other (output) item (e.g., B) in some specific environment
(e.g., X__Y). Such a statement can formally be represented in the
following way:
A → B / X __Y
Here the slash (/) precedes (marks) the environment. X and Y

stand for two variables (the left-hand and the right-hand environment),
and the underscore ( __ ) represents the position of the item which
suffers the effect of the rule – in this case: A.
An illustration is offered by the regional nasalization of the
English vowels before nasal stops. For instance, the underlying form
/πΕν/ may be realized as [πΕ)ν], for instance in the Southern United
States. We may write therefore the following rule:
a. /Ε/ is nasalized when followed by /n/.

b. /Ε/ → [Ε)] / __ /ν/
88
Moreover, taking into account that this phenomenon affects all
vowels of English preceding all kinds of nasals, we can raise our rule
to a higher degree of generalization, using the phonological features
introduced in Chapter 7.
a. A vowel is nazalised when followed by a nasal.

b. [+syll] → [+nas] / __ [+nas]
Rules are usually written in terms of the relevant features, not of

the whole feature matrices represented by sounds (in order to avoid
redundancy and to increase the explanatory power). Thus, to represent
vowels we only picked [+syllabic] because vowels are the only speech
sounds which typically form a syllable nucleus. For nasals we picked
the feature [+nasal], which distinguishes them from the rest of the
sounds; besides, in English there are only nasal stops, so any
additional features describing stops in particular would have been
redundant.
2. Selecting the underlying form
Now the question arises: on which criteria did we select the

underlying form? In other words: why did we pick the non-nasalized
vowel to be the underlying form and the nasalized one to be the
surface form and not the other way round?
Although no formula has been found yet to work without fail, as
sometimes there might be more than a single right answer, several
guiding principles have been suggested so far, based of which we may
identify the best candidate for the underlying item.
1. First of all we have to make sure we are dealing with the
allophones of one single phoneme. For this we need to see if the
89
sounds are in complementary distribution and also share a great
number of features (i.e., if the sounds are phonetically similar).
2. Then we have to apply the principle of phonetic naturalness
(which refers to what is likely to be found or frequently found across
languages). According to this principle, the symbol chosen to
represent the phoneme must have as much in common with the surface
forms as possible. For instance, if we want to represent the underlying
form realized on the surface level as oral [Ε] or nasalised [Ε)], we
should not pick a random symbol, such as ‘2’ or ‘*’, but a symbol that
represents the largest number of the features of the two allophones,
i.e., a symbol which usually stands for a low-mid front short vowel,
which cannot be far from ‘Ε’ itself.
3. It derives that the symbol representing the phoneme should in
fact be the same as one of the symbols representing the allophones.
This way, we can explain the other allophones and their distribution in
opposition to this basic form and its own distribution. In the example
above, we would have to pick either the oral or the nasalized vowel
symbol to represent the phoneme.
4. Of several allophone symbols, the simplest is usually
preferred for the underlying representation, i.e., the one that has
nothing added to its basic shape. From this point of view, in the case
of the two vowels, ‘Ε’ would be more appropriate to stand for the
phoneme, as it lacks the additional tilde symbol ‘ ) ’.
5. It is usually the form with the widest distribution (the
allophone which occurs in the largest number of environments) that is
selected to also represent the phoneme. In our example, [Ε] can be
followed by any kind of consonants, except for nasals, while [Ε)] is
naturally placed before a nasal consonant. According to this criterion,
we come to the conclusion that the unnasalyzed allophone [Ε] must be
chosen to also represent the phoneme, since the number of
90
environments of [Ε] is far larger than the number of environments of
its nasalized counterpart.
6. The principle of process naturalness is also applicable
whenever we need confirmation for the underlying form already
identified by using the other criteria. A cross-linguistic analysis will
confirm the supposition that there is indeed a natural tendency for
unnasalized vowels to be nasalized when followed by a nasal.
7. The same phenomenon (nasalization) applies to all English
vowels. This regularity is usually referred to as pattern congruity and
is itself often worth adopting as a general guiding principle in the
phonemic analysis.
3. Phonological alternations
There are many kinds of phonological alternations, as there are

various kinds of phonological processes. Some of the alternations are
purely phonetically conditioned, some others are phonetically and
morphologically conditioned, while a third type of alternations are
phonetically, morphologically and lexically conditioned.
3.1. Phonetically conditioned alternations
An example of this category is the alternation between

unnasalized and nasalized vowels in English (see above). The only
cause which determines the nasalization is the presence of a nasal
consonant immediately after the vowel. This means that the
phenomenon of nasalization occurs irrespective of the morphological
structure of the word: it is simply conditioned by the phonetic
environment.
91
Other examples of alternations of the same type include
aspirated vs. non-aspirated voiceless stops, the lateral and nasal
release of stops (e.g., in battle or rotten), the phenomenon of
‘flapping’ characteristic of North American English, Northern Irish
and Australian English (e.g., in wa[Ρ]er (water)), the assimilation of
the English alveolar nasal /n/ to the place of articulation of the
following labial or velar consonant (e.g., i[m+p]eace (in peace)),
‘clear’ vs. ‘dark l’, etc.
3.2. Phonetically and morphologically conditioned alternations
A word is made up of one or several morphemes (units

contained in the word with identifiable meanings), e.g., in the word
input, the prefix in- is a (prefix) morpheme with one meaning, while
the root -put is another morpheme, with a meaning of its own;
therefore, morpho-phonologically we can represent the word as
/In+pUt/ (where the symbol ‘+’, called juncture, is used to mark the
morpheme boundary). A conditioned variant of a morpheme is called
an allomorph.
The English noun plural morpheme (orthographic ‘(e)s’) has
three allomorphs: [s], [z], and [Iz], depending on the nature of the
preceding segment. If the noun ends in a sibilant (i.e. [s], [z], [S], [Z],
[Í], or [Ù]), the plural takes the form [Ιζ], if it ends in a voiceless non-
sibilant, the plural is [s], and if the final segment is a voiced non-
sibilant, the form of the plural is [z]. This means that a word ending,
for instance, in a nasal or a vowel will automatically take the plural
allomorph [ζ].
On the other hand, when we hear the English words [dA:ns] and
[keIs] we do not have to dismiss them as ill-formed plural forms,
because we can interpret them as the mono-morphemic (= made up of
one morpheme) singular forms dance and case. Therefore, the
92
phonetic alternation introduced at the beginning of this section, though
perfectly motivated by the environment, is exclusively valid in the
case of the plural marker allomorphs.
Other alternations of a similar kind in English are, e.g., the
third person singular present tense markers [s/z/Iz] and the past tense
regular markers [t/d/Id].
3.3. Phonetically, morphologically and lexically conditioned

alternations
Consider the following English singular and plural noun forms:
wolf [wUlf] – wolves [wUlvz] gulf [gVlf] – gulfs [gVlfs]

wife [waIf] – wives [waIvz] still-life ["stIllaIf] – still-lifes ["stIllaIfs]
leaf [li:f] – leaves [li:vz] belief [bI"li:f] – beliefs [bI"li:fs]
Apparently, there is no phonetic or morphological difference

between the words in the left-hand column and those in the right-hand
column that would motivate this erratic behaviour. And yet native
speakers of English do know that in the cases exemplified in the left-
hand column they have to apply voicing on the labio-dental fricative
when they add the plural suffix. This means that there is a list of items
in the lexicon (= the set of words) contained in the speakers’ minds
which are specified for this irregular type of plural marking, a list
which is transmitted from parent to child as a pre-established
convention. The assimilatory voicing phenomenon is not restricted to
the voiceless labio-dental fricative, as it also applies to its alveolar and
dental counterparts.
path [pA:T] – paths [pA:Dz] moth [mQT] – moths [mQTs]
house [haUs] – houses ["haUzIz] boss [bQs] – bosses ["bQsIz]
93
The explanation stays in the diachronic evolution of English.
These plural forms are exceptions to the general plural-forming rule
which have been inherited from earlier stages of English, when a rule
applied according to which intervocalic voicing was obligatory. That
this is so is proven by the fact that this type of plural formation is no
longer productive (i.e., it cannot apply to newly-formed nouns, which
automatically build their plural according to the common present-day
plural rule presented in Section VIII.3.2).
Other alternations of this type in English are the velar softening
(the process by which the velar stop [k] is fronted and fricativized to
the alveolar fricative [s] before a high front (palatal) vowel sound),
e.g., in ethnic ["ETnIk] / ethnicity [ET"nIsItI], and the trisyllabic
shortening, e.g., in nature ["neItS@] / natural ["n&tS@R@l], docile
["d@UsaIl] / docility [d@U"sIlItI], serene [sI"Ri:n] / serenity
[sI"REnItI], etc. All these alternations are the so-called ‘fossilized’
rests of phonological processes once productive in the history of
English.
There are also other irregularities among the plural noun forms
in contemporary English, e.g., goose [gu:s] / geese [gi:s], mouse
[maUs] / mice [maIs], etc. This kind of alternations are not
phonetically conditioned at all, as there are no phonological
processes to be recognized by speakers of contemporary English, who
have to learn them and use them as such. The phenomenon also occurs
for instance in irregular verbal and adjectival forms, e.g., can / could,
sing / sang, far / farther, etc. If the two forms are etymologically
unrelated, their association within one paradigm is called suppletion:
e.g., is / was, go / went, good / better, etc.
4. More on rule writing
In Section 8.1 we showed that rules can be written in words or

with sound symbols, but quite often they are written in terms of their
features, preferably in terms of their most relevant features.
94
In English there are alternations between the alveolar fricatives
[s] and [z] and the alveo-palatal fricatives [S] and [Z], respectively.
The latter appear before the palatal glide. Consider the following
examples:
i. kiss [kIs] ii. kiss you ["kISju]

please [pli:z] please you ["pli:Zju]
In order to account for these alternations we may write two

rules using sound symbols:
a. /s/ → [S] / __ [j]

b. /z/ → [Z] / __ [j]
However accurate, this kind of notation does not reveal

anything about the phonological processes at stake here. Let us now
transcribe these rules in feature notation (as introduced in Chapter
VII), trying to avoid redundancies.
a. +cont +cont –syll

+stri +stri –cons
cor → cor / __ cor
+ant –ant –ant
–voice –voice
b. +cont +cont –syll

+stri +stri –cons
cor → cor / __ cor
+ant –ant –ant
+voice +voice
95
The first observation we can make is that we could write one
single rule, ignoring the feature [voice], as the rest of the
specifications are identical. Secondly, we notice that instead of
describing the alveo-palatal fricatives in so many features, we might
simply pick [–ant] to capture the essence of the transformation. Thus,
we arrive at the following generalization:
+cont –syll
+stri –cons
cor → [–ant] / __ cor
+ant –ant
Apart from the basic rules, as the one illustrated above, there
are also more complex relationships and operations, for which we
need additional notation devices and conventions. For instance,
optional elements are noted in linear rule writing by means of regular
parentheses (brackets). They may occur to the right or left of both the
left-hand-side and the right-hand-side environment.
A → B / X(Y) __ Z or A → B / X __ (Y)Z etc.
An example is provided by the rule of l-velarization in English.

Most English varieties have two lateral allophones, ‘a clear l’,
represented as [λ], in words like [λεΙt] and [↔∪λαΙν], and a ‘dark
(velarised) l’ – [⊃], as in [βΥ⊃] and [φΙ⊃μ]. Velarised ‘l’ (be it
consonantal or syllabic ‘dark l’) occurs at the end of a monosyllabic
word (followed or not by another consonant), but it also occurs at the
end of a non-final syllable in a polysyllabic word: e.g., in
[.∪ ⎤.φΥ⊃.], [.∪♦Θ.τ⊃⎯.], [.m&⊃.∪pR&k.tIs.] etc. (Dots indicate
syllable boundaries in the IPA transcript.)
96
In word notation, the l-velarization rule can be formulated in the
following way: Alveolar l is velarized whenever it occurs in syllable-
final position (followed or not by another consonant), i.e., when it
belongs to the syllable coda. This generalization can also be expressed
in more formal phonological notation (where the bracket and the ‘σ’
mark the syllable boundary):
/λ/ → [⊃] / __ (C)]σ
Brace notation (within curly brackets) is used when we want to

show that the same rule applies in more than one environment (i.e.
that it applies either in one environment or the other). The extra
environment may occur either to the left or to the right of the segment
that suffers the transformation.
A → B / X __ Y or A → B / X __ Y
Z Z
For instance, voiced fricatives in English suffer devoicing

whenever they are placed in word-final position or before a voiceless
sound. Thus we may write (using the grid (#) to mark the word
boundary):
–son #
+cont → [–voice] / __ [–voice]
+voice
Sometimes we need to express the upper and lower limits on the

number of similar segments possibly contained in the environment
variables. The maximum number of segments is conventionally
noted as a superscript number attached to the upper right side of the
segment symbol, while the minimum number is noted as a subscript
97
attached to the lower right side of the symbol. Consider the following
(imaginary) examples:
/υ/ → [Υ] / __ C2
This rule states that in order for the /υ/ vowel to turn into [Υ] it
needs to be followed by at least two consonants.
/ι/ → [Ι] / __ C1
According to this rule, /ι/ will become [Ι] if followed by no

more than one consonant.
Let us discuss another example, that of the assimilation of the
alveolar nasal [n] to the place of articulation of the following stop. In
order to write a rule that would capture the whole phenomenon in its
generalization, we would have to solve the problem of how to
represent the two transformations simultaneously in feature notation.
We know that whenever /n/ precedes a velar stop it often turns
into its velar counterpart [Ν], e.g., in [ΙΝκ] (ink), and when it occurs
before a labial stop, it is labialised as [μ], e.g. in [Ιμ πi:s] (in peace).
We also know that the three nasals can be described as follows:
[μ] [ν] [Ν]

[lab] 9
[cor] 9
[dors] 9
If we start by saying that the [coronal] [ν] becomes [labial] [μ],

how can we add, in the same rule – and in the same type of
environment – that it can also become [dorsal] [Ν]? One possible
solution lies in dropping the detailed notation and replacing it with a
98
variable, conventionally taken from the letters of the Greek alphabet,
hence the name alpha-notation. Thus, instead of writing [lab], [cor]
and [dors], which are all place features, we can simply write α[place].
[+nasal] → α[place] / __ –son

–cont
–del rel
α[place]
5. Derivations
As suggested in Section VIII.1, phonological rules apply on the

underlying representations (URs) (the phonemes) and determine their
surface representations (SRs) (the phonetic forms). In other words,
speakers derive the phonetic forms from the phonemes by means of
language-specific rules. Underlying forms may be affected by one or
more rules; the series of steps taken from UR to SR is known as a
derivation.
Let us have another look at the rules of nasal assimilation to the
place of the articulation of the following obstruent. We will apply this
rule on samples and establish the derivation. The derivation is a means
of checking whether the rule has been formulated correctly. If the rule
applies to the appropriate segments in the appropriate environments,
the derivation will necessarily end with viable phonetic forms.
UR /In pi:s/ /Ink@nteIS@n/ /IndEt/ /In&pt/

Assim. Rule m N – n –
SR [Im pi:s] [INk@nteIS@n] [IndEt] [In&pt]
(in peace) (incantation) (indebt) (inapt)
5.1. Rule ordering

99
Let us take another look at the regular noun plural forms in
English. Consider the following forms:
a. caps [k&ps], staffs [stA:fs], cats [k&ts], months [mQnTs], ticks

[tIks]
b. cabs [k&bz], doves [dVvz], pads [p&dz], clothes [kl@UDz], dogs
[dQgz] bins [bInz], bells [bElz], spas [spA:z], cows [kaUz]
c. bosses [bQsIz], buzzes [bVzIz], leashes [li:SIz], rouges [Ru:ZIz],
benches [bEnÍIz], judges [ÙVÙIz]
At close examination, we notice that the singular nouns that

take [s] end in a voiceless stop or a voiceless non-sibilant fricative (a
sibilant is a hissing sound made with the air flowing down the center
of the tongue: [s z S Z]). Secondly, those that take [z] may end in a
voiced stop, in a voiced non-sibilant fricative, a nasal or a liquid, a
vowel or a diphthong. Finally, those that take [Iz] (or [@z], depending
on the dialect) end in one of the voiced or voiceless sibilants or in
affricates whose release stage is similar to a sibilant fricative.
Hence, we may draw the conclusion that the regular noun plural
suffix in English is a [coronal] [+anterior] sibilant fricative which
agrees in voicing with the preceding segment, except for those cases
in which the root-final segment is also a sibilant – then a vowel is
inserted between the two consonants.
According to the principles established in Section VIII.2, the
allophone which is selected to play the role of underlying form must
have the widest distribution of the three. The form which qualifies
best is [z], as it occurs after voiced obstruents, sonorants, vowels and
diphthongs, while [s] is restricted to positions following voiceless
obstruents, and [Iz] only occurs after sibilants. If we pick /z/ as the
underlying form, we have to decide what rules apply to change it into
[s] and [Iz] and in what order. Since [s] is always preceded by a
100
voiceless non-sibilant obstruent, we should write a rule of voicing
assimilation.
+strid → [–voice] / [–voice] __

cor
+ant
+voice
At the same time, considering that the only difference between

the UR form /z/ and the SR form [Iz] is the presence of the vowel [I],
we should postulate an insertion (also called epenthesis) rule to
account for it.
+syll
Ø → +high / +strid + __ +strid
–back cor cor
–tense
The problem is to decide which rule applies first. Let us assume

that the first to apply is the voicing assimilation rule, followed by the
I-epenthesis rule. The derivation of the UR forms /k&tz/, /dQgz/ and
/li:Sz/ would then be the following.
UR /k&t+z/ /dQg+z/ /li:S+z/

voicing assim. rule k&t+s –– li:S+s
I-epenthesis rule –– –– li:S+Is
SR [k&ts] [dQgz] [li:SIs]
The first two forms resulting from the derivation are correct, but
the last one is wrong. For this reason we have to reorder the
application of the two rules.
UR /k&t+z/ /dQg+z/ /li:S+z/
Ι-epenthesis rule –– –– li:S+Iz
101
voicing assim. rule k&t+s –– __
SR [k&ts] [dQgz] [li:SIz]
The SR forms resulting from this derivation are all correct, so

this must be the order in which the two rules are to apply.
1. What level of representation is characteristic of a) phonemes;

b) allophones?
2. How are the underlying and the surface structure related?
3. What are phonological rules? What is their role?
4. What does a rule contain?
5. How can the underlying representation be selected?
6. How many kinds of alternations do you know?
7. Are there any alternations which are not phonetically
conditioned?
8. In how many ways can rules be written?
9. How are optional elements noted in a rule?
10. How are multiple environments noted in a rule?
11. When is alpha-notation used?
12. What is a derivation?
13. How are rules ordered?
14. Write the following rules in feature notation:
a) A consonant is deleted at the end of a word when it follows
another consonant.
b) A voiceless fricative is voiced between two vowels.
c) An alveolar stop becomes a palato-alveolar affricate before
[i] or [j].
d) An alveolar stop is inserted between an alveolar fricative and
[r].
102
e) A stop is devoiced at the end of a word.
15. Consider the following series of words in English.
a) last [lA:st], clasp [klA:sp], draft [dRA:ft], synapse
["saIn&ps], inept [In"Ept], works [w3:ks], worked [w3:kt]
b) lagged [l&gd], lazed [leIzd], receives [II"si:vz]
Can you identify any pattern congruity? If so, comment on the
acceptability of the following four transcriptions in English: [pA:sd],
[lVvt], [&pt], [st&bd].
16. In Japanese, the phoneme /t/ has at least the allophones [t],
[ts], and [Í]. Consider the following words.
a) [tatsM] ‘stand’ d) [tetsM] ‘iron’
b) [toΡM] ‘take’ e) [Íiba] ‘Chiba’
c) [tsMÍi] ‘dirt’ f) [ÙaΡimiÍi] ‘gravel road’
What are the underlying representations of the forms for ‘iron’,
‘Chiba’, and ‘dirt’? Write derivations for these three words. Write a
rule in prose and then in feature notation to account for the realisation
of the allophones of /t/.
17. In French there is voice agreement between the non-
sonorant members of a consonant cluster. The first segment may
sometimes assimilate to the second to comply to this rule, as follows:
/bs/ becomes [ps] as in absolu [apsOly] ‘absolute’ /kd/ becomes [gd]
as in anecdote [anEgdOt] ‘anecdote’, /bt/ becomes [pt] as in obtus
[Optys] ‘obtuse’, /gs/ → [ks], /kb/ → [gb], /tz/ → [dz]. As you can
see, sometimes the assimilation implies voicing, some other times
devoicing. Write two rules to illustrate the two types of regressive
assimilation. Then write one rule to generalise over the first two.
103
IX. PHONOLOGICAL PROCESSES
1. Feature changing rules
Feature-changing rules are those rules which affect one feature

or a small group of features. Here belong assimilation and
dissimilation, as well as lenition, flapping, glottalisation, etc.
1.1. Assimilation
Assimilation is the process by which (non-)adjacent segments

(belonging to the same word or to two successive words) change so as
to become more like each other. It is the result of the speaker’s
tendency to reduce his articulatory effort.
Assimilation can be classified according to the direction in
which the feature spreads. Thus assimilation can be progressive,
regressive or reciprocal. In progressive assimilation (which happens
to be the least common) one or several features are copied/spread from
the item on the left to the one on the right: e.g., in happen the alveolar
nasal may be influenced by the preceding labial sound, hence the
pronunciation [h&pm⎯].
Regressive assimilation applies from right to left, in
anticipation of the sound that is just to be articulated: e.g., in
dismantle, the feature [voice] of the nasal may be copied on the
preceding sound, which is voiceless, hence the pronunciation
[dIzm&ntl⎯].
104
In reciprocal assimilation the two sounds influence each other
and may even coalesce (= become fused): e.g., in schedule [SEdju:l],
[d] and [j] may coalesce into the new affricate sound [Ù].
In terms of the degree of similarity achieved, assimilation may
be partial or total. In partial (allophonic) assimilation the two
neighboring sounds become only partly similar: e.g., in inclination the
nasal is often pronounced as a velar [N] because of the following velar
[k]; however, the two sounds remain different. Total (phonemic)
assimilation occurs when the two sounds come to be perceived as
one: e.g., in this ship [DISIp], where [s] is no longer heard in fast
speech.
Assimilation may affect the voicing, the manner or the place
of articulation. For example, in English liquids and glides following a
voiceless obstruent are devoiced, as in f[λ8]y, s[λ8]ope, c[j∗]ute,
t[w8]in, s[w8]ine, etc. (devoicing is indicated by a little circle-like
diacritic written under or over the phonetic symbol). Devoicing also
takes place when voiced fricatives or affricates in word-final position
followed by another word beginning with a voiceless consonant: e.g.,
with ten [wIT tEn], of course [@f kO:s], those seven [D@Us sEvn⎯],
etc.
The coalescence of the stops [t] and [d] with the glide [j]
produces the affricates [Í] and [Ù] (= affrication), e.g., in don’t you
[d@UnÍ@] and could you [kuÙ@]. A vowel or a consonant may be
nasalised under the influence of the following nasal sound
(= nasalization); e.g., /&/ in pan or /d/ in good night [gu:n naIt], etc.
Place assimilation is present, for instance, in the articulation
of the initial consonantal cluster [t♦], as in tray [τ♦εΙ], where [τ]
acquires a post-alveolar articulation under the influence of [♦] (and
can even be pronounced as the alveo-palatal affricate [Í]). The alveolar
fricatives [s] and [z] may have alveo-palatal articulation before [j], [S]
or [Z]: e.g., this year [DIS j3:], please you [pli:Z j@], etc. The last two
105
changes are cases of palatalization (= the transformation in which a
sound becomes (more) palatal). As shown in Sections VIII.4 and
VIII.5 nasal stops can assimilate to the place of articulation of the
following sound.
1.2. Dissimilation
The process in which two (usually adjacent) segments that share

some feature(s) change so as to become less similar is known as
dissimilation. Like assimilation, it can be progressive or regressive,
partial or total, etc. An example from English is the substandard
pronunciation [ÍIm(b)lI] of the word chimney. Dissimilation also
occurred in the history of the word pilgrim (from Old French pelegrin,
itself from Latin peregrinus).
In Romanian dissimilation is illustrated, for instance, by the
historical evolution of the word mormânt [mor"mÈnt] ‘grave’, from
Latin monumentum ‘(funerary) monument’, as well as by the current
substandard pronunciation [koli"dor] of the word coridor ‘corridor’.
1.3. Lenition
The term ‘lenition’ (or ‘weakening’) refers to various changes

in which the resulting sound is somehow weaker in the articulation
than the original sound. Lenitions can be changes of stops or affricates
into fricatives, of two consonants to one, of full consonants to glides,
of voiceless consonants to voiced in some environments (especially in
intervocalic position), etc. In some cases lenition can also refer to the
complete loss of sounds.
An example of double-staged lenition is the evolution of Latin
voiceless stops [p, t, k] to Spanish voiced [b, d, g] and then to the
106
fricatives [Β, Δ, ⊗] in intervocalic position: e.g., Latin scopa >
Spanish escoba [EskoΒa] ‘broom’, Lat. natare > Sp. nadar [naDaΡ]
‘to swim’, Lat. amica > Sp. amiga [ami◊a] ‘female friend’.
1.4. Flapping
Flapping is a phenomenon characteristic of North American

English and a few other English varieties. In these accents, when a /t/
occurs between two vowels, it is pronounced as a flap [Ρ], provided
that the second vowel is not stressed: e.g., in water.
1.5. Glottalization
Glottalization applies to English /t/, which either becomes a

glottal stop after a vowel at the end of a word or is only partially
glottalized, irrespective of the preceding sound. The second
phenomenon may also characterize other voiceless stops (/p/ and /k/)
(see Section III.3 and IV.1.1).
2. Other types of changes
2.1. Deletion
Deletion (or elision) is the process by which a whole segment (e.g.,

A) is eliminated. In technical terms, the segment becomes Ø (zero).
A→Ø/X_Y
Deletion can affect vowels or consonants and it can occur at the
beginning, inside or at the end of a word. An example of initial vowel
deletion comes from Spanish: the Spanish word bodega ‘wine cellar,
107
storeroom’ derives from Latin apoteca (on the voicing suffered by the
consonants see IX.1.2). English words like family or memory tend to
be pronounced without the unstressed vowel [@]. If the following
syllable starts with a sonorant, the sonorant may become syllabic, as in
tonight [tn⎯aIt], police [pl⎯i:s], correct [kR⎯Ekt], etc. Old English
final (unaccented) vowels have been reduced to [@] and then lost:
e.g., OE sunu > PDE (= Present-day English) sun, OE mona > PDE
moon, etc.
In the history of English, initial [g] and [k] were lost in initial
position preceding a nasal. Even if they are still used in spelling, they
are no longer pronounced: e.g., in knight [naIt], gnaw [nO:]. In present-
day English, elision also applies to (mostly alveolar) consonants
occurring within consonant clusters, e.g. in handsome ["h&ns@m],
mostly [m@UslI], prompts [pRQmps], friendship ["fREnSIp], fifths
[fIfs], etc. The final [v] in the preposition of is often lost before
consonants, e.g. in lots of them ["lQts @ D@m], while the conjunction
and is reduced to [@n], e.g., in bread and breakfast ["bREd @n
"bREkf@st].
2.2. Insertion
The process of insertion (or epenthesis) consists in the

introduction of a new segment (e.g., A) between two previously extant
sounds (in this case, we may say that Ø becomes A).
Ø→A/X_Y
Insertion can occur in word-initial position or inside a word. An
example of initial vowel insertion is offered by Spanish escuela, from
Latin scola. English film is regionally pronounced ["fIl@m], with [@]
epenthesis and a similar phenomenon occurs in words of foreign
origin, with specific consonantal clusters unknown to English: e.g., in
Tbilissi, pronounced [t@bIlIsI].
108
A plosive may be inserted between two sonorants so as to ease
their pronunciation. Some examples come from the history of English:
e.g., OE þymel [Ty:mel] > PDE thimble, OE þunrian [TUnrIAn] >
PDE thunder, etc. Similarly, the English word chamber comes from
the French chambre, itself from Latin camera.
2.3. Metathesis
By metathesis (= transposition of sounds) the order of a
sequence of sounds (or longer segments) is reversed. Examples of
historically recognizable metathesis in English are contained in words
like clasp, from Middle English clapse, burn, from ME brennen, bird,
from OE brid, horse, from OE hros, etc.
In Romanian we find palavră, from Latin parabola, castravete
from Bulgarian krastavitza ‘cucumber’, întreg from Latin integrum
‘whole’, as well as present-day substandard forms, such as potrocală
for portocală ‘orange’ and scluptură for sculptură ‘sculpture’.
2.4. Reduplication
Reduplication is the process in which a part of a word is copied
and attached to the beginning of the original word. In English,
reduplication has exclusively lexical functions: it is often used in child
language (e.g., in words like mama, papa, gee-gee, wee-wee).
In some languages spoken in Samoa (Samoan), the Philippines
(Tagalog), North America (Dakota), etc. reduplication is used to mark
grammatical categories, e.g., tense and number. A similar device was
used at some time in the old Indo-European languages (e.g., in the
paradigm of some of the perfect forms), as can still be seen in
Sanskrit, Ancient Greek, Latin, etc.
2.5. Haplology
109
Haplology is a change in which a repeated sequence of sounds is
simplified to a single occurrence. In some varieties of English, a word like
library is pronounced [laIbRI], and probably [pRQblI]. There are also
examples where the haplologized form has become the standard, e.g.,
pacifism (instead of pacificism, from pacific), humbly (instead of ME
humblely).
1. What feature changing rules do you know?
2. What is the difference between regressive and progressive
assimilation?
3. What is reciprocal assimilation?
4. What is total assimilation?
5. What is nasalization?
6. What is voicing / devoicing?
7. What is palatalization?
8. What is dissimilation?
9. What is lenition?
10. What is flapping?
11. What is glottalization?
12. What do deletion and insertion have in common?
13. What do metathesis and reduplication have in common?
14. What do reduplication and haplology have in common?
15. Identify the changes in the following words:
a) athlete ["&T@li:t], b) good morning [gu:m "mO:nIN], c) soften
["sQfn⎯], d) dodo ["d@Ud@U], c) OE &fre ["&vr@] ‘ever’, d) increase
[IN"kRi:s], e) open ["@Upm⎯], f) education [EÙU"keISn⎯], g) buckle
["bV?l⎯], h) fatter ["f&Ρ™], i) ban [b&)n], j) February ["fEbRI], k) Sp.
arbol < Lat. arbor, l) jewelry ["Ùu:l@RI], m) handbag ["h&mb&g], n)
average ["&vRIÙ].
110
X. SUPRASEGMENTAL PHONOLOGY:
THE SYLLABLE
Syllables are clusters of segments grouped around a sonority peak

(usually a vowel). The most widely-spread syllable structure in the
languages of the world consists of a CV sequence (i.e., a consonant
followed by a vowel – e.g., Rom. masă ‘table, meal’, syllabified as
[μΑ]σ[σ↔]σ, where the Greek letter ‘σ’ stands for ‘syllable’. This is also
the first type of syllable used in early child speech, as it demands the least
articulatory effort (e.g., in words like mama or papa). For these two
reasons, the CV syllable has been known as the core or basic syllable. It
is an open syllable (it ends in a vowel; a syllable ending in one or more
consonants is referred to as a closed syllable). Closed syllables
predominate in English, while in Romanian open syllables are preferred.
Other types of syllables have a higher degree of complexity: V
and CVC structures differ by one segment from the core syllable,
whereas VC differs by two segments, which makes it the most
complex syllable structure of the four and thus the least likely to occur
in human languages. It has been noticed in fact that a language which
allows for (C)VC structures also accepts syllables with a lower degree
of complexity, but when a language has CV syllables it does not
necessarily use other syllable structures.
Native speakers are able to recognise syllables as phonological
units in their own language according to the characteristic well-
formedness restrictions (phonotactic constraints). Some languages
may use more than one consonant (i.e., consonantal clusters) in
syllable initial or final position or in both. In such a language there are
111
a series of acceptable consonantal clusters (see Appendix 1 for
English consonantal clusters). These clusters are not independent of
their position in the syllable, i.e., the clusters allowed in syllable-
initial position are often unacceptable in syllable-final position and
vice versa – e.g., the Romanian consonantal sequence [pl] can occur in
syllable-initial but not in syllable-final position. Thus the
syllabification of a word like Rom. suplini ‘replace’ implies cutting
the consonantal group [pl] off the first syllable and including it in the
second syllable: [su]σ[pli]σ[ni]σ. A similar phenomenon takes place in
the syllabification of Rom. complace ‘indulge’, where the medial
cluster [mpl] needs to be split, since it is unacceptable both as a
syllable-final cluster and as a syllable-initial one: [kom]σ[pla]σ[Íe]σ.
1. Syllable structure
1.1. Sonority and the syllable
What makes speakers of a language able to identify the number

of syllables within a word is their perception of the fact that some of
the sounds contained in the word are more sonorous that any of the
others (hence the name syllable peaks or nucleuses). Practically,
what speakers count are syllable peaks, not syllables. Since vowels are
inherently more sonorous than consonants, they tend to be syllable
peaks. However, in syllables which do not contain a vowel the most
sonorous consonant will be the syllable peak. For instance, when
English speakers recognize four syllables in the word refundable, they
perceive four syllable peaks, as in the following graphic
representation, where the sonority profile follows a rugged line.
The final [⊃⎯] in refundable is a sonorant consonant which is
neither preceded nor followed by a more sonorous segment (the
previous consonant [b] is less sonorous, and there is no following
112
segment). This is why [⊃⎯] forms a syllable peak (is ‘syllabic’), just
as the vowels [I], [V], and [@], which are more sonorous than their
neighbours. Other English sonorant consonants can also be syllabic,
being marked with the same diacritic sign under the phonetic symbol,
e.g., mechanism ["mEk@nIzm⎯], button ["bVtn⎯], etc. Even fricatives
may be syllabic (though only in fast speech) – e.g., the pronunciation
[s⎯pISs⎯] for suspicious, or the interjections psst! [ps⎯t] and ssh! [S⎯].
♦ Ι φ ℘ ν δ ↔ β
sonority profile
⊃⎯
In articulatory terms, the degree of sonority is closely linked

with two things: one of them is the blockage of the airstream (the
degree of stricture). Vowels are the least constricted segments (in
their articulation, the mouth is relatively open). Furthermore, the
lower a vowel, the more open the vocal tract, and the less constriction
there is. Low vowels are therefore the least constricted, and thus the
most sonorous and the most prone to belong to the nucleus of a
stressed syllable.
Voicing too plays a role in sonority, since it is required to
produce it: voiced segments are always more sonorous than their
voiceless counterparts. Given the two factors, voicing and degree of
stricture, phonologists have postulated a sonority hierarchy (scale)
among segment types, of the following sort:
Vowels (6) > Glides (5) > Liquids (4) > Nasals (3) >
> Fricatives/Affricates (2) > Plosives (1)
113
According to this scale, plosives are the least likely to be the
nucleus of a syllable. On the contrary, they usually occur at syllable
edges, either preceding the nucleus or following it.
If a consonant precedes the nucleus (N), it is said to belong to the
onset (O); if it follows the nucleus, it is known to be contained in the coda
(Co). Each of the three syllable components may be either simple or complex
(depending on the phonotactic restrictions in the respective language). In
English only the nucleus is an obligatory constituent of the syllable.
The degree of sonority (graphically represented as the sonority
profile – see above) is supposed to be low at the beginning of the onset,
to gradually increase up to its peak in the nucleus, and then to decrease
to the end of the coda. This is regulated by a universal principle known
as the sonority sequencing generalisation: the sonority profile of the
syllable must rise until it peaks, and then fall. An example which obeys
this principle is that of the monosyllabic word trust [tRVst]. Indeed, in
this case a stop precedes the liquid sonorant in the onset, the peak is a
vowel, and the coda starts with a fricative and ends with a stop:
t R V s t
As we will see, not all cases are as easy to account for as this one.
Syllables like skips [skIps] or streets [stRi:ts] obey the sonority scale but for
the fricative [s], whose sonority is higher than that of the adjacent stops [k],
[t] and [p], although it is placed at the extremities of these syllables:
s k I p s
114
This is a feature of English phonotactics, which allows for
consonantal groups such as [spR], [stR], [skR], [sp] [st], [sk], etc. in
syllable-initial position and [ps], [ts], [ks], etc. in final position.
A phonotactic rule which applies on English onsets is the
minimal sonority distance. According to this rule, the distance in
sonority between the first and second element in the onset must be of
at least two degrees. Therefore, sequences like plosive (1) + liquid (4)
(e.g., [kl]) and fricative (2) + glide (5) (e.g., [sw]) are allowed, but
combinations like nasal (3) + liquid (4) (e.g., *[mr]) are ruled out (the
asterisk ‘*’ marks an unacceptable form).
Sequences made up of nasal and liquid, which do not obey the
minimal sonority distance, tend to be uncomfortable for speakers even
if the nasal and the liquid belong to different adjacent syllables. For
instance, in IX.2.2 several examples are provided where a stop was
inserted in between two sonorants: OE þymel [Ty:mel] > PDE
thimble, OE þunrian [TUnrIAn] > PDE thunder, etc. Engl. chamber <
Fr. chambre < Lat. camera.
Like many other languages, English also disfavours segments
with an identical place of articulation in the same onset or coda. This
principle (called the obligatory contour principle) applies on [labial]
or [coronal] clusters such as *[pw], *[bw], *[tl], *[dl], *[Tl], *[Dl]
etc., which are disallowed.
1.2. The onset-rhyme theory
Adepts of the onset-rhyme theory analyse the syllable as

consisting of two immediate constituents: the onset(0), containing the
consonants preceding the vowel (or another syllabic element), and the
rhyme (R), containing the vowel and the segments that follow it. The
name of the phonological constituent ‘rhyme’ derives from the term
traditionally used in analyzing verse – e.g., think of the segments
115
shared by the mono-syllabic words ash [&S], dash [d&S] and clash
[kl&S].
Various arguments have been advanced in favour of dividing
the syllable into onset and rhyme, which are apparently independent
units, each with its own constraints on its internal structure. That
speakers have an awareness of this is proved by the phenomena of
alliteration and spoonerism, which emphasise the individuality of the
onset, and poetic rhyme, which evinces the phonological rhyme (see
above).
Alliteration (the rhetorical repetition of consonants or
consonantal clusters in the onset of successive stressed syllables) can
be traced in the following example: Laughing and leaping they left the
lodge, where the consonant [l] appears in initial position (i.e., in the
onset) in all stressed syllables.
Spoonerism is a type of speech error, in which the first segment
or cluster of a syllable (the onset) is swapped for the first segment of
another syllable in a phrase, e.g., in hush my brat replacing brush my
hat, or a well-boiled icicle for a well-oiled bicycle.
Another important argument for posing rhyme as a separate unit
involves stress assignment. In many languages (including English), the
location of stress in a word depends on the syllable structure; however,
the onset has no role to play here – in stress assignment, it is entirely
irrelevant whether there is an onset at all or how many consonants it is
made up of. What matters is the composition of the rhyme. It has been
noticed that in English a syllable can only receive stress in one of the
following cases: if its rhyme contains at least a long vowel or a
diphthong (VV), or a short vowel and one or more consonants (VC). In
other words, if the rhyme of an English syllable contains nothing more
than a short vowel it cannot be assigned stress, and that because it is
light (see below). The first three cases, however, exemplify heavy
syllables, which are capable of carrying stress. (Syllables with long
nucleuses as well as (long) codas are called superheavy.)
116
a. heavy b. heavy c. heavy d. superheavy e. light
Rhyme Rhyme Rhyme Rhyme Rhyme
α Ι ι ι Ε ν α Ι νd Ι
angina arena agenda behind America
English is therefore known as a rhyme-weight language

because it is the rhyme, not necessarily the nucleus that has to be
heavy to receive stress in this language. (There are also nucleus-
weight languages (where only syllables with heavy nucleus receive
stress) and coda languages (where only syllables ending in codas can
be stressed).)
A rhyme consists of a nucleus(N), (usually a vowel) and a
coda(Co) (one or more consonants). This accounts for the following
syllable representation to which we can associate segments, as in the
example below.
σ
O R
N Co
Consider, for instance, the onset-rhyme representation of the
monosyllabic word [keIÙ]:
σ
O R
N Co
ke I Ù
117
1.3. The timing tier
Syllables are sequences of segments, each with its own set of

features. Take for instance the monosyllabic word bat [b&t], which is made
up of three segments: a stop, a short vowel and another stop. Each of them
is associated in the English speakers’ minds to an abstract timing unit (or
timing slot), which we may represent conventionally by the symbol X.
X X X timing tier
b & t melody tier
In point of segment length, this syllable raises no problems. Each of its

timing units in the timing tier is associated to one segment represented by
one symbol (i.e., one melody – a unit of phonetic quality) in the melody tier.
Consider however a monosyllabic word like [λι⎤d] lead, which contains a
long vowel. Since the long vowel is perceived phonologically as one single
segment and yet it is considered, at least theoretically, to last twice as much as
the short vowel, it will be associated with two timing units (i.e., two Xs).
X X X X
λ ι⎤ d
Similarly, a geminate consonant, like double ‘ll’ in the Italian word

stella, will also be represented as two timing units associated to one melody:
X X X X X X
s t E l A
118
As to diphthongs, which have two melodies, a distinction has to
be made between long and short ones. Long diphthongs, such as those
in English (e.g., in boy), are associated to two timing slots, whereas
short diphthongs, like those in Icelandic (e.g., in [laIstI] ‘lock’), are
represented as being linked to only one slot.
a. X X X b. X X X X X
b o Ι l a I s t I
The same principle applies in the timing tier representation of

an affricate or of a prenasalized stop. Such sounds, are simultaneously
monosegmental, with a single X slot, and bisegmental, since they
involve a dual sequential articulation (i.e., two melodies). See, e.g.,
the representations of the English word job and of the Sinhala word
for ‘blind’ [λΑ<δΑ].
a. X X X b. X X X X
δ Ζ β λ Α < δ Α
By associating the timing tire representation with the onset-
rhyme representation, we obtain the following syllabic structure for
(a) cage and (b) shriek:
a. σ b. σ
O R O R
N Co N Co
X X X X X X X X X
κ ε Ι δ Ζ S R i: k
2. Syllabification
119
2.1. Principles of syllabification
Nucleuses are the most important components of syllables, so

they are to be granted the role of syllable heads. Syllabification (= the
parsing of segments into syllables) begins by marking the nucleuses
(the peaks) (see, e.g., the syllabification of the words ["kRItIk] and
[RI"flEkSn⎯] below) and continues by selecting the onsets.
a. N N b. N N N
κ ♦ Ι τ Ι κ ♦ Ι φ λ Ε κ Σ ν⎯
The intervocalic [t] in ["kRItIk] qualifies in principle as either

an onset or a coda. There is a general tendency however in natural
languages to assign an intervocalic consonant to the onset, according
to what has been named the principle of minimal onset satisfaction:
minimal satisfaction of onsets takes priority over satisfaction of codas
– see below. (There are however syllables without any onset: e.g. [@]
in a-bout [@]σ"[baUt]σ.)
a. O R O R b. O R O R O R
N N N N N
κ ♦ Ι τ Ι κ ♦ Ι φ λ Ε κ Σ ν⎯
The second principle which applies in onset fulfillment is that

of onset maximization: maximal formation of onsets takes priority
over formation of codas. According to this principle, with a given
string of segments in which the consonants may in principle be
120
syllabified in more than one way, syllabification will take place such
that consonants which may occupy either coda or onset position will
occur in the onset rather than in the coda. The two cases in which the
onset maximization principle applies in the examples above are the
clusters [kR] and [fl]. These two clusters obey the English phonotactic
constraints on syllable well-formedness, so they can be selected in the
onset. This is not the case of *[kS], which is not a well-formed
English consonantal cluster; it has to be split in the syllabification, so
that [k] is assigned to the coda and [S] to the onset.
a. σ σ b. σ σ σ
O R O R O R O R O R
N N Co N N Co N
κ ♦ Ι τ Ι κ ♦ Ι φ λ Ε κ Σ
ν⎯
Phonotactic constraints (those rules which restrict the set of
permissible combinations of segments in a certain language) are thus
essential in syllabification (see Appendix 1). A syllable may only
include in its onset and coda, respectively, consonantal clusters
allowed in that particular language. Not any consonantal sequence
which occurs in a language is a well-formed consonantal cluster, e.g.,
in the English words cobweb ["kQb∩wEb] and knapweed
["n&p∩wi:d] the sequences *[bw] and *[pw] are not good clusters,
because they can never occur in the onset of a word-initial syllable –
there is no word starting with [bw] or [pw] in English. (The symbol ‘"’
is used to indicate the presence of primary stress on the following
syllable, whereas ‘∩’ indicates secondary stress.)
121
Consequently, we have to ignore the sonority sequencing
generalization and the onset maximization principle in these cases and
split these sequences in syllabification: the first consonant should
belong to the coda of the initial syllable, while the second consonant
should be part of the onset of the final syllable:
σ σ
O R O R
N Co N Co
κ β ω Ε β
3. Syllable weight
An alternative model to the onset-rhyme theory is the mora

theory – based on syllable weight. In such an approach, syllables are
no longer divided into the immediate constituents onset and rhyme,
but into ‘weight units’ – also called moras or morae (a term
originally used in classical poetic prosody) – represented by the Greek
letter mu, μ. By convention, a light syllable only contains one mora
(hence the name monomoraic), while a heavy syllable contains at
least two moras (bimoraic syllable), or even three moras (trimoraic
superheavy syllable).
a. light b. heavy c. heavy d. heavy e. superheavy

σ σ σ σ σ
122
μ μ μ μ μ μ μ μ μ μ
C V C V C V V C V C C V C
The single V symbol in (a) above stands for a short vowel, in

(b) and (e) – for a long vowel, while the two V symbols in (c) stand
for a diphthong. In Section X.1.1.2 we stated that English is a rhyme-
weight language, so English CVC syllables are heavy. In fact, they
acquire weight by position, i.e., when followed by consonants. For
instance, the final consonant in a word like imagine [I"m&ÙIn] is not
considered to project (i.e., to be linked to) any mora because it is not
followed by another consonant, unlike [n] in agenda [@"ÙEnd@],
which is followed by [d] and does project a mora of its own.
a. σ σ σ b. σ σ σ
μ μ μ μ μμ μ
Ι μ Θ δΖ Ι ν ↔ δΖ Ε ν δ ↔
As already stated in Section X.1.1.2, initial (onset) consonants

do not contribute to the weight of a syllable, so for the weight of a
syllable it does not matter whether there are two consonants in the
onset or one, or three, or none. Such consonants, therefore, have a
special (‘extramoraic’) representation – that is why they are usually
associated directly to the syllable node.
3.1. Latin stress assignment rule
123
In many languages (e.g. Latin), stress is sensitive to syllable-
weight. The Latin stress assignment rule, which states that the stress
in this language will always fall on the third mora counting from right
to left, i.e., from the end of the word to its beginning. Take the
following examples: uidere [.wi."dee.re.] ‘to see’ vs. capere
[."ka.pe.re.] ‘to take’. Both words are trisyllabic, but the rhyme of the
second syllable in the former word consists of a long vowel (which
counts as two morae), whereas the rhyme of the second syllable in the
latter word has only a short vowel (which only counts as one single
mora). Consequently, the stress in uidere will fall on the penultimate
syllable, because it is this syllable which contains the third mora
counting from right to left, whereas capere will be stressed on the
antepenultimate syllable for the same reason.
a. σ σ σ b. σ σ σ
μ μ μ μ μ μ μ
ω ι ∪δ ε ε ρ ε ∪κ α π ε ρ ε
1. What is a syllable?
2. Which is the core syllable type and what is special about it?
3. What other types of syllable do you know?
4. What are phonotactic constraints?
5. What is a syllable peak? Which sounds can form syllable
peaks?
6. What factors determine the sonority level of a sound?
7. What does the sonority hierarchy refer to?
8. What is a sonority profile?
124
9. What does the sonority sequencing generalisation postulate?
10. What is the minimal sonority distance?
11. What is the obligatory contour principle?
12. How is a syllable analyzed in the Onset-Rhyme theory?
13. What is the composition of a syllable in the perspective of
the onset-rhyme theory?
14. What arguments have been advanced in support of the
onset-rhyme theory?
15. What are alliteration and spoonerism?
16. Which kinds of syllables are considered ‘heavy’ in English?
How does this effect stress assignment?
17. What is the timing tier?
18. Which are the principles of syllabification?
19. What do minimal onset satisfaction and onset maximization
postulate?
20. What is the mora theory based on?
21. What does Latin stress assignment depend on?
22. Arrange the following sounds according to their relative
sonority:
[κ] [ν] [σ] [ε] [ρ] [Ζ] [ι] [β] [m] [λ] [α] [ω]
23. Draw sonority profiles for the following words:
content verandah tripper hysteria Krakatoa improbable
24. Which of the following hypothetical words are syllabifiable

in English? Explain.
[tnEfIn] [p&hmIl] [dRUlÍIN] [lbEksIt] [NImlQp] [kEskIl]

[VdRIkREz]
125
25. Syllabify the following words using onset-rhyme and timing
tier representations. What kinds of syllables can you identify? Which
principles do you apply during syllabification?
combustion industrialization spectacle flower hairy knuckles

pneumonia
26. Syllabify the following words using moraic representations:
accusation cartoon dandelion structural boosting depressive

Jeremiah
126
XI. SUPRASYLLABIC STRUCTURE
1. Stress and accent
In several examples so far we have used a symbol consisting in

a small vertical line placed before a syllable at top level in order to
indicate the location of (main) stress in a word, e.g. ["glQRI]. The
syllable placed after the little vertical line is stressed, i.e., it is
pronounced perceptually more ‘prominently’ (or ‘saliently’) than the
other syllables. This prominence is achieved in English, Romanian
and many other languages by increasing the (1) duration, (2)
amplitude and modifying the (3) pitch of the syllable. The most
important element is pitch (see also Section II.1).
This means (1) that stressed syllables often last longer, (2) that
more energy is spent in their articulation so as to make them louder
than unstressed syllables, and (3) that a stressed vowel is articulated
with different frequency than the others (its pitch changes). In
addition, stressed syllables tend to contain low-vowel nucleuses, with
a high level of (4) sonority, whereas unstressed syllables usually have
high vowels, characterized by less sonority, or a reduced vowel.
The terms ‘stress’ and ‘accent’ are often used as synonyms, but
stress is the prominence given to a syllable (without reference to
pitch), whereas accent is usually associated with pitch. Besides, the
word accent in English is also understood to mean the pronunciation
and speech patterns that are typical of a speech community (as, e.g., in
he speaks with a French accent).
127
In ‘stress’ languages, including English, intonation also plays
an important role. Thus, in a sentence like It’s Mary, the first syllable
of Mary is likely to be stressed and given some sort of pitch
prominence, but the type of pitch prominence may be, e.g., high, as in:
_______
§ ‡ ¦
It’s Mary
or low, as in:
_______
¨ ƒ ©
It’s Mary
(In this type of transcription the top and bottom lines represent
the top and bottom of the speaker’s speech range and each dot
corresponds to a syllable, the larger dots indicating stressed/accented
syllables.)
Stress and intonation languages, like English, are often
contrasted with pitch accent languages, like Japanese. In Japanese
words realise their accent by a high pitch on the accented syllable,
followed by a low pitch on the following syllable (unless the accented
syllable is the last in the word) (e.g., óngaku ‘music’ [… ¦ ¦]), a
situation that can also be encountered in English. However, the
Japanese accents cannot be reversed by intonation as English accents
can. In some situations such a reversal would in fact lead to confusion,
because pitch variation is distinctive in Japanese (e.g., háshi
‘chopsticks’ vs. hashí ‘bridge’, tábi ‘socks’ vs. tabí ‘trip’, etc.). The
use of intonation in Japanese is highly limited in comparison to
English.
Stress may have a demarcative function: in many languages,
between any two stresses there must be a word boundary. If the
location of stress is predictable, i.e., if it falls on a fixed syllable in
128
the word (e.g., the first one – as in Hungarian, and Czech, or the last
one – as in French and Turkish), the exact boundary between words
can be determined according to the position of the stress. However, in
connected speech, stressed words alternate with unstressed words
(e.g., weak forms of pronouns, articles, prepositions, etc.) and thus in
French, for instance, stress will delimit a word group rather than a
single word.
There are also languages in which the placement of stress is
unpredictable. English and Romanian, for instance, have no fixed
word-stress and their rules of stress assignment are quite complex. In
such languages word-stress can be used with a distinctive function:
e.g., Romanian urcă ["urk@] ‘(he) climbs’ vs. urcă [ur"k@] ‘(he)
climbed’, pasă ["pas@] ‘pass (noun)’ vs. pasă [pa"s@] ‘(he) passed’;
English convict ["kQnvIkt] (noun) vs. convict [k@n"vIkt] (verb),
perfect ["p3:fEkt] (adjective) vs. perfect [p@"fEkt] (verb), etc.
Every word has at least one stress in its dictionary entry form,
but some types of words most commonly occur in a weak (unstressed)
form in connected speech, e.g., the articles the and a are usually
pronounced [D@] and [@], not ["Di] and ["eI] (strong forms).
English unstressed syllables are pronounced in a lax manner, which
leads to vowel reduction – often to schwa, the most reduced vowel.
Other types of words most commonly occurring without a stress (and
with reduced vowels) are all grammatical words (auxiliary verbs,
personal pronouns and shorter prepositions and conjunctions) (see
Appendix 2 for strong and weak forms of such words in English),
whereas the majority of lexical words (nouns, main verbs, adjectives
and adverbs) commonly occur with a stress.
Stresses in connected speech (in an intonation group) occur
with varying degrees of prominence: (1) primary stress (involving
the principal pitch prominence), (2) secondary stress (involving a
subsidiary pitch prominence), (3) tertiary stress (involving a
prominence produced mainly by length and/or loudness), or (4)
129
unstressed. Both tertiary stress and lack of stress can be described as
unaccented. Sometimes a polysyllabic word may be characterised by
both a primary and a secondary stress: e.g., in telephone
["tElI∩f@Un]. The secondary stress is usually represented as a small
vertical line placed before a stressed syllable at bottom level.
Any utterance is made up of a sequence of stressed and
unstressed syllables. The way in which the pitch changes during the
utterance following the stressed and unstressed syllables creates the
intonational melody (or contour) of an utterance. Intonational
contours provide information on the syntactic and semantic structure
of utterances and play an important discourse role.
2. The metrical foot
The organizing structure for combining syllables is commonly

called the metrical foot (or rhythm group). The term is familiar from
the study of the metre of traditional verse-forms. In prosody, a foot is
the association of a accented syllable with one or several unaccented
syllables. The accented syllable, being the most prominent, plays the
role of head (= peak) of the foot.
According to their prominence, there are right-headed [σ σ⇔]
and left-headed feet [σ⇔ σ], depending on the position of the
accented syllable, as there are binary (bounded) feet (made up of two
syllables only) – e.g. the iamb [σ σ⇔] and the trochee [σ⇔ σ] – and
unbounded feet (consisting of all the syllables in a morpheme or
word). If the foot only has one syllable, it is called degenerate
because it lacks internal opposition: when a syllable occurs in
isolation it is neither strong, nor weak in relation to another one.
Below see representations for (a) an unbounded left-headed foot in
and (b) a degenerate foot.
130
a. F b. F
[σ⇔ σ σ σ] [σ]
One current notation uses the symbols ‘s’ or ‘ ≅≅’ and ‘w’ or ‘
(’ to mark strong and weak syllables, respectively.
s w s w s w s w
(U≅p go(es) (Ha≅r-ry() (cre≅e-py() (cra≅w-ly()
The example above displays what is known as eurhythmicity –

the accented syllables are not too close and not too wide apart – in fact
in this case they are equally spaced, as it often happens in verse
patterns. In English, however, eurhythmicity can be achieved even
though the number of syllables placed between two accents varies: in
time, the distance between the two accents remains fairly constant
(they are isochronous, i.e., they last the same amount of time). In this
respect, the foot is interpreted as a unit of timing, just like the bar or
measure in music. Consider the following example:
w w s ww s w w s w s
’Ti(s the( (mi≅d-dle& o&f) (dáy by( the() (wóo-de(n)
(clo≅ck)
In this case, the first two syllables do not form a foot. They
belong to the so-called anacrusis, which in principle can be attached
to the final foot of the previous line. The general tendency in English
is to produce syllables in an anacrusis with greater speed than any
unstressed syllables within following feet, so such syllables are
extremely liable to be reduced.
131
As to the number of unaccented syllables in the four complete
feet in the example above – they vary from zero to two, and yet they
are more or less equally distributed (their duration is almost the same).
This is in fact a feature of the English language, which allows for any
of the following foot structures with little difference in the time
necessary to pronounce them – see blow the representations for the
words: a. cozy, b. carnival, c. palatable, d. characterize.
a. (× ·) b. (× · ·) c. (× · · ·) d. (× · · · ·)
σ σ σ σ σ σ σ σ σ σ σ σ σ σ
(The accented syllables are indicated here with a cross ‘×’, and
the unaccented syllables with a dot ‘·’. This is known as ‘the bracketed
grid notation’.)
This is possible because English is a ‘stress-timed’ language, in
which the rhythmic pulse (or beat) of the speech is determined by the
timing relationship between accented syllables. Each accented syllable
in English coincides with a beat and the distance between them is
approximately the same.
French and Romanian, on the other hand, are characterised by
a different rhythmical pattern. In French, which is a ‘syllable-timed’
language, each syllable corresponds to a beat, except for reduced
syllables which contain a schwa, as, e.g., in mon petit garçon ‘my
little boy’ (see below). The only foot structures possible in French are
thus unary and binary.
(× ·) (×) (×) (×)

σ σ σ σ σ
μ ) π↔ τι γα® σ )
There are cases where foot structure plays a role in

phonological processes. Consider the following data:
132
[ΙΝκ] ink [∩ΙΝκλΙ∪νεΙΣν] inclination
[∪Ιν∩κλαΙν] incline (noun) [∩Ιν∪κλαΙν] incline (verb)
In ink and inclination the /n/ obligatorily appears as [Ν],

whereas in incline (noun and verb) the /n/ may occur as [Ν], but it
does not have to; it may as well occur as [n]. Why is this so? Leaving
aside ink, the syllabification is the same: both in inclination and in
incline the /n/s are syllable-final.
a. σ σ σ σ b. σ σ
Ι Ν κλΙ ν εΙ Σ ν⎯ Ι ν κ λ αΙ ν
But the foot structure of these words is different. If we assume

that each accented syllable (primary or secondary accent) heads a foot
(be it even degenerate), the difference between the two words depends
on whether the /n/ and the /k/ are both in the same foot or they are
separated by a foot boundary. When the two segments are in the same
foot /n/ obligatorily surfaces as [Ν] (is obligatorily assimilated to the
velar); when they are in different feet the /n/ may appear as [n].
a. F b. F F c. F F
σ σ σ σ σ σ σ
Ι Ν κ Ι Ν κ λΙ ν εΙ Σ ν⎯ Ι ν κ λ αΙ ν
133
3. Intonation and tone
An English utterance can be pronounced in various ways, with

various types of intonation. The very words yes and no or utterances
containing these words – e.g. He said “yes”. or Did he say “yes”? or
Say “yes”! – can be pronounced with a diversity of pitch variation,
such as rising or falling or simply level, signaling all sorts of attitudes
and meanings. Therefore, intonation can be defined as modulated
pitch. Different patterns of modulation correspond to different
intonational contours (= melodies or tunes). Each intonational
contour can be associated with a set of meanings; it applies on
utterances, which are made up of groups of words of varying length
(from isolated words to whole sentences) called intonation groups or
intonational phrases. Intonation groups generally correspond with
constituents of sentences in a somewhat loose way.
In each intonation group there can be several pitches, but there
is one syllable within one word which bears the main stress or
nucleus of that intonation group. This syllable is also called the ‘tonic
syllable’. The nucleus is thus the pitch accent which stands out as the
most prominent in an intonation group.
The nucleus may be preceded by a head and pre-head, and
may be followed by a tail. The pre-head is made up of all the
unaccented syllables preceding the first accented syllable of the
intonation group. The head begins with the accented syllable of the
first stressed word and ends with the syllable immediately preceding
the nucleus. All the syllables that follow the nucleus make up the tail.
For instance, in the utterance He has come to dinner, the pre-head is
He has, the head come to, the nucleus di- and the tail -nner:
134
_________________
¦ ¦ ‡ © „ ¦
He has come to dinner
The rises and falls in the pitch taking place on the nucleus or
starting from it are called nuclear patterns (or tones). There are six
types of nuclear tones in English: low fall, high fall, low rise, high
rise, rise-fall and fall-rise. The unaccented syllables preceding the
head can be pronounced on a low pitch level or on a high pitch level.
Heads may be low or stepping (gradually falling to the nucleus). Tails
can take various patterns, depending on the nuclear tone.
Intonation has various functions: (1) grammatical,
(2) attitudinal and (3) accentual. (1) Intonation distinguishes
between questions, statements and exclamations and it also marks
sentence, clause, phrase, or word boundaries. (2) Intonation usually
signals personal attitude: e.g., surprise, joy, anger, irony, etc. (3)
Intonation marks the most important word (and syllable) in the
intonation group, by a change of pitch on the prominent syllable (it
also attaches emphasis to a certain word in the sentence).
English and most of the other European languages are grouped
in the category of stress and intonation languages because in their
case a change in the pitch variation pattern of a certain utterance does
not trigger a change in the meaning of the words contained in the
utterance, but a change in the discourse function of the respective
utterance. For example, raising intonation may turn the declarative
sentence You have succeeded. into the interrogative sentence You have
succeeded? without any alteration in the prepositional (semantic)
content of either the utterance or the words included in it.
There are other functions of tone, beside intonation, which do
not occur in English or most European languages but are still very
common in other languages of the world. In tone languages, for
instance, tones are used to differentiate lexical items or to express
135
morphological functions One of the widest known tone languages is
Mandarin Chinese, in which words which share identical segments are
only differentiated lexically by their tonal structure.
m¾ ‘feel’ mó ‘plan’ mò ‘end’ m¼ ‘smear’

x® ‘sunset’ xí ‘exercise’ xì ‘play’ x³ ‘wash’
This is possible because in tone languages each syllable is a

tone unit (i.e., each syllable functions as an independently variable
item). In stress languages the syllable may be the smallest tone unit
(e.g., in the monosyllabic utterance school), but there are also larger
tone units which comprise several syllables, of which only one carries
the tone (the tonic syllable) (e.g., in the intonation group to the
school).
In pitch accent languages, as already shown, a single fixed
tonal melody is associated to each word and pitch variation can have a
distinctive function at the lexical level (change the meaning of the
word). The same type of pitch variation would change an order into an
interrogation in English (e.g., Johnny! vs. Johnny?).
Some languages which are predominantly intonation languages
may also make a limited use of tone. This means that in such
languages tone has a distinctive function for some words. This is the
case, e.g., in Norwegian and Swedish, as well as Serbian and
Lithuanian.
In Swedish about five hundred minimal pairs distinguished by
tone alone can be found (see, e.g., the following examples).
búren ‘the case’ vs. bùren ‘carried’

tánken ‘the tank’ vs. tànken ‘the thought’
ánden ‘the duck’ vs. ànden ‘the spirit’
pánter ‘panther’ vs. pànter ‘pledges’, etc.
136
Words signaled here by ´ are associated with a single-peaked
falling tone (high-low), while words with ` are commonly double-
peaked (high-low-high-low). The first pattern is in fact the common
accentual pattern for words in Swedish and is not limited to words
where the accent is on the first syllable, whereas the second pattern is
the ‘marked’ pattern and limited to word-initial accent. In connected
speech some indication of the different accents is regularly
maintained.
1. What characterizes stress?

2. What is the difference between stress and accent?
3. What is the difference between a stress and intonation
language and a pitch accent language?
4. Can stress be predictable?
5. Is the position of stress predictable in English?
6. What is the demarcative function of stress?
7. Can stress have a distinctive function?
8. What is the difference between lexical forms and
grammatical forms in English in terms of stress?
9. How many degrees of prominence can stress have?
10. What is the intonational melody (or contour) of an
utterance?
11. What is a metrical foot?
12. How many kinds of metrical feet do you know?
13. What is eurhythmicity?
14. What are isochronous accents?
15. What is the difference between a stress-timed and a
syllable-timed language?
16. Can the foot structure play a role in phonological processes?
137
17. What is intonation?
18. What are intonation groups?
19. What is the structure of an intonation group?
20. What is a nucleus?
21. What is a nuclear tone (or pattern)?
22. Which are the functions of intonation?
23. What differentiates stress and intonation languages from
tone languages?
24. How can an intonation language make use of tone?
25. Identify the position of the (primary) stress in the following
English words. What effect does placing the stress on one of the
syllables have upon the other syllable nuclei?
correspondent definitely courageous declaration inference

infer comfortable
26. Identify the rhythm patterns in the following lines and

isolate the metrical feet within brackets.
“To be, or not to be: that is the question:

Whether ‘tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles,
And by opposing end them. To die: to sleep;
No more; and by a sleep to say we end
The heart-ache, and the thousand natural shocks
That flesh is heir to, ‘this a consummation
Devoutly to be wish’d. To die, to sleep;
To sleep: perchance to dream: ay, there’s the rub;
For in that sleep of death what dreams may come,
When we have shuffled off this mortal coil,
Must give us pause: there’s the respect
138
That makes calamity of so long life;
For who would bear the hips and scorn of time.”
(W. Shakespeare: Hamlet, III, i)
27. Using the information in Appendix 2, as well as the

knowledge acquired throughout this course, make phonetic
transcription of the following sentences:
a) Could you show me your book, please?

b) Did he help you as he was supposed to?
c) Must I always tell him what to do?
d) We were wondering if he really was the one.
e) Should she wait for the bus or shall I drive her home?
f) Can it be so that they are from here?
g) But for Mary and John the house would not have been
finished.
h) I am your father and you had better not forget it!
i) I would rather work in a pub than at the station.
j) There are some cookies in a box if you want any.
139
SAMPLE TESTS
Test A (You will find the answers at the end of Appendix 2.)
1. Define: a) articulatory phonetics; b) active articulators; c)

liquids.
2. Indicate the symbols representing the sounds described below
and give examples of words containing them:
a) voiced labio-dental fricative; b) voiced alveo-palatal
affricate; c) voiced velar nasal d) low front unrounded lax vowel; e)
low back unrounded tense vowel.
description in words: a) Í, g, l; b) u: , E .
Examples: a) p = voiceless bilabial stop; b) U = high back
rounded lax vowel.
4. Transcribe phonetically the following English words:
a) concrete, b) equip, c) divergence.
5. Define: a) contrastive distribution, b) allophone, c)
progressive assimilation.
6. Give two examples of minimal pairs illustrating that the
sound /T/ is a phoneme in English.
7. a) Syllabify the following words. b) Give the common
spelling of these words in English orthography.
a. σt&nd↔daIzeISn⎯ ; b. mΘnjUskrIπτ .
140
Test B
1. Define: a) acoustic phonetics; b) phonetic transcription; c)

obstruents.
2. Indicate the symbols representing the sounds described below
and give examples of words containing them:
a) voiceless interdental fricative; b) voiced labio-velar glide; c)
voiced bilabial nasal; d) high front unrounded tense vowel; e) low-mid
central unrounded lax vowel.
description in words: a) Z, R, n; b) O, A: .
Examples: a) k = voiceless velar stop; b) U = high back rounded
lax vowel.
4. Transcribe phonetically the following English words:
a) manufacture, b) honour, c) though.
5. Define: a) complementary distribution, b) phoneme, c)
metathesis.
6. Give two examples of minimal pairs illustrating that the
sound /S/ is a phoneme in English.
7. a) Syllabify the following words. b) Give the common
spelling of these words in English orthography.
a. ΙlEktr@Um&gn@t ; b. fVnd@mEntl⎯.
141
APPENDIX 1: English consonantal clusters
The first syllable of an English word may begin with a vowel

(zero onset) or with a consonant. If the syllable begins with one
consonant (the initial consonant), this may be any English consonant
except [Ν] (though [Ζ] rarely occurs in this position).
There are two sorts of two-consonant initial clusters in English:
one sort composed of [s] (pre-initial) + another consonant (initial)
and the other made up of a consonant (initial) + one of the following:
[λ], [♦], [ω] or [ϕ] (post-initial). Tables 1 and 2 present two-
consonant initial clusters. The clusters corresponding to a question
mark in Table 2 are rare in English. Thus, [σ♦] only occurs in foreign
place names (e.g., Sri Lanka), [γω] is a characteristic Welsh cluster
occurring in a few names in English (e.g., Gwen and Gwent), [Σω] is
to be found in the words schwa and Schweppes, while [γϕ] and [Τϕ]
are only used in the archaic words gules and thew.
142
Table 1 Two-consonant initial clusters with pre-initial [σ]
Pre-initial Initial
σ+ π τ κ φ μ ν λ ♦ ω ϕ
σπΙν στΙκ σκΙν σφΙ σμΕλ σν↔ σλΙπ ? σωΙμ σϕυ⎤

↔ Υ

143
144
Table 2 Two-consonant initial clusters with post-initial [λ], [♦], [ω], [ϕ]
Post- Initial
initial p t k b d γ f Τ Σ h v m n l
+λ πλΕΙ - κλΕΙ βλΙσ - γλυ⎤ φλαΙ - - - - - - -
+♦ π♦Ε τ♦ κ♦αΙ β♦Ικ δ♦Ιπ γ♦Ιν φ♦α Τ♦↔ Σ♦ - - - - -

Ι ΕΙ Ι Υ ⎤
+ω - τωΙ κωΙκ - δωΕ ? - Τω ⎤τ ? - - - - -
λ
+ϕ πϕΥ τϕ κϕυ⎤ βϕυ⎤ δϕυ⎤ ? φϕυ⎤ ? - ηϕυ ϖϕυ μϕυ⎤ νϕυ⎤ λϕυ⎤
⎤ Ι ⎤ ⎤ ζ ζ δ
Three-consonant initial clusters are related to the two-
consonant clusters in Table 1 and 2. Their number is restricted – see
Table 3.
Table 3 Three-consonant initial clusters
Pre- Initial Post-initial

initial
σ λ ♦ ω ϕ
π ‘splay’ ‘spray’ - ‘spew’
τ - ‘string’ - ‘stew’
κ ‘sclerosis’ ‘screen’ ‘squeak’ ‘skewer’
At the end of a word we may find no consonant at all (zero

coda), or we may find up to four consonants. When there is one
consonant only, it is called the final consonant. Any consonants may
play this role, except [η], [♦] (unless we are dealing with a rhotic
dialect), [ω] and [ϕ].
As in the case of initial clusters, there are two types of two-
consonant final clusters: the final consonant is either preceded by a
pre-final consonant (selected from [μ], [ν], [Ν], [λ], [σ]) or followed
by a post-final consonant. The post-final consonants can often be
identified as separate morphemes (though not always): [σ], [ζ], [τ],
[δ], [Τ]. More than one post-final consonant can occur in a final
cluster – see Table 4 (after Roach 1993: 72).
145
Table 4 Final clusters
Pre- Final Post-final 1 Post-final 2 Post-final 3

final
‘helped’ ηΕ λ π τ - -
‘banks’ βΘ Ν κ σ - -
‘bonds’ β ν δ σ - -
‘twelfth’ τω λ φ Τ - -
Ε
‘fifths’ φΙ - φ Τ σ -
‘next’ νΕ - κ σ τ -
‘lapsed’ λΘ - π σ τ -
‘twelfths’ τω λ φ Τ σ -
Ε
‘prompts’ π♦ μ π τ σ -
‘sixths’ σΙ - κ σ Τ σ
‘texts’ τΕ - κ σ τ σ
146
APPENDIX 2: English weak forms
In the English language there are certain highly frequent words

which can be pronounced in two different ways: strong or weak.
Almost all the words which have both a strong and a weak form
belong to the category of function words, which contains auxiliary
verbs, pronouns, prepositions, conjunctions, determiners etc. All of
these are more frequently pronounced in their weak forms.
The most common weak forms will be introduced in Table 5
(after Roach: 102-9).
Table 5 English weak forms
Function Word Weak Forms

1. ‘THE’ Δ↔ (before consonants); Δι (before vowels)
2. ‘A’, ‘AN’ ↔ (before consonants); ↔ν (before vowels)
3. ‘AND’ ↔ν (sometimes ν⎯ after t, d, s, z, Σ)
4. ‘BUT’ β↔τ
5. ‘THAT’ Δ↔τ (when it introduces a relative clause)
6. ‘THAN’ Δ↔ν
7. ‘HE’ ι; ηι (in initial position)
8. ‘HER’ ↔ (before consonants); ↔♦ (before vowels); η↔ (in
initial position)
9. ‘HIM’ Ιμ
10. ‘HIS’ Ιζ
11. ‘SHE’ Σι
12. ‘THEM’ Δ↔μ
147
13. ‘US’ ↔σ
14. ‘WE’ ωι
15. ‘YOU’ ϕυ
16. ‘YOUR’ ϕ↔ (before consonants); ϕ↔♦ (before vowels)
17. ‘AT’ ↔τ
18. ‘FOR’ φ↔ (before consonants); φ↔♦ (before vowels)
19. ‘FROM’ φ♦↔μ
20. ‘OF’ ↔ϖ
21. ‘TO’ τ↔ (before consonants); τυ (before vowels)
22. ‘AS’ ↔ζ
23. ‘SOME’ σ↔μ (before uncountable nouns and other nouns in the
plural)
24. ‘THERE’ Δ↔ (before consonants); Δ↔♦ (before vowels)
25. ‘CAN’ κ↔ν
26. ‘COULD’ κ↔δ
27. ‘HAD’ ↔δ; η↔δ (in initial position)
28. ‘HAS’ ↔ζ; η↔ζ (in initial position)
29. ‘HAVE’ ↔ϖ; η↔ϖ (in initial position)
30. ‘SHALL’ Σ↔λ or Σλ⎯
31. ‘SHOULD’ Σ↔δ
32. ‘MUST’ μ↔σ (before consonants); μ↔στ (before vowels)
33. ‘DO’ δ↔ (before consonants); δυ (before vowels)
34. ‘DOES’ δ↔ζ
35. ‘AM’ ↔μ
36. ‘ARE’ ↔ (before consonants); ↔♦ (before vowels)
37. ‘WAS’ ω↔ζ
38. ‘WERE’ ω↔ (before consonants); ω↔♦ (before vowels)
SUGGESTED ANSWERS TO SAMPLE TEST A
148
1.a) Articulatory phonetics is a branch of phonetics which deals
with the production of speech sounds;
b) Active articulators are parts of the vocal tract which actively
participate in the production of sounds: the lips and the tongue.
c) Liquids are sonorant consonants produced with
approximation. There are two types of liquids: laterals (l-sounds) and
rhotics (r-sounds).
2. a) voiced labio-dental fricative: [v] – e.g., in voice;
b) voiced alveo-palatal affricate: [Ù] – e.g., in George;
c) voiced velar nasal: [N] – e.g., in bring;
d) low front unrounded lax vowel: [&] – e.g., in ash;
e) low back unrounded tense vowel: [A:] – e.g., in father.
3. a) [Í] : voiceless alveo-palatal affricate
[g] : voiced velar plosive (oral stop)
[l] : voiced lateral alveolar liquid
b) [u:] : high back long tense rounded vowel
[E] : low-mid front short lax unrounded vowel
4. a) concrete: [k@n"kri:t]
b) equip: [I"kwIp]
c) divergence: [daI"v3:Ùn⎯s]
5. a) Contrastive distribution is a type of distribution which
characterizes phonemes. Two sounds are in contrastive distribution if
by replacing one with the other (in a minimal pair) there results
another word with a different meaning: e.g., bΘn vs. pΘn.
b) An allophone is a contextual variant of a phoneme: e.g., /p/
in English is realized as the allophone [ph] in at the beginning of a
stressed syllable unless preceded by [s].
149
c) Progressive assimilation is a type of assimilation in which
a phonological feature spreads from one sound to the following sound:
e.g., in open ["@Upm⎯].
6. thick [TIk] vs. kick [kIk]; moth [mQT] vs. mob [mQb].
7.
a) σ σ σ σ σ
O R O R O R O R O R
N Co N Co N Co N Co N Co
σt & n d ↔ daI z eI S n⎯
b) σ σ σ
O R O R O R
N Co N Co N Co
m Θ nj U s k rI π τ
RECOMMENDED FURTHER READING
150
Mateescu, Dan (2002). A Course in English Phonetics and Phonology.
Bucureşti: Editura Universităţii Bucureşti.
Pârlog, Hortensia (1997). English Phonetics and Phonology.
Bucureşti: ALL.
Rădulescu, Mara-Octavia (2001). An Introduction to Phonetics and
Phonology. Bucureşti, CREDIS.
151
BIBLIOGRAPHY
Carr, Philip (1993). Phonology. London: Macmillan.

Chiţoran, Dumitru (1978). English Phonetics and Phonology.
Bucureşti: Editura didactică şi pedagogică.
Chiţoran, Dumitru & Lucreţia Petri (1977). Workbook in English
Phonetics and Phonology. Bucureşti: Editura didactică şi
pedagogică.
Chomsky & Halle (1968). The Sound Pattern of English. New York:
Harper & Row. In Davenport & Hannahs (1998).
Clements, George N. & Samuel Jay Keyser (1983, 1990). CV
Phonology: A Generative Theory of the Syllable. Cambridge,
Mass.: MIT Press.
Davenport, Mike & S. J. Hannahs (1998). Introducing Phonetics and
Phonology. London: Arnold.
Ewen, Colin J. & Harry van der Hulst (1999). Phonological
Representation. An Introduction to the Structure of Words.
Cambridge: Cambridge University Press.
Harris, J. (1994). English Sound Structure. Oxford & Cambridge,
Mass.: Blackwell.
Kenstowicz, Michael (1994). Phonology in Generative Grammar.
Oxford: Blackwell.
Ladefoged, Peter (1993). A Course in Phonetics. 3rd edn. New York:
Harcourt Brace.
Lyons, John (1969). Introduction to Theoretical Linguistics.
152
Mateescu, Dan (2002). A Course in English Phonetics and Phonology.
Bucureşti: Editura Universităţii Bucureşti.
Pârlog, Hortensia (1997). English Phonetics and Phonology.
Bucureşti: ALL.
Rădulescu, Mara-Octavia (2001). An Introduction to Phonetics and
Phonology. Bucureşti, CREDIS.
Roach, Peter (1993). English Phonetics and Phonology, 2nd edn.
Roca, Iggy & Wyn Johnson (1999). A Course in Phonology. Oxford:
Blackwell.
Spencer, Andrew (1996). Phonology: Theory and Description.
Oxford: Blackwell.
Trudgill, Peter & Jean Hannah (1994). International English. A Guide
to the Varieties of Standard English. London: Arnold.
Wells, John C. (1982). Accents of English. 3 vols. Cambridge:
Cambridge University Press.
153

An 1 Sem 1 A Short Introduction To Phonetics and Phonology

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An 1 Sem 1 A Short Introduction To Phonetics and Phonology

Uploaded by

Copyright:

Available Formats

MARA VAN SCHAIK RĂDULESCU

Universitatea SPIRU HARET

© Editura Fundaţiei România de Mâine, 2005

Redactor: Andreea DINU

Universitatea SPIRU HARET

MARA VAN SCHAIK RĂDULESCU

EDITURA FUNDAŢIEI ROMÂNIA DE MÂINE

Universitatea SPIRU HARET

1. Phonetics and phonology as branches of linguistics …………. 11

II. BRANCHES OF PHONETICS ………………………………. 22

1. Acoustic phonetics ………………………………………….. 23

III. ARTICULATORY PHONETICS …………………………... 32

1. Airstream mechanisms ………………………………………. 32

1. Criteria for classifying vowels ……………………………… 55

VI. PHONOLOGY ……………………………………………….. 66

1. Phonetics vs. phonology ……………………………………... 66

VII. PHONOLOGICAL FEATURES …………………………... 72

1. Major class features …………………………………………. 75

VIII. PHONOLOGICAL RULES ……………………………….. 86

1. Rule writing ………………………………………………….. 86

IX. PHONOLOGICAL PROCESSES ………………… 103

1. Feature changing rules ………………………………………. 103

X. SUPRASEGMENTAL PHONOLOGY: THE SYLLABLE 110

1. Syllable structure …………………………………………….. 111

XI. SUPRASYLLABIC STRUCTURE 126

1. Stress and accent …………………………………………….. 126

SAMPLE TESTS ……………………………………………... 139

APPENDIX 1: English consonantal clusters ………………….. 141

SUGGESTED ANSWERS TO SAMPLE TEST A …………. 148

The general purpose of this course of lectures is to introduce the

1. Phonetics and phonology as branches of linguistics

Phonetics and phonology are two closely related branches of

In a philological approach, students are first to become familiar

As can be seen from their definitions, both phonetics and

3. The International Phonetic Alphabet

As a means of communication, language is fundamentally oral.

Being spoken on all continents, English is the most widely

1. What characterizes linguistics?

A phonetician may be interested in studying the speech sounds

Acoustic phonetics is the most technical branch of phonetics, as

Figure 2.2 Spectrograms of /i:/, /A:/, /aI/ (after Ladefoged 1971;

The fundamental frequency of a sound corresponds to its pitch.

(a) speech recognition (analysis of the linguistic content of a

Auditory phonetics focuses on the perception of sounds (the

1. Which branches of phonetics do you know?

The physical processes involved in the production of speech

The airflow initiated in the lungs follows the direction of the

Alveolar ridge (Hard) Palate

Upper lip Velum (=soft

Figure 3.1 The vocal tract and articulatory organs

As the air moves out of the larynx, owing to the movement of

The choice between the oral and nasal articulations depends on

5. Active and passive articulators

The manner in which a sound is articulated depends on the

7. Fortis and lenis

Fortis consonants are produced with greater articulatory effort

The production of a sound involves the movement of an active

Bilabial and labiodental sounds are included in the general

Table 3.2 Groups of place of articulation

LABIAL CORONAL DORSAL GUTTURAL

9. Questions and exercises

1. What do you know about the pulmonic egressive airstream

1.1. Plosives (= oral stops)