Universitatea “Ovidius”, Constanta

Facultatea de Litere si Teologie
Catedra de limba englezã

Semestrul I, anul I



Conf.univ.dr. Eduard Vlad

Phonemic symbols used in the course

Long vowels
/i:/ in fee, sea, see
/fi:, si:, si:/
/o:/ in smart, art, far
/smo:t, o:t, fo: /
/o:/ in cause, ought, more
/ko:z, o:t, mo:/

/u:/ in blue, flew, moo
/blu:, flu:, mu:/
/s:/ in sir, dirt, earn
/ss:, ds:t, s:n/

Short vowels
/I/ in tip, in, very /tIp, In, ‘verI/
/e/ in ten, pet, pleasure /ten, pen, ‘ple¿e/
/´/ in cat, Ann, happy /k ´ t, ´ n,
‘h ´ pI/
/t/ in up, but, young /tp, btt, jt`/
/p/ in on, cop, Ross /pn, kpp, rps/
/o/ in good, put, pull /god, pot, pol/
/e/ in painter, about, perhaps /’peInte,
e’baut, pe’haps/

/p/ in pin, supper, up /pw ww wn, ‘s¥ ¥¥ ¥pj jj j, ¥ ¥¥ ¥p/
/b/ in bet, about, hub /bet, j jj j’baut, h¥ ¥¥ ¥b/
/t/ in tick, apt, attic /tw ww wk, : :: :pt, ‘: :: :tw ww wk/
/d/ in deed, add, addict /di:d, : :: :d, : :: :dw ww wkt/
/k/ in can, act, talk / k: :: :n, : :: :kt, td dd d:k/
/g/ in game, again, log /gew ww wm, j jj j’gen, la aa ag
/f/ in fine, off, often /faw ww wn, a aa af, a aa afn/
/v/ in vain, over, of /vew ww wn, ‘j jj j¤ ¤¤ ¤vi ii i, i ii iv/
/s/ in seen, asset, cross /si:n, ‘: :: :set, kra aa as/
/z/ in zeal, easy, buzz /zi:l, ‘i:zw ww w, b¥ ¥¥ ¥z/
/¢ ¢¢ ¢/ in three, Kathy, path /¢ ¢¢ ¢ri:, ‘k : :: : ¢ ¢¢ ¢w ww w, p` `` `:¢ ¢¢ ¢/
/ ð/ in they, other, breathe / ðew ww w, ‘¥ ¥¥ ¥ðj jj j, bri:ð/
/M MM M/ in ship, usher, finish /M MM Mw ww wp, ‘¥ ¥¥ ¥M MM Mj jj j, ‘fw ww wnw ww wM MM M/
/W WW W/ in pleasure, usually /’pleW WW Wj jj j, ‘juW WW Wuj jj jlw ww w/
/h/ in house, hen /haus, hen/
/tM MM M/ in chin, each, itchy / tM MM Mw ww wn, i:tM MM M, ‘w ww wtM MM Mw ww w/
/dW WW W/ in gin, urge / dW WW Ww ww wn, Ÿ ŸŸ Ÿ: dW WW W/
/m/ in mine, hymn /maw ww wn, hw ww wm/
/n/ in name, sunny, one /new ww wm, ‘s¥ ¥¥ ¥nw ww w, w¥ ¥¥ ¥n/
/l ll l/ in sing, think /sil ll l, ¢ ¢¢ ¢w ww wl ll lk/
/l/ in lane, alone, eel /lew ww wn, j jj jlj jj j¤ ¤¤ ¤n, i:l/
/r/ in read, red, arrive /ri:d, red, j jj j’raw ww wv/
/w/ in we, suave, persuade /wi:, sw` `` `:v, pj jj j’swew ww wd/
/j/ in yes, queue, issue /jes, kju:, ‘w ww wsju:/

Slants / / are used for phonemic transcription, while brackets [ ] are used for phonetic


Communication by means of sound signals is not the exclusive province of human
beings. All creatures are said to communicate with each other to attract each other’s
attention, to warn of danger, to give information about the availability of food, directions,
etc. Nevertheless, we humans have managed to pattern the sound continuum we can
produce to a remarkable extent and to come up with an unrivalled, extremely efficient and
articulate system of communication.
Phonetics and phonology are the branches of linguistic investigation that concern
themselves with the description and functioning of the speech sounds of languages. One has
to distinguish between the practically unlimited number of different sounds a human being
can produce and the sounds that have acquired a functional status in a particular language.
The latter are called phonemes.
Although the boundary between phonetics and phonology is hard to draw, the two
may be said to operate at various, distinct levels.
Phonetics has to do with the concrete characteristics of the phonemes in terms of
their articulation, transmission and perception. Accordingly it is further subdivided into
articulatory, acoustic and auditory phonetics.
Articulatory phonetics deals with the way in which speech sounds are produced.
Sounds are usually classified according to the position of the lips, tongue, soft palate,
according to whether the air flow coming from the lungs is obstructed or not, whether the
vocal cords vibrate or not, etc.
Acoustic phonetics studies the transmission of speech sounds through the air.
When a speech sound is articulated it produces sound waves, which are investigated by
means of various instruments.
Auditory phonetics deals with the hearing mechanism, describing how sounds are
perceived by the listener.
Phonology analyses sound structure in language, including the functional, phonemic
behaviour of the speech sounds, their combinatory possibilities, as well as such prosodic
features as rhythm, stress, intonation.
Phonetic and phonological investigations may be applied to specific languages or to
general linguistic phenomena. They may be conducted comparatively, with a view to
establishing what the speech sound systems of two or several languages have in common or
contrastively, to disclose differences and similarities that may prove useful in foreign
language teaching and learning. The study may be synchronic (an investigation of the
sound system of a language at a specific time) or diachronic (the system seen throughout
its historical development).
Apart from phonetics and phonology, a description of a language includes
information about the lexemes (vocabulary items), their meanings and relations. This area is
covered by lexicology.
Morphology provides information about the structure or forms of words, primarily
through the use of the morpheme construct. It is further divided into inflectional
morphology (the study of inflections) and derivational morphology (the study of word
While morphology studies word structure, syntax covers the rules governing the
way words are combined to form sentences in a language. It studies the interrelationships
between elements of sentence structure, and the arrangement of sentences in sequences.
Semantics investigates the way in which meaning is structured in a language at
various linguistic levels. In functional grammars, the boundary between semantics and
grammar 'proper' (traditionally, morphology and syntax) is blurred, grammar being pushed
in the direction of semantics.
Pragmatics is the study of the use of language in communication, particularly the
relationships between utterances and the contexts and situations in which they are used. It
may be contrasted with semantics, which deals with meaning without reference to the
interlocutors and communicative functions of sentences.
Starting from Saussure's terminology, most linguists have viewed language as being
analysable at the level of expression, corresponding to the signifiant, and the level of
content, corresponding to the signifié. At each level a further distinction is operated
between substance and form. A comprehensive description of a language will thus integrate
a phonetic (substance) and phonological (form) analysis of the expression level, a
semantic (substance) and syntactic (form) analysis of the content level, while also taking
into account contextual features of actual communicative events.
Associating phonetics and phonology with the expression level may give the
impression that they deal with surface (superficial) phenomena only. In fact, they have to do
as much with the 'skin' of language as with its 'flesh and bones' and even its 'soul', as will be
seen later in the course.
Form and substance, as well as expression and content, are difficult to separate
completely, if at all, post-Saussurean linguistics has found out. Although it is sometimes
useful to distinguish between phonetics (substance) and phonology (form), we will mainly
see them as closely interrelated. Therefore, more often than not, we will use the term
phonetics as the inclusive term for both phonetics proper and phonology.


The way in which we acquire our native language and the manner we adopt to learn
a foreign language are usually very different. The difficulty many foreign language learners
experience is mainly due to that.
A child is exposed to the sound system of his native language for many months
before he actually starts using it properly. Expression is mastered before content, to use the
terms mentioned previously. A learner of a foreign language has already acquired the
phonological system of his own language, as well as a great number of words and linguistic
patterns. He will have the tendency to concentrate on lexis and grammar first, considering
them essential, and taking pronunciation more or less for granted. As a result, accuracy and
fluency in speech are likely to suffer, many learners feeling apprehensive about freely
communicating in the foreign language.
One of the lessons to be derived is that the sound system of the foreign language
should be thoroughly focused on from the very beginning and that constant oral practice is a
must at all levels of study. Foreign language learners do not usually benefit from the almost
round-the-clock exposure that young children acquiring their native language do; moreover,
they are hampered and misled by the peculiarities of their own phonological system. Their
success in the learning of another language depends on their consciously developing a fine
ear for the foreign sounds and the articulatory skills needed for their imitation. This can be
achieved by the familiarisation with a number of sound features, contrasts, phenomena that
phonetics and phonology make explicit.
More than the foreign language learner, the teacher must have sufficient information
about the way speech sounds are produced by the organs of speech, about the differences
between the phonological systems of the foreign and native language of the students he
works with. He will have to use elements of articulatory phonetics to teach learners how to
pronounce ‘difficult sounds’; he will do that contrastively, concentrating on the differences
between the native and the foreign phonological systems.
Apart from foreign language teaching and learning, phonetics is also used in the
treatment of certain speech defects, in teaching delivery to actors and singers, in the
elaboration of alphabets for languages lacking one, etc.



Although we sometimes claim to speak a common language, be it English or Romanian,
each of us speaks a distinct variety, a certain idiolect. This does not prevent us from
understanding each other, as variation is kept within reasonable limits. For purposes of
analysis, we will leave out individual variation and will concentrate on varieties whose
features are shared by a large number of speakers.
Any use of language involves variation due to a large number of factors, the most
important of which pertaining to region, social group, field of discourse, medium and
attitude. Before dealing with the first two types, we will only touch on the last three.
Variation according to field of discourse is conspicuous when we distinguish
between the language of journalism and the language of cooking recipes, between political
language and poetic language, etc. Members of a certain profession develop their own
jargon, a term which is typically not used by the members of the profession but by those
unfamiliar with that particular type of language, and/or by those who dislike it.
The varieties according to medium may be restricted to those conditioned by writing
and speaking. Writing normally presumes the absence of the addressee and of a common
context of situation. This imposes the necessity of a far greater explicitness. Writing also
lacks many of the devices we use to transmit language by speech and that we will study
later in the course: stress, rhythm, intonation, tempo. As a consequence writers often have
to reformulate their sentences to successfully convey what they want to express within the
orthographic system.
Varieties according to attitude are due to the choice of linguistic form that proceeds
from our attitude to the interlocutor, to the topic, and to the purpose of our communication.
A gradient between formal (polite, impersonal) and informal (warm, friendly) language can
be established.
Dialects are varieties of a language spoken in certain parts of a country (regional
dialects) or by people belonging to particular social classes (social dialects or sociolects).
They are different from each other in terms of lexis, grammar and pronunciation. If
somebody says I be ready and somebody else says I am ready, the two persons may be said
to speak two different dialects, as there is a difference in grammar between the two
Geographical dispersion is in fact the classic basis for linguistic variation, and in the
course of time, with poor communication and relative remoteness, such dispersion results in
dialects becoming so distinct that we regard them as different languages. This latter stage
was long ago reached with the original Germanic dialects that are now Dutch, English,
German, Swedish, but it has not been reached with the dialects of English that have resulted
from the regional dispersion and separation of communities within the British Isles and
elsewhere in the world. Within the English-speaking world, on the contrary, we are
witnessing now a narrowing of the differences between the national varieties of English.
A dialect is often associated with a particular accent. Accents are varieties based on
pronunciation peculiarities only. Although region is what they are mainly dependent on,
social class, age and educational background also leave their mark.
There are many different accents in England, and the range becomes much wider if
the accents of Scotland, Wales and Northern Ireland (Scotland and Wales are included in
Britain and with Northern Ireland form the United Kingdom) are also considered. Within
the accents of England, the main distinction is made between northern and southern ones.
This is a very broad division, and where the boundary between them lies is hard to settle
with any certainty. Nevertheless, on hearing a pronunciation typical of someone from
Yorkshire, for example, most English people would identify it as ‘Northern’.
Some people are very good at identifying accents. There have been radio shows in
which experts have tried to identify the regional background of the people who phoned in,
just from their voices. However, Professor Higgins’s claim to place any man within six
miles, sometimes within two streets is intenable today. This is due to the mobility of the
Britons nowadays: it is very unlikely for people to live their whole lives in one place, and
mixed accents have become widespread.
Attitudes that native speakers adopt to certain accents of English vary considerably.
For us Romanians it might seem strange that rural accents should be viewed more
favourably in England than those of large urban areas such as Birmingham or Liverpool.


RP (Received Pronunciation) is the name given to the regionally neutral accent in
British English, historically deriving from the prestige speech of the Court in the sixteenth
century and of the public schools in the nineteenth. In this phrase, ‘received’ is said to come
either from ‘received at court’ or from the now old-fashioned sense of ‘generally accepted’.
The term RP indicates that its prestige is the result of social factors, not linguistic
ones. RP is in no sense linguistically superior or inferior to other accents. However, it is the
accent which tends to be associated with the better-educated parts of society, and is the one
most often invoked as a norm for the description of British English, or for its teaching to
foreigners. It is connected to, or equated with, ‘The Queen’s English’
The BBC originally adopted RP for its announcers because it was the form of
pronunciation most likely to be nationally understood, and to attract least regional criticism
- hence the association of RP with the phrase ‘BBC English’. RP was considered to apply,
at the beginning of the century (the term was first used by Daniel Jones in 1926), to
speakers of ‘Oxford’ or ‘BBC’ English and implied not only certain vowel and consonant
qualities, but also a noticeable upper class voice quality. According to Gillian Brown
(1990), the situation is different today. People whose vowel qualities differ slightly from
those favoured by Daniel Jones may be considered to speak with an RP accent if their
vowels are distributed like those of RP - if they pronounce a given vowel that is quite like
an RP vowel in the same set of words in which other RP speakers use this vowel.
About 5% of the Britons are said to speak RP, which is more or less the percentage
of the people that have attended public schools. The accent of a rather exclusive group,
although it serves as a model of educated British English throughout the world for foreign
learners of English, RP is by no means ‘pure’, homogeneous or stable. Katie Wales (1994)
identifies three varieties: ‘general’ or ‘mainstream’, used in teaching; ‘conservative’ RP
spoken by the older generation, and ‘advanced’ RP used by the young. The Queen and the
older royals will use features of ‘conservative’ RP: house is pronounced /hais/, off is
pronounced /o:f/, the triphthong in powerless is monophthongised: /pa:lis/ (see
RP is a non-rhotic variety of English, unlike American or Scottish English. In a
rhotic variety, /r/ is pronounced in post-vowel, pre-consonant positions, in words like tar,
curt. English English is not predominantly non-rhotic, as one might think: according to
Crystal (1988), half of England, including parts of Northumberland, Lancashire, most of the
‘west country’ and some areas of the south-west are rhotic.
While RP may be regarded as the standard British accent, Standard British
English is the national variety of English accepted by educated speakers throughout Britain,
irrespective of the accents they may use. Therefore, a Scot may speak Standard British
English without using RP, a standard variety being allowed to show some variation in
pronunciation according to the part of the country where it is spoken.


Standard English is sometimes used as a cover term for all the national standard varieties
of English. These national standard varieties have differences in spelling, lexis, grammar,
and particularly pronunciation, but there is a common core of the language. This makes it
possible for educated native speakers of the various national standard varieties of English to
communicate with one another.
The possibility for people speaking national standard varieties of English to
understand each other satisfactorily has increased tremendously over the last 60 years or so.
Before the Second World War, the gap between British English and American English was
very wide. Since then the influence of television and of the movies / films have smoothed
the differences to a remarkable extent.
Speakers of Standard American English will probably have less difficulty getting
themselves understood by speakers of Standard Australian English than speakers of
different isolated dialects within the same country will have communicating with each
The present course will focus on RP, while also giving information about distinctive
features of varieties of English within the United Kingdom and elsewhere in the world.



Cockney. This accent has long been socially marked as the variety of English
spoken mainly by the working-class Greater London. To get rid of this accent was
considered desirable if one wanted to succeed professionally and socially, and Professor
Higgins’ assiduous attempts to teach Cockney-born Eliza Doolittle how to speak properly
in Shaw’s Pygmalion are well-known.
Cockney speakers are likely to pronounce bad as /bed/ or /beid/, flood as /flæid/,
pity as /piti:/, Katie as /kaiti:/ or /kæiti:/, home as /hæum/. Glottalization (see appendix) is
widely spread in Cockney, accompanying or even replacing such consonants as /p/, /k/ and
/t/ in sack, Gatwick, chapter. On the other hand, in positions where /p, t, k/ are subject to
aspiration (see appendix for aspiration), the phenomenon is more pronounced than in
RP.The sound /h/ is absent in he, hospital, horrible. While the contrast between /ð/ and /v/
is usually lost, weather being pronounced /weve/, the one between /θ/ and /f/ is completely
lost: three is pronounced /fri:/. The Cockney equivalent of the RP dark l in postvocalic,
preconsonantal and syllable-final positions is realized as a vowel: milk is therefore
pronounced as /mi:wk/, table as /tæibw/. -Ing in raining, sleeping will be pronounced as
/in/: /ræinin/ and /sli:pin/.

Estuary English. Estuary English is a variety of modified regional speech. It is a
mixture of non-regional and local south-eastern English pronunciation and intonation.
According to David Rosewarne (English Today, Oct.1994) it is a form of pronunciation
between Cockney, whose descendant it is, and RP. This accent is to be found in suburban
areas of Greater London and the counties of Essex and Kent lying to the north and south of
the Thames Estuary, which gives its name.
Estuary English inherited from Cockney glottalization, vocalic l and the lengthening
of short i in final position. RP /i:/ becomes a diphthong /ei:/ in sea. Diphthongization of
RP /u:/ is also to be noted: blue is pronounced as /bleu:/. The /æ/ of Estuary English is less
open than that of RP and may be represented as a diphthong too: bad is pronounced /bæed/
or /bæid/. The RP diphthong in choice is replaced by a triphthong in Estuary English: /
The intonation is characterised by frequent prominence given to prepositions and
auxiliary verbs which are not normally stressed in RP, while the vocabulary of EE speakers
evinces a strong influence of American English.
Since the term was coined in 1984, Estuary English has spread very quickly, MPs,
the Archbishop of Canterbury, some of the Royals using it either systematically or
occasionally. At the moment it is the strongest influence on RP. Though Rosewarne thinks
that Estuary English may eventually replace R.P. as the most influential accent in the
British Isles, it is hard to see it taking on an international role with anything like the current
prestige of R.P.

The Northern Accents. The most noticeable differences from RP are to be noted in
the north, including the Midlands. The line between north and south is hard to draw with
any accuracy. Some trace it relatively low, somewhere near above London. The phrase
‘south/north of Watford’ is used to refer to such a linguistic line (Watford is a small town
north of London). However, a more rigorous scrutiny of the way The Great Vowel Shift
affected various areas of England would indicate as a more adequate boundary between the
north and the south a line running from the river Ribble in the west to the river Humber in
the east.
Some of the counties of northern England are not far from Scotland, whose
influence is therefore noticeable. However, there are a lot of pronunciation features which
are typical only of northern English regions.
The vowel /t/ does not usually occur in the northern accents (North-east, Central
north, Central Lancashire, Merseyside, Humberside, North-west Midlands, East Midlands,
West Midlands): blood, gloves, lovely are pronounced /blud/, /gluvz/, /luvli/. The RP /t/ of
come is usually replaced by a rounded vowel in the area of /o/ in some northern areas. As
for RP /o:/, it is replaced by /æ/ before the voiceless fricatives /f, θ, s/: dance, path, past are
pronounced /dæns, pæθ, pæst/ (as a matter of fact, only three of the thirteen English accents
mentioned above observe the RP /o:/ in that position ).
The closing diphthongs of crate, boat have a tendency to be turned into
monophthongs (/kre:t/, bo:t/) or even opening diphthongs: /kriet, buot/. In some northern
areas (e.g. Newscastle), /l/ is clear in all positions, /ai/ is /ei/ in right, sight, and RP /au/
turns into /u:/: out is pronounced /u:t/.
Scottish English. This variety is either seen as a dialect of English or as a national
variant, the latter view being much in keeping with the latest tendencies to achieve greater
autonomy and self-determination for Scotland.
The short ‘quantum leap’ into the past in the lines below may shed some light on the
position and development of this important variety of English. Going as far back as the Old
English period (7
– 11
centuries), we will find Scotland as Celtic-speaking (the variety
known as Gaelic). At the end of this period, however, as a result of the Norman Conquest,
many English noblemen took refuge in the southern part of Scotland. The migration was
encouraged by several Scots kings, who granted land to the refugees in the new royal estates
known as burhs (e.g. Edinburgh). These areas were predominantly English-speaking, the
language of the new settlers gradually spreading through the whole lowlands region, while
Gaelic remained strong in the Highlands. Scottish English became more and more different
from the varieties spoken in England, mainly in lexis and pronunciation, the differences
being very marked today.
Contemporary Scottish English is still rhotic, preserving post-vocalic /r/, which is a
flap resembling Romanian /r/; as a result, /ie/, /ee/, /s:/, /ue/ do not occur: peer, pear,
moor, hurt are pronounced /pir/, /per/, /mur/, /htrt/. The difference between long and short
vowels is not important in this variety of English, so there is no difference between cot and
caught, pull and pool. Initial /p/, /t/, /k/ are usually non-aspirated, while /l/ is clear in all
positions. A specific Scottish characteristic is the pronunciation of /θr/ as /∫r/: through /∫ru:/.
The diphthong /au/ is usually replaced by /u:/ in house /hu:s/, or rather Scottish English has
preserved a feature which English lost during the Great Vowel Shift (i.e., through the
diphthongisation of /u:/). Similarly, the diphthong in RP road, note is replaced by a
monophthong: /ro:d/, /no:t/ (or, again, it is a feature which was preserved in Scottish
English and lost in English English). One more Scottish feature is the pronunciation of
father as /fæðer/.
Scottish English uses unstressed syllables with vowel qualities that are much more
distinct than in RP and therefore having noticeably longer unstressed vowels. This,
combined with the effect of the stressed vowels that are much shorter than in RP, leads to
the impression of Scottish English being a ‘syllable-timed’ variety (see stress and rhythm).

Northern Ireland English. This variety bears a considerable resemblance to
Scottish English, as a large number of settlers have come from Scotland to Northern Ireland
since the 17th century. Areas of the far north are more heavily influenced by Scottish
This is, like American English and Scottish English, a rhotic variety, post-vocalic
retroflex frictionless sonorant /r/ being used like in America. In all positions /l/ tends to be
clear, intervocalic /t/ is usually a voiced flap like in American English: city /sidi:/. Between
vowels /ð/ may be lost: mother is pronounced /mo:er/. In final position, RP diphthong /ei/ is
replaced in this variety by a long monophthong: say, may /se:, me:/. In preconsonant
position, /ei/ may be replaced by the diphthong /ie/: state /stiet/. On the other hand, the
diphthong /ei/, which is reduced to a monophthong in the examples above, is used instead
of the RP /i:/ in speak, weak. RP /e/ is replaced by a more open vowel sound close to /æ/ in
Derry and merry. In words like hop, off, the vowel sound is very close to RP /æ/: /hæp/,
Like in Scottish English and in northern British varieties, the diphthong /eu/ is
monophthongised, words like boat, soul being pronounced /bo:t/ and /so:l/.


American English, General American. It is obvious that in a huge country like the
U.S.A. pronunciation cannot be homogeneous. Three main varieties have emerged, gaining
a prominent status: the Eastern type, the Southern type and General American. The latter is
considered to be the pronunciation standard of the country. It is mainly spoken in the central
Atlantic states: New York, New Jersey, Wisconsin, and it is widely used in scientific,
cultural and business circles, on the radio and TV.
General American (GA) is a rhotic variety of English. In final word position, in
words like more, car, /r/ is consonantal and non-syllabic: /mor, kar/. After a vowel and
before a consonant, /r/ is vocalic and syllabic: bird may be rendered as /brd/. The RP
centring diphthongs /Ie, ee, oe/ have as American counterparts /ir, er, ur/ in beer, bear,
boor /bir, ber, bur/, which is reminiscent of Scottish English. The final /r/ in these words is
retroflex and slightly lateralized, the air stream escaping along the sides of the tongue in a
channel formed between the back teeth and the tongue, like in the articulation of /l/.
One special feature of the vowels is their nasalization, when they occur after or
before nasal consonants (in words like now, smile). This nasalization is commonly called
American twang. The vowel /e/ is more open in GA than in RP, and final /i/ is realized as
/i:/ in pity, sunny, etc. The short vowel in hot, not is noticeably more open in GA, being
pronounced as /a/: /hat, nat/. In words like soap, home, the diphthong is pronounced as
/ou/.Like in the northern British accents, words like path, dance, half are pronounced /pæθ,
dæns, hæf /. In all positions /l/ is dark, like in Russian. Intervocalic /t/ is usually voiced,
leading to the disappearance of the distinction between the pronunciation of words like
rider and writer. In words like twenty, bottle, /t/ is usually dropped altogether. Thus, the
distinction between winner and winter is also cancelled. The approximant /j/ is usually
omitted between a consonant and /u:/: suit, New York, stupid are pronounced /su:t, nu:
jork, stu:pid/.

Canadian English. For anyone who is at all knowledgeable about Canada’s
position, history and economic relations, it is no wonder that this variety combines features
of both American and British English. One should not be surprised, for example, to see a
garage displaying signs with the word “tire” spelled the American way and “centre” spelled
the British way.
Unlike American varieties, Canadian English has a long vowel in been, and words
like missile, hostile rhyme with Nile. The prefixes anti-, multi-, semi- are pronounced with
/i:/, unlike in American English, where the pronunciation is with /ai/.
On the other hand, a lot of features are shared with the American accents: the
sounding of /r/ in all positions, the voicing of /t/ in intervocalic position and its frequent
dropping in words like Toronto (often pronounced /’tranou/).
One of the most distinctive features of Canadian English is the so called Canadian
Raising – the RP diphthongs /au/ and /ai/ are replaced by /eu/ and /oi/ in cloud and fight.

Australian English. According to Chitoran 1978, two main tendencies can be
found in this variety, as far as the vowel system is concerned:
a) the diphthongization of all monophthongs, mainly by adding a glide reaching towards
b) the fronting of back vowels and the closing of open vowels.
Apart from these tendencies, RP diphthongs /ei/ and /ai/ are replaced by /ai/ and /oi/,: day is
pronounced /dai/ (feature inherited from Cockney) and height is pronounced /hoit/(see
Canadian English above).
In terms of intonation, a high rising tone (HRT) is probably the most noticeable
trait. What is special about it is the fact that in Australian English HRT occurs at the end of
a declarative utterance, not necessarily at the end of a Yes/no question/ interrogative. Guy
and Vonwiller (quoted in David Graddol et al. 1996) argue that this intonation peculiarity is
used in spoken texts that are complex in structure when the speaker wants to monitor the
listener. In order to check whether the listener is following what s/he is saying (in spoken
narrative texts and descriptions), the speaker resorts to HRT. In a conversational setting,
this tone may also have what the above-quoted authors call an ‘interactional’ meaning. A
speaker is likely to use HRT to indicate to his/her interlocutor that s/he is not ready to give
up her turn in the conversation, but that s/he is continuing with what s/he is saying. This
special use is one illustration of the relevance of intonational features to conversation
analysis (see glossary).


In the production of speech sounds we make use of what are usually called the
speech organs. It goes without saying that the basic function of these organs is not speech,
but the physiological functions of breathing, eating and drinking. Parts of two organ
systems, the respiratory system and the upper part of the food tract are in charge of speech
sound production. Particularly closely linked are speech production and breathing, so much
so that the former is considered to be a supplementary activity to the latter, the whole
respiration apparatus being active in speaking.
The speech tract has three distinct sections according to the role these parts play in
speech production: I.the lungs,II. the larynx, and III.the resonating cavities.
I.The lungs supply the source of energy needed for the vocal activity. They are
situated within the thoracic cavity.
The thorax consists of the barrel-shaped rib structure which forms the sides of the
thoracic cage, of the associated muscles, and of the lung structure contained within it. There
are twelve paired ribs, attached at the back to the vertebral column (the backbone), and at
the front to the sternum (the breast-bone). The upper limit of the thoracic cage is formed at
the back by the scapulae (the shoulder blades) and by the clavicles (the collar bones) at the
front. The floor of the cage is made up of the diaphragm muscle.
By the raising or lowering of the d
iaphragm, which forms the base of the chest cavity, and by the contraction of the
intercostal muscles, the lungs are expanded or contracted, taking in air and then expelling
it. It is while we breathe out that the air-stream necessary for the production of speech
sounds is initiated. The rate of respiration ranges from 10 to 20 cycles (inspiration-
expiration) per minute.
Our speech activity is largely determined by the physiological constraints imposed
by the limited capacity of our lungs and by the muscles controlling their action. We have to
pause in articulation to be able to refill our lungs with air and the number of energetic peaks
of exhalation which we make will determine the length of any breath group. Syllabic pulses
and dynamic stress, both typical of English, are directly linked to the activity of the muscles
activating the lungs.
II.The trachea (or windpipe) carries the air-stream from the lungs to the larynx, a
cartilage and muscle casing situated in the neck.


It is made up of the thyroid cartilage and the cricoid cartilage attached to the top of the
trachea. The front of the larynx comes to a point that is situated at the front of one’s neck,
prominent in males, commonly called the Adam’s apple. The glottis is the opening between
the two lip-like folds of ligament and muscle called the vocal cords. The glottis is opened
when the vocal cords are brought apart and closed then they are brought together. These
movements are controlled by the arytenoid cartilages. This sounds fairly simple, but
complex changes in the position of the vocal cords are involved in breathing, whispering
and normal speaking.
Acting as a valve, the vocal cords prevent the admission of foreign bodies into the
trachea; they can also obstruct the passage of the air-stream, enclosing it within the lungs in
order to assist in muscular effort on the part of the abdomen or of the arms. Apart from that,
they have such a decisive role in the production of speech sounds that they are said to be the
most important organ of the speech mechanism.

The inside of the larynx seen from above

In connection with breathing and speaking, the vocal cords (or vocal folds) can assume four
basic positions:
a) They may be kept wide apart for normal breathing and while voiceless consonants like
/p/, /f/, /s/ are produced.
b) They are brought close to each other so that there is a narrow passage between them for
the air-stream to escape. The result is a voiceless glottal fricative sound, /h/, not very
different from a whispered vowel. When laughing, our vocal cords successively assume the
b) and c) position
( hahahaha).
c) When the edges of the vocal cords touch each other, air passing through the glottis will
cause vibration. This vibration occurs in the articulation of the vowels and of the voiced
consonants, as opposed to the articulation of the voiceless consonants, which is not
accompanied by vowel-cord vibrations. The vibratory movement is not at all like the
vibration of the string of a musical instrument, and the vocal cords are not at all similar to
the strings of a musical instrument, either. What actually happens is that the air is pressed
from the lungs and it pushes a little the vocal cords apart, so that some air escapes. As the
air flows quickly past the edges of the vocal cords, the cords are made to vibrate, tending to
return to their closed position on the one hand, being pushed open on the other. This
opening and closing happens very rapidly around two hundred times per second or more.
The higher the frequency of vibration of the vocal cords is, the higher the pitch of voice.
d) The vocal cords can be tightly pressed together so that air is prevented from escaping
through the glottis. As a result of the compression of the air-stream behind this closure, a
glottal stop or glottal plosive is produced, for which the symbol /7/ is used. When coughing,
we usually produce a succession of glottal stops.
The glottal stop often occurs in English when it reinforces or even replaces the
voiceless stops /p/, /t/, /k/. Glottalisation of this latter kind is particularly marked with the
younger generations in contemporary British English.
A glottal stop also precedes the energetic articulation or a vowel: æ in /ri’ækt/ is
usually pronounced /ri’7ækt/. Foreign learners feel like inserting a glottal stop before each
word beginning with a vowel: /7it 7iz nau /, a tendency to be resisted.

The four basic positions of the glottis

III.The resonating cavities are located above the larynx, that is why they are also called the
supraglottal cavities. Having passed through the larynx, the air goes through what is also
called the vocal tract, which ends at the mouth and nostrils. These cavities of the vocal tract
function as the resonators of the laryngeal tone produced by the vocal cords. This tone is
subject to changes produced as a result of the shape assumed by the supraglottal cavities of
the pharynx and mouth and of the possible role of the nasal cavity.
The pharynx is a tube that extends from the top of the larynx and oesophagus to the
region at the back of the soft palate. At its upper end it splits in two, one part being the back
of the mouth and the other being the beginning of the way through the nasal cavity. The
shape and volume of this resonator may be considerably altered by the constrictive action of
the muscles enclosing the pharynx, by the movements of the back of the tongue, by the up
or down movements of the soft palate and by the raising of the larynx itself. It is a
characteristic of certain varieties of English pronunciation to have certain vowels articulated
with a strong pharyngeal contraction.
On its way out, the air-stream may be released either through the mouth only (the
soft palate is raised, thus blocking the passage of the air towards the nasal cavity), or
through the nose, after some obstruction has been made in the oral cavity (the soft palate is
lowered, allowing the air to go through the nasal cavity), or through both the nose and
mouth, when the intermediate position of the soft palate enables the air-stream to go
through the two cavities at the same time. A ‘purely’ oral escape is associated with most
English speech sounds, while the initial consonant sounds in my, near, and the final one in
sing are released nasally.
Although it does not normally appear in ‘educated’ British English, nasalisation of
certain vowels occurs in certain non-RP varieties of English (in the Birmingham area, for
instance) when the air escapes both through the mouth and through the nose. This
phenomenon is particularly strong in French in the vowels of such phrases as un bon vin
blanc. It has been seen that the mobile soft palate acts as a sort of traffic agent, regulating
the circulation of the air through the oral or/and nasal cavity/es.
The mouth cavity has traditionally been given a lot of attention by phoneticians not
only because it is the most easily observed section of the vocal tract but also because it is
highly instrumental in speech production and modification. The shape of the mouth and the
position of the different articulators within it determine the quality of most of the speech
sounds. In the oral cavity are possible far more finely controlled movements than in any
other segment of the speech mechanism.
The boundaries of the mouth which are relatively fixed are the teeth, the hard palate
and the pharyngeal wall. The other parts are mobile: the lips, the tongue, the soft palate with
its appendix, the uvula. The lower jaw is also very mobile. Its movement modifies the
distance between the upper and lower teeth and the posture of the lips. The mobile parts
found in the oral cavity are usually called articulators, while the fixed ones are the places of
articulation. It is difficult to classify the lower jaw in these terms, as it is neither a place of
articulation nor an articulator proper.
For purposes of description the upper boundary of the mouth is traditionally divided
into the velum (or soft palate, the mobile part of the palate at the back of the mouth ), the
hard palate (‘the roof of the mouth’ proper) and the alveolar ridge.
The ‘traffic-lights’ job of the velum regulating the passage of the air-flow through
the mouth and nose has already been briefly discussed. Another important thing about the
soft palate is that, an articulator itself, it can be touched by another articulator, the tongue,
when /k/ or /g/ are produced. Such sounds are called velar.
The hard palate is the fixed, bony arch, the roof of the mouth to which the soft
palate is attached at its back-of-the-mouth end. Sounds whose place of articulation is the
hard palate are called palatal
The alveolar ridge is between the top front teeth and the hard palate. The sounds
articulated with the front of the tongue against the alveolar ridge are called alveolar.
The teeth (upper and lower) are shown in diagrams immediately behind the lips. The
tongue touches the upper teeth for the sounds which are called dental.
The lips may affect by the position they assume the shape of the oral cavity and they
can also determine the place of articulation. The lower lip gets close to the upper teeth in
the articulation of the fricatives /f, v/. These sounds have labio-dental articulation. If the
lips are tightly shut they may form a momentary obstruction which, when released,
produces plosive sounds (the bilabial consonants /p/, /b/); also, if the closure is maintained,
the air may be redirected by the lowering of the soft palate through the nose for the
articulation of the nasal consonant /m/. If the lips are held open, they may assume five basic
positions in English:
1.the spread lip position - the lips remain sufficiently far apart for no friction to be heard,
yet remaining fairly close together and energetically spread, as in the articulation of the long
vowel in see ;
2.the neutral position - the lips are held in a relaxed position with a medium lowering of the
lower jaw, as in the articulation of /e/ in /get/;
3.the open position - the lips are held relatively wide apart, without any marked rounding,
as for the vowel in far ;
4.the close rounded position - the lips are tightly pursed, so that the aperture is small and
rounded; although this position occasionally exists in English (in blue, for example), it
occurs much more frequently in French and Romanian, so that speakers of these languages
are tempted to assume this position for articulations which would not normally involve too
much lip-rounding in English ;
5.the open rounded position - the lips are held wide apart, but with slight projection and
rounding, as in the vowel of cop.
Naturally, there will be intermediate lip positions as well. Thus, one can say that for the
vowel in book there is a position between close rounded and open rounded.
The tongue is undoubtedly the most important articulator; it may assume a wide
range of positions and shapes in the articulation of vowels and consonants alike. It acts as
an articulator in the production of consonant sounds and as a modifier of the shape of the
mouth for the production of the vowel sounds.
Although the tongue does not show obvious distinct parts, for the sake of phonetic
descriptions it is usually divided into the tip, blade, front, back and root. The tip and blade
of the tongue form together the apex, and the consonants produced with the apex placed
against a place of articulation are called apical. Thus, /ð/ is an apico-dental consonant,
apical describing the part of the articulator involved in its production, dental showing the
place of articulation. Part of the middle section of the tongue is rather misleadingly called
the front. It is that segment which lies opposite the hard palate when the tongue is at rest.
Many sounds are articulated with this part of the tongue. Next, in between the front and the
root comes the back of the tongue; /k/ and /g/ are called dorso-velar because they are
articulated with the back of the tongue against the velum, or soft palate. Apart from these,
the edges of the tongue are called its rims.



We have briefly surveyed the modifications which are made to the original air-stream by the
speech mechanism extending from the lungs to the oral and nasal cavity. In order to
describe an English speech sound, some basic information is required on:
1.the action of the vocal cords, i.e., whether the glottis is closed, wide apart, or vibrating;
2. the position of the soft palate, according to which sounds will be oral, nasal, or nasalised;
3. the disposition of the movable mouth organs, ie the shape of the lips and tongue, as well
as the manner and place of articulation of the consonant sounds.
Thus, English /d/ will be described as a voiced (the vowel cords vibrating during its
production), oral (released through the mouth, with the soft palate raised), apico-alveolar
(the articulator being the apex of the tongue, the alveolar ridge being the place of
articulation), plosive consonant. These criteria of description and classification of speech
sounds will become more detailed when we deal with more specific classes, such as vowels
and consonants.


Etymologically, a vowel is a voiced sound, while a consonant is a sound that goes
with another sound to form a syllable (Chitoran, 1978). Daniel Jones (1957) gives the
following definition: ‘A vowel is a voiced sound in forming which the air issues in a
continuous stream through the pharynx and mouth, there being no obstruction and no
narrowing such as would cause audible friction. All other sounds are called consonants.’
Baudouin de Courtenay discovered a physiological distinction between vowels and
consonants; according to him, in consonant articulation, the muscular tension is
concentrated at one point, which is the place of articulation; in vowel articulation, the
muscular tension is spread over all the speech organs.
We can say that the most important difference between vowels and consonants does
not lie in the way they are made, but in their different distributions. Difficulties arise, for
example, when one has to classify sounds such as /j/ , /w/ or /r/. They are articulated very
much like vowels but function as consonants in English (incidentally, a sound similar to
English /r/ functions as a vowel in Chinese).
The American linguist Pike (1943) therefore suggested that the terms vowel and
consonants should be confined to phonology, where the distinction is based on the linguistic
function of sounds, and that new terms should be provided for an articulatory classification.
The terms he came up with were contoid and vocoid. Thus, for example, the above-
mentioned /w/ would be described as a consonant in phonological (functional) terms and as
a vocoid in phonetic (articulatory) terms.
Sounds of the consonant type will be classified, resuming some of the criteria
previously mentioned, as voiced or voiceless, oral or nasal. As far as place of articulation is
concerned, they are bilabial (articulated with the lips pressed together), labio-dental (lower
lip against upper teeth), apico-dental (apex against the teeth), apico-alveolar (apex against
the alveolar ridge), post-alveolar, palatal, velar and glottal. In terms of manner of
articulation, consonants are classified according to the type of obstruction met by the air-
stream in their production and the way the air-stream is released. If the air is released
suddenly, it produces plosives or stops. If released slowly, affricates are produced.
Fricatives are produced as a result of a narrowing of the air passage that causes friction.
When a partial closure is produced and the air is released laterally, a lateral sound is
produced. Certain sounds, the ‘double agents’ mentioned a little previously (/w/,/r/,/j/) will
be classed as approximants; in terms of production, one can say that the articulator and the
place of articulation only ‘approximate’ a closure, without fully achieving it ( /w/ and /j/
used to be traditionally classified as semi-vowels). On the other hand, in their production,
the tongue may be said to act more as a modifier of the shape of the oral cavity than as an
articulator proper (mention should be made that we are concerned here with the English /r/,
not any kind of /r/). Finally, consonantal sounds are classified according to the force of
articulation. Some of the speech sounds are produced with an attendant tenseness of the
speech organs, while the others are not. The former are called fortis, the latter lenis.
Sounds of the vowel type will be described according to
-the part of the tongue which is raised and the degree of raising;
-the position of the lips: the degrees of spreading and/or rounding mentioned previously;
-the duration and degree of muscular tension; vowel sounds are not so much long or short
as tense and lax, the muscles of the tongue being tense in the articulation of the so-called
long vowels and lax in the articulation of the short ones;
-the constancy of articulation; we distinguish between monophthongs or simple vowels,
which remain relatively constant, and diphthongs which imply a distinct change; the
situation is not as simple as that, though, as long vowels have a tendency towards
diphthongisation with some speakers, while diphthongs tend to be monophthongised in
rapid speech.


Writing, with its discrete units, the letters, gives us the misleading impression that,
when speaking, we produce a series of distinct, isolated sounds. Actually we produce a
continuum, a stream of sound. In studying speech, we divide this stream into segments.
Each of these segments can be articulated in slightly different ways. Not only are there
differences between the realization of the ‘same’ sound by different speakers; the same
person is likely to pronounce the ‘same’ sound slightly differently each time he produces it.
The range of sounds we produce is practically unlimited. Yet we say that, for example,
there are 24 consonant phonemes and 20 vowel phonemes in the sound system of English.
To arrive at the total phonemic inventory of a language one does not have to pay
attention to all the possible differences among the practically unlimited number of sounds
one can articulate. It is sufficient to take into account those variations between different
sounds that are linguistically relevant, that are associated with changes of meaning in a
particular language.
If we ask for /bære/ instead of /bere/ in a Romanian pub, we are bound to be
understood, which means that the bartender takes /æ/ for a realization of the phoneme /e/.
The pronunciation /ðor/ will probably be ‘deciphered’ as /zor/. We say that /æ/ and /ð/ are
not phonemes in Romanian: if we hear these sounds in our language, we are bound to
ascribe them, as special realisations, to the phonemes /e/ and /z/. Yet in English the sounds
/æ/ and /e/, /ð/ and /z/ are distinct phonemes, since the substitution of /æ/ for /e/ and of /ð/
for /z/ may lead to different words: bad - bed, ten - then. The pairs of words which are told
apart by one phoneme only are called minimal pairs: part - cart, tip - tin, sin - thin, and the
operation through which we find out the differing phonemes is called the commutation test.
The final /l/ sound in wheel is definitely different from the initial /l/ sound in line,
yet we perceive the two sounds as variants of the same phoneme. We say that the ‘dark l’ in
wheel and the ‘clear l’ in line are the allophones of a class, the phoneme /l/. Therefore, a
phoneme is not an actual sound, but a class of phonetically similar (not identical) sounds,
called its allophones. A phoneme has the function of telling words apart. As for its
allophones, or variants, they may be due to free variation or to complementary distribution.
Free variation refers to the slight changes in the production of the ‘same’ sound in a
certain position by the same speaker or by different speakers. It has already been said that
we can’t produce exactly the same sound twice, and other speakers will also vary in their
realizations, although our articulations may be considered to be the same speech sound in a
certain language: ðare and zare as extreme cases of free variation are perceived as
‘identical’ in the phonological system of our language, because they do not affect meaning.
In wheel and line, the allophones of /l/ are said to be in complementary distribution:
a ‘dark l’ will not be found in the position of ‘clear l’, they are complementary in their
distribution in words. So are the aspirated allophone of /t/ in time and the almost unreleased
allophone at the end of cup (here, unreleased p may even be replaced by a glottal stop).
We can now complete the definition of the phoneme, saying that it is an abstraction,
the minimal unit of the phonological system of a language. It is a class of phonetically
similar sounds, found in complementary distribution or in free variation. A phoneme tells
words apart in a certain language.
In rendering pronunciation either phonemic or phonetic transcription can be used.
In a phonemic transcription (please note that in current usage you will hear and read
about phonetic transcription when the meaning is phonemic transcription), only the
phonemes are given symbols. The pronunciation of cat will be rendered as /kæt/ in
phonemic transcription, no specific information being given about its allophones.
Different degrees of allophonic detail can be introduced in phonetic transcription.
Thus, the above-mentioned cat will appear as [k
æt]. This more detailed transcription
shows that the allophone of /k/ in this word is an aspirated allophone, [k
]. Note that in
phonemic or broad transcription we use oblique brackets, or slants, / /, while square
brackets, [ ], are employed in phonetic transcription (or narrow). In narrow (with the
meaning of detailed), phonetic transcription one can use different small symbols and
diacritic signs to show aspiration, palatalization, devoicing, nasalization, etc. In
dictionaries, phonemic transcription is commonly used, although we sometimes improperly
call it phonetic. We sometimes use a combination of phonemic transcription, writing the
phonemes of a word, and phonetic, adding information about one single allophone, as in the
above-mentioned example, [k


In order to describe vowels in articulatory and auditory terms, Daniel Jones devised a
standard reference system, called the Cardinal Vowel Scale, consisting in setting up a
number of vowel qualities.
When we learn the cardinal vowels, we do not learn to articulate actual English or
Romanian sounds, but we do learn about the range of vowels that the human speech
mechanism can make.
The system is based on physiological criteria: the eight primary cardinal vowels
were arrived at by taking into account how far or how low the tongue can get in the
articulation of vowel sounds, as well as which part of the tongue is raised or lowered in the
The primary cardinal vowels are 1,[i]; 2, [e]; 3, [ε]; 4, [a]; 5, [o]; 6, [o]; 7, [o]; 8,
[u]. It has become traditional to locate these abstractions of vowel quality on a four-sided
figure, a chart that approximates the oral cavity of a person seen in profile (looking

Front Central Back
Close 1 8

Half-close 2 7

Half-open 3

Open 4 5

The Cardinal Vowel Chart

Cardinal vowel 1 shows an ideal position in which the front of the tongue is raised as much
as possible, while cardinal vowel 5 would be articulated with the extreme back of the
tongue lowered as much as possible in the mouth. Cardinal vowel 1 is the most close and
front, whereas number 5 is the most open and back. In between open and close, we get the
intermediate positions half-close - [e] and [o], and half-open - [ε] and [o].
Once again, we must remember that these constructs are not actual sounds, but
extremes of vowel quality that provide us with a way of describing, classifying and
comparing vowels. For example, we can say that the English vowel [æ] is not as open as
cardinal vowel 4, [a], and more open than cardinal vowel 3, [ε].
Besides the variables of vowel quality based on which part of the tongue is raised
and the degree of raising, another important factor in the description of vowels is lip-
rounding. We are going to consider three basic positions:
a) rounded lips - for the pronunciation of primary cardinal vowel [u];
b) spread lips - in the articulation of primary cardinal vowel [i];
c) the neutral position of the lips assumed to pronounce [e].
Now, using these factors, we can proceed to describe the English vowel phonemes,
placing them against the background of the Cardinal Vowel Chart.

THE ENGLISH SHORT VOWELS: /i/, /e/, /æ/, /t /, /p /, /υ/

Short vowels are only relatively short; as we shall see later, sometimes what we call a
‘short’ vowel can have the same length as a long vowel, or it can even be longer. We may
say that English vowels are roughly divided into short and long, but in certain contexts
short vowels are lengthened and long ones are shortened, the difference between them being
not so much one of length as of tenseness or laxness, concepts introduced a little earlier.

7 w ww w Q QQ Q


: :: : ¥ ¥¥ ¥ a aa a

The English short vowels described in relation to the primary cardinal vowels

/w ww w/ in ‘pit’, ‘sin’, ‘fish’.
Although this vowel is in the close front area, compared with cardinal vowel 1, it is more
open, and nearer to the centre. Romanians speakers should tend towards an articulation
approaching /e/ rather than /i/. We should remember that this sound is short and that we, as
well as speakers of other Romance languages, feel like lengthening it (lengthening ‘eat’
instead of ‘it’). However, we should not run into the opposite direction and make it too
short and the final voiced consonant too marked in ‘big’, for example. In unstressed
positions, /I/ is often replaced by /e/ - /re'si:v/ instead of /rI 'si:v/.

/e/ in ‘bed’, ‘set’, ‘measure’.
This is a front, half-open, half-close vowel between cardinal vowels 2 and 3. The lips are
slightly spread. Out of hypercorrectness, Romanians may have the tendency to ‘anglicise’ it,
pronouncing it /æ/. This leads to confusing /bed/ and /bæd/, for example. We can safely say
that the English /e/ is very similar to the Romanian /e/. Again, this short vowel, like the
preceding one, should not be too short and the final consonant too marked in ‘get’, for

/æ/ in ‘bat’, ‘reveille’, ‘plait’.
This vowel is front, less open than cardinal vowel 4, more open than number 3. The lips are
slightly spread. Since it does not exist in Romanian, we might reduce it to the neighbouring
vowels /e/ or /a/. The English /æ/, as the chart and the very symbol show, is in between /e/
and /a/. Do not produce it as a diphthong going from /e/ to /a/, either. When followed by a
voiced consonant, it is as long as any of the long vowels and it also involves a considerable
degree of tenseness: ‘man’, ‘cab’, ‘Sam’.

/ ¥ ¥¥ ¥ / in ‘upper’,‘but’, ‘young’, ‘rush’.
This is a central vowel, and the diagram shows that it is more open than the half-open
tongue height. The lip position is neutral. In northern regional speech of the York area, a
half-close back vowel is used, a sort of / u /

/ a aa a / in ‘pot’, ‘gone’, ‘cross’.
This vowel is not quite fully back, between half-open and open in tongue height. It is
noticeably more open than the Romanian [o] and the lip rounding is less marked.

/ Q QQ Q / in ‘put’, ‘push’, ‘pull’.
The nearest cardinal vowel is number 8, [u], but it can be seen that [u] is more open and
nearer to central. For Romanian learners it is advisable that they should aim at a sound
closer to /ã/ and /î/ than to long and fully rounded /u/. Practice with ‘good’, ‘put’.

/ j jj j / in ‘perhaps’, ‘about’, ‘supper’ is a central short vowel, the most frequent vowel
sound in English. Since it is different from the other vowels in several important ways, it
will be dealt with in another section of this course (the structure of weak syllables and weak
form words).

To sum up and add to the considerations on the relative length of vowels, we can
say that this length depends on the situation of the vowel in a word or phrase. Unstressed /I/
and /e / are usually very short when they occur immediately before a stressed syllable in
‘begin’, ‘eleven’, ‘today’. Apart from this, the ‘short’ vowels are not extremely short. It is a
common mistake to make the vowel too short and the consonant too long in words like:
‘that’, ‘not’, back’, ‘yes’, ‘off’.


Variations of length are also noticed in the long vowels and diphthongs, as will be shown
a) Long vowels are shorter when unstressed than when stressed:
/i:/ is shorter in ‘concrete than in dis’creet
/o:/ is shorter in ‘record than in re’cord
/au/ is shorter in ‘Cracow than in ‘how
b) They are shorter in stressed syllables immediately followed by unstressed syllables than
in those which are not so followed:
the long vowel is shorter in father than in far
/u:/ is shorter in ‘do it than in final do.
c) They are shorter before voiceless consonants than before voiced ones:
/i:/ is shorter in seat than in seed
/o:/ is shorter in court than in cord
i: u:

Ÿ ŸŸ Ÿ: d dd d:

` `` `:

The English long vowels on the Cardinal Vowel Chart

THE ENGLISH LONG VOWELS: /i:/, /Ÿ ŸŸ Ÿ:/, /` `` `:/, /d dd d:/, /u:/
/i:/ in ‘beat’, ‘mean’, ‘peace’, ‘police’, ‘see’.
This vowel is nearer to cardinal vowel 1 than the short vowel of ‘fit’, already described. The
front of the tongue is raised slightly below and behind the close front position. The fact that
the lips are only slightly spread also contributes to the production of a rather different vowel
quality. This sound is often diphthongised by RP speakers, marking a slight glide from a
vowel quality closer to /I/. The use of a pure vowel in a final position may be typical of an
over-cultivated pronunciation.

/Ÿ ŸŸ Ÿ:/ in ‘bird’, ‘fern’, ‘journey’, ‘purse’, ‘worm’, ‘world’.
This is a central vowel Romanians can easily produce. It is articulated with the centre of the
tongue raised between the half-close and half-open positions, the lips being neutrally
spread. Lip-spreading is particularly important after /w/ in ‘word’, ‘world’, ‘work’.

/d dd d:/ in ‘board’, ‘torn’, ‘horse’, ‘warm’, ‘war’.
The vowel is almost fully back, at a height in between cardinal vowels 6 and 7. Bear in
mind that the Romanian /o/ is closer and more rounded, accompanied by a protrusion of the
lips resulting in a sort of preceding /w/ sound. Make sure there is no such thing in your
pronunciation of the English sound.

/u:/ in ‘food’, ‘soon’, ‘loose’.
The quality is that of a relaxed, slightly lowered and centralised cardinal vowel 8. It is
noticeably longer than Romanian /u/, a little more open and less rounded. Just as RP /i:/ is
rarely pure, so RP /u:/ is usually diphthongised, especially in final positions, marking a
glide from a quality close to that of short /u/ to that of pure /u:/.

It became apparent from the description above that the long vowels are different from the
short ones not only in length but also in quality. Perhaps the only case where a long and a
short vowel are closely similar in quality is that of /e / and /s:/.


Diphthongs consist of a glide from one vowel to another. The first part (the nucleus) of
English diphthongs is much longer and stronger (they are mainly falling diphthongs). We
must remember that in English diphthongs the last part (the glide) is not to be articulated
too strongly. The easiest way to remember the eight English diphthongs is in terms of the
three groups divided as in this diagram:



TO /w ww w/ TO / Q QQ Q / TO /j jj j/

ew ww w aw ww w d dd dw ww w j jj jQ QQ Q aQ QQ Q ij jj j ej jj j Q QQ Qj jj j

w ww wj jj j Q QQ Qj jj j

ej jj j -----

The centring diphthongs glide towards the /e / vowel.

/ w ww wj jj j / in ‘beard’, ‘fierce’,’fierce’.
It has been pointed out by Daniel Jones that this sound may not always
constitute the falling diphthong described, with the first element longer and more
prominent than the second. In unstressed syllables the /I/ element may be the weaker of the
two (‘period’, ‘serious’).
Since there is no similar diphthong in Romanian, we might have the tendency to reduce the
glide of ‘serious’ or ‘series’. Romanian learners should insist on an ‘English’ pronunciation
of the diphthong in such situations.

/ ej jj j / in ‘hair’, ‘scarce’, ‘aired’.
According to Chitoran (1978) the diphthong starts with a fairly open /e/ closer to /æ/ than to
RP /e/. According to Peter Roach, it begins with the same vowel sound as the /e/ of ‘get’,
‘men’. therefore we can use either the symbols /ee/ or /se/ to describe the sound.
Romanians should avoid pronouncing this diphthong as a monophthong in ‘parents’ or
‘Mary’. When in final position, the /e / glide approaches /t / in ‘hair’, ‘layer’.

/ Q QQ Qj jj j / in ‘moored’, ‘tour’, ‘truant’.
This diphthong moves from and initial tongue position close to that used for /u/ towards a
more open type of /e/ which constitutes the target that all the centring diphthongs aim at.
The lips are slightly rounded at the beginning of the glide, becoming neutrally spread as the
glide progresses. A common mistake Romanians are liable to make is to monophthongise
the glide, especially when the diphthong is followed by /r/, as in ‘during’ or ‘tourist’.

j jj jQ QQ Q

ew ww w
d dd dw ww w
aw ww w

The closing diphthongs end with a glide towards a closer vowel.

/ew ww w/ in ‘paid’,’ ‘ache’, ‘say’, ‘gaol’
The movement begins from slightly below the half-close front position and moves in the
direction of /I /, the lips being spread. The nucleus is longer when the diphthong is in final
position (in ‘day’), almost as long when followed by a voiced consonant (‘spade’), and
shortened when followed by a voiceless consonant (‘Kate’). The phenomenon is also
encountered with the other closing diphthongs. In some dialectal varieties, including
Cockney, the diphthong is pronounced /aI/. One possible mistake is the tendency of some
Romanians, especially Moldavians (the French and the Greeks are also liable to make it) to
palatalise plosive consonants before this diphthong, namely to pronounce / k
eIk/ as /k
or /g
eIt/ as /g
eIt/. This mistake can be corrected by stressing the aspiration of the initial
plosive consonant, a phenomenon that will be dealt with when we discuss the plosives.

/aw ww w/ in ‘tide’, ‘blind’, ‘high’.
The nucleus of the Romanian diphthong is a central vowel, while that of the English one is
only front retracted, open, and longer. To acquire the right pronunciation, one should aim at
a sort of /a:e/.

/d dd dw ww w/ in ‘void’, ‘loin’, ‘voice’.
The nucleus has the same quality as /o:/ in ‘ought’ and ‘born’. The /oi/ in Romanian has a
much closer and more rounded nucleus and a closer and more distinct final element. One
should not round the English nucleus /o/ too much and should aim at /e/ as the target of the
glide, approximating a kind of /o:e/.

Two English diphthongs glide towards /u/, with a rounding movement of the lips.

/ j jj jQ QQ Q / in ‘no’, ‘soap’, ‘roast’.
The lips may be slightly rounded in anticipation of the glide towards /u/, for which there is
quite noticeable lip-rounding. In unstressed syllables, it is often replaced by /e /, as in
‘phonetics’ and ‘phonology’.

/au/ in ‘owl’, ‘gown’, ‘doubt’.
It begins with a vowel similar to a: but a little more front. The glide towards /u/ begins but
is not completed. You should try to aim at a sort of /a:o/ and avoid adding too much lip-
rounding .


Triphthongs are the most complex English sounds of the vowel type. They are glides from
one vowel to another and then to a third, all produced rapidly and without interruption.
The triphthongs can be looked on as being composed of the five closing diphthongs
described in the last section, with /e / added at the end. They are:

/eI/ + / e / = /eIe / (layer, prayer) / eu / + / e / = / eue / (lower, slower)
/aI/ + / e / = / aIe / (higher, fire) /au/ + / e / = / aue / (shower, tower)
/oI/ + /e / = /oIe / (lawyer, Sawyer)

The middle of the three vowel qualities of the triphthong can hardly be heard and
the resulting sound is difficult to distinguish from some of the diphthongs and long vowels.
Thus ‘layer’ may be pronounced /lee/, ‘fire’ /fa:e /, ‘slower’ /sls:/.


According to their manner of articulation, consonants may be classified as plosives (stops),
affricates, fricatives, nasals, approximants and lateral /l/.
Plosives have the following characteristics:
- one articulator is moved against a point of articulation so as to form a stricture that allows
no air to escape from the vocal tract;
- after air has been stopped and compressed behind the stricture, it is allowed to escape;
- if the air is still under pressure when the plosive is released, it will produce audible noise
-there may be voicing during the plosive articulation.
The four phases in the production of a plosive are called the closing phase, the hold
phase, the release or explosion phase and the post-release phase.


English has six plosive consonant phonemes, to which can be added the glottal stop /7/,
which can replace or accompany other sounds. The six plosives come in three sets,
according to their place of articulation and to their being voiced or voiceless.
/p/ and /b/ are bilabial, made with the lips pressed together.

/t/ and /d/ are alveolar made with the tip of the tongue pressed against the alveolar
ridge (the ridge of the gum), just behind the upper teeth.
/k/ and /g/ are velar, articulated with the back of the tongue pressed against the soft
When /p, t, k/ occur at the begining of a stressed syllable, a slight puff or breath
(called aspiration) is heard immediately after the release of the consonant. Therefore,
‘Tom’, ‘Pete’, ‘Kate’ are pronounced as /t
pm/, /p
i:t/, /k
eIt/. Aspiration does not usually
occur when the voiceless plosives are preceded by /s/, as in ‘span’. When a voiceless
plosive precedes a vowel in an unstressed syllable, the aspiration that may occur is
relatively weak. In final position, /p,t,k/ may have no audible release or may be replaced by
a glottal stop. They have a shortening effect on the preceding vowel sound, which is most
noticeable after long vowels and diphthongs. Therefore, /aI/ is shorter in ‘height’ than in
‘hide’, /eI/ is shorter in ‘late’ than in ‘day’ or ‘spade’.
/b,d,g/ are voiced sounds corresponding to the voiceless /p,t,k/. One should bear in
mind that /t/ and /d/ are alveolar, not dental, in English. It may be noted, however, that the
articulation of these sounds can be affected by an adjoining sound; so that one may use a
forward /t/ in ‘eighth’ (because of θ), and a retracted, post-alveolar t in ‘tray’(because of r).
/b,d,g/ may have full voicing when they occur in positions between voiced sounds,
e.g. in ‘idiot’, ‘ago’, ‘labour’. In initial and especially final positions they may remain
partially voiced or completely voiceless, e.g. ‘ghost’, ‘done’, ‘bore’, ‘lid’, ‘lab’.
Incomplete plosion. When two plosives occur together in the same word or two adjoining
words, there is no audible release (or plosion) of the first sound. In ‘soup plate’, the lips are
not parted at all between the first and the second p. In ‘sit down’ the tip of the tongue is not
removed from the teeth ridge between t and d. In ‘background’ the back of the tongue is not
removed from the soft palate between k and g. A similar effect is produced when the two
adjacent plosives have different points of articulation, a phenomenon that will be discussed
when assimilation is dealt with.
Fortis and lenis
Although /b,d,g/ are called voiced plosives, one can notice that in initial and final positions
they are scarcely voiced at all. As some phoneticians say that /p,t,k/ are produced with more
force than /b,d,g/ , it is better to give the two sets of plosives (and some other consonants)
names that indicate that fact; so the voiceless plosives /p,t,k/ will sometimes be referred to
as fortis (strong), while /b,d,g/ will be called lenis (weak).
The glottal stop /7/ is produced as a result of the closure of the vocal folds, which
can interrupt the passage of the air stream into the supra-glottal cavities. The air pressure
behind the stricture is released by the sudden opening of the vocal cords. The plosive sound
is voiceless and fortis because of the strong air pressure involved.
Any initial stressed vowel may be reinforced by a preceding glottal stop when
particular prominence is given to the word: / its 7æn /. Word final /p, t, k / and the affricate /
t∫ / may be strengthened by a glottal stop which may coincide with the mouth closure or
slightly precede it, especially in the last word of an utterance. Some speakers go farther than
that, altogether replacing / p, t, k / with /7/. Apart from initial, prevocalic position, the
glottal stop is very frequently found in final position, replacing /t, p, k/ before a word
beginning with a consonant: cut grass /kt7 gra:s/.
It is widespread in the speech of urban working-class people, and is found in most
regions, with the exception of certain parts of Wales.


Fricatives are produced while air escapes through a small passage and makes a hissing
sound. Fricatives are continuant sounds, which means that one can continue making them
without interruption until the whole air stream is spent.
The fortis fricatives /f, θ, s, ∫ / are articulated with greater force, and their friction
noise is louder. They have the effect of shortening a preceding vowel, as do fortis plosives.
Their voiced counterparts are / v, ð, z, ¿ /.
/f,v /
/f/ in word initial father, feather, fine, fortress, philosophy, pheasant
word medial after, paraffin, offer, office, muffin, suffer
word final cough, laugh, shelf, tough, trough
/v/ in word initial van, vice, Victor, voice, void
word medial covert, even, nephew, oven, sovereign
word final believe, have, give, love, of, receive, sieve
/f, v/ are labio-dental , the inner part of the lower lip lightly touching the edge of the
upper teeth; the narrowing of the air passage produces friction. For /f/, the friction is
voiceless, while for /v/ there may be some vocal cord vibration, according to the
Word final /v/ may be assimilated to /f/ (see assimilation) before a voiceless
consonant at the beginning of the following word, regularly in have to /hæfte/, or may be
subject to elision in the case of the unstressed forms of of, have in a lot of money, a cup of
tea, I could have guessed.


/ θ θθ θ, ð /
/θ θθ θ/ in word initial therapy, thin, three, thorough
word medial author, Athens, athletic, worthless
word final oath, wrath, myth, length, month
/ ð / in word initial the, this, that, they, though
word medial breathing, either, other, northern
word final seethe, soothe, mouth, with, writhe
/ θ, ð / are dental, the tip of the tongue making a light contact with the edge and
inner surface of the upper teeth; again, the narrowing of the air stream causes friction. As
with the preceding pair, the first element is produced with voiceless friction, whereas there
may be some vocal fold vibration for the second.
In popular London speech, the difficulties of the dental articulation leads to the
replacement of / θ, ð / with the labio-dental / f, v / in ‘mother’ , ‘Smith’, ‘throw’.
In Romanian, these phonemes do not exist, which may cause difficulty to the beginner, who
might assimilate them to other consonants. However, the two dental sounds are not difficult
to pronounce in themselves, since young children acquire them before /s,z/ as variants. A
teacher need only remind the learners of this, getting them to produce ‘lisped’ variants of

/s, z /
/s/ in word initial sample, sing, sigh, sign, sorrow, stupid
word medial aspect, bossy, concert, inspiration, rascal, task
word final dismiss, distress, hiss, grass, piece, kiss, miss, moss
/z/ in word initial zero, zeal, zombie, czar, zoo
word medial easy, dozing, bosom, husband, scissors, exam
word final eyes, does, gaze, maze, noise, ooze, seize, tease
/ s,z / are alveolar in English. The tip and the blade of the tongue make a light
contact with the upper alveolar ridge, and the side rims of the tongue a close contact with
the upper side teeth. The air-stream escapes through a narrow groove in the centre of the
tongue and causes friction between the tongue and the alveolar ridge. A lisp, i.e. the
replacement of / s, z / with /θ, ð/, is a common speech defect.

/³ ³³ ³, W WW W/
/³ ³³ ³ / in word initial Sean, Seamus, shame, shadow, sugar, sure
word medial Ishmael, assurance, fashion, luscious, machine
word final anguish, dish, lash, posh, swish, wish
/ ¿ / in word initial (in French loan words) genre, jabot
in word medial fusion, allusion, leisure, pleasure, treasure
word final (only in French loan words; an alternative pronunciation with /d¿ / is also
possible) - barrage, beige, prestige, rouge
/∫,¿ / are palato-alveolar or post-alveolar in English. The tip and blade of the tongue make
a light contact with the alveolar ridge, the front of the tongue being raised at the same time
in the direction of the hard palate and the side rims of the tongue being in contact with the
upper side teeth. The escape of air is rather diffuse, the friction occurring between a more
extensive area of the tongue and the roof of the mouth. As /¿ / is rare in initial position and
can be replaced by /d¿/ in final position, it has a restricted ‘functional load’ as a phoneme in

in initial position - hair, hen, harmony, hostage, heritage, Hebrew, host, hiss, horse
in medial position - behead, behold, behind, childhood, ahead
no final position
/h/ is a glottal fricative sound. It is distributed only in syllable initial, pre-vocalic positions.
The air is expelled from the lungs with considerable pressure, causing friction along the
vocal tract, especially in the glottis. Phonetically /h/ is a voiceless vowel with the quality of
the voiced vowel that follows it; phonologically, it functions as a consonant.
In certain types of regional speech, /h/ is dropped. Sometimes a trace of the function
of /h/ will be seen in the insertion of a glottal stop in its stead, e.g. /ðe '7ospitl /. Many
English speakers tend to judge as sub-standard a pronunciation in which /h/ is missing,
though in fact virtually all native English speakers omit the /h / in unstressed pronunciations
of the personal pronouns ‘him’, ‘her’, the auxiliary ‘have’, though few of them are aware of
doing it. Consider the following riddle and note the pronunciation: When a girl slips on the
ice, why can’t her brother help her up? Answer - Because he can’t be a brother and assist
her too (a sister too ).
Some RP speakers treat an unstressed h syllable, as in historical’, as if it belonged
to the special group hour,honest,heir: an historical novel. Also in the case of hotel,a
pronunciation without /h/ is quite widely spread.


Affricates are plosives whose release stage is performed in such a way that considerable
friction appears at the point where the plosive stop is made. Affricates begin as plosives
and end as fricatives, the final friction being shorter than that of the fricatives proper.
/ t³ ³³ ³ /, / dW WW W /
/ t³ ³³ ³ / in initial position cherry, chip, churn, chop, cheer, charter
in medial position urchin, researcher, Richard, lecture, treacherous
in final position change, lurch, search, much, touch
/ d ¿ / in initial position join, John, jam, genius, joke, general
in medial position adjoining, ajar, agitator, grandeur, midget, ingenious
in final position edge, large, budge, change, surge, urge
/ t∫ / and /d¿ / are post-alveolar sounds. In their articulation, the obstacle to the air-
stream is formed by a closure effected by the tip, blade, and rims of the tongue against the
upper alveolar ridge and side teeth. Simultaneously, the front of the tongue is raised
towards the hard palate in anticipation for the fricative release. The closure is released
slowly, the air escaping diffusely over the central surface of the tongue with accompanying
friction. During the stop and fricative stages, the vocal cords are wide apart for /t∫/, but may
be vibrating for /d¿/ according to the environment.
/ t∫ / is slightly aspirated in the positions where / p, t, k / are. Like them, when it assumes
final position, it has a shortening effect on the preceding vowel sound, most noticeably
when the latter is a long vowel or a diphthong (church, lurch, search). Like /p,t,k/ again, /t∫/
can be glottalised, i.e. preceded by a glottal stop in certain contexts. The most widespread
glottalisation is that at the end of a stressed syllable, for example in /neI7t[e / or /kæ7t[w4 /.

Some phoneticians (Gimson 1962, 1989) treat the combinations /t / + / r / and /d / +
/r / as established affricate units, / tr / and / dr /. As they are more than the sum of the two
elements involved, involving assimilation, they need some consideration here.
/tr/ in initial position tray, tree, triptych, trap, train, true
medial position attraction, poetry, mistress, sultry,
/dr/ in initial position drain, draw, drip, dream, drag, drill, drama
medial position adrift, address, Andrew, adroit, Audrey
/tr/ and /dr/ do not occur in final position. Because of the following /r/ sound, alveolar /t/
and /d/ are made to be articulated in a post-alveolar place. As a result, these combinations
may sound very much like /t∫/ and / d¿/.


It is agreed that English has at least two contrasting nasal phonemes, /m/ and /n/. However,
there is disagreement about whether there is a third nasal phoneme in RP, / ` /. Some say
that the latter is only an allophone of /n/, occurring in certain set positions.
For the production of the nasals, the soft palate must be lowered, thus directing the
air-stream towards the nasal cavity. There is also a complete closure in the mouth (bilabial
for m, alveolar for n and velar for `).
While /m/ and /n/ are simple to articulate, /`/ is different. It gives considerable
problems to foreign learners, including Romanians. The place of articulation of /` / is the
same as that of / k / and / g /; it is a useful exercise to practise making a continuous /` /
sound. If you do this, do not produce a /k / or /g / at the end - pronounce the /` / just like /m/
or /n/.
/`/ is the only English consonant that cannot occur initially. Medially, / ` / occurs
quite frequently, but there is the question whether to pronounce or not a following
orthographic g, when it is not final. For example, in RP one finds two distinct sets of words:
a) fishmonger, anger, longest, longer;
b) singer, hanger, ringing, clinging.
The words in a) are verbs or derived from verbs. In such words the g is not
pronounced. Suppose we derived the noun ‘longer’ from the verb ‘to long’, in the sentence,
‘this longer longs longer than that longer,’ only the adverb longer allows the pronunciation
of g.
/` / never occurs after a diphthong or a long vowel, and there are only five vowels
ever found preceding it: / I, e, æ, t , p/. To sum up, the velar consonant /` / is phonetically
relatively simple, but phonologically complex (i.e., it is not easy to describe the contexts in
which it occurs).

A lateral consonant is made by means of a partial closure, on both sides of which the air is
allowed to escape. Only one alveolar, lateral phoneme occurs in English. It has three main
allophones: a)‘clear’ [l], b) ‘dark’ [¡], and c) ‘voiceless’ [l].
Romanians have no difficulty articulating clear [l] or devoiced [l]. Dark [¡] does not
exist in our language, though. It is similar to the Russian ‘hard’ [l]. For its articulation, the
tongue assumes the position for clear l, the tip of the tongue touching the alveolar ridge, but
with a simultaneous raising of the back of the tongue. The tongue does not break contact
with the roof of the mouth for darl l so that an alveolar lateral sound is maintained. In the
figures above, you can see the comparative articulations of clear [l] and dark [¡].
a) clear l occurs before vowels and /j/ - leave, below, foolish;
b) dark l appears finally after a vowel, before a consonant, and as a syllabic sound
following another consonant - well, silk, settle ;
c) voiceless l mainly follows aspirated /p/ and /k/ - play, clean.

APPROXIMANTS: /r/, /j/, /w/

An approximant, as a type of consonant approximating the qualities of a vocoid, is a
sound in which the articulator approaches the place of articulation but does not get
sufficiently close to produce a proper consonant such as a plosive, nasal or fricative.
/r/ is a post-alveolar approximant or frictionless continuant. Its most common
allophone is the kind of /r/ sound heard in ray, rag, wry. It is the type of /r/ we usually
associate with English, whose symbol is an inverted r /1 /. In its articulation, the tip of the
tongue is held in a position near to, but not touching, the rear part of the upper teeth ridge;
the back rims of the tongue are touching the upper molars; the central part of the tongue is
lowered, with a general contraction of the tongue, so that the effect of hollowing and slight
retroflexion of the tip. The air-stream is allowed to escape freely, without friction, over the
central part of the tongue. This frictionless continuant variety of /r/ has a quality which is
not encountered in Romanian, having much in common with a vowel (it has been said that
/1/ is a vocoid).
This phoneme only occurs before vowels in RP, as well as between words, for
linking purposes (linking /r/ and intrusive /r/, phenomena which will be dealt with in
another section of the course).

In words like car, horse, arm, sort, the r-letter may be taken to indicate the length of the
preceding vowel. However, many accents of English do pronounce /r/ in words like car,
hard, ever, here (American, Scottish, and West of England accents). It has already been said
that accents which have /r/ in final position or before a consonant are rhotic accents, unlike
RP, which is non-rhotic.
/j/ and /w/ are traditionally called semivowels; the term approximant will be used
here, though. These phonemes are phonetically like vowels but phonologically like
consonants. They only occur before vowels.
The allophones of palatal /j/ are articulated by the tongue adopting the position for a
vowel like /i/ and moving away immediately to the position of the following sound. When
/j/ follows a fortis consonant, devoicing takes place; when the consonant is one of /p,t,k,h/,
the devoicing is complete, a fortis voiceless palatal fricative being produced [c] - pure,
tune, cute, human.
The allophones of bilabial /w/ are articulated by the tongue assuming the position
for a vowel like /u/ and moving away quickly to the position of the next sound. The lip-
rounding for /w/ is closer and more energetic than that for /u:/, allowing distinctions for
such pairs as woos - ooze.
Words beginning with /w/ or /j/ are considered to be initiated by a consonant: the
preceding indefinite article is a, and the definite article is unstressed the /ðe/, e.g., a worker,
a war, a university. This is further evidence that /j/ and /w/ function as consonants.

bilabial labio-
dental alveola
velar glottal
plosive p, b t, d k, g 7
fricative f, v θ, ð s, z [, ¿ h
affricate t [ , d¿
nasal m n `
lateral l
w r j

The English consonant phonemes



The identification of individual sounds and the segmentation of the speech chain are the
concern of segmental phonology. Actually, English and many other languages use sounds
that involve combinations of articulatory coordinates and values. Pike (1943) speaks of
secondary articulations (such as vowel nasalisation) and other forms of articulation
involving more than one place of articulatory activity in the vocal tract. Clark and Yallop
(1995) do not make a clearcut distinction between secondary and complex articulations,
using the latter term to describe two types of complex articulation: simultaneous (separate
but co-occurring articulatory activities which result in the production of a sound which is
viewed as a single unit) and transitional (separate and successive articulatory activities
which together can be identified as a single segment). The distinction between simultaneous
and transitional, while important, will not be of much use here. For the time being, I will
only use the term complex articulation, whose most important illlustrations in English are
a) nasalisation (often found in Northern accents): kennel may be pronounced with
nasalised /e/ and /l/, the two sounds being affected by the articulation of the nasal
consonant /n/ between them;
b) labialisation: the addition of lip rounding or lip protrusion: /l/ in aloof is pronounced
with lip rounding in anticipation of /u:/;
c) palatalisation: the raising of the blade of the tongue to a high front position, as for an /i/
sound; in the articulation of plosives followed by an /i/ or a /j/, the air turbulaence
which is produced may lead to their affrication: teens /t[i:nz/;
d) velarisation: the movement of the tongue body and root from their normal vocal tract
position backwards, curling the back of the tongue, is important in the production of
‘dark’ l and in the articulation of American /r/.
Other instances of complex articulation are diphthongisation of monophthongs (which
has already been mentioned when dealing with the vowels) and syllabicity – the use of
syllabic consonants (an aspect that will be discussed in the section on weak syllables).


The syllable is the lowest phonological unit into which phonemes are combined.
Phonetically speaking (in terms of perception), a syllable consists of a sonorous centre
sounding comparatively loud, having little or no obstruction to the air-flow; at the
beginning and at the end of the syllable there will be greater obstruction and less loud
sound. From an articulatory point of view, it is based on one chest pulse resulting from the
movement of the intercostal muscles, a single unit of movement of the lungs in which there
is only one crest of speed, the sonorous centre mentioned above.
A minimum syllable would be one made up of a single isolated vowel, as in are, or,
err. More complex ones have an onset (a beginning), as in tea, far, me, do, and/or a
termination (sit, cup, eat, art, smart, charm).
The onset of English and Romanian syllables may consist of single consonants or
consonant clusters of two or three. Unlike Romanian, English may have long consonant
clusters as termination (four-consonant clusters in words like twelfths or glimpsed). What is
also special about English is that some two-consonant and three-consonant clusters may
form syllables without a vowel centre: apples, rhythms, dozens, bacon. The centres of the
final syllables in these words are the so-called sonorous consonants /l,m, n, ` /.
Syllables of the type V, CV are called free / unchecked syllables (not ending in an
obstruction of the air-flow), while combinations of the type VC, CVC make up checked
syllables. It is worth noting that free syllables are predominant in Romanian, and checked
ones are typical of English.

One of the most noticeable features of English is that many syllables are weak (in this
sentence, for instance, are is a weak syllable). Weak syllables can only have four types of
1. the vowel / e /;
2. a close front unrounded vowel between /i:/ and /I /;
3. a close back rounded vowel between / u / and / u: /;
4. a syllabic consonant, a centre unusual for Romanian.

1. The /e / vowel (‘schwa’) is the most frequently occurring vowel in English,
always associated with weak syllables. Vowel letters in unstressed syllables are usually
pronounced as /e/, as in kimono, about, become, forget, survive.
2. The / i / sound occurring in weak syllables is sometimes rather different from
both short /I/ and long /i:/, as in happy, valley, happier, hurrying, react, preoccupied,
appreciate, in the unstressed forms of he, she, me, be, as well as in the article the when it
precedes a vowel.
3. The close back vowel sound /u/ functions as centre of weak syllables such as you,
to, into, do, when they are stressed and not immediately followed by a consonant, and
through and who in all positions when they are unstressed.
4. A syllabic consonant can assume the role the vowel sounds in 1-3 play as centres
of weak syllables. As already said, in weak syllables in which no vowel is found, one of the
consonants /l, r, m, n, ` / can function as the syllabic centre. It is usual to indicate that a
consonant is syllabic by means of a small vertical mark below it.
Syllabic[ l ] is found in cases where we have a word ending with one or more
consonant letters followed by orthographic le(s), as in bottle, muddle, wrestle, couple,
trouble, knuckle, struggle. We also find syllabic l in words spelt with, at the end, one or
several consonant letters followed by al or el, as in panel, papal, petal, parcel, kernel,
pedal, Babel. Make sure you do not insert an /e / sound before final /l /.
Syllabic [n] is the most frequently found and the most important of the syllabic
nasals. It is found medially and finally in words like threaten, Edinburgh, Cheltenham,
Tottenham. To pronounce a vowel before it would sound uncommon or overcareful in RP.
Syllabic n is most common after alveolar plosives and fricatives. One does not find syllabic
[n] after /l, t[ , d¿ / so that words like sullen, Christian, pigeon do not contain it. After /f/ or
/v/, though, syllabic n is more common than / en/ in non-initial syllables, such as in eleven,
enliven, often. Syllabic [n] can be preceded by two consonants, but not by nt, mt, and nd,
md; thus, London, Framton, Clinton do not have syllabic [n].
Clusters of syllabic consonants
Sometimes we can find two syllabic consonants together, as in international,
visceral, visionary, veteran.


There are four factors that are important in making a syllable recognisably stressed:
loudness, length, pitch, and quality. Pitch is thought to produce the strongest effect, and
length is also a powerful factor. A syllable is therefore prominent in a certain context if it is
uttered on a higher pitch, than the surrounding syllables, if it is longer, louder, if it contains
a vowel that is different in quality from the neighbouring vowels.
The vowel quality as a factor influencing stress has to do with the contrast between
the degrees of explicitness of articulation of the stressed and unstressed syllables. In a
stressed syllable the initial consonant(s) and the vowel will be comparatively clearly uttered
while in an unstressed syllable the consonants are likely to be weakly uttered and the vowel
is usually obscure. Compare, for example, the /p/ in the first and second syllable of paper,
as well as the vowels in the two syllables.
To the four factors contributing to stress formation, Underhill 1994 adds a fifth,
which provides a visual cue: stressed syllables are accompanied by larger jaw, lip and other
facial movements by the speaker.
The five factors are interrelated, being ways of increasing or decreasing the amount
of articulatory energy at any point. The stress pattern of a word can thus be seen as its
energy profile.


Words have stress patterns that are quite constant when the word is uttered in isolation or in
set constructions. For polysyllabic words, the stress pattern is a decisively important
identifying feature in rapid or casual speech. It should not be considered as a superficial
addition to a correctly pronounced string of speech sounds but as the essential framework
within which the consonants and vowels interact to produce meaning. Muttering almost
incomprehensibly Tottenham Court Road while riding the London tube, but using the right
stress pattern, will probably be more successful in a conversation than articulating more
carefully the name of the tube station with the wrong stress pattern.


Between stressed and unstressed syllables we have to distinguish one or more intermediate
levels. In the word accommodation we can notice five different stress levels in decreasing
order: 1. /deI / 2. /kp / 3. /me / 4. /e / 5. / [n /. However, English makes effective use of
only three such levels: tonic strong stress or primary stress, non-tonic strong stress or
secondary stress and weak or unstressed level.


As far as stress is concerned, English words are to a large extent unpredictable. It has been
said that the best policy is to treat stress placement as a property of the individual word, to
be learnt when the word itself is learnt. Some people claim that rules of stress placement do
exist, although they abound in exceptions.
In order to decide on stress placement, it is necessary to make use of some or all of
the following information:
a) whether the word is morphologically simple or complex (affixes, compounds);
b) the grammatical category to which the word belongs (noun, adjective, verb, etc.)
c) the number of syllables in a word;
d) the phonological structure of the inherent syllables.


Complex words are of two major types: words made from a basic stem word with the
addition of an affix, and compound words, which are made of two or more independent
Affixes will have one of three possible effects on word stress:
1. the affix receives the primary stress;
2. the affix does not affect the stress in the stem;
3. the stress remains within the stem, but is shifted to a different syllable.

refugee, mountaineer, Siamese, cigarette, picturesque, unique.

comfortable, anchorage, refusal, widen, wonderful, amazing, devilish, birdlike, powerless,
hurriedly, punishment, dangerous, funny.

courageous, photography, colonial, organic, reflexive, economical.

Primary stress may be placed on the first word of the compound or on the second. A simple
rule of thumb can be used although it is not infallible.
1. If the first part is (in a broad sense) adjectival, the stress goes on the second
element, with a secondary stress on the first: , loud’speaker, ,bad’ tempered, ,three’wheeler.
2. If the first element is (in a broad sense) nominal, the stress goes on the first
element: ‘sunrise, ‘suitcase, ‘teacup.

When a pair of words exists, members of which are spelt the same, one of them being a
verb and the other either a noun or an adjective, the stress will be placed on the second
syllable of the verb but on the first syllable for the noun or adjective:
verb noun or adjective
com’pound ‘compound
ex’port ‘export
ex’ploit ‘exploit
re’cord ‘record
in’sult ‘insult
pro’duce ‘produce
pro’test ‘protest
re’bel ‘rebel
sus’pect ‘suspect
An exception to the above rule is to be noted with to comment (on) and the noun comment,
which both receive stress on the first syllable.


Almost all the words which have both a strong and a weak form - there are roughly 40 such
items in English - belong to the category of grammatical words, such as primary auxiliaries,
modals, prepositions, conjunctions.
In certain circumstances, these words are pronounced in their strong forms, but their
weak forms are more frequently produced and sound more natural in casual, ordinary
conversation. The weak forms are the result of o process of sound simplification that
depends on the tempo and context of the utterance. A slower and more careful delivery will
stick closely to dictionary pronunciation with its strong forms, whereas a faster and less
careful delivery will contain variants of weak forms.
There are contexts where the weak form is the normal pronunciation and others
where only the strong form is acceptable.)
The weak forms are the result of o process of sound simplification that depends on
the tempo and context of the utterance. A slower and more careful delivery will stick
closely to dictionary pronunciation with its strong forms, whereas a faster and less careful
delivery will contain variants of weak forms.
The strong form is used in the following cases:
1. For prepositions and auxiliaries, when they occur at the end of the utterance:
She dislikes being looked at /æt /.
What are you waiting for ? / fo:/
He says he will come but I don’t know if he can / kæn /.
2. When a weak-form word is given prominence in conversation, contrasted with another
word or ‘quoted’:
He has brought his wife and /ænd / his mother-in-law.
You must / mest / face up to it, whether you like it or not.
The message is not from / from / them, we’re planning to send it to /tu / them.
The preposition of / ov / occurs too often in this text, I’m afraid.

Weak forms in context (although italicised, it goes without saying that the words should
not be stressed; it is the words around them that are stressed)
OF That’s very nice of / e v / you.
Have a cup of tea (cuppa tea).
He’s a friend of mine (frienda mine).
AND Come and / e n/ see us tomorrow.
Gobble and / ` / go (name of a small fast food place in Edinburgh).
Rock ‘n’ roll (the weak pronunciation in this case has all but institutionalised this
BUT It is sad but /bet / true.
THAN Better safe than /ðen / sorry.
THAT used as a relative pronoun has a weak form; as a demonstrative, it always has its
strong form, / ðæt /. Compare the two ‘thats’ in:
That /ðæt / man that /ðet / applied for the job.
THEM Tell them to get a move on, it is rather late (/ðem / or / em/).
HIS is pronounced in its strong form when it assumes initial or final position in an
utterance; otherwise, in unstressed positions, it is commonly realised as /Iz/:
His / hIz / preoccupation with language is normal in a person like him.
It is not mine, it is his /hIz /
What’s his / Iz / opinion on this issue?
He should take his /Iz / time, there’s no rush.
HER, like HIS, is pronounced in its strong form in initial position; in other contexts it is
commonly pronounced in its weak form:
Take her / e / to the manager.
They seem to like her / e /.
Her / he / proposal that we should postpone making a decision was turned down.
HE will be pronounced as /hI/ or /hi:/ in initial position and when it is given special
prominence; otherwise, it will be realised as /I /:
He /hI / was wrong, wasn’t he / I /?
Why didn’t he /I / turn up on Monday?
HIM, when occurring in initial position (very seldom, actually), will be stressed; in other
contexts, the weak form is preferred:
Him / hIm / I don’t like, but I like her.
Tell him /Im / we are looking forward to hearing from him /Im/.
US is usually pronounced / es /:
Let us / es / know as soon as you have made up your mind.
Don’t forget us /es /!
YOU You should do as you are told ( / ju / or / je /).
What did you do then? ( /je/ ; did you may be pronounced /did¿e /)
YOUR Thank you, but I don’t need your help ( /je / ; need your may be pronounced
SOME in final position and before singular count nouns is pronounced in its strong form;
otherwise, it takes on its weak form:
You don’t have to give me any, I’ve got some /stm/.
I hope to meet you again some /stm/ time next month.
I need some /sem / help with this essay.
Some /sem / people may think he’s nuts.

PRIMARY AUXILIARY VERBS and MODALS are always stressed when used in the
negative; as already said, their strong forms are used in final position in a tone group (when
DO, BE, HAVE are used as lexical verbs, they will have their strong forms):
He wasn’t /wpznt/ paying attention to what you were saying.
He can’t /ka:nt/ be telling the truth, he never does /d z/.
DO Where do /de / they intend to put him up for the night?
Do /de / they think they will be able to keep their promise this time?
In front of a vowel, it will be pronounced / du /:
How do /du / I go about it?
DOES Why does / dez / he / I / keep telling us / es / the same old stories?
I don’t know why he does / dtz / so much work as long as he hardly gets any credit
for it (does is used as a lexical verb here).
COULD You could /ked / do that if you wanted.
I know I could /kod /.
Could you tell me the time, please? / kedje / or / ked¿e /.
HAVE Sorry, I must have /ev/ made a mistake (must has a strong form here; see why
when MUST is discussed below)
Yes, you have / hæv / (final position).
HAS How much has / ez/ he /I / done so far?
Where’s /z / he /I / left his / Iz / luggage ?
MUST is strong when it shows deduction or in final position; otherwise, it is either /
mest / or / mes/:
They must / mtst / have /ev/ completed their project by now.
Yes, I must / mtst /.
You must / mest / eat it up.
You must /mes / drink it up.
SHALL We shall have to accept their offer ( /[l / or /[el /).
Yes, we shall / [æl / (compare with No, we shan’t /[a:nt /).
SHOULD I should /[ed/ have done it long ago, I’m aware of that.
Yes, you should /[ud /. No, you shouldn’t / [udnt/.
ARE Here are / e / the reports you asked me to type.
There are / e / a couple of things I’ve got to tell you right away.
Yes, you are / a: /. No, you aren’t /a:nt /.
WAS Was /wez/ he /I / pleased with his / Iz / work?
Yes, he /I / was /wpz /. No, he /hI / wasn’t /wpznt/.
He was /wz / given this opportunity and he made the most of it.


Connected speech is not the juxtaposition of individual words with their dictionary
pronunciation. In the process of actual speech, depending on the tempo or speed of
utterance, the flow of sounds is affected by a system of simplifications by means of which
phonemes are connected, clustered, reduced, changed or deleted altogether.
The changes are quite systematic and include assimilation, elision, linking, due to
tempo, rhythm, intonation and other factors.


The speed and context of the utterance are the key factors in the process of sound
simplification. A slower and more careful way of pronouncing sounds may stick more
closely to the dictionary pronunciation with which foreign learners are usually accustomed.
A faster and more casual manner of articulation will lead to a greater degree of sound
simplification, which does not usually simplify the task of the foreign speaker. The two
speeds of delivery are called by Underhill (1994) careful colloquial speech and rapid
colloquial speech, concepts he goes on to use in order to describe two models or landmarks
to aim for in the study of connected speech pronunciation.
Careful colloquial speech is said to contain all types of modifications to a moderate
degree. Words remain closer to their dictionary pronunciation than with rapid colloquial
speech. This style is used in formal situations. The careful colloquial RP of newscasters and
announcers on the BBC World Service is given as an internationally available example of it
by the above-mentioned author. This type of pronunciation is useful as a goal to be aimed at
for learners when they speak. It is clear, easy to listen to and widely understood. A native
speaker is likely to resort to it when communicating with a foreigner.
Rapid colloquial speech contains modifications to a larger extent. This type of
pronunciation is resorted to in less formal situations, when native speakers are talking
casually to one another. Underhill suggests that this style should be used as a target for
learners to aim especially in their listening activities.


Rhythm is the pattern formed by the stressed syllables being perceived as peaks of
prominence. Pronouns, determiners, auxiliaries, prepositions, conjunctions are generally
unstressed, unlike lexical verbs, adverbs, adjectives and nouns, which are usually stressed.
The utterances
He ‘said to them that he was ‘sorry
He ‘said he was ‘sorry
are of different length, yet they have two stresses each, occurring at approximately the same
interval of time in ordinary speech. Between the first and the second stress in the first
example there are five syllables, while in the second utterance the interval is made up of
two syllables only. The five syllables in the first utterance and the two in the second are
pronounced in roughly the same interval of time.
Two broad kinds of rhythm can be found in natural languages. One kind may be
typical of a particular language, while with some languages there is a mixture of the two.
In syllable-timing, the tempo depends on the syllable, so that all the syllables are of
about the same length. French, Italian and Romanian are languages that display this kind of
Stress-timing depends on more unequal and irregular units, rhythm groups or feet,
which contain various numbers of syllables, yet tend to be pronounced in roughly equal
intervals of time. We say that English has a tendency for stress-timed rhythm, that is,
what matters in establishing the rhythm of an utterance is not its total number of syllables,
but the number of its stresses. Therefore, an utterance with two stresses and three syllables
will be pronounced in about the same interval of time as one with two stresses and seven
The tendency towards a regular beat is much more conspicuous in spontaneous
speech than in monitored speech such as lecturing or reading aloud; it is more marked in
British and Australian than in American and Canadian English.
When a number of syllables are squeezed between two stresses in a short space of
time we have to use their weak forms to pronounce them very quickly and without much
effort. The more unstressed syllables there are between two stresses, the quicker they must
be pronounced ( ...‘quickertheymustbepro ‘nounced ).

Every word, when uttered alone, has at least one stress. In continuous, rhythmic stress, we
do not place stresses on all the words, though. Not only the grammatical words mentioned
above, but also content words like nouns may lose their stress or have their stress pattern
altered when we speak quickly. From individual word stress we thus move to utterance
stress, tone unit stress, rhythm for short.
The principle of stress-timed rhythm stipulates that the strongly stressed syllables of
an utterance should be separated by roughly equal intervals of time. The relatively equal ( in
terms of time, not of number of syllables ) stretches of utterance beginning with a stressed
syllable and extending up to the next are called rhythm groups. As an illustration, let us
divide the previous sentence ( if we read it it will become an utterance) in rhythm groups:
the‘relatively ‘equal ‘stretchesof ‘utterancebe ‘ginningwitha
‘stressed ‘syllableandex ‘tending ‘uptothe ‘nextare ‘called
‘rhythm ‘groups
The initial unstressed syllables will be attached to the first rhythm group, as in
the’relatively. Here are some more illustrations:
Ishould’thinkitwouldbe ‘betterto ‘waittillto ‘morrow
She’boughtusa ‘loafof ‘bread
‘Whatade ‘licious ‘cakeyouhave ‘made
When you practise you should say each group separately, with a pause after it, disregarding
word boundaries, in order to get the feel of the rhythm. Of course, one should not make
such distinct pauses in actual speech, where the groups should be joined smoothly to make
a continuous whole. The average tempo (speed) of an utterance is usually established as
soon as the first two rhythm groups have been spoken. They will set the rhythm of our
speech, and the remaining groups will fit in as nearly as possible. It will be noticed that the
rate of speaking constantly varies.
When stresses occur close together ( ‘when ‘stresseso ‘cur ‘closeto ‘gether), the
speed decreases. In a poem, the idea of painful, slow progress may thus be suggested as in
‘cut ‘grass ‘lies ‘frail
four short rhythm groups that considerably affect the tempo of the utterance. If the rhythm
groups contain a large number of syllables, speed is bound to increase noticeably, as in
There’wasa ‘young ‘ladyfrom ‘Niger / who’rodeonthe ‘backofa
In the example above it is not so much the idea of speed that is important as the irregularity
produced by the contrast between the number of syllables of the rhythm groups, as well as
the idea of regularity introduced by the funny rhyme. Also consider:
He’realisedthatthe ‘bus ‘wasntgoingto ‘stopforhim.
Between rea and bus there are three syllables; and between was and stop there are also
three. But between bus and stop there are none. Therefore we reduce speed on bus and rest
upon that syllable for a considerable while. Then we pick up our former speed between was
and stop. The pause on bus will not be enough to make the second group exactly as long as
the first and the third. But it is enough to let us feel the rhythmic beat on those four
syllables as the backbone, as it were, of the sentence.
Successful results in speaking fluently and naturally largely depends upon patient
analysis and much observation of native speakers, in order to acquire a sense of sentence (or
utterance) rhythm. And the most important single factors are knowing where to place the
stresses marking the rhythm groups and which syllables not to stress, which syllables to
pronounce in their weak forms.


Assimilation occurs when we find a phoneme realised differently as a result of being near
some other phoneme. It usually happens at word boundary, one or several sounds being
changed under the influence of adjoining sounds. The phenomenon is more marked in
rapid, casual speech.
In I have to / hæftj jj j / go now, voiceless /t / in / te/ has deleted voice in the
previous /v/ sound, turning it into voiceless /f /. Assimilation of voice has become
institutionalised with the suffix s with nouns and verbs. Thus, if a voiced sound precedes
the suffix, the latter will be realised as /z /: rag - rags /rægz /, clean - cleans /kli:nz/.
Assimilation of voice ( alveolar voiced fricative /z/ turning into its voiceless counterpart,
/s/) is also apparent in the following examples:
hæz turns into s - What’ s happened?
iz turns into s - It’s easy.
Assimilation involving manner and place of articulation can be noticed in the illustrations
/s, z / change to / ∫, ¿ / (alveolar fricatives assimilated by post-alveolar affricates or
palatal approximant):
This shape - / ði∫ ∫eip / ; has she come - / hæ[ ∫i ktm / ; this year - / ði∫ jie / ;
those years - / ðeu¿ jiez /.
/n / changes to /l ll l / ( alveolar nasal assimilated by velar nasal ):
in case - / i` keis /.
/n / changes to /m / ( alveolar nasal assimilated by bilabial nasal ):
ten monkeys - /tem mtnkiz /, ten pencils /tem penslz/
/ t, d / change to / p, b / ( alveolar plosives assimilated by bilabial plosives):
hit man - /hIp m ´ n/; don’t be rude - / deump bi ru:d /; good boy - / gub boi /.
/t, d / change to / k, g / ( alveolar plosives assimilated by velar plosives ):
I don’t care - / ai deunk kee / ; good game - / gug geim /.
Assimilation may also occur within words themselves. In this case, they may - or
have - become established as such: literature / litrit∫e/, treasure / tre¿e/, issue /i∫ju:/.
Assimilation may also be described in terms of the direction in which it works.
There are three possibilities:
(1) regressive (or anticipatory) assimilation: the sound changes because of the influence of
the following sound, e.g. ten boys / tem boiz/; this is particularly common in English in
alveolar consonants in word-final position;
(2) progressive assimilation: the sound changes because of the influence of the preceding
sound, e.g. staunch supporter articulated with the s of supporter turning into /∫ /, under the
influence of the preceding /t∫ /; such assimilations are less common, though;
(3) reciprocal assimilation: there is mutual influence, or fusion, of the sounds upon each
other, as in don’t you, did you, could you pronounced dontcha, didja, couldja. The /t/
and the /j/ have fused to produce an affricate.


Elision means the dropping of a sound or sounds, either within the body of a word or at
word boundary.
Within a word, an unstressed vowel between two consonants is often elided: several
/sevrl/, national /næ∫nl /, correct /krekt/, ordinary / o:dnrI /, strawberry /stro:brI /, police
/pli:s /.
Final / v/ is often lost before consonants: lots of them (lotsa them), pint of beer
(pInta beer).
In the section of the course dealing with strong and weak forms we met instances of
elision in such combinations as
He must have / ev / done it before
Don’t hit him /Im/
Does he /I / like her / e / ?
As with assimilation, elision is typical of casual speech. Producing elisions is
something we Romanians as foreign learners needn’t learn, but it is important to be aware
of the fact that when native speakers of English talk to each other, quite a number of
phonemes that we expect to hear are not actually pronounced.


Linking of words by means of a /r/ round is frequent in English. Although RP English does
not have the sound /r/ at the end of words, it does use such a sound as a linking element
when the following word begins with a vowel: for instance / frInstns /, sore eyes /so:raIz /,
four aces /fo:reIsiz /.
Many RP speakers use /r / in a similar way to link a word ending with a vowel (so,
with no final orthographic /r /) with another word beginning with a vowel (what is called an
intrusive /r/ ). Usually, the final vowel after which the intrusive /r / is inserted is /e /: vodka
and tonic / vodkeren 'tonIk /, formula A /fo:mjeler'eI /, the idea of it (ði aI'dIerevIt /,
media event /midieri'vent /.


‘Intonation is the soul of a language while the pronunciation of its sounds is its body, and
the recording of it in writing and printing gives a very imperfect picture of the body and
hardly hints at the existence of a soul.’ ( R.Kingdon 1958) If we take this statement
seriously, it follows that the issue is of tremendous phonological importance.
Intonation, stress, and rhythm are all concerned with the perception of
We speak of stress when we consider the prominence with which one part of a word
is distinguished from other parts.
Rhythm is the pattern formed by the stresses of an utterance.
We speak of intonation when we associate relative prominence with pitch, the
aspect of sound which we perceive in terms of ‘high’ or ‘low’, and with pitch change; thus,
we will say that the intonation nucleus in the following sentence has a falling tone (the
pitch goes down): The man has \ gone / ð mæn z \ gon /
and that it would have the discourse function of a question if the nucleus had a rising tone:
The man has / gone? / ðe mæn ez / gpn /.
We may say that intonation is the music of speech. The changes in pitch that
intonation is associated with create different tunes. The use of an inappropriate tune may
lead to misunderstandings, mainly as far as the attitude of the speaker is concerned.


Intonation is normally realised in tone units consisting of a sequence of stressed and
unstressed syllables. On occasion the tone-unit may be made up of a single pitch-prominent
syllable. The peak of greatest prominence is called the nucleus or the tonic syllable. This
syllable is likely to be found in the last content word in the tone unit (noun, lexical verb,
adverb or adverbial particle).
Identifying the tonic syllable in a tone unit is very important in speech. It is equally
important to decide which syllables are not to be stressed. It has been said that learning to
unstress is more important than stress. Stressing too many syllables in speech is apt to give
foreigners away or to make them sound affected.
The part of the tone-unit from the first stressed syllable up to (but not including) the
tonic syllable is called the head. The following utterance may be divided into tone units,
further divided as follows:
She phoned me at seven and told me she was coming =
She phoned me at seven (tone unit 1)+ and told me she was coming (tone unit 2)
Tone unit 1 - She = pre-head
‘phoned me at = head
/se = nucleus
ven = tail
Tone unit 2 and = pre-head
‘told me she was = head
\ com = nucleus
ing = tail
The syllable or syllables (if any) between the nucleus and the end of the tone unit is/are
called the tail. When necessary to mark stress in a tail, a raised dot may be used, as in :
\ Seldom does he . do that.


While one speaks, the pitch of the voice may remain at a constant level, or it may move
from one level to another. The word used for the behaviour of the pitch is tone; a one-
syllable word can be uttered with either a level tone or a moving tone. English speakers do
not use level tones very often. Moving tones are more frequent; if English speakers want to
say yes or no in a definite, final manner, they will probably use a falling tone. If they want
something else they are likely to use a rising tone.
The English tone system is based on an opposition between falling and rising tone,
in which falling pitch conveys certainty and rising pitch uncertainty. The falling/rising
opposition is the most important opposition , and it probably plays some part in the system
of every language. In English, where it plays a very significant role, the distinction has to do
with polarity, the positive/negative opposition. Thus, falling pitch means ‘polarity known’,
while rising pitch means ‘polarity unknown’.
It should be mentioned that no particular tone has the exclusive privilege of
occurring in a specific context, one tone being able to replace another one in certain
situations; intonation is more a matter of tendencies and principles, not of cut-and-dried

It consists of a fall of the pitch from a high note (high frequency of vibration of the vocal
cords) to a low note. If someone is asked a question and replies \ Yes or \ No, one will
gather that the question is now answered and that there is nothing more to be said. The fall
indicates that the information is complete and final.
In longer stretches of speech, this tone is used with wh- questions, straightforward
orders and exclamations:
Where did you \buy it?
Come \in!
Isn’t she \lovely?
In the last example, word order and a possible question mark might give the impression that
this is a yes/no question associated with a rising tone. Actually, such structures are nothing
more than special kinds of statement, reinforced by some sort of emotional colouring.

This tone conveys the impression that there is more to come:
A: Excuse me -
B: / Yes ( meaning ‘ I am inviting you to continue’)
A: Do you know Laura?
One possible reply from B would be / Yes?, inviting A to continue with what she intends to
say about Laura after establishing that B knows her. To reply instead \ Yes or \ No would
give a feeling of ‘finality’ or ‘end of the conversation’. We can see similar ‘invitations to
continue’ in someone’s response to a series of instructions or directions. For example:
You take three / eggs -
/ Yes?
/ break them
/ Yes?
and put them in a \ frying pan.
With no, a similar function can be seen, for instance, if B is asked by A: Have you
seen Ann?; the reply with \ No implies quite clearly that the interlocutor has no interest in
continuing with that topic of conversation. However, a reply with /No? would be an
invitation for A to explain why he is looking for Ann.
Similarly, someone may ask a question that implies readiness to present some new
information, for instance:
Do you know who managed the longest balloon flight?
If B replies / No? he is inviting A to go on, while a response of \ No could be taken as lack
of interest.
Greetings such as good morning, good afternoon, good evening sound more friendly
if uttered with a rising tone. With the falling tone, they may sound aggressive, cold or
simply business-like. With goodbye, however, only the rising tone sounds polite or
The rising tone is also used in longer stretches of speech that function as statements
which, apart from giving information, are supposed to sound reassuring and encouraging. It
is used with the imperative when one wants to tone down the imposition made on the
Come /here.

This tone or tune is effected by a fall of the voice from a high note to a low one and after
that by a rise from the low note to a high one again.
It is commonly used to express ‘response with reservations’ or ‘limited agreement’.
Something seems certain, but turns out not to be. There is a but about the statement:
A: They say the curry in this restaurant is not very hot.
B: \/ Yes.
B’s reply is supposed to indicate that he or she does not fully agree with what A has just
said, and one expects A to ask B why s/he is unwilling to agree in a definite manner.
The fall-rise is also used to correct something a person has just said. It is used on the
word which one particularly wants to correct:
A: So tomorrow you will be \ eighteen.
B: \ Nine / teen.
A no uttered with a falling-rising tone accompanied by a ‘negative’ face will
naturally express uncertainty, doubt, reservation. However, if it is accompanied by a
‘positive’ face, it can be seen as a variant of the rising tone, thus expressing encouragement,
interest, urgency.


A rising-falling tone is meant to express strong feelings of surprise, approval or
disapproval. It is used on strong, especially contradictiong, assertions, and amounts to a
high variety of the falling tone. It means ‘this seems uncertain, but turns out to be certain’,
often carrying the implication of ‘you ought to know that’. It is not usually considered
useful for foreign learners to acquire it, nevertheless it is worth getting used to recognising
it in the speech of native speakers:
A:You wouldn’t call your friend a jerk, would you?
B: /\ No.


This tone is used in rather restricted contexts. Therefore it is not one of the basic tones
briefly described above. More often than not it conveys the feeling of saying something
routine or boring. A teacher calling the names of pupils from a register often does so using
a level tone on each name, and the pupils are likely to respond with a level tone on yes--
when their name is called.

Authors like Hortensia Pârlog (1997:144-145) describe compound tunes or tones in
utterances with more than one nucleus.
Thus, additional emphasis, considerable insistence can be conveyed by a series of
falling nuclei: \ Don’t be ri \ diculous! \That was \absolutely \fabulous!
By contrast, a series of high rising nuclei may convey surprise, excitement,
incredulity: Will you / do that a /gain to / morrow?
A tune that may suggest apology or reassurance is one featuring a falling nucleus
followed by a falling-rising one: I \ won’t do that a \/ gain.


The functions that intonation can perform can be linked to the general functions of
language. Roman Jakobson identified six ‘factors’ in human communication: the speaker,
the addressee, the code (the conventions of the language common to speaker and
addressee), the message, the context (the entities the speaker talks about) and the contact
(the relations between speaker and addressee). Related to these factors are six language
1 Emotive: the speaker expresses his/her feelings;
2 Conative: the speaker seeks the achievement of a goal by influencing the addressee;
3 Metalingual: the speaker is talking about the items of the code;
4 Poetic: the message of the speaker draws attention on itself, on its patterns, organization,
sound symbolism, etc.;
5 Referential: the primary concern is with the communication of factual information;
6 Phatic: the speaker focuses on achieving a relationship with the addressee.
M.A.K. Halliday boils down the various uses and functions of language to three
macrofunctions that have to do with describing reality, interacting with people to get things
done and constructing more or less extended messages, written and oral. These can be
related to the three basic functions of intonation postulated by authors like Chitoran (1978),
Hortensia Pârlog (1997): grammatical (the way in which intonation can give clues as to the
linguistic structure of utterances), attitudinal (how intonation can convey attitude) and
accentual (intonation can give clues about how information is organized in terms of given
and new).


Intonation is not a superficial process that has to do with the ‘outer layer’ of
communication, contributing to the clarification of linguistic communication in terms of
syntax, semantics and pragmatics.
It can convey grammatical information, by indicating phrase, clause and sentence
boundaries, by distinguishing between a statement and a question.
By clarifying the interplay between syntactic form and semantic content in specific
situations of conversation it serves an important pragmatic purpose.
By the use of contrastive stress (i.e. picking out a marked tonic syllable, not one you
expect in a certain context) placed on a particular syllable/word in an utterance it affects
information organization and expresses different presuppositions in discourse. Intonation
expresses attitude of the speaker: a slow tempo associated with a level tone or little
variation in pitch may indicate that someone is bored; a preponderence of falling tones
might indicate an authoritarian attitude; rising tones may be encouraging or reassuring,
downtoning the effect of a directive, etc.
A good example of the interplay of syntax, semantics and pragmatics by means of
intonation is seen in the use of the vocative.
A vocative is an optional nominal element added to show to whom the utterance is
addressed. It may be a call, drawing the attention of the person addressed, or a form of
address, showing the speaker’s attitude to the person spoken to:
Your Excellency, the guests have arrived
Mike, you’re wanted on the phone
It’s not me, sir, it’s somebody else
Intonationally, the vocative is set off from the rest of the clause either by constituting a
separate tone unit or by forming the tail of a tone unit. The most characteristic intonations
are falling-rising for an initial vocative functioning as a call and rising for a vocative
functioning as an address. Vocatives can be:
1) Names, with or without a title: Pete, Mr Peter Jones, General Schwartzkopf, Dr James.
2) Terms showing family relationships (sometimes capitalized): Mom, Dad, auntie, granny,
3)Titles of respect: my Lord, your Ladyship, your Excellency, your Hono(u)r, your Majesty,
madam, sir.
4) Markers of status: Mr President, Father [for priest], Sister [for a nun], professor, colonel,
5) Terms for occupations: waiter, cabbie, barmaid, bartender, nurse, officer [for a
policeman, not an army officer].
6) Epithets showing a favourable evaluation - darling, honey, love, dearest, sweetie-pie
<esp AmE>, gorgeous, handsome, or an unfavourable one - bastard, creep, coward, idiot.
7) General nouns often used in more specialized senses: buddy, guys, ladies and gentlemen,
mate <BrE>, partner <AmE>, son. Except for ladies and gentlemen, these are usually
familiar and considered impolite when addressed to a person one is not familiar with.
8) Nominal clauses: Whoever said that, come out here!
To address somebody you know by last name preceded by title (Mr Smith, Dr
Jekyll) is a formal manner of address. It is now much easier to be ‘on Christian name
terms’<BrE> or ‘on a first name basis’<espAmE> than previously; address by family name
alone, which used to indicate male comradeship is sometimes used today in the armed
forces and at school.
Vocatives addressed to strangers express a relationship or attitude. It is worth noting
that these forms of address are rather limited in contemporary English, particularly in
British English.
(phonetics and phonology, linguistics, linguistic sciences, applied linguistics)


absolute participle clauses
Acceptable constructions in which participle clauses whose subject is not the one of the
main clause are called absolute clauses: His eyes staring blankly into the sky, he looked like
a statue. Absolute clauses can function as sentence adverbials, usually taking initial
position: God willing, the expedition will succeed. Weather permitting, the finals will be
played according to schedule. Everything considered, it was an interesting meeting.

An a. is the auditory effect of those features of pronunciation which indicate where a person
is from, both regionally and socially.
There are many different accents in England, and the range becomes much wider if the
accents of Scotland, Wales and Northern Ireland are also considered. Within the accents of
England, the main distinction is made between northern and southern ones. The regionally
neutral accent associated with public-school education is called RP. A ‘broad’ accent is one
which differs widely from RP.

In the analysis of spoken discourse based on classroom activity data, Sinclair and Coulthard
(1975) put forth a descriptive framework consisting of acts, moves, exchanges,
transactions, lessons. These elements are arranged hierarchically, such that acts combine to
form moves, moves combine to form exchanges, and so on. It is worth noting that an
utterance may contain more than one move, being intermediate between move and

adjacency pair
In conversational analysis, an a.p. is made up of a sequence of two utterances (the first part
and the second part) that are: 1) adjacent 2) produced by different speakers 3) ordered as
first part and second part 4) typed, i.e. a particular first part requires a particular second part
(or range of second parts) - an offer expects an acceptance or a decline, a greeting requires
another greeting, and each of the latter forms a pair type with the former. Other examples
are complaint-apology, complaint - denial/ admission, invitation - acceptance/ non-

In grammar, the term is used to describe an optional element in a construction that can be
removed without the construction being affected, the clearest example being the optional
adverbials. In Quirk et al. 1972, etc., the term more specifically refers to a type of clause
constituent integrated to some extent in clause structure. Semantically, a.s are further
subdivided into focusing, intensifying, formulaic, process, place, time, viewpoint, others.

adverb and adverbial
An adverb is a part of speech, while an adverbial is a syntactic, functional element, along
with S, P, O, C.

In phonetics, in terms of manner of articulation, the term refers to a sound made when the
completely obstructed air-stream is released gradually, not suddenly. Its beginning is like a
plosive, its ending like a fricative. The initial sounds in Jim and chap are a.s.

agreement or concord
It describes a relationship between linguistic elements, by means of which a form of one
item asks for a corresponding form of another. The term concord has been more widely
used in linguistic studies lately.

An a. is a variant of a phoneme: in well, the final sound is a ‘dark l’, in ‘Luke’, the initial
sound is a ‘clear l’. The two sounds are allophones of the phoneme / l /.

anaphoric reference
It denotes backward-looking reference, a pronoun referring back to an antecedent that has
been expressed: Kevin stayed a whole week; before he left, he tried to get in touch with you.
He has a.r. that sends back to its antecedent, Kevin.

anticipatory it
This type of it is found in extraposition, where it anticipates an item in the sentence: It’s
great to have another such opportunity. This anticipatory subject is also called extrapositive
or preparatory it, and it should be distinguished from the prop or dummy it found in It’s
snowing/cold/two hours/twenty miles.

In semantics, a term that refers to contrasts of word meaning, mainly in adjectives, but also
nouns, and sometimes verbs. A. is a set of sense relations, along with synonymy and
hyponymy. It covers all types of semantic oppositeness. Some antonyms are gradable:
‘good’ vs ‘bad’. Others are ungraded: ‘dead’ vs ‘alive’. Another type of antonyms includes
converse pairs or relational opposites: ‘to borrow’ is the converse of ‘to lend’. As there is a
matter of controversy how many types of opposites are to be considered in semantic
analysis, the use of the term a. should be viewed with caution.

applied linguistics
A branch of linguistics that applies linguistic theories, methods and findings to the
clarification of problems that appear in other fields. Its most important application so far
has been in the study of second and foreign language learning and teaching. A.l. relates the
study of language to such fields as lexicography, translation, speech pathology. It uses
information from sociology, psychology, anthropology, information theory, as well as from
linguistics in order to develop its own theoretical models, and then uses this theory in
practical areas such as syllabus design, speech therapy, stylistics. There is a fuzzy boundary
between a.l. and such interdisciplinary branches of linguistics as sociolinguistics and
psycholinguistics, as the latter’s preoccupations include practical outcomes of a plainly
applied kind.

A term used in grammar to refer to a sequence of units, usually NPs but also clauses (that-
clauses, wh-clauses, to-infinitive clauses) with identical reference and grammatical
function. The appositional phrase is usually marked off by commas in writing, or a separate
tone unit in speech: Nick Carraway, the narrator of The Great Gatsby, ... . Other examples
of appositive clauses: His pledge to look after the children... The suggestion that they
should retire...

appositional and non-appositional coordination
A distinction is made between a. and non-a. coordination. Under non-a. coordination we
include cases that can be treated as an implied reduction of two clauses. These have a verb
form in the plural : Tom and Jerry are now ready (Tom is and Jerry is...)
What he says and what he thinks are none of their business. Conjoinings expressing a
mutual relationship also take a plural verb form: Her dilemma and yours are alike.With the
less common a. cordination, however, no such reduction is possible at all, for the
coordinated structures refer to the same thing. Hence a singular verb form is used: The old
friend and the subsequent editor of his poetic works was by his deathbed (compare with *
Filosoful si scriitorul Mihail Sora au împlinit 80 ani, a ‘howler’ in a local literary
periodical, Paradigma). Some latitude is allowed in the interpretation of abstract nouns:
Winifred’s cheerfulness and buoyancy was the cause of the unexpected coup de foudre.
There may be doubt whether cheerfulness and buoyancy represent two distinct qualities or
only one. Invoking the principle of notional concord, we may use either singular or plural,
depending on whether unity or separateness is implied. to make things clearer, use Both...
and... when separateness is to be stressed: Both your frankness and your honesty are
highly thought of.

In phonetics, a term used to describe consonants on the basis of their manner of
articulation. In the production of /w/, /j/, /r/, /l/, one articulator approaches a place of
articulation, but the degree of narrowing does not cause audible friction.

In grammar, the term refers to a subclass of determiners which displays an essential role in
differentiating the uses of nouns. They may be definite (the) or indefinite (a/an).

A grammatical category of the verb that mainly refers to how a language encodes the
duration or type of temporal activity denoted by the verb. A. is concerned with the internal
character of the event as it is presented by the speaker. It focuses on such contrasts as
durative - non-durative, perfective - imperfective, progressive - non-progressive.
Progressiveness is one type of imperfectivity. The progressive - non-progressive aspectual
contrast is well marked in English.

When /p, t, k/ occur at the begining of a stressed syllable, a slight puff or breath (called
aspiration) is heard immediately after the release of the consonant. Therefore, ‘Tom’,
‘Pete’, ‘Kate’ are pronounced as /t
om/, /p
i:t/, /k
eIt/. Aspiration does not usually occur
when the voiceless plosives are preceded by /s/, as in ‘span’. When a voiceless plosive
precedes a vowel in an unstressed syllable, the aspiration that may occur is relatively weak.
In final position, /p,t,k/ may have no audible release or may be replaced by a glottal stop.

A term in phonology that refers to the influence exerted by one sound upon the articulation
of another, so that the sounds become similar, or identical. Assimilation is partial in tin bin
/ tim bin /, where the n of tin has adopted the bilabiality of b, turning into m; it may be
total, as in thin mouse /θim maus /, where the m of mouse has completely assimilated the n
of thin. In terms of the direction in which a. works there are three subtypes: regressive or
anticipatory a. - a sound assimilates its preceding sound, as in thin mouse above;
progressive a.: the sound changes because of the influence of the preceding sound, e.g.
staunch supporter articulated with the s of supporter turning into /∫ /, under the influence of
the preceding /t∫ /; reciprocal a.: there is mutual influence, or fusion, of the sounds upon
each other, as in don’t you pronounced dontcha. The /t / and the /j / have fused to produce
an affricate.

attributable silence
In conversation analysis, a.s. describes what happens when one speaker gives the floor to
another speaker and the latter chooses to be silent, his/her absence of contribution to the
conversation becoming significant, the silence being attributable to him/her. The less the
two interlocutors are familiar with each other, the more significant a.s. becomes.

A term referring to a class of verbs that help to make distinctions of mood, aspect, voice,
etc. in the VP. It may be a primary auxiliary (do, be, have) or a modal auxiliary (can/could,
may/might, shall/should, will/would, must).


backchannel signal
Such signals are instances of feedback given by listeners to speakers engaged in extended
turns that they (the listeners) are in touch. In face-to-face and telephone conversation these
signals may include ‘I see’,‘yeah’, ‘uh-huh’, ‘mm’.

In phonetics, in terms of manner of articulation, a consonant articulated with both lips, such
as /p/, /b/, /m/.

bound morpheme
A morpheme which is structurally dependent on the word to which it is added, e.g., the
plural morpheme.

boundary markers
In conversation analysis, b.m.s are linguistic signals like Well, Now, Right, Good, OK,
Listen, usually spoken with falling intonation, whose function is to mark boundaries in
conversation. They may mark both the beginning and ending of a topic, or either its
beginning or ending.


A c. is a word or phrase which is translated directly into a language from another, retaining
the structural properties of the first language. Sometimes a c. is institutionalized (Governor
general), sometimes it may be part of a learner’s interlanguage or a matter of lexical play (
“a battle of game” would be a c. from Romanian into English).

cardinal vowel chart
A set of standard reference points, devised by Daniel Jones to describe the articulation of
the vowels of a language. The chart takes into account which part of the tongue is involved
in the process, and how high or low in the mouth that part is. Its primary cardinal vowels
are not actual sounds, but extremes of vowel quality taken as points of reference for the
description of real vowels.

Case is a grammatical category that can express a number of different syntactic
relationships between nominal elements. The traditional case system, such as is found in
Latin grammar, is based on variations in the morphological paradigms of the words.
In languages like English, in which morphological variations are very limited, the
term case, as traditionally used, does not apply. English nouns have a two-case system, the
unmarked common case (girl) and the marked genitive case (girl’s); six pronouns (I, he,
she, we, they, who) have a three-case system, where common case is replaced by subjective
and objective case.

cataphoric reference
A form of reference in which the proform occurs before its text referent, and has to be
interpreted by means of the following co-text: Unless you book it in advance, you won’t
find any accommodation.

cleft structure
A sentence whose normal SVO pattern is rearranged to highlight a certain part of the
structure. Paul is giving a house warming party tomorrow may be ‘cleft’ into two separate
sections, each with its own verb: It is Paul who is giving a house warming party or It is a
house warming party that Paul is giving tomorrow or It is tomorrow that Paul is giving a
house warming party. The different variants affect the distribution of emphasis within the
sentence, and correlate closely with patterns of intonation prominence.

The property of a text to ‘hang together’ rather than being a set of unrelated sentences or
utterances. Coherence is sometimes seen as referring to the underlying development of
propositions in terms of speech acts, in contrast to cohesion, concerned with surface
deatures of connectivity.

The formal links that mark various kinds of intersentential links in discourse. Cohesive ties
have to do with a) explicit lexical repetition b) co-reference c) ellipsis as an implicit device.

collective noun
Nouns like committee, board, crew, family, government, army, jury, public. They can occur
in the singular with either a singular or a plural form, this correlating with a difference in
interpretation - the noun being seen as a single collective entity, or as a collection of
individual entities: The family are watching TV vs His family is famous.

Words collocate when they occur next to each other in texts. Although rancid, stale, fetid
all express the meaning ‘unsuitable for human consumption’, they collocate differently:
rancid butter, stale bread, fetid water. Lean collocates with meat, but thin does not. C. is
one kind of lexical cohesion.

communicative competence
The ability to use language effectively in specific contexts and for specific purposes. It has
four components: grammatical competence, sociolinguistic competence, discourse
competence, and strategic competence.

communicative dynamism
A fundamental concept of the Prague school theory of linguistics, whereby an utterance is
seen as a process of gradually unfolding meaning, each part contributing differently to the
total communicative effect. The theme, for example, has a lower degree of c.d. than the
rheme, which contains the new information that furthers the communicative activity.

complement (Cs, Co, etc)
In grammar, c. refers in its broadest sense to all obligatory features of the predicate other
than the predicator: objects and adverbials of all kinds. In a more restricted sense, it denotes
the ‘completing’ function of structures following linking verbs. In I was wrong, the last
word is the complement of the subject or subject complement (Cs), while the same word is
the complement of the object or object complement (Co) in He proved me wrong. The term
complementation may be used to refer to structures completing other items: apart from verb
complementation, there is talk of adjective complementation, for example.

concord (see agreement)

conditional clauses and conditional sentences
Conditional clauses are the subordinate clauses in conditional sentences, combinations of
main clauses and c.c.s. Syntactically, conditional sentences in English are different from
those in Romanian, as the tenses of the verbs in the subordinate clauses of condition are
usually backshifted ( If I see vs Dacã voi vedea).Semantically speaking, c. sentences are of
two kinds: open (real possibilities) and hypothetical (unreal conditions). Pragmatically,
conditional situations and the sentences that express them offer a complex picture, form and
function, being likely to differ widely at times.

Like adjuncts and disjuncts, they are subclasses of the adverbial. Like disjuncts, they are
less integrated into the structure of the clause, being peripheral to it. C. have an important
function, binding texts together. They show a large number of textual connections, being
enumerative, reinforcing, equative, transitional, summative, appositive, resultative,
inferential, reformulatory, replacive, antithetic, concessive, temporal transitive.

A part of speech whose members join together clauses and other sentence elements.
Coordinating c. (and,but, or) join equal elements, while subordinating ones (although, as,
because, if, once, since, unless, whereas) join elements of unequal ranks. Many
conjunctions are closely related to adverbial meaning: after, where, as long as, supposing.

Phonetically, a c. is a sound made by a closure or narrowing in the vocal tract so that the
airstream is either completely blocked, or so restricted that audible friction is effected. C.s
are described in terms of place and manner of articulation, presence or absence of vocal
cord vibration, amount of energy spent in their production. Phonologically, c.s are those
units that function at the edges of syllables, either individually or in clusters.

context and co-text
Most narrowly defined, context refers to ‘something which precedes or follows something
else’. This is also known as the verbal context or co-text. Context, more precisely
situational context, is a term used to indicate the features of the non-linguistic world in
relation to which linguistic units are systematically used. Certain linguists, particularly
systemic-functional linguists, claim that context and purpose determine the grammar and
structure of the discourse.

A term introduced by the American phonetician Kenneth Pike to help distinguish between
the phonetic and phonological notions of consonant (see consonant above). Contoid refers
to the phonetic characteization of a consonant; consonant is reserved for the phonological
sense. Its opposite is vocoid.

conversation analysis
C.a. tries to describe and explain the ways in which conversations work. This type of
analysis is rather different from that undertaken by discourse analysis, although both c.a.
and d.a. give an account of how coherence and sequential organisation in discourse are
produced and understood. C.a., as practised by Sacks, Schegloff, Jefferson, is a rigorously
empirical approach which avoids premature theory constructions. It was developed within a
sociological rather than linguistic tradition, the school itself being known by the term
ethnomethodology. Ethnomethodologists insist that data should be derived from naturally
occurring instances of everyday interaction. In particular, they reject the data obtained
through formal experiments, interviews and other forms of elicitation. Questions that
conversation analysts have investigated include the following: How do topics get
nominated, accepted, maintained and changed? How is speaker selection and change
organised? How are conversational ambiguities and indeterminacies resolved? How are
non-verbal and verbal aspects of conversation organised and integrated?

conversational implicature
An additional unformulated meaning that can be arrived at by the application of the maxims
of the cooperative principle.

conversational maxim
One of the components of the cooperative principle, described below.

cooperative principle
A term introduced by the philosopher H.P.Grice now used in the study of conversational
structure. At its simplest it is the assumption in conversation that each participant will
attempt to cooperate with the others by being as informative as necessary, truthful, relevant
and clear (maxims of quantity, quality, relation and manner respectively).

coordination, asyndetic and syndetic
C. is a process of linkage which does not differentiate between the two elements linked.
When it is signalled by the presence of the conjunctions and, but, or it is syndetic. When
commas are used instead, it is asyndetic. C. can link NPs or whole clauses. When it links
clauses, it contrasts with subordination, where there is some implication that the
propositions of the clauses that are linked are not of equal importance. In most texts,
clauses are linked by both coordination and subordination, but the relative ratios can change
dramatically, being and indication of a particular style.

copular verb or linking verb
Such verbs have little independent meaning, their main function being to relate other
elements of clause structure, especially subject and complement. Here are some examples
of such verbs: to be tired, to look good, to feel great, to sound interesting, to grow pale, to
fall silent.

countable/count vs mass/ uncountable
The ‘countability’ distinction count vs non-count/mass/uncountable nouns is a focus of
attention in analyses of the noun phrase, because of the way it can explain the distribution
of nouns in relation to the use of such items as articles and quantifiers. Count nouns are
viewed as separate entities, used with such forms as a, many, two, etc.; mass nouns are
treated as continuous entities, having no natural bounds, and are used with much, some.
Some nouns can have dual membership, i.e. they can be at times either count or mass, with
a difference in meaning. It is worth noting that some nouns can be mass in English and
count in Romanian: plural, count informatii, progrese, bagaje, dovezi, spaghete, confetti are
to be translated as singular, mass information, progress, luggage, evidence, spaghetti,

C. is also known as sound symbolism, i.e. the theory that there is a natural harmony
between linguistic form and semantic content, thus contradicting the Saussurean theory of
the arbitrariness of the linguistic sign. We would all probably agree that what clinks is
smaller than what clunks, for instance. C. was powerful in the symbolist poetry of Mallarme
and Rimbaud, certain sounds being said to evoke certain colours.


dangling participle - unrelated participle
When a participle clause has no stated subject, we assume it is the same as the main clause
subject. But in Coming round the corner, a tile fell on my head, the participle coming
cannot be related to the subject of the main clause (a tile). This incorrect use describes a d.
or u. participle. The u. p. is, however, acceptable in absolute clauses: The island looked
impregnable, massive cliffs rising out of the sea.

deference strategy
Characteristic of conversation that stresses negative politeness, the non-personal, and
freedom from imposition.

In linguistics, d. refers generally to those features of language which relate utterances to a
time, place, or the speaker’s viewpoint: now/then, here/there, I/you, this/that. The notion is
useful in several areas of linguistics, especially in pragmatics.

A pronoun or determiner which refers to something in terms of whether it is near to or
distant from the speaker. The demonstratives are this, that, these, those.

A term used in morphology to one of the two main types of word formation, the other being
inflection.The result of a derivation is a new word: friend - >friendship.

determinative and determiner
Determinatives include a) predeterminers (all, both, double), b) central determiners (the
articles, this, some), c) postdeterminers, which follow central determiners but precede
premodifiers (e.g. adjectives).

descriptive grammar
It aims to show the facts of language usage synchronically and as they are, not how they
ought to be, with reference to some imagined ideal state. The emphasis on a given time
places it in contrast with historical linguistics, which aims to describe linguistic change.

A regional or social variety of language, marked by a particular lexis and grammatical
features. The systematic study of all forms of dialect, especially regional, is called linguistic
geography or dialectology.

The term describes a process in which a monophthong turns into a diphthong.

discourse analysis
both CA and DA give an account of how coherence and sequential organization in
discourse are produced and understood.
DA employs the methodology, principles, concepts of linguistics. It is an attempt to
extend the techniques used in linguistics beyond the level of the sentence. There is a
tendency to take texts, which are often constructed by the analyst, and to give an in-depth
analysis of all the interesting features of this limited domain. DAists include text
grammarians and speech acts(interactional) theorists.

dispreferred second
Alternative second parts to first parts of adjacency pairs are not of equal status; some
second turns are preferred and others dispreferred. Preference has little to do with the
interlocutors’ desires, it indicates what is commonly expected in a certain context. While
preferred, unmarked preferred second parts may be said to be irregular, dispreferred seconds
have much in common, notably components of delay and a higher level of complexity.
Using these terms, we will need a rule for speech production, which can be stated roughly
as follows: ‘try to avoid the dispreferred second - the action that generally occurs in
dispreferred or marked format’. The dispreferred second of an adjacency pair whose first
part is an offer will be a refusal.

dynamic vs stative verbs
The aspectual distinction d. vs s. is mainly syntactic: dynamic verbs may occur in the
progressive form and in the imperative. In terms of their meaning, dynamic verbs may be
durative or punctual, agentive or non-agentive, bounded or unbounded. Stative verbs are
durative, non-agentive, unbounded (= not bounded by an end-point).


In phonology, the omission of sounds in connected speech: February may turn into /febri/,
twelfths may be elided and pronounced as /twelfs/.

A process whereby, for reasons of economy or emphasis, a part of a language structure has
been left unexpressed, but is easily recoverable from the context: (It) sounds good.

E. relationships of cohesion help to define the structure of texts. It is contrasted with
exophoric relationships, whose interpretation depends on the extralinguistic situation.
Endophoric relationships are anaphoric or cataphoric.

Something that logically follows from what is asserted. If knowing that one sentence
(strictly speaking proposition) is true makes us certain about the truth of the second
sentence, then the first sentence entails the second. Entailment does not depend upon the
context in which a sentence is used.

ergative verb
Some verbs are used transitively and intransitively with different kinds of subject; the
intransitive use has a meaning rather like a passive or reflexive:
They sell cakes.
The book is selling like hot cakes.

A branch of sociology that deals with the description and interpretation of everyday spoken

One of the units of spoken discourse introduced by Sinclair and Coulthard (1975) to
describe typical classroom discourse. According to the two authors the typical exchange
consists of three elements, initiation, response, follow-up. The initiating and the follow-up
moves are performed by the teacher, the second by the pupils/students. An exchange which
consists of only two parts is perceived by S. and C. as the ‘marked form’ in which the third
part is omitted for strategic reasons. An example would be when a student gives the wrong
answer and the teacher withholds the evaluation and goes on to provide clues in order to
help the student reach the right answer.

existential there
A term used in the grammatical description of clause or sentence types, referring to a type
of structure beginning with the unstressed word there followed by the verb be.

The term, popularized by Halliday and Hassan (1976), refers to contextual or situational
reference (exophoric), as distinct from textual reference (endophoric). E.reference may be
specific, pointing to the immediate situational context or generalized or homophoric,
referring to the larger cultural context or shared knowledge. Both types of e. reference are
very important in fiction and drama for the creation of the situational dimensions of a
certain universe of discourse.

A process whereby an element is moved from its normal position and moved towards the
end of the sentence. An anticipatory it substitutes for the extraposed element. E. operates
mainly on subordinate nominal clauses.


A term used in the sociology of language and in pragmatics to denote a person’s public self-
image. In social interaction, the interlocutors try to convey a positive message about
themselves by means of face saving acts. The utterances or actions that threaten a person’s
public self-image are face threatening acts.

felicity conditions
In speech act theory, the conditions that must be met for a speech act to be satisfactorily
performed. I bet you $1,000,000 that I am right, even if I am right, is not a bet if I am
broke. I pronounce you man and wife does not work if the speaker is neither a priest nor a

finite vs nonfinite
A finite verb is a form that can occur on its own in an independent sentence or a main
clause. It displays tense and mood contrasts. Non-finite forms of the verb occur on their
own only in dependent clauses, and lack tense and mood contrasts. Infinitives and
participles are non-finite.

first topic slot - in conversational analysis, an element of the overall organizaton of a
telephone conversation. It contains an announcement by the caller of the reason for the call.
The first topic slot, immediately following the opening section is a privileged point: it is
almost entirely free from the topical constraints coming from prior turns. The main body of
the call is thus structured by topical constraints: the content of the first slot is likely to be
understood as the main reason for the call (whether this is true or not).

The ongoing permission to speak while a conversation is going on.

foreign plurals
Some nouns borrowed from other languages have preserved their original plural form. Here
are some examples in the singualar and plural: stimulus - stimuli, phenomenon -
phenomena, alga - algae, analysis - analyses.

fortis vs lenis
Fortis consonants are articulated with a strong degree of muscular effort and breath force;
the other consonants are lenis. Voiceless consonants, such as /p/, /t/, /s/, etc., are said to be
produced with fortis articulation, while voiced ones are lenis sounds.

Sounds made when an articulator and a place of articulation get so close together that the
air moving between them causes audible friction. There are pairs of voiced and voiceless
fricatives in English /v/ and /f/, /¿/ and /[/, /z/ and /s/, and voiceless glottal fricative /h/.

functional grammar
A linguistic theory put forth in the 1970s as an alternative to formal, abstract
transformational grammar, relying on a pragmatic view of language as social interaction. It
focuses on the rules of verbal interaction, seen as cooperation, and on the linguistic rules
and expressions that are used as instruments of this interaction.

functional sentence perspective
Associated with the Prague School, f.s.p. describes how information is organized in
sentences. It mainly focuses on the effect of the distribution of given and new information
in discourse. The known information (theme) refers to the information that is not new to the
reader or listener. The rheme refers to information that is new. F.s.p. differs from the
traditional grammatical analysis of sentences because the distinction subject - predicate is
not always the same as the theme-rheme contrast.

A synonym for speech acts - the things we do with words (apologizing, threatening,
promising, flattering, instructing).

generative grammar
A set of formal rules that projects a limited set of sentences upon the practically unlimited
set of sentences that constitute the language as a whole. It does so in an explicit manner,
assigning to each a set of structural descriptions. This type of grammar is said to generate,
or produce, grammatical sentences.

generative phonology
It describes the competence a native speaker must have to be able to produce and
understand the sound system of his/her language. The phonemes are viewed as bundles of
distinctive features. Each sound displays a different set of such features.

generative semantics
It grew as a response and reaction to Chomsky’s syntactic-based transformational
generative grammar. It states that all sentences stem from (are generated by) an underlying
semantic structure. The latter is frequently in the form of a proposition that resembles
logical propositions in philosophy.

generic reference
A kind of reference that indicates a class of entities, rather than a specific member of a
class. The English is a phrase that has g.r., while two Englishmen has specific reference.

A grammatical category used to describe parts of speech, particularly nouns, displaying
distinctions such as masculine/ feminine/ neuter, animate/ inanimate, personal/ non-
One can distinguish between natural gender, where items refer to the sex of real-world
entities, and grammatical gender, which has next to nothing with sex, but which has an
important part to play in signalling grammatical relationships between words in a sentence.
English gender contrasts are on the whole natural, i.e. he usually refers to male people and
animals. We say that English gender is also covert, i.e. it makes very few distinctions.
Where they are made, the correspondence between sex and gender is very close. Gender
contrasts are commonly made by the correlation between nouns and personal and relative
pronouns, as well as, less usually, by means of suffixes and gender markers. According to
some grammarians (Schibsbye 1965), there are four genders (or gender distinctions) in
English: masculine, feminine, common (parent, for example) and neuter. Quirk et al (1973)
describes ten gender classes as being the product of the combinations of gender-sensitive
pronouns (and, in addition, they) and relative pronouns who, which, that substitute for
singular nouns.

glottal stop and glottalization
The vocal cords can be tightly pressed together so that air is prevented from escaping
through the glottis. As a result of the compression of the air-stream behind this closure, a
glottal stop or glottal plosive is produced, for which the symbol ? is used. When coughing,
we usually produce a succession of glottal stops. The glottal stop often occurs in English
when it reinforces or even replaces the voiceless stops /p/, /t/, /k/. Glottalisation of this
latter kind is particularly marked with the younger generations in contemporary British

government-binding theory
A model of grammar that assumes that sentences have three main levels of structure: D-
structure, S-structure and logical form. The main sub-theories are X-bar theory, theta
theory, case theory, binding theory, bounding theory, control theory and government theory.
Government-binding is often referred as the ‘principles and parameters’ theory.

grammatical metaphor
The transformation process whereby functions that usually feature as verbs are turned into
entities represented by nouns.

A qualifying, mitigating note conveyed about how an utterance is to be interpreted, such as
‘to a certain extent’, ‘if I am not mistaken’, ‘as far as I know’.

A term in the phonetic classification of speech sounds, referring to sounds that are produced
at the same place of articulation, such as /p/, /b/, /m/.


ideational function of language
The function that relates to information about concrete and abstract entities and states of
affairs. It contrasts with the interpersonal function that shows the attitudes or feelings of the
speakers towards each other and towards the topics of the utterances.
The linguistic system of an individual speaker - one’s personal variety of a language.

In grammar and lexicology, a term that refers to a sequence of words which functions as a
single unit. The meaning of its individual words cannot be put together to produce the
meaning of the ‘idiomatic’ expression as a whole.

illocutionary vs locutionary, perlocutionary
Illocutionary is a term used in the theory of speech acts to refer to an act which is
performed by the speaker by the utterance having been made. Such acts include those of
promising, commanding, arresting, baptising. The term is contrasted with locutionary (the
act of saying) and perlocutionary (the act is defined by reference to the effect it has on the

A term used in grammar and semantics to refer to an entity which is not capable of specific
identification; contrasted with definite. Indefiniteness in English is expressed by means of
the indefinite article a or an indefinite pronoun (one, some, each,etc.).

Features of speech or writing showing the personal characteristics of a language user, as in
voice quality or handwriting. More generally, the term refers to the features that identify
membership in a group, such as regional, social or occupational ‘indices’. Indexical
expressions, usually called deictic features, sometimes refer to those features of language
that indicate directly the situation within which an utterance takes place; their meaning is
thus relative to that situation.

A term used in morphology to refer to one of the two main processes of word formation, the
other being derivation. Inflectional affixes signal grammatical relationships, such as plural,
past tense and possession, and do not change the grammatical class of the stems to which
they are attached.

information focus
Speech can be seen as displaying an information structure, with formally identifiable units
of information. Intonation provides the main signal for such units. The tone unit represents
an information unit, and the nuclear tone marks the information focus.

inherent/ non-inherent
Terms used in the semantic analysis of adjectives. Rather than using dichotomies and clear-
cut distinctions, we can discuss the meaning of adjectives by using three gradients:
stative/dynamic, inherent/ non-inherent, gradable/ non-gradable. Inherent adjectives
characterise the referent of the noun directly ( a good man), unlike non-inherent ones (the
referent of a good hunter is not necessarily a good man).

insertion sequence
In conversational analysis, a two-part sequence that interrupts an adjacency pair, getting
embedded between the two parts:
A: Can I go out tonight?( B:Do you promise to come back before midnight?A: Sure.) B:

A term used in semantics and functional grammar as part of a classification of types of
meaning. It refers to those aspects of meaning that relate to the establishing and keeping of
social relations.

In suprasegmental phonology, a term referring to the distinctive use of patterns of pitch. In
some approaches, the pitch patterns are described as contours and analysed in terms of
levels of pitch as pitch phonemes and morphemes; in others, the patterns are described as
tone units or tone groups, analysed further as contrasts of nuclear tone, tonicity, etc. The
three variables of pitch range, height and direction are generally distinguished.

intrusive r
In RP, the introduction of /r/ as a linking form after a vowel, when the following word
begins with a vowel, where there is no r in spelling: law(r) and order, media(r) event


The term refers to a sound in the articulation of which the lower lip acts as the articulator
that gets in contact with the upper teeth, as in /f/ and /v/.

langue vs parole
According to the distinction made by Ferdinand de Saussure, langue refers to the language
system shared by a community of speakers, and is contrasted with parole, which is the
concrete act of speaking in actual situations by an individual.

lexical cohesion
In discourse analysis, the term describes the situation when two or more words are related
in terms of their meaning. The two main kinds of l.c. are reiteration and collocation.

lexical verbs
One of the three main classes of verbs, along with primary verbs and modals. The lexical
verbs can act only as main verbs, the modals can act only as auxiliaries, the primary verbs
can act either as main verbs or auxiliaries.

The art and science of compiling dictionaries. It can be seen as a branch of ‘applied

The overall study of a language’s vocabulary, including its history. It can be seen as a
branch of semantics.

limited generic reference
It occurs with NP heads having of-phrase postmodification: the music of the spheres. This
type of postmodification should be compared to the alternative with adjectival
premodification. Unlike other languages, English also uses the zero article where the
reference of the NP head is restricted by premodification. This is another type of limited
generic reference: Australian sheep, Restoration comedy, American literature, Korean
industry, Chinese porcelain.

Different branches may be identified according to the linguist’s focus and range of interest.
A major distinction is between diachronic and synchronic l., the former referring to the
study of language change (=historical l.), the latter to the study of the state of language at a
certain time. If the aim is to establish general principles for all languages, to determine the
characteristics of human language as a phenomenon, it is called general linguistics. When it
attempts to establish the facts of a particular language, it is called descriptive. When it
focuses on the differences between languages within a language-teaching context, it is
contrastive. When its aim is to identify the common characteristics of different languages or
language families, it is comparative. The term of linguistic sciences has lately come to be
used by some as a cover term for both linguistics and phonetics - the latter being seen as a
pre-language study. When the subject’s findings, methods, or theoretical principles are
applied to the study of problems from other areas of experience, one talks of applied
linguistics; the term of applied l. is often restricted to the study of the theory and method.

linking verb (see copular verb)


minimal pair
One of the discovery procedures used in phonology to determine which sounds belong to
the same phoneme. Two words which differ in meaning when only one sound is changed
are referred to as a ‘minimal pair’, e.g. tap vs cap.

The dimension of an utterance that allows the speaker or writer to show his/her attitude
towards a) the propositional content or b) the illocutionary force of an utterance. It may
display a range of attitudes: certainty, possibility, permission, ability, likelihood, obligation
and hypothesis.
By means of modality speakers can intervene in a speech event, by laying obligations,
giving permission, assessing the probability of a situation or event. Modality thus expresses
a relation to reality, whereas a declarative syntactic form treats the situation as reality. Two
broad kinds of modal meaning can be identified: intrinsic, in which some kind of human
control is present; extrinsic, in which the events are beyond human control, although they
are not beyond human judgement. Consider: I won’t do it again (volition - intrinsic); I’m
sure it won’t last (prediction - extrinsic).

A process showing a change in vowel quality whereby a diphthong or triphthong is
phonetically realized as a monophthong.

The term refers to a number of syntactic and semantic contrasts achieved by alternative
forms of the verb (indicative, subjunctive, imperative). Semantically, a wide range of
meanings is involved. Syntactically, these contrasts may be shown by alternative
inflectional forms of a verb, or by the use of modal auxiliaries.

The minimal unit of grammar, the central concept of morphology. The m. may be free (able
to occur as a single word) or bound (mainly affixes). Another distinction can be made
between lexical and grammatical m.s; the former are used for the construction of new
words, the latter are used to express grammatical relationships between a word and its
context, such as plurality or tense. Grammatical m.s which are separate words are called
function words.

The branch of grammar which studies the structure or forms of words, through the use of
the morpheme construct. It is traditionally distinguished from syntax, which studies the
rules applying to the combination of words in sentences. It is generally divided into
inflectional m. (the study of inflections) and lexical or derivational m. (the study of word

A term from historical linguistics that describes the change in a sound’s quality because of
the influence of sounds in neighbouring morphemes or words. The plurals of such words as
foot, tooth, goose, mouse show m.


negative face
The need to be independent, not imposed on by others; the opposite of positive face.

negative politeness
Awareness of another’s right not to be imposed upon.

negotiation of meaning
The joint work done by interlocutors to make sure that they have a common understanding
of the current meanings in a conversation. Conversational strategies that are of use are
comprehension checks, confirmation checks and clarification requests.

new information
The part of the message a speaker assumes is new to the addressee. The division theme/
rheme shows the organisation of information in terms of given information (theme) and
new information (rheme).

The process of forming a noun from some other part of speech or the derivation of a NP
from a clause (His painting of the portrait from He painted the portrait).

a verb, verb phrase or clause without tense or a modal, usually an infinitive or a participle
form / construction.

The syllable in a tone unit which carried maximal prominence, usually due to a major pitch
change. It is also called the tonic syllable.

nuclear tone
The most prominent pitch movement in a tone unit. In English, such contrasts as falling,
rising, rising-falling and falling-rising are important.


A term used in the analysis of grammatical functions, to refer to a major constituent of
clause structure. Traditional analysis distinguishes a direct vs an indirect o. In the study of
inflected languages, objective may be used as an alternative to accusative. Some linguists
talk about ‘the object of a preposition’ to refer to the NP that follows a preposition in a

The first auxiliary in a VP. Together with the predication is makes up the predicate. the
division subject - operator - predication is important in the description of the relationship
between different clause types or mood structures.


Sounds articulated in the area of the hard palate.

A set of grammatically conditioned forms all derived from a single root or stem.

paradigmatic vs syntagmatic relationships
P. refers to a set of substitutional relationships a linguistic unit has with other units in a
specific context. P.r.s, together with syntagmatic r.s, constitute the statement fo a linguistic
unit’s identity within the language system. S.r.s refer to the sequential characteristics of
speech, usually seen as a string of constituents in linear order. Sets of syntagmatically
related constituents are often referred to as structures.

pedagogical grammar
A grammar organised for learning and teaching the language structures of a foreign

perlocutionary act
In speech act theory, the effect achieved by the utterance on the addressee through the
illocutionary act.

A minimal, distinctive functional element in the sound system of a language. Actually, it is
a set of similar sounds, found in free variation and complementary distribution that are
perceived as the ‘same’ sound. The positional variants of a p. are called its allophones.

phrasal verb
A sequence of a lexical verb plus one or several particles e.g. to take in, to go off, to turn

A single element of grammatical structure typically containing more than one word, lacking
the subject-predicate structure typical of clauses. It is seen as part of a structural hierarchy,
between the level of the word and that of the clause. Several types can be distinguished, all
of them but PrepPs (prepositional phrases) having a central element as head, which gives
the name to the particular construction: NPs, VPs, AdjPs, AdvPs.

Voice level produced by varying tension in the vocal cords and leading to sounds of lower
and higher frequency.

A sound articulated when a complete obstruction in the vocal tract is suddenly released. the
air-pressure rushes out with an explosive sound, plosion. Plosive consonants are also called
stops. /p,b, t, d, k, g/ are plosives or stops.

Discourse strategies that enable a speaker to maintain face in an interaction.

positive face
The need to belong to a group, to relate to the others; the opposite of negative face.

positive politeness
Showing solidarity with another.

postmodifier (see qualifier)

P. deals with intentional meaning, with what people mean by their utterances, rather than
with the meaning of the utterances themselves. It also involves the evaluation of the context
and its contribution to meaning. P. also studies how there is more to meaning than what is
explicitly said.

predicate, predication, predicator
A sentence is traditionally divided into subject and predicate, division coming from logic.
The latter may be further divided into operator (the first auxiliary in a VP) and predication.
In semantics, in the description of propositional content, a predicate may be realised by
verbs, as well as certain noun phrases, adjectives and prepositional phrases. The predicator
is a functional element of the clause, along with subject, object, complement, adverbial.

preference structure
In conversation analysis, the principle according to which one type of utterance will be
more typically found in response to another in a conversational sequence, e.g. a denial will
more typically follow an accusation than an admission, an acceptance is more likely to
follow an invitation.

An affix which is attached initially to the root or stem of a word, as in untouchable (-able in
the example is a suffix). Prefixation is an important word-formation process in English, by
means of which new lexical items can be coined.

The items occurring before the head of a phrase and after the determinatives. Adjectives
prototypically function attributively as premodifiers.

A part of speech whose members typically precede NPs to form a single constituent of
structure, the PrepP.

presupposition, semantic and pragmatic
Taken from logic, the term is used in semantics to describe a condition that has to be met if
a particular state of affairs is to obtain. In pragmatics, p. is both what the speaker assumes
to be true in saying something and what s/he assumes to be shared knowledge between
himself/herself and the interlocutor. A child, for instance, wrongly presupposes that the
adult listening to him/her understands his We were there and she told me to do that, without
previously specifying the people and the circumstances involved. In Who killed the
mosquito? I presuppose that the mosquito is dead, that I do not know who did it, and that
somebody else is likely to be able to tell me who did it. One of the qualities of
presupposition is constancy under negation, i.e. even if a statement is negated, its
presupposition remains the same.

In the grammatical description of verb forms, it refers to a contrast of a temporal or durative
kind, sometimes dealt with under the heading of tense, sometimes under aspect. The usual
important contrast is between progressive or continuous and non-progressive or simple.
Linguists prefer an aspectual analysis here, because of the complex interaction of
durational, completive and temporal features of meaning involved.

P.s are the basic units of meaning that sentences and clauses express. A p. consists of an
entity that is named and an expression of a state or action associated with that entity. In The
young man in the street staring at you is a friend of mine there are several p.s: ‘the man is
young’, ‘the man is in the street’, ‘the man is staring at you’, ‘the man is a friend of mine’.
The propositional content of a sentence is its context-free meaning.

A branch of linguistics that investigates the relationships between linguistic behaviour and
the psychological processes thought to underlie that behaviour. The best developed branch
of the subject is the study of language acquisition in children.


A structure following the head of a NP, another term for postmodifier.

A term used in semantics or logic to refer to items that show contrasts in quantity, such as
all, some, each. In some grammars, q.s refer to a class of items expressing contrasts in
quantity occurring in the NP: much/many, several, a lot of.

A term used in the classification of sentence functions, typically used to elicit information
or a response. Syntactically, a question is a sentence with inversion of subject and operator,
beginning with a wh-word, or ending wit a question tag (isn’t it? ..., doesn’t he?).
Semantically, questions express a desire for more information. The term is contrasted with
three other major sentence functions: statement, command, exclamation. In syntax,
questions are referred to as interrogatives.


A form of lexical cohesion in which the two cohesive items refer to the same entity or
event. Reiteration includes: repetition, synonym or near synonym, superordinate and
general word.

relative clause
Type of embedded clause, functioning as postmodifier in a NP. It may be defining
(restrictive) or non-defining. The presence of commas that mark the boundaries of a r.c.
also distinguish defining from non-defining r.c.: The girl who appeared to know you is here
(defining) vs The girl, who appeared to know you, is here (non-defining).

One of the pair of terms, along with theme, developed by the post-war Prague school of
linguists. The rheme coincides with new information an occurs in focus position towards
the end of the utterance.

rhotic and non-rhotic
A term used in phonology to refer to dialects or accents where /r/ is pronounced following a
vowel, as in far and smart. RP is a non-rhotic variety of English.

The perceived regularity of prominent units in speech. It is the pattern formed by the
stressed syllables being perceived as peaks of prominence. Pronouns, determiners,
auxiliaries, prepositions, conjunctions are generally unstressed, unlike lexical verbs,
adverbs, adjectives and nouns, which are usually stressed. Two broad kinds of rhythm can
be found in natural languages. One kind may be typical of a particular language, while with
some languages there is a mixture of the two. In syllable-timing, the tempo depends on the
syllable, so that all the syllables are of about the same length. French, Italian and Romanian
are languages that display this kind of rhythmicity. Stress-timing depends on more unequal
and irregular units, rhythm groups or feet, which contain various numbers of syllables, yet
tend to be pronounced in roughly equal intervals of time. We say that English has a
tendency for stress-timed rhythm, that is, what matters in establishing the rhythm of an
utterance is not its total number of syllables, but the number of its stresses. Therefore, an
utterance with two stresses and three syllables will be pronounced in about the same
interval of time as one with two stresses and seven syllables.

root (radical)
A term from historical linguistics used as part of a classification of the kinds of element
operating within the structure of the word. It is the base form of a word which cannot be
further analysed without total loss of identity. It is that part of the word which remains
whenall the affixes have been removed.

Received pronunciation is the name given to the regionally neutral accent in British
English, previously also called ‘BBC English’.


schema (pl schemata) The mental structure we have that conveys the expected structure of
things, e.g. a bike schema has two wheels, handlebars, a saddle, a chain, etc.

A pre-existing knowledge structure for interpreting event sequences; a witness summoned
to court will go through a certain sequence, and so will someone who sees a doctor.

A branch of linguistics dealing with the study of context-free meaning in language. The
emphasis is on the study of the semantic properties of natural languages. Areas of semantics
include etymology (the diachronic study of word meanings), lexicology (the synchronic
analysis of word usage, and lexicography (the compilation of dictionaries).

sense relations
Lexical items contract with each other systems of linguistic relationships: antonymy,
synonymy, homonymy, hyponymy; these are dealt with in lexicology/ lexical semantics.

The study of language in relation to social factors (social class, educational level and type
of education, age, sex, ethnic origin, etc.) Some linguists include in s. the study of
interpersonal communication, sometimes called micro-s., e.g. speech acts, speech events,
sequencing of utterances.

speech act
The functional intention of an utterance. It may be locutionary, illocutionary or
perlocutionary in terms of its reference, force or effect.

speech event
A particular instance when people exchange speech. The components of a speech event are
its setting, the participants and their role relationships, the message, the key and the

standard variety
The variety of a language which has the highest status in a community or nation and which
is usually based on the speech and writing of educated native speakers of the language. A
standard variety may show some variation in pronunciation according to the part of the
country where it is spoken.

In the classification of the kinds of elements operating within the structure of the word, it
consists of a single root morpheme, of two root morphemes, of a root morpheme plus a
derivational affix. The s. is the unit to which inflectional affixes are attached.

There are four factors that are important in making a syllable recognisably stressed:
loudness, length, pitch, and quality. Pitch is thought to produce the strongest effect, and
length is also a powerful factor. A syllable is therefore prominent in a certain context if it is
uttered on a higher pitch, than the surrounding syllables, if it is longer, louder, if it contains
a vowel that is different in quality from the neighbouring vowels.

A branch of linguistics that studies the features of varieties of language, and tries to
establish principles capable of accounting for the particular choices made by individual and
social groups in their use of language. Literary stylistics uses linguistics to show how
literary effects can be related to linguistic features.

The subjunctive has a very restricted use as a mood in contemporary English. It has to do
with both syntax and morphology. Syntactically, the term is used in the classification of
sentence types, having to do with the mood structures of the clause. Morphologically, two
subjunctives are identified, the present and the past subjunctives.

In phonetics, a general term referring to the segments of the vocal tract above the glottis:
the pharynx, the mouth and nasal cavities.

The lowest phonological unit into which phonemes are combined. Phonetically speaking (in
terms of perception), a syllable consists of a sonorous centre sounding comparatively loud,
having little or no obstruction to the air-flow; at the beginning and at the end of the syllable
there will be greater obstruction and less loud sound.

The study of the rules governing the way words are combined to form sentences. As such, it
is opposed to morphology, the study of word structure. An alternative definition that avoids
the concept of ‘word’ is the study of the interrelationships between the elements of sentence
structure, and of the rules governing the arrangement of sentences into sequences.


A dependent declarative clause, introduced by that. The main types function as subject,
object, apposition, subject complement and postmodifying that-relative clauses. The that
may be omitted in certain circumstances.

The process of giving prominence to certain elements on a sentence or utterance by placing
them at the beginning of the sentence or utterance.

The initial element in a sentence or utterance which forms the point of departure. The rest
of the sentence or utterance is called its rheme.

tone group
A term used in intonation analysis to refer to a distinctive sequence of pitches, or tones, in
an utterance. The most important feature of a t.g. is the nuclear tone, the most prominent
tone in the sequence. It may be accompanied, depending on the length of the t.g., by a pre-
head, a head and a tail.

Transition Relevance Place (TRP)
One of the points where a change of speaker during a conversation can occur.

transitive verb
Transitive in traditional grammar, in the categorization of verbs and clauses, describes
structures which have a verb followed by an object which is affected by the action of the
verb. From Lat. ‘going through’, the influence of the verb extends to the object as ‘goal’.

Combination of a diphthong with /j jj j/, e.g. au + j jj j in flowers. Sometimes the middle element
is lost in rapid speech or in certain varieties of English; sometimes the triphthong is
monophthongised: flowers may be pronounced /fla:z/.
In sociolinguistics and conversational analysis, one of the opportunities to hold the floor
during a conversation.


The change of speaker during conversation. Some of its rules are obvious (that only one
person should speak at a time), others would attempt to state who should speak next in a
group discussion.


The collective term for the speech and writing habits of a community, with information
about preferences for alternative linguistic forms. The study of u. can reveal, for instance,
that passive constructions are very frequent in scientific writing, that double negations like
I ain’t got no time are typical of substandard varieties of English.

usage and use
A distinction introduced by H.G.Widdowson (1978) between the function of a linguistic
item in a linguistic system (usage) and its function as part of a system of communication
(use). The meaning a linguistic item has as an example of usage is called signification,
while the meaning is has as an example of use is called its value.

A term used in phonology and pragmatics to refer to a stretch of speech preceded and
followed by silence or a change of speaker. Used in pragmatics, u. explicitly involves its
contextualisation, i.e. it is a contextualised stretch of speech.


In phonetics, a term used to describe the place of articulation of certain consonants: it refers
to a sound made by the back of the tongue (the articulator) against the velum (place of
articulation). /k/ and /g/ are examples of velar consonants.

A term introduced by the American phonetician Pike to account for situations when a sound
that functions as a consonant displays vocalic qualities. English /l/, /r/, /w/, /j/ are
phonetically vowel-like (lacking, in their articulation, any closure or narrowing sufficient to
produce audible friction), but their function is consonantal, except for their behaviour in
syllables where they act as syllabic centre.

Vowels can be defined in terms of both phonetics and phonology. Phonetically, they are
sounds articulated without a complete closure in the mouth or a degree of narrowing which
would cause audible friction. Phonologically, vowels function as syllabic centres. In
English, some vocoid consonants can also function as syllabic centres.

vowel quality
The features, other than length, that distinguish one vowel from another. V.q. is determined
by the shape of the mouth when the particular vowel is produced. the shape of the mouth
varies according to the position of the tongue and the degree of lip rounding.


weak form
One of the possible pronunciations for a word in connected speech. The weak form is that
which is the result of the word being unstressed, are in These are my friends. Almost all the
words which have both a strong and a weak form - there are roughly 40 such items in
English - belong to the category of grammatical words, such as primary auxiliaries, modals,
prepositions, conjunctions. In certain circumstances, these words are pronounced in their
strong forms, but their weak forms are more frequently produced and sound more natural in
casual, ordinary conversation. There are contexts where the weak form is the normal
pronunciation and others where only the strong form is acceptable.

Syntactically, a class of structures initiated by wh-words, associated with falling intonation.
They are information-seeking, the expected answer replacing the wh-word in the question.
When the wh-word functions as the subject of the question, there is no operator insertion in
the structure; compare Who do you know? and Who knows you?


Yes/no question
Syntactically, interrogative structures formed by placing the operator before the subject and
using rising or falling-rising intonation. When they have neutral polarity (leaving open
whether the answer will be positive or affirmative), they include non-assertive items (any,
ever, at all). They may also have positive (Did someone leave a message for me?) or
negative orientation (Didn’t anyone leave any message for me?).


zero anaphora
The absence of an expression where one is expected, as a device for maintaining reference:
He bought the book and ----- sold it right away.

zero-plural nouns
Some nouns have the same spoken and written form in both singular and plural. Examples
of zero-plural nouns: aircraft, crossroads, headquarters, means, gallows, salmon.


Abercrombie, D., Elements of General Phonetics, EUP, Edinburgh, 1967.
Bowen, T., Marks, J., The Pronunciation Book, Longman, 1992.
Brazil, D., Coulthard, M., Johns, C., Discourse Intonation and Language
Teaching, Longman, Harlow,1980.
Brown, A., Teaching English Pronunciation, Routledge, 1991.
Brown, G., Listening to Spoken English, 2nd ed.,Longman, 1990.
Catford, J., Fundamental problems in phonetics, Edinburgh University Press, 1977.
Catford, J., A Practical Introduction to Phonetics, OUP, 1988.
Chitoran, D., English Phonetics and Phonology, EDP, 1978.
Chitoran, D., Pârlog, H., Ghid de pronuntie a limbii engleze, ESE, Bucuresti, 1989.
Clark, J., and Yallop, C., An Introduction to Phonetics and Phonology, Blackwell, 2
Cruttenden, A., Intonation, CUP, Cambridge, 1986.
Crystal, D., Prosodic systems and intonation in English, CUP, Cambridge, 1969.
Crystal, D., The Cambridge Encyclopedia of Language, 2nd ed., CUP, 1977.
Crystal, D., The English Language, Penguin, 1988.
Crystal, D., A Dictionary of Linguistics and Phonetics, Blackwell, 1991.
Crystal, D., Quirk, R., Systems of prosodic and paralinguistic features in English, Mouton,
The Hague, 1964.
Cutler, A., Ladd, D.R. (eds.) Prosody: Models and Measurements, Springer Verlag,
Berlin, 1983.
Davies Roberts, P., Plain English: A User’s Guide, Penguin Books, 1987.
Dickson, R.D., Dickson, W.M., Anatomical and Physiological Bases of Speech, College
Hill Press, Boston,1982
Fudge, E.C., (ed.), Phonology,Penguin, Harmondsworth, 1973.
Fudge, E., English Word-stress, Allen and Unwin, London, 1984.
Gimson, A.C., An Introduction to the Pronunciation of English, 4
ed., Edward Arnold,
London, 1989.
Gogãlniceanu, C., The English Phonetics and Phonology, Chemarea, Iasi, 1993.
Graddol, D., Leith, D., Swann, J. (eds.), English: history, diversity and change, Routledge,
Halliday, M.A.K., An Introduction to Functional Grammar, Edward Arnold, London, 1985.
Harmer, J., The Practice of English Language Teaching, 1991.
Hawkins, P.R., Introducing Phonology, Hutchinson, London, 1984.
Hughes, A., Trudgill, P., English Accents and Dialects: An Introduction to Social and
Regional Varieties of British English, Edward Arnold, London, 1979.
Ladefoged, P., Preliminaries to Linguistic Phonetics, Chicago University Press, 1971.
Ladefoged, P., (2
edn), A Course in Phonetics, Harcourt Brace Jovanovich, London, 1982.
Kenworthy, J., Teaching English Pronunciation, Longman, 1987.
Kingdon, R., English Intonation Practice, Longman, 1972.
Lass, R., Phonology: An Introduction to Basic Concepts, CUP, Cambridge, 1984.
Laver, J., The Phonetic Description of Voice Quality, CUP, Cambridge, 1980.
Malmberg, B. (ed.), Manual of Phonetics, North-Holland, Amsterdam, 1968.
O’Connor, J.D., Phonetics, Penguin Books, 1973.
Palmer, F.R.(ed.), Prosodic Analysis, OUP, Oxford, 1970.
Pârlog, H., English Phonetics and Phonology, ALL, Bucuresti, 1997.
Pârlog, H., The Sound of Sounds, Hestia Publishing House, Timisoara, 1995.
Pike, K.L., Phonetics, University of Michigan Press, Ann Arbor, 1943.
Potter, S., Our Language, Penguin, 1990.
Quirk, R. et al, A Comprehensive Grammar of the English Language, Longman, 1985.
Quirk, R. and Stein, G., English in Use, Longman, 1990.
Roach, P., English Phonetics and Phonology, CUP, 1983.
Townson, N., Language and Languages in Contemporary Britain, Clusium, Cluj, 1995.
Trudgill, P. (ed.), Sociolinguistic Patterns in British English, Edward Arnold, 1978.
Trudgill, P., On Dialect: Social and Geographic Perspectives, Basil Blackwell, Oxford,
Trudgill, P., The Dialects of England, Blackwell, 1990.
Underhill, A., Sound Foundations, Heinemann, 1994.
Veres, G, Cehan, A., Andriescu, I., A Student’s Companion to English Grammar, Editura
Universitãtii “Al.I. Cuza”, Iasi, 1996.
Wells, J., Longman Pronunciation Dictionary, Longman, 1990.

Sign up to vote on this title
UsefulNot useful