Discovering Factors That Influence Engli

DISCOVERING FACTORS THAT INFLUENCE ENGLISH PRONUNCIATION OF
NATIVE VIETNAMESE SPEAKERS
by
Leah Tweedy
A Capstone submitted in partial fulfillment of the requirements

for the degree of Master of Arts in English as a Second Language
Hamline University
St. Paul, Minnesota
August 2012
Committee:
Primary Advisor: Julia Reimer
Secondary Advisor: Carol Mayer
To Brent and Mom who supported and encouraged me through this process. To my
students, who continually inspire me to be the best teacher possible.
ii
TABLE OF CONTENTS
CHAPTER ONE: INTRODUCTION ...........................................................................1
What are Intelligibility and Comprehensibility .............................................................3
Professional Experience .................................................................................................4
Suprasegmental Features and Intelligibility...................................................................6
Capstone Overview ........................................................................................................7
CHAPTER TWO: LITERATURE REVIEW ................................................................9
Perceptions of Pronunciation ........................................................................................9
Factors in Achieving Pronunciation Goals .................................................................11
Pronunciation Features ................................................................................................17
Vietnamese Phonology Overview and Comparative Analysis to English ..................18
CHAPTER THREE: METHODS ...............................................................................28
Research Paradigm.......................................................................................................29
Ethics............................................................................................................................29
Context and Participant ...............................................................................................30
Data Collection ............................................................................................................33
Data Analysis ...............................................................................................................36
CHAPTER FOUR: RESULTS ....................................................................................39
Participant Information ................................................................................................39
Speech Samples ...........................................................................................................42
Discussion of Data .......................................................................................................46
CHAPTER FIVE: CONCLUSION..............................................................................52
iii
Study Limitations .........................................................................................................53
Implications..................................................................................................................54
Future Research ...........................................................................................................55
Reflection .....................................................................................................................56
APPENDICES .............................................................................................................60
Appendix A Part A: Personal Information Survey ......................................................61
Appendix A Part B: Pronunciation Attitude Inventory................................................63
Appendix A Part C: Pronunciation Experience and Perception ..................................64
Appendix B Part A: Diagnostic Passage......................................................................66
Appendix B Part B: Informal Speech Topics ..............................................................67
Appendix C: Vietnamese Alphabet .............................................................................68
REFERENCES ............................................................................................................72
iv
LIST OF TABLES
Table 2.1 Vietnamese Tones ................................................................................20
Table 2.2 Vowel Stress .........................................................................................26
Table 3.1 SPS ........................................................................................................37
Table 4.1 SPS ........................................................................................................43
Table 4.2 Data Summary ......................................................................................46
v
1
CHAPTER ONE: INTRODUCTION
Pronunciation is becoming increasingly recognized as a crucial area for
language learners (Celce-Murcia, Brinton, & Goodwin, 1996; Rossiter, Derwing,
Manimtim, & Thomson, 2010). Comprehensibility from speaker to listener has the
power to build confidence and can influence others‟ perceptions of the speaker‟s
credibility. A lack of intelligible speech can be of great detriment to a person‟s
professional, social, and educational life. Professionally, an inability to be clearly
understood has the potential to severely limit job opportunities, especially where
communication is of utmost importance. Academically, students might not feel
comfortable asking questions and collaborating with fellow students. Socially,
individuals might feel overall dismay and apprehension about using their voice to
communicate, putting them at clear disadvantage for maneuvering through daily life.
Collectively, inability to communicate effectively can ostracize a person in multiple
facets and settings, and can lead to poor self-confidence (Morley, 1998).
Although research supports the teaching of pronunciation in ESL classrooms,
ESL teachers are often not trained in teaching pronunciation. A survey of ESL
programs in Canada discovered only 30% of ESL teachers had received formal
training in pronunciation teaching, and that stand-alone pronunciation classes in
public school ESL programs were very limited (Breitkreutz, Derwing, & Rossiter,
2001). The amount of training teachers receive in teaching pronunciation can also
vary considerably. From informally collecting data by asking fellow ESL teachers, I
2
found that many have only had one general linguistics class with varying degrees of
intensity within that class. The instruction provided at the college I attend seems to be
on the higher end of instruction, with a four-credit general linguistics class followed
by courses in advanced linguistic analysis and a class specifically focused on
phonetics and phonology for teachers of English to speakers of other languages
(TESOL). However, some classes for English language learners are taught by speech
language pathologists (SLPs) with an excellent knowledge of phonetics and
phonology but often without training in its application to speakers of other languages.
As a secondary ESL teacher, my fundamental goal is to provide students the tools to
be successful in their lives beyond high school in their profession and society. Why
then is there often a marginalization of quality pronunciation instruction in schools?
Attaining native-like speech was once the widely held goal under the audio-
lingual method of pronunciation instruction. Students repeated phrases in an effort to
sound like the teacher, following what is referred to as the nativeness principle
(Levis, 2005). In the 1970s and beyond, this goal shifted for many as communicative
language teaching became more popular, with the effect that pronunciation
instruction was relegated to the backburner (Celce-Murcia, Brinton, & Goodwin,
1996). Fluency became the goal at the cost of accuracy. The goal has now shifted
back and falls somewhere in between the two extremes, with intelligibility as the goal
of pronunciation teaching (Levis, 2005). Yet teachers of ESL remain largely
untrained to teach pronunciation, as it is often not emphasized in TESOL programs
(Breitkreutz, Derwing, & Rossiter, 2001).

3
Another probable factor in the perceived lack of importance of pronunciation
teaching at the secondary level of K-12 education is due to its being a non-assessed
skill. Every year in Iowa, ESL teachers‟ teaching effectiveness is measured by their
students‟ language growth in reading, writing, listening, and speaking. The speaking
assessment simply measures how well students can use the words of English to
express themselves, and the ESL teacher, who is typically very familiar with the
student‟s speech style, scores the assessment. No measure of intelligibility or
comprehensibility is included, and the high-stakes assessments revolve around
reading proficiency. With some teachers now being rewarded based on test results,
there seems to be little motivation to teach pronunciation. The fact is this: Intelligible
and comprehensible speech is crucial for success in most careers (Morley, 1998;
Zielinski, 2003). Beyond that, lack of comprehensibility can negatively affect self-
confidence, restrict social interactions, and negatively influence estimations of a
speaker‟s credibility and abilities (Morley, 1998). Even though school districts
seldom measure teaching ability by quality of student speech output, overlooking this
area can be a great disservice to English language learners.
What are Intelligibility and Comprehensibility?
Two areas often focused on in pronunciation instruction are intelligibility and
comprehensibility, with intelligibility targeting how well a listener understands an
utterance, and comprehensibility targeting the ease of understanding that same
utterance. More specifically, Zielinksi (2003) defines intelligibility as the degree to
which the speech produced by a speaker can be identified by a listener as the words
4
the speaker intended to produce. Given that, heavily accented speech does not equal
low intelligibility or comprehensibility (Zielinski, 2003). A person with a heavy
accent might be unintelligible, intelligible but with a high interlocutor load, or even
perfectly intelligible with no burden on hearer, especially if the accent is familiar
(American Migrant Education Program, 2002). Zielinski (2008) points out that
intelligibility, therefore, is a two-way process, which can also depend on such factors
as topic familiarity based on background and prior experiences for all parties involved
in the communication. Derwing and Munro (1997) recommend placing priority for
pronunciation in this order: Intelligibility is of greatest concern, followed by
comprehensibility, and accentedness being of lowest priority.
Professional Experience
In my mixed-first-language (L1) high school ESL classroom, the significance
of pronunciation instruction can sometimes be diminished depending on the L1
student makeup. I am the sole high school teacher in my district, and I also teach at
the 5th-6th grade intermediate building, so my time at the high school is limited. I
generally have time for a few classes grouped by ability level, with two for beginners
and one for intermediate-advanced learners. Many of the Spanish, Kirundi, and
Bosnian-speaking students, while often with noticeable accents are still relatively
comprehensible. They don‟t meet the goals of the native-like pronunciation but they
do achieve the goal of intelligibility. Also, there is a significant group of students in
the classes who could be classified as long-term ELLs, having been in school in the
5
United States for most of their lives. They began learning English in the elementary
years and have no distinguishable accent from their native-English speaking peers.
However, the Vietnamese students who began learning English as teenagers,
after the purported „critical period‟ have struggled to be intelligible. I‟ve observed
their interactions with other students and teachers and seen them struggle to be
understood, while the listeners struggle and attempt to make meaning from the
context. Teachers have approached me with their own difficulties understanding the
Vietnamese students when the students have asked questions, and the students have
talked to me about being frustrated while trying to communicate with their peers and
teachers. Because the primary purpose of the ESL class has been communicative-
based and only a small fraction of students need help with pronunciation, I have
found it difficult to coordinate a time with the students where we could focus on
pronunciation.
The need was greatest three years ago when I had two students who were
especially struggling with pronunciation. Because my schedule would not allow any
extra time to work solely on pronunciation with these students, I enlisted the
assistance of the speech-language pathologist (SLP) and our district‟s ELL consultant
- a former SLP herself - who met with the students and worked on the segmental
difficulties the students were having. No data was kept of which I am aware, and it
was difficult to tell whether much improvement in intelligibility was made, but the
students did report they felt greater awareness of how to make some of the sounds
with which they‟d previously struggled. However, by the time the students were in
6
my class and this intervention took place, they had been in the country learning
English for almost two years and it is possible that fossilization might have already
occurred. In addition, and arguably even more significant, the focus was solely on
segmentals, leaving the students with very little explicit teaching in an area perhaps
even more important to intelligibility: suprasegmental features of English.
Suprasegmental Features and Intelligibility
The Vietnamese students‟ work with the SLPs helped them gain a better
understanding of the physiology behind producing some of the new sounds of
English. Yet little was done with the elements of American English beyond
segmentals. In the pronunciation field of study, suprasegmentals have been gaining
ground as equally, if not more, important to intelligibility as segmental features
(Celce-Murcia et al., 1996; Cunningham, 1998; Hahn, 2004). Suprasegmental
features include stress, rhythm, intonation, connected speech, and prominence.
Vietnamese and English differ a great deal, with Vietnamese being syllable-timed and
tonal, and English being stress-timed and non-tonal. For example, when English
speakers change intonation to signal a question, Vietnamese speakers change tone to
distinguish between two completely different words. Also, Vietnamese speakers tend
to give each syllable the same amount of time, while English speakers follow a
rhythm where stressed syllables happen at regular intervals. This disparity indicates
that native Vietnamese speakers can be expected to have difficulty acquiring
appropriate English stress and rhythm patterns (Byrne, Butcher, & McCormack,
1996). According to Nguyen (2006), the stress pattern is the usual culprit when
7
listeners mishear a single word (e.g. IM.por.tant vs. im.POR.tant), and the overall
sentential stress is very important for listeners to reconstruct the whole message. The
important words are typically stressed in a sentence in English, hence if those words
are not stressed and other less important words are, the message can be difficult to
comprehend.
As ELLs practice their new language, pronunciation features can be improved
upon and taught (Derwing, 2003), with the level of success relying on a variety of
factors. But with a young Vietnamese woman deemed fluent by numerous ELL
exams, what factors have led to the continued existence of intelligibility and
comprehensibility issues? The present study is seeking to answer this question and
discover how the factors in one‟s educational, professional, and personal life
influence pronunciation intelligibility.
Capstone Overview
Chapter 2: Literature Review begins by providing more details on the
perceptions of intelligibility issues, along with the factors that influence intelligibility
and comprehensibility. Next, segmental and suprasegmental pronunciation features
are described, with a particular focus on suprasegmentals in NAE (North American
English). A contrastive examination of English and Vietnamese phonology is also
done to further illustrate the differences and pronunciation difficulties a Vietnamese
speaker of English might encounter. Chapter Three describes the tools employed in
the gathering of the background information and speech samples, and the methods
used to analyze the student-generated speech samples.

8
The results of the analysis comparing and contrasting the personal information
to the intelligibility data will be discussed and analyzed in Chapter Four. The present
study seeks to examine the personal and environmental factors influencing the
English intelligibility of an adult Vietnamese fluent in the English language. These
factors will be evaluated in detail in Chapter Four. Implications for future research
will be discussed in Chapter Five.

9
CHAPTER TWO: LITERATURE REVIEW
Introduction
The following literature review examines the research regarding the impact
pronunciation issues can have in the work place and daily life. Next, the factors that
influence a non-native speaker‟s intelligibility are investigated, followed by an
overview of the features of pronunciation. Finally, an analysis of Vietnamese
phonology in comparison to English phonology provides a basic understanding of
difficulties Vietnamese students might encounter when learning English
pronunciation. This review of the research lays the groundwork for exploring the
essential question: What factors influence the English intelligibility of a native
Vietnamese student tested as proficient in English?
Perceptions of Pronunciation
Research by Derwing and Rossiter (2002) indicates a large number of non-
native speakers of English may speak in a way that affects intelligibility. Derwing and
Rossiter surveyed 100 non-native English speakers, ranging in age from 19-64, and
found more than half believed pronunciation contributed to their communication
problems, with 42 listing it as the primary cause of their communication problems. In
addition, 37 of those surveyed stated that they often had to repeat themselves.
Baier (2008) wrote about impacts of pronunciation issues when discussing the
example of Sourav Bhunia. A native of India, Mr. Bhunia started studying English in
grade school, and considers himself fluent in the language. He attended MIT, and
10
works as a scientist at a leading biomedical company. Yet there's still one thing he
would like to change: "I realized [sic] long time ago, [that] my English is not at the
point that I can really participate in life here fully," said Bhunia. Bhunia is among the
many non-native English speakers who strive to be better understood. "We [...] tend
to make some judgments and assumptions about people's credibility, expertise and
education based on how we perceive them communicating," according to Linda
Halliburton, Director of the Continuing Education Program at the University of
Minnesota. (Baier, 2008, p. 1)
Negative effects on esteem might not be limited to the self: Lack of
intelligibility can also negatively influence estimations of a speaker‟s credibility and
abilities (Morley, 1998). Vitanova and Miller (2002) found that lack of confidence,
frustration, and even depression, are often the emotions that drive learners to the
pronunciation classroom. Through gathering reflections of graduate-level ESL
students, Vitanova and Miller (2002) discovered that most students reflected on the
socio-psychological factors of pronunciation. The writings shed light on how poor
self-confidence and frustration with being misunderstood led to silencing the students
in their academics, with the potential outcome of reduced success in their graduate
work.
Scales, Wennerstrom, Richard, and Wu (2006) analyzed the perceptions of
accents by 37 English language learners and 10 native-English speaking Americans.
A one-minute passage was read by speakers with four different accents: British
English, Chinese English, General American English, and Mexican English. The
11
participants listened to the passages and rated each in such areas as quality of
pronunciation, ease of understanding, and preferred accent. There was an almost
perfect correlation between which speaker the participant liked the most and the one
he voted easiest to understand.
Factors in Achieving Pronunciation Goals
Age
The first, rather controversial, factor is age. The debate over whether there is a
critical period for language learning has been an arduous one. Celce-Murcia et al.
(1996) and Cunningham Florez (1998) examine the debate over the impact of age on
pronunciation. Some researchers insist that after a learner goes through puberty,
lateralization of the brain occurs. Lateralization, or the assigning of linguistic
functions to specific brain hemispheres, results in learners‟ difficulty in acquiring and
being able to produce new sounds to the extent possible by a child. Other researchers
argue that various sensitive periods for language learning exist and that “adults need
to re-adjust existing neural networks to accommodate new sounds” (Cunningham
Florez, 1998, p.1). However, in general, research has shown that adults have more
difficulty with pronunciation than children when learning a new language. This
phenomenon has also been apparent in my own teaching with none of my pre-middle
school students struggling with intelligibility or even having an accent, and most of
my high school students who arrived in the country in late-middle school or high
school all having varying degrees of noticeably accented speech. Yet many of those
with accents, some which could be described as heavy, still are relatively easy to
12
understand. In my classroom, the difference between an accent that interferes with
intelligibility and comprehensibility and one that doesn‟t seems to be split based on
our next factor: native language.
First Language Transfer
A learner‟s first language (L1) can have significant influence on the level of
accentedness and intelligibility of the new language. Negative transfer, also called
interference, means that the features of the L1 are carried into the second language
(L2). With differences between the two languages, negative transfer can lead to
erroneous production of aspiration, rhythm, and intonation in the new language
(Cunningham Florez 1998). According to Meng, Tseng, Kondo, Harrison, and
Viscelgia (2009), language transfer occurs at both the segmental and suprasegmental
levels, and these interference effects can become fossilized with age, creating
challenges for adult L2 learners.
Prior Instruction
The issue of fossilization arises again with the next factor: amount and type of
prior pronunciation instruction. When a learner has achieved a high level of
proficiency in the new language, she may also have developed systematic speech
errors that are now complicated to unlearn (Cunningham Florez, 1998; Celce-Murcia
et al., 1996). Ideally, quality pronunciation practice would coincide with the start of
learning the new language, but when this isn‟t addressed, habits can be formed and
fossilization occurs. This fossilization could also occur due to the L1, a lack of
intervention early on when learning the language as mentioned above, or a

13
combination of the two, and can be a considerable hindrance to attaining
pronunciation goals.
The issue of quality pronunciation practice is one that has gotten considerable
attention in recent years. With the advent of communicative-based approaches to
language teaching in the 1980s, teachers and educational publishers started to look for
more suitable ways to teach pronunciation skills compared to the direct or audio-
lingual methods of the past. The communicative approach is based on the basic
premise that classrooms should make language to communicate the core of classroom
teaching since communication is the primary purpose of language. This method
continues to be the dominant theory in language teaching today and has caused a shift
in curricular materials over the last couple decades.
Teachers and material developers determined that refocusing their efforts
toward suprasegmental features, such as rhythm, stress, and intonation, in a
discourse-rich context was the best structure for a short-term pronunciation class
(Celce-Murcia et al., 1996). Also, with the structure of classes shifting to being
communicative in nature, many of the old methods of teaching segmental features
were incompatible with the new communicative pedagogy with its emphasis on
fluency in lieu of accuracy. For example, the audio-lingual and direct methods placed
great emphasis on correct output. As the communicative approach gained popularity,
the emphasis shifted off of preciseness of output (Levis, 2005). Many teachers today,
trained in a communicative-based teaching method, do little in the way of teaching
pronunciation, and might also know little about suprasegmentals. The survey
14
(Breitkreutz et al., 2001) mentioned in Chapter One of ESL programs showed that
while 76% of ESL instructors felt capable teaching pronunciation, only 30% had any
pedagogical training. Of this 30%, 79% was in the form of conference presentations,
69% was in-house seminars, while only 12% was from access to university or college
courses (Breitkreutz et al., 2001). This lack of comprehensive training can lead to
overlooking the impact of suprasegmental features of language.
Both empirical and anecdotal evidence exist that show a threshold level of
pronunciation for nonnative speakers of English (Morley, 1991). Even with a strong
knowledge of grammar and vocabulary, if nonnative speakers fall below this
threshold, they will have difficulties being understood while speaking (Celce-Murcia
et al., 1996). Otlowski (1998) points out that some research suggests little difference
can be made by pronunciation training; yet contrasting research indicates that if
certain criteria are met, such as suprasegmental training, noticeable differences are
possible from pronunciation training (Derwing & Munro, 1997). Yet Celce-Murcia et
al. (1996) found that giving priority to suprasegmentals leads to improvements in
comprehension and creates an environment that is less frustrating for students
because students are better able to make significant results in a short period of time.
Language Exposure
Similarly to amount and type of prior instruction, a learner‟s exposure to the
language is also a salient component. The length of time a person has spent
interacting with the language could have an impact, and, perhaps more importantly,
the quantity and quality of English interaction in day-to-day activities can be critical
15
(Singer, 2006). Languages are acquired by receiving large amounts of
comprehensible input. This comprehensible input can be easily seen when visiting a
quality elementary school where students interact freely in a comfortable setting.
Adults, on the other hand, often spend their days working in an environment without
a rich source of comprehensible input. Socializing often occurs with people from their
linguistic group. With the Vietnamese population in my area, many of the families are
employed together in a factory, at nail salons, and at a family-owned restaurant. Some
English interaction occurs with customers, and Vietnamese tends to be used with co-
workers. In Singer‟s (2006) case study of factors influencing English pronunciation of
native Somali speakers, the most definite factor determining accurate pronunciation
was the learners‟ daily exposure to English.
Aptitude
Another factor that has garnered much debate is aptitude, with some
researchers asserting, “the ability to recognize and internalize foreign sounds may be
unequally developed in different learners” (Cunningham Florez, 1998, p. 1). Still
others argue that as long as learners have a developed first language, their ability to
learn an additional language and its sounds are the same (Cunningham Florez, 1998).
With the debate and lack of tools to measure, this study does not attempt to assess the
participant‟s aptitude in regard to pronunciation.
Attitude and Motivation
An additional factor is one that is nonlinguistic or unrelated to aptitude:
learner attitude and motivation. The difference between a positive and motivated
16
student and a negative and unmotivated student is, as any teacher would attest,
enormous. While a number of students might take English in college to reach their
academic and future professional goals, others might be feeling forced to learn
English, for example, because they are unwilling refugees of their own countries. In
addition, some might feel conflicted about learning a new language if they feel it will
result in the eventual loss of their L1.
The development of pronunciation intelligibility can be positively or
negatively influenced by one‟s attitude toward the new culture, its language and
speakers. Concurrently, personal identity issues and motivation for learning also can
support or impede pronunciation (Cunningham Florez, 1998). Elliott (1995)
examined the pronunciation accuracy of a group of university students studying
intermediate Spanish. He measured the students‟ attitude toward acquiring near-
native-like pronunciation using the Pronunciation Attitude Inventory (PAI). The
results showed that a student‟s motivation toward achieving the target language‟s
pronunciation was the principal variable in their accuracy of actual pronunciation
output.
Cunningham Florez (1998) summarizes the impact these factors can have on
setting and achieving pronunciation goals:
Native-like pronunciation is not likely to be a realistic goal for older learners;
a learner who is a native speaker of a tonal language, such as Vietnamese, will
need assistance with different pronunciation features than will a native
Spanish speaker; and a twenty-three-year-old engineer who knows he will be

17
more respected and possibly promoted if his pronunciation improves is likely
to be responsive to direct pronunciation instruction. (p. 3)
Pronunciation Features
Pronunciation features are typically grouped in two categories: segmentals
and suprasegmentals. Segmental features are the sound inventory of a language. The
North American English sound inventory consists of fifteen vowels and twenty-five
consonant sounds, for a total of forty distinct sounds that enable speakers to
distinguish one word from another (Cunningham Florez, 1998). Suprasegmentals, in
contrast, encode rich information structure, giving the listener the ability to detect
emphasized words, speech acts (e.g. statements vs. questions), phrasal boundaries,
and attitudes and emotions. This suprasegmental information presents itself in the
following sounds features:
 stress – the length, volume, and pitch applied to syllables in words and sentences
 rhythm – the beat pattern of stressed and unstressed syllables (tied with sentence
stress)
 adjustments in connected speech – changes in sounds when words blend together
in speech
 prominence – vocally highlighting words in speech to express meaning, new vs.
old information, or intent, by use of loudness, length, pitch and vowel quality
 intonation – the rise and fall of voice pitch in sentences and phrases
18
According to Meng et al. (2009), while perceptual studies indicate that both
segmental and suprasegmental features impact expert judgments on speaking
proficiency, suprasegmentals have a greater effect for the overall comprehension of
the message. Meng (2009) states, “suprasegmental features encode rich information
structure that helps the listener locate emphasized words, phrase boundaries, speech
acts (e.g. statements, questions, continuations, etc.) as well as the speaker‟s attitudes
and emotions” (p. 1). In Derwing and Rossiter‟s (2002) survey of 100 ESL students,
the participants perceived segmental issues to be the crux of their pronunciation
difficulties despite the fact that these features have a low functional load. The
participants‟ awareness of suprasegmental features was limited. Therefore negative
transfer of suprasegmental features can disrupt clear comprehensibility of the
intended message, and the detrimental effects would naturally be greater when the L1
is more markedly different than the L2, as is the case when juxtaposing English with
Vietnamese (discussed further in Comparative Analysis below).
Vietnamese Phonology Overview and Comparative Analysis to English
It is beneficial to understand the phonology of the students‟ L1 in order to
gain a broader view of the challenges likely to be encountered. The difference
between English and Vietnamese is vast, with Vietnamese a part of the Austro-
Asiatic language family, while English belongs to the Indo-European family. These
language families are very distinctive, and those distinctions carry into the differences
between English and Vietnamese. For a segmental inventory of the Vietnamese
alphabet and approximations of the sounds in English, see Appendix C. At the

19
segmental level, the differences result in some common segmental errors: dropping of
final consonant sounds; difficulty in pronouncing some consonant sounds such as /δ/,
/θ/, /z/, /dз/,/S/ and /tS/, as well as some initial consonant clusters such as sp-, dr-, br-,
fr-, pl-, and str- (Nguyen, 2002). Consonant clusters in the word-final position can be
especially problematic (Nguyen, 2008). Yet perhaps even more detrimental to
intelligibility, Vietnamese-English differences in suprasegmentals are immense.
One of the most prevalent differences is that Vietnamese is a tonal language
while English is not. English uses pitch changes to emphasize or express emotion,
while Vietnamese hear those changes to mean different words. Because of this, it is
common for native speakers of tonal languages to have strong accents when speaking
English (Shoebottom, 2011). Six pitch tones exist in Vietnamese: level, breathing
rising, breathing falling, falling-rising, creaky rising, and low falling (“Vietnamese
Language,” 2010). Six tones are used in Hanoi/Northern Vietnamese, with the South
using five tones:

20
Table 2.1
Tone name Features Diacritic Example
ngang "level" level, laxness (no mark) ba 'three'
huyền "curve" falling, laxness, breathiness ` bà 'lady'
sắc "rising" mid rising, tenseness ´ bá 'governor'
nặng "drop" mid falling, glottalized, tenseness bạ 'at random'
hỏi "falling" mid falling, tenseness bả 'poison'
ngã "broken" mid rising, glottalized ˜ bã 'residue'
Thompson 1987
This means, at the word level, that pitch contours and intonation are crucial in
signifying different words in Vietnamese, whereas in English these linguistic features
signal old versus new information, sentence function, and various pragmatic
elements.
Another aspect of Vietnamese that is strikingly contrastive to English is that
Vietnamese is predominantly monosyllabic, with only a small fraction of words being
di- or polysyllabic. Each Vietnamese syllable consists of three elements: the initial
consonant singleton, the rhyme, and the tone. The syllable then also falls into one of
four types: open, partially open, partially closed, and closed (“Vietnamese
Language,” 2010). The monosyllabic nature of Vietnamese presents obstacles when
learning English in regards to the stressed syllables English exhibits in its
polysyllabic words; consequently, as a tonal language, Vietnamese has no system of
word stress (Byrne et al., 1996). However, this has been argued in the case of
21
reduplication. Nguyen and Ingram (2006) suggest that the second syllable is more
prominent in reduplicated forms, which means that in disyllabic reduplications, the
word stress pattern is on the last syllable. Furthermore, while Vietnamese is more
similar to English in that it exhibits various degrees of stress - medium loudness with
heavy stress, louder than medium, and less than medium loudness - the majority of
syllables carry medium stress. Byrne et al., (1996) goes on to say that “weak stress is
accompanied by very short syllables, with the overall effect that conversational
Vietnamese is syncopated” (p. 1). Nevertheless, according to Pittam and Ingram
(1992) phrasal stress in Vietnamese appears to perform basically the same function as
phrasal stress in English, which is highlighting new information or emphasizing
certain aspects of importance.
In summary, Vietnamese contrasts greatly with English in regards to many
segmental elements, tone, and pitch contours in regards to word stress; however,
though not identical, some similarities are evident at the phrasal level. Although
considerable research has been done on analyzing the differences between the
languages, less attention has been focused on how probable it is for a non-newcomer,
with possible fossilization, to overcome these differences and achieve consistently
intelligible speech.
Stress and Rhythm
Having discussed an overview of Vietnamese phonology, we now turn to a
fundamental component of suprasegmental features: stress-time versus syllable-time
languages. However, before examining the effect this timing has on English and
22
Vietnamese, I should first acknowledge the controversy surrounding the concept.
According to Byrne et al. (1996), although studies haven‟t found evidence of absolute
isochrony - the rhythmic division of time in language, i.e., stress-timed vs. syllable-
timed - it is clear that languages do differ in their rhythmic qualities. Despite the
controversy, the overwhelming majority of research supports a stress-timed versus
syllable-timed difference between Vietnamese and English, and these terms will
therefore be examined and used in this research.
English is a stress-timed language, meaning that it contains stressed and
unstressed syllables with quasi-uniform durations between successive stressed
syllables (Meng et al., 2009). Cohen (2007) demonstrates an example of this in the
following related sentences:
HORSES EAT GRASS
The HORSES EAT GRASS
The HORSES EAT the GRASS
The HORSES will EAT the GRASS
The HORSES will have EATen the GRASS
The HORSES might have been EATing the GRASS
The duration of each sentence above is dependent upon the number of stressed
syllables, not the total number of syllables. Vietnamese, along with many other
languages including French, Spanish, and Hindi, is syllable-timed, with the
implication being that “there is less temporal variability in the duration of individual
syllables as a function of varying levels of stress or emphasis” (Pittam & Ingram,

23
1992, p. 2). In other words, every syllable takes up approximately the same amount of
time and creates a rhythm uncharacteristic of native-like English.
In Cohen‟s example above, the words marked as stressed and unstressed can
be placed into one of two categories: content words and function words. Horses, eat,
and grass are the principle words that express the meaning of the sentence. These
words are typically nouns, pronouns, main verbs, adjectives, question words,
demonstratives, and adverbs. Function words are those intervening words that have
little or no meaning, but help express grammatical relationships (Cohen, 2007;
Shoebottom, 2011). Typical function words are articles, auxiliaries, some pronouns,
and prepositions. These words are sometimes described as „swallowed‟ due to their
being shortened and weakened, with the result being that several words together may
take the same duration of time as the single content words around them (Shoebottom,
2011).
Yet what exactly sets a stressed syllable apart from an unstressed one? A
stressed syllable is sometimes referred to as having prominence. The key
characteristics that give a syllable prominence are loudness, length, pitch and vowel
quality (Nguyen, 2006). Given the four characteristics, they can be arranged in order
from strongest effect to weakest, with pitch and length creating the most detectable
difference, followed by loudness and quality (Roach, 1982 as cited in Nguyen, 2006).
Similarly, Meng et al. (2009) describe a stressed syllable as being acoustically of
higher intensity, longer duration, and with a stronger, unreduced articulation reflected
in the spectral quality. Pitch movements near or around the syllable can further
24
increase the stress.
According to Nguyen (2006), negative transfer is likely the culprit for much of
the difficulty in word stress for Vietnamese speakers of English, resulting in a
tendency of Vietnamese students to pronounce all syllables with the same
prominence. In a study by Nguyen (2006), native Vietnamese speakers of English
read a passage that included over one hundred polysyllabic words. While the
participants read the passages aloud, researchers counted the number of stress errors.
Students were placed into three groups based on their length of time learning English,
from first year to third year students. The first-year learners incorrectly stressed
syllables approximately 65% of the time. The third year students, who had been
taught the rules for stress, did improve but still made errors over a quarter of the time.
The errors occurred either due to the primary or secondary stress falling on the wrong
syllable, or the students stressed all syllables equally. Errors such as these can cause
serious issues with intelligibility. Lexical stress can provide information such as part
of speech: RE-cord (noun), re-CORD (verb); CON-struct (noun), con-STRUCT
(verb). Even in words where an alteration in the stressed syllable doesn‟t change the
part of speech, the incorrect stress pattern greatly impedes intelligibility (Celce-
Murcia et al., 1996; Field, 2005; Hahn, 2004; Nguyen, 2006; Pittam & Ingram,
1992.).
As previously discussed, Vietnamese students of English often do have
difficulties with segmental features. Yet, according to Gilbert (2008), a frequent issue
in many ESL classrooms is the common method of teaching individual sounds before
25
establishing a basic knowledge of the new language‟s rhythm and melody. When this
is done, students then attempt to apply the new sounds of the language to their native
language‟s prosody. Rather, by learning the new language‟s rhythmic structure, the
students are provided with a timing context. The result is that the prosody knowledge
makes it easier for the students to produce the new individual sounds. Gilbert (2008)
clearly states, “rhythm training is a precondition to good, clear target sounds” (p. 30).
As illustrated by the chart below, stress impacts pronunciation of the vowel,
and the difference between a stressed vowel and reduced vowel is an entirely
different vowel sound represented by the same letter. Table 1 demonstrates the
tenseness of the stressed vowels in English as opposed to the unstressed lax vowels,
with the schwa receiving so little stress to the point that it can be difficult to recognize
for NNS. Without this knowledge of how stress changes sounds in words, non-native
English speakers could be terribly confused on when to produce different sounds for
the same graphical vowel representation, the letter.

26
Table 2.2
Gilbert 2008
Summary
To summarize, as Vietnamese teenagers and adults begin to learn to speak
English, the differences in the two languages are likely to present challenges. In order
to diminish chances of fossilization, it is ideal for quality pronunciation instruction to
coincide with language learning from the start. This will work to prevent the long-
term effects of negative transfer, which are likely in regards to stress and rhythm
(Nguyen, 2006). For those proficient students who are still struggling with
intelligibility, a careful course design is necessary to maximize outcome. Developing
a solid understanding of pronunciation research, and how students‟ language
backgrounds affects pronunciation provides a solid base on which to build a more

27
effective pronunciation course. Given this overview of factors affecting intelligibility,
to what extent do each of these factors affect the participant in this case? While
research exists regarding factors influencing overall pronunciation, specifically why
is a young Vietnamese woman fluent in English still having difficulties with
intelligbility? In the next chapter I describe the methods used to answer this question.
28
CHAPTER THREE: RESEARCH METHODS
Throughout the last two chapters, I have researched the factors that potentially
lead to intelligibility and comprehensibility difficulties for non-native speakers of
English. I have established the detrimental effects of poor intelligibility and/or
comprehensibility in the social, academic, and professional realms. I also examined the
difficulties of acquiring intelligible pronunciation, especially in regard to students whose
L1 is quite different from English and who began learning English as teenagers or later,
after the purported critical period. In order to better understand the needs of these
students, particularly those with Vietnamese as their first language, I will examine
multiple influencing factors through the use of a case study. I will collect and analyze
discourse through the use of speech samples in order to create an inventory of problem
areas for Vietnamese speakers of English. In addition, I will conduct an interview to
construct a better idea of the subject‟s background, exposure, attitude, and experience
with the English language, particularly with how it pertains to pronunciation.
This chapter describes the research plan I will use to investigate the factors
affecting the intelligibility of a young Vietnamese woman. I describe the participant, her
cultural, educational, and language background, including previous pronunciation
instruction. Then I describe the resources and methods used to gather the survey and
interview information and speech samples. Finally, the processes used to analyze the data
are described.
29
Research Paradigm
The catalyst for this capstone was my continued interest in a former student. Despite
what I saw to be hard work, desire, and strong English literacy skills, she continued to report
intelligibility issues when talking with others. As she advanced to the college level and
continued ESL classes, including courses focused on speaking, I found myself often
reflecting on the possible reasons for her self-reported issues. My interest in exploring her
situation further lent itself perfectly to a qualitative research paradigm, in particular a case
study. The study of an individual language learner is often a case study, and, according to
Nunan (1992), a case study generally incorporates a variety of data collection and analysis
methods. As I explain in this chapter, to allow more in-depth exploration of the participant, I
incorporate a variety of interview questions, along with speech samples. The researcher in a
case study examines the characteristics of an individual unit with the aim of a deeper
investigation (Nunan, 1992). A detailed description and analysis of an individual subject is a
prevalent type of case study, and this level of detail seemed the best fit as I attempted to form
a complete picture of my participant‟s personal and educational background that could
influence her pronunciation.
Ethics 
To secure my participant‟s anonymity and ensure her safety in regards to this study,
the following measures were taken: 1) a Human Subject Research Form was submitted,
reviewed and accepted by Hamline faculty, 2) the participant was given an informed
consent letter, and although tested as proficient in English in both high school and
college, allowed time to read, ask questions, and translate any words if unclear (not
needed), and given the choice to sign under no pressure, 3) the research objectives were
explained to the participant, 4) the participant‟s identity was protected throughout the
30
research by use of a pseudonym, 5) no materials were kept with the participant‟s real
name on them, 6) research materials and notes were kept in a secure location, and 7) there
were no negative or positive consequences from participating in this study.
Context and Participant
The participant I selected is a 22-year-old female whom I will call Anh* for the
purpose of anonymity. Anh was raised in Ho Chi Minh City (formerly Saigon), Vietnam
and attended a year of high school in that area. There are three dialects of Vietnamese
based on the northern, mid-coastal and southern regions (“Vietnamese Language,” 2010).
The dialect most studied and described by linguists in the U.S., Cochinchinese, is the
southern regional dialect and is the dialect of the subject in this study. Anh had not
learned English prior to coming to the U.S. at the age of 16, when her family moved to
Iowa to be near an uncle.
Upon arrival in the U.S., Ahn attended an urban school for 4 months before
moving into my district on the outskirts of the same city the following year as a 9th
grader. By the time I started as an ESL teacher at the school, she was beginning 10th
grade and was taking mostly regular 10th grade classes with native English speakers. Her
motivation was - and continues to be - high, and Anh excelled in school, as she was
elected into National Honor Society as a senior. Also as a senior, Anh scored „proficient‟
on the Iowa English Language Development Assessment. Proficiency is based on the
combination of four scores in reading, listening, speaking, and writing.
Despite this success, she continually worried about her pronunciation. We
continued to work on her pronunciation when possible in class, although the majority of
the other ELL students – Spanish-, Bosnian-, Arabic-speaking, and often ones who had
31
arrived in elementary school – were highly intelligible. Anh, her brother, and her cousin
were the students who consistently struggled the most with pronunciation based on my
observations, their anecdotes, and feedback from their content-area teachers. I also
struggled to find time to meet their pronunciation needs with a schedule that was already
filled during the day and the students unable to stay after school due to work
commitments. When only a small fraction of students are struggling with pronunciation
and the time during class shared with the other modalities is proving insufficient to create
the needed change, what other effective options exist for teachers and students?
Although I continued to embed pronunciation in the curriculum, the main focus
was on reading, writing, listening, and speaking, with the speaking practice being
insufficient to produce enough detectable improvement for my liking. I enlisted the help
of our Area Education Association‟s (AEA) ELL consultant, who happened to have been
a speech-language pathologist (SLP) in her previous position. I also enlisted the
assistance of our district‟s SLP who was interested in working with the students outside
of her normal hours and focusing on her expertise with teaching some of the students‟
challenges with segmental features of English. Both SLP-trained professionals worked
with the students outside of class time so the students wouldn‟t get behind in their
schoolwork. The instruction focused on segmental features and the students were pleased
to be getting the practice specifically working on such areas as articulation and voicing.
When asking the students to say a certain word containing the sounds, they could stop,
think, and produce a more accurate utterance. However, in spontaneous conversation, I
typically heard the students slip back to their original manner. These interventions took
place after having already been exposed to the language for almost two years, and I
32
worried that fossilization might have occurred. Still, their metalinguistic awareness of
segmental features did seem to be strengthened and the students had the knowledge and
tools available now to think about their speech and independently make some of the
segmental adjustments. Ahn didn‟t always produce all segmentals correctly in context,
but she did develop awareness of these features and could correctly produce them when
allowed to stop and reflect before attempting words. A key challenge she continued to
have in pronunciation was vowel pronunciation: She could use correct voicing and place
and manner of articulation, but she could still struggle on when to use which vowel
sounds. As Gilbert (2008) explains, this could be due to a lack of training in stress (see
Table 2.2).
Despite her pronunciation struggles, Ahn continued to be successful, graduated
with honors, and enrolled in the local college where she has been taking numerous ESL
classes concurrently as she prepares for a career in the medical field. On one of her
frequent visits to the high school, she recently reported that the college-level speech
classes were more similar to a public speaking rather than pronunciation class, and she
still reports people frequently not understanding or incorrectly understanding her speech.
The issues described above seem to remain noticeable in our conversations and interfere
with my comprehensibility of her speech. While I have my own hypotheses from our past
experience together, what are Anh‟s perceptions of her pronunciation, awareness of her
strengths and weaknesses, and her more recent experiences with the English language?
The quantity of questions illustrated here and depth in which I wanted to explore Anh‟s
situation led to the conclusion that a case study was the most appropriate paradigm in
which to conduct my research.

33
Data Collection
In order to gather as much information as possible, I employed two different
methods of data collection: an interview and speech samples.
The interview
This consisted of three parts. First, the Personal Information Survey (PIS)
(Appendix A) was given to collect specific information about the participant‟s linguistic
development. I administered the survey orally, and I wrote and e-recorded the subject‟s
answers. I chose to write and record in an effort to encourage the subject to respond
freely with as much detail as necessary, without being encumbered with writing lengthy
answers. This also allowed me to ask for clarification or further explanation as needed.
The PIS (Singer, 2006, adapted from Elliott, 1995) elicits information regarding length of
time in country and learning the target language, along with educational background. The
PIS also gathers details on exposure to the target language at work, home, and at school.
In addition, I added a final section to the PIS asking specifically about the subject‟s
background experience with pronunciation instruction, her perception of its effectiveness,
and her perception of her own pronunciation strengths and difficulties.
The second step in collecting data was the Pronunciation Attitude Inventory (PAI)
(Elliot, 1995) (Appendix A). This inventory was done in writing rather than as an
interview, with the participant rating statements on a scale of 1-5. Elliot devised the PAI
in an effort to gauge language learners‟ attitudes toward pronunciation. The PAI was
modified for this study, using the modifications in questions and scoring done by Singer
(2006). Elliott used the PAI to examine the pronunciation acquisition of a group of
Spanish speakers, while Singer altered it to be used with students learning NAE. Each
34
statement is followed by the numbers one through five, and the subject decides among
5=strongly agree, 4=somewhat agree, 3=neither agree nor disagree, 2=somewhat
disagree, or 1=strongly disagree. Some of the statements include, „Sounding like a native
speaker is important to me,‟ and „I believe I can improve my pronunciation in English.‟
Before tallying the scores, some statements‟ ratings had to be reversed so that a high
score always corresponded with a favorable attitude. Next, scores were added up to total
a maximum of 45. The idea of the PAI is that the higher the score, the more positive
attitude a subject has toward English pronunciation. A very low score would indicate a
person with a very negative attitude, with a score in the middle range being somewhat
neutral.
Data about the subject‟s overall perception of her accent and English
pronunciation training background were collected in the third part of the interview
(Participant Interview – Part C). This section of the interview was compiled from original
questions of my own and adapting others from Derwing and Rossiter‟s (2002) study of
ESL learners‟ perceptions of pronunciation needs and strategies (Appendix A).
All of the previously described interview and survey data help to gain a clear
picture of the subject that was compared to the second major part of the data: the speech
samples.
Speech Samples
The first large chunk of data previously described focused on the learner‟s
experiences and attitudes. The second part was actual pronunciation through the
collection of recorded speech samples. The speech samples include two parts: one
practiced reading of a diagnostic passage and another of the subject speaking freely on
35
her choice of a topic. There was a significant intentional difference between the two
samples: The diagnostic passage is standardized and purposely includes a wide
assortment of segmental and suprasegmental features in order to assess the full range of
English language pronunciation. Conversely, the free speech sample offers the subject a
range of subjects on which to speak, allowing for a conscious choice of a topic with
which the speaker is most familiar. Furthermore, the speaker chooses her own words,
allowing for avoidance of troublesome features. For example, a speaker might struggle
with the voiced and voiceless „th‟ and „r‟ sounds, so rather than saying „three‟ they
substitute „a couple‟ to avoid the problem word. Therefore, the combination of the two
speech samples provides a more informative view of the speaker‟s pronunciation
abilities.
The subject was first given the diagnostic passage to read through independently.
When she was ready, she listened to the passage read to her by a native-English speaker.
She was then given time to reread and practice the passage prior to being recorded.
Giving the subject the chance to practice the passage beforehand helps ease any
nervousness that could impede her reading. The goal of the speech sample was to get as
accurate of a depiction as possible of her true pronunciation abilities. In an authentic
situation, people sometimes speak spontaneously, yet at other times they deliver practiced
discourses. The diagnostic speech sample most closely mirrored the latter in authenticity,
but was no less authentic.
A second recording was next taken with the subject speaking freely on a chosen
topic, more closely reflecting the authentic situation of an unplanned conversation. The
subject was given a range of topics from which to choose (Singer, 2006). Topics such as
36
discussing your favorite day, most embarrassing moment, future plans, and describing
your family led the subject to speak freely at some length on a topic comfortable and
familiar to her. The contrast of the two speech samples most accurately showed the
learner‟s abilities in a variety of situations.
Data Analysis
To analyze the speech data, I used the Analysis of Problems (AP) by Dauer
(1993). The AP lists a full inventory of vowels, voiced and voiceless consonants, nasals,
and other segmental features, along with issues such as dropped endings such as “-ed”
and voicing final consonants when they should be voiceless. I particularly appreciated the
AP for its full range of suprasegmental features as well. The AP guides the assessor to
listen for a variety of stress errors, rhythm, vowel reductions, problems with intonation,
and many others. I listened to the samples numerous times while focusing specifically on
the different pronunciation features.
After analyzing the two speech samples using the AP, the Speaking Performance
Scale for UCLA Oral Proficiency for Nonnative TAs (SPS) (Celce-Murcia, et al., 1996)
was used to assign a score. A careful review of the AP results guided my choice of a
point value from the SPS. The SPS examines different aspects of language and assigns a
score of a 0 through 4 based on the accuracy of production. A high score of four indicates
the learner is very intelligible, while a score of zero indicates the learner is likely to be
impossible to understand. The SPS contains seven areas to assess, including grammar and
listening. Therefore, for the purpose of this study and in alignment with Singer‟s study of
Somali learners‟ pronunciation factors (2006), the SPS was limited to the two pertinent
areas: pronunciation and speech flow. Table 3.1 shows the correspondence of the scores
37
with the quality of segmental and suprasegmental features from the speech samples.
Table 3.1
SPS
Rating System used for assessment of segmental and suprasegmental features
Rating Segmental Suprasegmental

4 Rarely High degree of fluency
mispronounces effortless, smooth
3 Accent may be foreign Speaks with facility, rarely

Never interferes has to grope…
2 Often faulty but intelligible Speaks with confidence

with effort but not facility, hesitant
1 Errors frequent only Slow, strained except for

intelligible to NS used to routine expressions
dealing with NNS
0 So halting that
Unintelligible conversation is impossible
Singer, 2006, adapted from Celce-Murcia, Brinton, & Goodwin (1996).
Once the scores were collected measuring attitude and pronunciation, I compared
the results of all the data to see if relationships were suggested. I examined the history
gathered through the interview and survey to gain an enhanced understanding of the
factors affecting intelligibility and comprehensibility. Subsequently, taking all data into
consideration, I postulated reasons why the subject received the pronunciation
intelligibility score she did.
Conclusion
This chapter outlined the context of the research, and the subject‟s linguistic and
educational background. The rationale was provided justifying the use of a case study to
best answer the research questions in a manageable setting. In addition, the methods used
to gather and analyze data were presented and explained. In Chapter Four, the results of
38
the data collection are presented, analyzed, and interpreted to determine connections
between learner attitude and background and intelligibility.

39
CHAPTER FOUR: RESULTS
The purpose of this study is to uncover the factors that lead to intelligibility issues
for a native Vietnamese woman tested as proficient in English. In this chapter I will first
detail the participant‟s background information gathered from the Personal Information
Survey. Next, I will describe the participant‟s attitude toward learning North American
English as estimated by the Personal Attitude Inventory. I will also discuss the results
from the final part of the interview: the specific pronunciation background and perception
questions. I will then share the results of the speech sample analysis, describing the most
noticeable inaccuracies from both the formal diagnostic and the free speech portions.
Finally, I will analyze the interview information in conjunction with the speech data in
order to hypothesize how and why different factors influence the participant‟s
pronunciation. Past and current research will continue to be cited as I seek to answer the
research question: What factors have led to intelligibility issues for the participant in this
case study?
Participant Information
In this section, I will describe the participant of this study using the PIS and PAI
(see Appendix A). The participant will continue to be referred to as Anh in order to
maintain her anonymity.
Anh was born in Ho Chi Minh City in Vietnam. She grew up with her mother,
father, and brother, who is approximately two years older than she. While in Vietnam, her
language was limited to Vietnamese, with the exception of one yearlong course at her
school in English. Her recollection was that the class was more British English than
40
NAE, and she was limited to using English only within the confines of the class, so she
retained little.
In February 2006, Anh‟s family moved to Iowa to join extended family. Her uncle
had moved to the U.S. years before, married a native Iowan woman, had children, and
began the process of applying for paperwork to bring some of his family over to join him.
When Anh‟s family first arrived, they lived with the uncle and his family for
approximately a year. Anh‟s aunt and cousins spoke little to no Vietnamese, so she was
immersed in a bilingual environment at home for some time: hearing English from her
cousins and aunt, communicating in Vietnamese with her immediate family and uncle.
Anh and her brother were immediately enrolled at a nearby school with a relatively large
ELL population, including other Vietnamese students, taking all ELL-based classes. At
the end of that year, there was a change in her district. The district in which the family
resided wasn‟t the one the kids had been attending due to it not having an ESL program.
However, the district began one for the next year, so Anh began her first full year as a
freshman at a new school, attending this time with her cousin, but with a much smaller
ELL population and her brother being the only other Vietnamese-speaking student at that
time. Anh began taking more mainstream classes with native speakers of English, so she
had to use English throughout the day for both academic and socializing purposes.
Anh graduated in 2010 and has now completed two years of college at a local
community college, working toward becoming a pharmaceutical technician. She took
many ESL classes her first year of college and so is on schedule to finish her program
next year. She also works at a job she‟s held for approximately three years at a nail salon.
41
Anh‟s family moved out of their relatives‟ house within the first year, and she continues
to live with just her immediate family members: her mother, father, and older brother.
The amount of exposure to English for Anh depends greatly on the setting. When
at school as a full-time student, she almost exclusively uses English to communicate with
her classmates and professors. English is also her predominant language for daily errands
around the community. At work, Anh uses English to interact with most of her customers
and Vietnamese to interact with her native-Vietnamese co-workers. At home with her
immediate family, Vietnamese is the natural choice to interact with her parents and
brother in their native tongue. Anh‟s uncle has been successful in supporting additional
members of his extended family in their efforts to immigrate to the U.S., and Anh now
has more extended family members in near vicinity of her home, and she mostly speaks
Vietnamese with them. When asked to assign a percent to the amount of time she uses
English during the day, Anh says she splits 50/50 between using English and Vietnamese
while at work. During the school year, she estimated that she uses English more: 75% of
the time. During the summer and breaks, she surmised this percentage would be much
lower but varied much each day depending on her plans.
Anh progressed through the advanced ESL classes at her college and is
considered proficient now by those standards. To get her perspective on her English
skills, she was asked as a part of the PIS to rate her English skills on a scale of one to five
with five being the best and one being the worst in four modalities: reading, writing,
listening, and speaking. She rated listening, speaking, and reading the same with a score
of a four, meaning she felt confident in her abilities. She assigned writing a three because
she said she sometimes messes up her grammar.

42
When asked about her attitude toward learning English pronunciation, Anh‟s
response was 39 out of a potential 45. This means that Anh has an overall positive
attitude toward improving English pronunciation. She assigned the top score that shows
the strongest desire for intelligible English pronunciation on many of the items: good
pronunciation is important to me; I believe I can improve my pronunciation skills; I try to
imitate native speakers of English; I believe my teacher should teach pronunciation more.
She strongly agreed with all of the above statements, illustrating her desire for
intelligibility, and her interest in continuing to work toward that goal.
Speech Samples
The speech sample portion began with the formal diagnostic reading. First, Anh
had time to listen to me read the diagnostic sample at a normal rate of speed as she
followed along. Then she had time to review, ask questions, and practice pronunciation of
some words before the recorded reading. When she was ready, she read and recorded the
sample. The next portion was the informal free speech topic. I gave her the list of options
to review, and she selected the topic she felt most comfortable discussing. If she got stuck
and needed help elaborating, I prompted her with additional questions in order to acquire
the most complete sample of her natural conversational pronunciation.
Considering both samples, I determined the most markedly noticeable segmental
and suprasegmental feature errors and detailed the findings in the section. I then gave an
overall score for segmental features and suprasegmental features using the abbreviated
Speaking Performance Scale (SPS) (Singer, 2006, adapted from Celce-Murcia, et al.,
1996), condensing the original SPS to focus solely on the pronunciation areas of
segmental and suprasegmental features (see Table 4.1).

43
Table 4.1
SPS
Rating System used for assessment of segmental and suprasegmental features
Rating Segmental Suprasegmental

4 Rarely High degree of fluency
mispronounces effortless, smooth
3 Accent may be foreign Speaks with facility, rarely

Never interferes has to grope…
2 Often faulty but intelligible Speaks with confidence

with effort but not facility, hesitant
1 Errors frequent only Slow, strained except for

intelligible to NS used to routine expressions
dealing with NNS
0 So halting that
Unintelligible conversation is impossible
Singer, 2006, adapted from Celce-Murcia, Brinton, & Goodwin (1996).
On a scale of 0 to 4, the participant has one score for segmental features and a second
score for suprasegmental features for each speech sample: eight points total for each
speech sample. These scores are combined for a total of sixteen points possible.
Segmental Features
The segmental feature most immediately noticeable to me was the frequently
dropped word endings. While NAE speakers drop endings when connecting speech, the
dropped endings in the sample were noticeably different. „Have‟ became „ha,‟ and
speakers (plural) became reduced to the singular. Sometimes one ending was dropped
and replaced with a different one: „mastered‟ became „masters‟. The dropped endings
occurred frequently and consistently throughout both speech samples. However, it didn‟t
seem to be the cause of intelligibility issues most of the time. In examples like the
singular speaker and the past tense mastered becoming present, I still understood the
44
words she was saying and the overall message. Imagining myself as a listener unaware of
her background, these pronunciation errors would likely have led me to believe she made
grammatical errors rather than pronunciation ones. Still, there were a few instances, such
as the „have‟ example, where the ease of comprehensibility was affected.
Other segmental errors involved specific phonemes. In her informal sample, she
talked about wearing a „hoodie‟, but instead of using the /ʊ/ phoneme, she used /uw/. I
also noticed a tendency to make /iy/ phonemes as in beat sound closer to /ɪ/ as in bit. The
/θ/ and /ð/ (voiced and voiceless th) were spoken as /d/ throughout, which was one
phoneme I recall her working on with the speech pathologist years ago when she was my
student. While I recall her physically being able to make the phonemes, she had almost
two years of pronouncing it as a /d/ before she was taught how to bring her tongue
against her top teeth. I can‟t be certain whether my awareness of this habit made it easier
for me to comprehend, but the lack of the /θ/ and /ð/ did not interfere with my
intelligibility. Overall, these phonemic errors seemed to impair intelligibility very little
given the surrounding context. However, confusion could be caused by words with a
minimal pair that might also fit the context (ie, She beat/bit her little brother.).
Her segmental performance from the two speech samples was measured to be
somewhere between a 2: often faulty but intelligible with effort, and a 3: accent may be
foreign but never interferes. There were segmental features that were often faulty, but a
lot of effort wasn‟t needed for it to be intelligible. The informal speech sample was much
more comprehensible, possibly because she was using words with which she was more
comfortable. Therefore, the formal diagnostic passage was rated a two and the informal
sample a 3 for segmental features, for a total of five out of eight points possible.
45
Suprasegmental Features
One of the most perceptible suprasegmental errors in the diagnostic sample was
misplaced word and sentence stress. For example, the stressed syllable for accent is
ACcent, but Anh repeatedly stressed the second syllable, acCENT, in the diagnostic
sample. It was specified that this error was more prevalent, especially in regard to
sentence stress, in the diagnostic sample; it did not seem to be as much of an issue in the
informal free speaking sample. It is difficult to isolate the exact cause, but it is likely to
have contributed to my ease of comprehensibility for the informal speech sample
compared to the formal diagnostic one.
There were also instances of epenthesis in the diagnostic sample. The word
„native‟, which is a word the participant knows well, inserted a syllable in the middle of
the word with a result of it sounding like „negative‟.
Overall, the diagnostic sample took much greater effort to be comprehensible.
The suprasegmental scoring options on the SPS didn‟t seem to fit well with the
observations: she spoke with confidence and her speech flow wasn‟t hesitant, but it also
was very difficult to understand at times. Conversely, the informal sample was easily
comprehensible, and, while not flawless, clearly earned a three. Anh‟s overall score for
suprasegmental features is a five out of eight.
Summary of Speech Data
The segmental and suprasegmental samples both were rated with five out of ten
possible points. Therefore, Anh attained a total of ten out of 16 possible points.
46
Discussion of Data
Table 4.2
Data Summary
Anh: 6 ½ years in U.S.

% of day in English: 75% during school year; ~25% during summer and breaks
Attitude Score (PAI): 39 out of 45
Speech Sample Score: 10 out of 16
To review, Anh has been in the country learning NAE for 6 ½ years. She arrived
at age 16 and is now 22, attending college, working at a nail salon, and living at home
with her parents and brother. Anh‟s attitude toward English pronunciation was rated as
very positive by the PAI, and her overall level of English is proficient. Anh‟s total speech
score was low-average, receiving a 10 out of 16, or 0.625 out of 1.00.
Interestingly, separating the formal diagnostic sample from the informal sample
reveals that overall comprehensibility was measurably better in the informal sample. This
goes against my original hypothesis that the formal diagnostic sample would either rate
the same or higher since Anh listened to it read first and then had time to practice. I
assumed a planned reading would be easier to apply previous pronunciation knowledge to
than spontaneous unplanned discourse.
In the following sections, I will use the information gathered in the interview to
examine the factors that I believe positively and/or negatively contributed to the
participant‟s pronunciation intelligibility. The research represented in Chapter Two:
Literature Review will be revisited as I elaborate further on my conclusions, using the
research to support or refute each factor‟s significance in this particular case.

47
Age
Despite Anh‟s rather limited English class in Vietnam, she really became
immersed in the sounds of NAE in an authentic setting when she moved to the U.S. at
age 16. Following the argument of a critical period for pronunciation, Anh‟s age was
likely to make it more difficult for her to acquire and produce the new sounds of NAE.
Given Anh‟s positive attitude and motivation as shown with the PAI and her immediate
interaction upon arrival with native English speakers both at school and at home with her
cousins, there does appear to be validity to the critical period with regard to
pronunciation in this case. Still, it is worth repeating that a native NAE accent is not
necessary for intelligibility (Scales, et al., 2006).
First Language
The contrast between English and Vietnamese is great, and this contrast often is
displayed when native speakers of one language attempt to learn the other. Nguyen
(2002) outlined how the differences at the segmental level often result in errors such as
dropped final consonant sounds and trouble pronouncing /δ/ and /θ/, the voiced and
voiceless „th‟. These issues anticipated by Nguyen were exhibited in the speech samples,
leading to the determination that first language is a definite factor in her current
pronunciation. The segmental errors occurred despite the fact that Anh recounted
working on these areas with ESL teachers and speech pathologists in the past. When
asked if she thought it was effective, she said that it was. However, as Meng, et al.
(2009) described, segmental and suprasegmental features can be affected by language
transfer, and the interference effects can become fossilized with age. The targeted work
on pronunciation at my school district didn‟t begin until almost two years after Anh‟s
48
arrival in the U.S. The type of pronunciation instruction prior to that time was not clearly
recollected, posing a challenge as we examine the next factor.
Prior Instruction
As previously stated, Anh‟s earlier recollections of pronunciation instruction were
somewhat vague. She recalled working on saying the „be‟ verbs, and the phonemes /ð/,
/θ/, /r/, and /z/. She commented that most of her pronunciation instruction was embedded
in the class, where the teacher would provide correction as an issue arose. When asked
how she knew what her main pronunciation problems are, she replied that she learns from
speaking with NAE speakers and the feedback they give to her. I found this interesting
because my own recollection was that, what little time we had, we examined
suprasegmentals, reviewing how NAE speakers stress certain words in a sentence,
teaching to listen for which syllable is stressed, and pointing out intonation. I had hoped
some of these lessons were retained since correctly placed sentence stress has been
shown to improve listener comprehension. Perhaps more importantly, it has also been
found to be both teachable and learnable (Hahn, 2004; Levis & Levis, 2010). If they did
have an effect, the effect was limited to the informal speech sample. Still, our time
outside of the larger class to do these lessons was very limited, and these aren‟t the
pronunciation areas most linguistically untrained people identify as problem areas.
Anh‟s recollection about the college-level ESL classes she has taken was that the
teacher would point out problems as they were presented, similar to her other past
experiences. She has taken an ESL speaking class, but she described it to be more of a
speech-type course where giving speeches in front of the class was the goal. As a whole,
49
Anh‟s focused instruction on pronunciation has been lacking and could have made a
significant impact.
Language Exposure
Two facets of language exposure were examined for this factor: length of time in
the country learning English and daily interaction with the language. It is worth
distinguishing between these two, as some people might live in the country, but carry on
few interactions in the language. Other language learners live in communities surrounded
by people who share their first language. Depending on their work or educational setting,
language exposure can vary a great deal. For example, many parents of students in my
district have lived in the U.S. over ten years. Yet they work in a noisy factory where
communication is impossible, and then go home to their families where their L1 is the
justifiably necessary spoken language. Lack of comprehensible input could make
language exposure very low despite having lived in the country for ten years. Alternately,
children who move to the country and attend school may have much greater language
exposure through rich comprehensible input in one year than the adult who has been in
the country for ten.
Anh has now lived in the U.S. for six and a half years and has attended high
school or college for all of those years. Her percentage of language exposure during the
school year is approximately 75% due to her interaction with predominately English-
speaking peers and professors at college. During the summer, this rate drops significantly
as she works with fellow native Vietnamese women during the day and speaks
Vietnamese at home with her family. Although Anh‟s exposure could be higher if she
lived with NAE speakers and interacted more in English at work, this rate seems to be
50
sufficient to allow her much comprehensible input. Based on her college exams, she has
had enough exposure to test as proficient in the language. Even though she has had a high
rate of language exposure with close to seven years in the country, this doesn‟t seem to
be preventing intelligibility issues.
Aptitude
While I noted in the Literature Review that this study would not measure aptitude,
I do want to note Anh‟s general progress with English. Anh was quite successful with her
overall English language acquisition. Having started her American education around the
early high school years, she attained satisfactory grades, qualifying her for National
Honor Society. Whether or not the debate over aptitude distinguishes an aptitude for
pronunciation as separate from comprehensive language, Anh has shown a satisfactory
aptitude for overall language learning.
Attitude and Motivation
The Pronunciation Attitude Inventory measured Anh to have a positive attitude
and motivation toward acquiring English pronunciation, with Anh receiving 39 out of 45
possible points. From the study by Elliott (1995), a student‟s motivation toward achieving
the target language‟s pronunciation was the principal variable in their accuracy of actual
pronunciation output. While Anh‟s scores cannot be compared with other speakers of
various PAI scores, it can be assumed that Anh‟s positive attitude has only been a boon to
her pronunciation rather than a hindrance. She strongly agreed that she listens to NAE
speakers to hone her skills, showing that her efforts are ongoing. Yet one item to which I
was happy to see she didn‟t assign a „strongly agree‟ was, „I would like to sound like a
native English speaker when I speak English.‟ She instead assigned a somewhat agree.
51
Throughout this study, I have wanted to make my belief clear that acquiring a native
accent is not the goal. Rather, intelligibility and comprehensibility should be the intent
for speakers of a new language. As stated by Derwing and Munro (1997), accentedness
should be the lowest priority in pronunciation teaching, relegating priority to
intelligibility and comprehensibility. Therefore, out of the 45 possible points, a score of
39, especially based on the questions Anh didn‟t answer with the top score, demonstrated
that Anh was highly motivated, but also is possibly aware that a native accent doesn‟t
need to be her goal.
Conclusion
In the concluding chapter, I will discuss the implications and limitations of this
study. I will review the major findings, how these findings fit with the scholarly literature
reviewed in this study, and how these findings might be used in an educational setting or
in future research. Lastly, considering the original catalyst that drove me to this case
study, I will reflect on how this experience will impact both my future interactions with
similar students and with colleagues as I endeavor to spread the message of the
importance of incorporating early quality pronunciation instruction for students of a
similar background as Anh.

52
CHAPTER FIVE: CONCLUSION
The goal of this study was to examine the factors influencing the English
pronunciation intelligibility of a young native Vietnamese woman. In the previous
chapter, the results pertaining to each factor were analyzed and discussed. In this final
chapter, I will summarize the major findings of the study, and discuss the limitations and
implications of the results. Lastly, I reflect on the experience both for this study and how
it applies to my years of teaching. These reflections are intertwined with thoughts for
future research, as I continue to question how best to meet the pronunciation needs of
students such as Anh.
This study uncovered factors that influence the pronunciation intelligibility of an
English-proficient Vietnamese woman. The participant’s age at arrival, first language
influence, and lack of directed pronunciation instruction appear to have all contributed to
intelligibility issues. On the other hand, the participant’s positive attitude and motivation
toward acquiring clear pronunciation and language exposure have likely been positive
influences. The participant does exhibit pronunciation errors that interfere with
intelligibility, but intelligibility improves when the participant is allowed to speak freely
about familiar topics and select her own words. The results do appear to support the idea
discussed in Chapter Two of a critical period, specifically for pronunciation. Meng et al.
(2009) stated first language interference occurs at both the segmental and suprasegmental
level, and can become fossilized with age. The data from this study also seems to support
this interference effect. Conversely, Singer (2006) found the most definitive factor in
Somali participants’ pronunciation was language exposure. Anh’s intelligibility issues

53
remained despite a high level of language exposure. Anh’s results also deviated from
Elliott (1995) in that Anh’s high level of motivation did not prevent her intelligibility
issues. Rather than disagreeing with research on language exposure and motivation, this
study demonstrated that, while essential, these factors are not always enough to guarantee
intelligibility.
Limitations of this Study
The participant and I have a history, as I am her former teacher; in many ways,
this is a great strength. Reciprocity is greatly enhanced by the relationship with the
participant. However, this established relationship could have altered the data in a few
ways. She has stated that I am the main teacher who spent time working on
pronunciation, and I wonder if that might have had an effect on her answers. She may
have felt unnecessarily influenced to answer more positively about her attitude toward
pronunciation, or about her past experience in my classroom. Because of this, conducting
a study with a proctor unknown to participants could be a consideration for further
studies. Furthermore, the original SPS (Celce-Murcia, et al., 1996) specifically
acknowledges the fact that ESL teachers tend to be better at understanding non-native
accents. The description tied with a score of „1‟ is written “Errors frequent, only
intelligible to NS (native speaker) used to dealing with NNS (non-native speaker).” I
scored Anh as a two in some areas. However, perhaps as a NS used to dealing with NNS,
what I found intelligible might have actually been unintelligible to someone else. This
could have resulted in a potentially lower score.
An additional limitation could be the variety of ESL programs and teachers the
participant encountered and her inability to recall details about each. Information on prior
54
instruction was all done by the participant‟s best recollection at that time of the interview.
Gathering many details on the type of prior pronunciation instruction about every
program was nearly impossible. As I posit that more effective pronunciation instruction
would have resulted in higher intelligibility scores, I also need to recognize that
recollections are not always thorough and sometimes lack accuracy. Therefore, her
memory is a legitimate limitation.
Lastly, while factors affecting pronunciation have been well studied, could there
be others that we haven‟t taken into consideration yet? As the wealth of research in this
area continues to grow, additional factors affecting pronunciation may be brought to the
forefront. Those yet-to-be discovered factors could play a role in Anh‟s situation and
were not analyzed in this study.
Implications
The results support that Anh‟s age of arrival, first language influence, and lack of
global pronunciation instruction all have contributed to intelligibility issues. For the most
part, nothing can be done about age of arrival and first language influence. The variable
that educators do have control over is the efficacy of their pronunciation instruction.
While this might sound like a straightforward solution, the problem and solutions are
much more complex, as I will discuss in the reflection. Still, the implication of this study
is clear: Neglecting pronunciation can be a great detriment to a language learner.
For ELLs, especially those who are no longer young children, English
pronunciation needs should be determined early upon their arrival to the country.
However, starting early doesn’t negate continuing on with advanced or even proficient
students. Global instruction that includes the teaching of prosody has been shown to
55
significantly improve comprehensibility of adult learners (Derwing & Rossiter, 2003).
Unfortunately, educators are not able to rely on textbooks to guide them as very few texts
include a full range of oral fluency practice (Rossiter, et al., 2010). Developing a concrete
understanding of pronunciation research and the factors that affect pronunciation
provides a solid base on which to become an effective teacher: one who addresses every
possible aspect of the student’s language to ensure future success.
Future Research
The focus of this case study on one participant precluded the ability to draw
comparisons amongst participants. A larger-scale study of Vietnamese learners could
allow for more generalizations to be made. I would like to see comparisons of distinct
factors, such as attitude, and any correlation that has on the speech sample scores of a
large group of participants. Similarly, additional research could be done attempting to
control for specific factors. Gathering a large group of Vietnamese learners of English of
similar backgrounds in all areas but one could provide quantifiable data on the impact of
each separate factor.
In addition, a comparison across first languages to determine which L1 group
struggles most with intelligibility could help educators anticipate learner‟s challenges and
be prepared with interventions. However, educators might still encounter the problem of
an inflexible schedule that doesn‟t allow time for pronunciation instruction, particularly
for just one or two students. This is why my interests are perhaps most piqued by future
research in computer-assisted pronunciation technology (CAPT).
According to Tanner and Landon, (2009) “lack of qualified teachers results in a
lack of quality pronunciation instruction, suggesting a need for materials that enable
56
learners to direct their own pronunciation learning outside the classroom” (p. 53). A
study by Tanner and Landon (2009) showed language learners made significant gains in
prosody areas impacting intelligibility by use of a self-directed computer-assisted
pronunciation program. This gives me hope for a workable solution. With the paucity of
comprehensive teacher training in pronunciation and the successful results of the Tanner
and Landon study, could computer technology help make pronunciation instruction more
effective? CAPT has the potential to maximize the effectiveness of pronunciation
instruction even more given that it allows more motivated students to work
autonomously. Since this particular case study showed many variables working positively
for the participant, this is an area where more research could be done, particularly with
Vietnamese students whose language exposure began in their teens or later. Further
research could show if this is a viable solution for educators working in a variety of
settings.
Reflection
While I spent a great deal of time reflecting on this study‟s question as I was
going through the process, I‟ve actually been reflecting on the overall question of
pronunciation practice since I became a teacher six years ago. As I gathered the data, I
continued my belief that while other factors definitely had an impact, such as age and
first language influence, prior instruction was the biggest variable within our control. So
then what is the solution? This might appear to be an easy question at first – ie. improve
teacher training – but the more I consider the question, the less simple it seems. As I‟ve
supported throughout this capstone, research shows that pronunciation instruction is
needed for a number of students. Yet most teacher training programs do not educate
57
teachers on effective pronunciation teaching strategies (Breitkreutz et al., 2001). So,
naturally, one of the first steps might be to mandate teaching research-based
pronunciation instruction methods in all post-secondary institutions offering ESL
teaching licenses and endorsements.
I also have spent a lot of time considering what a quality pronunciation class
would look like. From my coursework at Hamline and research gathered to improve my
craft and for this study, a clearer picture has emerged of an effective pronunciation
course. The goals for a class should be aimed at four key components: functional
intelligibility – the ability to make oneself relatively easily understood; functional
communicability – the ability to meet the communication needs one faces; increased self-
confidence in speaking; and the development of speech monitoring abilities and speech
modification strategies for use outside the classroom (Morley, 1998; Cunningham Florez,
1998). Again, functional intelligibility and communicability are the key pronunciation
goals: the attainment of native-like speech – which varies greatly when considering the
variety of Englishes globally – should not be the goal (Otlowski, 1998). Meaningful
pronunciation practice as part of the larger communication class, with an emphasis on
suprasegmentals, and a clear link between listening comprehension and pronunciation
help create an opportune learning environment.
Supposing teacher training programs improved, that still leaves me thinking of the
old adage, „practice makes perfect.‟ Many teachers will teach elementary students and the
need for pronunciation intervention may never arise. From my experience at the
secondary level, many students‟ accented speech does not interfere with intelligibility,
and little work might be needed. My student population this year is without any language
58
learner whose accent interferes with their intelligibility, and I know that if I have a new
student next year needing intelligibility interventions, I‟ll need to relearn what I‟ve
forgotten from my college coursework.
Even with these two pieces in place, the solution still isn‟t always an easy one,
especially where time is limited. Many districts have a relatively small number of ELLs,
and teachers are spread between multiple classrooms or even buildings, as I am. The time
with students is very limited. Also, the time that is available is for a classroom with a
variety of backgrounds and needs. As mentioned previously, the majority of my students
have not needed much pronunciation work. This could be due to a variety of factors as
this case study demonstrated, but I believe the two that have had the greatest impact have
been age of arrival and first language influence.
Given the constraints such as these that numerous teachers face, I‟m left
wondering what is a workable solution? As I reflect, I am heartened and excited to see
the rising prevalence of computer-assisted pronunciation teaching. Could this provide a
solution where student learning can be individualized and students hold the key? If this
becomes the case, then it allows the variable to transition away from pronunciation
instruction and toward learner attitude and motivation. Equipping students with all the
tools to be successful and letting them determine what they do with those tools is the
essence of education. The principal of the high school where I‟m employed has a saying
that he is known for: “Teachers open the door; the choice to enter is yours.” I look
forward to technology that allows attitude and motivation to be the main variables in
intelligibility. However, with the ongoing marginalization of pronunciation and paucity

59
of promising research and technologies circulating amongst ESL teachers, I‟m afraid too
many students are arriving at closed doors.
Summary
My motivation for this study began with one student who showed needs unique to
the rest of her class. Despite her effort, which seemed to be greater than many of the
other students, she consistently reported troubles with listeners understanding her speech.
In an effort to help her be successful, I‟ve examined the factors that have led to her
intelligibility concerns. This interested me a great deal as I wanted to allow others to see
her full potential without being distracted by intelligibility issues or misled to believe
accented speech reflects English proficiency. The solution to the latter misperception
rests on the general public being educated about second language learning; the solution to
the former rests on quality ESL programs cultivating quality teachers, who in turn
approach each student with the awareness of the tools he or she will need to be
successful. For many older language learners especially, the toolset must include
effective pronunciation instruction.

APPENDIX A
Interview Questions and Prompts

61
Participant Interview – Part A
Personal Information Survey
Personal:
Date: ______________
Age: _____________ Sex: Male Female
Native Country: _______________________ 
How long have you lived in the United States?
___________________________________
Please list other countries that you have lived in and time you were there:
_________________________________________________________
Language
What is your first language? ____________________________________
Do you speak other languages (not English)?
________________________________________________________________________
If yes, please answer the next questions:
• Where did you learn it?
____________________________________________________
• How long have you studied it?
______________________________________________
• Do you still use it today? When?
_____________________________________________  
English  
• Did you study English before you came to the United States? Yes No
If yes please answer the next questions:
• Where did you learn it?
____________________________________________________
• How long did you study it?
_________________________________________________
• How often did you use it?
__________________________________________________
What level of English classes have you taken at this point?
______________________________________________________________________
How would you rate your English from 1 to 5 (with 5 = best)?
Listening 1 2 3 4 5 Speaking 1 2 3 4 5 Writing 1 2 3 4 5 Reading 1 2 3 4 5
Education
Please describe the education that you completed in your native country or other
countries before you came to the United States.
1. Primary education: yes no -How many years? ________
62
2. Secondary education: yes no -How many years? ________

3. Post-Secondary education? Yes no -How many years? ________
If you have post-secondary education, please describe where you studied, what you
studied and for how long.
Where? _______________________________________________________
What? ________________________________________________________
How long? ____________________________________________________
Experience
Do you have a job now? Yes No  What is your position? _____________________
How long have you worked there? _______
Are there other Vietnamese people who work there?
______________________________________
Do you speak Vietnamese with them?
__________________________________________________
How often do you speak English at work? a little sometimes often
What percentage of your day do you speak English at work? (i.e. 50%, 75%)
______________
How often do you speak English when you are out in the community?
a little sometimes often
How many hours do you think you speak English each day? _______________________
Where do you speak English (i.e. doctors office, grocery store, bank, etc.)?
________________________________________________________________________
Family 
Who do you live with? Please put a number after each family member:
Husband/Wife ( ) Children ( ) Others ( ): ______________________ Parents ( ) Sister (
)  Grandparents ( ) Brother ( )
What level is the English of your spouse, better, worse or the same as yours?
_______________
Do you have children? Yes No How many? _____________________
What are their ages?
________________________________________________________________________
Do they speak English fluently? Yes No Do they speak English at home? Yes No
Do you speak English with them at home? Explain:
________________________________________________________________________
What percentage of each day do you think that you speak English with your family?
________________________________________________________________________
*Adapted from Singer, 2006

63
Participant Interview – Part B
Pronunciation Attitude Inventory (PAI)
Pronunciation Attitude Inventory
English Attitude
Please answer the questions using the numbers below, circle the number that fits your
feelings best: 
5 = Strongly agree 
4 = Somewhat agree
3 = Neither disagree or agree
2 = Somewhat disagree 
1 = Strongly disagree
1. I would like to sound like native English speaker when I speak English. 5 4 3 2 1
2. Good pronunciation in English is important to me. 5 4 3 2 1
3. I will never be able to speak English with a good accent. 5 4 3 2 1
4. I believe I can improve my pronunciation skills in English. 5 4 3 2 1
5. I believe my teacher should teach pronunciation more. 5 4 3 2 1
6. I try to imitate native speakers of English as much as possible. 5 4 3 2 1
7. For me, communicating is much more important that sounding like a native English
speaker.   5 4 3 2 1
8. Learning good pronunciation is NOT as important as learning grammar and
vocabulary. 5 4 3 2 1
9. Sounding like a native English speaker is VERY important to me. 5 4 3 2 1
*Adapted from Elliot, 1995

64
Participant Interview – Part C
Pronunciation Experiences and Perception (PEP)
Pronunciation Interview Portion

Pronunciation
1. Have you studied pronunciation during any of your time learning English? Yes No
2. What did you work on?
_______________________________________________________________________
_______________________________________________________________________
3. How long did you study it?
_______________________________________________________________________
_______________________________________________________________________
4. Did you think it was effective? Explain.
_______________________________________________________________________
5. Do you know what your main pronunciation problem areas are? How can you tell?
6. When you have problems communicating in English, is it more likely because of a

language problem or a pronunciation problem?
7. If there were an effective online pronunciation program, I would dedicate time to

practice pronunciation. 5 4 3 2 1
8. How much time would you estimate you’d spend independently practicing
pronunciation if there were an effective available program? (ie. 20 min./daily, 20
min/week, etc.)
_______________________________________________________________
*Parts adapted from Derwing and Rossiter, 2002

APPENDIX B
Speech Samples
66
Speech Sample – Part A
Diagnostic Passage
Diagnostic Passage
If English is not your native language, people may have

noticed that you come from another country because of your
“foreign accent.” Why do people usually have an accent when they
speak a second language? Several theories address this issue. Many
people believe that only young children can learn a second
language without an accent, but applied linguists have reported
cases of older individuals who have mastered a second language
without an accent.
Another common belief is that your first language influences
your pronunciation in a second language. Most native speakers of
English can, for example, recognize people from France by their
French accents. They may also be able to identify Spanish or
Arabic speakers over the telephone, just by listening to their
pronunciation.
Does this mean that accents can’t be changed? Not at all! But
old habits won’t change without a lot of hard work, will they? In
the end, the path to learning to speak a second language without an
accent appears to be a combination of hard work, a good ear, and a
strong desire to sound like a native speaker. You also need
accurate information about the English sound system and lots of
exposure to the spoken language. Will you manage to make
progress, or will you just give up? Only time will tell, I’m afraid.
Good luck, and don’t forget to work hard!
*From Celce-Murcia, Goodwin & Brinton, 1996

67
Speech Sample – Part B
Informal Speech Topics
Informal Speech Topics

1. Describe your family.
  2. Tell me about the city where you were born.
  3. Tell me about your favorite city/place.
  4. Tell me about your best day when you were a
child.
  5. Tell me about your job.
  6. Tell me what you would like to do in the
future. 
7. Tell me about the most important day in you
life. 
8. Tell me about your first day in the United
States. 
9. Tell me about your most embarrassing day. 
10.Tell me about your favorite thing to do in your
free time.
*From Singer, 2006
APPENDIX C
Vietnamese Alphabet
69
70
71
From Ngo and Gainty, 2004

72
REFERENCES
Baier, E. (2008, October 28). Speaking without an accent. Minnesota Public Radio News.
Retrieved from http://minnesota.publicradio.org/display/web/2008/10/20/
accent_reduction/
Breitkreutz, J. A., Derwing, T. M., & Rossiter, M. J. (2001, Winter). Pronunciation teaching
practices in Canada. TESL Canada Journal, 19(1), 51-61.
Byrne, B., Butcher, A., & McCormack, P. (1996). The Speech Rhythm of Vietnamese Speakers
of English. Australasian Speech Science and Technology Association, 427-432.
Celce-Murcia, M., Brinton, D., & Goodwin, J. (1996). Teaching pronunciation: Reference for
teachers of English to speakers of other languages. Cambridge: Cambridge University
Press.
Cohen, J. (2007, November). Suprasegmentals: Pronunciation practice for your EFL classroom.
The Internet TESL Journal, 13(11). Retrieved from http://iteslj.org/Techniques/Cohen-
Suprasegmentals.html
Cunningham Florez, M. (1998, December). Improving Adult ESL Learners’ Pronunciation
Skills.
Derwing, T. M. (2003). What do ESL students say about their accents? Canadian Modern
Language Review, 59(4).
Derwing, T. M., & Munro, M. J. (1997). Accent, intelligibility and comprehensibility: evidence
from four L1s. Studies in Second Language Acquisition, 1, 1-16.
Derwing, T. M., & Munro, M. J. (2005, September). Second language accent and pronunciation
73
teaching: a research-based approach. TESOL Quarterly, 39(3), 379-397.
Derwing, T. M., Munro, M. J., Thompson, R. I., & Rossiter, M. J. (2009). The relationship
between L1 fluency and L2 fluency development. Studies in Second Language
Acquisition, 31, 533-557. doi:10.1017/S0272263109990015
Derwing, T. M., & Rossiter, M. J. (2002). ESL learners’ perceptions of their pronunciation needs
and strategies. System, 30, 155-166.
Derwing, T. M., & Rossiter, M. J. (2003). The effects of pronunciation instruction on the
accuracy, fluency, and complexity of L2 accented speech. Applied Language Learning,
13(1), 1-17.
Elliott, A. R. (1995, Fall). Field independence/dependence, hemispheric specialization, and
attitude in relation to pronunciation accuracy in Spanish as a foreign language. Modern
Language Journal, 79(3), 356-371.
Foote, J. A., Holtby, A. K., & Derwing, T. M. (2011, Winter). Survey of pronunciation teaching
in adult ESL programs in Canada, 2010. TESL Canada Journal, 29.
Gilbert, J. B. (2008). Teaching Pronunciation: Using the Prosody Pyramid. New York:
Cambridge University Press.
Hahn, L. (2004). Primary stress and intelligibility: Research to motivate the teaching of
suprasegmentals. TESOL Quarterly, 38(2), 201-223.
Levis, J. (2005, September). Changing contexts and shifting paradigms in pronunciation
teaching. TESOL Quarterly, 39(3), 369-377.
Levis, J., & Levis, G. M. (2010, September). Authentic speech and teaching sentence focus.
Ames, IA: Iowa State University.
Meng, H., Tseng, C.-Y., Kondo, M., Harrison, A., & Viscelgia, T. (2009). Studying L2
74
suprasegmental features in Asian Englishes: A position paper. Retrieved from
Morley, J. (1991, Fall). The pronunciation component in teaching English to speakers of other
languages. TESOL Quarterly, 25(3), 481-520.
Morley, J. (1998, January/February). Trippingly on the tongue: putting serious speech/
pronunciation instruction back in the TESOL equation. ESL Magainze, 1(1), 20-23.
Nguyen, N. (2008). Interlanguage phonology and the pronunciation of English final consonant
clusters by native speakers of Vietnamese (Unpublished master’s thesis). Ohio
University.
Nguyen, Q. V. (2006, May 15). Stress Errors Analysis in Vietnamese Students’ Reading Aloud
[Online forum message]. Retrieved from Research Forum, English Department, Hanoi
University: http://web.hanu.vn/en/mod/forum/discuss.php?d=104
Nguyen, T., & Ingram, J. (2006). Reduplication and word stress in Vietnamese. Queensland,
Australia: University of Queensland, Linguistic Program, E.M.S.A.H.
Nguyen, T. H. (2002). Vietnam: Cultural Background for ESL/EFL Teachers (Doctoral
dissertation, Boston University, Boston, MA).
Nunan, D. (1992). Research Methods in Language Learning. Cambridge: Cambridge University
Press.
Otlowski, M. (1998, January). Prononciation: What are the Expectations. The Internet TESL
Journal, IV(1).
Pittam, J., & Ingram, J. (1992). Accuracy of perception and production of compound and phrasal
stress by Vietnamese-Americans. Applied Psycholinguistics, 13.
Rossiter, M. J., Derwing, T. M., Manimtim, L. G., & Thomson, R. I. (2010, June). Oral Fluency:
The Neglected Component in the Communicative Classroom. The Canadian Modern

75
Language Review, 66(4), 583-606.
Scales, J., Wennerstrom, A., Richard, D., & Wu, S. H. (2006). Language learners’ perceptions of
accent. TESOL Quarterly, 40(4), 715-738.
Shoebottom, P. (2011). A Guide to Learning English. Retrieved April 10, 2011, from Frankfurt
International School website: http://esl.fis.edu
Singer, J. (2006). Uncovering factors that influence English pronunciation of native Somali
speakers (Unpublished master’s thesis). Hamline University, St. Paul, MN.
Tanner, M. W., & Landon, M. M. (2009). The effects of computer-assisted pronunciation
readings on ESL learners’ use of pausing, stress, intonation, and overall
comprehensibility. Language Learning & Technology, 13(3), 51-65.
Thompson, L. (1987). A Vietnamese Reference Grammar. Hawaii: University of Hawaii.
Vietnamese Language: Phonology. (2010). Vietnam. Retrieved from Multicultural Topics in
Communication Sciences & Disorders website: http://www.multicsd.org/
doku.php?do=show&id=vietnam
Vitanova, G., & Miller, A. (2002, January). Reflective Practice in Pronunciation Learning. The
Internet TESL Journal, 8(1). Retrieved from http://iteslj.org/Articles/Vitanova-
Pronunciation.html
Zielinski, B. (2003). Intelligibility in speakers of English as a second language. Reading
presented at 16th Educational Conference, Melbourne.
Zielinski, B. W. (2008). The listener: No longer the silent partner in reduced intelligibility.
System, 36, 69-84.

Discovering Factors That Influence Engli

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Discovering Factors That Influence Engli

Uploaded by

Copyright:

Available Formats

DISCOVERING FACTORS THAT INFLUENCE ENGLISH PRONUNCIATION OF

NATIVE VIETNAMESE SPEAKERS

A Capstone submitted in partial fulfillment of the requirements

St. Paul, Minnesota

CHAPTER ONE: INTRODUCTION ...........................................................................1

What are Intelligibility and Comprehensibility .............................................................3

Professional Experience .................................................................................................4

Suprasegmental Features and Intelligibility...................................................................6

Capstone Overview ........................................................................................................7

CHAPTER TWO: LITERATURE REVIEW ................................................................9

Perceptions of Pronunciation ........................................................................................9

Factors in Achieving Pronunciation Goals .................................................................11

Pronunciation Features ................................................................................................17

Vietnamese Phonology Overview and Comparative Analysis to English ..................18

CHAPTER THREE: METHODS ...............................................................................28

Context and Participant ...............................................................................................30

Data Collection ............................................................................................................33

Data Analysis ...............................................................................................................36

CHAPTER FOUR: RESULTS ....................................................................................39

Participant Information ................................................................................................39

Speech Samples ...........................................................................................................42

Discussion of Data .......................................................................................................46

CHAPTER FIVE: CONCLUSION..............................................................................52

Future Research ...........................................................................................................55

Appendix A Part A: Personal Information Survey ......................................................61

Appendix A Part B: Pronunciation Attitude Inventory................................................63

Appendix A Part C: Pronunciation Experience and Perception ..................................64

Appendix B Part A: Diagnostic Passage......................................................................66

Appendix B Part B: Informal Speech Topics ..............................................................67

Appendix C: Vietnamese Alphabet .............................................................................68

Table 2.1 Vietnamese Tones ................................................................................20

Table 2.2 Vowel Stress .........................................................................................26

Table 3.1 SPS ........................................................................................................37

Table 4.1 SPS ........................................................................................................43

Table 4.2 Data Summary ......................................................................................46

CHAPTER ONE: INTRODUCTION

Pronunciation is becoming increasingly recognized as a crucial area for

language learners (Celce-Murcia, Brinton, & Goodwin, 1996; Rossiter, Derwing,

credibility. A lack of intelligible speech can be of great detriment to a person‟s

professional, social, and educational life. Professionally, an inability to be clearly

communication is of utmost importance. Academically, students might not feel

comfortable asking questions and collaborating with fellow students. Socially,

Collectively, inability to communicate effectively can ostracize a person in multiple

Although research supports the teaching of pronunciation in ESL classrooms,

training in pronunciation teaching, and that stand-alone pronunciation classes in

by courses in advanced linguistic analysis and a class specifically focused on

phonetics and phonology for teachers of English to speakers of other languages

language pathologists (SLPs) with an excellent knowledge of phonetics and

As a secondary ESL teacher, my fundamental goal is to provide students the tools to

then is there often a marginalization of quality pronunciation instruction in schools?

lingual method of pronunciation instruction. Students repeated phrases in an effort to

instruction was relegated to the backburner (Celce-Murcia, Brinton, & Goodwin,

of pronunciation teaching (Levis, 2005). Yet teachers of ESL remain largely

untrained to teach pronunciation, as it is often not emphasized in TESOL programs

(Breitkreutz, Derwing, & Rossiter, 2001).

Another probable factor in the perceived lack of importance of pronunciation

student‟s speech style, scores the assessment. No measure of intelligibility or

comprehensibility is included, and the high-stakes assessments revolve around

confidence, restrict social interactions, and negatively influence estimations of a

area can be a great disservice to English language learners.

What are Intelligibility and Comprehensibility?

Two areas often focused on in pronunciation instruction are intelligibility and

comprehensibility, with intelligibility targeting how well a listener understands an