You are on page 1of 17

Contrastive analysis theory

1 Introduction

Narrowly defined, contrastive analysis investigates the differences between pairs (or small sets) of languages against the background of similarities and with the purpose of providing input to applied disciplines such as foreign language teaching and translation studies. With its largely descriptive focus contrastive linguistics provides an interface between theory and application. It makes use of theoretical findings and models of language description but is driven by the objective of applicability. Contrastive studies mostly deal with the comparison of languages that are socio-culturally linked, i.e. languages whose speech communities overlap in some way, typically through (natural or instructed) bilingualism.

Contrastive analysis and foreign language teaching

Different approaches to the phenomenon of language, different linguistic theories and schools of thought influence our methods of teaching. Structural linguistic and the behaviouristic movement in psychology resulted in the audio lingual method. The transformational approach, with its stress on the analytical element in language learning, reintroduced rational, cognitive methods, but regardless of our view of language, we must somehow solve a whole series of problems in the process of teaching a foreign language. One of these problems is the relationship between the L1 (the learners native language) and L2 (the language to be learned). Contrastive analysis is the systematic study of a pair of languages with a view to identifying their structural differences and similarities. Contrastive Analysis was extensively used in the 1960s and early 1970s as a method of explaining why some features of a Target Language were more difficult to acquire than others. According to the behaviourist theories, language learning was a question of habit formation, and this could be reinforced by existing habits. Therefore, the difficulty in mastering certain structures in a second language depended on the difference between the learners' mother language and the language they were trying to learn. The theoretical foundations for what became known as the Contrastive Analysis Hypothesis were formulated in Lados Linguistics across Cultures (1957). In this book, Lado claimed that "those elements which are similar to the learner's native language will be simple for him, and those elements that are different will be difficult". While this was not a novel suggestion, Lado was the first to provide a comprehensive theoretical treatment and to suggest a systematic set of technical procedures for the contrastive study of languages. This involved describing the languages (using structuralise linguistics),

comparing them and predicting learning difficulties acquisition, January 25th 2011). Thus, the languages comparison is aimed at assisting language learning and teaching. The goals of Contrastive Analysis can be stated as follows: to make foreign language teaching more effective, to find out the differences between the first language and the target language based on the assumptions that: (1) foreign language learning is based on the mother tongue, (2) similarities facilitate learning (positive transfer), (3) differences cause problems (negative transfer/Interference), (3) via contrastive analysis, problems can be predicted and considered in the curriculum. However, not all problems predicted by contrastive analysis always appear to be difficult for the students. On the other hand, many errors that do turn up are not predicted by contrastive analysis. Larsen, et al (1992: 55) states predictions arising from were subjected to empirical tests. Some errors it did predict failed to materialize, i.e. it over predicted. This prediction failure leads to the criticism to the Contrastive Analysis hypothesis. The criticism is that Contrastive Analysis hypothesis could not be sustained by empirical evidence. It was soon pointed out that many errors predicted by Contrastive Analysis were inexplicably not observed in learners' language. Even more confusingly, some uniform errors were made by learners irrespective of their L1. It thus became clear that Contrastive Analysis could not predict learning difficulties, and was only useful in the retrospective explanation of errors. These developments, along with the decline of the behaviourist and structuralise paradigms considerably weakened the appeal of Contrastive Analysis. Fisiak (1981: 7) claims that Contrastive Analysis needs to be carried out in spite of some shortcoming because not all Contrastive Analysis hypotheses are wrong. To overcome the shortcoming of contrastive analysis, it is suggested that teachers accompany contrastive analysis with error analysis. It is carried out by identifying the errors actually made by the students in the classroom. Contrastive Analysis has a useful explanatory role. That is, it can still be said to explain certain errors and mistakes. He further explains error analysis as part of applied linguistics cannot replace Contrastive Analysis but only supplement it. Schackne (2002) states research shows that contrastive analysis may be most predictive at the level of phonology and least predictive at the syntactic level.

A counter-theory was error analysis, which treated second language errors as similar to errors encountered in first language acquisition, or what the linguists referred to as "developmental errors." By the early 1970s, this contrastive analysis theory had been to an extent supplanted by error analysis, which examined not only the impact of transfer errors but also those related to the target language, including overgeneralization (Schackne, 2002).

Conclusion

We may conclude that the aim of contrastive studies is not only a better understanding of the linguistic structure, but also applied deductions, meant to raise the entire teaching activity above the empirical and occasional practice, to outline fundamental teaching programs based on the scientific knowledge of the language. Contrastive analysis has laid the emphasis on error analysis as a way to study the difficulties encountered by foreign language learners. The findings of such studies can be very helpful in setting up teaching devices. Contrastive analysis and error analysis are complementary to one another, in the sense that the results obtained and the predictions made by the contrastive studies are to be checked up and corrected by the results obtained in the error analysis.

LEXICON ADAPTATION FOR LVCSR: SPEAKER IDIOSYNCRACIES, NON-NATIVE SPEAKERS, AND PRONUNCIATION CHOICE Wayne Ward, Holly Krech, Xiuyang Yu, Keith Herold, George Figgs, Ayako Ikeno, Dan Jurafsky Center for Spoken Language Research University of Colorado, Boulder
ABSTRACT We report on our preliminary experiments on building dynamic lexicons for native-speaker conversational speech and for foreign-accented conversational speech. Our goal is to build a lexicon with a set of pronunciations for each word, in which the probability distribution over pronunciation is dynamically computed. The set of pronunciations are derived from hand-written rules (for foreign accent) or clustering (for phonetically-transcribed Switchboard data). The dynamic pronunciation-probability will take into account specific characteristics of the speaker as well as factors such as language-model probability, disfluencies, sentence position, and phonetic context. This work is still in progress; we hope to be further along by the time of the workshop.

William Byrne Center for Language and Speech Research The Johns Hopkins University

the speaker as well as factors such as language-model probability, disfluencies, sentence position, and phonetic context. Section 2 describes a preliminary experiment suggesting that a dynamic lexicon is only useful if words have many pronunciations. Section 3 describes our preliminary work on automatically creating pronunciations. Section 4 reports on preliminary work on the foreign-accent accented data. 2. PILOT EXPERIMENT: DYNAMIC LEXICON WITH TWO PRONUNCIATIONS

Our first experiment was an oracle experiment designed to show whether having exactly two pronunciations for each of the 50 most frequent words in Switchboard, a very full pronunciation and a very reduced pronunciation, would im1. INTRODUCTION prove recognition. Our experiments were conducted using Sonic [6], a large Many ASR researchers have suggested the idea of a dyvocabulary continuous speech recognition system with Viterbi namic lexicon: a lexicon with a large number of pronuncidecoding, continuous density hidden Markov models and ation variants whose probability is set dynamically accordtrigram language models. Sonics acoustic models are decisioning to various factors. ([1] inter alia). This paper is the tree state-clustered HMMs with associated gamma probapreliminary description of our project to apply this idea to bility density functions to model state-durations. Our extwo domains: Switchboard (human-human native Ameriperiments used only the first-pass of the decoder, which can English telephone conversations) and Hispanic English consists of a time-synchronous, beam-pruned Viterbi token(conversations in English between native Spanish speakers passing search. Cross-word acoustic models and trigram with varying levels of accent). Both of these domains are language models are applied in this pass. This first experknown to have high error rates, and pronunciation variaiment was run with an early version of Sonic, which had tion is known to contribute to the difficulty of these tasks a WER of 42.9% on the 888-sentence Switchboard WS97[2, 3, 4, 5]. test set. (By comparison, WER on this test set in our current The goal of this work-in-progress is to build a lexicon version of Sonic is 32.9%). with a set of pronunciations for each word, in which the We used SRIs Hub-5 language model, generously made probability distribution over pronunciation is dynamically available by Andreas Stolcke. We built our 39,198-word computed. The set of pronunciations are derived from handlexicon from the Mississippi State ISIP Switchboard lexiwritten rules (for foreign accent) or clustering (for phoneticallycon. Since this dictionary did not have every word in the transcribed Switchboard data). The dynamic pronunciationLM, we used the CMU dictionary as a resource for any probability will take into account specific characteristics of words that were in the LM but were not in the ISIP lexicon. We also included 1658 compound words (multiwords), of Thanks to the NSF for partial support of this research via award #IISwhich 1393 were not in the ISIP or CMU lexicons. So for
9978025.

these 1393 we included two pronunciations, full (by concatenating the pronunciations of the consituents words) and reduced (hand-written). The average number of pronunciations per word is 1.13. We built 2 versions of this lexicon, which differed only in the pronunciations of the top 50 words. In the singlepron lexicon, we allowed only one pronunciation for the most frequent 50 words. In the two-pron lexicon, we included two pronunciations for each of these words, a canonical pronunciation and a very reduced pronunciation, with equal probabilities. Finally, we created a test set from 4237 Switchboard utterances which had been phonetically labeled [?, 7]. This allowed us to know, for each test utterance, whether the correct pronunciation of each word was canonical or reduced. From this we built a third dynamic lexicon, a cheating or oracle lexicon, which for each test set sentence only used the pronunciation that was present in the test set. We then tested the three lexicons with and without retraining the acoustic models. Table 1 shows the results. Models Baseline Model Baseline Model Retrained Models Retrained Models Lexicon single-pron oracle oracle two-pron WER 43.7 41.8 41.5 41.7

1. Extract observed alternate word pronunciations from the ICSI labeled data. 2. Align pronunciations with training data 3. Count number of times each pronunciation occurs 4. Prune pronunciations with low counts 5. Retrain acoustic models with alignments to new dictionary 6. (Evaluate WER on test set) We will then build a slightly more advanced clustered version of the algorithm, in which pronunciations are clustered into broad classes (Vowel Front, Vowel Back, Vowel Reduced, Consonant Labial, Consonant Dorsal, Silence) before accumulating counts. Then we keep at least one example of each broad class with sufficient count, before the align, prune, re-train and evaluate steps. For example, the word that has 36 phone-level variant pronunciations; [dh ae] and [dh ae t] are the most frequent. It has 19 broad class variants, with [CC VF] and [CC VF CC] being the most frequent. We have already aligned and counted pronunciations, both for phones and broad classes, and are currently working on pruning and then retraining acoustic models. 3.2. Building broad-class maps In addition to building pronunciations, we are creating a new kind of pronunciation feature based on canonical-tosurface mappings, relying on a database originally produced by Eric Fosler-Lussier that aligns canonical pronunciations with surface pronunciations from the ICSI phonetically labeled data. A mapping is a change or transduction from the canonical phone sequence to the surface phone sequence, containing a sequence of differing labels (of whatever length) anchored on each end by labels that are the same in both sequences. For the maps, in addition to the 7 broad classes, 3 word positions, b(eginning), m(iddle) and e(nd) were used. For example, in the following map pattern the sequence to the left of is the canonical sequence, the sequence to the right is the surface sequence, and vb:e represents a back vowel at the end of a word: sil cc:b vb:e cc:b sil null vf cc

Table 1. Comparing lexicon performance on a 4237utterance SWBD test set Table 1 suggests that having two pronunciations rather than one for the 50 most-frequent words does in fact reduce WER (by 2%, from 43.7% to 41.8%). But an oracle telling us which pronunciation to use (41.5% WER) was not significantly better than just putting in both pronunciations (41.7% WER). This suggests that two pronunciations is an insufficient number for any kind of dynamic lexicon to be useful. In essence, with only two pronunciations, the recognizer was able to choose the correct pronunciation, even without a pronunciation probability. As a result of this pilot, we determined that a dynamic lexicon would need to have large numbers of pronunciations, more than we were thought was possible to correctly write by hand. In the next two sections, we discuss how we are building pronunciations by clustering and rule-writing. 3. SWITCHBOARD EXPERIMENT: BUILDING MORE PRONUNCIATIONS AND MAPS 3.1. Baselines Before describing our clustering work, we describe our intended baseline for the SWBD experiments. This is a 5-step extract-align-count-prune-retrain algorithm generalized from [8]:

This algorithm has 4 steps: 1. Accumulate counts for all canonical-to-surface mappings in the training data: with and without word boundary info, with phones and with broad classes: 2. Prune low frequency maps 3. Cluster maps by co-occurrence into classes which will define speaker types

After computing counts from the training data, low frequency patterns were pruned to give the final set of map patterns. For each session side, the frequency of each of the patterns in the set was computed, including the frequency of each canonical string mapping onto itself. The patterns are currently being clustered to produce a set of classes with correlated pattern probabilities. These will define a set of speaker classes on the basis of the observed frequency of patterns. It is generally the case that relatively few patterns account for much of the data. For example, 19 broad class patterns account for about 50% of the sequence differences in the training data. These derived speaker classes and their probability estimates will be used as features in the decision trees determining the probabilities for alternate pronunciations of words. 4. DYNAMIC LEXICONS FOR SPANISH ACCENTED ENGLISH 4.1. The Hispanic-English corpus and test sets We are using the conversational Hispanic-English corpus developed at Johns Hopkins University [9]. This database contains about 20 hours of telephone conversations in English from 18 native Spanish speakers, 9 male and 9 female. All speakers were adults from South or Central America who had lived in the United States at least one year and had a basic ability to understand, speak and read English. During the telephone conversations, the speakers completed four tasks: picture sequencing, story completion, and two conversational games. For the picture sequencing task, participants received half of a randomly shuffled set of cartoon drawings and were asked to reconstruct the original narrative with their partner. For the story completion, participants were given two identical copies of a set of drawings depicting unrelated scenes from a larger narrative context and were asked to answer three questions: What is going on here?, What happened before?, What is going to happen next? The first conversational game, Scruples, involved reading a description of a hypothetical situation and trying to resolve the conflict or dilemma. For the second game, the speaker pairs were asked to agree on five professionals to take along on a mission to Mars from a list of ten professions. These data were divided into development, training and test sets according to speaker proficiency and gender. The development and test sets both include about 30,000 words; from four speakers in the test set, and two in the dev set, while the training set contains about 70,000 words from the remaining ten speakers, five male and five female (See Table 2). Speakers had been judged on proficiency scores based on a telephone-based, automated English proficiency test [10] We also listened to each speaker and rated their

accent as heavy, mid and light. We then combined the proficiency scores with our accent ratings to distribute speakers with heavy, mid and light accents evenly into the different data sets. A range of the degree of accentedness is thus represented in each data set. Set Training Dev Test Gender 5 male, 5 female 1 male, 1 female 2 male, 2 female Minutes 546 176 282 Words 69,926 29,474 30,104

Table 2. Hispanic-English training and test set statistics 4.2. Baseline recognizer performance We used the Sonic speech recognizer with our SWBD lexicon and acoustic models to establish a baseline from a system trained on native American English on Hispanic-English speech. Our SWBD system, as described earlier, consists of a 39,000 word lexicon, the SRI Hub-5 language model, and SWBD acoustic models. On the development test set of 176 minutes of speech and 29,974 words, we achieved a baseline word error rate of 62%. 4.3. Pronunciation rules for Hispanic-English We next created lexical variants on the basis of seven phonological rules (See list below). These rules represent common characteristics of Spanish accented English, and they were determined by comparing literature about Spanish accents [11] to the Hispanic-English database and selecting the most appropriate characteristics. The seven rules are:
1. epenthetic schwa added before words beginning in /s/, as in speak [ax s p iy k]; 2. past tense morpheme -ed pronounced /ax d/ following voiced consonants, as in planned [p l ae n ax d]; 3. reduced schwa vowels pronounced as they are spelled, the full vowel represented by the orthography, as in minimum [iy n iy m uw m]; 4. the mid-high vowels /ih/ and /uh/ become the high vowels /iy/ and /uw/; 5. /s/ and /z/ in word fi nal position are deleted; 6. the fricatives /sh/ becomes the affricate /ch/ in word initial position, and 7. the fricative /dh/ becomes the stop /d/.

Table 3 gives formal versions of the rules. While we have not yet tested whether these rules help in improving recognition performance, we have analyzed some of the errors when the Switchboard recognizer is applied to the Hispanic English dev set, yielding some anecdotal observations that relate to the rule set. First, final consonants tend to be deleted, especially /s/, /z/, /v/ and /t/, causing substitutions of words with no final consonants, such as know for not and how for have. Our phonological rules account only for the deletion of /s/ and /z/. Second, the /dh/ fricative is pronounced as both /d/ and /s/, not just as the /d/ we indicate in our rules. Another fricative that is

1. 2.

s d

4. 5.

ax s / # ax d / voiced C # 3. ax aa / orthographic a ax eh / orthographic e ax iy / orthographic i ax ow / orthographic o ax uw / orthographic u axr er / orthographic er ih iy uh u s 0/ # z 0/ # 6. sh ch / # 7. dh d Table 3. Phonological Rules for Hispanic English problematic is /f/, which is pronounced and recognized as /p/. Third, the softening of /b/ to a bilabial fricative causes substitution of words that have no stop consonant where the /b/ occurs, as in busy substituted with easy. Fourth, many of the reduced vowels are pronounced and recognized as full vowels, which we expected based on the third phono- logical rule. Finally, hesitations seem to be nasalized, with nn for uh, which causes the recognizer to substitute a short word beginning with a nasal, such as no or not, for these hesitations. 4.4. Applying pronunciation count-prune-retraining We next use the phonological rules discussed above to at- tempt to build a better baseline system for Hispanic English. We use the 3-step algorithm first proposed by [12]: apply phonological rules to the base lexicon, generat- ing a large number of pronunciations, forced-align against the training set to get pronuncia- tion counts prune low-probability pronunciations Our base lexicon was the Switchboard lexicon described above , consisting of 39204 word tokens with 1.13 pronunciations per word type. We applied the 7 phonological rules in Section 4.3 to produce accented pronunciations, which were then merged with the base lexicon, and redun- dant forms were removed. The resulting augmented lexicon consisted of 96954 word tokens with 2.8 pronunciations per word type. Next, this augmented dictionary was aligned with the reference corpus data, giving us counts of the num- ber of times a particular pronunciation was choosen for a given word. We are currently working on the pruning step. Once that is complete, we will proceed to retraining the acoustic mod- els with the resulting dictionary. That will provide a static lexicon baseline which we can then use to see the perfor- mance of our dynamic lexicon approach on the Hispanic- English data.

5. CONCLUSION Our main result so far is that hand-writing very-reduced pro- nunciations for 50 frequent function words reduces word error rate even after using a lexicon with 1600 reduced- pronunciation multi-words, usually based on these same func- tion words. Our other results are still too preliminary to ad- mit of much conclusion, but we hope to have more results by September.
6. REFERENCES [1] Eric Fosler-Lussier, Dynamic Pronunciation Models for Au- tomatic Speech Recognition, Ph.D. thesis, University of California, Berkeley, 1999, Reprinted as ICSI technical report TR-99-015. [2] Don McAllaster, Larry Gillick, Francesco Scattone, and Mike Newman, Fabricating conversational speech data with acoustic models: A program to examine model-data mis- match, in ICSLP-98, Sydney, 1998, vol. 5, pp. 18471850. [3] Mitch Weintraub, Kelsey Taussig, Kate Hunicke-Smith, and Amy Snodgras, Effect of speaking style on LVCSR performance, in ICSLP-96, Philadelphia, PA, 1996, pp. 1619. [4] Murat Saraclar, Harriet Nock, and Sanjeev Khudanpur, Pro- nunciation modeling by sharing gaussian densities across phonetic models, Computer Speech and Language, vol. 14, no. 2, pp. 137160, 2000. [5] Dan Jurafsky, Wayne Ward, Zhang Jianping, Keith Herold, Yu Xiuyang, and Zhang Sen, What kind of pronunciation variation is hard for triphones to model?, in IEEE ICASSP01, Salt Lake City, Utah, 2001, pp. I.577580. [6] Bryan Pellom, Sonic: The university of colorado continu- ous speech recognizer, Tech. Rep. TR-CSLR-2001-01, Center for Spoken Language Research, University of Colorado, Boulder, 2001, Revised April 2002. [7] Steven Greenberg, Speaking in shorthand a syllable- centric perspective for understanding pronunciation variation, Speech Communication, vol. 29, pp. 159176, 1999. [8] Michael D. Riley, William Byrne, Michael Finke, Sanjeev Khudanpur, Andrei Ljolje, John McDonough, Harriet Nock, Murat Saraclar, Chuck Wooters, and George Zavaliagkos, Stochastic pronunciation modeling from hand-labelled phonetic corpora, Speech Communication, vol. 29, pp. 209 224, 1999. [9] W. Byrne, E. Knodt, S. Khudanpur, and J. Bernstein, Is automatic speech recognition ready for non-native speech? a data collection effort and initial experiments in modeling conversational hispanic english, in ESCA Workshop, 1998. [10] Ordinate Corporation, The phonepass test, 1998. [11] H. S. Magen, The perception of foreign-accented speech, Journal of Phonetics, vol. 26, pp. 381400, 1998. [12] Michael H. Cohen, ley, 1989. Phonological Structures for Speech Recognition, Ph.D. thesis, University of California, Berke-

Data Findings
In this section, I have to record my subjects conversation with her friend. After finished recording, I will then study her pronunciation and transcribe her sentences. In my study, I have to detect the errors made by my subject and try to correct it with the right transcription.

A: Have you watched the latest movie of Twilight: Breaking Dawn? B: Nope, I havent watched it yet. Ive been busy with some work. A: What are you busy with? B: I have a lot of assignment to do first. A: Cehh...Since when you became a studious nerd? B: Since I have to submit it by next week? A: Alright then, hows your assignment going? B: Pretty good I guess? Oh, by the way, was the Twilight good? A: It was awesome! Taylors so hot I tell you! B: Well I think Edward is more better than Taylor. A: Whatever, if you insist. You can have your white face Edward. B: Hello! He can shine under the sunlight! A: Okay...So, do you want to know how the story ends? B: No you idiot! Youll ruin the surprise! A: Okay fine! I wont tell you. Anyway its better to watch it yourself. B: Okay lahh... I better finish my assignment first. A: Okay, bye. B: Bye.

My subject is a 19 years old Kedahan-Malay who is studying Diploma in Early Childhood Education in semester one. Subjects transcription Correct transcription

Comment

Have you watched the latest movie Twilight Breaking Dawn? Have you watched the latest movie Twilight Breaking Dawn dan twlat d hv ju wtt letst muvi twalat brek dn What are you busy with? What are you busy with wf wt ju bzi w the // sound is pronounce as /f/ subject confused the word dawn with down subject pronounce the word as it spelled there is no // sound in Malay language

Cehh...Since when you become a studious nerd? Cehh Since when you [Cehh] sns wen ju influence by the typical Malay conversation

became a studious

bkem e stjuds subject stresses on the /r/

nerd

nrd

nd

sound since the word is spelled it that way

Alright then, hows your assignment going? Alright lrat the // sound is pronounce as then den en /d/ sound and the /en/ sound is pronounce long when it is supposed to be short hows your assignment going has haz j sanmnt It was awesome! Taylors so hot I tell you! It was awesome ws t wz sm subject pronounce the word as so s s it spelled, the // sound like a // sound hot I tell you ht a tel ju Whatever, if you insist. You can have your white face Edward. Whatever wtev the /z/ sound is pronounce as a /s/ sound subject pronounce the /z/ sound like a /s/ sound

if you insist You can have your white face

f ju nsst ju kn hv j wat fes Okay...So, do you want to know how the story ends? the / / sound is prounce as // sound subject pronounce the word as

Okay

ke

ke

So

s du ju wnt tu n ha stri

it spelled, the // sound like a // sound

do you want to know how the story ends ends

endz

the /z/ sound is pronounce like a /s/ sound

Okay fine! I wont tell you. Anyway its better to watch it yourself. Okay fine I wont ke fan a wnt

tell you Anyway its better to watch it yourself

tel ju eniwe ts bet tu wt t jslf Okay, bye. ke ba

Okay bye

Analysis and Discussion of Data Findings


Pronunciation refers to the utterances of words in spoken language. While Correct pronunciation refers to how to utter the words in the right sounds of the targeted language other than the mother tongue. The right pronunciation usually refers to how the native speakers of the targeted language say or utter the words. With good enunciation and pronunciation, someone does can understand what the other person is saying. There are some common problems in pronunciation in an ESL classroom and one of them is mother tongue interference. When learning the second language (L2), the person with the knowledge of the first language (L1) sometimes will apply the rules of their first language in the second Language. This scenario is popular among the Malaysian people. In Malaysia we live with three different races and cultures which is Malay, Chinese, and Indian people. Sometimes, people who came from English speaking background cannot pronounce the English words correctly because they are used to the Malaysian English (Manglish). This is where the Malaysian applies the Bahasa Melayu rules in English language. When speaking in English language among the Malaysian people, the Malaysian tends to put the lah at the end of the conversations. For example okaylah, no lah and see lah. Sometimes Malaysian also tends to put repetition in their English language. They apply the rules of first language in the second Language, such as dont play-play (jangan mainmain) and together-gather (bersama-sama). They tend to apply the rules of Malay language in the English language. Problem in pronunciation also occur because a specific sound in the English language do not exist in the mother tongue. The students need to be helped to hear the sounds and help they understand how the sound is produce and given a lot of practice to make the perfect pronunciation of the words. It will become a lot easier if the students know the phonemic alphabet, but in English some words does not sounds exactly like how they are spelt. For example ough, cough, although, through, bough, rough, etc. These chosen words do not produce the same sound although they kind of have the same spelling at the back of the words. Another prove of mother tongue is the biggest interference in the pronunciations are, when we look at the Chinese people which in their Chinese language system doesnt have the r sound. When learning English language as well as other language, the most common problem they faces

was, they cant pronounce the words that have the r sound in it. For example, in Bahasa the word rokok they tend to pronounce it lokok. In English language the word rabbit turns into labbit. For the Indian people, in their language system they pronunciation of the r sound is very thick. Thats why when they pronounce in English word that has the r sound they tend to stress it. This way of pronunciation is not right although we might understand the word that they are saying. The next problem in pronunciation in an ESL classroom is the low self-esteem when speaking English. Since it is not their first language the second Language learners that are not used to speak in English will feel shy when speaking in English because of afraid of making mistakes. This is due to improper training and lack of exposure to English language. Furthermore, the tutors themselves here in Malaysia are not the native speakers. Sometimes tutors tend to pronounce wrongly and students will tend to follow. Some of second Language students are passive students. This makes them hard to achieve perfection when speaking in English. These low self-esteem problems happen when the second Language learners tend to make mistake when speaking. The common mistakes that they make when speaking is they tend to stress individual words incorrectly. This problem can be fixed by hearing correctly and pronounce it correctly by using the guide in the dictionary. Second Language students also tend to stress the words in a sentence wrongly. In English language, by stressing different words in a sentence, we can actually change the meaning of the sentence. If you stress the wrong word, the listener might get the wrong message. Pronunciations cover on word stress, sentence stress and intonation. By learning all of these, second language learners can actually pronounce correctly. There are eight common pronunciation features that second language learners must achieve in order to pronounce correctly that is voicing, aspiration, mouth position, intonation, linking, vowel length, syllables and specific sounds. By achieving all of these, learners can actually boost their selfesteem in speaking English.

References
Gast, V., Contrastive analysis . Retrieved January 14, 2013 from http://www.personal.unijena.de/~mu65qev/papdf/CA.pdf

Mihalache, R., Contrastive analysis and Error Analysis - Implications for Teaching of English. Retrieved January 14, 2013 from

http://www.academia.edu/422410/CONTRASTIVE_ANALYSIS_AND_ERROR_ANAL YSIS-IMPLICATIONS_FOR_THE_TEACHING_OF_ENGLISH

Rustipa, K., Contrastive Analysis, Error Analysis, Interlanguage and the Implication to Language Teaching . Retrieved January 14, 2013 from

http://www.polines.ac.id/ragam/index_files/jurnalragam/paper_3%20apr_2011.pdf Lexicon Adaptation for LVCSR: Speaker Idiocyncracies, non-native speakers, and pronunciation choice, Retrieved February 1, 2013 from

http://www.stanford.edu/~jurafsky/pmla.pmod.pdf