This chapter presents the review of literature that relates to the topic of the research. They are the basic concept of phonology, English pronunciation, drilling technique, the purpose of teaching pronunciation, the factors influencing students’ pronunciation ability, Text to Speech (TTS) software in language learning, teaching pronunciation by using “Talk Any” software, the importance of educational software in the teaching and learning process and the action hypotheses. The topic will be presented in turn.

The Nature of Pronunciation 2.1.1 The Definition of Pronunciation Pronunciation is one of the important aspects in English, especially in oral

communication. Every sound, stress pattern, and intonation may convey meaning. The non native speakers of English who speak English have to be very careful in pronouncing some utterances or he may create misunderstanding. So, having an intelligible pronunciation is necessary rather than having a native-like pronunciation. According to Lado (1964: 70), pronunciation is the use of a sound system in speaking and listening. Here, pronunciation is merely treated as the act that happens in speaking and listening. Pronunciation is the act or manner of pronouncing words; utterance of speech. In other words, it can also be said that it is a way of speaking a word, especially a way that is accepted or generally understood. In the senses, pronunciation entails the production and reception of sounds of speech and the achievement of the meaning (Kristina, Diah, et al.2006: 1). This second definition gives a briefer pronunciation’s definition. It contains some important keys in the words, phrases, and sentences being pronounced should be intelligible.


The Definition of Phonology According to Yule (2001:54) the definition of phonology is the study of the systems, patterns and use of sounds that occur in the languages in this world. In addition, Kusuma (1990:7) adds that phonology deals with the phonemes and sequences of phonemes. A phoneme is a class of sounds. It is an abstract alphabet unit that can be used for writing a language in a systematic and unambiguous way (Yusuf, 1998:19). For example, phoneme /p/ and /t/ in word pie and tie, etc. However, to acquire a full understanding of the use of speech sounds in English, we should study both the phonology and phonetics. 2.1.3 The Definition of Phonetics Based on Kusuma (1990:1), the definition of phonetics is the study of speech sounds, the production, transmission and reception. After knowing the definition of phonetics, we understand the way where the sounds of languages are formed. Further, the mistake that may occur in someone utterances may be detected and corrected. In sum, to complete an act of communication, our speech mechanism should simply function in such a way to produce sounds. In turn, it must be received and understood by the listener. 2.1.4 Speech Sounds Production According to Giegerich (1995:1), the most usual source of energy for our vocal activity is provided by an air stream expelled from our lungs. In short, the air stream is responsible in producing the speech sounds. The production of speech sounds covers four processes, they are initiation, phonation, oro-nasal, and articulation. 1. Initiation Process

The initiation process or airstream mechanism is the process when the air is expelled from the lungs then it goes through the trachea to the oral/nasal cavity. In English, speech sounds are the result of “a pulmonic egressive air stream” (Giegerich, 1992). Ambercrombie in Dr. Photini Coutsougera‟s journal (2004) says that the airstream provided the action of some organs of speech that makes audible the movements of other organs. In the same journal, other expert, Catford (1994: 4) states that the airstream mechanism is the movements of organs during the organic phase act upon the air contained within the vocal tract. They compress the air, or dilate it, and they set it moving in various ways – in rapid puffs, in sudden bursts, in a smooth flow, in a rough, eddying, turbulent stream, and so on.” There are 3 airstream mechanisms used in world languages: pulmonic (involving lungs), velaric (involving velum and tongue) and glottalic (involving glottis and larynx). 2. The Phonation Process The phonation process occurs at the larynx. The larynx has two horizontal folds of tissue in the passage of air; they are the vocal folds. The gap between these folds is called the glottis. When glottis is closed no air can pass. Or it can have a narrow opening which can make the vocal folds vibrate producing the “voiced sounds”. The examples of voiced sounds are: [b], [g], and all vowels. Finally, when the glottis can be wide open, as in normal breathing, thus the vibration of the vocal folds is reduced, producing the “voiceless sounds”, for example a plosive such as [p], [t], and [k]. 3. Oro-Nasal Process After it has gone through the larynx and the pharynx, the air can go into the nasal or the oral cavity. Giegerich (1995:3-5) thinks that the air stream can go either into the nasal or oral activity after having passed the larynx and the back of the throat (pharynx). This process called oro-nasal process. The

velum is the part responsible for that selection. Through the oro-nasal process, nasal consonants (/m/, /n/, /ŋ/) can be differentiated to other sounds. 4. The Articulation Process The final process is articulation. The articulation process takes place in the mouth and it is the process which speech sounds are distinguished from one another in terms of the place where and the manner how they are articulated. In other word, the people can distinct the oral cavity, which acts as a resonator, and the articulators, which can be active or passive: upper and lower lips, upper and lower teeth, tongue (tip, blade, front, back) and roof of the mouth (alveolar ridge, palate and velum). Based on the explanation above, the research focuses on the articulation or pronunciation of speech sounds. It is due to the articulation of speech sounds appears in the pronunciation of English words. 2.1.5 The Classification of Speech Sounds Jones, (1995:12), states that there are many kinds of speech sounds that can be articulated. The three kinds of speech sounds are vowels, diphthongs and consonants. Based on Kenworthy (1987:46), the three kinds of speech sounds cause a perception problem to the students. The vowels are produced in which the air stream can pass freely through and out of mouth (Kusuma, 1990:14). These sounds are made in which there is no hindrance to the flow of air as it passes from the larynx to the lips. Based on Kusuma (1990:28-38) English has twelve pure vowels, which are:

1. [i:] – tree / tri: / 2. [I] – milk / mIlk / 3. [e] – bed / bed / 4. [æ] – sat / sæt / 5. [ɜ:] – word / wɜ:d / 6. [ə] – along / ə’lɔ:ŋ /

7. [ɑ:] – pass /pɑ:s / 8. [ʌ] – sun / sʌn / 9. [ʊ:] – blue / blʊ: / 10. [ʊ] – put / pʊt / 11. [ɔ:] – four / fɔ:(r) / 12. [ɔ] – dog /dɔg /

According to the tongue position (Kusuma, 1990:18), the English vowels can be put in a chart as follows: FRONT Close HIGH CENTRAL BACK


i: i

Half Close

ɔ: ə ɔ



Half Open LOW Open


ʌ ɑ:

(Picture 2.1 Adopted from Kusuma, 1990:18) The second category of speech sound is diphthongs. Kusuma (1990:15) asserts that a diphthong is a sound made by gliding from one vowel position into another position. In other words, diphthongs is a sound which consists of a movement or glide from one vowel to another. It is a union of two vowels likes fine / faIn /, goes / gaʊ /, etc. there are nine diphthongs (Kusuma, 1990:39-42), they are: 1. [aI] – lie / laI / 2. [eI] – day / deI / 4. [Iə] – deer / dIə(r) / 5. [ɔə] – door / dɔə(r) / 7. [ʊə] – poor / pʊə(r) / 8. [aʊ] – down / daʊn /

3. [ɔI] – toy / tɔI

6. [eə] – bear / beə(r) /

9. [əʊ] – ago / aəʊ /

The third category of speech sound is consonants. According to Kusuma (1990:14), consonants are speech sounds in which the air stream after having passed the larynx is either stopped for a moment and release is driven through such a narrow opening that we hear a friction. All of them are also voiced or voiceless consonants (Haycraft, 1980:160-161), they are: 1. [p] – pet / pet / 2. [b] - bad / bæd / 3. [t] – tea / ti: / 4. [d] – day / deI / 5. [k] – key / ki: / 6. [g] – go / gəʊ/ 7. [f] – fish / fIʃ/ 8. [v] – vase / vɑ:z / 9. [s] – sea / si: / 10. [z] – zoo / zu: / 11. [ʃ] – shoe / ʃu: / 12. [ʒ] – leisure / ‘leʒə(r) / 13. [tʃ] – chin / tʃIn / 14. [dʒ] – jump / dʒʌmp / 15. [θ] – thin / θIn/ 16. [ð] – this / ðIs / 17. [m] – man / mæn 18. [n] – night / naIt / 19. [ŋ] – sing / sIŋ / 20. [h] – hot / hɔt / 21. [I] – like / laIk / 22. [r] – right / raIt / 23. [w] – wait / weIt / 24. [j] – you / ju: /

Based on the types of speech sounds described above, this research focuses on investigating English words pairs which differ only in the two sounds being focused. They are either between vowels, consonants, diphthongs or vowel and diphthong sounds. For example, ‘leave’ and ‘live’ that are in English words pairs which contrasts vowel sound / i: / and vowel sound / I /. 2.2 Teaching English Pronunciation

Murcia et al. in Hermansyah (2011) states that, the goal of teaching pronunciation to the students is to be at eased intelligible. As a speaker, he or she needs to be able to pronounce the words correctly. It is proposed to avoid misunderstanding to the listener and make comfortable situation in which the listener can catch the meaning of the speaker’s utterances. As a listener, he or she needs to be able to listen comfortably in which he or she can understand what the speaker says without any unnecessary repetitions from the speaker that may bother both the speaker and the listener. In other word, having a correct pronunciation is a key to have an effective communication. Based on Hornby (1974), pronunciation is the way in which a language is spoken or the way in which a word is pronounced. In speaking English, the students face some difficulties in the matter of pronunciation (Kusuma, 1990:1). They have to learn to memorize the foreign sounds with their organ of speech. According to Haycraft (1971:56-57), there are several ways to overcome the difficulties of pronunciation. The students can contribute by trying to overcome their own difficulties by training their ears to have successful speech, the students recognize the foreign sounds that they hear, and they have to have ability to imitate and produce the good speech. And also, they have to be sensitive of errors made by themselves and others. The second way is that pronunciation practice helped by the teacher and some appropriate media with pronunciation ability. Nowadays, computer software can help them to practice and imitate what they hear followed by some drills, that will be the major of improving the young student’s pronunciation. Here, the students need to watch and listen carefully how the teacher or the computer software pronounces the sounds like native speakers and they can produce the sounds correctly. Kenwrothy (1987:1-3) suggests that teaching and learning pronunciation needs supports from the teacher also students. In line with his idea, participation of the English teacher and the students was very required in the application of this technique in teaching and learning process. Teacher provided way to the students to

produce sounds, gave related media for effective teaching, and gave feedback for the students’ performance. In this research, the goal of teaching pronunciation at the first grade of Junior High School was to enable students pronouncing English words, phrases, and sentences correctly. The teacher should build an effort to make the students able to communicate effectively. According to Samantray (2009), the approach relied resolutely on listening for good human models and electronic media (records, tape recorders, language labs, audio-video, cassettes, and CDs) prevailed as the requirements. In this case, the teacher apply “Talk Any” as the media software to train the students to pronounced English words, phrases and sentences to produce an effective communication in which the listener can understand what the speaker says without unnecessary repetition. Based on the description above, pronouncing the English sounds within the English words, phrases, and sentences are very essential in order to make good communication.


Pronouncing Words Pronunciation activities involve carefully listening and then articulation of

words and word pairs (Mora, 1999:1). Many young learners would be able to produce new sounds just by imitating what they heard, but if the students seemed to be unable to imitate, then the teacher could give guidance to the students which may help them to attain the sounds. One of useful guidance to the students are ones that can be followed, responded to, understood and carried out, and controlled (Kenworthy, 1987:69). Below are the examples to be given to the students that are easy enough to notice and control (Kenworthy, 1987:69-70): a) Lip position: whether the lips are pursued (as in whistling) or spread (as in a smile) or wide apart (as when yawning).

b) Contact between the tongue and teeth: whether the sides of the tongue are touching the upper back teeth or whether the tip of the tongue is touching the top or bottom front teeth. c) Contact between the tongue and the roof of the mouth: whether the tip of the tongue or the back of the tongue is touching a part of the roof of the mouth. Apart from what kind of instructions should be given to the students, the teacher should start with pronouncing the words where the sound is at the beginning of the word, then move on to words where it occurs at the end, then to middle position (Kenworthy, 1987:70-71). It is usually easier for the students to produce a new sound in first position. This kind of exercises should also involve self correction to individual learners where their problems lie and how to repairs them. Besides that, along with recognition practice, such activity may be an essential part of any language teaching as they make pronunciation an active element of the learning process and focus learners on the language they are producing (Dalton, 1997:2).


The Text To Speech Software (Talk Any) in Language Learning Computer is widely used by most people in the world. In language teaching or

in a classroom, teachers and students are helped by computer as a media in their teaching and learning process to deliver important information. Unwin and McAleese (1978:149) state that the capability of today’s minicomputer will be usual in the classroom, laboratory, equipments and library aids of the schools and colleges. In the same view, Rivers (1987:177) says that anyone who has used computer for whatever purposes realizes that the computer is an essentially interactive device. It means that computer provides certain information that is preferred by the users. Furthermore, teaching and learning process where communication between the teacher and students occur need certain media.

Goodwyn (1992:28) says that media which can be used such as television, film, video, radio, photography, popular, music, printed material, books, comics, magazines and also computer software. Davies (1996:8) adds that English teacher should try to vary the English teaching to make the students active in learning. It means that English teacher should use various teaching techniques, in this research by using text to speech or natural sounding software. Computer software also can be used as media in teaching in order to optimize the teacher performance in presenting the material and also it is a key area for media education that helps to simulate specific media situations (Goodwyn, 1992:64). There are so many kinds of software, specifically in teaching learning activity the computer software usually called educational software. According to Elliot, et al. (2000:361), the educational software is a one popular way to increase students’ level of motivation and enthusiasm in learning. Computer assisted language learning is an approach to language teaching in which computer technology is used as a language teaching aid to the mastering of material to be learned. Software that we can use as a media in delivering information is a text to speech software or natural sounding software. The style of software is texts that can be transformed into sound like human being’s voice, based on what we type on it. So, everyone who want be mastered in pronouncing English words can use this software as the matter of learning basic pronunciation. Based on the growth of technology, there are so many kinds of text to speech or natural sounding software that we can find through the internet whether it is free or paying. 2.3.1 The Role of Text To Speech (Talk Any) Software as a Media Teaching According to Khalifa, et al. (2000:2), the appropriate way to use the computer is as a tool for learning, and the software as a support for the educational programs. As media in teaching pronunciation, this software has some roles in classroom. First, a kind of media is able to reduce time of delivering material (Wiharjo, 2007:5). As result, this media can save a lot of time of the teacher in teaching and learning process

in classroom. Second, the media of software is able to make the classroom to be more interesting and motivating. Furthermore, Wiharjo (2007:5) states that software can improve not only attention and concentration of the students but also to improve motivation of the students’ interest. The abilities of this software are able to imitate the real foreign speakers’ sound with several characters styles and we can adjust the speed of voices. The variation of media can help the teacher to drill the students’ pronunciation and also motivate them to build their interest in learning pronunciation. The roles described above indicate that presentation software is very useful for both teacher and students in teaching and learning process in classroom. By using this software, the students not only motivated but also they could be more active to learn. The “Talk Any” interfaces could be displayed as follow:

(Picture 2.2 The screenshot of “Talk Any”) 2.3.2 The Procedure of the Use “Talk Any” Software in Teaching The simple and easy way in the use of “Talk Any” software is, knowing the function of each button on the software. There are some menus and buttons that this software has, namely: 1) The file menu, it is used to open a file formatted *.txt (example: notepad file) so that, the long text/saved text can be pronounced by this software without write it back. 2) The Edit menu, it is used to copy/cut and paste certain text from unsaved file on the computer and, 3) The help menu is the software’s description. The buttons has different functions to maximize the output of speech production, they are: 1) The personality button, it has twenty different sound, start from male, female, woman, child robot voices and so on. 2) The pitch button, it has function to adjust the level of voice it will be automatically adjusted when we choose the personality voice. 3) The speed button, it is used to adjust the tempo of voice, increasing the number of speed will make the voice faster and if it is decreased the speed of voice getting slower. 4) The Pitch quality button is to show the natural voice human, the flat voice and sing voice.

5) The vocal effort button is to show the effect of voice like normal human voices, human breathy and human whispered sound. 6) The last is the talk it! and Stop Talking button is to start and stop the sound’s produce. The simple procedure in teaching pronunciation by using “Talk Any”can be arranged as follow: a) Open some saved files based on the exercises and/or write down some words on the place provided. b) Then, press Talk It! button to start the machine saying. c) The students listen what the software says. d) The last is, the students need to be drilled many times by repeating the difficult words that they do not really good in pronouncing. To make maximize the sound voice, the speed button needs to be decreased or increased.


The “Talk Any” Software as a Media in Pronunciation Class A text to speech (TTS) is a system that converts normal language text into

speech. There are many types of natural sounding or text to speech software that we can use to assist our job and for teaching and learning. Kinds of software that include to TTS software are PistonSoft, TextAloud, Natural Voice Text to Speech Reader, Talk Any, Pronunciation Power, and so many others. In education those software also can be used whether for listening or pronunciation drilling. In teaching pronunciation it leads the learners to imitate the words that they usually difficult to pronounce by listening to the speech machine continuously.

Based on Goodwyn (1992:105), one of the richest areas of development is the increasingly dynamic interface of media technology for the English teacher. One of the text to speech software which has friendly interface and fun that we can use in teaching pronunciation is “Talk Any” software. Friendly and fun mean that it lets us to listen to words or texts we type on clipboard and just one click to show its natural speech sound and read it back to you. 2.4.1 The Advantages and Disadvantages of Using “Talk Any” Software As we know, everything has its own strengths and weaknesses. It has some strengths that make this software useful for teaching and learning they are: 1. It does not need high computer specification and the file size is small only 380kb, so it doesn’t need large space for the system operation. 2. We can open *.txt format files from the File menu or just paste to the clipboard so that the machine will read it for us. 3. It is free downloaded software, we can get it at Http:// 4. An unlimited number of vocal qualities/personalities can be created by the user also control of speed and quality of speech so that the listener can imitate the same as the machine did. 5. The last advantage is user friendly; it has the ease in use and learns especially for pronunciation practice. Besides those strengths, there are some weaknesses of this software that should we know, namely: 1. 2. 3. There is no English accent provided for speech sounds, such as American or British English. Default accent is American English. No option for sound recording to differentiate the machine and user voices. No phonetic transcription and translation of words. Based on those strengths and weaknesses above, it gave easiness to the teacher in running the program. It assisted the teacher role in acting the pronunciation

or gave clearer examples of voices and speed control. It was clear that this software can be used as a media to assist the teacher in teaching pronunciation, and also it could be used to teach listening. So “Talk Any” software has capability to present material attractively, and the students could improve their pronunciation ability in English class. 2.4.2 The Comparison with Other Related Software There are many kinds of pronunciation software, but some software are integrated with dictionary. The examples are Encarta, Longman, and Cambridge Learner’s dictionary. The three software above have some features, they are, pronunciation practice (record and listen to your own pronunciation), sounds recording (both British and American native English speakers), smart thesaurus (find the right word for any situation), quick find (look up words quickly while using other programs), Study pages (explore different areas of English), practice (what have you learned). The three dictionary software only focuses on the meaning and description of the words only but they have a feature that can record our sounds but it cannot be saved as files. In this research, the researcher prefers to use Talk Any software rather than the three dictionary software above although it doesn’t has recording feature. The strength of Talk Any than the other software is that it can pronounce some words in a time, so it is match with the word pairs drilling. 2.6 The Effect of “Talk Any” as the Media on the Students’ Pronunciation Pronunciation is one of the English components that should be taught seriously. To speak or communicate fluently, the students should have good mastery in pronunciation. To get the ease in teaching pronunciation, the software as the media is implemented in the teaching. (Sigafoos, 2007:115) states that, the existence of computer technology grows sophisticated also has very important roles in teaching and learning process at various levels. It means that the educational software can help


Based on the statements above, the use of technology/software computer in this case "Talk Any" as the media to assist the teacher and the students to get the ease in mastering the pronunciation, encourage students to imitate the real voice of native, motivating and hopefully the students are more enjoyable and interested in practicing pronunciation to make their good communication in English. 2.7 Alternative Hypothesis Based on the theory above, the hypothesis of this research can be formulated as follow; there was a significant effect of the use "Talk Any" software as a media in drilling the seventh grade students' pronunciation ability at SMPN 5 Tanggul Jember in the 2010/2011 academic year.

important keys in pronunciation: act, speaking, production and reception of sound. It means that the words being pronounced should be understandable (intelligible).